Abstract
This chapter is concerned with the analysis of the probability distributions of two-time-scale Markov chains. We aim to approximate the solution of forward equation by means of sequences of functions so that the desired accuracy is reached. As alluded to in Chapter 1, we devote our attention to nonstationary Markov chains with time-varying generators.
Access provided by Autonomous University of Puebla. Download chapter PDF
Keywords
- Forward Equation
- Weakly Irreducible
- Quasi-stationary Distribution
- Initial Layer Correction
- Exponential Decay Property
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
This chapter is concerned with the analysis of the probability distributions of two-time-scale Markov chains. We aim to approximate the solution of forward equation by means of sequences of functions so that the desired accuracy is reached. As alluded to in Chapter 1, we devote our attention to nonstationary Markov chains with time-varying generators. A key feature here is time-scale separation. By introducing a small parameter ε > 0, the generator and hence the corresponding Markov chain have “two times,” a usual running time t and a fast time t ∕ ε. The main approach that we are using is the matched asymptotic expansions from singular perturbation theory. We first construct a sequence of functions that well approximate the solution of the forward equation when t is large enough (outside the initial layer of O(ε)). By adopting the notion of singular perturbation theory, this part of the approximation will be called outer expansions. We demonstrate that it is a good approximation as long as t is not in a neighborhood of 0 of the order O(ε). Nevertheless, this sequence of functions does not satisfy the given initial condition and the approximation breaks down when t ≤ O(ε). To circumvent these difficulties, we construct another sequence of functions by magnifying the asymptotic behavior of the solution near 0 using the stretched fast time \(\tau = t/\varepsilon \). Following the traditional terminology in singular perturbation theory, we call this sequence of functions initial-layer corrections (or sometimes, boundary-layer corrections). It effectively yields corrections to the outer expansions and makes sure that the approximation is good in a neighborhood of O(ε). By combining the outer expansions and the initial-layer corrections, we obtain a sequence of matched asymptotic expansions. The entire process is constructive. Our aims in this chapter include:
-
Construct the outer expansions and the initial-layer corrections. This construction is often referred to as formal expansions.
-
Justify the sequence of approximations obtained by deriving the desired error bounds. To achieve this, we show that (i) the outer solutions are sufficiently smooth, (ii) the initial-layer terms all decay exponentially fast, and (iii) the error is of the desired order. Thus not only is convergence of the asymptotic expansions proved, but also the error bound is obtained.
-
Demonstrate that the error bounds hold uniformly. We would like to mention that in the usual singular perturbation theory, for example, in treating a linear system of differential equations, it is required that the system matrix be stable (i.e., all eigenvalues have negative real parts). In our setup, even for a homogeneous Markov chain, the generator (the system matrix in the equation) has an eigenvalue 0, so is not invertible. Thus, the stability requirement is violated. Nevertheless, using Markov properties, we are still able to obtain the desired asymptotic expansions.
Before proceeding further, we present a lemma. Let \(Q(t) \in {\mathbb{R}}^{m\times m}\) be a generator, and let α(t) be a finite-state Markov chain with state space \(\mathcal{M} =\{ 1,\ldots,m\}\) and generator Q(t). Denote by
the row vector of the probability distribution of the underlying chain at time t. Then in view of Theorem 2.5, p( ⋅) is a solution of the forward equation
where p 0 = (p 1 0, …, p m 0) and p i 0 denotes the ith component of p 0. Therefore, studying the probability distribution is equivalent to examining the solution of (4.1). Note that the forward equation is linear, so the solution is unique. As a result, the following lemma is immediate. This lemma will prove useful in subsequent study.
Lemma 4.1.
The solution p(t) of (4.1) satisfies the conditions
Remark 4.2.
For the reader whose interests are mainly in differential equations, we point out that the initial condition \(\sum_{i=1}^{m}{p}_{i}^{0} = 1\) in (4.1) is not restrictive since if p0 = 0, then p(t) = 0 is the only solution to (4.1). If pi 0 > 0 for some i, one may divide both sides of (4.1) by ∑ i=1 mpi 0 (> 0) and consider \(\widetilde{p}(t) = p(t)/\sum_{i=1}^{m}{p}_{i}^{0}\) in lieu of p(t).
To achieve our goal, we first treat a simple case, namely, the case that the generator is weakly irreducible. Once this is established, we proceed to the more complex case that the generator has several weakly irreducible classes, the inclusion of absorbing states, and the inclusion of transient states.
The rest of the chapter is arranged as follows. Section 4.2 begins with the study of the situation in which the generator is weakly irreducible. Although it is a simple case, it outlines the main ideas behind the construction of asymptotic expansions. This section begins with the construction of formal expansions, proves the needed regularity, and ascertains the error estimates. Section 4.3 develops asymptotic expansions of the underlying probability distribution for the chains with recurrent states. As will be seen in the analysis to follow, extreme care must be taken to handle two-time-scale Markov chains with fast and slow components. One of the key issues is the selection of appropriate initial conditions to make the series a “matched” asymptotic expansions, in which the separable form of our asymptotic expansion appears to be advantageous compared with the two-time-scale expansions. For easy reference, a subsection is also provided as a user’s guide.
Using the methods of matched asymptotic expansion, Section 4.4 extends the results to include absorbing states. It demonstrates that similar techniques can be used. We also demonstrate that the techniques and methods of Section 4.3 are rather general and can be applied to a wide variety of cases. Section 4.5 continues the study of problems involving transient states. By treating chains having recurrent states, chains including absorbing states, and chains including transient states, we are able to characterize the probability distributions of the underlying singularly perturbed chains of general cases with finite-state spaces, and hence provide comprehensive pictures through these “canonical” models.
While Sections 4.3–4.5 cover most practical concerns of interest for the finite-state-space cases, the rest of the chapter makes several remarks on Markov chains with countable-state spaces and two-time-scale diffusions. In Section 4.6.1, we extend the results to processes with countable-state spaces in which \(\widetilde{Q}(t)\) is a block-diagonal matrix with infinitely many blocks each of which is finite-dimensional. Then Section 4.6.2 treats the problem in which \(\widetilde{Q}(t)\) itself is an infinite-dimensional matrix. In this case, further conditions are necessary. As in the finite-dimensional counterpart, sufficient conditions that ensure the validity of the asymptotic expansions are provided. The essential ingredients include Fredholm-alternative-like conditions and the notion of weak irreducibility. Finally, we mention related results of singularly perturbed diffusions in Section 4.7. Additional notes and remarks are given in Section 4.8.
2 Irreducible Case
We begin with the case concerning weakly irreducible generators. Let \(Q(t) \in {\mathbb{R}}^{m\times m}\) be a generator, ε > 0 be a small parameter, and suppose that αε(t) is a finite-state Markov chain with state space \(\mathcal{M} =\{ 1,\ldots,m\}\) generated by \({Q}^{\varepsilon }(t) = Q(t)/\varepsilon \). The row vector \({p}^{\varepsilon }(t) = (P({\alpha }^{\varepsilon }(t) = 1),\ldots,P({\alpha }^{\varepsilon }(t) = m)) \in {\mathbb{R}}^{1\times m}\) denotes the probability distribution of the underlying chain at time t. Then by virtue of Theorem 2.5, p ε( ⋅) is a solution of the forward equation
where p 0 = (p 1 0, …, p m 0) and p i 0 denotes the ith component of p 0. Therefore, studying the probability distribution is equivalent to examining the solution of (4.3). Now, Lemma 4.1 continues to hold for the solution p ε(t).
As discussed in Chapters 1 and 3, the equation in (4.3) arises from various applications involving a rapidly fluctuating Markov chain governed by the generator Q(t) ∕ ε. As ε gets smaller and smaller, the Markov chain fluctuates more and more rapidly. Normally, the fast-changing process αε( ⋅) in an actual system is difficult to analyze. The desired limit properties, however, provide us with an alternative. We can replace the actual process by its “average” in the system under consideration. This approach has significant practical value. A fundamental question common to numerous applications involving two-time-scale Markov chains is to understand the asymptotic properties of p ε( ⋅), namely, the limit behavior as ε → 0. If Q(t) = Q, a constant matrix, and if Q is irreducible (see Definition 2.7), then for each t > 0, p ε(t) → ν, the familiar stationary distribution. For the time-varying counterpart, it is reasonable to expect that the corresponding distribution will converge to a probability distribution that mimics the main features of the distribution of stationary chains, meanwhile preserving the time-varying nature of the nonstationary system. A candidate bearing such characteristics is the quasi-stationary distribution ν(t). Recall that ν(t) is said to be a quasi-stationary distribution (see Definition 2.8) if ν(t) = (ν1(t), …, ν m (t)) ≥ 0 and it satisfies the equations
If Q(t) ≡ Q, a constant matrix, then an analytic solution of (4.3) is obtainable, since the fundamental matrix solution (see Hale [79]) takes the simple form exp(Qt); the limit behavior of p ε(t) is derivable through the solution p 0exp(Qt ∕ ε). For time-dependent Q(t), although the fundamental matrix solution still exists, it does not have a simple form. The complex integral representation is not very informative in the asymptotic study of p ε(t), except in the case m = 2. In this case, αε( ⋅) is a two-state Markov chain and the constraint \({p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) = 1\) reduces the current problem to a scalar one. Therefore, a closed-form solution is possible. However, such a technique cannot be generalized to m > 2. Let 0 < T < ∞ be a finite real number. We divide the interval [0, T] into two parts. One part is for t very close to 0 (in the range of an ε-layer), and the other is for t bounded away from 0. The behavior of p ε( ⋅) differs significantly in these two regions. Such a division led us to the utilization of the matched asymptotic expansion. Not only do we prove the convergence of p ε(t) as ε → 0, but we also obtain an asymptotic series. The procedure involves constructing the regular part (outer expansion) for t to be away from 0, as well as the initial-layer corrections for small t, and to match these expansions by a proper choice of initial conditions.
In what follows, in addition to obtaining the zeroth-order approximation, i.e., the convergence of p ε( ⋅) to its quasi-stationary distribution, we derive higher-order approximations and error bounds. A consequence of the findings is that the convergence of the probability distribution and related occupation measures of the corresponding Markov chain takes place in an appropriate sense. The asymptotic properties of a suitably scaled occupation time and the corresponding central limit theorem for αε( ⋅) (based on the expansion) will be studied in Chapter 5.
2.1 Asymptotic Expansions
To proceed, we make the following assumptions.
-
(A4.1)
Given 0 < T < ∞, for each t ∈ [0, T], Q(t) is weakly irreducible, that is, the system of equations
$$\begin{array}{ll} &f(t)Q(t) = 0, \\ &\sum\limits_{i=1}^{m}{f}_{ i}(t) = 1\end{array}$$(4.5)has a unique nonnegative solution.
-
(A4.2)
For some n, Q( ⋅) is (n + 1)-times continuously differentiable on [0, T], and \(({d}^{n+1}/d{t}^{n+1})Q(\cdot )\) is Lipschitz on [0, T].
Remark 4.3.
Condition (A4.2) requires that the matrix Q(t) be sufficiently smooth. This is necessary for obtaining the desired asymptotic expansion. To validate the asymptotic expansion, we need to estimate the remainder term. Thus for the nth-order approximation, we need the (n + 1)st-order smoothness.
To proceed, we first state a lemma. Its proof is in Lemma A.2 in the appendix.
Lemma 4.4.
Consider the matrix differential equation
where \(P(s) \in {\mathbb{R}}^{m\times m}\). Suppose \(A \in {\mathbb{R}}^{m\times m}\) is a generator of a (homogeneous or stationary) finite-state Markov chain and is weakly irreducible. Then \(P(s) \rightarrow \overline{P}\) as s →∞ and
where \(\overline{P} = \mathrm{1}\mathrm{l}({\overline{\nu }}_{1},\cdots \,,{\overline{\nu }}_{m}) \in {\mathbb{R}}^{m\times m},\) and \(({\overline{\nu }}_{1}\), …, \({\overline{\nu }}_{m})\) is the quasi-stationary distribution of the Markov process with generator A.
Recall that \(\mathrm{1}\mathrm{l} = (1,\ldots,1)^{\prime} \in {\mathbb{R}}^{m\times 1}\) and \(({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m}) \in {\mathbb{R}}^{1\times m}.\) Thus \(\mathrm{1}\mathrm{l}({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})\) is the usual matrix product. Recall that an m ×m matrix P(s) is said to be a solution of (4.6) if each row of P(s) satisfies the equation. In the lemma above, if A is a constant matrix that is irreducible, then \(({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})\) becomes the familiar stationary distribution. In general, A could be time-dependent, e.g., A = A(t). As shown in Lemma A.4, by assuming the existence of the solution ν(t) to (4.5), it follows that ν(t) ≥ 0; that is, the nonnegativity assumption is redundant. We seek asymptotic expansions of the form
where e n ε(t) is the remainder,
and
with the functions φ i ( ⋅) and ψ i ( ⋅) to be determined in the sequel. We now state the main result of this section.
Theorem 4.5.
Suppose that (A4.1) and (A4.2) are satisfied. Denote the unique solution of (4.3) by p ε (⋅). Then two sequences of functions φ i (⋅) and ψ i (⋅), 0 ≤ i ≤ n, can be constructed such that
-
φ i ( ⋅) is \((n + 1 - i)\)-times continuously differentiable on [0, T];
-
for eachi, there is a κ0 > 0 such that
$$\vert {\psi }_{i}\left ({ t \over \varepsilon } \right )\vert \leq K\exp \left (-\frac{{\kappa }_{0}t} {\varepsilon } \right );$$ -
the following estimate holds:
$$ \sup\limits_{t\in [0,T]}{\biggl |{p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\biggr |} \leq K{\varepsilon }^{n+1}.$$(4.10)
Remark 4.6.
The method described in what follows gives an explicit construction of the functions φi(⋅) and ψi(⋅) for i ≤ n. Thus the proof to be presented is constructive. Our plan is first to obtain these sequences, and then validate properties (a) and (b) above and derive an error bound in (c) by showing that the remainder
is of order O(εn+1) uniformly in t.
It will be seen from the subsequent development that φ0(t) is equal to the quasi-stationary distribution, that is, φ0(t) = ν(t). In particular, if n = 0 in the above theorem, we have the following result.
Corollary 4.7.
Suppose Q(⋅) is continuously differentiable on [0,T], which satisfies (A4.1), and (d∕dt)Q(⋅) is Lipschitz on [0,T]. Then for all t > 0,
i.e., p ε (⋅) converges to the quasi-stationary distribution.
Remark 4.8.
The theorem manifests the convergence of pε(⋅) to φ0(⋅), as well as the rate of convergence. In addition to the zeroth-order approximation, we have the first-order approximation, the second-order approximation, and so on. In fact, the difference pε(⋅) − φ0(⋅) is characterized by the initial-layer term ψ0(⋅) and the associated error bound.
If the initial condition is chosen to be exactly equal to p 0 = φ 0 (0), then in the expansion, the zeroth-order initial layer ψ 0 (⋅) will vanish. This cannot be expected in general, however. Even if ψ 0 (⋅) = 0, the rest of the initial-layer terms ψ i (⋅), i ≥ 1 will still be there.
To proceed, we define an operator \({\mathcal{L}}^{\varepsilon }\) by
for any smooth row-vector-valued function f( ⋅). Then \({\mathcal{L}}^{\varepsilon }f = 0\) iff f is a solution to the differential equation in (4.3). The proof of Theorem 4.5 is divided into the following steps.
-
1.
Construct the asymptotic series, i.e., find φ i ( ⋅) and ψ i ( ⋅), for i ≤ n. For the purpose of evaluating the remainder, we need to calculate two extra terms φ n + 1( ⋅) and ψ n + 1( ⋅). This will become clear when we carry out the error analysis.
-
2.
Obtain the regularity of φ i ( ⋅) and ψ i ( ⋅) by proving that φ i ( ⋅) is \((n + 1 - i)\)-times continuously differentiable on [0, T] and that ψ i ( ⋅) decays exponentially fast.
-
3.
Carry out the error analysis and justify that the remainder has the desired property.
2.2 Outer Expansion
We begin with the construction of Φ n ε( ⋅) in the asymptotic expansion. We call it the outer expansion or the regular part of expansion. Consider the differential equation
where \({\mathcal{L}}^{\varepsilon }\) is given by (4.12).
By equating the coefficients of εk, for \(k = 1,\ldots,n + 1\), we obtain
Remark 4.9.
First, one has to make sure that the equations above have solutions, that is, a consistency condition needs to be verified. For each t ∈ [0,T], denote the null space of Q(t) by N(Q(t)). Note that the irreducibility of Q(t) implies that
thus
It is easily seen that N(Q(t)) is spanned by the vector 1 l . By virtue of the Fredholm alternative (see Corollary A.38), the second equation in (4.13) has a solution only if its right-hand side, namely, (d∕dt)φ0 (t) is orthogonal to N(Q(t)). Since N(Q(t)) is spanned by 1 l,
and
the orthogonality is easily verified. Similar arguments hold for the rest of the equations. The consistency in fact is rather crucial. Without such a condition, one would not be able to solve the equations in (4.13). This point will be made again when we deal with weak and strong interaction models in Section 4.3.
Recall that the components of p ε( ⋅) are probabilities (see (4.2)). In what follows, we show that all these φ i ( ⋅) can be determined by (4.13) and (4.2).
Note that rank\((Q(t)) = m - 1\). Thus Q(t) is singular, and each equation in (4.13) is not uniquely solvable. For example, the first equation (4.13) cannot be solved uniquely. Nevertheless, this equation together with the constraint \(\sum_{i=1}^{m}{\varphi }_{0}^{i}(t) = 1\) leads to a unique solution, namely, the quasi-stationary distribution.
In fact, a direct consequence of (A4.3) and (A4.4) is that the weak irreducibility of Q(t) is uniform in the sense that for any t ∈ [0, T], if any column of Q(t) is replaced by \(\mathrm{1}\mathrm{l} \in {\mathbb{R}}^{m\times 1}\), the resulting determinant Δ(t) satisfies | Δ(t) | > 0, since (4.5) has only one solution, and \(\sum_{j=1}^{m}{q}_{ij}(t) = 0\) for each i = 1, …, m. Moreover, there is a number c > 0 such that | Δ(t) | ≥ c > 0. Thus, in view of the uniform continuity of Q(t), | Δ(t) | ≥ c > 0 on [0, T]. We can replace any equation in the first m equations of the system φ0(t)Q(t) = 0 by the equation \(\sum_{i=1}^{m}{\varphi }_{0}^{i}(t) = 1\). The corresponding determinant Δ(t) of the resulting coefficient matrix satisfies | Δ(t) | ≥ c > 0, for some c > 0 and all t ∈ [0, T]. To illustrate, we may suppose without loss of generality that the mth equation is the one that can be replaced. Then we have
The determinant of the coefficient matrix in (4.14) is
and satisfies | Δ(t) | ≥ c > 0. Now by Cramer’s rule, for each 0 ≤ i ≤ m,
that is, the ith column of Δ(t) in (4.15) is replaced by \((0,\ldots,0,1)^{\prime} \in {\mathbb{R}}^{m\times 1}\). By the assumption of Q( ⋅), it is plain that φ0( ⋅) is (n + 1)-times continuously differentiable on [0, T].
The foregoing method can be used to solve other equations in (4.13) analogously. Owing to the smoothness of φ0( ⋅), (d ∕ dt)φ0(t) exists, and we can proceed to obtain φ1( ⋅). Repeat the procedure above, and continue inductively. For each k ≥ 1,
Note that φ k − 1 j( ⋅) has been found so \((d/dt){\varphi }_{k-1}^{j}(t)\) is a known function. After a suitable replacement of one of the first m equations by the last equation in (4.16), the determinant Δ(t) of the resulting coefficient matrix satisfies | Δ(t) | ≥ c > 0. We obtain for each 0 ≤ i ≤ m,
Hence φ k ( ⋅) is \((n + 1 - k)\)-times continuously differentiable on [0, T]. Thus we have constructed a sequence of functions φ k (t) that are \((n + 1 - k)\)-times continuously differentiable on [0, T] for \(k = 0,1,\ldots,n + 1\).
Remark 4.10.
The method used above is convenient for computational purposes. An alternative way of obtaining the sequence φk(t) is as follows. For example, to solve
define \({Q}_{c}(t) = (\mathrm{1}\mathrm{l}\vdots Q(t)) \in {\mathbb{R}}^{m\times (m+1)}\). Then the equation above can be written as
Note that Qc(t)Q′c(t) has full rank m owing to weak irreducibility. Thus the solution of the equation is
We can obtain all other φk(t) for \(k = 1,\ldots,n + 1\), similarly.
The regular part Φ n ε( ⋅) is a good approximation to p ε( ⋅) when t is bounded away from 0. When t approaches 0, an initial layer (or a boundary layer) develops and the approximation breaks down. To accommodate this situation, an initial-layer correction, i.e., a sequence of functions ψ k (t ∕ ε) for \(k = 0,1,\ldots,n + 1\) needs to be constructed.
2.3 Initial-Layer Correction
This section is on the construction of the initial-layer terms. The presentation consists of two parts. We obtain the sequence {ψ k ( ⋅)} in the first subsection, and derive the exponential decay property in the second subsection.
Construction of ψ k (⋅). Following usual practice in singular perturbation theory, define the stretched (or rescaled) time variable by
Note that τ → ∞ as ε → 0 for any given t > 0.
Consider the differential equation
Using the stretched time variable τ, we arrive at
Owing to the smoothness of Q( ⋅), a truncated Taylor expansion about τ = 0 leads to
where
for some 0 < ξ < t. In view of (A4.2),
Drop the term R n + 1(t) and use the first n + 2 terms to get
Similar to the previous section, for \(k = 1,\ldots,n + 1\), equating coefficients of εk, we have
where r k (τ) is a function having the form
These equations together with appropriate initial conditions allow us to determine the ψ k ( ⋅)’s. For constructing φ k ( ⋅), a number of algebraic equations are solved, whereas when determining ψ k , one has to solve a number of differential equations instead. Two points are worth mentioning in connection with (4.18). First the time-varying differential equation is replaced by one with constant coefficients; the solution thus can be written explicitly. The second point is on the selection of the initial conditions for ψ k ( ⋅), with \(k = 0,1,\ldots,n + 1\). We choose the initial conditions so that the initial data of the asymptotic expansion will “match” that of the differential equation (4.3). To be more specific,
Corresponding to ε0, solving
where p 0 is the initial data given in (4.3), one has
Continuing in this fashion, for \(k = 1,\ldots,n + 1\), we obtain
In the equations above, we purposely separated Q(0) from the term r k (τ). As a result, the equations are linear systems with a constant matrix Q(0) and time-varying forcing terms. This is useful for our subsequent investigation.
For k = 1, 2, …, the solutions are given by
The construction of ψ k ( ⋅) for \(k = 0,1,\ldots,n + 1\), and hence the construction of the asymptotic series is complete.
2.4 Exponential Decay of ψ k ( ⋅)
This subsection concerns the exponential decay of ψ k ( ⋅). At first glance, it seems to be troublesome since Q(0) has a zero eigenvalue. Nevertheless, probabilistic argument helps us to derive the desired property. Two key points in the proof below are the utilization of orthogonality and repeated application of the approximation of exp(Q(0)τ) in Lemma 4.4.
By virtue of Assumption (A4.1), the finite-state Markov chain generated by Q(0) is weakly irreducible. Identifying Q(0) with the matrix A in Lemma 4.4 yields that
where \(\overline{P} = \mathrm{1}\mathrm{l}\overline{\nu }\), and \(\overline{\nu } = ({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})\) is the quasi-stationary distribution corresponding to the constant matrix Q(0).
Proposition 4.11.
Under the conditions of Theorem 4.5 , for each 0 ≤ k ≤ n + 1, there exist a nonnegative real polynomial c 2k (τ) of degree 2k and a positive number κ 0,0 > 0 such that
Proof: First of all, note that
It follows that
That is, ψ0(0) is orthogonal to 1 l. Consequently, \({\psi }_{0}(0)\overline{P} = 0\) and by virtue of Lemma 4.4 (with A = Q(0)), for some \({\kappa }_{0,0} :=\widetilde{ \kappa } > 0\),
Note that
Differentiating this equation repeatedly leads to
Hence, it follows that
for each 0 ≤ k ≤ n + 1. Owing to Lemma 4.4 and (4.21),
for some nonnegative polynomial c 2(τ) of degree 2.
Note that r k (s) is orthogonal to \(\overline{P}\). By induction, for any k with \(k = 1,\ldots,n + 1\),
where c 2k (τ) is a nonnegative polynomial of degree 2k. This completes the proof of the proposition. □
Since n is a finite integer, the growth of c 2k (τ) for 0 ≤ k ≤ n + 1 is much slower than exponential. Thus the following corollary is in force.
Corollary 4.12.
For each 0 ≤ k ≤ n + 1, with κ 0,0 given in Proposition 4.11,
2.5 Asymptotic Validation
Recall that \({\mathcal{L}}^{\varepsilon }f = \varepsilon (d/dt)f - fQ\). Then we have the following lemma.
Lemma 4.13.
Suppose that for some 0 ≤ k ≤ n + 1,
Then
Proof: Let ηε( ⋅) be a function satisfying \( \sup\limits_{t\in [0,T]}\vert {\eta }^{\varepsilon }(t)\vert = O\left ({\varepsilon }^{k+1}\right )\). Consider the differential equation
Then the solution of (4.24) is given by
where X ε(t, s) is a principal matrix solution. Recall that (see Hale [79, p. 80]) a fundamental matrix solution of the differential equation is an invertible matrix each row of which is a solution of the equation; a principal matrix solution is a fundamental matrix solution with initial value the identity matrix. In view of Lemma 4.1,
Therefore, we have the inequalities
The proof of the lemma is thus complete. □
Recall that the vector-valued “error” or remainder e n ε(t) is defined by
where p ε( ⋅) is the solution of (4.3), and φ i ( ⋅) and ψ i ( ⋅) are constructed in (4.13) and (4.18). It remains to show that \({e}_{n}^{\varepsilon }(t) = O\left ({\varepsilon }^{n+1}\right )\). To do so, we utilize Lemma 4.13 as a bridge. It should be pointed out, however, that to get the correct order for the remainder, a trick involving “back up one step” is needed. The details follow.
Proposition 4.14.
Assume (A4.1) and (A4.2) , for each k = 0,…,n,
Proof: We begin with
We will use the exponential decay property given in ψ i (τ) Corollary 4.12. Clearly, e 1 ε(0) = 0, and hence the condition of Lemma 4.13 on the initial data is satisfied. By virtue of the exponential decay property of ψ i ( ⋅) in conjunction with the defining equations of φ i ( ⋅) and ψ i ( ⋅),
For the term involving ψ0(t ∕ ε), using a Taylor expansion on Q(t) yields that for some ξ ∈ (0, t)
Owing to the exponential decay property of ψ i ( ⋅), the fact that φ1( ⋅) is n-times continuously differentiable on [0, T], and the above estimate, we have
Moreover, for any \(k = 0,1,2\ldots,n + 1\), it is easy to see that
This implies \({\mathcal{L}}^{\varepsilon }{e}_{1}^{\varepsilon }(t) = O({\varepsilon }^{2})\) uniformly in t. Thus, e 1 ε(t) = O(ε) by virtue of Lemma 4.13 and the bound is uniform in t ∈ [0, T].
We now go back one step to show that the zeroth-order approximation also possesses the correct error estimate, that is, e 0 ε(t) = O(ε). Note that the desired order seems to be difficult to obtain directly, and as a result the back-tracking is employed.
Note that
However, the smoothness of φ1( ⋅) and the exponential decay of ψ1( ⋅) imply that
Thus e 0 ε(t) = O(ε) uniformly in t.
Proceeding analogously, we obtain
Note that the term in the fifth line above is
Using (4.19), we represent r i (t) in terms of (d i ∕ dt i)Q(0), etc. For the term involving ψ0(t ∕ ε), using a truncated Taylor expansion up to order (n + 1) for Q(t), by virtue of the Lipschitz continuity of \(({d}^{n+1}/d{t}^{n+1})Q(\cdot )\), there is a ξ ∈ (0, t) such that
For all the other terms involving ψ i (t ∕ ε), for \(i = 1,\ldots,n + 1\) in (4.30), we proceed as in the calculation of \({\mathcal{L}}^{\varepsilon }{e}_{1}^{\varepsilon }\). As a result, the last two terms in (4.30) are bounded by
which in turn leads to the bound
in accordance with (4.27). Moreover, it is clear that \({e}_{n+1}^{\varepsilon }(0) = 0\). In view of the fact that φ n + 1( ⋅) is continuously differentiable on [0, T] and Q( ⋅) is (n + 1)-times continuously differentiable on [0, T], by virtue of Lemma 4.13, we infer that \({e}_{n+1}^{\varepsilon }(t) = O({\varepsilon }^{n+1})\) uniformly in t. Since
it must be that \({e}_{n}^{\varepsilon }(t) = O({\varepsilon }^{n+1})\). The proof of Proposition 4.14 is complete, and so is the proof of Theorem 4.5. □
Remark 4.15.
In the estimate given above, we actually obtained
This observation will be useful when we consider the unbounded interval [0,∞).
The findings reported are very useful for further study of the limit behavior of the corresponding Markov chain problems of central limit type, which will be discussed in the next chapter. In many applications, a system is governed by a Markov chain, which consists of both slow and fast motions. An immediate question is this: Can we still develop an asymptotic series expansion? This question will be dealt with in Section 4.3.
Suppose that in lieu of (A4.2), we assume that Q( ⋅) is piecewise (n + 1)-times continuously differentiable on [0, T], and \(({d}^{n+1}/d{t}^{n+1})Q(\cdot )\) is piecewise Lipschitz, that is, there is a partition of [0, T], namely,
such that Q( ⋅) is (n + 1)-times continuously differentiable and \(({d}^{n+1}/d{t}^{n+1})\) Q( ⋅) is Lipschitz on each subinterval [t i , t i + 1). Then the result obtained still holds. In this case, in addition to the initial layers, one also has a finite number of inner-boundary layers. In each interval \([{t}_{i},{t}_{i+1} - \eta ]\) for η > 0, the expansion is similar to that presented in Theorem 4.5.
2.6 Examples
As a further illustration, we consider two examples in this section. The first example is concerned with a stationary Markov chain, i.e., Q(t) = Q is a constant matrix. The second example deals with an analytically solvable case for a two-state Markov chain with nonstationary transition probabilities. Although they are simple, these examples give us insight into the asymptotic behavior of the underlying systems.
Example 4.16.
Let αε(t) be an m-state Markov chain with a constant generator Q(t) = Q that is irreducible. This is an analytically solvable case, with
Using the technique of asymptotic expansion, we obtain
where
Note that \({p}^{0}\overline{P} = {\varphi }_{0}\), and hence
Moreover,
In this case, \({\varphi }_{0}(t) \equiv {\varphi }_{0}\), a constant vector, which is the equilibrium distribution of Q; the series terminates. Moreover, the solution consists of two terms, one of them the equilibrium distribution (the zeroth-order approximation) and the other the zeroth-order initial-layer correction term. Since φ0 is the quasi-stationary distribution,
Hence the analytic solution and the asymptotic expansion coincide.
In particular, let Q be a two-dimensional matrix, i.e.,
Then setting
we have
Therefore,
Example 4.17.
Consider a two-state Markov chain with generator
where λ(t) ≥ 0, μ(t) ≥ 0 and λ(t) + μ(t) > 0 for each t ∈ [0,T]. Therefore Q(⋅) is weakly irreducible. For the following discussion, assume Q(⋅) to be sufficiently smooth. Although it is time-varying, a closed-form solution is obtainable. Since \({p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) = 1\) for each t, (4.3) can be solved explicitly and the solution is given by
Following the approach in the previous sections, we construct the first a few terms in the asymptotic expansion. By considering (4.13) together with (4.2), a system of the form
is obtained. The solution of the system yields that
To find φ1( ⋅), consider
where \(\dot{\lambda } = (d/dt)\lambda \) and \(\dot{\mu } = (d/dt)\mu \). Solving this system of equations gives us
To get the inner expansion, consider the differential equation
with \(\tau = t/\varepsilon \). We obtain
where
Similarly ψ1( ⋅) can be obtained from (4.21) with the exponential matrix given above.
It is interesting to note that either λ(t) or μ(t) can be equal to 0 for some t as long as λ(t) + μ(t) > 0. For example, if we take μ( ⋅) to be the repair rate of a machine in a manufacturing model, then μ(t) = 0 corresponds to the repair workers taking breaks or waiting for parts on order to arrive. The minors of Q(t) are λ(t), − λ(t), μ(t), and − μ(t). As long as not all of them are zero at the same time, the weak irreducibility condition will be met.
2.7 Two-Time-Scale Expansion
The asymptotic expansion derived in the preceding sections is separable in the sense that it is the sum of a regular part and initial-layer corrections. Naturally, one is interested in the relationship between such an expansion and the so-called two-time-scale expansion (see, for example, Smith [199]). To answer this question, we first obtain the two-time-scale asymptotic expansion for the forward equation (4.3), proceed with the exploration of the relationships between these two expansions, and conclude with a discussion of the connection between these two methods.
Two-Time-Scale Expansion. Following the literature on asymptotic expansion (e.g., Kevorkian and Cole [108, 109] and Smith [199] among others), consider two scales t and \(\tau = t/\varepsilon \), both as “times.” One of them is in a normal time scale and the other is a stretched one. We seek asymptotic expansions of the form
where {y i (t, τ)} i = 0 n is a sequence of two-time-scale functions. Treating t and τ as independent variables, one has
Formally substituting (4.32) into (4.3) and equating coefficients of like powers of εi results in
The initial conditions are
Holding t constant and solving the first equation in (4.34) (with the first equation in (4.35) as the initial condition) yields
By virtue of (A4.4), (∂ ∕ ∂t)y 0(t, τ) exists and
As a result, (∂ ∕ ∂t)y 0(t, τ) is orthogonal to 1 l. We continue the procedure recursively. It can be verified that for 1 ≤ i ≤ n,
Furthermore, for i = 1, …, n, (∂ ∕ ∂t)y i (t, τ) exists and is continuous; it is also orthogonal to 1l. It should be emphasized that in the equations above, t is viewed as being “frozen,” and as a consequence, Q(t) is a “constant” matrix.
Parallel to the previous development, one can show that for all 1 ≤ i ≤ n,
Compared with the separable expansions presented before, note the t-dependence of K( ⋅) and κ0( ⋅) above. Furthermore, the asymptotic series is valid. We summarize this as the following theorem.
Theorem 4.18.
Under the conditions of Theorem 4.5 , a sequence of functions {y i (t,τ)} i=0 n can be constructed so that
Example 4.19.
We return to Example 4.16 . It is readily verified that the zeroth-order two-time-scale expansion coincides with that of the analytic solution, in fact, with
Relationship between the Two Methods. Now we have two different asymptotic expansions. Do they in some sense produce similar asymptotic results? Note that each term in y i (t, τ) contains the regular part φ i (t) as well as the initial-layer corrections. Examining the zeroth-order approximation leads to
via the same argument employed in the proof of Lemma 4.4. The matrix has identical rows, and is given by \(\overline{P}(t) = \mathrm{1}\mathrm{l}\nu (t)\). In fact, owing to \({p}^{0}\mathrm{1}\mathrm{l} =\sum\limits_{i=1}^{m}{p}_{i}^{0} = 1\), we have
where \(\widetilde{{y}}_{0}(t,\tau )\) decays exponentially fast as τ → ∞ for t < τ.
In view of (4.38), the two methods produce the same limit as τ → ∞, namely, the quasi-stationary distribution. To explore further, we study a special case (a two-state Markov chain) so as to keep the notation simple. Consider the two-state Markov chain model Example 4.17. In view of (4.38) and the formulas in Example 4.17, we have
Owing to (4.37), direct calculation yields that
It can be verified that the second term on the right-hand side of the equal sign above decays exponentially fast, while the first term yields φ1(t) plus another term tending to 0 exponentially fast as τ → ∞. Using the result of Example 4.17 yields
Thus, it follows that
where
Similarly, we can obtain
where \(\widetilde{{y}}_{i}(t,\tau )\) decay exponentially fast as τ → ∞ for all t < τ. This establishes the connection between these two different expansions.
Comparison and Additional Remark. A moment of reflection reveals that:
-
-
The conditions required to obtain the asymptotic expansions are the same.
-
Except for the actual forms, there is no significant difference between these two methods.
-
No matter which method is employed, in one way or another the results for stationary Markov chains are used. In the separable expansion, this is accomplished by using Q(0), and in the two-time-scale expansion, this is carried out by holding t constant and hence treating Q(t) as a constant matrix.
-
Although the two-time-scale expansion admits a seemingly more general form, the separable expansion is more transparent as far as the quasi-stationary distribution is concerned.
-
When a more complex problem, for example the case of weak and strong interactions, is encountered, the separable expansion becomes more advantageous.
-
To study asymptotic normality, etc., in the sequel, the separable expansion will prove to be more convenient than the two-time-scale expansion.
-
In view of the items mentioned above, we choose to use the separable form of the expansion throughout.
3 Markov Chains with Multiple Weakly Irreducible Classes
This section presents the asymptotic expansions of two-time-scale Markov chains with slow and fast components subject to weak and strong interactions. We assume that all the states of the Markov chain are recurrent. In contrast to Section 4.2, the states belong to multiple weakly irreducible classes. As was mentioned in the introductory chapter, such time-scale separation stems from various applications in production planning, queueing networks, random fatigue, system reliability, competing risk theory, control and optimization of large-scale dynamical systems, and related fields. The sunderlying models in which some components change very rapidly whereas others vary relatively slowly, are more complex than those of Section 4.2. The weak and strong interactions of the systems are modeled by assuming the generator of the underlying Markov chain to be of the form
where \(\widetilde{Q}(t)\) governs the rapidly changing part and \(\widehat{Q}(t)\) describes the slowly changing components. They have the appropriate forms to be mentioned in the sequel.
This section extends the results in Section 4.2 to incorporate the cases in which the generator \(\widetilde{Q}(t)\) is not irreducible. Our study focuses on the forward equation, similar to (4.3); now the forward equation takes the form
such that
To illustrate, we present a simple example below.
Example 4.20.
Consider a two-machine flowshop with machines that are subject to breakdown and repair. The production capacity of the machines is described by a finite-state Markov chain. If the machine is up, then it can produce parts with production rate u(t); its production rate is zero if the machine is under repair. For simplicity, suppose each of the machines is either in operating condition (denoted by 1) or under repair (denoted by 0). Then the capacity of the workshop becomes a four-state Markov chain with state space {(1,1),(0,1),(1,0),(0,0)}. Suppose that the first machine breaks down much more often than the second one. To reflect this situation, consider a Markov chain αε(⋅) generated by Qε(t) in (4.39), with \(\widetilde{Q}(\cdot )\) and \(\widehat{Q}(\cdot )\) given by
and
where λi(⋅) and μi(⋅) are the rates of repair and breakdown, respectively. The matrices \(\widetilde{Q}(t)\) and \(\widehat{Q}(t)\) are themselves generators of Markov chains. Note that
is a block-diagonal matrix, representing the fast motion, and \(\widehat{Q}(t)\) governs the slow components. In order to obtain any meaningful results for controlling and optimizing the performance of the underlying systems, the foremost task is to determine the asymptotic behavior (as ε → 0) of the probability distribution of the underlying chain.
In this example, a first glance reveals that \(\widetilde{Q}(t)\) is reducible, hence the results in Section 4.2 are not applicable. However, closer scrutiny indicates that \(\widetilde{Q}(t)\) consists of two irreducible submatrices. One expects that the asymptotic expansions may still be established. Our main objective is to develop asymptotic expansions of such systems and their variants. The corresponding procedure is, however, much more involved compared with the irreducible cases.
Examining (4.39), it is seen that the asymptotic properties of the underlying Markov chains largely depend on the structure of the matrix \(\widetilde{Q}(t)\). In accordance with the classification of states, we may consider three different cases: the chains with recurrent states only, the inclusion of absorbing states, and the inclusion of transient states. We treat the recurrent-state cases in this section, and then extend the results to notationally more involved cases including absorbing states and transient states in the following two sections.
Suppose αε( ⋅) is a finite-state Markov chain with generator given by (4.39), where both \(\widetilde{Q}(t)\) and \(\widehat{Q}(t)\) are generators of appropriate Markov chains. In view of the results in Section 4.2, it is intuitively clear that the structure of the generator \(\widetilde{Q}(t)\) governs the fast-changing part of the Markov chain. As mentioned in the previous section, our study of the finite-state-space cases is naturally divided into the recurrent cases, the inclusion of absorbing states, and the inclusion of transient states of the generator \(\widetilde{Q}(t)\). In accordance with classical results (see Chung [31] and Karlin and Taylor [105, 106]), one can always decompose the states of a finite-state Markov chain into recurrent (including absorbing) and transient classes. Inspired by Seneta’s approach to nonnegative matrices (see Seneta [189]), we aim to put the matrix \(\widetilde{Q}(t)\) into some sort of “canonical” form so that a systematic study can be carried out. In a finite-state Markov chain, not all states are transient, and it has at least one recurrent state. Similar to the argument of Iosifescu [95, p. 94] (see also Goodman [75], Karlin and McGregor [104], Keilson [107] among others), if there are no transient states, then after suitable permutations and rearrangements (i.e., by appropriately relabeling the states), \(\widetilde{Q}(t)\) can be put into the block-diagonal form
where \(\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}\) are weakly irreducible, for k = 1, 2, …, l, and \(\sum\limits_{k=1}^{l}{m}_{k} = m\). Here and hereinafter, \(\widetilde{{Q}}^{k}(t)\), (a superscript without parentheses) denotes the kth block matrix in \(\widetilde{Q}(t)\) . The rest of this section deals with the generator Q ε(t) given by (4.39) with \(\widetilde{Q}(t)\) taking the form (4.41). Note that an example of the recurrent case is that of the irreducible (or weakly irreducible) generators treated in Section 4.2.
Let \({\mathcal{M}}_{k} =\{ {s}_{k1},\ldots,{s}_{k{m}_{k}}\}\) for k = 1, …, l denote the states corresponding to \(\widetilde{{Q}}^{k}(t)\) and let \(\mathcal{M}\) denote the state space of the underlying chains given by
Since \(\widetilde{{Q}}^{k}(t) = {(\widetilde{{q}}_{ij}^{k}(t))}_{{m}_{k}\times {m}_{k}}\) and \(\widehat{Q}(t) = {(\widehat{{q}}_{ij}(t))}_{m\times m}\) are generators, for k = 1, 2, …, l, we have
The slow and fast components are coupled through weak and strong interactions in the sense that the underlying Markov chain fluctuates rapidly within a single group \({\mathcal{M}}_{k}\) and jumps less frequently between groups \({\mathcal{M}}_{k}\) and \({\mathcal{M}}_{j}\) for k≠j. The states in \({\mathcal{M}}_{k},\) k = 1, …, l, are not isolated or independent of each other. More precisely, if we consider the states in \({\mathcal{M}}_{k}\) as a single “state,” then these “states” are coupled through the matrix \(\widehat{Q}(t)\), and transitions from \({\mathcal{M}}_{k}\) to \({\mathcal{M}}_{j}\), k ≠ j are possible. In fact, \(\widehat{Q}(\cdot )\), together with the quasi-stationary distributions of \(\widetilde{{Q}}^{k}(t)\), determines the transition rates among states in \({\mathcal{M}}_{k}\), for k = 1, …, l.
Consider the forward equation (4.40). Our goal here is to develop an asymptotic series for the solution p ε( ⋅) of (4.40). Working with the interval [0, T] for some T < ∞, we will need the following conditions:
-
-
For each t ∈ [0, T] and k = 1, 2, …, l, \(\widetilde{{Q}}^{k}(t)\) is weakly irreducible.
-
For some positive integer n, \(\widetilde{Q}(\cdot )\) and \(\widehat{Q}(\cdot )\) are ( n + 1)-times and n-times continuously differentiable on [0, T], respectively. Moreover, \(({d}^{n+1}/d{t}^{n+1})\widetilde{Q}(\cdot )\) and \(({d}^{n}/d{t}^{n})\widehat{Q}(\cdot )\) are Lipschitz on [0, T].
-
Compared with the irreducible models in Section 4.2, the main difficulty in this chapter lies in the interactions among different blocks. In constructing the expansion in Section 4.2, for i = 1, …, n, the two sets of functions {φi( ⋅)} and {ψ i ( ⋅)} are obtained independently except the initial conditions \({\psi }_{i}(0) = -{\varphi }_{i}(0)\). For Markov chains with weak and strong interactions, φi ( ⋅) and ψ i ( ⋅) are highly intertwined. The essence is to find φi ( ⋅) and ψ i ( ⋅) jointly and recursively. In the process of construction, one of the crucial and delicate points is to select the “right” initial conditions. This is done by demanding that ψi (τ) decay to 0 as τ → ∞. For future use, we define a differential operator \({\mathcal{L}}^{\varepsilon }\) on \({\mathbb{R}}^{1\times m}\)-valued functions by
Then it follows that \({\mathcal{L}}^{\varepsilon }f = 0\) iff f is a solution to the differential equation in (4.40). We are now in a position to derive the asymptotic expansion.
3.1 Asymptotic Expansions
As in Section 4.2, we seek expansions of the form
For the purpose of estimating the remainder (or error), the terms φ n + 1 ( ⋅) and ψ n + 1 ( ⋅) are needed. Set \({\mathcal{L}}^{\varepsilon }{y}_{n+1}^{\varepsilon }(t) = 0\). Parallel to the approach in Section 4.2, equating like powers of εi (for \(i = 0,1,\ldots,n + 1\)) leads to the equations for the regular part:
As discussed in Section 4.2, the approximation above is good for t away from 0. When t is sufficiently close to 0, an initial layer of thickness ε develops. Thus for the singular part of the expansion we enlarge the picture near 0 using the stretched variable τ defined by \(\tau = t/\varepsilon \). Identifying the initial-layer terms in \({\mathcal{L}}^{\varepsilon }{y}_{n+1}^{\varepsilon } = 0\), we obtain
By means of the Taylor expansion, we have
where
for some 0 ≤ ξ ≤ t and 0 ≤ ζ ≤ t. Note that in view of (A4.4),
Equating coefficients of like powers of εi, for \(i = 0,1,\ldots,n + 1\) , and using the Taylor expansion above, we obtain
In view of the essence of matched asymptotic expansion, we have necessarily at t = 0 that
This equation implies
for i ≥ 1. Moreover, note that p ε(t)1 l = 1 for all t ∈ [0, T]. Sending ε → 0 in the asymptotic expansion, one necessarily has to have the following conditions: For all t ∈ [0, T],
Our task now is to determine the functions φ i ( ⋅) and ψ i( ⋅).
Determining φ 0 (⋅) and ψ 0 (⋅). Write v = ( v 1, …, v l) for a vector \(v \in {\mathbb{R}}^{1\times m}\) , where v k denotes the subvector corresponding to the kth block of the partition. Meanwhile, a superscript with parentheses denotes a sequence. Thus v n k denotes the kth subblock of the corresponding partitioned vector of the sequence v n .
Let us start with the first equation in (4.44). In view of (4.47), we have
Note that the system above depends only on the generator \(\widetilde{Q}(t)\). However, by itself, the system is not uniquely solvable. Since for each t ∈ [0, T] and k = 1, …, l, \(\widetilde{{Q}}^{k}(t)\) is weakly irreducible, it follows that \(\mathrm{rank}(\widetilde{{Q}}^{k}(t)) = {m}_{k} - 1\) and \(\mathrm{rank}(\widetilde{Q}(t)) = m - l\) . Therefore, to get a unique solution, we need to supply l auxiliary equations. Where can we find these equations? Upon dividing the system (4.48) into l subsystems, one can apply the Fredholm alternative (see Lemma A.37 and Corollary A.38) and use the orthogonality condition to choose l additional equations to replace l equations in the system represented by the first equation in (4.48).
Since for each k, \(\widetilde{{Q}}^{k}(t)\) is weakly irreducible, there exists a unique quasi-stationary distribution νk(t). Therefore any solution to the equation
can be written as the product of νk(t) and a scalar “multiplier,” say \({{\vartheta}}_{0}^{k}(t)\). It follows from the second equation in (4.48) that \(\sum\limits_{k=1}^{l}{{\vartheta}}_{0}^{k}(t) = 1\). These \({{\vartheta}}_{0}^{k}(t)\)’s can be interpreted as the probabilities of the “grouped states” (or “aggregated states”) \({\mathcal{M}}_{k}\).
As will be seen in the sequel, \({{\vartheta}}_{0}^{k}(t)\) becomes an important spinoff in the process of construction. Effort will subsequently be devoted to finding the unique solution \(({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{0}^{l}(t))\). Let \(\mathrm{1}{\mathrm{l}}_{{m}_{k}} = (1,\ldots,1)^{\prime} \in {\mathbb{R}}^{{m}_{k}\times 1}\).
Lemma 4.21
. Under (A4.3) and (A4.4) , for each k = 1,…,l, the solution of the equation
can be uniquely expressed as \({\varphi }_{0}^{k}(t) = {\nu }^{k}(t){{\vartheta}}_{0}^{k}(t)\) , where ν k (t) is the quasi-stationary distribution corresponding to \(\widetilde{{Q}}^{k}(t)\) . Moreover, φ 0 k (t) is (n + 1)-times continuously differentiable on [0,T], provided that \({{\vartheta}}_{0}^{k}(\cdot )\) is (n + 1)-times continuously differentiable.
Proof: For each k, let us regard \({{\vartheta}}_{0}^{k}(\cdot )\) as a known function temporarily. For t ∈ [0, T], let \(\widetilde{{Q}}_{c}^{k}(t) = (\mathrm{1}{\mathrm{l}}_{{m}_{k}}\vdots\;\widetilde{{Q}}^{k}(t))\) . Then the solution can be written as
where \({0}_{{m}_{k}} = (0,\ldots,0)^{\prime} \in {\mathbb{R}}^{{m}_{k}\times 1}\) . Moreover, φ 0 ( ⋅) is ( n + 1)-times continuously differentiable. The lemma is thus concluded. □
Remark 4.22
. This lemma indicates that for each k, the subvector φ 0 k (⋅) is a multiple of the quasi-stationary distribution ν k (⋅) for each k = 1,…,l. The multipliers \({{\vartheta}}_{0}^{k}(\cdot )\) are to be determined. Owing to the interactions among different “aggregated states” corresponding to the block matrices, piecing together quasi-stationary distributions does not produce a quasi-stationary distribution for the entire system (i.e., (ν 1 (t),…,ν k (t)) is not a quasi-stationary distribution for the entire system). Therefore, the leading term in the asymptotic expansion is proportional to (or a “multiple” of) the quasi-stationary distributions of the Markov chains generated by \(\widetilde{{Q}}^{k}(t)\) , for k = 1,…,l. The multiplier \({{\vartheta}}_{0}^{k}(t)\) reflects the interactions of the Markov chain among the “aggregated states.” The probabilistic meaning of the leading term φ 0 (⋅) is in the sense of total probability. Intuitively, \({{\vartheta}}_{0}^{k}(t)\) is the corresponding probability of the chain belonging to \({\mathcal{M}}_{k}\) , and φ 0 k (t) is the probability distribution of the chain belonging to \({\mathcal{M}}_{k}\) and the transitions taking place within this group of states.
We proceed to determining \({{\vartheta}}_{0}^{k}(\cdot )\) for k = 1, …, l. Define an m ×l matrix
A crucial observation is that \(\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0\) , that is, \(\widetilde{Q}(t)\) and \(\widetilde{\mathrm{1}\mathrm{l}}\) are orthogonal. Thus postmultiplying by \(\widetilde{\mathrm{1}\mathrm{l}}\) leads to
Recall that
Equating the coefficients of ε in the above equation yields
where
Remark 4.23.
Intuitively, \(\overline{Q}(t)\) is the “average” of \(\widehat{Q}(t)\) weighted by the collection of quasi-stationary distributions (ν1(t),…,νl(t)). In fact, (4.50) is merely a requirement that the equations in (4.44) be consistent in the sense of Fredholm. This can be seen as follows. Denote by \(N(\widetilde{Q}(t))\) the null space of the matrix Q(t). Since \(\mathrm{rank}(\widetilde{Q}(t)) = m - l\), the dimension of \(N(\widetilde{Q}(t))\) is l. Observe that \(\widetilde{\mathrm{1}\mathrm{l}} =\) \(\mathrm{diag}(\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{1}}\), \(\ldots,\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{l}})\) where
are linearly independent and span the null space of \(\widetilde{Q}(t)\). The equations in (4.44) have solutions only if the right-hand side of each equation is orthogonal to \(\widetilde{\mathrm{1}\mathrm{l}}\). Hence, (4.50) must hold.
Next we determine the initial value \({{\vartheta}}_{0}(0)\) . Assuming that the asymptotic expansion of p ε( ⋅) is given by y n ε( ⋅) (see (4.43)), then it is necessary that
We will refer to such a condition as an initial-value consistency condition. Moreover, in view of (4.40) and \(\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0,\)
Since p ε( ⋅) and \(\widehat{Q}(\cdot )\) are both bounded, it follows that
Therefore, the initial-value consistency condition (4.53) yields
Note that \(({{\vartheta}}_{0}^{1}(0),\ldots,{{\vartheta}}_{0}^{l}(0)) = {\varphi }_{0}(0)\widetilde{\mathrm{1}\mathrm{l}}\). So the initial value for \({{\vartheta}}_{0}(t)\) should be
Using this initial condition and solving (4.50) yields that
where X( t, s) is the principal matrix solution of (4.50) (see Hale [79]). Since the smoothness of X( ⋅, ⋅) depends solely on the smoothness properties of \(\widetilde{Q}(t)\) and \(\widehat{Q}(t)\), \(({{\vartheta}}_{0}^{1}(\cdot ),\ldots,{{\vartheta}}_{0}^{l}(\cdot ))\) is (n + 1)-times continuously differentiable on [0, T]. Up to now, we have shown that φ0( ⋅) can be constructed that is (n + 1)-times continuously differentiable on [0, T]. Set \({{\vartheta}}_{0}(t) = ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{0}^{l}(t))\). We now summarize the discussion above as follows:
Proposition 4.24
. Assume conditions (A4.3) and (A4.4) . Then for t ∈ [0,T], φ 0 (t) can be obtained uniquely by solving the following system of equations:
such that φ 0 (⋅) is (n + 1)-times continuously differentiable. □
We next consider the initial-layer term ψ0( ⋅). First note that solving (4.45) for each \(i = 0,1\ldots,n + 1\) leads to
Once again, to match the asymptotic expansion requires that (4.46) hold and hence
Solving the first equation in (4.45) together with the above initial condition, one obtains
Note that in Proposition 4.25 to follow, we still use κ0, 0 as a positive constant, which is generally a different constant from that in Section 4.2.
Proposition 4.25
. Assume conditions (A4.3) and (A4.4) . Then ψ 0 (⋅) can be obtained uniquely by (4.56) . In addition, there is a positive number κ 0,0 such that
Proof: We prove only the exponential decay property, since the rest is obvious. Let νk (0) be the stationary distribution corresponding to the generator \(\widetilde{{Q}}^{k}(0)\). Define
where
Noting the block-diagonal structure of \(\widetilde{Q}(0)\), we have
It is easy to see that
Owing to the choice of initial condition, (p 0 − φ 0 (0)) is orthogonal to π, and by virtue of Lemma 4.4, for each k = 1, …, l and some κ0, k > 0,
we have
where κ 0, 0 = min k ≤ l κ 0, k > 0. □
Determining φ i (⋅) and ψ i (⋅) for i ≥ 1. In contrast to the situation encountered in Section 4.2, the sequence {φ i ( ⋅)} cannot be obtained without the involvement of {ψi ( ⋅)}. We thus obtain the sequences pairwise. While the determination of φ0 ( ⋅) and ψ 0 ( ⋅) is similar to that of Section 4.2, the solutions for the rest of the functions show distinct features resulting from the underlying weak and strong interactions. With known
we proceed to solve the second equation in (4.44) together with the constraint \(\sum\limits_{i=1}^{m}{\varphi }_{1}^{i}(t) = 0\) due to (4.47). Partition the vectors φ1(t) and b 0(t) as
In view of the definition of \(\overline{Q}(t)\) in (4.51) and \({\varphi }_{0}^{k}(t) = {\nu }^{k}(t){{\vartheta}}_{0}^{k}(t)\), it follows that \({b}_{0}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0\), thus,
Let \({{\vartheta}}_{1}^{k}(t)\) denote the function such that \(\sum\limits_{k=1}^{l}{{\vartheta}}_{1}^{k}(t) = 0\) because φ1(t)1 l = 0. Then for each k = 1, …, l, the solution to
can be expressed as
where \(\widetilde{{b}}_{0}^{k}(t)\) is a solution to the following equation:
or equivalently,
The procedure for solving this equation is similar to that for φ0( ⋅).
Analogously to the previous treatment, we proceed to determine \({{\vartheta}}_{1}^{k}(t)\) by solving the system of equations
Using the conditions
we have
and
where \(\overline{Q}(t)\) was defined in (4.51).
By equating the coefficients of ε2 in (4.60), we obtain a system of linear inhomogeneous equations
with initial conditions
Again, as observed in Remark 4.23, equation (4.61) comes from the consideration in the sense of Fredholm since the functions on the right-hand sides in (4.44) must be orthogonal to \(\widetilde{\mathrm{1}\mathrm{l}}\).
The initial conditions \({{\vartheta}}_{1}^{k}(0)\) for k = 1, …, l have not been completely specified yet. We do this later to ensure the matched asymptotic expansion. Once the \({{\vartheta}}_{1}^{k}(0)\) ’s are given, the solution of the above equation is
Thus if the initial value \({{\vartheta}}_{1}^{k}(0)\) is given, then \({{\vartheta}}_{1}^{k}(\cdot ),\) k = 1, …, l can be found, and so can φ1( ⋅). Moreover, φ1( ⋅) is n-times continuously differentiable on [0, T]. The problem boils down to finding the initial condition of \({{\vartheta}}_{1}(0)\).
So far, with the proviso of specified initial conditions \({{\vartheta}}_{1}^{k}(0)\), for k = 1, …, l, the construction of φ 1 ( ⋅) has been completed, and its smoothness has been established. Compared with the determination of φ 0 ( ⋅), the multipliers \({{\vartheta}}_{1}^{k}(\cdot )\) can no longer be determined using the information about the regular part alone because its initial values have to be determined in conjunction with that of the singular part. This will be seen as follows.
In view of (4.55),
Recall that ψ 1 (0) has not been specified yet.
Similar to Section 4.2, for each t ∈ [0, T], \(\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0\) . Therefore,
for \(i = 1,\ldots,n + 1\), where π is defined in (4.57). This together with ψ0 (0)π = 0 yields
To obtain the desired property, we need only work with the first two terms on the right side of the equal sign of (4.62). Noting the exponential decay property of \({\psi }_{0}(\tau ) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau )\) , we have
that is, the improper integral converges absolutely. Set
Consequently,
Recall that \(\pi = \mathrm{diag}(\mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0),\ldots,\mathrm{1}{\mathrm{l}}_{{m}_{l}}{\nu }^{l}(0))\) . Partitioning the vector \({\overline{\psi }}_{0}\) as \(({\overline{\psi }}_{0}^{1},\ldots,{\overline{\psi }}_{0}^{l})\) for k = 1, …, l, we have
Our expansion requires that limτ → ∞ ψ1 (τ) = 0. As a result,
which implies, by virtue of (4.66),
for k = 1, …, l. Solving these equations and in view of
we choose
Substituting these into (4.59), we obtain φ1( ⋅). Finally, we use \({\psi }_{1}(0) = -{\varphi }_{1}(0)\). The process of choosing initial conditions for φ1( ⋅) and ψ1( ⋅) is complete. Furthermore,
This procedure can be applied to φi( ⋅) and ψ i ( ⋅) for \(i = 2,\ldots,n + 1\). We proceed recursively to solve for φi ( ⋅) and ψ i ( ⋅) jointly. Using exactly the same methods as the solution for φ 1( ⋅), we define
for each k = 1, …, l and \(i = 2,\ldots,n + 1\) . Similar to \(\widetilde{{b}}_{0}^{k}(\cdot )\) , we define \(\widetilde{{b}}_{i}^{k}(\cdot )\) . and write
Proceeding inductively, suppose that \({{\vartheta}}_{i}^{k}(0)\) is selected and in view of (4.55), it has been shown that
for some 0 < κ i, 0 < κ i − 1, 0 . Solve
to obtain \({\psi }_{i+1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = -{\overline{\psi }}_{i}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) . Set
Finally choose \({\psi }_{i+1}(0) = -{\varphi }_{i+1}(0).\) We thus have determined the initial conditions for φ i ( ⋅). Exactly the same arguments as in Proposition 4.25 lead to
Proposition 4.26
. Assume (A4.3) and (A4.4) . Then the following assertions hold:
-
The sequences of row-vector-valued functions φ i ( ⋅) and \({{\vartheta}}_{i}(\cdot )\) fori = 1, 2, …, ncan be obtained by solving the system of algebraic differential equations
$$\begin{array}{ll} &{\varphi }_{i}(t)\widetilde{Q}(t) = \frac{d{\varphi }_{i-1}(t)} {dt} - {\varphi }_{i-1}(t)\widehat{Q}(t), \\ &{{\vartheta}}_{i}^{k}(t) = {\varphi }_{ i}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}, \\ &{ d{{\vartheta}}_{i}(t) \over dt} = {{\vartheta}}_{i}(t)\overline{Q}(t) +\widetilde{ {b}}_{i-1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}}.\end{array}$$(4.69) -
Fori = 1, …, n, the initial conditions are selected as follows:
-
Fork = 1, 2, …, l, find \({\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) from the equation
$${\psi }_{i}(0)\pi = -\left(\;\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j-1}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right)\pi := -{\overline{\psi }}_{i-1}\pi.$$ -
Choose
$${{\vartheta}}_{i}^{k}(0) = -{\psi }_{ i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},\mbox{ for }k = 1,\ldots,l.$$ -
Choose \({\psi }_{i}(0) = -{\varphi }_{i}(0).\)
-
-
There is a positive real number 0 < κ0 < κ i, 0 (given in (4.68)) for \(i = 0,1,\ldots,n + 1\) such that
$$\vert {\psi }_{i}(\tau )\vert \leq K\exp (-{\kappa }_{0}\tau ).$$ -
The choice of initial conditions yields that \({{\vartheta}}_{i}^{k}(\cdot )\) is \((n + 1 - i)\) -times continuously differentiable on [0, T] and hence φi( ⋅) is \((n + 1 - i)\) -times continuously differentiable on [0, T]. □
3.2 Analysis of Remainder
The objective here is to carry out the error analysis and validate the asymptotic expansion. Since the details are quite similar to those of Section 4.2, we make no attempt to spell them out. Only the following lemma and proposition are presented.
Lemma 4.27.
Suppose that (A4.3) and (A4.4) are satisfied. Let η ε (⋅) be a function such that
and let \({\mathcal{L}}^{\varepsilon }\) be an operator defined in (4.42) . If f ε (⋅) is a solution to the equation
then f ε (⋅) satisfies
Proof: Note that using \({Q}^{\varepsilon }(t) =\widetilde{ Q}(t)/\varepsilon +\widehat{ Q}(t)\) , the differential equation can be written as
We can then proceed as in the proof of Lemma 4.13. □
Lemma 4.27 together with detailed computation similar to that of Section 4.2 yields the following proposition.
Proposition 4.28.
For each i = 0,1,…,n, define
Under conditions (A4.3) and (A4.4),
3.3 Computational Procedure: User’s Guide
Since the constructions of φ i ( ⋅) and ψ i ( ⋅) are rather involved, and the choice of initial conditions is tricky, we summarize the procedure below. This procedure, which can be used as a user’s guide for developing the asymptotic expansion, comprises two main stages.
Step 1: Initialization: finding φ 0 ( ⋅) and ψ 0 ( ⋅).
Step 2. Iteration: finding φ i ( ⋅) and ψ i ( ⋅) for 1 ≤ i ≤ n.
While i ≤ n, do the following:
-
-
1.
Find φ i ( ⋅) the solution of (4.69) with temporarily unspecified \({{\vartheta}}_{i}^{k}(0)\) for k = 1, …, l.
-
2.
Obtain ψ i ( ⋅) from (4.55) with temporarily unspecified ψ i (0).
-
3.
Use the equation
$${\psi }_{i}(0)\pi = -\left(\,\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j-1}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right)\pi := -{\overline{\psi }}_{i-1}\pi $$to obtain \({\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = -{\overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}.\)
-
4.
Set \({{\vartheta}}_{i}^{k}(0) = -{\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} ={ \overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}\). By now, φ i ( ⋅) has been determined uniquely.
-
5.
Choose \({\psi }_{i}(0) = -{\varphi }_{i}(0)\) . By now, ψ i ( ⋅) has also been determined uniquely.
-
6.
Set \(i = i + 1\).
-
7.
If i > n, stop.
-
1.
3.4 Summary of Results
While the previous subsection gives the computational procedure, this subsection presents the main theorem. It establishes the validity of the asymptotic expansion.
Theorem 4.29
. Suppose conditions (A4.3) and (A4.4) are satisfied. Then the asymptotic expansion
can be constructed as in the computational procedure such that
-
φ i ( ⋅) is \((n + 1 - i)\) - times continuously differentiable on [0, T];
-
| ψ i (t) | ≤ Kexp( − κ0 t) for some κ0 > 0;
-
\(\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T]\).
Remark 4.30
In general, in view of Proposition 4.11, the error bound is of the form c 2n (t)exp (−κ 0 t), where c 2n (t) is a polynomial of degree 2n. The exponential constant κ 0 typically depends on n. The larger n is, the smaller κ 0 will be to account for the polynomial c 2n (t).
The following result is a corollary to Theorem 4.29 and will be used in Chapters 5 and 7. Denote the jth component of νk(t) by ν j k(t).
Corollary 4.31.
Assume, in addition to the conditions in Theorem 4.29 with n = 0, that \(\widetilde{Q}(t) =\widetilde{ Q}\) and \(\widehat{Q}(t) =\widehat{ Q}\) are time independent. Then there exist positive constants K and κ 0 ( both independent of ε and t ) such that
where \({{\vartheta}}^{k}(t)\) satisfies
with \(({{\vartheta}}^{1}(0),\ldots,{{\vartheta}}^{l}(0)) = (P({\alpha }^{\varepsilon }(0) \in {\mathcal{M}}_{1}),\ldots,P({\alpha }^{\varepsilon }(0) \in {\mathcal{M}}_{l}))\).
Proof: By a slight modification of the analysis of remainder in Section 4.3, we can obtain (4.71) with a constant K independent of ε and t. The second part of the lemma follows from the uniqueness of the solution to the ordinary differential equation (4.71). □
Remark 4.32.
We mention an alternative approach to establishing the asymptotic expansion. In lieu of the constructive procedure presented previously, one may wish to write φi(t) as a sum of solutions of the homogeneous part and the inhomogeneous part. For instance, one may set
where \({v}_{i}(t) \in {\mathbb{R}}^{l}\) and Ui(t) is a particular solution of the inhomogeneous equation. For i ≥ 0, the equation
and \(\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0\) lead to
Substituting (4.72) into the equation above, and noting that \({\nu }^{k}(t)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = 1\) for k = 1,…,l, and that \(\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t))\widetilde{\mathrm{1}\mathrm{l}} = {I}_{l}\), the l × l identity matrix, one obtains
One then proceeds to determine vi(0) via the matching condition. The main ideas are similar, and the details are slightly different.
3.5 An Example
Consider Example 4.20 again. Note that the conditions in (A4.3) and (A4.4) require that
and the jump rates λ(t) and μ( t) be smooth enough.
The probability distribution of the state process is given by p ε(t) satisfying
To solve this set of equations, note that
To proceed, define functions a 12(t), a 13(t), a 24(t), and a 34(t) as follows:
Then using the fact that \({p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) + {p}_{3}^{\varepsilon }(t) + {p}_{4}^{\varepsilon }(t) = 1\), we have
Note also that
The solution to this equation is
Consequently, in view of (4.73), it follows that
In this example, the zeroth-order term is given by
where the quasi-stationary distributions are given by
and the multipliers \(({{\vartheta}}_{0}^{1}(t),{{\vartheta}}_{0}^{2}(t))\) are determined by the differential equation
with initial value \(({{\vartheta}}_{0}^{1}(0),{{\vartheta}}_{0}^{2}(0)) = ({p}_{1}^{0} + {p}_{2}^{0},{p}_{3}^{0} + {p}_{4}^{0})\).
The inner expansion term ψ0(τ) is given by
By virtue of Theorem 4.29,
provided that Q ε(t) is continuously differentiable on [0, T]. Noting the exponential decay of ψ 0(t ∕ ε), we further have
In particular, for any t > 0,
Namely, φ 0(t) is the limit distribution of the Markov chain generated by Q ε(t).
4 Inclusion of Absorbing States
While the case of recurrent states was considered in the previous section, this section concerns the asymptotic expansion in which the Markov chain generated by Q ε(t) in which \(\widetilde{Q}(t)\) includes components corresponding to absorbing states. By rearrangement, the matrix \(\widetilde{Q}(t)\) takes the form
where \(\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}\) for k = 1, 2, …, l, \({0}_{{m}_{a}\times {m}_{a}}\) is an m a ×m a zero matrix, and
Let \({\mathcal{M}}_{a} =\{ {s}_{a1},\ldots,{s}_{a{m}_{a}}\}\) denote the set of absorbing states. We may, as in Section 4.3, represent the state space as
Following the development of Section 4.3, suppose that αε ( ⋅) is a Markov chain generated by \({Q}^{\varepsilon }(\cdot ) =\widetilde{ Q}(\cdot )/\varepsilon +\widehat{ Q}(\cdot )\). Compared with Section 4.3, the difference is that now the dominant part in the generator includes absorbing states corresponding to the m a ×m a matrix \({0}_{{m}_{a}\times {m}_{a}}\). As in the previous case, our interest is to obtain an asymptotic expansion of the probability distribution.
Remark 4.33.
The motivation of the current study stems from the formulation of competitive risk theory discussed in Section 3.3 The idea is that within the m states, there are several groups. Some of them are much riskier than the others (in the sense of frequency of the occurrence of the corresponding risks). The different rates (sensitivity) of risks are modeled by the use of a small parameter ε > 0.
Denote by p ε ( ⋅) the solution of (4.40). The objective here is to obtain an asymptotic expansion
Since the techniques employed are essentially the same as in the previous section, it will be most instructive here to highlight the main ideas. Thus, we only note the main steps and omit most of the details.
Assume conditions (A4.3) and (A4.4) for the current matrices \(\widetilde{{Q}}^{k}(t),\) \(\widetilde{Q}(t)\) , and \(\widehat{Q}(t)\) . For t ∈ [0, T], substituting the expansion above into (4.40) and equating coefficients of εi, for \(i = 1,\ldots,n + 1\), yields
and (with the use of the stretched variable \(\tau = t/\varepsilon \))
For each i ≥ 0, we use the following notation for the partitioned vectors:
In the above φ i a(t) and ψ i a (τ) are vectors in \({\mathbb{R}}^{1\times {m}_{a}}\).
To determine the outer- and the initial-layer expansions, let us start with i = 0. For each t ∈ [0, T], the use of the partitioned vector φ 0(t) leads to
Note that φ0 a(t) does not show up in any of these equations owing to the \({0}_{{m}_{a}\times {m}_{a}}\) matrix in \(\widetilde{Q}(t)\). It will have to be obtained from the equation in (4.75) corresponding to i = 1. Put another way, φ 0 a (t) is determined mainly by the matrix \(\widehat{Q}(t)\).
Similar to Section 4.3, \({\varphi }_{0}^{k}(t) = {{\vartheta}}_{0}^{k}(t){\nu }^{k}(t)\) , where ν k(t) are the quasi-stationary distributions corresponding to the generators \(\widetilde{{Q}}^{k}(t)\) for k = 1, …, l and \({{\vartheta}}_{0}^{k}(t)\) are the corresponding multipliers. Define
where \({I}_{{m}_{a}}\) is an m a ×m a identity matrix. Clearly, \(\widetilde{\mathrm{1}{\mathrm{l}}}_{a}\) is orthogonal to \(\widetilde{Q}(t)\) for each t ∈ [0, T]. As a result, multiplying (4.75) by \(\widetilde{\mathrm{1}{\mathrm{l}}}_{a}\) from the right with i = 1 leads to
where \({{\vartheta}}_{0}(0) = ({{\vartheta}}_{0}^{1}(0),\ldots,{{\vartheta}}_{0}^{l}(0)).\)
The above initial condition is a consequence of the initial-value consistency condition in (4.53). It is readily seen that
where p 0 = (p 0, 1 , …, p 0, l, p 0, a).
We write
Define
Then (4.77) is equivalent to
This is a linear system of differential equations. Therefore it has a unique solution given by
where X( t, 0) is the principal matrix solution of the homogeneous equation. Thus φ0(t) has been found and is ( n + 1)-times continuously differentiable.
Remark 4.34.
Note that in φ0(t), the term φ0 a(t) corresponds to the set of absorbing states \({\mathcal{M}}_{a}\). Clearly, these states cannot be aggregated to a single state as in the case of recurrent states. Nevertheless, the function φ0 a(t) tends to be stabilized in a neighborhood of a constant for t large enough. To illustrate, let us consider a stationary case, that is, both \(\widetilde{Q}(t) =\widetilde{ Q}\) and \(\widehat{Q}(t) =\widehat{ Q}\) are independent of t. Partition \(\widehat{Q}\) as blocks of submatrices
where \(\widehat{{Q}}^{22}\) is an ma × ma matrix. Assume that the eigenvalues of \(\widehat{{Q}}^{22}\) have negative real parts. Then, in view of the definition of \(\overline{Q}(t) = \overline{Q}\) in (4.78), it follows that
Using the partition ψ0(τ) = (ψ0 1(τ), …, ψ 0 l (τ), ψ 0 a (τ)), consider the zeroth-order initial-layer term given by
We obtain
Noting that p 0, a = φ0 a (0) and choosing \({\psi }_{0}(0) = {p}^{0} - {\varphi }_{0}(0)\) lead to \({\psi }_{0}^{a}(\tau ) = {0}_{{m}_{a}}.\) Thus
Similar to the result in Section 4.3, the following lemma holds. The proof is analogous to that of Proposition 4.25.
Lemma 4.35
. Define
Then there exist positive constants K and κ0,0 such that
By virtue of the lemma above and the orthogonality \(({p}^{0} - {\varphi }_{0}(0)){\pi }_{a} = 0\), we have
for some K > 0 and κ 0, 0 > 0 given in Lemma 4.35; that is, ψ0 (τ) decays exponentially fast. Therefore, ψ 0 (τ) has the desired property.
We continue in this fashion and proceed to determine the next term φ1(t) as well as ψ1(t ∕ ε). Let
It is easy to check that \({b}_{0}^{a}(t) = {0}_{{m}_{a}}\) . The equation \({\varphi }_{1}(t)\widetilde{Q}(t) = {b}_{0}(t)\) then leads to
The solutions of the l inhomogeneous equations in (4.79) above are of the form
where \({{\vartheta}}_{1}^{k}(t)\) for k = 1, …, l are scalar multipliers. Again, φ 1 a (t) cannot be obtained from the equation above, it must come from the contribution of the matrix-valued function \(\widehat{Q}(t)\).
Note that
Using the equation
one obtains
which in turn implies that
where
Let X(t, s) denote the principal matrix solution to the homogeneous differential equation
Then the solution to (4.80) can be represented by X( t, s) as follows:
Note that the initial conditions φ 1 a (0) and \({{\vartheta}}_{1}^{k}(0)\) for k = 1, …, l need to be determined using the initial-layer terms just as in Section 4.3.
Using (4.76) with i = 1, one obtains an equation that has the same form as that of (4.62). That is,
As in Section 4.3, with the use of π a , it can be shown that | ψ1 (τ) | ≤ Kexp( − κ1, 0τ) for some K > 0 and 0 < κ 1, 0 < κ 0, 0 . By requiring that ψ 1 (τ) decay to 0 as τ → ∞, we obtain the equation
where
Owing to (4.81) and the known form of ψ0(τ),
which is a completely known vector. Thus the solution to (4.81) is
To obtain the desired matching property for the inner-outer expansions, choose
In general, for i = 2, …, n, the initial conditions are selected as follows: For k = 1, 2, …, l, find \({\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) from the equation
Choose
for k = 1, …, l,
Proceeding inductively, we then construct all φ i (t) and ψ i (τ). Moreover, we can verify that there exists 0 < κ i, 0 < κ i − 1, 0 < κ 0, 0 such that | ψ i (τ) | ≤ Kexp( − κ i, 0 τ). This indicates that the inclusion of absorbing states is very similar to the case of all recurrent states. In the zeroth-order outer expansion, there is a component φ0 a(t) that “takes care of” the absorbing states. Note, however, that starting from the leading term (zeroth-order approximation), the matching will be determined not only by the multipliers \({{\vartheta}}_{i}(0)\) but also by the vector ψ i (0) associated with the absorbing states. We summarize the results in the following theorem.
Theorem 4.36
. Consider \(\widetilde{Q}(t)\) given by (4.74) , and suppose conditions (A4.3) and (A4.4) are satisfied for the matrix-valued functions \(\widetilde{{Q}}^{k}(\cdot )\) for k = 1,…,l and \(\widehat{Q}(\cdot )\) . An asymptotic expansion
exists such that
-
φ i ( ⋅) is \((n + 1 - i)\)-timescontinuously differentiable on [0, T];
-
| ψ i (t) | ≤ Kexp( − κ 0 t) for some 0 < κ 0 < κ i, 0;
-
\(\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T].\)
Finally, at the end of this section, we give a simple example to illustrate the result.
Example 4.37.
Let us consider a Markov chain generated by
where
Not being irreducible, the chain generated by \(\widetilde{Q}\) includes an absorbing state. In this example, \(\overline{Q} = \left (\begin{array}{*{10}c} 0& 0\\ 1 &-1\\ \end{array} \right )\). Let p0 = (p1 0,p2 0,p0,a) denote the initial distribution of αε(⋅). Then solving the forward equation (4.40) gives us
where
Computing φ0(t) yields
It is easy to see that for t > 0,
The limit behavior of the underlying Markov chain as ε → 0 is determined by φ0(t) (for t > 0). Moreover, when t is large, the influence from \(\widehat{Q}\) corresponding to the absorbing state (the vector multiplied by exp (−t)) can be ignored because exp (−t) goes to 0 exponentially fast as t →∞.
5 Inclusion of Transient States
If a Markov chain has transient states, then, relabeling the states through suitable permutations, one can decompose the states into several groups of recurrent states, each of which is weakly irreducible, and a group of transient states. Naturally, we consider the generator \(\widetilde{Q}(t)\) in Q ε(t) having the form
such that for each t ∈ [0, T], and each k = 1, …, l, \(\widetilde{{Q}}^{k}(t)\) is a generator with dimension m k ×m k , \(\widetilde{{Q}}_{{_\ast}}(t)\) is an m ∗ × m ∗ matrix, \(\widetilde{{Q}}_{{_\ast}}^{k}(t) \in {\mathbb{R}}^{{m}_{{_\ast}}\times {m}_{k}}\), and
We continue our study of singularly perturbed chains with weak and strong interactions by incorporating the transient states into the model. Let αε( ⋅) be a Markov chain generated by Q ε ( ⋅), with \({Q}^{\varepsilon }(t) \in {\mathbb{R}}^{m\times m}\) given by (4.39) with \(\widetilde{Q}(t)\) given by (4.82). The state space of the underlying Markov chain is given by
where \({\mathcal{M}}_{k} =\{ {s}_{k1},\ldots,{s}_{k{m}_{k}}\}\) are the states corresponding to the recurrent states and \({\mathcal{M}}_{{_\ast}} =\{ {s}_{{_\ast}1},\ldots,{s}_{{_\ast}{m}_{{_\ast}}}\}\) are those corresponding to the transient states.
Since \(\widetilde{Q}(t)\) is a generator, for each k = 1, …, l, \(\widetilde{{Q}}^{k}(t)\) is a generator. Thus the matrix \(\widetilde{{Q}}_{{_\ast}}^{k}(t) = (\widetilde{{q}}_{{_\ast},ij}^{k})\) satisfies \(\widetilde{{q}}_{{_\ast},ij}^{k} \geq 0\) for each i = 1, …, m ∗ and j = 1, …, m k , and \(\widetilde{{Q}}_{{_\ast}}(t) = (\widetilde{{q}}_{{_\ast},ij})\) satisfies
Roughly, the block matrix \((\widetilde{{Q}}_{{_\ast}}^{1}(t),\ldots,\widetilde{{Q}}_{{_\ast}}^{l}(t),\widetilde{{Q}}_{{_\ast}}(t))\) is “negatively dominated” by the matrix \(\widetilde{{Q}}_{{_\ast}}(t)\) . Thus it is natural to assume that \(\widetilde{{Q}}_{{_\ast}}(t)\) is a stable matrix (or Hurwitz, i.e., all its eigenvalues have negative real parts). Comparing with the setups of Sections 4.3 and 4.4, the difference in \(\widetilde{Q}(t)\) is the additional matrices \(\widetilde{{Q}}_{{_\ast}}^{k}(t)\) for k = 1, …, l and \(\widetilde{{Q}}_{{_\ast}}(t)\). Note that \(\widetilde{{Q}}_{{_\ast}}^{k}(t)\) are nonsquare matrices, and \(\widetilde{Q}(t)\) no longer has block-diagonal form.
The formulation here is inspired by the work of Phillips and Kokotovic [175] and Delebecque and Quadrat [44]; see also the recent work of Pan and Başar [164], in which the authors treated time-invariant \(\widetilde{Q}\) matrix of a similar form. Sections 4.3 and 4.4 together with this section essentially include generators of finite-state Markov chains of the most practical concerns. It ought to be pointed out that just as one cannot in general simultaneously diagonalize two matrices, for Markov chains with weak and strong interactions, one cannot put both \(\widetilde{Q}(t)\) and \(\widehat{Q}(t)\) into the forms mentioned above simultaneously. Although the model to be studied in this section is slightly more complex compared with the block-diagonal \(\widetilde{Q}(t)\) in (4.41), we demonstrate that an asymptotic expansion of the probability distribution can still be obtained by using the same techniques of the previous sections. Moreover, it can be seen from the expansion that the underlying Markov chain stays in the transient states only with very small probability. In some cases, for example \(\widehat{Q}(t) = 0\), these transient states can be ignored; see Remark 4.40 for more details.
To incorporate the transient states, we need the following conditions. The main addition is the assumption that \(\widetilde{{Q}}_{{_\ast}}(t)\) is stable.
-
-
For each t ∈ [0, T] and k = 1, …, l, \(\widetilde{Q}(t),\) \(\widehat{Q}(t)\) , and \(\widetilde{{Q}}^{k}(t)\) satisfy (A4.3) and (A4.4).
-
For each t ∈ [0, T], \(\widetilde{{Q}}_{{_\ast}}(t)\) is Hurwitz (i.e., all of its eigenvalues have negative real parts).
-
Remark 4.38.
Condition (A4.6) indicates the inclusion of transient states. Since \(\widetilde{{Q}}_{{_\ast}}(t)\) is Hurwitz, it is nonsingular. Thus the inverse matrix \(\widetilde{{Q}}_{{_\ast}}^{-1}(t)\) exists for each t ∈ [0,T].
Let p ε( ⋅) denote the solution to (4.40) with \(\widetilde{Q}(t)\) specified in (4.82). We seek asymptotic expansions of p ε( ⋅) having the form
The development is very similar to that of Section 4.3, so no attempt is made to give verbatim details. Instead, only the salient features will be brought out.
Substituting y n ε(t) into the forward equation and equating coefficients of εi for i = 1, …, n lead to the equations
and with the change of time scale \(\tau = t/\varepsilon \),
As far as the expansions are concerned, the equations have exactly the same form as that of Section 4.3. Note, however, that the partitioned vector φ i (t) has the form
where φ i k(t), k = 1, …, l, is an m k row vector and φ i ∗ (t) is an m ∗ row vector. A similar partition holds for the vector ψ i (t). To construct these functions, we begin with i = 0. Writing \({\varphi }_{0}(t)\widetilde{Q}(t) = 0\) in terms of the corresponding partition, we have
Since \(\widetilde{{Q}}_{{_\ast}}(t)\) is stable, it is nonsingular. The last equation above implies \({\varphi }_{0}^{{_\ast}}(t) = {0}_{{m}_{{_\ast}}} = (0,\ldots,0) \in {\mathbb{R}}^{1\times {m}_{{_\ast}}}\). Consequently, as in the previous section, for each k = 1, …, l, the weak irreducibility of \(\widetilde{{Q}}^{k}(t)\) implies that \({\varphi }_{0}^{k}(t) = {{\vartheta}}_{0}^{k}(t){\nu }^{k}(t)\), for some scalar function \({{\vartheta}}_{0}^{k}(t)\). Equivalently,
Comparing the equation above with the corresponding expression of φ0(t) in Section 4.3, the only difference is the addition of the m ∗ -dimensional row vector \({0}_{{m}_{{_\ast}}}\).
Remark 4.39.
Note that the dominant term in the asymptotic expansion is φ0(t), in which the probabilities corresponding to the transient states are 0. Thus, the probability corresponding to αε(t) ∈{ transient states } is negligibly small.
Define
where \({a}_{{m}_{k}}(t) = -\widetilde{{Q}}_{{_\ast}}^{-1}(t)\widetilde{{Q}}_{{_\ast}}^{k}(t)\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) for k = 1, …, l, and \({0}_{{m}_{{_\ast}}\times {m}_{{_\ast}}}\) is the zero matrix in \({\mathbb{R}}^{{m}_{{_\ast}}\times {m}_{{_\ast}}}\).
It is readily seen that
In view of (4.83), it follows that
where
We write \(\widehat{Q}(t)\) as follows:
where for each t ∈ [0, T],
Let
Then \(\overline{Q}(t) = \mathrm{diag}({\overline{Q}}_{{_\ast}}(t),{0}_{{m}_{{_\ast}}\times {m}_{{_\ast}}})\). Moreover, the differential equation (4.86) becomes
Remark 4.40.
Note that the submatrix \(\widehat{{Q}}^{12}(t)\) in \(\widehat{Q}(t)\) determines the jump rates of the underlying Markov chain from a recurrent state in \({\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l}\) to a transient state in \({\mathcal{M}}_{{_\ast}}\). If the magnitude of the entries of \(\widehat{{Q}}^{12}(t)\) is small, then the transient state can be safely ignored because the contribution of \(\widehat{{Q}}^{12}(t)\) to \(\overline{Q}(t)\) is small. On the other hand, if \(\widehat{{Q}}^{12}(t)\) is not negligible, then one has to be careful to include the corresponding terms in \(\overline{Q}(t)\).
We now determine the initial value \({{\vartheta}}_{0}^{k}(0)\). In view of the asymptotic expansions y n ε(t) and the initial-value consistency condition in (4.53), it is necessary that for k = 1, …, l,
where p ε(t) = (p ε, 1(t), …, p ε, l(t), p ε, ∗ (t)) is a solution to (4.40). Here p ε, k(t) has dimensions compatible with φ0 k(0) and ψ0 k(0). Similarly, we write the partition of the initial vector as p 0 = (p 0, 1, …, p 0, l, p 0, ∗ ). The next theorem establishes the desired consistency of the initial values. Its proof is placed in Appendix A.4.
Theorem 4.41.
Assume (A4.5) and (A4.6) . Then for k = 1,…,l,
Remark 4.42.
In view of this theorem, the initial value should be given as
Therefore, in view of (4.88), to make sure that the initial condition satisfies the probabilistic interpretation, it is necessary that
In view of the structure of the \(\widetilde{Q}(0)\) matrix, for each k = 1,…,l, all components of the vector \(\widetilde{{Q}}_{{_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) are nonnegative. Note that the solution of the differential equation
is \({p}^{0}\exp (\widetilde{Q}(0)t)\). This implies that all components of \({p}^{0,{_\ast}}\exp (\widetilde{{Q}}_{{_\ast}}(0)t)\) are nonnegative. By virtue of the stability of \(\widetilde{{Q}}_{{_\ast}}(0)\),
Thus all components of \(-{p}^{0,{_\ast}}\widetilde{{Q}}_{{_\ast}}^{-1}(0)\) are nonnegative, and as a result, the inner product
is nonnegative. It follows that for each k = 1,…,l, \({{\vartheta}}_{0}^{k}(0) \geq {p}^{0,k}\mathrm{1}{\mathrm{l}}_{{m}_{k}} \geq 0\). Moreover,
Before treating the terms in ψ0( ⋅), let us give an estimate on \(\exp (\widetilde{Q}(0)t)\).
Lemma 4.43.
Set
Then there exist positive constants K and κ 0,0 such that
for τ ≥ 0.
Proof: To prove (4.90), it suffices to show for any m-row vector y 0,
Given \({y}^{0} = ({y}^{0,1},\ldots,{y}^{0,l},{y}^{0,{_\ast}}) \in {\mathbb{R}}^{1\times m}\), let
Then, y(τ) is a solution to
It follows that
and for k = 1, …, l,
For each k = 1, …, l, we have
By virtue of the stability of \(\widetilde{{Q}}_{{_\ast}}(0)\), the last term above is bounded above by K | y 0, ∗ | exp( − κ ∗ τ). Recall that by virtue of Lemma 4.4, for some κ0, k > 0,
Choose κ0, 0 = min(κ ∗ , min k {κ0, k }). The terms in the second and the third lines above are bounded by K | y 0 | exp( − κ0, 0τ). The desired estimate thus follows. □
Next consider the first equation in the initial-layer expansions:
The solution to this equation can be written as
To be able to match the asymptotic expansion, choose
Thus,
By virtue of the choice of φ0(0), it is easy to show that
Therefore, in view of Lemma 4.43, ψ0( ⋅) decays exponentially fast in that for some constants K and κ0, 0 > 0 given in Lemma 4.43,
We have obtained φ0( ⋅) and ψ0( ⋅). To proceed, set
and
Note that b 0(t) is a completely known function.
In view of the second equation in (4.83),
Solving the last equation in (4.91) yields
Putting this back into the first l equations of (4.91) leads to
Again, the right side is a known function. In view of the choice of φ0( ⋅) and (4.86), we have \({b}_{0}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = 0\). This implies
Therefore, (4.92) has a particular solution \(\widetilde{{b}}_{0}^{k}(t)\) with
As in the previous section, we write the solution of φ1 k(t) as a sum of the homogeneous solution and a solution of the inhomogeneous equation \(\widetilde{{b}}_{0}^{k}(t)\), that is,
In view of
using the equation
we obtain that
The initial value \({{\vartheta}}_{1}(0)\) will be determined in conjunction with the initial value of ψ1( ⋅) next.
Note that in comparison with the differential equation governing \({{\vartheta}}_{1}(t)\) in Section 4.3, the equation (4.93) has an extra term involving the derivative of \(\widetilde{{b}}_{0}^{{_\ast}}(t)\).
To determine ψ1( ⋅), solving the equation in (4.84) with i = 1, we have
Choose the initial values of ψ1(0) and \({{\vartheta}}_{1}^{k}(0)\) as follows:
Write \({\overline{\psi }}_{0} = ({\overline{\psi }}_{0}^{1},\ldots,{\overline{\psi }}_{0}^{l},{\overline{\psi }}_{0}^{{_\ast}})\). Then the definition of π ∗ implies that
Recall that
and
It follows that
Moreover, it can be verified that | ψ1(τ) | ≤ Kexp( − κ1, 0τ) for some 0 < κ1, 0 < κ0, 0.
Remark 4.44.
Note that there is an extra term
involved in the equation determining \({{\vartheta}}_{1}(0)\) in (4.94). This term does not vanish as in Section 4.3 because generally \(((d/dt)\widetilde{Q}(0)){\pi }_{{_\ast}}\neq 0\).
To obtain the desired asymptotic expansion, continue inductively. For each i = 2, …, n, we first obtain the solution of φ i (t) with the “multiplier” given by the solution of the differential equation but with unspecified condition \({{\vartheta}}_{i}(0)\); solve ψ i (t) with the as yet unavailable initial condition \({\psi }_{i}(0) = -{\varphi }_{i}(0)\). Next jointly prove the exponential decay properties of ψ i (τ) and obtain the solution \({{\vartheta}}_{i}(0)\). The equation to determine \({{\vartheta}}_{i}(0)\) with transient states becomes
In this way, we have constructed the asymptotic expansion with transient states. In addition, we can show that φ i ( ⋅) are smooth and ψ i ( ⋅) satisfies | ψ i (τ) | ≤ Kexp( − κ i, 0τ) for some 0 < κ i, 0 < κ i − 1, 0 < κ0, 0. Similarly as in the case with all recurrent states, we establish the following theorem.
Theorem 4.45.
Suppose (A4.5) and (A4.6) hold. Then an asymptotic expansion
can be constructed such that for i = 0,…,n,
-
φ i ( ⋅) is \((n + 1 - i)\)-times continuously differentiable on [0, T];
-
| ψ i (t) | ≤ Kexp( − κ0 t) for some K > 0 and 0 < κ0 < κ i, 0;
-
\(\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T].\)
Example 4.46.
Let \(\widetilde{Q}(t) =\widetilde{ Q},\) a constant matrix such that
In this example,
The last two rows in \(\widetilde{Q}\) represent the jump rates corresponding to the transient states. The matrix \(\widetilde{{Q}}^{1}\) is weakly irreducible and \(\widetilde{{Q}}_{{_\ast}}\) is stable. Solving the forward equation gives us
where
It is easy to see that \({\varphi }_{0}(t) = (1/2,1/2,0,0)\) and
The limit behavior of the underlying Markov chain as ε → 0 is determined by φ0(t) for t > 0. It is clear that the probability of the Markov chain staying at the transient states is very small for small ε.
Remark 4.47.
The model discussed in this section has the extra ingredient of including transient states as compared with that of Section 4.3. The main feature is embedded in the last few rows of the \(\widetilde{Q}(t)\) matrix. One of the crucial points here is that the matrix \(\widetilde{{Q}}_{{_\ast}}(t)\) in the right corner is Hurwitzian. This stability condition guarantees the exponential decay properties of the boundary layers. As far as the regular part (or the outer) expansion is concerned, we have that the last subvector φ0 ∗(t) = 0. The determination of the initial conditions \({{\vartheta}}_{i}(0)\) uses the same technique as before, namely, matching the outer terms and inner layers. The procedure involves recursively solving a sequence of algebraic and differential equations. Although the model is seemingly more general, the methods and techniques involved in obtaining the asymptotic expansion and proof of the results are essentially the same as in the previous section. The notation is slightly more complex, nevertheless.
6 Remarks on Countable-State-Space Cases
6.1 Countable-State Spaces: Part I
This section presents an extension of the singularly perturbed Markov chains with fast and slow components and finite-state spaces. In this section, the generator \(\widetilde{Q}(\cdot )\) is a block-diagonal matrix consisting of infinitely many blocks each of which is of finite dimension. The generator Q ε(t) still has the form (4.39). However,
where \(\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}\) is a generator of an appropriate Markov chain with finite-state space, and \(\widehat{Q}(t)\) is an infinite-dimensional matrix and is a generator of a Markov chain having a countable-state space, that is, \(\widehat{Q}(t) = (\widehat{{q}}_{ij}(t))\) such that
We aim at deriving asymptotic results under the current setting. To do so, assume that the following condition holds:
-
-
For t ∈ [0, T], \(\widetilde{{Q}}^{k}(t)\), for k = 1, 2, …, are weakly irreducible.
-
Parallel to the development of Section 4.3, the solution of φ i ( ⋅) can be constructed similar to that of Theorem 4.29 as in (4.44) and (4.45). In fact, we obtain φ0( ⋅) from (4.49) and (4.50) with l = ∞; the difference is that now we have an infinite number of equations. Similarly, for all k = 1, 2, … and \(i = 0,1,\ldots,n + 1\), φ i ( ⋅) can be obtained from
The problem is converted to one that involves infinitely many algebraic differential equations. The same technique as presented before still works.
Nevertheless, the boundary layer corrections deserve more attention. Let us start with ψ0( ⋅), which is the solution of the abstract Cauchy problem
To continue our study, one needs the notion of semigroup (see Dunford and Schwartz [52], and Pazy [172]). Recall that for a Banach space \(\mathbb{B}\), a one-parameter family T(t), 0 ≤ t < ∞, of bounded linear operators from \(\mathbb{B}\) into \(\mathbb{B}\) is a semigroup of bounded linear operators on \(\mathbb{B}\) if (i) T(0) = I and (ii) \(T(t + s) = T(t)T(s)\) for every t, s ≥ 0.
Let \({\mathbb{R}}^{\infty }\) be the sequence space with a canonical element \(x = ({x}_{1},{x}_{2},\ldots ) \in {\mathbb{R}}^{\infty }\). Let A = (a ij ) satisfying \(A : {\mathbb{R}}^{\infty }\mapsto {\mathbb{R}}^{\infty }\), equipped with the l 1-norm
(see Hutson and Pym [90, p. 74]) Using the definition of semigroup above, the solution of (4.97) is
where T(τ) is a one-parameter family of semigroups generated by \(\widetilde{Q}(0)\). Moreover, since \(\widetilde{Q}(0)\) is a bounded linear operator, \(\exp (\widetilde{Q}(0)\tau )\) still makes sense. Thus \(T(\tau ){\psi }_{0}(0) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau )\), where
Therefore, the solution has the same form as in the previous section. Under (A4.7), exactly the same argument as in the proof of Lemma 4.4 yields that for each k = 1, 2, …,
and the convergence takes place at an exponential rate, that is,
for some κ k > 0. In order to obtain a valid asymptotic expansion, another piece of assumption is needed. That is, these κ k , for all k = 1, 2, …, are uniformly bounded below by a positive constant κ0.
-
-
There exists a positive number κ0 = min k {κ k } > 0.
-
Set
In view of (A4.8)
The exponential decay property of ψ0( ⋅) is thus established. Likewise, it can be proved that all ψ i ( ⋅) for \(i = 1,\ldots,n + 1\), satisfy the exponential decay property. From here on, we can proceed as in the previous section to get the error estimate and verify the validity of the asymptotic expansion. In short the following theorem is obtained.
Theorem 4.48.
Suppose conditions (A4.7) and (A4.8) are satisfied. Then the results in Theorem 4.29 hold for the countable-state-space model with \(\widetilde{Q}(\cdot )\) given by (4.95).
6.2 Countable-State Spaces: Part II
The aim of this section is to develop further results on singularly perturbed Markov chains with fast and slow components whose generators are infinite-dimensional matrices but in different form from that described in Section 4.6.1. The complexity as well as difficulty increase. A number of technical issues also arise. One idea arises almost immediately: to approximating the underlying system via a Galerkin-kind procedure, that is, to approximate an infinite-dimensional system by finite-dimensional truncations. Unfortunately, this does not work in the setting of this section. We will return to this question at the end of this section.
To proceed, as in the previous sections, the first step invariably involves the solution of algebraic differential equations in the constructions of the approximating functions. One of the main ideas used is the Fredholm alternative. There are analogues to the general setting in Banach spaces for compact operators. Nevertheless, the infinite-dimensional matrices are in fact more difficult to handle.
Throughout this section, we treat the class of generators with | Q(t) | 1 < ∞ only. We use 1 l to denote the column vector with all components equal to 1. Consider (1 l⋮Q(t)) as an operator for a generator Q(t) of a Markov chain with state space \(\mathcal{M} =\{ 1,2,\ldots \}\). To proceed, we first give the definitions of irreducibility and quasi-stationary distribution. Set Q c (t) : = (1 l⋮Q(t)).
Definition 4.49.
The generator Q(t) is said to be weakly irreducible at t0 ∈ [0,T], for \(w \in {\mathbb{R}}^{\infty }\), if the equation wQc(t0) = 0 has only the zero solution. If Q(t) is weakly irreducible for each t ∈ [0,T], then it is said to be weakly irreducible on [0,T].
Definition 4.50.
A quasi-stationary distribution ν(t) (with respect to Q(t)) is a solution to (2.8) with the finite summation replaced by \(\sum\limits_{i=1}^{\infty }{\nu }_{i}(t) = 1\) that satisfies ν(t) ≥ 0.
As was mentioned before, the Fredholm alternative plays an important role in our study. For infinite-dimensional systems, we state another definition to take this into account.
Definition 4.51.
A generator Q(t) satisfies the F-Property if wQc(t) = b has a unique solution for each \(b \in {\mathbb{R}}^{\infty }.\)
Note that for all weakly irreducible generators of finite dimension (i.e., generators for Markov chains with finite-state space), the F-Property above is automatically satisfied.
Since 1 l ∈ l ∞ (l ∞ denotes the sequence space equipped with the l ∞ norm) for each t ∈ [0, T], \(Q(t) \in {\mathbb{R}}^{\infty }\times {\mathbb{R}}^{\infty }\). Naturally, we use the norm
It is easily seen that
If a generator Q(t) satisfies the F-Property, then it is weakly irreducible. In fact if Q(t) satisfies the F-Property on t ∈ [0, T], then yQ c (t) = 0 has a unique solution y = 0.
By the definition of the generator, in particular the q-Property, Q c (t) is a bounded linear operator for each t ∈ [0, T]. If Q c (t) is bijective (i.e., one-to-one and onto), then it has a bounded inverse. This, in turn, implies that Q c (t) exhibits the F-Property. Roughly, the F-Property is a generalization of the conditions in dealing with finite-dimensional spaces. Recall from Section 4.2 that although fQ(t) = b is not solvable uniquely, by adding an equation f1 l = c, the system has a unique solution.
Owing to the inherited difficulty caused by the infinite dimensionality, the irreducibility and smoothness of Q( ⋅) are not sufficient to guarantee the existence of asymptotic expansions. Stronger conditions are needed. In the sequel, for ease of presentation, we consider the model with \(\widetilde{Q}(\cdot )\) irreducible and both \(\widetilde{Q}(\cdot )\) and \(\widehat{Q}(\cdot )\) infinite-dimensional.
For each t, we denote the spectrum of Q(t) by σ(Q(t)). In view of Pazy [172] and Hutson and Pym [90], we have
where σ d (Q(t)), σ c (Q(t)), and σ r (Q(t)) denote the discrete, continuous, and residue spectrum of Q(t), respectively. The well-known linear operator theory implies that for a compact operator A, σ r (A) = ∅, and the only possible candidate for σ c (A) is 0. Keeping this in mind, we assume that the following condition holds.
-
-
The following condition holds.
-
The smoothness condition (A4.4) is satisfied.
-
The generator \(\widetilde{Q}(t)\) exhibits the F-Property.
-
\( \sup\limits_{t\in [0,T]}\vert \widetilde{Q}(t){\vert }_{1} < \infty \) and \( \sup\limits_{t\in [0,T]}\vert \widehat{Q}(t)\vert < \infty \).
-
The eigenvalue 0 of \(\widetilde{Q}(t)\) has multiplicity 1 and 0 is not an accumulation point of the eigenvalues.
-
\({\sigma }_{r}(\widetilde{Q}(t)) = \varnothing \).
-
-
Remark 4.52.
Item (a) above requires that the smoothness condition be satisfied and Item (b) requires the operator \((\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t))\) satisfy a Fredholm-alternative-like condition. Finally, (d) indicates the spectrum of \((\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t))\) is like a compact operator. Recall that for a compact linear operator, 0 is in its spectrum, and the only possible accumulation point is 0. Our conditions mimic such a condition. It will be used when we prove the exponential decay property of the initial-layer terms.
Theorem 4.53.
Under condition (A4.9) , the results in Theorem 4.29 hold for Markov chains with countable-state space.
Proof: The proof is very similar to its finite-dimensional counterpart. We only point out the difference here.
As far as the regular part is concerned, we get the same equation (4.44). One thing to note is that we can no longer use Cramer’s rule to solve the systems of equations. Without such an explicit representation of the solution, the smoothness of φ i ( ⋅) needs to be proved by examining (4.44) directly. For example,
can be rewritten as
Since \(\widetilde{Q}(t)\) satisfies the F-Property, this equation has a unique solution.
To verify the differentiability, consider also
Examining the difference quotient leads to
Taking the limit as δ → 0 and by virtue of the smoothness of \(\widetilde{Q}(\cdot )\), we have
That is (d ∕ dt)φ0(t) exists and is given by the solution of
Again by the F-Property, there is a unique solution for this equation. Higher-order derivatives of φ0( ⋅) and smoothness of φ i ( ⋅) can be proved in a similar way.
As far as the initial-layer terms are concerned, since \(\widetilde{Q}(0)\) is a bounded linear operator, the semigroup interpretation \(\exp (\widetilde{Q}(0)\tau )\) makes sense. It follows from Theorem 1.4 of Pazy [172, p. 104] that the equation
has a unique solution.
To show that ψ0( ⋅) decays exponentially fast, we use an argument that is analogous to the finite-dimensional counterpart. Roughly, since the multiplicity of the eigenvalue 0 is 1, the subspace generated by the corresponding eigenvector v 0 is one-dimensional. Similar to the situation of Section 4.2, \( \lim\limits_{\tau \rightarrow \infty }\exp (\widetilde{Q}(0)\tau )\) exists and the limit must have identical rows. Denote the limit by \(\overline{P}\). It then follows that
The meaning should be very clear. Upon “subtracting” the subspace generated by v 0, it ought to behave like exp( − κ0τ). A similar argument works for \(i = 1,\ldots,n + 1\), so the ψ i ( ⋅) decay exponentially fast. □
6.3 A Remark on Finite-Dimensional Approximation
Concerning the cases in Section 4.6.2, a typical way of dealing with infinite-dimensional Markov chains is to make a finite-dimensional approximation. Let Q(t) = (q ij (t)), t ≥ 0, denote a generator of a Markov chain with countable-state space. We consider an N ×N, N = 1, 2, …, truncation matrix \({Q}_{N}(t) = {({q}_{ij}(t))}_{i,j=1}^{N}\). Then Q N (t) is a subgenerator in the sense that ∑ j = 1 N q ij (t) ≤ 0, i = 1, 2, …, N.
A first glance seems to indicate that the idea of subgenerator provides a way to treat the problem of approximating an infinite-dimensional generator by finite-dimensional matrices. In fact, Reuter and Ledermann used such an idea to derive the existence and uniqueness of the solution to the forward equation (see Bharucha-Reid [10]). Dealing with singularly perturbed chains with countable-state space, one would be interested in knowing whether a Galerkin-like approximation would work in the sense that an asymptotic expansion of a finite-dimensional system would provide an approximation to the probability distribution. To be more precise, let αε( ⋅) denote the Markov chain generated by Q(t) ∕ ε and let
Consider the following approximation via N-dimensional systems
Using the techniques presented in the previous sections, we can find outer and inner expansions to approximate p ε, N(t). The questions are these: For small ε and large N, can we approximate p ε(t) by p ε, N(t)? Can we approximate p ε, N(t) by y n ε, N(t), where y n ε, N(t) is an expansion of the form (4.43) when subgenerators are used? More importantly, can we use y n ε, N(t) to approximate p ε(t)?
Although p i ε(t) can be approximated by its truncation p i ε, N(t) for large N and p ε, N(t) can be expanded as y n ε, N(t) for small ε, the approximation of y n ε, N(t) to p ε(t) does not work in general because the limits as ε → 0 and N → ∞ are not interchangeable. This can be seen by considering the following example.
Let
Then for any N, the truncation matrix Q N has only negative eigenvalues. It follows that the solution p ε, N(t) decays exponentially fast, i.e.,
Thus, all terms in the regular part of y n ε, N vanish. It is clear from this example that y n ε, N(t) cannot be used to approximate p ε(t).
7 Remarks on Singularly Perturbed Diffusions
In this section, we present some related results on singular perturbations of diffusions. If in lieu of a discrete state space, one considers a continuous-state space, then naturally the singularly perturbed Markov chains become singularly perturbed Markov processes. We illustrate the idea of matched asymptotic expansions for singularly perturbed diffusions. In this section, we only summarize the results and refer the reader to Khasminskii and Yin [116] for details of proofs. To proceed, consider the following example.
Example 4.54.
This example discusses a model arising from stochastic control, namely, a controlled singularly perturbed system. As pointed out in Kushner [140] and Kokotovic, Bensoussan, and Blankenship [127], many control problems can be modeled by systems of differential equations, where the state variables can be divided into two coupled groups, consisting of “fast” and “slow” variables. A typical system takes the form
where w1(⋅) and w2(⋅) are independent Brownian motions, fi(⋅) and σi(⋅) for i = 1, 2 are suitable functions, u is the control variable, and ε > 0 is a small parameter. The underlying control problem is to minimize the cost function
where R(⋅) is the running cost function. The small parameter ε > 0 signifies the relative rates of x1 ε and x2 ε. Such singularly perturbed systems have drawn much attention (see Bensoussan [8], Kushner [140], and the references therein). The system is very difficult to analyze directly; the approach of Kushner [140] is to use weak convergence methods to approximate the total system by the reduced system that is obtained using the differential equation for the slow variable, where the fast variable is fixed at its steady-state value as a function of the slow variable. In order to gain further insight, it is crucial to understand the asymptotic behavior of the rapidly changing process x2 ε through the transition density given by the solution of the corresponding Kolmogorov-Fokker-Planck equations.
As demonstrated in the example above, a challenge common to many applications is to study the asymptotic behavior of the following problem. Let ε > 0 be a small parameter, and let X 1 ε( ⋅) and X 2 ε( ⋅) be real-valued diffusion processes satisfying
where the real-valued functions a 1(t, x 1, x 2), a 2(t, x 1, x 2), σ1(t, x 1, x 2), and σ2(t, x 1, x 2) represent the drift and diffusion coefficients, respectively, and w 1( ⋅) and w 2( ⋅) are independent and standard Brownian motions. Define a vector X as X = (X 1, X 2)′. Then X ε( ⋅) = (X 1 ε( ⋅), X 2 ε( ⋅))′ is a diffusion process. This is a model treated in Khasminskii [113], in which a probabilistic approach was employed. It was shown that as ε → 0, the fast component is averaged out and the slow component X 1 ε( ⋅) has a limit X 1 0( ⋅) such that
where
and μ( ⋅) is a limit density of the fast process X 2 ε( ⋅).
To proceed further, it is necessary to investigate the limit properties of the rapidly changing process X 2 ε( ⋅). To do so, consider the transition density of the underlying diffusion process. It is known that it satisfies the forward equation
where
Similar to the discrete-state-space cases, the basic problems to be addressed are these: As ε → 0, does the system display certain asymptotic properties? Is there any equilibrium distribution? If p ε(t, x 1, x 2) → p(t, x 1, x 2) for some function p( ⋅), can one get a handle on the error bound (i.e., a bound on | p ε(t, x 1, x 2) − p(t, x 1, x 2) | )?
To obtain the desired asymptotic expansion in this case, one needs to make sure the quasi-stationary density exists. Note that for diffusions in unbounded domains, the quasi-stationary density may not exist. Loosely for the existence of the quasi-stationary distribution, it is necessary that the Markov processes corresponding to \({\mathcal{L}}_{2}^{{_\ast}}\) be positive recurrent for each fixed t. Certain sufficient conditions for the existence of the quasi-stationary density are provided in Il’in and Khasminskii [93]. An alternative way of handling the problem is to concentrate on a compact manifold. In doing so we are able to establish the existence of the quasi-stationary density. To illustrate, we choose the second alternative and suppose the following conditions are satisfied.
For each t ∈ [0, T], i, j = 1, 2, and
-
-
for each \({x}_{2} \in \mathbb{R}\), a 1(t, ⋅, x 2), σ1 2(t, ⋅, x 2) and p 0( ⋅, x 2) are periodic with period 1;
-
for each \({x}_{1} \in \mathbb{R}\), a 2(t, x 1, ⋅), σ2 2(t, x 1, ⋅) and p 0(x 1, ⋅) are periodic with period 1.
-
There is an \(n \in {\mathbb{Z}}_{+}\) such that for each i = 1, 2,
the (n + 1)st partial with respect to t of a i ( ⋅, x 1, x 2), and σ i 2( ⋅, x 1, x 2) are Lipschitz continuous uniformly in x 1, x 2 ∈ [0, 1]. In addition, for each t ∈ [0, T] and each x 1, x 2 ∈ [0, 1], σ i 2(t, x 1, x 2) > 0.
Definition 4.55.
A function μ(⋅) is said to be a quasi-stationary density for the periodic diffusion corresponding to the Kolmogorov-Fokker-Planck operator ⋘2 if it is periodic in x1 and x2 with period 1,
and for each fixed t and x1,
To proceed, let \(\mathcal{H}\) be the space of functions that are bounded and continuous and are Hölder continuous in (x 1, x 2) ∈ [0, 1] ×[0, 1] (with Hölder exponent Δ for some 0 < Δ < 1), uniformly with respect to t. For each h 1, \({h}_{2} \in \mathcal{H}\) define \(\langle {h}_{1},{h{}_{2}\rangle }_{\mathcal{H}}\) as
Under the assumptions mentioned above, two sequences of functions φ i ( ⋅) (periodic in x 1 and x 2) and ψ i ( ⋅) for i = 0, …, n can be found such that
-
\({\varphi }_{i}(\cdot,\cdot,\cdot ) \in {C}^{n+1-i,2(n+1-i),2(n+1-i)}\);
-
ψ i (t ∕ ε, x 1, x 2) decay exponentially fast in that for some c 1 > 0 and c 2 > 0,
$$ \sup\limits_{{x}_{1},{x}_{2}\in [0,1]}\l {\psi }_{i}\left ( \frac{t} {\varepsilon },{x}_{1},{x}_{2}\right )\vert \leq {c}_{1}\exp \left (-\frac{{c}_{2}t} {\varepsilon } \right );$$ -
define \(\widetilde{{s}}_{n}^{\varepsilon }\) by
$$\widetilde{{s}}_{n}^{\varepsilon }(t,{x}_{ 1},{x}_{2}) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t,{x}_{1},{x}_{2}) + {\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon },{x}_{1},{x}_{2}\right )\right );$$for each \(h \in \mathcal{H}\), the following error bound holds:
$$\begin{array}{ll} \left \vert \langle {p}^{\varepsilon } -\widetilde{ {s}}_{n}^{\varepsilon },h{\rangle}_{\mathcal{H}}\right \vert = O({\varepsilon }^{n+1}).\end{array}$$(4.103)
It is interesting to note that the leading term of the approximation φ0( ⋅) is approximately the probability density of X 1, namely, v 0(t, x 1) multiplied by the conditional density of X 2 given X 1 = x 1 (i.e., holding x 1 as a parameter), the quasi-stationary density μ(t, x 1, x 2). The rest of the terms in the regular part of the expansion assume the form
where U i ( ⋅) is a particular solution of an inhomogeneous equation. Note the resemblance of the form to that of the Markov-chain cases studied in this chapter. A detailed proof of the assertion is in Khasminskii and Yin [116]. In fact, more complex systems (allowing interaction of X 1 ε and X 2 ε, the mixed partial derivatives of x 1 and x 2 as well as extension to multidimensional systems) are treated in [116]. In addition, in lieu of \(\langle \cdot,{\cdot \rangle }_{\mathcal{H}}\), convergence under the uniform topology can be considered via the use of stochastic representation of solutions of partial differential equations or energy integration methods (see, for example, the related treatment of singularly perturbed switching diffusion systems in Il’in, Khasminskii, and Yin [94]).
8 Notes
Two-time-scale Markov chains are dealt with in this chapter using purely analytic methods, which are closely connected with the singular perturbation methods. The literature of singular perturbation for ordinary differential equations is rather rich. For an extensive list of references in singular perturbation methods for ordinary differential equations and various techniques such as initial-layer etc., we refer to Vasi’leva and Butuzov [209], Wasow [215, 216], O’Malley [163], and the references therein. The development of singular perturbation methods has been intertwined with advances in technology and progress in various applications. It can be traced back to the beginning of the twentieth century when Prandtl dealt with fluid motion with small friction (see Prandtl [178]). Nowadays, the averaging principle developed by Krylov, Bogoliubov, and Mitropolskii (see Bogoliubov and Mitropolskii [18]) has become a popular technique, taught in standard graduate applied mathematics courses and employed widely. General results on singular perturbations can be found in Bensoussan, Lion, and Papanicolaou [7], Bogoliubov and Mitropolskii [18], Eckhaus [54], Erdélyi [58], Il’in [92], Kevorkian and Cole [108, 109], Krylov and Bogoliubov [133], O’Malley [163], Smith [199], Vasil’eava and Butuzov [209, 210], Wasow [215, 216]; applications to control theory and related fields are in Bensoussan [8], Bielecki and Filar [11], Delebecque and Quadrat [44], Delebecque, Quadrat, and Kokotovic [45], Kokotovic [126], Kokotovic, Bensoussan, and Blankenship [127], Kokotovic and Khalil [128], Kokotovic, Khalil, and O’Reilly [129], Kushner [140], Pan and Başar [164–166], Pervozvanskii and Gaitsgori [174], Phillips and Kokotovic [175], Yin and Zhang [233], among others; the vast literature on applications to different branches of physics are in Risken [182], van Kampen [208]; the survey by Hänggi, Talkner, and Borkovec [80] contains hundreds of references concerning applications in physics; related problems via large deviations theory are in Lerman and Schuss [151]; some recent work of singular perturbations to queueing networks, and heavy traffic, etc., is in Harrison and Reiman [81], Knessel and Morrison [125], and the references therein; applications to manufacturing systems are in Sethi and Zhang [192], Soner [202], Zhang [248], and the references cited there; related problems for stochastic differential equations and diffusion approximations, etc., can be found in Day [42], Friedlin and Wentzell [67], Il’in and Khasminskii [93], Khaminskii [111, 112], Kushner [139], Ludwig [152], Matkowsky and Schuss [158], Naeh, Klosek, Matkowski, and Schuss [160], Papanicolaou [169, 170], Schuss [187, 188], Skorohod [198], Yin [222], Yin and Ramachandran [227], and Zhang [247], among others. Singularly perturbed Markov processes also appear in the context of random evolution, a generalization of the motion of a particle on a fixed line with a random velocity or a random diffusivity; see, for example, Griego and Hersh [76, 77] and Pinsky [177]; an extensive survey can be found in Hersh [85]. A first-order approximation of the distribution of the Cox process with rapid switching is in Di Masi and Kabanov [48]. Recently, modeling communication systems via two-time-scale Markov chains has gained renewed interest; see Tse, Gallager, and Tsitsiklis [206], and the references therein.
It should be pointed out that there is a distinct feature in the problem we are studying compared with the traditional study of singularly perturbed systems. In contrast to many singularly perturbed ordinary differential equations, the matrix Q(t) in (4.3) is singular, and has an eigenvalue 0. Thus the usual stability condition does not hold. To circumvent this difficulty, we utilize the q-Property of the matrix Q(t), which leads to a probabilistic interpretation. The main emphasis in this chapter is on developing approximations to the solutions of the forward equations. The underlying systems arise from a wide range of applications where a finite-state Markov chain is involved and a fast time scale t ∕ ε is used. Asymptotic series of the probability distribution of the Markov chain have been developed by employing the techniques of matched expansions. An attempt to obtain the asymptotic expansion of (4.3) is initiated in Khasminskii, Yin, and Zhang [119] for time-inhomogeneous Markov chains. The result presented here is a refinement of the aforementioned reference.
Extending the results for irreducible generators, this chapter further discusses two-time-scale Markov chains with weak and strong interactions. The formulations substantially generalize the work of Khasminskii, Yin, and Zhang [120]. Section 4.3 discusses Markovian models with recurrent states belonging to several ergodic classes is a refinement of [120].
Previous work on singularly perturbed Markov chains with weak and strong interactions can be found in Delebecque, Quadrat, and Kokotovic [45], Gaitsgori and Pervozvanskii [69], Pervozvanskii and Gaitsgori [174], and Phillips and Kokotovic [175]. The essence is a decomposition and aggregation point of view. Their models are similar to that considered in this chapter. For example, translating the setup into our setting, the authors of [175] assumed that the Markov chain generated by \(\widetilde{Q}/\varepsilon +\widehat{ Q}\) has a single ergodic class for ε sufficiently small. Moreover, for each j = 1, 2, …, l, the subchain has a single ergodic class. Their formulation requires that \(\widetilde{Q}(t) =\widetilde{ Q}\) and \(\widehat{Q}(t) =\widehat{ Q}\), and it requires essentially the irreducibility of \(\widetilde{Q}/\varepsilon +\widehat{ Q}\) for all ε ≤ ε0 for some ε0 > 0 small enough in addition to the irreducibility of \(\widetilde{{Q}}^{j}\) for j = 1, 2, …, l. The problem considered in this chapter is nonstationary; the generators are time-varying. The irreducibility is in the weak sense, and only weak irreducibility of each subgenerator (or block matrix) \(\widetilde{{Q}}^{j}(t)\) for j = 1, 2, …, l is needed. Thus our results generalize the existing theorems to nonstationary cases under weaker assumptions. The condition on \(\widetilde{Q}(t)\) exploits the intrinsic properties of the underlying chains. Furthermore, our results also include Markov chains with countable-state spaces. The formulation and development of Section 4.5 are inspired by that of [175] (see also Pan and Başar [164]). This together with the consideration of chains with recurrent states and the inclusion of absorbing states includes most of practical concerns for the rapidly varying part of the generator. Although the forms of the generators with absorbing states and with transient states have more complex structures, the asymptotic expansion of the probability distributions can still be obtained via a similar approach to that of the case of block-diagonal \(\widetilde{Q}(\cdot )\). Applications to manufacturing systems are discussed, for example, in Jiang and Sethi [99] and Sethi and Zhang [192] among others. As a complement of the development in this chapter, the work of Il’in, Khasminskii, and Yin [94] deals with the cases that the underlying Markov processes involve both diffusion and pure jump processes; see also Yin and Yang [229]. Previous work of singular perturbation of stochastic systems can be found in Day [42], Friedlin and Wentzel [67], Khasminskii [111–113], Kushner [139], Ludwig [152], Matkowsky and Schuss [158], Naeh, Klosek, Matkowski, and Schuss [160], Papanicolaou [169, 170], Schuss [187], Yin and Ramachandran [227], and the references therein. Singular perturbation in connection with optimal control problems are contained in Bensoussan [8], Bielecki and Filar [11], Delebecque and Quadrat [44], Kokotovic [126], Kokotovic, Bensoussan, and Blankenship [127], Kushner [140], Lehoczky, Sethi, Soner, and Taksar [150], Martins and Kushner [156], Pan and Başar [164], Pervozvanskii and Gaitsgori [174], Sethi and Zhang [192], Soner [202], and Yin and Zhang [233] among others. For discrete-time two-time-scale Markov chains, we refer the reader to Yin and Zhang [238] Yin, Zhang, and Badowski [242] among others.
We note that one of the key points that enables us to solve these problems is the Fredholm alternative. This is even more crucial compared with the situation in Section 4.2 for irreducible generators. In Section 4.2, the consistency conditions are readily verified, whereas in the formulation under weak and strong interactions, the verification needs more work and we have to utilize the consistency to obtain the desired solution.
The discussions on Markov chains with countable-state spaces in this chapter focused on simple situations. For more general cases, see Yin and Zhang [230, 231], in which applications to quasi-birth-death queues were considered; see also Altman, Avrachenkov, and Nunez-Queija [4] for a different approach. The discussions on singularly perturbed diffusion processes dealt with mainly forward equations. For related work on singularly perturbed diffusions, see the papers of Khasminskii and Yin [115, 116] and the references therein; one of the motivations for studying singularly perturbed diffusion comes from wear process modeling (see Rishel [181]). For treatments of averaging principles and related backward equations, we refer the reader to Khasminskii and Yin [117, 118]. For a number of applications on queueing systems, financial engineering, and insurance risk, we refer the reader to Yin, Zhang, and Zhang [232] and references therein.
References
M. Abbad, J.A. Filar, and T.R. Bielecki, Algorithms for singularly perturbed limiting average Markov control problems, IEEE Trans. Automat. Control AC-37 (1992), 1421–1425.
R. Akella and P.R. Kumar, Optimal control of production rate in a failure-prone manufacturing system, IEEE Trans. Automat. Control AC-31 (1986), 116–126.
W.J. Anderson, Continuous-Time Markov Chains: An Application-Oriented Approach, Springer-Verlag, New York, 1991.
E. Altman, K.E. Avrachenkov, and R. Nunez-Queija, Pertrubation analysis for denumerable Markov chains with applications to queueing models, Adv. in Appl. Probab., 36 (2004), 839–853.
G. Badowski and G. Yin, Stability of hybrid dynamic systems containing singularly perturbed random processes, IEEE Trans. Automat. Control, 47 (2002), 2021–2031.
G. Barone-Adesi and R. Whaley, Efficient analytic approximation of American option values, J. Finance, 42 (1987), 301–320.
A. Bensoussan, J.L. Lion, and G.C. Papanicolaou, Asymptotic Analysis of Periodic Structures, North-Holland, Amsterdam, 1978.
A. Bensoussan, Perturbation Methods in Optimal Control, J. Wiley, Chichester, 1988.
L.D. Berkovitz, Optimal Control Theory, Springer-Verlag, New York, 1974.
A.T. Bharucha-Reid, Elements of the Theory of Markov Processes and Their Applications, McGraw-Hill, New York, 1960.
T.R. Bielecki and J.A. Filar, Singularly perturbed Markov control problem: Limiting average cost, Ann. Oper. Res. 28 (1991), 153–168.
T.R. Bielecki and P.R. Kumar, Optimality of zero-inventory policies for unreliable manufacturing systems, Oper. Res. 36 (1988), 532–541.
P. Billingsley, Convergence of Probability Measures, J. Wiley, New York, 1968.
T. Björk, Finite dimensional optimal filters for a class of Ito processes with jumping parameters, Stochastics, 4 (1980), 167–183.
W.P. Blair and D.D. Sworder, Feedback control of a class of linear discrete systems with jump parameters and quadratic cost criteria, Int. J. Control, 21 (1986), 833–841.
G.B. Blankenship and G.C. Papanicolaou, Stability and control of stochastic systems with wide band noise, SIAM J. Appl. Math. 34 (1978), 437–476.
H.A.P. Blom and Y. Bar-Shalom, The interacting multiple model algorithm for systems with Markovian switching coefficients, IEEE Trans. Automat. Control, AC-33 (1988), 780–783.
N.N. Bogoliubov and Y.A. Mitropolskii, Asymptotic Methods in the Theory of Nonlinear Oscillator, Gordon and Breach, New York, 1961.
E.K. Boukas and A. Haurie, Manufacturing flow control and preventive maintenance: A stochastic control approach, IEEE Trans. Automat. Control AC-35 (1990), 1024–1031.
P. Brémaud, Point Processes and Queues, Springer-Verlag, New York, 1981.
P.E. Caines and H.-F. Chen, Optimal adaptive LQG control for systems with finite state process parameters, IEEE Trans. Automat. Control, AC-30 (1985), 185–189.
S.L. Campbell, Singular perturbation of autonomous linear systems, II, J. Differential Equations 29 (1978), 362–373.
S.L. Campbell and N.J. Rose, Singular perturbation of autonomous linear systems, SIAM J. Math. Anal. 10 (1979), 542–551.
M. Caramanis and G. Liberopoulos, Perturbation analysis for the design of flexible manufacturing system flow controllers, Oper. Res. 40 (1992), 1107–1125.
M.-F. Chen, From Markov Chains to Non-equilibrium Particle Systems, 2nd ed., World Scientific, Singapore, 2004.
S. Chen, X. Li, and X.Y. Zhou, Stochastic linear quadratic regulators with indefinite control weight costs, SIAM J. Control Optim. 36 (1998), 1685–1702.
C.L. Chiang, An Introduction to Stochastic Processes and Their Applications, Kreiger, Hungtington, 1980.
T.-S. Chiang and Y. Chow, A limit theorem for a class of inhomogeneous Markov processes, Ann. Probab. 17 (1989), 1483–1502.
P.L. Chow, J.L. Menaldi, and M. Robin, Additive control of stochastic linear systems with finite horizon, SIAM J. Control Optim. 23 (1985), 859–899.
Y.S. Chow and H. Teicher, Probability Theory, Springer-Verlag, New York, 1978.
K.L. Chung, Markov Chains with Stationary Transition Probabilities, 2nd Ed., Springer-Verlag, New York, 1967.
F. Clarke, Optimization and Non-smooth Analysis, Wiley Interscience, New York, 1983.
O.L.V. Costa and F. Dufour, Singular perturbation for the discounted continuous control of piecewise deterministic Markov processes, Appl. Math. Optim., 63 (2011), 357–384.
O.L.V. Costa and F. Dufour, Singularly perturbed discounted Markov control processes in a general state space, SIAM J. Control Optim., 50 (2012), 720–747.
P.J. Courtois, Decomposability: Queueing and Computer System Applications, Academic Press, New York, NY, 1977.
D.R. Cox and H.D. Miller, The Theory of Stochastic Processes, J. Wiley, New York, 1965.
M.G. Crandall, C. Evans, and P.L. Lions, Some properties of viscosity solutions of Hamilton-Jacobi equations, Trans. Amer. Math. Soc. 282 (1984), 487–501.
M.G. Crandall, H. Ishii, and P.L. Lions, User’s guide to viscosity solutions of second order partial differential equations, Bull. Amer. Math. Soc. 27 (1992), 1–67.
M.G. Crandall and P.L. Lions, Viscosity solutions of Hamilton-Jacobi equations, Trans. Amer. Math. Soc. 277 (1983), 1–42.
I. Daubechies, Ten Lectures on Wavelets, CBMS-NSF Regional Conf. Ser. Appl. Math., SIAM, Philadelphia, PA, 1992.
M.H.A. Davis, Markov Models and Optimization, Chapman & Hall, London, 1993.
M.V. Day, Boundary local time and small parameter exit problems with characteristic boundaries, SIAM J. Math. Anal. 20 (1989), 222–248.
F. Delebecque, A reduction process for perturbed Markov chains, SIAM J. Appl. Math., 48 (1983), 325–350.
F. Delebecque and J. Quadrat, Optimal control for Markov chains admitting strong and weak interactions, Automatica 17 (1981), 281–296.
F. Delebecque, J. Quadrat, and P. Kokotovic, A unified view of aggregation and coherency in networks and Markov chains, Internat. J. Control 40 (1984), 939–952.
C. Derman, Finite State Markovian Decision Processes, Academic Press, New York, 1970.
G.B. Di Masi and Yu.M. Kabanov, The strong convergence of two-scale stochastic systems and singular perturbations of filtering equations, J. Math. Systems, Estimation Control 3 (1993), 207–224.
G.B. Di Masi and Yu.M. Kabanov, A first order approximation for the convergence of distributions of the Cox processes with fast Markov switchings, Stochastics Stochastics Rep. 54 (1995), 211–219.
J.L. Doob, Stochastic Processes, Wiley Classic Library Edition, Wiley, New York, 1990.
R.L. Dobrushin, Central limit theorem for nonstationary Markov chains, Theory Probab. Appl. 1 (1956), 65–80, 329–383.
E.B. Dynkin, Markov Processes, Springer-Verlag, Berlin, 1965.
N. Dunford and J.T. Schwartz, Linear Operators, Interscience, New York, 1958.
E.B. Dynkin and A.A. Yushkevich, Controlled Markov Processes, Springer-Verlag, New York, 1979.
W. Eckhaus, Asymptotic Analysis of Singular Perturbations, North-Holland, Amsterdam, 1979.
R.J. Elliott, Stochastic Calculus and Applications, Springer-Verlag, New York, 1982.
R.J. Elliott, Smoothing for a finite state Markov process, in Lecture Notes in Control and Inform. Sci., 69, 199–206, Springer-Verlag, New York, 1985.
R.J. Elliott, L. Aggoun, and J. Moore, Hidden Markov Models: Estimation and Control, Springer-Verlag, New York, 1995.
A. Erdélyi, Asymptotic Expansions, Dover, New York, 1956.
S.N. Ethier and T.G. Kurtz, Markov Processes: Characterization and Convergence, J. Wiley, New York, 1986.
W. Feller, An Introduction to Probability Theory and Its Applications, J. Wiley, New York, Vol. I, 1957; Vol. II, 1966.
W.H. Fleming, Functions of Several Variables, Addison-Wesley, Reading, 1965.
W.H. Fleming, Generalized solution in optimal stochastic control, in Proc. URI Conf. on Control, 147–165, Kingston, RI, 1982.
W.H. Fleming and R.W. Rishel, Deterministic and Stochastic Optimal Control, Springer-Verlag, New York, 1975.
W.H. Fleming and H.M. Soner, Controlled Markov Processes and Viscosity Solutions, Springer-Verlag, New York, 1992.
W.H. Fleming, S.P. Sethi, and H.M. Soner, An optimal stochastic production planning problem with randomly fluctuating demand, SIAM J. Control Optim. 25 (1987), 1494–1502.
W.H. Fleming and Q. Zhang, Risk-sensitive production planning of a stochastic manufacturing system, SIAM J. Control Optim., 36 (1998), 1147–1170.
M.I. Friedlin and A.D. Wentzel, Random Perturbations of Dynamical Systems, Springer-Verlag, New York, 1984.
C.W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences, 2nd Ed., Springer-Verlag, Berlin, 1985.
V.G. Gaitsgori and A.A. Pervozvanskii, Aggregation of states in a Markov chain with weak interactions, Kybernetika 11 (1975), 91–98.
D. Geman and S. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Machine Intelligence 6 (1984), 721–741.
S.B. Gershwin, Manufacturing Systems Engineering, Prentice-Hall, Englewood Cliffs, 1994.
M.K. Ghosh, A. Arapostathis, and S.I. Marcus, Ergodic control of switching diffusions, SIAM J. Control Optim., 35 (1997), 1952–1988.
I.I. Gihman and A.V. Skorohod, Introduction to the Theory of Random Processes, W.B. Saunders, Philadelphia, 1969.
P. Glasserman, Gradient Estimation via Perturbation Analysis, Kluwer, Boston, MA, 1991.
R. Goodman, Introduction to Stochastic Models, Benjamin/Cummings, Menlo Park, CA, 1988.
R.J. Griego and R. Hersh, Random evolutions, Markov chains, and systems of partial differential equations, Proc. Nat. Acad. Sci. U.S.A. 62 (1969), 305–308.
R.J. Griego and R. Hersh, Theory of random evolutions with applications to partial differential equations, Trans. Amer. Math. Soc. 156 (1971), 405–418.
X. Guo and O. Hernàndez-Lerma, Continuous-time Markov Decision Processes: Theory and Applications, Springer, Heidelberg, 2001.
J.K. Hale, Ordinary Differential Equations, R.E. Krieger Publishing Co., 2nd Ed., Malabar, 1980.
P. Hänggi, P. Talkner, and M. Borkovec, Reaction-rate theory: Fifty years after Kramers, Rev. Modern Phys. 62 (1990), 251–341.
J.M. Harrison and M.I. Reiman, Reflected Brownian motion on an orthant, Ann. Probab. 9 (1981), 302–308.
U.G. Haussmann and Q. Zhang, Stochastic adaptive control with small observation noise, Stochastics Stochastics Rep. 32 (1990), 109–144.
U.G. Haussmann and Q. Zhang, Discrete time stochastic adaptive control with small observation noise, Appl. Math. Optim. 25 (1992), 303–330.
Q. He, G. Yin, and Q. Zhang, Large Deviations for Two-Time-Scale Systems Driven by Nonhomogeneous Markov Chains and LQ Control Problems, SIAM J. Control Optim., 49, (2011), 1737–1765.
R. Hersh, Random evolutions: A survey of results and problems, Rocky Mountain J. Math. 4 (1974), 443–477.
F.S. Hillier and G.J. Lieberman, Introduction to Operations Research, McGraw-Hill, New York, 1989.
Y.C. Ho and X.R. Cao, Perturbation Analysis of Discrete Event Dynamic Systems, Kluwer, Boston, MA, 1991.
A. Hoyland and M. Rausand, System Reliability Theory: Models and Statistical Methods, J. Wiley, New York, 1994.
Z. Hou and Q. Guo, Homogeneous Denumerable Markov Processes, Springer-Verlag, Berlin, 1980.
V. Hutson and J.S. Pym, Applications of Functional Analysis and Operator Theory, Academic Press, London, 1980.
N. Ikeda and S. Watanabe, Stochastic Differential Equations and Diffusion Processes, North-Holland, Amsterdam, 1981.
A.M. Il’in, Matching of Asymptotic Expansions of Solutions of Boundary Value Problems, Trans. Math. Monographs, Vol. 102, Amer. Math. Soc., Providence, 1992.
A.M. Il’in and R.Z. Khasminskii, Asymptotic behavior of solutions of parabolic equations and ergodic properties of nonhomogeneous diffusion processes, Math. Sbornik. 60 (1963), 366–392.
A.M. Il’in, R.Z. Khasminskii, and G. Yin, Singularly perturbed switching diffusions: Rapid switchings and fast diffusions, J. Optim. Theory Appl. 102 (1999), 555–591.
M. Iosifescu, Finite Markov Processes and Their Applications, Wiley, Chichester, 1980.
H. Ishii, Uniqueness of unbounded viscosity solutions of Hamilton-Jacobi equations, Indiana Univ. Math. J. 33 (1984), 721–748.
Y. Ji and H.J. Chizeck, Controllability, stabilizability, and continuous-time Markovian jump linear quadratic control, IEEE Trans. Automatic Control, 35 (1990), 777–788.
Y. Ji and H.J. Chizeck, Jump linear quadratic Gaussian control in continuous time, IEEE Trans. Automat. Control AC-37 (1992), 1884–1892.
J. Jiang and S.P. Sethi, A state aggregation approach to manufacturing systems having machines states with weak and strong interactions, Oper. Res. 39 (1991), 970–978.
Yu. Kabanov and S. Pergamenshchikov, Two-scale Stochastic Systems: Asymptotic Analysis and Control, Springer, New York, NY, 2003.
I.Ia. Kac and N.N. Krasovskii, On the stability of systems with random parameters, J. Appl. Math. Mech., 24 (1960), 1225–1246.
G. Kallianpur, Stochastic Filtering Theory, Springer-Verlag, New York, 1980.
D. Kannan, An Introduction to Stochastic Processes, North-Holland, New York, 1980.
S. Karlin and J. McGregor, The classification of birth and death processes, Trans. Amer. Math. Soc. 85 (1957), 489–546.
S. Karlin and H.M. Taylor, A First Course in Stochastic Processes, 2nd Ed., Academic Press, New York, 1975.
S. Karlin and H.M. Taylor, A Second Course in Stochastic Processes, Academic Press, New York, 1981.
J. Keilson, Green’s Function Methods in Probability Theory, Griffin, London, 1965.
J. Kevorkian and J.D. Cole, Perturbation Methods in Applied Mathematics, Springer-Verlag, New York, 1981.
J. Kevorkian and J.D. Cole, Multiple Scale and Singular Perturbation Methods, Springer-Verlag, New York, 1996.
H. Kesten and G.C. Papanicolaou, A limit theorem for stochastic acceleration, Comm. Math. Phys. 78 (1980), 19–63.
R.Z. Khasminskii, On diffusion processes with a small parameter, Izv. Akad. Nauk U.S.S.R. Ser. Mat. 27 (1963), 1281–1300.
R.Z. Khasminskii, On stochastic processes defined by differential equations with a small parameter, Theory Probab. Appl. 11 (1966), 211–228.
R.Z. Khasminskii, On an averaging principle for Ito stochastic differential equations, Kybernetika 4 (1968), 260-279.
R.Z. Khasminskii, Stochastic Stability of Differential Equations, 2nd Ed., Springer, New York, 2012.
R.Z. Khasminskii and G. Yin, Asymptotic series for singularly perturbed Kolmogorov-Fokker-Planck equations, SIAM J. Appl. Math. 56 (1996), 1766–1793.
R.Z. Khasminskii and G. Yin, On transition densities of singularly perturbed diffusions with fast and slow components, SIAM J. Appl. Math. 56 (1996), 1794–1819.
R.Z. Khasminskii and G. Yin, On averaging principles: An asymptotic expansion approach, SIAM J. Math. Anal., 35 (2004), 1534–1560.
R.Z. Khasminskii and G. Yin, Limit behavior of two-time-scale diffusions revisited, J. Differential Eqs., 212 (2005) 85–113.
R.Z. Khasminskii, G. Yin, and Q. Zhang, Asymptotic expansions of singularly perturbed systems involving rapidly fluctuating Markov chains, SIAM J. Appl. Math. 56 (1996), 277–293.
R.Z. Khasminskii, G. Yin, and Q. Zhang, Constructing asymptotic series for probability distribution of Markov chains with weak and strong interactions, Quart. Appl. Math. LV (1997), 177–200.
J.G. Kimemia and S.B. Gershwin, An algorithm for the computer control production in flexible manufacturing systems, IIE Trans. 15 (1983), 353–362.
J.F.C. Kingman, Poisson Processes, Oxford Univ. Press, Oxford, 1993.
S. Kirkpatrick, C. Gebatt, and M. Vecchi, Optimization by simulated annealing, Science 220 (1983), 671–680.
C. Knessel, On finite capacity processor-shared queues, SIAM J. Appl. Math. 50 (1990), 264–287.
C. Knessel and J.A. Morrison, Heavy traffic analysis of a data handling system with multiple sources, SIAM J. Appl. Math. 51 (1991), 187–213.
P.V. Kokotovic, Application of singular perturbation techniques to control problems, SIAM Rev. 26 (1984), 501–550.
P.V. Kokotovic, A. Bensoussan, and G. Blankenship (Eds.), Singular Perturbations and Asymptotic Analysis in Control Systems, Lecture Notes in Control and Inform. Sci. 90, Springer-Verlag, Berlin, 1987.
P.V. Kokotovic and H.K. Khalil (Eds.), Singular Perturbations in Systems and Control, IEEE Press, New York, 1986.
P.V. Kokotovic, H.K. Khalil, and J. O’Reilly, Singular Perturbation Methods in Control, Academic Press, London, 1986.
V. Korolykuk and A. Swishchuk, Evolution of Systems in Random Media, CRC Press, Boca Raton, 1995.
V.S. Korolyuk and N. Limnios, Diffusion approximation with equilibrium of evolutionary systems switched by semi-Markov processes, translation in Ukrainian Math. J. 57 (2005), 1466–1476.
V.S. Korolyuk and N. Limnios, Stochastic systems in merging phase space, World Sci., Hackensack, NJ, 2005.
N.M. Krylov and N.N. Bogoliubov, Introduction to Nonlinear Mechanics, Princeton Univ. Press, Princeton, 1947.
H. Kunita and S. Watanabe, On square integrable martingales, Nagoya Math. J. 30 (1967), 209–245.
T.G. Kurtz, A limit theorem for perturbed operator semigroups with applications to random evolutions, J. Functional Anal. 12 (1973), 55–67.
T.G. Kurtz, Approximation of Population Processes, SIAM, Philadelphia, PA, 1981.
T.G. Kurtz, Averaging for martingale problems and stochastic approximation, in Proc. US-French Workshop on Appl. Stochastic Anal., Lecture Notes in Control and Inform. Sci., 177, I. Karatzas and D. Ocone (Eds.), 186–209, Springer-Verlag, New York, 1991.
H.J. Kushner, Probability Methods for Approximation in Stochastic Control and for Elliptic Equations, Academic Press, New York, 1977.
H.J. Kushner, Approximation and Weak Convergence Methods for Random Processes, with Applications to Stochastic Systems Theory, MIT Press, Cambridge, MA, 1984.
H.J. Kushner, Weak Convergence Methods and Singularly Perturbed Stochastic Control and Filtering Problems, Birkhäuser, Boston, 1990.
H.J. Kushner and P.G. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time, Springer-Verlag, New York, 1992.
H.J. Kushner and W. Runggaldier, Nearly optimal state feedback controls for stochastic systems with wideband noise disturbances, SIAM J Control Optim. 25 (1987), 289–315.
H.J. Kushner and F.J. Vázquez-Abad, Stochastic approximation algorithms for systems over an infinite horizon, SIAM J. Control Optim. 34 (1996), 712–756.
H.J. Kushner and G. Yin, Asymptotic properties of distributed and communicating stochastic approximation algorithms, SIAM J. Control Optim. 25 (1987), 1266–1290.
H.J. Kushner and G. Yin, Stochastic Approximation and Recursive Algorithms and Applications, 2nd Edition, Springer-Verlag, New York, 2003.
X.R. Li, Hybrid estimation techniques, in Control and Dynamic Systems, Vol. 76, C.T. Leondes (Ed.), Academic Press, New York, 1996.
Yu. V. Linnik, On the theory of nonhomogeneous Markov chains, Izv. Akad. Nauk. USSR Ser. Mat. 13 (1949), 65–94.
Y.J. Liu, G. Yin, and X.Y. Zhou, Near-optimal controls of random-switching LQ problems with indefinite control weight costs, Automatica, 41 (2005) 1063–1070.
P. Lochak and C. Meunier, Multiphase Averaging for Classical Systems, Springer-Verlag, New York, 1988.
J. Lehoczky, S.P. Sethi, H.M. Soner, and M. Taksar, An asymptotic analysis of hierarchical control of manufacturing systems under uncertainty, Math. Oper. Res. 16 (1992), 596–608.
G. Lerman and Z. Schuss, Asymptotic theory of large deviations for Markov chains, SIAM J. Appl. Math., 58 (1998), 1862–1877.
D. Ludwig, Persistence of dynamical systems under random perturbations, SIAM Rev. 17 (1975), 605–640.
X. Mao and C. Yuan, Stochastic Differential Equations with Markovian Switching, Imperial College Press, London, UK, 2006.
M. Mariton, Robust jump linear quadratic control: A mode stabilizing solution, IEEE Trans. Automat. Control, AC-30 (1985), 1145–1147.
M. Mariton, Jump Linear Systems in Automatic Control, Marcel Dekker, Inc., New York, 1990.
L.F. Martins and H.J. Kushner, Routing and singular control for queueing networks in heavy traffic, SIAM J. Control Optim. 28 (1990), 1209–1233.
W.A. Massey and W. Whitt, Uniform acceleration expansions for Markov chains with time-varying rates, Ann. Appl. Probab., 8 (1998), 1130–1155.
B.J. Matkowsky and Z. Schuss, The exit problem for randomly perturbed dynamical systems, SIAM J. Appl. Math. 33 (1977), 365–382.
S.P. Meyn and R.L. Tweedie, Markov Chains and Stochastic Stability, Springer-Verlag, London, 1993.
T. Naeh, M.M. Klosek, B.J. Matkowski, and Z. Schuss, A direct approach to the exit problem, SIAM J. Appl. Math. 50 (1990), 595–627.
A.H. Nayfeh, Introduction to Perturbation Techniques, J. Wiley, New York, 1981.
M.F. Neuts, Matrix-Geometric Solutions in Stochastic Models, Johns Hopkins Univ. Press, Baltimore, 1981.
R.E. O’Malley, Jr., Singular Perturbation Methods for Ordinary Differential Equations, Springer-Verlag, New York, 1991.
Z.G. Pan and T. Başar, H ∞-control of Markovian jump linear systems and solutions to associated piecewise-deterministic differential games, in New Trends in Dynamic Games and Applications, G.J. Olsder Ed., 61–94, Birkhäuser, Boston, MA, 1995.
Z.G. Pan and T. Başar, H ∞ control of large scale jump linear systems via averaging and aggregation, in Proc. 34th IEEE Conf. Decision Control, 2574-2579, New Orleans, LA, 1995.
Z.G. Pan and T. Başar, Random evolutionary time-scale decomposition in robust control of jump linear systems, in Proc. 35th IEEE Conf. Decision Control, Kobe, Japan, 1996.
Z.G. Pan and T. Basar, H ∞ control of large-scale jump linear systems via averaging and aggregation, Internat. J. Control, 72 (1999), 866–881.
G.C. Papanicolaou, D. Stroock, and S.R.S. Varadhan, Martingale approach to some limit theorems, in Proc. 1976 Duke Univ. Conf. on Turbulence, Durham, NC, 1976.
G.C. Papanicolaou, Introduction to the asymptotic analysis of stochastic equations, in Lectures in Applied Mathematics, Amer. Math. Soc., Vol. 16, 1977, 109-147.
G.C. Papanicolaou, Asymptotic analysis of stochastic equations, Studies in Probability Theory, M. Rosenblatt (Ed.), Vol. 18, MAA, 1978, 111–179.
E. Pardoux and S. Peng, Adapted solution of backward stochastic equation, Syst. Control Lett., 14 (1990), 55–61.
A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, Springer-Verlag, New York, 1983.
L. Perko, Differential Equations and Dynamical Systems, Springer, 3rd Ed., New York, 2001.
A.A. Pervozvanskii and V.G. Gaitsgori, Theory of Suboptimal Decisions: Decomposition and Aggregation, Kluwer, Dordrecht, 1988.
R.G. Phillips and P.V. Kokotovic, A singular perturbation approach to modelling and control of Markov chains, IEEE Trans. Automat. Control 26 (1981), 1087–1094.
M.A. Pinsky, Differential equations with a small parameter and the central limit theorem for functions defined on a Markov chain, Z. Wahrsch. verw. Gebiete 9 (1968), 101–111.
M.A. Pinsky, Multiplicative operator functionals and their asymptotic properties, in Advances in Probability Vol. 3, P. Ney and S. Port (Eds.), Marcel Dekker, New York, 1974.
L. Prandtl, Über Flüssigkeits – bewegung bei kleiner Reibung, Verhandlungen, in III. Internat. Math. Kongresses, (1905), 484–491.
M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, J. Wiley, New York, 1994.
D. Revuz, Markov Chains, Revised Ed., North-Holland, Amsterdam, 1975.
R. Rishel, Controlled wear process: Modeling optimal control, IEEE Trans. Automat. Control 36 (1991), 1100–1102.
H. Risken, The Fokker-Planck Equation: Methods of Solution and Applications, 2nd Ed., Springer-Verlag, London, 1989.
M. Rosenblatt, Markov Processes: Structure and Asymptotic Behavior, Springer-Verlag, Berlin, 1971.
S. Ross, Introduction to Stochastic Dynamic Programming, Academic Press, New York, 1983.
E. Roxin, The existence of optimal controls, Mich. Math. J. 9 (1962), 109–119.
V.R. Saksena, J. O’Reilly, and P.V. Kokotovic, Singular perturbations and time-scale methods in control theory: Survey 1976-1983, Automatica 20 (1984), 273–293.
Z. Schuss, Singularly perturbation methods in stochastic differential equations of mathematical physics, SIAM Rev. 22 (1980), 119–155.
Z. Schuss, Theory and Applications of Stochastic Differential Equations, J. Wiley, New York, 1980.
E. Seneta, Non-negative Matrices and Markov Chains, Springer-Verlag, New York, 1981.
R. Serfozo, Introduction to Stochastic Networks, Springer, New York, 1999.
S.P. Sethi and G.L. Thompson, Applied Optimal Control: Applications to Management Science, Martinus Nijhoff, Boston, MA, 1981.
S.P. Sethi and Q. Zhang, Hierarchical Decision Making in Stochastic Manufacturing Systems, Birkhäuser, Boston, 1994.
S.P. Sethi and Q. Zhang, Multilevel hierarchical decision making in stochastic marketing-production systems, SIAM J. Control Optim. 33 (1995), 528–553.
O.P. Sharma, Markov Queues, Ellis Horwood, New York, 1990.
H.A. Simon, Models of Discovery and Other Topics in the Methods of Science, D. Reidel Publ. Co., Boston, MA, 1977.
H.A. Simon and A. Ando, Aggregation of variables in dynamic systems, Econometrica 29 (1961), 111–138.
A.V. Skorohod, Studies in the Theory of Random Processes, Dover, New York, 1982.
A.V. Skorohod, Asymptotic Methods of the Theory of Stochastic Differential Equations, Trans. Math. Monographs, Vol. 78, Amer. Math. Soc., Providence, 1989.
D.R. Smith, Singular Perturbation Theory, Cambridge Univ. Press, New York, 1985.
D. Snyder, Random Point Processes, Wiley, New York, 1975.
H.M. Soner, Optimal control with state space constraints II, SIAM J. Control Optim. 24 (1986), 1110–1122.
H.M. Soner, Singular perturbations in manufacturing systems, SIAM J. Control Optim. 31 (1993), 132–146.
D.W. Stroock and S.R.S. Varadhan, Multidimensional Diffusion Processes, Springer-Verlag, Berlin, 1979.
H.M. Taylor and S. Karlin, An Introduction to Stochastic Modeling, Academic Press, Boston, 1994.
W.A. Thompson, Jr., Point Process Models with Applications to Safety and Reliability, Chapman and Hall, New York, 1988.
D.N.C. Tse, R.G. Gallager, and J.N. Tsitsiklis, Statistical multiplexing of multiple time-scale Markov streams, IEEE J. Selected Areas Comm. 13 (1995), 1028–1038.
S.R.S. Varadhan, Large Deviations and Applications, SIAM, Philadelphia, 1984.
N.G. van Kampen, Stochastic Processes in Physics and Chemistry, North-Holland, Amsterdam, 1992.
A.B. Vasil’eava and V.F. Butuzov, Asymptotic Expansions of the Solutions of Singularly Perturbed Equations, Nauka, Moscow, 1973.
A.B. Vasil’eava and V.F. Butuzov, Asymptotic Methods in Singular Perturbations Theory (in Russian), Vysshaya Shkola, Moscow, 1990.
D. Vermes, Optimal control of piecewise deterministic Markov processes, Stochastics, 14 (1985), 165–207.
L.Y. Wang, P.P. Khargonekar, and A. Beydoun, Robust control of hybrid systems: Performance guided strategies, in Hybrid Systems V, P. Antsaklis, W. Kohn, M. Lemmon, A. Nerode, amd S. Sastry Eds., Lecuture Notes in Computer Sci., 1567, 356–389, Berlin, 1999.
Z. Wang and X. Yang, Birth and Death Processes and Markov Chains, Springer-Verlag, Science Press, Beijing, 1992.
J. Warga, Relaxed variational problems, J Math. Anal. Appl. 4 (1962), 111–128.
W. Wasow, Asymptotic Expansions for Ordinary Differential Equations, Interscience, New York, 1965.
W. Wasow, Linear Turning Point Theory, Springer-Verlag, New York, 1985.
A.D. Wentzel, On the asymptotics of eigenvalues of matrices with elements of order exp( − V ij ∕ 2ε2), Dokl. Akad. Nauk SSSR 222 (1972), 263–265.
D.J. White, Markov Decision Processes, Wiley, New York, 1992.
J.H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University Press, New York, 1988.
H. Yan, G. Yin, and S. X. C. Lou, Using stochastic optimization to determine threshold values for control of unreliable manufacturing systems, J. Optim. Theory Appl. 83 (1994), 511–539.
H. Yan and Q. Zhang, A numerical method in optimal production and setup scheduling in stochastic manufacturing systems, IEEE Trans. Automat. Control, 42 (1997), 1452–1455.
G. Yin, Asymptotic properties of an adaptive beam former algorithm, IEEE Trans. Information Theory IT-35 (1989), 859-867.
G. Yin, Asymptotic expansions of option price under regime-switching diffusions with a fast-varying switching process, Asymptotic Anal., 65 (2009), 203–222.
G. Yin and I. Gupta, On a continuous time stochastic approximation problem, Acta Appl. Math. 33 (1993), 3–20.
G. Yin, V. Krishnamurthy, and C. Ion, Regime switching stochastic approximation algorithms with application to adaptive discrete stochastic optimization, SIAM J. Optim., 14 (2004), 1187–1215.
G. Yin and D.T. Nguyen, Asymptotic expansions of backward equations for two-time-scale Markov chains in continuous time, Acta Math. Appl. Sinica, 25 (2009), 457–476.
G. Yin and K.M. Ramachandran, A differential delay equation with wideband noise perturbation, Stochastic Process Appl. 35 (1990), 231–249.
G. Yin, H. Yan, and S.X.C. Lou, On a class of stochastic optimization algorithms with applications to manufacturing models, in Model-Oriented Data Analysis, W.G. Müller, H.P. Wynn and A.A. Zhigljavsky (Eds.), 213–226, Physica-Verlag, Heidelberg, 1993.
G. Yin and H.L. Yang, Two-time-scale jump-diffusion models with Markovian switching regimes, Stochastics Stochastics Rep., 76 (2004), 77–99.
G. Yin and H. Zhang, Two-time-scale markov chains and applications to quasi-birth-death queues, SIAM J. Appl. Math., 65 (2005), 567–586.
G. Yin and H. Zhang, Singularly perturbed markov chains: Limit results and applications, Ann. Appl. Probab., 17 (2007), 207–229.
G. Yin, H. Zhang, and Q. Zhang, Applications of Two-time-scale Markovian Systems, Preprint, 2012.
G. Yin and Q. Zhang, Near optimality of stochastic control in systems with unknown parameter processes, Appl. Math. Optim. 29 (1994), 263–284.
G. Yin and Q. Zhang, Control of dynamic systems under the influence of singularly perturbed Markov chains, J. Math. Anal. Appl., 216 (1997), 343–367.
G. Yin and Q. Zhang (Eds.), Recent Advances in Control and Optimization of Manufacturing Systems, Lecture Notes in Control and Information Sciences (LNCIS) series, Vol. 214, Springer-Verlag, New York, 1996.
G. Yin and Q. Zhang (Eds.), Mathematics of Stochastic Manufacturing Systems, Proc. 1996 AMS-SIAM Summer Seminar in Applied Mathematics, Lectures in Applied Mathematics, Amer. Math. Soc., Providence, RI, 1997.
G. Yin and Q. Zhang, Continuous-Time Markov Chains and Applications: A Singular Perturbation Approach, 1st Ed., Springer-Verlag, New York, 1998.
G. Yin and Q. Zhang, Discrete-time Markov Chains: Two-time-scale Methods and Applications, Springer, New York. 2005.
G. Yin, Q. Zhang, and G. Badowski, Asymptotic properties of a singularly perturbed Markov chain with inclusion of transient states, Ann. Appl. Probab., 10 (2000), 549–572.
G. Yin, Q. Zhang, and G. Badowski, Singularly perturbed Markov chains: Convergence and aggregation, J. Multivariate Anal., 72 (2000), 208–229.
G. Yin, Q. Zhang, and G. Badowski, Occupation measures of singularly perturbed Markov chains with absorbing states, Acta Math. Sinica, 16 (2000), 161–180.
G. Yin, Q. Zhang, and G. Badowski, Discrete-time singularly perturbed Markov chains: Aggregation, occupation measures, and switching diffusion limit, Adv. in Appl. Probab., 35 (2003), 449–476.
G. Yin and X.Y. Zhou, Markowitz’s mean-variance portfolio selection with regime switching: From discrete-time models to their continuous-time limits, IEEE Trans. Automat. Control, 49 (2004), 349–360.
G. Yin and C. Zhu, Hybrid Switching Diffusions: Properties and Applications, Springer, New York, 2010.
K. Yosida, Functional Analysis, 6th Ed., Springer-Verlag, New York, NY, 1980.
J. Yong and X.Y. Zhou, Stochastic Controls: Hamiltonian Systems and HJB Equations, Springer, New York, 1999.
Q. Zhang, An asymptotic analysis of controlled diffusions with rapidly oscillating parameters, Stochastics Stochastics Rep. 42 (1993), 67–92.
Q. Zhang, Risk sensitive production planning of stochastic manufacturing systems: A singular perturbation approach, SIAM J. Control Optim. 33 (1995), 498–527.
Q. Zhang, Finite state Markovian decision processes with weak and strong interactions, Stochastics Stochastics Rep. 59 (1996), 283–304.
Q. Zhang, Nonlinear filtering and control of a switching diffusion with small observation noise, SIAM J. Control Optim., 36 (1998), 1738–1768.
Q. Zhang and G. Yin, Turnpike sets in stochastic manufacturing systems with finite time horizon, Stochastics Stochastics Rep. 51 (1994), 11–40.
Q. Zhang and G. Yin, Central limit theorems for singular perturbations of nonstationary finite state Markov chains, Ann. Appl. Probab. 6 (1996), 650–670.
Q. Zhang and G. Yin, Structural properties of Markov chains with weak and strong interactions, Stochastic Process Appl., 70 (1997), 181–197.
Q. Zhang and G. Yin, On nearly optimal controls of hybrid LQG problems, IEEE Trans. Automat. Control, 44 (1999), 2271–2282.
Q. Zhang and G. Yin, Nearly optimal asset allocation in hybrid stock-investment models, J. Optim. Theory Appl., 121 (2004), 197–222.
Q. Zhang, G. Yin, and E.K. Boukas, Controlled Markov chains with weak and strong interactions: Asymptotic optimality and application in manufacturing, J. Optim. Theory Appl., 94 (1997), 169–194.
Q. Zhang, G. Yin, and R.H. Liu, A near-optimal selling rule for a two-time-scale market model, SIAM J. Multiscale Modeling Simulation, 4 (2005), 172–193.
X.Y. Zhou, Verification theorem within the framework of viscosity solutions, J. Math. Anal. Appl. 177 (1993), 208–225.
X.Y. Zhou and G. Yin, Markowitz mean-variance portfolio selection with regime switching: A continuous-time model, SIAM J. Control Optim., 42 (2003), 1466–1482.
C. Zhu, G. Yin, and Q.S. Song, Stability of random-switching systems of differential equations, Quart. Appl. Math., 67 (2009), 201–220.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Yin, G.G., Zhang, Q. (2013). Asymptotic Expansions of Solutions for Forward Equations. In: Continuous-Time Markov Chains and Applications. Stochastic Modelling and Applied Probability, vol 37. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4346-9_4
Download citation
DOI: https://doi.org/10.1007/978-1-4614-4346-9_4
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4345-2
Online ISBN: 978-1-4614-4346-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)