Asymptotic Expansions of Solutions for Forward Equations

Yin, G. George; Zhang, Qing

doi:10.1007/978-1-4614-4346-9_4

G. George Yin³ &
Qing Zhang⁴

Part of the book series: Stochastic Modelling and Applied Probability ((SMAP,volume 37))

3665 Accesses

Abstract

This chapter is concerned with the analysis of the probability distributions of two-time-scale Markov chains. We aim to approximate the solution of forward equation by means of sequences of functions so that the desired accuracy is reached. As alluded to in Chapter 1, we devote our attention to nonstationary Markov chains with time-varying generators.

Access provided by Autonomous University of Puebla. Download chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

This chapter is concerned with the analysis of the probability distributions of two-time-scale Markov chains. We aim to approximate the solution of forward equation by means of sequences of functions so that the desired accuracy is reached. As alluded to in Chapter 1, we devote our attention to nonstationary Markov chains with time-varying generators. A key feature here is time-scale separation. By introducing a small parameter ε > 0, the generator and hence the corresponding Markov chain have “two times,” a usual running time t and a fast time t ∕ ε. The main approach that we are using is the matched asymptotic expansions from singular perturbation theory. We first construct a sequence of functions that well approximate the solution of the forward equation when t is large enough (outside the initial layer of O(ε)). By adopting the notion of singular perturbation theory, this part of the approximation will be called outer expansions. We demonstrate that it is a good approximation as long as t is not in a neighborhood of 0 of the order O(ε). Nevertheless, this sequence of functions does not satisfy the given initial condition and the approximation breaks down when t ≤ O(ε). To circumvent these difficulties, we construct another sequence of functions by magnifying the asymptotic behavior of the solution near 0 using the stretched fast time $\tau = t/\varepsilon $. Following the traditional terminology in singular perturbation theory, we call this sequence of functions initial-layer corrections (or sometimes, boundary-layer corrections). It effectively yields corrections to the outer expansions and makes sure that the approximation is good in a neighborhood of O(ε). By combining the outer expansions and the initial-layer corrections, we obtain a sequence of matched asymptotic expansions. The entire process is constructive. Our aims in this chapter include:

Construct the outer expansions and the initial-layer corrections. This construction is often referred to as formal expansions.
Justify the sequence of approximations obtained by deriving the desired error bounds. To achieve this, we show that (i) the outer solutions are sufficiently smooth, (ii) the initial-layer terms all decay exponentially fast, and (iii) the error is of the desired order. Thus not only is convergence of the asymptotic expansions proved, but also the error bound is obtained.
Demonstrate that the error bounds hold uniformly. We would like to mention that in the usual singular perturbation theory, for example, in treating a linear system of differential equations, it is required that the system matrix be stable (i.e., all eigenvalues have negative real parts). In our setup, even for a homogeneous Markov chain, the generator (the system matrix in the equation) has an eigenvalue 0, so is not invertible. Thus, the stability requirement is violated. Nevertheless, using Markov properties, we are still able to obtain the desired asymptotic expansions.

Before proceeding further, we present a lemma. Let $Q(t) \in {\mathbb{R}}^{m\times m}$ be a generator, and let α(t) be a finite-state Markov chain with state space $\mathcal{M} =\{ 1,\ldots,m\}$ and generator Q(t). Denote by

$$p(t) = (P(\alpha (t) = 1),\ldots,P(\alpha (t) = m)) \in {\mathbb{R}}^{1\times m}$$

the row vector of the probability distribution of the underlying chain at time t. Then in view of Theorem 2.5, p( ⋅) is a solution of the forward equation

$$\begin{array}{ll} &\frac{dp(t)} {dt} = pQ(t) = p(t)Q(t), \\ &p(0) = {p}^{0}\mbox{ such that }{p}_{ i}^{0} \geq 0\mbox{ for each }i,\mbox{ and }\sum\limits_{i=1}^{m}{p}_{ i}^{0} = 1, \end{array}$$

(4.1)

where p ⁰ = (p ₁ ⁰, …, p _m ⁰) and p _i ⁰ denotes the ith component of p ⁰. Therefore, studying the probability distribution is equivalent to examining the solution of (4.1). Note that the forward equation is linear, so the solution is unique. As a result, the following lemma is immediate. This lemma will prove useful in subsequent study.

Lemma 4.1.

The solution p(t) of (4.1) satisfies the conditions

$$0 \leq {p}_{i}(t) \leq 1 and \sum\limits_{i=1}^{m}{p}_{ i}(t) = 1.$$

(4.2)

Remark 4.2.

For the reader whose interests are mainly in differential equations, we point out that the initial condition $\sum_{i=1}^{m}{p}_{i}^{0} = 1$ in (4.1) is not restrictive since if p⁰ = 0, then p(t) = 0 is the only solution to (4.1). If p_i ⁰ > 0 for some i, one may divide both sides of (4.1) by ∑ _i=1 ^mp_i ⁰ (> 0) and consider $\widetilde{p}(t) = p(t)/\sum_{i=1}^{m}{p}_{i}^{0}$ in lieu of p(t).

To achieve our goal, we first treat a simple case, namely, the case that the generator is weakly irreducible. Once this is established, we proceed to the more complex case that the generator has several weakly irreducible classes, the inclusion of absorbing states, and the inclusion of transient states.

The rest of the chapter is arranged as follows. Section 4.2 begins with the study of the situation in which the generator is weakly irreducible. Although it is a simple case, it outlines the main ideas behind the construction of asymptotic expansions. This section begins with the construction of formal expansions, proves the needed regularity, and ascertains the error estimates. Section 4.3 develops asymptotic expansions of the underlying probability distribution for the chains with recurrent states. As will be seen in the analysis to follow, extreme care must be taken to handle two-time-scale Markov chains with fast and slow components. One of the key issues is the selection of appropriate initial conditions to make the series a “matched” asymptotic expansions, in which the separable form of our asymptotic expansion appears to be advantageous compared with the two-time-scale expansions. For easy reference, a subsection is also provided as a user’s guide.

Using the methods of matched asymptotic expansion, Section 4.4 extends the results to include absorbing states. It demonstrates that similar techniques can be used. We also demonstrate that the techniques and methods of Section 4.3 are rather general and can be applied to a wide variety of cases. Section 4.5 continues the study of problems involving transient states. By treating chains having recurrent states, chains including absorbing states, and chains including transient states, we are able to characterize the probability distributions of the underlying singularly perturbed chains of general cases with finite-state spaces, and hence provide comprehensive pictures through these “canonical” models.

While Sections 4.3–4.5 cover most practical concerns of interest for the finite-state-space cases, the rest of the chapter makes several remarks on Markov chains with countable-state spaces and two-time-scale diffusions. In Section 4.6.1, we extend the results to processes with countable-state spaces in which $\widetilde{Q}(t)$ is a block-diagonal matrix with infinitely many blocks each of which is finite-dimensional. Then Section 4.6.2 treats the problem in which $\widetilde{Q}(t)$ itself is an infinite-dimensional matrix. In this case, further conditions are necessary. As in the finite-dimensional counterpart, sufficient conditions that ensure the validity of the asymptotic expansions are provided. The essential ingredients include Fredholm-alternative-like conditions and the notion of weak irreducibility. Finally, we mention related results of singularly perturbed diffusions in Section 4.7. Additional notes and remarks are given in Section 4.8.

2 Irreducible Case

We begin with the case concerning weakly irreducible generators. Let $Q(t) \in {\mathbb{R}}^{m\times m}$ be a generator, ε > 0 be a small parameter, and suppose that α^ε(t) is a finite-state Markov chain with state space $\mathcal{M} =\{ 1,\ldots,m\}$ generated by ${Q}^{\varepsilon }(t) = Q(t)/\varepsilon $. The row vector ${p}^{\varepsilon }(t) = (P({\alpha }^{\varepsilon }(t) = 1),\ldots,P({\alpha }^{\varepsilon }(t) = m)) \in {\mathbb{R}}^{1\times m}$ denotes the probability distribution of the underlying chain at time t. Then by virtue of Theorem 2.5, p ^ε( ⋅) is a solution of the forward equation

$$\begin{array}{ll} &\frac{d{p}^{\varepsilon }(t)} {dt} = {p}^{\varepsilon }{Q}^{\varepsilon }(t) ={ 1 \over \varepsilon } {p}^{\varepsilon }(t)Q(t), \\ &{p}^{\varepsilon }(0) = {p}^{0}\mbox{ such that }{p}_{ i}^{0} \geq 0\mbox{ for each }i,\mbox{ and }\sum\limits_{i=1}^{m}{p}_{ i}^{0} = 1, \end{array}$$

(4.3)

where p ⁰ = (p ₁ ⁰, …, p _m ⁰) and p _i ⁰ denotes the ith component of p ⁰. Therefore, studying the probability distribution is equivalent to examining the solution of (4.3). Now, Lemma 4.1 continues to hold for the solution p ^ε(t).

As discussed in Chapters 1 and 3, the equation in (4.3) arises from various applications involving a rapidly fluctuating Markov chain governed by the generator Q(t) ∕ ε. As ε gets smaller and smaller, the Markov chain fluctuates more and more rapidly. Normally, the fast-changing process α^ε( ⋅) in an actual system is difficult to analyze. The desired limit properties, however, provide us with an alternative. We can replace the actual process by its “average” in the system under consideration. This approach has significant practical value. A fundamental question common to numerous applications involving two-time-scale Markov chains is to understand the asymptotic properties of p ^ε( ⋅), namely, the limit behavior as ε → 0. If Q(t) = Q, a constant matrix, and if Q is irreducible (see Definition 2.7), then for each t > 0, p ^ε(t) → ν, the familiar stationary distribution. For the time-varying counterpart, it is reasonable to expect that the corresponding distribution will converge to a probability distribution that mimics the main features of the distribution of stationary chains, meanwhile preserving the time-varying nature of the nonstationary system. A candidate bearing such characteristics is the quasi-stationary distribution ν(t). Recall that ν(t) is said to be a quasi-stationary distribution (see Definition 2.8) if ν(t) = (ν₁(t), …, ν_m(t)) ≥ 0 and it satisfies the equations

$$\nu (t)Q(t) = 0\mbox{ and }\sum\limits_{i=1}^{m}{\nu }_{ i}(t) = 1.$$

(4.4)

If Q(t) ≡ Q, a constant matrix, then an analytic solution of (4.3) is obtainable, since the fundamental matrix solution (see Hale [79]) takes the simple form exp(Qt); the limit behavior of p ^ε(t) is derivable through the solution p ⁰exp(Qt ∕ ε). For time-dependent Q(t), although the fundamental matrix solution still exists, it does not have a simple form. The complex integral representation is not very informative in the asymptotic study of p ^ε(t), except in the case m = 2. In this case, α^ε( ⋅) is a two-state Markov chain and the constraint ${p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) = 1$ reduces the current problem to a scalar one. Therefore, a closed-form solution is possible. However, such a technique cannot be generalized to m > 2. Let 0 < T < ∞ be a finite real number. We divide the interval [0, T] into two parts. One part is for t very close to 0 (in the range of an ε-layer), and the other is for t bounded away from 0. The behavior of p ^ε( ⋅) differs significantly in these two regions. Such a division led us to the utilization of the matched asymptotic expansion. Not only do we prove the convergence of p ^ε(t) as ε → 0, but we also obtain an asymptotic series. The procedure involves constructing the regular part (outer expansion) for t to be away from 0, as well as the initial-layer corrections for small t, and to match these expansions by a proper choice of initial conditions.

In what follows, in addition to obtaining the zeroth-order approximation, i.e., the convergence of p ^ε( ⋅) to its quasi-stationary distribution, we derive higher-order approximations and error bounds. A consequence of the findings is that the convergence of the probability distribution and related occupation measures of the corresponding Markov chain takes place in an appropriate sense. The asymptotic properties of a suitably scaled occupation time and the corresponding central limit theorem for α^ε( ⋅) (based on the expansion) will be studied in Chapter 5.

2.1 Asymptotic Expansions

To proceed, we make the following assumptions.

(A4.1)
Given 0 < T < ∞, for each t ∈ [0, T], Q(t) is weakly irreducible, that is, the system of equations
$$\begin{array}{ll} &f(t)Q(t) = 0, \\ &\sum\limits_{i=1}^{m}{f}_{ i}(t) = 1\end{array}$$
(4.5)
has a unique nonnegative solution.
(A4.2)
For some n, Q( ⋅) is (n + 1)-times continuously differentiable on [0, T], and $({d}^{n+1}/d{t}^{n+1})Q(\cdot )$ is Lipschitz on [0, T].

Remark 4.3.

Condition (A4.2) requires that the matrix Q(t) be sufficiently smooth. This is necessary for obtaining the desired asymptotic expansion. To validate the asymptotic expansion, we need to estimate the remainder term. Thus for the nth-order approximation, we need the (n + 1)st-order smoothness.

To proceed, we first state a lemma. Its proof is in Lemma A.2 in the appendix.

Lemma 4.4.

Consider the matrix differential equation

$${ dP(s) \over ds} = P(s)A,\ \ P(0) = I,$$

(4.6)

where $P(s) \in {\mathbb{R}}^{m\times m}$. Suppose $A \in {\mathbb{R}}^{m\times m}$ is a generator of a (homogeneous or stationary) finite-state Markov chain and is weakly irreducible. Then $P(s) \rightarrow \overline{P}$ as s →∞ and

$$\left\vert\exp (As) -\overline{P}\right\vert\leq K\exp (-\widetilde{\kappa }s)\quad \mbox{ for some }\widetilde{\kappa } > 0,$$

(4.7)

where $\overline{P} = \mathrm{1}\mathrm{l}({\overline{\nu }}_{1},\cdots \,,{\overline{\nu }}_{m}) \in {\mathbb{R}}^{m\times m},$ and $({\overline{\nu }}_{1}$, …, ${\overline{\nu }}_{m})$ is the quasi-stationary distribution of the Markov process with generator A.

Recall that $\mathrm{1}\mathrm{l} = (1,\ldots,1)^{\prime} \in {\mathbb{R}}^{m\times 1}$ and $({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m}) \in {\mathbb{R}}^{1\times m}.$ Thus $\mathrm{1}\mathrm{l}({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})$ is the usual matrix product. Recall that an m ×m matrix P(s) is said to be a solution of (4.6) if each row of P(s) satisfies the equation. In the lemma above, if A is a constant matrix that is irreducible, then $({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})$ becomes the familiar stationary distribution. In general, A could be time-dependent, e.g., A = A(t). As shown in Lemma A.4, by assuming the existence of the solution ν(t) to (4.5), it follows that ν(t) ≥ 0; that is, the nonnegativity assumption is redundant. We seek asymptotic expansions of the form

$${p}^{\varepsilon }(t) = {\Phi }_{ n}^{\varepsilon }(t) + {\Psi }_{ n}^{\varepsilon }\left({ t \over \varepsilon }\right) + {e}_{n}^{\varepsilon }(t),$$

where e _n ^ε(t) is the remainder,

$${\Phi }_{n}^{\varepsilon }(t) = {\varphi }_{ 0}(t) + \varepsilon {\varphi }_{1}(t) + \cdots + {\varepsilon }^{n}{\varphi }_{ n}(t),$$

(4.8)

and

$${\Psi }_{n}^{\varepsilon }\left ({ t \over \varepsilon } \right ) = {\psi }_{0}\left ({ t \over \varepsilon } \right ) + \varepsilon {\psi }_{1}\left ({ t \over \varepsilon } \right ) + \cdots + {\varepsilon }^{n}{\psi }_{ n}\left ({ t \over \varepsilon } \right ),$$

(4.9)

with the functions φ_i( ⋅) and ψ_i( ⋅) to be determined in the sequel. We now state the main result of this section.

Theorem 4.5.

Suppose that (A4.1) and (A4.2) are satisfied. Denote the unique solution of (4.3) by p ^ε (⋅). Then two sequences of functions φ _i (⋅) and ψ _i (⋅), 0 ≤ i ≤ n, can be constructed such that

φ_i( ⋅) is $(n + 1 - i)$-times continuously differentiable on [0, T];
for eachi, there is a κ₀ > 0 such that
$$\vert {\psi }_{i}\left ({ t \over \varepsilon } \right )\vert \leq K\exp \left (-\frac{{\kappa }_{0}t} {\varepsilon } \right );$$
the following estimate holds:
$$ \sup\limits_{t\in [0,T]}{\biggl |{p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\biggr |} \leq K{\varepsilon }^{n+1}.$$
(4.10)

Remark 4.6.

The method described in what follows gives an explicit construction of the functions φ_i(⋅) and ψ_i(⋅) for i ≤ n. Thus the proof to be presented is constructive. Our plan is first to obtain these sequences, and then validate properties (a) and (b) above and derive an error bound in (c) by showing that the remainder

$${\biggl |{p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon }\right )\biggr |}$$

is of order O(εⁿ⁺¹) uniformly in t.

It will be seen from the subsequent development that φ₀(t) is equal to the quasi-stationary distribution, that is, φ₀(t) = ν(t). In particular, if n = 0 in the above theorem, we have the following result.

Corollary 4.7.

Suppose Q(⋅) is continuously differentiable on [0,T], which satisfies (A4.1), and (d∕dt)Q(⋅) is Lipschitz on [0,T]. Then for all t > 0,

$$ \lim\limits_{\varepsilon \rightarrow 0}{p}^{\varepsilon }(t) = \nu (t) = {\varphi }_{ 0}(t),$$

(4.11)

i.e., p ^ε (⋅) converges to the quasi-stationary distribution.

Remark 4.8.

The theorem manifests the convergence of p^ε(⋅) to φ₀(⋅), as well as the rate of convergence. In addition to the zeroth-order approximation, we have the first-order approximation, the second-order approximation, and so on. In fact, the difference p^ε(⋅) − φ₀(⋅) is characterized by the initial-layer term ψ₀(⋅) and the associated error bound.

If the initial condition is chosen to be exactly equal to p ⁰ = φ ₀ (0), then in the expansion, the zeroth-order initial layer ψ ₀ (⋅) will vanish. This cannot be expected in general, however. Even if ψ ₀ (⋅) = 0, the rest of the initial-layer terms ψ _i (⋅), i ≥ 1 will still be there.

To proceed, we define an operator ${\mathcal{L}}^{\varepsilon }$ by

$${\mathcal{L}}^{\varepsilon }f = \varepsilon \frac{df} {dt} - fQ,$$

(4.12)

for any smooth row-vector-valued function f( ⋅). Then ${\mathcal{L}}^{\varepsilon }f = 0$ iff f is a solution to the differential equation in (4.3). The proof of Theorem 4.5 is divided into the following steps.

1.
Construct the asymptotic series, i.e., find φ_i( ⋅) and ψ_i( ⋅), for i ≤ n. For the purpose of evaluating the remainder, we need to calculate two extra terms φ_n + 1( ⋅) and ψ_n + 1( ⋅). This will become clear when we carry out the error analysis.
2.
Obtain the regularity of φ_i( ⋅) and ψ_i( ⋅) by proving that φ_i( ⋅) is $(n + 1 - i)$-times continuously differentiable on [0, T] and that ψ_i( ⋅) decays exponentially fast.
3.
Carry out the error analysis and justify that the remainder has the desired property.

2.2 Outer Expansion

We begin with the construction of Φ _n ^ε( ⋅) in the asymptotic expansion. We call it the outer expansion or the regular part of expansion. Consider the differential equation

$${\mathcal{L}}^{\varepsilon }{\Phi }_{ n+1}^{\varepsilon } = 0$$

where ${\mathcal{L}}^{\varepsilon }$ is given by (4.12).

By equating the coefficients of ε^k, for $k = 1,\ldots,n + 1$, we obtain

$$\begin{array}{ll} &{\varepsilon }^{0} :\ \ {\varphi }_{ 0}(t)Q(t) = 0, \\ &{\varepsilon }^{1} :\ \ {\varphi }_{ 1}(t)Q(t) ={ d{\varphi }_{0}(t) \over dt}, \\ &\ \qquad \ \cdots \\ &{\varepsilon }^{k} :\ \ {\varphi }_{ k}(t)Q(t) ={ d{\varphi }_{k-1}(t) \over dt},\ \mbox{ for }k = 1,\ldots,n + 1.\end{array}$$

(4.13)

Remark 4.9.

First, one has to make sure that the equations above have solutions, that is, a consistency condition needs to be verified. For each t ∈ [0,T], denote the null space of Q(t) by N(Q(t)). Note that the irreducibility of Q(t) implies that

$$\mbox{ rank}(Q(t)) = m - 1,$$

thus

$$\mbox{ dim}(N(Q(t))) = 1.$$

It is easily seen that N(Q(t)) is spanned by the vector 1 l . By virtue of the Fredholm alternative (see Corollary A.38), the second equation in (4.13) has a solution only if its right-hand side, namely, (d∕dt)φ₀ (t) is orthogonal to N(Q(t)). Since N(Q(t)) is spanned by 1 l,

$${\varphi }_{0}(t)\mathrm{1}\mathrm{l} = 1$$

and

$${ d{\varphi }_{0}(t) \over dt} \mathrm{1}\mathrm{l} ={ d\left ({\varphi }_{0}(t)\mathrm{1}\mathrm{l}\right ) \over dt} = 0,$$

the orthogonality is easily verified. Similar arguments hold for the rest of the equations. The consistency in fact is rather crucial. Without such a condition, one would not be able to solve the equations in (4.13). This point will be made again when we deal with weak and strong interaction models in Section 4.3.

Recall that the components of p ^ε( ⋅) are probabilities (see (4.2)). In what follows, we show that all these φ_i( ⋅) can be determined by (4.13) and (4.2).

Note that rank$(Q(t)) = m - 1$. Thus Q(t) is singular, and each equation in (4.13) is not uniquely solvable. For example, the first equation (4.13) cannot be solved uniquely. Nevertheless, this equation together with the constraint $\sum_{i=1}^{m}{\varphi }_{0}^{i}(t) = 1$ leads to a unique solution, namely, the quasi-stationary distribution.

In fact, a direct consequence of (A4.3) and (A4.4) is that the weak irreducibility of Q(t) is uniform in the sense that for any t ∈ [0, T], if any column of Q(t) is replaced by $\mathrm{1}\mathrm{l} \in {\mathbb{R}}^{m\times 1}$, the resulting determinant Δ(t) satisfies | Δ(t) | > 0, since (4.5) has only one solution, and $\sum_{j=1}^{m}{q}_{ij}(t) = 0$ for each i = 1, …, m. Moreover, there is a number c > 0 such that | Δ(t) | ≥ c > 0. Thus, in view of the uniform continuity of Q(t), | Δ(t) | ≥ c > 0 on [0, T]. We can replace any equation in the first m equations of the system φ₀(t)Q(t) = 0 by the equation $\sum_{i=1}^{m}{\varphi }_{0}^{i}(t) = 1$. The corresponding determinant Δ(t) of the resulting coefficient matrix satisfies | Δ(t) | ≥ c > 0, for some c > 0 and all t ∈ [0, T]. To illustrate, we may suppose without loss of generality that the mth equation is the one that can be replaced. Then we have

$$\begin{array}{ll} &{q}_{11}(t){\varphi }_{0}^{1}(t) + \cdots + {q}_{ m1}(t){\varphi }_{0}^{m}(t) = 0, \\ &{q}_{12}(t){\varphi }_{0}^{1}(t) + \cdots + {q}_{ m2}(t){\varphi }_{0}^{m}(t) = 0, \\ &\quad \ \cdots \\ &{q}_{1,m-1}(t){\varphi }_{0}^{1}(t) + \cdots + {q}_{ m,m-1}(t){\varphi }_{0}^{m}(t) = 0, \\ &{\varphi }_{0}^{1}(t) + \cdots + {\varphi }_{ 0}^{m}(t) = 1.\end{array}$$

(4.14)

The determinant of the coefficient matrix in (4.14) is

$$\begin{array}{ll} &\Delta (t) =\left\vert \begin{array}{*{10}c} {q}_{11}(t) & {q}_{21}(t) &\cdots & {q}_{m1}(t) \\ {q}_{12}(t) & {q}_{22}(t) &\cdots & {q}_{m2}(t) \\ \vdots & \vdots &\cdots & \vdots \\ {q}_{1,m-1}(t)&{q}_{2,m-1}(t)&\cdots &{q}_{m,m-1}(t) \\ 1 & 1 &\cdots & 1 \end{array}\right\vert \end{array}$$

(4.15)

and satisfies | Δ(t) | ≥ c > 0. Now by Cramer’s rule, for each 0 ≤ i ≤ m,

$${\varphi }_{0}^{i}(t) ={ 1 \over \Delta (t)} \left\vert \begin{array}{*{10}c} {q}_{11}(t) &\cdots & 0 &\cdots & {q}_{m1}(t) \\ {q}_{12}(t) &\cdots & 0 &\cdots & {q}_{m2}(t)\\ \vdots &\cdots & \vdots &\cdots & \vdots \\ {q}_{1,m-1}(t)&\cdots & 0 &\cdots &{q}_{m,m-1}(t) \\ 1 &\cdots &\underbrace{{1}}_{i\mathrm{th\;column}} & \cdots & 1 \end{array} \right\vert,$$

that is, the ith column of Δ(t) in (4.15) is replaced by $(0,\ldots,0,1)^{\prime} \in {\mathbb{R}}^{m\times 1}$. By the assumption of Q( ⋅), it is plain that φ₀( ⋅) is (n + 1)-times continuously differentiable on [0, T].

The foregoing method can be used to solve other equations in (4.13) analogously. Owing to the smoothness of φ₀( ⋅), (d ∕ dt)φ₀(t) exists, and we can proceed to obtain φ₁( ⋅). Repeat the procedure above, and continue inductively. For each k ≥ 1,

$$\begin{array}{ll} &\sum\limits_{i=1}^{m}{\varphi }_{ k}^{i}(t){q}_{ ij}(t) ={ d{\varphi }_{k-1}^{j}(t) \over dt} \mbox{ for }j = 1,\ldots,m, \\ &\sum\limits_{i=1}^{m}{\varphi }_{ k}^{i}(t) = 0.\end{array}$$

(4.16)

Note that φ_k − 1 ^j( ⋅) has been found so $(d/dt){\varphi }_{k-1}^{j}(t)$ is a known function. After a suitable replacement of one of the first m equations by the last equation in (4.16), the determinant Δ(t) of the resulting coefficient matrix satisfies | Δ(t) | ≥ c > 0. We obtain for each 0 ≤ i ≤ m,

$${\varphi }_{k}^{i}(t) ={ 1 \over \Delta (t)} \left\vert \begin{array}{*{10}c} {q}_{11}(t) &\cdots &{ d{\varphi }_{k-1}^{1}(t) \over dt} &\cdots & {q}_{m1}(t) \\ {q}_{12}(t) &\cdots &{ d{\varphi }_{k-1}^{2}(t) \over dt} &\cdots & {q}_{m2}(t) \\ \vdots &\cdots & \vdots &\cdots & \vdots \\ {q}_{1,m-1}(t)&\cdots &{ d{\varphi }_{k-1}^{m-1}(t) \over dt} &\cdots &{q}_{m,m-1}(t) \\ & & & \\ 1 &\cdots & \underbrace{{0}}_{i\mathrm{th\;column}} & \cdots & 1 \end{array} \right\vert.$$

Hence φ_k( ⋅) is $(n + 1 - k)$-times continuously differentiable on [0, T]. Thus we have constructed a sequence of functions φ_k(t) that are $(n + 1 - k)$-times continuously differentiable on [0, T] for $k = 0,1,\ldots,n + 1$.

Remark 4.10.

The method used above is convenient for computational purposes. An alternative way of obtaining the sequence φ_k(t) is as follows. For example, to solve

$${\varphi }_{0}(t)Q(t) = 0,\ \ \sum\limits_{j=1}^{m}{\varphi }_{ 0}^{j}(t) = 1,$$

define ${Q}_{c}(t) = (\mathrm{1}\mathrm{l}\vdots Q(t)) \in {\mathbb{R}}^{m\times (m+1)}$. Then the equation above can be written as

$${\varphi }_{0}(t){Q}_{c}(t) = (1,0,\ldots,0).$$

Note that Q_c(t)Q′_c(t) has full rank m owing to weak irreducibility. Thus the solution of the equation is

$${\varphi }_{0}(t) = (1,0,\ldots,0){Q^{\prime}}_{c}(t){[{Q}_{c}(t){Q^{\prime}}_{c}(t)]}^{-1}.$$

We can obtain all other φ_k(t) for $k = 1,\ldots,n + 1$, similarly.

The regular part Φ _n ^ε( ⋅) is a good approximation to p ^ε( ⋅) when t is bounded away from 0. When t approaches 0, an initial layer (or a boundary layer) develops and the approximation breaks down. To accommodate this situation, an initial-layer correction, i.e., a sequence of functions ψ_k(t ∕ ε) for $k = 0,1,\ldots,n + 1$ needs to be constructed.

2.3 Initial-Layer Correction

This section is on the construction of the initial-layer terms. The presentation consists of two parts. We obtain the sequence {ψ_k( ⋅)} in the first subsection, and derive the exponential decay property in the second subsection.

Construction of ψ _k (⋅). Following usual practice in singular perturbation theory, define the stretched (or rescaled) time variable by

$$\tau ={ t \over \varepsilon }.$$

(4.17)

Note that τ → ∞ as ε → 0 for any given t > 0.

Consider the differential equation

$${\mathcal{L}}^{\varepsilon }{\Psi }_{ n+1}^{\varepsilon } =\sum\limits_{i=0}^{n+1}{\varepsilon }^{i}{\mathcal{L}}^{\varepsilon }{\psi }_{ i} = 0.$$

Using the stretched time variable τ, we arrive at

$${ d{\Psi }_{n+1}^{\varepsilon }(\tau ) \over d\tau } = {\Psi }_{n+1}^{\varepsilon }(\tau )Q(\varepsilon \tau ).$$

Owing to the smoothness of Q( ⋅), a truncated Taylor expansion about τ = 0 leads to

$$Q(t) = Q(\varepsilon \tau ) =\sum\limits_{i=0}^{n+1}{ {(\varepsilon \tau )}^{i} \over i!} { {d}^{i}Q(0) \over d{t}^{i}} + {R}_{n+1}(\varepsilon \tau ),$$

where

$${R}_{n+1}(t) ={ {t}^{n+1} \over (n + 1)!} \left ({ {d}^{n+1}Q(\xi ) \over d{t}^{n+1}} -{ {d}^{n+1}Q(0) \over d{t}^{n+1}} \right ),$$

for some 0 < ξ < t. In view of (A4.2),

$${R}_{n+1}(t) = O({t}^{n+2})\mbox{ uniformly in }t \in [0,T].$$

Drop the term R _n + 1(t) and use the first n + 2 terms to get

$${ d{\Psi }_{n+1}^{\varepsilon }(\tau ) \over d\tau } = {\Psi }_{n+1}^{\varepsilon }(\tau )\left (\sum\limits_{i=0}^{n+1}{ {(\varepsilon \tau )}^{i} \over i!} { {d}^{i}Q(0) \over d{t}^{i}} \right ).$$

Similar to the previous section, for $k = 1,\ldots,n + 1$, equating coefficients of ε^k, we have

$$\begin{array}{ll} &{\varepsilon }^{0} :\ { d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )Q(0), \\ &{\varepsilon }^{1} :\ { d{\psi }_{1}(\tau ) \over d\tau } = {\psi }_{1}(\tau )Q(0) + \tau {\psi }_{0}(\tau ){ dQ(0) \over dt}, \\ &\ \qquad \ \cdots \\ &{\varepsilon }^{k} :\ { d{\psi }_{k}(\tau ) \over d\tau } = {\psi }_{k}(\tau )Q(0) + {r}_{k}(\tau ), \end{array}$$

(4.18)

where r _k(τ) is a function having the form

$$\begin{array}{ll} {r}_{k}(\tau )& ={ {\tau }^{k} \over k!} {\psi }_{0}(\tau ){ {d}^{k}Q(0) \over d{t}^{k}} + \cdots + \tau {\psi }_{k-1}(\tau ){ dQ(0) \over dt} \\ & =\sum\limits_{i=1}^{k}{ {\tau }^{i} \over i!} {\psi }_{k-i}(\tau ){ {d}^{i}Q(0) \over d{t}^{i}}. \end{array}$$

(4.19)

These equations together with appropriate initial conditions allow us to determine the ψ_k( ⋅)’s. For constructing φ_k( ⋅), a number of algebraic equations are solved, whereas when determining ψ_k, one has to solve a number of differential equations instead. Two points are worth mentioning in connection with (4.18). First the time-varying differential equation is replaced by one with constant coefficients; the solution thus can be written explicitly. The second point is on the selection of the initial conditions for ψ_k( ⋅), with $k = 0,1,\ldots,n + 1$. We choose the initial conditions so that the initial data of the asymptotic expansion will “match” that of the differential equation (4.3). To be more specific,

$$\begin{array}{rl} &{\varphi }_{0}(0) + {\psi }_{0}(0) = {p}^{0},\mbox{ and } \\ &{\varphi }_{k}(0) + {\psi }_{k}(0) = 0\mbox{ for }k = 1,2,\ldots,n + \end{array}$$

(1.)

Corresponding to ε⁰, solving

$$\begin{array}{rl} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )Q(0), \\ &{\psi }_{0}(0) = {p}^{0} - {\varphi }_{ 0}(0), \end{array}$$

where p ⁰ is the initial data given in (4.3), one has

$${\psi }_{0}(\tau ) = ({p}^{0} - {\varphi }_{ 0}(0))\exp \left (Q(0)\tau \right ).$$

(4.20)

Continuing in this fashion, for $k = 1,\ldots,n + 1$, we obtain

$$\begin{array}{rl} &{ d{\psi }_{k}(\tau ) \over d\tau } = {\psi }_{k}(\tau )Q(0) + {r}_{k}(\tau ), \\ &{\psi }_{k}(0) = -{\varphi }_{k}(0)\end{array}$$

In the equations above, we purposely separated Q(0) from the term r _k(τ). As a result, the equations are linear systems with a constant matrix Q(0) and time-varying forcing terms. This is useful for our subsequent investigation.

For k = 1, 2, …, the solutions are given by

$$\begin{array}{ll} {\psi }_{k}(\tau )& = -{\varphi }_{k}(0)\exp (Q(0)\tau ) \\ &\qquad +{ \int }_{0}^{\tau }{r}_{ k}(s)\exp \left (Q(0)(\tau - s)\right )ds\end{array}$$

(4.21)

The construction of ψ_k( ⋅) for $k = 0,1,\ldots,n + 1$, and hence the construction of the asymptotic series is complete.

2.4 Exponential Decay of ψ_k( ⋅)

This subsection concerns the exponential decay of ψ_k( ⋅). At first glance, it seems to be troublesome since Q(0) has a zero eigenvalue. Nevertheless, probabilistic argument helps us to derive the desired property. Two key points in the proof below are the utilization of orthogonality and repeated application of the approximation of exp(Q(0)τ) in Lemma 4.4.

By virtue of Assumption (A4.1), the finite-state Markov chain generated by Q(0) is weakly irreducible. Identifying Q(0) with the matrix A in Lemma 4.4 yields that

$$\exp (Q(0)\tau ) \rightarrow \overline{P}\mbox{ as }\tau \rightarrow \infty,$$

where $\overline{P} = \mathrm{1}\mathrm{l}\overline{\nu }$, and $\overline{\nu } = ({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})$ is the quasi-stationary distribution corresponding to the constant matrix Q(0).

Proposition 4.11.

Under the conditions of Theorem 4.5 , for each 0 ≤ k ≤ n + 1, there exist a nonnegative real polynomial c _2k (τ) of degree 2k and a positive number κ _0,0 > 0 such that

$$\vert {\psi }_{k}(\tau )\vert \leq {c}_{2k}(\tau )\exp (-{\kappa }_{0,0}\tau ).$$

(4.22)

Proof: First of all, note that

$$\sum\limits_{i=1}^{m}{p}_{ i}^{0} = 1\mbox{ and }\sum\limits_{i=1}^{m}{\varphi }_{ 0}^{i}(0) = 1.$$

It follows that

$$\sum\limits_{i=1}^{m}{\psi }_{ 0}^{i}(0) =\sum\limits_{i=1}^{m}{p}_{ i}^{0} -\sum\limits_{i=1}^{m}{\varphi }_{ 0}^{i}(0) = 0.$$

That is, ψ₀(0) is orthogonal to 1 l. Consequently, ${\psi }_{0}(0)\overline{P} = 0$ and by virtue of Lemma 4.4 (with A = Q(0)), for some ${\kappa }_{0,0} :=\widetilde{ \kappa } > 0$,

$$\begin{array}{ll} \left \vert {\psi }_{0}(\tau )\right \vert & = \left \vert {\psi }_{0}(0)\exp (Q(0)\tau )\right \vert \\ &\leq \left \vert {\psi }_{0}(0)\overline{P}\right \vert + \left \vert {\psi }_{0}(0)(\exp (Q(0)\tau ) -\overline{P})\right \vert \\ & = \left \vert {\psi }_{0}(0)(\exp (Q(0)\tau ) -\overline{P})\right \vert \leq K\exp (-{\kappa }_{0,0}\tau ).\end{array}$$

(4.23)

Note that

$$Q(t)\mathrm{1}\mathrm{l} = 0\mbox{ for all }t \geq 0.$$

Differentiating this equation repeatedly leads to

$${ {d}^{k}Q(t) \over d{t}^{k}} \mathrm{1}\mathrm{l} ={ {d}^{k}(Q(t)\mathrm{1}\mathrm{l}) \over d{t}^{k}} = 0.$$

Hence, it follows that

$${ {d}^{k}Q(0) \over d{t}^{k}} \mathrm{1}\mathrm{l} = 0\ \mbox{ and }\ { {d}^{k}Q(0) \over d{t}^{k}} \overline{P} = 0,$$

for each 0 ≤ k ≤ n + 1. Owing to Lemma 4.4 and (4.21),

$$\begin{array}{rl} \vert {\psi }_{1}(\tau )\vert \leq &\left\vert {\varphi }_{1}(0)\exp (Q(0)\tau )\right\vert \\ & + \left\vert{\int }_{0}^{\tau }{\psi }_{ 0}(s){ dQ(0) \over dt} \left (\overline{P} + \left (\exp (Q(0)(\tau - s) -\overline{P}\right )\right )sds\right\vert \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) \\ & +{ \int }_{0}^{\tau }\vert {\psi }_{ 0}(s)\left\vert \l { dQ(0) \over dt} \left (\exp (Q(0)(\tau - s)) -\overline{P}\right )\right\vert sds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) + K{\int }_{0}^{\tau }\exp (-{\kappa }_{ 0,0}s)\exp (-{\kappa }_{0,0}(\tau - s))sds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) + K{\tau }^{2}\exp (-{\kappa }_{ 0,0}\tau ) \leq {c}_{2}(\tau )\exp (-{\kappa }_{0,0}\tau ), \end{array}$$

for some nonnegative polynomial c ₂(τ) of degree 2.

Note that r _k(s) is orthogonal to $\overline{P}$. By induction, for any k with $k = 1,\ldots,n + 1$,

$$\begin{array}{ll} &\vert {\psi }_{k } (\tau )\vert \\ \leq &\vert {\varphi }_{k}(0)\exp (Q(0)\tau )\vert +{ \int }_{0}^{\tau }\left\vert {r}_{ k}(s)\left (\exp (Q(0)(\tau - s)) -\overline{P}\right )\right\vert ds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) +\sum\limits_{i=1}^{k}{ 1 \over i!} {\int }_{0}^{\tau }{s}^{i}\vert {\psi }_{ k-i}(s)\vert \\ & \quad \times \left\vert { {d}^{i}Q(0) \over d{t}^{i}} \left (\exp (Q(0)(\tau - s)) -\overline{P}\right )\right\vert ds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) + K\sum\limits_{i=1}^{2k-1}{ \int }_{0}^{\tau }{s}^{i}\exp (-{\kappa }_{ 0,0}\tau )ds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) + K\sum\limits_{i=1}^{2k}{\tau }^{i}\exp (-{\kappa }_{ 0,0}\tau ) \leq {c}_{2k}(\tau )\exp (-{\kappa }_{0,0}\tau ),\end{array}$$

where c _2k(τ) is a nonnegative polynomial of degree 2k. This completes the proof of the proposition. □

Since n is a finite integer, the growth of c _2k(τ) for 0 ≤ k ≤ n + 1 is much slower than exponential. Thus the following corollary is in force.

Corollary 4.12.

For each 0 ≤ k ≤ n + 1, with κ _0,0 given in Proposition 4.11,

$$\vert {\psi }_{k}(\tau )\vert \leq K\exp \left (-{\kappa }_{0}\tau \right ),\mbox{ for some }{\kappa }_{0}\mbox{ with }0 < {\kappa }_{0} < {\kappa }_{0,0}.$$

2.5 Asymptotic Validation

Recall that ${\mathcal{L}}^{\varepsilon }f = \varepsilon (d/dt)f - fQ$. Then we have the following lemma.

Lemma 4.13.

Suppose that for some 0 ≤ k ≤ n + 1,

$$ \sup\limits_{t\in [0,T]}\vert {\mathcal{L}}^{\varepsilon }{v}^{\varepsilon }(t)\vert = O\left ({\varepsilon }^{k+1}\right )\, and \,{v}^{\varepsilon }(0) = 0.$$

Then

$$ \sup\limits_{t\in [0,T]}\vert {v}^{\varepsilon }(t)\vert = O\left ({\varepsilon }^{k}\right ).$$

Proof: Let η^ε( ⋅) be a function satisfying $ \sup\limits_{t\in [0,T]}\vert {\eta }^{\varepsilon }(t)\vert = O\left ({\varepsilon }^{k+1}\right )$. Consider the differential equation

$$\begin{array}{ll} &{\mathcal{L}}^{\varepsilon }{v}^{\varepsilon }(t) = {\eta }^{\varepsilon }(t), \\ &{v}^{\varepsilon }(0) = 0.\end{array}$$

(4.24)

Then the solution of (4.24) is given by

$${v}^{\varepsilon }(t) ={ 1 \over \varepsilon } {\int }_{0}^{t}{\eta }^{\varepsilon }(s){X}^{\varepsilon }(t,s)ds,$$

where X ^ε(t, s) is a principal matrix solution. Recall that (see Hale [79, p. 80]) a fundamental matrix solution of the differential equation is an invertible matrix each row of which is a solution of the equation; a principal matrix solution is a fundamental matrix solution with initial value the identity matrix. In view of Lemma 4.1,

$$\vert {X}^{\varepsilon }(t,s)\vert \leq K\quad \mbox{ for all }t,s \in [0,T].$$

Therefore, we have the inequalities

$$ \sup\limits_{t\in [0,T]}\vert {v}^{\varepsilon }(t)\vert \leq {{ K \over \varepsilon } \sup }_{t\in [0,T]}{ \int }_{0}^{t}\vert {\eta }^{\varepsilon }(s)\vert ds \leq K{\varepsilon }^{k}.$$

The proof of the lemma is thus complete. □

Recall that the vector-valued “error” or remainder e _n ^ε(t) is defined by

$${e}_{n}^{\varepsilon }(t) = {p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right ),$$

(4.25)

where p ^ε( ⋅) is the solution of (4.3), and φ_i( ⋅) and ψ_i( ⋅) are constructed in (4.13) and (4.18). It remains to show that ${e}_{n}^{\varepsilon }(t) = O\left ({\varepsilon }^{n+1}\right )$. To do so, we utilize Lemma 4.13 as a bridge. It should be pointed out, however, that to get the correct order for the remainder, a trick involving “back up one step” is needed. The details follow.

Proposition 4.14.

Assume (A4.1) and (A4.2) , for each k = 0,…,n,

$$ \sup\limits_{t\in [0,T]}\vert {e}_{k}^{\varepsilon }(t)\vert = O({\varepsilon }^{k+1}).$$

Proof: We begin with

$${e}_{1}^{\varepsilon }(t) = {p}^{\varepsilon }(t) - {\varphi }_{ 0}(t) - \varepsilon {\varphi }_{1}(t) - {\psi }_{0}\left ( \frac{t} {\varepsilon }\right ) - \varepsilon {\psi }_{1}\left ( \frac{t} {\varepsilon }\right ).$$

(4.26)

We will use the exponential decay property given in ψ_i(τ) Corollary 4.12. Clearly, e ₁ ^ε(0) = 0, and hence the condition of Lemma 4.13 on the initial data is satisfied. By virtue of the exponential decay property of ψ_i( ⋅) in conjunction with the defining equations of φ_i( ⋅) and ψ_i( ⋅),

$$\begin{array}{rl} {\mathcal{L}}^{\varepsilon }{e}_{1}^{\varepsilon }(t)& = -\biggl [\varepsilon { d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)Q(t) + {\varepsilon }^{2}{ d{\varphi }_{1}(t) \over dt} - \varepsilon {\varphi }_{1}(t)Q(t) \\ &\qquad + \varepsilon { d \over dt} {\psi }_{0}\left ( \frac{t} {\varepsilon }\right ) - {\psi }_{0}\left ( \frac{t} {\varepsilon }\right )Q(t) + {\varepsilon }^{2}{ d \over dt} {\psi }_{1}\left ( \frac{t} {\varepsilon }\right ) \\ &\qquad - \varepsilon {\psi }_{1}\left ( \frac{t} {\varepsilon }\right )Q(t)\biggr ] \\ & = -{\varepsilon }^{2}{ d{\varphi }_{1}(t) \over dt} + {\psi }_{0}\left ( \frac{t} {\varepsilon }\right )\biggl [Q(t) - Q(0) - t{ dQ(0) \over dt} \biggr ] \\ &\qquad + \varepsilon {\psi }_{1}\left ( \frac{t} {\varepsilon }\right )[Q(t) - Q(0)]\end{array}$$

For the term involving ψ₀(t ∕ ε), using a Taylor expansion on Q(t) yields that for some ξ ∈ (0, t)

$$\left\vert Q(t) - Q(0) - t{ dQ(0) \over dt} \right\vert = \left\vert { 1 \over 2} \left({ {d}^{2}Q(\xi ) \over d{t}^{2}} \right){t}^{2}\right\vert \leq K{t}^{2}.$$

Owing to the exponential decay property of ψ_i( ⋅), the fact that φ₁( ⋅) is n-times continuously differentiable on [0, T], and the above estimate, we have

$$\vert {\mathcal{L}}^{\varepsilon }{e}_{ 1}^{\varepsilon }(t)\vert \leq K\left({\varepsilon }^{2} + (\varepsilon t + {t}^{2})\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right)\right).$$

Moreover, for any $k = 0,1,2\ldots,n + 1$, it is easy to see that

$${t}^{k}\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right) = {\varepsilon }^{k}\left( \frac{t} {\varepsilon }\right)^{k}\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right) \leq K{\varepsilon }^{k}.$$

(4.27)

This implies ${\mathcal{L}}^{\varepsilon }{e}_{1}^{\varepsilon }(t) = O({\varepsilon }^{2})$ uniformly in t. Thus, e ₁ ^ε(t) = O(ε) by virtue of Lemma 4.13 and the bound is uniform in t ∈ [0, T].

We now go back one step to show that the zeroth-order approximation also possesses the correct error estimate, that is, e ₀ ^ε(t) = O(ε). Note that the desired order seems to be difficult to obtain directly, and as a result the back-tracking is employed.

Note that

$${e}_{1}^{\varepsilon }(t) = {e}_{ 0}^{\varepsilon }(t) - \varepsilon {\varphi }_{ 1}(t) - \varepsilon {\psi }_{1}\left ({ t \over \varepsilon } \right ).$$

(4.28)

However, the smoothness of φ₁( ⋅) and the exponential decay of ψ₁( ⋅) imply that

$$\varepsilon {\varphi }_{1}(t) + \varepsilon {\psi }_{1}\left ({ t \over \varepsilon } \right ) = O(\varepsilon )\quad \mbox{ uniformly in }t.$$

(4.29)

Thus e ₀ ^ε(t) = O(ε) uniformly in t.

Proceeding analogously, we obtain

$$\begin{array}{ll} &{\mathcal{L}}^{\varepsilon } {e}_{ n+1}^{\varepsilon } \\ & = {\mathcal{L}}^{\varepsilon }\left ({p}^{\varepsilon }(t) -\sum n+1 i=0 {\varepsilon }^{i}{\varphi }_{ i}(t) -\sum n+1 i=0 {\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon }\right )\right ) \\ & = -\varepsilon \left (\sum n+1 i=0 {\varepsilon }^{i}{ d{\varphi }_{i}(t) \over dt} + \sum n+1 i=0 {\varepsilon }^{i}{ d \over dt} {\psi }_{i}\left ( \frac{t} {\varepsilon }\right )\right ) \\ &\quad + \left (\sum n+1 i=0 {\varepsilon }^{i}{\varphi }_{ i}(t) + \sum n+1 i=0 {\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon }\right )\right )Q(t) \\ & = -{\varepsilon }^{n+2}{ d{\varphi }_{n+1}(t) \over dt} + \left [\sum n+1 i=0 {\varepsilon }^{i}{\varphi }_{ i}(t)Q(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i+1}{\varphi }_{ i+1}(t)Q(t)\right ] \\ &\quad + \sum n+1 i=0 {\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )Q(t) -\sum n+1 i=0 {\varepsilon }^{i}\left [{\psi }_{ i}\left ( \frac{t} {\varepsilon }\right )Q(0) + {r}_{i}\left ( \frac{t} {\varepsilon }\right )\right ].\end{array}$$

(4.30)

Note that the term in the fifth line above is

$$\sum n+1 i=0 {\varepsilon }^{i}{\varphi }_{ i}(t)Q(t) -\sum\limits_{i=1}^{n+1}{\varepsilon }^{i}{\varphi }_{ i}(t)Q(t) = {\varphi }_{0}(t)Q(t) = 0.$$

Using (4.19), we represent r _i(t) in terms of (d ⁱ ∕ dt ⁱ)Q(0), etc. For the term involving ψ₀(t ∕ ε), using a truncated Taylor expansion up to order (n + 1) for Q(t), by virtue of the Lipschitz continuity of $({d}^{n+1}/d{t}^{n+1})Q(\cdot )$, there is a ξ ∈ (0, t) such that

$$\begin{array}{rl} {\biggl |Q(t) -\sum\limits_{i=0}^{n+1}{ {t}^{i} \over i!} { {d}^{i}Q(0) \over d{t}^{i}} \biggr |}& ={ 1 \over (n + 1)!} \l {t}^{n+1}{ {d}^{n+1}Q(\xi ) \over d{t}^{n+1}} - {t}^{n+1}{ {d}^{n+1}Q(0) \over d{t}^{n+1}} \vert \\ & \leq K{t}^{n+1}\xi \leq K{t}^{n+2}\end{array}$$

For all the other terms involving ψ_i(t ∕ ε), for $i = 1,\ldots,n + 1$ in (4.30), we proceed as in the calculation of ${\mathcal{L}}^{\varepsilon }{e}_{1}^{\varepsilon }$. As a result, the last two terms in (4.30) are bounded by

$${\psi }_{0}\left ( \frac{t} {\varepsilon }\right )O({t}^{n+2}) + \varepsilon {\psi }_{ 1}\left ( \frac{t} {\varepsilon }\right )O({t}^{n+1}) + \cdots + {\varepsilon }^{n+1}{\psi }_{ n+1}\left ( \frac{t} {\varepsilon }\right )O(t),$$

which in turn leads to the bound

$$K({t}^{n+2} + \varepsilon {t}^{n+1} + \cdots + {\varepsilon }^{n+1}t)\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right) \leq K{\varepsilon }^{n+2},$$

in accordance with (4.27). Moreover, it is clear that ${e}_{n+1}^{\varepsilon }(0) = 0$. In view of the fact that φ_n + 1( ⋅) is continuously differentiable on [0, T] and Q( ⋅) is (n + 1)-times continuously differentiable on [0, T], by virtue of Lemma 4.13, we infer that ${e}_{n+1}^{\varepsilon }(t) = O({\varepsilon }^{n+1})$ uniformly in t. Since

$${e}_{n+1}^{\varepsilon }(t) = {e}_{ n}^{\varepsilon }(t) + O({\varepsilon }^{n+1}),$$

it must be that ${e}_{n}^{\varepsilon }(t) = O({\varepsilon }^{n+1})$. The proof of Proposition 4.14 is complete, and so is the proof of Theorem 4.5. □

Remark 4.15.

In the estimate given above, we actually obtained

$${\mathcal{L}}^{\varepsilon }{e}_{ k}^{\varepsilon }(t) = O\left({\varepsilon }^{k+1} + (\varepsilon {t}^{k} + \cdots + {\varepsilon }^{k}t)\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right)\right).$$

(4.31)

This observation will be useful when we consider the unbounded interval [0,∞).

The findings reported are very useful for further study of the limit behavior of the corresponding Markov chain problems of central limit type, which will be discussed in the next chapter. In many applications, a system is governed by a Markov chain, which consists of both slow and fast motions. An immediate question is this: Can we still develop an asymptotic series expansion? This question will be dealt with in Section 4.3.

Suppose that in lieu of (A4.2), we assume that Q( ⋅) is piecewise (n + 1)-times continuously differentiable on [0, T], and $({d}^{n+1}/d{t}^{n+1})Q(\cdot )$ is piecewise Lipschitz, that is, there is a partition of [0, T], namely,

$${t}_{0} = 0 < {t}_{1} < {t}_{2} < \cdots \leq {t}_{k} = T$$

such that Q( ⋅) is (n + 1)-times continuously differentiable and $({d}^{n+1}/d{t}^{n+1})$ Q( ⋅) is Lipschitz on each subinterval [t _i, t _i + 1). Then the result obtained still holds. In this case, in addition to the initial layers, one also has a finite number of inner-boundary layers. In each interval $[{t}_{i},{t}_{i+1} - \eta ]$ for η > 0, the expansion is similar to that presented in Theorem 4.5.

2.6 Examples

As a further illustration, we consider two examples in this section. The first example is concerned with a stationary Markov chain, i.e., Q(t) = Q is a constant matrix. The second example deals with an analytically solvable case for a two-state Markov chain with nonstationary transition probabilities. Although they are simple, these examples give us insight into the asymptotic behavior of the underlying systems.

Example 4.16.

Let α^ε(t) be an m-state Markov chain with a constant generator Q(t) = Q that is irreducible. This is an analytically solvable case, with

$${p}^{\varepsilon }(t) = {p}^{0}\exp \left ({ Qt \over \varepsilon } \right ).$$

Using the technique of asymptotic expansion, we obtain

$$\begin{array}{rl} &{\varphi }_{0}(t) + {\psi }_{0}\left ({ t \over \varepsilon } \right ) = {\varphi }_{0} + ({p}^{0} - {\varphi }_{ 0})\exp \left ({ Qt \over \varepsilon } \right ), \\ &\mbox{ with }\exp \left ({ Qt \over \varepsilon } \right ) \rightarrow \overline{P},\mbox{ as }\varepsilon \rightarrow 0, \end{array}$$

where

$$\begin{array}{rl} &{\varphi }_{0}(t) = ({\nu }_{1},\ldots,{\nu }_{m})\mbox{ and }\overline{P} = \mathrm{1}\mathrm{l}{\varphi }_{0}\end{array}$$

Note that ${p}^{0}\overline{P} = {\varphi }_{0}$, and hence

$$({p}^{0} - {\varphi }_{ 0})\exp \left ({ Qt \over \varepsilon } \right ) = ({p}^{0} - {\varphi }_{ 0})\left [\exp \left ({ Qt \over \varepsilon } \right ) -\overline{P}\right ].$$

Moreover,

$${\varphi }_{i}(t) \equiv 0,\ \ {\psi }_{i}\left ({ t \over \varepsilon } \right ) \equiv 0\ \mbox{ for }i \geq 1.$$

In this case, ${\varphi }_{0}(t) \equiv {\varphi }_{0}$, a constant vector, which is the equilibrium distribution of Q; the series terminates. Moreover, the solution consists of two terms, one of them the equilibrium distribution (the zeroth-order approximation) and the other the zeroth-order initial-layer correction term. Since φ₀ is the quasi-stationary distribution,

$${\varphi }_{0}Q = 0\ \mbox{ and }\ {\varphi }_{0}\exp \left ({ Qt \over \varepsilon } \right ) = {\varphi }_{0}.$$

Hence the analytic solution and the asymptotic expansion coincide.

In particular, let Q be a two-dimensional matrix, i.e.,

$$Q = \left (\begin{array}{*{10}c} -\lambda & \lambda \\ \mu &-\mu \\ \end{array} \right ).$$

Then setting

$${y}_{0}^{\varepsilon }(t) = {\varphi }_{ 0}(t) + {\psi }_{0}(t/\varepsilon ),$$

we have

$$\begin{array}{rl} &{p}_{1}^{\varepsilon }(t) = {y}_{ 0,1}^{\varepsilon }(t) ={ \mu \over \lambda + \mu } + \left ({p}_{1}^{0} -{ \mu \over \lambda + \mu } \right )\exp \left (-\frac{(\lambda + \mu )t} {\varepsilon } \right ), \\ &{p}_{2}^{\varepsilon }(t) = {y}_{ 0,2}^{\varepsilon }(t) ={ \lambda \over \lambda + \mu } + \left ({p}_{2}^{0} -{ \lambda \over \lambda + \mu } \right )\exp \left (-\frac{(\lambda + \mu )t} {\varepsilon } \right )\end{array}$$

Therefore,

$$\begin{array}{rl} &{\varphi }_{0}(t) = \left ({ \mu \over \lambda + \mu },{ \lambda \over \lambda + \mu } \right ), \\ &{\psi }_{0}\left ({ t \over \varepsilon } \right ) = \left (\left ({p}_{1}^{0} -{ \mu \over \lambda + \mu } \right ),\left ({p}_{2}^{0} -{ \lambda \over \lambda + \mu } \right )\right )\exp \left (-{ (\lambda + \mu )t \over \varepsilon } \right ), \\ &{\varphi }_{i}(t) \equiv 0\mbox{ and }{\psi }_{i}\left ({ t \over \varepsilon } \right ) \equiv 0\quad \mbox{ for }i \geq \end{array}$$

(1.)

Example 4.17.

Consider a two-state Markov chain with generator

$$Q(t) = \left (\begin{array}{*{10}c} -\lambda (t)& \lambda (t)\\ \mu (t) &-\mu (t) \\ \end{array} \right )$$

where λ(t) ≥ 0, μ(t) ≥ 0 and λ(t) + μ(t) > 0 for each t ∈ [0,T]. Therefore Q(⋅) is weakly irreducible. For the following discussion, assume Q(⋅) to be sufficiently smooth. Although it is time-varying, a closed-form solution is obtainable. Since ${p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) = 1$ for each t, (4.3) can be solved explicitly and the solution is given by

$$\begin{array}{rl} &{p}_{1}^{\varepsilon }(t) = {p}_{ 1}^{0}\exp \left (-{ 1 \over \varepsilon } {\int }_{0}^{t}(\lambda (s) + \mu (s))ds\right ) \\ &\qquad \qquad +{ \int }_{0}^{t}{ \mu (u) \over \varepsilon } \exp \left (-{ 1 \over \varepsilon } {\int }_{u}^{t}(\lambda (s) + \mu (s))ds\right )du, \\ &{p}_{2}^{\varepsilon }(t) = {p}_{ 2}^{0}\exp \left (-{ 1 \over \varepsilon } {\int }_{0}^{t}(\lambda (s) + \mu (s))ds\right ) \\ &\qquad \qquad +{ \int }_{0}^{t}{ \lambda (u) \over \varepsilon } \exp \left (-{ 1 \over \varepsilon } {\int }_{u}^{t}(\lambda (s) + \mu (s))ds\right )du\end{array}$$

Following the approach in the previous sections, we construct the first a few terms in the asymptotic expansion. By considering (4.13) together with (4.2), a system of the form

$$\begin{array}{rl} &\lambda (t){\varphi }_{0}^{1}(t) - \mu (t){\varphi }_{ 0}^{2}(t) = 0, \\ &{\varphi }_{0}^{1}(t) + {\varphi }_{ 0}^{2}(t) = \end{array}$$

(1)

is obtained. The solution of the system yields that

$${\varphi }_{0}(t) = \left ({ \mu (t) \over \lambda (t) + \mu (t)},{ \lambda (t) \over \lambda (t) + \mu (t)} \right ).$$

To find φ₁( ⋅), consider

$$\begin{array}{rl} &\lambda (t){\varphi }_{1}^{1}(t) - \mu (t){\varphi }_{ 1}^{2}(t) ={ \dot{\lambda }(t)\mu (t) -\dot{ \mu }(t)\lambda (t) \over {(\lambda (t) + \mu (t))}^{2}}, \\ &{\varphi }_{1}^{1}(t) + {\varphi }_{ 1}^{2}(t) = 0, \end{array}$$

where $\dot{\lambda } = (d/dt)\lambda $ and $\dot{\mu } = (d/dt)\mu $. Solving this system of equations gives us

$${\varphi }_{1}(t) =\left({ \dot{\lambda }(t)\mu (t) -\dot{ \mu }(t)\lambda (t) \over {(\lambda (t) + \mu (t))}^{3}},{ \lambda (t)\dot{\mu }(t) - \mu (t)\dot{\lambda }(t) \over {(\lambda (t) + \mu (t))}^{3}} \right).$$

To get the inner expansion, consider the differential equation

$$\begin{array}{rl} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )Q(0), \\ &{\psi }_{0}(0) = {p}^{0} - {\varphi }_{ 0}(0),\end{array}$$

with $\tau = t/\varepsilon $. We obtain

$${\psi }_{0}(\tau ) = ({p}^{0} - {\varphi }_{ 0}(0))\exp (Q(0)\tau ),$$

where

$$\begin{array}{rl} &\exp \left (Q(0)\tau \right ) ={ 1 \over \lambda (0) + \mu (0)} \\ &\qquad \qquad \times \left (\begin{array}{cc} \mu (0) + \lambda (0){e}^{-(\lambda (0)+\mu (0))\tau }&\lambda (0) - \lambda (0){e}^{-(\lambda (0)+\mu (0))\tau } \\ \mu (0) - \mu (0){e}^{-(\lambda (0)+\mu (0))\tau }&\lambda (0) + \mu (0){e}^{-(\lambda (0)+\mu (0))\tau }\\ \end{array} \right )\end{array}$$

Similarly ψ₁( ⋅) can be obtained from (4.21) with the exponential matrix given above.

It is interesting to note that either λ(t) or μ(t) can be equal to 0 for some t as long as λ(t) + μ(t) > 0. For example, if we take μ( ⋅) to be the repair rate of a machine in a manufacturing model, then μ(t) = 0 corresponds to the repair workers taking breaks or waiting for parts on order to arrive. The minors of Q(t) are λ(t), − λ(t), μ(t), and − μ(t). As long as not all of them are zero at the same time, the weak irreducibility condition will be met.

2.7 Two-Time-Scale Expansion

The asymptotic expansion derived in the preceding sections is separable in the sense that it is the sum of a regular part and initial-layer corrections. Naturally, one is interested in the relationship between such an expansion and the so-called two-time-scale expansion (see, for example, Smith [199]). To answer this question, we first obtain the two-time-scale asymptotic expansion for the forward equation (4.3), proceed with the exploration of the relationships between these two expansions, and conclude with a discussion of the connection between these two methods.

Two-Time-Scale Expansion. Following the literature on asymptotic expansion (e.g., Kevorkian and Cole [108, 109] and Smith [199] among others), consider two scales t and $\tau = t/\varepsilon $, both as “times.” One of them is in a normal time scale and the other is a stretched one. We seek asymptotic expansions of the form

$${y}^{\varepsilon }(t,\tau ) =\sum\limits_{i=0}^{n}{\varepsilon }^{i}{y}_{ i}(t,\tau ),$$

(4.32)

where {y _i(t, τ)}_i = 0 ⁿ is a sequence of two-time-scale functions. Treating t and τ as independent variables, one has

$${ d \over dt} ={ \partial \over \partial t} +{ 1 \over \varepsilon } { \partial \over \partial \tau }.$$

(4.33)

Formally substituting (4.32) into (4.3) and equating coefficients of like powers of εⁱ results in

$$\begin{array}{ll} &{ \partial {y}_{0}(t,\tau ) \over \partial \tau } = {y}_{0}(t,\tau )Q(t), \\ &{ \partial {y}_{1}(t,\tau ) \over \partial \tau } = {y}_{1}(t,\tau )Q(t) -{ \partial {y}_{0}(t,\tau ) \over \partial t}, \\ &\qquad \cdots \\ &{ \partial {y}_{i}(t,\tau ) \over \partial \tau } = {y}_{i}(t,\tau )Q(t) -{ \partial {y}_{i-1}(t,\tau ) \over \partial t},\ \ 1 \leq i \leq n.\end{array}$$

(4.34)

The initial conditions are

$$\begin{array}{ll} &{y}_{0}(t,0) = {p}^{0}\ \mbox{ and } \\ &{y}_{i}(t,0) = 0,\ \mbox{ for }1 \leq i \leq n.\end{array}$$

(4.35)

Holding t constant and solving the first equation in (4.34) (with the first equation in (4.35) as the initial condition) yields

$${y}_{0}(t,\tau ) = {p}^{0}\exp (Q(t)\tau ).$$

(4.36)

By virtue of (A4.4), (∂ ∕ ∂t)y ₀(t, τ) exists and

$${ \partial {y}_{0}(t,\tau ) \over \partial t} = {p}^{0}\exp (Q(t)\tau )\left({ dQ(t) \over dt} \right)\tau.$$

As a result, (∂ ∕ ∂t)y ₀(t, τ) is orthogonal to 1 l. We continue the procedure recursively. It can be verified that for 1 ≤ i ≤ n,

$${y}_{i}(t,\tau ) = -{\int }_{0}^{\tau }{ \partial {y}_{i-1}(t,s) \over \partial t} \exp (Q(t)(\tau - s))ds.$$

(4.37)

Furthermore, for i = 1, …, n, (∂ ∕ ∂t)y _i(t, τ) exists and is continuous; it is also orthogonal to 1l. It should be emphasized that in the equations above, t is viewed as being “frozen,” and as a consequence, Q(t) is a “constant” matrix.

Parallel to the previous development, one can show that for all 1 ≤ i ≤ n,

$$\vert {y}_{i}(t,\tau )\vert \leq K(t)\exp (-{\kappa }_{0}(t)\tau ).$$

Compared with the separable expansions presented before, note the t-dependence of K( ⋅) and κ₀( ⋅) above. Furthermore, the asymptotic series is valid. We summarize this as the following theorem.

Theorem 4.18.

Under the conditions of Theorem 4.5 , a sequence of functions {y _i (t,τ)} _i=0 ⁿ can be constructed so that

$$ \sup\limits_{t\in [0,T]}{\biggl |{p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{y}_{ i}(t,\tau )\biggr |} = O({\varepsilon }^{n+1}).$$

Example 4.19.

We return to Example 4.16 . It is readily verified that the zeroth-order two-time-scale expansion coincides with that of the analytic solution, in fact, with

$${y}_{0}(t,\tau ) = {p}^{0}\exp \left ({ Qt \over \varepsilon } \right )\mbox{ and }{y}_{i}(t,\tau ) \equiv 0\mbox{ for all }i \geq 1.$$

Relationship between the Two Methods. Now we have two different asymptotic expansions. Do they in some sense produce similar asymptotic results? Note that each term in y _i(t, τ) contains the regular part φ_i(t) as well as the initial-layer corrections. Examining the zeroth-order approximation leads to

$$\exp (Q(t)\tau ) \rightarrow \overline{P}(t)\mbox{ as }\tau \rightarrow \infty $$

via the same argument employed in the proof of Lemma 4.4. The matrix has identical rows, and is given by $\overline{P}(t) = \mathrm{1}\mathrm{l}\nu (t)$. In fact, owing to ${p}^{0}\mathrm{1}\mathrm{l} =\sum\limits_{i=1}^{m}{p}_{i}^{0} = 1$, we have

$${y}_{0}(t,\tau ) = \nu (t) + {p}^{0}\left (\exp (Q(t)\tau ) -\overline{P}(t)\right ) = \nu (t) +\widetilde{ {y}}_{ 0}(t,\tau ),$$

(4.38)

where $\widetilde{{y}}_{0}(t,\tau )$ decays exponentially fast as τ → ∞ for t < τ.

In view of (4.38), the two methods produce the same limit as τ → ∞, namely, the quasi-stationary distribution. To explore further, we study a special case (a two-state Markov chain) so as to keep the notation simple. Consider the two-state Markov chain model Example 4.17. In view of (4.38) and the formulas in Example 4.17, we have

$${y}_{0}(t,\tau ) = \nu (t) +\widetilde{ {y}}_{0}(t,\tau ) = {\varphi }_{0}(t) +\widetilde{ {y}}_{0}(t,\tau ).$$

Owing to (4.37), direct calculation yields that

$$\begin{array}{rl} {y}_{1}(t,\tau )& = -{\int }_{0}^{\tau }{ d{\varphi }_{0}(t) \over dt} \exp (Q(t)(\tau - s))ds \\ &\qquad -{\int }_{0}^{\tau }{ \partial \widetilde{{y}}_{0}(t,\tau ) \over \partial t} \exp (Q(t)(\tau - s))ds\end{array}$$

It can be verified that the second term on the right-hand side of the equal sign above decays exponentially fast, while the first term yields φ₁(t) plus another term tending to 0 exponentially fast as τ → ∞. Using the result of Example 4.17 yields

$$\begin{array}{rl} & -{\int }_{0}^{\tau }{ d{\varphi }_{0}(t) \over dt} \exp (Q(t)(\tau - s))ds \\ &\qquad ={ d{\varphi }_{0}(t) \over dt} \left({ 1 -\exp (-(\lambda (t) + \mu (t))\tau ) \over \lambda (t) + \mu (t)} \right)\left (\begin{array}{*{10}c} \lambda (t) &-\lambda (t)\\ -\mu (t) & \mu (t) \\ \end{array} \right ) \\ &\qquad = {\varphi }_{1}(t) -{ d{\varphi }_{0}(t) \over dt} \left({ \exp (-(\lambda (t) + \mu (t))\tau ) \over \lambda (t) + \mu (t)} \right)\left (\begin{array}{*{10}c} \lambda (t) &-\lambda (t)\\ -\mu (t) & \mu (t) \\ \end{array} \right )\end{array}$$

Thus, it follows that

$${y}_{1}(t,\tau ) = {\varphi }_{1}(t) +\widetilde{ {y}}_{1}(t,\tau ),$$

where

$$\begin{array}{rl} \widetilde{{y}}_{1}(t,\tau )& = -{\int }_{0}^{\tau }{ \partial \widetilde{{y}}_{0}(t,\tau ) \over \partial t} \exp (Q(t)(\tau - s))ds \\ &\qquad -{ d{\varphi }_{0}(t) \over dt} \left({ \exp (-(\lambda (t) + \mu (t))\tau ) \over \lambda (t) + \mu (t)} \right)\left (\begin{array}{*{10}c} \lambda (t) &-\lambda (t)\\ -\mu (t) & \mu (t) \end{array} \right )\end{array}$$

Similarly, we can obtain

$${y}_{i}(t,\tau ) = {\varphi }_{i}(t) +\widetilde{ {y}}_{i}(t,\tau ),\mbox{ for }1 \leq i \leq n,$$

where $\widetilde{{y}}_{i}(t,\tau )$ decay exponentially fast as τ → ∞ for all t < τ. This establishes the connection between these two different expansions.

Comparison and Additional Remark. A moment of reflection reveals that:

- The conditions required to obtain the asymptotic expansions are the same.
- Except for the actual forms, there is no significant difference between these two methods.
- No matter which method is employed, in one way or another the results for stationary Markov chains are used. In the separable expansion, this is accomplished by using Q(0), and in the two-time-scale expansion, this is carried out by holding t constant and hence treating Q(t) as a constant matrix.
- Although the two-time-scale expansion admits a seemingly more general form, the separable expansion is more transparent as far as the quasi-stationary distribution is concerned.
- When a more complex problem, for example the case of weak and strong interactions, is encountered, the separable expansion becomes more advantageous.
- To study asymptotic normality, etc., in the sequel, the separable expansion will prove to be more convenient than the two-time-scale expansion.

In view of the items mentioned above, we choose to use the separable form of the expansion throughout.

3 Markov Chains with Multiple Weakly Irreducible Classes

This section presents the asymptotic expansions of two-time-scale Markov chains with slow and fast components subject to weak and strong interactions. We assume that all the states of the Markov chain are recurrent. In contrast to Section 4.2, the states belong to multiple weakly irreducible classes. As was mentioned in the introductory chapter, such time-scale separation stems from various applications in production planning, queueing networks, random fatigue, system reliability, competing risk theory, control and optimization of large-scale dynamical systems, and related fields. The sunderlying models in which some components change very rapidly whereas others vary relatively slowly, are more complex than those of Section 4.2. The weak and strong interactions of the systems are modeled by assuming the generator of the underlying Markov chain to be of the form

$${Q}^{\varepsilon }(t) = \frac{1} {\varepsilon }\widetilde{Q}(t) +\widehat{ Q}(t),$$

(4.39)

where $\widetilde{Q}(t)$ governs the rapidly changing part and $\widehat{Q}(t)$ describes the slowly changing components. They have the appropriate forms to be mentioned in the sequel.

This section extends the results in Section 4.2 to incorporate the cases in which the generator $\widetilde{Q}(t)$ is not irreducible. Our study focuses on the forward equation, similar to (4.3); now the forward equation takes the form

$$\begin{array}{ll} &{ d{p}^{\varepsilon }(t) \over dt} = {p}^{\varepsilon }(t)\left (\frac{1} {\varepsilon }\widetilde{Q}(t) +\widehat{ Q}(t)\right ),\quad {p}^{\varepsilon }(0) = {p}^{0} \end{array}$$

(4.40)

such that

$${p}_{i}^{0} \geq 0\mbox{ for each }i\mbox{ and }\sum\limits_{i=1}^{m}{p}_{ i}^{0} = 1.$$

To illustrate, we present a simple example below.

Example 4.20.

Consider a two-machine flowshop with machines that are subject to breakdown and repair. The production capacity of the machines is described by a finite-state Markov chain. If the machine is up, then it can produce parts with production rate u(t); its production rate is zero if the machine is under repair. For simplicity, suppose each of the machines is either in operating condition (denoted by 1) or under repair (denoted by 0). Then the capacity of the workshop becomes a four-state Markov chain with state space {(1,1),(0,1),(1,0),(0,0)}. Suppose that the first machine breaks down much more often than the second one. To reflect this situation, consider a Markov chain α^ε(⋅) generated by Q^ε(t) in (4.39), with $\widetilde{Q}(\cdot )$ and $\widehat{Q}(\cdot )$ given by

$$\widetilde{Q}(t) = \left (\begin{array}{*{10}c} -{\lambda }_{1}(t)& {\lambda }_{1}(t) & 0 & 0 \\ {\mu }_{1}(t) &-{\mu }_{1}(t)& 0 & 0 \\ 0 & 0 &-{\lambda }_{1}(t)& {\lambda }_{1}(t) \\ 0 & 0 & {\mu }_{1}(t) &-{\mu }_{1}(t)\\ \end{array} \right )$$

and

$$\widehat{Q}(t) = \left (\begin{array}{*{10}c} -{\lambda }_{2}(t)& 0 & {\lambda }_{2}(t) & 0 \\ 0 &-{\lambda }_{2}(t)& 0 & {\lambda }_{2}(t) \\ {\mu }_{2}(t) & 0 &-{\mu }_{2}(t)& 0 \\ 0 & {\mu }_{2}(t) & 0 &-{\mu }_{2}(t)\\ \end{array} \right ),$$

where λ_i(⋅) and μ_i(⋅) are the rates of repair and breakdown, respectively. The matrices $\widetilde{Q}(t)$ and $\widehat{Q}(t)$ are themselves generators of Markov chains. Note that

$$\widetilde{Q}(t) = \mathrm{diag}\left (\left (\begin{array}{cc} - {\lambda }_{1}(t)& {\lambda }_{1}(t) \\ {\mu }_{1}(t) & - {\mu }_{1}(t)\\ \end{array} \right ),\left (\begin{array}{cc} - {\lambda }_{1}(t)& {\lambda }_{1}(t) \\ {\mu }_{1}(t) & - {\mu }_{1}(t)\\ \end{array} \right )\right )$$

is a block-diagonal matrix, representing the fast motion, and $\widehat{Q}(t)$ governs the slow components. In order to obtain any meaningful results for controlling and optimizing the performance of the underlying systems, the foremost task is to determine the asymptotic behavior (as ε → 0) of the probability distribution of the underlying chain.

In this example, a first glance reveals that $\widetilde{Q}(t)$ is reducible, hence the results in Section 4.2 are not applicable. However, closer scrutiny indicates that $\widetilde{Q}(t)$ consists of two irreducible submatrices. One expects that the asymptotic expansions may still be established. Our main objective is to develop asymptotic expansions of such systems and their variants. The corresponding procedure is, however, much more involved compared with the irreducible cases.

Examining (4.39), it is seen that the asymptotic properties of the underlying Markov chains largely depend on the structure of the matrix $\widetilde{Q}(t)$. In accordance with the classification of states, we may consider three different cases: the chains with recurrent states only, the inclusion of absorbing states, and the inclusion of transient states. We treat the recurrent-state cases in this section, and then extend the results to notationally more involved cases including absorbing states and transient states in the following two sections.

Suppose α^ε( ⋅) is a finite-state Markov chain with generator given by (4.39), where both $\widetilde{Q}(t)$ and $\widehat{Q}(t)$ are generators of appropriate Markov chains. In view of the results in Section 4.2, it is intuitively clear that the structure of the generator $\widetilde{Q}(t)$ governs the fast-changing part of the Markov chain. As mentioned in the previous section, our study of the finite-state-space cases is naturally divided into the recurrent cases, the inclusion of absorbing states, and the inclusion of transient states of the generator $\widetilde{Q}(t)$. In accordance with classical results (see Chung [31] and Karlin and Taylor [105, 106]), one can always decompose the states of a finite-state Markov chain into recurrent (including absorbing) and transient classes. Inspired by Seneta’s approach to nonnegative matrices (see Seneta [189]), we aim to put the matrix $\widetilde{Q}(t)$ into some sort of “canonical” form so that a systematic study can be carried out. In a finite-state Markov chain, not all states are transient, and it has at least one recurrent state. Similar to the argument of Iosifescu [95, p. 94] (see also Goodman [75], Karlin and McGregor [104], Keilson [107] among others), if there are no transient states, then after suitable permutations and rearrangements (i.e., by appropriately relabeling the states), $\widetilde{Q}(t)$ can be put into the block-diagonal form

$$\begin{array}{ll} \widetilde{Q}(t)& = \left (\begin{array}{*{10}c} \widetilde{{Q}}^{1}(t)& && \\ &\widetilde{{Q}}^{2}(t)&&\\ &&\ddots& \\ & &&\widetilde{{Q}}^{l}(t)\\ \end{array} \right ) \\ & = \mathrm{diag}\left (\widetilde{{Q}}^{1}(t),\ldots,\widetilde{{Q}}^{l}(t)\right ), \end{array}$$

(4.41)

where $\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}$ are weakly irreducible, for k = 1, 2, …, l, and $\sum\limits_{k=1}^{l}{m}_{k} = m$. Here and hereinafter, $\widetilde{{Q}}^{k}(t)$, (a superscript without parentheses) denotes the kth block matrix in $\widetilde{Q}(t)$ . The rest of this section deals with the generator Q ^ε(t) given by (4.39) with $\widetilde{Q}(t)$ taking the form (4.41). Note that an example of the recurrent case is that of the irreducible (or weakly irreducible) generators treated in Section 4.2.

Let ${\mathcal{M}}_{k} =\{ {s}_{k1},\ldots,{s}_{k{m}_{k}}\}$ for k = 1, …, l denote the states corresponding to $\widetilde{{Q}}^{k}(t)$ and let $\mathcal{M}$ denote the state space of the underlying chains given by

$$\begin{array}{rl} \mathcal{M}& = {\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l} \\ & = \{{s}_{11},\ldots,{s}_{1{m}_{1}},\ldots,{s}_{l1},\ldots,{s}_{l{m}_{l}}\}\end{array}$$

Since $\widetilde{{Q}}^{k}(t) = {(\widetilde{{q}}_{ij}^{k}(t))}_{{m}_{k}\times {m}_{k}}$ and $\widehat{Q}(t) = {(\widehat{{q}}_{ij}(t))}_{m\times m}$ are generators, for k = 1, 2, …, l, we have

$$\begin{array}{rl} &\sum\limits_{j=1}^{{m}_{k} }\widetilde{{q}}_{ij}^{k}(t) = 0,\ \mbox{ for }i = 1,\ldots,{m}_{ k},\ \mbox{ and } \\ &\sum\limits_{j=1}^{m}\widehat{{q}}_{ ij}(t) = 0,\ \mbox{ for }i = 1,\ldots,m\end{array}$$

The slow and fast components are coupled through weak and strong interactions in the sense that the underlying Markov chain fluctuates rapidly within a single group ${\mathcal{M}}_{k}$ and jumps less frequently between groups ${\mathcal{M}}_{k}$ and ${\mathcal{M}}_{j}$ for k≠j. The states in ${\mathcal{M}}_{k},$ k = 1, …, l, are not isolated or independent of each other. More precisely, if we consider the states in ${\mathcal{M}}_{k}$ as a single “state,” then these “states” are coupled through the matrix $\widehat{Q}(t)$, and transitions from ${\mathcal{M}}_{k}$ to ${\mathcal{M}}_{j}$, k ≠ j are possible. In fact, $\widehat{Q}(\cdot )$, together with the quasi-stationary distributions of $\widetilde{{Q}}^{k}(t)$, determines the transition rates among states in ${\mathcal{M}}_{k}$, for k = 1, …, l.

Consider the forward equation (4.40). Our goal here is to develop an asymptotic series for the solution p ^ε( ⋅) of (4.40). Working with the interval [0, T] for some T < ∞, we will need the following conditions:

- For each t ∈ [0, T] and k = 1, 2, …, l, $\widetilde{{Q}}^{k}(t)$ is weakly irreducible.
- For some positive integer n, $\widetilde{Q}(\cdot )$ and $\widehat{Q}(\cdot )$ are ( n + 1)-times and n-times continuously differentiable on [0, T], respectively. Moreover, $({d}^{n+1}/d{t}^{n+1})\widetilde{Q}(\cdot )$ and $({d}^{n}/d{t}^{n})\widehat{Q}(\cdot )$ are Lipschitz on [0, T].

Compared with the irreducible models in Section 4.2, the main difficulty in this chapter lies in the interactions among different blocks. In constructing the expansion in Section 4.2, for i = 1, …, n, the two sets of functions {φ_i( ⋅)} and {ψ_i( ⋅)} are obtained independently except the initial conditions ${\psi }_{i}(0) = -{\varphi }_{i}(0)$. For Markov chains with weak and strong interactions, φ_i ( ⋅) and ψ _i( ⋅) are highly intertwined. The essence is to find φ_i ( ⋅) and ψ _i( ⋅) jointly and recursively. In the process of construction, one of the crucial and delicate points is to select the “right” initial conditions. This is done by demanding that ψ_i (τ) decay to 0 as τ → ∞. For future use, we define a differential operator ${\mathcal{L}}^{\varepsilon }$ on ${\mathbb{R}}^{1\times m}$-valued functions by

$${\mathcal{L}}^{\varepsilon }f = \varepsilon { df \over dt} - f(\widetilde{Q} + \varepsilon \widehat{Q}).$$

(4.42)

Then it follows that ${\mathcal{L}}^{\varepsilon }f = 0$ iff f is a solution to the differential equation in (4.40). We are now in a position to derive the asymptotic expansion.

3.1 Asymptotic Expansions

As in Section 4.2, we seek expansions of the form

$${y}_{n}^{\varepsilon }(t) = {\Phi }_{ n}^{\varepsilon }(t) + {\Psi }_{ n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) +\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right ).$$

(4.43)

For the purpose of estimating the remainder (or error), the terms φ _n + 1 ( ⋅) and ψ _n + 1 ( ⋅) are needed. Set ${\mathcal{L}}^{\varepsilon }{y}_{n+1}^{\varepsilon }(t) = 0$. Parallel to the approach in Section 4.2, equating like powers of εⁱ (for $i = 0,1,\ldots,n + 1$) leads to the equations for the regular part:

$$\begin{array}{ll} &{\varepsilon }^{0} :\ {\varphi }_{ 0}(t)\widetilde{Q}(t) = 0, \\ &{\varepsilon }^{1} :\ {\varphi }_{ 1}(t)\widetilde{Q}(t) ={ d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)\widehat{Q}(t), \\ &\qquad \cdots \\ &{\varepsilon }^{i} :\ {\varphi }_{ i}(t)\widetilde{Q}(t) ={ d{\varphi }_{i-1}(t) \over dt} - {\varphi }_{i-1}(t)\widehat{Q}(t).\end{array}$$

(4.44)

As discussed in Section 4.2, the approximation above is good for t away from 0. When t is sufficiently close to 0, an initial layer of thickness ε develops. Thus for the singular part of the expansion we enlarge the picture near 0 using the stretched variable τ defined by $\tau = t/\varepsilon $. Identifying the initial-layer terms in ${\mathcal{L}}^{\varepsilon }{y}_{n+1}^{\varepsilon } = 0$, we obtain

$$\begin{array}{rl} &{ d \over d\tau } \left ({\psi }_{0}(\tau ) + \varepsilon {\psi }_{1}(\tau ) + \cdots + {\varepsilon }^{n+1}{\psi }_{ n+1}(\tau )\right ) \\ &\quad = \left ({\psi }_{0}(\tau ) + \varepsilon {\psi }_{1}(\tau ) + \cdots + {\varepsilon }^{n+1}{\psi }_{ n+1}(\tau )\right )\left (\widetilde{Q}(\varepsilon \tau ) + \varepsilon \widehat{Q}(\varepsilon \tau )\right )\end{array}$$

By means of the Taylor expansion, we have

$$\begin{array}{rl} &\widetilde{Q}(\varepsilon \tau ) =\widetilde{ Q}(0) + \varepsilon \tau { d\widetilde{Q}(0) \over dt} + \cdots \\ &\qquad +{ {(\varepsilon \tau )}^{n+1} \over (n + 1)!} { {d}^{n+1}\widetilde{Q}(0) \over d{t}^{n+1}} +\widetilde{ {R}}_{n+1}(\varepsilon \tau ), \\ &\varepsilon \widehat{Q}(\varepsilon \tau ) = \varepsilon \widehat{Q}(0) + {\varepsilon }^{2}\tau { d\widehat{Q}(0) \over dt} + \cdots \\ &\qquad +{ \varepsilon {(\varepsilon \tau )}^{n} \over n!} { {d}^{n}\widehat{Q}(0) \over d{t}^{n}} +\widehat{ {R}}_{n}(\varepsilon \tau ),\end{array}$$

where

$$\begin{array}{rl} &\widetilde{{R}}_{n+1}(t) ={ {t}^{n+1} \over (n + 1)!} \left({ {d}^{n+1}\widetilde{Q}(\xi ) \over d{t}^{n+1}} -{ {d}^{n+1}\widetilde{Q}(0) \over d{t}^{n+1}} \right), \\ &\widehat{{R}}_{n}(t) ={ \varepsilon {t}^{n} \over n!} \left({ {d}^{n}\widehat{Q}(\zeta ) \over d{t}^{n}} -{ {d}^{n}\widehat{Q}(0) \over d{t}^{n}} \right), \end{array}$$

for some 0 ≤ ξ ≤ t and 0 ≤ ζ ≤ t. Note that in view of (A4.4),

$$\widetilde{{R}}_{n+1}(t) = O({t}^{n+2})\mbox{ and }\widehat{{R}}_{ n}(t) = O(\varepsilon {t}^{n+1}).$$

Equating coefficients of like powers of εⁱ, for $i = 0,1,\ldots,n + 1$ , and using the Taylor expansion above, we obtain

$$\begin{array}{ll} &{\varepsilon }^{0} :\ { d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0), \\ &{\varepsilon }^{1} :\ { d{\psi }_{1}(\tau ) \over d\tau } = {\psi }_{1}(\tau )\widetilde{Q}(0) \\ & + {\psi }_{0}(\tau )\left(\widehat{Q}(0) + \tau { d\widetilde{Q}(0) \over dt} \right),\\ \\ \\ &\qquad \ \cdots \\ &{\varepsilon }^{i} :\ { d{\psi }_{i}(\tau ) \over d\tau } = {\psi }_{i}(\tau )\widetilde{Q}(0) +\sum\limits_{j=0}^{i-1}{\psi }_{ i-j-1}(\tau ) \\ & \times \left({ {\tau }^{j} \over j!} { {d}^{j}\widehat{Q}(0) \over d{t}^{j}} +{ {\tau }^{j+1} \over (j + 1)!} { {d}^{j+1}\widetilde{Q}(0) \over d{t}^{j+1}} \right).\end{array}$$

(4.45)

In view of the essence of matched asymptotic expansion, we have necessarily at t = 0 that

$$\sum\limits_{i=0}^{n+1}{\varepsilon }^{i}\left ({\varphi }_{ i}(0) + {\psi }_{i}(0)\right ) = {p}^{0}.$$

(4.46)

This equation implies

$${p}^{0} = {\varphi }_{ 0}(0) + {\psi }_{0}(0)\mbox{ and }{\varphi }_{i}(0) + {\psi }_{i}(0) = 0,$$

for i ≥ 1. Moreover, note that p ^ε(t)1 l = 1 for all t ∈ [0, T]. Sending ε → 0 in the asymptotic expansion, one necessarily has to have the following conditions: For all t ∈ [0, T],

$${\varphi }_{0}(t)\mathrm{1}\mathrm{l} = 1\mbox{ and }{\varphi }_{i}(t)\mathrm{1}\mathrm{l} = 0,\;i \geq 1.$$

(4.47)

Our task now is to determine the functions φ _i ( ⋅) and ψ _i( ⋅).

Determining φ ₀ (⋅) and ψ ₀ (⋅). Write v = ( v ¹, …, v ^l) for a vector $v \in {\mathbb{R}}^{1\times m}$ , where v ^k denotes the subvector corresponding to the kth block of the partition. Meanwhile, a superscript with parentheses denotes a sequence. Thus v _n ^k denotes the kth subblock of the corresponding partitioned vector of the sequence v _n.

Let us start with the first equation in (4.44). In view of (4.47), we have

$$\begin{array}{ll} &{\varphi }_{0}(t)\widetilde{Q}(t) = 0, \\ &\sum\limits_{i=1}^{m}{\varphi }_{ 0}^{i}(t) = 1.\end{array}$$

(4.48)

Note that the system above depends only on the generator $\widetilde{Q}(t)$. However, by itself, the system is not uniquely solvable. Since for each t ∈ [0, T] and k = 1, …, l, $\widetilde{{Q}}^{k}(t)$ is weakly irreducible, it follows that $\mathrm{rank}(\widetilde{{Q}}^{k}(t)) = {m}_{k} - 1$ and $\mathrm{rank}(\widetilde{Q}(t)) = m - l$ . Therefore, to get a unique solution, we need to supply l auxiliary equations. Where can we find these equations? Upon dividing the system (4.48) into l subsystems, one can apply the Fredholm alternative (see Lemma A.37 and Corollary A.38) and use the orthogonality condition to choose l additional equations to replace l equations in the system represented by the first equation in (4.48).

Since for each k, $\widetilde{{Q}}^{k}(t)$ is weakly irreducible, there exists a unique quasi-stationary distribution ν^k(t). Therefore any solution to the equation

$${\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = 0$$

can be written as the product of ν^k(t) and a scalar “multiplier,” say ${{\vartheta}}_{0}^{k}(t)$. It follows from the second equation in (4.48) that $\sum\limits_{k=1}^{l}{{\vartheta}}_{0}^{k}(t) = 1$. These ${{\vartheta}}_{0}^{k}(t)$’s can be interpreted as the probabilities of the “grouped states” (or “aggregated states”) ${\mathcal{M}}_{k}$.

As will be seen in the sequel, ${{\vartheta}}_{0}^{k}(t)$ becomes an important spinoff in the process of construction. Effort will subsequently be devoted to finding the unique solution $({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{0}^{l}(t))$. Let $\mathrm{1}{\mathrm{l}}_{{m}_{k}} = (1,\ldots,1)^{\prime} \in {\mathbb{R}}^{{m}_{k}\times 1}$.

Lemma 4.21

. Under (A4.3) and (A4.4) , for each k = 1,…,l, the solution of the equation

$$\begin{array}{ll} &{\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = 0, \\ &{\varphi }_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = {{\vartheta}}_{0}^{k}(t), \end{array}$$

(4.49)

can be uniquely expressed as ${\varphi }_{0}^{k}(t) = {\nu }^{k}(t){{\vartheta}}_{0}^{k}(t)$ , where ν ^k (t) is the quasi-stationary distribution corresponding to $\widetilde{{Q}}^{k}(t)$ . Moreover, φ ₀ ^k (t) is (n + 1)-times continuously differentiable on [0,T], provided that ${{\vartheta}}_{0}^{k}(\cdot )$ is (n + 1)-times continuously differentiable.

Proof: For each k, let us regard ${{\vartheta}}_{0}^{k}(\cdot )$ as a known function temporarily. For t ∈ [0, T], let $\widetilde{{Q}}_{c}^{k}(t) = (\mathrm{1}{\mathrm{l}}_{{m}_{k}}\vdots\;\widetilde{{Q}}^{k}(t))$ . Then the solution can be written as

$${\varphi }_{0}^{k}(t) = ({{\vartheta}}_{ 0}^{k}(0)\vdots{0}_{{ m}_{k}}^{\prime})\widetilde{{Q}}_{c}^{k,{\prime}}(t){\left (\widetilde{{Q}}_{ c}^{k}(t)\widetilde{{Q}}_{ c}^{k,{\prime}}(t)\right )}^{-1},$$

where ${0}_{{m}_{k}} = (0,\ldots,0)^{\prime} \in {\mathbb{R}}^{{m}_{k}\times 1}$ . Moreover, φ ₀ ( ⋅) is ( n + 1)-times continuously differentiable. The lemma is thus concluded. □

Remark 4.22

. This lemma indicates that for each k, the subvector φ ₀ ^k (⋅) is a multiple of the quasi-stationary distribution ν ^k (⋅) for each k = 1,…,l. The multipliers ${{\vartheta}}_{0}^{k}(\cdot )$ are to be determined. Owing to the interactions among different “aggregated states” corresponding to the block matrices, piecing together quasi-stationary distributions does not produce a quasi-stationary distribution for the entire system (i.e., (ν ¹ (t),…,ν ^k (t)) is not a quasi-stationary distribution for the entire system). Therefore, the leading term in the asymptotic expansion is proportional to (or a “multiple” of) the quasi-stationary distributions of the Markov chains generated by $\widetilde{{Q}}^{k}(t)$ , for k = 1,…,l. The multiplier ${{\vartheta}}_{0}^{k}(t)$ reflects the interactions of the Markov chain among the “aggregated states.” The probabilistic meaning of the leading term φ ₀ (⋅) is in the sense of total probability. Intuitively, ${{\vartheta}}_{0}^{k}(t)$ is the corresponding probability of the chain belonging to ${\mathcal{M}}_{k}$ , and φ ₀ ^k (t) is the probability distribution of the chain belonging to ${\mathcal{M}}_{k}$ and the transitions taking place within this group of states.

We proceed to determining ${{\vartheta}}_{0}^{k}(\cdot )$ for k = 1, …, l. Define an m ×l matrix

$$\widetilde{\mathrm{1}\mathrm{l}} = \left (\begin{array}{*{10}c} \mathrm{1}{\mathrm{l}}_{{m}_{1}} & & & \\ & \mathrm{1}{\mathrm{l}}_{{m}_{2}} & &\\ & & \ddots& \\ & & & \mathrm{1}{\mathrm{l}}_{{m}_{l}}\\ \end{array} \right ) = \mathrm{diag}(\mathrm{1}{\mathrm{l}}_{{m}_{1}},\ldots,\mathrm{1}{\mathrm{l}}_{{m}_{l}}).$$

A crucial observation is that $\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0$ , that is, $\widetilde{Q}(t)$ and $\widetilde{\mathrm{1}\mathrm{l}}$ are orthogonal. Thus postmultiplying by $\widetilde{\mathrm{1}\mathrm{l}}$ leads to

$${\mathcal{L}}^{\varepsilon }\left (\sum\limits_{i=0}^{n+1}{\varepsilon }^{i}{\varphi }_{ i}(t)\widetilde{\mathrm{1}\mathrm{l}}\right ) = 0.$$

Recall that

$${\varphi }_{0}^{k}(t) = {{\vartheta}}_{ 0}^{k}(t){\nu }^{k}(t)\mbox{ and }{\varphi }_{ 0}^{k}(t)\mathrm{1}\mathrm{l} = {{\vartheta}}_{ 0}^{k}(t).$$

Equating the coefficients of ε in the above equation yields

$${ d \over dt} ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t)) = ({{\vartheta}}_{ 0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t))\overline{Q}(t),$$

(4.50)

where

$$\begin{array}{rl} \overline{Q}(t) =&\left (\begin{array}{*{10}c} {\nu }^{1}(t)& && \\ &{\nu }^{2}(t)&&\\ &&\ddots& \\ & &&{\nu }^{l}(t)\\ \end{array} \right )\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} \\ =&\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t))\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}}.\end{array}$$

(4.51)

Remark 4.23.

Intuitively, $\overline{Q}(t)$ is the “average” of $\widehat{Q}(t)$ weighted by the collection of quasi-stationary distributions (ν¹(t),…,ν^l(t)). In fact, (4.50) is merely a requirement that the equations in (4.44) be consistent in the sense of Fredholm. This can be seen as follows. Denote by $N(\widetilde{Q}(t))$ the null space of the matrix Q(t). Since $\mathrm{rank}(\widetilde{Q}(t)) = m - l$, the dimension of $N(\widetilde{Q}(t))$ is l. Observe that $\widetilde{\mathrm{1}\mathrm{l}} =$ $\mathrm{diag}(\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{1}}$, $\ldots,\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{l}})$ where

$$\begin{array}{ll} &\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{1}} = {(\underbrace{{1,\ldots,1}}_{{m}_{1}},\underbrace{{0,\ldots,0}}_{{m}_{2}+\cdots +{m}_{l}})}^{{\prime}}, \\ &\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{2}} = {(\underbrace{{0,\ldots,0}}_{{m}_{1}},\underbrace{{1,\ldots,1}}_{{m}_{2}},\underbrace{{0,\ldots,0}}_{{m}_{3}+\cdots +{m}_{l}})}^{{\prime}}, \\ &\quad \cdots \\ &\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{l}} = {(\underbrace{{0,\ldots,0,}}_{{m}_{1}+\cdots +{m}_{l-1}}\underbrace{{ 1,\ldots,1}}_{{m}_{l}})}^{{\prime}} \end{array}$$

(4.52)

are linearly independent and span the null space of $\widetilde{Q}(t)$. The equations in (4.44) have solutions only if the right-hand side of each equation is orthogonal to $\widetilde{\mathrm{1}\mathrm{l}}$. Hence, (4.50) must hold.

Next we determine the initial value ${{\vartheta}}_{0}(0)$ . Assuming that the asymptotic expansion of p ^ε( ⋅) is given by y _n ^ε( ⋅) (see (4.43)), then it is necessary that

$${\varphi }_{0}(0)\widetilde{\mathrm{1}\mathrm{l}} =\lim\limits_{\delta \rightarrow 0}\lim\limits_{\varepsilon \rightarrow 0}{p}^{\varepsilon }(\delta )\widetilde{\mathrm{1}\mathrm{l}}.$$

(4.53)

We will refer to such a condition as an initial-value consistency condition. Moreover, in view of (4.40) and $\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0,$

$${p}^{\varepsilon }(t)\widetilde{\mathrm{1}\mathrm{l}} = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}} +{ \int }_{0}^{\delta }{p}^{\varepsilon }(s)\widehat{Q}(s)ds\widetilde{\mathrm{1}\mathrm{l}}.$$

Since p ^ε( ⋅) and $\widehat{Q}(\cdot )$ are both bounded, it follows that

$$ \lim\limits_{\delta \rightarrow 0}\left(\mathop{\lim\sup}\limits_{\varepsilon \rightarrow 0}{ \int }_{0}^{\delta }{p}^{\varepsilon }(s)\widehat{Q}(s)ds\widetilde{\mathrm{1}\mathrm{l}}\right) = 0.$$

Therefore, the initial-value consistency condition (4.53) yields

$${\varphi }_{0}(0)\widetilde{\mathrm{1}\mathrm{l}} = \lim\limits_{\delta \rightarrow 0}\left ( \lim\limits_{\varepsilon \rightarrow 0}{p}^{\varepsilon }(\delta )\widetilde{\mathrm{1}\mathrm{l}}\right ) = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}}.$$

Note that $({{\vartheta}}_{0}^{1}(0),\ldots,{{\vartheta}}_{0}^{l}(0)) = {\varphi }_{0}(0)\widetilde{\mathrm{1}\mathrm{l}}$. So the initial value for ${{\vartheta}}_{0}(t)$ should be

$$({{\vartheta}}_{0}^{1}(0),\ldots,{{\vartheta}}_{ 0}^{l}(0)) = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}}.$$

Using this initial condition and solving (4.50) yields that

$$({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t)) = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}}X(t,0),$$

where X( t, s) is the principal matrix solution of (4.50) (see Hale [79]). Since the smoothness of X( ⋅, ⋅) depends solely on the smoothness properties of $\widetilde{Q}(t)$ and $\widehat{Q}(t)$, $({{\vartheta}}_{0}^{1}(\cdot ),\ldots,{{\vartheta}}_{0}^{l}(\cdot ))$ is (n + 1)-times continuously differentiable on [0, T]. Up to now, we have shown that φ₀( ⋅) can be constructed that is (n + 1)-times continuously differentiable on [0, T]. Set ${{\vartheta}}_{0}(t) = ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{0}^{l}(t))$. We now summarize the discussion above as follows:

Proposition 4.24

. Assume conditions (A4.3) and (A4.4) . Then for t ∈ [0,T], φ ₀ (t) can be obtained uniquely by solving the following system of equations:

$$\begin{array}{ll} &{\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = 0, \\ &{\varphi }_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = {{\vartheta}}_{0}^{k}(t), \\ &{ d{{\vartheta}}_{0}(t) \over dt} = {{\vartheta}}_{0}(t)\overline{Q}(t), \\ &\mbox{ with }{{\vartheta}}_{0}(0) = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}},\end{array}$$

(4.54)

such that φ ₀ (⋅) is (n + 1)-times continuously differentiable. □

We next consider the initial-layer term ψ₀( ⋅). First note that solving (4.45) for each $i = 0,1\ldots,n + 1$ leads to

$$\begin{array}{ll} &{\psi }_{0}(\tau ) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau ),\\ \\ \\ &\qquad \cdots \\ \\ \\ &{\psi }_{i}(\tau ) = {\psi }_{i}(0)\exp (\widetilde{Q}(0)\tau ) \\ &\qquad \qquad +\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\tau }{\psi }_{ i-j-1}(s)\left({ {s}^{j} \over j!} { {d}^{j}\widehat{Q}(0) \over dt} + \frac{{s}^{j+1}} {(j + 1)!}{ {d}^{j+1}\widetilde{Q}(0) \over d{t}^{j+1}} \right) \\ & \quad \times \exp (\widetilde{Q}(0)(\tau - s))ds.\end{array}$$

(4.55)

Once again, to match the asymptotic expansion requires that (4.46) hold and hence

$${p}^{0} = {p}^{\varepsilon }(0) = {\varphi }_{ 0}(0) + {\psi }_{0}(0).$$

Solving the first equation in (4.45) together with the above initial condition, one obtains

$${\psi }_{0}(\tau ) = ({p}^{0} - {\varphi }_{ 0}(0))\exp (\widetilde{Q}(0)\tau ).$$

(4.56)

Note that in Proposition 4.25 to follow, we still use κ_0, 0 as a positive constant, which is generally a different constant from that in Section 4.2.

Proposition 4.25

. Assume conditions (A4.3) and (A4.4) . Then ψ ₀ (⋅) can be obtained uniquely by (4.56) . In addition, there is a positive number κ _0,0 such that

$$\vert {\psi }_{0}(\tau )\vert \leq K\exp (-{\kappa }_{0,0}\tau ),\;\tau \geq 0.$$

Proof: We prove only the exponential decay property, since the rest is obvious. Let ν^k (0) be the stationary distribution corresponding to the generator $\widetilde{{Q}}^{k}(0)$. Define

$$\begin{array}{rl} \pi & =\widetilde{ \mathrm{1}\mathrm{l}}\left (\begin{array}{*{10}c} {\nu }^{1}(0)& && \\ &{\nu }^{2}(0)&&\\ &&\ddots& \\ & &&{\nu }^{l}(0)\\ \end{array} \right ) \\ &\\ \\ \\ & = \left (\begin{array}{*{10}c} \mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0)& && \\ &\mathrm{1}{\mathrm{l}}_{{m}_{2}}{\nu }^{2}(0)&&\\ &&\ddots& \\ & &&\mathrm{1}{\mathrm{l}}_{{m}_{l}}{\nu }^{l}(0)\\ \end{array} \right ), \end{array}$$

(4.57)

where

$$\mathrm{1}{\mathrm{l}}_{{m}_{k}}{\nu }^{k}(0) = \left (\begin{array}{*{10}c} {\nu }_{1}^{k}(0)&\cdots &{\nu }_{{ m}_{k}}^{k}(0)\\ & \vdots & \\ {\nu }_{1}^{k}(0)&\cdots &{\nu }_{{m}_{k}}^{k}(0)\\ \end{array} \right ).$$

Noting the block-diagonal structure of $\widetilde{Q}(0)$, we have

$$\exp (\widetilde{Q}(0)\tau ) = \left (\begin{array}{*{10}c} \exp (\widetilde{{Q}}^{1}(0)\tau )& & & \\ &\exp (\widetilde{{Q}}^{2}(0)\tau )& &\\ & &\ddots& \\ & & &\exp (\widetilde{{Q}}^{l}(0)\tau )\\ \end{array} \right ).$$

It is easy to see that

$$({p}^{0} - {\varphi }_{ 0}(0))\widetilde{\mathrm{1}\mathrm{l}} = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}} - {\varphi }_{ 0}(0))\widetilde{\mathrm{1}\mathrm{l}} = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}} - {{\vartheta}}_{ 0}(0) = 0.$$

Owing to the choice of initial condition, (p ⁰ − φ ₀ (0)) is orthogonal to π, and by virtue of Lemma 4.4, for each k = 1, …, l and some κ_0, k > 0,

$$\left \vert \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right \vert \leq K\exp (-{\kappa }_{ 0,k}\tau ),$$

we have

$$\begin{array}{rl} \l {\psi }_{0}(\tau )\vert & = \l ({p}^{0} - {\varphi }_{ 0}(0))[\exp (\widetilde{Q}(0)\tau ) - \pi ]\vert \\ & \leq {K\sup }_{k\leq l}\l \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\vert \\ & \leq K\exp (-{\kappa }_{0,0}\tau ), \end{array}$$

where κ _0, 0 = min _k ≤ l κ _0, k > 0. □

Determining φ _i (⋅) and ψ _i (⋅) for i ≥ 1. In contrast to the situation encountered in Section 4.2, the sequence {φ_i ( ⋅)} cannot be obtained without the involvement of {ψ_i ( ⋅)}. We thus obtain the sequences pairwise. While the determination of φ₀ ( ⋅) and ψ ₀ ( ⋅) is similar to that of Section 4.2, the solutions for the rest of the functions show distinct features resulting from the underlying weak and strong interactions. With known

$${b}_{0}(t) ={ d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)\widehat{Q}(t),$$

we proceed to solve the second equation in (4.44) together with the constraint $\sum\limits_{i=1}^{m}{\varphi }_{1}^{i}(t) = 0$ due to (4.47). Partition the vectors φ₁(t) and b ₀(t) as

$$\begin{array}{rl} &{\varphi }_{1}(t) = ({\varphi }_{1}^{1}(t),\ldots,{\varphi }_{1}^{l}(t)), \\ &{b}_{0}(t) = ({b}_{0}^{1}(t),\ldots,{b}_{0}^{l}(t))\end{array}$$

In view of the definition of $\overline{Q}(t)$ in (4.51) and ${\varphi }_{0}^{k}(t) = {\nu }^{k}(t){{\vartheta}}_{0}^{k}(t)$, it follows that ${b}_{0}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0$, thus,

$${b}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0,\;k = 1,\ldots,l.$$

Let ${{\vartheta}}_{1}^{k}(t)$ denote the function such that $\sum\limits_{k=1}^{l}{{\vartheta}}_{1}^{k}(t) = 0$ because φ₁(t)1 l = 0. Then for each k = 1, …, l, the solution to

$$\begin{array}{ll} &{\varphi }_{1}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t), \\ &{\varphi }_{1}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = {{\vartheta}}_{1}^{k}(t), \end{array}$$

(4.58)

can be expressed as

$${\varphi }_{1}^{k}(t) =\widetilde{ {b}}_{ 0}^{k}(t) + {{\vartheta}}_{ 1}^{k}(t){\nu }^{k}(t),$$

(4.59)

where $\widetilde{{b}}_{0}^{k}(t)$ is a solution to the following equation:

$$\begin{array}{l} \widetilde{{b}}_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t), \\ \widetilde{{b}}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0,\\ \end{array}$$

or equivalently,

$$\widetilde{{b}}_{0}^{k}(t)(\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\vdots\widetilde{{Q}}^{k}(t)) = (0\vdots{b}_{ 0}^{k}(t)).$$

The procedure for solving this equation is similar to that for φ₀( ⋅).

Analogously to the previous treatment, we proceed to determine ${{\vartheta}}_{1}^{k}(t)$ by solving the system of equations

$${\mathcal{L}}^{\varepsilon }\bigg{(}\sum\limits_{i=0}^{n+1}{\varepsilon }^{i}{\varphi }_{ i}(t)\widetilde{\mathrm{1}\mathrm{l}}\bigg{)} = 0.$$

(4.60)

Using the conditions

$$\widetilde{{b}}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0\mbox{ and }{\nu }^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 1,$$

we have

$${\varphi }_{1}(t)\widetilde{\mathrm{1}\mathrm{l}} = ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t))$$

and

$${\varphi }_{1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t))\overline{Q}(t) + (\widetilde{{b}}_{ 0}^{1}(t),\ldots,\widetilde{{b}}_{ 0}^{l}(t))\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}},$$

where $\overline{Q}(t)$ was defined in (4.51).

By equating the coefficients of ε² in (4.60), we obtain a system of linear inhomogeneous equations

$$\begin{array}{ll} &{ d \over dt} ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t)) = ({{\vartheta}}_{ 1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t))\overline{Q}(t) \\ & + (\widetilde{{b}}_{0}^{1}(t),\ldots,\widetilde{{b}}_{ 0}^{l}(t))\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}},\end{array}$$

(4.61)

with initial conditions

$${{\vartheta}}_{1}^{k}(0),\mbox{ for }k = 1,2,\ldots,l\mbox{ such that }\sum\limits_{k=1}^{l}{{\vartheta}}_{ 1}^{k}(0) = 0.$$

Again, as observed in Remark 4.23, equation (4.61) comes from the consideration in the sense of Fredholm since the functions on the right-hand sides in (4.44) must be orthogonal to $\widetilde{\mathrm{1}\mathrm{l}}$.

The initial conditions ${{\vartheta}}_{1}^{k}(0)$ for k = 1, …, l have not been completely specified yet. We do this later to ensure the matched asymptotic expansion. Once the ${{\vartheta}}_{1}^{k}(0)$ ’s are given, the solution of the above equation is

$$\begin{array}{rl} ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{1}^{l}(t)) =&({{\vartheta}}_{1}^{1}(0),\ldots,{{\vartheta}}_{ 1}^{l}(0))X(t,0) \\ &\ +{ \int }_{0}^{t}(\widetilde{{b}}_{ 0}^{1}(s),\ldots,\widetilde{{b}}_{ 0}^{l}(s))\widehat{Q}(s)\widetilde{\mathrm{1}\mathrm{l}}X(t,s)ds\end{array}$$

Thus if the initial value ${{\vartheta}}_{1}^{k}(0)$ is given, then ${{\vartheta}}_{1}^{k}(\cdot ),$ k = 1, …, l can be found, and so can φ₁( ⋅). Moreover, φ₁( ⋅) is n-times continuously differentiable on [0, T]. The problem boils down to finding the initial condition of ${{\vartheta}}_{1}(0)$.

So far, with the proviso of specified initial conditions ${{\vartheta}}_{1}^{k}(0)$, for k = 1, …, l, the construction of φ ₁ ( ⋅) has been completed, and its smoothness has been established. Compared with the determination of φ ₀ ( ⋅), the multipliers ${{\vartheta}}_{1}^{k}(\cdot )$ can no longer be determined using the information about the regular part alone because its initial values have to be determined in conjunction with that of the singular part. This will be seen as follows.

In view of (4.55),

$$\begin{array}{ll} &{\psi }_{1 } (\tau ) = {\psi }_{1}(0)\exp (\widetilde{Q}(0)\tau ) \\ &\qquad +{ \int }_{0}^{\tau }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\widehat{Q}(0)\exp (\widetilde{Q}(0)(\tau - s))ds \\ &\qquad +{ \int }_{0}^{\tau }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s){ d\widetilde{Q}(0) \over dt} \exp (\widetilde{Q}(0)(\tau - s))ds.\end{array}$$

(4.62)

Recall that ψ ₁ (0) has not been specified yet.

Similar to Section 4.2, for each t ∈ [0, T], $\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0$ . Therefore,

$$\left(\frac{{d}^{i}\widetilde{Q}(t)} {d{t}^{i}} \right)\widetilde{\mathrm{1}\mathrm{l}} = 0\quad \mbox{ and }\quad \left(\frac{{d}^{i}\widetilde{Q}(0)} {d{t}^{i}} \right)\pi = 0,$$

for $i = 1,\ldots,n + 1$, where π is defined in (4.57). This together with ψ₀ (0)π = 0 yields

$$\begin{array}{ll} &{\biggl |{\int }_{0}^{\tau }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s){ d\widetilde{Q}(0) \over dt} \exp (\widetilde{Q}(0)(\tau - s))ds\biggr |} \\ &\ \leq {\int }_{0}^{\tau }s{\biggl |{\psi }_{ 0}(0)[\exp (\widetilde{Q}(0)s) - \pi ]\biggr |} \\ &\quad \times {\biggl |{ d\widetilde{Q}(0) \over dt} [\exp (\widetilde{Q}(0)(\tau - s)) - \pi ]\biggr |}ds \\ &\ \leq K{\tau }^{2}\exp (-{\kappa }_{ 0,0}\tau ).\end{array}$$

(4.63)

To obtain the desired property, we need only work with the first two terms on the right side of the equal sign of (4.62). Noting the exponential decay property of ${\psi }_{0}(\tau ) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau )$ , we have

$${\int }_{0}^{\infty }\Big{\vert }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\Big{\vert }ds < \infty,$$

that is, the improper integral converges absolutely. Set

$${ \overline{\psi }}_{0} = \left ({\int }_{0}^{\infty }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\widehat{Q}(0) \in {\mathbb{R}}^{1\times m}.$$

(4.64)

Consequently,

$$\begin{array}{ll} & \lim\limits_{\tau \rightarrow \infty }{\psi }_{1}(0)\exp (\widetilde{Q}(0)\tau ) = {\psi }_{1}(0)\pi \quad \mbox{ and } \\ & \lim\limits_{\tau \rightarrow \infty }{\int }_{0}^{\tau }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\widehat{Q}(0)\exp (\widetilde{Q}(0)(\tau - s))ds \\ &\quad = \left ({\int }_{0}^{\infty }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\widehat{Q}(0)\pi \\ &\quad :={ \overline{\psi }}_{0}\pi.\end{array}$$

(4.65)

Recall that $\pi = \mathrm{diag}(\mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0),\ldots,\mathrm{1}{\mathrm{l}}_{{m}_{l}}{\nu }^{l}(0))$ . Partitioning the vector ${\overline{\psi }}_{0}$ as $({\overline{\psi }}_{0}^{1},\ldots,{\overline{\psi }}_{0}^{l})$ for k = 1, …, l, we have

$$\begin{array}{ll} &{\psi }_{1}(0)\pi = \left (\left ({\psi }_{1}^{1}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{1}}\right ){\nu }^{1}(0),\ldots,\left ({\psi }_{ 1}^{l}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{l}}\right ){\nu }^{l}(0)\right ) \\ &{\overline{\psi }}_{0}\pi = \left (\left ({\overline{\psi }}_{0}^{1}\mathrm{1}{\mathrm{l}}_{{ m}_{1}}\right ){\nu }^{1}(0),\ldots,\left ({\overline{\psi }}_{ 0}^{l}\mathrm{1}{\mathrm{l}}_{{ m}_{l}}\right ){\nu }^{l}(0)\right ).\end{array}$$

(4.66)

Our expansion requires that lim_{τ → ∞}ψ₁ (τ) = 0. As a result,

$${\psi }_{1}(0)\pi +{ \overline{\psi }}_{0}\pi = 0,$$

(4.67)

which implies, by virtue of (4.66),

$${\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = -{\overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

for k = 1, …, l. Solving these equations and in view of

$${{\vartheta}}_{1}^{k}(0) = {\varphi }_{ 1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

we choose

$${{\vartheta}}_{1}^{k}(0) = -{\psi }_{ 1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\mbox{ for }k = 1,\ldots,l.$$

Substituting these into (4.59), we obtain φ₁( ⋅). Finally, we use ${\psi }_{1}(0) = -{\varphi }_{1}(0)$. The process of choosing initial conditions for φ₁( ⋅) and ψ₁( ⋅) is complete. Furthermore,

$$\vert {\psi }_{1}(\tau )\vert \leq K\exp (-{\kappa }_{1,0}\tau )\quad \mbox{ for some }0 < {\kappa }_{1,0} < {\kappa }_{0,0}.$$

This procedure can be applied to φ_i( ⋅) and ψ_i( ⋅) for $i = 2,\ldots,n + 1$. We proceed recursively to solve for φ_i ( ⋅) and ψ _i ( ⋅) jointly. Using exactly the same methods as the solution for φ ₁( ⋅), we define

$${{\vartheta}}_{i}^{k}(t) = {\varphi }_{ i}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

for each k = 1, …, l and $i = 2,\ldots,n + 1$ . Similar to $\widetilde{{b}}_{0}^{k}(\cdot )$ , we define $\widetilde{{b}}_{i}^{k}(\cdot )$ . and write

$$\widetilde{{b}}_{i}(t) = (\widetilde{{b}}_{i}^{1}(t),\ldots,\widetilde{{b}}_{ i}^{l}(t)).$$

Proceeding inductively, suppose that ${{\vartheta}}_{i}^{k}(0)$ is selected and in view of (4.55), it has been shown that

$$\vert {\psi }_{i}(\tau )\vert \leq K\exp (-{\kappa }_{i,0}\tau ),\ \ i \leq n$$

(4.68)

for some 0 < κ _i, 0 < κ _{i − 1, 0} . Solve

$${\psi }_{i+1}(0)\pi = -\left(\;\sum\limits_{j=0}^{i}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right)\pi := -{\overline{\psi }}_{i}\pi $$

to obtain ${\psi }_{i+1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = -{\overline{\psi }}_{i}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}$ . Set

$${{\vartheta}}_{i+1}^{k}(0) = -{\psi }_{ i+1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{i}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},\mbox{ for }k = 1,\ldots,l.$$

Finally choose ${\psi }_{i+1}(0) = -{\varphi }_{i+1}(0).$ We thus have determined the initial conditions for φ _i( ⋅). Exactly the same arguments as in Proposition 4.25 lead to

$$\vert {\psi }_{i+1}(\tau )\vert \leq K\exp (-{\kappa }_{i+1,0}\tau )\mbox{ for some }0 < {\kappa }_{i+1,0} < {\kappa }_{i,0}.$$

Proposition 4.26

. Assume (A4.3) and (A4.4) . Then the following assertions hold:

The sequences of row-vector-valued functions φ _i ( ⋅) and ${{\vartheta}}_{i}(\cdot )$ fori = 1, 2, …, ncan be obtained by solving the system of algebraic differential equations
$$\begin{array}{ll} &{\varphi }_{i}(t)\widetilde{Q}(t) = \frac{d{\varphi }_{i-1}(t)} {dt} - {\varphi }_{i-1}(t)\widehat{Q}(t), \\ &{{\vartheta}}_{i}^{k}(t) = {\varphi }_{ i}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}, \\ &{ d{{\vartheta}}_{i}(t) \over dt} = {{\vartheta}}_{i}(t)\overline{Q}(t) +\widetilde{ {b}}_{i-1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}}.\end{array}$$
(4.69)
Fori = 1, …, n, the initial conditions are selected as follows:
- Fork = 1, 2, …, l, find ${\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}$ from the equation
  $${\psi }_{i}(0)\pi = -\left(\;\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j-1}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right)\pi := -{\overline{\psi }}_{i-1}\pi.$$
- Choose
  $${{\vartheta}}_{i}^{k}(0) = -{\psi }_{ i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},\mbox{ for }k = 1,\ldots,l.$$
- Choose ${\psi }_{i}(0) = -{\varphi }_{i}(0).$
There is a positive real number 0 < κ₀ < κ_i, 0 (given in (4.68)) for $i = 0,1,\ldots,n + 1$ such that
$$\vert {\psi }_{i}(\tau )\vert \leq K\exp (-{\kappa }_{0}\tau ).$$
The choice of initial conditions yields that ${{\vartheta}}_{i}^{k}(\cdot )$ is $(n + 1 - i)$ -times continuously differentiable on [0, T] and hence φ_i( ⋅) is $(n + 1 - i)$ -times continuously differentiable on [0, T]. □

3.2 Analysis of Remainder

The objective here is to carry out the error analysis and validate the asymptotic expansion. Since the details are quite similar to those of Section 4.2, we make no attempt to spell them out. Only the following lemma and proposition are presented.

Lemma 4.27.

Suppose that (A4.3) and (A4.4) are satisfied. Let η ^ε (⋅) be a function such that

$$ \sup\limits_{t\in [0,T]}\vert {\eta }^{\varepsilon }(t)\vert = O({\varepsilon }^{k+1})\mbox{ for }k \leq n$$

and let ${\mathcal{L}}^{\varepsilon }$ be an operator defined in (4.42) . If f ^ε (⋅) is a solution to the equation

$${\mathcal{L}}^{\varepsilon }{f}^{\varepsilon }(t) = {\eta }^{\varepsilon }(t)\mbox{ with }{f}^{\varepsilon }(0) = 0,$$

then f ^ε (⋅) satisfies

$$ \sup\limits_{t\in [0,T]}\vert {f}^{\varepsilon }(t)\vert = O({\varepsilon }^{k}).$$

Proof: Note that using ${Q}^{\varepsilon }(t) =\widetilde{ Q}(t)/\varepsilon +\widehat{ Q}(t)$ , the differential equation can be written as

$${ d{f}^{\varepsilon }(t) \over dt} = {f}^{\varepsilon }(t){Q}^{\varepsilon }(t) +{ {\eta }^{\varepsilon }(t) \over \varepsilon }.$$

We can then proceed as in the proof of Lemma 4.13. □

Lemma 4.27 together with detailed computation similar to that of Section 4.2 yields the following proposition.

Proposition 4.28.

For each i = 0,1,…,n, define

$${e}_{i}^{\varepsilon }(t) = {p}^{\varepsilon }(t) - {y}_{ i}^{\varepsilon }(t).$$

(4.70)

Under conditions (A4.3) and (A4.4),

$$ \sup\limits_{0\leq t\leq T}\vert {e}_{i}^{\varepsilon }(t)\vert = O({\varepsilon }^{i+1}).$$

3.3 Computational Procedure: User’s Guide

Since the constructions of φ_i( ⋅) and ψ_i ( ⋅) are rather involved, and the choice of initial conditions is tricky, we summarize the procedure below. This procedure, which can be used as a user’s guide for developing the asymptotic expansion, comprises two main stages.

Step 1: Initialization: finding φ ₀ ( ⋅) and ψ ₀ ( ⋅).

1. 1.
  Obtain the unique solution φ₀ ( ⋅) via (4.54).
2. 2.
  Obtain the unique solution ψ₀( ⋅) via (4.55) and the initial condition ${\psi }_{0}(0) = {p}^{0} - {\varphi }_{0}(0)$.

Step 2. Iteration: finding φ_i( ⋅) and ψ_i ( ⋅) for 1 ≤ i ≤ n.

While i ≤ n, do the following:

1. 1.
  Find φ_i( ⋅) the solution of (4.69) with temporarily unspecified ${{\vartheta}}_{i}^{k}(0)$ for k = 1, …, l.
2. 2.
  Obtain ψ_i( ⋅) from (4.55) with temporarily unspecified ψ_i (0).
3. 3.
  Use the equation
  $${\psi }_{i}(0)\pi = -\left(\,\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j-1}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right)\pi := -{\overline{\psi }}_{i-1}\pi $$
  to obtain ${\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = -{\overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}.$
4. 4.
  Set ${{\vartheta}}_{i}^{k}(0) = -{\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} ={ \overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}$. By now, φ_i ( ⋅) has been determined uniquely.
5. 5.
  Choose ${\psi }_{i}(0) = -{\varphi }_{i}(0)$ . By now, ψ _i ( ⋅) has also been determined uniquely.
6. 6.
  Set $i = i + 1$.
7. 7.
  If i > n, stop.

3.4 Summary of Results

While the previous subsection gives the computational procedure, this subsection presents the main theorem. It establishes the validity of the asymptotic expansion.

Theorem 4.29

. Suppose conditions (A4.3) and (A4.4) are satisfied. Then the asymptotic expansion

$${y}_{n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t) + {\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\right )$$

can be constructed as in the computational procedure such that

φ_i( ⋅) is $(n + 1 - i)$ - times continuously differentiable on [0, T];
| ψ_i(t) | ≤ Kexp( − κ₀ t) for some κ₀ > 0;
$\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T]$.

Remark 4.30

In general, in view of Proposition 4.11, the error bound is of the form c _2n (t)exp (−κ ₀ t), where c _2n (t) is a polynomial of degree 2n. The exponential constant κ ₀ typically depends on n. The larger n is, the smaller κ ₀ will be to account for the polynomial c _2n (t).

The following result is a corollary to Theorem 4.29 and will be used in Chapters 5 and 7. Denote the jth component of ν^k(t) by ν_j ^k(t).

Corollary 4.31.

Assume, in addition to the conditions in Theorem 4.29 with n = 0, that $\widetilde{Q}(t) =\widetilde{ Q}$ and $\widehat{Q}(t) =\widehat{ Q}$ are time independent. Then there exist positive constants K and κ ₀ ( both independent of ε and t ) such that

$$\Big{\vert }P({\alpha }^{\varepsilon }(t) = {s}_{ kj}) - {\nu }_{j}^{k}(t){{\vartheta}}^{k}(t)\Big{\vert }\leq K\left(\varepsilon (t + 1) +\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right)\right),$$

(4.71)

where ${{\vartheta}}^{k}(t)$ satisfies

$$\frac{d} {dt}({{\vartheta}}^{1}(t),\ldots,{{\vartheta}}^{l}(t)) = ({{\vartheta}}^{1}(t),\ldots,{{\vartheta}}^{l}(t))\overline{Q},$$

with $({{\vartheta}}^{1}(0),\ldots,{{\vartheta}}^{l}(0)) = (P({\alpha }^{\varepsilon }(0) \in {\mathcal{M}}_{1}),\ldots,P({\alpha }^{\varepsilon }(0) \in {\mathcal{M}}_{l}))$.

Proof: By a slight modification of the analysis of remainder in Section 4.3, we can obtain (4.71) with a constant K independent of ε and t. The second part of the lemma follows from the uniqueness of the solution to the ordinary differential equation (4.71). □

Remark 4.32.

We mention an alternative approach to establishing the asymptotic expansion. In lieu of the constructive procedure presented previously, one may wish to write φ_i(t) as a sum of solutions of the homogeneous part and the inhomogeneous part. For instance, one may set

$${\varphi }_{i}(t) = {v}_{i}(t)\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t)) + {U}_{ i}(t),$$

(4.72)

where ${v}_{i}(t) \in {\mathbb{R}}^{l}$ and U_i(t) is a particular solution of the inhomogeneous equation. For i ≥ 0, the equation

$${\varphi }^{(i+1)}(t)\widetilde{Q}(t) ={ d{\varphi }_{i}(t) \over dt} - {\varphi }_{i}(t)\widehat{Q}(t)$$

and $\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0$ lead to

$$0 =\left({ d{\varphi }_{i}(t) \over dt} - {\varphi }_{i}(t)\widehat{Q}(t)\right)\widetilde{\mathrm{1}\mathrm{l}}.$$

Substituting (4.72) into the equation above, and noting that ${\nu }^{k}(t)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = 1$ for k = 1,…,l, and that $\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t))\widetilde{\mathrm{1}\mathrm{l}} = {I}_{l}$, the l × l identity matrix, one obtains

$${ d{v}_{i}(t) \over dt} = {v}_{i}(t)\overline{Q}(t) + {U}_{i}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} -\left({ d{U}_{i}(t) \over dt} \right)\widetilde{\mathrm{1}\mathrm{l}}.$$

One then proceeds to determine v_i(0) via the matching condition. The main ideas are similar, and the details are slightly different.

3.5 An Example

Consider Example 4.20 again. Note that the conditions in (A4.3) and (A4.4) require that

$${\lambda }_{1}(t) + {\mu }_{1}(t) > 0\mbox{ for all }t \in [0,T],$$

and the jump rates λ(t) and μ( t) be smooth enough.

The probability distribution of the state process is given by p ^ε(t) satisfying

$$\begin{array}{rl} &\frac{d{p}^{\varepsilon }(t)} {dt} = {p}^{\varepsilon }(t){Q}^{\varepsilon }(t), \\ &{p}^{\varepsilon }(0) = {p}^{0}\ \mbox{ such that} \\ &{p}_{i}^{0} \geq 0\mbox{ and }\sum\limits_{i=1}^{4}{p}_{ i}^{0} = \end{array}$$

(1.)

To solve this set of equations, note that

$$\begin{array}{l} \frac{d} {dt}({p}_{1}^{\varepsilon }(t) + {p}_{ 2}^{\varepsilon }(t)) = -{\lambda }_{ 2}(t)({p}_{1}^{\varepsilon }(t) + {p}_{ 2}^{\varepsilon }(t)) + {\mu }_{ 2}(t)({p}_{3}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)), \\ \frac{d} {dt}({p}_{1}^{\varepsilon }(t) + {p}_{ 3}^{\varepsilon }(t)) = -\frac{{\lambda }_{1}(t)} {\varepsilon } ({p}_{1}^{\varepsilon }(t) + {p}_{ 3}^{\varepsilon }(t)) + \frac{{\mu }_{1}(t)} {\varepsilon } ({p}_{2}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)), \\ \frac{d} {dt}({p}_{2}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)) = \frac{{\lambda }_{1}(t)} {\varepsilon } ({p}_{1}^{\varepsilon }(t) + {p}_{ 3}^{\varepsilon }(t)) -\frac{{\mu }_{1}(t)} {\varepsilon } ({p}_{2}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)), \\ \frac{d} {dt}({p}_{3}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)) = {\lambda }_{ 2}(t)({p}_{1}^{\varepsilon }(t) + {p}_{ 2}^{\varepsilon }(t)) - {\mu }_{ 2}(t)({p}_{3}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t))\end{array}$$

To proceed, define functions a ₁₂(t), a ₁₃(t), a ₂₄(t), and a ₃₄(t) as follows:

$$\begin{array}{rl} {a}_{12}(t) =&({p}_{1}^{0} + {p}_{ 2}^{0})\exp \left (-{\int }_{0}^{t}({\lambda }_{ 2}(s) + {\mu }_{2}(s))ds\right ) \\ & +{ \int }_{0}^{t}{\mu }_{ 2}(u)\exp \left (-{\int }_{u}^{t}({\lambda }_{ 2}(s) + {\mu }_{2}(s))ds\right )du, \end{array}$$

$$\begin{array}{rl} {a}_{13}(t) =&({p}_{1}^{0} + {p}_{ 3}^{0})\exp \left (-\frac{1} {\varepsilon }{\int }_{0}^{t}({\lambda }_{ 1}(s) + {\mu }_{1}(s))ds\right ) \\ & +{ \int }_{0}^{t}\frac{{\mu }_{1}(u)} {\varepsilon } \exp \left (-\frac{1} {\varepsilon }{\int }_{u}^{t}({\lambda }_{ 1}(s) + {\mu }_{1}(s))ds\right )du,\end{array}$$

$$\begin{array}{rl} {a}_{24}(t) =&({p}_{2}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{1} {\varepsilon }{\int }_{0}^{t}({\lambda }_{ 1}(s) + {\mu }_{1}(s))ds\right ) \\ & +{ \int }_{0}^{t}\frac{{\lambda }_{1}(u)} {\varepsilon } \exp \left (-\frac{1} {\varepsilon }{\int }_{u}^{t}({\lambda }_{ 1}(s) + {\mu }_{1}(s))ds\right )du,\end{array}$$

$$\begin{array}{rl} {a}_{34}(t) =&({p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-{\int }_{0}^{t}({\lambda }_{ 2}(s) + {\mu }_{2}(s))ds\right ) \\ & +{ \int }_{0}^{t}{\lambda }_{ 2}(u)\exp \left (-{\int }_{u}^{t}({\lambda }_{ 2}(s) + {\mu }_{2}(s))ds\right )du\end{array}$$

Then using the fact that ${p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) + {p}_{3}^{\varepsilon }(t) + {p}_{4}^{\varepsilon }(t) = 1$, we have

$$\begin{array}{l} {p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) = {a}_{12}(t), \\ {p}_{1}^{\varepsilon }(t) + {p}_{3}^{\varepsilon }(t) = {a}_{13}(t), \\ {p}_{2}^{\varepsilon }(t) + {p}_{4}^{\varepsilon }(t) = {a}_{24}(t), \\ {p}_{3}^{\varepsilon }(t) + {p}_{4}^{\varepsilon }(t) = {a}_{34}(t).\end{array}$$

(4.73)

Note also that

$$\begin{array}{rl} &\frac{d{p}_{1}^{\varepsilon }(t)} {dt} = -\left (\frac{{\lambda }_{1}(t)} {\varepsilon } + \frac{{\mu }_{1}(t)} {\varepsilon } + {\lambda }_{2}(t) + {\mu }_{2}(t)\right ){p}_{1}^{\varepsilon }(t) \\ &\quad + \frac{{\mu }_{1}(t)} {\varepsilon } {a}_{12}(t) + {\mu }_{2}(t){a}_{13}(t)\end{array}$$

The solution to this equation is

$$\begin{array}{rl} &{p}_{1 }^{\varepsilon }(t) = {p}_{ 1}^{0}\exp \left (-{\int }_{0}^{t}\left (\frac{{\lambda }_{1}(s) + {\mu }_{1}(s)} {\varepsilon } + {\lambda }_{2}(s) + {\mu }_{2}(s)\right )ds\right ) \\ &\qquad +{ \int }_{0}^{t}\left (\frac{{\mu }_{1}(u)} {\varepsilon } {a}_{12}(u) + {\mu }_{2}(u){a}_{13}(u)\right ) \\ &\qquad \quad \times \exp \left (-{\int }_{u}^{t}\left (\frac{{\lambda }_{1}(s) + {\mu }_{1}(s)} {\varepsilon } + {\lambda }_{2}(s) + {\mu }_{2}(s)\right )ds\right )du\end{array}$$

Consequently, in view of (4.73), it follows that

$$\begin{array}{l} {p}_{2}^{\varepsilon }(t) = {a}_{12}(t) - {p}_{1}^{\varepsilon }(t), \\ {p}_{3}^{\varepsilon }(t) = {a}_{13}(t) - {p}_{1}^{\varepsilon }(t), \\ {p}_{4}^{\varepsilon }(t) = {a}_{24}(t) - {p}_{2}^{\varepsilon }(t)\end{array}$$

In this example, the zeroth-order term is given by

$${\varphi }_{0}(t) = ({\nu }^{1}(t){{\vartheta}}_{ 0}^{1}(t),{\nu }^{2}(t){{\vartheta}}_{ 0}^{2}(t)),$$

where the quasi-stationary distributions are given by

$${\nu }^{1}(t) = {\nu }^{2}(t) = \left ( \frac{{\mu }_{1}(t)} {{\lambda }_{1}(t) + {\mu }_{1}(t)}, \frac{{\lambda }_{1}(t)} {{\lambda }_{1}(t) + {\mu }_{1}(t)}\right ),$$

and the multipliers $({{\vartheta}}_{0}^{1}(t),{{\vartheta}}_{0}^{2}(t))$ are determined by the differential equation

$$\frac{d} {dt}({{\vartheta}}_{0}^{1}(t),{{\vartheta}}_{ 0}^{2}(t)) = ({{\vartheta}}_{ 0}^{1}(t),{{\vartheta}}_{ 0}^{2}(t))\left (\begin{array}{cc} - {\lambda }_{2}(t)& {\lambda }_{2}(t) \\ {\mu }_{2}(t) & - {\mu }_{2}(t) \end{array} \right ),$$

with initial value $({{\vartheta}}_{0}^{1}(0),{{\vartheta}}_{0}^{2}(0)) = ({p}_{1}^{0} + {p}_{2}^{0},{p}_{3}^{0} + {p}_{4}^{0})$.

The inner expansion term ψ₀(τ) is given by

$${ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0),\;{\psi }_{0}(0) = {p}^{0} - {\varphi }_{ 0}(0).$$

By virtue of Theorem 4.29,

$${p}^{\varepsilon }(t) - {\varphi }_{ 0}(t) - {\psi }_{0}\left ({ t \over \varepsilon } \right ) = O(\varepsilon ),$$

provided that Q ^ε(t) is continuously differentiable on [0, T]. Noting the exponential decay of ψ ₀(t ∕ ε), we further have

$${p}^{\varepsilon }(t) = {\varphi }_{ 0}(t) + O\left(\varepsilon +\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right)\right).$$

In particular, for any t > 0,

$$ \lim\limits_{\varepsilon \rightarrow 0}{p}^{\varepsilon }(t) = {\varphi }_{ 0}(t).$$

Namely, φ ₀(t) is the limit distribution of the Markov chain generated by Q ^ε(t).

4 Inclusion of Absorbing States

While the case of recurrent states was considered in the previous section, this section concerns the asymptotic expansion in which the Markov chain generated by Q ^ε(t) in which $\widetilde{Q}(t)$ includes components corresponding to absorbing states. By rearrangement, the matrix $\widetilde{Q}(t)$ takes the form

$$\widetilde{Q}(t) = \left (\begin{array}{*{10}c} \widetilde{{Q}}^{1}(t)& & & & \\ &\widetilde{{Q}}^{2}(t)& & &\\ & &\ddots && \\ & & &\widetilde{{Q}}^{l}(t)& \\ & & & &{0}_{{m}_{a}\times {m}_{a}}\\ \end{array} \right ),$$

(4.74)

where $\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}$ for k = 1, 2, …, l, ${0}_{{m}_{a}\times {m}_{a}}$ is an m _a ×m _a zero matrix, and

$${m}_{1} + {m}_{2} + \cdots + {m}_{l} + {m}_{a} = m.$$

Let ${\mathcal{M}}_{a} =\{ {s}_{a1},\ldots,{s}_{a{m}_{a}}\}$ denote the set of absorbing states. We may, as in Section 4.3, represent the state space as

$$\begin{array}{rl} \mathcal{M}& = {\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l} \cup {\mathcal{M}}_{a} \\ & = \{{s}_{11},\ldots,{s}_{1{m}_{1}},\ldots,{s}_{l1},\ldots,{s}_{l{m}_{l}},{s}_{a1},\ldots,{s}_{a{m}_{a}}\}\end{array}$$

Following the development of Section 4.3, suppose that α^ε ( ⋅) is a Markov chain generated by ${Q}^{\varepsilon }(\cdot ) =\widetilde{ Q}(\cdot )/\varepsilon +\widehat{ Q}(\cdot )$. Compared with Section 4.3, the difference is that now the dominant part in the generator includes absorbing states corresponding to the m _a ×m _a matrix ${0}_{{m}_{a}\times {m}_{a}}$. As in the previous case, our interest is to obtain an asymptotic expansion of the probability distribution.

Remark 4.33.

The motivation of the current study stems from the formulation of competitive risk theory discussed in Section 3.3 The idea is that within the m states, there are several groups. Some of them are much riskier than the others (in the sense of frequency of the occurrence of the corresponding risks). The different rates (sensitivity) of risks are modeled by the use of a small parameter ε > 0.

Denote by p ^ε ( ⋅) the solution of (4.40). The objective here is to obtain an asymptotic expansion

$${y}_{n}^{\varepsilon } =\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) +\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right ).$$

Since the techniques employed are essentially the same as in the previous section, it will be most instructive here to highlight the main ideas. Thus, we only note the main steps and omit most of the details.

Assume conditions (A4.3) and (A4.4) for the current matrices $\widetilde{{Q}}^{k}(t),$ $\widetilde{Q}(t)$ , and $\widehat{Q}(t)$ . For t ∈ [0, T], substituting the expansion above into (4.40) and equating coefficients of εⁱ, for $i = 1,\ldots,n + 1$, yields

$$\begin{array}{ll} &{\varphi }_{0}(t)\widetilde{Q}(t) = 0, \\ &{\varphi }_{i}(t)\widetilde{Q}(t) ={ d{\varphi }_{i-1}(t) \over dt} - {\varphi }_{i-1}(t)\widehat{Q}(t),\\ \end{array}$$

(4.75)

and (with the use of the stretched variable $\tau = t/\varepsilon $)

$$\begin{array}{rl} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0), \\ &{ d{\psi }_{i}(\tau ) \over d\tau } = {\psi }_{i}(\tau )\widetilde{Q}(0) +\sum\limits_{j=0}^{i-1}{\psi }_{ i-j-1}(\tau ) \\ & \times \left({ {\tau }^{j} \over j!} { {d}^{j}\widehat{Q}(0) \over d{t}^{j}} +{ {\tau }^{j+1} \over (j + 1)!} { {d}^{j+1}\widetilde{Q}(0) \over d{t}^{j+1}} \right).\end{array}$$

(4.76)

For each i ≥ 0, we use the following notation for the partitioned vectors:

$$\begin{array}{rl} &{\varphi }_{i}(t) = ({\varphi }_{i}^{1}(t),\ldots,{\varphi }_{ i}^{l}(t),{\varphi }_{ i}^{a}(t)), \\ &{\psi }_{i}(\tau ) = ({\psi }_{i}^{1}(\tau ),\ldots,{\psi }_{ i}^{l}(\tau ),{\psi }_{ i}^{a}(\tau ))\end{array}$$

In the above φ _i ^a(t) and ψ_i ^a (τ) are vectors in ${\mathbb{R}}^{1\times {m}_{a}}$.

To determine the outer- and the initial-layer expansions, let us start with i = 0. For each t ∈ [0, T], the use of the partitioned vector φ ₀(t) leads to

$${\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = 0,\mbox{ for }k = 1,\ldots,l.$$

Note that φ₀ ^a(t) does not show up in any of these equations owing to the ${0}_{{m}_{a}\times {m}_{a}}$ matrix in $\widetilde{Q}(t)$. It will have to be obtained from the equation in (4.75) corresponding to i = 1. Put another way, φ ₀ ^a (t) is determined mainly by the matrix $\widehat{Q}(t)$.

Similar to Section 4.3, ${\varphi }_{0}^{k}(t) = {{\vartheta}}_{0}^{k}(t){\nu }^{k}(t)$ , where ν ^k(t) are the quasi-stationary distributions corresponding to the generators $\widetilde{{Q}}^{k}(t)$ for k = 1, …, l and ${{\vartheta}}_{0}^{k}(t)$ are the corresponding multipliers. Define

$$\widetilde{\mathrm{1}{\mathrm{l}}}_{a} = \left (\begin{array}{*{10}c} \mathrm{1}{\mathrm{l}}_{{m}_{1}} & & & &\\ & \ddots & && \\ & & & \mathrm{1}{\mathrm{l}}_{{m}_{l}} & \\ & & & & {I}_{{m}_{a}}\\ \end{array} \right ),$$

where ${I}_{{m}_{a}}$ is an m _a ×m _a identity matrix. Clearly, $\widetilde{\mathrm{1}{\mathrm{l}}}_{a}$ is orthogonal to $\widetilde{Q}(t)$ for each t ∈ [0, T]. As a result, multiplying (4.75) by $\widetilde{\mathrm{1}{\mathrm{l}}}_{a}$ from the right with i = 1 leads to

$$\begin{array}{ll} &{ d{\varphi }_{0}(t) \over dt} \widetilde{\mathrm{1}{\mathrm{l}}}_{a} = {\varphi }_{0}(t)\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a}, \\ &({{\vartheta}}_{0}(0),{\varphi }_{0}^{a}(0)) = {p}^{0}\widetilde{\mathrm{1}{\mathrm{l}}}_{ a}, \end{array}$$

(4.77)

where ${{\vartheta}}_{0}(0) = ({{\vartheta}}_{0}^{1}(0),\ldots,{{\vartheta}}_{0}^{l}(0)).$

The above initial condition is a consequence of the initial-value consistency condition in (4.53). It is readily seen that

$$\sum\limits_{k=1}^{l}{{\vartheta}}_{ 0}^{k}(0) = 1 - {\varphi }_{ 0}^{a}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{a}} = 1 - {p}^{0,a}\mathrm{1}{\mathrm{l}}_{{ m}_{a}},$$

where p ⁰ = (p ^0, 1 , …, p ^0, l, p ^0, a).

We write

$${\varphi }_{0}(t) = ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t),{\varphi }_{ 0}^{a}(t))\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t),{I}_{{ m}_{a}}).$$

Define

$$\overline{Q}(t) = \mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t),{I}_{{ m}_{a}})\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a}.$$

(4.78)

Then (4.77) is equivalent to

$$\begin{array}{rl} &{ d \over dt} ({{\vartheta}}_{0}(t),{\varphi }_{0}^{a}(t)) = ({{\vartheta}}_{ 0}(t),{\varphi }_{0}^{a}(t))\overline{Q}(t), \\ &({{\vartheta}}_{0}(0),{\varphi }_{0}^{a}(0)) = {p}^{0}\widetilde{\mathrm{1}{\mathrm{l}}}_{ a}\end{array}$$

This is a linear system of differential equations. Therefore it has a unique solution given by

$$({{\vartheta}}_{0}(t),{\varphi }_{0}^{a}(t)) = {p}^{0}\widetilde{\mathrm{1}{\mathrm{l}}}_{ a}X(t,0),$$

where X( t, 0) is the principal matrix solution of the homogeneous equation. Thus φ₀(t) has been found and is ( n + 1)-times continuously differentiable.

Remark 4.34.

Note that in φ₀(t), the term φ₀ ^a(t) corresponds to the set of absorbing states ${\mathcal{M}}_{a}$. Clearly, these states cannot be aggregated to a single state as in the case of recurrent states. Nevertheless, the function φ₀ ^a(t) tends to be stabilized in a neighborhood of a constant for t large enough. To illustrate, let us consider a stationary case, that is, both $\widetilde{Q}(t) =\widetilde{ Q}$ and $\widehat{Q}(t) =\widehat{ Q}$ are independent of t. Partition $\widehat{Q}$ as blocks of submatrices

$$\widehat{Q} = \left (\begin{array}{cc} \widehat{{Q}}^{11} & \widehat{{Q}}^{12} \\ \widehat{{Q}}^{21} & \widehat{{Q}}^{22}\\ \end{array} \right ),$$

where $\widehat{{Q}}^{22}$ is an m_a × m_a matrix. Assume that the eigenvalues of $\widehat{{Q}}^{22}$ have negative real parts. Then, in view of the definition of $\overline{Q}(t) = \overline{Q}$ in (4.78), it follows that

$${\varphi }_{0}^{a}(t) \rightarrow \mbox{ a constant as }t \rightarrow \infty.$$

Using the partition ψ₀(τ) = (ψ₀ ¹(τ), …, ψ ₀ ^l (τ), ψ ₀ ^a (τ)), consider the zeroth-order initial-layer term given by

$$\begin{array}{rl} { d{\psi }_{0}(\tau ) \over d\tau } & ={ d \over d\tau } ({\psi }_{0}^{1}(\tau ),\ldots,{\psi }_{ 0}^{l}(\tau ),{\psi }_{ 0}^{a}(\tau )) \\ & = {\psi }_{0}(\tau )\widetilde{Q}(0) = ({\psi }_{0}^{1}(\tau )\widetilde{{Q}}^{1}(\tau ),\ldots,{\psi }_{ 0}^{l}(\tau )\widetilde{{Q}}^{l}(0),{0}_{{ m}_{a}}).\end{array}$$

We obtain

$$\begin{array}{rl} &{\psi }_{0}^{k}(\tau ) = {\psi }_{ 0}^{k}(0)\exp (\widetilde{{Q}}^{k}(0)\tau ),\mbox{ for }k = 1,\ldots,l,\mbox{ and } \\ &{\psi }_{0}^{a}(\tau ) = \mbox{ constant.} \end{array}$$

Noting that p ^0, a = φ₀ ^a (0) and choosing ${\psi }_{0}(0) = {p}^{0} - {\varphi }_{0}(0)$ lead to ${\psi }_{0}^{a}(\tau ) = {0}_{{m}_{a}}.$ Thus

$${\psi }_{0}(\tau ) = ({\psi }_{0}^{1}(0)\exp (\widetilde{{Q}}^{1}(0)\tau ),\ldots,{\psi }_{ 0}^{l}(0)\exp (\widetilde{{Q}}^{l}(0)\tau ),{0}_{{ m}_{a}}).$$

Similar to the result in Section 4.3, the following lemma holds. The proof is analogous to that of Proposition 4.25.

Lemma 4.35

. Define

$${\pi }_{a} = \mbox{ diag}(\mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0),\ldots,\mathrm{1}{\mathrm{l}}_{{ m}_{l}}{\nu }^{l}(0),{I}_{{ m}_{a}}).$$

Then there exist positive constants K and κ_0,0 such that

$$\vert \exp (\widetilde{Q}(0)\tau ) - {\pi }_{a}\vert \leq K\exp (-{\kappa }_{0,0}\tau ).$$

By virtue of the lemma above and the orthogonality $({p}^{0} - {\varphi }_{0}(0)){\pi }_{a} = 0$, we have

$$\begin{array}{rl} \vert {\psi }_{0}(\tau )\vert & = \vert ({p}^{0} - {\varphi }_{ 0}(0))(\exp (\widetilde{Q}(0)\tau ) - {\pi }_{a})\vert \\ &\leq K\exp (-{\kappa }_{0,0}\tau ) \end{array}$$

for some K > 0 and κ _0, 0 > 0 given in Lemma 4.35; that is, ψ₀ (τ) decays exponentially fast. Therefore, ψ ₀ (τ) has the desired property.

We continue in this fashion and proceed to determine the next term φ₁(t) as well as ψ₁(t ∕ ε). Let

$$\begin{array}{rl} &{b}_{0}(t) ={ d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)\widehat{Q}(t)\quad \mbox{ with } \\ &{b}_{0}(t) = ({b}_{0}^{1}(t),\ldots,{b}_{ 0}^{l}(t),{b}_{ 0}^{a}(t))\end{array}$$

It is easy to check that ${b}_{0}^{a}(t) = {0}_{{m}_{a}}$ . The equation ${\varphi }_{1}(t)\widetilde{Q}(t) = {b}_{0}(t)$ then leads to

$$\begin{array}{ll} &{\varphi }_{1}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t),\mbox{ for }k = 1,\ldots,l, \\ &{b}_{0}^{a}(t) = {0}_{{ m}_{a}}.\end{array}$$

(4.79)

The solutions of the l inhomogeneous equations in (4.79) above are of the form

$${\varphi }_{1}^{k}(t) = {{\vartheta}}_{ 1}^{k}(t){\nu }^{k}(t) +\widetilde{ {b}}_{ 0}^{k}(t),\ k = 1,\ldots,l,$$

where ${{\vartheta}}_{1}^{k}(t)$ for k = 1, …, l are scalar multipliers. Again, φ ₁ ^a (t) cannot be obtained from the equation above, it must come from the contribution of the matrix-valued function $\widehat{Q}(t)$.

Note that

$$\widetilde{{b}}_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t)\quad \mbox{ and }\quad \widetilde{{b}}_{ 0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0.$$

Using the equation

$${\varphi }_{2}(t)\widetilde{Q}(t) ={ d{\varphi }_{1}(t) \over dt} - {\varphi }_{1}(t)\widehat{Q}(t),$$

one obtains

$$0 = {\varphi }_{2}(t)\widetilde{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a} ={ d{\varphi }_{1}(t) \over dt} \widetilde{\mathrm{1}{\mathrm{l}}}_{a} - {\varphi }_{1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a},$$

which in turn implies that

$$\begin{array}{l} { d \over dt} ({{\vartheta}}_{1}(t),{\varphi }_{1}^{a}(t)) = ({{\vartheta}}_{ 1}(t),{\varphi }_{1}^{a}(t))\overline{Q}(t) \\ + (\widetilde{{b}}_{0}(t),{0}_{{m}_{a}})\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a}, \end{array}$$

(4.80)

where

$${{\vartheta}}_{1}(t) = ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t))\mbox{ and }\widetilde{{b}}_{ 0}(t) = (\widetilde{{b}}_{0}^{1}(t),\ldots,\widetilde{{b}}_{ 0}^{l}(t)).$$

Let X(t, s) denote the principal matrix solution to the homogeneous differential equation

$$\frac{dy(t)} {dt} = y(t)\overline{Q}(t).$$

Then the solution to (4.80) can be represented by X( t, s) as follows:

$$\begin{array}{rl} ({{\vartheta}}_{1}(t),{\varphi }_{1}^{a}(t))& = ({{\vartheta}}_{1}(0),{\varphi }_{1}^{a}(0))X(t,0) \\ &\qquad +{ \int }_{0}^{t}(\widetilde{{b}}_{ 0}(s),{0}_{{m}_{a}})\widehat{Q}(s)\widetilde{\mathrm{1}{\mathrm{l}}}_{a}X(t,s)ds\end{array}$$

Note that the initial conditions φ ₁ ^a (0) and ${{\vartheta}}_{1}^{k}(0)$ for k = 1, …, l need to be determined using the initial-layer terms just as in Section 4.3.

Using (4.76) with i = 1, one obtains an equation that has the same form as that of (4.62). That is,

$$\begin{array}{rl} {\psi }_{1}(\tau )& = {\psi }_{1}(0)\exp (\widetilde{Q}(0)\tau ) \\ &\quad +{ \int }_{0}^{\tau }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\widehat{Q}(0)\exp (\widetilde{Q}(0)(\tau - s))ds \\ &\quad +{ \int }_{0}^{\tau }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s){ d\widetilde{Q}(0) \over dt} \exp (\widetilde{Q}(0)(\tau - s))ds\end{array}$$

As in Section 4.3, with the use of π_a, it can be shown that | ψ₁ (τ) | ≤ Kexp( − κ_1, 0τ) for some K > 0 and 0 < κ _1, 0 < κ _0, 0 . By requiring that ψ ₁ (τ) decay to 0 as τ → ∞, we obtain the equation

$${\psi }_{1}(0){\pi }_{a} = -{\overline{\psi }}_{0}{\pi }_{a},$$

(4.81)

where

$${\overline{\psi }}_{0} ={ \int }_{0}^{\infty }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\widehat{Q}(0).$$

Owing to (4.81) and the known form of ψ₀(τ),

$$\begin{array}{rl} {\overline{\psi }}_{0} & = ({\overline{\psi }}_{0}^{1},\ldots,{\overline{\psi }}_{ 0}^{l},{\overline{\psi }}_{ 0}^{a}) \\ & = ({p}^{0,1} - {\varphi }_{ 0}^{1}(0),\ldots,{p}^{0,l} - {\varphi }_{ 0}^{l},{0}_{{ m}_{a}})\left ({\int }_{0}^{\infty }\exp (\widetilde{Q}(0)s)ds\right )\widehat{Q}(0), \end{array}$$

which is a completely known vector. Thus the solution to (4.81) is

$${\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = -{\overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\mbox{ for }k = 1,\ldots,l,\mbox{ and }{\psi }_{1}^{a}(0) = -{\overline{\psi }}_{ 0}^{a}.$$

To obtain the desired matching property for the inner-outer expansions, choose

$$\begin{array}{rl} &{{\vartheta}}_{1}^{k}(0) = -{\psi }_{ 1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\mbox{ for }k = 1,\ldots,l, \\ &{\varphi }_{1}^{a}(0) = -{\psi }_{ 1}^{a}(0) ={ \overline{\psi }}_{ 0}^{a}\end{array}$$

In general, for i = 2, …, n, the initial conditions are selected as follows: For k = 1, 2, …, l, find ${\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}$ from the equation

$${\psi }_{i}(0){\pi }_{a} = -\left(\;\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j-1}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right){\pi }_{a} := -{\overline{\psi }}_{i-1}{\pi }_{a}.$$

Choose

$${{\vartheta}}_{i}^{k}(0) = -{\psi }_{ i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

for k = 1, …, l,

$${\phi }_{i}^{a}(0) = -{\overline{\psi }}_{ i-1}^{a},\quad \mbox{ and }{\psi }_{ i}(0) = -{\varphi }_{i}(0).$$

Proceeding inductively, we then construct all φ_i(t) and ψ _i (τ). Moreover, we can verify that there exists 0 < κ_i, 0 < κ _{i − 1, 0} < κ _0, 0 such that | ψ _i (τ) | ≤ Kexp( − κ_i, 0τ). This indicates that the inclusion of absorbing states is very similar to the case of all recurrent states. In the zeroth-order outer expansion, there is a component φ₀ ^a(t) that “takes care of” the absorbing states. Note, however, that starting from the leading term (zeroth-order approximation), the matching will be determined not only by the multipliers ${{\vartheta}}_{i}(0)$ but also by the vector ψ _i (0) associated with the absorbing states. We summarize the results in the following theorem.

Theorem 4.36

. Consider $\widetilde{Q}(t)$ given by (4.74) , and suppose conditions (A4.3) and (A4.4) are satisfied for the matrix-valued functions $\widetilde{{Q}}^{k}(\cdot )$ for k = 1,…,l and $\widehat{Q}(\cdot )$ . An asymptotic expansion

$${y}_{n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t) + {\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\right )$$

exists such that

φ_i ( ⋅) is $(n + 1 - i)$-timescontinuously differentiable on [0, T];
| ψ_i(t) | ≤ Kexp( − κ ₀ t) for some 0 < κ ₀ < κ _i, 0;
$\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T].$

Finally, at the end of this section, we give a simple example to illustrate the result.

Example 4.37.

Let us consider a Markov chain generated by

$${Q}^{\varepsilon } = \frac{1} {\varepsilon }\widetilde{Q} +\widehat{ Q},$$

where

$$\widetilde{Q} = \left (\begin{array}{*{10}c} -1& 1 &0\\ 1 &-1 &0 \\ 0 & 0 &0&\\ \end{array} \right )\mbox{ and }\widehat{Q} = \left (\begin{array}{*{10}c} 0&0& 0\\ 0 &0 & 0 \\ 1&0&-1\\ \end{array} \right ).$$

Not being irreducible, the chain generated by $\widetilde{Q}$ includes an absorbing state. In this example, $\overline{Q} = \left (\begin{array}{*{10}c} 0& 0\\ 1 &-1\\ \end{array} \right )$. Let p⁰ = (p₁ ⁰,p₂ ⁰,p^0,a) denote the initial distribution of α^ε(⋅). Then solving the forward equation (4.40) gives us

$${p}^{\varepsilon }(t) = ({p}_{ 1}^{\varepsilon }(t),{p}_{ 2}^{\varepsilon }(t),{p}_{ 3}^{\varepsilon }(t)),$$

where

$$\begin{array}{l} {p}_{1}^{\varepsilon }(t) = \frac{{p}_{1}^{0} + {p}_{ 2}^{0} + {p}^{0,a}} {2} \\ \quad \quad -\left (\frac{-{p}_{1}^{0} + {p}_{2}^{0} - {p}^{0,a}} {2} + \frac{{p}^{0,a}} {2 - \varepsilon }\right )\exp \left( -\frac{2t} {\varepsilon } \right) -\left(\frac{(1 - \varepsilon ){p}^{0,a}} {2 - \varepsilon } \right)\exp (-t), \\ {p}_{2}^{\varepsilon }(t) = \frac{{p}_{1}^{0} + {p}_{ 2}^{0} + {p}^{0,a}} {2} \\ \quad \quad + \left (\frac{-{p}_{1}^{0} + {p}_{2}^{0} - {p}^{0,a}} {2} + \frac{{p}^{0,a}} {2 - \varepsilon }\right )\exp \left( -\frac{2t} {\varepsilon } \right) -\left( \frac{{p}^{0,a}} {2 - \varepsilon }\right)\exp (-t), \\ {p}_{3}^{\varepsilon }(t) = {p}^{0,a}\exp (-t)\end{array}$$

Computing φ₀(t) yields

$$\begin{array}{rl} {\varphi }_{0}(t)& = \left (\frac{{p}_{1}^{0} + {p}_{2}^{0} + {p}^{0,a}} {2}, \frac{{p}_{1}^{0} + {p}_{2}^{0} + {p}^{0,a}} {2},0\right ) \\ &\qquad + \left (-\frac{{p}^{0,a}} {2},-\frac{{p}^{0,a}} {2},{p}^{0,a}\right )\exp (-t)\end{array}$$

It is easy to see that for t > 0,

$$ \lim\limits_{\varepsilon \rightarrow 0}\vert {p}^{\varepsilon }(t) - {\varphi }_{ 0}(t)\vert = 0.$$

The limit behavior of the underlying Markov chain as ε → 0 is determined by φ₀(t) (for t > 0). Moreover, when t is large, the influence from $\widehat{Q}$ corresponding to the absorbing state (the vector multiplied by exp (−t)) can be ignored because exp (−t) goes to 0 exponentially fast as t →∞.

5 Inclusion of Transient States

If a Markov chain has transient states, then, relabeling the states through suitable permutations, one can decompose the states into several groups of recurrent states, each of which is weakly irreducible, and a group of transient states. Naturally, we consider the generator $\widetilde{Q}(t)$ in Q ^ε(t) having the form

$$\widetilde{Q}(t) = \left (\begin{array}{cccc} \widetilde{{Q}}^{1}(t) & & & \\ & \ddots & & \\ & & \widetilde{{Q}}^{l}(t) & \\ \widetilde{{Q}}_{{_\ast}}^{1}(t)&\cdots &\widetilde{{Q}}_{{_\ast}}^{l}(t)&\widetilde{{Q}}_{{_\ast}}(t)\\ \end{array} \right )$$

(4.82)

such that for each t ∈ [0, T], and each k = 1, …, l, $\widetilde{{Q}}^{k}(t)$ is a generator with dimension m _k ×m _k, $\widetilde{{Q}}_{{_\ast}}(t)$ is an m _∗ × m _∗ matrix, $\widetilde{{Q}}_{{_\ast}}^{k}(t) \in {\mathbb{R}}^{{m}_{{_\ast}}\times {m}_{k}}$, and

$${m}_{1} + {m}_{2} + \cdots + {m}_{l} + {m}_{{_\ast}} = m.$$

We continue our study of singularly perturbed chains with weak and strong interactions by incorporating the transient states into the model. Let α^ε( ⋅) be a Markov chain generated by Q ^ε ( ⋅), with ${Q}^{\varepsilon }(t) \in {\mathbb{R}}^{m\times m}$ given by (4.39) with $\widetilde{Q}(t)$ given by (4.82). The state space of the underlying Markov chain is given by

$$\mathcal{M} = {\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l} \cup {\mathcal{M}}_{{_\ast}}$$

where ${\mathcal{M}}_{k} =\{ {s}_{k1},\ldots,{s}_{k{m}_{k}}\}$ are the states corresponding to the recurrent states and ${\mathcal{M}}_{{_\ast}} =\{ {s}_{{_\ast}1},\ldots,{s}_{{_\ast}{m}_{{_\ast}}}\}$ are those corresponding to the transient states.

Since $\widetilde{Q}(t)$ is a generator, for each k = 1, …, l, $\widetilde{{Q}}^{k}(t)$ is a generator. Thus the matrix $\widetilde{{Q}}_{{_\ast}}^{k}(t) = (\widetilde{{q}}_{{_\ast},ij}^{k})$ satisfies $\widetilde{{q}}_{{_\ast},ij}^{k} \geq 0$ for each i = 1, …, m _∗ and j = 1, …, m _k, and $\widetilde{{Q}}_{{_\ast}}(t) = (\widetilde{{q}}_{{_\ast},ij})$ satisfies

$$\widetilde{{q}}_{{_\ast},ij}(t) \geq 0\mbox{ for }i\neq j,\widetilde{{q}}_{{_\ast},ii}(t) < 0,\mbox{ and }\widetilde{{q}}_{{_\ast},ii}(t) \leq -\sum\limits_{j\neq i}\widetilde{{q}}_{{_\ast},ij}(t).$$

Roughly, the block matrix $(\widetilde{{Q}}_{{_\ast}}^{1}(t),\ldots,\widetilde{{Q}}_{{_\ast}}^{l}(t),\widetilde{{Q}}_{{_\ast}}(t))$ is “negatively dominated” by the matrix $\widetilde{{Q}}_{{_\ast}}(t)$ . Thus it is natural to assume that $\widetilde{{Q}}_{{_\ast}}(t)$ is a stable matrix (or Hurwitz, i.e., all its eigenvalues have negative real parts). Comparing with the setups of Sections 4.3 and 4.4, the difference in $\widetilde{Q}(t)$ is the additional matrices $\widetilde{{Q}}_{{_\ast}}^{k}(t)$ for k = 1, …, l and $\widetilde{{Q}}_{{_\ast}}(t)$. Note that $\widetilde{{Q}}_{{_\ast}}^{k}(t)$ are nonsquare matrices, and $\widetilde{Q}(t)$ no longer has block-diagonal form.

The formulation here is inspired by the work of Phillips and Kokotovic [175] and Delebecque and Quadrat [44]; see also the recent work of Pan and Başar [164], in which the authors treated time-invariant $\widetilde{Q}$ matrix of a similar form. Sections 4.3 and 4.4 together with this section essentially include generators of finite-state Markov chains of the most practical concerns. It ought to be pointed out that just as one cannot in general simultaneously diagonalize two matrices, for Markov chains with weak and strong interactions, one cannot put both $\widetilde{Q}(t)$ and $\widehat{Q}(t)$ into the forms mentioned above simultaneously. Although the model to be studied in this section is slightly more complex compared with the block-diagonal $\widetilde{Q}(t)$ in (4.41), we demonstrate that an asymptotic expansion of the probability distribution can still be obtained by using the same techniques of the previous sections. Moreover, it can be seen from the expansion that the underlying Markov chain stays in the transient states only with very small probability. In some cases, for example $\widehat{Q}(t) = 0$, these transient states can be ignored; see Remark 4.40 for more details.

To incorporate the transient states, we need the following conditions. The main addition is the assumption that $\widetilde{{Q}}_{{_\ast}}(t)$ is stable.

- For each t ∈ [0, T] and k = 1, …, l, $\widetilde{Q}(t),$ $\widehat{Q}(t)$ , and $\widetilde{{Q}}^{k}(t)$ satisfy (A4.3) and (A4.4).
- For each t ∈ [0, T], $\widetilde{{Q}}_{{_\ast}}(t)$ is Hurwitz (i.e., all of its eigenvalues have negative real parts).

Remark 4.38.

Condition (A4.6) indicates the inclusion of transient states. Since $\widetilde{{Q}}_{{_\ast}}(t)$ is Hurwitz, it is nonsingular. Thus the inverse matrix $\widetilde{{Q}}_{{_\ast}}^{-1}(t)$ exists for each t ∈ [0,T].

Let p ^ε( ⋅) denote the solution to (4.40) with $\widetilde{Q}(t)$ specified in (4.82). We seek asymptotic expansions of p ^ε( ⋅) having the form

$${y}_{n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) +\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right ).$$

The development is very similar to that of Section 4.3, so no attempt is made to give verbatim details. Instead, only the salient features will be brought out.

Substituting y _n ^ε(t) into the forward equation and equating coefficients of εⁱ for i = 1, …, n lead to the equations

$$\begin{array}{ll} &{\varphi }_{0}(t)\widetilde{Q}(t) = 0, \\ &{\varphi }_{i}(t)\widetilde{Q}(t) ={ d{\varphi }_{i-1}(t) \over dt} - {\varphi }_{i-1}(t)\widehat{Q}(t),\\ \end{array}$$

(4.83)

and with the change of time scale $\tau = t/\varepsilon $,

$$\begin{array}{rl} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0), \\ &{ d{\psi }_{i}(\tau ) \over d\tau } = {\psi }_{i}(\tau )\widetilde{Q}(0) +\sum\limits_{j=0}^{i-1}{\psi }_{ i-j-1}(\tau ) \\ & \times \left({ {\tau }^{j} \over j!} { {d}^{j}\widehat{Q}(0) \over d{t}^{j}} +{ {\tau }^{j+1} \over (j + 1)!} { {d}^{j+1}\widetilde{Q}(0) \over d{t}^{j+1}} \right).\end{array}$$

(4.84)

As far as the expansions are concerned, the equations have exactly the same form as that of Section 4.3. Note, however, that the partitioned vector φ_i(t) has the form

$${\varphi }_{i}(t) = ({\varphi }_{i}^{1}(t),\ldots,{\varphi }_{ i}^{l}(t),{\varphi }_{ i}^{{_\ast}}(t)),\;i = 0,1,\ldots,n,$$

where φ_i ^k(t), k = 1, …, l, is an m _k row vector and φ_i ^∗(t) is an m _∗ row vector. A similar partition holds for the vector ψ_i(t). To construct these functions, we begin with i = 0. Writing ${\varphi }_{0}(t)\widetilde{Q}(t) = 0$ in terms of the corresponding partition, we have

$$\begin{array}{rl} &{\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) + {\varphi }_{ 0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{k}(t) = 0,\mbox{ for }k = 1,\ldots,l,\mbox{ and } \\ &{\varphi }_{0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}(t) = \end{array}$$

(0.)

Since $\widetilde{{Q}}_{{_\ast}}(t)$ is stable, it is nonsingular. The last equation above implies ${\varphi }_{0}^{{_\ast}}(t) = {0}_{{m}_{{_\ast}}} = (0,\ldots,0) \in {\mathbb{R}}^{1\times {m}_{{_\ast}}}$. Consequently, as in the previous section, for each k = 1, …, l, the weak irreducibility of $\widetilde{{Q}}^{k}(t)$ implies that ${\varphi }_{0}^{k}(t) = {{\vartheta}}_{0}^{k}(t){\nu }^{k}(t)$, for some scalar function ${{\vartheta}}_{0}^{k}(t)$. Equivalently,

$${\varphi }_{0}(t) = ({{\vartheta}}_{0}^{1}(t){\nu }^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t){\nu }^{l}(t),{0}_{{ m}_{{_\ast}}}).$$

Comparing the equation above with the corresponding expression of φ₀(t) in Section 4.3, the only difference is the addition of the m _∗-dimensional row vector ${0}_{{m}_{{_\ast}}}$.

Remark 4.39.

Note that the dominant term in the asymptotic expansion is φ₀(t), in which the probabilities corresponding to the transient states are 0. Thus, the probability corresponding to α^ε(t) ∈{ transient states } is negligibly small.

Define

$$\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = \left (\begin{array}{cccc} \mathrm{1}{\mathrm{l}}_{{m}_{1}} & & & \\ & \ddots & & \\ & & \mathrm{1}{\mathrm{l}}_{{m}_{l}} & \\ {a}_{{m}_{1}}(t)&\cdots &{a}_{{m}_{l}}(t)&{0}_{{m}_{{_\ast}}\times {m}_{{_\ast}}} \end{array} \right )$$

(4.85)

where ${a}_{{m}_{k}}(t) = -\widetilde{{Q}}_{{_\ast}}^{-1}(t)\widetilde{{Q}}_{{_\ast}}^{k}(t)\mathrm{1}{\mathrm{l}}_{{m}_{k}}$ for k = 1, …, l, and ${0}_{{m}_{{_\ast}}\times {m}_{{_\ast}}}$ is the zero matrix in ${\mathbb{R}}^{{m}_{{_\ast}}\times {m}_{{_\ast}}}$.

It is readily seen that

$$\widetilde{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = 0\mbox{ for each }t \in [0,T].$$

In view of (4.83), it follows that

$$\begin{array}{l} { d \over dt} ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t),{0}_{{ m}_{{_\ast}}}) \\ \quad \quad = ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t),{0}_{{ m}_{{_\ast}}})\overline{Q}(t), \end{array}$$

(4.86)

where

$$\overline{Q}(t) = \mbox{ diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t),{0}_{{ m}_{{_\ast}}\times {m}_{{_\ast}}})\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t).$$

We write $\widehat{Q}(t)$ as follows:

$$\widehat{Q}(t) = \left (\begin{array}{cc} \widehat{{Q}}^{11}(t)&\widehat{{Q}}^{12}(t) \\ \widehat{{Q}}^{21}(t)&\widehat{{Q}}^{22}(t)\\ \end{array} \right ),$$

where for each t ∈ [0, T],

$$\begin{array}{rl} &\widehat{{Q}}^{11}(t) \in {\mathbb{R}}^{(m-{m}_{{_\ast}})\times (m-{m}_{{_\ast}})},\;\widehat{{Q}}^{12}(t) \in {\mathbb{R}}^{(m-{m}_{{_\ast}})\times {m}_{{_\ast}} }, \\ &\widehat{{Q}}^{21}(t) \in {\mathbb{R}}^{{m}_{{_\ast}}\times (m-{m}_{{_\ast}})},\mbox{ and }\widehat{{Q}}^{22}(t) \in {\mathbb{R}}^{{m}_{{_\ast}}\times {m}_{{_\ast}} }\end{array}$$

Let

$${\overline{Q}}_{{_\ast}}(t) = \mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t))\left (\widehat{{Q}}^{11}(t)\widetilde{\mathrm{1}\mathrm{l}} +\widehat{ {Q}}^{12}(t)({a}_{{ m}_{1}}(t),\ldots,{a}_{{m}_{l}}(t))\right ).$$

Then $\overline{Q}(t) = \mathrm{diag}({\overline{Q}}_{{_\ast}}(t),{0}_{{m}_{{_\ast}}\times {m}_{{_\ast}}})$. Moreover, the differential equation (4.86) becomes

$${ d \over dt} ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t)) = ({{\vartheta}}_{ 0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t)){\overline{Q}}_{ {_\ast}}(t).$$

Remark 4.40.

Note that the submatrix $\widehat{{Q}}^{12}(t)$ in $\widehat{Q}(t)$ determines the jump rates of the underlying Markov chain from a recurrent state in ${\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l}$ to a transient state in ${\mathcal{M}}_{{_\ast}}$. If the magnitude of the entries of $\widehat{{Q}}^{12}(t)$ is small, then the transient state can be safely ignored because the contribution of $\widehat{{Q}}^{12}(t)$ to $\overline{Q}(t)$ is small. On the other hand, if $\widehat{{Q}}^{12}(t)$ is not negligible, then one has to be careful to include the corresponding terms in $\overline{Q}(t)$.

We now determine the initial value ${{\vartheta}}_{0}^{k}(0)$. In view of the asymptotic expansions y _n ^ε(t) and the initial-value consistency condition in (4.53), it is necessary that for k = 1, …, l,

$${{\vartheta}}_{0}^{k}(0) = {\varphi }_{ 0}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} {=\lim { }_{\delta \rightarrow 0}\lim }_{\varepsilon \rightarrow 0}{p}^{\varepsilon,k}(\delta )\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

(4.87)

where p ^ε(t) = (p ^ε, 1(t), …, p ^ε, l(t), p ^ε, ∗(t)) is a solution to (4.40). Here p ^ε, k(t) has dimensions compatible with φ₀ ^k(0) and ψ₀ ^k(0). Similarly, we write the partition of the initial vector as p ⁰ = (p ^0, 1, …, p ^0, l, p ^0, ∗). The next theorem establishes the desired consistency of the initial values. Its proof is placed in Appendix A.4.

Theorem 4.41.

Assume (A4.5) and (A4.6) . Then for k = 1,…,l,

$$ \lim\limits_{\delta \rightarrow 0}\left({ limsup}_{\varepsilon \rightarrow 0}\left \vert {p}^{\varepsilon,k}(\delta )\mathrm{1}{\mathrm{l}}_{{ m}_{k}} -\left ({p}^{0,k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} - {p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)\widetilde{{Q}}_{ {_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\right )\right \vert \right) = 0.$$

Remark 4.42.

In view of this theorem, the initial value should be given as

$${{\vartheta}}_{0}^{k}(0) = {p}^{0,k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} - {p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)\widetilde{{Q}}_{ {_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}.$$

(4.88)

Therefore, in view of (4.88), to make sure that the initial condition satisfies the probabilistic interpretation, it is necessary that

$${{\vartheta}}_{0}^{k}(t) \geq 0\mbox{ for }t \in [0,T]\mbox{ and }k = 1,\ldots,l\mbox{ and }\sum\limits_{k=1}^{l}{{\vartheta}}_{ 0}^{k}(0) = 1.$$

In view of the structure of the $\widetilde{Q}(0)$ matrix, for each k = 1,…,l, all components of the vector $\widetilde{{Q}}_{{_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}$ are nonnegative. Note that the solution of the differential equation

$$\begin{array}{rl} &{ dy(t) \over dt} = y(t)\widetilde{Q}(0), \\ &y(0) = {p}^{0} \end{array}$$

is ${p}^{0}\exp (\widetilde{Q}(0)t)$. This implies that all components of ${p}^{0,{_\ast}}\exp (\widetilde{{Q}}_{{_\ast}}(0)t)$ are nonnegative. By virtue of the stability of $\widetilde{{Q}}_{{_\ast}}(0)$,

$$-\widetilde{{Q}}_{{_\ast}}^{-1}(0) ={ \int }_{0}^{\infty }\exp (\widetilde{{Q}}_{ {_\ast}}(0)t)dt.$$

Thus all components of $-{p}^{0,{_\ast}}\widetilde{{Q}}_{{_\ast}}^{-1}(0)$ are nonnegative, and as a result, the inner product

$$-{p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)\widetilde{{Q}}_{ {_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}$$

is nonnegative. It follows that for each k = 1,…,l, ${{\vartheta}}_{0}^{k}(0) \geq {p}^{0,k}\mathrm{1}{\mathrm{l}}_{{m}_{k}} \geq 0$. Moreover,

$$\begin{array}{ll} \sum\limits_{k=1}^{l}{{\vartheta}}_{ 0}^{k}(0)& =\sum\limits_{k=1}^{l}{p}^{0,k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} - {p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)\left(\;\sum\limits_{k=1}^{l}\widetilde{{Q}}_{ {_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\right) \\ & = (1 - {p}^{0,{_\ast}}\mathrm{1}{\mathrm{l}}_{{ m}_{{_\ast}}}) - {p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)(-\widetilde{{Q}}_{ {_\ast}}(0)\mathrm{1}{\mathrm{l}}_{{m}_{{_\ast}}}) = 1.\end{array}$$

(4.89)

Before treating the terms in ψ₀( ⋅), let us give an estimate on $\exp (\widetilde{Q}(0)t)$.

Lemma 4.43.

Set

$${\pi }_{{_\ast}} = \left (\begin{array}{cccc} \mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0) & & & \\ & \ddots & & \\ & & \mathrm{1}{\mathrm{l}}_{{m}_{l}}{\nu }^{l}(0) & \\ {a}_{{m}_{1}}(0){\nu }^{1}(0)&\cdots &{a}_{{m}_{l}}(0){\nu }^{l}(0)&\mathrm{1}{\mathrm{l}}_{{m}_{{_\ast}}}{0}_{{m}_{{_\ast}}}\\ \end{array} \right ).$$

Then there exist positive constants K and κ _0,0 such that

$$\Big{\vert }\exp (\widetilde{Q}(0)\tau ) - {\pi }_{{_\ast}}\Big{\vert }\leq K\exp (-{\kappa }_{0,0}\tau ),$$

(4.90)

for τ ≥ 0.

Proof: To prove (4.90), it suffices to show for any m-row vector y ⁰,

$$\Big{\vert }{y}^{0}(\exp (\widetilde{Q}(0)\tau ) - {\pi }_{ {_\ast}})\Big{\vert }\leq K\vert {y}^{0}\vert \exp (-{\kappa }_{ 0}\tau ).$$

Given ${y}^{0} = ({y}^{0,1},\ldots,{y}^{0,l},{y}^{0,{_\ast}}) \in {\mathbb{R}}^{1\times m}$, let

$$y(\tau ) = ({y}^{1}(\tau ),\ldots,{y}^{l}(\tau ),{y}^{{_\ast}}(\tau )) = {y}^{0}\exp (\widetilde{Q}(0)\tau ).$$

Then, y(τ) is a solution to

$$\frac{dy(\tau )} {d\tau } = y(\tau )\widetilde{Q}(0),\;y(0) = {y}^{0}.$$

It follows that

$${y}^{{_\ast}}(\tau ) = {y}^{0,{_\ast}}\exp (\widetilde{{Q}}_{ {_\ast}}(0)\tau )$$

and for k = 1, …, l,

$${y}^{k}(\tau ) = {y}^{0,k}\exp (\widetilde{{Q}}^{k}(0)\tau ) +{ \int }_{0}^{\tau }{y}^{{_\ast}}(s)\widetilde{{Q}}_{ {_\ast}}^{k}(0)\exp (\widetilde{{Q}}^{k}(0)(\tau - s))ds.$$

For each k = 1, …, l, we have

$$\begin{array}{rl} &{y}^{k } (\tau ) -\left ({y}^{0,k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0) + {y}^{0,{_\ast}}{\int }_{0}^{\infty }\exp (\widetilde{{Q}}_{ {_\ast}}(0)s)ds\widetilde{{Q}}_{{_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right ) \\ & = {y}^{0,k}\left (\exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right ) \\ &\quad + {y}^{0,{_\ast}}{\int }_{0}^{\tau }\exp (\widetilde{{Q}}_{ {_\ast}}(0)s)\widetilde{{Q}}_{{_\ast}}^{k}(0)\left (\exp (\widetilde{{Q}}^{k}(0)(\tau - s)) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right )ds \\ &\quad - {y}^{0,{_\ast}}{\int }_{\tau }^{\infty }\exp (\widetilde{{Q}}_{ {_\ast}}(0)s)\widetilde{{Q}}_{{_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)ds\end{array}$$

By virtue of the stability of $\widetilde{{Q}}_{{_\ast}}(0)$, the last term above is bounded above by K | y ^0, ∗ | exp( − κ_∗τ). Recall that by virtue of Lemma 4.4, for some κ_0, k > 0,

$$\left \vert \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right \vert \leq K\exp (-{\kappa }_{ 0,k}\tau ).$$

Choose κ_0, 0 = min(κ_∗, min_k{κ_0, k}). The terms in the second and the third lines above are bounded by K | y ⁰ | exp( − κ_0, 0τ). The desired estimate thus follows. □

Next consider the first equation in the initial-layer expansions:

$${ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0).$$

The solution to this equation can be written as

$${\psi }_{0}(\tau ) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau ).$$

To be able to match the asymptotic expansion, choose

$${\psi }_{0}(0) = {p}^{0} - {\varphi }_{ 0}(0).$$

Thus,

$$\begin{array}{ll} {\psi }_{0}(\tau )& = ({p}^{0} - {\varphi }_{ 0}(0))\exp (\widetilde{Q}(0)\tau ) \\ & = ({p}^{0} - {\varphi }_{ 0}(0))\left (\exp (\widetilde{Q}(0)\tau ) - {\pi }_{{_\ast}}\right ) + ({p}^{0} - {\varphi }_{ 0}(0)){\pi }_{{_\ast}}\end{array}$$

By virtue of the choice of φ₀(0), it is easy to show that

$$({p}^{0} - {\varphi }_{ 0}(0)){\pi }_{{_\ast}} = 0.$$

Therefore, in view of Lemma 4.43, ψ₀( ⋅) decays exponentially fast in that for some constants K and κ_0, 0 > 0 given in Lemma 4.43,

$$\vert {\psi }_{0}(\tau )\vert \leq K\exp (-{\kappa }_{0,0}\tau ),\;\tau \geq 0.$$

We have obtained φ₀( ⋅) and ψ₀( ⋅). To proceed, set

$${b}_{0}(t) ={ d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)\widehat{Q}(t)$$

and

$${b}_{0}(t) = ({b}_{0}^{1}(t),\ldots,{b}_{ 0}^{l}(t),{b}_{ 0}^{{_\ast}}(t)).$$

Note that b ₀(t) is a completely known function.

In view of the second equation in (4.83),

$$\begin{array}{ll} &{\varphi }_{1}^{k}(t)\widetilde{{Q}}^{k}(t) + {\varphi }_{ 1}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{k}(t) = {b}_{ 0}^{k}(t)\mbox{ for }k = 1,\ldots,l, \\ &{\varphi }_{1}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}(t) = {b}_{0}^{{_\ast}}(t).\end{array}$$

(4.91)

Solving the last equation in (4.91) yields

$${\varphi }_{1}^{{_\ast}}(t) = {b}_{ 0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{-1}(t).$$

Putting this back into the first l equations of (4.91) leads to

$${\varphi }_{1}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t) - {b}_{ 0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{-1}(t)\widetilde{{Q}}_{ {_\ast}}^{k}(t).$$

(4.92)

Again, the right side is a known function. In view of the choice of φ₀( ⋅) and (4.86), we have ${b}_{0}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = 0$. This implies

$$\begin{array}{l} {b}_{0 }^{k }(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} - {b}_{0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{-1}(t)\widetilde{{Q}}_{ {_\ast}}^{i}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} \\ \quad = {b}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} + {b}_{0}^{{_\ast}}(t){a}_{{ m}_{k}}(t) = \end{array}$$

(0.)

Therefore, (4.92) has a particular solution $\widetilde{{b}}_{0}^{k}(t)$ with

$$\widetilde{{b}}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0,\mbox{ for }k = 1,\ldots,l.$$

As in the previous section, we write the solution of φ₁ ^k(t) as a sum of the homogeneous solution and a solution of the inhomogeneous equation $\widetilde{{b}}_{0}^{k}(t)$, that is,

$${\varphi }_{1}^{k}(t) = {{\vartheta}}_{ 1}^{k}(t){\nu }^{k}(t) +\widetilde{ {b}}_{ 0}^{k}(t)\mbox{ for }k = 1,\ldots,l.$$

In view of

$$\begin{array}{rl} &\widetilde{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = 0\ \mbox{ and} \\ &\widetilde{{b}}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0, \end{array}$$

using the equation

$${\varphi }_{2}(t)\widetilde{Q}(t) = \frac{d{\varphi }_{1}(t)} {dt} - {\varphi }_{1}(t)\widehat{Q}(t),$$

we obtain that

$$\begin{array}{rl} &\frac{d} {dt}({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t),0) \\ & = ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t),0)\overline{Q}(t) +\widetilde{ {b}}_{ 0}(t)\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) \\ &\quad -\left(\frac{d\widetilde{{b}}_{0}^{{_\ast}}(t)} {dt} \right)\left ({a}_{{m}_{1}}(t),\ldots,{a}_{{m}_{l}}(t),{0}_{{m}_{{_\ast}}}^{\prime}\right ).\end{array}$$

(4.93)

The initial value ${{\vartheta}}_{1}(0)$ will be determined in conjunction with the initial value of ψ₁( ⋅) next.

Note that in comparison with the differential equation governing ${{\vartheta}}_{1}(t)$ in Section 4.3, the equation (4.93) has an extra term involving the derivative of $\widetilde{{b}}_{0}^{{_\ast}}(t)$.

To determine ψ₁( ⋅), solving the equation in (4.84) with i = 1, we have

$$\begin{array}{rl} {\psi }_{1}(\tau ) =&{\psi }_{1}(0)\exp (\widetilde{Q}(0)\tau ) \\ & +{ \int }_{0}^{\tau }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\widehat{Q}(0)\exp (\widetilde{Q}(0)(\tau - s))ds \\ & +{ \int }_{0}^{\tau }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\left(\frac{d\widetilde{Q}(0)} {dt} \right)\exp (\widetilde{Q}(0)(\tau - s))ds\end{array}$$

Choose the initial values of ψ₁(0) and ${{\vartheta}}_{1}^{k}(0)$ as follows:

$$\begin{array}{cl} {\psi }_{1}(0) & = -{\varphi }_{1}(0), \\ {{\vartheta}}_{1}^{k}(0) & = -{\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}, \\ {\psi }_{1}(0){\pi }_{{_\ast}}& = -\left ({\int }_{0}^{\infty }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\widehat{Q}(0){\pi }_{{_\ast}} \\ &\qquad \quad -\left ({\int }_{0}^{\infty }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\frac{d\widetilde{Q}(0)} {dt} {\pi }_{{_\ast}} \\ & := -{\overline{\psi }}_{0}{\pi }_{{_\ast}}.\end{array}$$

(4.94)

Write ${\overline{\psi }}_{0} = ({\overline{\psi }}_{0}^{1},\ldots,{\overline{\psi }}_{0}^{l},{\overline{\psi }}_{0}^{{_\ast}})$. Then the definition of π_∗ implies that

$${\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} + {\psi }_{1}^{{_\ast}}(0){a}_{{ m}_{k}}(0) = -({\overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} +{ \overline{\psi }}_{0}^{{_\ast}}{a}_{{ m}_{k}}(0)).$$

Recall that

$${\varphi }_{1}^{{_\ast}}(0) + {\psi }_{ 1}^{{_\ast}}(0) = 0$$

and

$${\varphi }_{1}^{{_\ast}}(t) = {b}_{ 0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{-1}(t).$$

It follows that

$${\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = -({\overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} +{ \overline{\psi }}_{0}^{{_\ast}}{a}_{{ m}_{k}}(0)) + {b}_{0}^{{_\ast}}(0)\widetilde{{Q}}_{ {_\ast}}^{-1}(0){a}_{{ m}_{k}}(0).$$

Moreover, it can be verified that | ψ₁(τ) | ≤ Kexp( − κ_1, 0τ) for some 0 < κ_1, 0 < κ_0, 0.

Remark 4.44.

Note that there is an extra term

$$\left ({\int }_{0}^{\infty }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\frac{d\widetilde{Q}(0)} {dt} {\pi }_{{_\ast}}$$

involved in the equation determining ${{\vartheta}}_{1}(0)$ in (4.94). This term does not vanish as in Section 4.3 because generally $((d/dt)\widetilde{Q}(0)){\pi }_{{_\ast}}\neq 0$.

To obtain the desired asymptotic expansion, continue inductively. For each i = 2, …, n, we first obtain the solution of φ_i(t) with the “multiplier” given by the solution of the differential equation but with unspecified condition ${{\vartheta}}_{i}(0)$; solve ψ_i(t) with the as yet unavailable initial condition ${\psi }_{i}(0) = -{\varphi }_{i}(0)$. Next jointly prove the exponential decay properties of ψ_i(τ) and obtain the solution ${{\vartheta}}_{i}(0)$. The equation to determine ${{\vartheta}}_{i}(0)$ with transient states becomes

$$\begin{array}{l} {\psi }_{i } (0){\pi }_{{_\ast}} \\ \ = -\left(\;\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{\psi }_{ i-j-1}(s)\left(\frac{{s}^{j}} {j!} \frac{{d}^{j}\widehat{Q}(0)} {d{t}^{j}} + \frac{{s}^{j+1}} {(j + 1)!} \frac{{d}^{j+1}\widetilde{Q}(0)} {d{t}^{j+1}} \right)ds\right){\pi }_{{_\ast}}\end{array}$$

In this way, we have constructed the asymptotic expansion with transient states. In addition, we can show that φ_i( ⋅) are smooth and ψ_i( ⋅) satisfies | ψ_i(τ) | ≤ Kexp( − κ_i, 0τ) for some 0 < κ_i, 0 < κ_{i − 1, 0} < κ_0, 0. Similarly as in the case with all recurrent states, we establish the following theorem.

Theorem 4.45.

Suppose (A4.5) and (A4.6) hold. Then an asymptotic expansion

$${y}_{n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t) + {\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\right )$$

can be constructed such that for i = 0,…,n,

φ_i( ⋅) is $(n + 1 - i)$-times continuously differentiable on [0, T];
| ψ_i(t) | ≤ Kexp( − κ₀ t) for some K > 0 and 0 < κ₀ < κ_i, 0;
$\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T].$

Example 4.46.

Let $\widetilde{Q}(t) =\widetilde{ Q},$ a constant matrix such that

$$\widetilde{Q} = \left (\begin{array}{*{10}c} -1& 1 & 0 & 0\\ 1 &-1 & 0 & 0 \\ 1 & 0 &-2& 1\\ 0 & 1 & 1 &-2 \\ \end{array} \right )\mbox{ and }\widehat{Q} = 0.$$

In this example,

$$\widetilde{{Q}}^{1} = \left (\begin{array}{*{10}c} -1& 1 \\ 1 &-1\\ \end{array} \right ),\quad \widetilde{{Q}}_{{_\ast}} = \left (\begin{array}{*{10}c} -2& 1\\ 1 &-2\\ \end{array} \right ),\quad \mbox{ and }\ \widetilde{{Q}}_{{_\ast}}^{1} = \left (\begin{array}{*{10}c} 1&0 \\ 0&1\\ \end{array} \right ).$$

The last two rows in $\widetilde{Q}$ represent the jump rates corresponding to the transient states. The matrix $\widetilde{{Q}}^{1}$ is weakly irreducible and $\widetilde{{Q}}_{{_\ast}}$ is stable. Solving the forward equation gives us

$${p}^{\varepsilon }(t) = ({p}_{ 1}^{\varepsilon }(t),{p}_{ 2}^{\varepsilon }(t),{p}_{ 3}^{\varepsilon }(t),{p}_{ 4}^{\varepsilon }(t)),$$

where

$$\begin{array}{rl} &{p}_{1}^{\varepsilon }(t) = \frac{1} {2} + \frac{1} {2}\biggl [(-{p}_{3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{t} {\varepsilon }\right ) \\ & + ({p}_{1}^{0} - {p}_{ 2}^{0} + {p}_{ 3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{2t} {\varepsilon } \right ) \\ & + (-{p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{3t} {\varepsilon } \right )\biggr ], \\ &{p}_{2}^{\varepsilon }(t) = \frac{1} {2} + \frac{1} {2}\biggl [(-{p}_{3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{t} {\varepsilon }\right ) \\ & + (-{p}_{1}^{0} + {p}_{ 2}^{0} - {p}_{ 3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{2t} {\varepsilon } \right ) \\ & + ({p}_{3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{3t} {\varepsilon } \right )\biggr ], \\ &{p}_{3}^{\varepsilon }(t) = \frac{1} {2}\biggl [({p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{t} {\varepsilon }\right ) + ({p}_{3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{3t} {\varepsilon } \right )\biggr ], \\ &{p}_{4}^{\varepsilon }(t) = \frac{1} {2}\biggl [({p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{t} {\varepsilon }\right ) + (-{p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{3t} {\varepsilon } \right )\biggr ]\end{array}$$

It is easy to see that ${\varphi }_{0}(t) = (1/2,1/2,0,0)$ and

$$\vert {p}^{\varepsilon }(t) - {\varphi }_{ 0}(t)\vert \leq K\exp \left (-\frac{t} {\varepsilon }\right ).$$

The limit behavior of the underlying Markov chain as ε → 0 is determined by φ₀(t) for t > 0. It is clear that the probability of the Markov chain staying at the transient states is very small for small ε.

Remark 4.47.

The model discussed in this section has the extra ingredient of including transient states as compared with that of Section 4.3. The main feature is embedded in the last few rows of the $\widetilde{Q}(t)$ matrix. One of the crucial points here is that the matrix $\widetilde{{Q}}_{{_\ast}}(t)$ in the right corner is Hurwitzian. This stability condition guarantees the exponential decay properties of the boundary layers. As far as the regular part (or the outer) expansion is concerned, we have that the last subvector φ₀ ^∗(t) = 0. The determination of the initial conditions ${{\vartheta}}_{i}(0)$ uses the same technique as before, namely, matching the outer terms and inner layers. The procedure involves recursively solving a sequence of algebraic and differential equations. Although the model is seemingly more general, the methods and techniques involved in obtaining the asymptotic expansion and proof of the results are essentially the same as in the previous section. The notation is slightly more complex, nevertheless.

6 Remarks on Countable-State-Space Cases

6.1 Countable-State Spaces: Part I

This section presents an extension of the singularly perturbed Markov chains with fast and slow components and finite-state spaces. In this section, the generator $\widetilde{Q}(\cdot )$ is a block-diagonal matrix consisting of infinitely many blocks each of which is of finite dimension. The generator Q ^ε(t) still has the form (4.39). However,

$$\widetilde{Q}(t) = \left (\begin{array}{*{10}c} \widetilde{{Q}}^{1}(t)& & & & \\ &\widetilde{{Q}}^{2}(t)& & &\\ & &\ddots && \\ & & &\widetilde{{Q}}^{k}(t)&\\ & & &&\ddots\\ \end{array} \right ),$$

(4.95)

where $\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}$ is a generator of an appropriate Markov chain with finite-state space, and $\widehat{Q}(t)$ is an infinite-dimensional matrix and is a generator of a Markov chain having a countable-state space, that is, $\widehat{Q}(t) = (\widehat{{q}}_{ij}(t))$ such that

$$\widehat{{q}}_{ij}(t) \geq 0\mbox{ for }i\neq j,\mbox{ and }\sum\limits_{j}\widehat{{q}}_{ij}(t) = 0.$$

We aim at deriving asymptotic results under the current setting. To do so, assume that the following condition holds:

- For t ∈ [0, T], $\widetilde{{Q}}^{k}(t)$, for k = 1, 2, …, are weakly irreducible.

Parallel to the development of Section 4.3, the solution of φ_i( ⋅) can be constructed similar to that of Theorem 4.29 as in (4.44) and (4.45). In fact, we obtain φ₀( ⋅) from (4.49) and (4.50) with l = ∞; the difference is that now we have an infinite number of equations. Similarly, for all k = 1, 2, … and $i = 0,1,\ldots,n + 1$, φ_i( ⋅) can be obtained from

$$\begin{array}{ll} &{\varphi }_{0}(t)\widetilde{Q}(t) = 0,\mbox{ if }i = 0 \\ &{\varphi }_{i}(t)\widetilde{Q}(t) ={ d{\varphi }_{i-1}(t) \over dt} - {\varphi }_{i-1}(t)\widehat{Q}(t),\mbox{ if }i \geq 1 \\ &{\varphi }_{i}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = {{\vartheta}}_{i}^{k}(t), \\ &{ d{{\vartheta}}_{i}(t) \over dt} = {{\vartheta}}_{i}(t)\overline{Q}(t) +\widetilde{ {b}}_{i-1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}}.\end{array}$$

(4.96)

The problem is converted to one that involves infinitely many algebraic differential equations. The same technique as presented before still works.

Nevertheless, the boundary layer corrections deserve more attention. Let us start with ψ₀( ⋅), which is the solution of the abstract Cauchy problem

$$\begin{array}{ll} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0), \\ &{\psi }_{0}(0) = {p}^{0} - {\varphi }_{0}(0).\end{array}$$

(4.97)

To continue our study, one needs the notion of semigroup (see Dunford and Schwartz [52], and Pazy [172]). Recall that for a Banach space $\mathbb{B}$, a one-parameter family T(t), 0 ≤ t < ∞, of bounded linear operators from $\mathbb{B}$ into $\mathbb{B}$ is a semigroup of bounded linear operators on $\mathbb{B}$ if (i) T(0) = I and (ii) $T(t + s) = T(t)T(s)$ for every t, s ≥ 0.

Let ${\mathbb{R}}^{\infty }$ be the sequence space with a canonical element $x = ({x}_{1},{x}_{2},\ldots ) \in {\mathbb{R}}^{\infty }$. Let A = (a _ij) satisfying $A : {\mathbb{R}}^{\infty }\mapsto {\mathbb{R}}^{\infty }$, equipped with the l ₁-norm

$$\vert A{\vert }_{1} {=\sup }_{j}\sum\limits_{i}\vert {a}_{ij}\vert ;$$

(see Hutson and Pym [90, p. 74]) Using the definition of semigroup above, the solution of (4.97) is

$${\psi }_{0}(\tau ) = T(\tau ){\psi }_{0}(0),$$

where T(τ) is a one-parameter family of semigroups generated by $\widetilde{Q}(0)$. Moreover, since $\widetilde{Q}(0)$ is a bounded linear operator, $\exp (\widetilde{Q}(0)\tau )$ still makes sense. Thus $T(\tau ){\psi }_{0}(0) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau )$, where

$$\begin{array}{rl} T(\tau )& =\exp (\widetilde{Q}(0)\tau ) =\sum\limits_{j=0}^{\infty }{ {\left (\widetilde{Q}(0)\tau \right )}^{j} \over j!} \\ & = \mathrm{diag}\left (\exp \left (\widetilde{{Q}}^{1}(0)\tau \right ),\ldots,\exp \left (\widetilde{{Q}}^{k}(0)\tau \right ),\ldots \right )\end{array}$$

Therefore, the solution has the same form as in the previous section. Under (A4.7), exactly the same argument as in the proof of Lemma 4.4 yields that for each k = 1, 2, …,

$$\exp (\widetilde{{Q}}^{k}(0)\tau ) \rightarrow \mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\mbox{ as }\tau \rightarrow \infty $$

and the convergence takes place at an exponential rate, that is,

$$\l \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\vert \leq K\exp (-{\kappa }_{ k}\tau ),$$

for some κ_k > 0. In order to obtain a valid asymptotic expansion, another piece of assumption is needed. That is, these κ_k, for all k = 1, 2, …, are uniformly bounded below by a positive constant κ₀.

- There exists a positive number κ₀ = min_k{κ_k} > 0.

Set

$$\widetilde{\mathrm{1}\mathrm{l}} = \mathrm{diag}\left (\mathrm{1}{\mathrm{l}}_{{m}_{1}},\ldots,\mathrm{1}{\mathrm{l}}_{{m}_{k}},\ldots \right )\mbox{ and }\nu (0) = \mathrm{diag}\left ({\nu }^{1}(0),\ldots,{\nu }^{k}(0),\ldots \right ).$$

In view of (A4.8)

$$\begin{array}{ll} \l \exp (\widetilde{Q}(0)\tau ) -\widetilde{\mathrm{1}\mathrm{l}}\nu {(0)\vert }_{1} & {\leq \sup }_{k}\l \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\vert \\ & \leq K\exp (-{\kappa }_{0}\tau ).\end{array}$$

(4.98)

The exponential decay property of ψ₀( ⋅) is thus established. Likewise, it can be proved that all ψ_i( ⋅) for $i = 1,\ldots,n + 1$, satisfy the exponential decay property. From here on, we can proceed as in the previous section to get the error estimate and verify the validity of the asymptotic expansion. In short the following theorem is obtained.

Theorem 4.48.

Suppose conditions (A4.7) and (A4.8) are satisfied. Then the results in Theorem 4.29 hold for the countable-state-space model with $\widetilde{Q}(\cdot )$ given by (4.95).

6.2 Countable-State Spaces: Part II

The aim of this section is to develop further results on singularly perturbed Markov chains with fast and slow components whose generators are infinite-dimensional matrices but in different form from that described in Section 4.6.1. The complexity as well as difficulty increase. A number of technical issues also arise. One idea arises almost immediately: to approximating the underlying system via a Galerkin-kind procedure, that is, to approximate an infinite-dimensional system by finite-dimensional truncations. Unfortunately, this does not work in the setting of this section. We will return to this question at the end of this section.

To proceed, as in the previous sections, the first step invariably involves the solution of algebraic differential equations in the constructions of the approximating functions. One of the main ideas used is the Fredholm alternative. There are analogues to the general setting in Banach spaces for compact operators. Nevertheless, the infinite-dimensional matrices are in fact more difficult to handle.

Throughout this section, we treat the class of generators with | Q(t) | ₁ < ∞ only. We use 1 l to denote the column vector with all components equal to 1. Consider (1 l⋮Q(t)) as an operator for a generator Q(t) of a Markov chain with state space $\mathcal{M} =\{ 1,2,\ldots \}$. To proceed, we first give the definitions of irreducibility and quasi-stationary distribution. Set Q _c(t) : = (1 l⋮Q(t)).

Definition 4.49.

The generator Q(t) is said to be weakly irreducible at t₀ ∈ [0,T], for $w \in {\mathbb{R}}^{\infty }$, if the equation wQ_c(t₀) = 0 has only the zero solution. If Q(t) is weakly irreducible for each t ∈ [0,T], then it is said to be weakly irreducible on [0,T].

Definition 4.50.

A quasi-stationary distribution ν(t) (with respect to Q(t)) is a solution to (2.8) with the finite summation replaced by $\sum\limits_{i=1}^{\infty }{\nu }_{i}(t) = 1$ that satisfies ν(t) ≥ 0.

As was mentioned before, the Fredholm alternative plays an important role in our study. For infinite-dimensional systems, we state another definition to take this into account.

Definition 4.51.

A generator Q(t) satisfies the F-Property if wQ_c(t) = b has a unique solution for each $b \in {\mathbb{R}}^{\infty }.$

Note that for all weakly irreducible generators of finite dimension (i.e., generators for Markov chains with finite-state space), the F-Property above is automatically satisfied.

Since 1 l ∈ l _∞ (l _∞ denotes the sequence space equipped with the l _∞ norm) for each t ∈ [0, T], $Q(t) \in {\mathbb{R}}^{\infty }\times {\mathbb{R}}^{\infty }$. Naturally, we use the norm

$$\vert (z\vdots A){\vert }_{\infty,1} =\max \left\{ \sup\limits_{{z}_{j}}\vert {z}_{j}\vert {,\sup }_{j}\sum\limits_{i=1}^{\infty }\vert {a}_{ ij}(t)\vert \right\}.$$

It is easily seen that

$$\vert {Q}_{c}(t){\vert }_{\infty,1} \leq \max {\biggl \{ 1{,\sup }_{j}\sum\limits_{i}\vert {q}_{ij}(t)\vert \biggr \}}.$$

If a generator Q(t) satisfies the F-Property, then it is weakly irreducible. In fact if Q(t) satisfies the F-Property on t ∈ [0, T], then yQ _c(t) = 0 has a unique solution y = 0.

By the definition of the generator, in particular the q-Property, Q _c(t) is a bounded linear operator for each t ∈ [0, T]. If Q _c(t) is bijective (i.e., one-to-one and onto), then it has a bounded inverse. This, in turn, implies that Q _c(t) exhibits the F-Property. Roughly, the F-Property is a generalization of the conditions in dealing with finite-dimensional spaces. Recall from Section 4.2 that although fQ(t) = b is not solvable uniquely, by adding an equation f1 l = c, the system has a unique solution.

Owing to the inherited difficulty caused by the infinite dimensionality, the irreducibility and smoothness of Q( ⋅) are not sufficient to guarantee the existence of asymptotic expansions. Stronger conditions are needed. In the sequel, for ease of presentation, we consider the model with $\widetilde{Q}(\cdot )$ irreducible and both $\widetilde{Q}(\cdot )$ and $\widehat{Q}(\cdot )$ infinite-dimensional.

For each t, we denote the spectrum of Q(t) by σ(Q(t)). In view of Pazy [172] and Hutson and Pym [90], we have

$$\sigma (Q(t)) = {\sigma }_{d}(Q(t)) \cap {\sigma }_{c}(Q(t)) \cap {\sigma }_{r}(Q(t)),$$

where σ_d(Q(t)), σ_c(Q(t)), and σ_r(Q(t)) denote the discrete, continuous, and residue spectrum of Q(t), respectively. The well-known linear operator theory implies that for a compact operator A, σ_r(A) = ∅, and the only possible candidate for σ_c(A) is 0. Keeping this in mind, we assume that the following condition holds.

- The following condition holds.
  - The smoothness condition (A4.4) is satisfied.
  - The generator $\widetilde{Q}(t)$ exhibits the F-Property.
  - $ \sup\limits_{t\in [0,T]}\vert \widetilde{Q}(t){\vert }_{1} < \infty $ and $ \sup\limits_{t\in [0,T]}\vert \widehat{Q}(t)\vert < \infty $.
  - The eigenvalue 0 of $\widetilde{Q}(t)$ has multiplicity 1 and 0 is not an accumulation point of the eigenvalues.
  - ${\sigma }_{r}(\widetilde{Q}(t)) = \varnothing $.

Remark 4.52.

Item (a) above requires that the smoothness condition be satisfied and Item (b) requires the operator $(\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t))$ satisfy a Fredholm-alternative-like condition. Finally, (d) indicates the spectrum of $(\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t))$ is like a compact operator. Recall that for a compact linear operator, 0 is in its spectrum, and the only possible accumulation point is 0. Our conditions mimic such a condition. It will be used when we prove the exponential decay property of the initial-layer terms.

Theorem 4.53.

Under condition (A4.9) , the results in Theorem 4.29 hold for Markov chains with countable-state space.

Proof: The proof is very similar to its finite-dimensional counterpart. We only point out the difference here.

As far as the regular part is concerned, we get the same equation (4.44). One thing to note is that we can no longer use Cramer’s rule to solve the systems of equations. Without such an explicit representation of the solution, the smoothness of φ_i( ⋅) needs to be proved by examining (4.44) directly. For example,

$$\begin{array}{rl} &\sum\limits_{i=1}^{\infty }{\varphi }_{ 0,i}(t) = 1, \\ &{\varphi }_{0}(t)\widetilde{Q}(t) = 0,\end{array}$$

can be rewritten as

$${\varphi }_{0}(t)\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t)\right ) = (1,0,\ldots ).$$

(4.99)

Since $\widetilde{Q}(t)$ satisfies the F-Property, this equation has a unique solution.

To verify the differentiability, consider also

$${\varphi }_{0}(t + \delta )\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )\right ) = (1,0,\ldots ).$$

Examining the difference quotient leads to

$$\begin{array}{rl} 0& ={ {\varphi }_{0}(t + \delta )\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )\right ) - {\varphi }_{0}(t)\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t)\right ) \over \delta } \\ & ={ \left [{\varphi }_{0}(t + \delta ) - {\varphi }_{0}(t)\right ]\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )\right ) \over \delta } \\ & +{ {\varphi }_{0}(t)\left ((\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )) - (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t))\right ) \over \delta } \end{array}$$

Taking the limit as δ → 0 and by virtue of the smoothness of $\widetilde{Q}(\cdot )$, we have

$$ \lim\limits_{\delta \rightarrow 0}{ \left [{\varphi }_{0}(t + \delta ) - {\varphi }_{0}(t)\right ]\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )\right ) \over \delta } = -{\varphi }_{0}(t)\left(0\vdots{ d\widetilde{Q}(t) \over dt} \right).$$

That is (d ∕ dt)φ₀(t) exists and is given by the solution of

$${ d{\varphi }_{0}(t) \over dt} \left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t)\right ) = -{\varphi }_{0}(t)\left(0\vdots{ d\widetilde{Q}(t) \over dt} \right).$$

Again by the F-Property, there is a unique solution for this equation. Higher-order derivatives of φ₀( ⋅) and smoothness of φ_i( ⋅) can be proved in a similar way.

As far as the initial-layer terms are concerned, since $\widetilde{Q}(0)$ is a bounded linear operator, the semigroup interpretation $\exp (\widetilde{Q}(0)\tau )$ makes sense. It follows from Theorem 1.4 of Pazy [172, p. 104] that the equation

$${ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0),\quad {\psi }_{0}(0) = {p}_{0} - {\varphi }_{0}(0)$$

has a unique solution.

To show that ψ₀( ⋅) decays exponentially fast, we use an argument that is analogous to the finite-dimensional counterpart. Roughly, since the multiplicity of the eigenvalue 0 is 1, the subspace generated by the corresponding eigenvector v ₀ is one-dimensional. Similar to the situation of Section 4.2, $ \lim\limits_{\tau \rightarrow \infty }\exp (\widetilde{Q}(0)\tau )$ exists and the limit must have identical rows. Denote the limit by $\overline{P}$. It then follows that

$$\Big{\vert }\exp (\widetilde{Q}(0)\tau ) -\overline{P}\Big{\vert }\leq K\exp (-{\kappa }_{0}\tau ).$$

The meaning should be very clear. Upon “subtracting” the subspace generated by v ₀, it ought to behave like exp( − κ₀τ). A similar argument works for $i = 1,\ldots,n + 1$, so the ψ_i( ⋅) decay exponentially fast. □

6.3 A Remark on Finite-Dimensional Approximation

Concerning the cases in Section 4.6.2, a typical way of dealing with infinite-dimensional Markov chains is to make a finite-dimensional approximation. Let Q(t) = (q _ij(t)), t ≥ 0, denote a generator of a Markov chain with countable-state space. We consider an N ×N, N = 1, 2, …, truncation matrix ${Q}_{N}(t) = {({q}_{ij}(t))}_{i,j=1}^{N}$. Then Q _N(t) is a subgenerator in the sense that ∑_j = 1 ^N q _ij(t) ≤ 0, i = 1, 2, …, N.

A first glance seems to indicate that the idea of subgenerator provides a way to treat the problem of approximating an infinite-dimensional generator by finite-dimensional matrices. In fact, Reuter and Ledermann used such an idea to derive the existence and uniqueness of the solution to the forward equation (see Bharucha-Reid [10]). Dealing with singularly perturbed chains with countable-state space, one would be interested in knowing whether a Galerkin-like approximation would work in the sense that an asymptotic expansion of a finite-dimensional system would provide an approximation to the probability distribution. To be more precise, let α^ε( ⋅) denote the Markov chain generated by Q(t) ∕ ε and let

$${p}^{\varepsilon }(t) = (P({\alpha }^{\varepsilon }(t) = 1),\ldots,P({\alpha }^{\varepsilon }(t) = k),\ldots ).$$

Consider the following approximation via N-dimensional systems

$${ d{p}^{\varepsilon,N}(t) \over dt} ={ 1 \over \varepsilon } {p}^{\varepsilon,N}(t){Q}_{ N}(t),\;{p}^{\varepsilon,N}(0) = {p}^{0}.$$

(4.100)

Using the techniques presented in the previous sections, we can find outer and inner expansions to approximate p ^ε, N(t). The questions are these: For small ε and large N, can we approximate p ^ε(t) by p ^ε, N(t)? Can we approximate p ^ε, N(t) by y _n ^ε, N(t), where y _n ^ε, N(t) is an expansion of the form (4.43) when subgenerators are used? More importantly, can we use y _n ^ε, N(t) to approximate p ^ε(t)?

Although p _i ^ε(t) can be approximated by its truncation p _i ^ε, N(t) for large N and p ^ε, N(t) can be expanded as y _n ^ε, N(t) for small ε, the approximation of y _n ^ε, N(t) to p ^ε(t) does not work in general because the limits as ε → 0 and N → ∞ are not interchangeable. This can be seen by considering the following example.

Let

$$Q(t) = Q = \left (\begin{array}{cccc} - 1& \frac{1} {2} & \frac{1} {{2}^{2}} & \cdots \\ \frac{1} {2} & - 1& \frac{1} {{2}^{2}} & \cdots \\ \frac{1} {{2}^{2}} & \frac{1} {2} & - 1&\cdots \\ \vdots & \vdots & \vdots & \vdots\\ \end{array} \right ).$$

Then for any N, the truncation matrix Q _N has only negative eigenvalues. It follows that the solution p ^ε, N(t) decays exponentially fast, i.e.,

$$\Big{\vert }{p}^{\varepsilon,N}(t)\Big{\vert }\leq C\exp \left (-\frac{{\kappa }_{0}t} {\varepsilon } \right ).$$

Thus, all terms in the regular part of y _n ^ε, N vanish. It is clear from this example that y _n ^ε, N(t) cannot be used to approximate p ^ε(t).

7 Remarks on Singularly Perturbed Diffusions

In this section, we present some related results on singular perturbations of diffusions. If in lieu of a discrete state space, one considers a continuous-state space, then naturally the singularly perturbed Markov chains become singularly perturbed Markov processes. We illustrate the idea of matched asymptotic expansions for singularly perturbed diffusions. In this section, we only summarize the results and refer the reader to Khasminskii and Yin [116] for details of proofs. To proceed, consider the following example.

Example 4.54.

This example discusses a model arising from stochastic control, namely, a controlled singularly perturbed system. As pointed out in Kushner [140] and Kokotovic, Bensoussan, and Blankenship [127], many control problems can be modeled by systems of differential equations, where the state variables can be divided into two coupled groups, consisting of “fast” and “slow” variables. A typical system takes the form

$$\begin{array}{rl} &d{x}_{1}^{\varepsilon } = {f}_{ 1}({x}_{1}^{\varepsilon },{x}_{ 2}^{\varepsilon },u)dt + {\sigma }_{ 1}({x}_{1}^{\varepsilon },{x}_{ 2}^{\varepsilon })d{w}_{ 1},\ {x}_{1}^{\varepsilon }(0) = {x}_{ 1}, \\ &d{x}_{2}^{\varepsilon } ={ 1 \over \varepsilon } {f}_{2}({x}_{1}^{\varepsilon },{x}_{ 2}^{\varepsilon },u)dt +{ 1 \over \sqrt{\varepsilon }} {\sigma }_{2}({x}_{1}^{\varepsilon },{x}_{ 2}^{\varepsilon })d{w}_{ 2},\ {x}_{2}^{\varepsilon }(0) = {x}_{ 2}, \end{array}$$

where w₁(⋅) and w₂(⋅) are independent Brownian motions, f_i(⋅) and σ_i(⋅) for i = 1, 2 are suitable functions, u is the control variable, and ε > 0 is a small parameter. The underlying control problem is to minimize the cost function

$${J}^{\varepsilon }({x}_{ 1},{x}_{2},u) = E{\int }_{0}^{T}R({x}_{ 1}^{\varepsilon }(t),{x}_{ 2}^{\varepsilon }(t),u)dt,$$

where R(⋅) is the running cost function. The small parameter ε > 0 signifies the relative rates of x₁ ^ε and x₂ ^ε. Such singularly perturbed systems have drawn much attention (see Bensoussan [8], Kushner [140], and the references therein). The system is very difficult to analyze directly; the approach of Kushner [140] is to use weak convergence methods to approximate the total system by the reduced system that is obtained using the differential equation for the slow variable, where the fast variable is fixed at its steady-state value as a function of the slow variable. In order to gain further insight, it is crucial to understand the asymptotic behavior of the rapidly changing process x₂ ^ε through the transition density given by the solution of the corresponding Kolmogorov-Fokker-Planck equations.

As demonstrated in the example above, a challenge common to many applications is to study the asymptotic behavior of the following problem. Let ε > 0 be a small parameter, and let X ₁ ^ε( ⋅) and X ₂ ^ε( ⋅) be real-valued diffusion processes satisfying

$$\left \{\begin{array}{l} d{X}_{1}^{\varepsilon } = {a}_{ 1}(t,{X}_{1}^{\varepsilon },{X}_{ 2}^{\varepsilon })dt + {\sigma }_{ 1}(t,{X}_{1}^{\varepsilon },{X}_{ 2}^{\varepsilon })d{w}_{ 1}, \\ d{X}_{2}^{\varepsilon } ={ 1 \over \varepsilon } {a}_{2}(t,{X}_{1}^{\varepsilon },{X}_{ 2}^{\varepsilon })dt +{ 1 \over \sqrt{\varepsilon }} {\sigma }_{2}(t,{X}_{1}^{\varepsilon },{X}_{ 2}^{\varepsilon })d{w}_{ 2}, \end{array} \right.$$

where the real-valued functions a ₁(t, x ₁, x ₂), a ₂(t, x ₁, x ₂), σ₁(t, x ₁, x ₂), and σ₂(t, x ₁, x ₂) represent the drift and diffusion coefficients, respectively, and w ₁( ⋅) and w ₂( ⋅) are independent and standard Brownian motions. Define a vector X as X = (X ₁, X ₂)^′. Then X ^ε( ⋅) = (X ₁ ^ε( ⋅), X ₂ ^ε( ⋅))^′ is a diffusion process. This is a model treated in Khasminskii [113], in which a probabilistic approach was employed. It was shown that as ε → 0, the fast component is averaged out and the slow component X ₁ ^ε( ⋅) has a limit X ₁ ⁰( ⋅) such that

$$d{X}_{1}^{0}(t) ={ \overline{a}}_{ 1}({X}_{1}^{0}(t))dt +{ \overline{\sigma }}_{ 1}({X}_{1}^{0}(t))d{w}_{ 1},$$

where

$$\begin{array}{rl} &{\overline{a}}_{1}(t,{x}_{1}) = \int {a}_{1}(t,{x}_{1},{x}_{2})\mu (t,{x}_{1},{x}_{2})d{x}_{2}, \\ &{\overline{\sigma }}_{1}(t,{x}_{1}) = \int {\sigma }_{1}(t,{x}_{1},{x}_{2})\mu (t,{x}_{1},{x}_{2})d{x}_{2}, \end{array}$$

and μ( ⋅) is a limit density of the fast process X ₂ ^ε( ⋅).

To proceed further, it is necessary to investigate the limit properties of the rapidly changing process X ₂ ^ε( ⋅). To do so, consider the transition density of the underlying diffusion process. It is known that it satisfies the forward equation

$$\begin{array}{ll} &{ \partial {p}^{\varepsilon } \over \partial t} ={ 1 \over \varepsilon } {{{\mathcal L}^*} }_{2}{p}^{\varepsilon }+ {{{\mathcal L}^*}}_{ 1}{p}^{\varepsilon }, \\ &{p}^{\varepsilon }(0,{x}_{ 1},{x}_{2}) = {p}_{0}({x}_{1},{x}_{2})\mbox{ with }{p}_{0}({x}_{1},{x}_{2}) \geq 0\mbox{ and} \\ &\int \int {p}_{0}({x}_{1},{x}_{2})d{x}_{1}d{x}_{2} = 1, \end{array}$$

(4.101)

where

$$\begin{array}{ll} & {{\mathcal L}^*}_{1}(t,{x}_{1},{x}_{2})\ \cdot ={ 1 \over 2} { {\partial }^{2} \over \partial {x}_{1}^{2}} ({\sigma }_{1}^{2}(t,{x}_{ 1},{x}_{2})\ \cdot ) -{ \partial \over \partial {x}_{1}} ({a}_{1}(t,{x}_{1},{x}_{2})\ \cdot ), \\ & {{{\mathcal L}^*}}_{2}(t,{x}_{1},{x}_{2})\ \cdot ={ 1 \over 2} { {\partial }^{2} \over \partial {x}_{2}^{2}} ({\sigma }_{2}^{2}(t,{x}_{ 1},{x}_{2})\ \cdot ) -{ \partial \over \partial {x}_{2}} ({a}_{2}(t,{x}_{1},{x}_{2})\ \cdot )\end{array}$$

Similar to the discrete-state-space cases, the basic problems to be addressed are these: As ε → 0, does the system display certain asymptotic properties? Is there any equilibrium distribution? If p ^ε(t, x ₁, x ₂) → p(t, x ₁, x ₂) for some function p( ⋅), can one get a handle on the error bound (i.e., a bound on | p ^ε(t, x ₁, x ₂) − p(t, x ₁, x ₂) | )?

To obtain the desired asymptotic expansion in this case, one needs to make sure the quasi-stationary density exists. Note that for diffusions in unbounded domains, the quasi-stationary density may not exist. Loosely for the existence of the quasi-stationary distribution, it is necessary that the Markov processes corresponding to ${\mathcal{L}}_{2}^{{_\ast}}$ be positive recurrent for each fixed t. Certain sufficient conditions for the existence of the quasi-stationary density are provided in Il’in and Khasminskii [93]. An alternative way of handling the problem is to concentrate on a compact manifold. In doing so we are able to establish the existence of the quasi-stationary density. To illustrate, we choose the second alternative and suppose the following conditions are satisfied.

For each t ∈ [0, T], i, j = 1, 2, and

- for each ${x}_{2} \in \mathbb{R}$, a ₁(t, ⋅, x ₂), σ₁ ²(t, ⋅, x ₂) and p ₀( ⋅, x ₂) are periodic with period 1;
- for each ${x}_{1} \in \mathbb{R}$, a ₂(t, x ₁, ⋅), σ₂ ²(t, x ₁, ⋅) and p ₀(x ₁, ⋅) are periodic with period 1.

There is an $n \in {\mathbb{Z}}_{+}$ such that for each i = 1, 2,

$${a}_{i}(\cdot ),\ {\sigma }_{i}^{2}(\cdot ) \in {C}^{n+1,2(n+1),2(n+1)},\mbox{ for all }t \in [0,T],\ {x}_{ 1},{x}_{2} \in [0,1],$$

(4.102)

the (n + 1)st partial with respect to t of a _i( ⋅, x ₁, x ₂), and σ_i ²( ⋅, x ₁, x ₂) are Lipschitz continuous uniformly in x ₁, x ₂ ∈ [0, 1]. In addition, for each t ∈ [0, T] and each x ₁, x ₂ ∈ [0, 1], σ_i ²(t, x ₁, x ₂) > 0.

Definition 4.55.

A function μ(⋅) is said to be a quasi-stationary density for the periodic diffusion corresponding to the Kolmogorov-Fokker-Planck operator ⋘₂ if it is periodic in x₁ and x₂ with period 1,

$$0 \leq \mu (t,{x}_{1},{x}_{2})\mbox{ for each }(t,{x}_{1},{x}_{2}) \in [0,T] \times [0,1] \times [0,1],$$

and for each fixed t and x₁,

$${\int }_{0}^{1}\mu (t,{x}_{ 1},{x}_{2})d{x}_{2} = 1\ \mbox{ and }\ {\mathcal{L}}_{2}^{{_\ast}}\mu (t,{x}_{ 1},{x}_{2}) = 0.$$

To proceed, let $\mathcal{H}$ be the space of functions that are bounded and continuous and are Hölder continuous in (x ₁, x ₂) ∈ [0, 1] ×[0, 1] (with Hölder exponent Δ for some 0 < Δ < 1), uniformly with respect to t. For each h ₁, ${h}_{2} \in \mathcal{H}$ define $\langle {h}_{1},{h{}_{2}\rangle }_{\mathcal{H}}$ as

$$\begin{array}{rl} &\langle {h}_{1},{h}_{2}{\rangle}_{\mathcal{H}} ={ \int }_{0}^{T}{ \int }_{0}^{1}{ \int }_{0}^{1}{h}_{ 1}(t,{x}_{1},{x}_{2}){h}_{2}(t,{x}_{1},{x}_{2})d{x}_{1}d{x}_{2}dt\end{array}$$

Under the assumptions mentioned above, two sequences of functions φ_i( ⋅) (periodic in x ₁ and x ₂) and ψ_i( ⋅) for i = 0, …, n can be found such that

${\varphi }_{i}(\cdot,\cdot,\cdot ) \in {C}^{n+1-i,2(n+1-i),2(n+1-i)}$;
ψ_i(t ∕ ε, x ₁, x ₂) decay exponentially fast in that for some c ₁ > 0 and c ₂ > 0,
$$ \sup\limits_{{x}_{1},{x}_{2}\in [0,1]}\l {\psi }_{i}\left ( \frac{t} {\varepsilon },{x}_{1},{x}_{2}\right )\vert \leq {c}_{1}\exp \left (-\frac{{c}_{2}t} {\varepsilon } \right );$$
define $\widetilde{{s}}_{n}^{\varepsilon }$ by
$$\widetilde{{s}}_{n}^{\varepsilon }(t,{x}_{ 1},{x}_{2}) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t,{x}_{1},{x}_{2}) + {\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon },{x}_{1},{x}_{2}\right )\right );$$
for each $h \in \mathcal{H}$, the following error bound holds:
$$\begin{array}{ll} \left \vert \langle {p}^{\varepsilon } -\widetilde{ {s}}_{n}^{\varepsilon },h{\rangle}_{\mathcal{H}}\right \vert = O({\varepsilon }^{n+1}).\end{array}$$
(4.103)

It is interesting to note that the leading term of the approximation φ₀( ⋅) is approximately the probability density of X ₁, namely, v ₀(t, x ₁) multiplied by the conditional density of X ₂ given X ₁ = x ₁ (i.e., holding x ₁ as a parameter), the quasi-stationary density μ(t, x ₁, x ₂). The rest of the terms in the regular part of the expansion assume the form

$$\mu (t,{x}_{1},{x}_{2}){v}_{i}(t,{x}_{1}) + {U}_{i}(t,{x}_{1},{x}_{2}),$$

where U _i( ⋅) is a particular solution of an inhomogeneous equation. Note the resemblance of the form to that of the Markov-chain cases studied in this chapter. A detailed proof of the assertion is in Khasminskii and Yin [116]. In fact, more complex systems (allowing interaction of X ₁ ^ε and X ₂ ^ε, the mixed partial derivatives of x ₁ and x ₂ as well as extension to multidimensional systems) are treated in [116]. In addition, in lieu of $\langle \cdot,{\cdot \rangle }_{\mathcal{H}}$, convergence under the uniform topology can be considered via the use of stochastic representation of solutions of partial differential equations or energy integration methods (see, for example, the related treatment of singularly perturbed switching diffusion systems in Il’in, Khasminskii, and Yin [94]).

8 Notes

Two-time-scale Markov chains are dealt with in this chapter using purely analytic methods, which are closely connected with the singular perturbation methods. The literature of singular perturbation for ordinary differential equations is rather rich. For an extensive list of references in singular perturbation methods for ordinary differential equations and various techniques such as initial-layer etc., we refer to Vasi’leva and Butuzov [209], Wasow [215, 216], O’Malley [163], and the references therein. The development of singular perturbation methods has been intertwined with advances in technology and progress in various applications. It can be traced back to the beginning of the twentieth century when Prandtl dealt with fluid motion with small friction (see Prandtl [178]). Nowadays, the averaging principle developed by Krylov, Bogoliubov, and Mitropolskii (see Bogoliubov and Mitropolskii [18]) has become a popular technique, taught in standard graduate applied mathematics courses and employed widely. General results on singular perturbations can be found in Bensoussan, Lion, and Papanicolaou [7], Bogoliubov and Mitropolskii [18], Eckhaus [54], Erdélyi [58], Il’in [92], Kevorkian and Cole [108, 109], Krylov and Bogoliubov [133], O’Malley [163], Smith [199], Vasil’eava and Butuzov [209, 210], Wasow [215, 216]; applications to control theory and related fields are in Bensoussan [8], Bielecki and Filar [11], Delebecque and Quadrat [44], Delebecque, Quadrat, and Kokotovic [45], Kokotovic [126], Kokotovic, Bensoussan, and Blankenship [127], Kokotovic and Khalil [128], Kokotovic, Khalil, and O’Reilly [129], Kushner [140], Pan and Başar [164–166], Pervozvanskii and Gaitsgori [174], Phillips and Kokotovic [175], Yin and Zhang [233], among others; the vast literature on applications to different branches of physics are in Risken [182], van Kampen [208]; the survey by Hänggi, Talkner, and Borkovec [80] contains hundreds of references concerning applications in physics; related problems via large deviations theory are in Lerman and Schuss [151]; some recent work of singular perturbations to queueing networks, and heavy traffic, etc., is in Harrison and Reiman [81], Knessel and Morrison [125], and the references therein; applications to manufacturing systems are in Sethi and Zhang [192], Soner [202], Zhang [248], and the references cited there; related problems for stochastic differential equations and diffusion approximations, etc., can be found in Day [42], Friedlin and Wentzell [67], Il’in and Khasminskii [93], Khaminskii [111, 112], Kushner [139], Ludwig [152], Matkowsky and Schuss [158], Naeh, Klosek, Matkowski, and Schuss [160], Papanicolaou [169, 170], Schuss [187, 188], Skorohod [198], Yin [222], Yin and Ramachandran [227], and Zhang [247], among others. Singularly perturbed Markov processes also appear in the context of random evolution, a generalization of the motion of a particle on a fixed line with a random velocity or a random diffusivity; see, for example, Griego and Hersh [76, 77] and Pinsky [177]; an extensive survey can be found in Hersh [85]. A first-order approximation of the distribution of the Cox process with rapid switching is in Di Masi and Kabanov [48]. Recently, modeling communication systems via two-time-scale Markov chains has gained renewed interest; see Tse, Gallager, and Tsitsiklis [206], and the references therein.

It should be pointed out that there is a distinct feature in the problem we are studying compared with the traditional study of singularly perturbed systems. In contrast to many singularly perturbed ordinary differential equations, the matrix Q(t) in (4.3) is singular, and has an eigenvalue 0. Thus the usual stability condition does not hold. To circumvent this difficulty, we utilize the q-Property of the matrix Q(t), which leads to a probabilistic interpretation. The main emphasis in this chapter is on developing approximations to the solutions of the forward equations. The underlying systems arise from a wide range of applications where a finite-state Markov chain is involved and a fast time scale t ∕ ε is used. Asymptotic series of the probability distribution of the Markov chain have been developed by employing the techniques of matched expansions. An attempt to obtain the asymptotic expansion of (4.3) is initiated in Khasminskii, Yin, and Zhang [119] for time-inhomogeneous Markov chains. The result presented here is a refinement of the aforementioned reference.

Extending the results for irreducible generators, this chapter further discusses two-time-scale Markov chains with weak and strong interactions. The formulations substantially generalize the work of Khasminskii, Yin, and Zhang [120]. Section 4.3 discusses Markovian models with recurrent states belonging to several ergodic classes is a refinement of [120].

Previous work on singularly perturbed Markov chains with weak and strong interactions can be found in Delebecque, Quadrat, and Kokotovic [45], Gaitsgori and Pervozvanskii [69], Pervozvanskii and Gaitsgori [174], and Phillips and Kokotovic [175]. The essence is a decomposition and aggregation point of view. Their models are similar to that considered in this chapter. For example, translating the setup into our setting, the authors of [175] assumed that the Markov chain generated by $\widetilde{Q}/\varepsilon +\widehat{ Q}$ has a single ergodic class for ε sufficiently small. Moreover, for each j = 1, 2, …, l, the subchain has a single ergodic class. Their formulation requires that $\widetilde{Q}(t) =\widetilde{ Q}$ and $\widehat{Q}(t) =\widehat{ Q}$, and it requires essentially the irreducibility of $\widetilde{Q}/\varepsilon +\widehat{ Q}$ for all ε ≤ ε₀ for some ε₀ > 0 small enough in addition to the irreducibility of $\widetilde{{Q}}^{j}$ for j = 1, 2, …, l. The problem considered in this chapter is nonstationary; the generators are time-varying. The irreducibility is in the weak sense, and only weak irreducibility of each subgenerator (or block matrix) $\widetilde{{Q}}^{j}(t)$ for j = 1, 2, …, l is needed. Thus our results generalize the existing theorems to nonstationary cases under weaker assumptions. The condition on $\widetilde{Q}(t)$ exploits the intrinsic properties of the underlying chains. Furthermore, our results also include Markov chains with countable-state spaces. The formulation and development of Section 4.5 are inspired by that of [175] (see also Pan and Başar [164]). This together with the consideration of chains with recurrent states and the inclusion of absorbing states includes most of practical concerns for the rapidly varying part of the generator. Although the forms of the generators with absorbing states and with transient states have more complex structures, the asymptotic expansion of the probability distributions can still be obtained via a similar approach to that of the case of block-diagonal $\widetilde{Q}(\cdot )$. Applications to manufacturing systems are discussed, for example, in Jiang and Sethi [99] and Sethi and Zhang [192] among others. As a complement of the development in this chapter, the work of Il’in, Khasminskii, and Yin [94] deals with the cases that the underlying Markov processes involve both diffusion and pure jump processes; see also Yin and Yang [229]. Previous work of singular perturbation of stochastic systems can be found in Day [42], Friedlin and Wentzel [67], Khasminskii [111–113], Kushner [139], Ludwig [152], Matkowsky and Schuss [158], Naeh, Klosek, Matkowski, and Schuss [160], Papanicolaou [169, 170], Schuss [187], Yin and Ramachandran [227], and the references therein. Singular perturbation in connection with optimal control problems are contained in Bensoussan [8], Bielecki and Filar [11], Delebecque and Quadrat [44], Kokotovic [126], Kokotovic, Bensoussan, and Blankenship [127], Kushner [140], Lehoczky, Sethi, Soner, and Taksar [150], Martins and Kushner [156], Pan and Başar [164], Pervozvanskii and Gaitsgori [174], Sethi and Zhang [192], Soner [202], and Yin and Zhang [233] among others. For discrete-time two-time-scale Markov chains, we refer the reader to Yin and Zhang [238] Yin, Zhang, and Badowski [242] among others.

We note that one of the key points that enables us to solve these problems is the Fredholm alternative. This is even more crucial compared with the situation in Section 4.2 for irreducible generators. In Section 4.2, the consistency conditions are readily verified, whereas in the formulation under weak and strong interactions, the verification needs more work and we have to utilize the consistency to obtain the desired solution.

The discussions on Markov chains with countable-state spaces in this chapter focused on simple situations. For more general cases, see Yin and Zhang [230, 231], in which applications to quasi-birth-death queues were considered; see also Altman, Avrachenkov, and Nunez-Queija [4] for a different approach. The discussions on singularly perturbed diffusion processes dealt with mainly forward equations. For related work on singularly perturbed diffusions, see the papers of Khasminskii and Yin [115, 116] and the references therein; one of the motivations for studying singularly perturbed diffusion comes from wear process modeling (see Rishel [181]). For treatments of averaging principles and related backward equations, we refer the reader to Khasminskii and Yin [117, 118]. For a number of applications on queueing systems, financial engineering, and insurance risk, we refer the reader to Yin, Zhang, and Zhang [232] and references therein.

References

M. Abbad, J.A. Filar, and T.R. Bielecki, Algorithms for singularly perturbed limiting average Markov control problems, IEEE Trans. Automat. Control AC-37 (1992), 1421–1425.
Google Scholar
R. Akella and P.R. Kumar, Optimal control of production rate in a failure-prone manufacturing system, IEEE Trans. Automat. Control AC-31 (1986), 116–126.
Google Scholar
W.J. Anderson, Continuous-Time Markov Chains: An Application-Oriented Approach, Springer-Verlag, New York, 1991.
Google Scholar
E. Altman, K.E. Avrachenkov, and R. Nunez-Queija, Pertrubation analysis for denumerable Markov chains with applications to queueing models, Adv. in Appl. Probab., 36 (2004), 839–853.
MathSciNet MATH Google Scholar
G. Badowski and G. Yin, Stability of hybrid dynamic systems containing singularly perturbed random processes, IEEE Trans. Automat. Control, 47 (2002), 2021–2031.
MathSciNet Google Scholar
G. Barone-Adesi and R. Whaley, Efficient analytic approximation of American option values, J. Finance, 42 (1987), 301–320.
Google Scholar
A. Bensoussan, J.L. Lion, and G.C. Papanicolaou, Asymptotic Analysis of Periodic Structures, North-Holland, Amsterdam, 1978.
Google Scholar
A. Bensoussan, Perturbation Methods in Optimal Control, J. Wiley, Chichester, 1988.
MATH Google Scholar
L.D. Berkovitz, Optimal Control Theory, Springer-Verlag, New York, 1974.
MATH Google Scholar
A.T. Bharucha-Reid, Elements of the Theory of Markov Processes and Their Applications, McGraw-Hill, New York, 1960.
MATH Google Scholar
T.R. Bielecki and J.A. Filar, Singularly perturbed Markov control problem: Limiting average cost, Ann. Oper. Res. 28 (1991), 153–168.
Google Scholar
T.R. Bielecki and P.R. Kumar, Optimality of zero-inventory policies for unreliable manufacturing systems, Oper. Res. 36 (1988), 532–541.
Google Scholar
P. Billingsley, Convergence of Probability Measures, J. Wiley, New York, 1968.
MATH Google Scholar
T. Björk, Finite dimensional optimal filters for a class of Ito processes with jumping parameters, Stochastics, 4 (1980), 167–183.
MathSciNet MATH Google Scholar
W.P. Blair and D.D. Sworder, Feedback control of a class of linear discrete systems with jump parameters and quadratic cost criteria, Int. J. Control, 21 (1986), 833–841.
MathSciNet Google Scholar
G.B. Blankenship and G.C. Papanicolaou, Stability and control of stochastic systems with wide band noise, SIAM J. Appl. Math. 34 (1978), 437–476.
Google Scholar
H.A.P. Blom and Y. Bar-Shalom, The interacting multiple model algorithm for systems with Markovian switching coefficients, IEEE Trans. Automat. Control, AC-33 (1988), 780–783.
Google Scholar
N.N. Bogoliubov and Y.A. Mitropolskii, Asymptotic Methods in the Theory of Nonlinear Oscillator, Gordon and Breach, New York, 1961.
Google Scholar
E.K. Boukas and A. Haurie, Manufacturing flow control and preventive maintenance: A stochastic control approach, IEEE Trans. Automat. Control AC-35 (1990), 1024–1031.
Google Scholar
P. Brémaud, Point Processes and Queues, Springer-Verlag, New York, 1981.
MATH Google Scholar
P.E. Caines and H.-F. Chen, Optimal adaptive LQG control for systems with finite state process parameters, IEEE Trans. Automat. Control, AC-30 (1985), 185–189.
MathSciNet Google Scholar
S.L. Campbell, Singular perturbation of autonomous linear systems, II, J. Differential Equations 29 (1978), 362–373.
Google Scholar
S.L. Campbell and N.J. Rose, Singular perturbation of autonomous linear systems, SIAM J. Math. Anal. 10 (1979), 542–551.
Google Scholar
M. Caramanis and G. Liberopoulos, Perturbation analysis for the design of flexible manufacturing system flow controllers, Oper. Res. 40 (1992), 1107–1125.
Google Scholar
M.-F. Chen, From Markov Chains to Non-equilibrium Particle Systems, 2nd ed., World Scientific, Singapore, 2004.
MATH Google Scholar
S. Chen, X. Li, and X.Y. Zhou, Stochastic linear quadratic regulators with indefinite control weight costs, SIAM J. Control Optim. 36 (1998), 1685–1702.
Google Scholar
C.L. Chiang, An Introduction to Stochastic Processes and Their Applications, Kreiger, Hungtington, 1980.
MATH Google Scholar
T.-S. Chiang and Y. Chow, A limit theorem for a class of inhomogeneous Markov processes, Ann. Probab. 17 (1989), 1483–1502.
Google Scholar
P.L. Chow, J.L. Menaldi, and M. Robin, Additive control of stochastic linear systems with finite horizon, SIAM J. Control Optim. 23 (1985), 859–899.
Google Scholar
Y.S. Chow and H. Teicher, Probability Theory, Springer-Verlag, New York, 1978.
MATH Google Scholar
K.L. Chung, Markov Chains with Stationary Transition Probabilities, 2nd Ed., Springer-Verlag, New York, 1967.
MATH Google Scholar
F. Clarke, Optimization and Non-smooth Analysis, Wiley Interscience, New York, 1983.
Google Scholar
O.L.V. Costa and F. Dufour, Singular perturbation for the discounted continuous control of piecewise deterministic Markov processes, Appl. Math. Optim., 63 (2011), 357–384.
MathSciNet MATH Google Scholar
O.L.V. Costa and F. Dufour, Singularly perturbed discounted Markov control processes in a general state space, SIAM J. Control Optim., 50 (2012), 720–747.
MathSciNet MATH Google Scholar
P.J. Courtois, Decomposability: Queueing and Computer System Applications, Academic Press, New York, NY, 1977.
MATH Google Scholar
D.R. Cox and H.D. Miller, The Theory of Stochastic Processes, J. Wiley, New York, 1965.
MATH Google Scholar
M.G. Crandall, C. Evans, and P.L. Lions, Some properties of viscosity solutions of Hamilton-Jacobi equations, Trans. Amer. Math. Soc. 282 (1984), 487–501.
Google Scholar
M.G. Crandall, H. Ishii, and P.L. Lions, User’s guide to viscosity solutions of second order partial differential equations, Bull. Amer. Math. Soc. 27 (1992), 1–67.
Google Scholar
M.G. Crandall and P.L. Lions, Viscosity solutions of Hamilton-Jacobi equations, Trans. Amer. Math. Soc. 277 (1983), 1–42.
Google Scholar
I. Daubechies, Ten Lectures on Wavelets, CBMS-NSF Regional Conf. Ser. Appl. Math., SIAM, Philadelphia, PA, 1992.
MATH Google Scholar
M.H.A. Davis, Markov Models and Optimization, Chapman & Hall, London, 1993.
Google Scholar
M.V. Day, Boundary local time and small parameter exit problems with characteristic boundaries, SIAM J. Math. Anal. 20 (1989), 222–248.
Google Scholar
F. Delebecque, A reduction process for perturbed Markov chains, SIAM J. Appl. Math., 48 (1983), 325–350.
MathSciNet Google Scholar
F. Delebecque and J. Quadrat, Optimal control for Markov chains admitting strong and weak interactions, Automatica 17 (1981), 281–296.
Google Scholar
F. Delebecque, J. Quadrat, and P. Kokotovic, A unified view of aggregation and coherency in networks and Markov chains, Internat. J. Control 40 (1984), 939–952.
Google Scholar
C. Derman, Finite State Markovian Decision Processes, Academic Press, New York, 1970.
MATH Google Scholar
G.B. Di Masi and Yu.M. Kabanov, The strong convergence of two-scale stochastic systems and singular perturbations of filtering equations, J. Math. Systems, Estimation Control 3 (1993), 207–224.
Google Scholar
G.B. Di Masi and Yu.M. Kabanov, A first order approximation for the convergence of distributions of the Cox processes with fast Markov switchings, Stochastics Stochastics Rep. 54 (1995), 211–219.
Google Scholar
J.L. Doob, Stochastic Processes, Wiley Classic Library Edition, Wiley, New York, 1990.
MATH Google Scholar
R.L. Dobrushin, Central limit theorem for nonstationary Markov chains, Theory Probab. Appl. 1 (1956), 65–80, 329–383.
Google Scholar
E.B. Dynkin, Markov Processes, Springer-Verlag, Berlin, 1965.
MATH Google Scholar
N. Dunford and J.T. Schwartz, Linear Operators, Interscience, New York, 1958.
MATH Google Scholar
E.B. Dynkin and A.A. Yushkevich, Controlled Markov Processes, Springer-Verlag, New York, 1979.
Google Scholar
W. Eckhaus, Asymptotic Analysis of Singular Perturbations, North-Holland, Amsterdam, 1979.
MATH Google Scholar
R.J. Elliott, Stochastic Calculus and Applications, Springer-Verlag, New York, 1982.
MATH Google Scholar
R.J. Elliott, Smoothing for a finite state Markov process, in Lecture Notes in Control and Inform. Sci., 69, 199–206, Springer-Verlag, New York, 1985.
Google Scholar
R.J. Elliott, L. Aggoun, and J. Moore, Hidden Markov Models: Estimation and Control, Springer-Verlag, New York, 1995.
MATH Google Scholar
A. Erdélyi, Asymptotic Expansions, Dover, New York, 1956.
MATH Google Scholar
S.N. Ethier and T.G. Kurtz, Markov Processes: Characterization and Convergence, J. Wiley, New York, 1986.
MATH Google Scholar
W. Feller, An Introduction to Probability Theory and Its Applications, J. Wiley, New York, Vol. I, 1957; Vol. II, 1966.
Google Scholar
W.H. Fleming, Functions of Several Variables, Addison-Wesley, Reading, 1965.
MATH Google Scholar
W.H. Fleming, Generalized solution in optimal stochastic control, in Proc. URI Conf. on Control, 147–165, Kingston, RI, 1982.
Google Scholar
W.H. Fleming and R.W. Rishel, Deterministic and Stochastic Optimal Control, Springer-Verlag, New York, 1975.
MATH Google Scholar
W.H. Fleming and H.M. Soner, Controlled Markov Processes and Viscosity Solutions, Springer-Verlag, New York, 1992.
Google Scholar
W.H. Fleming, S.P. Sethi, and H.M. Soner, An optimal stochastic production planning problem with randomly fluctuating demand, SIAM J. Control Optim. 25 (1987), 1494–1502.
Google Scholar
W.H. Fleming and Q. Zhang, Risk-sensitive production planning of a stochastic manufacturing system, SIAM J. Control Optim., 36 (1998), 1147–1170.
MathSciNet MATH Google Scholar
M.I. Friedlin and A.D. Wentzel, Random Perturbations of Dynamical Systems, Springer-Verlag, New York, 1984.
Google Scholar
C.W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences, 2nd Ed., Springer-Verlag, Berlin, 1985.
Google Scholar
V.G. Gaitsgori and A.A. Pervozvanskii, Aggregation of states in a Markov chain with weak interactions, Kybernetika 11 (1975), 91–98.
Google Scholar
D. Geman and S. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Machine Intelligence 6 (1984), 721–741.
Google Scholar
S.B. Gershwin, Manufacturing Systems Engineering, Prentice-Hall, Englewood Cliffs, 1994.
Google Scholar
M.K. Ghosh, A. Arapostathis, and S.I. Marcus, Ergodic control of switching diffusions, SIAM J. Control Optim., 35 (1997), 1952–1988.
MathSciNet MATH Google Scholar
I.I. Gihman and A.V. Skorohod, Introduction to the Theory of Random Processes, W.B. Saunders, Philadelphia, 1969.
Google Scholar
P. Glasserman, Gradient Estimation via Perturbation Analysis, Kluwer, Boston, MA, 1991.
MATH Google Scholar
R. Goodman, Introduction to Stochastic Models, Benjamin/Cummings, Menlo Park, CA, 1988.
Google Scholar
R.J. Griego and R. Hersh, Random evolutions, Markov chains, and systems of partial differential equations, Proc. Nat. Acad. Sci. U.S.A. 62 (1969), 305–308.
Google Scholar
R.J. Griego and R. Hersh, Theory of random evolutions with applications to partial differential equations, Trans. Amer. Math. Soc. 156 (1971), 405–418.
Google Scholar
X. Guo and O. Hernàndez-Lerma, Continuous-time Markov Decision Processes: Theory and Applications, Springer, Heidelberg, 2001.
Google Scholar
J.K. Hale, Ordinary Differential Equations, R.E. Krieger Publishing Co., 2nd Ed., Malabar, 1980.
Google Scholar
P. Hänggi, P. Talkner, and M. Borkovec, Reaction-rate theory: Fifty years after Kramers, Rev. Modern Phys. 62 (1990), 251–341.
Google Scholar
J.M. Harrison and M.I. Reiman, Reflected Brownian motion on an orthant, Ann. Probab. 9 (1981), 302–308.
Google Scholar
U.G. Haussmann and Q. Zhang, Stochastic adaptive control with small observation noise, Stochastics Stochastics Rep. 32 (1990), 109–144.
Google Scholar
U.G. Haussmann and Q. Zhang, Discrete time stochastic adaptive control with small observation noise, Appl. Math. Optim. 25 (1992), 303–330.
Google Scholar
Q. He, G. Yin, and Q. Zhang, Large Deviations for Two-Time-Scale Systems Driven by Nonhomogeneous Markov Chains and LQ Control Problems, SIAM J. Control Optim., 49, (2011), 1737–1765.
MathSciNet MATH Google Scholar
R. Hersh, Random evolutions: A survey of results and problems, Rocky Mountain J. Math. 4 (1974), 443–477.
Google Scholar
F.S. Hillier and G.J. Lieberman, Introduction to Operations Research, McGraw-Hill, New York, 1989.
Google Scholar
Y.C. Ho and X.R. Cao, Perturbation Analysis of Discrete Event Dynamic Systems, Kluwer, Boston, MA, 1991.
MATH Google Scholar
A. Hoyland and M. Rausand, System Reliability Theory: Models and Statistical Methods, J. Wiley, New York, 1994.
Google Scholar
Z. Hou and Q. Guo, Homogeneous Denumerable Markov Processes, Springer-Verlag, Berlin, 1980.
Google Scholar
V. Hutson and J.S. Pym, Applications of Functional Analysis and Operator Theory, Academic Press, London, 1980.
MATH Google Scholar
N. Ikeda and S. Watanabe, Stochastic Differential Equations and Diffusion Processes, North-Holland, Amsterdam, 1981.
MATH Google Scholar
A.M. Il’in, Matching of Asymptotic Expansions of Solutions of Boundary Value Problems, Trans. Math. Monographs, Vol. 102, Amer. Math. Soc., Providence, 1992.
Google Scholar
A.M. Il’in and R.Z. Khasminskii, Asymptotic behavior of solutions of parabolic equations and ergodic properties of nonhomogeneous diffusion processes, Math. Sbornik. 60 (1963), 366–392.
Google Scholar
A.M. Il’in, R.Z. Khasminskii, and G. Yin, Singularly perturbed switching diffusions: Rapid switchings and fast diffusions, J. Optim. Theory Appl. 102 (1999), 555–591.
Google Scholar
M. Iosifescu, Finite Markov Processes and Their Applications, Wiley, Chichester, 1980.
MATH Google Scholar
H. Ishii, Uniqueness of unbounded viscosity solutions of Hamilton-Jacobi equations, Indiana Univ. Math. J. 33 (1984), 721–748.
Google Scholar
Y. Ji and H.J. Chizeck, Controllability, stabilizability, and continuous-time Markovian jump linear quadratic control, IEEE Trans. Automatic Control, 35 (1990), 777–788.
MathSciNet MATH Google Scholar
Y. Ji and H.J. Chizeck, Jump linear quadratic Gaussian control in continuous time, IEEE Trans. Automat. Control AC-37 (1992), 1884–1892.
Google Scholar
J. Jiang and S.P. Sethi, A state aggregation approach to manufacturing systems having machines states with weak and strong interactions, Oper. Res. 39 (1991), 970–978.
Google Scholar
Yu. Kabanov and S. Pergamenshchikov, Two-scale Stochastic Systems: Asymptotic Analysis and Control, Springer, New York, NY, 2003.
Google Scholar
I.Ia. Kac and N.N. Krasovskii, On the stability of systems with random parameters, J. Appl. Math. Mech., 24 (1960), 1225–1246.
Google Scholar
G. Kallianpur, Stochastic Filtering Theory, Springer-Verlag, New York, 1980.
MATH Google Scholar
D. Kannan, An Introduction to Stochastic Processes, North-Holland, New York, 1980.
Google Scholar
S. Karlin and J. McGregor, The classification of birth and death processes, Trans. Amer. Math. Soc. 85 (1957), 489–546.
Google Scholar
S. Karlin and H.M. Taylor, A First Course in Stochastic Processes, 2nd Ed., Academic Press, New York, 1975.
MATH Google Scholar
S. Karlin and H.M. Taylor, A Second Course in Stochastic Processes, Academic Press, New York, 1981.
MATH Google Scholar
J. Keilson, Green’s Function Methods in Probability Theory, Griffin, London, 1965.
MATH Google Scholar
J. Kevorkian and J.D. Cole, Perturbation Methods in Applied Mathematics, Springer-Verlag, New York, 1981.
MATH Google Scholar
J. Kevorkian and J.D. Cole, Multiple Scale and Singular Perturbation Methods, Springer-Verlag, New York, 1996.
MATH Google Scholar
H. Kesten and G.C. Papanicolaou, A limit theorem for stochastic acceleration, Comm. Math. Phys. 78 (1980), 19–63.
Google Scholar
R.Z. Khasminskii, On diffusion processes with a small parameter, Izv. Akad. Nauk U.S.S.R. Ser. Mat. 27 (1963), 1281–1300.
Google Scholar
R.Z. Khasminskii, On stochastic processes defined by differential equations with a small parameter, Theory Probab. Appl. 11 (1966), 211–228.
Google Scholar
R.Z. Khasminskii, On an averaging principle for Ito stochastic differential equations, Kybernetika 4 (1968), 260-279.
Google Scholar
R.Z. Khasminskii, Stochastic Stability of Differential Equations, 2nd Ed., Springer, New York, 2012.
MATH Google Scholar
R.Z. Khasminskii and G. Yin, Asymptotic series for singularly perturbed Kolmogorov-Fokker-Planck equations, SIAM J. Appl. Math. 56 (1996), 1766–1793.
Google Scholar
R.Z. Khasminskii and G. Yin, On transition densities of singularly perturbed diffusions with fast and slow components, SIAM J. Appl. Math. 56 (1996), 1794–1819.
Google Scholar
R.Z. Khasminskii and G. Yin, On averaging principles: An asymptotic expansion approach, SIAM J. Math. Anal., 35 (2004), 1534–1560.
MathSciNet MATH Google Scholar
R.Z. Khasminskii and G. Yin, Limit behavior of two-time-scale diffusions revisited, J. Differential Eqs., 212 (2005) 85–113.
MathSciNet MATH Google Scholar
R.Z. Khasminskii, G. Yin, and Q. Zhang, Asymptotic expansions of singularly perturbed systems involving rapidly fluctuating Markov chains, SIAM J. Appl. Math. 56 (1996), 277–293.
Google Scholar
R.Z. Khasminskii, G. Yin, and Q. Zhang, Constructing asymptotic series for probability distribution of Markov chains with weak and strong interactions, Quart. Appl. Math. LV (1997), 177–200.
Google Scholar
J.G. Kimemia and S.B. Gershwin, An algorithm for the computer control production in flexible manufacturing systems, IIE Trans. 15 (1983), 353–362.
Google Scholar
J.F.C. Kingman, Poisson Processes, Oxford Univ. Press, Oxford, 1993.
MATH Google Scholar
S. Kirkpatrick, C. Gebatt, and M. Vecchi, Optimization by simulated annealing, Science 220 (1983), 671–680.
Google Scholar
C. Knessel, On finite capacity processor-shared queues, SIAM J. Appl. Math. 50 (1990), 264–287.
Google Scholar
C. Knessel and J.A. Morrison, Heavy traffic analysis of a data handling system with multiple sources, SIAM J. Appl. Math. 51 (1991), 187–213.
Google Scholar
P.V. Kokotovic, Application of singular perturbation techniques to control problems, SIAM Rev. 26 (1984), 501–550.
Google Scholar
P.V. Kokotovic, A. Bensoussan, and G. Blankenship (Eds.), Singular Perturbations and Asymptotic Analysis in Control Systems, Lecture Notes in Control and Inform. Sci. 90, Springer-Verlag, Berlin, 1987.
Google Scholar
P.V. Kokotovic and H.K. Khalil (Eds.), Singular Perturbations in Systems and Control, IEEE Press, New York, 1986.
Google Scholar
P.V. Kokotovic, H.K. Khalil, and J. O’Reilly, Singular Perturbation Methods in Control, Academic Press, London, 1986.
MATH Google Scholar
V. Korolykuk and A. Swishchuk, Evolution of Systems in Random Media, CRC Press, Boca Raton, 1995.
Google Scholar
V.S. Korolyuk and N. Limnios, Diffusion approximation with equilibrium of evolutionary systems switched by semi-Markov processes, translation in Ukrainian Math. J. 57 (2005), 1466–1476.
Google Scholar
V.S. Korolyuk and N. Limnios, Stochastic systems in merging phase space, World Sci., Hackensack, NJ, 2005.
Google Scholar
N.M. Krylov and N.N. Bogoliubov, Introduction to Nonlinear Mechanics, Princeton Univ. Press, Princeton, 1947.
Google Scholar
H. Kunita and S. Watanabe, On square integrable martingales, Nagoya Math. J. 30 (1967), 209–245.
Google Scholar
T.G. Kurtz, A limit theorem for perturbed operator semigroups with applications to random evolutions, J. Functional Anal. 12 (1973), 55–67.
Google Scholar
T.G. Kurtz, Approximation of Population Processes, SIAM, Philadelphia, PA, 1981.
Google Scholar
T.G. Kurtz, Averaging for martingale problems and stochastic approximation, in Proc. US-French Workshop on Appl. Stochastic Anal., Lecture Notes in Control and Inform. Sci., 177, I. Karatzas and D. Ocone (Eds.), 186–209, Springer-Verlag, New York, 1991.
Google Scholar
H.J. Kushner, Probability Methods for Approximation in Stochastic Control and for Elliptic Equations, Academic Press, New York, 1977.
Google Scholar
H.J. Kushner, Approximation and Weak Convergence Methods for Random Processes, with Applications to Stochastic Systems Theory, MIT Press, Cambridge, MA, 1984.
MATH Google Scholar
H.J. Kushner, Weak Convergence Methods and Singularly Perturbed Stochastic Control and Filtering Problems, Birkhäuser, Boston, 1990.
MATH Google Scholar
H.J. Kushner and P.G. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time, Springer-Verlag, New York, 1992.
MATH Google Scholar
H.J. Kushner and W. Runggaldier, Nearly optimal state feedback controls for stochastic systems with wideband noise disturbances, SIAM J Control Optim. 25 (1987), 289–315.
Google Scholar
H.J. Kushner and F.J. Vázquez-Abad, Stochastic approximation algorithms for systems over an infinite horizon, SIAM J. Control Optim. 34 (1996), 712–756.
Google Scholar
H.J. Kushner and G. Yin, Asymptotic properties of distributed and communicating stochastic approximation algorithms, SIAM J. Control Optim. 25 (1987), 1266–1290.
Google Scholar
H.J. Kushner and G. Yin, Stochastic Approximation and Recursive Algorithms and Applications, 2nd Edition, Springer-Verlag, New York, 2003.
MATH Google Scholar
X.R. Li, Hybrid estimation techniques, in Control and Dynamic Systems, Vol. 76, C.T. Leondes (Ed.), Academic Press, New York, 1996.
Google Scholar
Yu. V. Linnik, On the theory of nonhomogeneous Markov chains, Izv. Akad. Nauk. USSR Ser. Mat. 13 (1949), 65–94.
Google Scholar
Y.J. Liu, G. Yin, and X.Y. Zhou, Near-optimal controls of random-switching LQ problems with indefinite control weight costs, Automatica, 41 (2005) 1063–1070.
MathSciNet MATH Google Scholar
P. Lochak and C. Meunier, Multiphase Averaging for Classical Systems, Springer-Verlag, New York, 1988.
MATH Google Scholar
J. Lehoczky, S.P. Sethi, H.M. Soner, and M. Taksar, An asymptotic analysis of hierarchical control of manufacturing systems under uncertainty, Math. Oper. Res. 16 (1992), 596–608.
Google Scholar
G. Lerman and Z. Schuss, Asymptotic theory of large deviations for Markov chains, SIAM J. Appl. Math., 58 (1998), 1862–1877.
MathSciNet MATH Google Scholar
D. Ludwig, Persistence of dynamical systems under random perturbations, SIAM Rev. 17 (1975), 605–640.
Google Scholar
X. Mao and C. Yuan, Stochastic Differential Equations with Markovian Switching, Imperial College Press, London, UK, 2006.
MATH Google Scholar
M. Mariton, Robust jump linear quadratic control: A mode stabilizing solution, IEEE Trans. Automat. Control, AC-30 (1985), 1145–1147.
MathSciNet Google Scholar
M. Mariton, Jump Linear Systems in Automatic Control, Marcel Dekker, Inc., New York, 1990.
Google Scholar
L.F. Martins and H.J. Kushner, Routing and singular control for queueing networks in heavy traffic, SIAM J. Control Optim. 28 (1990), 1209–1233.
Google Scholar
W.A. Massey and W. Whitt, Uniform acceleration expansions for Markov chains with time-varying rates, Ann. Appl. Probab., 8 (1998), 1130–1155.
MathSciNet MATH Google Scholar
B.J. Matkowsky and Z. Schuss, The exit problem for randomly perturbed dynamical systems, SIAM J. Appl. Math. 33 (1977), 365–382.
Google Scholar
S.P. Meyn and R.L. Tweedie, Markov Chains and Stochastic Stability, Springer-Verlag, London, 1993.
MATH Google Scholar
T. Naeh, M.M. Klosek, B.J. Matkowski, and Z. Schuss, A direct approach to the exit problem, SIAM J. Appl. Math. 50 (1990), 595–627.
Google Scholar
A.H. Nayfeh, Introduction to Perturbation Techniques, J. Wiley, New York, 1981.
MATH Google Scholar
M.F. Neuts, Matrix-Geometric Solutions in Stochastic Models, Johns Hopkins Univ. Press, Baltimore, 1981.
MATH Google Scholar
R.E. O’Malley, Jr., Singular Perturbation Methods for Ordinary Differential Equations, Springer-Verlag, New York, 1991.
MATH Google Scholar
Z.G. Pan and T. Başar, H ^∞-control of Markovian jump linear systems and solutions to associated piecewise-deterministic differential games, in New Trends in Dynamic Games and Applications, G.J. Olsder Ed., 61–94, Birkhäuser, Boston, MA, 1995.
Google Scholar
Z.G. Pan and T. Başar, H ^∞ control of large scale jump linear systems via averaging and aggregation, in Proc. 34th IEEE Conf. Decision Control, 2574-2579, New Orleans, LA, 1995.
Google Scholar
Z.G. Pan and T. Başar, Random evolutionary time-scale decomposition in robust control of jump linear systems, in Proc. 35th IEEE Conf. Decision Control, Kobe, Japan, 1996.
Google Scholar
Z.G. Pan and T. Basar, H ^∞ control of large-scale jump linear systems via averaging and aggregation, Internat. J. Control, 72 (1999), 866–881.
MathSciNet MATH Google Scholar
G.C. Papanicolaou, D. Stroock, and S.R.S. Varadhan, Martingale approach to some limit theorems, in Proc. 1976 Duke Univ. Conf. on Turbulence, Durham, NC, 1976.
Google Scholar
G.C. Papanicolaou, Introduction to the asymptotic analysis of stochastic equations, in Lectures in Applied Mathematics, Amer. Math. Soc., Vol. 16, 1977, 109-147.
MathSciNet Google Scholar
G.C. Papanicolaou, Asymptotic analysis of stochastic equations, Studies in Probability Theory, M. Rosenblatt (Ed.), Vol. 18, MAA, 1978, 111–179.
Google Scholar
E. Pardoux and S. Peng, Adapted solution of backward stochastic equation, Syst. Control Lett., 14 (1990), 55–61.
MathSciNet MATH Google Scholar
A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, Springer-Verlag, New York, 1983.
MATH Google Scholar
L. Perko, Differential Equations and Dynamical Systems, Springer, 3rd Ed., New York, 2001.
Google Scholar
A.A. Pervozvanskii and V.G. Gaitsgori, Theory of Suboptimal Decisions: Decomposition and Aggregation, Kluwer, Dordrecht, 1988.
MATH Google Scholar
R.G. Phillips and P.V. Kokotovic, A singular perturbation approach to modelling and control of Markov chains, IEEE Trans. Automat. Control 26 (1981), 1087–1094.
Google Scholar
M.A. Pinsky, Differential equations with a small parameter and the central limit theorem for functions defined on a Markov chain, Z. Wahrsch. verw. Gebiete 9 (1968), 101–111.
Google Scholar
M.A. Pinsky, Multiplicative operator functionals and their asymptotic properties, in Advances in Probability Vol. 3, P. Ney and S. Port (Eds.), Marcel Dekker, New York, 1974.
Google Scholar
L. Prandtl, Über Flüssigkeits – bewegung bei kleiner Reibung, Verhandlungen, in III. Internat. Math. Kongresses, (1905), 484–491.
Google Scholar
M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, J. Wiley, New York, 1994.
MATH Google Scholar
D. Revuz, Markov Chains, Revised Ed., North-Holland, Amsterdam, 1975.
Google Scholar
R. Rishel, Controlled wear process: Modeling optimal control, IEEE Trans. Automat. Control 36 (1991), 1100–1102.
Google Scholar
H. Risken, The Fokker-Planck Equation: Methods of Solution and Applications, 2nd Ed., Springer-Verlag, London, 1989.
MATH Google Scholar
M. Rosenblatt, Markov Processes: Structure and Asymptotic Behavior, Springer-Verlag, Berlin, 1971.
MATH Google Scholar
S. Ross, Introduction to Stochastic Dynamic Programming, Academic Press, New York, 1983.
MATH Google Scholar
E. Roxin, The existence of optimal controls, Mich. Math. J. 9 (1962), 109–119.
Google Scholar
V.R. Saksena, J. O’Reilly, and P.V. Kokotovic, Singular perturbations and time-scale methods in control theory: Survey 1976-1983, Automatica 20 (1984), 273–293.
Google Scholar
Z. Schuss, Singularly perturbation methods in stochastic differential equations of mathematical physics, SIAM Rev. 22 (1980), 119–155.
Google Scholar
Z. Schuss, Theory and Applications of Stochastic Differential Equations, J. Wiley, New York, 1980.
MATH Google Scholar
E. Seneta, Non-negative Matrices and Markov Chains, Springer-Verlag, New York, 1981.
MATH Google Scholar
R. Serfozo, Introduction to Stochastic Networks, Springer, New York, 1999.
MATH Google Scholar
S.P. Sethi and G.L. Thompson, Applied Optimal Control: Applications to Management Science, Martinus Nijhoff, Boston, MA, 1981.
Google Scholar
S.P. Sethi and Q. Zhang, Hierarchical Decision Making in Stochastic Manufacturing Systems, Birkhäuser, Boston, 1994.
MATH Google Scholar
S.P. Sethi and Q. Zhang, Multilevel hierarchical decision making in stochastic marketing-production systems, SIAM J. Control Optim. 33 (1995), 528–553.
Google Scholar
O.P. Sharma, Markov Queues, Ellis Horwood, New York, 1990.
Google Scholar
H.A. Simon, Models of Discovery and Other Topics in the Methods of Science, D. Reidel Publ. Co., Boston, MA, 1977.
MATH Google Scholar
H.A. Simon and A. Ando, Aggregation of variables in dynamic systems, Econometrica 29 (1961), 111–138.
Google Scholar
A.V. Skorohod, Studies in the Theory of Random Processes, Dover, New York, 1982.
Google Scholar
A.V. Skorohod, Asymptotic Methods of the Theory of Stochastic Differential Equations, Trans. Math. Monographs, Vol. 78, Amer. Math. Soc., Providence, 1989.
Google Scholar
D.R. Smith, Singular Perturbation Theory, Cambridge Univ. Press, New York, 1985.
MATH Google Scholar
D. Snyder, Random Point Processes, Wiley, New York, 1975.
MATH Google Scholar
H.M. Soner, Optimal control with state space constraints II, SIAM J. Control Optim. 24 (1986), 1110–1122.
Google Scholar
H.M. Soner, Singular perturbations in manufacturing systems, SIAM J. Control Optim. 31 (1993), 132–146.
Google Scholar
D.W. Stroock and S.R.S. Varadhan, Multidimensional Diffusion Processes, Springer-Verlag, Berlin, 1979.
MATH Google Scholar
H.M. Taylor and S. Karlin, An Introduction to Stochastic Modeling, Academic Press, Boston, 1994.
MATH Google Scholar
W.A. Thompson, Jr., Point Process Models with Applications to Safety and Reliability, Chapman and Hall, New York, 1988.
MATH Google Scholar
D.N.C. Tse, R.G. Gallager, and J.N. Tsitsiklis, Statistical multiplexing of multiple time-scale Markov streams, IEEE J. Selected Areas Comm. 13 (1995), 1028–1038.
Google Scholar
S.R.S. Varadhan, Large Deviations and Applications, SIAM, Philadelphia, 1984.
Google Scholar
N.G. van Kampen, Stochastic Processes in Physics and Chemistry, North-Holland, Amsterdam, 1992.
Google Scholar
A.B. Vasil’eava and V.F. Butuzov, Asymptotic Expansions of the Solutions of Singularly Perturbed Equations, Nauka, Moscow, 1973.
Google Scholar
A.B. Vasil’eava and V.F. Butuzov, Asymptotic Methods in Singular Perturbations Theory (in Russian), Vysshaya Shkola, Moscow, 1990.
Google Scholar
D. Vermes, Optimal control of piecewise deterministic Markov processes, Stochastics, 14 (1985), 165–207.
MathSciNet MATH Google Scholar
L.Y. Wang, P.P. Khargonekar, and A. Beydoun, Robust control of hybrid systems: Performance guided strategies, in Hybrid Systems V, P. Antsaklis, W. Kohn, M. Lemmon, A. Nerode, amd S. Sastry Eds., Lecuture Notes in Computer Sci., 1567, 356–389, Berlin, 1999.
Google Scholar
Z. Wang and X. Yang, Birth and Death Processes and Markov Chains, Springer-Verlag, Science Press, Beijing, 1992.
MATH Google Scholar
J. Warga, Relaxed variational problems, J Math. Anal. Appl. 4 (1962), 111–128.
Google Scholar
W. Wasow, Asymptotic Expansions for Ordinary Differential Equations, Interscience, New York, 1965.
MATH Google Scholar
W. Wasow, Linear Turning Point Theory, Springer-Verlag, New York, 1985.
MATH Google Scholar
A.D. Wentzel, On the asymptotics of eigenvalues of matrices with elements of order exp( − V _ij ∕ 2ε²), Dokl. Akad. Nauk SSSR 222 (1972), 263–265.
Google Scholar
D.J. White, Markov Decision Processes, Wiley, New York, 1992.
Google Scholar
J.H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University Press, New York, 1988.
MATH Google Scholar
H. Yan, G. Yin, and S. X. C. Lou, Using stochastic optimization to determine threshold values for control of unreliable manufacturing systems, J. Optim. Theory Appl. 83 (1994), 511–539.
Google Scholar
H. Yan and Q. Zhang, A numerical method in optimal production and setup scheduling in stochastic manufacturing systems, IEEE Trans. Automat. Control, 42 (1997), 1452–1455.
MathSciNet MATH Google Scholar
G. Yin, Asymptotic properties of an adaptive beam former algorithm, IEEE Trans. Information Theory IT-35 (1989), 859-867.
Google Scholar
G. Yin, Asymptotic expansions of option price under regime-switching diffusions with a fast-varying switching process, Asymptotic Anal., 65 (2009), 203–222.
MATH Google Scholar
G. Yin and I. Gupta, On a continuous time stochastic approximation problem, Acta Appl. Math. 33 (1993), 3–20.
Google Scholar
G. Yin, V. Krishnamurthy, and C. Ion, Regime switching stochastic approximation algorithms with application to adaptive discrete stochastic optimization, SIAM J. Optim., 14 (2004), 1187–1215.
MathSciNet MATH Google Scholar
G. Yin and D.T. Nguyen, Asymptotic expansions of backward equations for two-time-scale Markov chains in continuous time, Acta Math. Appl. Sinica, 25 (2009), 457–476.
MathSciNet MATH Google Scholar
G. Yin and K.M. Ramachandran, A differential delay equation with wideband noise perturbation, Stochastic Process Appl. 35 (1990), 231–249.
Google Scholar
G. Yin, H. Yan, and S.X.C. Lou, On a class of stochastic optimization algorithms with applications to manufacturing models, in Model-Oriented Data Analysis, W.G. Müller, H.P. Wynn and A.A. Zhigljavsky (Eds.), 213–226, Physica-Verlag, Heidelberg, 1993.
Google Scholar
G. Yin and H.L. Yang, Two-time-scale jump-diffusion models with Markovian switching regimes, Stochastics Stochastics Rep., 76 (2004), 77–99.
MathSciNet MATH Google Scholar
G. Yin and H. Zhang, Two-time-scale markov chains and applications to quasi-birth-death queues, SIAM J. Appl. Math., 65 (2005), 567–586.
MathSciNet MATH Google Scholar
G. Yin and H. Zhang, Singularly perturbed markov chains: Limit results and applications, Ann. Appl. Probab., 17 (2007), 207–229.
MathSciNet MATH Google Scholar
G. Yin, H. Zhang, and Q. Zhang, Applications of Two-time-scale Markovian Systems, Preprint, 2012.
Google Scholar
G. Yin and Q. Zhang, Near optimality of stochastic control in systems with unknown parameter processes, Appl. Math. Optim. 29 (1994), 263–284.
Google Scholar
G. Yin and Q. Zhang, Control of dynamic systems under the influence of singularly perturbed Markov chains, J. Math. Anal. Appl., 216 (1997), 343–367.
MathSciNet MATH Google Scholar
G. Yin and Q. Zhang (Eds.), Recent Advances in Control and Optimization of Manufacturing Systems, Lecture Notes in Control and Information Sciences (LNCIS) series, Vol. 214, Springer-Verlag, New York, 1996.
Google Scholar
G. Yin and Q. Zhang (Eds.), Mathematics of Stochastic Manufacturing Systems, Proc. 1996 AMS-SIAM Summer Seminar in Applied Mathematics, Lectures in Applied Mathematics, Amer. Math. Soc., Providence, RI, 1997.
Google Scholar
G. Yin and Q. Zhang, Continuous-Time Markov Chains and Applications: A Singular Perturbation Approach, 1st Ed., Springer-Verlag, New York, 1998.
MATH Google Scholar
G. Yin and Q. Zhang, Discrete-time Markov Chains: Two-time-scale Methods and Applications, Springer, New York. 2005.
Google Scholar
G. Yin, Q. Zhang, and G. Badowski, Asymptotic properties of a singularly perturbed Markov chain with inclusion of transient states, Ann. Appl. Probab., 10 (2000), 549–572.
MathSciNet MATH Google Scholar
G. Yin, Q. Zhang, and G. Badowski, Singularly perturbed Markov chains: Convergence and aggregation, J. Multivariate Anal., 72 (2000), 208–229.
MathSciNet MATH Google Scholar
G. Yin, Q. Zhang, and G. Badowski, Occupation measures of singularly perturbed Markov chains with absorbing states, Acta Math. Sinica, 16 (2000), 161–180.
MathSciNet MATH Google Scholar
G. Yin, Q. Zhang, and G. Badowski, Discrete-time singularly perturbed Markov chains: Aggregation, occupation measures, and switching diffusion limit, Adv. in Appl. Probab., 35 (2003), 449–476.
MathSciNet MATH Google Scholar
G. Yin and X.Y. Zhou, Markowitz’s mean-variance portfolio selection with regime switching: From discrete-time models to their continuous-time limits, IEEE Trans. Automat. Control, 49 (2004), 349–360.
MathSciNet Google Scholar
G. Yin and C. Zhu, Hybrid Switching Diffusions: Properties and Applications, Springer, New York, 2010.
Google Scholar
K. Yosida, Functional Analysis, 6th Ed., Springer-Verlag, New York, NY, 1980.
MATH Google Scholar
J. Yong and X.Y. Zhou, Stochastic Controls: Hamiltonian Systems and HJB Equations, Springer, New York, 1999.
MATH Google Scholar
Q. Zhang, An asymptotic analysis of controlled diffusions with rapidly oscillating parameters, Stochastics Stochastics Rep. 42 (1993), 67–92.
Google Scholar
Q. Zhang, Risk sensitive production planning of stochastic manufacturing systems: A singular perturbation approach, SIAM J. Control Optim. 33 (1995), 498–527.
Google Scholar
Q. Zhang, Finite state Markovian decision processes with weak and strong interactions, Stochastics Stochastics Rep. 59 (1996), 283–304.
Google Scholar
Q. Zhang, Nonlinear filtering and control of a switching diffusion with small observation noise, SIAM J. Control Optim., 36 (1998), 1738–1768.
Google Scholar
Q. Zhang and G. Yin, Turnpike sets in stochastic manufacturing systems with finite time horizon, Stochastics Stochastics Rep. 51 (1994), 11–40.
Google Scholar
Q. Zhang and G. Yin, Central limit theorems for singular perturbations of nonstationary finite state Markov chains, Ann. Appl. Probab. 6 (1996), 650–670.
Google Scholar
Q. Zhang and G. Yin, Structural properties of Markov chains with weak and strong interactions, Stochastic Process Appl., 70 (1997), 181–197.
MathSciNet MATH Google Scholar
Q. Zhang and G. Yin, On nearly optimal controls of hybrid LQG problems, IEEE Trans. Automat. Control, 44 (1999), 2271–2282.
MathSciNet MATH Google Scholar
Q. Zhang and G. Yin, Nearly optimal asset allocation in hybrid stock-investment models, J. Optim. Theory Appl., 121 (2004), 197–222.
MathSciNet Google Scholar
Q. Zhang, G. Yin, and E.K. Boukas, Controlled Markov chains with weak and strong interactions: Asymptotic optimality and application in manufacturing, J. Optim. Theory Appl., 94 (1997), 169–194.
MathSciNet MATH Google Scholar
Q. Zhang, G. Yin, and R.H. Liu, A near-optimal selling rule for a two-time-scale market model, SIAM J. Multiscale Modeling Simulation, 4 (2005), 172–193.
MathSciNet MATH Google Scholar
X.Y. Zhou, Verification theorem within the framework of viscosity solutions, J. Math. Anal. Appl. 177 (1993), 208–225.
Google Scholar
X.Y. Zhou and G. Yin, Markowitz mean-variance portfolio selection with regime switching: A continuous-time model, SIAM J. Control Optim., 42 (2003), 1466–1482.
MathSciNet MATH Google Scholar
C. Zhu, G. Yin, and Q.S. Song, Stability of random-switching systems of differential equations, Quart. Appl. Math., 67 (2009), 201–220.
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Wayne State University, Detroit, Michigan, USA
G. George Yin
Department of Mathematics, University of Georgia, Athens, Georgia, USA
Qing Zhang

Authors

G. George Yin
View author publications
You can also search for this author in PubMed Google Scholar
Qing Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yin, G.G., Zhang, Q. (2013). Asymptotic Expansions of Solutions for Forward Equations. In: Continuous-Time Markov Chains and Applications. Stochastic Modelling and Applied Probability, vol 37. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4346-9_4

Download citation

DOI: https://doi.org/10.1007/978-1-4614-4346-9_4
Published: 03 August 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4345-2
Online ISBN: 978-1-4614-4346-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Asymptotic Expansions of Solutions for Forward Equations

Abstract

Keywords

1 Introduction

Lemma 4.1.

Remark 4.2.

2 Irreducible Case

2.1 Asymptotic Expansions

Remark 4.3.

Lemma 4.4.

Theorem 4.5.

Remark 4.6.

Corollary 4.7.

Remark 4.8.

2.2 Outer Expansion

Remark 4.9.

Remark 4.10.

2.3 Initial-Layer Correction

2.4 Exponential Decay of ψ k ( ⋅)

Proposition 4.11.

Corollary 4.12.

2.5 Asymptotic Validation

Lemma 4.13.

Proposition 4.14.

Remark 4.15.

2.6 Examples

Example 4.16.

Example 4.17.

2.7 Two-Time-Scale Expansion

Theorem 4.18.

Example 4.19.

3 Markov Chains with Multiple Weakly Irreducible Classes

Example 4.20.

3.1 Asymptotic Expansions

Lemma 4.21

Remark 4.22

Remark 4.23.

Proposition 4.24

Proposition 4.25

Proposition 4.26

3.2 Analysis of Remainder

Lemma 4.27.

Proposition 4.28.

3.3 Computational Procedure: User’s Guide

3.4 Summary of Results

Theorem 4.29

Remark 4.30

Corollary 4.31.

Remark 4.32.

3.5 An Example

4 Inclusion of Absorbing States

Remark 4.33.

Remark 4.34.

Lemma 4.35

Theorem 4.36

Example 4.37.

5 Inclusion of Transient States

Remark 4.38.

Remark 4.39.

Remark 4.40.

Theorem 4.41.

Remark 4.42.

Lemma 4.43.

Remark 4.44.

Theorem 4.45.

Example 4.46.

Remark 4.47.

6 Remarks on Countable-State-Space Cases

6.1 Countable-State Spaces: Part I

Theorem 4.48.

6.2 Countable-State Spaces: Part II

Definition 4.49.

Definition 4.50.

Definition 4.51.

Remark 4.52.

Theorem 4.53.

6.3 A Remark on Finite-Dimensional Approximation

7 Remarks on Singularly Perturbed Diffusions

Example 4.54.

Definition 4.55.

2.4 Exponential Decay of ψ_k( ⋅)