Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

This chapter is concerned with the analysis of the probability distributions of two-time-scale Markov chains. We aim to approximate the solution of forward equation by means of sequences of functions so that the desired accuracy is reached. As alluded to in Chapter 1, we devote our attention to nonstationary Markov chains with time-varying generators. A key feature here is time-scale separation. By introducing a small parameter ε > 0, the generator and hence the corresponding Markov chain have “two times,” a usual running time t and a fast time t ∕ ε. The main approach that we are using is the matched asymptotic expansions from singular perturbation theory. We first construct a sequence of functions that well approximate the solution of the forward equation when t is large enough (outside the initial layer of O(ε)). By adopting the notion of singular perturbation theory, this part of the approximation will be called outer expansions. We demonstrate that it is a good approximation as long as t is not in a neighborhood of 0 of the order O(ε). Nevertheless, this sequence of functions does not satisfy the given initial condition and the approximation breaks down when t ≤ O(ε). To circumvent these difficulties, we construct another sequence of functions by magnifying the asymptotic behavior of the solution near 0 using the stretched fast time \(\tau = t/\varepsilon \). Following the traditional terminology in singular perturbation theory, we call this sequence of functions initial-layer corrections (or sometimes, boundary-layer corrections). It effectively yields corrections to the outer expansions and makes sure that the approximation is good in a neighborhood of O(ε). By combining the outer expansions and the initial-layer corrections, we obtain a sequence of matched asymptotic expansions. The entire process is constructive. Our aims in this chapter include:

  • Construct the outer expansions and the initial-layer corrections. This construction is often referred to as formal expansions.

  • Justify the sequence of approximations obtained by deriving the desired error bounds. To achieve this, we show that (i) the outer solutions are sufficiently smooth, (ii) the initial-layer terms all decay exponentially fast, and (iii) the error is of the desired order. Thus not only is convergence of the asymptotic expansions proved, but also the error bound is obtained.

  • Demonstrate that the error bounds hold uniformly. We would like to mention that in the usual singular perturbation theory, for example, in treating a linear system of differential equations, it is required that the system matrix be stable (i.e., all eigenvalues have negative real parts). In our setup, even for a homogeneous Markov chain, the generator (the system matrix in the equation) has an eigenvalue 0, so is not invertible. Thus, the stability requirement is violated. Nevertheless, using Markov properties, we are still able to obtain the desired asymptotic expansions.

Before proceeding further, we present a lemma. Let \(Q(t) \in {\mathbb{R}}^{m\times m}\) be a generator, and let α(t) be a finite-state Markov chain with state space \(\mathcal{M} =\{ 1,\ldots,m\}\) and generator Q(t). Denote by

$$p(t) = (P(\alpha (t) = 1),\ldots,P(\alpha (t) = m)) \in {\mathbb{R}}^{1\times m}$$

the row vector of the probability distribution of the underlying chain at time t. Then in view of Theorem  2.5, p( ⋅) is a solution of the forward equation

$$\begin{array}{ll} &\frac{dp(t)} {dt} = pQ(t) = p(t)Q(t), \\ &p(0) = {p}^{0}\mbox{ such that }{p}_{ i}^{0} \geq 0\mbox{ for each }i,\mbox{ and }\sum\limits_{i=1}^{m}{p}_{ i}^{0} = 1, \end{array}$$
(4.1)

where p 0 = (p 1 0, , p m 0) and p i 0 denotes the ith component of p 0. Therefore, studying the probability distribution is equivalent to examining the solution of (4.1). Note that the forward equation is linear, so the solution is unique. As a result, the following lemma is immediate. This lemma will prove useful in subsequent study.

Lemma 4.1.

The solution p(t) of (4.1) satisfies the conditions

$$0 \leq {p}_{i}(t) \leq 1 and \sum\limits_{i=1}^{m}{p}_{ i}(t) = 1.$$
(4.2)

Remark 4.2.

For the reader whose interests are mainly in differential equations, we point out that the initial condition \(\sum_{i=1}^{m}{p}_{i}^{0} = 1\) in (4.1) is not restrictive since if p0 = 0, then p(t) = 0 is the only solution to (4.1). If pi 0 > 0 for some i, one may divide both sides of (4.1) by ∑ i=1 mpi 0 (> 0) and consider \(\widetilde{p}(t) = p(t)/\sum_{i=1}^{m}{p}_{i}^{0}\) in lieu of p(t).

To achieve our goal, we first treat a simple case, namely, the case that the generator is weakly irreducible. Once this is established, we proceed to the more complex case that the generator has several weakly irreducible classes, the inclusion of absorbing states, and the inclusion of transient states.

The rest of the chapter is arranged as follows. Section 4.2 begins with the study of the situation in which the generator is weakly irreducible. Although it is a simple case, it outlines the main ideas behind the construction of asymptotic expansions. This section begins with the construction of formal expansions, proves the needed regularity, and ascertains the error estimates. Section 4.3 develops asymptotic expansions of the underlying probability distribution for the chains with recurrent states. As will be seen in the analysis to follow, extreme care must be taken to handle two-time-scale Markov chains with fast and slow components. One of the key issues is the selection of appropriate initial conditions to make the series a “matched” asymptotic expansions, in which the separable form of our asymptotic expansion appears to be advantageous compared with the two-time-scale expansions. For easy reference, a subsection is also provided as a user’s guide.

Using the methods of matched asymptotic expansion, Section 4.4 extends the results to include absorbing states. It demonstrates that similar techniques can be used. We also demonstrate that the techniques and methods of Section 4.3 are rather general and can be applied to a wide variety of cases. Section 4.5 continues the study of problems involving transient states. By treating chains having recurrent states, chains including absorbing states, and chains including transient states, we are able to characterize the probability distributions of the underlying singularly perturbed chains of general cases with finite-state spaces, and hence provide comprehensive pictures through these “canonical” models.

While Sections 4.34.5 cover most practical concerns of interest for the finite-state-space cases, the rest of the chapter makes several remarks on Markov chains with countable-state spaces and two-time-scale diffusions. In Section 4.6.1, we extend the results to processes with countable-state spaces in which \(\widetilde{Q}(t)\) is a block-diagonal matrix with infinitely many blocks each of which is finite-dimensional. Then Section 4.6.2 treats the problem in which \(\widetilde{Q}(t)\) itself is an infinite-dimensional matrix. In this case, further conditions are necessary. As in the finite-dimensional counterpart, sufficient conditions that ensure the validity of the asymptotic expansions are provided. The essential ingredients include Fredholm-alternative-like conditions and the notion of weak irreducibility. Finally, we mention related results of singularly perturbed diffusions in Section 4.7. Additional notes and remarks are given in Section 4.8.

2 Irreducible Case

We begin with the case concerning weakly irreducible generators. Let \(Q(t) \in {\mathbb{R}}^{m\times m}\) be a generator, ε > 0 be a small parameter, and suppose that αε(t) is a finite-state Markov chain with state space \(\mathcal{M} =\{ 1,\ldots,m\}\) generated by \({Q}^{\varepsilon }(t) = Q(t)/\varepsilon \). The row vector \({p}^{\varepsilon }(t) = (P({\alpha }^{\varepsilon }(t) = 1),\ldots,P({\alpha }^{\varepsilon }(t) = m)) \in {\mathbb{R}}^{1\times m}\) denotes the probability distribution of the underlying chain at time t. Then by virtue of Theorem  2.5, p ε( ⋅) is a solution of the forward equation

$$\begin{array}{ll} &\frac{d{p}^{\varepsilon }(t)} {dt} = {p}^{\varepsilon }{Q}^{\varepsilon }(t) ={ 1 \over \varepsilon } {p}^{\varepsilon }(t)Q(t), \\ &{p}^{\varepsilon }(0) = {p}^{0}\mbox{ such that }{p}_{ i}^{0} \geq 0\mbox{ for each }i,\mbox{ and }\sum\limits_{i=1}^{m}{p}_{ i}^{0} = 1, \end{array}$$
(4.3)

where p 0 = (p 1 0, , p m 0) and p i 0 denotes the ith component of p 0. Therefore, studying the probability distribution is equivalent to examining the solution of (4.3). Now, Lemma  4.1 continues to hold for the solution p ε(t).

As discussed in Chapters 1 and 3, the equation in (4.3) arises from various applications involving a rapidly fluctuating Markov chain governed by the generator Q(t) ∕ ε. As ε gets smaller and smaller, the Markov chain fluctuates more and more rapidly. Normally, the fast-changing process αε( ⋅) in an actual system is difficult to analyze. The desired limit properties, however, provide us with an alternative. We can replace the actual process by its “average” in the system under consideration. This approach has significant practical value. A fundamental question common to numerous applications involving two-time-scale Markov chains is to understand the asymptotic properties of p ε( ⋅), namely, the limit behavior as ε → 0. If Q(t) = Q, a constant matrix, and if Q is irreducible (see Definition  2.7), then for each t > 0, p ε(t) → ν, the familiar stationary distribution. For the time-varying counterpart, it is reasonable to expect that the corresponding distribution will converge to a probability distribution that mimics the main features of the distribution of stationary chains, meanwhile preserving the time-varying nature of the nonstationary system. A candidate bearing such characteristics is the quasi-stationary distribution ν(t). Recall that ν(t) is said to be a quasi-stationary distribution (see Definition  2.8) if ν(t) = (ν1(t), , ν m (t)) ≥ 0 and it satisfies the equations

$$\nu (t)Q(t) = 0\mbox{ and }\sum\limits_{i=1}^{m}{\nu }_{ i}(t) = 1.$$
(4.4)

If Q(t) ≡ Q, a constant matrix, then an analytic solution of (4.3) is obtainable, since the fundamental matrix solution (see Hale [79]) takes the simple form exp(Qt); the limit behavior of p ε(t) is derivable through the solution p 0exp(Qt ∕ ε). For time-dependent Q(t), although the fundamental matrix solution still exists, it does not have a simple form. The complex integral representation is not very informative in the asymptotic study of p ε(t), except in the case m = 2. In this case, αε( ⋅) is a two-state Markov chain and the constraint \({p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) = 1\) reduces the current problem to a scalar one. Therefore, a closed-form solution is possible. However, such a technique cannot be generalized to m > 2. Let 0 < T <  be a finite real number. We divide the interval [0, T] into two parts. One part is for t very close to 0 (in the range of an ε-layer), and the other is for t bounded away from 0. The behavior of p ε( ⋅) differs significantly in these two regions. Such a division led us to the utilization of the matched asymptotic expansion. Not only do we prove the convergence of p ε(t) as ε → 0, but we also obtain an asymptotic series. The procedure involves constructing the regular part (outer expansion) for t to be away from 0, as well as the initial-layer corrections for small t, and to match these expansions by a proper choice of initial conditions.

In what follows, in addition to obtaining the zeroth-order approximation, i.e., the convergence of p ε( ⋅) to its quasi-stationary distribution, we derive higher-order approximations and error bounds. A consequence of the findings is that the convergence of the probability distribution and related occupation measures of the corresponding Markov chain takes place in an appropriate sense. The asymptotic properties of a suitably scaled occupation time and the corresponding central limit theorem for αε( ⋅) (based on the expansion) will be studied in Chapter 5.

2.1 Asymptotic Expansions

To proceed, we make the following assumptions.

  1. (A4.1)

    Given 0 < T < , for each t ∈ [0, T], Q(t) is weakly irreducible, that is, the system of equations

    $$\begin{array}{ll} &f(t)Q(t) = 0, \\ &\sum\limits_{i=1}^{m}{f}_{ i}(t) = 1\end{array}$$
    (4.5)

    has a unique nonnegative solution.

  2. (A4.2)

    For some n, Q( ⋅) is (n + 1)-times continuously differentiable on [0, T], and \(({d}^{n+1}/d{t}^{n+1})Q(\cdot )\) is Lipschitz on [0, T].

Remark 4.3.

Condition (A4.2) requires that the matrix Q(t) be sufficiently smooth. This is necessary for obtaining the desired asymptotic expansion. To validate the asymptotic expansion, we need to estimate the remainder term. Thus for the nth-order approximation, we need the (n + 1)st-order smoothness.

To proceed, we first state a lemma. Its proof is in Lemma  A.2 in the appendix.

Lemma 4.4.

Consider the matrix differential equation

$${ dP(s) \over ds} = P(s)A,\ \ P(0) = I,$$
(4.6)

where \(P(s) \in {\mathbb{R}}^{m\times m}\). Suppose \(A \in {\mathbb{R}}^{m\times m}\) is a generator of a (homogeneous or stationary) finite-state Markov chain and is weakly irreducible. Then \(P(s) \rightarrow \overline{P}\) as s →∞ and

$$\left\vert\exp (As) -\overline{P}\right\vert\leq K\exp (-\widetilde{\kappa }s)\quad \mbox{ for some }\widetilde{\kappa } > 0,$$
(4.7)

where \(\overline{P} = \mathrm{1}\mathrm{l}({\overline{\nu }}_{1},\cdots \,,{\overline{\nu }}_{m}) \in {\mathbb{R}}^{m\times m},\) and \(({\overline{\nu }}_{1}\), …, \({\overline{\nu }}_{m})\) is the quasi-stationary distribution of the Markov process with generator A.

Recall that \(\mathrm{1}\mathrm{l} = (1,\ldots,1)^{\prime} \in {\mathbb{R}}^{m\times 1}\) and \(({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m}) \in {\mathbb{R}}^{1\times m}.\) Thus \(\mathrm{1}\mathrm{l}({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})\) is the usual matrix product. Recall that an m ×m matrix P(s) is said to be a solution of (4.6) if each row of P(s) satisfies the equation. In the lemma above, if A is a constant matrix that is irreducible, then \(({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})\) becomes the familiar stationary distribution. In general, A could be time-dependent, e.g., A = A(t). As shown in Lemma  A.4, by assuming the existence of the solution ν(t) to (4.5), it follows that ν(t) ≥ 0; that is, the nonnegativity assumption is redundant. We seek asymptotic expansions of the form

$${p}^{\varepsilon }(t) = {\Phi }_{ n}^{\varepsilon }(t) + {\Psi }_{ n}^{\varepsilon }\left({ t \over \varepsilon }\right) + {e}_{n}^{\varepsilon }(t),$$

where e n ε(t) is the remainder,

$${\Phi }_{n}^{\varepsilon }(t) = {\varphi }_{ 0}(t) + \varepsilon {\varphi }_{1}(t) + \cdots + {\varepsilon }^{n}{\varphi }_{ n}(t),$$
(4.8)

and

$${\Psi }_{n}^{\varepsilon }\left ({ t \over \varepsilon } \right ) = {\psi }_{0}\left ({ t \over \varepsilon } \right ) + \varepsilon {\psi }_{1}\left ({ t \over \varepsilon } \right ) + \cdots + {\varepsilon }^{n}{\psi }_{ n}\left ({ t \over \varepsilon } \right ),$$
(4.9)

with the functions φ i ( ⋅) and ψ i ( ⋅) to be determined in the sequel. We now state the main result of this section.

Theorem 4.5.

Suppose that (A4.1) and (A4.2) are satisfied. Denote the unique solution of (4.3) by p ε (⋅). Then two sequences of functions φ i (⋅) and ψ i (⋅), 0 ≤ i ≤ n, can be constructed such that

  • φ i ( ⋅) is \((n + 1 - i)\)-times continuously differentiable on [0, T];

  • for eachi, there is a κ0 > 0 such that

    $$\vert {\psi }_{i}\left ({ t \over \varepsilon } \right )\vert \leq K\exp \left (-\frac{{\kappa }_{0}t} {\varepsilon } \right );$$
  • the following estimate holds:

    $$ \sup\limits_{t\in [0,T]}{\biggl |{p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\biggr |} \leq K{\varepsilon }^{n+1}.$$
    (4.10)

Remark 4.6.

The method described in what follows gives an explicit construction of the functions φi(⋅) and ψi(⋅) for i ≤ n. Thus the proof to be presented is constructive. Our plan is first to obtain these sequences, and then validate properties (a) and (b) above and derive an error bound in (c) by showing that the remainder

$${\biggl |{p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon }\right )\biggr |}$$

is of order O(εn+1) uniformly in t.

It will be seen from the subsequent development that φ0(t) is equal to the quasi-stationary distribution, that is, φ0(t) = ν(t). In particular, if n = 0 in the above theorem, we have the following result.

Corollary 4.7.

Suppose Q(⋅) is continuously differentiable on [0,T], which satisfies (A4.1), and (d∕dt)Q(⋅) is Lipschitz on [0,T]. Then for all t > 0,

$$ \lim\limits_{\varepsilon \rightarrow 0}{p}^{\varepsilon }(t) = \nu (t) = {\varphi }_{ 0}(t),$$
(4.11)

i.e., p ε (⋅) converges to the quasi-stationary distribution.

Remark 4.8.

The theorem manifests the convergence of pε(⋅) to φ0(⋅), as well as the rate of convergence. In addition to the zeroth-order approximation, we have the first-order approximation, the second-order approximation, and so on. In fact, the difference pε(⋅) − φ0(⋅) is characterized by the initial-layer term ψ0(⋅) and the associated error bound.

If the initial condition is chosen to be exactly equal to p 0 = φ 0 (0), then in the expansion, the zeroth-order initial layer ψ 0 (⋅) will vanish. This cannot be expected in general, however. Even if ψ 0 (⋅) = 0, the rest of the initial-layer terms ψ i (⋅), i ≥ 1 will still be there.

To proceed, we define an operator \({\mathcal{L}}^{\varepsilon }\) by

$${\mathcal{L}}^{\varepsilon }f = \varepsilon \frac{df} {dt} - fQ,$$
(4.12)

for any smooth row-vector-valued function f( ⋅). Then \({\mathcal{L}}^{\varepsilon }f = 0\) iff f is a solution to the differential equation in (4.3). The proof of Theorem  4.5 is divided into the following steps.

  1. 1.

    Construct the asymptotic series, i.e., find φ i ( ⋅) and ψ i ( ⋅), for i ≤ n. For the purpose of evaluating the remainder, we need to calculate two extra terms φ n + 1( ⋅) and ψ n + 1( ⋅). This will become clear when we carry out the error analysis.

  2. 2.

    Obtain the regularity of φ i ( ⋅) and ψ i ( ⋅) by proving that φ i ( ⋅) is \((n + 1 - i)\)-times continuously differentiable on [0, T] and that ψ i ( ⋅) decays exponentially fast.

  3. 3.

    Carry out the error analysis and justify that the remainder has the desired property.

2.2 Outer Expansion

We begin with the construction of Φ n ε( ⋅) in the asymptotic expansion. We call it the outer expansion or the regular part of expansion. Consider the differential equation

$${\mathcal{L}}^{\varepsilon }{\Phi }_{ n+1}^{\varepsilon } = 0$$

where \({\mathcal{L}}^{\varepsilon }\) is given by (4.12).

By equating the coefficients of εk, for \(k = 1,\ldots,n + 1\), we obtain

$$\begin{array}{ll} &{\varepsilon }^{0} :\ \ {\varphi }_{ 0}(t)Q(t) = 0, \\ &{\varepsilon }^{1} :\ \ {\varphi }_{ 1}(t)Q(t) ={ d{\varphi }_{0}(t) \over dt}, \\ &\ \qquad \ \cdots \\ &{\varepsilon }^{k} :\ \ {\varphi }_{ k}(t)Q(t) ={ d{\varphi }_{k-1}(t) \over dt},\ \mbox{ for }k = 1,\ldots,n + 1.\end{array}$$
(4.13)

Remark 4.9.

First, one has to make sure that the equations above have solutions, that is, a consistency condition needs to be verified. For each t ∈ [0,T], denote the null space of Q(t) by N(Q(t)). Note that the irreducibility of Q(t) implies that

$$\mbox{ rank}(Q(t)) = m - 1,$$

thus

$$\mbox{ dim}(N(Q(t))) = 1.$$

It is easily seen that N(Q(t)) is spanned by the vector 1 l . By virtue of the Fredholm alternative (see Corollary  A.38), the second equation in (4.13) has a solution only if its right-hand side, namely, (d∕dt)φ0 (t) is orthogonal to N(Q(t)). Since N(Q(t)) is spanned by 1 l,

$${\varphi }_{0}(t)\mathrm{1}\mathrm{l} = 1$$

and

$${ d{\varphi }_{0}(t) \over dt} \mathrm{1}\mathrm{l} ={ d\left ({\varphi }_{0}(t)\mathrm{1}\mathrm{l}\right ) \over dt} = 0,$$

the orthogonality is easily verified. Similar arguments hold for the rest of the equations. The consistency in fact is rather crucial. Without such a condition, one would not be able to solve the equations in (4.13). This point will be made again when we deal with weak and strong interaction models in Section 4.3.

Recall that the components of p ε( ⋅) are probabilities (see (4.2)). In what follows, we show that all these φ i ( ⋅) can be determined by (4.13) and (4.2).

Note that rank\((Q(t)) = m - 1\). Thus Q(t) is singular, and each equation in (4.13) is not uniquely solvable. For example, the first equation (4.13) cannot be solved uniquely. Nevertheless, this equation together with the constraint \(\sum_{i=1}^{m}{\varphi }_{0}^{i}(t) = 1\) leads to a unique solution, namely, the quasi-stationary distribution.

In fact, a direct consequence of (A4.3) and (A4.4) is that the weak irreducibility of Q(t) is uniform in the sense that for any t ∈ [0, T], if any column of Q(t) is replaced by \(\mathrm{1}\mathrm{l} \in {\mathbb{R}}^{m\times 1}\), the resulting determinant Δ(t) satisfies | Δ(t) |  > 0, since (4.5) has only one solution, and \(\sum_{j=1}^{m}{q}_{ij}(t) = 0\) for each i = 1, , m. Moreover, there is a number c > 0 such that | Δ(t) | ≥ c > 0. Thus, in view of the uniform continuity of Q(t), | Δ(t) | ≥ c > 0 on [0, T]. We can replace any equation in the first m equations of the system φ0(t)Q(t) = 0 by the equation \(\sum_{i=1}^{m}{\varphi }_{0}^{i}(t) = 1\). The corresponding determinant Δ(t) of the resulting coefficient matrix satisfies | Δ(t) | ≥ c > 0, for some c > 0 and all t ∈ [0, T]. To illustrate, we may suppose without loss of generality that the mth equation is the one that can be replaced. Then we have

$$\begin{array}{ll} &{q}_{11}(t){\varphi }_{0}^{1}(t) + \cdots + {q}_{ m1}(t){\varphi }_{0}^{m}(t) = 0, \\ &{q}_{12}(t){\varphi }_{0}^{1}(t) + \cdots + {q}_{ m2}(t){\varphi }_{0}^{m}(t) = 0, \\ &\quad \ \cdots \\ &{q}_{1,m-1}(t){\varphi }_{0}^{1}(t) + \cdots + {q}_{ m,m-1}(t){\varphi }_{0}^{m}(t) = 0, \\ &{\varphi }_{0}^{1}(t) + \cdots + {\varphi }_{ 0}^{m}(t) = 1.\end{array}$$
(4.14)

The determinant of the coefficient matrix in (4.14) is

$$\begin{array}{ll} &\Delta (t) =\left\vert \begin{array}{*{10}c} {q}_{11}(t) & {q}_{21}(t) &\cdots & {q}_{m1}(t) \\ {q}_{12}(t) & {q}_{22}(t) &\cdots & {q}_{m2}(t) \\ \vdots & \vdots &\cdots & \vdots \\ {q}_{1,m-1}(t)&{q}_{2,m-1}(t)&\cdots &{q}_{m,m-1}(t) \\ 1 & 1 &\cdots & 1 \end{array}\right\vert \end{array}$$
(4.15)

and satisfies | Δ(t) | ≥ c > 0. Now by Cramer’s rule, for each 0 ≤ i ≤ m,

$${\varphi }_{0}^{i}(t) ={ 1 \over \Delta (t)} \left\vert \begin{array}{*{10}c} {q}_{11}(t) &\cdots & 0 &\cdots & {q}_{m1}(t) \\ {q}_{12}(t) &\cdots & 0 &\cdots & {q}_{m2}(t)\\ \vdots &\cdots & \vdots &\cdots & \vdots \\ {q}_{1,m-1}(t)&\cdots & 0 &\cdots &{q}_{m,m-1}(t) \\ 1 &\cdots &\underbrace{{1}}_{i\mathrm{th\;column}} & \cdots & 1 \end{array} \right\vert,$$

that is, the ith column of Δ(t) in (4.15) is replaced by \((0,\ldots,0,1)^{\prime} \in {\mathbb{R}}^{m\times 1}\). By the assumption of Q( ⋅), it is plain that φ0( ⋅) is (n + 1)-times continuously differentiable on [0, T].

The foregoing method can be used to solve other equations in (4.13) analogously. Owing to the smoothness of φ0( ⋅), (d ∕ dt0(t) exists, and we can proceed to obtain φ1( ⋅). Repeat the procedure above, and continue inductively. For each k ≥ 1,

$$\begin{array}{ll} &\sum\limits_{i=1}^{m}{\varphi }_{ k}^{i}(t){q}_{ ij}(t) ={ d{\varphi }_{k-1}^{j}(t) \over dt} \mbox{ for }j = 1,\ldots,m, \\ &\sum\limits_{i=1}^{m}{\varphi }_{ k}^{i}(t) = 0.\end{array}$$
(4.16)

Note that φ k − 1 j( ⋅) has been found so \((d/dt){\varphi }_{k-1}^{j}(t)\) is a known function. After a suitable replacement of one of the first m equations by the last equation in (4.16), the determinant Δ(t) of the resulting coefficient matrix satisfies | Δ(t) | ≥ c > 0. We obtain for each 0 ≤ i ≤ m,

$${\varphi }_{k}^{i}(t) ={ 1 \over \Delta (t)} \left\vert \begin{array}{*{10}c} {q}_{11}(t) &\cdots &{ d{\varphi }_{k-1}^{1}(t) \over dt} &\cdots & {q}_{m1}(t) \\ {q}_{12}(t) &\cdots &{ d{\varphi }_{k-1}^{2}(t) \over dt} &\cdots & {q}_{m2}(t) \\ \vdots &\cdots & \vdots &\cdots & \vdots \\ {q}_{1,m-1}(t)&\cdots &{ d{\varphi }_{k-1}^{m-1}(t) \over dt} &\cdots &{q}_{m,m-1}(t) \\ & & & \\ 1 &\cdots & \underbrace{{0}}_{i\mathrm{th\;column}} & \cdots & 1 \end{array} \right\vert.$$

Hence φ k ( ⋅) is \((n + 1 - k)\)-times continuously differentiable on [0, T]. Thus we have constructed a sequence of functions φ k (t) that are \((n + 1 - k)\)-times continuously differentiable on [0, T] for \(k = 0,1,\ldots,n + 1\).

Remark 4.10.

The method used above is convenient for computational purposes. An alternative way of obtaining the sequence φk(t) is as follows. For example, to solve

$${\varphi }_{0}(t)Q(t) = 0,\ \ \sum\limits_{j=1}^{m}{\varphi }_{ 0}^{j}(t) = 1,$$

define \({Q}_{c}(t) = (\mathrm{1}\mathrm{l}\vdots Q(t)) \in {\mathbb{R}}^{m\times (m+1)}\). Then the equation above can be written as

$${\varphi }_{0}(t){Q}_{c}(t) = (1,0,\ldots,0).$$

Note that Qc(t)Q′c(t) has full rank m owing to weak irreducibility. Thus the solution of the equation is

$${\varphi }_{0}(t) = (1,0,\ldots,0){Q^{\prime}}_{c}(t){[{Q}_{c}(t){Q^{\prime}}_{c}(t)]}^{-1}.$$

We can obtain all other φk(t) for \(k = 1,\ldots,n + 1\), similarly.

The regular part Φ n ε( ⋅) is a good approximation to p ε( ⋅) when t is bounded away from 0. When t approaches 0, an initial layer (or a boundary layer) develops and the approximation breaks down. To accommodate this situation, an initial-layer correction, i.e., a sequence of functions ψ k (t ∕ ε) for \(k = 0,1,\ldots,n + 1\) needs to be constructed.

2.3 Initial-Layer Correction

This section is on the construction of the initial-layer terms. The presentation consists of two parts. We obtain the sequence {ψ k ( ⋅)} in the first subsection, and derive the exponential decay property in the second subsection.

Construction of ψ k (⋅). Following usual practice in singular perturbation theory, define the stretched (or rescaled) time variable by

$$\tau ={ t \over \varepsilon }.$$
(4.17)

Note that τ →  as ε → 0 for any given t > 0.

Consider the differential equation

$${\mathcal{L}}^{\varepsilon }{\Psi }_{ n+1}^{\varepsilon } =\sum\limits_{i=0}^{n+1}{\varepsilon }^{i}{\mathcal{L}}^{\varepsilon }{\psi }_{ i} = 0.$$

Using the stretched time variable τ, we arrive at

$${ d{\Psi }_{n+1}^{\varepsilon }(\tau ) \over d\tau } = {\Psi }_{n+1}^{\varepsilon }(\tau )Q(\varepsilon \tau ).$$

Owing to the smoothness of Q( ⋅), a truncated Taylor expansion about τ = 0 leads to

$$Q(t) = Q(\varepsilon \tau ) =\sum\limits_{i=0}^{n+1}{ {(\varepsilon \tau )}^{i} \over i!} { {d}^{i}Q(0) \over d{t}^{i}} + {R}_{n+1}(\varepsilon \tau ),$$

where

$${R}_{n+1}(t) ={ {t}^{n+1} \over (n + 1)!} \left ({ {d}^{n+1}Q(\xi ) \over d{t}^{n+1}} -{ {d}^{n+1}Q(0) \over d{t}^{n+1}} \right ),$$

for some 0 < ξ < t. In view of (A4.2),

$${R}_{n+1}(t) = O({t}^{n+2})\mbox{ uniformly in }t \in [0,T].$$

Drop the term R n + 1(t) and use the first n + 2 terms to get

$${ d{\Psi }_{n+1}^{\varepsilon }(\tau ) \over d\tau } = {\Psi }_{n+1}^{\varepsilon }(\tau )\left (\sum\limits_{i=0}^{n+1}{ {(\varepsilon \tau )}^{i} \over i!} { {d}^{i}Q(0) \over d{t}^{i}} \right ).$$

Similar to the previous section, for \(k = 1,\ldots,n + 1\), equating coefficients of εk, we have

$$\begin{array}{ll} &{\varepsilon }^{0} :\ { d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )Q(0), \\ &{\varepsilon }^{1} :\ { d{\psi }_{1}(\tau ) \over d\tau } = {\psi }_{1}(\tau )Q(0) + \tau {\psi }_{0}(\tau ){ dQ(0) \over dt}, \\ &\ \qquad \ \cdots \\ &{\varepsilon }^{k} :\ { d{\psi }_{k}(\tau ) \over d\tau } = {\psi }_{k}(\tau )Q(0) + {r}_{k}(\tau ), \end{array}$$
(4.18)

where r k (τ) is a function having the form

$$\begin{array}{ll} {r}_{k}(\tau )& ={ {\tau }^{k} \over k!} {\psi }_{0}(\tau ){ {d}^{k}Q(0) \over d{t}^{k}} + \cdots + \tau {\psi }_{k-1}(\tau ){ dQ(0) \over dt} \\ & =\sum\limits_{i=1}^{k}{ {\tau }^{i} \over i!} {\psi }_{k-i}(\tau ){ {d}^{i}Q(0) \over d{t}^{i}}. \end{array}$$
(4.19)

These equations together with appropriate initial conditions allow us to determine the ψ k ( ⋅)’s. For constructing φ k ( ⋅), a number of algebraic equations are solved, whereas when determining ψ k , one has to solve a number of differential equations instead. Two points are worth mentioning in connection with (4.18). First the time-varying differential equation is replaced by one with constant coefficients; the solution thus can be written explicitly. The second point is on the selection of the initial conditions for ψ k ( ⋅), with \(k = 0,1,\ldots,n + 1\). We choose the initial conditions so that the initial data of the asymptotic expansion will “match” that of the differential equation (4.3). To be more specific,

$$\begin{array}{rl} &{\varphi }_{0}(0) + {\psi }_{0}(0) = {p}^{0},\mbox{ and } \\ &{\varphi }_{k}(0) + {\psi }_{k}(0) = 0\mbox{ for }k = 1,2,\ldots,n + \end{array}$$
(1.)

Corresponding to ε0, solving

$$\begin{array}{rl} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )Q(0), \\ &{\psi }_{0}(0) = {p}^{0} - {\varphi }_{ 0}(0), \end{array}$$

where p 0 is the initial data given in (4.3), one has

$${\psi }_{0}(\tau ) = ({p}^{0} - {\varphi }_{ 0}(0))\exp \left (Q(0)\tau \right ).$$
(4.20)

Continuing in this fashion, for \(k = 1,\ldots,n + 1\), we obtain

$$\begin{array}{rl} &{ d{\psi }_{k}(\tau ) \over d\tau } = {\psi }_{k}(\tau )Q(0) + {r}_{k}(\tau ), \\ &{\psi }_{k}(0) = -{\varphi }_{k}(0)\end{array}$$

In the equations above, we purposely separated Q(0) from the term r k (τ). As a result, the equations are linear systems with a constant matrix Q(0) and time-varying forcing terms. This is useful for our subsequent investigation.

For k = 1, 2, , the solutions are given by

$$\begin{array}{ll} {\psi }_{k}(\tau )& = -{\varphi }_{k}(0)\exp (Q(0)\tau ) \\ &\qquad +{ \int }_{0}^{\tau }{r}_{ k}(s)\exp \left (Q(0)(\tau - s)\right )ds\end{array}$$
(4.21)

The construction of ψ k ( ⋅) for \(k = 0,1,\ldots,n + 1\), and hence the construction of the asymptotic series is complete.

2.4 Exponential Decay of ψ k ( ⋅)

This subsection concerns the exponential decay of ψ k ( ⋅). At first glance, it seems to be troublesome since Q(0) has a zero eigenvalue. Nevertheless, probabilistic argument helps us to derive the desired property. Two key points in the proof below are the utilization of orthogonality and repeated application of the approximation of exp(Q(0)τ) in Lemma  4.4.

By virtue of Assumption (A4.1), the finite-state Markov chain generated by Q(0) is weakly irreducible. Identifying Q(0) with the matrix A in Lemma  4.4 yields that

$$\exp (Q(0)\tau ) \rightarrow \overline{P}\mbox{ as }\tau \rightarrow \infty,$$

where \(\overline{P} = \mathrm{1}\mathrm{l}\overline{\nu }\), and \(\overline{\nu } = ({\overline{\nu }}_{1},\ldots,{\overline{\nu }}_{m})\) is the quasi-stationary distribution corresponding to the constant matrix Q(0).

Proposition 4.11.

Under the conditions of Theorem  4.5 , for each 0 ≤ k ≤ n + 1, there exist a nonnegative real polynomial c 2k (τ) of degree 2k and a positive number κ 0,0 > 0 such that

$$\vert {\psi }_{k}(\tau )\vert \leq {c}_{2k}(\tau )\exp (-{\kappa }_{0,0}\tau ).$$
(4.22)

Proof: First of all, note that

$$\sum\limits_{i=1}^{m}{p}_{ i}^{0} = 1\mbox{ and }\sum\limits_{i=1}^{m}{\varphi }_{ 0}^{i}(0) = 1.$$

It follows that

$$\sum\limits_{i=1}^{m}{\psi }_{ 0}^{i}(0) =\sum\limits_{i=1}^{m}{p}_{ i}^{0} -\sum\limits_{i=1}^{m}{\varphi }_{ 0}^{i}(0) = 0.$$

That is, ψ0(0) is orthogonal to 1 l. Consequently, \({\psi }_{0}(0)\overline{P} = 0\) and by virtue of Lemma  4.4 (with A = Q(0)), for some \({\kappa }_{0,0} :=\widetilde{ \kappa } > 0\),

$$\begin{array}{ll} \left \vert {\psi }_{0}(\tau )\right \vert & = \left \vert {\psi }_{0}(0)\exp (Q(0)\tau )\right \vert \\ &\leq \left \vert {\psi }_{0}(0)\overline{P}\right \vert + \left \vert {\psi }_{0}(0)(\exp (Q(0)\tau ) -\overline{P})\right \vert \\ & = \left \vert {\psi }_{0}(0)(\exp (Q(0)\tau ) -\overline{P})\right \vert \leq K\exp (-{\kappa }_{0,0}\tau ).\end{array}$$
(4.23)

Note that

$$Q(t)\mathrm{1}\mathrm{l} = 0\mbox{ for all }t \geq 0.$$

Differentiating this equation repeatedly leads to

$${ {d}^{k}Q(t) \over d{t}^{k}} \mathrm{1}\mathrm{l} ={ {d}^{k}(Q(t)\mathrm{1}\mathrm{l}) \over d{t}^{k}} = 0.$$

Hence, it follows that

$${ {d}^{k}Q(0) \over d{t}^{k}} \mathrm{1}\mathrm{l} = 0\ \mbox{ and }\ { {d}^{k}Q(0) \over d{t}^{k}} \overline{P} = 0,$$

for each 0 ≤ k ≤ n + 1. Owing to Lemma  4.4 and (4.21),

$$\begin{array}{rl} \vert {\psi }_{1}(\tau )\vert \leq &\left\vert {\varphi }_{1}(0)\exp (Q(0)\tau )\right\vert \\ & + \left\vert{\int }_{0}^{\tau }{\psi }_{ 0}(s){ dQ(0) \over dt} \left (\overline{P} + \left (\exp (Q(0)(\tau - s) -\overline{P}\right )\right )sds\right\vert \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) \\ & +{ \int }_{0}^{\tau }\vert {\psi }_{ 0}(s)\left\vert \l { dQ(0) \over dt} \left (\exp (Q(0)(\tau - s)) -\overline{P}\right )\right\vert sds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) + K{\int }_{0}^{\tau }\exp (-{\kappa }_{ 0,0}s)\exp (-{\kappa }_{0,0}(\tau - s))sds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) + K{\tau }^{2}\exp (-{\kappa }_{ 0,0}\tau ) \leq {c}_{2}(\tau )\exp (-{\kappa }_{0,0}\tau ), \end{array}$$

for some nonnegative polynomial c 2(τ) of degree 2.

Note that r k (s) is orthogonal to \(\overline{P}\). By induction, for any k with \(k = 1,\ldots,n + 1\),

$$\begin{array}{ll} &\vert {\psi }_{k } (\tau )\vert \\ \leq &\vert {\varphi }_{k}(0)\exp (Q(0)\tau )\vert +{ \int }_{0}^{\tau }\left\vert {r}_{ k}(s)\left (\exp (Q(0)(\tau - s)) -\overline{P}\right )\right\vert ds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) +\sum\limits_{i=1}^{k}{ 1 \over i!} {\int }_{0}^{\tau }{s}^{i}\vert {\psi }_{ k-i}(s)\vert \\ & \quad \times \left\vert { {d}^{i}Q(0) \over d{t}^{i}} \left (\exp (Q(0)(\tau - s)) -\overline{P}\right )\right\vert ds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) + K\sum\limits_{i=1}^{2k-1}{ \int }_{0}^{\tau }{s}^{i}\exp (-{\kappa }_{ 0,0}\tau )ds \\ \leq &K\exp (-{\kappa }_{0,0}\tau ) + K\sum\limits_{i=1}^{2k}{\tau }^{i}\exp (-{\kappa }_{ 0,0}\tau ) \leq {c}_{2k}(\tau )\exp (-{\kappa }_{0,0}\tau ),\end{array}$$

where c 2k (τ) is a nonnegative polynomial of degree 2k. This completes the proof of the proposition. □ 

Since n is a finite integer, the growth of c 2k (τ) for 0 ≤ k ≤ n + 1 is much slower than exponential. Thus the following corollary is in force.

Corollary 4.12.

For each 0 ≤ k ≤ n + 1, with κ 0,0 given in Proposition  4.11,

$$\vert {\psi }_{k}(\tau )\vert \leq K\exp \left (-{\kappa }_{0}\tau \right ),\mbox{ for some }{\kappa }_{0}\mbox{ with }0 < {\kappa }_{0} < {\kappa }_{0,0}.$$

2.5 Asymptotic Validation

Recall that \({\mathcal{L}}^{\varepsilon }f = \varepsilon (d/dt)f - fQ\). Then we have the following lemma.

Lemma 4.13.

Suppose that for some 0 ≤ k ≤ n + 1,

$$ \sup\limits_{t\in [0,T]}\vert {\mathcal{L}}^{\varepsilon }{v}^{\varepsilon }(t)\vert = O\left ({\varepsilon }^{k+1}\right )\, and \,{v}^{\varepsilon }(0) = 0.$$

Then

$$ \sup\limits_{t\in [0,T]}\vert {v}^{\varepsilon }(t)\vert = O\left ({\varepsilon }^{k}\right ).$$

Proof: Let ηε( ⋅) be a function satisfying \( \sup\limits_{t\in [0,T]}\vert {\eta }^{\varepsilon }(t)\vert = O\left ({\varepsilon }^{k+1}\right )\). Consider the differential equation

$$\begin{array}{ll} &{\mathcal{L}}^{\varepsilon }{v}^{\varepsilon }(t) = {\eta }^{\varepsilon }(t), \\ &{v}^{\varepsilon }(0) = 0.\end{array}$$
(4.24)

Then the solution of (4.24) is given by

$${v}^{\varepsilon }(t) ={ 1 \over \varepsilon } {\int }_{0}^{t}{\eta }^{\varepsilon }(s){X}^{\varepsilon }(t,s)ds,$$

where X ε(t, s) is a principal matrix solution. Recall that (see Hale [79, p. 80]) a fundamental matrix solution of the differential equation is an invertible matrix each row of which is a solution of the equation; a principal matrix solution is a fundamental matrix solution with initial value the identity matrix. In view of Lemma  4.1,

$$\vert {X}^{\varepsilon }(t,s)\vert \leq K\quad \mbox{ for all }t,s \in [0,T].$$

Therefore, we have the inequalities

$$ \sup\limits_{t\in [0,T]}\vert {v}^{\varepsilon }(t)\vert \leq {{ K \over \varepsilon } \sup }_{t\in [0,T]}{ \int }_{0}^{t}\vert {\eta }^{\varepsilon }(s)\vert ds \leq K{\varepsilon }^{k}.$$

The proof of the lemma is thus complete. □ 

Recall that the vector-valued “error” or remainder e n ε(t) is defined by

$${e}_{n}^{\varepsilon }(t) = {p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right ),$$
(4.25)

where p ε( ⋅) is the solution of (4.3), and φ i ( ⋅) and ψ i ( ⋅) are constructed in (4.13) and (4.18). It remains to show that \({e}_{n}^{\varepsilon }(t) = O\left ({\varepsilon }^{n+1}\right )\). To do so, we utilize Lemma  4.13 as a bridge. It should be pointed out, however, that to get the correct order for the remainder, a trick involving “back up one step” is needed. The details follow.

Proposition 4.14.

Assume (A4.1) and (A4.2) , for each k = 0,…,n,

$$ \sup\limits_{t\in [0,T]}\vert {e}_{k}^{\varepsilon }(t)\vert = O({\varepsilon }^{k+1}).$$

Proof: We begin with

$${e}_{1}^{\varepsilon }(t) = {p}^{\varepsilon }(t) - {\varphi }_{ 0}(t) - \varepsilon {\varphi }_{1}(t) - {\psi }_{0}\left ( \frac{t} {\varepsilon }\right ) - \varepsilon {\psi }_{1}\left ( \frac{t} {\varepsilon }\right ).$$
(4.26)

We will use the exponential decay property given in ψ i (τ) Corollary  4.12. Clearly, e 1 ε(0) = 0, and hence the condition of Lemma  4.13 on the initial data is satisfied. By virtue of the exponential decay property of ψ i ( ⋅) in conjunction with the defining equations of φ i ( ⋅) and ψ i ( ⋅),

$$\begin{array}{rl} {\mathcal{L}}^{\varepsilon }{e}_{1}^{\varepsilon }(t)& = -\biggl [\varepsilon { d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)Q(t) + {\varepsilon }^{2}{ d{\varphi }_{1}(t) \over dt} - \varepsilon {\varphi }_{1}(t)Q(t) \\ &\qquad + \varepsilon { d \over dt} {\psi }_{0}\left ( \frac{t} {\varepsilon }\right ) - {\psi }_{0}\left ( \frac{t} {\varepsilon }\right )Q(t) + {\varepsilon }^{2}{ d \over dt} {\psi }_{1}\left ( \frac{t} {\varepsilon }\right ) \\ &\qquad - \varepsilon {\psi }_{1}\left ( \frac{t} {\varepsilon }\right )Q(t)\biggr ] \\ & = -{\varepsilon }^{2}{ d{\varphi }_{1}(t) \over dt} + {\psi }_{0}\left ( \frac{t} {\varepsilon }\right )\biggl [Q(t) - Q(0) - t{ dQ(0) \over dt} \biggr ] \\ &\qquad + \varepsilon {\psi }_{1}\left ( \frac{t} {\varepsilon }\right )[Q(t) - Q(0)]\end{array}$$

For the term involving ψ0(t ∕ ε), using a Taylor expansion on Q(t) yields that for some ξ ∈ (0, t)

$$\left\vert Q(t) - Q(0) - t{ dQ(0) \over dt} \right\vert = \left\vert { 1 \over 2} \left({ {d}^{2}Q(\xi ) \over d{t}^{2}} \right){t}^{2}\right\vert \leq K{t}^{2}.$$

Owing to the exponential decay property of ψ i ( ⋅), the fact that φ1( ⋅) is n-times continuously differentiable on [0, T], and the above estimate, we have

$$\vert {\mathcal{L}}^{\varepsilon }{e}_{ 1}^{\varepsilon }(t)\vert \leq K\left({\varepsilon }^{2} + (\varepsilon t + {t}^{2})\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right)\right).$$

Moreover, for any \(k = 0,1,2\ldots,n + 1\), it is easy to see that

$${t}^{k}\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right) = {\varepsilon }^{k}\left( \frac{t} {\varepsilon }\right)^{k}\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right) \leq K{\varepsilon }^{k}.$$
(4.27)

This implies \({\mathcal{L}}^{\varepsilon }{e}_{1}^{\varepsilon }(t) = O({\varepsilon }^{2})\) uniformly in t. Thus, e 1 ε(t) = O(ε) by virtue of Lemma  4.13 and the bound is uniform in t ∈ [0, T].

We now go back one step to show that the zeroth-order approximation also possesses the correct error estimate, that is, e 0 ε(t) = O(ε). Note that the desired order seems to be difficult to obtain directly, and as a result the back-tracking is employed.

Note that

$${e}_{1}^{\varepsilon }(t) = {e}_{ 0}^{\varepsilon }(t) - \varepsilon {\varphi }_{ 1}(t) - \varepsilon {\psi }_{1}\left ({ t \over \varepsilon } \right ).$$
(4.28)

However, the smoothness of φ1( ⋅) and the exponential decay of ψ1( ⋅) imply that

$$\varepsilon {\varphi }_{1}(t) + \varepsilon {\psi }_{1}\left ({ t \over \varepsilon } \right ) = O(\varepsilon )\quad \mbox{ uniformly in }t.$$
(4.29)

Thus e 0 ε(t) = O(ε) uniformly in t.

Proceeding analogously, we obtain

$$\begin{array}{ll} &{\mathcal{L}}^{\varepsilon } {e}_{ n+1}^{\varepsilon } \\ & = {\mathcal{L}}^{\varepsilon }\left ({p}^{\varepsilon }(t) -\sum n+1 i=0 {\varepsilon }^{i}{\varphi }_{ i}(t) -\sum n+1 i=0 {\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon }\right )\right ) \\ & = -\varepsilon \left (\sum n+1 i=0 {\varepsilon }^{i}{ d{\varphi }_{i}(t) \over dt} + \sum n+1 i=0 {\varepsilon }^{i}{ d \over dt} {\psi }_{i}\left ( \frac{t} {\varepsilon }\right )\right ) \\ &\quad + \left (\sum n+1 i=0 {\varepsilon }^{i}{\varphi }_{ i}(t) + \sum n+1 i=0 {\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon }\right )\right )Q(t) \\ & = -{\varepsilon }^{n+2}{ d{\varphi }_{n+1}(t) \over dt} + \left [\sum n+1 i=0 {\varepsilon }^{i}{\varphi }_{ i}(t)Q(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i+1}{\varphi }_{ i+1}(t)Q(t)\right ] \\ &\quad + \sum n+1 i=0 {\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )Q(t) -\sum n+1 i=0 {\varepsilon }^{i}\left [{\psi }_{ i}\left ( \frac{t} {\varepsilon }\right )Q(0) + {r}_{i}\left ( \frac{t} {\varepsilon }\right )\right ].\end{array}$$
(4.30)

Note that the term in the fifth line above is

$$\sum n+1 i=0 {\varepsilon }^{i}{\varphi }_{ i}(t)Q(t) -\sum\limits_{i=1}^{n+1}{\varepsilon }^{i}{\varphi }_{ i}(t)Q(t) = {\varphi }_{0}(t)Q(t) = 0.$$

Using (4.19), we represent r i (t) in terms of (d i ∕ dt i)Q(0), etc. For the term involving ψ0(t ∕ ε), using a truncated Taylor expansion up to order (n + 1) for Q(t), by virtue of the Lipschitz continuity of \(({d}^{n+1}/d{t}^{n+1})Q(\cdot )\), there is a ξ ∈ (0, t) such that

$$\begin{array}{rl} {\biggl |Q(t) -\sum\limits_{i=0}^{n+1}{ {t}^{i} \over i!} { {d}^{i}Q(0) \over d{t}^{i}} \biggr |}& ={ 1 \over (n + 1)!} \l {t}^{n+1}{ {d}^{n+1}Q(\xi ) \over d{t}^{n+1}} - {t}^{n+1}{ {d}^{n+1}Q(0) \over d{t}^{n+1}} \vert \\ & \leq K{t}^{n+1}\xi \leq K{t}^{n+2}\end{array}$$

For all the other terms involving ψ i (t ∕ ε), for \(i = 1,\ldots,n + 1\) in (4.30), we proceed as in the calculation of \({\mathcal{L}}^{\varepsilon }{e}_{1}^{\varepsilon }\). As a result, the last two terms in (4.30) are bounded by

$${\psi }_{0}\left ( \frac{t} {\varepsilon }\right )O({t}^{n+2}) + \varepsilon {\psi }_{ 1}\left ( \frac{t} {\varepsilon }\right )O({t}^{n+1}) + \cdots + {\varepsilon }^{n+1}{\psi }_{ n+1}\left ( \frac{t} {\varepsilon }\right )O(t),$$

which in turn leads to the bound

$$K({t}^{n+2} + \varepsilon {t}^{n+1} + \cdots + {\varepsilon }^{n+1}t)\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right) \leq K{\varepsilon }^{n+2},$$

in accordance with (4.27). Moreover, it is clear that \({e}_{n+1}^{\varepsilon }(0) = 0\). In view of the fact that φ n + 1( ⋅) is continuously differentiable on [0, T] and Q( ⋅) is (n + 1)-times continuously differentiable on [0, T], by virtue of Lemma  4.13, we infer that \({e}_{n+1}^{\varepsilon }(t) = O({\varepsilon }^{n+1})\) uniformly in t. Since

$${e}_{n+1}^{\varepsilon }(t) = {e}_{ n}^{\varepsilon }(t) + O({\varepsilon }^{n+1}),$$

it must be that \({e}_{n}^{\varepsilon }(t) = O({\varepsilon }^{n+1})\). The proof of Proposition  4.14 is complete, and so is the proof of Theorem  4.5. □ 

Remark 4.15.

In the estimate given above, we actually obtained

$${\mathcal{L}}^{\varepsilon }{e}_{ k}^{\varepsilon }(t) = O\left({\varepsilon }^{k+1} + (\varepsilon {t}^{k} + \cdots + {\varepsilon }^{k}t)\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right)\right).$$
(4.31)

This observation will be useful when we consider the unbounded interval [0,∞).

The findings reported are very useful for further study of the limit behavior of the corresponding Markov chain problems of central limit type, which will be discussed in the next chapter. In many applications, a system is governed by a Markov chain, which consists of both slow and fast motions. An immediate question is this: Can we still develop an asymptotic series expansion? This question will be dealt with in Section 4.3.

Suppose that in lieu of (A4.2), we assume that Q( ⋅) is piecewise (n + 1)-times continuously differentiable on [0, T], and \(({d}^{n+1}/d{t}^{n+1})Q(\cdot )\) is piecewise Lipschitz, that is, there is a partition of [0, T], namely,

$${t}_{0} = 0 < {t}_{1} < {t}_{2} < \cdots \leq {t}_{k} = T$$

such that Q( ⋅) is (n + 1)-times continuously differentiable and \(({d}^{n+1}/d{t}^{n+1})\) Q( ⋅) is Lipschitz on each subinterval [t i , t i + 1). Then the result obtained still holds. In this case, in addition to the initial layers, one also has a finite number of inner-boundary layers. In each interval \([{t}_{i},{t}_{i+1} - \eta ]\) for η > 0, the expansion is similar to that presented in Theorem  4.5.

2.6 Examples

As a further illustration, we consider two examples in this section. The first example is concerned with a stationary Markov chain, i.e., Q(t) = Q is a constant matrix. The second example deals with an analytically solvable case for a two-state Markov chain with nonstationary transition probabilities. Although they are simple, these examples give us insight into the asymptotic behavior of the underlying systems.

Example 4.16.

Let αε(t) be an m-state Markov chain with a constant generator Q(t) = Q that is irreducible. This is an analytically solvable case, with

$${p}^{\varepsilon }(t) = {p}^{0}\exp \left ({ Qt \over \varepsilon } \right ).$$

Using the technique of asymptotic expansion, we obtain

$$\begin{array}{rl} &{\varphi }_{0}(t) + {\psi }_{0}\left ({ t \over \varepsilon } \right ) = {\varphi }_{0} + ({p}^{0} - {\varphi }_{ 0})\exp \left ({ Qt \over \varepsilon } \right ), \\ &\mbox{ with }\exp \left ({ Qt \over \varepsilon } \right ) \rightarrow \overline{P},\mbox{ as }\varepsilon \rightarrow 0, \end{array}$$

where

$$\begin{array}{rl} &{\varphi }_{0}(t) = ({\nu }_{1},\ldots,{\nu }_{m})\mbox{ and }\overline{P} = \mathrm{1}\mathrm{l}{\varphi }_{0}\end{array}$$

Note that \({p}^{0}\overline{P} = {\varphi }_{0}\), and hence

$$({p}^{0} - {\varphi }_{ 0})\exp \left ({ Qt \over \varepsilon } \right ) = ({p}^{0} - {\varphi }_{ 0})\left [\exp \left ({ Qt \over \varepsilon } \right ) -\overline{P}\right ].$$

Moreover,

$${\varphi }_{i}(t) \equiv 0,\ \ {\psi }_{i}\left ({ t \over \varepsilon } \right ) \equiv 0\ \mbox{ for }i \geq 1.$$

In this case, \({\varphi }_{0}(t) \equiv {\varphi }_{0}\), a constant vector, which is the equilibrium distribution of Q; the series terminates. Moreover, the solution consists of two terms, one of them the equilibrium distribution (the zeroth-order approximation) and the other the zeroth-order initial-layer correction term. Since φ0 is the quasi-stationary distribution,

$${\varphi }_{0}Q = 0\ \mbox{ and }\ {\varphi }_{0}\exp \left ({ Qt \over \varepsilon } \right ) = {\varphi }_{0}.$$

Hence the analytic solution and the asymptotic expansion coincide.

In particular, let Q be a two-dimensional matrix, i.e.,

$$Q = \left (\begin{array}{*{10}c} -\lambda & \lambda \\ \mu &-\mu \\ \end{array} \right ).$$

Then setting

$${y}_{0}^{\varepsilon }(t) = {\varphi }_{ 0}(t) + {\psi }_{0}(t/\varepsilon ),$$

we have

$$\begin{array}{rl} &{p}_{1}^{\varepsilon }(t) = {y}_{ 0,1}^{\varepsilon }(t) ={ \mu \over \lambda + \mu } + \left ({p}_{1}^{0} -{ \mu \over \lambda + \mu } \right )\exp \left (-\frac{(\lambda + \mu )t} {\varepsilon } \right ), \\ &{p}_{2}^{\varepsilon }(t) = {y}_{ 0,2}^{\varepsilon }(t) ={ \lambda \over \lambda + \mu } + \left ({p}_{2}^{0} -{ \lambda \over \lambda + \mu } \right )\exp \left (-\frac{(\lambda + \mu )t} {\varepsilon } \right )\end{array}$$

Therefore,

$$\begin{array}{rl} &{\varphi }_{0}(t) = \left ({ \mu \over \lambda + \mu },{ \lambda \over \lambda + \mu } \right ), \\ &{\psi }_{0}\left ({ t \over \varepsilon } \right ) = \left (\left ({p}_{1}^{0} -{ \mu \over \lambda + \mu } \right ),\left ({p}_{2}^{0} -{ \lambda \over \lambda + \mu } \right )\right )\exp \left (-{ (\lambda + \mu )t \over \varepsilon } \right ), \\ &{\varphi }_{i}(t) \equiv 0\mbox{ and }{\psi }_{i}\left ({ t \over \varepsilon } \right ) \equiv 0\quad \mbox{ for }i \geq \end{array}$$
(1.)

Example 4.17.

Consider a two-state Markov chain with generator

$$Q(t) = \left (\begin{array}{*{10}c} -\lambda (t)& \lambda (t)\\ \mu (t) &-\mu (t) \\ \end{array} \right )$$

where λ(t) ≥ 0, μ(t) ≥ 0 and λ(t) + μ(t) > 0 for each t ∈ [0,T]. Therefore Q(⋅) is weakly irreducible. For the following discussion, assume Q(⋅) to be sufficiently smooth. Although it is time-varying, a closed-form solution is obtainable. Since \({p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) = 1\) for each t, (4.3) can be solved explicitly and the solution is given by

$$\begin{array}{rl} &{p}_{1}^{\varepsilon }(t) = {p}_{ 1}^{0}\exp \left (-{ 1 \over \varepsilon } {\int }_{0}^{t}(\lambda (s) + \mu (s))ds\right ) \\ &\qquad \qquad +{ \int }_{0}^{t}{ \mu (u) \over \varepsilon } \exp \left (-{ 1 \over \varepsilon } {\int }_{u}^{t}(\lambda (s) + \mu (s))ds\right )du, \\ &{p}_{2}^{\varepsilon }(t) = {p}_{ 2}^{0}\exp \left (-{ 1 \over \varepsilon } {\int }_{0}^{t}(\lambda (s) + \mu (s))ds\right ) \\ &\qquad \qquad +{ \int }_{0}^{t}{ \lambda (u) \over \varepsilon } \exp \left (-{ 1 \over \varepsilon } {\int }_{u}^{t}(\lambda (s) + \mu (s))ds\right )du\end{array}$$

Following the approach in the previous sections, we construct the first a few terms in the asymptotic expansion. By considering (4.13) together with (4.2), a system of the form

$$\begin{array}{rl} &\lambda (t){\varphi }_{0}^{1}(t) - \mu (t){\varphi }_{ 0}^{2}(t) = 0, \\ &{\varphi }_{0}^{1}(t) + {\varphi }_{ 0}^{2}(t) = \end{array}$$
(1)

is obtained. The solution of the system yields that

$${\varphi }_{0}(t) = \left ({ \mu (t) \over \lambda (t) + \mu (t)},{ \lambda (t) \over \lambda (t) + \mu (t)} \right ).$$

To find φ1( ⋅), consider

$$\begin{array}{rl} &\lambda (t){\varphi }_{1}^{1}(t) - \mu (t){\varphi }_{ 1}^{2}(t) ={ \dot{\lambda }(t)\mu (t) -\dot{ \mu }(t)\lambda (t) \over {(\lambda (t) + \mu (t))}^{2}}, \\ &{\varphi }_{1}^{1}(t) + {\varphi }_{ 1}^{2}(t) = 0, \end{array}$$

where \(\dot{\lambda } = (d/dt)\lambda \) and \(\dot{\mu } = (d/dt)\mu \). Solving this system of equations gives us

$${\varphi }_{1}(t) =\left({ \dot{\lambda }(t)\mu (t) -\dot{ \mu }(t)\lambda (t) \over {(\lambda (t) + \mu (t))}^{3}},{ \lambda (t)\dot{\mu }(t) - \mu (t)\dot{\lambda }(t) \over {(\lambda (t) + \mu (t))}^{3}} \right).$$

To get the inner expansion, consider the differential equation

$$\begin{array}{rl} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )Q(0), \\ &{\psi }_{0}(0) = {p}^{0} - {\varphi }_{ 0}(0),\end{array}$$

with \(\tau = t/\varepsilon \). We obtain

$${\psi }_{0}(\tau ) = ({p}^{0} - {\varphi }_{ 0}(0))\exp (Q(0)\tau ),$$

where

$$\begin{array}{rl} &\exp \left (Q(0)\tau \right ) ={ 1 \over \lambda (0) + \mu (0)} \\ &\qquad \qquad \times \left (\begin{array}{cc} \mu (0) + \lambda (0){e}^{-(\lambda (0)+\mu (0))\tau }&\lambda (0) - \lambda (0){e}^{-(\lambda (0)+\mu (0))\tau } \\ \mu (0) - \mu (0){e}^{-(\lambda (0)+\mu (0))\tau }&\lambda (0) + \mu (0){e}^{-(\lambda (0)+\mu (0))\tau }\\ \end{array} \right )\end{array}$$

Similarly ψ1( ⋅) can be obtained from (4.21) with the exponential matrix given above.

It is interesting to note that either λ(t) or μ(t) can be equal to 0 for some t as long as λ(t) + μ(t) > 0. For example, if we take μ( ⋅) to be the repair rate of a machine in a manufacturing model, then μ(t) = 0 corresponds to the repair workers taking breaks or waiting for parts on order to arrive. The minors of Q(t) are λ(t), − λ(t), μ(t), and − μ(t). As long as not all of them are zero at the same time, the weak irreducibility condition will be met.

2.7 Two-Time-Scale Expansion

The asymptotic expansion derived in the preceding sections is separable in the sense that it is the sum of a regular part and initial-layer corrections. Naturally, one is interested in the relationship between such an expansion and the so-called two-time-scale expansion (see, for example, Smith [199]). To answer this question, we first obtain the two-time-scale asymptotic expansion for the forward equation (4.3), proceed with the exploration of the relationships between these two expansions, and conclude with a discussion of the connection between these two methods.

Two-Time-Scale Expansion. Following the literature on asymptotic expansion (e.g., Kevorkian and Cole [108, 109] and Smith [199] among others), consider two scales t and \(\tau = t/\varepsilon \), both as “times.” One of them is in a normal time scale and the other is a stretched one. We seek asymptotic expansions of the form

$${y}^{\varepsilon }(t,\tau ) =\sum\limits_{i=0}^{n}{\varepsilon }^{i}{y}_{ i}(t,\tau ),$$
(4.32)

where {y i (t, τ)} i = 0 n is a sequence of two-time-scale functions. Treating t and τ as independent variables, one has

$${ d \over dt} ={ \partial \over \partial t} +{ 1 \over \varepsilon } { \partial \over \partial \tau }.$$
(4.33)

Formally substituting (4.32) into (4.3) and equating coefficients of like powers of εi results in

$$\begin{array}{ll} &{ \partial {y}_{0}(t,\tau ) \over \partial \tau } = {y}_{0}(t,\tau )Q(t), \\ &{ \partial {y}_{1}(t,\tau ) \over \partial \tau } = {y}_{1}(t,\tau )Q(t) -{ \partial {y}_{0}(t,\tau ) \over \partial t}, \\ &\qquad \cdots \\ &{ \partial {y}_{i}(t,\tau ) \over \partial \tau } = {y}_{i}(t,\tau )Q(t) -{ \partial {y}_{i-1}(t,\tau ) \over \partial t},\ \ 1 \leq i \leq n.\end{array}$$
(4.34)

The initial conditions are

$$\begin{array}{ll} &{y}_{0}(t,0) = {p}^{0}\ \mbox{ and } \\ &{y}_{i}(t,0) = 0,\ \mbox{ for }1 \leq i \leq n.\end{array}$$
(4.35)

Holding t constant and solving the first equation in (4.34) (with the first equation in (4.35) as the initial condition) yields

$${y}_{0}(t,\tau ) = {p}^{0}\exp (Q(t)\tau ).$$
(4.36)

By virtue of (A4.4), ( ∕ ∂t)y 0(t, τ) exists and

$${ \partial {y}_{0}(t,\tau ) \over \partial t} = {p}^{0}\exp (Q(t)\tau )\left({ dQ(t) \over dt} \right)\tau.$$

As a result, ( ∕ ∂t)y 0(t, τ) is orthogonal to 1 l. We continue the procedure recursively. It can be verified that for 1 ≤ i ≤ n,

$${y}_{i}(t,\tau ) = -{\int }_{0}^{\tau }{ \partial {y}_{i-1}(t,s) \over \partial t} \exp (Q(t)(\tau - s))ds.$$
(4.37)

Furthermore, for i = 1, , n, ( ∕ ∂t)y i (t, τ) exists and is continuous; it is also orthogonal to 1l. It should be emphasized that in the equations above, t is viewed as being “frozen,” and as a consequence, Q(t) is a “constant” matrix.

Parallel to the previous development, one can show that for all 1 ≤ i ≤ n,

$$\vert {y}_{i}(t,\tau )\vert \leq K(t)\exp (-{\kappa }_{0}(t)\tau ).$$

Compared with the separable expansions presented before, note the t-dependence of K( ⋅) and κ0( ⋅) above. Furthermore, the asymptotic series is valid. We summarize this as the following theorem.

Theorem 4.18.

Under the conditions of Theorem  4.5 , a sequence of functions {y i (t,τ)} i=0 n can be constructed so that

$$ \sup\limits_{t\in [0,T]}{\biggl |{p}^{\varepsilon }(t) -\sum\limits_{i=0}^{n}{\varepsilon }^{i}{y}_{ i}(t,\tau )\biggr |} = O({\varepsilon }^{n+1}).$$

Example 4.19.

We return to Example  4.16 . It is readily verified that the zeroth-order two-time-scale expansion coincides with that of the analytic solution, in fact, with

$${y}_{0}(t,\tau ) = {p}^{0}\exp \left ({ Qt \over \varepsilon } \right )\mbox{ and }{y}_{i}(t,\tau ) \equiv 0\mbox{ for all }i \geq 1.$$

Relationship between the Two Methods. Now we have two different asymptotic expansions. Do they in some sense produce similar asymptotic results? Note that each term in y i (t, τ) contains the regular part φ i (t) as well as the initial-layer corrections. Examining the zeroth-order approximation leads to

$$\exp (Q(t)\tau ) \rightarrow \overline{P}(t)\mbox{ as }\tau \rightarrow \infty $$

via the same argument employed in the proof of Lemma  4.4. The matrix has identical rows, and is given by \(\overline{P}(t) = \mathrm{1}\mathrm{l}\nu (t)\). In fact, owing to \({p}^{0}\mathrm{1}\mathrm{l} =\sum\limits_{i=1}^{m}{p}_{i}^{0} = 1\), we have

$${y}_{0}(t,\tau ) = \nu (t) + {p}^{0}\left (\exp (Q(t)\tau ) -\overline{P}(t)\right ) = \nu (t) +\widetilde{ {y}}_{ 0}(t,\tau ),$$
(4.38)

where \(\widetilde{{y}}_{0}(t,\tau )\) decays exponentially fast as τ →  for t < τ.

In view of (4.38), the two methods produce the same limit as τ → , namely, the quasi-stationary distribution. To explore further, we study a special case (a two-state Markov chain) so as to keep the notation simple. Consider the two-state Markov chain model Example  4.17. In view of (4.38) and the formulas in Example  4.17, we have

$${y}_{0}(t,\tau ) = \nu (t) +\widetilde{ {y}}_{0}(t,\tau ) = {\varphi }_{0}(t) +\widetilde{ {y}}_{0}(t,\tau ).$$

Owing to (4.37), direct calculation yields that

$$\begin{array}{rl} {y}_{1}(t,\tau )& = -{\int }_{0}^{\tau }{ d{\varphi }_{0}(t) \over dt} \exp (Q(t)(\tau - s))ds \\ &\qquad -{\int }_{0}^{\tau }{ \partial \widetilde{{y}}_{0}(t,\tau ) \over \partial t} \exp (Q(t)(\tau - s))ds\end{array}$$

It can be verified that the second term on the right-hand side of the equal sign above decays exponentially fast, while the first term yields φ1(t) plus another term tending to 0 exponentially fast as τ → . Using the result of Example  4.17 yields

$$\begin{array}{rl} & -{\int }_{0}^{\tau }{ d{\varphi }_{0}(t) \over dt} \exp (Q(t)(\tau - s))ds \\ &\qquad ={ d{\varphi }_{0}(t) \over dt} \left({ 1 -\exp (-(\lambda (t) + \mu (t))\tau ) \over \lambda (t) + \mu (t)} \right)\left (\begin{array}{*{10}c} \lambda (t) &-\lambda (t)\\ -\mu (t) & \mu (t) \\ \end{array} \right ) \\ &\qquad = {\varphi }_{1}(t) -{ d{\varphi }_{0}(t) \over dt} \left({ \exp (-(\lambda (t) + \mu (t))\tau ) \over \lambda (t) + \mu (t)} \right)\left (\begin{array}{*{10}c} \lambda (t) &-\lambda (t)\\ -\mu (t) & \mu (t) \\ \end{array} \right )\end{array}$$

Thus, it follows that

$${y}_{1}(t,\tau ) = {\varphi }_{1}(t) +\widetilde{ {y}}_{1}(t,\tau ),$$

where

$$\begin{array}{rl} \widetilde{{y}}_{1}(t,\tau )& = -{\int }_{0}^{\tau }{ \partial \widetilde{{y}}_{0}(t,\tau ) \over \partial t} \exp (Q(t)(\tau - s))ds \\ &\qquad -{ d{\varphi }_{0}(t) \over dt} \left({ \exp (-(\lambda (t) + \mu (t))\tau ) \over \lambda (t) + \mu (t)} \right)\left (\begin{array}{*{10}c} \lambda (t) &-\lambda (t)\\ -\mu (t) & \mu (t) \end{array} \right )\end{array}$$

Similarly, we can obtain

$${y}_{i}(t,\tau ) = {\varphi }_{i}(t) +\widetilde{ {y}}_{i}(t,\tau ),\mbox{ for }1 \leq i \leq n,$$

where \(\widetilde{{y}}_{i}(t,\tau )\) decay exponentially fast as τ →  for all t < τ. This establishes the connection between these two different expansions.

Comparison and Additional Remark. A moment of reflection reveals that:

    • The conditions required to obtain the asymptotic expansions are the same.

    • Except for the actual forms, there is no significant difference between these two methods.

    • No matter which method is employed, in one way or another the results for stationary Markov chains are used. In the separable expansion, this is accomplished by using Q(0), and in the two-time-scale expansion, this is carried out by holding t constant and hence treating Q(t) as a constant matrix.

    • Although the two-time-scale expansion admits a seemingly more general form, the separable expansion is more transparent as far as the quasi-stationary distribution is concerned.

    • When a more complex problem, for example the case of weak and strong interactions, is encountered, the separable expansion becomes more advantageous.

    • To study asymptotic normality, etc., in the sequel, the separable expansion will prove to be more convenient than the two-time-scale expansion.

In view of the items mentioned above, we choose to use the separable form of the expansion throughout.

3 Markov Chains with Multiple Weakly Irreducible Classes

This section presents the asymptotic expansions of two-time-scale Markov chains with slow and fast components subject to weak and strong interactions. We assume that all the states of the Markov chain are recurrent. In contrast to Section 4.2, the states belong to multiple weakly irreducible classes. As was mentioned in the introductory chapter, such time-scale separation stems from various applications in production planning, queueing networks, random fatigue, system reliability, competing risk theory, control and optimization of large-scale dynamical systems, and related fields. The sunderlying models in which some components change very rapidly whereas others vary relatively slowly, are more complex than those of Section 4.2. The weak and strong interactions of the systems are modeled by assuming the generator of the underlying Markov chain to be of the form

$${Q}^{\varepsilon }(t) = \frac{1} {\varepsilon }\widetilde{Q}(t) +\widehat{ Q}(t),$$
(4.39)

where \(\widetilde{Q}(t)\) governs the rapidly changing part and \(\widehat{Q}(t)\) describes the slowly changing components. They have the appropriate forms to be mentioned in the sequel.

This section extends the results in Section 4.2 to incorporate the cases in which the generator \(\widetilde{Q}(t)\) is not irreducible. Our study focuses on the forward equation, similar to (4.3); now the forward equation takes the form

$$\begin{array}{ll} &{ d{p}^{\varepsilon }(t) \over dt} = {p}^{\varepsilon }(t)\left (\frac{1} {\varepsilon }\widetilde{Q}(t) +\widehat{ Q}(t)\right ),\quad {p}^{\varepsilon }(0) = {p}^{0} \end{array}$$
(4.40)

such that

$${p}_{i}^{0} \geq 0\mbox{ for each }i\mbox{ and }\sum\limits_{i=1}^{m}{p}_{ i}^{0} = 1.$$

To illustrate, we present a simple example below.

Example 4.20.

Consider a two-machine flowshop with machines that are subject to breakdown and repair. The production capacity of the machines is described by a finite-state Markov chain. If the machine is up, then it can produce parts with production rate u(t); its production rate is zero if the machine is under repair. For simplicity, suppose each of the machines is either in operating condition (denoted by 1) or under repair (denoted by 0). Then the capacity of the workshop becomes a four-state Markov chain with state space {(1,1),(0,1),(1,0),(0,0)}. Suppose that the first machine breaks down much more often than the second one. To reflect this situation, consider a Markov chain αε(⋅) generated by Qε(t) in (4.39), with \(\widetilde{Q}(\cdot )\) and \(\widehat{Q}(\cdot )\) given by

$$\widetilde{Q}(t) = \left (\begin{array}{*{10}c} -{\lambda }_{1}(t)& {\lambda }_{1}(t) & 0 & 0 \\ {\mu }_{1}(t) &-{\mu }_{1}(t)& 0 & 0 \\ 0 & 0 &-{\lambda }_{1}(t)& {\lambda }_{1}(t) \\ 0 & 0 & {\mu }_{1}(t) &-{\mu }_{1}(t)\\ \end{array} \right )$$

and

$$\widehat{Q}(t) = \left (\begin{array}{*{10}c} -{\lambda }_{2}(t)& 0 & {\lambda }_{2}(t) & 0 \\ 0 &-{\lambda }_{2}(t)& 0 & {\lambda }_{2}(t) \\ {\mu }_{2}(t) & 0 &-{\mu }_{2}(t)& 0 \\ 0 & {\mu }_{2}(t) & 0 &-{\mu }_{2}(t)\\ \end{array} \right ),$$

where λi(⋅) and μi(⋅) are the rates of repair and breakdown, respectively. The matrices \(\widetilde{Q}(t)\) and \(\widehat{Q}(t)\) are themselves generators of Markov chains. Note that

$$\widetilde{Q}(t) = \mathrm{diag}\left (\left (\begin{array}{cc} - {\lambda }_{1}(t)& {\lambda }_{1}(t) \\ {\mu }_{1}(t) & - {\mu }_{1}(t)\\ \end{array} \right ),\left (\begin{array}{cc} - {\lambda }_{1}(t)& {\lambda }_{1}(t) \\ {\mu }_{1}(t) & - {\mu }_{1}(t)\\ \end{array} \right )\right )$$

is a block-diagonal matrix, representing the fast motion, and \(\widehat{Q}(t)\) governs the slow components. In order to obtain any meaningful results for controlling and optimizing the performance of the underlying systems, the foremost task is to determine the asymptotic behavior (as ε → 0) of the probability distribution of the underlying chain.

In this example, a first glance reveals that \(\widetilde{Q}(t)\) is reducible, hence the results in Section 4.2 are not applicable. However, closer scrutiny indicates that \(\widetilde{Q}(t)\) consists of two irreducible submatrices. One expects that the asymptotic expansions may still be established. Our main objective is to develop asymptotic expansions of such systems and their variants. The corresponding procedure is, however, much more involved compared with the irreducible cases.

Examining (4.39), it is seen that the asymptotic properties of the underlying Markov chains largely depend on the structure of the matrix \(\widetilde{Q}(t)\). In accordance with the classification of states, we may consider three different cases: the chains with recurrent states only, the inclusion of absorbing states, and the inclusion of transient states. We treat the recurrent-state cases in this section, and then extend the results to notationally more involved cases including absorbing states and transient states in the following two sections.

Suppose αε( ⋅) is a finite-state Markov chain with generator given by (4.39), where both \(\widetilde{Q}(t)\) and \(\widehat{Q}(t)\) are generators of appropriate Markov chains. In view of the results in Section 4.2, it is intuitively clear that the structure of the generator \(\widetilde{Q}(t)\) governs the fast-changing part of the Markov chain. As mentioned in the previous section, our study of the finite-state-space cases is naturally divided into the recurrent cases, the inclusion of absorbing states, and the inclusion of transient states of the generator \(\widetilde{Q}(t)\). In accordance with classical results (see Chung [31] and Karlin and Taylor [105, 106]), one can always decompose the states of a finite-state Markov chain into recurrent (including absorbing) and transient classes. Inspired by Seneta’s approach to nonnegative matrices (see Seneta [189]), we aim to put the matrix \(\widetilde{Q}(t)\) into some sort of “canonical” form so that a systematic study can be carried out. In a finite-state Markov chain, not all states are transient, and it has at least one recurrent state. Similar to the argument of Iosifescu [95, p. 94] (see also Goodman [75], Karlin and McGregor [104], Keilson [107] among others), if there are no transient states, then after suitable permutations and rearrangements (i.e., by appropriately relabeling the states), \(\widetilde{Q}(t)\) can be put into the block-diagonal form

$$\begin{array}{ll} \widetilde{Q}(t)& = \left (\begin{array}{*{10}c} \widetilde{{Q}}^{1}(t)& && \\ &\widetilde{{Q}}^{2}(t)&&\\ &&\ddots& \\ & &&\widetilde{{Q}}^{l}(t)\\ \end{array} \right ) \\ & = \mathrm{diag}\left (\widetilde{{Q}}^{1}(t),\ldots,\widetilde{{Q}}^{l}(t)\right ), \end{array}$$
(4.41)

where \(\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}\) are weakly irreducible, for k = 1, 2, …,  l, and \(\sum\limits_{k=1}^{l}{m}_{k} = m\). Here and hereinafter, \(\widetilde{{Q}}^{k}(t)\), (a superscript without parentheses) denotes the kth block matrix in \(\widetilde{Q}(t)\) . The rest of this section deals with the generator Q ε(t) given by (4.39) with \(\widetilde{Q}(t)\) taking the form (4.41). Note that an example of the recurrent case is that of the irreducible (or weakly irreducible) generators treated in Section 4.2.

Let \({\mathcal{M}}_{k} =\{ {s}_{k1},\ldots,{s}_{k{m}_{k}}\}\) for k = 1, …,  l denote the states corresponding to \(\widetilde{{Q}}^{k}(t)\) and let \(\mathcal{M}\) denote the state space of the underlying chains given by

$$\begin{array}{rl} \mathcal{M}& = {\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l} \\ & = \{{s}_{11},\ldots,{s}_{1{m}_{1}},\ldots,{s}_{l1},\ldots,{s}_{l{m}_{l}}\}\end{array}$$

Since \(\widetilde{{Q}}^{k}(t) = {(\widetilde{{q}}_{ij}^{k}(t))}_{{m}_{k}\times {m}_{k}}\) and \(\widehat{Q}(t) = {(\widehat{{q}}_{ij}(t))}_{m\times m}\) are generators, for k = 1, 2,  , l, we have

$$\begin{array}{rl} &\sum\limits_{j=1}^{{m}_{k} }\widetilde{{q}}_{ij}^{k}(t) = 0,\ \mbox{ for }i = 1,\ldots,{m}_{ k},\ \mbox{ and } \\ &\sum\limits_{j=1}^{m}\widehat{{q}}_{ ij}(t) = 0,\ \mbox{ for }i = 1,\ldots,m\end{array}$$

The slow and fast components are coupled through weak and strong interactions in the sense that the underlying Markov chain fluctuates rapidly within a single group \({\mathcal{M}}_{k}\) and jumps less frequently between groups \({\mathcal{M}}_{k}\) and \({\mathcal{M}}_{j}\) for kj. The states in \({\mathcal{M}}_{k},\) k = 1, …,  l, are not isolated or independent of each other. More precisely, if we consider the states in \({\mathcal{M}}_{k}\) as a single “state,” then these “states” are coupled through the matrix \(\widehat{Q}(t)\), and transitions from \({\mathcal{M}}_{k}\) to \({\mathcal{M}}_{j}\), k ≠  j are possible. In fact, \(\widehat{Q}(\cdot )\), together with the quasi-stationary distributions of \(\widetilde{{Q}}^{k}(t)\), determines the transition rates among states in \({\mathcal{M}}_{k}\), for k = 1,  , l.

Consider the forward equation (4.40). Our goal here is to develop an asymptotic series for the solution p ε( ⋅) of (4.40). Working with the interval [0, T] for some T < ∞, we will need the following conditions:

    • For each t ∈ [0, T] and k = 1, 2, …,  l, \(\widetilde{{Q}}^{k}(t)\) is weakly irreducible.

    • For some positive integer n, \(\widetilde{Q}(\cdot )\) and \(\widehat{Q}(\cdot )\) are ( n + 1)-times and n-times continuously differentiable on [0,  T], respectively. Moreover, \(({d}^{n+1}/d{t}^{n+1})\widetilde{Q}(\cdot )\) and \(({d}^{n}/d{t}^{n})\widehat{Q}(\cdot )\) are Lipschitz on [0, T].

Compared with the irreducible models in Section 4.2, the main difficulty in this chapter lies in the interactions among different blocks. In constructing the expansion in Section 4.2, for i = 1, …,  n, the two sets of functions {φi( ⋅)} and {ψ i ( ⋅)} are obtained independently except the initial conditions \({\psi }_{i}(0) = -{\varphi }_{i}(0)\). For Markov chains with weak and strong interactions, φi ( ⋅) and ψ i ( ⋅) are highly intertwined. The essence is to find φi ( ⋅) and ψ i ( ⋅) jointly and recursively. In the process of construction, one of the crucial and delicate points is to select the “right” initial conditions. This is done by demanding that ψi (τ) decay to 0 as τ →  . For future use, we define a differential operator \({\mathcal{L}}^{\varepsilon }\) on \({\mathbb{R}}^{1\times m}\)-valued functions by

$${\mathcal{L}}^{\varepsilon }f = \varepsilon { df \over dt} - f(\widetilde{Q} + \varepsilon \widehat{Q}).$$
(4.42)

Then it follows that \({\mathcal{L}}^{\varepsilon }f = 0\) iff f is a solution to the differential equation in (4.40). We are now in a position to derive the asymptotic expansion.

3.1 Asymptotic Expansions

As in Section 4.2, we seek expansions of the form

$${y}_{n}^{\varepsilon }(t) = {\Phi }_{ n}^{\varepsilon }(t) + {\Psi }_{ n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) +\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right ).$$
(4.43)

For the purpose of estimating the remainder (or error), the terms φ n + 1 ( ⋅) and ψ n + 1 ( ⋅) are needed. Set \({\mathcal{L}}^{\varepsilon }{y}_{n+1}^{\varepsilon }(t) = 0\). Parallel to the approach in Section 4.2, equating like powers of εi (for \(i = 0,1,\ldots,n + 1\)) leads to the equations for the regular part:

$$\begin{array}{ll} &{\varepsilon }^{0} :\ {\varphi }_{ 0}(t)\widetilde{Q}(t) = 0, \\ &{\varepsilon }^{1} :\ {\varphi }_{ 1}(t)\widetilde{Q}(t) ={ d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)\widehat{Q}(t), \\ &\qquad \cdots \\ &{\varepsilon }^{i} :\ {\varphi }_{ i}(t)\widetilde{Q}(t) ={ d{\varphi }_{i-1}(t) \over dt} - {\varphi }_{i-1}(t)\widehat{Q}(t).\end{array}$$
(4.44)

As discussed in Section 4.2, the approximation above is good for t away from 0. When t is sufficiently close to 0, an initial layer of thickness ε develops. Thus for the singular part of the expansion we enlarge the picture near 0 using the stretched variable τ defined by \(\tau = t/\varepsilon \). Identifying the initial-layer terms in \({\mathcal{L}}^{\varepsilon }{y}_{n+1}^{\varepsilon } = 0\), we obtain

$$\begin{array}{rl} &{ d \over d\tau } \left ({\psi }_{0}(\tau ) + \varepsilon {\psi }_{1}(\tau ) + \cdots + {\varepsilon }^{n+1}{\psi }_{ n+1}(\tau )\right ) \\ &\quad = \left ({\psi }_{0}(\tau ) + \varepsilon {\psi }_{1}(\tau ) + \cdots + {\varepsilon }^{n+1}{\psi }_{ n+1}(\tau )\right )\left (\widetilde{Q}(\varepsilon \tau ) + \varepsilon \widehat{Q}(\varepsilon \tau )\right )\end{array}$$

By means of the Taylor expansion, we have

$$\begin{array}{rl} &\widetilde{Q}(\varepsilon \tau ) =\widetilde{ Q}(0) + \varepsilon \tau { d\widetilde{Q}(0) \over dt} + \cdots \\ &\qquad +{ {(\varepsilon \tau )}^{n+1} \over (n + 1)!} { {d}^{n+1}\widetilde{Q}(0) \over d{t}^{n+1}} +\widetilde{ {R}}_{n+1}(\varepsilon \tau ), \\ &\varepsilon \widehat{Q}(\varepsilon \tau ) = \varepsilon \widehat{Q}(0) + {\varepsilon }^{2}\tau { d\widehat{Q}(0) \over dt} + \cdots \\ &\qquad +{ \varepsilon {(\varepsilon \tau )}^{n} \over n!} { {d}^{n}\widehat{Q}(0) \over d{t}^{n}} +\widehat{ {R}}_{n}(\varepsilon \tau ),\end{array}$$

where

$$\begin{array}{rl} &\widetilde{{R}}_{n+1}(t) ={ {t}^{n+1} \over (n + 1)!} \left({ {d}^{n+1}\widetilde{Q}(\xi ) \over d{t}^{n+1}} -{ {d}^{n+1}\widetilde{Q}(0) \over d{t}^{n+1}} \right), \\ &\widehat{{R}}_{n}(t) ={ \varepsilon {t}^{n} \over n!} \left({ {d}^{n}\widehat{Q}(\zeta ) \over d{t}^{n}} -{ {d}^{n}\widehat{Q}(0) \over d{t}^{n}} \right), \end{array}$$

for some 0 ≤ ξ ≤ t and 0 ≤ ζ ≤  t. Note that in view of (A4.4),

$$\widetilde{{R}}_{n+1}(t) = O({t}^{n+2})\mbox{ and }\widehat{{R}}_{ n}(t) = O(\varepsilon {t}^{n+1}).$$

Equating coefficients of like powers of εi, for \(i = 0,1,\ldots,n + 1\) , and using the Taylor expansion above, we obtain

$$\begin{array}{ll} &{\varepsilon }^{0} :\ { d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0), \\ &{\varepsilon }^{1} :\ { d{\psi }_{1}(\tau ) \over d\tau } = {\psi }_{1}(\tau )\widetilde{Q}(0) \\ & + {\psi }_{0}(\tau )\left(\widehat{Q}(0) + \tau { d\widetilde{Q}(0) \over dt} \right),\\ \\ \\ &\qquad \ \cdots \\ &{\varepsilon }^{i} :\ { d{\psi }_{i}(\tau ) \over d\tau } = {\psi }_{i}(\tau )\widetilde{Q}(0) +\sum\limits_{j=0}^{i-1}{\psi }_{ i-j-1}(\tau ) \\ & \times \left({ {\tau }^{j} \over j!} { {d}^{j}\widehat{Q}(0) \over d{t}^{j}} +{ {\tau }^{j+1} \over (j + 1)!} { {d}^{j+1}\widetilde{Q}(0) \over d{t}^{j+1}} \right).\end{array}$$
(4.45)

In view of the essence of matched asymptotic expansion, we have necessarily at t = 0 that

$$\sum\limits_{i=0}^{n+1}{\varepsilon }^{i}\left ({\varphi }_{ i}(0) + {\psi }_{i}(0)\right ) = {p}^{0}.$$
(4.46)

This equation implies

$${p}^{0} = {\varphi }_{ 0}(0) + {\psi }_{0}(0)\mbox{ and }{\varphi }_{i}(0) + {\psi }_{i}(0) = 0,$$

for i ≥ 1. Moreover, note that p ε(t)1 l = 1 for all t ∈ [0, T]. Sending ε → 0 in the asymptotic expansion, one necessarily has to have the following conditions: For all t ∈ [0, T],

$${\varphi }_{0}(t)\mathrm{1}\mathrm{l} = 1\mbox{ and }{\varphi }_{i}(t)\mathrm{1}\mathrm{l} = 0,\;i \geq 1.$$
(4.47)

Our task now is to determine the functions φ i ( ⋅) and ψ i( ⋅).

Determining φ 0 (⋅) and ψ 0 (⋅). Write v = ( v 1, …,  v l) for a vector \(v \in {\mathbb{R}}^{1\times m}\) , where v k denotes the subvector corresponding to the kth block of the partition. Meanwhile, a superscript with parentheses denotes a sequence. Thus v n k denotes the kth subblock of the corresponding partitioned vector of the sequence v n .

Let us start with the first equation in (4.44). In view of (4.47), we have

$$\begin{array}{ll} &{\varphi }_{0}(t)\widetilde{Q}(t) = 0, \\ &\sum\limits_{i=1}^{m}{\varphi }_{ 0}^{i}(t) = 1.\end{array}$$
(4.48)

Note that the system above depends only on the generator \(\widetilde{Q}(t)\). However, by itself, the system is not uniquely solvable. Since for each t ∈ [0,  T] and k = 1,  , l, \(\widetilde{{Q}}^{k}(t)\) is weakly irreducible, it follows that \(\mathrm{rank}(\widetilde{{Q}}^{k}(t)) = {m}_{k} - 1\) and \(\mathrm{rank}(\widetilde{Q}(t)) = m - l\) . Therefore, to get a unique solution, we need to supply l auxiliary equations. Where can we find these equations? Upon dividing the system (4.48) into l subsystems, one can apply the Fredholm alternative (see Lemma  A.37 and Corollary  A.38) and use the orthogonality condition to choose l additional equations to replace l equations in the system represented by the first equation in (4.48).

Since for each k, \(\widetilde{{Q}}^{k}(t)\) is weakly irreducible, there exists a unique quasi-stationary distribution νk(t). Therefore any solution to the equation

$${\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = 0$$

can be written as the product of νk(t) and a scalar “multiplier,” say \({{\vartheta}}_{0}^{k}(t)\). It follows from the second equation in (4.48) that \(\sum\limits_{k=1}^{l}{{\vartheta}}_{0}^{k}(t) = 1\). These \({{\vartheta}}_{0}^{k}(t)\)’s can be interpreted as the probabilities of the “grouped states” (or “aggregated states”) \({\mathcal{M}}_{k}\).

As will be seen in the sequel, \({{\vartheta}}_{0}^{k}(t)\) becomes an important spinoff in the process of construction. Effort will subsequently be devoted to finding the unique solution \(({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{0}^{l}(t))\). Let \(\mathrm{1}{\mathrm{l}}_{{m}_{k}} = (1,\ldots,1)^{\prime} \in {\mathbb{R}}^{{m}_{k}\times 1}\).

Lemma 4.21

. Under (A4.3) and (A4.4) , for each k = 1,…,l, the solution of the equation

$$\begin{array}{ll} &{\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = 0, \\ &{\varphi }_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = {{\vartheta}}_{0}^{k}(t), \end{array}$$
(4.49)

can be uniquely expressed as \({\varphi }_{0}^{k}(t) = {\nu }^{k}(t){{\vartheta}}_{0}^{k}(t)\) , where ν k (t) is the quasi-stationary distribution corresponding to \(\widetilde{{Q}}^{k}(t)\) . Moreover, φ 0 k (t) is (n + 1)-times continuously differentiable on [0,T], provided that \({{\vartheta}}_{0}^{k}(\cdot )\) is (n + 1)-times continuously differentiable.

Proof: For each k, let us regard \({{\vartheta}}_{0}^{k}(\cdot )\) as a known function temporarily. For t ∈ [0, T], let \(\widetilde{{Q}}_{c}^{k}(t) = (\mathrm{1}{\mathrm{l}}_{{m}_{k}}\vdots\;\widetilde{{Q}}^{k}(t))\) . Then the solution can be written as

$${\varphi }_{0}^{k}(t) = ({{\vartheta}}_{ 0}^{k}(0)\vdots{0}_{{ m}_{k}}^{\prime})\widetilde{{Q}}_{c}^{k,{\prime}}(t){\left (\widetilde{{Q}}_{ c}^{k}(t)\widetilde{{Q}}_{ c}^{k,{\prime}}(t)\right )}^{-1},$$

where \({0}_{{m}_{k}} = (0,\ldots,0)^{\prime} \in {\mathbb{R}}^{{m}_{k}\times 1}\) . Moreover, φ 0 ( ⋅) is ( n + 1)-times continuously differentiable. The lemma is thus concluded. □ 

Remark 4.22

. This lemma indicates that for each k, the subvector φ 0 k (⋅) is a multiple of the quasi-stationary distribution ν k (⋅) for each k = 1,…,l. The multipliers \({{\vartheta}}_{0}^{k}(\cdot )\) are to be determined. Owing to the interactions among different “aggregated states” corresponding to the block matrices, piecing together quasi-stationary distributions does not produce a quasi-stationary distribution for the entire system (i.e., (ν 1 (t),…,ν k (t)) is not a quasi-stationary distribution for the entire system). Therefore, the leading term in the asymptotic expansion is proportional to (or a “multiple” of) the quasi-stationary distributions of the Markov chains generated by \(\widetilde{{Q}}^{k}(t)\) , for k = 1,…,l. The multiplier \({{\vartheta}}_{0}^{k}(t)\) reflects the interactions of the Markov chain among the “aggregated states.” The probabilistic meaning of the leading term φ 0 (⋅) is in the sense of total probability. Intuitively, \({{\vartheta}}_{0}^{k}(t)\) is the corresponding probability of the chain belonging to \({\mathcal{M}}_{k}\) , and φ 0 k (t) is the probability distribution of the chain belonging to \({\mathcal{M}}_{k}\) and the transitions taking place within this group of states.

We proceed to determining \({{\vartheta}}_{0}^{k}(\cdot )\) for k = 1,  , l. Define an m ×l matrix

$$\widetilde{\mathrm{1}\mathrm{l}} = \left (\begin{array}{*{10}c} \mathrm{1}{\mathrm{l}}_{{m}_{1}} & & & \\ & \mathrm{1}{\mathrm{l}}_{{m}_{2}} & &\\ & & \ddots& \\ & & & \mathrm{1}{\mathrm{l}}_{{m}_{l}}\\ \end{array} \right ) = \mathrm{diag}(\mathrm{1}{\mathrm{l}}_{{m}_{1}},\ldots,\mathrm{1}{\mathrm{l}}_{{m}_{l}}).$$

A crucial observation is that \(\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0\) , that is, \(\widetilde{Q}(t)\) and \(\widetilde{\mathrm{1}\mathrm{l}}\) are orthogonal. Thus postmultiplying by \(\widetilde{\mathrm{1}\mathrm{l}}\) leads to

$${\mathcal{L}}^{\varepsilon }\left (\sum\limits_{i=0}^{n+1}{\varepsilon }^{i}{\varphi }_{ i}(t)\widetilde{\mathrm{1}\mathrm{l}}\right ) = 0.$$

Recall that

$${\varphi }_{0}^{k}(t) = {{\vartheta}}_{ 0}^{k}(t){\nu }^{k}(t)\mbox{ and }{\varphi }_{ 0}^{k}(t)\mathrm{1}\mathrm{l} = {{\vartheta}}_{ 0}^{k}(t).$$

Equating the coefficients of ε in the above equation yields

$${ d \over dt} ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t)) = ({{\vartheta}}_{ 0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t))\overline{Q}(t),$$
(4.50)

where

$$\begin{array}{rl} \overline{Q}(t) =&\left (\begin{array}{*{10}c} {\nu }^{1}(t)& && \\ &{\nu }^{2}(t)&&\\ &&\ddots& \\ & &&{\nu }^{l}(t)\\ \end{array} \right )\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} \\ =&\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t))\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}}.\end{array}$$
(4.51)

Remark 4.23.

Intuitively, \(\overline{Q}(t)\) is the “average” of \(\widehat{Q}(t)\) weighted by the collection of quasi-stationary distributions (ν1(t),…,νl(t)). In fact, (4.50) is merely a requirement that the equations in (4.44) be consistent in the sense of Fredholm. This can be seen as follows. Denote by \(N(\widetilde{Q}(t))\) the null space of the matrix Q(t). Since \(\mathrm{rank}(\widetilde{Q}(t)) = m - l\), the dimension of \(N(\widetilde{Q}(t))\) is l. Observe that \(\widetilde{\mathrm{1}\mathrm{l}} =\) \(\mathrm{diag}(\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{1}}\), \(\ldots,\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{l}})\) where

$$\begin{array}{ll} &\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{1}} = {(\underbrace{{1,\ldots,1}}_{{m}_{1}},\underbrace{{0,\ldots,0}}_{{m}_{2}+\cdots +{m}_{l}})}^{{\prime}}, \\ &\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{2}} = {(\underbrace{{0,\ldots,0}}_{{m}_{1}},\underbrace{{1,\ldots,1}}_{{m}_{2}},\underbrace{{0,\ldots,0}}_{{m}_{3}+\cdots +{m}_{l}})}^{{\prime}}, \\ &\quad \cdots \\ &\widetilde{\mathrm{1}{\mathrm{l}}}_{{m}_{l}} = {(\underbrace{{0,\ldots,0,}}_{{m}_{1}+\cdots +{m}_{l-1}}\underbrace{{ 1,\ldots,1}}_{{m}_{l}})}^{{\prime}} \end{array}$$
(4.52)

are linearly independent and span the null space of \(\widetilde{Q}(t)\). The equations in (4.44) have solutions only if the right-hand side of each equation is orthogonal to \(\widetilde{\mathrm{1}\mathrm{l}}\). Hence, (4.50) must hold.

Next we determine the initial value \({{\vartheta}}_{0}(0)\) . Assuming that the asymptotic expansion of p ε( ⋅) is given by y n ε( ⋅) (see (4.43)), then it is necessary that

$${\varphi }_{0}(0)\widetilde{\mathrm{1}\mathrm{l}} =\lim\limits_{\delta \rightarrow 0}\lim\limits_{\varepsilon \rightarrow 0}{p}^{\varepsilon }(\delta )\widetilde{\mathrm{1}\mathrm{l}}.$$
(4.53)

We will refer to such a condition as an initial-value consistency condition. Moreover, in view of (4.40) and \(\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0,\)

$${p}^{\varepsilon }(t)\widetilde{\mathrm{1}\mathrm{l}} = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}} +{ \int }_{0}^{\delta }{p}^{\varepsilon }(s)\widehat{Q}(s)ds\widetilde{\mathrm{1}\mathrm{l}}.$$

Since p ε( ⋅) and \(\widehat{Q}(\cdot )\) are both bounded, it follows that

$$ \lim\limits_{\delta \rightarrow 0}\left(\mathop{\lim\sup}\limits_{\varepsilon \rightarrow 0}{ \int }_{0}^{\delta }{p}^{\varepsilon }(s)\widehat{Q}(s)ds\widetilde{\mathrm{1}\mathrm{l}}\right) = 0.$$

Therefore, the initial-value consistency condition (4.53) yields

$${\varphi }_{0}(0)\widetilde{\mathrm{1}\mathrm{l}} = \lim\limits_{\delta \rightarrow 0}\left ( \lim\limits_{\varepsilon \rightarrow 0}{p}^{\varepsilon }(\delta )\widetilde{\mathrm{1}\mathrm{l}}\right ) = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}}.$$

Note that \(({{\vartheta}}_{0}^{1}(0),\ldots,{{\vartheta}}_{0}^{l}(0)) = {\varphi }_{0}(0)\widetilde{\mathrm{1}\mathrm{l}}\). So the initial value for \({{\vartheta}}_{0}(t)\) should be

$$({{\vartheta}}_{0}^{1}(0),\ldots,{{\vartheta}}_{ 0}^{l}(0)) = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}}.$$

Using this initial condition and solving (4.50) yields that

$$({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t)) = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}}X(t,0),$$

where X( t, s) is the principal matrix solution of (4.50) (see Hale [79]). Since the smoothness of X( ⋅,  ⋅) depends solely on the smoothness properties of \(\widetilde{Q}(t)\) and \(\widehat{Q}(t)\), \(({{\vartheta}}_{0}^{1}(\cdot ),\ldots,{{\vartheta}}_{0}^{l}(\cdot ))\) is (n + 1)-times continuously differentiable on [0,  T]. Up to now, we have shown that φ0( ⋅) can be constructed that is (n + 1)-times continuously differentiable on [0,  T]. Set \({{\vartheta}}_{0}(t) = ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{0}^{l}(t))\). We now summarize the discussion above as follows:

Proposition 4.24

. Assume conditions (A4.3) and (A4.4) . Then for t ∈ [0,T], φ 0 (t) can be obtained uniquely by solving the following system of equations:

$$\begin{array}{ll} &{\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = 0, \\ &{\varphi }_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = {{\vartheta}}_{0}^{k}(t), \\ &{ d{{\vartheta}}_{0}(t) \over dt} = {{\vartheta}}_{0}(t)\overline{Q}(t), \\ &\mbox{ with }{{\vartheta}}_{0}(0) = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}},\end{array}$$
(4.54)

such that φ 0 (⋅) is (n + 1)-times continuously differentiable. □

We next consider the initial-layer term ψ0( ⋅). First note that solving (4.45) for each \(i = 0,1\ldots,n + 1\) leads to

$$\begin{array}{ll} &{\psi }_{0}(\tau ) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau ),\\ \\ \\ &\qquad \cdots \\ \\ \\ &{\psi }_{i}(\tau ) = {\psi }_{i}(0)\exp (\widetilde{Q}(0)\tau ) \\ &\qquad \qquad +\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\tau }{\psi }_{ i-j-1}(s)\left({ {s}^{j} \over j!} { {d}^{j}\widehat{Q}(0) \over dt} + \frac{{s}^{j+1}} {(j + 1)!}{ {d}^{j+1}\widetilde{Q}(0) \over d{t}^{j+1}} \right) \\ & \quad \times \exp (\widetilde{Q}(0)(\tau - s))ds.\end{array}$$
(4.55)

Once again, to match the asymptotic expansion requires that (4.46) hold and hence

$${p}^{0} = {p}^{\varepsilon }(0) = {\varphi }_{ 0}(0) + {\psi }_{0}(0).$$

Solving the first equation in (4.45) together with the above initial condition, one obtains

$${\psi }_{0}(\tau ) = ({p}^{0} - {\varphi }_{ 0}(0))\exp (\widetilde{Q}(0)\tau ).$$
(4.56)

Note that in Proposition  4.25 to follow, we still use κ0, 0 as a positive constant, which is generally a different constant from that in Section 4.2.

Proposition 4.25

. Assume conditions (A4.3) and (A4.4) . Then ψ 0 (⋅) can be obtained uniquely by (4.56) . In addition, there is a positive number κ 0,0 such that

$$\vert {\psi }_{0}(\tau )\vert \leq K\exp (-{\kappa }_{0,0}\tau ),\;\tau \geq 0.$$

Proof: We prove only the exponential decay property, since the rest is obvious. Let νk (0) be the stationary distribution corresponding to the generator \(\widetilde{{Q}}^{k}(0)\). Define

$$\begin{array}{rl} \pi & =\widetilde{ \mathrm{1}\mathrm{l}}\left (\begin{array}{*{10}c} {\nu }^{1}(0)& && \\ &{\nu }^{2}(0)&&\\ &&\ddots& \\ & &&{\nu }^{l}(0)\\ \end{array} \right ) \\ &\\ \\ \\ & = \left (\begin{array}{*{10}c} \mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0)& && \\ &\mathrm{1}{\mathrm{l}}_{{m}_{2}}{\nu }^{2}(0)&&\\ &&\ddots& \\ & &&\mathrm{1}{\mathrm{l}}_{{m}_{l}}{\nu }^{l}(0)\\ \end{array} \right ), \end{array}$$
(4.57)

where

$$\mathrm{1}{\mathrm{l}}_{{m}_{k}}{\nu }^{k}(0) = \left (\begin{array}{*{10}c} {\nu }_{1}^{k}(0)&\cdots &{\nu }_{{ m}_{k}}^{k}(0)\\ & \vdots & \\ {\nu }_{1}^{k}(0)&\cdots &{\nu }_{{m}_{k}}^{k}(0)\\ \end{array} \right ).$$

Noting the block-diagonal structure of \(\widetilde{Q}(0)\), we have

$$\exp (\widetilde{Q}(0)\tau ) = \left (\begin{array}{*{10}c} \exp (\widetilde{{Q}}^{1}(0)\tau )& & & \\ &\exp (\widetilde{{Q}}^{2}(0)\tau )& &\\ & &\ddots& \\ & & &\exp (\widetilde{{Q}}^{l}(0)\tau )\\ \end{array} \right ).$$

It is easy to see that

$$({p}^{0} - {\varphi }_{ 0}(0))\widetilde{\mathrm{1}\mathrm{l}} = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}} - {\varphi }_{ 0}(0))\widetilde{\mathrm{1}\mathrm{l}} = {p}^{0}\widetilde{\mathrm{1}\mathrm{l}} - {{\vartheta}}_{ 0}(0) = 0.$$

Owing to the choice of initial condition, (p 0  − φ 0 (0)) is orthogonal to π, and by virtue of Lemma  4.4, for each k = 1, …,  l and some κ0, k  > 0,

$$\left \vert \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right \vert \leq K\exp (-{\kappa }_{ 0,k}\tau ),$$

we have

$$\begin{array}{rl} \l {\psi }_{0}(\tau )\vert & = \l ({p}^{0} - {\varphi }_{ 0}(0))[\exp (\widetilde{Q}(0)\tau ) - \pi ]\vert \\ & \leq {K\sup }_{k\leq l}\l \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\vert \\ & \leq K\exp (-{\kappa }_{0,0}\tau ), \end{array}$$

where κ 0, 0  = min k ≤ l κ 0, k  > 0. □ 

Determining φ i (⋅) and ψ i (⋅) for i ≥ 1. In contrast to the situation encountered in Section 4.2, the sequence {φ i ( ⋅)} cannot be obtained without the involvement of {ψi ( ⋅)}. We thus obtain the sequences pairwise. While the determination of φ0 ( ⋅) and ψ 0 ( ⋅) is similar to that of Section 4.2, the solutions for the rest of the functions show distinct features resulting from the underlying weak and strong interactions. With known

$${b}_{0}(t) ={ d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)\widehat{Q}(t),$$

we proceed to solve the second equation in (4.44) together with the constraint \(\sum\limits_{i=1}^{m}{\varphi }_{1}^{i}(t) = 0\) due to (4.47). Partition the vectors φ1(t) and b 0(t) as

$$\begin{array}{rl} &{\varphi }_{1}(t) = ({\varphi }_{1}^{1}(t),\ldots,{\varphi }_{1}^{l}(t)), \\ &{b}_{0}(t) = ({b}_{0}^{1}(t),\ldots,{b}_{0}^{l}(t))\end{array}$$

In view of the definition of \(\overline{Q}(t)\) in (4.51) and \({\varphi }_{0}^{k}(t) = {\nu }^{k}(t){{\vartheta}}_{0}^{k}(t)\), it follows that \({b}_{0}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0\), thus,

$${b}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0,\;k = 1,\ldots,l.$$

Let \({{\vartheta}}_{1}^{k}(t)\) denote the function such that \(\sum\limits_{k=1}^{l}{{\vartheta}}_{1}^{k}(t) = 0\) because φ1(t)1 l = 0. Then for each k = 1, …,  l, the solution to

$$\begin{array}{ll} &{\varphi }_{1}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t), \\ &{\varphi }_{1}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = {{\vartheta}}_{1}^{k}(t), \end{array}$$
(4.58)

can be expressed as

$${\varphi }_{1}^{k}(t) =\widetilde{ {b}}_{ 0}^{k}(t) + {{\vartheta}}_{ 1}^{k}(t){\nu }^{k}(t),$$
(4.59)

where \(\widetilde{{b}}_{0}^{k}(t)\) is a solution to the following equation:

$$\begin{array}{l} \widetilde{{b}}_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t), \\ \widetilde{{b}}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0,\\ \end{array}$$

or equivalently,

$$\widetilde{{b}}_{0}^{k}(t)(\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\vdots\widetilde{{Q}}^{k}(t)) = (0\vdots{b}_{ 0}^{k}(t)).$$

The procedure for solving this equation is similar to that for φ0( ⋅).

Analogously to the previous treatment, we proceed to determine \({{\vartheta}}_{1}^{k}(t)\) by solving the system of equations

$${\mathcal{L}}^{\varepsilon }\bigg{(}\sum\limits_{i=0}^{n+1}{\varepsilon }^{i}{\varphi }_{ i}(t)\widetilde{\mathrm{1}\mathrm{l}}\bigg{)} = 0.$$
(4.60)

Using the conditions

$$\widetilde{{b}}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0\mbox{ and }{\nu }^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 1,$$

we have

$${\varphi }_{1}(t)\widetilde{\mathrm{1}\mathrm{l}} = ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t))$$

and

$${\varphi }_{1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t))\overline{Q}(t) + (\widetilde{{b}}_{ 0}^{1}(t),\ldots,\widetilde{{b}}_{ 0}^{l}(t))\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}},$$

where \(\overline{Q}(t)\) was defined in (4.51).

By equating the coefficients of ε2 in (4.60), we obtain a system of linear inhomogeneous equations

$$\begin{array}{ll} &{ d \over dt} ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t)) = ({{\vartheta}}_{ 1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t))\overline{Q}(t) \\ & + (\widetilde{{b}}_{0}^{1}(t),\ldots,\widetilde{{b}}_{ 0}^{l}(t))\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}},\end{array}$$
(4.61)

with initial conditions

$${{\vartheta}}_{1}^{k}(0),\mbox{ for }k = 1,2,\ldots,l\mbox{ such that }\sum\limits_{k=1}^{l}{{\vartheta}}_{ 1}^{k}(0) = 0.$$

Again, as observed in Remark  4.23, equation (4.61) comes from the consideration in the sense of Fredholm since the functions on the right-hand sides in (4.44) must be orthogonal to \(\widetilde{\mathrm{1}\mathrm{l}}\).

The initial conditions \({{\vartheta}}_{1}^{k}(0)\) for k = 1,  , l have not been completely specified yet. We do this later to ensure the matched asymptotic expansion. Once the \({{\vartheta}}_{1}^{k}(0)\) ’s are given, the solution of the above equation is

$$\begin{array}{rl} ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{1}^{l}(t)) =&({{\vartheta}}_{1}^{1}(0),\ldots,{{\vartheta}}_{ 1}^{l}(0))X(t,0) \\ &\ +{ \int }_{0}^{t}(\widetilde{{b}}_{ 0}^{1}(s),\ldots,\widetilde{{b}}_{ 0}^{l}(s))\widehat{Q}(s)\widetilde{\mathrm{1}\mathrm{l}}X(t,s)ds\end{array}$$

Thus if the initial value \({{\vartheta}}_{1}^{k}(0)\) is given, then \({{\vartheta}}_{1}^{k}(\cdot ),\) k = 1, …,  l can be found, and so can φ1( ⋅). Moreover, φ1( ⋅) is n-times continuously differentiable on [0,  T]. The problem boils down to finding the initial condition of \({{\vartheta}}_{1}(0)\).

So far, with the proviso of specified initial conditions \({{\vartheta}}_{1}^{k}(0)\), for k = 1,  , l, the construction of φ 1 ( ⋅) has been completed, and its smoothness has been established. Compared with the determination of φ 0 ( ⋅), the multipliers \({{\vartheta}}_{1}^{k}(\cdot )\) can no longer be determined using the information about the regular part alone because its initial values have to be determined in conjunction with that of the singular part. This will be seen as follows.

In view of (4.55),

$$\begin{array}{ll} &{\psi }_{1 } (\tau ) = {\psi }_{1}(0)\exp (\widetilde{Q}(0)\tau ) \\ &\qquad +{ \int }_{0}^{\tau }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\widehat{Q}(0)\exp (\widetilde{Q}(0)(\tau - s))ds \\ &\qquad +{ \int }_{0}^{\tau }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s){ d\widetilde{Q}(0) \over dt} \exp (\widetilde{Q}(0)(\tau - s))ds.\end{array}$$
(4.62)

Recall that ψ 1 (0) has not been specified yet.

Similar to Section 4.2, for each t ∈ [0, T], \(\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0\) . Therefore,

$$\left(\frac{{d}^{i}\widetilde{Q}(t)} {d{t}^{i}} \right)\widetilde{\mathrm{1}\mathrm{l}} = 0\quad \mbox{ and }\quad \left(\frac{{d}^{i}\widetilde{Q}(0)} {d{t}^{i}} \right)\pi = 0,$$

for \(i = 1,\ldots,n + 1\), where π is defined in (4.57). This together with ψ0 (0)π = 0 yields

$$\begin{array}{ll} &{\biggl |{\int }_{0}^{\tau }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s){ d\widetilde{Q}(0) \over dt} \exp (\widetilde{Q}(0)(\tau - s))ds\biggr |} \\ &\ \leq {\int }_{0}^{\tau }s{\biggl |{\psi }_{ 0}(0)[\exp (\widetilde{Q}(0)s) - \pi ]\biggr |} \\ &\quad \times {\biggl |{ d\widetilde{Q}(0) \over dt} [\exp (\widetilde{Q}(0)(\tau - s)) - \pi ]\biggr |}ds \\ &\ \leq K{\tau }^{2}\exp (-{\kappa }_{ 0,0}\tau ).\end{array}$$
(4.63)

To obtain the desired property, we need only work with the first two terms on the right side of the equal sign of (4.62). Noting the exponential decay property of \({\psi }_{0}(\tau ) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau )\) , we have

$${\int }_{0}^{\infty }\Big{\vert }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\Big{\vert }ds < \infty,$$

that is, the improper integral converges absolutely. Set

$${ \overline{\psi }}_{0} = \left ({\int }_{0}^{\infty }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\widehat{Q}(0) \in {\mathbb{R}}^{1\times m}.$$
(4.64)

Consequently,

$$\begin{array}{ll} & \lim\limits_{\tau \rightarrow \infty }{\psi }_{1}(0)\exp (\widetilde{Q}(0)\tau ) = {\psi }_{1}(0)\pi \quad \mbox{ and } \\ & \lim\limits_{\tau \rightarrow \infty }{\int }_{0}^{\tau }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\widehat{Q}(0)\exp (\widetilde{Q}(0)(\tau - s))ds \\ &\quad = \left ({\int }_{0}^{\infty }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\widehat{Q}(0)\pi \\ &\quad :={ \overline{\psi }}_{0}\pi.\end{array}$$
(4.65)

Recall that \(\pi = \mathrm{diag}(\mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0),\ldots,\mathrm{1}{\mathrm{l}}_{{m}_{l}}{\nu }^{l}(0))\) . Partitioning the vector \({\overline{\psi }}_{0}\) as \(({\overline{\psi }}_{0}^{1},\ldots,{\overline{\psi }}_{0}^{l})\) for k = 1, …,  l, we have

$$\begin{array}{ll} &{\psi }_{1}(0)\pi = \left (\left ({\psi }_{1}^{1}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{1}}\right ){\nu }^{1}(0),\ldots,\left ({\psi }_{ 1}^{l}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{l}}\right ){\nu }^{l}(0)\right ) \\ &{\overline{\psi }}_{0}\pi = \left (\left ({\overline{\psi }}_{0}^{1}\mathrm{1}{\mathrm{l}}_{{ m}_{1}}\right ){\nu }^{1}(0),\ldots,\left ({\overline{\psi }}_{ 0}^{l}\mathrm{1}{\mathrm{l}}_{{ m}_{l}}\right ){\nu }^{l}(0)\right ).\end{array}$$
(4.66)

Our expansion requires that limτ →  ψ1 (τ) = 0. As a result,

$${\psi }_{1}(0)\pi +{ \overline{\psi }}_{0}\pi = 0,$$
(4.67)

which implies, by virtue of (4.66),

$${\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = -{\overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

for k = 1, …,  l. Solving these equations and in view of

$${{\vartheta}}_{1}^{k}(0) = {\varphi }_{ 1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

we choose

$${{\vartheta}}_{1}^{k}(0) = -{\psi }_{ 1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\mbox{ for }k = 1,\ldots,l.$$

Substituting these into (4.59), we obtain φ1( ⋅). Finally, we use \({\psi }_{1}(0) = -{\varphi }_{1}(0)\). The process of choosing initial conditions for φ1( ⋅) and ψ1( ⋅) is complete. Furthermore,

$$\vert {\psi }_{1}(\tau )\vert \leq K\exp (-{\kappa }_{1,0}\tau )\quad \mbox{ for some }0 < {\kappa }_{1,0} < {\kappa }_{0,0}.$$

This procedure can be applied to φi( ⋅) and ψ i ( ⋅) for \(i = 2,\ldots,n + 1\). We proceed recursively to solve for φi ( ⋅) and ψ i ( ⋅) jointly. Using exactly the same methods as the solution for φ 1( ⋅), we define

$${{\vartheta}}_{i}^{k}(t) = {\varphi }_{ i}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

for each k = 1,  , l and \(i = 2,\ldots,n + 1\) . Similar to \(\widetilde{{b}}_{0}^{k}(\cdot )\) , we define \(\widetilde{{b}}_{i}^{k}(\cdot )\) . and write

$$\widetilde{{b}}_{i}(t) = (\widetilde{{b}}_{i}^{1}(t),\ldots,\widetilde{{b}}_{ i}^{l}(t)).$$

Proceeding inductively, suppose that \({{\vartheta}}_{i}^{k}(0)\) is selected and in view of (4.55), it has been shown that

$$\vert {\psi }_{i}(\tau )\vert \leq K\exp (-{\kappa }_{i,0}\tau ),\ \ i \leq n$$
(4.68)

for some 0 < κ i, 0  < κ i − 1, 0 . Solve

$${\psi }_{i+1}(0)\pi = -\left(\;\sum\limits_{j=0}^{i}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right)\pi := -{\overline{\psi }}_{i}\pi $$

to obtain \({\psi }_{i+1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = -{\overline{\psi }}_{i}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) . Set

$${{\vartheta}}_{i+1}^{k}(0) = -{\psi }_{ i+1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{i}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},\mbox{ for }k = 1,\ldots,l.$$

Finally choose \({\psi }_{i+1}(0) = -{\varphi }_{i+1}(0).\) We thus have determined the initial conditions for φ i ( ⋅). Exactly the same arguments as in Proposition  4.25 lead to

$$\vert {\psi }_{i+1}(\tau )\vert \leq K\exp (-{\kappa }_{i+1,0}\tau )\mbox{ for some }0 < {\kappa }_{i+1,0} < {\kappa }_{i,0}.$$

Proposition 4.26

. Assume (A4.3) and (A4.4) . Then the following assertions hold:

  • The sequences of row-vector-valued functions φ i ( ⋅) and \({{\vartheta}}_{i}(\cdot )\) fori = 1, 2, …,  ncan be obtained by solving the system of algebraic differential equations

    $$\begin{array}{ll} &{\varphi }_{i}(t)\widetilde{Q}(t) = \frac{d{\varphi }_{i-1}(t)} {dt} - {\varphi }_{i-1}(t)\widehat{Q}(t), \\ &{{\vartheta}}_{i}^{k}(t) = {\varphi }_{ i}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}, \\ &{ d{{\vartheta}}_{i}(t) \over dt} = {{\vartheta}}_{i}(t)\overline{Q}(t) +\widetilde{ {b}}_{i-1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}}.\end{array}$$
    (4.69)
  • Fori = 1,  , n, the initial conditions are selected as follows:

    • Fork = 1, 2,  , l, find \({\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) from the equation

      $${\psi }_{i}(0)\pi = -\left(\;\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j-1}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right)\pi := -{\overline{\psi }}_{i-1}\pi.$$
    • Choose

      $${{\vartheta}}_{i}^{k}(0) = -{\psi }_{ i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},\mbox{ for }k = 1,\ldots,l.$$
    • Choose \({\psi }_{i}(0) = -{\varphi }_{i}(0).\)

  • There is a positive real number 0 < κ0 < κ i, 0 (given in (4.68)) for \(i = 0,1,\ldots,n + 1\) such that

    $$\vert {\psi }_{i}(\tau )\vert \leq K\exp (-{\kappa }_{0}\tau ).$$
  • The choice of initial conditions yields that \({{\vartheta}}_{i}^{k}(\cdot )\) is \((n + 1 - i)\) -times continuously differentiable on [0, T] and hence φi( ⋅) is \((n + 1 - i)\) -times continuously differentiable on [0, T]. □ 

3.2 Analysis of Remainder

The objective here is to carry out the error analysis and validate the asymptotic expansion. Since the details are quite similar to those of Section 4.2, we make no attempt to spell them out. Only the following lemma and proposition are presented.

Lemma 4.27.

Suppose that (A4.3) and (A4.4) are satisfied. Let η ε (⋅) be a function such that

$$ \sup\limits_{t\in [0,T]}\vert {\eta }^{\varepsilon }(t)\vert = O({\varepsilon }^{k+1})\mbox{ for }k \leq n$$

and let \({\mathcal{L}}^{\varepsilon }\) be an operator defined in (4.42) . If f ε (⋅) is a solution to the equation

$${\mathcal{L}}^{\varepsilon }{f}^{\varepsilon }(t) = {\eta }^{\varepsilon }(t)\mbox{ with }{f}^{\varepsilon }(0) = 0,$$

then f ε (⋅) satisfies

$$ \sup\limits_{t\in [0,T]}\vert {f}^{\varepsilon }(t)\vert = O({\varepsilon }^{k}).$$

Proof: Note that using \({Q}^{\varepsilon }(t) =\widetilde{ Q}(t)/\varepsilon +\widehat{ Q}(t)\) , the differential equation can be written as

$${ d{f}^{\varepsilon }(t) \over dt} = {f}^{\varepsilon }(t){Q}^{\varepsilon }(t) +{ {\eta }^{\varepsilon }(t) \over \varepsilon }.$$

We can then proceed as in the proof of Lemma  4.13. □ 

Lemma  4.27 together with detailed computation similar to that of Section 4.2 yields the following proposition.

Proposition 4.28.

For each i = 0,1,…,n, define

$${e}_{i}^{\varepsilon }(t) = {p}^{\varepsilon }(t) - {y}_{ i}^{\varepsilon }(t).$$
(4.70)

Under conditions (A4.3) and (A4.4),

$$ \sup\limits_{0\leq t\leq T}\vert {e}_{i}^{\varepsilon }(t)\vert = O({\varepsilon }^{i+1}).$$

3.3 Computational Procedure: User’s Guide

Since the constructions of φ i ( ⋅) and ψ i ( ⋅) are rather involved, and the choice of initial conditions is tricky, we summarize the procedure below. This procedure, which can be used as a user’s guide for developing the asymptotic expansion, comprises two main stages.

Step 1: Initialization: finding φ 0 ( ⋅) and ψ 0 ( ⋅).

    1. 1.

      Obtain the unique solution φ0 ( ⋅) via (4.54).

    2. 2.

      Obtain the unique solution ψ0( ⋅) via (4.55) and the initial condition \({\psi }_{0}(0) = {p}^{0} - {\varphi }_{0}(0)\).

Step 2. Iteration: finding φ i ( ⋅) and ψ i ( ⋅) for 1 ≤ i ≤  n.

While i ≤  n, do the following:

    1. 1.

      Find φ i ( ⋅) the solution of (4.69) with temporarily unspecified \({{\vartheta}}_{i}^{k}(0)\) for k = 1, …,  l.

    2. 2.

      Obtain ψ i ( ⋅) from (4.55) with temporarily unspecified ψ i (0).

    3. 3.

      Use the equation

      $${\psi }_{i}(0)\pi = -\left(\,\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j-1}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right)\pi := -{\overline{\psi }}_{i-1}\pi $$

      to obtain \({\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = -{\overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}.\)

    4. 4.

      Set \({{\vartheta}}_{i}^{k}(0) = -{\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}} ={ \overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{m}_{k}}\). By now, φ i ( ⋅) has been determined uniquely.

    5. 5.

      Choose \({\psi }_{i}(0) = -{\varphi }_{i}(0)\) . By now, ψ i ( ⋅) has also been determined uniquely.

    6. 6.

      Set \(i = i + 1\).

    7. 7.

      If i >  n, stop.

3.4 Summary of Results

While the previous subsection gives the computational procedure, this subsection presents the main theorem. It establishes the validity of the asymptotic expansion.

Theorem 4.29

. Suppose conditions (A4.3) and (A4.4) are satisfied. Then the asymptotic expansion

$${y}_{n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t) + {\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\right )$$

can be constructed as in the computational procedure such that

  • φ i ( ⋅) is \((n + 1 - i)\) - times continuously differentiable on [0, T];

  •  | ψ i (t) | ≤  Kexp( − κ0 t) for some κ0 > 0;

  • \(\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T]\).

Remark 4.30

In general, in view of Proposition  4.11, the error bound is of the form c 2n (t)exp (−κ 0 t), where c 2n (t) is a polynomial of degree 2n. The exponential constant κ 0 typically depends on n. The larger n is, the smaller κ 0 will be to account for the polynomial c 2n (t).

The following result is a corollary to Theorem  4.29 and will be used in Chapters 5 and 7. Denote the jth component of νk(t) by ν j k(t).

Corollary 4.31.

Assume, in addition to the conditions in Theorem  4.29 with n = 0, that \(\widetilde{Q}(t) =\widetilde{ Q}\) and \(\widehat{Q}(t) =\widehat{ Q}\) are time independent. Then there exist positive constants K and κ 0 ( both independent of ε and t ) such that

$$\Big{\vert }P({\alpha }^{\varepsilon }(t) = {s}_{ kj}) - {\nu }_{j}^{k}(t){{\vartheta}}^{k}(t)\Big{\vert }\leq K\left(\varepsilon (t + 1) +\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right)\right),$$
(4.71)

where \({{\vartheta}}^{k}(t)\) satisfies

$$\frac{d} {dt}({{\vartheta}}^{1}(t),\ldots,{{\vartheta}}^{l}(t)) = ({{\vartheta}}^{1}(t),\ldots,{{\vartheta}}^{l}(t))\overline{Q},$$

with \(({{\vartheta}}^{1}(0),\ldots,{{\vartheta}}^{l}(0)) = (P({\alpha }^{\varepsilon }(0) \in {\mathcal{M}}_{1}),\ldots,P({\alpha }^{\varepsilon }(0) \in {\mathcal{M}}_{l}))\).

Proof: By a slight modification of the analysis of remainder in Section 4.3, we can obtain (4.71) with a constant K independent of ε and t. The second part of the lemma follows from the uniqueness of the solution to the ordinary differential equation (4.71). □ 

Remark 4.32.

We mention an alternative approach to establishing the asymptotic expansion. In lieu of the constructive procedure presented previously, one may wish to write φi(t) as a sum of solutions of the homogeneous part and the inhomogeneous part. For instance, one may set

$${\varphi }_{i}(t) = {v}_{i}(t)\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t)) + {U}_{ i}(t),$$
(4.72)

where \({v}_{i}(t) \in {\mathbb{R}}^{l}\) and Ui(t) is a particular solution of the inhomogeneous equation. For i ≥ 0, the equation

$${\varphi }^{(i+1)}(t)\widetilde{Q}(t) ={ d{\varphi }_{i}(t) \over dt} - {\varphi }_{i}(t)\widehat{Q}(t)$$

and \(\widetilde{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} = 0\) lead to

$$0 =\left({ d{\varphi }_{i}(t) \over dt} - {\varphi }_{i}(t)\widehat{Q}(t)\right)\widetilde{\mathrm{1}\mathrm{l}}.$$

Substituting (4.72) into the equation above, and noting that \({\nu }^{k}(t)\mathrm{1}{\mathrm{l}}_{{m}_{k}} = 1\) for k = 1,…,l, and that \(\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t))\widetilde{\mathrm{1}\mathrm{l}} = {I}_{l}\), the l × l identity matrix, one obtains

$${ d{v}_{i}(t) \over dt} = {v}_{i}(t)\overline{Q}(t) + {U}_{i}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}} -\left({ d{U}_{i}(t) \over dt} \right)\widetilde{\mathrm{1}\mathrm{l}}.$$

One then proceeds to determine vi(0) via the matching condition. The main ideas are similar, and the details are slightly different.

3.5 An Example

Consider Example  4.20 again. Note that the conditions in (A4.3) and (A4.4) require that

$${\lambda }_{1}(t) + {\mu }_{1}(t) > 0\mbox{ for all }t \in [0,T],$$

and the jump rates λ(t) and μ( t) be smooth enough.

The probability distribution of the state process is given by p ε(t) satisfying

$$\begin{array}{rl} &\frac{d{p}^{\varepsilon }(t)} {dt} = {p}^{\varepsilon }(t){Q}^{\varepsilon }(t), \\ &{p}^{\varepsilon }(0) = {p}^{0}\ \mbox{ such that} \\ &{p}_{i}^{0} \geq 0\mbox{ and }\sum\limits_{i=1}^{4}{p}_{ i}^{0} = \end{array}$$
(1.)

To solve this set of equations, note that

$$\begin{array}{l} \frac{d} {dt}({p}_{1}^{\varepsilon }(t) + {p}_{ 2}^{\varepsilon }(t)) = -{\lambda }_{ 2}(t)({p}_{1}^{\varepsilon }(t) + {p}_{ 2}^{\varepsilon }(t)) + {\mu }_{ 2}(t)({p}_{3}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)), \\ \frac{d} {dt}({p}_{1}^{\varepsilon }(t) + {p}_{ 3}^{\varepsilon }(t)) = -\frac{{\lambda }_{1}(t)} {\varepsilon } ({p}_{1}^{\varepsilon }(t) + {p}_{ 3}^{\varepsilon }(t)) + \frac{{\mu }_{1}(t)} {\varepsilon } ({p}_{2}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)), \\ \frac{d} {dt}({p}_{2}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)) = \frac{{\lambda }_{1}(t)} {\varepsilon } ({p}_{1}^{\varepsilon }(t) + {p}_{ 3}^{\varepsilon }(t)) -\frac{{\mu }_{1}(t)} {\varepsilon } ({p}_{2}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)), \\ \frac{d} {dt}({p}_{3}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t)) = {\lambda }_{ 2}(t)({p}_{1}^{\varepsilon }(t) + {p}_{ 2}^{\varepsilon }(t)) - {\mu }_{ 2}(t)({p}_{3}^{\varepsilon }(t) + {p}_{ 4}^{\varepsilon }(t))\end{array}$$

To proceed, define functions a 12(t), a 13(t), a 24(t), and a 34(t) as follows:

$$\begin{array}{rl} {a}_{12}(t) =&({p}_{1}^{0} + {p}_{ 2}^{0})\exp \left (-{\int }_{0}^{t}({\lambda }_{ 2}(s) + {\mu }_{2}(s))ds\right ) \\ & +{ \int }_{0}^{t}{\mu }_{ 2}(u)\exp \left (-{\int }_{u}^{t}({\lambda }_{ 2}(s) + {\mu }_{2}(s))ds\right )du, \end{array}$$
$$\begin{array}{rl} {a}_{13}(t) =&({p}_{1}^{0} + {p}_{ 3}^{0})\exp \left (-\frac{1} {\varepsilon }{\int }_{0}^{t}({\lambda }_{ 1}(s) + {\mu }_{1}(s))ds\right ) \\ & +{ \int }_{0}^{t}\frac{{\mu }_{1}(u)} {\varepsilon } \exp \left (-\frac{1} {\varepsilon }{\int }_{u}^{t}({\lambda }_{ 1}(s) + {\mu }_{1}(s))ds\right )du,\end{array}$$
$$\begin{array}{rl} {a}_{24}(t) =&({p}_{2}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{1} {\varepsilon }{\int }_{0}^{t}({\lambda }_{ 1}(s) + {\mu }_{1}(s))ds\right ) \\ & +{ \int }_{0}^{t}\frac{{\lambda }_{1}(u)} {\varepsilon } \exp \left (-\frac{1} {\varepsilon }{\int }_{u}^{t}({\lambda }_{ 1}(s) + {\mu }_{1}(s))ds\right )du,\end{array}$$
$$\begin{array}{rl} {a}_{34}(t) =&({p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-{\int }_{0}^{t}({\lambda }_{ 2}(s) + {\mu }_{2}(s))ds\right ) \\ & +{ \int }_{0}^{t}{\lambda }_{ 2}(u)\exp \left (-{\int }_{u}^{t}({\lambda }_{ 2}(s) + {\mu }_{2}(s))ds\right )du\end{array}$$

Then using the fact that \({p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) + {p}_{3}^{\varepsilon }(t) + {p}_{4}^{\varepsilon }(t) = 1\), we have

$$\begin{array}{l} {p}_{1}^{\varepsilon }(t) + {p}_{2}^{\varepsilon }(t) = {a}_{12}(t), \\ {p}_{1}^{\varepsilon }(t) + {p}_{3}^{\varepsilon }(t) = {a}_{13}(t), \\ {p}_{2}^{\varepsilon }(t) + {p}_{4}^{\varepsilon }(t) = {a}_{24}(t), \\ {p}_{3}^{\varepsilon }(t) + {p}_{4}^{\varepsilon }(t) = {a}_{34}(t).\end{array}$$
(4.73)

Note also that

$$\begin{array}{rl} &\frac{d{p}_{1}^{\varepsilon }(t)} {dt} = -\left (\frac{{\lambda }_{1}(t)} {\varepsilon } + \frac{{\mu }_{1}(t)} {\varepsilon } + {\lambda }_{2}(t) + {\mu }_{2}(t)\right ){p}_{1}^{\varepsilon }(t) \\ &\quad + \frac{{\mu }_{1}(t)} {\varepsilon } {a}_{12}(t) + {\mu }_{2}(t){a}_{13}(t)\end{array}$$

The solution to this equation is

$$\begin{array}{rl} &{p}_{1 }^{\varepsilon }(t) = {p}_{ 1}^{0}\exp \left (-{\int }_{0}^{t}\left (\frac{{\lambda }_{1}(s) + {\mu }_{1}(s)} {\varepsilon } + {\lambda }_{2}(s) + {\mu }_{2}(s)\right )ds\right ) \\ &\qquad +{ \int }_{0}^{t}\left (\frac{{\mu }_{1}(u)} {\varepsilon } {a}_{12}(u) + {\mu }_{2}(u){a}_{13}(u)\right ) \\ &\qquad \quad \times \exp \left (-{\int }_{u}^{t}\left (\frac{{\lambda }_{1}(s) + {\mu }_{1}(s)} {\varepsilon } + {\lambda }_{2}(s) + {\mu }_{2}(s)\right )ds\right )du\end{array}$$

Consequently, in view of (4.73), it follows that

$$\begin{array}{l} {p}_{2}^{\varepsilon }(t) = {a}_{12}(t) - {p}_{1}^{\varepsilon }(t), \\ {p}_{3}^{\varepsilon }(t) = {a}_{13}(t) - {p}_{1}^{\varepsilon }(t), \\ {p}_{4}^{\varepsilon }(t) = {a}_{24}(t) - {p}_{2}^{\varepsilon }(t)\end{array}$$

In this example, the zeroth-order term is given by

$${\varphi }_{0}(t) = ({\nu }^{1}(t){{\vartheta}}_{ 0}^{1}(t),{\nu }^{2}(t){{\vartheta}}_{ 0}^{2}(t)),$$

where the quasi-stationary distributions are given by

$${\nu }^{1}(t) = {\nu }^{2}(t) = \left ( \frac{{\mu }_{1}(t)} {{\lambda }_{1}(t) + {\mu }_{1}(t)}, \frac{{\lambda }_{1}(t)} {{\lambda }_{1}(t) + {\mu }_{1}(t)}\right ),$$

and the multipliers \(({{\vartheta}}_{0}^{1}(t),{{\vartheta}}_{0}^{2}(t))\) are determined by the differential equation

$$\frac{d} {dt}({{\vartheta}}_{0}^{1}(t),{{\vartheta}}_{ 0}^{2}(t)) = ({{\vartheta}}_{ 0}^{1}(t),{{\vartheta}}_{ 0}^{2}(t))\left (\begin{array}{cc} - {\lambda }_{2}(t)& {\lambda }_{2}(t) \\ {\mu }_{2}(t) & - {\mu }_{2}(t) \end{array} \right ),$$

with initial value \(({{\vartheta}}_{0}^{1}(0),{{\vartheta}}_{0}^{2}(0)) = ({p}_{1}^{0} + {p}_{2}^{0},{p}_{3}^{0} + {p}_{4}^{0})\).

The inner expansion term ψ0(τ) is given by

$${ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0),\;{\psi }_{0}(0) = {p}^{0} - {\varphi }_{ 0}(0).$$

By virtue of Theorem  4.29,

$${p}^{\varepsilon }(t) - {\varphi }_{ 0}(t) - {\psi }_{0}\left ({ t \over \varepsilon } \right ) = O(\varepsilon ),$$

provided that Q ε(t) is continuously differentiable on [0, T]. Noting the exponential decay of ψ 0(t ∕ ε), we further have

$${p}^{\varepsilon }(t) = {\varphi }_{ 0}(t) + O\left(\varepsilon +\exp \left( -\frac{{\kappa }_{0}t} {\varepsilon } \right)\right).$$

In particular, for any t > 0,

$$ \lim\limits_{\varepsilon \rightarrow 0}{p}^{\varepsilon }(t) = {\varphi }_{ 0}(t).$$

Namely, φ 0(t) is the limit distribution of the Markov chain generated by Q ε(t).

4 Inclusion of Absorbing States

While the case of recurrent states was considered in the previous section, this section concerns the asymptotic expansion in which the Markov chain generated by Q ε(t) in which \(\widetilde{Q}(t)\) includes components corresponding to absorbing states. By rearrangement, the matrix \(\widetilde{Q}(t)\) takes the form

$$\widetilde{Q}(t) = \left (\begin{array}{*{10}c} \widetilde{{Q}}^{1}(t)& & & & \\ &\widetilde{{Q}}^{2}(t)& & &\\ & &\ddots && \\ & & &\widetilde{{Q}}^{l}(t)& \\ & & & &{0}_{{m}_{a}\times {m}_{a}}\\ \end{array} \right ),$$
(4.74)

where \(\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}\) for k = 1, 2,  , l, \({0}_{{m}_{a}\times {m}_{a}}\) is an m a ×m a zero matrix, and

$${m}_{1} + {m}_{2} + \cdots + {m}_{l} + {m}_{a} = m.$$

Let \({\mathcal{M}}_{a} =\{ {s}_{a1},\ldots,{s}_{a{m}_{a}}\}\) denote the set of absorbing states. We may, as in Section 4.3, represent the state space as

$$\begin{array}{rl} \mathcal{M}& = {\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l} \cup {\mathcal{M}}_{a} \\ & = \{{s}_{11},\ldots,{s}_{1{m}_{1}},\ldots,{s}_{l1},\ldots,{s}_{l{m}_{l}},{s}_{a1},\ldots,{s}_{a{m}_{a}}\}\end{array}$$

Following the development of Section 4.3, suppose that αε ( ⋅) is a Markov chain generated by \({Q}^{\varepsilon }(\cdot ) =\widetilde{ Q}(\cdot )/\varepsilon +\widehat{ Q}(\cdot )\). Compared with Section 4.3, the difference is that now the dominant part in the generator includes absorbing states corresponding to the m a ×m a matrix \({0}_{{m}_{a}\times {m}_{a}}\). As in the previous case, our interest is to obtain an asymptotic expansion of the probability distribution.

Remark 4.33.

The motivation of the current study stems from the formulation of competitive risk theory discussed in Section 3.3 The idea is that within the m states, there are several groups. Some of them are much riskier than the others (in the sense of frequency of the occurrence of the corresponding risks). The different rates (sensitivity) of risks are modeled by the use of a small parameter ε > 0.

Denote by p ε ( ⋅) the solution of (4.40). The objective here is to obtain an asymptotic expansion

$${y}_{n}^{\varepsilon } =\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) +\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right ).$$

Since the techniques employed are essentially the same as in the previous section, it will be most instructive here to highlight the main ideas. Thus, we only note the main steps and omit most of the details.

Assume conditions (A4.3) and (A4.4) for the current matrices \(\widetilde{{Q}}^{k}(t),\) \(\widetilde{Q}(t)\) , and \(\widehat{Q}(t)\) . For t ∈ [0, T], substituting the expansion above into (4.40) and equating coefficients of εi, for \(i = 1,\ldots,n + 1\), yields

$$\begin{array}{ll} &{\varphi }_{0}(t)\widetilde{Q}(t) = 0, \\ &{\varphi }_{i}(t)\widetilde{Q}(t) ={ d{\varphi }_{i-1}(t) \over dt} - {\varphi }_{i-1}(t)\widehat{Q}(t),\\ \end{array}$$
(4.75)

and (with the use of the stretched variable \(\tau = t/\varepsilon \))

$$\begin{array}{rl} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0), \\ &{ d{\psi }_{i}(\tau ) \over d\tau } = {\psi }_{i}(\tau )\widetilde{Q}(0) +\sum\limits_{j=0}^{i-1}{\psi }_{ i-j-1}(\tau ) \\ & \times \left({ {\tau }^{j} \over j!} { {d}^{j}\widehat{Q}(0) \over d{t}^{j}} +{ {\tau }^{j+1} \over (j + 1)!} { {d}^{j+1}\widetilde{Q}(0) \over d{t}^{j+1}} \right).\end{array}$$
(4.76)

For each i ≥ 0, we use the following notation for the partitioned vectors:

$$\begin{array}{rl} &{\varphi }_{i}(t) = ({\varphi }_{i}^{1}(t),\ldots,{\varphi }_{ i}^{l}(t),{\varphi }_{ i}^{a}(t)), \\ &{\psi }_{i}(\tau ) = ({\psi }_{i}^{1}(\tau ),\ldots,{\psi }_{ i}^{l}(\tau ),{\psi }_{ i}^{a}(\tau ))\end{array}$$

In the above φ i a(t) and ψ i a (τ) are vectors in \({\mathbb{R}}^{1\times {m}_{a}}\).

To determine the outer- and the initial-layer expansions, let us start with i = 0. For each t ∈ [0, T], the use of the partitioned vector φ 0(t) leads to

$${\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = 0,\mbox{ for }k = 1,\ldots,l.$$

Note that φ0 a(t) does not show up in any of these equations owing to the \({0}_{{m}_{a}\times {m}_{a}}\) matrix in \(\widetilde{Q}(t)\). It will have to be obtained from the equation in (4.75) corresponding to i = 1. Put another way, φ 0 a (t) is determined mainly by the matrix \(\widehat{Q}(t)\).

Similar to Section 4.3, \({\varphi }_{0}^{k}(t) = {{\vartheta}}_{0}^{k}(t){\nu }^{k}(t)\) , where ν k(t) are the quasi-stationary distributions corresponding to the generators \(\widetilde{{Q}}^{k}(t)\) for k = 1, …,  l and \({{\vartheta}}_{0}^{k}(t)\) are the corresponding multipliers. Define

$$\widetilde{\mathrm{1}{\mathrm{l}}}_{a} = \left (\begin{array}{*{10}c} \mathrm{1}{\mathrm{l}}_{{m}_{1}} & & & &\\ & \ddots & && \\ & & & \mathrm{1}{\mathrm{l}}_{{m}_{l}} & \\ & & & & {I}_{{m}_{a}}\\ \end{array} \right ),$$

where \({I}_{{m}_{a}}\) is an m a ×m a identity matrix. Clearly, \(\widetilde{\mathrm{1}{\mathrm{l}}}_{a}\) is orthogonal to \(\widetilde{Q}(t)\) for each t ∈ [0,  T]. As a result, multiplying (4.75) by \(\widetilde{\mathrm{1}{\mathrm{l}}}_{a}\) from the right with i = 1 leads to

$$\begin{array}{ll} &{ d{\varphi }_{0}(t) \over dt} \widetilde{\mathrm{1}{\mathrm{l}}}_{a} = {\varphi }_{0}(t)\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a}, \\ &({{\vartheta}}_{0}(0),{\varphi }_{0}^{a}(0)) = {p}^{0}\widetilde{\mathrm{1}{\mathrm{l}}}_{ a}, \end{array}$$
(4.77)

where \({{\vartheta}}_{0}(0) = ({{\vartheta}}_{0}^{1}(0),\ldots,{{\vartheta}}_{0}^{l}(0)).\)

The above initial condition is a consequence of the initial-value consistency condition in (4.53). It is readily seen that

$$\sum\limits_{k=1}^{l}{{\vartheta}}_{ 0}^{k}(0) = 1 - {\varphi }_{ 0}^{a}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{a}} = 1 - {p}^{0,a}\mathrm{1}{\mathrm{l}}_{{ m}_{a}},$$

where p 0 = (p 0, 1 ,  , p 0, l, p 0, a).

We write

$${\varphi }_{0}(t) = ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t),{\varphi }_{ 0}^{a}(t))\mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t),{I}_{{ m}_{a}}).$$

Define

$$\overline{Q}(t) = \mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t),{I}_{{ m}_{a}})\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a}.$$
(4.78)

Then (4.77) is equivalent to

$$\begin{array}{rl} &{ d \over dt} ({{\vartheta}}_{0}(t),{\varphi }_{0}^{a}(t)) = ({{\vartheta}}_{ 0}(t),{\varphi }_{0}^{a}(t))\overline{Q}(t), \\ &({{\vartheta}}_{0}(0),{\varphi }_{0}^{a}(0)) = {p}^{0}\widetilde{\mathrm{1}{\mathrm{l}}}_{ a}\end{array}$$

This is a linear system of differential equations. Therefore it has a unique solution given by

$$({{\vartheta}}_{0}(t),{\varphi }_{0}^{a}(t)) = {p}^{0}\widetilde{\mathrm{1}{\mathrm{l}}}_{ a}X(t,0),$$

where X( t, 0) is the principal matrix solution of the homogeneous equation. Thus φ0(t) has been found and is ( n + 1)-times continuously differentiable.

Remark 4.34.

Note that in φ0(t), the term φ0 a(t) corresponds to the set of absorbing states \({\mathcal{M}}_{a}\). Clearly, these states cannot be aggregated to a single state as in the case of recurrent states. Nevertheless, the function φ0 a(t) tends to be stabilized in a neighborhood of a constant for t large enough. To illustrate, let us consider a stationary case, that is, both \(\widetilde{Q}(t) =\widetilde{ Q}\) and \(\widehat{Q}(t) =\widehat{ Q}\) are independent of t. Partition \(\widehat{Q}\) as blocks of submatrices

$$\widehat{Q} = \left (\begin{array}{cc} \widehat{{Q}}^{11} & \widehat{{Q}}^{12} \\ \widehat{{Q}}^{21} & \widehat{{Q}}^{22}\\ \end{array} \right ),$$

where \(\widehat{{Q}}^{22}\) is an ma × ma matrix. Assume that the eigenvalues of \(\widehat{{Q}}^{22}\) have negative real parts. Then, in view of the definition of \(\overline{Q}(t) = \overline{Q}\) in (4.78), it follows that

$${\varphi }_{0}^{a}(t) \rightarrow \mbox{ a constant as }t \rightarrow \infty.$$

Using the partition ψ0(τ) = (ψ0 1(τ), …, ψ 0 l (τ), ψ 0 a (τ)), consider the zeroth-order initial-layer term given by

$$\begin{array}{rl} { d{\psi }_{0}(\tau ) \over d\tau } & ={ d \over d\tau } ({\psi }_{0}^{1}(\tau ),\ldots,{\psi }_{ 0}^{l}(\tau ),{\psi }_{ 0}^{a}(\tau )) \\ & = {\psi }_{0}(\tau )\widetilde{Q}(0) = ({\psi }_{0}^{1}(\tau )\widetilde{{Q}}^{1}(\tau ),\ldots,{\psi }_{ 0}^{l}(\tau )\widetilde{{Q}}^{l}(0),{0}_{{ m}_{a}}).\end{array}$$

We obtain

$$\begin{array}{rl} &{\psi }_{0}^{k}(\tau ) = {\psi }_{ 0}^{k}(0)\exp (\widetilde{{Q}}^{k}(0)\tau ),\mbox{ for }k = 1,\ldots,l,\mbox{ and } \\ &{\psi }_{0}^{a}(\tau ) = \mbox{ constant.} \end{array}$$

Noting that p 0, a = φ0 a (0) and choosing \({\psi }_{0}(0) = {p}^{0} - {\varphi }_{0}(0)\) lead to \({\psi }_{0}^{a}(\tau ) = {0}_{{m}_{a}}.\) Thus

$${\psi }_{0}(\tau ) = ({\psi }_{0}^{1}(0)\exp (\widetilde{{Q}}^{1}(0)\tau ),\ldots,{\psi }_{ 0}^{l}(0)\exp (\widetilde{{Q}}^{l}(0)\tau ),{0}_{{ m}_{a}}).$$

Similar to the result in Section 4.3, the following lemma holds. The proof is analogous to that of Proposition  4.25.

Lemma 4.35

. Define

$${\pi }_{a} = \mbox{ diag}(\mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0),\ldots,\mathrm{1}{\mathrm{l}}_{{ m}_{l}}{\nu }^{l}(0),{I}_{{ m}_{a}}).$$

Then there exist positive constants K and κ0,0 such that

$$\vert \exp (\widetilde{Q}(0)\tau ) - {\pi }_{a}\vert \leq K\exp (-{\kappa }_{0,0}\tau ).$$

By virtue of the lemma above and the orthogonality \(({p}^{0} - {\varphi }_{0}(0)){\pi }_{a} = 0\), we have

$$\begin{array}{rl} \vert {\psi }_{0}(\tau )\vert & = \vert ({p}^{0} - {\varphi }_{ 0}(0))(\exp (\widetilde{Q}(0)\tau ) - {\pi }_{a})\vert \\ &\leq K\exp (-{\kappa }_{0,0}\tau ) \end{array}$$

for some K > 0 and κ 0, 0  > 0 given in Lemma  4.35; that is, ψ0 (τ) decays exponentially fast. Therefore, ψ 0 (τ) has the desired property.

We continue in this fashion and proceed to determine the next term φ1(t) as well as ψ1(t ∕ ε). Let

$$\begin{array}{rl} &{b}_{0}(t) ={ d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)\widehat{Q}(t)\quad \mbox{ with } \\ &{b}_{0}(t) = ({b}_{0}^{1}(t),\ldots,{b}_{ 0}^{l}(t),{b}_{ 0}^{a}(t))\end{array}$$

It is easy to check that \({b}_{0}^{a}(t) = {0}_{{m}_{a}}\) . The equation \({\varphi }_{1}(t)\widetilde{Q}(t) = {b}_{0}(t)\) then leads to

$$\begin{array}{ll} &{\varphi }_{1}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t),\mbox{ for }k = 1,\ldots,l, \\ &{b}_{0}^{a}(t) = {0}_{{ m}_{a}}.\end{array}$$
(4.79)

The solutions of the l inhomogeneous equations in (4.79) above are of the form

$${\varphi }_{1}^{k}(t) = {{\vartheta}}_{ 1}^{k}(t){\nu }^{k}(t) +\widetilde{ {b}}_{ 0}^{k}(t),\ k = 1,\ldots,l,$$

where \({{\vartheta}}_{1}^{k}(t)\) for k = 1,  , l are scalar multipliers. Again, φ 1 a (t) cannot be obtained from the equation above, it must come from the contribution of the matrix-valued function \(\widehat{Q}(t)\).

Note that

$$\widetilde{{b}}_{0}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t)\quad \mbox{ and }\quad \widetilde{{b}}_{ 0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0.$$

Using the equation

$${\varphi }_{2}(t)\widetilde{Q}(t) ={ d{\varphi }_{1}(t) \over dt} - {\varphi }_{1}(t)\widehat{Q}(t),$$

one obtains

$$0 = {\varphi }_{2}(t)\widetilde{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a} ={ d{\varphi }_{1}(t) \over dt} \widetilde{\mathrm{1}{\mathrm{l}}}_{a} - {\varphi }_{1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a},$$

which in turn implies that

$$\begin{array}{l} { d \over dt} ({{\vartheta}}_{1}(t),{\varphi }_{1}^{a}(t)) = ({{\vartheta}}_{ 1}(t),{\varphi }_{1}^{a}(t))\overline{Q}(t) \\ + (\widetilde{{b}}_{0}(t),{0}_{{m}_{a}})\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{a}, \end{array}$$
(4.80)

where

$${{\vartheta}}_{1}(t) = ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t))\mbox{ and }\widetilde{{b}}_{ 0}(t) = (\widetilde{{b}}_{0}^{1}(t),\ldots,\widetilde{{b}}_{ 0}^{l}(t)).$$

Let X(t,  s) denote the principal matrix solution to the homogeneous differential equation

$$\frac{dy(t)} {dt} = y(t)\overline{Q}(t).$$

Then the solution to (4.80) can be represented by X( t, s) as follows:

$$\begin{array}{rl} ({{\vartheta}}_{1}(t),{\varphi }_{1}^{a}(t))& = ({{\vartheta}}_{1}(0),{\varphi }_{1}^{a}(0))X(t,0) \\ &\qquad +{ \int }_{0}^{t}(\widetilde{{b}}_{ 0}(s),{0}_{{m}_{a}})\widehat{Q}(s)\widetilde{\mathrm{1}{\mathrm{l}}}_{a}X(t,s)ds\end{array}$$

Note that the initial conditions φ 1 a (0) and \({{\vartheta}}_{1}^{k}(0)\) for k = 1,  , l need to be determined using the initial-layer terms just as in Section 4.3.

Using (4.76) with i = 1, one obtains an equation that has the same form as that of (4.62). That is,

$$\begin{array}{rl} {\psi }_{1}(\tau )& = {\psi }_{1}(0)\exp (\widetilde{Q}(0)\tau ) \\ &\quad +{ \int }_{0}^{\tau }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\widehat{Q}(0)\exp (\widetilde{Q}(0)(\tau - s))ds \\ &\quad +{ \int }_{0}^{\tau }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s){ d\widetilde{Q}(0) \over dt} \exp (\widetilde{Q}(0)(\tau - s))ds\end{array}$$

As in Section 4.3, with the use of π a , it can be shown that | ψ1 (τ) | ≤  Kexp( − κ1, 0τ) for some K > 0 and 0 < κ 1, 0  < κ 0, 0 . By requiring that ψ 1 (τ) decay to 0 as τ →  , we obtain the equation

$${\psi }_{1}(0){\pi }_{a} = -{\overline{\psi }}_{0}{\pi }_{a},$$
(4.81)

where

$${\overline{\psi }}_{0} ={ \int }_{0}^{\infty }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\widehat{Q}(0).$$

Owing to (4.81) and the known form of ψ0(τ),

$$\begin{array}{rl} {\overline{\psi }}_{0} & = ({\overline{\psi }}_{0}^{1},\ldots,{\overline{\psi }}_{ 0}^{l},{\overline{\psi }}_{ 0}^{a}) \\ & = ({p}^{0,1} - {\varphi }_{ 0}^{1}(0),\ldots,{p}^{0,l} - {\varphi }_{ 0}^{l},{0}_{{ m}_{a}})\left ({\int }_{0}^{\infty }\exp (\widetilde{Q}(0)s)ds\right )\widehat{Q}(0), \end{array}$$

which is a completely known vector. Thus the solution to (4.81) is

$${\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = -{\overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\mbox{ for }k = 1,\ldots,l,\mbox{ and }{\psi }_{1}^{a}(0) = -{\overline{\psi }}_{ 0}^{a}.$$

To obtain the desired matching property for the inner-outer expansions, choose

$$\begin{array}{rl} &{{\vartheta}}_{1}^{k}(0) = -{\psi }_{ 1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\mbox{ for }k = 1,\ldots,l, \\ &{\varphi }_{1}^{a}(0) = -{\psi }_{ 1}^{a}(0) ={ \overline{\psi }}_{ 0}^{a}\end{array}$$

In general, for i = 2,  , n, the initial conditions are selected as follows: For k = 1, 2, …,  l, find \({\psi }_{i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) from the equation

$${\psi }_{i}(0){\pi }_{a} = -\left(\;\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{ {s}^{j} \over j!} {\psi }_{i-j-1}(s)ds{ {d}^{j}\widehat{Q}(0) \over d{t}^{j}} \right){\pi }_{a} := -{\overline{\psi }}_{i-1}{\pi }_{a}.$$

Choose

$${{\vartheta}}_{i}^{k}(0) = -{\psi }_{ i}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} ={ \overline{\psi }}_{i-1}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$

for k = 1,  , l,

$${\phi }_{i}^{a}(0) = -{\overline{\psi }}_{ i-1}^{a},\quad \mbox{ and }{\psi }_{ i}(0) = -{\varphi }_{i}(0).$$

Proceeding inductively, we then construct all φ i (t) and ψ i (τ). Moreover, we can verify that there exists 0 < κ i, 0  < κ i − 1, 0  < κ 0, 0 such that | ψ i (τ) | ≤  Kexp( − κ i, 0 τ). This indicates that the inclusion of absorbing states is very similar to the case of all recurrent states. In the zeroth-order outer expansion, there is a component φ0 a(t) that “takes care of” the absorbing states. Note, however, that starting from the leading term (zeroth-order approximation), the matching will be determined not only by the multipliers \({{\vartheta}}_{i}(0)\) but also by the vector ψ i (0) associated with the absorbing states. We summarize the results in the following theorem.

Theorem 4.36

. Consider \(\widetilde{Q}(t)\) given by (4.74) , and suppose conditions (A4.3) and (A4.4) are satisfied for the matrix-valued functions \(\widetilde{{Q}}^{k}(\cdot )\) for k = 1,…,l and \(\widehat{Q}(\cdot )\) . An asymptotic expansion

$${y}_{n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t) + {\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\right )$$

exists such that

  • φ i ( ⋅) is \((n + 1 - i)\)-timescontinuously differentiable on [0,  T];

  •  | ψ i (t) | ≤ Kexp( − κ 0 t) for some 0 < κ 0  < κ i, 0;

  • \(\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T].\)

Finally, at the end of this section, we give a simple example to illustrate the result.

Example 4.37.

Let us consider a Markov chain generated by

$${Q}^{\varepsilon } = \frac{1} {\varepsilon }\widetilde{Q} +\widehat{ Q},$$

where

$$\widetilde{Q} = \left (\begin{array}{*{10}c} -1& 1 &0\\ 1 &-1 &0 \\ 0 & 0 &0&\\ \end{array} \right )\mbox{ and }\widehat{Q} = \left (\begin{array}{*{10}c} 0&0& 0\\ 0 &0 & 0 \\ 1&0&-1\\ \end{array} \right ).$$

Not being irreducible, the chain generated by \(\widetilde{Q}\) includes an absorbing state. In this example, \(\overline{Q} = \left (\begin{array}{*{10}c} 0& 0\\ 1 &-1\\ \end{array} \right )\). Let p0 = (p1 0,p2 0,p0,a) denote the initial distribution of αε(⋅). Then solving the forward equation (4.40) gives us

$${p}^{\varepsilon }(t) = ({p}_{ 1}^{\varepsilon }(t),{p}_{ 2}^{\varepsilon }(t),{p}_{ 3}^{\varepsilon }(t)),$$

where

$$\begin{array}{l} {p}_{1}^{\varepsilon }(t) = \frac{{p}_{1}^{0} + {p}_{ 2}^{0} + {p}^{0,a}} {2} \\ \quad \quad -\left (\frac{-{p}_{1}^{0} + {p}_{2}^{0} - {p}^{0,a}} {2} + \frac{{p}^{0,a}} {2 - \varepsilon }\right )\exp \left( -\frac{2t} {\varepsilon } \right) -\left(\frac{(1 - \varepsilon ){p}^{0,a}} {2 - \varepsilon } \right)\exp (-t), \\ {p}_{2}^{\varepsilon }(t) = \frac{{p}_{1}^{0} + {p}_{ 2}^{0} + {p}^{0,a}} {2} \\ \quad \quad + \left (\frac{-{p}_{1}^{0} + {p}_{2}^{0} - {p}^{0,a}} {2} + \frac{{p}^{0,a}} {2 - \varepsilon }\right )\exp \left( -\frac{2t} {\varepsilon } \right) -\left( \frac{{p}^{0,a}} {2 - \varepsilon }\right)\exp (-t), \\ {p}_{3}^{\varepsilon }(t) = {p}^{0,a}\exp (-t)\end{array}$$

Computing φ0(t) yields

$$\begin{array}{rl} {\varphi }_{0}(t)& = \left (\frac{{p}_{1}^{0} + {p}_{2}^{0} + {p}^{0,a}} {2}, \frac{{p}_{1}^{0} + {p}_{2}^{0} + {p}^{0,a}} {2},0\right ) \\ &\qquad + \left (-\frac{{p}^{0,a}} {2},-\frac{{p}^{0,a}} {2},{p}^{0,a}\right )\exp (-t)\end{array}$$

It is easy to see that for t > 0,

$$ \lim\limits_{\varepsilon \rightarrow 0}\vert {p}^{\varepsilon }(t) - {\varphi }_{ 0}(t)\vert = 0.$$

The limit behavior of the underlying Markov chain as ε → 0 is determined by φ0(t) (for t > 0). Moreover, when t is large, the influence from \(\widehat{Q}\) corresponding to the absorbing state (the vector multiplied by exp (−t)) can be ignored because exp (−t) goes to 0 exponentially fast as t →∞.

5 Inclusion of Transient States

If a Markov chain has transient states, then, relabeling the states through suitable permutations, one can decompose the states into several groups of recurrent states, each of which is weakly irreducible, and a group of transient states. Naturally, we consider the generator \(\widetilde{Q}(t)\) in Q ε(t) having the form

$$\widetilde{Q}(t) = \left (\begin{array}{cccc} \widetilde{{Q}}^{1}(t) & & & \\ & \ddots & & \\ & & \widetilde{{Q}}^{l}(t) & \\ \widetilde{{Q}}_{{_\ast}}^{1}(t)&\cdots &\widetilde{{Q}}_{{_\ast}}^{l}(t)&\widetilde{{Q}}_{{_\ast}}(t)\\ \end{array} \right )$$
(4.82)

such that for each t ∈ [0, T], and each k = 1, …,  l, \(\widetilde{{Q}}^{k}(t)\) is a generator with dimension m k ×m k , \(\widetilde{{Q}}_{{_\ast}}(t)\) is an m  ∗  × m  ∗  matrix, \(\widetilde{{Q}}_{{_\ast}}^{k}(t) \in {\mathbb{R}}^{{m}_{{_\ast}}\times {m}_{k}}\), and

$${m}_{1} + {m}_{2} + \cdots + {m}_{l} + {m}_{{_\ast}} = m.$$

We continue our study of singularly perturbed chains with weak and strong interactions by incorporating the transient states into the model. Let αε( ⋅) be a Markov chain generated by Q ε ( ⋅), with \({Q}^{\varepsilon }(t) \in {\mathbb{R}}^{m\times m}\) given by (4.39) with \(\widetilde{Q}(t)\) given by (4.82). The state space of the underlying Markov chain is given by

$$\mathcal{M} = {\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l} \cup {\mathcal{M}}_{{_\ast}}$$

where \({\mathcal{M}}_{k} =\{ {s}_{k1},\ldots,{s}_{k{m}_{k}}\}\) are the states corresponding to the recurrent states and \({\mathcal{M}}_{{_\ast}} =\{ {s}_{{_\ast}1},\ldots,{s}_{{_\ast}{m}_{{_\ast}}}\}\) are those corresponding to the transient states.

Since \(\widetilde{Q}(t)\) is a generator, for each k = 1, …,  l, \(\widetilde{{Q}}^{k}(t)\) is a generator. Thus the matrix \(\widetilde{{Q}}_{{_\ast}}^{k}(t) = (\widetilde{{q}}_{{_\ast},ij}^{k})\) satisfies \(\widetilde{{q}}_{{_\ast},ij}^{k} \geq 0\) for each i = 1,  , m  ∗  and j = 1, …,  m k , and \(\widetilde{{Q}}_{{_\ast}}(t) = (\widetilde{{q}}_{{_\ast},ij})\) satisfies

$$\widetilde{{q}}_{{_\ast},ij}(t) \geq 0\mbox{ for }i\neq j,\widetilde{{q}}_{{_\ast},ii}(t) < 0,\mbox{ and }\widetilde{{q}}_{{_\ast},ii}(t) \leq -\sum\limits_{j\neq i}\widetilde{{q}}_{{_\ast},ij}(t).$$

Roughly, the block matrix \((\widetilde{{Q}}_{{_\ast}}^{1}(t),\ldots,\widetilde{{Q}}_{{_\ast}}^{l}(t),\widetilde{{Q}}_{{_\ast}}(t))\) is “negatively dominated” by the matrix \(\widetilde{{Q}}_{{_\ast}}(t)\) . Thus it is natural to assume that \(\widetilde{{Q}}_{{_\ast}}(t)\) is a stable matrix (or Hurwitz, i.e., all its eigenvalues have negative real parts). Comparing with the setups of Sections 4.3 and 4.4, the difference in \(\widetilde{Q}(t)\) is the additional matrices \(\widetilde{{Q}}_{{_\ast}}^{k}(t)\) for k = 1, …,  l and \(\widetilde{{Q}}_{{_\ast}}(t)\). Note that \(\widetilde{{Q}}_{{_\ast}}^{k}(t)\) are nonsquare matrices, and \(\widetilde{Q}(t)\) no longer has block-diagonal form.

The formulation here is inspired by the work of Phillips and Kokotovic [175] and Delebecque and Quadrat [44]; see also the recent work of Pan and Başar [164], in which the authors treated time-invariant \(\widetilde{Q}\) matrix of a similar form. Sections 4.3 and 4.4 together with this section essentially include generators of finite-state Markov chains of the most practical concerns. It ought to be pointed out that just as one cannot in general simultaneously diagonalize two matrices, for Markov chains with weak and strong interactions, one cannot put both \(\widetilde{Q}(t)\) and \(\widehat{Q}(t)\) into the forms mentioned above simultaneously. Although the model to be studied in this section is slightly more complex compared with the block-diagonal \(\widetilde{Q}(t)\) in (4.41), we demonstrate that an asymptotic expansion of the probability distribution can still be obtained by using the same techniques of the previous sections. Moreover, it can be seen from the expansion that the underlying Markov chain stays in the transient states only with very small probability. In some cases, for example \(\widehat{Q}(t) = 0\), these transient states can be ignored; see Remark  4.40 for more details.

To incorporate the transient states, we need the following conditions. The main addition is the assumption that \(\widetilde{{Q}}_{{_\ast}}(t)\) is stable.

    • For each t ∈ [0,  T] and k = 1,  , l, \(\widetilde{Q}(t),\) \(\widehat{Q}(t)\) , and \(\widetilde{{Q}}^{k}(t)\) satisfy (A4.3) and (A4.4).

    • For each t ∈ [0, T], \(\widetilde{{Q}}_{{_\ast}}(t)\) is Hurwitz (i.e., all of its eigenvalues have negative real parts).

Remark 4.38.

Condition (A4.6) indicates the inclusion of transient states. Since \(\widetilde{{Q}}_{{_\ast}}(t)\) is Hurwitz, it is nonsingular. Thus the inverse matrix \(\widetilde{{Q}}_{{_\ast}}^{-1}(t)\) exists for each t ∈ [0,T].

Let p ε( ⋅) denote the solution to (4.40) with \(\widetilde{Q}(t)\) specified in (4.82). We seek asymptotic expansions of p ε( ⋅) having the form

$${y}_{n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\varphi }_{ i}(t) +\sum\limits_{i=0}^{n}{\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right ).$$

The development is very similar to that of Section 4.3, so no attempt is made to give verbatim details. Instead, only the salient features will be brought out.

Substituting y n ε(t) into the forward equation and equating coefficients of εi for i = 1, , n lead to the equations

$$\begin{array}{ll} &{\varphi }_{0}(t)\widetilde{Q}(t) = 0, \\ &{\varphi }_{i}(t)\widetilde{Q}(t) ={ d{\varphi }_{i-1}(t) \over dt} - {\varphi }_{i-1}(t)\widehat{Q}(t),\\ \end{array}$$
(4.83)

and with the change of time scale \(\tau = t/\varepsilon \),

$$\begin{array}{rl} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0), \\ &{ d{\psi }_{i}(\tau ) \over d\tau } = {\psi }_{i}(\tau )\widetilde{Q}(0) +\sum\limits_{j=0}^{i-1}{\psi }_{ i-j-1}(\tau ) \\ & \times \left({ {\tau }^{j} \over j!} { {d}^{j}\widehat{Q}(0) \over d{t}^{j}} +{ {\tau }^{j+1} \over (j + 1)!} { {d}^{j+1}\widetilde{Q}(0) \over d{t}^{j+1}} \right).\end{array}$$
(4.84)

As far as the expansions are concerned, the equations have exactly the same form as that of Section 4.3. Note, however, that the partitioned vector φ i (t) has the form

$${\varphi }_{i}(t) = ({\varphi }_{i}^{1}(t),\ldots,{\varphi }_{ i}^{l}(t),{\varphi }_{ i}^{{_\ast}}(t)),\;i = 0,1,\ldots,n,$$

where φ i k(t), k = 1, , l, is an m k row vector and φ i  ∗ (t) is an m  ∗  row vector. A similar partition holds for the vector ψ i (t). To construct these functions, we begin with i = 0. Writing \({\varphi }_{0}(t)\widetilde{Q}(t) = 0\) in terms of the corresponding partition, we have

$$\begin{array}{rl} &{\varphi }_{0}^{k}(t)\widetilde{{Q}}^{k}(t) + {\varphi }_{ 0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{k}(t) = 0,\mbox{ for }k = 1,\ldots,l,\mbox{ and } \\ &{\varphi }_{0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}(t) = \end{array}$$
(0.)

Since \(\widetilde{{Q}}_{{_\ast}}(t)\) is stable, it is nonsingular. The last equation above implies \({\varphi }_{0}^{{_\ast}}(t) = {0}_{{m}_{{_\ast}}} = (0,\ldots,0) \in {\mathbb{R}}^{1\times {m}_{{_\ast}}}\). Consequently, as in the previous section, for each k = 1, , l, the weak irreducibility of \(\widetilde{{Q}}^{k}(t)\) implies that \({\varphi }_{0}^{k}(t) = {{\vartheta}}_{0}^{k}(t){\nu }^{k}(t)\), for some scalar function \({{\vartheta}}_{0}^{k}(t)\). Equivalently,

$${\varphi }_{0}(t) = ({{\vartheta}}_{0}^{1}(t){\nu }^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t){\nu }^{l}(t),{0}_{{ m}_{{_\ast}}}).$$

Comparing the equation above with the corresponding expression of φ0(t) in Section 4.3, the only difference is the addition of the m  ∗ -dimensional row vector \({0}_{{m}_{{_\ast}}}\).

Remark 4.39.

Note that the dominant term in the asymptotic expansion is φ0(t), in which the probabilities corresponding to the transient states are 0. Thus, the probability corresponding to αε(t) ∈{ transient states } is negligibly small.

Define

$$\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = \left (\begin{array}{cccc} \mathrm{1}{\mathrm{l}}_{{m}_{1}} & & & \\ & \ddots & & \\ & & \mathrm{1}{\mathrm{l}}_{{m}_{l}} & \\ {a}_{{m}_{1}}(t)&\cdots &{a}_{{m}_{l}}(t)&{0}_{{m}_{{_\ast}}\times {m}_{{_\ast}}} \end{array} \right )$$
(4.85)

where \({a}_{{m}_{k}}(t) = -\widetilde{{Q}}_{{_\ast}}^{-1}(t)\widetilde{{Q}}_{{_\ast}}^{k}(t)\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) for k = 1, , l, and \({0}_{{m}_{{_\ast}}\times {m}_{{_\ast}}}\) is the zero matrix in \({\mathbb{R}}^{{m}_{{_\ast}}\times {m}_{{_\ast}}}\).

It is readily seen that

$$\widetilde{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = 0\mbox{ for each }t \in [0,T].$$

In view of (4.83), it follows that

$$\begin{array}{l} { d \over dt} ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t),{0}_{{ m}_{{_\ast}}}) \\ \quad \quad = ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t),{0}_{{ m}_{{_\ast}}})\overline{Q}(t), \end{array}$$
(4.86)

where

$$\overline{Q}(t) = \mbox{ diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t),{0}_{{ m}_{{_\ast}}\times {m}_{{_\ast}}})\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t).$$

We write \(\widehat{Q}(t)\) as follows:

$$\widehat{Q}(t) = \left (\begin{array}{cc} \widehat{{Q}}^{11}(t)&\widehat{{Q}}^{12}(t) \\ \widehat{{Q}}^{21}(t)&\widehat{{Q}}^{22}(t)\\ \end{array} \right ),$$

where for each t ∈ [0, T],

$$\begin{array}{rl} &\widehat{{Q}}^{11}(t) \in {\mathbb{R}}^{(m-{m}_{{_\ast}})\times (m-{m}_{{_\ast}})},\;\widehat{{Q}}^{12}(t) \in {\mathbb{R}}^{(m-{m}_{{_\ast}})\times {m}_{{_\ast}} }, \\ &\widehat{{Q}}^{21}(t) \in {\mathbb{R}}^{{m}_{{_\ast}}\times (m-{m}_{{_\ast}})},\mbox{ and }\widehat{{Q}}^{22}(t) \in {\mathbb{R}}^{{m}_{{_\ast}}\times {m}_{{_\ast}} }\end{array}$$

Let

$${\overline{Q}}_{{_\ast}}(t) = \mathrm{diag}({\nu }^{1}(t),\ldots,{\nu }^{l}(t))\left (\widehat{{Q}}^{11}(t)\widetilde{\mathrm{1}\mathrm{l}} +\widehat{ {Q}}^{12}(t)({a}_{{ m}_{1}}(t),\ldots,{a}_{{m}_{l}}(t))\right ).$$

Then \(\overline{Q}(t) = \mathrm{diag}({\overline{Q}}_{{_\ast}}(t),{0}_{{m}_{{_\ast}}\times {m}_{{_\ast}}})\). Moreover, the differential equation (4.86) becomes

$${ d \over dt} ({{\vartheta}}_{0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t)) = ({{\vartheta}}_{ 0}^{1}(t),\ldots,{{\vartheta}}_{ 0}^{l}(t)){\overline{Q}}_{ {_\ast}}(t).$$

Remark 4.40.

Note that the submatrix \(\widehat{{Q}}^{12}(t)\) in \(\widehat{Q}(t)\) determines the jump rates of the underlying Markov chain from a recurrent state in \({\mathcal{M}}_{1} \cup \cdots \cup {\mathcal{M}}_{l}\) to a transient state in \({\mathcal{M}}_{{_\ast}}\). If the magnitude of the entries of \(\widehat{{Q}}^{12}(t)\) is small, then the transient state can be safely ignored because the contribution of \(\widehat{{Q}}^{12}(t)\) to \(\overline{Q}(t)\) is small. On the other hand, if \(\widehat{{Q}}^{12}(t)\) is not negligible, then one has to be careful to include the corresponding terms in \(\overline{Q}(t)\).

We now determine the initial value \({{\vartheta}}_{0}^{k}(0)\). In view of the asymptotic expansions y n ε(t) and the initial-value consistency condition in (4.53), it is necessary that for k = 1, , l,

$${{\vartheta}}_{0}^{k}(0) = {\varphi }_{ 0}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} {=\lim { }_{\delta \rightarrow 0}\lim }_{\varepsilon \rightarrow 0}{p}^{\varepsilon,k}(\delta )\mathrm{1}{\mathrm{l}}_{{ m}_{k}},$$
(4.87)

where p ε(t) = (p ε, 1(t), , p ε, l(t), p ε, ∗ (t)) is a solution to (4.40). Here p ε, k(t) has dimensions compatible with φ0 k(0) and ψ0 k(0). Similarly, we write the partition of the initial vector as p 0 = (p 0, 1, , p 0, l, p 0, ∗ ). The next theorem establishes the desired consistency of the initial values. Its proof is placed in Appendix A.4.

Theorem 4.41.

Assume (A4.5) and (A4.6) . Then for k = 1,…,l,

$$ \lim\limits_{\delta \rightarrow 0}\left({ limsup}_{\varepsilon \rightarrow 0}\left \vert {p}^{\varepsilon,k}(\delta )\mathrm{1}{\mathrm{l}}_{{ m}_{k}} -\left ({p}^{0,k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} - {p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)\widetilde{{Q}}_{ {_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\right )\right \vert \right) = 0.$$

Remark 4.42.

In view of this theorem, the initial value should be given as

$${{\vartheta}}_{0}^{k}(0) = {p}^{0,k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} - {p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)\widetilde{{Q}}_{ {_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}.$$
(4.88)

Therefore, in view of (4.88), to make sure that the initial condition satisfies the probabilistic interpretation, it is necessary that

$${{\vartheta}}_{0}^{k}(t) \geq 0\mbox{ for }t \in [0,T]\mbox{ and }k = 1,\ldots,l\mbox{ and }\sum\limits_{k=1}^{l}{{\vartheta}}_{ 0}^{k}(0) = 1.$$

In view of the structure of the \(\widetilde{Q}(0)\) matrix, for each k = 1,…,l, all components of the vector \(\widetilde{{Q}}_{{_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{m}_{k}}\) are nonnegative. Note that the solution of the differential equation

$$\begin{array}{rl} &{ dy(t) \over dt} = y(t)\widetilde{Q}(0), \\ &y(0) = {p}^{0} \end{array}$$

is \({p}^{0}\exp (\widetilde{Q}(0)t)\). This implies that all components of \({p}^{0,{_\ast}}\exp (\widetilde{{Q}}_{{_\ast}}(0)t)\) are nonnegative. By virtue of the stability of \(\widetilde{{Q}}_{{_\ast}}(0)\),

$$-\widetilde{{Q}}_{{_\ast}}^{-1}(0) ={ \int }_{0}^{\infty }\exp (\widetilde{{Q}}_{ {_\ast}}(0)t)dt.$$

Thus all components of \(-{p}^{0,{_\ast}}\widetilde{{Q}}_{{_\ast}}^{-1}(0)\) are nonnegative, and as a result, the inner product

$$-{p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)\widetilde{{Q}}_{ {_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}$$

is nonnegative. It follows that for each k = 1,…,l, \({{\vartheta}}_{0}^{k}(0) \geq {p}^{0,k}\mathrm{1}{\mathrm{l}}_{{m}_{k}} \geq 0\). Moreover,

$$\begin{array}{ll} \sum\limits_{k=1}^{l}{{\vartheta}}_{ 0}^{k}(0)& =\sum\limits_{k=1}^{l}{p}^{0,k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} - {p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)\left(\;\sum\limits_{k=1}^{l}\widetilde{{Q}}_{ {_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}\right) \\ & = (1 - {p}^{0,{_\ast}}\mathrm{1}{\mathrm{l}}_{{ m}_{{_\ast}}}) - {p}^{0,{_\ast}}\widetilde{{Q}}_{ {_\ast}}^{-1}(0)(-\widetilde{{Q}}_{ {_\ast}}(0)\mathrm{1}{\mathrm{l}}_{{m}_{{_\ast}}}) = 1.\end{array}$$
(4.89)

Before treating the terms in ψ0( ⋅), let us give an estimate on \(\exp (\widetilde{Q}(0)t)\).

Lemma 4.43.

Set

$${\pi }_{{_\ast}} = \left (\begin{array}{cccc} \mathrm{1}{\mathrm{l}}_{{m}_{1}}{\nu }^{1}(0) & & & \\ & \ddots & & \\ & & \mathrm{1}{\mathrm{l}}_{{m}_{l}}{\nu }^{l}(0) & \\ {a}_{{m}_{1}}(0){\nu }^{1}(0)&\cdots &{a}_{{m}_{l}}(0){\nu }^{l}(0)&\mathrm{1}{\mathrm{l}}_{{m}_{{_\ast}}}{0}_{{m}_{{_\ast}}}\\ \end{array} \right ).$$

Then there exist positive constants K and κ 0,0 such that

$$\Big{\vert }\exp (\widetilde{Q}(0)\tau ) - {\pi }_{{_\ast}}\Big{\vert }\leq K\exp (-{\kappa }_{0,0}\tau ),$$
(4.90)

for τ ≥ 0.

Proof: To prove (4.90), it suffices to show for any m-row vector y 0,

$$\Big{\vert }{y}^{0}(\exp (\widetilde{Q}(0)\tau ) - {\pi }_{ {_\ast}})\Big{\vert }\leq K\vert {y}^{0}\vert \exp (-{\kappa }_{ 0}\tau ).$$

Given \({y}^{0} = ({y}^{0,1},\ldots,{y}^{0,l},{y}^{0,{_\ast}}) \in {\mathbb{R}}^{1\times m}\), let

$$y(\tau ) = ({y}^{1}(\tau ),\ldots,{y}^{l}(\tau ),{y}^{{_\ast}}(\tau )) = {y}^{0}\exp (\widetilde{Q}(0)\tau ).$$

Then, y(τ) is a solution to

$$\frac{dy(\tau )} {d\tau } = y(\tau )\widetilde{Q}(0),\;y(0) = {y}^{0}.$$

It follows that

$${y}^{{_\ast}}(\tau ) = {y}^{0,{_\ast}}\exp (\widetilde{{Q}}_{ {_\ast}}(0)\tau )$$

and for k = 1, , l,

$${y}^{k}(\tau ) = {y}^{0,k}\exp (\widetilde{{Q}}^{k}(0)\tau ) +{ \int }_{0}^{\tau }{y}^{{_\ast}}(s)\widetilde{{Q}}_{ {_\ast}}^{k}(0)\exp (\widetilde{{Q}}^{k}(0)(\tau - s))ds.$$

For each k = 1, , l, we have

$$\begin{array}{rl} &{y}^{k } (\tau ) -\left ({y}^{0,k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0) + {y}^{0,{_\ast}}{\int }_{0}^{\infty }\exp (\widetilde{{Q}}_{ {_\ast}}(0)s)ds\widetilde{{Q}}_{{_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right ) \\ & = {y}^{0,k}\left (\exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right ) \\ &\quad + {y}^{0,{_\ast}}{\int }_{0}^{\tau }\exp (\widetilde{{Q}}_{ {_\ast}}(0)s)\widetilde{{Q}}_{{_\ast}}^{k}(0)\left (\exp (\widetilde{{Q}}^{k}(0)(\tau - s)) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right )ds \\ &\quad - {y}^{0,{_\ast}}{\int }_{\tau }^{\infty }\exp (\widetilde{{Q}}_{ {_\ast}}(0)s)\widetilde{{Q}}_{{_\ast}}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)ds\end{array}$$

By virtue of the stability of \(\widetilde{{Q}}_{{_\ast}}(0)\), the last term above is bounded above by K | y 0, ∗  | exp( − κ ∗ τ). Recall that by virtue of Lemma  4.4, for some κ0, k  > 0,

$$\left \vert \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\right \vert \leq K\exp (-{\kappa }_{ 0,k}\tau ).$$

Choose κ0, 0 = min(κ ∗ , min k 0, k }). The terms in the second and the third lines above are bounded by K | y 0 | exp( − κ0, 0τ). The desired estimate thus follows. □ 

Next consider the first equation in the initial-layer expansions:

$${ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0).$$

The solution to this equation can be written as

$${\psi }_{0}(\tau ) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau ).$$

To be able to match the asymptotic expansion, choose

$${\psi }_{0}(0) = {p}^{0} - {\varphi }_{ 0}(0).$$

Thus,

$$\begin{array}{ll} {\psi }_{0}(\tau )& = ({p}^{0} - {\varphi }_{ 0}(0))\exp (\widetilde{Q}(0)\tau ) \\ & = ({p}^{0} - {\varphi }_{ 0}(0))\left (\exp (\widetilde{Q}(0)\tau ) - {\pi }_{{_\ast}}\right ) + ({p}^{0} - {\varphi }_{ 0}(0)){\pi }_{{_\ast}}\end{array}$$

By virtue of the choice of φ0(0), it is easy to show that

$$({p}^{0} - {\varphi }_{ 0}(0)){\pi }_{{_\ast}} = 0.$$

Therefore, in view of Lemma  4.43, ψ0( ⋅) decays exponentially fast in that for some constants K and κ0, 0 > 0 given in Lemma  4.43,

$$\vert {\psi }_{0}(\tau )\vert \leq K\exp (-{\kappa }_{0,0}\tau ),\;\tau \geq 0.$$

We have obtained φ0( ⋅) and ψ0( ⋅). To proceed, set

$${b}_{0}(t) ={ d{\varphi }_{0}(t) \over dt} - {\varphi }_{0}(t)\widehat{Q}(t)$$

and

$${b}_{0}(t) = ({b}_{0}^{1}(t),\ldots,{b}_{ 0}^{l}(t),{b}_{ 0}^{{_\ast}}(t)).$$

Note that b 0(t) is a completely known function.

In view of the second equation in (4.83),

$$\begin{array}{ll} &{\varphi }_{1}^{k}(t)\widetilde{{Q}}^{k}(t) + {\varphi }_{ 1}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{k}(t) = {b}_{ 0}^{k}(t)\mbox{ for }k = 1,\ldots,l, \\ &{\varphi }_{1}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}(t) = {b}_{0}^{{_\ast}}(t).\end{array}$$
(4.91)

Solving the last equation in (4.91) yields

$${\varphi }_{1}^{{_\ast}}(t) = {b}_{ 0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{-1}(t).$$

Putting this back into the first l equations of (4.91) leads to

$${\varphi }_{1}^{k}(t)\widetilde{{Q}}^{k}(t) = {b}_{ 0}^{k}(t) - {b}_{ 0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{-1}(t)\widetilde{{Q}}_{ {_\ast}}^{k}(t).$$
(4.92)

Again, the right side is a known function. In view of the choice of φ0( ⋅) and (4.86), we have \({b}_{0}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = 0\). This implies

$$\begin{array}{l} {b}_{0 }^{k }(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} - {b}_{0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{-1}(t)\widetilde{{Q}}_{ {_\ast}}^{i}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} \\ \quad = {b}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} + {b}_{0}^{{_\ast}}(t){a}_{{ m}_{k}}(t) = \end{array}$$
(0.)

Therefore, (4.92) has a particular solution \(\widetilde{{b}}_{0}^{k}(t)\) with

$$\widetilde{{b}}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0,\mbox{ for }k = 1,\ldots,l.$$

As in the previous section, we write the solution of φ1 k(t) as a sum of the homogeneous solution and a solution of the inhomogeneous equation \(\widetilde{{b}}_{0}^{k}(t)\), that is,

$${\varphi }_{1}^{k}(t) = {{\vartheta}}_{ 1}^{k}(t){\nu }^{k}(t) +\widetilde{ {b}}_{ 0}^{k}(t)\mbox{ for }k = 1,\ldots,l.$$

In view of

$$\begin{array}{rl} &\widetilde{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) = 0\ \mbox{ and} \\ &\widetilde{{b}}_{0}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = 0, \end{array}$$

using the equation

$${\varphi }_{2}(t)\widetilde{Q}(t) = \frac{d{\varphi }_{1}(t)} {dt} - {\varphi }_{1}(t)\widehat{Q}(t),$$

we obtain that

$$\begin{array}{rl} &\frac{d} {dt}({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t),0) \\ & = ({{\vartheta}}_{1}^{1}(t),\ldots,{{\vartheta}}_{ 1}^{l}(t),0)\overline{Q}(t) +\widetilde{ {b}}_{ 0}(t)\widehat{Q}(t)\widetilde{\mathrm{1}{\mathrm{l}}}_{{_\ast}}(t) \\ &\quad -\left(\frac{d\widetilde{{b}}_{0}^{{_\ast}}(t)} {dt} \right)\left ({a}_{{m}_{1}}(t),\ldots,{a}_{{m}_{l}}(t),{0}_{{m}_{{_\ast}}}^{\prime}\right ).\end{array}$$
(4.93)

The initial value \({{\vartheta}}_{1}(0)\) will be determined in conjunction with the initial value of ψ1( ⋅) next.

Note that in comparison with the differential equation governing \({{\vartheta}}_{1}(t)\) in Section 4.3, the equation (4.93) has an extra term involving the derivative of \(\widetilde{{b}}_{0}^{{_\ast}}(t)\).

To determine ψ1( ⋅), solving the equation in (4.84) with i = 1, we have

$$\begin{array}{rl} {\psi }_{1}(\tau ) =&{\psi }_{1}(0)\exp (\widetilde{Q}(0)\tau ) \\ & +{ \int }_{0}^{\tau }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\widehat{Q}(0)\exp (\widetilde{Q}(0)(\tau - s))ds \\ & +{ \int }_{0}^{\tau }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)\left(\frac{d\widetilde{Q}(0)} {dt} \right)\exp (\widetilde{Q}(0)(\tau - s))ds\end{array}$$

Choose the initial values of ψ1(0) and \({{\vartheta}}_{1}^{k}(0)\) as follows:

$$\begin{array}{cl} {\psi }_{1}(0) & = -{\varphi }_{1}(0), \\ {{\vartheta}}_{1}^{k}(0) & = -{\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}}, \\ {\psi }_{1}(0){\pi }_{{_\ast}}& = -\left ({\int }_{0}^{\infty }{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\widehat{Q}(0){\pi }_{{_\ast}} \\ &\qquad \quad -\left ({\int }_{0}^{\infty }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\frac{d\widetilde{Q}(0)} {dt} {\pi }_{{_\ast}} \\ & := -{\overline{\psi }}_{0}{\pi }_{{_\ast}}.\end{array}$$
(4.94)

Write \({\overline{\psi }}_{0} = ({\overline{\psi }}_{0}^{1},\ldots,{\overline{\psi }}_{0}^{l},{\overline{\psi }}_{0}^{{_\ast}})\). Then the definition of π ∗  implies that

$${\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} + {\psi }_{1}^{{_\ast}}(0){a}_{{ m}_{k}}(0) = -({\overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} +{ \overline{\psi }}_{0}^{{_\ast}}{a}_{{ m}_{k}}(0)).$$

Recall that

$${\varphi }_{1}^{{_\ast}}(0) + {\psi }_{ 1}^{{_\ast}}(0) = 0$$

and

$${\varphi }_{1}^{{_\ast}}(t) = {b}_{ 0}^{{_\ast}}(t)\widetilde{{Q}}_{ {_\ast}}^{-1}(t).$$

It follows that

$${\psi }_{1}^{k}(0)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = -({\overline{\psi }}_{0}^{k}\mathrm{1}{\mathrm{l}}_{{ m}_{k}} +{ \overline{\psi }}_{0}^{{_\ast}}{a}_{{ m}_{k}}(0)) + {b}_{0}^{{_\ast}}(0)\widetilde{{Q}}_{ {_\ast}}^{-1}(0){a}_{{ m}_{k}}(0).$$

Moreover, it can be verified that | ψ1(τ) | ≤ Kexp( − κ1, 0τ) for some 0 < κ1, 0 < κ0, 0.

Remark 4.44.

Note that there is an extra term

$$\left ({\int }_{0}^{\infty }s{\psi }_{ 0}(0)\exp (\widetilde{Q}(0)s)ds\right )\frac{d\widetilde{Q}(0)} {dt} {\pi }_{{_\ast}}$$

involved in the equation determining \({{\vartheta}}_{1}(0)\) in (4.94). This term does not vanish as in Section 4.3 because generally \(((d/dt)\widetilde{Q}(0)){\pi }_{{_\ast}}\neq 0\).

To obtain the desired asymptotic expansion, continue inductively. For each i = 2, , n, we first obtain the solution of φ i (t) with the “multiplier” given by the solution of the differential equation but with unspecified condition \({{\vartheta}}_{i}(0)\); solve ψ i (t) with the as yet unavailable initial condition \({\psi }_{i}(0) = -{\varphi }_{i}(0)\). Next jointly prove the exponential decay properties of ψ i (τ) and obtain the solution \({{\vartheta}}_{i}(0)\). The equation to determine \({{\vartheta}}_{i}(0)\) with transient states becomes

$$\begin{array}{l} {\psi }_{i } (0){\pi }_{{_\ast}} \\ \ = -\left(\;\sum\limits_{j=0}^{i-1}{ \int }_{0}^{\infty }{\psi }_{ i-j-1}(s)\left(\frac{{s}^{j}} {j!} \frac{{d}^{j}\widehat{Q}(0)} {d{t}^{j}} + \frac{{s}^{j+1}} {(j + 1)!} \frac{{d}^{j+1}\widetilde{Q}(0)} {d{t}^{j+1}} \right)ds\right){\pi }_{{_\ast}}\end{array}$$

In this way, we have constructed the asymptotic expansion with transient states. In addition, we can show that φ i ( ⋅) are smooth and ψ i ( ⋅) satisfies | ψ i (τ) | ≤ Kexp( − κ i, 0τ) for some 0 < κ i, 0 < κ i − 1, 0 < κ0, 0. Similarly as in the case with all recurrent states, we establish the following theorem.

Theorem 4.45.

Suppose (A4.5) and (A4.6) hold. Then an asymptotic expansion

$${y}_{n}^{\varepsilon }(t) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t) + {\varepsilon }^{i}{\psi }_{ i}\left ({ t \over \varepsilon } \right )\right )$$

can be constructed such that for i = 0,…,n,

  • φ i ( ⋅) is \((n + 1 - i)\)-times continuously differentiable on [0, T];

  •  | ψ i (t) | ≤ Kexp( − κ0 t) for some K > 0 and 0 < κ0 < κ i, 0;

  • \(\vert {p}^{\varepsilon }(t) - {y}_{n}^{\varepsilon }(t)\vert = O({\varepsilon }^{n+1})\mbox{ uniformly in }t \in [0,T].\)

Example 4.46.

Let \(\widetilde{Q}(t) =\widetilde{ Q},\) a constant matrix such that

$$\widetilde{Q} = \left (\begin{array}{*{10}c} -1& 1 & 0 & 0\\ 1 &-1 & 0 & 0 \\ 1 & 0 &-2& 1\\ 0 & 1 & 1 &-2 \\ \end{array} \right )\mbox{ and }\widehat{Q} = 0.$$

In this example,

$$\widetilde{{Q}}^{1} = \left (\begin{array}{*{10}c} -1& 1 \\ 1 &-1\\ \end{array} \right ),\quad \widetilde{{Q}}_{{_\ast}} = \left (\begin{array}{*{10}c} -2& 1\\ 1 &-2\\ \end{array} \right ),\quad \mbox{ and }\ \widetilde{{Q}}_{{_\ast}}^{1} = \left (\begin{array}{*{10}c} 1&0 \\ 0&1\\ \end{array} \right ).$$

The last two rows in \(\widetilde{Q}\) represent the jump rates corresponding to the transient states. The matrix \(\widetilde{{Q}}^{1}\) is weakly irreducible and \(\widetilde{{Q}}_{{_\ast}}\) is stable. Solving the forward equation gives us

$${p}^{\varepsilon }(t) = ({p}_{ 1}^{\varepsilon }(t),{p}_{ 2}^{\varepsilon }(t),{p}_{ 3}^{\varepsilon }(t),{p}_{ 4}^{\varepsilon }(t)),$$

where

$$\begin{array}{rl} &{p}_{1}^{\varepsilon }(t) = \frac{1} {2} + \frac{1} {2}\biggl [(-{p}_{3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{t} {\varepsilon }\right ) \\ & + ({p}_{1}^{0} - {p}_{ 2}^{0} + {p}_{ 3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{2t} {\varepsilon } \right ) \\ & + (-{p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{3t} {\varepsilon } \right )\biggr ], \\ &{p}_{2}^{\varepsilon }(t) = \frac{1} {2} + \frac{1} {2}\biggl [(-{p}_{3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{t} {\varepsilon }\right ) \\ & + (-{p}_{1}^{0} + {p}_{ 2}^{0} - {p}_{ 3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{2t} {\varepsilon } \right ) \\ & + ({p}_{3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{3t} {\varepsilon } \right )\biggr ], \\ &{p}_{3}^{\varepsilon }(t) = \frac{1} {2}\biggl [({p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{t} {\varepsilon }\right ) + ({p}_{3}^{0} - {p}_{ 4}^{0})\exp \left (-\frac{3t} {\varepsilon } \right )\biggr ], \\ &{p}_{4}^{\varepsilon }(t) = \frac{1} {2}\biggl [({p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{t} {\varepsilon }\right ) + (-{p}_{3}^{0} + {p}_{ 4}^{0})\exp \left (-\frac{3t} {\varepsilon } \right )\biggr ]\end{array}$$

It is easy to see that \({\varphi }_{0}(t) = (1/2,1/2,0,0)\) and

$$\vert {p}^{\varepsilon }(t) - {\varphi }_{ 0}(t)\vert \leq K\exp \left (-\frac{t} {\varepsilon }\right ).$$

The limit behavior of the underlying Markov chain as ε → 0 is determined by φ0(t) for t > 0. It is clear that the probability of the Markov chain staying at the transient states is very small for small ε.

Remark 4.47.

The model discussed in this section has the extra ingredient of including transient states as compared with that of Section 4.3. The main feature is embedded in the last few rows of the \(\widetilde{Q}(t)\) matrix. One of the crucial points here is that the matrix \(\widetilde{{Q}}_{{_\ast}}(t)\) in the right corner is Hurwitzian. This stability condition guarantees the exponential decay properties of the boundary layers. As far as the regular part (or the outer) expansion is concerned, we have that the last subvector φ0 (t) = 0. The determination of the initial conditions \({{\vartheta}}_{i}(0)\) uses the same technique as before, namely, matching the outer terms and inner layers. The procedure involves recursively solving a sequence of algebraic and differential equations. Although the model is seemingly more general, the methods and techniques involved in obtaining the asymptotic expansion and proof of the results are essentially the same as in the previous section. The notation is slightly more complex, nevertheless.

6 Remarks on Countable-State-Space Cases

6.1 Countable-State Spaces: Part I

This section presents an extension of the singularly perturbed Markov chains with fast and slow components and finite-state spaces. In this section, the generator \(\widetilde{Q}(\cdot )\) is a block-diagonal matrix consisting of infinitely many blocks each of which is of finite dimension. The generator Q ε(t) still has the form (4.39). However,

$$\widetilde{Q}(t) = \left (\begin{array}{*{10}c} \widetilde{{Q}}^{1}(t)& & & & \\ &\widetilde{{Q}}^{2}(t)& & &\\ & &\ddots && \\ & & &\widetilde{{Q}}^{k}(t)&\\ & & &&\ddots\\ \end{array} \right ),$$
(4.95)

where \(\widetilde{{Q}}^{k}(t) \in {\mathbb{R}}^{{m}_{k}\times {m}_{k}}\) is a generator of an appropriate Markov chain with finite-state space, and \(\widehat{Q}(t)\) is an infinite-dimensional matrix and is a generator of a Markov chain having a countable-state space, that is, \(\widehat{Q}(t) = (\widehat{{q}}_{ij}(t))\) such that

$$\widehat{{q}}_{ij}(t) \geq 0\mbox{ for }i\neq j,\mbox{ and }\sum\limits_{j}\widehat{{q}}_{ij}(t) = 0.$$

We aim at deriving asymptotic results under the current setting. To do so, assume that the following condition holds:

    • For t ∈ [0, T], \(\widetilde{{Q}}^{k}(t)\), for k = 1, 2, , are weakly irreducible.

Parallel to the development of Section 4.3, the solution of φ i ( ⋅) can be constructed similar to that of Theorem  4.29 as in (4.44) and (4.45). In fact, we obtain φ0( ⋅) from (4.49) and (4.50) with l = ; the difference is that now we have an infinite number of equations. Similarly, for all k = 1, 2,  and \(i = 0,1,\ldots,n + 1\), φ i ( ⋅) can be obtained from

$$\begin{array}{ll} &{\varphi }_{0}(t)\widetilde{Q}(t) = 0,\mbox{ if }i = 0 \\ &{\varphi }_{i}(t)\widetilde{Q}(t) ={ d{\varphi }_{i-1}(t) \over dt} - {\varphi }_{i-1}(t)\widehat{Q}(t),\mbox{ if }i \geq 1 \\ &{\varphi }_{i}^{k}(t)\mathrm{1}{\mathrm{l}}_{{ m}_{k}} = {{\vartheta}}_{i}^{k}(t), \\ &{ d{{\vartheta}}_{i}(t) \over dt} = {{\vartheta}}_{i}(t)\overline{Q}(t) +\widetilde{ {b}}_{i-1}(t)\widehat{Q}(t)\widetilde{\mathrm{1}\mathrm{l}}.\end{array}$$
(4.96)

The problem is converted to one that involves infinitely many algebraic differential equations. The same technique as presented before still works.

Nevertheless, the boundary layer corrections deserve more attention. Let us start with ψ0( ⋅), which is the solution of the abstract Cauchy problem

$$\begin{array}{ll} &{ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0), \\ &{\psi }_{0}(0) = {p}^{0} - {\varphi }_{0}(0).\end{array}$$
(4.97)

To continue our study, one needs the notion of semigroup (see Dunford and Schwartz [52], and Pazy [172]). Recall that for a Banach space \(\mathbb{B}\), a one-parameter family T(t), 0 ≤ t < , of bounded linear operators from \(\mathbb{B}\) into \(\mathbb{B}\) is a semigroup of bounded linear operators on \(\mathbb{B}\) if (i) T(0) = I and (ii) \(T(t + s) = T(t)T(s)\) for every t, s ≥ 0.

Let \({\mathbb{R}}^{\infty }\) be the sequence space with a canonical element \(x = ({x}_{1},{x}_{2},\ldots ) \in {\mathbb{R}}^{\infty }\). Let A = (a ij ) satisfying \(A : {\mathbb{R}}^{\infty }\mapsto {\mathbb{R}}^{\infty }\), equipped with the l 1-norm

$$\vert A{\vert }_{1} {=\sup }_{j}\sum\limits_{i}\vert {a}_{ij}\vert ;$$

(see Hutson and Pym [90, p. 74]) Using the definition of semigroup above, the solution of (4.97) is

$${\psi }_{0}(\tau ) = T(\tau ){\psi }_{0}(0),$$

where T(τ) is a one-parameter family of semigroups generated by \(\widetilde{Q}(0)\). Moreover, since \(\widetilde{Q}(0)\) is a bounded linear operator, \(\exp (\widetilde{Q}(0)\tau )\) still makes sense. Thus \(T(\tau ){\psi }_{0}(0) = {\psi }_{0}(0)\exp (\widetilde{Q}(0)\tau )\), where

$$\begin{array}{rl} T(\tau )& =\exp (\widetilde{Q}(0)\tau ) =\sum\limits_{j=0}^{\infty }{ {\left (\widetilde{Q}(0)\tau \right )}^{j} \over j!} \\ & = \mathrm{diag}\left (\exp \left (\widetilde{{Q}}^{1}(0)\tau \right ),\ldots,\exp \left (\widetilde{{Q}}^{k}(0)\tau \right ),\ldots \right )\end{array}$$

Therefore, the solution has the same form as in the previous section. Under (A4.7), exactly the same argument as in the proof of Lemma  4.4 yields that for each k = 1, 2, , 

$$\exp (\widetilde{{Q}}^{k}(0)\tau ) \rightarrow \mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\mbox{ as }\tau \rightarrow \infty $$

and the convergence takes place at an exponential rate, that is,

$$\l \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\vert \leq K\exp (-{\kappa }_{ k}\tau ),$$

for some κ k  > 0. In order to obtain a valid asymptotic expansion, another piece of assumption is needed. That is, these κ k , for all k = 1, 2, , are uniformly bounded below by a positive constant κ0.

    • There exists a positive number κ0 = min k k } > 0.

Set

$$\widetilde{\mathrm{1}\mathrm{l}} = \mathrm{diag}\left (\mathrm{1}{\mathrm{l}}_{{m}_{1}},\ldots,\mathrm{1}{\mathrm{l}}_{{m}_{k}},\ldots \right )\mbox{ and }\nu (0) = \mathrm{diag}\left ({\nu }^{1}(0),\ldots,{\nu }^{k}(0),\ldots \right ).$$

In view of (A4.8)

$$\begin{array}{ll} \l \exp (\widetilde{Q}(0)\tau ) -\widetilde{\mathrm{1}\mathrm{l}}\nu {(0)\vert }_{1} & {\leq \sup }_{k}\l \exp (\widetilde{{Q}}^{k}(0)\tau ) -\mathrm{1}{\mathrm{l}}_{{ m}_{k}}{\nu }^{k}(0)\vert \\ & \leq K\exp (-{\kappa }_{0}\tau ).\end{array}$$
(4.98)

The exponential decay property of ψ0( ⋅) is thus established. Likewise, it can be proved that all ψ i ( ⋅) for \(i = 1,\ldots,n + 1\), satisfy the exponential decay property. From here on, we can proceed as in the previous section to get the error estimate and verify the validity of the asymptotic expansion. In short the following theorem is obtained.

Theorem 4.48.

Suppose conditions (A4.7) and (A4.8) are satisfied. Then the results in Theorem  4.29 hold for the countable-state-space model with \(\widetilde{Q}(\cdot )\) given by (4.95).

6.2 Countable-State Spaces: Part II

The aim of this section is to develop further results on singularly perturbed Markov chains with fast and slow components whose generators are infinite-dimensional matrices but in different form from that described in Section 4.6.1. The complexity as well as difficulty increase. A number of technical issues also arise. One idea arises almost immediately: to approximating the underlying system via a Galerkin-kind procedure, that is, to approximate an infinite-dimensional system by finite-dimensional truncations. Unfortunately, this does not work in the setting of this section. We will return to this question at the end of this section.

To proceed, as in the previous sections, the first step invariably involves the solution of algebraic differential equations in the constructions of the approximating functions. One of the main ideas used is the Fredholm alternative. There are analogues to the general setting in Banach spaces for compact operators. Nevertheless, the infinite-dimensional matrices are in fact more difficult to handle.

Throughout this section, we treat the class of generators with | Q(t) | 1 <  only. We use 1 l to denote the column vector with all components equal to 1. Consider (1 l⋮Q(t)) as an operator for a generator Q(t) of a Markov chain with state space \(\mathcal{M} =\{ 1,2,\ldots \}\). To proceed, we first give the definitions of irreducibility and quasi-stationary distribution. Set Q c (t) : = (1 l⋮Q(t)).

Definition 4.49.

The generator Q(t) is said to be weakly irreducible at t0 ∈ [0,T], for \(w \in {\mathbb{R}}^{\infty }\), if the equation wQc(t0) = 0 has only the zero solution. If Q(t) is weakly irreducible for each t ∈ [0,T], then it is said to be weakly irreducible on [0,T].

Definition 4.50.

A quasi-stationary distribution ν(t) (with respect to Q(t)) is a solution to (2.8) with the finite summation replaced by \(\sum\limits_{i=1}^{\infty }{\nu }_{i}(t) = 1\) that satisfies ν(t) ≥ 0.

As was mentioned before, the Fredholm alternative plays an important role in our study. For infinite-dimensional systems, we state another definition to take this into account.

Definition 4.51.

A generator Q(t) satisfies the F-Property if wQc(t) = b has a unique solution for each \(b \in {\mathbb{R}}^{\infty }.\)

Note that for all weakly irreducible generators of finite dimension (i.e., generators for Markov chains with finite-state space), the F-Property above is automatically satisfied.

Since 1 l ∈ l (l denotes the sequence space equipped with the l norm) for each t ∈ [0, T], \(Q(t) \in {\mathbb{R}}^{\infty }\times {\mathbb{R}}^{\infty }\). Naturally, we use the norm

$$\vert (z\vdots A){\vert }_{\infty,1} =\max \left\{ \sup\limits_{{z}_{j}}\vert {z}_{j}\vert {,\sup }_{j}\sum\limits_{i=1}^{\infty }\vert {a}_{ ij}(t)\vert \right\}.$$

It is easily seen that

$$\vert {Q}_{c}(t){\vert }_{\infty,1} \leq \max {\biggl \{ 1{,\sup }_{j}\sum\limits_{i}\vert {q}_{ij}(t)\vert \biggr \}}.$$

If a generator Q(t) satisfies the F-Property, then it is weakly irreducible. In fact if Q(t) satisfies the F-Property on t ∈ [0, T], then yQ c (t) = 0 has a unique solution y = 0.

By the definition of the generator, in particular the q-Property,  Q c (t) is a bounded linear operator for each t ∈ [0, T]. If Q c (t) is bijective (i.e., one-to-one and onto), then it has a bounded inverse. This, in turn, implies that Q c (t) exhibits the F-Property. Roughly, the F-Property is a generalization of the conditions in dealing with finite-dimensional spaces. Recall from Section 4.2 that although fQ(t) = b is not solvable uniquely, by adding an equation f1 l = c, the system has a unique solution.

Owing to the inherited difficulty caused by the infinite dimensionality, the irreducibility and smoothness of Q( ⋅) are not sufficient to guarantee the existence of asymptotic expansions. Stronger conditions are needed. In the sequel, for ease of presentation, we consider the model with \(\widetilde{Q}(\cdot )\) irreducible and both \(\widetilde{Q}(\cdot )\) and \(\widehat{Q}(\cdot )\) infinite-dimensional.

For each t, we denote the spectrum of Q(t) by σ(Q(t)). In view of Pazy [172] and Hutson and Pym [90], we have

$$\sigma (Q(t)) = {\sigma }_{d}(Q(t)) \cap {\sigma }_{c}(Q(t)) \cap {\sigma }_{r}(Q(t)),$$

where σ d (Q(t)), σ c (Q(t)), and σ r (Q(t)) denote the discrete, continuous, and residue spectrum of Q(t), respectively. The well-known linear operator theory implies that for a compact operator A, σ r (A) = , and the only possible candidate for σ c (A) is 0. Keeping this in mind, we assume that the following condition holds.

    • The following condition holds.

      • The smoothness condition (A4.4) is satisfied.

      • The generator \(\widetilde{Q}(t)\) exhibits the F-Property.

      • \( \sup\limits_{t\in [0,T]}\vert \widetilde{Q}(t){\vert }_{1} < \infty \) and \( \sup\limits_{t\in [0,T]}\vert \widehat{Q}(t)\vert < \infty \).

      • The eigenvalue 0 of \(\widetilde{Q}(t)\) has multiplicity 1 and 0 is not an accumulation point of the eigenvalues.

      • \({\sigma }_{r}(\widetilde{Q}(t)) = \varnothing \).

Remark 4.52.

Item (a) above requires that the smoothness condition be satisfied and Item (b) requires the operator \((\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t))\) satisfy a Fredholm-alternative-like condition. Finally, (d) indicates the spectrum of \((\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t))\) is like a compact operator. Recall that for a compact linear operator, 0 is in its spectrum, and the only possible accumulation point is 0. Our conditions mimic such a condition. It will be used when we prove the exponential decay property of the initial-layer terms.

Theorem 4.53.

Under condition (A4.9) , the results in Theorem  4.29 hold for Markov chains with countable-state space.

Proof: The proof is very similar to its finite-dimensional counterpart. We only point out the difference here.

As far as the regular part is concerned, we get the same equation (4.44). One thing to note is that we can no longer use Cramer’s rule to solve the systems of equations. Without such an explicit representation of the solution, the smoothness of φ i ( ⋅) needs to be proved by examining (4.44) directly. For example,

$$\begin{array}{rl} &\sum\limits_{i=1}^{\infty }{\varphi }_{ 0,i}(t) = 1, \\ &{\varphi }_{0}(t)\widetilde{Q}(t) = 0,\end{array}$$

can be rewritten as

$${\varphi }_{0}(t)\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t)\right ) = (1,0,\ldots ).$$
(4.99)

Since \(\widetilde{Q}(t)\) satisfies the F-Property, this equation has a unique solution.

To verify the differentiability, consider also

$${\varphi }_{0}(t + \delta )\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )\right ) = (1,0,\ldots ).$$

Examining the difference quotient leads to

$$\begin{array}{rl} 0& ={ {\varphi }_{0}(t + \delta )\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )\right ) - {\varphi }_{0}(t)\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t)\right ) \over \delta } \\ & ={ \left [{\varphi }_{0}(t + \delta ) - {\varphi }_{0}(t)\right ]\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )\right ) \over \delta } \\ & +{ {\varphi }_{0}(t)\left ((\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )) - (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t))\right ) \over \delta } \end{array}$$

Taking the limit as δ → 0 and by virtue of the smoothness of \(\widetilde{Q}(\cdot )\), we have

$$ \lim\limits_{\delta \rightarrow 0}{ \left [{\varphi }_{0}(t + \delta ) - {\varphi }_{0}(t)\right ]\left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t + \delta )\right ) \over \delta } = -{\varphi }_{0}(t)\left(0\vdots{ d\widetilde{Q}(t) \over dt} \right).$$

That is (d ∕ dt0(t) exists and is given by the solution of

$${ d{\varphi }_{0}(t) \over dt} \left (\mathrm{1}\mathrm{l}\vdots\widetilde{Q}(t)\right ) = -{\varphi }_{0}(t)\left(0\vdots{ d\widetilde{Q}(t) \over dt} \right).$$

Again by the F-Property, there is a unique solution for this equation. Higher-order derivatives of φ0( ⋅) and smoothness of φ i ( ⋅) can be proved in a similar way.

As far as the initial-layer terms are concerned, since \(\widetilde{Q}(0)\) is a bounded linear operator, the semigroup interpretation \(\exp (\widetilde{Q}(0)\tau )\) makes sense. It follows from Theorem 1.4 of Pazy [172, p. 104] that the equation

$${ d{\psi }_{0}(\tau ) \over d\tau } = {\psi }_{0}(\tau )\widetilde{Q}(0),\quad {\psi }_{0}(0) = {p}_{0} - {\varphi }_{0}(0)$$

has a unique solution.

To show that ψ0( ⋅) decays exponentially fast, we use an argument that is analogous to the finite-dimensional counterpart. Roughly, since the multiplicity of the eigenvalue 0 is 1, the subspace generated by the corresponding eigenvector v 0 is one-dimensional. Similar to the situation of Section 4.2, \( \lim\limits_{\tau \rightarrow \infty }\exp (\widetilde{Q}(0)\tau )\) exists and the limit must have identical rows. Denote the limit by \(\overline{P}\). It then follows that

$$\Big{\vert }\exp (\widetilde{Q}(0)\tau ) -\overline{P}\Big{\vert }\leq K\exp (-{\kappa }_{0}\tau ).$$

The meaning should be very clear. Upon “subtracting” the subspace generated by v 0, it ought to behave like exp( − κ0τ). A similar argument works for \(i = 1,\ldots,n + 1\), so the ψ i ( ⋅) decay exponentially fast. □ 

6.3 A Remark on Finite-Dimensional Approximation

Concerning the cases in Section 4.6.2, a typical way of dealing with infinite-dimensional Markov chains is to make a finite-dimensional approximation. Let Q(t) = (q ij (t)), t ≥ 0, denote a generator of a Markov chain with countable-state space. We consider an N ×N, N = 1, 2, , truncation matrix \({Q}_{N}(t) = {({q}_{ij}(t))}_{i,j=1}^{N}\). Then Q N (t) is a subgenerator in the sense that ∑ j = 1 N q ij (t) ≤ 0, i = 1, 2, , N.

A first glance seems to indicate that the idea of subgenerator provides a way to treat the problem of approximating an infinite-dimensional generator by finite-dimensional matrices. In fact, Reuter and Ledermann used such an idea to derive the existence and uniqueness of the solution to the forward equation (see Bharucha-Reid [10]). Dealing with singularly perturbed chains with countable-state space, one would be interested in knowing whether a Galerkin-like approximation would work in the sense that an asymptotic expansion of a finite-dimensional system would provide an approximation to the probability distribution. To be more precise, let αε( ⋅) denote the Markov chain generated by Q(t) ∕ ε and let

$${p}^{\varepsilon }(t) = (P({\alpha }^{\varepsilon }(t) = 1),\ldots,P({\alpha }^{\varepsilon }(t) = k),\ldots ).$$

Consider the following approximation via N-dimensional systems

$${ d{p}^{\varepsilon,N}(t) \over dt} ={ 1 \over \varepsilon } {p}^{\varepsilon,N}(t){Q}_{ N}(t),\;{p}^{\varepsilon,N}(0) = {p}^{0}.$$
(4.100)

Using the techniques presented in the previous sections, we can find outer and inner expansions to approximate p ε, N(t). The questions are these: For small ε and large N, can we approximate p ε(t) by p ε, N(t)? Can we approximate p ε, N(t) by y n ε, N(t), where y n ε, N(t) is an expansion of the form (4.43) when subgenerators are used? More importantly, can we use y n ε, N(t) to approximate p ε(t)?

Although p i ε(t) can be approximated by its truncation p i ε, N(t) for large N and p ε, N(t) can be expanded as y n ε, N(t) for small ε, the approximation of y n ε, N(t) to p ε(t) does not work in general because the limits as ε → 0 and N →  are not interchangeable. This can be seen by considering the following example.

Let

$$Q(t) = Q = \left (\begin{array}{cccc} - 1& \frac{1} {2} & \frac{1} {{2}^{2}} & \cdots \\ \frac{1} {2} & - 1& \frac{1} {{2}^{2}} & \cdots \\ \frac{1} {{2}^{2}} & \frac{1} {2} & - 1&\cdots \\ \vdots & \vdots & \vdots & \vdots\\ \end{array} \right ).$$

Then for any N, the truncation matrix Q N has only negative eigenvalues. It follows that the solution p ε, N(t) decays exponentially fast, i.e.,

$$\Big{\vert }{p}^{\varepsilon,N}(t)\Big{\vert }\leq C\exp \left (-\frac{{\kappa }_{0}t} {\varepsilon } \right ).$$

Thus, all terms in the regular part of y n ε, N vanish. It is clear from this example that y n ε, N(t) cannot be used to approximate p ε(t).

7 Remarks on Singularly Perturbed Diffusions

In this section, we present some related results on singular perturbations of diffusions. If in lieu of a discrete state space, one considers a continuous-state space, then naturally the singularly perturbed Markov chains become singularly perturbed Markov processes. We illustrate the idea of matched asymptotic expansions for singularly perturbed diffusions. In this section, we only summarize the results and refer the reader to Khasminskii and Yin [116] for details of proofs. To proceed, consider the following example.

Example 4.54.

This example discusses a model arising from stochastic control, namely, a controlled singularly perturbed system. As pointed out in Kushner [140] and Kokotovic, Bensoussan, and Blankenship [127], many control problems can be modeled by systems of differential equations, where the state variables can be divided into two coupled groups, consisting of “fast” and “slow” variables. A typical system takes the form

$$\begin{array}{rl} &d{x}_{1}^{\varepsilon } = {f}_{ 1}({x}_{1}^{\varepsilon },{x}_{ 2}^{\varepsilon },u)dt + {\sigma }_{ 1}({x}_{1}^{\varepsilon },{x}_{ 2}^{\varepsilon })d{w}_{ 1},\ {x}_{1}^{\varepsilon }(0) = {x}_{ 1}, \\ &d{x}_{2}^{\varepsilon } ={ 1 \over \varepsilon } {f}_{2}({x}_{1}^{\varepsilon },{x}_{ 2}^{\varepsilon },u)dt +{ 1 \over \sqrt{\varepsilon }} {\sigma }_{2}({x}_{1}^{\varepsilon },{x}_{ 2}^{\varepsilon })d{w}_{ 2},\ {x}_{2}^{\varepsilon }(0) = {x}_{ 2}, \end{array}$$

where w1(⋅) and w2(⋅) are independent Brownian motions, fi(⋅) and σi(⋅) for i = 1, 2 are suitable functions, u is the control variable, and ε > 0 is a small parameter. The underlying control problem is to minimize the cost function

$${J}^{\varepsilon }({x}_{ 1},{x}_{2},u) = E{\int }_{0}^{T}R({x}_{ 1}^{\varepsilon }(t),{x}_{ 2}^{\varepsilon }(t),u)dt,$$

where R(⋅) is the running cost function. The small parameter ε > 0 signifies the relative rates of x1 ε and x2 ε. Such singularly perturbed systems have drawn much attention (see Bensoussan [8], Kushner [140], and the references therein). The system is very difficult to analyze directly; the approach of Kushner [140] is to use weak convergence methods to approximate the total system by the reduced system that is obtained using the differential equation for the slow variable, where the fast variable is fixed at its steady-state value as a function of the slow variable. In order to gain further insight, it is crucial to understand the asymptotic behavior of the rapidly changing process x2 ε through the transition density given by the solution of the corresponding Kolmogorov-Fokker-Planck equations.

As demonstrated in the example above, a challenge common to many applications is to study the asymptotic behavior of the following problem. Let ε > 0 be a small parameter, and let X 1 ε( ⋅) and X 2 ε( ⋅) be real-valued diffusion processes satisfying

$$\left \{\begin{array}{l} d{X}_{1}^{\varepsilon } = {a}_{ 1}(t,{X}_{1}^{\varepsilon },{X}_{ 2}^{\varepsilon })dt + {\sigma }_{ 1}(t,{X}_{1}^{\varepsilon },{X}_{ 2}^{\varepsilon })d{w}_{ 1}, \\ d{X}_{2}^{\varepsilon } ={ 1 \over \varepsilon } {a}_{2}(t,{X}_{1}^{\varepsilon },{X}_{ 2}^{\varepsilon })dt +{ 1 \over \sqrt{\varepsilon }} {\sigma }_{2}(t,{X}_{1}^{\varepsilon },{X}_{ 2}^{\varepsilon })d{w}_{ 2}, \end{array} \right.$$

where the real-valued functions a 1(t, x 1, x 2), a 2(t, x 1, x 2), σ1(t, x 1, x 2), and σ2(t, x 1, x 2) represent the drift and diffusion coefficients, respectively, and w 1( ⋅) and w 2( ⋅) are independent and standard Brownian motions. Define a vector X as X = (X 1, X 2). Then X ε( ⋅) = (X 1 ε( ⋅), X 2 ε( ⋅)) is a diffusion process. This is a model treated in Khasminskii [113], in which a probabilistic approach was employed. It was shown that as ε → 0, the fast component is averaged out and the slow component X 1 ε( ⋅) has a limit X 1 0( ⋅) such that

$$d{X}_{1}^{0}(t) ={ \overline{a}}_{ 1}({X}_{1}^{0}(t))dt +{ \overline{\sigma }}_{ 1}({X}_{1}^{0}(t))d{w}_{ 1},$$

where

$$\begin{array}{rl} &{\overline{a}}_{1}(t,{x}_{1}) = \int {a}_{1}(t,{x}_{1},{x}_{2})\mu (t,{x}_{1},{x}_{2})d{x}_{2}, \\ &{\overline{\sigma }}_{1}(t,{x}_{1}) = \int {\sigma }_{1}(t,{x}_{1},{x}_{2})\mu (t,{x}_{1},{x}_{2})d{x}_{2}, \end{array}$$

and μ( ⋅) is a limit density of the fast process X 2 ε( ⋅).

To proceed further, it is necessary to investigate the limit properties of the rapidly changing process X 2 ε( ⋅). To do so, consider the transition density of the underlying diffusion process. It is known that it satisfies the forward equation

$$\begin{array}{ll} &{ \partial {p}^{\varepsilon } \over \partial t} ={ 1 \over \varepsilon } {{{\mathcal L}^*} }_{2}{p}^{\varepsilon }+ {{{\mathcal L}^*}}_{ 1}{p}^{\varepsilon }, \\ &{p}^{\varepsilon }(0,{x}_{ 1},{x}_{2}) = {p}_{0}({x}_{1},{x}_{2})\mbox{ with }{p}_{0}({x}_{1},{x}_{2}) \geq 0\mbox{ and} \\ &\int \int {p}_{0}({x}_{1},{x}_{2})d{x}_{1}d{x}_{2} = 1, \end{array}$$
(4.101)

where

$$\begin{array}{ll} & {{\mathcal L}^*}_{1}(t,{x}_{1},{x}_{2})\ \cdot ={ 1 \over 2} { {\partial }^{2} \over \partial {x}_{1}^{2}} ({\sigma }_{1}^{2}(t,{x}_{ 1},{x}_{2})\ \cdot ) -{ \partial \over \partial {x}_{1}} ({a}_{1}(t,{x}_{1},{x}_{2})\ \cdot ), \\ & {{{\mathcal L}^*}}_{2}(t,{x}_{1},{x}_{2})\ \cdot ={ 1 \over 2} { {\partial }^{2} \over \partial {x}_{2}^{2}} ({\sigma }_{2}^{2}(t,{x}_{ 1},{x}_{2})\ \cdot ) -{ \partial \over \partial {x}_{2}} ({a}_{2}(t,{x}_{1},{x}_{2})\ \cdot )\end{array}$$

Similar to the discrete-state-space cases, the basic problems to be addressed are these: As ε → 0, does the system display certain asymptotic properties? Is there any equilibrium distribution? If p ε(t, x 1, x 2) → p(t, x 1, x 2) for some function p( ⋅), can one get a handle on the error bound (i.e., a bound on | p ε(t, x 1, x 2) − p(t, x 1, x 2) | )?

To obtain the desired asymptotic expansion in this case, one needs to make sure the quasi-stationary density exists. Note that for diffusions in unbounded domains, the quasi-stationary density may not exist. Loosely for the existence of the quasi-stationary distribution, it is necessary that the Markov processes corresponding to \({\mathcal{L}}_{2}^{{_\ast}}\) be positive recurrent for each fixed t. Certain sufficient conditions for the existence of the quasi-stationary density are provided in Il’in and Khasminskii [93]. An alternative way of handling the problem is to concentrate on a compact manifold. In doing so we are able to establish the existence of the quasi-stationary density. To illustrate, we choose the second alternative and suppose the following conditions are satisfied.

For each t ∈ [0, T], i, j = 1, 2, and

    • for each \({x}_{2} \in \mathbb{R}\), a 1(t,  ⋅, x 2), σ1 2(t,  ⋅, x 2) and p 0( ⋅, x 2) are periodic with period 1;

    • for each \({x}_{1} \in \mathbb{R}\), a 2(t, x 1,  ⋅), σ2 2(t, x 1,  ⋅) and p 0(x 1,  ⋅) are periodic with period 1.

There is an \(n \in {\mathbb{Z}}_{+}\) such that for each i = 1, 2,

$${a}_{i}(\cdot ),\ {\sigma }_{i}^{2}(\cdot ) \in {C}^{n+1,2(n+1),2(n+1)},\mbox{ for all }t \in [0,T],\ {x}_{ 1},{x}_{2} \in [0,1],$$
(4.102)

the (n + 1)st partial with respect to t of a i ( ⋅, x 1, x 2), and σ i 2( ⋅, x 1, x 2) are Lipschitz continuous uniformly in x 1, x 2 ∈ [0, 1]. In addition, for each t ∈ [0, T] and each x 1, x 2 ∈ [0, 1], σ i 2(t, x 1, x 2) > 0.

Definition 4.55.

A function μ(⋅) is said to be a quasi-stationary density for the periodic diffusion corresponding to the Kolmogorov-Fokker-Planck operator ⋘2 if it is periodic in x1 and x2 with period 1,

$$0 \leq \mu (t,{x}_{1},{x}_{2})\mbox{ for each }(t,{x}_{1},{x}_{2}) \in [0,T] \times [0,1] \times [0,1],$$

and for each fixed t and x1,

$${\int }_{0}^{1}\mu (t,{x}_{ 1},{x}_{2})d{x}_{2} = 1\ \mbox{ and }\ {\mathcal{L}}_{2}^{{_\ast}}\mu (t,{x}_{ 1},{x}_{2}) = 0.$$

To proceed, let \(\mathcal{H}\) be the space of functions that are bounded and continuous and are Hölder continuous in (x 1, x 2) ∈ [0, 1] ×[0, 1] (with Hölder exponent Δ for some 0 < Δ < 1), uniformly with respect to t. For each h 1, \({h}_{2} \in \mathcal{H}\) define \(\langle {h}_{1},{h{}_{2}\rangle }_{\mathcal{H}}\) as

$$\begin{array}{rl} &\langle {h}_{1},{h}_{2}{\rangle}_{\mathcal{H}} ={ \int }_{0}^{T}{ \int }_{0}^{1}{ \int }_{0}^{1}{h}_{ 1}(t,{x}_{1},{x}_{2}){h}_{2}(t,{x}_{1},{x}_{2})d{x}_{1}d{x}_{2}dt\end{array}$$

Under the assumptions mentioned above, two sequences of functions φ i ( ⋅) (periodic in x 1 and x 2) and ψ i ( ⋅) for i = 0, , n can be found such that

  • \({\varphi }_{i}(\cdot,\cdot,\cdot ) \in {C}^{n+1-i,2(n+1-i),2(n+1-i)}\);

  • ψ i (t ∕ ε, x 1, x 2) decay exponentially fast in that for some c 1 > 0 and c 2 > 0,

    $$ \sup\limits_{{x}_{1},{x}_{2}\in [0,1]}\l {\psi }_{i}\left ( \frac{t} {\varepsilon },{x}_{1},{x}_{2}\right )\vert \leq {c}_{1}\exp \left (-\frac{{c}_{2}t} {\varepsilon } \right );$$
  • define \(\widetilde{{s}}_{n}^{\varepsilon }\) by

    $$\widetilde{{s}}_{n}^{\varepsilon }(t,{x}_{ 1},{x}_{2}) =\sum\limits_{i=0}^{n}\left ({\varepsilon }^{i}{\varphi }_{ i}(t,{x}_{1},{x}_{2}) + {\varepsilon }^{i}{\psi }_{ i}\left ( \frac{t} {\varepsilon },{x}_{1},{x}_{2}\right )\right );$$

    for each \(h \in \mathcal{H}\), the following error bound holds:

    $$\begin{array}{ll} \left \vert \langle {p}^{\varepsilon } -\widetilde{ {s}}_{n}^{\varepsilon },h{\rangle}_{\mathcal{H}}\right \vert = O({\varepsilon }^{n+1}).\end{array}$$
    (4.103)

It is interesting to note that the leading term of the approximation φ0( ⋅) is approximately the probability density of X 1, namely, v 0(t, x 1) multiplied by the conditional density of X 2 given X 1 = x 1 (i.e., holding x 1 as a parameter), the quasi-stationary density μ(t, x 1, x 2). The rest of the terms in the regular part of the expansion assume the form

$$\mu (t,{x}_{1},{x}_{2}){v}_{i}(t,{x}_{1}) + {U}_{i}(t,{x}_{1},{x}_{2}),$$

where U i ( ⋅) is a particular solution of an inhomogeneous equation. Note the resemblance of the form to that of the Markov-chain cases studied in this chapter. A detailed proof of the assertion is in Khasminskii and Yin [116]. In fact, more complex systems (allowing interaction of X 1 ε and X 2 ε, the mixed partial derivatives of x 1 and x 2 as well as extension to multidimensional systems) are treated in [116]. In addition, in lieu of \(\langle \cdot,{\cdot \rangle }_{\mathcal{H}}\), convergence under the uniform topology can be considered via the use of stochastic representation of solutions of partial differential equations or energy integration methods (see, for example, the related treatment of singularly perturbed switching diffusion systems in Il’in, Khasminskii, and Yin [94]).

8 Notes

Two-time-scale Markov chains are dealt with in this chapter using purely analytic methods, which are closely connected with the singular perturbation methods. The literature of singular perturbation for ordinary differential equations is rather rich. For an extensive list of references in singular perturbation methods for ordinary differential equations and various techniques such as initial-layer etc., we refer to Vasi’leva and Butuzov [209], Wasow [215, 216], O’Malley [163], and the references therein. The development of singular perturbation methods has been intertwined with advances in technology and progress in various applications. It can be traced back to the beginning of the twentieth century when Prandtl dealt with fluid motion with small friction (see Prandtl [178]). Nowadays, the averaging principle developed by Krylov, Bogoliubov, and Mitropolskii (see Bogoliubov and Mitropolskii [18]) has become a popular technique, taught in standard graduate applied mathematics courses and employed widely. General results on singular perturbations can be found in Bensoussan, Lion, and Papanicolaou [7], Bogoliubov and Mitropolskii [18], Eckhaus [54], Erdélyi [58], Il’in [92], Kevorkian and Cole [108, 109], Krylov and Bogoliubov [133], O’Malley [163], Smith [199], Vasil’eava and Butuzov [209, 210], Wasow [215, 216]; applications to control theory and related fields are in Bensoussan [8], Bielecki and Filar [11], Delebecque and Quadrat [44], Delebecque, Quadrat, and Kokotovic [45], Kokotovic [126], Kokotovic, Bensoussan, and Blankenship [127], Kokotovic and Khalil [128], Kokotovic, Khalil, and O’Reilly [129], Kushner [140], Pan and Başar [164166], Pervozvanskii and Gaitsgori [174], Phillips and Kokotovic [175], Yin and Zhang [233], among others; the vast literature on applications to different branches of physics are in Risken [182], van Kampen [208]; the survey by Hänggi, Talkner, and Borkovec [80] contains hundreds of references concerning applications in physics; related problems via large deviations theory are in Lerman and Schuss [151]; some recent work of singular perturbations to queueing networks, and heavy traffic, etc., is in Harrison and Reiman [81], Knessel and Morrison [125], and the references therein; applications to manufacturing systems are in Sethi and Zhang [192], Soner [202], Zhang [248], and the references cited there; related problems for stochastic differential equations and diffusion approximations, etc., can be found in Day [42], Friedlin and Wentzell [67], Il’in and Khasminskii [93], Khaminskii [111, 112], Kushner [139], Ludwig [152], Matkowsky and Schuss [158], Naeh, Klosek, Matkowski, and Schuss [160], Papanicolaou [169, 170], Schuss [187, 188], Skorohod [198], Yin [222], Yin and Ramachandran [227], and Zhang [247], among others. Singularly perturbed Markov processes also appear in the context of random evolution, a generalization of the motion of a particle on a fixed line with a random velocity or a random diffusivity; see, for example, Griego and Hersh [76, 77] and Pinsky [177]; an extensive survey can be found in Hersh [85]. A first-order approximation of the distribution of the Cox process with rapid switching is in Di Masi and Kabanov [48]. Recently, modeling communication systems via two-time-scale Markov chains has gained renewed interest; see Tse, Gallager, and Tsitsiklis [206], and the references therein.

It should be pointed out that there is a distinct feature in the problem we are studying compared with the traditional study of singularly perturbed systems. In contrast to many singularly perturbed ordinary differential equations, the matrix Q(t) in (4.3) is singular, and has an eigenvalue 0. Thus the usual stability condition does not hold. To circumvent this difficulty, we utilize the q-Property of the matrix Q(t), which leads to a probabilistic interpretation. The main emphasis in this chapter is on developing approximations to the solutions of the forward equations. The underlying systems arise from a wide range of applications where a finite-state Markov chain is involved and a fast time scale t ∕ ε is used. Asymptotic series of the probability distribution of the Markov chain have been developed by employing the techniques of matched expansions. An attempt to obtain the asymptotic expansion of (4.3) is initiated in Khasminskii, Yin, and Zhang [119] for time-inhomogeneous Markov chains. The result presented here is a refinement of the aforementioned reference.

Extending the results for irreducible generators, this chapter further discusses two-time-scale Markov chains with weak and strong interactions. The formulations substantially generalize the work of Khasminskii, Yin, and Zhang [120]. Section 4.3 discusses Markovian models with recurrent states belonging to several ergodic classes is a refinement of [120].

Previous work on singularly perturbed Markov chains with weak and strong interactions can be found in Delebecque, Quadrat, and Kokotovic [45], Gaitsgori and Pervozvanskii [69], Pervozvanskii and Gaitsgori [174], and Phillips and Kokotovic [175]. The essence is a decomposition and aggregation point of view. Their models are similar to that considered in this chapter. For example, translating the setup into our setting, the authors of [175] assumed that the Markov chain generated by \(\widetilde{Q}/\varepsilon +\widehat{ Q}\) has a single ergodic class for ε sufficiently small. Moreover, for each j = 1, 2, , l, the subchain has a single ergodic class. Their formulation requires that \(\widetilde{Q}(t) =\widetilde{ Q}\) and \(\widehat{Q}(t) =\widehat{ Q}\), and it requires essentially the irreducibility of \(\widetilde{Q}/\varepsilon +\widehat{ Q}\) for all ε ≤ ε0 for some ε0 > 0 small enough in addition to the irreducibility of \(\widetilde{{Q}}^{j}\) for j = 1, 2, , l. The problem considered in this chapter is nonstationary; the generators are time-varying. The irreducibility is in the weak sense, and only weak irreducibility of each subgenerator (or block matrix) \(\widetilde{{Q}}^{j}(t)\) for j = 1, 2, , l is needed. Thus our results generalize the existing theorems to nonstationary cases under weaker assumptions. The condition on \(\widetilde{Q}(t)\) exploits the intrinsic properties of the underlying chains. Furthermore, our results also include Markov chains with countable-state spaces. The formulation and development of Section 4.5 are inspired by that of [175] (see also Pan and Başar [164]). This together with the consideration of chains with recurrent states and the inclusion of absorbing states includes most of practical concerns for the rapidly varying part of the generator. Although the forms of the generators with absorbing states and with transient states have more complex structures, the asymptotic expansion of the probability distributions can still be obtained via a similar approach to that of the case of block-diagonal \(\widetilde{Q}(\cdot )\). Applications to manufacturing systems are discussed, for example, in Jiang and Sethi [99] and Sethi and Zhang [192] among others. As a complement of the development in this chapter, the work of Il’in, Khasminskii, and Yin [94] deals with the cases that the underlying Markov processes involve both diffusion and pure jump processes; see also Yin and Yang [229]. Previous work of singular perturbation of stochastic systems can be found in Day [42], Friedlin and Wentzel [67], Khasminskii [111113], Kushner [139], Ludwig [152], Matkowsky and Schuss [158], Naeh, Klosek, Matkowski, and Schuss [160], Papanicolaou [169, 170], Schuss [187], Yin and Ramachandran [227], and the references therein. Singular perturbation in connection with optimal control problems are contained in Bensoussan [8], Bielecki and Filar [11], Delebecque and Quadrat [44], Kokotovic [126], Kokotovic, Bensoussan, and Blankenship [127], Kushner [140], Lehoczky, Sethi, Soner, and Taksar [150], Martins and Kushner [156], Pan and Başar [164], Pervozvanskii and Gaitsgori [174], Sethi and Zhang [192], Soner [202], and Yin and Zhang [233] among others. For discrete-time two-time-scale Markov chains, we refer the reader to Yin and Zhang [238] Yin, Zhang, and Badowski [242] among others.

We note that one of the key points that enables us to solve these problems is the Fredholm alternative. This is even more crucial compared with the situation in Section 4.2 for irreducible generators. In Section 4.2, the consistency conditions are readily verified, whereas in the formulation under weak and strong interactions, the verification needs more work and we have to utilize the consistency to obtain the desired solution.

The discussions on Markov chains with countable-state spaces in this chapter focused on simple situations. For more general cases, see Yin and Zhang [230, 231], in which applications to quasi-birth-death queues were considered; see also Altman, Avrachenkov, and Nunez-Queija [4] for a different approach. The discussions on singularly perturbed diffusion processes dealt with mainly forward equations. For related work on singularly perturbed diffusions, see the papers of Khasminskii and Yin [115, 116] and the references therein; one of the motivations for studying singularly perturbed diffusion comes from wear process modeling (see Rishel [181]). For treatments of averaging principles and related backward equations, we refer the reader to Khasminskii and Yin [117, 118]. For a number of applications on queueing systems, financial engineering, and insurance risk, we refer the reader to Yin, Zhang, and Zhang [232] and references therein.