1 Introduction

Linear quadratic (LQ, for short) optimal control can be traced back to the works of Kalman [13] for the deterministic cases, and Wonham [25] for the stochastic cases (also see [2, 6, 28], and the references therein). In the classical setting, under some mild conditions on the weighting coefficients in the cost functional such as positive definiteness of the weighting control matrix, the stochastic LQ optimal control problems can be solved elegantly via Riccati equation approach (see [28, Chapter 6]). Chen et al. [3] studied stochastic LQ optimal control problems with indefinite weighting control matrix as well as financial applications such as continuous time mean-variance portfolio selection problems (see [17, 35]). Since then, there has been an increasing interest in the so-called indefinite stochastic LQ optimal control (see [1, 16]).

A topic of state systems involving random jumps, such as Poisson jumps or regime switching jumps, is of interest and of importance in various fields such as engineering, management, finance, economics, and so on. For example, Wu and Wang [26] considered the stochastic LQ optimal control problems with Poisson jumps and obtained the existence and uniqueness of the deterministic Riccati equation. Using the technique of completing squares, Hu and Oksendal [10] discussed the stochastic LQ optimal control problem with Poisson jumps and partial information. Yu [29] investigated a kind of infinite horizon backward stochastic LQ optimal control problems. Li et al. [14] solved the indefinite stochastic LQ optimal control problem with Poisson jumps. Meanwhile, there has been dramatically increasing interest in studying this family of stochastic control problems as well as their financial applications, see, for examples,  [7, 7,8,9, 19, 21, 23, 27, 31, 33, 34, 36]. Moreover [11, 12] formulated a class of continuous-time LQ optimal controls with Markovian jumps. Zhang and Yin [30] developed hybrid controls of a class of LQ systems modulated by a finite-state Markov chain. Li et al. [16] initiated indefinite stochastic LQ optimal controls with regime switching jumps. Liu et al. [18] considered near-optimal controls of regime switching LQ problems with indefinite control weight costs. Some other recent development concerning regime switching jumps see  [4, 5, 15].

Recently, Sun and Yong [22] investigated the two-person zero-sum stochastic LQ differential games. It was shown in [22] that the open-loop solvability is equivalent to the existence of an adapted solution to a forward-backward stochastic differential equation (FBSDE, for short) with constraint and the closed loop solvability is equivalent to the existence of a regular solution to the Riccati equations. As a continuation work of  [20, 22] fundamentally studied the open-loop and closed-loop solvabilities for stochastic LQ optimal control problems. Moreover, the equivalence between the strongly regular solvability of the Riccati equation and the uniform convexity of the cost functional is established. Wang et al. [24] introduced the notion of weak closed-loop optimal strategy for LQ problems, and obtained its existence which is equivalent to the open-loop solvability of the LQ problem. Zhang et al. [32] studied the open-loop and closed-loop solvabilities for stochastic LQ optimal control problems with Markovian regime switching jumps, and established the equivalent relationship between the strongly regular solvability of the Riccati equation and the uniform convexity of the cost functional in the circumstance of Markovian regime switching system. In this paper, we further study the weak closed-loop solvability of stochastic LQ optimal control problems with Markovian regime switching system. In order to present our work more clearly, we describe the problem in detail.

Let \((\Omega ,{{{\mathcal {F}}}},{\mathbb {F}},{\mathbb {P}})\) be a complete filtered probability space on which a standard one-dimensional Brownian motion \(W=\{W(t); 0\leqslant t < \infty \}\) and a continuous time, finite-state, Markov chain \(\alpha =\{\alpha (t); 0\leqslant t< \infty \}\) are defined, where \({\mathbb {F}}=\{{{{\mathcal {F}}}}_t\}_{t\geqslant 0}\) is the natural filtration of W and \(\alpha \) augmented by all the \({\mathbb {P}}\)-null sets in \({{{\mathcal {F}}}}\), and \({\mathbb {F}}^\alpha =\{{{{\mathcal {F}}}}_t^\alpha \}_{t\geqslant 0}\) is the filtration generated by \(\alpha \), with the related expectation \({\mathbb {E}}^\alpha \). We identify the state space of the chain \(\alpha \) with a finite set \({{{\mathcal {S}}}}\triangleq \{1, 2 \ldots , D\}\), where \(D\in {\mathbb {N}}\) and suppose that the chain is homogeneous and irreducible. Let \(0\leqslant t<T\) and consider the following controlled Markovian regime switching linear stochastic differential equation (SDE, for short) over a finite time horizon [tT]:

$$\begin{aligned} \left\{ \begin{array}{ll} &{} dX(s)=\Big [A(s,\alpha (s))X(s)+B(s,\alpha (s))u(s)+b(s)\Big ]ds\\ &{}\qquad \quad \qquad +\Big [C(s,\alpha (s))X(s)+D(s,\alpha (s))u(s)+\sigma (s)\Big ]dW(s),\quad ~s\in [t,T],\\ &{} X(t)=x,\quad ~\alpha (t)=i, \end{array}\right. \nonumber \\ \end{aligned}$$
(1.1)

where \(A,C:[0,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{n\times n}\) and \(B,D:[0,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{n\times m}\) are given deterministic functions, called the coefficients of the state Eq. (1.1); \(b,\sigma :[0,T]\times \Omega \rightarrow {\mathbb {R}}^n\) are \({\mathbb {F}}\)-progressively measurable processes, called the nonhomogeneous terms; and \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\) is called the initial pair. In the above, the process \(u(\cdot )\), which belongs to the following space:

$$\begin{aligned}&{{{\mathcal {U}}}}[t,T]\triangleq \Big \{u:[t,T]\times \Omega \rightarrow {\mathbb {R}}^m\ \Big |\ u\hbox { is } {\mathbb {F}}\hbox {-progressively measurable and }\\&\quad \quad {\mathbb {E}}\int _t^T|u(s)|^2ds<\infty \Big \}, \end{aligned}$$

is called the control process, and the solution \(X(\cdot )\) of (1.1) is called the state process corresponding to (txi) and \(u(\cdot )\). To measure the performance of the control \(u(\cdot )\), we introduce the following quadratic cost functional:

$$\begin{aligned} J(t,x,i;u(\cdot ))\triangleq & {} {\mathbb {E}}\Bigg \{\Big \langle G(\alpha (T))X(T),X(T)\Big \rangle +2\Big \langle g(\alpha (T)),X(T)\Big \rangle \nonumber \\&\quad +\int _t^T\Bigg [\left\langle \begin{pmatrix}Q(s,\alpha (s))&{}S(s,\alpha (s))^\top \\ S(s,\alpha (s))&{}R(s,\alpha (s))\end{pmatrix} \begin{pmatrix}X(s)\\ u(s)\end{pmatrix}, \begin{pmatrix}X(s)\\ u(s)\end{pmatrix}\right\rangle \nonumber \\&\quad +2\left\langle \begin{pmatrix}q(s,\alpha (s))\\ \rho (s,\alpha (s))\end{pmatrix},\begin{pmatrix}X(s)\\ u(s)\end{pmatrix}\right\rangle \Bigg ]ds\Bigg \}, \end{aligned}$$
(1.2)

where \(G(i)\in {\mathbb {R}}^{n\times n}\) is a symmetric constant matrix, and g(i) is an \({{{\mathcal {F}}}}_T\)-measurable random variable taking values in \({\mathbb {R}}^n\), with \(i\in {{{\mathcal {S}}}}\); \(Q:[0,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{n\times n}\), \(S:[0,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{m\times n}\) and \(R:[0,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{m\times m}\) are deterministic functions with both Q and R being symmetric; \(q:[0,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{n}\) and \(\rho :[0,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{m}\) are deterministic functions. In the above, \(M^\top \) stands for the transpose of a matrix M. The problem that we are going to study is the following:

Problem (M-SLQ). For any given initial pair \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\), find a control \(u^{*}(\cdot )\in {\mathcal {U}}[t,T]\), such that

$$\begin{aligned} J(t,x,i;u^{*}(\cdot ))=\inf _{u(\cdot )\in {{{\mathcal {U}}}}[t,T]}J(t,x,i;u(\cdot )),\quad ~\forall u(\cdot )\in {\mathcal {U}}[t,T]. \end{aligned}$$
(1.3)

The above is called a stochastic linear quadratic optimal control problem of the Markovian regime switching system. Any \(u^{*}(\cdot )\in {\mathcal {U}}[t,T]\) satisfying (1.3) is called an open-loop optimal control of Problem (M-SLQ) for the initial pair (txi); the corresponding state process \(X(\cdot )=X(\cdot \ ;t,x,i,u^*(\cdot ))\) is called an optimal state process; and the function \(V(\cdot ,\cdot ,\cdot )\) defined by

$$\begin{aligned} V(t,x,i)\triangleq \inf _{u(\cdot )\in {{{\mathcal {U}}}}[t,T]}J(t,x,i;u(\cdot )),\quad ~(t,x,i)\in [0,T]\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}, \end{aligned}$$
(1.4)

is called the value function of Problem (M-SLQ). Note that in the special case when \(b(\cdot ,\cdot ),\sigma (\cdot ,\cdot ),g(\cdot ),q(\cdot ,\cdot ),\rho (\cdot ,\cdot )=0\), the state Eq. (1.1) and the cost functional (1.2), respectively, become

$$\begin{aligned} \left\{ \begin{array}{ll} &{} dX(s)=\Big [A(s,\alpha (s))X(s)+B(s,\alpha (s))u(s)\Big ]ds\\ \\ &{}\qquad \qquad +\Big [C(s,\alpha (s))X(s)+D(s,\alpha (s))u(s)\Big ]dW(s),\quad ~s\in [t,T],\\ &{} X(t)=x,\quad ~\alpha (t)=i, \end{array}\right. \end{aligned}$$
(1.5)

and

$$\begin{aligned} J^0(t,x,i;u(\cdot ))= & {} {\mathbb {E}}\Bigg \{\Big \langle G(\alpha (T))X(T),X(T))\Big \rangle \nonumber \\&+\int _t^T\left\langle \begin{pmatrix}Q(s,\alpha (s))&{}S(s,\alpha (s))^\top \\ S(s,\alpha (s))&{}R(s,\alpha (s))\end{pmatrix} \begin{pmatrix}X(s)\\ u(s)\end{pmatrix}, \begin{pmatrix}X(s)\\ u(s)\end{pmatrix}\right\rangle ds\Bigg \}.\nonumber \\ \end{aligned}$$
(1.6)

We refer to the problem of minimizing (1.6) subject to (1.5) as the homogeneous LQ problem associated with Problem (M-SLQ), denoted by Problem (M-SLQ)\(^0\). The corresponding value function is denoted by \(V^0(t,x,i)\). Moreover, when all the coefficients of (1.1) and (1.2) are independent of the regime switching term \(\alpha (\cdot )\), the corresponding problem (1.3) is called Problem (SLQ).

Following the works of [20, 22, 32] investigated the open-loop and closed-loop solvabilities for stochastic LQ problems of Markovian regime switching system. It was shown that the open-loop solvability of Problem (M-SLQ) is equivalent to the solvability of a forward-backward stochastic differential equation with constraint. They also showed that the closed-loop solvability of Problem (M-SLQ) is equivalent to the existence of a regular solution of the following general Riccati equation (GRE, for short):

$$\begin{aligned} \left\{ \begin{array}{ll} &{}{\dot{P}}(s,i)+P(s,i)A(s,i)+A(s,i)^\top P(s,i)+C(s,i)^\top P(s,i)C(s,i)+Q(s,i)\\ &{}\qquad \quad -{\hat{S}}(s,i)^\top {\hat{R}}(s,i)^{-1}{\hat{S}}(s,i)+\sum _{k=1}^D\lambda _{ik}(s)P(s,k)=0,\quad {\mathrm{a.e.}}~s\in [0,T],\ i\in {{{\mathcal {S}}}},\\ &{} P(T,i)=G(i), \end{array}\right. \nonumber \\ \end{aligned}$$
(1.7)

where

$$\begin{aligned} {{\hat{S}}}(s,i)&= B(s,i)^\top P(s,i)+ D(s,i)^\top P(s,i)C(s,i)+S(s,i),\\ {{\hat{R}}}(s,i)&= R(s,i)+D(s,i)^\top P(s,i)D(s,i). \end{aligned}$$

It can be found (see  [32]) that, for the stochastic LQ optimal control problem of Markovian regime switching system, the existence of a closed-loop optimal strategy implies the existence of an open-loop optimal control, but not vice versa. Thus, there are some LQ problems that are open-loop solvable, but not closed-loop solvable. Such problems cannot be expected to get a regular solution (which does not exist) to the associated GRE (1.7). Therefore, the state feedback representation of the open-loop optimal control might be impossible. To be more convincing, let us look at the following simple example.

Example 1.1

Consider the following one-dimensional state equation

$$\begin{aligned} \left\{ \begin{array}{ll} &{} dX(s)=\big [-\alpha (s)X(s)+u(s)\big ]ds+\sqrt{2\alpha (s)}X(s)dW(s),\quad ~s\in [t,1],\\ &{} X(t)=x,\quad ~\alpha (t)=i, \end{array}\right. \end{aligned}$$

and the nonnegative cost functional

$$\begin{aligned} J(t,x,i;u(\cdot ))={\mathbb {E}}|X(1)|^2. \end{aligned}$$

In this example, the GRE reads (noting that \(Q(\cdot ,i)=0,R(\cdot ,i)=0,D(\cdot ,i)=0\) for every \(i\in {{{\mathcal {S}}}}\), and \(0^{-1}=0\)):

$$\begin{aligned} \left\{ \begin{array}{ll} &{}{\dot{P}}(s,i)+\sum _{k=1}^D\lambda _{ik}(s)P(s,k)=0,\quad {\mathrm{a.e.}}~s\in [t,1],\\ &{} P(T,i)=1, \quad i \in {{{\mathcal {S}}}}. \end{array}\right. \end{aligned}$$
(1.8)

It is not hard to check that GRE (1.8) has no regular solution (see Sect. 3 for the definition of regular solution), thus the corresponding LQ problem is not closed-loop solvable. A usual Riccati equation approach specifies the corresponding state feedback control as follows (noting that \(Q(\cdot ,i)=0,R(\cdot ,i)=0,D(\cdot ,i)=0\) for every \(i\in {{{\mathcal {S}}}}\), and \(0^{-1}=0\)):

$$\begin{aligned}&u^*(s) \triangleq -\Big [R(s,i)\!+\!D(s,i)^\top \!P(s,i)D(s,i)\Big ]^{-1} \!\Big [B(s,i)P(s,i)\!\\&\quad \quad +\!D(s,i)^\top \!P(s,i)C(s,i)\!+\!S(s,i)\Big ]X(s)\!\equiv \!0, \end{aligned}$$

which is not an open-loop optimal control for any nonzero initial state x. In fact, let \((t,x,i)\in [0,1)\times {\mathbb {R}}\times {{{\mathcal {S}}}}\) be an arbitrary but the fixed initial pair with \(x\ne 0\). By Itô’s formula, the state process \(X^*(\cdot )\) corresponding to (txi) and \(u^*(\cdot )\) is expressed as

$$\begin{aligned} X^*(s)=x\cdot \exp \left\{ -2\int _t^s\alpha (r)dr+\int _t^s\sqrt{2\alpha (r)}dW(r)\right\} ,\quad ~s\in [t,1]. \end{aligned}$$

Thus,

$$\begin{aligned} J(t,x,i;u^*(\cdot ))={\mathbb {E}}|X^*(1)|^2=x^2>0. \end{aligned}$$

On the other hand, let \({\bar{u}}(\cdot )\) be the control defined by

$$\begin{aligned} {\bar{u}}(s)\equiv \frac{x}{t-1}\cdot \exp \left\{ -2\int _t^s\alpha (r)dr+\int _t^s\sqrt{2\alpha (r)}dW(r)\right\} ,\quad ~s\in [t,1]. \end{aligned}$$

By the variation of constants formula, the state process \({\bar{X}}(\cdot )\), corresponding to (txi) and \({\bar{u}}(\cdot )\), can be presented by

$$\begin{aligned} {\bar{X}}(s)= & {} \exp \left\{ -2\int _t^s\alpha (r)dr+\int _t^s\sqrt{2\alpha (r)}dW(r)\right\} \\&\cdot \left[ x+\int _t^s\exp \left\{ 2\int _t^r\alpha (v)dv-\int _t^r\sqrt{2\alpha (v)}dW(v)\right\} \cdot {\bar{u}}(r)dr\right] \\= & {} \exp \left\{ -2\int _t^s\alpha (r)dr+\int _t^s\sqrt{2\alpha (r)}dW(r)\right\} \cdot \left[ x+\frac{s-t}{t-1}x\right] ,\quad ~s\in [t,1], \end{aligned}$$

which satisfies \({\bar{X}}(1)=0\). Hence,

$$\begin{aligned} J(t,x,i;{\bar{u}}(\cdot ))={\mathbb {E}}|{\bar{X}}(1)|^2=0<J(t,x,i;u^*(\cdot )). \end{aligned}$$

Since the cost functional is nonnegative, the open-loop control \({\bar{u}}(\cdot )\) is optimal for the initial pair (txi), but \(u^*(\cdot )\) is not optimal.

The above example suggests that the usual solvability of the GRE (1.7) no longer helpfully handles the open-loop solvability of certain stochastic LQ problems. It is then natural to ask: When Problem (M-SLQ) is merely open-loop solvable, not closed-loop solvable, is it still possible to get a linear state feedback representation for an open-loop optimal control within the framework of Markovian regime switching system? The goal of this paper is to tackle this problem.

The contribution of this paper is to study the weak closed-loop solvability of stochastic LQ optimal control problems with Markovian regime switching system. In detail, we provide an alternative characterization of the open-loop solvability of Problem (M-SLQ) using the perturbation approach adopted in [20]. In order to obtain a linear state feedback representation of open-loop optimal control for Problem (M-SLQ), we introduce the notion of weak closed-loop strategies in the circumstance of stochastic LQ optimal control problem of Markovian regime switching system. We prove that as long as Problem (M-SLQ) is open-loop solvable, there always exists a weak closed-loop strategy whose outcome actually is an open-loop optimal control. Consequently, the open-loop and weak closed-loop solvability of Problem (M-SLQ) are equivalent on [0, T). Comparing with [24], this paper further develops the results in [24] to the case of stochastic LQ optimal control problems with Markovian regime switching system, which could be applied to financial market models with Markov process, such as interest rate, stocks return and volatility. However, the regime switching jumps will bring some difficulties. For example, the first problem is how to define the closed-loop solvability and weak closed-loop solvability in the circumstance of Markovian regime switching system. The second problem is how to prove the equivalent between the open-loop and weak closed-loop solvability of Problem (M-SLQ) in the circumstance of Markovian regime switching system. We will use the methods of [20, 24, 32] to overcome these difficulties.

The rest of the paper is organized as follows. In Sect. 2, we collect some preliminary results and introduce a few elementary notions for Problem (M-SLQ). Section 3 is devoted to the study of open-loop solvability by a perturbation method. In Sect. 4, we show how to obtain a weak closed-loop optimal strategy and establish the equivalence between open-loop and weak closed-loop solvability. Finally, an example is presented in Sect. 5 to illustrate the results we obtained.

2 Preliminaries

Throughout this paper, and recall from the previous section, let \((\Omega ,{{{\mathcal {F}}}},{\mathbb {F}},{\mathbb {P}})\) be a complete filtered probability space on which a standard one-dimensional Brownian motion \(W=\{W(t); 0\leqslant t < \infty \}\) and a continuous time, finite-state, Markov chain \(\alpha =\{\alpha (t); 0\leqslant t< \infty \}\) are defined, where \({\mathbb {F}}=\{{{{\mathcal {F}}}}_t\}_{t\geqslant 0}\) is the natural filtration of W and \(\alpha \) augmented by all the \({\mathbb {P}}\)-null sets in \({{{\mathcal {F}}}}\). In the rest of our paper, we will use the following notation:

$$\begin{aligned} \begin{array}{ll} {\mathbb {R}}^n &{}\quad \text{ the } n\text{-dimensional } \text{ Euclidean } \text{ space };\\ M^\top &{} \quad \text{ the } \text{ transpose } \text{ of } \text{ any } \text{ vector } \text{ or } \text{ matrix } M;\\ {\mathrm{tr}}\,[M] &{}\quad \text{ the } \text{ trace } \text{ of } \text{ a } \text{ square } \text{ matrix } M;\\ {{{\mathcal {R}}}}(M) &{} \quad \text{ the } \text{ range } \text{ of } \text{ the } \text{ matrix } M;\\ M^{-1} &{}\quad \text{ the } \text{ Moore-Penrose } \text{ pseudo-inverse } \text{ of } \text{ the } \text{ matrix } M;\\ \langle \cdot \,,\cdot \rangle &{}\quad \text{ the } \text{ Frobenius } \text{ inner } \text{ products } \text{ in } \text{ possibly } \text{ different } \text{ Hilbert } \text{ spaces };\\ {\mathbb {R}}^{n\times m} &{}\quad \text{ the } \text{ Euclidean } \text{ space } \text{ of } \text{ all } n\times m \text{ real } \text{ matrices } \text{ endowed } \text{ with } \text{ inner }\\ &{}\quad \text{ product } \langle M, N\rangle \mapsto {\mathrm{tr}}\,[M^\top N] \text{ and } \text{ the } \text{ norm } |M|=\sqrt{{\mathrm{tr}}\,[M^\top M]};\\ {\mathbb {S}}^n &{}\quad \text{ the } \text{ set } \text{ of } \text{ all } n\times n \text{ symmetric } \text{ matrices }, \end{array} \end{aligned}$$

and for an \({\mathbb {S}}^n\)-valued function \(F(\cdot )\) on [tT], we use the notation \(F(\cdot )\gg 0\) to indicate that \(F(\cdot )\) is uniformly positive definite on [tT], i.e., there exists a constant \(\delta >0\) such that

$$\begin{aligned} F(s)\geqslant \delta I,\qquad {\mathrm{a.e.}}~s\in [t,T]. \end{aligned}$$

Next, for any \(t\in [0,T)\) and Euclidean space \({\mathbb {H}}\), we further introduce the following spaces of functions and processes:

$$\begin{aligned} C([t,T];{\mathbb {H}})= & {} \Big \{\varphi :[t,T]\rightarrow {\mathbb {H}}\bigm |\varphi (\cdot )\hbox { is continuous }\negthinspace \Big \},\\ L^p(t,T;{\mathbb {H}})= & {} \left\{ \varphi :[t,T]\rightarrow {\mathbb {H}}\biggm |\int _t^T|\varphi (s)|^pds<\infty \right\} ,\quad 1\leqslant p<\infty ,\\ L^\infty (t,T;{\mathbb {H}})= & {} \left\{ \varphi :[t,T]\rightarrow {\mathbb {H}}\biggm |\mathop {\mathrm{esssup}}_{s\in [t,T]}|\varphi (s)|<\infty \right\} , \end{aligned}$$

and

$$\begin{aligned} L^2_{{{{\mathcal {F}}}}_T}(\Omega ;{\mathbb {H}})= & {} \Big \{\xi :\Omega \rightarrow {\mathbb {H}}\bigm |\xi \hbox { is } {{{\mathcal {F}}}}_T\hbox {-measurable, }{\mathbb {E}}|\xi |^2<\infty \Big \},\\ L_{\mathbb {F}}^2(t,T;{\mathbb {H}})= & {} \bigg \{\varphi :[t,T]\times \Omega \rightarrow {\mathbb {H}}\bigm |\varphi (\cdot )\hbox { is } {\mathbb {F}}\hbox {-progressively measurable, }\\&{\mathbb {E}}\int ^T_t|\varphi (s)|^2ds<\infty \bigg \},\\ L_{\mathbb {F}}^2(\Omega ;C([t,T];{\mathbb {H}}))= & {} \bigg \{\varphi :[t,T]\times \Omega \rightarrow {\mathbb {H}}\bigm |\varphi (\cdot )\hbox { is }{\mathbb {F}}\hbox {-adapted, continuous, }\\&{\mathbb {E}}\left[ \sup _{s\in [t,T]}|\varphi (s)|^2\right]<\infty \bigg \},\\ L^2_{\mathbb {F}}(\Omega ;L^1(t,T;{\mathbb {H}}))= & {} \bigg \{\varphi :[t,T]\times \Omega \rightarrow {\mathbb {H}}\bigm |\varphi (\cdot )\hbox { is }\\&{\mathbb {F}}\hbox {-progressively measurable, } {\mathbb {E}}\left( \int _t^T|\varphi (s)|ds\right) ^2<\infty \bigg \}. \end{aligned}$$

Now we start to formulate our system. We identify the state space of the chain \(\alpha \) with a finite set \({{{\mathcal {S}}}}\triangleq \{1, 2 \ldots , D\}\), where \(D\in {\mathbb {N}}\) and suppose that the chain is homogeneous and irreducible. To specify statistical or probabilistic properties of the chain \(\alpha \), for \(t\in [0,\infty )\), we define the generator \(\lambda (t)\triangleq [\lambda _{ij}(t)]_{i, j = 1, 2, \ldots , D}\) of the chain under \({\mathbb {P}}\). This is also called the rate matrix, or the Q-matrix. Here, for each \(i, j = 1, 2, \ldots , D\), \(\lambda _{ij}(t)\) is the constant transition intensity of the chain from state i to state j at time t. Note that \(\lambda _{ij}(t) \geqslant 0\), for \(i \ne j\) and \(\sum ^{D}_{j = 1} \lambda _{ij}(t) = 0\), so \(\lambda _{ii}(t) \leqslant 0\). In what follows for each \(i, j = 1, 2, \ldots , D\) with \(i \ne j\), we suppose that \(\lambda _{ij}(t) > 0\), so \(\lambda _{ii}(t) < 0\). For each fixed \(i,j = 1, 2, \ldots , D\), let \(N_{ij}(t)\) be the number of jumps from state i into state j up to time t and set

$$\begin{aligned} \lambda _j (t)\triangleq \int _0^t\lambda _{\alpha (s-)\, j}I_{\{\alpha (s-)\ne j\}}ds =\sum ^{D}_{i = 1, i \ne j}{\tilde{\lambda }}_{ij}(t),\ \hbox { with }\ {\tilde{\lambda }}_{ij}(t)\triangleq \int ^{t}_{0}\lambda _{ij}(s)I_{\{\alpha (s-)=i\}} d s. \end{aligned}$$

Then for each \(i,j=1,2,\ldots , D\), the term \({\widetilde{N}}_{ij} (t)\) defined as follows is an \(({\mathbb {F}}, {\mathbb {P}})\)-martingale:

$$\begin{aligned} {\widetilde{N}}_{ii} (t)\equiv 0,\qquad {\widetilde{N}}_{ij} (t)= N_{ij}(t)-{\tilde{\lambda }}_{ij}(t),\quad i\ne j. \end{aligned}$$

To guarantee the well-posedness of the state Eq. (1.1), we adopt the following assumption:

(H1) For every \(i\in {{{\mathcal {S}}}}\), the coefficients and nonhomogeneous terms of (1.1) satisfy

$$\begin{aligned} \left\{ \begin{array}{ll} &{} A(\cdot ,i)\in L^\infty (0,T;{\mathbb {R}}^{n\times n}), \quad B(\cdot ,i)\in L^\infty (0,T;{\mathbb {R}}^{n\times m}), \quad b(\cdot )\in L^2_{\mathbb {F}}(\Omega ;L^1(0,T;{\mathbb {R}}^n)),\\ &{} C(\cdot ,i)\in L^\infty (0,T;{\mathbb {R}}^{n\times n}), \quad D(\cdot ,i)\in L^{\infty }(0,T;{\mathbb {R}}^{n\times m}), \quad \sigma (\cdot )\in L^2_{\mathbb {F}}(0,T;{\mathbb {R}}^n). \end{array}\right. \end{aligned}$$

The following result, whose proof is similar to the result in [22, Proposition 2.1], establishes the well-posedness of the state equation under the assumption (H1).

Lemma 2.1

Under the assumption (H1), for any initial pair \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\) and control \(u(\cdot )\in {{{\mathcal {U}}}}[t,T]\), the state Eq. (1.1) has a unique adapted solution \(X(\cdot )\equiv X(\cdot \ ;t,x,i,u(\cdot ))\). Moreover, there exists a constant \(K>0\), independent of (txi) and \(u(\cdot )\), such that

$$\begin{aligned} {\mathbb {E}}\left[ \sup _{t\leqslant s\leqslant T}|X(s)|^2\right] \leqslant K{\mathbb {E}}\left[ |x|^2+\Big (\int _t^T|b(s)|ds\Big )^2+\int _t^T|\sigma (s)|^2ds+\int _t^T|u(s)|^2ds\right] .\nonumber \\ \end{aligned}$$
(2.1)

To ensure that the random variables in the cost functional (1.2) are integrable, we assume the following holds:

(H2) For every \(i\in {{{\mathcal {S}}}}\), the weighting coefficients in the cost functional (1.2) satisfy

$$\begin{aligned} \left\{ \begin{array}{ll} &{} G(i)\in {\mathbb {S}}^n, \quad Q(\cdot ,i)\in L^1(0,T;{\mathbb {S}}^n), \quad S(\cdot ,i)\in L^2(0,T;{\mathbb {R}}^{m\times n}), \quad R(\cdot ,i)\in L^\infty (0,T;{\mathbb {S}}^m),\\ &{} g(i)\in L^2_{{{{\mathcal {F}}}}_T}(\Omega ;{\mathbb {R}}^n), \quad q(\cdot ,i)\in L^2(0,T;{\mathbb {R}}^n), \quad \rho (\cdot ,i)\in L^2(0,T;{\mathbb {R}}^m). \end{array}\right. \end{aligned}$$

Remark 2.2

Suppose that (H1) holds. Then according to Lemma 2.1, for any initial pair \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\) and control \(u(\cdot )\in {{{\mathcal {U}}}}[t,T]\), the state Eq. (1.1) admits a unique (strong) solution \(X(\cdot )\equiv X(\cdot ;t,x,i,u(\cdot ))\) which belongs to the space \(L_{\mathbb {F}}^2(\Omega ;C([t,T];{\mathbb {H}}))\). In addition, if (H2) holds, then the random variables on the right-hand side of (1.2) are integrable, and hence Problem (M-SLQ) is well-posed.

Let us recall some basic notions of stochastic LQ optimal control problems.

Definition 2.3

(Open-loop) Problem (M-SLQ) is said to be

  1. (i)

    (uniquely) open-loop solvable for an initial pair \((t,x,i)\in [0,T]\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\) if there exists a (unique) \(u^*(\cdot )=u^*(\cdot \ ;t,x,i)\in {{{\mathcal {U}}}}[t,T]\) (depending on (txi)) such that

    $$\begin{aligned} J(t,x,i;u^{*}(\cdot ))\leqslant J(t,x,i;u(\cdot )),\qquad \forall u(\cdot )\in {{{\mathcal {U}}}}[t,T]. \end{aligned}$$
    (2.2)

    Such a \(u^*(\cdot )\) is called an open-loop optimal control for (txi).

  2. (ii)

    (uniquely) open-loop solvable if it is (uniquely) open-loop solvable for all the initial pairs \((t,x,i)\in [0,T]\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\).

Definition 2.4

(Closed-loop) Let \(\Theta :[t,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{m\times n}\) to be deterministic function and \(v:[t,T]\times \Omega \rightarrow {\mathbb {R}}^m\) be an \({\mathbb {F}}\)-progressively measurable process.

  1.  (i)

    We call \((\Theta (\cdot ,\cdot ),v(\cdot ))\) a closed-loop strategy on [tT] if

    $$\begin{aligned} {\mathbb {E}}\int _t^T|\Theta (s,\alpha (s))|^2ds<\infty ,\quad \hbox {and}\quad {\mathbb {E}}\int _t^T|v(s)|^2ds<\infty . \end{aligned}$$
    (2.3)

    The set of all closed-loop strategies \((\Theta (\cdot ,\cdot ),v(\cdot ))\) on [tT] is denoted by \({\mathscr {C}}[t,T]\).

  2.  (ii)

    A closed-loop strategy \((\Theta ^*(\cdot ,\cdot ),v^*(\cdot ))\in {\mathscr {C}}[t,T]\) is said to be optimal on [tT] if

    $$\begin{aligned}&J(t,x,i;\Theta ^*(\cdot ,\alpha (\cdot ))X^*(\cdot )+v^*(\cdot ))\leqslant J(t,x,i;\Theta (\cdot ,\alpha (\cdot ))X(\cdot )+v(\cdot )),\nonumber \\&\quad \forall (x,i)\in {\mathbb {R}}^n\times {{{\mathcal {S}}}},\quad \forall (\Theta (\cdot ,\cdot ),v(\cdot ))\in {\mathscr {C}}[t,T], \end{aligned}$$
    (2.4)

    where \(X^*(\cdot )\) is the solution to the closed-loop system under \((\Theta ^*(\cdot ,\cdot ),v^*(\cdot ))\):

    $$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{}dX^*(s)=\Big \{\big [A(s,\alpha (s))+B(s,\alpha (s))\Theta ^*(s,\alpha (s))\big ]X^*(s)+B(s,\alpha (s))v^*(s)+b(s)\Big \}ds\\ &{}\qquad \quad \ +\Big \{\big [C(s,\alpha (s))+D(s,\alpha (s))\Theta ^*(s,\alpha (s))\big ]X^*(s)\\ &{}\qquad \quad \ +D(s,\alpha (s))v^*(s)+\sigma (s)\Big \}dW(s),\qquad s\in [t,T],\\ &{} X^*(t)=x,\quad ~\alpha (t)=i, \end{array}\right. \end{aligned}$$
    (2.5)

    and \(X(\cdot )\) is the solution to the closed-loop system (2.5) corresponding to \((\Theta (\cdot ,\cdot ),v(\cdot ))\).

  3.  (iii)

    For any \(t\in [0,T)\), if a closed-loop optimal strategy (uniquely) exists on [tT], Problem (M-SLQ) is (uniquely) closed-loop solvable.

Remark 2.5

We emphasize that, in the above definition, \(\Theta \) is a deterministic function, and in (2.3) the randomness of \(\Theta (\cdot ,\alpha (\cdot ))\) comes from \(\alpha (\cdot )\). Moreover, (2.4) must be true for all \((x,i)\in {\mathbb {R}}^n\times {{{\mathcal {S}}}}\). The same remark applies to the definition below.

Definition 2.6

(Weak closed-loop) Let \(\Theta :[t,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{m\times n}\) be a deterministic function and \(v:[t,T]\times \Omega \rightarrow {\mathbb {R}}^m\) be an \({\mathbb {F}}\)-progressively measurable process such that for any \(T'\in [t,T)\),

$$\begin{aligned} {\mathbb {E}}\int _t^{T'}|\Theta (s,\alpha (s))|^2ds<\infty ,\quad \hbox {and}\quad {\mathbb {E}}\int _t^{T'}|v(s)|^2ds<\infty . \end{aligned}$$
  1.  (i)

    We call \((\Theta (\cdot ,\cdot ),v(\cdot ))\) a weak closed-loop strategy on [tT) if for any initial state \((x,i)\in {\mathbb {R}}^n\times {{{\mathcal {S}}}}\), the outcome \(u(\cdot )\equiv \Theta (\cdot ,\alpha (\cdot ))X(\cdot )+v(\cdot )\) belongs to \({{{\mathcal {U}}}}[t,T]\equiv L^2_{{\mathbb {F}}}(t,T;{\mathbb {R}}^m)\), where \(X(\cdot )\) is the solution to the weak closed-loop system:

    $$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} dX(s)=\Big \{\big [A(s,\alpha (s))+B(s,\alpha (s))\Theta (s,\alpha (s))\big ]X(s)+B(s,\alpha (s))v(s)+b(s)\Big \}ds\\ &{}\qquad \quad ~~~+\Big \{\big [C(s,\alpha (s))+D(s,\alpha (s))\Theta (s,\alpha (s))\big ]X(s)\\ &{}\qquad \qquad \qquad +D(s,\alpha (s))v(s)+\sigma (s)\Big \}dW(s),\qquad s\in [t,T],\\ &{} X(t)=x,\quad ~\alpha (t)=i. \end{array}\right. \end{aligned}$$
    (2.6)

    The set of all weak closed-loop strategies is denoted by \({\mathscr {C}}_w[t,T]\).

  2.  (ii)

    A weak closed-loop strategy \((\Theta ^*(\cdot ,\cdot ),v^*(\cdot ))\in {\mathscr {C}}_w[t,T]\) is said to be optimal on [tT) if

    $$\begin{aligned}&J(t,x,i;\Theta ^*(\cdot ,\alpha (\cdot ))X^*(\cdot )+v^*(\cdot ))\leqslant J(t,x,i;\Theta (\cdot ,\alpha (\cdot ))X(\cdot )+v(\cdot )),\nonumber \\&\qquad \qquad \qquad \quad \forall (x,i)\in {\mathbb {R}}^n\times {{{\mathcal {S}}}},\quad \forall (\Theta (\cdot ,\cdot ),v(\cdot ))\in {\mathscr {C}}_w[t,T], \end{aligned}$$
    (2.7)

    where \(X(\cdot )\) is the solution of the closed-loop system (2.6), and \(X^*(\cdot )\) is the solution to the weak closed-loop system (2.6) corresponding to \((\Theta ^*(\cdot ,\cdot ),v^*(\cdot ))\).

  3.  (iii)

    For any \(t\in [0,T)\), if a weak closed-loop optimal strategy (uniquely) exists on [tT), Problem (M-SLQ) is (uniquely) weakly closed-loop solvable.

3 Open-Loop Solvability: A Perturbation Approach

In this section, we study the open-loop solvability of Problem (M-SLQ) through a perturbation approach. We begin by assuming that, for any choice of \((t,i)\in [0,T)\times {{{\mathcal {S}}}}\),

$$\begin{aligned} J^0(t,0,i;u(\cdot ))\geqslant 0,\quad ~\forall u(\cdot )\in {{{\mathcal {U}}}}[t,T], \end{aligned}$$
(3.1)

which is necessary for the open-loop solvability of Problem (M-SLQ) according to [32, Theorem 4.1]. In fact, assumption (3.1) means that \(u(\cdot )\rightarrow J^0(t,0,i;u(\cdot ))\) is convex, and one can actually prove that assumption (3.1) implies the convexity of the mapping \(u(\cdot )\rightarrow J(t,x,i;u(\cdot ))\) for any choice of \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\) (see [20, 32]).

For \(\varepsilon >0\), consider the LQ problem of minimizing the perturbed cost functional

$$\begin{aligned}&J_\varepsilon (t,x,i;u(\cdot ))\triangleq J(t,x,i;u(\cdot ))+\varepsilon {\mathbb {E}}\int _t^T|u(s)|^2ds\nonumber \\&\qquad \qquad \ ={\mathbb {E}}\Bigg \{\Big \langle G(\alpha (T))X(T),X(T)\Big \rangle +2\Big \langle g(\alpha (T)),X(T)\Big \rangle \nonumber \\&\qquad \qquad \qquad \ \ \,+\int _t^T\Bigg [\left\langle \begin{pmatrix}Q(s,\alpha (s))&{}S(s,\alpha (s))^\top \\ S(s,\alpha (s))&{}R(s,\alpha (s))+\varepsilon I_m\end{pmatrix} \begin{pmatrix}X(s)\\ u(s)\end{pmatrix}, \begin{pmatrix}X(s)\\ u(s)\end{pmatrix}\right\rangle \nonumber \\&\qquad \qquad \qquad +2\left\langle \begin{pmatrix}q(s,\alpha (s))\\ \rho (s,\alpha (s))\end{pmatrix},\begin{pmatrix}X(s)\\ u(s)\end{pmatrix}\right\rangle \Bigg ]ds\Bigg \}, \end{aligned}$$
(3.2)

subject to the state Eq. (1.1). We denote this perturbed LQ problem by Problem (M-SLQ)\(_\varepsilon \) and its value function by \(V_\varepsilon (\cdot ,\cdot ,\cdot )\). Notice that the cost functional \(J^0_\varepsilon (t,x,i;u(\cdot ))\) of the homogeneous LQ problem associated with Problem (M-SLQ)\(_\varepsilon \) is

$$\begin{aligned} J^0_\varepsilon (t,x,i;u(\cdot ))=J^0(t,x,i;u(\cdot ))+\varepsilon {\mathbb {E}}\int _t^T|u(s)|^2ds, \end{aligned}$$

which, by (3.1), satisfies

$$\begin{aligned} J^0_\varepsilon (t,0,i;u(\cdot ))\geqslant \varepsilon {\mathbb {E}}\int _t^T|u(s)|^2ds. \end{aligned}$$

The Riccati equations associated with Problem (M-SLQ)\(_\varepsilon \) follow

$$\begin{aligned} \left\{ \begin{array}{ll} &{} {\dot{P}}_\varepsilon (s,i)+P_\varepsilon (s,i)A(s,i)+A(s,i)^\top P_\varepsilon (s,i)+C(s,i)^\top P_\varepsilon (s,i)C(s,i)+Q(s,i)\\ &{}\qquad \quad \ -{\hat{S}}_\varepsilon (s,i)^\top [{\hat{R}}_\varepsilon (s,i)+\varepsilon I_m]^{-1}{\hat{S}}_\varepsilon (s,i)\\ {} &{}\qquad +\sum _{k=1}^D\lambda _{ik}(s)P_\varepsilon (s,k)=0,\quad {\mathrm{a.e.}}~s\in [0,T],\ i\in {{{\mathcal {S}}}},\\ &{} P_\varepsilon (T,i)=G(i), \end{array}\right. \end{aligned}$$
(3.3)

where for every \((s,i)\in [0,T]\times {{{\mathcal {S}}}}\) and \(\varepsilon >0\),

$$\begin{aligned} \begin{aligned} {{\hat{S}}}_\varepsilon (s,i)&\triangleq B(s,i)^\top P_\varepsilon (s,i)+ D(s,i)^\top P_\varepsilon (s,i)C(s,i)+S(s,i),\\ {{\hat{R}}}_\varepsilon (s,i)&\triangleq R(s,i)+D(s,i)^\top P_\varepsilon (s,i)D(s,i). \end{aligned} \end{aligned}$$
(3.4)

We say that a solution \(P_\varepsilon (\cdot ,\cdot )\in C([0,T]\times {{{\mathcal {S}}}};{\mathbb {S}}^n)\) of (3.3) is said to be regular if

$$\begin{aligned} {{{\mathcal {R}}}}\big ({\hat{S}}_\varepsilon (s,i)\big )&\subseteq {{{\mathcal {R}}}}\big ({\hat{R}}_\varepsilon (s,i)\big ),\quad {\mathrm{a.e.}}~s\in [0,T],\ i\in {{{\mathcal {S}}}}, \end{aligned}$$
(3.5)
$$\begin{aligned} {\hat{R}}_\varepsilon (\cdot ,i)^{-1}{\hat{S}}_\varepsilon (\cdot ,i)&\in L^2(0,T;{\mathbb {R}}^{m\times n}),\quad ~i\in {{{\mathcal {S}}}}, \end{aligned}$$
(3.6)
$$\begin{aligned} {\hat{R}}_\varepsilon (s,i)&\geqslant 0,\quad {\mathrm{a.e.}}~s\in [0,T],\ i\in {{{\mathcal {S}}}}. \end{aligned}$$
(3.7)

A solution \(P_\varepsilon (\cdot ,\cdot )\) of (3.3) is said to be strongly regular if

$$\begin{aligned} {\hat{R}}_\varepsilon (s,i)\geqslant \lambda I,\quad {\mathrm{a.e.}}~s\in [0,T], \end{aligned}$$
(3.8)

for some \(\lambda >0\). The system of Riccati equation (3.3) is said to be (strongly) regularly solvable, if it admits a (strongly) regular solution. Clearly, condition (3.8) implies (3.5)–(3.7). Thus, a strongly regular solution \(P_\varepsilon (\cdot ,\cdot )\) must be regular. Moreover, it follows from [32, Theorem 6.3] that, under the assumption (3.1), Riccati equation (3.3) have a unique strongly regular solution \(P_\varepsilon (\cdot ,\cdot )\in C([0,T]\times {{{\mathcal {S}}}};{\mathbb {S}}^n)\), and from (3.7), we have

$$\begin{aligned} {{\hat{R}}}_\varepsilon (s,i)+\varepsilon I_m\geqslant \varepsilon I_m, \quad {\mathrm{a.e.}}~s\in [0,T]. \end{aligned}$$

Furthermore, let \((\eta _\varepsilon (\cdot ),\zeta _\varepsilon (\cdot ), \xi ^\varepsilon (\cdot ))\) be the adapted solution of the following BSDE:

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} d\eta _\varepsilon (s)=-\Big \{\big [A(s,\alpha (s))+B(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]^\top \eta _\varepsilon (s)\\ &{} \qquad \qquad \quad ~+\big [C(s,\alpha (s))+D(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]^\top \zeta _\varepsilon (s)\\ &{} \qquad \qquad \quad ~+\big [C(s,\alpha (s))+D(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]^\top P_\varepsilon (s,\alpha (s))\sigma (s)\\ &{} \qquad \qquad \quad ~+\Theta _\varepsilon (s,\alpha (s))^\top \rho (s,\alpha (s))+P_\varepsilon (s,\alpha (s))b(s)+q(s,\alpha (s))\Big \}ds\\ &{} \qquad \qquad \qquad +\zeta _\varepsilon (s) dW(s)+\sum _{k,l=1}^D\xi ^\varepsilon _{kl}(s)d{\widetilde{N}}_{kl}(s),\quad s\in [0,T],\\ &{} \eta _\varepsilon (T)=g(\alpha (T)),\end{array}\right. \end{aligned}$$
(3.9)

and let \(X_\varepsilon (\cdot )\) be the solution of the following closed-loop system:

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} dX_\varepsilon (s)=\Big \{\big [A(s,\alpha (s))+B(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]X_\varepsilon (s)+B(s,\alpha (s))v_\varepsilon (s)+b(s)\Big \}ds\\ &{}\qquad \qquad ~+\Big \{\big [C(s,\alpha (s))+D(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]X_\varepsilon (s)\\ &{}\qquad \qquad \ +D(s,\alpha (s))v_\varepsilon (s)+\sigma (s)\Big \}dW(s),\quad s\in [t,T], \\ &{} X_\varepsilon (t)=x,,\quad ~\alpha (t)=i,\end{array}\right. \end{aligned}$$
(3.10)

where \(\Theta _\varepsilon :[0,T]\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{m\times n}\) and \(v_\varepsilon :[0,T]\times \Omega \rightarrow {\mathbb {R}}^m\) are defined by

$$\begin{aligned} \Theta _\varepsilon (s,\alpha (s))&= - [{{\hat{R}}}_\varepsilon (s,\alpha (s))+\varepsilon I_m]^{-1}{{\hat{S}}}_\varepsilon (s,\alpha (s)), \end{aligned}$$
(3.11)
$$\begin{aligned} v_\varepsilon (s)&= - [{{\hat{R}}}_\varepsilon (s,\alpha (s))+\varepsilon I_m]^{-1}{{\hat{\rho }}}_\varepsilon (s,\alpha (s)), \end{aligned}$$
(3.12)

with

$$\begin{aligned} {{\hat{\rho }}}_\varepsilon (s,i)&=B(s,i)^\top \eta _\varepsilon (s)+D(s,i)^\top \zeta _\varepsilon (s)+D(s,i)^\top P_\varepsilon (s,i)\sigma (s)+\rho (s,i). \end{aligned}$$
(3.13)

Then from Theorem 5.2 and Corollary 6.5 in [32], the unique open-loop optimal control of Problem (M-SLQ)\(_\varepsilon \), for the initial pair (txi), is given by

$$\begin{aligned} u_\varepsilon (s)=\Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s)+v_\varepsilon (s), \quad ~s\in [t,T]. \end{aligned}$$
(3.14)

Before studying the main result of this section, we prove the following lemma.

Lemma 3.1

Under Assumptions (H1) and (H2), for any initial pair \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\), one has

$$\begin{aligned} \lim _{\varepsilon \mathop {\downarrow }0}V_\varepsilon (t,x,i)= V(t,x,i). \end{aligned}$$
(3.15)

Proof

Let \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\) be fixed. On the one hand, for any \(\varepsilon >0\) and any \(u(\cdot )\in {{{\mathcal {U}}}}[t,T]\), we have

$$\begin{aligned} J_\varepsilon (t,x,i;u(\cdot ))=J(t,x,i;u(\cdot ))+\varepsilon {\mathbb {E}}\int _t^T|u(s)|^2ds\geqslant J(t,x,i;u(\cdot ))\geqslant V(t,x,i). \end{aligned}$$

Taking the infimum over all \(u(\cdot )\in {{{\mathcal {U}}}}[t,T]\) on the left hand side implies that

$$\begin{aligned} V_\varepsilon (t,x,i)\geqslant V(t,x,i). \end{aligned}$$
(3.16)

On the other hand, if V(txi) is finite, then for any \(\delta >0\), we can find a \(u^\delta (\cdot )\in {{{\mathcal {U}}}}[t,T]\), independent of \(\varepsilon >0\), such that

$$\begin{aligned} J(t,x,i;u^\delta (\cdot ))\leqslant V(t,x,i)+\delta . \end{aligned}$$

It follows that

$$\begin{aligned} V_\varepsilon (t,x,i)&\leqslant J(t,x,i;u^\delta (\cdot ))+\varepsilon {\mathbb {E}}\int _t^T|u^\delta (s)|^2ds\leqslant V(t,x,i)+\delta \\&\quad +\varepsilon {\mathbb {E}}\int _t^T|u^\delta (s)|^2ds. \end{aligned}$$

Letting \(\varepsilon \rightarrow 0\), we obtain

$$\begin{aligned} \lim _{\varepsilon \mathop {\downarrow }0}V_\varepsilon (t,x,i)\leqslant V(t,x,i)+\delta . \end{aligned}$$
(3.17)

Since \(\delta >0\) is arbitrary, by combining (3.16) and (3.17), we obtain (3.15). A similar argument applies to the case when \(V(t,x,i)=-\infty \). \(\square \)

Now, we present the main result of this section, which provides a characterization of the open-loop solvability of Problem (M-SLQ) in terms of the family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\).

Theorem 3.2

Let Assumptions (H1) and (H2) and (3.1) hold. For any given initial pair \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\), let \(u_\varepsilon (\cdot )\) be defined by (3.14), which is the outcome of the closed-loop optimal strategy \((\Theta _\varepsilon (\cdot ,\cdot ),v_\varepsilon (\cdot ))\) of Problem (M-SLQ)\(_\varepsilon \). Then the following statements are equivalent:

  1.  (i)

    Problem (M-SLQ) is open-loop solvable at (txi);

  2.  (ii)

    The family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) is bounded in \(L^2_{{\mathbb {F}}}(t,T;{\mathbb {R}}^m)\), i.e.,

    $$\begin{aligned} \sup _{\varepsilon >0}{\mathbb {E}}\int _t^T|u_\varepsilon (s)|^2ds<\infty ; \end{aligned}$$
  3.  (iii)

    The family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) is convergent strongly in \(L^2_{{\mathbb {F}}}(t,T;{\mathbb {R}}^m)\) as \(\varepsilon \rightarrow 0\).

Proof

We begin by proving the implication (i) \(\mathop {\Rightarrow }\) (ii). Let \(v^*(\cdot )\) be an open-loop optimal control of Problem (M-SLQ) for the initial pair (txi). Then for any \(\varepsilon >0\),

$$\begin{aligned} V_\varepsilon (t,x,i)\leqslant & {} J_\varepsilon (t,x,i;v^*(\cdot ))=J(t,x,i;v^*(\cdot ))+\varepsilon {\mathbb {E}}\int _t^T |v^*(s)|^2ds\nonumber \\= & {} V(t,x,i)+\varepsilon {\mathbb {E}}\int _t^T |v^*(s)|^2ds. \end{aligned}$$
(3.18)

On the other hand, since \(u_\varepsilon (\cdot )\) is optimal for Problem (M-SLQ)\(_\varepsilon \) with respect to (txi), we have

$$\begin{aligned} V_\varepsilon (t,x,i)= & {} J_\varepsilon (t,x,i;v_\varepsilon (\cdot ))=J(t,x,i;v_\varepsilon (\cdot ))+\varepsilon {\mathbb {E}}\int _t^T |v_\varepsilon (s)|^2ds\nonumber \\\geqslant & {} V(t,x,i)+\varepsilon {\mathbb {E}}\int _t^T |v_\varepsilon (s)|^2ds. \end{aligned}$$
(3.19)

Combining (3.18) and (3.19) yields that

$$\begin{aligned} {\mathbb {E}}\int _t^T|u_\varepsilon (s)|^2ds\leqslant \frac{V_\varepsilon (t,x,i)-V(t,x,i)}{\varepsilon }\leqslant {\mathbb {E}}\int _t^T|v^*(s)|^2ds. \end{aligned}$$
(3.20)

This shows that \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) is bounded in \(L^2_{{\mathbb {F}}}(t,T;{\mathbb {R}}^m)\).

For (ii) \(\mathop {\Rightarrow }\) (i), the proof is similar to [24] (See Remark 3.3 below), and the implication (iii) \(\mathop {\Rightarrow }\) (ii) is trivially true.

Finally, we prove the implication (ii) \(\mathop {\Rightarrow }\) (iii). We divide the proof into two steps.

Step 1: The family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) converges weakly to an open-loop optimal control of Problem (M-SLQ) for the initial pair (txi) as \(\varepsilon \rightarrow 0\).

To verify this, it suffices to show that every weakly convergent subsequence of \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) has the same weak limit which is an open-loop optimal control of Problem (M-SLQ) for (txi). Let \(u^*_i(\cdot )\), \(i=1,2\) be the weak limits of two different weakly convergent subsequences \(\{u_{i,\varepsilon _k}(\cdot )\}_{k=1}^\infty \) \((i=1,2)\) of \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\). The same argument as in the proof of (ii) \(\mathop {\Rightarrow }\) (i) shows that both \(u^*_1(\cdot )\) and \(u^*_2(\cdot )\) are optimal for (txi). Thus, recalling that the mapping \(u(\cdot )\mapsto J(t, x,i; u(\cdot ))\) is convex, we have

$$\begin{aligned} J\left( t,x,i;\frac{u^*_1(\cdot )+u^*_2(\cdot )}{2}\right) \leqslant \frac{1}{2}J(t,x,i;u^*_1(\cdot ))+\frac{1}{2}J(t,x,i;u^*_2(\cdot ))=V(t,x,i). \end{aligned}$$

This means that \(\frac{u^*_1(\cdot )+u^*_2(\cdot )}{2}\) is also optimal for Problem (M-SLQ) with respect to (txi). Then we can repeat the argument employed in the proof of (i) \(\mathop {\Rightarrow }\) (ii), replacing \(v^*(\cdot )\) by \(\frac{u^*_1(\cdot )+u^*_2(\cdot )}{2}\) to obtain (see (3.20))

$$\begin{aligned} {\mathbb {E}}\int _t^T|u_{i,\varepsilon _k}(s)|^2ds\leqslant {\mathbb {E}}\int _t^T\Big (\frac{u^*_1(s)+u^*_2(s)}{2}\Big )^2ds,\quad ~i=1,2. \end{aligned}$$

Now, note that

$$\begin{aligned} 0\leqslant&~{\mathbb {E}}\int _t^T|u_{i,\varepsilon _k}(s)-u_{i}^*(s)|^2ds ={\mathbb {E}}\int _t^T\Big [|u_{i,\varepsilon _k}(s)|^2-2\langle u_{i,\varepsilon _k}(s),u_{i}^*(s)\rangle +|u_i^*(s)|^2\Big ]ds, \end{aligned}$$

which implies that

$$\begin{aligned} 2{\mathbb {E}}\int _t^T\langle u_{i,\varepsilon _k}(s),u_{i}^*(s)\rangle ds-{\mathbb {E}}\int _t^T|u_i^*(s)|^2ds\leqslant {\mathbb {E}}\int _t^T|u_{i,\varepsilon _k}(s)|^2ds. \end{aligned}$$

By the definition of weak-convergence yields

$$\begin{aligned} {\mathbb {E}}\int _t^T|u_i^*(s)|^2ds=&~2\liminf _{\varepsilon _k\rightarrow 0}{\mathbb {E}}\int _t^T\langle u_{i,\varepsilon _k}(s),u_{i}^*(s)\rangle ds-{\mathbb {E}}\int _t^T|u_i^*(s)|^2ds\\ \leqslant&~\liminf _{\varepsilon _k\rightarrow 0}{\mathbb {E}}\int _t^T|u_{i,\varepsilon _k}(s)|^2ds \leqslant {\mathbb {E}}\int _t^T\Big (\frac{u^*_1(s)+u^*_2(s)}{2}\Big )^2ds\quad ~ i=1,2. \end{aligned}$$

Adding the above two inequalities and then multiplying by 2, we get

$$\begin{aligned} 2\left[ {\mathbb {E}}\int _t^T|u^*_{1}(s)|^2ds+{\mathbb {E}}\int _t^T|u^*_{2}(s)|^2ds\right] \leqslant {\mathbb {E}}\int _t^T|u^*_1(s)+u^*_2(s)|^2ds, \end{aligned}$$

or equivalently (by shifting the integral on the right-hand side to the left-hand side),

$$\begin{aligned} {\mathbb {E}}\int _t^T|u^*_1(s)-u^*_2(s)|^2ds\leqslant 0. \end{aligned}$$

It follows that \(u^*_1(\cdot )=u^*_2(\cdot )\), which establishes the claim.

Step 2: The family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) converges strongly as \(\varepsilon \rightarrow 0\).

According to Step 1, the family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) converges weakly to an open-loop optimal control \(u^*(\cdot )\) of Problem (M-SLQ) for (txi) as \(\varepsilon \rightarrow 0\). By repeating the argument employed in the proof of (i) \(\mathop {\Rightarrow }\) (ii) with \(u^*(\cdot )\) replacing \(v^*(\cdot )\), we obtain

$$\begin{aligned} {\mathbb {E}}\int _t^T|u_\varepsilon (s)|^2ds\leqslant {\mathbb {E}}\int _t^T|u^*(s)|^2ds,\quad ~\varepsilon >0. \end{aligned}$$
(3.21)

On the other hand, since \(u^*(\cdot )\) is the weak limit of \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\), we have

$$\begin{aligned} {\mathbb {E}}\int _t^T|u^*(s)|^2ds\leqslant \liminf _{\varepsilon \rightarrow 0}{\mathbb {E}}\int _t^T|u_\varepsilon (s)|^2ds. \end{aligned}$$
(3.22)

Combining (3.21) and (3.22), we see that \({\mathbb {E}}\int _t^T|u_\varepsilon (s)|^2ds\) actually has the limit \({\mathbb {E}}\int _t^T|u^*(s)|^2ds\). Therefore (recalling that \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) converges weakly to \(u^*(\cdot )\)),

$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0}{\mathbb {E}}\int _t^T|u_\varepsilon (s)-u^*(s)|^2ds \\&\quad =\lim _{\varepsilon \rightarrow 0}\left[ {\mathbb {E}}\int _t^T|u_\varepsilon (s)|^2ds+{\mathbb {E}}\int _t^T|u^*(s)|^2ds -2{\mathbb {E}}\int _t^T\langle u^*(s),u_\varepsilon (s)\rangle ds\right] \\&\quad =0, \end{aligned}$$

which means that \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) converges strongly to \(u^*(\cdot )\) as \(\varepsilon \rightarrow 0\). \(\square \)

Remark 3.3

A similar result recently appeared in [32], which asserts that if Problem (M-SLQ) is open-loop solvable at (txi), then the limit of any weakly/strongly convergent subsequence of \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) is an open-loop optimal control for (txi). Our result sharpens that in [32] by showing the family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) itself is strongly convergent when Problem (M-SLQ) is open-loop solvable. This improvement has at least two advantages. First, it serves as a crucial bridge to the weak closed-loop solvability presented in the next section. Second, it is much more convenient for computational purposes because subsequence extraction is not required.

Remark 3.4

In Example 1.1, since \(B=1\) and \(D=S=R=0\), we have

$$\begin{aligned} {{\hat{S}}}_\varepsilon (s,i)&\triangleq B(s,i)^\top P_\varepsilon (s,i)+ D(s,i)^\top P_\varepsilon (s,i)C(s,i)\\&\quad +S(s,i)=P_\varepsilon (s,i) \ \hbox { with }P_\varepsilon (T,i)=1,\\ {{\hat{R}}}_\varepsilon (s,i)&\triangleq R(s,i)+D(s,i)^\top P_\varepsilon (s,i)D(s,i)=0. \end{aligned}$$

So the condition \({{{\mathcal {R}}}}\big ({\hat{S}}_\varepsilon (s,i)\big )\subseteq {{{\mathcal {R}}}}\big ({\hat{R}}_\varepsilon (s,i)\big ),\ {\mathrm{a.e.}}~s\in [0,T],\ i\in {{{\mathcal {S}}}}\) is not satisfied, which implies that GRE (1.8) has no regular solution.

4 Weak Closed-Loop Solvability

In this section, we study the equivalence between open-loop and weak closed-loop solvabilities of Problem (M-SLQ). We shall show that \(\Theta _\varepsilon (\cdot ,\cdot )\) and \(v_\varepsilon (\cdot )\) defined by (3.11) and (3.12) converge locally in [0, T), and that the limit pair \((\Theta ^*(\cdot ,\cdot ),v^*(\cdot ))\) is a weak closed-loop optimal strategy.

We start with a simple lemma, which enables us to work separately with \(\Theta _\varepsilon (\cdot ,\cdot )\) and \(v_\varepsilon (\cdot )\). Recall that the associated Problem (M-SLQ)\(^0\) is to minimize (1.6) subject to (1.5).

Lemma 4.1

Under Assumptions (H1) and (H2), if Problem (M-SLQ) is open-loop solvable, then so is Problem (M-SLQ)\(^0\).

Proof

For arbitrary \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\), we note that if \(b(\cdot ,\cdot ),\sigma (\cdot ,\cdot ),g(\cdot ),q(\cdot ,\cdot ),\rho (\cdot ,\cdot )=0\), then the adapted solution \((\eta _\varepsilon (\cdot ),\zeta _\varepsilon (\cdot ), \xi ^\varepsilon _1(\cdot ),\cdots ,\xi ^\varepsilon _D(\cdot ))\) to BSDE (3.9) is identically zero, and hence the process \(v_\varepsilon (\cdot )\) defined by (3.12) is also identically zero. By Theorem 3.2, to prove that Problem (M-SLQ)\(^0\) is open-loop solvable at (txi), we need to verify that the family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) is bounded in \(L^2_{{\mathbb {F}}}(t,T;{\mathbb {R}}^m)\), where (see (3.14) and note that \(v_\varepsilon (\cdot )=0\)),

$$\begin{aligned} u_\varepsilon (\cdot )=\Theta _\varepsilon (\cdot ,\alpha (\cdot ))X_\varepsilon (\cdot ), \end{aligned}$$
(4.1)

with \(X_\varepsilon (\cdot )\) is the solution to the following equation:

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} dX_\varepsilon (s)=\big [A(s,\alpha (s))+B(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]X_\varepsilon (s)ds\\ &{}\quad \qquad \qquad ~ +\big [C(s,\alpha (s))+D(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]X_\varepsilon (s)dW(s),\quad s\in [t,T], \\ &{} X_\varepsilon (t)=x,\quad ~\alpha (t)=i,\end{array}\right. \end{aligned}$$
(4.2)

To this end, we return to Problem (M-SLQ). Let \(v_\varepsilon (\cdot )\) be defined in (3.12) and denote by \(X_\varepsilon ( \cdot \ ;t,x,i)\) and \(X_\varepsilon ( \cdot \ ;t,0,i)\) solutions to (3.10) with respect to the initial pairs (txi) and (t, 0, i), respectively. Since Problem (M-SLQ) is open-loop solvable at both (txi) and (t, 0, i), by Theorem 3.2, the families

$$\begin{aligned} \begin{array}{ll} &{} u_\varepsilon (s;t,x,i)\triangleq \Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s;t,x,i)+v_\varepsilon (s),\\ &{} u_\varepsilon (s;t,0,i)\triangleq \Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s;t,0,i)+v_\varepsilon (s), \end{array}\quad ~s\in [t,T], \end{aligned}$$
(4.3)

are bounded in \(L^2_{{\mathbb {F}}}(t,T;{\mathbb {R}}^m)\). Note that due to that the process \(v_\varepsilon (\cdot )\) is independent of the initial state, the difference \(X_\varepsilon ( \cdot \ ;t,x,i)-X_\varepsilon ( \cdot \ ;t,0,i)\) also satisfies the same Eq. (4.2). Then by the uniqueness of adapted solutions of SDEs, we obtain that

$$\begin{aligned} X_\varepsilon (\cdot )=X_\varepsilon ( \cdot \ ;t,x,i)-X_\varepsilon ( \cdot \ ;t,0,i), \end{aligned}$$

which, combining (4.1) and (4.3), implies that

$$\begin{aligned} u_\varepsilon (\cdot )=u_\varepsilon ( \cdot ,t,x,i)-u_\varepsilon ( \cdot ,t,0,i). \end{aligned}$$

Since \(\{u_\varepsilon ( \cdot ,t,x,i)\}_{\varepsilon >0}\) and \(\{u_\varepsilon ( \cdot ,t,0,i)\}_{\varepsilon >0}\) are bounded in \(L^2_{{\mathbb {F}}}(t,T;{\mathbb {R}}^m)\), so is \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\). Finally, it follows from Theorem 3.2 that Problem (M-SLQ)\(^0\) is open-loop solvable. \(\square \)

Next, we prove that the family \(\{\Theta _\varepsilon (\cdot ,\cdot )\}_{\varepsilon >0}\) defined by (3.11) is locally convergent in [0, T).

Proposition 4.2

Let (H1) and (H2) hold. Suppose that Problem (M-SLQ)\(^0\) is open-loop solvable. Then the family \(\{\Theta _\varepsilon (\cdot ,\cdot )\}_{\varepsilon >0}\) defined by (3.11) converges in \(L^2(0,T';{\mathbb {R}}^{m\times n})\) for any \(0<T'<T\); that is, there exists a locally square-integrable deterministic function \(\Theta ^*:[0,T)\times {{{\mathcal {S}}}}\rightarrow {\mathbb {R}}^{m\times n}\) such that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}{\mathbb {E}}\int _0^{T'}|\Theta _\varepsilon (s,\alpha (s))-\Theta ^*(s,\alpha (s))|^2ds=0,\quad ~\forall ~0<T'<T. \end{aligned}$$

Proof

We need to show that for any \(0<T'<T\), the family \(\{\Theta _\varepsilon (\cdot )\}_{\varepsilon >0}\) is Cauchy in \(L^2(0,T';{\mathbb {R}}^{m\times n})\). To this end, let us first fix an arbitrary initial \((t,i)\in [0,T)\times {{{\mathcal {S}}}}\) and let \(\Phi _\varepsilon (\cdot )\in L^2_{{\mathbb {F}}}(\Omega ;C([t,T];{\mathbb {R}}^{n\times n}))\) be the solution to the following SDE:

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} d\Phi _\varepsilon (s)=\big [A(s,\alpha (s))+B(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]\Phi _\varepsilon (s)ds\\ &{}\quad \qquad \qquad \ +\big [C(s,\alpha (s))+D(s,\alpha (s))\Theta _\varepsilon (s,\alpha (s))\big ]\Phi _\varepsilon (s)dW(s),\quad ~s\in [t,T], \\ &{} \Phi _\varepsilon (t)=I_n,\quad ~\alpha (t)=i.\end{array}\right. \end{aligned}$$
(4.4)

Clearly, for any initial state x, from the uniqueness of SDEs, the solution of (4.2) is given by

$$\begin{aligned} X_\varepsilon (s)=\Phi _\varepsilon (s)x,\quad ~s\in [t,T]. \end{aligned}$$

Since Problem (M-SLQ)\(^0\) is open-loop solvable, by Theorem 3.2, the family

$$\begin{aligned} u_\varepsilon (s)=\Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s)=\Theta _\varepsilon (s,\alpha (s))\Phi _\varepsilon (s)x,\quad ~s\in [t,T],\quad \varepsilon >0 \end{aligned}$$

is strongly convergent in \(L^2_{\mathbb {F}}(t,T;{\mathbb {R}}^{m})\) for any \(x{\in }{\mathbb {R}}^n\). It follows that \(\{\Theta _\varepsilon (\cdot ,\cdot )\Phi _\varepsilon (\cdot )\}_{\varepsilon >0}\) converges strongly in \(L^2_{\mathbb {F}}(t,T;{\mathbb {R}}^{m\times n})\) as \(\varepsilon \rightarrow 0\). Denote \(U_\varepsilon (\cdot )=\Theta _\varepsilon (\cdot ,\cdot )\Phi _\varepsilon (\cdot )\) and let \(U^*(\cdot )\) be the strong limit of \(U_\varepsilon (\cdot )\). By Jensen’s inequality, we get

$$\begin{aligned} \int _t^T\big |{\mathbb {E}}[U_\varepsilon (s)]-{\mathbb {E}}[U^*(s)]\big |^2ds \leqslant {\mathbb {E}}\int _t^T\big |U_\varepsilon (s)-U^*(s)\big |^2ds\rightarrow 0\quad \hbox {as}\quad \varepsilon \rightarrow 0. \end{aligned}$$
(4.5)

Moreover, from (4.4), we see that \({\mathbb {E}}^\alpha [\Phi _\varepsilon (\cdot )]\) satisfies the following ODE:

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} d{\mathbb {E}}^\alpha [\Phi _\varepsilon (s)]=\Big \{{\mathbb {E}}^\alpha [A(s,\alpha (s))\Phi _\varepsilon (s)]+{\mathbb {E}}^\alpha [B(s,\alpha (s))U_\varepsilon (s)]\Big \}ds,\quad ~s\in [t,T],\\ &{} \Phi _\varepsilon (t)=I_n,\quad ~\alpha (t)=i. \end{array}\right. \end{aligned}$$

By the standard results of ODE, combining (4.5), the family of continuous functions \({\mathbb {E}}[\Phi _\varepsilon (\cdot )]\) converges uniformly to the solution of

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} d{\mathbb {E}}^\alpha [\Phi ^*(s)]=\Big \{{\mathbb {E}}^\alpha [A(s,\alpha (s))\Phi ^*(s)]+{\mathbb {E}}^\alpha [B(s,\alpha (s))U^*(s)]\Big \}ds,\quad ~s\in [t,T],\\ &{} \Phi ^*(t)=I_n,\quad ~\alpha (t)=i. \end{array}\right. \end{aligned}$$

Note that \(\Phi _\varepsilon (t)=I_n\), we can define the following stopping time:

$$\begin{aligned} \tau \triangleq \inf \Big \{s\in [t,T];\ |\Phi _\varepsilon (s)|<\frac{1}{2}\Big \}. \end{aligned}$$

We claim that the family \(\{\Theta _\varepsilon (\cdot ,i)\}_{\varepsilon >0}\) is Cauchy in \(L^2(t,\tau ;{\mathbb {R}}^{m\times n})\) for each \(i\in {{{\mathcal {S}}}}\). Indeed, first note that when \(s\in [t,\tau ]\), one has for each \(i\in {{{\mathcal {S}}}}\),

$$\begin{aligned}&U_\varepsilon (s)=\Theta _\varepsilon (s,\alpha (s))\Phi _\varepsilon (s)~\\&\quad \Longrightarrow ~{\mathbb {E}}^\alpha [\Theta _\varepsilon (s,\alpha (s))]={\mathbb {E}}^\alpha [U_\varepsilon (s)\Phi _\varepsilon (s)^{-1}]\\&\quad \Longrightarrow ~{\mathbb {E}}^\alpha [\Theta _\varepsilon (s,\alpha (s))]\leqslant \{{\mathbb {E}}^\alpha [|U_\varepsilon (s)|^2]\}^{\frac{1}{2}} \{{\mathbb {E}}^\alpha [|\Phi _\varepsilon (s)^{-1}|^2]\}^{\frac{1}{2}}. \end{aligned}$$

Then we have

$$\begin{aligned}&{\mathbb {E}}\int _t^{\tau }\big |\Theta _{\varepsilon _1}(s,\alpha (s))-\Theta _{\varepsilon _2}(s,\alpha (s))\big |^2ds\\&\quad \leqslant 2{\mathbb {E}}\int _t^{\tau }\big |{\mathbb {E}}^\alpha [U_{\varepsilon _1}(s)-U_{\varepsilon _2}(s)]\big |^2\cdot \big | {\mathbb {E}}^\alpha [\Phi _{\varepsilon _1}(s)]^{-1}\big |^2ds\\&\quad \quad +2{\mathbb {E}}\int _t^{\tau }\big |{\mathbb {E}}^\alpha [U_{\varepsilon _2}(s)]\big |^2\cdot \big |{\mathbb {E}}^\alpha [\Phi _{\varepsilon _1}(s)]^{-1} -{\mathbb {E}}^\alpha [\Phi _{\varepsilon _2}(s)]^{-1}\big |^2ds\\&\quad =2{\mathbb {E}}\int _t^{\tau }\big |{\mathbb {E}}^\alpha [U_{\varepsilon _1}(s)-U_{\varepsilon _2}(s)]\big |^2\cdot \big |{\mathbb {E}}^\alpha [\Phi _{\varepsilon _1}(s)]^{-1}\big |^2ds\\&\quad \quad +2{\mathbb {E}}\int _t^{\tau }\big |{\mathbb {E}}^\alpha [U_{\varepsilon _2}(s)]\big |^2\cdot \big |{\mathbb {E}}^\alpha [\Phi _{\varepsilon _1}(s)-\Phi _{\varepsilon _2}(s)]\big |^2 \cdot \big |{\mathbb {E}}^\alpha [\Phi _{\varepsilon _1}(s)]^{-1}\big |^2\\&\qquad \cdot \big | {\mathbb {E}}^\alpha [\Phi _{\varepsilon _2}(s)]^{-1}\big |^2ds\\&\quad \leqslant 8{\mathbb {E}}\int _t^{\tau }\big |{\mathbb {E}}^\alpha [U_{\varepsilon _1}(s)-U_{\varepsilon _2}(s)]\big |^2ds\\&\qquad +32{\mathbb {E}}\bigg [\int _t^{\tau }\big |{\mathbb {E}}^\alpha [U_{\varepsilon _2}(s)]\big |^2ds\cdot \Big (\sup _{t\leqslant s\leqslant \tau }\big |{\mathbb {E}}^\alpha [\Phi _{\varepsilon _1}(s)-\Phi _{\varepsilon _2}(s)]\big |^2\Big )\bigg ]. \end{aligned}$$

Since \(\{U_\varepsilon (\cdot )\}_{\varepsilon >0}\) is Cauchy in \(L^2_{{\mathbb {F}}}(t,T;{\mathbb {R}}^{m\times n})\) and \(\{\Phi _\varepsilon (\cdot )\}_{\varepsilon >0}\) converges uniformly on [tT], the last two terms of the above inequality approach to zero as \(\varepsilon _1,\varepsilon _2\rightarrow 0\), which implies that \(\{\Theta _\varepsilon (\cdot ,i)\}_{\varepsilon >0}\) is Cauchy in \(L^2(t,\tau ;{\mathbb {R}}^{m\times n})\) for each \(i\in {{{\mathcal {S}}}}\).

Next we use a compactness argument to prove that, for each \(i\in {{{\mathcal {S}}}}\), \(\{\Theta _\varepsilon (\cdot ,i)\}_{\varepsilon >0}\) is actually Cauchy in \(L^2(0,T';{\mathbb {R}}^{m\times n})\) for any \(0<T'<T\). Take any \(T'\in (0,T)\). From the preceding argument we see that for each \(t\in [0,T']\), there exists a small \(\Delta _t>0\) such that \(\{\Theta _\varepsilon (\cdot ,i)\}_{\varepsilon >0}\) is Cauchy in \(L^2(t,t+\Delta _t;{\mathbb {R}}^{m\times n})\). Since \([0,T']\) is compact, we can choose finitely many \(t\in [0,T']\), say, \(t_1,t_2,...,t_k,\) such that \(\{\Theta _\varepsilon (\cdot ,i)\}_{\varepsilon >0}\) is Cauchy in each \(L^2(t_j,t_j+\Delta _{t_j};{\mathbb {R}}^{m\times n})\) and \([0,T']\subseteq \bigcup _{j=1}^k[t_j,t_j+\Delta _{t_j}]\). It follows that

$$\begin{aligned}&{\mathbb {E}}\int _t^T\big |\Theta _{\varepsilon _1}(s,\alpha (s))-\Theta _{\varepsilon _2}(s,\alpha (s))\big |^2ds\\&\quad \leqslant \sum _{j=1}^k{\mathbb {E}}\int _t^{t_j+\Delta _{t_j}}\big |\Theta _{\varepsilon _1}(s,\alpha (s))-\Theta _{\varepsilon _2}(s,\alpha (s))\big |^2ds \rightarrow 0\quad \hbox {as}\quad \varepsilon _1,\varepsilon _2\rightarrow 0. \end{aligned}$$

The proof is therefore completed. \(\square \)

The following result shows that the family \(\{v_\varepsilon (\cdot )\}_{\varepsilon >0}\) defined by (3.12) is also locally convergent in [0, T).

Proposition 4.3

Let (H1) and (H2) hold. Suppose that Problem (M-SLQ) is open-loop solvable. Then the family \(\{v_\varepsilon (\cdot )\}_{\varepsilon >0}\) defined by (3.12) converges in \(L^2(0,T';{\mathbb {R}}^{m})\) for any \(0<T'<T\); that is, there exists a locally square-integrable deterministic function \(v^*(\cdot ):[0,T)\rightarrow {\mathbb {R}}^{m}\) such that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}{\mathbb {E}}\int _0^{T'}|v_\varepsilon (s)-v^*(s)|^2ds=0,\quad ~\forall ~0<T'<T. \end{aligned}$$

Proof

Let \(X_\varepsilon (s)\), \(0\leqslant s\leqslant T\), be the solution to the closed-loop system (3.10) with respect to initial time \(t=0\). Then, on the one hand, from the linearity of the state Eq. (1.1) and Lemma 2.1, we have

$$\begin{aligned} {\mathbb {E}}\left[ \sup _{0\leqslant s\leqslant T}|X_{\varepsilon _1}(s)-X_{\varepsilon _2}(s)|^2\right] \leqslant K{\mathbb {E}}\int _0^T|u_{\varepsilon _1}(s)-u_{\varepsilon _2}(s)|^2ds. \end{aligned}$$

On the other hand, since Problem (M-SLQ) is open-loop solvable, Theorem 3.2 implies that the family

$$\begin{aligned} u_\varepsilon (s)=\Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s)+v_\varepsilon (s),\quad ~s\in [0,T];\quad ~\varepsilon >0 \end{aligned}$$
(4.6)

is Cauchy in \(L^2_{{\mathbb {F}}}(0,T;{\mathbb {R}}^m)\), i.e.,

$$\begin{aligned} {\mathbb {E}}\int _0^T|u_{\varepsilon _1}(s)-u_{\varepsilon _2}(s)|^2ds\rightarrow 0\quad \hbox {as}\quad \varepsilon _1,\varepsilon _2\rightarrow 0. \end{aligned}$$
(4.7)

Therefore

$$\begin{aligned} {\mathbb {E}}\left[ \sup _{0\leqslant s\leqslant T}|X_{\varepsilon _1}(s)-X_{\varepsilon _2}(s)|^2\right] \rightarrow 0\quad \hbox {as}\quad \varepsilon _1,\varepsilon _2\rightarrow 0. \end{aligned}$$
(4.8)

Now for every \(0<T'<T\). Since Problem (M-SLQ) is open-loop solvable, according to Lemma 4.1 and Proposition 4.2, the family \(\{\Theta _\varepsilon (\cdot ,i)\}_{\varepsilon >0}\) is Cauchy in \(L^2(0,T';{\mathbb {R}}^{m\times n})\) for every \(i\in {{{\mathcal {S}}}}\). Thus, combining (4.8), we have

$$\begin{aligned}&{\mathbb {E}}\int _0^{T'}\Big |\Theta _{\varepsilon _1}(s,\alpha (s))X_{\varepsilon _1}(s)-\Theta _{\varepsilon _2}(s,\alpha (s))X_{\varepsilon _2}(s)\Big |^2ds\\&\quad \leqslant 2{\mathbb {E}}\int _0^{T'}|\Theta _{\varepsilon _1}(s,\alpha (s))-\Theta _{\varepsilon _2}(s,\alpha (s))|^2ds\cdot {\mathbb {E}}\left[ \sup _{0\leqslant s\leqslant T'}|X_{\varepsilon _1}(s)|^2\right] \\&\quad \quad +2{\mathbb {E}}\int _0^{T'}|\Theta _{\varepsilon _2}(s,\alpha (s))|^2ds\cdot {\mathbb {E}}\left[ \sup _{0\leqslant s\leqslant T'}|X_{\varepsilon _1}(s)-X_{\varepsilon _2}(s)|^2\right] \\&\quad \longrightarrow 0\quad \hbox {as}\quad \varepsilon _1,\varepsilon _2\rightarrow 0, \end{aligned}$$

which combing (4.6) and (4.7), implies that

$$\begin{aligned}&{\mathbb {E}}\int _0^{T'}|v_{\varepsilon _1}(s)-v_{\varepsilon _2}(s)|^2ds\\&\quad ={\mathbb {E}}\int _0^{T'}\Big |[u_{\varepsilon _1}(s)-\Theta _{\varepsilon _1}(s,\alpha (s))X_{\varepsilon _1}(s)]-[u_{\varepsilon _2}(s) -\Theta _{\varepsilon _2}(s,\alpha (s))X_{\varepsilon _2}(s)]\Big |^2ds\\&\quad \leqslant 2{\mathbb {E}}\int _0^{T'}|u_{\varepsilon _1}(s)-u_{\varepsilon _2}(s)|^2ds +2{\mathbb {E}}\int _0^{T'}|\Theta _{\varepsilon _1}(s,\alpha (s))X_{\varepsilon _1}(s)\\&\qquad -\Theta _{\varepsilon _2}(s)X_{\varepsilon _2}(s,\alpha (s))|^2ds\\&\quad \longrightarrow 0\quad \hbox {as}\quad \varepsilon _1,\varepsilon _2\rightarrow 0. \end{aligned}$$

This shows that the family \(\{v_\varepsilon (\cdot )\}_{\varepsilon >0}\) converges in \(L^2_{{\mathbb {F}}}(0,T';{\mathbb {R}}^m)\). \(\square \)

We are now ready to state and prove the main result of this section, which establishes the equivalence between open-loop and weak closed-loop solvability of Problem (M-SLQ).

Theorem 4.4

Let (H1) and (H2) hold. If Problem (M-SLQ) is open-loop solvable, then the limit pair \((\Theta ^*(\cdot ,\cdot ),v^*(\cdot ))\) obtained in Propositions 4.2 and 4.3 is a weak closed-loop optimal strategy of Problem (M-SLQ) on any [tT). Consequently, the open-loop and weak closed-loop solvability of Problem (M-SLQ) are equivalent.

Proof

From Definition 2.6, it is obvious that the weak closed-loop solvability of Problem (M-SLQ) implies the open-loop solvability of Problem (M-SLQ). In the following, we consider the inverse case.

Take an arbitrary initial pair \((t,x,i)\in [0,T)\times {\mathbb {R}}^n\times {{{\mathcal {S}}}}\) and let \(\{u_\varepsilon (s);t\leqslant s\leqslant T\}_{\varepsilon >0}\) be the family defined by (3.14). Since Problem (M-SLQ) is open-loop solvable at (txi), by Theorem 3.2, \(\{u_\varepsilon (s);t\leqslant s\leqslant T\}_{\varepsilon >0}\) converges strongly to an open-loop optimal control \(\{u^*(s);t\leqslant s\leqslant T\}_{\varepsilon >0}\) of Problem (M-SLQ) (for the initial pair (txi)). Let \(\{X^*(s);t\leqslant s\leqslant T\}_{\varepsilon >0}\) be the corresponding optimal state process; i.e., \(X^*(\cdot )\) is the adapted solution of the following equation:

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} dX^*(s)=\Big [A(s,\alpha (s))X^*(s)+B(s,\alpha (s))u^*(s)+b(s)\Big ]ds\\ &{}\quad \qquad \qquad +\Big [C(s,\alpha (s))X^*(s)+D(s,\alpha (s))u^*(s)+\sigma (s)\Big ]dW(s),\quad ~s\in [t,T], \\ &{} X^*(t)=x,\quad ~\alpha (t)=i.\end{array}\right. \end{aligned}$$

If we can show that

$$\begin{aligned} u^*(s)=\Theta ^*(s,\alpha (s))X^*(s)+v^*(s),\quad ~t\leqslant s<T, \end{aligned}$$
(4.9)

then \((\Theta ^*(\cdot ,\cdot ),v^*(\cdot ))\) is clearly a weak closed-loop optimal strategy of Problem (M-SLQ) on [tT). To justify the argument, we note first that by Lemma 2.1, we obtain

$$\begin{aligned} {\mathbb {E}}\left[ \sup _{t\leqslant s\leqslant T}|X_\varepsilon (s)-X^*(s)|^2\right] \leqslant K{\mathbb {E}}\int _t^T|u_\varepsilon (s)-u^*(s)|^2ds\rightarrow 0\quad \hbox {as}\quad \varepsilon \rightarrow 0, \end{aligned}$$

where \(\{X_\varepsilon (s);t\leqslant s\leqslant T\}_{\varepsilon >0}\) is the solution of Eq. (3.10). Second, by Propositions 4.2 and 4.3, one has

$$\begin{aligned} \left\{ \begin{array}{ll} &{} \lim _{\varepsilon \rightarrow 0}{\mathbb {E}}\int _0^{T'}|\Theta _\varepsilon (s,\alpha (s))-\Theta ^*(s,\alpha (s))|^2ds=0,\quad ~\forall 0<T'<T,\\ &{} \lim _{\varepsilon \rightarrow 0}{\mathbb {E}}\int _0^{T'}|v_\varepsilon (s)-v^*(s)|^2ds=0,\quad ~\forall 0<T'<T. \end{array}\right. \end{aligned}$$

It follows that for any \(0<T'<T\),

$$\begin{aligned}&{\mathbb {E}}\int _0^{T'}\Big |\big [\Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s)+v_\varepsilon (s)\big ]-\big [\Theta ^*(s,\alpha (s))X^*(s)+v^*(s)\big ]\Big |^2ds\\&\quad \leqslant 2{\mathbb {E}}\int _0^{T'}|\Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s)-\Theta ^*(s,\alpha (s))X^*(s)|^2ds\\&\qquad +2{\mathbb {E}}\int _0^{T'}|v_\varepsilon (s)-v^*(s)|^2ds\\&\quad \leqslant 4{\mathbb {E}}\int _0^{T'}|\Theta _{\varepsilon }(s,\alpha (s))|^2ds\cdot {\mathbb {E}}\left[ \sup _{0\leqslant s\leqslant T'}|X_{\varepsilon }(s)-X^*(s)|^2\right] \\&\qquad +2{\mathbb {E}}\int _0^{T'}|v_\varepsilon (s)-v^*(s)|^2ds\\&\quad \quad +4{\mathbb {E}}\int _0^{T'}|\Theta _{\varepsilon }(s,\alpha (s))-\Theta ^*(s,\alpha (s))|^2ds\cdot {\mathbb {E}}\left[ \sup _{0\leqslant s\leqslant T'}|X^*(s)|^2\right] \\&\quad \longrightarrow 0\quad \hbox {as}\quad \varepsilon \rightarrow 0. \end{aligned}$$

Recall that \(u_\varepsilon (s)=\Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s)+v_\varepsilon (s)\) converges strongly to \(u_\varepsilon ^*(s)\), \(t\leqslant s\leqslant T\), in \(L^2_{\mathbb {F}}(t,T;{\mathbb {R}}^m)\) as \(\varepsilon \rightarrow 0\). Thus, (4.9) must hold. The above argument shows that the open-loop solvability implies the weak closed-loop solvability. Consequently, the open-loop and weak closed-loop solvability of Problem (M-SLQ) are equivalent. This completes the proof. \(\square \)

5 Examples

There are some (M-SLQ) problems that are open-loop solvable, but not closed-loop solvable; for such problems, one could not expect to get a regular solution (which does not exist) to the associated GRE (3.3), so that the state feedback representation of the open-loop optimal control might be impossible. In fact, Example 1.1 has illustrated this conclusion. However, Theorem 4.4 shows that the open-loop and weak closed-loop solvability of Problem (M-SLQ) are equivalent. In the following, we present another example to illustrate the procedure for finding weak closed-loop optimal strategies for some (M-SLQ) problems that are open-loop solvable (and hence weakly closed-loop solvable) but not closed-loop solvable.

Example 5.1

In order to present the procedure more clearly, we simplify the problem. Let \(T=1\) and \(D=2\), that is, the state space of \(\alpha (\cdot )\) is \({{{\mathcal {S}}}}=\{1,2\}\). For the generator \(\lambda (s)\triangleq [\lambda _{ij}(s)]_{i, j = 1, 2}\), note that \(\sum ^{2}_{j = 1} \lambda _{ij}(s) = 0\) for \(i\in {{{\mathcal {S}}}}\), then

$$\begin{aligned} \lambda (s)=\begin{pmatrix}\lambda _{11}(s)&{}\lambda _{12}(s)\\ \lambda _{21}(s)&{}\lambda _{22}(s)\end{pmatrix} =\begin{pmatrix}\lambda _{11}(s)&{}-\lambda _{11}(s)\\ spsl_{22}(s)&{}\lambda _{22}(s)\end{pmatrix},\quad ~s\in [0,1]. \end{aligned}$$

Consider the following Problem (M-SLQ) with one-dimensional state equation

$$\begin{aligned} \left\{ \begin{array}{ll} &{} dX(s)=\Big [-\alpha (s)X(s)+u(s)+b(s)\Big ]ds+\sqrt{2\alpha (s)}X(s)dW(s),\quad ~s\in [t,1],\\ &{} X(t)=x,\quad ~\alpha (t)=i, \end{array}\right. \nonumber \\{} \end{aligned}$$
(5.1)

and the cost functional

$$\begin{aligned} J(t,x,i;u(\cdot ))={\mathbb {E}}|X(1)|^2, \end{aligned}$$

where the nonhomogeneous term \(b(\cdot ,\cdot )\) is given by

$$\begin{aligned} b(s)=\left\{ \begin{array}{ll} &{} \frac{1}{\sqrt{1-s}}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} ,\qquad \text{ if } s\in [0,1);\\ &{} 0,\qquad \text{ if } s=1. \end{array}\right. \end{aligned}$$

It is easy to see that \(b(\cdot ,i)\in L^2_{\mathbb {F}}(\Omega ;L^1(0,1;{\mathbb {R}}))\) for each \(i\in {{{\mathcal {S}}}}\). In fact,

$$\begin{aligned}&\ \ {\mathbb {E}}\left( \int _0^1|b(s)|ds\right) ^2\\&\quad ={\mathbb {E}}\left( \int _0^1\frac{1}{\sqrt{1-s}}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} ds\right) ^2\\&\quad \leqslant {\mathbb {E}}\left( \int _0^1\frac{1}{\sqrt{1-s}}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-\int _0^s\alpha (r)dr\right\} ds\right) ^2\\&\quad \leqslant {\mathbb {E}}\left( \int _0^1\frac{1}{\sqrt{1-s}}ds\cdot \sup _{0\leqslant s\leqslant 1}\exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-\int _0^s\alpha (r)dr\right\} \right) ^2\\&\quad =\left( \int _0^1\frac{1}{\sqrt{1-s}}ds\right) ^2\cdot {\mathbb {E}}\left( \sup _{0\leqslant s\leqslant 1}\exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-\int _0^s\alpha (r)dr\right\} \right) ^2\\&\quad =4\ {\mathbb {E}}\left( \sup _{0\leqslant s\leqslant 1}\exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-\int _0^s\alpha (r)dr\right\} \right) ^2. \end{aligned}$$

Since the term \(\exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-\int _0^s\alpha (r)dr\right\} \) is a square-integrable martingale, note that \(\alpha (\cdot )\) belongs to \({{{\mathcal {S}}}}=\{1,2\}\), it follows from Doob’s maximal inequality that

$$\begin{aligned}&{\mathbb {E}}\left( \sup _{0\leqslant s\leqslant 1}\exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-\int _0^s\alpha (r)dr\right\} \right) ^2\\&\quad \leqslant 4{\mathbb {E}}\exp \left\{ 2\int _0^1\sqrt{2\alpha (r)}dW(r)-2\int _0^1\alpha (r)dr\right\} \\&\quad \leqslant 4e^4. \end{aligned}$$

Thus,

$$\begin{aligned} {\mathbb {E}}\left( \int _0^1|b(s)|ds\right) ^2\leqslant 16e^4, \end{aligned}$$

which implies that \(b(\cdot ,i)\in L^2_{\mathbb {F}}(\Omega ;L^1(0,1;{\mathbb {R}}))\) for each \(i\in {{{\mathcal {S}}}}\).

We first claim that this (M-SLQ) problem is not closed-loop solvable on any [t, 1]. Indeed, the generalized Riccati equation associate with this problem reads

$$\begin{aligned} \left\{ \begin{array}{ll} &{} {\dot{P}}(s,1)+\lambda _{11}(s)P(s,1)-\lambda _{11}(s)P(s,2)=0,\quad ~{\mathrm{a.e.}}~s\in [t,1],\quad ~\\ &{} P(1,1)=1, \end{array}\right. \text{ for } i=1, \end{aligned}$$

and

$$\begin{aligned} \left\{ \begin{array}{ll} &{} {\dot{P}}(s,2)-\lambda _{22}(s)P(s,1)+\lambda _{22}(s)P(s,2)=0,\quad {\mathrm{a.e.}}~s\in [t,1],\quad ~\\ &{} P(1,2)=1, \end{array}\right. \text{ for } i=2, \end{aligned}$$

whose solutions are \(P(s,1)=P(s,2)=1\), or \(P(s,i)\equiv 1,\) for \((s,i)\in [0,1]\times {{{\mathcal {S}}}}.\) Then for any \(s\in [t,1]\) and \(i\in {{{\mathcal {S}}}}\), we have

$$\begin{aligned} \begin{array}{ll} &{} {{{\mathcal {R}}}}\big ({\hat{S}}(s,i)\big )={{{\mathcal {R}}}}(1)={\mathbb {R}},\\ &{} {{{\mathcal {R}}}}\big ({\hat{R}}(s,i)\big )={{{\mathcal {R}}}}(0)=\{0\},\quad ~ \end{array}\Longrightarrow \quad ~{{{\mathcal {R}}}}\big ({\hat{S}}(s,i)\big )\nsubseteq {{{\mathcal {R}}}}\big ({\hat{R}}(s,i)\big ). \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} {{\hat{S}}}(s,i)&\triangleq B(s,i)^\top P(s,i)+ D(s,i)^\top P(s,i)C(s,i)+S(s,i), \\ {{\hat{R}}}(s,i)&\triangleq R(s,i)+D(s,i)^\top P(s,i)D(s,i). \end{aligned} \end{aligned}$$
(5.2)

Therefore, the range inclusion condition is not satisfied. This implies that our claim holds.

In the following, we use Theorem 3.2 to conclude that the above (M-SLQ) problem is open-loop solvable (and hence, by Theorem 4.4, weakly closed-loop solvable). Without loss of generality, we consider only the open-loop solvability at \(t=0\). To this end, let \(\varepsilon >0\) be arbitrary and consider Riccati equation (3.3), which, in our example, read:

$$\begin{aligned} \left\{ \begin{array}{ll} &{} {\dot{P}}_\varepsilon (s,1)-\frac{1}{\varepsilon }P_\varepsilon (s,1)^2+\lambda _{11}(s)P_\varepsilon (s,1)-\lambda _{11}(s)P_\varepsilon (s,2)=0,\quad ~{\mathrm{a.e.}}~s\in [t,1],\quad ~\\ &{} P_\varepsilon (1,1)=1, \end{array}\right. \text{ for } i=1, \end{aligned}$$

and

$$\begin{aligned} \left\{ \begin{array}{ll} &{} {\dot{P}}_\varepsilon (s,2)-\frac{1}{\varepsilon }P_\varepsilon (s,2)^2-\lambda _{22}(s)P_\varepsilon (s,1)+\lambda _{22}(s)P_\varepsilon (s,2)=0,\quad {\mathrm{a.e.}}~s\in [t,1],\quad ~\\ &{} P_\varepsilon (1,2)=1, \end{array}\right. \text{ for } i=2. \end{aligned}$$

Solving the above equations yields

$$\begin{aligned} P_\varepsilon (s,1)=P_\varepsilon (s,2)=\frac{\varepsilon }{\varepsilon +1-s},\quad ~s\in [0,1]. \end{aligned}$$

Or

$$\begin{aligned} P_\varepsilon (s,i)=\frac{\varepsilon }{\varepsilon +1-s},\quad ~(s,i)\in [0,1]\times {{{\mathcal {S}}}}. \end{aligned}$$

Noting that the state space of \(\alpha (s)\) is \({{{\mathcal {S}}}}=\{1,2\}\), we let

$$\begin{aligned}&\Theta _\varepsilon (s,\alpha (s))\triangleq -[{{\hat{R}}}_\varepsilon (s,\alpha (s))+\varepsilon I_m]^{-1}{{\hat{S}}}_\varepsilon (s,\alpha (s))\nonumber \\&\quad =-\frac{P_\varepsilon (s,\alpha (s))}{\varepsilon }=-\frac{1}{\varepsilon +1-s},\quad ~s\in [0,1]. \end{aligned}$$
(5.3)

Then, the corresponding BSDE (3.9) reads

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} d\eta _\varepsilon (s)=-\Big \{\big [\Theta _\varepsilon (s,\alpha (s))-\alpha (s)\big ]\eta _\varepsilon (s)+\sqrt{2\alpha (s)}\zeta _\varepsilon (s)+P_\varepsilon (s,\alpha (s))b(s)\Big \}ds\\ &{} \qquad \qquad \qquad +\zeta _\varepsilon (s) dW(s)+\sum _{k,l=1}^2\xi ^\varepsilon _{kl}(s)d{\widetilde{N}}_{kl}(s),\quad ~s\in [0,1],\\ &{} \eta _\varepsilon (1)=0.\end{array}\right. \end{aligned}$$

Let \(f(s)=\frac{1}{\sqrt{1-s}}\). Using the variation of constants formula for BSDEs, and noting that \(W(\cdot )\) and \({\widetilde{N}}_k(\cdot )\) are \(({\mathbb {F}}, {\mathbb {P}})\)-martingales, we obtain

$$\begin{aligned} \eta _\varepsilon (s)= & {} \frac{\varepsilon }{\varepsilon +1-s}\cdot \exp \left\{ 2\int _0^s\alpha (r)dr-\int _0^s\sqrt{2\alpha (r)}dW(r)\right\} \\&\cdot {\mathbb {E}}\left[ \int _s^1b(r) \cdot \exp \left\{ \int _0^r\sqrt{2\alpha ({\bar{r}})}dW({\bar{r}})-2\int _0^r\alpha ({\bar{r}})d{\bar{r}}\right\} dr\bigg |{{{\mathcal {F}}}}_s\right] \\= & {} \frac{\varepsilon }{\varepsilon +1-s}\cdot \exp \left\{ 2\int _0^s\alpha (r)dr-\int _0^s\sqrt{2\alpha (r)}dW(r)\right\} \\&\cdot \int _s^1f(r) \cdot {\mathbb {E}}\left[ \exp \left\{ 2\int _0^r\sqrt{2\alpha ({\bar{r}})}dW({\bar{r}}) -4\int _0^r\alpha ({\bar{r}})d{\bar{r}}\right\} \bigg |{{{\mathcal {F}}}}_s\right] dr\\= & {} \frac{\varepsilon }{\varepsilon +1-s}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} \cdot \int _s^1f(r)dr,\quad ~s\in [0,1]. \end{aligned}$$

It should be point out that, in the above equality, we use the Fibini’s Theorem and the martingale property, i.e.,

$$\begin{aligned}&{\mathbb {E}}\left[ \exp \left\{ 2\int _0^r\sqrt{2\alpha ({\bar{r}})}dW({\bar{r}}) -4\int _0^r\alpha ({\bar{r}})d{\bar{r}}\right\} \bigg |{{{\mathcal {F}}}}_s\right] \\&\quad =\exp \left\{ 2\int _0^s\sqrt{2\alpha ({\bar{r}})}dW({\bar{r}}) -4\int _0^s\alpha ({\bar{r}})d{\bar{r}}\right\} ,\quad ~0\leqslant s\leqslant r\leqslant 1. \end{aligned}$$

Now, let

$$\begin{aligned}&v_\varepsilon (s)\triangleq -[{{\hat{R}}}_\varepsilon (s,\alpha (s))+\varepsilon I_m]^{-1}{{\hat{\rho }}}_\varepsilon (s,\alpha (s))=-\frac{\eta _\varepsilon (s)}{\varepsilon }\nonumber \\&\quad =-\frac{1}{\varepsilon +1-s}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} \cdot \int _s^1f(r)dr,\quad ~s\in [0,1].\nonumber \\ \end{aligned}$$
(5.4)

Then, the corresponding closed-loop system (3.10) can be written as

$$\begin{aligned} \left\{ \negthinspace \negthinspace \begin{array}{ll} &{} dX_\varepsilon (s)=\Big \{\big [\Theta _\varepsilon (s,\alpha (s))-\alpha (s)\big ]X_\varepsilon (s)+v_\varepsilon (s)+b(s)\Big \}ds +\sqrt{2\alpha (s)}X_\varepsilon (s)dW(s),\quad s\in [0,1], \\ &{} X_\varepsilon (0)=x,\end{array}\right. \end{aligned}$$

By the variation of constants formula for SDEs, we get

$$\begin{aligned} X_\varepsilon (s)= & {} (\varepsilon +1-s)\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} \\&\quad \cdot \int _0^s\left[ \frac{1}{\varepsilon +1-r}\cdot \exp \left\{ -\int _0^r\sqrt{2\alpha ({\bar{r}})}dW({\bar{r}})+2\int _0^r\alpha ({\bar{r}})d{\bar{r}}\right\} \right. \\&\quad \left. \cdot \big (v_\varepsilon (r)+b(r,\alpha (r))\big )\right] dr\\&+x\cdot \frac{\varepsilon +1-s}{\varepsilon +1}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} ,\quad ~s\in [0,1]. \end{aligned}$$

In light of Theorem 3.2, in order to prove the open-loop solvability at (0, xi), it suffices to show the family \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) defined by

$$\begin{aligned}&u_\varepsilon (s)\triangleq \Theta _\varepsilon (s,\alpha (s))X_\varepsilon (s)+v_\varepsilon (s)\nonumber \\&\quad =-\exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} \nonumber \\&\qquad \cdot \int _0^s\left[ \frac{1}{\varepsilon +1-r}\cdot \exp \left\{ -\int _0^r\sqrt{2\alpha ({\bar{r}})}dW({\bar{r}})\nonumber \right. \right. \\&\qquad \left. \left. +2\int _0^r\alpha ({\bar{r}})d{\bar{r}}\right\} \cdot \big (v_\varepsilon (r)+b(r,\alpha (r))\big )\right] dr\nonumber \\&\qquad -\frac{x}{\varepsilon +1}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} +v_\varepsilon (s),\quad ~s\in [0,1],\nonumber \\ \end{aligned}$$
(5.5)

is bounded in \(L^2_{{\mathbb {F}}}(0,1;{\mathbb {R}})\). For this, let us first simplify (5.5). On the one hand, by Fubini’s theorem,

$$\begin{aligned}&\int _0^s\left[ \frac{1}{\varepsilon +1-r}\cdot \exp \left\{ -\int _0^r\sqrt{2\alpha ({\bar{r}})}dW({\bar{r}})+2\int _0^r\alpha ({\bar{r}})d{\bar{r}}\right\} \cdot v_\varepsilon (r)\right] dr\\&\quad =-\int _0^s\frac{1}{(\varepsilon +1-r)^2}\int _r^1f({\bar{r}})d{\bar{r}}dr\\&\quad =-\int _0^sf({{\bar{r}}})\int _0^{{{\bar{r}}}}\frac{1}{(\varepsilon +1-r)^2}drd{\bar{r}} -\int _s^1f({{\bar{r}}})\int _0^s\frac{1}{(\varepsilon +1-r)^2}drd{{\bar{r}}}\\&\quad =-\int _0^s\frac{1}{\varepsilon +1-r}\cdot f({{\bar{r}}})d{\bar{r}}+\frac{1}{\varepsilon +1}\int _0^1f({{\bar{r}}})d{{\bar{r}}} -\frac{1}{\varepsilon +1-r}\int _s^1f({{\bar{r}}})d{{\bar{r}}}. \end{aligned}$$

Similarly, on the other hand,

$$\begin{aligned}&\int _0^s\left[ \frac{1}{\varepsilon +1-r}\cdot \exp \left\{ -\int _0^r\sqrt{2\alpha ({\bar{r}})}dW({\bar{r}})+2\int _0^r\alpha ({\bar{r}})d{\bar{r}}\right\} \cdot b_\varepsilon (r,\alpha (r))\right] dr\\&\quad =\int _0^s\frac{1}{\varepsilon +1-r}f(r)dr. \end{aligned}$$

Consequently, we get

$$\begin{aligned} u_\varepsilon (s)= & {} -\left( \frac{x}{\varepsilon +1}+\frac{1}{\varepsilon +1}\int _0^1f(r)dr\right) \cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} \nonumber \\= & {} -\frac{x+2}{\varepsilon +1}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} . \end{aligned}$$
(5.6)

A short calculation gives

$$\begin{aligned} {\mathbb {E}}\int _0^1|u_\varepsilon (s)|^2ds=\left( \frac{x+2}{\varepsilon +1}\right) ^2\leqslant (x+2)^2,\quad ~\forall \varepsilon >0. \end{aligned}$$

Therefore, \(\{u_\varepsilon (\cdot )\}_{\varepsilon >0}\) is bounded in \(L^2_{{\mathbb {F}}}(0,1;{\mathbb {R}})\). Now, let \(\varepsilon \rightarrow 0\) in (5.6), we get an open-loop optimal control:

$$\begin{aligned} u^*(s)=-(x+2)\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} ,\quad ~s\in [0,1]. \end{aligned}$$

From the above discussion, similar to the state process \(X(\cdot )\) of (5.1), the open-loop optimal control \(u^*(\cdot )\) also depends on the regime switching term \(\alpha (\cdot )\). That is to say, as the value of the switching \(\alpha (\cdot )\) varies, the open-loop optimal control \(u^*(\cdot )\) will be changed too.

Finally, we let \(\varepsilon \rightarrow 0\) in (5.3) and (5.4) to get a weak closed-loop optimal strategy \((\Theta ^*(\cdot ,\cdot ),v^*(\cdot ))\):

$$\begin{aligned}&\Theta ^*(s,\alpha (s))=\lim _{\varepsilon \rightarrow 0}\Theta _\varepsilon (s,\alpha (s))=-\frac{1}{1-s},\qquad s\in [0,1),\\&v^*(s)=\lim _{\varepsilon \rightarrow 0}v_\varepsilon (s)= -\frac{1}{1-s}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} \cdot \int _s^1f(r)dr\\&\qquad \qquad \qquad \quad ~=-\frac{2}{\sqrt{1-s}}\cdot \exp \left\{ \int _0^s\sqrt{2\alpha (r)}dW(r)-2\int _0^s\alpha (r)dr\right\} , \qquad s\in [0,1). \end{aligned}$$

We put out that neither \(\Theta ^*(\cdot ,\cdot )\) and \(v^*(\cdot )\) is square-integrable on [0, 1). Indeed, one has

$$\begin{aligned} {\mathbb {E}}\int _0^1|\Theta ^*(s,\alpha (s))|^2ds= & {} \int _0^1\frac{1}{(1-s)^2}ds=\infty ,\\ {\mathbb {E}}\int _0^1|v^*(s)|^2ds= & {} {\mathbb {E}}\int _0^1\frac{4}{1-s}\cdot \exp \left\{ 2\int _0^s\sqrt{2\alpha (r)}dW(r)-4\int _0^s\alpha (r)dr\right\} ds\\= & {} {\mathbb {E}}\int _0^1\frac{4}{1-s}ds=\infty . \end{aligned}$$

6 Conclusions

In this paper, we mainly study the open-loop and weak closed-loop solvabilities for a class of stochastic LQ optimal control problems of Markovian regime switching system. The main result is that these two solvabilities are equivalent. First, using the perturbation approach, we provide an alternative characterization of the open-loop solvability. Then we investigate the weak closed-loop solvability of the LQ problem of Markovian regime switching system, and establish the equivalent relationship between open-loop and weak closed-loop solvabilities. Finally, we present an example to illustrate the procedure for finding weak closed-loop optimal strategies in the circumstance of Markovian regime switching system.