1 Introduction

Stochastic optimal control problems for regime-switching models have been studied by many authors (cf., e.g., [1,2,3,4]). A main reason that regime-switching models received a lot of attention by the researchers is the ability of these models to capture the different modes of the financial market easily. The shifts from one regime to another may be activated by a change in economic policy, e.g., an exchange rate policy or by a major event, e.g., the bankruptcy of Lehman Brothers in September 2008.

Moreover, in the real world, investors tend to look at the historical performance of risky assets. This leads us to consider the time delay in the model, which may represent the memory in the dynamics of the system or the inertia in the financial market. A very complete treatment of the theory of the stochastic differential delay equations (SDDEs) can be found in the monograph by Mohammed [5]. On the other hand, the results of the modern theory of regime-switching models with delay are presented in the monograph by Mao and Yuan [6].

Optimal control of the SDDEs have already been studied by various authors; see, for example, Øksendal and Sulem [7], Larssen [8] and Elsanosi et al. [9] and references therein. A stochastic maximum principle of a forward-backward delayed regime-switching diffusion model has been given by Lv et al. [2]. The stochastic maximum principle is one of the main approaches of stochastic optimal control theory. It is the stochastic extension of the Pontryagin’s maximum principle, which is used for the optimal control of deterministic dynamic systems; here we refer to, for example, the monographs by Øksendal and Sulem [10], Yong and Zhou [11]. Moreover, to solve optimal control problems by the tools of the maximum principle, it is well known that one needs the adjoint equations represented by backward stochastic differential equations (BSDEs) (cf., e.g., [12,13,14,15]). One can find a different perspective for BSDEs in [15], which studies BSDEs just on Markov chains. Moreover, if one studies on an SDDE, he has to consider a new form of BSDEs for the adjoint equations, which are called the anticipated (time-advanced) BSDEs (ABSDEs). Peng and Yang [16] developed the duality between SDDEs and ABSEDs and provided some main results related to ABSDEs. Øksendal et al. [17] and Tu and Hao [18] extended the existence–uniqueness results of ABSDEs for jump-diffusion models. To the best of our knowledge, our work shows the first extension of the stochastic maximum principle for a Markov regime-switching jump-diffusion model with delay (SDDEJR) and the existence–uniqueness theorem of the ABSDEs with jumps and regimes.

The paper is organized as follows. The model setup and optimal control problem are presented in Sect. 2. In Sect. 3, we prove the existence–uniqueness theorem for ABSDE with jumps and regimes. In Sects. 4, 5, 6 and 7, sufficient and necessary maximum principles are developed under full and partial information, respectively. An optimal consumption problem from a cash flow with delay is studied in Sect. 8. The final section is devoted to the conclusions.

2 Model Setup and the Control Problem

Throughout the paper we work with a finite time horizon \(T>0\), which is the maturity time. Let \((N(\mathrm{d}t,\mathrm{d}z):t\in [0,T],z\in {\mathbb {R}}_{0})\) be a Poisson random measure on \(([0,T]\times {\mathbb {R}}_{0},\mathscr {B}([0,T])\otimes \mathscr {B}_{0})\), where \({\mathbb {R}}_{0}:={\mathbb {R}}\setminus \left\{ 0\right\} \) and \(\mathscr {B}_{0}\) is the Borel \(\sigma \)-field generated by open subset O of \({\mathbb {R}}_{0}\), whose closure does not contain the point 0. Let \(\tilde{N}(\mathrm{d}t,\mathrm{d}z):=N(\mathrm{d}t,\mathrm{d}z)-\nu (\mathrm{d}z)\mathrm{d}t\) be a compensated Poisson random measure, where \(\nu \) is the Lévy measure of the jump measure \(N(\cdot ,\cdot )\). Furthermore, let \(\left( W(t):t\in [0,T]\right) \) be a Brownian motion and \(\left( \alpha (t): t\in [0,T]\right) \) be a continuous-time, finite state and observable Markov chain. Let \((\Omega ,{\mathbb {F}},\mathscr {F}_{t},{\mathbb {P}})\) be a complete filtered probability space generated by the Brownian motion \(W(\cdot )\), the Poisson random measure \(N(\cdot ,\cdot )\) and the Markov chain \(\alpha (\cdot )\). Let \({\mathbb {F}}=\left( \mathscr {F}_{t}:t\in [0,T]\right) \) be a right-continuous, \({\mathbb {P}}\)-completed filtration. We assume that the Brownian motion, the Markov chain and the Poisson random measure are independent and adapted to \({\mathbb {F}}\).

The finite state space of the Markov chain \(\alpha (t)\), \(S=\left\{ e_{1},e_{2},\ldots ,e_{D}\right\} \), is called a canonical state space, where \(D\in {\mathbb {N}}\), \(e_{i}\in {\mathbb {R}}^{D}\) and the jth component of \(e_{i}\) is the Kronecker delta \(\delta _{ij}\) for each pair of \(i,j=1,2,\ldots ,D\). We suppose that the chain is homogenous and irreducible. The generator of the chain under \({\mathbb {P}}\) is defined by \(\Lambda :=[\lambda _{ij}]_{i,j=1,2,\ldots ,D}\). For each \(i,j=1,2,\ldots ,D\), \(\lambda _{ij}\) is the constant transition intensity of the chain from each state \(e_{i}\) to the state \(e_{j}\) at time t. For \(i\ne j\), \(\lambda _{ij}\ge 0\) and \(\sum _{j=1}^{D}\lambda _{ij}=0\); hence, \(\lambda _{ii}\le 0.\) We suppose that for each \(i,j=1,2,\ldots ,D\), with \(i\ne j\), \(\lambda _{ij}>0\) and \(\lambda _{ii}<0\).

Elliott et al. [19] obtained the following semimartingale representation for the chain \(\alpha \):

$$\begin{aligned} \alpha (t)=\alpha (0)+\int _{0}^{t}\Lambda ^{T}\alpha (u)\mathrm{d}u+M(t), \end{aligned}$$

where \(\left( M(t):t\in [0,T]\right) \) is an \({\mathbb {R}}^{D}\)-valued \(({\mathbb {F}},{\mathbb {P}})\)-martingale and \(\Lambda ^{T}\) represents the transpose of the matrix.

Let us introduce a set of Markov jump martingales associated with the chain \(\alpha \).

For each \(i,j=1,2,\ldots ,D\), with \(i\ne j\) and \(t\in [0,T]\), let \(J^{ij}(t)\) be the number of the jumps from state \(e_{i}\) to state \(e_{j}\) up to time t. Then,

$$\begin{aligned} J^{ij}(t)&:=\sum \limits _{0<s\le t}\left\langle \alpha (s-),e_{i}\right\rangle \left\langle \alpha (s),e_{j}\right\rangle \\&=\sum \limits _{0<s\le t}\left\langle \alpha (s-),e_{i}\right\rangle \left\langle \alpha (s)-\alpha (s-),e_{j}\right\rangle \\&=\int _{0}^{t}\left\langle \alpha (s-),e_{i}\right\rangle \left\langle \mathrm{d}\alpha (s),e_{j}\right\rangle \\&=\int _{0}^{t}\left\langle \alpha (s-),e_{i}\right\rangle \left\langle \Lambda ^{T}\alpha (s),e_{i}\right\rangle \mathrm{d}s+\int _{0}^{t}\left\langle \alpha (s-),e_{i}\right\rangle \left\langle \mathrm{d}M(s),e_{j}\right\rangle \\&=\lambda _{ij}\int _{0}^{t}\left\langle \alpha (s-),e_{i}\right\rangle \mathrm{d}s+m_{ij}(t), \end{aligned}$$

where \(m_{ij}\)’s are \((\mathbb {F},\mathbb {P})\)-martingales and called the basic martingales associated with the chain \(\alpha \). For each fixed \(j=1,2,\ldots ,D\), let \(\Phi _{j}\) be the number of the jumps into state \(e_{j}\) up to time t. Then,

$$\begin{aligned} \Phi _{j}(t)&:=\sum \limits _{i=1,i\ne j}^{D}J^{ij}(t) \\&=\sum \limits _{i=1,i\ne j}^{D} \lambda _{ij}\int _{0}^{t}\left\langle \alpha (s-),e_{i}\right\rangle \mathrm{d}s+ {\tilde{\Phi }}_{j}(t). \end{aligned}$$

Let us define \({\tilde{\Phi }}_{j}(t):=\sum \limits _{i=1,i\ne j}^{D}m_{ij}(t)\) and \(\lambda _{j}(t):=\sum \limits _{i=1,i\ne j}^{D}\lambda _{ij}\int _{0}^{t}\left\langle \alpha (s-),e_{i}\right\rangle \mathrm{d}s\); then for each \(j=1,2,\ldots ,D\),

$$\begin{aligned} {\tilde{\Phi }}_{j}(t)=\Phi _{j}(t)-\lambda _{j}(t) \end{aligned}$$

is an \((\mathbb {F},\mathbb {P})\)-martingale. Let \({\tilde{\Phi }}(t)=(\tilde{\Phi }_{1}(t),{\tilde{\Phi }}_{2}(t),\ldots ,{\tilde{\Phi }}_{D}(t))^{T}\) represent an integer-valued random measure on \(([0,T]\times S,\mathscr {B}([0,T])\otimes \mathscr {B}_{S})\), where \(\mathscr {B}_{S}\) is the \(\sigma \)-field of S. Let \(\mathscr {P}\) be the predictable sigma field on \(\Omega \times [0,T]\).

Let us represent the controlled Markov regime-switching jump-diffusion with delay:

$$\begin{aligned} X(t)=&\ b(t,X(t),Y(t),A(t),\alpha (t),u(t))\mathrm{d}t \nonumber \\&+\sigma (t,X(t),Y(t),A(t),\alpha (t),u(t))\mathrm{d}W(t) \nonumber \\&+\int _{\mathbb {R}_{0}}\eta (t,X(t),Y(t),A(t),\alpha (t),u(t),z)\tilde{N}(\mathrm{d}t,\mathrm{d}z) \nonumber \\&+\gamma (t,X(t),Y(t),A(t),\alpha (t),u(t))d{\tilde{\Phi }}(t), \quad t\in [0,T], \\ X(t)=&\ x_{0}(t), \qquad t\in [-\delta ,0] \nonumber , \end{aligned}$$
(1)

where

$$\begin{aligned} Y(t)=X(t-\delta ) \quad \hbox {and} \quad A(t)=\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}X(r)dr, \qquad t\in [0,T]. \end{aligned}$$

Furthermore, let \(x_{0}\) be a continuous and deterministic function, \(\rho \ge 0\) be a constant averaging parameter and \(\delta > 0\) be a constant delay.

Let us introduce:

$$\begin{aligned}&b:[0,T]\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times S\times \mathscr {U}\rightarrow \mathbb {R},\\&\sigma :[0,T]\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times S\times \mathscr {U}\rightarrow \mathbb {R},\\&\eta :[0,T]\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times S\times \mathscr {U}\times \mathbb {R}_{0}\rightarrow \mathbb {R},\\&\gamma :[0,T]\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times S\times \mathscr {U}\rightarrow \mathbb {R}^{D}, \end{aligned}$$

where for all \(x,y,a\in \mathbb {R}\), \(e_{i}\in S\), \(u\in \mathscr {U}\), \(z\in \mathbb {R}_{0}\) and \(t\in [0,T]\), \(b(t,x,y,a,e_{i},u), \sigma (t,x,y,a,e_{i},u), \ \eta (t,x,y,a,e_{i},u,z)\) and \(\gamma (t,x,y,a,e_{i},u)\) are given \(\mathscr {F}_{t}\)-measurable, \(\mathscr {C}^{1}\)-functions with respect to xyau such that for all \(x_{i}=x,y,a,u\),

$$\begin{aligned}&E\left[ \int _{0}^{T}\left\{ \left| \frac{\partial {b}}{\partial {x_{i}}}(t,X(t),Y(t),A(t), \alpha (t),u(t))\right| ^{2}+ \left| \frac{\partial {\sigma }}{\partial {x_{i}}}(t,X(t),Y(t),A(t),\alpha (t),u(t))\right| ^{2}\right. \right. \\&\qquad +\int _{\mathbb {R}_{0}}\left| \frac{\partial {\eta }}{\partial {x_{i}}}(t,X(t),Y(t),A(t),\alpha (t),u(t),z)\right| ^{2}\nu (\mathrm{d}z)\\&\qquad \left. \left. +\sum _{j=1}^{D}\left| \frac{\partial {\gamma }}{\partial {x_{i}}}(t,X(t),Y(t),A(t),\alpha (t),u(t))\right| ^{2}\lambda _{j}(t)\right\} \mathrm{d}t\right] <\infty . \end{aligned}$$

Let \(\mathscr {U}\) be a non-empty, closed and convex subset of \(\mathbb {R}\). An admissible control is a \(\mathscr {U}\)-valued, \(\mathscr {F}_{t}\)-measurable and càdlàg process u(t), \(t\in [0,T]\), such that (1) has a unique solution and

$$\begin{aligned} E\left[ \int _{0}^{T}\left| u(t)\right| ^{2}\mathrm{d}t\right] <\infty . \end{aligned}$$

We denote by \(\mathscr {A}\) the set of all admissible controls.

Let us define the performance criterion (objective functional) as follows:

$$\begin{aligned} J(u)=E\left[ \int _{0}^{T}f(t,X(t),Y(t),A(t),\alpha (t),u(t))\mathrm{d}t+g(X(T),\alpha (T))\right] \end{aligned}$$

for all \(u\in \mathscr {A}\), where \(f:[0,T]\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times S\times \mathscr {U}\rightarrow \mathbb {R}\) and \(g:\mathbb {R}\times S\rightarrow \mathbb {R}\) are \(\mathscr {C}^{1}\)-functions with respect to xyau such that for all \(x_{i}=x,y,a,u\),

$$\begin{aligned}&E\left[ \int _{0}^{T}(\left| f(t,X(t),Y(t),A(t),\alpha (t),u(t))\right| +\left| \frac{\partial {f}}{\partial {x_{i}}}(t,X(t),Y(t),A(t),\alpha (t),u(t))\right| ^{2})\mathrm{d}t\right. \\&\qquad +\left. \left| g(X(T),\alpha (T))\right| +\left| g_{x}(X(T),\alpha (T))\right| ^{2}\right] <\infty . \end{aligned}$$

Our problem is to find the optimal control \(\hat{u}\in \mathscr {A}\) such that

$$\begin{aligned} J(\hat{u})=\sup \limits _{u\in \mathscr {A}}J(u). \end{aligned}$$
(2)

Now let us define the Hamiltonian as follows:

\(H:[0,T]\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times S\times \mathscr {U}\times \mathbb {R}\times \mathbb {R}\times \mathscr {R}\times \mathbb {R}^{D}\rightarrow \mathbb {R}\),

$$\begin{aligned} H(t,x,y,a,e_{i},u,p,q,r,w)=&\ f(t,x,y,a,e_{i},u)+b(t,x,y,a,e_{i},u)p\nonumber \\&+\sigma (t,x,y,a,e_{i},u)q \nonumber \\&+\int _{\mathbb {R}_{0}}\eta (t,x,y,a,e_{i},u,z)r(t,z)\nu {(\mathrm{d}z)} \nonumber \\&+\sum \limits _{j=1}^{D}\gamma ^{j}(t,x,y,a,e_{i},u)w^{j}(t)\lambda _{ij}, \end{aligned}$$
(3)

where \(\mathscr {R}\) denotes the set of all functions \(r:[0,T]\times \mathbb {R}_{0}\rightarrow \mathbb {R}\), for which the integral in (3) converges.

Associated with H, the adjoint, unknown and adapted processes \(\left( p(t)\in \mathbb {R}:t\in \right. \) \(\left. [0,T]\right) \), \(\left( q(t)\in \mathbb {R}:t\in [0,T] \right) , \ \left( r(t,z)\in \mathscr {R}:t\in [0,T],z\in \mathbb {R}_{0}\right) \ \hbox {and} \ \left( w(t)\in \mathbb {R}^{D}\right. \) \(\left. :t\in [0,T]\right) \) are given by the following ABSDE with jumps and regimes:

$$\begin{aligned} dp(t)&=E[\mu (t)|\mathscr {F}_{t}]\mathrm{d}t+q(t)\mathrm{d}W(t)+\int _{\mathbb {R}_{0}}r(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)+w(t)d{\tilde{\Phi }}{(t)}, \\ p(T)&=g_{x}(X(T),\alpha (T)), \nonumber \end{aligned}$$
(4)

where

$$\begin{aligned} \mu (t):=&-\frac{\partial {H}}{\partial {x}}(t,X(t),Y(t),A(t),\alpha (t),u(t),p(t),q(t),r(t,\cdot ),w(t)) \nonumber \\&-\frac{\partial {H}}{\partial {y}}(t+\delta ,X(t+\delta ),Y(t+\delta ),A(t+\delta ),\alpha (t+\delta ),u(t+\delta ),p(t+\delta ),\nonumber \\&q(t+\delta ),r(t+\delta ,\cdot ),w(t+\delta ))\mathbf 1 _{[0,T-\delta ]}(t)-\mathrm{e}^{\rho t}\left( \int _{t}^{t+\delta }\frac{\partial {H}}{\partial {a}}(s,X(s),Y(s),\right. \nonumber \\&\left. A(s),\alpha (s),u(s),p(s),q(s),r(s,\cdot ),w(s))\mathrm{e}^{-\rho s}{} \mathbf 1 _{[0,T]}(s)\mathrm{d}s\right) . \end{aligned}$$
(5)

Note that \(\mu (t)\) in (5) contains future values of \(X(s),\alpha (s),u(s),p(s),q(s),r(s,\cdot )\) and w(s) for \(s\le t+\delta \); hence, the BSDE (4) is anticipative (or time-advanced). In the following section, we will prove the existence–uniqueness theorem for an ABSDE with jumps and regimes in a general setting and then, we will apply it to a constant delay case, \(\delta >0\), for the rest of the work.

Moreover, the derivatives of \(b, \ \sigma , \ \eta \) and \(\gamma \) with respect to \(x, \ y\) and a are bounded. By this assumption, it is easy to check that \(\mu \) in (4)–(5) satisfies Lipschitz condition (A1) for \(p,\ q, \ r, \ w\) and their future values.

Note that by the aforementioned integrability conditions on the derivatives of \(b, \ \sigma , \ \eta \), \(\gamma \) and f, A2 in Theorem 3.1 is satisfied by \(\mu \) in (4)–(5), please see pp. 7–8.

Furthermore, note that \(p(T)=g_{x}(X(T),\alpha (T))\) (see [4)] corresponds to \(\xi (\cdot )\) in Theorem 3.1; hence, it has to satisfy \(E[\left| g_{x}(X(T),\alpha (T))\right| ^{2}]<\infty \), cf p. 8.

We will use the subsequent notations and introduce Banach spaces of measurable and integrable variables and processes as follows:

  • \(L^{2}(\mathscr {F}_{T};\mathbb {R})=\{ \mathbb {R}\)-valued, \(\mathscr {F}_{T}\)-measurable random variables such that \(E[\left| \phi \right| ^{2}]<\infty \}\),

  • \(L^{2}(\mathscr {B}_{0};\mathbb {R})=\{ \mathbb {R}\)-valued, \(\mathscr {B}_{0}\)-measurable random variables such that \(\left\| \phi \right\| ^{2}_{J}=\int _{\mathbb {R}_{0}}\left| \phi (z)\right| ^{2}\nu (\mathrm{d}z)<\infty \}\),

  • \(L^{2}(\mathscr {B}_{S};\mathbb {R}^{D})=\{ \mathbb {R}^{D}\)-valued, \(\mathscr {B}_{S}\)-measurable random variables such that \(\left\| \phi \right\| ^{2}_{S}=\sum _{j=1}^{D}\left| \phi ^{j} \right| ^{2}\lambda _{j}(t)<\infty \}\),

  • \(L^{2}(\mathscr {F}_{T}\times \mathscr {B}_{0};\mathbb {R})=\{ \mathbb {R}\)-valued, \(\mathscr {F}_{T}\times \mathscr {B}_{0}\)-measurable random variables such that \(E[\int _{\mathbb {R}_{0}}\left| \phi (z) \right| ^{2}\nu (\mathrm{d}z)]<\infty \}\),

  • \(L^{2}(\mathscr {F}_{T}\times \mathscr {B}_{S};\mathbb {R}^{D})=\{ \mathbb {R}^{D}\)-valued, \(\mathscr {F}_{T}\times \mathscr {B}_{S}\)-measurable random variables such that\( E[\sum _{j=1}^{D}\left| \phi ^{j} \right| ^{2}\lambda _{j}(t)]<\infty \}\),

  • \(L_{\mathbb {F}}^{2}(0,T;\mathbb {R})=\{ \mathbb {R}\)-valued, \(\mathscr {F}_{t}\)-adapted stochastic processes such that \(E[\int _{0}^{T}\left| \phi (t) \right| ^{2}\) \(\mathrm{d}t]<\infty \}\),

  • \(S_{\mathbb {F}}^{2}(0,T;\mathbb {R})=\{\)càdlàg processes in \(L_{\mathbb {F}}^{2}(0,T;\mathbb {R})\) such that \(E[\sup _{t\in [0,T]}\left| \phi (t) \right| ^{2}]<\infty \}\),

  • \(\mathscr {H}^{2}_{\mathbb {F}}(0,T;\mathbb {R})=\{ \mathbb {R}\)-valued, \(\mathscr {P}\otimes \mathscr {B}_{0}\)-measurable stochastic processes such that \(\left\| \phi (t) \right\| ^{2}_{\mathscr {H}^{2}} =E[\int _{0}^{T}\left\| \phi (t) \right\| ^{2}_{J}\mathrm{d}t]<\infty \}\),

  • \(\mathscr {M}^{2}_{\mathbb {F}}(0,T;\mathbb {R}^{D})=\{ \mathbb {R}^{D}\)-valued, \(\mathscr {P}\otimes \mathscr {B}_{S}\)-measurable stochastic processes such that \(\left\| \phi (t) \right\| ^{2}_{\mathscr {M}^{2}}=E[\int _{0}^{T}\left\| \phi (t) \right\| ^{2}_{S}\mathrm{d}t]<\infty \}\).

3 Existence and Uniqueness Theorem

We consider a generalized form of the BSDEs as follows:

$$\begin{aligned} -\mathrm{d}Y(t)=&\ f(t,Y(t),Z(t),Q(t),V(t),Y(t+\delta _{1}(t)),Z(t+\delta _{2}(t)), \nonumber \\&Q(t+\delta _{3}(t)),V(t+\delta _{4}(t)),\alpha (t))\mathrm{d}s-Z(t)\mathrm{d}W(t) \nonumber \\&-\int _{\mathbb {R}_{0}}Q(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z) -V(t)d{\tilde{\Phi }}(t), \qquad t\in [0,T],\\ Y(t)=&\ \xi (t), \ Z(t)=\psi (t),\ Q(t)=\zeta (t), \ V(t)=\vartheta (t), \ t\in [T,T+K]. \nonumber \end{aligned}$$
(6)

Let \(\delta _{i}(\cdot )\), \(i=1,2,3,4\), be an \(\mathbb {R}^{+}\)-valued continuous function on [0, T] such that:

(i) There exists a constant \(K\ge 0\) such that for all \(t\in [0,T]\) and \(i=1,2,3,4\),

\(t+\delta _{i}(t)\le T+K\).

(ii) There exists a constant \(L\ge 0\) such that for each \(t\in [0,T]\) and for any non-negative integrable function \(g(\cdot )\),

$$\begin{aligned} \int _{t}^{T}g(s+\delta _{i}(s))\mathrm{d}s\le L\int _{t}^{T+K}g(s)\mathrm{d}s \qquad \hbox {for} \qquad i=1,2,3,4. \end{aligned}$$

Assume that for all \(t\in [0,T]\) and \(e_{j}\in S\), \(f(t,y,z,q,v,\xi ,\psi ,\zeta ,\vartheta ,e_{j}):[0,T]\times \mathbb {R}\times \mathbb {R} \times L^{2}(\mathscr {B}_{0};\mathbb {R})\times L^{2}(\mathscr {B}_{S};\mathbb {R}^{D})\times L^{2}(\mathscr {F}_{r};\mathbb {R})\times L^{2}(\mathscr {F}_{r^{*}};\mathbb {R})\times L^{2}(\mathscr {F}_{\hat{r}}\times \mathscr {B}_{0};\mathbb {R}) \times L^{2}(\mathscr {F}_{\tilde{r}}\times \mathscr {B}_{S};\mathbb {R}^{D})\times S\rightarrow L^{2}(\mathscr {F}_{t};\mathbb {R})\), where \(r,r^{*},\hat{r},\tilde{r}\in [t,T+K]\).

Furthermore, f satisfies the following conditions:

A1. There exists a constant \(C>0\) such that for all \(t\in [0,T], \ e_{j}\in S, \ y,y',z,z'\)

\(\in \mathbb {R}, \ q,q'\in L^{2}(\mathscr {B}_{0};\mathbb {R}), \ v,v'\in L^{2}(\mathscr {B}_{S};\mathbb {R}^{D}), \ \xi ,\xi ', \psi ,\psi '\in L_{\mathbb {F}}^{2}(t,T+K;\mathbb {R})\),

\(\zeta ,\zeta '\in \mathscr {H}^{2}_{\mathbb {F}}(t,T+K;\mathbb {R}), \vartheta ,\vartheta '\in \mathscr {M}^{2}_{\mathbb {F}}(t,T+K;\mathbb {R}^{D})\) and \(r,r^{*},\hat{r},\tilde{r}\in [t,T+K]\), we have

$$\begin{aligned}&|f(t,y,z,q,v,\xi (r),\psi (r^{*}),\zeta (\hat{r}),\vartheta (\tilde{r}),e_{j})\\&\qquad -f(t,y',z',q',v',\xi '(r),\psi '(r^{*}),\zeta '(\hat{r}),\vartheta '(\tilde{r}),e_{j})| \\&\le C\left( \left| y-y'\right| +\left| z-z'\right| +\left\| q-q'\right\| _{J}+\left\| v-v'\right\| _{S}+E\left[ \left| \xi (r)-\xi '(r)\right| \right. \right. \\&\qquad \left. \left. +\left| \psi (r^{*})-\psi '(r^{*})\right| +\left\| \zeta (\hat{r})-\zeta '(\hat{r})\right\| _{J}+\left\| \vartheta (\tilde{r})-\vartheta '(\tilde{r})\right\| _{S}|\mathscr {F}_{t}\right] \right) . \end{aligned}$$

A2. \(E\left[ \int _{0}^{T}\left| f(t,0,0,0,0,0,0,0,0,e_{j})\right| ^{2}\mathrm{d}t\right] < \infty \),       for all \(e_{j}\in S\).

Let us give the main result of this section.

Theorem 3.1

Suppose f fulfills A1 and A2 and for \(i=1,2,3,4\), \(\delta _{i}\) satisfies (i) and (ii). Then, for any given terminal variables \(\xi (\cdot )\in S_{\mathbb {F}}^{2}(T,T+K;\mathbb {R})\), \(\psi (\cdot )\in L_{\mathbb {F}}^{2}(T,T+K;\mathbb {R}), \ \zeta (\cdot )\in \mathscr {H}^{2}_{\mathbb {F}}(T,T+K;\mathbb {R})\) and \(\vartheta (\cdot )\in \mathscr {M}^{2}_{\mathbb {F}}(T,T+K;\mathbb {R}^{D})\), the ABSDE (6) has a unique solution, i.e., there exists a unique 4-tuple of \(\mathscr {F}_{t}\)-adapted processes \((Y,Z,Q,V)\in S_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times L_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times \mathscr {H}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R})\)

\(\times \mathscr {M}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R}^{D})\) satisfying (6).

Proof

We fix \(\beta =16C^{2}(L+1)(T+1)\), where C is the Lipschitz constant of f given in A1 and introduce a norm in the Banach space \(S_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times L_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\)

\( \times \mathscr {H}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R})\times \mathscr {M}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R}^{D})\) as follows:

$$\begin{aligned} \left\| (Y(t),Z(t),Q(t),V(t))\right\| _{\beta }^{2}= & {} E\left[ \int _{0}^{T+K}\mathrm{e}^{\beta t}\left( \left| Y(t)\right| ^{2}+\left| Z(t)\right| ^{2}\right. \right. \\&\left. \left. \quad +\int _{\mathbb {R}_{0}}\left| Q(t,z)\right| ^{2}\nu (\mathrm{d}z)+\sum _{j=1}^{D}\left| V^{j}(t)\right| ^{2}\lambda _{j}(t)\right) \mathrm{d}t\right] . \end{aligned}$$

It is more convenient to use the equivalent \(\beta \)-norm for applying Banach fixed point theorem. We pose the problem,

$$\begin{aligned}&-\mathrm{d}Y(t)=f(t,y(t),z(t),q(t),v(t),y(t+\delta _{1}(t)),z(t+\delta _{2}(t)),q(t+\delta _{3}(t)), \\&v(t+\delta _{4}(t)),\alpha (t))\mathrm{d}t-Z(t)\mathrm{d}W(t)-\int _{\mathbb {R}_{0}}Q(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)-V(t)d{\tilde{\Phi }}(t), \quad t\in [0,T],\\&Y(t)=\xi (t), \ Z(t)=\psi (t),\ Q(t)=\zeta (t) \ \hbox {and} \ V(t)=\vartheta (t), \qquad t\in [T,T+K]. \end{aligned}$$

Let us define:

$$\begin{aligned} h: S_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times L_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times \mathscr {H}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R})\times \mathscr {M}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R}^{D}) \\ \rightarrow S_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times L_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times \mathscr {H}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R})\times \mathscr {M}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R}^{D}). \end{aligned}$$

According to the existence–uniqueness results of the BSDEs with jumps and regimes (see Propositions 5.1 and 5.2 by Crépey and Matoussi [20]), the aforementioned equations have a unique solution; hence, h is well defined.

Now we will prove that h is a contraction mapping under the norm \(\left\| \cdot \right\| _{\beta }\).

For two arbitrary elements (y(t), z(t), q(t), v(t)) and \((y'(t),z'(t),q'(t),v'(t))\) in \(S_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times L_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times \mathscr {H}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R})\times \mathscr {M}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R}^{D})\), let us set \(h(y(t),z(t),q(t),v(t))=(Y(t),Z(t),Q(t),V(t))\) and \(h(y'(t),z'(t),q'(t),v'(t))=(Y'(t),Z'(t),Q'(t),V'(t))\).

Let us define their differences by \((\hat{y}(t),\hat{z}(t),\hat{q}(t),\hat{v}(t))=(y(t)-y'(t),z(t)-z'(t),q(t)-q'(t),v(t)-v'(t))\) and \((\hat{Y}(t),\hat{Z}(t),\hat{Q}(t),\hat{V}(t))=(Y(t)-Y(t),Z(t)-Z'(t),Q(t)-Q'(t),V(t)-V'(t)).\)

Let us apply the integrating by parts for regime-switching jump-diffusions (cf. Lemma 3.2., by Zhang et al. [3]) and take the expectation:

$$\begin{aligned}&E[\mathrm{e}^{\beta t}(\hat{Y}(t))^{2}]+E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\left( \left| \hat{Z}(s)\right| ^{2}+\int _{\mathbb {R}}\left| \hat{Q}(s,z)\right| ^{2}\nu (\mathrm{d}z)+\sum _{j=1}^{D}\left| \hat{V}^{j}(s)\right| ^{2}\lambda _{j}(s)\right) \mathrm{d}s\right] \\&=E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\left( 2\hat{Y}(s) \left( f(s,y(s),z(s),q(s),v(s),y(s+\delta _{1}(s)),z(s+\delta _{2}(s)),\right. \right. \right. \\&\qquad q(s+\delta _{3}(s)),v(s+\delta _{4}(s)),\alpha (s))-f(s,y'(s),z'(s),q'(s),v'(s),y'(s+\delta _{1}(s)),\\&\qquad \left. \left. \left. \left. z'(s+\delta _{2}(s)),q'(s+\delta _{3}(s)),v'(s+\delta _{4}(s)),\alpha (s)) \right) \right] -\beta (\hat{Y}(s))^{2}\right) \mathrm{d}s\right] . \end{aligned}$$

We note that the terms \(2\int _{0}^{t}\mathrm{e}^{\beta s}\hat{Y}(s)\hat{Z}(s)\mathrm{d}W(s)\), \(2\int _{0}^{t}\mathrm{e}^{\beta s}\hat{Y}(s)\hat{Q}(s,z)\tilde{N}(\mathrm{d}s,\mathrm{d}z)\) and \(2\int _{0}^{t}\mathrm{e}^{\beta s}\hat{Y}(s)\hat{V}(s,z)d{\tilde{\Phi }}(s)\) are uniformly integrable martingales. Let us show this:

$$\begin{aligned}&E\left[ \left( \int _{0}^{T}\sum _{j=1}^{D}\mathrm{e}^{2\beta t}\left| \hat{Y}(t)\right| ^{2}\left| \hat{V}^{j}(t)\right| ^{2}\lambda _{j}(t)\mathrm{d}t\right) ^{\frac{1}{2}}\right] \\&\quad \le aE\left[ \sup \limits _{0\le t\le T}\left| \hat{Y}(t)\right| \left( \int _{0}^{T}\sum _{j=1}^{D}\left| \hat{V}^{j}(t)\right| ^{2}\lambda _{j}(t)\mathrm{d}t\right) ^{\frac{1}{2}}\right] \\&\quad \le \frac{a}{2}E\left[ \sup \limits _{0\le t\le T}\left| \hat{Y}(t)\right| ^{2}\right] +\frac{a}{2}E\left[ \int _{0}^{T}\sum _{j=1}^{D}\left| \hat{V}^{j}(t)\right| ^{2}\lambda _{j}(t)\mathrm{d}t\right] ; \end{aligned}$$

since \(\hat{Y}(t)\in S_{\mathbb {F}}^{2}(0,T;\mathbb {R})\) and \(\hat{V}(t)\in \mathscr {M}^{2}_{\mathbb {F}}(0,T;\mathbb {R}^{D})\), the associated stochastic integral is a uniformly integrable martingale with null expectation. The others can be obtained similarly.

By the aforementioned equality, conditions A1 and (ii) and the inequality \(2ab\le (a^{2}+b^{2})\), we continue:

$$\begin{aligned}&\le E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\left( -\beta \left| \hat{Y}(s)\right| ^{2}+2C \left| \hat{Y}(s)\right| \left( \left| \hat{y}(s)\right| +\left| \hat{z}(s)\right| + \left\| \hat{q}(s)\right\| _{J}+\left\| \hat{v}(s)\right\| _{S}\right. \right. \right. \\&\qquad +E\left[ \left| \hat{y}(s+\delta _{1}(s))\right| + \left| \hat{z}(s+\delta _{2}(s))\right| +\left\| \hat{q}(s+\delta _{3}(s))\right\| _{J}\right. \\&\qquad +\left. \left. \left. \left. \left\| \hat{v}(s+\delta _{4}(s)) \right\| _{S}|\mathscr {F}_{s}\right] \right) \right) \mathrm{d}s\right] \\&\le E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}(-\beta \left| \hat{Y}(s)\right| ^{2})\mathrm{d}s\right] +E \left[ \int _{t}^{T}\mathrm{e}^{\beta s}2C\left| \hat{Y}(s)\right| \left( \left| \hat{y}(s)\right| + \left| \hat{z}(s)\right| +\left\| \hat{q}(s)\right\| _{J} \right. \right. \\&\left. \left. \qquad +\left\| \hat{v}(s)\right\| _{S}\right) \mathrm{d}s\right] + E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}2C\left| \hat{Y}(s)\right| \left( \left| \hat{y} (s+\delta _{1}(s))\right| +\left| \hat{z}(s+\delta _{2}(s))\right| \right. \right. \\&\left. \left. \qquad +\left\| \hat{q}(s+\delta _{3}(s))\right\| _{J}+ \left\| \hat{v}(s+\delta _{4}(s))\right\| _{S}\right) \mathrm{d}s\right] \\&\le E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}(-\beta \left| \hat{Y}(s)\right| ^{2})\mathrm{d}s\right] \\&\qquad +E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\left( \frac{\beta }{4} \left| \hat{Y}(s)\right| ^{2}+\frac{4C^{2}}{\beta }(\left| \hat{y}(s)\right| + \left| \hat{z}(s)\right| )^{2}\right) \mathrm{d}s\right] \\&\qquad +E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\left( \frac{\beta }{4} \left| \hat{Y}(s)\right| ^{2}+\frac{4C^{2}}{\beta }(\left\| \hat{q}(s)\right\| _{J}+ \left\| \hat{v}(s)\right\| _{S})^{2}\right) \mathrm{d}s\right] \\&\qquad +E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\left( \frac{\beta }{4} \left| \hat{Y}(s)\right| ^{2} +\frac{4C^{2}}{\beta }(\left| \hat{y}(s+\delta _{1}(s))\right| +\left| \hat{z}(s+\delta _{2}(s))\right| )^{2}\right) \mathrm{d}s\right] \\&\qquad +E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\left( \frac{\beta }{4} \left| \hat{Y}(s)\right| ^{2}+\frac{4C^{2}}{\beta } (\left\| \hat{q}(s+\delta _{3}(s))\right\| _{J}+ \left\| \hat{v}(s+\delta _{4}(s))\right\| _{S})^{2}\right) \mathrm{d}s\right] \\&\le E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\frac{8C^{2}}{\beta } (\left| \hat{y}(s)\right| ^{2}+\left| \hat{z}(s)\right| ^{2}+ \left\| \hat{q}(s)\right\| ^{2}_{J}+\left\| \hat{v}(s)\right\| ^{2}_{S})\mathrm{d}s\right] \\&\qquad +E\left[ \int _{t}^{T}\mathrm{e}^{\beta s}\frac{8C^{2}}{\beta } (\left| \hat{y}(s+\delta _{1}(s))\right| ^{2}+\left| \hat{z} (s+\delta _{2}(s))\right| ^{2}+\left\| \hat{q}(s+\delta _{3}(s))\right\| ^{2}_{J}\right. \\&\qquad \left. +\left\| \hat{v}(s+\delta _{4}(s))\right\| ^{2}_{S})\mathrm{d}s\right] \\&\le \frac{8C^{2}}{\beta }E\left[ \int _{t}^{T+K}\mathrm{e}^{\beta s} (\left| \hat{y}(s)\right| ^{2}+\left| \hat{z}(s)\right| ^{2}+\left\| \hat{q}(s)\right\| ^{2}_{J}+\left\| \hat{v}(s)\right\| ^{2}_{S})\mathrm{d}s\right] \\&\qquad +\frac{8C^{2}L}{\beta }E\left[ \int _{t}^{T+K}\mathrm{e}^{\beta s} (\left| \hat{y}(s)\right| ^{2}+\left| \hat{z}(s)\right| ^{2}+ \left\| \hat{q}(s)\right\| ^{2}_{J}+\left\| \hat{v}(s)\right\| ^{2}_{S})\mathrm{d}s\right] \\&=\frac{8C^{2}}{\beta }(L+1)E\left[ \int _{0}^{T+K}\mathrm{e}^{\beta s} (\left| \hat{y}(s)\right| ^{2}+\left| \hat{z}(s)\right| ^{2}+ \left\| \hat{q}(s)\right\| ^{2}_{J}+\left\| \hat{v}(s)\right\| ^{2}_{S})\mathrm{d}s\right] . \end{aligned}$$

In particular,

$$\begin{aligned} E\left[ \mathrm{e}^{\beta t}\left| \hat{Y}(t)\right| ^{2}\mathrm{d}t\right]&\le \frac{8C^{2}}{\beta }(L+1) \left\| (\hat{y}(t),\hat{z}(t),\hat{q}(t),\hat{v}(t))\right\| _{\beta }^{2}, \\ E\left[ \int _{0}^{T}\mathrm{e}^{\beta t}\left| \hat{Y}(t)\right| ^{2}\mathrm{d}t\right]&\le \frac{8C^{2}T}{\beta }(L+1)\left\| (\hat{y}(t),\hat{z}(t),\hat{q}(t),\hat{v}(t))\right\| _{\beta }^{2}. \end{aligned}$$

Hence,

$$\begin{aligned}&E\left[ \int _{0}^{T+K}\mathrm{e}^{\beta t}\left( \left| \hat{Y}(t)\right| ^{2}+\left| \hat{Z}(t)\right| ^{2}+\int _{\mathbb {R}_{0}}\left| \hat{Q}(t,z)\right| ^{2}\nu (\mathrm{d}z)+\sum _{j=1}^{D}\left| \hat{V}^{j}(t)\right| ^{2}\lambda _{j}(t)\right) \mathrm{d}t\right] \\&\qquad \le \frac{8C^{2}(L+1)(T+1)}{\beta }\left\| (\hat{y}(t),\hat{z}(t),\hat{q}(t),\hat{v}(t))\right\| _{\beta }^{2}. \end{aligned}$$

Since \(\beta =16C^{2}(L+1)(T+1)\), we obtain,

$$\begin{aligned} \left\| (\hat{Y},\hat{Z},\hat{Q},\hat{V})\right\| _{\beta }\le \frac{1}{\sqrt{2}}\left\| (\hat{y},\hat{z},\hat{q},\hat{v})\right\| _{\beta }. \end{aligned}$$

Hence, h is a contracting mapping on \(S_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times L_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times \mathscr {H}^{2}_{\mathbb {F}}(0,T\)

\(+K;\mathbb {R})\times \mathscr {M}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R}^{D})\). Then, by Banach fixed point theorem, (6) has a unique solution \((Y,Z,Q,V)\in S_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times L_{\mathbb {F}}^{2}(0,T+K;\mathbb {R})\times \mathscr {H}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R})\)

\(\times \mathscr {M}^{2}_{\mathbb {F}}(0,T+K;\mathbb {R}^{D})\). \(\square \)

Note that if \(\delta _{i}=\delta \in \mathbb {R}^{+}\) for all \(i=1,2,3,4\), then one can omit (ii) in the proof and, hence, in Theorem 3.1 itself.

4 Sufficient Maximum Principle

In this section, we present the sufficient maximum principle and show that under concavity assumptions, maximizing the Hamiltonian provides us the optimal control. We will use the following abbreviations:

$$\begin{aligned} \frac{\partial {\hat{H}}}{\partial {x}}(t)&=\frac{\partial }{\partial {x}}H(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),\hat{p}(t),\hat{q}(t),\hat{r}(t,z),\hat{w}(t)), \\ \hat{b}(t)&=b(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),\hat{p}(t),\hat{q}(t),\hat{r}(t,z),\hat{w}(t)), \\ b(t)&=b(t,X(t),Y(t),A(t),\alpha (t),p(t),q(t),r(t,z),w(t)), \ \hbox {etc.} \end{aligned}$$

Theorem 4.1

Let \(\hat{u}\in \mathscr {A}\) with corresponding state processes \(\hat{X}(t), \ \hat{Y}(t)\) and \(\hat{A}(t)\) and the adjoint processes \(\hat{p}(t), \ \hat{q}(t), \ \hat{r}(t,z) \ \hbox {and} \ \hat{w}(t)\) assumed to satisfy the SDDEJR (1) and the ABSDE with jumps and regimes (4), respectively. Suppose that the following assertions hold:

1.

$$\begin{aligned}&E\left[ \int _{0}^{T}\hat{p}(t)^{2}\left( (\sigma (t)-\hat{\sigma }(t))^{2}+ \int _{\mathbb {R}_{0}}(\eta (t,z)-\hat{\eta }(t,z))^{2}\nu (\mathrm{d}z)\right. \right. \\&\qquad \left. \left. +\sum _{j=1}^{D}(\gamma ^{j}(t)-\hat{\gamma }^{j}(t))^{2}\lambda _{j}(t)\right) \mathrm{d}t\right] <\infty \end{aligned}$$

and

$$\begin{aligned}&E\left[ \int _{0}^{T}(X(t)-\hat{X}(t))^{2}\left\{ \hat{q}^{2}(t)+\int _{\mathbb {R}_{0}}\hat{r}^{2}(t,z)\nu (\mathrm{d}z)+\sum _{j=1}^{D}(\hat{w}^{j})^{2}(t)\lambda _{j}(t)\right\} \mathrm{d}t\right] <\infty . \end{aligned}$$

2. For almost all \(t\in [0,T]\),

$$\begin{aligned}&H(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),\hat{u}(t),\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t)) \\&=\max \limits _{u\in \mathscr {U}}H(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),u,\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t)). \end{aligned}$$

3. \((x,y,a,u)\mapsto H(t,x,y,a,e_{i},u,\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))\) is a concave function for each \(t\in [0,T]\) almost surely and \(e_{i}\in S\).

4. \(g(x,e_{i})\) is a concave function of x for each \(e_{i}\in S\).

Then, \(\hat{u}(t)\) is an optimal control process and \(\hat{X}(t), \ \hat{Y}(t) \ \hbox {and} \ \hat{A}(t)\) are the corresponding controlled state processes.

Proof

Let \(J(u)-J(\hat{u})=I_{1}-I_{2}\), where

$$\begin{aligned} I_{1}=E\left[ \int _{0}^{T}\left\{ f(t,X(t),Y(t),A(t),\alpha (t),u(t))-f(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),\hat{u}(t))\right\} \mathrm{d}t\right] \end{aligned}$$

and

$$\begin{aligned} I_{2}=E\left[ g(X(T),\alpha (T))-g(\hat{X}(T),\alpha (T))\right] . \end{aligned}$$

By the concavity of H, we have

$$\begin{aligned} I_{1}&=E\left[ \int _{0}^{T}\left\{ \ H(t,X(t),Y(t),A(t),\alpha (t),u(t), \hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t)) \right. \right. \nonumber \\&\qquad -H(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t), \hat{u}(t),\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))\nonumber \\&\qquad -(b(t)-\hat{b}(t))\hat{p}(t)-(\sigma (t)-\hat{\sigma }(t))\hat{q}(t) \nonumber \\&\qquad -\int _{\mathbb {R}_{0}}(\eta (t,z)-\hat{\eta }(t,z))\hat{r}(t,z)\nu (\mathrm{d}z) \nonumber \\&\qquad \left. \left. -\sum _{j=1}^{D}(\gamma ^{j}(t)-\hat{\gamma }^{j}(t))\hat{w}^{j}(t)\lambda _{j}(t)\right\} \ \mathrm{d}t\right] \nonumber \\&\le E\left[ \int _{0}^{T}\left\{ \frac{\partial {\hat{H}}}{\partial {x}}(t)(X(t)- \hat{X}(t))+\frac{\partial {\hat{H}}}{\partial {y}}(t)(Y(t)-\hat{Y}(t))\right. \right. \nonumber \\&\qquad +\frac{\partial {\hat{H}}}{\partial {a}}(t)(A(t)-\hat{A}(t))+ \frac{\partial {\hat{H}}}{\partial {u}}(t)(u(t)-\hat{u}(t))-(b(t)- \hat{b}(t))\hat{p}(t) \nonumber \\&\qquad -(\sigma (t)-\hat{\sigma }(t))\hat{q}(t)-\int _{\mathbb {R}_{0}}(\eta (t,z)-\hat{\eta }(t,z))\hat{r}(t,z)\nu (\mathrm{d}z) \nonumber \\&\qquad \left. \left. -\sum _{j=1}^{D}(\gamma ^{j}(t)-\hat{\gamma }^{j}(t))\hat{w}^{j}(t)\lambda _{j}(t)\right\} \mathrm{d}t\right] . \end{aligned}$$
(7)

By applying integrating by parts and by concavity of g, we obtain:

$$\begin{aligned} I_{2}&\le E\left[ \frac{\partial {\hat{g}}}{\partial {x}}(T)(X(T)-\hat{X}(T))\right] \nonumber \\&=E\left[ \hat{p}(T)(X(T)-\hat{X}(T))\right] \nonumber \\&=E\left[ \int _{0}^{T}\hat{p}(t)d(X(t)-\hat{X}(t))+ \int _{0}^{T}(X(t)-\hat{X}(t))\mathrm{d}\hat{p}(t) \right. \nonumber \\&\qquad +\int _{0}^{T}\left\{ \ (\sigma (t)-\hat{\sigma }(t))\hat{q}(t)+ \int _{\mathbb {R}_{0}}(\eta (t,z)-\hat{\eta }(t,z))\hat{r}(t,z)\nu (\mathrm{d}z) \right. \nonumber \\&\qquad \left. \left. +\sum _{j=1}^{D}(\gamma ^{j}(t)-\hat{\gamma }^{j}(t))\hat{w}^{j}(t) \lambda _{j}(t)\right\} \mathrm{d}t\right] \nonumber \\&=E\left[ \int _{0}^{T}\hat{p}(t)\left\{ \ (b(t)-\hat{b}(t))\mathrm{d}t+ (\sigma (t)-\hat{\sigma }(t))\mathrm{d}W(t) \right. \right. \nonumber \\&\qquad \left. +\int _{\mathbb {R}_{0}}(\eta (t,z)-\hat{\eta }(t,z)) \tilde{N}(\mathrm{d}t,\mathrm{d}z)+(\gamma (t)-\hat{\gamma }(t))d{\tilde{\Phi }}(t)\right\} \nonumber \\&\qquad +\int _{0}^{T}(X(t)-\hat{X}(t))\left\{ E[\hat{\mu }(t)|\mathscr {F}_{t}]\mathrm{d}t+ \hat{q}(t)\mathrm{d}W(t)+\int _{\mathbb {R}_{0}}\hat{r}(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z) \right. \nonumber \\&\qquad \left. +\hat{w}(t)d{\tilde{\Phi }}(t)\right\} +\int _{0}^{T} \left\{ \ (\sigma (t)-\hat{\sigma }(t))\hat{q}(t)+\int _{\mathbb {R}_{0}}(\eta (t,z)-\hat{\eta }(t,z))\hat{r}(t,z)\nu (\mathrm{d}z) \right. \nonumber \\&\qquad \left. \left. +\sum _{j=1}^{D}(\gamma ^{j}(t)-\hat{\gamma }^{j}(t))\hat{w}^{j}(t)\lambda _{j}(t)\right\} \ \mathrm{d}t\right] . \end{aligned}$$
(8)

Note that \(X(t)=\hat{X}(t)=x_{0}(t)\) for all \(t\in [-\delta ,0]\). Then, by (7)–(8), we have

$$\begin{aligned} J(u)-J(\hat{u})&\le E\left[ \int _{0}^{T}\left\{ \ \frac{\partial {\hat{H}}}{\partial {x}}(t)(X(t)-\hat{X}(t))+\frac{\partial {\hat{H}}}{\partial {y}}(t)(Y(t)- \hat{Y}(t))+\frac{\partial {\hat{H}}}{\partial {a}}(t) \right. \right. \nonumber \\&\qquad \left. \left. \times (A(t)-\hat{A}(t))+\frac{\partial {\hat{H}}}{\partial {u}}(t)(u(t)- \hat{u}(t))+(X(t)-\hat{X}(t))\hat{\mu }(t)\right\} \ \mathrm{d}t\right] \nonumber \\&=E\left[ \int _{\delta }^{T+\delta }\left\{ \ \frac{\partial {\hat{H}}}{\partial {x}}(t-\delta )+ \frac{\partial {\hat{H}}}{\partial {y}}(t)\mathbf 1 _{[0,T]}(t)+\hat{\mu } (t-\delta )\right\} \right. \nonumber \\&\qquad \times (Y(t)-\hat{Y}(t))\mathrm{d}t +\int _{0}^{T} \frac{\partial {\hat{H}}}{\partial {a}}(t)(A(t)-\hat{A}(t))\mathrm{d}t \nonumber \\&\qquad \left. +\int _{0}^{T}\frac{\partial {\hat{H}}}{\partial {u}}(t)(u(t)-\hat{u}(t))\mathrm{d}t\right] . \end{aligned}$$
(9)

Substituting \(r:=t-\delta \), we get

$$\begin{aligned}&\int _{0}^{T}\frac{\partial {\hat{H}}}{\partial {a}}(s)(A(t)-\hat{A}(s))\mathrm{d}s=\int _{0}^{T}\frac{\partial {\hat{H}}}{\partial {a}}(s)\int _{s-\delta }^{s}\mathrm{e}^{-\rho (s-r)}(X(r)-\hat{X}(r))dr\mathrm{d}s \nonumber \\&\quad =\int _{0}^{T}\left( \int _{r}^{r+\delta }\frac{\partial {\hat{H}}}{\partial {a}}(s)\mathrm{e}^{-\rho s}{} \mathbf 1 _{[0,T]}(s)\mathrm{d}s\right) \mathrm{e}^{\rho r}(X(r)-\hat{X}(r))dr \nonumber \\&\quad =\int _{\delta }^{T+\delta }\left( \int _{t-\delta }^{t}\frac{\partial {\hat{H}}}{\partial {a}}(s)\mathrm{e}^{-\rho s}{} \mathbf 1 _{[0,T]}(s)\mathrm{d}s\right) \mathrm{e}^{\rho (t-\delta )}(X(t-\delta )-\hat{X}(t-\delta ))\mathrm{d}t. \end{aligned}$$
(10)

By combining (9)–(10), we obtain

$$\begin{aligned} J(u)-J({\hat{u}})&\le E\left[ \int _{\delta }^{T+\delta }\left\{ \frac{\partial {\hat{H}}}{\partial {x}}(t-\delta )+\frac{\partial {\hat{H}}}{\partial {y}}(t)\mathbf 1 _{[0,T]}(t)+\hat{\mu }(t-\delta )\right. \right. \\&\qquad \left. +\left( \int _{t-\delta }^{t}\frac{\partial {\hat{H}}}{\partial {a}}(s)\mathrm{e}^{-\rho s} \mathbf 1 _{[0,T]}(s)\mathrm{d}s\right) \mathrm{e}^{\rho (t-\delta )}\right\} \ (Y(t)-\hat{Y}(t))\mathrm{d}t \\&\qquad \left. +\int _{0}^{T}\frac{\partial {\hat{H}}}{\partial {u}}(t)(u(t)-\hat{u}(t))\mathrm{d}t\right] \\&=E\left[ \int _{0}^{T}\frac{\partial {\hat{H}}}{\partial {u}}(t)(u(t)-\hat{u}(t))\mathrm{d}t\right] \le 0. \end{aligned}$$

Since \(\hat{u}(t)\) maximizes \(H(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),u,\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))\), the last inequality holds (see Proposition 2.1 by Ekeland and Temam [21]). Then, we obtain that \(\hat{u}(t)\) is an optimal control for problem (2).

One of the key facts in this proof is that concave and differentiable functions are bounded above by their first-order Taylor approximation. Concavity assumptions on H with respect to xyau and on g with respect to x for all \(e_{i}\in S\) have been used in this sense. Furthermore, Proposition 2.1 by Ekeland and Temam [21] works under concavity condition of H with respect to u.

In the next section, we present necessary maximum principle, by which one can determine the candidate optimal control processes, but for verification the concavity condition is necessary. \(\square \)

5 Necessary Maximum Principle

Let \(\hat{u}\) be an optimal control process and \(\beta \) be any other control process, satisfying \(\hat{u}+\beta =:v'\in \mathscr {A}\). Since \(\mathscr {U}\) is a convex set, for any \(v'\in \mathscr {A}\), the perturbed control process \(u^{s}=\hat{u}+s(v'-\hat{u}), \ 0<s<1\), is also in \(\mathscr {A}\). The directional derivative of the performance criterion \(J(\cdot )\) at \(\hat{u}\) in the direction of \(\beta \) is given by:

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}s}J(\hat{u}+s\beta )|_{s=0} :=\lim \limits _{s \rightarrow 0^{+}}\frac{J(\hat{u}+s\beta )-J(\hat{u})}{s}. \end{aligned}$$

Since \(\hat{u}\) is an optimal control, a necessary condition for optimality is

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}s}J(\hat{u}+s\beta )|_{s=0}\le 0. \end{aligned}$$

Let us assume that the derivative process \(\xi (t)=\frac{\mathrm{d}}{\mathrm{d}s}X^{u+s\beta }(t)\left| _{s=0}\right. \ \hbox {for} \ t\in [0,T]\) exists and it is defined as follows:

$$\begin{aligned} \mathrm{d}\xi (t)&=\left\{ \ \frac{\partial {b}}{\partial {x}}(t)\xi (t)+ \frac{\partial {b}}{\partial {y}}(t)\xi (t-\delta )+\frac{\partial { b}}{\partial {a}}(t) \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\xi (r)dr\right. \nonumber \\&\qquad \left. +\frac{\partial {b}}{\partial {u}}(t)\beta (t)\right\} \ \mathrm{d}t+ \left\{ \ \frac{\partial {\sigma }}{\partial {x}}(t)\xi (t)+ \frac{\partial {\sigma }}{\partial {y}}(t)\xi (t-\delta )+ \frac{\partial {\sigma }}{\partial {a}}(t)\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\right. \nonumber \\&\qquad \left. \times \xi (r)dr +\frac{\partial {\sigma }}{\partial {u}}(t)\beta (t)\right\} \ \mathrm{d}W(t)+\int _{\mathbb {R}_{0}}\left\{ \ \frac{\partial {\eta }}{\partial {x}}(t,z)\xi (t)+ \frac{\partial {\eta }}{\partial {y}}(t,z)\xi (t-\delta ) \right. \nonumber \\&\qquad \left. +\frac{\partial {\eta }}{\partial {a}}(t,z)\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\xi (r)dr+ \frac{\partial {\eta }}{\partial {u}}(t,z)\beta (t)\right\} \ \tilde{N}(\mathrm{d}t,\mathrm{d}z)+\left\{ \ \frac{\partial {\gamma }}{\partial {x}}(t)\xi (t) \right. \nonumber \\&\qquad \left. +\frac{\partial {\gamma }}{\partial {y}}(t)\xi (t-\delta )+ \frac{\partial {\gamma }}{\partial {a}}(t)\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\xi (r)dr+ \frac{\partial {\gamma }}{\partial {u}}(t)\beta (t)\right\} \ d{\tilde{\Phi }}(t), \end{aligned}$$
(11)

where we know that

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}s}Y^{u+s \beta }(t)|_{s=0}&=\frac{\mathrm{d}}{\mathrm{d}s}X^{u+s \beta }(t-\delta )|_{s=0}=\xi (t-\delta ),\\ \frac{\mathrm{d}}{\mathrm{d}s}A^{u+s \beta }(t)|_{s=0}&=\frac{\mathrm{d}}{\mathrm{d}s}\left( \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}X^{u+s \beta }(r)dr\right) |_{s=0}\\&=\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\frac{\mathrm{d}}{\mathrm{d}s}X^{u+s \beta }(r)|_{s=0}dr=\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\xi (r)dr, \end{aligned}$$

and we have used the following abbreviations:

$$\begin{aligned} \frac{\partial {b}}{\partial {x}}(t)=\frac{\partial {b}}{\partial {x}}(t,X(t),Y(t),A(t),\alpha (t),u(t)), \ \hbox {etc.} \end{aligned}$$

Note that \(\xi (t)=0\) for all \(t\in [-\delta ,0]\).

Theorem 5.1

Let \(\hat{u}\in \mathscr {A}\) be an optimal control of problem (2) subject to the controlled system (1) and let \((\hat{p}(t),\hat{q}(t),\hat{r}(t,z),\hat{w}(t))\) be the unique solution of (4). Moreover, let us assume that,

$$\begin{aligned}&E\left[ \int _{0}^{T}\hat{p}^{2}(t) \left\{ \ \left( \frac{\partial {\hat{\sigma }}}{\partial {x}}\right) ^{2}(t) \hat{\xi }^{2}(t)+\left( \frac{\partial {\hat{\sigma }}}{\partial {y}}\right) ^{2}(t) \hat{\xi }^{2}(t-\delta )+\left( \frac{\partial {\hat{\sigma }}}{\partial {a}}\right) ^{2}(t)\right. \right. \\&\times \left( \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\hat{\xi }(r)dr\right) ^{2}+ \left( \frac{\partial {\hat{\sigma }}}{\partial {u}}\right) ^2(t)\beta ^{2}(t)+ \int _{\mathbb {R}_{0}}\left\{ \ \left( \frac{\partial {\hat{\eta }}}{\partial {x}}\right) ^{2}(t,z)\hat{\xi }^{2}(t)\right. \\&+\left( \frac{\partial {\hat{\eta }}}{\partial {y}}\right) ^{2}(t,z) \hat{\xi }^{2}(t-\delta )+\left( \frac{\partial {\hat{\eta }}}{\partial {a}}\right) ^{2} (t,z)\left( \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\hat{\xi }(r)dr\right) ^{2}\\&\left. +\left( \frac{\partial {\hat{\eta }}}{\partial {u}}\right) ^2(t,z) \beta ^{2}(t) \right\} \ \nu (\mathrm{d}z)+\sum _{j=1}^{D}\left\{ \ \left( \frac{\partial {\hat{\gamma }}^{j}}{\partial {x}}\right) ^{2}(t)\hat{\xi }^{2}(t)+\left( \frac{\partial {\hat{\gamma }}^{j}}{\partial {y}}\right) ^{2}(t)\hat{\xi }^{2}(t-\delta )\right. \\&\left. \left. \left. +\left( \frac{\partial {\hat{\gamma }}^{j}}{\partial {a}}\right) ^{2}(t) \left( \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\hat{\xi }(r)dr\right) ^{2}+ \left( \frac{\partial {\hat{\gamma }}^{j}}{\partial {u}}\right) ^2(t)\beta ^{2}(t) \right\} \ \lambda _{j}(t) \right\} \ \mathrm{d}t\right] < \infty \end{aligned}$$

and

$$\begin{aligned} E\left[ \int _{0}^{T}(\hat{\xi })^{2}(t)\left\{ \ (\hat{q})^{2}(t)+\int _{\mathbb {R}_{0}}(\hat{r})^{2}(t,z)\nu (\mathrm{d}z)+\sum _{j=1}^{D}(\hat{w}^{j})^{2}(t)\lambda _{j}(t)\right\} \ \mathrm{d}t\right] <\infty . \end{aligned}$$

Then, for any \(v\in \mathscr {U}\), we have

$$\begin{aligned}&\frac{\partial {H}}{\partial {u}}(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),\hat{u}(t),\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))(v-\hat{u}(t))\le 0 \\&\qquad \mathrm{d}t-a.e, \quad \mathbb {P}-a.s. \end{aligned}$$

Proof

For simplicity of notation, let be \(\hat{u}=u\), \(\hat{X}=X\), \(\hat{Y}=Y\), \(\hat{p}=p\), \(\hat{q}=q\), \(\hat{r}=r\) and \(\hat{w}=w\). Then,

$$\begin{aligned} 0&\ge \frac{\mathrm{d}}{\mathrm{d}s}J(u+s\beta )|_{s=0} \nonumber \\&=\frac{\mathrm{d}}{\mathrm{d}s}E\left[ \int _{0}^{T}f(t,X^{u+s \beta }(t),Y^{u+s \beta }(t),A^{u+s \beta }(t),\alpha (t),u(t)+s\beta )\mathrm{d}t \right. \nonumber \\&\qquad \left. +g(X^{u+s \beta }(T),\alpha (T))\right] \biggl |_{s=0} \nonumber \\&=E\left[ \int _{0}^{T}\left\{ \frac{\partial {f}}{\partial {x}}(t)\xi (t)+ \frac{\partial {f}}{\partial {y}}(t)\xi (t-\delta )+ \frac{\partial {f}}{\partial {a}}(t)\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\xi (r)dr\right. \right. \nonumber \\&\qquad \left. \left. +\frac{\partial {f}}{\partial {u}}(t)\beta (t)\right\} \mathrm{d}t+ \frac{\partial {g}}{\partial {x}}(X(T),\alpha (T))\xi (T)\right] \nonumber \\&=E\left[ \int _{0}^{T}\left\{ \frac{\partial {H}}{\partial {x}}(t)- \frac{\partial {b}}{\partial {x}}(t)p(t)-\frac{\partial {\sigma }}{\partial {x}}(t)q(t)- \int _{\mathbb {R}_{0}}\frac{\partial {\eta }}{\partial {x}}(t,z)r(t,z)\nu (\mathrm{d}z) \right. \right. \nonumber \\&\qquad \left. -\sum _{j=1}^{D}\frac{\partial {\gamma }^{j}}{\partial {x}}(t)w^{j}(t) \lambda _{j}(t)\right\} \ \xi (t)\mathrm{d}t+\int _{0}^{T} \left\{ \frac{\partial {H}}{\partial {y}}(t)-\frac{\partial {b}}{\partial {y}}(t)p(t)- \frac{\partial {\sigma }}{\partial {y}}(t)q(t) \right. \nonumber \\&\qquad \left. -\int _{\mathbb {R}_{0}}\frac{\partial {\eta }}{\partial {y}}(t,z)r(t,z)\nu (\mathrm{d}z)- \sum _{j=1}^{D}\frac{\partial {\gamma }^{j}}{\partial {y}}(t)w^{j}(t) \lambda _{j}(t)\right\} \ \xi (t-\delta )\mathrm{d}t \nonumber \\&\qquad +\int _{0}^{T}\left\{ \frac{\partial {H}}{\partial {a}}(t)- \frac{\partial {b}}{\partial {a}}(t)p(t)-\frac{\partial {\sigma }}{\partial {a}}(t)q(t)- \int _{\mathbb {R}_{0}}\frac{\partial {\eta }}{\partial {a}}(t,z)r(t,z)\nu (\mathrm{d}z) \right. \nonumber \\&\qquad \left. -\sum _{j=1}^{D}\frac{\partial {\gamma }^{j}}{\partial {a}}(t)w^{j}(t) \lambda _{j}(t)\right\} \left( \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)} \xi (r)dr\right) \mathrm{d}t+\int _{0}^{T}\left\{ \frac{\partial {H}}{\partial {u}}(t) \right. \nonumber \\&\qquad -\frac{\partial {b}}{\partial {u}}(t)p(t)- \frac{\partial {\sigma }}{\partial {u}}(t)q(t)-\int _{\mathbb {R}_{0}} \frac{\partial {\eta }}{\partial {u}}(t,z)r(t,z)\nu (\mathrm{d}z)\nonumber \\&\qquad \left. \left. -\sum _{j=1}^{D}\frac{\partial {\gamma }^{j}}{\partial {u}}(t)w^{j}(t) \lambda _{j}(t)\right\} \ \beta (t)\mathrm{d}t+\frac{\partial {g}}{\partial {x}}(X(T), \alpha (T))\xi (T)\right] . \end{aligned}$$
(12)

By (11) and integration by parts, we get

$$\begin{aligned}&E\left[ \frac{\partial {g}}{\partial {x}}(X(T),\alpha (T))\xi (T)\right] = E\left[ p(T)\xi (T)\right] \nonumber \\&\quad =E\left[ \int _{0}^{T}p(t)\mathrm{d}\xi (t)+\int _{0}^{T}\xi (t)dp(t)+\int _{0}^{T}q(t) \left\{ \ \frac{\partial {\sigma }}{\partial {x}}(t)\xi (t)+ \frac{\partial {\sigma }}{\partial {y}}(t)\xi (t-\delta ) \right. \right. \nonumber \\&\qquad \left. +\frac{\partial {\sigma }}{\partial {a}}(t)\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)} \xi (r)dr+\frac{\partial {\sigma }}{\partial {u}}(t)\beta (t)\right\} \ \mathrm{d}t+ \int _{0}^{T}\int _{\mathbb {R}_{0}}r(t,z) \left\{ \frac{\partial {\eta }}{\partial {x}}(t,z)\xi (t) \right. \nonumber \\&\qquad \left. +\frac{\partial {\eta }}{\partial {y}}(t,z)\xi (t-\delta )+ \frac{\partial {\eta }}{\partial {a}}(t)\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\xi (r)dr+ \frac{\partial {\eta }}{\partial {u}}(t)\beta (t)\right\} \nu (\mathrm{d}z)\mathrm{d}t \nonumber \\&\qquad +\int _{0}^{T}\sum _{j=1}^{D}w^{j}(t)\left\{ \ \frac{\partial {\gamma }^{j}}{\partial {x}}(t)\xi (t)+ \frac{\partial {\gamma }^{j}}{\partial {y}}(t)\xi (t-\delta )+ \frac{\partial {\gamma }^{j}}{\partial {a}}(t)\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)} \xi (r)dr \right. \nonumber \\&\qquad \left. \left. +\frac{\partial {\gamma }^{j}}{\partial {u}}(t)\beta (t)\right\} \ \lambda _{j}(t)\mathrm{d}t\right] . \end{aligned}$$
(13)

By (12)–(13), we obtain

$$\begin{aligned} 0&\ge \frac{\mathrm{d}}{\mathrm{d}s}J(u+s\beta )|_{s=0} \\&=E\left[ \int _{0}^{T}\left\{ \ \frac{\partial {H}}{\partial {x}}(t)\xi (t)+ \frac{\partial {H}}{\partial {y}}(t)\xi (t-\delta )+ \frac{\partial {H}}{\partial {a}}(t)\int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\xi (r)dr\right. \right. \\&\qquad \left. \left. +\frac{\partial {H}}{\partial {u}}(t)\beta (t)+\xi (t)E[\mu (t)| \mathscr {F}_{t}]\right\} \ \mathrm{d}t\right] \\&=E\left[ \int _{0}^{T}\xi (t)\left\{ \ \frac{\partial {H}}{\partial {x}}(t) -\frac{\partial {H}}{\partial {x}}(t)-\frac{\partial {H}}{\partial {y}} (t+\delta )\mathbf 1 _{[0,T-\delta ]}(t)\right. \right. \\&\qquad \left. -\mathrm{e}^{\rho t}\left( \int _{t}^{t+\delta } \frac{\partial {H}}{\partial {a}}(s)\mathrm{e}^{-\rho s} \mathbf 1 _{[0,T-\delta ]}(s)\mathrm{d}s\right) \right\} \ \mathrm{d}t+\int _{0}^{T} \frac{\partial {H}}{\partial {y}}(t)\xi (t-\delta )\mathrm{d}t \\&\qquad \left. +\int _{0}^{T}\left( \int _{s-\delta }^{s}\mathrm{e}^{-\rho (s-t)} \xi (t)\mathrm{d}t\right) \frac{\partial {H}}{\partial {a}}(s)\mathrm{d}s+\int _{0}^{T} \frac{\partial {H}}{\partial {u}}(t)\beta (t)\mathrm{d}t\right] \\&=E\left[ \int _{0}^{T}\xi (t)\left\{ \ -\frac{\partial {H}}{\partial {y}}(t+\delta ) \mathbf 1 _{[0,T-\delta ]}(t)\right. \right. \\&\qquad \left. -\mathrm{e}^{\rho t}\left( \int _{t}^{t+\delta } \frac{\partial {H}}{\partial {a}}(s)\mathrm{e}^{-\rho s} \mathbf 1 _{[0,T-\delta ]}(s)\mathrm{d}s\right) \right\} \ \mathrm{d}t+ \int _{0}^{T}\frac{\partial {H}}{\partial {y}}(t)\xi (t-\delta )\mathrm{d}t\\&\qquad \left. +\mathrm{e}^{\rho t}\int _{0}^{T}\left( \int _{t}^{t+\delta } \frac{\partial {H}}{\partial {a}}(s)\mathrm{e}^{-\rho s}{} \mathbf 1 _{[0,T-\delta ]}(s)\mathrm{d}s\right) \xi (t)\mathrm{d}t+\int _{0}^{T}\frac{\partial {H}}{\partial {u}}(t)\beta (t)\mathrm{d}t\right] \\&=E\left[ \int _{0}^{T}\frac{\partial {H}}{\partial {u}}(t)\beta (t)\mathrm{d}t\right] . \end{aligned}$$

Let \(\beta (t)=v'(t)-u(t)\). Since u(t) is optimal, we have

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}s}J(u+s(v'-u))|_{s=0}=\left[ \int _{0}^{T}\frac{\partial {H}}{\partial {u}}(t)(v'(t)-u(t))\mathrm{d}t\right] \le 0. \end{aligned}$$

Let us define,

$$\begin{aligned} v'(t):= \left\{ \begin{array}{ll} v, &{}\quad \text {on} \ B\times (t_{0},t_{0}+h), \\ u(t), &{}\quad \text {otherwise}, \end{array}\right. \end{aligned}$$

for any deterministic element \(v\in \mathscr {U}\) and for any element B of \(\mathscr {F}_{t}\). Then,

$$\begin{aligned} E\left[ \int _{0}^{T}\frac{\partial {H}}{\partial {u}}(t)(v'(t)-u(t))\mathrm{d}t\right] =E\left[ \int _{t_{0}}^{t_{0}+h}\frac{\partial {H}}{\partial {u}}(t)(v-u(t))\mathbf 1 _{B}\mathrm{d}t\right] . \end{aligned}$$

Dividing by h and taking limit, we obtain

$$\begin{aligned}&\lim _{h \rightarrow 0}\frac{1}{h}E\left[ \int _{t_{0}}^{t_{0}+h}\frac{\partial {H}}{\partial {u}}(t)(v-u(t))\mathbf 1 _{B}\mathrm{d}t\right] \\&\qquad =E\left[ \frac{\partial {H}}{\partial {u}}(t_{0})(v-u(t_{0}))\mathbf 1 _{B}\right] \le 0 \qquad a.e. \end{aligned}$$

for all \(B\in \mathscr {F}_{t_{0}}\); this implies that

$$\begin{aligned} E\left[ \frac{\partial {H}}{\partial {u}}(t_{0})(v-u(t_{0}))|\mathscr {F}_{t_{0}}\right] \le 0. \end{aligned}$$

Since the quantity inside the conditional expectation is \(\mathscr {F}_{t_{0}}\)-measurable, then the inequality in Theorem 5.1 holds \(\mathrm{d}t-a.e.\), \(\mathbb {P}-a.s.\), for all \(v\in \mathscr {U}\). \(\square \)

6 Sufficient Maximum Principle Under Partial Information

In this section, we establish a maximum principle of sufficient type under partial information. Under the assumptions of Sect. 2, this theorem is the extension of Øksendal et al. [17] to a Markov regime-switching model.

Let us introduce \(\mathscr {E}_{t}\subseteq \mathscr {F}_{t}\), \(t\in [0,T]\), the subfiltration of \(\left\{ \mathscr {F}_{t}\right\} _{t\in [0,T]}\), which represents the information available to the controller, who decides on the value of u(t) at time t. For example, we may consider \(\mathscr {E}_{t}=\mathscr {F}_{(t-d)^{+}}\) for some given \(d>0\).

Let \(\mathscr {A}_{\mathscr {E}}\) be a given family of admissible control processes u(t), \(t\in [0,T]\), included in the set of càdlàg, \(\mathscr {E}\)-adapted and \(\mathscr {U}\)-valued processes such that (1) has a unique solution.

Theorem 6.1

Let \(\hat{u}\in \mathscr {A}_{\mathscr {E}}\) with corresponding state processes \(\hat{X}(t), \ \hat{Y}(t)\) and \(\hat{A}(t)\) and the adjoint processes \(\hat{p}(t), \ \hat{q}(t), \ \hat{r}(t,z)\) and \(\hat{w}(t)\) assumed to satisfy the SDDEJR (1) and the ABSDE with jumps and regimes (4), respectively. Suppose that the following conditions hold:

  1. 1.
    $$\begin{aligned}&E\left[ \int _{0}^{T}\hat{p}(t)^{2}\left( (\sigma (t)-\hat{\sigma }(t))^{2}+ \int _{\mathbb {R}_{0}}(\eta (t,z)-\hat{\eta }(t,z))^{2}\nu (\mathrm{d}z)\right. \right. \\&\qquad \left. \left. +\sum _{j=1}^{D}(\gamma ^{j}(t)-\hat{\gamma }^{j}(t)\right) ^{2} \lambda _{j}(t))\mathrm{d}t\right]<\infty \\ \hbox {and} \\&E\left[ \int _{0}^{T}(X(t)-\hat{X}(t))^{2}\left\{ \hat{q}^{2}(t)+ \int _{\mathbb {R}_{0}}\hat{r}^{2}(t,z)\nu (\mathrm{d}z)\right. \right. \\&\qquad \left. \left. +\sum _{j=1}^{D}(\hat{w}^{j})^{2}(t)\lambda _{j}(t)\right\} \mathrm{d}t\right] <\infty . \end{aligned}$$
  2. 2.

    For almost all \(t\in [0,T]\),

    $$\begin{aligned}&E\left[ H(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),\hat{u}(t),\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))|\mathscr {E}_{t}\right] \\&\ =\max \limits _{u\in \mathscr {U}}E\left[ H(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t),u,\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))|\mathscr {E}_{t}\right] . \end{aligned}$$
  3. 3.

    \((x,y,a,u)\mapsto H(t,x,y,a,e_{i},u,\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))\) is a concave function for each \(t\in [0,T]\) almost surely and \(e_{i}\in S\).

  4. 4.

    \(g(x,e_{i})\) is a concave function of x for each \(e_{i}\in S\).

Then, \(\hat{u}(t)\) is an optimal control process and \(\hat{X}(t), \hat{Y}(t), \ \hat{A}(t)\) are the corresponding controlled state processes for problem (2).

Proof

By the methods of Theorem 4.1, we obtained (9)–(10). For the sake of completeness, we give the rest of the proof. Then,

$$\begin{aligned} J(u)-J(\hat{u})&\le E\left[ \int _{0}^{T}\frac{\partial {\hat{H}}}{\partial {u}}(t)(u(t)-\hat{u}(t))\mathrm{d}t\right] \\&=E\left[ \int _{0}^{T}E\left[ \frac{\partial {\hat{H}}}{\partial {u}}(t)(u(t)-\hat{u}(t))|\mathscr {E}_{t}\right] \mathrm{d}t\right] \\&=E\left[ \int _{0}^{T}E\left[ \frac{\partial {\hat{H}}}{\partial {u}}(t)|\mathscr {E}_{t}\right] (u(t)-\hat{u}(t))\mathrm{d}t\right] \\&\le 0. \end{aligned}$$

Since \(\hat{u}(t)\) maximizes \(E\left[ H(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),\alpha (t), u,\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))|\mathscr {E}_{t}\right] \), the last inequality holds. Hence, \(\hat{u}(t)\) is an optimal control. \(\square \)

7 Necessary Maximum Principle Under Partial Information

In this section, we will give the necessary maximum principle under partial information, which is the extension of Øksendal et al. [17] to a Markov regime-switching model. Let us represent the technical assumptions as follows:

B1. For all \(u\in \mathscr {A}_{\mathscr {E}}\) and all bounded \(\beta \in \mathscr {A}_{\mathscr {E}}\), there exists \(\epsilon >0\) such that \(u+s\beta \in \mathscr {A}_{\mathscr {E}}\) for all \(s\in ]-\epsilon ,\epsilon [\).

B2. For all \(t_{0}\in [0,T]\) and all bounded \(\mathscr {E}_{t_{0}}\)-measurable random variables v, the control process \(\beta (t)\), defined by

$$\begin{aligned} \beta (t)=v\mathbf 1 _{[t_{0},T]}(t), \qquad t\in [0,T], \end{aligned}$$
(14)

belongs to \(\mathscr {A}_{\mathscr {E}}\).

B3. For all bounded \(\beta \in \mathscr {A}_{\mathscr {E}}\), the derivative process

$$\begin{aligned} \xi (t):=\frac{\mathrm{d}}{\mathrm{d}s}X^{u+s\beta }(t)|_{s=0} \end{aligned}$$

exists as described by (11).

Theorem 7.1

Let \(\hat{u}\in \mathscr {A}_{\mathscr {E}}\) with corresponding solutions \(\hat{X}(t), \ \hat{Y}(t)\ \hbox {and} \ \hat{A}(t)\) of (1) and \(\hat{p}(t), \ \hat{q}(t), \ \hat{r}(t,z) \ \hbox {and} \ \hat{w}(t)\) of (4) and corresponding derivative process \(\hat{\xi }(t)\) given by (11). Moreover, we assume that

$$\begin{aligned}&E\left[ \int _{0}^{T}\hat{p}^{2}(t)\left\{ \ \left( \frac{\partial {\hat{\sigma }}}{\partial {x}}\right) ^{2}(t) \hat{\xi }^{2}(t)+\left( \frac{\partial {\hat{\sigma }}}{\partial {y}}\right) ^{2}(t) \hat{\xi }^{2}(t-\delta )+\left( \frac{\partial {\hat{\sigma }}}{\partial {a}}\right) ^{2}(t)\right. \right. \\&\qquad \times \left( \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)} \hat{\xi }(r)dr\right) ^{2}+\left( \frac{\partial {\hat{\sigma }}}{\partial {u}}\right) ^2(t) \beta ^{2}(t)+\int _{\mathbb {R}_{0}}\left\{ \ \left( \frac{\partial {\hat{\eta }}}{\partial {x}}\right) ^{2}(t,z)\hat{\xi }^{2}(t)\right. \\&\qquad +\left( \frac{\partial {\hat{\eta }}}{\partial {y}}\right) ^{2} (t,z)\hat{\xi }^{2}(t-\delta )+\left( \frac{\partial {\hat{\eta }}}{\partial {a}}\right) ^{2}(t,z)\left( \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)} \hat{\xi }(r)dr\right) ^{2}\\&\qquad \left. +\left( \frac{\partial {\hat{\eta }}}{\partial {u}}\right) ^{2} (t,z)\beta ^{2}(t) \right\} \ \nu (\mathrm{d}z)+\sum _{j=1}^{D} \left\{ \ \left( \frac{\partial {\hat{\gamma }}^{j}}{\partial {x}}\right) ^{2}(t) \hat{\xi }^{2}(t)\right. \\&\qquad +\left( \frac{\partial {\hat{\gamma }}^{j}}{\partial {y}}\right) ^{2}(t) \hat{\xi }^{2}(t-\delta )+\left( \frac{\partial {\hat{\gamma }}^{j}}{\partial {a}}\right) ^{2}(t) \left( \int _{t-\delta }^{t}\mathrm{e}^{-\rho (t-r)}\hat{\xi }(r)dr\right) ^{2}\\&\qquad \left. \left. \left. +\left( \frac{\partial {\hat{\gamma }}^{j}}{\partial {u}}\right) ^2(t)\beta ^{2}(t) \right\} \ \lambda _{j}(t) \right\} \ \mathrm{d}t\right] < \infty \end{aligned}$$

and

$$\begin{aligned} E\left[ \int _{0}^{T}(\hat{\xi })^{2}(t)\left\{ \ (\hat{q})^{2}(t)+\int _{\mathbb {R}_{0}}(\hat{r})^{2}(t,z)\nu (\mathrm{d}z)+\sum _{j=1}^{D}(\hat{w}^{j})^{2}(t)\lambda _{j}(t)\right\} \ \mathrm{d}t\right] <\infty . \end{aligned}$$

Then, the following equations are equivalent:

(iii) For all bounded \(\beta \in \mathscr {A}_{\mathscr {E}}\),

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}s}J(\hat{u}+s\beta )\left| _{s=0}=0\right. . \end{aligned}$$

(iv) For all \(t\in [0,T]\),

$$\begin{aligned} E\left[ \frac{\partial {H}}{\partial {u}}(t,\hat{X}(t),\hat{Y}(t),\hat{A}(t),u,\alpha (t),\hat{p}(t),\hat{q}(t),\hat{r}(t,\cdot ),\hat{w}(t))|\mathscr {E}_{t} \right] _{u=\hat{u}(t)}=0 \ a.s. \end{aligned}$$

Proof

By the methods of Theorem 5.1, we obtained (12)–(13). For the sake of completeness, we give the reminder of the proof. In fact,

$$\begin{aligned} 0=\frac{\mathrm{d}}{\mathrm{d}s}J(u+s\beta )|_{s=0}=E\left[ \int _{0}^{T}\frac{\partial {H}}{\partial {u}}(t)\beta (t)\mathrm{d}t\right] . \end{aligned}$$

Let us consider \(\beta (t)=v(\omega )\mathbf 1 _{[s,T]}(t)\) in (14), where \(v(\omega )\) is bounded and \(\mathscr {E}_{t_{0}}\)-measurable, \(s\ge t_{0}\). Hence, we get

$$\begin{aligned} E\left[ \int _{s}^{T}\frac{\partial }{\partial {u}}H(t)v\mathrm{d}t\right] =0. \end{aligned}$$

Differentiating with respect to s, we get

$$\begin{aligned} E\left[ \frac{\partial }{\partial {u}}H(s)v\right] =0 \end{aligned}$$

for all \(s\ge t_{0}\) and for all v. Hence, we obtain

$$\begin{aligned} E\left[ \frac{\partial }{\partial {u}}H(t_{0})|\mathscr {E}_{t_{0}}\right] =0. \end{aligned}$$

This shows that (iii) implies (iv).

Since every bounded \(\beta \in \mathscr {A}_{\mathscr {E}}\) can be approximated by linear combinations of controls \(\beta \) of the form (14), i.e, by so-called simple processes having the form of step functions, we can reverse the above steps and show that (iv) implies (iii). \(\square \)

8 An Application to Finance

Let \(b(t,\alpha (t)),\ \sigma (t,\alpha (t)), \ \eta (t,\alpha (t),z)\) and \(\gamma (t,\alpha (t))\) be given bounded and adapted processes. Let us consider a cash flow \(X^{0}(t)\) with the dynamics,

$$\begin{aligned} \mathrm{d}X^{0}(t)=&\ X(t-\delta )\left[ b(t,\alpha (t))\mathrm{d}t+\sigma (t,\alpha (t))\mathrm{d}W(t) \right. \nonumber \\&\left. +\int _{\mathbb {R}_{0}}\eta (t,\alpha (t),z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)+ \gamma (t,\alpha (t))d{\tilde{\Phi }}(t)\right] , \qquad t\in [0,T],\\ X^{0}(t)=&\ x_{0}(t), \qquad t\in [-\delta ,0], \nonumber \end{aligned}$$

where \(x_{0}(t)\) is a given continuous, non-negative and deterministic function.

A consumption rate \(c(t)\ge 0\) is a càdlàg, \(\mathscr {F}_{t}\)-adapted process satisfying

$$\begin{aligned} E\left[ \int _{0}^{t}\left| c(t)\right| ^{2}\mathrm{d}t\right] <\infty . \end{aligned}$$

Hence, the dynamics of the net cash flow \(X(t)=X^{c}(t)\) is given by

$$\begin{aligned} \mathrm{d}X(t)=&\ (X(t-\delta )b(t,\alpha (t))-c(t))\mathrm{d}t+X(t-\delta )\sigma (t,\alpha (t))\mathrm{d}W(t) \nonumber \\&+X(t-\delta )\int _{\mathbb {R}_{0}}\eta (t,\alpha (t),z)\tilde{N}(\mathrm{d}t,\mathrm{d}z) \nonumber \\&+X(t-\delta )\gamma (t,\alpha (t))d{\tilde{\Phi }}(t), \qquad t\in [0,T], \\ X(t)=&\ x_{0}(t), \qquad t\in [-\delta ,0]. \nonumber \end{aligned}$$
(15)

Let \(U(t,c,e_{i},\omega ):[0,T]\times \mathbb {R}^{+}\times S\times \Omega \rightarrow \mathbb {R}\) be a given stochastic utility function at each \(i=1,2,\ldots ,D\), that it is a slightly more general way of modeling, here; U also depends on \(\omega \), whose notation will be suppressed. Furthermore, U satisfies the following conditions:

$$\begin{aligned}&\qquad t\mapsto U(t,c,e_{i}) \ \hbox {is} \ \mathscr {F}_{t}\hbox {-adapted for each} \ c\ge 0 \ \hbox {and} \ e_{i}\in S, \\&\qquad c\mapsto U(t,c,e_{i}) \ \hbox {is} \ \mathscr {\mathscr {C}}^{1} \ \hbox {and} \ \frac{\partial {U}}{\partial {c}}(t,c,e_{i})>0 \ \hbox {for each} \ e_{i}\in S, \\&\qquad c\mapsto \frac{\partial {U}}{\partial {c}}(t,c,e_{i}) \ \hbox {is strictly decreasing for each} \ e_{i}\in S, \\&\qquad \lim \limits _{c\rightarrow \infty }\frac{\partial {U}}{\partial {c}}(t,c,e_{i})=0 \ \hbox {for all} \ (t,e_{i})\in [0,T]\times S. \end{aligned}$$

Let \(v_{0}(t,e_{i}):=\frac{\partial {U}}{\partial {c}}(t,0,e_{i})\) and we define,

$$\begin{aligned} I(t,v,e_{i}):= \left\{ \begin{array}{ll} 0, &{}\quad \text {if} \ v\ge v_{0}(t,e_{i}), \\ (\frac{\partial {U}}{\partial {c}}(t,\cdot ,e_{i}))^{-1}(v), &{}\quad \text {if} \ 0\le v< v_{0}(t,e_{i}). \end{array}\right. \end{aligned}$$

We want to find the optimal consumption rate \(\hat{c}\) such that

$$\begin{aligned} J(\hat{c})&=\sup \limits _{c\in \mathscr {A}}J(c) \\&=\sup \limits _{c\in \mathscr {A}}E\left[ \int _{0}^{T}U(t,c(t),\alpha (t))\mathrm{d}t+g(X(T),\alpha (T))\right] . \end{aligned}$$

In this case, the Hamiltonian takes the form:

$$\begin{aligned} H(t,x,y,a,e_{i},c,p,q,r(\cdot ),w)=U(t,c,e_{i})+(b(t,e_{i})y-c)p+y\sigma (t,e_{i})q \nonumber \\ +y\int _{\mathbb {R}_{0}}\eta (t,e_{i},z)r(t,z)\nu (\mathrm{d}z)+y\sum _{j=1}^{D}\gamma ^{j}(t,e_{i})w^{j}(t)\lambda _{ij}. \end{aligned}$$
(16)

Here we observe that maximizing H with respect to c gives

$$\begin{aligned} \frac{\partial {U}}{\partial {c}}(t,\hat{c}(t),\alpha (t))=p(t). \end{aligned}$$

The ABSDE for \(p(t),\ q(t), \ r(t,z) \ \hbox {and} \ w(t)\) is, by (4),

$$\begin{aligned} dp(t)=&\ -E[(b(t+\delta ,\alpha (t+\delta ))p(t+\delta ) \nonumber \\&+\sigma (t+\delta ,\alpha (t+\delta ))q(t+\delta ) \nonumber \\&+\int _{\mathbb {R}_{0}}\eta (t+\delta ,\alpha (t+\delta ),z)r(t+\delta ,z)\nu (\mathrm{d}z) \nonumber \\&+\sum _{j=1}^{D}\gamma ^{j}(t,\alpha (t+\delta ))w^{j}(t+\delta )\lambda _{j}(t))\mathbf 1 _{[0,T-\delta ]}(t)|\mathscr {F}_{t}]\mathrm{d}t \nonumber \\&+q(t)\mathrm{d}W(t)+\int _{\mathbb {R}_{0}}r(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)+w(t)d{\tilde{\Phi }}(t), \ t\in [0,T], \\ p(T)=&\ g_{x}(X(T),\alpha (T)).\nonumber \end{aligned}$$
(17)

We solve (15) inductively in the following way:

Step 1. If \(t\in [T-\delta ,T]\), the corresponding adjoint equation takes the form:

$$\begin{aligned} dp(t)&=q(t)\mathrm{d}W(t)+\int _{\mathbb {R}_{0}}r(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)+w(t)d{\tilde{\Phi }}(t), \qquad t\in [T-\delta ,T], \\ p(T)&=g_{x}(X(T),\alpha (T)), \end{aligned}$$

which has the solution

$$\begin{aligned} p(t)=E[g_{x}(X(T),\alpha (T))|\mathscr {F}_{t}], \qquad t\in [T-\delta ,T], \end{aligned}$$

with corresponding \(q(t),\ r(t,z) \ \hbox {and} \ w(t)\) obtained by the martingale representation theorem for regime-switching jump-diffusions, by Crépéy and Matoussi [20].

Step 2. If \(t\in [T-2\delta ,T-\delta ]\) and \(T-2\delta >0\), we get:

$$\begin{aligned} dp(t)=&-E[(b(t+\delta ,\alpha (t+\delta ))p(t+\delta )+\sigma (t+\delta ,\alpha (t+\delta ))q(t+\delta ) \\&+\int _{\mathbb {R}_{0}}\eta (t+\delta ,\alpha (t+\delta ),z)r(t+\delta ,z)\nu (\mathrm{d}z) \\&+\sum _{j=1}^{D}\gamma ^{j}(t+\delta ,\alpha (t+\delta ))w^{j}(t+\delta )\lambda _{j}(t))\mathbf 1 _{[0,T-\delta ]}(t)|\mathscr {F}_{t}]\mathrm{d}t \\&+q(t)\mathrm{d}W(t)+\int _{\mathbb {R}_{0}}r(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)+w(t)d{\tilde{\Phi }}(t), \ t\in [T-2\delta ,T-\delta ], \end{aligned}$$

with the terminal value \(p(T-\delta )\) known from Step 1. When we follow the intervals, it is seen that \(p(t+\delta ), \ q(t+\delta ), \ r(t+\delta ,z) \ \hbox {and} \ w(t+\delta )\) are known by Step 1. Therefore, this BSDE can be solved for \(p(t), \ q(t), \ r(t,z) \ \hbox {and} \ w(t)\) in the interval \([T-2\delta ,T-\delta ]\). We continue in the same way and by induction, we obtain a solution \(p(t)=p_{X(T),\alpha (T)}(t)\) of (17).

If \(0\le p(t)\le v_{0}(t,\alpha (t))\) for all \(t\in [0,T]\), then the optimal consumption rate \(\hat{c}(t)\) is given by

$$\begin{aligned} \hat{c}(t)=\hat{c}_{\hat{X}(T),\alpha (T)}(t)=I(t,p(t),\alpha (t)), \qquad t\in [0,T]. \end{aligned}$$
(18)

Proposition 8.1

Let \(p(t), \ q(t), \ r(t,z)\), and w(t) be the solution of (17) and suppose that \(0\le p(t)\le v_{0}(t,\alpha (t))\) holds for all \(t\in [0,T]\). Then, the corresponding optimal terminal wealth X(t) and the optimal consumption rate \(\hat{c}(t)\) are given implicitly by (15) and (18), respectively.

To obtain a more explicit solution, let us assume that \(b(t,e_{i})=b(t)\) is deterministic and \(g(x,e_{i})=kx, \ k>0, \ i=1,2,\ldots ,D\). Let us continue our study with the utility function \(U(t,c,e_{i})=\phi (t,e_{i})\ln (1+c)\) for all \(i=1,2,\ldots ,D\), where \(\phi (t,e_{i})\) is an \(\mathbb {R}^{+}\)-valued, càdlàg and \(\mathscr {F}_{t}\)-adapted function such that

$$\begin{aligned} E\left[ \int _{0}^{t}\left| \phi (t,\alpha (t))\right| ^{2}\mathrm{d}t\right] <\infty . \end{aligned}$$

If we consider (17), since k is deterministic, we can choose \(q=r=w=0\). Hence, the BSDE becomes a deterministic equation:

$$\begin{aligned} dp(t)&=-b(t+\delta )p(t+\delta )\mathbf 1 _{[0,T-\delta ]}(t)\mathrm{d}t, \qquad t\le T, \\ p(t)&=k, \qquad t\in [T-\delta ,T]. \end{aligned}$$

To solve this, we introduce

$$\begin{aligned} h(t)=p(T-t), \qquad t\in [0,T]. \end{aligned}$$

Then, we obtain a differential delay equation:

$$\begin{aligned} dh(t)&=-dp(T-t)=b(T-t+\delta )p(T-t+\delta )\mathrm{d}t=b(T-t+\delta )h(t-\delta )\mathrm{d}t, \\&\qquad \qquad t\in [\delta ,T], \\ h(t)&=p(T-t)=k, \qquad t\in [0,\delta ]. \end{aligned}$$

Hence, we can determine h(t) inductively on each interval as follows:

If h(t) is known on \([(j-1)\delta ,j\delta ]\), then

$$\begin{aligned} h(t)=h(j\delta )+\int _{j\delta }^{t}h'(s)\mathrm{d}s=h(j\delta )+\int _{j\delta }^{t}b(T-t+\delta )h(s-\delta )\mathrm{d}s \end{aligned}$$
(19)

for \(t\in [j\delta ,(j+1)\delta ]\), \(j=1,2,\ldots \).

If we substitute the utility function \(U(t,c,e_{i})=\phi (t,e_{i})\ln (1+c)\), \(i=1,2,\ldots ,D\), in (16), then we have proved the following theorem. Furthermore, since h depends on the constant delay \(\delta \) and by the coefficient \(\phi (t,\alpha (t))\), Theorem 8.1 clarifies the effects of memory and different states of economy on optimal consumption rate.

Theorem 8.1

The optimal consumption rate \(\hat{c}(t)\) under the above construction is explicitly given by

$$\begin{aligned} \hat{c}(t)&=I(t,h_{\delta }(T-t),\alpha (t)|_{\alpha (t)=e_{i}}) \\&= \left\{ \begin{array}{ll} 0, &{}\quad \text {if} \quad h_{\delta }(T-t)\ge \phi (t,e_{i}), \\ \frac{\phi (t,e_{i})}{h_{\delta }(T-t)}-1, &{}\quad \text {if} \quad 0\le h_{\delta }(T-t)<\phi (t,e_{i}), \end{array}\right. \end{aligned}$$

where \(h(\cdot )=h_{\delta }(\cdot )\) is determined by (19).

9 Conclusions

In this paper, we study a stochastic optimal control problem by the tools of maximum principle and prove the necessary and sufficient maximum principles for a delayed jump-diffusion with regimes under full and partial information. We develop the general analytic model setting and methods for the solution of such a model and apply our results to an optimal consumption problem from a cash flow with delay and regimes. In our setting, under the given conditions, one may prefer any stochastic utility function based on the information about the investor. In this work, we present the optimal consumption rate for a specific stochastic utility function explicitly.