Keywords

Mathematical Subject Classification 2010

1 Introduction

This paper studies the following form of a controlled stochastic differential equation (SDE in short):

$$\begin{aligned} \left\{ \begin{array}{l} d X(t) = F (X(t), u (t)) d t + G (X (t)) d M(t), \;\; 0 \le t \le T,\\ \; X(0) = x_0, \end{array}\right. \end{aligned}$$
(1)

where M is a continuous martingale taking its values in a separable Hilbert space K, while F, G are some mappings with properties to be given later and \(u (\cdot )\) represents a control variable. We will be interested in minimizing the cost functional:

$$\begin{aligned} J(u (\cdot ) ) = \mathbb {E} \, \left[ \int _0^T \ell ( X^{u (\cdot ) } (t), u (t) ) \, dt + h ( X^{u (\cdot ) } (T) )\right] \end{aligned}$$

over a set of admissible controls.

We shall follow mainly the ideas of Bensoussan in [10, 11], Zhou in [36, 37], Øksendal et al. [26], and our earlier work [4]. The reader can see our main results in Theorems 2 and 3.

We recall that forward SDEs driven by martingales are studied in [6, 15, 16, 21, 34]. In fact in [6] we derived the maximum principle (necessary conditions) for optimality of stochastic systems governed by SPDEs. However, the results there show the maximum principle in its local form and also the control domain is assumed to be convex. In this paper we shall try to avoid such conditions as we shall shortly talk about it. Due to the fact that we are dealing here with a non-convex domain of controls, it is not obvious how one can allow the control variable u(t) to enter in the mapping G in (1) and obtain a result like Lemma 3 below. This issue was raised also in [10]. Nevertheless, in some special cases (see [8]) we can allow G to depend on the control, still overcome this difficulty, and prove the maximum principle. The general case is still open as pointed out in [6, Remark 6.4].

The maximum principle in infinite dimensions started after the work of Pontryagin [30]. The reader can find a detailed description of these aspects in Li and Yong [22] and the references therein. An expanded discussion on the history of maximum principle can be found in [36, P. 153–156]. On the other hand, the use of (linear) backward stochastic differential equations (BSDEs) for deriving the maximum principle for forward controlled stochastic equations was done by Bismut in [12]. In this respect, one can see also the works of Bensoussan in [10, 11]. In 1990 Pardoux and Peng [27], initiated the theory of nonlinear BSDEs, and then Peng studied the stochastic maximum principle in [28, 29]. Since then several works appeared consequently on the maximum principle and its relationship with BSDEs. For example one can see [1719, 33, 36] and the references of Zhou cited therein. Our earlier work in [2] has now opened the way to study BSDEs and backward SPDEs that are driven by martingales. One can see [23] for financial applications of BSDEs driven by martingales, and [7, 9, 14, 20] for other applications.

In this paper we shall consider first a suitable perturbation of an optimal control by means of the spike variation method in order to derive the maximum principle in its global form to derive optimality necessary conditions. Then we shall provide sufficient conditions for optimality of our control problem. The results will be achieved mainly by using the adjoint equation of (1), which is a BSDE driven by the martingale M. This can be seen from Eq. (30) in Sect. 5. It is quite important to realize that the adjoint equations in Sect. 5 of such SDEs are in general BSDEs driven by martingales. This happens also even if the martingale M, which is appearing in Eq. (1), is a Brownian motion with respect to a right continuous filtration being larger than its natural filtration. There is a discussion on this issue in Bensoussan’s lecture note [10, Sect. 4.4], and in [1] and its erratum, [5]. In particular, studying control problems associated with SDEs like (1) with their martingale noises cannot be recovered from the works done for SDEs driven by Brownian motions in the literature. We refer the reader to the discussion at the beginning of Sect. 5 below for more details. To the best of our knowledge our results here towards deriving the maximum principle (necessary and sufficient optimality conditions) in its global form for a control problem governed by SDE (1) with a martingale noise are new. The general case when the control variable enters in the noise term G is still an open problem as stated above.

The paper is organized as follows. Section 2 is devoted to some preliminary notation. In Sect. 3 we present our main stochastic control problems. Then in Sect. 4 we establish many of our necessary estimates, which will be needed to derive the maximum principle for the control problem of (1). The maximum principle in the sense of Pontryagin for the above control problem is derived in Sect. 5. In Sect. 6 we establish sufficient conditions for optimality for this control problem, and present some examples as well.

2 Preliminary Notation

Let \((\varOmega , \mathscr {F}, \mathbb {P} )\) be a complete probability space, filtered by a continuous filtration \(\{\mathscr {F}_t\}_{t \ge 0 }\), in the sense that every square integrable K-valued martingale with respect to \(\{\mathscr {F}_t, \; 0 \le t \le T \}\) has a continuous version.

Denoting by \(\mathscr {P}\) the predictable \(\sigma \)—algebra of subsets of \(\varOmega \times [0, T]\) we say that a K - valued process is predictable if it is \(\mathscr {P}/\mathscr {B}(K)\) measurable. Suppose that \(\mathscr {M}^2_{[0, T]} (K)\) is the Hilbert space of cadlag square integrable martingales \(\{ M (t), 0 \le t \le T \}\), which take their values in K. Let \(\mathscr {M}^{2, c}_{[0, T ]} (K) \) be the subspace of \(\mathscr {M}^2_{[0, T]} (K)\) consisting of all continuous square integrable martingales in K. Two elements M and N of \(\mathscr {M}^2_{[0, T]} (K) \) are said to be very strongly orthogonal (or shortly VSO) if

$$ \mathbb {E}\, [ M (\tau )\otimes N (\tau ) ] =\, \mathbb {E}\, [ M (0)\otimes N (0) ], $$

for all [0, T]—valued stopping times \(\tau \).

Now for \(M \in \mathscr {M}^{2, c}_{[0, T ]} (K) \) we shall use the notation \(<M>\) to mean the predictable quadratic variation of M and similarly \(\ll M\gg \) means the predictable tensor quadratic variation of M, which takes its values in the space \(L_1(K)\) of all nuclear operators on K. Precisely, \(M\otimes M - \ll M\gg \, \in \mathscr {M}^{2, c}_{[0, T ]} (L_1 (K))\). We shall assume for a given fixed \(M \in \mathscr {M}^{2, c}_{[0, T ]}(K)\) that there exists a measurable mapping \({\mathscr {Q} (\cdot ) : [ 0, T ] \times \varOmega \rightarrow L_1 (K)}\) such that \(\mathscr {Q} (t)\) is symmetric, positive definite, \(\mathscr {Q} (t) \le \mathscr {Q} \) for some positive definite nuclear operator \(\mathscr {Q}\) on K, and satisfies the following equality:

$$ \ll M \gg _t \; = \int _0^t \mathscr {Q} (s) \, ds. $$

We refer the reader to Example 1 for a precise computation of this process \(\mathscr {Q} (\cdot ).\)

For fixed \((t, \omega ), \) we denote by \(L_{\mathscr {Q}(t, \omega )} (K)\) to the set of all linear operators \(\varphi : \mathscr {Q}^{1/2}(t, \omega ) (K) \rightarrow K \) and satisfy \(\varphi \mathscr {Q}^{1/2}(t, \omega ) \in L_2 (K)\), where \(L_2 (K)\) is the space of all Hilbert-Schmidt operators from K into itself. The inner product and norm in \(L_2 (K)\) will be denoted respectively by \(\langle \cdot , \cdot \rangle _2 \) and \(|| \cdot ||_2\). Then the stochastic integral \(\int _0^{\cdot } \varPhi (s) d M(s) \) is defined for mappings \(\varPhi \) such that for each \((t, \omega ), \; \varPhi (t, \omega ) \in L_{\mathscr {Q} (t, \omega )} (K), \; \varPhi \mathscr {Q}^{1/2} (t, \omega ) (h) \; \forall \; h \in K\) is predictable, and

$$ \mathbb {E}\; \left[ \, \int _0^T || ( \varPhi \mathscr {Q}^{1/2} ) (t) ||_2^2\; dt \, \right] < \infty . $$

Such integrands form a Hilbert space with respect to the scalar product \(( \varPhi _1, \varPhi _2 ) \mapsto \mathbb {E}\; [ \, \int _0^T \langle \varPhi _1 \mathscr {Q}^{1/2} (t), \varPhi _2 \mathscr {Q}^{1/2} (t)\rangle \; dt \, ]\). Simple processes taking values in L(KK) are examples of such integrands. By letting \(\varLambda ^2 ( K ; \mathscr {P}, M ) \) be the closure of the set of simple processes in this Hilbert space, it becomes a Hilbert subspace. We have also the following isometry property:

$$\begin{aligned} \mathbb {E} \, \left[ \, | \int _0^T \varPhi (t) dM(t) |^2 \right] = \mathbb {E} \, \left[ \, \int _0^T || \varPhi (t) \mathscr {Q}^{1/2} (t)||_2^2 \, ds \right] \end{aligned}$$
(2)

for mappings \(\varPhi \in \varLambda ^2 ( K ; \mathscr {P}, M )\). For more details and proofs we refer the reader to [25].

On the other hand, we emphasize that the process \(\mathscr {Q} (\cdot )\) will be play an important role in deriving the adjoint equation of the SDE (1) as it can be seen from Eqs. (29), (30) in Sect. 5. This is due to the fact that the integrand \(\varPhi \) is not necessarily bounded. More precisely, it is needed in order for the mapping \(\nabla _{x} H\), which appear in both equations, to be defined on the space \(L_2 (K)\), since the process \(Z^{u (\cdot )}\) there need not be bounded. This always has to be considered when working with BSDEs or BSPDEs driven by infinite dimensional martingales.

Next let us introduce the following space:

$$ L^2_{\mathscr {F}} ( 0, T ; E )\ {:=}\ \{ \psi {:}\ [ 0, T]\times \varOmega {\rightarrow } E,\; \text {predictable and} \; \mathbb {E} \, \left[ \! \int _0^T | \psi (t) |^2 dt \, \!\right] < \infty \, \}, $$

where E is a separable Hilbert space.

Since \(\mathscr {Q} (t) \le \mathscr {Q} \) for all \(t \in [ 0, T]\) a.s., it follows from [3, Proposition 2.2] that if \(\varPhi \in L^2_{\mathscr {F}} ( 0, T ; L_{\mathscr {Q}} (K) ) \) (where as above \(L_{\mathscr {Q}} (K) ) = L_2 (\mathscr {Q}^{1/2} (K) ; K) )\), the space of all Hilbert-Schmidt operators from \(\mathscr {Q}^{1/2} (K)\) into K ), then \(\varPhi \in \varLambda ^2 ( K ; \mathscr {P}, M ) \) and

$$\begin{aligned} \mathbb {E}\; \left[ \, \int _0^T || \varPhi (t) \mathscr {Q}^{1/2} (t) ||^2_2 \; dt \, \right] \le \mathbb {E}\; \left[ \, \int _0^T || \varPhi (t) ||_{L_{\mathscr {Q}}(K)}^2 \; d t \, \right] . \end{aligned}$$
(3)

An example of such a mapping \(\varPhi \) is the mapping G in Eq. (1); see the domain of G in the introduction of the following section.

3 Formulation of the Control Problem

Let \(\mathscr {O}\) be a separable Hilbert space and U be a nonempty subset of \(\mathscr {O}\). We say that \(u (\cdot ) : [0, T]\times \varOmega \rightarrow \mathscr {O}\) is admissible if \(u (\cdot ) \in L^2_{\mathscr {F}} ( 0, T ; \mathscr {O} )\) and \(u (t) \in U \; \; a.e., \; a.s\). The set of admissible controls will be denoted by \(\mathscr {U}_{ad}\).

Let \(F{:}\ K \times \mathscr {O} \rightarrow K\), \(G{:}\ K \rightarrow L_{\mathscr {Q}} (K)\), \(\ell {:}\ K \times \mathscr {O} \rightarrow \mathbb {R}\) and \(h{:}\ K \rightarrow \mathbb {R} \) be measurable mappings. Consider the following SDE:

$$\begin{aligned} \left\{ \begin{array}{l} d X(t) = F (X(t), u (t) ) \, d t + \, G (X(t) )\, d M(t), \;\; t \in [0, T],\\ \; X(0) = x_0 \in K. \end{array}\right. \end{aligned}$$
(4)

If assumption (E1), which is stated below, holds, then (4) attains a unique solution in \(L^2_{\mathscr {F}} ( 0, T ; K )\). The proof of this fact can be gleaned from [31] or [32]. In this case we shall denote the solution of (4) by \(X^{u (\cdot ) }\).

Our assumptions are the following.

(E1) \(F, G, \ell , h\) are continuously Fréchet differentiable with respect to x, F and \(\ell \) are continuously Fréchet differentiable with respect to u, the derivatives \(F_x, \, F_{u }, \, G_x, \ell _x, \ell _{u} \) are uniformly bounded, and

$$ | h_x |_{L(K; K)} \le k \, ( 1 + |x|_{K} ) $$

for some constant \(k > 0\).

In particular, \(|F_x|_{L(K,K)} \le C_1, \, ||G_x||_{L(K,L_{\mathscr {Q}}(K))} \le C_2, \, |F_{v}|_{L(\mathscr {O},K)} \le C_3, \) for some positive constants \(C_i, \; i =1, 2, 3\), and similarly for \(\ell \).

(E2) \(\ell _x\) satisfies Lipschitz condition with respect to u uniformly in x.

Consider now the cost functional:

$$\begin{aligned} J(u (\cdot ) )\ {:=}\ \mathbb {E} \, \left[ \, \int _0^T \ell ( X^{u (\cdot ) } (t), u (t) ) \, dt + h ( X^{u (\cdot ) } (T) ) \, \right] , \end{aligned}$$
(5)

for \(u (\cdot ) \in \mathscr {U}_{ad}\).

The control problem here is to minimize (5) over the set \(\mathscr {U}_{ad}\). Any \(u^{*} ( \cdot ) \in \mathscr {U}_{ad} \) satisfying

$$\begin{aligned} J(u^{*} ( \cdot ) ) = \inf \{ J(u (\cdot ) ): \; u (\cdot ) \in \, \mathscr {U}_{ad} \} \end{aligned}$$
(6)

is called an optimal control, and its corresponding solution \(X^*\ {:=}\ X^{u^{*} (\cdot ) }\) to (4) is called an optimal solution of the stochastic optimal control problem (4)–(6). In this case the pair \(( X^{*}, u^{*} (\cdot ) )\) in this case is called an optimal pair.

Remark 1

We mention here that the mappings FG and \(\ell \) in (4) and (5) can be taken easily to depend on time t with a similar proof as established in the following sections, but rather, having more technical computations.

Since this control problem has no constraints we shall deal generally with progressively measurable controls. However, for the case when there are final state constraints, one can mimic our results in Sects. 4, 5 and 6, and use Ekeland’s variational principle in a similar way to [24, 28] or [36].

In the following section we shall begin with some variational method in order to derive our main variational inequalities that are necessary to establish the main result of Sect. 5.

4 Estimates

Let \((X^{*}, u^{*} (\cdot ) )\) be the given optimal pair. Let \(0 \le t_0 < T\) be fixed and \(0< \varepsilon < T - t_0\). Let v be a random variable taking its values in U, \(\mathscr {F}_{t_0}\) - measurable and \(\displaystyle {\sup _{\omega \in \varOmega }} \, | v (\omega ) | < \infty \). Consider the following spike variation of the control \(u^{*} (\cdot )\):

$$\begin{aligned} u_{\varepsilon } (t) = \left\{ \begin{array}{ll} u^{*} (t) \;\; &{} \text { if} \; \; t \in [0, T] \backslash [t_0, t_0 + \varepsilon ] \\ v \;\; &{} \text { if} \; \; t \in [t_0, t_0 + \varepsilon ]. \end{array}\right. \end{aligned}$$
(7)

Let \(X^{u_{\varepsilon } (\cdot ) }\) denote the solution of the SDE (4) corresponding to \( u_{\varepsilon } (\cdot )\). We shall denote it briefly by \(X_{\varepsilon }\). Observe that \(X_{\varepsilon } (t) = X^{*} (t) \) for all \(0 \le t \le t_0\).

The following lemmas will be very useful in proving the main results of Sect. 5.

Lemma 1

Let (E1) hold. Assume that \(\{ p(t), \; t_0 \le t \le T \}\) is the solution of the following linear equation:

$$\begin{aligned} \left\{ \begin{array}{ll} d p(t) = F_x (X^{*}(t), u^{*} (t) ) \, p (t) \, dt + G_x (X^{*} (t) )\, p (t) \, d M(t), t_0 < t \le T,\\ \; p (t_0 ) = F (X^{*}(t_0), v ) - F (X^{*}(t_0), u^{*} (t_0) ). \end{array}\right. \end{aligned}$$
(8)

Then

$$ \sup _{t \in [t_0, T]} \mathbb {E} \, [ \, | p (t) |^2 \, ] \, < C $$

for some positive constant C.

Proof

With the help of (E1) apply Itô’s formula to compute \(| p (t) |^2\), and take the expectation. The required result follows then by using Gronwall’s inequality.

Lemma 2

Assuming (E1) we have

$$ \mathbb {E} \, \left[ \, \sup _{t_0 \le t \le T } \, | X_{\varepsilon } (t) - X^{*} (t) |^2 \, \right] = o (\varepsilon ). $$

Proof

For \(t_0 \le t \le t_0 + \varepsilon \) one observes that

$$\begin{aligned}&X_{\varepsilon } (t) - X^{*} (t) = \int _{t_0}^t [ F( X_{\varepsilon }(s), v ) - F( X^{*}(s), v ) ] \, ds \nonumber \\&\quad + \, \int _{t_0}^t [ F( X^{*}(s), v ){-} F( X^{*}(s), u^{*} (s) ) ] \, ds {+} \int _{t_0}^t [ G( X_{\varepsilon }(s) ) {-} G( X^{*}(s) ) \!] dM(s),\nonumber \\ \end{aligned}$$
(9)

or, in particular,

$$\begin{aligned}&| X_{\varepsilon } (t) - X^{*} (t) |^2 \le 3\, (t-t_0 ) \int _{t_0}^t | F( X_{\varepsilon }(s), v ) - F( X^{*}(s), v ) |^2 \, ds \nonumber \\&\qquad \qquad \qquad \qquad \qquad + \, 3 \, (t-t_0 ) \int _{t_0}^t | F( X^{*}(s), v ) - F( X^{*}(s), u^{*} (s) ) |^2 \, ds \nonumber \\&\qquad \qquad \qquad \quad \quad \quad \quad \quad \quad + \, 3 \; | \int _{t_0}^t [ G( X_{\varepsilon }(s) ) - G( X^{*}(s) ) ] dM(s) |^2. \end{aligned}$$
(10)

But Taylor expansion implies the three identities:

$$\begin{aligned}&F( X_{\varepsilon }(s), v ) - F( X^{*}(s), v ) \nonumber \\&\qquad = \, \int _0^1 F_x ( X^{* }(s), u^{*} (s) +\lambda ( X_{\varepsilon }(s) - X^{*}(s) )) \, ( X_{\varepsilon }(s) - X^{*}(s) ) \, d\lambda , \end{aligned}$$
(11)
$$\begin{aligned}&F( X^{*}(s), v ) - F( X^{*}(s), u^{*} (s) ) \nonumber \\&\qquad \qquad \qquad = \, \int _0^1 F_v ( X^{* }(s), u^{*} (s) + \lambda (v - u^{*} (s) )) \, (v - u^{*} (s) ) \, d\lambda , \end{aligned}$$
(12)

and

$$\begin{aligned} G( X_{\varepsilon }(s) ) - G( X^{*}(s) ) = \, \int _0^1 G_x ( X^{*} (s)&+\ \lambda ( X_{\varepsilon }(s) - X^{*}(s) )) \, ( X_{\varepsilon }(s) - X^{*}(s) ) \, d\lambda \nonumber \\&\qquad \quad {=:}\ \varPhi (s) \; ( \in L_{\mathscr {Q}} (K) ). \end{aligned}$$
(13)

Then, by using (13), the isometry property (2), (3) and (E1) we deduce that for all \(t \in [ t_0, t_0 + \varepsilon ],\)

$$\begin{aligned}&\mathbb {E}\, \left[ \, | \int _{t_0}^t \left( G( X_{\varepsilon }(s) ) - G( X^{*}(s) ) \right) dM(s) |^2\right] = \mathbb {E}\, \left[ \, | \int _{t_0}^t \varPhi (s) dM(s) |^2\right] \nonumber \\&= \; \mathbb {E}\, \left[ \, \int _{t_0}^t || \varPhi (s) \mathscr {Q}^{1/2} (s) ||_2^2 \, ds\right] \nonumber \\&\le \; \mathbb {E}\, \left[ \, \int _{t_0}^t || \varPhi (s) ||_{L_{\mathscr {Q}} (K)}^2 \, ds\right] \nonumber \\&= \; \mathbb {E}\, \left[ \, \int _{t_0}^t || \int _0^1 G_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) )) \, ( X_{\varepsilon }(s) - X^{*}(s) ) \, d\lambda ||_{L_{\mathscr {Q}} (K)}^2 \, ds \, \right] \nonumber \\&\le \; \mathbb {E}\, \left[ \, \int _{t_0}^t \int _0^1 || G_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) )) \, ( X_{\varepsilon }(s) - X^{*}(s) ) ||_{L_{\mathscr {Q}} (K)}^2 \, d\lambda \, ds \, \right] \nonumber \\&\le \; C_2 \; \mathbb {E}\, \left[ \, \int _{t_0}^t | X_{\varepsilon }(s) - X^{*}(s) |^2 \, ds \, \right] . \end{aligned}$$
(14)

Therefore, from (10), (11), (12), (E1) and (14), it follows evidently that

$$\begin{aligned} \mathbb {E}\, [ \, | X_{\varepsilon } (t) - X^{*} (t) |^2 \, ]&\le 3\, \left( C_1 \, (t-t_0 ) + C_2 \right) \int _{t_0}^t \mathbb {E}\, [ \, |\, X_{\varepsilon } (s) - X^{*} (s) \, |^2 \, ] \, ds \nonumber \\&\,\,\,\,\, + \, 3\, (t-t_0 ) \, C_3 \int _{t_0}^t \mathbb {E}\, [ \, |\, v - u^{*} (s) \, |^2 \, ] \, ds, \nonumber \\ \end{aligned}$$

for all \(t \in [ t_0, t_0 + \varepsilon ]\).

Hence by using Gronwall’s inequality we obtain

$$\begin{aligned} \mathbb {E}\, [ \, |\, X_{\varepsilon } (t) - X^{*} (t) \, |^2 \, ] \le 3\, C_3 \, (t-t_0 ) \, e^{3\, \left( C_1 \, (t-t_0 ) + C_2 \right) (t-t_0 ) } {\times }\! \int _{t_0}^{t_0 + \varepsilon } \mathbb {E}\, [ \, | v - u^{*} (s) |^2 \, ] \, ds, \nonumber \\ \end{aligned}$$
(15)

for all \(t \in [t_0, t_0 + \varepsilon ]\). Consequently,

$$\begin{aligned} \mathbb {E}\, \left[ \,\! \int _{t_0}^{t_0 + \varepsilon } |\, X_{\varepsilon } (t) - X^{*} (t) \, |^2 \, dt \, \!\right] \le 3\, C_3 \, \varepsilon ^2 \, e^{3 \, (C_1 \, \varepsilon + C_2 ) \varepsilon } \times \int _{t_0}^{t_0 + \varepsilon } \mathbb {E}\, [ \, | v - u^{*} (s) |^2 \, ] \, ds. \nonumber \\ \end{aligned}$$
(16)

It follows then from (10), (15), standard martingale inequalities, (14) and (16) that

$$\begin{aligned}&\mathbb {E}\, \left[ \, \sup _{t_0 \le t \le t_0 + \varepsilon } | X_{\varepsilon } (t) - X^{*} (t) |^2 \, \right] \nonumber \\&\quad \le 3 \, C_3 \, [ \, 3 \, ( C_1 \, \varepsilon + 4 C_2) \, \varepsilon \, e^{3 \, (C_1 \, \varepsilon + C_2 ) \varepsilon } \, + \, 1 \, ] \, \varepsilon \, \int _{t_0}^{t_0 + \varepsilon } \mathbb {E}\, [ \, | v - u^{*} (s) |^2 \, ] \, ds.\quad \quad \quad \quad \end{aligned}$$
(17)

Next, for \(t_0 + \varepsilon \le t \le T\), we have

$$\begin{aligned}&X_{\varepsilon } (t) - X^{*} (t) = X_{\varepsilon } (t_0 + \varepsilon ) - X^{*} (t_0 + \varepsilon ) \nonumber \\&\qquad \qquad \qquad \qquad + \, \int _{t_0 + \varepsilon }^t [ F( X_{\varepsilon }(s), u^{*} (s) ) - F( X^{*}(s), u^{*} (s) ) ] \, ds \nonumber \\&\qquad \qquad \qquad \qquad + \, \int _{t_0 + \varepsilon }^t [ G( X_{\varepsilon }(s) ) - G( X^{*}(s) ) ] dM(s). \end{aligned}$$
(18)

Thus by working as before and applying (15) we derive

$$\begin{aligned}&\mathbb {E}\, \left[ \, \int _{t_0 + \varepsilon }^T |\, X_{\varepsilon } (t) - X^{*} (t) \, |^2 \, dt \, \right] \le 9 \, C_3 \, \varepsilon ^2 e^{C_4 (\varepsilon )} \int _{t_0}^{t_0 + \varepsilon } \mathbb {E}\, [ \, | v - u^{*} (s) |^2 \, ] \, ds \end{aligned}$$

and

$$\begin{aligned} \mathbb {E}\, \left[ \, \sup _{t_0 + \varepsilon \le t \le T } | X_{\varepsilon } (t) - X^{*} (t) |^2 \, \right]&\le 27 \, C_3 \, \varepsilon \, e^{C_4 (\varepsilon )} \, [ \, 1+ \left( ( T - t_0 - \varepsilon ) \, C_1 + 4 \, C_2 \right) \, \varepsilon \, ] \nonumber \\&\,\,\,\,\,\,\times \int _{t_0}^{t_0 + \varepsilon } \mathbb {E}\, [ \, | v - u^{*} (s) |^2 \, ] \, ds, \end{aligned}$$
(19)

where \(C_4 (\varepsilon ) = [ 3 \, \varepsilon ^2 + 3 \, ( T - t_0 - \varepsilon )^2 ] \, C_1 + ( T - t_0 + 2\, \varepsilon ) \, C_2\).

Now (17) and (19) imply that

$$\begin{aligned}&\mathbb {E}\, \left[ \, \sup _{t_0 \le t \le T } | X_{\varepsilon } (t) - X^{*} (t) |^2 \, \right] \le \left( C_5 (\varepsilon ) + C_6 (\varepsilon ) \right) \int _{t_0}^{t_0 + \varepsilon } \mathbb {E}\, [ \, | v - u^{*} (s) |^2 \, ] \, ds, \end{aligned}$$

with the constants

$$ C_5 (\varepsilon ) = 3 \, C_3 \, [ \, 3 \, ( C_1 \, \varepsilon + 4 C_2) \, \varepsilon \, e^{3 \, (C_1 \, \varepsilon + C_2 ) \varepsilon } \, + \, 1 \, ] \, \varepsilon $$

and

$$ C_6 (\varepsilon ) = 27 \, C_3 \, \varepsilon \, e^{C_4 (\varepsilon )} \, [ \, 1+ \left( ( T - t_0 - \varepsilon ) \, C_1 + 4 \, C_2 \right) \, \varepsilon \, ]. $$

This completes the proof.

Remark 2

We note that for \(a.e. \; s, \)

$$\begin{aligned} \frac{1}{\varepsilon } \, \int _{s}^{s + \varepsilon } \mathbb {E} \, [ \, |\phi ( X^{*}(t), u^{*}(t) ) - \phi ( X^{*}(s), u^{*} (s) ) |^2 \, ] \, dt \rightarrow 0, \; \; \text {as} \;\; \varepsilon \rightarrow 0, \end{aligned}$$
(20)

for \(\phi = F, \ell \). Indeed, if for example, \(\phi = F, \) then we may argue as in (12) to see that

$$\begin{aligned}&\frac{1}{\varepsilon } \, \int _{s}^{s + \varepsilon } \mathbb {E} \, [ \, | F( X^{*}(t), u^{*}(t) ) - F( X^{*}(s), u^{*} (s) ) |^2 \, ] \, dt \nonumber \\&\quad = \, \frac{1}{\varepsilon } \, \int _{s}^{s + \varepsilon } \mathbb {E} \, \left[ \, | \int _0^1 F_v ( X^{* }(t), u^{*} (s) + \lambda (u^{*} (t) - u^{*} (s) )) \, (u^{*} (t) - u^{*} (s)) \, d\lambda |^2 \, dt \,\! \right] \nonumber \\&\quad \le \, \frac{1}{\varepsilon } \, \int _{s}^{s + \varepsilon } \mathbb {E} \, [ \, | u^{*} (t) - u^{*} (s) |^2 \, ] \, dt. \end{aligned}$$
(21)

But since \(\int _{0}^{T} \mathbb {E} \, [ \, | u^{*} (t) - u^{*} (s) |^2 \, ] \, dt < \infty \) (for fixed s), then, as it is well-known from measure theory (e.g. [13]), there exists a subset O of [0, T] such that \(\text {Leb}([0, T]\setminus O) = 0 \) and the function \( O \ni t \mapsto \mathbb {E} \, [ \, | u^{*} (t) - u^{*} (s) |^2 \, ] \) is continuous. Thus, if \(s \in O\), this function is continuous in a neighborhood of s, and so we have

$$\begin{aligned} \frac{1}{\varepsilon } \, \int _{s}^{s + \varepsilon } \mathbb {E} \, [ \, | u^{*} (t) - u^{*} (s) |^2 \, ] \, dt \rightarrow 0, \; \; \text {as} \;\; \varepsilon \rightarrow 0, \end{aligned}$$

which by (21) implies (20) for \(\phi = F\).

We will choose \(t_0\) such that (20) holds for \(\phi = F, \ell \). This assumption will be considered until the end of Sect. 5.

Lemma 3

Assume (E1). Let

$$\xi _{\varepsilon } (t) = \frac{1}{\varepsilon } \, (X_{\varepsilon } (t) - X^{*} (t)) - p (t), t \in [t_0, T ].$$

Then

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0 } \mathbb {E} \, [ \, | \xi _{\varepsilon } (T) |^2 \, ] \, = \, 0. \end{aligned}$$

Proof

First note that, for \(t_0 \le t \le t_0 + \varepsilon ,\)

$$\begin{aligned}&d \xi _{\varepsilon } (t) = \frac{1}{\varepsilon } \, [ \, F( X_{\varepsilon }(t), v ) - F( X^{*}(t), u^{*} (t) ) - \varepsilon \, F_x (X^{*}(t), u^{*} (t) ) \, p(t) \, ] dt \nonumber \\&\qquad \quad + \; \frac{1}{\varepsilon } \, [ \, G( X_{\varepsilon }(t)) - G( X^{*}(t)) - \varepsilon \, G_x (X^{*}(t)) \, p(t) \, ] dM(t), \nonumber \\&\; \xi _{\varepsilon } (t_0 ) = - \left( F( X^{*}(t_0), v ) - F( X^{*}(t_0), u^{*} (t_0) ) \right) . \end{aligned}$$

Thus

$$\begin{aligned} \xi _{\varepsilon } (t_0 + \varepsilon )= & {} \frac{1}{\varepsilon } \, \int _{t_0 }^{t_0 + \varepsilon } [ \, F( X_{\varepsilon }(s), v ) - F( X^{*}(s), v) \, ] \, ds \nonumber \\&+\; \frac{1}{\varepsilon } \, \int _{t_0 }^{t_0 + \varepsilon } [ \, F( X^{*}(s), v ) - F( X^{*}(t_0 ), v) \, ] \, ds \nonumber \\&+\; \frac{1}{\varepsilon } \, \int _{t_0 }^{t_0 + \varepsilon } [ \, F( X^{*}(t_0 ), u^{*} (t_0 ) ) - F( X^{*}(s), u^{*}(s) ) \, ] \, ds \nonumber \\&+\; \frac{1}{\varepsilon } \, \int _{t_0 }^{t_0 + \varepsilon } [ \, G( X_{\varepsilon }(s)) - G( X^{*}(s)) \, ] dM(s) \nonumber \\&- \int _{t_0 }^{t_0 + \varepsilon } F_x (X^{*}(s), u^{*} (s) ) p(s) \, ds - \int _{t_0 }^{t_0 + \varepsilon } G_x (X^{*}(s)) p(s) dM(s). \end{aligned}$$

By using (2), (3) and (E1) we deduce

$$\begin{aligned}&\mathbb {E}\, [ \, |\, \xi _{\varepsilon } (t_0 + \varepsilon ) \, |^2 \, ] \le 6\, C_1 \, \mathbb {E}\, [ \, \sup _{t_0 \le t \le t_0 + \varepsilon } | X_{\varepsilon } (t) - X^{*} (t) |^2 \, ] \nonumber \\&\qquad + \, 6 \, \sup _{t_0 \le t \le t_0 + \varepsilon } \mathbb {E}\, [ \, | F( X^{* }(t), v ) - F( X^{*}(t_0 ), v ) |^2 \, ] \nonumber \\&\qquad + \, \frac{6}{\varepsilon } \, \int _{t_0 }^{t_0 + \varepsilon } \mathbb {E}\, [ \, | F( X^{* }(s), u^{*}(s) ) - F( X^{*}(t_0 ), u^{*}(t_0) ) |^2 \, ] \, ds \nonumber \\&\qquad + \, \frac{6 \, C_2}{\varepsilon } \; \mathbb {E}\, \left[ \, \sup _{t_0 \le t \le t_0 + \varepsilon } | X_{\varepsilon } (t) - X^{*} (t) |^2 \, \right] + 6 \, (C_1 + C_2 ) \; \mathbb {E}\, \left[ \, \int _{t_0 }^{t_0 + \varepsilon } | p(s) |^2 \, ds \, \right] .\nonumber \\ \end{aligned}$$
(22)

But from (17)

$$\begin{aligned}&\frac{1}{\varepsilon } \; \mathbb {E}\, \left[ \, \sup _{t_0 \le t \le t_0 + \varepsilon } | X_{\varepsilon } (t) - X^{*} (t) |^2 \, \right] \nonumber \\&\qquad \le 3 \, C_3 \, [ \, 3 \, ( C_1 \, \varepsilon + 4 C_2) \, \varepsilon \, e^{3 \, (C_1 \, \varepsilon + C_2 ) \varepsilon } \, + \, 1 \, ] \, \int _{t_0}^{t_0 + \varepsilon } \mathbb {E}\, [ \, | v - u^{*} (s) |^2 \, ] \, ds \; \rightarrow 0\nonumber \\ \end{aligned}$$
(23)

as \(\varepsilon \rightarrow 0\). Also as in (11), by applying (E1) and (15), one gets

$$\begin{aligned}&\mathbb {E}\, [ \, | F( X^{* }(t), v ) - F( X^{*}(t_0 ), v ) |^2] \nonumber \\&\qquad = \, \mathbb {E}\, \left[ \, | \int _0^1 F_x ( X^{*} (t_0 ) + \lambda ( X^{* }(t) - X^{*}(t_0 ) ), v ) ( X^{* }(t) - X^{*}(t_0 ) ) \, d\lambda |^2\right. \nonumber \\&\qquad \le \, C_1 \; \mathbb {E}\, [ \, | X^{* }(t) - X^{*}(t_0 ) |^2] \nonumber \\&\qquad \le 3 \, C_1 \, C_3 \, \varepsilon \, e^{3\, \left( C_1 \, \varepsilon + C_2 \right) \varepsilon } \, \int _{t_0}^{t_0 + \varepsilon } \mathbb {E}\, [ \, | v - u^{*} (s) |^2 \, ] \, ds \rightarrow 0 \text {as} \; \; \varepsilon \rightarrow 0.\quad \quad \end{aligned}$$
(24)

Thus, by applying Lemma 2, (24), (23), (20) and Lemma 1 in (22), we deduce

$$\begin{aligned}&\mathbb {E}\, [ \, |\, \xi _{\varepsilon } (t_0 + \varepsilon ) \, |^2 \, ] \rightarrow 0 \text {as} \; \; \varepsilon \rightarrow 0. \end{aligned}$$
(25)

Let us now assume that \(t_0 + \varepsilon \le t \le T\). In this case we have

$$\begin{aligned}&d \xi _{\varepsilon } (t) = \frac{1}{\varepsilon } \, [ \, F( X_{\varepsilon }(t), u^{*} (t) ) - F( X^{*}(t), u^{*} (t) ) - \varepsilon \, F_x (X^{*}(t), u^{*} (t) ) \, p(t) \, ] dt \nonumber \\&\qquad \qquad \qquad + \; \frac{1}{\varepsilon } \, [ \, G( X_{\varepsilon }(t)) - G( X^{*}(t)) - \varepsilon \, G_x (X^{*}(t)) \, p(t) \, ] dM(t), \end{aligned}$$

or, in particular, by setting

$$\tilde{\varPhi }_{\varepsilon }(s) = \int _0^1 [ \, G_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) ) ) - G_x (X^{*}(s)) \, ]\, p(s) \, d\lambda , $$

we get

$$\begin{aligned}&\xi _{\varepsilon } (t) = \xi _{\varepsilon } (t_0 + \varepsilon ) + \int _{t_0 + \varepsilon }^{t} \int _0^1 F_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) ), u^{*} (s) ) \, \xi _{\varepsilon } (s) \, d\lambda \, ds \nonumber \\&\qquad \qquad \quad + \, \int _{t_0 + \varepsilon }^{t} \int _0^1 G_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) ) ) \, \xi _{\varepsilon } (s) \, d\lambda \, d M(s) \nonumber \\&\qquad + \, \int _{t_0 + \varepsilon }^{t} \int _0^1 [ \, F_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) ), u^{*} (s) ) - F_x (X^{*}(s), u^{*} (s) ) \, ]\, p(s) \, d\lambda \, ds \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad + \, \int _{t_0 + \varepsilon }^{t} \tilde{\varPhi }_{\varepsilon } (s) \, dM(s), \end{aligned}$$

for all \(t \in [t_0 + \varepsilon , T]\). Hence by making use of the isometry property (2) it holds \(\forall \; t \in [t_0 + \varepsilon , T],\)

$$\begin{aligned}&\mathbb {E}\, [ \, |\, \xi _{\varepsilon } (t) \, |^2 \, ] \le 5\, \mathbb {E}\, [ \, |\, \xi _{\varepsilon } (t_0 + \varepsilon ) \, |^2 \, ] + 5 \, ( C_1 + C_2 ) \, \int _{t_0 + \varepsilon }^{t} \mathbb {E}\, [ \, | \xi _{\varepsilon } (s) \, |^2 \, ] \, ds \nonumber \\&+ \, 5 \, \mathbb {E}\, \left[ \, \int _{t_0 }^T |\, \int _0^1 \left( \, F_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) ), u^{*} (s) ) - F_x (X^{*}(s), u^{*} (s) ) \, \right) p(s) \, d\lambda \, ds \, | \, \right] ^2 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad + \, 5 \, \mathbb {E}\, \left[ \, \int _{t_0}^T || \tilde{\varPhi }_{\varepsilon } (s) \, \mathscr {Q}^{1/2} (s) ||^2_{2} \, d s \, \right] . \end{aligned}$$
(26)

But as done for the second equality and first inequality in (14) we can derive easily that

$$\begin{aligned}&\mathbb {E}\, \left[ \, \int _{t_0}^T || \tilde{\varPhi }_{\varepsilon } (s) \, \mathscr {Q}^{1/2} (s) ||^2_{2} \, d s \, \right] = \mathbb {E}\, \left[ \, \int _{t_0}^t || \tilde{\varPhi }_{\varepsilon } (s) \mathscr {Q}^{1/2} (s) ||_2^2 \, ds \, \right] \nonumber \\&\le \; \mathbb {E}\, \left[ \, \int _{t_0}^t || \tilde{\varPhi }_{\varepsilon } (s) ||_{L_{\mathscr {Q}} (K)}^2 \, ds \, \right] \nonumber \\&= \; \mathbb {E}\, \left[ \, \int _{t_0}^t || \int _0^1 [ G_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) ) ) - G_x (X^{*}(s)) ] p(s) \, d\lambda ||_{L_{\mathscr {Q}} (K)}^2 ds \right] \nonumber \\&\le \; \mathbb {E}\, \left[ \, \int _{t_0}^t \int _0^1 || G_x ( X^{*} (s) + \lambda ( X_{\varepsilon }(s) - X^{*}(s) ) ) - G_x (X^{*}(s)) \, ]\, p(s) ||_{L_{\mathscr {Q}} (K)}^2 \, d\lambda \, ds \, \right] . \nonumber \\ \end{aligned}$$
(27)

Therefore, from Lemma 2, the continuity and boundedness of \(G_x\) in (E1), Lemma 1 and the dominated convergence theorem we deduce that the last term in the right hand side of (27) goes to 0 as \(\varepsilon \rightarrow 0\).

Similarly, the third term in the right hand side of (26) converges also to 0 as \(\varepsilon \rightarrow 0\).

Finally, by applying Gronwall’s inequality to (26), and using (25)–(27), we deduce that

$$\begin{aligned} \sup _{t_0 + \varepsilon \le t \le T } \, \mathbb {E}\, [ \, |\, \xi _{\varepsilon } (t) \, |^2 \, ] \rightarrow 0 \;\; \text {as}\; \; \varepsilon \rightarrow 0, \end{aligned}$$

which proves the lemma.

Lemma 4

Assume (E1) and (E2). Let \(\zeta \) be the solution of the equation:

$$\begin{aligned} \left\{ \begin{array}{ll} d \zeta (t) = \ell _x (X^{*}(t), u^{*} (t) ) p (t) dt, t_0 < t \le T,\\ \, \zeta (t_0 ) = \ell (X^{*}(t_0), v ) - \ell (X^{*}(t_0), u^{*} (t_0) ). \end{array} \right. \end{aligned}$$

Then

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0 } \, \mathbb {E}\, \left[ \, \left| \, \frac{1}{\varepsilon } \int _{t_0 }^T \left( \ell ( X_{\varepsilon }(t), u_{\varepsilon }(t) ) - \ell ( X^{*}(t), u^{*} (t) ) \right) dt - \zeta (t) \, \right| ^2 \, \right] = 0. \end{aligned}$$

Proof

Let

$$\eta _{\varepsilon }(t) = \frac{1}{\varepsilon } \int _{t_0 }^t \left( \, \ell ( X_{\varepsilon }(t), u_{\varepsilon }(t) ) - \ell ( X^{*}(t), u^{*} (t) ) \right) dt - \zeta (T),$$

for \(t \in [ t_0, T ]\). Then \(\eta _{\varepsilon }(t_0) = - \left( \ell (X^{*}(t_0), v ) - \ell (X^{*}(t_0), u^{*} (t_0) ) \right) \). So one can proceed easily as done in the proof of Lemma 3 to show that \({ \mathbb {E}\, [ \, | \, \eta _{\varepsilon }(T) \, |^2 \, ] \rightarrow 0, }\) though this case is rather simpler.

Let us now for a \(C^1\) mapping \(\varPsi : K \rightarrow \mathbb {R}\) denote by \(\nabla \varPsi \) to the gradient of \(\varPsi \), which is defined, by using the directional derivative \(D\varPsi (x) (k)\) of \(\varPsi \) at a point \(x \in K\) in the direction of \(k \in K\), as \(\langle \nabla \varPsi (x), k \rangle = D\varPsi (x) (k) \, ( = \varPsi _x (k) ).\) We shall sometimes write \(\nabla _x \varPsi \) for \(\nabla \varPsi (x)\).

Corollary 1

Under the assumptions of Lemma 4

$$\begin{aligned} \frac{d}{d \varepsilon } \, J ( u_{\varepsilon } (\cdot ) ) {|}_{ \varepsilon = 0} = \mathbb {E}\, [ \, \langle \nabla \, h ( X^{*}(T) ), \, p (T) \rangle + \zeta (T) \, ]. \end{aligned}$$
(28)

Proof

Note that from the definition of the cost functional in (5) we see that

$$\begin{aligned}&\frac{1}{\varepsilon }~ \left[ J ( u_{\varepsilon } (\cdot ) ) - J ( u^{*} (\cdot ) ) \right] = \frac{1}{\varepsilon }~ \mathbb {E}\, \left[ h ( X_{\varepsilon } (T) ) - h ( X^{*}(T))\right. \nonumber \\&\qquad \qquad \qquad \qquad + \, \left. \int _{t_0 }^T \left( \ell (X_{\varepsilon } (s), u_{\varepsilon } (s) ) - \ell (X^{*} (s), u^{*} (s) ) \right) ds \, \right] \nonumber \\&= \mathbb {E}\, \left[ \, \int _0^1 h_x (X^{*} (T) + \lambda ( X_{\varepsilon }(T) - X^{*}(T) ) ) \, \frac{( X_{\varepsilon }(T) - X^{*}(T) )}{\varepsilon } \, d\lambda \right. \nonumber \\&\qquad \qquad \qquad \qquad + \; \left. \frac{1}{\varepsilon }~ \int _{t_0 }^T \left( \ell (X_{\varepsilon } (s), u_{\varepsilon } (s) ) - \ell (X^{*} (s), u^{*} (s) ) \right) ds \, \right] . \end{aligned}$$

Now let \(\varepsilon \rightarrow 0 \) and use the properties of h in (E1), Lemmas 3 and 4 to get (28).

5 Maximum Principle

The maximum principle is a good tool for studying the optimality of controlled SDEs like (4) since in fact the dynamic programming approach for similar optimal control problems require usually a Markov property to be satisfied by the solution of (4), cf. for instance [36, Chap. 4]. But this property does not hold in general especially when the driving noise is a martingale.

Let us recall the SDE (4) and the mappings in (5), and define the Hamiltonian \({ H:[ 0, T ] \times \varOmega \times K \times \mathscr {O} \times K \times L_2 (K) \rightarrow \mathbb {R} }\) for \(( t, \omega , x, u, y, z ) \in [ 0, T ] \times \varOmega \times K \times \mathscr {O} \times K \times L_2 (K) \) by

$$\begin{aligned} H ( t, \omega , x, u, y, z )\ {:=}\ \ell (x, u ) + \langle F (x, u ), y \rangle + \langle G (x) \mathscr {Q}^{1/2} (t, \omega ), z \rangle _2 \,. \end{aligned}$$
(29)

The adjoint equation of (4) is the following BSDE:

$$\begin{aligned} \left\{ \begin{array}{ll} -\, d Y^{u (\cdot )}(t) = &{} \nabla _{x} H ( t, X^{u (\cdot )} (t), u (t), Y^{u (\cdot )} (t), Z^{u (\cdot )} (t) \mathscr {Q}^{1/2} (t) ) \, dt \\ &{} ~- Z^{u (\cdot )} (t)\, d M(t) - d N^{u (\cdot )} (t), \; \; \; t_0 \le t < T, \\ \; \; \; \, \, Y^{u (\cdot )} (T) = &{} \nabla h (X^{u (\cdot )} (T)). \end{array} \right. \end{aligned}$$
(30)

The following theorem gives the solution to BSDE (30) in the sense that there exists a triple \(( Y^{u (\cdot )}, Z^{u (\cdot )}, N^{u (\cdot )} )\) in \({L^2_{\mathscr {F}} ( 0, T ; K )\times \varLambda ^2 ( K ; \mathscr {P}, M ) \times \mathscr {M}^{2, c}_{[ 0, T] } (K) }\) such that the following equality holds a.s. for all \(t \in [ 0, T ], \; N (0) = 0\) and N is VSO to M:

$$\begin{aligned} Y^{u (\cdot )} (t)= & {} \xi + \int _t^T \nabla _{x} H ( s, X^{u (\cdot )} (s), u (s), Y^{u (\cdot )} (s), Z^{u (\cdot )} (s) \mathscr {Q}^{1/2} (s) ) \, ds \nonumber \\&- \int _t^T Z^{u (\cdot )} (s ) d M (s) - \int _t^T d N^{u (\cdot )} (s). \end{aligned}$$

Theorem 1

Assume that (E1)–(E2) hold. Then there exists a unique solution \(( Y^{u (\cdot )}, Z^{u (\cdot )}, N^{u (\cdot )} )\) of the BSDE (30).

For the proof of this theorem one can see [2].

We shall denote briefly the solution of (30), which corresponds to the optimal control \(u^{*} ( \cdot )\) by \((Y^{*}, Z^{*}, N^{*})\).

In the following lemma we shall try to compute \(\mathbb {E}\, [ \, \langle Y^{*} (T), \, p (T) \rangle \, ]\).

Lemma 5

$$\begin{aligned} \mathbb {E} \, [ \, \langle \, Y^{*} (T), p(T) \, \rangle \, ]&= - \; \mathbb {E} \, \left[ \, \int _{t_0 }^T \ell _x (X^{*} (s), u^{*} (s) ) p(s) \, ds \, \right] \nonumber \\&\qquad + \; \mathbb {E} \, \left[ \, \langle Y^{*}(t_0 ), F ( X^{*}(t_0 ), v) - F ( X^{*}(t_0 ), u^{*} (t_0 ) \rangle \, \right] .\quad \quad \quad \end{aligned}$$
(31)

Proof

Use Itô’s formula together to compute \(d \, \langle Y^{*} (t), p(t) \rangle \) for \(t \in [t_0, T]\), and use the facts that

$$\begin{aligned}&\int _{t_0}^T \langle \, p(s), \nabla _x H ( s, X^{*} (s), u^{*} (s), Y^{*} (s), Z^{*} (s) \mathscr {Q}^{1/2}(s) ) \rangle \, ds \\&\qquad \quad = \int _{t_0}^T \left[ \, \ell _x (X^{*} (s), u^{*} (s) ) p(s) + \langle \, F_x (X^{*} (s), u^{*} (s) ) p(s), Y^{*}(s) \, \rangle \right] \, ds \\&\qquad \qquad + \, \int _{t_0}^T \langle \, G_x ( X^{*} (s) ) p(s) \mathscr {Q}^{1/2} (s), Z^{*}(s) \mathscr {Q}^{1/2} (s) \, \rangle _2 \, ds, \end{aligned}$$

which is easily seen from (29).

Now we state our main result of this section.

Theorem 2

Suppose (E1)–(E2). If \(( X^{*}, u^{*} (\cdot ) )\) is an optimal pair for the problem (4)–(6), then there exists a unique solution \(( Y^{*}, Z^{*}, N^{*} )\) to the corresponding BSDE (30) such that the following inequality holds:

$$\begin{aligned} H ( t, X^{*} (t ), v, Y^{*} (t ), Z^{*} (t ) \mathscr {Q}^{1/2} (t ) )&\ge \, H ( t, X^{*} (t ), u^{*} (t ), Y^{*} (t), Z^{*} (t ) \mathscr {Q}^{1/2} (t ) ) \nonumber \\&\,\,\,\,\,\,\,\,\text {a.e.} \; t \in [0, T],\; \text {a.s.} \; \forall \; v \in U. \end{aligned}$$
(32)

Proof

We note that since \(u^{*} (\cdot )\) is optimal, \(\frac{d}{d \varepsilon } \, J ( u_{\varepsilon } (\cdot ) )|_{\varepsilon = 0 } \, \ge 0\), which implies by using Corollary 1 that

$$\begin{aligned} \mathbb {E}\, [ \, \langle Y^{*} (T), \, p (T) \rangle + \zeta (T) \, ] \ge 0. \end{aligned}$$
(33)

On other hand by applying (33) and Lemma 5 one sees that

$$\begin{aligned}&0 \le - \; \mathbb {E} \, [ \, \int _{t_0 }^T \ell _x (X^{*} (s), u^{*} (s) ) p(s) \, ds \, ] \nonumber \\&\qquad \qquad \quad + \; \mathbb {E} \, [ \, \langle Y^{*}(t_0 ), F ( X^{*}(t_0 ), v) - F ( X^{*}(t_0 ), u^{*} (t_0 ) \rangle + \zeta (T) \, ]. \end{aligned}$$
(34)

But

$$ \zeta (T) = \zeta (t_0 ) + \int _{t_0 }^T \ell _x (X^{*} (s), u^{*} (s) ) p(s) \, ds $$

and

$$\begin{aligned}&H ( t_0, X^{*} (t_0 ), v, Y^{*} (t_0 ), Z^{*} (t_0 ) \mathscr {Q}^{1/2} (t_0 ) )\\&\qquad \qquad \qquad - H ( t_0, X^{*} (t_0 ), u^{*} (t_0 ), Y^{*} (t_0), Z^{*} (t_0 ) \mathscr {Q}^{1/2} (t_0 ) )\\&\qquad \quad = \zeta (t_0 ) + \langle Y^{*}(t_0 ), F ( X^{*}(t_0 ), v ) - F ( X^{*}(t_0 ), u^{*} (t_0 ) ) \rangle . \end{aligned}$$

Hence (34) becomes

$$\begin{aligned}&0 \le \mathbb {E} \; [ \, H ( t_0, X^{*} (t_0 ), v, Y^{*} (t_0 ), Z^{*} (t_0 ) \mathscr {Q}^{1/2} (t_0 ) ) \nonumber \\&\qquad \qquad \qquad - \, H ( t_0, X^{*} (t_0 ), u^{*} (t_0 ), Y^{*} (t_0), Z^{*} (t_0 ) \mathscr {Q}^{1/2} (t_0 ) ) \, ]. \end{aligned}$$
(35)

Now varying \(t_0\) as in (20) shows that (35) holds for \(a.e. \; t.\), and so by arguing for instance as in [10, p. 19] we obtain easily (32).

Remark 3

Let us assume for example that the space K in Theorem 1 is the real space \(\mathbb {R}\) and M is the martingale given by the formula

$$M(t) = \int _0^t \alpha (s) dB(s), t \in [0, T],$$

for some \(\alpha \in L^2_{\mathscr {F}} (0, T ; \mathbb {R} ) \) and a one dimensional Brownian motion B. If \(\alpha (s) > 0\) for each s, then \(\mathscr {F}_t (M) = \mathscr {F}_t (B)\) for each t,  where

$$\mathscr {F}_t (R) = \sigma \{ R(s), 0 \le s \le t \} $$

for \(R = M, B.\) Consequently, by applying the unique representation property for martingales with respect to \(\{ \mathscr {F}_t (M), \; t \ge 0 \}\) or larger filtration in [2, Theorem 2.2] or [5] and the Brownian martingale representation theorem as e.g. in [14, Theorem 3.4, P. 200], we deduce that the martingale \(N^{ u ( \cdot ) } \) in (30) vanishes almost surely if the filtration furnished for the SDE (4) is \(\{ \mathscr {F}_t (M), \; 0 \le t \le T \}.\) This result follows from the construction of the solution of the BSDE (30). More details on this matter can be found in [2, Sect. 3]. As a result, in this particular case BSDE (30) fits well with those BSDEs studied by Pardoux & Peng in [27], but with the variable \(\alpha Z\) replacing Z there.

Thus in particular we conclude that many of the applications of BSDEs, which were studied in the literature, to both stochastic optimal control and finance (e.g. [37] and the references therein) can be applied directly or after slight modification to work here for BSDEs driven by martingales. For example we refer the reader to [23] for financial application. Another interesting case can be found in [9].

On the other hand, in this respect we shall present an example (see Example 2) in Sect. 6, by modifying an interesting example due to Bensoussan [10].

6 Sufficient Conditions for Optimality

In the previous two sections we derived Pontyagin’s maximum principle which gives necessary conditions for optimality for the control problem (4)–(6). In the following theorem we shall try to show when the necessary conditions for optimality becomes sufficient as well. Let us assume from here on that U is convex. This concerned result is a variation of Theorem 4.2 in [3].

Theorem 3

Assume (E1) and, for a given \(u^* (\cdot ) \in \mathscr {U}_{ad}\), let \(X^{*}\) and \(( Y^{*}, Z^{*}, N^{*} )\) be the corresponding solutions of Eqs. (4) and (30) respectively. Suppose that the following conditions hold:

  1. (i)

    U is a convex domain in \(\mathscr {O}\), h is convex.

  2. (ii)

    \((x, v) \mapsto H ( t, x, v, Y^{*}(t), Z^{*}(t) \mathscr {Q}^{1/2} (t) )\) is convex for all \(t \in [0, T]\)  a.s.

  3. (iii)

    \(H ( t, X^{*} (t), u^{*} (t)\), \(Y^{*} (t)\), \(Z^{*} (t) \mathscr {Q}^{1/2} (t) )\)

    $$\begin{aligned}= & {} \min _{v \in U } \; H ( t, X^{*}(t), v, Y^{*}(t), Z^{*}(t) \mathscr {Q}^{1/2} (t) ) \end{aligned}$$

for a.e. \(t \in [0, T]\) a.s.

Then \(( X^{*}, u^{*} (\cdot ) )\) is an optimal pair for the control problem (4)–(6).

Proof

Let \(u (\cdot ) \in \mathscr {U}_{ad}\). Consider the following definitions:

$$ I_1\ {:=}\ \mathbb {E} \; \left[ \; \int _0^T \left( \ell ( X^{*}(t), u^{*} (t ) ) - \ell (X^{u (\cdot )}(t), u (t ) ) \right) dt \; \right] $$

and

$$ I_2\ {:=}\ \mathbb {E} \; [ \, h ( X^{*}(T) ) - h ( X^{u (\cdot )}(T) ) \, ]. $$

Then readily

$$\begin{aligned} J ( u^{*} (\cdot ) ) - J ( u (\cdot ) ) = I_1 + I_2. \end{aligned}$$
(36)

Let us define

$$\begin{aligned}&I_3\ {:=}\ \mathbb {E} \; \big [\; \int _0^T \big ( H ( t, X^{*} (t), u^{*} (t), Y^{*} (t), Z^{*} (t) \, \mathscr {Q}^{1/2} (t) )\\&\qquad \qquad \qquad \qquad - \, H ( t, X^{u (\cdot )} (t), u (t), Y^{*} (t), Z^{*} (t) \, \mathscr {Q}^{1/2} (t) ) \big ) dt \; \big ], \end{aligned}$$
$$\begin{aligned} I_4\ {:=}\ \mathbb {E} \; \left[ \, \int _0^T \langle F ( X^{*} (t), u^{*} (t) ) - F ( X^{u (\cdot ) } (t), u (t) ), Y^{*} (t) \rangle \, dt \, \right] , \end{aligned}$$
$$\begin{aligned} I_5\ {:=}\ \mathbb {E} \; \left[ \, \int _0^T \langle \left( G ( X^{u^{*} (\cdot ) } (t) ) - G ( X^{u (\cdot ) } (t) ) \right) \mathscr {Q}^{1/2} (t), Z^{*}(t) \mathscr {Q}^{1/2} (t) \rangle _2 \, dt \, \right] , \end{aligned}$$

and

$$\begin{aligned}&I_6\ {:=}\ \mathbb {E} \; \left[ \; \int _0^T \langle \nabla _{x} H ( t, X^{*} (t), u^{*} (t), Y^{*} (t) , Z^{u^{*} (\cdot )} (t) \mathscr {Q}^{1/2} (t) ), X^{*}(t) - X^{u (\cdot ) }(t) \rangle \, dt \; \right] . \end{aligned}$$

From the definition of H in (29) we get

$$\begin{aligned} I_1 = I_3 - I_4 - I_5. \end{aligned}$$
(37)

On the other hand, from the convexity of h in condition (ii) it follows

$$\begin{aligned} h ( X^{*}(T) ) - h ( X^{u (\cdot )}(T) ) \le \; \langle \, \nabla h ( X^{*}(T) ), X^{*}(T) - X^{u (\cdot )}(T) \, \rangle \; \; a.s., \end{aligned}$$

which implies that

$$\begin{aligned} I_2 \le \mathbb {E} \; [ \, \langle \; Y^{*}(T), X^{*} (T) - X^{u }(T) \; \rangle \, ]. \end{aligned}$$
(38)

Next by applying Itô’s formula to compute \(d \, \langle Y^{*} (t), X^{*}(t) - X^{u (\cdot ) }(t) \rangle \) and using Eqs. (4) and (30) we find with the help of (38) that

$$\begin{aligned} I_2 \le I_4 + I_5 - I_6 \,. \end{aligned}$$
(39)

Consequently, by considering (36), (37) and (39) it follows that

$$\begin{aligned} J ( u^{*} (\cdot ) ) - J ( u (\cdot ) ) \le I_3 - I_6. \end{aligned}$$
(40)

On the other hand, from the convexity property of the mapping \( (x, v) \mapsto H ( t, x, u, Y^{*}(t), Z^{*}(t) \mathscr {Q}^{1/2} (t) ) \) in assumption (iii) the following inequality holds a.s.:

$$\begin{aligned}&\int _0^T \left( H ( t, X^{*} (t), u^{*} (t), Y^{*} (t), Z^{*} (t) \mathscr {Q}^{1/2} (t) ) \right. \\&\qquad \qquad \qquad \qquad \left. - \; H ( t, X^{u (\cdot )} (t), u (t), Y^{*} (t), Z^{*} (t) \mathscr {Q}^{1/2} (t) ) \right) \; dt \\&\le \, \int _0^T \langle \; \nabla _{x} H ( t, X^{*}(t), u^{*} (t), Y^{*} (t), Z^{*} (t) \mathscr {Q}^{1/2} (t) ), \, X^{*} (t) - X^{u (\cdot )} (t) \; \rangle \; dt \\&+ \int _0^T \langle \; \nabla _{u } H ( t, X^{*} (t), u^{*} (t), Y^{*} (t), Z^{*}(t) \mathscr {Q}^{1/2} (t) ), \, u^{*} (t) - u (t) \, \rangle _{\mathscr {O}} \, dt. \end{aligned}$$

As a result

$$\begin{aligned} I_3 \le I_6 + I_7, \end{aligned}$$
(41)

where

$$\begin{aligned}&I_7 = \mathbb {E} \, \left[ \; \int _0^T \langle \; \nabla _{u } H ( t, X^{*} (t), u^{*} (t), Y^{*} (t), Z^{*}(t) \mathscr {Q}^{1/2} (t) ), u^{*} (t) - u (t) \; \rangle _{\mathscr {O} } \, dt \; \right] . \end{aligned}$$

Since \(v \mapsto H ( t, X^{*} (t), v, Y^{*} (t), Z^{*}(t) \mathscr {Q}^{1/2} (t) )\) is minimum at \(v = u^{*} (t)\) by the minimum condition (iii), we have

$$\langle \; \nabla _{u } H ( t, X^{*} (t), u^{*} (t), Y^{*} (t), Z^{*}(t) \mathscr {Q}^{1/2} (t) ), u^{*} (t) - u (t) \; \rangle _{\mathscr {O} } \le 0.$$

Therefore \(I_7 \le 0\), which by (41) implies that \(I_3 - I_6 \le 0\). So (40) becomes

$$ J ( u^{*} (\cdot ) ) - J ( u (\cdot ) ) \le 0.$$

Now since \(u (\cdot ) \in \mathscr {U}_{ad} \) is arbitrary, this inequality proves that \(( X^{*}, u^{*} (\cdot ) )\) is an optimal pair for the control problem (4)–(6) as required.

Example 1

Let m be a continuous square integrable one dimensional martingale with respect to \(\{ \mathscr {F}_t \}_t\) such that \( <m>_t \, = \int _0^t \alpha (s) ds\) \(\forall ~ 0 \le t \le T\) for some continuous \(\alpha : [0, T] \rightarrow (0, \infty )\). Consider \(M(t) = \beta \, m(t) ( = \int _0^t \beta \, d m(s) ), \) with \(\beta \ne 0\) being a fixed element of K. Then \(M \in \mathscr {M}^{2, c} (K) \) and \(\ll M\gg _t \) equals \(\widetilde{\beta \otimes \beta } \;\int _0^t \alpha (s) ds, \) where \(\widetilde{\beta \otimes \beta }\) is the identification of \(\beta \otimes \beta \) in \(L_1 (K), \) that is \((\widetilde{\beta \otimes \beta }) (k) = \langle \beta , k \rangle \, \beta , \; k \in K\). Also \(<M>_t \; = | \beta |^2 \int _0^t \alpha (s) \, ds\). Now letting \(\mathscr {Q} (t) = \widetilde{ \beta \otimes \beta } \; \alpha (t) \) yields that \(\ll M\gg _t \; = \int _0^t \mathscr {Q} (s) \, ds\). This process \(\mathscr {Q}(\cdot )\) is bounded since \(\mathscr {Q} (t) \le \mathscr {Q} \; \; \forall \; t,\) where \(\mathscr {Q} = \widetilde{ \beta \otimes \beta } \; \displaystyle {\max _{0 \le t \le T } }\alpha (t)\). It is also easy to see that \(\mathscr {Q}^{1/2} (t) (k) = \frac{\langle \beta , k \rangle \, \beta }{| \beta |} \; \alpha ^{1/2} (t)\). In particular \(\beta \in \mathscr {Q}^{1/2} (t) (K)\).

Let \(K = L^2 ( \mathbb {R}^n )\). Let M be the above martingale. Suppose that \(\mathscr {O} = K\). Assume that \(\tilde{G} \in L_{\mathscr {Q}} (K) \) or even a bounded linear operator from K into itself, and \(\tilde{F} \) is a bounded linear operator from \(\mathscr {O}\) into K. Let us consider the SDE:

$$\begin{aligned} \left\{ \begin{array}{ll} d X(t) = \tilde{F} \, u (t) \; d t + \, \langle X (t ), \beta \rangle \; \tilde{G} \; d M(t), \;\; t \in [0, T],\\ \; X(0) = x_0 \in K. \end{array} \right. \end{aligned}$$

For a given fixed element c of K we assume that the cost functional is given by the formula:

$$\begin{aligned} J ( u (\cdot ) ) = \mathbb {E} \, [ \, \int _0^T | u (t) |^2 \, dt \, ] + \mathbb {E} \; [ \; \langle c, X (T) \rangle \, ], \end{aligned}$$

and the value function is

$$ J^{*} = \inf \{ J( u (\cdot ) ) : \; u (\cdot ) \in \mathscr {{U}}_{ad} \}. $$

This control problem can be related to the control problem (4)–(6) as follows. We define

$$ F ( x, u ) = \tilde{F} \, u, \; G (x) = \langle x, \beta \rangle \, \tilde{G}, \; \; \ell (x, u) = |u|^2, \; \; \text {and} \; \; h (x) = \langle c, x \rangle , $$

where \((x, u ) \in K \times \mathscr {O}.\)

The Hamiltonian then becomes the mapping

$$ H: [0, T] \times \varOmega \times K \times \mathscr {O} \times K \times L_2 (K) \rightarrow \mathbb {R}, $$
$$\begin{aligned} H (t, x, u, y, z) = | u |^2 + \langle \tilde{F} \, u, y \rangle + \langle x, \beta \rangle \; \langle \tilde{G} \, \mathscr {Q}^{1/2} (t), z \rangle _2, \end{aligned}$$

\(( t, x, u, y, z ) \in K \times \mathscr {O} \times K \times L_2 (K).\)

It is obvious that \(H ( \cdot , \cdot , y, z )\) is convex with respect to (xu ) for each y and z and \( \nabla _{x } H ( t, x, u, y, z ) = \langle \tilde{G} \, \mathscr {Q}^{1/2} (t), z \rangle \; \beta .\)

Next we consider the adjoint BSDE:

$$\begin{aligned} \left\{ \begin{array}{ll} -\, d Y (t) = [\, \langle \tilde{G} \, \mathscr {Q}^{1/2} (t), Z (t) \rangle _2 \; \beta \; ]\; dt - Z (t)\; d M(t) - d N (t), \\ \; \; \; Y (T) = c. \end{array} \right. \end{aligned}$$

This BSDE attains an explicit solution \(Y(t) = c \,\), since c is non-random. But this implies that \(Z(t) = 0 \) and \(N(t) = 0 \) for each \(t \in [ 0, T ].\)

On the other hand, we note that the function \( \mathscr {O} \ni u \mapsto H ( t, x, u, y, z ) \in \mathbb {R} \) attains its minimum at \(u = - \, \frac{1}{2} \, \tilde{F}^{*} \, y\), for fixed (xyz). So we choose our candidate for an optimal control as

$$ u^* (t, \omega ) = - \, \frac{1}{2} \, \tilde{F}^{*} \, Y(t, \omega ) = - \, \frac{1}{2} \, \tilde{F}^{*} \, c \; \; ( \in U\ {:=}\ \mathscr {O} ). $$

With this choice all the requirements in Theorem 3 are verified. Consequently \(u^* (\cdot )\) is an optimal control of this control problem with an optimal solution \(\hat{X} \) given by the solution of the following closed loop equation:

$$\begin{aligned} \left\{ \begin{array}{ll} d \hat{X}(t) = - \, \frac{1}{2} \, \tilde{F} \, \tilde{F}^{*} \, Y(t) \, d t + \, \langle \hat{X} (t ), \beta \rangle \; \tilde{G} \, d M(t),\\ \; \hat{X}(0) = x_0 \in K. \end{array} \right. \end{aligned}$$

The value function takes the following value:

$$\begin{aligned} J^{*} = \, \frac{1}{4} \; | \tilde{F}^{*} c |^2 \, T + \mathbb {E} \; [ \; \langle c, \hat{X} (T) \rangle \; ]. \end{aligned}$$

Remark 4

It would be possible if we take \(h (x) = |x|^2, \; x \in K\), in the preceding example and proceeds as above. However if a result of existence and uniqueness os solutions to what we may call “forward-backward stochastic differential equations with martingale noise” holds, it should certainly be very useful to deal with both this particular case and similar problems.

Example 2

Let \(\mathscr {O} = K\). We are interested in the following linear quadratic example, which is gleaned from Bensoussan [10, p. 33]. Namely, we consider the SDE:

$$\begin{aligned} \left\{ \begin{array}{ll} d X(t) = ( A(t) X(t) + C(t) u (t) + f(t) )\, d t + \, ( B(t) X(t) + D(t) )\, d M(t), \\ \; X(0) = x_0, \end{array} \right. \quad \quad \end{aligned}$$
(42)

where \(B (t) x = \langle \gamma (t), x \rangle \, \tilde{G} (t) \) and \(A, \gamma , C : [0,T] \times K \rightarrow K, \; f : [0,T] \rightarrow K, \; \tilde{G}, D: [0,T] \rightarrow L_{\mathscr {Q}}(K)\) are measurable and bounded mappings.

Let \(P, Q : [0,T] \times K \rightarrow K, \; P_1: K \rightarrow K\) be measurable and bounded mappings. Assume that \(P, P_1\) are symmetric non-negative definite, and Q is a symmetric positive definite and \(Q^{-1}(t)\) is bounded. For SDE (42) we shall assume that the cost functional is

$$\begin{aligned} J(u (\cdot ) )&= \mathbb {E} \, \left[ \, \int _0^T \left( \, \frac{1}{2} \, \langle P(t) X^{u (\cdot ) } (t), X^{u (\cdot ) } (t) \rangle + \frac{1}{2} \, \langle Q(t) u (t), u (t) \rangle \, \right) \, dt \right. \nonumber \\&\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\left. + \, \frac{1}{2} \, \langle P_1 X^{u (\cdot ) } (T), X^{u (\cdot ) } (T) \rangle \, \right] , \end{aligned}$$
(43)

for \(u (\cdot ) \in \mathscr {U}_{ad}.\)

The control problem now is to minimize (43) over the set \(\mathscr {U}_{ad} \) and get an optimal control \(u^{*} ( \cdot ) \in \mathscr {U}_{ad}\), that is

$$\begin{aligned} J(u^{*} ( \cdot ) ) = \inf \{ J(u (\cdot ) ): \; u (\cdot ) \in \, \mathscr {U}_{ad} \}. \end{aligned}$$
(44)

By recalling Remark 1 we can consider this control problem (42)–(44) as a control problem of the type (4)–(6). To this end, we let

$$F(t,x,u) = A(t) x + C(t) u + f(t), $$
$$G(t, x) = \langle \gamma (t), x \rangle \, \tilde{G} (t) + D(t),$$
$$\ell (t,x,u) = \frac{1}{2} \, \langle P(t) x, x \rangle + \frac{1}{2} \, \langle Q(t) u, u \rangle ,$$
$$h(x) = \frac{1}{2} \, \langle P_1 x, x \rangle .$$

Then the Hamiltonian \({ H:[ 0, T ] \times \varOmega \times K \times K \times K \times L_2 (K) \rightarrow \mathbb {R} }\) is given by

$$\begin{aligned}&H ( t, x, u, y, z ) = \ell (t, x, u ) + \langle F (t, x, u ), y \rangle + \langle G (t,x) \mathscr {Q}^{1/2} (t), z \rangle _2 \, \nonumber \\&\qquad \qquad \qquad = \, \frac{1}{2} \, \langle P(t) x, x \rangle + \frac{1}{2} \, \langle Q(t) u, u \rangle + \langle A(t) x + C(t) u + f(t), y \rangle \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad + \, \langle \, ( \langle \gamma (t), x \rangle \, \tilde{G} (t) + D(t) ) \mathscr {Q}^{1/2} (t), z \rangle _2 \,. \end{aligned}$$

We can compute \(\nabla _x H\) directly to find that

$$\nabla _x H ( t, x, u, y, z ) = P(t) u + A^* (t) x + \langle \tilde{G} (t) \mathscr {Q}^{1/2} (t), z \rangle _2 \, \gamma (t).$$

Hence the adjoint equation of (42) takes the following shape:

$$\begin{aligned} \left\{ \begin{array}{ll} -\, d Y^{u (\cdot )}(t) = &{} \Big ( \, A^* (t) Y^{u (\cdot )}(t) + P(t) X^{u (\cdot )}(t) \\ ~ &{} \qquad \qquad \quad + \big < \tilde{G} (t) \mathscr {Q}^{1/2} (t), Z^{u (\cdot )} (t) \mathscr {Q}^{1/2} (t) \big >_2 \, \gamma (t) \, \Big ) \, dt \\ ~ &{} \qquad \qquad \qquad \qquad \qquad \qquad \quad - Z^{u (\cdot )} (t) d M(t) - d N^{u (\cdot )} (t), \\ \; \; \; \, \, Y^{u (\cdot )} (T) = &{} P_1 X^{u (\cdot )} (T). \end{array} \right. \end{aligned}$$

Now the maximum principle theorems (Theorems 2, 3) in this case hold readily if we consider Remark 1 again, and yield eventually

$$ C^* (t) Y^* (t) + \frac{1}{2}\, Q(t) u^* (t) = 0. $$