A Malliavin Calculus Approach to General Stochastic Differential Games with Partial Information

Kieu, An Ta Thi; Øksendal, Bernt; Okur, Yeliz Yolcu

doi:10.1007/978-1-4614-5906-4_22

An Ta Thi Kieu⁵,
Bernt Øksendal⁵ &
Yeliz Yolcu Okur⁶

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 34))

2579 Accesses
3 Citations

Abstract

In this article, we consider stochastic differential game where the state process is governed by a controlled Itô–Lévy process and the information available to the controllers is possibly less than the general information. All the system coefficients and the objective performance functional are assumed to be random. We use Malliavin calculus to derive a maximum principle for the optimal control of such problem. The results are applied to solve a worst-case scenario portfolio problem in finance.

Received2/18/2011; Accepted 5/23/2012; Final 5/29/2012

Access provided by Autonomous University of Puebla. Download conference paper PDF

Optimal Strategy of Mean-Field FBSDE Games with Delay and Noisy Memory Based on Malliavin Calculus

Article 20 August 2024

A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications

Article Open access 04 February 2017

Zero-Sum Stochastic Differential Games with Risk-Sensitive Cost

Article 15 February 2018

Keywords

1 Introduction

Suppose the dynamics of a state process X(t) = X ^(u ₀, u ₁)(t, ω); t ≥ 0, ω ∈ Ω, is a controlled Itô–Lévy process in ℝ of the form

$$\left \{\begin{array}{lll} \mathrm{d}X(t)& =&b(t,X(t),u_{0}(t),\omega )\mathrm{d}t + \sigma (t,X(t),u_{0}(t),\omega )\mathrm{d}B(t) \\ & & +\displaystyle\int _{\mathbb{R}_{0}}\gamma (t,X({t}^{-}),u_{0}({t}^{-}),u_{1}({t}^{-},z),z,\omega )\tilde{N}(\mathrm{d}t,\mathrm{d}z); \\ X(0) & =&x \in \mathbb{R}, \end{array} \right.$$

(22.1)

where the coefficients b : [0, T] ×ℝ ×U ×Ω → ℝ, σ : [0, T] ×ℝ ×U ×Ω → ℝ and γ : [0, T] ×ℝ ×U ×K ×ℝ ₀ ×Ω are all continuously differentiable (C ¹) with respect to x ∈ ℝ and u ₀ ∈ U, u ₁ ∈ K for each t ∈ [0, T] and a.a. ω ∈ Ω; U, K are given open convex subsets of ℝ ² and ℝ ×ℝ ₀, respectively. Here ℝ ₀ = ℝ − { 0}, B(t) = B(t, ω) and η(t) = η(t, ω), given by

$$\displaystyle\begin{array}{rcl} \eta (t) =\displaystyle\int _{ 0}^{t}\displaystyle\int _{ \mathbb{R}_{0}}z\tilde{N}(\mathrm{d}s,\mathrm{d}z);\;t \geq 0,\;\omega \in \Omega,& &\end{array}$$

(22.2)

are a one-dimensional Brownian motion and an independent pure jump Lévy martingale, respectively, on a given filtered probability space $(\Omega,\mathcal{F},\{\mathcal{F}_{t}\}_{t\geq 0},P).$ Thus

$$\tilde{N}(\mathrm{d}t,\mathrm{d}z) := N(\mathrm{d}t,\mathrm{d}z) - \nu (\mathrm{d}z)\mathrm{d}t$$

(22.3)

is the compensated Poisson jump measure of η( ⋅), where N(dt, dz) is the Poisson jump measure and ν(dz) is the Lévy measure of the pure jump Lévy process η( ⋅). For simplicity, we assume that

$$\displaystyle\begin{array}{rcl} \displaystyle\int _{\mathbb{R}_{0}}{z}^{2}\nu (\mathrm{d}z) < \infty.& &\end{array}$$

(22.4)

The processes u ₀(t) and u ₁(t, z) are the control processes and have values in a given open convex set U and K, respectively, for a.a. t ∈ [0, T], z ∈ ℝ ₀ for a given fixed T > 0. Also, u ₀( ⋅) and u ₁( ⋅) are càdlàg and adapted to a given filtration {ℰ _t}_t ≥ 0, where

$$\mathcal{E}_{t} \subseteq \mathcal{F}_{t},\qquad t \in [0,T].$$

{ℰ _t}_t ≥ 0 represents the information available to the controller at time t. For example, we could have

$$\mathcal{E}_{t} = \mathcal{F}_{{(t-\delta )}^{+}};\quad t \in [0,T],\;\delta > 0\mbox{ is a constant},$$

meaning that the controller gets a delayed information compared to $\mathcal{F}_{t}$. We refer to [15, 12] for more information about stochastic control of Itô diffusions and jump diffusions, respectively, and to [2], [4], [8], [9], [14] for other papers dealing with optimal control under partial information/observation.

Let f : [0, T] ×ℝ ×U ×K ×Ω → ℝ and g : ℝ ×Ω → ℝ are given continuously differentiable (C ¹) with respect to x ∈ ℝ and u ₀ ∈ U, u ₁ ∈ K. Suppose there are two players in the stochastic differential game and the given performance functionals for players are as follows:

$$\displaystyle\begin{array}{rcl} J_{i}(u_{0},u_{1})& =& {\mathbb{E}}^{x}\left [\displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}f_{i}(t,X(t),u_{0}(t),u_{1}(t,z),z,\omega )\mu (\mathrm{d}z)\mathrm{d}t + g_{i}(X(T),\omega )\right ], \\ & & \quad i = 1,2, \\ \end{array}$$

where μ is a measure on the given measurable space (Ω, ℱ) and ${\mathbb{E}}^{x} = \mathbb{E}_{P}^{x}$ denotes the expectation with respect to P given that X(0) = x. Suppose that the controls u ₀(t) and u ₁(t, z) have the form

$$u_{0}(t) = (\pi _{0}(t),\theta _{0}(t));\quad t \in [0,T];$$

(22.5)

$$u_{1}(t,z) = (\pi _{1}(t,z),\theta _{1}(t,z));\quad (t,z) \in [0,T] \times \mathbb{R}_{0}.$$

(22.6)

Let $\mathcal{A}_{\Pi }$ and $\mathcal{A}_{\Theta }$ denote the given family of controls π = (π₀, π₁) and θ = (θ₀, θ₁) such that they are contained in the set of càdlàg ℰ _t-adapted controls, Eq. (22.1) has a unique strong solution up to time T and

$$\displaystyle\begin{array}{rcl} & & {\mathbb{E}}^{x} \left [ \displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \vert f_{i}(t,X(t),\pi _{0}(t),\pi _{1}(t,z),\theta _{0}(t),\theta _{1}(t,z),z,\omega )\vert \mu (\mathrm{d}z)\mathrm{d}t+\vert g_{i}(X(T),\omega )\vert \right ] \\ & & \qquad \quad < \infty,\quad i = 1,\end{array}$$

(2.)

The partial information non-zero-sum stochastic differential game problem we consider is the following:

Problem 1.1.

Find $({\pi }^{{\ast}},{\theta }^{{\ast}}) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$ (if it exists) such that

(i)
J ₁(π, θ^∗) ≤ J ₁(π^∗, θ^∗) for all $\pi \in \mathcal{A}_{\Pi }$,
(ii)
J ₂(π^∗, θ) ≤ J ₂(π^∗, θ^∗) for all $\theta \in \mathcal{A}_{\Theta }$.

Such a control (π^∗, θ^∗) is called a Nash equilibrium (if it exists). The intuitive idea is that there are two players, players I and II. While player I controls π, player II controls θ. Given that each player knows the equilibrium strategy chosen by the other player, none of the players has anything to gain by changing only his or her own strategy only (i.e., by changing unilaterally). Note that since we allow b, σ, γ, f and g to be stochastic processes and also because our controls are required to be ℰ _t-adapted, this problem is not of Markovian type and hence cannot be solved by dynamic programming. Our paper is related to the recent paper [1, 10], where a maximum principle for stochastic differential games with partial information and a mean-field maximum principle are dealt with, respectively. However, the approach in [1] needs the solution of the backward stochastic differential equation (BSDE) for the adjoint processes. This is often a difficult point, particularly in the partial information case. In the current paper, we use Malliavin calculus techniques to obtain a maximum principle for this general non-Markovian stochastic differential game with partial information, without the use of BSDEs.

2 The General Maximum Principle for the Stochastic Differential Games

In this section we base on Malliavin calculus to solve Problem 1.1. We assume the following:

(A1)
For all s, r, t ∈ (0, T), t ≤ r, and all bounded ℰ _t-measurable random variables α = α(ω), ξ = ξ(ω) the controls β_α(s) : = (0, β_α ⁱ(s)) and η_ξ(s) : = (0, η_ξ ⁱ), i = 1, 2, with
$$\displaystyle\begin{array}{rcl} \beta _{\alpha }^{i}(s) = {\alpha }^{i}(\omega )\chi _{ [t,r]}(s)\mbox{ and }\eta _{\xi }^{i}(s) = {\xi }^{i}(\omega )\chi _{ [t,r]}(s);\quad s \in [0,T],& & \\ \end{array}$$
belong to $\mathcal{A}_{\Pi }$ and $\mathcal{A}_{\Theta }$, respectively. Also, we will denote the transposed of the vectors β and η by β^∗, η^∗, respectively.
(A2)
For all $\pi,\beta \in \mathcal{A}_{\Pi }$; $\theta,\eta \in \mathcal{A}_{\Theta }$ with β and η are bounded, there exists δ > 0 such that the controls π(t) + yβ(t) and θ(t) + υη(t), t ∈ [0, T], belong to $\mathcal{A}_{\Pi }$ and $\mathcal{A}_{\Theta }$, respectively, for all υ ∈ ( − δ, δ), and such that the families
$$\displaystyle\begin{array}{rcl} & \{\frac{\partial f_{1}} {\partial x} (t,{X}^{(\pi +y\beta,\theta )}(t),\pi + y\beta,\theta,z) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\pi +y\beta,\theta )}(t) & \\ & \qquad + \nabla _{\pi }f_{1}(t,{X}^{(\pi +y\beta,\theta )}(t),\pi + y\beta,\theta,z){\beta }^{{\ast}}(t)\}_{y\in (-\delta,\delta )},& \\ \end{array}$$

$$\displaystyle\begin{array}{rcl} & \{\frac{\partial f_{2}} {\partial x} (t,{X}^{(\pi,\theta +\upsilon \eta )}(t),\pi,\theta + \upsilon \eta,z) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\pi,\theta +\upsilon \eta )}(t) & \\ & \qquad + \nabla _{\theta }f_{2}(t,{X}^{(\pi,\theta +\upsilon \eta )}(t),\pi (t),\theta + \upsilon \eta,z){\eta }^{{\ast}}(t)\}_{\upsilon \in (-\delta,\delta )}& \\ \end{array}$$
are λ ×ν ×P-uniformly integrable and the families
$$\displaystyle\begin{array}{rcl} \{g_{1}^{\prime}({X}^{\pi +y\beta }(T)) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\pi +y\beta,\theta )}(T)\}_{ y\in (-\delta,\delta )},& & \\ \quad \mbox{ }\{g_{2}^{\prime}({X}^{\pi,\theta +\upsilon \eta }(T)) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\pi,\theta +\upsilon \eta )}(T)\}_{ \upsilon \in (-\delta,\delta )}& & \\ \end{array}$$
are P-uniformly integrable.

In the following, D _t F denotes the Malliavin derivative with respect to B(⋅) (at t) of a given (Malliavin differentiable) random variable F = F(ω); ω ∈ Ω. Similarly, D _t, z F denotes the Malliavin derivative with respect to $\widetilde{N}(\cdot,\cdot )$ (at t,z) of F. We let $\mathbb{D}_{1,2}$ denote the set of all random variables which are Malliavin differentiable with respect to both B( ⋅) and N( ⋅, ⋅). We will use the following duality formula for Malliavin derivatives:

$$\mathbb{E}\left [F\displaystyle\int _{0}^{T}\varphi (t)\mathrm{d}B(t)\right ] = \mathbb{E}\left [\displaystyle\int _{ 0}^{T}\varphi (t)D_{ t}F\mathrm{d}t\right ],$$

(22.7)

$$\displaystyle\begin{array}{rcl} \mathbb{E}\left [F\displaystyle\int _{0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}\psi (t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)\right ] = \mathbb{E}\left [\displaystyle\int _{0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}\psi (t,z)D_{t,z}F\nu (\mathrm{d}z)\mathrm{d}t\right ]& &\end{array}$$

(22.8)

valid for all Malliavin differentiable F- and all ℱ _t-predictable processes φ and ψ such that the integrals on the right converge absolutely. We also need the following basic properties of Malliavin derivatives:

If $F \in \mathbb{D}_{1,2}$ is ℱ _s-measurable, then

$$D_{t}F = D_{t,z}F = 0,\mbox{ for all }t > s.$$

(22.9)

(Fundamental theorem)

$$D_{t}\left (\displaystyle\int _{0}^{T}\varphi (s)\delta B(s)\right ) =\displaystyle\int _{ 0}^{T}D_{ t}\varphi (s)\delta B(s) + \varphi (t)\qquad \mbox{ for a.a. }(t,\omega ),$$

(22.10)

where ∫ ₀ ^T u(s)δB(s) denotes Skorohod integral of u with respect to B( ⋅). (See [11], p. 35–38 for a definition of Skorohod integrals and for more details.)

$$D_{t,z}\Big{(}\displaystyle\int _{0}^{T}\displaystyle\int _{ \mathbb{R}}\psi (s,y)\tilde{N}(\mathrm{d}s,\mathrm{d}y)\Big{)} =\displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}}D_{t,z}\psi (s,y)\tilde{N}(\mathrm{d}s,\mathrm{d}y) + \psi (t,z),$$

(22.11)

provided that all terms involved are well defined. We refer to [3], [5], [6], [7], [10] and [11] for more information about the Malliavin calculus for Lévy processes and its applications.

(A3)
For all $(\pi,\theta ) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$, we assume the following processes, i = 1, 2:
$$\displaystyle\begin{array}{rcl} K_{i}(t) = g_{i}^{\prime}(X(T)) +\displaystyle\int _{ t}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \frac{\partial f_{i}} {\partial x} (s,X(s),\pi,\theta,z_{1})\mu (\mathrm{d}z_{1})\mathrm{d}s,& & \end{array}$$
(22.12)

$$\displaystyle\begin{array}{rcl} H_{i}^{0}(s,x,\pi,\theta ) =& K_{ i}(s)b(s,x,\pi _{0},\theta _{0}) + D_{s}K_{i}(s)\sigma (s,x,\pi _{0},\theta _{0})& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}K_{i}(s)\gamma (s,x,\pi,\theta,z)\nu (\mathrm{d}z), & \end{array}$$
(22.13)

$$\displaystyle\begin{array}{rcl} G(t,s) :=& \exp \left (\displaystyle\int _{t}^{s}\{ \frac{\partial b} {\partial x}(r,X(r),\pi _{0}(r),\theta _{0}(r))\right. & \\ & -\frac{1} {2}\Big{(}\frac{\partial \sigma } {\partial x}{\Big{)}}^{2}(r,X(r),\pi _{ 0}(r),\theta _{0}(r))\}\mathrm{d}r & \\ & +\displaystyle\int _{t}^{s}\frac{\partial \sigma } {\partial x}(r,X(r),\pi _{0}(r),\theta _{0}(r))\mathrm{d}B(r) & \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\ln \left (1 + \frac{\partial \gamma } {\partial x}(r,X({r}^{-}),\pi ({r}^{-},z),\theta ({r}^{-},z),z)\right )\tilde{N}(\mathrm{d}r,\mathrm{d}z)& \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\{\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}(r,X(r),\pi,\theta,z)\Big{)} & \\ & -\frac{\partial \gamma } {\partial x}(r,X(r),\pi,\theta,z)\}\nu (\mathrm{d}z)\mathrm{d}r\Big{)}, & \end{array}$$
(22.14)

$$F_{i}(t,s) := \frac{\partial H_{i}^{0}} {\partial x} (s)G(t,s),$$
(22.15)

$$\displaystyle p_{i}(t) = K_{i}(t)+\displaystyle\int _{t}^{T} \frac{\partial H_{i}^{0}} {\partial x} (s,X(s),\pi _{0}(s),\pi _{1}(s,z),\theta _{0}(s),\theta _{1}(s,z))G(t,s)\mathrm{d}s,$$
(22.16)

$$q_{i}(t) = D_{t}p_{i}(t),$$
(22.17)

$$r_{i}(t,z) = D_{t,z}p_{i}(t),$$
(22.18)
all exist for 0 ≤ t ≤ s, z ∈ ℝ ₀.

We now define the Hamiltonians for this general problem as follows:

Definition 2.1 (The General Stochastic Hamiltonian).

The general stochastic Hamiltonians for the stochastic differential game in Problem 1.1 are the functions

$$H_{i}(t,x,\pi,\theta,\omega ) : [0,T] \times \mathbb{R} \times U \times K \times \Omega \rightarrow \mathbb{R},\quad i = 1,2,$$

defined by

$$\displaystyle\begin{array}{rcl} & H_{i}(t,x,\pi,\theta,\omega ) & \\ \quad & =\displaystyle\int _{\mathbb{R}_{0}}f_{i}(t,x,\pi,\theta,z,\omega )\mu (\mathrm{d}z)+p_{i}(t)b(t,x,\pi _{0},\theta _{0},\omega )+q_{i}(t)\sigma (t,x,\pi _{0},\theta _{0},\omega )& \\ & \quad +\displaystyle\int _{\mathbb{R}_{0}}r_{i}(t,z)\gamma (t,x,\pi,\theta,z,\omega )\nu (\mathrm{d}z),\quad i = 1,2, &\end{array}$$

(22.19)

where π = (π₀, π₁) and θ = (θ₀, θ₁).

Remark 2.1.

In the classical case, the Hamiltonian $H_{i}^{{\ast}} : [0,T] \times \mathbb{R} \times U \times K \times \mathbb{R} \times \mathbb{R} \times \mathcal{R}\rightarrow \mathbb{R}$ is defined by

$$\displaystyle\begin{array}{rcl} H_{i}^{{\ast}}(t,x,\pi,\theta,p,q,r)& =\displaystyle\int _{ \mathbb{R}_{0}}f_{i}(t,x,\pi,\theta )\mu (\mathrm{d}z)+p_{i}\,b(t,x,\pi _{0},\theta _{0})+q_{i}\,\sigma (t,x,\pi _{0},\theta _{0})& \\ & \quad +\displaystyle\int _{\mathbb{R}_{0}}r_{i}(t,z)\gamma (t,x,\pi,\theta,z)\nu (\mathrm{d}z), &\end{array}$$

(22.20)

where $\mathcal{R}$ is the set of functions r _i : ℝ ×ℝ ₀ → ℝ; i = 1, 2; see [12]. Thus the relation between H _i ^∗ and H _i is that

$$H_{i}(t,x,\pi,\theta,\omega ) = H_{i}^{{\ast}}(t,x,\pi,\theta,p(t),q(t),r(t,\cdot )),\quad i = 1,2,$$

(22.21)

where p( ⋅), q( ⋅) and r( ⋅, ⋅) are given by Eqs. (22.16)–(22.18).

Theorem 2.1 (Maximum principle for non-zero-sum games).

(i)
Let $(\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$ be a Nash equilibrium with corresponding state process $\hat{X}(t) = {X}^{(\hat{\pi },\hat{\theta })}(t)$ , i.e.,
$$\displaystyle\begin{array}{rcl} J_{1}(\pi,\hat{\theta })& \leq J_{1}(\hat{\pi },\hat{\theta }),\qquad \mbox{ for all }\pi \in \mathcal{A}_{\Pi },& \\ J_{2}(\hat{\pi },\theta )& \leq J_{2}(\hat{\pi },\hat{\theta }),\qquad \mbox{ for all }\theta \in \mathcal{A}_{\Theta }.& \\ \end{array}$$
Assume that the random variables $\frac{\partial f_{i}} {\partial x}$ and F _i (t,s), i = 1,2, belong to $\mathbb{D}_{1,2}$ . Then
$${\mathbb{E}}^{x}[\nabla _{ \pi }\hat{H}_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi,\hat{\theta },\omega )\vert _{ \pi =\hat{\pi }}\;\vert \mathbb{E}_{t}] = 0,$$
(22.22)

$${\mathbb{E}}^{x}[\nabla _{ \theta }\hat{H}_{2}(t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi },\theta,\omega )\vert _{ \theta =\hat{\theta }}\;\vert \mathbb{E}_{t}] = 0,$$
(22.23)
for a.a. t,ω.
(ii)
Conversely, suppose that there exists $(\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$ such that Eqs. (22.22) and (22.23) hold. Then
$$\displaystyle\begin{array}{rcl} \frac{\partial } {\partial y}J_{1}(\hat{\pi } + y\beta,\hat{\theta }){\vert }_{y=0}& = 0\quad \text{for all}\;\beta,& \\ \frac{\partial } {\partial \upsilon }J_{2}(\hat{\pi },\hat{\theta } + \upsilon \eta ){\vert }_{\upsilon =0}& = 0\quad \text{for all}\;\eta.& \\ \end{array}$$
In particular, if
$$\displaystyle\begin{array}{rcl} \pi \rightarrow J_{1}(\pi,\hat{\theta })\qquad \mbox{ and}\qquad \theta \rightarrow J_{2}(\hat{\pi },\theta ),& & \end{array}$$
(22.24)
are concave, then $(\hat{\pi },\hat{\theta })$ is a Nash equilibrium.

Proof.

(i)
Suppose $(\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$ is a Nash equilibrium. Since (i) and (ii) hold for all π and θ, $(\hat{\pi },\hat{\theta })$ is a directional critical point for J _i(π, θ), i = 1, 2, in the sense that for all bounded $\beta \in \mathcal{A}_{\Pi }$ and $\eta \in \mathcal{A}_{\Theta }$, there exists δ > 0 such that $\hat{\pi } + y\beta \in \mathcal{A}_{\Pi }$, $\hat{\theta } + \upsilon \eta \in \mathcal{A}_{\Theta }$ for all y, υ ∈ ( − δ, δ). Then we have

$$\displaystyle\begin{array}{rcl} & =& \frac{\partial } {\partial y}J_{1}\left (\hat{\pi } + y\beta,\hat{\theta }\right ){\vert }_{y=0} \\ & =& {\mathbb{E}}^{x} \left [\displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \left \{\frac{\partial f_{1}} {\partial x} (t,\hat{X}(t),\hat{\pi }_{0}(t),\hat{\pi }_{1}(t,z),\hat{\theta }_{0}(t),\hat{\theta }_{1}(t,z),z) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\hat{\pi }+y\beta,\hat{\theta })}(t){\vert }_{ y=0}\right.\right. \\ & & +\nabla _{\pi }f_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi _{ 0}(t),\pi _{1}(t,z),\hat{\theta }_{0}(t),\hat{\theta }_{1}(t,z),z){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(t)\}\mu (\mathrm{d}z)\mathrm{d}t \\ & & +g_{1}^{\prime}(\hat{X}(T)) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\hat{\pi }+y\beta,\hat{\theta })}(T){\vert }_{ y=0}\Big{]} \\ & =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}\{\frac{\partial f_{1}} {\partial x} (t,\hat{X}(t),\hat{\pi }_{0}(t),\hat{\pi }_{1}(t,z),\hat{\theta }_{0}(t),\hat{\theta }_{1}(t,z),z)Y (t) \\ & & +\nabla _{\pi }f_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi _{ 0}(t),\pi _{1}(t,z),\hat{\theta }_{0}(t),\hat{\theta _{1}}(t,z),z){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(t)\} \\ & & \times \mu (\mathrm{d}z)\mathrm{d}t+g_{1}^{\prime}(\hat{X}(T))Y (T)\Big{]}, \end{array}$$
(22.25)

where
$$\displaystyle\begin{array}{rcl} Y (t) =& {Y }^{(\beta )}(t) = \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\hat{\pi }+y\beta,\hat{\theta })}(t)\vert _{ y=0} & \\ =& \displaystyle\int _{0}^{t}\{ \frac{\partial b} {\partial x}(s,\hat{X}(s),\hat{\pi }_{0}(s),\hat{\theta }_{0}(s))Y (s) & \\ & +\nabla _{\pi }b(s,{X}^{(\pi,\hat{\theta })}(s),\pi _{0}(s),\hat{\theta }_{0}(s)){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(s)\}\mathrm{d}s & \\ & +\displaystyle\int _{0}^{t}\{\frac{\partial \sigma } {\partial x}(s,\hat{X}(s),\hat{\pi }_{0}(s),\hat{\theta }_{0}(s))Y (s) & \\ & +\nabla _{\pi }\sigma (s,{X}^{(\pi,\hat{\theta })}(s),\pi _{0}(s),\hat{\theta }_{0}(s)){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(s)\}\mathrm{d}B(s) & \\ & +\displaystyle\int _{0}^{t}\displaystyle\int _{\mathbb{R}_{0}}\{\frac{\partial \gamma } {\partial x}(s,\hat{X}({s}^{-}),\hat{\pi }({s}^{-}),\hat{\theta }({s}^{-}),z)Y (s) & \\ & +\nabla _{\pi }\gamma (s,{X}^{(\pi,\hat{\theta })}({s}^{-}),\pi ({s}^{-}),\hat{\theta }({s}^{-}),z){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(s)\}\tilde{N}(\mathrm{d}s,\mathrm{d}z).& \end{array}$$
(22.26)

If we use the shorthand notation
$$\displaystyle\begin{array}{rcl} \frac{\partial f_{1}} {\partial x} (t,\hat{X}(t),\hat{\pi },\hat{\theta },z)=\frac{\partial f_{1}} {\partial x} (t,z),\;\;\nabla _{\pi }f_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi,\hat{\theta },z)\vert _{ \pi =\hat{\pi }}=\nabla _{\pi }f_{1}(t,z),& & \\ \end{array}$$
and similarly for $\frac{\partial b} {\partial x}$, ∇ _π b, $\frac{\partial \sigma } {\partial x}$, ∇ _πσ, $\frac{\partial \gamma } {\partial x}$ and ∇ _πγ, we can write
$$\left \{\begin{array}{ll} \mathrm{d}Y (t) =&( \frac{\partial b} {\partial x}(t)Y (t)+\nabla _{\pi }b(t){\beta }^{{\ast}}(t))\mathrm{d}t+(\frac{\partial \sigma } {\partial x}(t)Y (t) + \nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t))\mathrm{d}B(t) \\ & +\displaystyle\int _{\mathbb{R}_{0}}(\frac{\partial \gamma } {\partial x}(t)Y (t) + \nabla _{\pi }\gamma (t,z){\beta }^{{\ast}}(t))\tilde{N}(\mathrm{d}t,\mathrm{d}z); \\ Y (0) & = 0.\end{array} \right.$$
(22.27)
By the duality formulas (22.7) and (22.8) and the Fubini theorem, we get
$$\displaystyle\begin{array}{rcl} & {\mathbb{E}}^{x}\left [\displaystyle\int _{0}^{T}\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (t,z)Y (t)\mu (\mathrm{d}z)\mathrm{d}t\right ] & \\ & \qquad \quad = {\mathbb{E}}^{x}\bigg{[}\displaystyle\int _{0}^{T}\displaystyle\int _{\mathbb{R}_{0}}\{\displaystyle\int _{0}^{t}\Big{(}\frac{\partial f_{1}} {\partial x} (t,z)\Big{[} \frac{\partial b} {\partial x}(s)Y (s) + \nabla _{\pi }b(s){\beta }^{{\ast}}(s)\Big{]} & \\ & \qquad \qquad + D_{s}\frac{\partial f_{1}} {\partial x} (t,z)\Big{[}\frac{\partial \sigma } {\partial x}(s)Y (s) + \nabla _{\pi }\sigma (s){\beta }^{{\ast}}(s)\Big{]} & \\ & \qquad \qquad +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z_{1}} \frac{\partial f_{1}} {\partial x} (t,z)\Big{[}\frac{\partial \gamma } {\partial x}(s,z_{1})Y (s) & \\ & \qquad \qquad + \nabla _{\pi }\gamma (s,z_{1}){\beta }^{{\ast}}(s)\Big{]}\nu (\mathrm{d}z_{1})\Big{)}\mathrm{d}s\}\mu (\mathrm{d}z)\mathrm{d}t\bigg{]} & \\ & \qquad \quad = {\mathbb{E}}^{x}\bigg{[}\displaystyle\int _{0}^{T}\{\Big{(}\displaystyle\int _{s}^{T}\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (t,z)\mu (\mathrm{d}z)\mathrm{d}t\Big{)}\Big{[} \frac{\partial b} {\partial x}Y (s) + \nabla _{\pi }b(s){\beta }^{{\ast}}(s)\Big{]}& \\ & \qquad \qquad + \Big{(}\displaystyle\int _{s}^{T}\displaystyle\int _{\mathbb{R}_{0}}D_{s}\frac{\partial f_{1}} {\partial x} (t,z)\mu (\mathrm{d}z)\mathrm{d}t\Big{)}\Big{[}\frac{\partial \sigma } {\partial x}Y (s) + \nabla _{\pi }\sigma {\beta }^{{\ast}}(s)\Big{]} & \\ & \qquad \qquad +\displaystyle\int _{\mathbb{R}_{0}}\Big{(}\displaystyle\int _{s}^{T}\displaystyle\int _{\mathbb{R}_{0}}D_{s,z_{1}} \frac{\partial f_{1}} {\partial x} (t,z)\mu (\mathrm{d}z)\mathrm{d}t\Big{)} & \\ & \qquad \qquad \times \Big{[}\frac{\partial \gamma } {\partial x}Y (s) + \nabla _{\pi }\gamma {\beta }^{{\ast}}(s)\Big{]}\nu (\mathrm{d}z_{ 1})\}\mathrm{d}s\bigg{]}. & \end{array}$$
(22.28)

Changing notation s → t and z ₁ → z this becomes
$$\displaystyle\begin{array}{rcl} & {\mathbb{E}}^{x}\bigg{[}\displaystyle\int _{0}^{T}\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (t,z)Y (t)\mu (\mathrm{d}z)\mathrm{d}t\bigg{]} & \\ & \quad = {\mathbb{E}}^{x}\bigg{[}\displaystyle\int _{0}^{T}\{\Big{(}\displaystyle\int _{t}^{T}\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (s,z_{1})\mu (\mathrm{d}z_{1})\mathrm{d}s\Big{)}\Big{[} \frac{\partial b} {\partial x}(t)Y (t) + \nabla _{\pi }b(t){\beta }^{{\ast}}(t)\Big{]}& \\ & \qquad + \Big{(}\displaystyle\int _{t}^{T}\displaystyle\int _{\mathbb{R}_{0}}D_{t}\frac{\partial f_{1}} {\partial x} (s,z_{1})\mu (\mathrm{d}z_{1})\mathrm{d}s\Big{)}\Big{[}\frac{\partial \sigma } {\partial x}(t)Y (t) + \nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t)\Big{]} & \\ & \qquad +\displaystyle\int _{\mathbb{R}_{0}}\Big{(}\displaystyle\int _{t}^{T}\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\frac{\partial f_{1}} {\partial x} (s,z_{1})\mu (\mathrm{d}z_{1})\mathrm{d}s\Big{)} & \\ & \qquad \times \Big{[}\frac{\partial \gamma } {\partial x}(t,z)Y (t) + \nabla _{\pi }\gamma (t,z){\beta }^{{\ast}}(t)\Big{]}\nu (\mathrm{d}z)\}\mathrm{d}t\bigg{]}. & \end{array}$$
(22.29)

On the other hand, by the duality formulas (22.7) and (22.8), we get
$$\displaystyle\begin{array}{rcl} {\mathbb{E}}^{x } \Big{[} g_{ 1}^{\prime}(\hat{X}(T))Y (T)\Big{]}& = {\mathbb{E}}^{x}\Big{[}g_{ 1}^{\prime}(\hat{X}(T))\Big{(}\displaystyle\int _{0}^{T}\left \{ \frac{\partial b} {\partial x}(t)Y (t) + \nabla _{\pi }b(t){\beta }^{{\ast}}(t)\right \}\mathrm{d}t & \\ & \quad\quad \quad + \displaystyle\int _{0}^{T}\left \{\frac{\partial \sigma } {\partial x}(t)Y (t) + \nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t)\right \}\mathrm{d}B(t) & \\ & \quad\quad \quad + \displaystyle\int _{0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}\left \{ \frac{\partial \gamma } {\partial x}(t,z)Y (t) + \nabla _{\pi }\gamma (t,z)\beta (t)\right \}\tilde{N}(\mathrm{d}t,\mathrm{d}z)\Big{)}\Big{]} & \\ & = {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ 0}^{T}\{g_{ 1}^{\prime}(\hat{X}(T)) \frac{\partial b} {\partial x}(t)Y (t) + g_{1}^{\prime}(\hat{X}(T))\nabla _{\pi }b(t){\beta }^{{\ast}}(t) & \\ & \quad\quad \quad + D_{t}(g_{1}^{\prime}(\hat{X}(T)))\frac{\partial \sigma } {\partial x}(t)Y (t) + D_{t}(g_{1}^{\prime}(\hat{X}(T)))\nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t)& \\ & \quad\quad \quad + \displaystyle\int _{\mathbb{R}_{0}}[D_{t,z}(g_{1}^{\prime}(\hat{X}(T)))\frac{\partial \gamma } {\partial x}(t,z)Y (t) & \\ & \quad\quad \quad + D_{t,z}(g_{1}^{\prime}(\hat{X}(T)))\nabla _{\pi }\gamma (t,z){\beta }^{{\ast}}(t)]\nu (\mathrm{d}z)\}\mathrm{d}t\Big{]}. & \\ \end{array}$$

We recall that
$$\hat{K}_{1}(t) := g_{1}^{\prime}(\hat{X}(T)) +\displaystyle\int _{ t}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (s,z_{1})\mu (\mathrm{d}z_{1})\;\mathrm{d}s,$$
and combining Eqs. (22.27)–(22.29), we get
$$\displaystyle\begin{array}{rcl} & {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{0}^{T}\{\hat{K}_{1}(t)\Big{(} \frac{\partial b} {\partial x}(t)Y (t) + \nabla _{\pi }b(t){\beta }^{{\ast}}(t)\Big{)} & \\ & \qquad \qquad + D_{t}\hat{K}_{1}(t)\Big{(}\frac{\partial \sigma } {\partial x}(t)Y (t) + \nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t)\Big{)} & \\ & \qquad \qquad +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\hat{K}_{1}(t)\Big{(}\frac{\partial \gamma } {\partial x}(t,z)Y (t) + \nabla _{\pi }\gamma (t,z){\beta }^{{\ast}}(t)\Big{)}\nu (\mathrm{d}z)& \\ & \qquad \qquad +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }f_{1}(t,z){\beta }^{{\ast}}(t)\mu (\mathrm{d}z)\}\mathrm{d}t\Big{]} = 0. & \end{array}$$
(22.30)

Now apply this to $\beta = \beta _{\alpha } \in \mathcal{A}_{\Pi }$ of the form β_α(s) = αχ_{[t, t + h]}(s), for some t, h ∈ (0, T), t + h ≤ T, where α = α(ω) is bounded and ℰ _t-measurable. Then ${Y }^{(\beta _{\alpha })}(s) = 0$ for 0 ≤ s ≤ t. Hence Eq. (22.30) becomes
$$A_{1} + A_{2} = 0,$$
(22.31)
where
$$\displaystyle\begin{array}{rcl} A_{1} =\ & {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{t}^{T}\{\hat{K}_{1}(s) \frac{\partial b} {\partial x}(s) + D_{s}\hat{K}_{1}(s)\frac{\partial \sigma } {\partial x}(s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{1}(s)\frac{\partial \gamma } {\partial x}(s)\nu (\mathrm{d}z)\}{Y }^{(\beta _{\alpha })}(s)\mathrm{d}s\Big{]},& \\ \end{array}$$

$$\displaystyle\begin{array}{rcl} A_{2} =\ & {\mathbb{E}}^{x}\Big{[}\{\displaystyle\int _{t}^{t+h} \Big{(}\hat{K}_{1}(s)\nabla _{\pi }b(s) + D_{s}\hat{K}_{1}(s) \nabla _{\pi }\sigma (s)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{1}(s)\nabla _{\pi }\gamma (s,z)\nu (\mathrm{d}z) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }f_{1}(s,z)\mu (\mathrm{d}z)\Big{)}\mathrm{d}s\}\alpha \Big{]}. & \\ \end{array}$$
Note that, by Eq. (22.26), with $Y (s) = {Y }^{(\beta _{\alpha })}(s)$ and s ≥ t + h,
$$\displaystyle\begin{array}{rcl} \mathrm{d}Y (s) = Y ({s}^{-})\{ \frac{\partial b} {\partial x}(s)\mathrm{d}s + \frac{\partial \sigma } {\partial x}(s)\mathrm{d}B(s) +\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial \gamma } {\partial x}({s}^{-},z)\tilde{N}(\mathrm{d}s,\mathrm{d}z)\}\,,& & \\ \end{array}$$
for s ≥ t + h. Hence, by the Itô formula,
$$Y (s) = Y (t + h)G(t + h,s);\qquad s \geq t + h,$$
(22.32)
where, in general, for s ≥ t,
$$\displaystyle\begin{array}{rcl} G(t,s) =& \exp \Big{(}\displaystyle\int _{t}^{s}\{ \frac{\partial b} {\partial x}(r) -\tfrac{1} {2}\Big{(}\frac{\partial \sigma } {\partial x}{\Big{)}}^{2}(r)\}\mathrm{d}r +\displaystyle\int _{ t}^{s}\frac{\partial \sigma } {\partial x}(r)\mathrm{d}B(r)& \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}({r}^{-},z)\Big{)}\tilde{N}(\mathrm{d}r,\mathrm{d}z) & \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\{\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}(r,z)\Big{)} -\frac{\partial \gamma } {\partial x}(r,z)\}\nu (\mathrm{d}z)\mathrm{d}r\Big{)}.& \end{array}$$
(22.33)
Note that G(t, s) does not depend on h. Put
$$\displaystyle\begin{array}{rcl} H_{1}^{0}(s,x,\pi,\theta ) =& K_{ 1}(s)b(s,x,\pi _{0},\theta _{0}) + D_{s}K_{1}(s)\sigma (s,x,\pi _{0},\theta _{0})& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}K_{1}(s)\gamma (s,x,\pi,\theta,z)\nu (\mathrm{d}z), & \end{array}$$
(22.34)
and $\hat{H}_{1}^{0}(s) = H_{1}^{0}(s,\hat{X}(s),\hat{\pi },\hat{\theta })$. Then
$$A_{1} = {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t}^{T}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (s)\mathrm{d}s\Big{]}.$$
Differentiating with respect to h at h = 0 we get
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}} {\mathrm{d}h}A_{1}{\vert }_{h=0}& =& \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t}^{t+h}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (s)\mathrm{d}s\Big{]}_{h=0} \\ & & + \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t+h}^{T}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (s)\mathrm{d}s\Big{]}_{h=0}. \end{array}$$
(22.35)
Since Y (t) = 0 we see that
$$\frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t}^{t+h}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (s)\mathrm{d}s\Big{]}_{h=0} = 0.$$
(22.36)
Therefore, by Eq. (22.31),
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}} {\mathrm{d}h}A_{1}{\vert }_{h=0}& = \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t+h}^{T}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (t + h)G(t + h,s)\mathrm{d}s\Big{]}_{h=0}& \\ & =\displaystyle\int _{ t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (t + h)G(t + h,s)\Big{]}_{h=0}\mathrm{d}s & \\ & =\displaystyle\int _{ t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)G(t,s)Y (t + h)\Big{]}_{h=0}\mathrm{d}s. & \end{array}$$
(22.37)
On the other hand, Eq. (22.26) gives
$$\displaystyle\begin{array}{rcl} Y (t+ h) =& \alpha \displaystyle\int _{t}^{t+h}\{\nabla _{\pi }b(r)\mathrm{d}r + \nabla _{\pi }\sigma \mathrm{d}B(r) +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }\gamma ({r}^{-},z)\tilde{N}(\mathrm{d}r,\mathrm{d}z)\} & \\ & +\displaystyle\int _{t}^{t+h}Y ({r}^{-})\{ \frac{\partial b} {\partial x}(r)\mathrm{d}r + \frac{\partial \sigma } {\partial x}(r)\mathrm{d}B(r) +\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial \gamma } {\partial x}({r}^{-},z)\tilde{N}(\mathrm{d}r,\mathrm{d}z)\}.& \end{array}$$
(22.38)
Combining this with Eqs. (22.36) and (22.37), we have
$$\frac{\mathrm{d}} {\mathrm{d}h}A_{1}{\vert }_{h=0} = \Lambda _{1} + \Lambda _{2},$$
(22.39)
where
$$\displaystyle\begin{array}{rcl} \Lambda _{1} =& \displaystyle\int _{t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)G(t,s)\alpha \displaystyle\int _{t}^{t+h}\{\nabla _{ \pi }b(r)\mathrm{d}r + \nabla _{\pi }\sigma (r)\mathrm{d}B(r)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }\gamma ({r}^{-},z)\tilde{N}(\mathrm{d}r,\mathrm{d}z)\}\Big{]}_{h=0}\mathrm{d}s & \end{array}$$
(22.40)
and
$$\displaystyle\begin{array}{rcl} \Lambda _{2} =& \displaystyle\int _{t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)G(t,s)\displaystyle\int _{t}^{t+h}Y ({r}^{-})\{ \frac{\partial b} {\partial x}(r)\mathrm{d}r + \frac{\partial \sigma } {\partial x}(r)\mathrm{d}B(r)& \\ & +\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial \gamma } {\partial x}({r}^{-},z)\tilde{N}(\mathrm{d}r,\mathrm{d}z)\}\Big{]}_{ h=0}\mathrm{d}s. & \end{array}$$
(22.41)
By the duality formulae (22.7) and (22.8), we have
$$\displaystyle\begin{array}{rcl} \Lambda _{1} =& \displaystyle\int _{t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\alpha \displaystyle\int _{ t}^{t+h}\{\nabla _{ \pi }b(r)F_{1}(t,s) + \nabla _{\pi }\sigma (r)D_{r}F_{1}(t,s)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }\gamma (r,z)D_{r,z}F_{1}(t,s)\nu (\mathrm{d}z)\}\mathrm{d}r\Big{]}_{h=0}\mathrm{d}s & \\ =& \displaystyle\int _{t}^{T}{\mathbb{E}}^{x}\Big{[}\alpha \{\nabla _{\pi }b(t)F_{1}(t,s) + \nabla _{\pi }\sigma (t)D_{t}F_{1}(t,s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }\gamma (t,z)D_{t,z}F_{1}(t,s)\nu (\mathrm{d}z)\}\Big{]}\mathrm{d}s. & \end{array}$$
(22.42)
Since Y (t) = 0 we see that
$$\Lambda _{2} = 0.$$
(22.43)
We conclude that
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}} {\mathrm{d}h}A_{1}{\vert }_{h=0} =& \Lambda _{1} & \\ =& \displaystyle\int _{t}^{T}{\mathbb{E}}^{x}\Big{[}\alpha \{F_{1}(t,s)\nabla _{\pi }b(t) + D_{t}F_{1}(t,s)\nabla _{\pi }\sigma (t)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}F_{1}(t,s)\nabla _{\pi }\gamma (t,z)\nu (\mathrm{d}z)\}\Big{]}\mathrm{d}s. & \end{array}$$
(22.44)

Moreover, we see directly that
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}} {\mathrm{d}h}A_{2}{\vert }_{h=0} =& {\mathbb{E}}^{x}\Big{[}\alpha \{\hat{K}_{1}(t)\nabla _{\pi }b(t) + D_{t}\hat{K}_{1}(t)\nabla _{\pi }\sigma (t) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}\{D_{t,z}\hat{K}_{1}(t)\nabla _{\pi }\gamma (t,z) + \nabla _{\pi }f_{1}(t,z)\}\nu (\mathrm{d}z)\}\Big{]}.& \end{array}$$
(22.45)
Therefore, differentiating Eq. (22.30) with respect to h at h = 0 gives the equation
$$\displaystyle\begin{array}{rcl} &{\mathbb{E}}^{x}\Big{[}\alpha \{\Big{(}\hat{K}_{1}(t) +\displaystyle\int _{ t}^{T}F_{1}(t,s)\mathrm{d}s\Big{)}\nabla _{\pi }b(t) + D_{t}\Big{(}\hat{K}_{1}(t) +\displaystyle\int _{ t}^{T}F_{1}(t,s)\mathrm{d}s\Big{)}\nabla _{\pi }\sigma (t)& \\ & \quad +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\Big{(}\hat{K}_{1}(t) +\displaystyle\int _{ t}^{T}F_{1}(t,s)\mathrm{d}s\Big{)}\nabla _{\pi }\gamma (t,z) + \nabla _{\pi }f_{1}(t,z)\nu (\mathrm{d}z)\}\Big{]} = 0. & \end{array}$$
(22.46)
We can reformulate this as follows: If we define, as in Eq. (22.16),
$$\hat{p}_{1}(t) =\hat{ K}_{1}(t) +\displaystyle\int _{ t}^{T}F_{ 1}(t,s)\mathrm{d}s =\hat{ K}_{1}(t) +\displaystyle\int _{ t}^{T}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)G(t,s)\mathrm{d}s,$$
(22.47)
then Eq. (22.44) can be written:
$$\displaystyle\begin{array}{rcl}{ \mathbb{E}}^{x}\Big{[}& \nabla _{ \pi }\{\displaystyle\int _{\mathbb{R}_{0}}f_{1}(t,\hat{X}(t),\pi,\hat{\theta },z)\mu (\mathrm{d}z) +\hat{ p}_{1}(t)b(t,\hat{X}(t),\pi _{0},\hat{\theta }_{0}) & \\ & \quad + D_{t}\hat{p}_{1}(t)\sigma (t,\hat{X}(t),\pi _{0},\hat{\theta }_{0}) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\hat{p}_{1}(t)\gamma (t,\hat{X}(t),\pi,\hat{\theta },z)\nu (\mathrm{d}z)\}_{\pi =(\pi _{0}(t),\pi _{1}(t,z))}\alpha \Big{]} = 0.& \\ \end{array}$$

Since this holds for all bounded ℰ _t-measurable random variable α, we conclude that
$${\mathbb{E}}^{x}\Big{[}\nabla _{ \pi }\hat{H}_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi,\hat{\theta })\mid _{ \pi =\hat{\pi }(t)}\mid \mathbb{E}_{t}\Big{]} = 0.$$
Similarly, we have
$$\displaystyle\begin{array}{rcl} 0=& \frac{\partial } {\partial \upsilon }J_{2}(\hat{\pi },\hat{\theta } + \upsilon \eta ){\vert }_{\upsilon =0} & \\ =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{0}^{T}\displaystyle\int _{\mathbb{R}_{0}}\{\frac{\partial f_{2}} {\partial x} (t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi }(t,z),\hat{\theta }(t,z),z)D(t) & \\ & + \nabla _{\theta }f_{2}(t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi }(t,z),\theta (t,z),z){\vert }_{\theta =\hat{\theta }}\eta (t)\}\mu (\mathrm{d}z)\mathrm{d}t+g_{2}^{\prime}(\hat{X}(T))D(T) \Big{]},& \end{array}$$
(22.48)
where
$$\displaystyle\begin{array}{rcl} D(t) =& {D}^{(\eta )}(t) = \frac{\mathrm{d}} {\mathrm{d}\upsilon }{X}^{(\hat{\pi },\hat{\theta }+\upsilon \eta )}(t){\vert }_{ \upsilon =0} & \\ =& \displaystyle\int _{0}^{t}\{ \frac{\partial b} {\partial x}(s,\hat{X}(s),\hat{\pi }_{0}(s),\hat{\theta }_{0}(s))D(s) & \\ & +\nabla _{\theta }b(s,{X}^{(\hat{\pi }_{0},\theta _{0})}(s),\hat{\pi }_{0}(s),\theta _{0}(s)){\vert }_{\theta =\hat{\theta }}{\eta }^{{\ast}}(s)\}\mathrm{d}s & \\ & +\displaystyle\int _{0}^{t}\{\frac{\partial \sigma } {\partial x}(s,\hat{X}(s),\hat{\pi }_{0}(s),\hat{\theta }_{0}(s))Y (s) & \\ & +\nabla _{\theta }\sigma (s,{X}^{(\hat{\pi },\theta )}(s),\hat{\pi }_{0}(s),\theta _{0}(s)){\vert }_{\theta =\hat{\theta }}{\eta }^{{\ast}}(s)\}\mathrm{d}B(s) & \\ & +\displaystyle\int _{0}^{t}\displaystyle\int _{\mathbb{R}_{0}}\{\frac{\partial \gamma } {\partial x}(s,\hat{X}({s}^{-}),\hat{\pi }({s}^{-},z),\hat{\theta }({s}^{-}),z)D(s) & \\ & +\nabla _{\theta }\gamma (s,{X}^{(\hat{\pi },\theta )}({s}^{-}),\hat{\pi }({s}^{-},z),\theta ({s}^{-},z),z){\vert }_{\theta =\hat{\theta }}{\eta }^{{\ast}}(s)\}\tilde{N}(\mathrm{d}s,\mathrm{d}z).& \end{array}$$
(22.49)
Define
$$D(s) = D(t + h)G(t + h,s);\quad s \geq t + h,$$
where G(t, s) is defined as in Eq. (22.32). By using similar arguments as above, we get
$${\mathbb{E}}^{x}\Big{[}\nabla _{ \theta }\hat{H}_{2}(t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi },\theta )\mid _{ \theta =\hat{\theta }(t)}\mid \mathbb{E}_{t}\Big{]} = 0.$$
This completes the proof of (i).
(ii)
Conversely, suppose that there exists $(\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$ such that Eqs. (22.22) and (22.23) hold. Then by reversing the above arguments, we obtain that Eq. (22.31) holds for all $\beta _{\alpha }(s,\omega ) = \alpha (\omega )\chi _{(t,t+h]}(s) \in \mathcal{A}_{\Pi }$, where
$$\displaystyle\begin{array}{rcl} A_{1} =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{t}^{T}\{\hat{K}_{1}(s) \frac{\partial b} {\partial x}(s) + D_{s}\hat{K}_{1}(s)\frac{\partial \sigma } {\partial x}(s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{1}(s)\frac{\partial \gamma } {\partial x}(s)\nu (\mathrm{d}z)\}{Y }^{(\beta _{\alpha })}(s)\mathrm{d}s\Big{]},& \\ \end{array}$$

$$\displaystyle\begin{array}{rcl} A_{2} =& {\mathbb{E}}^{x}\Big{[}\{\displaystyle\int _{t}^{t+h}\Big{(}\hat{K}_{1}(s)\nabla _{\pi }b(s) + D_{s}\hat{K}_{1}(s)\nabla _{\pi }\sigma (s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{1}(s)\nabla _{\pi }\gamma (s,z)\nu (\mathrm{d}z) +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }f_{1}(s,z)\mu (\mathrm{d}z)\Big{)}\mathrm{d}s\}\alpha \Big{]},& \\ \end{array}$$
for some t, h ∈ [0, T] with t + h ≤ T and some bounded ℰ _t-measurable α. Similarly,
$$A_{3} + A_{4} = 0$$
(22.50)
for all $\eta _{\xi }(s,\omega ) = \xi (\omega )\chi _{(t,t+h]}(s) \in \mathcal{A}_{\Theta }$, where
$$\displaystyle\begin{array}{rcl} A_{3} =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{t}^{T}\{\hat{K}_{2}(s) \frac{\partial b} {\partial x}(s) + D_{s}\hat{K}_{2}(s)\frac{\partial \sigma } {\partial x}(s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{2}(s)\frac{\partial \gamma } {\partial x}(s)\nu (\mathrm{d}z)\}{Y }^{(\eta _{\xi })}(s)\mathrm{d}s\Big{]},& \\ \end{array}$$

$$\displaystyle\begin{array}{rcl} A_{4} =& {\mathbb{E}}^{x}\Big{[}\{\displaystyle\int _{t}^{t+h} \Big{(}\hat{K}_{2}(s)\nabla _{\theta }b(s) + D_{s}\hat{K}_{2}(s) \nabla _{\theta }\sigma (s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{2}(s)\nabla _{\theta }\gamma (s,z)\nu (\mathrm{d}z) +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\theta }f_{2}(s,z)\mu (\mathrm{d}z)\Big{)}\mathrm{d}s\}\alpha \Big{]},& \\ \end{array}$$
for some t, h ∈ [0, T] with t + h ≤ T and some bounded ℰ _t-measurable ξ. Hence, these equalities hold for all linear combinations of β_α and η_ξ. Since all bounded $\beta \in \mathcal{A}_{\Pi }$ and $\eta \in \mathcal{A}_{\Theta }$ can be approximated pointwise boundary in (t, ω) by such linear combinations, it follows that Eqs. (22.31) and (22.50) hold for all bounded $(\beta,\eta ) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$. Hence, by reversing the remaining part of the proof above, we conclude that
$$\frac{\partial } {\partial y}J_{1}(\hat{\pi } + y\beta,\hat{\theta }){\vert }_{y=0} = 0,$$

$$\frac{\partial } {\partial \upsilon }J_{2}(\hat{\pi },\hat{\theta } + \upsilon \eta ){\vert }_{\upsilon =0} = 0,$$
for all β and η.

3 Zero-Sum Games

Suppose that the given performance functional of player I is the negative of the player II, i.e.,

$$\displaystyle\begin{array}{rcl} J_{1}(u_{0},u_{1})& =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}f(t,X(t),u_{0}(t),u_{1}(t,z),z,\omega )\mu (\mathrm{d}z)\mathrm{d}t + g(X(T),\omega )\Big{]} \\ & =& -J_{2}(u_{0},u_{1}), \end{array}$$

(22.51)

where ${\mathbb{E}}^{x} = \mathbb{E}_{P}^{x}$ denotes the expectation with respect to P given that X(0) = x. Suppose that the controls u ₀(t) and u ₁(t, z) have the form as in Eqs. (22.5) and (22.6). Let $\mathcal{A}_{\Pi }$ and $\mathcal{A}_{\Theta }$ denote the given family of controls π = (π₀, π₁) and θ = (θ₀, θ₁) such that they are contained in the set of càdlàg ℰ _t-adapted controls, Eq. (22.1) has a unique strong solution up to time T and

$$\displaystyle\begin{array}{rcl} & & {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}\vert f(t,X(t),\pi _{0}(t),\pi _{1}(t,z),\theta _{0}(t),\theta _{1}(t,z),z,\omega )\vert \mu (\mathrm{d}z)\mathrm{d}t \\ & & \quad + \vert g(X(T),\omega )\vert \Big{]} < \infty. \end{array}$$

(22.52)

Then the partial information zero-sum stochastic differential game problem is the following:

Problem 3.1.

Find $\Phi _{\mathbb{E}}\in \mathbb{R}$, ${\pi }^{{\ast}}\in \mathcal{A}_{\Pi }$ and ${\theta }^{{\ast}}\in \mathcal{A}_{\Theta }$ (if it exists) such that

$$\Phi _{\mathbb{E}} =\inf _{\theta \in \mathcal{A}_{\Theta }}(\sup _{\pi \in \mathcal{A}_{\Pi }}J(\pi,\theta )) = J({\pi }^{{\ast}},{\theta }^{{\ast}}) =\sup _{ \pi \in \mathcal{A}_{\Pi }}(\inf _{\theta \in \mathcal{A}_{\Theta }}J(\pi,\theta )).$$

(22.53)

Such a control (π^∗, θ^∗) is called an optimal control (if it exists). The intuitive idea is that while player I controls π, player II controls θ. The actions of the players are antagonistic, which means that between players I and II, there is a payoff J(π, θ) which is a reward for player I and a cost for Player II. Note that since we allow b, σ, γ, f and g to be stochastic processes and also because our controls are ℰ _t-adapted, this problem is not of Markovian type and hence cannot be solved by dynamic programming.

Theorem 3.1 (Maximum principle for zero-sum games).

(i)
Suppose $(\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$ is a directional critical point for J(π,θ) in the sense that for all bounded $\beta \in \mathcal{A}_{\Pi }$ and $\eta \in \mathcal{A}_{\Theta }$ , there exists δ > 0 such that $\hat{\pi } + y\beta \in \mathcal{A}_{\Pi }$ , $\hat{\theta } + \upsilon \eta \in \mathcal{A}_{\Theta }$ for all y,υ ∈ (−δ,δ) and
$$c(y,\upsilon ) := J(\hat{\pi } + y\beta,\hat{\theta } + \upsilon \eta ),\quad y,\upsilon \in (-\delta,\delta )$$
has a critical point at 0, i.e.,
$$\frac{\partial c} {\partial y}(0,0) = \frac{\partial c} {\partial \upsilon }(0,0) = 0.$$
(22.54)
Then
$${\mathbb{E}}^{x}[\nabla _{ \pi }\hat{H}(t,{X}^{(\pi,\hat{\theta })}(t),\pi,\hat{\theta },\omega )\vert \mathbb{E}_{ t}]_{\pi =\hat{\pi }} = 0,$$
(22.55)

$${\mathbb{E}}^{x}[\nabla _{ \theta }\hat{H}(t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi },\theta,\omega )\vert \mathbb{E}_{ t}]_{\theta =\hat{\theta }} = 0\quad \mbox{ for a.a. t,}\omega,$$
(22.56)
where
$$\displaystyle\begin{array}{rcl} & \hat{X}(t) = {X}^{(\hat{\pi },\hat{\theta })}(t),& \\ \end{array}$$

$$\displaystyle\begin{array}{rcl} \hat{H}(t,\hat{X}(t),\pi,\theta )& =& \displaystyle\int _{\mathbb{R}_{0}}f(t,\hat{X}(t),\pi,\theta,z)\mu (\mathrm{d}z)+\hat{p}(t)b(t,\hat{X}(t),\pi _{0},\theta _{0}) \\ & & +\hat{q}(t)\sigma (t,\hat{X}(t),\pi _{0},\theta _{0})+\displaystyle\int _{\mathbb{R}_{0}}\hat{r}(t,z)\gamma (t,\hat{X}({t}^{-}),\pi,\theta,z)\nu (\mathrm{d}z), \\ & & \end{array}$$
(22.57)
with
$$\hat{p}(t) =\hat{ K}(t) +\displaystyle\int _{ t}^{T}\frac{\partial \hat{{H}}^{0}} {\partial x} (s,\hat{X}(s),\hat{\pi }(s),\hat{\theta }(s))\,\hat{G}(t,s)\mathrm{d}s,$$
(22.58)

$$\hat{K}(t) = {K}^{(\hat{\pi },\hat{\theta })}(t) = g^{\prime}(\hat{X}(T)) +\displaystyle\int _{ t}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \frac{\partial f} {\partial x}(s,\hat{X}(s),\hat{\pi }(s,z),\hat{\theta }(s,z),z)\mu (\mathrm{d}z)\mathrm{d}s,$$
(22.59)

$$\displaystyle\begin{array}{rcl} \hat{{H}}^{0}(s,\hat{X},\hat{\pi },\hat{\theta }) =& \hat{K}(s)b(s,\hat{X},\hat{\pi }_{ 0},\hat{\theta }_{0}) + D_{s}\hat{K}(s)\sigma (s,\hat{X},\hat{\pi }_{0},\hat{\theta }_{0})& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}(s)\gamma (s,\hat{X},\hat{\pi },\hat{\theta },z)\nu (\mathrm{d}z), & \end{array}$$
(22.60)

$$\displaystyle\begin{array}{rcl} \hat{G}(t,s) :=& \exp \Big{(}\displaystyle\int _{t}^{s}\{ \frac{\partial b} {\partial x}(r,\hat{X}(r),\hat{\pi }_{0}(r),\hat{\theta }_{0}(r)) & \\ & -\frac{1} {2}\Big{(}\frac{\partial \sigma } {\partial x}{\Big{)}}^{2}(r,\hat{X}(r),\hat{\pi }_{ 0}(r),\hat{\theta }_{0}(r))\}\mathrm{d}r & \\ & +\displaystyle\int _{t}^{s}\frac{\partial \sigma } {\partial x}(r,\hat{X}(r),\hat{\pi }_{0}(r),\hat{\theta }_{0}(r))\mathrm{d}B(r) & \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}(r,\hat{X}({r}^{-}),\hat{\pi }({r}^{-},z),\hat{\theta }({r}^{-},z),z)\Big{)}\tilde{N}(\mathrm{d}r,\mathrm{d}z)& \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\{\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}(r,\hat{X}(r),\hat{\pi },\hat{\theta },z)\Big{)} & \\ & -\frac{\partial \gamma } {\partial x}(r,\hat{X}(r),\hat{\pi },\hat{\theta },z)\}\nu (\mathrm{d}z)\mathrm{d}r\Big{)}; & \\ \end{array}$$

$$\hat{q}(t) := D_{t}\hat{p}(t),$$
and
$$\hat{r}(t,z) := D_{t,z}\hat{p}(t).$$
(22.61)
(ii)
Conversely, suppose that there exists $(\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }$ such that Eqs. (22.55) and (22.56) hold. Furthermore, suppose that g is an affine function, H is concave in π and convex in θ. Then $(\hat{\pi },\hat{\theta })$ satisfies Eq. (22.54).

4 Application: Worst-Case Scenario Optimal Portfolio Under Partial Information

We illustrate the results in the previous section by looking at an application to robust portfolio choice in finance:

Consider a financial market with the following two investment possibilities:

1.
A risk free asset, where the unit price S ₀(t) at time t is
$$\mathrm{d}S_{0}(t) = r(t)S_{0}(t)\mathrm{d}t;\quad S_{0}(0) = 1;\quad 0 \leq t \leq T,$$
where T > 0 is a given constant.
2.
A risky asset, where the unit price S ₁(t) at time t is given by
$$\left \{\begin{array}{ll} \mathrm{d}S_{1}(t) = S_{1}({t}^{-})[\theta (t)\mathrm{d}t + \sigma _{0}(t)\mathrm{d}B(t) +\displaystyle\int _{\mathbb{R}_{0}} \gamma _{0}(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)], \\ S_{1}(0) > 0,\end{array} \right.$$
(22.62)
where r, θ, σ₀ and γ₀ are predictable processes such that

$$\displaystyle\int _{0}^{T}\{\mid \theta (s)\mid + \sigma _{ 0}^{2}(s) +\displaystyle\int _{ \mathbb{R}_{0}}\gamma _{0}^{2}(s,z)\nu (\mathrm{d}z)\}\mathrm{d}s < \infty \ \ \ \mathrm{a.s.}$$

We assume that θ is adapted to a given subfiltration ℰ _t and that

$$\gamma _{0}(t,z,\omega ) \geq -1 + \delta \qquad \text{for all}\ \ t,z,\omega \in [0,T] \times \mathbb{R}_{0} \times \Omega,$$

for some constant δ > 0.

Let π(t) = π(t, ω) be a portfolio, representing the amount invested in the risky asset at time t. We require that π be càdlàg and ℰ _t-adapted and self–financing and hence that the corresponding wealth X(t) = X ^(π, θ)(t) at time t is given by

$$\left \{\begin{array}{ll} \mathrm{d}X(t) =&[X(t) - \pi (t)]r(t)\mathrm{d}t + \pi ({t}^{-})[\theta (t)\mathrm{d}t + \sigma _{0}(t)\mathrm{d}B(t) \\ & +\displaystyle\int _{\mathbb{R}_{0}} \gamma _{0}(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)] \\ X(0) = &x > 0. \end{array} \right.$$

(22.63)

Let us assume that the mean relative growth rate θ(t) of the risky asset is not known to the trader, but subject to uncertainty. We may regard θ as a market scenario or a stochastic control of the market, which is playing against the trader. Let $\mathcal{A}_{\Pi }^{\epsilon }$ and $\mathcal{A}_{\Theta }^{\epsilon }$ denote the set of admissible controls π, θ, respectively. The worst-case partial information scenario optimal problem for the trader is to find ${\pi }^{{\ast}}\in \mathcal{A}_{\Pi }^{\epsilon }$ and ${\theta }^{{\ast}}\in \mathcal{A}_{\Theta }^{\epsilon }$ and Φ ∈ ℝ such that

$$\displaystyle\begin{array}{rcl} \Phi & =& \inf _{\theta \in \mathcal{A}_{\Theta }^{\epsilon }}(\sup _{\pi \in \mathcal{A}_{\Pi }^{\epsilon }}\mathbb{E}[U({X}^{(\pi,\theta )}(T))]) \\ & =& \mathbb{E}[U({X}^{({\pi }^{{\ast}},{\theta }^{{\ast}})(T) })], \end{array}$$

(22.64)

where U : [0, ∞) → ℝ is a given utility function, assumed to be concave, strictly increasing and ${\mathcal{C}}^{1}$ on (0, ∞). We want to study this problem by using Theorem 3.1. In this case we have

$$b(t,x,\pi,\theta ) = \pi (\theta - r(t)) + xr(t),\quad K(t) = {U}^{^{\prime}}({X}^{(\pi,\theta )}(T)),$$

(22.65)

$$\displaystyle\begin{array}{rcl} H_{0}(t,x,\pi,\theta ) =& {U}^{^{\prime}}({X}^{(\pi,\theta )}(T))[\pi (\theta - r(t)) + xr(t)] & \\ & +D_{t}({U}^{^{\prime}}({X}^{(\pi,\theta )}(T)))\pi \sigma _{0}(t) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}({U}^{^{\prime}}({X}^{(\pi,\theta )}(T)))\pi \gamma _{0}(t,z)\nu (\mathrm{d}z),&\end{array}$$

(22.66)

and

$$\displaystyle\begin{array}{rcl} p(t) = {U}^{^{\prime}}({X}^{(\pi,\theta )}(T))\left [1 +\displaystyle\int _{ t}^{T}r(s)G(t,s)\mathrm{d}s\right ],& & \\ \end{array}$$

where

$$G(t,s) =\exp \left (\displaystyle\int _{t}^{s} r(v)\mathrm{d}v\right ).$$

Hence,

$$\displaystyle\int _{t}^{T} r(s)G(t,s)\mathrm{d}s = {\vert }_{ t}^{T}\exp \Big{(}\displaystyle\int _{ t}^{s} r(v)\mathrm{d}v\Big{)} =\exp \Big{(}\displaystyle\int _{ t}^{T} r(v)\mathrm{d}v\Big{)} - 1$$

and

$$p(t) = {U}^{^{\prime}}({X}^{(\pi,\theta )}(T))\exp \left (\displaystyle\int _{ t}^{T}r(s)\mathrm{d}s\right ).$$

(22.67)

With this value for p(t) we have

$$\displaystyle\begin{array}{rcl} H(t,{X}^{(\pi,\theta )}(t),\pi,\theta ) =& p(t)[\pi (\theta - r(t)) + r(t)X(t)] & \\ & +D_{t}p(t)\pi \sigma _{0}(t) +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}p(t)\pi \gamma _{0}(t,z)\nu (\mathrm{d}z).&\end{array}$$

(22.68)

Hence Eq. (22.55) becomes

$$\displaystyle\begin{array}{rcl} \mathbb{E}\left [\frac{\partial H} {\partial \pi } (t,\hat{X}(t),\pi,\hat{\theta }){\vert }\mathbb{E}_{t}\right ]_{\pi =\hat{\pi }(t)}& = \mathbb{E}\Big{[}p(t)(\hat{\theta } - r(t)) + D_{t}p(t)\sigma _{0}(t)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}p(t)\gamma _{0}(t,z)\nu (\mathrm{d}z){\vert }\mathbb{E}_{t}\Big{]} = 0&\end{array}$$

(22.69)

and Eq. (22.56) becomes

$$\displaystyle\begin{array}{rcl} & \mathbb{E}\left [\frac{\partial H} {\partial \theta } (t,\hat{X}(t),\hat{\pi },\theta ){\vert }\mathbb{E}_{t}\right ]_{\theta =\hat{\theta }(t)} = \mathbb{E}[p(t)\hat{\pi }(t)\mid \mathbb{E}_{t}] = \mathbb{E}[p(t)\mid \mathbb{E}_{t}]\hat{\pi }(t) = 0.&\end{array}$$

(22.70)

Since p(t) > 0 we conclude that

$$\hat{\pi }(t) = 0.$$

(22.71)

This implies that

$$\hat{X}(t) = x\exp \Big{(}\displaystyle\int _{0}^{t} r(s)\mathrm{d}s\Big{)}$$

(22.72)

and

$$\hat{p}(t) = {U}^{^{\prime}}\left (x\exp \Big{(}\displaystyle\int _{0}^{T}r(s)\mathrm{d}s\Big{)}\right )\exp \left (\displaystyle\int _{ t}^{T}r(s)\mathrm{d}s\right );\ \ \ t \in [0,T].$$

(22.73)

Substituting this into Eq. (22.69), we get

$$\hat{\theta }(t) = \frac{\mathbb{E}\left [\hat{p}(t)r(t) - D_{t}\hat{p}(t)\sigma _{0}(t) -\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\hat{p}(t)\gamma _{0}(t,z)\nu (\mathrm{d}z){\vert }\mathbb{E}_{t}\right ]} {\mathbb{E}[\hat{p}(t)\vert \mathbb{E}_{t}]}.$$

(22.74)

We have proved the following theorem:

Theorem 4.1 (Worst-case scenario optimal portfolio under partial information).

Suppose there exists a solution $({\pi }^{{\ast}},{\theta }^{{\ast}}) \in (\mathcal{A}_{\Pi }^{\epsilon },\mathcal{A}_{\Theta }^{\epsilon })$ of the stochastic differential game Eq. (22.64). Then

$${\pi }^{{\ast}} =\hat{ \pi } = 0,$$

(22.75)

and

$${\theta }^{{\ast}} =\hat{ \theta }\ \ \text{is given by}\ \ (\mbox{ 22.74}).$$

(22.76)

In particular, if r(s) is deterministic, then

$${\pi }^{{\ast}} = 0\quad \mbox{ and}\quad {\theta }^{{\ast}}(t) = r(t).$$

(22.77)

Remark 4.1.

(i)
If r(s) is deterministic, then Eq. (22.77) states that the worst-case scenario is when $\hat{\theta }(t) = r(t)$, for all t ∈ [0, T], i.e., when the normalized risky asset price
$${\mathrm{e}}^{-\displaystyle\int _{0}^{t}r(s)\mathrm{d}s }S_{1}(t)$$
is a martingale. In such a situation the trader might as well put all her money in the risk free asset, i.e., choose $\pi (t) =\hat{ \pi }(t) = 0$. This trading strategy remains optimal if r(s) is not deterministic, but now the worst-case scenario $\hat{\theta }(t)$ is given by the more complicated expression (22.74).
(ii)
This is a new approach to, and a partial extension of, Theorem 2.2 in [13] and Theorem 4.1 in the subsequent paper [1]. Both of these papers consider the case with deterministic r(t) only. On the other hand, in these papers the scenario is represented by a probability measure and not by the drift.

References

An, T.T.K., Øksendal, B.: A maximum principle for stochastic differential games with partial information. J. Optim. Theor. Appl. 139, 463–483 (2008)
Article Google Scholar
Bensoussan, A.: Stochastic Control of Partially Observable Systems. Cambridge University Press, Cambridge (1992)
Book MATH Google Scholar
Benth, F.E., Di Nunno, G., Løkka, A., Øksendal, B., Proske, F.: Explicit representation of the minimal variance portfolio in markets driven by Lévy processes. Math. Financ. 13, 55–72 (2003)
Article MATH Google Scholar
Baghery, F., Øksendal, B.: A maximum principle for stochastic control with partial information. Stoch. Anal. Appl. 25, 705–717 (2007)
Article MathSciNet MATH Google Scholar
Di Nunno, G., Meyer-Brandis, T., Øksendal, B., Proske, F.: Malliavin calculus and anticipative Itô formulae for Lévy processes. Inf. Dim. Anal. Quant. Probab. 8, 235–258 (2005)
Article MATH Google Scholar
Di Nunno, G., Øksendal, B., Proske, F.: Malliavin Calculus for Lévy Processes and Applications to Finance. Springer, Berlin (2009)
Book Google Scholar
Karatzas, I., Ocone, D.: A generalized Clark representation formula, with application to optimal portfolios. Stoch. Stoch. Rep. 34, 187–220 (1991)
MathSciNet MATH Google Scholar
Karatzas, I., Xue, X.: A note on utility maximization under partial observations. Math. Financ. 1, 57–70 (1991)
Article MATH Google Scholar
Lakner, P.: Optimal trading strategy for an investor: the case of partial information. Stoch. Process. Appl. 76, 77–97 (1998)
Article MathSciNet MATH Google Scholar
Meyer-Brandis, T., Øksendal, B., Zhou, X.Y.: A mean-field stochastic maximum principle via Malliavin calculus. Stoch. 84, 643–666 (2012)
MATH Google Scholar
Nualart, D.: Malliavin Calculus and Related Topics, 2nd edn. Springer, Berlin (2006)
MATH Google Scholar
Øksendal, B., Sulem, A.: Applied Stochastic Control of Jump Diffusions, 2nd edn. Springer, Berlin (2007)
Book Google Scholar
Øksendal, B., Sulem, A.: A game theoretic approach to martingale measures in incomplete markets. Surv. Appl. Ind. Math. 15, 18–24 (2008)
MATH Google Scholar
Pham, H., Quenez M.-C.: Optimal portfolio in partially observed stochastic volatility models. Ann. Appl. Probab. 11, 210–238 (2001)
Article MathSciNet MATH Google Scholar
Yong, J., Zhou, X.Y.: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer, New York (1999)
MATH Google Scholar

Download references

Acknowledgements

The research leading to these results has received funding from the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007–2013)/ ERC grant agreement no [228087].

Author information

Authors and Affiliations

Center of Mathematics for Applications (CMA), University of Oslo, 1053, Blindern, N-0316, Oslo, Norway
An Ta Thi Kieu & Bernt Øksendal
Institute of Applied Mathematics, Middle East Technical Unoversity (METU), Ankara, 06800, Turkey
Yeliz Yolcu Okur

Authors

An Ta Thi Kieu
View author publications
You can also search for this author in PubMed Google Scholar
Bernt Øksendal
View author publications
You can also search for this author in PubMed Google Scholar
Yeliz Yolcu Okur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernt Øksendal .

Editor information

Editors and Affiliations

, Department of Statistics, Purdue University, North University Street 150, West Lafayette, 47907, Indiana, USA
Frederi Viens
, Department of Mathematics, University of Kansas, Jayhawk Blvd 1460, Lawrence, 66045, Kansas, USA
Jin Feng
Dept. Mathematics, University of Kansas, Snow Hall 405, Lawrence, 66045-2142, Kansas, USA
Yaozhong Hu
, Department of Economics and Business, University Pompeu Fabra, Ramon Trias Fargas 25-27, Barcelona, 08005, Spain
Eulalia Nualart

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kieu, A.T.T., Øksendal, B., Okur, Y.Y. (2013). A Malliavin Calculus Approach to General Stochastic Differential Games with Partial Information. In: Viens, F., Feng, J., Hu, Y., Nualart , E. (eds) Malliavin Calculus and Stochastic Analysis. Springer Proceedings in Mathematics & Statistics, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-5906-4_22

Download citation

DOI: https://doi.org/10.1007/978-1-4614-5906-4_22
Published: 09 January 2013
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-5905-7
Online ISBN: 978-1-4614-5906-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

A Malliavin Calculus Approach to General Stochastic Differential Games with Partial Information

Abstract

Similar content being viewed by others

Optimal Strategy of Mean-Field FBSDE Games with Delay and Noisy Memory Based on Malliavin Calculus

A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications

Zero-Sum Stochastic Differential Games with Risk-Sensitive Cost

Keywords

1 Introduction

Problem 1.1.

2 The General Maximum Principle for the Stochastic Differential Games

Definition 2.1 (The General Stochastic Hamiltonian).

Remark 2.1.

Theorem 2.1 (Maximum principle for non-zero-sum games).

Proof.

3 Zero-Sum Games

Problem 3.1.

Theorem 3.1 (Maximum principle for zero-sum games).

4 Application: Worst-Case Scenario Optimal Portfolio Under Partial Information

Theorem 4.1 (Worst-case scenario optimal portfolio under partial information).

Remark 4.1.

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Malliavin Calculus Approach to General Stochastic Differential Games with Partial Information

Abstract

Similar content being viewed by others

Optimal Strategy of Mean-Field FBSDE Games with Delay and Noisy Memory Based on Malliavin Calculus

A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications

Zero-Sum Stochastic Differential Games with Risk-Sensitive Cost

Keywords

1 Introduction

Problem 1.1.

2 The General Maximum Principle for the Stochastic Differential Games

Definition 2.1 (The General Stochastic Hamiltonian).

Remark 2.1.

Theorem 2.1 (Maximum principle for non-zero-sum games).

Proof.

3 Zero-Sum Games

Problem 3.1.

Theorem 3.1 (Maximum principle for zero-sum games).

4 Application: Worst-Case Scenario Optimal Portfolio Under Partial Information

Theorem 4.1 (Worst-case scenario optimal portfolio under partial information).

Remark 4.1.

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation