Abstract
In this article, we consider stochastic differential game where the state process is governed by a controlled Itô–Lévy process and the information available to the controllers is possibly less than the general information. All the system coefficients and the objective performance functional are assumed to be random. We use Malliavin calculus to derive a maximum principle for the optimal control of such problem. The results are applied to solve a worst-case scenario portfolio problem in finance.
Received2/18/2011; Accepted 5/23/2012; Final 5/29/2012
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Malliavin calculus
- Stochastic differential game
- Stochastic control, Jump diffusion
- Partial information
- Optimal worst-case scenario portfolio
1 Introduction
Suppose the dynamics of a state process X(t) = X (u 0, u 1)(t, ω); t ≥ 0, ω ∈ Ω, is a controlled Itô–Lévy process in ℝ of the form
where the coefficients b : [0, T] ×ℝ ×U ×Ω → ℝ, σ : [0, T] ×ℝ ×U ×Ω → ℝ and γ : [0, T] ×ℝ ×U ×K ×ℝ 0 ×Ω are all continuously differentiable (C 1) with respect to x ∈ ℝ and u 0 ∈ U, u 1 ∈ K for each t ∈ [0, T] and a.a. ω ∈ Ω; U, K are given open convex subsets of ℝ 2 and ℝ ×ℝ 0, respectively. Here ℝ 0 = ℝ − { 0}, B(t) = B(t, ω) and η(t) = η(t, ω), given by
are a one-dimensional Brownian motion and an independent pure jump Lévy martingale, respectively, on a given filtered probability space \((\Omega,\mathcal{F},\{\mathcal{F}_{t}\}_{t\geq 0},P).\) Thus
is the compensated Poisson jump measure of η( ⋅), where N(dt, dz) is the Poisson jump measure and ν(dz) is the Lévy measure of the pure jump Lévy process η( ⋅). For simplicity, we assume that
The processes u 0(t) and u 1(t, z) are the control processes and have values in a given open convex set U and K, respectively, for a.a. t ∈ [0, T], z ∈ ℝ 0 for a given fixed T > 0. Also, u 0( ⋅) and u 1( ⋅) are càdlàg and adapted to a given filtration {ℰ t } t ≥ 0, where
{ℰ t } t ≥ 0 represents the information available to the controller at time t. For example, we could have
meaning that the controller gets a delayed information compared to \(\mathcal{F}_{t}\). We refer to [15, 12] for more information about stochastic control of Itô diffusions and jump diffusions, respectively, and to [2], [4], [8], [9], [14] for other papers dealing with optimal control under partial information/observation.
Let f : [0, T] ×ℝ ×U ×K ×Ω → ℝ and g : ℝ ×Ω → ℝ are given continuously differentiable (C 1) with respect to x ∈ ℝ and u 0 ∈ U, u 1 ∈ K. Suppose there are two players in the stochastic differential game and the given performance functionals for players are as follows:
where μ is a measure on the given measurable space (Ω, ℱ) and \({\mathbb{E}}^{x} = \mathbb{E}_{P}^{x}\) denotes the expectation with respect to P given that X(0) = x. Suppose that the controls u 0(t) and u 1(t, z) have the form
Let \(\mathcal{A}_{\Pi }\) and \(\mathcal{A}_{\Theta }\) denote the given family of controls π = (π0, π1) and θ = (θ0, θ1) such that they are contained in the set of càdlàg ℰ t -adapted controls, Eq. (22.1) has a unique strong solution up to time T and
The partial information non-zero-sum stochastic differential game problem we consider is the following:
Problem 1.1.
Find \(({\pi }^{{\ast}},{\theta }^{{\ast}}) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\) (if it exists) such that
-
(i)
J 1(π, θ ∗ ) ≤ J 1(π ∗ , θ ∗ ) for all \(\pi \in \mathcal{A}_{\Pi }\),
-
(ii)
J 2(π ∗ , θ) ≤ J 2(π ∗ , θ ∗ ) for all \(\theta \in \mathcal{A}_{\Theta }\).
Such a control (π ∗ , θ ∗ ) is called a Nash equilibrium (if it exists). The intuitive idea is that there are two players, players I and II. While player I controls π, player II controls θ. Given that each player knows the equilibrium strategy chosen by the other player, none of the players has anything to gain by changing only his or her own strategy only (i.e., by changing unilaterally). Note that since we allow b, σ, γ, f and g to be stochastic processes and also because our controls are required to be ℰ t -adapted, this problem is not of Markovian type and hence cannot be solved by dynamic programming. Our paper is related to the recent paper [1, 10], where a maximum principle for stochastic differential games with partial information and a mean-field maximum principle are dealt with, respectively. However, the approach in [1] needs the solution of the backward stochastic differential equation (BSDE) for the adjoint processes. This is often a difficult point, particularly in the partial information case. In the current paper, we use Malliavin calculus techniques to obtain a maximum principle for this general non-Markovian stochastic differential game with partial information, without the use of BSDEs.
2 The General Maximum Principle for the Stochastic Differential Games
In this section we base on Malliavin calculus to solve Problem 1.1. We assume the following:
-
(A1)
For all s, r, t ∈ (0, T), t ≤ r, and all bounded ℰ t -measurable random variables α = α(ω), ξ = ξ(ω) the controls βα(s) : = (0, βα i(s)) and ηξ(s) : = (0, ηξ i), i = 1, 2, with
$$\displaystyle\begin{array}{rcl} \beta _{\alpha }^{i}(s) = {\alpha }^{i}(\omega )\chi _{ [t,r]}(s)\mbox{ and }\eta _{\xi }^{i}(s) = {\xi }^{i}(\omega )\chi _{ [t,r]}(s);\quad s \in [0,T],& & \\ \end{array}$$belong to \(\mathcal{A}_{\Pi }\) and \(\mathcal{A}_{\Theta }\), respectively. Also, we will denote the transposed of the vectors β and η by β ∗ , η ∗ , respectively.
-
(A2)
For all \(\pi,\beta \in \mathcal{A}_{\Pi }\); \(\theta,\eta \in \mathcal{A}_{\Theta }\) with β and η are bounded, there exists δ > 0 such that the controls π(t) + yβ(t) and θ(t) + υη(t), t ∈ [0, T], belong to \(\mathcal{A}_{\Pi }\) and \(\mathcal{A}_{\Theta }\), respectively, for all υ ∈ ( − δ, δ), and such that the families
$$\displaystyle\begin{array}{rcl} & \{\frac{\partial f_{1}} {\partial x} (t,{X}^{(\pi +y\beta,\theta )}(t),\pi + y\beta,\theta,z) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\pi +y\beta,\theta )}(t) & \\ & \qquad + \nabla _{\pi }f_{1}(t,{X}^{(\pi +y\beta,\theta )}(t),\pi + y\beta,\theta,z){\beta }^{{\ast}}(t)\}_{y\in (-\delta,\delta )},& \\ \end{array}$$$$\displaystyle\begin{array}{rcl} & \{\frac{\partial f_{2}} {\partial x} (t,{X}^{(\pi,\theta +\upsilon \eta )}(t),\pi,\theta + \upsilon \eta,z) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\pi,\theta +\upsilon \eta )}(t) & \\ & \qquad + \nabla _{\theta }f_{2}(t,{X}^{(\pi,\theta +\upsilon \eta )}(t),\pi (t),\theta + \upsilon \eta,z){\eta }^{{\ast}}(t)\}_{\upsilon \in (-\delta,\delta )}& \\ \end{array}$$are λ ×ν ×P-uniformly integrable and the families
$$\displaystyle\begin{array}{rcl} \{g_{1}^{\prime}({X}^{\pi +y\beta }(T)) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\pi +y\beta,\theta )}(T)\}_{ y\in (-\delta,\delta )},& & \\ \quad \mbox{ }\{g_{2}^{\prime}({X}^{\pi,\theta +\upsilon \eta }(T)) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\pi,\theta +\upsilon \eta )}(T)\}_{ \upsilon \in (-\delta,\delta )}& & \\ \end{array}$$are P-uniformly integrable.
In the following, D t F denotes the Malliavin derivative with respect to B(⋅) (at t) of a given (Malliavin differentiable) random variable F = F(ω); ω ∈ Ω. Similarly, D t, z F denotes the Malliavin derivative with respect to \(\widetilde{N}(\cdot,\cdot )\) (at t,z) of F. We let \(\mathbb{D}_{1,2}\) denote the set of all random variables which are Malliavin differentiable with respect to both B( ⋅) and N( ⋅, ⋅). We will use the following duality formula for Malliavin derivatives:
valid for all Malliavin differentiable F- and all ℱ t -predictable processes φ and ψ such that the integrals on the right converge absolutely. We also need the following basic properties of Malliavin derivatives:
If \(F \in \mathbb{D}_{1,2}\) is ℱ s -measurable, then
(Fundamental theorem)
where ∫ 0 T u(s)δB(s) denotes Skorohod integral of u with respect to B( ⋅). (See [11], p. 35–38 for a definition of Skorohod integrals and for more details.)
provided that all terms involved are well defined. We refer to [3], [5], [6], [7], [10] and [11] for more information about the Malliavin calculus for Lévy processes and its applications.
-
(A3)
For all \((\pi,\theta ) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\), we assume the following processes, i = 1, 2:
$$\displaystyle\begin{array}{rcl} K_{i}(t) = g_{i}^{\prime}(X(T)) +\displaystyle\int _{ t}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \frac{\partial f_{i}} {\partial x} (s,X(s),\pi,\theta,z_{1})\mu (\mathrm{d}z_{1})\mathrm{d}s,& & \end{array}$$(22.12)$$\displaystyle\begin{array}{rcl} H_{i}^{0}(s,x,\pi,\theta ) =& K_{ i}(s)b(s,x,\pi _{0},\theta _{0}) + D_{s}K_{i}(s)\sigma (s,x,\pi _{0},\theta _{0})& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}K_{i}(s)\gamma (s,x,\pi,\theta,z)\nu (\mathrm{d}z), & \end{array}$$(22.13)$$\displaystyle\begin{array}{rcl} G(t,s) :=& \exp \left (\displaystyle\int _{t}^{s}\{ \frac{\partial b} {\partial x}(r,X(r),\pi _{0}(r),\theta _{0}(r))\right. & \\ & -\frac{1} {2}\Big{(}\frac{\partial \sigma } {\partial x}{\Big{)}}^{2}(r,X(r),\pi _{ 0}(r),\theta _{0}(r))\}\mathrm{d}r & \\ & +\displaystyle\int _{t}^{s}\frac{\partial \sigma } {\partial x}(r,X(r),\pi _{0}(r),\theta _{0}(r))\mathrm{d}B(r) & \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\ln \left (1 + \frac{\partial \gamma } {\partial x}(r,X({r}^{-}),\pi ({r}^{-},z),\theta ({r}^{-},z),z)\right )\tilde{N}(\mathrm{d}r,\mathrm{d}z)& \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\{\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}(r,X(r),\pi,\theta,z)\Big{)} & \\ & -\frac{\partial \gamma } {\partial x}(r,X(r),\pi,\theta,z)\}\nu (\mathrm{d}z)\mathrm{d}r\Big{)}, & \end{array}$$(22.14)$$F_{i}(t,s) := \frac{\partial H_{i}^{0}} {\partial x} (s)G(t,s),$$(22.15)$$\displaystyle p_{i}(t) = K_{i}(t)+\displaystyle\int _{t}^{T} \frac{\partial H_{i}^{0}} {\partial x} (s,X(s),\pi _{0}(s),\pi _{1}(s,z),\theta _{0}(s),\theta _{1}(s,z))G(t,s)\mathrm{d}s,$$(22.16)$$q_{i}(t) = D_{t}p_{i}(t),$$(22.17)$$r_{i}(t,z) = D_{t,z}p_{i}(t),$$(22.18)all exist for 0 ≤ t ≤ s, z ∈ ℝ 0.
We now define the Hamiltonians for this general problem as follows:
Definition 2.1 (The General Stochastic Hamiltonian).
The general stochastic Hamiltonians for the stochastic differential game in Problem 1.1 are the functions
defined by
where π = (π0, π1) and θ = (θ0, θ1).
Remark 2.1.
In the classical case, the Hamiltonian \(H_{i}^{{\ast}} : [0,T] \times \mathbb{R} \times U \times K \times \mathbb{R} \times \mathbb{R} \times \mathcal{R}\rightarrow \mathbb{R}\) is defined by
where \(\mathcal{R}\) is the set of functions r i : ℝ ×ℝ 0 → ℝ; i = 1, 2; see [12]. Thus the relation between H i ∗ and H i is that
where p( ⋅), q( ⋅) and r( ⋅, ⋅) are given by Eqs. (22.16)–(22.18).
Theorem 2.1 (Maximum principle for non-zero-sum games).
-
(i)
Let \((\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\) be a Nash equilibrium with corresponding state process \(\hat{X}(t) = {X}^{(\hat{\pi },\hat{\theta })}(t)\) , i.e.,
$$\displaystyle\begin{array}{rcl} J_{1}(\pi,\hat{\theta })& \leq J_{1}(\hat{\pi },\hat{\theta }),\qquad \mbox{ for all }\pi \in \mathcal{A}_{\Pi },& \\ J_{2}(\hat{\pi },\theta )& \leq J_{2}(\hat{\pi },\hat{\theta }),\qquad \mbox{ for all }\theta \in \mathcal{A}_{\Theta }.& \\ \end{array}$$Assume that the random variables \(\frac{\partial f_{i}} {\partial x}\) and F i (t,s), i = 1,2, belong to \(\mathbb{D}_{1,2}\) . Then
$${\mathbb{E}}^{x}[\nabla _{ \pi }\hat{H}_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi,\hat{\theta },\omega )\vert _{ \pi =\hat{\pi }}\;\vert \mathbb{E}_{t}] = 0,$$(22.22)$${\mathbb{E}}^{x}[\nabla _{ \theta }\hat{H}_{2}(t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi },\theta,\omega )\vert _{ \theta =\hat{\theta }}\;\vert \mathbb{E}_{t}] = 0,$$(22.23)for a.a. t,ω.
-
(ii)
Conversely, suppose that there exists \((\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\) such that Eqs. (22.22) and (22.23) hold. Then
$$\displaystyle\begin{array}{rcl} \frac{\partial } {\partial y}J_{1}(\hat{\pi } + y\beta,\hat{\theta }){\vert }_{y=0}& = 0\quad \text{for all}\;\beta,& \\ \frac{\partial } {\partial \upsilon }J_{2}(\hat{\pi },\hat{\theta } + \upsilon \eta ){\vert }_{\upsilon =0}& = 0\quad \text{for all}\;\eta.& \\ \end{array}$$In particular, if
$$\displaystyle\begin{array}{rcl} \pi \rightarrow J_{1}(\pi,\hat{\theta })\qquad \mbox{ and}\qquad \theta \rightarrow J_{2}(\hat{\pi },\theta ),& & \end{array}$$(22.24)are concave, then \((\hat{\pi },\hat{\theta })\) is a Nash equilibrium.
Proof.
-
(i)
Suppose \((\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\) is a Nash equilibrium. Since (i) and (ii) hold for all π and θ, \((\hat{\pi },\hat{\theta })\) is a directional critical point for J i (π, θ), i = 1, 2, in the sense that for all bounded \(\beta \in \mathcal{A}_{\Pi }\) and \(\eta \in \mathcal{A}_{\Theta }\), there exists δ > 0 such that \(\hat{\pi } + y\beta \in \mathcal{A}_{\Pi }\), \(\hat{\theta } + \upsilon \eta \in \mathcal{A}_{\Theta }\) for all y, υ ∈ ( − δ, δ). Then we have
$$\displaystyle\begin{array}{rcl} & =& \frac{\partial } {\partial y}J_{1}\left (\hat{\pi } + y\beta,\hat{\theta }\right ){\vert }_{y=0} \\ & =& {\mathbb{E}}^{x} \left [\displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \left \{\frac{\partial f_{1}} {\partial x} (t,\hat{X}(t),\hat{\pi }_{0}(t),\hat{\pi }_{1}(t,z),\hat{\theta }_{0}(t),\hat{\theta }_{1}(t,z),z) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\hat{\pi }+y\beta,\hat{\theta })}(t){\vert }_{ y=0}\right.\right. \\ & & +\nabla _{\pi }f_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi _{ 0}(t),\pi _{1}(t,z),\hat{\theta }_{0}(t),\hat{\theta }_{1}(t,z),z){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(t)\}\mu (\mathrm{d}z)\mathrm{d}t \\ & & +g_{1}^{\prime}(\hat{X}(T)) \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\hat{\pi }+y\beta,\hat{\theta })}(T){\vert }_{ y=0}\Big{]} \\ & =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ 0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}\{\frac{\partial f_{1}} {\partial x} (t,\hat{X}(t),\hat{\pi }_{0}(t),\hat{\pi }_{1}(t,z),\hat{\theta }_{0}(t),\hat{\theta }_{1}(t,z),z)Y (t) \\ & & +\nabla _{\pi }f_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi _{ 0}(t),\pi _{1}(t,z),\hat{\theta }_{0}(t),\hat{\theta _{1}}(t,z),z){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(t)\} \\ & & \times \mu (\mathrm{d}z)\mathrm{d}t+g_{1}^{\prime}(\hat{X}(T))Y (T)\Big{]}, \end{array}$$(22.25)where
$$\displaystyle\begin{array}{rcl} Y (t) =& {Y }^{(\beta )}(t) = \frac{\mathrm{d}} {\mathrm{d}y}{X}^{(\hat{\pi }+y\beta,\hat{\theta })}(t)\vert _{ y=0} & \\ =& \displaystyle\int _{0}^{t}\{ \frac{\partial b} {\partial x}(s,\hat{X}(s),\hat{\pi }_{0}(s),\hat{\theta }_{0}(s))Y (s) & \\ & +\nabla _{\pi }b(s,{X}^{(\pi,\hat{\theta })}(s),\pi _{0}(s),\hat{\theta }_{0}(s)){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(s)\}\mathrm{d}s & \\ & +\displaystyle\int _{0}^{t}\{\frac{\partial \sigma } {\partial x}(s,\hat{X}(s),\hat{\pi }_{0}(s),\hat{\theta }_{0}(s))Y (s) & \\ & +\nabla _{\pi }\sigma (s,{X}^{(\pi,\hat{\theta })}(s),\pi _{0}(s),\hat{\theta }_{0}(s)){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(s)\}\mathrm{d}B(s) & \\ & +\displaystyle\int _{0}^{t}\displaystyle\int _{\mathbb{R}_{0}}\{\frac{\partial \gamma } {\partial x}(s,\hat{X}({s}^{-}),\hat{\pi }({s}^{-}),\hat{\theta }({s}^{-}),z)Y (s) & \\ & +\nabla _{\pi }\gamma (s,{X}^{(\pi,\hat{\theta })}({s}^{-}),\pi ({s}^{-}),\hat{\theta }({s}^{-}),z){\vert }_{\pi =\hat{\pi }}{\beta }^{{\ast}}(s)\}\tilde{N}(\mathrm{d}s,\mathrm{d}z).& \end{array}$$(22.26)If we use the shorthand notation
$$\displaystyle\begin{array}{rcl} \frac{\partial f_{1}} {\partial x} (t,\hat{X}(t),\hat{\pi },\hat{\theta },z)=\frac{\partial f_{1}} {\partial x} (t,z),\;\;\nabla _{\pi }f_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi,\hat{\theta },z)\vert _{ \pi =\hat{\pi }}=\nabla _{\pi }f_{1}(t,z),& & \\ \end{array}$$and similarly for \(\frac{\partial b} {\partial x}\), ∇ π b, \(\frac{\partial \sigma } {\partial x}\), ∇ πσ, \(\frac{\partial \gamma } {\partial x}\) and ∇ πγ, we can write
$$\left \{\begin{array}{ll} \mathrm{d}Y (t) =&( \frac{\partial b} {\partial x}(t)Y (t)+\nabla _{\pi }b(t){\beta }^{{\ast}}(t))\mathrm{d}t+(\frac{\partial \sigma } {\partial x}(t)Y (t) + \nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t))\mathrm{d}B(t) \\ & +\displaystyle\int _{\mathbb{R}_{0}}(\frac{\partial \gamma } {\partial x}(t)Y (t) + \nabla _{\pi }\gamma (t,z){\beta }^{{\ast}}(t))\tilde{N}(\mathrm{d}t,\mathrm{d}z); \\ Y (0) & = 0.\end{array} \right.$$(22.27)By the duality formulas (22.7) and (22.8) and the Fubini theorem, we get
$$\displaystyle\begin{array}{rcl} & {\mathbb{E}}^{x}\left [\displaystyle\int _{0}^{T}\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (t,z)Y (t)\mu (\mathrm{d}z)\mathrm{d}t\right ] & \\ & \qquad \quad = {\mathbb{E}}^{x}\bigg{[}\displaystyle\int _{0}^{T}\displaystyle\int _{\mathbb{R}_{0}}\{\displaystyle\int _{0}^{t}\Big{(}\frac{\partial f_{1}} {\partial x} (t,z)\Big{[} \frac{\partial b} {\partial x}(s)Y (s) + \nabla _{\pi }b(s){\beta }^{{\ast}}(s)\Big{]} & \\ & \qquad \qquad + D_{s}\frac{\partial f_{1}} {\partial x} (t,z)\Big{[}\frac{\partial \sigma } {\partial x}(s)Y (s) + \nabla _{\pi }\sigma (s){\beta }^{{\ast}}(s)\Big{]} & \\ & \qquad \qquad +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z_{1}} \frac{\partial f_{1}} {\partial x} (t,z)\Big{[}\frac{\partial \gamma } {\partial x}(s,z_{1})Y (s) & \\ & \qquad \qquad + \nabla _{\pi }\gamma (s,z_{1}){\beta }^{{\ast}}(s)\Big{]}\nu (\mathrm{d}z_{1})\Big{)}\mathrm{d}s\}\mu (\mathrm{d}z)\mathrm{d}t\bigg{]} & \\ & \qquad \quad = {\mathbb{E}}^{x}\bigg{[}\displaystyle\int _{0}^{T}\{\Big{(}\displaystyle\int _{s}^{T}\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (t,z)\mu (\mathrm{d}z)\mathrm{d}t\Big{)}\Big{[} \frac{\partial b} {\partial x}Y (s) + \nabla _{\pi }b(s){\beta }^{{\ast}}(s)\Big{]}& \\ & \qquad \qquad + \Big{(}\displaystyle\int _{s}^{T}\displaystyle\int _{\mathbb{R}_{0}}D_{s}\frac{\partial f_{1}} {\partial x} (t,z)\mu (\mathrm{d}z)\mathrm{d}t\Big{)}\Big{[}\frac{\partial \sigma } {\partial x}Y (s) + \nabla _{\pi }\sigma {\beta }^{{\ast}}(s)\Big{]} & \\ & \qquad \qquad +\displaystyle\int _{\mathbb{R}_{0}}\Big{(}\displaystyle\int _{s}^{T}\displaystyle\int _{\mathbb{R}_{0}}D_{s,z_{1}} \frac{\partial f_{1}} {\partial x} (t,z)\mu (\mathrm{d}z)\mathrm{d}t\Big{)} & \\ & \qquad \qquad \times \Big{[}\frac{\partial \gamma } {\partial x}Y (s) + \nabla _{\pi }\gamma {\beta }^{{\ast}}(s)\Big{]}\nu (\mathrm{d}z_{ 1})\}\mathrm{d}s\bigg{]}. & \end{array}$$(22.28)Changing notation s → t and z 1 → z this becomes
$$\displaystyle\begin{array}{rcl} & {\mathbb{E}}^{x}\bigg{[}\displaystyle\int _{0}^{T}\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (t,z)Y (t)\mu (\mathrm{d}z)\mathrm{d}t\bigg{]} & \\ & \quad = {\mathbb{E}}^{x}\bigg{[}\displaystyle\int _{0}^{T}\{\Big{(}\displaystyle\int _{t}^{T}\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (s,z_{1})\mu (\mathrm{d}z_{1})\mathrm{d}s\Big{)}\Big{[} \frac{\partial b} {\partial x}(t)Y (t) + \nabla _{\pi }b(t){\beta }^{{\ast}}(t)\Big{]}& \\ & \qquad + \Big{(}\displaystyle\int _{t}^{T}\displaystyle\int _{\mathbb{R}_{0}}D_{t}\frac{\partial f_{1}} {\partial x} (s,z_{1})\mu (\mathrm{d}z_{1})\mathrm{d}s\Big{)}\Big{[}\frac{\partial \sigma } {\partial x}(t)Y (t) + \nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t)\Big{]} & \\ & \qquad +\displaystyle\int _{\mathbb{R}_{0}}\Big{(}\displaystyle\int _{t}^{T}\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\frac{\partial f_{1}} {\partial x} (s,z_{1})\mu (\mathrm{d}z_{1})\mathrm{d}s\Big{)} & \\ & \qquad \times \Big{[}\frac{\partial \gamma } {\partial x}(t,z)Y (t) + \nabla _{\pi }\gamma (t,z){\beta }^{{\ast}}(t)\Big{]}\nu (\mathrm{d}z)\}\mathrm{d}t\bigg{]}. & \end{array}$$(22.29)On the other hand, by the duality formulas (22.7) and (22.8), we get
$$\displaystyle\begin{array}{rcl} {\mathbb{E}}^{x } \Big{[} g_{ 1}^{\prime}(\hat{X}(T))Y (T)\Big{]}& = {\mathbb{E}}^{x}\Big{[}g_{ 1}^{\prime}(\hat{X}(T))\Big{(}\displaystyle\int _{0}^{T}\left \{ \frac{\partial b} {\partial x}(t)Y (t) + \nabla _{\pi }b(t){\beta }^{{\ast}}(t)\right \}\mathrm{d}t & \\ & \quad\quad \quad + \displaystyle\int _{0}^{T}\left \{\frac{\partial \sigma } {\partial x}(t)Y (t) + \nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t)\right \}\mathrm{d}B(t) & \\ & \quad\quad \quad + \displaystyle\int _{0}^{T}\displaystyle\int _{ \mathbb{R}_{0}}\left \{ \frac{\partial \gamma } {\partial x}(t,z)Y (t) + \nabla _{\pi }\gamma (t,z)\beta (t)\right \}\tilde{N}(\mathrm{d}t,\mathrm{d}z)\Big{)}\Big{]} & \\ & = {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ 0}^{T}\{g_{ 1}^{\prime}(\hat{X}(T)) \frac{\partial b} {\partial x}(t)Y (t) + g_{1}^{\prime}(\hat{X}(T))\nabla _{\pi }b(t){\beta }^{{\ast}}(t) & \\ & \quad\quad \quad + D_{t}(g_{1}^{\prime}(\hat{X}(T)))\frac{\partial \sigma } {\partial x}(t)Y (t) + D_{t}(g_{1}^{\prime}(\hat{X}(T)))\nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t)& \\ & \quad\quad \quad + \displaystyle\int _{\mathbb{R}_{0}}[D_{t,z}(g_{1}^{\prime}(\hat{X}(T)))\frac{\partial \gamma } {\partial x}(t,z)Y (t) & \\ & \quad\quad \quad + D_{t,z}(g_{1}^{\prime}(\hat{X}(T)))\nabla _{\pi }\gamma (t,z){\beta }^{{\ast}}(t)]\nu (\mathrm{d}z)\}\mathrm{d}t\Big{]}. & \\ \end{array}$$We recall that
$$\hat{K}_{1}(t) := g_{1}^{\prime}(\hat{X}(T)) +\displaystyle\int _{ t}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \frac{\partial f_{1}} {\partial x} (s,z_{1})\mu (\mathrm{d}z_{1})\;\mathrm{d}s,$$and combining Eqs. (22.27)–(22.29), we get
$$\displaystyle\begin{array}{rcl} & {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{0}^{T}\{\hat{K}_{1}(t)\Big{(} \frac{\partial b} {\partial x}(t)Y (t) + \nabla _{\pi }b(t){\beta }^{{\ast}}(t)\Big{)} & \\ & \qquad \qquad + D_{t}\hat{K}_{1}(t)\Big{(}\frac{\partial \sigma } {\partial x}(t)Y (t) + \nabla _{\pi }\sigma (t){\beta }^{{\ast}}(t)\Big{)} & \\ & \qquad \qquad +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\hat{K}_{1}(t)\Big{(}\frac{\partial \gamma } {\partial x}(t,z)Y (t) + \nabla _{\pi }\gamma (t,z){\beta }^{{\ast}}(t)\Big{)}\nu (\mathrm{d}z)& \\ & \qquad \qquad +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }f_{1}(t,z){\beta }^{{\ast}}(t)\mu (\mathrm{d}z)\}\mathrm{d}t\Big{]} = 0. & \end{array}$$(22.30)Now apply this to \(\beta = \beta _{\alpha } \in \mathcal{A}_{\Pi }\) of the form βα(s) = αχ[t, t + h](s), for some t, h ∈ (0, T), t + h ≤ T, where α = α(ω) is bounded and ℰ t -measurable. Then \({Y }^{(\beta _{\alpha })}(s) = 0\) for 0 ≤ s ≤ t. Hence Eq. (22.30) becomes
$$A_{1} + A_{2} = 0,$$(22.31)where
$$\displaystyle\begin{array}{rcl} A_{1} =\ & {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{t}^{T}\{\hat{K}_{1}(s) \frac{\partial b} {\partial x}(s) + D_{s}\hat{K}_{1}(s)\frac{\partial \sigma } {\partial x}(s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{1}(s)\frac{\partial \gamma } {\partial x}(s)\nu (\mathrm{d}z)\}{Y }^{(\beta _{\alpha })}(s)\mathrm{d}s\Big{]},& \\ \end{array}$$$$\displaystyle\begin{array}{rcl} A_{2} =\ & {\mathbb{E}}^{x}\Big{[}\{\displaystyle\int _{t}^{t+h} \Big{(}\hat{K}_{1}(s)\nabla _{\pi }b(s) + D_{s}\hat{K}_{1}(s) \nabla _{\pi }\sigma (s)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{1}(s)\nabla _{\pi }\gamma (s,z)\nu (\mathrm{d}z) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }f_{1}(s,z)\mu (\mathrm{d}z)\Big{)}\mathrm{d}s\}\alpha \Big{]}. & \\ \end{array}$$Note that, by Eq. (22.26), with \(Y (s) = {Y }^{(\beta _{\alpha })}(s)\) and s ≥ t + h,
$$\displaystyle\begin{array}{rcl} \mathrm{d}Y (s) = Y ({s}^{-})\{ \frac{\partial b} {\partial x}(s)\mathrm{d}s + \frac{\partial \sigma } {\partial x}(s)\mathrm{d}B(s) +\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial \gamma } {\partial x}({s}^{-},z)\tilde{N}(\mathrm{d}s,\mathrm{d}z)\}\,,& & \\ \end{array}$$for s ≥ t + h. Hence, by the Itô formula,
$$Y (s) = Y (t + h)G(t + h,s);\qquad s \geq t + h,$$(22.32)where, in general, for s ≥ t,
$$\displaystyle\begin{array}{rcl} G(t,s) =& \exp \Big{(}\displaystyle\int _{t}^{s}\{ \frac{\partial b} {\partial x}(r) -\tfrac{1} {2}\Big{(}\frac{\partial \sigma } {\partial x}{\Big{)}}^{2}(r)\}\mathrm{d}r +\displaystyle\int _{ t}^{s}\frac{\partial \sigma } {\partial x}(r)\mathrm{d}B(r)& \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}({r}^{-},z)\Big{)}\tilde{N}(\mathrm{d}r,\mathrm{d}z) & \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\{\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}(r,z)\Big{)} -\frac{\partial \gamma } {\partial x}(r,z)\}\nu (\mathrm{d}z)\mathrm{d}r\Big{)}.& \end{array}$$(22.33)Note that G(t, s) does not depend on h. Put
$$\displaystyle\begin{array}{rcl} H_{1}^{0}(s,x,\pi,\theta ) =& K_{ 1}(s)b(s,x,\pi _{0},\theta _{0}) + D_{s}K_{1}(s)\sigma (s,x,\pi _{0},\theta _{0})& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}K_{1}(s)\gamma (s,x,\pi,\theta,z)\nu (\mathrm{d}z), & \end{array}$$(22.34)and \(\hat{H}_{1}^{0}(s) = H_{1}^{0}(s,\hat{X}(s),\hat{\pi },\hat{\theta })\). Then
$$A_{1} = {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t}^{T}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (s)\mathrm{d}s\Big{]}.$$Differentiating with respect to h at h = 0 we get
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}} {\mathrm{d}h}A_{1}{\vert }_{h=0}& =& \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t}^{t+h}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (s)\mathrm{d}s\Big{]}_{h=0} \\ & & + \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t+h}^{T}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (s)\mathrm{d}s\Big{]}_{h=0}. \end{array}$$(22.35)Since Y (t) = 0 we see that
$$\frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t}^{t+h}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (s)\mathrm{d}s\Big{]}_{h=0} = 0.$$(22.36)Therefore, by Eq. (22.31),
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}} {\mathrm{d}h}A_{1}{\vert }_{h=0}& = \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\displaystyle\int _{ t+h}^{T}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (t + h)G(t + h,s)\mathrm{d}s\Big{]}_{h=0}& \\ & =\displaystyle\int _{ t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)Y (t + h)G(t + h,s)\Big{]}_{h=0}\mathrm{d}s & \\ & =\displaystyle\int _{ t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)G(t,s)Y (t + h)\Big{]}_{h=0}\mathrm{d}s. & \end{array}$$(22.37)On the other hand, Eq. (22.26) gives
$$\displaystyle\begin{array}{rcl} Y (t+ h) =& \alpha \displaystyle\int _{t}^{t+h}\{\nabla _{\pi }b(r)\mathrm{d}r + \nabla _{\pi }\sigma \mathrm{d}B(r) +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }\gamma ({r}^{-},z)\tilde{N}(\mathrm{d}r,\mathrm{d}z)\} & \\ & +\displaystyle\int _{t}^{t+h}Y ({r}^{-})\{ \frac{\partial b} {\partial x}(r)\mathrm{d}r + \frac{\partial \sigma } {\partial x}(r)\mathrm{d}B(r) +\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial \gamma } {\partial x}({r}^{-},z)\tilde{N}(\mathrm{d}r,\mathrm{d}z)\}.& \end{array}$$(22.38)Combining this with Eqs. (22.36) and (22.37), we have
$$\frac{\mathrm{d}} {\mathrm{d}h}A_{1}{\vert }_{h=0} = \Lambda _{1} + \Lambda _{2},$$(22.39)where
$$\displaystyle\begin{array}{rcl} \Lambda _{1} =& \displaystyle\int _{t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)G(t,s)\alpha \displaystyle\int _{t}^{t+h}\{\nabla _{ \pi }b(r)\mathrm{d}r + \nabla _{\pi }\sigma (r)\mathrm{d}B(r)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }\gamma ({r}^{-},z)\tilde{N}(\mathrm{d}r,\mathrm{d}z)\}\Big{]}_{h=0}\mathrm{d}s & \end{array}$$(22.40)and
$$\displaystyle\begin{array}{rcl} \Lambda _{2} =& \displaystyle\int _{t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)G(t,s)\displaystyle\int _{t}^{t+h}Y ({r}^{-})\{ \frac{\partial b} {\partial x}(r)\mathrm{d}r + \frac{\partial \sigma } {\partial x}(r)\mathrm{d}B(r)& \\ & +\displaystyle\int _{\mathbb{R}_{0}} \frac{\partial \gamma } {\partial x}({r}^{-},z)\tilde{N}(\mathrm{d}r,\mathrm{d}z)\}\Big{]}_{ h=0}\mathrm{d}s. & \end{array}$$(22.41)By the duality formulae (22.7) and (22.8), we have
$$\displaystyle\begin{array}{rcl} \Lambda _{1} =& \displaystyle\int _{t}^{T} \frac{\mathrm{d}} {\mathrm{d}h}{\mathbb{E}}^{x}\Big{[}\alpha \displaystyle\int _{ t}^{t+h}\{\nabla _{ \pi }b(r)F_{1}(t,s) + \nabla _{\pi }\sigma (r)D_{r}F_{1}(t,s)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }\gamma (r,z)D_{r,z}F_{1}(t,s)\nu (\mathrm{d}z)\}\mathrm{d}r\Big{]}_{h=0}\mathrm{d}s & \\ =& \displaystyle\int _{t}^{T}{\mathbb{E}}^{x}\Big{[}\alpha \{\nabla _{\pi }b(t)F_{1}(t,s) + \nabla _{\pi }\sigma (t)D_{t}F_{1}(t,s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }\gamma (t,z)D_{t,z}F_{1}(t,s)\nu (\mathrm{d}z)\}\Big{]}\mathrm{d}s. & \end{array}$$(22.42)Since Y (t) = 0 we see that
$$\Lambda _{2} = 0.$$(22.43)We conclude that
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}} {\mathrm{d}h}A_{1}{\vert }_{h=0} =& \Lambda _{1} & \\ =& \displaystyle\int _{t}^{T}{\mathbb{E}}^{x}\Big{[}\alpha \{F_{1}(t,s)\nabla _{\pi }b(t) + D_{t}F_{1}(t,s)\nabla _{\pi }\sigma (t)& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}F_{1}(t,s)\nabla _{\pi }\gamma (t,z)\nu (\mathrm{d}z)\}\Big{]}\mathrm{d}s. & \end{array}$$(22.44)Moreover, we see directly that
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}} {\mathrm{d}h}A_{2}{\vert }_{h=0} =& {\mathbb{E}}^{x}\Big{[}\alpha \{\hat{K}_{1}(t)\nabla _{\pi }b(t) + D_{t}\hat{K}_{1}(t)\nabla _{\pi }\sigma (t) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}\{D_{t,z}\hat{K}_{1}(t)\nabla _{\pi }\gamma (t,z) + \nabla _{\pi }f_{1}(t,z)\}\nu (\mathrm{d}z)\}\Big{]}.& \end{array}$$(22.45)Therefore, differentiating Eq. (22.30) with respect to h at h = 0 gives the equation
$$\displaystyle\begin{array}{rcl} &{\mathbb{E}}^{x}\Big{[}\alpha \{\Big{(}\hat{K}_{1}(t) +\displaystyle\int _{ t}^{T}F_{1}(t,s)\mathrm{d}s\Big{)}\nabla _{\pi }b(t) + D_{t}\Big{(}\hat{K}_{1}(t) +\displaystyle\int _{ t}^{T}F_{1}(t,s)\mathrm{d}s\Big{)}\nabla _{\pi }\sigma (t)& \\ & \quad +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\Big{(}\hat{K}_{1}(t) +\displaystyle\int _{ t}^{T}F_{1}(t,s)\mathrm{d}s\Big{)}\nabla _{\pi }\gamma (t,z) + \nabla _{\pi }f_{1}(t,z)\nu (\mathrm{d}z)\}\Big{]} = 0. & \end{array}$$(22.46)We can reformulate this as follows: If we define, as in Eq. (22.16),
$$\hat{p}_{1}(t) =\hat{ K}_{1}(t) +\displaystyle\int _{ t}^{T}F_{ 1}(t,s)\mathrm{d}s =\hat{ K}_{1}(t) +\displaystyle\int _{ t}^{T}\frac{\partial \hat{H}_{1}^{0}} {\partial x} (s)G(t,s)\mathrm{d}s,$$(22.47)then Eq. (22.44) can be written:
$$\displaystyle\begin{array}{rcl}{ \mathbb{E}}^{x}\Big{[}& \nabla _{ \pi }\{\displaystyle\int _{\mathbb{R}_{0}}f_{1}(t,\hat{X}(t),\pi,\hat{\theta },z)\mu (\mathrm{d}z) +\hat{ p}_{1}(t)b(t,\hat{X}(t),\pi _{0},\hat{\theta }_{0}) & \\ & \quad + D_{t}\hat{p}_{1}(t)\sigma (t,\hat{X}(t),\pi _{0},\hat{\theta }_{0}) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{t,z}\hat{p}_{1}(t)\gamma (t,\hat{X}(t),\pi,\hat{\theta },z)\nu (\mathrm{d}z)\}_{\pi =(\pi _{0}(t),\pi _{1}(t,z))}\alpha \Big{]} = 0.& \\ \end{array}$$Since this holds for all bounded ℰ t -measurable random variable α, we conclude that
$${\mathbb{E}}^{x}\Big{[}\nabla _{ \pi }\hat{H}_{1}(t,{X}^{(\pi,\hat{\theta })}(t),\pi,\hat{\theta })\mid _{ \pi =\hat{\pi }(t)}\mid \mathbb{E}_{t}\Big{]} = 0.$$Similarly, we have
$$\displaystyle\begin{array}{rcl} 0=& \frac{\partial } {\partial \upsilon }J_{2}(\hat{\pi },\hat{\theta } + \upsilon \eta ){\vert }_{\upsilon =0} & \\ =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{0}^{T}\displaystyle\int _{\mathbb{R}_{0}}\{\frac{\partial f_{2}} {\partial x} (t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi }(t,z),\hat{\theta }(t,z),z)D(t) & \\ & + \nabla _{\theta }f_{2}(t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi }(t,z),\theta (t,z),z){\vert }_{\theta =\hat{\theta }}\eta (t)\}\mu (\mathrm{d}z)\mathrm{d}t+g_{2}^{\prime}(\hat{X}(T))D(T) \Big{]},& \end{array}$$(22.48)where
$$\displaystyle\begin{array}{rcl} D(t) =& {D}^{(\eta )}(t) = \frac{\mathrm{d}} {\mathrm{d}\upsilon }{X}^{(\hat{\pi },\hat{\theta }+\upsilon \eta )}(t){\vert }_{ \upsilon =0} & \\ =& \displaystyle\int _{0}^{t}\{ \frac{\partial b} {\partial x}(s,\hat{X}(s),\hat{\pi }_{0}(s),\hat{\theta }_{0}(s))D(s) & \\ & +\nabla _{\theta }b(s,{X}^{(\hat{\pi }_{0},\theta _{0})}(s),\hat{\pi }_{0}(s),\theta _{0}(s)){\vert }_{\theta =\hat{\theta }}{\eta }^{{\ast}}(s)\}\mathrm{d}s & \\ & +\displaystyle\int _{0}^{t}\{\frac{\partial \sigma } {\partial x}(s,\hat{X}(s),\hat{\pi }_{0}(s),\hat{\theta }_{0}(s))Y (s) & \\ & +\nabla _{\theta }\sigma (s,{X}^{(\hat{\pi },\theta )}(s),\hat{\pi }_{0}(s),\theta _{0}(s)){\vert }_{\theta =\hat{\theta }}{\eta }^{{\ast}}(s)\}\mathrm{d}B(s) & \\ & +\displaystyle\int _{0}^{t}\displaystyle\int _{\mathbb{R}_{0}}\{\frac{\partial \gamma } {\partial x}(s,\hat{X}({s}^{-}),\hat{\pi }({s}^{-},z),\hat{\theta }({s}^{-}),z)D(s) & \\ & +\nabla _{\theta }\gamma (s,{X}^{(\hat{\pi },\theta )}({s}^{-}),\hat{\pi }({s}^{-},z),\theta ({s}^{-},z),z){\vert }_{\theta =\hat{\theta }}{\eta }^{{\ast}}(s)\}\tilde{N}(\mathrm{d}s,\mathrm{d}z).& \end{array}$$(22.49)Define
$$D(s) = D(t + h)G(t + h,s);\quad s \geq t + h,$$where G(t, s) is defined as in Eq. (22.32). By using similar arguments as above, we get
$${\mathbb{E}}^{x}\Big{[}\nabla _{ \theta }\hat{H}_{2}(t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi },\theta )\mid _{ \theta =\hat{\theta }(t)}\mid \mathbb{E}_{t}\Big{]} = 0.$$This completes the proof of (i).
-
(ii)
Conversely, suppose that there exists \((\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\) such that Eqs. (22.22) and (22.23) hold. Then by reversing the above arguments, we obtain that Eq. (22.31) holds for all \(\beta _{\alpha }(s,\omega ) = \alpha (\omega )\chi _{(t,t+h]}(s) \in \mathcal{A}_{\Pi }\), where
$$\displaystyle\begin{array}{rcl} A_{1} =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{t}^{T}\{\hat{K}_{1}(s) \frac{\partial b} {\partial x}(s) + D_{s}\hat{K}_{1}(s)\frac{\partial \sigma } {\partial x}(s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{1}(s)\frac{\partial \gamma } {\partial x}(s)\nu (\mathrm{d}z)\}{Y }^{(\beta _{\alpha })}(s)\mathrm{d}s\Big{]},& \\ \end{array}$$$$\displaystyle\begin{array}{rcl} A_{2} =& {\mathbb{E}}^{x}\Big{[}\{\displaystyle\int _{t}^{t+h}\Big{(}\hat{K}_{1}(s)\nabla _{\pi }b(s) + D_{s}\hat{K}_{1}(s)\nabla _{\pi }\sigma (s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{1}(s)\nabla _{\pi }\gamma (s,z)\nu (\mathrm{d}z) +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\pi }f_{1}(s,z)\mu (\mathrm{d}z)\Big{)}\mathrm{d}s\}\alpha \Big{]},& \\ \end{array}$$for some t, h ∈ [0, T] with t + h ≤ T and some bounded ℰ t -measurable α. Similarly,
$$A_{3} + A_{4} = 0$$(22.50)for all \(\eta _{\xi }(s,\omega ) = \xi (\omega )\chi _{(t,t+h]}(s) \in \mathcal{A}_{\Theta }\), where
$$\displaystyle\begin{array}{rcl} A_{3} =& {\mathbb{E}}^{x}\Big{[}\displaystyle\int _{t}^{T}\{\hat{K}_{2}(s) \frac{\partial b} {\partial x}(s) + D_{s}\hat{K}_{2}(s)\frac{\partial \sigma } {\partial x}(s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{2}(s)\frac{\partial \gamma } {\partial x}(s)\nu (\mathrm{d}z)\}{Y }^{(\eta _{\xi })}(s)\mathrm{d}s\Big{]},& \\ \end{array}$$$$\displaystyle\begin{array}{rcl} A_{4} =& {\mathbb{E}}^{x}\Big{[}\{\displaystyle\int _{t}^{t+h} \Big{(}\hat{K}_{2}(s)\nabla _{\theta }b(s) + D_{s}\hat{K}_{2}(s) \nabla _{\theta }\sigma (s) & \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}_{2}(s)\nabla _{\theta }\gamma (s,z)\nu (\mathrm{d}z) +\displaystyle\int _{\mathbb{R}_{0}}\nabla _{\theta }f_{2}(s,z)\mu (\mathrm{d}z)\Big{)}\mathrm{d}s\}\alpha \Big{]},& \\ \end{array}$$for some t, h ∈ [0, T] with t + h ≤ T and some bounded ℰ t -measurable ξ. Hence, these equalities hold for all linear combinations of βα and ηξ. Since all bounded \(\beta \in \mathcal{A}_{\Pi }\) and \(\eta \in \mathcal{A}_{\Theta }\) can be approximated pointwise boundary in (t, ω) by such linear combinations, it follows that Eqs. (22.31) and (22.50) hold for all bounded \((\beta,\eta ) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\). Hence, by reversing the remaining part of the proof above, we conclude that
$$\frac{\partial } {\partial y}J_{1}(\hat{\pi } + y\beta,\hat{\theta }){\vert }_{y=0} = 0,$$$$\frac{\partial } {\partial \upsilon }J_{2}(\hat{\pi },\hat{\theta } + \upsilon \eta ){\vert }_{\upsilon =0} = 0,$$for all β and η.
3 Zero-Sum Games
Suppose that the given performance functional of player I is the negative of the player II, i.e.,
where \({\mathbb{E}}^{x} = \mathbb{E}_{P}^{x}\) denotes the expectation with respect to P given that X(0) = x. Suppose that the controls u 0(t) and u 1(t, z) have the form as in Eqs. (22.5) and (22.6). Let \(\mathcal{A}_{\Pi }\) and \(\mathcal{A}_{\Theta }\) denote the given family of controls π = (π0, π1) and θ = (θ0, θ1) such that they are contained in the set of càdlàg ℰ t -adapted controls, Eq. (22.1) has a unique strong solution up to time T and
Then the partial information zero-sum stochastic differential game problem is the following:
Problem 3.1.
Find \(\Phi _{\mathbb{E}}\in \mathbb{R}\), \({\pi }^{{\ast}}\in \mathcal{A}_{\Pi }\) and \({\theta }^{{\ast}}\in \mathcal{A}_{\Theta }\) (if it exists) such that
Such a control (π ∗ , θ ∗ ) is called an optimal control (if it exists). The intuitive idea is that while player I controls π, player II controls θ. The actions of the players are antagonistic, which means that between players I and II, there is a payoff J(π, θ) which is a reward for player I and a cost for Player II. Note that since we allow b, σ, γ, f and g to be stochastic processes and also because our controls are ℰ t -adapted, this problem is not of Markovian type and hence cannot be solved by dynamic programming.
Theorem 3.1 (Maximum principle for zero-sum games).
-
(i)
Suppose \((\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\) is a directional critical point for J(π,θ) in the sense that for all bounded \(\beta \in \mathcal{A}_{\Pi }\) and \(\eta \in \mathcal{A}_{\Theta }\) , there exists δ > 0 such that \(\hat{\pi } + y\beta \in \mathcal{A}_{\Pi }\) , \(\hat{\theta } + \upsilon \eta \in \mathcal{A}_{\Theta }\) for all y,υ ∈ (−δ,δ) and
$$c(y,\upsilon ) := J(\hat{\pi } + y\beta,\hat{\theta } + \upsilon \eta ),\quad y,\upsilon \in (-\delta,\delta )$$has a critical point at 0, i.e.,
$$\frac{\partial c} {\partial y}(0,0) = \frac{\partial c} {\partial \upsilon }(0,0) = 0.$$(22.54)Then
$${\mathbb{E}}^{x}[\nabla _{ \pi }\hat{H}(t,{X}^{(\pi,\hat{\theta })}(t),\pi,\hat{\theta },\omega )\vert \mathbb{E}_{ t}]_{\pi =\hat{\pi }} = 0,$$(22.55)$${\mathbb{E}}^{x}[\nabla _{ \theta }\hat{H}(t,{X}^{(\hat{\pi },\theta )}(t),\hat{\pi },\theta,\omega )\vert \mathbb{E}_{ t}]_{\theta =\hat{\theta }} = 0\quad \mbox{ for a.a. t,}\omega,$$(22.56)where
$$\displaystyle\begin{array}{rcl} & \hat{X}(t) = {X}^{(\hat{\pi },\hat{\theta })}(t),& \\ \end{array}$$$$\displaystyle\begin{array}{rcl} \hat{H}(t,\hat{X}(t),\pi,\theta )& =& \displaystyle\int _{\mathbb{R}_{0}}f(t,\hat{X}(t),\pi,\theta,z)\mu (\mathrm{d}z)+\hat{p}(t)b(t,\hat{X}(t),\pi _{0},\theta _{0}) \\ & & +\hat{q}(t)\sigma (t,\hat{X}(t),\pi _{0},\theta _{0})+\displaystyle\int _{\mathbb{R}_{0}}\hat{r}(t,z)\gamma (t,\hat{X}({t}^{-}),\pi,\theta,z)\nu (\mathrm{d}z), \\ & & \end{array}$$(22.57)with
$$\hat{p}(t) =\hat{ K}(t) +\displaystyle\int _{ t}^{T}\frac{\partial \hat{{H}}^{0}} {\partial x} (s,\hat{X}(s),\hat{\pi }(s),\hat{\theta }(s))\,\hat{G}(t,s)\mathrm{d}s,$$(22.58)$$\hat{K}(t) = {K}^{(\hat{\pi },\hat{\theta })}(t) = g^{\prime}(\hat{X}(T)) +\displaystyle\int _{ t}^{T}\displaystyle\int _{ \mathbb{R}_{0}} \frac{\partial f} {\partial x}(s,\hat{X}(s),\hat{\pi }(s,z),\hat{\theta }(s,z),z)\mu (\mathrm{d}z)\mathrm{d}s,$$(22.59)$$\displaystyle\begin{array}{rcl} \hat{{H}}^{0}(s,\hat{X},\hat{\pi },\hat{\theta }) =& \hat{K}(s)b(s,\hat{X},\hat{\pi }_{ 0},\hat{\theta }_{0}) + D_{s}\hat{K}(s)\sigma (s,\hat{X},\hat{\pi }_{0},\hat{\theta }_{0})& \\ & +\displaystyle\int _{\mathbb{R}_{0}}D_{s,z}\hat{K}(s)\gamma (s,\hat{X},\hat{\pi },\hat{\theta },z)\nu (\mathrm{d}z), & \end{array}$$(22.60)$$\displaystyle\begin{array}{rcl} \hat{G}(t,s) :=& \exp \Big{(}\displaystyle\int _{t}^{s}\{ \frac{\partial b} {\partial x}(r,\hat{X}(r),\hat{\pi }_{0}(r),\hat{\theta }_{0}(r)) & \\ & -\frac{1} {2}\Big{(}\frac{\partial \sigma } {\partial x}{\Big{)}}^{2}(r,\hat{X}(r),\hat{\pi }_{ 0}(r),\hat{\theta }_{0}(r))\}\mathrm{d}r & \\ & +\displaystyle\int _{t}^{s}\frac{\partial \sigma } {\partial x}(r,\hat{X}(r),\hat{\pi }_{0}(r),\hat{\theta }_{0}(r))\mathrm{d}B(r) & \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}(r,\hat{X}({r}^{-}),\hat{\pi }({r}^{-},z),\hat{\theta }({r}^{-},z),z)\Big{)}\tilde{N}(\mathrm{d}r,\mathrm{d}z)& \\ & +\displaystyle\int _{t}^{s}\displaystyle\int _{\mathbb{R}_{0}}\{\ln \Big{(}1 + \frac{\partial \gamma } {\partial x}(r,\hat{X}(r),\hat{\pi },\hat{\theta },z)\Big{)} & \\ & -\frac{\partial \gamma } {\partial x}(r,\hat{X}(r),\hat{\pi },\hat{\theta },z)\}\nu (\mathrm{d}z)\mathrm{d}r\Big{)}; & \\ \end{array}$$$$\hat{q}(t) := D_{t}\hat{p}(t),$$and
$$\hat{r}(t,z) := D_{t,z}\hat{p}(t).$$(22.61) -
(ii)
Conversely, suppose that there exists \((\hat{\pi },\hat{\theta }) \in \mathcal{A}_{\Pi } \times \mathcal{A}_{\Theta }\) such that Eqs. (22.55) and (22.56) hold. Furthermore, suppose that g is an affine function, H is concave in π and convex in θ. Then \((\hat{\pi },\hat{\theta })\) satisfies Eq. (22.54).
4 Application: Worst-Case Scenario Optimal Portfolio Under Partial Information
We illustrate the results in the previous section by looking at an application to robust portfolio choice in finance:
Consider a financial market with the following two investment possibilities:
-
1.
A risk free asset, where the unit price S 0(t) at time t is
$$\mathrm{d}S_{0}(t) = r(t)S_{0}(t)\mathrm{d}t;\quad S_{0}(0) = 1;\quad 0 \leq t \leq T,$$where T > 0 is a given constant.
-
2.
A risky asset, where the unit price S 1(t) at time t is given by
$$\left \{\begin{array}{ll} \mathrm{d}S_{1}(t) = S_{1}({t}^{-})[\theta (t)\mathrm{d}t + \sigma _{0}(t)\mathrm{d}B(t) +\displaystyle\int _{\mathbb{R}_{0}} \gamma _{0}(t,z)\tilde{N}(\mathrm{d}t,\mathrm{d}z)], \\ S_{1}(0) > 0,\end{array} \right.$$(22.62)where r, θ, σ0 and γ0 are predictable processes such that
We assume that θ is adapted to a given subfiltration ℰ t and that
for some constant δ > 0.
Let π(t) = π(t, ω) be a portfolio, representing the amount invested in the risky asset at time t. We require that π be càdlàg and ℰ t -adapted and self–financing and hence that the corresponding wealth X(t) = X (π, θ)(t) at time t is given by
Let us assume that the mean relative growth rate θ(t) of the risky asset is not known to the trader, but subject to uncertainty. We may regard θ as a market scenario or a stochastic control of the market, which is playing against the trader. Let \(\mathcal{A}_{\Pi }^{\epsilon }\) and \(\mathcal{A}_{\Theta }^{\epsilon }\) denote the set of admissible controls π, θ, respectively. The worst-case partial information scenario optimal problem for the trader is to find \({\pi }^{{\ast}}\in \mathcal{A}_{\Pi }^{\epsilon }\) and \({\theta }^{{\ast}}\in \mathcal{A}_{\Theta }^{\epsilon }\) and Φ ∈ ℝ such that
where U : [0, ∞) → ℝ is a given utility function, assumed to be concave, strictly increasing and \({\mathcal{C}}^{1}\) on (0, ∞). We want to study this problem by using Theorem 3.1. In this case we have
and
where
Hence,
and
With this value for p(t) we have
Hence Eq. (22.55) becomes
and Eq. (22.56) becomes
Since p(t) > 0 we conclude that
This implies that
and
Substituting this into Eq. (22.69), we get
We have proved the following theorem:
Theorem 4.1 (Worst-case scenario optimal portfolio under partial information).
Suppose there exists a solution \(({\pi }^{{\ast}},{\theta }^{{\ast}}) \in (\mathcal{A}_{\Pi }^{\epsilon },\mathcal{A}_{\Theta }^{\epsilon })\) of the stochastic differential game Eq. (22.64). Then
and
In particular, if r(s) is deterministic, then
Remark 4.1.
-
(i)
If r(s) is deterministic, then Eq. (22.77) states that the worst-case scenario is when \(\hat{\theta }(t) = r(t)\), for all t ∈ [0, T], i.e., when the normalized risky asset price
$${\mathrm{e}}^{-\displaystyle\int _{0}^{t}r(s)\mathrm{d}s }S_{1}(t)$$is a martingale. In such a situation the trader might as well put all her money in the risk free asset, i.e., choose \(\pi (t) =\hat{ \pi }(t) = 0\). This trading strategy remains optimal if r(s) is not deterministic, but now the worst-case scenario \(\hat{\theta }(t)\) is given by the more complicated expression (22.74).
-
(ii)
This is a new approach to, and a partial extension of, Theorem 2.2 in [13] and Theorem 4.1 in the subsequent paper [1]. Both of these papers consider the case with deterministic r(t) only. On the other hand, in these papers the scenario is represented by a probability measure and not by the drift.
References
An, T.T.K., Øksendal, B.: A maximum principle for stochastic differential games with partial information. J. Optim. Theor. Appl. 139, 463–483 (2008)
Bensoussan, A.: Stochastic Control of Partially Observable Systems. Cambridge University Press, Cambridge (1992)
Benth, F.E., Di Nunno, G., Løkka, A., Øksendal, B., Proske, F.: Explicit representation of the minimal variance portfolio in markets driven by Lévy processes. Math. Financ. 13, 55–72 (2003)
Baghery, F., Øksendal, B.: A maximum principle for stochastic control with partial information. Stoch. Anal. Appl. 25, 705–717 (2007)
Di Nunno, G., Meyer-Brandis, T., Øksendal, B., Proske, F.: Malliavin calculus and anticipative Itô formulae for Lévy processes. Inf. Dim. Anal. Quant. Probab. 8, 235–258 (2005)
Di Nunno, G., Øksendal, B., Proske, F.: Malliavin Calculus for Lévy Processes and Applications to Finance. Springer, Berlin (2009)
Karatzas, I., Ocone, D.: A generalized Clark representation formula, with application to optimal portfolios. Stoch. Stoch. Rep. 34, 187–220 (1991)
Karatzas, I., Xue, X.: A note on utility maximization under partial observations. Math. Financ. 1, 57–70 (1991)
Lakner, P.: Optimal trading strategy for an investor: the case of partial information. Stoch. Process. Appl. 76, 77–97 (1998)
Meyer-Brandis, T., Øksendal, B., Zhou, X.Y.: A mean-field stochastic maximum principle via Malliavin calculus. Stoch. 84, 643–666 (2012)
Nualart, D.: Malliavin Calculus and Related Topics, 2nd edn. Springer, Berlin (2006)
Øksendal, B., Sulem, A.: Applied Stochastic Control of Jump Diffusions, 2nd edn. Springer, Berlin (2007)
Øksendal, B., Sulem, A.: A game theoretic approach to martingale measures in incomplete markets. Surv. Appl. Ind. Math. 15, 18–24 (2008)
Pham, H., Quenez M.-C.: Optimal portfolio in partially observed stochastic volatility models. Ann. Appl. Probab. 11, 210–238 (2001)
Yong, J., Zhou, X.Y.: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer, New York (1999)
Acknowledgements
The research leading to these results has received funding from the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007–2013)/ ERC grant agreement no [228087].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Kieu, A.T.T., Øksendal, B., Okur, Y.Y. (2013). A Malliavin Calculus Approach to General Stochastic Differential Games with Partial Information. In: Viens, F., Feng, J., Hu, Y., Nualart , E. (eds) Malliavin Calculus and Stochastic Analysis. Springer Proceedings in Mathematics & Statistics, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-5906-4_22
Download citation
DOI: https://doi.org/10.1007/978-1-4614-5906-4_22
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-5905-7
Online ISBN: 978-1-4614-5906-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)