Abstract
We establish existence of nearly-optimal controls, conditions for existence of an optimal control and a saddle-point for respectively a control problem and zero-sum differential game associated with payoff functionals of mean-field type, under dynamics driven by weak solutions of stochastic differential equations of mean-field type.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this work we investigate existence of an optimal control and a saddle-point for a zero-sum game associated with a payoff functional of mean-field type, under a dynamics driven by the weak solution of a stochastic differential equation (SDE) also of mean-field type. The obtained results extend in a natural way those obtained in [17] for standard payoffs associated with standard diffusion processes.
Given a control process \(u:=(u_t)_{t\le T}\) with values in some compact metric space U, the controlled SDE of mean-field type we consider in this paper is of the following functional form:
i.e. f depends on the whole path \(x_.\) and \(P^u\circ x_t^{-1}\) and \(\sigma \) depends on \(x_.\) (this feature can be improved substantially, see Remark 3.2), the marginal probability distribution of \(x_t\) under the probability measure \(P^u\), and where \(W^{P^u}\) is a standard Brownian motion under \(P^u\).
The payoff functional \(J(u),\,\, u\in \mathcal {U},\) associated with the controlled SDE is of the form
where \(E^u\) denotes the expectation w.r.t. \(P^u\).
As an example, the functions f, g and h can have the following forms
where \(\varphi _i\), \(i=1,2,3\), are bounded Borel-measurable functions.
Taking \(h=0\) and \(g(x,y)=\varphi _2(x)^2-y^2\), the cost functional reduces to the variance, \(J(u)=E^u[\varphi _2(x_T)^2])-\left( E^u[\varphi _2 (x_T)]\right) ^2=Var_{P^u}[\varphi _2(x_T)]\).
While controlling a strong solution of an SDE means controlling the process \(x^u\) defined on a given probability space \((\Omega , \mathcal {F},\mathbb {F}, {\mathbb {P}})\) on which a Brownian motion W is defined exists and \(\mathbb {F}\) is its natural filtration, controlling a weak solution of an SDE boils down to controlling the Girsanov density process \(L^u:=dP^u/d{\mathbb {P}}\) of \(P^u\) w.r.t. a reference probability measure \({\mathbb {P}}\) on \(\Omega \) such that \((\Omega ,{\mathbb {P}})\) carries a Brownian motion W and such that the coordinates process \(x_t\) is the unique solution of the following stochastic differential equation:
Integrating by parts, the payoff functional can be expressed in terms of \(L^u\) as follows
where \({\mathbb {E}}\) denotes the expectation w.r.t. \({\mathbb {P}}\). For this reason, we do not include a control parameter in the diffusion term \(\sigma \).
In the first part of this paper we establish conditions for existence of an optimal control associated with J(u): find a stochastic process \(u^*\) with values in U such that
Optimal control of SDEs of mean-field type is also known as McKean–Vlasov type optimal control or simply optimal control of nonlinear diffusion; see e.g. the recent books [3] and [5] and the references therein.
The recent paper by Carmona and Lacker [6] discusses a similar problem but in the so-called mean-field game setting (where they further consider the marginal laws of the control process, i.e., \(P^u\circ u_t^{-1}\)) which has the following structure (cf. [6]):
- (1)
Fix a probability measure \(\mu \) on the path space and a flow \(\nu : t \mapsto \nu _t\) of measures on the control space;
- (2)
Standard optimization: With \(\mu \) and \(\nu \) frozen, solve the standard optimal control problem:
$$\begin{aligned} \left\{ \begin{array}{lll} \inf _u E^u\left[ \int _0^T h(t,x_.,\mu ,\nu , u_t)dt+ g(x_T,\mu )\right] , \\ dx_t=f(t,x_.,\mu , u_t)dt+\sigma (t, x_.)dW^{P^u}_t,\quad x_0=\text {x}\in \mathbb {R}^d, \end{array} \right. \end{aligned}$$(1.2)i.e. find an optimal control u, inject it into the dynamics of (1.2), and find the law \(\Phi _x(\mu ,\nu )\) of the optimally controlled state process and the flow \(\Phi _u(\mu ,\nu )\) of marginal laws of the optimal control process;
- (3)
Matching: Find a fixed point \(\mu =\Phi _x(\mu ,\nu ),\,\, \nu =\Phi _u(\mu ,\nu )\).
To perform the matching step (3), the authors of [6] are led to impose more or less stringent assumptions which in turn narrow the scope of the applicability of their framework. This is mainly due to the fact that the functional which is supposed to provide the optimal control is rather irregular. Overall, to show existence of a fixed point is not an easy task and cannot work in broader frameworks. For an in-depth comparison between the mean-field games approach and optimal control of strong solutions of SDEs of mean-field type see [3, 7], and the references therein. In the recent paper [1] the authors derive a non-linear Feynman–Kac representation for the value function associated with an optimal control related to such SDEs. However they do not address the problem of existence of optimal or even \(\epsilon \)-optimal controls.
In this paper we use another approach which in a way addresses the full control problem where the marginal law changes with the control process and is not frozen as in the mean-field game approach. Our strategy goes as follows: By a fixed point argument we first show that for any admissible control u there exists a unique probability \(P^u\) under which the SDE
has a weak solution, where \(W^{P^u}\) is a Brownian motion under \({P^u}\). Moreover, the mapping which to u associates \(P^u\) is continuous. Therefore, the mean-field terms which appear in the drift of the above equation and in the payoff functional J(u) are treated as continuous functions of u. Using this point of view, which avoids the irregularity issues encountered in [6], we suggest conditions for existence of an optimal control using backward stochastic differential equations (BSDEs) in a similar fashion to the standard control problems, i.e. without mean-field terms. Indeed, if \((Y^u,Z^u)\) is the solution of the BSDEs associated with the driver (Hamiltonian) \(H(t,x_.,z,u):=h(t,x_.,P^{u}\circ x_t^{-1},u_t)+z\cdot \sigma ^{-1}(t,x_.)f(t,x_.,P^u\circ x_t^{-1},u_t)\) and the terminal value \(g(x_T,P^{u}\circ x^{-1}_T)\), we have \(Y^u_0=J(u)\). Moreover, the unique solution \((Y^*,Z^*)\) of the BSDE associated with
satisfies, under appropriate assumptions, \(Y^*(t)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}Y^u(t)\). In particular if g does not depend on the mean-field term this equality holds. The use of the essential infimum over the whole set of admissible controls \(\mathcal {U}\) instead of the infimum of the Hamiltonian H over the set U of actions (as is the case for the standard control problem, as discussed, e.g. in [17]) is simply due to the fact that the mean-field coupling \(P^{u}\circ x_t^{-1}\) involves the whole path of the control u over [0, t] and not only \(u_t\). This nonlocal feature of the dependence of H on the control does not seem covered by the powerful Benes’ type ‘progressively’ measurable selection, frequently used in standard control problems. Thus, if there exists \(u^*\in \mathcal {U}\) such that \(H^*(t,x_.,z)=H(t,x_.,z,u^*)\) and \(g^*(x_.)=g(x_T,P^{u^*}\circ x^{-1}_T)\), then \(u^*\) is an optimal control for J(u). We don’t know of any suitable measurable selection theorem that would guarantee existence of \(u^*\). At the end of this section, by using Ekeland’s variational principle, we show the existence of a near-optimal control. Finally, we suggest some particular cases where an optimal control exists.
The zero-sum game we consider is between two players with controls u and v valued in some compact metric spaces U and V, respectively. The dynamics and the payoff function associated with the game are both of mean-field type and are given by
and
where \(P^{u,v}\circ x_t^{-1}\) is the marginal probability distribution of \(x_t\) under the probability measure \(P^{u,v}\), \(W^{P^{u,v}}\) is a standard Brownian motion under \(P^{u,v}\) and \(E^{u,v}\) denotes the expectation w.r.t. \(P^{u,v}\).
In the zero-sum game, the first player (with control u) wants to minimize the payoff J(u, v) while the second player (with control v) wants to maximize it. The zero-sum game boils down to investigating the existence of a saddle point for the game, i.e. to show existence of a pair \((u^*, v^*)\) of controls such that
for each (u, v) with values in \(U\times V\).
This framework of games is symmetric in the sense that two players are allowed to use arbitrary adapted controls. Its introduction goes back to the eighties (see e.g. [8, 9, 16, 17]). Moreover, these controls are somehow of feedback form since, in the canonical space, for any control \((u_t)_{0\le t\le T}\) (resp. \((v_t)_{0\le t\le T}\)) there exists a measurable function \({\bar{u}}\) (resp. \({\bar{v}}\)) such that \(u_t={\bar{u}}(t,x_.)\) (resp. \(v_t={\bar{v}}(t,x_.))\). This framework is not the same as the one, e.g. in [4, 15], where the zero-sum game is formulated so that the first player uses controls while the second one strategies which in a way are responses, making the game nonsymmetric.
By using the same approach as in the control framework, we show that the game has a saddle-point. The recent paper by Li and Min [21] deals with the same zero-sum game for weak solutions of SDEs of the form (1.1), where they apply a similar ‘matching argument’ approach as [6]. However, due to the irregularity of the functional which provides the fixed point, they could only show existence of a so-called generalized saddle-point, i.e. of a pair of controls \((u^*, v^*)\) which satisfies (see, for instance, Theorem 5.6 in [21])
where \(\psi (u,{\bar{u}}):=({\mathbb {E}}[\int _0^T d^2(u_s.{\bar{u}}_s)ds])^{1/4}\) and C is a positive constant depending only on f and h.
Instead of the Wasserstein metric which is by now standard in the literature dealing with mean-field models, because it is designed to guarantee weak convergence of probability measures and convergence of finite moments, in this paper we have chosen to use the total variation as a metric between two probability measures, although it does not guarantee existence of finite moments, simply due to its relationship to the Hellinger distance thanks to the celebrated Csiszár–Kullback–Pinsker inequality (see the bound (4.22), Theorem V.4.21 in [19]) which gives a simple and direct proof of existence of a unique probability \(P^u\) (resp. \(P^{u,v}\)) under which the SDE (1.1) (resp. (1.3)) has a weak solution.
The paper is organized as follows. In Sect. 3, we account for existence and uniqueness of the weak solution of the SDE of mean-field type. In Sect. 4, we provide conditions for existence of an optimal control and prove existence of nearly-optimal controls. Finally, in Sect. 5, we investigate existence of a saddle point for a two-persons zero-sum game.
2 Preliminaries
Let \(\Omega :=\mathcal {C}([0,T]; \mathbb {R}^d)\) be the space of \(\mathbb {R}^d\)-valued continuous functions on [0, T] endowed with the metric of uniform convergence on [0, T]; \(|w|_t:=\sup _{0\le s\le t}|w_s|\), for \(0\le t\le T\). Denote by \(\mathcal {F}\) the Borel \(\sigma \)-field over \(\Omega \). Given \(t\in [0,T]\) and \(\omega \in \Omega \), let \(x(t,\omega )\) be the position in \(\mathbb {R}^d\) of \(\omega \) at time t. Denote by \(\mathcal {F}^0_t:=\sigma (x_s,\,\, s\le t),\, 0\le t\le T,\) the filtration generated by x. Below, C denotes a generic positive constant which may change from one line to another.
Let \(\sigma \) be a function from \([0,T]\times \Omega \) into \(\mathbb {R}^{d\times d}\) such that
- (A1)
\(\sigma \) is \(\mathcal {F}^0_t\)-progressively measurable ;
- (A2)
There exists a constant \(C>0\) such that
- (a)
For every \(t\in [0,T]\) and \(w, \bar{w} \in \Omega \), \( |\sigma (t,w)-\sigma (t,\bar{w})|\le C|w-\bar{w}|_t.\)
- (b)
\(\sigma \) is invertible and its inverse \(\sigma ^{-1}\) satisfies \(|\sigma ^{-1}(t,w)|\le C(1+|w|_t^{\alpha }),\) for some constant \(\alpha \ge 0\).
- (c)
For every \(t\in [0,T]\) and \(w\in \Omega \), \(|\sigma (t,w)|\le C(1+|w|_t).\)
- (a)
Let \({\mathbb {P}}\) be a probability measure on \(\Omega \) such that \((\Omega ,{\mathbb {P}})\) carries a Brownian motion \((W_t)_{0\le t\le T}\) and such that the coordinates process \((x_t)_{0\le t\le T}\) is the unique solution of the following stochastic differential equation:
Such a triplet \(({\mathbb {P}},W,x)\) exists due to Proposition 4.6 in [20, p. 315] since \(\sigma \) satisfies (A2). Moreover, for every \(p\ge 2\),
where \(C_p\) depends only on p, T, the initial value \(\text {x}\) and the linear growth constant of \(\sigma \) (see [20, p. 306]). Again, since \(\sigma \) satisfies (A2), \(\mathcal {F}^0_t\) is the same as \(\sigma \{W_s, s\le t\}\), for any \(t\le T\), since \(dW_t=\sigma ^{-1}(t,x_.)dx_t\). We denote by \(\mathbb {F}:=(\mathcal {F}_t)_{0\le t\le T}\) the completion of \((\mathcal {F}^0_t)_{t\le T}\) with the \({\mathbb {P}}\)-null sets of \(\Omega \).
Let \({\mathcal {P}}(\mathbb {R}^d)\) denote the set of probability measures on \(\mathbb {R}^d\) and \({\mathcal {P}}_2(\mathbb {R}^d)\) the subset of measures \(\nu \) with finite second moment:
For \(\mu ,\nu \in {\mathcal {P}}(\mathbb {R}^d)\), the total variation distance is defined by the formula
Furthermore, let \({\mathcal {P}}(\Omega )\) be the space of probability measures P on \(\Omega \) and \({\mathcal {P}}_p(\Omega ), \,p\ge 1,\) be the subspace of probability measures such that
where \(|x|_t:=\sup _{0\le s\le t}|x_s|\), \(0\le t\le T\).
Define on \(\mathcal {F}\) the total variation metric
Similarly, on the filtration \(\mathbb {F}\), we define the total variation metric between two probability measures P and Q as
It satisfies
For \(P, Q\in {\mathcal {P}}(\Omega )\) with time marginals \(P_t:=P\circ x_t^{-1}\) and \(Q_t:=Q\circ x_t^{-1}\), the total variation distance between \(P_t\) and \(Q_t\) satisfies
Indeed, we have
Endowed with the total variation metric \(D_T\) on \((\Omega , \mathcal {F}_T)\), \({\mathcal {P}}(\Omega )\) is a complete metric space. For the sake of completeness, a proof is displayed in the Appendix. Moreover, by the Portmanteau Theorem, \(D_T\) carries out the usual topology of weak convergence.
3 Diffusion Process of Mean-Field Type
Hereafter, a process \(\theta \) from \([0,T]\times \Omega \) into a measurable space is said to be progressively measurable if it is progressively measurable w.r.t. \(\mathbb {F}\). Let \({\mathcal {S}}^2_T\) be the set of \(\mathbb {F}\)-progressively measurable continuous processes \((\zeta _t)_{t\le T}\) such that \({\mathbb {E}}[\sup _{t\le T}|\zeta _t|^2]<\infty \) and finally let \({\mathcal {H}}^2_T\) be the set of \(\mathbb {F}\)-progressively measurable processes \((\theta _t)_{t\le T}\) such that \({\mathbb {E}}[\int _0^T|\theta _s|^2ds]<\infty \).
Let b be a measurable function from \([0,T]\times \Omega \times {\mathcal {P}}(\mathbb {R}^d)\) into \(\mathbb {R}^d\) such that
- (A3)
For every \(Q\in {\mathcal {P}}(\Omega )\), the process \(((b(t, x_.,Q\circ x_t^{-1}))_{t\le T}\) is progressively measurable.
- (A4)
For every \(t\in [0,T]\), \(w\in \Omega \) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),
$$\begin{aligned} |b(t,w,\mu )-b(t,w,\nu )|\le Cd(\mu ,\nu ). \end{aligned}$$ - (A5)
For every \(t\in [0,T]\), \(w\in \Omega \) and \(\mu \in {\mathcal {P}}(\mathbb {R}^d)\),
$$\begin{aligned} |b(t,w,\mu )|\le C(1+|w|_t). \end{aligned}$$
Next, for \(Q\in {\mathcal {P}}(\Omega )\), let \(P^Q\) be the measure on \((\Omega ,\mathcal {F})\) defined by
with
where, for any \((\mathbb {F},{\mathbb {P}})\)-continuous local martingale \(M=(M_t)_{0\le t\le T}\), \(\mathcal {E}(M)\) denotes the Doleans exponential, i.e., \(\mathcal {E}(M):=(\exp {M_t-\frac{1}{2}\langle M\rangle _t})_{{0\le t\le T}}\). Thanks to assumptions (A2) and (A5), \(P^Q\) is a probability measure on \((\Omega ,\mathcal {F})\). A proof of this fact follows the same lines of the proof of Proposition A.1 in [12]. Hence, in view of Girsanov’s theorem, the process \((W^Q_t,\,\, 0\le t\le T)\) defined by
is an \((\mathbb {F}, P^Q)\)-Brownian motion. Furthermore, under \(P^{Q}\),
Now, in view of (A2) and (A5), the Hölder and Burkholder–Davis–Gundy inequalities yield, for every \(p\ge 2\),
where the constant \(C_p\) depends only on \(p, T,\text {x}\) and the linear growth constants of b and \(\sigma \). By Gronwall’s inequality, we obtain
Next, we will show that there is \({\bar{Q}}\) such that \(P^{{\bar{Q}}}={{\bar{Q}}}\), i.e., \({\bar{Q}}\) is a fixed point. Moreover, \({\bar{Q}}\) has a finite moment of any order \(p\ge 2\).
Theorem 3.1
The map
admits a unique fixed point.
Moreover, for every \(p\ge 2\), the fixed point, denoted \({\bar{Q}}\), belongs to \({\mathcal {P}}_p(\Omega )\), i.e.
where the constant \(C_p\) depends only on \(p, T,\text {x}\) and the linear growth constants of b and \(\sigma \).
Proof
We show the contraction property of the map \( \Phi \) in the complete metric space \({\mathcal {P}}(\Omega )\), endowed with the total variation distance \(D_T\). To this end, given \(Q,{\widehat{Q}}\in {\mathcal {P}}(\Omega )\), we use an estimate of the total variation distance \(D_T(\Phi (Q),\Phi ({\widehat{Q}}))\) in terms of a version of the Hellinger process associated with the coordinate process x under the probability measures \(\Phi (Q)\) and \(\Phi ({\widehat{Q}})\), respectively. Indeed, since by (3.3),
in view of Theorem IV.1.33 in [19], a version of the associated Hellinger process is
where
and \(a_t:=(\sigma \sigma ^{\dagger })(t,x_.)\) and \(M^{\dagger }\) denotes the transpose of the matrix M. We may use the estimate (4.22) of Theorem V.4.21 in [19], to obtain
By (A2), (A4) and (3.4), we have
which together with (3.7) yield
Iterating this inequality, we obtain, for every \(N>0\),
where \(\Phi ^N\) denotes the N-fold composition of the map \(\Phi \). Hence, for N large enough, \(\Phi ^N\) is a contraction which entails that \(\Phi \) admits a unique fixed point.
Let \({\bar{Q}}\) be such a fixed point for the map \(\Phi \). Thus, under \({\bar{Q}}\),
where \({\bar{Q}}_t:={\bar{Q}}\circ x_t^{-1}\). In view of assumptions (A2) and (A5), the Hölder and Burkholder–Davis–Gundy inequalities yield
By Gronwall’s inequality, we obtain (3.5), i.e.
\(\square \)
Remark 3.2
The dependence of the drift b with respect to the law of \(x_t\) under Q, i.e., \(Q \circ x_t^{-1}\) can be relaxed substantially since we can replace this latter by \(Q\circ \phi (t,x)^{-1}\) where \(\phi (t,x)\) is an adapted process. For example one can choose \(\phi (t,x)=\sup _{0\le s\le t}x_s\). The main point is that the inequality (2.7) holds for a general adapted process \(\phi (t,x)\). \(\square \)
4 Optimal Control of the Diffusion Process of Mean-Field Type
Let \((U, \delta )\) be a compact metric space with its Borel \(\sigma \)-field \(\mathcal {B}(U)\) and \(\mathcal {U}\) the set of \(\mathbb {F}\)-progressively measurable processes \(u=(u_t)_{t\le T}\) with values in U. We call \(\mathcal {U}\) the set of admissible controls.
Next let f and h be two measurable function from \([0,T]\times \Omega \times {\mathcal {P}}(\mathbb {R}^d)\times U\) into \(\mathbb {R}^d\) and \(\mathbb {R}\), respectively, and g be a measurable functions from \(\mathbb {R}^d\times {\mathcal {P}}(\mathbb {R}^d)\) into \(\mathbb {R}\) such that
- (B1)
For any \(u\in \mathcal {U}\) and \(Q\in {\mathcal {P}}(\Omega )\), the processes \((f(t, x_.,Q\circ x_t^{-1},u_t))_t\) and \((h(t, x_.,Q\circ x_t^{-1},u_t))_t\) are progressively measurable. Moreover, \(g(x_T,Q\circ x_T^{-1})\) is \(\mathcal {F}_T\)-measurable.
- (B2)
For every \(t\in [0,T]\), \(w\in \Omega \), \(u,v\in U\) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),
$$\begin{aligned} |\phi (t,w,\mu , u)-\phi (t,w,\nu ,v)|\le C(d(\mu ,\nu )+\delta (u,v)). \end{aligned}$$for \(\phi \in \{f,h\}\).
For every \(w\in \Omega \) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),
$$\begin{aligned} |g(w,\mu )-g(w,\nu )|\le C d(\mu ,\nu ). \end{aligned}$$ - (B3)
For every \(t\in [0,T]\), \(w\in \Omega \), \(\mu \in {\mathcal {P}}(\mathbb {R}^d)\) and \(u\in U\),
$$\begin{aligned} |f(t,w,\mu ,u)|\le C(1+|w|_t). \end{aligned}$$ - (B4)
h and g are uniformly bounded.
For \(u\in \mathcal {U}\), let \(P^u\) be the probability measure on \((\Omega ,\mathcal {F})\) which is a fixed point of \(\Phi ^u\) defined in the same way as in Theorem (3.1) except that the drift term \(b(\cdot )\) depends moreover on u but this does not rise a major issue. Thus we have
where
By Girsanov’s theorem, the process \((W^u_t,\,\, 0\le t\le T)\) defined by
is an \((\mathbb {F}, P^u)\)-Brownian motion. Moreover, under \(P^u\),
Let \(E^u\) denote the expectation w.r.t. \(P^u\). In view of (3.5), we have, for every \(u\in \mathcal {U}\),
where the constant C depends only on \(T,\text {x}\) and the linear growth constants of f and \(\sigma \).
We also have the following estimate of the total variation between \(P^u\) and \(P^v\).
Lemma 4.1
For every \(u,v\in \mathcal {U}\), it holds that
In particular, the function \(u\mapsto P^u\) from U into \({\mathcal {P}}(\Omega )\) is Lipschitz continuous: for every \(u,v\in U\),
Proof
Using a similar estimate as (3.7), we have
where \({\tilde{\Gamma }}\) is the following version of the Hellinger process associated with \(P^u\) and \(P^v\):
where
Using (A2) and (B2), we obtain
Hence, in view of (4.7), Gronwall’s inequality yields
Inequality (4.6) follows from (4.5) by letting \(u_t\equiv u\in U\) and \(v_t\equiv v\in U\).
The cost functional \(J(u),\,\, u\in \mathcal {U}\), associated with the controlled SDE (4.3) is
where h and g satisfy (B1)–(B4) above.
Any \(u^*\in \mathcal {U}\) satisfying
is called optimal control. The corresponding optimal dynamics is given by the probability measure \(P^*\) on \((\Omega ,\mathcal {F})\) defined by
under which
We want to find such an optimal control and characterize the optimal cost functional \(J(u^*)\).
For \((t,w,\mu ,z,u)\in [0,T]\times \Omega \times {\mathcal {P}}(\mathbb {R}^d)\times \mathbb {R}^d\times U\) we introduce the Hamiltonian associated with the optimal control problem (4.3) and (4.8)
The function H enjoys the following properties.
Lemma 4.2
Assume that (A1), (A2), (B1) and (B2) hold. Then, the function H satisfies
Assume further that (B3) holds. Then H satisfies the (stochastic) Lipschitz condition
Proof
Inequality (4.13) is a consequence of (A2) and (B2). Assume further that (B3) is satisfied. Then (4.14) is also satisfied since f and \(\sigma ^{-1}\) are of polynomial growth in w. \(\square \)
Next, we show that the payoff functional \(J(u),\,u\in \mathcal {U}\), can be expressed in terms of solutions of linear BSDEs.
Proposition 4.3
Assume that (A1), (A2), (B1), (B2), (B3) and (B4) are satisfied. Then, for every \(u\in \mathcal {U}\), there exists a unique \(\mathbb {F}\)-progressively measurable process \((Y^u,Z^u)\in {\mathcal {S}}^2_T\times {\mathcal {H}}^2_T\) such that
Moreover, \(Y^u_0=J(u)\).
Proof
The mapping \(p\mapsto H(t,x_.,P^u\circ x_t^{-1},p,u_t)\) satisfies (4.14) and \(H(t,x_.,P^u\circ x_t^{-1},0,u_t)=h(t,x_.,u_t)\) and \(g(x_T,P^u\circ x_T^{-1})\) are bounded, then by Theorem I-3 in [17], the BSDE (4.15) has a unique solution. Note that the proof in [16, 17] is made for \(\alpha =0\). However it can be generalized without any difficulty to the case when \(\alpha >0\). The most important thing is that the moment of any order of \((x_t)_{t\le T}\) exist under \({\mathbb {P}}\).
It remains to show that \(Y^u_0=J(u)\). Indeed, in terms of the \((\mathbb {F}, P^u)\)-Brownian motion
the process \((Y^u,Z^u)\) satisfies
Therefore,
In particular, since \(\mathcal {F}_0\) contains only the P-null sets of \(\Omega \) and, \(P^u\) and P are equivalent, then
\(\square \)
4.1 Existence of Optimal Controls
In the remaining part of this section we want to find \(u^*\in \mathcal {U}\) such that \(u^*=\arg \min _{u\in \mathcal {U}}J(u)\). A way to find such an optimal control is to proceed as in Proposition 4.3 and introduce a BSDE whose solution \(Y^*\) satisfies \(Y^*_0=\inf _{u\in \mathcal {U}}J(u)=Y^{u^*}_0\). By the comparison theorem for BSDEs, the problem can be reduced to minimizing the corresponding Hamiltonian and the terminal value g w.r.t. the control u. Since in the Hamiltonian \(H(t,x_.,P^u\circ x_t^{-1},z,u_t)\) the marginal law \(P^u\circ x_t^{-1}\) of \(x_t\) under \(P^u\) depends on the whole path of u over [0, t] and not only on \(u_t\), we should minimize H w.r.t. the whole set \(\mathcal {U}\) of admissible stochastic controls. Therefore, we should take the essential infimum of the Hamiltonian over \(\mathcal {U}\), instead of the minimum over U. Thus, for the associated BSDE to make sense, we should show that it exists and is progressively measurable. This is shown in the next proposition.
Let \({\mathbb {L}}\) denote the \(\sigma \)-algebra of progressively measurable sets on \([0,T]\times \Omega \). For \((t,x_.,z,u)\in [0,T]\times \Omega \times \mathbb {R}^d\times \mathcal {U}\), set
Note that since H is linear in z and, for every fixed z and u, \(H(\cdot ,\cdot ,z,u)\) a progressively measurable process, it is an \({\mathbb {L}}\times B(\mathbb {R}^d)\)-random variable.
Next we have:
Proposition 4.4
For any \(z\in \mathbb {R}^d\), there exists an \({\mathbb {L}}\)-measurable process \(H^*(\cdot ,\cdot ,z)\) such that,
Moreover, \(H^*\) is stochastic Lipschitz continuous in z, i.e., for every \(z,z^{\prime } \in \mathbb {R}^d\),
Proof
For \(n\ge 0\) let \(z_n\in {\mathbb {Q}}^d\), the d-cube of rational numbers. Then, since \((t,\omega )\mapsto H(t,\omega ,z_n,u)\) is \({\mathbb {L}}\)-measurable, its essential infimum w.r.t. \(u\in \mathcal {U}\) is well defined, i.e. there exists a \({\mathbb {L}}\)-measurable r.v. \(H^n\) such that
Moreover, there exists a countable set \({\mathcal {J}}_n\) of \(\mathcal {U}\) such that
Finally note that the process \((t,\omega )\mapsto \underset{u\in {\mathcal {J}}_n}{\inf \,} H(t,\omega ,z_n,u)\) is \({\mathbb {L}}\)-measurable.
Next, set \(N=\bigcup _{n\ge 0} N_n\), where
Then obviously \(d{\mathbb {P}}\otimes dt(N)=0\).
We now define \(H^*\) as follows: for \((t,\omega )\in N\), \(H^*\equiv 0\) and for \((t,\omega )\in N^c\) (the complement of N) we set:
The last limit exists due to the fact that, for \(n\ne m\), we have
Furthermore, the last inequality implies that the limit does not depend on the sequence \((z_n)_{n\ge 0}\) of \({\mathbb {Q}}^d\) which converges to z. Finally note that \(H^*(t,x_.,z)\) is \({\mathbb {L}}\otimes B(\mathbb {R}^d)\)-measurable and is Lipschitz-continuous in z with the stochastic Lipschitz constant \(C(1+|x|_t^{\alpha +1})\).
It remains to show that, for every \(z\in \mathbb {R}^d\),
If \(z\in {\mathbb {Q}}^d\), the equality follows from the definitions (4.19) and (4.21). Assume \(z\notin {\mathbb {Q}}^d\) and let \(z_n\in {\mathbb {Q}}^d\) such that \(z_n\rightarrow z\). Then
But, \(H^*(t,x_.,z_n)=\underset{u\in {\mathcal {J}}_n}{\inf \,} H(t,x_.,z_n,u)\rightarrow _n H^*(t,x_.,z)\) and \(\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z_n,u)\rightarrow _n\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z,u)\) which finishes the proof. \(\square \)
Consider further the \(\mathcal {F}_T\)-measurable random variable
and let \((Y^*,Z^*)\in {\mathcal {S}}^2_T\times {\mathcal {H}}^2_T\) be the solution of the following BSDE
The existence of the pair \((Y^*,Z^*)\) follows from the boundedness of \(g^*\) and h, the measurability of \(H^*\) and (4.18) (see [17] for more details).
The next proposition displays a comparison result between the solutions \(Y^*\) and \(Y^u,\,u\in \mathcal {U}\) of the BSDEs (4.25) and (4.15), respectively.
Proposition 4.5
(Comparison) For every \(t\in [0,T]\), we have
Proof
For any \(t\le T\), we have:
Since, \(g^*(x_.)-g(x_T,P^u\circ x_T^{-1})\le 0\) and \(H^*(s,x_.,Z_s^*)-H(t,x_.,Z^*_s,u)\le 0\), then, performing a change of probability measure and taking conditional expectation w.r.t. \(\mathcal {F}_t\), we obtain \(Y^*_t\le Y^u_t,\,\, {\mathbb {P}}\text {-a.s.},\,\, \forall u\in \mathcal {U}\). \(\square \)
Proposition 4.6
(\(\varepsilon \)-optimality) Assume that for any \(\varepsilon >0\) there exists \(u^{\varepsilon }\in \mathcal {U}\) such that P-a.s.,
Then,
Proof
Let \((Y^{\varepsilon },Z^{\varepsilon })\in {\mathcal {S}}^2_T\times {\mathcal {H}}^2_T\) be the solution the following BSDE
Once more the existence of \((Y^{\varepsilon },Z^{\varepsilon })\) follows from ([17], Theorem I.3). We then have
Since \(g^*(x_.)-g(x_T,P^{u^{\varepsilon }}\circ x_T^{-1})\ge -\varepsilon \) and \(H^*(s,x_.,Z^*)-H(t,x_.,Z^*_s,u^{\varepsilon })\ge -\varepsilon \), then, once more, performing a change of probability measure and taking conditional expectation w.r.t. \(\mathcal {F}_t\), we obtain \(Y^*_t\ge Y^{u^{\varepsilon }}_t -\varepsilon (T+1)\). This entails that, in view of (4.26), for every \(0\le t\le T\), \(Y^*_t=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}Y^u_t\) . \(\square \)
In next theorem, we characterize the set of optimal controls associated with (4.9) under the dynamics (4.3).
Theorem 4.7
(Existence of optimal control) If there exists \(u^*\in \mathcal {U}\) such that
Then,
In particular, \(Y_0^*=\inf _{u\in \mathcal {U}}J(u)=J(u^*)\).
Proof
Under (4.29), for any \(t\le T\) we have
Making now a change of probability and taking expectation leads to \({\tilde{E}}[Y^*_t-Y^{u^*}_t]=0\), \(\forall t\le T\) where \({\tilde{E}}\) is the expectation under the new probability \({\tilde{P}}\) which is equivalent to \({\mathbb {P}}\). As \(Y^*_t\le Y^{u^*}_t\), \({\mathbb {P}}\)-a.s. and then \({\tilde{P}}\)-a.s., we obtain, in taking into account of (4.26), \(Y^*=Y^{u^*}\) which means, once more by (4.26), that \(u^*\) is an optimal control. \(\square \)
Remark 4.8
As is the case for any optimality criteria for systems, obviously checking the sufficient condition (4.29) is quite hard simply because there are no general conditions which guarantee existence of essential minima for systems. One should rather solve the problem in particular cases. In the special case where the marginal law \(P^u\circ x^{-1}_t\)only depends on \((u_t, x|_{[0,t]})\) at each time \(t\in [0,T]\), we may minimize H and g over the action set U, instead of using the essential infimum, and use Beneš selection theorem [2] to find two measurable functions \(u_1^*\) from \([0,T)\times \Omega \times \mathbb {R}^d\) into U and \(u_2^*\) from \(\mathbb {R}^d\) into U such that
and
Combining (4.31) and (4.32), it is easily seen that the progressively measurable function \(u^*\) defined by
satisfies
\(\square \)
We are going now to deal with the case when the terminal payoff g does not depend on the mean-field term. To begin with let us show the following result:
Proposition 4.9
Let \(\theta \) be an \({\mathbb {L}}\)-measurable process with values in \(\mathbb {R}^d\). We then have;
Proof
First note that for any \(z\in \mathbb {R}^d\) and \(u\in \mathcal {U}\), \( H^*(t,x_.,z)\le H(t,x_.,z,u), \,\,d{\mathbb {P}} \times dt \text{-a.e. }\) and then, for any \(u\in \mathcal {U}\),
Next let \(\Phi \) be a \({\mathbb {L}}\)-measurable process such that \(\Phi (t,\omega )\le H(t,x_.,\theta _t,u)\) for any \(u\in \mathcal {U}\). Assume first that \(\theta \) is uniformly bounded. Then there exists a sequence of \({\mathbb {L}}\)-processes \((\theta ^n)_{n\ge 0}\) such that for any \(n\ge 0\), \(\theta ^n\) takes its values in \({\mathbb {Q}}^d\), is piecewise constant and verifies \(\Vert \theta ^n-\theta \Vert _\infty :=\sup _{(t,\omega )}|\theta ^n_t(\omega )-\theta _t(\omega ) |\rightarrow 0\) as \(n\rightarrow \infty \) (for e.g. if \(d=1\), one can take \(\theta _n=\sum _{i=1}^{n2^n}\frac{i-1}{2^n}1_{\{\theta ^{-1}([\frac{i-1}{2^n},\frac{i}{2^n}[)\}}+n 1_{\{\theta \ge n\}}\) and the generalization to the case when \(d\ge 2\) is straightforward). On the other hand by the definition of H in (4.16) we have
Now let \(\epsilon >0\) and \(n_0\) such that for any \(n\ge n_0\), \(\Vert \theta ^n-\theta \Vert _\infty \le \epsilon \). Then for \(n\ge n_0\) and \(u\in \mathcal {U}\) we have,
which implies that
where \(B^k_n\) is a subset of \([0,T]\times \Omega \) on which \(\theta _n\) is constant and equals to \(z^k_n\in {\mathbb {Q}}^d\). Therefore
where \({\mathcal {J}}_n^k\) is the countable subset of \(\mathcal {U}\) defined in (4.20) and associated with \(z^k_n\). Summing now over k to obtain
since \(H^*\) is stochastic Lipschitz w.r.t. z (see (4.18)) and then
for \(n\ge n_0\). Send now \(\epsilon \) to 0 in (4.35) to obtain that \(\Phi (t,\omega )\le H^*(t,x_.,\theta _t)\) which means
If \(\theta \) is not bounded, one can find a sequence of bounded \({\mathbb {L}}\)-processes \(({\bar{\theta }}_n)_{n\ge 0}\) such that \({\bar{\theta }}_n\rightarrow _n \theta \), \(\,\,d{\mathbb {P}} \times dt \text{-a.e. }\)
Therefore we have
But the stochastic Lipschitz property of \(H^*\) and the linearity of H w.r.t. z imply
\(H^*(t,x_.,{\bar{\theta }}_n(t))\rightarrow _n H^*(t,x_.,\theta (t))\) and \(\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}H(t,x_.,{\bar{\theta }}_n(t),u) \rightarrow _n \underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}H(t,x_.,{\bar{\theta }},u)\). We then obtain the desired result by taking the limit in (4.36). \(\square \)
Proposition 4.10
If g does not depend on the mean-field term then
Proof
Recall that \(Z^*\) is defined in (4.25). Then by the previous result and the properties of \({\mathrm {ess}\inf \,}\) (see [11, p. 229]), there exists a countable subset \({\bar{\mathcal {U}}}\) of \(\mathcal {U}\) such that
Therefore for any \(\epsilon >0\) there exists \(u^\epsilon \in {\bar{\mathcal {U}}}\) such that \( H^*(t,x_.,Z^*_t)\ge H(t,x_.,Z^*_t,u^\epsilon )-\epsilon . \) Thus (4.27) is satisfied since g does not depend on the mean-field term. Finally the result follows from Proposition 4.6. \(\square \)
4.2 Existence of Nearly-Optimal Controls
As noted above, the sufficient condition (4.29) is quite hard to verify in concrete situations, which makes Theorem (4.7) less useful for showing existence of optimal controls. Nevertheless, near-optimal controls enjoy many useful and desirable properties that optimal controls do not have. In fact, thanks to Ekeland’s variational principle [14], that we will use below, under very mild conditions on the control set \(\mathcal {U}\) and the payoff functional J, near-optimal controls always exist while optimal controls may not exist or are difficult to establish. Moreover, there are many candidates for near-optimal controls which makes it possible to select among them appropriate ones that are easier to implement and handle both analytically and numerically.
For later use we introduce the Ekeland metric \(d_E\) on the space \(\mathcal {U}\) of admissible controls defined as follows. For \(u, v\in \mathcal {U}\),
where \(d{\widehat{P}}=d{\mathbb {P}} \times dt \) is the product measure of \({\mathbb {P}}\) and the Lebesgue measure on [0, T].
On the other hand let us consider the following assumption on \(\sigma \) which will replace (A2)-(b),(c).
Assumption (A6) \(\sigma (t,x_.)\) and \(\sigma ^{-1}(t,x_.)\) are bounded.
Lemma 4.11
- (i):
-
\(d_E\) is a distance. Moreover, \((\mathcal {U},d_E)\) is a complete metric space.
- (ii):
-
Let \((u^n)_n\) and u be in \(\mathcal {U}\). If \(d_E(u^n,u)\rightarrow 0\) then \(\mathbb {E}[\int _0^T\delta ^2(u^n_t,u_t)dt]\rightarrow 0\).
Proof
For a proof of
- (i):
See [10]. The proof of completeness of \((\mathcal {U},d_E)\) needs only completeness of the metric space \((U,\delta )\).
- (ii):
Let \((u^n)_n\) and u be in \(\mathcal {U}\). Then, by definition of the distance \(d_E\), since \(d_E(u^n,u)\rightarrow 0\) then \(\delta (u^n_t,u_t)\) converges to 0, \(d{\mathbb {P}}\times dt\)-a.e. Now, since the set U is compact, it is totally bounded, i.e. for every \(\varepsilon >0\), there exists an integer \(p_{\varepsilon }\ge 1\) and points \(y_1,\ldots ,y_{p_{\varepsilon }}\) in U such that \(U\subset \bigcup _{j=1}^{p_{\varepsilon }} B(y_j,\varepsilon )\), where \(B(x,\varepsilon )\) denotes the closed ball with center x and radius \(\varepsilon \). In particular, by the triangle inequality, for every \(n\ge 1\) and \((t,\omega )\), \(\delta (u^n_t(\omega ),u_t(\omega ))\le 2p_{\varepsilon }\varepsilon \). Thus, by dominated convergence, \(\mathbb {E}[\int _0^T\delta ^2(u^n_t,u_t)dt]\rightarrow 0\). \(\square \)
Proposition 4.12
Assume (A1), (A2)-(a),(A6) and (B1)–(B4). Let \((u^n)_n\) and u be in \(\mathcal {U}\). If \(d_E(u^n,u)\rightarrow 0\) then \(D^2_T(P^{u^n},P^u)\rightarrow 0\). Moreover, for every \(t\in [0,T]\), \(L^{u^n}_t\) converges to \(L^{u}_t\) in \(L^1(P)\).
Proof
In view of Lemma 4.11, we have \(\mathbb {E}[\int _0^T \delta ^2(u_t, u^n_t)dt] \rightarrow 0\). Therefore the sequence \((\int _0^T \delta ^2(u_t, u^n_t)dt)_{n\ge 0}\) converges in probability w.r.t \({\mathbb {P}}\) to 0 and by compactness of U it is bounded. On the other hand since \(L^u_T\) is integrable then the sequence \((L^u_T\int _0^T \delta ^2(u_t, u^n_t)dt)_{n\ge 0}\) converges also in probability w.r.t to \({\mathbb {P}}\) to 0. Next by the uniform boundedness of \((\int _0^T \delta ^2(u_t, u^n_t)dt)_{n\ge 0}\), the sequence \((L^u_T\int _0^T \delta ^2(u_t, u^n_t)dt)_{n\ge 0}\) is uniformly integrable. Finally as we have
then
Now to conclude it is enough to use the inequality (4.5).
To prove the last statement, set \(M^u_t:=\int _0^t \sigma ^{-1}(s,x_.)f(s,x_.,P^u\circ x_s^{-1},u_s)dW_s\). In view of (B2), we have
which converge to zero as \(n\rightarrow +\infty \).
Furthermore, setting \(f(t,x_.,u):=f(t,x_.,P^{u}\circ x_t^{-1},u_t)\), we have (taking into account of (A6))
which converges to zero as \(n\rightarrow +\,\infty \). Therefore, \(L^{u^n}_t\) converges to \(L^u_t\) in probability w.r.t \(\mathbb {P}\). But, by Theorem 2.2 in [18], under (A6), \((L^{u^n}_t)_n\) is uniformly integrable. Thus, \(L^{u^n}_t\) converges to \(L^u_t\) in \(L^1(\mathbb {P})\) when \(n\rightarrow +\,\infty \). \(\square \)
Proposition 4.13
For any \(\varepsilon >0\), there exists a control \(u^{\varepsilon }\in \mathcal {U}\) such that
\(u^{\varepsilon }\) is called near or \(\varepsilon \)-optimal for the payoff functional J.
Proof
The result follows from Ekeland’s variational principle, provided that we prove that the payoff function J, as a mapping from the complete metric space \((\mathcal {U},d_E)\) to \(\mathbb {R}\), is lower bounded and lower-semincontinuous. Since f and g are assumed uniformly bounded, J is obviously bounded. We now show continuity of J: \(J(u^n)\) converges to J(u) when \(d_E(u^n,u)\rightarrow 0\).
Integrating by parts, we obtain
Now by the boundedness of h we have the inequality
and (B3) together with the boundedness of h, by Proposition (4.12), \({\mathbb {E}}[\int _0^T L^{u^n}_th(t,x_.,P^{u^n}\circ x_t^{-1},u^n_t)dt]\) converges to \({\mathbb {E}}[\int _0^T L^u_th(t,x_.,P^u\circ x_t^{-1},u_t)dt]\) as \(d_E(u^n,u)\rightarrow 0\). A similar argument yields convergence of \({\mathbb {E}}[L^{u^n}_Tg(x_T, P^{u^n}\circ x_T^{-1})]\) to \({\mathbb {E}}[L^u_Tg(x_T, P^u\circ x_T^{-1})]\) when \(d_E(u^n,u)\rightarrow 0\). \(\square \)
Finally below we provide examples where an optimal control exists. Actually assume that:
- (i)
The drift f does not depend on the mean field term \(P^u\circ x_t^{-1}\) and the set \(f(t,\zeta ,U)\) is convex for any fixed \((t,\zeta )\in [0,T]\times \Omega \). Additionally for simplicity assume that \(\sigma =I_d\).
- (ii)
The instantaneous (resp. terminal) payoff has the following form:
where:
- (a)
the functions \(\psi _i\), \(i=1,2\), are bounded ;
- (b)
the mapping \(a\in \mathbb {R}^d \mapsto (\Gamma (t,\zeta ,a),\Theta (\eta ,a)\) is continuous (\(\zeta \in \mathcal {C}\) and \(\eta \in \mathbb {R}^d\)).
Then an optimal control exists. Indeed let \((u_n)_{n\ge 0}\) be a sequence of \(\mathcal {U}\) such that
As the set of densities \(\{L^{u}_T, u\in \mathcal {U}\}\) is weakly compact for the topology \(\sigma (L^1,L^\infty )\) (see e.g. [2, p. 470]), then there exist \(u^*\in \mathcal {U}\) and a subsequence \(\{L^{u_{n_k}}_T, k\ge 0\}\) which converges weakly to \(L^{u^*}_T\). But for any \(t\le T\),
Using now boundedness and continuity of \(\Gamma \), \(\Theta \) and finally dominated convergence theorem to obtain that:
in \(L^p\) for any \(p\ge 1\). Next
and by Theorem 2.2 in [18], there exists \(p_0>1\) such that \({\mathbb {E}}[(L_T^{u_{n_k}})^{p_0}]\) is bounded by a constant which does not depend on k. Therefore by the weak convergence of \((L_T^{u_{n_k}})_{k\ge 0}\) and (4.39) we have that \(\lim _{k\rightarrow \infty }J(u_{n_k})=J(u^*)\) which implies that \(J(u^*)=\inf _{u\in \mathcal {U}}J(u) \) and then \(u^*\) is optimal.
As a final remark, this example can be formalized and generalized substantially. \(\square \)
5 The Zero-Sum Game Problem
In this section we consider a symmetric two-players zero-sum game. Let \(\mathcal {U}\) (resp. \(\mathcal {V}\)) be the set of admissible U-valued (resp. V-valued) controls for the first (resp. second) player, where \((U,\delta _1)\) and \((V,\delta _2)\) are compact metric spaces.
For \((u,v),({\bar{u}},{\bar{v}})\in U\times V\), we set
The distance \(\delta \) defines a metric on the compact space \(U\times V\).
Let f and h be two measurable functions from \([0,T]\times \Omega \times {\mathcal {P}}(\mathbb {R}^d)\times U\times V\) into \(\mathbb {R}^d\) and \(\mathbb {R}\), respectively, and g be a measurable function from \(\mathbb {R}^d\times {\mathcal {P}}(\mathbb {R}^d)\) into \(\mathbb {R}\) such that
- (C1)
For any \((u,v)\in \mathcal {U}\times \mathcal {V}\) and \(Q\in {\mathcal {P}}(\Omega )\), the processes \((f(t, x_.,Q\circ x_t^{-1},u_t,v_t))_t\) and \((h(t, x_.,Q\circ x_t^{-1},u_t,v_t))_t\) are progressively measurable. Moreover, \(g(x_T,Q\circ x_T^{-1})\) is \(\mathcal {F}_T\)-measurable.
- (C2)
For every \(t\in [0,T]\), \(w\in \Omega \), \((u,v),({\bar{u}},{\bar{v}}) \in U\times V\) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),
$$\begin{aligned} |\phi (t,w,\mu , u,v)-\phi (t,w,\nu ,{\bar{u}},{\bar{v}})|\le C(d(\mu ,\nu )+\delta ((u,v),({\bar{u}},{\bar{v}})), \end{aligned}$$for \(\phi \in \{f,h\}\). For every \(w\in \Omega \) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),
$$\begin{aligned} |g(w,\mu )-g(w,\nu )|\le Cd(\mu ,\nu ). \end{aligned}$$ - (C3)
For every \(t\in [0,T]\), \(w\in \Omega ,\,\mu \in {\mathcal {P}}(\mathbb {R}^d)\) and \((u,v)\in \mathcal {U}\times \mathcal {V}\),
$$\begin{aligned} |f(t,w,\mu ,u,v)|\le C(1+|w|_t). \end{aligned}$$ - (C4)
h and g are uniformly bounded.
For \((u,v)\in \mathcal {U}\times \mathcal {V}\), let \(P^{u,v}\) be the probability measure on \((\Omega ,\mathcal {F})\) defined by
where
The proof of existence of \(P^{u,v}\) follows the same lines as the one of \(P^u\) defined in (4.1)–(4.2). Hence, by Girsanov’s theorem, the process \((W^{u,v}_t,\,\, 0\le t\le T)\) defined by
is an \((\mathbb {F}, P^{u,v})\)-Brownian motion. Moreover, under \(P^{u,v}\),
Let \(E^{u,v}\) denote the expectation w.r.t. \(P^{u,v}\).
The payoff functional \(J(u,v),\,(u,v)\in \mathcal {U}\times \mathcal {V}\), associated with the controlled SDE (5.4) is
The zero-sum game we consider is between two players, where the first player (with control u) wants to minimize the payoff (5.5), while the second player (with control v) wants to maximize it. The zero-sum game boils down to showing existence of a saddle-point for the game, i.e. to show existence of a pair \((u^*, v^*)\) of controls such that
for each \((u, v)\in \mathcal {U}\times \mathcal {V}\).
The corresponding dynamics is given by the probability measure \(P^*\) on \((\Omega ,\mathcal {F})\) defined by
under which
For \((u,v)\in \mathcal {U}\times \mathcal {V}\) and \(z\in \mathbb {R}^d\), we introduce the Hamiltonian associated with the game (5.4)–(5.5):
Next, set
- (i)
\(\underline{H}(t,x_.,z):=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, \underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, H(t,x_.,z,u,v),\)
- (ii)
\(\overline{H}(t,x_.,z):=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, \underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, H(t,x_.,z,u,v),\)
- (iii)
\(\underline{g}(x_.):=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, \underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, g(x_T, P^{u,v}\circ x_T^{-1})\),
- (iv)
\(\overline{g}(x_.):=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, \underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, g(x_T, P^{u,v}\circ x_T^{-1})\).
As in Proposition 4.4, \(\underline{H}\), \(\overline{H}\), \(\underline{g}\) and \(\overline{g}\) exist. On the other hand following a similar proof as the one leading to (4.18), \(\underline{H}(t,x_.,z)\) and \(\overline{H}(t,x_.,z)\) are stochastic Lipschitz continuous in z with the Lipschitz constant \(C(1+|x|^{1+\alpha }_t)\).
Let \((\underline{Y},\underline{Z})\) be the solution of the BSDE associated with \((\underline{H}, \underline{g})\) and \((\overline{Y},\overline{Z})\) the solution of the BSDE associated with \((\overline{H}, \overline{g})\).
Definition 5.1
(Isaacs’ condition) We say that the Isaacs’ condition holds for the game if
Applying the comparison theorem for BSDEs and then uniqueness of the solution, we obtain the following
Proposition 5.2
For every \(t\in [0,T]\), it holds that \(\underline{Y}_t\le \overline{Y}_t\), \(\,P\)-a.s. Moreover, if the Issac’s condition holds, then
In the next theorem, we formulate conditions for which the zero-sum game has a value. For \((u,v)\in \mathcal {U}\times \mathcal {V}\), let \((Y^{u,v},Z^{u,v})\in {\mathcal {S}}^2_T\times {\mathcal {H}}^2_T\) be the solution of the BSDE
Theorem 5.3
(Existence of a value of the zero-sum game) Assume that, for every \(t\in [0,T]\),
If there exists \((u^*,v^*)\in \mathcal {U}\times \mathcal {V}\) such that, for every \(\, 0\le t<T\),
and
Then, \({\mathbb {P}}\)-a.s. for any \(t\le T\),
Moreover, the pair \((u^*,v^*)\) is a saddle-point for the game.
Proof
First note that we can replace in (5.12) \(\underline{Z}\) by \(\overline{Z}\) and the result still holds. So assume that \(\underline{H}(t,x_.,\underline{Z}_t)=\overline{H}(t,x_.,\underline{Z}_t)\). Then by the uniqueness of the solution of the BSDEs associated with \((\underline{H},\underline{g})\) and \((\overline{H},\overline{g})\) we have \((\underline{Y},\underline{Z})=(\overline{Y},\overline{Z})\).
On the other hand, by (5.13)-(5.14) one can easily check that the pair \((u^*,v^*)\) satisfies a saddle-point property for H and g as well, i.e.,
and
The previous equalities and the uniqueness of the solutions of the BSDEs imply that \(\overline{Y}_t=\underline{Y}_t=Y^{u^*,v^*}_t\).
Now let \((u,v)\in \mathcal {U}\times \mathcal {V}\) and, \(({\widehat{Y}}^u,{\widehat{Z}}^u)\), \(({\widetilde{Y}}^v,{\widetilde{Z}}^v)\) be the solutions of the following BSDEs:
Then by comparison we have
But \({\hat{Y}}^{u^*}\) satisfies the following BSDE:
Taking into account of (5.13)–(5.14) and since the solution of the previous BSDE is unique, we obtain that
Moreover, (5.18) implies that \(Y^{u^*,v^*}_t\ge Y^{u^*,v}_t\) for any \(v \in \mathcal {V}\). But in the same way we have also \(\underline{Y}_t=Y^{u^*,v^*}_t={\tilde{Y}}^{v^*}_t\le Y^{u,v^*}_t\), P-a.s., for any \(u\in \mathcal {U}\). Therefore,
Thus, \((u^*,v^*)\) is a saddle-point of the game and \(\underline{Y}_t=Y^{u^*, v^*}_t\) is the value of the game, i.e., it satisfies: \({\mathbb {P}}\)-a.s. for any \(t\le T\),
\(\square \)
Final remark Assumptions (B4) and (C4) on the boundedness of the functions g and h can be substantially weakened by using subtle arguments on existence and uniqueness of solutions of one dimensional BSDEs which are by now well known in the BSDEs literature.
References
Bayraktar, E., Cosso, A., Huyên, P.: Randomized dynamic programming principle and Feynman–Kac representation for optimal control of McKean–Vlasov dynamics. Trans. Am. Math. Soc. 370(3), 2115–2160 (2018)
Beneš, V.E.: Existence of optimal stochastic control laws. SIAM J. Control 9, 446–472 (1971)
Bensoussan, A., Frehse, J., Yam, P.: Mean Field Games and Mean Field Type Control Theory, vol. 101. Springer, New York (2013)
Buckdahn, R., Li, J.: Stochastic differential games and viscosity solutions of Hamilton–Bellman–Isaacs equations. SIAM J. Control 47(1), 444–475 (2008)
Carmona, R., Delarue, F.: Probabilistic Theory of Mean Field Games with Applications. Springer, New York (2018)
Carmona, R., Lacker, D.: A probabilistic weak formulation of mean field games and applications. Ann. Appl. Probab. 25(3), 1189–1231 (2015)
Carmona, R., Delarue, F., Lachapelle, A.: Control of McKean–Vlasov dynamics versus mean field games. Math. Financ. Econ. 7(2), 131–166 (2013)
Elliott, R.J.: The existence of value in stochastic differential games. SIAM J. Control Optim. 14(1), 85–94 (1980)
Elliott, R.J., Davis, M.H.A.: Optimal play in a stochastic differential game. SIAM J. Control Optim. 19(4), 543–554 (1981)
Elliott, R.J., Kohlmann, M.: The variational principle and stochastic optimal control. Stochastics 3, 229–241 (1980)
El-Karoui, N.: Les aspects probabilistes du contrôle stochastique. Stoch. Process. Appl. 876, 73–238 (1981)
El-Karoui, N., Hamadène, S.: BSDEs and risk-sensitive control, zero-sum and nonzero-sum game problems of stochastic functional differential equations. Stoch. Process. Appl. 107(1), 145–169 (2003)
El-Karoui, N., Peng, S., Quenez, M.C.: Backward stochastic differential equations in finance. Math. Financ. 7(1), 1–71 (1997)
Ekeland, I.: On the variational principle. J. Math. Anal. Appl. 47, 324–353 (1974)
Fleming, W.E., Souganidis, P.E.: On the existence of value functions of two-player, zero-sum stochastic differential games. Indiana Univ. Math. J. 38(2), 293–314 (1989)
Hamadène, S., Lepeltier, J.P.: Zero-sum stochastic differential games and backward equations. Syst. Control Lett. 24(4), 259–263 (1995)
Hamadène, S., Lepeltier, J.P.: Backward equations, stochastic control and zero-sum stochastic differential games. Stoch. Stoch. Rep. 54(3–4), 221–231 (1995)
Haussmann, U.G.: A Stochastic Maximum Principle for Optimal Control of Diffusions. Wiley, Hoboken (1986)
Jacod, J., Shiryaev, A.N.: Limit theorems for stochastic processes. In: Chenciner, A., Chern, S.S. et al. (eds.), Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 288, 2nd edn. Springer, Berlin (2003)
Karatzas, I., Steven, S.: Brownian Motion and Stochastic Calculus, 2nd edn. Springer, New York (2012)
Li, J., Min, H.: Weak solutions of mean-field stochastic differential equations and application to zero-sum stochastic differential games. SIAM J. Control Optim. 54(3), 1826–1858 (2016)
Pardoux, E., Peng, S.: Adapted solution of a backward stochastic differential equation. Syst. Control Lett. 14(1), 55–61 (1990)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
For the sake of completeness, we display a poof of the fact that the set of probability measures \({\mathcal {P}}(\Omega )\) endowed with the total variation metric \(D_T\) defined on \((\Omega ,\mathcal {F}_T)\) by
is complete. Indeed, let \((Q_n)_{n\ge 0}\) be a Cauchy sequence for \(D_T\). Then , for each set \(A\in \mathcal {F}_T\), the sequence \((Q_n(A))_{n\ge 0}\) is a Cauchy sequence in \(\mathbb {R}\) and thus is a convergent sequence. By the Vitali-Hahn-Saks-Nikodym Theorem, the set-function Q defined on \((\Omega ,\mathcal {F}_T)\) by
is indeed a probability measure.
We will now show that \(D_T(Q_n,Q)\rightarrow 0\). Given \(\varepsilon >0\), there exists an integer \(n_0\) such that if \(m,n>n_0\) and \(A\in \mathcal {F}_T\), such that
Sending m to infinity, we obtain
Now taking the supremum over all \(A\in \mathcal {F}_T\), we finally get that \(D_T(Q_n,Q)\rightarrow 0\), as \(n\rightarrow \infty \).
Rights and permissions
About this article
Cite this article
Djehiche, B., Hamadène, S. Optimal Control and Zero-Sum Stochastic Differential Game Problems of Mean-Field Type. Appl Math Optim 81, 933–960 (2020). https://doi.org/10.1007/s00245-018-9525-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00245-018-9525-6