1 Introduction

In this work we investigate existence of an optimal control and a saddle-point for a zero-sum game associated with a payoff functional of mean-field type, under a dynamics driven by the weak solution of a stochastic differential equation (SDE) also of mean-field type. The obtained results extend in a natural way those obtained in [17] for standard payoffs associated with standard diffusion processes.

Given a control process \(u:=(u_t)_{t\le T}\) with values in some compact metric space U, the controlled SDE of mean-field type we consider in this paper is of the following functional form:

$$\begin{aligned} dx_t=f(t,x_.,P^u\circ x_t^{-1},u_t)dt+\sigma (t,x_.)dW^{P^u}_t,\quad x_0=\text {x}\in \mathbb {R}^d, \end{aligned}$$
(1.1)

i.e. f depends on the whole path \(x_.\) and \(P^u\circ x_t^{-1}\) and \(\sigma \) depends on \(x_.\) (this feature can be improved substantially, see Remark 3.2), the marginal probability distribution of \(x_t\) under the probability measure \(P^u\), and where \(W^{P^u}\) is a standard Brownian motion under \(P^u\).

The payoff functional \(J(u),\,\, u\in \mathcal {U},\) associated with the controlled SDE is of the form

$$\begin{aligned} J(u):=E^u\left[ \int _0^T h\left( t,x_.,P^u\circ x_t^{-1},u_t\right) dt+ g\left( x_T,P^u\circ x_T^{-1}\right) \right] , \end{aligned}$$

where \(E^u\) denotes the expectation w.r.t. \(P^u\).

As an example, the functions f, g and h can have the following forms

$$\begin{aligned} f(t,x_.,E^u[\varphi _1(x_t)],u), g(x,E^u[\varphi _2(x_T)]) \text{ and } h(t,x_.,E^u[\varphi _3(x_t)],u) \end{aligned}$$

where \(\varphi _i\), \(i=1,2,3\), are bounded Borel-measurable functions.

Taking \(h=0\) and \(g(x,y)=\varphi _2(x)^2-y^2\), the cost functional reduces to the variance, \(J(u)=E^u[\varphi _2(x_T)^2])-\left( E^u[\varphi _2 (x_T)]\right) ^2=Var_{P^u}[\varphi _2(x_T)]\).

While controlling a strong solution of an SDE means controlling the process \(x^u\) defined on a given probability space \((\Omega , \mathcal {F},\mathbb {F}, {\mathbb {P}})\) on which a Brownian motion W is defined exists and \(\mathbb {F}\) is its natural filtration, controlling a weak solution of an SDE boils down to controlling the Girsanov density process \(L^u:=dP^u/d{\mathbb {P}}\) of \(P^u\) w.r.t. a reference probability measure \({\mathbb {P}}\) on \(\Omega \) such that \((\Omega ,{\mathbb {P}})\) carries a Brownian motion W and such that the coordinates process \(x_t\) is the unique solution of the following stochastic differential equation:

$$\begin{aligned} dx_t=\sigma (t,x_.)dW_t,\quad x_0=\text {x}. \end{aligned}$$

Integrating by parts, the payoff functional can be expressed in terms of \(L^u\) as follows

$$\begin{aligned} J(u)={\mathbb {E}}\left[ \int _0^T L^u_th\left( t,x_.,P^u\circ x_t^{-1},u_t\right) dt+ L^u_Tg\left( x_T,P^u\circ x_T^{-1}\right) \right] , \end{aligned}$$

where \({\mathbb {E}}\) denotes the expectation w.r.t. \({\mathbb {P}}\). For this reason, we do not include a control parameter in the diffusion term \(\sigma \).

In the first part of this paper we establish conditions for existence of an optimal control associated with J(u): find a stochastic process \(u^*\) with values in U such that

$$\begin{aligned} J(u^*)=\min _{u\in \mathcal {U}}J(u). \end{aligned}$$

Optimal control of SDEs of mean-field type is also known as McKean–Vlasov type optimal control or simply optimal control of nonlinear diffusion; see e.g. the recent books [3] and [5] and the references therein.

The recent paper by Carmona and Lacker [6] discusses a similar problem but in the so-called mean-field game setting (where they further consider the marginal laws of the control process, i.e., \(P^u\circ u_t^{-1}\)) which has the following structure (cf. [6]):

  1. (1)

    Fix a probability measure \(\mu \) on the path space and a flow \(\nu : t \mapsto \nu _t\) of measures on the control space;

  2. (2)

    Standard optimization: With \(\mu \) and \(\nu \) frozen, solve the standard optimal control problem:

    $$\begin{aligned} \left\{ \begin{array}{lll} \inf _u E^u\left[ \int _0^T h(t,x_.,\mu ,\nu , u_t)dt+ g(x_T,\mu )\right] , \\ dx_t=f(t,x_.,\mu , u_t)dt+\sigma (t, x_.)dW^{P^u}_t,\quad x_0=\text {x}\in \mathbb {R}^d, \end{array} \right. \end{aligned}$$
    (1.2)

    i.e. find an optimal control u, inject it into the dynamics of (1.2), and find the law \(\Phi _x(\mu ,\nu )\) of the optimally controlled state process and the flow \(\Phi _u(\mu ,\nu )\) of marginal laws of the optimal control process;

  3. (3)

    Matching: Find a fixed point \(\mu =\Phi _x(\mu ,\nu ),\,\, \nu =\Phi _u(\mu ,\nu )\).

To perform the matching step (3), the authors of [6] are led to impose more or less stringent assumptions which in turn narrow the scope of the applicability of their framework. This is mainly due to the fact that the functional which is supposed to provide the optimal control is rather irregular. Overall, to show existence of a fixed point is not an easy task and cannot work in broader frameworks. For an in-depth comparison between the mean-field games approach and optimal control of strong solutions of SDEs of mean-field type see [3, 7], and the references therein. In the recent paper [1] the authors derive a non-linear Feynman–Kac representation for the value function associated with an optimal control related to such SDEs. However they do not address the problem of existence of optimal or even \(\epsilon \)-optimal controls.

In this paper we use another approach which in a way addresses the full control problem where the marginal law changes with the control process and is not frozen as in the mean-field game approach. Our strategy goes as follows: By a fixed point argument we first show that for any admissible control u there exists a unique probability \(P^u\) under which the SDE

$$\begin{aligned} dx_t=f(t,x_.,P^u\circ x_t^{-1},u_t)dt+\sigma (t,x_.)dW^{P^u}_t,\quad x_0=\text {x}\in \mathbb {R}^d, \end{aligned}$$

has a weak solution, where \(W^{P^u}\) is a Brownian motion under \({P^u}\). Moreover, the mapping which to u associates \(P^u\) is continuous. Therefore, the mean-field terms which appear in the drift of the above equation and in the payoff functional J(u) are treated as continuous functions of u. Using this point of view, which avoids the irregularity issues encountered in [6], we suggest conditions for existence of an optimal control using backward stochastic differential equations (BSDEs) in a similar fashion to the standard control problems, i.e. without mean-field terms. Indeed, if \((Y^u,Z^u)\) is the solution of the BSDEs associated with the driver (Hamiltonian) \(H(t,x_.,z,u):=h(t,x_.,P^{u}\circ x_t^{-1},u_t)+z\cdot \sigma ^{-1}(t,x_.)f(t,x_.,P^u\circ x_t^{-1},u_t)\) and the terminal value \(g(x_T,P^{u}\circ x^{-1}_T)\), we have \(Y^u_0=J(u)\). Moreover, the unique solution \((Y^*,Z^*)\) of the BSDE associated with

$$\begin{aligned} H^*(t,x_.,z):=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z,u),\,\, g^*(x_.):=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}g(x_T,P^{u}\circ x^{-1}_T) \end{aligned}$$

satisfies, under appropriate assumptions, \(Y^*(t)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}Y^u(t)\). In particular if g does not depend on the mean-field term this equality holds. The use of the essential infimum over the whole set of admissible controls \(\mathcal {U}\) instead of the infimum of the Hamiltonian H over the set U of actions (as is the case for the standard control problem, as discussed, e.g. in [17]) is simply due to the fact that the mean-field coupling \(P^{u}\circ x_t^{-1}\) involves the whole path of the control u over [0, t] and not only \(u_t\). This nonlocal feature of the dependence of H on the control does not seem covered by the powerful Benes’ type ‘progressively’ measurable selection, frequently used in standard control problems. Thus, if there exists \(u^*\in \mathcal {U}\) such that \(H^*(t,x_.,z)=H(t,x_.,z,u^*)\) and \(g^*(x_.)=g(x_T,P^{u^*}\circ x^{-1}_T)\), then \(u^*\) is an optimal control for J(u). We don’t know of any suitable measurable selection theorem that would guarantee existence of \(u^*\). At the end of this section, by using Ekeland’s variational principle, we show the existence of a near-optimal control. Finally, we suggest some particular cases where an optimal control exists.

The zero-sum game we consider is between two players with controls u and v valued in some compact metric spaces U and V, respectively. The dynamics and the payoff function associated with the game are both of mean-field type and are given by

$$\begin{aligned} dx_t=f\bigg (t,x_.,P^{u,v}\circ x_t^{-1},u_t,v_t\bigg )dt+\sigma (t,x_.)dW^{P^{u,v}}_t,\quad x_0=\text {x}\in \mathbb {R}^d, \end{aligned}$$
(1.3)

and

$$\begin{aligned} J(u,v):=E^{u,v}\left[ \int _0^T h\bigg (t,x_.,P^{u,v}\circ x_t^{-1},u_t,v_t\bigg )dt+ g\bigg (x_T,P^{u,v}\circ x^{-1}_T\bigg )\right] , \end{aligned}$$

where \(P^{u,v}\circ x_t^{-1}\) is the marginal probability distribution of \(x_t\) under the probability measure \(P^{u,v}\), \(W^{P^{u,v}}\) is a standard Brownian motion under \(P^{u,v}\) and \(E^{u,v}\) denotes the expectation w.r.t. \(P^{u,v}\).

In the zero-sum game, the first player (with control u) wants to minimize the payoff J(uv) while the second player (with control v) wants to maximize it. The zero-sum game boils down to investigating the existence of a saddle point for the game, i.e. to show existence of a pair \((u^*, v^*)\) of controls such that

$$\begin{aligned} J(u^*, v) \le J(u^*, v^*)\le J(u,v^*), \end{aligned}$$

for each (uv) with values in \(U\times V\).

This framework of games is symmetric in the sense that two players are allowed to use arbitrary adapted controls. Its introduction goes back to the eighties (see e.g. [8, 9, 16, 17]). Moreover, these controls are somehow of feedback form since, in the canonical space, for any control \((u_t)_{0\le t\le T}\) (resp. \((v_t)_{0\le t\le T}\)) there exists a measurable function \({\bar{u}}\) (resp. \({\bar{v}}\)) such that \(u_t={\bar{u}}(t,x_.)\) (resp. \(v_t={\bar{v}}(t,x_.))\). This framework is not the same as the one, e.g. in [4, 15], where the zero-sum game is formulated so that the first player uses controls while the second one strategies which in a way are responses, making the game nonsymmetric.

By using the same approach as in the control framework, we show that the game has a saddle-point. The recent paper by Li and Min [21] deals with the same zero-sum game for weak solutions of SDEs of the form (1.1), where they apply a similar ‘matching argument’ approach as [6]. However, due to the irregularity of the functional which provides the fixed point, they could only show existence of a so-called generalized saddle-point, i.e. of a pair of controls \((u^*, v^*)\) which satisfies (see, for instance, Theorem 5.6 in [21])

$$\begin{aligned} J(u^*, v)-C\psi (v,v^*) \le J(u^*, v^*)\le J(u,v^*)+C\psi (u,u^*), \end{aligned}$$

where \(\psi (u,{\bar{u}}):=({\mathbb {E}}[\int _0^T d^2(u_s.{\bar{u}}_s)ds])^{1/4}\) and C is a positive constant depending only on f and h.

Instead of the Wasserstein metric which is by now standard in the literature dealing with mean-field models, because it is designed to guarantee weak convergence of probability measures and convergence of finite moments, in this paper we have chosen to use the total variation as a metric between two probability measures, although it does not guarantee existence of finite moments, simply due to its relationship to the Hellinger distance thanks to the celebrated Csiszár–Kullback–Pinsker inequality (see the bound (4.22), Theorem V.4.21 in [19]) which gives a simple and direct proof of existence of a unique probability \(P^u\) (resp. \(P^{u,v}\)) under which the SDE (1.1) (resp. (1.3)) has a weak solution.

The paper is organized as follows. In Sect. 3, we account for existence and uniqueness of the weak solution of the SDE of mean-field type. In Sect. 4, we provide conditions for existence of an optimal control and prove existence of nearly-optimal controls. Finally, in Sect. 5, we investigate existence of a saddle point for a two-persons zero-sum game.

2 Preliminaries

Let \(\Omega :=\mathcal {C}([0,T]; \mathbb {R}^d)\) be the space of \(\mathbb {R}^d\)-valued continuous functions on [0, T] endowed with the metric of uniform convergence on [0, T]; \(|w|_t:=\sup _{0\le s\le t}|w_s|\), for \(0\le t\le T\). Denote by \(\mathcal {F}\) the Borel \(\sigma \)-field over \(\Omega \). Given \(t\in [0,T]\) and \(\omega \in \Omega \), let \(x(t,\omega )\) be the position in \(\mathbb {R}^d\) of \(\omega \) at time t. Denote by \(\mathcal {F}^0_t:=\sigma (x_s,\,\, s\le t),\, 0\le t\le T,\) the filtration generated by x. Below, C denotes a generic positive constant which may change from one line to another.

Let \(\sigma \) be a function from \([0,T]\times \Omega \) into \(\mathbb {R}^{d\times d}\) such that

  1. (A1)

    \(\sigma \) is \(\mathcal {F}^0_t\)-progressively measurable ;

  2. (A2)

    There exists a constant \(C>0\) such that

    1. (a)

      For every \(t\in [0,T]\) and \(w, \bar{w} \in \Omega \), \( |\sigma (t,w)-\sigma (t,\bar{w})|\le C|w-\bar{w}|_t.\)

    2. (b)

      \(\sigma \) is invertible and its inverse \(\sigma ^{-1}\) satisfies \(|\sigma ^{-1}(t,w)|\le C(1+|w|_t^{\alpha }),\) for some constant \(\alpha \ge 0\).

    3. (c)

      For every \(t\in [0,T]\) and \(w\in \Omega \), \(|\sigma (t,w)|\le C(1+|w|_t).\)

Let \({\mathbb {P}}\) be a probability measure on \(\Omega \) such that \((\Omega ,{\mathbb {P}})\) carries a Brownian motion \((W_t)_{0\le t\le T}\) and such that the coordinates process \((x_t)_{0\le t\le T}\) is the unique solution of the following stochastic differential equation:

$$\begin{aligned} dx_t=\sigma (t,x_.)dW_t,\quad x_0=\text {x} \in \mathbb {R}^d. \end{aligned}$$
(2.1)

Such a triplet \(({\mathbb {P}},W,x)\) exists due to Proposition 4.6 in [20, p. 315] since \(\sigma \) satisfies (A2). Moreover, for every \(p\ge 2\),

$$\begin{aligned} {\mathbb {E}}_{{\mathbb {P}}}[|x|_T^{p}]\le C_p, \end{aligned}$$
(2.2)

where \(C_p\) depends only on pT, the initial value \(\text {x}\) and the linear growth constant of \(\sigma \) (see [20, p. 306]). Again, since \(\sigma \) satisfies (A2), \(\mathcal {F}^0_t\) is the same as \(\sigma \{W_s, s\le t\}\), for any \(t\le T\), since \(dW_t=\sigma ^{-1}(t,x_.)dx_t\). We denote by \(\mathbb {F}:=(\mathcal {F}_t)_{0\le t\le T}\) the completion of \((\mathcal {F}^0_t)_{t\le T}\) with the \({\mathbb {P}}\)-null sets of \(\Omega \).

Let \({\mathcal {P}}(\mathbb {R}^d)\) denote the set of probability measures on \(\mathbb {R}^d\) and \({\mathcal {P}}_2(\mathbb {R}^d)\) the subset of measures \(\nu \) with finite second moment:

$$\begin{aligned} \int _{\mathbb {R}^d}|y|^2\nu (dy)<+\,\infty . \end{aligned}$$

For \(\mu ,\nu \in {\mathcal {P}}(\mathbb {R}^d)\), the total variation distance is defined by the formula

$$\begin{aligned} d(\mu ,\nu )=2\sup _{B\in \mathcal {B}(\mathbb {R}^d)}|\mu (B)-\nu (B)|. \end{aligned}$$
(2.3)

Furthermore, let \({\mathcal {P}}(\Omega )\) be the space of probability measures P on \(\Omega \) and \({\mathcal {P}}_p(\Omega ), \,p\ge 1,\) be the subspace of probability measures such that

$$\begin{aligned} \Vert P\Vert _p^p:=\int _{\Omega }|w|^p_TP(dw)=\mathbb {{E}_{P}}[|x|_T^p]<+\,\infty , \end{aligned}$$

where \(|x|_t:=\sup _{0\le s\le t}|x_s|\), \(0\le t\le T\).

Define on \(\mathcal {F}\) the total variation metric

$$\begin{aligned} d(P,Q):=2\sup _{A\in \mathcal {F}}|P(A)-Q(A)|,\quad P,Q \in {\mathcal {P}}(\Omega ). \end{aligned}$$
(2.4)

Similarly, on the filtration \(\mathbb {F}\), we define the total variation metric between two probability measures P and Q as

$$\begin{aligned} D_t(P,Q):=2\sup _{A\in \mathcal {F}_t}|P(A)-Q(A)|,\quad 0\le t\le T. \end{aligned}$$
(2.5)

It satisfies

$$\begin{aligned} D_s(P,Q)\le D_t(P,Q),\quad 0\le s\le t. \end{aligned}$$
(2.6)

For \(P, Q\in {\mathcal {P}}(\Omega )\) with time marginals \(P_t:=P\circ x_t^{-1}\) and \(Q_t:=Q\circ x_t^{-1}\), the total variation distance between \(P_t\) and \(Q_t\) satisfies

$$\begin{aligned} d(P_t,Q_t)\le D_t(P,Q),\quad 0\le t\le T. \end{aligned}$$
(2.7)

Indeed, we have

$$\begin{aligned}\begin{array}{lll} d(P_t,Q_t)&{}:=2\sup _{B\in \mathcal {B}(\mathbb {R}^d)}|P_t(B)-Q_t(B)|\\ &{}=2\sup _{B\in \mathcal {B}(\mathbb {R}^d)}|P(x_t^{-1}(B))-Q(x_t^{-1}(B))|\\ &{}\le 2\sup _{A\in \mathcal {F}_t}|P(A)-Q(A)|=D_t(P,Q). \end{array} \end{aligned}$$

Endowed with the total variation metric \(D_T\) on \((\Omega , \mathcal {F}_T)\), \({\mathcal {P}}(\Omega )\) is a complete metric space. For the sake of completeness, a proof is displayed in the Appendix. Moreover, by the Portmanteau Theorem, \(D_T\) carries out the usual topology of weak convergence.

3 Diffusion Process of Mean-Field Type

Hereafter, a process \(\theta \) from \([0,T]\times \Omega \) into a measurable space is said to be progressively measurable if it is progressively measurable w.r.t. \(\mathbb {F}\). Let \({\mathcal {S}}^2_T\) be the set of \(\mathbb {F}\)-progressively measurable continuous processes \((\zeta _t)_{t\le T}\) such that \({\mathbb {E}}[\sup _{t\le T}|\zeta _t|^2]<\infty \) and finally let \({\mathcal {H}}^2_T\) be the set of \(\mathbb {F}\)-progressively measurable processes \((\theta _t)_{t\le T}\) such that \({\mathbb {E}}[\int _0^T|\theta _s|^2ds]<\infty \).

Let b be a measurable function from \([0,T]\times \Omega \times {\mathcal {P}}(\mathbb {R}^d)\) into \(\mathbb {R}^d\) such that

  1. (A3)

    For every \(Q\in {\mathcal {P}}(\Omega )\), the process \(((b(t, x_.,Q\circ x_t^{-1}))_{t\le T}\) is progressively measurable.

  2. (A4)

    For every \(t\in [0,T]\), \(w\in \Omega \) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),

    $$\begin{aligned} |b(t,w,\mu )-b(t,w,\nu )|\le Cd(\mu ,\nu ). \end{aligned}$$
  3. (A5)

    For every \(t\in [0,T]\), \(w\in \Omega \) and \(\mu \in {\mathcal {P}}(\mathbb {R}^d)\),

    $$\begin{aligned} |b(t,w,\mu )|\le C(1+|w|_t). \end{aligned}$$

Next, for \(Q\in {\mathcal {P}}(\Omega )\), let \(P^Q\) be the measure on \((\Omega ,\mathcal {F})\) defined by

$$\begin{aligned} dP^Q:=L_T^Q d{\mathbb {P}} \end{aligned}$$
(3.1)

with

$$\begin{aligned} L_t^Q:=\mathcal {E}_t\left( \int _0^{\cdot } \sigma ^{-1}(s,x_{\cdot })b(s,x_{\cdot },Q\circ x_s^{-1})dW_s\right) ,\quad 0\le t\le T, \end{aligned}$$
(3.2)

where, for any \((\mathbb {F},{\mathbb {P}})\)-continuous local martingale \(M=(M_t)_{0\le t\le T}\), \(\mathcal {E}(M)\) denotes the Doleans exponential, i.e., \(\mathcal {E}(M):=(\exp {M_t-\frac{1}{2}\langle M\rangle _t})_{{0\le t\le T}}\). Thanks to assumptions (A2) and (A5), \(P^Q\) is a probability measure on \((\Omega ,\mathcal {F})\). A proof of this fact follows the same lines of the proof of Proposition A.1 in [12]. Hence, in view of Girsanov’s theorem, the process \((W^Q_t,\,\, 0\le t\le T)\) defined by

$$\begin{aligned} W_t^Q:=W_t-\int _0^t \sigma ^{-1}(s,x_.)b(s,x_.,Q\circ x_s^{-1})ds, \quad 0\le t\le T, \end{aligned}$$

is an \((\mathbb {F}, P^Q)\)-Brownian motion. Furthermore, under \(P^{Q}\),

$$\begin{aligned} dx_t=b(t,x_.,Q\circ x_t^{-1})dt+\sigma (t,x_.)dW^Q_t,\quad x_0=\text {x}\in \mathbb {R}^d. \end{aligned}$$
(3.3)

Now, in view of (A2) and (A5), the Hölder and Burkholder–Davis–Gundy inequalities yield, for every \(p\ge 2\),

$$\begin{aligned} \Vert P^Q\Vert _p^p=E_{P^Q}\left[ |x|_T^p\right] \le C_p\left( 1+E_{P^Q}\left[ \int _0^T |x|_t^p dt\right] \right) . \end{aligned}$$

where the constant \(C_p\) depends only on \(p, T,\text {x}\) and the linear growth constants of b and \(\sigma \). By Gronwall’s inequality, we obtain

$$\begin{aligned} E_{P^Q}[|x|^p_T]\le C_p<+\infty . \end{aligned}$$
(3.4)

Next, we will show that there is \({\bar{Q}}\) such that \(P^{{\bar{Q}}}={{\bar{Q}}}\), i.e., \({\bar{Q}}\) is a fixed point. Moreover, \({\bar{Q}}\) has a finite moment of any order \(p\ge 2\).

Theorem 3.1

The map

$$\begin{aligned}\begin{array}{lll} \Phi : {\mathcal {P}}(\Omega )\longrightarrow {\mathcal {P}}(\Omega ) \\ \qquad \quad Q \mapsto \Phi (Q):=P^Q;\quad dP^Q:=L_T^Q d{\mathbb {P}} \end{array} \end{aligned}$$

admits a unique fixed point.

Moreover, for every \(p\ge 2\), the fixed point, denoted \({\bar{Q}}\), belongs to \({\mathcal {P}}_p(\Omega )\), i.e.

$$\begin{aligned} E_{{\bar{Q}}}[|x|^p_T]\le C_p<+\infty , \end{aligned}$$
(3.5)

where the constant \(C_p\) depends only on \(p, T,\text {x}\) and the linear growth constants of b and \(\sigma \).

Proof

We show the contraction property of the map \( \Phi \) in the complete metric space \({\mathcal {P}}(\Omega )\), endowed with the total variation distance \(D_T\). To this end, given \(Q,{\widehat{Q}}\in {\mathcal {P}}(\Omega )\), we use an estimate of the total variation distance \(D_T(\Phi (Q),\Phi ({\widehat{Q}}))\) in terms of a version of the Hellinger process associated with the coordinate process x under the probability measures \(\Phi (Q)\) and \(\Phi ({\widehat{Q}})\), respectively. Indeed, since by (3.3),

$$\begin{aligned}\left\{ \begin{array}{lll} \text {under}\,\, \Phi (Q),\;\; dx_t=b(t,x_.,Q_t)dt+\sigma (t,x_.)dW^Q_t,\quad x_0=\text {x}\in \mathbb {R}^d,\\ \\ \text {under}\,\, \Phi ({\widehat{Q}}),\;\; dx_t=b(t,x_.,{\widehat{Q}}_t)dt+\sigma (t,x_.)dW^{{\widehat{Q}}}_t,\quad x_0=\text {x}\in \mathbb {R}^d,\\ \end{array} \right. \end{aligned}$$

in view of Theorem IV.1.33 in [19], a version of the associated Hellinger process is

$$\begin{aligned} \Gamma _T:=\frac{1}{8}\int _0^T \Delta b_t(Q,{\widehat{Q}})^{\dagger }a^{-1}_t\Delta b_t(Q,{\widehat{Q}})dt, \end{aligned}$$
(3.6)

where

$$\begin{aligned} \Delta b_t(Q,{\widehat{Q}}):=b(t,x_.,Q_t)-b(t,x_.,{\widehat{Q}}_t) \end{aligned}$$

and \(a_t:=(\sigma \sigma ^{\dagger })(t,x_.)\) and \(M^{\dagger }\) denotes the transpose of the matrix M. We may use the estimate (4.22) of Theorem V.4.21 in [19], to obtain

$$\begin{aligned} D_T(\Phi (Q),\Phi ({\widehat{Q}}))\le 4\sqrt{E_{\Phi (Q)}\left[ \Gamma _T\right] }. \end{aligned}$$
(3.7)

By (A2), (A4) and (3.4), we have

$$\begin{aligned} E_{\Phi (Q)}\left[ \Delta b_t(Q,{\widehat{Q}})^{\dagger }a^{-1}_t\Delta b_t(Q,{\widehat{Q}})\right] \le C d^2(Q_t,{\widehat{Q}}_t)\le CD^2_t(Q,{\widehat{Q}}), \end{aligned}$$

which together with (3.7) yield

$$\begin{aligned} D^2_T(\Phi (Q),\Phi ({\widehat{Q}}))\le C\int _0^TD^2_t(Q,{\widehat{Q}})dt. \end{aligned}$$
(3.8)

Iterating this inequality, we obtain, for every \(N>0\),

$$\begin{aligned} D^2_T(\Phi ^N(Q),\Phi ^N({\widehat{Q}}))\le C^N\int _0^T\frac{(T-t)^{N-1}}{(N-1)!}D^2_t(Q,{\widehat{Q}})dt\le \frac{C^NT^N}{N!}D^2_T(Q,{\widehat{Q}}), \end{aligned}$$

where \(\Phi ^N\) denotes the N-fold composition of the map \(\Phi \). Hence, for N large enough, \(\Phi ^N\) is a contraction which entails that \(\Phi \) admits a unique fixed point.

Let \({\bar{Q}}\) be such a fixed point for the map \(\Phi \). Thus, under \({\bar{Q}}\),

$$\begin{aligned} dx_t=b(t,x_.,{\bar{Q}}_t)dt+\sigma (t,x_.)dW^{{\bar{Q}}},\quad x_0=\text {x}\in \mathbb {R}^d, \end{aligned}$$

where \({\bar{Q}}_t:={\bar{Q}}\circ x_t^{-1}\). In view of assumptions (A2) and (A5), the Hölder and Burkholder–Davis–Gundy inequalities yield

$$\begin{aligned} \Vert {\bar{Q}}\Vert _p^p=E_{{\bar{Q}}}\left[ |x|_T^p\right] \le C_p\left( 1+E_{{\bar{Q}}}\left[ \int _0^T |x|_t^p dt\right] \right) . \end{aligned}$$

By Gronwall’s inequality, we obtain (3.5), i.e.

$$\begin{aligned} E_{{\bar{Q}}}[|x|^p_T]\le C_p<+\infty . \end{aligned}$$

\(\square \)

Remark 3.2

The dependence of the drift b with respect to the law of \(x_t\) under Q, i.e., \(Q \circ x_t^{-1}\) can be relaxed substantially since we can replace this latter by \(Q\circ \phi (t,x)^{-1}\) where \(\phi (t,x)\) is an adapted process. For example one can choose \(\phi (t,x)=\sup _{0\le s\le t}x_s\). The main point is that the inequality (2.7) holds for a general adapted process \(\phi (t,x)\). \(\square \)

4 Optimal Control of the Diffusion Process of Mean-Field Type

Let \((U, \delta )\) be a compact metric space with its Borel \(\sigma \)-field \(\mathcal {B}(U)\) and \(\mathcal {U}\) the set of \(\mathbb {F}\)-progressively measurable processes \(u=(u_t)_{t\le T}\) with values in U. We call \(\mathcal {U}\) the set of admissible controls.

Next let f and h be two measurable function from \([0,T]\times \Omega \times {\mathcal {P}}(\mathbb {R}^d)\times U\) into \(\mathbb {R}^d\) and \(\mathbb {R}\), respectively, and g be a measurable functions from \(\mathbb {R}^d\times {\mathcal {P}}(\mathbb {R}^d)\) into \(\mathbb {R}\) such that

  1. (B1)

    For any \(u\in \mathcal {U}\) and \(Q\in {\mathcal {P}}(\Omega )\), the processes \((f(t, x_.,Q\circ x_t^{-1},u_t))_t\) and \((h(t, x_.,Q\circ x_t^{-1},u_t))_t\) are progressively measurable. Moreover, \(g(x_T,Q\circ x_T^{-1})\) is \(\mathcal {F}_T\)-measurable.

  2. (B2)

    For every \(t\in [0,T]\), \(w\in \Omega \), \(u,v\in U\) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),

    $$\begin{aligned} |\phi (t,w,\mu , u)-\phi (t,w,\nu ,v)|\le C(d(\mu ,\nu )+\delta (u,v)). \end{aligned}$$

    for \(\phi \in \{f,h\}\).

    For every \(w\in \Omega \) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),

    $$\begin{aligned} |g(w,\mu )-g(w,\nu )|\le C d(\mu ,\nu ). \end{aligned}$$
  3. (B3)

    For every \(t\in [0,T]\), \(w\in \Omega \), \(\mu \in {\mathcal {P}}(\mathbb {R}^d)\) and \(u\in U\),

    $$\begin{aligned} |f(t,w,\mu ,u)|\le C(1+|w|_t). \end{aligned}$$
  4. (B4)

    h and g are uniformly bounded.

For \(u\in \mathcal {U}\), let \(P^u\) be the probability measure on \((\Omega ,\mathcal {F})\) which is a fixed point of \(\Phi ^u\) defined in the same way as in Theorem (3.1) except that the drift term \(b(\cdot )\) depends moreover on u but this does not rise a major issue. Thus we have

$$\begin{aligned} dP^u:=L_T^u d{\mathbb {P}}, \end{aligned}$$
(4.1)

where

$$\begin{aligned} L_t^u:=\mathcal {E}_t\left( \int _0^{\cdot } \sigma ^{-1}(s,x_.)f(s,x_.,P^u\circ x_s^{-1},u_s)dW_s\right) ,\quad 0\le t\le T. \end{aligned}$$
(4.2)

By Girsanov’s theorem, the process \((W^u_t,\,\, 0\le t\le T)\) defined by

$$\begin{aligned} W_t^u:=W_t-\int _0^t \sigma ^{-1}(s,x_.)f(s,x_.,P^u\circ x_s^{-1},u_s)ds, \quad 0\le t\le T, \end{aligned}$$

is an \((\mathbb {F}, P^u)\)-Brownian motion. Moreover, under \(P^u\),

$$\begin{aligned} dx_t=f(t,x,P^u\circ x_t^{-1},u_t)dt+\sigma (t,x)dW^u_t,\quad x_0=\text {x}\in \mathbb {R}^d. \end{aligned}$$
(4.3)

Let \(E^u\) denote the expectation w.r.t. \(P^u\). In view of (3.5), we have, for every \(u\in \mathcal {U}\),

$$\begin{aligned} \forall p \ge 2, \Vert P^u\Vert _p^p=E^u[{|x_T|}^p]\le C < \infty , \end{aligned}$$
(4.4)

where the constant C depends only on \(T,\text {x}\) and the linear growth constants of f and \(\sigma \).

We also have the following estimate of the total variation between \(P^u\) and \(P^v\).

Lemma 4.1

For every \(u,v\in \mathcal {U}\), it holds that

$$\begin{aligned} D_T^2(P^u,P^v)\le C E^u\left[ \int _0^T\delta ^2(u_t,v_t)dt\right] . \end{aligned}$$
(4.5)

In particular, the function \(u\mapsto P^u\) from U into \({\mathcal {P}}(\Omega )\) is Lipschitz continuous: for every \(u,v\in U\),

$$\begin{aligned} D_T(P^u,P^v)\le C\delta (u,v). \end{aligned}$$
(4.6)

Proof

Using a similar estimate as (3.7), we have

$$\begin{aligned} D_T(P^u,P^v)\le 4\sqrt{E^u\left[ {\tilde{\Gamma }}^{u,v}_T\right] }, \end{aligned}$$
(4.7)

where \({\tilde{\Gamma }}\) is the following version of the Hellinger process associated with \(P^u\) and \(P^v\):

$$\begin{aligned} {\tilde{\Gamma }}_T:=\frac{1}{8}\int _0^T \Delta f_t(u,v)^{\dagger }a^{-1}_t\Delta f_t(u,v)dt, \end{aligned}$$

where

$$\begin{aligned} \Delta f_t(u,v):=f(t,x_.,P^u\circ x_t^{-1},u_t)-f(t,x_.,P^v\circ x_t^{-1},v_t). \end{aligned}$$

Using (A2) and (B2), we obtain

$$\begin{aligned} \Delta f_t(u,v)^{\dagger }a^{-1}_t\Delta f_t(u,v)\le & {} C(1+|x|_t^{2\alpha })(d^2(P^u\circ x_t^{-1},P^v\circ x_t^{-1})+\delta ^2(u_t,v_t))\\\le & {} C(1+|x|_t^{2\alpha })(D^2_t(P^u,P^v)+\delta ^2(u_t,v_t)). \end{aligned}$$

Hence, in view of (4.7), Gronwall’s inequality yields

$$\begin{aligned} D^2_T(P^u,P^v)\le CE^u\left[ \int _0^T \delta ^2(u_t,v_t)dt\right] . \end{aligned}$$

Inequality (4.6) follows from (4.5) by letting \(u_t\equiv u\in U\) and \(v_t\equiv v\in U\).

The cost functional \(J(u),\,\, u\in \mathcal {U}\), associated with the controlled SDE (4.3) is

$$\begin{aligned} J(u):=E^u\left[ \int _0^T h(t,x_.,P^u\circ x_t^{-1},u_t)dt+ g(x_T,P^u\circ x_T^{-1})\right] , \end{aligned}$$
(4.8)

where h and g satisfy (B1)–(B4) above.

Any \(u^*\in \mathcal {U}\) satisfying

$$\begin{aligned} J(u^*)=\min _{u\in \mathcal {U}}J(u) \end{aligned}$$
(4.9)

is called optimal control. The corresponding optimal dynamics is given by the probability measure \(P^*\) on \((\Omega ,\mathcal {F})\) defined by

$$\begin{aligned} dP^*=\mathcal {E}\left( \int _0^{\cdot } \sigma ^{-1}(s,x_.)f(s,x_., P^*\circ x_s^{-1},u^*_s)dW_s\right) d{\mathbb {P}}, \end{aligned}$$
(4.10)

under which

$$\begin{aligned} dx_t=f(t,x, P^*\circ x_t^{-1},u^*_t)dt+\sigma (t,x)dW^{u^*}_t,\quad x_0=\text {x}\in \mathbb {R}^d. \end{aligned}$$
(4.11)

We want to find such an optimal control and characterize the optimal cost functional \(J(u^*)\).

For \((t,w,\mu ,z,u)\in [0,T]\times \Omega \times {\mathcal {P}}(\mathbb {R}^d)\times \mathbb {R}^d\times U\) we introduce the Hamiltonian associated with the optimal control problem (4.3) and (4.8)

$$\begin{aligned} H(t,w,\mu ,z,u):=h(t,w,\mu ,u)+z\cdot \sigma ^{-1}(t,w)f(t,w,\mu ,u). \end{aligned}$$
(4.12)

The function H enjoys the following properties.

Lemma 4.2

Assume that (A1), (A2), (B1) and (B2) hold. Then, the function H satisfies

$$\begin{aligned} |H(t,w,\mu ,p,u)-H(t,w,\nu ,p,v)|\le C(1+|w|^{\alpha }_t))(1+|p|)(d(\mu ,\nu )+\delta (u,v)).\nonumber \\ \end{aligned}$$
(4.13)

Assume further that (B3) holds. Then H satisfies the (stochastic) Lipschitz condition

$$\begin{aligned} \begin{array}{lll} |H(t,w,\mu ,z,u)-H(t,w,\mu ,z^{\prime },u)| \le C(1+|w|^{1+\alpha }_t)|z-z^{\prime }|. \end{array} \end{aligned}$$
(4.14)

Proof

Inequality (4.13) is a consequence of (A2) and (B2). Assume further that (B3) is satisfied. Then (4.14) is also satisfied since f and \(\sigma ^{-1}\) are of polynomial growth in w. \(\square \)

Next, we show that the payoff functional \(J(u),\,u\in \mathcal {U}\), can be expressed in terms of solutions of linear BSDEs.

Proposition 4.3

Assume that (A1), (A2), (B1), (B2), (B3) and (B4) are satisfied. Then, for every \(u\in \mathcal {U}\), there exists a unique \(\mathbb {F}\)-progressively measurable process \((Y^u,Z^u)\in {\mathcal {S}}^2_T\times {\mathcal {H}}^2_T\) such that

$$\begin{aligned} \left\{ \begin{array}{ll} -dY^u_t&{}=H(t,x_.,P^u\circ x_t^{-1},Z^u_t,u_t) dt-Z^u_tdW_t,\quad 0\le t<T,\\ Y^u_T&{}=g(x_T,P^u\circ x_T^{-1}). \end{array} \right. \end{aligned}$$
(4.15)

Moreover, \(Y^u_0=J(u)\).

Proof

The mapping \(p\mapsto H(t,x_.,P^u\circ x_t^{-1},p,u_t)\) satisfies (4.14) and \(H(t,x_.,P^u\circ x_t^{-1},0,u_t)=h(t,x_.,u_t)\) and \(g(x_T,P^u\circ x_T^{-1})\) are bounded, then by Theorem I-3 in [17], the BSDE (4.15) has a unique solution. Note that the proof in [16, 17] is made for \(\alpha =0\). However it can be generalized without any difficulty to the case when \(\alpha >0\). The most important thing is that the moment of any order of \((x_t)_{t\le T}\) exist under \({\mathbb {P}}\).

It remains to show that \(Y^u_0=J(u)\). Indeed, in terms of the \((\mathbb {F}, P^u)\)-Brownian motion

$$\begin{aligned} W_t^u:=W_t-\int _0^t \sigma ^{-1}(s,x_.)f(s,x_.,P^u\circ x_s^{-1},u_s)ds, \quad 0\le t\le T, \end{aligned}$$

the process \((Y^u,Z^u)\) satisfies

$$\begin{aligned} Y^u_t=g(x_T,P^u\circ x_T^{-1})+\int _t^T h(s,x_.,P^u\circ x_s^{-1},u_s) ds-\int _t^T Z^u_sdW^u_s,\quad 0\le t\le T. \end{aligned}$$

Therefore,

$$\begin{aligned} Y^u_t=E^u\left[ \int _t^T h(s,x_.,P^u\circ x_s^{-1},u_s)ds+g(x_T,P^u\circ x_T^{-1})\big |\mathcal {F}_t\right] \quad P^u\text{-a.s. } \end{aligned}$$

In particular, since \(\mathcal {F}_0\) contains only the P-null sets of \(\Omega \) and, \(P^u\) and P are equivalent, then

$$\begin{aligned} Y^u_0=E^u\left[ \int _0^T h(s,x_.,P^u\circ x_s^{-1},u_s)ds+g(x_T,P^u\circ x_T^{-1})\right] =J(u). \end{aligned}$$

\(\square \)

4.1 Existence of Optimal Controls

In the remaining part of this section we want to find \(u^*\in \mathcal {U}\) such that \(u^*=\arg \min _{u\in \mathcal {U}}J(u)\). A way to find such an optimal control is to proceed as in Proposition 4.3 and introduce a BSDE whose solution \(Y^*\) satisfies \(Y^*_0=\inf _{u\in \mathcal {U}}J(u)=Y^{u^*}_0\). By the comparison theorem for BSDEs, the problem can be reduced to minimizing the corresponding Hamiltonian and the terminal value g w.r.t. the control u. Since in the Hamiltonian \(H(t,x_.,P^u\circ x_t^{-1},z,u_t)\) the marginal law \(P^u\circ x_t^{-1}\) of \(x_t\) under \(P^u\) depends on the whole path of u over [0, t] and not only on \(u_t\), we should minimize H w.r.t. the whole set \(\mathcal {U}\) of admissible stochastic controls. Therefore, we should take the essential infimum of the Hamiltonian over \(\mathcal {U}\), instead of the minimum over U. Thus, for the associated BSDE to make sense, we should show that it exists and is progressively measurable. This is shown in the next proposition.

Let \({\mathbb {L}}\) denote the \(\sigma \)-algebra of progressively measurable sets on \([0,T]\times \Omega \). For \((t,x_.,z,u)\in [0,T]\times \Omega \times \mathbb {R}^d\times \mathcal {U}\), set

$$\begin{aligned} H(t,x_.,z,u):=H(t,x_.,P^u\circ x_t^{-1},z,u_t). \end{aligned}$$
(4.16)

Note that since H is linear in z and, for every fixed z and u, \(H(\cdot ,\cdot ,z,u)\) a progressively measurable process, it is an \({\mathbb {L}}\times B(\mathbb {R}^d)\)-random variable.

Next we have:

Proposition 4.4

For any \(z\in \mathbb {R}^d\), there exists an \({\mathbb {L}}\)-measurable process \(H^*(\cdot ,\cdot ,z)\) such that,

$$\begin{aligned} H^*(t,x_.,z)=\mathrm {ess}\inf _{u\in \mathcal {U}}H(t,x_.,z,u),\quad d{\mathbb {P}} \times dt \text{-a.e. } \end{aligned}$$
(4.17)

Moreover, \(H^*\) is stochastic Lipschitz continuous in z, i.e., for every \(z,z^{\prime } \in \mathbb {R}^d\),

$$\begin{aligned} |H^*(t,x_.,z)-H^*(t,x_.,z^{\prime })|\le C(1+|x|^{1+\alpha }_t)|z-z^{\prime }|. \end{aligned}$$
(4.18)

Proof

For \(n\ge 0\) let \(z_n\in {\mathbb {Q}}^d\), the d-cube of rational numbers. Then, since \((t,\omega )\mapsto H(t,\omega ,z_n,u)\) is \({\mathbb {L}}\)-measurable, its essential infimum w.r.t. \(u\in \mathcal {U}\) is well defined, i.e. there exists a \({\mathbb {L}}\)-measurable r.v. \(H^n\) such that

$$\begin{aligned} H^n(t,x_., z_n)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z_n,u). \end{aligned}$$
(4.19)

Moreover, there exists a countable set \({\mathcal {J}}_n\) of \(\mathcal {U}\) such that

$$\begin{aligned} H^n(t,x_.,z_n)=\underset{u\in {\mathcal {J}}_n}{\inf \,} H(t,x_.,z_n,u), \quad d{\mathbb {P}} \times dt\text{-a.e. } \end{aligned}$$
(4.20)

Finally note that the process \((t,\omega )\mapsto \underset{u\in {\mathcal {J}}_n}{\inf \,} H(t,\omega ,z_n,u)\) is \({\mathbb {L}}\)-measurable.

Next, set \(N=\bigcup _{n\ge 0} N_n\), where

$$\begin{aligned} N_n:=\{(t,\omega ):\,\,H^n(t,\omega , z_n)\ne \underset{u\in {\mathcal {J}}_n}{\inf \,} H(t,\omega ,z_n,u)\}. \end{aligned}$$

Then obviously \(d{\mathbb {P}}\otimes dt(N)=0\).

We now define \(H^*\) as follows: for \((t,\omega )\in N\), \(H^*\equiv 0\) and for \((t,\omega )\in N^c\) (the complement of N) we set:

$$\begin{aligned} H^*(t,x_.,z)=\left\{ \begin{array}{ll} \underset{u\in {\mathcal {J}}_n}{\inf \,} H(t,x_.,z_n,u) &{} \text {if }\,\, z=z_n\in {\mathbb {Q}}^d, \\ \underset{z_n\in {\mathbb {Q}}^d\rightarrow z}{\lim \,\,} \inf _{u\in {\mathcal {J}}_n} H(t,x_.,z_n,u) &{} \text {otherwise. } \end{array} \right. \end{aligned}$$
(4.21)

The last limit exists due to the fact that, for \(n\ne m\), we have

$$\begin{aligned} \begin{array}{ll} |\underset{u\in {\mathcal {J}}_n}{\inf \,} H(t,x_.,z_n,u) -\underset{u\in {\mathcal {J}}_m}{\inf \,} H(t,x_.,z_m,u)|\\ \quad \quad =|H^n(t,x_.,z_n)-H^m(t,x_.,z_m)|\\ \qquad =|\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z_n,u) -\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z_m,u)|\\ \quad \quad \le \underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}|\sigma ^{-1} (t,x_.)f(t,x_.,P^u\circ x_t^{-1},u_t)||z_n-z_m|\\ \quad \quad \le C(1+|x|_t^{\alpha +1})|z_n-z_m|. \end{array} \end{aligned}$$

Furthermore, the last inequality implies that the limit does not depend on the sequence \((z_n)_{n\ge 0}\) of \({\mathbb {Q}}^d\) which converges to z. Finally note that \(H^*(t,x_.,z)\) is \({\mathbb {L}}\otimes B(\mathbb {R}^d)\)-measurable and is Lipschitz-continuous in z with the stochastic Lipschitz constant \(C(1+|x|_t^{\alpha +1})\).

It remains to show that, for every \(z\in \mathbb {R}^d\),

$$\begin{aligned} H^*(t,x_.,z)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z,u),\quad d{\mathbb {P}} \times dt\text{-a.e. } \end{aligned}$$
(4.22)

If \(z\in {\mathbb {Q}}^d\), the equality follows from the definitions (4.19) and (4.21). Assume \(z\notin {\mathbb {Q}}^d\) and let \(z_n\in {\mathbb {Q}}^d\) such that \(z_n\rightarrow z\). Then

$$\begin{aligned} H^*(t,x_.,z_n)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z_n,u),\quad d{\mathbb {P}} \times dt\text{-a.e. } \end{aligned}$$
(4.23)

But, \(H^*(t,x_.,z_n)=\underset{u\in {\mathcal {J}}_n}{\inf \,} H(t,x_.,z_n,u)\rightarrow _n H^*(t,x_.,z)\) and \(\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z_n,u)\rightarrow _n\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,z,u)\) which finishes the proof. \(\square \)

Consider further the \(\mathcal {F}_T\)-measurable random variable

$$\begin{aligned} g^*(x_.):=\mathrm {ess}\inf _{u\in \mathcal {U}} g(x_T,P^u\circ x_T^{-1}) \end{aligned}$$
(4.24)

and let \((Y^*,Z^*)\in {\mathcal {S}}^2_T\times {\mathcal {H}}^2_T\) be the solution of the following BSDE

$$\begin{aligned} Y^*_t=g^*(x_.)+\int _t^T H^*(s,x_.,Z_s^*)ds-\int _t^TZ^*_sdW_s,\,\, t\le T. \end{aligned}$$
(4.25)

The existence of the pair \((Y^*,Z^*)\) follows from the boundedness of \(g^*\) and h, the measurability of \(H^*\) and (4.18) (see [17] for more details).

The next proposition displays a comparison result between the solutions \(Y^*\) and \(Y^u,\,u\in \mathcal {U}\) of the BSDEs (4.25) and (4.15), respectively.

Proposition 4.5

(Comparison) For every \(t\in [0,T]\), we have

$$\begin{aligned} Y^*_t\le Y^u_t,\quad {\mathbb {P}}\text {-a.s.},\quad u\in \mathcal {U}. \end{aligned}$$
(4.26)

Proof

For any \(t\le T\), we have:

$$\begin{aligned} \begin{array}{lll} Y^*_t-Y^u_t=g^*(x_.)-g(x_T,P^u\circ x_T^{-1})-\int _t^T (Z^*_s-Z^u_s)dW_s\\ \qquad \qquad \qquad +\int _t^T \{H^*(s,x_.,Z_s^*)-H(s,x_.,Z^*_s,u)\} ds\\ \qquad \qquad \qquad +\int _t^T \{H(s,x_.,Z^*_s,u) -H(s,x_.,Z^u_s,u)\} ds. \end{array} \end{aligned}$$

Since, \(g^*(x_.)-g(x_T,P^u\circ x_T^{-1})\le 0\) and \(H^*(s,x_.,Z_s^*)-H(t,x_.,Z^*_s,u)\le 0\), then, performing a change of probability measure and taking conditional expectation w.r.t. \(\mathcal {F}_t\), we obtain \(Y^*_t\le Y^u_t,\,\, {\mathbb {P}}\text {-a.s.},\,\, \forall u\in \mathcal {U}\). \(\square \)

Proposition 4.6

(\(\varepsilon \)-optimality) Assume that for any \(\varepsilon >0\) there exists \(u^{\varepsilon }\in \mathcal {U}\) such that P-a.s.,

$$\begin{aligned} \left\{ \begin{array}{ll} H^*(t,x_.,Z^*_t)\ge H(t,x_.,Z^*_t,u^{\varepsilon })-\varepsilon , \quad 0\le t<T, \\ g^*(x_.)\ge g(x_T,P^{u^{\varepsilon }}\circ x_T^{-1})-\varepsilon . \end{array} \right. \end{aligned}$$
(4.27)

Then,

$$\begin{aligned} Y^*_t=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} Y^u_t,\quad 0\le t\le T. \end{aligned}$$
(4.28)

Proof

Let \((Y^{\varepsilon },Z^{\varepsilon })\in {\mathcal {S}}^2_T\times {\mathcal {H}}^2_T\) be the solution the following BSDE

$$\begin{aligned} Y^{\varepsilon }_t=g(x_T,P^{u^{\varepsilon }}\circ x_T^{-1})+\int _t^T H(s,x_.,Z^{\varepsilon }_s,u^{\varepsilon })ds-\int _t^T Z_s^{\varepsilon }dW_s. \end{aligned}$$

Once more the existence of \((Y^{\varepsilon },Z^{\varepsilon })\) follows from ([17], Theorem I.3). We then have

$$\begin{aligned} Y^*_t-Y^{\varepsilon }_t= & {} g^*(x_.)-g(x_T,P^{u^{\varepsilon }}\circ x_T^{-1})-\int _t^T (Z^*_s-Z^{\varepsilon }_s)dW_s\\&\quad +\,\int _t^T \{H^*(s,x_.,Z^*_s)-H(s,x_.,Z^*_s,u^{\varepsilon })\} ds\\&\quad +\,\int _t^T \{H(s,x_.,Z^*_s,u^{\varepsilon }) -H(s,x_.,Z^{\varepsilon }_s,u^{\varepsilon })\} ds. \end{aligned}$$

Since \(g^*(x_.)-g(x_T,P^{u^{\varepsilon }}\circ x_T^{-1})\ge -\varepsilon \) and \(H^*(s,x_.,Z^*)-H(t,x_.,Z^*_s,u^{\varepsilon })\ge -\varepsilon \), then, once more, performing a change of probability measure and taking conditional expectation w.r.t. \(\mathcal {F}_t\), we obtain \(Y^*_t\ge Y^{u^{\varepsilon }}_t -\varepsilon (T+1)\). This entails that, in view of (4.26), for every \(0\le t\le T\), \(Y^*_t=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}Y^u_t\) . \(\square \)

In next theorem, we characterize the set of optimal controls associated with (4.9) under the dynamics (4.3).

Theorem 4.7

(Existence of optimal control) If there exists \(u^*\in \mathcal {U}\) such that

$$\begin{aligned} \left\{ \begin{array}{ll} H^*(t,x_.,Z^*_t)=H(t,x_.,P^{u^*}\circ x^{-1}_t,Z^*_t,u^*_t),\quad d{\mathbb {P}}\times dt\text{-a.e. },\quad 0\le t< T, \\ g^*(x_.)=g(x_T,P^{u^*}\circ x^{-1}_T), \quad d{\mathbb {P}}\text{-a.s. } \end{array} \right. \end{aligned}$$
(4.29)

Then,

$$\begin{aligned} Y^*_t=Y^{u^*}_t=\mathrm {ess}\inf _{u\in \mathcal {U}} Y^u_t,\quad 0\le t\le T. \end{aligned}$$
(4.30)

In particular, \(Y_0^*=\inf _{u\in \mathcal {U}}J(u)=J(u^*)\).

Proof

Under (4.29), for any \(t\le T\) we have

$$\begin{aligned} Y^*_t-Y^{u^*}_t= & {} \int _t^T (Z^*_s-Z^{u^*}_s)dW_s+\int _t^T \{H(s,x_.,P^{u^*}\circ x^{-1}_s,Z^*_s,u^{*}_s) \\&-\,H(s,x_.,P^{u^*}\circ x^{-1}_s,Z^{u^*}_s,u^{*}_s)\} ds\\= & {} \int _t^T (Z^*_s-Z^{u^*}_s)dW_s+\int _t^T (Z^*_s-Z^{u^*}_s)\sigma ^{-1}(s,x.)\\&f(s,x_.,P^{u^*}\circ x^{-1}_s,u^{*}_s)ds . \end{aligned}$$

Making now a change of probability and taking expectation leads to \({\tilde{E}}[Y^*_t-Y^{u^*}_t]=0\), \(\forall t\le T\) where \({\tilde{E}}\) is the expectation under the new probability \({\tilde{P}}\) which is equivalent to \({\mathbb {P}}\). As \(Y^*_t\le Y^{u^*}_t\), \({\mathbb {P}}\)-a.s. and then \({\tilde{P}}\)-a.s., we obtain, in taking into account of (4.26), \(Y^*=Y^{u^*}\) which means, once more by (4.26), that \(u^*\) is an optimal control. \(\square \)

Remark 4.8

As is the case for any optimality criteria for systems, obviously checking the sufficient condition (4.29) is quite hard simply because there are no general conditions which guarantee existence of essential minima for systems. One should rather solve the problem in particular cases. In the special case where the marginal law \(P^u\circ x^{-1}_t\)only depends on \((u_t, x|_{[0,t]})\) at each time \(t\in [0,T]\), we may minimize H and g over the action set U, instead of using the essential infimum, and use Beneš selection theorem [2] to find two measurable functions \(u_1^*\) from \([0,T)\times \Omega \times \mathbb {R}^d\) into U and \(u_2^*\) from \(\mathbb {R}^d\) into U such that

$$\begin{aligned} H^*(t,x_.,z):=\inf _{u\in U}H(t,x_.,P^u\circ x_t^{-1},z,u)= H(t,x,P^{u_1^*}\circ x_t^{-1},z,u_1^*(t,x,z)) \end{aligned}$$
(4.31)

and

$$\begin{aligned} g^*(x_.):=\inf _{u\in U} g(x_T,P^u\circ x_T^{-1})=g(x_T,P^{u_2^*}\circ x_T^{-1}). \end{aligned}$$
(4.32)

Combining (4.31) and (4.32), it is easily seen that the progressively measurable function \(u^*\) defined by

$$\begin{aligned} {\widehat{u}}(t,x_.,z):=\left\{ \begin{array}{ll} u_1^*(t,x_.,z), \quad t<T,\\ u_2^*(x_T),\quad t=T, \end{array} \right. \end{aligned}$$
(4.33)

satisfies

$$\begin{aligned} H^*(t,x_.,z)=H(t,x_.,P^{{\widehat{u}}}\circ x_t^{-1},z,{\widehat{u}}) \quad \text {and} \quad g^*(x_.)=g(x_T,P^{{\widehat{u}}}\circ x_T^{-1}). \end{aligned}$$
(4.34)

\(\square \)

We are going now to deal with the case when the terminal payoff g does not depend on the mean-field term. To begin with let us show the following result:

Proposition 4.9

Let \(\theta \) be an \({\mathbb {L}}\)-measurable process with values in \(\mathbb {R}^d\). We then have;

$$\begin{aligned} H^*(t,x_.,\theta _t)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}H(t,x_.,\theta _t,u), \,\,d{\mathbb {P}} \times dt \text{-a.e. } \end{aligned}$$

Proof

First note that for any \(z\in \mathbb {R}^d\) and \(u\in \mathcal {U}\), \( H^*(t,x_.,z)\le H(t,x_.,z,u), \,\,d{\mathbb {P}} \times dt \text{-a.e. }\) and then, for any \(u\in \mathcal {U}\),

$$\begin{aligned} H^*(t,x_.,\theta _t)\le H(t,x_.,\theta _t,u), \,\,d{\mathbb {P}} \times dt \text{-a.e. } \end{aligned}$$

Next let \(\Phi \) be a \({\mathbb {L}}\)-measurable process such that \(\Phi (t,\omega )\le H(t,x_.,\theta _t,u)\) for any \(u\in \mathcal {U}\). Assume first that \(\theta \) is uniformly bounded. Then there exists a sequence of \({\mathbb {L}}\)-processes \((\theta ^n)_{n\ge 0}\) such that for any \(n\ge 0\), \(\theta ^n\) takes its values in \({\mathbb {Q}}^d\), is piecewise constant and verifies \(\Vert \theta ^n-\theta \Vert _\infty :=\sup _{(t,\omega )}|\theta ^n_t(\omega )-\theta _t(\omega ) |\rightarrow 0\) as \(n\rightarrow \infty \) (for e.g. if \(d=1\), one can take \(\theta _n=\sum _{i=1}^{n2^n}\frac{i-1}{2^n}1_{\{\theta ^{-1}([\frac{i-1}{2^n},\frac{i}{2^n}[)\}}+n 1_{\{\theta \ge n\}}\) and the generalization to the case when \(d\ge 2\) is straightforward). On the other hand by the definition of H in (4.16) we have

$$\begin{aligned} |H(t,x_.,\theta _t,u)-H(t,x_.,\theta ^n_t,u)|\le C(1+\Vert x\Vert _t^{1+\alpha })\Vert \theta ^n- \theta \Vert _\infty . \end{aligned}$$

Now let \(\epsilon >0\) and \(n_0\) such that for any \(n\ge n_0\), \(\Vert \theta ^n-\theta \Vert _\infty \le \epsilon \). Then for \(n\ge n_0\) and \(u\in \mathcal {U}\) we have,

$$ \Phi (t,\omega )\le H(t,x_.,\theta _t^n,u)+\epsilon C(1+\Vert x\Vert _t^{1+\alpha })$$

which implies that

$$\begin{aligned} 1_{B^k_n}\Phi (t,\omega )\le 1_{B^k_n}\{H(t,x_.,z^k_n,u)+ \epsilon C(1+\Vert x\Vert _t^{1+\alpha })\} \end{aligned}$$

where \(B^k_n\) is a subset of \([0,T]\times \Omega \) on which \(\theta _n\) is constant and equals to \(z^k_n\in {\mathbb {Q}}^d\). Therefore

$$\begin{aligned} \begin{array}{ll} 1_{B^k_n}\Phi (t,\omega )&{}\le 1_{B^k_n}\{\inf _{u\in {\mathcal {J}}_n^k}H(t,x_.,z^k_n,u)+ \epsilon C(1+\Vert x\Vert _t^{1+\alpha })\} \\ {}&{} = 1_{B^k_n}\{H^*(t,x_.,z^k_n)+ \epsilon C(1+\Vert x\Vert _t^{1+\alpha })\} \\ {}&{} = 1_{B^k_n}\{H^*(t,x_.,\theta ^n_t)+ \epsilon C(1+\Vert x\Vert _t^{1+\alpha })\}, \end{array} \end{aligned}$$

where \({\mathcal {J}}_n^k\) is the countable subset of \(\mathcal {U}\) defined in (4.20) and associated with \(z^k_n\). Summing now over k to obtain

$$\begin{aligned} \begin{array}{l}\Phi (t,\omega )\le H^*(t,x_.,\theta _t)+ 2\epsilon C(1+\Vert x\Vert _t^{1+\alpha }) \end{array}\end{aligned}$$
(4.35)

since \(H^*\) is stochastic Lipschitz w.r.t. z (see (4.18)) and then

$$\begin{aligned} |H^*(t,x_.,\theta _t)-H^*(t,x_.,\theta ^n_t)|\le \epsilon C(1+\Vert x\Vert _t^{1+\alpha }) \end{aligned}$$

for \(n\ge n_0\). Send now \(\epsilon \) to 0 in (4.35) to obtain that \(\Phi (t,\omega )\le H^*(t,x_.,\theta _t)\) which means

$$\begin{aligned} H^*(t,x_.,\theta _t)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,} H(t,x_.,\theta _t,u), \,\,d{\mathbb {P}} \times dt \text{-a.e. } \end{aligned}$$

If \(\theta \) is not bounded, one can find a sequence of bounded \({\mathbb {L}}\)-processes \(({\bar{\theta }}_n)_{n\ge 0}\) such that \({\bar{\theta }}_n\rightarrow _n \theta \), \(\,\,d{\mathbb {P}} \times dt \text{-a.e. }\)

Therefore we have

$$\begin{aligned} H^*(t,x_.,{\bar{\theta }}_n(t))=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}H(t,x_.,{\bar{\theta }}_n(t),u), \,\,d{\mathbb {P}} \times dt \text{-a.e. } \end{aligned}$$
(4.36)

But the stochastic Lipschitz property of \(H^*\) and the linearity of H w.r.t. z imply

\(H^*(t,x_.,{\bar{\theta }}_n(t))\rightarrow _n H^*(t,x_.,\theta (t))\) and \(\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}H(t,x_.,{\bar{\theta }}_n(t),u) \rightarrow _n \underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}H(t,x_.,{\bar{\theta }},u)\). We then obtain the desired result by taking the limit in (4.36). \(\square \)

Proposition 4.10

If g does not depend on the mean-field term then

$$\begin{aligned} Y^*_0=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf \,}J(u). \end{aligned}$$

Proof

Recall that \(Z^*\) is defined in (4.25). Then by the previous result and the properties of \({\mathrm {ess}\inf \,}\) (see [11, p. 229]), there exists a countable subset \({\bar{\mathcal {U}}}\) of \(\mathcal {U}\) such that

$$\begin{aligned} H^*(t,x_.,Z^*_t)= \inf _{u\in {\bar{\mathcal {U}}}}H(t,x_.,Z^*_t,u) \end{aligned}$$

Therefore for any \(\epsilon >0\) there exists \(u^\epsilon \in {\bar{\mathcal {U}}}\) such that \( H^*(t,x_.,Z^*_t)\ge H(t,x_.,Z^*_t,u^\epsilon )-\epsilon . \) Thus (4.27) is satisfied since g does not depend on the mean-field term. Finally the result follows from Proposition 4.6. \(\square \)

4.2 Existence of Nearly-Optimal Controls

As noted above, the sufficient condition (4.29) is quite hard to verify in concrete situations, which makes Theorem (4.7) less useful for showing existence of optimal controls. Nevertheless, near-optimal controls enjoy many useful and desirable properties that optimal controls do not have. In fact, thanks to Ekeland’s variational principle [14], that we will use below, under very mild conditions on the control set \(\mathcal {U}\) and the payoff functional J, near-optimal controls always exist while optimal controls may not exist or are difficult to establish. Moreover, there are many candidates for near-optimal controls which makes it possible to select among them appropriate ones that are easier to implement and handle both analytically and numerically.

For later use we introduce the Ekeland metric \(d_E\) on the space \(\mathcal {U}\) of admissible controls defined as follows. For \(u, v\in \mathcal {U}\),

$$\begin{aligned} d_E(u,v):={\widehat{P}}\{(\omega ,t)\in \Omega \times [0,T],\,\, \delta (u_t(\omega ),v_t(\omega ))>0\}, \end{aligned}$$
(4.37)

where \(d{\widehat{P}}=d{\mathbb {P}} \times dt \) is the product measure of \({\mathbb {P}}\) and the Lebesgue measure on [0, T].

On the other hand let us consider the following assumption on \(\sigma \) which will replace (A2)-(b),(c).

Assumption (A6) \(\sigma (t,x_.)\) and \(\sigma ^{-1}(t,x_.)\) are bounded.

Lemma 4.11

(i):

\(d_E\) is a distance. Moreover, \((\mathcal {U},d_E)\) is a complete metric space.

(ii):

Let \((u^n)_n\) and u be in \(\mathcal {U}\). If \(d_E(u^n,u)\rightarrow 0\) then \(\mathbb {E}[\int _0^T\delta ^2(u^n_t,u_t)dt]\rightarrow 0\).

Proof

For a proof of

(i):

See [10]. The proof of completeness of \((\mathcal {U},d_E)\) needs only completeness of the metric space \((U,\delta )\).

(ii):

Let \((u^n)_n\) and u be in \(\mathcal {U}\). Then, by definition of the distance \(d_E\), since \(d_E(u^n,u)\rightarrow 0\) then \(\delta (u^n_t,u_t)\) converges to 0, \(d{\mathbb {P}}\times dt\)-a.e. Now, since the set U is compact, it is totally bounded, i.e. for every \(\varepsilon >0\), there exists an integer \(p_{\varepsilon }\ge 1\) and points \(y_1,\ldots ,y_{p_{\varepsilon }}\) in U such that \(U\subset \bigcup _{j=1}^{p_{\varepsilon }} B(y_j,\varepsilon )\), where \(B(x,\varepsilon )\) denotes the closed ball with center x and radius \(\varepsilon \). In particular, by the triangle inequality, for every \(n\ge 1\) and \((t,\omega )\), \(\delta (u^n_t(\omega ),u_t(\omega ))\le 2p_{\varepsilon }\varepsilon \). Thus, by dominated convergence, \(\mathbb {E}[\int _0^T\delta ^2(u^n_t,u_t)dt]\rightarrow 0\). \(\square \)

Proposition 4.12

Assume (A1), (A2)-(a),(A6) and (B1)–(B4). Let \((u^n)_n\) and u be in \(\mathcal {U}\). If \(d_E(u^n,u)\rightarrow 0\) then \(D^2_T(P^{u^n},P^u)\rightarrow 0\). Moreover, for every \(t\in [0,T]\), \(L^{u^n}_t\) converges to \(L^{u}_t\) in \(L^1(P)\).

Proof

In view of Lemma 4.11, we have \(\mathbb {E}[\int _0^T \delta ^2(u_t, u^n_t)dt] \rightarrow 0\). Therefore the sequence \((\int _0^T \delta ^2(u_t, u^n_t)dt)_{n\ge 0}\) converges in probability w.r.t \({\mathbb {P}}\) to 0 and by compactness of U it is bounded. On the other hand since \(L^u_T\) is integrable then the sequence \((L^u_T\int _0^T \delta ^2(u_t, u^n_t)dt)_{n\ge 0}\) converges also in probability w.r.t to \({\mathbb {P}}\) to 0. Next by the uniform boundedness of \((\int _0^T \delta ^2(u_t, u^n_t)dt)_{n\ge 0}\), the sequence \((L^u_T\int _0^T \delta ^2(u_t, u^n_t)dt)_{n\ge 0}\) is uniformly integrable. Finally as we have

$$\begin{aligned} {E}^u\left[ \int _0^T \delta ^2(u_t, u^n_t)dt\right] =\mathbb {E}\left[ L^u_T\int _0^T \delta ^2(u_t, u^n_t)dt\right] \end{aligned}$$

then

$$\begin{aligned} {E}^u\left[ \int _0^T \delta ^2(u_t, u^n_t)dt\right] \rightarrow _n 0. \end{aligned}$$

Now to conclude it is enough to use the inequality (4.5).

To prove the last statement, set \(M^u_t:=\int _0^t \sigma ^{-1}(s,x_.)f(s,x_.,P^u\circ x_s^{-1},u_s)dW_s\). In view of (B2), we have

$$\begin{aligned} \mathbb {E}[|M^{u_n}_t-M^u_t|^2]= & {} \mathbb {E}\Big [\int _0^t |\sigma ^{-1}(s,x_.)(f(s,x_.,P^{u^n}\circ x_s^{-1},u^n_s)\\&-f(s,x_.,P^u\circ x_s^{-1},u_s))|^2ds\Big ]\\\le & {} C(D_t(P^{u_n},P^u)+\mathbb {E}\Big [\int _0^T \delta ^2(u^n_t, u_t)dt\Big ], \end{aligned}$$

which converge to zero as \(n\rightarrow +\infty \).

Furthermore, setting \(f(t,x_.,u):=f(t,x_.,P^{u}\circ x_t^{-1},u_t)\), we have (taking into account of (A6))

$$\begin{aligned}&\mathbb {E}\Big [|\langle M^{u^n}\rangle _t-\langle M^u\rangle _t|]\le C\mathbb {E}\Big [\int _0^t |f(s,x_.,u^n)\\&\quad -f(s,x_.,u)|(|f(s,x_.,u^n)| +|f(s,x_.,u)|)ds\Big ] \\&\quad \le C\Big (\mathbb {E}\Big [\int _0^t |f(s,x_.,u^n)-f(s,x_.,u)|^2\Big ]\Big )^{1/2}\Big (\mathbb {E}\Big [\int _0^t (|f(s,x_.,u^n)|\\&\quad +|f(s,x_.,u)|)^2ds\Big ]\Big )^{1/2} \\&\quad \le C\Big (\mathbb {E}\Big [\int _0^t |f(s,x_.,u^n)-f(s,x_.,u)|^2\Big ]\Big )^{1/2}\mathbb {E}\Big [\int _0^t (1+|x|_s^2)ds\Big ]\Big )^{1/2} \end{aligned}$$

which converges to zero as \(n\rightarrow +\,\infty \). Therefore, \(L^{u^n}_t\) converges to \(L^u_t\) in probability w.r.t \(\mathbb {P}\). But, by Theorem 2.2 in [18], under (A6), \((L^{u^n}_t)_n\) is uniformly integrable. Thus, \(L^{u^n}_t\) converges to \(L^u_t\) in \(L^1(\mathbb {P})\) when \(n\rightarrow +\,\infty \). \(\square \)

Proposition 4.13

For any \(\varepsilon >0\), there exists a control \(u^{\varepsilon }\in \mathcal {U}\) such that

$$\begin{aligned} J(u^{\varepsilon })\le \inf _{u\in \mathcal {U}} J(u)+\varepsilon . \end{aligned}$$
(4.38)

\(u^{\varepsilon }\) is called near or \(\varepsilon \)-optimal for the payoff functional J.

Proof

The result follows from Ekeland’s variational principle, provided that we prove that the payoff function J, as a mapping from the complete metric space \((\mathcal {U},d_E)\) to \(\mathbb {R}\), is lower bounded and lower-semincontinuous. Since f and g are assumed uniformly bounded, J is obviously bounded. We now show continuity of J: \(J(u^n)\) converges to J(u) when \(d_E(u^n,u)\rightarrow 0\).

Integrating by parts, we obtain

$$\begin{aligned} J(u)=\mathbb {E}\left[ \int _0^T L^u_th(t,x_.,P^u\circ x_t^{-1},u_t)dt+L^u_Tg(x_T, P^u\circ x_T^{-1})\right] . \end{aligned}$$

Now by the boundedness of h we have the inequality

$$\begin{aligned} |L^{u^n}_th(t,x_.,u^n)-L^{u}_th(t,x_.,u)|\le C|L^{u^n}_t-L^{u}_t|+L^{u}_t|h(t,x_.,u^n)-h(t,x_.,u)| \end{aligned}$$

and (B3) together with the boundedness of h, by Proposition (4.12), \({\mathbb {E}}[\int _0^T L^{u^n}_th(t,x_.,P^{u^n}\circ x_t^{-1},u^n_t)dt]\) converges to \({\mathbb {E}}[\int _0^T L^u_th(t,x_.,P^u\circ x_t^{-1},u_t)dt]\) as \(d_E(u^n,u)\rightarrow 0\). A similar argument yields convergence of \({\mathbb {E}}[L^{u^n}_Tg(x_T, P^{u^n}\circ x_T^{-1})]\) to \({\mathbb {E}}[L^u_Tg(x_T, P^u\circ x_T^{-1})]\) when \(d_E(u^n,u)\rightarrow 0\). \(\square \)

Finally below we provide examples where an optimal control exists. Actually assume that:

  1. (i)

    The drift f does not depend on the mean field term \(P^u\circ x_t^{-1}\) and the set \(f(t,\zeta ,U)\) is convex for any fixed \((t,\zeta )\in [0,T]\times \Omega \). Additionally for simplicity assume that \(\sigma =I_d\).

  2. (ii)

    The instantaneous (resp. terminal) payoff has the following form:

$$\begin{aligned} h(t,x_.,u)=\Gamma (t,x_., E^u[\psi _1(x_t)]) \,\,(resp.\,\, g(x_T,P^u\circ x_T^{-1})=\Theta (x_T,E^u[\psi _2(x_T)])) \end{aligned}$$

where:

  1. (a)

    the functions \(\psi _i\), \(i=1,2\), are bounded ;

  2. (b)

    the mapping \(a\in \mathbb {R}^d \mapsto (\Gamma (t,\zeta ,a),\Theta (\eta ,a)\) is continuous (\(\zeta \in \mathcal {C}\) and \(\eta \in \mathbb {R}^d\)).

Then an optimal control exists. Indeed let \((u_n)_{n\ge 0}\) be a sequence of \(\mathcal {U}\) such that

$$\begin{aligned} \inf _{u\in \mathcal {U}}J(u)=\lim _{n\rightarrow \infty }J(u_n). \end{aligned}$$

As the set of densities \(\{L^{u}_T, u\in \mathcal {U}\}\) is weakly compact for the topology \(\sigma (L^1,L^\infty )\) (see e.g. [2, p. 470]), then there exist \(u^*\in \mathcal {U}\) and a subsequence \(\{L^{u_{n_k}}_T, k\ge 0\}\) which converges weakly to \(L^{u^*}_T\). But for any \(t\le T\),

$$\begin{aligned} \begin{array}{l}E^{u_{n_k}}[\psi _1(x_t)] = {\mathbb {E}}[L^{u_{n_k}}_T\psi _1(x_t)]\rightarrow _k {\mathbb {E}}[L^{u^*}_T\psi _1(x_t)] =E^{u^*}[\psi _1(x_t)] \text{ and } \\ E^{u_{n_k}}[\psi _2(x_T)] = {\mathbb {E}}[L^{u_{n_k}}_T\psi _2(x_T)]\rightarrow _k {\mathbb {E}}[L_T^{u^*}\psi _2(x_T)] =E^{u^*}[\psi _2(x_T)].\end{array} \end{aligned}$$

Using now boundedness and continuity of \(\Gamma \), \(\Theta \) and finally dominated convergence theorem to obtain that:

$$\begin{aligned} \begin{array}{l} \lim _{k\rightarrow \infty }\int _0^T\Gamma (s,x_.,E^{u^{n_k}}[\psi _1(x_t)])ds+\Theta (x_T, E^{u_{n_k}}[\psi _2(x_T)])\\ \qquad \qquad \qquad =\int _0^T\Gamma (s,x_.,E^{u^{*}}[\psi _1(x_t)])ds +\Theta (x_T, E^{u{*}}[\psi _2(x_T)]) \end{array} \end{aligned}$$
(4.39)

in \(L^p\) for any \(p\ge 1\). Next

$$\begin{aligned} J(u_{n_k})={\mathbb {E}}\left[ L_T^{u_{n_k}}\left\{ \int _0^T\Gamma (s,x_.,E^{u^{n_k}}[\psi _1(x_t)])ds+\Theta (x_T, E^{u_{n_k}}[\psi _2(x_T)])\right\} \right] \end{aligned}$$

and by Theorem 2.2 in [18], there exists \(p_0>1\) such that \({\mathbb {E}}[(L_T^{u_{n_k}})^{p_0}]\) is bounded by a constant which does not depend on k. Therefore by the weak convergence of \((L_T^{u_{n_k}})_{k\ge 0}\) and (4.39) we have that \(\lim _{k\rightarrow \infty }J(u_{n_k})=J(u^*)\) which implies that \(J(u^*)=\inf _{u\in \mathcal {U}}J(u) \) and then \(u^*\) is optimal.

As a final remark, this example can be formalized and generalized substantially. \(\square \)

5 The Zero-Sum Game Problem

In this section we consider a symmetric two-players zero-sum game. Let \(\mathcal {U}\) (resp. \(\mathcal {V}\)) be the set of admissible U-valued (resp. V-valued) controls for the first (resp. second) player, where \((U,\delta _1)\) and \((V,\delta _2)\) are compact metric spaces.

For \((u,v),({\bar{u}},{\bar{v}})\in U\times V\), we set

$$\begin{aligned} \delta ((u,v),({\bar{u}},{\bar{v}})):=\delta _1(u,{\bar{u}}) +\delta _2(v,{\bar{v}}). \end{aligned}$$
(5.1)

The distance \(\delta \) defines a metric on the compact space \(U\times V\).

Let f and h be two measurable functions from \([0,T]\times \Omega \times {\mathcal {P}}(\mathbb {R}^d)\times U\times V\) into \(\mathbb {R}^d\) and \(\mathbb {R}\), respectively, and g be a measurable function from \(\mathbb {R}^d\times {\mathcal {P}}(\mathbb {R}^d)\) into \(\mathbb {R}\) such that

  1. (C1)

    For any \((u,v)\in \mathcal {U}\times \mathcal {V}\) and \(Q\in {\mathcal {P}}(\Omega )\), the processes \((f(t, x_.,Q\circ x_t^{-1},u_t,v_t))_t\) and \((h(t, x_.,Q\circ x_t^{-1},u_t,v_t))_t\) are progressively measurable. Moreover, \(g(x_T,Q\circ x_T^{-1})\) is \(\mathcal {F}_T\)-measurable.

  2. (C2)

    For every \(t\in [0,T]\), \(w\in \Omega \), \((u,v),({\bar{u}},{\bar{v}}) \in U\times V\) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),

    $$\begin{aligned} |\phi (t,w,\mu , u,v)-\phi (t,w,\nu ,{\bar{u}},{\bar{v}})|\le C(d(\mu ,\nu )+\delta ((u,v),({\bar{u}},{\bar{v}})), \end{aligned}$$

    for \(\phi \in \{f,h\}\). For every \(w\in \Omega \) and \(\mu , \nu \in {\mathcal {P}}(\mathbb {R}^d)\),

    $$\begin{aligned} |g(w,\mu )-g(w,\nu )|\le Cd(\mu ,\nu ). \end{aligned}$$
  3. (C3)

    For every \(t\in [0,T]\), \(w\in \Omega ,\,\mu \in {\mathcal {P}}(\mathbb {R}^d)\) and \((u,v)\in \mathcal {U}\times \mathcal {V}\),

    $$\begin{aligned} |f(t,w,\mu ,u,v)|\le C(1+|w|_t). \end{aligned}$$
  4. (C4)

    h and g are uniformly bounded.

For \((u,v)\in \mathcal {U}\times \mathcal {V}\), let \(P^{u,v}\) be the probability measure on \((\Omega ,\mathcal {F})\) defined by

$$\begin{aligned} dP^{u,v}:=L_T^{u,v}d{\mathbb {P}}, \end{aligned}$$
(5.2)

where

$$\begin{aligned} L_t^{u,v}:=\mathcal {E}_t\left( \int _0^{\cdot } \sigma ^{-1}(s,x_.)f(s,x_.,P^{u,v}\circ x_s^{-1},u_s,v_s)dW_s\right) ,\quad 0\le t\le T. \end{aligned}$$
(5.3)

The proof of existence of \(P^{u,v}\) follows the same lines as the one of \(P^u\) defined in (4.1)–(4.2). Hence, by Girsanov’s theorem, the process \((W^{u,v}_t,\,\, 0\le t\le T)\) defined by

$$\begin{aligned} W_t^{u,v}:=W_t-\int _0^t \sigma ^{-1}(s,x_.)f(s,x_.,P^{u,v}\circ x_s^{-1},u_s,v_s)ds, \quad 0\le t\le T, \end{aligned}$$

is an \((\mathbb {F}, P^{u,v})\)-Brownian motion. Moreover, under \(P^{u,v}\),

$$\begin{aligned} dx_t=f(t,x_.,P^{u,v}\circ x_t^{-1},u_t,v_t)dt+\sigma (t,x_.)dW^{u,v}_t,\quad x_0=\text {x}\in \mathbb {R}^d. \end{aligned}$$
(5.4)

Let \(E^{u,v}\) denote the expectation w.r.t. \(P^{u,v}\).

The payoff functional \(J(u,v),\,(u,v)\in \mathcal {U}\times \mathcal {V}\), associated with the controlled SDE (5.4) is

$$\begin{aligned} J(u,v):=E^{u,v}\left[ \int _0^T h(t,x_.,P^{u,v}\circ x_t^{-1},u_t,v_t)dt+ g(x_T,P^{u,v}\circ x^{-1}_T)\right] . \end{aligned}$$
(5.5)

The zero-sum game we consider is between two players, where the first player (with control u) wants to minimize the payoff (5.5), while the second player (with control v) wants to maximize it. The zero-sum game boils down to showing existence of a saddle-point for the game, i.e. to show existence of a pair \((u^*, v^*)\) of controls such that

$$\begin{aligned} J(u^*, v) \le J(u^*, v^*)\le J(u,v^*) \end{aligned}$$
(5.6)

for each \((u, v)\in \mathcal {U}\times \mathcal {V}\).

The corresponding dynamics is given by the probability measure \(P^*\) on \((\Omega ,\mathcal {F})\) defined by

$$\begin{aligned} dP^*=\mathcal {E}_T\left( \int _0^{\cdot } \sigma ^{-1}(s,x_.)f(s,x_., P^*\circ x_s^{-1},u^*_s, v^*_s)dW_s\right) d{\mathbb {P}} \end{aligned}$$
(5.7)

under which

$$\begin{aligned} dx_t=f(t,x, P^*\circ x_t^{-1},u^*_t,v^*_t)dt+\sigma (t,x)dW^{u^*, v^*}_t,\quad x_0=\text {x}\in \mathbb {R}^d. \end{aligned}$$
(5.8)

For \((u,v)\in \mathcal {U}\times \mathcal {V}\) and \(z\in \mathbb {R}^d\), we introduce the Hamiltonian associated with the game (5.4)–(5.5):

$$\begin{aligned} H(t,x_.,z,u,v):= & {} z\cdot \sigma ^{-1}(t,x_.)f(t,x_.,P^{u,v}\circ x_t^{-1},u_t,v_t)\nonumber \\&+\,h(t,x_.,P^{u,v}\circ x_t^{-1},u_t,v_t). \end{aligned}$$
(5.9)

Next, set

  1. (i)

    \(\underline{H}(t,x_.,z):=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, \underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, H(t,x_.,z,u,v),\)

  2. (ii)

    \(\overline{H}(t,x_.,z):=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, \underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, H(t,x_.,z,u,v),\)

  3. (iii)

    \(\underline{g}(x_.):=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, \underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, g(x_T, P^{u,v}\circ x_T^{-1})\),

  4. (iv)

    \(\overline{g}(x_.):=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, \underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, g(x_T, P^{u,v}\circ x_T^{-1})\).

As in Proposition 4.4, \(\underline{H}\), \(\overline{H}\), \(\underline{g}\) and \(\overline{g}\) exist. On the other hand following a similar proof as the one leading to (4.18), \(\underline{H}(t,x_.,z)\) and \(\overline{H}(t,x_.,z)\) are stochastic Lipschitz continuous in z with the Lipschitz constant \(C(1+|x|^{1+\alpha }_t)\).

Let \((\underline{Y},\underline{Z})\) be the solution of the BSDE associated with \((\underline{H}, \underline{g})\) and \((\overline{Y},\overline{Z})\) the solution of the BSDE associated with \((\overline{H}, \overline{g})\).

Definition 5.1

(Isaacs’ condition) We say that the Isaacs’ condition holds for the game if

$$\begin{aligned} \left\{ \begin{array}{lll} \underline{H}(t,x_.,z)=\overline{H}(t,x_.,z),\quad z\in \mathbb {R}^d,\,\, 0\le t\le T, \\ \underline{g}(x_.)=\overline{g}(x_.), \end{array} \right. \end{aligned}$$

Applying the comparison theorem for BSDEs and then uniqueness of the solution, we obtain the following

Proposition 5.2

For every \(t\in [0,T]\), it holds that \(\underline{Y}_t\le \overline{Y}_t\), \(\,P\)-a.s. Moreover, if the Issac’s condition holds, then

$$\begin{aligned} \underline{Y}_t=\overline{Y}_t:=Y_t,\quad P\text{-a.s. },\quad 0\le t\le T. \end{aligned}$$
(5.10)

In the next theorem, we formulate conditions for which the zero-sum game has a value. For \((u,v)\in \mathcal {U}\times \mathcal {V}\), let \((Y^{u,v},Z^{u,v})\in {\mathcal {S}}^2_T\times {\mathcal {H}}^2_T\) be the solution of the BSDE

$$\begin{aligned} \left\{ \begin{array}{ll} -dY^{u,v}_t=H(t,x_.,Z^{u,v}_t,u,v) dt-Z^{u,v}_tdW_t,\quad 0\le t<T,\\ Y^{u,v}_T=g(x_T,P^{u,v}\circ x_T^{-1}), \end{array} \right. \end{aligned}$$
(5.11)

Theorem 5.3

(Existence of a value of the zero-sum game) Assume that, for every \(t\in [0,T]\),

$$\begin{aligned} \underline{H}(t,x_.,\underline{Z}_t)=\overline{H}(t,x_.,\underline{Z}_t). \end{aligned}$$
(5.12)

If there exists \((u^*,v^*)\in \mathcal {U}\times \mathcal {V}\) such that, for every \(\, 0\le t<T\),

$$\begin{aligned} \underline{H}(t,x_.,\underline{Z}_t)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, H(t,x_.,\underline{Z}_t,u,v^*)=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, H(t,x_.,\underline{Z}_t,u^*,v), \end{aligned}$$
(5.13)

and

$$\begin{aligned} \underline{g}(x_.)=\overline{g}(x_.)=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, g(x_T,P^{u,v^*}\circ x_T^{-1})=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\, \, g(x_T,P^{u^*,v}\circ x_T^{-1}). \end{aligned}$$
(5.14)

Then, \({\mathbb {P}}\)-a.s. for any \(t\le T\),

$$\begin{aligned} Y_t=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\,\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }Y_t^{u,v}=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\,\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\,Y_t^{u,v}. \end{aligned}$$
(5.15)

Moreover, the pair \((u^*,v^*)\) is a saddle-point for the game.

Proof

First note that we can replace in (5.12) \(\underline{Z}\) by \(\overline{Z}\) and the result still holds. So assume that \(\underline{H}(t,x_.,\underline{Z}_t)=\overline{H}(t,x_.,\underline{Z}_t)\). Then by the uniqueness of the solution of the BSDEs associated with \((\underline{H},\underline{g})\) and \((\overline{H},\overline{g})\) we have \((\underline{Y},\underline{Z})=(\overline{Y},\overline{Z})\).

On the other hand, by (5.13)-(5.14) one can easily check that the pair \((u^*,v^*)\) satisfies a saddle-point property for H and g as well, i.e.,

$$\begin{aligned} H(t,x_.,\underline{Z}_t,u^*,v)\le \underline{H}(t,x_.,\underline{Z}_t)={H}(t,x_.,\underline{Z}_t,u^*,v^*)\le H(t,x_.,\underline{Z}_t,u,v^*), t<T \end{aligned}$$

and

$$\begin{aligned} g(x_T,P^{u^*,v}\circ x_T^{-1})\le \underline{g}(x_.)=\overline{g}(x_.)=g(x_T,P^{u^*,v^*}\circ x_T^{-1})\le g(x_T,P^{u,v^*}\circ x_T^{-1}). \end{aligned}$$

The previous equalities and the uniqueness of the solutions of the BSDEs imply that \(\overline{Y}_t=\underline{Y}_t=Y^{u^*,v^*}_t\).

Now let \((u,v)\in \mathcal {U}\times \mathcal {V}\) and, \(({\widehat{Y}}^u,{\widehat{Z}}^u)\), \(({\widetilde{Y}}^v,{\widetilde{Z}}^v)\) be the solutions of the following BSDEs:

$$\begin{aligned} \left\{ \begin{array}{ll} -d{\widehat{Y}}^{u}_t&{}=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup } \,H(t,x_.,{\widehat{Z}}^{u}_t,u,v) dt-{\widehat{Z}}^{u}_tdW_t, \quad 0\le t\le T,\\ {\widehat{Y}}^{u}_T&{}=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\,g(x_T,P^{u,v}\circ x_T^{-1}), \end{array} \right. \end{aligned}$$
(5.16)
$$\begin{aligned} \left\{ \begin{array}{ll} -d{\widetilde{Y}}^{v}_t&{}=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\, H(t,x_.,{\widetilde{Z}}^{v}_t,u,v) dt-{\widetilde{Z}}^{v}_tdW_t, \quad 0\le t\le T,\\ {\widetilde{Y}}^{v}_T&{}=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\,g(x_T,P^{u,v}\circ x_T^{-1}). \end{array} \right. \end{aligned}$$
(5.17)

Then by comparison we have

$$\begin{aligned} {\hat{Y}}^{u^*}_t\ge Y^{u^*,v}_t \text{ and } {\tilde{Y}}^{v^*}_t\le Y^{u,v^*}_t. \end{aligned}$$
(5.18)

But \({\hat{Y}}^{u^*}\) satisfies the following BSDE:

$$\begin{aligned} \left\{ \begin{array}{ll} -d{\widehat{Y}}^{u^*}_t&{}=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup } \,H(t,x_.,{\widehat{Z}}^{u^*}_t,u^*,v) dt-{\widehat{Z}}^{u^*}_tdW_t, \quad 0\le t<T,\\ {\widehat{Y}}^{u^*}_T&{}=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\,g(x_T,P^{u^*,v}\circ x_T^{-1}). \end{array} \right. \end{aligned}$$
(5.19)

Taking into account of (5.13)–(5.14) and since the solution of the previous BSDE is unique, we obtain that

$$\begin{aligned} \underline{Y}_t=Y^{u^*,v^*}_t={\hat{Y}}^{u^*}_t. \end{aligned}$$

Moreover, (5.18) implies that \(Y^{u^*,v^*}_t\ge Y^{u^*,v}_t\) for any \(v \in \mathcal {V}\). But in the same way we have also \(\underline{Y}_t=Y^{u^*,v^*}_t={\tilde{Y}}^{v^*}_t\le Y^{u,v^*}_t\), P-a.s., for any \(u\in \mathcal {U}\). Therefore,

$$\begin{aligned} Y^{u^*, v}_t\le Y^{u^*, v^*}_t\le Y^{u, v^*}_t. \end{aligned}$$

Thus, \((u^*,v^*)\) is a saddle-point of the game and \(\underline{Y}_t=Y^{u^*, v^*}_t\) is the value of the game, i.e., it satisfies: \({\mathbb {P}}\)-a.s. for any \(t\le T\),

$$\begin{aligned} Y^{u^*, v^*}_t=Y_t=\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\,\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }Y_t^{u,v}=\underset{v\in \mathcal {V}}{\mathrm {ess}\sup }\,\underset{u\in \mathcal {U}}{\mathrm {ess}\inf }\,Y_t^{u,v}. \end{aligned}$$

\(\square \)

Final remark Assumptions (B4) and (C4) on the boundedness of the functions g and h can be substantially weakened by using subtle arguments on existence and uniqueness of solutions of one dimensional BSDEs which are by now well known in the BSDEs literature.