On the Quadratic Convergence of Newton’s Method for Mean Field Games with Non-separable Hamiltonian

Camilli, Fabio; Tang, Qing

doi:10.1007/s13235-024-00561-y

On the Quadratic Convergence of Newton’s Method for Mean Field Games with Non-separable Hamiltonian

Open access
Published: 25 April 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Dynamic Games and Applications Aims and scope Submit manuscript

On the Quadratic Convergence of Newton’s Method for Mean Field Games with Non-separable Hamiltonian

Download PDF

Fabio Camilli¹ &
Qing Tang²

350 Accesses
Explore all metrics

Abstract

We analyze asymptotic convergence properties of Newton’s method for a class of evolutive Mean Field Games systems with non-separable Hamiltonian arising in mean field type models with congestion. We prove the well posedness of the Mean Field Game system with non-separable Hamiltonian and of the linear system giving the Newton iterations. Then, by forward induction and assuming that the initial guess is sufficiently close to the solution of problem, we show a quadratic rate of convergence for the approximation of the Mean Field Game system by Newton’s method. We also consider the case of a nonlocal coupling, but with separable Hamiltonian, and we show a similar rate of convergence.

Time-Dependent Focusing Mean-Field Games: The Sub-critical Case

Article 30 April 2018

The Convergence Problem in Mean Field Games with Local Coupling

Article 23 June 2017

Regularity for Mean-Field Games Systems with Initial-Initial Boundary Conditions: The Subquadratic Case

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Mean Field Games (MFG in short) theory, introduced in [25, 27], arises in the study of differential games with an infinite number of rational agents. The corresponding literature is now vast and concerns both theoretical and applicative aspects, see [4, 17, 30] and references therein. In this regard, a significant part of it is dedicated to the study of numerical methods and algorithms for the computation of the solution to the MFG model, both in the formulation as a PDEs system and as an optimal control problem of a PDE. Such approaches, just to mention a few, include finite differences, semi-Lagrangian methods and Fourier expansions with regard to the approximation methods (see [1, 2, 6, 7, 11, 13, 14, 28, 29, 31, 33, 34]). Many of these methods exploit the variational structure of the problem, concerns the case in which the coupling term involving the distribution of the population is separated from the Hamiltonian, while relatively few works have been dedicated to the so-called non-separable case

$$\begin{aligned} \left\{ \begin{aligned} (i) \qquad&-\partial _t u -\Delta u+H(x,m,Du)=f(m) \qquad{} & {} \textrm{in}\,\,Q,\\ (ii) \qquad&\partial _t m- \Delta m - \text {div} \big ( mH_p(x,m,Du)\big ) =0{} & {} \textrm{in}\,\,Q, \\&m(x,0)=m_0(x), \; u(x,T)=u_T(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(1.1)

(${\mathbb {T}}^d$ is the unit torus and $Q={\mathbb {T}}^d\times (0,T)$). Moreover, the non-separable case is very important in applications to model congestion effects, i.e., situations in which the cost of displacement of the agents increases in those regions where the density is large. MFGs models including congestion were introduced in [30] and a typical Hamiltonian in such cases is

$$\begin{aligned} H(x,m,p)=\frac{h(x)|p|^2}{(1+m)^\alpha }, \quad \alpha >0. \end{aligned}$$

Global in time weak solution to Eq. (1.1) has been considered in [5, 23], short time existence and uniqueness of regular solution in [8, 19] and the stationary case in [22]. In general, MFGs with non-separable Hamiltonian do not have a variational structure and this causes a restriction on the choice of numerical methods. Moreover and in general, implicit schemes are preferred as they enhance the stability and efficiency compared to explicit schemes. To design implicit finite difference schemes, iterative methods are needed to reduce the problem to a sequence of linear systems. Iterative methods employed in solving MFGs include Newton’s method [6, 7, 9, 28], fixed point iteration, fictitious play, policy iteration [15, 29], smoothed policy iteration [34], etc. In particular, numerical solution of MFGs with non-separable Hamiltonian have been discussed in, e.g., [6, 7, 23, 28, 29].

In this paper, we consider Newton’s method from a continuous standpoint, viewing it as a linear system of partial differential equations (PDEs) which approximate the nonlinear problem (1.1). Newton’s method (also known as the Newton Kantorovich method) is effective for convex optimization problems ( [10]) or for solving nonlinear functional equations in Banach spaces, cf. [18]. The novelty and main contributions of this work are theoretical. We rigorously establish a quadratic rate of convergence of the method in a neighborhood of the solution of Eq. (1.1). In the study of Newton’s method to Eq. (1.1), a critical point is in establishing the well posedness of the linearized MFG system. To address this, we broaden the theoretical framework developed in [12, 16] for analyzing master equations. This extension is applicable to MFGs with non-separable Hamiltonians, subject to certain Hessian-type monotonicity conditions. Recently, the convergence analysis of Newton’s method has been considered in [9] for stationary MFGs with separable Hamiltonian.

The Newton method for system Eq. (1.1) reads as follows. Writing the MFG system as an operator equation ${\mathcal {F}}(u,m)=0$ and denoting by $\mathcal{L}\mathcal{F}_{({\check{u}},{\check{m}})}$ the linearized ${\mathcal {F}}$ at $({\check{u}},{\check{m}})$ and by $\mathcal{L}\mathcal{F}_{({\check{u}},{\check{m}})}^{-1}$ the inverse of $\mathcal{L}\mathcal{F}_{({\check{u}},{\check{m}})}$, then we get

$$\begin{aligned} (u^{n},m^n)=(u^{n-1},m^{n-1})-\mathcal{L}\mathcal{F}^{-1}_{(u^{n-1},m^{n-1})}\cdot {\mathcal {F}}(u^{n-1},m^{n-1}), \end{aligned}$$

or equivalently

$$\begin{aligned} \mathcal{L}\mathcal{F}_{(u^{n-1},m^{n-1})}\cdot \big (u^{n}- u^{n-1},m^n-m^{n-1}\big )=-{\mathcal {F}}(u^{n-1},m^{n-1}). \end{aligned}$$

(1.2)

The previous identity in PDE form reads as

$$\begin{aligned} \left\{ \begin{aligned} (i)\qquad&-\partial _t (u^{n}- u^{n-1}) -\Delta (u^{n}- u^{n-1})+H_p(x,m^{n-1},Du^{n-1})D(u^{n}- u^{n-1})\\&+H_m(x,m^{n-1},Du^{n-1}) (m^{n}- m^{n-1})-f'(m^{n-1})(m^n-m^{n-1})\\ ={}&\partial _t u^{n-1} +\Delta u^{n-1}-H(x,m^{n-1},Du^{n-1})+f(m^{n-1})\qquad{} & {} \textrm{in}\,\,Q, \\ (ii)\qquad&\partial _t (m^{n}- m^{n-1})- \Delta (m^{n}- m^{n-1}) - \text {div} \big ( (m^{n}- m^{n-1}) H_p(x,m^{n-1},Du^{n-1})\big )\\&-\text {div} \Big (m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})(m^{n}- m^{n-1})\Big ) \\&-\mathrm{{div}}\Big (m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})(Du^n-Du^{n-1})\Big )\\ ={}&-\partial _t m^{n-1}+ \Delta m^{n-1} +\text {div} \big ( m^{n-1}H_p(x,m^{n-1},Du^{n-1})\big ){} & {} \textrm{in}\,\,Q, \\&m^n(x,0)=m_0(x), \; u^n(x,T)=u_T(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d, \end{aligned}\right. \end{aligned}$$

(1.3)

and, after simplification, we get the coupled linear system in the unknown $(u^{n},m^{n})$

$$\begin{aligned} \left\{ \begin{aligned} (i)\qquad&-\partial _t u^n -\Delta u^n+H_p(x,m^{n-1},Du^{n-1})D(u^{n}- u^{n-1})\\&+H_m(x,m^{n-1},Du^{n-1}) (m^{n}- m^{n-1})\\ ={}&-H(x,m^{n-1},Du^{n-1})+f(m^{n-1})+f'(m^{n-1})(m^n-m^{n-1})\qquad{} & {} \textrm{in}\,\,Q, \\ (ii)\qquad&\partial _t m^n- \Delta m^n - \text {div} \big ( m^nH_p(x,m^{n-1},Du^{n-1})\big ) \\ ={}&\mathrm{{div}}\Big (m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})(Du^n-Du^{n-1})\Big )\\&+\text {div} \Big (m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})(m^{n}- m^{n-1})\Big ){} & {} \textrm{in}\,\,Q, \\&m^n(x,0)=m_0(x), \; u^n(x,T)=u_T(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(1.4)

Assuming that the Hamiltonian is regular and satisfies a classical monotonicity condition (see [5, 30]), we obtain existence and uniqueness of a classical solution (u, m) to Eq. (1.1), see Proposition 2.4. Then, we prove the well posedness of Eq. (1.4) at each iteration and the quadratic rate of convergence of the Newton iteration to the solution of the MFG system when the initial guess $(u^0,m^0)$ is sufficiently close to (u, m). We remark that, even though the algorithm is presented for evolutive MFGs, the ideas extend naturally to stationary MFGs as well.

This paper primarily focuses on analyzing the convergence of the Newton method for the MFG system at the level of PDEs. In the MFG literature, this iterative method has been applied to solve the nonlinear finite dimensional system which results via a finite differences approximation of Eq. (1.1), see [4, 7]. The algorithm is usually coupled with a continuation method (typically with respect to the viscosity parameter). Indeed, it is important to have a good initial guess of the solution and, for that, it is possible to take advantage of the continuation method by choosing the initial guess as the solution obtained with the previous value of the parameter (see [28]). Alternatively, the Newton method may be selectively employed to tackle the Hamilton-Jacobi equation at each iteration while using a fixed point iterations for the MFG system (as in, e.g., [23, 31]). Another approach involves employing a nonlinear discretized system, as presented in [1, 2], followed by the application of automatic differentiation for Newton’s iteration. Within this context, a significant challenge involves establishing a priori estimates for the finite difference scheme, ensuring the stability of the region of attraction of the method with respect to the discretization parameters. Our convergence result can be interpreted as an intermediate step in the proof of the convergence of the Newton method for finite dimensional approximation to the MFG system. However, it is important to note that the convergence analysis presented in this work does not readily extend to the discretized system, and addressing this is left for our future works.

The paper is organized as follows. In Sect. 2, we introduce some notations and preliminary results. In Sect. 3, we discuss the convergence of the Newton method for a non-separable Hamiltonian and local coupling, while in Sect. 4 the case of a separable Hamiltonian and nonlocal coupling. Finally, the Appendix A is devoted to the proof of some basic results necessary for the rest of the paper.

2 Preliminaries

In this section, we introduce the assumptions on the Hamiltonian, prove the well posedness of (1.1) and some preliminary results necessary for the estimate of the rate in the next section. Throughout the paper, we work in the $d-$dimensional torus ${\mathbb {T}}^d$ (i.e., periodic boundary conditions). We consider the set ${\mathcal {P}}({\mathbb {T}}^d)$ of Borel probability measures on ${\mathbb {T}}^d$ is endowed with the Monge Kantorovich (Wasserstein) distance: for $m,m' \in {\mathcal {P}}({\mathbb {T}}^d)$, $\mathbf{{d}}_1(m,m')=\sup _\phi \int _{{\mathbb {T}}^d}\phi (x)\textrm{d}(m-m')(x)$ where the supremum is taken over all Lipschitz maps $\phi : {\mathbb {T}}^d\rightarrow {\mathbb {R}}$ with a Lipschitz constant bounded by 1. In particular, we have that $\mathbf{{d}}_1(m,m')\le \Vert m-m'\Vert _{{\mathcal {C}}^0({\mathbb {T}}^d)}$ if $m,m'\in {\mathcal {P}}({\mathbb {T}}^d)\cap {\mathcal {C}}^0({\mathbb {T}}^d)$. Given a map $f:{\mathbb {T}}^d\times {\mathcal {P}}({\mathbb {T}}^d)\rightarrow {\mathbb {R}}^d$, we will use the notation $\frac{\delta f}{\delta m}$ for the derivative of f w.r.t m, as introduced in [16, Section 2.2]. $\frac{\delta f}{\delta m}:{\mathbb {T}}^d\times {\mathcal {P}}({\mathbb {T}}^d)\times {\mathbb {T}}^d\rightarrow {\mathbb {R}}$ is a continuous map such that

$$\begin{aligned} f[m'](x)-f[m](x)= \int _0^1\int _{{\mathbb {T}}^d}\frac{\delta f}{\delta m}[(1-s)m+sm'](x)(y)d (m'-m)(y) ds. \end{aligned}$$

The above relation defined the map $\frac{\delta f}{\delta m}$ only up to a constant. We always use the normalization

$$\begin{aligned} \int _{{\mathbb {T}}^d} \frac{\delta f}{\delta m}[m](x)(y)\textrm{d}m(y)=0. \end{aligned}$$

Higher-order derivatives are defined similarly.

The set ${\mathcal {C}}^{1,0}(Q)$ with the norm

$$\begin{aligned} \Vert u\Vert _{{\mathcal {C}}^{1,0}(Q)}= \Vert u\Vert _{{\mathcal {C}}^{0}(Q)}+ \Vert D u\Vert _{{\mathcal {C}}^{0}(Q;{\mathbb {R}}^d)} \end{aligned}$$

is the space of continuous functions on Q with continuous derivatives in the $x-$variable, up to the parabolic boundary. We also recall the definition of parabolic Hölder spaces on the torus (we refer to [26] for a more comprehensive discussion). For $0<\alpha <1$, we denote

$$\begin{aligned}{}[u]_{C^{\alpha ,\frac{\alpha }{2}}(Q)}:=\sup _{(x_1,t_1),(x_2,t_2)\in Q}\frac{\vert u(x_1,t_1)-u(x_2,t_2)\vert }{(\textrm{d}(x_1,x_2)^2+\vert t_1-t_2\vert )^{\frac{\alpha }{2}}}, \end{aligned}$$

(2.1)

where $\textrm{d}(x,y)$ stands for the geodesic distance from x to y in ${{\mathbb {T}}^d}$. The parabolic Hölder space $C^{\alpha ,\frac{\alpha }{2}}(Q)$ is the space of functions $u\in L^\infty (Q)$ for which $[u]_{C^{\alpha ,\frac{\alpha }{2}}(Q)}<\infty $. It is endowed with the norm:

$$\begin{aligned} \Vert u\Vert _{{\mathcal {C}}^{\alpha ,\frac{\alpha }{2}}(Q)}:=\Vert u\Vert _{{\mathcal {C}}^{0}(Q)}+[u]_{C^{\alpha ,\frac{\alpha }{2}}(Q)}. \end{aligned}$$

The space ${\mathcal {C}}^{1+\alpha ,\frac{1+\alpha }{2}}(Q)$ and ${\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)$ are endowed with the norms

$$\begin{aligned} \Vert u\Vert _{{\mathcal {C}}^{1+\alpha ,\frac{1+\alpha }{2}}(Q)}:= & {} \Vert u\Vert _{{\mathcal {C}}^{0}(Q)}+\sum _{i=1}^d\Vert \partial _{x_i}u\Vert _{C^{\alpha ,\frac{\alpha }{2}}(Q)}\nonumber \\{} & {} \quad +\sup _{(x_1,t_1),(x_2,t_2)\in Q}\frac{\vert u(x_1,t_1)-u(x_2,t_2)\vert }{\vert t_1-t_2\vert ^{\frac{1+\alpha }{2}}},\end{aligned}$$

(2.2)

$$\begin{aligned}{} & {} \Vert u\Vert _{{\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)}:=\Vert u\Vert _{{\mathcal {C}}^0(Q)}+\sum _{i=1}^d\Vert \frac{\partial u}{\partial x_i}\Vert _{{\mathcal {C}}^{1+\alpha ,\frac{1+\alpha }{2}}(Q)}+\Vert \frac{\partial u}{\partial t}\Vert _{{\mathcal {C}}^{\alpha ,\alpha /2}(Q)}.\nonumber \\ \end{aligned}$$

(2.3)

We now introduce some useful anisotropic Sobolev spaces to handle time-dependent problems. First, given a Banach space X, $L^p(0,T;X)$ denotes the usual vector-valued Lebesgue space for $p\in [1,\infty ]$. For any $r\ge 1$, we denote by $W^{2,1}_r(Q)$ the space of functions u such that $\partial _t^{{\delta }}D^{\sigma }_x u\in L^r(Q)$ for all multi-indices $\sigma $ and ${\delta }$ such that $\vert \sigma \vert +2{\delta }\le 2$, endowed with the norm

$$\begin{aligned} \Vert u\Vert _{W^{2,1}_r(Q)}=\Big (\int _{Q}\sum _{\vert \sigma \vert +2{\delta }\le 2}\vert \partial _t^{{\delta }}D^{\sigma }_x u\vert ^rdxdt\Big )^{\frac{1}{r}}. \end{aligned}$$

We recall that, by classical results in interpolation theory, the sharp space of initial (or terminal) trace of $W^{2,1}_r(Q)$ is given by the fractional Sobolev class $W^{2-\frac{2}{r}}_r({{\mathbb {T}}^d})$. We define $W^{1,0}_r(Q)$ as the space of functions on Q such that the norm

$$\begin{aligned} \Vert {u} \Vert _{W^{1,0}_r(Q)}:=\Vert u\Vert _{L^r(Q)}+ \sum _{i=1}^d\Vert \frac{\partial u}{\partial x_i}\Vert _{L^r(Q)} \end{aligned}$$

is finite and we denote with ${\mathcal {H}}_r^{1}(Q)$ the space of functions $u\in W^{1,0}_r(Q)$ with $\partial _t u\in (W^{1,0}_{r'}(Q))'$, equipped with the natural norm

$$\begin{aligned} \Vert u\Vert _{{\mathcal {H}}_r^{1}(Q)}:=\Vert u\Vert _{W^{1,0}_r(Q)} +\Vert {\partial _tu} \Vert _{(W^{1,0}_{s'}(Q))'}. \end{aligned}$$

From [32, Theorem A.3 (iii)] and [21, Proposition 2.1 (iii)], for $r>d+2$ the space ${\mathcal {H}}^1_r(Q)$ is continuously embedded in ${\mathcal {C}}^{\alpha /2,\alpha }(Q)$, for some $\alpha \in (0,1)$.

We consider the following set of assumptions for the non-separable case with local coupling, while specific assumptions in the case of a nonlocal coupling will be discussed in Sect. 4. The notation $|\cdot |$ both refers to the modulus of a vector and to the norm of a matrix in the appropriate space.

(A1)
$m_0\in {\mathcal {P}}({\mathbb {T}}^d) \cap {\mathcal {C}}^{2+\alpha }({\mathbb {T}}^d)$ and $m_0(x)\ge \vartheta >0$, $u_T\in {\mathcal {C}}^{2+\alpha }({\mathbb {T}}^d)$.
(A2)
$H\in {\mathcal {C}}^4({\mathbb {T}}^d\times {\mathbb {R}}^+ \times {\mathbb {R}}^d)$ and for all $x\in {\mathbb {T}}^d$, $m\in {\mathbb {R}}^+$, $p\in {\mathbb {R}}^d$ and some ${\bar{C}}>0$:
$$\begin{aligned} \begin{aligned}&|H_{px}(x,m,p)|\le {\bar{C}}(|p| + 1), \,\,\,|H_{xx}(x,m,p)|+|H_{xxm}(x,m,p)| \le {\bar{C}} (|p|^2 + 1),\\&\quad |H_{mp}(x,m,p)|\le {\bar{C}}\vert p\vert ,\,\,|H_{mm}(x,m,p)|\le {\bar{C}}\vert p\vert ^2,\\&\quad |H_{pp}(x,m,p) | + |H_{ppm}(x,m,p) |\le {\bar{C}},\\&\quad |H_{ppp}(x,m,p)|+|H_{pppm}(x,m,p)| \le {\bar{C}}. \end{aligned} \end{aligned}$$
(2.4)
$$\begin{aligned} \begin{pmatrix} -H_m(x,m,p) &{} \dfrac{m}{2}H_{pm}(x,m,p)^T\\ \dfrac{m}{2}H_{pm}(x,m,p) &{} mH_{pp}(x,m,p) \end{pmatrix}>0,\quad \forall m>0.\nonumber \\ \end{aligned}$$
(2.5)
(A3)
$f(\cdot )$, $f'(\cdot )$ and $f''(\cdot )$ are uniformly bounded mappings from ${\mathbb {R}}^+$ to ${\mathbb {R}}$. Moreover, $f'(\cdot )\ge 0$.

Some remarks about these assumptions are in order.

Remark 2.1

Equation (2.5), first proposed by P. L. Lions in [30] and then exploited in [5, 6], is a uniqueness condition for the MFG systems with non-separable Hamiltonian. In particular, it implies that H is convex with respect to p and nonincreasing with respect to m and, when H has a separate form $H = ~H( x, p)-{\overline{f}}(m)$, it reduces to $H_{pp}> 0$ and ${\overline{f}} '>0$. Besides for uniqueness, we use this condition to prove the estimate in Lemma 2.6, which is crucial for the rate of convergence.

Remark 2.2

A typical example of Hamiltonian which satisfies (A2) is

$$\begin{aligned} H(x,m,p)=\frac{h(x)|p|^2}{(1+m)^\alpha }, \end{aligned}$$

(2.6)

where $0<\alpha \le 2$, $h(x)\in {\mathcal {C}}^2({\mathbb {T}}^d)$ and $h(x)>0$ for all $x\in {\mathbb {T}}^d$. Existence and uniqueness of a weak solutions to MFGs with such Hamiltonians, under some additional assumptions, can be obtained from results in [5, 24]. In Proposition 2.4, under the stronger assumptions (A1)–(A3), we prove existence and uniqueness of a classical solution to Eq. (1.1).

Remark 2.3

An example of f which satisfies (A3) is the sigmoid function

$$\begin{aligned} f(m)=\frac{1}{1+e^{-m}}. \end{aligned}$$

In fact, the uniformly boundedness of f is included in (A3) only to obtain a relatively simple proof of existence of a solution in the non-separable case, see Proposition 2.4, but one can obtain small time existence and uniqueness results with less restrictions on H and f, see [19]. If we assume a priori that Eq. (1.1) has a classical solution, independently of assuming (A1)–(A3), the key assumption for proving the convergence of Newton method is the uniform boundedness of $f''(\cdot )$, see also Remark 3.3. In this case, we can also include examples such as

$f(m)=m$.
$f(m)=(1+m)^\alpha $, $0<\alpha <2$.

Therefore, in some practical applications, we can apply the Newton method without requiring all the assumptions in this paper to be satisfied. In any case, restrictions on the uniform boundedness of $H_{ppp}(x,m,p)$ and $f''(m)$ are very typical for Newton iterations, c.f. S. Boyd [10, Section 9.5.3]. Some possible generalizations to the coupling $f(m)=m^\alpha $, $\alpha \ge 2$, will be discussed later in the paper, see Remark 3.4. It is also possible to include, under appropriate assumptions, a dependence of f on t, but for simplicity we omit it.

We will consider classical solutions to the MFG system. Recall that a classical solution of (1.1) is a couple (u, m) such that u and m belong to ${\mathcal {C}}^{2, \alpha }(Q)$ for some $\alpha \in (0,1)$ and satisfies the problem in pointwise sense. For the proof of the next result, see the Appendix.

Proposition 2.4

Under assumptions (A1), (A2) and (A3), the system (1.1) has a unique classical solution.

The next two lemmas are devoted to prove an estimate for a perturbation of the linearized MFG system. This result is the main ingredient in our analysis of the convergence of the Newton algorithm.

Lemma 2.5

Assume (A1), (A2) and (A3) and let (u, m) be the solution of Eq. (1.1). Then, the unique weak solution $(v,\rho )$ of the system

$$\begin{aligned} \left\{ \begin{aligned} (i) \qquad&-\partial _t v -\Delta v+H_p(x,m,Du)Dv+H_m(x,m,Du)\rho =f'(m)\rho \qquad{} & {} \textrm{in}\,\,Q,\\ (ii) \qquad&\partial _t \rho - \Delta \rho - \mathrm{{div}} \big ( \rho H_p(x,m,Du)\big )-\mathrm{{div}} \big ( m\rho H_{pm}(x,m,Du)\big ) \\&=\mathrm{{div}}\big (mH_{pp}(x,m,Du)Dv\big ){} & {} \textrm{in}\,\,Q, \\&\rho (x,0)=0, \; v(x,T)=0{} & {} \textrm{in}\,\,{\mathbb {T}}^d \end{aligned}\right. \nonumber \\ \end{aligned}$$

(2.7)

is the trivial solution $(v,\rho )=(0,0)$.

Proof

Multiply by $\rho $ on both sides of (i), integrate on Q and exploit (ii) to get

$$\begin{aligned} \begin{aligned}&\int _Qf'(m)|\rho |^2dxdt\\&\quad ={}\int _QH_m(x,m,Du)|\rho |^2dxdt+\int _Qv\mathrm{{div}}\big (mH_{pp}(x,m,Du)Dv\big )dxdt\\&\qquad +\int _Qv\mathrm{{div}}\big (m\rho H_{pm}(x,m,Du)\rho \big )dxdt\\&\quad ={}\int _QH_m(x,m,Du)|\rho |^2dxdt-\int _QmH_{pp}(x,m,Du)Dv\cdot Dvdxdt\\&\qquad -\int _QmH_{pm}(x,m,Du)\rho Dv. \end{aligned} \end{aligned}$$

It follows from (A1) and parabolic maximum principle (c.f. [34]) that $m>0$. Hence with (A3), we obtain $f'(m)\ge 0$ and

$$\begin{aligned} \int _Qf'(m)|\rho |^2dxdt\ge 0. \end{aligned}$$

Hence, from Eq. (2.5), we get that

$$\begin{aligned} \begin{pmatrix} \rho&Dv \end{pmatrix} \begin{pmatrix} -H_m(x,m,Du) &{} mH_{pm}(x,m,Du)/2\\ mH_{pm}(x,m,Du)/2 &{} mH_{pp}(x,m,Du) \end{pmatrix} \begin{pmatrix} \rho \\ Dv \end{pmatrix} = 0, \end{aligned}$$

otherwise we obtain a contradiction. Therefore $(\rho ,Dv)\equiv (0,0)$. From $v(x,T)=0$, it also follows that $v=0$. $\square $

The estimate in the next lemma is similar to [12, Lemma 5.2], with the key differences that we consider non-separable Hamiltonian and local couplings.

Lemma 2.6

Assume (A1), (A2) and (A3) and let (u, m) be the classical solution of Eq. (1.1). Given $a\in {\mathcal {C}}^0(Q)$ and a vector field $b\in {\mathcal {C}}^0(Q;{\mathbb {R}}^d)$, let $(v,\rho )$ be a classical solution of the perturbed linear system

$$\begin{aligned} \left\{ \begin{aligned} (i)\qquad&-\partial _t v -\Delta v+H_p(x,m,Du)Dv+H_m(x,m,Du)\rho =f'(m)\rho +a(x,t) \qquad{} & {} \textrm{in}\,\,Q,\\ (ii)\qquad&\partial _t \rho - \Delta \rho - \mathrm{{div}} \big ( \rho H_p(x,m,Du)\big ) -\mathrm{{div}} \big ( m\rho H_{pm}(x,m,Du)\big ) \\ ={}&\mathrm{{div}}\big (mH_{pp}(x,m,Du)Dv\big )+\mathrm{{div}}(b(x,t)){} & {} \textrm{in}\,\,Q, \\&\rho (x,0)=0, \; v(x,T)=0{} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(2.8)

Then, there exists a constant $C>0$, depending only on the coefficients of the problem, such that

$$\begin{aligned} \Vert v\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho \Vert _{{\mathcal {C}}^{0}}\le C\Big (\Vert a\Vert _{{\mathcal {C}}^{0}}+\Vert b\Vert _{{\mathcal {C}}^{0}}\Big ). \end{aligned}$$

(2.9)

Proof

First observe that since the system (2.8) is linear, then $(v,\rho )/(\Vert a\Vert _{{\mathcal {C}}^{0}}+\Vert b\Vert _{{\mathcal {C}}^{0}})$ is the solution of the problem corresponding to the perturbation $(a,b)/(\Vert a\Vert _{{\mathcal {C}}^{0}}+\Vert b\Vert _{{\mathcal {C}}^{0}})$. Hence, (2.9) is equivalent to show that, for $\Vert a\Vert _{{\mathcal {C}}^{0}}+\Vert b\Vert _{{\mathcal {C}}^{0}}\le 1$, then $\Vert v\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho \Vert _{{\mathcal {C}}^{0}}\le C$ for some $C>0$. We argue by contradiction and suppose that the estimate is not true. Hence, we assume that there exists $a^k$, $b^k$ and $(v^k,\rho ^k)$ with

$$\begin{aligned} \Vert a^k\Vert _{{\mathcal {C}}^{0}}+\Vert b^k\Vert _{{\mathcal {C}}^{0}}\le 1,\,\,\theta ^k:=\Vert v^k\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^k\Vert _{{\mathcal {C}}^{0}}\ge k. \end{aligned}$$

(2.10)

Set

$$\begin{aligned} {\tilde{v}}^k:=\frac{v^k}{\theta ^k},\,\,\,{\tilde{\rho }}^k:=\frac{\rho ^k}{\theta ^k}. \end{aligned}$$

By definition, $\Vert {\tilde{v}}^k\Vert _{{\mathcal {C}}^{1,0}}+\Vert {\tilde{\rho }}^k\Vert _{{\mathcal {C}}^{0}}=1$ for all k and the pair $({\tilde{v}}^k,{\tilde{\rho }}^k)$ solves:

$$\begin{aligned} \left\{ \begin{aligned} (i) \qquad&-\partial _t {\tilde{v}}^k -\Delta {\tilde{v}}^k+H_p(x,m,Du)D{\tilde{v}}^k+H_m(x,m,Du){\tilde{\rho }}^k=f'(m){\tilde{\rho }}^k+\frac{a^k(x,t)}{\theta ^k} \qquad{} & {} \textrm{in}\,\,Q,\\ (ii) \qquad&\partial _t {\tilde{\rho }}^k- \Delta {\tilde{\rho }}^k - \mathrm{{div}} \big ( {\tilde{\rho }}^k H_p(x,m,Du)\big ) -\mathrm{{div}} \big ( m{\tilde{\rho }}^k H_{pm}(x,m,Du)\big ) \\ ={}&\mathrm{{div}}\big (mH_{pp}(x,m,Du)D{\tilde{v}}^k\big )+\mathrm{{div}}\big (\frac{b^k(x,t)}{\theta ^k}\big ){} & {} \textrm{in}\,\,Q, \\&\rho (x,0)=0, \; v(x,T)=0{} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \nonumber \\ \end{aligned}$$

(2.11)

Observe that ${\tilde{v}}^k$ is a solution of a linear parabolic equation with bounded coefficients. Hence, ${\tilde{v}}^k$ and $D{\tilde{v}}^k$ are bounded in ${\mathcal {C}}^{\alpha ,\alpha /2}$ for some $\alpha \in (0,1)$. Similarly, ${\tilde{\rho }}^k$, solution of a linear equation in divergence form, is bounded in ${\mathcal {C}}^{\beta ,\beta /2}$ for some $\beta \in (0,1)$. By taking subsequences we obtain a cluster point $(v,\rho )$ of $({\tilde{v}}^k,{\tilde{\rho }}^k)$ such that

$$\begin{aligned} \Vert v\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho \Vert _{{\mathcal {C}}^{0}}=\lim _{k\rightarrow +\infty }(\Vert {\tilde{v}}^k\Vert _{{\mathcal {C}}^{1,0}}+\Vert {\tilde{\rho }}^k\Vert _{{\mathcal {C}}^{0}})=1. \end{aligned}$$

(2.12)

By Eq. (2.10), we know $ a^k(x,t)/\theta ^k$ and ${ \textrm{div}} ( b^k(x,t)/\theta ^k )$ actually vanish for $k\rightarrow \infty $ and therefore $(v,\rho )$ is a solution of (2.7). Hence by Lemma 2.5, we have $(v,\rho )=(0,0)$, a contradiction to Eq. (2.12). $\square $

3 The Newton Method for the Mean Field Games System with Non-separable Hamiltonian and Local Coupling

In this section, we give an estimate for the rate of convergence of the Newton method to the MFG system in the case of a non-separable Hamiltonian and local coupling. We first prove the well posedness of the system (1.4) for each n.

Proposition 3.1

For any $n\in {\mathbb {N}}$, there exists a unique solution $(u^{n},m^{n})\in {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)\times {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)$ to the system (1.4).

Proof

Assume to have proved the statement at step $n-1$. Hence, given $(u^{n-1},m^{n-1})\in {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)\times {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)$, Eq. (1.4) is a strongly coupled linear system for $(u^{n},m^{n})$.

We first prove existence of a weak solution $(u^n,m^n)\in W^{2,1}_r(Q)\times {\mathcal {H}}^1_r(Q)$, $r>d+2$ by means of a fixed point argument. Define ${\textbf{X}}:=\{\varrho \in {\mathcal {C}}^0(Q): \varrho \ge 0, \varrho (x,0)=m_0(x), \int _{{\mathbb {T}}^d}\varrho (x,t)\textrm{d}x=1\}$ and consider the compact mapping ${\hat{\varrho }}={\textbf{T}}(\varrho ): {\mathcal {C}}^0(Q)\rightarrow {\mathcal {C}}^{\alpha ,\alpha /2}(Q)$ defined by solving in sequence

$$\begin{aligned} \left\{ \begin{aligned} (i) \qquad&-\partial _t {\hat{u}} -\Delta {\hat{u}}+H_p(x,m^{n-1},Du^{n-1})D{\hat{u}}+H_m(x,m^{n-1},Du^{n-1}) (\varrho - m^{n-1})\\ ={}&H_p(x,m^{n-1},Du^{n-1})Du^{n-1}-H(x,m^{n-1},Du^{n-1})+f(m^{n-1})+f'(m^{n-1})(\varrho -m^{n-1}) \qquad{} & {} \textrm{in}\,\,Q,\\&{\hat{u}}(x,T)=u_T(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d,\\ (ii) \qquad&\partial _t {\hat{\varrho }}- \Delta {\hat{\varrho }} - \text {div} \big ( \varrho H_p(x,Du^{n-1})\big )-\text {div} \big (m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})\varrho \big )\\ ={}&\mathrm{{div}}\big (m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})(D{\hat{u}}-Du^{n-1})\big )-\text {div} \big ((m^{n-1})^2H_{pm}(x,m^{n-1},Du^{n-1})\big ){} & {} \textrm{in}\,\,Q, \\&{\hat{\varrho }}(x,0)=m_0(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \nonumber \\ \end{aligned}$$

(3.1)

We rewrite equation (i) in Eq. (3.1) as

$$\begin{aligned} -\partial _t {\hat{u}} -\Delta {\hat{u}}+H_p(x,m^{n-1},Du^{n-1})D{\hat{u}}={\textsf{f}} \end{aligned}$$

with

$$\begin{aligned} \begin{aligned} {\textsf{f}}:={}&H_p(x,m^{n-1},Du^{n-1})Du^{n-1}-H(x,m^{n-1},Du^{n-1})+f(m^{n-1})\\&+f'(m^{n-1})(\varrho -m^{n-1})-H_m(x,m^{n-1},Du^{n-1}) (\varrho - m^{n-1}) \in L^\infty (Q). \end{aligned} \end{aligned}$$

(3.2)

By Proposition A.2, we obtain the existence of ${\hat{u}}\in W^{2,1}_r(Q)$ solving (i) in (3.1). By Sobolev embedding $D{\hat{u}} \in {\mathcal {C}}^{\alpha ,\alpha /2}(Q;{\mathbb {R}}^d)$. Next we rewrite equation (ii) in Eq. (3.1) as

$$\begin{aligned} \partial _t {\hat{\varrho }}- \Delta {\hat{\varrho }} - \textrm{div}({\textsf{F}})=0 \end{aligned}$$

with

$$\begin{aligned} {\textsf{F}}:={}&\varrho H_p(x,Du^{n-1})+m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})\\&+m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})(D{\hat{u}}-Du^{n-1}) -(m^{n-1})^2H_{pm}(x,m^{n-1},Du^{n-1}). \end{aligned}$$

From $D{\hat{u}} \in {\mathcal {C}}^{\alpha ,\alpha /2}(Q;{\mathbb {R}}^d)$ and the assumptions on $(u^{n-1},m^{n-1})$, ${\textsf{F}}\in L^\infty (Q;{\mathbb {R}}^d)$ and, by Proposition A.3, we obtain there exists a solution ${\hat{\varrho }}\in {\mathcal {H}}^1_r(Q)$ to (i) in (3.1), thus ${\hat{\varrho }}\in {\mathcal {C}}^{\alpha ,\alpha /2}(Q)$.

Set $\delta {\hat{\varrho }}=\hat{\varrho _1}-\hat{\varrho _2}$, $\delta {\hat{u}}={\hat{u}}_1-{\hat{u}}_2$. Then

$$\begin{aligned} \left\{ \begin{aligned} (i) \qquad&-\partial _t \delta {\hat{u}} -\Delta \delta {\hat{u}}+H_p(x,m^{n-1},Du^{n-1})D\delta {\hat{u}}+m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})\delta \varrho \\ ={}&f'(m^{n-1})\delta \varrho \qquad{} & {} \textrm{in}\,\,Q,\\ (ii) \qquad&\partial _t \delta {\hat{\varrho }}- \Delta \delta {\hat{\varrho }} - \text {div} \big ( \delta \varrho H_p(x,m^{n-1},Du^{n-1})\big )-\text {div} \big (m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})\delta \varrho \big )\\ ={}&\mathrm{{div}}\big (m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})D\delta {\hat{u}}\big ){} & {} \textrm{in}\,\,Q, \\&\delta {\hat{\varrho }}(x,0)=0, \; \delta {\hat{u}}(x,T)=0{} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(3.3)

We obtain by Proposition A.2 that

$$\begin{aligned} \Vert \delta {\hat{u}}\Vert _{W^{2,1}_r(Q)}\le C\Vert \delta \varrho \Vert _{L^\infty (Q)},\,\,\,\Vert D\delta {\hat{u}}\Vert _{L^\infty (Q;{\mathbb {R}}^d)}\le C\Vert \delta \varrho \Vert _{L^\infty (Q)}, \end{aligned}$$

then by Proposition A.3

$$\begin{aligned} \Vert \delta {\hat{\varrho }}\Vert _{{\mathcal {C}}^{\alpha ,\alpha /2}}\le C(\Vert \delta \varrho \Vert _{L^\infty (Q)}+\Vert D\delta {\hat{u}}\Vert _{L^\infty (Q;{\mathbb {R}}^d)})\le C\Vert \delta \varrho \Vert _{L^\infty (Q)}. \end{aligned}$$

It then follows that ${\textbf{T}}$ is a continuous map. We conclude, by Schauder fixed point theorem, the existence of a solution to (1.4). It follows that $({\hat{u}},{\hat{\varrho }})\in W^{2,1}_r(Q)\times {\mathcal {C}}^{\alpha ,\alpha /2}(Q)$ is a fixed point defined by Eq. (3.1), with ${\textsf{f}}$ replaced by

$$\begin{aligned} \begin{aligned} \hat{{\textsf{f}}}:={}&H_p(x,m^{n-1},Du^{n-1})Du^{n-1}-H(x,m^{n-1},Du^{n-1})+f(m^{n-1})\\&+f'(m^{n-1})({\hat{\varrho }}-m^{n-1})-H_m(x,m^{n-1},Du^{n-1}) ({\hat{\varrho }}- m^{n-1}). \end{aligned} \end{aligned}$$

Since $\hat{{\textsf{f}}}\in {\mathcal {C}}^{\alpha ,\alpha /2}(Q)$, from Proposition A.1 it follows that ${\hat{u}}\in {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)$. By assumption (A2), ${\hat{u}}\in {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)$ and $(u^{n-1},m^{n-1})\in {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)\times {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)$, we obtain $\textrm{div}({\textsf{F}})\in {\mathcal {C}}^{\alpha ,\alpha /2}(Q)$. Using Proposition A.1 again, we obtain ${\hat{\rho }}\in {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)$. We now prove uniqueness. Assume that there are two solutions $({\hat{u}}_1,{\hat{\rho }}_1)$ and $({\hat{u}}_2,{\hat{\rho }}_2)$ to Eq. (3.1) and set $\delta {\hat{u}}={\hat{u}}_1-{\hat{u}}_2$, $\delta {\hat{\varrho }}=\hat{\varrho _1}-\hat{\varrho _2}$. Clearly $(\delta {\hat{u}},\delta {\hat{\rho }})$ solves

$$\begin{aligned} \left\{ \begin{aligned} (i) \qquad&-\partial _t \delta {\hat{u}} -\Delta \delta {\hat{u}}+H_p(x,m^{n-1},Du^{n-1})D\delta {\hat{u}}+m^{n-1} H_{m}(x,m^{n-1},Du^{n-1})\delta {\hat{\rho }} \\ ={}&f'(m^{n-1})\delta {\hat{\rho }} \qquad{} & {} \textrm{in}\,\,Q,\\ (ii) \qquad&\partial _t \delta {\hat{\varrho }}- \Delta \delta {\hat{\varrho }} - \text {div} \big ( \delta {\hat{\rho }} H_p(x,m^{n-1},Du^{n-1})\big )-\text {div} \big (m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})\delta {\hat{\rho }} \big )\\ ={}&\mathrm{{div}}\big (m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})D\delta {\hat{u}}\big ){} & {} \textrm{in}\,\,Q, \\&\delta {\hat{\varrho }}(x,0)=0, \; \delta {\hat{u}}(x,T)=0{} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(3.4)

Testing the equation (i) with ${\hat{\rho }}$, equation (ii) with ${\hat{u}}$ and subtracting the resulting identities, an easy computation gives

$$\begin{aligned} \begin{aligned}&\int _Qf'(m^{n-1})|\delta {\hat{\varrho }}|^2dxdt\\&\quad ={}\int _QH_m(x,m^{n-1},Du^{n-1})|\delta {\hat{\varrho }}|^2dxdt-\int _Qm^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})D\delta {\hat{u}}\cdot D\delta {\hat{u}}dxdt\\&\qquad -\int _Qm^{n-1}H_{pm}(x,m^{n-1},Du^{n-1})\delta {\hat{\varrho }} D\delta {\hat{u}}\le 0. \end{aligned} \end{aligned}$$

By (A2), Eq. (2.5), and (A3) we get $(\delta {\hat{\varrho }},D\delta {\hat{u}})=(0,0)$. From $\delta {\hat{u}}(x,T)=0$, it also follows that $\delta {\hat{u}}=0$. $\square $

Proposition 3.1 is concerned with the invertibility of the linear operator $\mathcal{L}\mathcal{F}$ at each step n, as defined in Eq. (1.2). Hence, it is not surprising that one needs some Hessian-type condition. The constant C in Proposition 3.1 may depend on the previous step $(u^{n-1},m^{n-1})$. In the discretized setting, invertibility of a linearized system similar to Eq. (3.4) has been discussed in [3, Section 4.1] for solving a mean field planning problem with separable Hamiltonian. We believe the ideas in [3, Section 4.1] can be extended also to MFGs with non-separable Hamiltonian. However, solvability at each iteration is not enough for the convergence of the Newton method since it is necessary to prove some a priori estimates independent of iteration index n. In the next result, we obtain the local quadratic rate of convergence result.

Theorem 3.2

Let (u, m) be the solution of system (1.1) and $(u^n,m^n)$ is the sequence generated by Newton’s algorithm (1.4). Set $v^n=u^n-u$, $\rho ^n=m^n-m$. There exists a constant $\eta >0$ such that if $\Vert v^0\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^0\Vert _{{\mathcal {C}}^{0}}\le \eta $ then $\Vert v^n\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}}\rightarrow 0$ with a quadratic rate of convergence.

Proof

We emphasize that from here and for the rest of the proof, C denotes some generic constant which may increase from line to line. This constant may depend on data of the problem and the solution (u, m), but it is always independent of n. We observe that $v^n$ solves the equation

$$\begin{aligned} \begin{aligned}&-\partial _t v^n -\Delta v^n+H_p(x,m,Du)\cdot Dv^n\\&={}f(m^{n-1})-f(m)+f'(m^{n-1})(m^n-m^{n-1})+H(x,m,Du)-H(x,m^{n-1},Du^{n-1})\\&\quad -H_p(x,m^{n-1},Du^{n-1})D(u^{n}- u^{n-1})-H_m(x,m^{n-1},Du^{n-1}) (m^{n}- m^{n-1}), \end{aligned} \end{aligned}$$

which can be rewritten as

$$\begin{aligned} -\partial _t v^n -\Delta v^n+H_p(x,m,Du)\cdot Dv^n+H_m(x,m,Du)\rho ^n=f'(m)\rho ^n+a, \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} a:={}&H_p(x,m,Du)(Du^n-Du)+H_m(x,m,Du)(m^n-m)+H(x,m,Du)\\&-H(x,m^{n-1},Du^{n-1})\\&-H_p(x,m^{n-1},Du^{n-1})D(u^{n}- u^{n-1})-H_m(x,m^{n-1},Du^{n-1}) (m^{n}- m^{n-1})\\&-f'(m)(m^n-m)+f(m^{n-1})-f(m)+f'(m^{n-1})(m^n-m^{n-1}). \end{aligned} \end{aligned}$$

(3.5)

In order to apply Lemma 2.6, we need to estimate $\Vert a\Vert _{{\mathcal {C}}^0}$. We rewrite the terms involving H in Eq. (3.5) as

$$\begin{aligned} \begin{aligned}&H_p(x,m,Du)(Du^n-Du)+H_m(x,m,Du)(m^n-m)+H(x,m,Du)\\&-H(x,m^{n-1},Du^{n-1})\\&-H_p(x,m^{n-1},Du^{n-1})D(u^{n}- u^{n-1})-H_m(x,m^{n-1},Du^{n-1}) (m^{n}- m^{n-1})\\ ={}&H(x,m,Du)-H(x,m^{n-1},Du^{n-1})-H_p(x,m^{n-1},Du^{n-1})D(u- u^{n-1})\\&-H_m(x,m^{n-1},Du^{n-1}) (m- m^{n-1})+\Big (H_p(x,m,Du)\\&-H_p(x,m^{n-1},Du^{n-1})\Big )(Du^n-Du)\\&+\Big (H_m(x,m,Du)-H_m(x,m^{n-1},Du^{n-1})\Big ) (m^n-m). \end{aligned} \end{aligned}$$

(3.6)

We now estimate the terms on the right hand side of the previous identity. It is clear from (A2) that for any $\tau \in (0,1)$,

$$\begin{aligned} \begin{aligned} \sup _\tau&\big \vert H_{mm}(x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du))\big \vert \\&\le C|Du+\tau (Du^{n-1}-Du)|^2 \le C\big (\vert Du\vert +\vert Du^{n-1}-Du\vert \big )^2\\&\le C\big (2\vert Du\vert ^2+2\vert Du^{n-1}-Du\vert ^2 \big ). \end{aligned} \end{aligned}$$

(3.7)

Likewise,

$$\begin{aligned} \begin{aligned} \sup _\tau \big \vert H_{mp}(x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du))\big \vert&\le C|Du+\tau (Du^{n-1}-Du)|\\&\le C\big (\vert Du\vert +\vert Du^{n-1}-Du\vert \big ), \end{aligned} \end{aligned}$$

(3.8)

and

$$\begin{aligned} \sup _\tau \vert H_{pp}(x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du))\vert \le {\bar{C}}. \end{aligned}$$

Moreover, by the mean value theorem, we have

$$\begin{aligned} \begin{aligned}&H(x,m,Du)-H(x,m^{n-1},Du^{n-1})-H_p(x,m^{n-1},Du^{n-1})D(u- u^{n-1})\\&\quad -H_m(x,m^{n-1},Du^{n-1}) (m- m^{n-1})\\&={}\int _0^1\Big (H_p\big (x,m^{n-1}+\tau (m-m^{n-1}),Du^{n-1}+\tau (Du-Du^{n-1})\big )\\&\quad -H_p(x,m^{n-1},Du^{n-1})\Big )(Du-Du^{n-1})\textrm{d}\tau \\&\quad +\int _0^1\Big (H_m\big (x,m^{n-1}+\tau (m-m^{n-1}),Du^{n-1}+\tau (Du-Du^{n-1})\big )\\&\quad -H_p(x,m^{n-1},Du^{n-1})\Big )(m-m^{n-1})\textrm{d}\tau . \end{aligned} \end{aligned}$$

(3.9)

By using mean value theorem again, we estimate the integrand in Eq. (3.9) by

$$\begin{aligned} \begin{aligned}&H_p\big (x,m^{n-1}+\tau (m-m^{n-1}),Du^{n-1}+\tau (Du-Du^{n-1})\big )\\&\quad -H_p(x,m^{n-1},Du^{n-1})\\&= {} \int _0^1 \Big (H_{pp}\big (x,m^{n-1}+(1-\tau ')\tau (m-m^{n-1}),Du^{n-1}\\&\quad +(1-\tau ')\tau (Du-Du^{n-1})\big )\tau (Du^{n-1}-Du)\textrm{d}\tau ' \\&\quad +\int _0^1 \Big (H_{pm}\big (x,m^{n-1}+(1-\tau ')\tau (m-m^{n-1}),Du^{n-1}+(1-\tau ')\tau (Du-Du^{n-1})\big )\\&\quad -H_p(x,m^{n-1},Du^{n-1})\Big )\tau (m^{n-1}-m)\textrm{d}\tau \end{aligned} \end{aligned}$$

and therefore, as $0<(1-\tau ')\tau <1$, $0<\tau <1$, we get

$$\begin{aligned} \begin{aligned}&\sup _\tau \vert H_p\big (x,m^{n-1}+\tau (m-m^{n-1}),Du^{n-1}+\tau (Du-Du^{n-1})\big )-H_p(x,m^{n-1},Du^{n-1})\vert \\&\quad \le \sup _\tau \vert H_{pp}\big (x,m^{n-1}+\tau (m-m^{n-1}),Du^{n-1}+\tau (Du-Du^{n-1})\vert \vert Du-Du^{n-1}\vert \\&\qquad +\sup _\tau \vert H_{pm}\big (x,m^{n-1}+\tau (m-m^{n-1}),Du^{n-1}+\tau (Du-Du^{n-1})\vert \vert m-m^{n-1}\vert \\&\quad \le {\bar{C}}\vert Du-Du^{n-1}\vert +C(\vert Du\vert +\vert Du-Du^{n-1}\vert ) \vert m-m^{n-1}\vert , \end{aligned} \end{aligned}$$

$$\begin{aligned} \begin{aligned}&\left| \int _0^1\Big (H_p\big (x,m^{n-1}+\tau (m-m^{n-1}),Du^{n-1}+\tau (Du-Du^{n-1})\big )\right. \\&\qquad \left. -H_p(x,m^{n-1},Du^{n-1})\Big )(Du-Du^{n-1})\textrm{d}\tau \right| \\&\quad \le \left( \int _0^1\tau \textrm{d}\tau \right) \Big ({\bar{C}}\vert Du-Du^{n-1}\vert +C(\vert Du\vert +\vert Du-Du^{n-1}\vert ) \vert m-m^{n-1}\vert \Big )\vert Du-Du^{n-1}\vert . \end{aligned} \end{aligned}$$

In a similar way, we obtain

$$\begin{aligned}{} & {} \left| \int _0^1\Big (H_m\big (x,m^{n-1}+\tau (m-m^{n-1}),Du^{n-1}+\tau (Du-Du^{n-1})\big )\right. \\{} & {} \qquad \left. -H_m(x,m^{n-1},Du^{n-1})\Big )(m-m^{n-1})\textrm{d}\tau \right| \\{} & {} \quad \le {} (\int _0^1\tau \textrm{d}\tau )\Big (C(1+\vert Du-Du^{n-1}\vert ^2 )\vert m-m^{n-1}\vert +C(1+\vert Du-Du^{n-1}\vert )\vert Du\\{} & {} \qquad -Du^{n-1}\vert \vert \Big )\vert m-m^{n-1}\vert . \end{aligned}$$

Therefore, replacing in Eq. (3.9), we have

$$\begin{aligned} \begin{aligned}&\left| H(x,m,Du)-H(x,m^{n-1},Du^{n-1})-H_p(x,m^{n-1},Du^{n-1})D(u- u^{n-1})\right. \\&\qquad \left. -H_m(x,m^{n-1},Du^{n-1}) (m- m^{n-1})\right| \\&\quad \le C(\vert Dv^{n-1}\vert ^2)+C(1+\vert Dv^{n-1}\vert )\vert Dv^{n-1}\vert \vert \rho ^{n-1}\vert +C(1+\vert Dv^{n-1}\vert ^2)\vert \rho ^{n-1}\vert ^2\\&\quad \le C\Big (\vert Dv^{n-1}\vert ^2+\vert \rho ^{n-1}\vert ^2+\vert Dv^{n-1}\vert ^4+\vert \rho ^{n-1}\vert ^4\Big ). \end{aligned} \end{aligned}$$

(3.10)

For the other terms in Eq. (3.6), we first observe that by (A2) and Eq. (3.8) we get

$$\begin{aligned} \begin{aligned}&\big \vert H_p(x,m,Du)-H_p(x,m^{n-1},Du^{n-1})\big \vert \le {} \sup _\tau \vert H_{pm}(x,m+\tau (m^{n-1}-m),Du\\&\qquad +\tau (Du^{n-1}-Du)) \vert \vert \rho ^{n-1}\vert \\&\qquad +\sup _\tau \vert H_{pp}(x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du)) \vert \vert v^{n-1}\vert \\&\quad \le C(\vert Du\vert +\vert Dv^{n-1}\vert )\vert \rho ^{n-1}\vert +C\vert Dv^{n-1}\vert , \end{aligned} \end{aligned}$$

and, by Eqs. (3.7) and (3.8),

$$\begin{aligned} \begin{aligned}&\big \vert H_m(x,m,Du)-H_m(x,m^{n-1},Du^{n-1})\big \vert \le \sup _\tau \vert H_{pm}\big (x,m+\tau (m^{n-1}-m),Du\\&\qquad +\tau (Du^{n-1}-Du)\big ) \vert \vert v^{n-1}\vert \\&\qquad +\sup _\tau \vert H_{mm}\big (x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du)\big ) \vert \vert \rho ^{n-1}\vert \\&\quad \le C(\vert Du\vert +\vert Dv^{n-1}\vert )\vert Dv^{n-1}\vert +C(\vert Du\vert ^2+\vert Dv^{n-1}\vert ^2)\vert \rho ^{n-1}\vert . \end{aligned} \end{aligned}$$

We then obtain

$$\begin{aligned} \begin{aligned}&\left| \Big (H_p(x,m,Du)-H_p(x,m^{n-1},Du^{n-1})\Big )(Du^n-Du)+\Big (H_m(x,m,Du)\right. \\&\qquad \left. -H_m(x,m^{n-1},Du^{n-1})\Big ) (m^n-m)\right| \\&\quad \le \vert H_p(x,m,Du)-H_p(x,m^{n-1},Du^{n-1})\vert \vert Du^n-Du\vert \\&\qquad +\vert H_m(x,m,Du)-H_m(x,m^{n-1},Du^{n-1})\vert \vert m^n-m\vert \\&\quad \le \Big (C(1+\vert Dv^{n-1}\vert )\vert \rho ^{n-1}\vert \\&\qquad +C\vert Dv^{n-1}\vert \Big )\vert Dv^n\vert +\Big (C(1+\vert Dv^{n-1}\vert ^2)\vert \rho ^{n-1}\vert \vert +C(1+\vert Dv^{n-1}\vert )\vert Dv^{n-1}\vert \Big )\vert \rho ^n\vert \\&\quad \le C\Big (\vert Dv^{n-1}\vert +\vert Dv^{n-1}\vert ^2+\vert Dv^{n-1}\vert \vert \rho ^{n-1}\vert +\vert Dv^{n-1}\vert ^2 \vert \rho ^{n-1}\vert +\vert \rho ^{n-1}\vert \Big )(\vert Dv^n\vert +\vert \rho ^n\vert )\\&\quad \le C\Big (\vert Dv^{n-1}\vert +\vert Dv^{n-1}\vert ^2+\vert Dv^{n-1}\vert ^4+\vert \rho ^{n-1}\vert +\vert \rho ^{n-1}\vert ^2\Big )\big (\vert Dv^n\vert +\vert \rho ^n\vert \big ). \end{aligned} \end{aligned}$$

(3.11)

To estimate the terms containing f in Eq. (3.5), we observe that

$$\begin{aligned} \begin{aligned}&-f'(m)(m^n-m)+f(m^{n-1})-f(m)+f'(m^{n-1})(m^n-m^{n-1})\\ ={}&f(m^{n-1})-f(m)-f'(m)(m^{n-1}-m)+ (f'(m^{n-1})-f'(m))(m^n-m^{n-1}). \end{aligned} \end{aligned}$$

Exploiting that $f'(\cdot )$ is globally Lipschitz, see (A3), we have

$$\begin{aligned} \vert f(m^{n-1})-f(m)-f'(m)(m^{n-1}-m)\vert \le C|m^{n-1}-m|^2, \end{aligned}$$

(3.12)

and

$$\begin{aligned} \begin{aligned}&\left| \Big (f'(m^{n-1})-f'(m)\Big )(m^n-m^{n-1})\right| \\&\quad \le \vert f'(m^{n-1})-f'(m)\vert \vert m^n-m\vert +\vert f'(m^{n-1})-f'(m)\vert \vert m-m^{n-1}\vert \\&\quad \le C\Big (|\rho ^n||\rho ^{n-1}|+|\rho ^{n-1}|^2\Big ). \end{aligned} \end{aligned}$$

(3.13)

Finally, by Eqs. (3.10), (3.11), (3.12) and (3.13), we get

$$\begin{aligned} \begin{aligned} \Vert a\Vert _{{\mathcal {C}}^{0}} \le {}&C\Big (\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}^2+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}^2+\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}^4+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}^4\\&+\big (\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}} +\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}^2+\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}^4+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}\\&+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}^2\big )\big (\Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\big )\Big ). \end{aligned} \end{aligned}$$

(3.14)

Now we consider the equation satisfied by $\rho ^n$. We have

$$\begin{aligned}{} & {} \partial _t \rho ^n -\Delta \rho ^n-\mathrm{{div}}(\rho ^nH_p(x,m,Du))-\mathrm{{div}}(m\rho ^nH_{pm}(x,m,Du))\\{} & {} \quad = \mathrm{{div}}\big (mH_{pp}(x,m,Du)Dv^n\big )\mathrm{{div}}(b), \end{aligned}$$

with

$$\begin{aligned} \begin{aligned} b:={}&-\rho ^nH_p(x,m,Du)-m\rho ^nH_{pm}(x,m,Du)\\&-mH_{pp}(x,m,Du)Dv^n\\&+ m^{n-1}H_p(x,m^{n-1},Du^{n-1}) -mH_p(x,m,Du) \\&+ (m^{n}- m^{n-1}) H_p(x,m^{n-1},Du^{n-1})\\&+m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})(m^{n}- m^{n-1}) \\&+m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})(Du^n-Du^{n-1}). \end{aligned} \end{aligned}$$

To estimate $\Vert b\Vert _{{\mathcal {C}}^0}$, we start observing that

$$\begin{aligned} \begin{aligned} b={}&m^{n-1}H_p(x,m^{n-1},Du^{n-1})\\&-mH_p(x,m,Du)-m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})(Du^{n-1}-Du)\\&-(m^{n-1}-m) H_p(x,m^{n-1},Du^{n-1})-m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})(m^{n-1}-m) \\&+\rho ^n\Big (H_p(x,m^{n-1},Du^{n-1})-H_p(x,m,Du)\Big )\\&+\rho ^n\Big (m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})-mH_{pm}(x,m,Du)\Big )\\&+Dv^n\Big (m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})-mH_{pp}(x,m,Du)\Big ). \end{aligned} \end{aligned}$$

(3.15)

Denoting $\Phi (x,m,p)=mH_p(x,m,p)$, then by elementary calculation

$$\begin{aligned}&\partial _{p}\Phi (x,m,p)=mH_{pp}(x,m,p),\,\partial _{pp}\Phi (x,m,p)=mH_{ppp}(x,m,p),\\&\partial _{m}\Phi (x,m,p)=H_p(x,m,p)+mH_{pm}(x,m,p),\,\partial _{mm}\Phi (x,m,p)\\&\quad =2H_{pm}(x,m,p)+mH_{mmp}(x,m,p). \end{aligned}$$

It is clear from (A2) that

$$\begin{aligned} \begin{aligned}&\sup _\tau \left| \big (m+\tau (m^{n-1}-m)\big )H_{ppp}(x,m+\tau (m^{n-1}-m),Du\right. \\&\quad \left. +\tau (Du^{n-1}-Du))\right| \le C\big (\vert m\vert +|\rho ^{n-1}|\big ). \end{aligned} \end{aligned}$$

From Eq. (3.8) and

$$\begin{aligned}{} & {} \sup _\tau \big \vert H_{mmp}(x,m+\tau (m^{n-1}-m),Du\nonumber \\{} & {} \quad +\tau (Du^{n-1}-Du))\big \vert \le C\big (\vert Du\vert +\vert Du^{n-1}-Du\vert \big ), \end{aligned}$$

(3.16)

we obtain

$$\begin{aligned} \begin{aligned}&\sup _\tau \left| 2H_{pm}(x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du))\right| \\&\qquad +\sup _\tau \left| \big (m+\tau (m^{n-1}-m)\big )H_{pmm}(x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du))\right| \\&\quad \le {} C\big (1+\vert Dv^{n-1}\vert +\vert \rho ^{n-1}\vert +\vert Dv^{n-1}\vert \vert \rho ^{n-1}\vert \big ). \end{aligned} \end{aligned}$$

In addition, we have

$$\begin{aligned} \begin{aligned}&\sup _\tau \Big \vert H_{pp}\big (x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du)\big )\\&\qquad +\big (m+\tau (m^{n-1}-m)\big )H_{ppm}\big (x,m+\tau (m^{n-1}-m),Du+\tau (Du^{n-1}-Du)\big )\Big \vert \\&\quad \le {} C(1+\vert \rho ^{n-1}\vert ). \end{aligned} \end{aligned}$$

Collecting these estimates, we obtain

$$\begin{aligned} \begin{aligned}&\Big \vert m^{n-1}H_p(x,m^{n-1},Du^{n-1}) -mH_p(x,m,Du)-m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})(Du^{n-1}-Du)\\&\qquad -(m^{n-1}-m) H_p(x,m^{n-1},Du^{n-1})-m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})(m^{n-1}-m) \Big \vert \\&\quad \le C\big (1+|\rho ^{n-1}|\big )\vert Dv^{n-1}\vert ^2+C\big (1+\vert v^{n-1}\vert +\vert \rho ^{n-1}\vert \\&\qquad +\vert Dv^{n-1}\vert \vert \rho ^{n-1}\vert \big )\vert \rho ^{n-1}\vert ^2+C(1+\vert \rho ^{n-1}\vert )\vert Dv^{n-1}\vert \vert \rho ^{n-1}\vert \\&\quad \le C\big (\vert Dv^{n-1}\vert ^2+\vert \rho ^{n-1}\vert ^2+\vert \rho ^{n-1}\vert ^3\\&\qquad +\vert Dv^{n-1}\vert \vert \rho ^{n-1}\vert +\vert Dv^{n-1}\vert ^2 \vert \rho ^{n-1}\vert +\vert Dv^{n-1}\vert \vert \rho ^{n-1}\vert ^2+\vert Dv^{n-1}\vert \vert \rho ^{n-1}\vert ^3\big )\\&\quad \le C\big (\vert Dv^{n-1}\vert ^2+\vert \rho ^{n-1}\vert ^2+\vert Dv^{n-1}\vert ^4+\vert \rho ^{n-1}\vert ^4+\vert \rho ^{n-1}\vert ^3+\vert \rho ^{n-1}\vert ^6\big ). \end{aligned} \end{aligned}$$

(3.17)

Moreover, from Eq. (2.4), it follows that

$$\begin{aligned}{} & {} \left| H_p(x,m^{n-1},Du^{n-1})-H_p(x,m,Du)\right| \le C\left( \vert Dv^{n-1}\vert \right. \nonumber \\{} & {} \quad \left. +(1+\vert Dv^{n-1}\vert )\vert \rho ^{n-1}\vert \right) , \end{aligned}$$

(3.18)

and, from Eqs. (2.4) and (3.16),

$$\begin{aligned} \begin{aligned}&\left| m^{n-1} H_{pm}(x,m^{n-1},Du^{n-1})-mH_{pm}(x,m,Du)\right| \\&\quad \le {}C(1+\vert \rho ^{n-1}\vert )\vert Dv^{n-1}\vert +C\Big (1+\vert v^{n-1}\vert +(1+\vert v^{n-1}\vert )(1+\vert \rho ^{n-1}\vert )\Big )\vert \rho ^{n-1}\vert \\&\quad \le {} C\big (\vert Dv^{n-1}\vert +\vert \rho ^{n-1}\vert +\vert Dv^{n-1}\vert ^2+\vert \rho ^{n-1}\vert ^2+\vert \rho ^{n-1}\vert ^4\big ), \end{aligned} \end{aligned}$$

(3.19)

$$\begin{aligned} \begin{aligned}&\left| m^{n-1}H_{pp}(x,m^{n-1},Du^{n-1})-mH_{pp}(x,m,Du)\right| \le C(1+\vert \rho ^{n-1}\vert )\vert Dv^{n-1}\vert \\&\qquad +C\vert \rho ^{n-1}\vert +C(1+\vert \rho ^{n-1}\vert )\vert \rho ^{n-1}\vert . \end{aligned} \end{aligned}$$

(3.20)

Replacing estimates Eqs. (3.17)–(3.20) in Eq. (3.15), we get

$$\begin{aligned} \begin{aligned}&\Vert b\Vert _{{\mathcal {C}}^{0}} \le C\Big (\Vert v^{n-1}\Vert ^2_{{\mathcal {C}}^{1,0}}+\Vert v^{n-1}\Vert ^4_{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n-1}\Vert ^2_{{\mathcal {C}}^{0}}+\Vert \rho ^{n-1}\Vert ^3_{{\mathcal {C}}^{0}}+\Vert \rho ^{n-1}\Vert ^4_{{\mathcal {C}}^{0}}+\Vert \rho ^{n-1}\Vert ^6_{{\mathcal {C}}^{0}}\\&\quad +\big (\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}+\Vert v^{n-1}\Vert ^2_{{\mathcal {C}}^{1,0}}\\&\quad + \Vert \rho ^{n-1}\Vert ^2_{{\mathcal {C}}^{0}}+ \Vert \rho ^{n-1}\Vert ^4_{{\mathcal {C}}^{0}}\big )\big (\Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\big )\Big ). \end{aligned} \end{aligned}$$

(3.21)

Finally, from Lemma 2.6 and estimates Eqs. (3.14) and (3.21), we have

$$\begin{aligned} \begin{aligned}&\Vert v^n\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}} \le C(\Vert a\Vert _{{\mathcal {C}}^{0}}+\Vert b\Vert _{{\mathcal {C}}^{0}})\\&\quad \le K\Big [\Vert v^{n-1}\Vert ^2_{{\mathcal {C}}^{1,0}}+\Vert v^{n-1}\Vert ^4_{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n-1}\Vert ^2_{{\mathcal {C}}^{0}}+\Vert \rho ^{n-1}\Vert ^3_{{\mathcal {C}}^{0}}+\Vert \rho ^{n-1}\Vert ^4_{{\mathcal {C}}^{0}}+\Vert \rho ^{n-1}\Vert ^6_{{\mathcal {C}}^{0}}\\&\qquad +\big (\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}+\Vert v^{n-1}\Vert ^2_{{\mathcal {C}}^{1,0}}+\Vert v^{n-1}\Vert ^4_{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{n-1}\Vert ^2_{{\mathcal {C}}^{0}}\\&\qquad + \Vert \rho ^{n-1}\Vert ^4_{{\mathcal {C}}^{0}}\big )\big (\Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\big )\Big ], \end{aligned} \end{aligned}$$

(3.22)

where K is a constant which depends only on the data of the problem. Without loss of generality, we assume that $K>1$. Assume that initial guess $(u^0,m^0)$ of the Newton method satisfies

$$\begin{aligned} \Vert v^{0}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{0}\Vert _{{\mathcal {C}}^{0}}\le \frac{1}{12K}, \end{aligned}$$

where K as in Eq. (3.22). Since $K>1$, we have

$$\begin{aligned} \Vert v^0\Vert ^4_{{\mathcal {C}}^{1,0}}\le \Vert v^{0}\Vert _{{\mathcal {C}}^{1,0}},\,\Vert \rho ^{0}\Vert ^6_{{\mathcal {C}}^{0}}\le \Vert \rho ^{0}\Vert _{{\mathcal {C}}^{0}}, \end{aligned}$$

$$\begin{aligned}{} & {} \Vert v^{0}\Vert _{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{0}\Vert _{{\mathcal {C}}^{0}}+\Vert v^{0}\Vert ^2_{{\mathcal {C}}^{1,0}}+\Vert v^{0}\Vert ^4_{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{0}\Vert ^2_{{\mathcal {C}}^{0}}+ \Vert \rho ^{0}\Vert ^4_{{\mathcal {C}}^{0}}\le 3(\Vert v^{0}\Vert _{{\mathcal {C}}^{1,0}}\nonumber \\{} & {} \quad +\Vert \rho ^{0}\Vert _{{\mathcal {C}}^{0}})\le \frac{1}{4K},\nonumber \\{} & {} \Vert v^{0}\Vert ^2_{{\mathcal {C}}^{1,0}}+\Vert \rho ^{0}\Vert ^2_{{\mathcal {C}}^{0}}\le 2(\Vert v^{0}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{0}\Vert _{{\mathcal {C}}^{0}})^2\le \frac{1}{72K},\nonumber \\{} & {} \Vert v^{0}\Vert ^2_{{\mathcal {C}}^{1,0}}+\Vert v^{0}\Vert ^4_{{\mathcal {C}}^{1,0}}+\Vert \rho ^{0}\Vert ^2_{{\mathcal {C}}^{0}}+\Vert \rho ^{0}\Vert ^3_{{\mathcal {C}}^{0}}+\Vert \rho ^{0}\Vert ^4_{{\mathcal {C}}^{0}}+\Vert \rho ^{0}\Vert ^6_{{\mathcal {C}}^{0}}\le 4(\Vert v^{0}\Vert ^2_{{\mathcal {C}}^{1,0}}\nonumber \\{} & {} \quad +\Vert \rho ^{0}\Vert ^2_{{\mathcal {C}}^{0}})\le \frac{1}{18K}. \end{aligned}$$

(3.23)

Replacing the previous estimates in Eq. (3.22) for $n=1$, we get

$$\begin{aligned} \Vert v^1\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^1\Vert _{{\mathcal {C}}^{0}}\le \frac{1}{18K}+\frac{1}{4}(\Vert v^1\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^1\Vert _{{\mathcal {C}}^{0}}). \end{aligned}$$

Hence, by absorbing the term $\frac{1}{4}(\Vert v^1\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^1\Vert _{{\mathcal {C}}^{0}})$ on the right hand side, we get

$$\begin{aligned} \Vert v^1\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^1\Vert _{{\mathcal {C}}^{0}}\le \frac{2}{27K}<\frac{1}{12K}. \end{aligned}$$

(3.24)

Arguing iteratively, we have that, if $\Vert v^{0}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{0}\Vert _{{\mathcal {C}}^{0}}\le \frac{1}{12K}$, then

$$\begin{aligned} \Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\le \frac{1}{12K}, \qquad \text {for any} n\in {\mathbb {N}}. \end{aligned}$$

Repeating an estimate similar to Eq. (3.23) for $n-1$, we have

$$\begin{aligned} \Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}+\Vert v^{n-1}\Vert ^2_{{\mathcal {C}}^{1,0}}+\Vert v^{n-1}\Vert ^4_{{\mathcal {C}}^{1,0}}+ \Vert \rho ^{n-1}\Vert ^2_{{\mathcal {C}}^{0}}+ \Vert \rho ^{n-1}\Vert ^4_{{\mathcal {C}}^{0}}\le \frac{1}{4K}. \end{aligned}$$

From Eq. (3.22), we obtain

$$\begin{aligned} \Vert v^n\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}}\le K\Big [8\big (\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}\big )^2+\frac{1}{4K}\big (\Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\big )\Big ], \end{aligned}$$

hence we get the estimate

$$\begin{aligned} \Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\le \frac{32K}{3}\big (\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}\big )^2. \end{aligned}$$

Multiplying both the side of the previous estimate for $\frac{32K}{3}$, we have

$$\begin{aligned} \frac{32K}{3}\big (\Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\big )\le \Big (\frac{32K}{3}\big (\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}\big )\Big )^2. \end{aligned}$$

Hence, as we have assumed $\big (\Vert v^{0}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{0}\Vert _{{\mathcal {C}}^{0}}\big )\le \frac{1}{12K}$, we obtain by induction that

$$\begin{aligned} \Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\le \frac{3}{32K}\Big (\frac{8}{9}\Big )^{2^n}. \end{aligned}$$

$\square $

Remark 3.3

We can extend our main convergence rate result to MFG system with superquadratic Hamiltonians, if we assume the system admits a classical solution. In this case, for the rate of convergence in Theorem 3.2, it is sufficient to assume that H is smooth, without the uniform bounds on the derivatives in Eq. (2.4). Indeed, a careful inspection of the previous proof shows that the constant K in Eq. (3.22) depends on (u, m), the data of the problem and, in particular, on the derivative of the Hamiltonian computed in $Dv^{n-1}$, $\rho ^{n-1}$. If we assume that $\Vert v^0\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^0\Vert _{{\mathcal {C}}^{0}}<1/12K$, then, arguing as in the proof, we have that $\Vert v^n\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}}<1/12K$ for any $n\in {\mathbb {N}}$. Hence, the constant K in Eq. (3.22), which depends on $\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}$, does not change with the iterations. The restriction of quadratic H in Theorem 3.2 is made to prove existence and uniqueness of a classical solution to the MFG system.

Remark 3.4

Assumption (A3) requires the uniform Lipschitz continuity of $f'$ and therefore it excludes some interesting cases such as $f(m)=m^\alpha $, $\alpha \ne 1$. We show that we can at least consider the case $\alpha \ge 2$. The main difference from using (A3) is in the estimate Eq. (3.13). We replace the argument in the proof with

$$\begin{aligned} \left| f'(m^{n-1})-f'(m)\right|&\le (\alpha -1)(|m^{\alpha -2}|+|(m^{n-1})^{\alpha -2}|)|\rho ^{n-1}|\\&\le (\alpha -1)(2|m^{\alpha -2}|+|(\rho ^{n-1})^{\alpha -2}|)|\rho ^{n-1}| \end{aligned}$$

and therefore

$$\begin{aligned}&\left| \Big (f'(m^{n-1})-f'(m)\Big )(m^n-m^{n-1})\right| \\&\quad \le {}(\alpha -1)(2|m^{\alpha -2}|+|(\rho ^{n-1})^{\alpha -2}|)|\rho ^{n-1}|^2+ (\alpha -1)(2|m^{\alpha -2}|+|(\rho ^{n-1})^{\alpha -1}|)|\rho ^{n-1}||\rho ^n|. \end{aligned}$$

Then one can proceed similarly as in Theorem 3.2. So far we do not have a corresponding result for $0<\alpha <2$, $\alpha \ne 1$. For global (in time) solutions to MFGs with separable Hamiltonians and $f(m)=m^\alpha $, $\alpha >0$ we refer to the paper of Cirant and Goffi [20, Theorem 1.4].

4 The Newton Method for the Mean Field Games System with Saparable Hamiltonian and Nonlocal Coupling

In this section, we consider the MFG system with Hamiltonian independent of m and nonlocal coupling

$$\begin{aligned} \left\{ \begin{aligned} (i)\qquad&-\partial _t u -\Delta u+H(x,Du)=f[m](x) \qquad{} & {} \textrm{in}\,\,Q,\\ (ii) \qquad&\partial _t m- \Delta m - \text {div} \big ( mH_p(x,Du)\big ) =0{} & {} \textrm{in}\,\,Q, \\&m(x,0)=m_0(x), \; {u(x,T)=g[m(T)](x)}{} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(4.1)

We assume that $m_0$ is as in (A1), the Hamiltonian H satisfies (A2), while the assumption on $u_T$ in (A1) and (A3) are replaced by

(A3’)
$f,g: {\mathbb {T}}^d \times {\mathcal {P}}({\mathbb {T}}^d) \rightarrow {\mathbb {R}}$. f, g and their space derivatives $\partial _{x_i}f$, $\partial _{x_i}g$, $\partial _{x_ix_j}g$ are all globally Lipschitz continuous. The measure derivatives $\frac{\delta f}{\delta m}$ and $\frac{\delta g}{\delta m}$ $:{\mathbb {T}}^d\times {\mathcal {P}}({\mathbb {T}}^d)\times {\mathbb {T}}^d\rightarrow {\mathbb {R}}$ are also Lipschitz continuous. For any $m,m'\in {\mathcal {P}}({\mathbb {T}}^d)$,
$$\begin{aligned} \begin{aligned} \int _{{\mathbb {T}}^d} \left( f[m](x)-f[m'](x)\right) \textrm{d}(m-m')(x)\ge 0, \,\,\\ \int _{{\mathbb {T}}^d} \left( g[m](x)-g[m'](x)\right) \textrm{d}(m-m')(x)\ge 0. \end{aligned} \end{aligned}$$
(4.2)

Remark 4.1

Equation (4.2) implies that $\frac{\delta f}{\delta m}$ and $\frac{\delta g}{\delta m}$ satisfy the following monotonicity property (explained for f):

$$\begin{aligned} \int _{{\mathbb {T}}^d}\int _{{\mathbb {T}}^d}\frac{\delta f}{\delta m}[m](x)(y)\rho (x)\rho (y)\textrm{d}x\textrm{d}y\ge 0 \end{aligned}$$

(4.3)

for any centered measure $\rho $, c.f. [16, p. 36].

Remark 4.2

From assumption (A3’), there exists a constant $C>0$,

$$\begin{aligned} \sup _{x\in {\mathbb {T}}^d}\big |f[m'](x)-f[m](x)\big |+\sup _{x,y \in {\mathbb {T}}^d}\big |\frac{\delta f}{\delta m}[m'](x)(y)-\frac{\delta f}{\delta m}[m](x)(y)\big |&\le C \mathbf{{d}}_1(m,m'),\end{aligned}$$

(4.4)

$$\begin{aligned} \sup _{x\in {\mathbb {T}}^d}\big |f[m' ](x) - f[m](x) - \int _{{\mathbb {T}}^d} \frac{\delta f}{\delta m}[m](x)(y)\textrm{d}(m'-m)(y)\big |&\le C\mathbf{{d}}^2_1(m,m') ,\end{aligned}$$

(4.5)

$$\begin{aligned} \sup _{x\in {\mathbb {T}}^d}\big |g[m' ](x) - g[m](x) - \int _{{\mathbb {T}}^d} \frac{\delta g}{\delta m}[m](x)(y)\textrm{d}(m'-m)(y)\big |&\le C\mathbf{{d}}^2_1(m,m') . \end{aligned}$$

(4.6)

Remark 4.3

For the simplicity, we will be using the shortened notation, c.f. [16, p. 60]

$$\begin{aligned} \int _{{\mathbb {T}}^d} \frac{\delta f}{\delta m}[m](x)(y)\textrm{d}m'(y)=\frac{\delta f}{\delta m}[m](x)m' \end{aligned}$$

for the duality bracket between $\frac{\delta f}{\delta m}[m]$ and $m'$ at x.

The next two lemmas are proved in Cardaliaguet Briani [12, Lemma 5.2]. A similar result with different functional spaces is discussed in Cardaliaguet, Delarue, Lasry and Lions [16, Lemma 3.3.1].

Lemma 4.4

Under assumptions (A1), (A2) and (A3’), let (u, m) be a classical solution to the system (4.1). Then, the unique weak solution of the system

$$\begin{aligned} \left\{ \begin{aligned} (i) \qquad&-\partial _t v -\Delta v+H_p(x,Du)Dv=\frac{\delta f}{\delta m}[m(t)]\rho{} & {} \textrm{in}\,\,Q,\\ (ii) \qquad&\partial _t \rho - \Delta \rho - \mathrm{{div}} \big ( \rho H_p(x,Du)\big )=\mathrm{{div}}\big (mH_{pp}(x,Du)Dv\big ){} & {} \textrm{in}\,\,Q, \\&\rho (x,0)=0, \; v(x,T)=\frac{\delta g}{\delta m}[m(T)]\rho (T){} & {} \textrm{in}\,\,{\mathbb {T}}^d \end{aligned}\right. \end{aligned}$$

(4.7)

is the trivial solution $(v,\rho )=(0,0)$.

Lemma 4.5

Given $a\in {\mathcal {C}}^0(Q)$, $b\in {\mathcal {C}}^0(Q;{\mathbb {R}}^d)$. Let (u, m) be a classical solution to the system (4.1) and $(v,\rho )$ be a classical solution of the perturbed linear system

$$\begin{aligned} \left\{ \begin{aligned} (i)\qquad&-\partial _t v -\Delta v+H_p(x,Du)Dv=\frac{\delta f}{\delta m}[m(t)](x)\rho +a(x,t)\qquad{} & {} \textrm{in}\,\,Q,\\ (ii)\qquad&\partial _t \rho - \Delta \rho - \mathrm{{div}} \big ( \rho H_p(x,Du)\big ) =\mathrm{{div}}\big (mH_{pp}(x,Du)Dv\big )+\mathrm{{div}}(b(x,t)){} & {} \textrm{in}\,\,Q, \\&\rho (x,0)=0, \; v(x,T)=\frac{\delta g}{\delta m}[m(T)](x)\rho (T)+c(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \nonumber \\ \end{aligned}$$

(4.8)

Then, there exists a constant $C>0$ depending on the coefficients of the problem, such that

$$\begin{aligned} \Vert v\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho \Vert _{{\mathcal {C}}^{0}}\le C\left( \Vert a\Vert _{{\mathcal {C}}^{0}}+\Vert b\Vert _{{\mathcal {C}}^{0}}+\Vert c\Vert _{{\mathcal {C}}^{0}}\right) . \end{aligned}$$

Existence and uniqueness result for a classical solution to Eq. (4.1) under rather general assumptions which, in particular, include (A1), (A2) and (A3’), can be found in [27]. The Newton system for solving Eq. (4.1), analogous to Eq. (1.4), can be written as

$$\begin{aligned} \left\{ \begin{aligned} (i)\qquad&-\partial _t u^n -\Delta u^n+H_p(x,Du^{n-1})D(u^n-u^{n-1})\\ ={}&-H(x,Du^{n-1})+f[m^{n-1}(t)](x)+\frac{\delta f}{\delta m}[m^{n-1}(t)](x)(m^n-m^{n-1})\qquad{} & {} \textrm{in}\,\,Q,\\ (ii)\qquad&\partial _t m^n- \Delta m^n - \text {div} \big ( m^nH_p(x,Du^{n-1})\big ) =\mathrm{{div}}\big (m^{n-1}H_{pp}(x,Du^{n-1})(Du^n-Du^{n-1})\big ){} & {} \textrm{in}\,\,Q, \\&m^n(x,0)=m_0(x), \; u^n(x,T)=g[m^{n-1}(T)]+\frac{\delta g}{\delta m}[m^{n-1}(T)](x)\big (m^n(T)-m^{n-1}(T)\big ){} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(4.9)

Existence and uniqueness of a classical solution to Eq. (4.9) can be proved as in Proposition 3.1.

Theorem 4.6

Let (u, m) be the solution of system (4.1) and $(u^n,m^n)$ is the sequence generated by Newton’s algorithm (4.9). Set $v^n=u^n-u$, $\rho ^n=m^n-m$. There exists a constant $\eta >0$ such that if $\Vert v^0\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^0\Vert _{{\mathcal {C}}^{0}}\le \eta $ then $\Vert v^n\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}}\rightarrow 0$ with a quadratic rate of convergence.

Proof

We first observe that $v^n=u^n-u$, $\rho ^n=m^n-m$ satisfy

$$\begin{aligned}&-\partial _t v^n -\Delta v^n+H_p(x,Du)\cdot Dv^n=\frac{\delta f}{\delta m}[m](x)(\rho ^n)+a,\\&\partial _t \rho ^n -\Delta \rho ^n-\mathrm{{div}}(\rho ^nH_p(x,Du))= \mathrm{{div}}\big (mH_{pp}(x,Du)Dv^n\big )+\mathrm{{div}}(b),\\&\rho ^n(x,0)=0,\,v^n(x,T)= \frac{\delta g}{\delta m}[m(T)](x)(\rho ^n) +c(x), \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} a&:={} H_p(x,Du)(Du^n-Du)+H(x,Du)-H(x,Du^{n-1})-H_p(x,Du^{n-1})D(u^{n}- u^{n-1})\\&\quad -\frac{\delta f}{\delta m}[m](x)(m^n-m)+f(m^{n-1})-f(m)+\frac{\delta f}{\delta m}[m^{n-1}](x)(m^n-m^{n-1}), \end{aligned} \end{aligned}$$

(4.10)

$$\begin{aligned} \begin{aligned} b&:={}-\rho ^nH_p(x,Du)-mH_{pp}(x,Du)Dv^n+ m^{n-1}H_p(x,Du^{n-1}) -mH_p(x,Du) \\&\quad + (m^{n}- m^{n-1}) H_p(x,Du^{n-1})+m^{n-1}H_{pp}(x,Du^{n-1})(Du^n-Du^{n-1}), \end{aligned} \end{aligned}$$

$$\begin{aligned} c:= & {} g[m^{n-1}(T)](x)-g[m(T)](x)-\frac{\delta g}{\delta m}[m(T)](x)(\rho ^n)\\{} & {} +\frac{\delta g}{\delta m}[m^{n-1}(T)](x)\big (m^n(T)-m^{n-1}(T)\big ). \end{aligned}$$

We first consider the nonlocal coupling terms as they constitute the main differences with respect to the Proof of Theorem 3.2. Rewrite the terms containing f in Eq. (4.10) as

$$\begin{aligned} \begin{aligned}&-\frac{\delta f}{\delta m}[m](x)(m^n-m)+f(m^{n-1})-f(m)+\frac{\delta f}{\delta m}[m^{n-1}](x)(m^n-m^{n-1})\\ ={}&f(m^{n-1})-f(m)-\frac{\delta f}{\delta m}[m^{n-1}](x)(\rho ^{n-1})+\Big (\frac{\delta f}{\delta m}[m^{n-1}](x)-\frac{\delta f}{\delta m}[m](x)\Big )(\rho ^n). \end{aligned} \end{aligned}$$

By Lipschitz continuity of $\frac{\delta f}{\delta m}$, i.e., Eqs. (4.4) and (4.5), we get

$$\begin{aligned}{} & {} \Vert -\frac{\delta f}{\delta m}[m](\cdot )(m^n-m)+f(m^{n-1})-f(m)+\frac{\delta f}{\delta m}[m^{n-1}](\cdot )(m^n-m^{n-1})\Vert _{{\mathcal {C}}^{0}}\\{} & {} \quad \le C(\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}^2+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}}). \end{aligned}$$

Similarly, rewriting

$$\begin{aligned} c= & {} g[m^{n-1}(T)](x)-g[m(T)](x)-\frac{\delta g}{\delta m}[m^{n-1}(T)](x)(\rho ^{n-1}(T))\\{} & {} \quad +\Big (\frac{\delta g}{\delta m}[m^{n-1}(T)](x)-\frac{\delta g}{\delta m}[m(T)](x)\Big )(\rho ^n(T)), \end{aligned}$$

we obtain

$$\begin{aligned} \Vert c\Vert _{{\mathcal {C}}^{0}}\le C\left( \Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}^2+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}}\right) . \end{aligned}$$

By a straightforward adaptation of the Proof of Theorem 3.2, we estimate the other terms in a and b. Indeed, we have

$$\begin{aligned} \begin{aligned}&\left| H_p(x,Du)(Du^n-Du)+H(x,Du)-H(x,Du^{n-1})-H_p(x,Du^{n-1})D(u^{n}- u^{n-1})\right| \\ \le {}&C(\vert Dv^{n-1}\vert ^2+\vert Dv^{n-1}\vert \vert Dv^{n}\vert ), \end{aligned} \end{aligned}$$

hence

$$\begin{aligned} \Vert a\Vert _{{\mathcal {C}}^{0}}\le C\left( \Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}^2+\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}\Vert v^n\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}^2+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}}\right) . \end{aligned}$$

Moreover, by

$$\begin{aligned} \begin{aligned} b:={}&\rho ^n(H_p(x,Du)-H_p(x,Du^{n-1}))+\rho ^{n-1}H_{pp}(x,Du)Dv^n\\&\quad + m^{n-1}H_p(x,Du^{n-1}) -mH_p(x,Du) \\&\quad -(m^{n-1}-m) H_p(x,Du^{n-1})-m^{n-1}H_{pp}(x,Du^{n-1})(Du^{n-1}-Du), \end{aligned} \end{aligned}$$

we obtain

$$\begin{aligned} \begin{aligned}&\Vert b\Vert _{{\mathcal {C}}^{0}}\\&\quad \le {}C\left( \Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}\Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}}\right. \\&\quad \left. + \Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}+\Vert \rho ^{n-1}\Vert ^2_{{\mathcal {C}}^{0}}+\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}}+(1+\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}})\Vert v^{n-1}\Vert ^2_{{\mathcal {C}}^{1,0}}\right) . \end{aligned} \end{aligned}$$

Collecting the estimate of a, b and c, by Lemma 4.5 we obtain

$$\begin{aligned}{} & {} \Vert v^n\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^n\Vert _{{\mathcal {C}}^{0}} \le C\left( \Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}^2+\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}^4+\Vert \rho ^{n-1}\Vert ^2_{{\mathcal {C}}^{0}}+(\Vert v^{n-1}\Vert _{{\mathcal {C}}^{1,0}}\right. \\{} & {} \quad \left. +\Vert \rho ^{n-1}\Vert _{{\mathcal {C}}^{0}})(\Vert v^{n}\Vert _{{\mathcal {C}}^{1,0}}+\Vert \rho ^{n}\Vert _{{\mathcal {C}}^{0}})\right) . \end{aligned}$$

We omit the rest of the proof as it is very similar to Theorem 3.2. $\square $

Remark 4.7

The monotonicity conditions Eq. (4.2) guarantee uniqueness of the solution to the MFG system with nonlocal coupling. If we do not assume Eq. (4.2), the result and proof methodology in this section can be adapted to prove local convergence to a stable solution of a potential MFG. Recall that, for a potential MFG, a solution (u, m) is said to be stable if the only solution to the linearized MFG system at (u, m) is the trivial one (see [12]). In other words, instead of proving that $(v,\rho )=0$ as in Lemma 4.4, we use it as part of the definition of the stable solution. We plan to study this problem in the future.

Remark 4.8

In this paper, we separated the discussions on MFGs with local or nonlocal couplings for simplicity. One can easily replace in Eq. (1.1) one or both of the terms f and g by nonlocal couplings and obtain similar results as Proposition 2.4 and Theorem 3.2. One can also consider the non-separable Hamiltonians with nonlocal congestions, e.g., replacing m in H(x, m, p) by a convolution with some kernels, as in [6]. However, even though the existence of solution to such type of MFGs has been demonstrated in [6], it is not clear how one can apply the Hessian condition Eq. (2.5) to show the uniqueness of a global (in time) classical solution and the stability property (Lemma 2.6) in the nonlocal congestion case. We leave further developments in this direction for the future.

Data Availability Statement

No data was associated in the manuscript.

References

Achdou Y, Capuzzo Dolcetta I (2010) Mean field games: numerical methods. SIAM J Numer Anal 48(3):1136–1162
Article MathSciNet Google Scholar
Achdou Y, Camilli F, Capuzzo Dolcetta I (2013) Mean field games: convergence of a finite difference method. SIAM J Numer Anal 51(5):2585–2612
Article MathSciNet Google Scholar
Achdou Y, Camilli F, Capuzzo-Dolcetta I (2012) Mean field games: Numerical methods for the planning problem. SIAM J Control Optim 50(1):77–109
Article MathSciNet Google Scholar
Achdou Y, Cardaliaguet P, Delarue F, Porretta A, Santambrogio F (2020) Mean Field Games: Cetraro, Italy 2019, vol 2281. Springer Nature, Berlin
Book Google Scholar
Achdou Y, Porretta A (2018) Mean field games with congestion. Ann Inst Henri Poincare (C) Anal Non Lineaire 35:443–480
Article MathSciNet Google Scholar
Achdou Y, Laurière M (2015) On the system of partial differential equations arising in mean field type control. Discret Contin Dyn Syst 35(9):38–79
Article MathSciNet Google Scholar
Achdou Y, Perez V (2012) Iterative strategies for solving linearized discrete mean field games systems. Networks Heterog Media 7(2):197–217
Article MathSciNet Google Scholar
Ambrose D, Mészáros A (2023) Well-posedness of mean field games master equations involving non-separable local hamiltonians. Trans Am Math Soc 376:2481–2523
MathSciNet Google Scholar
Berry J, Ley O, Silva FJ (2024) Approximation and perturbations of stable solutions to a stationary mean field game system, arXiv preprint arXiv:2402.16377,
Boyd SP, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Book Google Scholar
Bonnans FJ, Liu K, Pfeiffer L (2023) Error estimates of a theta-scheme for second-order mean field games. ESAIM Math Model Numer Anal 57(4):2493–2528
Article MathSciNet Google Scholar
Briani A, Cardaliaguet P (2018) Stable solutions in potential mean field game systems. Nonlinear Differ Equ Appl 25:1–26
Article MathSciNet Google Scholar
Cacace S, Camilli F, Goffi A (2021) A policy iteration method for mean field games. ESAIM Control Optim Calc Var 27:85
Article MathSciNet Google Scholar
Carlini E, Silva FJ (2018) On the discretization of some nonlinear Fokker–Planck–Kolmogorov equations and applications SIAM. J Numer Anal 56(4):2148–2177
Article MathSciNet Google Scholar
Camilli F, Tang Q (2022) Rates of convergence for the policy iteration method for mean field games systems. J Math Anal Appl 512:126–138
Article MathSciNet Google Scholar
Cardaliaguet P, Delarue F, Lasry J-M, Lions P-L (2019) The master equation and the convergence problem in mean field games. Princeton University Press, New Jersey
Book Google Scholar
Carmona R, Delarue F (2018) Probabilistic theory of mean field games with applications I. Springer, Berlin
Book Google Scholar
Ciarlet PG, Mardare C (2012) On the Newton–Kantorovich theorem. Anal Appl 10:249–269
Article MathSciNet Google Scholar
Cirant M, Gianni R, Mannucci P (2020) Short-time existence for a general backward-forward parabolic system arising from mean-field games. Dyn Games Appl 10:100–119
Article MathSciNet Google Scholar
Cirant M, Goffi A (2021) Maximal $L^q$-regularity for parabolic Hamilton–Jacobi equations and applications to mean field games. Ann PDE 7:19
Article Google Scholar
Cirant M, Tonon D (2019) Time-dependent focusing mean-field games: the sub-critical case. J Dyn Differ Equ 31:49–79
Article MathSciNet Google Scholar
Evangelista D, Gomes DA (2018) On the existence of solutions for stationary mean-field games with congestion. J Dyn Differ Equ 30(4):1365–1388
Article MathSciNet Google Scholar
Ghattassi M, Masmoudi N. Non-separable mean field games for pedestrian flow: Generalized hughes model, arXiv:2310.04702
Graber PJ (2015) Weak solutions for mean field games with congestion. arXiv preprint arXiv:1503.04733,
Huang M, Caines PE, Malhame RP (2007) Large-population cost-coupled LQG problems with non uniform agents: Individual-mass behaviour and decentralized $\epsilon $-Nash equilibria. IEEE Trans Autom Control 52:1560–1571
Article Google Scholar
Ladyzenskaja OA, Solonnikov VA, Ural’ceva NN (1968) Linear and quasilinear equations of parabolic type, vol 23. Translations of Mathematical Monographs. American Mathematical Society, Providence
Book Google Scholar
Lasry JM, Lions PL (2007) Mean field games. Jpn J Math 2:229–260
Article MathSciNet Google Scholar
Lauriere M (2021) Numerical methods for mean field games and mean field type control, preprint arXiv: 2106.06231,
Laurière M, Song J, Tang Q (2023) Policy iteration method for time-dependent mean field games systems with non-separable Hamiltonians. Appl Math Optim 87:17
Article MathSciNet Google Scholar
Lions PL. Cours au College de France. www.college-de-france.fr
Li H, Fan Y, Ying L (2021) A simple multiscale method for mean field games. J Comput Phys 439:110385
Article MathSciNet Google Scholar
Metafune G, Pallara D, Rhandi A (2009) Global properties of transition probabilities of singular diffusions. Teor Veroyatn Primen 54:116–148
Article MathSciNet Google Scholar
Nurbekyan L, Saúde J (2018) Fourier approximation methods for first-order nonlocal mean-field games. Port Math 75:367–396
Article MathSciNet Google Scholar
Tang Q, Song J (2024) Learning optimal policies in potential mean field games: Smoothed policy iteration algorithms. SIAM J Control Optim 62(1):351–375
Article MathSciNet Google Scholar

Download references

Funding

Open access funding provided by Università degli Studi di Roma La Sapienza within the CRUI-CARE Agreement. No funding to declare

Author information

Authors and Affiliations

SBAI, Sapienza Università di Roma, Roma, Italy
Fabio Camilli
China University of Geosciences (Wuhan), Wuhan, China
Qing Tang

Authors

Fabio Camilli
View author publications
You can also search for this author in PubMed Google Scholar
Qing Tang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have contributed equally

Corresponding author

Correspondence to Fabio Camilli.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest

Ethical Approval

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Optimal control and differential games: theory, numerics and applications. In memory of Maurizio Falcone”.

Appendix: Some Classical Parabolic Estimates Results

Consider the linear parabolic equation:

$$\begin{aligned} \left\{ \begin{aligned} \qquad&-\partial _t u-\Delta u+b(x,t)\cdot \nabla u+c(x,t)u=f(x,t) \qquad{} & {} \textrm{in}\,\,Q,\\ \qquad&u(x,T)=u_T(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(A.1)

The following two results are very classical for equations on cylinders with boundary conditions (see [26, Theorem 5.1, p. 320 ] and [26, Theorem 9.1 pp. 341–342]). A complete proof of them in the flat torus case can be found in [19, Appendix pp. 17–18].

Proposition A.1

Let $b\in {\mathcal {C}}^{\alpha ,\alpha /2}(Q;{\mathbb {R}}^d)$, c and f belong to ${\mathcal {C}}^{\alpha ,\alpha /2}(Q)$ and $u_T\in C^{2+\alpha }({\mathbb {T}}^d)$. Then, the problem (A.1) admits a unique solution $u\in {\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)$ and it holds

$$\begin{aligned} \Vert u\Vert _{{\mathcal {C}}^{2+\alpha ,1+\alpha /2}(Q)}\le C\left( \Vert f\Vert _{{\mathcal {C}}^{\alpha ,\alpha /2}(Q)}+\Vert u_T\Vert _{{\mathcal {C}}^{2+\alpha }({\mathbb {T}}^d)}\right) , \end{aligned}$$

where C depends on the ${\mathcal {C}}^{\alpha ,\alpha /2}$ norms of b, c and remains bounded for bounded values of T.

Proposition A.2

Let $r>d+2$, $b\in L^\infty (Q;{\mathbb {R}}^d)$, $c\in L^\infty (Q)$, $f\in L^r(Q)$ and $u_T\in W^{2-\frac{2}{r}}_r({\mathbb {T}}^d)$. Then, the problem (A.1) admits a unique solution $u\in W^{1,2}_r(Q)$ and it holds

$$\begin{aligned} \Vert u\Vert _{W^{2,1}_r(Q)}\le C\left( \Vert f\Vert _{L^r(Q)}+\Vert u_T\Vert _{W^{2-\frac{2}{r}}_r({\mathbb {T}}^d)}\right) , \end{aligned}$$

where C depends on the norms of b, c and remains bounded for bounded values of T. Moreover, the following embedding holds:

$$\begin{aligned} \Vert u\Vert _{{\mathcal {C}}^{2-\frac{\textrm{d}+2}{r},1-\frac{\textrm{d}+2}{2r}}(Q)} \le C\Vert u\Vert _{W^{2,1}_r(Q)}. \end{aligned}$$

(A.2)

(see [26, Corollary pp. 342–343])

Next we introduce some results for the parabolic equations in divergence form, for the proof we refer to [34, Proposition A.3]. We use the notation $\iota :=\{1,2\}$.

Proposition A.3

Let $r>d+2$, $m_0\in W^1_r({\mathbb {T}}^d)$, $q_\iota \in L^\infty (Q;{\mathbb {R}}^d)$ and consider the parabolic equation in divergence form

$$\begin{aligned} \left\{ \begin{aligned} \qquad&\partial _t m_\iota - \Delta m_\iota - \mathrm{{div}} ( m_\iota q_\iota ) =0\qquad{} & {} \textrm{in}\,\,Q,\\ \qquad&m_\iota (0,x)=m_0(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(A.3)

Then, there exists a unique solution $m_\iota $ in ${\mathcal {H}}^1_r(Q)$ to (A.3). Then, there exists a constant C depending only on r, T, d, R, $\Vert m_0\Vert _{W^1_r({\mathbb {T}}^d)}$ and $\Vert q_\iota \Vert _{L^\infty (Q;{\mathbb {R}}^d)}$ such that

$$\begin{aligned} \Vert m_\iota \Vert _{{\mathcal {H}}^1_r(Q)}\le C. \end{aligned}$$

Moreover, denote by $\delta m=m_1-m_2$ and $\delta q=q_1-q_2$. Then,

$$\begin{aligned} \Vert \delta m\Vert _{{\mathcal {H}}^1_r(Q)}\le C\Vert \delta q\Vert _{L^\infty (Q;{\mathbb {R}}^d)} \end{aligned}$$

with C as above.

We show the well posedness of the MFG system with non-separable Hamiltonian by means of an argument similar to [5, Lemma 4.1].

Proof of Proposition 2.4

Define ${\textbf{X}}:=\big \{m\in {\mathcal {C}}^0(Q): m>0, m(x,0)=m_0(x), \int _{{\mathbb {T}}^d}m(x,t)\textrm{d}x=1\big \}$. Consider the compact mapping ${\tilde{m}}={\textbf{T}}(m): {\textbf{X}}\rightarrow {\mathcal {C}}^{\alpha ,\alpha /2}(Q)$ ,

$$\begin{aligned} \left\{ \begin{aligned} (i) \qquad&-\partial _t u -\Delta u+H(x,m,Du)=f(m) \qquad{} & {} \textrm{in}\,\,Q,\\ (ii) \qquad&\partial _t {\tilde{m}}- \Delta {\tilde{m}} - \text {div} \big ( {\tilde{m}}H_p(x,m,Du)\big ) =0{} & {} \textrm{in}\,\,Q, \\&{\tilde{m}}(x,0)=m_0(x), \; u(x,T)=u_T(x){} & {} \textrm{in}\,\,{\mathbb {T}}^d. \end{aligned}\right. \end{aligned}$$

(A.4)

$f(\cdot ): {\textbf{X}}\rightarrow {\mathbb {R}}$ is uniformly bounded. By (A2), $H(x,m,p)\le {\bar{C}} (|p|^2 + 1)$. It follows from the standard result of quasilinear parabolic equation (see Theorem 6.3 in Chapter 5 of [26]) that Eq. (A.4) (i) has a unique solution $u\in L^\infty (0,T;W^1_{\infty }({\mathbb {T}}^d))$ and $\Vert u\Vert _{L^\infty (0,T;W^1_{\infty }({\mathbb {T}}^d))}$ is bounded independently of m in ${\textbf{X}}$. From the boundedness of $\Vert Du\Vert _{L^\infty (Q)}$ and (A2), $H_m(x,m,p)$ is bounded independently of m in ${\textbf{X}}$. Under the additional assumption that $f'(m)$ is bounded independently of m in ${\textbf{X}}$, it is easy to see that the map $m\mapsto u$ from ${\textbf{X}}$ to $L^\infty (0,T;W^1_{\infty }({\mathbb {T}}^d))$ is continuous. Equation (A.4) (ii) has a unique solution ${\tilde{m}}\in {\textbf{X}}\cap {\mathcal {C}}^{\alpha ,\alpha /2}(Q)$, which is bounded in ${\mathcal {C}}^{\alpha ,\alpha /2}(Q)$ uniformly with respect to m in ${\textbf{X}}$, see, e.g., [26], Chapter 3, Theorem 10.1. From Proposition A.3 and the continuous embedding of ${\mathcal {H}}_r^1(Q)$ onto ${\mathcal {C}}^{\alpha ,\alpha /2}(Q)$, the map $H_p(x,m,Du) \mapsto {\tilde{m}}$ is continuous from $L^\infty (Q)$ to ${\mathcal {C}}^{\alpha ,\alpha /2}(Q)$. From (A2), $H_p(x,m,Du)$ is continuous with respect to both m and Du, independently of m in ${\textbf{X}}$. As we have shown, the map $m\mapsto u$ from ${\textbf{X}}$ to $L^\infty (0,T;W^1_{\infty }({\mathbb {T}}^d))$ is continuous; hence, the map $m\mapsto {\tilde{m}}$ is continuous from ${\textbf{X}}$ to ${\textbf{X}}\cap {\mathcal {C}}^{\alpha ,\alpha /2}(Q)$. We can obtain by Schauder fixed point theorem that the map ${\textbf{T}}$ has at least one fixed point. Regularity of the solution follows from assumptions (A1), (A2), (A3) and previously cited results in [26]. Uniqueness follows from (A2) ((2.5) in particular) and (A3). We refer to [4, Ch.1, Theorem 13] for details. $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Camilli, F., Tang, Q. On the Quadratic Convergence of Newton’s Method for Mean Field Games with Non-separable Hamiltonian. Dyn Games Appl (2024). https://doi.org/10.1007/s13235-024-00561-y

Download citation

Accepted: 18 March 2024
Published: 25 April 2024
DOI: https://doi.org/10.1007/s13235-024-00561-y

On the Quadratic Convergence of Newton’s Method for Mean Field Games with Non-separable Hamiltonian

Abstract

Similar content being viewed by others

Time-Dependent Focusing Mean-Field Games: The Sub-critical Case

The Convergence Problem in Mean Field Games with Local Coupling

Regularity for Mean-Field Games Systems with Initial-Initial Boundary Conditions: The Subquadratic Case

1 Introduction

2 Preliminaries

Remark 2.1

Remark 2.2

Remark 2.3

Proposition 2.4

Lemma 2.5

Proof

Lemma 2.6

Proof

3 The Newton Method for the Mean Field Games System with Non-separable Hamiltonian and Local Coupling

Proposition 3.1

Proof

Theorem 3.2

Proof

Remark 3.3

Remark 3.4

4 The Newton Method for the Mean Field Games System with Saparable Hamiltonian and Nonlocal Coupling

Remark 4.1

Remark 4.2

Remark 4.3

Lemma 4.4

Lemma 4.5

Theorem 4.6

Proof

Remark 4.7

Remark 4.8

Data Availability Statement

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Additional information

Publisher's Note

Appendix: Some Classical Parabolic Estimates Results

Appendix: Some Classical Parabolic Estimates Results

Proposition A.1

Proposition A.2

Proposition A.3

Proof of Proposition 2.4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation