The classical Bernstein’s method is a well-known tool for obtaining gradient estimates for solutions of second-order, elliptic and parabolic equations (cf. Caffarelli and Cabré [2] Gilbarg and Trudinger [5, Chap. 15] and Lions [6]). The underlying idea is very simple: if \(\Omega \) is a domain in \({\mathbb {R}}^N\) and \(u : \Omega \rightarrow {\mathbb {R}}\) is a smooth solution of

$$\begin{aligned} -\Delta u = 0 \quad \text {in }\Omega , \end{aligned}$$

where \(\Delta \) denotes the Laplacian in \({\mathbb {R}}^N\), then \(w:=|Du|^2\) satisfies

$$\begin{aligned} -\Delta w \le 0 \quad \text {in }\Omega . \end{aligned}$$

The gradient bounded is deduced from this property by using the Maximum Principle if one knows that Du is bounded on \(\partial \Omega \) and this bound on the boundary is usually the consequence of the existence of barriers functions.

Of course this strategy, consisting in showing that \(w:=|Du|^2\) is a subsolution of an elliptic equation and then using the Maximum Principle, can be applied to far more general equations but it has a clear defect: in order to justify the above computations, the solution has to be \(C^3\) and, since it is rare that the solution has such a regularity, the classical Bernstein’s method provides, in general, only a priori estimates; then one has to find a suitable approximation of the equation, with smooth enough solutions, to actually obtain the gradient bound.

In 1990, this difficulty was partially overcomed by the weak Bernstein’s method whose idea is even simpler: if one looks at the maximum of the function

$$\begin{aligned} (x,y) \mapsto u(x)-u(y)-L|x-y| \quad \text {in }\quad {\overline{\Omega }} \times {\overline{\Omega }}, \end{aligned}$$

and if one can prove that it is achieved only for \(x=y\) for L large enough, then \(|Du|\le L\). Surprisingly, as it is explained in the introduction of [1], the computations and structure conditions which are needed to obtain this bound are the same (or almost the same with tiny differences) as for the classical Bernstein’s method. Of course, the main advantage of the weak Bernstein’s method is that it does not require u to be smooth since there is no differentiation of u and it can even be used in the framework of viscosity solutions.

Problem solved? Not completely because the weak Bernstein’s method is not of an easy use if one looks for local bounds instead of global bounds. In fact, in order to get such local gradient bounds, the only possible way seems to multiply the solution by a cut-off function and to look for a gradient bound for this new function. Unfortunately, this new function satisfies a rather complicated equation where the derivatives of the cut-off function appear at different places and the computations become rather technical. The classical Bernstein’s method also faces similar difficulties but, at least in some cases, succeeds in providing these local bounds in a not too complicated way.

The aim of this article is to describe a slight improvement of the weak Bernstein’s method which allows to obtain local gradient bounds in a simpler way, “simpler” meaning that the technicalities are as reduced as possible, although some are unavoidable. This improvement is based on an idea of Cardaliaguet [3] which dramatically simplifies a matrix analysis which is keystone in [1] but also allows this extension to local bounds.

To present our result, we consider second-order, possibly degenerate, elliptic equations which we write in the general form

$$\begin{aligned} F(x, u, D u, D^2u) = 0 \quad \text {in }\quad \Omega , \end{aligned}$$
(1)

where \(\Omega \) is a domain of \({\mathbb {R}}^N\) and \(F :\Omega \times {\mathbb {R}}\times {\mathbb {R}}^N \times {{\mathcal {S}}}^N\rightarrow {\mathbb {R}}\) is a locally Lipschitz continuous function, \({{\mathcal {S}}}^N\) denotes the space of \(N \times N\) symmetric matrices, the solution u is a real-valued function defined on \(\Omega \), \(Du, D^2u\) denote respectively its gradient and Hessian matrix. We assume that F satisfies the (degenerate) ellipticity condition : for any \((x,r,p)\in \Omega \times {\mathbb {R}}\times {\mathbb {R}}^N\) and for any \(X,Y\in {{\mathcal {S}}}^N\),

$$\begin{aligned} F(x,r,p,X) \le F(x,r,p,Y)\quad \text {if }\quad X\ge Y. \end{aligned}$$

Our results consist in providing several general “structure conditions” on F under which one has a local gradient bound depending or not on the local oscillation of u and the uniform ellipticity of the equation. We also consider the parabolic case for which we give a structure condition on the equation allowing to prove a local gradient bound, depending on the local oscillation of u, where “local” means both in space and time.

In the stationary framework, we focus in particular on the following example

$$\begin{aligned} -\Delta u + |Du|^m = f(x) \quad \text {in }\quad \Omega , \end{aligned}$$
(2)

where \(m>1\) and \(f \in W^{1,\infty }_{loc}(\Omega )\), which is a particular case for which the classical Bernstein’s method provides local bound (independent of the oscillation of u) in a rather easy way, while it is not the case for the weak Bernstein’s method.

We conclude this introduction by two remarks: the first one concerns the “structure conditions” on F on which our results are based. In [1], it is pointed out that, in general, the equation we consider does not satisfy these structure conditions and we have to make a change of unknown function \(v=\psi (u)\), choosing \(\psi \) in order that the new equation for v satisfies them. Obviously, the same remark is true here and we provide an example where such a change allows to obtain the desired gradient bound. But, contrarily to [1], we are not going to study the effect of such changes in a more systematic way.

The second remark concerns the method we are going to present: the results we obtain are based on several choices we made at several places and, in particular, in the estimates of the terms we have to handle. Clearly, many variants are possible and we have just tried to convince the reader that, actually, the technicalities are really “reasonnable” as we pretend it in the abstract.

1 Some preliminary results

In this section, we are going to construct the functions we use in the proof of our main result. To do so, we introduce \({\mathcal {K}}\) which is the class of continuous functions \(\chi :[0,+\infty )\rightarrow [0,+\infty )\) such that \(\chi (t)=0\) if \(t\le 1\), \(\chi \) is increasing on \([1,+\infty [\), \(\chi (t)\le {{\tilde{K}}}(\chi )t^\beta \) for \(t\ge 1\), for some \(0<\beta < 1/2\) and some constant \({{\tilde{K}}}(\chi )>0\), and

$$\begin{aligned} \int _{1}^{+\infty }\frac{dt}{t\chi (t)}<+\infty . \end{aligned}$$

The first ingredient we use below is a smooth function \(\varphi : [0,1[ \rightarrow {\mathbb {R}}\) such that \(\varphi (0)=0\), \(\varphi '(0)=1 \le \varphi '(t)\) for any \(t\in [0,1[\) with \(\varphi (t) \rightarrow +\infty \) as \(t\rightarrow 1^-\) and which solves the ode \(\varphi ''(t)= {K_{1}}\varphi '(t) \chi (\varphi '(t))\) for some constant \(K_1>0\). In fact the existence of such function is classical using that

$$\begin{aligned} {\int _{1}}^{\varphi {'}(t)} \frac{ds}{s\chi (s)} = {K_{1}} t, \end{aligned}$$

and by choosing \(K_1={\int _{1}}^{+\infty } \frac{ds}{s\chi (s)}\) we already see that \(\varphi ' (t) \rightarrow +\infty \) as \(t\rightarrow 1^-\). Moreover

$$\begin{aligned} \int _{\varphi {'}(t)}^{+\infty } \frac{ds}{s\chi (s)} = K_1 (1-t), \end{aligned}$$

and therefore, for t close enough to 1

$$\begin{aligned} K_1 (1-t) \ge [{{\tilde{K}}}(\chi )]^{-1}\int _{\varphi {'}(t)}^{+\infty } \frac{ds}{s^{1+\beta }}= [{{\tilde{K}}}(\chi )\beta ]^{-1}\varphi {'}(t)^{-\beta }. \end{aligned}$$

This means that

$$\begin{aligned} \varphi {'}(t) \ge \left( \frac{K_1 (1-t)}{[{{\tilde{K}}}(\chi )\beta ]^{-1}}\right) ^{-1/\beta }, \end{aligned}$$

and therefore \(\varphi {'}(t)\) is not integrable at 1 since \(1/\beta >2\). Hence we have \(\varphi (t) \rightarrow +\infty \) as \(t\rightarrow 1^-\).

On the other hand, given \(x_0 \in {\mathbb {R}}^N\) and \(R>0\), we use below a smooth function \(C: B(x_0,3R/4) \rightarrow {\mathbb {R}}\) is a smooth function such that \(C(z)= 1\) on \(B(x_0,R/4)\), \(C(z) \ge 1\) in \( B(x_0,3R/4)\) and \(C(z)\rightarrow +\infty \) when \(z\rightarrow \partial B(x_0,3R/4)\) and with

$$\begin{aligned} \frac{|D^2C(x)|}{C(x)} , \frac{|DC(x)|^2}{[C(x)]^2} \le K_2(R) [\chi (C(x))]^2, \end{aligned}$$

where \(\chi \) is a function in the class \({\mathcal {K}}\). If \(C_1\) is a function which satisfies the above properties for \(x_0=0\) and \(R=1\), we see that we can choose C as

$$\begin{aligned} C(x)=C_1\left( \frac{x-x_0}{R}\right) , \end{aligned}$$

and therefore \(K_2(R)\) behaves like \(R^{-2}K_2(1)\).

To build \(C_1\), we first solve

$$\begin{aligned} \psi '' (t) = K_3 \psi (t)[\chi (\psi (t))]^2, \quad \psi (0)=1,\quad \psi '(0)=0, \end{aligned}$$

for some constant \(K_3\) to be chosen later on. Multiplying the equation by \(2 \psi '(t)\), we obtain that

$$\begin{aligned} \psi '(t) = F(\psi (t)), \end{aligned}$$

where

$$\begin{aligned}{}[F(\tau )]^2= 2K_3 {\int _{1}}^\tau s[\chi (s)]^2 ds. \end{aligned}$$

Again we look for a function \(\psi \) such that \(\psi (t) \rightarrow +\infty \) as \(t\rightarrow 1^{-}\) and to do so, the following condition should hold

$$\begin{aligned} {\int _{1}}^{+\infty } \frac{d\tau }{F(\tau )} < +\infty . \end{aligned}$$

But, since \(\chi \) is increasing,

$$\begin{aligned}{}[F(\tau )]^2 \ge 2K_3 \int _{\tau /2}^\tau s[\chi (s)]^2 ds\ge \; 2K_3 [\tau /2 \chi (\tau /2)]^2 , \end{aligned}$$

and since \(\tau \mapsto \chi (\tau /2)\) is in \({\mathcal {K}}\), we have the result for F, and then for \(\psi \) by choosing appropriately the constant \(K_3\).

Moreover

$$\begin{aligned}{}[F(\tau )]^2 \le 2K_3 (\tau -1) \tau [\chi (\tau )]^2 \le 2K_3 [\tau \chi (\tau )]^2, \end{aligned}$$

and therefore

$$\begin{aligned} \psi '(t) \le (2K_3)^{1/2} \psi (t) \chi (\psi (t)). \end{aligned}$$

Finally, we can extend \(\psi \) by setting \(\psi (t)=1\) for \(t\le 0\) and the equations satisfied by \(\psi \) show that we define in that way a \(C^2\)-function on \((-\infty ,1)\).

With such a \(\psi \), the construction of \(C_1\) is easy, we may choose

$$\begin{aligned} C_1(x):= \psi \bigl (4(|x| - 1/2)\bigr )\quad \text {for }\quad x\in B(0,3/4) , \end{aligned}$$

and define C from \(C_1\) as above. We notice that, because of the properties of \(\psi \), \(\dfrac{|DC(x)|}{[C(x)]^2}\) remains bounded on \(B(x_0,3R/4)\) and is a \(O(R^{-1})\), a property that we will use later on.

2 The main result

In the statement of our main result below, for the sake of clarity, we are going to drop the arguments of the partial derivatives of F and to simply denote by \(F_s\) the quantity \(\dfrac{\partial F}{\partial s} (x,r,p,M)\) for \(s=x,r,p,M\). Actually these arguments are (xrpM) everywhere.

Our result is the following

Theorem 2.1

Assume that F is a locally Lipschitz function in \(\Omega \times {\mathbb {R}}\times {\mathbb {R}}^N \times {{\mathcal {S}}}^N\rightarrow {\mathbb {R}}\) which satisfies : F(xrpM) is Lipschitz continuous in M and

$$\begin{aligned} F_M(x,r,p,M) \le 0 \;\text {and}\; F_r(x,r,p,M) \ge 0\quad \text {a.e. in }\Omega \times {\mathbb {R}}\times {\mathbb {R}}^N \times {{\mathcal {S}}}^N\; , \end{aligned}$$

and let \(u\in C(\Omega )\) be a solution of (1).

  1. (i)

    (Uniformly elliptic equation with coercive gradient dependence: estimates which are independent of the oscillation of u) Assume that there exist a function \(\chi \in {\mathcal {K}}\) and \(0<\eta \le 1\) such that, for any \(K>0\), there exists \(L= L(F,K)\) large enough such that

    $$\begin{aligned}&-(1+\eta )|F_x| |p| (1+K\chi (\eta |p|)) - K |F_p| |p|^2 \left( 1+K\chi (\eta |p| )\right) \chi (\eta |p| ) \\&\quad - \dfrac{1}{1+\eta }F_M\cdot M^2\ge \eta + K \bigl ( |p| \left( 1+K\chi (\eta |p| )\right) \chi (\eta |p| )\bigr )^2 \; \text {a.e.}, \end{aligned}$$

    in the set

    $$\begin{aligned} \{(x,r,p,M);\ |F(x,r,p,M))| \le K \eta |p|[1 + K\chi (\eta |p|)]+\eta \; ,\; |p|\ge L\}. \end{aligned}$$

    If \(\overline{B(x_0,R)} \subset \Omega \) then u is Lipschitz continuous in \(B(x_0,R/2)\) and \(|Du| \le {{\bar{L}}} \) in \(B(x_0,R/2)\) where \({\bar{L}}\) depends only on F and R.

  2. (ii)

    (Uniformly elliptic equation with coercive gradient dependence: estimates depending the oscillation of u) Assume that there exist a function \(\chi \in {\mathcal {K}}\) and \(0<\eta \le 1\) small enough such that, for any \(K>0\), there exists \(L= L(F,K)\) large enough such that

    $$\begin{aligned} -(1+\eta ) |F_x||p| - K|F_p| |p|^2\chi (\eta |p|) - \frac{1}{1+\eta } F_M\cdot M^2 \ge \eta + K |p|^2\chi (\eta |p|)^{2}\; \text {a.e.}, \end{aligned}$$

    in the set \(\{(x,r,p,M);\ |F(x,r,p,M))| \le K |p|+\eta \; ,\; |p|\ge L\}\). If \(\overline{B(x_0,R)} \subset \Omega \) then u is Lipschitz continuous in \(B(x_0,R/2)\) and \(|Du| \le {{\bar{L}}} \) in \(B(x_0,R/2)\) where \({\bar{L}}\) depends on F, R and \(osc_R (u)\), the oscillation of u on \( \overline{B(x_0,R)}\).

  3. (iii)

    (Non-uniformly elliptic equation: estimates depending the oscillation of u) Assume that there exist a function \(\chi \in {\mathcal {K}}\) and \(0< \eta \le 1\) small enough such that, for any \(K>0\), there exists \(L=L(F,K)\) large enough such that

    $$\begin{aligned}&-(1+\eta ) |F_x| |p| +(1-\eta )^2 F_r|p|^2 - K|F_p| |p|^2\chi (\eta |p|)\\&\quad - \frac{1}{1+\eta } F_M\cdot M^2 \ge \eta + K |p|^2\chi (\eta |p|)^{2}\; \text {a.e.}, \end{aligned}$$

    in the set \(\{(x,r,p,M);\ |F(x,r,p,M))| \le K |p|+\eta \; ,\; |p|\ge L\}\). If \(\overline{B(x_0,R)} \subset \Omega \) then u is Lipschitz continuous in \(B(x_0,R/2)\) and \(|Du| \le {{\bar{L}}} \) in \(B(x_0,R/2)\) where \({\bar{L}}\) depends on F, R and \(osc_R (u)\).

As an application we consider Eq. (2): in order to have a gradient estimate which is independant of the oscillation of u, i.e. Result (i) in Theorem 2.1, the idea is to choose \(\chi (t)=(t-1)^\beta \) for \(t\ge 1\) with \(0<\beta <1/2\) and \(\gamma :=1+2\beta < m\). The most important point is that, for large |p|, the constraint on F reads

$$\begin{aligned} |F(x,r,p,M))| \le K\eta |p|(1+ K(\eta |p|)^{\beta })+\eta \end{aligned}$$

and therefore |F(xrpM))| behaves as \(K^2(\eta |p|)^{1+\beta }\) if |p| is large enough. Since \(1+\beta <m\), this implies that, for such (xrpM),

$$\begin{aligned} \mathrm{Tr}(M)\ge \frac{1}{2} |p|^m - ||f||_{L^{\infty }(B(x_0,R)}. \end{aligned}$$

But, by Cauchy–Schwarz inequality

$$\begin{aligned} \mathrm{Tr}(M)\le C(N)[\mathrm{Tr}(M^2)]^{1/2}. \end{aligned}$$

Therefore the term \(-F_M\cdot M^2\) behaves like \(|p|^{2m}\). For the other terms, we have, for large |p|

  1. 1.

    the term \(|F_x| |p| (1+K\chi (\eta |p|)) \) behaves like \(|p|^{1+\beta }=|p|^{\gamma -\beta }\);

  2. 2.

    the term \(|F_p| |p|^2 \left( 1+K\chi (\eta |p| )\right) \chi (\eta |p| )\) behaves like \(|p|^{m+1+2\beta }=|p|^{m+\gamma }\);

  3. 3.

    the term \(K ||F_M||_\infty \bigl ( |p| \left( 1+K\chi (\eta |p| )\right) \chi (\eta |p| )\bigr ) ^2\) behaves like \(|p|^{2(1+2\beta )}=|p|^{2\gamma }\).

Since \(\gamma < m\), the term \(-F_M\cdot M^2\) clearly dominates all the other terms as |p| tends to \(+\infty \); therefore we have the gradient bound since the assumption holds for any \(0<\eta \le 1\). Moreover the classical case (\(m=1\)) can be also treated under the assumptions of Result (ii).

In this example, it is also clear that we can replace the term \(|Du|^m\) by a term H(Du) where H satisfies: there exists \(\chi \in {\mathcal {K}}\) such that

$$\begin{aligned} \frac{|p|\chi (|p|)}{H(p)}\rightarrow 0 \quad \text {as }\quad |p|\rightarrow +\infty , \end{aligned}$$

and

$$\begin{aligned} \frac{|H_p|(|p|\chi (|p|))^2}{[H(p)]^2}\rightarrow 0 \quad \text {as }\quad |p|\rightarrow +\infty \; . \end{aligned}$$

In the case of non-uniformly elliptic equation, the gradient bound comes necessarily from the \(F_r|p|^2\)-term. We consider the equation

$$\begin{aligned} -\mathrm{Tr}(A(x)D^2 u) + |Du|^m = f(x) \quad \text {in }\quad \Omega , \end{aligned}$$
(3)

where \(m>1\) and f is locally bounded and Lipschitz continuous; concerning A, we use the classical assumption: \(A(x)=\sigma (x)\cdot \sigma ^T(x)\) for some bounded, Lipschitz continuous function \(\sigma \), where \(\sigma ^T(x)\) denotes the transpose matrix of \(\sigma (x)\).

In order to obtain a local gradient bound for u, a change of variable is necessary: assuming (without loss of generality) that \(u\ge 1\) at least in the ball \(\overline{B(x_0,R)}\), we can use the change \(u=\exp (v)\). The equation satisfied by v is

$$\begin{aligned} -\mathrm{Tr}(A(x)D^2 v) +A(x)Dv\cdot Dv+ \exp ((m-1)v)|Dv|^m = \exp (-v)f(x) \quad \text {in }\Omega , \end{aligned}$$

And the aim is now to apply Theorem 2.1(iii) to get the gradient bound for v (hence for u).

The computation of the different terms gives

$$\begin{aligned} F_r(x,r,p,M)= & {} (m-1)\exp ((m-1)r)|p|^m + \exp (-r)f(x),\\ F_x(x,r,p,M)= & {} -\mathrm{Tr}(A_x(x)M)+A_x(x)p\cdot p-\exp (-r)f_x (x),\\ F_p(x,r,p,M)= & {} 2A(x)p +\exp ((m-1)r)|p|^{m-2}p,\\ - F_M(x,r,p,M)M^2= & {} \mathrm{Tr}(A(x)M^2). \end{aligned}$$

We first use Cauchy-Schwarz inequality and the assumption on A to deduce that, for any \(\eta >0\)

$$\begin{aligned} |\mathrm{Tr}(A_x(x) M)| |p|\le \frac{1}{1+\eta } \mathrm{Tr}(A(x)M^2)+ O((|\sigma _x||p|)^2); \end{aligned}$$

This control of the first term in \(F_x(x,v,p,M)\) is the only use of the term \(- F_M(x,v,p,M)M^2\) .

Therefore the \(F_r(x,r,p,M)|p|^2\)-term which behaves like \(|p|^{m+2}\) if \(m>1\), has to control the terms

$$\begin{aligned} (A_x(x)\cdot p)(p\cdot p)=O(|p|^3)\quad -\exp (-v)f_x (x) |p|=O(|p|),\quad 2A(x)p\cdot p =O(|p|^2). \end{aligned}$$

We have now to consider the \(F_p\)-term and the term \(K \bigl (|p|\chi (\eta |p|)\bigr )^{2}\) in the right-hand side. Notice that, for the time being, we have not chosen \(\chi \) nor \(\eta \).

The \(F_p\)-term behaves as \(|p|^{\max (1,m-1)}\) and therefore \(|F_p| |p|^2\chi (\eta |p|)\) behaves as \(|p|^{\max (3,m+1)}\chi (\eta |p|)\). On the other hand, \(K \bigl (|p|\chi (\eta |p|)\bigr )^{2}\) behaves as \(|p|^2[\chi (\eta |p|)]^2\). If we choose any \(\chi \in {\mathcal {K}}\), because of the growth of such \(\chi \) at infinity, these two terms are controlled by the \(F_r|p|^2\)-one. Therefore Theorem 2.1 (iii) applies.

It is worth pointing out that, in this last example, we do not use the fact that the assumption has to hold only in the set \(\{(x,r,p,M);\ |F(x,r,p,M))| \le K |p|+\eta \; ,\; |p|\ge {\bar{L}}\}\). In the parabolic setting, we also argue in this way in most of the cases.

3 Proof of Theorem 2.1

We start by proving (i) : the aim is to prove that, for any \(x\in B(x_0,R/4)\), \(D^+u(x)\) is bounded with an explicit bound. This will provide the desired gradient bound. We recall that

$$\begin{aligned} D^+u(x)=\{p\in {\mathbb {R}}^n:\ u(x+h)\le u(x)+p\cdot h+o(|h|) \ \text { as } h\rightarrow 0\}. \end{aligned}$$

To do so, we consider on

$$\begin{aligned} \Gamma _L :=\{(x,y) \in B(x_0,3R/4) \times B(x_0,R) : LC(x)(|x-y| +\alpha )<1\} \end{aligned}$$

the following function

$$\begin{aligned} \Phi (x,y)= u(x)-u(y) - \varphi \left( LC(x)(|x-y| +\alpha )\right) , \end{aligned}$$

where

  • \(L\ge \max (1,4/R)\) is a constant which is our future gradient bound (and therefore which has to be chosen large enough),

  • the functions \(\varphi \) and C are built in Sect. 1,

  • \(\alpha >0\) is a small constant devoted to tend to 0.

We remark that the above function achieves its maximum in the open set \(\Gamma _L\): indeed, if \((x,y) \in \Gamma _L\), we have \( LC(x)\alpha <1\) and therefore \(x\in \overline{B(x_0,R')}\) for some \(R'<3R/4\). Moreover \(LC(x)|x-y|<1\) implies \(|x-y|<L^{-1}\) and, since \(L> 4/R\), this implies \(y\in \overline{B(x_0,R'+R/4)}\) and \(R'+R/4<R\). Therefore, clearly \(\Phi (x,y) \rightarrow - \infty \) if \((x,y)\rightarrow \partial \Gamma _L\).

Next we argue by contradiction: if, for some L, this maximum is achieved for any \(\alpha \) at \(({{\bar{x}}}_\alpha ,{{\bar{y}}}_\alpha )\) with \({{\bar{x}}}_\alpha ={{\bar{y}}}_\alpha \), then \(\Phi ({{\bar{x}}}_\alpha ,{{\bar{x}}}_\alpha )=- \varphi (LC({{\bar{x}}}_\alpha )\alpha )\) and therefore necessarely \({{\bar{x}}}_\alpha \in B(x_0,R/4)\) by the maximality property and the form of C. Moreover, for any xy

$$\begin{aligned} u(x)-u(y) - \varphi (LC(x)(|x-y| +\alpha ))\le -\varphi (L\alpha ) , \end{aligned}$$

and if this is true, for a fixed L, this implies that, for any xy

$$\begin{aligned} u(x)-u(y) - \varphi (LC(x)|x-y| )\le 0. \end{aligned}$$

Choosing \(x\in B(x_0,R/4)\), we have

$$\begin{aligned} u(y)-u(x) \ge - \varphi (L|x-y| ), \end{aligned}$$

and this inequality implies that any element in \(D^+u(x)\) has a norm which is less than L, which we wanted to prove.

Notice that, by using slightly more complicated arguments, the same conclusion is true if, for some L, we have \({{\bar{x}}}_\alpha -{{\bar{y}}}_\alpha \rightarrow 0\) when \(\alpha \rightarrow 0\).

Therefore, we may assume without loss of generality that, for any fixed L, the maximum points \(({{\bar{x}}}_\alpha ,{{\bar{y}}}_\alpha )\) of \(\Phi \), satisfies not only \({{\bar{x}}}_\alpha \ne {{\bar{y}}}_\alpha \) for \(\alpha \) small enough but \({{\bar{x}}}_\alpha -{{\bar{y}}}_\alpha \) is bounded away from 0 when \(\alpha \rightarrow 0\). We are going to prove that this is a contradiction for L large enough.

For the sake of simplicity of notations, we omit the indices \(\alpha \) in all the quantities which depends on \(\alpha \) (actually they also depend on L). In particular, we denote by (xy) a maximum point of \(\Phi \) and we set \(t=LC(x)(|x-y| +\alpha )\) and

$$\begin{aligned} p= \varphi {'}(t)LC(x) \frac{(x-y)}{|x-y|} ,\quad q= \varphi {'}(t)LDC(x) (|x-y|+\alpha ). \end{aligned}$$

By a classical result of the User’s guide (cf. Crandall et al. [4]), there exist matrices \(X,Y \in {{\mathcal {S}}}^N\) such that \((p+q,X)\in \overline{D^{2,+}}u(x)\), \((p,Y)\in \overline{D^{2,-}}u(y)\), for which the following viscosity inequalities hold

$$\begin{aligned} F(x,u(x), p+q,X) \le 0,\quad F(y,u(y), p,Y) \ge 0. \end{aligned}$$

Moreover the matrices XY satisfy, for any \(\varepsilon >0\)

$$\begin{aligned} \left( -\frac{1}{\varepsilon }+ ||A||\right) I_{2N} \le \left( \begin{array}{cc}X &{} 0 \\ 0 &{} -Y\end{array}\right) \le A + \varepsilon A^2 \end{aligned}$$

and where, if \(\psi (x,y)= \varphi (LC(x)(|x-y| +\alpha ))\), \(A=D^2\psi (x,y)\) and \(||A||=\max \{|\lambda |: \lambda \text {is an eigenvalue of A} \}\).

Since \(\varepsilon >0\) is arbitrary and since we are going to use only the second above inequality, we may choose a sufficiently small \(\varepsilon \) in order that the term \(\varepsilon A^2\) becomes negligible. Using this remark, we argue below assuming that \(\varepsilon =0\) in order to simplify the exposure.

With this convention, the matrices XY satisfy, for any \(r,s \in {\mathbb {R}}^N\)

$$\begin{aligned} Xr\cdot r - Ys \cdot s \le \gamma _1|r-s|^2+2\gamma _2 |r-s||r|+\gamma _3|r|^2, \end{aligned}$$
(4)

where

$$\begin{aligned} \gamma _1= & {} \frac{\varphi {'}(t)LC(x)}{|x-y|}+ \varphi ''(t)(LC(x))^2, \\ \gamma _2= & {} \varphi {'}(t)L|DC(x)|+ \varphi ''(t)L^2|DC(x)|C(x)(|x-y|+\alpha ), \\ \gamma _3= & {} \varphi {'}(t)\frac{|D^2C(x)|}{C(x)}t+ \varphi ''(t)\frac{|DC(x)|^2}{[C(x)]^2}t^2. \end{aligned}$$

By easy manipulations, it is easy to see that

$$\begin{aligned} \gamma _2\le & {} \gamma _1\frac{|DC(x)|}{C(x)}(|x-y|+\alpha ) + o_\alpha (1)\le \gamma _1 K_2^{1/2}\chi (C(x))(|x-y|+\alpha )+ o_\alpha (1),\\ \gamma _3\le & {} \gamma _1 K_2[\chi (C(x))]^2(|x-y|+\alpha )^2+ o_\alpha (1), \end{aligned}$$

where the \(o_\alpha (1)\) comes from terms of the form \(\alpha /|x-y|\). Again, for the sake of clarity, we are going to drop these terms which play no role at the end.

By Cauchy–Schwarz inequality, we deduce that, using \(\eta \) appearing in the assumption,

$$\begin{aligned} Xr\cdot r - Ys \cdot s \le (1+\eta )\gamma _1|r-s|^2+ B(R,\eta )\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2|r|^2, \end{aligned}$$
(5)

where \(B(R,\eta )=(1+\eta ^{-1})K_2\) depends on R through \(K_2\) and therefore is a \(O(R^{-2})\) if \(\eta \) is fixed.

Coming back to p and q, we also have

$$\begin{aligned} |q| = |p|\frac{|DC(x)|}{C(x)} (|x-y|+\alpha ) \le |p|\frac{|DC(x)|}{L [C(x)]^2} \le O((RL)^{-1})|p|, \end{aligned}$$

since \(LC(x)(|x-y|+\alpha )\le 1\), \(C\ge 1\) everywhere and since \(\dfrac{|DC(x)|}{ [C(x)]^2}\) is a \(O(R^{-1})\). In order to have simpler formulas, we denote below by \(\varpi _1\) any quantity which is a \(O((RL)^{-1})\).

Now we arrive at the key point of the proof: by (4), choosing \(r=0\), we have \(-Y\le \gamma _1I_N\) where \(I_N\) is the identity matrix in \({\mathbb {R}}^N\). Therefore the matrix \(\displaystyle I_N+[(1+\eta )\gamma _1]^{-1}Y\) is invertible and rewriting (5) as

$$\begin{aligned} Xr\cdot r \le Ys \cdot s + (1+\eta )\gamma _1|r-s|^2+ B(R,\eta )\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2|r|^2, \end{aligned}$$

we can take the infimum in s in the right-hand side and we end up with

$$\begin{aligned} X \le Y(I_N+\frac{1}{(1+\eta )\gamma _1}Y)^{-1}+B(R,\eta )\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2I_N\; . \end{aligned}$$

Setting \({\tilde{Y}}:= Y(I_N+\frac{1}{(1+\eta )\gamma _1}Y)^{-1}\), this implies that we have \((p+q,{\tilde{Y}}+3\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2I_N)\in \overline{D^{2,+}}u(x)\), \((p,Y)\in \overline{D^{2,-}}u(y)\) and then, using the Lipschitz continuity of F in M, we have the viscosity inequalities

$$\begin{aligned} F(x,u(x), p+q,{\tilde{Y}})\le & {} ||F_M||_\infty B(R,\eta )\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2,\\ F(y,u(y), p,Y)\ge & {} 0. \end{aligned}$$

Next we introduce the function

$$\begin{aligned} g(\tau ):= F(X(\tau ), U(\tau ), P(\tau ), Z(\tau ))-\tau ||F_M||_\infty B(R,\eta )\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2, \end{aligned}$$

where

$$\begin{aligned} X(\tau )= & {} \tau x+(1-\tau )y,\quad U(\tau )= \tau u(x)+(1-\tau )u(y),\quad P(\tau )=p+\tau q,\\ Z(\tau )= & {} Y(I_N+\frac{\tau }{(1+\eta )\gamma _1}Y)^{-1}. \end{aligned}$$

From now on, in order to simplify the exposure, we are going to argue as if F were \(C^1\): the case when F is just locally Lipschitz continuous follows from tedious but standard approximation arguments.

The above viscosity inequalities read \(g(0)\ge 0\) and \(g(1)\le 0\): if we can show that the \(C^1\)-function g satisfies \(g'(\tau )>0\) if \(g(\tau )=0\), we would have a contradiction. Therefore we compute

$$\begin{aligned} g'(\tau )= & {} F_x\cdot (x-y)+F_r (u(x)-u(y)) + F_p\cdot q + F_M\cdot Z'(\tau )\\&-||F_M||_\infty B(R,\eta )\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2, \end{aligned}$$

and using that \(F_r \ge 0\) and \(Z'(\tau )=-((1+\eta )\gamma _1)^{-1}[Z(\tau )]^2\), we are lead to

$$\begin{aligned} g'(\tau )\ge & {} (\gamma _1)^{-1}\biggl \{-|F_x| \gamma _1 |x-y|- \gamma _1 |F_p| |q| - \dfrac{1}{1+\eta }F_M\cdot [Z(\tau )]^2\\&-B(R,\eta ) ||F_M||_\infty (\gamma _1)^2 [\chi (C(x))]^2(|x-y|+\alpha )^2\biggr \}. \end{aligned}$$

Before estimating the different terms inside the brackets, we point out that, contrarily to [1] where \(Z(\tau )\) was given by \(\tau X + (1-\tau )Y\) and where we had to prove an inequality between \(X-Y\) and \(-[Z(\tau )]^2\), here this inequality comes for free because of the form of \(Z(\tau )\): this is the key idea of Cardaliaguet [3].

Now we estimate the terms \(\gamma _1 |x-y|\), \(\gamma _1 |q|\) and \(\gamma _1 \chi (C(x))(|x-y|+\alpha )\) in terms of \(|P(\tau )|\) in order to be able to use the assumptions on F.

Using that \(LC(x)(|x-y|+\alpha ) \le 1\) and the properties of \(\varphi \), we have

$$\begin{aligned} \gamma _1|x-y|\le & {} \varphi {'}(t)LC(x)+ \varphi ''(t)(LC(x))^2|x-y|\\\le & {} \varphi {'}(t)LC(x) + K_1\varphi {'}(t)\chi (\varphi {'}(t)) LC(x)\\\le & {} |P(\tau )| (1+\varpi _1)\left( 1 + K_1\chi (\varphi {'}(t)) \right) \\\le & {} |P(\tau )| (1+\varpi _1)\left( 1+K_1 \chi (L^{-1}|P(\tau )| (1+\varpi _1))\right) . \end{aligned}$$

Indeed, recalling the estimate on |q|, \( \varphi {'}(t)LC(x)=|p|=|P(\tau )|(1+\varpi _1\tau )\) and, on an other hand, since \(C\ge 1\), we have

$$\begin{aligned} \chi (\varphi {'}(t))\le \chi (L^{-1}|p|)\le \chi (L^{-1}|P(\tau )| (1+\varpi _1)). \end{aligned}$$

From now on, we are going to assume that L is chosen large enough in order to have \(L^{-1} (1+\varpi _1)\le \eta \) and, since R is fixed, \(|\varpi _1| \le \eta \). Notice that these constraints on L depend only on R and \(\eta \), hence on R and F.

Using this choice, the above estimate of \(\chi (\varphi {'}(t))\) – and we can argue in the same way for \(\chi (C(x))\)– takes the simple form

$$\begin{aligned} \chi (\varphi {'}(t)), \chi (C(x)) \le \chi (\eta |P(\tau )|). \end{aligned}$$
(6)

This leads to the simpler estimate

$$\begin{aligned} \gamma _1|x-y| \le |P(\tau )| (1+\eta )(1+K_1\chi (\eta |P(\tau )|))\; . \end{aligned}$$

In the same way, since we can take \(\alpha \) as small as we want and \(|x-y|\) is bounded away from 0, one has

$$\begin{aligned} \gamma _1 (|x-y| +\alpha ) \le |P(\tau )| (1+\eta )\left( 1+K_1\chi (\eta |P(\tau )|)\right) +o_\alpha (1). \end{aligned}$$

This allows to estimate the \(F_p\)-term, namely

$$\begin{aligned} \gamma _1 |q|&\le \gamma _1 |p| \frac{|DC(x)|}{C(x)}(|x-y| +\alpha )\\&\le |P(\tau )|^2 (1+\eta )^2 \left( 1+K_1\chi (\eta |P(\tau )| )\right) K_2^{1/2} \chi (C(x))+o_\alpha (1),\\&\le K_2^{1/2} |P(\tau )|^2 (1+\eta )^2 \left( 1+K_1\chi (\eta |P(\tau )| )\right) \chi (\eta |P(\tau )| )+o_\alpha (1). \end{aligned}$$

Finally, by the same estimates

$$\begin{aligned} \gamma _1 \chi (C(x)) (|x-y|+\alpha ) \le |P(\tau )| (1+\eta )\left( 1+K_1\chi (\eta |P(\tau )| )\right) \chi (\eta |P(\tau )| ) +o_\alpha (1). \end{aligned}$$

We end up with

$$\begin{aligned} g'(\tau )\ge & {} (\gamma _1)^{-1}\biggl \{-|F_x| |P(\tau )| (1+\eta )(1+K_1\chi (\eta |P(\tau )|) ) \\&- |F_p| K_2^{1/2} |P(\tau )|^2 (1+\eta )^2 \left( 1+K_1\chi (\eta |P(\tau )| )\right) \chi (\eta |P(\tau )| ) \\&- \dfrac{1}{1+\eta }F_M\cdot [Z(\tau )]^2\\&-B(R,\eta ) ||F_M||_\infty \bigl ( |P(\tau )| (1+\eta )\left( 1+K_1\chi (\eta |P(\tau )| )\right) \chi (\eta |P(\tau )| )\bigr ) ^2\biggr \} \\&+ o_\alpha (1). \end{aligned}$$

On the other hand, in order to take into account the constraint \(g(\tau )= 0\), we have to estimate \(\gamma _1 [\chi (C(x))]^2 (|x-y|+\alpha )^2\). Since \(|x-y|\) is bounded away from 0 and \(LC(x)(|x-y|+\alpha )\le 1\), we have

$$\begin{aligned} \gamma _1(|x-y|+\alpha )^2\le & {} \varphi {'}(t)+ \varphi ''(t) +o_\alpha (1)\\\le & {} \varphi {'}(t)[1 + K_1\chi (\varphi {'}(t))]+ o_\alpha (1)\\\le & {} (1+\eta ) \frac{|P(\tau )|}{LC(x)}[1 + K_1\chi (\varphi {'}(t))]+o_\alpha (1)\\\le & {} (1+\eta ) \frac{|P(\tau )|}{LC(x)}[1 + K_1\chi (\eta |P(\tau )| )]+o_\alpha (1). \end{aligned}$$

But \(\dfrac{[\chi (C(x))]^2}{C(x)} \le {\tilde{K}}(\chi )\) and therefore

$$\begin{aligned} \gamma _1 [\chi (C(x))]^2 (|x-y|+\alpha )^2 \le \eta (1+\eta ){\tilde{K}}(\chi ) |P(\tau )|[1 + K_1\chi (\eta |P(\tau )| )]+o_\alpha (1). \end{aligned}$$

This implies

$$\begin{aligned} |F(X(\tau ), U(\tau ), P(\tau ), Z(\tau ))|\le \eta (1+\eta ){\tilde{K}}(\chi ) |P(\tau )|[1 + K_1\chi (\eta |P(\tau )|)]+o_\alpha (1), \end{aligned}$$

while

$$\begin{aligned} |P(\tau )| \ge (1-\eta )L. \end{aligned}$$

The conclusion follows by applying the assumption on F for L large enough and \(\alpha \) small enough in order that the \(o_\alpha (1)\)-terms are controlled by the \(\eta \)-terms. Taking L large enough depending on \(\eta \) and R, we have a contradiction and the proof of (i) is complete.

Now we turn to the proof of (ii) where we choose \(\varphi (t)=t\) and

$$\begin{aligned} \Gamma '_L :=\left\{ (x,y) \in B(x_0,3R/4) \times B(x_0,R) : LC(x)(|x-y| +\alpha )\le osc_R (u)\right\} . \end{aligned}$$

The proof follows the same arguments, except that the fact that \(\varphi ''(t)\equiv 0\) allows different estimates on the \(\gamma _i\), \(i=1,2,3\) because several terms do not exist anymore. We denote by \(\varpi _2\) any quantity of the form \(O(osc_R (u)(RL)^{-1})\) and we choose L large enough in order to have \(|\varpi _2|\le \eta \) for any of these terms and \(L^{-1} \le \eta /(1+\eta )\). We notice that, here, the constraints on L depend not only on R and \(\eta \) but also on \(osc_R (u)\).

We have \(p= LC(x) \dfrac{(x-y)}{|x-y|}\) and therefore

$$\begin{aligned} |q|= L.|DC(x)| (|x-y|+\alpha )= |p| \frac{|DC(x)|}{C^2}\frac{LC(x)(|x-y|+\alpha )}{L}=\varpi _2|p|\le \eta |p|, \end{aligned}$$

since \(\dfrac{|DC(x)|}{C^2}\le O(R^{-1})\). Using this inequality and taking into account our choice of L, it is easy to check that (6) still holds.

Moreover we have

$$\begin{aligned} \gamma _1= \frac{LC(x)}{|x-y|},\quad \gamma _2= L.|DC(x)|,\quad \gamma _3= L |D^2C(x)|(|x-y|+\alpha ) . \end{aligned}$$

And we still have the same estimates on \(\gamma _1, \gamma _2,\gamma _3\)

$$\begin{aligned} \gamma _2= & {} \gamma _1\frac{|DC(x)|}{C(x)}|x-y| \le \gamma _1 \chi (C(x))|x-y|,\\ \gamma _3\le & {} \gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2. \end{aligned}$$

The proof is then done in the same way as in the first case with the computation of \( g'(\tau )\) and then with the estimates of the different terms

$$\begin{aligned} g'(\tau )\ge & {} (\gamma _1)^{-1}\left\{ -|F_x| \gamma _1 |x-y|-\gamma _1 |F_p| |q| - \frac{1}{1+\eta }F_M\cdot [Z(\tau )]^2\right. \\&\left. -B(R,\eta ) ||F_M||_\infty (\gamma _1)^2 [\chi (C(x))]^2(|x-y|+\alpha )^2\right\} . \end{aligned}$$

But here

$$\begin{aligned} \gamma _1 |x-y|=|p| \le |P(\tau )| (1+\eta ), \end{aligned}$$

and in the same way,

$$\begin{aligned} \gamma _1 |q|&= \frac{LC}{|x-y|} L |DC(x)|(|x-y| +\alpha )\\&\le |p|^2 \dfrac{|DC(x)|}{C(x)}(1+o_\alpha (1))\\&\le K_2^{1/2} (1+\eta )^2 |P(\tau )|^2 \chi (\eta |P(\tau )| )+o_\alpha (1)\; , \end{aligned}$$

and

$$\begin{aligned} \gamma _1 \chi (C(x)) (|x-y|+\alpha ) \le (1+\eta )|P(\tau )|\chi (\eta |P(\tau )|)+o_\alpha (1). \end{aligned}$$

We end up with

$$\begin{aligned} g'(\tau )\ge & {} (\gamma _1)^{-1}\biggl \{-|F_x| (1+\eta )|P(\tau )| - K_2^{1/2} (1+\eta )^2|F_p| |P(\tau )|^2 \chi (\eta |P(\tau )| ) \\&-\frac{1}{1+\eta } F_M\cdot [Z(\tau )]^2 - B(R,\eta ) ||F_M||_\infty (1+\eta )^2 |P(\tau )|^2[\chi (\eta |P(\tau )|)]^2 +o_\alpha (1) \biggr \}. \end{aligned}$$

On the other hand, for the constraint \(g(\tau )= 0\), we have

$$\begin{aligned} \gamma _1[\chi (C(x))]^2(|x-y|+\alpha )^2&= |p|\frac{[\chi (C(x))]^2}{C(x)}\dfrac{LC(|x-y|+\alpha )^2}{|x-y|}\\&\le (1+\eta )[{\tilde{K}}(\chi )]^2 |P(\tau )|(1+\varpi _2)(1+o_\alpha (1))\\&\le (1+\eta )^2[{\tilde{K}}(\chi )]^2|P(\tau )|+o_\alpha (1), \end{aligned}$$

and

$$\begin{aligned} |P(\tau )|\ge LC(x) (1-\eta )\ge L(1-\eta ) . \end{aligned}$$

Hence

$$\begin{aligned} |F(X(\tau ), U(\tau ), P(\tau ), Z(\tau ))| \le B(R,\eta )||F_M||_\infty (1+\eta )^2 |P(\tau )| + o_\alpha (1). \end{aligned}$$
(7)

The conclusion follows as in the first case by applying the assumption on F for L large enough and \(\alpha \) small enough for which we have a contradiction.

For the proof of (iii), we keep the same test-function and the same set \(\Gamma '_L\) but since we are not expecting the gradient bound to come from the same term in \(g'(\tau )\), we are going to change the strategy in our computation of \(g'(\tau )\) by keeping the \(F_r\)-term. Using that \(F_r \ge 0\) and

$$\begin{aligned} u(x)-u(y)\ge LC(x)(|x-y|+\alpha )= \frac{|p |^2}{\gamma _1}(1+o_\alpha (1)), \end{aligned}$$

we obtain

$$\begin{aligned} g'(\tau )= & {} F_x\cdot (x-y)+F_r (u(x)-u(y)) + F_p\cdot q + F_M\cdot Z'(\tau )\\&-||F_M||_\infty B(R,\eta )\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2\; ,\\\ge & {} (\gamma _1)^{-1}\left\{ F_x\cdot p+ F_r |p|^2 - \gamma _1 |F_p| |q| - \frac{1}{1+\eta }F_M\cdot [Z(\tau )]^2\right. \\&\left. -B(R,\eta ) ||F_M||_\infty (\gamma _1)^2 [\chi (C(x))]^2(|x-y|+\alpha )^2+o_\alpha (1)\right\} . \end{aligned}$$

This computation is close to the one given in [1] if there is no localization term (\(C\equiv 1\)).

Since \(|P(\tau )| (1-\eta )\le |p| \le |P(\tau )| (1+\eta )\) and using anagolous estimates as above, we are lead to

$$\begin{aligned} g'(\tau )\ge & {} (\gamma _1)^{-1}\biggl \{-(1+\eta ) |F_x| |P(\tau )| +(1-\eta )^2 F_r |P(\tau )|^2\\&- K_2^{1/2} (1+\eta )^2|F_p| |P(\tau )|^2 \chi (\eta |P(\tau )| )- \frac{1}{1+\eta }F_M\cdot [Z(\tau )]^2 \\&-B(R,\eta ) ||F_M||_\infty (1+\eta )^2 [{\tilde{K}}(\chi )]^2 |P(\tau )|^2[\chi (\eta |P(\tau )|)]^2 \biggr \}+o_\alpha (1). \end{aligned}$$

On the other hand, the constraint \(g(\tau )= 0\) still implies (7) and we also conclude by choosing L large enough and \(\alpha \) small enough.

4 The parabolic case

In this section, we consider evolution equations under the general form

$$\begin{aligned} u_t + F(x, t, u, D u, D^2u) = 0 \quad \text {in }\quad \Omega \times (0,T), \end{aligned}$$
(8)

and the aim is to provide a local gradient bound where “local” means both local in space and time. As a consequence, we will have to provide a localization also in time and a second main difference is that we will not be able to use that the equation holds since the \(u_t\)-term has no property in general and therefore the assumptions on F have to hold for any xtrpM and not only those for which F(xtrpM) is close to 0.

Theorem 4.1

(Estimates for non-uniformly parabolic equations: estimates depending the oscillation of u) Assume that F is a locally Lipschitz function in \(\Omega \times (0,T) \times {\mathbb {R}}\times {\mathbb {R}}^N \times {{\mathcal {S}}}^N\) which satisfies : F(xtrpM) is Lipschitz continuous in M and

$$\begin{aligned} F_M(x,t,r,p,M) \le 0 \quad \text {a.e. in }\quad \Omega \times (0,T)\times {\mathbb {R}}\times {\mathbb {R}}^N \times {{\mathcal {S}}}^N, \end{aligned}$$

and let \(u\in C(\Omega \times (0,T))\) be a solution of (8). Assume that there exists a function \(\chi \in {\mathcal {K}}\), \(0<\eta \le 1\) such that, for any \(K>0\), there exists \(L=L(\eta ,K)\) large enough such that, for \(|p|\ge L\), we have \(F_r(x,t,r,p,M)\ge 0\) and

$$\begin{aligned}&-(1+\eta ) |F_x| |p| (1+\chi (\eta |p|)) - K |F_p| |p|^2 \left( 1+\chi (\eta |p| )\right) \chi (\eta |p| )\\&\quad - \frac{1}{1+\eta } F_M\cdot M^2\ge \eta + K |p|^2\biggl ( \chi ((1+\eta ) |p|)+ \chi (\eta |p|)^{2}\biggr )\; \text {a.e.}, \end{aligned}$$

If \(\overline{B(x_0,R)} \subset \Omega \) and \(\delta >0\), then u is Lipschitz continuous in x in \(B(x_0,R/2)\times [\delta , T-\delta ]\) and \(|Du| \le {{\bar{L}}} \) in \(B(x_0,R/2)\times [\delta , T-\delta ]\) where \({\bar{L}}\) depends on F, R, \(\delta \) and the oscillation of u in \(B(x_0,R)\times (\delta /2,T-\delta ]\).

It is worth pointing out that the assumptions of Theorem 4.1 are rather close to the one of Theorem 2.1 (iii) and the same computations provide a gradient bound for the evolution equation

$$\begin{aligned} u_t-\mathrm{Tr}(A(x)D^2 u) + |Du|^m = f(x) \quad \text {in }\quad \Omega \times (0,T), \end{aligned}$$
(9)

if \(m>1\).

Proof of Theorem 4.1

We argue as in the proof of Theorem 2.1 (iii), except that here \(L=L(t)\) with \(L(t) \rightarrow +\infty \) as \(t \rightarrow (\delta /2)^+\). We still choose \(\varphi (t)=t\) and we denote by \(\Gamma '_L\), the subset of points \((x,y,t) \in B(x_0,3R/4) \times B(x_0,R)\times (\delta /2,T-\delta ]\) such that

$$\begin{aligned} L(t)C(x)(|x-y| +\alpha )\le osc_{R,\delta } (u)\}, \end{aligned}$$

where \(osc_{R,\delta } (u)\) denotes the oscillation of u in \(B(x_0,R)\times (\delta /2,T-\delta ]\).

We consider maximum points \((x,y,t) \in \Gamma '_L\) of the function

$$\begin{aligned} (x,y,t)\mapsto u(x,t)-u(y,t) - L(t)C(x)(|x-y| +\alpha ), \end{aligned}$$

and, if \(x\ne y\), we are lead to the viscosity inequalities

$$\begin{aligned} a+ F(x,t, u(x,t), p+q,X) \le 0,\quad b+F(y,t,u(y,t), p,Y) \ge 0, \end{aligned}$$

where \((a,p+q,X)\in D^{2,+}u(x,t)\), \((p,Y)\in D^{2,-}u(y,t)\) and

$$\begin{aligned} a-b \ge L'(t)C(x)(|x-y|+\alpha ). \end{aligned}$$

As in the proof of Theorem 2.1, the second inequality holds for \({\tilde{Y}}\) as well and subtracting these inequalities, we have

$$\begin{aligned} L'(t)C(x)(|x-y|+\alpha )+ F(x,u(x), p+q,X) -F(y,u(y), p,{\tilde{Y}})\le 0. \end{aligned}$$

Then, with the notations of the proof of Theorem 2.1, we introduce

$$\begin{aligned} g(\tau ):= & {} F(X(\tau ), U(\tau ), P(\tau ), Z(\tau ))-\tau ||F_M||_\infty B(R,\eta )\gamma _1 [\chi (C(x))]^2(|x-y|+\alpha )^2]\\&+\tau L'(t)C(x)(|x-y|+\alpha ) . \end{aligned}$$

Here we have no information on the signs of g(0) and g(1), we only know that \(g(1)-g(0)\le 0\); therefore, in order to have the contradiction, we have to show that \(g'(\tau ) >0\) for any \(0\le \tau \le 1\) if we choose a function \(L(\cdot )\) such that L(t) is large enough for any \(t\in (\delta /2,T-\delta ]\).

The computation of \(g'(\tau )\) and the estimates are done as above; we have just to estimate the new term \(L'(t)C(x)(|x-y|+\alpha )\) which is multiplied by \(\gamma _1\) when we put it inside the bracket. We have

$$\begin{aligned} \gamma _1 L'(t)C(x)(|x-y|+\alpha ) = L(t) L'(t)[C(x)]^2 (1+o_\alpha (1)), \end{aligned}$$

and if we choose L as the solution of the ode

$$\begin{aligned} L'(t)=-k_T L(t)\chi (L(t)),\quad L(T-\delta ) = L_T \;\text {(large enough)}. \end{aligned}$$

By choosing properly \(k_T>0\), we have \(L((\delta /2)^+)=+\infty \) (notice that \(k_T\) decreases when \(L_T\) increases). Since \(L(t)\le |p|\le (1+\eta ) |P(\tau )|\), we have

$$\begin{aligned} L(t) L'(t)[C(x)]^2 \ge -k_T |P(\tau )|^2 \chi ((1+\eta )|P(\tau )|)\; . \end{aligned}$$

Using this estimate, the conclusion follows as above by applying the assumption on F for K large enough and \(\alpha \) small enough for which we have a contradiction by taking \(L_T \) large enough.