1 Introduction

1.1 The problem

In this paper, we pursue the study of the regularity of local minimizers of degenerate functionals with orthotropic structure, that we already considered in [1,2,3,4]. More precisely, for \(p\ge 2\), we consider local minimizers of the functional

$$\begin{aligned} \mathfrak {F}_0(u,\Omega ')=\sum _{i=1}^N \frac{1}{p}\,\int _{\Omega '} |u_{x_i}|^p\,dx,\qquad \Omega '\Subset \Omega ,\ u\in W^{1,p}_\mathrm{loc}(\Omega '), \end{aligned}$$
(1.1)

and more generally of the functional

$$\begin{aligned} \mathfrak {F}_\delta (u,\Omega ')=\sum _{i=1}^N \frac{1}{p}\,\int _{\Omega '} (|u_{x_i}|-\delta _i)_{+}^p\,dx +\int _{\Omega '}f\,u\,dx ,\qquad \Omega '\Subset \Omega ,\ u\in W^{1,p}_\mathrm{loc}(\Omega '). \end{aligned}$$

Here, \(\Omega \subset \mathbb {R}^N\) is an open set, \(N\ge 2\), and \(\delta _1, \ldots , \delta _N\) are nonnegative numbers.

A local minimizer u of the functional \(\mathfrak {F}_0\) defined in (1.1) is a local weak solution of the orthotropic p-Laplace equation

$$\begin{aligned} \sum _{i=1}^N \left( |u_{x_i}|^{p-2}\,u_{x_i}\right) _{x_i}=0. \end{aligned}$$
(1.2)

For \(p=2\), this is just the Laplace equation, which is uniformly elliptic. For \(p>2\), this looks quite similar to the usual p-Laplace equation

$$\begin{aligned} \sum _{i=1}^N \left( |\nabla u|^{p-2}\,u_{x_i}\right) _{x_i}=0, \end{aligned}$$

whose local weak solutions are local minimizers of the functional

$$\begin{aligned} \mathfrak {I}(u,\Omega ')=\frac{1}{p}\,\int _{\Omega '} |\nabla u|^p\,dx,\qquad \Omega '\Subset \Omega ,\ u\in W^{1,p}_\mathrm{loc}(\Omega '). \end{aligned}$$
(1.3)

However, as explained in [1, 2], equation (1.2) is much more degenerate. Consequently, as for the regularity of \(\nabla u\) (i.e. boundedness and continuity), the two equations are dramatically different.

In order to understand this discrepancy between the p-Laplacian and its orthotropic version, let us observe that the map \(\xi \mapsto |\xi |^p\) occuring in the definition (1.3) of \(\mathfrak {I}\) degenerates only at the origin, in the sense that its Hessian is positive definite on \(\mathbb {R}^N\setminus \{0\}\). On the contrary, the definition of the orthotropic functional \(\mathfrak {F}_0\) in (1.1) is related to the map \(\xi \mapsto \sum _{i=1}^N |\xi _i|^p\), which degenerates on an unbounded set, namely the N hyperplanes orthogonal to the coordinate axes of \(\mathbb {R}^N\).

The situation is even worse when

$$\begin{aligned} \max \{\delta _i\, :\, i=1,\ldots ,N\}>0, \end{aligned}$$
(1.4)

for the lack of ellipticity of the degenerate p-orthotropic functional arises on the larger set

$$\begin{aligned} \bigcup _{i=1}^N \{\xi \in \mathbb {R}^N : |\xi _i|\le \delta _i\}. \end{aligned}$$

As a matter of fact, the regularity theory for these very degenerate functionals is far less understood than the corresponding theory for the standard case (1.3) and its variants.

Under suitable integrability conditions on the function \(f\), we can use the classical theory for functionals with p-growth and ensure that the local minimizers of \(\mathfrak {F}_\delta \) are locally bounded and Hölder continuous, see for example [11, Theorems 7.5 & 7.6]. This theory also assures that the gradients of local minimizers lie in \(L^r_\mathrm{loc}(\Omega )\) for some \(r>p\), see [11, Theorem 6.7].

We also point out that for \(f\in L^\infty _\mathrm{loc}(\Omega )\), local minimizers of \(\mathfrak {F}_\delta \) are contained in \(W^{1,q}_\mathrm{loc}(\Omega )\), for every \(q<+\infty \) (see [3, Main Theorem]).

1.2 Main result

In this paper, we establish the optimal regularity expected for the minimizers of \(\mathfrak {F}_\delta \), namely the Lipschitz regularity.Footnote 1 More precisely, we establish the following result.

Theorem 1.1

Let \(p\ge 2\), \(f\in W^{1,h}_\mathrm{loc}(\Omega )\) for some \(h>N/2\) and let \(U\in W^{1,p}_\mathrm{loc}(\Omega )\) be a local minimizer of the functional \(\mathfrak {F}_\delta \). Then U is locally Lipschitz in \(\Omega \).

Moreover, in the case \(\delta _1=\cdots =\delta _N=0\), we have the following local scaling invariant estimate: for every ball \(B_{2R_0}\Subset \Omega \), it holds

(1.5)

for some \(C=C(N,p,h)>1\).

Remark 1.2

(Comparison with previous results) This result unifies and substantially extends the results on the orthotropic functional \(\mathfrak {F}_\delta \) contained in [2], where it has been established that the local minimizers of \(\mathfrak {F}_\delta \) are locally Lipschitz, provided that:

  • \(p\ge 2\), \(N=2\) and \(f\in W^{1,p'}_\mathrm{loc}(\Omega )\), see [2, Theorem A];

  • \(p\ge 4\), \(N\ge 2\) and \(f\in W^{1,\infty }_\mathrm{loc}(\Omega )\), see [2, Theorem B].

The second result was based on the so-called Bernstein’s technique, see for example [12, Proposition 2.19]. This technique had already been exploited in the pioneering paper [17] by Uralt’seva and Urdaletova, for a class of functionals which contains the orthotropic functional \(\mathfrak {F}_0\) defined in (1.1), but not its more degenerate version \(\mathfrak {F}_\delta \). Namely, the result of [17] does not cover the case when condition (1.4) is in force.

Still for the case \(\delta _1=\cdots =\delta _N=0\), an entirely different approach relying on viscosity methods has been developped in [6]. To our knowledge, both methods are limited to (at least) bounded lower order terms f.

On the contrary, [2, Theorem A] can be considered as the true ancestor to Theorem 1.1 above. Indeed, they both follow the Moser’s iteration technique, originally introduced in [16] to establish regularity for uniformly elliptic problems. However, going beyond the two-dimensional setting requires new ideas, that we will explain in Sect. 1.3 below.

In contrast to the partial results of [2, Theorems A & B], the proof of Theorem 1.1 does not depend on the dimension and does not need any additional restriction on p, apart from \(p\ge 2\). It allows unbounded lower order terms, even if the condition \(f\in W^{1,h}_\mathrm{loc}(\Omega )\) for some \(h>N/2\) is certainly not sharp. On this point, it is useful to observe that by Sobolev’s embedding we haveFootnote 2

$$\begin{aligned} W^{1,h}\hookrightarrow L^{h^*}, \end{aligned}$$

with \(h^*\) larger than N and as close to N as desired, provided h is close to N / 2. This means that, in terms of summability, our assumption on \(f\) amounts to \(f\in L^q_\mathrm{loc}(\Omega )\) for some \(q>N\). This is exactly the sharp expected condition on f for the local minimizers to be locally Lipschitz, at least if one nurtures the (optimistic) hope that the regularity for the orthotropic p-Laplacian agrees with that for the standard p-Laplacian.Footnote 3

Our strategy to prove Theorem 1.1 relies on energy methods and integral estimates, and more precisely on ad hoc Caccioppoli-type inequalities. This only requires growth assumptions on the Lagrangian and its derivatives and can be adapted to a large class of functionals. For instance, we briefly explain in “Appendix” how to adapt our poof to the case of nonlinear lower order terms, i.e. when \(f\, u\) is replaced by a term of the form \(G(x,u)\).

Remark 1.3

We collect in this remark some interesting open issues:

  1. (1)

    one word about the assumption \(p\ge 2\): as explained in [1, 2], when \(\delta _1=\cdots =\delta _N=0\), the subquadratic case \(1<p<2\) is simpler in a sense. In this case, the desired Lipschitz regularity can be inferred from [8, Theorem 2.2] (see also [9, Theorem 2.7]). However, the more degenerate case (1.4) is open;

  2. (2)

    in [1, Main Theorem], local minimizers were proven to be \(C^1\), in the two-dimensional case, for \(1<p<\infty \) and when \(\delta _1=\cdots =\delta _N=0\). We also refer to the very recent paper [14], where a modulus of continuity for the gradient of local mimizers is exhibited. We do not know whether such a result still holds in higher dimensions;

  3. (3)

    in [4, Theorem 1.4], local Lipschitz regularity is established in the two-dimensional case for an orthotropic functional, with anisotropic growth conditions; that is, for the functional

    $$\begin{aligned} \sum _{i=1}^2 \frac{1}{p_i}\,\int (|u_{x_i}|-\delta _i)_{+}^{p_i}\,dx +\int f\,u\,dx,\qquad \text{ with } 2\le p_1\le p_2. \end{aligned}$$

    For such a functional, Lipschitz regularity is open in higher dimensions, even for the case \(\delta _1=\cdots =\delta _N=0\), i.e. for the functional

    $$\begin{aligned} \sum _{i=1}^2 \frac{1}{p_i}\,\int |u_{x_i}|^{p_i}\,dx +\int f\,u\,dx,\qquad \text{ with } 2\le p_1\le p_2\le \cdots \le p_N. \end{aligned}$$

    We point out that in this case, Lipschitz regularity in every dimension has been obtained in [17, Theorem 1] for bounded local minimizers, under the additional restrictions

    $$\begin{aligned} p_1\ge 4\qquad \text{ and } \qquad p_N<2\,p_1. \end{aligned}$$

    Though these restrictions are not optimal, we recall that regularity can not be expected when \(p_N\) and \(p_1\) are too far apart, due to the well-known counterexamples by Giaquinta [10] and Marcellini [15].

1.3 Technical novelties of the proof

Our main result is obtained by considering a regularized problem having a unique smooth solution converging to our local minimizer, and proving a local Lipschitz estimate independent of the regularization parameter.

At first sight, the strategy to prove such an estimate may seem quite standard:

(a):

differentiate equation (1.2);

(b):

obtain Caccioppoli-type inequalities for convex powers of the components \(u_{x_k}\) of the gradient;

(c):

derive an iterative scheme of reverse Hölder’s inequalities;

(d):

iterate and obtain the desired local \(L^\infty \) estimate on \(\nabla u\).

However, steps (b) and (c) are quite involved, due to the degeneracy of our equation. This makes their concrete realization fairly intricate. Thus in order to smoothly introduce the reader to the proof, we prefer to spend some words.

We point out that our proof is not just a mere adaption of techniques used for the p-Laplace equation. Moreover, it does not even rely on the ideas developed in [2] for the two-dimensional case. In a nutshell, we need new ideas to deal with our functional in full generality.

In order to obtain “good” Caccioppoli-type inequalities for the gradient, we exploit an idea introduced in nuce in [1]. This consists in differentiating (1.2) in the direction \(x_j\) and then testing the resulting equation with a test function of the formFootnote 4

$$\begin{aligned} u_{x_j}|u_{x_j}|^{2s-2}\,|u_{x_k}|^{2m}, \end{aligned}$$

with \(1\le s\le m\). This leads to an estimate of the type (see Proposition 4.1)

$$\begin{aligned}&\sum _{i=1}^N \int |u_{x_i}|^{p-2}\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,s-2}\,|u_{x_k}|^{2\,m}\,\,dx\nonumber \\&\quad \le C\, \sum _{i=1}^N \int |u_{x_i}|^{p-2}\,\left( |u_{x_j}|^{2\,s+2\,m} +|u_{x_k}|^{2\,s+2\,m}\right) \,dx \nonumber \\&\qquad + \sum _{i=1}^N \int |u_{x_i}|^{p-2}\,u_{x_i x_j}^2\,|u_{x_j}|^{4\,s-2}\,|u_{x_k}|^{2\,m-2\,s}\,dx. \end{aligned}$$
(1.6)

Then the idea is the following: let us suppose that we are interested in improving the summability of the component \(u_{x_k}\). Ideally, we would like to take \(s=1\) in (1.6), since in this case the left-hand side boils down to

$$\begin{aligned} \sum _{i=1}^N \int |u_{x_i}|^{p-2}\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,m}\,dx\ge & {} \int |u_{x_k}|^{p-2}\,u_{x_k x_j}^2\,|u_{x_k}|^{2\,m}\,dx\\&\simeq \int \left| \left( |u_{x_k}|^{\frac{p}{2}+m}\right) _{x_j}\right| ^2\,dx. \end{aligned}$$

If we now sum over \(j=1,\ldots ,N\), this would give a control on the \(W^{1,2}\) norms of convex powers of \(u_{x_k}\). But there is a drawback here: indeed, this \(W^{1,2}\) norm is estimated still in terms of the Hessian of u, which is contained in the right-hand side of (1.6). Observe that (1.6) has the following form

$$\begin{aligned} \mathcal {I}(s-1,m)\le & {} C\,\sum _{i=1}^N \int |u_{x_i}|^{p-2}\, \left( |u_{x_j}|^{2\,s+2\,m}+|u_{x_k}|^{2\,s+2\,m}\right) \,dx\nonumber \\&+\mathcal {I}(2\,s-1,m-s), \end{aligned}$$
(1.7)

where

$$\begin{aligned} \mathcal {I}(s,m)=\sum _{i=1}^N \int |u_{x_i}|^{p-2}\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,s}\,|u_{x_k}|^{2\,m}\,\,dx. \end{aligned}$$

This suggests to perform a finite iteration of (1.7) for \(s=s_i\) and \(m=m_i\) such that

$$\begin{aligned} \left\{ \begin{array}{lll} 2\,s_i-1&{}=&{}s_{i+1}-1\\ s_0&{}=&{}1 \end{array}\right. \qquad \text{ and } \qquad m_i-s_i=m_{i+1},\qquad \text{ for } i=0,\ldots ,\ell . \end{aligned}$$

The number \(\ell \) is chosen so that we stop the iteration when we reach \(m_\ell =0\). The above conditions imply that for every \(i=0,\ldots ,\ell \), we have

$$\begin{aligned} m_i+s_i=m_0+s_0=2^\ell . \end{aligned}$$

In this way, after a finite number of steps (comparable to \(\ell \)), the coupling between \(u_{x_k}\) and the Hessian of u contained in the term \(\mathcal {I}\) will disappear from the right-hand side. In other words, we will end up with an estimate of the type

$$\begin{aligned} \begin{aligned} \int \left| \nabla |u_{x_k}|^{2^\ell +\frac{p-2}{2}}\right| ^2\,dx&\le C\,\sum _{i,j=1}^N\int |u_{x_i}|^{p-2}\,\left( |u_{x_j}|^{2^{\ell +1}} +|u_{x_k}|^{2^{\ell +1}}\right) \,dx\\&\quad +\sum _{i=1}^N \int |u_{x_i}|^{p-2}\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,(2^{\ell }-1)}\,dx. \end{aligned} \end{aligned}$$
(1.8)

Observe that we still have the Hessian of u in the right-hand side (this is the second term), but this time it is harmless. It is sufficient to use the standard Caccioppoli inequality (3.3) for the gradient, which reads

$$\begin{aligned} \sum _{i=1}^N \int |u_{x_i}|^{p-2}\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,(2^{\ell }-1)}\,dx\lesssim \sum _{i=1}^N \int |u_{x_i}|^{p-2}\,|u_{x_j}|^{2^{\ell +1}}\,dx, \end{aligned}$$

and the last term is already contained in the right-hand side of (1.8). All in all, by applying the Sobolev inequality in the left-hand side of (1.8), we get the following type of self-improving information

$$\begin{aligned} \nabla u\in L^{2\,\gamma }(B_R)\qquad \Longrightarrow \qquad \nabla u\in L^{2^*\gamma }(B_r),\qquad \text{ where } \text{ we } \text{ set } \gamma =\frac{p-2}{2}+2^\ell . \end{aligned}$$

In this way, we obtain an iterative scheme of reverse Holder’s inequalities. This is Step 1 in the proof of Proposition 5.1 below. Thus, apparently, we safely land in step (c) of the strategy described above.

We now want to pass to step (d) and iterate infinitely many times the previous information. The goal would be to define the diverging sequence of exponents \(\gamma _\ell \) by

$$\begin{aligned} \gamma _{\ell }=\frac{p-2}{2}+2^\ell ,\qquad \ell \ge 1, \end{aligned}$$

and conclude by iterating

$$\begin{aligned} \nabla u\in L^{2\,\gamma _\ell }(B_R) \qquad \Longrightarrow \qquad \nabla u \in L^{2^*\gamma _\ell }(B_r). \end{aligned}$$
(1.9)

Once again, there is a drawback. Indeed, observe that by definition

$$\begin{aligned} \frac{2^*}{2}\,\gamma _\ell \not = \gamma _{\ell +1}. \end{aligned}$$

One may think that this is not a big issue: indeed, it would be sufficient to have

$$\begin{aligned} \gamma _{\ell +1}\le \frac{2^*}{2}\,\gamma _\ell , \end{aligned}$$
(1.10)

then an application of Hölder’s inequality in (1.9) would lead us to

$$\begin{aligned} \nabla u\in L^{2\,\gamma _\ell }(B_R) \qquad \Longrightarrow \qquad \nabla u \in L^{2\,\gamma _{\ell +1}}(B_r), \end{aligned}$$

and we could enchain all the estimates. However, since the ratio \(2^*/2\) tends to 1 as the dimension N goes to \(\infty \), it is easy to see that (1.10) cannot be true in general. More precisely, such a condition holds only up to dimension \(N=4\).

The idea is then to go back to (1.9) and use interpolation in Lebesgue spaces in order to construct a Moser’s scheme “without holes”. In a nutshell, we control the term

$$\begin{aligned} \int _{B_R} |\nabla u|^{2\,\gamma _\ell }\,dx, \end{aligned}$$

with

$$\begin{aligned} \int _{B_R} |\nabla u|^{2\,\gamma _{\ell -1}}\,dx\qquad \text{ and } \qquad \int _{B_R} |\nabla u|^{2^*\,\gamma _\ell }\,dx, \end{aligned}$$

and use an iteration over shrinking radii in order to absorb the last term, see Step 2 of the proof of Proposition 5.1. Once this is done, we end up with the updated self-improving information

$$\begin{aligned} \nabla u\in L^{2\,\gamma _{\ell -1}}(B_R) \quad \Longrightarrow \quad \nabla u \in L^{2^*\gamma _\ell }(B_r). \end{aligned}$$

What we gain is that now \(2^*\,\gamma _\ell> 2\,\gamma _\ell >2\,\gamma _{\ell -1}\), thus by using Hölder’s inequality we obtain

$$\begin{aligned} \nabla u\in L^{2\,\gamma _{\ell -1}}(B_R) \quad \Longrightarrow \quad \nabla u \in L^{2\,\gamma _\ell }(B_r). \end{aligned}$$

The information comes with a precise iterative estimate and a good control on the relevant constants. We can thus launch the Moser’s iteration procedure and obtain the desired \(L^\infty \) estimate, see Step 3 of the proof of Proposition 5.1.

There is still a small detail that needs some care: the first exponent of the iteration is

$$\begin{aligned} 2\,\gamma _0=p+2, \end{aligned}$$

which means that on \(\nabla u\) we obtain a \(L^\infty -L^{p+2}\) local estimate. Finally, in order to obtain the desired \(L^\infty -L^p\) estimate, one can simply use an interpolation argument (this is Step 4 of the proof of Proposition 5.1).

1.4 Plan of the paper

In Sect. 2, we define the approximation scheme and settle all the needed machinery. We have dedicated Sect. 3 to the new Caccioppoli inequalities which mix together the derivatives of the gradient with respect to 2 orthogonal directions. In Sect. 4, we exploit these Caccioppoli inequalities to establish integrability estimates on power functions of the gradient. In the subsequent section, we rely on these estimates to construct a Moser’s iteration scheme which finally leads to the uniform a priori estimate of Proposition 5.1.

For ease of readability, both in Sects. 4 and 5, we first consider the case \(f=0\) and \(\delta =0\), in order to emphasize the main ideas and novelties of our approach. We explain subsequently in Sects. 4.2 and 5.2 respectively the technicalities to cover the general case \(f\in W^{1,h}_\mathrm{loc}(\Omega )\) and \(\max \{\delta _i\, : \, i=1,\ldots ,N\} >0\).

Finally, in “Appendix”, we generalize Theorem 1.1 to nonlinear lower order terms.

2 Preliminaries

We will use the same approximation scheme as in [2, Section 2]. We introduce the notation

$$\begin{aligned} g_i(t)=\frac{1}{p}\, (|t|-\delta _i)^p_+,\qquad t\in \mathbb {R},\ i=1,\ldots ,N, \end{aligned}$$

where \(0\le \delta _1,\ldots ,\delta _N\) are given real numbers and we also set

$$\begin{aligned} \delta =1+\max \{\delta _i\, :\, i=1,\ldots ,N\}. \end{aligned}$$
(2.1)

We are interested in local minimizers of the following variational integral

$$\begin{aligned} \mathfrak {F}_\delta (u;\Omega ')=\sum _{i=1}^N \int _{\Omega '} g_i(u_{x_i})\, dx+\int _{\Omega '} f\, u\, dx,\qquad u\in W^{1,p}_\mathrm{loc}(\Omega ), \end{aligned}$$

where \(\Omega '\Subset \Omega \) and \(f\in W^{1,h}_\mathrm{loc}(\Omega )\) for some \(h>N/2\). The latter implies that

$$\begin{aligned} f\in L^{h^*}_\mathrm{loc}(\Omega ) \subset L^{N}_\mathrm{loc}(\Omega ) \subset L^{p'}_\mathrm{loc}(\Omega ). \end{aligned}$$

The last inclusion is a consequence of the fact that \(p\ge 2\) and \(N\ge 2\). The condition \(f\in L^{p'}_\mathrm{loc}\) is exactly the one required in [2, Section 2] to justify the approximation scheme that we now describe.

We set

$$\begin{aligned} g_{i,\varepsilon }(t)=g_i(t)+\frac{\varepsilon }{2}\, t^2=\frac{1}{p}\, (|t|-\delta _i)_+^{p}+\frac{\varepsilon }{2}\, t^2,\qquad t\in \mathbb {R}. \end{aligned}$$
(2.2)

Remark 2.1

For \(p=2\) and \(\delta _i>0\), we have \(g_i\in C^{1,1}(\mathbb {R})\cap C^\infty (\mathbb {R}\setminus \{\delta _i,-\delta _i\})\), but \(g_i\) is not \(C^2\). In this case, like in [3, Section 2] one would need to replace \(g_i\) by a regularized version, in particular for the \(C^2\) regularity result of Lemma 2.2 below. In order not to overburden the presentation, we prefer to avoid to explicitely write down this regularization and keep on using the same symbol \(g_i\).

From now on, we fix U a local minimizer of \(\mathfrak {F}_\delta \). We also fix a ball

$$\begin{aligned} B \Subset \Omega \quad \text{ such } \text{ that } \quad 2\,B\Subset \Omega \text{ as } \text{ well }. \end{aligned}$$

Here \(\lambda \,B\) denotes the ball having the same center as B, scaled by a factor \(\lambda >0\).

For every \(0<\varepsilon \ll 1\) and every \(x\in \overline{B}\), we set \(U_\varepsilon (x)=U*\varrho _\varepsilon (x)\), where \(\varrho _\varepsilon \) is a smooth convolution kernel, supported in a ball of radius \(\varepsilon \) centered at the origin.

Finally, we define

$$\begin{aligned} \mathfrak {F}_{\delta ,\varepsilon }(v;B)=\sum _{i=1}^N \int _B g_{i,\varepsilon }(v_{x_i})\, dx+\int _B f_{\varepsilon }\, v\, dx, \end{aligned}$$

where \(f_{\varepsilon }=f*\varrho _{\varepsilon }\). The following preliminary result is standard, see [2, Lemma 2.5 and Lemma 2.8].

Lemma 2.2

(Basic energy estimate) There exists \(\varepsilon _0>0\) such that for every \(0<\varepsilon \le \varepsilon _0<1\), the problem

$$\begin{aligned} \min \left\{ \mathfrak {F}_\varepsilon (v;B)\, :\, v-U_\varepsilon \in W^{1,p}_0(B)\right\} , \end{aligned}$$
(2.3)

admits a unique solution \(u_\varepsilon \). Moreover, there exists a constant \(C=C(N,p)>0\) such that the following uniform estimate holds

$$\begin{aligned} \int _B |\nabla u_\varepsilon |^p\, dx\le C\,\left[ \int _{2\,B} |\nabla U|^p\,dx+|B|^\frac{p'}{N}\,\int _{2\,B} |f|^{p'}\,dx+(\varepsilon _0+(\delta -1)^p)|B|\,\right] . \end{aligned}$$

Finally, \(u_\varepsilon \in C^{2}(B)\).

We also rely on the following stability result, which is slightly more precise than [2, Lemma 2.9].

Lemma 2.3

(Convergence to a minimizer) With the same notation as before, there exists a sequence \(\{\varepsilon _k\}_{k\in \mathbb {N}}\subset (0,\varepsilon _0)\) converging to 0, such that

$$\begin{aligned} \lim _{k\rightarrow \infty } \Vert u_{\varepsilon _k}-\widetilde{u}\Vert _{L^p(B)}=0, \end{aligned}$$

where \(\widetilde{u}\) is a solution of

$$\begin{aligned} \min \left\{ \mathfrak {F}_\delta (v;B)\, :\, v-U\in W^{1,p}_0(B)\right\} . \end{aligned}$$

We also have

$$\begin{aligned} \Big |\widetilde{u}_{x_i} -U_{x_i}\Big | \le 2\,\delta _i,\qquad \text{ for } \text{ a. }\, \text{ e. } x\in B,\quad i=1,\ldots ,N. \end{aligned}$$
(2.4)

In the case \(\delta =1\), i.e. when \(\delta _1=\cdots =\delta _N=0\), then \(\widetilde{u}=U\) and we have the stronger convergence

$$\begin{aligned} \lim _{k\rightarrow \infty } \Vert u_{\varepsilon _k}-U\Vert _{W^{1,p}(B)}=0. \end{aligned}$$
(2.5)

Proof

The first part is proven in [2, Lemma 2.9], while (2.4) is proven in [2, Lemma 2.3]. For the case \(\delta =1\), we observe that \(\widetilde{u}=U\) follows from the strict convexity of the functional, together with the local minimality of U. In order to prove (2.5), we observe that

$$\begin{aligned} \begin{aligned} \left| \sum _{i=1}^N\frac{1}{p}\,\int _B \left| (u_{\varepsilon _k})_{x_i} \right| ^p\,dx-\sum _{i=1}^N\frac{1}{p}\int _B \left| U_{x_i}\right| ^p\,dx \right|&\le \left| \mathfrak {F}_{\delta ,\varepsilon _k} (u_{\varepsilon _k};B)-\mathfrak {F}_\delta (U;B)\right| \\&\quad +\frac{\varepsilon _k}{2}\,\int _B |\nabla u_{\varepsilon _k}|^2\,dx\\&\quad +\left| \int _B f_{\varepsilon _k}\,u_{\varepsilon _k}\,dx-\int _B f\,U\,dx\right| . \end{aligned} \end{aligned}$$

We now use that \(\{u_{\varepsilon _k}\}_{k\in \mathbb {N}}\) strongly converges in \(L^p(B)\), is bounded in \(W^{1,p}(B)\) and that \(\{f_{\varepsilon _k}\}_{k\in \mathbb {N}}\) strongly converges in \(L^{p'}(B)\) to f. By further using that (see the proof of [2, Lemma 2.9])

$$\begin{aligned} \lim _{k\rightarrow \infty }\left| \mathfrak {F}_{\delta ,\varepsilon _k} (u_{\varepsilon _k};B)-\mathfrak {F}_\delta (U;B)\right| =0, \end{aligned}$$

we finally get

$$\begin{aligned} \lim _{k\rightarrow \infty } \sum _{i=1}^N\int _B \left| (u_{\varepsilon _k})_{x_i}\right| ^p\,dx=\sum _{i=1}^N\int _B \left| U_{x_i}\right| ^p\,dx,\qquad i=1,\ldots ,N. \end{aligned}$$
(2.6)

Observe that by Clarkson’s inequality for \(p\ge 2\), we obtain

$$\begin{aligned}&\sum _{i=1}^N\left\| \frac{(u_{\varepsilon _k})_{x_i} +U_{x_i}}{2}\right\| ^p_{L^p(B)}+\sum _{i=1}^N\left\| \frac{(u_{\varepsilon _k})_{x_i}-U_{x_i}}{2}\right\| ^p_{L^p(B)}\\&\quad \le \frac{1}{2}\,\left( \sum _{i=1}^N \Vert (u_{\varepsilon _k})_{x_i}\Vert ^p_{L^p(B)}+\sum _{i=1}^N \Vert U_{x_i}\Vert ^p_{L^p(B)}\right) . \end{aligned}$$

By using this and (2.6), we eventually get (2.5). \(\square \)

Remark 2.4

Observe that the functional \(\mathfrak {F}_\delta \) is not strictly convex when \(\delta >1\). Thus property (2.4) is useful in order to transfer a Lipschitz estimate for the minimizer \(\widetilde{u}\) selected in the limit, to the chosen one U.

Finally, we will repeatedly use the following classical result, see [11, Lemma 6.1] for a proof.

Lemma 2.5

Let \(0<r<R\) and let \(Z(t):[r,R]\rightarrow [0,\infty )\) be a bounded function. Assume that for \(r\le t<s\le R\) we have

$$\begin{aligned} Z(t)\le \frac{\mathcal {A}}{(s-t)^{\alpha _0}} +\frac{\mathcal {B}}{(s-t)^{\beta _0}}+\mathcal {C}+\vartheta \,Z(s), \end{aligned}$$

with \(\mathcal {A},\mathcal {B},\mathcal {C}\ge 0\), \(\alpha _0\ge \beta _0>0\) and \(0\le \vartheta <1\). Then we have

$$\begin{aligned} Z(r)\le \left( \frac{1}{(1-\lambda )^{\alpha _0}}\, \frac{\lambda ^{\alpha _0}}{\lambda ^{\alpha _0}-\vartheta }\right) \, \left[ \frac{\mathcal {A}}{(R-r)^{\alpha _0}}+\frac{\mathcal {B}}{(R-r)^{\beta _0}}+\mathcal {C}\right] , \end{aligned}$$

where \(\lambda \) is any number such that

$$\begin{aligned} \vartheta ^\frac{1}{\alpha _0}<\lambda <1. \end{aligned}$$

3 Caccioppoli-type inequalities

The solution \(u_\varepsilon \) of the regularized problem (2.3) satisfies the Euler-Lagrange equation

$$\begin{aligned} \sum _{i=1}^N \int g'_{i,\varepsilon }((u_\varepsilon )_{x_i})\, \varphi _{x_i}\, dx+\int f_\varepsilon \, \varphi \, dx=0,\qquad \varphi \in W^{1,p}_0(B). \end{aligned}$$
(3.1)

From now on, in order to simplify the notation, we will systematically forget the subscript \(\varepsilon \) on \(u_\varepsilon \) and \(f_\varepsilon \) and simply write u and \(f\) respectively.

We now insert a test function of the form \(\varphi =\psi _{x_j}\in W^{1,p}_0(B)\) in (3.1), compactly supported in B. Then an integration by parts yields

$$\begin{aligned} \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\, u_{x_i\,x_j}\, \psi _{x_i}\, dx+\int f_{x_j}\,\psi \,dx=0, \end{aligned}$$
(3.2)

for \(j=1,\ldots ,N\). This is the equation solved by \(u_{x_j}\).

We refer to [2, Lemma 3.2] for a proof of the following Caccioppoli inequality:

Lemma 3.1

Let \(\Phi :\mathbb {R}\rightarrow \mathbb {R}^+\) be a \(C^1\) convex function. Then there exists a universal constant \(C>0\) such that for every function \(\eta \in C^\infty _0(B)\) and every \(j=1,\ldots ,N\), we have

$$\begin{aligned}&\sum _{i=1}^N \int g''_{i,\varepsilon }(u_{x_i})\,\left| \left( \Phi (u_{x_j})\right) _{x_i}\right| ^2\, \eta ^2\, dx\nonumber \\&\quad \le C\,\sum _{i=1}^N \int g''_{i,\varepsilon }(u_{x_i})\,| \Phi (u_{x_j})|^2\, \eta _{x_i}^2\, dx+C\,\int |f_{x_j}|\, | \Phi '(u_{x_j})|\, |\Phi (u_{x_j})|\,\eta ^2\, dx.\nonumber \\ \end{aligned}$$
(3.3)

We need a more elaborate Caccioppoli-type inequality for the gradient, which is reminiscent of [1, Proposition 3.1].

Proposition 3.2

(Weird Caccioppoli inequality) Let \(\Phi , \Psi :[0,+\infty )\rightarrow [0,+\infty )\) be two non-decreasing continuous functions. We further assume that \(\Psi \) is convex and \(C^1\). Then there exists a universal constant \(C>0\) such that for every \(\eta \in C^\infty _0(B)\), \(0\le \theta \le 2\) and \(k,j=1,\dots ,N\),

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\,\eta ^2\,dx\\&\quad \le C\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i}) \,u_{x_j}^2\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,|\nabla \eta |^2\,dx\\&\qquad +C\,\left( \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,u_{x_j}^2\,\Phi (u_{x_j}^2)^2\,\Psi '(u_{x_k}^2)^\theta \,\eta ^2\,dx\right) ^\frac{1}{2}\\&\qquad \times \left[ \left( \sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i}) \,|u_{x_k}|^{2\theta }\,\Psi (u_{x_k}^2)^{2-\theta }\,|\nabla \eta |^2\,dx \right) ^\frac{1}{2}+ \mathcal {E}_1(f)^\frac{1}{2}\right] + C\,\mathcal {E}_2(f) \end{aligned} \end{aligned}$$
(3.4)

where

$$\begin{aligned} \mathcal {E}_1(f):= & {} \int |f_{x_k}|\, |u_{x_k}|^{\theta +1}\, \Big |\Psi (u_{x_k}^2)\,\Psi '(u_{x_k}^2)\Big |^{1 -\frac{\theta }{2}}\,\eta ^2\,dx,\\ \mathcal {E}_2(f):= & {} \int |f_{x_j}|\,|u_{x_j}|\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\eta ^2\,dx. \end{aligned}$$

Proof

By a standard approximation argument, one can assume that \(\Phi \) is \(C^1\) as well. We take in (3.2)

$$\begin{aligned} \varphi =u_{x_j}\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\eta ^2. \end{aligned}$$

This gives

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,\Big (\Phi (u_{x_j}^2)+ 2u_{x_j}^2\,\Phi '(u_{x_j}^2)\Big )\,\Psi (u_{x_k}^2)\,\,\eta ^2\,dx\\&\quad =-2\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}\,u_{x_j}\,\Phi (u_{x_j}^2)\, \Psi (u_{x_k}^2)\,\,\eta \,\eta _{x_i}\,dx\\&\qquad -2\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}\,u_{x_j}\,u_{x_i x_k}\,u_{x_k}\, \Psi '(u_{x_k}^2)\,\Phi (u_{x_j}^2)\,\,\eta ^2\,dx\\&\qquad -\int f_{x_j}\,u_{x_j}\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\, \eta ^2\,dx=:\mathcal {A}_1 +\mathcal {A}_2+\mathcal {A}_3. \end{aligned} \end{aligned}$$
(3.5)

We now proceed to estimating the three terms \(\mathcal {A}_\ell \). We have

$$\begin{aligned} \begin{aligned} \mathcal {A}_1&\le \frac{1}{2}\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\eta ^2\,dx\\&\quad +2\, \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\, u_{x_j}^2\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\eta _{x_i}^2\,dx \end{aligned} \end{aligned}$$

and the integral containing the Hessian of u can be absorbed in the left-hand side of (3.5). Using also that \(2\,u_{x_j}^2\,\Phi '(u_{x_j}^2) \ge 0\), this yields

$$\begin{aligned} \begin{aligned}&\frac{1}{2}\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\,\eta ^2\,dx\\&\quad \le 2\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i}) \,u_{x_j}^2\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\eta _{x_i}^2\,dx +\mathcal {A}_2+\mathcal {A}_3. \end{aligned} \end{aligned}$$
(3.6)

We now estimate \(\mathcal {A}_2\), which is the most delicate term: writing \(\Psi '(u_{x_k}^2)\!=\!\Psi '(u_{x_k}^2)^{\frac{\theta }{2}}\,\Psi '(u_{x_k}^2)^{1-\frac{\theta }{2}}\) and using the Cauchy-Schwarz inequality, we get

$$\begin{aligned} \begin{aligned} \mathcal {A}_2&\le 2\,\left( \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,u_{x_j}^2\, \Phi (u_{x_j}^2)^2\,\Psi '(u_{x_k}^2)^{\theta }\,\eta ^2\, dx\right) ^\frac{1}{2}\\&\times \left( \sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_k}^2\,u_{x_k}^2\,\Psi '(u_{x_k}^2)^{2-\theta }\,\eta ^2\, dx\right) ^\frac{1}{2}. \end{aligned} \end{aligned}$$
(3.7)

We observe that

$$\begin{aligned} \sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_k}^2\,u_{x_k}^2\,\Psi '(u_{x_k}^2)^{2-\theta }\,\eta ^2\, dx=\frac{1}{4}\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,\left| \left( G(u_{x_k})\right) _{x_i} \right| ^2\,\eta ^2\,dx, \end{aligned}$$

where \(G\) is the convex nonnegative \(C^1\) function defined by

$$\begin{aligned} G(t)=\int _0^{t^2} \Psi '(\tau )^{1-\frac{\theta }{2}}\,d\tau . \end{aligned}$$

Thus by the Caccioppoli inequality (3.3) with \(x_k\) in place of \(x_j\) and

$$\begin{aligned} \Phi (t)=G(t),\qquad t\in \mathbb {R}, \end{aligned}$$

we get

$$\begin{aligned} \begin{aligned} \sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_k}^2\,u_{x_k}^2\,\Psi '(u_{x_k}^2)^{2-\theta }\,\eta ^2&\le \, C\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\, G(u_{x_k})^2\,\eta _{x_i}^2\,dx\\&\;+\, C\,\int |f_{x_k}|\,\Big |G(u_{x_k})\, G'(u_{x_k})\Big |\,\eta ^2\,dx. \end{aligned} \end{aligned}$$

By Jensen’s inequality

$$\begin{aligned} 0\le G(u_{x_k})\le |u_{x_k}|^{\theta }\left( \int _{0}^{u_{x_k}^2} \Psi '(\tau )\,d\tau \right) ^{1-\frac{\theta }{2}}\le |u_{x_k}|^{\theta }\,\Psi (u_{x_k}^2)^{1-\frac{\theta }{2}}. \end{aligned}$$

Together with the fact that \(G'(u_{x_k})=2\,u_{x_k}\Psi '(u_{x_k}^2)^{1-\frac{\theta }{2}}\), this implies

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_k}^2\,u_{x_k}^2\,\Psi '(u_{x_k}^2)^{2-\theta }\,\eta ^2 \le C\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\, |u_{x_k}|^{2\theta }\,\Psi (u_{x_k}^2)^{2-\theta }\,\eta _{x_i}^2\,dx\\&\quad + C\,\int |f_{x_k}|\,|u_{x_k}|^{\theta +1}\,\Big |\Psi (u_{x_k}^2) \,\Psi '(u_{x_k}^2)\Big |^{1-\frac{\theta }{2}}\,\eta ^2\,dx, \end{aligned} \end{aligned}$$

which in turn yields by (3.6) and (3.7),

$$\begin{aligned} \begin{aligned}&\frac{1}{2}\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\,\eta ^2\,dx\\&\quad \le 2\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_j}^2\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\eta _{x_i}^2\,dx\\&\qquad + C\,\left( \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,u_{x_j}^2\,\Phi (u_{x_j}^2)^2\,\Psi '(u_{x_k}^2)^{\theta }\,\eta ^2\,dx\right) ^\frac{1}{2}\\&\qquad \times \left[ \left( \sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\theta }\,\Psi (u_{x_k}^2)^{2-\theta }\,\eta _{x_i}^2\,dx \right) ^\frac{1}{2}\right. \\&\qquad \left. + \left( \int |f_{x_k}|\,|u_{x_k}|^{\theta +1}\,\Big |\Psi (u_{x_k}^2)\,\Psi '(u_{x_k}^2)\Big |^{1-\frac{\theta }{2}}\,\eta ^2\,dx\right) ^\frac{1}{2}\right] +\mathcal {A}_3. \end{aligned} \end{aligned}$$

Here, we have also used the inequality \((A+B)^{1/2} \le A^{1/2} + B^{1/2}.\)

Finally,

$$\begin{aligned} \mathcal {A}_3 \le C\,\int |f_{x_j}|\,|u_{x_j}|\,\Phi (u_{x_j}^2)\,\Psi (u_{x_k}^2)\,\eta ^2\,dx. \end{aligned}$$

This completes the proof. \(\square \)

4 Local energy estimates for the regularized problem

In order to emphasize the main ideas of the proof, we have divided this section in two parts. In the first one, we explain how (3.4) leads to higher integrability estimates for the gradient when \(f=0\) and \(\delta =1\). This allows to ignore a certain amount of technicalities. In the second part, we then detail the modifications of the proof to obtain the corresponding estimates in the general case.

4.1 The homogeneous case

In this subsection, we assume that \(f=0\) and \(\delta =1\). Then the two terms \(\mathcal {E}_1(f)\) and \(\mathcal {E}_2(f)\) in (3.4) vanish. Also observe that in this case from (2.2) we have

$$\begin{aligned} g_{i,\varepsilon }''(t)=(p-1)\,|t|^{p-2}+\varepsilon . \end{aligned}$$

Let us single out a particular case of Proposition 3.2 by taking

$$\begin{aligned} \Phi (t)=t^{s-1}\qquad \text{ and } \qquad \Psi (t)=t^m,\qquad \text{ for } t\ge 0, \end{aligned}$$
(4.1)

with \(1\le s \le m\).

Proposition 4.1

(Staircase to the full Caccioppoli) Let \(p\ge 2\) and let \(\eta \in C^{\infty }_0(B)\), then for every \(k,j=1, \ldots , N\) and \(1\le s\le m\)

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,s-2}\,|u_{x_k}|^{2\,m}\,\,\eta ^2\,dx\\&\quad \le C\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_j}|^{2\,s+2\,m}\,|\nabla \eta |^2\,dx\\&\qquad + C\,(m+1)\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,s+2\,m}\,|\nabla \eta |^2\,dx\\&\qquad + \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{4\,s-2}\,|u_{x_k}|^{2\,m-2\,s}\,\eta ^2\,dx. \end{aligned} \end{aligned}$$
(4.2)

Proof

We use (3.4) with the choices (4.1) above and

$$\begin{aligned} \theta ={\left\{ \begin{array}{ll} \dfrac{m-s}{m-1} \in [0,1] &{} \text { if }\,m>1,\\ &{}\\ 1 &{} \text { if }\, m=1. \end{array}\right. } \end{aligned}$$

This gives

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,s-2}\,|u_{x_k}|^{2\,m}\,\,\eta ^2\,dx\\&\quad \le C\,\sum _{i=1}^N \int g_{i,\varepsilon }'' (u_{x_i})\,|u_{x_j}|^{2\,s}\,|u_{x_k}|^{2\,m}\,|\nabla \eta |^2\,dx\\&\qquad +C\,\left( m^{\theta }\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{4\,s-2}\,|u_{x_k}|^{2\,m-2\,s}\, \eta ^2\,dx\right) ^\frac{1}{2}\\&\qquad \times \left( \,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,m+2\,s}\,|\nabla \eta |^2\,dx\right) ^\frac{1}{2}. \end{aligned} \end{aligned}$$

We use Young’s inequality in the form \(C\,\sqrt{a\,b}\le C^2\, b/4 +a\) for the product in the right-hand side to get

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,s-2}\,|u_{x_k}|^{2\,m}\,\,\eta ^2\,dx\\&\quad \le C\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_j}|^{2\,s}\,|u_{x_k}|^{2\,m}\,|\nabla \eta |^2\,dx\\&\qquad +C\,m^{\theta }\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,m+2\,s}\,|\nabla \eta |^2\,dx\\&\qquad +\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{4\,s-2}\,|u_{x_k}|^{2\,m-2\,s}\,\eta ^2\,dx. \end{aligned} \end{aligned}$$

In the first term of the right-hand side, we use Young’s inequality with the exponents

$$\begin{aligned} \frac{2\,m+2\,s}{2\,s} \quad , \quad \frac{2\,m+2\,s}{2\,m}. \end{aligned}$$

We also observe for the second term that \(m^\theta \le m\). This gives the desired estimate. \(\square \)

Proposition 4.2

(Caccioppoli for power functions of the gradient) We fix an exponent

$$\begin{aligned} q=2^{\ell _0}-1,\qquad \text{ for } \text{ a } \text{ given } \ell _0\in \mathbb {N}\setminus \{0\}. \end{aligned}$$

Let \(\eta \in C^{\infty }_0(B)\), then for every \(k=1, \ldots , N\) we have

$$\begin{aligned} \begin{aligned} \int \left| \nabla \left( |u_{x_k}|^{q+\frac{p-2}{2}}\,u_{x_k}\right) \right| ^2\,\eta ^2\,dx \le&C\,q^5\,\sum _{i,j=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_j}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&+ C\,q^5\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,q+2}\,|\nabla \eta |^2\,dx, \end{aligned} \end{aligned}$$
(4.3)

for some \(C=C(N,p)>0\).

Proof

We define the two finite families of indices \(\{s_\ell \}\) and \(\{m_\ell \}\) such that

$$\begin{aligned} s_\ell =2^\ell ,\qquad m_{\ell }=q+1-2^{\ell },\qquad \ell \in \{0,\ldots ,\ell _0\}. \end{aligned}$$

Observe that

$$\begin{aligned}&1\le s_\ell \le m_\ell ,\qquad \ell \in \{0,\ldots ,\ell _0-1\},\\&s_\ell +m_\ell =q+1,\qquad \ell \in \{0,\ldots ,\ell _0\},\\&4\,s_\ell -2=2\,s_{\ell +1}-2,\qquad 2\,m_\ell -2\,s_\ell =2\,m_{\ell +1}, \end{aligned}$$

and

$$\begin{aligned} s_0=1,\qquad m_0=q,\qquad s_{\ell _0}=2^{\ell _0},\qquad m_{\ell _0}=0. \end{aligned}$$

In terms of these families, inequality (4.2) implies for every \(\ell \in \{0,\ldots ,\ell _0-1\}\)

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,s_\ell -2}\,|u_{x_k}|^{2\,m_\ell }\,\,\eta ^2\,dx\\&\quad \le C\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_j}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\qquad + C\,(m_\ell +1)\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,| u_{x_k}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\qquad +\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,s_{\ell +1}-2}\,|u_{x_k}|^{2 \,m_{\ell +1}}\, \eta ^2\,dx, \end{aligned} \end{aligned}$$

for some \(C>0\) universal. By starting from \(\ell =0\) and iterating the previous estimate up to \(\ell =\ell _0-1\), we then get

$$\begin{aligned} \begin{aligned} \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx&\le C\,q^2\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_j}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad + C\,q^2\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad +\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,q}\,\eta ^2\,dx, \end{aligned} \end{aligned}$$

for a universal constant \(C>0\). For the last term, we apply the Caccioppoli inequality (3.3) with

$$\begin{aligned} \Phi (t)=\frac{|t|^{q+1}}{q+1},\qquad t\in \mathbb {R}, \end{aligned}$$

thus we get

$$\begin{aligned} \begin{aligned} \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx&\le C\,q^2\, \sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,| u_{x_j}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad + C\,q^2\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad +\frac{C}{(q+1)^2}\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_j}|^{2\,q+2}\,|\nabla \eta |^2\,dx; \end{aligned} \end{aligned}$$

that is,

$$\begin{aligned} \begin{aligned} \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx&\le C\,q^2\, \sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\, |u_{x_j}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad + C\,q^2\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,q+2}\,|\nabla \eta |^2\,dx, \end{aligned} \end{aligned}$$
(4.4)

possibly for a different universal constant \(C>0\).

We now observe that \(g_{i,\varepsilon }''(u_{x_i}) =\Big ((p-1)\,|u_{x_i}|^{p-2}+\varepsilon \Big )\) and thus

$$\begin{aligned} \begin{aligned} \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx\ge&\int |u_{x_k}|^{p-2}\,u_{x_k x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx\\&=\left( \frac{2}{2\,q+p}\right) ^2\,\int \left| \left( |u_{x_k}|^{q+\frac{p-2}{2}} \,u_{x_k}\right) _{x_j}\right| ^2\,\eta ^2\,dx. \end{aligned} \end{aligned}$$

When we sum over \(j=1,\ldots ,N\), we get

$$\begin{aligned} \sum _{i,j=1}^N \int {g_{i,\varepsilon }''(u_{x_i})}\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx\ge \left( \frac{2}{2\,q+p}\right) ^2\,\int \left| \nabla \left( |u_{x_k}|^{q+\frac{p-2}{2}}\,u_{x_k}\right) \right| ^2\,\eta ^2\,dx. \end{aligned}$$

This proves the desired inequality. \(\square \)

4.2 The non-homogeneous case

In the general case where \(f\not =0\) and/or \(\delta >1\), we can prove the following analogue of (4.2), in a similar way:

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{2\,s-2}\,|u_{x_k}|^{2\,m}\,\,\eta ^2\,dx\\&\quad \le \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_j}|^{4\,s-2}\,|u_{x_k}|^{2\,m-2\,s}\,\eta ^2\,dx\\&\qquad + C\,{(m+1)}\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i}) \,\left( |u_{x_j}|^{2\,s+2\,m}+|u_{x_k}|^{2\,s+2\,m}\right) \,|\nabla \eta |^2\,dx\\&\qquad +C\, m^2\, \int |\nabla f|\, \left( |u_{x_k}|^{2\,s+2\,m-1} +|u_{x_j}|^{2\,s+2\,m-1}\right) \,\eta ^2\,dx. \end{aligned} \end{aligned}$$
(4.5)

We then deduce the following analogue of Proposition 4.2:

Proposition 4.3

We fix an exponent

$$\begin{aligned} q=2^{\ell _0}-1,\qquad \text{ for } \text{ a } \text{ given } \ell _0\in \mathbb {N}\setminus \{0\}. \end{aligned}$$

Let \(\eta \in C^{\infty }_0(\Omega )\), then for every \(k=1, \ldots , N\) we have

$$\begin{aligned} \begin{aligned}&\int \left| \nabla \left( (|u_{x_k}|-\delta _k)_{+}^{\frac{p}{2}}\, |u_{x_k}|^q\right) \right| ^2\,\eta ^2\,dx \\&\quad \le C\,q^5\,\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,\left( |u_{x_k}|^{2\,q+2} +\sum _{j=1}^N|u_{x_j}|^{2\,q+2}\right) \,|\nabla \eta |^2\,dx\\&\qquad +C\,q^5\, \int |\nabla f|\, \left( |u_{x_k}|^{2\,q+1} + \sum _{j=1}^{N}|u_{x_j}|^{2\,q+1}\right) \,\eta ^2\,dx, \end{aligned} \end{aligned}$$
(4.6)

for some \(C=C(N,p)>0\).

Proof

Using the same notation and the same strategy as in the proof of (4.3), except that we start from (4.5) instead of (4.2), we get the following analogue of (4.4):

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx\\&\quad \le C\,q^2\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,(|u_{x_j}|^{2\,q+2}+|u_{x_k}|^{2\,q+2})\,|\nabla \eta |^2\,dx\\&\qquad +C\,q^3\, \int |\nabla f|\, (|u_{x_k}|^{2\,q+1}+ |u_{x_j}|^{2\,q+1})\,\eta ^2\,dx. \end{aligned} \end{aligned}$$

We now observe that

$$\begin{aligned} \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx\ge (p-1)\,\int (|u_{x_k}|-\delta _k)_{+}^{p-2}\,u_{x_k x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx. \end{aligned}$$

Noting that

$$\begin{aligned} (|u_{x_k}|-\delta _k)_+^{p}\le (|u_{x_k}|-\delta _k)_+^{p-2}|u_{x_k}|^2, \end{aligned}$$

we have

$$\begin{aligned} \begin{aligned} \left| \left( (|u_{x_k}|-\delta _k)_{+}^{\frac{p}{2}} \,|u_{x_k}|^q\right) _{x_j}\right| ^2&\le 2\,\left| \left( (|u_{x_k}| -\delta _k)_{+}^{\frac{p}{2}}\right) _{x_j}\right| ^2\,|u_{x_k}|^{2\,q}\\&\quad + 2\,(|u_{x_k}|-\delta _k)_{+}^{p}\, \left| \left( |u_{x_k}|^q\right) _{x_j}\right| ^2\\&\le C\,q^2\,(|u_{x_k}|-\delta _k)_{+}^{p-2} \,|u_{x_k}|^{2\,q}\,u_{x_k x_j}^2. \end{aligned} \end{aligned}$$

We deduce therefrom

$$\begin{aligned} \sum _{i=1}^N \int g_{i,\varepsilon }''(u_{x_i})\,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx \ge \frac{C}{q^2}\,\int \left| \left( (|u_{x_k}|-\delta _k)_{+}^{\frac{p}{2}} \,|u_{x_k}|^q\right) _{x_j}\right| ^2\,\eta ^2\,dx, \end{aligned}$$

thus when we sum over \(j=1,\ldots ,N,\) we get

$$\begin{aligned} \sum _{i,j=1}^N \int g_{i,\varepsilon }''(u_{x_i}) \,u_{x_i x_j}^2\,|u_{x_k}|^{2\,q}\,\eta ^2\,dx\ge \frac{C}{q^2}\,\int \left| \nabla \left( (|u_{x_k}|-\delta _k)_{+}^{\frac{p}{2}} \,|u_{x_k}|^q\right) \right| ^2\,\eta ^2\,dx. \end{aligned}$$

This proves the desired inequality (4.6). \(\square \)

5 Proof of Theorem 1.1

Proof

The core of the proof of Theorem 1.1 is the uniform Lipschitz estimate of Proposition 5.1 below. Its proof, which is postponed for ease of readability, uses the integrability estimates of Sect. 4. Once we have this uniform estimate, we can reproduce the proof of [2, Theorem A] and prove that \(\nabla U\in L^\infty (\Omega ')\), for every \(\Omega '\Subset \Omega \).

We now detail how to obtain the scaling invariant local estimate (1.5) in the case \(\delta _1=\cdots =\delta _N=0\). We take \(0<r_0<R_0\le 1\) and a ball \(B_{2R_0}\Subset \Omega \). We then consider the sequence of miminizers \(\{u_{\varepsilon _k}\}_{k\in \mathbb {N}}\) of (2.3) obtained in Lemma 2.3, with B a ball slightly larger than \(B_{R_0}\) so that \(2\,B\Subset \Omega \). By using the uniform Lipschitz estimate (5.3) below, taking the limit as k goes to \(\infty \) and using the strong convergence of Lemma 2.3, we obtain

$$\begin{aligned} \Vert \nabla U\Vert _{L^\infty (B_{r_0})}\le \frac{C}{(R_0-r_0)^{\sigma _2}}\,\left( 1+\Vert \nabla f\Vert ^{\sigma _2}_{L^{h}(B_{R_0})}\right) \,\left( \Vert \nabla U\Vert ^{\sigma _1}_{L^{p}(B_{R_0})}+1\right) . \end{aligned}$$

Without loss of generality, we can assume that \(\Vert \nabla U\Vert _{L^{p}(B_{R_0})}>0\). Hence, by Young’s inequality,

$$\begin{aligned} \Vert \nabla U\Vert _{L^\infty (B_{r_0})}\le \frac{C}{(R_0-r_0)^{\sigma _2}}\,\left( 1+\Vert \nabla f\Vert _{L^{h}(B_{R_0})}^{2\,\sigma _2}+\Vert \nabla U\Vert _{L^{p}(B_{R_0})}^{2\,\sigma _1}\right) , \end{aligned}$$
(5.1)

possibly for a different \(C=C(N,p,h)>0\). We now observe that for every \(\lambda >0\), \(\lambda \,U\) is still a solution of the orthotropic \(p-\)Laplace equation, with the right hand side \(f\) replaced by \(\lambda ^{p-1}\,f\). We can use (5.1) for \(\lambda \, U\) and get

$$\begin{aligned} \lambda \,\Vert \nabla U\Vert _{L^\infty (B_{r_0})}\le \frac{C}{(R_0-r_0)^{\sigma _2}}\, \left( 1+\lambda ^{2\,\sigma _2\,(p-1)}\,\Vert \nabla f\Vert _{L^{h}(B_{R_0})}^{2\,\sigma _2}+\lambda ^{2\,\sigma _1}\,\Vert \nabla U\Vert _{L^{p}(B_{R_0})}^{2\,\sigma _1}\right) . \end{aligned}$$

Dividing by \(\lambda \), we obtain

$$\begin{aligned} \Vert \nabla U\Vert _{L^\infty (B_{r_0})}\le \frac{C}{(R_0-r_0)^{\sigma _2}}\,\left( \frac{1}{\lambda } +\lambda ^{2\,\sigma _2\,(p-1)-1}\,\Vert \nabla f\Vert ^{2\sigma _2}_{L^{h}(B_{R_0})}+\lambda ^{2\,\sigma _1-1}\,\Vert \nabla U\Vert _{L^{p}(B_{R_0})}^{2\,\sigma _1}\right) . \end{aligned}$$

We take

$$\begin{aligned} \lambda :=\frac{1}{\Vert \nabla U\Vert _{L^{p}(B_{R_0})} + \Vert \nabla f\Vert _{L^{h}(B_{R_0})}^{\frac{1}{p-1}}}, \end{aligned}$$

and observe that if \(\Vert \nabla f\Vert _{L^{h}(B_{R_0})}>0\), then

$$\begin{aligned} \lambda ^{2\,\sigma _2\,(p-1)-1}\,\Vert \nabla f\Vert ^{2\sigma _2}_{L^{h}(B_{R_0})}\le \frac{1}{ \left( \Vert \nabla f\Vert _{L^{h}(B_{R_0})}^{\frac{1}{p-1}}\right) ^{2\,\sigma _2\,(p-1)-1}}\,\Vert \nabla f\Vert ^{2\sigma _2}_{L^{h}(B_{R_0})}=\Vert \nabla f\Vert _{L^{h}(B_{R_0})}^{\frac{1}{p-1}} \end{aligned}$$

while the inequality is obvious when \(\Vert \nabla f\Vert _{L^{h}(B_{R_0})}=0\). Similarly,

$$\begin{aligned} \lambda ^{2\,\sigma _1-1}\,\Vert \nabla U\Vert _{L^{p}(B_{R_0})}^{2\,\sigma _1}\le \frac{1}{\Vert \nabla U\Vert _{L^{p}(B_{R_0})}^{2\,\sigma _1-1}}\,\Vert \nabla U\Vert _{L^{p}(B_{R_0})}^{2\,\sigma _1}=\Vert \nabla U\Vert _{L^{p}(B_{R_0})}. \end{aligned}$$

It thus follows that

$$\begin{aligned} \Vert \nabla U\Vert _{L^\infty (B_{r_0})}\le \frac{C}{(R_0-r_0)^{\sigma _2}}\,\left( \Vert \nabla f\Vert _{L^{h}(B_{R_0})}^{\frac{1}{p-1}} + \Vert \nabla U\Vert _{L^{p}(B_{R_0})}\right) . \end{aligned}$$
(5.2)

We now make this estimate dimensionally correct. Given \(R_0>0\), we consider a ball \(B_{2R_0}\Subset \Omega \). Then the rescaled function

$$\begin{aligned} U_{R_0}(x)=U(R_0\,x),\qquad \text{ for } x\in R_0^{-1}\,\Omega , \end{aligned}$$

is a solution of the orthotropic p-Laplace equation, with right-hand side \(f_{R_0}(x):=R_{0}^p\,f(R_0\,x)\). We can use for it the estimate (5.2) with radii 1 and 1 / 2. By scaling back, we thus obtain

$$\begin{aligned} R_0\,\Vert \nabla U\Vert _{L^\infty (B_{R_0/2})}\le C\,\left( R_0^{-\frac{N}{p}+1}\,\Vert \nabla U\Vert _{L^{p}(B_{R_0})}+ R_{0}^{\frac{h\,(p+1)-N}{h\,(p-1)}}\,\Vert \nabla f\Vert _{L^{h}(B_{R_0})}^{\frac{1}{p-1}} \right) , \end{aligned}$$

for some constant \(C=C(N,p,h)>1\). Dividing by \(R_0\), we get

This concludes the proof. \(\square \)

Proposition 5.1

(Uniform Lipschitz estimate) Let \(p\ge 2\), \(h>N/2\) and \(0<\varepsilon \le \varepsilon _0\). For every \(B_{r_0}\subset B_{R_0}\Subset B\) with \(0<r_0<R_0\le 1\), we have

$$\begin{aligned} \Vert \nabla u_\varepsilon \Vert _{L^\infty (B_{r_0})}\le C\,\left( \frac{1+\Vert \nabla f_\varepsilon \Vert ^{\sigma _2}_{L^{h}(B_{R_0})}}{(R_0-r_0)^{\sigma _2}}\right) \, \Big (\Vert \nabla u_\varepsilon \Vert ^{\sigma _1}_{L^{p}(B_{R_0})}+1\Big ), \end{aligned}$$
(5.3)

where \(C=C(N,p,h, \delta )>1\) and \(\sigma _i=\sigma _i(N,p,h)>0\), for \(i=1,2\).

5.1 Proof of Proposition 5.1: the homogeneous case

In this subsection, we assume that \(f=0\) and \(\delta =1\).

For simplicity, we assume throughout the proof that \(N\ge 3\), so in this case the Sobolev exponent \(2^*\) is finite. The case \(N=2\) can be treated with minor modifications and is left to the reader. For ease of readability, we divide the proof into four steps.

Step 1: a first iterative scheme We add on both sides of inequality (4.3) the term

$$\begin{aligned} \int |\nabla \eta |^2\, |u_{x_k}|^{2\,q+p}\,dx. \end{aligned}$$

We thus obtain

$$\begin{aligned} \begin{aligned} \int \left| \nabla \left( \left( |u_{x_k}|^{q+\frac{p-2}{2}} \,u_{x_k}\right) \,\eta \right) \right| ^2\,dx&\le C\,q^5\, \sum _{i,j=1}^N\int g_{i,\varepsilon }''(u_{x_i}) \,|u_{x_j}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad +C\,q^5\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad +C\, \int |\nabla \eta |^2\, |u_{x_k}|^{2\,q+p}\,dx. \end{aligned} \end{aligned}$$

An application of the Sobolev inequality leads to

$$\begin{aligned}\begin{aligned} \left( \int |u_{x_k}|^{\frac{2^*}{2}(2\,q+p)} \,\eta ^{2^*}\,dx\right) ^\frac{2}{2^*}&\le C\,q^5\,\sum _{i,j=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_j}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad +C\,q^5\,\sum _{i=1}^N\int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,q+2}\,|\nabla \eta |^2\,dx\\&\quad +C\, \int |\nabla \eta |^2\, |u_{x_k}|^{2\,q+p}\,dx. \end{aligned} \end{aligned}$$

We now sum over \(k=1,\ldots ,N\) and use that by the Minkowski inequality,

$$\begin{aligned} \sum _{k=1}^N\left( \int |u_{x_k}|^{\frac{2^*}{2}(2\,q+p)}\,\eta ^{2^*}\,dx\right) ^\frac{2}{2^*} = \sum _{k=1}^N \left\| |u_{x_k}|^{2\,q+p}\eta ^2\right\| _{L^{\frac{2^*}{2}}} \ge \left\| \sum _{k=1}^N|u_{x_k}|^{2\,q+p}\eta ^2\right\| _{L^{\frac{2^*}{2}}}. \end{aligned}$$

This implies

$$\begin{aligned} \begin{aligned} \left( \int \left| \sum _{k=1}^N |u_{x_k}|^{2\,q+p}\right| ^{\frac{2^*}{2}} \,\eta ^{2^*}\,dx\right) ^{\frac{2}{2^*}}&\le C\,q^5 \sum _{i, k=1}^{N} \int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2\,q+2} \, |\nabla \eta |^2\,dx\\&\quad + C\, \int |\nabla \eta |^2\, \sum _{k=1}^{N}|u_{x_k}|^{2\,q+p} \,dx. \end{aligned} \end{aligned}$$
(5.4)

We now introduce the function

$$\begin{aligned} \mathcal {U}(x):= \max _{k=1, \ldots , N}|u_{x_k}(x)|. \end{aligned}$$

We use that

$$\begin{aligned} \mathcal {U}^{2\,q+p}\le \sum _{k=1}^{N}|u_{x_k}|^{2\,q+p} \le N\, \mathcal {U}^{2\,q+p}, \end{aligned}$$

and also that \(g_{i,\varepsilon }''(u_{x_i})\, |u_{x_k}|^{2\,q+2}\le C\,\mathcal {U}^{2\,q+p}+\varepsilon \,\mathcal {U}^{2\,q+2}\) for every \(1\le i, k \le N\). This yields

$$\begin{aligned} \left( \int \mathcal {U}^{\frac{2^*}{2}(2\,q+p)}\, \eta ^{2^*}\right) ^{\frac{2}{2^*}} \le C\,q^5 \int \mathcal {U}^{2\,q+p}|\nabla \eta |^2\,dx + Cq^5\varepsilon \int \mathcal {U}^{2q+2}\, |\nabla \eta |^2 \, dx \end{aligned}$$

for a possibly different \(C=C(N,p)>1\). By using that \(\mathcal {U}^{2\,q+2}\le 1 +\mathcal {U}^{2\,q+p}\), we obtain (for \(\varepsilon <1\))

$$\begin{aligned} \left( \int \mathcal {U}^{\frac{2^*}{2}(2\,q+p)}\,\eta ^{2^*}\,dx\right) ^\frac{2}{2^*} \le C\, q^5 \,\int |\nabla \eta |^2\,\Big (\mathcal {U}^{2q+p}+1\Big )\,dx. \end{aligned}$$
(5.5)

We fix two concentric balls \(B_r\subset B_R \Subset B\) and \(0<r<R\le 1\). Let us assume for simplicity that all the balls are centered at the origin. Then for every pair of radius \(r\le t<s\le R\) we take in (5.5) a standard cut-off function

$$\begin{aligned} \eta \in C^\infty _0(B_s),\quad \eta \equiv 1 \text{ on } B_t,\quad 0\le \eta \le 1,\quad \Vert \nabla \eta \Vert _{L^\infty }\le \frac{C}{s-t}. \end{aligned}$$
(5.6)

This yields

$$\begin{aligned} \left( \int _{B_t} \mathcal {U}^{\frac{2^*}{2}(2\,q+p)}\,dx\right) ^\frac{2}{2^*} \le C\, \frac{q^5}{(s-t)^2} \,\int _{B_s} \Big (\mathcal {U}^{2\,q+p}+1\Big )\,dx. \end{aligned}$$
(5.7)

We define the sequence of exponents

$$\begin{aligned} \gamma _j=p+2^{j+2}-2,\qquad j\in \mathbb {N}, \end{aligned}$$

and take in (5.7) \(q=2^{j+1}-1\). This gives

$$\begin{aligned} \left( \int _{B_{t}} \mathcal {U}^{\frac{2^*}{2}\gamma _j}\,dx\right) ^\frac{2}{2^*}\le C\,\frac{2^{5\,j}}{(s-t)^2}\,\int _{B_{s}}\Big (\mathcal {U}^{\gamma _j} +1\Big )\,dx, \end{aligned}$$
(5.8)

for a possibly different constant \(C=C(N,p)>1\).

Step 2: filling the gaps We now observe that

$$\begin{aligned} \gamma _{j-1}<\gamma _j<\frac{2^*}{2}\,\gamma _j,\qquad \text{ for } \text{ every } j\in \mathbb {N}\setminus \{0\}. \end{aligned}$$

By interpolation in Lebesgue spaces, we obtain

$$\begin{aligned} \int _{B_{t}} \mathcal {U}^{\gamma _j}\,dx\le \left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}}\,dx\right) ^\frac{\tau _j\,\gamma _j}{\gamma _{j-1}}\,\left( \int _{B_{t}} \mathcal {U}^{\frac{2^*}{2} \,\gamma _{j}} \,dx\right) ^\frac{(1-\tau _j)\,2}{2^*} \end{aligned}$$

where \(0<\tau _j<1\) is given by

$$\begin{aligned} \tau _j=\frac{\frac{2^*}{2}-1}{\frac{2^*}{2}\,\dfrac{\gamma _j}{\gamma _{j-1}}-1}. \end{aligned}$$

We now rely on (5.8) to get

$$\begin{aligned} \begin{aligned} \int _{B_{t}} \mathcal {U}^{\gamma _j}\,dx&\le \left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}}\,dx\right) ^\frac{\tau _j\,\gamma _j}{\gamma _{j-1}}\, \left( C\,\frac{2^{5\,j}}{(s-t)^2}\,\int _{B_{s}} \Big (\mathcal {U}^{\gamma _j}+1\Big )\,dx\right) ^{1-\tau _j}\\&=\left[ \left( C\,\frac{2^{5\,j}}{(s-t)^2}\right) ^{\frac{1 -\tau _j}{\tau _j}}\,\left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}} \,dx\right) ^\frac{\gamma _j}{\gamma _{j-1}}\right] ^{\tau _j}\, \left( \int _{B_{s}}\Big (\mathcal {U}^{\gamma _j}+1\Big ) \,dx\right) ^{1-\tau _j}. \end{aligned} \end{aligned}$$

The sequence \((\tau _j)_{j\ge 1}\) is decreasing, which implies

$$\begin{aligned} \tau _j> \lim _{n\rightarrow \infty } \tau _n =\frac{1}{2}\frac{2^*-2}{2^*-1}=:\underline{\tau }\qquad \text{ for } \text{ every } j\in \mathbb {N}\setminus \{0\}. \end{aligned}$$

Hence,

$$\begin{aligned} \frac{1-\tau _j}{\tau _j} \le \frac{1-\underline{\tau }}{\underline{\tau }}=:\beta . \end{aligned}$$

Using that \(s\le R\le 1\) and \(C>1\), this implies that

$$\begin{aligned} \left( C\,\frac{2^{5\,j}}{(s-t)^2}\right) ^{\frac{1-\tau _j}{\tau _j}} \le \left( C\,\frac{2^{5\,j}}{(s-t)^2}\right) ^{\beta }. \end{aligned}$$

By Young’s inequality,

$$\begin{aligned} \begin{aligned} \int _{B_{t}} \mathcal {U}^{\gamma _j}\,dx&\le (1-\tau _j)\,\int _{B_{s}}\Big ( \mathcal {U}^{\gamma _j}+1\Big )\,dx + \tau _j\,\left( C\,\frac{2^{5\,j}}{(s-t)^2}\right) ^{\beta }\, \left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}} \,dx\right) ^\frac{\gamma _j}{\gamma _{j-1}}\\&\le (1-\underline{\tau })\,\int _{B_{s}}\mathcal {U}^{\gamma _j}\,dx + C\,\frac{2^{5\,j\,\beta }}{(s-t)^{2\,\beta }}\,\left( \int _{{B_R}} \mathcal {U}^{\gamma _{j-1}}\,dx\right) ^\frac{ \gamma _j}{\gamma _{j-1}} + |B_R|. \end{aligned} \end{aligned}$$

By applying Lemma 2.5 with

$$\begin{aligned} Z(t)= \int _{B_{t}} \mathcal {U}^{\gamma _j}\,dx ,\qquad \alpha _0=2\, \beta , \qquad \text{ and } \qquad \vartheta =1-\underline{\tau }, \end{aligned}$$

we finally obtain

$$\begin{aligned} \int _{B_r} \mathcal {U}^{\gamma _j}\,dx\le C\,\left( 2^{5\,j\,\beta }\,(R-r)^{-2\,\beta }\,\left( \int _{B_R} \mathcal {U}^{\gamma _{j-1}}\,dx\right) ^\frac{\gamma _j}{\gamma _{j-1}}+ 1\right) , \end{aligned}$$
(5.9)

for some \(C=C(N,p)>1\).

Step 3: Moser’s iteration We now want to iterate the previous estimate on a sequence of shrinking balls. We fix two radii \(0<r<R\le 1\), then we consider the sequence

$$\begin{aligned} R_j=r+\frac{R-r}{2^{j-1}},\qquad j\in \mathbb {N}\setminus \{0\}, \end{aligned}$$

and we apply (5.9) with \(R_{j+1}<R_j\) instead of \(r<R\). Thus we get

$$\begin{aligned} \int _{B_{R_{j+1}}} \mathcal {U}^{\gamma _j}\,dx \le \,C\,\left( 2^{7\,j\,\beta }\,(R-r)^{-2\,\beta }\left( \int _{B_{R_j}}\mathcal {U}^{\gamma _{j-1}}\,dx \right) ^{\frac{\gamma _j}{\gamma _{j-1}}}+ 1\right) \end{aligned}$$
(5.10)

where the constant \(C>1\) only depends on N and p.

We introduce the notation

$$\begin{aligned} Y_j=\int _{B_{R_{j}}} \mathcal {U}^{\gamma _{j-1}}\,dx, \end{aligned}$$

thus (5.10) rewrites as

$$\begin{aligned} Y_{j+1} \le \,C\,\left( 2^{7\,j\,\beta }\,(R-r)^{-2\,\beta } \,Y_{j}^{\frac{\gamma _j}{\gamma _{j-1}}}+ 1\right) \le {2}\,C\,2^{7\,j\,\beta }\,(R-r)^{-2\,\beta }\,(Y_{j}+1)^{\frac{ \gamma _j}{\gamma _{j-1}}}. \end{aligned}$$

Here, we have used again that \(R\le 1\), so that the term multiplying \(Y_j\) is larger than 1. By iterating the previous estimate starting from \(j=1\) and using some standard manipulations, we obtain

$$\begin{aligned} Y_{n+1}\le \Big (C\,2^{7\,\beta }\,(R-r)^{-2\,\beta }\Big )^{\sum \limits _{j=0}^{n-1}(n-j)\frac{\gamma _n}{\gamma _{n-j}}}\, \Big [Y_1+1\Big ]^\frac{\gamma _n}{\gamma _0}, \end{aligned}$$

possibly for a different constant \(C=C(N,p)>1\). We now take the power \(1/\gamma _n\) on both sides:

$$\begin{aligned} Y_{n+1}^\frac{1}{\gamma _n}\le & {} \Big (C\,2^{7\,\beta }\,(R-r)^{-2\,\beta }\Big )^{\sum \limits _{j=0}^{n-1}\frac{n-j}{\gamma _{n-j}}} \,\Big [Y_1+1\Big ]^\frac{1}{\gamma _0}\\= & {} \Big (C\,2^{7\, \beta }\,(R-r)^{-2\, \beta }\Big )^{\sum \limits _{j=1}^{n}\frac{j}{\gamma _{j}}} \,\Big [Y_1+1\Big ]^\frac{1}{\gamma _0}. \end{aligned}$$

We observe that \(\gamma _{j}\sim 2^{j+2} \) as \(j\) goes to \(\infty \). This implies the convergence of the series above and we thus get

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{\infty }(B_{r})} = \lim _{n\rightarrow \infty }\left( \int _{B_{R_{n+1}}} \mathcal {U}^{\gamma _{n+1}}\,dx\right) ^\frac{1}{\gamma _{n+1}} \le C\, (R-r)^{-\beta '}\,\left( \int _{B_{R}} \mathcal {U}^{p+2}\,dx+1\right) ^\frac{1}{p+2}, \end{aligned}$$

for some \(C=C(N,p)>{ 1}\) and \(\beta '=\beta '(N,p)>0\). We also used that \(\gamma _0=p+2\). By recalling the definition of \(\mathcal {U}\), we finally obtain

$$\begin{aligned} \Vert \nabla u\Vert _{L^{\infty }(B_{r})} \le C\,(R-r)^{-\beta '}\, \left( \int _{B_{R}} |\nabla u|^{p+2}\,dx+1\right) ^{\frac{1}{p+2}}. \end{aligned}$$
(5.11)

Step 4 \(L^\infty -L^p\) estimate We fix two concentric balls \(B_{r_0}\subset B_{R_0}\Subset B\) with \(R_0\le 1\). Then for every \(r_0\le t<s\le R_0\) from (5.11) we have

$$\begin{aligned} \Vert \nabla u\Vert _{L^\infty (B_{t})}\le \frac{C}{(s-t)^{\beta '}}\, \left( \int _{B_{s}}|\nabla u|^{p+2}\,dx\right) ^\frac{1}{p+2}+\frac{C}{(s-t)^{\beta '}}, \end{aligned}$$

where we also used the subadditivity of \(\tau \mapsto \tau ^{1/(p+2)}\). We now observe that

$$\begin{aligned} \begin{aligned} \frac{C}{(s-t)^{\beta '}}\, \left( \int _{B_{s}}|\nabla u|^{p+2}\,dx\right) ^\frac{1}{p+2}&\le \frac{C}{(s-t) ^{\beta '}}\, \left( \int _{B_{s}}|\nabla u|^{p}\,dx \right) ^\frac{1}{p+2}\,\Vert \nabla u\Vert _{L^\infty (B_{s})}^\frac{2}{p+2}\\&\le \frac{2}{p+2}\,\Vert \nabla u\Vert _{L^\infty (B_{s})}\\&\quad +\frac{p}{p+2}\,\left( \frac{C}{(s-t)^{\beta '}}\right) ^\frac{p+2}{p}\, \left( \int _{B_{s}}|\nabla u|^{p}\,dx\right) ^\frac{1}{p}. \end{aligned} \end{aligned}$$

We can apply again Lemma 2.5, this time with the choices

$$\begin{aligned} Z(t)=\Vert \nabla u\Vert _{L^\infty (B_{t})},\quad \mathcal {A}=\frac{p}{p+2}\,C^\frac{p+2}{p}\, \left( \int _{B_{R_0}}|\nabla u|^{p}\,dx\right) ^\frac{1}{p},\quad \alpha _0=\frac{p+2}{p\,\beta '}{,\quad \beta _0=\beta '}. \end{aligned}$$

This yields

$$\begin{aligned} \Vert \nabla u\Vert _{L^\infty (B_{r_0})}\le C\,\left[ \frac{1}{(R_0-r_0)^{\beta '\,\frac{p+2}{p}}}\, \left( \int _{B_{R_0}}|\nabla u|^{p}\,dx\right) ^\frac{1}{p}+\frac{1}{(R_0-r_0)^{\beta '}}\right] , \end{aligned}$$

for every \(R_0\le 1\). This readily implies the desired estimate (5.3) in the homogeneous case. \(\square \)

5.2 Proof of Proposition 5.1: the non-homogeneous case

We follow step by step the proof of the homogeneous case and we only indicate the main changes, which essentially occur in Step 1 and Step 2.

Step 1: a first iterative scheme This time, we add on both sides of inequality (4.6) the term

$$\begin{aligned} \int |\nabla \eta |^2\,(|u_{x_k}|-\delta _k)_{+}^p\, |u_{x_k}|^{2\,q}\,dx. \end{aligned}$$

Then the left-hand side is greater, up to a constant, than

$$\begin{aligned} \int \left| \nabla \left( (|u_{x_k}|-\delta _k)_{+}^{\frac{p}{2}} \,|u_{x_k}|^q\,\eta \right) \right| ^2\,dx. \end{aligned}$$

The latter in turn, by the Sobolev inequality is greater, up to a constant, than

$$\begin{aligned} \left( \int (|u_{x_k}|-\delta _k)_{+}^{\frac{2^*\,p}{2}}\, |u_{x_k}|^{2^*q}\,\eta ^{2^*}\,dx\right) ^\frac{2}{2^*}. \end{aligned}$$

By summing over \(k=1,\ldots ,N\) and using the Minkowski inequality, we obtain the analogue of (5.4), namely

$$\begin{aligned} \begin{aligned} \left( \int \Big |\sum _{k=1}^N(|u_{x_k}|-\delta _k)_{+}^{p} \,|u_{x_k}|^{2\,q}\Big |^{\frac{2^*}{2}}\eta ^{2^*} \,dx\right) ^{\frac{2}{2^*}}&\le Cq^5 \sum _{i, k=1}^{N} \int g_{i,\varepsilon }''(u_{x_i})\,|u_{x_k}|^{2q+2} \, |\nabla \eta |^2\,dx\\&\quad +C\,q^5\, \sum _{k=1}^{N}\int |\nabla f|\, |u_{x_k}|^{2\,q+1}\,\eta ^2\,dx\\&\quad + C \int |\nabla \eta |^2 \sum _{k=1}^{N} (|u_{x_k}|-\delta _k)_{+}^p\, |u_{x_k}|^{2\,q}\,dx. \end{aligned} \end{aligned}$$

We now introduce the function

$$\begin{aligned} \mathcal {U}(x):= \frac{1}{2\,\delta }\max _{k=1, \ldots , N}|u_{x_k}(x)|, \end{aligned}$$

where the parameter \(\delta \) is defined in (2.1). We use that

$$\begin{aligned} \sum _{k=1}^N(|u_{x_k}|-\delta _k)_{+}^{p}\,|u_{x_k}|^{2\,q}\ge (2\,\delta \,\mathcal {U}-\delta )_{+}^{p}\,|2\,\delta \,\mathcal {U}|^{2\,q} \ge (2\,\delta )^{2\,q+p}\, \left( \mathcal {U}-\frac{1}{2}\right) _{+}^p\,\mathcal {U}^{2\,q}, \end{aligned}$$

and also that for every \(1\le i \le N\),

$$\begin{aligned} g_{i,\varepsilon }''(u_{x_i})=(p-1)\,(|u_{x_i}|-\delta _i)_{+}^{p-2}+\varepsilon \le C\,\delta ^{p-2}\,\mathcal {U}^{p-2}+\varepsilon . \end{aligned}$$

This yields

$$\begin{aligned} \begin{aligned} \left( \int \left( \mathcal {U}-\frac{1}{2}\right) _{+}^{\frac{2^*}{2}\,p} \mathcal {U}^{2^* q}\, \eta ^{2^*}\,dx\right) ^{\frac{2}{2^*}}&\le C\,q^5 \int \mathcal {U}^{2\,q+p}\,|\nabla \eta |^2\,dx \\&\quad + C\,q^5\varepsilon \int \mathcal {U}^{2\,q+2}\, |\nabla \eta |^2 \, dx\\&\quad +C\,q^5\, \int |\nabla f|\, \mathcal {U}^{2\,q+1}\,\eta ^2\,dx \end{aligned} \end{aligned}$$

for a possibly different \(C=C(N,p, \delta )>1\).

With the concentric balls \(B_r\subset B_t \subset B_s \subset B_R\) and the function \(\eta \) as defined in (5.6), an application of Hölder’s inequality leads to

$$\begin{aligned} \begin{aligned} \left( \int _{B_t} \left( \mathcal {U}-\frac{1}{2}\right) _{+}^{\frac{2^*}{2}\,p} \mathcal {U}^{2^*\,q} \,dx\right) ^{\frac{2}{2^*}}&\le C\,\frac{q^5}{(s-t)^2} \int _{B_s} \mathcal {U}^{2\,q+p}\,dx \\&\quad + C\,\frac{q^5}{(s-t)^2}\,\varepsilon \, \int _{B_s} \mathcal {U}^{2\,q+2}\, dx \\&\quad +C\,q^5\, \Vert \nabla f\Vert _{L^h(B_R)}\,\left( \int _{B_s} \mathcal {U}^{(2\,q+1)\,h'}\,dx\right) ^\frac{1}{h'}. \end{aligned} \end{aligned}$$
(5.12)

From now on, we assume that

$$\begin{aligned} q\ge \max \left\{ \frac{p-2\,h'}{2\,(h'-1)},\, \frac{2^*\,p}{{2}\,h'}-1\right\} . \end{aligned}$$
(5.13)

This in particular implies that

$$\begin{aligned} 2\,q+2\le 2\,q+p\le (2\,q+2)\,h', \end{aligned}$$

then by using Hölder’s inequality and taking into account that \(s\le 1\), we get

$$\begin{aligned} \begin{aligned} \left( \int _{B_t} \left( \mathcal {U}-\frac{1}{2}\right) _{+}^{\frac{2^*}{2}\,p} \mathcal {U}^{2^*\,q} \,dx\right) ^{\frac{2}{2^*}}&\le C\,\frac{q^5}{(s-t)^2} \left( \int _{B_s} \mathcal {U}^{(2\,q+2)\,h'}\,dx\right) ^\frac{2\,q+p}{(2\,q+2)\,h'}\\&\quad + C\,\frac{q^5}{(s-t)^2}\,\varepsilon \, \left( \int _{B_s} \mathcal {U}^{(2\,q+2)\,h'}\, dx\right) ^\frac{1}{h'} \\&\quad +C\,q^5\, \Vert \nabla f\Vert _{L^h(B_R)}\,\left( \int _{B_s} \mathcal {U}^{(2\,q+2)\,h'}\,dx\right) ^\frac{2\,q+1}{(2\,q+2)\,h'}. \end{aligned} \end{aligned}$$

Thanks to the relation on the exponents, this gives (recall that \(\varepsilon <1\) and \(s\le 1\))

$$\begin{aligned} \begin{aligned} \left( \int _{B_t} \left( \mathcal {U}-\frac{1}{2}\right) _{+}^{\frac{2^*}{2}\,p} \mathcal {U}^{2^*\,q} \,dx\right) ^{\frac{2}{2^*}}\le&\frac{C\,q^5}{(s-t)^2}\left( 1 + \Vert \nabla f\Vert _{L^{h}(B_{R})}\right) \\&\times \left( \int _{B_s} \mathcal {U}^{(2\,q+2)\,h'}\,dx+1\right) ^{\frac{2\,q+p}{(2\,q+2)\,h'}}. \end{aligned} \end{aligned}$$
(5.14)

We now estimate

$$\begin{aligned} \begin{aligned} \int _{B_s} \mathcal {U}^{(2\,q+2)\,h'}\,dx&= \int _{B_s\cap \{\mathcal {U}\ge 1\}} \mathcal {U}^{(2\,q+2)\,h'}\,dx + \int _{B_s\cap \{\mathcal {U}\le 1\}}\mathcal {U}^{(2\,q+2)\,h'}\,dx\\&\le \int _{B_s\cap \{\mathcal {U}\ge 1\}} \mathcal {U}^{(2\,q+2)\,h'}\,dx +C. \end{aligned} \end{aligned}$$

Observe that on the set \(\{\mathcal {U}\ge 1\}\), we have \(\mathcal {U}\le 2\,\left( \mathcal {U}-1/2\right) _+\). Hence,

$$\begin{aligned} \int _{B_s} \mathcal {U}^{(2\,q+2)\,h'}\,dx \le C\,\int _{B_s} \left( \mathcal {U}-\frac{1}{2} \right) _{+}^{\frac{2^*}{2}\,p} \mathcal {U}^{(2\,q+2)\,h'-\frac{2^*}{2}\,p}\,dx +C, \end{aligned}$$
(5.15)

where the exponent \((2\,q+2)\,h'-(2^*p)/2\) is positive, thanks to the choice (5.13) of q. We deduce from (5.14) that

$$\begin{aligned} \begin{aligned}&\left( \int _{B_t} \left( \mathcal {U}-\frac{1}{2}\right) _{+}^{\frac{2^*}{2}\,p} \mathcal {U}^{2^*\,q} \,dx\right) ^{\frac{2}{2^*}} \le \frac{C\,q^5}{(s-t)^2}\left( 1 + \Vert \nabla f\Vert _{L^{h}(B_{R})}\right) \\&\quad \times \left( \int _{B_s} \left( \mathcal {U}-\frac{1}{2} \right) _{+}^{\frac{2^*}{2}\,p}\,\mathcal {U}^{(2\,q+2)\,h'-\frac{2^*}{2}\,p}\,dx+1\right) ^{\frac{2\,q+p}{(2\,q+2)\,h'}}, \end{aligned} \end{aligned}$$
(5.16)

for a constant \(C=C(N,p,h,\delta )>1\). We now take \(q=2^{j+1}-1\) for \(j\ge j_0-1\), where \(j_0\in \mathbb {N}\) is chosen so as to ensure condition (5.13). Then we define the sequence of positive exponents

$$\begin{aligned} \gamma _j=(2\,q+2)\,h'-\frac{2^*}{2}\,p=2^{j+2}\,h' -\frac{2^*}{2}\,p,\qquad j\ge j_0, \end{aligned}$$

and

$$\begin{aligned} \widehat{\gamma }_j=2^*\,q=2^*\,(2^{j+1}-1),\qquad j\ge j_0. \end{aligned}$$

In order to simplify the notation, we also introduce the absolutely continuous measure

$$\begin{aligned} d\,\mu :=\left( \mathcal {U} -\frac{1}{2}\right) _{+}^{\frac{2^*}{2}\,p}\,dx. \end{aligned}$$

From (5.16), we get

$$\begin{aligned} \left( \int _{B_t} \mathcal {U}^{\widehat{\gamma }_j} \,d\mu \right) ^{\frac{2}{2^*}} \le \frac{C\,2^{5\,j}}{(s-t)^2}\left( 1 + \Vert \nabla f\Vert _{L^{h}(B_{R})}\right) \, \left( \int _{B_s} \mathcal {U}^{\gamma _j}\,d\mu +1\right) ^{\frac{2}{2^*}\,\frac{\widehat{\gamma }_j+\frac{2^*}{2}\,p}{\gamma _j+\frac{2^*}{2}\,p}}. \end{aligned}$$

We now observe that \(h>N/2\) implies \(h'<2^*/2\). By recalling that \(p\ge 2\), we thus have \(2\,h'<(2^*\,p)/2\), which in turn implies

$$\begin{aligned} \frac{\widehat{\gamma }_j}{\gamma _j}\ge \frac{2^*}{2\,h'}> 1, \qquad j\ge j_0. \end{aligned}$$
(5.17)

It follows that

$$\begin{aligned} \frac{\widehat{\gamma }_j+\dfrac{2^*}{2}\,p}{\gamma _j+\dfrac{2^*}{2}\,p} \le \frac{\widehat{\gamma }_j}{\gamma _j}. \end{aligned}$$

Hence, we obtain

$$\begin{aligned} \left( \int _{B_t} \mathcal {U}^{\widehat{\gamma }_j} \,d\mu \right) ^{\frac{2}{2^*}} \le \frac{C\,2^{5\,j}}{(s-t)^2}\left( 1 + \Vert \nabla f\Vert _{L^{h}(B_{R})}\right) \,\left( \int _{B_s} \mathcal {U}^{\gamma _j}\,d\mu +1\right) ^{\frac{2}{2^*}\,\frac{\widehat{\gamma }_j}{\gamma _j}}. \end{aligned}$$
(5.18)

Step 2: filling the gaps Since

$$\begin{aligned} \gamma _{j-1}<\gamma _j<\widehat{\gamma }_j,\qquad \text{ for } \text{ every } j\ge {j_0+1}, \end{aligned}$$

we obtain by interpolation in Lebesgue spaces,

$$\begin{aligned} \int _{B_{t}} \mathcal {U}^{\gamma _j}\,d\mu \le \left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}}\,d\mu \right) ^\frac{\tau _j\,\gamma _j}{\gamma _{j-1}}\,\left( \int _{B_{t}} \mathcal {U}^{\widehat{\gamma }_j}\,d\mu \right) ^\frac{(1-\tau _j)\, \gamma _j}{\widehat{\gamma }_j}, \end{aligned}$$

where \(0<\tau _j<1\) is given by

$$\begin{aligned} \tau _j=\frac{\dfrac{\widehat{\gamma }_j}{\gamma _j}-1}{\dfrac{\widehat{\gamma }_j}{\gamma _j}\,\dfrac{\gamma _j}{\gamma _{j-1}}-1}. \end{aligned}$$
(5.19)

We now rely on (5.18) to get

$$\begin{aligned} \begin{aligned} \int _{B_{t}} \mathcal {U}^{\gamma _j}\,d\mu&\le \left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}}\,d\mu \right) ^\frac{\tau _j\,\gamma _j}{\gamma _{j-1}}\\&\quad \times \left[ \left( C\,\frac{2^{5\,j}}{(s-t)^2}(1+ \Vert \nabla f\Vert _{L^{h}(B_{R})})\right) ^{\frac{2^*\,\gamma _j}{2\,\widehat{\gamma }_j}}\,\left( \int _{B_{s}} \mathcal {U}^{\gamma _j}\,d\mu +1\right) \right] ^{1-\tau _j}\\&=\left[ \left( C\,\frac{2^{5\,j}}{(s-t)^2}(1+\Vert \nabla f\Vert _{L^{h}(B_{R})})\right) ^{\frac{2^*\,\gamma _j\,(1-\tau _j)}{2\,\widehat{\gamma }_j\,\tau _j}}\,\left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}}\,d\mu \right) ^\frac{\gamma _j}{\gamma _{j-1}}\right] ^{\tau _j}\\&\quad \times \left( \int _{B_{s}} \mathcal {U}^{\gamma _j}\,d\mu +1\right) ^{1-\tau _j}. \end{aligned} \end{aligned}$$
(5.20)

We claim that

$$\begin{aligned} \tau _j\ge \underline{\tau }:=\frac{2^*-2\,h'}{4\cdot 2^*-2\,h'}\qquad \text{ for } \text{ every } j\ge {j_0+1}. \end{aligned}$$
(5.21)

We already know by (5.17) that \((\widehat{\gamma }_j/\gamma _j) \ge 2^*/(2h')\). Moreover, relying on the fact that \((2^*\,p)/2\le 2^{j_0}\,h'\) (this follows from the definition of \(j_0\)), we also have

$$\begin{aligned} 2\le \frac{\gamma _j}{\gamma _{j-1}} \le 4,\qquad j\ge {j_0+1}. \end{aligned}$$

By recalling the definition (5.19) of \(\tau _j\), we get

$$\begin{aligned} \tau _j=\zeta \left( \frac{\widehat{\gamma }_j}{\gamma _j}, \frac{\gamma _j}{\gamma _{j-1}}\right) ,\qquad \text{ where } \zeta (x,y) = \frac{x-1}{x\,y-1}. \end{aligned}$$

Observe that on \([2^*/(2\,h'),+\infty )\times [2,4]\), the function \(x\mapsto \zeta (x,y)\) is increasing, while \(y\mapsto \zeta (x,y)\) is decreasing. Thus we get

$$\begin{aligned} \tau _j\ge \zeta \left( \frac{2^*}{2\,h'},4\right) , \end{aligned}$$

which is exactly claim (5.21). We deduce from (5.21) and (5.17) that

$$\begin{aligned} \frac{2^*\,\gamma _j\,(1-\tau _j)}{2\,\widehat{\gamma }_j\,\tau _j}\le \frac{1-\underline{\tau }}{\underline{\tau }}\,h'=:\beta . \end{aligned}$$

In particular, we have

$$\begin{aligned} \left( C\,\frac{2^{5\,j}}{(s-t)^2}(1+\Vert \nabla f\Vert _{L^{h}(B_{R})})\right) ^{\frac{2^*\,\gamma _j \,(1-\tau _j)}{2\,\widehat{\gamma }_j\,\tau _j}}\le \left( C\,\frac{2^{5\,j}}{(s-t)^2}(1+\Vert \nabla f\Vert _{L^{h}(B_{R})})\right) ^{\beta }, \end{aligned}$$

since the quantity inside the parenthesis is larger than \(1\) (here, we use again that \(s\le 1\)). In view of (5.20), this implies

$$\begin{aligned} \begin{aligned} \int _{B_{t}} \mathcal {U}^{\gamma _j}\,d\mu&\le \left[ \left( C\,\frac{2^{5\,j}}{(s-t)^2}(1+\Vert \nabla f\Vert _{L^{h}(B_{R})})\right) ^{\beta }\,\left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}}\,d\mu \right) ^\frac{\gamma _j}{\gamma _{j-1}}\right] ^{\tau _j}\\&\quad \times \left( \int _{B_{s}} \mathcal {U}^{\gamma _j}\,d\mu +1\right) ^{1-\tau _j}. \end{aligned} \end{aligned}$$

By Young’s inequality,

$$\begin{aligned} \begin{aligned} \int _{B_{t}} \mathcal {U}^{\gamma _j}\,d\mu&\le (1-\tau _j)\,\left( \int _{B_{s}} \mathcal {U}^{\gamma _j}\,d\mu +1\right) \\&\quad + \tau _j\,\left( C\,\frac{2^{5\,j}}{(s-t)^2}\,(1+ \Vert \nabla f\Vert _{L^{h}(B_{R})})\right) ^{\beta }\,\left( \int _{B_{t}} \mathcal {U}^{\gamma _{j-1}}\,d\mu \right) ^\frac{\gamma _j}{\gamma _{j-1}}\\&\le (1-\underline{\tau })\,\int _{B_{s}}\mathcal {U}^{\gamma _j}\,d\mu \\&\quad + C\,\frac{2^{5\,j\,\beta }}{(s-t)^{2\,\beta }}\,(1+\Vert \nabla f\Vert _{L^{h}(B_{R})})^{\beta }\,\left( \int _{B_{R}} \mathcal {U}^{\gamma _{j-1}}\,d\mu \right) ^\frac{\gamma _j}{\gamma _{j-1}}+ 1, \end{aligned} \end{aligned}$$

where \(C=C(N,p,h,\delta )>1\) as usual. By applying again Lemma 2.5, this times with the choices

$$\begin{aligned} Z(t)= \int _{B_{t}} \mathcal {U}^{\gamma _j}\,d\mu ,\qquad \alpha _0=2\,\beta , \qquad \text{ and } \qquad \vartheta =1-\underline{\tau }, \end{aligned}$$

we finally obtain

$$\begin{aligned} \int _{B_r} \mathcal {U}^{\gamma _j}\,d\mu \le C\,\frac{2^{5\,j\,\beta }}{(R-r)^{2\,\beta }}\,(1+\Vert \nabla f\Vert _{L^{h}(B_{R})})^{\beta }\,\left( \int _{B_{R}} \mathcal {U}^{\gamma _{j-1}}\,d\mu \right) ^\frac{\gamma _j}{\gamma _{j-1}}+C. \end{aligned}$$
(5.22)

Step 3: Moser’s iteration Estimate (5.22) is the analogue of (5.9), except that the Lebesgue measure \(dx\) is now replaced by the measure \(d\mu \), and the index \(j\) is assumed to be larger than some \(j_0+1\), instead of \(j\ge 0\) as in (5.9). Following the same iteration argument and starting from \(j=j_0+1\), we are led to

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{\infty }(B_{r},\,d\mu )}\le C\, \left( \frac{1+\Vert \nabla f\Vert _{L^{h}(B_{R})}}{R-r}\right) ^{\beta '}\,\left( \int _{B_{R}} \mathcal {U}^{\gamma _{j_0}}\,d\mu +1\right) ^\frac{1}{\gamma _{j_0}}, \end{aligned}$$
(5.23)

for some \(C=C(N,p,h, \delta )>1\), \(\beta '=\beta '(N,p,h)>0\).

Step 4 \(L^{\infty }-L^{p}\) estimate We now want to replace the norm \(L^{\gamma _{j_0}}(B_R,d\mu )\) of \(\mathcal {U}\) in the right-hand side of (5.23) by its norm \(L^p(B_R,dx)\). Let \(q_1:=2^{j_1+1}-1\) where

$$\begin{aligned} j_1:=\min \left\{ j\ge j_0 : j+1\ge \log _2\left( 1+\frac{\gamma _{j_0}}{2^*}\right) \right\} . \end{aligned}$$

Then \(\gamma _{j_0}\le 2^*\,q_1\) and thus, by using that

$$\begin{aligned} \mathcal {U}^{\gamma _{j_0}}\le 2^{2^*q_1-\gamma _{j_0}}\,\mathcal {U}^{2^*q_1},\qquad \text{ whenever } \mathcal {U}\ge \frac{1}{2}, \end{aligned}$$

we have

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{\gamma _{j_0}}(B_R,\,d\mu )}\le C\,\Vert \mathcal {U}\Vert ^\frac{2^*q_1}{\gamma _{j_0}}_{L^{2^* q_1}(B_R,\,d\mu )}. \end{aligned}$$
(5.24)

We rely on (5.14) with \(q=q_1\) to get for every \(0<r<t<s<R\)

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{2^* q_1}(B_t,\,d\mu )}^{2\,q_1} \le \frac{C}{(s-t)^2}\,\left( 1+\Vert \nabla f\Vert _{L^{h}(B_{R})}\right) \,\left( \Vert \mathcal {U}\Vert _{L^{2\,(q_1+1)\,h'}(B_s)}^{2\,q_1+p} +1\right) , \end{aligned}$$
(5.25)

for some new constant \(C=C(N,p,h,\delta )>1\).

Since \(j_1\ge j_0\), we have \(p<(2\,q_1+2)\,h'<(2q_1+p)\frac{2^*}{2}\), and thus, by interpolation in Lebesgue spaces

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{2\,(q_1+1)\,h'}(B_s)}\le \Vert \mathcal {U}\Vert _{L^{2^*\,q_1+\frac{2^*}{2}\,p}(B_s)}^{\theta }\, \Vert \mathcal {U}\Vert _{L^{p}(B_s)}^{1-\theta }, \end{aligned}$$
(5.26)

where \(\theta \in (0,1)\) is determined as usual by scale invariance. As in the proof of (5.15), we have

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{2^*q_1+\frac{2^*}{2}\,p}(B_s)}\le C\,\Vert \mathcal {U}\Vert _{L^{2^*q_1}(B_s,\,d\mu )}^{\frac{2\,q_1}{2\,q_1+p}} + C. \end{aligned}$$

Inserting this last estimate into (5.26), we obtain

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{2\,(q_1+1)\,h'}(B_s)}^{2\,q_1+p}\le C\, \Vert \mathcal {U}\Vert _{L^{2^*q_1}(B_s,\, d\mu )}^{2\,q_1\,\theta } \Vert \mathcal {U}\Vert _{L^{p}(B_s)}^{(1-\theta )\,(2\,q_1+p)} + C\,\Vert \mathcal {U}\Vert _{L^{p}(B_s)}^{(1-\theta )\,(2\,q_1+p)}, \end{aligned}$$

up to changing the constant \(C=C(N,p,h,\delta )>1\). In view of (5.25), this gives

$$\begin{aligned} \begin{aligned} \Vert \mathcal {U}\Vert _{L^{2^*q_1}(B_t,\,d\mu )}^{2\,q_1}&\le \frac{C}{(s-t)^2}\,\left( 1+\Vert \nabla f\Vert _{L^{h}(B_{R})}\right) \\&\quad \times \left( \Vert \mathcal {U}\Vert _{L^{2^*q_1}(B_s,\,d\mu )}^{2\,q_1\,\theta }\, \Vert \mathcal {U}\Vert _{L^{p}(B_s)}^{(1-\theta )\,(2\,q_1+p)} + \Vert \mathcal {U}\Vert _{L^{p}(B_s)}^{(1-\theta )\,(2\,q_1+p)}+1\right) . \end{aligned} \end{aligned}$$

By Young’s inequality, we get

$$\begin{aligned} \begin{aligned} \Vert \mathcal {U}\Vert _{L^{2^*q_1}(B_t,\,d\mu )}^{2\,q_1}&\le \theta \,\Vert \mathcal {U}\Vert _{L^{2^*q_1}(B_s,\,d\mu )}^{2\,q_1}\\&\quad +(1-\theta )\,\left( \frac{C}{(s-t)^2}(1+\Vert \nabla f\Vert _{L^{h}(B_{R})})\right) ^{\frac{1}{1-\theta }} \Vert \mathcal {U}\Vert _{L^{p}(B_{R})}^{\,(2\,q_1+p)}\\&\quad + \frac{C}{(s-t)^2}\,\left( 1+\Vert \nabla f\Vert _{L^{h}(B_{R})}\right) \, \left( \Vert \mathcal {U}\Vert _{L^{p}(B_{R})}^{(1-\theta )\,(2\,q_1+p)}+1\right) . \end{aligned} \end{aligned}$$

By Lemma 2.5, this implies

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{2^*\,q_1}(B_r,\,d\mu )}^{2\,q_1} \le C\,\left( \frac{1}{(R-r)^2}(1+\Vert \nabla f\Vert _{L^{h}(B_{R_0})})\right) ^{\frac{1}{1-\theta }} \left( \Vert \mathcal {U}\Vert _{L^{p}(B_R)}^{\,(2\,q_1+p)} + 1\right) , \end{aligned}$$

after some standard manipulations. Coming back to (5.23) and taking into account (5.24), we obtain

$$\begin{aligned} \Vert \mathcal {U}\Vert _{L^{\infty }(B_{r_0},\,d\mu )}\le C\, \left( \frac{1+\Vert \nabla f\Vert _{L^{h}(B_{R_0})}}{R_0-r_0}\right) ^{\sigma _2}\,\left( \Vert \mathcal {U}\Vert _{L^{p}(B_{R_0})}^{\sigma _1}+1\right) , \end{aligned}$$

where \(C=C(N,p,h,\delta )>1\) and \(\sigma _i=\sigma _i(N,p,h)>0\), for \(i=1,2\). By definition of \(\mathcal {U}\), we have

$$\begin{aligned} |\nabla u|\le 2\,\delta \,\sqrt{N}\,\mathcal {U}\le \sqrt{N}\, |\nabla u|. \end{aligned}$$

Since \(\Vert \mathcal {U}\Vert _{L^{\infty }(B_{r_0},\,d\mu )}+1\ge \Vert \mathcal {U}\Vert _{L^{\infty }(B_{r_0})}\), it follows that

$$\begin{aligned} \Vert \nabla u\Vert _{L^{\infty }(B_{r_0})} \le C \, \left( \frac{1+\Vert \nabla f\Vert _{L^{h}(B_{R_0})}}{R_0-r_0}\right) ^{\sigma _2}\,\left( \Vert \nabla u\Vert _{L^{p}(B_{R_0})}^{\sigma _1}+1\right) , \end{aligned}$$

possibly for a different constant \(C=C(N,p,h,\delta )>1\). This completes the proof. \(\square \)