1 Introduction

1.1 Motivation

The following generic nonlinear parabolic model

$$\begin{aligned} \begin{array}{llll} \partial _t \beta ({\overline{u}})-\mathrm{div}\left( {\varvec{a}}({\varvec{x}},\nu ({\overline{u}}),\nabla \zeta ({\overline{u}})) \right) = f &{}\hbox { in } \Omega \times (0,T),\\ \beta ({\overline{u}})({\varvec{x}},0) = \beta (u_\mathrm{ini})({\varvec{x}})&{}\hbox { in } \Omega ,\\ \zeta ({\overline{u}})=0 &{}\hbox { on } \partial \Omega \times (0,T), \end{array}\end{aligned}$$
(1)

where \(\beta \) and \(\zeta \) are non-decreasing, \(\nu \) is such that \(\nu ' = \beta '\zeta '\) and \({\varvec{a}}\) is a Leray–Lions operator, arises in various frameworks (see next section for precise hypotheses on the data). This model includes

  1. 1.

    The Richards model, setting \(\zeta (s)=s\), \(\nu \!=\! \beta \) and \({\varvec{a}}({\varvec{x}},\nu ({\overline{u}}),\nabla \zeta ({\overline{u}})) = K({\varvec{x}},\beta ({\overline{u}})) \nabla {\overline{u}}\), which describes the flow of water in a heterogeneous anisotropic underground medium,

  2. 2.

    The Stefan model [8], setting \(\beta (s)=s\), \(\nu = \zeta \), \({\varvec{a}}({\varvec{x}},\nu ({\overline{u}}),\nabla \zeta ({\overline{u}})) = K({\varvec{x}},\zeta ({\overline{u}})) \nabla \zeta ({\overline{u}})\), which arises in the study of a simplified heat diffusion process in a melting medium,

  3. 3.

    The p-Laplace problem, setting \(\beta (s) = \zeta (s) = \nu (s) = s\) and \({\varvec{a}}({\varvec{x}},\nu ({\overline{u}}),\nabla \zeta ({\overline{u}})) = |\nabla {\overline{u}}|^{p-2} \nabla {\overline{u}}\), which is involved in the motion of glaciers [37] or flows of incompressible turbulent fluids through porous media [16]. General Leray–Lions operators \({\varvec{a}}({\varvec{x}},s,{\varvec{\xi }})\) have growth, monotony and coercivity properties [see (2f)–(2h) below] which ensure that \(-\mathrm{div}({\varvec{a}}({\varvec{x}},w,\nabla \cdot ))\) maps \(W^{1,p}_0(\Omega )\) into \(W^{-1,p'}(\Omega )\), and thanks to which this differential operator is viewed as a generalisation of the p-Laplace operator.

The numerical approximation of these models has been extensively studied in the literature—see the fundamental work on the Stefan’s problem [48] and [30, 51] for some of its numerical approximations, [33, 46] for the Richards’ problem, and [19, 23] and references therein for some studies of convergence of numerical methods for the Leray–Lions’ problem. In [52], fully discrete implicit schemes are considered in 2D domains for the problem \(\partial _t e-\Delta u=f\), \(e\in \beta (u)\) with \(\beta \) a maximal monotone operator; error estimates are obtained and the results are relevant, e.g., for the Stefan problem and the porous medium equation.

More generally, studies have been carried out on numerical time-stepping approximations of non-linear abstract parabolic equations. In [43] the authors study the stability and convergence properties of linearised implicit methods for the time discretization of nonlinear parabolic equations in the general framework of Hilbert spaces. The time discretisation of nonlinear evolution equations in an abstract Banach space setting of analytic semigroups is studied in [38]; this setting covers fully nonlinear parabolic initial-boundary value problems with smooth coefficients. [3] deals with a general formulation for semi-discretisations of linear parabolic evolution problems in Hilbert spaces; this time-stepping formulation encompasses continuous and discontinuous Galerkin methods, as well as Runge Kutta methods. The study in [3] has been extended in [2] to semi-linear equations, i.e. with the addition of a right-hand side which is locally Lipschitz-continuous with respect to the unknown. In the same directions, we also quote [39, 42, 44, 45, 49] for Runge-Kutta time discretizations of linear and quasilinear parabolic equations (reaction-diffusion, Navier–Stokes equations, etc.). Multisteps methods have also been considered, see e.g. [50].

However, most of these studies are only applicable under regularity assumptions on the solution or data, and to semi-linear equations or semi-discretised schemes. None deals with as many non-linearities and degeneracies as in (1). Moreover, the results in these works mostly yield space-time averaged convergences, e.g. in \(L^2(\Omega \times (0,T))\). Yet, the quantity of interest is often not \({\overline{u}}\) on \(\Omega \times (0,T)\) but \({\overline{u}}\) at a given time, for example \(t=T\). Current numerical analyses therefore do not ensure that this quantity of interest is properly approximated by numerical methods.

The usual way to obtain pointwise-in-time approximation results for numerical schemes is to prove estimates in \(L^\infty (0,T;L^2(\Omega ))\) on \(u-{\overline{u}}\), where u is the approximated solution. Establishing such error estimates is however only feasible when uniqueness of the solution \({\overline{u}}\) to (1) can be proved, which is the case for Richards’ and Stefan’s problems (with K only depending on \({\varvec{x}}\)), but not for more complex non-linear parabolic problems as (1) or even p-Laplace problems. It moreover requires some regularity assumptions on \({\overline{u}}\), which clearly fail to hold for (1) (and simpler p-Laplace problems); indeed, because of the possible plateaux of \(\beta \) and \(\zeta \), the solution’s gradient can develop jumps.

The purpose of this article is to prove that, using Discrete Functional Analysis techniques (i.e. the translation to numerical analysis of nonlinear analysis techniques), an \(L^\infty (0,T;L^2(\Omega ))\) convergence result can be established for numerical approximations of (1), without having to assume non-physical regularity assumptions on the data. Note that, although Richards’ and Stefan’s models are formally equivalent when \(\beta \) and \(\zeta \) are strictly increasing (consider \(\beta =\zeta ^{-1}\) to pass from one model to the other), they change nature when these functions are allowed to have plateaux. Stefan’s model can degenerate to an ODE (if \(\zeta \) is constant on the range of the solution) and Richards’ model can become a non-transient elliptic equation (if \(\beta \) is constant on this range). The innovative technique we develop in this paper is nonetheless generic enough to work directly on (1) and with a vast number of numerical methods.

That being said, a particular numerical framework must be selected to write precise equations and estimates. The framework we choose is that of gradient schemes, which has the double benefit of covering a vast number of numerical methods, and of having already been studied for many models—elliptic, parabolic, linear or non-linear, possibly degenerate, etc.—with various boundary conditions. The schemes or family of schemes included in the gradient schemes framework, and to which our results therefore directly apply, currently are:

  • Galerkin methods, including conforming finite element schemes,

  • finite element with mass lumping [12],

  • the Crouzeix–Raviart non-conforming finite element, with or without mass lumping [14, 27],

  • the Raviart–Thomas mixed finite elements [9],

  • the vertex approximate gradient scheme [31],

  • the hybrid mimetic mixed family [22], which includes mimetic finite differences [10], mixed finite volume [20] and the SUSHI scheme [29],

  • the discrete duality finite volume scheme in dimension 2 [5, 40], and the CeVeFE-discrete duality finite volume scheme in dimension 3 [13],

  • the multi-point flux approximation O-method [1, 25].

We refer the reader to [21, 23, 28, 32, 34] for more details. Let us finally emphasize that the unified convergence study of numerical schemes for Problem (1), which combines a general Leray–Lions operator and nonlinear functions \(\beta \) or \(\zeta \), seems to be new even without the uniform-in-time convergence result.

The paper is organised as follows. In Sect. 1.2, we present the assumptions and the notion of weak solution for (1) and, in Sect. 1.3, we give an overview of the ideas involved in the proof of uniform-in-time convergence. This overview is given not in a numerical analysis context but in the context of a pure stability analysis of (1) with very little regularity on the data, for which the uniform-in-time convergence result also seems to be new. Section 2 presents the gradient schemes for our generic model (1). We give in Sect. 3 some preliminaries to the convergence study, in particular a crucial uniform-in-time weak-in-space discrete Aubin–Simon compactness result. Section 4 contains the complete convergence proof of gradient schemes for (1), including the uniform-in-time convergence result. This proof is initially conducted under a simplifying assumption on \(\beta \) and \(\zeta \). We demonstrate in Sect. 5 that, in the case \(p\ge 2\), this assumption can be removed thanks to a discrete compensated compactness result. We also remark in this section that our results apply to the model considered in [52]. An appendix concludes the article with technical results, in particular a generalisation of the Ascoli–Arzelà compactness result to discontinuous functions and a characterisation of the uniform convergence of a sequence of functions; these results are critical to establishing our uniform-in-time convergence result. We believe that the discrete functional analysis results we establish in order to study the approximations of (1)—in particular the discrete compensated compactness theorem (Theorem 5.4)—could be critical to the numerical analysis of other degenerate or coupled models of physical importance.

Note that the main results and their proofs have been sketched and illustrated by some numerical examples in [24], for \({\varvec{a}}({\varvec{x}},\nu ({\overline{u}}),\nabla \zeta ({\overline{u}})) = \nabla \zeta ({\overline{u}})\) and \(\beta =\mathrm{Id}\) or \(\zeta =\mathrm{Id}\).

1.2 Hypotheses and weak sense for the continuous problem

We consider the evolution problem (1) under the following hypotheses.

$$\begin{aligned}&\begin{array}{l}\Omega \text{ is } \text{ an } \text{ open } \text{ bounded } \text{ subset } \text{ of } \mathbb R^d (d\in \mathbb N^\star ) \text{ and } T>0,\end{array} \end{aligned}$$
(2a)
$$\begin{aligned}&\begin{array}{l} \zeta \in C^0(\mathbb R) \text{ is } \text{ non-decreasing, } \text{ Lipschitz } \text{ continuous } \text{ with } \text{ Lipschitz } \text{ constant } L_\zeta >0, \\ \zeta (0) = 0 \text{ and, } \text{ for } \text{ some } M_0,M_1>0, |\zeta (s)| \ge M_0 |s| - M_1 \text{ for } \text{ all } s\in \mathbb R. \end{array} \end{aligned}$$
(2b)
$$\begin{aligned}&\begin{array}{l} \beta \in C^0(\mathbb R)\hbox { is non-decreasing, Lipschitz continuous with Lipschitz constant }L_\beta > 0,\\ \hbox {and }\beta (0) = 0.\end{array} \end{aligned}$$
(2c)
$$\begin{aligned}&\begin{array}{l} \displaystyle \forall s\in \mathbb R,\quad \nu (s) = \int _0^s \zeta '(q)\beta '(q) \mathrm{d}q. \end{array}\end{aligned}$$
(2d)
$$\begin{aligned}&\begin{array}{l} {\varvec{a}}: \Omega \times \mathbb R\times \mathbb R^d\rightarrow \mathbb R^d\text { is a Carath}{\acute{\mathrm{e}}}\text {odory function} \end{array} \end{aligned}$$
(2e)

[i.e. a function such that, for a.e. \({\varvec{x}}\in \Omega \), \((s,{\varvec{\xi }}) \mapsto {\varvec{a}}({\varvec{x}}, s,{\varvec{\xi }})\) is continuous and, for any \((s,{\varvec{\xi }}) \in \mathbb R\times \mathbb R^d\), \({\varvec{x}}\mapsto {\varvec{a}}({\varvec{x}},s,{\varvec{\xi }})\) is measurable] and, for some \(p\in (1,+\infty )\),

$$\begin{aligned}&\begin{array}{l} \exists \underline{a}\in (0,+\infty ) \ : \ {\varvec{a}}({\varvec{x}},s,{\varvec{\xi }})\cdot {\varvec{\xi }}\ge \underline{a}|{\varvec{\xi }}|^p,\hbox { for a.e. }{\varvec{x}}\in \Omega ,\ \forall s\in \mathbb R,\ \forall {\varvec{\xi }}\in \mathbb R^d, \end{array} \end{aligned}$$
(2f)
$$\begin{aligned}&\begin{array}{l} ({\varvec{a}}({\varvec{x}},s,{\varvec{\xi }}) - {\varvec{a}}({\varvec{x}},s,{\varvec{\chi }}))\cdot ({\varvec{\xi }}-{\varvec{\chi }})\ge 0,\hbox { for a.e. }{\varvec{x}}\in \Omega ,\ \forall s\in \mathbb R,\ \forall {\varvec{\xi }},{\varvec{\chi }}\in \mathbb R^d, \end{array} \end{aligned}$$
(2g)
$$\begin{aligned}&\exists \overline{a}\in L^{p'}(\Omega ),\,\exists \mu \in (0,+\infty )\ : |{\varvec{a}}({\varvec{x}},s,{\varvec{\xi }})|\le \overline{a}({\varvec{x}}) + \mu |{\varvec{\xi }}|^{p-1},\nonumber \\&\quad \hbox { for a.e. }{\varvec{x}}\in \Omega ,\ \forall s\in \mathbb R,\ \forall {\varvec{\xi }}\in \mathbb R^d. \end{aligned}$$
(2h)

We also assume, setting \(p'=\frac{p}{p-1}\) the dual exponent of the p previously introduced,

$$\begin{aligned}&\begin{array}{l} u_\mathrm{ini}\in L^2(\Omega ),\quad f \in L^{p'}(\Omega \times (0,T)). \end{array} \end{aligned}$$
(2i)

We denote by \(R_\beta \) the range of \(\beta \) and define the pseudo-inverse function \(\beta _r:R_\beta \rightarrow \mathbb R\) of \(\beta \) by

$$\begin{aligned} \begin{array}{llll} \forall s\in R_\beta ,\; \displaystyle \beta _r(s)&{}=&{}\displaystyle \left\{ \begin{array}{ll} \inf \{t\in \mathbb R\,|\,\beta (t)=s\}&{} \text{ if } s> 0,\\ 0&{} \text{ if } s=0,\\ \sup \{t\in \mathbb R\,|\,\beta (t)=s\}&{} \text{ if } s< 0,\end{array}\right. \\ &{}=&{}\displaystyle \text{ closest } t \text{ to } 0 \text{ such } \text{ that } \beta (t)=s. \end{array} \end{aligned}$$
(3)

Since \(\beta (t)\) has the same sign as t, we have \(\beta _r\ge 0\) on \(R_\beta \cap \mathbb R^+\) and \(\beta _r\le 0\) on \(R_\beta \cap \mathbb R^-\). We then define \(B:R_\beta \rightarrow [0,\infty ]\) by

$$\begin{aligned} B(z)=\int _0^z \zeta (\beta _r(s))\,ds. \end{aligned}$$

Since \(\beta _r\) is non-decreasing, this expression is always well-defined in \([0,\infty )\). The signs of \(\beta _r\) and \(\zeta \) ensure that B is non-decreasing on \(R_\beta \cap \mathbb R^+\) and non-increasing on \(R_\beta \cap \mathbb R^-\), and therefore has limits (possibly \(+\infty \)) at the endpoints of \(R_\beta \). We can thus extend B as a function defined on \(\overline{R_\beta }\) with values in \([0,+\infty ]\).

The precise notion of solution to (1) that we consider is the following:

$$\begin{aligned} \left\{ \begin{array}{llll} {\overline{u}}\in L^p(0,T;L^p(\Omega )),\; \zeta ({\overline{u}}) \in L^p(0,T;W^{1,p}_0(\Omega )),\\ B(\beta ({\overline{u}}))\in L^\infty (0,T;L^1(\Omega )),\ \beta ({\overline{u}})\in C([0,T];\\ \quad L^2(\Omega ){-}\text{ w }),\partial _t\beta ({\overline{u}})\!\in \! L^{p'}(0,T;W^{-1,p'}(\Omega )),\\ \beta ({\overline{u}})(\cdot ,0) = \beta (u_\mathrm{ini}) \text{ in } L^2(\Omega ),\\ \displaystyle \int _0^T \langle \partial _t\beta ({\overline{u}})(\cdot ,t), {\overline{v}}(\cdot ,t)\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t \\ \displaystyle \quad +\int _0^T \int _\Omega {\varvec{a}}({\varvec{x}},\nu ({\overline{u}}({\varvec{x}},t)),\nabla \zeta ({\overline{u}})({\varvec{x}},t))\cdot \nabla {\overline{v}}({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t\\ \quad \quad = \int _0^T \int _\Omega f({\varvec{x}},t) {\overline{v}}({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t,\quad \forall {\overline{v}}\in L^p(0;T;W^{1,p}_0(\Omega )). \end{array}\right. \end{aligned}$$
(4)

where \(C([0,T];L^2(\Omega ){-}\text{ w })\) denotes the space of continuous functions \([0,T]\mapsto L^2(\Omega )\) for the weak-\(*\) topology of \(L^2(\Omega )\). Here and in the following, we remove the mention of \(\Omega \) in the duality bracket \(\langle \cdot , \cdot \rangle _{W^{-1,p'},W^{1,p}_0} =\langle \cdot , \cdot \rangle _{W^{-1,p'}(\Omega ),W^{1,p}_0(\Omega )}\).

Remark 1.1

The derivative \(\partial _t\beta ({\overline{u}})\) is to be understood in the usual sense of distributions on \(\Omega \times (0,T)\). Since the set \(\mathcal T=\{\sum _{i=1}^q\varphi _i(t)\gamma _i({\varvec{x}})\,:\, q\in \mathbb N,\varphi _i\in C^\infty _c(0,T),\gamma _i\in C^\infty _c(\Omega )\}\) of tensorial functions in \(C^\infty _c(\Omega \times (0,T))\) is dense in \(L^p(0,T;W^{1,p}_0(\Omega ))\), one can ensure that this distribution derivative \(\partial _t \beta ({\overline{u}})\) belongs to \(L^{p'}(0,T;W^{-1,p'}(\Omega ))=(L^p(0,T;W^{1,p}_0(\Omega )))'\) by checking that the linear form

$$\begin{aligned} \varphi \in \mathcal T\mapsto \langle \partial _t \beta ({\overline{u}}),\varphi \rangle _{\mathcal D',\mathcal D} =-\int _0^T\int _\Omega \beta ({\overline{u}})({\varvec{x}},t)\partial _t\varphi ({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t \end{aligned}$$

is continuous for the norm of \(L^p(0,T;W^{1,p}_0(\Omega ))\).

Note that the continuity property of \(\beta ({\overline{u}})\) in (4) is natural. Indeed, since \(\beta ({\overline{u}})\in L^\infty (0,T;L^2(\Omega ))\) [this comes from \(B(\beta ({\overline{u}}))\in L^\infty (0,T;L^1(\Omega ))\) and (26)], the PDE in the sense of distributions shows that for any \(\varphi \in C^\infty _c(\Omega )\) the mapping \(T_\varphi :t\mapsto \langle \beta ({\overline{u}})(t),\varphi \rangle _{L^2}\) belongs to \(W^{1,1}(0,T)\subset C([0,T])\). By density of \(C^\infty _c(\Omega )\) in \(L^2(\Omega )\) and the integrability properties of \(\beta ({\overline{u}})\), we deduce that \(T_\varphi \in C([0,T])\) for any \(\varphi \in L^2(\Omega )\), which precisely establishes the continuity of \(\beta ({\overline{u}}):[0,T]\rightarrow L^2(\Omega ){-}\text{ w }\).

This notion of \(\beta ({\overline{u}})\) as a function continuous in time is nevertheless a subtle one. It is to be understood in the sense that the function \(({\varvec{x}},t)\mapsto \beta ({\overline{u}}({\varvec{x}},t))\) has an a.e. representative which is continuous \([0,T]\mapsto L^2(\Omega ){-}\text{ w }\). In other words, there is a function \(Z\in C([0,T];L^2(\Omega ){-}\text{ w })\) such that \(Z(t)({\varvec{x}})=\beta ({\overline{u}}({\varvec{x}},t))\) for a.e. \(({\varvec{x}},t)\in \Omega \times (0,T)\). We must however make sure, when dealing with pointwise values in time, to separate Z from \(\beta ({\overline{u}}(\cdot ,\cdot ))\) as \(\beta ({\overline{u}}(\cdot ,t_1))\) may not make sense for a particular \(t_1\in [0,T]\). That being said, in order to adopt a simple notation, we will denote by \(\beta ({\overline{u}})(\cdot ,\cdot )\) the function Z, and by \(\beta ({\overline{u}}(\cdot ,\cdot ))\) the a.e.-defined composition of \(\beta \) and \({\overline{u}}\). Hence, it will make sense to talk about \(\beta ({\overline{u}})(\cdot ,t)\) for a particular \(t_1\in [0,T]\), and we will only write \(\beta ({\overline{u}})({\varvec{x}},t)=\beta ({\overline{u}}({\varvec{x}},t))\) for a.e. \(({\varvec{x}},t)\in \Omega \times (0,T)\). Note that from this a.e. equality we can ensure that \(\beta ({\overline{u}})(\cdot ,\cdot )\) takes its values in the closure \(\overline{R_\beta }\) of the range of \(\beta \).

1.3 General ideas for the uniform-in-time convergence result

As explained in the introduction, the main innovative result of this article is the uniform-in-time convergence result (Theorem 2.16 below). Although it’s stated and proved in the context of numerical approximations of (1), we emphasize that the ideas underlying its proof are also applicable to theoretical analysis of PDEs. Let us informally present these ideas on the following continuous approximation of (1):

$$\begin{aligned} \begin{array}{llll} \partial _t \beta ({\overline{u}}_\varepsilon )-\mathrm{div}\left( {\varvec{a}}_\varepsilon ({\varvec{x}},\nu ({\overline{u}}_\varepsilon ),\nabla \zeta ({\overline{u}}_\varepsilon )) \right) = f &{}\hbox { in } \Omega \times (0,T),\\ \beta ({\overline{u}}_\varepsilon )({\varvec{x}},0) = \beta (u_\mathrm{ini})({\varvec{x}})&{}\hbox { in } \Omega ,\\ \zeta ({\overline{u}}_\varepsilon )=0 &{}\hbox { on } \partial \Omega \times (0,T) \end{array}\end{aligned}$$
(5)

where \({\varvec{a}}_\varepsilon \) satisfies Assumptions (2e)–(2h) with constants not depending on \(\varepsilon \) and, as \(\varepsilon \rightarrow 0\), \({\varvec{a}}_\varepsilon \rightarrow {\varvec{a}}\) locally uniformly with respect to \((s,{\varvec{\xi }})\).

We want to show here how to deduce from averaged convergences a strong uniform-in-time convergence result. We therefore assume the following convergences (up to a subsequence as \(\varepsilon \rightarrow 0\)), which are compatible with basic compactness results that can be obtained on \(({\overline{u}}_\varepsilon )_{\varepsilon }\) and also correspond to the initial convergences (18) on numerical approximations of (1):

$$\begin{aligned} \begin{array}{llll} \beta ({\overline{u}}_\varepsilon )\!\rightarrow \! \beta ({\overline{u}}) \text{ in } C([0,T];L^2(\Omega ){-}\text{ w }),\; \nu ({\overline{u}}_\varepsilon )\!\rightarrow \! \nu ({\overline{u}}) \text{ strongly } \text{ in } L^1(\Omega \times (0,T)),\\ \zeta ({\overline{u}}_\varepsilon )\rightarrow \zeta ({\overline{u}}) \text{ weakly } \text{ in } L^p(0,T;W^{1,p}_0(\Omega )),\\ {\varvec{a}}_\varepsilon (\cdot ,\nu ({\overline{u}}_\varepsilon ),\nabla \zeta ({\overline{u}}_\varepsilon )) \rightarrow {\varvec{a}}(\cdot ,\nu ({\overline{u}}),\nabla \zeta ({\overline{u}})) \text{ weakly } \text{ in } L^{p'}(\Omega \times (0,T))^d. \end{array} \end{aligned}$$
(6)

We will prove from these convergences that, along the same subsequence, \(\nu ({\overline{u}}_\varepsilon )\rightarrow \nu ({\overline{u}})\) strongly in \(C([0,T];L^2(\Omega ))\), which is our uniform-in-time convergence result.

We start by noticing that the weak-in-space uniform-in-time convergence of \(\beta ({\overline{u}}_\varepsilon )\) gives, for any \(T_0\in [0,T]\) and any family \((T_\varepsilon )_{\varepsilon >0}\) converging to \(T_0\) as \(\varepsilon \rightarrow 0\), \(\beta ({\overline{u}}_\varepsilon )(T_\varepsilon ,\cdot ) \rightarrow \beta ({\overline{u}})(T_0,\cdot )\) weakly in \(L^2(\Omega )\). Classical strong-weak semi-continuity properties of convex functions (see Lemma 3.4) and the convexity of B (see Lemma 3.3) then ensure that

$$\begin{aligned} \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0))\mathrm{d}{\varvec{x}}\le \liminf _{\varepsilon \rightarrow 0} \int _\Omega B(\beta ({\overline{u}}_\varepsilon )({\varvec{x}},T_\varepsilon ))\mathrm{d}{\varvec{x}}. \end{aligned}$$
(7)

The second step is to notice that, by (2g) for \({\varvec{a}}_\varepsilon \),

$$\begin{aligned} \int _0^{T_\varepsilon } \int _\Omega \left[ {\varvec{a}}_\varepsilon (\cdot ,\nu ({\overline{u}}_\varepsilon ), \nabla \zeta ({\overline{u}}_\varepsilon )) - {\varvec{a}}_\varepsilon (\cdot ,\nu ({\overline{u}}_\varepsilon ), \nabla \zeta ({\overline{u}}))\right] \cdot \left[ \nabla \zeta ({\overline{u}}_\varepsilon )-\nabla \zeta ({\overline{u}})\right] \mathrm{d}{\varvec{x}}\mathrm{d}t \ge 0. \end{aligned}$$

Developing this expression and using the convergences (6), we find that

$$\begin{aligned}&\liminf _{\varepsilon \rightarrow 0}\int _0^{T_\varepsilon } \int _\Omega {\varvec{a}}_\varepsilon (\cdot ,\nu ({\overline{u}}_\varepsilon ), \nabla \zeta ({\overline{u}}_\varepsilon ))\cdot \nabla \zeta ({\overline{u}}_\varepsilon )({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad \ge \int _0^{T_0}\int _\Omega {\varvec{a}}(\cdot ,\nu ({\overline{u}}),\nabla \zeta ({\overline{u}})) \cdot \nabla \zeta ({\overline{u}}) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(8)

We then establish the following formula:

$$\begin{aligned}&\int _\Omega B(\beta ({\overline{u}}_\varepsilon ({\varvec{x}},T_\varepsilon ))) \mathrm{d}{\varvec{x}}+ \int _{0}^{T_\varepsilon } \int _\Omega {\varvec{a}}_\varepsilon ({\varvec{x}},\nu ({\overline{u}}_\varepsilon ({\varvec{x}},t)),\nabla \zeta ({\overline{u}}_\varepsilon )({\varvec{x}},t)) \cdot \nabla \zeta ({\overline{u}}_\varepsilon )({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t \nonumber \\&\quad = \int _\Omega B(\beta (u_\mathrm{ini}({\varvec{x}}))) \mathrm{d}{\varvec{x}}+ \int _{0}^{T_\varepsilon } \int _\Omega f({\varvec{x}},t) \zeta ({\overline{u}}_\varepsilon )({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(9)

This energy equation is formally obtained by multiplying (5) by \(\zeta ({\overline{u}}_\varepsilon )\) and integrating by parts, using \((B\circ \beta )'=\zeta \beta '\) (see Lemma 3.3); the rigorous justification of (9) is however quite technical – see Lemma 3.6 and Corollary 3.8. Thanks to (8), we can pass to the \(\limsup \) in (9) and we find, using the same energy equality with \(({\overline{u}},{\varvec{a}},T_0)\) instead of \(({\overline{u}}_\varepsilon ,{\varvec{a}}_\varepsilon ,T_\varepsilon )\),

$$\begin{aligned} \limsup _{\varepsilon \rightarrow 0} \int _\Omega B(\beta ({\overline{u}}_\varepsilon ({\varvec{x}},T_\varepsilon )))\mathrm{d}{\varvec{x}}\le \int _\Omega B(\beta ({\overline{u}}({\varvec{x}},T_0)))\mathrm{d}{\varvec{x}}. \end{aligned}$$
(10)

Combined with (7), this shows that \(\int _\Omega B(\beta ({\overline{u}}_\varepsilon ({\varvec{x}},T_\varepsilon )))\mathrm{d}{\varvec{x}}\rightarrow \int _\Omega B(\beta ({\overline{u}}({\varvec{x}},T_0)))\mathrm{d}{\varvec{x}}\). A uniform convexity property of B [see (28)] then allows us to deduce that \(\nu ({\overline{u}}_\varepsilon (\cdot ,T_\varepsilon ))\rightarrow \nu ({\overline{u}}(\cdot ,T_0))\) strongly in \(L^2(\Omega )\) and thus that \(\nu ({\overline{u}}_\varepsilon )\rightarrow \nu ({\overline{u}})\) strongly in \(C([0,T];L^2(\Omega ))\) (see Lemma 6.4).

Remark 1.2

A close examination of this proof indicates that equality in the energy relation (9) is not required for \({\overline{u}}_\varepsilon \). An inequality \(\le \) would be sufficient. This is particularly important in the context of numerical methods which may introduce additional numerical diffusion (for example due to an implicit-in-time discretisation) and therefore only provide an upper bound in this energy estimate, see (42). It is however essential that the limit solution \({\overline{u}}\) satisfies the equivalent of (9) with an equal sign (or \(\ge \)).

2 Gradient discretisations and gradient schemes

2.1 Definitions

We give here a minimal presentation of gradient discretisations and gradient schemes, limiting ourselves to what is necessary to study the discretisation of (1). We refer the reader to [21, 23, 31] for more details.

A gradient scheme can be viewed as a general formulation of several discretisations of (1), that are based on a nonconforming approximation of the weak formulation of the problem. This approximation is constructed by using discrete space and mappings, the set of which are called a gradient discretisation.

Definition 2.1

(Space-time gradient discretisation for homogeneous Dirichlet boundary conditions) We say that \({\mathcal D}= (X_{{\mathcal D},0}, \Pi _{\mathcal D},\nabla _{\mathcal D}, {\mathcal I}_{\mathcal D},(t^{(n)})_{n=0,\ldots ,N})\) is a space-time gradient discretisation for homogeneous Dirichlet boundary conditions if

  1. 1.

    the set of discrete unknowns \(X_{{\mathcal D},0}\) is a finite dimensional real vector space,

  2. 2.

    the linear mapping \(\Pi _{\mathcal D}:X_{{\mathcal D},0}\rightarrow L^{\infty }(\Omega )\) is a piecewise constant reconstruction operator in the following sense: there exists a set I of degrees of freedom and a family \((\Omega _i)_{i\in I}\) of disjoint subsets of \(\Omega \) such that \(X_{{\mathcal D},0}=\mathbb R^I\), \(\Omega =\bigcup _{i\in I}\Omega _i\) and, for all \(u=(u_i)_{i\in I}\in X_{{\mathcal D},0}\) and all \(i\in I\), \(\Pi _{\mathcal D}u=u_i\) on \(\Omega _i\),

  3. 3.

    the linear mapping \(\nabla _{\mathcal D}: X_{{\mathcal D},0}\rightarrow L^p(\Omega )^d\) gives a reconstructed discrete gradient. It must be chosen such that \(\Vert \nabla _{\mathcal D}\cdot \Vert _{L^p(\Omega )^d}\) is a norm on \(X_{{\mathcal D},0}\),

  4. 4.

    \({\mathcal I}_{\mathcal D}: L^2(\Omega )\rightarrow X_{{\mathcal D},0}\) is a linear interpolation operator,

  5. 5.

    \(t^{(0)}=0<t^{(1)}<t^{(2)}<\cdots <t^{(N)}=T\).

We then set \({\delta t}^{(n+{\frac{1}{2}})} = t^{(n+1)} -t^{(n)}\) for \(n=0,\ldots ,N-1\), and \({\delta t}_{\mathcal D}= \max _{n=0,\ldots ,N-1} {\delta t}^{(n+{\frac{1}{2}})}\). We define the dual semi-norm \(|w|_{\star ,{\mathcal D}}\) of \(w\in X_{{\mathcal D},0}\) by

$$\begin{aligned} |w|_{\star ,{\mathcal D}}=\sup \left\{ \int _\Omega \Pi _{{\mathcal D}} w({\varvec{x}})\Pi _{{\mathcal D}}z({\varvec{x}})\mathrm{d}{\varvec{x}}\,:\, z\in X_{{\mathcal D},0},\;||\nabla _{\mathcal D}z||_{L^p(\Omega )^d}= 1\right\} . \end{aligned}$$
(11)

Remark 2.2

(Boundary conditions) Other boundary conditions can be seamlessly handled by gradient schemes, see [21].

Remark 2.3

(Nonlinear function of the elements of \(X_{{\mathcal D},0}\)) Let \({\mathcal D}\) be a gradient discretisation in the sense of Definition 2.1. For any \(\chi :\mathbb R\mapsto \mathbb R\) and any \(u=(u_i)_{i\in I}\in X_{{\mathcal D},0}\), we define \(\chi _I(u)\in X_{{\mathcal D},0}\) by \(\chi _I(u)=(\chi (u_i))_{i\in I}\). As indicated by the subscript I, this definition depends on the choice of the degrees of freedom in \(X_{{\mathcal D},0}\). That said, these degrees of freedom are usually canonical and the index I can be dropped. An important consequence of the fact that \(\Pi _{{\mathcal D}}\) is a piecewise constant reconstruction is the following:

$$\begin{aligned} \forall \chi :\mathbb R\mapsto \mathbb R,\quad \forall u\in X_{{\mathcal D},0},\quad \Pi _{\mathcal D}\chi (u)=\chi (\Pi _{\mathcal D}u). \end{aligned}$$
(12)

It is customary to use the notations \(\Pi _{\mathcal D}\) and \(\nabla _{\mathcal D}\) also for space-time dependent functions. Moreover, we will need a notation for the jump-in-time of piecewise constant functions in time. Hence, if \((v^{(n)})_{n=0,\ldots ,N}\subset X_{{\mathcal D},0}\), we set

$$\begin{aligned} \begin{array}{llll} \hbox {for a.e. }{\varvec{x}}\in \Omega ,\ \Pi _{{\mathcal D}} v({\varvec{x}},0) = \Pi _{{\mathcal D}} v^{(0)}({\varvec{x}}) \text{ and, } \forall n=0,\ldots ,N-1,\;\forall t\in (t^{(n)},t^{(n+1)}],\\ \qquad \Pi _{\mathcal D}v({\varvec{x}},t) = \Pi _{\mathcal D}v^{(n+1)}({\varvec{x}}),\; \nabla _{\mathcal D}v({\varvec{x}},t) = \nabla _{\mathcal D}v^{(n+1)}({\varvec{x}})\\ \qquad \displaystyle \text{ and } \delta _{{\mathcal D}} v(t) = \delta _{{\mathcal D}}^{(n+{\frac{1}{2}})}v:= \frac{v^{(n+1)}-v^{(n)}}{{\delta t}^{(n+{\frac{1}{2}})}}\in X_{{\mathcal D},0}. \end{array}\end{aligned}$$
(13)

If \({\mathcal D}= (X_{{\mathcal D},0}, \Pi _{\mathcal D},\nabla _{\mathcal D}, {\mathcal I}_{\mathcal D},(t^{(n)})_{n=0,\ldots ,N})\) is a space-time gradient discretisation in the sense of Definition 2.1, the associated gradient scheme for Problem (1) is obtained by replacing in this problem the continuous space and mappings with their discrete ones. Using the notations in Remark 2.3, the implicit-in-time gradient scheme therefore consists in considering a sequence \((u^{(n)})_{n=0,\ldots ,N}\subset X_{{\mathcal D},0}\) such that

$$\begin{aligned} \left\{ \begin{array}{llll} u^{(0)}={\mathcal I}_{\mathcal D}u_\mathrm{ini}\hbox { and, for all }v=(v^{(n)})_{n=1,\ldots ,N}\subset X_{{\mathcal D},0}, \\ \displaystyle \int _0^T\int _\Omega \left[ \Pi _{\mathcal D}\delta _{{\mathcal D}}\beta (u)({\varvec{x}},t) \Pi _{\mathcal D}v({\varvec{x}},t) \!+\! {\varvec{a}}({\varvec{x}}, \Pi _{\mathcal D}\nu (u)({\varvec{x}},t),\right. \\ \quad \left. \nabla _{\mathcal D}\zeta ( u)({\varvec{x}},t))\cdot \nabla _{\mathcal D}v({\varvec{x}},t)\right] \mathrm{d}{\varvec{x}}\mathrm{d}t\\ \qquad \displaystyle = \int _0^T\int _\Omega f({\varvec{x}},t) \Pi _{\mathcal D}v({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{array}\right. \end{aligned}$$
(14)

Remark 2.4

(Time-stepping) Scheme (14) is implicit-in-time because of the choice, in the definitions of \(\Pi _{\mathcal D}\) and \(\nabla _{\mathcal D}\) in (13), of \(v^{(n+1)}\) when \(t\in (t^{(n)},t^{(n+1)}]\). As a consequence, \(u^{(n+1)}\) appears in \({\varvec{a}}({\varvec{x}},\cdot ,\cdot )\) in (14) for \(t\in (t^{(n)},t^{(n+1)}]\). Instead of a fully implicit method, we could as well consider a Crank–Nicolson scheme or any scheme between those two (\(\theta \)-scheme). This would consist in choosing \(\theta \in [\frac{1}{2},1]\) and in replacing these terms \(u^{(n+1)}\) with \(u^{(n+\theta )}=\theta u^{(n+1)}+(1-\theta )u^{(n)}\). All results established here for (14) would hold for such a scheme. We refer the reader to the treatment done in [23] for the details.

2.2 Properties of gradient discretisations

In order to establish the convergence of the associated gradient schemes, sequences of space-time gradient discretisations are required to satisfy four properties: coercivity, consistency, limit-conformity and compactness.

Definition 2.5

(Coercivity) If \({\mathcal D}\) is a space-time gradient discretisation in the sense of Definition 2.1, the norm of \(\Pi _{\mathcal D}\) is denoted by

$$\begin{aligned} C_{\mathcal D}=\max _{v\in X_{{\mathcal D},0}\backslash \{0\}} \frac{||\Pi _{\mathcal D}v||_{L^p(\Omega )}}{||\nabla _{\mathcal D}v||_{L^p(\Omega )^d}}. \end{aligned}$$

A sequence \(({\mathcal D}_m)_{m\in \mathbb N}\) of space-time gradient discretisations in the sense of Definition 2.1 is said to be coercive if there exists \(C_P\ge 0\) such that, for any \(m\in \mathbb N\), \(C_{{\mathcal D}_m}\le C_P\).

Definition 2.6

(Consistency) If \({\mathcal D}\) is a space-time gradient discretisation in the sense of Definition 2.1, we define

$$\begin{aligned} \forall \varphi \in L^2(\Omega )\cap W^{1,p}_0(\Omega ),\ \widehat{S}_{\mathcal D}(\varphi )= & {} \min _{w\in X_{{\mathcal D},0}}\left( ||\Pi _{{\mathcal D}} w-\varphi ||_{L^{\max (p,2)}(\Omega )}\right. \nonumber \\&\left. +\,||\nabla _{{\mathcal D}} w-\nabla \varphi ||_{L^p(\Omega )^d}\right) . \end{aligned}$$
(15)

A sequence \(({\mathcal D}_m)_{m\in \mathbb N}\) of space-time gradient discretisations in the sense of Definition 2.1 is said to be consistent if

  • for all \(\varphi \in L^2(\Omega ) \cap W^{1,p}_0(\Omega )\), \(\widehat{S}_{{\mathcal D}_m}(\varphi )\rightarrow 0\) as \(m\rightarrow \infty \),

  • for all \(\varphi \in L^2(\Omega )\), \(\Pi _{{\mathcal D}_m}{\mathcal I}_{{\mathcal D}_m}\varphi \rightarrow \varphi \) in \(L^2(\Omega )\) as \(m\rightarrow \infty \), and

  • \({\delta t}_{{\mathcal D}_m}\rightarrow 0\) as \(m\rightarrow \infty \).

Definition 2.7

(Limit-conformity) If \({\mathcal D}\) is a space-time gradient discretisation in the sense of Definition 2.1 and \(W^{\mathrm{div},p'}(\Omega )=\{{\varvec{\varphi }}\in L^{p'}(\Omega )^d\,:\,\mathrm{div}{\varvec{\varphi }}\in L^{p'}(\Omega )\}\), we define

$$\begin{aligned} \begin{array}{l} \displaystyle \forall {\varvec{\varphi }}\!\in \! W^{\mathrm{div},p'}(\Omega ),\; \displaystyle W_{{\mathcal D}}(\varvec{\varphi }) \!=\! \max _{u\in X_{{\mathcal D},0}\setminus \{0\}}\frac{\left| \displaystyle \int _\Omega \left( \nabla _{{\mathcal D}} u({\varvec{x}})\cdot {\varvec{\varphi }}({\varvec{x}}){+} \Pi _{{\mathcal D}} u({\varvec{x}}) \mathrm{div}{\varvec{\varphi }}({\varvec{x}})\right) \mathrm{d}{\varvec{x}}\right| }{\Vert \nabla _{\mathcal D}u \Vert _{L^p(\Omega )^d}}. \end{array} \end{aligned}$$
(16)

A sequence \(({\mathcal D}_m)_{m\in \mathbb N}\) of space-time gradient discretisations in the sense of Definition 2.1 is said to be limit-conforming if, for all \(\varvec{\varphi }\in W^{\mathrm{div},p'}(\Omega )\), \( W_{{\mathcal D}_m}(\varvec{\varphi })\rightarrow 0\) as \(m\rightarrow \infty \).

Remark 2.8

The convergences \(\widehat{S}_{{\mathcal D}_m}\rightarrow 0\) on \(L^2(\Omega )\cap W^{1,p}_0(\Omega )\) and \(W_{{\mathcal D}_m}\rightarrow 0\) on \(W^{\mathrm{div},p'}(\Omega )\) only need to be checked on dense subsets of these spaces [21, 31].

Definition 2.9

(Compactness) If \({\mathcal D}\) is a space-time gradient discretisation in the sense of Definition 2.1, we define

$$\begin{aligned} \forall {\varvec{\xi }}\in \mathbb R^d,\;T_{\mathcal D}({\varvec{\xi }})=\max _{v\in X_{{\mathcal D},0}\backslash \{0\}} \frac{||\Pi _{\mathcal D}v(\cdot +{\varvec{\xi }})-\Pi _{\mathcal D}v||_{L^p(\mathbb R^d)}}{||\nabla _{\mathcal D}v||_{L^p(\Omega )^d}}, \end{aligned}$$

where \(\Pi _{\mathcal D}v\) has been extended by 0 outside \(\Omega \).

A sequence \(({\mathcal D}_m)_{m\in \mathbb N}\) of space-time gradient discretisations is said to be compact if

$$\begin{aligned} \lim _{{\varvec{\xi }}\rightarrow 0} \sup _{m\in \mathbb N} T_{{\mathcal D}_m}({\varvec{\xi }})=0. \end{aligned}$$

We refer the reader to [21, 23] for a proof of the following lemma.

Lemma 2.10

(Regularity of the limit) Let \(({\mathcal D}_m)_{m\in \mathbb N}\) be a sequence of space-time gradient discretisations, in the sense of Definition 2.1, that is coercive and limit-conforming in the sense of Definitions 2.5 and 2.7. Let, for any \(m\in \mathbb N\), \(v_m=(v^{(n)}_m)_{n=0,\ldots ,N_m}\subset X_{{\mathcal D}_m,0}\) be such that, with the notations in (13), \((\nabla _{{\mathcal D}_m}v_m)_{m\in \mathbb N}\) is bounded in \(L^p(\Omega \times (0,T))^d\).

Then there exists \(v\in L^p(0,T;W^{1,p}_0(\Omega ))\) such that, up to a subsequence as \(m\rightarrow \infty \), \(\Pi _{{\mathcal D}_m}v_m\rightarrow v\) weakly in \(L^p(\Omega \times (0,T))\) and \(\nabla _{{\mathcal D}_m}v_m \rightarrow \nabla v\) weakly in \(L^p(\Omega \times (0,T))^d\).

2.3 Main results

Uniform-in-time convergence of numerical solutions to schemes for parabolic equations starts with a weak convergence with respect to the space variable. This weak convergence is then used to prove a stronger convergence. We therefore first recall a standard definition related to the weak topology of \(L^2(\Omega )\) (we also refer the reader to Proposition 6.5 in the appendix for a classical characterisation of the weak topology of bounded sets in \(L^2(\Omega )\)).

Definition 2.11

(Uniform-in-time \(L^2(\Omega )\) -weak convergence) Let \(\langle \cdot ,\cdot \rangle _{L^2(\Omega )}\) denote the inner product in \(L^2(\Omega )\), let \((u_m)_{m\in \mathbb N}\) be a sequence of functions \([0,T]\rightarrow L^2(\Omega )\) and let \(u:[0,T]\mapsto L^2(\Omega )\).

We say that \((u_m)_{m\in \mathbb N}\) converges weakly in \(L^2(\Omega )\) uniformly on [0, T] to u if, for all \(\varphi \in L^2(\Omega )\), as \(m\rightarrow \infty \) the sequence of functions \(t\in [0,T]\rightarrow \langle u_m(t),\varphi \rangle _{L^2(\Omega )}\) converges uniformly on [0, T] to the function \(t\in [0,T]\rightarrow \langle u(t),\varphi \rangle _{L^2(\Omega )}\).

Our first theorem states weak or space-time averaged convergence properties of gradient schemes for (1). These results have already been established for Leray-Lions’, Richards’ and Stefan’s models, see [23, 28, 32]. The convergence proof we provide afterwards however covers more non-linear model, and is more compact than the previous proofs.

Theorem 2.12

(Convergence of gradient schemes) We assume (2) and we take a sequence \(({\mathcal D}_m)_{m\in \mathbb N}\) of space-time gradient discretisations, in the sense of Definition 2.1, that is coercive, consistent, limit-conforming and compact (see Sect. 2.2). Then for any \(m\in \mathbb N\) there exists a solution \(u_m\) to (14) with \({\mathcal D}={\mathcal D}_m\).

Moreover, if we assume that

$$\begin{aligned} (\forall s\in \mathbb R,\;\beta (s)=s)\quad \text{ or } \quad (\forall s\in \mathbb R,\;\zeta (s)=s), \end{aligned}$$
(17)

then there exists a solution \({\overline{u}}\) to (4) such that, up to a subsequence, the following convergences hold as \(m\rightarrow \infty \):

$$\begin{aligned} \begin{array}{llll} \displaystyle \Pi _{{\mathcal D}_m}\beta (u_m)\rightarrow \beta ({\overline{u}})\hbox { weakly in }L^2(\Omega )\hbox { uniformly on }[0,T]~ \hbox {(see Definition 2.11)},\\ \displaystyle \Pi _{{\mathcal D}_m}\nu (u_m)\rightarrow \nu ({\overline{u}}) \text{ strongly } \text{ in } L^1(\Omega \times (0,T)),\\ \displaystyle \Pi _{{\mathcal D}_m}\zeta (u_m)\rightarrow \zeta ({\overline{u}}) \text{ weakly } \text{ in } L^p(\Omega \times (0,T)),\\ \displaystyle \nabla _{{\mathcal D}_m}\zeta (u_m)\rightarrow \nabla \zeta ({\overline{u}}) \text{ weakly } \text{ in } L^p(\Omega \times (0,T))^d. \end{array} \end{aligned}$$
(18)

Remark 2.13

Since \(|\nu |\le L_\zeta |\beta |\) and \(|\nu |\le L_\beta |\zeta |\), the \(L^\infty (0,T;L^2(\Omega ))\) bound on \(\Pi _{{\mathcal D}_m}\beta (u_m)\) and the \(L^p(\Omega \times (0,T))\) bound on \(\Pi _{{\mathcal D}_m}\zeta (u_m)\) (see Lemma 4.1 and Definition 2.5) shows that the strong convergence of \(\Pi _{{\mathcal D}_m}\nu (u_m)\) is also valid in \(L^q(0,T;L^r(\Omega ))\) for any \((q,r)\in [1,\infty )\times [1,2)\), any \((q,r)\in [1,p)^2\) and, of course, any space interpolated between these two cases.

Remark 2.14

We do not assume the existence of a solution \({\overline{u}}\) to the continuous problem, our convergence analysis will establish this existence.

Remark 2.15

Assumption (17) covers Richards’ and Stefan’s models, as well as many other non-linear parabolic equations. As we prove in Sect. 5, this assumption is actually not required if \(p\ge 2\). However, we first state and prove Theorem 2.12 under (17) in order to simplify the presentation. See also Remark 2.19.

The main innovation of this paper is the following theorem, which states the uniform-in-time strong-in-space convergence of numerical methods for fully non-linear degenerate parabolic equations with no regularity assumptions on the data.

Theorem 2.16

(Uniform-in-time convergence) Under Assumptions (2), let \(({\mathcal D}_m)_{m\in \mathbb N}\) be a sequence of space-time gradient discretisations, in the sense of Definition 2.1, that is coercive, consistent, limit-conforming and compact (see Sect. 2.2). We assume that \(u_m\) is a solution to (14) with \({\mathcal D}={\mathcal D}_m\) that converges as \(m\rightarrow \infty \) to a solution \({\overline{u}}\) of (4) in the sense (18).

Then, as \(m\rightarrow \infty \), \(\Pi _{{\mathcal D}_m}\nu (u_m)\rightarrow \nu ({\overline{u}})\) strongly in \(L^\infty (0,T;L^2(\Omega ))\).

Remark 2.17

Since the functions \(\Pi _{{\mathcal D}_m}\nu (u_m)\) are piecewise constant in time, their convergence in \(L^\infty (0,T;L^2(\Omega ))\) is actually a uniform-in-time convergence (not “uniform a.e. in time”).

The last theorem completes our convergence result by stating the strong space-time averaged convergence of the discrete gradients. Its proof is inspired by the study of gradient schemes for Leray–Lions operators made in [23].

Theorem 2.18

(Strong convergence of gradients) Under Assumptions (2), let \(({\mathcal D}_m)_{m\in \mathbb N}\) be a sequence of space-time gradient discretisations, in the sense of Definition 2.1, that is coercive, consistent, limit-conforming and compact (see Sect. 2.2). We assume that \(u_m\) is a solution to (14) with \({\mathcal D}={\mathcal D}_m\) that converges as \(m\rightarrow \infty \) to a solution \({\overline{u}}\) of (4) in the sense (18). We also assume that \({\varvec{a}}\) is strictly monotone in the sense:

$$\begin{aligned} ({\varvec{a}}({\varvec{x}},s,{\varvec{\xi }}) - {\varvec{a}}({\varvec{x}},s,{\varvec{\chi }}))\cdot ({\varvec{\xi }}-{\varvec{\chi }})> 0,\hbox { for a.e. }{\varvec{x}}\in \Omega ,\ \forall s\in \mathbb R,\ \forall {\varvec{\xi }}\not ={\varvec{\chi }}\in \mathbb R^d. \end{aligned}$$
(19)

Then, as \(m\rightarrow \infty \), \(\Pi _{{\mathcal D}_m}\zeta (u_m)\rightarrow \zeta ({\overline{u}})\) strongly in \(L^p(\Omega \times (0,T))\) and \(\nabla _{{\mathcal D}_m} \zeta (u_m)\rightarrow \nabla \zeta ({\overline{u}})\) strongly in \(L^p(\Omega \times (0,T))^d\).

Remark 2.19

Theorems 2.16 and 2.18 do not require the structural assumption (17); they only require that the convergences (18) hold.

3 Preliminaries

We establish here a few results which will be used in the analysis of the gradient scheme (14).

3.1 Uniform-in-time compactness for space-time gradient discretisations

Aubin–Simon compactness results roughly consist in establishing the compactness of a sequence of space-time functions from some strong bounds on the functions with respect to the space variable (typically, bounds in a Sobolev space with positive exponent) and some weaker bounds on their time derivatives (typically, bounds in a Sobolev space with a negative exponent, i.e. the dual of a Sobolev space with positive exponent). Several variants exist, including for piecewise constant-in-time functions appearing in the numerical approximation of parabolic equations [4, 11, 17, 36]. Although quite strong in space, the convergence results provided by these discrete versions of Aubin–Simon theorems are only averaged-in-time—i.e. in an \(L^p(0,T;E)\) space where E is a normed space.

Theorem 3.1 can be considered as a discrete form of an Aubin–Simon theorem, that establishes a uniform-in-time but weak-in-space compactness result. The corresponding convergence is therefore weaker than in Theorem 2.16, but it is a critical initial step for establishing the uniform-in-time strong-in-space convergence result. Given that the functions considered here are piecewise constant in time, it might be surprising to obtain a uniform-in-time convergence result; everything hinges on the fact that the jumps in time tend to vanish as the time step goes to zero. The proof of Theorem 3.1 is based on the results in the appendix, and in particular on the discontinuous Ascoli–Arzelà theorem stated and proved there.

Theorem 3.1

(Uniform-in-time weak-in-space discrete Aubin–Simon theorem) Let \(T\!>\!0\) and take a sequence \(({\mathcal D}_m)_{m\in \mathbb N} \!=\! (X_{{\mathcal D}_m,0}, \Pi _{{\mathcal D}_m},\nabla _{{\mathcal D}_m}, {\mathcal I}_{{\mathcal D}_m},(t_m^{(n)})_{n=0,\ldots ,N_m})_{m\in \mathbb N}\) of space-time gradient discretisations, in the sense of Definition 2.1, that is consistent in the sense of Definition 2.6.

For any \(m\in \mathbb N\), let \(v_m=(v_m^{(n)})_{n=0,\ldots ,N_m} \subset X_{{\mathcal D}_m,0}\). If there exists \(q>1\) and \(C>0\) such that, for any \(m\in \mathbb N\),

$$\begin{aligned} ||\Pi _{{\mathcal D}_m}v_m||_{L^\infty (0,T;L^2(\Omega ))}\le C\quad \text{ and } \quad \int _0^T |\delta _m v_m(t)|_{\star ,{\mathcal D}_m}^{q}\mathrm{d}t\le C, \end{aligned}$$
(20)

then the sequence \((\Pi _{{\mathcal D}_m}v_m)_{m\in \mathbb N}\) is relatively compact uniformly-in-time and weakly in \(L^2(\Omega )\), i.e. it has a subsequence that converges in the sense of Definition 2.11.

Moreover, any limit of such a subsequence is continuous \([0,T]\rightarrow L^2(\Omega )\) for the weak topology of \(L^2(\Omega )\).

Remark 3.2

The bound on \(|\delta _m v_m|_{\star ,{\mathcal D}_m}\) is often a consequence of a numerical scheme satisfied by \(v_m\) and of a bound on \(||\nabla _{{\mathcal D}_m}v_m||_{L^p(\Omega \times (0,T))^d}\), see the proof of Lemma 4.3 for example.

Proof

This result is a consequence of the discontinuous Ascoli–Arzelà theorem (Theorem 6.2) with \(K=[0,T]\) and E the ball of radius C in \(L^2(\Omega )\) endowed with the weak topology. We let \((\varphi _l)_{l\in \mathbb N}\subset C^\infty _c(\Omega )\) be a dense sequence in \(L^2(\Omega )\) and equipp E with the metric (82) from these \(\varphi _l\) (see Proposition 6.5). The set E is metric compact and therefore complete, and the functions \(\Pi _{{\mathcal D}_m}v_m\) take their values in E. It remains to estimate \(d_E(v_m(s),v_m(s'))\). In what follows, we drop the index m in \({\mathcal D}_m\) for the sake of legibility.

Let us define the interpolant \(P_{{\mathcal D}}\varphi _l\in X_{{\mathcal D},0}\) by

$$\begin{aligned} P_{\mathcal D}\varphi _l=\mathop {\,\mathrm argmin\,}_{w\in X_{{\mathcal D},0}}\left( ||\Pi _{\mathcal D}w-\varphi _l||_{L^{\max (p,2)}(\Omega )} +||\nabla _{\mathcal D}w-\nabla \varphi _l||_{L^p(\Omega )^d}\right) . \end{aligned}$$
(21)

For \(0\le s\le s'\le T\), by writing \(\Pi _{\mathcal D}v_m(s')-\Pi _{{\mathcal D}}v_m(s)\) as the sum of its jumps \({\delta t}^{(n+{\frac{1}{2}})}\Pi _{\mathcal D}\delta _{\mathcal D}^{(n+{\frac{1}{2}})}v_m\) at the points \((t^{(n)})_{n=n_1,\ldots ,n_2}\) between s and \(s'\), the definition of \(|\cdot |_{\star ,{\mathcal D}}\), Hölder’s inequality and Estimate (20) give

$$\begin{aligned}&\left| \int _\Omega \left( \Pi _{\mathcal D}v_m({\varvec{x}},s')-\Pi _{{\mathcal D}}v_m({\varvec{x}},s)\right) \Pi _{\mathcal D}P_{\mathcal D}\varphi _l({\varvec{x}})\mathrm{d}{\varvec{x}}\right| \nonumber \\&\quad = \left| \int _{t^{(n_1)}}^{t^{(n_2+1)}}\int _\Omega \Pi _{{\mathcal D}}\delta _{\mathcal D}v(t)({\varvec{x}}) \Pi _{\mathcal D}P_{\mathcal D}\varphi _l({\varvec{x}})\mathrm{d}{\varvec{x}}\mathrm{d}t\right| \nonumber \\&\quad \le C^{1/q} (t^{(n_2+1)}-t^{(n_1)})^{1/q'}||\nabla _{\mathcal D}P_{\mathcal D}\varphi _l||_{L^p(\Omega )^d}. \end{aligned}$$
(22)

By definition of \(P_{\mathcal D}\), we have

$$\begin{aligned} ||\Pi _{\mathcal D}P_{\mathcal D}\varphi _l-\varphi _l||_{L^2(\Omega )}\le \widehat{S}_{\mathcal D}(\varphi _l) \end{aligned}$$

and

$$\begin{aligned} ||\nabla _{\mathcal D}P_{\mathcal D}\varphi _l||_{L^p(\Omega )^d}\le \widehat{S}_{\mathcal D}(\varphi _l) +||\nabla \varphi _l||_{L^p(\Omega )^d}\le C_{\varphi _l} \end{aligned}$$

with \(C_{\varphi _l}\) not depending on \({\mathcal D}\) (and therefore on m). Since \(t^{(n_2+1)}-t^{(n_1)}\le |s'-s|+{\delta t}\) and \((\Pi _{{\mathcal D}}v_m)_{m\in \mathbb N}\) is bounded in \(L^\infty (0,T;L^2(\Omega ))\), we deduce from (22) that

$$\begin{aligned}&\left| \int _\Omega \left( \Pi _{\mathcal D}v_m({\varvec{x}},s')-\Pi _{{\mathcal D}}v_m({\varvec{x}},s)\right) \varphi _l({\varvec{x}})\mathrm{d}{\varvec{x}}\right| \\&\quad \le \left| \int _\Omega \left( \Pi _{\mathcal D}v_m({\varvec{x}},s')-\Pi _{{\mathcal D}}v_m({\varvec{x}},s)\right) \Pi _{\mathcal D}P_{\mathcal D}\varphi _l({\varvec{x}})\mathrm{d}{\varvec{x}}\right| \\&\qquad +2||\Pi _{\mathcal D}v_m||_{L^\infty (0,T;L^2(\Omega ))}||\Pi _{\mathcal D}P_{\mathcal D}\varphi _l-\varphi _l||_{L^2(\Omega )}\\&\quad \le 2C\widehat{S}_{\mathcal D}(\varphi _l)+C^{1/q}C_{\varphi _l}|s'-s|^{1/q'}+C^{1/q}C_{\varphi _l}{\delta t}^{1/q'}. \end{aligned}$$

Plugged into the definition (82) of the distance in E, this shows that

$$\begin{aligned} d_E\Big ({\Pi _{{\mathcal D}}v_m(s'),\Pi _{{\mathcal D}}v_m(s)\Big )}\le & {} \sum _{l\in \mathbb N}\frac{\min (1,C^{1/q'}C_{\varphi _l}|s'-s|^{1/q'})}{2^l}\\&+\sum _{l\in \mathbb N}\frac{\min (1,2C\widehat{S}_{{\mathcal D}_m}(\varphi _l)+C^{1/q'}C_{\varphi _l}{\delta t}_m^{1/q'})}{2^l}\\=: & {} \omega (s,s')+\delta _m. \end{aligned}$$

Using the dominated convergence theorem for series, we see that \(\omega (s,s')\rightarrow 0\) as \(s-s'\rightarrow 0\) and that \(\delta _m\rightarrow 0\) as \(m\rightarrow \infty \) (we invoke the consistency to establish that \(\lim _{m\rightarrow \infty }\widehat{S}_{{\mathcal D}_m}(\varphi _l)\rightarrow 0\) for any l). Hence, the assumptions of Theorem 6.2 are satisfied and the proof is complete. \(\square \)

3.2 Technical results

We state here a family of technical lemmas, starting with a few properties on \(\nu \) and B.

Lemma 3.3

Under Assumptions (2) there holds

$$\begin{aligned} |\nu (a) - \nu (b)|\le & {} L_\beta |\zeta (a) - \zeta (b)|,\end{aligned}$$
(23)
$$\begin{aligned} (\nu (a) - \nu (b))^2\le & {} L_\beta L_\zeta (\zeta (a) - \zeta (b)) (\beta (a) - \beta (b)). \end{aligned}$$
(24)

The function B is convex continuous on \(\overline{R_\beta }\), the function \(B\circ \beta :\mathbb R\rightarrow [0,\infty )\) is continuous,

$$\begin{aligned} \forall s\in \mathbb R,\quad B(\beta (s))=\int _0^s \zeta (q)\beta '(q)\mathrm{d}q, \end{aligned}$$
(25)
$$\begin{aligned} \exists K_0,K_1,K_2>0 \text{ such } \text{ that },\; \forall s\in \mathbb R,\quad K_0 \beta (s)^2 - K_1\le B(\beta (s))\le K_2 s^2, \end{aligned}$$
(26)
$$\begin{aligned} \forall a\in \mathbb R,\;\forall S\in \overline{R_\beta },\quad \zeta (a)(S-\beta (a))\le B(S) -B(\beta (a)), \end{aligned}$$
(27)

and

$$\begin{aligned} \forall s,s'\in \mathbb R,\; (\nu (s)-\nu (s'))^2\le 4 L_\beta L_\zeta \left[ B(\beta (s))+ B(\beta (s'))-2 B\left( \frac{\beta (s)+\beta (s')}{2}\right) \right] . \end{aligned}$$
(28)

Proof

Inequality (23) is a straightforward consequence of the estimate \(\nu '=\zeta '\beta '\le L_\beta \zeta '\). Note that the same inequality also holds with \(\beta \) and \(\zeta \) swapped. Since these functions are non-decreasing, Inequality (24) follows from (23) and the similar inequality with \(\beta \) and \(\zeta \) swapped.

Since \(\beta \) is non-decreasing, \(\beta _r\) is also non-decreasing on \(R_\beta \) and therefore locally bounded on \(R_\beta \). Hence, B is locally Lipschitz-continuous on \(R_\beta \), with an a.e. derivative \(B'=\zeta (\beta _r)\). \(B'\) is therefore non-decreasing and B is convex continuous on \(R_\beta \), and thus also on \(\overline{R_\beta }\) by choice of its values at the endpoints of \(R_\beta \).

To prove (25), we denote by \(P\subset R_\beta \) the countable set of plateaux values of \(\beta \), i.e. the \(y\in \mathbb R\) such that \(\beta ^{-1}(\{y\})\) is not reduced to a singleton. If \(s\not \in \beta ^{-1}(P)\) then \(\beta ^{-1}(\{\beta (s)\})\) is the singleton \(\{s\}\) and therefore \(\beta _r(\beta (s))=s\). Moreover, \(\beta _r\) is continuous at \(\beta (s)\) and thus B is differentiable at \(\beta (s)\) with \(B'(\beta (s))=\zeta (\beta _r(\beta (s)))=\zeta (s)\). Since \(\beta \) is differentiable a.e., we deduce that, for a.e. \(s\not \in \beta ^{-1}(P)\), \((B(\beta ))'(s)=B'(\beta (s))\beta '(s) =\zeta (s)\beta '(s)\). The set \(\beta ^{-1}(P)\) is a union of intervals on which \(\beta \) and thus \(B(\beta )\) are locally constant; hence, for a.e. s in this set, \((B(\beta ))'(s)=0\) and \(\zeta (s)\beta '(s)=0\). Hence, the locally Lipschitz-continuous functions \(B(\beta )\) and \(s\rightarrow \int _0^s \zeta (q)\beta '(q)\mathrm{d}q\) have identical derivatives a.e. on \(\mathbb R\) and take the same value at \(s=0\). They are thus equal on \(\mathbb R\) and the proof of (25) is complete.

The continuity of \(B\circ \beta \) is an obvious consequence of (25). The second inequality in (26) can also be easily deduced from (25) by noticing that \(|\zeta (s)\beta '(s)|\le L_\zeta L_\beta |s|\) (we can take \(K_2=\frac{L_\zeta L_\beta }{2}\)). To prove the first inequality in (26), we start by inferring from (2b) the existence of \(S>0\) such that \(|\zeta (q)|\ge \frac{M_0}{2}|q|\ge \frac{M_0}{2L_\beta }|\beta (q)|\) whenever \(|q|\ge S\). We then write, for \(s\ge S\),

$$\begin{aligned} B(\beta (s))= & {} \int _0^S \zeta (q)\beta '(q)\mathrm{d}q + \int _S^s \zeta (q)\beta '(q)\mathrm{d}q \ge \frac{M_0}{2L_\beta }\int _S^s \beta (q)\beta '(q)\mathrm{d}q\\= & {} \frac{M_0}{4L_\beta }\left( \beta (s)^2-\beta (S)^2\right) . \end{aligned}$$

A similar inequality holds for \(s\le -S\) (with \(\beta (-S)\) instead of \(\beta (S)\)) and the first inequality in (26) therefore holds with \(K_0=\frac{M_0}{4L_\beta }\) and \(K_1=\frac{M_0}{4L_\beta }\max _{[-S,S]}\beta ^2\).

We now prove (27), which states that \(\zeta (a)\) belongs to the convex sub-differential of B at \(\beta (a)\). We first start with the case \(S\in R_\beta \), that is \(S=\beta (b)\) for some \(b\in \mathbb R\). If \(\beta _r\) is continuous at \(\beta (a)\) then this inequality is an obvious consequence of the convexity of B since B is then differentiable at \(\beta (a)\) with \(B'(\beta (a))=\zeta (\beta _r(\beta (a)))=\zeta (a)\). Otherwise, a plain reasoning also does the job:

$$\begin{aligned} B(S)-B(\beta (a))= & {} B(\beta (b))-B(\beta (a))\\= & {} \int _{a}^{b}\zeta (q)\beta '(q)\mathrm{d}q\\= & {} \int _{a}^{b}(\zeta (q)-\zeta (a))\beta '(q)\mathrm{d}q + \zeta (a)(\beta (b)-\beta (a))\\\ge & {} \zeta (a)(S-\beta (a)), \end{aligned}$$

the inequality coming from the fact that \(\beta '\ge 0\) and that \(\zeta (q)-\zeta (a)\) has the same sign as \(b-a\) when q is between a and b. The general case \(S\in \overline{R_\beta }\) is obtained by passing to the limit on \(b_n\) such that \(\beta (b_n)\rightarrow S\) and by using the fact that B has limits (possibly \(+\infty \)) at the endpoints of \(R_\beta \).

Let us now take \(s,s'\in \mathbb R\). Let \(\bar{s}\in \mathbb R\) be such that \(\beta (\bar{s}) = \frac{\beta (s)+\beta (s')}{2}\). We notice that

$$\begin{aligned} B(\beta (s))+ B(\beta (s')) -2 B(\beta (\bar{s})) \!=\! \int _{\bar{s}}^s (\zeta (q) - \zeta (\bar{s})) \beta '(q)\mathrm{d}q + \int _{\bar{s}}^{s'} \!(\zeta (q) - \zeta (\bar{s})) \beta '(q)\mathrm{d}q. \end{aligned}$$
(29)

We then notice that \(|\zeta (q) - \zeta (\bar{s})| \ge \frac{1}{L_\beta } |\nu (q) - \nu (\bar{s})|\) and \(\beta '(q)\ge \beta '(q)\frac{\zeta '(q)}{L_\zeta }=\frac{\nu '(q)}{L_\zeta }\). If \(\widetilde{s}=s\) or \(s'\), since \(\zeta (q)-\zeta (\bar{s})\) has the same sign as \(\widetilde{s}-\bar{s}\) for all q between \(\bar{s}\) and \(\widetilde{s}\), we can write

$$\begin{aligned} \int _{\bar{s}}^{\widetilde{s}} (\zeta (q) - \zeta (\bar{s})) \beta '(q)\mathrm{d}q \ge \frac{1}{L_\beta L_\zeta }\int _{\bar{s}}^{\widetilde{s}} \nu '(q) (\nu (q) - \nu (\bar{s}))\mathrm{d}q \!=\! \frac{1}{ 2 L_\beta L_\zeta } (\nu (\widetilde{s}) - \nu (\bar{s}))^2. \end{aligned}$$
(30)

Estimate (28) follows from (29), (30) and the inequality \((\nu (s) - \nu (s'))^2 \le 2(\nu (s) - \nu (\bar{s}))^2 + 2 (\nu (s') - \nu (\bar{s}))^2\). \(\square \)

The next lemma is an easy consequence of Fatou’s lemma and the fact that strongly lower semi-continuous convex functions are also weakly lower semi-continuous. We all the same provide its short proof.

Lemma 3.4

Let I be a closed interval of \(\mathbb R\) and let \(H:I\rightarrow (-\infty ,\infty ]\) be a convex continuous function (continuity for possible infinite values, at the endpoints of I, corresponding to H having limits at these endpoints). We denote by \(L^2(\Omega ;I)\) the convex set of functions in \(L^2(\Omega )\) with values in I. Let \(v\in L^2(\Omega ;I)\) and let \((v_m)_{m\in \mathbb N}\) be a sequence of functions in \(L^2(\Omega ;I)\) that converges weakly to v in \(L^2(\Omega )\). Then

$$\begin{aligned} \int _\Omega H(v({\varvec{x}}))\mathrm{d}{\varvec{x}}\le \liminf _{m\rightarrow \infty } \int _\Omega H(v_m({\varvec{x}}))\mathrm{d}{\varvec{x}}. \end{aligned}$$

Proof

For \(w\in L^2(\Omega ;I)\) we set \(\Phi (w)=\int _\Omega H(w({\varvec{x}}))\mathrm{d}{\varvec{x}}\). Since H is convex, it is greater than a linear functional and \(\Phi (w)\) is thus well defined in \((-\infty ,\infty ]\). Moreover, if \(w_k\rightarrow w\) strongly in \(L^2(\Omega ;I)\) then, up to a subsequence, \(w_k\rightarrow w\) a.e. on \(\Omega \) and therefore \(H(w_k)\rightarrow H(w)\) a.e. on \(\Omega \). Thanks to the linear lower bound of H, we can apply Fatou’s lemma to see that \(\Phi (w)\le \liminf _{k\rightarrow \infty }\Phi (w_k)\).

Hence, \(\Phi \) is lower semi-continuous for the strong topology of \(L^2(\Omega ;I)\). Since \(\Phi \) (like H) is convex, we deduce that this lower semi-continuity property is also valid for the weak topology of \(L^2(\Omega ;I)\), see [26]. The result of the lemma is just the translation of this weak lower semi-continuity of \(\Phi \). \(\square \)

The last technical result is a consequence of the Minty trick. It has been proved and used in the \(L^2\) case in [21, 28], but we need here an extension to the non-Hilbertian case.

Lemma 3.5

(Minty’s trick) Let \(H \in C^0(\mathbb R)\) be a nondecreasing function. Let \((X,\mu )\) be a measurable set with finite measure and let \((u_n)_{n\in \mathbb N} \subset L^p(X)\), with \(p>1\), satisfy

  1. 1.

    there exists \(u \in L^p(X)\) such that \((u_n)_{n\in \mathbb N}\) converges weakly to u in \(L^p(X)\);

  2. 2.

    \((H(u_n))_{n\in \mathbb N} \subset L^1(X)\) and there exists \(w\in L^1(X)\) such that \((H(u_n))_{n\in \mathbb N}\) converges strongly to w in \(L^1(X)\);

Then \(w = H(u)\) a.e. on X.

Proof

For \(k,l>0\) we define the truncation at levels \(-l\) and k by \(T_{k,l}(s)=\max (-l,\min (s,k))\) and we let \(T_k=T_{k,k}\). Since H is non-decreasing, there exists sequences \((h_k)_{k\in \mathbb N}\) and \((m_k)_{k\in \mathbb N}\) that tend to \(+\infty \) as \(k\rightarrow \infty \) and such that \(H(T_k(s))=T_{h_k,m_k}(H(s))\). Thus, \(H(T_k(u_n))\rightarrow T_{h_k,m_k}(w)\) in \(L^1(X)\) as \(n\rightarrow \infty \). Given that \((H(T_k(u_n)))_{n\in \mathbb N}\) remains bounded in \(L^\infty (X)\), its convergence to \(T_{h_k,m_k}(w)\) also holds in \(L^{p'}(X)\).

Using fact that \(H\circ T_k\) is non-decreasing, we write for any \(g\in L^{p}(X)\)

$$\begin{aligned} \int _X (H(T_k(u_n))-H(T_k(g)))(u_n-g)\mathrm{d}\mu \ge 0. \end{aligned}$$

By strong convergence of \(H(T_k(u_n))\) in \(L^{p'}(X)\) and weak convergence of \(u_n\) in \(L^{p}(X)\), as well as the fact that \(H\circ T_k\) is bounded, we can take the limit of this expression as \(n\rightarrow \infty \) and we find

$$\begin{aligned} \int _X (T_{h_k,m_k}(w)-H(T_k(g)))(u-g)\mathrm{d}\mu \ge 0. \end{aligned}$$
(31)

We then use Minty’s trick. We pick a generic \(\varphi \in L^{p}(X)\), apply (31) to \(g=u-t\varphi \), divide by t and let \(t\rightarrow \pm 0\) (using the dominated convergence theorem and the fact that \(H\circ T_k\) is continuous and bounded) to find

$$\begin{aligned} \int _X (T_{h_k,m_k}(w)-H(T_k(u)))\varphi \mathrm{d}\mu = 0. \end{aligned}$$

Selecting \(\varphi =\mathrm{sign}(T_{h_k,m_k}(w)-H(T_k(u)))\), we deduce that \(T_{h_k,m_k}(w)=H(T_k(u))\) a.e. on X. Letting \(k\rightarrow \infty \), we conclude that \(w=H(u)\) a.e. on X. \(\square \)

3.3 Integration-by-parts for the continuous solution

The last series of preliminary results are properties on the solution to (4), all based on the following integration-by-parts property. This property, used in the proof of Theorems 2.12 and 2.16, enables us to compute the value of the linear form \(\partial _t \beta ({\overline{u}})\in L^{p'}(0,T;W^{-1,p'}(\Omega ))\) on the function \(\zeta ({\overline{u}}) \in L^p(0,T;W^{1,p}_0(\Omega ))\). Because of the lack of regularity on \({\overline{u}}\) and the double non-linearity (\(\beta \) and \(\zeta \)), justifying this integration-by-parts is however not straightforward at all.

Lemma 3.6

Let us assume (2b) and (2c). Let \(v:\Omega \times (0,T)\mapsto \mathbb R\) be measurable such that \(\zeta (v)\in L^p(0,T;W^{1,p}_0(\Omega ))\), \(B(\beta (v)) \in L^\infty (0,T;L^1(\Omega ))\), \(\beta (v)\in C([0,T];L^2(\Omega ){-}\text{ w })\) and \(\partial _t \beta (v)\in L^{p'}(0,T;W^{-1,p'}(\Omega ))\). Then \(t\in [0,T]\rightarrow \int _\Omega B(\beta (v)({\varvec{x}},t))\mathrm{d}{\varvec{x}}\in [0,\infty )\) is continuous and, for all \(t_1,t_2\in [0,T]\),

$$\begin{aligned} \int _{t_1}^{t_2}\langle \partial _t \beta (v)(t),\zeta (v(\cdot ,t))\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t= & {} \int _\Omega B(\beta (v)({\varvec{x}},t_2))\mathrm{d}{\varvec{x}}\nonumber \\&-\int _\Omega B(\beta (v)({\varvec{x}},t_1)) \mathrm{d}{\varvec{x}}. \end{aligned}$$
(32)

Remark 3.7

Similarly to the discussion at the end of Sect. 1.2, we notice that it is important to keep in mind the separation between \(\beta (v(\cdot ,\cdot ))\) and its continuous representative \(\beta (v)(\cdot ,\cdot )\).

Proof

Without loss of generality, we assume that \(0\le t_1<t_2\le T\).

Step 1 truncation, extension and approximation of \(\beta (v)\).

We define \(\overline{\beta (v)}:\mathbb R\rightarrow L^2(\Omega )\) by setting

$$\begin{aligned} \overline{\beta (v)}(t)=\left\{ \begin{array}{ll} \beta (v)(t)&{} \text{ if } t\in [t_1,t_2],\\ \beta (v)(t_1)&{} \text{ if } t\le t_1,\\ \beta (v)(t_2)&{} \text{ if } t\ge t_2. \end{array}\right. \end{aligned}$$

By the continuity property of \(\beta (v)\), this definition makes sense and gives \(\overline{\beta (v)}\in C(\mathbb R;L^2(\Omega ){-}\text{ w })\) such that \(\partial _t \overline{\beta (v)} =\mathbf {1}_{(t_1,t_2)}\partial _t \beta (v) \in L^{p'}(\mathbb R;W^{-1,p'}(\Omega ))\) where \(\mathbf {1}\) is the characteristic function (no Dirac masses have been introduced at \(t=t_1\) or \(t=t_2\)). This regularity of \(\partial _t\overline{\beta (v)}\) ensures that the function \(D_h\overline{\beta (v)}:\mathbb R\mapsto W^{-1,p'}(\Omega )\) defined by

$$\begin{aligned} \forall t\in \mathbb R,\;D_h\overline{\beta (v)}(t)=\frac{1}{h}\int _{t}^{t+h} \partial _t\overline{\beta (v)}(s)\mathrm{d}s=\frac{\overline{\beta (v)}(t+h)-\overline{\beta (v)}(t)}{h} \end{aligned}$$
(33)

tends to \(\partial _t \overline{\beta (v)}\) in \(L^{p'}(\mathbb R;W^{-1,p'}(\Omega ))\) as \(h\rightarrow 0\).

Step 2 we prove that \(||B(\overline{\beta (v)}(t))||_{L^1(\Omega )}\le ||B(\beta (v))||_{L^\infty (0,T;L^1(\Omega ))}\) for all \(t\in \mathbb R\) (not only for a.e. t).

Let \(t\in [t_1,t_2]\). Since \(\beta (v)(\cdot ,\cdot )=\beta (v(\cdot ,\cdot ))\) a.e. on \(\Omega \times (t_1,t_2)\), there exists a sequence \(t_n\rightarrow t\) such that \(\beta (v)(\cdot ,t_n)=\beta (v(\cdot ,t_n))\) in \(L^2(\Omega )\) and \(||B(\beta (v)(\cdot ,t_n))||_{L^1(\Omega )}\le ||B(\beta (v))||_{L^\infty (0,T;L^1(\Omega ))}\) for all n. As \(\beta (v)\in C([0,T]; L^2(\Omega ){-}\text{ w })\), we have \(\beta (v)(\cdot ,t_n)\rightarrow \beta (v)(\cdot ,t)\) weakly in \(L^2(\Omega )\). We then use the convexity of B and Lemma 3.4 to write, thanks to our choice of \(t_n\),

$$\begin{aligned} \int _\Omega B(\beta (v)({\varvec{x}},t))\mathrm{d}{\varvec{x}}\le \liminf _{n\rightarrow \infty }\int _\Omega B(\beta (v)({\varvec{x}},t_n))\mathrm{d}{\varvec{x}}\le ||B(\beta (v))||_{L^\infty (0,T;L^1(\Omega ))} \end{aligned}$$

and the proof is complete for \(t\in [t_1,t_2]\). The result for \(t\le t_1\) or \(t\ge t_2\) is obvious since \(\overline{\beta (v)}(t)\) is then either \(\beta (v)(t_1)\) or \(\beta (v)(t_2)\).

Step 3 We prove that for all \(\tau \in \mathbb R\) and a.e. \(t\in (t_1,t_2)\),

$$\begin{aligned} \langle \overline{\beta (v)}(\tau )-\beta (v)(t),\zeta (v(\cdot ,t))\rangle _{W^{-1,p'},W^{1,p}_0} \le \int _\Omega B(\overline{\beta (v)}({\varvec{x}},\tau ))-B(\beta (v)({\varvec{x}},t))\mathrm{d}{\varvec{x}}. \end{aligned}$$
(34)

If we could just replace the duality product \(W^{-1,p'}\)\(W^{1,p}_0\) with an \(L^2\) inner product, this formula would be a straightforward consequence of (27). The problem is that nothing ensures that \(\zeta (v)(t)\in L^2(\Omega )\) for a.e. t.

We first notice that \(\overline{\beta (v)}(\tau )-\beta (v)(t) =\int _t^{\tau }\partial _t \overline{\beta (v)}(s)\mathrm{d}s\) belongs to \(W^{-1,p'}(\Omega )\) so the left-hand side of (34) makes sense provided that t is chosen such that \(\zeta (v(\cdot ,t))\in W^{1,p}_0(\Omega )\) (which we do from here on). To deal with the fact that \(\zeta (v(\cdot ,t))\) does not necessarily belong to \(L^2(\Omega )\), we replace it with a truncation. As in the proof of Lemma 3.5, we introduce \(T_{k,l}(s)= \max (-l,\min (s,k))\) and we let \(T_k=T_{k,k}\). By the monotony assumption (2b) on \(\zeta \) we see that there exists sequences \((r_k)_{k\in \mathbb N}\) and \((l_k)_{k\in \mathbb N}\) that tend to \(+\infty \) as \(k\rightarrow +\infty \) and such that \(\zeta (T_k(v(\cdot ,t)))=T_{r_k,l_k}(\zeta (v(\cdot ,t)))\). Hence, \(\zeta (T_k(v(\cdot ,t)))\in W^{1,p}_0(\Omega )\) and converges, as \(k\rightarrow \infty \), to \(\zeta (v(\cdot ,t))\) in \(W^{1,p}_0(\Omega )\).

We can therefore write

$$\begin{aligned}&\langle \overline{\beta (v)}(\tau )-\beta (v)(t),\zeta (v(\cdot ,t)) \rangle _{W^{-1,p'},W^{1,p}_0}\nonumber \\&\quad =\lim _{k\rightarrow \infty } \langle \overline{\beta (v)}(\tau )-\beta (v)(t),\zeta (T_k(v(\cdot ,t))) \rangle _{W^{-1,p'},W^{1,p}_0}\nonumber \\&\quad =\lim _{k\rightarrow \infty } \int _\Omega \left[ \overline{\beta (v)}({\varvec{x}},\tau )-\beta (v({\varvec{x}},t))\right] \zeta (T_k(v({\varvec{x}},t)))\mathrm{d}{\varvec{x}}, \end{aligned}$$
(35)

the replacement of the duality product by an \(L^2(\Omega )\) inner product being justified since \(\overline{\beta (v)}(\tau )-\beta (v)(t)\) and \(\zeta (T_k(v(\cdot ,t)))\) both belong to \(L^2(\Omega )\). We also used that, for a.e. \(t\in (t_1,t_2)\), \(\beta (v)(\cdot ,t)=\beta (v(\cdot ,t))\) a.e. on \(\Omega \); hence (35) is valid for a.e. \(t\in (t_1,t_2)\).

We then write \(\beta (v({\varvec{x}},t))=\beta (T_k(v({\varvec{x}},t)))+\left[ \beta (v({\varvec{x}},t))-\beta (T_k(v({\varvec{x}},t)))\right] \) and apply (27) with \(S=\overline{\beta (v)}({\varvec{x}},\tau )\) and \(a=T_k(v({\varvec{x}},t))\) to find

$$\begin{aligned}&\int _\Omega \left[ \overline{\beta (v)}({\varvec{x}},\tau )-\beta (v({\varvec{x}},t))\right] \zeta (T_k(v({\varvec{x}},t)))\mathrm{d}{\varvec{x}}\\&\quad =\int _\Omega \left[ \overline{\beta (v)}({\varvec{x}},\tau )-\beta (T_k(v({\varvec{x}},t)))\right] \zeta (T_k(v({\varvec{x}},t)))\mathrm{d}{\varvec{x}}\\&\qquad -\int _\Omega \left[ \beta (v({\varvec{x}},t))-\beta (T_k(v({\varvec{x}},t)))\right] \zeta (T_k(v({\varvec{x}},t)))\mathrm{d}{\varvec{x}}\\&\quad \le \int _\Omega B(\overline{\beta (v)}({\varvec{x}},\tau ))-B(\beta (T_k(v({\varvec{x}},t))))\mathrm{d}{\varvec{x}}\\&\qquad -\int _\Omega \left[ \beta (v({\varvec{x}},t))-\beta (T_k(v({\varvec{x}},t)))\right] \zeta (T_k(v({\varvec{x}},t)))\mathrm{d}{\varvec{x}}. \end{aligned}$$

By the monotony of \(\beta \), the sign of \(\zeta \) and by studying the cases \(v({\varvec{x}},t)\ge k\), \(-k\le v({\varvec{x}},t)\le k\) and \(v({\varvec{x}},t)\le -k\), we notice that the last integrand is everywhere non-negative. We can therefore write

$$\begin{aligned}&\int _\Omega \left[ \overline{\beta (v)}({\varvec{x}},\tau )-\beta (v({\varvec{x}},t))\right] \zeta (T_k(v({\varvec{x}},t)))\mathrm{d}{\varvec{x}}\\&\quad \le \int _\Omega B(\overline{\beta (v)}({\varvec{x}},\tau ))-B(\beta (T_k(v({\varvec{x}},t))))\mathrm{d}{\varvec{x}}. \end{aligned}$$

We then use the continuity of \(B\circ \beta \) and Fatou’s lemma to deduce

$$\begin{aligned}&\limsup _{k\rightarrow \infty } \int _\Omega \left[ \overline{\beta (v)}({\varvec{x}},\tau )-\beta (v({\varvec{x}},t))\right] \zeta (T_k(v({\varvec{x}},t)))\mathrm{d}{\varvec{x}}\\&\quad \le \int _\Omega B(\overline{\beta (v)}({\varvec{x}},\tau ))\mathrm{d}{\varvec{x}}-\liminf _{k\rightarrow \infty }\int _\Omega B(\beta (T_k(v({\varvec{x}},t))))\mathrm{d}{\varvec{x}}\\&\quad \le \int _\Omega B(\overline{\beta (v)}({\varvec{x}},\tau ))\mathrm{d}{\varvec{x}}-\int _\Omega B(\beta (v({\varvec{x}},t)))\mathrm{d}{\varvec{x}}\end{aligned}$$

which, combined with (35), concludes the proof of (34) (recall that t has been chosen such that \(\beta (v(\cdot ,t))=\beta (v)(\cdot ,t)\) a.e. on \(\Omega \)).

Step 4: proof of the formula

Since \(\mathbf {1}_{(t_1,t_2)}\zeta (v) \!\in \! L^p(\mathbb R;W^{1,p}_0(\Omega ))\) and \(D_h\overline{\beta (v)}\rightarrow \partial _t\overline{\beta (v)}\) in \(L^{p'}(\mathbb R;W^{-1,p'}(\Omega ))\) as \(h\rightarrow 0\), we have

$$\begin{aligned}&\int _{t_1}^{t_2}\langle \partial _t \beta (v)(t),\zeta (v(\cdot ,t))\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t\nonumber \\&\quad =\int _\mathbb R\langle \partial _t \overline{\beta (v)}(t),\mathbf {1}_{(t_1,t_2)}(t)\zeta (v(\cdot ,t))\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t\nonumber \\&\quad =\lim _{h\rightarrow 0}\int _\mathbb R\langle D_h\overline{\beta (v)}(t),\mathbf {1}_{(t_1,t_2)}(t)\zeta (v(\cdot ,t))\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t\nonumber \\&\quad =\lim _{h\rightarrow 0}\frac{1}{h}\int _{t_1}^{t_2} \langle \overline{\beta (v)}(t+h)-\overline{\beta (v)}(t),\zeta (v(\cdot ,t)\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t. \end{aligned}$$
(36)

We then use (34) for a.e. \(t\in (t_1,t_2)\) to obtain, for h small enough such that \(t_1+h<t_2\),

$$\begin{aligned}&\frac{1}{h}\int _{t_1}^{t_2} {\langle \overline{\beta (v)}(t+h)-\overline{\beta (v)}(t),\zeta (v(\cdot ,t))\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t}\nonumber \\&\quad \le \frac{1}{h}\int _{t_1}^{t_2} \int _\Omega B(\overline{\beta (v)}({\varvec{x}},t+h))-B(\overline{\beta (v)}({\varvec{x}},t))\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad =\frac{1}{h}\int _{t_2}^{t_2+h} \int _\Omega B(\overline{\beta (v)}({\varvec{x}},t))\mathrm{d}{\varvec{x}}\mathrm{d}t -\frac{1}{h}\int _{t_1}^{t_1+h} \int _\Omega B(\overline{\beta (v)}({\varvec{x}},t))\mathrm{d}{\varvec{x}}\mathrm{d}t\\&\quad =\int _\Omega B(\beta (v)({\varvec{x}},t_2))\mathrm{d}{\varvec{x}}-\frac{1}{h}\int _{t_1}^{t_1+h} \int _\Omega B(\beta (v)({\varvec{x}},t))\mathrm{d}{\varvec{x}}\mathrm{d}t.\nonumber \end{aligned}$$
(37)

We used the estimate in Step 2 to justify the separation of the integrals in (37). We now take the \(\limsup \) as \(h\rightarrow 0\) of this inequality, using again Step 2 to see that \(B(\beta (v)(\cdot ,t_2))\) is integrable and therefore take its integral out of the \(\limsup \). Coming back to (36) we obtain

$$\begin{aligned}&\int _{t_1}^{t_2}\langle \partial _t \beta (v)(t),\zeta (v(\cdot ,t))\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t\nonumber \\&\quad \le \int _\Omega B(\beta (v)({\varvec{x}},t_2))\mathrm{d}{\varvec{x}}-\liminf _{h\rightarrow 0}\frac{1}{h}\int _{t_1}^{t_1+h} \int _\Omega B(\beta (v)({\varvec{x}},t))\mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(38)

Since \(\beta (v)\in C([0,T];L^2(\Omega ){-}\text{ w })\), as \(h\rightarrow 0\) we have \(\frac{1}{h}\int _{t_1}^{t_1+h}\beta (v)(t)\mathrm{d}t\rightarrow \beta (v)(t_1)\) weakly in \(L^2(\Omega )\). Hence, the convexity of B, Lemma 3.4 and Jensen’s inequality give

$$\begin{aligned} \int _\Omega B(\beta (v)({\varvec{x}},t_1))\mathrm{d}{\varvec{x}}\le & {} \liminf _{h\rightarrow 0}\int _\Omega B\left( \frac{1}{h}\int _{t_1}^{t_1+h}\beta (v)({\varvec{x}},t)\mathrm{d}t\right) \mathrm{d}{\varvec{x}}\\\le & {} \liminf _{h\rightarrow 0}\int _\Omega \frac{1}{h}\int _{t_1}^{t_1+h} B(\beta (v)({\varvec{x}},t))\mathrm{d}t\mathrm{d}{\varvec{x}}. \end{aligned}$$

Plugged into (38), this inequality shows that (32) holds with \(\le \) instead of \(=\). The reverse inequality is obtained by reversing the time. We consider \(\widetilde{v}(t)=v(t_1+t_2-t)\). Then \(\zeta (\widetilde{v})\), \(B(\beta (\widetilde{v}))\) and \(\beta (\widetilde{v})\) have the same properties as \(\zeta (v)\), \(B(\beta (v))\) and \(\beta (v)\), and \(\beta (\widetilde{v})\) takes values \(\beta (v)(t_1)\) at \(t=t_2\) and \(\beta (v)(t_2)\) at \(t=t_1\). Applying (32) with “\(\le \)” instead of “\(=\)” to \(\widetilde{v}\) and using the fact that \(\partial _t \beta (\widetilde{v})(t)=-\partial _t \beta (v)(t_1+t_2-t)\), we obtain (32) with “\(\ge \)” instead of “\(=\)” and the proof of (32) is complete.

The continuity of \(t\in [0,T]\mapsto \int _\Omega B(\beta (v)({\varvec{x}},t))\mathrm{d}{\varvec{x}}\) is straightforward from (32) as the left-hand side of this relation is continuous with respect to \(t_1\) and \(t_2\). \(\square \)

The following corollary states continuity properties and an essential formula on the solution to (4).

Corollary 3.8

Under Assumptions (2a)–(2i), if \({\overline{u}}\) is a solution of (4) then:

  1. 1.

    the function \(t\in [0,T]\mapsto \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},t))\mathrm{d}{\varvec{x}}\in [0,\infty )\) is continuous and bounded,

  2. 2.

    for any \(T_0\in [0,T]\),

    $$\begin{aligned}&\int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0)) \mathrm{d}{\varvec{x}}{+} \int _{0}^{T_0}\! \int _\Omega \! {\varvec{a}}({\varvec{x}},\nu ({\overline{u}}({\varvec{x}},t)),\nabla \zeta ({\overline{u}})({\varvec{x}},t)) \cdot \nabla \zeta ({\overline{u}})({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t \nonumber \\&\quad = \int _\Omega B(\beta (u_\mathrm{ini}({\varvec{x}}))) \mathrm{d}{\varvec{x}}+ \int _{0}^{T_0} \int _\Omega f({\varvec{x}},t) \zeta ({\overline{u}})({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t, \end{aligned}$$
    (39)
  3. 3.

    \(\nu ({\overline{u}})\) is continuous \([0,T]\rightarrow L^2(\Omega )\).

Remark 3.9

The continuity of \(\nu ({\overline{u}})\) has to be understood in the same sense as the continuity of \(\beta ({\overline{u}})\), that is \(\nu ({\overline{u}})\) is a.e. on \(\Omega \times (0,T)\) equal to a continuous function \([0,T]\rightarrow L^2(\Omega )\). We use in particular the notation \(\nu ({\overline{u}})(\cdot ,\cdot )\) for the continuous-in-time representative of \(\nu ({\overline{u}}(\cdot ,\cdot ))\), similarly to the way we denote the continuous-in-time representative of \(\beta ({\overline{u}}(\cdot ,\cdot ))\).

Proof

The continuity of \(t\in [0,T]\mapsto \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},t))\mathrm{d}{\varvec{x}}\in [0,\infty )\) and Formula (39) are straightforward consequences of Lemma 3.6 with \(v={\overline{u}}\) and using (4) with \({\overline{v}}=\zeta ({\overline{u}})\). Note that the bound on \(\int _\Omega B(\beta ({\overline{u}})({\varvec{x}},t))\mathrm{d}{\varvec{x}}\) can be seen as a consequence of (39), or from Step 2 in the proof of Lemma 3.6.

Let us prove the strong continuity of \(\nu ({\overline{u}}):[0,T]\mapsto L^2(\Omega )\). Let \(\mathcal T\) be the set of \(\tau \in [0,T]\) such that \(\beta ({\overline{u}}(\cdot ,\tau ))=\beta ({\overline{u}})(\cdot ,\tau )\) a.e. on \(\Omega \), and let \((s_l)_{l\in \mathbb N}\) and \((t_k)_{k\in \mathbb N}\) be two sequences in \(\mathcal T\) that converge to the same value s. Invoking (28) we can write

$$\begin{aligned}&\int _\Omega (\nu ({\overline{u}}({\varvec{x}},s_l))-\nu ({\overline{u}}({\varvec{x}},t_k)))^2\mathrm{d}{\varvec{x}}\nonumber \\&\quad \le 4L_\beta L_\zeta \left( \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},s_l))\mathrm{d}{\varvec{x}}+\int _\Omega B(\beta ({\overline{u}})({\varvec{x}},t_k))\mathrm{d}{\varvec{x}}\right) \nonumber \\&\qquad -8L_\beta L_\zeta \int _\Omega B\left( \frac{\beta ({\overline{u}})({\varvec{x}},s_l)+\beta ({\overline{u}})({\varvec{x}},t_k)}{2}\right) \mathrm{d}{\varvec{x}}. \end{aligned}$$
(40)

Since \(\frac{\beta ({\overline{u}})(\cdot ,s_l)+\beta ({\overline{u}})(\cdot ,t_k)}{2}\rightarrow \beta ({\overline{u}})(\cdot ,s)\) weakly in \(L^2(\Omega )\) as \(l,k\rightarrow \infty \), Lemma 3.4 gives

$$\begin{aligned} \int _\Omega B\left( \beta ({\overline{u}})({\varvec{x}},s)\right) \mathrm{d}{\varvec{x}}\le \liminf _{l,k\rightarrow \infty }\int _\Omega B\left( \frac{\beta ({\overline{u}})({\varvec{x}},s_l)+\beta ({\overline{u}})({\varvec{x}},t_k)}{2}\right) \mathrm{d}{\varvec{x}}. \end{aligned}$$

Taking the \(\limsup \) as \(l,k\rightarrow \infty \) of (40) and using the continuity of \(t\mapsto \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},t))\mathrm{d}{\varvec{x}}\) thus shows that

$$\begin{aligned} ||\nu ({\overline{u}}(\cdot ,s_l))-\nu ({\overline{u}}(\cdot ,t_k))||_{L^2(\Omega )}\rightarrow 0\quad \text{ as } l,k\rightarrow \infty . \end{aligned}$$
(41)

The existence of an a.e. representative of \(\nu ({\overline{u}}(\cdot ,\cdot ))\) which is continuous \([0,T]\mapsto L^2(\Omega )\) is a direct consequence of this convergence. Let \(s\in [0,T]\) and \((s_l)_{l\in \mathbb N}\subset \mathcal T\) that converges to s. Applied with \(t_k=s_k\), (41) shows that \((\nu ({\overline{u}}(\cdot ,s_l)))_{l\in \mathbb N}\) is a Cauchy sequence in \(L^2(\Omega )\) and therefore that \(\lim _{l\rightarrow \infty } \nu ({\overline{u}}(\cdot ,s_l))\) exists in \(L^2(\Omega )\). Moreover, (41) shows that this limit, that we denote by \(\nu ({\overline{u}})(\cdot ,s)\), does not depend on the sequence in \(\mathcal T\) that converges to s. Whenever \(s\in \mathcal T\), the choice \(t_k=s\) in (41) shows that \(\nu ({\overline{u}})(\cdot ,s)=\nu ({\overline{u}}(\cdot ,s))\) a.e. on \(\Omega \), and \(\nu ({\overline{u}})(\cdot ,\cdot )\) is therefore equal to \(\nu ({\overline{u}}(\cdot ,\cdot ))\) a.e. on \(\Omega \times (0,T)\).

It remains to establish that \(\nu ({\overline{u}})\) thus defined is continuous \([0,T]\mapsto L^2(\Omega )\). For any \((\tau _r)_{r\in \mathbb N}\subset [0,T]\) that converges to \(\tau \in [0,T]\), we can pick \(s_r\in \mathcal T\cap (\tau _r-\frac{1}{r},\tau _r+\frac{1}{r})\) and \(t_r\in \mathcal T\cap (\tau -\frac{1}{r},\tau +\frac{1}{r})\) such that

$$\begin{aligned} ||\nu ({\overline{u}})(\cdot ,\tau _r)-\nu ({\overline{u}}(\cdot ,s_r))||_{L^2(\Omega )} \le \frac{1}{r},\quad ||\nu ({\overline{u}})(\cdot ,\tau )-\nu ({\overline{u}}(\cdot ,t_r))||_{L^2(\Omega )}\le \frac{1}{r}. \end{aligned}$$

We therefore have

$$\begin{aligned} ||\nu ({\overline{u}})(\cdot ,\tau _r)-\nu ({\overline{u}})(\cdot ,\tau )||_{L^2(\Omega )} \le \frac{2}{r}+||\nu ({\overline{u}}(\cdot ,s_r))-\nu ({\overline{u}}(\cdot ,t_r))||_{L^2(\Omega )}. \end{aligned}$$

This proves by (41) with \(l=k=r\) that \(\nu ({\overline{u}})(\cdot ,\tau _r) \rightarrow \nu ({\overline{u}})(\cdot ,\tau )\) in \(L^2(\Omega )\) as \(r\rightarrow \infty \), and the proof is complete. \(\square \)

4 Proof of the convergence theorems

4.1 Estimates on the approximate solution

As usual in the study of numerical methods for PDE with strong non-linearities or without regularity assumptions on the data, everything starts with a priori estimates.

Lemma 4.1

\((L^\infty (0,T;L^2(\Omega ))\) estimate and discrete \(L^p(0,T;W^{1,p}_0(\Omega ))\) estimate) Under Assumptions (2), let \({\mathcal D}\) be a space-time gradient discretisation in the sense of Definition 2.1. Let u be a solution to Scheme (14).

Then, for any \(T_0\in (0,T]\), denoting by \(k=1,\ldots ,N\) the index such that \(T_0\in (t^{(k-1)},t^{(k)}]\) we have

$$\begin{aligned}&\int _\Omega B(\Pi _{\mathcal D}\beta (u)({\varvec{x}},T_0))\mathrm{d}{\varvec{x}}\nonumber \\&\quad \quad +\int _0^{T_0}\int _\Omega {\varvec{a}}({\varvec{x}},\Pi _{\mathcal D}\nu (u)({\varvec{x}},t),\nabla _{\mathcal D}\zeta (u)({\varvec{x}},t))\cdot \nabla _{\mathcal D}\zeta (u)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad \le \int _\Omega B(\Pi _{\mathcal D}\beta ({\mathcal I}_{\mathcal D}u_\mathrm{ini})({\varvec{x}}))\mathrm{d}{\varvec{x}}+\int _0^{t^{(k)}}\int _\Omega f({\varvec{x}},t)\Pi _{\mathcal D}\zeta (u)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(42)

Consequently, there exists \(C_{1}>0\) only depending on p, \(L_\beta \), \(C_P\ge C_{\mathcal D}\) (see Definition 2.5), \(C_\mathrm{ini} \ge \Vert \Pi _{\mathcal D}{\mathcal I}_{\mathcal D}u_\mathrm{ini}\Vert _{L^2(\Omega )}\), f, \(\underline{a}\) and the constants \(K_0\), \(K_1\) and \(K_2\) in (26) such that

$$\begin{aligned}&\Vert \Pi _{\mathcal D}B( \beta (u))\Vert _{L^\infty (0,T;L^1(\Omega ))} \le C_{1},\; \Vert \nabla _{{\mathcal D}} \zeta (u) \Vert _{L^p(\Omega \times (0,T))^d} \le C_{1}\nonumber \\&\text{ and } \Vert \Pi _{\mathcal D}\beta (u)\Vert _{L^\infty (0,T;L^2(\Omega ))} \le C_{1}. \end{aligned}$$
(43)

Proof

By using (12) and (27) we notice that for any \(n=0,\ldots ,N-1\) and any \(t\in (t^{(n)},t^{(n+1)}]\)

$$\begin{aligned} \Pi _{\mathcal D}\delta _{\mathcal D}\beta (u)(t) \Pi _{\mathcal D}\zeta (u^{(n+1)})= & {} \frac{1}{{\delta t}^{(n+{\frac{1}{2}})}}\left( \beta (\Pi _{\mathcal D}u^{(n+1)})-\beta (u^{(n)})\right) \zeta (\Pi _{{\mathcal D}}u^{(n+1)})\\\ge & {} \frac{1}{{\delta t}^{(n+{\frac{1}{2}})}}\left( B(\Pi _{\mathcal D}\beta (u^{(n+1)})) - B(\Pi _{\mathcal D}\beta (u^{(n)}))\right) . \end{aligned}$$

Hence, with \(v = (\zeta (u^{(1)}),\ldots ,\zeta (u^{(k)}),0,\ldots ,0) \subset X_{{\mathcal D},0}\) in (14) we find

$$\begin{aligned}&\int _\Omega B(\Pi _{\mathcal D}\beta (u)({\varvec{x}},t^{(k)}))\mathrm{d}{\varvec{x}}\nonumber \\&\quad \quad + \int _0^{t^{(k)}}\int _\Omega {\varvec{a}}({\varvec{x}},\Pi _{\mathcal D}\nu (u)({\varvec{x}},t), \nabla _{\mathcal D}\zeta (u)({\varvec{x}},t))\cdot \nabla _{\mathcal D}\zeta (u)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad \le \int _\Omega B(\Pi _{\mathcal D}\beta (u^{(0)})({\varvec{x}}))\mathrm{d}{\varvec{x}}+\int _0^{t^{(k)}} f({\varvec{x}},t)\Pi _{\mathcal D}\zeta (u)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(44)

Equation (42) is a straightforward consequence of this estimate, of the relation \(\beta (u)(\cdot ,T_0)=\beta (u)(\cdot ,t^{(k)})\) [see (13)] and of the fact that the integrand involving \({\varvec{a}}\) is nonnegative on \([T_0,t^{(k)}]\).

By using Young’s inequality \(ab\le \frac{1}{p}a^p+\frac{1}{p'}b^{p'}\), we can write

$$\begin{aligned}&\int _0^{t^{(k)}}\int _\Omega f({\varvec{x}},t)\Pi _{\mathcal D}\zeta (u)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t \;\le \; \frac{2^{1/(p-1)} C_{\mathcal D}^{p'}}{ (p \underline{a})^{1/(p-1)}\ p'} \Vert f\Vert _{L^{p'}(\Omega \times (0,t^{(k)}))}^{p'}\\&\quad +\, \frac{ \underline{a}}{2 C_{\mathcal D}^{p}}\Vert \Pi _{\mathcal D}\zeta (u)\Vert _{L^p(\Omega \times (0,t^{(k)}))}^{p} \end{aligned}$$

and the first two estimates in (43) therefore follow from (44), (26), the coercivity assumption (2f) on \({\varvec{a}}\) and the Definition 2.5 of \(C_{\mathcal D}\). The estimate on \(\Pi _{\mathcal D}\beta (u)=\beta (\Pi _{\mathcal D}u)\) in \(L^\infty (0,T;L^2(\Omega ))\) is a consequence of the estimate on \(B(\beta (\Pi _{\mathcal D}u))\) in \(L^\infty (0,T;L^1(\Omega ))\) and of (26). \(\square \)

Corollary 4.2

(Existence of a solution to the gradient scheme) Under Assumptions (2), if \({\mathcal D}\) is a gradient discretisation in the sense of Definition 2.1 then there exists at least a solution to the gradient scheme (14).

Proof

We endow \(E=\{(u^{(n)})_{n=1,\ldots ,N}\,:\,u^{(n)}\in X_{{\mathcal D},0} \text{ for } \text{ all } n\}\) with the dot product “\(\cdot \)” coming from the degrees of freedom I (see Remark 2.3), and we denote by \(|\cdot |\) the corresponding norm. Let \(T:E\mapsto E\) be such that, for all \(u,v\in E\),

$$\begin{aligned} T(u)\cdot v= & {} \int _0^T\int _\Omega \left[ \Pi _{\mathcal D}\delta _{{\mathcal D}}\beta (u)({\varvec{x}},t) \Pi _{\mathcal D}v({\varvec{x}},t)\right. \\&\left. +\, {\varvec{a}}({\varvec{x}}, \Pi _{\mathcal D}\nu (u)({\varvec{x}},t),\nabla _{\mathcal D}\zeta ( u)({\varvec{x}},t))\cdot \nabla _{\mathcal D}v({\varvec{x}},t)\right] \mathrm{d}{\varvec{x}}\mathrm{d}t, \end{aligned}$$

where \(\delta _{\mathcal D}^{(\frac{1}{2})}\beta (u)\) is defined by setting \(u^{(0)}={\mathcal I}_{\mathcal D}u_\mathrm{ini}\). Set \(f_E\in E\) such that, for all \(v\in E\), \(f_E\cdot v = \int _0^T\int _\Omega f({\varvec{x}},t) \Pi _{\mathcal D}v({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t\). A solution to (14) is an element \(u\in E\) such that \(T(u)=f_E\). The continuity and growth properties of \(\beta \), \(\zeta \) and \({\varvec{a}}\) clearly show that T is continuous \(E\mapsto E\), so we can prove that \(T(u)=f_E\) has has a solution by establishing that, for R large enough, \(d(T,B(R),f_E)\not =0\) where d is the Brouwer topological degree [15] and B(R) is the open ball of radius R in E.

Following the reasoning used to prove (42), the coercivity property (2f) on \({\varvec{a}}\) and the equivalence of all norms on E give \(C_{2}\) and \(C_{3}\) not depending on \(u\in E\) such that

$$\begin{aligned} T(u)\cdot \zeta (u) \ge \underline{a}||\nabla _{\mathcal D}\zeta (u)||_{L^p(\Omega )^d}^p - ||B(\Pi _{\mathcal D}\beta ({\mathcal I}_{\mathcal D}u_\mathrm{ini}))||_{L^1(\Omega )} \ge C_{2} |u|^p - C_{3}. \end{aligned}$$

From the choice of the dot product on E and Assumption (2b) on \(\zeta \), we have \(|\zeta (v)|\le L_\zeta |v|\) and \(\zeta (v)\cdot v\ge C_{4}|v|^2 - C_{5}\), with \(C_{4}>0\) and \(C_{5}\) not depending on \(v\in E\). Let us consider the homotopy \(h(\rho ,u)=\rho T(u)+(1-\rho )u\) between T and \(\mathrm{Id}\), and assume that u is a solution to \(h(\rho ,u)=f_E\) for some \(\rho \in [0, 1]\). We have if \(|u|\ge 1\)

$$\begin{aligned} |f_E|L_\zeta |u|\ge & {} f_E\cdot \zeta (u) = \rho T(u)\cdot \zeta (u) + (1-\rho ) u\cdot \zeta (u)\\\ge & {} \rho C_{2}|u|^p - \rho C_{3} + (1-\rho )C_{4}|u|^2 - (1-\rho )C_{5}\\\ge & {} \min (C_{2},C_{4})|u|^{\min (p,2)} - C_{3}-C_{5}. \end{aligned}$$

Hence, if we select \(R>1\) such that \(|f_E|L_\zeta R< \min (C_{2},C_{4})R^{\min (p,2)}- C_{3}-C_{5}\), which is possible since \(\min (p,2)>1\), no solution to \(h(\rho ,u)=f_E\) can lie on \(\partial B(R)\). The invariance by homotopy of the topological degree then gives \(d(T,B(R),f_E)=d(\mathrm{Id},B(R),f_E)\), and this last degree is equal to 1 if we select R such that \(f_E\in B(R)\). The proof is complete. \(\square \)

Lemma 4.3

(Estimate on the dual semi-norm of the discrete time derivative) Under Assumptions (2), let \({\mathcal D}\) be a space-time gradient discretisation in the sense of Definition 2.1. Let u be a solution to Scheme (14). Then there exists \(C_{6}\) only depending on p, \(L_\beta \), \(C_P\ge C_{{\mathcal D}}\), \(C_\mathrm{ini}\ge \Vert \Pi _{\mathcal D}I_{\mathcal D}u_\mathrm{ini}\Vert _{L^2(\Omega )}\), f, \(\underline{a}\), \(\mu \), \(\overline{a}\), T and the constants \(K_0\), \(K_1\) and \(K_2\) in (26) such that

$$\begin{aligned} \int _0^T \vert \delta _{{\mathcal D}} \beta (u)(t)\vert _{\star ,{\mathcal D}}^{p'}\mathrm{d}t\le C_{6}. \end{aligned}$$
(45)

Proof

Let us take a generic \(v=(v^{(n)})_{n=1,\ldots ,N}\subset X_{{\mathcal D},0}\) as a test function in Scheme (14). We have, thanks to Assumption (2h) on \({\varvec{a}}\),

$$\begin{aligned}&\int _0^T \int _\Omega \Pi _{\mathcal D}\delta _{{\mathcal D}}\beta (u)({\varvec{x}},t) \Pi _{\mathcal D}v({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\\&\quad \le \int _0^T \int _\Omega (\overline{a}({\varvec{x}}) + \mu |\nabla _{\mathcal D}\zeta (u)({\varvec{x}},t)|^{p-1}) \vert \nabla _{\mathcal D}v({\varvec{x}},t)\vert \mathrm{d}{\varvec{x}}\mathrm{d}t \\&\qquad + \int _0^T\int _\Omega f({\varvec{x}},t) \Pi _{\mathcal D}v({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$

Using Hölder’s inequality, Definition 2.5 and estimates (43), this leads to the existence of \(C_{7}>0\) only depending on p, \(L_\beta \), \(C_P\), \(C_\mathrm{ini}\), f, \(\underline{a}\), \(\overline{a}\), \(\mu \) and \(K_0\), \(K_1\) and \(K_2\) such that

$$\begin{aligned} \int _0^T \int _\Omega \Pi _{\mathcal D}\delta _{{\mathcal D}} \beta (u)({\varvec{x}},t) \Pi _{\mathcal D}v({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t \le C_{7}\Vert \nabla _{\mathcal D}v\Vert _{L^p(0,T;L^p(\Omega ))^d}. \end{aligned}$$

The proof of (45) is completed by selecting \(v=(|\delta _{\mathcal D}^{(n+{\frac{1}{2}})} \beta (u)|_{\star ,{\mathcal D}}^{p'-1}z^{(n)})_{n=1,\ldots ,N}\) with \((z^{(n)})_{n=1,\ldots ,N}\subset X_{{\mathcal D},0}\) such that, for any \(n=1,\ldots ,N\), \(z^{(n)}\) realises the supremum in (11) with \(w=\delta _{\mathcal D}^{(n+{\frac{1}{2}})}\beta (u)\). \(\square \)

Lemma 4.4

(Estimate on the time translates of \(\nu (u)\)) Under Assumptions (2), let \({\mathcal D}\) be a space-time gradient discretisation in the sense of Definition 2.1. Let u be a solution to Scheme (14). Then there exists \(C_{8}\) only depending on p, \(L_\beta \), \(L_\zeta \), \(C_P\ge C_{{\mathcal D}}\), \(C_\mathrm{ini}\ge \Vert \Pi _{\mathcal D}I_{\mathcal D}u_\mathrm{ini}\Vert _{L^2(\Omega )}\), f, \(\underline{a}\), \(\mu \), \(\overline{a}\), T and \(K_0\), \(K_1\) and \(K_2\) in (26) such that

$$\begin{aligned} \Vert \Pi _{\mathcal D}\nu (u)(\cdot ,\cdot +\tau )-\Pi _{\mathcal D}\nu (u)(\cdot ,\cdot ) \Vert _{L^2(\Omega \times (0,T-\tau ))}^2 \le C_{8}(\tau + {\delta t}), \quad \forall \tau \in (0,T).\quad \quad \end{aligned}$$
(46)

Proof

Let \(\tau \in (0,T)\). Thanks to (24), we can write

$$\begin{aligned} \int _{\Omega \times (0,T-\tau )} \Bigl (\Pi _{\mathcal D}\nu (u)({\varvec{x}},t+\tau )-\Pi _{\mathcal D}\nu (u)({\varvec{x}},t)\Bigr )^2 \mathrm{d}{\varvec{x}}\mathrm{d}t \le L_\beta L_\zeta \int _0^{T-\tau } A(t) \mathrm{d}t, \end{aligned}$$
(47)

where

$$\begin{aligned} A(t) = \int _\Omega \Bigl (\Pi _{\mathcal D}\zeta (u)({\varvec{x}},t+\tau )-\Pi _{\mathcal D}\zeta (u)({\varvec{x}},t)\Bigr ) \Bigl (\Pi _{\mathcal D}\beta (u)({\varvec{x}},t+\tau )-\Pi _{\mathcal D}\beta (u)({\varvec{x}},t)\Bigr ) \mathrm{d}{\varvec{x}}. \end{aligned}$$

For \(s\in (0,T)\), we define \(n(s) \in \{0,\ldots ,N-1\}\) such that \(t^{(n(s))} < s \le t^{(n(s)+1)}\). Taking \(t\in (0,T-\tau )\), we may write

$$\begin{aligned} A(t) = \int _\Omega \Bigl (\Pi _{\mathcal D}\zeta (u^{(n(t+\tau )+1)})({\varvec{x}})-\Pi _{\mathcal D}\zeta (u^{(n(t)+1)})({\varvec{x}})\Bigr )\Bigl (\sum _{n = n(t)+1}^{n(t+\tau )}{\delta t}^{(n+{\frac{1}{2}})}\Pi _{\mathcal D}\delta _{{\mathcal D}}^{(n+{\frac{1}{2}})} \beta (u)({\varvec{x}})\Bigr ) \mathrm{d}{\varvec{x}}. \end{aligned}$$

We then use the definition (11) of the discrete dual semi-norm to infer

$$\begin{aligned} A(t) \!\le \! \sum _{n = n(t)+1}^{n(t+\tau )}{\delta t}^{(n+{\frac{1}{2}})} \left| \left| \nabla _{\mathcal D}\left[ \zeta (u^{(n(t+\tau )+1)})\!-\!\zeta (u^{(n(t)+1)})\right] \right| \right| _{L^p(\Omega )^d} |\delta _{{\mathcal D}}^{(n+{\frac{1}{2}})} \beta (u)|_{\star ,{\mathcal D}}. \end{aligned}$$
(48)

We apply the triangular inequality on the first norm in this right-hand side, Young’s inequality and we integrate over \(t\in (0,T-\tau )\) to get

$$\begin{aligned} \int _0^{T-\tau }A(t)\mathrm{d}t\le \mathcal {A}_\tau + \mathcal {A}_0+\mathcal {B} \end{aligned}$$
(49)

with, for \(s=0\) or \(s=\tau \),

$$\begin{aligned} \mathcal {A}_s=\frac{1}{p}\int _0^{T-\tau } \sum _{n = n(t)+1}^{n(t+\tau )}{\delta t}^{(n+{\frac{1}{2}})} ||\nabla _{\mathcal D}\zeta (u^{(n(t+s)+1)})||_{L^p(\Omega )^d}^p\mathrm{d}t\le \frac{C_{1}^p}{p}(\tau +{\delta t}) \end{aligned}$$
(50)

and

$$\begin{aligned} \mathcal {B}= \frac{2}{p'}\int _0^{T-\tau } \sum _{n = n(t)+1}^{n(t+\tau )}{\delta t}^{(n+{\frac{1}{2}})} |\delta _{{\mathcal D}}^{(n+{\frac{1}{2}})} \beta (u)|_{\star ,{\mathcal D}}^{p'}\mathrm{d}t\le \frac{2C_{6}}{p'}\tau . \end{aligned}$$
(51)

In (50), the quantity \(\mathcal {A}_s\) has been estimated by using (84) in Lemma 6.6 and the estimate on \(\nabla _{\mathcal D}\zeta (u)\) in (43). In (51), \(\mathcal {B}\) has been estimated by applying (83) in Lemma 6.6 and by using the bound (45) on \(\int _0^T \vert \delta _{{\mathcal D}} \beta (u)(t)\vert _{\star ,{\mathcal D}}^{p'}\mathrm{d}t\). The proof is completed by gathering (47), (49), (50) and (51). \(\square \)

4.2 Proof of Theorem 2.12

Step 1 Application of compactness results.

Thanks to Theorem 3.1 and Estimates (43) and (45), we first extract a subsequence such that \((\Pi _{{\mathcal D}_m}\beta (u_m))_{m\in \mathbb N}\) converges weakly in \(L^2(\Omega )\) uniformly on [0, T] (in the sense of Definition 2.11) to some function \(\overline{\beta }\in C([0,T];L^2(\Omega ){-}\text{ w })\) which satisfies \(\overline{\beta }(\cdot ,0) = \beta (u_\mathrm{ini})\) in \(L^2(\Omega )\). Using again Estimates (43) and applying Lemma 2.10, we extract a further subsequence such that, for some \(\overline{\zeta }\in L^p(0,T;W^{1,p}_0(\Omega ))\), \(\Pi _{{\mathcal D}_m}\zeta (u_m)\rightarrow \overline{\zeta }\) weakly in \(L^p(\Omega \times (0,T))\) and \(\nabla _{{\mathcal D}_m}\zeta (u_m) \rightarrow \nabla \overline{\zeta }\) weakly in \(L^p(\Omega \times (0,T))^d\). Estimates (43), Definition 2.5 and the growth assumption (2b) on \(\zeta \) show that \((\Pi _{{\mathcal D}_m}u_m)_{m\in \mathbb N}\) is bounded in \(L^p(\Omega \times (0,T))\) and we can therefore assume, up to a subsequence, that it converges weakly to some \({\overline{u}}\) in this space.

We then prove, by means of the Kolmogorov theorem, that \((\Pi _{{\mathcal D}_m}\nu (u_m))_{m\in \mathbb N}\) is relatively compact in \(L^1(\Omega \times (0,T))\). We first remark that \(|\nu (a) - \nu (b)|\le L_\beta |\zeta (a) - \zeta (b)|\), which implies, using Estimate (43) and Definition 2.9 with \(v=\zeta (u_m)\),

$$\begin{aligned} ||\Pi _{{\mathcal D}_m} \nu (u_m)(\cdot +{\varvec{\xi }},\cdot )-\Pi _{{\mathcal D}_m} \nu (u_m)(\cdot ,\cdot )||_{L^p(\mathbb R^d\times (0,T))} \le L_\beta C_{1}T_{{\mathcal D}_m}({\varvec{\xi }}) \end{aligned}$$
(52)

where \(\Pi _{{\mathcal D}_m} \nu (u_m)\) has been extended by 0 outside \(\Omega \), and \(\lim _{{\varvec{\xi }}\rightarrow 0}\sup _{m\in \mathbb N}T_{{\mathcal D}_m}({\varvec{\xi }})=0\). This takes care of the space translates. Let us now turn to the time translates. Invoking Lemma 4.4 and, to control the time translates at both ends of [0, T], the fact that \(\Pi _{{\mathcal D}_m}\beta (u_m)\) – and therefore also \(\Pi _{{\mathcal D}_m} \nu (u_m)\) since \(|\nu |\le L_{\zeta }|\beta |\) – remains bounded in \(L^\infty (0,T; L^2(\Omega ))\), we can write for any \(M\in \mathbb N\)

$$\begin{aligned}&\sup _{m\in \mathbb N}||\Pi _{{\mathcal D}_m}\nu (u_m)(\cdot ,\cdot +\tau )-\Pi _{{\mathcal D}_m}\nu (u_m)(\cdot ,\cdot )||_{L^2(\Omega \times (0,T))}^2\nonumber \\&\quad \le \max \left( \max _{m\le M}||\Pi _{{\mathcal D}_m}\nu (u_m)(\cdot ,\cdot +\tau )-\Pi _{{\mathcal D}_m}\nu (u_m)(\cdot ,\cdot )||_{L^2(\Omega \times (0,T))}^2;\right. \nonumber \\&\qquad \left. C_{9}\left( \tau +\sup _{m>M}{\delta t}_m\right) \right) , \end{aligned}$$
(53)

where \(C_{9}\) does not depend on m or \(\tau \), and the functions have been extended by 0 outside (0, T). Since each \(||\Pi _{{\mathcal D}_m}\nu (u_m)(\cdot ,\cdot +\tau )-\Pi _{{\mathcal D}_m}\nu (u_m)||_{L^2(\Omega \times (0,T))}^2\) tends to 0 as \(\tau \rightarrow 0\) and since \({\delta t}_m\rightarrow 0\) as \(m\rightarrow \infty \), taking in that order the limsup as \(\tau \rightarrow 0\) and the limit as \(M\rightarrow \infty \) of (53) shows that the left-hand side of this inequality tends to 0 as \(\tau \rightarrow 0\), as required. Hence, Kolmogorov’s theorem shows that, up to extraction of another subsequence, \(\Pi _{{\mathcal D}_m}\nu (u_m)\rightarrow \overline{\nu }\) in \(L^1(\Omega \times (0,T))\).

Let us now identify these limits \(\overline{\beta }\), \(\overline{\zeta }\) and \(\overline{\nu }\). Under the first case in the structural hypothesis (17), we have \(\beta =\mathrm{Id}\), and therefore \(\overline{\beta }={\overline{u}}=\beta ({\overline{u}})\) and \(\nu =\zeta \). The strong convergence of \(\Pi _{{\mathcal D}_m}\nu (u_m)= \Pi _{{\mathcal D}_m}\zeta (u_m)\) to \(\overline{\nu }=\overline{\zeta }\) allows us to apply Lemma 3.5 to see that \(\overline{\zeta }=\zeta ({\overline{u}})\) and \(\overline{\nu }=\nu ({\overline{u}})\). Exchanging the roles of \(\beta \) and \(\zeta \), we see that \(\overline{\beta }=\beta ({\overline{u}})\), \(\overline{\zeta }=\zeta ({\overline{u}})\) and \(\overline{\nu }=\nu ({\overline{u}})\) still hold in the second case of (17). We notice that this is the only place where we use this structural assumption (17) on \(\beta ,\zeta \).

Using the growth assumption (2h) on \({\varvec{a}}\) and Estimates (43), upon extraction of another subsequence we can also assume that \({\varvec{a}}\left( \cdot , \Pi _{{\mathcal D}_m} \nu (u_m),\nabla _{{\mathcal D}_m} \zeta (u_m)\right) \) has a weak limit in \(L^{p'}(\Omega \times (0,T))^d\), which we denote by \({\varvec{A}}\).

Finally, for any \(T_0\in [0,T]\), since \(\Pi _{{\mathcal D}_m}\beta (u_m(\cdot ,T_0))\rightarrow \beta ({\overline{u}})(\cdot ,T_0)\) weakly in \(L^2(\Omega )\), Lemma 3.4 gives

$$\begin{aligned} \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0))\mathrm{d}{\varvec{x}}\le \liminf _{m\rightarrow \infty }\int _\Omega B(\beta (\Pi _{{\mathcal D}_m} u_m)({\varvec{x}},T_0))\mathrm{d}{\varvec{x}}. \end{aligned}$$
(54)

With (43), this shows that \(B(\beta ({\overline{u}}))\in L^\infty (0,T;L^1(\Omega ))\).

Step 2 Passing to the limit in the scheme.

We drop the indices m for legibility reasons. Let \(\varphi \in C^1_c(-\infty , T)\) and let \(w\in W^{1,p}_0(\Omega )\cap L^2(\Omega )\). We introduce \(v=(\varphi (t^{(n-1)})P_{{\mathcal D}} w)_{n=1,\ldots ,N}\) as a test function in (14), with \(P_{{\mathcal D}}\) defined by (21). We get \(T_1^{(m)} + T_2^{(m)} = T_3^{(m)}\) with

$$\begin{aligned} T_1^{(m)} \!= & {} \!\sum _{n=0}^{N-1} \varphi (t^{(n)}){\delta t}^{(n+{\frac{1}{2}})}\int _\Omega \Pi _{\mathcal D}\delta _{{\mathcal D}}^{(n+{\frac{1}{2}})} \beta (u)({\varvec{x}}) \Pi _{\mathcal D}P_{{\mathcal D}} w({\varvec{x}})\mathrm{d}{\varvec{x}},\\ T_2^{(m)} \!= & {} \!\sum _{n=0}^{N-1} \varphi (t^{(n)}){\delta t}^{(n+{\frac{1}{2}})} \!\int _\Omega {\varvec{a}}\left( {\varvec{x}}, \Pi _{\mathcal D}\nu (u^{(n+1)}),\nabla _{\mathcal D}\zeta ( u^{(n+1)})({\varvec{x}})\!\right) \cdot \nabla _{\mathcal D}P_{{\mathcal D}} w({\varvec{x}})\mathrm{d}{\varvec{x}}, \end{aligned}$$

and

$$\begin{aligned} T_3^{(m)} = \sum _{n=0}^{N-1} \varphi (t^{(n)}) \int _{t^{(n)}}^{t^{(n+1)}}\int _\Omega f({\varvec{x}},t) \Pi _{\mathcal D}P_{{\mathcal D}}w({\varvec{x}}) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$

Using discrete integrate-by-parts to transform the terms \(\varphi (t^{(n)}) (\Pi _{\mathcal D}\beta (u^{(n+1)})-\Pi _{\mathcal D}\beta (u^{(n)}))\) appearing in \(T_1^{(m)}\) into \((\varphi (t^{(n)})-\varphi (t^{(n+1)}))\Pi _{\mathcal D}\beta (u^{(n+1)})\), we have

$$\begin{aligned} T_1^{(m)}= & {} \displaystyle - \int _0^T\varphi '(t) \int _\Omega \Pi _{\mathcal D}\beta (u)({\varvec{x}},t) \Pi _{\mathcal D}P_{{\mathcal D}} w({\varvec{x}})\mathrm{d}{\varvec{x}}\mathrm{d}t\\&-\varphi (0)\int _\Omega \Pi _{\mathcal D}\beta (u^{(0)})({\varvec{x}}) \Pi _{\mathcal D}P_{{\mathcal D}}w({\varvec{x}})\mathrm{d}{\varvec{x}}. \end{aligned}$$

Setting \(\varphi _{\mathcal D}(t) = \varphi (t^{(n)})\) for \(t\in (t^{(n)},t^{(n+1)})\), we have

$$\begin{aligned}\begin{array}{llll}\displaystyle T_2^{(m)} = \int _0^T\varphi _{\mathcal D}(t) \int _\Omega {\varvec{a}}\left( {\varvec{x}}, \Pi _{\mathcal D}\nu (u)({\varvec{x}},t),\nabla _{\mathcal D}\zeta (u)({\varvec{x}},t)\right) \cdot \nabla _{\mathcal D}P_{{\mathcal D}}w({\varvec{x}})\mathrm{d}{\varvec{x}}\mathrm{d}t \\ \displaystyle T_3^{(m)} = \int _0^T\varphi _{\mathcal D}(t)\int _\Omega f({\varvec{x}},t) \Pi _{\mathcal D}P_{{\mathcal D}}w({\varvec{x}}) \mathrm{d}{\varvec{x}}\mathrm{d}t.\end{array} \end{aligned}$$

Since \(\varphi _{\mathcal D}\rightarrow \varphi \) uniformly on [0, T], \(\Pi _{\mathcal D}P_{\mathcal D}w\rightarrow w\) in \(L^p(\Omega )\cap L^2(\Omega )\) and \(\nabla _{\mathcal D}P_{\mathcal D}w\rightarrow \nabla w\) in \(L^p(\Omega )^d\), we may let \(m\rightarrow \infty \) in \(T_1^{(m)} + T_2^{(m)} = T_3^{(m)}\) to see that \({\overline{u}}\) satisfies

$$\begin{aligned} \left\{ \!\begin{array}{llll} {\overline{u}}\in L^p(\Omega \times (0,T)),\;\zeta ({\overline{u}})\in L^p(0,T;W^{1,p}_0(\Omega )),\; B(\beta ({\overline{u}}))\!\in \! L^\infty (0,T;L^1(\Omega )),\\ \beta ({\overline{u}})\in C([0,T];L^2(\Omega ){-}\text{ w }),\; \beta ({\overline{u}})(\cdot ,0)=\beta (u_\mathrm{ini}),\\ \displaystyle - \int _0^T \varphi '(t)\int _\Omega \beta ({\overline{u}}({\varvec{x}},t)) w({\varvec{x}})\mathrm{d}{\varvec{x}}\mathrm{d}t -\varphi (0)\int _\Omega \beta (u_\mathrm{ini}({\varvec{x}}))w({\varvec{x}})\mathrm{d}{\varvec{x}}\\ \displaystyle +\int _0^T \varphi (t)\int _\Omega {\varvec{A}}({\varvec{x}},t)\cdot \nabla w({\varvec{x}}) \mathrm{d}{\varvec{x}}\mathrm{d}t = \int _0^T \varphi (t)\int _\Omega f({\varvec{x}},t) w({\varvec{x}}) \mathrm{d}{\varvec{x}}\mathrm{d}t,\\ \qquad \forall w \in W^{1,p}_0(\Omega )\cap L^2(\Omega ),\ \forall \varphi \in C^\infty _c(-\infty ,T). \end{array}\right. \qquad \end{aligned}$$
(55)

Note that the regularity properties on \({\overline{u}}\), \(\zeta ({\overline{u}})\), \(\beta ({\overline{u}})\) and \(B(\beta ({\overline{u}}))\) have been established in Step 1. Linear combinations of this relation show that (55) also holds with \(\varphi (t)w({\varvec{x}})\) replaced by a tensorial functions in \(C^\infty _c(\Omega \times (0,T))\). This proves that \(\partial _t\beta ({\overline{u}})\in L^{p'}(0,T;W^{-1,p'}(\Omega ))\) (see Remark 1.1). Using the density of tensorial functions in \(L^p(0,T;W^{1,p}_0(\Omega ))\) [18], we then see that \({\overline{u}}\) satisfies

$$\begin{aligned} \begin{array}{llll} &{}&{}\displaystyle \int _0^T \langle \partial _t\beta ({\overline{u}})(\cdot ,t), {\overline{v}}(\cdot ,t)\rangle _{W^{-1,p'},W^{1,p}_0}\mathrm{d}t \\ &{}&{}\qquad \displaystyle + \int _0^T \int _\Omega {\varvec{A}}({\varvec{x}},t)\cdot \nabla {\overline{v}}({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t \\ &{}&{}\quad = \displaystyle \int _0^T \int _\Omega f({\varvec{x}},t) {\overline{v}}({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t,\quad \forall {\overline{v}}\in L^p(0,T;W^{1,p}_0(\Omega )). \end{array}\end{aligned}$$
(56)

Step 3 Proof that \( {\overline{u}}\) is a solution to (4).

It only remains to show that

$$\begin{aligned} {\varvec{A}}({\varvec{x}},t) = {\varvec{a}}({\varvec{x}},\nu ({\overline{u}})({\varvec{x}},t),\nabla \zeta ({\overline{u}})({\varvec{x}},t))\hbox { for a.e. }({\varvec{x}},t)\in \Omega \times (0,T). \end{aligned}$$
(57)

We take \(T_0\in [0,T]\), write (42) with \({\mathcal D}={\mathcal D}_m\) and take the \(\limsup \) as \(m\rightarrow \infty \). We notice that the \(t^{(k)}=:T_m\) from Lemma 4.1 converges to \(T_0\) as \(m\rightarrow \infty \). Hence, by using the convergence \(\Pi _{{\mathcal D}_m} {\mathcal I}_{{\mathcal D}_m}u_\mathrm{ini}\rightarrow u_\mathrm{ini}\) in \(L^2(\Omega )\) [consistency of \(({\mathcal D}_m)_{m\in \mathbb N}\)], and the continuity and quadratic growth of \(B\circ \beta \) [upper bound in (26)], we obtain

$$\begin{aligned}&\limsup _{m\rightarrow \infty }\int _0^{T_0}\int _\Omega {\varvec{a}}({\varvec{x}},\Pi _{{\mathcal D}_m}\nu (u_m)({\varvec{x}},t),\nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t))\cdot \nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad \le \int _\Omega B( \beta (u_\mathrm{ini})({\varvec{x}}))\mathrm{d}{\varvec{x}}+\int _0^{T_0}\int _\Omega f({\varvec{x}},t) \zeta ({\overline{u}})({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\qquad -\liminf _{m\rightarrow \infty }\int _\Omega B(\beta (\Pi _{{\mathcal D}_m} u_m)({\varvec{x}},T_0))\mathrm{d}{\varvec{x}}. \end{aligned}$$
(58)

We take \(\overline{v} = \zeta ({\overline{u}})\mathbf {1}_{[0,T_0]}\) in (56) and apply Lemma 3.6 to get

$$\begin{aligned}&\displaystyle \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0))\mathrm{d}{\varvec{x}}- \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},0))\mathrm{d}{\varvec{x}}\\&\quad \displaystyle + \int _0^{T_0} \int _\Omega {\varvec{A}}({\varvec{x}},t)\cdot \nabla \zeta ({\overline{u}})({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t = \int _0^{T_0} \int _\Omega f({\varvec{x}},t) \zeta ({\overline{u}})({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$

This relation, combined with (58) and using (54), shows that

$$\begin{aligned}&\limsup _{m\rightarrow \infty }\int _0^{T_0}\int _\Omega {\varvec{a}}({\varvec{x}},\Pi _{{\mathcal D}_m}\nu (u_m)({\varvec{x}},t),\nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t))\cdot \nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad \le \int _0^{T_0} \int _\Omega {\varvec{A}}({\varvec{x}},t)\cdot \nabla \zeta ({\overline{u}})({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(59)

It is now possible to apply Minty’s trick. Consider for \({\varvec{G}}\in L^p(\Omega \times (0,T))^d\) the following relation, stemming from the monotony (2g) of \({\varvec{a}}\):

$$\begin{aligned}&\int _0^{T_0} \int _\Omega \left[ {\varvec{a}}(\cdot ,\Pi _{{\mathcal D}_m} \nu (u_m),\nabla _{{\mathcal D}_m} \zeta (u_m)) - {\varvec{a}}(\cdot ,\Pi _{{\mathcal D}_m} \nu (u_m),{\varvec{G}})\right] \nonumber \\&\quad \cdot \left[ \nabla _{{\mathcal D}_m} \zeta (u_m)-{\varvec{G}}\right] \mathrm{d}{\varvec{x}}\mathrm{d}t \ge 0. \end{aligned}$$
(60)

By strong convergence of \(\Pi _{{\mathcal D}_m}\nu (u_m)\) to \(\nu ({\overline{u}})\) in \(L^1(\Omega \times (0,T))\) and Assumptions (2e), (2h) on \({\varvec{a}}\), we see that \({\varvec{a}}(\cdot ,\Pi _{{\mathcal D}_m}\nu (u_m),{\varvec{G}})\rightarrow {\varvec{a}}(\cdot ,\nu ({\overline{u}}),{\varvec{G}})\) strongly in \(L^{p'}(\Omega \times (0,T))^d\). The development of (60) gives a sum of four terms, the first one being the integral in the left-hand side of (59) and the other three being integrals of products of weakly and strongly converging sequences. We can thus take the \(\limsup \) of (60) with \(T_0=T\) to find

$$\begin{aligned} \int _0^{T} \int _\Omega \left[ {\varvec{A}}({\varvec{x}},t) - {\varvec{a}}({\varvec{x}},\nu ({\overline{u}})({\varvec{x}},t),{\varvec{G}}({\varvec{x}},t))\right] \cdot \left[ \nabla \zeta ({\overline{u}})({\varvec{x}},t)-{\varvec{G}}({\varvec{x}},t)\right] \mathrm{d}{\varvec{x}}\mathrm{d}t \ge 0. \end{aligned}$$

Application of Minty’s method [47] (i.e. taking \({\varvec{G}}=\nabla \zeta ({\overline{u}})+r{\varvec{\varphi }}\) for \({\varvec{\varphi }}\in L^p(\Omega \times (0,T))^d\) and letting \(r\rightarrow 0\)) then shows that (57) holds and concludes the proof that \({\overline{u}}\) satisfies (4).

4.3 Proof of Theorem 2.16

Let \(T_0\in [0,T]\) and \((T_m)_{m\ge 1}\) be a sequence in [0, T] that converges to \(T_0\). By setting \(T_0=T_m\) and \({\varvec{G}}= \nabla \zeta ({\overline{u}})\) in the developed form of (60), by taking the infimum limit (thanks to the strong convergence of \({\varvec{a}}(\cdot ,\Pi _{{\mathcal D}_m}\nu (u_m),\nabla \zeta ({\overline{u}}))\)) and by using (57), we find

$$\begin{aligned}&\liminf _{m\rightarrow \infty }\int _0^{T_m}\int _\Omega {\varvec{a}}({\varvec{x}},\Pi _{{\mathcal D}_m}\nu (u_m)({\varvec{x}},t),\nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t))\cdot \nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad \ge \int _0^{T_0} \int _\Omega {\varvec{a}}({\varvec{x}},\nu ({\overline{u}})({\varvec{x}},t),\nabla \zeta ({\overline{u}})({\varvec{x}},t))\cdot \nabla \zeta ({\overline{u}})({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(61)

We then write (42) with \(T_m\) instead of \(T_0\) and we take the \(\limsup \) as \(m\rightarrow \infty \). We notice that the \(t^{(k)}\) such that \(T_m\in (t^{(k-1)},t^{(k)}]\) converges to \(T_0\) as \(m\rightarrow \infty \). Thanks to (61) and (39) we obtain

$$\begin{aligned} \limsup _{m\rightarrow \infty }\int _\Omega B({\beta }(\Pi _{{\mathcal D}_m} u_m({\varvec{x}},T_m))) \mathrm{d}{\varvec{x}}\le \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0)) \mathrm{d}{\varvec{x}}. \end{aligned}$$
(62)

By Lemma 6.4, the uniform-in-time weak convergence of \(\beta (\Pi _{{\mathcal D}_m}u_m)\) to \(\beta (\bar{u})\) and the continuity of \(\beta (\bar{u}):[0,T]\rightarrow L^2(\Omega ){-}\text{ w }\), we have \(\beta (\Pi _{{\mathcal D}_m}u_m)(T_m)\rightarrow \beta (\bar{u})(T_0)\) weakly in \(L^2(\Omega )\) as \(m\rightarrow \infty \). Therefore, for any \((s_m)_{m\in \mathbb N}\) converging to \(T_0\), \(\frac{1}{2}({\beta }(\Pi _{{\mathcal D}_m} u_m(T_m))+\beta ({\overline{u}})(s_m))\rightarrow \beta ({\overline{u}})(T_0)\) weakly in \(L^2(\Omega )\) as \(m\rightarrow \infty \) and Lemma 3.4 gives, by convexity of B,

$$\begin{aligned} \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0)) \mathrm{d}{\varvec{x}}\le \liminf _{m\rightarrow \infty }\int _\Omega B\left( \frac{{\beta }(\Pi _{{\mathcal D}_m} u_m({\varvec{x}},T_m))+\beta ({\overline{u}})({\varvec{x}},s_m)}{2}\right) \mathrm{d}{\varvec{x}}.\nonumber \\ \end{aligned}$$
(63)

Property (28) of B and the two inequalities (62) and (63) allow us to conclude the proof. Let \((s_m)_{m\in \mathbb N}\) be a sequence in \(\mathcal T\) (see proof of Corollary 3.8) that converges to \(T_0\). Then \(\nu ({\overline{u}}(\cdot ,s_m))\rightarrow \nu ({\overline{u}})(\cdot ,T_0)\) in \(L^2(\Omega )\) as \(m\rightarrow \infty \). Using (28), we get

$$\begin{aligned}&\Vert {\nu }(\Pi _{{\mathcal D}_m} u_m(\cdot ,T_m)) - \nu ({\overline{u}})(\cdot ,T_0)\Vert _{L^2(\Omega )}^2 \\&\quad \le 2\Vert {\nu }(\Pi _{{\mathcal D}_m} u_m(\cdot ,T_m)) - \nu ({\overline{u}}(\cdot ,s_m))\Vert _{L^2(\Omega )}^2\\&\quad +2\Vert {\nu }({\overline{u}}(\cdot ,s_m)) - \nu ({\overline{u}})(\cdot ,T_0)\Vert _{L^2(\Omega )}^2\\&\quad \le 8 L_\beta L_\zeta \int _\Omega \left[ B({\beta }(\Pi _{{\mathcal D}_m} u_m({\varvec{x}},T_m))) + B(\beta ({\overline{u}}({\varvec{x}},s_m)))\right] \mathrm{d}{\varvec{x}}\\&\qquad -\,16L_\beta L_\zeta \int _\Omega B\left( \frac{{\beta }(\Pi _{{\mathcal D}_m} u_m({\varvec{x}},T_m))+\beta ({\overline{u}}({\varvec{x}},s_m))}{2}\right) \mathrm{d}{\varvec{x}}\\&\qquad +\,2\Vert {\nu }({\overline{u}}(\cdot ,s_m)) - \nu ({\overline{u}})(\cdot ,T_0)\Vert _{L^2(\Omega )}^2. \end{aligned}$$

We then take the \(\limsup \) as \(m\rightarrow \infty \) of this expression. Thanks to (62) and the continuity of \(t\in [0,T]\mapsto \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},t))\mathrm{d}{\varvec{x}}\in [0,\infty )\) (see Corollary 3.8), the first term in the right-hand side has a finite \(\limsup \), bounded above by \(16L_\beta L_\zeta \int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0))\mathrm{d}{\varvec{x}}\). We can therefore split the \(\limsup \) of this right-hand side without risking writing \(\infty -\infty \) and we get, thanks to (63),

$$\begin{aligned} \limsup _{m\rightarrow \infty }\Vert {\nu }(\Pi _{{\mathcal D}_m} u_m(\cdot ,T_m)) - \nu ({\overline{u}})(\cdot ,T_0)\Vert _{L^2(\Omega )}^2 \le 0. \end{aligned}$$

Thus, \(\nu (\Pi _{{\mathcal D}_m}u_m(\cdot ,T_m))\rightarrow \nu ({\overline{u}})(T_0)\) strongly in \(L^2(\Omega )\). By Lemma 6.4 and the continuity of \(\nu ({\overline{u}}):[0,T]\mapsto L^2(\Omega )\) stated in Corollary 3.8, this concludes the proof of the convergence of \(\nu (\Pi _{{\mathcal D}_m}u_m)\) to \(\nu ({\overline{u}})\) in \(L^\infty (0,T;L^2(\Omega ))\).

Remark 4.5

Since \(\beta (\Pi _{{\mathcal D}_m}u_m)(T_m)\rightarrow \beta (\bar{u})(T_0)\) weakly in \(L^2(\Omega )\) as \(m\rightarrow \infty \), Lemma 3.4 shows that \(\int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0))\mathrm{d}{\varvec{x}}\le \liminf _{m\rightarrow \infty }\int _\Omega B(\beta (\Pi _{{\mathcal D}_m} u_m)({\varvec{x}},T_m))\mathrm{d}{\varvec{x}}\). Combined with (62), this gives

$$\begin{aligned} \lim _{m\rightarrow \infty }\int _\Omega B({\beta }(\Pi _{{\mathcal D}_m} u_m({\varvec{x}},T_m))) \mathrm{d}{\varvec{x}}=\int _\Omega B(\beta ({\overline{u}})({\varvec{x}},T_0)) \mathrm{d}{\varvec{x}}. \end{aligned}$$
(64)

Item 1 in Corollary 3.8 and Lemma 6.4 therefore show that the functions \(\int _\Omega B(\beta (\Pi _{{\mathcal D}_m}u_m({\varvec{x}},\cdot )))\mathrm{d}{\varvec{x}}\) converges uniformly on [0, T] to \(\int _\Omega B(\beta ({\overline{u}})({\varvec{x}},\cdot ))\mathrm{d}{\varvec{x}}\).

4.4 Proof of Theorem 2.18

By taking the \(\limsup \) as \(m\rightarrow \infty \) of (42) for \(u_m\) with \(T_0=T\), and by using (64) (with \(T_m\equiv T\)) and the continuous integration-by-parts formula (39), we find

$$\begin{aligned}&\limsup _{m\rightarrow \infty }\int _0^{T}\int _\Omega {\varvec{a}}({\varvec{x}},\Pi _{{\mathcal D}_m}\nu (u_m)({\varvec{x}},t),\nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t))\cdot \nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\\&\quad \le \int _0^{T_0} \int _\Omega {\varvec{a}}({\varvec{x}},\nu ({\overline{u}})({\varvec{x}},t),\nabla \zeta ({\overline{u}})({\varvec{x}},t))\cdot \nabla \zeta ({\overline{u}})({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$

Combined with (61), this shows that

$$\begin{aligned}&\lim _{m\rightarrow \infty }\int _0^{T}\int _\Omega {\varvec{a}}({\varvec{x}},\Pi _{{\mathcal D}_m}\nu (u_m)({\varvec{x}},t),\nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t))\cdot \nabla _{{\mathcal D}_m} \zeta (u_m)({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad = \int _0^{T_0} \int _\Omega {\varvec{a}}({\varvec{x}},\nu ({\overline{u}})({\varvec{x}},t),\nabla \zeta ({\overline{u}})({\varvec{x}},t))\cdot \nabla \zeta ({\overline{u}})({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(65)

Let us define

$$\begin{aligned} f_m= & {} \left[ {\varvec{a}}({\varvec{x}},\Pi _{{\mathcal D}_m} \nu (u_m),\nabla _{{\mathcal D}_m} \zeta (u_m)) - {\varvec{a}}({\varvec{x}},\Pi _{{\mathcal D}_m} \nu (u_m)(\cdot ,t),\nabla \zeta ({\overline{u}}))\right] \\&\cdot \left[ \nabla _{{\mathcal D}_m} \zeta (u_m)-\nabla \zeta ({\overline{u}})\right] \ge 0. \end{aligned}$$

By developing this expression and using (65), (57) and (18), we see that \(\int _0^T\int _\Omega f_m({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t\rightarrow 0\) as \(m\rightarrow \infty \). This shows that \(f_m\rightarrow 0\) in \(L^1(\Omega \times (0,T))\) and therefore a.e. up to a subsequence. We can then reason as in [23], using the strict monotony (19) of \({\varvec{a}}\), the coercivity assumption (2f) and Vitali’s theorem, to deduce that \(\nabla _{{\mathcal D}_m}\zeta (u_m)\rightarrow \nabla \zeta ({\overline{u}})\) strongly in \(L^p(\Omega \times (0,T))^d\) as \(m\rightarrow \infty \).

5 Removal of the assumption “\(\beta =\mathrm{Id}\) or \(\zeta =\mathrm{Id}\)

We show here that all previous results are actually true without the structural assumption (17)—i.e. without assuming that \(\beta =\mathrm{Id}\) or \(\zeta =\mathrm{Id}\)—provided that the range of p is slightly restricted. The main theorem in this section is the following convergence result.

Theorem 5.1

Under Assumptions (2), let \(({\mathcal D}_m)_{m\in \mathbb N}\) be a sequence of space-time gradient discretisations, in the sense of Definition 2.1, that is coercive, consistent, limit-conforming and compact (see Sect. 2.2). Let, for any \(m\in \mathbb N\), \(u_m\) be a solution to (14) with \({\mathcal D}={\mathcal D}_m\), provided by Theorem 2.12.

If \(p\ge 2\) then there exists a solution \({\overline{u}}\) to (4) such that, up to a subsequence,

  • the convergences in (18) hold,

  • \(\Pi _{{\mathcal D}_m}\nu (u_m)\rightarrow \nu ({\overline{u}})\) strongly in \(L^\infty (0,T;L^2(\Omega ))\) as \(m\rightarrow \infty \),

  • under the strict monotony assumption on \({\varvec{a}}\) (i.e. (19)), as \(m\rightarrow \infty \) we have \(\Pi _{{\mathcal D}_m}\zeta (u_m)\rightarrow \zeta ({\overline{u}})\) strongly in \(L^p(\Omega \times (0,T))\) and \(\nabla _{{\mathcal D}_m}\zeta (u_m)\rightarrow \nabla \zeta ({\overline{u}})\) strongly in \(L^p(\Omega \times (0,T))^d\).

Proof

We only need to prove the first conclusion of the theorem, i.e. that the convergences (18) hold. Theorems 2.16 and 2.18 then provide the last two conclusions. The difference with respect to Theorem 2.12 is the removal, here, of the structural assumption (17). The only place in the proof of Theorem 2.12 where this assumption was used is in Step 1, to identify the limits \(\overline{\beta }\), \(\overline{\zeta }\) and \(\overline{\nu }\) of \(\Pi _{{\mathcal D}_m}\beta (u_m)\), \(\Pi _{{\mathcal D}_m}\zeta (u_m)\) and \(\Pi _{{\mathcal D}_m}\nu (u_m)\). We will show that these limits can still be identified without assuming (17).

Set \(\mu =\beta +\zeta \), let \(\overline{\mu }=\overline{\beta }+\overline{\zeta }\) and fix a measurable \({\overline{u}}\) such that \((\mu +\nu )({\overline{u}})=\overline{\mu }+\overline{\nu }\). The existence of such a \({\overline{u}}\) is ensured by Assumptions (2b) and (2c). Indeed, these assumptions show that the range of \(\mu +\nu \) is \(\mathbb R\) and therefore that the pseudo-reciprocal \((\mu +\nu )_r\) of \(\mu +\nu \) [defined as in (3)] has domain \(\mathbb R\); this allows us to set, for example, \({\overline{u}}=(\mu +\nu )_r(\overline{\mu }+\overline{\nu })\). Let us now prove that, for such a function \({\overline{u}}\), we have \(\overline{\beta }=\beta ({\overline{u}})\), \(\overline{\zeta }=\zeta ({\overline{u}})\) and \(\overline{\nu }=\nu ({\overline{u}})\).

By using estimates (52) and (53), Kolmogorov’s compactness theorem shows that the convergence of \(\Pi _{{\mathcal D}_m}\nu (u_m)\) towards \(\overline{\nu }\) is actually strong in \(L^2(\Omega \times (0,T))\) (we use \(p\ge 2\) here). Since \(\mu (\Pi _{{\mathcal D}_m}u_m)=\beta (\Pi _{{\mathcal D}_m}u_m)+\zeta (\Pi _{{\mathcal D}_m}u_m) \rightarrow \overline{\beta } + \overline{\zeta }=\overline{\mu }\) weakly in \(L^2(\Omega \times (0,T))\), we can apply Lemma 5.6 with \(\varphi \equiv 1\), \(w_m=\Pi _{{\mathcal D}_m}u_m\), \(w={\overline{u}}\) and \((\mu ,\nu )\) instead of \((\beta ,\zeta )\) to deduce that \(\overline{\nu }=\nu (\overline{u})\) and \(\overline{\mu }=\mu (\overline{u})\). The second of these relations translates into \(\overline{\beta } + \overline{\zeta } = (\beta +\zeta )(\overline{u})\).

We now turn to identifying \(\overline{\beta }\) and \(\overline{\zeta }\). Lemmas 4.1 and 4.3 show that \(\beta _m=\beta (u_m)\) and \(\zeta _m=\zeta (u_m)\) satisfy the assumptions of the discrete compensated compactness Theorem 5.4 below (we use \(p\ge 2\) here). Hence, \(\Pi _{{\mathcal D}_m}\beta (u_m)\Pi _{{\mathcal D}_m}\zeta (u_m)\rightarrow \overline{\beta }\,\overline{\zeta }\) in the sense of measures on \(\Omega \times (0,T)\). Since we already established that \((\beta +\zeta )({\overline{u}})=\overline{\beta }+\overline{\zeta }\), we can therefore apply Lemma 5.6 with \(\varphi \equiv 1\), \(w_m=\Pi _{{\mathcal D}_m}u_m\) and \(w = \overline{u}\). This gives \(\overline{\beta }=\beta (\overline{u})\) and \(\overline{\zeta }=\zeta (\overline{u})\) a.e. on \(\Omega \times (0,T)\), as required.

To summarise, the limits of \(\Pi _{{\mathcal D}_m}\beta (u_m)\), \(\Pi _{{\mathcal D}_m}\zeta (u_m)\) and \(\Pi _{{\mathcal D}_m}\nu (u_m)\) have been identified as \(\beta (\overline{u})\), \(\zeta (\overline{u})\) and \(\nu (\overline{u})\) for some \(\overline{u}\). Since \(\zeta (\overline{u})=\overline{\zeta }\in L^p(\Omega \times (0,T))\), the growth assumptions (2b) on \(\zeta \) ensure that \(\overline{u}\in L^p(\Omega \times (0,T))\). We can then take over the proof of Theorem 2.12 from after the usage of (17), using the \(\overline{u}\) we just found instead of the one defined as the weak limit of \(\Pi _{{\mathcal D}_m}u_m\). This allows us to conclude that \(\overline{u}\) is a solution to (4), and that the convergences in (18) hold. \(\square \)

Remark 5.2

It is not proved that \(\overline{u}\) is a weak limit of \(\Pi _{{\mathcal D}_m}u_m\). Such a limit is not stated in (18) and is not necessarily expected for the model (1), in which the quantities of interest (physically relevant when this PDE models a natural phenomenon) are \(\beta ({\overline{u}})\), \(\zeta ({\overline{u}})\) and \(\nu ({\overline{u}})\).

Remark 5.3

(Maximal monotone operator) Hypotheses (2b) and (2c) imply that the operator T defined by the graph \(\mathcal {G}(T) = \{(\zeta (s),\beta (s)),s\in \mathbb R\}\) is a maximal monotone operator with domain \(\mathbb R\), such that \(0\in T(0)\). Indeed, assume that xy satisfy \((\zeta (s)-x)(\beta (s) - y)\ge 0\) for all \(s\in \mathbb R\). Then, letting \(w\in \mathbb R\) be such that

$$\begin{aligned} \frac{\beta (w)+\zeta (w)}{2}=\frac{x+y}{2}, \end{aligned}$$
(66)

we have \((\zeta (w)-x)(\beta (w) - y) = -( \frac{\zeta (w)-\beta (w)}{2} - \frac{x-y}{2})^2 \ge 0\). This implies \(\frac{\zeta (w)-\beta (w)}{2}=\frac{x-y}{2}\) which, combined with (66), gives \(x = \zeta (w)\) and \(y = \beta (w)\) and hence \((x,y)\in \mathcal {G}(T)\).

Reciprocally, for any maximal monotone operator T from \(\mathbb R\) to \(\mathbb R\) such that \(0\in T(0)\), one can find \(\zeta \) and \(\beta \) satisfying (2b) and (2c), and such that \(\mathcal {G}(T) = \{(\zeta (s),\beta (s)),s\in \mathbb R\}\). Indeed, for all \((x,y)\in \mathcal {G}(T)\) and \((x',y')\in \mathcal {G}(T)\) satisfying \(x+y = x'+y'\), since \((x-x')(y-y')\ge 0\) we have \(x=x'\) and \(y=y'\). We can therefore define \(\zeta \) and \(\beta \) by: for all \((x,y)\in \mathcal G(T)\), \(x = \zeta (\frac{x+y}{2})\) and \(y = \beta (\frac{x+y}{2})\). We observe that these functions are nondecreasing and Lipschitz-continuous with constant 2, and that \(\zeta +\beta = 2 \mathrm{Id}\).

Hence, Theorem 5.1 applies to the model considered in [52], but provides convergence results for much more general equations and various numerical methods in any space dimension.

We now state the two key results that allowed us to remove Assumption (17) if \(p\ge 2\). The first one is a discrete version of a compensated compactness result in [41]. The second is a Minty-like result, useful to identify weak non-linear limits.

We note that Theorem 5.4 states a more general convergence result than needed for the proof of Theorem 5.1 (which only requires \(\varphi \equiv 1\)). We nevertheless state the general form in order to obtain the genuine discrete equivalent of the result in [41]. We also believe that this discrete compensated compactness theorem will find many more applications in the numerical analysis of degenerate or coupled parabolic models. We also refer to [6] for another transposition to the discrete setting of a compensated compactness result.

Theorem 5.4

(Discrete compensated compactness) We take \(T>0\), \(p\ge 2\) and a sequence \(({\mathcal D}_m)_{m\in \mathbb N} = (X_{{\mathcal D}_m,0}, \Pi _{{\mathcal D}_m},\nabla _{{\mathcal D}_m}, {\mathcal I}_{{\mathcal D}_m},(t_m^{(n)})_{n=0,\ldots ,N_m})_{m\in \mathbb N}\) of space-time gradient discretisations, in the sense of Definition 2.1, that is consistent and compact in the sense of Definitions 2.6 and 2.9.

For any \(m\in \mathbb N\), let \(\beta _m=(\beta _m^{(n)})_{n=0,\ldots ,N_m} \subset X_{{\mathcal D}_m,0}\) and \(\zeta _m=(\zeta _m^{(n)})_{n=0,\ldots ,N_m} \subset X_{{\mathcal D}_m,0}\) be such that

  • the sequences \((\int _0^T |\delta _m \beta _m(t)|_{\star ,{\mathcal D}_m})_{m\in \mathbb N}\) and \((||\nabla _{{\mathcal D}_m}\zeta _m||_{L^2(0,T;L^p(\Omega )^d)})_{m\in \mathbb N}\) are bounded,

  • as \(m\rightarrow \infty \), \(\Pi _{{\mathcal D}_m}\beta _m\rightarrow \overline{\beta }\) and \(\Pi _{{\mathcal D}_m}\zeta _m\rightarrow \overline{\zeta }\) weakly in \(L^2(\Omega \times (0,T))\).

Then \((\Pi _{{\mathcal D}_m}\beta _m)(\Pi _{{\mathcal D}_m}\zeta _m)\rightarrow \overline{\beta }\,\overline{\zeta }\) in the sense of measures on \(\Omega \times (0,T)\), that is, for all \(\varphi \in C(\overline{\Omega }\times [0,T])\),

$$\begin{aligned}&\lim _{m\rightarrow \infty } \int _0^T\int _\Omega \Pi _{{\mathcal D}_m}\beta _m({\varvec{x}},t)\Pi _{{\mathcal D}_m}\zeta _m({\varvec{x}},t)\varphi ({\varvec{x}},t) \mathrm{d}{\varvec{x}}\mathrm{d}t \nonumber \\&\quad = \int _0^T\int _\Omega \overline{\beta }({\varvec{x}},t)\,\overline{\zeta }({\varvec{x}},t)\varphi ({\varvec{x}},t)\mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(67)

Proof

The idea is to reduce to the case where \(\Pi _{{\mathcal D}_m}\zeta _m\) is a tensorial function, in order to separate the space and time variables and make use of the compactness of \(\Pi _{{\mathcal D}_m}\zeta _m\) and \(\Pi _{{\mathcal D}_m}\beta _m\) with respect to each of these variables. Note that the technique we use here apparently provides a new proof for the continuous equivalent of this compensated compactness result.

Step 1: reduction of \(\Pi _{{\mathcal D}_m}\zeta _m\) to tensorial functions.

Let us take \(\delta >0\) and let us consider a covering \((A^\delta _k)_{k=1,\ldots ,K}\) of \(\Omega \) in disjoint cubes of length \(\delta \). Let \(R_\delta :L^2(\Omega )\rightarrow L^2(\Omega )\) be the operator defined by:

$$\begin{aligned} \forall g\in L^2(\Omega ),\;\forall k=1,\ldots ,K,\;\forall {\varvec{x}}\in A_k^\delta \cap \Omega \,:\, R_\delta g({\varvec{x}}) = \frac{1}{\mathrm{meas}(A_k^\delta )}\int _{A^k_\delta } g({\varvec{y}})\mathrm{d}{\varvec{y}}, \end{aligned}$$

where g has been extended by 0 outside \(\Omega \). Let \({\varvec{x}}\in A_k^\delta \cap \Omega \). Using Jensen’s inequality, the fact that \(\mathrm{meas}(A_k^\delta )=\delta ^d\) and the change of variable \({\varvec{y}}\in A_k^\delta \mapsto {\varvec{\xi }}={\varvec{y}}-{\varvec{x}}\in (-\delta ,\delta )^d\), we can write

$$\begin{aligned} |R_\delta g({\varvec{x}})-g({\varvec{x}})|^2 \le \delta ^{-d}\int _{A_k^\delta }|g({\varvec{y}})\!-\!g({\varvec{x}})|^2\mathrm{d}{\varvec{y}}\le \delta ^{-d}\int _{(-\delta ,\delta )^d}|g({\varvec{x}}+{\varvec{\xi }})-g({\varvec{x}})|^2\mathrm{d}{\varvec{\xi }}. \end{aligned}$$

Integrating over \({\varvec{x}}\in A_k^\delta \) and summing over \(k=1,\ldots ,K\) gives

$$\begin{aligned} ||R_\delta g-g||_{L^2(\Omega )}^2\le & {} \delta ^{-d} \int _{(-\delta ,\delta )^d}\int _{\mathbb R^d}|g({\varvec{x}}+{\varvec{\xi }})-g({\varvec{x}})|^2\mathrm{d}{\varvec{x}}\mathrm{d}{\varvec{\xi }}\nonumber \\\le & {} 2^d\sup _{{\varvec{\xi }}\in (-\delta ,\delta )^d} \int _{\mathbb R^d}|g({\varvec{x}}+{\varvec{\xi }})-g({\varvec{x}})|^2\mathrm{d}{\varvec{x}}. \end{aligned}$$
(68)

The compactness of \(({\mathcal D}_m)_{m\in \mathbb N}\) (Definition 2.9) and the fact that \(p\ge 2\) give \(\epsilon ({\varvec{\xi }})\) such that \(\epsilon ({\varvec{\xi }})\rightarrow 0\) as \({\varvec{\xi }}\rightarrow 0\) and, for all \(m\in \mathbb N\) and all \(v\in X_{{\mathcal D}_m,0}\),

$$\begin{aligned} ||\Pi _{{\mathcal D}_m}v(\cdot +{\varvec{\xi }})-\Pi _{{\mathcal D}_m}v||_{L^2(\mathbb R^d)}^2\le \epsilon ({\varvec{\xi }})||\nabla _{{\mathcal D}_m} v||_{L^p(\Omega )^d}^2. \end{aligned}$$

Combining this with (68) and using the bound on \(||\nabla _{{\mathcal D}_m}\zeta _m||_{L^2(0,T;L^p(\Omega )^d)}\) shows that

$$\begin{aligned} ||R_\delta \Pi _{{\mathcal D}_m}\zeta _m-\Pi _{{\mathcal D}_m}\zeta _m||_{L^2(\Omega \times (0,T))} \le C\sup _{|{\varvec{\xi }}|_\infty \le \delta }\sqrt{\epsilon ({\varvec{\xi }})}=:\omega (\delta ) \end{aligned}$$
(69)

where C does not depend on m, and \(\omega (\delta )\rightarrow 0\) as \(\delta \rightarrow 0\). Note that a similar estimate holds with \(\Pi _{{\mathcal D}_m}\zeta _m\) replaced with \(\overline{\zeta }\) since \(\overline{\zeta }\in L^2(\Omega \times (0,T))\).

If we respectively denote by \(\mathcal A_m(\Pi _{{\mathcal D}_m}\zeta _m)\) and \(\mathcal A(\overline{\zeta })\) the integrals in the left-hand side and right-hand side (67), then since \((\Pi _{{\mathcal D}_m}\beta _m)_{m\in \mathbb N}\) is bounded in \(L^2(\Omega \times (0,T))\) we have by (69)

$$\begin{aligned} |\mathcal A_m(\Pi _{{\mathcal D}_m}\zeta _m)-\mathcal A(\overline{\zeta })|\le C\omega (\delta ) +|\mathcal A_m(R_\delta \Pi _{{\mathcal D}_m}\zeta _m)-\mathcal A(R_\delta \overline{\zeta })|. \end{aligned}$$
(70)

Let us assume that we can prove that, for a fixed \(\delta \),

$$\begin{aligned} \mathcal A_m(R_\delta \Pi _{{\mathcal D}_m}\zeta _m)\rightarrow \mathcal A(R_\delta \overline{\zeta }) \text{ as } m\rightarrow \infty . \end{aligned}$$
(71)

Then (70) gives \(\limsup _{m\rightarrow \infty } |\mathcal A_m(\Pi _{{\mathcal D}_m}\zeta _m)-\mathcal A(\overline{\zeta })|\le C\omega (\delta )\). Letting \(\delta \rightarrow 0\) in this inequality gives \(\mathcal A_m(\Pi _{{\mathcal D}_m}\zeta _m)\rightarrow \mathcal A(\overline{\zeta })\) as wanted. Hence, we only need to prove (71).

The definition of \(R_\delta \) shows that

$$\begin{aligned} R_\delta g=\sum _{k=1}^K\frac{1}{\mathrm{meas}(A_k^\delta )}\mathbf {1}_{A^\delta _k} [g]_{A_k^\delta }, \end{aligned}$$

where \(\mathbf {1}_{A_k^\delta }\) is the characteristic function of \(A_k^\delta \) and \([g]_A=\int _A g({\varvec{x}})\mathrm{d}{\varvec{x}}\). Hence, (71) follows if we can prove that for any measurable set A

$$\begin{aligned}&\lim _{m\rightarrow \infty } \int _0^T\int _\Omega \Pi _{{\mathcal D}_m}\beta _m({\varvec{x}},t)[\Pi _{{\mathcal D}_m}\zeta _m]_A(t) \varphi (t,{\varvec{x}})\mathbf {1}_A({\varvec{x}}) \mathrm{d}{\varvec{x}}\mathrm{d}t \nonumber \\&\quad = \int _0^T\int _\Omega \overline{\beta }({\varvec{x}},t)[\,\overline{\zeta }\,]_A(t) \varphi (t,{\varvec{x}})\mathbf {1}_A({\varvec{x}})\mathrm{d}{\varvec{x}}\mathrm{d}t \end{aligned}$$
(72)

where for \(g\in L^2(\Omega \times (0,T))\) we set \([g]_A(t)=\int _A g(t,{\varvec{y}})\mathrm{d}{\varvec{y}}\).

Step 2: further reductions.

We now reduce \(\varphi \) to a tensorial function and \(\mathbf {1}_A\) to a smooth function. It is well-known that there exists tensorial functions \(\varphi _r=\sum _{l=1}^{L_r} \theta _{l,r}(t)\gamma _{l,r}({\varvec{x}})\), with \(\theta _{l,r}\in C^\infty ([0,T])\) and \(\gamma _{l,r}\in C^\infty (\overline{\Omega })\), such that \(\varphi _r\rightarrow \varphi \) uniformly on \(\Omega \times (0,T)\) as \(r\rightarrow \infty \). Moreover, there exists \(\rho _r\in C^\infty _c(\Omega )\) such that \(\rho _r\rightarrow \mathbf {1}_A\) in \(L^2(\Omega )\) as \(r\rightarrow \infty \).

Hence, as \(r\rightarrow \infty \) the function \((t,{\varvec{x}})\mapsto \varphi _r(t,{\varvec{x}})\rho _r({\varvec{x}})\) converges in \(L^\infty (0,T;L^2(\Omega ))\) to the function \((t,{\varvec{x}})\mapsto \varphi (t,{\varvec{x}})\mathbf {1}_A({\varvec{x}})\). Since the sequence of functions \((t,{\varvec{x}})\mapsto \Pi _{{\mathcal D}_m}\beta _m(t,{\varvec{x}}) [\Pi _{{\mathcal D}_m}\zeta _m]_A(t)\) is bounded in \(L^1(0,T;L^2(\Omega ))\) (notice that \(([\Pi _{{\mathcal D}_m}\zeta _m]_A)_{m\in \mathbb N}\) is bounded in \(L^2(0,T)\) since \((\Pi _{{\mathcal D}_m}\zeta _m)_{m\in \mathbb N}\) is bounded in \(L^2(\Omega \times (0,T))\)), a reasoning similar to the one used in Step 1 shows that we only need to prove (72) with \(\varphi (t,{\varvec{x}})\mathbf {1}_A({\varvec{x}})\) replaced with \(\varphi _r(t,{\varvec{x}})\rho _r({\varvec{x}})\) for a fixed r.

We have \(\varphi _r(t,{\varvec{x}})\rho _r({\varvec{x}})=\sum _{l=1}^{L_r} \theta _{l,r}(t)(\gamma _{l,r}\rho _r)({\varvec{x}})\) and \(\gamma _{l,r}\rho _r\in C^\infty _c(\Omega )\). Hence, (72) with \(\varphi (t,{\varvec{x}})\mathbf {1}_A({\varvec{x}})\) replaced with \(\varphi _r(t,{\varvec{x}})\rho _r({\varvec{x}})\) will follow if we can establish that for any \(\theta \in C^\infty ([0,T])\), any \(\psi \in C^\infty _c(\Omega )\) and any measurable set A

$$\begin{aligned}&\lim _{m\rightarrow \infty } \int _0^T\int _\Omega \theta (t)\Pi _{{\mathcal D}_m}\beta _m({\varvec{x}},t)[\Pi _{{\mathcal D}_m}\zeta _m]_A(t) \psi ({\varvec{x}})\mathrm{d}{\varvec{x}}\mathrm{d}t\nonumber \\&\quad = \int _0^T\int _\Omega \theta (t)\overline{\beta }({\varvec{x}},t)[\,\overline{\zeta }\,]_A(t)\psi ({\varvec{x}})\mathrm{d}{\varvec{x}}\mathrm{d}t. \end{aligned}$$
(73)

Step 3: proof of (73).

We now use the estimate on \(\delta _m\beta _m\) to conclude. We write

$$\begin{aligned} \int _0^T\int _\Omega \theta (t)\Pi _{{\mathcal D}_m}\beta _m({\varvec{x}},t)[\Pi _{{\mathcal D}_m}\zeta _m]_A(t) \psi ({\varvec{x}})\mathrm{d}{\varvec{x}}\mathrm{d}t =\int _0^T \theta (t)[\Pi _{{\mathcal D}_m}\zeta _m]_A(t) F_m(t) \end{aligned}$$
(74)

with \(F_m(t)= \int _\Omega \Pi _{{\mathcal D}_m}\beta _m({\varvec{x}},t)\psi ({\varvec{x}})\mathrm{d}{\varvec{x}}\). It is clear from the weak convergence of \(\Pi _{{\mathcal D}_m}\zeta _m\) that \([\Pi _{{\mathcal D}_m}\zeta _m]_A \rightarrow [\,\overline{\zeta }\,]_A\) weakly in \(L^2(0,T)\). Hence, if we can prove that \(F_m\rightarrow F:=\int _\Omega \overline{\beta }({\varvec{x}},\cdot )\psi ({\varvec{x}})\mathrm{d}{\varvec{x}}\) strongly in \(L^2(0,T)\), we can pass to the limit in (74) and obtain (73). Since \(F_m\) weakly converges to F in \(L^2(0,T)\) [thanks to the weak convergence of \(\Pi _{{\mathcal D}_m}\beta _m\) in \(L^2(\Omega \times (0,T))\)], we only have to prove that \((F_m)_{m\in \mathbb N}\) is relatively compact in \(L^2(0,T)\).

We introduce the interpolant \(P_{{\mathcal D}_m}\) defined by (21) and we define \(G_m\) as \(F_m\) with \(\psi \) replaced with \(\Pi _{{\mathcal D}_m}P_{{\mathcal D}_m}\psi \). We then have

$$\begin{aligned} |F_m(t)-G_m(t)| \le ||\Pi _{{\mathcal D}_m}\beta _m(\cdot ,t)||_{L^2(\Omega )}S_{{\mathcal D}_m}(\psi ). \end{aligned}$$

The consistency of \(({\mathcal D}_m)_{m\in \mathbb N}\) thus shows that

$$\begin{aligned} F_m-G_m\rightarrow 0 \text{ strongly } \text{ in } L^2(0,T) \text{ as } m\rightarrow \infty . \end{aligned}$$
(75)

We now study the strong convergence of \(G_m\). This function is, like \(\Pi _{{\mathcal D}_m}\beta _m\), piecewise constant on (0, T) and, by definition of \(|\cdot |_{\star ,{\mathcal D}_m}\), its discrete derivative satisfies

$$\begin{aligned} |\delta _mG_m(t)|\le |\delta _m\beta _m (t)|_{\star ,{\mathcal D}_m}||\nabla _{{\mathcal D}_m}P_{{\mathcal D}_m}\psi ||_{L^p(\Omega )^d}. \end{aligned}$$

Since \(||\nabla _{{\mathcal D}_m}P_{{\mathcal D}_m}\psi ||_{L^p(\Omega )^d}\le S_{{\mathcal D}_m}(\psi )+||\nabla \psi ||_{L^p(\Omega )^d}\) is bounded uniformly with respect to m, the assumption on \(\delta _m \beta _m\) proves that \((||\delta _m G_m||_{L^1(0,T)})_{m\in \mathbb N}\) is bounded. We have \(||\delta _m G_m||_{L^1(0,T)}=|G_m|_{BV(0,T)}\), and \((\Pi _{{\mathcal D}_m}\beta _m)_{m\in \mathbb N}\) is bounded in \(L^2(\Omega \times (0,T))\); hence, \((G_m)_{m\in \mathbb N}\) is bounded in \(BV(0,T)\cap L^2(0,T)\) and therefore relatively compact in \(L^2(0,T)\) (see [7, Theorem 10.1.4]). Combined with (75), this shows that \((F_m)_{m\in \mathbb N}\) is relatively compact in \(L^2(0,T)\) and concludes the proof. \(\square \)

Remark 5.5

If we assume that \((\Pi _{{\mathcal D}_m}\beta _m)_{m\in \mathbb N}\) is bounded in \(L^\infty (0,T;L^2(\Omega ))\) and that, for some \(q>1\), \((\int _0^T |\delta _m \beta _m(t)|^q_{\star ,{\mathcal D}_m})_{m\in \mathbb N}\) is bounded, then Step 3 becomes a trivial consequence of Theorem 3.1. Indeed, this theorem shows that \((\Pi _{{\mathcal D}_m}\beta _m)_{m\in \mathbb N}\) is relatively compact uniformly-in-time and weakly in \(L^2(\Omega )\), which translates into the relative compactness of \((F_m)_{m\in \mathbb N}\) in \(L^\infty (0,T)\).

Lemma 5.6

Let V be a non-empty measurable subset of \(\mathbb R^N\), \(N\ge 1\). Let \(\beta ,\zeta \in C^0(\mathbb R)\) be two nondecreasing functions such that \(\beta (0)=\zeta (0)=0\). We assume that there exists a sequence \((w_m)_{m\in \mathbb N}\) of measurable functions on V, and two functions \(\overline{\beta },\overline{\zeta } \in L^2(V)\) such that:

  • \(\beta (w_m)\rightarrow \overline{\beta }\) and \(\zeta (w_m)\rightarrow \overline{\zeta }\) weakly in \(L^2(V)\),

  • there exists \(\varphi \in L^\infty (V)\) such that \(\varphi >0\) a.e. on V and

    $$\begin{aligned} \lim _{m\rightarrow \infty } \int _V \varphi ({\varvec{z}})\beta (w_m({\varvec{z}}))\zeta (w_m({\varvec{z}}))\mathrm{d}{\varvec{z}}= \int _V \varphi ({\varvec{z}})\overline{\beta }({\varvec{z}})\,\overline{\zeta }({\varvec{z}})\mathrm{d}{\varvec{z}}. \end{aligned}$$
    (76)

Then, for any measurable function w such that \((\beta + \zeta )(w) = \overline{\beta }+\overline{\zeta }\) a.e. in V, we have

$$\begin{aligned} \overline{\beta } = \beta (w)\hbox { and } \overline{\zeta } = \zeta (w) \hbox { a.e. in { V}}. \end{aligned}$$
(77)

Proof

We first notice that \(\beta (w)\) and \(\zeta (w)\) belong to \(L^2(V)\) since they have the same sign and therefore verify \(|\beta (w)| + |\zeta (w)| = |\overline{\beta }+\overline{\zeta }|\in L^2(V)\). Using the fact that \(\beta \) and \(\zeta \) are non-decreasing, we can write

$$\begin{aligned} \int _V \varphi ({\varvec{z}}) \left[ \beta (w_m({\varvec{z}})) - \beta (w({\varvec{z}}))\right] \,\left[ \zeta (w_m({\varvec{z}})) - \zeta (w({\varvec{z}}))\right] \mathrm{d}{\varvec{z}}\ge 0. \end{aligned}$$

Letting \(m\rightarrow \infty \) in the above inequality, and using the convergences of \(\beta (w_m)\), \(\zeta (w_m)\) and (76), we obtain

$$\begin{aligned} \int _V \varphi ({\varvec{z}}) \left[ \overline{\beta }({\varvec{z}}) - \beta (w({\varvec{z}}))\right] \left[ \, \overline{\zeta }({\varvec{z}}) - \zeta (w({\varvec{z}}))\right] \mathrm{d}{\varvec{z}}\ge 0. \end{aligned}$$
(78)

We then remark that \(\overline{\beta }+\overline{\zeta }= \beta (w) + \zeta (w)\) gives \(\beta (w) = \frac{\overline{\beta }+\overline{\zeta }}{2} + \left( \frac{\beta -\zeta }{2}\right) (w)\) and \(\zeta (w) = \frac{\overline{\beta }+\overline{\zeta }}{2} - \left( \frac{\beta -\zeta }{2}\right) (w)\). Hence, (78) leads to

$$\begin{aligned} - \int _V \varphi ({\varvec{z}}) \left[ \frac{\overline{\beta }-\overline{\zeta }}{2}({\varvec{z}}) - \left( \frac{\beta -\zeta }{2}\right) (w({\varvec{z}}))\right] ^2\mathrm{d}{\varvec{z}}\ge 0. \end{aligned}$$

Since \(\varphi \) is almost everywhere strictly positive on V, we deduce that \(\frac{\overline{\beta }-\overline{\zeta }}{2} = \frac{\beta (w)-\zeta (w)}{2}\) a.e. in V, and (77) follows from \(\frac{\overline{\beta }+\overline{\zeta }}{2}= \frac{\beta (w) + \zeta (w)}{2}\). \(\square \)