1 Introduction and main results

In the 1970s Yurinskii [39], Kozlov [26], and Papanicolaou and Varadhan [35] proved the first homogenization results for elliptic equations with random coefficients. They considered the elliptic operator \(-\nabla \cdot \varvec{a}(\frac{\cdot }{\varepsilon })\nabla \) (on a domain of \(\mathbb R^d\)) with random, uniformly elliptic coefficients \(\varvec{a}(x)\in \mathbb R^{d\times d}\) and studied its asymptotic behavior in the macroscopic limit \(\varepsilon \downarrow 0\). For coefficients that are stationary and ergodic (w.r.t. the shifts \(\varvec{a}(\cdot )\mapsto \varvec{a}(\cdot +z)\), \(z\in \mathbb R^d\)) they proved a qualitative homogenization result which says that as \(\varepsilon \downarrow 0\) the elliptic operator \(-\nabla \cdot \varvec{a}(\frac{\cdot }{\varepsilon })\nabla \) almost surely H-converges to the homogenized elliptic operator \(-\nabla \cdot \varvec{a}_{\mathrm{hom}}\nabla \). Morally speaking this means that the rapidly oscillating random coefficients \(\varvec{a}(\frac{\cdot }{\varepsilon })\) can be replaced in the macroscopic limit by the homogenized coefficient matrix \(\varvec{a}_{\mathrm{hom}}\), which is deterministic and characterized by a homogenization formula. In the present contribution we consider stochastic homogenization in a discrete setting where the continuum domain \(\mathbb R^d\) is replaced by the lattice \(\mathbb Z^d\). The qualitative homogenization theory is similar to the one in the continuum setting, see Künnemann [28], Kozlov [27]. This problem corresponds to random conductance models for a network of resistors (see Biskup [3] for a recent review).

We are interested in the homogenization formula. To be precise let \(d\ge 2\) denote the dimension and \(\lambda >0\) be a constant of ellipticity which is fixed throughout this article. Let \(\Omega \subset (\mathbb R^{d\times d})^{\mathbb Z^d}\) denote the set of admissible coefficient fields, which is defined as the set of all functions from \(\mathbb Z^d\) to the set of diagonal matrices with entries in \([\lambda ,1]\). We endow \(\Omega \) with a stationary ensemble \(\langle \cdot \rangle \) that describes the statistics of the random coefficients. With \(\langle \cdot \rangle \) we associate the symmetric matrix of homogenized coefficients \(\varvec{a}_{\mathrm{hom}}\in \mathbb R^{d\times d}_\mathrm {sym}\) via the minimization problem:

$$\begin{aligned} \forall e\in \mathbb R^d:\qquad e\cdot \varvec{a}_{\mathrm{hom}}e=\inf _{\overline{\varphi }}\left\langle (e+\nabla \overline{\varphi })\cdot \varvec{a}(e+\nabla \overline{\varphi }) \right\rangle , \end{aligned}$$
(1)

where the infimum runs over all random fields \(\overline{\varphi }:\Omega \times \mathbb Z^d\rightarrow \mathbb R\) that are stationary in the sense of \(\overline{\varphi }(\varvec{a},x+z)=\overline{\varphi }(\varvec{a}(\cdot +z),x)\) for all \(x,z\in \mathbb Z^d\) and \(\langle \cdot \rangle \)-almost every \(\varvec{a}\in \Omega \). The Euler-Lagrange equation associated with (1) is called the corrector equation:

$$\begin{aligned} \nabla ^*\varvec{a}(\nabla \overline{\phi }+e)=0\quad \text{ on } \, \mathbb Z^d,\qquad \overline{\phi }\text { is a} \langle \cdot \rangle \hbox {-stationary random field}. \end{aligned}$$
(2)

We refer to Sect. 2.1 for the definition of the finite-difference gradient \(\nabla \) and its adjoint \(\nabla ^*\). A solution to (2) is called a stationary corrector, as opposed to the non-stationary corrector introduced by Künnemann [28]. In case it exists, the stationary corrector minimizes the Dirichlet energy (1), and the homogenization formula reads

$$\begin{aligned} \varvec{a}_{\mathrm{hom}}e=\left\langle \varvec{a}(\nabla \overline{\phi }+e) \right\rangle . \end{aligned}$$
(3)

The goal of this paper is to introduce quantitative methods that yield optimal estimates on the corrector equation (2) and on approximations of the homogenization formula (3). The methods we present continue and extend earlier ideas of two of the authors in [21, 22].

The quantitative theory for (2) is subtle. As a matter of fact, for general stationary and ergodic ensembles the minimum in (1) may not be attained and stationary correctors may not exist. On top of that, in dimension \(d=2\) stationary correctors do not exist even under the strong assumption of independent and identically distributed (i.i.d.) coefficients. The only existence result of stationary correctors has been obtained recently in [21] by two of the authors in the case of i.i.d. coefficients in dimensions \(d>2\).

Deterministic approaches to (2) that do not exploit the properties of the underlying probability space (e.g. when \(\varvec{a}\in \Omega \) is just viewed as a parameter) are too narrow: One generically does not have existence of bounded solutions of (2) pointwise in \(\varvec{a}\in \Omega \) (as we learn from the small ellipticity-contrast expansion). A natural way to benefit from the underlying probability space relies on the observation that stationary fields \(\overline{\varphi }(\varvec{a},x)\) are fully characterized by the random variable \(\varphi (\varvec{a}):=\overline{\varphi }(\varvec{a},x=0)\). Based on this, one can introduce a differential calculus for random variables and rewrite (2) as

$$\begin{aligned} D^*\varvec{a}(0)(D\phi +e)=0\qquad \phi \, \text {is} \langle \cdot \rangle \hbox {-measurable.} \end{aligned}$$
(4)

Here \(D\) and \(D^*\) are what we call horizontal derivatives and related to \(\nabla \) and \(\nabla ^*\) via stationarity (see Sect. 2.1 for the details). Moreover, \(\varvec{a}(0)\) stands for the coordinate projection \(\Omega \ni \varvec{a}\mapsto \varvec{a}(0)\) on diagonal matrices. Thanks to the discrete setting, \(D^*\varvec{a}(0)D\) is a bounded linear operator on \(L^p(\Omega )\) for any \(1\le p\le \infty \). However, it is highly degenerate as an elliptic operator since the horizontal derivative \(D=(D_1,\dots ,D_d)\) with its \(d\) components typically does not yield a Poincaré inequality for functions on the infinite-dimensional space \(\Omega \). Our quantitative analysis is based on the assumption that \(\langle \cdot \rangle \) satisfies a Spectral Gap Estimate (SG) with respect to a Glauber dynamics on coefficient fields: We assume there exists \(\rho >0\) such that

$$\begin{aligned} \forall \zeta \in L^2(\Omega ):\qquad \left\langle (\zeta -\langle \zeta \rangle )^2 \right\rangle \le \frac{1}{\rho }\sum _{y\in \mathbb Z^d}\left\langle \left( \tfrac{\partial \zeta }{\partial y}\right) ^2 \right\rangle , \end{aligned}$$
(SG)

where \(\tfrac{\partial \zeta }{\partial y}\) denotes the vertical derivative and can be viewed as a discrete version of the classical partial derivative \(\tfrac{\partial \zeta (\varvec{a})}{\partial \varvec{a}(y)}\) (see Sect. 2.1 for details). Note that (SG) can be seen as a Poincaré inequality, this time with respect to the vertical derivatives \(\{\tfrac{\partial }{\partial y}\}_{y\in \mathbb Z^d}\). (SG) is stronger than ergodicity, but weaker than the assumption that the coefficients are i.i.d. (cf. Lemma 1 below).

To circumvent the degeneracy of \(D^*\varvec{a}(0)D\) one usually considers approximations of (4) that regularize the problem. In this paper we study two natural approximations, which we introduce now.

The modified corrector equation. In the qualitative works by Papanicolaou and Varadhan [35] and Kipnis and Varadhan [25], the corrector equation is regularized by introducing an additional 0th-order (“massive”) term. The result is what we call the modified corrector equation

$$\begin{aligned} \mu \phi _\mu +D^*\varvec{a}(0)(D\phi _\mu +e)=0\qquad \phi _\mu \text { is} \langle \cdot \rangle \hbox {-measurable.} \end{aligned}$$
(5)

Here, the regularization parameter \(\mu \) is a positive number, sometimes written as an inverse time \(\frac{1}{T}\). As a merit of the regularization, the modified corrector equation always admits a unique solution in \(L^2(\Omega )\). By standard arguments, \(\langle |D\phi _\mu |^2 \rangle \) and \(\mu \langle \phi _\mu ^2 \rangle \) are bounded uniformly in \(\mu \). For general ensembles, bounds on \(\phi _\mu \) that are uniform in \(\mu \) are not available, and thus one can only pass to the limit \(\mu \downarrow 0\) on the level of the gradient \(D\phi _\mu \), which is enough for the qualitative homogenization theory. However, it is not sufficient for proving existence and estimates for the original problem (4). As an application of our methods, assuming (SG), we shall prove in Proposition 1 that in dimensions \(d>2\) the modified corrector \(\phi _\mu \) is bounded in \(L^p(\Omega )\) uniformly in \(\mu \) for every \(p<\infty \). In the case of the critical dimension \(d=2\) we obtain the estimate \(\left\langle \phi _\mu ^2 \right\rangle \lesssim (\ln \mu ^{-1})|e|^2\), the scaling in \(\mu \) of which is optimal.

The periodic approximation. Another natural approach to approximate (4) is to replace \(\langle \cdot \rangle \) by a stationary ensemble \(\langle \cdot \rangle _L\) that concentrates on \(L\)-periodic coefficient fields. In that case we can unambiguously find a solution \(\phi \) of (4) (with \(\langle \cdot \rangle \) replaced by \(\langle \cdot \rangle _L\)) by solving for all \(L\)-periodic coefficient fields \(\varvec{a}\) the periodic corrector equation on \(\mathbb Z^d\):

$$\begin{aligned}&\nabla ^*\varvec{a}(x)(\nabla \overline{\phi }(\varvec{a},x)+e)=0,\qquad \overline{\phi }(\varvec{a},\cdot )\text { is } L\hbox {-periodic and }\nonumber \\&\quad \sum _{x\in ([0,L)\cap \mathbb Z)^d}\overline{\phi }(\varvec{a},x)=0. \end{aligned}$$
(6)

Standard arguments involving the Poincaré estimate on \(([0,L)\cap \mathbb Z)^d\) only show that \(\langle |D\phi |^2 \rangle _L\) and \(L^{-2}\langle \phi ^2 \rangle _L\) are bounded uniformly in \(L\). Assuming an \(L\)-periodic version of (SG), we shall prove that \(\langle \phi ^p \rangle _L\lesssim |e|^p\), \(p<\infty \), in dimensions \(d>2\), and that \(\langle \phi ^2 \rangle _L\lesssim (\ln L)|e|^2\) in dimension \(d=2\) (see Proposition 1). Again the latter is optimal in terms of scaling in \(L\).

A parabolic key estimate. We study both the modified and the periodic corrector equations by a unified approach that is based on the parabolic equation

$$\begin{aligned} \left\{ \begin{aligned}&\partial _t u(t)+D^*\varvec{a}(0)Du(t)= 0\qquad t>0,\\&u(t=0)= \zeta . \end{aligned}\right. \end{aligned}$$
(7)

This equation defines a stochastic process on the space \(\Omega \) of coefficient fields that can be conveniently interpreted in the context of random walks in random environments (see e.g. [4]): Consider a continuous-in-time random walker (referred to as a particle) on \(\mathbb Z^d\) whose symmetric jump rate across a bond \(\{x,x+e_i\}\) is given by \(\varvec{a}_{ii}(x)\). Then the above process describes the “environment \(\varvec{a}\) as seen from the particle”.

The solution of (7) is unique and given by \(u(t)=\exp (-tD^*\varvec{a}(0)D)\zeta \), where \(t\mapsto \exp (-tD^*\varvec{a}(0)D)\) denotes the group associated with \(D^*\varvec{a}(0)D:L^2(\Omega )\rightarrow L^2(\Omega )\). The parabolic equation (7) and the elliptic equation (4) are related by means of the formal identity

$$\begin{aligned} \phi =\int _0^\infty u(t)\,dt, \end{aligned}$$

where for \(u(t)\) we choose the initial condition \(\zeta =-D^*\varvec{a}(0)e\). This formal relation becomes rigorous (and yields estimates on \(\phi \)) as soon as we have suitable decay estimates on \(u(t)\) in \(t\).

Ergodicity of the ensemble implies that \(\langle u^2(t) \rangle \rightarrow 0\) as \(t\uparrow \infty \). (Indeed, since \(D^*\varvec{a}(0)D\) is non-negative, \(u(t)\) converges to the \(L^2(\Omega )\)-orthogonal projection of \(u(t=0)\) onto the kernel of \(D^*\varvec{a}(0)D\), which by ergodicity only consists of the constant functions). While ergodicity does not yield any rate of convergence, we prove that (SG) yields an algebraic rate, and thus quantifies ergodicity. As a main result we prove an optimal decay estimate for \(\exp (-tD^*\varvec{a}(0)D)\) with initial conditions in divergence form.

Theorem 1

Assume that

$$\begin{aligned} \langle \cdot \rangle \text { is stationary}, L\hbox {-periodic and satisfies } \hbox {SG}_{L}(\rho )\hbox { (see Definition~1 below).} \end{aligned}$$
(8)

Then there exists an exponent \(1\le p_0<\infty \) that only depends on the constant of ellipticity \(\lambda \) and the dimension \(d\ge 2\) such that the following statement holds for every \(p\in (p_0,\infty )\):

For \(\xi \in C_b(\Omega )^d\) and \(t\ge 0\) define

$$\begin{aligned} u(t):=\exp (-tD^*\varvec{a}(0)D)D^*\xi . \end{aligned}$$

Then we have

$$\begin{aligned} \left\langle |u(t)|^{2p} \right\rangle ^{\frac{1}{2p}}\lesssim (t+1)^{-(\frac{d}{4}+\frac{1}{2})}\Vert \partial \xi \Vert _{\ell ^1_yL^{2p}_{\langle \cdot \rangle }}, \end{aligned}$$
(9)

where

$$\begin{aligned} \Vert \partial \xi \Vert _{\ell ^1_yL^{2p}_{\langle \cdot \rangle }}:= \sum _{y\in ([0,L)\cap \mathbb Z)^d}\,\left\langle \left| \tfrac{\partial \xi }{\partial _L y}\right| ^{2p} \right\rangle ^{\frac{1}{2p}}, \end{aligned}$$
(10)

and \(\lesssim \) means \(\le \) up to a multiplicative constant that only depends on the integrability exponent \(p\), the spectral gap constant \(\rho \), the constant of ellipticity \(\lambda \), and the dimension \(d\).

A similar result that is formally obtained by letting \(L\uparrow \infty \) holds for the whole-space case, under the assumption

$$\begin{aligned} \langle \cdot \rangle \text { is stationary and satisfies } SG_{\infty }(\rho )\hbox { (see Definition~1 below).} \end{aligned}$$
(11)

The exponent \(\tfrac{d}{4}{+}\tfrac{1}{2}\) in (9) is optimal, since in the case of vanishing ellipticity contrast (that is, when \(\varvec{a}\) is a perturbation of the identity), one may replace at first order the elliptic operator by the discrete Laplacian, in which case

$$\begin{aligned} \langle |\exp (-tD^*D)D^*\xi |^2 \rangle ^{\frac{1}{2}}\sim \left( \sup _{x\in \mathbb Z^d}|\nabla ^2G(2t,x)|\right) ^{\frac{1}{2}}\Vert \partial \xi \Vert _{\ell ^1_yL^2_{\langle \cdot \rangle }}\sim (t+1)^{-(\frac{d}{4}+\frac{1}{2})}\Vert \partial \xi \Vert _{\ell ^1_yL^2_{\langle \cdot \rangle }} \end{aligned}$$

for \(\langle \cdot \rangle \) i.i.d. and initial values \(\xi (\varvec{a})\) that depend on \(\varvec{a}\) only through \(\varvec{a}(0)\). Here \(\nabla ^2G\) denotes the Hessian of the discrete heat kernel on \(\mathbb Z^d\).

As a corollary of Theorem 1 we obtain in Sect. 4 several optimal estimates for the stationary corrector, the modified corrector and the periodic approximation of the corrector. In addition, we obtain optimal estimates of the error introduced by adding the \(0\)th order term in the modified corrector equation. Based on these estimates and spectral theory we derive error estimates for the approximation of the homogenized coefficients by periodization, which we shall present now.

Error estimate for approximations of \(\varvec{a}_{\mathrm{hom}}\). In this paper we provide an optimal error estimate for the approximation by periodization of the homogenized coefficients associated with an i.i.d. ensemble. Approximation by periodization is widely used in the mechanical community and also called the “representative volume element method”, see [34] for a qualitative analysis in the stationary ergodic case. The homogenized coefficients \(\varvec{a}_{\mathrm{hom}}\) associated with an ergodic ensemble \(\langle \cdot \rangle \) are approximated by the homogenized coefficients \(\varvec{a}_{\mathrm{hom},L}\) associated with a stationary \(L\)-periodic ensemble \(\langle \cdot \rangle _L\), which we think of as having the same statistical specifications as \(\langle \cdot \rangle \). Since \(\varvec{a}_{\mathrm{hom},L}\) is still not computable in practice, one replaces \(\varvec{a}_{\mathrm{hom},L}\) by a spatial average \(\varvec{a}_{\mathrm{av},L}(\varvec{a})\) where \(\varvec{a}\) is distributed according to \(\langle \cdot \rangle _L\) and \(\varvec{a}_{\mathrm{av},L}:\Omega \rightarrow \mathbb R^{d\times d}_\mathrm{sym}\) is given by

$$\begin{aligned} e\cdot \varvec{a}_{\mathrm{av},L}(\varvec{a})e:=\min \limits _{\varphi L{\hbox {-}}\text {periodic}}\frac{1}{L^d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}(\nabla \varphi (x)+e)\cdot \varvec{a}(x)(\nabla \varphi (x)+e). \end{aligned}$$
(12)

As an application of our method we quantify the speed of convergence when \(\langle \cdot \rangle \) is an i.i.d. ensemble. In that case it is natural to define \(\langle \cdot \rangle _L\) as the \(L\)-periodic i.i.d. ensemble associated with \(\langle \cdot \rangle \) (cf. Definition 3). We estimate the mean square error \(\langle |\varvec{a}_{\mathrm{av},L}-\varvec{a}_{\mathrm{hom}}|^2 \rangle _L\), which naturally splits into two parts:

$$\begin{aligned} \left\langle |\varvec{a}_{\mathrm{av},L}-\varvec{a}_\mathrm{hom}|^2 \right\rangle _L&= \mathrm{var}_L[\varvec{a}_{\mathrm{av},L}]+|\langle \varvec{a}_{\mathrm{av},L} \rangle _L-\varvec{a}_{\mathrm{hom}}|^2\nonumber \\&= \left\langle |\varvec{a}_{\mathrm{av},L}-\langle \varvec{a}_{\mathrm{av},L} \rangle _L|^2 \right\rangle _L+|\varvec{a}_{\mathrm{hom},L}-\varvec{a}_{\mathrm{hom}}|^2 \nonumber \\&= (\text {random error})^2 + (\text {systematic error})^2, \end{aligned}$$
(13)

where \(\mathrm{var}_L\) denotes the variance w.r.t. the ensemble \(\langle \cdot \rangle _L\). Following the terminology of Gloria and Otto [21], we call the (square roots of the) first and second terms of the RHS the random error and systematic error, respectively. The random error is due to the lack of ergodicity of the \(L\)-periodic ensemble and measures the fluctuation of \(\varvec{a}_{\mathrm{av},L}\) around its average. Although \(\varvec{a}_{\mathrm{av},L}\) is a spatial average of highly correlated random variables, we obtain in Proposition 2 (under the weaker assumption that \(\langle \cdot \rangle _L\) satisfies an \(L\)-periodic version of (SG)) that the random error decays with the rate of the central limit theorem, i.e.

$$\begin{aligned} \mathrm{var}_L[\varvec{a}_{\mathrm{av},L}]\lesssim L^{-d}. \end{aligned}$$
(14)

The systematic error is of different nature: By replacing the infinite ensemble \(\langle \cdot \rangle \) by the \(L\)-periodic ensemble \(\langle \cdot \rangle _L\), we introduce artificial long-range correlations (in order to enforce the periodicity of \(\langle \cdot \rangle _L\)). This produces the systematic error. In contrast to the systematic error, the effect of the random error can be reduced by empirical averaging: For \(N\in \mathbb N\) define the random matrix

$$\begin{aligned} \varvec{a}_{\mathrm{av},L,N}:=\frac{1}{N}\sum _{i=1}^N \varvec{a}_{\mathrm{av},L}(\varvec{a}^i), \end{aligned}$$
(15)

where \(\varvec{a}^1,\ldots ,\varvec{a}^N\) denote \(N\) independent realizations distributed according to \(\langle \cdot \rangle _L\). This is of particular interest since the systematic error is indeed much smaller than the random error. With the help of Theorem 1 we obtain the following optimal error estimate:

Theorem 2

Let \(d\ge 2\), \(L,N\in \mathbb N\). Let \(\langle \cdot \rangle \) be an infinite i.i.d. ensemble and let \(\langle \cdot \rangle _L\) be the associated \(L\)-periodic i.i.d. ensemble. Then we have

$$\begin{aligned} \left\langle |\varvec{a}_{\mathrm{av},L,N}-\varvec{a}_{\mathrm{hom}}|^2 \right\rangle _L^{\frac{1}{2}} \lesssim \frac{1}{\sqrt{N}}L^{-\frac{d}{2}}+L^{-d}\ln ^{d}L, \end{aligned}$$

where \(\lesssim \) means \(\le \) up to a constant that only depends on \(\lambda \) and \(d\).

The proof of Theorem 2 is given in Sect. 5 and uses new estimates on the modified corrector equation. For a discussion of the literature on error estimates for the approximation of homogenized coefficients (and in particular [5, 12], and [40]), we refer the reader to [18] and [21, Section 1.2]. For a review on applications of the approximation by periodization, see [24]. Theorem 2 is the first quantitative result on this method in a stochastic (yet discrete and scalar) setting. For associated numerical results, we refer the reader to [13].

Estimates of the Gradient of the parabolic Green’s function. An important observation in the proof of Theorem 1 is that the vertical derivatives \(\frac{\partial u(t)}{\partial y}\) (with \(u(t)\) solution of the parabolic problem (7)) can be characterized as the solution of a parabolic equation whose RHS involves \(\nabla u(t)\). Hence, it can be represented using the parabolic Green’s function \(G\) associated with the parabolic operator \(\partial _t+\nabla ^*\varvec{a}\nabla \). The resulting Duhamel formula is a nonlinear identity that involves the gradient of the parabolic Green’s function. The quantitative statements we are interested in require estimates on the gradient of the parabolic Green’s function \(G(t,\varvec{a},x,y)\). For our purposes we need estimates that are uniform in \(\varvec{a}\), but nevertheless, are optimal in terms of the exponent in \(t\). By optimal we mean that the exponent should be identical to the one of the constant-coefficient Green’s function. As a consequence, these estimates cannot be pointwise in \(x\), but rather integral estimates. In particular we shall need to capture the decay in \(x\) in a better way than the following estimate does: For all \(t\ge 0\)

$$\begin{aligned} \left( \sum _{z\in \mathbb Z^d} |\nabla G(t,\varvec{a},z,0)|^2\right) ^{\tfrac{1}{2}} \lesssim \,(t+1)^{-(\tfrac{d}{4}+\frac{1}{2})}. \end{aligned}$$

We do this by establishing weighted integral estimates with weight functions

$$\begin{aligned} \omega (t,x):=\left( \frac{|x|^2}{t+1}+1\right) ^{\frac{1}{2}},\qquad \omega _L(t,x):=\left( \frac{\mathrm{dist}^2(x,L\mathbb Z^d)}{t+1}+1\right) ^{\frac{1}{2}}, \end{aligned}$$
(16)

where \(\mathrm{dist}(x,L\mathbb Z^d):=\min _{z\in \mathbb Z^d}|x-Lz|\) denotes the distance to \(0\) on the \(L\)-torus. Finally, in order to treat the nonlinear term, we shall need a slightly stronger estimate than the square integral estimate. In Sect. 6 we establish the following estimates for the discrete, whole-space parabolic Green’s function \(G\) and the discrete, \(L\)-periodic parabolic Green’s function \(G_L\).

Theorem 3

There exists an exponent \(q_0>1\) (only depending on \(\lambda \) and \(d\)) such that for all \(1\le q < q_0\) and all \(\alpha <\infty \) we have: For all \(L\in \mathbb N\) and \(\varvec{a}\in \Omega _L\)

$$\begin{aligned}&\left( \sum _{x\in ([0,L)\cap \mathbb Z)^d}\left( \omega ^{\alpha }_L(t,x-y)|\nabla _x G_L(t,\varvec{a},x,y)|\right) ^{2q}\right) ^\frac{1}{2q}\\&\quad \lesssim (t+1)^{-(\frac{d}{2}+\frac{1}{2})+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{t}{L^2}\right) , \end{aligned}$$

where the multiplicative constant only depends on \(\alpha \), \(\lambda \), \(d\), and \(q\), and \(c_0>0\) denotes a constant that only depends on \(\lambda \) and \(d\).

A similar result that is formally obtained by letting \(L\uparrow \infty \) holds for the whole-space Green’s function.

The estimates of Theorem 3 are optimal in terms of scaling. Indeed, the exponents are the same for the constant coefficient case (i.e. \(\varvec{a}=\varvec{id}\)) as can be seen by a scaling argument in the continuum case (i.e. \(\mathbb Z^d\) replaced by \(\mathbb R^d\)). In the proof we make extensive use of the elliptic and parabolic regularity theory by Nash [33] and Meyers [29]. The methods we use rely in particular on the maximum principle, which is the reason why we restrict ourselves, in our discrete setting, to the case where \(\varvec{a}\) is diagonal.

Structure of the paper. In Sect. 2 we introduce the general framework, in particular a discrete differential calculus for functions on the lattice \(\mathbb Z^d\), a “horizontal” differential calculus for random variables, the description of random coefficients via the ensemble, and introduce our main assumption on the ensemble, namely the Spectral Gap Estimate (SG). In Sect. 3 we prove Theorem 1, which is the main result of the paper. In Sect. 4 we apply Theorem 1 to the corrector equation in stochastic homogenization. In Sect. 5 we prove Theorem 2 which establishes an optimal error estimate for the approximation of the homogenized coefficients by periodization. Finally, in Sect. 6 we prove the quenched estimates on the gradient of the parabolic Green’s function of Theorem 3.

Relation to previous works. The first paper on quantitative stochastic homogenization is due to Yurinskii [40]. Yurinskii considered the dependence of the gradient of the modified corrector \(D\phi _\mu \) on \(\mu \). For dimension \(d>2\) and under mixing assumptions on the statistics, he obtained non-optimal decay estimates in \(\mu \) on the \(L^2(\Omega )\)-distance between the gradient \(D\phi _\mu \) of the modified corrector and that of the original corrector \(\phi \). As a central ingredient, Yurinskii appeals to optimal deterministic estimates on the parabolic variable-coefficients Green’s function.

The idea of combining stochastic homogenization methods with statistical mechanics methods naturally arises in the study of scaling limits of Gradient Gibbs Measures. Gradient Gibbs Measures can be seen as a model for thermally fluctuating interfaces, as introduced in the mathematical literature by Funaki and Spohn [16], see [15] for a review. Naddaf and Spencer [31] were the first to combine all the three concepts of (discrete) spatial, of horizontal, and of vertical derivatives. They were also the first to use methods of statistical mechanics in stochastic homogenization, cf. the very inspiring unpublished preprint [32]. A main ingredient in [32] are deterministic estimates on the gradient of the elliptic variable-coefficients Green’s function—as opposed to the estimates on the parabolic Green’s function itself by Yurinskii. Implicitly, Naddaf and Spencer obtain deterministic \(\ell ^{2q}\)-estimates (for some \(q\) slightly larger than \(1\)) via Meyers’ argument. They use it to establish a variance estimate in the spirit of (14) in the case of small ellipticity contrast (that is, when \(q\ge 2\)). This approach has been further developed by Conlon and Naddaf [6] to obtain estimates on the annealed elliptic and parabolic Green’s functions (that is, on the expectation of these Green’s functions), and by Conlon and Spencer [7] to quantify the homogenization error, for small ellipticity contrast. Some optimal annealed estimates on the first and mixed second derivatives of the Green’s function (that is, pointwise-in-space estimates of moments of order \(2q\) in probability) have also been obtained by Delmotte and Deuschel [9] for any ellipticity contrast (yet for \(q=1\) for the first derivative, and \(q=\frac{1}{2}\) for the mixed second derivative only).

Another source of inspiring ideas are the works on qualitative stochastic homogenization by Papanicolaou and Varadhan [35] and Kipnis and Varadhan [25], where the modified corrector problem (5) is introduced. The modified corrector yields an approximation \(\varvec{a}_{\mathrm{hom},\mu }\) for the homogenized coefficients \(\varvec{a}_\mathrm{hom}\), see Definition 4 below. In [25, 35] the authors appeal to spectral analysis to treat the original corrector problem (2) with the help of its modified version (5). Furthermore, they devise a spectral representation formula for the homogenized coefficients and its approximation \(\varvec{a}_{\mathrm{hom},\mu }\), which we use to develop quantitative estimates on the error \(|\varvec{a}_{\mathrm{hom},\mu }-\varvec{a}_{\mathrm{hom}}|\) via Theorem 1.

In [21] and [22], the first and third authors obtained the first optimal quantitative results on the corrector equation, namely the boundedness of correctors, the optimal estimate of the random error, and optimal bounds on the systematic error \(|\varvec{a}_{\mathrm{hom},\mu }-\varvec{a}_{\mathrm{hom}}|\) for \(d>2\). In these contributions, an auxiliary result are optimal deterministic estimates on the gradient of the variable-coefficients elliptic Green’s function. In [30], Mourrat obtained a suboptimal version of Theorem 1 for \(d\ge 5\) by a different approach. Then, Mourrat and the first author [19] obtained further quantitative results for the systematic error by appealing to the spectral calculus mentioned above. In particular, in [19], motivated by the spectral representation of \(\varvec{a}_\mathrm{hom}\), a different “higher order” approximation scheme by extrapolation for \(\varvec{a}_\mathrm{hom}\) is introduced, and optimal error estimates for the associated systematic error are obtained for dimensions \(2< d\le 6\), still using the deterministic estimates on the gradient of the elliptic Green’s function. We use this scheme in our analysis of the approximation by periodization.

The interest of the present contribution is threefold. First we obtain optimal results on the corrector and on the systematic error in any dimension (and in particular for \(d=2\) and \(d>6\)), and give the first complete and optimal quantitative analysis of the very popular approximation by periodization. Second, we introduce a unified method which quantifies optimally the ergodicity of the environment as seen from the particle, result from which all the other estimates follow. Last, the main auxiliary result are new deterministic estimates on the gradient of the parabolic variable-coefficients Green’s function.

This article is the short version of lecture notes [20], which contain in addition detailed proofs of several classical results which are recalled here without a proof.

2 Differential calculus on stationary random fields

In this section, we introduce two differential calculi on stationary random fields, that we call horizontal and vertical.

2.1 Spatial derivatives, stationarity, and horizontal derivatives

Random coefficient fields. We consider linear second-order difference equations with uniformly elliptic, bounded, diagonal random coefficients. We denote the set of admissible coefficient matrices by

$$\begin{aligned} \Omega _0:=\left\{ \mathrm{diag}(a_1,\ldots ,a_d)\in \mathbb R^{d\times d}:\lambda \le a_j\le 1 \text { for }j=1,\ldots ,d\right\} , \end{aligned}$$
(17)

where \(\mathrm{diag}(a_1,\ldots ,a_d)\) is the diagonal, \(d\times d\) matrix with entries \(a_1,\ldots , a_d\), and \(\lambda >0\) is an ellipticity constant which is fixed throughout the paper. We equip \(\Omega _0\) with the usual topology of \(\mathbb R^{d\times d}\). A coefficient field, denoted by \(\varvec{a}\), is a function on \(\mathbb Z^d\) taking values in \(\Omega _0\). We endow \(\Omega =(\Omega _0)^{\mathbb Z^d}\) with the product topology. For \(L\in \mathbb N\) we denote by \(\Omega _L\) the subspace of \(L\)-periodic conductivity fields (i.e. \(\varvec{a}\in \Omega \) with \(\varvec{a}(x+Lz)=\varvec{a}(x)\) for all \(x,z\in \mathbb Z^d\)).

We describe a random coefficient field by equipping \(\Omega \) with a probability measure. Following the convention in statistical mechanics, we call a probability measure on \(\Omega \) also an ensemble and denote the associated ensemble average by \(\langle \cdot \rangle \). If \(\langle \cdot \rangle \) concentrates on \(\Omega _L\) we call it an L-periodic ensemble.

Unless otherwise stated we always assume that \(\langle \cdot \rangle \) is stationary. Let \(T_z:\Omega \rightarrow \Omega ,\;\varvec{a}(\cdot )\mapsto \varvec{a}(\cdot +z)\) denote the shift by \(z\). Then \(\langle \cdot \rangle \) is stationary if and only if \(T_z\) is \(\langle \cdot \rangle \)-preserving for all shifts \(z\in \mathbb Z^d\).

Random variables and stationary random fields. A random variable is a measurable function on \(\Omega \). We denote by \(L^p(\Omega )\), \(1\le p\le \infty \), the usual spaces of random variables with finite \(p\)-th moment. We denote by \(C_b(\Omega )\) the space of bounded continuous functions on \(\Omega \) equipped with the norm \(||\zeta ||_\infty :=\sup _{\varvec{a}\in \Omega }|\zeta (\varvec{a})|<\infty \).

A random field \(\tilde{\zeta }\) is a measurable function on \(\Omega \times \mathbb Z^d\). With any random variable \(\zeta :\Omega \rightarrow \mathbb R\) we associate its \(\langle \cdot \rangle \)-stationary extension \(\overline{\zeta }:\Omega \times \mathbb Z^d\rightarrow \mathbb R\) via \(\overline{\zeta }(\varvec{a},x):=\zeta (\varvec{a}(\cdot +x))\). Conversely, we say that a random field \(\tilde{\zeta }\) is \(\langle \cdot \rangle \)-stationary if there exists a random variable \(\zeta \) with \(\tilde{\zeta }(\varvec{a},\cdot )=\overline{\zeta }(\varvec{a},\cdot )\) \(\langle \cdot \rangle \)-almost surely. Since \(\langle \overline{\zeta }(x) \rangle \) does not depend on \(x\) by stationarity, we simply write \(\langle \overline{\zeta } \rangle \) for the expectation of a stationary field \(\overline{\zeta }\).

Spatial and horizontal derivatives. For scalar fields \(\zeta :\mathbb Z^d\rightarrow \mathbb R\), vector fields \(\xi =(\xi _1,\ldots ,\xi _d):\mathbb Z^d\rightarrow \mathbb R^d\), and all \(i=1,\ldots ,d\), we define the spatial derivatives

$$\begin{aligned} \nabla _i\zeta (x)&:= \zeta (x+e_i)-\zeta (x),\quad \nabla ^*_i\zeta (x):=\zeta (x-e_i)-\zeta (x),\\ \nabla \zeta&:= (\nabla _1\zeta ,\ldots ,\nabla _d\zeta ),\qquad \nabla ^*\xi :=\sum _{i=1}^d\nabla ^*_i\xi _i, \end{aligned}$$

where \(\{e_1,\ldots ,e_d\}\) denotes the canonical basis of \(\mathbb R^d\), \(\nabla \) is the discrete gradient for functions on \(\mathbb Z^d\), and \(-\nabla ^*\) is the discrete divergence for vector fields on \(\mathbb Z^d\). By discreteness \(\nabla _i\) and \(\nabla ^*_i\) are bounded linear operators on \(\ell ^p(\mathbb Z^d)\), \(1\le p\le \infty \). They are formally adjoint, as are \(\nabla \) and \(\nabla ^*\).

Next, we recall a similar standard structure for random variables: For scalar random variables \(\zeta :\Omega \rightarrow \mathbb R\), vector-valued random variables \( \xi =( \xi _1,\ldots \xi _d):\Omega \rightarrow \mathbb R^d\), and \(i=1,\ldots ,d\), we define the horizontal derivatives

$$\begin{aligned} D_i\zeta (\varvec{a})&:= \zeta (\varvec{a}(\cdot +e_i))-\zeta (\varvec{a}),\quad D^*_i \zeta (\varvec{a}):=\zeta (\varvec{a}(\cdot -e_i))-\zeta (\varvec{a}),\\ D\zeta&:= (D_1\zeta ,\ldots ,D_d\zeta ),\qquad D^* \xi :=\sum _{i=1}^dD^*_i \xi _i. \end{aligned}$$

The horizontal derivatives \(D_i\) and \(D^*_i\) are bounded linear operators on \(L^p(\Omega )\), \(1\le p\le \infty \), and \(C_b(\Omega )\). They are formally adjoint, as are \(D\) and \(D^*\). An elementary but important observation is the following. Let \(\overline{(\cdot )}\) denote the mapping that associates a random variable with its stationary extension, see above. Then

$$\begin{aligned} \nabla _i\overline{\zeta }=\overline{D_i\zeta },\qquad \nabla _i^*\overline{\zeta }=\overline{D_i^*\zeta },\qquad \text {and}\qquad \nabla ^*\varvec{a}\nabla \overline{\zeta }=\overline{D^*\varvec{a}(0)D\zeta }. \end{aligned}$$
(18)

We use \(\lesssim \) (resp. \(\gtrsim \)) for \(\le \) (resp. \(\ge \)) up to a multiplicative constant whose dependence on the different parameters is made explicit in each statement.

2.2 Spectral gap on Glauber dynamics and vertical derivatives

Definition 1

(Spectral gap and vertical derivatives) We say that \(\langle \cdot \rangle \) satisfies a spectral gap for the Glauber dynamics with constant \(\rho >0\), in short \(SG_{\infty }(\rho )\), if for all \(\zeta \in L^2(\Omega )\) with \(\langle \zeta \rangle =0\) we have

$$\begin{aligned} \left\langle \zeta ^2 \right\rangle \le \frac{1}{\rho }\sum _{y\in \mathbb Z^d}\left\langle \left( \tfrac{\partial \zeta }{\partial y}\right) ^2 \right\rangle \;, \end{aligned}$$
(19)

where the vertical derivative \(\tfrac{\partial \zeta }{\partial y}\) of \(\zeta \) w.r.t. \(y\) is defined as follows. For all \(\zeta \in L^2(\Omega )\) and \(y\in \mathbb Z^d\),

$$\begin{aligned} \frac{\partial \zeta }{\partial y}:=\zeta -\langle \zeta \rangle _y,\qquad \text{ where } \ \langle \zeta \rangle _y:=\left\langle \zeta \Big | \{\varvec{a}(x)\}_{x\in \mathbb Z^d\setminus \{y\}} \right\rangle , \end{aligned}$$

i.e. \(\langle \zeta \rangle _y\) is the conditional expectation of \(\zeta \) given \(\varvec{a}(x)\) for all \(x\in \mathbb Z^d\setminus \{y\}\). More analytically, \(\langle \zeta \rangle _y\) is the \(L^2(\Omega )\)-orthogonal projection of \(\zeta \) on the space of functions of \(\varvec{a}\) that do not depend on \(\varvec{a}(y)\).

The \(L\)-periodic version of (SG) is the following:

Definition 2

(Spectral gap and vertical derivatives, periodic case) We say that \(\langle \cdot \rangle \) satisfies a spectral gap with constant \(\rho >0\) on the torus of size \(L\in \mathbb N\), in short \(\hbox {SG}_{L}(\rho )\), if for all \(\zeta \in L^2(\Omega )\) with \(\langle \zeta \rangle =0\) we have

$$\begin{aligned} \left\langle \zeta ^2 \right\rangle \le \frac{1}{\rho }\sum _{y\in ([0,L)\cap \mathbb Z)^d}\left\langle \left( \tfrac{\partial \zeta }{\partial _L y}\right) ^2 \right\rangle \;, \end{aligned}$$

where the (\(L\)-periodic) vertical derivative \(\tfrac{\partial \zeta }{\partial _L y}\) of \(\zeta \) w.r.t. \(y\) is defined as follows. For all \(\zeta \in L^2(\Omega )\) and \(y\in \mathbb Z^d\),

$$\begin{aligned} \frac{\partial \zeta }{\partial _L y}:=\zeta -\langle \zeta \rangle _{L,y}, \quad \text{ where } \ \langle \zeta \rangle _{L,y}:=\left\langle \zeta \Big | \{\varvec{a}(x)\}_{x\in \mathbb Z^d\setminus \{y+L\mathbb Z^d\}} \right\rangle , \end{aligned}$$

i.e. \(\langle \zeta \rangle _{L,y}\) is the conditional expectation of \(\zeta \) given \(\varvec{a}(x)\) for all \(x\in \mathbb Z^d\setminus \{y+L\mathbb Z^d\}\).

The vertical derivative does not commute with the shift \(T_x:\Omega \rightarrow \Omega \), \(\varvec{a}(\cdot )\mapsto \varvec{a}(\cdot +x)\). We have \(\left\langle \zeta \circ T_x \right\rangle _y=\left\langle \zeta \right\rangle _{y-x}\circ T_x\), which in terms of the stationary extension \(\overline{(\cdot )}\) reads

$$\begin{aligned} \left\langle \overline{\zeta }(x) \right\rangle _y=\overline{\left\langle \zeta \right\rangle _{y-x}}(x). \end{aligned}$$
(20)

Hence,

$$\begin{aligned} \tfrac{\partial \overline{\zeta }(x)}{\partial y}=\overline{\left( \tfrac{\partial \zeta }{\partial (y-x)}\right) }(x). \end{aligned}$$
(21)

Both identities also hold in the \(L\)-periodic case.

Remark 1

Let us comment on the naming of the derivatives \(D\) and \(\frac{\partial }{\partial y}\). A coefficient field \(\varvec{a}\in \Omega \) might be viewed as a “surface” \((x,\varvec{a}(x))\) in the space \(\mathbb Z^d\times \Omega _0\). We call the directions associated with \(\mathbb Z^d\) horizontal and the directions associated with \(\Omega _0\) vertical. The horizontal derivative \(D\zeta \) monitors the sensitivity of \(\zeta \) w.r.t. to horizontal shifts of the coefficient field. In contrast, the vertical derivative \(\frac{\partial }{\partial y}\) is associated with variations of the coefficient field in vertical directions. It can be seen as a discrete version of the classical partial derivatives \(\{\tfrac{\partial }{\partial \varvec{a}_{ii}(y)}\}_{i=1,\ldots ,d}\), and monitors how sensitively a random variable \(\zeta \) depends on the value of the coefficient field \(\{\mathbb Z^d\ni y\mapsto \varvec{a}(y)\}\) at site \(y\).

Remark 2

From the functional analytic point of view, (SG) is a Poincaré inequality on \(L^2(\Omega )\) for the vertical derivative \(\frac{\partial }{\partial y}\), which defines a bounded and symmetric operator \(\tfrac{\partial }{\partial y}: L^2(\Omega )\rightarrow L^2(\Omega )\). Since each site \(y\in \mathbb Z^d\) is endowed with a vertical derivative, the number of degrees of freedom that the vertical gradient \(\{\frac{\partial }{\partial y}\}_{y\in \mathbb Z^d}\) controls matches the degrees of freedom of the underlying probability space \(\Omega =\Omega _0^{\mathbb Z^d}\). One can show that \(SG_{\infty }(\rho ){}\) implies that the underlying ensemble is ergodic (so that the qualitative stochastic homogenization theory holds).

\(SG_{\infty }(\rho ){}\) is satisfied in the case of i.i.d. coefficients:

Definition 3

(i.i.d. ensemble) Let \(\beta \) be a probability measure on \(\Omega _0\). The infinite i.i.d. ensemble associated with \(\beta \) is defined as the unique ensemble with the property that the coordinate projections \(\Omega \ni \varvec{a}\mapsto \varvec{a}(x)\in \Omega _0\), \(x\in \mathbb Z^d\), are independent and identically distributed with \(\varvec{a}(0)\sim \beta \). Likewise, we define for \(L\in \mathbb N\) the \(L\)-periodic i.i.d. ensemble associated with \(\beta \) as the unique, \(L\)-periodic ensemble with the property that the coordinate projections \(\Omega \ni \varvec{a}\mapsto \varvec{a}(x)\in \Omega _0\), \(x\in ([0,L)\cap \mathbb Z)^d\), are independent and identically distributed with \(\varvec{a}(0)\sim \beta \).

Lemma 1

Every infinite (resp. \(L\)-periodic) i.i.d. ensemble satisfies \(SG_{\infty }(\rho ){}\) (resp. \(\hbox {SG}_{L}(\rho ){}\)) with constant \(\rho =1\).

The proof, which relies on the tensorization principle, is standard (cf. [20]) and is omitted here.

In the statistical mechanics literature, in the context of Ising models, there are several criteria for the validity of the Spectral Gap estimate. In their seminal contributions [10, 11], Dobrushin and Shlosman (DS) introduced ten equivalent conditions that ensure the existence and analyticity of Gibbs fields on \(\mathbb Z^d\). It was then proved later that the (DS) mixing conditions are in addition equivalent to the validity of a logarithmic-Sobolev inequality (which implies (SG)) for discrete spin spaces [38, Theorem 1.8] and for compact spin spaces [37, Corollary 3.19] and [36, Theorem 1.2]. The most suitable form of the (DS) mixing conditions in our context is given by [38, DSM] (see also [37, Remark 3.23]) and holds, roughly speaking, provided correlations are integrable. However, the appropriate measure of independence of \(\varvec{a}(x)\) from \(\varvec{a}(y)\) for \(|y-x|\gg 1\) has to be integrable in \(y\) under conditioning on \(\{\varvec{a}(z)\}_{z\in S}\) uniformly in any subset \(S\) of the lattice. It is this uniformity condition that puts the (DS) criterion in a different class than the usual mixing conditions in the quantitative theory of stochastic homogenization (cf. the uniformly strong mixing condition used in [40, (2.1)] and [2, Definition 2.1]). We don’t know if the condition of integrable correlations on the level of the usual mixing conditions is sufficient for our results.

Roughly speaking, in particular disregarding the above-mentioned difference between these mixing conditions, it is clear that the condition of integrability of correlations is necessary for some of our results, as we shall explain now. Consider for instance Proposition 2. Suppose that we are in the case of small ellipticity contrast, i.e. \(1-\lambda \ll 1\). In this situation, an expansion of the definition (12) of the approximate homogenized coefficient in \(1-\lambda \) reveals that to leading order, \(\varvec{a}_{\mathrm{av},L}(\varvec{a})\) is a simple spatial average

$$\begin{aligned} \varvec{a}_{\mathrm{av},L}(\varvec{a}) \approx L^{-d}\sum _{x\in ([0,L)\cap \mathbb {Z})^d}\varvec{a}(x). \end{aligned}$$

This approximate representation indeed implies that the variance of \(\varvec{a}_{\mathrm{av},L}(\varvec{a})\) only scales as \(L^{-d}\) if the covariances \(\langle (\varvec{a}(x)-\langle \varvec{a}\rangle _L)(\varvec{a}(y)-\langle \varvec{a}\rangle _L)\rangle _L =\langle (\varvec{a}(x-y)-\langle \varvec{a}\rangle _L)(\varvec{a}(0)-\langle \varvec{a}\rangle _L)\rangle _L\) are integrable in \(x-y\) (i.e. summable over the periodic box with a sum that stays finite for \(L\uparrow \infty \)). A main contribution of this paper is to show that this scaling is preserved in the case of arbitrary ellipticity contrast \(\lambda >0\) when \(\varvec{a}_{\mathrm{av},L}(\varvec{a})\) is a bigly nonlinear function of \(\varvec{a}\).

3 Decay of the variable-coefficient semigroup

This section is devoted to the proof of Theorem 1. The proof essentially relies on the vertical differential calculus and the optimal decay estimates of the parabolic Green’s function from Theorem 3. In order to exploit the vertical differential calculus on the solution \(u\) of (7), we shall work with its stationary extension \(\overline{u}\), as explained in the upcoming section. Section 3.2 is then devoted to the proof of Theorem 1.

3.1 The parabolic equation and Green’s function, and Duhamel’s formula

Solutions of (7)—the parabolic problem in the probability space—are characterized by a corresponding parabolic equation in the physical space \(\mathbb Z^d\), and thus admit a representation via the parabolic Green’s function on \(\mathbb Z^d\). To make that precise, recall that the operator \(X\ni \zeta \mapsto D^*\varvec{a}(0)D\zeta \in X\) is bounded and linear with \(X\) denoting any of the spaces \(L^p(\Omega )\), \(1\le p\le \infty \), or \(C_b(\Omega )\). Hence, by standard semigroup theory, for all \(\zeta \in X\) the function \(u(t):=\exp (-tD^*\varvec{a}(0)D)\zeta \) is a \(C^\infty (\mathbb R,X)\)-solution of (7). Thanks to (18) it is elementary to see that for all \(\zeta \in C_b(\Omega )\), the stationary extension \(\overline{u}\) of \(u\) solves (for all \(\varvec{a}\in \Omega \)) the following parabolic equation in physical space:

$$\begin{aligned} \left\{ \begin{aligned} \partial _t\overline{u}(t,x)+\nabla ^*\varvec{a}(x)\nabla \overline{u}(t,x)&= g(t,x)&\qquad&\text {for all } \ t>0, x\in \mathbb Z^d,\\ \overline{u}(t=0,x)&= g_0(x)&\qquad&\text {for all } \ x\in \mathbb Z^d, \end{aligned}\right. \end{aligned}$$
(22)

with \(g(t,x)=0\) and \(g_0(x):=\overline{\zeta }(\varvec{a},x)\). The solution of (22) can be represented via Duhamel’s formula:

$$\begin{aligned} \overline{u}(t,x)=\sum _{y\in \mathbb Z^d}G(t,\varvec{a},x,y)g_0(y)+\int _0^t\sum _{y\in \mathbb Z^d}G(t-s,\varvec{a},x,y)g(s,y)\,ds, \end{aligned}$$
(23)

where \(G\) denotes the parabolic Green’s function and is defined for all \(\varvec{a}\in \Omega \) and \(y\in \mathbb Z^d\) as the function \((t,x)\mapsto G(t,\varvec{a},x,y)\) in \(C^\infty (\mathbb R_+,\ell ^1(\mathbb Z^d))\) given by \(G(t,\varvec{a},\cdot ,y):=\exp (-t\nabla ^*\varvec{a}(\cdot )\nabla )\delta (\cdot -y)\), where \(\delta (x)=1\) if \(x=0\) and \(\delta (x)=0\) if \(x\in \mathbb Z^d\setminus \{0\}\). Likewise, the \(L\)-periodic parabolic Green’s function \(G_L\), \(L\in \mathbb N\), is defined for all \(\varvec{a}\in \Omega _L\) and \(y\in \mathbb Z^d\) as the function \((t,x)\mapsto G_L(t,\varvec{a},x,y)\) in \(C^\infty (\mathbb R^+,\ell ^\infty (\mathbb Z^d))\) given by \(G_L(t,\varvec{a},\cdot ,y):=\exp (-t\nabla ^*\varvec{a}(\cdot )\nabla )\delta _L(\cdot -y)\), where \(\delta _L(x):=\sum _{z\in \mathbb Z^d}\delta (x+Lz)\) denotes the Dirac function on the discrete torus of size \(L\).

Remark 3

Thanks to the assumption that \(\varvec{a}\) is diagonal (and elliptic) we have a discrete mean value property in the following sense: If \(\nabla ^*\varvec{a}\nabla u=0\) then \(u(\cdot )\le \max \{u(\cdot \pm e_1),\dots ,u(\cdot \pm e_d)\}\), which yields a weak maximum principle for \(\partial _t+\nabla ^*\varvec{a}(\cdot )\nabla \). In particular, this implies that \(G\) and \(G_L\) are non-negative. This is crucial, since our results heavily rely on estimates of the Green’s function that are based on elliptic and parabolic regularity theory.

3.2 Auxiliary lemmas and proof of Theorem 1

We split the proof of Theorem 1 into several lemmas (which we prove in Sect. 3.3). The starting point is the spectral gap estimate. Since we have to estimate higher moments, we need the following version of (SG):

Lemma 2

\((p\)-version of (SG)) Let \(\langle \cdot \rangle \) satisfy (11). Then for all \(p\ge 1\) and all \(\zeta \in C_b(\Omega )\) with \(\langle \zeta \rangle =0\) we have

$$\begin{aligned} \left\langle (\zeta ^{2})^p \right\rangle ^{\frac{1}{2p}}\lesssim \Vert \partial \zeta \Vert _{L^{2p}_{\langle \cdot \rangle }\ell ^2_y}, \end{aligned}$$

where

$$\begin{aligned} \Vert \partial \zeta \Vert _{L^{2p}_{\langle \cdot \rangle }\ell ^2_y}:= \left\langle \left( \sum _{y\in \mathbb Z^d}\left( \tfrac{\partial \zeta }{\partial y}\right) ^2\right) ^p \right\rangle ^{\frac{1}{2p}}, \end{aligned}$$

and the constant only depends on \(p\) and \(\rho \).

In the \(L\)-periodic case, i.e. when \(\langle \cdot \rangle \) satisfies (8), the statement remains valid for \(\sum _{y\in \mathbb Z^d}\) and \(\frac{\partial }{\partial y}\) replaced by \(\sum _{y\in ([0,L)\cap \mathbb Z)^d}\) and \(\frac{\partial }{\partial _L y}\), respectively.

In the course of proving Theorem 1 via Lemma 2, we have to estimate the vertical derivative \(\frac{\partial u}{\partial y}\), which can be conveniently characterized with the help of the stationary extension \(\overline{u}\) of \(u\). Indeed, since \(\overline{u}\) solves (22) with \(g\equiv 0\) and \(g_0(x)=\nabla ^*\overline{\xi }(\varvec{a},x)\), application of \(\frac{\partial }{\partial y}\) to (22) shows that the non-stationary random field \(\frac{\partial \overline{u}(t,\varvec{a},x)}{\partial y}\) is the unique solution of the parabolic equation (with time variable \(t\) and space variable \(x\))

$$\begin{aligned} \left\{ \begin{aligned} \partial _t\,\tfrac{\partial \overline{(}t,\varvec{a},x)}{\partial y}+\nabla ^*\varvec{a}(x)\nabla \tfrac{\partial \overline{u}(t,\varvec{a},x)}{\partial y}&=\nabla ^*g(t,\varvec{a},x,y)&\qquad&t>0,x\in \mathbb Z^d,\\ \tfrac{\partial \overline{u}(t=0,\varvec{a},x)}{\partial y}&=\nabla ^*\overline{\left( \tfrac{\partial \xi }{\partial (y-x)}\right) }(x)&x\in \mathbb Z^d, \end{aligned} \right. \end{aligned}$$

where by Leibniz’ rule

$$\begin{aligned} g(t,\varvec{a},x,y)\mathop {=}\limits ^{\text {formally}}-\tfrac{\partial \varvec{a}(x)}{\partial y}\overline{D u(t,\varvec{a})}(x). \end{aligned}$$

Since \(\frac{\partial \varvec{a}(x)}{\partial y}\) vanishes for \(x\ne y\), Duhamel’s formula (23) yields after integrating by parts

$$\begin{aligned} \tfrac{\partial \overline{u}(t,\varvec{a},x)}{\partial y}&\mathop {=}\limits ^{\text {formally}} \sum _{z\in \mathbb Z^d}\nabla _{z}G(t,\varvec{a},x,z)\cdot \overline{\left( \tfrac{\partial \xi }{\partial (y-z)}\right) }(z)\nonumber \\&-\int _0^t\nabla _{y}G(t-s,\varvec{a},x,y)\cdot \tfrac{\partial \varvec{a}(y)}{\partial y}\overline{D u(s,\varvec{a})}(y)\,ds. \end{aligned}$$
(24)

Note that the formula for \(g\) (and thus the identity above) is only formal, since Leibniz’ rule does not hold for the (discrete) vertical derivative. However, we obtain precisely the same estimate as if (24) was correct:

Lemma 3

Let \(\langle \cdot \rangle \) be stationary. Consider

$$\begin{aligned} u(t):=\exp \left( -tD^*\varvec{a}(0)D\right) D^*\xi ,\qquad \xi \in C_b(\Omega )^d. \end{aligned}$$

We then have

$$\begin{aligned} \tfrac{\partial u(t)}{\partial y}=\sum _{z\in \mathbb Z^d}\nabla _z G(t,0,z)\cdot \overline{\left( \tfrac{\partial \xi }{\partial (y-z)}\right) }(z) + \int _0^t\nabla _y G(t-s,0,y)\cdot \overline{g}(s,y)\,ds \end{aligned}$$
(25)

\(\langle \cdot \rangle \)-almost surely, where \(\overline{g}\) is the stationary extension of some \(g(t,\varvec{a})\) that satisfies

$$\begin{aligned} \left\langle |g(t)|^{2p} \right\rangle ^{\frac{1}{2p}}\le 2\left\langle |Du(t)|^{2p} \right\rangle ^{\frac{1}{2p}} \end{aligned}$$
(26)

for all \(p<\infty \).

Likewise, the statement holds for \(\sum _{z\in \mathbb Z^d}\), \(\frac{\partial }{\partial (y-z)}\), and \(G\) replaced by \(\sum _{z\in ([0,L)\cap \mathbb Z)^d}\), \(\frac{\partial }{\partial _L (y-z)}\), and \(G_{L}\), respectively.

Lemma 3 shows that we need time-decay estimates of \(\nabla G\). These are given by Theorem 3. As we shall see in the proof of Lemma 4 below, the fact that in Theorem 3 we obtain optimal decay for \(\nabla G\) for exponents up to some \(2q_0\) slightly larger than 2 is crucial. In fact, the exponent \(p_0\) in the statement of Theorem 1 is the dual exponent of \(q_0\). This explains why we are forced to estimate high moments of \(u\) even if ultimately we are mostly interested in the second and fourth moments, cf. Sect. 4.

Combining the Spectral Gap Estimate in its \(p\)-version with the representation formula of Lemma 3 we get:

Lemma 4

In the situation of Theorem 1 there exists an exponent \(1\le p_0<\infty \) (only depending on \(\lambda \) and \(d\)), such that for every \(p\in (p_0,\infty )\) we have

$$\begin{aligned} \left\langle u^{2p}(t) \right\rangle ^{\frac{1}{2p}}&\lesssim (t+1)^{-(\frac{d}{4}+\frac{1}{2})}\Vert \partial \xi \Vert _{\ell ^1_yL^{2p}_{\langle \cdot \rangle }}+\int _0^t(t-s+1)^{-(\frac{d}{4}+\frac{1}{2})}\left\langle |Du(s)|^{2p} \right\rangle ^{\frac{1}{2p}}\,ds,\nonumber \\ \end{aligned}$$
(27)

where \(\Vert \partial \xi \Vert _{\ell ^1_yL^{2p}_{\langle \cdot \rangle }}\) is defined as in Theorem 1, and the constant only depends on \(p\), \(\rho \), \(\lambda \), and \(d\).

To complete the proof of Theorem 1 we have to gain control over the nonlinear term \(\int _0^t(t-s+1)^{-(\frac{d}{4}+\frac{1}{2})}\langle |Du(s)|^{2p} \rangle ^{\frac{1}{2p}}\,ds\). This is done by using Caccioppoli’s inequality in probability combined with an ODE-argument. More precisely, the following lemma shows that \(\langle |Du(t)|^{2p} \rangle \) has better decay than \(\langle u^{2p}(t) \rangle \).

Lemma 5

(Caccioppoli) Let \(\langle \cdot \rangle \) be stationary. Consider \(u(t):=\exp (-tD^*\varvec{a}(0)D)\zeta \) with \(\zeta \in C_b(\Omega )\). Then for all \(p\ge 1\) we have

$$\begin{aligned} \left\langle |Du(t)|^{2p} \right\rangle \lesssim -\frac{d}{dt}\left\langle (u^2)^p(t) \right\rangle , \end{aligned}$$

where the constant only depends on \(d\) and \(p\).

The following lemma shows how to absorb the nonlinear term into the LHS of (27).

Lemma 6

(ODE-argument) Let \(1\le p,\gamma <\infty \) and \(a(t),b(t)\ge 0\). Suppose that there exists \(C_1<\infty \) such that for all \(t\ge 0\),

$$\begin{aligned} a(t)&\le C_1\left( (t+1)^{-\gamma }+\int _0^t(t-s+1)^{-\gamma }b(s)\,ds\right) , \end{aligned}$$
(28a)
$$\begin{aligned} b^p(t)&\le C_1\left( -\frac{d}{dt}a^p(t)\right) . \end{aligned}$$
(28b)

Then there exists \(C_2<\infty \) depending only on \(C_1\), \(p\) and \(\gamma \) such that

$$\begin{aligned} a(t)\le C_2(t+1)^{-\gamma }. \end{aligned}$$

We are in position to prove Theorem 1.

Proof of Theorem 1

Let \(p_0\) be given by Lemma 4, and fix an exponent \(p\in (p_0,\infty )\). By homogeneity we may assume that \(\Vert \xi \Vert _{\ell ^1_yL^{2p}_{\langle \cdot \rangle }}=1\), so that the desired estimate reduces to

$$\begin{aligned} \left\langle u^{2p}(t) \right\rangle ^{\frac{1}{2p}}\lesssim (t+1)^{-(\frac{d}{4}+\frac{1}{2})}. \end{aligned}$$
(29)

Set

$$\begin{aligned} a(t):=\left\langle u^{2p}(t) \right\rangle ^{\frac{1}{2p}},\qquad b(t):=\left\langle |Du(t)|^{2p} \right\rangle ^{\frac{1}{2p}},\qquad \gamma =\frac{d}{4}+\frac{1}{2}. \end{aligned}$$

By Lemmas 4 and 5 we have

$$\begin{aligned} a(t)&\lesssim (t+1)^{-\gamma }+\int _0^t(t-s+1)^{-\gamma }b(s)\,ds,\\ b^{2p}(t)&\lesssim -\frac{d}{dt}a^{2p}(t). \end{aligned}$$

Hence, Lemma 6 yields (29) in the form of \(a(t)\lesssim (t+1)^{-(\frac{d}{4}+\frac{1}{2})}\).

We finally state a slight improvement of Theorem 1 that yields an additional exponential decay (for times \(t\gtrsim L^2\)) in the \(L\)-periodic case:

Lemma 7

In the situation of Theorem 1 the following holds: If \(\langle \cdot \rangle \) satisfies (8), then there exists \(c_0>0\) depending only on \(d\) such that

$$\begin{aligned} \left\langle |\exp (-tD^*\varvec{a}(0)D)D^*\xi |^2 \right\rangle ^{\frac{1}{2}} \lesssim \exp \left( -\frac{c_0 \lambda }{L^2}t\right) \left\langle |\xi |^2 \right\rangle ^{\frac{1}{2}}, \end{aligned}$$

where the constant only depends on \(\rho \), \(\lambda \), and \(d\).

3.3 Proofs of the auxiliary lemmas

We prove the auxiliary lemmas in the case (11). The argument in the periodic case (8) is similar.

Proof of Lemma 2

The only technicality is due to the failure of the Leibniz rule for the vertical derivative. In what follows, for all \(a \in \mathbb R\) and \(p>0\), we use the notation \(a^{2p}:=(a^2)^p\).

Step 1. Substitute for the Leibniz rule: For all \(p\ge 1\),

$$\begin{aligned} \left\langle \left( \tfrac{\partial (\zeta |\zeta |^{p-1})}{\partial y}\right) ^2 \right\rangle \lesssim \left\langle \zeta ^{2(p-1)}\left( \tfrac{\partial \zeta }{\partial y}\right) ^2 + \left( \tfrac{\partial \zeta }{\partial y}\right) ^{2p} \right\rangle , \end{aligned}$$
(30)

where the constant only depends on \(p\) and \(d\).

By definition of \(\frac{\partial }{\partial y}\), (30) can be rewritten as

$$\begin{aligned} \left\langle (\zeta |\zeta |^{p-1}-\langle \zeta |\zeta |^{p-1} \rangle _y)^2 \right\rangle \lesssim \left\langle \zeta ^{2(p-1)}(\zeta -\langle \zeta \rangle _y)^2 + (\zeta -\langle \zeta \rangle _y)^{2p} \right\rangle . \end{aligned}$$

Since \(\langle \langle \cdot \rangle _y \rangle =\langle \cdot \rangle \), it suffices to show that

$$\begin{aligned}&\left\langle (\zeta |\zeta |^{p-1}-\langle \zeta |\zeta |^{p-1} \rangle _y)^2 \right\rangle _y\\&\quad \lesssim \left\langle \zeta ^{2(p-1)}(\zeta -\langle \zeta \rangle _y)^2 + (\zeta -\langle \zeta \rangle _y)^{2p} \right\rangle _y\qquad \langle \cdot \rangle \text {-almost surely}. \end{aligned}$$

Since the conditional expectation is an orthogonal projection in \(L^2(\Omega )\), we have

$$\begin{aligned}&\forall \tilde{\zeta }\in L^2(\Omega ), \forall c\in \mathbb R:\qquad \left\langle (\tilde{\zeta }-\langle \tilde{\zeta } \rangle _y)^2 \right\rangle _y\le \left\langle (\tilde{\zeta }-c)^2 \right\rangle _y\qquad \langle \cdot \rangle \text {-almost surely}. \end{aligned}$$

In particular, with \(\tilde{\zeta }=\zeta |\zeta |^{p-1}\) and \(c=\langle \zeta \rangle _y|\langle \zeta \rangle _y|^{p-1}\) we get \(\langle \cdot \rangle \text {-almost surely}\)

$$\begin{aligned} \langle (\zeta |\zeta |^{p-1}-\langle \zeta |\zeta |^{p-1} \rangle _y)^2 \rangle _y\le \langle (\zeta |\zeta |^{p-1}-\langle \zeta \rangle _y|\langle \zeta \rangle _y|^{p-1})^2 \rangle _y. \end{aligned}$$

Hence, it suffices to argue that

$$\begin{aligned}&\left\langle (\zeta |\zeta |^{p-1}-\langle \zeta \rangle _y|\langle \zeta \rangle _y|^{p-1})^2 \right\rangle _y\\&\quad \lesssim \left\langle \zeta ^{2(p-1)}(\zeta -\langle \zeta \rangle _y)^2 + (\zeta -\langle \zeta \rangle _y)^{2p} \right\rangle _y\qquad \langle \cdot \rangle \text {-almost surely}. \end{aligned}$$

The latter follows from the elementary inequality:

$$\begin{aligned} \forall a,b\in \mathbb R:\qquad (a|a|^{p-1}-b|b|^{p-1})^2 \lesssim a^{2(p-1)}(a-b)^2+(a-b)^{2p}. \end{aligned}$$

Step 2. Application of the Spectral Gap Estimate.

For all \(p\ge 1\) we claim that

$$\begin{aligned} \left\langle \zeta ^{2p} \right\rangle \lesssim \left\langle \zeta ^2 \right\rangle ^p+\left\langle \left( \sum _{y\in \mathbb Z^d}\left( \tfrac{\partial \zeta }{\partial y}\right) ^2\right) ^p \right\rangle . \end{aligned}$$

Assume that \(p>1\). The application of \(SG_{\infty }(\rho ){}\) to \(\zeta |\zeta |^{p-1}-\langle \zeta |\zeta |^{p-1} \rangle \) yields

$$\begin{aligned} \left\langle (\zeta |\zeta |^{p-1}-\langle \zeta |\zeta |^{p-1} \rangle )^2 \right\rangle \le \frac{1}{\rho }\sum _{y\in \mathbb Z^d}\left\langle \left( \tfrac{\partial }{\partial y}(\zeta |\zeta |^{p-1})\right) ^2 \right\rangle , \end{aligned}$$

which, by Step 1 and the triangle inequality, turns into

$$\begin{aligned} \left\langle \zeta ^{2p} \right\rangle \lesssim \left\langle |\zeta |^{p} \right\rangle ^2 + \left\langle \zeta ^{2(p-1)}\sum _{y\in \mathbb Z^d}\left( \tfrac{\partial \zeta }{\partial y}\right) ^2+\sum _{y\in \mathbb Z^d}\left( \tfrac{\partial \zeta }{\partial y}\right) ^{2p} \right\rangle . \end{aligned}$$
(31)

We treat each of the three terms on the RHS of (31) separately. For the third term we appeal to the discrete \(\ell ^{2p}\)-\(\ell ^2\) estimate. For the second term on the RHS of (31) we use Hölder’s inequality with exponents \((\frac{p}{p-1},p)\), combined with Young’s inequality. We turn to the first term. For \(p=2\) there is nothing to do, whereas for \(p>2\) we may apply Hölder’s inequality with exponents \((2\tfrac{p-1}{p-2},2\tfrac{p-1}{p})\) to \(\langle |\zeta |^p \rangle =\langle {|\zeta |^{p\frac{p-2}{p-1}}|\zeta |^{\frac{p}{p-1}}}\rangle \):

$$\begin{aligned} \left\langle |\zeta |^p \right\rangle ^2 \le \left\langle \zeta ^{2p} \right\rangle ^{\frac{p-2}{p-1}}\left\langle \zeta ^2 \right\rangle ^{\frac{p}{p-1}}, \end{aligned}$$

and we absorb the first factor into the LHS of (31) by Young’s inequality.

Step 3. Conclusion.

Application of \(SG_{\infty }(\rho ){}\) to \(\zeta \) combined with Jensen’s inequality yields

$$\begin{aligned} \left\langle \zeta ^2 \right\rangle ^p\le \frac{1}{\rho ^{p}}\left\langle \left( \sum _{y\in \mathbb Z^d}\left( \tfrac{\partial \zeta }{\partial y}\right) ^{2}\right) ^p \right\rangle , \end{aligned}$$

so that the claim of Lemma 2 follows from Step 2. \(\square \)

Proof of Lemma 3

Recall that the stationary extension \(\overline{u}(t,\cdot )\) of \(u\) satisfies (for all \(\varvec{a}\in \Omega \)) the spatial parabolic equation (22) with \(g\equiv 0\) and \(g_0(x):=\nabla ^*\overline{\xi }(x)\). We take the vertical derivative \(\frac{\partial }{\partial y}\) of this equation:

$$\begin{aligned} \left\{ \begin{aligned} \partial _t\tfrac{\partial \overline{u}(t,x)}{\partial y}+\nabla ^*\varvec{a}(x)\nabla \tfrac{\partial \overline{u}(t,x)}{\partial y}=&\nabla ^*\xi _1(t,x,y)&\text {for all }\ t>0,\,x\in \mathbb Z^d,\\ \tfrac{\partial \overline{u}(t=0,x)}{\partial y}=&\nabla ^*\xi _0(x,y)&\text {for all } \ x\in \mathbb Z^d, \end{aligned}\right. \end{aligned}$$
(32)

where

$$\begin{aligned} \xi _1(t,x,y)&:= \varvec{a}(x)\nabla \tfrac{\partial \overline{u}(t,x)}{\partial y}-\tfrac{\partial }{\partial y}\left( \varvec{a}(x)\nabla \overline{u}(t,x)\right) ,\\ \xi _0(x,y)&:= \tfrac{\partial \overline{\xi }(x)}{\partial y}. \end{aligned}$$

Duhamel’s formula, cf. (23), and two integrations by parts yield

$$\begin{aligned} \tfrac{\partial \overline{u}(t,0)}{\partial y}=\sum _{z\in \mathbb Z^d}\nabla _zG(t,0,z)\cdot \xi _0(z,y) + \int _0^t\sum _{z\in \mathbb Z^d}\nabla _zG(t-s,0,z)\cdot \xi _1(s,z,y)\,ds. \end{aligned}$$
(33)

We then claim that

$$\begin{aligned} \xi _0(x,y)&= \overline{\left( \tfrac{\partial \xi }{\partial (y-x)}\right) }(x),\end{aligned}$$
(34)
$$\begin{aligned} \xi _1(t,x,y)&= \delta (y-x)\overline{g}(t,x), \end{aligned}$$
(35)

where \(g(t):=\left\langle \varvec{a}(0)Du(t) \right\rangle _0-\varvec{a}(0)\left\langle Du(t) \right\rangle _0.\)

Indeed, (34) is a consequence of (21). Let us prove (35). Since \(\xi _1(t,x,y) = \left\langle \varvec{a}(x)\nabla \overline{u}(t,x) \right\rangle _y-\varvec{a}(x)\nabla \left\langle \overline{u}(t,x) \right\rangle _y\) vanishes for all \(y\ne x\), (35) follows from the properties of the stationary extension:

$$\begin{aligned}&{\left\langle \varvec{a}(x)\nabla \overline{u}(t,x) \right\rangle _y-\varvec{a}(x)\langle \nabla \overline{u}(t,x) \rangle _y}\\&\quad {=}\,\left\langle \overline{\varvec{a}(0)Du(t)}(x) \right\rangle _y-\varvec{a}(x)\left\langle \overline{Du(t)}(x) \right\rangle _y\\&~ \mathop {=}\limits ^{(20)}\delta (y-x)\overline{\left( \left\langle \varvec{a}(0)D u(t) \right\rangle _{y-x}-\varvec{a}(0)\left\langle D u(t) \right\rangle _{y-x}\right) }(x)\\&\quad =\delta (y-x)\overline{g}(t,x). \end{aligned}$$

Since \(u(t)=\overline{u}(t,0)\), we have \(\frac{\partial \overline{u}(t,0)}{\partial y}=\frac{\partial u(t)}{\partial y}\), and the combination of (33)–(35) yields (25), whereas (26) follows from the triangle inequality in \(L^{2p}(\Omega )\) and Jensen’s inequality in probability. \(\square \)

Proof of Lemma 4

Since \(\langle D^*\xi \rangle =0\), we have \(\langle u(t) \rangle =0\) for all \(t>0\), so that Lemma 2 yields

$$\begin{aligned} \left\langle u^{2p}(t) \right\rangle \lesssim \left\langle \Bigg (\sum _{y\in \mathbb Z^d}\Bigg (\tfrac{\partial u(t)}{\partial y}\Bigg )^2\Bigg )^p \right\rangle . \end{aligned}$$

The representation formula of Lemma 3 for \(\frac{\partial u(t)}{\partial y}\) and the triangle inequality w.r.t. \(\langle {(\sum _{y\in \mathbb Z^d}(\cdot )^2)^p}\rangle ^{\frac{1}{2p}}\) and in the time integral show that

$$\begin{aligned} \left\langle u^{2p}(t) \right\rangle ^{\frac{1}{2p}}&\lesssim \left\langle \Bigg (\sum _{y\in \mathbb Z^d}\left( \sum _{z\in \mathbb Z^d}|\nabla _zG(t,0,z)|\left| \overline{\left( \tfrac{\partial \xi }{\partial (y-z)}\right) }(z)\right| \right) ^2\Bigg )^p \right\rangle ^{\frac{1}{2p}}\nonumber \\&+\int _0^t\left\langle \left( \sum _{y\in \mathbb Z^d}|\nabla _yG(t-s,0,y)|^2|\overline{g}(s,y)|^2\right) ^p \right\rangle ^{\frac{1}{2p}}\,ds.\nonumber \\ \end{aligned}$$
(36)

Recall that \(\langle {|\overline{g}(t,x)|^{2p}}\rangle ^{\frac{1}{2p}}\le 2\langle {|Du(t)|^{2p}}\rangle ^{\frac{1}{2p}}\). To estimate the first term of the RHS we change variables and use the triangle inequality w. r. t. \(\langle {(\sum _{y\in \mathbb Z^d}(\cdot )^2)^p}\rangle ^{\frac{1}{2p}}\):

$$\begin{aligned}&\left\langle \left( \sum _{y\in \mathbb Z^d}\left( \sum _{z\in \mathbb Z^d}|\nabla _zG(t,0,z)|\left| \overline{\left( \tfrac{\partial \xi }{\partial (y-z)}\right) }(z)\right| \right) ^2\right) ^p \right\rangle ^{\frac{1}{2p}}\\&\quad \mathop {=}\limits ^{x:=y-z} \left\langle \left( \sum _{y\in \mathbb Z^d}\left( \sum _{x\in \mathbb Z^d}|\nabla _yG(t,0,y-x)|\left| \overline{\tfrac{\partial \xi }{\partial x}}(y-x)\right| \right) ^2\right) ^p \right\rangle ^{\frac{1}{2p}}\\&\quad \mathop {\le }\limits ^{\triangle \text {-inequality}} \sum _{x\in \mathbb Z^d}\left\langle \left( \sum _{y\in \mathbb Z^d}\left( |\nabla _yG(t,0,y-x)|\left| \overline{\tfrac{\partial \xi }{\partial x}}(y-x)\right| \right) ^2\right) ^p \right\rangle ^{\frac{1}{2p}}\\&\quad \mathop {=}\limits ^{x':=y-x} \sum _{x\in \mathbb Z^d}\left\langle \left( \sum _{x'\in \mathbb Z^d}\left( |\nabla _{x'}G(t,0,x')|\left| \overline{\tfrac{\partial \xi }{\partial x}}(x')\right| \right) ^2\right) ^p \right\rangle ^{\frac{1}{2p}}. \end{aligned}$$

Hence, (36) turns into

$$\begin{aligned} \left\langle u^{2p}(t) \right\rangle ^{\frac{1}{2p}}&\le \sum _{x\in \mathbb Z^d}\left\langle \left( \sum _{y\in \mathbb Z^d}\left( |\nabla _{y}G(t,0,y)|\left| \overline{\tfrac{\partial \xi }{\partial x}}(y)\right| \right) ^2\right) ^p \right\rangle ^{\frac{1}{2p}}\\&+\int _0^t\left\langle \left( \sum _{y\in \mathbb Z^d}|\nabla _yG(t-s,0,y)|^2\left| \overline{g}(s,y)\right| ^2\right) ^p \right\rangle ^{\frac{1}{2p}}\,ds. \end{aligned}$$

It remains to show that

$$\begin{aligned} \left\langle \left( \sum _{y\in \mathbb Z^d}\left( |\nabla _{y}G(t,0,y)|\left| \overline{\tfrac{\partial \xi }{\partial x}}(y)\right| \right) ^2\right) ^p \right\rangle ^{\frac{1}{2p}}&\lesssim (t+1)^{-(\frac{d}{4}+\frac{1}{2})}\left\langle \left| \tfrac{\partial \xi }{\partial x}\right| ^{2p} \right\rangle ^{\frac{1}{2p}},\end{aligned}$$
(37)
$$\begin{aligned} \left\langle \left( \sum _{y\in \mathbb Z^d}|\nabla _yG(t-s,0,y)|^2|\overline{g}(s,y)|^2\right) ^p \right\rangle ^{\frac{1}{2p}}&\lesssim (t-s+1)^{-(\frac{d}{4}+\frac{1}{2})}\left\langle |Du(s)|^{2p} \right\rangle ^{\frac{1}{2p}}. \end{aligned}$$
(38)

We only give the argument for (37), the argument for (38) being similar. Let \(q:=\frac{p}{p-1}\) be the dual exponent of \(p\), let \(\alpha >0\) be some exponent to be fixed later, and let \(\omega (t,x)\) be the weight defined in (16). By Hölder’s inequality with exponents \((q,p)\) and the symmetry \(G(t,\varvec{a},x,y)=G(t,\varvec{a},y,x)\) of \(G\), we have

$$\begin{aligned} \sum _{y\in \mathbb Z^d}\left( |\nabla _yG(t,0,y)|\left| \overline{\tfrac{\partial \xi }{\partial x}}(y)\right| \right) ^2&\le \left( \sum _{y\in \mathbb Z^d}\left( \omega ^\alpha (t,y)|\nabla _yG(t,y,0)|\right) ^{2q}\right) ^{\frac{1}{q}}\\&\left( \sum _{y\in \mathbb Z^d}\left( \omega ^{-\alpha }(t,y)\left| \overline{\tfrac{\partial \xi }{\partial x}}(y)\right| \right) ^{2p}\right) ^{\frac{1}{p}} . \end{aligned}$$

Hence, by stationarity of \(\overline{\tfrac{\partial \xi }{\partial x}}\),

$$\begin{aligned} \left\langle \left( \sum _{y\in \mathbb Z^d}\left( |\nabla _{y}G(t,0,y)|\left| \overline{\tfrac{\partial \xi }{\partial x}}(y)\right| \right) ^2\right) ^p \right\rangle ^{\frac{1}{2p}}&\le \sup _{\varvec{a}\in \Omega }\left( \sum _{y\in \mathbb Z^d}\left( \omega ^{\alpha }(t,y)|\nabla _{y}G(t,y,0)|\right) ^{2q}\right) ^{\frac{1}{2q}}\nonumber \\&\times \left( \sum _{y\in \mathbb Z^d}\omega ^{-2p\alpha }(t,y)\right) ^{\frac{1}{2p}} \left\langle \left| \tfrac{\partial \xi }{\partial x}\right| ^{2p} \right\rangle ^{\frac{1}{2p}}.\nonumber \\ \end{aligned}$$
(39)

We now address the choice of \(p\) and \(\alpha \). First, set \(p_0\) to be the dual exponent of \(q_0\) defined in Theorem 3, and let \(p> p_0\), so that its dual exponent \(q\) lies in the range \([1,q_0)\) of applicability of Theorem 3. We then choose \(\alpha \) so large that \(2p\alpha >d\). Theorem 3 thus implies

$$\begin{aligned} \sup _{\varvec{a}\in \Omega }\left( \sum _{y\in \mathbb Z^d}\left( \omega ^{\alpha }(t,y)|\nabla _yG(t,y,0)|\right) ^{2q}\right) ^{\frac{1}{2q}} \lesssim (t+1)^{-(\frac{d}{2}+\frac{1}{2})+\frac{d}{2}\frac{1}{2q}}, \end{aligned}$$

whereas

$$\begin{aligned} \sum _{y\in \mathbb Z^d}\omega ^{-2p\alpha }(t,y)= \sum _{y\in \mathbb Z^d}\left( \frac{|y|^2}{t+1}+1\right) ^{-p\alpha } \lesssim (t+1)^{\frac{d}{2}}. \end{aligned}$$

Combined with (39), this yields the desired estimate (37). \(\square \)

Proof of Lemma 5

For all \(t>0\) and \(p\ge 1\)

$$\begin{aligned} \lambda \left\langle D(u |u|^{2p-2})\cdot Du \right\rangle \le -\frac{d}{dt}\frac{1}{2p}\left\langle (u^{2})^p \right\rangle . \end{aligned}$$

Indeed, since \(u(t)=\exp (-tD^*\varvec{a}(0)D)\zeta \),

$$\begin{aligned} \frac{\partial }{\partial t}u+D^*\varvec{a}(0)Du=0\qquad t>0, \end{aligned}$$
(40)

whose weak formulation with test-function \(u|u|^{2p-2}\in C^\infty (\mathbb R_+,C_b(\Omega ))\) reads

$$\begin{aligned} \frac{d}{dt}\frac{1}{2p}\left\langle (u^{2})^p \right\rangle =-\left\langle D(u|u|^{2p-2})\cdot \varvec{a}(0)Du \right\rangle \le -\lambda \left\langle D(u|u|^{2p-2})\cdot Du \right\rangle , \end{aligned}$$

where we used the diagonality of \(\varvec{a}(0)\) in the last inequality. The claim follows from the inequality \(D(u|u|^{2p-2})\cdot Du\gtrsim |Du|^{2p}\), which is a consequence of the elementary estimate: For all \(a,b\in \mathbb R\) we have \((a|a|^{2p-2}-b|b|^{2p-2})(a-b)\,\gtrsim \,(a-b)^{2p}\). \(\square \)

Proof of Lemma 6

The claim can be reformulated as

$$\begin{aligned} \Lambda (t):=\sup \limits _{0\le s\le t}(1+s)^\gamma a(s)\lesssim 1. \end{aligned}$$
(41)

Step 1. Two auxiliary estimates.

We claim that

$$\begin{aligned} \int _{\tau _1}^{\tau _2}b(s)\,ds \lesssim \left\{ \begin{aligned}&(\tau _2-\tau _1)^{1-\frac{1}{p}}a(\tau _1)&\text {for all } \ 0\le \tau _1\le \tau _2,\\&\tau _1^{1-\gamma -\frac{1}{p}}\Lambda (\tau _2)&\text {for all }\ 1\le \tau _1\le \tau _2. \end{aligned} \right. \end{aligned}$$
(42)

The first estimate follows from Hölder’s inequality, Eq. (28b), and the non-negativity of \(a\):

$$\begin{aligned} \int _{\tau _1}^{\tau _2}b(s)\,ds\le (\tau _2-\tau _1)^{1-\frac{1}{p}}\left( -\int _{\tau _1}^{\tau _2}\frac{d}{dt}a^p(t)\,dt\right) ^{\frac{1}{p}}\le (\tau _2-\tau _1)^{1-\frac{1}{p}}\,a(\tau _1). \end{aligned}$$
(43)

The second inequality can be deduced from the first one as follows: Let \(N\in \mathbb N\) satisfy \(2^{N-1}\tau _1<\tau _2\le 2^N\tau _1\). We then have

$$\begin{aligned} \int _{\tau _1}^{\tau _2}b(s)\,ds&\le \sum _{n=0}^{N-1}\int _{2^n\tau _1}^{2^{n+1}\tau _1}b(s)\,ds \mathop {\lesssim }\limits ^{(43)} \sum _{n=0}^{N-1}(2^n\tau _1)^{1-\frac{1}{p}}\,a(2^n\tau _1)\\&\le \sum _{n=0}^{N-1}(2^n\tau _1)^{1-\frac{1}{p}}\,(1+2^n\tau _1)^{-\gamma }\Lambda (2^n\tau _1)\\&\mathop {\le }\limits ^{\Lambda \text { monotone}} \Lambda (\tau _2)\sum _{n=0}^{N-1}(2^n\tau _1)^{1-\frac{1}{p}}(1+2^n\tau _1)^{-\gamma }\\&\le \Lambda (\tau _2)\sum _{n=0}^{N-1}(2^n\tau _1)^{1-\gamma -\frac{1}{p}} \mathop {\lesssim }\limits ^{\gamma \ge 1} \Lambda (\tau _2)\tau _1^{1-\gamma -\frac{1}{p}}. \end{aligned}$$

Step 2. A threshold estimate.

Let \(1\le \tau \le \frac{1}{4}t\). We claim that

$$\begin{aligned} (t+1)^{\gamma }a(t)\lesssim 1 + \tau ^{1-\frac{1}{p}} + \left( \tau ^{1-\gamma -\frac{1}{p}}+\frac{\ln (t+1)}{(t+1)^{\frac{1}{p}}}\right) \Lambda (t). \end{aligned}$$

First notice that by (28b) the function \(a(\cdot )\) is non-increasing. Hence,

$$\begin{aligned} a(t)&\le \frac{2}{t}\int _{\frac{t}{2}}^ta(t')\,dt'\nonumber \\&\mathop {\lesssim }\limits ^{(28a)} \frac{1}{t}\int _{\frac{t}{2}}^t(t'+1)^{-\gamma }\,dt' +\frac{1}{t}\int _{\frac{t}{2}}^t\int _0^{t'}(t'-s+1)^{-\gamma }b(s)\,ds\,dt'. \end{aligned}$$
(44)

The first term of the RHS is estimated by

$$\begin{aligned} \frac{1}{t}\int _{\frac{t}{2}}^t(t'+1)^{-\gamma }\,dt' \lesssim (t+1)^{-\gamma }. \end{aligned}$$
(45)

For the second term of the RHS of (44), we split the inner integral into three contributions that we estimate separately. More precisely, we shall prove that

$$\begin{aligned} \frac{1}{t}\int _{\frac{t}{2}}^t\int _0^{\tau }(t'-s+1)^{-\gamma }b(s)\,ds\,dt' \lesssim&(t+1)^{-\gamma }\tau ^{1-\frac{1}{p}},\end{aligned}$$
(46)
$$\begin{aligned} \frac{1}{t}\int _{\frac{t}{2}}^t\int _{\tau }^{\frac{t'}{2}}(t'-s+1)^{-\gamma }b(s)\,ds\,dt' \lesssim&(t+1)^{-\gamma }\tau ^{1-\gamma -\frac{1}{p}}\Lambda (t),\end{aligned}$$
(47)
$$\begin{aligned} \frac{1}{t}\int _{\frac{t}{2}}^t\int _{\frac{t'}{2}}^{t'}(t'-s+1)^{-\gamma }b(s)\,ds\,dt' \lesssim&(t+1)^{-\gamma }\frac{\ln (t+1)}{(t+1)^{\frac{1}{p}}}\Lambda (t). \end{aligned}$$
(48)

Argument for (46): Since \(\tau \le \frac{t}{4}\le \frac{t'}{2}\), we have \(t'-s+1\ge \frac{t'}{2}+1\). Hence,

$$\begin{aligned} \int _0^{\tau }(t'-s+1)^{-\gamma }b(s)ds \lesssim (t'+1)^{-\gamma }\int _0^\tau b(s)ds \mathop {\lesssim }\limits ^{(42)} (t'+1)^{-\gamma }\tau ^{1-\frac{1}{p}}a(0), \end{aligned}$$

and (46) follows by (45) and (28a) for \(t=0\) in the form of \(a(0)\lesssim 1\).

Argument for (47): As above we have

$$\begin{aligned} \int _{\tau }^{\frac{t'}{2}}(t'-s+1)^{-\gamma }b(s)ds \lesssim (t'+1)^{-\gamma }\int _{\tau }^{\frac{t'}{2}} b(s)ds \mathop {\lesssim }\limits ^{(42)} (t'+1)^{-\gamma }\tau ^{1-\gamma -\frac{1}{p}}\Lambda (\tfrac{t'}{2}), \end{aligned}$$

and (47) follows by (45) and the monotonicity of \(\Lambda \).

Argument for (48): Since

$$\begin{aligned} \tfrac{t}{2}\le t'\le t\qquad&\text {and}\qquad \tfrac{t'}{2}\le s\le t'\\&\Leftrightarrow \\ \tfrac{t}{4}\le s\le t\qquad \text {and}\qquad&\max \{s,\tfrac{t}{2}\}\le t'\le \min \{2s,t\}, \end{aligned}$$

we obtain by switching the order of the integrals:

$$\begin{aligned} \frac{1}{t}\int _{\frac{t}{2}}^t\int _{\frac{t'}{2}}^{t'}(t'-s+1)^{-\gamma }b(s)\,ds\,dt'&\le \frac{1}{t}\int _{\frac{t}{4}}^t\int _{s}^{t}(t'{-}s{+}1)^{-\gamma }\,dt'\,b(s)\,ds\\&= \frac{1}{t}\int _{\frac{t}{4}}^t \int _{0}^{t-s}(t''+1)^{-\gamma }\,dt''\,b(s)\,ds\\&\mathop {\le }\limits ^{\gamma \ge 1} \int _{\frac{t}{4}}^t\,b(s)\,ds\,\times \, \frac{1}{t}\int _{0}^{t}(t''+1)^{-1}\,dt''\\&\mathop {\lesssim }\limits ^{42} (t+1)^{1-\gamma -\frac{1}{p}}\Lambda (t)\ \frac{\ln (t+1)}{t}\\&\mathop {\lesssim }\limits ^{t\ge 1}(t+1)^{-\gamma }\Lambda (t)\ \frac{\ln (t+1)}{(t+1)^{\frac{1}{p}}}. \end{aligned}$$

The claim of Step 2 follows from the combination of (44)–(48).

Step 3. Proof of (41).

For \(\tau \gg 1\), \(\frac{\ln (t+1)}{(t+1)^{\frac{1}{p}}}\) is monotone decreasing for \(t\ge \tau \). Hence for \(\tau \gg 1\), Step 2 can be upgraded to

$$\begin{aligned} (t+1)^{\gamma }a(t)\lesssim 1 + \tau ^{1-\frac{1}{p}} + \left( \tau ^{1-\gamma -\frac{1}{p}}+\frac{\ln (\tau +1)}{(\tau +1)^{\frac{1}{p}}}\right) \Lambda (t). \end{aligned}$$

Because of \(\gamma \ge 1\) and \(p<\infty \), the expressions \(\tau ^{1-\gamma -\frac{1}{p}}\) and \(\frac{\ln (\tau +1)}{(\tau +1)^{\frac{1}{p}}}\) tend to zero as \(\tau \rightarrow \infty \). Hence, by Step 2 we can find a threshold \(\tau _0>0\) only depending on \(p,\gamma \) and \(C_1\) such that for all \(t\ge 4\tau _0\)

$$\begin{aligned} (t+1)^{\gamma }a(t)\le C_2\left( 1+ \tau _0^{1-\frac{1}{p}}\right) +\frac{1}{2}\Lambda (t), \end{aligned}$$

where \(C_2\) is a constant only depending on \(p,\gamma \) and \(C_1\). For all \(t>0\) we then have

$$\begin{aligned} \Lambda (t)&\le \sup _{0\le s\le 4\tau _0}(1+s)^\gamma a(s)+\sup _{4\tau _0\le s\le t}(1+s)^\gamma a(s) \\&\le \sup _{0\le s\le 4\tau _0}(1+s)^\gamma a(s)+ C_2\left( 1+\tau _0^{1-\frac{1}{p}}\right) +\frac{1}{2}\Lambda (t), \end{aligned}$$

that is

$$\begin{aligned} \Lambda (t)\le 2\sup _{0\le s\le 4\tau _0}(1+s)^\gamma a(s)+2C_2(1+\tau _0^{1-\frac{1}{p}}). \end{aligned}$$

Since

$$\begin{aligned} \sup _{0\le s\le 4\tau _0}(1+s)^\gamma a(s)&\mathop {\lesssim }\limits ^{(28\mathrm{b})}(1+4\tau _0)^{\gamma +1-\frac{1}{p}}a(0) \mathop {\lesssim }\limits ^{(28\mathrm{a})}(1+4\tau _0)^{\gamma +1-\frac{1}{p}}, \end{aligned}$$

we deduce that \(\Lambda (t)\) is bounded for all \(t>0\) by a constant that only depends on \(p,\gamma \), and \(C_1\). \(\square \)

Proof of Lemma 7

Let \(\overline{v}(x):=\overline{D^*\xi }(x)\) denote the stationary extension of the initial data. Since \(\overline{\xi }\) is almost surely \(L\)-periodic, we have

$$\begin{aligned} \sum _{x\in ([0,L)\cap \mathbb Z)^d}\overline{v}(x)=\sum _{x\in ([0,L)\cap \mathbb Z)^d}\nabla ^*\overline{\xi }(x)=0\qquad \langle \cdot \rangle \text {-almost surely}, \end{aligned}$$

and consequently

$$\begin{aligned} \sum _{x\in ([0,L)\cap \mathbb Z)^d}\overline{u}(t,x)=0\qquad \langle \cdot \rangle \text {-almost surely}. \end{aligned}$$

From the Poincaré inequality for mean free \(L\)-periodic functions, we deduce that

$$\begin{aligned} \sum _{x\in ([0,L)\cap \mathbb Z)^d}|\overline{u}(t,x)|^2\le \frac{L^2}{c_0} \sum _{x\in ([0,L)\cap \mathbb Z)^d}|\nabla \overline{u}(t,x)|^2\qquad \left\langle \cdot \right\rangle \text {-almost surely}, \end{aligned}$$

and thus, by stationarity,

$$\begin{aligned} \left\langle u^2(t) \right\rangle \le \frac{L^2}{c_0}\left\langle |Du(t)|^2 \right\rangle \le \frac{L^2}{c_0 \lambda }\left\langle Du(t)\cdot \varvec{a}(0)Du(t) \right\rangle =-\frac{L^2}{2c_0\lambda }\frac{d}{dt}\left\langle u^2(t) \right\rangle , \end{aligned}$$

so that the claim follows by Gronwall’s lemma. \(\square \)

4 Estimates on the corrector in stochastic homogenization

The corrector \(\phi \) can be formally recovered by integrating in time the function

$$\begin{aligned} u(t):=\exp (-tD^*\varvec{a}(0)D)\mathfrak d, \end{aligned}$$
(49)

where \(\mathfrak d:=-D^*\varvec{a}(0)e\) is the RHS of the corrector equation (4). Making this connection rigorous for the modified and periodic correctors yields the following moment bounds.

Proposition 1

Let \(d\ge 2\) and let \(e\in \mathbb R^d\) be an arbitrary direction with \(|e|=1\).

  1. (a)

    (Modified corrector). Let \(\mu >0\). Assume that \(\langle \cdot \rangle \) satisfies either (11) or (8) (for some \(L\in \mathbb N\)). Then the unique solution \(\phi _\mu \in L^2(\Omega )\) of the modified corrector equation (5) satisfies

    $$\begin{aligned} \langle |\phi _\mu |^p \rangle ^{\frac{1}{p}}\lesssim&{\left\{ \begin{array}{ll} \ln ^{\frac{1}{2}}(\tfrac{1}{\mu }+1)&{}\text {for }d=2\text { and }1\le p\le 2,\\ \ln (\tfrac{1}{\mu }+1)&{}\text {for }d=2\text { and }p>2,\\ 1&{}\text {for }d>2, \end{array}\right. }\\ \langle |D\phi _\mu |^p \rangle ^{\frac{1}{p}} \lesssim&1, \end{aligned}$$

    for all \(1\le p<\infty \).

  2. (b)

    \((L\)-periodic corrector). Let \(L\in \mathbb N\) and assume that \(\langle \cdot \rangle \) satisfies (8). Then the unique solution \(\phi \in L^2(\Omega )\) of the corrector equation (4) in the form of (6) satisfies

    $$\begin{aligned} \langle |\phi |^p \rangle ^{\frac{1}{p}} \lesssim&{\left\{ \begin{array}{ll} \ln ^{\frac{1}{2}}(L+1)&{}\text {for }d=2\text { and }1\le p\le 2,\\ \ln (L+1)&{}\text {for }d=2\text { and }p>2,\\ 1&{}\text {for }d>2, \end{array}\right. }\\ \langle |D\phi |^p \rangle ^{\frac{1}{p}} \lesssim&1, \end{aligned}$$

    for all \(1\le p<\infty \).

For both (a) and (b), the multiplicative constants only depend on \(p, \rho ,\lambda \), and \(d\).

(The proof is postponed to the end of this section). For \(d>2\) in the case of (11), the boundedness of the corrector was obtained originally in [21, Proposition 2.1] (the spectral gap estimate \(SG_{\infty }(\rho ){}\) indeed implies the version of the spectral gap estimate used in [21, Lemma 2.3]). For \(d=2\) in the case of (11), and for \(d\ge 2\) in the case of (8), these results are new. In terms of scaling in \(\mu \) and \(L\), these estimates are optimal except for \(d=2\) and \(p> 2\). Yet, using the spectral gap estimate on the corrector equation and bounds on the elliptic Green’s function, one can prove that for \(d=2\) and for all \(p\ge 1\) (see e.g. [23] in the case of continuum elliptic equations),

$$\begin{aligned} \langle |\phi _\mu |^{p} \rangle ^{\frac{1}{p}} \lesssim \ln ^{\frac{1}{2}}(\tfrac{1}{\mu }+1)\langle |D\phi _\mu |^{p} \rangle ^{\frac{1}{p}}, \end{aligned}$$

so that the optimal bound for \(p>2\) follows from the bounds on \(\langle |D\phi _\mu |^{p} \rangle \) in Proposition 1. The same argument holds for periodic ensembles.

Remark 4

In dimensions \(d>2\) Proposition 1 implies that the corrector equation (4) admits a unique solution \(\phi \in L^2(\Omega )\) with \(\langle \phi \rangle =0\). In addition, for all \(1\le p<\infty \),

$$\begin{aligned} \langle |\phi |^p \rangle ^{\frac{1}{p}} \lesssim 1. \end{aligned}$$

In dimension \(d=2\) stationary correctors do not exist. However, as \(\mu \downarrow 0\), \(D\phi _\mu \) converges to some potential field \(\Psi \) in \(L^2(\Omega )^d\) which, in view of Proposition 1, satisfies for all \(1\le p<\infty \)

$$\begin{aligned} \langle |\Psi |^p \rangle ^{\frac{1}{p}} \lesssim 1. \end{aligned}$$

Proof of Proposition 1

Both statements (a) and (b) follow from Theorem 1 using the same strategy, and we only prove the former. Let \(u\) be given by (49). Since for any \(1\le p\le \infty \) we have \(\sum _{y\in \mathbb Z^d}\langle {(\frac{\partial \mathfrak d}{\partial y})^{2p}\rangle }^{\frac{1}{2p}}\le 2\), Theorem 1 combined with Jensen’s inequality in probability yields

$$\begin{aligned} \left\langle |u(t)|^p \right\rangle ^{\frac{1}{p}} \lesssim (t+1)^{-\left( \frac{d}{4}+\frac{1}{2}\right) }\qquad \text {for all }1\le p<\infty , \end{aligned}$$
(50)

from which we deduce that

$$\begin{aligned} \phi _\mu :=\int _0^\infty \exp (-\mu t)u(t)\,dt \end{aligned}$$

defines a function in \(L^p(\Omega )\). By construction \(\phi _\mu \) satisfies the identity:

$$\begin{aligned} \mu \phi _\mu&= \mu \int _0^\infty \exp (-\mu t)u(t)\,dt \\&= -\int _0^\infty \frac{\partial }{\partial t}(\exp (-\mu t)u(t)) + \exp (-\mu t)\frac{\partial }{\partial t}u(t)\,dt\\&= u(0)-\int _0^\infty D^*\varvec{a}(0)D(\exp (-\mu t)u(t))\,dt\\&= \mathfrak d-D^*\varvec{a}(0)D\phi _\mu . \end{aligned}$$

It remains to establish the desired estimates. By the triangle inequality in \(L^p(\Omega )\) and (50) we have

$$\begin{aligned} \left\langle \phi ^p_\mu \right\rangle ^{\frac{1}{p}}&\le \int _0^\infty \exp (-\mu t)\left\langle |u(t)|^p \right\rangle ^{\frac{1}{p}}\,dt \lesssim \int _0^\infty \exp (-\mu t)(t+1)^{-(\frac{d}{4}+\frac{1}{2})}\,dt\\&\lesssim {\left\{ \begin{array}{ll} \ln (\tfrac{1}{\mu }+1)&{}\text {for }d=2,\\ 1&{}\text {for }d>2. \end{array}\right. } \end{aligned}$$

For the sharper estimates on \(\phi _\mu \) for \(d=2\) and \(p\le 2\), it is enough to use the semi-group property for \(p=2\). Since \(D^*\varvec{a}(0)D\) is symmetric, we have for all \(t,s\ge 0\), \( {\left\langle u(s)u(t) \right\rangle } = \left\langle u^2(\tfrac{s+t}{2}) \right\rangle . \) Hence,

$$\begin{aligned} \left\langle \Bigg (\int _0^\infty u(t)\exp (-\mu t)\,dt\Bigg )^2 \right\rangle&= \int _0^\infty \int _0^\infty \langle u^2(\tfrac{s+t}{2})\exp (-\mu (s+t)) \rangle \,dt\,ds\\&= 2 \int _0^\infty \int _{\frac{s}{2}}^\infty \langle u^2(\tau )\exp (-2\mu \tau ) \rangle \,d\tau \,ds\\&\mathop {\lesssim }\limits ^{(50)} \int _0^\infty \int _{\frac{s}{2}}^\infty \exp (-2\mu \tau )(\tau +1)^{-2}\,d\tau \,ds\\&\lesssim \ln (\tfrac{1}{\mu }+1), \end{aligned}$$

as desired.

It remains to prove the estimates on the gradient \(D\phi _\mu \) for \(d=2\) (for \(d>2\) this follows by discreteness and the estimate on \(\phi _\mu \) itself). By Jensen’s inequality in probability, it is enough to prove the claim for \(p=2\tilde{p}\) with \(\tilde{p}\ge 1\) so that one can appeal to the Caccioppoli inequality of Lemma 5. We fix \(\gamma \in (p-1,p)\) and note that this yields both

$$\begin{aligned} \int _0^\infty (t+1)^{-\frac{\gamma }{p-1}}dt<\infty \qquad \text {and}\qquad \int _0^\infty (t+1)^{\gamma -1-p}dt<\infty . \end{aligned}$$
(51)

As announced, we then have:

$$\begin{aligned} \left\langle |D\phi _\mu |^p \right\rangle ^{\frac{1}{p}}&\mathop {\le }\limits ^{\triangle \text {-inequality}} \int _0^\infty \left\langle |\exp (-\mu t)Du(t)|^p \right\rangle ^{\frac{1}{p}}\ dt\\&\le \int _0^\infty \left\langle |Du(t)|^p \right\rangle ^{\frac{1}{p}}(t+1)^{\frac{\gamma }{p}}(t+1)^{-\frac{\gamma }{p}}\,dt\\&\mathop {\le }\limits ^{\text {H}\ddot{\mathrm{o}}\text {lder}} \left( \int _0^\infty \left\langle |Du(t)|^p \right\rangle (t+1)^{\gamma }\,dt\right) ^{\frac{1}{p}} \left( \int _0^\infty (t+1)^{-\frac{\gamma }{p-1}}\,dt\right) ^{\frac{p-1}{p}}\\&\mathop {\lesssim }\limits ^{{\begin{array}{c} (51),\\ \mathrm{Lemma}~ 5 \end{array} }} \left( \int _0^\infty -\frac{d}{dt}\left\langle |u(t)|^p \right\rangle (t+1)^{\gamma }\,dt\right) ^{\frac{1}{p}}\\&\mathop {\lesssim }\limits ^{\langle |u(0)|^p \rangle \lesssim 1} \left( \int _0^\infty \left\langle |u(t)|^p \right\rangle (t+1)^{\gamma -1}\,dt\right) ^{\frac{1}{p}}+1\\&\mathop {\lesssim }\limits ^{(50), d=2} \left( \int _0^\infty (t+1)^{\gamma -1-p}\,dt\right) ^{\frac{1}{p}}+1\, \mathop {\lesssim }\limits ^{(51)}\, 1. \end{aligned}$$

\(\square \)

5 Approximation of the homogenized coefficients by periodization

In this section we prove Theorem 2 which establishes an optimal error estimate for the approximation of \(\varvec{a}_{\mathrm{hom}}\) by periodization. We present a complete analysis for i.i.d. coefficients and partial results for ensembles satisfying (SG).

5.1 Auxiliary results and proof of Theorem 2

The mathematical version of the periodization method is the following: Let \(\langle \cdot \rangle \) denote a stationary and ergodic ensemble and \(\varvec{a}_{\mathrm{hom}}\) the homogenized matrix associated with \(\langle \cdot \rangle \) via (1). To approximate \(\varvec{a}_{\mathrm{hom}}\) we

  • approximate \(\varvec{a}_{\mathrm{hom}}\) by \(\varvec{a}_{\mathrm{hom},L}\), where \(\varvec{a}_{\mathrm{hom},L}\) is the homogenized coefficient associated via (1) with a suitable stationary \(L\)-periodic ensemble \(\langle \cdot \rangle _L\), which we think of as having the same “specifications” as \(\langle \cdot \rangle \). This introduces a systematic error, which we will only fully study in the case of the i.i.d. ensembles;

  • approximate \(\varvec{a}_{\mathrm{hom},L}\) by \(\varvec{a}_{\mathrm{av},L}\), cf. (12), for some realization \(\varvec{a}\) distributed according to \(\langle \cdot \rangle _L\). By stationarity and periodicity we have

    $$\begin{aligned} e\cdot \varvec{a}_{\mathrm{av},L}(\varvec{a})e=L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}(\nabla \overline{\phi }(x)+e)\cdot \varvec{a}(x)(\nabla \overline{\phi }(x)+e)\quad \text {and}\quad \langle \varvec{a}_{\mathrm{av},L} \rangle _L{=}\varvec{a}_{\mathrm{hom},L}, \end{aligned}$$

    where \(\overline{\phi }\) is defined via (6). Replacing \(\varvec{a}_{\mathrm{hom},L}\) by \(\varvec{a}_{\mathrm{av},L}\) introduces a random error.

Remark 5

In case of the i.i.d. ensemble studied in this paper, the definition of the stationary periodic ensemble \(\langle \cdot \rangle _L\) and its coupling to \(\langle \cdot \rangle \) is natural, see Definition 3. For a general ensemble, this is more subtle, and we shall now sketch two situations in which it is clear what to do.

  1. (1)

    Suppose that \(\langle \cdot \rangle \) is the push-forward of an ensemble \(\langle \cdot \rangle _0\) under a transformation \(\Phi :\Omega \mapsto \Omega \), which is stationary in the sense of \(\Phi (\varvec{a})(x+z)=\Phi (\varvec{a}(\cdot +z))(x)\) (in order to preserve stationarity of the ensemble), and which we think of being short-ranged (e. g. a convolution operator with integrable kernel). Suppose further that the base measure \(\langle \cdot \rangle _0\) is such that there exists a natural stationary \(L\)-periodic version \(\langle \cdot \rangle _{0,L}\) and that it is naturally coupled to \(\langle \cdot \rangle _0\). The latter means that there exists an ensemble \(\langle \cdot \rangle _{0,c}\) on the product space \((\varvec{a},\varvec{b})\in ([0,L)\cap \mathbb {Z})^d\times \mathbb {Z}^d\) such that the distribution of \(\varvec{a}\) is that of \(\langle \cdot \rangle _{0,L}\), the distribution of \(\varvec{b}\) is that of \(\langle \cdot \rangle _0\), and that \(\varvec{a}\approx \varvec{b}\) in (the middle of) \(([0,L)\cap \mathbb {Z})^d\) (as would be the case for the i.i.d. ensemble). Then a suitable \(L\)-periodic ensemble \(\langle \cdot \rangle _L\) and coupling \(\langle \cdot \rangle _{c}\) is given by the push forward of \(\langle \cdot \rangle _{0,L}\) and \(\langle \cdot \rangle _{0,c}\) under \(\Phi \).

  2. (2)

    Suppose that, in the jargon of statistical mechanics, the ensemble \(\langle \cdot \rangle \) is an infinite-volume Gibbs measure, with a translation invariant formal Hamiltonian coming from finite-range specifications. To fix ideas, we consider the one-dimensional lattice (i.e. \(d=1\)), a two-valued spin space (i.e. \(\varvec{a}(x)\in \{\lambda ,1\}\)), and a nearest neighbor interaction (i.e. given by the specification \(H(\varvec{a}(x),\varvec{a}(x+1))\)). In this case, the natural definition of a stationary \(L\)-periodic ensemble is the following: The probability of a configuration \(\varvec{a}=(\varvec{a}(0),\ldots ,\varvec{a}(L-1))\) is given by

    $$\begin{aligned} P_L(\varvec{a})=\frac{1}{Z}\exp \left( -\sum _{x=0}^{L-1}H(\varvec{a}(x),\varvec{a}(x+1))-H(\varvec{a}(L-1),\varvec{a}(0))\right) , \end{aligned}$$

    where \(Z\) is a normalization constant. The coupling is more subtle: consider \(\langle \cdot |\varvec{a}(0)\rangle _L\), i.e. the \(L\)-periodic ensemble \(\langle \cdot |\varvec{a}(0)\rangle _L\) conditioned on \(\varvec{a}(0)\), and \(\langle \cdot |\varvec{b}(0),\varvec{b}(L)\rangle \), the infinite ensemble conditioned on \((\varvec{b}(0),\varvec{b}(L))\) (we use the latter \(\varvec{b}\) for the random variable of the infinite ensemble to avoid confusion). Because the interaction is nearest neighbor, \((\varvec{b}(1),\ldots ,\varvec{b}(L-1))\) and \(\{\varvec{b}(x)\}_{x\not \in \{0,\ldots ,L\}}\) are independent under \(\langle \cdot |\varvec{b}(0),\varvec{b}(L)\rangle \). Because the interaction is stationary, \((\varvec{a}(1),\ldots ,\varvec{a}(L-1))\) and \((\varvec{b}(1),\ldots ,\varvec{b}(L-1))\) are identically distributed under \(\langle \cdot |\varvec{a}(0)\rangle _L\) and \(\langle \cdot |\varvec{b}(0),\varvec{b}(L)\rangle \) provided \(\varvec{a}(0)=\varvec{b}(0)=\varvec{b}(L)\). Hence it is straightforward to couple the conditional measures \(\langle \cdot |\varvec{a}(0)\rangle _L\) and \(\langle \cdot |\varvec{b}(0),\varvec{b}(L)\rangle \) in such a way that \((\varvec{a}(1),\ldots ,\varvec{a}(L-1))=(\varvec{b}(1),\ldots ,\varvec{b}(L-1))\) provided \(\varvec{a}(0)=\varvec{b}(0)=\varvec{b}(L)\). Let \(\langle |\varvec{a}(0),\varvec{b}(0),\varvec{b}(L)\rangle _c\) denote this coupling. Then a coupling \(\langle \cdot \rangle _c\) of \(\langle \cdot \rangle _L\) and \(\langle \cdot \rangle \) is given as follows: Let \(P_L(\varvec{a}(0))\) denote the probability of \(\varvec{a}(0)\) under \(\langle \cdot \rangle _L\) and let \(P(\varvec{b}(0),\varvec{b}(L))\) denote the probability of \((\varvec{b}(0),\varvec{b}(1))\) under \(\langle \cdot \rangle \). For any function \(\zeta =\zeta (\varvec{a};\varvec{b})\) we set

    $$\begin{aligned} \langle \zeta \rangle _c&= \sum _{(\varvec{b}(0),\varvec{b}(L))\in \{\lambda ,1\}^2} \sum _{\varvec{a}(0)\in \{\lambda ,1\}}\langle \zeta |\varvec{a}(0),\varvec{b}(0),\varvec{b}(L)\rangle _c\\&\times P_L(\varvec{a}(0)) P(\varvec{b}(0),\varvec{b}(L)). \end{aligned}$$

    This is a good coupling: Since the distribution of \(\varvec{a}(L/2)\) under \(\langle \cdot |\varvec{a}(0)\rangle _L\) depends only weakly (exponentially weakly in \(L\gg 1\)) on \(\varvec{a}(0)\) and likewise the distribution of \(\varvec{b}(L/2)\) under \(\langle \cdot |\varvec{b}(0),\varvec{b}(L)\rangle \) depends only weakly on \((\varvec{b}(0),\varvec{b}(L))\), we have \(\varvec{a}(L/2)\approx \varvec{b}(L/2)\) under the distribution of \(\langle \cdot |\varvec{a}(0),\varvec{b}(0),\varvec{b}(L)\rangle _c\) and thus under \(\langle \cdot \rangle _c\).

We first discuss the random error, which is the variance of \(\varvec{a}_{\mathrm{av},L}\). Using the bound on the quartic moment of the gradient of the corrector in Proposition 1 (b), we shall prove the following optimal estimate:

Proposition 2

(Optimal variance estimate) Let \(d\ge 2\), \(\langle \cdot \rangle \) be stationary, \(L\)-periodic and satisfy \(\hbox {SG}_{L}(\rho )\). Then for all \(e\in \mathbb R^d\), \(|e|=1\), we have

$$\begin{aligned} \mathrm{var}[e\cdot \varvec{a}_{\mathrm{av},L}e]\lesssim L^{-d}, \end{aligned}$$

where the constant only depends on \(\rho ,\lambda \), and \(d\).

Proposition 2 shows that the random error decays at the rate \(L^{-\frac{d}{2}}\) of the central limit theorem. Since this error is due to fluctuations, its effect can be reduced by empirical averaging: For \(N\in \mathbb N\) consider the random matrix \(\varvec{a}_{\mathrm{av},L,N}\) defined via (15) where \(\varvec{a}^1,\ldots ,\varvec{a}^N\) denote \(N\) independent realizations of the coefficient field distributed according to \(\langle \cdot \rangle \). Under the assumption of Proposition 2, for all \(e\in \mathbb R^d\), \(|e|=1\), we then have

$$\begin{aligned} \mathrm{var}[e\cdot \varvec{a}_{\mathrm{av},L,N}e]=\frac{1}{N}\mathrm{var}[e\cdot \varvec{a}_{\mathrm{av},L}e]\lesssim \frac{1}{N}L^{-d}, \end{aligned}$$
(52)

where the constant only depends on \(\rho ,\lambda \) and \(d\).

For the systematic error we prove the following estimate under the assumption that the coefficients are i.i.d.:

Proposition 3

(Optimal estimate of the systematic error) Let \(d\ge 2\), \(\langle \cdot \rangle \) and \(\langle \cdot \rangle _L\) be the infinite and \(L\)-periodic i.i.d. ensembles associated with the same measure \(\beta \) on \(\Omega _0\) (cf. Definition 3). Then we have

$$\begin{aligned} |\varvec{a}_{\mathrm{hom}}-\varvec{a}_{\mathrm{hom},L}|\lesssim L^{-d}\ln ^dL, \end{aligned}$$

where the constant only depends on \(\lambda \) and \(d\).

Evidently, Theorem 2 is a direct consequence of Proposition 2 in the form of (52) and of Proposition 3.

In order to estimate the systematic error we require additional “inner” approximations that rely on the modified corrector equation (5). To that end notice that—as a merit of the \(\mu \)-regularization—the modified corrector \(\phi _\mu \) (associated with a direction \(e\) via (5)) can be defined independently of the ensemble: Indeed, pointwise in \(\varvec{a}\) we have \(\phi _\mu (\varvec{a})=\overline{\phi }_\mu (\varvec{a},x=0)\) where \(\overline{\phi }_\mu (\varvec{a},\cdot )\) is the unique bounded solution of

$$\begin{aligned} \mu \overline{\phi }_\mu (\varvec{a},x)+\nabla ^*\varvec{a}(x)\nabla \overline{\phi }_\mu (\varvec{a},x)=-\nabla ^*\varvec{a}(x)e,\quad x\in \mathbb Z^d. \end{aligned}$$
(53)

The resulting function \(\phi _\mu :\Omega \rightarrow \mathbb R\) is continuous and thus a measurable solution of (5) (see [21, Lemma 2.6]).

Definition 4

(\(\mu \)-approximation) Let \(\langle \cdot \rangle \) be a stationary ensemble. For \(k\in \mathbb N_0\) and \(\mu >0\) we inductively define symmetric matrices \(\varvec{a}_{\mathrm{hom},\mu }^k\) via: For all \(e\in \mathbb R^d\),

$$\begin{aligned} \begin{aligned} e\cdot \varvec{a}_{\mathrm{hom},\mu }^0e&:=e\cdot \varvec{a}_{\mathrm{hom},\mu }e:=\langle (D\phi _\mu +e)\cdot \varvec{a}(0)(D\phi _\mu +e) \rangle ,\\ \varvec{a}_{\mathrm{hom},\mu }^{k}&:=\frac{1}{2^{k+1}-1}\left( 2^{k+1} \varvec{a}^{k-1}_{\mathrm{hom},\mu }-\varvec{a}^{k-1}_{\mathrm{hom},2\mu }\right) , \end{aligned} \end{aligned}$$
(54)

where \(\phi _\mu \) denotes the modified corrector (for direction \(e\)).

By elementary spectral theory one can show that \(\varvec{a}_{\mathrm{hom},\mu }\rightarrow \varvec{a}_{\mathrm{hom}}\) as \(\mu \downarrow 0\) for general ergodic ensembles. The matrix \(\varvec{a}_{\mathrm{hom},\mu }^k\) is a “higher-order” approximation for \(\varvec{a}_{\mathrm{hom}}\) based on a Richardson extrapolation. With these approximations at hand we split the systematic error \( |\varvec{a}_{\mathrm{hom},L}-\varvec{a}_{\mathrm{hom}}|\) into two systematic sub-errors and a coupling error: For arbitrary \(\mu >0\) and \(k\in \mathbb N_0\) we have

$$\begin{aligned} \begin{aligned} |\varvec{a}_{\mathrm{hom},L}-\varvec{a}_{\mathrm{hom}}|&\le |\varvec{a}_{\mathrm{hom},L,\mu }^k-\varvec{a}_{\mathrm{hom},L}|\\&\quad +|\varvec{a}_{\mathrm{hom},\mu }^k-\varvec{a}_{\mathrm{hom}}|&\text {(systematic sub-errors)}\\&\quad +|\varvec{a}_{\mathrm{hom},L,\mu }^k-\varvec{a}_{\mathrm{hom},\mu }^k|&\text {(coupling error)}, \end{aligned} \end{aligned}$$
(55)

where \(\varvec{a}_{\mathrm{hom},\mu }^k\) and \(\varvec{a}_{\mathrm{hom},L,\mu }^k\) are associated with \(\langle \cdot \rangle \) and \(\langle \cdot \rangle _L\) respectively, through the induction (54).

We first discuss the systematic sub-errors, the estimate of which only requires (SG). Following [19], we estimate the systematic sub-errors by appealing to spectral calculus. Indeed, since the elliptic operator \(D^*\varvec{a}(0) D\) is bounded, symmetric, and non-negative on \(L^2(\Omega )\), the spectral theorem yields the existence of a spectral measure \(P(d\nu )\) on \([0,+\infty )\) such that for all \(\zeta \in L^2(\Omega )\), and suitable continuous functions \(f\), we have

$$\begin{aligned} f(D^*\varvec{a}(0) D)\zeta =\int _0^\infty f(\nu )P(d\nu )\zeta . \end{aligned}$$

As already used by Mourrat in [30], estimates on the semigroup allow one to quantify the bottom of the spectrum of \(D^*\varvec{a}(0)D\). In particular, by [19, Theorem 7] (see also [30, Theorem 2.4]), Theorem 1 answers positively a conjecture of [19] and yields:

Corollary 1

Let \(d\ge 2\). Let \(\langle \cdot \rangle \) denote either a stationary ensemble that satisfies \(SG_{\infty }(\rho ){}\), or a stationary, \(L\)-periodic ensemble that satisfies \(\hbox {SG}_{L}(\rho ){}\). We let \(P(d\nu )\) denote the spectral measure of the operator \(D^*\varvec{a}(0) D\), and let \(\mathfrak {d}= -D^*\varvec{a}(0)e\) denote the RHS of the corrector equation in some fixed direction \(e\in \mathbb R^d\) with \(|e|=1\). Then, for all \(\tilde{\nu }\ge 0\), we have

$$\begin{aligned} \int _0^{\tilde{\nu }} \langle \mathfrak {d} P(d\nu )\mathfrak {d} \rangle \lesssim \tilde{\nu }^{\frac{d}{2}+1}, \end{aligned}$$

where the constant only depends on \(\rho \), \(\lambda \) and \(d\).

Based on this, we shall prove the following estimate of the systematic sub-error:

Lemma 8

Let \(d\ge 2\). Let \(\langle \cdot \rangle \) denote either a stationary ensemble that satisfies \(SG_{\infty }(\rho ){}\), or a stationary, \(L\)-periodic ensemble that satisfies \(\hbox {SG}_{L}(\rho ){}\). Let \(\varvec{a}_{\mathrm{hom}}\) and \(\varvec{a}_{\mathrm{hom},\mu }^k\) be associated with \(\langle \cdot \rangle \) via (1) and (54), respectively. Then for all non-negative integers \(k>\frac{d}{2}-2\) and \(\mu >0\) we have

$$\begin{aligned} |\varvec{a}^k_{\mathrm{hom},\mu }-\varvec{a}_{\mathrm{hom}}|\lesssim \mu ^{\frac{d}{2}}, \end{aligned}$$

where the constant only depends on \(k, \rho \), \(\lambda \), and \(d\).

Note that this lemma applies to both errors \(|\varvec{a}^k_{\mathrm{hom},\mu }-\varvec{a}_{\mathrm{hom}}|\) and \(|\varvec{a}^k_{\mathrm{hom},L,\mu }-\varvec{a}_{\mathrm{hom},L}|\).

Finally, we study the coupling error. It is the only term that relates the original ensemble \(\langle \cdot \rangle \) with the periodic proxy \(\langle \cdot \rangle _L\). Hence we need to specify the approximation mechanism used to construct \(\langle \cdot \rangle _L\) from \(\langle \cdot \rangle \). In the present contribution we discuss this issue rigorously only for i.i.d. coefficients, for which the \(L\)-periodic proxy \(\langle \cdot \rangle _L\) can unambigously be obtained by using the same base measure \(\beta \). For our purpose the following estimate on the coupling error is sufficient:

Lemma 9

Let \(d\ge 2\), \(\langle \cdot \rangle \) and \(\langle \cdot \rangle _L\) be the infinite and \(L\)-periodic i.i.d. ensembles associated with the same measure \(\beta \) on \(\Omega _0\) (cf. Definition 3). Then there exist \(\alpha >0\) only depending on \(d\) and \(c_0>0\) only depending on \(\lambda \) and \(d\), such that for all \(\mu \in (0,1]\) and any direction \(e\in \mathbb R^d\), \(|e|=1\), we have

$$\begin{aligned} |e\cdot \varvec{a}_{\mathrm{hom},L,\mu }e-e\cdot \varvec{a}_{\mathrm{hom},\mu } e| \lesssim \sqrt{\mu }^{-\alpha }\exp (-c_0\sqrt{\mu } L), \end{aligned}$$
(56)

where the constant only depends on \(\lambda \) and \(d\).

5.2 Proofs of the auxiliary results

Proof of Proposition 2

Set \( \mathcal E:=e\cdot \varvec{a}_{\mathrm{av},L}e=L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}(\nabla \overline{\phi }(x)+e)\cdot \varvec{a}(x)(\nabla \overline{\phi }(x)+e).\) In the following we drop the subindex \(L\) in the notation for \(\tfrac{\partial }{\partial _L y}\) and \(\langle \cdot \rangle _{L,y}\).

Step 1. Estimate of \(\nabla \tfrac{\partial \overline{\phi }}{\partial y}\):

$$\begin{aligned} \sum _{x\in ([0,L)\cap \mathbb Z)^d}\left| \nabla \tfrac{\partial \overline{\phi }}{\partial y}(x)\right| ^2\lesssim \left\langle |e+\nabla \overline{\phi }(y)|^2 \right\rangle _y. \end{aligned}$$
(57)

We apply the vertical derivative \(\frac{\partial }{\partial y}\) to (6):

$$\begin{aligned} 0&= \nabla ^*\varvec{a}(x)\nabla \tfrac{\partial \overline{\phi }(x)}{\partial y}+\nabla ^*\left( \varvec{a}(x)\left\langle \nabla \overline{\phi }(x)+e \right\rangle _y-\left\langle \varvec{a}(x)(\nabla \overline{\phi }(x)+e) \right\rangle _y\right) . \end{aligned}$$

Hence, \(\frac{\partial \overline{\phi }(\varvec{a},\cdot )}{\partial y}\) is an \(L\)-periodic function and satisfies

$$\begin{aligned}&\nabla ^*\varvec{a}(x)\nabla \tfrac{\partial \overline{\phi }(\varvec{a},x)}{\partial y}=\nabla ^*\xi (\varvec{a},x,y)\nonumber \\&\qquad \text {for all } x,y\in \mathbb Z^d \hbox { and } \langle \cdot \rangle \hbox {-almost every } \varvec{a}\in \Omega , \end{aligned}$$
(58)

with a RHS given by

$$\begin{aligned} \xi (\varvec{a},x,y)=\left\langle \varvec{a}(x)(\nabla \overline{\phi }(x)+e) \right\rangle _y-\varvec{a}(x)\left\langle \nabla \overline{\phi }(x)+e \right\rangle _y. \end{aligned}$$

This yields \(\xi (\varvec{a},x,y)=0\) whenever \(x-y\not \in L\mathbb Z^d\). The weak formulation of (58) with (periodic) test-function \(\tfrac{\partial \overline{\phi }}{\partial y}\) yields the a priori estimate

$$\begin{aligned} \lambda \sum _{x\in ([0,L)\cap \mathbb Z)^d}|\nabla \tfrac{\partial \overline{\phi }}{\partial y}(x)|^2\le \nabla \tfrac{\partial \overline{\phi }}{\partial y}(y)\cdot \xi (y,y)\le 2|\nabla \tfrac{\partial \overline{\phi }}{\partial y}(y)| \langle |e+\nabla \overline{\phi }(y)| \rangle _y. \end{aligned}$$

Combined with \(|\nabla \tfrac{\partial \overline{\phi }}{\partial y}(y)|^2\le \sum _{x\in ([0,L)\cap \mathbb Z)^d}|\nabla \tfrac{\partial \overline{\phi }}{\partial y}(x)|^2\), we get (57).

Step 2. Estimate of \(\tfrac{\partial \mathcal E}{\partial y}\):

$$\begin{aligned} \left| \tfrac{\partial \mathcal E}{\partial y}\right| \lesssim L^{-d}\left( |+\nabla \overline{\phi }(y)|^2+\left\langle |e+\nabla \overline{\phi }(y)|^2 \right\rangle _y\right) . \end{aligned}$$

Indeed, by the following vertical Leibniz rule

$$\begin{aligned} \tfrac{\partial (\zeta _1\zeta _2)}{\partial y} = \zeta _1\tfrac{\partial \zeta _2}{\partial y}+\tfrac{\partial \zeta _1}{\partial y}\zeta _2-\tfrac{\partial \zeta _1}{\partial y}\tfrac{\partial \zeta _2}{\partial y} -\left\langle \tfrac{\partial \zeta _1}{\partial y}\tfrac{\partial \zeta _2}{\partial y} \right\rangle _y, \end{aligned}$$

we have

$$\begin{aligned} \tfrac{\partial \mathcal E}{\partial y}&= L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}\varvec{a}(x):\tfrac{\partial }{\partial y}\left( \left( e+\nabla \overline{\phi }(x)\right) \otimes \left( e+\nabla \overline{\phi }(x)\right) \right) \\&+\,L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}\tfrac{\partial \varvec{a}(x)}{\partial y}:\left( e+\nabla \overline{\phi }(x)\right) \otimes \left( e+\nabla \overline{\phi }(x)\right) \\&-\,L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}\tfrac{\partial \varvec{a}(x)}{\partial y}:\tfrac{\partial }{\partial y}\left( \left( e+\nabla \overline{\phi }(x)\right) \otimes \left( e+\nabla \overline{\phi }(x)\right) \right) \\&-\,L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}\left\langle \tfrac{\partial \varvec{a}(x)}{\partial y}:\tfrac{\partial }{\partial y}\left( \left( e+\nabla \overline{\phi }(x)\right) \otimes \left( e+\nabla \overline{\phi }(x)\right) \right) \right\rangle _y. \end{aligned}$$

For convenience, we denote by \(I_1\) the first term of the RHS, and by \(I_2\) the sum of the other three terms. Since

$$\begin{aligned} \left| \tfrac{\partial \varvec{a}(x)}{\partial y}\right| \le {\left\{ \begin{array}{ll} 2&{}\text {if }x-y\in L\mathbb Z^d,\\ 0&{}\text {else,} \end{array}\right. } \end{aligned}$$

and \(|\tfrac{\partial \zeta }{\partial y}|\le |\zeta |+\langle |\zeta | \rangle _y\) for all \(\zeta \in L^2(\Omega )\), we have

$$\begin{aligned} |I_2|\lesssim L^{-d}\left( |\nabla \overline{\phi }(y)+e|^2+\left\langle |\nabla \overline{\phi }(y)+e|^2 \right\rangle _y\right) . \end{aligned}$$

To estimate \(I_1\) we appeal again to the vertical Leibniz rule:

$$\begin{aligned} I_1&= L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}\varvec{a}(x):\frac{\partial }{\partial y}\left( (e+\nabla \overline{\phi }(x))\otimes (e+\nabla \overline{\phi }(x))\right) \\&= 2L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}\varvec{a}(x):(e+\nabla \overline{\phi }(x))\otimes \nabla \tfrac{\partial \overline{\phi }}{\partial y}(x)\\&\begin{aligned}&-L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}\varvec{a}(x):\nabla \tfrac{\partial \overline{\phi }}{\partial y}(x)\otimes \nabla \tfrac{\partial \overline{\phi }}{\partial y}(x)\\&-L^{-d}\sum _{x\in ([0,L)\cap \mathbb Z)^d}\varvec{a}(x):\left\langle \nabla \tfrac{\partial \overline{\phi }}{\partial y}(x)\otimes \nabla \tfrac{\partial \overline{\phi }}{\partial y}(x) \right\rangle _y. \end{aligned} \end{aligned}$$

The first term of the RHS vanishes identically by (6), whereas the last two terms are controlled by Step 1.

Step 3. Conclusion via Spectral Gap Estimate.

We apply the Spectral Gap Estimate to \(e\cdot \varvec{a}_{L,\mathrm{hom}}e\), use Step 2, and then Jensen’s inequality to bound the variance of \(e\cdot \varvec{a}_{L,\mathrm{hom}}e\) by the quartic moment of \(D\phi \):

$$\begin{aligned}&\mathrm{var}[e\cdot \varvec{a}_{L,\mathrm{hom}}e] = \left\langle (\mathcal E-\langle \mathcal E \rangle )^2 \right\rangle \le \frac{1}{\rho }\sum _{y\in ([0,L)\cap \mathbb Z)^d}\left\langle \left( \tfrac{\partial \mathcal E}{\partial y}\right) ^2 \right\rangle \\&\quad \mathop {\lesssim }\limits ^{\text {Step~2}} L^{-2d}\sum _{y\in ([0,L)\cap \mathbb Z)^d}\left\langle |e+\nabla \overline{\phi }(y)|^4 \right\rangle +\left\langle \langle |e+\nabla \overline{\phi }(y)|^2 \rangle _y^2 \right\rangle \\&\qquad \lesssim L^{-d}\left\langle |e+D\phi |^4 \right\rangle , \end{aligned}$$

so that the claim follows from Proposition 1 (b). \(\square \)

We only display the proofs of Corollary 1 and Lemma 8 in the case (11) since the argument in the periodic case (8) is similar.

Proof of Corollary 1

We simply apply the semigroup \(t\mapsto \exp (-tD^*\varvec{a}(0) D)\) of Theorem 1 to \(\mathfrak {d}\), and define for all \(t\ge 0\)

$$\begin{aligned} u(t):=\exp (-tD^*\varvec{a}(0)D) \mathfrak {d}. \end{aligned}$$

By the spectral theorem, \(u(t)=\int _0^\infty \exp (-t\nu )P(d\nu ) \mathfrak {d}\), so that for all \(t>0\),

$$\begin{aligned} \langle u^2(t) \rangle =\int _0^\infty \exp (-2t\nu )\langle \mathfrak {d} P(d\nu ) \mathfrak {d} \rangle \ge \exp (-2) \int _0^{\frac{1}{t}} \langle \mathfrak {d} P(d\nu ) \mathfrak {d} \rangle . \end{aligned}$$

Corollary 1 thus follows from the estimate \(\langle u^2(t) \rangle \lesssim (t+1)^{-(\frac{d}{2}+1)}\) of Theorem 1.

Proof of Lemma 8

We start with writing the Richardson extrapolation in spectral variables in the spirit of [25] and [35]. First, notice that by (4) we have

$$\begin{aligned} e\cdot \varvec{a}_{\mathrm{hom},\mu }e&= \langle (D\phi _\mu +e)\cdot \varvec{a}(0)(D\phi _\mu +e) \rangle \nonumber \\&= \langle e\cdot \varvec{a}(0)e \rangle +2\langle D\phi _\mu \cdot \varvec{a}(0)(D\phi _\mu +e) \rangle -\langle D\phi _\mu \cdot \varvec{a}(0)D\phi _\mu \rangle \nonumber \\&= \langle e\cdot \varvec{a}(0)e \rangle -2\mu \langle \phi _\mu ^2 \rangle -\langle \phi _\mu (D^*\varvec{a}(0)D)\phi _\mu \rangle . \end{aligned}$$
(59)

Since \(\phi _\mu = (\mu +D^*\varvec{a}(0) D)^{-1}\mathfrak {d}\), the spectral theorem yields \(\phi _\mu = \int _0^\infty \frac{1}{\mu +\nu }P(d\nu )\mathfrak {d}\), and thus (59) turns into the spectral representation formula:

$$\begin{aligned} e\cdot \varvec{a}_{\mathrm{hom},\mu } e=\langle e \cdot \varvec{a}(0) e \rangle - \int _0^\infty \frac{2\mu +\nu }{(\mu +\nu )^2}\langle \mathfrak {d} P(d\nu )\mathfrak {d} \rangle . \end{aligned}$$

By the Lebesgue dominated convergence theorem, we obtain in the limit \(\mu \downarrow 0\)

$$\begin{aligned} e\cdot \varvec{a}_{\mathrm{hom}} e&= \langle e \cdot \varvec{a}(0) e \rangle -\int _0^\infty \frac{1}{\nu }\langle \mathfrak {d} P(d\nu )\mathfrak {d} \rangle . \end{aligned}$$
(60)

The combination of both yields

$$\begin{aligned} e\cdot (\varvec{a}_{\mathrm{hom},\mu }-\varvec{a}_\mathrm{hom})e=\mu ^2\int _0^\infty \frac{1}{\nu (\mu +\nu )^2} \langle \mathfrak {d} P(d\nu )\mathfrak {d} \rangle . \end{aligned}$$

From (54) we conclude by an elementary induction argument that for all \(k\in \mathbb N_0\),

$$\begin{aligned} e\cdot (\varvec{a}_{\mathrm{hom},\mu }^k-\varvec{a}_{\mathrm{hom}})e=\mu ^{k+2}\int _0^\infty \frac{p_{k}(\nu ,\mu )}{\nu (2^0\mu +\nu )^2\cdots (2^k\mu +\nu )^2} \langle \mathfrak {d} P(d\nu )\mathfrak {d} \rangle , \end{aligned}$$

where \(p_{k}\) denotes a linear combination of monomials \(\mu ^i\nu ^{j}\) of total degree \(i+j={k}\). Since \(|p_k(\nu ,\mu )| \lesssim (\mu +\nu )^{k}\) for all \(\nu ,\mu \ge 0\),

$$\begin{aligned} |e\cdot (\varvec{a}^k_{\mathrm{hom},\mu }- \varvec{a}_\mathrm{hom}) e| \lesssim \mu ^{k+2}\int _0^\infty \frac{1}{\nu (\mu +\nu )^{k+2}}\langle \mathfrak {d} P(d\nu )\mathfrak {d} \rangle . \end{aligned}$$

Corollary 1 implies for all monotone decreasing functions \(f(\nu )\):

$$\begin{aligned} \int _0^1f(\nu )\langle \mathfrak {d}P(d\nu )\mathfrak {d} \rangle \lesssim \int _0^1f(\nu )\nu ^{\frac{d}{2}}d\nu , \end{aligned}$$

and thus

$$\begin{aligned} \int _0^{1}\frac{1}{\nu (\mu +\nu )^{k+2}} \langle \mathfrak {d} P(d\nu )\mathfrak {d} \rangle&\lesssim \int _0^{1}\nu ^{\frac{d}{2}-1}\frac{1}{(\mu +\nu )^{k+2}}\, d\nu \\&= \mu ^{\frac{d}{2}-2-k}\int _0^{\frac{1}{\mu }}\frac{\tilde{\nu }^{\frac{d}{2}-1}}{(1+\tilde{\nu })^{k+2}}\,d\tilde{\nu }\\&\lesssim \mu ^{\frac{d}{2}-2-k}, \end{aligned}$$

where we used that \(\tilde{\nu }\mapsto \frac{\tilde{\nu }^{\frac{d}{2}-1}}{(1+\tilde{\nu })^{k+2}}\) is integrable on \((0,\infty )\) for \(k>\frac{d}{2}-2\). Combined with \(\int _0^\infty \frac{1}{\nu }\langle \mathfrak {d} P(d\nu )\mathfrak {d} \rangle \le 1\) (cf. (60)) the claim of the lemma follows.

Proof of Lemma 9

We split the proof into four steps.

Step 1. Definition of the coupling.

For \(L\in \mathbb N\) consider the periodization mapping

$$\begin{aligned} T_L:\Omega \rightarrow \Omega _L,\qquad \varvec{a}\mapsto T_L\varvec{a}, \end{aligned}$$

where \(T_L\varvec{a}\) denotes the unique element in \(\Omega _L\) such that \(\varvec{a}(x)=T_L\varvec{a}(x)\) for all \(x\in ([-\frac{L}{2},\frac{L}{2})\cap \mathbb Z)^d\). Since \(\langle \cdot \rangle \) is a product measure, \(\langle \cdot \rangle _L\) is obviously the pushforward of \(\langle \cdot \rangle \) under \(T_L\), i.e.

$$\begin{aligned} \langle f \rangle _L=\langle f\circ T_L \rangle \qquad \text {for all }\langle \cdot \rangle _L\text {-measurable }f. \end{aligned}$$
(61)

Step 2. Reduction to an estimate on the corrector.

We claim that

$$\begin{aligned} |e\cdot \varvec{a}_{\mathrm{hom},L,\mu }e-e\cdot \varvec{a}_{\mathrm{hom},\mu }e| \lesssim \sum _{x_0\in \{0,e_1,\ldots ,e_d\}}\left\langle |\phi _{\mu }\circ T_{x_0}\circ T_L\circ T_{-{x_0}}-\phi _{\mu }|^2 \right\rangle ^{{\frac{1}{2}}}, \end{aligned}$$
(62)

where \(T_x:\Omega \rightarrow \Omega ,\,\varvec{a}\mapsto \varvec{a}(\cdot +x)\) is the shift operator. Thanks to (61) we have

$$\begin{aligned} e\cdot \varvec{a}_{\mathrm{hom},L,\mu }e&= \left\langle (D\phi _{\mu }+e)\cdot \varvec{a}(0) (D\phi _{\mu }+e) \right\rangle _L\\&= \left\langle \left( (D\phi _{\mu }+e)\cdot \varvec{a}(0) (D\phi _{\mu }+e)\right) \circ T_L \right\rangle \\&\mathop {=}\limits ^{\varvec{a}(0)\circ T_L=\varvec{a}(0)}\left\langle (D\phi _\mu \circ T_L+e)\cdot \varvec{a}(0) (D\phi _\mu \circ T_L+e) \right\rangle , \end{aligned}$$

so that

$$\begin{aligned}&e\cdot (\varvec{a}_{\mathrm{hom},L,\mu }-\varvec{a}_{\mathrm{hom},\mu })e\\&\quad = \left\langle (D\phi _\mu \circ T_L+e)\cdot \varvec{a}(0) (D\phi _\mu \circ T_L+e)-(D\phi _{\mu }+e)\cdot \varvec{a}(0) (D\phi _{\mu }+e) \right\rangle \\&\quad \lesssim \left\langle |D \phi _{\mu }\circ T_L-D\phi _{\mu }|^2 \right\rangle ^{{\frac{1}{2}}}\left( \left\langle |D\phi _\mu |^2 \right\rangle ^{{\frac{1}{2}}}+\left\langle |D\phi _{\mu }|^2 \right\rangle ^{{\frac{1}{2}}}_L+1\right) \\&\quad \lesssim \left\langle |D \phi _{\mu }\circ T_L-D\phi _{\mu }|^2 \right\rangle ^{{\frac{1}{2}}}, \end{aligned}$$

using the elementary a priori estimates \(\langle |D\phi _\mu |^2 \rangle ,\langle |D\phi _{\mu }|^2 \rangle _L \lesssim 1\). Combined with

$$\begin{aligned} \left\langle |D \phi _{\mu }\circ T_L{-}D\phi _{\mu }|^2 \right\rangle ^{{\frac{1}{2}}}&\le d\left\langle |\phi _{\mu }\circ T_L{-}\phi _{\mu }|^2 \right\rangle ^{{\frac{1}{2}}}\\&\,+\sum _{i=1}^d\left\langle |\phi _{\mu }\circ T_{e_i}\circ T_L\circ T_{-e_i}{-}\phi _{\mu }|^2 \right\rangle ^{{\frac{1}{2}}}, \end{aligned}$$

this yields (62).

Step 3. Representation formula for the difference of the correctors.

Let \(x_0\in \{0,e_1,\ldots ,e_d\}\) be fixed. For \(\varvec{a}\in \Omega \) define \(\tilde{\varvec{a}}:=(T_{x_0}\circ T_L\circ T_{-x_0})(\varvec{a})\). We claim that

$$\begin{aligned} \left| \phi _\mu (\tilde{\varvec{a}})-\phi _\mu (\varvec{a})\right| \le \sum \limits _{\begin{array}{c} z \in \mathbb {Z}d \\ |z| \ge \tfrac{L}{4} \end{array}} {|\nabla _z G_\mu (\tilde{\varvec{a}},0,z)||\nabla \overline{\phi }_\mu (\varvec{a},z)+e|.} \end{aligned}$$
(63)

where \(G_\mu (\tilde{\varvec{a}},\cdot ,\cdot )\) is the Green’s function of the elliptic operator \(\mu +\nabla ^*\tilde{\varvec{a}}\nabla \). Indeed, since \(\overline{\phi }_\mu (\tilde{\varvec{a}},\cdot )-\overline{\phi }_\mu (\varvec{a},\cdot )\) is an \(\ell ^\infty (\mathbb Z^d)\)-solution of

$$\begin{aligned} (\mu +\nabla ^*\tilde{\varvec{a}}\nabla )\left( \overline{\phi }_\mu (\tilde{\varvec{a}},\cdot )-\overline{\phi }_\mu (\varvec{a},\cdot )\right) = \nabla ^*(\varvec{a}-\tilde{\varvec{a}})(\nabla \overline{\phi }_\mu (\varvec{a},\cdot )+e), \end{aligned}$$

the Green representation formula reads

$$\begin{aligned} \overline{\phi }_\mu (\tilde{\varvec{a}},0)-\overline{\phi }_\mu (\varvec{a},0) = \sum _{z\in \mathbb Z^d}\nabla _z G_\mu (\tilde{\varvec{a}},0,z)\cdot f(z),\quad \text {with } f(z):=(\varvec{a}(z)-\tilde{\varvec{a}}(z))(\nabla \overline{\phi }_\mu (\varvec{a},z)+e). \end{aligned}$$

By construction, we have \(\tilde{\varvec{a}}(z)=\varvec{a}(z)\) for \(z+x_0\in ([-\frac{L}{2},\frac{L}{2})\cap \mathbb Z)^d\). Hence,

$$\begin{aligned} |f(z)|\le {\left\{ \begin{array}{ll} |\nabla \overline{\phi }_\mu (\varvec{a},\cdot )+e|&{}\text {for }|z|\ge \frac{L}{4},\\ 0&{}\text {else}, \end{array}\right. } \end{aligned}$$

and the desired estimate (63) follows.

Step 4. Conclusion.

We note that \(0\le G_\mu (\varvec{a},0,z) \lesssim \mu ^{-1}\exp (-c_0\sqrt{\mu }|z|)\), for some \(c_0>0\) only depending on \(\lambda \) and \(d\), as can be seen by an elementary argument (see e.g. [20, Lemma 23]). Thanks to discreteness, we get

$$\begin{aligned} |\nabla _z G_\mu (\varvec{a},0,z)| \lesssim \mu ^{-1}\exp \left( -c_0\sqrt{\mu }|z|\right) . \end{aligned}$$
(64)

The combination of (62), (63), and (64) then yields

$$\begin{aligned}&e\cdot (\varvec{a}_{\mathrm{hom},L,\mu }-\varvec{a}_{\mathrm{hom},\mu })e\\&\quad \lesssim \left\langle \left( \sum _{|z|\ge \frac{L}{4}}\mu ^{-1}\exp \left( -c_0\sqrt{\mu }|z|\right) |\nabla \overline{\phi }_\mu (z)+e|\right) ^2 \right\rangle ^{\frac{1}{2}}\\&\quad \mathop {\le }\limits ^{\triangle \text {-ineq. and stationarity of }\nabla \overline{\phi }_\mu } \left( \left\langle |D\phi _\mu |^2 \right\rangle +1\right) ^{\frac{1}{2}}\sum _{|z|\ge \frac{L}{4}} \mu ^{-1}\exp \left( -c_0\sqrt{\mu }|z|\right) , \end{aligned}$$

and (56) follows from the energy estimate \(\langle {|D\phi _\mu |^2}\rangle \lesssim 1\) and the evaluation of the sum on the RHS.

Proof of Proposition 3

Fix a non-negative integer \(\frac{d}{2}-2<k<\frac{d}{2}\), and let \(\varvec{a}_{\mathrm{hom},\mu }^k\) (resp. \(\varvec{a}_{\mathrm{hom},L,\mu }^k\)) be the approximate homogenized coefficients of Definition 4 associated with \(\langle \cdot \rangle \) (resp. \(\langle \cdot \rangle _L\)). By combining (55), Lemma 8, and Lemma 9 in the form of

$$\begin{aligned} |\varvec{a}_{\mathrm{hom},L,\mu }^k-\varvec{a}_{\mathrm{hom},\mu }^k| \lesssim \sup _{\mu \le \tilde{\mu }\le 2^k\mu }|\varvec{a}_{\mathrm{hom},L,\tilde{\mu }}-\varvec{a}_{\mathrm{hom},\tilde{\mu }}|, \end{aligned}$$

we get

$$\begin{aligned} |\varvec{a}_{\mathrm{hom},L}-\varvec{a}_{\mathrm{hom}}| \lesssim \mu ^{\frac{d}{2}}+\sqrt{\mu }^{-\alpha }\exp \left( -c_0\sqrt{\mu }L\right) . \end{aligned}$$
(65)

It remains to optimize (65) in \(\mu \). To this aim, note that for all \(\alpha '>0\), \(c_0>0\) and \(L\gg 1\), we have

$$\begin{aligned} \left( \frac{\ln L}{L}\right) ^d&\lesssim \min _{\mu >0}\left( \sqrt{\mu }^{d}+\sqrt{\mu }^{-\alpha '}\exp (-c_0\sqrt{\mu }L)\right) \\&\lesssim \left( \frac{\ln L}{L}\right) ^d, \end{aligned}$$

where the constant only depends on \(\alpha '\), \(c_0\) and \(d\). Combined with (65), Proposition 3 follows. \(\square \)

6 Estimates on the gradient of the parabolic Green’s function

6.1 Auxiliary lemmas and outline of the proof of Theorem 3

Unless stated otherwise, \(\varvec{a}\) denotes in this section an \(L\)-periodic coefficient field in \(\Omega _L\) (cf. Sect. 2.1) and \(G_L(t,\varvec{a},x,y)\) is the associated \(L\)-periodic Green’s function. When no confusion occurs, we suppress the argument \(\varvec{a}\) in the notation. Note that \(\bar{G}_L:=L^{-d}\) is the spatial average on \(([0,L)\cap \mathbb Z)^d\) of \(G_L(t,\cdot ,y)\) for all \(t>0\) and \(y\in \mathbb Z^d\). Throughout this section we shall write \(\int dx\) for the sum over \(x\in ([0,L)\cap \mathbb Z)^d\). We denote by \({d}_L\) the \(L\)-periodic weight \({d}_L(x):=\mathrm{dist}(x,L\mathbb {Z}^d)+1\) and recall that

$$\begin{aligned} \omega _L(t,x)=\left( \frac{{d}_L^2(x)}{t+1}+1\right) ^{\frac{1}{2}}. \end{aligned}$$

We first recall standard pointwise bounds on the Green’s function itself:

Lemma 10

(Estimate of \(G_L)\) For any weight exponent \(\beta <\infty \) we have

$$\begin{aligned} G_L(t,x,y)&\lesssim (t+1)^{-\frac{d}{2}}\omega _L^{-\beta }(t,x-y) \quad \text{ for }\;t\lesssim L^2,\end{aligned}$$
(66)
$$\begin{aligned} |G_L(t,x,y)-\bar{G}_L|&\lesssim L^{-d}\exp \left( -c_0\frac{t}{L^2}\right) \quad \text{ for }\;t\gtrsim L^2, \end{aligned}$$
(67)

where the constant depends only on \(\beta \), next to \(\lambda \) and \(d\), and \(c_0>0\) is a constant that only depends on \(\lambda \) and \(d\).

Lemma 10 essentially reflects the exponentially decaying tail of the Green’s function. It is well-known that in the continuum case the exponential decay can be upgraded to a Gaussian decay:

$$\begin{aligned} G^\mathrm{cont}(t,x,y) \lesssim t^{-\frac{d}{2}}\exp \left( -c_0\frac{|x-y|^2}{t}\right) , \end{aligned}$$

as was first established for the continuum Green’s function \(G^\mathrm{cont}\) by Nash [33] and Aronson [1]; see also Fabes and Stroock [14] for a stream-lined approach. In the spatially discrete case the Gaussian behavior of the tail does not hold. A lower bound with the exact (exponential) tail behavior has been obtained by Delmotte [8]. For the upper bound and a partial Aronson lower bound see also [17, Propositions B.3 and B.4].

Lemma 10 treats the \(L\)-periodic Green’s function, not the whole-space Green’s function as addressed in [8]. Its proof is a slight refinement of the classical approach by Nash, smuggling in the weight function when needed. Since the argument is standard, we omit the proof and refer to [20, Section 7] for details.

In addition, we need for the proof of Theorem 3 the following Meyers estimate for discrete \(L\)-periodic parabolic equations:

Lemma 11

(Discrete, \(L\)-periodic, parabolic Meyers’ estimate) There exists \(\bar{q}>1\) depending only on \(\lambda \) and \(d\) such that for all \(u:\mathbb R\times \mathbb Z^d\rightarrow \mathbb R\), \(f:\mathbb R\times \mathbb Z^d\rightarrow \mathbb R\) and \(g:\mathbb R\times \mathbb Z^d\rightarrow \mathbb R^d\) compactly supported in \(t\) and \(L\)-periodic in \(x\), with \(u\) smooth in time and related to \(f\) and \(g\) via the equation

$$\begin{aligned} \partial _tu+\nabla ^*\varvec{a}\nabla u=\nabla ^* g+f \end{aligned}$$
(68)

for some \(\varvec{a}\in \Omega \), we have for all \(1<q\le \bar{q}\)

$$\begin{aligned} {\left( \int _{-\infty }^\infty \int |\nabla u|^{2q}\,dx\,dt\right) ^\frac{1}{2q}}&\lesssim \left( \int _{-\infty }^\infty \int |g|^{2q}\,dx\,dt\right) ^\frac{1}{2q}\nonumber \\&+\left( \int _{-\infty }^\infty \left( \int |f|^{\frac{2qd}{2q+d}}\,dx\right) ^\frac{2q+d}{d}dt\right) ^\frac{1}{2q},\qquad \end{aligned}$$
(69)

where the constant only depends on \(q\), \(\lambda \), and \(d\).

This result is the discrete counterpart of the well-known parabolic version of the original Meyers estimates for elliptic equations [29]. Since the proof is standard we only sketch the argument and refer to [20, Section 7] for details. The argument relies on the Calderón–Zygmund estimate for discrete parabolic and elliptic equations (with periodic boundary conditions), and a perturbation argument where \(\varvec{a}\) is viewed as a perturbation of the constant matrix \(\frac{1+\lambda }{2}\varvec{id}\) – recall that we assume the uniform ellipticity in the form of \(\lambda \le \varvec{a}\le 1\). Indeed, a direct calculation shows that we may rewrite (68) in terms of the function \(\tilde{u}:=u-\bar{u}\) (where \(\bar{u}\) denotes the mean of \(u\)) as

$$\begin{aligned} \partial _t\tilde{u} +\tfrac{1+\lambda }{2}\nabla ^*\nabla \tilde{u}=\nabla ^*\tilde{g} \end{aligned}$$

with RHS

$$\begin{aligned} \tilde{g}:=g+\nabla v-(\varvec{a}-\tfrac{1+\lambda }{2}\varvec{id})\nabla \tilde{u} \end{aligned}$$

and \(v\) given as the unique solution with \(\sum _{([0,L)\cap \mathbb Z)^d}v=0\) of the \(L\)-periodic Poisson equation

$$\begin{aligned} \nabla ^*\nabla v=f-\bar{f}\qquad \text {(with } \bar{f} \hbox { the mean of} f). \end{aligned}$$

In the constant-coefficient case, i.e. when \(\varvec{a}=\frac{1+\lambda }{2}\varvec{id}\) and therefore \(\tilde{g}=g+\nabla v\), the estimate (69) directly follows from

$$\begin{aligned} \int _{-\infty }^\infty \int |\nabla \tilde{u}(t,x)|^{2q}\,dx\,dt&\lesssim \int _{-\infty }^\infty \int |\tilde{g}(t,x)|^{2q}\,dx\,dt,\end{aligned}$$
(70)
$$\begin{aligned} \int |\nabla ^2 v(x)|^{2q}\,dx\,&\lesssim \left( \int |f(x)-\bar{f}|^\frac{2qd}{2q+d}\,dx\right) ^{\frac{2q+d}{d}}. \end{aligned}$$
(71)

Estimate (70) is a discrete, parabolic Calderón–Zygmund estimate, while (71) is obtained by combining a discrete elliptic Calderón–Zygmund estimate with a Sobolev–Poincaré inequality. Both discrete Calderón–Zygmund estimates can for instance be obtained from their well-known continuum counterparts by a direct comparison of the associated discrete and continuum Fourier multipliers (see [20, Section 7.4] for details). The additional \(\nabla {\tilde{u}}\)-term that appears in the variable-coefficient case in the definition of \(\tilde{g}\) can be absorbed into the LHS. Indeed, the multiplicative constant in (70) tends to \(1\) as \(q\rightarrow 1\), whereas \(|(\varvec{a}-\frac{1+\lambda }{2}\varvec{id})\nabla \tilde{u}|\le \frac{1-\lambda }{2}|\nabla \tilde{u}|\) by uniform ellipticity. For the details we refer to [20, Step 3 and 4, Proof of Lemma 24].

The proof of Theorem 3 is organized as follows. We only address the \(L\)-periodic case, that is, statement (b). Based on Lemma 10, we shall derive the following pointwise-in-time estimate

$$\begin{aligned} \int \left( \omega _L^\alpha (t,x-y)|\nabla G_L(t,x,y)|\right) ^{2} \,dx \lesssim (t+1)^{-\frac{d}{2}-1}\exp \left( -c_0\frac{t}{L^2}\right) ,\nonumber \\ \end{aligned}$$
(72)

see Steps 1 and 2 of the proof. The most delicate part is to increase the exponent of integrability from \(2\) to \(2q>2\) by appealing to Lemma 11. Statement (a) of Theorem 3 follows from statement (b) by soft arguments in the limit \(L\uparrow \infty \). Indeed, by the Arzelà–Ascoli theorem, the periodic Green’s function \(G_L(\varvec{a}_L)\) converges pointwise to the whole-space Green’s function \(G(\varvec{a})\) if \(\varvec{a}_L\in \Omega \) is the periodic extension on \(\mathbb Z^d\) of the restriction \(\varvec{a}_{|([-L/2,L/2)\cap \mathbb Z)^d}\).

6.2 Proof of Theorem 3 (b)

We shall need a Leibniz rule and a chain rule for discrete spatial derivatives:

$$\begin{aligned} \forall u,v:\mathbb Z^d\rightarrow \mathbb R,\quad \left\{ \begin{array}{rcl} \nabla _i (uv)&{}=&{}(\nabla _i u)v+ u(\cdot +e_i) \nabla _i v,\\ \nabla ^*_i(uv)&{}=&{}(\nabla _i^* u)v+ u(\cdot -e_i) \nabla _i^* v, \end{array} \right. \end{aligned}$$
(73)

and for all \(\beta \ge 1\)

$$\begin{aligned} \forall a,b\ge 0 : |a^\beta -b^\beta |\le C (a^{\beta -1}+b^{\beta -1})|a-b|, \end{aligned}$$
(74)

where \(C\) only depends on \(\beta \).

For further reference we note that

$$\begin{aligned} |\nabla \omega _L(t,x)|&\le (t+1)^{-\frac{1}{2}},\end{aligned}$$
(75)
$$\begin{aligned} \omega _L(t,x)&\lesssim 1\quad \text{ for }\;t\gtrsim L^2, \end{aligned}$$
(76)

and recall that the weight \({d}_L\) satisfies

$$\begin{aligned} {d}_L(x\pm e_i)&\lesssim {d}_L(x),\end{aligned}$$
(77a)
$$\begin{aligned} |\nabla {d}_L(x)|&\le 1. \end{aligned}$$
(77b)

that for \(2\alpha \ge 1\), we have

$$\begin{aligned} |\nabla _i{d}_L^{2\alpha }(x)|&\mathop {\lesssim }\limits ^{(74),(77\mathrm{b}),(77\mathrm{a})} {d}_L^{2\alpha -1}(x). \end{aligned}$$
(77c)

Step 1. \(L^2_{t}\ell ^2_x\) estimate of \(\nabla G_L\):

$$\begin{aligned}&\frac{1}{T}\int _T^{2T}\int \left( \omega _L^\alpha (t,x-y)|\nabla _x G_L(t,x,y)|\right) ^2 \,dx\,dt\\&\quad \lesssim (T+1)^{-\frac{d}{2}-1}\exp \left( -c_0\frac{T}{L^2}\right) . \end{aligned}$$

In what follows, the spatial gradient of the Green’s function is always taken w.r.t. the first variable, and we write \(\nabla G_L(t,x,y)\) for \(\nabla _x G_L(t,x,y)\). Since the estimates are uniform in \(\varvec{a}\in \Omega \), we may assume \(y=0\). In view of (76), this statement follows from the two inequalities:

$$\begin{aligned} \frac{1}{T}\int _T^{2T}\int \omega _L^{2\alpha }(t,x)|\nabla G_L(t,x,0)|^2\, dx\,dt&\lesssim (T+1)^{-\frac{d}{2}-1}\quad \text{ for }\;T\lesssim L^2,\qquad \quad \end{aligned}$$
(78)
$$\begin{aligned} \frac{1}{T}\int _T^{2T}\int |\nabla G_L(t,x,0)|^2 \,dx\,dt&\lesssim L^{-d}T^{-1}\exp \left( -c_0\frac{T}{L^2}\right) \nonumber \\&\quad \text{ for }\;T\gtrsim L^2, \end{aligned}$$
(79)

up to changing \(c_0\) when bounding \(L^{-d}T^{-1}\exp (-c_0\frac{T}{L^2})\) by \(T^{-\frac{d}{2}-1}\exp (-c_0\frac{T}{L^2})\) for \(T\gtrsim L^2\). We first note that integrating the inequality

$$\begin{aligned} \begin{array}{ll} \frac{d}{dt}\int \frac{1}{2}(G_L(t,x,0)-\bar{G}_L)^2\,dx &{}=\int (G_L(t,x,0)-\bar{G}_L)\partial _tG_L(t,x,0)\, dx\\ &{}=-\int \nabla G_L(t,x,0)\cdot \varvec{a}\nabla G_L(t,x,0)\,dx\\ &{}\le -\lambda \int |\nabla G_L(t,x,0)|^2\,dx \end{array} \end{aligned}$$

from \(T\) to \(2T\) yields

$$\begin{aligned} \int _{T}^{2T}\int |\nabla G_L(t,x,0)|^2\,dx\,dt\lesssim \int (G_L(T,x,0)-\bar{G}_L)^2\,dx. \end{aligned}$$
(80)

Combined with (67), it yields (79). We now turn to (78). For \(T\lesssim 1\) this reduces to

$$\begin{aligned} \frac{1}{T}\int _T^{2T}\int \omega _L^{2\alpha }(t,x)|\nabla G_L(t,x,0)|^2\,dx\,dt&\lesssim 1, \end{aligned}$$

which follows from the discrete estimate \(|\nabla _iG_L(t,x,0)|\le |G_L(t,x,0)|+|G_L(t,x+e_i,0)|\) and (66). It remains to address the case \(1\lesssim T\lesssim L^2\). By definition of \(\omega _L\),

$$\begin{aligned}&\frac{1}{T}\int _T^{2T}\int \omega _L^{2\alpha }(t,x)|\nabla G_L(t,x,0)|^2\,dx\,dt \lesssim \frac{1}{T}\int _T^{2T}\int |\nabla G_L(t,x,0)|^2\,dx\,dt\\&\quad +\frac{1}{(T+1)^{\alpha +1}}\int _T^{2T}\int {d}_L^{2\alpha }(x)|\nabla G_L(t,x,0)|^2\,dx\,dt. \end{aligned}$$

Hence, we need to show that for all \(\alpha \ge 0\),

$$\begin{aligned} \int _T^{2T}\int {d}_L^{2\alpha }(x)|\nabla G_L(t,x,0)|^2\,dx\,dt \lesssim T^{\alpha -\frac{d}{2}}\quad \text{ for }\;T\lesssim L^2. \end{aligned}$$
(81)

By Hölder’s inequality, it is enough to show (81) for \(\alpha =0\) and \(\alpha \ge 1\). For \(\alpha =0\), this is a consequence of (80) and (66), where the latter is used for a \(\beta \) with \(\beta >\frac{d}{2}\). The starting point for \(\alpha \ge 1\) is the identity

$$\begin{aligned}&{\frac{d}{dt}\int {d}_L^{2\alpha }(x)\frac{1}{2}G_L^2(t,x,0)\,dx}\\&\quad =-\int \nabla ({d}_L^{2\alpha }(x) G_L(t,x,0))\cdot \varvec{a}(x)\nabla G_L(t,x,0)\, dx\\&\quad \mathop {=}\limits ^{(73)}-\int {d}_L^{2\alpha }(x) \nabla G_L(t,x,0)\cdot \varvec{a}(x)\nabla G_L(t,x,0)\, dx\\&\qquad \quad -\int \sum _{i=1}^d G_L(t,\cdot +e_i,0)\nabla _i{d}_L^{2\alpha }(x)\, \varvec{a}_{ii}(x)\nabla _i G_L(t,x,0)\,dx, \end{aligned}$$

which, combined with (77c), yields

$$\begin{aligned} {\frac{d}{dt}\int {d}_L^{2\alpha }(x)\frac{1}{2}G^2_L(t,x,0)\,dx}&\le -\lambda \int {d}_L^{2\alpha }(x)|\nabla G_L(t,x,0)|^2 \,dx\\&+C\int {d}_L^{2\alpha -1}(x) G_L(t,x,0) |\nabla G_L(t,x,0)|\, dx \end{aligned}$$

for some constant \(C\) only depending on \(\alpha \) and \(d\). Young’s inequality then yields

$$\begin{aligned} \int {d}_L^{2\alpha }(x)|\nabla G_L(t,x,0)|^2\, dx&\lesssim -\frac{d}{dt}\int {d}_L^{2\alpha }(x)\frac{1}{2}G_L^2(t,x,0)\,dx\\&+\int {d}_L^{2\alpha -2}(x)G_{L}^2(t,x,0)\, dx, \end{aligned}$$

which we integrate between \(T\) and \(2T\):

$$\begin{aligned}&{\int _T^{2T}\int {d}_L^{2\alpha }(x)|\nabla G_L(t,x,0)|^2\,dx\,dt}\\&\quad \lesssim \int {d}_L^{2\alpha } (x)G_L^2(T,x,0)\,dx +\int _T^{2T}\int {d}_L^{2\alpha -2}(x)G_L^2(t,x,0) \,dx\, dt\\&\quad \lesssim (T+1)^\alpha \int \omega ^{2\alpha }_L(T,x)G_L^2(T,x,0)\,dx\\&\qquad +\int _T^{2T}(t+1)^{\alpha -1}\int \omega ^{2\alpha -2}_L(t,x)G_L^2(t,x,0)\,dx\,dt. \end{aligned}$$

We combine this inequality with (66) for a \(\beta \) with \(2\alpha -2\beta <-d\), and the estimate

$$\begin{aligned} \int \omega _L^{-r}(t,x)\,dx \le \int _{\mathbb {R}^d}\left( \frac{|x|^2}{t+1}+1\right) ^{-\frac{r}{2}}\,dx \, \lesssim (t+1)^{\frac{d}{2}} \end{aligned}$$
(82)

which holds for \(T\lesssim L^2\) and any \(r>d\), and which we apply with \(r=2(\beta -\alpha )\) and \(r=2(\beta -(\alpha -1))\). In conclusion we get (81) for \(\alpha \ge 1\).

Step 2. \(L^\infty _t \ell ^2_x\) estimate of \(\nabla G_L\):

$$\begin{aligned} \int \left( \omega _L^\alpha (t,x-y)|\nabla G_L(t,x,y)|\right) ^2 \,dx&\lesssim (t+1)^{-\frac{d}{2}-1}\exp \left( -c_0\frac{t}{L^2}\right) .\nonumber \\ \end{aligned}$$
(83)

This upgrade of Step 1 follows from the semigroup property \(G_L(T,x,y)=\int G_L(t,x,z)G_L(T-t,z,y)\,dz\), which we use in the form of

$$\begin{aligned} G_L(T,x,y) = \frac{3}{T}\int _{T/3}^{2T/3}\int G_L(t,x,z) G_L(T-t,z,y) dz dt. \end{aligned}$$

We differentiate this identity w. r. t. \(x\) and use that \(\frac{3}{T}\int _{T/3}^{2T/3}\int G_L(T-t,z,y)dzdt=1\), so that by Jensen’s inequality

$$\begin{aligned} |\nabla G_L(T,x,y)|^2\le \frac{3}{T}\int _{T/3}^{2T/3}\int G_L(T-t,z,y) |\nabla G_L(t,x,z)|^2 dz dt.\quad \end{aligned}$$
(84)

Note that for all \(\theta \in [0,1]\), we have by the triangle inequality for all \(t\ge 0\), \(x,y,z\in \mathbb Z^d\),

$$\begin{aligned} \omega _L^2(t,x-y)&= \frac{{d}_L^2(x-y)}{t+1}+1 \le 2 \frac{{d}_L^2(x-z)+{d}_L^2(z-y)}{t+1}+1 \nonumber \\&\le 2\left( \frac{{d}_L^2(x-z)}{t+1}+1 \right) \left( \frac{{d}_L^2(z-y)}{t+1}+1 \right) \nonumber \\&\le 2\omega _L^2(\theta t,x-z)\omega _L^2((1-\theta )t,z-y). \end{aligned}$$
(85)

Thus, integrating (84) on \(([0,L)\cap \mathbb Z)^d\) and using (85) with \(\theta =\frac{t}{T}\in [0,1]\) yields

$$\begin{aligned}&{\int \omega _L^{2\alpha }(T,x-y)|\nabla G_L(T,x,y)|^2\,dx}\nonumber \\&\quad \lesssim \frac{3}{T}\int _{T/3}^{2T/3}\int \omega _L^{2\alpha }(T-t,y-z) G_L(T{-}t,z,y)\!\!\! \nonumber \\&\qquad \times \int \omega _L^{2\alpha }(t,x-z)|\nabla G_L(t,x,z)|^2 \,dx\;dz\, dt. \end{aligned}$$
(86)

Next, we use Lemma 10 to estimate the term \(\omega _L^{2\alpha }(t',z-y)G_L(t',z,y)\) for \(t'=T-t\) in (86):

Case \(t'\lesssim L^2\): Estimate (66) shows that \(G_L(t',z,y)\lesssim (t'+1)^{-\frac{d}{2}}\omega _L^{-\gamma }(t',z-y)\) for any weight exponent \(\gamma <\infty \). Taking \(\gamma =2\alpha +r\) for some \(r>d\) yields

$$\begin{aligned} \omega _L^{2\alpha }(t',z-y)G_L(t',z,y) \lesssim {(t'+1)}^{-\frac{d}{2}}\omega _L^{-r}(t',z-y). \end{aligned}$$

Case \(t'\gtrsim L^2\): Estimate (67) shows that \(G_L(t',x,y)\lesssim L^{-d}\). Combined with (76), this yields

$$\begin{aligned} \omega _L^{2\alpha }(t',z-y)G_L(t',z,y)\lesssim L^{-d}. \end{aligned}$$

Since \(t'=T-t\in [T/3,2T/3]\), we get

$$\begin{aligned}&\omega _L^{2\alpha }(T-t,z-y)G_L(T-t,z,y)\nonumber \\&\quad \lesssim \left\{ \begin{array}{lcc} (T+1)^{-\frac{d}{2}}\omega _L^{-r}(T,z-y)&{}\text{ for }&{}T\le L^2\\ L^{-d } &{}\text{ for }&{}T\ge L^2 \end{array} \right\} . \end{aligned}$$
(87)

We note for further reference that the integral of the RHS is of order 1:

$$\begin{aligned} \left\{ \begin{array}{rcl} T\le L^2&{}:&{}\int (T+1)^{-\frac{d}{2}}\omega _L^{-r}(T,z-y) dz\\ T\ge L^2&{}:&{}\int L^{-d } dz \end{array} \right\} \lesssim 1, \end{aligned}$$
(88)

which follows from (82).

Inserting (87) into (86) yields by Fubini’s theorem

$$\begin{aligned} {\int \omega _L^{2\alpha }(T,x-y)|\nabla G_L(T,x,y)|^2\,dx}&\lesssim \int \left\{ \begin{array}{lcc} (T+1)^{-\frac{d}{2}}\omega _L^{-r}(T,z-y)&{}\text{ for }&{}T\le L^2\\ L^{-d}&{}\text{ for }&{}T\ge L^2 \end{array} \right\} \\&\times \frac{1}{T}\int _{T/3}^{2T/3}\int \omega _L^{2\alpha }(t,x-z)|\nabla G_L(t,x,z)|^2 \,dx \,dt\,dz. \end{aligned}$$

By Step 1 (with \(T\) replaced by \(T/3\) and \(y\) replaced by \(z\)), this estimate turns into

$$\begin{aligned}&\int \omega _L^{2\alpha }(T,x{-}y)|\nabla G_L(T,x,y)|^2\,dx\\&\quad \lesssim \int \left\{ \begin{array}{lcc} (T{+}1)^{{-}\frac{d}{2}}\omega _L^{{-}r}(T,z{-}y)&{}\text{ for }&{}T\le L^2\\ L^{{-}d}&{}\text{ for }&{}T\ge L^2 \end{array} \right\} dz\\&\qquad \times (T{+}1)^{{-}\frac{d}{2}{-}1}\exp \left( {-}c_0\frac{T}{L^2}\right) , \end{aligned}$$

which by (88) yields (83).

Step 3. Weighted \(L^{2q}_{t}\ell ^{2q}_x\)-estimate of \(\nabla G_L\): For all \(1\le q< q_0:=\min \{\bar{q},\frac{d}{d-2}\}\), where \(\bar{q}\) is the Meyers exponent of Lemma 11, we have

$$\begin{aligned}&\left( \frac{1}{T}\int _T^{2T}\int \left( \omega _L^\alpha (t,x-y)|\nabla G_L(t,x,y)|\right) ^{2q}\,dx\,dt\right) ^\frac{1}{2q}\nonumber \\&\quad \lesssim (T+1)^{-\frac{d}{2}-\frac{1}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{T}{L^2}\right) , \end{aligned}$$
(89)

where the constant only depends on \(\lambda \), \(d\), \(\alpha \), and \(q\). In order to treat the cases of \(T\le L^2\) and \(T\ge L^2\) simultaneously, we introduce the notation

$$\begin{aligned} G_L'(t,x,y):=\left\{ \begin{array}{lcc} G_L(t,x,y)&{}\text{ for }&{}T\le L^2\\ G_L(t,x,y)-\bar{G}_L&{}\text{ for }&{}T\ge L^2\\ \end{array}\right\} . \end{aligned}$$

W.l.o.g. we assume that \(y=0\). We first treat the case \(T\lesssim 1\). Since \(|\nabla _i G_L(t,x,0)|\le |G_L(t,x,0)|+|G_L(t,x+e_i,0)|\), (77a) and the discrete \(\ell ^{2q}_x\)-\(\ell ^2_x\)-estimate yield

$$\begin{aligned} \int \left( \omega _L^{\alpha }(t,x)|\nabla G_L(t,x,0)|\right) ^{2q}\, dx&\lesssim \int \left( \omega _L^{\alpha }(t,x)|G_L(t,x,0)|\right) ^{2q}\, dx\\&\le \left( \int \left( \omega _L^{\alpha }(t,x)|G_L(t,x,0)|\right) ^{2}\,dx\right) ^q, \end{aligned}$$

so that (89) follows from (66) for \(T\lesssim 1\).

We now assume that \(T\gtrsim 1\), so that \(T\sim (T+1)\). Let \(\eta :\mathbb R^+\rightarrow [0,1]\) be a smooth temporal cut-off function for the interval \([T,2T]\) such that \(\eta (t)=0\) for \(t\le \frac{T}{2}\) and \(t\ge 4T\), \(\eta (t)=1\) for \(T\le t\le 2T\), and \(|\frac{d\eta }{dt}|\lesssim \frac{1}{T}\). We shall apply the Meyers estimate of Lemma 11 to \(u(t,x)=\eta (t)\omega _L^\alpha (T,x)G_L'(t,x,0)\) (note that \(\omega _L^\alpha (T,x)\) does not depend on \(t\)). By applying the (continuum) Leibniz rule to the time derivative, by adding and substracting the term \(\eta (\nabla ^*\varvec{a}\nabla G'_L)\omega _L^\alpha \), and by the defining equation of the Green’s function

$$\begin{aligned} \partial _tG_L(t,\varvec{a},x,0)+\nabla ^*_x\varvec{a}(x)\nabla _x G_L(t,\varvec{a},x,0) =0, \end{aligned}$$

we have

$$\begin{aligned} \partial _tu+\nabla ^*\varvec{a}\nabla u&= (\partial _t G_L'+\nabla ^*\varvec{a}\nabla G_L')\eta \omega _L^\alpha +\eta \nabla ^*\varvec{a}\nabla (\omega _L^\alpha G_L')\nonumber \\&-\eta (\nabla ^*\varvec{a}\nabla G_L')\omega _L^\alpha +(\partial _t\eta )\omega _L^\alpha G_L'\nonumber \\&= \eta \nabla ^*\varvec{a}\nabla (\omega _L^\alpha G_L')-\eta (\nabla ^*\varvec{a}\nabla G_L')\omega _L^\alpha +(\partial _t\eta )\omega _L^\alpha G_L'.\qquad \end{aligned}$$
(90)

For the first term on the RHS we use the discrete Leibniz rule (73) for all \(i=1,\ldots ,d\):

$$\begin{aligned} \nabla ^*_i\left( \varvec{a}_{ii}\nabla _i(\omega _L^\alpha G_L')\right)&= \nabla ^*_i\left( \varvec{a}_{ii}(\nabla _iG_L')\omega _L^\alpha \right) +\nabla ^*_i\left( \varvec{a}_{ii}(\nabla _i\omega _L^\alpha )G_L'(\cdot +e_i)\right) \\&= \left( \nabla ^*_i(\varvec{a}_{ii}\nabla _iG_L')\right) \omega _L^\alpha + (\nabla ^*_i\omega _L^\alpha )\left( \varvec{a}_{ii}\nabla _iG_L'\right) (\cdot -e_i)\\&+\nabla ^*_i\left( \varvec{a}_{ii}(\nabla _i\omega _L^\alpha )G_L'(\cdot +e_i)\right) . \end{aligned}$$

Hence (90) turns into

$$\begin{aligned} \partial _t u+\nabla ^*\varvec{a}\nabla u=\nabla ^*(\varvec{a}g_0)+f_1+f_2 \end{aligned}$$
(91)

with

$$\begin{aligned} g_0(t,x)&:= \sum _{i=1}^d\left( \nabla _i\omega _L^\alpha (T,x)\right) \,\eta (t) G_L'(t,\cdot +e_i,0)e_i,\\ f_1(t,x)&:= \tfrac{d\eta }{dt}(t)\,\omega _L^\alpha (T,x) G_L'(t,x,0),\\ f_2(t,x)&:= \eta (t)\sum _{i=1}^d\left( \nabla _i^*\omega _L^\alpha (T,x)\right) \,\varvec{a}_{ii}(x-e_i)\nabla _i G_L'(t,x-e_i,0). \end{aligned}$$

Since we have written (91) in the form of (68) we may apply Lemma 11. We thus need to estimate \(\varvec{a}g_0\), \(f_1\) and \(f_2\). It suffices to establish the pointwise-in-time estimates

$$\begin{aligned}&\left( \int |\varvec{a}(x) g_0(t,x)|^{2q}\,dx\right) ^\frac{1}{2q} \lesssim \eta (t)t^{-\frac{d}{2}-\frac{1}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{t}{L^2}\right) ,\end{aligned}$$
(92a)
$$\begin{aligned}&\left( \int |f_1(t,x)|^{\frac{2qd}{2q+d}}\,dx\right) ^\frac{2q+d}{2qd} \lesssim \left| \tfrac{d\eta }{dt}(t)\right| t^{-\frac{d}{2}+\frac{1}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{t}{L^2}\right) ,\end{aligned}$$
(92b)
$$\begin{aligned}&\left( \int |f_2(t,x)|^{\frac{2qd}{2q+d}}\,dx\right) ^\frac{2q+d}{2qd} \lesssim t^{-\frac{d}{2}-\frac{1}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{t}{L^2}\right) . \end{aligned}$$
(92c)

Indeed, (92a)–(92c) yield

$$\begin{aligned}&\left( \int _{-\infty }^\infty \int |\varvec{a}(x) g_0(t,x)|^{2q}\,dx\,dt\right) ^\frac{1}{2q} \!+\!\left( \int _{-\infty }^\infty \left( \int |f_1(t,x)|^{\frac{2qd}{2q+d}}\,dx\right) ^\frac{2q+d}{d}dt\right) ^\frac{1}{2q}\\&\quad + \left( \int _{-\infty }^\infty \left( \int |f_2(t,x)|^{\frac{2qd}{2q+d}}\,dx\right) ^\frac{2q+d}{d}dt\right) ^\frac{1}{2q} \lesssim T^{\frac{1}{2q}}T^{-\frac{d}{2}-\frac{1}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{T}{L^2}\right) , \end{aligned}$$

so that the desired estimate (89) follows from Lemma 11 and the identity \(\nabla u=\eta \omega _L^\alpha \nabla G_L+g_0\).

It remains to prove (92a)–(92c). From (74), (77a), (75), and the definitions of \(g_0\), \(f_1\) and \(f_2\), we learn that

$$\begin{aligned} \left\{ \begin{array}{l} \int |\varvec{a}(x) g_0(t,x)|^{2q}\,dx\lesssim \eta ^{2q}(t) (t+1)^{-\frac{1}{2}2q}\int |\omega _L^{\alpha -1}(T,x) G_L'(t,x,0)|^{2q}\,dx,\\ \int |f_1(t,x)|^{\frac{2qd}{2q+d}}\,dx\lesssim \left| \tfrac{d\eta }{dt}(t)\right| ^{\frac{2qd}{2q+d}}\int \left| \omega _L^{\alpha } (T,x)G_L'(t,x,0)\right| ^{\frac{2qd}{2q+d}}\,dx,\\ \int |f_2(t,x)|^{\frac{2qd}{2q+d}}\,dx\lesssim \eta ^{\frac{2qd}{2q+d}}(t) (T{+}1)^{-\frac{1}{2}\frac{2qd}{2q+d}}\int \left( \omega _L^{\alpha -1}(T,x)|\nabla G_L(t,x,0)|\right) ^{\frac{2qd}{2q+d}}\,dx. \end{array} \right. \end{aligned}$$
(93)

By definition of \(\eta \), we only need to consider \(t\ge \frac{T}{2}\gtrsim 1\). Since \(\omega _L(t,x)\sim \omega _L(T,x)\) (uniformly in \(x\)) for all \(\frac{T}{2}\le t\le 4T\), estimates (92a)–(92c) follow from (93) combined with

$$\begin{aligned}&\left( \int |\omega _L^{\alpha -1} (t,x)G_L'(t,x,0)|^{2q}\,dx\right) ^\frac{1}{2q} \lesssim t^{-\frac{d}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{t}{L^2}\right) ,\end{aligned}$$
(94a)
$$\begin{aligned}&\left( \int |\omega _L^{\alpha } (t,x)G_L'(t,x,0)|^{\frac{2qd}{2q+d}}\,dx\right) ^\frac{2q+d}{2qd} \lesssim t^{-\frac{d}{2}+\frac{1}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{t}{L^2}\right) ,\end{aligned}$$
(94b)
$$\begin{aligned}&\left( \int (\omega _L^{\alpha -1}(t,x)|\nabla G_L(t,x,0)|)^{\frac{2qd}{2q+d}}\,dx\right) ^\frac{2q+d}{2qd} \lesssim t^{-\frac{d}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{t}{L^2}\right) ,\nonumber \\ \end{aligned}$$
(94c)

which we prove next.

Note that (94a) and (94b) can be combined into the single statement: For all \(1\le {q'}<\infty \) and \(0\le \alpha <\infty \),

$$\begin{aligned} \left( \int |\omega _L^\alpha (t,x) G_L'(t,x,0)|^{{q'}}\,dx\right) ^\frac{1}{{q'}}&\lesssim t^{-\frac{d}{2}+\frac{d}{2}\frac{1}{{q'}}}\exp \left( -c_0\frac{t}{L^2}\right) . \end{aligned}$$
(95)

To prove (95) we distinguish the cases \(T\le L^2\) and \(T\ge L^2\). In the latter case we have \(t\ge \frac{T}{2}\ge \frac{1}{2}L^2\), so that (67) in Lemma 10 combined with (76) yields

$$\begin{aligned} \left( \int \left| \omega _L^\alpha (t,x) (G_L(t,x,0)-\bar{G}_L)\right| ^{{q'}}\,dx\right) ^\frac{1}{{q'}}&\lesssim L^{-d+d\frac{1}{{q'}}}\exp \left( -c_0\frac{t}{L^2}\right) \\&\lesssim t^{-\frac{d}{2}+\frac{d}{2}\frac{1}{{q'}}}\exp \left( -c_0\frac{t}{L^2}\right) , \end{aligned}$$

up to redefining \(c_0\). This proves (95) for \(T\ge L^2\).

For \(T\le L^2\), we have \(t\le 4L^2\), so that (66) in Lemma 10 (with \(\beta =\alpha +r/q'\) for some \(r>d\)) yields

$$\begin{aligned} \left( \int \left( \omega _L^\alpha (t,x)G_L(t,x,0)\right) ^{q'}\,dx\right) ^\frac{1}{q'}&\lesssim t^{-\frac{d}{2}}\left( \int \omega _L^{-r}(t,x)\,dx\right) ^\frac{1}{q'}\\&\,\mathop {\lesssim }\limits ^{r>d,(82)} t^{-\frac{d}{2}}t^{\frac{d}{2}\frac{1}{q'}}. \end{aligned}$$

This establishes (95) for \(t\le L^2\).

We finally turn to (94c). We note that for \(q<\frac{d}{d-2}\) in dimensions \(d>2\) (and all \(q<\infty \) in dimension \(d=2\)) we have \(\frac{qd}{2q+d}<1\). For all \(1\le q <q_0:=\min \{\bar{q},\frac{d}{d-2}\}\), we then have by Hölder’s inequality with exponents \((\frac{2q+d}{2q+d-qd},\frac{2q+d}{qd})\),

$$\begin{aligned}&\left( \int \left( \omega _L^{\alpha -1}(t,x)|\nabla G_L(t,x,0)|\right) ^\frac{2qd}{2q+d}\,dx\right) ^\frac{2q+d}{2qd}\\&\quad = \left( \int \omega _L^{-\frac{2qd}{2q+d}}(t,x)\left( \omega _L^{\alpha }(t,x)|\nabla G_L(t,x,0)|\right) ^\frac{2qd}{2q+d}\,dx\right) ^\frac{2q+d}{2qd}\\&\quad \mathop {\le }\limits ^{\frac{2qd}{2q+d}<2,\text {H}\ddot{\mathrm{o}}\text {lder}} \left( \int \omega _L^{-\frac{2qd}{2q+d-qd}}(t,x)\,dx\right) ^\frac{2q+d-qd}{2qd} \left( \int \left( \omega _L^{\alpha }(t,x)|\nabla G_L(t,x,0)|\right) ^2\,dx\right) ^\frac{1}{2}\\&\quad \mathop {\lesssim }\limits ^{\frac{2qd}{2q+d(1-q)}>d,(82)} t^{\frac{d}{2}\frac{2q+d-qd}{2qd}} \left( \int \left( \omega _L^{\alpha }(t,x)|\nabla G_L(t,x,0)|\right) ^2\,dx\right) ^\frac{1}{2}\\&\quad \mathop {\lesssim }\limits ^{(83)} t^{\frac{d}{2}\frac{2q+d-qd}{2qd}} \left( t^{-\frac{d}{2}-1}\exp \left( -c_0\frac{t}{L^2}\right) \right) ^\frac{1}{2}\\&\quad \lesssim t^{-\frac{d}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{t}{L^2}\right) . \end{aligned}$$

This establishes (94c).

Step 4. \(L^\infty _t\ell ^{2q}_{x}\)-estimate for \(\nabla G_L\): For all \(1\le q<q_0=\min \{\bar{q},\frac{d}{d-2}\}\), and all \(0\le \alpha <\infty \), we have

$$\begin{aligned}&\left( \int \left( \omega _L^\alpha (T,x-y)|\nabla G_L(T,x,y)|\right) ^{2q}\,dx\right) ^\frac{1}{2q}\\&\quad \lesssim (T+1)^{-\frac{d}{2}-\frac{1}{2}+\frac{d}{2}\frac{1}{2q}}\exp \left( -c_0\frac{T}{L^2}\right) . \end{aligned}$$

We essentially repeat the argument of Step 2. The only differences are:

  • In (84), we use Jensen’s inequality applied to \(\mathbb {R}^d\ni g\mapsto |g|^{2q}\):

    $$\begin{aligned} |\nabla G_L(T,x,y)|^{2q}\le \frac{3}{T}\int _{T/3}^{2T/3}\int G_L(T-t,z,y) |\nabla G_L(t,x,z)|^{2q} dz dt. \end{aligned}$$
  • In (86), we multiply by \(\omega ^{2q\alpha }\):

    $$\begin{aligned}&{\int \omega _L^{2q\alpha }(T,x-y)|\nabla G_L(T,x,y)|^{2q}\,dx}\nonumber \\&\quad \le \frac{3}{T}\int _{T/3}^{2T/3}\int \omega _L^{2q\alpha }(T-t,y-z) G_L(T-t,z,y) \nonumber \\&\qquad \times \int \omega _L^{2q\alpha }(t,x-z)|\nabla G_L(t,x,z)|^{2q} \,dx \,dz \,dt. \end{aligned}$$
    (96)
  • In (87), we replace \(2\alpha \) by \(2q\alpha \):

    $$\begin{aligned} \omega _L^{2q\alpha }(T-t,z-y)G_L(T-t,z,y)\lesssim \left\{ \begin{array}{lcc} T^{-\frac{d}{2}}\omega _L^{-r}(T,z-y)&{}\text{ for }&{}T\le L^2\\ L^{-d } &{}\text{ for }&{}T\ge L^2 \end{array} \right\} . \end{aligned}$$
    (97)

Inserting (97) into (96) then yields

$$\begin{aligned}&{\int \omega _L^{2q\alpha }(T,x-y)|\nabla G_L(T,x,y)|^{2q}\,dx}\\&\quad \lesssim \int \left\{ \begin{array}{lcc} T^{-\frac{d}{2}}\omega _L^{-r}(T,z-y)&{}\text{ for }&{}T\le L^2\\ L^{-d } &{}\text{ for }&{}T\ge L^2 \end{array} \right\} \\&\qquad \times \frac{1}{T}\int _{T/3}^{2T/3}\int \omega _L^{2q\alpha }(t,x-z)|\nabla G_L(t,x,z)|^{2q} \,dx \,dt\,dz, \end{aligned}$$

and the conclusion follows from Step 3 and estimate (88).