1 Introduction

In classical continuum mechanics, the (static) behavior of an elastic body subject to applied body forces and imposed boundary values is described in terms of a deformation mapping which satisfies the (static) partial differential equations of nonlinear elasticity theory. For hyperelastic materials, these equilibrium equations of elastostatics are the Euler–Lagrange equations of an associated energy functional in which the stored elastic energy enters in terms of an integral of a stored energy function acting on the local deformation gradient. Stable configurations are given by deformations which are local minimizers of this functional. The stored energy density in particular induces the stress strain relation and thus encodes the elastic properties of such a material.

On a more basic level, crystalline solids may be viewed as particle systems consisting of interacting atoms on (a portion of) a Bravais lattice. The interatomic cohesive and repulsive forces, which are dominantly induced by the atomic electronic structure, can effectively be modeled in terms of classical interaction potentials. Stable configurations are now local minimizers of the total atomistic energy and solutions of the high dimensional system of equations for the force balance.

The classical connection between atomistic and continuum models of nonlinear elasticity is provided by the Cauchy–Born rule: The continuum stored energy function associated to a macroscopic affine map is given by the energy (per unit volume) of a crystal which is homogeneously deformed with the same affine mapping. In particular, this entails the assumption that there are no fine scale oscillations on the atomistic scale. We will call this the Cauchy–Born energy density in the following. Indeed, if one assumes the Cauchy–Born rule to hold true and consequently requires that every individual atom follow a smooth macroscopic deformation mapping, one can derive a continuum energy expression for such a deformation from given atomistic potentials as shown by Blanc, Le Bris and Lions, cf. [1]. To leading order in the small lattice spacing parameter \(\varepsilon \) one then obtains the continuum energy functional with the Cauchy–Born energy density. However, it is not clear a priori that the Cauchy–Born hypothesis is true. Moreover, it is desirable to not only obtain a pointwise convergence result for the corresponding energy functionals, but also to relate the solutions of the continuum problem to equilibrium configurations of the associated atomistic system.

Our aim in this work is to establish a rigorous link between atomistic models in their asymptotic regime \(\varepsilon \rightarrow 0\) and the corresponding Cauchy–Born models from continuum mechanics for the nonlinear elastic behavior of crystalline solids accounting for body forces and boundary values. To be more precise, from a macroscopic point of view our goal is to show that, under suitable stability assumptions, for each solution of the continuous boundary value problem there are solutions of the associated atomistic boundary value problems with lattice spacing \(\varepsilon \) which converge to the given continuum solution as \(\varepsilon \rightarrow 0\). Conversely, from the microscopic point of view, we aim at establishing sufficient conditions on atomistic body forces and boundary values that yield stable discrete solutions close to the corresponding continuum solution and thus obeying the Cauchy–Born rule.

The crucial assumption of our analysis is an atomistic stability condition which effectively rules out atomistic relaxation effects. We therefore study this condition in detail with the further objective to provide criteria which are well-adapted to applications and verifiable for specific models of interest.

Over the last 15 years in particular there has been considerable progress in identifying conditions that allow for a mathematically rigorous justification of the Cauchy–Born rule. Here, we restrict our attention to those contributions which directly have influenced our results. For a general review on the Cauchy–Born rule we refer to the survey article [9] by Ericksen. In their seminal contribution [10] Friesecke and Theil consider a two-dimensional mass-spring model and prove the Cauchy–Born rule does indeed hold true for small strains, while it in general fails for large strains. Their result has then been generalized to a wider class of discrete models and arbitrary dimensions by Conti, Dolzmann, Kirchheim and Müller in [5]. More specifically, in these papers a version of the Cauchy–Born rule is established by considering a box containing a portion of a crystal and showing that, under the condition that the atoms in a boundary layer (whose width depends on the maximal interatomic interaction length) follow a given affine deformation, the global minimizer of the energy is given by the homogeneous deformation in which all atoms follow that affine deformation. In [4] we showed that these results can be combined with abstract results on integral representation to give a link in terms of \(\Gamma \)-convergence and, in particular, convergence of global minimizers of the atomistic energy to the continuum energy with Cauchy–Born energy density (for small strains) as the interatomic distances tend to zero. A corresponding discrete-to-continuum convergence result in which simultaneously the strain becomes infinitesimally small had been obtained by the second author in [17] resulting in a continuum energy functional with the linearized Cauchy–Born energy density.

A drawback of these approaches which rely on global energy minimization is that they require strong growth assumptions on the atomic interactions that are not compatible with classical potentials such as, e.g., the Lennard–Jones potential. Based on the observation that elastic deformations in general are merely local energy minimizers, E and Ming have pioneered a different approach. In [8] they show that, under suitable stability assumptions, solutions of the equations of continuum elasticity on the flat torus with smooth body forces are asymptotically approximated by corresponding atomistic equilibrium configurations. Recently these results have been generalized to a large class of interatomic potentials under remarkably mild regularity assumptions on the body forces for problems on the whole space by Ortner and Theil, cf. [16].

In view of these results, the natural question arises if an analogous analysis is possible for a material occupying a general finite domain in space on the boundary of which there might also be prescribed boundary values. To cite Ericksen [9, p. 207], “Cannot someone do something like this for a more realistic case, say zero surface tractions on part of the boundary and given displacements on the remainder?” Our main result in Theorem 5.1 will give an answer to this question. While we formulate our results only in the case of given displacements on the entire boundary, traction and mixed boundary conditions are to a large part included. If one has a solution to the atomistic equations under given Dirichlet boundary conditions on a set of boundary atoms, one can just as easily declare these boundary atoms as non-physical ghost atoms that generate certain forces in their interaction range. Thus one also has a solution under a (specific) traction boundary condition. Compare, e.g., [14] for a thorough discussion in one dimension. Our main restriction either way is that we can only consider a certain range of atomistic boundary data. But this is unsurprising. Indeed, as we will argue below, in the case of general atomistic boundary data the Cauchy–Born rule is typically expected to fail due to relaxation effects at the boundary.

We believe that our treatment of arbitrary domains and general displacement boundary conditions is of interest not only from a theoretical perspective but also with a view to specific situations that are of interest in applications, whose investigation indispensably requires an effective continuum theory.

In order to relate our set-up to the aforementioned previous contributions, we remark that the presence of displacement boundary conditions leads to some subtleties within the statement of our main Theorem 5.1 and to a number of technical difficulties within its proof: 1. As discussed above, boundary values are naturally imposed on a boundary layer of the atomistic system. In contrast to the situation, e.g., in [5], the adequate choice of the atomistic displacements at the boundary for a general non-affine continuum boundary datum is not determined canonically a priori. In view of our rather mild regularity assumptions, one needs to construct the correct atomistic displacements from the continuum Cauchy–Born solution. Doing so we see that not only the correct asymptotic continuous boundary values but also the correct asymptotic normal derivative given by the normal derivative of the continuous Cauchy–Born solution are attained. (If these conditions fail, we again expect surface relaxation effects and a failure of the Cauchy–Born rule close to the boundary.) 2. In order to allow for as many atomistic boundary conditions (and body forces) as possible, we consider general scalings \(\varepsilon ^\gamma \) in Theorem 5.1 and only restrict \(\gamma \) as much as necessary (\(\frac{d}{2} \le \gamma \le 2\)). While smaller \(\gamma \) will lead to a larger variety of atomistic boundary values, \(\gamma = 2\) will lead to optimal convergence rates. One should also note that our result no longer requires \(\varepsilon \) to be small. 3. Certain technical methods, which are available on the flat torus or on the whole space, do not translate to our setting. E.g., quasi-interpolations as in [15] do not preserve boundary conditions. This leads us to prove the important residual estimates, which lie at the core of our main proof, in a more direct way. With the help of a subtle atomic scale regularization, this can be achieved by requiring only slightly higher regularity assumptions for the continuous equations as compared to [16].

The basic stability assumption, which guarantees the existence of stable configurations solving a boundary value problem in continuum elasticity (for small data), is the classical Legendre–Hadamard condition. While still necessary for stability of the atomistic system, the Legendre–Hadamard condition in general will not be sufficient to rule out relaxation effects due to microstructure on the atomistic scale. We introduce a suitable atomistic stability constant, investigate its basic properties and also relate it to the Legendre–Hadamard constant in the long wave-length limit. Our discussion is motivated by the results in [12]. In particular, we prove that the stability constant is determined in the many particle limit, is thus independent of \(\Omega \) and even is equivalent to, though different from, the constant in [12]. Yet, our representation formula in Corollary 3.7 in the spirit of the Legendre–Hadamard condition seems to more directly connect this quantity to its continuous counterpart. In particular, it allows us to analytically compare atomistic and continuous stability in two-dimensional model cases.

As a result of independent interest, we are able to describe the onset of instability in (generalized versions of) the Friesecke-Theil mass spring model. In [10] Friesecke and Theil noted that the Cauchy–Born rule might fail due to a period doubling shift relaxation if the mismatch of equilibria of the two types of springs within the model is sufficiently large. Our analysis shows that indeed the lattice is stable on all scales up to precisely the point at which period doubling shift relaxations occur. This on the one hand gives a precise description of the stability region in terms of the mismatch parameter. On the other hand it suggests that stability is lost at the critical value due to period doubling shift relaxed configurations forming.

Additionally, we give an easy sufficient condition for stability in Corollary 3.8 which appears to be more tractable and straightforward to check in applications. By way of example we show in Proposition 3.10 that this condition is satisfied by a large class of atomistic potentials.

Having introduced the continuous and atomistic models and their relation in Sect. 2, we devote Sect. 3 to our thorough examination of stability. In Sect. 4 we briefly discuss the existence of solutions of the boundary value problem in continuous elasticity for small body forces and boundary values in the vicinity of a stable affine deformation under fairly mild regularity assumptions. This is fairly standard and is based on an infinite dimensional implicit function theorem.

With these preparations we state and prove our main result Theorem 5.1 in Sect. 5. As in [8, 16] our goal is to find an atomistic solution in the vicinity of a continuous solution of the associated Cauchy–Born system by observing that the latter is an approximate solution of the discrete system, where now we also have to account for the additional boundary data. To this end, we begin by formulating a quantitative version of the implicit function theorem with a small parameter. While tailored to our application to a singularly perturbed problem, such a result appears to have a wider range of applicability as we discuss at the end of this introduction. From a technical point of view, the main point is then to obtain sufficiently strong residual estimates of the discrete operator acting on the continuous solution. We remark that these estimates cannot be obtained by mere Taylor expansion, but additionally require a subtle regularization on the atomistic scale (cf. Propositions 5.8 and 5.9) and cancellation due to a suitable lattice interaction symmetry. We finally conclude by proving Theorem 5.1.

In view of limitations and possible extensions of our work, a natural question is if analogous results can be obtained also in the dynamic setting. Based on our analysis of the static case, this problem will be addressed by the first author in the forthcoming paper [2], cf. also [3].

Let us also reconsider the still open problem of general atomistic boundary data against the above detailed background on our approach. While in the bulk it is plausible that the Cauchy–Born rule is still approximately true, the situation is, as mentioned, very different close to the surface. In the outermost layers one expects surface relaxation effects. E.g., in case of a free boundary one expects the gradients to have an error of \(\mathscr {O}(1)\) while oscillating on the scale \(\mathscr {O}(\varepsilon )\). Even though this does not effect the highest order of the energy, it does mean that the Cauchy–Born approximation leaves a residual in the equations that does not vanish as \(\varepsilon \rightarrow 0\), e.g., in any \(L^p\)-norm. This makes it much more difficult to find exact solutions to the equations with asymptotically equal bulk behavior. A precise and rigorous mathematical treatment of surface relaxation effects is currently still out of reach. The best known result so far appears to be [19], which gives the correct asymptotics of the surface energy in the limit of vanishing mismatches in the potentials. But even if one were to establish a full characterization of the surface energy, this would still be just a first step towards describing exact solutions of the equations.

We conclude this introduction with a short remark on the more general applicability of the quantitative implicit function theorem, Theorem 5.3. By way of example, let us just touch on one quite different field in which our formulation might have some interest: the area of computer assisted proofs. There, a typical problem can be to find solutions with certain properties to a nonlinear PDE. One starts by finding an approximate solution numerically without the need for a convergence proof. Then, one establishes rigorous bounds on the necessary quantities and concludes with a quantitative existence result that there is indeed a true solution in the vicinity. Such arguments can also be used to establish multiplicity or even uniqueness. Compare [13] for a short survey on recent progress. Our specific version of the quantitative existence result then allows for the dependence on additional parameters in small balls. At least in principle, a covering argument directly extends this to parameters in a given compact set which is even allowed to be infinite dimensional. In practice, due to computational restrictions, this is typically only realistic for parameters in a given finite dimensional bounded set.

2 The models

2.1 The continuous model

Before we start let us fix some notation. Throughout, d will denote the space dimension. We always use \(|\cdot |\) as the standard Euclidean norm for elements of \(\mathbb {R}^d\) or \(\mathbb {R}^{d \times d}\). We will drop any notation for standard matrix multiplications and for scalar products of vectors in \(\mathbb {R}^d\).

In the continuous model we then consider a bounded, open reference set \(\Omega \subset \mathbb {R}^d\), deformations \(y :\Omega \rightarrow \mathbb {R}^d\), a Borel function \(W_\mathrm{cont} :\mathbb {R}^{d \times d} \rightarrow (-\infty ,\infty ]\) which is bounded from below, a body force \(f \in L^2(\Omega ; \mathbb {R}^d)\), a boundary datum \(g \in H^1(\Omega ; \mathbb {R}^d)\) and the deformation energy

$$\begin{aligned} E(y;f) = \int \limits _\Omega W_\mathrm{cont}(\nabla y (x)) - y(x)f(x) \,dx. \end{aligned}$$

We are interested in finding local minimizers y of this energy (in a suitable topology) having boundary values g. In a sufficiently regular setting, these are (stable) solutions of the corresponding Euler–Lagrange equations

$$\begin{aligned} \left\{ \begin{array}{ll} -{{\mathrm{div}}}(DW_\mathrm{cont}(\nabla y (x))) = f(x) &{}\quad \text {in}\ \Omega , \\ y(x) = g(x) &{}\quad \text {on}\ \partial \Omega . \end{array} \right. \end{aligned}$$

We want the assumptions on \(W_\mathrm{cont}\) to be weak enough to include, e.g., Lennard-Jones-type interactions. Therefore, we should not assume global (quasi-)convexity or growth at infinity and \(W_\mathrm{cont}\) should be allowed to have singularities. Of course, under such weak assumptions we cannot hope to solve the problem for all f and g. Instead, we will look at a stable affine reference deformation \(y_{A_0}(x) = A_0 x\) with gradient \(A_0 \in \mathbb {R}^{d \times d}\) and show that for all f small enough and all g close enough to the reference deformation there is a unique deformation close to \(y_{A_0}\) that solves the problem. Here, stability is yet to be defined.

2.2 The atomistic model

Let us start by introducing some important quantities. We consider the reference lattice \( \varepsilon \mathbb {Z}^d\), where \(\varepsilon > 0\) is the lattice spacing. Up to a set of measure zero, we partition \(\mathbb {R}^d\) into the cubes \(\{z\} + \big (-\frac{\varepsilon }{2},\frac{\varepsilon }{2}\big )^d\) with \(z \in \varepsilon \mathbb {Z}^d\). Given \(x \in \mathbb {R}^d\), not in the neglected set of measure zero, we let \(\hat{x}\in \varepsilon \mathbb {Z}^d\) be the midpoint of the corresponding cube and \(Q_\varepsilon (x)\) the cube itself. Furthermore, for certain symmetry arguments we will later use the point \(\bar{x}\) defined as the reflection of x at \(\hat{x}\) (i.e., \(\bar{x} = 2 \hat{x} - x\)).

Now, atomistic deformations are maps \(y :\Omega \cap \varepsilon \mathbb {Z}^d \rightarrow \mathbb {R}^d\). Again, we will look at the reference configuration \(y_{A_0}(x) = A_0 x\), meaning that the reference positions of the atoms are \(A_0 \Omega \cap \varepsilon A_0 \mathbb {Z}^d\) in the macroscopic domain \(A_0 \Omega \). The deformation energy is supposed to result from local finite range atomic interactions. More precisely, there is a finite set \(\mathscr {R} \subset \mathbb {Z}^d \backslash \{0\}\) accounting for the possible interactions, for which we will always assume that \({{\mathrm{span}}}_{\mathbb {Z}} \mathscr {R}=\mathbb {Z}^d\) and \(\mathscr {R} = - \mathscr {R}\). We then assume that the atoms marked by \(x,\tilde{x} \in \varepsilon \mathbb {Z}^d\) can only interact directly if there is a point \(z\in \varepsilon \mathbb {Z}^d\) with \(x,\tilde{x} \in z+ \varepsilon \mathscr {R}\). Furthermore, we assume our system to be translationally invariant such that the interaction can only depend on the matrix of differences \(D_{\mathscr {R},\varepsilon } y (x) = (\frac{y(x+\varepsilon \rho )-y(x)}{\varepsilon })_{\rho \in \mathscr {R}}\) with \(x \in \varepsilon \mathbb {Z}^d\), where we already use the natural scaling such that \(D_{\mathscr {R},\varepsilon } y_{A_0} (x) = (A_0 \rho )_{\rho \in \mathscr {R}}\) for all \(\varepsilon >0\). Our site potential \(W_\mathrm{atom} :(\mathbb {R}^d)^\mathscr {R} \rightarrow (-\infty ,\infty ]\) is then assumed to be independent of \(\varepsilon \). Compare [1] for a detailed discussion of this scaling assumption.

As a mild symmetry assumption on \(W_\mathrm{atom}\), we will assume throughout that

$$\begin{aligned} W_\mathrm{atom} (A) = W_\mathrm{atom} (T(A)) \end{aligned}$$

for all \(A \in (\mathbb {R}^d)^\mathscr {R}\), where

$$\begin{aligned} T(A)_{\rho } = -A_{-\rho }. \end{aligned}$$

This is indeed a quite weak assumption. In a typical situation this just means that we have partitioned the overall energy in such a way that the site potential is invariant under a point reflection at that atom combined with the natural relabeling.

Lemma 2.1

If \(W_\mathrm{atom}\) satisfies the symmetry condition and \(B \in \mathbb {R}^{d \times d}\), then

$$\begin{aligned} D^k W_\mathrm{atom} ((B \rho )_{\rho \in \mathscr {R}})[T(A_1), \cdots , T(A_k)] = D^k W_\mathrm{atom} ((B \rho )_{\rho \in \mathscr {R}})[A_1, \cdots , A_k] \end{aligned}$$

whenever these derivatives exist.

Proof

This follows directly from \(T((B \rho )_{\rho \in \mathscr {R}}) = (B \rho )_{\rho \in \mathscr {R}}\). \(\square \)

Letting \(R_\mathrm{max} = \max \{|\rho |:\rho \in \mathscr {R}\}\) and \(R_0 = \max \{R_\mathrm{max}, \frac{\sqrt{d}}{4}\}\), the discrete gradient \(D_{\mathscr {R},\varepsilon }y\) is well-defined on the discrete ‘semi-interior’

$$\begin{aligned} {{\mathrm{sint}}}_\varepsilon \Omega = \{x \in \Omega \cap \varepsilon \mathbb {Z}^d :{{\mathrm{dist}}}(x,\partial \Omega ) > \varepsilon R_0\}. \end{aligned}$$

The total energy below will be defined by a sum over this set, which is justified by our considering variations only on the discrete interior

$$\begin{aligned} {{\mathrm{int}}}_\varepsilon \Omega = \{x \in \Omega \cap \varepsilon \mathbb {Z}^d :{{\mathrm{dist}}}(x,\partial \Omega ) > 2\varepsilon R_0\}, \end{aligned}$$

which do not affect the gradients outside the semi-interior, and by prescribing boundary values on the layer \(\partial _\varepsilon \Omega = \Omega \cap \varepsilon \mathbb {Z}^d \backslash {{\mathrm{int}}}_\varepsilon \Omega \). Now, given a body force \(f :\varepsilon \mathbb {Z}^d \cap \Omega \rightarrow \mathbb {R}^d\) and a boundary datum \(g :\partial _\varepsilon \Omega \rightarrow \mathbb {R}^d\) we define the set of admissible deformations as

$$\begin{aligned} \mathscr {A}_\varepsilon (\Omega , g) = \{ y :\Omega \cap \varepsilon \mathbb {Z}^d \rightarrow \mathbb {R}^d :y(x)=g(x) \text { for all } x \in \partial _\varepsilon \Omega \} \end{aligned}$$

and the elastic energy of an atomistic deformation y by

$$\begin{aligned} E_\varepsilon (y;f;g) = \left\{ \begin{array}{ll} \varepsilon ^d \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y (x)) - \varepsilon ^d \sum \limits _{x \in \varepsilon \mathbb {Z}^d \cap \Omega } y(x)f(x) &{}\quad \text {if}\ y \in \mathscr {A}_\varepsilon (\Omega , g), \\ \infty &{}\quad \text {else.} \end{array} \right. \end{aligned}$$

We remark here that the definition of \(R_0\) implies that

$$\begin{aligned} \Omega _\varepsilon = \bigcup _{z \in {{\mathrm{int}}}_\varepsilon \Omega } Q_\varepsilon (z) \subset \Omega \end{aligned}$$

which will simplify things later on.

As in the continuous case, our goal is to find local minimizers of the energy which, under suitable assumptions on \(W_\mathrm{atom}\), are (stable) solutions of the corresponding Euler–Lagrange equation

$$\begin{aligned} -{{\mathrm{div}}}_{\mathscr {R},\varepsilon }\big ( DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y(x))\big ) = f(x), \end{aligned}$$

with \(x \in {{\mathrm{int}}}_\varepsilon \Omega \), where \(DW_\mathrm{atom}(M) = \big ( \frac{\partial W_\mathrm{atom}(M)}{\partial M_{i\rho }} \big )_{\begin{array}{c} 1 \le i \le d \\ \rho \in \mathscr {R} \end{array}}\) for \(M = (M_{i\rho })_{\begin{array}{c} 1 \le i \le d \\ \rho \in \mathscr {R} \end{array}} \in \mathbb {R}^{d \times \mathscr {R}}\) and we write

$$\begin{aligned} {{\mathrm{div}}}_{\mathscr {R},\varepsilon } M(x) = \sum \limits _{\rho \in \mathscr {R}} \frac{M_\rho (x) - M_\rho (x-\varepsilon \rho )}{\varepsilon } \end{aligned}$$

for any \(M :\Omega \cap \varepsilon \mathbb {Z}^d \rightarrow \mathbb {R}^{d \times \mathscr {R}} \cong (\mathbb {R}^d)^\mathscr {R}\). Of course, there is no reason to hope for existence (or uniqueness) in general. We will also restrict ourselves to ‘elastic’ solutions that are (macroscopically) sufficiently close to some affine lattice. To find such solutions we will look close to continuous solutions.

2.3 The Cauchy–Born rule

As described in detail in the introduction, it is a fundamental problem to identify the correct \(W_\mathrm{cont}\) that should be taken for the continuous equation so that one can hope for atomistic solutions close by as \(\varepsilon \) becomes small enough. The classical ansatz to resolve this question by applying the Cauchy–Born leads to setting \(W_\mathrm{cont} = W_\mathrm{CB}\), where in our setting the Cauchy–Born energy density has the simple mathematical expression

$$\begin{aligned} W_\mathrm{CB}(A) := W_\mathrm{atom} ((A\rho )_{\rho \in \mathscr {R}}). \end{aligned}$$

In the following we will only consider \(W_\mathrm{cont} = W_\mathrm{CB}\), where \(W_\mathrm{atom}\) is given. Our main goal is to justify this choice rigorously.

3 Stability

A crucial ingredient for our main theorem, but also for further applications, is the concept of atomistic stability. Here we define the continuous and atomistic stability constants, discuss their properties, and give simple characterizations.

3.1 Stability constants

For a bilinear form \(L\in \mathbb {R}^{d \times d \times d \times d} \cong {{\mathrm{Bil}}}(\mathbb {R}^{d \times d}) \cong L(\mathbb {R}^{d \times d},\mathbb {R}^{d \times d})\) and matrices \(A, B \in \mathbb {R}^{d \times d}\) we will write

$$\begin{aligned} L[A,B]&= \sum \limits _{j,k,l,m=1}^d L_{jklm} A_{jk} B_{lm},\\ (L[A])_{jk}&= \sum \limits _{l,m=1}^d L_{jklm} A_{lm}, \end{aligned}$$

and

$$\begin{aligned} |L |= \sup \{ L[A,B] :|A |= |B |= 1 \}. \end{aligned}$$

Later we will use a similar notation for higher order tensors. We will also use notation like \(L[A]^2 = L[A,A]\) to shorten long expressions.

In our problem L is the tensor in the equation

$$\begin{aligned} -{{\mathrm{div}}}(L[\nabla u]) = f, \end{aligned}$$

which is the linearization of the continuous equation at the affine deformation \(y_{A_0}\) if we set \(L=D^2W_\mathrm{CB}(A_0)\). The condition that ensures existence, uniqueness and regularity and at the same time ensures that solutions are strict local minimizers of the nonlinear energy, and in that sense stability, is the Legendre–Hadamard condition

$$\begin{aligned} \lambda _\mathrm{LH}(L) = \inf \limits _{\xi , \eta \in \mathbb {R}^d\backslash \{0\}} \frac{L[\xi \otimes \eta ,\xi \otimes \eta ]}{|\xi |^2 |\eta |^2} >0. \end{aligned}$$

It is a well known fact, proven by Fourier transformation and a cutoff argument, that

$$\begin{aligned} \lambda _\mathrm{LH}(L) = \inf \limits _{u \in H^1_0(U;\mathbb {R}^d) \backslash \{0\}} \frac{\int _U L[\nabla u (x), \nabla u(x)] \,dx}{\int _U |\nabla u (x) |^2 \,dx} \end{aligned}$$

for any open, nonempty \(U \subset \mathbb {R}^d\). This is the same as saying that quasiconvexity and rank-one-convexity are equivalent for quadratic densities with constant coefficients, in this case for \(L-\lambda _\mathrm{LH} {{\mathrm{Id}}}\). The result is standard and can be found in the literature, e.g. [6, Theorem 5.25].

We also introduce a modified version that is equivalent but more adapted to the atomistic norms we will use in the following:

$$\begin{aligned} \tilde{\lambda }_\mathrm{LH}(L) = \inf \limits _{\xi , \eta \in \mathbb {R}^d\backslash \{0\}} \frac{L[\xi \otimes \eta ,\xi \otimes \eta ]}{|\xi |^2 \sum \nolimits _{\rho \in \mathscr {R}} (\rho \eta )^2} >0. \end{aligned}$$

Since \({{\mathrm{span}}}_\mathbb {Z}\mathscr {R} = \mathbb {Z}^d\), we have \({{\mathrm{span}}}_\mathbb {R}\mathscr {R} = \mathbb {R}^d\) and there are \(C_1,C_2>0\) such that

$$\begin{aligned} C_1 |\eta |^2 \le \sum \limits _{\rho \in \mathscr {R}} (\rho \eta )^2 \le C_2 |\eta |^2.\end{aligned}$$

Hence,

$$\begin{aligned} C_1 \tilde{\lambda }_\mathrm{LH}(L) \le \lambda _\mathrm{LH}(L) \le C_2 \tilde{\lambda }_\mathrm{LH}(L) \end{aligned}$$

and, in particular, \(\tilde{\lambda }_\mathrm{LH}(L) > 0\) if and only if \(\lambda _\mathrm{LH}(L)>0\).

In the atomistic setting we have tensors on higher dimensional spaces of the type

$$\begin{aligned} K\in \mathbb {R}^{ \{1, \cdots , d\}\times \mathscr {R} \times \{1, \cdots , d\}\times \mathscr {R}} \cong {{\mathrm{Bil}}}(\mathbb {R}^{\{1, \cdots , d\}\times \mathscr {R}}) \cong L(\mathbb {R}^{\{1, \cdots , d\}\times \mathscr {R}},\mathbb {R}^{\{1, \cdots , d\}\times \mathscr {R}}). \end{aligned}$$

Note that with each such K we can associate a tensor of the form \(L\in \mathbb {R}^{d \times d \times d \times d}\) by

$$\begin{aligned} L[A,B] = K[(A\rho )_{\rho \in \mathscr {R}},(B\rho )_{\rho \in \mathscr {R}}]. \end{aligned}$$

In our equations we will consider \(K=D^2W_\mathrm{atom}((A_0\rho )_{\rho \in \mathscr {R}})\), which then corresponds to \(L=D^2W_\mathrm{CB}(A_0)\). It turns out that we need a stronger condition for existence and the local minimizing property in the atomistic case. We define

$$\begin{aligned} \lambda _\varepsilon (K,\Omega ) = \inf \limits _{\begin{array}{c} y \in \mathscr {A}_\varepsilon (\Omega ,0)\\ y \ne 0 \end{array}} \frac{ \varepsilon ^d \sum \nolimits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } K[D_{\mathscr {R},\varepsilon } y (x),D_{\mathscr {R},\varepsilon } y (x)]}{\varepsilon ^d \sum \nolimits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } y (x)|^2}. \end{aligned}$$

Now by atomistic stability we mean that

$$\begin{aligned} \lambda _\mathrm{atom}(K,\Omega ) = \inf _{\varepsilon>0} \lambda _\varepsilon (K,\Omega ) > 0. \end{aligned}$$

We will first show that \(\lambda _\mathrm{atom}\) is in fact independent of \(\Omega \) and is equivalently given by the minimization of periodic problems. This can be done in the spirit of a thermodynamical limit argument. Let us consider

$$\begin{aligned} \mathscr {B}_{0,N} = \{ y :\{0, \cdots , N\}^d \rightarrow \mathbb {R}^d :y(z)=0, \text { if } {{\mathrm{dist}}}(z, \partial (0,N)^d) \le 2 R_0 \} \end{aligned}$$

and

$$\begin{aligned} \mathscr {B}_{\text {per},N} = \{ y :\{0, \cdots , N\}^d \rightarrow \mathbb {R}^d :y \text { is } [0,N)^d\text {-periodic.} \}. \end{aligned}$$

Whenever necessary we will consider these functions to be \([0,N)^d\)-periodically extended to \(\mathbb {Z}^d\).

Let us define

$$\begin{aligned} \mu _{0,N} = \inf \limits _{\begin{array}{c} y \in \mathscr {B}_{0,N}\\ y \ne 0 \end{array}} \frac{ \sum \nolimits _{x \in \{0, \cdots , N\}^d} K[D_{\mathscr {R},1} y (x),D_{\mathscr {R},1} y (x)]}{ \sum \nolimits _{x \in \{0, \cdots , N\}^d} |D_{\mathscr {R},1} y (x)|^2} \end{aligned}$$

and

$$\begin{aligned} \mu _{\text {per},N} = \inf \limits _{\begin{array}{c} y \in \mathscr {B}_{\text {per},N}\\ y \text { not constant} \end{array}} \frac{ \sum \nolimits _{x \in \{0, \cdots , N-1\}^d} K[D_{\mathscr {R},1} y (x),D_{\mathscr {R},1} y (x)]}{ \sum \nolimits _{x \in \{0, \cdots , N-1\}^d} |D_{\mathscr {R},1} y (x)|^2}. \end{aligned}$$

In this definition we used that \(D_{\mathscr {R},1} y (x)= 0\) for all \(x \in \mathbb {Z}^d\) implies that y is constant since \({{\mathrm{span}}}_\mathbb {Z}\mathscr {R} = \mathbb {Z}^d\). Obviously, \(\mu _{0,N}\) is nonincreasing and \(- |K |\le \mu _{0,N} \le |K |\) for N sufficiently large. Hence, \(\mu _{0,N} \rightarrow \mu _0 \in [- |K |, |K |]\), where \(\mu _0 = \inf _N \mu _{0,N}\).

Proposition 3.1

We have \(\mu _{\text {per},N} \rightarrow \mu _0\) as \(N \rightarrow \infty \) and \(\mu _0 = \inf _N \mu _{\text {per},N}\).

Proof

It is clear that \(\mu _{\text {per},N} \le \mu _{0,N}\) for all N. The opposite inequality can not be expected to hold. Instead, we will look at M periodicity cells at once in a larger cell of size MN. If M is sufficiently large, we can expect boundary effects to be small, so that a simple cutoff at the boundary gives a good result.

More precisely, let \(\delta >0\) and \(N \in \mathbb {N}\) with \(N \ge 6 R_0\). Take a nonconstant \(y \in \mathscr {B}_{\text {per},N}\) such that

$$\begin{aligned} \sum \limits _{x \in \{0, \cdots , N-1\}^d} K[D_{\mathscr {R},1} y (x),D_{\mathscr {R},1} y (x)] \le (\mu _{\text {per},N} + \delta ) \sum \limits _{x \in \{0, \cdots , N-1\}^d} |D_{\mathscr {R},1} y (x)|^2. \end{aligned}$$

Now we consider \(\tilde{y} = \varphi y \in \mathscr {B}_{0,MN}\), where \(M \in \mathbb {N}\), \(M\ge 3\). Here, the cutoff \(\varphi \in C^\infty (\mathbb {R}^d;[0,1])\) should satisfy \(\varphi (x) = 1\) whenever \( {{\mathrm{dist}}}(x, \big ((0,MN)^d\big )^c) \ge 4 R_0\), \(\varphi (x) = 0\) whenever \( {{\mathrm{dist}}}(x, \big ((0,MN)^d\big )^c) \le 2 R_0\), and \(|\nabla \varphi (x) |\le \frac{1}{R_0}\) for all x. A short calculation gives

$$\begin{aligned} |D_{\mathscr {R},1} \tilde{y}(x) |\le C(d,|\mathscr {R}|,R_0) ||y ||_\infty \end{aligned}$$

for all x. Using this we can estimate

$$\begin{aligned} \sum \limits _{x \in \{0, \cdots , MN-1\}^d} K[D_{\mathscr {R},1} \tilde{y} (x),D_{\mathscr {R},1} \tilde{y} (x)]\le & {} (M-2)^d(\mu _{\text {per},N} + \delta ) \sum \limits _{x \in \{0, \cdots , N-1\}^d} |D_{\mathscr {R},1} y (x)|^2\\&+\, C(d, |\mathscr {R}|,R_0) N^d M^{d-1} |K |||y ||_\infty ^2 \\\le & {} (\mu _{\text {per},N} + \delta ) \sum \limits _{x \in \{0, \cdots , MN-1\}^d} |D_{\mathscr {R},1} \tilde{y} (x)|^2\\&+\, C(d, |\mathscr {R}|,R_0) N^d M^{d-1} ( |K |+ |\mu _{\text {per},N} |+ \delta ) ||y ||_\infty ^2. \end{aligned}$$

But for M large enough we have

$$\begin{aligned} C(d, |\mathscr {R}|,R_0) N^d M^{d-1} ( |K |+ |\mu _{\text {per},N} |+ \delta ) ||y ||_\infty ^2 \le \delta \sum \limits _{x \in \{0, \cdots , MN-1\}^d} |D_{\mathscr {R},1} \tilde{y} (x)|^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \mu _0 \le \mu _{0,MN} \le \mu _{\text {per},N} + 2\delta . \end{aligned}$$

The restriction \(N \ge 6 R_0\) is not problematic when we take the infimum since

$$\begin{aligned} \mu _{\text {per},jN} \le \mu _{\text {per},N} \end{aligned}$$

for all \(j \in \mathbb {N}\). \(\square \)

Proposition 3.2

For all open, bounded, nonempty \(\Omega \subset \mathbb {R}^d\) we have

$$\begin{aligned} \lambda _\mathrm{atom}(K,\Omega )= \lim \limits _{\varepsilon \rightarrow 0^+} \lambda _\varepsilon (K, \Omega ) = \mu _0. \end{aligned}$$

Proof

Take \(z_1, z_2 \in \mathbb {R}^d\) and \(0<a_1<a_2\) such that

$$\begin{aligned} \{z_1\} + [0,a_1]^d \subset \Omega \subset \{z_2\} + ( 0,a_2)^d. \end{aligned}$$

Now, define \((z_{1,\varepsilon })_i = \lceil \frac{(z_1)_i}{\varepsilon }\rceil \varepsilon \), \((z_{2,\varepsilon })_i = \lfloor \frac{(z_2)_i}{\varepsilon }\rfloor \varepsilon \), \(N_{1,\varepsilon } = \lfloor \frac{a_1}{\varepsilon }\rfloor - 1\) and \(N_{2,\varepsilon } = \lceil \frac{a_2}{\varepsilon }\rceil + 1\). Then,

$$\begin{aligned} z_{1,\varepsilon }+ \varepsilon \{0, \cdots , N_{1,\varepsilon }\}^d \subset \Omega \cap \varepsilon \mathbb {Z}^d \subset z_{2,\varepsilon }+ \varepsilon \{0, \cdots , N_{2,\varepsilon }\}^d. \end{aligned}$$

Given \(y \in \mathscr {B}_{0,N_{1,\varepsilon }}\) we can set

$$\begin{aligned} \tilde{y}(z_{1,\varepsilon } + \varepsilon v) = \varepsilon y(v) \end{aligned}$$

for \(v \in \{0, \cdots , N_{1,\varepsilon }\}^d\) and \(\tilde{y}(x)=0\) else. Then \(\tilde{y} \in \mathscr {A}_\varepsilon (\Omega ,0)\),

$$\begin{aligned} \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } K[D_{\mathscr {R},\varepsilon } \tilde{y} (x),D_{\mathscr {R},\varepsilon } \tilde{y} (x)] = \sum \limits _{x \in \{0, \cdots , N_{1,\varepsilon }-1\}^d} K[D_{\mathscr {R},1} y (x),D_{\mathscr {R},1} y (x)] \end{aligned}$$

and

$$\begin{aligned} \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } \tilde{y} (x) |^2 = \sum \limits _{x \in \{0, \cdots , N_{1,\varepsilon }-1\}^d} |D_{\mathscr {R},1} y (x)|^2. \end{aligned}$$

Hence,

$$\begin{aligned} \mu _{0, \lfloor \frac{a_1}{\varepsilon }\rfloor - 1} \ge \lambda _\varepsilon (K, \Omega ). \end{aligned}$$

Since we also have the embedding

$$\begin{aligned} \Omega \cap \varepsilon \mathbb {Z}^d \subset z_{2,\varepsilon }+ \varepsilon \{0, \cdots , N_{2,\varepsilon }\}^d, \end{aligned}$$

a similar argument shows

$$\begin{aligned} \mu _{0, \lceil \frac{a_2}{\varepsilon }\rceil + 1} \le \lambda _\varepsilon (K, \Omega ). \end{aligned}$$

This holds for all \(\varepsilon >0\), if we set \(\mu _{0, - 1}= \infty \). Therefore,

$$\begin{aligned} \lim \limits _{\varepsilon \rightarrow 0} \lambda _\varepsilon (K, \Omega ) = \inf \limits _{\varepsilon > 0} \lambda _\varepsilon (K, \Omega ) = \mu _0. \end{aligned}$$

\(\square \)

Because of Proposition 3.2, the stability constant is independent of the set \(\Omega \) and we can just write \(\lambda _\mathrm{atom}(K)\). We will also abuse the notation by writing \(\lambda _\mathrm{LH}(K)\) for the stability constant of the corresponding \(\mathbb {R}^{d \times d \times d \times d}\) tensor. For \(K=D^2W_\mathrm{atom}((A_0\rho )_{\rho \in \mathscr {R}})\) and \(L = D^2W_\mathrm{CB}(A_0)\) with \(A_0 \in \mathbb {R}^{d \times d}\) we also write \(\lambda _\mathrm{atom}(A_0)\) and \(\lambda _\mathrm{LH}(A_0)\) and suppress the dependency on \(W_\mathrm{atom}\). In the same way we will write \(\lambda _\mathrm{atom}(A_0)\) instead of \(\lambda _\mathrm{atom}(D^2W_\mathrm{atom}(A_0))\) if \(A_0 \in \mathbb {R}^{d \times \mathscr {R}}\). While we are mostly interested in the case \(K=D^2W_\mathrm{atom}(A_0)\), our more general analysis is quite useful to allow, for example, small perturbations of these tensors that are no longer of the same form.

For the dependency on K of \(\lambda _\mathrm{atom}\) we record the following elementary observation.

Proposition 3.3

Given tensors \(K,\tilde{K}\), we have

$$\begin{aligned} |\lambda _\mathrm{atom}(K)- \lambda _\mathrm{atom}(\tilde{K}) |\le |K - \tilde{K} |. \end{aligned}$$

In particular, if \(W_\mathrm{atom} \in C^2(V)\), V open, then

$$\begin{aligned} \{A \in V :\lambda _\mathrm{atom}(A) >0\} \end{aligned}$$

is open as well.

Proof

This is straightforward. Just use

$$\begin{aligned}&\bigg |\sum \limits _{x \in \{0, \cdots , N\}^d} K[D_{\mathscr {R},1} y (x),D_{\mathscr {R},1} y (x)] - \sum \limits _{x \in \{0, \cdots , N\}^d} \tilde{K}[D_{\mathscr {R},1} y (x),D_{\mathscr {R},1} y (x)] \bigg |\\&\quad \le |K - \tilde{K} |\sum \limits _{x \in \{0, \cdots , N\}^d} |D_{\mathscr {R},1} y (x) |^2. \end{aligned}$$

For the additional claim in the case \(K=D^2W_\mathrm{atom}(A)\), we just note that K depends continuously on A on the set V by assumption. \(\square \)

3.2 Representation formulae

Combining Propositions 3.1 and 3.2 we are now basically in the setting of the stability discussion in [12]. We include the most important points here to stay self-contained and, more importantly, to provide a new, more intuitive characterization of the stability constant and sufficient criteria for stability that allow for a direct application in interesting situations.

Note that we use an \(h^1\)-Norm based on difference quotients, while the authors in [12] use a Fourier norm. Therefore, the stability constants will be different, but of course everything remains equivalent. While the Fourier norm makes the connection to the continuum case slightly more direct, our approach has the advantage that it is considerably easier to check if the atomistic stability condition holds true in specific situations making it possible to rigorously discuss relatively simple examples as will be detailed in the Sect. 3.3.

We will write

$$\begin{aligned} Q_N = \{0, 1, \cdots , N-1\}^d \end{aligned}$$

and

$$\begin{aligned} \hat{Q}_N = \Big \{0, \frac{2 \pi }{N}, \cdots , \frac{2 \pi (N-1)}{N}\Big \}^d \end{aligned}$$

for the dual group. Given \(y :Q_N \rightarrow \mathbb {C}\), the Fourier transformation of y is defined by \(\hat{y} :\hat{Q}_N \rightarrow \mathbb {C}\) with

$$\begin{aligned} \hat{y}(k) = \frac{1}{N^d} \sum \limits _{x \in Q_N} y(x)e^{-ixk}. \end{aligned}$$

We have

$$\begin{aligned} \sum \limits _{x \in Q_N} e^{ix(k-k')} = N^d \delta _{k,k'} \end{aligned}$$

for all \(k,k' \in \hat{Q}_N\) and

$$\begin{aligned} \sum \limits _{k \in \hat{Q}_N} e^{i(x-x')k} = N^d \delta _{x,x'} \end{aligned}$$

for all \(x,x' \in Q_N\). Therefore,

$$\begin{aligned} y(x) = \sum \limits _{k \in \hat{Q}_N} \hat{y}(k)e^{ixk} \end{aligned}$$

for all \(x\in Q_N\).

In the following we will often assume that \(K_{j \rho l \sigma } = K_{l \sigma j \rho }\) which is automatically satisfied if K is the second derivative of a potential. Furthermore, we will sometimes assume \(K_{j \rho l \sigma } = K_{j (-\rho ) l (-\sigma )}\), which is satisfied in our models because of the symmetry condition and Lemma 2.1.

In Fourier space the problem is in diagonal form.

Proposition 3.4

Assume that \(K_{j \rho l \sigma } = K_{l \sigma j \rho }\) for all \(j,l,\rho ,\sigma \). Now, given \(y :Q_N \rightarrow \mathbb {R}^d\) periodically extended to \(\mathbb {Z}^d\), we have

$$\begin{aligned} \sum \limits _{x\in Q_N} K[D_{\mathscr {R},1}y(x), D_{\mathscr {R},1}y(x)] = N^d \sum \limits _{k \in \hat{Q}_N}\hat{y}(k)^T H(k) \overline{\hat{y}(k)}, \end{aligned}$$

where

$$\begin{aligned} H(k)_{jl}= \sum \limits _{\rho ,\sigma \in \mathscr {R}} K_{j \rho l \sigma } \big ( \cos (\rho k) -1 +i \sin (\rho k) \big )\big ( \cos (\sigma k) -1 -i \sin (\sigma k) \big ). \end{aligned}$$

In particular, H(k) is hermitian for all k, H is \([0,2\pi )^d\)-periodic and \(\overline{H(k)}=H(-k)\) for all k.

Furthermore, if K additionally satisfies \(K_{j \rho l \sigma } = K_{j (-\rho ) l (-\sigma )}\), then

$$\begin{aligned} H(k)_{jl}= \sum \limits _{\rho ,\sigma \in \mathscr {R}} K_{j \rho l \sigma } \big ( (\cos (\rho k) -1)(\cos (\sigma k) -1) + \sin (\rho k)\sin (\sigma k) \big ). \end{aligned}$$

In particular, \(H(k) \in \mathbb {R}^{d \times d}_\mathrm{sym}\) for all k.

Proof

$$\begin{aligned}&\sum \limits _{x\in Q_N} K[D_{\mathscr {R},1}y(x), D_{\mathscr {R},1}y(x)]\\&= \sum \limits _{x,j,l,\rho ,\sigma ,k,k'} K_{j \rho l \sigma } (e^{i \rho k} - 1)(e^{-i \sigma k'} - 1) e^{ix(k-k')} \hat{y}_j(k) \overline{\hat{y}_l(k')} \\&= N^d \sum \limits _{j,l,\rho ,\sigma ,k} K_{j \rho l \sigma } \big ( \cos (\rho k) -1 +i \sin (\rho k) \big )\big ( \cos (\sigma k) -1 -i \sin (\sigma k) \big ) \hat{y}_j(k) \overline{\hat{y}_l(k)}. \end{aligned}$$

Everything else follows easily since

$$\begin{aligned} ( \cos (\rho k) -1&+i \sin (\rho k) )( \cos (\sigma k) -1 -i \sin (\sigma k) )\\&= ( \cos (\rho k) -1)( \cos (\sigma k) -1) + \sin (\rho k)\sin (\sigma k)\\&\quad + i (\cos (\sigma k) -1) \sin (\rho k)- i (\cos (\rho k) -1) \sin (\sigma k). \end{aligned}$$

\(\square \)

Proposition 3.5

Assume that \(K_{j \rho l \sigma } = K_{l \sigma j \rho }\) for all \(j,l,\rho ,\sigma \). Then,

$$\begin{aligned} \mu _{\text {per},N}= \min \bigg \{\frac{h(k)}{\sum _{\rho \in \mathscr {R}} (1-\cos (\rho k))^2+ \sin ^2(\rho k)} :k \in \hat{Q}_N \backslash \{0\} \bigg \} , \end{aligned}$$

where h(k) is the smallest eigenvalue of H(k).

Proof

First of all note that, for \(k \in \hat{Q}_N\), \(\sum _{\rho \in \mathscr {R}} (1-\cos (\rho k))^2+ \sin ^2(\rho k) =0\) is equivalent to \(k\rho \in 2 \pi \mathbb {Z}\) for all \(\rho \in \mathscr {R}\) or, equivalently, for all \(\rho \in \mathbb {Z}^d\) since \({{\mathrm{span}}}_\mathbb {Z}\mathscr {R} = \mathbb {Z}^d\). This is the case if and only if \(k=0\). Now set

$$\begin{aligned} \mu _{\text {F},N}= \min \bigg \{\frac{h(k)}{ \sum _{\rho \in \mathscr {R}} (1-\cos (\rho k))^2+ \sin ^2(\rho k)} :k \in \hat{Q}_N\backslash \{0\} \bigg \}. \end{aligned}$$

Given \(y :Q_N \rightarrow \mathbb {R}^d\), periodically extended, we have

$$\begin{aligned} \sum \limits _{x\in Q_N} K[D_{\mathscr {R},1}y(x), D_{\mathscr {R},1}y(x)]= & {} N^d \sum \limits _{k \in \hat{Q}_N}\hat{y}(k)^T H(k) \overline{\hat{y}(k)}\\\ge & {} N^d \sum \limits _{k \in \hat{Q}_N} h(k) |\hat{y}(k) |^2 \\\ge & {} \mu _{\text {F},N} N^d \sum \limits _{k \in \hat{Q}_N} \sum _{\rho \in \mathscr {R}} |\hat{y}(k) |^2 \big ((1-\cos (\rho k))^2+ \sin ^2(\rho k) \big ) \\= & {} \mu _{\text {F},N} \sum \limits _{x\in Q_N} |D_{\mathscr {R},1}y(x) |^2, \end{aligned}$$

where we used Proposition 3.4 for K and \(\tilde{K}_{j \rho l \sigma } = \delta _{jl} \delta _{\rho \sigma }\). This proves \(\mu _{\text {per},N} \ge \mu _{\text {F},N}\). For the opposite inequality take \(k_0 \in \hat{Q}_N\backslash \{0\}\) such that

$$\begin{aligned} h(k_0) = \mu _{\text {F},N} \sum _{\rho \in \mathscr {R}} (1-\cos (\rho k))^2+ \sin ^2(\rho k). \end{aligned}$$

Let \(v_0\) be a corresponding eigenvector and \(k_1 \in \hat{Q}_N\) be the unique vector such that \(k_0+k_1 \in 2\pi \mathbb {Z}^d\). In the case \(k_0=k_1\), take \(v_0\) real. We define

$$\begin{aligned} y(x) = \overline{v}_0 e^{ik_0x}+ v_0 e^{ik_1x}. \end{aligned}$$

For \(x \in Q_N\) we have \(y(x) = 2 {{\mathrm{Re}}}v_0 \cos (k_0 x) + 2 {{\mathrm{Im}}}v_0 \sin (k_0 x)\), which is real, \([0,N)^d\)-periodic and nonconstant. We calculate

$$\begin{aligned} \sum \limits _{x\in Q_N} K[D_{\mathscr {R},1}y(x), D_{\mathscr {R},1}y(x)]= & {} N^d \sum \limits _{k \in \hat{Q}_N}\hat{y}(k)^T H(k) \overline{\hat{y}(k)}\\= & {} 2 N^d h(k_0) |v_0 |^2 (1+ \delta _{k_0 k_1}) \\= & {} \mu _{\text {F},N} N^d \sum \limits _{k \in \hat{Q}_N} \sum _{\rho \in \mathscr {R}} |\hat{y}(k) |^2 \big ((1-\cos (\rho k))^2+ \sin ^2(\rho k) \big ) \\= & {} \mu _{\text {F},N} \sum \limits _{x\in Q_N} |D_{\mathscr {R},1}y(x) |^2. \end{aligned}$$

Therefore, \(\mu _{\text {per},N} \le \mu _{\text {F},N}\). \(\square \)

In the limit \(N \rightarrow \infty \) we get the following result:

Theorem 3.6

Assume that \(K_{j \rho l \sigma } = K_{l \sigma j \rho }\) for all \(j,l,\rho ,\sigma \). Then

$$\begin{aligned} \lambda _\mathrm{atom}(K)&= \inf \bigg \{\frac{h(k)}{\sum _{\rho \in \mathscr {R}} (1-\cos (\rho k))^2+ \sin ^2(\rho k)} :k \in [0,2\pi )^d\backslash \{0\} \bigg \},\\ \tilde{\lambda }_\mathrm{LH}(K)&= \lim \limits _{s \rightarrow 0^+} \inf \bigg \{\frac{h(k)}{\sum _{\rho \in \mathscr {R}} (1-\cos (\rho k))^2+ \sin ^2(\rho k)} :k \in (-s,s)^d\backslash \{ 0 \} \bigg \}, \end{aligned}$$

where h(k) is the smallest eigenvalue of H(k). In particular, atomistic stability implies the Legendre–Hadamard condition.

Proof

Set

$$\begin{aligned} \mu _{\text {F}} = \inf \bigg \{\frac{h(k)}{\sum _{\rho \in \mathscr {R}} (1-\cos (\rho k))^2+ \sin ^2(\rho k)} :k \in [0,2\pi )^d\backslash \{0\} \bigg \} \end{aligned}$$

By Proposition 3.5 we have \(\mu _{\text {per},N} \ge \mu _{\text {F}}\) and thus \(\lambda _\mathrm{atom}(K) \ge \mu _{\text {F}}\). For the opposite inequality let \(M>\mu _{\text {F}}\). Now, take \(k_0 \in [0,2\pi )^d\backslash \{0\}\) such that

$$\begin{aligned} \frac{h(k_0)}{\sum _{\rho \in \mathscr {R}} (1-\cos (\rho k_0))^2+ \sin ^2(\rho k_0)} < M. \end{aligned}$$

By continuity of h, we can find an \(N \in \mathbb {N}\) and a \(k_1 \in \hat{Q}_N\) such that

$$\begin{aligned} \frac{h(k_1)}{\sum _{\rho \in \mathscr {R}} (1-\cos (\rho k_1))^2+ \sin ^2(\rho k_1)} < M. \end{aligned}$$

Therefore, \(\lambda _\mathrm{atom}(K) \le \mu _{\text {per},N} < M\). Now, let \(\eta \in \mathbb {R}^d\) with \(|\eta |=1\) and \(0<\tau \le 1\). Then,

$$\begin{aligned} \big |(1-\cos (\rho \eta \tau ))^2+ \sin ^2(\rho \eta \tau ) - \tau ^2 (\rho \eta )^2 \big |\le C \tau ^4 \end{aligned}$$

and for \(\xi \in \mathbb {C}^d\) with \(|\xi |= 1\)

$$\begin{aligned} \big |\xi ^T H(\eta \tau )\overline{\xi } - \tau ^2 K[\xi \otimes (\rho \eta )_{\rho \in \mathscr {R}}, \overline{\xi } \otimes (\rho \eta )_{\rho \in \mathscr {R}}] \big |\le C \tau ^3. \end{aligned}$$

This implies

$$\begin{aligned} \big |h(\eta \tau ) - \min \limits _{\begin{array}{c} \xi \in \mathbb {C}^d \\ |\xi |= 1 \\ \end{array}}\tau ^2 K[\xi \otimes (\rho \eta )_{\rho \in \mathscr {R}}, \overline{\xi } \otimes (\rho \eta )_{\rho \in \mathscr {R}}] \big |\le C \tau ^3. \end{aligned}$$

Furthermore, for all \(\eta \) as above we have

$$\begin{aligned} 0<c \le \sum _{\rho \in \mathscr {R}} (\rho \eta )^2 \le C \end{aligned}$$

and

$$\begin{aligned} \Big |\min \limits _{\begin{array}{c} \xi \in \mathbb {C}^d \\ |\xi |= 1 \\ \end{array}} K[\xi \otimes (\rho \eta )_{\rho \in \mathscr {R}}, \overline{\xi } \otimes (\rho \eta )_{\rho \in \mathscr {R}}] \Big |\le C. \end{aligned}$$

Thus, for \(\tau \) small enough we also know that

$$\begin{aligned} \sum _{\rho \in \mathscr {R}} (1-\cos (\rho \eta \tau ))^2+ \sin ^2(\rho \eta \tau ) \ge \frac{c \tau ^2}{2}. \end{aligned}$$

Due to the symmetry of K we have

$$\begin{aligned} K[\xi \otimes b, \overline{\xi } \otimes b] = K[{{\mathrm{Re}}}\xi \otimes b, {{\mathrm{Re}}}\xi \otimes b] + K[{{\mathrm{Im}}}\xi \otimes b, {{\mathrm{Im}}}\xi \otimes b] \end{aligned}$$

for all \(\xi \in \mathbb {C}^d\) and \(b \in \mathbb {R}^\mathscr {R}\). In particular,

$$\begin{aligned} \min \limits _{\begin{array}{c} \xi \in \mathbb {C}^d \\ |\xi |= 1 \\ \end{array}} K[\xi \otimes (\rho \eta )_{\rho \in \mathscr {R}}, \overline{\xi } \otimes (\rho \eta )_{\rho \in \mathscr {R}}] = \min \limits _{\begin{array}{c} \xi \in \mathbb {R}^d \\ |\xi |= 1 \\ \end{array}} K[\xi \otimes (\rho \eta )_{\rho \in \mathscr {R}}, \xi \otimes (\rho \eta )_{\rho \in \mathscr {R}}] \end{aligned}$$

Combining the above inequalities we get

$$\begin{aligned}&\Bigg |\frac{h(\eta \tau )}{\sum _{\rho \in \mathscr {R}} (1-\cos (\rho \eta \tau ))^2+ \sin ^2(\rho \eta \tau )} - \frac{\min \limits _{\xi \in \mathbb {R}^d, |\xi |= 1} K[\xi \otimes (\rho \eta )_{\rho \in \mathscr {R}}, \xi \otimes (\rho \eta )_{\rho \in \mathscr {R}}]}{\sum _{\rho \in \mathscr {R}} (\rho \eta )^2} \Bigg |\\&\quad \le \frac{4C^2}{c^2} \tau \end{aligned}$$

for all \(\tau \) small enough and all \(\eta \) as above. Therefore,

$$\begin{aligned} \lim \limits _{\tau \rightarrow 0^+} \min \limits _{|\eta |= 1} \frac{h(\eta \tau )}{\sum _{\rho \in \mathscr {R}} (1-\cos (\rho \eta \tau ))^2+ \sin ^2(\rho \eta \tau )} = \tilde{\lambda }_\mathrm{LH}(K) \end{aligned}$$

which gives the desired result. \(\square \)

If H is real we can express \(\lambda _\mathrm{atom}\) in a way that looks quite similar to the definition of \(\lambda _\mathrm{LH}\).

Corollary 3.7

Assume that \(K_{j \rho l \sigma } = K_{l \sigma j \rho }\) and additionally \(K_{j \rho l \sigma } = K_{j (-\rho ) l (-\sigma )}\) or \(K_{j \rho l \sigma } = K_{l \rho j \sigma }\) for all \(j,l,\rho ,\sigma \). Then

$$\begin{aligned} \lambda _\mathrm{atom}(K)&= \inf \bigg \{\frac{K[\xi \otimes c(k),\xi \otimes c(k)] + K[\xi \otimes s(k), \xi \otimes s(k)]}{|\xi |^2 (|c(k) |^2 + |s(k) |^2)} :\\&\qquad \xi \in \mathbb {R}^d \backslash \{0\}, k \in [0,2\pi )^d\backslash \{0\} \bigg \}, \end{aligned}$$

where \(c(k)_\rho = \cos (\rho k) -1\) and \(s(k)_\rho = \sin (\rho k)\).

The following criterion is strictly weaker but often easier to check.

Corollary 3.8

Assume that \(K_{j \rho l \sigma } = K_{l \sigma j \rho }\) and additionally \(K_{j \rho l \sigma } = K_{j (-\rho ) l (-\sigma )}\) for all \(j,l,\rho ,\sigma \). Let \(\lambda _\mathrm{LH}(K)>0\), \(K[\xi \otimes s(k),\xi \otimes s(k)] \ge 0\) for all \(\xi ,k \in \mathbb {R}^d\) and

$$\begin{aligned} K[\xi \otimes c(k),\xi \otimes c(k)] \ge \gamma |\xi |^2 |c(k) |^2 \end{aligned}$$

for all \(\xi ,k \in \mathbb {R}^d\) and some \(\gamma > 0\). Then \(\lambda _\mathrm{atom}>0\).

Proof

Since \(\lambda _\mathrm{LH}(K)\) and \(\tilde{\lambda }_\mathrm{LH}(K)\) are equivalent, we can use Theorem 3.6 to see that that there are some \(\tilde{\gamma }, \delta > 0\) such that

$$\begin{aligned} K[\xi \otimes c(k),\xi \otimes c(k)] + K[\xi \otimes s(k), \xi \otimes s(k)] \ge \tilde{\gamma }|\xi |^2 (|c(k) |^2 + |s(k) |^2) \end{aligned}$$

for all \(\xi \) and all k with \({{\mathrm{dist}}}(k, 2\pi \mathbb {Z}^d) < \delta \). On the other hand, there is a \(C>0\) such that \(|s(k) |\le C |c(k) |\) whenever \({{\mathrm{dist}}}(k, 2\pi \mathbb {Z}^d) \ge \delta \). Therefore

$$\begin{aligned} K[\xi \otimes c(k),\xi \otimes c(k)] + K[\xi \otimes s(k), \xi \otimes s(k)] \ge \frac{\gamma }{1+C^2}|\xi |^2 (|c(k) |^2 + |s(k) |^2) \end{aligned}$$

for these k and all \(\xi \). \(\square \)

Remark 3.9

The connection to the formulas in [12] is given by

$$\begin{aligned} 4 \sin ^2\Big (\frac{z}{2}\Big )&= (\cos (z)-1)^2 + \sin ^2(z)\\ 2\sin ^2\Big (\frac{y}{2}\Big ) + 2\sin ^2\Big (\frac{z}{2}\Big ) - 2\sin ^2\Big (\frac{z-y}{2}\Big )&= (\cos (y)-1)(\cos (z)-1) + \sin (y)\sin (z). \end{aligned}$$

A little bit of calculation shows that the stability constants here and in [12] then are actually equivalent (with the minor correction that most of their sums should actually run over the set \(\mathscr {R}-\mathscr {R}\) instead of \(\mathscr {R}\)).

3.3 Examples for stability

First of all, let us point out that the general assumptions made in this work are consistent with a large variety of atomic interaction models and lattices. A simple sufficient condition for atomistic stability is the following:

Proposition 3.10

If \(W_\mathrm{atom} \in C^2\) in a neighborhood \((A_0 \rho )_{\rho \in \mathscr {R}}\), satisfies the symmetry condition and \((A_0 \rho )_{\rho \in \mathscr {R}}\) is a local minimizer of the energy, such that the second derivative in the directions of affine rank-one deformations \(((\xi \otimes \eta ) \rho )_{\rho \in \mathscr {R}}\) and on the orthogonal complement of all affine deformations is strictly positive. Then \(\lambda _\mathrm{atom} (A_0)>0\).

Proof

Just use Corollary 3.8 and the fact that \(\xi \otimes c(k)\) is orthogonal on affine deformations. \(\square \)

Remark 3.11

These conditions allow for a large class of frame indifferent interaction models. Examples include the general finite range potentials discussed in [5].

We next want to discuss the connection between atomistic and continuous stability. To do this we will characterize the stability constants in two examples. The examples are two-dimensional to allow for a significantly easier analytical treatment, but the studied effects are expected to be the same in three dimensions.

There is a conjecture that in certain regimes one has \(\lambda _\mathrm{atom}(A) = \tilde{\lambda }_\mathrm{LH}(A)\) for a large set of matrices or at least \(\lambda _\mathrm{atom}(A)>0\) if and only if \(\lambda _\mathrm{LH}(A)>0\), compare [12]. But so far this has only been proven in certain one-dimensional cases (e.g., in [12]). Even more importantly, this is expected to be false in general. In more than one dimension so far this has only been discussed numerically in [12].

First, let us look at a rather simple but multidimensional example where it is possible to analytically prove \(\lambda _\mathrm{atom}(A) = \tilde{\lambda }_\mathrm{LH}(A)\) for a large set of matrices A. To be more precise, we consider uniform contractions and extensions of a triangular lattice where the energy is given by an unspecified pair potential for the nearest neighbors. This means we will look at \(d=2\),

$$\begin{aligned} M = \begin{pmatrix} 1 &{}\quad \frac{1}{2}\\ 0 &{}\quad \frac{\sqrt{3}}{2} \end{pmatrix} \end{aligned}$$

and consider the linearization at \(M(t) = t M\) for \(t>0\). Furthermore,

$$\begin{aligned} \mathscr {R} = \{ \pm e_1,\pm e_2,\pm (e_2-e_1)\} \end{aligned}$$

and the interaction is given by

$$\begin{aligned} W_\mathrm{atom}(A) = \frac{1}{2} \sum \limits _{\rho \in \mathscr {R}} V_0(|A_\rho |) \end{aligned}$$

with some pair potential \(V_0 \in C^2((0,\infty );\mathbb {R})\). The Cauchy–Born energy density is then given by

$$\begin{aligned} W_\mathrm{CB}(A) = V_0 ( |A_{\cdot 1} |) + V_0 ( |A_{\cdot 2} |) + V_0 ( |A_{\cdot 2}-A_{\cdot 1} |). \end{aligned}$$

Direct calculations give

$$\begin{aligned} K(t)_{j \rho l \sigma } = \delta _{\rho \sigma } \Big ( \frac{V_0'(t)}{t} (\delta _{jl} - (M\rho )_j (M \rho )_l) + V_0''(t)(M\rho )_j (M \rho )_l \Big ) \end{aligned}$$

and, with some more effort,

$$\begin{aligned} h(t,k)&= 4 \Big (V_0''(t) + \frac{V_0'(t)}{t}\Big )\Big (\sin ^2\Big (\frac{k_1}{2}\Big ) + \sin ^2\Big (\frac{k_2}{2}\Big ) + \sin ^2\Big (\frac{k_2-k_1}{2}\Big )\Big )\\&\qquad - 2\sqrt{2} \Big |V_0''(t) - \frac{V_0'(t)}{t}\Big |\bigg (\Big (\sin ^2\Big (\frac{k_1}{2}\Big )-\sin ^2\Big (\frac{k_2}{2}\Big )\Big )^2\\&\qquad +\Big (\sin ^2\Big (\frac{k_1}{2}\Big )-\sin ^2\Big (\frac{k_2-k_1}{2}\Big )\Big )^2+\Big (\sin ^2\Big (\frac{k_2-k_1}{2}\Big )-\sin ^2\Big (\frac{k_2}{2}\Big )\Big )^2\bigg )^{\frac{1}{2}}. \end{aligned}$$

The nonlinear minimization problem can be drastically simplified by the substitution \(s_1 = \sin (\frac{k_1}{2})\) and \(s_2= \sin (\frac{k_2}{2})\). Then, only certain algebraic inequalities have to be shown. A lengthy but not too difficult calculation results in the following characterization. All omitted details can be found in [3].

Proposition 3.12

In the above setting we have

$$\begin{aligned} \lambda _\mathrm{atom}(M(t))=\tilde{\lambda }_\mathrm{LH}(M(t))= \frac{1}{2} \Big (V_0''(t) + \frac{V_0'(t)}{t}\Big ) - \frac{1}{4} \Big |V_0''(t) - \frac{V_0'(t)}{t}\Big |. \end{aligned}$$

Remark 3.13

If \(V_0\) is a standard Lennard–Jones potential, i.e.,

$$\begin{aligned} V_0(r) = r^{-12} - 2r^{-6}, \end{aligned}$$

then M(t) is stable in both senses if and only if

$$\begin{aligned} t \in \Big (0,\root 6 \of {\frac{19}{10}}\Big ), \end{aligned}$$

where \(\root 6 \of {\frac{19}{10}} \approx 1.113\).

Remark 3.14

In the proof the choice of our \(h^1\)-norm helps to drastically simplify the problem. If one tries to show the equivalent result for the Fourier \(h^1\)-norm one has to prove a fully nonlinear, nonalgebraic inequality. In the approach above only a few algebraic manipulations are necessary.

As a second example to actually show the differences between the two notions of stability we want to look at a rectangular lattice with nearest and next-to-nearest neighbor interactions that are not balanced with each other. In [10] this problem is discussed in the context of global minimization. We will look at the same setting and give an explicit characterization of our notions of (local) stability. As in [10] the instability we find is a “shift-relaxation”, which corresponds to a period doubling. But we prove even more. We show that there are no macroscopic instabilities at all and that the lattice is stable on all scales up to the point where the instability due to “shift-relaxations” occurs. Additionally, since we only require a local analysis, it is easy to extend the example to a quite general class of potentials, as we will describe in more detail at the end.

We set \(\mathscr {R} = \{\pm e_1, \pm e_2, e_1 \pm e_2, -e_1 \pm e_2 \}\) and

$$\begin{aligned} W_\mathrm{atom}(A) = \frac{K_1}{4} \sum _{\rho \in \mathscr {R}, |\rho |= 1} (|A_\rho |- a_1)^2 + \frac{K_2}{4} \sum _{\rho \in \mathscr {R}, |\rho |= \sqrt{2}} (|A_\rho |- a_2)^2 \end{aligned}$$

for some \(a_1,a_2,K_1,K_2>0\). We are now interested in the stability of \(A_0 = r^*{{\mathrm{Id}}}\) with

$$\begin{aligned} r^*= \frac{K_1 a_1 + \sqrt{2} K_2 a_2}{K_1 + 2 K_2}. \end{aligned}$$

In the following let us use the notation

$$\begin{aligned} \alpha = \frac{a_2}{\sqrt{2}a_1}, \quad \kappa = \frac{K_2}{K_1} \quad \text {and}\ \beta = \frac{1+2\kappa }{1+2\alpha \kappa }. \end{aligned}$$

Proposition 3.15

In this setting we have

$$\begin{aligned} \tilde{\lambda }_\mathrm{LH}(r^*{{\mathrm{Id}}}) = \frac{K_1}{12} \beta \min \{1,2\alpha \kappa \}>0 \end{aligned}$$

for all parameter values, while \(\lambda _\mathrm{atom}(r^*{{\mathrm{Id}}}) >0\) if and only if \(\beta <2\), which corresponds to \(\alpha \ge \frac{1}{2}\) or \(\alpha <\frac{1}{2}\) and \(\kappa < \frac{1}{2(1-2\alpha )}\).

Proof

Calculating the derivatives we find

$$\begin{aligned} K_{j\rho l \sigma }&= D^2 W_\mathrm{atom}((r^*\rho )_{\rho \in \mathscr {R}})[e_j \otimes e_\rho , e_l \otimes e_\sigma ]\\&= \delta _{\rho \sigma } \delta _{|\rho |1}\Big (\delta _{jl} \frac{K_1}{2}\big (1-\frac{a_1}{r^*}\big ) + \rho _j \rho _l\frac{K_1 a_1}{2 r^*}\Big )\\&\quad + \delta _{\rho \sigma } \delta _{|\rho |\sqrt{2}}\Big (\delta _{jl} \frac{K_2}{2}\big (1-\frac{a_2}{\sqrt{2} r^*}\big ) + \rho _j \rho _l\frac{K_2 a_2}{4 \sqrt{2} r^*}\Big ). \end{aligned}$$

One then proceeds similarly to the last example. All details can again be found in [3]. \(\square \)

In this example we see that the Legendre–Hadamard stability constant and the atomistic stability constant can be quite different and the parameter regions where we have macroscopic or atomistic stability can be very different as well. In the Fourier characterization it is clear that this difference occurs whenever a system is stable under macroscopic, long wavelength perturbations but not under some perturbation with wavelength on the atomistic scale. In this example, the instability does indeed occur on the atomistic scale and actually corresponds to a period doubling where the wave number is \(k=(\pi ,\pi )\).

The example is actually much more general than it looks. Given general pair potentials \(V_1, V_2 \in C^2(0,\infty )\) as well as an \(r^*\) with

$$\begin{aligned} V_1(r^*) + \sqrt{2} V_2'(r^*) = 0, \end{aligned}$$

one can look at the site potential

$$\begin{aligned} W_\mathrm{atom}(A) = \frac{1}{2} \sum _{\rho \in \mathscr {R}, |\rho |= 1} V_1(|A_\rho |) + \frac{1}{2} \sum _{\rho \in \mathscr {R}, |\rho |= \sqrt{2}} V_{2}(|A_\rho ). \end{aligned}$$

We can now set \(K_1 = V_1''(r^*)\), \(K_2 = V_2''(r^*)\), \(a_1 =r^*-\frac{V_1'(r^*)}{V_1''(r^*)}\), and \(a_2 =\sqrt{2} r^*-\frac{V_2'(\sqrt{2} r^*)}{V_2''(\sqrt{2} r^*)}\). As long as \(K_1, K_2, a_1, a_2 > 0\), the above analysis applies directly since the linearization K is the same.

4 Solving the continuous equations

4.1 The linearized system

Let us first recall standard results for the linear(-ized) system.

Theorem 4.1

Let \(\Omega \subset \mathbb {R}^d\) be an open, bounded set, \(L \in \mathbb {R}^{d \times d \times d \times d}\), \(f \in L^2(\Omega ;\mathbb {R}^d)\), \(F \in L^2(\Omega ; \mathbb {R}^{d \times d})\), and \(g \in H^1(\Omega ;\mathbb {R}^d)\). Furthermore, assume \(\lambda _\mathrm{LH}(L)>0\). Then there is one and only one weak solution \(u \in g + H^1_0(\Omega ;\mathbb {R}^d)\) of

$$\begin{aligned} -{{\mathrm{div}}}(L[\nabla u]) = f - {{\mathrm{div}}}F. \end{aligned}$$

Theorem 4.2

Let \(m \in \mathbb {N}_0\), \(\Omega \subset \mathbb {R}^d\) an open, bounded set with \(C^{m+2}\) boundary, \(L \in C^{m,1}(\Omega ;\mathbb {R}^{d \times d \times d \times d})\), \(f \in H^m(\Omega ;\mathbb {R}^d)\), \(F \in H^{m+1}(\Omega ; \mathbb {R}^{d \times d})\), and \(g \in H^{m+2}(\Omega ;\mathbb {R}^d)\). Furthermore, assume \(\lambda _\mathrm{LH}(L(x)) \ge \lambda _0\) for some \(\lambda _0 > 0\) and all \(x\in \Omega \). Assume that \(u \in g + H^1_0(\Omega ;\mathbb {R}^d)\) is a weak solution of

$$\begin{aligned} -{{\mathrm{div}}}(L[\nabla u]) = f - {{\mathrm{div}}}F. \end{aligned}$$

Then \(u \in H^{m+2}(\Omega ;\mathbb {R}^d)\) and there is a \(c=c(m,\Omega ,||L ||_{C^{m,1}}, \lambda )>0\), such that

$$\begin{aligned} ||\nabla ^{m+2} u ||_{L^2} \le c (||f ||_{H^m} + ||\nabla F ||_{H^m} + ||g ||_{H^{m+2}}). \end{aligned}$$

We only need this theorem for constant L. Reformulating these results we get:

Corollary 4.3

Let \(m \in \mathbb {N}_0\), let \(\Omega \subset \mathbb {R}^d\) be an open, bounded set with \(C^{m+2}\) boundary, let \(L \in \mathbb {R}^{d \times d \times d \times d}\) and assume \(\lambda _\mathrm{LH}(L)>0\). Then the mapping

$$\begin{aligned} u \mapsto {{\mathrm{div}}}( L[\nabla u] ) \end{aligned}$$

is a linear isomorphism from \(H^{m+2}(\Omega ;\mathbb {R}^d)\cap H^1_0(\Omega ;\mathbb {R}^d) \) onto \(H^m(\Omega ;\mathbb {R}^d)\).

Proof

These statements are rather standard and can be found in the literature. See, e.g., [11, Corollary 3.46] and [11, Theorem 4.14]. \(\square \)

4.2 Local solutions of the nonlinear problem

We now improve the linearized result to a local result for the nonlinear problem with the help of an implicit function theorem.

Theorem 4.4

Let \(m \in \mathbb {N}_0\), \(d<2m+2\) and let \(\Omega \subset \mathbb {R}^d\) be an open, bounded set with \(C^{m+2}\)-boundary. Let \(r_0>0\), \(W_\mathrm{atom} \in C^{m+3}(\overline{B_{r_0}((A_0 \rho )_{\rho \in \mathscr {R}})})\) and assume that \(\lambda _\mathrm{LH}(A_0)>0\). Then there are constants \(\kappa _1,\kappa _2>0\) such that for all \(g\in H^{m+2}(\Omega ;\mathbb {R}^d)\) and \(f \in H^m(\Omega ;\mathbb {R}^d)\) that satisfy \(||g-y_{A_0} ||_{H^{m+2}(\Omega ;\mathbb {R}^d)} < \kappa _1\) and \(||f ||_{H^m(\Omega ;\mathbb {R}^d)} < \kappa _1\) the problem

$$\begin{aligned} -{{\mathrm{div}}}( DW_\mathrm{CB}(\nabla y(x)))&= f(x), \quad \text { if } x \in \Omega , \\ y(x)&= g(x), \quad \text { if } x \in \partial \Omega , \end{aligned}$$

has exactly one weak solution with \(||y-g ||_{H^{m+2}(\Omega ;\mathbb {R}^d)} < \kappa _2\). Furthermore, we have

  1. 1.

    \(\sup _x |((\nabla y(x)-A_0)\rho )_{\rho \in \mathscr {R}} |<r_0\),

  2. 2.

    y is a \(W^{1,\infty }\)-local minimizer of \(E(\cdot ;f)\) restricted to \(y=g\) on \(\partial \Omega \),

  3. 3.

    y depends \(C^1\) on f and g in the norms used above.

Let us start with an important statement on compositions:

Lemma 4.5

Let \(m \in \mathbb {N}_0\), \(d<2m+2\) and let \(\Omega \subset \mathbb {R}^d\) be an open, bounded set with Lipschitz boundary. Let \(V \subset \mathbb {R}^{d \times \mathscr {R}}\) and \(W_\mathrm{atom} \in C_b^{m+3}(V)\) with uniform continuous highest derivatives.

Define the operator \(F:B \mapsto DW_\mathrm{CB} \circ B\). We claim that

$$\begin{aligned} \{B\in H^{m+1}(\Omega ;\mathbb {R}^{d \times d}) :\inf \limits _{x \in \Omega } {{\mathrm{dist}}}( (B(x) \rho )_{\rho \in \mathscr {R}}, V^c)>0\} \end{aligned}$$

is open in \(H^{m+1}(\Omega ;\mathbb {R}^{d \times d})\) and

$$\begin{aligned} F:\{B\in H^{m+1}(\Omega ;\mathbb {R}^{d \times d}) :\inf \limits _{x \in \Omega } {{\mathrm{dist}}}( (B(x) \rho )_{\rho \in \mathscr {R}}, V^c)>0\} \rightarrow H^{m+1}(\Omega ;\mathbb {R}^{d \times d}) \end{aligned}$$

is well-defined and \(C^1\) with

$$\begin{aligned} DF(B)[H](x) = D^2 W_\mathrm{CB} (B(x))[H(x)]. \end{aligned}$$

Proof

This is contained in [20, II. Theorem 4.1]. \(\square \)

Proof (Theorem 4.4)

Let \(X= H^{m+2}(\Omega ;\mathbb {R}^d)\), \(X_0= H^{m+2}(\Omega ;\mathbb {R}^d) \cap H^1_0(\Omega ;\mathbb {R}^d)\) and \(Y = H^m(\Omega ;\mathbb {R}^d)\). Define \(T :B_{r_1}^{X_0}(0) \times B_{r_2}^{X}(0) \times Y \rightarrow Y\),

$$\begin{aligned} T(u,h,f) = -{{\mathrm{div}}}(DW_\mathrm{CB}(A_0 + \nabla h(x) + \nabla u(x))) - f(x). \end{aligned}$$

If we choose \(r_1,r_2>0\) small enough, then we always have

$$\begin{aligned} \sup \limits _{x \in \Omega } |((\nabla h (x) + \nabla u (x))\rho )_{\rho \in \mathscr {R}}|< r_0, \end{aligned}$$

since \(H^{m+2}(\Omega ; \mathbb {R}^d) \hookrightarrow C^1_b(\Omega ;\mathbb {R}^d)\). Using the properties of F from Lemma 4.5 where we set \(V= B_{r_0}((A_0 \rho )_{\rho \in \mathscr {R}})\), this implies that T is well-defined, is in \(C^1\) and

$$\begin{aligned} \partial _u T(u,h,f)[v](x) = -{{\mathrm{div}}}( D^2 W_\mathrm{CB}(A_0+ \nabla u(x) + \nabla h(x))[\nabla v(x)]). \end{aligned}$$

In particular,

$$\begin{aligned} \partial _u T(0,0,0)[v](x) = -{{\mathrm{div}}}( D^2 W_\mathrm{CB}(A_0)[\nabla v(x)]). \end{aligned}$$

Since \(D^2 W_\mathrm{CB}(A_0)\) satisfies the Legendre–Hadamard condition, the invertibility of the linear map \(\partial _u T(0,0,0) :X_0 \rightarrow Y\) follows from Corollary 4.3. Now the main statement on existence, uniqueness and \(C^1\)-dependence follows from a standard Banach space implicit function theorem, as can be found, e.g., in [7, Theorem 15.1 and Corollary 15.1], and then setting \(g = y_{A_0} + h\), \(y = y_{A_0} + h + u\).

Furthermore, if we choose \(r_1,r_2\) even smaller, the above statements are still true and we can achieve that

$$\begin{aligned} \sup \limits _{x \in \Omega } |\nabla h (x) + \nabla u (x)|< \tilde{r} \end{aligned}$$

for all \((u,h) \in B_{r_1}^{X_0}(0) \times B_{r_2}^{X}(0)\), where \(\tilde{r}\) is such that

$$\begin{aligned} \int \limits _\Omega D^2 W_\mathrm{CB}(\nabla z(x))[\nabla s,\nabla s] \,dx \ge \frac{\lambda _\mathrm{LH}(A_0)}{2}\int \limits _\Omega |\nabla s(x) |^2 \,dx \end{aligned}$$

holds for all \(s\in H^1_0(\Omega ;\mathbb {R}^d)\) and \(z \in W^{1,\infty }(\Omega ;\mathbb {R}^d)\) with \(|\nabla z(x) - A_0 |\le \tilde{r}\) a.e.. This is possible since \(D^2 W_\mathrm{CB}\) is uniformly continuous.

Now, if \(w\in W^{1,\infty }(\Omega ;\mathbb {R}^d)\) is in this space close enough to y and also has boundary values g, then

$$\begin{aligned} |\nabla w(x) - A_0|\le \tilde{r}, \end{aligned}$$

and we have

$$\begin{aligned} E(w;f)= & {} E(y;f)\\&+ \int \limits _0^1 \int \limits _\Omega (1-t)D^2 W_\mathrm{CB}( (1-t)\nabla y(x) + t \nabla w(x) ) [\nabla w(x) - \nabla y(x)]^2 \,dx \,dt \\\ge & {} E(y;f) + \frac{\lambda }{2}\int \limits _\Omega |\nabla w(x) - \nabla y(x)|^2 \,dx. \end{aligned}$$

Hence, y is a \(W^{1,\infty }\)-local minimizer of \(E(\cdot ;f)\) restricted to having boundary values g (strongly in the \(H^1_0(\Omega ;\mathbb {R}^d)\)-Norm). \(\square \)

5 Existence and convergence of solutions of the atomistic equations

5.1 Statement of the main theorem

Let us define the following discrete norms and semi-norms:

$$\begin{aligned} ||u ||_{\ell _\varepsilon ^2(\Lambda )} = \Big ( \varepsilon ^d \sum _{x \in \Lambda } |u(x) |^2 \Big )^{\frac{1}{2}} \end{aligned}$$

for any finite set \(\Lambda \) and \(u :\Lambda \rightarrow \mathbb {R}^d\),

$$\begin{aligned} ||u ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} = \Big ( \varepsilon ^d \sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } u(x) |^2 \Big )^{\frac{1}{2}} \end{aligned}$$

for \(u :\Omega \cap \varepsilon \mathbb {Z}^d \rightarrow \mathbb {R}^d\) and

$$\begin{aligned} ||u ||_{h_\varepsilon ^{-1}({{\mathrm{int}}}_\varepsilon \Omega )} = \sup \Big \{ \varepsilon ^d \sum _{x\in {{\mathrm{int}}}_\varepsilon \Omega } u(x) \varphi (x) :\varphi \in \mathscr {A}_\varepsilon (\Omega ,0) \text { with }||\varphi ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )}=1\Big \} \end{aligned}$$

for \(u :{{\mathrm{int}}}_\varepsilon \Omega \rightarrow \mathbb {R}^d\). The \(h^1_\varepsilon \)-semi-norm is given by the semi-definite symmetric bilinear form

$$\begin{aligned} (u,v)_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} = \varepsilon ^d\sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } D_{\mathscr {R},\varepsilon }u(x) :D_{\mathscr {R},\varepsilon }v(x), \end{aligned}$$

where \(A :B = \sum _\rho \sum _j A_{j \rho } B_{j \rho }\). On \(\mathscr {A}_\varepsilon (\Omega ,0)\) this is a scalar product and \(||\cdot ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )}\) is a norm.

Given \(g :\partial _\varepsilon \Omega \rightarrow \mathbb {R}^d\), \(y :\Omega \cap \varepsilon \mathbb {Z}^d\) minimizes \(||y ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )}\) under the constraint \(y(x)=g(x)\) for all \(x \in \partial _\varepsilon \Omega \) if and only if \((y,u)_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )}=0\) for all \(u \in \mathscr {A}_\varepsilon (\Omega ,0)\) and \(y(x)=g(x)\) for all \(x \in \partial _\varepsilon \Omega \). Thus, for every \(g :\partial _\varepsilon \Omega \rightarrow \mathbb {R}^d\) there is precisely one such y, it depends linearly on g and is the unique solution to \({{\mathrm{div}}}_{\mathscr {R},\varepsilon } D_{\mathscr {R},\varepsilon } y = 0\) with boundary values g. We write \(y=T_\varepsilon g\). Accordingly, we define the semi-norm

$$\begin{aligned} ||g ||_{\partial _\varepsilon \Omega } = ||T_\varepsilon g ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )}. \end{aligned}$$

Given \(\varepsilon \in (0,1]\) and \(f \in L^2(\Omega )\) we will write

$$\begin{aligned} \tilde{f}(x) = \int -_{Q_\varepsilon (x)} f(z)\,dz \end{aligned}$$

for \(x \in {{\mathrm{int}}}_\varepsilon \Omega \). If \(\Omega \) has Lipschitz boundary and we have a deformation \(y \in H^1(\Omega ;\mathbb {R}^d)\) we will write

$$\begin{aligned} S_{\varepsilon }y(x) = \eta _\varepsilon *(y_{A_0} + E(y-y_{A_0}) ) (x) \end{aligned}$$

for \(x \in \varepsilon \mathbb {Z}^d\), where \(\eta _\varepsilon \) is the standard scaled smoothing kernel and E is an extension operator for all Sobolev spaces, see [18, Chapter VI], such that every Eu has support in a fixed ball \(B_{R_E}(0)\). In the following \(\tilde{f}\) and \(S_{\varepsilon }y\) are our reference points for the atomistic body forces, boundary conditions and deformations.

Theorem 5.1

Let \(d \in \{1,2,3,4\}\) and let \(\Omega \subset \mathbb {R}^d\) be an open, bounded set with \(C^4\)-boundary. Let \(r_0 >0\), \(W_\mathrm{atom} \in C^5(\overline{B_{r_0}((A_0 \rho )_{\rho \in \mathscr {R}})})\) and assume \(\lambda _\mathrm{atom}(A_0)>0\). Then there are constants \(K_1, K_2, K_3>0\) such that for every \(f \in H^2(\Omega ;\mathbb {R}^d)\) with \(||f ||_{H^2(\Omega ;\mathbb {R}^d)} \le K_1\), \(g \in H^4(\Omega ;\mathbb {R}^d)\) with \(||g - y_{A_0} ||_{H^4(\Omega ;\mathbb {R}^d)} \le K_1\), \(\varepsilon \in (0,1]\), \(\gamma \in [\frac{d}{2},2]\), \(f_\mathrm{atom} :{{\mathrm{int}}}_\varepsilon \Omega \rightarrow \mathbb {R}^d\) with \(||f_\mathrm{atom} - \tilde{f} ||_{h_\varepsilon ^{-1}({{\mathrm{int}}}_\varepsilon \Omega )} \le K_2 \varepsilon ^\gamma \), and \(g_\mathrm{atom} :\partial _\varepsilon \Omega \rightarrow \mathbb {R}^d\) with \(||g_\mathrm{atom} - S_{\varepsilon }y ||_{\partial _\varepsilon \Omega } \le K_2 \varepsilon ^\gamma \), where y is the continuous solution corresponding to f and g given by Theorem 4.4, there is a unique \(y_\mathrm{atom} \in \mathscr {A}_\varepsilon (\Omega , g_\mathrm{atom})\) with \(||y_\mathrm{atom} - S_{\varepsilon }y ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} \le K_3 \varepsilon ^\gamma \) such that

$$\begin{aligned} -{{\mathrm{div}}}_{\mathscr {R},\varepsilon } \big ( DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y_\mathrm{atom} (x))\big ) = f_\mathrm{atom}(x) \end{aligned}$$

for all \(x \in {{\mathrm{int}}}_\varepsilon \Omega \). Furthermore, \(y_\mathrm{atom}\) is a strict local minimizer of \(E_\varepsilon (\cdot ,f_\mathrm{atom},g_\mathrm{atom})\).

Additionally, there is a \(K_4>0\) such that whenever \(\gamma \in (1,2]\) and \(E(y-y_{A_0}) \in C^{2,(\gamma -1)}(\Omega )\) then

$$\begin{aligned} ||y_\mathrm{atom} - y ||_{h^1_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )} \le (K_3 + K_4 ||\nabla ^2 E(y-y_{A_0}) ||_{C^{0,\gamma -1}}) \varepsilon ^\gamma . \end{aligned}$$

Remark 5.2

If \(d=3\) and \(\gamma = \frac{3}{2}\) the assumption in the additional statement is automatically satisfied since \(E(y-y_{A_0})\in H^4(B_{R_E}(0)) \hookrightarrow C^{2, \frac{1}{2}}(B_{R_E}(0))\).

5.2 A quantitative implicit function theorem

The proof of Theorem 5.1 relies on an implicit function theorem, which will eventually yield the desired solution to the atomistic equations if an approximate solution can be found with good estimates on the residuum, as well as invertibility, boundedness, and continuity of certain partial derivatives. The approximate solution in our case will be a smooth approximation of the solution to the corresponding continuous equations with the Cauchy–Born energy density. In order to obtain the strong estimates on the rate of convergence stated in Theorem 5.1, we will formulate a quantitative implicit function theorem which also allows for a small parameter.

Theorem 5.3

Let X be a Banach space and YZ normed spaces, \(U \subset X\), \(V \subset Y\) open and \(F :U \times V \rightarrow Z\) Fréchet-differentiable. Assume that \(\partial _u F (0,0) :X \rightarrow Z\) is invertible. Furthermore, assume that there are \(\rho ,\tau ,\kappa _1,\kappa _2,\kappa _3>0\) and a function \(\omega :[0,\infty )^2 \rightarrow [0,\infty ]\), non-decreasing in both variables, such that \(\overline{B_\rho (0)} \subset U\), \(\overline{B_\tau (0)} \subset V\),

$$\begin{aligned} ||F(0,0) ||_Z&\le \kappa _1,\\ ||\partial _u F(0,0)^{-1} ||_{L(Z,X)}&\le \kappa _2,\\ ||\partial _h F(u,h) ||_{L(Y,Z)}&\le \kappa _3 \quad \forall (u,h)\in \overline{B_\rho (0)} \times \overline{B_\tau (0)} ,\\ ||\partial _u F(0,0) - \partial _u F(u,h) ||_{L(X,Z)}&\le \omega (||u ||_X,||h ||_Y) \quad \forall (u,h)\in \overline{B_\rho (0)} \times \overline{B_\tau (0)} ,\\ \kappa _2 \omega (\rho ,\tau )&< 1, \text { and}\\ \kappa _2 \left( \kappa _1 + \kappa _3 \tau + \int _0^\rho \omega \left( t,\frac{\tau }{\rho } t \right) \,dt \right)&\le \rho . \end{aligned}$$

Then, for every \(h \in \overline{B_\tau (0)}\) there is a unique \(u \in \overline{B_\rho (0)}\) with \(F(u,h)=0\).

Proof

Let

$$\begin{aligned} G_h(u) = u - \partial _u F(0,0)^{-1} F(u,h). \end{aligned}$$

If \(||u ||\le \rho \) and \(||h ||\le \tau \) then

$$\begin{aligned} G_h(u)= \partial _u F(0,0)^{-1} \big (- F(0,0) + \partial _u F(0,0)u + F(0,0) -F(u,h) \big ). \end{aligned}$$

Therefore,

$$\begin{aligned} ||G_h(u) ||&\le \kappa _2 \kappa _1 + \kappa _2 \Big ||\int _0^1 (\partial _u F(0,0)-\partial _u F(tu,th))u - \partial _h F(tu,th)h \,dt \Big ||\\&\le \kappa _2 \left( \kappa _1 + \int _0^\rho \omega \left( t,\frac{\tau }{\rho } t \right) \,dt + \kappa _3 \tau \right) \le \rho . \end{aligned}$$

Furthermore, for \(u,v \in \overline{B_\rho (0)}\) we have

$$\begin{aligned} ||G_h(u)-G_h(v) ||&\le \kappa _2 \Big ||\int _0^1 (\partial _u F(u+t(v-u),h) -\partial _u F(0,0))(v-u) \,dt \Big ||\\&\le \kappa _2 \omega (\rho ,\tau ) ||v-u ||, \end{aligned}$$

where \(\kappa _2 \omega (\rho ,\tau ) < 1\). Hence, \(G_h\) has a unique fixed point in \(u \in \overline{B_\rho (0)}\). \(\square \)

More precisely, we want to use the following more specific corollary:

Corollary 5.4

Let \(d \in \{1,2,3,4\}\). Assume we have a family \(F_\varepsilon :U_\varepsilon \times V_\varepsilon \rightarrow Z_\varepsilon \) with \(\varepsilon \in (0,1]\) and \(U_\varepsilon \subset X_\varepsilon , V_\varepsilon \subset Y_\varepsilon \) open, where \(Y_\varepsilon , Z_\varepsilon \) are normed spaces and \(X_\varepsilon \) Banach spaces. Furthermore, assume that the \(F_\varepsilon \) are Fréchet-differentiable and we have fixed \(r_1,r_2>0\) such that \(\overline{B_{r_1\varepsilon ^{\frac{d}{2}}}(0)} \subset U_\varepsilon \) and \(\overline{B_{r_2\varepsilon ^{\frac{d}{2}}}(0)} \subset V_\varepsilon \). Now, assume there are \(A,M_1,M_2,M_3,M_4>0\), such that

$$\begin{aligned} ||F_\varepsilon (0,0) ||_{Z_\varepsilon }&\le A \varepsilon ^2,\\ ||\partial _u F_\varepsilon (0,0)^{-1} ||_{L(Z_\varepsilon ,X_\varepsilon )}&\le M_1,\\ ||\partial _h F_\varepsilon (u,h) ||_{L(Y_\varepsilon ,Z_\varepsilon )}&\le M_2 \qquad \forall (u,h)\in \overline{B_{r_1\varepsilon ^{\frac{d}{2}}}(0)} \times \overline{B_{r_2\varepsilon ^{\frac{d}{2}}}(0)} ,\\ ||\partial _u F_\varepsilon (0,0) - \partial _u F_\varepsilon (u,h) ||_{L(X_\varepsilon ,Z_\varepsilon )}&\le M_3 \varepsilon ^{-\frac{d}{2}} (||u ||_{X_\varepsilon }+||h ||_{Y_\varepsilon }) \\&\qquad \qquad \forall (u,h)\in \overline{B_{r_1\varepsilon ^{\frac{d}{2}}}(0)} \times \overline{B_{r_2\varepsilon ^{\frac{d}{2}}}(0)} \end{aligned}$$

and

$$\begin{aligned} A \le \min \Big \{ \frac{r_1}{3M_1}, \frac{1}{9 M_1^2 M_3}\Big \}. \end{aligned}$$

If we now set \(\rho _\varepsilon = \lambda _1 \varepsilon ^\gamma \) and \(\tau _\varepsilon = \lambda _2 \varepsilon ^\gamma \) for arbitrary \(\gamma \in \big [\frac{d}{2},2\big ]\) and

$$\begin{aligned} \lambda _1&= \min \Big \{r_1,\frac{1}{3 M_1 M_3}\Big \},\\ \lambda _2&= \min \Big \{r_2,\frac{1}{3 M_1 M_3}, \frac{\lambda _1}{3 M_1 M_2}\Big \}, \end{aligned}$$

then for every \(\varepsilon \in (0,1]\) and every \(h_\varepsilon \in \overline{B_{\tau _\varepsilon }(0)}\) there is a unique \(u_\varepsilon \in \overline{B_{\rho _\varepsilon }(0)}\) with \(F_\varepsilon (u_\varepsilon ,h_\varepsilon )=0\).

Proof

We set

$$\begin{aligned} \kappa _1&= A \varepsilon ^2,\\ \kappa _2&= M_1,\\ \kappa _3&= M_2, \text { and}\\ \omega (s,t)&= M_3 \varepsilon ^{-\frac{d}{2}} (s+t). \end{aligned}$$

A simple calculation gives

$$\begin{aligned} \kappa _2 \omega (\rho _\varepsilon ,\lambda _\varepsilon ) = 2 M_1 M_3 \varepsilon ^{\gamma -\frac{d}{2}} (\lambda _1 + \lambda _2) \le \frac{2}{3} < 1. \end{aligned}$$

Furthermore, since \(A \le \frac{\lambda _1}{3 M_1}\),

$$\begin{aligned} \kappa _2 \left( \kappa _1 + \kappa _3 \tau + \int _0^\rho \omega \left( t,\frac{\tau }{\rho } t \right) \,dt \right)\le & {} A M_1 \varepsilon ^2 + M_1 M_2 \lambda _2 \varepsilon ^\gamma \\&+\, M_1 M_3 \varepsilon ^{-\frac{d}{2}} \left( 1+ \frac{\lambda _2}{\lambda _1}\right) \frac{1}{2} \lambda _1^2 \varepsilon ^{2 \gamma }\\\le & {} \frac{\lambda _1 \varepsilon ^2}{3} + \frac{\lambda _1 \varepsilon ^\gamma }{3} + \frac{\lambda _1 \varepsilon ^{2 \gamma - \frac{d}{2}}}{3}\\\le & {} \rho _\varepsilon . \end{aligned}$$

We can therefore apply Theorem 5.3. \(\square \)

Remark 5.5

Without the smallness condition on A the theorem is still true for all \(\varepsilon \) small enough if \(d \le 3\). In the case \(\gamma = 2\) this requires a different choice of \(\lambda _1\) and \(\lambda _2\), e.g., \(\lambda _1 = 3 M_1 A\) and \(\lambda _2 = \frac{A}{M_2}\).

Remark 5.6

In contrast to previous work (e.g., [16]), there is no one-to-one map between body forces and solutions which could be handled with a quantitative inverse function theorem. To treat the higher dimensional set of data consisting of body forces and boundary values we instead use a version of the implicit function theorem.

5.3 Residual estimates

Proposition 5.7

Let \(V \subset \mathbb {R}^{d \times \mathscr {R}}\) be open and \(W_\mathrm{atom} \in C^4_b(V)\). Let \(f \in L^2(\Omega ;\mathbb {R}^d)\) and set as before

$$\begin{aligned} \tilde{f}(x) = \int -_{Q_\varepsilon (x)} f(a)\,da \end{aligned}$$

for \(x \in {{\mathrm{int}}}_\varepsilon \Omega \). Furthermore let \(\varepsilon \in (0,1]\) and \(y \in C^{3,1}(\mathbb {R}^d;\mathbb {R}^d)\) with

$$\begin{aligned} {{\mathrm{co}}}\{D_{\mathscr {R},\varepsilon } y (\hat{x}+ \varepsilon \sigma ), (\nabla y (x) \rho )_{\rho \in \mathscr {R}}\} \subset V \end{aligned}$$

for all \(x \in \Omega _\varepsilon \) and \(\sigma \in \mathscr {R} \cup \{0\}\). Then we have

$$\begin{aligned} \big ||-\tilde{f}&- {{\mathrm{div}}}_{\mathscr {R},\varepsilon } \big ( DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y)\big ) \big ||_{\ell _\varepsilon ^2({{\mathrm{int}}}_\varepsilon \Omega )}\\&\le ||-f - {{\mathrm{div}}}DW_\mathrm{CB}(\nabla y)||_{L^2(\Omega _\varepsilon ; \mathbb {R}^d)}+ C \varepsilon ^2 \Big ||||\nabla ^4 y ||_{L^\infty (B_{\varepsilon R}(x))}\\&\quad + ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon R}(x))}^\frac{3}{2} + ||\nabla ^2 y ||_{L^\infty (B_{\varepsilon R}(x))}^3 + \varepsilon ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon R}(x))}^2 \Big ||_{L^2(\Omega _\varepsilon )}, \end{aligned}$$

where \(\Omega _\varepsilon = \bigcup _{z \in {{\mathrm{int}}}_\varepsilon \Omega } Q_\varepsilon (z)\), \(R=2R_\mathrm{max}+\frac{3\sqrt{d}}{2}\) and \(C = C(d,\mathscr {R}, ||D^2 W_\mathrm{atom} ||_{C^2(V)}) >0\).

Proof

For \(x\in \Omega _\varepsilon \), \(\sigma \in \mathscr {R}\) and \(\rho \in \mathscr {R} \cup \{0\}\) set

$$\begin{aligned} r_{1,\varepsilon }(x;\sigma ,\rho )&= \frac{y(\hat{x} - \varepsilon \rho +\varepsilon \sigma ) - y(\hat{x} - \varepsilon \rho )}{\varepsilon } - \nabla y(x)\sigma , \end{aligned}$$
$$\begin{aligned} r_{2,\varepsilon }(x;\sigma ,\rho )&= \frac{y(\hat{x} - \varepsilon \rho +\varepsilon \sigma ) - y(\hat{x} - \varepsilon \rho )}{\varepsilon } - \nabla y(x)\sigma \\&\quad - \frac{1}{2}\varepsilon \nabla ^2 y(x)\left[ \sigma - \rho +\frac{\hat{x} - x}{\varepsilon }, \sigma - \rho +\frac{\hat{x} - x}{\varepsilon }\right] \\&\quad + \frac{1}{2}\varepsilon \nabla ^2 y(x)\left[ -\rho +\frac{\hat{x} - x}{\varepsilon },-\rho +\frac{\hat{x} - x}{\varepsilon }\right] , \end{aligned}$$

and

$$\begin{aligned} r_{3,\varepsilon }(x;\sigma ,\rho )&= \frac{y(\hat{x} - \varepsilon \rho +\varepsilon \sigma ) - y(\hat{x} - \varepsilon \rho )}{\varepsilon } - \nabla y(x)\sigma \\&\quad - \frac{1}{2}\varepsilon \nabla ^2 y(x)\left[ \sigma - \rho +\frac{\hat{x} - x}{\varepsilon }, \sigma - \rho +\frac{\hat{x} - x}{\varepsilon }\right] \\&\quad + \frac{1}{2}\varepsilon \nabla ^2 y(x)\left[ -\rho +\frac{\hat{x} - x}{\varepsilon },-\rho +\frac{\hat{x} - x}{\varepsilon }\right] \\&\quad - \frac{1}{6}\varepsilon ^2 \nabla ^3 y(x)\left[ \sigma - \rho +\frac{\hat{x} - x}{\varepsilon }, \sigma - \rho +\frac{\hat{x} - x}{\varepsilon }, \sigma - \rho +\frac{\hat{x} - x}{\varepsilon }\right] \\&\quad + \frac{1}{6}\varepsilon ^2 \nabla ^3 y(x)\left[ -\rho +\frac{\hat{x} - x}{\varepsilon },-\rho +\frac{\hat{x} - x}{\varepsilon },-\rho +\frac{\hat{x} - x}{\varepsilon }\right] . \end{aligned}$$

First order Taylor expansions with integral remainder of \(y(\hat{x} - \varepsilon \rho +\varepsilon \sigma )\) and \(y(\hat{x} - \varepsilon \rho )\) at x give the estimate

$$\begin{aligned} |r_{1,\varepsilon }(x;\sigma ,\rho ) |\le \varepsilon \bar{R}^2 ||\nabla ^2 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}, \end{aligned}$$

where \(\bar{R} = 2 R_{max} + \frac{1}{2}\sqrt{d}\) and we have used that \(|\hat{x}-x |\le \frac{1}{2} \varepsilon \sqrt{d}\). Similarly, second and third order Taylor expansions give

$$\begin{aligned} |r_{2,\varepsilon }(x;\sigma ,\rho ) |&\le \frac{1}{3} \varepsilon ^2 \bar{R}^3 ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}, \\ |r_{3,\varepsilon }(x;\sigma ,\rho ) |&\le \frac{1}{12} \varepsilon ^3 \bar{R}^4 ||\nabla ^4 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}. \end{aligned}$$

Now, doing a second order Taylor expansion of \(DW_\mathrm{atom}\) at \((\nabla y (x) \rho )_{\rho \in \mathscr {R}}\) with integral remainder, using the definition of \(r_{3,\varepsilon }\) in the first order term, the definition of \(r_{2,\varepsilon }\) in the second order term and the definition of \(r_{1,\varepsilon }\) in the remainder and then collecting the terms with the same exponent in \(\varepsilon \) gives

$$\begin{aligned} -f(x)-{{\mathrm{div}}}_{\mathscr {R},\varepsilon } \big ( DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y (\hat{x})) \big )=\varepsilon ^{-1} I_{-1} + \varepsilon ^0 I_0 + \varepsilon ^{1} I_1 +R_\varepsilon (x), \end{aligned}$$

where

$$\begin{aligned} I_{-1}&= -\sum \limits _{\rho \in \mathscr {R}} D_{e_\rho } W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}}) - D_{e_\rho } W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}}) = 0,\\ (I_0)_j&= -f_j(x) - \sum \limits _{\rho ,\sigma \in \mathscr {R}} D^2 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\Big [e_j \otimes e_\rho ,\frac{1}{2} \nabla ^2 y(x) \\&\qquad \Big (\left[ \sigma + \frac{\hat{x}-x}{\varepsilon }\right] ^2 - \left[ \frac{\hat{x}-x}{\varepsilon }\right] ^2 -\left[ \sigma - \rho + \frac{\hat{x}-x}{\varepsilon }\right] ^2 + \left[ -\rho + \frac{\hat{x}-x}{\varepsilon }\right] ^2 \Big ) \otimes e_\sigma \Big ] \\&= -f_j(x) - \sum \limits _{\rho ,\sigma \in \mathscr {R}} D^2 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\Big [e_j \otimes e_\rho , \nabla ^2 y(x)[\sigma ,\rho ] \otimes e_\sigma \Big ] \\&= -f_j(x) - \sum \limits _{i} \sum \limits _{\rho ,\sigma \in \mathscr {R}} D^2 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\Big [((e_j \otimes e_i)\rho ) \otimes e_\rho ,\\&\qquad \nabla ^2 y(x)[\sigma ,e_i] \otimes e_\sigma \Big ]\\&= -f_j(x) - \sum \limits _{i} \frac{\partial }{\partial x_i}\sum \limits _{\rho \in \mathscr {R}} DW_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\Big [((e_j \otimes e_i)\rho ) \otimes e_\rho \Big ] \\&= -f_j(x) - \sum \limits _{i} \frac{\partial }{\partial x_i} DW_\mathrm{CB}(\nabla y (x)) [e_j \otimes e_i] \\&= \big (-f(x) - {{\mathrm{div}}}DW_\mathrm{CB}(\nabla y(x))\big )_j, \end{aligned}$$

and

$$\begin{aligned} (I_1)_j&= -\sum \limits _{\rho ,\sigma \in \mathscr {R}} D^2 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\Bigg [e_j \otimes e_\rho ,\frac{1}{6}\nabla ^3 y(x)\\&\quad \Big ( \left[ \sigma + \frac{\hat{x}-x}{\varepsilon }\right] ^3-\left[ \frac{\hat{x}-x}{\varepsilon }\right] ^3-\left[ \sigma -\rho + \frac{\hat{x}-x}{\varepsilon }\right] ^3+\left[ -\rho + \frac{\hat{x}-x}{\varepsilon }\right] ^3 \Big )\otimes e_\sigma \Bigg ] \\&\quad -\frac{1}{2}\sum \limits _{\rho ,\sigma ,\tau \in \mathscr {R}} D^3 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\big [e_j \otimes e_\rho \big ]\\&\quad \bigg (\Big [\frac{1}{2}\nabla ^2 y(x)\Bigg (\Big [\sigma + \frac{\hat{x}-x}{\varepsilon }\Big ]^2 - \Big [\frac{\hat{x}-x}{\varepsilon }\Big ]^2 \Bigg ) \otimes e_\sigma ,\\&\quad \frac{1}{2}\nabla ^2 y(x)\Bigg (\Big [\tau + \frac{\hat{x}-x}{\varepsilon }\Big ]^2 - \Big [\frac{\hat{x}-x}{\varepsilon }\Big ]^2 \Bigg ) \otimes e_\tau \Big ]\\&\quad -\Big [\frac{1}{2}\nabla ^2 y(x)\Big (\big [\sigma -\rho + \frac{\hat{x}-x}{\varepsilon }\big ]^2 - \big [\frac{\hat{x}-x}{\varepsilon }-\rho \big ]^2 \Big ) \otimes e_\sigma ,\\&\quad \frac{1}{2}\nabla ^2 y(x)\Big (\big [\tau -\rho + \frac{\hat{x}-x}{\varepsilon }\big ]^2 - \big [\frac{\hat{x}-x}{\varepsilon }-\rho \big ]^2 \Big ) \otimes e_\tau \Big ]\bigg )\\&= -\sum \limits _{\rho ,\sigma \in \mathscr {R}} D^2 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\Bigg [e_j \otimes e_\rho ,\\&\quad \frac{1}{2}\nabla ^3 y(x)[\sigma ,\rho ,\sigma - \rho + 2\frac{\hat{x}-x}{\varepsilon }]\otimes e_\sigma \Bigg ]\\&\quad - \frac{1}{2}\sum \limits _{\rho ,\sigma ,\tau \in \mathscr {R}} D^3 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\big [e_j \otimes e_\rho \big ]\\&\quad \bigg (\Big [\nabla ^2 y(x)[\sigma ,\rho ] \otimes e_\sigma , \nabla ^2 y (x)\Big [\tau , \frac{1}{2}\tau + \frac{\hat{x}-x}{\varepsilon }\Big ] \otimes e_\tau \Big ]\\&\quad +\Big [\nabla ^2 y (x)\Big [\sigma , \frac{1}{2}\sigma + \frac{\hat{x}-x}{\varepsilon }\Big ] \otimes e_\sigma , \nabla ^2y(x)[\tau ,\rho ] \otimes e_\tau \Big ]\\&\quad -\Big [\nabla ^2 y (x)[\sigma ,\rho ] \otimes e_\sigma , \nabla ^2 y (x)[\tau ,\rho ] \otimes e_\tau \Big ]\bigg )\\&= -\sum \limits _{\rho ,\sigma \in \mathscr {R}} D^2 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\Big [e_j \otimes e_\rho , \nabla ^3 y(x)\Big [\sigma ,\rho ,\frac{\hat{x}-x}{\varepsilon }\Big ]\otimes e_\sigma \Big ]\\&\quad -\frac{1}{2}\sum \limits _{\rho ,\sigma ,\tau \in \mathscr {R}} D^3 W_\mathrm{atom}((\nabla y (x) \rho )_{\rho \in \mathscr {R}})\big [e_j \otimes e_\rho \big ]\\&\quad \bigg (\Big [\nabla ^2y(x)[\sigma ,\rho ]\otimes e_\sigma , \nabla ^2y(x)\left[ \tau ,\frac{\hat{x}-x}{\varepsilon }\right] \otimes e_\tau \Big ]\\&\quad +\Big [\nabla ^2y(x)\left[ \sigma ,\frac{\hat{x}-x}{\varepsilon }\right] \otimes e_\sigma , \nabla ^2y(x)[\tau ,\rho ] \otimes e_\tau \Big ]\bigg ). \end{aligned}$$

Here, in the last equality we applied the symmetry condition in the form of Lemma 2.1. While the last expression is not necessarily zero, it is linear in \(\frac{\hat{x}-x}{\varepsilon }\), with coefficients depending on x. Therefore, the average \(\frac{1}{2}(I_1(x)+I_1(\bar{x}))\) is actually of higher order. Here \(\bar{x}\) denotes the almost everywhere uniquely defined point in the same cube as x such that \(\frac{1}{2}(x+\bar{x}) = \hat{x}\). To be more precise, we have

$$\begin{aligned} \Big |\frac{\varepsilon }{2}(I_1(x)&+I_1(\bar{x})) \Big |\\ \le \,&\frac{1}{2}\varepsilon ^2 |\mathscr {R}|^2 ||D^2W_\mathrm{atom} ||_\infty \sqrt{d} ||\nabla ^4 y ||_{L^\infty (B_{\varepsilon \sqrt{d}}(x))} R_\mathrm{max}^2 \frac{\sqrt{d}}{2}\\&+ \frac{1}{2} \varepsilon ^2 |\mathscr {R}|^{\frac{5}{2}} ||D^3 W_\mathrm{atom} ||_\infty \sqrt{d} ||\nabla ^2 y ||_{L^\infty (B_{\varepsilon \sqrt{d}}(x))} R_\mathrm{max}^3 \frac{\sqrt{d}}{2} |\nabla ^3 y (x) |\\&+ \varepsilon ^2 |\mathscr {R}|^3 ||D^3 W_\mathrm{atom} ||_\infty \sqrt{d} ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon \sqrt{d}}(x))} ||\nabla ^2 y ||_{L^\infty (B_{\varepsilon \sqrt{d}}(x))} R_\mathrm{max}^3 \frac{\sqrt{d}}{2} \\&+ \varepsilon ^2 |\mathscr {R}|^{\frac{7}{2}} ||D^4 W_\mathrm{atom} ||_\infty \sqrt{d} ||\nabla ^2 y ||_{L^\infty (B_{\varepsilon \sqrt{d}}(x))} R_\mathrm{max}^4 \frac{\sqrt{d}}{2} |\nabla ^2 y (x) |^2 \\ \le \,&C\varepsilon ^2 \Big ( ||\nabla ^4 y ||_{L^\infty (B_{\varepsilon \sqrt{d}}(x))} + ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon \sqrt{d}}(x))}^\frac{3}{2}+||\nabla ^2 y ||_{L^\infty (B_{\varepsilon \sqrt{d}}(x))}^3\Big ). \end{aligned}$$

Using the bounds we have on \(r_{1,\varepsilon }\), \(r_{2,\varepsilon }\) and \(r_{3,\varepsilon }\), we estimate

$$\begin{aligned} |(R_\varepsilon )_j(x) |&\le \varepsilon ^2 \Big (|\mathscr {R} |^2 ||D^2 W_\mathrm{atom} ||_\infty \bar{R}^4 \frac{1}{6} ||\nabla ^4 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}\\&\quad +|\mathscr {R} |^3 ||D^3 W_\mathrm{atom} ||_\infty \frac{2}{3} \bar{R}^3 R_\mathrm{max}\left( R_\mathrm{max} + \frac{\sqrt{d}}{2}\right) |\nabla ^2 y(x)|||\nabla ^3 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}\\&\quad + |\mathscr {R} |^4 ||D^4 W_\mathrm{atom} ||_\infty \frac{1}{3} \bar{R}^6 ||\nabla ^2 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}^3 \Big ) \\&\quad +\varepsilon ^3 \Big ( |\mathscr {R} |^3 ||D^3 W_\mathrm{atom} ||_\infty \bar{R}^6 \frac{1}{9} ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}^2 \Big ) \\&\le C \varepsilon ^2 \big ( ||\nabla ^4 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))} + ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}^\frac{3}{2}\\&\quad + ||\nabla ^2 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}^3 + \varepsilon ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon \bar{R}}(x))}^2 \big ). \end{aligned}$$

Combining these estimates and using \(\bar{R}+\sqrt{d}=R\), we get

$$\begin{aligned}&\bigg |-\frac{f(x)+ f(\bar{x})}{2}-{{\mathrm{div}}}_{\mathscr {R},\varepsilon } \big ( DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y (\hat{x}))\big ) \bigg |\\&\quad \le \Big |\frac{-f(x)-{{\mathrm{div}}}DW_\mathrm{CB}(\nabla y (x))- f(\bar{x})-{{\mathrm{div}}}DW_\mathrm{CB}(\nabla y (\bar{x}))}{2} \Big |\\&\quad \quad + C \varepsilon ^2 \Big ( ||\nabla ^4y ||_{L^\infty (B_{\varepsilon R}(x))} + ||\nabla ^3y ||_{L^\infty (B_{\varepsilon R}(x))}^\frac{3}{2}\\&\quad \quad + ||\nabla ^2y ||_{L^\infty (B_{\varepsilon R}(x))}^3 + \varepsilon ||\nabla ^3 y ||_{L^\infty (B_{\varepsilon R}(x))}^2 \Big ). \end{aligned}$$

But,

$$\begin{aligned} \varepsilon ^d&\sum _{z \in {{\mathrm{int}}}_\varepsilon \Omega } \big (-\tilde{f}(z) - {{\mathrm{div}}}_{\mathscr {R},\varepsilon } \big ( DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y (z))\big )\big )^2\\&\le \sum _{z \in {{\mathrm{int}}}_\varepsilon \Omega } \int _{Q_\varepsilon (z)} \Big (-\frac{f(a)+f(\bar{a})}{2}- {{\mathrm{div}}}_{\mathscr {R},\varepsilon } \big (DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y (\hat{a})\big )\Big )^2\,da, \end{aligned}$$

which combined gives the desired result. \(\square \)

These residual estimates are particularly strong if we combine them with the following two approximation results. We begin with a result that lets us convert \(L^\infty \) estimates on small balls into \(L^p\) estimates.

Proposition 5.8

For any \(R>0\), \(k,d \in \mathbb {N}\), \(p\ge 1\), there is a \(C=C(R,d,p)>0\) such that for any \(U \subset \mathbb {R}^d\) measurable and \(y \in W^{k,p}(U + B_{(R+1)\varepsilon }(0);\mathbb {R}^d)\) we have

$$\begin{aligned} \Big ||||\nabla ^k (y *\eta _\varepsilon ) ||_{L^\infty (B_{\varepsilon R}(\cdot ))} \Big ||_{L^p(U)} \le C ||\nabla ^k y ||_{L^{p}(U+B_{(R+1)\varepsilon }(0))}, \end{aligned}$$

where \(\eta _\varepsilon \) is the standard scaled smoothing kernel.

Proof

Using Jensen’s inequality, we calculate

$$\begin{aligned} \Big ||||\nabla ^k (y *\eta _\varepsilon ) ||_{L^\infty (B_{\varepsilon R}(\cdot ))} \Big ||_{L^p(U)}^p&\le \int _U {{\mathrm{ess\,sup}}}_{z \in B_{\varepsilon R}(x)} \int _{\mathbb {R}^d} \eta _\varepsilon (a)|\nabla ^k y(z+a) |^{p}\,da\,dx\\&\le ||\eta ||_\infty \varepsilon ^{-d} \int _U \int _{B_{\varepsilon (R+1)}(x)} |\nabla ^k y (a) |^{p} \,da\,dx \\&\le C(d) (R+1)^d \int _{U+B_{\varepsilon (R+1)}(x)} |\nabla ^k y (x) |^{p} \,dx. \end{aligned}$$

\(\square \)

The second result is about estimating the nonlinearity for approximations.

Proposition 5.9

Let \(d \in \{1,2,3,4\}\), \(V \subset \mathbb {R}^{d \times \mathscr {R}}\) open, \(\Omega \subset \mathbb {R}^d\) open and bounded with Lipschitz boundary, and \(W_\mathrm{atom} \in C^5_b(V)\). Then, there is a \(C>0\) such that for all \(\varepsilon \in (0,1]\) and all \(y \in H^4(\Omega + B_{\varepsilon }(0);\mathbb {R}^d)\) with

$$\begin{aligned} \inf _{x\in \Omega }\inf _{t \in [0,1]} {{\mathrm{dist}}}((1-t)(\nabla y (x) \rho )_{\rho \in \mathscr {R}}+t(\nabla (y *\eta _\varepsilon ) (x) \rho )_{\rho \in \mathscr {R}}, V^c)>0, \end{aligned}$$

we have

$$\begin{aligned} ||{{\mathrm{div}}}&DW_\mathrm{CB}(\nabla y(x)) - {{\mathrm{div}}}DW_\mathrm{CB}(\nabla (y *\eta _\varepsilon )(x)) ||_{L^2(\Omega )}\\&\le C \varepsilon ^2 \big ( ||\nabla ^2 y ||_{L^4(\Omega +B_{\varepsilon }(0))} ||\nabla ^3 y ||_{L^4(\Omega +B_{\varepsilon }(0))} + ||\nabla ^4 y ||_{L^2(\Omega +B_{\varepsilon }(0))} \big ) \end{aligned}$$

where \(\eta _\varepsilon \) is the standard scaled smoothing kernel.

Proof

Now, since \(\eta (z) = \eta (-z)\), we have

$$\begin{aligned} \nabla ^k y(x) - \nabla ^k (y *\eta _\varepsilon )(x)&= \int _{\mathbb {R}^d} \eta _\varepsilon (z) (\nabla ^k y(x) - \nabla ^k y(x+z))\,dz \\&= \int _{\mathbb {R}^d} \eta _\varepsilon (z) (\nabla ^k y(x) + \nabla ^{k+1} y(x)[z] - \nabla ^k y(x+z))\,dz \\&= -\int _{\mathbb {R}^d} \int _0^1 \eta _\varepsilon (z) (1-t) \nabla ^{k+2} y(x+tz)[z,z]\,dt\,dz. \end{aligned}$$

But then

$$\begin{aligned} \int _\Omega |&\nabla ^k y(x) - \nabla ^k (y *\eta _\varepsilon )(x) |^p \,dx\\&= \int _\Omega \Big |\int _{\mathbb {R}^d} \int _0^1 \eta _\varepsilon (z) (1-t) \nabla ^{k+2} y(x+tz)[z,z]\,dt\,dz \Big |^p \,dx \\&\le \varepsilon ^{2p} \int _\Omega \int _{\mathbb {R}^d} \eta _{\varepsilon }(z) \int _0^1 |\nabla ^{k+2} y (x+tz) |^p\,dt\,dz\,dx\\&\le \varepsilon ^{2p} ||\nabla ^{k+2} y ||_{L^p(\Omega +B_\varepsilon (0))}^p. \end{aligned}$$

While we used strong differentiability in the proof, the inequality extends directly to \(W^{k+2,p}(\Omega +B_\varepsilon (0))\) by density. As in the proof of Lemma 4.5 we get

$$\begin{aligned} DW_\mathrm{CB}(\nabla y)&\in H^3(\Omega ; \mathbb {R}^{d}),\\ DW_\mathrm{CB}(\nabla (y*\eta _\varepsilon ))&\in H^3(\Omega ; \mathbb {R}^{d}),\\ D^2W_\mathrm{CB}((1-t)\nabla y + t \nabla (y *\eta _\varepsilon ))&\in H^3(\Omega ; \mathbb {R}^{d \times d}), \end{aligned}$$

and thus

$$\begin{aligned} ||{{\mathrm{div}}}&DW_\mathrm{CB}(\nabla y(x)) - {{\mathrm{div}}}DW_\mathrm{CB}(\nabla (y *\eta _\varepsilon )(x)) ||_{L^2(\Omega )}\\&\le \int _0^1 ||{{\mathrm{div}}}D^2W_\mathrm{CB}((1-t)\nabla y + t \nabla (y *\eta _\varepsilon ))[\nabla y - \nabla (y *\eta _\varepsilon )] ||_{L^2(\Omega )} \,dt\\&\le C \big (||\nabla ^2 y - \nabla ^2 (y *\eta _\varepsilon ) ||_{L^2(\Omega )}\\&\quad + ||\nabla y - \nabla (y *\eta _\varepsilon ) ||_{L^4(\Omega )} (||\nabla ^2 y||_{L^4(\Omega )}+||\nabla ^2 (y *\eta _\varepsilon )||_{L^4(\Omega )})\big )\\&\le C \varepsilon ^2 \big ( ||\nabla ^4 y ||_{L^2(\Omega +B_{\varepsilon }(0))} + ||\nabla ^3 y ||_{L^4(\Omega +B_{\varepsilon }(0))} ||\nabla ^2 y ||_{L^4(\Omega +B_{\varepsilon }(0))} \big ), \end{aligned}$$

where we used the inequality from above with \(k=p=2\) or \(k=1\) and \(p=4\), respectively. \(\square \)

5.4 Proof of the main theorem

We will need a discrete Poincaré-inequality:

Proposition 5.10

Let \(\Omega \subset \mathbb {R}^d\) be open and bounded. Then there is a \(C_\mathrm{P}(\Omega ) >0\) such that for all \(\varepsilon \in (0,1]\) and \(u \in \mathscr {A}_\varepsilon (\Omega ,0)\) we have

$$\begin{aligned} ||u ||_{\ell _\varepsilon ^2 ({{\mathrm{int}}}_\varepsilon \Omega )} \le C_\mathrm{P} ||u ||_{h_\varepsilon ^1 ({{\mathrm{sint}}}_\varepsilon \Omega )} \end{aligned}$$

and for \(u :{{\mathrm{int}}}_\varepsilon \Omega \rightarrow \mathbb {R}^d\) we have

$$\begin{aligned} ||u ||_{h^{-1}_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )} \le C_\mathrm{P} ||u ||_{\ell _\varepsilon ^2 ({{\mathrm{int}}}_\varepsilon \Omega )}. \end{aligned}$$

Proof

Set \(M_\varepsilon = \big \lceil \frac{{{\mathrm{diam}}}\Omega }{\varepsilon } \big \rceil \), fix \(\rho \in \mathscr {R}\) and extend u by 0 to all of \(\varepsilon \mathbb {Z}^d\). Then,

$$\begin{aligned} u(x) = -\varepsilon \sum \limits _{k=1}^{M_\varepsilon } \frac{u(x+k\varepsilon \rho ) - u(x+(k-1)\varepsilon \rho )}{\varepsilon } \end{aligned}$$

for all \(x \in \Omega \cap \varepsilon \mathbb {Z}^d\) and thus

$$\begin{aligned} \varepsilon ^d \sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } |u(x) |^2&\le \varepsilon ^{d+2} \sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } \Big (\sum \limits _{k=1}^{M_\varepsilon } |D_{\mathscr {R},\varepsilon } u (x+ (k-1)\varepsilon \rho ) |\Big )^2 \\&\le \varepsilon ^{d+2} M_\varepsilon \sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } \sum \limits _{k=1}^{M_\varepsilon } |D_{\mathscr {R},\varepsilon } u (x+ (k-1)\varepsilon \rho ) |^2 \\&\le \varepsilon ^{d+2} M_\varepsilon ^2 \sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } u (x) |^2 \\&\le ({{\mathrm{diam}}}\Omega +1)^2 \varepsilon ^d \sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } u (x) |^2. \end{aligned}$$

For the second inequality just take a \(v \in \mathscr {A}_\varepsilon (\Omega ,0)\) and calculate

$$\begin{aligned} \varepsilon ^d \sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } u(x)v(x) \le ||u ||_{\ell _\varepsilon ^2 ({{\mathrm{int}}}_\varepsilon \Omega )} ||v ||_{\ell _\varepsilon ^2 ({{\mathrm{int}}}_\varepsilon \Omega )} \le C_\mathrm{P} ||u ||_{\ell _\varepsilon ^2 ({{\mathrm{int}}}_\varepsilon \Omega )} ||v ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )}. \end{aligned}$$

\(\square \)

Now let us prove the theorem.

Proof (Theorem 5.1)

By Theorem 3.6, \(\lambda _\mathrm{atom}(A_0)>0\) implies \(\lambda _\mathrm{LH}(A_0)>0\) and we can apply Theorem 4.4 with \(m=2\). This already gives a \(K_1\) and the solution of the continuous problem y. Since the solution depends continuously on the data and we have the embedding \(H^4\hookrightarrow C^1\) we can always achieve

$$\begin{aligned} |D_{\mathscr {R},\varepsilon } S_\varepsilon y(x) - (A_0 \rho )_{\rho \in \mathscr {R}} |\le \frac{r_0}{2} \end{aligned}$$

for all \(x \in \varepsilon \mathbb {Z}^d\) and all \(\varepsilon \in (0,1]\) by choosing \(K_1\) small enough.

We want to use Corollary 5.4. Let \(X_\varepsilon = \mathscr {A}_\varepsilon (\Omega ,0)\) with \(||u ||_{X_\varepsilon } = ||u ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )}\),

$$\begin{aligned} Z_\varepsilon = \{ r :{{\mathrm{int}}}_\varepsilon \Omega \rightarrow \mathbb {R}^d \} \end{aligned}$$

with \(||r ||_{Z_\varepsilon } = ||r ||_{h^{-1}_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )}\) and

$$\begin{aligned} Y_\varepsilon =\Big ( \big \{g :\partial _\varepsilon \Omega \rightarrow \mathbb {R}^d\big \} \big / \{g :||g ||_{\partial _\varepsilon \Omega } = 0\} \Big ) \times \mathscr {A}_\varepsilon (\Omega ,0), \end{aligned}$$

with \(||([g],v) ||_{Y_\varepsilon } = ||g ||_{\partial _\varepsilon \Omega } + ||v ||_{h^{-1}_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )}\). Note that \(D_{\mathscr {R},\varepsilon } T_\varepsilon g (x) = D_{\mathscr {R},\varepsilon } T_\varepsilon h (x)\) for all \(x \in {{\mathrm{sint}}}_\varepsilon \Omega \) whenever \([g]=[h]\). Now, define \(F_\varepsilon :X_\varepsilon \times Y_\varepsilon \rightarrow Z_\varepsilon \) by

$$\begin{aligned} F_\varepsilon (u,[g],v)(x) = - \tilde{f}(x) - v(x) -{{\mathrm{div}}}_{\mathscr {R},\varepsilon } \big (DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } (S_\varepsilon y + T_\varepsilon g + u)(x))\big ) \end{aligned}$$

for \(x \in {{\mathrm{int}}}_\varepsilon \Omega \). This is well defined for all \(\varepsilon \in (0,1]\) on an open neighborhood of

$$\begin{aligned} \overline{B_{r_1 \varepsilon ^{\frac{d}{2}}}(0)} \times \overline{B_{r_2 \varepsilon ^{\frac{d}{2}}}(0)} \times \mathscr {A}_\varepsilon (\Omega ,0), \end{aligned}$$

if we choose \(r_1,r_2>0\) small enough. In particular, we can choose them so small that

$$\begin{aligned} D_{\mathscr {R},\varepsilon } (S_\varepsilon y + T_\varepsilon g + u)(x) \in \overline{B_{r_0}((A_0 \rho )_{\rho \in \mathscr {R}})} \end{aligned}$$

for all \(x \in {{\mathrm{sint}}}_\varepsilon \Omega \). Now we use Proposition 5.7 with \(S_\varepsilon y \in C^{3,1}(\Omega ;\mathbb {R}^d)\) and \(V=B_{r_0}((A_0 \rho )_{\rho \in \mathscr {R}})\) to get

$$\begin{aligned} ||F_\varepsilon (0,0,0) ||_{\ell ^2_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )}&\le ||{{\mathrm{div}}}DW_\mathrm{CB}(\nabla y) - {{\mathrm{div}}}DW_\mathrm{CB}(\nabla S_\varepsilon y) ||_{L^2(\Omega ;\mathbb {R}^d)}\\&\quad +C\varepsilon ^2 \Big ||||\nabla ^4 S_\varepsilon y ||_{L^\infty (B_{\varepsilon R}(x))} + ||\nabla ^3 S_\varepsilon y ||_{L^\infty (B_{\varepsilon R}(x))}^\frac{3}{2}\\&\quad + ||\nabla ^2 S_\varepsilon y ||_{L^\infty (B_{\varepsilon R}(x))}^3 + \varepsilon ||\nabla ^3 S_\varepsilon y ||_{L^\infty (B_{\varepsilon R}(x))}^2\Big ||_{L^2(\Omega )}. \end{aligned}$$

Next, we can apply Propositions 5.8 and 5.9 on \(\bar{y} = y_{A_0} + E(y-y_{A_0})\) and use \(y= \bar{y}\) in \(\Omega \) to obtain

$$\begin{aligned} ||F_\varepsilon (0,0,0) ||_{\ell ^2_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )}&\le C \varepsilon ^2 \big ( ||\nabla ^2 \bar{y} ||_{L^4(\mathbb {R}^d)} ||\nabla ^3 \bar{y} ||_{L^4(\mathbb {R}^d)} + ||\nabla ^4 \bar{y} ||_{L^2(\mathbb {R}^d)}\\&\quad + ||\nabla ^3 \bar{y} ||_{L^3(\mathbb {R}^d)}^\frac{3}{2} + ||\nabla ^2 \bar{y} ||_{L^6(\mathbb {R}^d)}^3 + \varepsilon ||\nabla ^3 \bar{y} ||_{L^4(\mathbb {R}^d)}^2 \big )\\&\le C_1 \varepsilon ^2 ||y-y_{A_0} ||_{H^4(\Omega ;\mathbb {R}^d)} (1+ ||y-y_{A_0} ||_{H^4(\Omega ;\mathbb {R}^d)}^2). \end{aligned}$$

Hence, we can set

$$\begin{aligned} A = C_\mathrm{P} C_1 ||y-y_{A_0} ||_{H^4(\Omega ;\mathbb {R}^d)} (1+ ||y-y_{A_0} ||_{H^4(\Omega ;\mathbb {R}^d)}^2). \end{aligned}$$

By stability,

$$\begin{aligned} \varepsilon ^d \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } D^2 W_\mathrm{atom}((A_0 \rho )_{\rho \in \mathscr {R}}) [D_{\mathscr {R},\varepsilon } u (x)]^2 \ge \lambda _\mathrm{atom}(A_0) \varepsilon ^d \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } u (x)|^2 \end{aligned}$$

for all \(u \in \mathscr {A}_\varepsilon (\Omega ,0)\). Continuity of \(D^2 W_\mathrm{atom}\) then implies the existence of a \(\tilde{r} \le r_0\) such that

$$\begin{aligned} \varepsilon ^d \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } D^2 W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } w(x)) [D_{\mathscr {R},\varepsilon } u (x)]^2 \ge \frac{\lambda _\mathrm{atom}(A_0)}{2} \varepsilon ^d \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } u (x)|^2 \end{aligned}$$

for all \(u \in \mathscr {A}_\varepsilon (\Omega ,0)\) and all \(w :\Omega \cap \varepsilon \mathbb {Z}^d\) with

$$\begin{aligned} |D_{\mathscr {R},\varepsilon } w(x) - (A_0 \rho )_{\rho \in \mathscr {R}} |\le \tilde{r} \end{aligned}$$

for all \(x \in {{\mathrm{sint}}}_\varepsilon \Omega \). And again, by choosing \(K_1\) small enough this last inequality is automatically satisfied for \(w= S_\varepsilon y\) with \(\varepsilon \in (0,1]\) arbitrary.

Since the spaces are finite dimensional, it is obvious that the \(F_\varepsilon \) are Fréchet-differentiable. For \(w \in X_\varepsilon \) we have

$$\begin{aligned} \varepsilon ^d&\sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } \partial _u F_\varepsilon (0,0,0)[w](x) w(x)\\&= -\varepsilon ^{d-1} \sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega }\sum _{\sigma , \rho \in \mathscr {R}} w(x) \Big ( D_{e_\rho }D_{e_\sigma }W_\mathrm{atom}(D_{\mathscr {R},\varepsilon }S_\varepsilon y(x)) \frac{w(x+\varepsilon \sigma ) - w(x)}{\varepsilon }\\&\quad - D_{e_\rho }D_{e_\sigma }W_\mathrm{atom}(D_{\mathscr {R},\varepsilon }S_\varepsilon y(x- \varepsilon \rho )) \frac{w(x+\varepsilon \sigma - \varepsilon \rho ) - w(x- \varepsilon \rho )}{\varepsilon }\Big )\\&=\varepsilon ^d \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } D^2 W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } S_\varepsilon y(x)) [D_{\mathscr {R},\varepsilon } w (x),D_{\mathscr {R},\varepsilon } w (x)] \\&\ge \frac{\lambda _\mathrm{atom}(A_0)}{2} \varepsilon ^d \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } w(x)|^2 \end{aligned}$$

and, thus,

$$\begin{aligned} ||\partial _u F_\varepsilon (0,0,0)^{-1} ||_{L(h^{-1}_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega ), h_\varepsilon ^1({{\mathrm{sint}}}_\varepsilon \Omega ))} \le \frac{2}{\lambda _\mathrm{atom}(A_0)} = M_1. \end{aligned}$$

Furthermore, for \(([h],v) \in Y_\varepsilon \) and \(w \in \mathscr {A}_\varepsilon (\Omega ,0)\) we have

$$\begin{aligned} \varepsilon ^d&\sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } \partial _{([g],v)}F_\varepsilon (0,0,0)[([h],v)](x)w(x)\\&\le ||v ||_{h^{-1}_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )} ||w ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} \\&\quad - \varepsilon ^{d-1} \sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } \sum _{\sigma , \rho \in \mathscr {R}} w(x) \Big ( D_{e_\rho }D_{e_\sigma }W_\mathrm{atom} (D_{\mathscr {R},\varepsilon } S_\varepsilon y(x)) \frac{T_\varepsilon h(x+\varepsilon \sigma ) - T_\varepsilon h(x)}{\varepsilon }\\&\quad - D_{e_\rho }D_{e_\sigma }W_\mathrm{atom} (D_{\mathscr {R},\varepsilon } S_\varepsilon y (x-\varepsilon \rho )) \frac{T_\varepsilon h(x-\varepsilon \rho +\varepsilon \sigma ) - T_\varepsilon h(x - \varepsilon \rho )}{\varepsilon } \Big ) \\&= ||v ||_{h^{-1}_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )} ||w ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )}\\&\quad + \varepsilon ^d \sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } D^2W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } S_\varepsilon y(x))[D_{\mathscr {R},\varepsilon } w(x),D_{\mathscr {R},\varepsilon } T_\varepsilon h(x)]\\&\le ||v ||_{h^{-1}_\varepsilon ({{\mathrm{int}}}_\varepsilon \Omega )} ||w ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} + ||D^2 W_\mathrm{atom} ||_\infty ||w ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} ||h ||_{\partial _\varepsilon \Omega }. \end{aligned}$$

Hence,

$$\begin{aligned} ||\partial _{([g],v)}F_\varepsilon (0,0,0) ||_{L(Y_\varepsilon ,Z_\varepsilon )} \le 1 + ||D^2 W_\mathrm{atom} ||_\infty = M_2. \end{aligned}$$

In a similar fashion we calculate

$$\begin{aligned} \varepsilon ^d&\sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } \big (\partial _u F_\varepsilon (0,0,0) - \partial _u F_\varepsilon (u,[g],v)\big )[w](x)z(x)\\&= \varepsilon ^d \sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } \big (D^2 W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } S_\varepsilon y (x))\\&\qquad -D^2 W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } (S_\varepsilon y +u+T_\varepsilon g) (x)) \big )[D_{\mathscr {R},\varepsilon } w(x), D_{\mathscr {R},\varepsilon } z(x)]\\&\le ||w ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} ||z ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} ||D^3 W_\mathrm{atom} ||_\infty ||D_{\mathscr {R},\varepsilon } (u+T_\varepsilon g) ||_{\ell ^\infty ({{\mathrm{sint}}}_\varepsilon \Omega )}. \end{aligned}$$

Thus,

$$\begin{aligned} ||\partial _u F_\varepsilon (0,0,0) - \partial _u F_\varepsilon (u,[g],v) ||_{L(X_\varepsilon , Z_\varepsilon )} \le ||D^3 W_\mathrm{atom} ||_\infty \varepsilon ^{-\frac{d}{2}} (||u ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} + ||g ||_{\partial _\varepsilon \Omega }), \end{aligned}$$

so that we can take \(M_3 = ||D^3 W_\mathrm{atom} ||_\infty \). Finally,

$$\begin{aligned} \varepsilon ^d&\sum _{x \in {{\mathrm{int}}}_\varepsilon \Omega } \big (\partial _{([g],v)} F_\varepsilon (0,0,0) - \partial _{([g],v)} F_\varepsilon (u,[g],v)\big )[([h],w)](x)z(x)\\&= \varepsilon ^d \sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } \big (D^2 W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } S_\varepsilon y(x))\\&\qquad -D^2 W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } (S_\varepsilon y +u+T_\varepsilon g) (x)) \big )[D_{\mathscr {R},\varepsilon } T_\varepsilon h(x), D_{\mathscr {R},\varepsilon } z(x)]\\&\le ||h ||_{\partial _\varepsilon \Omega } ||z ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} ||D^3 W_\mathrm{atom} ||_\infty ||D_{\mathscr {R},\varepsilon } (u+T_\varepsilon g) ||_{\ell ^\infty ({{\mathrm{sint}}}_\varepsilon \Omega )}. \end{aligned}$$

Hence, we can also take \(M_4 = ||D^3 W_\mathrm{atom} ||_\infty \). As before, since y depends continuously on the data, we can take \(K_1\) small enough such that

$$\begin{aligned} C_\mathrm{P} C_1 ||y-y_{A_0} ||_{H^4(\Omega ;\mathbb {R}^d)}&(1+ ||y-y_{A_0} ||_{H^4(\Omega ;\mathbb {R}^d)}^2)\\&\le \min \Big \{ \frac{r_1 \lambda _\mathrm{atom}(A_0)}{8}, \frac{\lambda _\mathrm{atom}(A_0)^2}{64 ||D^3 W_\mathrm{atom} ||_\infty }\Big \}. \end{aligned}$$

Therefore, we can apply Corollary 5.4 and get the fixed point result with

$$\begin{aligned} \lambda _1&= \min \Big \{r_1,\frac{\lambda _\mathrm{atom}(A_0)}{8 ||D^3 W_\mathrm{atom} ||_\infty }\Big \},\\ \lambda _2&= \min \Big \{r_2,\frac{\lambda _\mathrm{atom}(A_0)}{8||D^3 W_\mathrm{atom} ||_\infty }, \frac{r_1 \lambda _\mathrm{atom}(A_0)}{8(1 + ||D^2 W_\mathrm{atom} ||_\infty ) + 2 \lambda _\mathrm{atom}(A_0)},\\&\quad \frac{\lambda _\mathrm{atom}(A_0)^2}{8 ||D^3 W_\mathrm{atom} ||_\infty \big (8 + 8||D^2 W_\mathrm{atom} ||_\infty + 2\lambda _\mathrm{atom}(A_0)\big )}\Big \}. \end{aligned}$$

After performing the substitutions \(g_\mathrm{atom} \in S_\varepsilon y + [g]\), \(f_\mathrm{atom} = \tilde{f} +v\) and \(y_\mathrm{atom} = S_\varepsilon y + T_\varepsilon (g_\mathrm{atom}-S_\varepsilon y) + u\), we get the stated existence result with \(K_2=\frac{\lambda _2}{2}\). The solution then satisfies \(||y_\mathrm{atom} - S_{\varepsilon }y ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} \le K_3 \varepsilon ^\gamma \) with \(K_3 = \lambda _1 + \frac{\lambda _2}{2}\). If \(r_1,r_2\) are chosen small enough, then this implies

$$\begin{aligned} |D_{\mathscr {R},\varepsilon } \tilde{y}_\mathrm{atom}(x) - (A_0 \rho )_{\rho \in \mathscr {R}} |\le \frac{\tilde{r}}{2} \end{aligned}$$

for all \(x \in {{\mathrm{sint}}}_\varepsilon \Omega \) and any \(\tilde{y}_\mathrm{atom}\) with boundary values \(g_\mathrm{atom}\). Furthermore, with \(r_1,r_2\) chosen small enough for \(u \in \mathscr {A}_\varepsilon (\Omega , 0)\backslash \{0\}\) with \(||u ||_{h^1_\varepsilon ({{\mathrm{sint}}}_\varepsilon \Omega )} \le K_3 \varepsilon ^\gamma \) we have

$$\begin{aligned} |D_{\mathscr {R},\varepsilon } u(x) |\le \frac{\tilde{r}}{2}. \end{aligned}$$

Now, since \(y_\mathrm{atom}\) is a solution, we can calculate

$$\begin{aligned}&E_\varepsilon (y_\mathrm{atom} + u, f_\mathrm{atom}, g_\mathrm{atom})-E_\varepsilon (y_\mathrm{atom}, f_\mathrm{atom}, g_\mathrm{atom})\\&\quad = \varepsilon ^d \sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } \Big ( W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y_\mathrm{atom}(x) + D_{\mathscr {R},\varepsilon } u(x)) - W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y_\mathrm{atom}(x))\\&\qquad - DW_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y_\mathrm{atom}(x))[D_{\mathscr {R},\varepsilon } u(x)] \Big ) \\&\quad = \varepsilon ^d \sum _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } \int _0^1 (1-t) D^2 W_\mathrm{atom}(D_{\mathscr {R},\varepsilon } y_\mathrm{atom}(x) + t D_{\mathscr {R},\varepsilon } u(x))[D_{\mathscr {R},\varepsilon } u(x),D_{\mathscr {R},\varepsilon } u(x)] \,dt\\&\quad \ge \frac{\lambda _\mathrm{atom}(A_0)}{2} \varepsilon ^d \sum \limits _{x \in {{\mathrm{sint}}}_\varepsilon \Omega } |D_{\mathscr {R},\varepsilon } u (x)|^2 > 0, \end{aligned}$$

which shows that \(y_\mathrm{atom}\) is a strict local minimizer. And, doing the same calculation again with \(\tilde{y}_\mathrm{atom} - y_\mathrm{atom}\) instead of u, we also see that the solution is unique.

For the additional statement we only have to estimate \(||S_\varepsilon y - y||_{h_\varepsilon ^1({{\mathrm{int}}}_\varepsilon \Omega )}\) with a Taylor expansion. We have

$$\begin{aligned} S_\varepsilon y(x+\varepsilon \rho )- y(x+\varepsilon \rho )= & {} \int _{B_\varepsilon (0)} \big (\bar{y}(x+\varepsilon \rho +z)- \bar{y}(x+\varepsilon \rho ) \Big )\eta _\varepsilon (z)\,dz\\= & {} \int _{B_\varepsilon (0)} \big (\bar{y}(x+\varepsilon \rho +z)\!-\! \bar{y}(x+\varepsilon \rho ) - \nabla \bar{y}(x+\varepsilon \rho )[z] \big )\eta _\varepsilon (z)\,dz\\= & {} \int _{B_\varepsilon (0)}\int _0^1 (1-t) \nabla ^2 \bar{y}(x+\varepsilon \rho +tz)[z,z] \eta _\varepsilon (z)\,dt\,dz. \end{aligned}$$

This includes the case \(\rho =0\), hence

$$\begin{aligned} \Big |&\frac{S_\varepsilon y(x+\varepsilon \rho ) - S_\varepsilon y(x)}{\varepsilon } - \frac{y(x+\varepsilon \rho ) -y(x)}{\varepsilon }\Big |^2 \\&= \Big |\int _{B_\varepsilon (0)}\int _0^1(1-t) \frac{\nabla ^2 \bar{y}(x+\varepsilon \rho +tz)[z,z] - \nabla ^2 \bar{y}(x+tz)[z,z]}{\varepsilon }\eta _\varepsilon (z)\,dt \,dz \Big |^2\\&\le \varepsilon ^{2\gamma } R_\mathrm{max}^{2(\gamma -1)} |\nabla ^2 \bar{y} |_{\gamma -1}^2, \end{aligned}$$

which gives the desired result. \(\square \)