1 Introduction

The existence of weak solutions is an open problem of nonlinear elastostatics. Locally injective (almost-everywhere) energy minimizers in the Sobolev space \(W^{1,p}\) were shown to exist in 1976 by Ball [2] for polyconvex materials. It remains to be seen if such minimizers correspond to weak solutions of the Euler–Lagrange equations. The underlying difficulty stems from a lack of regularity combined with the blow-up of the energy density as the local volume ratio approaches zero. As such, the energy density could be “infinite” on a set of measure zero at a minimizer. With this in mind, second-gradient nonlinear elasticity was investigated in [7], where the natural setting is in the Sobolev space \(W^{2,p}\). Although the existence of energy minimizers for that class of problems is well known [1], the main point of [7] is that, with \(p>3\) and sufficiently fast blow-up of the energy density function, the volume ratio is uniformly bounded below by a positive constant for all admissible deformations with bounded energy (including energy minimizers). With this in hand, Gâteaux differentiability at a minimizer is routine, and rigorous weak solutions are readily established for a wide variety of boundary conditions—including truly mixed conditions. In the special cases of “oriented Dirichlet” boundary conditions, it follows from well-known arguments that such weak solutions are globally injective, cf. [3, 4].

We return to the setting of [7] in this work and take up the question of global injectivity of weak solutions for second-gradient problems with mixed boundary conditions. Of course this entails the treatment of self-contact. We note that the existence of almost-everywhere globally injective energy minimizers for polyconvex materials (as in [2]) was obtained by Ciarlet and Nečas in [5], cf. also [11]. The minimization is carried out in the presence of an injectivity constraint given by an integral inequality. As in [2], the existence of weak solutions is not known. In addition, we mention the works of [6] and [10], which also consider global injectivity in nonlinear elasticity. None of these address the reactive traction field resulting from self-contact. The contact of polyconvex hyperelastic bodies with rigid obstacles is treated in [12]. Without the assumption that the energy density grows unboundedly as the volume ratio approaches zero, the existence of a measure-valued contact traction is established. In this work we construct a measure that enforces interior global injectivity and corresponds to the magnitude of a non-tensile self-contact traction acting in the direction of the unit normal to the boundary in the deformed configuration.

The outline of the work is as follows. We present our formulation in Sect. 2, specifying our assumptions in accord with those of [7]. In particular, we assume physically reasonable growth and convexity conditions enabling a well-posed minimization analysis, which we take up in Sect. 3. We work in the class of appropriate vector-valued deformations in the Sobolev space that are injective on the interior of the domain. We demonstrate that the total energy attains its minimum within that admissible set; the results of [7] play a crucial role. Sections 4 and 5 comprise the heart of the paper. In the former we first define an appropriate tangent cone at an admissible deformation, and then we establish a variational inequality at a minimizer. Next we define the self-contact coincidence set in a natural way at an admissible deformation; we show that the set is closed and confined to the boundary of the domain. In Sect. 5 we prove the main result, viz., the existence of a non-negative (Radon) measure, vanishing outside of the coincidence set, which represents the normal contact-reaction force distribution. This construction follows from a version of the geometric form of the Hahn–Banach theorem. With that in hand, we obtain the weak form of the equilibrium equations, naturally involving the self-contact field.

2 Problem formulation

Let the reference configuration be an open and bounded subset of three-dimensional Euclidean space, \(\varOmega \subset {\mathbb {E}}^3\). (We use \({\mathbb {E}}^3\) to denote both Euclidean space as well as the tangent space to a point.) We assume the boundary, \(\partial \varOmega \), is continuously differentiable, i.e. there is a continuous outward unit-normal vector \(\mathbf {n}\in C(\partial \varOmega ,{\mathbb {E}}^3)\). We work with deformations, denoted \(\mathbf {f}:{\overline{\varOmega }}\rightarrow {\mathbb {E}}^3\), in the Sobolev space \(W^{2,p}(\varOmega ,{\mathbb {E}}^3)\) of vector-valued functions with p-integrable second weak-derivatives for \(p>3\). We denote the deformation-gradient at \(\mathbf {x}\in {\overline{\varOmega }}\) by \(\nabla \mathbf {f}(\mathbf {x})\in GL^+({\mathbb {E}}^3)\), the latter denoting the set of all invertible linear transformation having positive determinant, and we denote the second-gradient by \(\nabla ^2\mathbf {f}(\mathbf {x})\in BL({\mathbb {E}}^3)\), a symmetric bilinear transformation. The norm of the Sobolev space is given by

$$\begin{aligned} \Vert \mathbf {f}\Vert _{W^{2,p}(\varOmega ,{\mathbb {E}}^3)}^p\equiv \int _\varOmega \Big [|\nabla \mathbf {f}(\mathbf {x})|^p+ |\nabla ^2\mathbf {f}(\mathbf {x})|^p+|\mathbf {f}(\mathbf {x})|^p\Big ]dV, \end{aligned}$$
(1)

where the Euclidean tensor norm is used in the integrand and dV is the Euclidean volume form.

We focus on the mixed boundary-value problem, where the deformation and its gradient are prescribed on a closed subset \(\varGamma \subset \partial \varOmega \) by \(\mathbf {f}_0\in W^{2,p}(\varOmega ,{\mathbb {E}}^3)\). We let \(W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\subset W^{2,p}(\varOmega ,{\mathbb {E}}^3)\) be the subspace of vector-valued functions that vanish, along with their normal directional derivatives, on \(\varGamma \), and we denote \(\varGamma _c=\partial \varOmega \backslash \varGamma \). We assume that \(\varGamma \) has non-zero area-measure, which implies a Poincaré inequality bounding \(\Vert \mathbf {h}\Vert _{W^{1,p}(\varOmega ,{\mathbb {E}}^3)}\) by the \(L^p\) norm of the second derivatives of \(\mathbf {h}\) when \(\mathbf {h}\in W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\). It is then routine to show that a norm equivalent to (1) on \(W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\) is given by

$$\begin{aligned} \Vert \mathbf {h}\Vert _{W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)}^p\equiv \int _\varOmega |\nabla ^2\mathbf {h}(\mathbf {x})|^pdV. \end{aligned}$$
(2)

We consider admissible deformations which are locally invertible, orientation-preserving, and globally injective on \(\varOmega \):

$$\begin{aligned} {\mathcal {A}}\equiv \left\{ \mathbf {f}\in W^{2,p}\big (\varOmega ,{\mathbb {E}}^3\big )\ \begin{array}{|l} \ \mathbf {f}-\mathbf {f}_0\in W^{2,p}_\varGamma \big (\varOmega ,{\mathbb {E}}^3\big ),\\ \ \nabla \mathbf {f}(\mathbf {x})\in GL^+({\mathbb {E}}^3)\ \mathrm{for\ all}\ \mathbf {x}\in {\overline{\varOmega }},\\ \ \mathbf {f}\ \mathrm{is\ injective\ on}\ \varOmega . \end{array}\right\} \end{aligned}$$
(3)

We consider a total energy functional \(E:{\mathcal {A}}\rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} E[\mathbf {f}]\equiv&\int _\varOmega \Big [ W\big (\nabla \mathbf {f},\nabla ^2\mathbf {f},\mathbf {x}\big )-\big (\mathbf {b}\cdot \mathbf {f}+\mathbf {B}\cdot \nabla \mathbf {f}\big )\Big ]dV-\int _{\varGamma _c} \big [{\varvec{\tau }} \cdot \mathbf {f}+{\varvec{\mu }}\cdot [\nabla \mathbf {f} \mathbf {n}]\big ]dS. \end{aligned}$$
(4)

The function \(W:GL^+({\mathbb {E}}^3)\times BL({\mathbb {E}}^3)\times {\overline{\varOmega }}\rightarrow [0,\infty )\) is the hyper-elastic second-gradient stored energy function. The fields \(\mathbf {b}\in L^1( \varOmega , {\mathbb {E}}^3)\), \(\mathbf {B}\in L^1(\varOmega ,L({\mathbb {E}}^3))\) represent the prescribed body-force density and generalized body-force density, respectively, while \({\varvec{\tau }},\varvec{\mu } \in L^1(\varGamma _c , {\mathbb {E}}^3)\) are the prescribed surface traction and surface hyper-traction, cf. [13]. Finally, \(\nabla \mathbf {f}(\mathbf {x})\mathbf {n}(\mathbf {x})\) denotes the partial derivative of the deformation in the direction of the outward unit-normal \(\mathbf {n}(\mathbf {x})\) at \(\mathbf {x}\in \varGamma _c\). We consider “dead” loadings for convenience, while noting that more general classes of conservative potentials can be treated here, cf. [1].

Throughout this work we make the following assumptions:

A1 :

The stored energy function is coercive in the sense that there is a \(p>3\) and a constant \(C>0\), independent of \(\mathbf {F},\mathbf {G}\) or \(\mathbf {x}\), such that

$$\begin{aligned} W(\mathbf {F},\mathbf {G},\mathbf {x})\ge C |\mathbf {G}|^p. \end{aligned}$$
(5)
A2 :

The mappings \(\mathbf {F}\mapsto W\) and \(\mathbf {G}\mapsto W\) are continuously differentiable, and \(\mathbf {x}\mapsto W\), \(\mathbf {x}\mapsto W_F\) and \(\mathbf {x}\mapsto W_G\) are measurable. There is a real-valued function on \(GL^+({\mathbb {E}}^3)\), \(\mathbf {F}\mapsto \alpha (\mathbf {F})\), continuous but not uniformly bounded, cf. (8), such that

$$\begin{aligned} W(\mathbf {F},\mathbf {G},\mathbf {x})\le&\alpha (\mathbf {F}) \big (1+|\mathbf {G}|^p\big ), \\ |W_G(\mathbf {F},\mathbf {G},\mathbf {x})|\le&\alpha (\mathbf {F}) \big (1+|\mathbf {G}|^{p-1}\big ).\nonumber \end{aligned}$$
(6)
A3 :

Material objectivity: For all \(\mathbf {x}\in \varOmega \) and \(\mathbf {Q}\in SO(3)\equiv \{\mathbf {Q}\in GL^+({\mathbb {E}}^3)\ :\ \mathbf {Q}^{-1}=\mathbf {Q}^\top \}\),

$$\begin{aligned} W(\mathbf {F},\mathbf {G},\mathbf {x})=W(\mathbf {Q}\mathbf {F},\mathbf {Q}\mathbf {G},\mathbf {x}). \end{aligned}$$
(7)
A4 :

For some \(q\ge 3p/(p-3)\),

$$\begin{aligned} W(\mathbf {F},\mathbf {G},\mathbf {x})\ge C\det (\mathbf {F})^{-q},\ \mathrm{for\ all}\ \mathbf {F}\in GL^+({\mathbb {E}}^3),\ \mathbf {G}\in BL\big ({\mathbb {E}}^3\big ),\ \mathrm{and}\ \mathbf {x}\in {\overline{\varOmega }}. \end{aligned}$$
(8)
A5 :

We assume that \(\mathbf {f}_0\in {\mathcal {A}}\), and that \(\mathbf {f}_0\) is injective on \({\overline{\varOmega }}\).

A6 :

The map \(\mathbf {G}\mapsto W\) is convex, or more generally polyconvex [1]:

The latter takes the following form in our setting: For any symmetric bilinear transformation \(\mathbf {G}\in BL({\mathbb {E}}^3)\), we associate a 3rd-order tensor \({\hat{\mathbf {G}}}:{\mathbb {E}}^3\rightarrow L({\mathbb {E}}^3)\) by \(({\hat{\mathbf {G}}}\mathbf {a})\mathbf {b}=\mathbf {G}[\mathbf {a}, \mathbf {b}]\). Let \(\mathbf {G}^{[2]}\) denote the list of all \(2\times 2\) sub-determinants of the Cartesian components of \({\hat{\mathbf {G}}}\) (viewed as a \(9\times 3\) matrix), and let \(\mathbf {G}^{[3]}\) denote the list of all \(3\times 3\) sub-determinants. For \(V:BL({\mathbb {E}}^3)\rightarrow {\mathbb {R}}\), we say that V is polyconvex if there is a convex function \({\tilde{V}}\) such that \(V(\mathbf {G})={\tilde{V}}(\mathbf {G},\mathbf {G}^{[2]},\mathbf {G}^{[3]})\).

3 Existence of injective minimizer

In this section we show that the total potential energy (4) attains its minimum in the set of admissible deformations \({\mathcal {A}}\). With the exception of dealing with the global injectivity constraint, we employ standard arguments of the direct method. A result of [5] is that the property “\(\mathbf {f}\) is almost-everywhere injective” is weakly closed in \(W^{1,p}(\varOmega ,{\mathbb {E}}^3)\). With the property that \(\nabla \mathbf {f}\) is uniformly non-singular, we infer here that \(\mathbf {f}\) is injective on \(\varOmega \). To begin, we need the following preliminary step.

Lemma 1

Suppose that for \(\mathbf {f}\in W^{2,p}(\varOmega ,{\mathbb {E}}^3)\) there is \(K>0\) such that

$$\begin{aligned} \det \big (\nabla \mathbf {f}(\mathbf {x})\big )\ge K,\ \forall \ \mathbf {x}\in {\overline{\varOmega }}, \end{aligned}$$
(9)

and that \(\mathbf {f}\) is almost-everywhere injective on \(\varOmega \). Then \(\mathbf {f}\) is everywhere injective on \(\varOmega \).

Proof

Suppose \(\mathbf {f}(\mathbf {x}_1)=\mathbf {f}(\mathbf {x}_2)=\mathbf {y}\) for \(\mathbf {x}_1,\mathbf {x}_2\in \varOmega \). We consider the equation \(\mathbf {f}(\mathbf {w})=\mathbf {z}\) for \(\mathbf {z}\) in a neighborhood of \(\mathbf {y}\). The Sobolev embedding theorem implies that \(\mathbf {f}\in C^1({\overline{\varOmega }},{\mathbb {E}}^3)\), and (9) guarantees that \(\nabla \mathbf {f}\) is non-singular. Then the inverse function theorem states that there is a neighborhood V of \(\mathbf {y}\) such that for all \(\mathbf {z}\in V\), there exists solutions \(\mathbf {w}\) in a neighborhood of \(\mathbf {x}_1\) or \(\mathbf {x}_2\). If \(\mathbf {x}_1\not =\mathbf {x}_2\), there is a neighborhood of \(\mathbf {x}_1\) and \(\mathbf {x}_2\) where \(\mathbf {f}\) fails to be injective. This contradicts almost-everywhere injectivity of \(\mathbf {f}\), thus \(\mathbf {x}_1=\mathbf {x}_2\) and \(\mathbf {f}\) is everywhere injective on \(\varOmega \). \(\square \)

Proposition 1

There exists \(\mathbf {f}^*\in {\mathcal {A}}\) such that \(E[\mathbf {f}^*]=\inf _{\mathbf {f}\in {\mathcal {A}}} E[\mathbf {f}]\).

Proof

By assumption A5, \({\mathcal {A}}\) is nonempty. We use the Poincaré inequality for the equivalence of the norms (1) and (2). We then infer that \(\inf _{\mathbf {f}\in {\mathcal {A}}} E[\mathbf {f}]\) is bounded below from A1 and the Sobolev embedding into continuous functions.

Suppose \(\{\mathbf {f}_i\}_{i=1}^\infty \subset {\mathcal {A}}\) is an energy infimizing sequence. In view of (2) and A1, there is a constant \(C>0\) such that

$$\begin{aligned} C\Vert \mathbf {f}_i-\mathbf {f}_0\Vert _{W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)}^p\le&\int _\varOmega W(\nabla \mathbf {f}_i(\mathbf {x}),\nabla ^2\mathbf {f}_i(\mathbf {x}),\mathbf {x})dV +\int _\varOmega |\nabla ^2\mathbf {f}_0(\mathbf {x})|^pdV\\ \le&\, E[\mathbf {f}_i]+\sup _{x\in {\overline{\varOmega }}} |\mathbf {f}_i|\big (\Vert \mathbf {b}\Vert _{L^1(\varOmega ,{\mathbb {E}}^3)}+\Vert \varvec{\tau }\Vert _{L^1(\varGamma _c,{\mathbb {E}}^3)}\big )\\&+\sup _{x\in {\overline{\varOmega }}} |\nabla \mathbf {f}_i|\big (|\Vert \mathbf {B} \Vert _{L^1(\varOmega ,L({\mathbb {E}}^3))}+\Vert {\varvec{\mu }}\Vert _{L^1(\varGamma _c,{\mathbb {E}}^3)} \big )+\int _\varOmega |\nabla ^2\mathbf {f}_0(\mathbf {x})|^pdV. \end{aligned}$$

Then it is clear there exist constants M and N, which depend on C, \(\mathbf {b}\), \({\varvec{\tau }}\), \(\varOmega \), \(\mathbf {f}_0\) and \(\inf _{\mathbf {f}\in {\mathcal {A}}} E[\mathbf {f}]\), such that \(\Vert \mathbf {f}_i\Vert _{W^{2,p}(\varOmega ,{\mathbb {E}}^3)}\le M \) for \(i>N\). The Banach-Alaoglu theorem implies that there is a subsequence \(\mathbf {f}_{i^k}\rightharpoonup \mathbf {f}^*\) converging in the weak topology of \(W^{2,p}(\varOmega ,{\mathbb {E}}^3)\).

Clearly \(\mathbf {f}^*-\mathbf {f}_0\) vanishes on \(\varGamma \). By virtue of A4 and Theorem 3.1 of [7], there is \(K>0\) such that \(\det (\nabla \mathbf {f}^*(\mathbf {x}))\ge K\) for all \(\mathbf {x}\in {\overline{\varOmega }}\). Also, Theorem 5 of [5] implies that \(\mathbf {f}^*\) is almost-everywhere injective; each element of the weakly convergent sequence, \(\{\mathbf {f}_{i^k}\}\), is almost-everywhere injective. Combined with Lemma 1, this shows that \(\mathbf {f}^*\) is injective on \(\varOmega \) and hence \(\mathbf {f}^*\in {\mathcal {A}}\).

Finally, in view of the continuity and convexity (or polyconvexity) properties of W, cf. A2, A6, Theorem 5.4 of [1] establishes the weak lower semi-continuity of \(E[\cdot ]\). We conclude that \(E[\cdot ]\) achieves its infimum at \(\mathbf {f}^*\). \(\square \)

4 Variational inequality

For \(\mathbf {f}\in {\mathcal {A}}\), we define the tangent cone

$$\begin{aligned} K_f{\mathcal {A}}\equiv \Big \{\mathbf {h}\in W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\ \big |\ \exists \ \epsilon _1>0\ \mathrm{s.t.}\ \mathbf {f}+\epsilon \mathbf {h}\in {\mathcal {A}}\ \mathrm{for\ all}\ \epsilon \in [0,\epsilon _1)\Big \}. \end{aligned}$$
(10)

Later we demonstrate that \(K_f{\mathcal {A}}\) contains a non-empty open subset. Also, conditions A2 ensure that \(E[\cdot ]\) is Gâteaux differentiable. Hence we immediately conclude:

Proposition 2

Let \(\mathbf {f}^*\in {\mathcal {A}}\) be an energy minimizer, cf. Proposition 1. Then

$$\begin{aligned} \langle DE[\mathbf {f}^*],\mathbf {h}\rangle _{W^{2,p}(\varOmega ,{\mathbb {E}}^3)}\ge 0,\ \mathrm{for\ all}\ \mathbf {h}\in K_{f^*}{\mathcal {A}}. \end{aligned}$$
(11)

In order to obtain a characterization of \(DE[\mathbf {f}^*]\), we first define the coincidence set at a deformation \(\mathbf {f}\in {\mathcal {A}}\):

$$\begin{aligned} S_f\equiv \Big \{\mathbf {x}\in {\overline{\varOmega }}\ \big |\ \exists \ \mathbf {w}\in {\overline{\varOmega }}\backslash \{\mathbf {x}\}\ \mathrm{s.t.}\ \mathbf {f}(\mathbf {x})=\mathbf {f}(\mathbf {w})\Big \}. \end{aligned}$$
(12)

In what follows, let \(\mathbf {n}_f(\mathbf {x})\) denote the outward pointing unit normal vector in the current configuration at \(\mathbf {f}(\mathbf {x})\), for \(\mathbf {x}\in \partial \varOmega \).

Lemma 2

For a given \(\mathbf {f}\in {\mathcal {A}}\):

  1. (i)

    \(S_f\) is closed.

  2. (ii)

    \(S_f\subset \partial \varOmega \), and if \(\mathbf {y}\in \mathbf {f}(S_f)\) with \(\{\mathbf {x}_1,\mathbf {x}_2\}\subset \mathbf {f}^{-1}(\mathbf {y})\), then

    $$\begin{aligned} \mathbf {n}_f(\mathbf {x}_1)+\mathbf {n}_f(\mathbf {x}_2)=\mathbf {0}. \end{aligned}$$
    (13)

    Furthermore, \(\{\mathbf {x}_1,\mathbf {x}_2\}= \mathbf {f}^{-1}(\mathbf {y})\).

Proof

Suppose \(\{\mathbf {x}_1^i\}_{i=1}^\infty \subset S_f\) and \(\mathbf {x}_1^i\rightarrow \mathbf {x}_1\in {\overline{\varOmega }}\). Then there exists \(\{\mathbf {x}_2^i\}_{i=1}^\infty \subset S_f\) such that \(\mathbf {x}_1^i\not =\mathbf {x}_2^i\) and \(\mathbf {f}(\mathbf {x}_1^i)=\mathbf {f}(\mathbf {x}_2^i)\). By compactness of \({\overline{\varOmega }}\), there is a subsequence \(\{\mathbf {x}_2^{i_k}\}_{k=1}^\infty \) such that \(\mathbf {x}_2^{i_k}\rightarrow \mathbf {x}_2\in {\overline{\varOmega }}\), and then \(\mathbf {f}(\mathbf {x}_1)=\mathbf {f}(\mathbf {x}_2)\) by continuity. We must show that \(\mathbf {x}_1\not =\mathbf {x}_2\). Suppose to the contrary that \(\mathbf {x}_1=\mathbf {x}_2=\mathbf {x}\). Then the Sobolev embedding and extension theorems imply that \(\mathbf {f}\) extends to a continuously differentiable function in a neighborhood of \(\mathbf {x}\), on which \(\nabla \mathbf {f}\) is invertible. The inverse function theorem implies that \(\mathbf {f}\) is a bijection of a neighborhood, U, of \(\mathbf {x}\) and a neighborhood, V, of \(\mathbf {f}(\mathbf {x})\). For large enough k, \(\{\mathbf {x}_1^{i_k},\mathbf {x}_2^{i_k}\}\subset U\) and \(\{\mathbf {f}(\mathbf {x}_1^{i_k}),\mathbf {f}(\mathbf {x}_2^{i_k})\}\subset V\). We reach a contradiction since \(\mathbf {f}(\mathbf {x}_1^{i_k})=\mathbf {f}(\mathbf {x}_2^{i_k})\) and \(\mathbf {x}_1^{i_k}\not =\mathbf {x}_2^{i_k}\), but \(\mathbf {f}\) is a bijection of U and V.

To prove (ii), we consider \(\mathbf {x}_1\not =\mathbf {x}_2\) with \(\mathbf {f}(\mathbf {x}_1)=\mathbf {f}(\mathbf {x}_2)=\mathbf {y}\). Since \(\mathbf {f}\) is injective on \(\varOmega \), then with, say, \(\mathbf {x}_1\in \partial \varOmega \), there are two possibilities: either \(\mathbf {x}_2\in \varOmega \) or \(\mathbf {x}_2\in \partial \varOmega \). In the first case we let \(\mathbf {z}=-\mathbf {n}_f(\mathbf {x}_1)\), and in the second case we take \(\mathbf {z} = -\mathbf {n}_f(\mathbf {x}_1)-\mathbf {n}_f(\mathbf {x}_2)\). We then consider the equation

$$\begin{aligned} \mathbf {g}(\mathbf {w},\epsilon )\equiv \mathbf {f}(\mathbf {w})-\mathbf {y}-\epsilon \mathbf {z}=\mathbf {0}. \end{aligned}$$
(14)

Again we may assume that \(\mathbf {f}\) is defined and \(\nabla \mathbf {f}\) is non-singular in neighborhoods of \(\mathbf {x}_1\) and \(\mathbf {x}_2\). The implicit function theorem shows that there exist continuously differentiable curves \(\mathbf {w}_1(\epsilon )\) and \(\mathbf {w}_2(\epsilon )\) solving (14) with \(\mathbf {w}_1(0)=\mathbf {x}_1\) and \(\mathbf {w}_2(0)=\mathbf {x}_2\). The derivatives at \(\epsilon =0\), for \(\alpha \in \{1,2\}\), are given by

$$\begin{aligned} \frac{d}{d\epsilon }\mathbf {w}_\alpha (0)=\nabla \mathbf {f}(\mathbf {x}_\alpha )^{-1}\mathbf {z}. \end{aligned}$$

The pullback of \(\mathbf {n}_f(\mathbf {x}_1)\), \(\nabla \mathbf {f}(\mathbf {x}_1)^\top \mathbf {n}_f(\mathbf {x}_1)\), is an outward normal to \(\partial \varOmega \) at \(\mathbf {x}_1\). In the first case \(\mathbf {w}_2(\epsilon )\in \varOmega \), because \(\mathbf {x}_2\) is interior and

$$\begin{aligned} \frac{d}{d\epsilon }\mathbf {w}_1(0)\cdot \nabla \mathbf {f}(\mathbf {x}_1)^\top \mathbf {n}_f(\mathbf {x}_1)=\mathbf {z}\cdot \mathbf {n}_f(\mathbf {x}_1)=-1. \end{aligned}$$

It follows that \(\mathbf {w}_1(\epsilon )\in \varOmega \) for small positive \(\epsilon \), contradicting the injectivity of \(\mathbf {f}\) on \(\varOmega \), so \(S_f\subset \partial \varOmega \). In the second case,

$$\begin{aligned}&\frac{d}{d\epsilon }\mathbf {w}_\alpha (0)\cdot \nabla \mathbf {f} (\mathbf {x}_\alpha )^\top \mathbf {n}_f(\mathbf {x}_\alpha )=\mathbf {z} \cdot \mathbf {n}_f(\mathbf {x}_\alpha )\\&\quad = -1-\mathbf {n}_f(\mathbf {x}_1)\cdot \mathbf {n}_f(\mathbf {x}_2). \end{aligned}$$

If \(\mathbf {n}_f(\mathbf {x}_1)+\mathbf {n}_f(\mathbf {x}_2)\not =0\) then \(\mathbf {w}_\alpha (\epsilon )\in \varOmega \) for small positive \(\epsilon \), contradicting the injectivity of \(\mathbf {f}\) on \(\varOmega \).

Suppose now that there are three distinct points mapped to \(\mathbf {y}\) by \(\mathbf {f}\), \(\{\mathbf {x}_1,\mathbf {x}_2,\mathbf {x}_3\}\subset \mathbf {f}^{-1}(\mathbf {y})\). Then (13) implies that

$$\begin{aligned} \mathbf {n}_f(\mathbf {x}_3)=-\frac{1}{2}\big (\mathbf {n}_f(\mathbf {x}_1) +\mathbf {n}_f(\mathbf {x}_2)\big )=\mathbf {0}. \end{aligned}$$

This contradicts that \(\mathbf {n}_f(\mathbf {x}_3)\) is a unit-vector, hence \(\{\mathbf {x}_1,\mathbf {x}_2\}= \mathbf {f}^{-1}(\mathbf {y})\). \(\square \)

We next will characterize sufficient conditions on a displacement for \(\mathbf {f}+ \epsilon \mathbf {h}\) to be admissible for small \(\epsilon \). Given \(\mathbf {f}\in {\mathcal {A}}\), we now define

$$\begin{aligned} K_f^0{\mathcal {A}}\equiv \left\{ \mathbf {h}\in W^{2,p}_\varGamma \big (\varOmega ,{\mathbb {E}}^3\big )\ \begin{array}{|l} \ \mathrm{if\ }\mathbf {f}(\mathbf {x}_1)=\mathbf {f}(\mathbf {x}_2)\ \mathrm{and}\ \mathbf {x}_1\not =\mathbf {x}_2,\\ \ \mathrm{then}\ \mathbf {h}(\mathbf {x}_1)\cdot \mathbf {n}_f(\mathbf {x}_1)+\mathbf {h}(\mathbf {x}_2)\cdot \mathbf {n}_f(\mathbf {x}_2)<0\end{array}\right\} . \end{aligned}$$

Proposition 3

\(K_f^0{\mathcal {A}}\subset K_f{\mathcal {A}}\), and \(K_f^0{\mathcal {A}}\) is non-empty. Furthermore, \(K_f^0{\mathcal {A}}\) is open in the norm topology of \(W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\).

Proof

Suppose that \(\mathbf {h}\in K_f^0{\mathcal {A}}\). We must show that \(\mathbf {h}\in K_f{\mathcal {A}}\), i.e. there exists \(\epsilon _1>0\) such that \(\mathbf {v}_\epsilon \equiv \mathbf {f}+\epsilon \mathbf {h}\in {\mathcal {A}}\) for \(\epsilon \in [0,\epsilon _1)\). Clearly, \(\mathbf {v}_\epsilon -\mathbf {f}_0\in W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\), and from the uniform bound on \(|\nabla \mathbf {h}|\) and (9), it is straightforward to show that \(\det (\nabla \mathbf {v}_\epsilon (\mathbf {x}))>0\) for all \(\mathbf {x}\in {\overline{\varOmega }}\) and \(\epsilon \) in a neighborhood of 0. Thus it only remains to show that \(\mathbf {v}_\epsilon \) is injective on \(\varOmega \).

Suppose for contradiction that there is a non-negative sequence \(\epsilon ^i\rightarrow 0\) and \(\{\mathbf {x}_1^i,\mathbf {x}_2^i\}\subset \varOmega \) such that \(\mathbf {v}_{\epsilon ^i}(\mathbf {x}_1^i)=\mathbf {v}_{\epsilon ^i}(\mathbf {x}_2^i)\) and \(\mathbf {x}_1^i\not =\mathbf {x}_2^i\). Then we restrict to a subsequence (without re-indexing) such that \(\mathbf {x}_\alpha ^i\rightarrow \mathbf {x}_\alpha \) for \(\alpha \in \{1,2\}\). By continuous dependence on \(\mathbf {x}\) and \(\epsilon \), \(\mathbf {f}(\mathbf {x}_1)=\mathbf {f}(\mathbf {x}_2)\).

We now consider two cases; first we suppose that \(\mathbf {x}_1=\mathbf {x}_2\). There is some \(0<\epsilon _2<\epsilon _1\) and \(\delta '>0\) such that \(|\nabla \mathbf {v}_\epsilon (\mathbf {x})\mathbf {w}|\ge \delta ' |\mathbf {w}|\) on \([0,\epsilon _2]\). Since \({\overline{\varOmega }}\times [0,\epsilon _2]\) is compact, \(\nabla \mathbf {v}_\epsilon (\mathbf {x})\) is uniformly continuous with respect to \(\mathbf {x}\) and \(\epsilon \), and

$$\begin{aligned} \big |\mathbf {v}_{\epsilon ^i}\big (\mathbf {x}_2^i\big )-\mathbf {v}_{\epsilon ^i} \big (\mathbf {x}_1^i\big )\big |\ge \big |\nabla \mathbf {v}_{\epsilon ^i} \big (\mathbf {x}_1^i\big )[\mathbf {x}_2^i-\mathbf {x}_1^i]\big |-o \big (|\mathbf {x}_2^i-\mathbf {x}_1^i|\big ). \end{aligned}$$
(15)

We choose r such that the error satisfies \(o(r)<\delta ' r\), which implies that \(\mathbf {v}_\epsilon \) is injective on balls of radius r for \(\epsilon \in [0,\epsilon _2]\). Then we choose i large enough that \(|\mathbf {x}_\alpha ^i-\mathbf {x}_\alpha |<\frac{r}{2}\) and \(\epsilon ^i<\epsilon _2\). Since \(\mathbf {x}_1^i\) and \(\mathbf {x}_2^i\) are contained in a ball of radius r, this contradicts that \(\mathbf {v}_{\epsilon ^i}(\mathbf {x}_1^i)=\mathbf {v}_{\epsilon ^i}(\mathbf {x}_2^i)\).

For the second case, we suppose that \(\mathbf {x}_1\not =\mathbf {x}_2\). Lemma 2 implies that \(\{\mathbf {x}_1,\mathbf {x}_2\}\subset \partial \varOmega \) and by definition of \(K_f^0{\mathcal {A}}\), \(\mathbf {h}(\mathbf {x}_1)\cdot \mathbf {n}_f(\mathbf {x}_1)+\mathbf {h}(\mathbf {x}_2)\cdot \mathbf {n}_f(\mathbf {x}_2)=-\delta <0\). We claim there exist a series of vectors \(\{\mathbf {n}^i\}_{i=1}^\infty \) such that \((\mathbf {f}(\mathbf {x}_2^i)-\mathbf {f}(\mathbf {x}_1^i))\cdot \mathbf {n}^i\ge 0\) and \(\mathbf {n}^i\rightarrow \mathbf {n}_f(\mathbf {x}_1)\). In this case,

$$\begin{aligned}&\mathbf {n}^i\cdot \Big (\mathbf {v}_{\epsilon ^i}\big (\mathbf {x}_2^i\big ) -\mathbf {v}_{\epsilon ^i}\big (\mathbf {x}_1^i\big )\Big )\nonumber \\&\quad =\mathbf {n}^i\cdot \Big (\mathbf {f}\big (\mathbf {x}_2^i\big )-\mathbf {f} \big (\mathbf {x}_1^i\big )\Big )+\epsilon ^i \mathbf {n}^i\cdot \Big (\mathbf {h}\big (\mathbf {x}_2^i\big )-\mathbf {h}(\mathbf {x}_1^i \big )\Big )+o(\epsilon ^i)\nonumber \\&\quad \ge \epsilon ^i \delta +\epsilon ^i \mathbf {n}^i\cdot \Big (\mathbf {h}\big (\mathbf {x}_2^i\big )-\mathbf {h}(\mathbf {x}_1^i\big ) \Big )-\epsilon ^i \mathbf {n}_f\big (\mathbf {x}_1\big )\cdot \Big (\mathbf {h}\big (\mathbf {x}_1\big )-\mathbf {h}(\mathbf {x}_2 \big )\Big )+o(\epsilon ^i). \end{aligned}$$
(16)

We now choose i large enough that \(o(\epsilon ^i)< \epsilon ^i \delta /2\) and

$$\begin{aligned} \mathbf {n}^i\cdot \Big (\mathbf {h}\big (\mathbf {x}_1^i\big )-\mathbf {h}(\mathbf {x}_2^i\big )\Big )- \mathbf {n}_f\big (\mathbf {x}_1\big )\cdot \Big (\mathbf {h}\big (\mathbf {x}_1\big )-\mathbf {h}(\mathbf {x}_2\big )\Big )<\delta /2, \end{aligned}$$

in which case inequality (16) contradicts \(\mathbf {v}_{\epsilon ^i}(\mathbf {x}_1^{i})=\mathbf {v}_{\epsilon ^i}(\mathbf {x}_2^{i})\). For the construction of the sequence \(\{\mathbf {n}^i\}_{i=1}^\infty \), we take

$$\begin{aligned} \mathbf {n}^i\equiv \mathbf {n}_f\big (\mathbf {x}_1\big )-\gamma ^i\frac{\mathbf {f} \big (\mathbf {x}_2^i\big )-\mathbf {f}\big (\mathbf {x}_1^i\big )}{\big |\mathbf {f}\big (\mathbf {x}_2^i\big )-\mathbf {f}\big (\mathbf {x}_1^i\big )\big |} \end{aligned}$$

and

$$\begin{aligned} \gamma ^i\equiv \min \Big \{\frac{ \mathbf {n}_f\big (\mathbf {x}_1 \big )\cdot \big (\mathbf {f}\big (\mathbf {x}_2^i\big )-\mathbf {f} \big (\mathbf {x}_1^i\big )\big )}{\big |\mathbf {f}\big (\mathbf {x}_2^i \big )-\mathbf {f}\big (\mathbf {x}_1^i\big )\big |},0\Big \}. \end{aligned}$$

The choice of \(\gamma ^i\) ensures that \((\mathbf {f}(\mathbf {x}_2^i)-\mathbf {f}(\mathbf {x}_1^i))\cdot \mathbf {n}^i\ge 0\). If \(\gamma ^i\not \rightarrow 0\), then there is some \(\gamma _0>0\) such that for any N there is \(\gamma ^i\ge \gamma _0\) with \(i>N\). If \(\gamma ^i>0\), then \(\mathbf {f}(\mathbf {x}_2^i)-\mathbf {f}(\mathbf {x}_1^i)\) is inward pointing to \(\mathbf {f}(\varOmega )\) at \(\mathbf {f}(\mathbf {x}_1)\). Since \(\mathbf {f}(\mathbf {x}_2^i)\rightarrow \mathbf {f}(\mathbf {x}_1)\), it would be the case that \(\mathbf {f}(\mathbf {x}_2^i)\in \mathbf {f}(\varOmega )\) for large enough i, which contradicts injectivity of \(\mathbf {f}\). Thus, \(\gamma ^i\rightarrow 0\) and \(\mathbf {n}^i\rightarrow \mathbf {n}_f(\mathbf {x}_1)\), which completes the proof that \(K_f^0{\mathcal {A}}\subset K_f{\mathcal {A}}\).

We now construct an interior vector-field \(\mathbf {h}_0\in K_f^0{\mathcal {A}}\). We begin with the exterior normal vector-field \(\mathbf {n}_f\in C(\partial \varOmega ,{\mathbb {E}}^3)\) and extend \(\mathbf {n}_f\) to a continuous compactly supported vector-field on \({\mathbb {E}}^3\). We let \({\tilde{\mathbf {n}}}\) be a mollification of \(\mathbf {n}_f\) such that \({\tilde{\mathbf {n}}}\in C^\infty _c({\mathbb {E}}^3,{\mathbb {E}}^3)\), \(|{\tilde{\mathbf {n}}}|=1\) in a neighborhood of \(\partial \varOmega \), and \(|\tilde{\mathbf {n}(\mathbf {x})}-\mathbf {n}_f(\mathbf {x})|<\frac{1}{3}\) for all \(\mathbf {x}\in \partial \varOmega \). This implies that \({\tilde{\mathbf {n}}}(\mathbf {x})\cdot \mathbf {n}_f(\mathbf {x})>0\) for \(\mathbf {x}\in \partial \varOmega \). We let \(\phi \) be any smooth, real-valued function on \({\overline{\varOmega }}\) that is strictly negative on \(\partial \varOmega \backslash \varGamma \), and vanishes, along with its normal directional derivative, on \(\varGamma \). We define \(\mathbf {h}_0(\mathbf {x})\equiv \phi (\mathbf {x}) {\tilde{\mathbf {n}}}(\mathbf {x})\). Since \(\phi \) and \(\nabla \phi \cdot \mathbf {n}\) vanish on \(\varGamma \), \(\mathbf {h}_0\in W_\varGamma ^{2,p}(\varOmega ,{\mathbb {E}}^3)\). Moreover, since \(\phi (\mathbf {x})<0\) on \(\partial \varOmega \backslash \varGamma \) and \({\tilde{\mathbf {n}}}(\mathbf {x})\cdot \mathbf {n}_f(\mathbf {x})>0\), it follows that \(\mathbf {h}_0\) satisfies \(\mathbf {h}_0(\mathbf {x})\cdot \mathbf {n}_f(\mathbf {x})<0\) for all \(\mathbf {x}\in \partial \varOmega \backslash \varGamma \). If \(\mathbf {f}(\mathbf {x}_1)=\mathbf {f}(\mathbf {x}_2)\), then either \(\mathbf {x}_1\not \in \varGamma \) or \(\mathbf {x}_2\not \in \varGamma \) by injectivity of \(\mathbf {f}\) on \(\varOmega \cap \varGamma \), c.f. A5. We conclude that \(\mathbf {h}_0\in K_f^0{\mathcal {A}}\).

Finally, we show that \(K_f^0{\mathcal {A}}\) is open. Given any \(\delta >0\) and \(\mathbf {h}\in K_f^0{\mathcal {A}}\), the embedding of \(W_\varGamma ^{2,p}(\varOmega ,{\mathbb {E}}^3)\) into uniformly continous functions implies that there is an open neighborhood \(U_\delta \) of \(\mathbf {h}\) in \(W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\) such that \(|\mathbf {h}(\mathbf {x})-\mathbf {g}(\mathbf {x})|<\delta \) for all \(\mathbf {x}\in {\overline{\varOmega }}\) and \(\mathbf {g}\in U_\delta \). By item (i) of Lemma 2, \(S_f\) is compact and thus we can find some \(\delta >0\) such that \(\mathbf {h}(\mathbf {x}_1)\cdot \mathbf {n}_f(\mathbf {x}_1)+\mathbf {h}(\mathbf {x}_2)\cdot \mathbf {n}_f(\mathbf {x}_2)\le -2\delta \) wherever \(\mathbf {f}(\mathbf {x}_1)=\mathbf {f}(\mathbf {x}_2)\) and \(\mathbf {x}_1\not =\mathbf {x}_2\). It then follows that \(U_\delta \subset K_f^0{\mathcal {A}}\). \(\square \)

5 Equilibrium equations

We now give our main result:

Theorem 1

There exists \(\mathbf {f}^*\in {\mathcal {A}}\) and \(\sigma \in C(\partial \varOmega )^*\) (a finite Radon measure) such that:

  1. (i)

    The equilibrium equation is satisfied for all \(\mathbf {h}\in W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\):

    $$\begin{aligned} \langle DE[\mathbf {f}^*],\mathbf {h}\rangle _{W^{2,p}(\varOmega ,{\mathbb {E}}^3)}+\langle \sigma ,\mathbf {h}\cdot \mathbf {n}_{f^*}\rangle _{C(\partial \varOmega )}=0. \end{aligned}$$
    (17)
  2. (ii)

    The measure \(\sigma \) is non-negative.

  3. (iii)

    The complementary slackness principle holds that

    $$\begin{aligned} \langle \sigma ,z\rangle _{C(\partial \varOmega )}= 0 \end{aligned}$$
    (18)

    for all \(z\in C(\partial \varOmega )\) that satisfy \(z(\mathbf {x}_1)+z(\mathbf {x}_2)=0\) wherever \(\mathbf {f}^*(\mathbf {x}_1)=\mathbf {f}^*(\mathbf {x}_2)\) and \(\mathbf {x}_1\not =\mathbf {x}_2\). In particular, \(\sigma \) is supported on \(S_{f^*}\).

Before giving the proof, we discuss some ramifications of the theorem. To begin, we observe that the equilibrium equations (17) take the explicit weak form

$$\begin{aligned}&\int _\varOmega \Big [\big [W_F\big (\nabla \mathbf {f}^*,\nabla ^2\mathbf {f}^*\big )-\mathbf {B}\big ]\cdot \nabla \mathbf {h}+ W_G\big (\nabla \mathbf {f}^*,\nabla ^2\mathbf {f}^*\big )\cdot \nabla ^2\mathbf {h}-\mathbf {b}\cdot \mathbf {h}\Big ]dV\nonumber \\&\quad -\int _{\varGamma _c} {\varvec{\tau }}\cdot \mathbf {h}\ dS-\int _{\varGamma _c}{\varvec{\mu }} \cdot (\nabla \mathbf {h}\mathbf {n})dS+\int _{\varGamma _c} \mathbf {n}_{f^*}\cdot \mathbf {h}\ d\sigma =0, \end{aligned}$$
(19)

for all \(\mathbf {h}\in W_\varGamma ^{2,p}(\varOmega ,{\mathbb {E}}^3)\). In particular, the second term on the left side of (17) and the last term on the left side of (19) coincide. Clearly the latter represents a measure-valued reactive traction, \(-\mathbf {n}_{f^*}d\sigma \), which according to (ii), is either compressive (negative) or zero. More specifically, suppose that \(\sigma \) is absolutely continuous with respect to the area measure of \(\partial \varOmega \). Then the Radon–Nikodym derivative exists, which we denote \(d\sigma /dS = T_*\). In fact, \(T_*\in L^1(\partial \varOmega )\) is non-negative with support on \(S_{f^*}\), and the last term of (19) combines directly with the applied-traction integral via an effective traction, say \({\varvec{\tau }}_{eff}={\varvec{\tau }}-T_*\mathbf {n}_{f^*}\).

In view of (13), property (iii) of Theorem 1 represents “action-reaction” of the contact traction on \(S_{f^*}\), which can be stated \(d\sigma (\mathbf {x}_1)=d\sigma (\mathbf {x}_2)\) (in the sense of measures) wherever \(\mathbf {f}^*(\mathbf {x}_1)=\mathbf {f}^*(\mathbf {x}_2)\) for \(\mathbf {x}_1\not =\mathbf {x}_2\). To better understand this, we consider the following two special cases, both of which are possible: (a) \(\sigma \) is absolutely continuous, as above; (b) there are isolated points of \(S_f\) on which \(\sigma \) has finite measure. For \(\mathbf {f}^*(\mathbf {x}_1)=\mathbf {f}^*(\mathbf {x}_2)\) with \(\mathbf {x}_1\not =\mathbf {x}_2\), we claim in case (a) that \(T_*(\mathbf {x}_1)=T_*(\mathbf {x}_2)\), except possibly on a set of zero area-measure. In case (b) we claim that \(\sigma (\{\mathbf {x}_1\})=\sigma (\{\mathbf {x}_2\})\). To see this, we construct a special test function z as follows. For \(\mathbf {y}\in {\mathbb {E}}^3\) and \(0<r<1\), we define \(\eta _{r,y}\) to be a non-negative smooth “bump” function, such that \(\eta _{r,y}\) has support on the ball of radius r centered at \(\mathbf {y}\), and \(\eta _{r,y}\) equals 1 on the ball of radius \(r-r^2\) centered at \(\mathbf {y}\). We now let \(\mathbf {y}\in \mathbf {f}^*(S_{f^*})\) and consider \(\{\mathbf {x}_1,\mathbf {x}_2\}= {\mathbf {f}^*}^{-1}(\mathbf {y})\). For sufficiently small r, \(\mathbf {x}\mapsto \eta _{r,y}(\mathbf {f}^*(\mathbf {x}))\) has support in the union of two disjoint neighborhoods of \(\mathbf {x}_1\) and \(\mathbf {x}_2\), \(O_1^r\) and \(O_2^r\) respectively, c.f. Lemma 2. We choose the test function on \(\partial \varOmega \) given by:

$$\begin{aligned} z(\mathbf {x})\equiv \left\{ \begin{array}{ll}r^{-3}\eta _{r,y}(\mathbf {f}^*(\mathbf {x}))\ &{}\ \mathbf {x}\in O_1^r\cap \partial \varOmega \\ -r^{-3}\eta _{r,y}(\mathbf {f}^*(\mathbf {x}))\ &{}\ \mathbf {x}\in O^r_2\cap \partial \varOmega \\ 0\ &{}\ \mathbf {x}\in \partial \varOmega \backslash (O_1^r\cup O_2^r).\end{array}\right. \end{aligned}$$
(20)

With this choice, z satisfies the criterion of (iii). Equation (18) becomes

$$\begin{aligned} r^{-3}\int _{O_1^r\cap \partial \varOmega } T_*(\mathbf {x})\eta _{r,y}\big (\mathbf {f}^*(\mathbf {x})\big )dS=r^{-3}\int _{O_2^r\cap \partial \varOmega } T_*(\mathbf {x})\eta _{r,y}\big (\mathbf {f}^*(\mathbf {x})\big )dS. \end{aligned}$$

If \(\mathbf {x}_1\) and \(\mathbf {x}_2\) are Lebesgue points of \(T_*\), then the limit as \(r\rightarrow 0\) above exists, which shows that \(T_*(\mathbf {x}_1)=T_*(\mathbf {x}_2)\).

Now suppose that \(\mathbf {x}_1\) and \(\mathbf {x}_2\) are isolated points of \(S_{f^*}\) on which \(\sigma \) has finite measure. We choose z as in (20) with r small enough so that \(O_1^r\cap S_f=\{\mathbf {x}_1\}\) and \(O_2^r\cap S_f=\{\mathbf {x}_2\}\). Immediately, (iii) implies that \(\sigma (\{\mathbf {x}_1\})=\sigma (\{\mathbf {x}_2\})\). We now give the proof of Theorem 1.

Proof

Suppose that \(\mathbf {f}^*\) is an energy minimizer, cf. Proposition 1. We consider \(M_{f^*}^+,M^-\subset {\mathbb {R}}\times C(\partial \varOmega )\) defined by

$$\begin{aligned} M_{f^*}^+\equiv&\Big \{(l,{z}):\exists \ \mathbf {h} \in W^{2,p}_\varGamma (\varOmega ,{\mathbb {E}}^3)\ s.t.\ \langle DE[\mathbf {f}^*],\mathbf {h}\rangle _{W^{2,p}(\varOmega ,{\mathbb {E}}^3)}\le l,\nonumber \\&\mathrm{and\ if}\ \mathbf {f}^*(\mathbf {x}_1)=\mathbf {f}^*(\mathbf {x}_2)\ \mathrm{and}\ \mathbf {x}_1\not =\mathbf {x}_2\ \mathrm{then}\nonumber \\&\mathbf {h}(\mathbf {x}_1)\cdot \mathbf {n}_{f^*}(\mathbf {x}_1)+ \mathbf {h}(\mathbf {x}_2)\cdot \mathbf {n}_{f^*}(\mathbf {x}_2)\le z(\mathbf {x}_1)+z(\mathbf {x}_2)\Big \},\\ M^-\equiv&\Big \{(l,{z}): l\le 0,\ z(\mathbf {x})\le 0\ \forall \ \mathbf {x}\in \partial \varOmega \Big \}.\nonumber \end{aligned}$$
(21)

Clearly both \(M_{f^*}^+\) and \(M^-\) are convex cones containing the origin, and \(M^-\) has non-empty interior. We claim that if \((l,{z})\in \mathrm{int}\ M^-\), then \((l,{z})\not \in M_{f^*}^+\). Indeed, by virtue of Proposition 3, \((l,{z})\in \mathrm{int}\ M^-\) implies that if \(\mathbf {h}\) satisfies the conditions of (21), then \(\mathbf {h}\in K_{f^*}{\mathcal {A}}\). But then \(l<0\) contradicts Proposition 2.

Since the interior of \(M^-\) is non-empty and disjoint from \(M_{f^*}^+\), the separating hyperplane theorem [8] implies the existence of \((\lambda _0,\sigma _0)\in {\mathbb {R}}\times C(\partial \varOmega )^*\) separating \(M_{f^*}^+\) from \(M^-\) in the sense that

$$\begin{aligned} \lambda _0 l+\langle \sigma _0,z\rangle _{C(\partial \varOmega )}\ge 0 \mathrm{\ for\ all}\ (l,{z})\in M_{f^*}^+, \end{aligned}$$

and

$$\begin{aligned} \lambda _0 l+\langle \sigma _0,z\rangle _{C(\partial \varOmega )}\le 0 \mathrm{\ for\ all}\ (l,{z})\in M^-. \end{aligned}$$

The inequality for \((0,{z})\in M^-\) with \(z\le 0\) implies that \(\sigma _0\) is non-negative.

Next we claim that \(\lambda _0>0\). This follows from the existence of an interior point \((l,{z})\in M_{f^*}^+\) with \((l',{z})\in \mathrm{int}\ M^-\) for some \(l'\), i.e. \({z}(\mathbf {x})<0\) for all \(\mathbf {x}\in \partial \varOmega \). We have shown there exists \(\mathbf {h}\in K_{f^*}^0{\mathcal {A}}\) in Proposition 3. By compactness of \(S_f\), there is \(\delta >0\) such that \(\mathbf {h}(\mathbf {x}_1)\cdot \mathbf {n}_{f^*}(\mathbf {x}_1)+ \mathbf {h}(\mathbf {x}_2)\cdot \mathbf {n}_{f^*}(\mathbf {x}_2)\le -2\delta \) whenever \(\mathbf {f}^*(\mathbf {x}_1)=\mathbf {f}^*(\mathbf {x}_2)\) and \(\mathbf {x}_1\not =\mathbf {x}_2\). Then we let \(z(\mathbf {x})=-\delta \) for all \(\mathbf {x}\in \partial \varOmega \) and \((l,z)\in M_{f^*}^+\) for \(l\ge \langle DE[\mathbf {f}^*],\mathbf {h}\rangle _{W^{2,p}(\varOmega ,{\mathbb {E}}^3)}\). Since l can be chosen arbitrarily large, this shows that \(\lambda _0>0\). We set \(\sigma =\sigma _0/\lambda _0\).

To show the complementary slackness property, suppose that \({z}\in C(\partial \varOmega )\) satisfies \(z(\mathbf {x}_1)+z(\mathbf {x}_2)=0\) wherever \(\mathbf {f}^*(\mathbf {x}_1)=\mathbf {f}^*(\mathbf {x}_2)\) and \(\mathbf {x}_1\not =\mathbf {x}_2\). Then using \(\mathbf {h}=\mathbf {0}\), we find that \((0,{z})\in M_{f^*}^+\), and hence, \(\langle \sigma ,{z}\rangle _{C(\partial \varOmega )}= 0\). In particular, this implies that \(\sigma \) has support on \(S_{f^*}\). Indeed \(\langle \sigma ,{z}\rangle _{C(\partial \varOmega )}= 0\) for any \(z\in C(\partial \varOmega )\) with support on \(\partial \varOmega \backslash S_{f^*}\); the values of \(z\in M_{f^*}^+\) are only restricted on \(S_{f^*}\).

Given \(\mathbf {h}\in W_\varGamma ^{2,p}(\varOmega ,{\mathbb {E}}^3)\), we choose \(l=\langle DE[\mathbf {f}^*],\mathbf {h}\rangle _{W^{2,p}(\varOmega ,{\mathbb {E}}^3)}\) and \({z}(\mathbf {x})=\mathbf {h}(\mathbf {x})\cdot \mathbf {n}_{f^*}(\mathbf {x})\), which implies that \((l,{z})\in M_{f^*}^+\) and

$$\begin{aligned} \langle DE[\mathbf {f}^*],\mathbf {h}\rangle _{W^{2,p}(\varOmega ,{\mathbb {E}}^3)}+\langle \sigma ,\mathbf {h}\cdot \mathbf {n}_{f^*}\rangle _{C(\partial \varOmega )}\ge 0. \end{aligned}$$

Since the opposite inequality follows from the same argument with \(-\mathbf {h}\) in place of \(\mathbf {h}\), we conclude that (17) is satisfied. \(\square \)

6 Concluding remarks

Given the second-gradient theory considered here, it is perhaps surprising that the enforcement of global interior injectivity does not trigger a reactive surface hyper-traction (e.g. the second surface integral on the left side of (19), cf. [13]). Indeed only a surface traction field arises. However, the latter acts only in the direction normal to the deformed surface, i.e., the reaction is frictionless. In the absence of tangential self-contact reactions, no reactive hyper-tractions are induced. Accounting for friction in some manner (not considered here) would undoubtedly induce reactive hyper-tractions as well.

Global injectivity is clearly an important property of physically realistic elastic deformations. The embedding of the Sobolev space \(W^{2,p}(\varOmega ,{\mathbb {E}}^3)\) into the continuously differentiable vector-valued functions (for \(p>3\)) plays a key role in our analysis. In particular, the continuity of the deformation gradient and the smoothness of the boundary enable the use of the outward-normal vectors of the current configuration to characterize the interior cones to the set of admissible displacements. This leads to the existence of a measure-valued Lagrange multiplier satisfying the weak equilibrium equations, along with non-negativity and a balance-of-forces condition. An alternative form of Theorem 1 can be obtained for strongly Lipschitz-continuous boundaries, cf. [9], which will be presented in a future work.

In general, we do not expect additional regularity of energy minimizers in our setting. Even when \(\partial \varOmega \) is smooth, the interfaces at the edges \(\varGamma \) and \(S_{f^*}\) generally preclude regularity up to the boundary. Moreover, conditions compatible with our assumptions that imply regularity on the interior of the domain are not known.