1 Introduction

We consider problems of static equilibrium in which the primary unknown is the stress field and the solutions minimize a complementary energy subject to equilibrium constraints. Such problems arise, for example, in the limit analysis of solids at collapse, which is characterized by continuing deformations, or yielding, at constant applied loads [10]. In a geometrically linear framework, the elastic strains and the stress remain constant during collapse. Therefore, the plastic strain rate coincides with the total strain rate and is compatible. In addition, the stress is constrained to be in equilibrium and take values in the elastic domain K, which, for ideal plasticity and in the absence of hardening, is a fixed subset of \(\mathbb {R}^{n\times n}_\mathrm {sym}\). Static theory then aims to minimize over all possible velocities \(v:\Omega \rightarrow \mathbb {R}^n\) compatible with the boundary data \(g:\partial \Omega \rightarrow \mathbb {R}^n\), and maximize over all possible stress fields \(\sigma :\Omega \rightarrow K\) in equilibrium, the plastic dissipation

$$\begin{aligned} \int _\Omega \sigma \cdot Dv \,\mathrm{d}x. \end{aligned}$$
(1.1)

Natural spaces of functions are \(\sigma \in L^\infty (\Omega ;\mathbb {R}^{n\times n}_\mathrm {sym})\) with \(\sigma \in K\) almost everywhere and \(v\in W^{1,p}(\Omega ;\mathbb {R}^n)\) with \(v=g_D\) on \(\partial \Omega \) in the sense of traces. If the elastic domain K is convex, then the mathematical analysis of the problem is straightforward. Thus, the supremum of (1.1) with respect to \(\sigma \) can be taken locally, and the resulting dissipation functional

$$\begin{aligned} \int _\Omega \psi (Dv) \, \mathrm{d}x \end{aligned}$$
(1.2)

can then be minimized over all admissible v. In (1.2), \(\psi (\xi ):=\sup _{\sigma \in K} \sigma \cdot \xi \) is the dissipation potential. Thus, for convex K the classical kinematic problem of limit analysis is recovered. The functional (1.2) is itself convex and, for compact K, coercive, whence existence of minimizers follows by the direct method of the calculus of variations.

However, the elastic domain K of some notable materials is not convex. An illustrative example is silica glass. Indeed, Meade and Jeanloz [11] made measurements of the shear strength of amorphous silica at pressures up to 81 GPa at room temperature and showed that the strength initially decreases sharply as the material is compressed to denser structures of higher coordination and then rises again (Fig. 1a) resulting in a strongly non-convex elastic domain in the pressure-shear stress plane. Several authors [13, 17] have performed molecular dynamics calculations of amorphous solids deforming in pressure-shear and have found that the resulting deformation field forms distinctive patterns to accommodate permanent macroscopic deformations; see Fig. 1b. Remarkably, whereas convex limit analysis is standard [10], the case of non-convex elastic domains does not appear to have been studied.

Fig. 1
figure 1

a Measurements of the shear yield strength of silica glass at pressures up to 81 GPa at room temperature reveal a non-convex elastic domain in pressure-shear space [11, Fig. 1]. Reprinted with permission from The American Association for the Advancement of Science. b Molecular dynamics simulations of glass exhibit distinctive patterns in the deformation field [13, Fig. 3]. © IOP Publishing. Reproduced with permission. All rights reserved

More generally, we may consider static problems where the material response is expressed as

$$\begin{aligned} \varepsilon = \frac{\partial \chi }{\partial \sigma }(x, \sigma ) , \end{aligned}$$
(1.3)

in terms of a complementary energy function \(\chi \). The functional of interest is then the complementary energy

$$\begin{aligned} \sigma \mapsto \int _{\Gamma _D} \sigma (x) \nu (x) \cdot g_D(x) \, \mathrm{d}\mathcal {H}^{d-1} - \int _\Omega \chi (x, \sigma (x)) \, \mathrm{d}x , \end{aligned}$$
(1.4)

to be minimized subject to the equilibrium constraints

$$\begin{aligned}&\mathrm{div} \sigma (x) + b(x) = 0 ,&\text {in } \Omega , \end{aligned}$$
(1.5a)
$$\begin{aligned}&\sigma (x) \nu (x) = h(x) ,&\text {on } \Gamma _N , \end{aligned}$$
(1.5b)

where \(\sigma : \Omega \rightarrow \mathbb {R}^{{n}\times {n}}\) is a local stress field, \(b : \Omega \rightarrow \mathbb {R}^{n}\) are body forces and \(h : \Gamma _N \rightarrow \mathbb {R}^{n}\) applied tractions over the Neumann boundary \(\Gamma _N\subseteq \partial \Omega \). If \(\chi \) is non-convex, the question of relaxation again becomes non-standard and it may be expected to result in the development of microstructure in the form of rapidly oscillatory stress fields.

A powerful mathematical tool for elucidating such questions is furnished by \(\mathcal {A}\)-quasiconvexity, introduced by Fonseca and Müller [5] as a necessary and sufficient condition for the sequential lower-semicontinuity of functionals of the form

$$\begin{aligned} (u,v) \mapsto \int _\Omega f(x, u(x), v(x)) \, \mathrm{d}x , \end{aligned}$$
(1.6)

where \(f : \Omega \times \mathbb {R}^m \times \mathbb {R}^d \rightarrow [0,+\infty )\) is a normal integrand, \(\Omega \subseteq \mathbb {R}^n\) open and bounded, and v must satisfy the differential constraint

$$\begin{aligned} \mathcal {A}\, v = 0 . \end{aligned}$$
(1.7)

Here,

$$\begin{aligned} \mathcal {A}\, v := \sum _{i=1}^n A^{(i)} \frac{\partial v}{\partial x_i} , \end{aligned}$$
(1.8)

and \(A^{(i)} \in \mathrm{Lin}(\mathbb {R}^l; \mathbb {R}^d)\) is a constant rank partial differential operator. Specifically, \(f(x, u, \cdot )\) is \(\mathcal {A}\)-quasiconvex if

$$\begin{aligned} f(x, u, v) \le \int _Q f(x, u, v + w(y)) \, \mathrm{d}y \end{aligned}$$
(1.9)

for all \(v \in \mathbb {R}^d\) and all \(w \in C^\infty (Q; \mathbb {R}^d)\) such that \(\mathcal {A}w = 0\) and w is Q-periodic, with \(Q = (0,1)^n\). In particular, with \(\mathcal {A}= \mathrm{curl}\), \(\mathcal {A}\)-quasiconvexity reduces to Morrey’s notion of quasiconvexity. In the context of the static problem (1.4) and (1.5), we may identify the state field v with \(\sigma \) and the operative differential operator \(\mathcal {A}\) with \(\mathrm{div}\). The pertinent notion of quasiconvexity is, therefore, \(\mathrm{div}\)-quasiconvexity, acting on fields of symmetric \(n\times n\) matrices. Whereas for kinematic problems of the energy-minimization type there is a well-developed theory of relaxation relating to \(\mathrm{curl}\)-quasiconvexity, the relaxation of static problems of the form (1.4) and (1.5), relating instead to \(\mathrm{div}\)-quasiconvexity, has been less extensively studied.

In this paper, we develop a theory of symmetric \(\mathrm{div}\)-quasiconvex relaxation for static problems. For definiteness, we confine attention to the static problem of limit analysis [10]

$$\begin{aligned} \sup \{ F(\sigma ): \sigma \in L^\infty (\Omega ;K)\} . \end{aligned}$$
(1.10)

Here, \(K \subseteq \mathbb {R}^{n\times n}_{\mathrm{sym}}\) is the elastic domain, which we assume to be compact, and

$$\begin{aligned} F(\sigma ):=\inf _v \Big \{ \int _\Omega \sigma \cdot Dv \, \mathrm{d}x \, : \, v\in W^{1,1}(\Omega ;\mathbb {R}^n), \ v = g_D\text { on } \partial \Omega \Big \} , \end{aligned}$$
(1.11)

where \(g_D\in L^1(\partial \Omega ;\mathbb {R}^n)\) gives the boundary data. The domain \(\Omega \) is assumed to be a bounded Lipschitz domain. The stress field \(\sigma \) is a divergence-free field, which takes values in symmetric matrices. This symmetry sets the present setting apart from previous applications of \(\mathrm {div\,}\)-quasiconvexity, also denoted \(\mathcal {S}\)-quasiconvexity or soleinoidal–quasiconvexity, which have focused on the characterization of the \(\mathrm {div\,}\)-quasiconvex hull of a 3-point set in relation with the three-well problem in linear elasticity [7, 15, 16] and on the Born-Infeld equations [12]. We call the present setting symmetric \(\mathrm {div\,}\)-quasiconvexity.

In Section 2, we show how the concept of symmetric \(\mathrm {div}\)-quasiconvexity fits within the framework of \(\mathcal {A}\)-quasiconvexity and discuss the relevant properties of symmetric \(\mathrm {div}\)-quasiconvex functions, which mainly follow directly from [5]. We also present in Lemma 2.7 an important example of a nonconvex symmetric \(\mathrm {div}\)-quasiconvex function. Section 3 deals with \(\mathrm {div}\)-quasiconvexity for sets and their hulls, in the context of relaxation theory. An important result, announced in [17, Th. 1 and Th. 2], is Theorem 3.3, which shows that the variational problem (1.10) has a solution if K is symmetric \(\mathrm {div}\)-quasiconvex. We then discuss, in particular, the definition of the symmetric \(\mathrm {div}\)-quasiconvex hull of a set K, which in principle depends on the growth of the class of test functions employed. However, we show that all \(p\in (1,\infty )\) give equivalent definitions, Theorem 3.6. Finally, Section 4 deals with the important case of sets K that can be characterized in terms of the first two stress invariants alone and show how their symmetric \(\mathrm {div}\)-quasiconvex hulls can be explicitly characterized. We recall that this elastic domain representation is the basis for a broad range of pressure-dependent plasticity models, including the Mohr–Coulomb model of sands ([10] and references therein), the Cam-Clay model of soils ( [19] and references therein), the Drucker-Prager model of pressure-dependent metal plasticity ([10] and references therein) and Gurson’s model of porous metal plasticity [8].

2 Symmetric div-quasiconvex Functions

We start by giving the basic definitions and recalling the main results from [5], specializing them to the case of interest here.

Definition 2.1

A Borel-measurable, locally bounded function \(f:\mathbb {R}^{n\times n}_\mathrm {sym}\rightarrow \mathbb {R}\) is symmetric \(\mathrm {div\,}\)-quasiconvex if, for all \(\varphi \in C^\infty _\mathrm {per}((0,1)^n;\mathbb {R}^{n\times n}_\mathrm {sym})\) which obey \(\mathrm {div\,}\varphi =0\) everywhere,

$$\begin{aligned} f\biggl (\int _{(0,1)^n} \varphi \, \mathrm{d}x\biggr )\leqq \int _{(0,1)^n} f(\varphi ) \mathrm{d}x\,. \end{aligned}$$
(2.1)

For \(\xi \in \mathbb {R}^{n\times n}_\mathrm {sym}\), the symmetric \(\mathrm {div}\)-quasiconvex envelope of \(f:\mathbb {R}^{n\times n}_\mathrm {sym}\rightarrow \mathbb {R}\) is defined as

$$\begin{aligned} \begin{aligned} \mathcal {Q}_{\mathrm {sdqc}}f(\xi ):=&\inf \left\{ \int _{(0,1)^n} f(\varphi )\mathrm{d}x:\right. \varphi \in C^\infty _\mathrm {per}((0,1)^n;\mathbb {R}^{n\times n}_\mathrm {sym}), \\&\left. \mathrm {div\,}\varphi =0, \int _{(0,1)^n} \varphi \, \mathrm{d}x =\xi \right\} . \end{aligned} \end{aligned}$$
(2.2)

We recall that \(C^\infty _\mathrm {per}((0,1)^n)\) is the set of \(\varphi \in C^\infty (\mathbb {R}^n)\) such that \(\varphi (x+e_i) = \varphi (x)\) for \(i=1,\dots ,n\).

Remark 2.2

From the definition it follows that, if fg are symmetric \(\mathrm {div}\)-quasiconvex, then so are \(\max \{f,g\}\) and \(f+\lambda g\), for any \(\lambda \in [0,\infty )\). Furthermore, all convex functions are symmetric \(\mathrm {div}\)-quasiconvex.

For a generic first-order differential operator of the form given in (1.8) and a wavevector \(w\in \mathbb {R}^n\setminus \{0\}\), the linear operator \({\mathbb {A}}(w)\in \mathrm {Lin}(\mathbb {R}^m;\mathbb {R}^n)\) is defined as

$$\begin{aligned} {\mathbb {A}}(w):= \sum _{i=1}^n A^{(i)} w_i. \end{aligned}$$
(2.3)

The general theory of \(\mathcal {A}\)-quasiconvexity requires that \({\mathbb {A}}\) be constant rank, in the sense that \(\mathop {\mathrm {rank}}{\mathbb {A}}\) does not depend on w (as long as \(w\ne 0\)). We first show that this condition holds in the present case and compute the characteristic cone. We recall that the characteristic cone is the union of the sets where \({\mathbb {A}}(w)\) vanishes, for \(w\ne 0\), and that symmetric \(\mathrm {div}\)-quasiconvex functions are convex in the directions of the characteristic cone.

Lemma 2.3

The condition of being divergence-free is constant rank on symmetric \(n\times n\) matrices. The characteristic cone consists of all non-invertible matrices and spans \(\mathbb {R}^{n\times n}_\mathrm {sym}\).

Proof

Let \(J:\mathbb {R}^{n(n+1)/2}\rightarrow \mathbb {R}^{n\times n}_\mathrm {sym}\) be a linear bijection which maps \(\{e_1\dots e_{n(n+1)/2}\}\) to \(\{e_i\odot e_j\}_{1\leqq i\leqq j\leqq n}\). We recall that \((a\odot b)_{ij}:=\frac{1}{2} (a_ib_j+a_jb_i)\). We define the differential operator \(\mathcal {A}^\mathrm {s-div\,}\) on \(C^\infty (\Omega ;\mathbb {R}^{n(n+1)/2})\) as \(\mathcal {A}^\mathrm {s-div\,}\varphi := \mathrm {div\,}(J\varphi )\). The corresponding linear operator \({\mathbb {A}}^\mathrm {s-div\,}(w)\in \mathrm {Lin}(\mathbb {R}^{n(n+1)/2};\mathbb {R}^n)\), for \(w\in \mathbb {R}^n\), is defined by its action on a vector \(\xi \in \mathbb {R}^{n(n+1)/2}\),

$$\begin{aligned} ({\mathbb {A}}^\mathrm {s-div\,}(w)\xi )_i = \sum _{j=1}^n (J\xi )_{ij}w_j, \end{aligned}$$
(2.4)

which can be written as \({\mathbb {A}}^\mathrm {s-div\,}(w)\xi = (J\xi )w\).

For example, for \(n=2\),

$$\begin{aligned} J\begin{pmatrix} \xi _1\\ \xi _2\\ \xi _3 \end{pmatrix} =\begin{pmatrix} \xi _1 &{} \frac{1}{2}\xi _3 \\ \frac{1}{2}\xi _3 &{} \xi _2 \end{pmatrix} \end{aligned}$$
(2.5)

and

$$\begin{aligned} \begin{aligned} \mathcal {A}^\mathrm {s-div\,}\begin{pmatrix} \varphi _1\\ \varphi _2\\ \varphi _3 \end{pmatrix} = \begin{pmatrix} \partial _1\varphi _1 + \frac{1}{2} \partial _2\varphi _3\\ \partial _2\varphi _2 + \frac{1}{2} \partial _1\varphi _3 \end{pmatrix}, \\ {\mathbb {A}}^\mathrm {s-div\,}\begin{pmatrix} w_1\\ w_2 \end{pmatrix} \begin{pmatrix} \xi _1\\ \xi _2\\ \xi _3\end{pmatrix} = \begin{pmatrix} w_1\xi _1 + \frac{1}{2} w_2\xi _3\\ w_2\xi _2 + \frac{1}{2} w_1\xi _3 \end{pmatrix}. \end{aligned} \end{aligned}$$
(2.6)

We now show that the operator \({\mathbb {A}}^\mathrm {s-div\,}(w)\) is surjective for every \(w\in S^{n-1}\). Indeed, fix any vector \(v\in \mathbb {R}^n\) and let \(F^{v,w}\in \mathbb {R}^{n\times n}_\mathrm {sym}\) be such that \(F^{v,w}w=v\) (for example, let \(F^{v,w}=v\otimes w + w\otimes v - (v\cdot w) w\otimes w\)). Then, choose \(\xi :=J^{-1}(F^{v,w})\) to obtain \({\mathbb {A}}^\mathrm {s-div\,}(w)J^{-1}(F^{v,w}) = F^{v,w}w=v\). Therefore, \({\mathbb {A}}^\mathrm {s-div\,}(w)\) has rank n for all \(w\ne 0\), and the constant-rank condition holds.

The characteristic cone, first introduced by Murat and Tartar [14, 21], is defined as

$$\begin{aligned} \Lambda := \bigcup _{w\in S^{n-1}} \ker {\mathbb {A}}^\mathrm {s-div\,}(w) \subseteq \mathbb {R}^{n(n+1)/2}. \end{aligned}$$
(2.7)

In the present context, the cone \(\Lambda \) may be identified (via the mapping J) with the set of non-invertible matrices,

$$\begin{aligned} J\Lambda = \bigcup _{w\in S^{n-1}} \{\sigma \in \mathbb {R}^{n\times n}_\mathrm {sym}: \sigma w =0\} = \{\sigma \in \mathbb {R}^{n\times n}_\mathrm {sym}: \det \sigma =0\}. \end{aligned}$$
(2.8)

\(\square \)

The next three results are essentially special cases of more general assertions that hold within the framework of \(\mathcal {A}\)-quasiconvexity in [5]. For convenience, we restate here the statements that are needed in the following:

Lemma 2.4

Let f be symmetric \(\mathrm {div}\)-quasiconvex. Then, it is convex along all non-invertible directions, in the sense that \(f(\lambda A + (1-\lambda )B)\leqq \lambda f(A)+(1-\lambda ) f(B)\) whenever \(\lambda \in [0,1]\), \(A,B\in \mathbb {R}^{n\times n}_\mathrm {sym}\), \(\det (A-B)=0\). Furthermore, all such f are locally Lipschitz continuous.

Proof

If f is upper semicontinuous, then the assertion follows directly from [5, Prop. 3.4] using Lemma 2.3. Here, we give a direct proof without assuming upper semicontinuity.

We first assume that there is a vector \(\nu \in \mathbb {Q}^{n}\setminus \{0\}\) such that \((A-B)\nu =0\). We let \(h:\mathbb {R}\rightarrow \{0,1\}\) be one-periodic, with \(h(t)=0\) for \(t\in (0,\lambda )\) and \(h(t)=1\) for \(t\in (\lambda ,1)\). We choose \(M\in \mathbb {N}\) such that \(M\nu \in \mathbb {Z}^n\) and define \(u(x):=A+(B-A) h(Mx\cdot \nu )\). From \(Me_i\cdot \nu =M\nu _i\in \mathbb {Z}\), we deduce that \(u(x+e_i)=u(x)\) for all i. Furthermore, \(\mathrm {div\,}u=0\) in the sense of distributions, \(|\{u=A\}\cap (0,1)^n|=\lambda \), and \(|\{u=B\}\cap (0,1)^n|=1-\lambda \), which implies \(\int _{(0,1)^n} u \, \mathrm{d}x = \lambda A + (1-\lambda ) B\).

Let \(\theta _\varepsilon \in C^\infty _c(B_\varepsilon )\) be a mollifier. Then, \(u*\theta _\varepsilon \in C^\infty _\mathrm {per}((0,1)^n;\mathbb {R}^{n\times n}_\mathrm {sym})\) and, therefore, by (2.1), we obtain

$$\begin{aligned} f(\lambda A +(1-\lambda )B)\leqq \int _{(0,1)^n} f(u*\theta _\varepsilon ) \mathrm{d}x. \end{aligned}$$
(2.9)

Since f is locally bounded, u is bounded and \(|\{u*\theta _\varepsilon \ne u\}\cap (0,1)^n|\rightarrow 0\). Taking the limit \(\varepsilon \rightarrow 0\), we deduce that

$$\begin{aligned} f(\lambda A+(1-\lambda ) B)\leqq \int _{(0,1)^n} f(u)\, \mathrm{d}x =\lambda f(A)+(1-\lambda ) f(B) \end{aligned}$$
(2.10)

whenever A and B are such that \((A-B)\nu =0\) for some \(\nu \in \mathbb {Q}^n\). In particular, f is separately convex and finite-valued, hence locally Lipschitz continuous.

Consider now any two matrices AB and a vector \(w\in S^{n-1}\) such that \((A-B)w=0\). We choose \(\nu _j\in \mathbb {Q}^n\) such that \(\nu _j\rightarrow w\), which implies \((A-B)\nu _j\rightarrow 0\). Let now \(B_j:=B+(A-B)\nu _j\otimes \nu _j/|\nu _j|^2\). Then, \((A-B_j)\nu _j=0\), hence \(f(\lambda A+(1-\lambda )B_j)\leqq \lambda f(A)+(1-\lambda ) f(B_j)\). Taking \(j\rightarrow \infty \), by continuity of f we conclude the proof. \(\quad \square \)

Lemma 2.5

  1. (i)

    Let f be symmetric \(\mathrm {div}\)-quasiconvex, \(u_j\overset{*}{\rightharpoonup }u\) weakly in \(L^\infty (\Omega ;\mathbb {R}^{n\times n}_\mathrm {sym})\), \(\mathrm {div\,}u_j=0\) in the sense of distributions. Then,

    $$\begin{aligned} \int _\Omega f(u(x))\mathrm{d}x\leqq \liminf _{j\rightarrow \infty } \int _\Omega f(u_j(x))\mathrm{d}x. \end{aligned}$$
    (2.11)
  2. (ii)

    Let f be symmetric \(\mathrm {div}\)-quasiconvex, \(f(\xi )\leqq c(|\xi |^p+1)\) for some \(p\in [1,\infty )\), \(u_j{\rightharpoonup }u\) weakly in \(L^p(\Omega ;\mathbb {R}^{n\times n}_\mathrm {sym})\), \(\mathrm {div\,}u_j=0\) in the sense of distributions. Then,

    $$\begin{aligned} \int _\Omega f(u(x))\mathrm{d}x\leqq \liminf _{j\rightarrow \infty } \int _\Omega f(u_j(x))\mathrm{d}x. \end{aligned}$$
    (2.12)

Proof

Lemma 2.4 shows that f is continuous. The result follows then immediately from [5, Th. 3.7] using Lemma 2.3. \(\quad \square \)

Lemma 2.6

Let \(f\in C^0(\mathbb {R}^{n\times n}_\mathrm {sym};[0,\infty ))\). Then, \(\mathcal {Q}_{\mathrm {sdqc}}f\) is symmetric \(\mathrm {div}\)-quasiconvex.

Proof

Follows from [5, Prop. 3.4]. \(\quad \square \)

We now recall an important example of a nontrivial symmetric \(\mathrm {div}\)-quasiconvex function, due to Luc Tartar.

Lemma 2.7

(From [23]) The function \(f_\mathrm {T}:\mathbb {R}^{n\times n}_\mathrm {sym}\rightarrow \mathbb {R}\), \(f_\mathrm {T}(\sigma ):=(n-1)|\sigma |^2-(\mathop {\mathrm {Tr}}\sigma )^2\), is symmetric \(\mathrm {div\,}\)-quasiconvex.

For completeness, we provide a short proof of this result, which plays an important role in the explicit examples discussed in Section 4.

Proof

We first observe that, for any matrix \(A\in \mathbb {C}^{n\times n}\), we have

$$\begin{aligned} (\mathop {\mathrm {rank}}A) |A|^2 \geqq |\mathop {\mathrm {Tr}}A|^2. \end{aligned}$$
(2.13)

To verify this inequality, it suffices to write A in a basis in which only the first \(\mathop {\mathrm {rank}}A\) diagonal entries are nonzero and to use then on this set the basic inquality \(|\sum _i A_{ii}|^2\leqq (\mathop {\mathrm {rank}}A) \sum _i |A_{ii}|^2\). We now show that for any \(\varphi \in C^1_\mathrm {per}((0,1)^n;\mathbb {R}^{n\times n})\) with \(\mathrm {div\,}\varphi =0\) the functional \(I(\varphi ):=\int _{(0,1)^n} f_\mathrm {T}(\varphi (x))\mathrm{d}x\) is nonnegative. Indeed, letting \({\hat{\varphi }}_\lambda \) be the Fourier coefficients of \(\varphi \), by Plancharel’s theorem we have

$$\begin{aligned} \int _{(0,1)^n} f_\mathrm {T}(\varphi )\, \mathrm{d}x=\sum _{\lambda \in 2\pi \mathbb {Z}^n} \left[ (n-1) |{\hat{\varphi }}_\lambda |^2 - |\mathop {\mathrm {Tr}}{\hat{\varphi }}_\lambda |^2\right] \geqq 0 , \end{aligned}$$
(2.14)

where we have used (2.13) and the fact that \(\mathrm {div\,}\varphi =0\) implies \({\hat{\varphi }}_\lambda \lambda = 0\) and therefore \(\mathop {\mathrm {rank}}{\hat{\varphi }}_\lambda \leqq n-1\). Let now \(\varphi \) be as in the definition of \(\mathrm {div\,}\)-quasiconvexity, \(\xi :=\int _{(0,1)^n} \varphi \, \mathrm{d}x\). Since \(f_\mathrm {T}\) is quadratic and \(\varphi -\xi \) has average zero, expanding, we obtain

$$\begin{aligned} \int _{(0,1)^n} f_\mathrm {T}(\varphi ) \mathrm{d}x = f_\mathrm {T}(\xi ) +\int _{(0,1)^n} f_\mathrm {T}(\varphi -\xi ) \mathrm{d}x \geqq f_\mathrm {T}(\xi ). \end{aligned}$$
(2.15)

\(\square \)

We close this section with a brief discussion of the relation to \(\mathrm {div\,}\)-quasiconvexity. In particular, we show that symmetric \(\mathrm {div\,}\)-quasiconvexity is not equivalent to \(\mathrm {div\,}\)-quasiconvexity composed with projection to symmetric matrices. We recall that a Borel-measurable, locally bounded function \(f:\mathbb {R}^{m\times n}\rightarrow \mathbb {R}\) is \(\mathrm {div\,}\)-quasiconvex if, for every \(\varphi \in C^\infty _\mathrm {per}((0,1)^n;\mathbb {R}^{m\times n})\) such that \(\mathrm {div\,}\varphi =0\) everywhere,

$$\begin{aligned} f\left( \int _{(0,1)^n} \varphi \,\mathrm{d}x\right) \leqq \int _{(0,1)^n} f(\varphi ) \mathrm{d}x\,. \end{aligned}$$
(2.16)

Lemma 2.8

For a given function \(f:\mathbb {R}^{n\times n}_\mathrm {sym}\rightarrow \mathbb {R}\), we define \(\mathcal {S}f:\mathbb {R}^{n\times n}\rightarrow \mathbb {R}\) as \(\mathcal {S}f(\xi ) := f((\xi +\xi ^T)/2)\). If \(\mathcal {S}f\) is \(\mathrm {div\,}\)-quasiconvex, then f is symmetric \(\mathrm {div\,}\)-quasiconvex. However, there are symmetric \(\mathrm {div\,}\)-quasiconvex functions f such that the corresponding \(\mathcal {S}f\) is not \(\mathrm {div\,}\)-quasiconvex.

Proof

In order to prove that f is symmetric \(\mathrm {div\,}\)-quasiconvex, we pick \(\varphi \in C^\infty _\mathrm {per}((0,1)^n; \mathbb {R}^{n\times n}_\mathrm {sym})\) with \(\mathrm {div\,}\varphi =0\) and observe that

$$\begin{aligned}&f\left( \int _{(0,1)^n} \varphi \,\mathrm{d}x\right) =\mathcal {S}f\left( \int _{(0,1)^n} \varphi \,\mathrm{d}x\right) \nonumber \\&\quad \leqq \int _{(0,1)^n} \mathcal {S}f(\varphi )\mathrm{d}x=\int _{(0,1)^n} f(\varphi )\mathrm{d}x. \end{aligned}$$
(2.17)

For the converse implication, we consider \(n=2\) and \(f(F)=\det (F)\), so that

$$\begin{aligned} \mathcal {S}f(F)=\det \frac{F+F^T}{2} = \det F - \frac{1}{4} (F_{12}-F_{21})^2. \end{aligned}$$
(2.18)

We first check that f is symmetric \(\mathrm {div\,}\)-quasiconvex. Let \(\xi \in \mathbb {R}^{2\times 2}_\mathrm {sym}\), \(\varphi \in C^\infty _\mathrm {per}([0,1]^2;\mathbb {R}^{2\times 2}_\mathrm {sym})\) with \(\mathrm {div\,}\varphi =0\) and \(\int _{(0,1)^n} \varphi \mathrm{d}x=0\). Then, there is \(v\in C^\infty (\mathbb {R}^2;\mathbb {R}^2)\) with \(Dv={}^\perp \varphi ^{\perp }\), where by this compact notation we mean \(Dv=R\varphi R\), with \(R=e_1\otimes e_2-e_2\otimes e_1\). Since \(\varphi \) has average 0 and is periodic, we can choose \(v\in C^\infty _\mathrm {per}([0,1]^2;\mathbb {R}^2)\). In particular,

$$\begin{aligned} \int _{[0,1]^2} f(\xi +\varphi )\mathrm{d}x = \det \xi + \int _{[0,1]^2} \det Dv \mathrm{d}x = \det \xi = f(\xi ). \end{aligned}$$
(2.19)

At the same time, the function \(\varphi (x):=e_1\otimes e_2 \sin (2\pi x_1)\) is \([0,1]^2\)-periodic, divergence-free, has average 0, and gives

$$\begin{aligned} \int _{[0,1]^2} \mathcal {S}f(\varphi ) \mathrm{d}x= -\frac{1}{4} \int _{[0,1]^2} \sin ^2(2\pi x_1) \mathrm{d}x = -\frac{1}{8} < 0=\mathcal {S}f(0). \end{aligned}$$
(2.20)

\(\square \)

3 Symmetric \(\mathrm {div}\)-quasiconvex Sets and Hulls

3.1 Symmetric \(\mathrm {div}\)-quasiconvex Sets

In this section, we discuss symmetric \(\mathrm {div\,}\)-quasiconvexity of sets and their hulls. As in the case of quasiconvexity, there are different possible definitions of the hulls, depending on the growth that is assumed. For quasiconvexity, it has been shown that the p-quasiconvex hull of a compact set does not depend on the assumed growth p. The key technical ingredient is Zhang’s truncation Lemma, see [26]. In the present setting, we can only prove the corresponding result for \(1<p<\infty \), since the bounds on the potentials of the oscillatory fields are based on singular-integral estimates which only hold in that range, see Lemma 3.13 below. For clarity we give separate definitions for \(p\in [1,\infty ]\).

Definition 3.1

A compact set \(K \subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) is symmetric \(\mathrm {div}\)-quasiconvex if, for any \(\xi \in \mathbb {R}^{n\times n}_\mathrm {sym}\setminus K\), there is a symmetric \(\mathrm{div}\)-quasiconvex function \(g\in C^0(\mathbb {R}^{n\times n}_\mathrm {sym};[0,\infty ))\) such that \(g(\xi )>\max g(K)\).

A compact set \(K \subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) is p-symmetric \(\mathrm {div}\)-quasiconvex, with \(p\in [1,\infty )\), if the function g can be chosen to have p-growth, in the sense that \(g(\sigma )\leqq c(|\sigma |^p+1)\) for some \(c\in \mathbb {R}\) and all \(\sigma \in \mathbb {R}^{n\times n}_\mathrm {sym}\).

We remark that the function g can be chosen so that it vanishes on K by replacing it with \({\hat{g}}:=\max \{g-\max g(K),0)\}\).

It is clear that if K is p-symmetric \(\mathrm {div}\)-quasiconvex for some p then it is symmetric \(\mathrm {div}\)-quasiconvex. As in the case of quasiconvexity, the definition for non compact sets depends crucially on growth and many variants are possible. We do not discuss this case here.

Lemma 3.2

Let \(K\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) be compact and symmetric \(\mathrm {div\,}\)-quasiconvex, \(E:=\{\sigma \in L^\infty (\Omega ;K): \mathrm {div\,}\sigma =0\}\). Then, E is closed with respect to weak-\(*\) convergence in \(L^\infty (\Omega ;\mathbb {R}^{n\times n}_\mathrm {sym})\).

Proof

Let \(\sigma _j\in E\) be such that \(\sigma _j\overset{*}{\rightharpoonup }\sigma \) in \(L^\infty (\Omega ;\mathbb {R}^{n\times n}_\mathrm {sym})\).

For any \(\xi \in \mathbb {R}^{n\times n}_\mathrm {sym}\setminus K\), there is a symmetric \(\mathrm {div}\)-quasiconvex function \(g_\xi \in C^0(\mathbb {R}^{n\times n}_\mathrm {sym};[0,\infty ))\) which vanishes on K and with \(g_\xi (\xi )>0\). By continuity, \(g_\xi >0\) on \(B_{r_\xi }(\xi )\), for some \(r_\xi >0\). The set \(\mathbb {R}^{n\times n}_\mathrm {sym}\setminus K\) can be covered by countably many such balls \(B_i\). Let \(g_i\) be the corresponding functions. It suffices to show that \(\{x: \sigma (x)\in B_i\}\) is a null set for any i.

By Lemma 2.5(i), recalling that \(\sigma _j\in K\) almost everywhere for all j, we obtain \(\int _\Omega g_i(\sigma )\mathrm{d}x\leqq \liminf _{j\rightarrow \infty } \int _\Omega g_i(\sigma _j)\mathrm{d}x=0\). This implies that \(g_i(\sigma (x))= 0\) almost everywhere. Since \(g_i>0\) on \(B_i\) we obtain that \(\{x: \sigma (x)\in B_i\}\) is a null set, which concludes the proof. \(\quad \square \)

We are now ready to prove our first main result, namely, an existence statement for static problems with symmetric \(\mathrm {div}\)-quasiconvex yield sets. We refer to the introduction for the formulation and the main definitions and recall in particular that \(g_D\in L^1(\partial \Omega ;\mathbb {R}^n)\) denotes the boundary data.

Theorem 3.3

If K is nonempty and symmetric \(\mathrm {div}\)-quasiconvex, then F is weakly upper semicontinuous and the problem defined in (1.10) and (1.11) has a solution \(\sigma _*\in L^\infty (\Omega ;K)\), which obeys \(\mathrm {div\,}\sigma _*=0\) in the sense of distributions.

Proof

We first prove that \(\sup F\in \mathbb {R}\).

Let \(\xi _0\in K\). Using the constant function \(\sigma =\xi _0\) gives

$$\begin{aligned} F(\xi _0)=\xi _0\cdot \int _\Omega Dv\, \mathrm{d}x = \xi _0\int _{\partial \Omega } g_D\otimes \nu \mathrm{d}\mathcal {H}^{n-1}\in \mathbb {R}, \end{aligned}$$
(3.1)

hence \(\sup F\ne -\infty \).

By the trace theorem for \(W^{1,1}\) (see for example [1, p. 168]), we can extend \(g_D\) to a function \(W^{1,1}(\Omega ;\mathbb {R}^n)\), which we shall also denote \(g_D\). For any \(\sigma \in L^\infty (\Omega ;K)\) we have

$$\begin{aligned} F(\sigma )\leqq \int _\Omega \sigma \cdot Dg_D\, \mathrm{d}x\leqq \Vert g_D\Vert _{W^{1,1}}\max \{|\xi |: \xi \in K\}, \end{aligned}$$
(3.2)

hence \(\sup F\ne +\infty \).

Next, we show that only fields \(\sigma \) that are divergence-free need be considered. If we assume additional regularity, then an integration by parts gives

$$\begin{aligned} \int _\Omega \sigma \cdot Dv\, \mathrm{d}x = \int _{\partial \Omega } \sigma g_D\cdot \nu \mathrm{d}\mathcal {H}^{n-1} - \int _\Omega v \cdot \mathrm {div\,}\sigma \, \mathrm{d}x, \end{aligned}$$
(3.3)

which does not contain any derivative of v. In particular, the \(\inf \) is \(-\infty \) unless \(\mathrm {div\,}\sigma =0\) almost everywhere.

Consider now a generic \(\sigma \in L^\infty (\Omega ;\mathbb {R}^{n\times n}_\mathrm {sym})\). If \(\mathrm {div\,}\sigma \ne 0\) in the sense of distributions, then there is \(\theta \in C^\infty _c(\Omega ;\mathbb {R}^n)\) such that \(\int _\Omega \sigma \cdot D\theta \, \mathrm{d}x\ne 0\). We consider the one-parameter family of test functions \(v_t:=g_D+t\theta \) and obtain

$$\begin{aligned} F(\sigma ) \leqq \int _\Omega \sigma \cdot Dv_t\, \mathrm{d}x = \int _\Omega \sigma \cdot Dg_D\, \mathrm{d}x + t \int _\Omega \sigma \cdot D\theta \,\mathrm{d}x\,\,\, \text { for all } t\in \mathbb {R},\quad \end{aligned}$$
(3.4)

which shows that \(F(\sigma )=-\infty \). Therefore, we can restrict attention to fields \(\sigma \) that are divergence-free in the sense of distributions.

Let \(\sigma _k\in L^\infty (\Omega ;K)\) be a maximizing sequence. By the preceding argument, \(\mathrm {div\,}\sigma _k=0\) in the sense of distributions. Since the sequence is bounded in \(L^\infty \), after extracting a subsequence it converges weak-\(*\) to some \(\sigma _*\), by the properties of distributions \(\mathrm {div\,}\sigma _*=0\). Lemma 3.2 implies that \(\sigma _*\in K\) almost everywhere. Hence, we only need to show that it is a maximizer. For any \(v\in W^{1,1}(\Omega ;\mathbb {R}^n)\) with \(v=g_D\) on the boundary we have

$$\begin{aligned} \int _\Omega \sigma _*\cdot Dv \, \mathrm{d}x = \lim _{k\rightarrow \infty } \int _\Omega \sigma _k\cdot Dv \, \mathrm{d}x \geqq \limsup _{k\rightarrow \infty } F(\sigma _k), \end{aligned}$$
(3.5)

hence,

$$\begin{aligned} F(\sigma _*)\geqq \limsup _{k\rightarrow \infty } F(\sigma _k) = \sup F. \end{aligned}$$
(3.6)

\(\square \)

3.2 Symmetric \(\mathrm {div\,}\)-quasiconvex Hulls

We now deal with the case that K is not symmetric \(\mathrm {div}\)-quasiconvex. Within the framework of relaxation theory, we begin by defining the symmetric \(\mathrm {div}\)-quasiconvex hull.

Definition 3.4

Let \(K\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) be compact, \(p\in [1,\infty )\), \(f_p(\xi ):={\text {dist}}^p(\xi ,K)\). We define

$$\begin{aligned} K^{(p)}:=\{\xi \in \mathbb {R}^{n\times n}_\mathrm {sym}: \mathcal {Q}_{\mathrm {sdqc}}f_p(\xi )=0\} \end{aligned}$$
(3.7)

and

$$\begin{aligned} \begin{aligned} K^{(\infty )}:= \{&{\xi } \in \mathbb {R}^{n\times n}_\mathrm {sym}\, : \, g(\xi ) \leqq \max g(K) \\&\text { for all symmetric } \mathrm{div}\text {-quasiconvex }g\in C^0(\mathbb {R}^{n\times n}_\mathrm {sym};[0,\infty )) \} . \end{aligned} \end{aligned}$$
(3.8)

Lemma 3.5

\(K^{(\infty )}\) is the smallest symmetric \(\mathrm {div}\)-quasiconvex compact set that contains K. \(K^{(p)}\) is the smallest p-symmetric \(\mathrm {div}\)-quasiconvex compact set that contains K.

As usual, the first assertion means that any symmetric \(\mathrm {div}\)-quasiconvex compact set that contains K also contains \(K^{(\infty )}\), and analogously for the second.

Proof

We start by \(K^{(p)}\). By Lemma 2.6 the function \(\mathcal {Q}_{\mathrm {sdqc}}f_p\) is symmetric \(\mathrm {div}\)-quasiconvex. From \(\mathcal {Q}_{\mathrm {sdqc}}f_p\leqq f_p\) it follows that \(\mathcal {Q}_{\mathrm {sdqc}}f_p\) has p-growth and that \(K\subseteq K^{(p)}\). If \(\xi \in \mathbb {R}^{n\times n}_\mathrm {sym}\setminus K^{(p)}\), then \(\mathcal {Q}_{\mathrm {sdqc}}f_p(\xi )>0=\max \mathcal {Q}_{\mathrm {sdqc}}f_p(K^{(p)})\). Therefore, \(K^{(p)}\) is p-symmetric \(\mathrm {div}\)-quasiconvex.

To show minimality, we consider a p-symmetric \(\mathrm {div}\)-quasiconvex compact set \({\tilde{K}}\) with \(K\subseteq {\tilde{K}}\) and show that \(K^{(p)}\subseteq {\tilde{K}}\). To this end, we fix a \(\xi \in K^{(p)}\) and a symmetric \(\mathrm {div}\)-quasiconvex function g with p growth and show that \(g(\xi )\leqq \max g(K)\leqq \max g({\tilde{K}})\). If this holds for any such function g, then necessarily \(\xi \in {\tilde{K}}\), which implies that \(K^{(p)}\subseteq {\tilde{K}}\) and concludes the proof.

It remains to show that \(g(\xi )\leqq \max g(K)\). Let \(\varepsilon >0\). Since g is continuous and \(f_p>0\) outside K, there is \(\delta >0\) such that \(g(\sigma )\leqq \max g(K)+\varepsilon \) for all \(\sigma \) with \(f_p(\sigma )\leqq \delta \). Using the fact that g has p-growth, we then obtain \(g\leqq \max g(K)+\varepsilon + C_\varepsilon f_p\) pointwise. By monotonicity of the symmetric \(\mathrm {div}\)-quasiconvex envelope, this gives \(g=\mathcal {Q}_{\mathrm {sdqc}}g \leqq \max g(K)+\varepsilon +C_\varepsilon \mathcal {Q}_{\mathrm {sdqc}}f_p\) pointwise and, therefore, \(g(\xi )\leqq \max g(K)+\varepsilon \). Since \(\varepsilon \) is arbitrary, this concludes the proof.

We now treat the \(p=\infty \) case. The fact that \(K\subseteq K^{(\infty )}\) is obvious. To show that \(K^{(\infty )}\) is symmetric \(\mathrm {div}\)-quasiconvex, we pick \(\xi \not \in K^{(\infty )}\). By the definition of \(K^{(\infty )}\), there is a symmetric \(\mathrm {div}\)-quasiconvex function g with \(g(\xi )>\max g(K)\). At the same time, for any \(\sigma \in K^{(\infty )}\) it follows that \(g(\sigma )\leqq \max g(K)\), which implies \(\max g(K^{(\infty )})=\max g(K)\). We conclude that \(g(\xi )>\max g(K^{(\infty )})\), which shows that \(K^{(\infty )}\) is symmetric \(\mathrm {div}\)-quasiconvex.

To show minimality, we assume that \({\tilde{K}}\) is symmetric \(\mathrm {div}\)-quasiconvex and \(K\subseteq {\tilde{K}}\). We wish to show that \(K^{(\infty )}\subseteq {\tilde{K}}\). To this end, we fix a \(\xi \in \mathbb {R}^{n\times n}_\mathrm {sym}\setminus {\tilde{K}}\) and choose a symmetric \(\mathrm {div}\)-quasiconvex function g with \(g(\xi )>\max g({\tilde{K}})\). From \(K\subseteq {\tilde{K}}\), we obtain \(\max g({\tilde{K}})\geqq \max g(K)\). Therefore, \(\xi \not \in K^{(\infty )}\). This implies \(K^{(\infty )}\subseteq {\tilde{K}}\) and concludes the proof. \(\quad \square \)

We proceed to show that \(K^{(p)}\) does not depend on p, as long as \(p\ne \infty \). One inclusion can easily be obtained from the definition. The other will be discussed in Section 3.3 below.

Theorem 3.6

Let \(K\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) be compact, \(1<p<q<\infty \). Then, \(K^{(p)}=K^{(q)}\).

Proof

Follows from Lemmas 3.8 and 3.15 below. \(\quad \square \)

Definition 3.7

Let \(K\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) be compact. For every \(p\in (1,\infty )\), we set \(K^{\mathrm {sdqc}}=K^{(p)}\). This is admissible by Theorem 3.6.

Lemma 3.8

Let \(K\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) be compact. Then, \(K^{(q)}\subseteq K^{(p)}\) for any pq with \(1\leqq p < q \leqq \infty \).

Proof

Assume first that \(q<\infty \). We write \(f_p(\xi ):={\text {dist}}^p(\xi ,K)\) and, analogously, \(f_q\). For all \(\delta >0\), we have

$$\begin{aligned} f_p \leqq \delta ^p + \frac{1}{\delta ^{q-p}} f_q, \end{aligned}$$
(3.9)

and, therefore,

$$\begin{aligned} \mathcal {Q}_{\mathrm {sdqc}}f_p \leqq \delta ^p + \delta ^{p-q} \mathcal {Q}_{\mathrm {sdqc}}f_q. \end{aligned}$$
(3.10)

Let now \(\xi \in K^{(q)}\), so that \(\mathcal {Q}_{\mathrm {sdqc}}f_q(\xi )=0\). The above inequality implies that \(\mathcal {Q}_{\mathrm {sdqc}}f_p(\xi )\leqq \delta ^p\) for any \(\delta >0\). We conclude that \(\mathcal {Q}_{\mathrm {sdqc}}f_p(\xi )=0\) and \(K^{(q)}\subseteq K^{(p)}\).

If, instead, \(q=\infty \), it suffices to observe that the function \(\mathcal {Q}_{\mathrm {sdqc}}f_p\) is symmetric \(\mathrm {div}\)-quasiconvex (Lemma 2.6). Therefore, it is one of the candidates in the definition of \(K^{(\infty )}\). Since \(\mathcal {Q}_{\mathrm {sdqc}}f_p=0\) on K, we obtain that, necessarily, \(\mathcal {Q}_{\mathrm {sdqc}}f_p=0\) on \(K^{(\infty )}\). Hence, \(K^{(\infty )}\subseteq K^{(p)}\). \(\quad \square \)

Remark 3.9

By analogy with the case of quasiconvexity, one might expect that \(K^{(p)}=K^{(\infty )}\) for every \(p\in [1,\infty )\) and every compact set K. This property holds in dimension \(n=2\), since \(\mathrm {div\,}\)-quasiconvexity is equivalent to quasiconvexity composed with a 90-degree rotation. We do not know if the statement is true in higher dimensions.

Lemma 3.10

Let \(K\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) be compact, \(A\in \mathbb {R}^{n\times n}\) invertible, \(B\in \mathbb {R}^{n\times n}_\mathrm {sym}\). Then,

$$\begin{aligned} (AKA^T+B)^{\mathrm {sdqc}}=AK^{\mathrm {sdqc}}A^T+B \end{aligned}$$
(3.11)

and

$$\begin{aligned} (AKA^T+B)^{(\infty )}=AK^{(\infty )}A^T+B. \end{aligned}$$
(3.12)

Proof

We shall prove below that

$$\begin{aligned} (AKA^T+B)^{\mathrm {sdqc}}\subseteq AK^{\mathrm {sdqc}}A^T+B. \end{aligned}$$
(3.13)

In order to derive the other inclusion, we then consider the set \({\tilde{K}}:=AKA^{T}+B\), so that \(K=A^{-1}({\tilde{K}}-B)A^{-T}\). Application of (3.13) to \({\tilde{K}}\) gives

$$\begin{aligned} K^{\mathrm {sdqc}}= (A^{-1}{\tilde{K}}A^{-T} - A^{-1}BA^{-T})^{\mathrm {sdqc}}\subseteq A^{-1}{\tilde{K}}^{\mathrm {sdqc}}A^{-T} - A^{-1}BA^{-T}.\nonumber \\ \end{aligned}$$
(3.14)

Multiplying on the left by A and on the right by \(A^T\) yields

$$\begin{aligned} A K^{\mathrm {sdqc}}A^T \subseteq {\tilde{K}}^{\mathrm {sdqc}}-B, \end{aligned}$$
(3.15)

which, recalling the definition of \({\tilde{K}}\), is the desired second inclusion.

It remains to prove (3.13). We consider the set \(H:=AK^{\mathrm {sdqc}}A^T+B\). It is obvious that \(AKA^T+B\subseteq H\). If we can prove that H is p-symmetric \(\mathrm {div}\)-quasiconvex, then Lemma 3.5 implies \((AKA^T+B)^{\mathrm {sdqc}}\subseteq H\) and concludes the proof.

In order to show that H is p-symmetric \(\mathrm {div}\)-quasiconvex, we fix a symmetric matrix \({\hat{\sigma }}\not \in H\) and show that there is a symmetric \(\mathrm {div}\)-quasiconvex function f with p-growth such that \(f({\hat{\sigma }})>\max f(H)\). Theorem 3.6 shows that \(p\in (1,\infty )\) can be chosen arbitrarily. In the case of \(K^{(\infty )}\), the requirement of p-growth does not apply.

We define \(\sigma :=A^{-1}({\hat{\sigma }}-B)A^{-T}\), so that \({\hat{\sigma }}=A\sigma A^T+B\). The definitions of H and \({\hat{\sigma }}\) show that \(\sigma \not \in K^{\mathrm {sdqc}}\). Since \(K^{\mathrm {sdqc}}\) is p-symmetric \(\mathrm {div}\)-quasiconvex, there is a symmetric \(\mathrm {div}\)-quasiconvex function g with p-growth such that \(g(\sigma )>\max g(K^{\mathrm {sdqc}})\). We define \(f(\xi ):=g(A^{-1}(\xi -B) A^{-T})\), so that \(f({\hat{\sigma }})>\max f(H)\). Growth and continuity are automatically inherited from g.

To conclude the proof it remains to show that f is symmetric \(\mathrm {div}\)-quasiconvex. To this end, pick some \(\varphi \in C^\infty _\mathrm {per}((0,1)^n;\mathbb {R}^{n\times n}_\mathrm {sym})\) with \(\mathrm {div\,}\varphi =0\) and let \(\xi :=\int _{(0,1)^n} \varphi \,\mathrm{d}x\).

For some \(F\in \mathbb {R}^{n\times n}\) chosen below, we define \(\psi (x):=A^{-1}(\varphi (Fx)-B)A^{-T}\) and compute

$$\begin{aligned} \psi _{ij}(x)=\sum _{\alpha ,\beta } A^{-1}_{i\alpha } \varphi _{\alpha \beta }(Fx) A^{-1}_{j\beta }-A^{-1}_{i\alpha }B_{\alpha \beta } A^{-1}_{j\beta } \end{aligned}$$
(3.16)

and

$$\begin{aligned} \partial _k \psi _{ij}(x)=\sum _{\alpha ,\beta ,\gamma } A^{-1}_{i\alpha } \partial _\gamma \varphi _{\alpha \beta }(Fx) A^{-1}_{j\beta } F_{\gamma k}. \end{aligned}$$
(3.17)

Therefore,

$$\begin{aligned} (\mathrm {div\,}\psi )_i(x)=\sum _{\alpha ,\beta ,\gamma ,j} A^{-1}_{i\alpha } \partial _\gamma \varphi _{\alpha \beta }(Fx) A^{-1}_{j\beta } F_{\gamma j}. \end{aligned}$$
(3.18)

We choose \(F:=A\), so that \(\sum _j A^{-1}_{j\beta } F_{\gamma j}={\text {Id}}_{\beta \gamma }\) and

$$\begin{aligned} (\mathrm {div\,}\psi )_i(x)=\sum _{\alpha ,\beta } A^{-1}_{i\alpha } \partial _\beta \varphi _{\alpha \beta }(Fx) =0. \end{aligned}$$
(3.19)

Recalling the definitions of f and \(\psi \), we compute

$$\begin{aligned} \begin{aligned} \int _{(0,1)^n} f(\varphi (x))\mathrm{d}x=&\int _{(0,1)^n} g(A^{-1}(\varphi (x)-B) A^{-T})\mathrm{d}x\\ =&\int _{(0,1)^n} g(\psi (A^{-1}x))\mathrm{d}x = \det A\int _{A^{-1}(0,1)^n} g(\psi (y))\mathrm{d}y. \end{aligned} \end{aligned}$$
(3.20)

The function \(\psi \) is \(A^{-1}(0,1)^n\)-periodic and has average \(A^{-1}(\xi -B)A^{-T}\). The maps \(u_j(x):=\psi (jx)\) are divergence-free and converge weakly in \(L^{\infty }(\mathbb {R}^n;\mathbb {R}^{n\times n}_\mathrm {sym})\) to their average, which is \(A^{-1}(\xi -B)A^{-T}\). The functions \(x\mapsto g(u_j(x))=g(\psi (jx))\) are equally periodic and converge weakly to their average, which is the last expression in the previous equation. Since g is symmetric \(\mathrm {div}\)-quasiconvex, recalling the lower semicontinuity (Lemma 2.5) we conclude

$$\begin{aligned} g(A^{-1}(\xi -B)A^{-T})\leqq \det A\int _{A^{-1}(0,1)^n} g(\psi (y))\mathrm{d}y, \end{aligned}$$
(3.21)

and recalling the definition of g and the previous computation this gives

$$\begin{aligned} f(\xi )\leqq \int _{(0,1)^n} f(\varphi (x))\mathrm{d}x. \end{aligned}$$
(3.22)

Therefore, f is symmetric \(\mathrm {div}\)-quasiconvex. This concludes the proof. \(\quad \square \)

Lemma 3.11

Let \(K\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) be compact. If \(A,B\in K^{\mathrm {sdqc}}\) and \(\mathop {\mathrm {rank}}(A-B)<n\) then \(\lambda A+(1-\lambda )B\in K^{\mathrm {sdqc}}\) for all \(\lambda \in [0,1]\). The corresponding assertion holds for \(K^{(\infty )}\).

Proof

The proof follows immediately from the definition and Lemma 2.4. Indeed, the assumption gives \(\mathcal {Q}_{\mathrm {sdqc}}f_p(A)=\mathcal {Q}_{\mathrm {sdqc}}f_p(B)=0\). Since \(\mathcal {Q}_{\mathrm {sdqc}}f_p\) is symmetric \(\mathrm {div}\)-quasiconvex, it is convex in the direction of \(B-A\), and \(\mathcal {Q}_{\mathrm {sdqc}}f_p(\lambda A+(1-\lambda )B)=0\).

In the case of \(K^{(\infty )}\), we consider any symmetric \(\mathrm {div}\)-quasiconvex function \(f\in C^0(\mathbb {R}^{n\times n}_\mathrm {sym};[0,\infty ))\), and deduce as above \(f(\lambda A+(1-\lambda )B)\leqq \lambda f(A)+(1-\lambda ) f(B)\leqq \max f(K^{(\infty )})\). By the definition of \(K^{(\infty )}\), we obtain \(\max f(K^{(\infty )})=\max f(K)\) and, therefore, \(f(\lambda A+(1-\lambda )B)\leqq \max f(K)\). \(\quad \square \)

In closing this section, we present an explicit example in which K consists of two matrices.

Lemma 3.12

Let \(K:=\{A,B\}\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\). If \(\mathop {\mathrm {rank}}(A-B)=n\), then \(K^{\mathrm {sdqc}}=K^{(\infty )}=K\). Otherwise, \(K^{\mathrm {sdqc}}=K^{(\infty )}=[A,B]\), where [AB] is the segment with endpoints A and B.

Proof

The function \(f(\xi ):={\text {dist}}(\xi ,[A,B])\) is convex, hence symmetric \(\mathrm {div}\)-quasiconvex, therefore \(K^{\mathrm {sdqc}}\subseteq [A,B]\).

If \(\mathop {\mathrm {rank}}(B-A)<n\), Lemma 3.11 shows that \([A,B]\subseteq K^{(\infty )}\subseteq K^{\mathrm {sdqc}}\) and concludes the proof.

Assume now that \(\mathop {\mathrm {rank}}(B-A)=n\). By Lemma 3.10, it suffices to consider the case \(A={\text {Id}}\), \(B=-{\text {Id}}\) and we need only show that no matrix of the form \(t{\text {Id}}\), \(t\in (-1,1)\), belongs to \(K^{\mathrm {sdqc}}\). Let \(f(\xi ):=((n-1)|\xi |^2-(\mathop {\mathrm {Tr}}\xi )^2+n)_+\). Lemma 2.7 implies that f is symmetric \(\mathrm {div}\)-quasiconvex, and we verify that \(f({\text {Id}})=f(-{\text {Id}})=0\). However, \(f(t{\text {Id}})=n(1-t^2)>0\) for all \(t\in (-1,1)\), hence \(t{\text {Id}}\not \in K^{\mathrm {sdqc}}\). \(\square \)

3.3 Truncation of Symmetric Divergence-Free Fields

In the remainder of this Section, we prove that \(K^{(p)}\) does not depend on p, for \(p\in (1,\infty )\). This proof requires truncation and approximation of vector fields that satisfy differential constraints, which is made much easier by working with the corresponding potentials. Following [2], we introduce a stress potential \(\Theta \), which is related to the field \(\sigma \) by \(\sigma =\mathrm {div\,}\mathrm {div\,}\Theta \), in a sense we now make precise. Let \(\mathbb {R}^{n^4}_*\) be the set of \(\zeta \in \mathbb {R}^{n\times n\times n\times n}\) such that

$$\begin{aligned} \zeta _{ijhk}=\zeta _{jikh}=-\zeta _{ihjk} \quad \text { for all } i,j,k,h\in \{1, 2, \dots , n\}. \end{aligned}$$
(3.23)

For \(\Theta \in L^1_\mathrm {loc}(\mathbb {R}^n;\mathbb {R}^{n^4}_*)\) we define the distribution

$$\begin{aligned} (\mathrm {div\,}\mathrm {div\,}\Theta )_{ij} = \sum _{h,k}\partial _{h}\partial _{k} \Theta _{ijhk}. \end{aligned}$$
(3.24)

We observe that, by (3.23), \(\mathrm {div\,}(\mathrm {div\,}\mathrm {div\,}\Theta )=0\) and \(\mathrm {div\,}\mathrm {div\,}\Theta =(\mathrm {div\,}\mathrm {div\,}\Theta )^T\). Therefore, every potential generates a divergence-free symmetric matrix field.

In order to construct potentials, we start from a fixed matrix \(M\in \mathbb {R}^{n\times n}_\mathrm {sym}\) and define \(\Theta ^M:\mathbb {R}^n\rightarrow \mathbb {R}^{n^4}_*\) as

$$\begin{aligned} \Theta ^M(x)_{ijhk}=\frac{1}{n(n-1)} \bigl ( M_{ij}x_hx_k+M_{hk}x_ix_j - M_{ih}x_jx_k - M_{kj}x_hx_i \bigr ). \end{aligned}$$
(3.25)

A straightforward computation shows that \(\mathrm {div\,}\mathrm {div\,}\Theta ^M=M\), with \(|\Theta ^M|(x)\leqq 2|x|^2|M|\), \(|D\Theta ^M|(x)\leqq 4|x|\, |M|\), \(|D^2\Theta ^M|(x)\leqq 4|M|\) for all \(x\in \mathbb {R}^n\), \(n\geqq 2\). Working in Fourier space, this procedure can be generalized to any divergence-free symmetric matrix field.

Lemma 3.13

  1. (i)

    Let \(w\in C^\infty _\mathrm {per}((0,1)^n;\mathbb {R}^{n\times n}_\mathrm {sym})\) with \(\mathrm {div\,}w=0\) and \(\int _{(0,1)^n} w\, \mathrm{d}x=0\). Then, there is \(\Theta \in C^\infty _\mathrm {per}((0,1)^n;\mathbb {R}^{n^4}_*)\) such that \(\mathrm {div\,}\mathrm {div\,}\Theta =w\). The map \(w\mapsto \Theta \) is linear.

  2. (ii)

    Let \(w\in L^p((0,1)^n;\mathbb {R}^{n\times n}_\mathrm {sym})\) for some \(p\in (1,\infty )\), \(\mathrm {div\,}w=0\), \(\int _{(0,1)^n} w\, \mathrm{d}x=0\). Then, there is \(\Theta \in W^{2,p}_\mathrm {per}((0,1)^n;\mathbb {R}^{n^4}_*)\), with \(\Vert D^2\Theta \Vert _{p}\leqq c \Vert w\Vert _p\) and \(\mathrm {div\,}\mathrm {div\,}\Theta =w\). The map \(w\mapsto \Theta \) is linear and extends the map in (i).

  3. (iii)

    Let \(w=w_p+w_q\), with \(w_p\in L^p((0,1)^n;\mathbb {R}^{n\times n}_\mathrm {sym})\), \(w_q\in L^q((0,1)^n;\mathbb {R}^{n\times n}_\mathrm {sym})\) for some \(p,q\in (1,\infty )\), \(\mathrm {div\,}w=0\), \(\int _T w_p\, \mathrm{d}x=\int _T w_q\, \mathrm{d}x=0\). Then, there are \(\Theta _p\in W^{2,p}_\mathrm {per}((0,1)^n;\mathbb {R}^{n^4}_*)\), with \(\Vert D^2\Theta _p\Vert _{p}\leqq c \Vert w_p\Vert _p\), and \(\Theta _q\in W^{2,q}_\mathrm {per}((0,1)^n;\mathbb {R}^{n^4}_*)\), with \(\Vert D^2\Theta _q\Vert _{q}\leqq c \Vert w_q\Vert _q\), such that \(\mathrm {div\,}\mathrm {div\,}(\Theta _p+\Theta _q)=w\).

We stress that (iii) does not assert \(\mathrm {div\,}\mathrm {div\,}\Theta _p=w_p\).

Proof

(i): Let \({\hat{w}}:2\pi \mathbb {Z}^n\rightarrow \mathbb {R}^{n\times n}_\mathrm {sym}\) be the Fourier coefficients of w, so that

$$\begin{aligned} w(x)=\sum _{\lambda \in 2\pi \mathbb {Z}^n} {\hat{w}}(\lambda ) e^{i\lambda \cdot x}. \end{aligned}$$
(3.26)

The assumptions on w imply \({\hat{w}}(0)=0\), \({\hat{w}}_{ij}={\hat{w}}_{ji}\) and \(\sum _j{\hat{w}}_{ij}\lambda _j=0\). We define, in analogy to (3.25), \({\hat{\Theta }}(0)=0\) and, for \(\lambda \in 2\pi \mathbb {Z}^n\setminus \{0\}\),

$$\begin{aligned} {\hat{\Theta }}(\lambda )_{ijhk}=\frac{1}{|\lambda |^4} \bigl ( {\hat{w}}_{ij}\lambda _h\lambda _k+{\hat{w}}_{hk}\lambda _i\lambda _j - {\hat{w}}_{ih}\lambda _j\lambda _k - {\hat{w}}_{jk}\lambda _i\lambda _h \bigr ). \end{aligned}$$
(3.27)

We easily verify that \({\hat{\Theta }}(\lambda )\in \mathbb {R}^{n^4}_*\) and \(\sum _{hk}\lambda _h\lambda _k {\hat{\Theta }}_{ijhk}(\lambda )={\hat{w}}_{ij}(\lambda )\) for all \(\lambda \). Since the decay of the coefficients \({\hat{\Theta }}\) is faster than the decay of the coefficients \({\hat{w}}\), the Fourier series

$$\begin{aligned} \Theta (x)=\sum _{\lambda \in 2\pi \mathbb {Z}^n} {\hat{\Theta }}(\lambda ) e^{i\lambda \cdot x} \end{aligned}$$
(3.28)

defines a smooth periodic function \(\Theta \in C^\infty _\mathrm {per}(T;\mathbb {R}^{n^4}_*)\) such that \(\mathrm {div\,}\mathrm {div\,}\Theta =w\).

(ii): Let \(T:C^\infty _\mathrm {per}(T;\mathbb {R}^{n\times n}_\mathrm {sym})\rightarrow C^\infty _\mathrm {per}(T;\mathbb {R}^{n^4}_*)\), \(w\mapsto Tw:=\Theta _w\), be the linear operator defined above. We consider the operator \(D^2T : C^\infty _\mathrm {per}(T;\mathbb {R}^{n\times n}_\mathrm {sym})\rightarrow C^\infty _\mathrm {per}(T;\mathbb {R}^{n^6})\), defined by \(w\mapsto D^2Tw:=D^2\Theta _w\). Its Fourier symbol is smooth on \(S^{n-1}\) and homogeneous of degree zero. By [5, Proposition 2.13] (which is based on [18, Ex. (iii), page 94] and [20, Cor. 3.16, p. 263]) the operator \(D^2T\) can be extended to a continuous operator from \(L^p\) to \(L^p\) for any \(p\in (1,\infty )\). By Poincaré, and using the fact that Tw and DTw have average zero, the estimate in \(W^{2,p}\) follows.

(iii): We define \(\Theta _p:=Tw_p\), \(\Theta _q:=Tw_q\). The estimates on the norm follow as for (ii). By linearity of the operator T, the differential condition holds as well. We remark that the \(L^p\) extension and the \(L^q\) extension of the operator defined on smooth functions coincide on \(L^p\cap L^q\). Therefore, we can use the symbol T for the operator defined on \(L^p\cup L^q\). \(\quad \square \)

A crucial element in subsequent steps is the following truncation result, which is a minor variant of those given in Section 6.6.2 of [3] and Prop. A.1 of [4] and is based on Zhang’s Lemma [26].

Lemma 3.14

Let \(u\in W^{2,p}_\mathrm {per}({(0,1)^n};V)\), \(M>0\), V a finite-dimensional vector space. Then, there is \(v\in W^{2,\infty }_\mathrm {per}({(0,1)^n};V)\) such that

  1. (i)

    \(\displaystyle \Vert D^2v\Vert _{2,\infty }\leqq c M\);

  2. (ii)

    \(\displaystyle |\{v\ne u\}| \leqq \frac{c}{M^p} \int _{|u|+|Du|+|D^2u|>M} |u|^p+|Du|^p+|D^2u|^p \mathrm{d}x\).

The constant depends only on n and V.

The above estimates immediately imply

$$\begin{aligned} \Vert D^2u-D^2v\Vert _p^p \leqq c\int _{|u|+|Du|+|D^2u|>M} |u|^p+|Du|^p+|D^2u|^p \mathrm{d}x. \end{aligned}$$
(3.29)

Proof

After choosing a basis and working componentwise, we can assume \(V=\mathbb {R}\). We define \(h:=(u,Du,D^2u)\) and

(3.30)

Here and subsequently, . If \(E_M\) is a null set, then it suffices to take \(v=u\) and the proof is concluded. Otherwise, using the Vitali or the Besicovitch covering theorem it follows that the volume of \(E_M\) obeys (ii). We can further enlarge \(E_M\) by a null set and assume that all points of \({(0,1)^n}\setminus E_M\) are Lebesgue points of h.

For \(x\in {(0,1)^n}\setminus E_M\) and \(r\in (0,\sqrt{n})\), we define

(3.31)

From the definition of \(E_M\) we obtain \(0\leqq \eta _r \leqq 4M\) for all r and x and \(\eta _r\rightarrow 0\) pointwise on \((0,1)^n\setminus E_M\). Therefore, there is a set \({\tilde{E}}_M\) with \(|{\tilde{E}}_M|\leqq |E_M|\) such that \(\eta _{r}\rightarrow 0\) uniformly in \({(0,1)^n}\setminus E_M\setminus {\tilde{E}}_M\). We define \(S_M:=(0,1)^n\setminus E_M\setminus {\tilde{E}}_M\).

We have shown that there is \(\omega :(0,\infty )\rightarrow (0,4M]\) nondecreasing with \(\omega _r\rightarrow 0\) such that

(3.32)

Fix now \(x\in S_M\). By Poincaré’s inequality, for any \(r\in (0,\sqrt{n})\) there is \(A_r=A_r(x)\in \mathbb {R}^n\) such that

(3.33)

With x being a Lebesgue point of \(Du\), we have \(\lim _{r\rightarrow 0} A_r=Du(x)\). Comparing the above equation on the balls B(xr) and B(xr / 2) we obtain \(|A_r-A_{r/2}|\leqq cr\omega _r\), which (summing the geometric series \(A_{2^{-k}r}-A_{2^{k+1}r}\)) implies \(|A_r-Du(x)|\leqq c r\omega _r\) and

(3.34)

A second application of Poincaré’s inequality yields

(3.35)

for some \(b_r=b_r(x)\in \mathbb {R}\), and the same argument as above leads to

(3.36)

where \(P_x\) is the second-order Taylor polynomial of \(u\) centered at x.

For \(x,x'\in S_M\) and \(r=|x-x'|\), we have

(3.37)

Since the space of polynomials of degree two is finite dimensional, this is an estimate on the difference of the coefficients and also a uniform estimate on the difference of the two polynomials. The conclusion then follows from Whitney’s extension theorem. We remark that the standard construction in Whitney’s extension theorem, if given periodic inputs, produces periodic outputs, and that, if \(E_M\) is not a null set, this procedure actually produces a \(C^2\) function. \(\quad \square \)

We are finally in a position to prove the other inequality in Theorem 3.6. Specifically, we show the following:

Lemma 3.15

Let \(K\subseteq \mathbb {R}^{n\times n}_\mathrm {sym}\) be compact. Then, \(K^{(p)}\subseteq K^{(q)}\) for any pq with \(1< p \leqq q < \infty \).

Proof

As usual, we define \(f_p(\sigma ):={\text {dist}}^p(\sigma ,K)\) and, analogously, \(f_q\). For brevity, we write \(T=(0,1)^n\). Pick \(\xi \in K^{(p)}\). Since \(\mathcal {Q}_{\mathrm {sdqc}}f_p(\xi )=0\), by the definition (2.2) there is a sequence of functions \(w_k\in C^\infty _\mathrm {per}(T;\mathbb {R}^{n\times n}_\mathrm {sym})\) with \(\mathrm {div\,}w_k=0\), \(\int _T w_k\, \mathrm{d}x=\xi \) and \(\int _T f_p(w_k(x),K)\, \mathrm{d}x\rightarrow 0\). We choose \(M>0\) such that \(K\subseteq B_{M-1}\) and \(|\xi |\leqq M-1\) and define

$$\begin{aligned} w_k^M:= w_k \chi _{|w_k|< M} \quad \text { and }\quad w_k^L := w_k-w_k^M=w_k\chi _{|w_k|\geqq M}, \end{aligned}$$
(3.38)

where \(\chi _{|w_k|< M}(x)=1\) if \(|w_k|(x)< M\) and 0 otherwise. Then, \(\Vert w_k^M\Vert _{L^{2q}}\leqq \Vert w_k^M\Vert _{L^\infty }\leqq M\). Since \(|\sigma |\geqq M\) implies \({\text {dist}}(\sigma ,K)\geqq 1\), we obtain

$$\begin{aligned} \begin{aligned} |w_k^L|&=|w_k|\chi _{|w_k|\geqq M} \leqq {\text {dist}}(w_k, K)+(M-1)\chi _{|w_k|\geqq M},\\&\leqq M{\text {dist}}(w_k,K) \end{aligned} \end{aligned}$$
(3.39)

and, therefore, \(\Vert w_k^L\Vert _{L^p}\rightarrow 0\). Let \(\Theta _k^{M}\in W^{2,2q}_\mathrm {per}(T;\mathbb {R}^{n^4}_*)\) and \(\Theta _k^{L}\in W^{2,p}_\mathrm {per}(T;\mathbb {R}^{n^4}_*)\) be corresponding potentials obtained from \(w_k^M-\int _T w_k^M\, \mathrm{d}x\in L^{2q}\) and \(w_k^L-\int _T w_k^L\, \mathrm{d}x\in L^p\) using Lemma 3.13(iii) with the exponents 2q and p. In particular, this implies that \(w_k=\xi +\mathrm {div\,}\mathrm {div\,}(\Theta _k^M+\Theta _k^L)\), with

$$\begin{aligned} \Vert \Theta _k^M \Vert _{2,2q}\leqq c M \quad \text { and }\quad \Vert \Theta _k^L \Vert _{2,p}\rightarrow 0 \text { as }k\rightarrow \infty . \end{aligned}$$
(3.40)

Let \(\Theta _k^T\in C^2(T;\mathbb {R}^{n^4}_*)\) be the truncation of \(\Theta _k^{L}\) obtained from Lemma 3.14, \(\Vert \Theta _k^T\Vert _{2,\infty }\leqq cM\). The above estimates show that \(\Vert \Theta _k^T\Vert _{2,p}\rightarrow 0\) and, therefore, \(\Vert \Theta _k^T\Vert _{2,2q}\rightarrow 0\). We define \(w_k^*:=\xi +\mathrm {div\,}\mathrm {div\,}(\Theta _k^M+\Theta _k^T)\in L^{2q}\). Then, \(w_k-w_k^*=\mathrm {div\,}\mathrm {div\,}(\Theta _k^L-\Theta _k^T)\rightarrow 0\) in \(L^p\).

We now proceed to prove that \(\int _T f_q(w_k^*)\mathrm{d}x\rightarrow 0\) as \(k\rightarrow \infty \). For every \(N>M\), we write

$$\begin{aligned} f_q(w_k^*) \leqq (2N)^{q-p} f_p(w_k^*)\chi _{|w_k^*|<N} + (2|w_k^*|)^q\chi _{|w_k^*|\geqq N} \end{aligned}$$
(3.41)

and treat the two terms separately. The second can be estimated as

$$\begin{aligned} \limsup _{k\rightarrow \infty } \int _{|w_k^*|\geqq N} |w_k^*|^q \mathrm{d}x \leqq \limsup _{k\rightarrow \infty } \frac{1}{N^q}\int _T |w_k^*|^{2q} \mathrm{d}x \leqq \frac{ c M^{2q}}{N^q}. \end{aligned}$$
(3.42)

It remains to estimate the first term. For fixed N, the function \(f_p\) is uniformly continuous on \(B_N\), so there is \(\delta _N>0\) such that \(|\sigma |<N\), \(|\sigma -\eta |<\delta _N\) imply \(f_p(\sigma )\leqq f_p(\eta )+1/N^q\). Therefore, for all \(\sigma ,\eta \in \mathbb {R}^{n\times n}_\mathrm {sym}\), we have

$$\begin{aligned} f_p(\sigma )\chi _{|\sigma |<N} \leqq f_p(\eta ) + \frac{1}{N^q} + (2N)^p \frac{|\sigma -\eta |^p}{\delta _N^p}. \end{aligned}$$
(3.43)

Setting \(\sigma =w_k^*(x)\), \(\eta =w_k(x)\), integrating, and recalling that \(w_k-w_k^*\rightarrow 0\) in \(L^p\) yields

$$\begin{aligned} \begin{aligned} \limsup _{k\rightarrow \infty } \int _{|w_k^*|<N} f_p(w_k^*) \mathrm{d}x \leqq&\limsup _{k\rightarrow \infty } \int _T f_p(w_k) \mathrm{d}x + \frac{1}{N^q}\\&+ \frac{(2N)^p}{\delta _N^p}\limsup _{k\rightarrow \infty } \Vert w_k-w_k^*\Vert _p^p = \frac{1}{N^q}. \end{aligned} \end{aligned}$$
(3.44)

From (3.41) to (3.44), we conclude that

$$\begin{aligned} \limsup _{k\rightarrow \infty } \int _T f_q(w_k^*) \mathrm{d}x \leqq \frac{1}{N^p}+ \frac{ c M^{2q}}{N^q} \end{aligned}$$
(3.45)

for all \(N>M\) and, therefore, \(\int f_q(w_k^*) \mathrm{d}x\rightarrow 0\). Finally, by continuity and density we can replace \(w_k^*\) by a sequence of smooth functions with the same properties (using mollification preserves the differential constraint, periodicity and the average), and therefore \(\mathcal {Q}_{\mathrm {sdqc}}f_q(\xi )=0\). \(\quad \square \)

4 Explicit Relaxation for Yield Surfaces Depending on the First Two Invariants

4.1 General Setting and Main Results

In this section, we focus on the case of rotationally symmetric sets of strains in three dimensions. Lemma 3.10 implies that if \(K\subseteq \mathbb {R}^{3\times 3}_\mathrm {sym}\) is rotationally invariant, in the sense that \(Q^TKQ=K\) for any \(Q\in \mathrm {SO}(3)\), then also its symmetric \(\mathrm {div}\)-quasiconvex hull is rotationally invariant, in the sense that \(Q^TK^{\mathrm {sdqc}}Q=K^{\mathrm {sdqc}}\) for any \(Q\in \mathrm {SO}(3)\), and the same for \(K^{(\infty )}\). We consider here the situation where K is described by only two invariants, one corresponding to the pressure (the isotropic stress) and another to the deviatoric stress (a measure of the distance to diagonal matrices). We leave the case of generic rotationally invariant elastic domains for future work.

For \(\sigma \in \mathbb {R}^{3\times 3}_\mathrm {sym}\), we define the two variables

$$\begin{aligned} p(\sigma ):=\frac{1}{3}\mathop {\mathrm {Tr}}\sigma \text { and } q(\sigma ):=\frac{|\sigma -p{\text {Id}}|}{\sqrt{2}}, \end{aligned}$$
(4.1)

and denote \(\Phi :\mathbb {R}^{3\times 3}_\mathrm {sym}\rightarrow \mathbb {R}\times [0,\infty )\) at the mapping \(\Phi :=(p,q)\), so that

$$\begin{aligned} \Phi (\sigma )=\left( \frac{1}{3}\mathop {\mathrm {Tr}}\sigma ,\frac{|\sigma -p{\text {Id}}|}{\sqrt{2}}\right) . \end{aligned}$$
(4.2)

We remark that \(2q^2(\sigma )=|\sigma _D|^2\) where \(\sigma _D:=\sigma -p{\text {Id}}\) is the deviatoric part of \(\sigma \). For example, for any \((p_*,q_*)\in \mathbb {R}\times [0,\infty )\), the matrices

$$\begin{aligned} \xi _0:=\begin{pmatrix} p_*+q_*&{}0&{}0\\ 0&{}p_*-q_*&{}0\\ 0&{}0&{}p_* \end{pmatrix} \text { and } \xi _1:=\begin{pmatrix} p_*&{}\quad q_*&{}\quad 0\\ q_*&{}\quad p_*&{}\quad 0\\ 0&{}\quad 0&{}\quad p_* \end{pmatrix} \end{aligned}$$
(4.3)

obey \(\Phi (\xi _0)=\Phi (\xi _1)=(p_*,q_*)\).

Here, we consider sets K that can be characterized by the values of these two invariants, in the sense that

$$\begin{aligned} K=\{\sigma \in \mathbb {R}^{3\times 3}_\mathrm {sym}: (p(\sigma ), q(\sigma ))\in H\} \text { for some } H\subseteq \mathbb {R}\times [0,\infty ). \end{aligned}$$
(4.4)

We seek a characterization of \(K^{\mathrm {sdqc}}\) in the (pq) plane, that is, we aim at characterizing the set

$$\begin{aligned} \begin{aligned} \Phi (K^{\mathrm {sdqc}}) {=}&\{(p_*,q_*): \exists \sigma \in K^{\mathrm {sdqc}}\text { with } (p(\sigma ), q(\sigma ))=(p_*,q_*)\}, \end{aligned} \end{aligned}$$
(4.5)

and the same for \(K^{(\infty )}\). An explicit expression is given in Theorem 4.1 below.

In some cases, we shall additionally show that \(K^{\mathrm {sdqc}}\) is fully characterized by the values of p and q, in the sense that \(\sigma \in K^{\mathrm {sdqc}}\) if and only if \((p(\sigma ),q(\sigma ))\in {\tilde{H}}\) for some \({\tilde{H}}\in \mathbb {R}\times [0,\infty )\), see Theorem 4.2 below. This is however not always true; see Lemma 4.12 for an example of where this representation fails.

Our results are restricted to the case in which the relevant set \({\tilde{H}}\) is connected. Connectedness of hulls is, in general, a very subtle issue related to the locality of the various convexity conditions. In the case of quasiconvexity, it relates to the compactness of sequences taking values in sets without rank-one connections, a question known as Tartar’s conjecture [22]. We recall that nonlocality of quasiconvexity was proven, in dimension 3 and above, by Kristensen [9] based on Šverák’s counterexample to the equivalence of rank-one convexity and quasiconvexity [24]. However, in dimension two the situation is different and positive results have been obtained by Šverák [25] and Faraco and Székelyhidi [6].

We begin by explaining the construction qualitatively and then present a proof of its correctness. In order to get started, we fix \(p_0\in \mathbb {R}\) and consider the rank-two line

$$\begin{aligned} t\mapsto \xi _t := \begin{pmatrix} p_0+ t &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad p_0 - t &{}\quad 0\\ 0 &{}\quad 0 &{}\quad p_0 \end{pmatrix}. \end{aligned}$$
(4.6)

Clearly, \(p(\xi _t)=p_0\) and \(q(\xi _t)=|t|\). In particular, if \((p_0, q_0)\in H\) then both \(\xi _{q_0}\) and \(\xi _{-q_0}\) belong to K and, with Lemma 3.11, we obtain \(\xi _t\in K\) for all \(t\in [-q_0, q_0]\). Based on this argument, we define the set

$$\begin{aligned} {\hat{H}}:=\{(p,q)\in \mathbb {R}\times [0,\infty ): (p,q+a)\in H\text { for some }a\geqq 0\}. \end{aligned}$$
(4.7)

The set \(\Phi (K^{\mathrm {sdqc}})\) mentioned in (4.5) will then be characterized in Theorem 4.1 as a set \(H^\mathrm {rel}\) that we now show how to construct explicitly. Specifically, \(H^\mathrm {rel}\) is obtained from \({\hat{H}}\) by first taking the convex hull and then eliminating all points that can be separated from \(H^\mathrm {rel}\) by means of a translation of Tartar’s function, \(f(\sigma ) := 4q^2(\sigma )-3p^2(\sigma )\), which is symmetric \(\mathrm {div\,}\)-quasiconvex; see Lemma 4.3 below. We say that a point \(y_*=(p_*,q_*)\) can be separated from \({\hat{H}}\) if there is \(y_0=(p_0, q_0)\in \mathbb {R}\times [0,\infty )\) such that the function \(f_{y_0}(p,q):=4(q^2-q_0^2)-3 (p-p_0)^2\) obeys \(\max f_{y_0}(H)<f_{y_0}(y_*)\). Then, the set \(H^\mathrm {rel}\) is

$$\begin{aligned} H^\mathrm {rel}:=\{y_*\in {\hat{H}}^\mathrm {conv}: y_* \text { cannot be separated from }{\hat{H}}\}. \end{aligned}$$
(4.8)

We refer to Fig. 2 for an illustration.

Our main result is the following:

Theorem 4.1

Let \(H\subseteq \mathbb {R}\times [0,\infty )\) be a compact set, \(K:=\{\sigma \in \mathbb {R}^{3\times 3}_\mathrm {sym}: (p(\sigma ), q(\sigma ))\in H\}\). If the set \(H^\mathrm {rel}\) defined in (4.74.8) is connected, then \(\Phi (K^{\mathrm {sdqc}})= \Phi (K^{(\infty )})= H^\mathrm {rel}\).

Proof

The result follows from Lemmas 4.4 and 4.10 below, using the inclusion \(K^{(\infty )}\subseteq K^{\mathrm {sdqc}}\) that was proven in Lemma 3.8. \(\quad \square \)

With an additional condition on the tangent to the boundary of \(H^\mathrm {rel}\), we obtain a full characterization of the hull. The necessity of the condition on the tangent is proven in Lemma 4.12 below.

Theorem 4.2

Under the assumptions of Theorem 4.1, if additionally the tangent to \(\partial H^\mathrm {rel}\) belongs to \(\{e\in S^1: |e_2|\leqq \frac{\sqrt{3}}{4}|e_1|\}\) for any \(y_*\in \partial H^\mathrm {rel}\setminus {\hat{H}}\), then \(K^{\mathrm {sdqc}}=K^{(\infty )}=\{\sigma :\Phi (\sigma )\in H^\mathrm {rel}\}\).

Proof

The result follows from Lemma 4.4 and 4.11 below, using the inclusion \(K^{(\infty )}\subseteq K^{\mathrm {sdqc}}\) that is proven in Lemma 3.8. \(\quad \square \)

Fig. 2
figure 2

Sketch of the construction of \(H^\mathrm {rel}\) in the case that H consists of two points. The set \({\hat{H}}\) consists of two segments, which join the points in H with their projections on the \(\{q=0\}\) axis. The set \(H^\mathrm {rel}\) consists of the part of the rectangle between these two lines that cannot be separated by the function \(f_{y_0}\) for any \(y_0\). Graphically, this corresponds to delimiting the set by the graph of \(f_{y_0}\). In this case, it suffices to consider a single function of the family (dotted)

4.2 Outer Bound

The next two Lemmas contain the proof of the outer bound, i. e., the inclusion \(\Phi (K^{\mathrm {sdqc}})\subseteq H^\mathrm {rel}\).

Lemma 4.3

Let \(g:\mathbb {R}^{3\times 3}_\mathrm {sym}\rightarrow \mathbb {R}\) be defined by \(g(\xi ):=f_{y_0}(p(\xi ),q(\xi ))\), where \(f_{y_0}(p,q):=4(q^2-q_0^2)-3 (p-p_0)^2\) and \(y_0=(p_0, q_0)\in \mathbb {R}\times [0,\infty )\). Then, g is symmetric \(\mathrm {div\,}\)-quasiconvex.

Proof

By Lemma 2.7, we know that for the function \(f_\mathrm {T}:\mathbb {R}^{3\times 3}_\mathrm {sym}\rightarrow \mathbb {R}\),

$$\begin{aligned} f_\mathrm {T}(\xi ):=2|\xi |^2-(\mathop {\mathrm {Tr}}\xi )^2 \end{aligned}$$
(4.9)

is symmetric \(\mathrm {div\,}\)-quasiconvex. From

$$\begin{aligned} |\xi |^2=|\xi -p(\xi ){\text {Id}}|^2+|p(\xi ){\text {Id}}|^2=2q(\xi )^2 + 3 p(\xi )^2 , \end{aligned}$$
(4.10)

we obtain

$$\begin{aligned} f_\mathrm {T}(\xi )=4q(\xi )^2-3p(\xi )^2. \end{aligned}$$
(4.11)

Therefore, \(g(\xi )=f_\mathrm {T}(\xi -p_0{\text {Id}})-4q_0^2\) is symmetric \(\mathrm {div\,}\)-quasiconvex. \(\quad \square \)

Lemma 4.4

Under the assumptions of Theorem 4.1, \(\Phi (K^{\mathrm {sdqc}})\subseteq H^\mathrm {rel}\).

Proof

We pick a \(\sigma \in K^{\mathrm {sdqc}}\) and define \(y:=(p(\sigma ),q(\sigma ))\). We need to show that \(y\in H^\mathrm {rel}\).

If \(y\not \in {\hat{H}}^\mathrm {conv}\), then there is an affine function \(a:\mathbb {R}^2\rightarrow \mathbb {R}\) of the form \((p,q)\mapsto a(p,q)=bp+cq+d\) such that \(a(y)>0\) and \(a\leqq 0\) on \({\hat{H}}\).

We first show that we can assume \(c\geqq 0\). Indeed, if this were not the case, we could consider the new affine function \(a'(p,q):=bp+d\), which obeys \(a'(y)\geqq a(y)>0\). Let now \((p',q')\in {\hat{H}}\). By the definition of \({\hat{H}}\) we have \((p',0)\in {\hat{H}}\). By the definition of \(a'\) and the properties of a we obtain \(a'(p',q')=a(p',0)\leqq 0\). Therefore, we can assume \(c\geqq 0\), or, equivalently, that a is nondecreasing in its second argument.

The function \(g:\mathbb {R}^{3\times 3}_\mathrm {sym}\rightarrow \mathbb {R}\), \(g(\xi ):=a(p(\xi ),q(\xi ))\) is the composition of convex functions, with p linear, and a nondecreasing in the second argument. Therefore, g is convex, as can be easily verified:

$$\begin{aligned} g(\lambda \xi _1+(1-\lambda ) \xi _2)&= a(p(\lambda \xi _1+(1-\lambda ) \xi _2), q(\lambda \xi _1+(1-\lambda ) \xi _2))\\&\leqq a(\lambda p(\xi _1)+(1-\lambda ) p(\xi _2), \lambda q(\xi _1)+(1-\lambda ) q(\xi _2))\\&= \lambda g(\xi _1)+(1-\lambda ) g(\xi _2). \end{aligned}$$

In particular, \(g\leqq 0\) on K, \(g(\sigma )>0\) and g is convex. Hence, \(\sigma \) does not belong to the convex hull of K and neither does it belong to the symmetric \(\mathrm {div\,}\)-quasiconvex hull.

Assume now that \(y\in {\hat{H}}^\mathrm {conv}\setminus H^\mathrm {rel}\). Then, it is separated from \({\hat{H}}\) in the sense of (4.8). Let \(y_0=(p_0,q_0)\) be as in the definition of separation. By Lemma 4.3 the function \(\xi \mapsto f_{y_0}(p(\xi ), q(\xi ))=4(q(\xi )-q_0)^2-3(p(\xi )-p_0)^2\) is symmetric \(\mathrm {div\,}\)-quasiconvex and this implies \(\sigma \not \in K^{\mathrm {sdqc}}\). Therefore, \(\Phi (K^{\mathrm {sdqc}})\subseteq H^\mathrm {rel}\). \(\quad \square \)

4.3 Inner Bound

We now prove the inner bound. Specifically, we first show that for any \(y_*\in H^\mathrm {rel}\) there is a matrix \(\sigma \in K^{(\infty )}\) with \(\Phi (\sigma )=y_*\) (Lemma 4.10) and then that, if an additional condition on the slope of the boundary of \(H^\mathrm {rel}\) is fulfilled, any matrix \(\sigma \) with \(\Phi (\sigma )=y_*\) belongs to \(K^{(\infty )}\) (Lemma 4.11).

Our key result is a characterization of a family of rank-two curves in the (pq) plane. We say that \(t\mapsto \gamma (t)\) is a rank-two curve if it is a reparametrization of \(s\mapsto \Phi (A+s(B-A))\) for some A, \(B\in \mathbb {R}^{3\times 3}_\mathrm {sym}\) with \(\mathop {\mathrm {rank}}(A-B)\leqq 2\). The curves we construct are at the same time level sets of symmetric \(\mathrm {div}\)-quasiconvex functions, either of the type used to separate points in the definition of \(H^\mathrm {rel}\) or (piecewise) affine. This allows us (see proof of Lemma 4.10) to show below that any point in \({\hat{H}}^\mathrm {conv}\) that cannot be separated from \({\hat{H}}\) can be constructed. This strategy is illustrated in Fig. 3.

Fig. 3
figure 3

Strategy for the proof of the inner bound. From every point y, we construct a one-parameter family of rank-two lines that start in all possible directions (left panel) and which are at the same time level sets of symmetric \(\mathrm {div}\)-quasiconvex functions. Then, we distinguish two cases: if there is a direction such that the rank-two line intersects the set H on both sides of y, then y belongs to the hull. If there is a direction such that the rank-two line does not intersect H on any side of y, then we can separate y from H. By continuity of the family of curves and compactness of H, one of the two must occur

Lemma 4.5

Let K, H and \({\hat{H}}\) be as above. Then, any \(\sigma _*\in \mathbb {R}^{3\times 3}_\mathrm {sym}\) with \((p(\sigma _*),q(\sigma _*))\in {\hat{H}}\) belongs to \(K^{(\infty )}\).

Proof

Let \(\sigma _*\in \mathbb {R}^{3\times 3}_\mathrm {sym}\) be such that \(p_*:=p(\sigma _*)\), \(q_*:=q(\sigma _*)\) obey \((p_*,q_*+a)\in H\) for some \(a>0\). We consider the rank-two line

$$\begin{aligned} t\mapsto \xi _t := \sigma _* + \begin{pmatrix} t&{}\quad 0&{}\quad 0\\ 0&{}\quad -t&{}\quad 0\\ 0&{}\quad 0&{}\quad 0 \end{pmatrix}. \end{aligned}$$
(4.12)

This obeys \(\xi _0=\sigma _*\) and \(p(\xi _t)=p_*\) for all t. The map \(t\mapsto q(\xi _t)\) is continuous, equals \(q_*\) at \(t=0\) and diverges for \(t\mapsto \pm \infty \). Hence, there are \(t_-<0<t_+\) such that \(q(\xi _{t_\pm })=q_*+a\). In particular, \(\xi _{t_\pm }\in K\) and, therefore, (Lemma 3.11) \(\sigma _*=\xi _0\in K^{(\infty )}\). \(\quad \square \)

Lemma 4.6

Let \(y=(p_*,q_*)\in \mathbb {R}\times (0,\infty )\). Then, there is a continuous function \(\Gamma _y:S^1\times \mathbb {R}\rightarrow \mathbb {R}\times [0,\infty )\) such that for any \(e\in S^1\) the map \(t\mapsto \Gamma _y(e,t)\) is a rank-two curve parametrized by arc-length, with \(\Gamma _y(e,0)=y\), \(\partial _t \Gamma _y(e,0)=e\), and \(\Gamma _y(e,t)=\Gamma _y(-e,-t)\). The curves \(\Gamma _y(e,\cdot )\) are either of the form (4.14) or of the form (4.19).

Fig. 4
figure 4

Sketch of the lines constructed in the proof of Lemma 4.6. Left panel: directions in \(S^1_+\), lines defined in (4.14). The point \(y=(p_*,q_*)\) and one choice of \((p_0,0)\) are marked. Right panel: directions in \(S^1_-\), lines defined in (4.19). The point \(y=(p_*,q_*)\) and one choice of \((p_0,0)\) are marked

Proof

For reasons that will become clear subsequently, we treat separately the two sets

$$\begin{aligned} S^1_+:=\left\{ e\in S^1: |e_2|\geqq \frac{\sqrt{3}}{2}|e_1|\right\} \quad \text { and }\quad S^1_-:=\left\{ e\in S^1: |e_2|\leqq \frac{\sqrt{3}}{2}|e_1|\right\} .\nonumber \\ \end{aligned}$$
(4.13)

We observe that both are closed, that their union is \(S^1\) and their intersection consists of the four points \((\pm \frac{2}{\sqrt{7}},\pm \frac{\sqrt{3}}{\sqrt{7}})\).

We start from \(S^1_+\). For \(p_0,a\in \mathbb {R}\), we consider the rank-two line

$$\begin{aligned} t\mapsto \xi _t := \begin{pmatrix} p_0+(1+ a) t &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad p_0 + (1-a) t &{}\quad 0\\ 0 &{}\quad 0 &{}\quad p_0 \end{pmatrix} \end{aligned}$$
(4.14)

(see Fig. 4, left panel). We compute

$$\begin{aligned} p(\xi _t)=p_0 + \frac{2}{3} t \quad \text { and }\quad q^2(\xi _t)= \left( \frac{1}{3}+a^2\right) t^2. \end{aligned}$$
(4.15)

Solving for t the first equation and inserting into the second, we obtain that the graph of \(t\mapsto (p(\xi _t), q(\xi _t))\) is the set

$$\begin{aligned} q^2= \frac{3}{4} (1+3a^2)(p-p_0)^2 , \end{aligned}$$
(4.16)

which we can be rewritten (recalling that \(q\geqq 0\)) as

$$\begin{aligned} q= \frac{\sqrt{3}}{2} \sqrt{1+3a^2} |p-p_0|. \end{aligned}$$
(4.17)

Therefore, any line of the form \(\{q=\alpha |p-p_0|\}\) with \(|\alpha |\geqq \sqrt{3}/2\) is a rank-two line of the type given in (4.14). In turn, this means that we can define

$$\begin{aligned} \Gamma _y(e,t):= \Pi (y+et) \quad \text { for } e\in S^1_+, \end{aligned}$$
(4.18)

where \(\Pi (p,q):=(p,|q|)\) denotes reflection onto the upper half-plane.

We now turn to \(S^1_-\). Let \((p_0,q_0)\in \mathbb {R}\times [0,\infty )\) and consider the rank-two line

$$\begin{aligned} t\mapsto \xi _t := \begin{pmatrix} p_0+q_0+ t &{} 0 &{} 0 \\ 0 &{} p_0 -q_0+ t &{} 0\\ 0 &{} 0 &{} p_0 \end{pmatrix}. \end{aligned}$$
(4.19)

As above, a simple computation shows that

$$\begin{aligned} p(\xi _t)=p_0 + \frac{2}{3} t \quad \text { and }\quad q^2(\xi _t) = q_0^2 + \frac{1}{3} t^2. \end{aligned}$$
(4.20)

We now consider the equation \((p(\xi _{t_*}), q(\xi _{t_*}))=(p_*, q_*)\). For every \(t_*\in [-\sqrt{3}q_*, \sqrt{3}q_*]\) there is a unique solution \((p_0, q_0)\in \mathbb {R}\times [0,\infty )\), namely,

$$\begin{aligned} p_0=p_*-\frac{2}{3} t_* \quad \text { and }\quad q_0=\sqrt{ q_*^2-\frac{1}{3} t_*^2}. \end{aligned}$$
(4.21)

We compute

$$\begin{aligned} \left. \frac{\mathrm{d}}{\mathrm{d}t} \begin{pmatrix} p(\xi _t)\\ q(\xi _t) \end{pmatrix}\right| _{t=t_*}=\begin{pmatrix} 2/3 \\ t_*/3 q_* \end{pmatrix}= \frac{1}{3q_*} \begin{pmatrix} 2q_* \\ t_* \end{pmatrix}. \end{aligned}$$
(4.22)

Since we can choose \(t_*\) freely in \([-\sqrt{3}q_*, \sqrt{3}q_*]\), we conclude that for every \(e\in S^1_-\) there is a unique triplet \((p_0,q_0,t_*)\) such that the curve \(t\mapsto (p(\xi _t),q(\xi _t))\) passes through \(y=(p_*,q_*)\) at \(t=t_*\) with the tangent parallel to e. Indeed, this solution can be explicitly written as

$$\begin{aligned} t_*=2q_* \frac{e_2}{e_1} \,,\quad q_0=\sqrt{q_*^2-\frac{1}{3} t_*^2}\,,\quad p_0=p_*-\frac{2}{3} t_*. \end{aligned}$$
(4.23)

It is clear that this solution and, hence, \(\xi _t\), depends continuously on e. We finally define \(\Gamma _y(e,t)\) for \(e\in S^1_-\) as the arc-length reparametrization of \(t\mapsto \xi _{t_*+t}\) or \(t\mapsto \xi _{t_*-t}\) depending on the sign of \(e_1\) (see Fig. 4, right panel).

It remains to check that this definition agrees with the previous one for the four points in \(S^1_-\cap S^1_+\). For these points, the formulas above give \(q_0=0\) and \(a=0\), so that the two definitions of \(\xi _t\) also coincide (with the same \(p_0\)). This concludes the proof. \(\quad \square \)

Lemma 4.7

Let \(y_*=(p_*,q_*)\) with \(q_*>0\), and assume that there are \(e\in S^1\) and \(t_-<0<t_+\) such that \(\Gamma _{y_*}(e,t_\pm )\in {\hat{H}}\), where \(\Gamma _{y_*}\) is the map constructed in Lemma 4.6. Then, \(y_*\in \Phi ( K^{(\infty )})\). If, additionally, \(|e_2|\leqq \frac{\sqrt{3} }{4} e_1\) then any matrix \(\sigma _*\in \mathbb {R}^{3\times 3}_\mathrm {sym}\) with \(\Phi (\sigma _*)=y_*\) belongs to \(K^{(\infty )}\).

Proof

In order to prove the first assertion we observe that, by Lemma 4.6, there is a rank-two line \(t\mapsto \xi _t\) such that \(\Phi (\xi _0)=y_*\) and \(\Gamma _{y_*}(e,\mathbb {R})\) is the graph of \(t\mapsto \Phi (\xi _t)\). In particular, there is \(s_-<0\) such that \((p,q)(\xi _{s_-})=\Gamma _{y_*}(e,t_-)\in {\hat{H}}\), which by Lemma 4.5 implies that \(\xi _{s_-}\in K^{(\infty )}\). Analogously for \(s_+\). By Lemma 3.11, we obtain \(\xi _0\in K^{(\infty )}\) and, therefore, \(y_*=\Phi (\xi _0)\in \Phi (K^{(\infty )})\).

We now turn to the second assertion. By Lemma 4.8 below, there is a rank-two line \(t\mapsto \xi _t\) with the same properties and, additionally, with \(\xi _0=\sigma _*\). The same argument then implies \(\sigma _*\in K^{(\infty )}\). \(\quad \square \)

Lemma 4.8

Let \(\sigma _*\in \mathbb {R}^{3\times 3}_\mathrm {sym}\). Let \(e\in S^1\) be such that \(|e_2|\leqq \frac{\sqrt{3}}{4} |e_1|\). Then, there is a rank-two line \(t\mapsto \xi _t\) through \(\xi _0=\sigma _*\) such that the curve \(t\mapsto (p(\xi _t),q(\xi _t))\) is an hyperbola of the type (4.20) which is parallel to e at \(t=0\).

Proof

Any rank-two line through \(\sigma _*\) has the form \(t\mapsto \xi _t:=\sigma _*+t B\), for some \(B\in \mathbb {R}^{3\times 3}_\mathrm {sym}\) with \(\det B=0\). Let ab be the eigenvalues of B, and let ef be a pair of orthonormal vectors such that \(B=ae\otimes e + bf\otimes f\). We let \(p_*:=p(\sigma _*)\), \(q_*:=q(\sigma _*)\) and compute

$$\begin{aligned} p(\xi _t)=p_*+\frac{a+b}{3} t \end{aligned}$$
(4.24)

and

$$\begin{aligned} \begin{aligned} 2q^2(\xi _t)=&|\xi _t|^2-3p(\xi _t)^2 \\ =&2q^2_*+t^2(a^2+b^2 - \frac{1}{3} (a+b)^2) \\&+2t (ae\cdot \sigma _* e+bf\cdot \sigma _* f)-2t p_*(a+b). \end{aligned} \end{aligned}$$
(4.25)

From (4.24), we obtain \(t=3(p(\xi _t)-p_*)/(a+b)\). Inserting in the previous expression leads to

$$\begin{aligned} \begin{aligned} 2q^2(\xi _t)=&2q^2_*+6(p(\xi _t)-p_*)^2 \frac{(a+b)^2-3ab}{(a+b)^2} \\&+6\frac{p(\xi _t)-p_*}{a+b} (ae\cdot \sigma _* e+bf\cdot \sigma _* f)-6p_*(p(\xi _t)-p_*) \end{aligned} \end{aligned}$$
(4.26)

(the case \(a+b=0\) is not relevant, since in this case \(t\mapsto p(\xi _t)\) is constant). The expression

$$\begin{aligned} \frac{(a+b)^2-3ab}{(a+b)^2}=\frac{1}{4} +\frac{3}{4} \frac{ (a-b)^2}{(a+b)^2} \end{aligned}$$
(4.27)

can take any value in \([1/4,\infty )\) and the value 1 / 4 is taken if and only if \(a=b\). Therefore, the coefficient of the quadratic term \((p(\xi _t)-p_*)^2 \) can be the required value of 3 / 2 [see (4.20)] if and only if \(a=b\). We can scale to \(a=b=1\) and obtain

$$\begin{aligned} \begin{aligned} 2q^2(\xi _t)=&2q^2_*+\frac{3}{2}(p(\xi _t)-p_*)^2 \\&+3(p(\xi _t)-p_*) (e\cdot \sigma _* e+f\cdot \sigma _* f)-6p_*(p(\xi _t)-p_*). \end{aligned} \end{aligned}$$
(4.28)

We are left with the task of choosing e and f. Let \(g:=e\wedge f\), so that (efg) is an orthonormal basis of \(\mathbb {R}^3\). Then,

$$\begin{aligned} e\cdot \sigma _* e+f\cdot \sigma _* f+g\cdot \sigma _*g = \mathop {\mathrm {Tr}}\sigma _*=3p_* , \end{aligned}$$
(4.29)

so that, after some rearrangement, the linear term takes the form

$$\begin{aligned} \begin{aligned}&3(p(\xi _t)-p_*)(p_*-g\cdot \sigma _* g). \end{aligned} \end{aligned}$$
(4.30)

We conclude that the graph of \(t\mapsto (p(\xi _t),q(\xi _t))\) is the graph of the curve defined by

$$\begin{aligned} 2q^2= 2q^2_*+\frac{3}{2}(p-p_*)^2 +3(p-p_*)(p_*-g\cdot \sigma _* g) \end{aligned}$$
(4.31)

and its derivative at \(p_*\) is given by

$$\begin{aligned} \left. \frac{dq}{dp}\right| _{p=p_*}=\frac{3}{4q_*} (p_*-g\cdot \sigma _* g). \end{aligned}$$
(4.32)

It remains to show that we can choose B such that this quantity equals \(e_2/e_1\), which is a number in \([-\sqrt{3}/4,\sqrt{3}/4]\). To this end, we first show that the ordered eigenvalues \(\lambda _1\leqq \lambda _2\leqq \lambda _3\) of the matrix \(\sigma _D:=\sigma _*-p_*{\text {Id}}\) obey \(\lambda _1\leqq -q_*/\sqrt{3}\), \(\lambda _3\geqq q_*/\sqrt{3}\). Indeed, assume the former was not the case. If \(\lambda _2\leqq 0\), then \(\lambda _3<2 q_*/\sqrt{3}\) and \(\lambda _1^2+\lambda _2^2+\lambda _3^2<(1/3+1/3+4/3 )q_*^2=2q_*^2\), which is a contradiction. If, instead, \(\lambda _2\geqq 0\), then \(\lambda _2,\lambda _3\leqq q_*/\sqrt{3}\), with the same conclusion. The argument for \(\lambda _3\) is similar.

Therefore, the set \(\{g\cdot \sigma _D g : g\in S^2\}\) contains the interval \([-q_*/\sqrt{3},q_*/\sqrt{3}]\), and we can choose g (and hence e, f) such that \(p_*-g\cdot \sigma _* g=-g\cdot \sigma _D g=4q_*e_2/(3e_1)\in [-q_*/\sqrt{3},q_*/\sqrt{3}]\). \(\quad \square \)

Lemma 4.9

Let \(p_{\min }:=\min \{p: \exists q, (p,q)\in H\}\), \(p_{\max }:=\max \{p: \exists q, (p,q)\in H\}\) and

$$\begin{aligned}&\displaystyle A:=[p_{\min },p_{\max }], \end{aligned}$$
(4.33)
$$\begin{aligned}&\displaystyle B:=\{p: p{\text {Id}}\in K^{(\infty )}\}, \end{aligned}$$
(4.34)
$$\begin{aligned}&\displaystyle C:=\{p: (p,0)\in H^\mathrm {rel}\}. \end{aligned}$$
(4.35)

Assume \(H^\mathrm {rel}\) is connected. Then, \(A=B=C\).

We remark that the definition of A immediately implies \({\hat{H}}^\mathrm {conv}\subseteq A\times [0,\infty )\).

Proof

By convexity, we easily obtain \(B\subseteq A\) and \(C\subseteq A\). By the construction of \({\hat{H}}\), we have \(p_{\min }\in C\), \(p_{\max }\in C\). From the construction of \(H^\mathrm {rel}\), we see that \((p,q)\in H^\mathrm {rel}\) implies that the segment joining (pq) with (p, 0) also belongs to \(H^\mathrm {rel}\). This proves that \(H^\mathrm {rel}\) is connected if and only if C is connected and that C is the orthogonal projection of \(H^\mathrm {rel}\) onto the \(q=0\) axis. In particular, we have \(A=C\).

It remains to show that \(A\subseteq B\). By Lemma 4.5, we have that \(p_{\min }\in B\) and \(p_{\max }\in B\). We define

$$\begin{aligned} D_+:=\bigcup \left\{ \left[ p,p+\frac{2}{\sqrt{3}} q\right] : (p,q)\in H\right\} \end{aligned}$$
(4.36)

and

$$\begin{aligned} D_-:=\bigcup \left\{ \left[ p-\frac{2}{\sqrt{3}} q,p\right] : (p,q)\in H\right\} . \end{aligned}$$
(4.37)

We first show that \(D_+\cap D_-\subseteq B\). Indeed, let \(p_*\in D_+\cap D_-\) and let \(\sigma _*:=p_*{\text {Id}}\). By assumption, there are \((p_-,q_-), (p_+,q_+)\in H\) such that \(p_-\leqq p_*\leqq p_+\), \(q_-\geqq \gamma (p_*-p_-)\), \(q_+\geqq \gamma (p_+-p_*)\), where \(\gamma :=\frac{\sqrt{3}}{2}\). In particular, \((p_-, \gamma (p_*-p_-))\in {\hat{H}}\) and \((p_+, \gamma (p_+-p_*))\in {\hat{H}}\). We consider the rank-two line

$$\begin{aligned} t\mapsto \xi _t := \begin{pmatrix} p_*+ t &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad p_* + t &{}\quad 0\\ 0 &{}\quad 0 &{}\quad p_* \end{pmatrix} \end{aligned}$$
(4.38)

and observe that there are \(t_-\leqq 0\leqq t_+\) such that \(p(\xi _{t_\pm })=p_\pm \), \(q(\xi _{t_\pm })=\gamma |p_\pm -p_*|\). Lemma 4.5 implies \(\xi _{t_\pm }\in K^{(\infty )}\) and, with Lemma 3.11, one then deduces \(\sigma _*=\xi _0\in K^{(\infty )}\).

We next show that \(A\subseteq D_+\cup D_-\) Indeed, if \(p_*\not \in D_+\cup D_-\) then \(q(\sigma )< \frac{\sqrt{3}}{2} |p(\sigma )-p_*|\) for any \(\sigma \in K\). Consider the function \(f(p,q):=4q^2-3(p-p_*)^2\). Then, \(f(p,q)<0=f(p_*,0)\) for all \((p,q)\in {\hat{H}}\), therefore \((p_*,0)\) is separated from \({\hat{H}}\) and does not belong to \(H^\mathrm {rel}\). This implies that \(p_*\not \in C=A\).

Up until now we have shown that

$$\begin{aligned} D_+\cap D_-\subseteq B\subseteq A\subseteq D_+\cup D_-. \end{aligned}$$
(4.39)

Assume that there is \(p_*\in A{\setminus }B\). Without loss of generality, assume \(p_*\in D_+\). Let \({\bar{p}} :=\min \{p\in B: p>p_*\}\). Since \(p_{\max }\in B\), the set is nonempty. Since B is closed, \(p_*<{\bar{p}}\). The sets \(D_+\) and \(D_-\) are compact, cover the interval \([p_*,{\bar{p}}]\) and are disjoint in \([p_*,{\bar{p}})\). Therefore, \([p_*,{\bar{p}}]\subseteq D_+\).

Let \(p'\in (p_*,{\bar{p}})\subseteq D_+\). If there was \(q'\geqq 0\) such that \((p',q')\in H\), then we would have \((p',0)\in {\hat{H}}\) and \(p'\in B\). Therefore, \([p_*,{\bar{p}})\times [0,\infty )\cap H=\emptyset \). For any \(p'\in (p_*,{\bar{p}})\), there is a point \(y=(p_-,q_-)\in H\) with \(p_-<p_*\), \(q_-\geqq \gamma (p'-p_*)\). Consider a sequence of such points, \(p'_j\rightarrow {\bar{p}}\). By compactness of H, the corresponding points \(y_j=(p^-_j, q^-_j)\) converge (after extracting a subsequence) to some \(y_0=(p_0,q_0)\in H\). Since \(p^-_j<p_*\) for all j and H is closed, we have \(p_0< p_*\).

We finally consider the rank-two line

$$\begin{aligned} t\mapsto \xi _t := \begin{pmatrix} {\bar{p}}+ t &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {\bar{p}} + t &{}\quad 0\\ 0 &{}\quad 0 &{}\quad {\bar{p}} \end{pmatrix}. \end{aligned}$$
(4.40)

Let \(t_0\) be such that \({\bar{p}} + \frac{2}{3} t_0=p_0\). The condition \({\bar{p}}\in B\) corresponds to \(\xi _0={\bar{p}} {\text {Id}}\in K^{(\infty )}\), the definition of \(y_0\) shows that \(\Phi (\xi _{t_0})\in {\hat{H}}\) and, with Lemma 4.5, we obtain \(\xi _{t_0}\in K^{(\infty )}\). Therefore, \(\xi _t\in K^{(\infty )}\) for all \(t\in [t_0,0]\).

Let now \(t_1\in (t_0,0)\) be such that \({\bar{p}} + \frac{2}{3} t_1=p_*\). After swapping coordinates, we see that the two matrices

$$\begin{aligned} \xi _A:=\xi _{t_1}=\begin{pmatrix} {\bar{p}}+ t_1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {\bar{p}} + t_1 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad {\bar{p}} \end{pmatrix}\,,\quad \xi _B:=\begin{pmatrix} {\bar{p}}+ t_1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {\bar{p}} &{}\quad 0\\ 0 &{} 0 &{} {\bar{p}}+ t_1 \end{pmatrix} \end{aligned}$$

belong to \(K^{(\infty )}\). Since \(\mathop {\mathrm {rank}}(\xi _A-\xi _B)=2\), so do all matrices in the segment joining them and, in particular,

$$\begin{aligned} \xi _C:=\begin{pmatrix} {\bar{p}}+ t_1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {\bar{p}} + \frac{2}{3} t_1 &{}\quad 0\\ 0 &{} \quad 0 &{}\quad {\bar{p}}+\frac{1}{3} t_1 \end{pmatrix}\,. \end{aligned}$$

Again, swapping coordinates, the same is true for

$$\begin{aligned} \xi _D:=\begin{pmatrix} {\bar{p}}+ \frac{1}{3}t_1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {\bar{p}} + \frac{2}{3} t_1 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad {\bar{p}}+ t_1 \end{pmatrix}\,. \end{aligned}$$

Since \(\mathop {\mathrm {rank}}(\xi _D-\xi _C)=2\) and \(p_*{\text {Id}}=\frac{1}{2}\xi _D+\frac{1}{2}\xi _C\), we obtain \(p_*{\text {Id}}\in K^{(\infty )}\). This implies \(p_*\in B\), a contradiction. Therefore, we conclude that \(A\subseteq B\). \(\quad \square \)

Lemma 4.10

Under the assumptions of Theorem 4.1, \(H^\mathrm {rel}\subseteq \Phi (K^{(\infty )})\).

Proof

We fix \(y_*=(p_*, q_*)\in H^\mathrm {rel}\). If \(q_*=0\), then, in the notation of Lemma 4.9, we have \(p_*\in C=B\) and therefore \(p_*{\text {Id}}\in K^{(\infty )}\). If \(y_*\in {\hat{H}}\), then the result follows from Lemma 4.5.

It remains to consider the case \(y_*\in H^\mathrm {rel}\setminus {\hat{H}}\) and \(q_*>0\). We consider the set of directions such that the rank-two line constructed in Lemma 4.6 intersects \({\hat{H}}_A:={\hat{H}}\cup A\times \{0\}\), where A is the set constructed in Lemma 4.9 and define

$$\begin{aligned} D(y_*):=\{e\in S^1: \Gamma _{y_*}(e,[0,\infty ))\cap {\hat{H}}_A\ne \emptyset \} \end{aligned}$$
(4.41)

(this is illustrated in Fig. 3). By continuity of \(\Gamma _{y_*}\) and compactness of \({\hat{H}}_A\), it follows that \(D(y_*)\) is a closed subset of \(S^1\).

We now distinguish two cases. If there is \(e\in D(y_*)\cap -D(y_*)\), then there are \(t_-<0<t_+\) such that \(\Gamma _{y_*}(e,t_\pm )\in {\hat{H}}_A\) and Lemma 4.7 implies that \(y_*\in \Phi (K^{(\infty )})\).

If instead there is no such e, then \(D(y_*)\) and \(-D(y_*)\) are disjoint. Since they are both closed, and \(S^1\) is connected, they cannot cover \(S^1\). In particular, there is \(e\in S^1\) such that \(e,-e\not \in D(y_*)\).

In the notation of Lemma 4.6, if \(e\in S^1_+\) then the curve \(\Gamma _{y_*}(e,\mathbb {R})\) is the graph of \(q=b|p-p_0|\) for some \(b\geqq \sqrt{3}/2\), \(p_0\in \mathbb {R}\) such that \(q_*=b|p_*-p_0|\). Assume, for definiteness, that \(p_0>p_*\). The remaining case is identical up to a few signs.

This curve does not intersect \({\hat{H}}_A\) and, by the form of \({\hat{H}}_A\), this implies that \(q<b|p-p_0|\) for all \((p,q)\in {\hat{H}}_A\). In particular, \(p_0\not \in A\). Since A is an interval and \(p_*\in A\), we have that \(A\subseteq (-\infty ,p_0)\) and \(H^\mathrm {conv}\subseteq (-\infty ,p_0)\times [0,\infty )\). Hence, \(q<b(p-p_0)\) for all \((p,q)\in {\hat{H}}\) and, by convexity, \(q<b(p-p_0)\) for all \((p,q)\in {\hat{H}}^\mathrm {conv}\), but this contradicts the assumption \((p_*,q_*)\in H^\mathrm {rel}\).

The case \(e\in S^1_-\) is similar. The curve \(\Gamma _{y_*}(e,\mathbb {R})\) is of the type \(\{f_{y_1}(\cdot )=0\}\), for some \(y_1\). Then, \(f_{y_1}(y_*)=0\) but \(f_{y_1}<0\) on \({\hat{H}}\), so that \(y_*\) is separated from \({\hat{H}}^\mathrm {conv}\), contradicting the assumption that \(y_*\in H^\mathrm {rel}\). \(\quad \square \)

Lemma 4.11

Under the assumptions of Theorem 4.1, if additionally the tangent to \(\partial H^\mathrm {rel}\) belongs to \(\{e\in S^1: |e_2|\leqq \frac{\sqrt{3}}{4}|e_1|\}\) for any \(y_*\in \partial H^\mathrm {rel}\setminus {\hat{H}}\), then any \(\sigma \) with \(\Phi (\sigma )\in H^\mathrm {rel}\) belongs to \(K^{(\infty )}\).

In particular, the assumption implies that \(\partial H^\mathrm {rel}\) is differentiable (as a graph) at any point not belonging to \({\hat{H}}\), but does not require differentiability on \({\hat{H}}\).

Proof

The argument is similar to the proof of the previous Lemma. By construction of \(H^\mathrm {rel}\), there is a map \(\psi : A\rightarrow [0,\infty )\) such that

$$\begin{aligned} H^\mathrm {rel}=\{(p,q): p\in A, 0\leqq q\leqq \psi (p)\}. \end{aligned}$$
(4.42)

We first show that any \(\sigma _*\) such that \((p_*,q_*):=\Phi (\sigma _*)\in \partial H^\mathrm {rel}\) belongs to \(K^{(\infty )}\). We distinguish several cases. If \(q_*=0\), then \(p_*\in A\) and the claim follows from the equality \(A=B\) in Lemma 4.9. If \((p_*,q_*)\in {\hat{H}}\), then the claim follows from Lemma 4.5. It remains the case that \((p_*,q_*)\in {\hat{H}}^\mathrm {conv}\setminus {\hat{H}}\) and cannot be separated from \({\hat{H}}\).

At this point, we repeat the argument in Lemma 4.10. In particular, since \(y_*\in H^\mathrm {rel}\) we know that there is \(e\in S^1\) such that \(e\in D(y_*)\cap -D(y_*)\). This means that there are \(t_-<0<t_+\) such that \(\Gamma _{y_*}(e,t_\pm )\in {\hat{H}}\) and that \(\Gamma _{y_*}(e,t)\in H^\mathrm {rel}\) for all \(t\in [t_-,t_+]\). This implies that \(\Gamma _{y_*}(e,\cdot )\) is tangential to \(\partial H^\mathrm {rel}\) at \(t=0\) and, in particular, that e is tangential to \(\partial H^\mathrm {rel}\). We remark that e cannot be \((0,\pm 1)\), since in that case we would have \(y_*\in {\hat{H}}\), a case we have already dealt with.

Therefore, \(|e_2|\leqq \frac{\sqrt{3}}{4}|e_1|\), so that by Lemma 4.8 we obtain that \(\Phi (\xi _{s_\pm })\in {\hat{H}}\), which by Lemma 4.5 implies \(\xi _{s_\pm }\in K^{(\infty )}\). Therefore, \(\sigma _*=\xi _0\in K^{(\infty )}\).

This shows that for any \(p\in A\) and matrix \(\sigma \) with \(\Phi (\sigma )=(p,\psi (p))\) belongs to \(K^{(\infty )}\). The argument of Lemma 4.5 then concludes the proof. \(\quad \square \)

Fig. 5
figure 5

Sketch of the sets H and \(H^\mathrm {rel}\) in the proof of Lemma 4.12. The three points marked correspond to \(\sigma _*\) and to the points of H

We finally show that \(H^\mathrm {rel}=\Phi (K^{(\infty )})\) does not imply \(K^{(\infty )}=\Phi ^{-1}(H^\mathrm {rel})\). We refer to Fig. 5 for an illustration.

Lemma 4.12

Let \(H:=\{(0,0),(1,\sqrt{3}/2)\}\), and define K as in (4.4). Then, \(H^\mathrm {rel}=\{(p,q): 0\leqq p\leqq 1, 0\leqq q \leqq \sqrt{3} p/2\}\), the matrix \(\sigma _*:=\mathrm {diag}(1,1/4,1/4)\) obeys \((p(\sigma _*), q(\sigma _*)) = (1/2,\sqrt{3}/4)\in H^\mathrm {rel}\), but \(\sigma _*\not \in K^{(\infty )}\subseteq K^{\mathrm {sdqc}}\).

Proof

The formula for \(H^\mathrm {rel}\) follows immediately from the definition in (4.74.8); the fact that \(\sigma _*\in H^\mathrm {rel}\) from the definition of p and q in (4.1). Lemma 3.8 shows that \(K^{(\infty )}\subseteq K^{\mathrm {sdqc}}\).

It remains to prove that \(\sigma \not \in K^{(\infty )}\). Since \(\mathop {\mathrm {rank}}\sigma _*=3\), Lemma 3.12 implies \(\{0, 2\sigma _*\}^{(\infty )}=\{0, 2\sigma _*\}\). Therefore, it suffices to show that \(\sigma _*\in K^{(\infty )}\) would imply \(\sigma _*\in \{0, 2\sigma _*\}^{(\infty )}\).

We first define \(h:\mathbb {R}^{3\times 3}_\mathrm {sym}\rightarrow \mathbb {R}\), \(h(\xi ):=2p(\xi )-\xi _{11}\) and observe that \(h(0) = h(\sigma _*) = h(2\sigma _*) = 0\). We fix any \(\xi \in K\setminus \{0\}\). Then, necessarily \(p(\xi )=1\) and \(q(\xi )=\sqrt{3}/2\). Recalling that \( 2q^2(\xi )=|\xi -p(\xi ){\text {Id}}|^2\) and \(\xi _{33}=3p(\xi )-\xi _{11}-\xi _{22}\), we compute that

$$\begin{aligned} \begin{aligned} \frac{3}{2}= 2q^2(\xi )&=|\xi -p(\xi ){\text {Id}}|^2= |\xi -{\text {Id}}|^2\\&\geqq (\xi _{11}-1)^2+(\xi _{22}-1)^2+(2-\xi _{11}-\xi _{22})^2\\&\geqq (\xi _{11}-1)^2+2\left( \frac{1}{2}-\frac{\xi _{11}}{2}\right) ^2 = \frac{3}{2} (\xi _{11}-1)^2, \end{aligned} \end{aligned}$$
(4.43)

and we conclude that \(\xi _{11}\leqq 2\), so that \(h(\xi )\geqq 0\). Furthermore, if \(h(\xi )=0\) then necessarily \(\xi _{11}=2\), so that equality holds throughout in (4.43). This, in turn, implies that \(\xi =2\sigma _*\). We have therefore proven that \(h\geqq 0\) on K, with \(\{h=0\}\cap K=\{0,2\sigma _*\}\).

We now assume \(\sigma _*\in K^{(\infty )}\), so that, for any \(g\in C^0(\mathbb {R}^{3\times 3}_\mathrm {sym};[0,\infty ))\) which is symmetric \(\mathrm {div}\)-quasiconvex, \(g(\sigma _*)\leqq \max g(K)\). In order to show that \(\sigma _*\in \{0, 2\sigma _*\}^{(\infty )}\), we fix a function \(f\in C^0(\mathbb {R}^{3\times 3}_\mathrm {sym};[0,\infty )\) which is symmetric \(\mathrm {div}\)-quasiconvex, and let \(\alpha :=\max \{f(0),f(2\sigma _*)\}\). We need to show that \(f(\sigma _*)\leqq \alpha \).

Fix \(\varepsilon >0\). By continuity there is \(\delta >0\) such that \(f\leqq \alpha +\varepsilon \) on \(B_\delta (2\sigma _*)\). Let \(M:=\max f(K)\geqq \alpha \), \(m:=\min h(K\setminus \{0\}\setminus B_\delta (2\sigma _*))>0\). We define

$$\begin{aligned} g(\xi ) := f(\xi ) - (M-\alpha )\frac{h(\xi )}{m} . \end{aligned}$$
(4.44)

Then, \(g(0)=f(0)\leqq \alpha \), \(g\leqq \alpha +\varepsilon \) on \(K\cap B_\delta (2\sigma _*)\), \(g\leqq M-(M-\alpha )=\alpha \) on the rest of K, and g is continuous and symmetric \(\mathrm {div}\)-quasiconvex. The function \(g_+=\max \{g,0\}\in C^0(\mathbb {R}^{3\times 3}_\mathrm {sym};[0,\infty ))\) obeys \(\max g_+(K)\leqq \alpha +\varepsilon \). Since \(\sigma _*\in K^{(\infty )}\), we have \(f(\sigma _*)=g_+(\sigma _*)\leqq \alpha +\varepsilon \). However, \(\varepsilon \) was arbitrary, hence we conclude that \(f(\sigma _*)\leqq \max f(\{0,2\sigma _*\})\). Therefore, \(\sigma _*\in \{0,2\sigma _*\}^{\mathrm {sdqc}}\), as claimed, and the proof is concluded. \(\quad \square \)

4.4 Examples

We close by presenting two specific examples for which the symmetric \(\mathrm {div}\)-quasiconvex hull can be explicitly characterized.

Lemma 4.13

Let \(p_1,q_1>0\), with \(0<p_1<2q_1/\sqrt{3}\), and let \(H:=\{(-p_1,q_1),(p_1,q_1)\}\). Then,

$$\begin{aligned} H^\mathrm {rel}=\left\{ (p,q): -p_1\leqq p\leqq p_1, 0\leqq q \leqq \sqrt{q_1^2+\frac{3}{4} (p^2-p_1^2)}\right\} \end{aligned}$$
(4.45)

and \(\Phi (K^{\mathrm {sdqc}})=H^\mathrm {rel}\). If, additionally, \(p_1\leqq q_1/\sqrt{3}\), then

$$\begin{aligned} \begin{aligned} K^{\mathrm {sdqc}}&=\{\sigma : \Phi (\sigma )\in H^\mathrm {rel}\}\\&=\left\{ \sigma : p(\sigma )\in [-p_1,p_1], q^2(\sigma )-\frac{3}{4} p^2(\sigma )\leqq q_1^2-\frac{3}{4} p_1^2\right\} . \end{aligned} \end{aligned}$$
(4.46)

We refer to Fig. 2 for an illustration.

Proof

We observe that \({\hat{H}}=\{-p_1,p_1\}\times [0,q_1]\) and \({\hat{H}}^\mathrm {conv}=[-p_1,p_1]\times [0, q_1]\).

Let \(W:=\{(p,q): -p_1\leqq p\leqq p_1, q^2-\frac{3}{4} p^2\leqq q_1^2-\frac{3}{4}p_1^2, q\geqq 0\}\) be the set in (4.45). We first show that \(H^\mathrm {rel}\subseteq W\). We define \(q_0:=\sqrt{q_1^2-\frac{3}{4} p_1^2}\) and consider the corresponding function \(f_{(0,q_0)}(p,q)=4(q^2-q_0^2)-3p^2 = 4(q^2-q_1^2)-3(p^2-p_1^2)\). Then, \(f_{(0,q_0)}\leqq 0\) on \({\hat{H}}\), and \(f_{(0,q_0)}>0\) on \({\hat{H}}^\mathrm {conv}\setminus W\). Recalling (4.8), we obtain \(H^\mathrm {rel}\subseteq W\).

To obtain the remaining inclusion, it suffices to show that we cannot separate any point of W from \({\hat{H}}\). We fix a point \((p,q)\in W\) and consider a generic pair \(y_0=(p_0,q_0)\in \mathbb {R}\times [0,\infty )\). The function \(f_{y_0}\) separates (pq) from \({\hat{H}}\) if

$$\begin{aligned} \max \{4(q_1^2-q_0^2)-3(p_1\pm p_0)^2\} < 4(q^2-q_0^2) - 3 (p-p_0)^2, \end{aligned}$$
(4.47)

which, expanding all squares, is the same as

$$\begin{aligned} 4q_1^2-3 p_1^2 + 6 |p_1p_0|< 4q^2 - 3 p^2 + 6 pp_0. \end{aligned}$$
(4.48)

From \((p,q)\in W\) we obtain \(|p|\leqq p_1\), which implies \(6pp_0\leqq 6 |p_1p_0|\), and \( 4q^2 - 3 p^2 \leqq 4q_1^2-3 p_1^2 \). Summing the two gives

$$\begin{aligned} 4q^2 - 3 p^2 + 6 pp_0\leqq 4q_1^2-3 p_1^2 + 6 |p_1p_0|, \end{aligned}$$
(4.49)

which means that we cannot separate (pq) from \({\hat{H}}\). Therefore, \(W\subseteq H^\mathrm {rel}\).

From the definition and the condition \(p_1<2q_1/\sqrt{3}\), we see that \(H^\mathrm {rel}\) is connected, so that the first assertion directly follows from Theorem 4.1.

To prove the second assertion we need only control the slope of the boundary. The vertical sides of \(H^\mathrm {rel}\) belong to \({\hat{H}}\). The slope of the hyperbola is maximal at the two extreme points, i. e., at \((\pm p_1, q_1)\). Differentiating \(q^2-\frac{3}{4}p^2=c\), we obtain \(q'q=\frac{3}{4} p'p\), which implies that \(|q'|/|p'|=\frac{3}{4} p_1/q_1\). If \(p_1\leqq q_1/\sqrt{3}\), this implies that the slope is not larger than \(\frac{\sqrt{3}}{4}\). The conclusion then follows from Lemma 4.2.

\(\square \)

Next, we consider a second example in which H consists of a half-circle of radius r centered in \(C:=(p_C,0)\) and a single point \(D:=(p_D,q_D)\):

$$\begin{aligned} H:=\{(p_D,q_D)\} \cup \{(p,q): (p-p_C)^2+q^2\leqq r^2, q\geqq 0\}. \end{aligned}$$
(4.50)

There are several different cases, depending on the existence of one or two hyperbolas in the family considered above which contain the point D and are tangent to the circle. The boundaries between the different phases are vertical lines (corresponding to the construction of \({\hat{H}}\) from H) and lines with slope \(\pm \sqrt{3}/2\) (corresponding to the maximal slope of the hyperbolas, which is also the boundary between \(S^1_+\) and \(S^1_-\)). The phase diagram is sketched in Fig. 6. The critical points are \(X=(p_C-\frac{\sqrt{7}}{\sqrt{3}} r,0)\), \(Y=(p_C+\frac{\sqrt{7}}{\sqrt{3}} r,0)\) and \(Z=(p_C,\frac{\sqrt{7}}{\sqrt{4}} r)\). For definiteness, we focus on two representative regions.

Lemma 4.14

Let H be as in (4.50) with D in region I, defined as

$$\begin{aligned} p_D<p_C-r,\quad \frac{\sqrt{3}}{2}|p_D- p_X|< q_D< \frac{\sqrt{3}}{2}|p_D- p_Y | . \end{aligned}$$
(4.51)

Then, there is a unique \(y_0=(p_0,q_0)\in \mathbb {R}\times [0,\infty )\) such that the hyperbola \(\{q^2-q_0^2=\frac{3}{4}(p-p_0)^2\}\) contains \(D=(p_D,q_D)\) and is tangent to the circle with radius r centered in \(C=(p_C,0)\) in a point T. Furthermore,

$$\begin{aligned} H^\mathrm {rel}= H\cup \left\{ (p,q): p_D\leqq p\leqq p_T, q^2\leqq q_0^2 + \frac{3}{4} (p-p_0)^2\right\} . \end{aligned}$$

If, instead, D is in region II, defined by

$$\begin{aligned} q_D-q_Z \geqq \frac{\sqrt{3}}{2} |p_D-p_C|, \end{aligned}$$
(4.52)

then \(H^\mathrm {rel}={\hat{H}}^\mathrm {conv}\).

Fig. 6
figure 6

Different regions for the location of D with respect to the circle in the construction of (4.50), see Lemma 4.14. The constructions in regions I and II are shown in Figs. 7 and 8, respectively

Fig. 7
figure 7

Example with H consisting of a point and a half-circle, see Lemma 4.14, for D in region I (see Fig. 6). The right panel shows some details of the construction, and in particular the location of the two curves \(\Gamma _D((2/\sqrt{7}, \pm \sqrt{3}/\sqrt{7}), \mathbb {R})\) used in the proof

Fig. 8
figure 8

Two examples with H consisting of a point and a half-circle, see Lemma 4.14, for D in region II (see Fig. 6)

Proof

The second case is straightforward. The boundary of \({\hat{H}}^\mathrm {conv}\) has slope at least \(\sqrt{3}/2\), hence there is no possibility to separate any point of it using the given hyperbolas. A sketch is shown in Fig. 8.

The first case, corresponding to region I in Fig. 6, requires a more detailed argument. We first have to show that there is a unique hyperbola of the type \(q^2-q_0^2=\frac{3}{4}(p-p_0)^2\) which contains D and is tangent to the half-circle. We refer to Fig. 7 for an illustration.

The condition that \(y_D\) belongs to the hyperbola translates into

$$\begin{aligned} q_0^2=q_D^2-\frac{3}{4} (p_D-p_0)^2. \end{aligned}$$
(4.53)

The condition of being tangential means that the system

$$\begin{aligned} \left. \begin{array}{ll} q^2=q_D^2-\frac{3}{4} (p_D-p_0)^2+\frac{3}{4}(p-p_0)^2\\ (p-p_C)^2+q^2=r^2 \end{array}\right. \end{aligned}$$
(4.54)

has a double solution. Note that these equations are both quadratic in p and linear in \(q^2\), hence the system is overall of second order in these two variables. Substituting \(q^2\) into the second equation leads to the condition that

$$\begin{aligned} (p-p_C)^2+q_D^2-\frac{3}{4} (p_D-p_0)^2+\frac{3}{4}(p-p_0)^2=r^2 \end{aligned}$$
(4.55)

has a double solution \(p_T\), which should satisfy \(p_T\in [p_C-r,p_C+r]\). This solution can be computed explicitly, but for proving the assertion existence suffices. To this end, we consider the family of curves \(\Gamma _D(e,\mathbb {R})\) constructed in Lemma 4.6 for \(|e_2|\leqq \frac{\sqrt{3}}{2}e_1\). The assumption (4.51) implies that \(\Gamma _D((2/\sqrt{7}, -\sqrt{3}/\sqrt{7}), [0,\infty ))\) intersects \(B_C(r)\), but \(\Gamma _D((2/\sqrt{7}, +\sqrt{3}/\sqrt{7}), [0,\infty ))\) does not (notice that both these curves are piecewise affine). By continuity there is \(e_*\) in the given interval such that \(\Gamma _D(e_*,\mathbb {R})\) is tangent to \(B_C(r)\). We denote by T the intersection of the two, and define \((q_0, p_0)\) so that \(\Gamma _D(e_*,\mathbb {R})\) is the set \(q^2-q_0^2=\frac{3}{4}(p-p_0)^2\) (see Fig. 7).

To conclude the proof, it suffices to show that no point of the given set can be separated by another hyperbola. To this end, it suffices to show that no other hyperbola of the given family can have two points in common with the given one. This follows from the fact that any solution to the system

$$\begin{aligned} \left\{ \begin{array}{ll} q^2-q_0^2=\frac{3}{4}(p-p_0)^2\\ q^2-q_1^2=\frac{3}{4}(p-p_1)^2 \end{array}\right. \end{aligned}$$
(4.56)

obeys \(q_0^2-q_1^2=\frac{3}{4} (p_1^2-p_0^2-2pp_1-2pp_0)\), which is a linear equation in p and, therefore, has at most one solution. If p is unique, since \(q\geqq 0\), then obviously q is also unique. This concludes the proof. \(\quad \square \)