1 Introduction

In this contribution, we discuss and investigate the impact of dominant gradient forces in nearly-incompressible, linear elasticity. Moreover, we restrict our considerations to an isotropic model problem. Thus, we assume that \(\Omega \subset {\mathbb {R}}^d\), \(d=2,3\) denotes a bounded, polyhedral Lipschitz domain and we consider the following vector-valued PDE in the displacement formulation

$$\begin{aligned} - {{\,\mathrm{{\text {div}}}\,}}\left( 2\mu \nabla _s\varvec{u}\right) - {\nabla }\left( \lambda \,{{\,\mathrm{{\text {div}}}\,}}\varvec{u}\right)&= \varvec{f}\qquad \text {in }\Omega , \end{aligned}$$
(1a)
$$\begin{aligned} \varvec{u}&= \varvec{0} \qquad \text {on }\partial \Omega , \end{aligned}$$
(1b)

where \(\varvec{u}\) denotes the displacement, \(\nabla _s\varvec{u} = {({\nabla }\varvec{u} + {\nabla }^T\varvec{u})}/{2}\) denotes the symmetric gradient operator, \(\mu > 0\), \(\lambda \ge 0\) denote the (constant) Lamé parameters, and \(\varvec{f}\) denotes an external force. We focus on the case, where it holds simultaneously:

  1. (1)

    the Poisson ratio \(0< \nu := \frac{\lambda }{2 (\lambda +\mu )} < \frac{1}{2}\) fulfills \(\nu \approx \frac{1}{2}\), i.e, the Lamé parameters are related by \(\lambda \gg \mu \), and the material is nearly incompressible;

  2. (2)

    the exterior volume force is a gradient field, i.e., there exists some potential \(\psi \) with \(\varvec{f} = \nabla \psi \).

Evidently, this discussion is especially relevant, if there does not exist any a-priori information that a load vector \(\varvec{f}\) is actually a gradient field, i.e., if its Helmholtz–Hodge decomposition is not known a-priori, which can be algorithmically exploited [34], or if \(\varvec{f}\) only contains a dominant gradient field part in the sense of the Helmholtz–Hodge decomposition, see [47, 60].

The special role of gradient-forces in nearly-incompressible linear elasticity can be derived from the observation that in the limit \(\lambda \rightarrow \infty \), the displacement \(\mathbf{u} = \mathbf{u} (\lambda )\) fulfills formally an incompressible Stokes problem \(- {{\,\mathrm{{\text {div}}}\,}}\left( 2\mu \nabla _s\varvec{u}^\infty \right) + \nabla p^\infty = \varvec{f}\), \({{\,\mathrm{{\text {div}}}\,}}\varvec{u}^\infty = 0\), where \(p^\infty \) denotes a formal pressure, acting as a Lagrangian parameter for the divergence constraint. Thus, in the incompressible limit it holds for the fully clamped situation (1)

$$\begin{aligned} \mathbf{f} = \nabla \psi \quad \Rightarrow \quad \lim _{\lambda \rightarrow \infty } \mathbf{u} (\lambda ) = \mathbf{0} , \end{aligned}$$
(2)

which compares well to (incompressible) hydrostatics in fluid dynamics.

Gradient-robustness: The concept of gradient-robustness is based on the observation (2): an accurate discretization for the linear elasticity problem (1) will be called gradient-robust, if a scheme for the system (1) delivers in the limit \(\lambda \rightarrow \infty \) on every fixed mesh the discrete displacement \(\mathbf{u} _h = \mathbf{0} \), if it holds \(\mathbf{f} = \nabla \psi \). Actually, a gradient-robust discretization for the linear elasticity problem (1) is asymptotic preserving (AP) in the sense of [45]. A related concept of gradient-robustness has been introduced first for hydrostatic situations in compressible flows [2].

Indeed, we will show in this contribution:

  1. (i)

    dominant and complicated gradient fields are a possible source of spurious displacements in isotropic, nearly-incompressible linear elasticity, if schemes are only free of volume-locking, but are not gradient-robust;

  2. (ii)

    \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming finite element spaces can be exploited to construct efficient gradient-robust numerical schemes for gradient forces \(\nabla \psi \in \varvec{L}^2\) on rather arbitrary unstructured grids;

  3. (iii)

    the approximation spaces do not need to be \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming, only certain test functions need to be \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming in the discretization of the load term \(\varvec{f}\), in order to enforce the \(\varvec{L}^2\) orthogonality of arbitrary gradient fields against discretely divergence-free vector fields. This gives more flexibility to construct gradient-robust schemes that are computationally efficient;

  4. (iv)

    gradient-robustness is especially needed in non-trivial multi-physics problems.

We want to emphasize that it has not remained hidden for the elasticity community that inf-sup stable schemes, which are free of volume-locking, are not sufficient for accurate schemes in nearly-incompressible elasticity. For example, in the abstract of the article Approximation of incompressible large deformation elastic problems: some unresolved issues [12] the authors write:

[...] it is shown that within the framework of displacement/pressure mixed elements, even schemes that are inf-sup stable for linear elasticity may exhibit problems when used in the finite deformation regime. The roots of such troubles are identified, but a general strategy to cure them is still missing [...]

As the root of the problem, the authors identify in the conclusion the need for an exact fulfillment of the incompressibility constraint in the linearized problems, which is a statement that has been already made earlier by the previous works [10, 11]. A closer look to the two-dimensional, incompressible and nonlinear benchmarks presented in [12, Tables 1 and 8] reveals that \({\varvec{H}}^1\)-conforming \(\varvec{P}_k\) elements [70, 74] with \(k=2,3,4\) in a Galerkin displacement formulation perform best in the detection of a certain stability range, where stability around the trivial displacement \(\varvec{u} = \varvec{0}\) under a parameter-dependent gradient forcing is investigated—which is similar to (2). Finally, the authors of [12] warn the readers about problems with ill-conditioning of high-order \(\varvec{P}_k\) elements, and recommend future research on NURBS based approaches, Discontinuous Galerkin and nonconforming methods, in order to find accurate and efficient alternatives.

In order to enable the construction of novel competitive schemes for some unresolved or maybe not fully-understood issues, the novel notion of gradient-robustness aims at changing the focus from the incompressibility of the displacements \(\varvec{u}\), i.e, the trial functions, to the incompressibility in certain test functions. The concept of gradient-robustness makes clear that the very same scheme may behave very well, when confronted with divergence-free forces, but may fail when confronted with forces of gradient-type—thus obscuring the origin of numerical errors.

And indeed, the classical \(\varvec{P}_k\) displacement elements, which delivered the most accurate results in [12], are closely related to the classical Scott–Vogelius elements for the incompressible Stokes problem [7, 71, 76], which have been demonstrated in recent years to be advantageous for incompressible fluid dynamics, whenever the momentum balance is dominated by strong and complicated gradient fields [1, 33, 47, 59, 69]. In incompressible CFD, such schemes have been called pressure-robust [60], since strong gradient fields in the momentum balance lead to strong pressure gradients likewise, but pressure-robustness is just a special case of gradient-robustness, and implies that for rather arbitrary gradient forcings \(\varvec{f}=\nabla \psi \), the hydrostatic solution \(\varvec{u}_h=\varvec{0}\) is preserved.

Besides inf-sup stability, the algorithmic key of these schemes is that they exploit a certain \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conformity, in order to be robust against strong gradient forces \(\nabla \psi \in \varvec{L}^2\). Partially in parallel, rather recently \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conformity has also been exploited for the construction of well-balanced schemes in hyperbolic conservation laws [2, 24, 63], where gradient fields are also an important trouble maker in nearly hydrostatic or nearly geostrophic force balances for PDEs like the shallow water, the compressible Euler or the compressible Navier–Stokes equations [19, 24, 36, 37, 63, 64].

As an application of the concept of gradient-robustness in elasticity problems, we will construct efficient HDG schemes for (1), which fully exploit the robustness of \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conformity without being \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming due to lowering the approximation order of the face variables and thus being computationally cheaper. The idea is inspired from a quite recent observation for the incompressible Stokes problem [26, 29, 50,51,52, 54,55,56,57,58]. For the incompressible Stokes problem a modified discretization of the exterior forcing via

$$\begin{aligned} \int _{\Omega } \mathbf{f} \cdot \mathbf{v} _h \, dx \rightarrow \int _{\Omega } \mathbf{f} \cdot \Pi \mathbf{v} _h \, dx \end{aligned}$$
(3)

is able to re-establish the \(\varvec{L}^2\) orthogonality against arbitrary gradient fields \(\mathbf{f} = \nabla \psi \in \varvec{L}^2\). Here, \(\Pi \) is a (locally defined) operator that maps into a discrete \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming space. It needs to have some approximation properties, and has to preserve the discrete divergence of \(\mathbf{v} _h\), i.e., it has to fulfill \({{\,\mathrm{{\text {div}}}\,}}(\Pi \mathbf{v} _h) = {{\,\mathrm{{\text {div}}}\,}}_h \mathbf{v} _h\) in a certain sense. In the case \(\mathbf{f} = \nabla \psi \) this leads to

$$\begin{aligned} {{\,\mathrm{{\text {div}}}\,}}_h \mathbf{v} _h\! =\! 0 ~~ \Rightarrow ~~ \int _{\Omega } \nabla \psi \cdot \Pi \mathbf{v} _h \, dx = - \! \int _{\Omega } \psi {{\,\mathrm{{\text {div}}}\,}}(\Pi \mathbf{v} _h) \, dx = - \! \int _{\Omega } \psi {{\,\mathrm{{\text {div}}}\,}}_h \mathbf{v} _h \, dx = 0. \end{aligned}$$
(4)

According to [47], this reestablished \(\varvec{L}^2\) orthogonality of gradient fields against discretely divergence-free vector fields can be interpreted as a scheme that—on unstructured grids—fulfills for arbitrary \(\varvec{L}^2\) gradient fields the vector calculus identity \(\nabla _h\times \nabla \psi = \mathbf{0} \). Here, \(\nabla _h \times \bullet \) denotes some implicitly defined discrete curl operator, whereas for standard schemes it only holds \(\nabla _h \times \nabla \psi = {\mathcal {O}}(h^k)\) [47].

Last but not least, we will present numerical examples in 2D and 3D for a problem in thermo-elasticity, where gradient-robustness against strong gradients fields in \(\varvec{L}^2\) leads to much more accurate schemes for nearly incompressible materials. We conjecture that nearly incompressible elasticity behaves somewhat similar to incompressible CFD: the more difficult a problem is (in CFD: multi physics, high Reynolds numbers [33, 47]) the more important gradient-robustness will be for numerical accuracy. Finally, we remark that the schemes proposed in this contribution will probably not be sufficient to solve the numerical issues reported in [12]. We conjecture that for solving them, gradient-robustness against arbitrary gradient fields in \(\varvec{H}^{-1}\) is necessary—which is true for the classical high order \(\varvec{P}_k\) displacement method. However, very recent developments by Zanotti, Verfürth and Kreuzer [49, 73] have proved that the approach in (3) can be extended to the construction of novel schemes that are gradient-robust even against rather arbitrary gradient fields in \(\varvec{H}^{-1}\).

The rest of the paper is organized as follows: In Sect. 2, we introduce the concepts of volume-locking and gradient-robustness by considering very basic discretization ideas for (1). Thereafter, in Sect. 3 we give an overview on existing discretizations for linear elasticity in the literature with respect to these properties. Then, in Sect. 4 we present and analyze a divergence-conforming HDG scheme, in particular, we prove that the scheme is both volume-locking-free and gradient-robust. In Sect. 5 we consider and analyze two (computationally more efficient) modified HDG schemes. Both have the same stiffness matrix, but the discretization of the load term differs. The scheme with a standard discretization of the load term is not gradient-robust, while a modified load term according to (3) leads to gradient-robustness. The numerical Sect. 6 shows the importance of gradient-robustness for multi-physics problem from thermo-elasticity and also demonstrates the practical efficiency of using (relaxed) HDG schemes. We conclude in Sect. 7.

2 Motivation: Volume-Locking and Gradient-Robustness

In this section we briefly repeat the concept of volume-locking and introduce the concept of gradient-robustness. To illustrate these we consider very basic discretization ideas for (1) in this section and give a definition of volume-locking and gradient-robustness, cf. [32] for a more extensive discussion. Only later, in the subsequent sections we turn our attention to our proposed discretization, a (relaxed) \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming HDG method and analyse it, with a focus on these properties. Due to the prominent role of the divergence operator in nearly incompressible elasticity, we introduce the following space:

$$\begin{aligned} \varvec{V}^0 := \{ \varvec{u} \in \varvec{H}^1_0(\Omega ) : {{\,\mathrm{{\text {div}}}\,}}\varvec{u} = 0 \}. \end{aligned}$$
(5)

2.1 A Basic Method

Let us start with a very basic method. Let \({\mathcal {T}}_h=\{T\}\) be a conforming simplicial triangulation of \(\Omega \). We use a standard \({\varvec{H}}^1\)-conforming piecewise polynomial finite element space for the displacement \(\varvec{u}\) in (1):

$$\begin{aligned} \varvec{P}_{\!h,0}^{k} : = [P_{\!h,0}^{k}]^d \quad \text {with} \quad P_{\!h}^{k} : = \prod _{T\in {\mathcal {T}}_h} {\mathbb {P}}^{k}(T) \cap H^1(\Omega ), \text { and } P_{\!h,0}^{k}:=P_{\!h}^{k}\cap H^1_0(\Omega ) \end{aligned}$$

where \({\mathbb {P}}^{k}(T)\) is the space of polynomials up to degree k. The numerical scheme is: Find \(\varvec{u}_h \in \varvec{P}_{\!h,0}^{k}\) s.t. for all \(\varvec{v}_h \in \varvec{P}_{\!h,0}^{k}\) there holds

$$\begin{aligned} a(\varvec{u}_h, \varvec{v}_h) \! := \!\!\int _{\Omega } 2\mu \,\nabla _s(\varvec{u}_h)\!:\!\!\nabla _s(\varvec{v}_h)\! \,\mathrm {d}\varvec{x}+ \int _{\Omega } \lambda \,{{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_h){{\,\mathrm{{\text {div}}}\,}}(\varvec{v}_h)\! \,\mathrm {d}\varvec{x}= \int _{\Omega } \!\!\varvec{f} \cdot \varvec{v}_h \! \,\mathrm {d}\varvec{x}\end{aligned}$$
(M1)

We choose a simple numerical example to investigate the performance of the method.

Example 1

We consider the domain \((0,1)^2\) and a uniform triangulation into right triangles. For the right hand side we choose the divergence-free r.h.s.

$$\begin{aligned} \varvec{f} = 2 \mu \pi ^2 (\sin (\pi x)\sin (\pi y),\cos (\pi x)\cos (\pi y)) \end{aligned}$$

and Dirichlet boundary conditions such that the unique solution is

$$\begin{aligned} \varvec{u} = (\sin (\pi x)\sin (\pi y), \cos (\pi x)\cos (\pi y)). \end{aligned}$$
Fig. 1
figure 1

Discretization errors for the method (M1), \(k=1\), under mesh refinement (x-axis: refinement level L) and different values of \(\lambda \) for Example 1

For successively refined meshes with smallest edge length \(h=2^{-(L+2)}\), fixed polynomial degree \(k=1\) and levels \(L=0,..,6\) we compute the error \(\varvec{u} - \varvec{u}_h\) in the \(\varvec{L}^2\) norm and the \({\varvec{H}}^1\) semi-norm for different values of \(\lambda \). The absolute errors are displayed in Fig. 1. Let us emphasize that the solution \(\varvec{u}\) is independent of \(\lambda \). However, we observe that this is not true for \(\varvec{u}_h\). For instance, for \(\lambda =10^5\) convergence can not yet be observed on the chosen meshes. Overall, we observe an error behavior of the form \({\mathcal {O}}(\lambda \cdot h)\) for the \({\varvec{H}}^1\) semi-norm and \({\mathcal {O}}(\lambda \cdot h^{2})\) for the \(\varvec{L}^2\) norm. From the discretization (M1) we directly see that with increasing \(\lambda \) we enforce that \({{\,\mathrm{{\text {div}}}\,}}\varvec{u}\) tends to zero (pointwise). For piecewise linear functions, however, the only divergence-free function that can be represented is the constant function. This leads to the observed effect which is known as volume-locking:

Definition 1

Volume-locking means that the discrete subspace of discretely divergence-free vector fields \( \varvec{V}^0_h := \left\{ \varvec{v}_h \in \varvec{P}_{\!h,0}^{k} : {{\,\mathrm{{\text {div}}}\,}}_h \, \varvec{v}_h = 0 \right\} \), does not have optimal approximation properties versus smooth, divergence-free functions \(\varvec{v} \in \varvec{V}^0 \cap \varvec{H}^{k+1}(\Omega )\)

$$\begin{aligned} \inf _{\varvec{v}_h \in \varvec{V}^0_h} \Vert \nabla \varvec{v}- \nabla \varvec{v}_h\Vert _{\Omega } \not \le C \inf _{\varvec{v}_h \in \varvec{P}_{\!h,0}^{k}} \Vert \nabla \varvec{v}- \nabla \varvec{v}_h\Vert _{\Omega } \le C h^k | \varvec{v} |_{k+1}. \end{aligned}$$
(6a)

In the sense of Definition 1 the discretization (M1) with \(k=1\) is obviously not free of volume-locking.

2.2 A Volume-Locking-Free Discretization Through Mixed Formulation

To get rid of the locking-effect one often reformulates the grad-div term in (1) by rewriting the problem in mixed form as

$$\begin{aligned} - {{\,\mathrm{{\text {div}}}\,}}\left( 2\mu \nabla _s\varvec{u}\right) + \nabla p = \varvec{f} \quad \text {in }\Omega , \quad {{\,\mathrm{{\text {div}}}\,}}\varvec{u} + \lambda ^{-1} p = 0 \quad \text {in }\Omega . \end{aligned}$$
(7)

With the intention to avoid volume-locking we now consider a discretization that is known to be stable in the Stokes limit \(\lambda \rightarrow \infty \). Here, we take the well-known Taylor-Hood velocity-pressure pair: Find \((\varvec{u}_h, p_h) \in \varvec{P}_{\!h,0}^{k}\times P_{\!h}^{k-1}\), s.t.

$$\begin{aligned} \begin{aligned} \int _{\Omega } 2\mu \,\nabla _s(\varvec{u}_h):\nabla _s(\varvec{v}_h) \,\mathrm {d}\varvec{x}- \int _{\Omega } {{\,\mathrm{{\text {div}}}\,}}(\varvec{v}_h) p_h \,\mathrm {d}\varvec{x}&= \int _{\Omega } \varvec{f} \!\cdot \! \varvec{v} \, \,\mathrm {d}\varvec{x}~&\forall ~v_h \in \varvec{P}_{\!h,0}^{k},\\ \int _{\Omega } {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_h) q_h \,\mathrm {d}\varvec{x}+ \int _{\Omega } \lambda ^{-1} p_h q_h \,\mathrm {d}\varvec{x}&= 0 ~&\forall ~q_h \in P_{\!h}^{k-1}. \end{aligned} \end{aligned}$$
(M2)

It is well-known that for every LBB-stable Stokes discretization the mixed formulation of linear elasticity guarantees that the discretization is free of volume-locking in the sense of Definition 1, cf. [16, Chapter VI.3]. Indeed, we numerically observe that, cf. [32], the discretization errors of the method (M2) for Example 1 are essentially independent of \(\lambda \) and optimally convergent.

2.3 Gradient-Robustness

In the previous subsection we considered a divergence-free force field. As a result of the Helmholtz decomposition we can decompose every \(\varvec{L}^2\) force field into a divergence-free and an irrotational part. In this section we now consider the case where the force field is irrotational, i.e. a gradient of an \(H^1\) function. This will lead us to gradient-robustness. Assume that there is \(\phi \in H^1(\Omega )\) so that \(\varvec{f} = \nabla \phi \). With \(\lambda \rightarrow \infty \) we have \(p \rightarrow \phi + c,~c\in {\mathbb {R}}\) and \(\varvec{u} \rightarrow \varvec{0}\) , i.e. in the Stokes limit gradients in the force field are solely balanced by the pressure and have no impact on the displacement. In the next subsection, this reasoning will be made more precise considering the limit \(\lambda \rightarrow \infty \).

2.4 A Definition of Gradient-Robustness

First, we introduce the orthogonal complement of the weakly-differential divergence-free vector fields (5) with respect to the inner-product \(a(\cdot , \cdot )\) defined in (M1):

$$\begin{aligned} \varvec{V}^\perp := \{ \varvec{u} \in \varvec{H}^1_0(\Omega ) : a(\varvec{u}, \varvec{v}) = 0, \forall \varvec{v} \in \varvec{V}^0 \}. \end{aligned}$$
(8)

Then, the solution of the linear elasticity equation can be decomposed as

$$\begin{aligned} \varvec{u} = \varvec{u}^0 + \varvec{u}^\perp , \qquad \varvec{u}^0 \in \varvec{V}^0, \quad \varvec{u}^\perp \in \varvec{V}^\perp , \end{aligned}$$
(9)

where \(\varvec{u}^0\) satisfies

$$\begin{aligned} a(\varvec{u}^0,\varvec{v}^0) = a(\varvec{u}, \varvec{v}^0) =(\varvec{f}, \varvec{v}^0),\quad \forall \varvec{v}^0\in \varvec{V}^0. \end{aligned}$$
(10)

The following lemma characterizes a robustness property of exact solutions to linear elasticity problems.

Theorem 1

(Gradient-robustness of nearly incompressible materials) If the right hand side \(\varvec{f} \in H^{-1}(\Omega )\) in (1a) is a gradient field, i.e. \(\varvec{f}=\nabla \phi \), \(\phi \in L^2(\Omega )\), then it holds for the solution \(\varvec{u} = \varvec{u}^0 + \varvec{u}^\perp \) (under homogeneous Dirichlet boundary conditions)

$$\begin{aligned} \varvec{u}^0 = 0, \qquad \Vert \varvec{u}^\perp \Vert _{{\varvec{H}}^1} = {\mathcal {O}}(\lambda ^{-1}), \text { i.e., for }\lambda \rightarrow \infty \text { one gets } \varvec{u} = \varvec{u}^\perp \rightarrow \varvec{0}. \end{aligned}$$

Proof

Taking \(\varvec{v}^0=\varvec{u}^0\) in (10), we get

$$\begin{aligned} a(\varvec{u}^0,\varvec{u}^0) = (\varvec{f},\varvec{u}^0) =(\nabla \phi , \varvec{u}^0) = (-\phi , {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}^0)) = 0. \end{aligned}$$

Hence, \(\varvec{u}^0= \varvec{0}\). On the other hand we obtain

$$\begin{aligned} ( 2\mu \,\nabla _s(\varvec{u}^\perp ), \nabla _s(\varvec{u}^\perp )) \!+\! ( \lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u}^\perp , {{\,\mathrm{{\text {div}}}\,}}\varvec{u}^\perp ) \!=\! \varvec{f}(\varvec{u}^\perp ) \!=\! -(\phi ,{{\,\mathrm{{\text {div}}}\,}}\varvec{u}^\perp ) \!\le \! \Vert \phi \Vert _{L^2} \Vert \varvec{u}^\perp \Vert _{{\varvec{H}}^1}. \end{aligned}$$

From Korn’s inequality \( \Vert \varvec{u}^\perp \Vert _{{\varvec{H}}^1}^2 \le C ( 2 \nabla _s(\varvec{u}^\perp ), \nabla _s(\varvec{u}^\perp )), \) and an estimate on the \({\varvec{H}}^1\) norm of functions in \(\varvec{V}^\perp \), \(\beta \Vert \varvec{u}^\perp \Vert _{{\varvec{H}}^1} \le \Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}^\perp \Vert _{{\varvec{L}}^2}\), where C is the constant for the Korn’s inequality and \(\beta \) is the inf-sup constant of a corresponding Stokes problem, cf. [46, Corollary 3.47],

we hence have

$$\begin{aligned} (\frac{\mu }{C}+\frac{\lambda }{\beta }) \Vert \varvec{u}^\perp \Vert _{{\varvec{H}}^1(\Omega )} \le \Vert \phi \Vert _{L^2(\Omega )}. \end{aligned}$$

\(\square \)

This characterization does not automatically carry over to discretization schemes.

Definition 2

We denote a space discretization for the linear elasticity problem (1) as gradient-robust, if it holds in the limit \(\lambda \rightarrow \infty \) on every fixed grid

$$\begin{aligned} \Vert \varvec{u}_h \Vert _{1,h} = {\mathcal {O}}(\lambda ^{-1}), \end{aligned}$$

where \(\Vert \bullet \Vert _{1, h}\) denotes an appropriate discrete \({\varvec{H}}^1\) norm.

We demonstrate the consequences for the linear elasticity problem in the following, where the load term \(\varvec{f}\) is a gradient field.

Example 2

We take \(\varvec{f} = \nabla \phi \) with \(\phi = x^6 + y^6\) and (homogeneous) Dirichlet boundary conditions so that it holds \(\varvec{u} \rightarrow \varvec{0}\) in the asymptotic limit \(\lambda \rightarrow \infty \).

We now compare (M1) and (M2) on a couple of fixed grids and we investigate the norms of the solutions \(\varvec{u}_h\) with respect to \(\lambda \rightarrow \infty \). The results for Example 2, are shown in Fig. 2. While (M1) behaves well as \(\Vert \nabla \varvec{u}_h \Vert _{{\varvec{L}}^2}\) goes to zero with \(\lambda ^{-1}\) essentially independent of h, for the method in (M2) we observe an upper bound for \(\Vert \nabla \varvec{u}_h \Vert _{{\varvec{L}}^2}\) that depends on the mesh.

Fig. 2
figure 2

Norm of discrete solution for methods in (M1) (left) and (M2) (right), \(k=2\), under mesh refinement (x-axis: refinement level L) and different values of \(\lambda \) for Example 2

As a conclusion of the numerical examples, let us summarize that both basic methods that we considered here, the discretization (M1) with \(k=1\) and the Taylor-Hood based method in (M2) are not satisfactory. While (M1) with \(k=1\) seems to be gradient-robust it is not free of volumetric locking, the behavior of the Taylor-Hood based method in (M2) has the exact opposite properties.

3 Literature

There exists a variety of discretization methods for nearly-incompressible linear elasticity. In this section we give a—non-exhaustive—overview on existing methods and classify them with respect to the structure properties volume-locking and gradient-robustness:

  1. (i)

    Historically, among the first methods for nearly incompressible linear elasticity, are the pure displacement-based conforming finite element methods (M1). They are all gradient-robust against gradient fields in \(\varvec{H}^{-1}\), because they are connected to the divergence-free (and thus pressure-robust) \({\varvec{H}}^1\)-conforming mixed Scott–Vogelius element for the incompressible Stokes problem. However, the low-order versions of these methods are prone to volume-locking, on general shape-regular meshes. In order to be robust against volume-locking, one has to choose a local polynomial degree of \(k \ge 4\) in 2D [71, 74]. In 3D, it is partially proven and conjectured to be \(k \ge 6\) in 3D [78]. In the late 70ies and early 80ies such high polynomial degrees seemed to be unfeasible due to conditioning problems with the appropriate stiffness matrices, and researchers were looking for alternatives. However, nowadays certain families of shape-regular meshes are known that allow to decrease the polynomial order without running into the problem of volume-locking [65]. Shape-regular meshes with barycentric refinements (Alfeld splits) allow to use \(k=2\) in 2D [7] and \(k=3\) in 3D [76]. Shape-regular meshes of Powell–Sabin type even allow for \(k=1\) in 2D [77] and \(k=2\) in 3D [79]. Nonconforming methods like [28] are typically designed to be free of volume-locking, but are not gradient-robust.

  2. (ii)

    Even in the 70ies, various techniques have been introduced in the literature to avoid volume-locking, in order to be able to use low-order methods. This includes, for example, the technique of reduced and selective integration [44, 80] and mixed methods like (M2), which have shown to be equivalent, later [61]. The idea of these methods is to relax the strong penalization of the term \(\lambda ({{\,\mathrm{{\text {div}}}\,}}\varvec{u}_h, {{\,\mathrm{{\text {div}}}\,}}\varvec{v}_h)\) by something lighter \(\lambda ({{\,\mathrm{{\text {div}}}\,}}_h \varvec{u}_h, {{\,\mathrm{{\text {div}}}\,}}_h \varvec{v}_h)\) conceptually replacing divergence-free \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming displacements by discretely divergence-free displacements and leading to a larger space \(\varvec{V}^0_h\), see Definition  1. But all these methods have a drawback: though free of volume-locking, they are not gradient-robust, since they relax the orthogonality of gradient fields and divergence-free fields in the \(\varvec{L}^2\) scalar product, likewise. This can be identified by the appearance of the possibly large term \(\lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\) in the right hand side of a-priori error estimates for the displacements. Sometimes authors hide this lack of gradient-robustness of their schemes in replacing the right hand side dependency on \(\lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\) by a dependency on \(\varvec{f}\), cf. also Remark 4 below. Only quite recently, this issue has been observed and the origin of numerical inaccuracies has been traced back to the discretization of the load term \((\varvec{f}, \varvec{v}_h)\). This insight allowed to build schemes, which repair the \(\varvec{L}^2\) orthogonality of gradient fields (in \(\varvec{L}^2\)) and discretely divergence-free displacements, as we propose in this contribution. A more sophisticated and more involved approach even allows to reestablish gradient-robustness against gradient fields in \(\varvec{H}^{-1}\) [49, 73].

  3. (iii)

    Start from the 90ies, Discontinuous Galerkin methods were introduced that are free of volume-locking. Some of them are only \(\varvec{L}^2\)-conforming [40], some of them are even \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming [43, 48]. \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming schemes lead naturally to gradient-robustness against gradient fields in \(\varvec{L}^2\), while schemes that are only \(\varvec{L}^2\) conforming are not gradient-robust, in general, e.g. [18, 41]. However, also \(\varvec{L}^2\)-conforming schemes can be gradient-robust versus \(\varvec{L}^2\) gradient fields if they converge (sufficiently fast) to an \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming scheme in the incompressible limit \(\lambda \rightarrow \infty \), cf. [3]. For example in [40, 75], this is achieved through the penalization of normal discontinuities which are scaled proportional to \(\lambda \).

  4. (iv)

    Further, various (stress-based) mixed methods have been introduced, which are free of volume-locking. Most of these methods are not gradient-robust [4,5,6, 8, 9, 22, 35, 38, 66]. More recent discretization strategies deliver similar results: the virtual element methods [14] and the hybrid high-order methods [25] are free of volume locking, but are not gradient-robust. The isogeometric analysis [13] is locking-free and gradient-robust by construction, since a stream function formulation for the incompressible material is used. This is similar to discretizations for the incompressible Navier–Stokes equations that directly discretize the vorticity equation. The decisive property for gradient-robustness \(\nabla \times \nabla \psi = \varvec{0}\) is fulfilled here by construction. On the other hand, the isogeometric method [27] is not gradient-robust, since it uses a reduced and selective integration technique.

  5. (v)

    Concerning hybridizable discontinuous Galerkin (HDG) methods, the schemes [20, 23, 72] are free of volume locking, but are not gradient-robust.

  6. (vi)

    Last but not least, we mention the approach [34], which could be adapted from its original application for the incompressible Stokes problem to linear elasticity. It is applicable to any computational method for the linear elasticity problem, which is connected to an inf-sup stable mixed method for the incompressible Stokes problem. The approach allows to reduce the numerical error induced by a lack of gradient-robustness at the cost of solving an elliptic problem, which actually delivers a discrete Helmholtz–Hodge decomposition. Then, the approach will replace the standard discretization of the load vector \((\varvec{f}, \varvec{v}_h)\) by something more sophisticated, which is dependent on the given mixed method under consideration. Effectively, one will end up with something similar to our suggestion (3)—at the cost of solving an elliptic problem, which we can avoid completely. Summing up, we observe that there exist several locking-free methods in the literature that are already gradient-robust. However, even among the most recently presented schemes—which are typically superior to the older schemes in view of computational efficiency—there are several schemes that lack gradient-robustness. In the subsequent two sections we introduce additional discretizations schemes which are on the one hand volume-locking-free and gradient-robust while exploiting the power of modern HDG schemes with respect to computational efficiency.

4 \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-Conforming HDG Discretization and Analysis

In this and the subsequent section we consider a special class of discretizations for linear elasticity: \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming HDG discretizations where we also keep track of the volume-locking and gradient-robustness property of the method. In Subsects. 4.14.3 we introduce preliminaries, notation and the numerical method and analyse it with respect to quasi-optimal error estimates and volume-locking in Subsect. 4.4. The proof of gradient-robustness is carried out in Subsect. 4.5. Numerical results for the simple example of Sect. 2 support these theoretical findings in Subsect. 4.6. In the subsequent section, Sect. 5, we consider a (more efficient) modified scheme which is volume-locking-free, but is gradient-robust only after a simple modification.

4.1 Preliminaries

Let \({\mathcal {F}}_h=\{F\}\) be the collection of facets (edges in 2D, faces in 3D) in \({\mathcal {T}}_h\). We distinguish functions with support only on facets indicated by a subscript F and those with support also on the volume elements which is indicated by a subscript T. Compositions of both types are used for the HDG discretization of the displacement and indicated by underlining, \(\underline{{\varvec{u}}}= (\varvec{u}_T ,\varvec{u}_F)\). On each simplex T, we denote the tangential component of a vector \(\varvec{v}_T\) on a facet F by \((\varvec{v}_T)^t = \varvec{v}_T-(\varvec{v}_T\cdot \varvec{n})\varvec{n}\), where \(\varvec{n}\) is the unit normal vector on F. Furthermore, we denote the compound exact solution as \(\underline{{\varvec{u}}}:=(\varvec{u}, \varvec{u}^t)\), and introduce the composite space of sufficiently smooth functions

$$\begin{aligned} \underline{{\varvec{U}}\!}(h): = ({\varvec{H}}^2(\Omega )\cap {\varvec{H}}^1_0(\Omega )) \times {\varvec{H}}^1_0({\mathcal {F}}_h). \end{aligned}$$
(11)

We denote the \({\varvec{H}}^{s}\)-norm on \(\Omega \) as \(\Vert \cdot \Vert _{s}\), and when \(s=0\), we simply denote \(\Vert \cdot \Vert \) as the \({\varvec{L}}^2\)-norm on \(\Omega \).

4.2 Finite Elements

We consider an HDG method which approximates the displacement on the mesh \({\mathcal {T}}_h\) using an \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming space and the tangential component of the displacement on the mesh skeleton \({\mathcal {F}}_h\) with a DG facet space given as follows:

$$\begin{aligned} \varvec{V}_{\!h}: =&\; \{\varvec{v}_T\in \prod _{T\in {\mathcal {T}}_h}[{\mathbb {P}}^{k}(T)]^d: \;\; [\![\varvec{v}_T\cdot \varvec{n} ]\!]_F = 0 \,\forall F\in {\mathcal {F}}_h\}\subset {\varvec{H}}_0({{\,\mathrm{{\text {div}}}\,}},\Omega ), \end{aligned}$$
(12a)
$$\begin{aligned} {\varvec{M}}_{h} :=&\;\{\varvec{v}_F\in \prod _{F\in {\mathcal {F}}_h} {\varvec{M}}_k(F): \;\; \varvec{v}_F\cdot \varvec{n} = 0 \,\forall F\in {\mathcal {F}}_h, \;\; \varvec{v}_F = 0 \,\;\;\forall F\subset \partial \Omega \}, \end{aligned}$$
(12b)

where \([\![\cdot ]\!]_F\) is the usual jump operator, \({\mathbb {P}}^k\) the space of polynomials up to degree k, and

$$\begin{aligned} {\varvec{M}}_k(F):=\left\{ \begin{array}{l l} {[}{\mathbb {P}}^{0}(F)]^3\oplus \varvec{x}\times [{\mathbb {P}}^{0}(F)]^3 &{} \hbox { if } k = 1 \hbox { and } d=3, \\ {[}{\mathbb {P}}^{k-1}(F)]^d &{} \hbox {else}. \end{array} \right. \end{aligned}$$

Note that functions in \({\varvec{M}}_h\) are defined only on the mesh skeleton and have normal component zero.

To further simplify notation, we denote the composite space as

$$\begin{aligned} \underline{{\varvec{U}}\!}_h: = \varvec{V}_{\!h}\times {M}_h. \end{aligned}$$

4.3 The Numerical Scheme

We introduce the \({\varvec{L}}^2\) projection onto \({\varvec{M}}_k(F)\) as \(\Pi _M\):

$$\begin{aligned} \Pi _M: {\varvec{L}}^2(F) \rightarrow {\varvec{M}}_k(F), \quad \int _F (\Pi _M f) v \, \mathrm {ds} = \int _{F}f\,v\, \mathrm {ds} \quad \forall v\in {\varvec{M}}_k(F). \end{aligned}$$

Then, for all \(\underline{{\varvec{u}}}, \underline{{\varvec{v}}}\in \underline{{\varvec{U}}\!}_h\), we introduce the bilinear and linear forms

$$\begin{aligned} a_{h}(\underline{{\varvec{u}}}, \underline{{\varvec{v}}}) :=&\; a_{h}^{\mu }(\underline{{\varvec{u}}}, \underline{{\varvec{v}}}) + a_{h}^{\lambda }(\underline{{\varvec{u}}}, \underline{{\varvec{v}}})\end{aligned}$$
(13a)
$$\begin{aligned} a_{h}^{\mu }(\underline{{\varvec{u}}}, \underline{{\varvec{v}}}) :=&\; \sum _{T\in {\mathcal {T}}_h}\int _T2\mu \,\nabla _s(\varvec{u}_T):\nabla _s(\varvec{v}_T) \,\mathrm {d}\varvec{x}\nonumber \\&\; -\int _{\partial T}2\mu \,\nabla _s(\varvec{u}_T)\varvec{n}\cdot [\![\underline{{\varvec{v}}}^t ]\!]\,\mathrm {d}\varvec{s}-\int _{\partial T}2\mu \,\nabla _s(\varvec{v}_T)\varvec{n}\cdot [\![\underline{{\varvec{u}}}^t ]\!]\,\mathrm {d}\varvec{s}\nonumber \\&\; + \int _{\partial T}\mu \frac{\alpha }{h}\Pi _M[\![\underline{{\varvec{u}}}^t ]\!]\cdot \Pi _M[\![\underline{{\varvec{v}}}^t ]\!]\,\mathrm {d}\varvec{s}, \end{aligned}$$
(13b)
$$\begin{aligned} a_h^{\lambda }(\underline{{\varvec{u}}}, \underline{{\varvec{v}}}) :=&\; \sum _{T\in {\mathcal {T}}_h}\int _T \lambda \,{{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T){{\,\mathrm{{\text {div}}}\,}}(\varvec{v}_T) \,\mathrm {d}\varvec{x}, \end{aligned}$$
(13c)
$$\begin{aligned} f(\underline{{\varvec{v}}}) :=&\; \sum _{T\in {\mathcal {T}}_h}\int _{T}\varvec{f}\cdot \varvec{v}_T\,\mathrm {d}\varvec{x}. \end{aligned}$$
(13d)

where \([\![\underline{{\varvec{u}}}^t ]\!]= (\varvec{u}_T)^t-\varvec{u}_F\) is the (tangential) jump between element interior and facet unknowns, and \(\alpha = \alpha _0 k^2\) with \(\alpha _0\) a sufficiently large positive constant.

The numerical scheme then reads: Find \(\underline{{\varvec{u}}}_h\in \underline{{\varvec{U}}\!}_h\) such that

$$\begin{aligned} a_h(\underline{{\varvec{u}}}_h, \underline{{\varvec{v}}}_h) = f(\underline{{\varvec{v}}}_h), \quad \forall \underline{{\varvec{v}}}_h\in \underline{{\varvec{U}}\!}_h. \end{aligned}$$
(S1)

4.4 Error Estimates

We write

$$\begin{aligned} A\preceq B \end{aligned}$$

to indicate that there exists a constant C, independent of the mesh size h, the Lamé parameters \(\mu \) and \(\lambda \), and the numerical solution, such that \(A\le CB.\)

Denote the space of rigid motions

$$\begin{aligned} \varvec{RM}(T)=\{\varvec{a}+B \,\varvec{x}:\;\;\varvec{a}\in {\mathbb {R}}^d, B\in S_d\}, \end{aligned}$$

where \(S_d\) is the space of anti-symmetric \(d\times d\) matrices. We observe that the tangential trace on a facet F of any function in \(\varvec{RM}(T)\) is a constant in 2D, and lies in the space \({\varvec{M}}_1(F)\) in 3D. Hence, there holds

$$\begin{aligned} \varvec{v}^t|_F \in {\varvec{M}}_k(F), \quad \quad \forall \varvec{v}\in \varvec{RM}(T). \end{aligned}$$
(14)

The above property is the key to prove coercivity of the bilinear form (13a).

We use the following projection \(\Pi _{\varvec{RM}}\) from \({\varvec{H}}^1(T)\) onto \(\varvec{RM}(T)\) [17]:

$$\begin{aligned} \int _T \Pi _{\varvec{RM}} \varvec{u} \,\mathrm {d}\varvec{x}=&\; \int _T \varvec{u} \,\mathrm {d}\varvec{x},\\ \int _T {{ {\varvec{\mathrm {curl}}}}}\, (\Pi _{\varvec{RM}} \varvec{u})\,\mathrm {d}\varvec{x}=&\; \int _T {{ {\varvec{\mathrm {curl}}}}}\,\varvec{u} \,\mathrm {d}\varvec{x}, \end{aligned}$$

where \({{ {\varvec{\mathrm {curl}}}}}\, \varvec{u}\) is the anti-symmetric part of the gradient of \(\varvec{u}\). Following [17] this projection operator has the approximation properties

$$\begin{aligned} ||{\nabla }(\varvec{u}-\Pi _{\varvec{RM}}\varvec{u})||_{T} \preceq&\; ||\nabla _s(\varvec{u})||_{T}, \end{aligned}$$
(15a)
$$\begin{aligned} ||\varvec{u}-\Pi _{\varvec{RM}}\varvec{u}||_{T}\preceq&\; h_T||{\nabla }(\varvec{u}-\Pi _{\varvec{RM}}\varvec{u})||_{T}. \end{aligned}$$
(15b)

Denoting the following (semi)norms

$$\begin{aligned} \Vert \underline{{\varvec{v}}}\Vert _{\mu ,h} := \mu ^\frac{1}{2} \Vert \underline{{\varvec{v}}}\Vert _{1,h},\quad \Vert \underline{{\varvec{v}}}\Vert _{\mu ,*,h} := \mu ^\frac{1}{2} \Vert \underline{{\varvec{v}}}\Vert _{1,*,h},\quad \Vert \underline{{\varvec{v}}}\Vert _{\mu ,**,h} := \mu ^\frac{1}{2} \Vert \underline{{\varvec{v}}}\Vert _{1,**,h}, \end{aligned}$$
$$\begin{aligned} \Vert \underline{{\varvec{v}}}\Vert _{1,h} :=&\; \left( \sum _{T\in {\mathcal {T}}_h} 2\Vert \nabla _s\varvec{v}_T\Vert ^2_T +\frac{2}{h}\Vert \Pi _M[\![\underline{{\varvec{v}}}^t ]\!]\Vert ^2_{\partial T} \right) ^{1/2}, \end{aligned}$$
(16a)
$$\begin{aligned} \Vert \underline{{\varvec{v}}}\Vert _{1,*,h} :=&\; \Big ( \Vert \underline{{\varvec{v}}}\Vert _{1,h}^2+ \sum _{T\in {\mathcal {T}}_h} 2 h \Vert \nabla _s(\varvec{v}_T)\varvec{n}\Vert ^2_{\partial T} \Big )^{1/2}, \end{aligned}$$
(16b)
$$\begin{aligned} \Vert \underline{{\varvec{v}}}\Vert _{1,**,h} :=&\; \Big ( \Vert \underline{{\varvec{v}}}\Vert _{1,*,h}^2+\sum _{T\in {\mathcal {T}}_h} \frac{2}{h}\Vert [\![\underline{{\varvec{v}}}^t ]\!]\Vert ^2_{\partial T} \Big )^{1/2}. \end{aligned}$$
(16c)

To derive optimal \({\varvec{L}}^2\) error estimates, we shall assume the following full \({\varvec{H}}^2\)-regularity

$$\begin{aligned} \mu \Vert \varvec{\phi }\Vert _{2} + \lambda \Vert {{\,\mathrm{{\text {div}}}\,}}\,\varvec{\phi }\Vert _{1} \preceq \Vert \varvec{\theta }\Vert \end{aligned}$$
(17)

for the dual problem with any source term \(\varvec{\theta }\in [{\varvec{L}}^2(\Omega )]^d\):

$$\begin{aligned} - {{\,\mathrm{{\text {div}}}\,}}\left( 2\mu \nabla _s\varvec{\phi }\right) - {\nabla }\left( \lambda \,{{\,\mathrm{{\text {div}}}\,}}\varvec{\phi }\right) =&\; \varvec{\theta }\quad \text { in } \Omega , \end{aligned}$$
(18a)
$$\begin{aligned} \varvec{\phi }=&\; \varvec{0} \quad \text { on } \partial \Omega . \end{aligned}$$
(18b)

The estimate (17) holds on convex polygons [18].

We have the following estimates.

Theorem 2

Assume \(k\ge 1\) and the regularity \(\varvec{u}\in {\varvec{H}}^{k+1}(\Omega )\). Let \(\underline{{\varvec{u}}}_h\in \underline{{\varvec{U}}\!}_h\) be the numerical solution to the scheme (S1). Then, for sufficiently large stabilization parameter \(\alpha _0\), the following estimates hold

$$\begin{aligned} \Vert \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h\Vert _{\mu ,h} \preceq&\; \mu ^{1/2} h^{k} \Vert \varvec{u}\Vert _{k+1}, \end{aligned}$$
(19a)
$$\begin{aligned} \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}-\varvec{u}_T)\Vert \preceq&\;(\mu /\lambda )^{1/2} h^{k} \Vert \varvec{u}\Vert _{k+1} + h^{k}\Vert {{\,\mathrm{{\text {div}}}\,}}\,\varvec{u}\Vert _{k}. \end{aligned}$$
(19b)

Moreover, under the regularity assumption (17), the following estimate holds

$$\begin{aligned} \Vert \varvec{u}-\varvec{u}_T\Vert \preceq h^{k+1} \Vert \varvec{u}\Vert _{k+1}. \end{aligned}$$
(19c)

Remark 1

(Volume-locking-free estimates) From the energy estimates (19a), we get that

$$\begin{aligned} \sum _{T\in {\mathcal {T}}_h} \Vert \nabla _s(\varvec{u}-\varvec{u}_{h,T})\Vert ^2_T +\frac{1}{h}\Vert \Pi _M[\![(\underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h)^t ]\!]\Vert ^2_{\partial T} \preceq h^{2 k} \Vert \varvec{u}\Vert _{k+1}^2, \end{aligned}$$

with the hidden constant independent of the Lamé constants \(\lambda \) and \(\mu \). This observation also holds for the \({\varvec{L}}^2\)-norm estimate (19c). Hence, the estimates are free of volume-locking when \(\lambda \rightarrow \infty \).

Proof

We proceed in the following five steps.

Step 1 (Coercivity): Observing the definition (13a) for the bilinear form \(a_h^\mu (\cdot ,\cdot )\), and applying the Cauchy-Schwarz inequality combined with trace-inverse inequalities, we obtain, cf. [31, Lemma 2], for sufficiently large \(\alpha \),

$$\begin{aligned} \Vert \underline{{\varvec{v}}}_h\Vert _{\mu ,h}^2\preceq a_h^{\mu }(\underline{{\varvec{v}}}_h,\underline{{\varvec{v}}}_h) \quad \forall \underline{{\varvec{v}}}_h\in \underline{{\varvec{U}}\!}_h. \end{aligned}$$
(20)

Step 2 (Norm equivalence): By property (14), we have \(\Pi _M(\Pi _{\varvec{RM}}\varvec{v}_T)^t = (\Pi _{\varvec{RM}}\varvec{v}_T)^t\). Hence, for any interior facet \(F\in {\mathcal {F}}_h\backslash \partial \Omega \) and any function \(\underline{{\varvec{v}}}\in \underline{{\varvec{U}}\!}(h)+\underline{{\varvec{U}}\!}_h\), we have

$$\begin{aligned} \Vert [\![\underline{{\varvec{v}}}^t ]\!]\Vert _{F}\le&\; \Vert \Pi _M[\![\underline{{\varvec{v}}}^t ]\!]\Vert _{F}+ \Vert \varvec{v}_T^t-\Pi _M\varvec{v}_T^t\Vert _{F}\\ \le&\; \Vert \Pi _M[\![\underline{{\varvec{v}}}^t ]\!]\Vert _{F}+ \Vert (\varvec{v}_T-\Pi _{\varvec{RM}}\varvec{v}_T)^t-\Pi _M(\varvec{v}_T -\Pi _{\varvec{RM}}\varvec{v}_T)^t\Vert _{F}\\ \preceq&\; \Vert \Pi _M[\![\underline{{\varvec{v}}}^t ]\!]\Vert _{F}+ \Vert \varvec{v}_T-\Pi _{\varvec{RM}}\varvec{v}_T\Vert _{F}. \end{aligned}$$

Using the trace theorem and approximation properties (15) of the projector \(\Pi _{\varvec{RM}}\), we get

$$\begin{aligned} \Vert \varvec{v}_T-\Pi _{\varvec{RM}}\varvec{v}_T\Vert _{F}^2 \preceq&\; \sum _{T\in {\mathcal {T}}(F)}(h|\varvec{v}_T-\Pi _{\varvec{RM}} \varvec{v}_T|^2_{1,T}+h^{-1}\Vert \varvec{v}_T-\Pi _{\varvec{RM}}\varvec{v}_T\Vert ^2_T)\\ \preceq&\; h\, \Vert \nabla _s\varvec{v}_T\Vert ^2_{{\mathcal {T}}(F)}, \end{aligned}$$

where \({\mathcal {T}}(F)\) is the set of the two simplices meeting F. Hence,

$$\begin{aligned} \Vert [\![\underline{{\varvec{v}}}^t ]\!]\Vert _{F}\le&\; \Vert \Pi _M[\![\underline{{\varvec{v}}}^t ]\!]\Vert _{F} + h^{1/2}\, \Vert \nabla _s\varvec{v}_T\Vert _{{\mathcal {T}}(F)} \quad \quad \forall \underline{{\varvec{v}}}\in \underline{{\varvec{U}}\!}(h)+\underline{{\varvec{U}}\!}_h. \end{aligned}$$
(21)

Recall the norms defined in (16), this directly implies

$$\begin{aligned} \Vert \underline{{\varvec{v}}}\Vert _{\mu ,**,h}\preceq \Vert \underline{{\varvec{v}}}\Vert _{\mu ,*,h}\quad \quad \forall \underline{{\varvec{v}}}\in \underline{{\varvec{U}}\!}(h)+\underline{{\varvec{U}}\!}_h. \end{aligned}$$
(22a)

On the other hand, by trace and inverse inequalities, we have, cf. [31, Lemma 1],

$$\begin{aligned} \Vert \underline{{\varvec{v}}}_h\Vert _{\mu ,*,h}\preceq \Vert \underline{{\varvec{v}}}_h\Vert _{\mu ,h}\quad \quad \forall \underline{{\varvec{v}}}_h\in \underline{{\varvec{U}}\!}_h. \end{aligned}$$
(22b)

Step 3 (Boundedness): Applying the Cauchy-Schwarz inequality on the bilinear form \(a_h(\cdot ,\cdot )\), we obtain using the estimate (21)

$$\begin{aligned} a_h^\mu (\underline{{\varvec{v}}},\underline{{\varvec{w}}}) \le&\; \Vert \underline{{\varvec{v}}}\Vert _{\mu ,**,h} \Vert \underline{{\varvec{w}}}\Vert _{\mu ,**,h} \preceq \;\Vert \underline{{\varvec{v}}}\Vert _{\mu ,*,h} \Vert \underline{{\varvec{w}}}\Vert _{\mu ,*,h}\quad \quad \forall \underline{{\varvec{v}}},\underline{{\varvec{w}}}\in \underline{{\varvec{U}}\!}(h)+\underline{{\varvec{U}}\!}_h. \end{aligned}$$
(23)

Step 4 (Galerkin orthogonality, BDM interpolation): Galerkin orthogonality yields \(a_h(\underline{{\varvec{u}}}, \underline{{\varvec{v}}}_h) = f(\underline{{\varvec{v}}}_h)\) for all \(\underline{{\varvec{v}}}_h\in \underline{{\varvec{U}}\!}_h\). Hence, \(a_h(\underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h, \underline{{\varvec{v}}}_h) = 0\). We estimate the error by first applying a triangle inequality to split

$$\begin{aligned} \Vert \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h\Vert _{\mu ,h} \le \Vert \underline{{\varvec{v}}}_h-\underline{{\varvec{u}}}\Vert _{\mu ,h} + \Vert \underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h\Vert _{\mu ,h}, \end{aligned}$$

where we choose \(\underline{{\varvec{v}}}_h= (\Pi _V \varvec{u}, \Pi _M \varvec{u})\) where \(\Pi _V\) is the classical BDM interpolator, [15, Proposition 2.3.2]. We note that the interpolation operator \(\Pi _V\) has, as a consequence of its commuting diagram property, that

$$\begin{aligned} \int _{\Omega } {{\,\mathrm{{\text {div}}}\,}}(\Pi _V \varvec{u} - \varvec{u}) q_h \,\mathrm {d}\varvec{x}= \int _{\Omega } (\Pi _Q {{\,\mathrm{{\text {div}}}\,}}\varvec{u} - {{\,\mathrm{{\text {div}}}\,}}\varvec{u}) q_h \,\mathrm {d}\varvec{x}= 0 \quad \forall ~ q_h \in Q_h, \end{aligned}$$

where \(\Pi _Q\) is the \(L^2\) projection into \(Q_h = \prod _{T\in {\mathcal {T}}_h}{\mathbb {P}}^{k-1}(T) = {{\,\mathrm{{\text {div}}}\,}}\varvec{V}_{\!h}\). Hence,

$$\begin{aligned}&\Vert \underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h\Vert _{\mu ,h}^2 +\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T-\varvec{v}_T)\Vert ^2\\&\preceq a_h^{\mu }(\underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h, \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h) +\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T-\varvec{v}_T)\Vert ^2\\&= a_h(\underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h, \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h) = a_h(\underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h, \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h) \\&= a_h^{\mu }(\underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h,\; \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h) + \underbrace{a_h^{\lambda }(\underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h,\; \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h)}_{=0}\\&\preceq \Vert \underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h\Vert _{\mu ,*,h} \Vert \underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h\Vert _{\mu ,*,h} \preceq \Vert \underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h\Vert _{\mu ,*,h} \Vert \underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h\Vert _{\mu ,h}. \end{aligned}$$

This implies

$$\begin{aligned} \Vert \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h\Vert _{\mu ,h} +\lambda ^{1/2}\Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T-\varvec{v}_T)\Vert \preceq \Vert \underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h\Vert _{\mu ,*,h} \preceq \mu ^{1/2} h^k\Vert \varvec{u}\Vert _{k+1}, \end{aligned}$$
(24)

where the last estimate follows from usual Bramble–Hilbert-type arguments, cf. [54, Proposition 2.3.8] for a proof in an almost identical setting. The estimate (19a) follows directly from (24), and the estimate (19b) follows from (24) and the triangle inequality:

$$\begin{aligned} \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}-\varvec{u}_T)\Vert \le&\; \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T-\varvec{v}_T)\Vert + \underbrace{\Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}-\varvec{v}_T)\Vert }_{=\Vert (I-\Pi _Q){{\,\mathrm{{\text {div}}}\,}}\,\varvec{u}\Vert } \\ \preceq&\; (\mu /\lambda )^{1/2} h^k\Vert \varvec{u}\Vert _{k+1} +h^k \Vert {{\,\mathrm{{\text {div}}}\,}}\,\varvec{u}\Vert _k. \end{aligned}$$

Step 5 (Duality): Let \(\varvec{\phi }\) be the solution to the dual problem (18) with \(\varvec{\theta }= \varvec{u}- \varvec{u}_T\) and \(\underline{\varvec{\phi }}= (\varvec{\phi }, \varvec{\phi }^t) \in \underline{{\varvec{U}}\!}(h)\). By symmetry of the bilinear form \(a_h(\cdot ,\cdot )\) and consistency of the numerical scheme (S1), we have with \(\underline{\varvec{\phi }}_h= (\Pi _V \varvec{\phi }, \Pi _M \varvec{\phi }) \in \underline{{\varvec{U}}\!}_h\)

$$\begin{aligned} \Vert \varvec{u}- \varvec{u}_T\Vert _\Omega ^2&= a_h(\underline{\varvec{\phi }}, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) = a_h(\underline{\varvec{\phi }}-\underline{\varvec{\phi }}_h, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) \\&= a_h^{\mu }(\underline{\varvec{\phi }}- \underline{\varvec{\phi }}_h, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) + a_h^{\lambda }(\underline{\varvec{\phi }}- \underline{\varvec{\phi }}_h, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) \\&= a_h^{\mu }(\underline{\varvec{\phi }}- \underline{\varvec{\phi }}_h, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) + \lambda \sum _{T\in {\mathcal {T}}_h} \int _T \underbrace{{{\,\mathrm{{\text {div}}}\,}}(\varvec{\phi }- \Pi _V \varvec{\phi }) {{\,\mathrm{{\text {div}}}\,}}(\varvec{u} - \Pi _V\varvec{u})}_{ =(I-\Pi _Q){{\,\mathrm{{\text {div}}}\,}}\varvec{\phi }\, (I-\Pi _Q){{\,\mathrm{{\text {div}}}\,}}\varvec{u}} \,\mathrm {d}\varvec{x}\\&\preceq \Vert \underline{\varvec{\phi }}-\underline{\varvec{\phi }}_h\Vert _{\mu ,*,h} \Vert \underline{{\varvec{u}}}_h-\underline{{\varvec{u}}}\Vert _{\mu ,*,h}+ \lambda \Vert (I-\Pi _Q){{\,\mathrm{{\text {div}}}\,}}\varvec{\phi }\Vert \cdot \Vert (I -\Pi _Q){{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert \\&\preceq \mu h^{k+1} \Vert \varvec{\phi }\Vert _2\Vert \varvec{u}\Vert _{k+1}+ \lambda h^{k+1} \Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{\phi }\Vert _1\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _{k} \\&\preceq h^{k+1} \Vert \varvec{u}- \varvec{u}_T\Vert _\Omega \Vert \varvec{u}\Vert _{k+1}, \end{aligned}$$

In the last step we invoked the regularity assumption (17). This completes the proof of (19c). \(\square \)

4.5 Gradient-Robustness

In this subsection we want to show that the \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming HDG method in (S1) is gradient-robust. In this section a splitting into a discretely divergence-free subspace and an orthogonal complement is crucial. To proceed, it seems more natural to work with a DG-equivalent reformulation of the HDG scheme (S1) by eliminating the facet unknowns (for analysis purposes only). In Remark 2 below we explain how this translates to the HDG setting.

We introduce the lifting \({\mathcal {L}}_h: \varvec{V}_{\!h}+ {\varvec{H}}^2_0(\Omega ) \rightarrow {\varvec{M}}_h\) where \({\mathcal {L}}_h(\varvec{w}_T)\) is the unique function in \({\varvec{M}}_h\) such that

$$\begin{aligned} a_h((\varvec{w}_T, {\mathcal {L}}_h(\varvec{w}_T), (0, \varvec{v}_F))=0,\quad \forall \varvec{v}_F\in {\varvec{M}}_h. \end{aligned}$$

For the case of a uniform mesh size h, an explicit formula can easily derived yielding

$$\begin{aligned} {\mathcal {L}}_h(\varvec{w}_T)= \{\!\!\{ \Pi _M \varvec{w}_T \}\!\!\}_* - \frac{h}{2 \alpha } [\![\nabla _s\varvec{w}_T \cdot \varvec{n} ]\!]_*, \end{aligned}$$

where \(\{\!\!\{ \cdot \}\!\!\}_*\) and \([\![\cdot ]\!]_*\) are the usual DG average and jump operators. Then, the solution \(\underline{\varvec{u}}_h=(\varvec{u}_T, \varvec{u}_F)\in \underline{{\varvec{U}}\!}_h\) to the scheme (S1) satisfies \(\varvec{u}_F = {\mathcal {L}}_h(\varvec{u}_T)\), with \(\varvec{u}_T\in \varvec{V}_{\!h}\) being the unique function such that

$$\begin{aligned} {\hat{a}}_h(\varvec{u}_T,\varvec{v}_T) = \hat{f}(\varvec{v}_T) \quad \forall \varvec{v}_T\in \varvec{V}_{\!h}, \end{aligned}$$
(S1-DG)

where \({\hat{a}}_h(\cdot ,\cdot )\) and \({\hat{f}}\) are defined on \(\varvec{V}_{\!h}\) as follows:

$$\begin{aligned} {\hat{a}}_h(\varvec{v}_T,\varvec{w}_T) := a_h\left( (\varvec{v}_T, {\mathcal {L}}_h(\varvec{v}_T)),(\varvec{w}_T, 0)\right) , \quad \hat{f}(\varvec{w}_T) := f((\varvec{w}_T, 0)), ~\varvec{v}_T, \varvec{w}_T\in \varvec{V}_{\!h}. \end{aligned}$$

Analogously (with slight abuse of notation) we define a norm on \(\varvec{V}_{\!h}\) with

$$\begin{aligned} \Vert \varvec{u}_T \Vert _{1,h} := \Vert ( \varvec{u}_T, {\mathcal {L}}_h(\varvec{u}_T) ) \Vert _{1,h}. \end{aligned}$$

Introducing the spaces

$$\begin{aligned} \varvec{V}_{\!h}^0 := \{ \varvec{v}_T \in \varvec{V}_{\!h}: {{\,\mathrm{{\text {div}}}\,}}\varvec{v}_T = 0, \quad \forall T \in {\mathcal {T}}_h\}, \end{aligned}$$
(25a)

and

$$\begin{aligned} \varvec{V}_{\!h}^\perp := \{ \varvec{v}_T \in \varvec{V}_{\!h}: \hat{a}_h(\varvec{v}_T, \varvec{w}_T) = 0, \quad \forall \varvec{w}_T \in \varvec{V}_{\!h}\}, \end{aligned}$$
(25b)

we then split the solution \(\varvec{u}_T \in \varvec{V}_{\!h}\) to the scheme (S1-DG) as \(\varvec{u}_T = \varvec{u}_T^0 +\varvec{u}_T^\perp \) where \(\varvec{u}_T^0,\varvec{u}_T^\perp \in \varvec{V}_{\!h}\) are the unique solutions to the following equations:

$$\begin{aligned} {\hat{a}}_{h}(\varvec{u}_T^0, \varvec{v}_T^0)&= {\hat{f}}(\varvec{v}_T^0) \quad \forall ~ \varvec{v}_T^0 \in \varvec{V}_{\!h}^0, \end{aligned}$$
(26a)
$$\begin{aligned} {\hat{a}}_{h}(\varvec{u}_T^\perp , \varvec{v}_T^\perp )&= {\hat{f}}(\varvec{v}_T^\perp ) \quad \forall ~ \varvec{v}_T^\perp \in \varvec{V}_{\!h}^\perp . \end{aligned}$$
(26b)

We are now ready to state the following gradient-robustness property of the schemes (S1-DG) and(S1) analogously to the continuous case in Theorem 1.

Theorem 3

(Gradient-robustness of (S1-DG)) The scheme (S1-DG) (and hence scheme (S1)) is gradient-robust, i.e. for \(\varvec{f}=\nabla \phi \), \(\phi \in H^1(\Omega )\), the solution \(\varvec{u}_T = \varvec{u}_T^0 + \varvec{u}_T^\perp \in \varvec{V}_{\!h}\) satisfies

$$\begin{aligned} \varvec{u}_T^0= 0 ,\quad \Vert \varvec{u}_T^\perp \Vert _{1,h} = {\mathcal {O}}(\lambda ^{-1}). \end{aligned}$$

In particular, for \(\lambda \rightarrow \infty \) one gets \(\varvec{u}_T \rightarrow \varvec{0}\).

To prove Theorem 3, we shall first recall the following inf-sup stability result.

Lemma 1

(inf-sup stability)

The following properties hold:

There holds the discrete LBB condition:

$$\begin{aligned} \sup _{\varvec{u}_T \in \varvec{V}_{\!h}} ({{\,\mathrm{{\text {div}}}\,}}\varvec{u}_T, q_h) \ge \beta \Vert q_h \Vert \Vert \varvec{u}_T \Vert _{1,h} \quad \text {for all } q_h \in Q_h. \end{aligned}$$
(27a)

for \(\beta \) independent of \(\mu ,~h,~k\). Moreover, for all \(q_h \in Q_h\) there exists a unique \(\varvec{u}_T^\perp \in \varvec{V}_{\!h}^\perp \) , s.t.

$$\begin{aligned} {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T^\perp ) = q_h \quad \text { and } \quad \Vert \varvec{u}_T^\perp \Vert _{1,h} \le \beta ^{-1} \Vert q_h \Vert . \end{aligned}$$
(27b)

Proof

For (27a) we refer to [53] where (27b) is a direct consequence of (27a) as it implies the existence of an isomorphism between \(\varvec{V}_{\!h}^\perp \) and \(Q_h\) related to \(({{\,\mathrm{{\text {div}}}\,}}(\cdot ),\cdot )\), cf. e.g. [46, Lemma 3.58]. \(\square \)

We now prove Theorem 3.

Proof of Theorem 3

For the proof it is crucial to first establish a result as in (4). With \({\hat{f}}(\cdot ) = (\nabla \phi , \cdot )_{\Omega }\)

there holds after partial integration

$$\begin{aligned} {\hat{f}}(\varvec{v}_T^0)= - \sum _{T \in {\mathcal {T}}_h} (\phi , {{\,\mathrm{{\text {div}}}\,}}\varvec{v}^0_T)_T + \sum _{F \in {\mathcal {F}}_h} ( \phi , [\![\varvec{v}^0_T \cdot \varvec{n} ]\!]_F ) = 0 \quad \forall ~ \varvec{v}_T^0 \in \varvec{V}_{\!h}^0. \end{aligned}$$
(28)

From the decomposition in (26) we hence have \(\varvec{u}_T^0=0\). Taking \(\varvec{v}_T^\perp := \varvec{u}_T^\perp \) in (26b) we get

$$\begin{aligned} \mu \Vert \varvec{u}_T^\perp \Vert _{1,h}^2 +\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}({{\varvec{u}}}_T^\perp ) \Vert ^2 \preceq {\hat{a}}_h(\varvec{u}_T^\perp ,\varvec{u}_T^\perp ) = {\hat{f}} (\varvec{u}_T^\perp ) \preceq \Vert \phi \Vert _{1} \Vert \varvec{u}_T^\perp \Vert _{1,h}. \end{aligned}$$

Since Lemma 1 implies that

$$\begin{aligned} \Vert \varvec{u}_T^\perp \Vert _{1,h} \le \beta ^{-1} \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T^\perp ) \Vert , \end{aligned}$$

we finally obtain

$$\begin{aligned} \Vert \varvec{u}_T^\perp \Vert _{1,h} \preceq \frac{1}{\mu + \lambda } \Vert \phi \Vert _{1} \quad {\mathop {\longrightarrow }\limits ^{\lambda \rightarrow \infty }} 0. \end{aligned}$$

\(\square \)

Remark 2

The splitting into a divergence-free subspace and its \(a_h\)-orthogonal complement can also be done for \(\underline{{\varvec{U}}\!}_h\). Let us relate the splitting of \(\varvec{V}_{\!h}\) to a corresponding splitting of \(\underline{{\varvec{U}}\!}_h\). First, there holds \(\underline{{\varvec{U}}\!}_h^0 = \varvec{V}_{\!h}^0 \times {\varvec{M}}_h\) and \(\underline{{\varvec{U}}\!}_h^\perp = \{ (\varvec{v}_T, \varvec{v}_F) \in \underline{{\varvec{U}}\!}_h\mid \varvec{v}_T \in \varvec{V}_{\!h}^\perp , \varvec{v}_F = {\mathcal {L}}_h(\varvec{v}_T)\}\). Second, the solution \(\underline{{\varvec{u}}}_h\) of (S1) then has the splitting \(\underline{{\varvec{u}}}_h= \underline{{\varvec{u}}}_h^0 + \underline{{\varvec{u}}}_h^\perp \) with

\(\underline{{\varvec{u}}}_h^0 = (\varvec{u}_T^0,{\mathcal {L}}_h(\varvec{u}_T^0)) \in \underline{{\varvec{U}}\!}_h^0\) and \(\underline{{\varvec{u}}}_h^\perp = (\varvec{u}_T^\perp ,{\mathcal {L}}_h(\varvec{u}_T^\perp )) \in \underline{{\varvec{U}}\!}_h^\perp \) and for \(\varvec{f} = \nabla \phi ,~\phi \in H^1(\Omega )\) there holds \(\underline{{\varvec{u}}}_h^0 = 0\) and \(\Vert \underline{{\varvec{u}}}_h^\perp \Vert _{1,h} = {\mathcal {O}}(\lambda ^{-1})\).

4.6 Numerical Results

The numerical results for the two examples in Sect. 2 for the scheme (S1) are given in Fig. 3 and are consistent with the results in Theorem 2 and Theorem 3.

Fig. 3
figure 3

Discretization error for Example 1 (left) and norm of discrete error for Example 2 (right) for the method (S1), \(k=2\), under mesh refinement (x-axis: refinement level L) and different values of \(\lambda \) for Example 2

5 Relaxed \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})-\)Conforming HDG Discretization

The results in Theorem 2 provide optimal error estimates for the method (S1). However, for the approximation of the displacement with a polynomial degree k it requires unknowns of degree k for the normal component of the displacement on every facet of the mesh. In view of the superconvergence property of other HDG methods [20, 67], where only unknowns of polynomial degree \(k-1\) on the facets are required to obtain an accurate polynomial approximation of order k (possibly after a local post-processing) this is sub-optimal. Here we follow [50] to slightly relax the \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conformity so that only unknowns of polynomial degree \(k-1\) are involved for normal-continuity. This allows for optimality of the method also in the sense of superconvergent HDG methods. The resulting method is still volume-locking-free. We assume the polynomial degree \(k\ge 2\) in the following discussion.

5.1 The Relaxed \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-Conforming HDG Scheme

We introduce the modified vector space

$$\begin{aligned} \varvec{V}_{\!h}^- : =&\; \{\varvec{v}_T\in \prod _{T\in {\mathcal {T}}_h}[{\mathbb {P}}^{k}(T)]^d: \;\; \Pi _F^{k-1}[\![\varvec{v}_T\cdot \varvec{n} ]\!]_F = 0,\; \,\forall F\in {\mathcal {F}}_h\}, \end{aligned}$$
(29)

where \(\Pi _F^{k-1}: L^2(F)\rightarrow P^{k-1}(F)\) is the \(L^2(F)\)-projection:

$$\begin{aligned} \int _F(\Pi _F^{k-1}w)v\,\mathrm {ds} = \int _Fw\,v\,\mathrm {ds},\quad \forall v\in P^{k-1}(F). \end{aligned}$$
(30)

Details of the construction of the finite element space \(\varvec{V}_{\!h}^-\) can be found in [50, Sect. 3]. Functions in \(\varvec{V}_{\!h}^-\) are only “almost normal-continuous”, but can be normal-discontinuous in the highest orders.

Denoting the compound finite element space

$$\begin{aligned} \underline{{\varvec{U}}\!}_h^{\,-}:= \varvec{V}_{\!h}^-\times {\varvec{M}}_h, \end{aligned}$$

then the relaxed \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming HDG scheme reads: Find \(\underline{{\varvec{u}}}_h\in \underline{{\varvec{U}}\!}_h^{\,-}\) such that

$$\begin{aligned} a_h(\underline{{\varvec{u}}}_h,\underline{{\varvec{v}}}_h) = f(\underline{{\varvec{v}}}_h), \quad \forall \underline{{\varvec{v}}}_h\in \underline{{\varvec{U}}\!}_h^{\,-}. \end{aligned}$$
(S2)

Remark 3

Notice that the globally coupled degrees of freedom for the above relaxed \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming scheme are polynomials of degree \(k-1\) per facet for both tangential and normal component of the displacement, while that for the original \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming scheme (S1) are polynomials of degree \(k-1\) per facet for the tangential component of the displacement, and polynomials of degree k per facet for the normal component. This relaxation reduces the globally coupled degrees of freedom which improves the sparsity pattern of the linear systems, cf. Table 1 below for the effect on a numerical example.

5.2 Error Estimates

The error analysis of the relaxed scheme (S2) follows closely from that for the original scheme (S1) in Theorem 2.

Due to the violation of \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conformity of \(\varvec{V}_{\!h}^-\), we have a consistency term to take care of.

Lemma 2

Let \(\varvec{u}\in {\varvec{H}}^2(\Omega )\cap {\varvec{H}}_0^1(\Omega )\) be the solution to the equations (1) and define the splitting \(\varvec{f} = \varvec{f}^\mu + \varvec{f}^\lambda \) with \(\varvec{f}^\mu = - {{\,\mathrm{{\text {div}}}\,}}\left( 2\mu \nabla _s\varvec{u}\right) \) and \(\varvec{f}^\lambda = - {\nabla }\left( \lambda \,{{\,\mathrm{{\text {div}}}\,}}\varvec{u}\right) \) and \(f(\cdot ) = f^{\mu }(\cdot ) + f^{\lambda }(\cdot )\) correspondingly. Denote \(\underline{{\varvec{u}}}:=(\varvec{u}, \varvec{u}^t)\in \underline{{\varvec{U}}\!}(h)\). There holds for all \(\underline{{\varvec{v}}}=(\varvec{v}_T,\varvec{v}_F)\in \underline{{\varvec{U}}\!}_h^{\,-} + \underline{{\varvec{U}}\!}(h)\)

$$\begin{aligned} a_h^{\mu }(\underline{{\varvec{u}}}, \underline{{\varvec{v}}})&= f^\mu (\underline{{\varvec{v}}}) + {\mathcal {E}}_c^\mu (\varvec{u}, \underline{{\varvec{v}}}), \end{aligned}$$
(31a)
$$\begin{aligned} a_h^\lambda (\underline{{\varvec{u}}}, \underline{{\varvec{v}}})&= f^\lambda (\underline{{\varvec{v}}}) + {\mathcal {E}}_c^\lambda (\varvec{u}, \underline{{\varvec{v}}}), \end{aligned}$$
(31b)
$$\begin{aligned} a_h(\underline{{\varvec{u}}}, \underline{{\varvec{v}}})&= f(\underline{{\varvec{v}}}) + {\mathcal {E}}_c(\varvec{u}, \underline{{\varvec{v}}}), \end{aligned}$$
(31c)

with

$$\begin{aligned} {\mathcal {E}}_c^{\mu }(\varvec{u}, \underline{{\varvec{v}}})&= \sum _{T\in {\mathcal {T}}_h}\int _{\partial T}\left( 2\mu (\nabla _s(\varvec{u})\varvec{n})\cdot \varvec{n}\right) ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})(\varvec{v}_T\cdot \varvec{n}), \end{aligned}$$
(31d)
$$\begin{aligned} {\mathcal {E}}_c^{\lambda }(\varvec{u}, \underline{{\varvec{v}}})&= \sum _{T\in {\mathcal {T}}_h}\int _{\partial T}\left( \lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\right) ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})(\varvec{v}_T\cdot \varvec{n}), \end{aligned}$$
(31e)
$$\begin{aligned} {\mathcal {E}}_c(\varvec{u}, \underline{{\varvec{v}}})&= {\mathcal {E}}_c^{\mu }(\varvec{u}, \underline{{\varvec{v}}}) + {\mathcal {E}}_c^{\lambda }(\varvec{u}, \underline{{\varvec{v}}}). \end{aligned}$$
(31f)

Moreover, for \(\varvec{u}\in {\varvec{H}}^\ell (\Omega )\), \(\ell \ge 2\) and \(1\le m\le \min (k,\ell -1)\) we have

$$\begin{aligned} {\mathcal {E}}_c^\mu (\varvec{u}, \underline{{\varvec{v}}})&\preceq h^m \mu ^{1/2}\Vert \varvec{u}\Vert _{m+1} \Vert \underline{{\varvec{v}}}\Vert _{\mu ,h}, \quad {\mathcal {E}}_c^\lambda (\varvec{u}, \underline{{\varvec{v}}}) \preceq h^m \frac{\lambda }{\mu ^{1/2}}\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _m \Vert \underline{{\varvec{v}}}\Vert _{\mu ,h}. \end{aligned}$$
(32a)
$$\begin{aligned} {\mathcal {E}}_c(\varvec{u}, \underline{{\varvec{v}}})&\preceq h^m\left( \mu ^{1/2}\Vert \varvec{u}\Vert _{m+1}+\frac{\lambda }{\mu ^{1/2}}\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _m\right) \Vert \underline{{\varvec{v}}}\Vert _{\mu ,h}. \end{aligned}$$
(32b)

Proof

By continuity of \(\varvec{u}\) and integration by parts, we get

$$\begin{aligned} a_h^{\mu }(\underline{{\varvec{u}}}, \underline{{\varvec{v}}})-f^{\mu }(\underline{{\varvec{v}}}) =&\; \sum _{T\in {\mathcal {T}}_h} \int _{\partial T}2\mu \,\nabla _s(\varvec{u})\varvec{n}\cdot (\varvec{v}_T-\varvec{v}_T^t)\,\mathrm {d}\varvec{s}\\ =&\; \sum _{T\in {\mathcal {T}}_h} \int _{\partial T} 2\mu (\nabla _s(\varvec{u})\varvec{n}\cdot \varvec{n} (\varvec{v}_T\cdot \varvec{n})\,\,\mathrm {d}\varvec{s}\\ =&\; \sum _{T\in {\mathcal {T}}_h} \int _{\partial T}(2\mu (\nabla _s(\varvec{u})\varvec{n})\cdot \varvec{n}) ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})(\varvec{v}_T\cdot \varvec{n})\,\,\mathrm {d}\varvec{s}\\ =&\; {\mathcal {E}}_c^{\mu }(\varvec{u},\underline{{\varvec{v}}}), \end{aligned}$$

where the third equality follows from the fact that \(\Pi _{F}^{k-1}[\![\varvec{v}\cdot \varvec{n} ]\!]_F = 0\) for all \(\varvec{v}\in \varvec{V}_{\!h}^-\). Analogously we obtain \(a_h^{\lambda }(\underline{{\varvec{u}}}, \underline{{\varvec{v}}})-f^{\lambda }(\underline{{\varvec{v}}}) = {\mathcal {E}}_c^{\lambda }(\varvec{u},\underline{{\varvec{v}}})\).

Applying the Cauchy-Schwarz inequality and properties of the \(L^2\)-projection, we have

$$\begin{aligned} {\mathcal {E}}_c^{\mu }(\varvec{u},\underline{{\varvec{v}}}) =&\int _{\partial T}({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})\left( 2\mu (\nabla _s(\varvec{u})\varvec{n})\cdot \varvec{n}\right) ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})(\varvec{v}_T\cdot \varvec{n})\\&\;\le \left( 2\mu \Vert ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})\nabla _s(\varvec{u})\Vert _{\partial T}\right) \Vert ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})(\varvec{v}_T\cdot \varvec{n})\Vert _{\partial T}\\&\;\preceq h^{m-1/2} \mu \Vert \nabla _s(\varvec{u})\Vert _{{\varvec{H}}^m(T)} \Vert ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})(\varvec{v}_T\cdot \varvec{n})\Vert _{\partial T}\\&\;\preceq h^m \mu \Vert \varvec{u}\Vert _{{\varvec{H}}^{m+1}(T)} \Vert ({{\,\mathrm{\varvec{{\text {id}}}}\,}}-\Pi _{\varvec{RM}})\varvec{v}_T\Vert _{\partial T} \; \preceq h^m\mu \Vert \varvec{u}\Vert _{{\varvec{H}}^{m+1}(T)} \Vert \nabla _s\varvec{v}_{T} \Vert _{T}, \end{aligned}$$

where the last inequality follows from the trace theorem and the approximation properties (15). Similarly,

$$\begin{aligned} {\mathcal {E}}_c^{\lambda }(\varvec{u},\underline{{\varvec{v}}}) =&\int _{\partial T}({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1}) \lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u} ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})(\varvec{v}_T\cdot \varvec{n})\\&\;\le \lambda \Vert ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1}){{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _{\partial T} \Vert ({{\,\mathrm{{\text {id}}}\,}}-\Pi _F^{k-1})(\varvec{v}_T\cdot \varvec{n})\Vert _{\partial T}\\&\; \preceq h^m { \lambda }\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _{{\varvec{H}}^m(T)} \Vert \nabla _s\varvec{v}_{T} \Vert _{T}. \end{aligned}$$

Summing over all elements concludes the proof. \(\square \)

We have the following error estimates, the proof of which we only sketch with a focus on the modification needed from the proof for Theorem 2.

Theorem 4

Assume \(k\ge 2\) and the regularity \(\varvec{u}\in {\varvec{H}}^{k+1}(\Omega )\). Let \(\underline{{\varvec{u}}}_h\in \underline{{\varvec{U}}\!}_h^{\,-}\) be the numerical solution to the scheme (S2). Then, for sufficiently large stabilization parameter \(\alpha _0\), the following estimate holds

$$\begin{aligned} \Vert \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h\Vert _{\mu ,h} \preceq&\; h^{k} (\mu ^{1/2} \Vert \varvec{u}\Vert _{k+1}+\frac{\lambda }{\mu ^{1/2}}\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _k), \end{aligned}$$
(33a)
$$\begin{aligned} \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}-\varvec{u}_T)\Vert \preceq&\;(\mu /\lambda )^{1/2} h^{k} \Vert \varvec{u}\Vert _{k+1} + \left( \frac{\lambda ^{1/2}}{\mu ^{1/2}}+1\right) h^{k}\Vert {{\,\mathrm{{\text {div}}}\,}}\,\varvec{u}\Vert _{k}. \end{aligned}$$
(33b)

Moreover, under the regularity assumption (17), the following estimate holds

$$\begin{aligned} \Vert \varvec{u}-\varvec{u}_T\Vert \preceq h^{k+1} \left( \Vert \varvec{u}\Vert _{k+1}+(\frac{\lambda }{\mu }+1)\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _{k}\right) . \end{aligned}$$
(33c)

Remark 4

(Volume-locking-free estimates) For convex polygonal domain \(\Omega \), it is proven [18] that

$$\begin{aligned} \mu \Vert \varvec{u}\Vert _2+\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _1\preceq \Vert \varvec{f}\Vert . \end{aligned}$$

If we have the regularity shift, for \(k\ge 2\),

$$\begin{aligned} \mu \Vert \varvec{u}\Vert _{k+1}+\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _{k}\preceq \Vert \varvec{f}\Vert _k, \end{aligned}$$

the above estimates are free of volume-locking when \(\lambda \rightarrow +\infty \).

Proof

To prove the energy estimates (33a) and (33b), we still take \(\underline{{\varvec{v}}}_h= (\Pi _V \varvec{u}, \Pi _M \varvec{u})\in \underline{{\varvec{U}}\!}_h\subset \underline{{\varvec{U}}\!}_h^{\,-}\) as in the proof of Theorem 2. By coercivity,

$$\begin{aligned}&\Vert \underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h\Vert _{\mu ,h}^2 +\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T-\varvec{v}_T)\Vert ^2\\&\preceq a_h^{\mu }(\underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h, \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h) +\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T-\varvec{v}_T)\Vert ^2\\&= a_h(\underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h, \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h) = a_h(\underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h, \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h) -{\mathcal {E}}_c(\varvec{u},\underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h)\\&= a_h^{\mu }(\underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h,\; \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h) -{\mathcal {E}}_c(\varvec{u},\underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h) \\&\preceq \left( \Vert \underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h\Vert _{\mu ,*,h} +\mu ^{1/2}h^k\Vert \varvec{u}\Vert _{k+1}+\frac{\lambda }{\mu ^{1/2}}h^k\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _k\right) \Vert \underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h\Vert _{\mu ,h} \end{aligned}$$

This implies

$$\begin{aligned} \Vert \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h\Vert _{\mu ,h} +\lambda ^{1/2}\Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T-\varvec{v}_T)\Vert \preceq h^k\left( \mu ^{1/2}\Vert \varvec{u}\Vert _{k+1}+\frac{\lambda }{\mu ^{1/2}}\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _k\right) . \end{aligned}$$

Then, the estimates (33a) and (33b) follow from (24) and the triangle inequality.

To prove the \({\varvec{L}}^2\)-estimate, let \(\varvec{\phi }\) be the solution to the dual problem (18) with \(\varvec{\theta }= \varvec{u}- \varvec{u}_T\) and \(\underline{\varvec{\phi }}= (\varvec{\phi }, \varvec{\phi }^t) \in \underline{{\varvec{U}}\!}(h)\). By symmetry of the bilinear form \(a_h(\cdot ,\cdot )\) and Lemma 2, we have, with \(\underline{\varvec{\phi }}_h= (\Pi _V \varvec{\phi }, \Pi _M \varvec{\phi }) \in \underline{{\varvec{U}}\!}_h\)

$$\begin{aligned} \Vert \varvec{u}- \varvec{u}_T\Vert _\Omega ^2&= a_h(\underline{\varvec{\phi }}, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) - {\mathcal {E}}_c(\varvec{\phi }, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h)\\&= a_h(\underline{\varvec{\phi }}-\underline{\varvec{\phi }}_h, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) - {\mathcal {E}}_c(\varvec{\phi }, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) + \underbrace{{\mathcal {E}}_c(\varvec{u}, \underline{\varvec{\phi }}_h)}_{=0} \\&\preceq h (\mu \Vert \varvec{\phi }\Vert _2+\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{\phi }\Vert _1) (\mu ^{-1/2}\Vert \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h\Vert _{\mu ,*,h}+ \Vert (I-\Pi _Q){{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert )\\&\preceq h^{k+1} \Vert \varvec{u}- \varvec{u}_T\Vert _\Omega \left( \Vert \varvec{u}\Vert _{k+1}+(\frac{\lambda }{\mu }+1)\Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{u}\Vert _{k}\right) . \end{aligned}$$

In the last step we invoked the regularity assumption (17). This completes the proof of (33c). \(\square \)

5.3 Numerical Results for the Scheme (S2)

The numerical results for the two examples in Sect. 2 for the scheme (S2) are given in Fig. 4. We observe from Fig. 4 (left) that the errors for the scheme (S2) are independent of \(\lambda \) for Example 1, which are similar to those for the scheme (S1). This is consistent with the volume-locking-free estimates in Theorem 4. However, the norm of the discrete solution for the scheme (S2) for Example 2 shows an upper bound depending on h which indicates that it is not gradient-robust. In the next subsection, we slightly modify the scheme (S2) to make it gradient-robust.

Fig. 4
figure 4

Discretizaton error for Example 1 (left) and norm of discrete error for Example 2 (right) for the method (S2), \(k=2\), under mesh refinement (x-axis: refinement level L) and different values of \(\lambda \) for Example 2

5.4 Gradient-Robust Relaxed \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-Conforming HDG Scheme

Note that Theorem 3 does not directly translate to the relaxed \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming case only because (28), the counterpart to (4), does not hold as the facet normal jumps do not vanish. In this section we introduce the following modification of (S2) in the treatment of the right hand side that re-enables gradient-robustness: Find \(\underline{{\varvec{u}}}_h\in \underline{{\varvec{U}}\!}_h^{\,-}\) such that

$$\begin{aligned} a_h(\underline{{\varvec{u}}}_h,\underline{{\varvec{v}}}_h) = f( (\Pi _V \varvec{v}_T, 0)), \quad \forall \underline{{\varvec{v}}}_h\in \underline{{\varvec{U}}\!}_h^{\,-}. \end{aligned}$$
(S3)

Here, \(\Pi _V\) is a generalization of the BDM interpolator, [15, Proposition 2.3.2], which can deal with only element-wise smooth functions by averaging, cf. the Appendix for a definition.

Remark 5

Let us note that the BDM interpolator is not mandatory here. In [50] and [51] several conditions on a suitable reconstruction operator are formulated. A much simpler version of the BDM interpolation operator is suggested that exploits the knowledge on the pre-image \(\varvec{V}_{\!h}^{\,-}\) and a proper basis for the relaxed \({\varvec{H}}({{\,\mathrm{{\text {div}}}\,}})\)-conforming finite element space. The reconstruction operation can then be realized by a simple averaging of a few unknowns which makes it computationally very cheap. In the numerical examples below we make use of this operator.

To prove gradient-robustness of the scheme (S3), we consider its equivalent DG formulation as in Sect. 4.5: Find \(\varvec{u}_T \in \varvec{V}_{\!h}^{\,-}\) such that

$$\begin{aligned} {\hat{a}}_h(\varvec{u}_T,\varvec{v}_T) = {\hat{f}}( \Pi _V \varvec{v}_T), \quad \forall \varvec{v}_T \in \varvec{V}_{\!h}^{\,-}. \end{aligned}$$
(S3-DG)

If we consider a splitting as in (25) with

$$\begin{aligned} \varvec{V}_{\!h}^{\,-,0} := \{ \varvec{v}_T \in \varvec{V}_{\!h}^{\,-} : {{\,\mathrm{{\text {div}}}\,}}\varvec{v}_T = 0, \ \forall T \in {\mathcal {T}}_h\} \end{aligned}$$
(34a)

and

$$\begin{aligned} \varvec{V}_{\!h}^{\,-,\perp } := \{ \varvec{v}_T \in \varvec{V}_{\!h}^{\,-} : {\hat{a}}_h(\varvec{v}_T, \varvec{w}_T) = 0, \forall \varvec{w}_T \in \varvec{V}_{\!h}^{\,-,0} \}, \end{aligned}$$
(34b)

we can again decompose the numerical solution \(\varvec{u}_T \in \varvec{V}_{\!h}^{\,-}\) to the scheme (S3-DG) as \(\varvec{u}_T = \varvec{u}_T^0 + \varvec{u}_T^\perp \) with \(\varvec{u}_T^0 \in \varvec{V}_{\!h}^{\,-,0},\varvec{u}_T^\perp \in \varvec{V}_{\!h}^{\,-,\perp }\) satisfying

$$\begin{aligned} {\hat{a}}_{h}(\varvec{u}_T^0, \varvec{v}_T^0)&= {\hat{f}}(\Pi _V\varvec{v}_T^0) \quad \forall ~ \varvec{v}_T^0 \in \varvec{V}_{\!h}^{\,-,0}, \end{aligned}$$
(35a)
$$\begin{aligned} {\hat{a}}_{h}(\varvec{u}_T^\perp , \varvec{v}_T^\perp )&= {\hat{f}}(\Pi _V\varvec{v}_T^\perp ) \quad \forall ~ \varvec{v}_T^\perp \in \varvec{V}_{\!h}^{\,-,\perp }. \end{aligned}$$
(35b)

Lemma 3

The scheme (S3-DG) (and hence scheme (S3)) is gradient-robust, i.e. for \(\varvec{f}=\nabla \phi \), \(\phi \in H^1(\Omega )\), the solution \(\varvec{u}_T= \varvec{u}_T^0 + \varvec{u}_T^\perp \in \varvec{V}_{\!h}^{\,-}\) has \( \varvec{u}_T^0 = \varvec{0},~ \Vert \varvec{u}_T^\perp \Vert _{1,h} = {\mathcal {O}}(\lambda ^{-1}). \)

Proof

With the application of \(\Pi _V\) on the right hand side we can re-establish a result as in (4).

There holds after partial integration

$$\begin{aligned} {\hat{f}}(\Pi _V\varvec{v}_T^0)= - \sum _{T \in {\mathcal {T}}_h} (\phi , {{\,\mathrm{{\text {div}}}\,}}\Pi _V \varvec{v}^0_T)_T + \sum _{F \in {\mathcal {F}}_h} ( \phi , [\![ \Pi _V \varvec{v}^0_T \cdot \varvec{n} ]\!]_F ) = 0~ \forall ~ \varvec{v}_T^0 \in \varvec{V}_{\!h}^{\,-,0}. \end{aligned}$$
(36)

where we used \({{\,\mathrm{{\text {div}}}\,}}\Pi _V \varvec{v}^0_T=0\) cf. [50, Lemma 4.8] and \([\![ \Pi _V \varvec{v}^0_T \cdot \varvec{n} ]\!]_F = 0\). The remainder of the proof follows from the proof of Theorem 3. \(\square \)

Next, we present the improved volume-locking-free error estimates for the scheme (S3). We need the following improved version of Lemma 2.

Lemma 4

Let \(\varvec{u}\in {\varvec{H}}^2(\Omega )\cap {\varvec{H}}_0^1(\Omega )\) be the solution to the equations (1) and define the splitting \(\varvec{f} = \varvec{f}^\mu + \varvec{f}^\lambda \) with \(\varvec{f}^\mu = - {{\,\mathrm{{\text {div}}}\,}}\left( 2\mu \nabla _s\varvec{u}\right) \) and \(\varvec{f}^\lambda = - {\nabla }\left( \lambda \,{{\,\mathrm{{\text {div}}}\,}}\varvec{u}\right) \) and \(f(\cdot ) = f^{\mu }(\cdot ) + f^{\lambda }(\cdot )\) and \({\hat{f}}(\cdot ) = {\hat{f}}^{\mu }(\cdot ) + {\hat{f}}^{\lambda }(\cdot )\) correspondingly. Denote \(\underline{{\varvec{u}}}:=(\varvec{u}, \varvec{u}^t)\in \underline{{\varvec{U}}\!}(h)\). There holds for all \(\underline{{\varvec{v}}}=(\varvec{v}_T,\varvec{v}_F)\in \underline{{\varvec{U}}\!}_h^{\,-}\)

$$\begin{aligned} a_h^{\mu }(\underline{{\varvec{u}}}, \underline{{\varvec{v}}})&= {\hat{f}}^\mu (\Pi _V \varvec{v}_T) + \widetilde{{\mathcal {E}}}_c^\mu (\varvec{u}, \underline{{\varvec{v}}}), \end{aligned}$$
(37a)
$$\begin{aligned} a_h^\lambda (\underline{{\varvec{u}}}, \underline{{\varvec{v}}})&= {\hat{f}}^\lambda (\Pi _V \varvec{v}_T), \end{aligned}$$
(37b)
$$\begin{aligned} a_h(\underline{{\varvec{u}}}, \underline{{\varvec{v}}})&= {\hat{f}}(\Pi _V \varvec{v}_T) + \widetilde{{\mathcal {E}}}_c^{\mu }(\varvec{u}, \underline{{\varvec{v}}}), \end{aligned}$$
(37c)
$$\begin{aligned} \text {with } \qquad \widetilde{{\mathcal {E}}}_c^{\mu }(\varvec{u}, \underline{{\varvec{v}}})&= {\mathcal {E}}_c^{\mu }(\varvec{u}, \underline{{\varvec{v}}}) + {\hat{f}}^\mu (\varvec{v}_T - \Pi _V \varvec{v}_T). \end{aligned}$$
(37d)

Moreover, for \(\varvec{u}\in {\varvec{H}}^\ell (\Omega )\), \(\ell \ge 2\) and \(1\le m\le \min (k,\ell -1)\) we have

$$\begin{aligned} \widetilde{{\mathcal {E}}}_c^{\mu }(\varvec{u}, \underline{{\varvec{v}}})&\preceq h^m \mu ^{1/2}\Vert \varvec{u}\Vert _{m+1}\Vert \underline{{\varvec{v}}}\Vert _{\mu ,h}. \end{aligned}$$
(38)

Proof

From (31a) the result (37a) follows directly. Next, we note that \({{\,\mathrm{{\text {div}}}\,}}\Pi _V \varvec{v}_T = {{\,\mathrm{{\text {div}}}\,}}\varvec{v}_T\) for \(\varvec{v}_T \in \varvec{V}_{\!h}^-\). This, we can see from the following observation. Let \(q \in {\mathbb {P}}^{k-1}(T)\) and \(T \in {\mathcal {T}}_h\). Then, we have

$$\begin{aligned} \int _{T} {{\,\mathrm{{\text {div}}}\,}}(\Pi _V \varvec{v}_T ) q \, dx&= - \int _{T} \Pi _V \varvec{v}_T \cdot \nabla q \, dx + \int _{\partial T} \Pi _V \varvec{v}_T \cdot n \, q \, ds \\&= - \int _{T} \varvec{v}_T \cdot \nabla q \, dx + \int _{\partial T} \varvec{v}_T \cdot n \, q \, ds = \int _{T} {{\,\mathrm{{\text {div}}}\,}}(\varvec{v}_T) q \, dx \end{aligned}$$

where we exploited (41a) and (41b) of the BDM-type interpolation. As \({{\,\mathrm{{\text {div}}}\,}}(\varvec{v}_T), {{\,\mathrm{{\text {div}}}\,}}(\Pi _V \varvec{v}_T) \in {\mathbb {P}}^{k-1}(T)\) we obtain \({{\,\mathrm{{\text {div}}}\,}}(\varvec{v}_T) = {{\,\mathrm{{\text {div}}}\,}}(\Pi _V \varvec{v}_T)\) pointwise. Then, (37b) follows from partial integration:

$$\begin{aligned} {\hat{f}}^{\lambda }(\Pi _V \varvec{v}_T)&= \sum _{T\in {\mathcal {T}}_h} \int _T - \nabla ( \lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u} ) \Pi _V \varvec{v}_T \,\mathrm {d}\varvec{x}\\&= \sum _{T\in {\mathcal {T}}_h} \int _T \lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u} \underbrace{{{\,\mathrm{{\text {div}}}\,}}(\Pi _V \varvec{v}_T)}_{= {{\,\mathrm{{\text {div}}}\,}}\varvec{v}_T} \,\mathrm {d}\varvec{x}- \int _{\partial T} \lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u} \Pi _V \varvec{v}_T \cdot \varvec{n} \,\mathrm {d}\varvec{s}\\&= a_h^{\lambda }(\underline{{\varvec{u}}},\underline{{\varvec{v}}}) - \sum _{F \in {\mathcal {F}}_h \setminus \partial \Omega } \int _{F} \lambda {{\,\mathrm{{\text {div}}}\,}}\varvec{u} \underbrace{[\![\Pi _V \varvec{v}_T ]\!]_F}_{=0} \cdot \varvec{n} \,\mathrm {d}\varvec{s}= a_h^{\lambda }(\underline{{\varvec{u}}},\underline{{\varvec{v}}}). \end{aligned}$$

Next, we note that for \(T \in {\mathcal {T}}_h\) there holds with standard Bramble-Hilbert arguments (\(\varvec{v}_T \in {\varvec{H}}^1(T)\))

$$\begin{aligned} \Vert ({{\,\mathrm{\varvec{{\text {id}}}}\,}}- \Pi _V) \varvec{v}_T \Vert _T^2 \preceq h \Vert \nabla \varvec{v}_T \Vert _T \end{aligned}$$
(39)

as constants are in the kernel of \({{\,\mathrm{\varvec{{\text {id}}}}\,}}- \Pi _V\). Let further \({\mathcal {P}}^{m-2} \varvec{f}\) be the element-wise \({\varvec{L}}^2\) projection into \([{\mathbb {P}}^{m-2}(T)]^d,~T\in {\mathcal {T}}_h\). Then, we have

$$\begin{aligned} ( \varvec{f}^{\mu },&\varvec{v}_T - \Pi _V \varvec{v}_T) = ( \varvec{f}^{\mu } - {\mathcal {P}}^{m-2} \varvec{f}^{\mu }, \varvec{v}_T - \Pi _V \varvec{v}_T) \le \Vert \varvec{f}^{\mu } - {\mathcal {P}}^{m-2} \varvec{f}^{\mu } \Vert \Vert \varvec{v}_T - \Pi _V \varvec{v}_T \Vert \\&\preceq h^{m-1} \Vert \varvec{f}^{\mu } \Vert _{m-1}~ h \Vert \varvec{v}_T \Vert _{1,h} \preceq h^{m} \mu \Vert \varvec{u} \Vert _{m+1}~ \Vert \underline{{\varvec{v}}}\Vert _{1,h} \preceq h^{m} \mu ^{\frac{1}{2}} \Vert \varvec{u} \Vert _{m+1}~ \Vert \underline{{\varvec{v}}}\Vert _{\mu ,h}. \end{aligned}$$

Here, we made use of (41b) in the last step. \(\square \)

The improved locking-free error estimates for the scheme (S3), compared with the estimates in Theorem 4 for the scheme (S2), is given below.

Theorem 5

Assume \(k\ge 2\) and the regularity \(\varvec{u}\in {\varvec{H}}^{k+1}(\Omega )\). Let \(\underline{{\varvec{u}}}_h\in \underline{{\varvec{U}}\!}_h^{\,-}\) be the numerical solution to the scheme (S3). Then, for sufficiently large stabilization parameter \(\alpha _0\), the estimates (19a)–(19c) hold.

Proof

Proceeding as in the proof of Theorem 4 (and hence using the equivalent HDG-version again) with \(\underline{{\varvec{v}}}_h= (\Pi _V \varvec{u}, \Pi _M \varvec{u})\in \underline{{\varvec{U}}\!}_h\subset \underline{{\varvec{U}}\!}_h^{\,-}\) and \(\underline{{\varvec{w}}}_h:= \underline{{\varvec{u}}}_h- \underline{{\varvec{v}}}_h\in \underline{{\varvec{U}}\!}_h^{\,-}\), we obtain

$$\begin{aligned}&\Vert \underline{{\varvec{w}}}_h\Vert _{\mu ,h}^2 +\lambda \Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{w}_T)\Vert ^2\\&\preceq a_h(\underline{{\varvec{w}}}_h, \underline{{\varvec{w}}}_h) = a_h(\underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h, \underline{{\varvec{w}}}_h) - \widetilde{{\mathcal {E}}}_c^{\mu }(\varvec{u},\underline{{\varvec{w}}}_h)\\&= a_h^{\mu }(\underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h,\; \underline{{\varvec{w}}}_h) + \underbrace{a_h^{\lambda }(\underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h,\; \underline{{\varvec{w}}}_h)}_{=0} -\widetilde{{\mathcal {E}}}_c^{\mu }(\varvec{u},\underline{{\varvec{w}}}_h) \\&\preceq \left( \Vert \underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h\Vert _{\mu ,*,h} +\mu ^{\frac{1}{2}}h^k\Vert \varvec{u}\Vert _{k+1}\right) \Vert \underline{{\varvec{w}}}_h\Vert _{\mu ,h}. \end{aligned}$$

With interpolation estimates for \(\Vert \underline{{\varvec{u}}}-\underline{{\varvec{v}}}_h\Vert _{\mu ,*,h} \) this implies

$$\begin{aligned} \Vert \underline{{\varvec{u}}}_h-\underline{{\varvec{v}}}_h\Vert _{\mu ,h} +\lambda ^{\frac{1}{2}}\Vert {{\,\mathrm{{\text {div}}}\,}}(\varvec{u}_T-\varvec{v}_T)\Vert \preceq \mu ^{\frac{1}{2}} h^k \Vert \varvec{u}\Vert _{k+1}. \end{aligned}$$

Then, the estimates (19a) and (19b) follow from triangle inequalities.

For the \({\varvec{L}}^2\)-estimate, let \(\varvec{\phi }\) be the solution to the dual problem (18) with \(\varvec{\theta }= \Pi _V (\varvec{u}- \varvec{u}_T)\) and \(\underline{\varvec{\phi }}_h\in \underline{{\varvec{U}}\!}_h\) the corresponding interpolation as before. Noting that \(\widetilde{{\mathcal {E}}}_c^{\mu }(\cdot ,\underline{{\varvec{w}}}_h)\) does not depend on \(\varvec{w}_F = \varvec{u}_F-\Pi _M \varvec{u} \), cf. Lemma 2 and Lemma 4, and \(\underline{\varvec{\phi }}= (\varvec{\phi }, \varvec{\phi }^t)\) we get for \(\Pi \underline{{\varvec{v}}}= \Pi (\varvec{v}_T, \varvec{v}_F) = (\Pi _V \varvec{v}_T, \Pi _M \varvec{v}_F)\), \(\underline{{\varvec{v}}}= (\varvec{v}_T, \varvec{v}_F) \in \underline{{\varvec{U}}\!}(h)\)

$$\begin{aligned} \Vert \Pi _V (\varvec{u}- \varvec{u}_T) \Vert _\Omega ^2&= a_h(\underline{\varvec{\phi }}, \Pi (\underline{{\varvec{u}}}- \underline{{\varvec{u}}}_h)) - \overbrace{\widetilde{{\mathcal {E}}}_c^{\mu }(\varvec{\phi }, \Pi (\underline{{\varvec{u}}}- \underline{{\varvec{u}}}_h))}^{=0}\\&= a_h(\underline{\varvec{\phi }}, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) - a_h(\underline{\varvec{\phi }}, ({{\,\mathrm{\varvec{{\text {id}}}}\,}}-\Pi )(\underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h)) \\&= a_h(\underline{\varvec{\phi }}- \underline{\varvec{\phi }}_h, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) - a_h(\underline{\varvec{\phi }}, ({{\,\mathrm{\varvec{{\text {id}}}}\,}}-\Pi )(\underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h)) + \overbrace{\widetilde{{\mathcal {E}}}^{\mu }_c(\varvec{u}, \underline{\varvec{\phi }}_h)}^{=0} \\&= a_h(\underline{\varvec{\phi }}- \underline{\varvec{\phi }}_h, \underline{{\varvec{u}}}-\underline{{\varvec{u}}}_h) - \underbrace{(\varvec{\theta },({{\,\mathrm{\varvec{{\text {id}}}}\,}}-\Pi _V)(\varvec{u}- \varvec{u}_T))}_{( \Pi _V(\varvec{u}- \varvec{u}_T),({{\,\mathrm{\varvec{{\text {id}}}}\,}}-\Pi _V)(\varvec{u}- \varvec{u}_T)) = 0} \\&\preceq h ( \mu \Vert \varvec{\phi }\Vert _2 + \lambda \Vert {{\,\mathrm{{\text {div}}}\,}}\varvec{\phi }\Vert _1) (\mu ^{-\tfrac{1}{2}} \Vert \underline{{\varvec{u}}}- \underline{{\varvec{u}}}_h\Vert _{\mu ,*,h} + \Vert ({{\,\mathrm{{\text {id}}}\,}}- \Pi _Q) {{\,\mathrm{{\text {div}}}\,}}\varvec{u}_T \Vert ) \\&\preceq \Vert \Pi _V (\varvec{u}- \varvec{u}_T) \Vert _\Omega \cdot \left( h \left( \mu ^{-\tfrac{1}{2}} \Vert \underline{{\varvec{u}}}- \underline{{\varvec{u}}}_h\Vert _{\mu ,*,h} + \Vert ({{\,\mathrm{{\text {id}}}\,}}- \Pi _Q) {{\,\mathrm{{\text {div}}}\,}}\varvec{u} \Vert \right) \right) \end{aligned}$$

Dividing by \(\Vert \Pi _V (\varvec{u}- \varvec{u}_T) \Vert _\Omega \) and applying the triangle inequality:

$$\begin{aligned} \Vert \varvec{u}- \varvec{u}_T \Vert _\Omega \le \Vert \Pi _V (\varvec{u}- \varvec{u}_T) \Vert _\Omega + \underbrace{\Vert ({{\,\mathrm{\varvec{{\text {id}}}}\,}}- \Pi _V) (\varvec{u}- \varvec{u}_T) \Vert _\Omega }_{\preceq h \Vert \underline{{\varvec{u}}}- \underline{{\varvec{u}}}_h\Vert _{1,h}} \end{aligned}$$

yields

$$\begin{aligned} \Vert \varvec{u}- \varvec{u}_T \Vert _\Omega \preceq h \left( \mu ^{-\tfrac{1}{2}} \Vert \underline{{\varvec{u}}}- \underline{{\varvec{u}}}_h\Vert _{\mu ,*,h} + \Vert ({{\,\mathrm{{\text {id}}}\,}}- \Pi _Q) {{\,\mathrm{{\text {div}}}\,}}\varvec{u} \Vert \right) \end{aligned}$$

and hence the claim. \(\square \)

With this result we conclude that method (S3) has quasi-optimal a-priori error bounds and is both volume-locking-free and gradient-robust.

5.5 Numerical Results for the Scheme (S3)

The numerical results for the two examples in Sect. 2 for the scheme (S3) are given in Fig. 5. As expected, the results are now essentially similar to those for the scheme (S1), but are obtained with reduced computational costs.

Fig. 5
figure 5

Discretization error for Example 1 (left) and norm of discrete error for Example 2 (right) for the method (S3), \(k=2\), under mesh refinement (x-axis: refinement level L) and different values of \(\lambda \) for Example 2

6 Numerical Experiments for Linear-Thermoelastic Solids

At the moment, we are aware of three different multi-physics applications, where gradient fields \(\varvec{f}=\nabla \phi \) arise naturally: thermo-elasticity [42], poro-elasticity [30, 48] and models for strained InGaAs quantum dots [62, equation (7)]. We remark that even more interesting examples would be those multi-physics situations, where \(\varvec{f}\) would be a load vector dependent on some other physical process, and where it would not be clear a-priori, whether it represents a gradient field, a divergence-free vector field or a linear combination of both of them—in the sense of the Helmholtz–Hodge decomposition. Such situations are well known from fluid dynamics, where the nonlinear convection term \((\varvec{u} \cdot \nabla ) \varvec{u} = \nabla \cdot (\varvec{u} \otimes \varvec{u})\) represents such a forcing [47]. Thus, due to structural similarities between fluid dynamics and Maxwell’s equations, we conjecture to find appropriate applications in elasticity, wherever electromagnetic forces interact with mechanical forces, i.e., where \(\varvec{f}\) is given by the Maxwell stress tensor. But this is ongoing work.

In the following, we will numerically investigate an application coming from a multi-physics context, where complicated gradient forces arise. Thus, we consider linear-thermoelastic solids, where the constitutive equation for the stress tensor reads as

$$\begin{aligned} \sigma =\varvec{C}\left\{ \varepsilon -\varepsilon ^{\mathrm {th}}\right\} \quad \text {with} \quad \varepsilon ^{\mathrm {th}}=\alpha (\theta -\theta _{0}) \varvec{I}, \end{aligned}$$

where \(\varvec{C}\) and \(\varepsilon = \varepsilon (\varvec{u}) = \nabla _s\varvec{u}\) denote the elasticity tensor and the linearized strain tensor. Further, \(\alpha \) denotes the thermal expansion coefficient, \(\theta \) is the temperature field, and \(\theta _{0}\) denotes a constant reference temperature. For isotropic materials, this reduces to

$$\begin{aligned} \sigma ^{\mathrm {el}}&=\varvec{C} \varepsilon =2\mu \varepsilon +\lambda \mathrm {tr}(\varepsilon ) \varvec{I} \\ \sigma ^{\mathrm {th}}&=\varvec{C} \varepsilon ^{\mathrm {th}}=(2\mu +3\lambda )\alpha (\theta -\theta _0 ) \varvec{I} \\ \sigma =\sigma ^{\mathrm {el}} - \sigma ^{\mathrm {th}}&= 2\mu \varepsilon +\lambda \mathrm {tr}(\varepsilon ) \varvec{I}-(2\mu +3\lambda )\alpha (\theta -\theta _0) \varvec{I} \end{aligned}$$

with Lamé coefficients \(\mu \), \(\lambda \), see [42, pp. 528–529]. Thus, we finally obtain a momentum balance

$$\begin{aligned} - {{\,\mathrm{{\text {div}}}\,}}\left( 2\mu \nabla _s\varvec{u}\right) - {\nabla }\left( \lambda \,{{\,\mathrm{{\text {div}}}\,}}\varvec{u}\right) = - (2\mu +3\lambda )\alpha {{\,\mathrm{{\text {div}}}\,}}\left( \theta \varvec{I} \right) = - (2\mu +3\lambda )\alpha \nabla \theta . \end{aligned}$$
(40)

where \(-(2\mu +3\lambda )\alpha \theta \) denotes the potential of a gradient force. For complicated and large temperature profiles \(\theta \) this gradient force can be arbitrarily complicated.

We consider two weakly-coupled linear thermoelasticity problems in both two dimensions (2D) and three dimensions (3D), where the source term in the momentum equation (1a) is given according to (40) by

$$\begin{aligned} \varvec{f} = - (2\mu +3\lambda )\alpha \nabla \theta . \end{aligned}$$

The computational domain is a square with length L in 2D and a cube with length L in 3D. We take the length \(L=0.1\, \mathsf {[m]}\). The temperature field is obtained as the solution of the steady-state heat equation:

$$\begin{aligned} -\nabla \cdot \gamma \nabla \theta = f, \end{aligned}$$

where \(\gamma = 0.2\, \mathsf {[W/(m K)]}\) is the thermal conductivity coefficient, and \(f = 4\times 10^3\exp (-40r^2)\, [\mathsf {W/m}^\mathsf {3}]\) is the heat source, with \(r^2=r(x,y)^2=(x-0.5L)^2+(y-0.5L)^2\) in 2D and \(r^2=r(x,y,z)^2=(x-0.5L)^2+(y-0.5L)^2+(z-0.5L)^2\) in 3D. With such a heat source, the temperate field achieves its maximum value at the center of the domain with \(\theta _{\max }=\theta (0.5L,0.5L)\approx 14 \mathsf {[K]}\) in 2D and \(\theta _{\max } = \theta (0.5L,0.5L,0.5L)\approx 10 \mathsf {[K]}\) in 3D. For the other material parameters, we use a nearly incompressible hard rubber material with Young’s modulus \(E=5\times 10^7 \mathsf {[Pa]}\), Poisson ratio \(\nu =0.4999\), and thermal expansion coefficient \(\alpha = 8\times 10^{-5} \mathsf {[1/K]}\). Hence, the Lamé parameters are \(\mu = \frac{E}{2(1+\nu )}\approx 1.667\times 10^{7}\, \mathsf {[Pa]}, \lambda = \frac{E\nu }{(1-2\nu )(1+\nu )}\approx 8.332\times 10^{10}\, \mathsf {[Pa]}\). Homogeneous Dirichlet boundary conditions are imposed for both displacement and temperature. Let us note that \(\theta \in H^1(\Omega )\) so that \(f \in {\varvec{L}}^2(\Omega )\)

The finite element library NGSolve [68] is used for all the simulations. We compare five volume-locking-free methods, namely, the mixed method (M2), the \({\varvec{H}}(\mathrm {div})\)-conforming HDG (div-HDG) scheme (S1), the relaxed \({\varvec{H}}(\mathrm {div})\)-conforming HDG (r.div-HDG) scheme (S2), and its gradient-robust modification (S3), and the \({\varvec{H}}(\mathrm {div})\)-conforming discontinuous Galerkin (div-DG) method used in [43, 48]. Among these five methods, only the schemes (S1), (S3) and the div-DG scheme are gradient-robust.

For the three dimensional problem, we only run the simulation on the subdomain \(\Omega = (0,0.5L)\times (0,0.5L)\times (0,0.5L)\) and impose the symmetry boundary condition on the top/front/right faces.

Table 1 Measures for the costs of solving linear systems for different methods in 2D and 3D. Polynomial degree \(k=2\)
Fig. 6
figure 6

Displacement magnitude for the 2D problem. Top left: mixed method (M2), top right: div-DG scheme [43, 48]. Bottom left: rdiv-HDG scheme (S2), bottom right: rdiv-HDG scheme (S3) Polynomial degree \(k=2\). Mesh size \(h=0.1L\) (200 triangles)

Fig. 7
figure 7

Displacement magnitude for the 3D problem. Top left: mixed method (M2), top right: div-DG scheme [43, 48]. Bottom left: rdiv-HDG scheme (S2), bottom right: rdiv-HDG scheme (S3) Polynomial degree \(k=2\). Mesh size \(h=0.05L\) (6000 tetrahedra)

We use polynomials of degree \(k=2\) throughout, and compute the temperature field using quadratic conforming finite element on the same mesh for all the methods. The following measures have been taken to compare the methods’ computational effort. We consider the number of unknowns that appear in the methods (dofs). For the HDG schemes, we apply static condensation prior to solving the linear systems, i.e. we eliminate all unknowns that have only element-local couplings. The remaining unknowns are denoted as the globally coupled degrees of freedom (gdofs). We measure the sparsity pattern of the resulting linear systems by recording the number of non-zero entries in the matrix (nze). Finally, we use a direct factorization method to prepare the solution of linear systems and measure the computation time on a shared-memory machine with 16 cores for the four methods (f.time). Note that these measurements are identical for the two r.div-HDG schemes (S2) and (S3), hence the results for scheme (S2) is omitted. As examples we consider a \(100\times 100\times 2\) structured triangular mesh in 2D (\(h=0.01L\)) and \(10\times 10\times 10\times 6\) structured tetrahedral mesh (\(h=0.05L\)). The results are shown in Table 1. From this table we observe the cost of matrix factorization for the r.div-HDG scheme (S3) in 2D is the cheapest among the four methods, which is about 1.4 times faster than schemes (M2) and (S1), and about 5.4 times faster than the div-DG scheme. In 3D, the scheme (S3) is about twice slower than the scheme (M2), twice faster than the scheme (S1), and 8 times faster than the div-DG scheme.

We remark that for large 3D simulations, one quickly runs into memory issues when direct method are used. The development of efficient iterative methods for the proposed HDG schemes consists of our ongoing work.

Finally, we plot the displacement magnitude for the four methods (M2), (S2), (S3), and the div-DG scheme in 2D on a \(10\times 10\times 2\) structured triangular mesh in Fig. 6, and in 3D on a \(10\times 10\times 10\times 6\) structured tetrahedral mesh in Fig. 7. We remark that the results for the (gradient-robust) div-HDG scheme (S1) are very similar to those for the (gradient-robust) r.div-HDG scheme (S3), and thus are omitted for simplicity. In both figures, we clearly observe that the (non-gradient-robust) mixed method (M2) and r.div-HDG scheme (S2) fail to produce accurate displacement approximations, however the results for the (gradient-robust) div-DG scheme and the r.div-HDG scheme (S3) are consistent.

7 Conclusion

The concept of gradient-robustness for numerical methods for linear elasticity is introduced in this paper. The class of divergence-conforming HDG methods are presented and analyzed as an example of volume-locking-free and gradient-robust finite element methods for linear elasticity. Two efficient variants of a divergence-conforming HDG scheme with reduced globally coupled degrees of freedom are also discussed and analyzed. On a linear thermo-elasticity example we demonstrate the importance of gradient-robustness and the computational efficiency of the proposed relaxed \({\varvec{H}}({\text {div}})\)-conforming HDG method.