1 Introduction

In this paper, we consider the model diffusion problem with the discontinuous coefficient

$$\begin{aligned} \left\{ \begin{array}{r l} -\nabla \cdot (a\nabla u)=f,&{} \mathrm {in}~\varOmega ,\\ u=0, &{} \mathrm {on}~\partial \varOmega , \end{array}\right. \end{aligned}$$
(1.1)

where \(\varOmega \subset \mathbb {R}^d~(d=2,3)\) is a polygonal (polyhedral) domain (open, bounded, and connected set), \(a\in L^\infty (\varOmega )\) is a scalar diffusion coefficient (\(a>0\)), and \(f\in L^2(\varOmega )\) is a source term. The purpose of this paper is to derive a unified framework for a posteriori error estimates for problem (1.1) discretized by different numerical methods on anisotropic meshes, in particular, which is applied to vertex-centered finite volume, cell-centered finite volume, finite difference and piecewise linear finite element methods.

There is a well-developed literature on a posteriori error estimation for isotropic finite element meshes. We refer to the overview textbooks by Verf\(\ddot{\mathrm u}\)rth [33], Ainsworth and Oden [4], and Repin [31]. Moreover, under the so-called monotonicity assumption on the discontinuous coefficient \(a\), robust estimates with respect to \(a\) have been derived by Dörfler and Wilderotter [16], Bernardi and Verfürth [9], Petzoldt [28], Ainsworth [3], Chen and Dai [15], or Cai and Zhang [11]. These discussions on robustness of error estimates are all done on isotropic meshes. For the summary information on these works, see [37]. For anisotropic meshes, based on the introduction of an alignment measure, the theory of error estimation is much less developed but has attracted some attention. Some types of a posteriori error estimation methods have already been generalized for anisotropic meshes by the contributions from Kunert et al., such as residual error estimate, Dirichlet local problem error estimate and Zienkiewicz-Zhu error estimate, see, e.g., [2226] and the citations therein. In addition, Grosman is also devoted to a posteriori error estimates on anisotropic meshes, including hierarchical error estimator [20] and the equilibrated residual method [21]. Let us finally mention the approach by Picasso [29] who derived an anisotropic error estimator that depends on \(\nabla (u-u_h)\) where \(\nabla u\) is replaced in practice by a recovered gradient \(\nabla ^{\mathrm R}u\) when applied to anisotropic adaptive refinement. It has been noted by Apel et al. [6] that the alignment measure can be controlled to drive anisotropic adaptive refinement in the same way. However, the discussion on robustness of error estimates with respect to the discontinuous coefficient \(a\) on anisotropic meshes is not mentioned.

For the finite volume method on isotropic meshes, the literature about a posteriori error estimation is less voluminous, cf. e.g. [1, 27]. Recently, Afif et al. [2] have developed a posteriori error estimates for the vertex-centered finite volume method on anisotropic meshes for the singularly perturbed reaction-diffusion problem, where a residual type error estimator is proposed and analysed.

A new family of error estimates was recently established based on \({\mathbf {H}}(\mathrm {div})\)-conforming flux reconstruction for various numerical methods by Vohralík et al., cf. [37] and the citations therein. These estimates are explicitly and easily computable, and moreover completely robust with respect to the discontinuous coefficient \(a\) in problem (1.1). The ideas go back to the Prager-Synge equality [30]. However, the estimates of Vohralík et al. can’t be applied to the anisotropic meshes due to a (potentially unbounded) factor appearing in the lower bound. This factor is \({\mathcal {O}}(1)\) on isotropic meshes, but it can be of size of the maximum aspect ratio on anisotropic meshes. Furthermore, it is here shown that, these estimates developed by Vohralík [37] fail on anisotropic meshes, see the numerical experiments in Sect. 6.1.1.

In this paper, we propose a modification for these estimates of Vohralík in [37], which leads to robust error estimators with respect to the discontinuous coefficient \(a\) on anisotropic meshes. Our error estimators are related to the anisotropy of the meshes, so the \({\mathbf {H}}(\mathrm {div})\)-conforming flux reconstruction must be able to reflect this point. Up to now, no comparable result is known to the authors. The upper error bound corresponding to the modification contains an alignment measure which is in accordance with the results of Kunert [22]. We are able to achieve reliable and efficient a posteriori error estimates on anisotropic meshes in two different ways (using the energy norm and dual norm). As stated in [37], the energy norm error estimates need the harmonic averaging to be used in the scheme definition, while simultaneously aligning the discontinuities of the diffusion coefficient \(a\) with a dual mesh formed around vertices. It is based on the observation of [18] that harmonic weighting can insure robustness in a posteriori error estimates; the dual norm error estimates apply to any method mentioned in this paper and require no alignment of the discontinuities and no use of particular averages, where we need to introduce a (not local or locally computable) dual norm of the residual. Such an approach has been pursued by Angermann [5], Verfürth [34], Chaillou and Suri [12, 13], or El Alaoui et al. [17] for other different problems.

In addition, compared with the residual error estimator [23] and Zienkiewicz-Zhu estimator [25] on anisotropic meshes, our estimators keep the same advantages as the isotropic versions derived in [37]. For the details, see [37, Remarks 4.10 and 4.12].

The rest of this paper is organised as follows. We specify our notation and give some preliminary results in Sect. 2. In Sect. 3, we give a posteriori error estimates with the mesh anisotropy under an abstract framework both in the energy and dual norms. Then we discuss two different ways of constructing an equilibrated flux in the lowest-order Raviart–Thomas space on the dual mesh associated with the original anisotropic simplicial mesh. The proofs of the (local) efficiency and robustness are given in Sect. 4. Sections 3 and 4 present the a posteriori error estimates in a quite general setting of conforming approximations without requiring any particular numerical scheme. In Sect. 5, we briefly discuss the application of the previous results to different numerical methods. A series of numerical experiments are presented in Sect. 6. Finally, some conclusions are given in Sect. 7.

2 Preliminaries

In this section, we give the meshes description, all notation, anisotropic mesh requirements, recall some important equivalence relations on anisotropic meshes, and describe the continuous problem we should work with.

2.1 Notation

Let \(\{{\mathcal {T}}_h\}\) be a set of triangulations which for all \(h>0\) consist of closed simplices (triangles for \(d=2\) and tetrahedrons for \(d=3\)) such that \(\overline{\varOmega }=\bigcup _{K\in {\mathcal {T}}_h}K\), and which are conforming, i.e., if \(K,L\in {\mathcal {T}}_h,~K\ne L\), then \(K\cap L\) is either an empty set or a common vertex, edge, or face of \(K\) and \(L\). We denote by \({\mathcal {E}}_h\) the set of all sides of \({\mathcal {T}}_h\), by \({\mathcal {E}}_h^{\mathrm {int}}\) the set of interior, by \({\mathcal {E}}_h^{\mathrm {ext}}\) the set of boundary, and by \({\mathcal {E}}_K\) the set of all the sides of any element \(K\in {\mathcal {T}}_h\). We also denote by \({\mathcal {V}}_h\) the set of all vertices of \({\mathcal {T}}_h\). Define for \(V\in {\mathcal {V}}_h\), \({\mathcal {T}}_V:=\{L\in {\mathcal {T}}_h;L\cap V\ne \emptyset \}\).

We shall next work with the dual partitions \({\mathcal {D}}_h\) of \(\varOmega \) such that \(\overline{\varOmega }=\bigcup _{D\in {\mathcal {D}}_h}D\) and such that each \(V\in {\mathcal {V}}_h\) is in exactly one dual volume \(D_V\in {\mathcal {D}}_h\). The notation \(V_D\) stands inversely for the vertex associated with a given \(D\in {\mathcal {D}}_h\). For \(d=2\), the dual volume \(D_V\) associated with \(V\) is the polygon which is obtained by connecting, for all triangles \(K\in {\mathcal {T}}_V\), the barycentre of \(K\) with the midpoints of its edges. An example of such a dual volume is shown in Fig. 1. For \(d=3\), the dual volume \(D_V\) is obtained by connecting the barycentre of each tetrahedron \(K\in {\mathcal {T}}_V\) with the barycentres of its faces and edges. We denote by \({\mathcal {D}}_h^{\mathrm {int}}~({\mathcal {D}}_h^{\mathrm {ext}})\) the set of all the dual volumes associated with the interior vertices (boundary vertices).

Fig. 1
figure 1

Original simplicial mesh \({\mathcal {T}}_h\), the associated dual mesh \({\mathcal {D}}_h\), and the fine simplicial mesh \({\mathcal {S}}_h\)

In addition, we need to introduce a second simplicial triangulation \({\mathcal {S}}_h\) of \(\varOmega \) such that \({\mathcal {S}}_h:=\bigcup _{D\in {\mathcal {D}}_h}{\mathcal {S}}_D\), where the local triangulation \({\mathcal {S}}_D\) of \(D\in {\mathcal {D}}_h\) is given by connecting the associated vertex \(V_D\) with all the vertices of the dual volume \(D\), see Fig. 1 for \(d=2\). We will use the notation \({\mathcal {G}}_h\) for all sides of \({\mathcal {S}}_h\) and \({\mathcal {G}}_h^{\mathrm {int}}~({\mathcal {G}}_h^{\mathrm {ext}})\) for all interior (boundary) sides of \({\mathcal {S}}_h\). The notation \({\mathcal {G}}_D^{\mathrm {int}}\) stands for all interior sides of \({\mathcal {S}}_D\), \({\mathcal {G}}_D^{\mathrm {ext}}\) for all boundary sides of \({\mathcal {S}}_D\), and \({\mathcal {G}}_D\) for \({\mathcal {G}}_D^{\mathrm {int}}\cup {\mathcal {G}}_D^{\mathrm {ext}}\).

Next, for \(K\in {\mathcal {T}}_h\), \({\mathbf {n}}_K\) (\({\mathbf {n}}_{\sigma ,K}\)) always denotes its exterior unit normal vector (with respect to the side \(\sigma \in {\mathcal {E}}_K\)); we shall also use the notation \({\mathbf {n}}_\sigma \) for a unit normal vector of a side \(\sigma \in {\mathcal {E}}_h\), whose orientation is chosen arbitrarily but fixed for interior sides and coinciding with the exterior normal of \(\varOmega \) for boundary sides. For a function \(\varphi \) and a side \(\sigma _{K,L}\in {\mathcal {E}}_h^{\mathrm {int}}\) shared by \(K,L\in {\mathcal {T}}_h\) such that \({\mathbf {n}}_\sigma \) points from \(K\) to \(L\), we shall use the abbreviated notation \(\sigma :=\sigma _{K,L}\) and define the jump operator \([\![\cdot ]\!]_\sigma \) through the side \(\sigma \) by

$$\begin{aligned}{}[\![\varphi ]\!]_\sigma :=(\varphi |_K)|_\sigma -(\varphi |_L)|_\sigma . \end{aligned}$$
(2.1)

For any \(\sigma \in {\mathcal {E}}_h^{\mathrm {ext}}\), set \([\![\varphi ]\!]_\sigma :=\varphi |_\sigma \). We associate with each \(K\in {\mathcal {T}}_h\) and each \(\sigma \in {\mathcal {E}}_K\) a weight \(\omega _{K,\sigma }\) such that

$$\begin{aligned} \left\{ \begin{array}{l l} 0\le \omega _{K,\sigma }\le 1, &{} \forall K\in {\mathcal {T}}_h,~\forall \sigma \in {\mathcal {E}}_K,\\ \omega _{K,\sigma }+\omega _{L,\sigma }=1, &{} \forall \sigma =\sigma _{K,L}\in {\mathcal {E}}_h^{\mathrm {int}},\\ \omega _{K,\sigma }=1, &{} \forall \sigma \in {\mathcal {E}}_h^{\mathrm {ext}}~\mathrm {and}~\sigma \in {\mathcal {E}}_K,~K\in {\mathcal {T}}_h. \end{array}\right. \end{aligned}$$
(2.2)

For \(\sigma =\sigma _{K,L}\in {\mathcal {E}}_h^{\mathrm {int}}\), the weighted average operator \(\{\!\{\cdot \}\!\}_\omega \) is defined by

$$\begin{aligned} \{\!\{\varphi \}\!\}_\omega :=\omega _{K,\sigma }(\varphi |_K)|_\sigma +\omega _{L,\sigma }(\varphi |_L)|_\sigma , \end{aligned}$$
(2.3)

whereas for \(\sigma \in {\mathcal {E}}_h^{\mathrm {ext}},~\{\!\{\varphi \}\!\}_\omega :=\varphi |_\sigma \). Recall that we have denoted by \(a\) the diffusion coefficient. Two basic choices for the weights in \(\{\!\{a\}\!\}_\omega \) on a side \(\sigma =\sigma _{K,L}\in {\mathcal {E}}_h^{\mathrm {int}}\) are

$$\begin{aligned} \omega _{K,\sigma }=\omega _{L,\sigma }=\frac{1}{2}, \end{aligned}$$
(2.4)

which corresponds to the arithmetic averaging, and

$$\begin{aligned} \omega _{K,\sigma }=\frac{a_L}{a_K+a_L},~~\omega _{L,\sigma } =\frac{a_K}{a_K+a_L}, \end{aligned}$$
(2.5)

which corresponds to the harmonic averaging. In contrast to the arithmetic averaging (2.4), the harmonic averaging (2.5) can lead to robust a posteriori error estimates for the error measured in the energy norm, which we will show below. We use the abbreviated notation \(\{\!\{\varphi \}\!\}=\{\!\{\varphi \}\!\}_\omega \) when \(\omega _{K,\sigma }=\omega _{L,\sigma }=\frac{1}{2}\) and the same type of notation also for the meshes \({\mathcal {D}}_h\) and \({\mathcal {S}}_h\).

We denote by \((\cdot ,\cdot )_S\) the \(L^2\)-scalar product on \(S\) and by \(\Vert \cdot \Vert _S\) the associated norm where \(S\subset \varOmega \); when \(S=\varOmega \), the subscript will be omitted. Let \(|S|\) stand for the Lebesgue measure of \(S\), \(|\sigma |\) the \((d-1)\)-dimensional Lebesgue measure of \(\sigma \subset \mathbb R^{d-1}\), and in particular, \(|s|\) the length of a segment \(s\). We denote by \(H^1(S)\) the Sobolev space of functions with square-integrable weak derivatives and by \(H_0^1(S)\) its subspace of functions with vanishing traces on \(\partial S\). In addition, \({\mathbf {H}}(\mathrm {div},S)\) is the space of vector functions with square-integrable weak divergences, i.e., \({\mathbf {H}}(\mathrm {div},S)=\{{\mathbf {q}}\in {\mathbf {L}}^2(S);\nabla \cdot {\mathbf {q}}\in L^2(S)\}\), and \(\langle \cdot ,\cdot \rangle _{\partial S}\) stands for the appropriate duality pairing on \(\partial S\).

Let \(X_h^0\) be a subspace of \(H_0^1(\varOmega )\) which consists of continuous piecewise affine functions on the primal simplicial mesh \({\mathcal {T}}_h\) with vanishing value on the boundary of \(\varOmega \), i.e., \(X_h^0:=\{v_h\in H_0^1(\varOmega );v_h|_K\in \mathbb P_1(K),K\in {\mathcal {T}}_h\}\), where \(\mathbb P_1(K)\) is the space of linear polynomials on \(K\in {\mathcal {T}}_h\).

2.2 Anisotropic Meshes

The previous exposition is independent of the element shape. In this subsection, we describe anisotropic elements in detail, present their notation, basic properties, some weak mesh assumptions and equivalences on anisotropic meshes. For simplicity, we use the shorthand notation \(x\lesssim y\) or \(x\sim y\) instead of \(x\le cy\) or \(c_1x\le y\le c_2x\) (with constants independent of \(x,y\) and the meshes), respectively.

Start with an arbitrary (anisotropic) tetrahedron \(K\in {\mathcal {T}}_h\) for \(d=3\). Enumerate its vertices such that \(P_0P_1\) is the longest edge, meas\(_2(\triangle P_0P_1P_2)\ge \)meas\(_2(\triangle P_0P_1P_3)\), and meas\(_1(P_1P_2)\ge \) meas\(_1(P_0P_2)\). Additionally define three pairwise orthogonal vectors \(\varvec{p}_{i,K}\) having the length \(h_{i,K}:=|\varvec{p}_{i,K}|\), see Fig. 2. Notice that \(h_{1,K}\ge h_{2,K}\ge h_{3,K}\), set \(h_{\min ,K}:=h_{3,K},~h_{max,K}:= h_{1,K}\) and for \(\sigma \in {\mathcal {E}}_K\), let

$$\begin{aligned} h_{\sigma ,K}:=\frac{d|K|}{|\sigma |}, \end{aligned}$$

be the length of the height over the face \(\sigma \). The three main anisotropic directions \({\mathbf {p}}_{i,K}\) play an important role in a posterior error estimation on anisotropic meshes. Define the matrices \(A_K,C_K\in \mathbb R^{3\times 3}\) by

$$\begin{aligned} A_K:=(\overset{\longrightarrow }{P_0P_1},\overset{\longrightarrow }{P_0P_2}, \overset{\longrightarrow }{P_0P_3}) ~~~\mathrm {and}~~~C_K:= (\varvec{p}_{1,K}, \varvec{p}_{2,K}, \varvec{p}_{3,K}). \end{aligned}$$

Then \(A_K\) and \(C_K\) can be considered as the transformation matrices which implicitly define the so-called standard tetrahedron \({\bar{K}}:=A_K^{-1}(K-\varvec{P}_0)\) and the reference tetrahedron \(\widehat{K}:=C_K^{-1}(K-\varvec{P}_0)\), respectively, see Fig. 3. Variables that are related to the standard tetrahedron \({\bar{K}}\) and the reference tetrahedron \(\widehat{K}\) are referred to with a bar and a hat, respectively, e.g., \(\bar{\nabla },\hat{v}\). The determinants of both matrices are equal, i.e., \(|\det A_K|=|\det C_K|=6|K|\), and the transformed derivatives satisfy \(\bar{\nabla }=A_K^\top \nabla \) and \(\widehat{\nabla }=C_K^\top \nabla \), respectively. Note in particular that the standard tetrahedron \({\bar{K}}\) and the reference tetrahedron \(\widehat{K}\) are all of size \({\mathcal {O}}(1)\). Hence, we have

$$\begin{aligned} \Vert C_K^{-1}A_K\Vert \sim \Vert A_K^{-1}C_K\Vert \sim 1. \end{aligned}$$
(2.6)

For more details on both transformation matrices, refer to [23].

Fig. 2
figure 2

Notation of tetrahedron \(K\)

Fig. 3
figure 3

Standard tetrahedron \({\bar{K}}\) (left) and reference tetrahedron \(\widehat{K}\) (right)

For the case \(d=2\), the notation is similar. The enumeration of the triangle \(K\) is as in the bottom triangle \(\triangle P_0P_1P_2\) of Fig. 2. Furthermore \(h_{\min ,K}:=h_{2,K}\) and \(A_K,C_K\) become \(2\times 2\) matrices.

For a sub-element \(L\in S_h\), analogously we introduce the anisotropic directions \({\mathbf {p}}_{i,L}\), the matrix \(C_L\), and the minimal dimension \(h_{\min ,L}\).

The following lemma states that the element \(K\) and the sub-element \(L\subset K\) have similar anisotropic directions \({\mathbf {p}}_{i,K}\) and \({\mathbf {p}}_{i,L}\). Mathematically this is expressed as a certain norm equivalence. The proof can be found in [2].

Lemma 2.1

(Equivalence of \(C_K\) and \(C_L\)) For any sub-element \(L\subset K\), one has

$$\begin{aligned} |C_K^\top y|_{\mathbb R^{d\times d}}\sim |C_L^\top y|_{\mathbb R^{d\times d}},~~~\forall y\in \mathbb R^d. \end{aligned}$$

In addition to the usual conformity conditions of the mesh, the primal mesh \({\mathcal {T}}_h\) should satisfy the following anisotropic mesh requirements

  1. 1.

    For any given vertex \(V\in {\mathcal {V}}_h\), the number of elements contained in \({\mathcal {T}}_V\) is bounded uniformly.

  2. 2.

    The dimensions of adjacent elements must not change rapidly, i.e.,

    $$\begin{aligned} h_{i,K}\sim h_{i,L},~~~K\cap L\ne \emptyset ,~i=1\ldots d. \end{aligned}$$

At some places of our exposition, it is advantageous to replace the minimal anisotropic dimension \(h_{\min ,K}\) by certain average values. For a dual volume \(D_V\) with respect to the vertex \(V\in {\mathcal {V}}_h\), thus define

$$\begin{aligned} h_{\min ,D_V}:=\frac{\sum _{K\in {\mathcal {T}}_V} h_{\min ,K}}{\sum _{K\in {\mathcal {T}}_V}1}. \end{aligned}$$

For the common side \(\sigma \) of two elements \(K\) and \(L\), set

$$\begin{aligned} h_\sigma :=\frac{h_{\sigma ,K}+h_{\sigma ,L}}{2},~~~ h_{\min ,\sigma }:=\frac{h_{\min ,K}+h_{\min ,L}}{2}. \end{aligned}$$

The advantage is that they are no longer related to \(K\) or \(L\) but to \(\sigma \). The definitions are modified in the obvious way for boundary sides.

Note that the original term \(h_{\min ,K}\) is of comparable size as the average values since the dimensions of adjacent elements do not change rapidly, see above. More precisely, we have

$$\begin{aligned} h_{\min ,K}\sim h_{\min ,D_V}\sim h_{\min ,L}\sim h_{\min ,\sigma }\sim h_{\min ,\rho }, \end{aligned}$$
(2.7)

where \(V\in {\mathcal {V}}_h,~V\cap K\ne \emptyset ,~K\in {\mathcal {T}}_h,~L\in {\mathcal {S}}_{D_V},~\sigma \subset {\mathcal {E}}_K\) and \(\rho \subset {\mathcal {E}}_L\).

2.3 Continuous Problem

For problem (1.1), we define a bilinear form \({\mathcal {B}}\) by

$$\begin{aligned} {\mathcal {B}}(u,v):=(a\nabla u,\nabla v),~~~ u\in H_0^1(\varOmega ),~v\in H_0^1(\varOmega ). \end{aligned}$$

The weak formulation of problem (1.1) is to find \(u\in H_0^1(\varOmega )\) such that

$$\begin{aligned} {\mathcal {B}}(u,v)=(f,v),~~~\forall v\in H_0^1(\varOmega ). \end{aligned}$$
(2.8)

The associated energy norm is given by

$$\begin{aligned} \parallel \mid \!v\!\parallel \mid :={\mathcal {B}}(v,v)^{\frac{1}{2}}=\Vert a^{\frac{1}{2}}\nabla v\Vert ,~~~v\in H_0^1(\varOmega ), \end{aligned}$$
(2.9)

and dual norm (\(H^{-1}\)-norm of the residual) given by

$$\begin{aligned} \parallel \mid \!v\!\parallel \mid _\#:=\sup _{\varphi \in H_0^1(\varOmega )}\frac{{\mathcal {B}}(v,\varphi )}{\Vert \nabla \varphi \Vert },~~~v\in H_0^1(\varOmega ). \end{aligned}$$
(2.10)

The energy and dual norms are closely related to each other, ref. [37] for the details. We will present our a posteriori error estimates in both the energy norm and the dual one.

2.4 Alignment Measure

From a heuristic point of view, one should stretch the elements in that direction where the (directional) derivative of the solution shows little change. The better the primal anisotropic mesh \({\mathcal {T}}_h\) is aligned with the anisotropy of the solution, the more accurate one would expect the error estimates to be. In order to quantify this alignment, we need to introduce the so-called alignment measures \(m_1(v,{\mathcal {T}}_h)\) and \(m_2(v,{\mathcal {T}}_h)\) with respect to the energy and dual norms, respectively, which originate from [22, 23].

Definition 2.1

(Alignment measures) Let \(v\in H^1(\varOmega )\), and \(\{{\mathcal {T}}_h\}\) be a family of triangulations of \(\varOmega \). Define the energy alignment measure \(m_1:H^1(\varOmega )\times \{{\mathcal {T}}_h\}\rightarrow \mathbb R\) by

$$\begin{aligned} m_1(v,{\mathcal {T}}_h):=\Big (\sum _{K\in {\mathcal {T}}_h}h_{\min ,K}^{-2}\Vert a^{\frac{1}{2}}C_K^\top \nabla v\Vert _K^2\Big )^{1/2}\Big /\parallel \mid \!v\!\parallel \mid , \end{aligned}$$
(2.11)

and dual alignment measure \(m_2:H^1(\varOmega )\times \{{\mathcal {T}}_h\}\rightarrow \mathbb R\) by

$$\begin{aligned} m_2(v,{\mathcal {T}}_h):=\Big (\sum _{K\in {\mathcal {T}}_h}h_{\min ,K}^{-2}\Vert C_K^\top \nabla v\Vert _K^2\Big )^{1/2}\Big /\Vert \nabla v\Vert . \end{aligned}$$
(2.12)

The alignment measures satisfy the following property

$$\begin{aligned} 1\le m_i(v,{\mathcal {T}}_h)\lesssim \underset{K\in {\mathcal {T}}_h}{\max }\frac{h_{\max ,K}}{h_{\min ,K}}. \end{aligned}$$

The above property implies that a mesh \({\mathcal {T}}_h\) which is well aligned with an anisotropic function \(v\), results in small alignment measures. In practice, one almost always obtains \(m_i(v,{\mathcal {T}}_h)\sim 1\) for ’sensible’ anisotropic meshes, i.e., the alignment measures are no obstacle for reliable error estimation. We refer to [23] for more details.

3 A Posteriori Error Estimates for Conforming Approximations

In this section, we derive a posteriori error estimates valid for arbitrary conforming approximation \(u_h\in H_0^1(\varOmega )\), with the anisotropy of the meshes. Note that the results of this section are presented generally, without a notion of any numerical scheme. For our a posteriori error estimates of Theorem 3.2 or Corollary 3.2, we only require Assumption 3.1 which is introduced by Vohralík in [37]. Two different constructions of the equilibrated flux \({\mathbf {t}}_h\) are presented under Assumption 3.1, respectively.

In order to present the results of this section as generally as possible, we suppose that the domain \(\varOmega \) is partitioned by a mesh \({\mathcal {D}}_h^*\), consisting in general of elements in the interior of the domain, \({\mathcal {D}}_h^{\mathrm {int},*}\), and elements near the boundary, \({\mathcal {D}}_h^{\mathrm {ext},*}\) (\(|\partial \varOmega \cap \partial D|\ne 0\) for all \(D\in {\mathcal {D}}_h^{\mathrm {ext},*}\)). As shown in [37], The meshes \({\mathcal {D}}_h^*\) will differ in different types of construction of the equilibrated flux \({\mathbf {t}}_h\), and here two main possibilities exist, i.e., \({\mathcal {D}}_h^{\mathrm {int},*}={\mathcal {D}}_h^{\mathrm {int}}\) and \({\mathcal {D}}_h^{\mathrm {ext},*}={\mathcal {D}}_h^{\mathrm {ext}}\); or \({\mathcal {D}}_h^{\mathrm {int},*}={\mathcal {S}}_h\) and \({\mathcal {D}}_h^{\mathrm {ext},*}=\emptyset \).

We shall suppose that the coefficient \(a({\mathbf {x}})\) is a piecewise constant on the meshes \({\mathcal {D}}_h\) (or \({\mathcal {T}}_h\)). This assumption corresponds to a situation where there exist subdomains of the domain \(\varOmega \) where \(a\) is constant and where the meshes \({\mathcal {D}}_h\) (or \({\mathcal {T}}_h\)) are conforming with respect to these subdomains. We denote by \(a_{\min ,K}\) and \(a_{\max ,K}\) for all \(K\in {\mathcal {T}}_h\) the best positive constants such that \(a_{\min ,K}\le a({\mathbf {x}})\le a_{\max ,K}\) for all \({\mathbf {x}}\in K\). Similar notation will be used also for \(D\in {\mathcal {D}}_h\), or else appropriate domains.

3.1 Anisotropic Interpolation Error Estimates

In order to obtain a posteriori error estimates on anisotropic meshes, we require a global, \({\mathcal {D}}_h^*\)-piecewise constant interpolation operator \({\mathbf {I}}:L^2(\varOmega )\rightarrow L^2(\varOmega )\) which is defined by

$$\begin{aligned} {\mathbf {I}}v:=\left\{ \begin{array}{l l} |D|^{-1}(v,1)_D, &{} D\in {\mathcal {D}}_h^{\mathrm {int},*},\\ 0, &{} D\in {\mathcal {D}}_h^{\mathrm {ext},*}. \end{array}\right. \end{aligned}$$
(3.1)

Lemma 3.1

(Local interpolation error bound for \({\mathbf {I}}\)) Let \(v\in H_0^1(\varOmega )\) and \(D\in {\mathcal {D}}_h^*\). Then

$$\begin{aligned} \Vert v-{\mathbf {I}}v\Vert _D\lesssim \Vert C_D^\top \nabla v\Vert _D, \end{aligned}$$
(3.2)

where \(C_D|_K:=C_K,~\forall K\in \mathcal S_D\).

Proof

For \({\mathcal {D}}_h^*={\mathcal {D}}_h\), the proof is given in [2]; For the other case, by the affine transformation and Poincar\(\acute{\mathrm e}\)/Friedrichs inequality (cf. [35]), the proof is standard, ref. [23] for the details.

By using the equivalence relations on anisotropic meshes, i.e., Lemma 2.1 and (2.7), we proceed along the same lines as the proof of [2, Lemma 5.3], and easily obtain the global interpolation error estimates.

Corollary 3.1

(Global interpolation error bounds for \({\mathbf {I}}\)) Let \(v\in H_0^1(\varOmega )\), then

$$\begin{aligned} \sum _{D\in {\mathcal {D}}_h^*}a_{\min ,D}h_{\min ,D}^{-2}\Vert v-{\mathbf {I}}v\Vert _D^2&\lesssim m_1(v,{\mathcal {T}}_h)^2\parallel \mid \!v\!\parallel \mid ^2, \end{aligned}$$
(3.3)
$$\begin{aligned} \sum _{D\in {\mathcal {D}}_h^*}h_{\min ,D}^{-2}\Vert v-{\mathbf {I}}v\Vert _D^2&\lesssim m_2(v,{\mathcal {T}}_h)^2\Vert \nabla v\Vert ^2. \end{aligned}$$
(3.4)

3.2 A Posteriori Error Estimates on Anisotropic Meshes

In order to proceed without specifying any particular numerical method, we will make an assumption on the equilibrated flux \(\mathbf{t}_h\), which is introduced by Vohralík in [37].

Assumption 3.1

(Equilibrated flux) Suppose that there is a flux \({\mathbf {t}}_h\in {\mathbf {H}}(\mathrm {div},\varOmega )\), arbitrary but such that

$$\begin{aligned} (\nabla \cdot {\mathbf {t}}_h,1)_D=(f,1)_D,~~~\forall D\in {\mathcal {D}}_h^{\mathrm {int},*}. \end{aligned}$$

With the above preparations, we can now present our mesh-dependent error estimators.

Theorem 3.2

(Energy norm a posteriori error estimate) Let \(u\) be the weak solution of problem (1.1) and \(u_h\in H_0^1(\varOmega )\) be arbitrary. Let Assumption 3.1 hold. Then

$$\begin{aligned} \parallel \mid \!u-u_h\!\parallel \mid \lesssim m_1(u-u_h,{\mathcal {T}}_h) \Big \{\sum _{D\in {\mathcal {D}}_h^*}(\eta _{\mathrm R,D}^2+\eta _{\mathrm {DF},D}^2)\Big \}^{1/2}, \end{aligned}$$
(3.5)

where the diffusive flux estimator \(\eta _{\mathrm {DF},D}\) is given by

$$\begin{aligned} \eta _{\mathrm {DF},D}:=h_{\min ,D}\Vert C_D^{-1}(a^{\frac{1}{2}}\nabla u_h+a^{-\frac{1}{2}}{\mathbf {t}}_h)\Vert _D,~~~D\in {\mathcal {D}}_h^*, \end{aligned}$$
(3.6)

and the residual estimator \(\eta _{\mathrm R,D}\) by

$$\begin{aligned} \eta _{\mathrm R,D}:=a_{\min ,D}^{-\frac{1}{2}}h_{\min ,D}\Vert f-\nabla \cdot {\mathbf {t}}_h\Vert _D,~~~D\in {\mathcal {D}}_h^*. \end{aligned}$$
(3.7)

Proof

Set \(\varphi :=u-u_h\in H_0^1(\varOmega )\). Similar to the proof of [37, Theorem 4.1], for an arbitrary \({\mathbf {t}}\in {\mathbf {H}}(\mathrm {div},\varOmega )\), we have

$$\begin{aligned} {\mathcal {B}}(u-u_h,\varphi )\le |(f-\nabla \cdot {\mathbf {t}},\varphi )|+|(a\nabla u_h+{\mathbf {t}},\nabla \varphi )|. \end{aligned}$$
(3.8)

Set \({\mathbf {t}}={\mathbf {t}}_h\). Using Assumption 3.1, the interpolation error estimate (3.3), and the definition (2.9) of the energy norm, we have

$$\begin{aligned} |(f-\nabla \cdot {\mathbf {t}}_h,\varphi )|&= |(f-\nabla \cdot {\mathbf {t}}_h,\varphi -{\mathbf {I}}\varphi )| \nonumber \\&\lesssim \Big (\sum _{D\in {\mathcal {D}}_h^*}\eta _{\mathrm R,D}^2\Big )^{1/2} \Big (\sum _{D\in {\mathcal {D}}_h^*}a_{\min ,D}h_{\min ,D}^{-2}\Vert \varphi -{\mathbf {I}} \varphi \Vert _D^2\Big )^{1/2} \nonumber \\&\lesssim m_1(\varphi ,{\mathcal {T}}_h)\parallel \mid \!\varphi \!\parallel \mid \Big (\sum _{D\in {\mathcal {D}}_h^*}\eta _{\mathrm R,D}^2\Big )^{1/2}. \end{aligned}$$
(3.9)

For the second term in (3.8), according to Lemma 2.1 and (2.7) it holds that

$$\begin{aligned} |(a\nabla u_h+{\mathbf {t}}_h,\nabla \varphi )|&= |\sum _{D\in {\mathcal {D}}_h^*} (C_D^{-1}(a^{\frac{1}{2}}\nabla u_h+a^{-\frac{1}{2}}{\mathbf {t}}_h),a^{\frac{1}{2}}C_D^\top \nabla \varphi )| \nonumber \\&\lesssim m_1(\varphi ,{\mathcal {T}}_h)\parallel \mid \!\varphi \!\parallel \mid \Big (\sum _{D\in {\mathcal {D}}_h^*}\eta _{\mathrm {DF},D}^2\Big )^{1/2}. \end{aligned}$$
(3.10)

The proof is complete.

The proof of the following corollary is completely similar.

Corollary 3.2

(Dual norm a posteriori error estimate) Let the assumptions of Theorem 3.2 be verified. Then

$$\begin{aligned} \parallel \mid \!u-u_h\!\parallel \mid _\#\lesssim m_2(u-u_h,{\mathcal {T}}_h) \Big \{\sum _{D\in {\mathcal {D}}_h^*}(\eta _{\mathrm R,D}^2+\eta _{\mathrm {DF},D}^2)\Big \}^{1/2}, \end{aligned}$$
(3.11)

where the diffusive flux estimator \(\eta _{\mathrm {DF},D}\) is given by

$$\begin{aligned} \eta _{\mathrm {DF},D}:=h_{\min ,D}\Vert C_D^{-1}(a\nabla u_h+{\mathbf {t}}_h)\Vert _D,~~~D\in {\mathcal {D}}_h^*, \end{aligned}$$
(3.12)

and the residual estimator \(\eta _{\mathrm R,D}\) by

$$\begin{aligned} \eta _{\mathrm R,D}:=h_{\min ,D}\Vert f-\nabla \cdot {\mathbf {t}}_h\Vert _D,~~~D\in {\mathcal {D}}_h^*. \end{aligned}$$
(3.13)

Remark 3.1

It is possible that the alignment measures \(m_1(u-u_h,{\mathcal {T}}_h)\) and \(m_2(u-u_h,{\mathcal {T}}_h)\) can be approximated numerically in an efficient way, e.g., by replacing \(\nabla u\) by a recovered gradient \(\nabla ^{\mathrm R}u_h\) of \(u_h\) such as an approximate \(L^2\)-projection of \(\nabla u_h\) onto the space of continuous piecewise linear functions. See numerical results in Tables 2, 3, 5 and 6 in Sect. 6 and reference [23] for the discussion of the alignment measure, where the approximation of \(m_1(u-u_h,{\mathcal {T}}_h)\) is denoted by \(m_1^{\mathrm R}(u-u_h,{\mathcal {T}}_h)\).

According to Theorem 3.2 and Corollary 3.2, we need to construct a (finite-dimensional) equilibrated flux \({\mathbf {t}}_h\in {\mathbf {H}}(\mathrm {div},\varOmega )\) such that Assumption 3.1 holds. More precisely, we will define it in the lowest-order Raviart–Thomas–N\(\acute{\mathrm e}\)d\(\acute{\mathrm e}\)lec space \(\mathbf {RTN}({\mathcal {S}}_h)\) over the fine simplicial mesh \({\mathcal {S}}_h\) introduced by Sect. 2.1, following the ideas presented in [37]. For more details on the space \(\mathbf {RTN}({\mathcal {S}}_h)\), we refer to Brezzi and Fortin [10] or Roberts and Thomas [32].

3.3 Constructions of the Equilibrated Flux \({\mathbf {t}}_h\)

We here show two different ways of constructing an equilibrated flux \({\mathbf {t}}_h\) satisfying Assumption 3.1. in order to continue our work as generally as possible, we will henceforth make an additional assumption on the numerical solution, introduced also by Vohralík in [37].

Assumption 3.3

(General local conservativity of the numerical scheme) Assume that \(u_h\in X_h^0\) and that, for \({\mathcal {D}}_h\) defined in Sect. 2.1, there holds

$$\begin{aligned} -\langle \{\!\{a\nabla u_h\cdot {\mathbf {n}}_D\}\!\}_\omega ,1\rangle _{\partial D}=(f,1)_D,~~~\forall D\in {\mathcal {D}}_h^\mathrm{{int}}. \end{aligned}$$
(3.14)

3.3.1 Construction of \({\mathbf {t}}_h\) by Direct Prescription

Following [37], the simplest construction of \({\mathbf {t}}_h\in \mathbf {RTN}({\mathcal {S}}_h)\) is given by

$$\begin{aligned} {\mathbf {t}}_h\cdot {\mathbf {n}}_\sigma :=-\{\!\{a\nabla u_h\cdot {\mathbf {n}}_\sigma \}\!\}_\omega ,~~~\forall \sigma \in {\mathcal {G}}_h, \end{aligned}$$
(3.15)

that is, we merely prescribe the degrees of freedom of \({\mathbf {t}}_h\) by a simple (weighted) average of the normal components of the discontinuous approximate flux \(\nabla u_h\) over those sides of the fine simplicial mesh \({\mathcal {S}}_h\). Note that, according to (3.15) and the Green theorem, Assumption 3.1 is immediately obtained from Assumption 3.3. We here take \({\mathcal {D}}_h^*={\mathcal {D}}_h\), i.e., \({\mathcal {D}}_h^{\mathrm {int},*}={\mathcal {D}}_h^{\mathrm {int}}\) and \({\mathcal {D}}_h^{\mathrm {ext},*}={\mathcal {D}}_h^{\mathrm {ext}}\).

This construction of \({\mathbf {t}}_h\) has been fully discussed by Vohralík in [37]. In general, this construction may lead to the fact that the residual estimators \(\eta _{\mathrm R,D}\) are not higher-order terms as is pointed out in [18, 37] and may dominate the diffusive flux ones \(\eta _{\mathrm {DF},D}\), since Assumption 3.3 in this case only holds on the dual volumes \(D\in {\mathcal {D}}_h^{\mathrm {int}}\) and not on each \(K\in {\mathcal {S}}_h\). As a consequence, the effectivity index may not approach the optimal value of one (see the numerical experiments in Sect. 7.2.1 of [37]). This inconvenience also exists on anisotropic meshes (see the numerical experiments in Sect. 6). The approach of the next subsection should improve this drawback.

3.3.2 Construction of \({\mathbf {t}}_h\) by Mixed Finite Element Approximations of Local Neumann/Dirichlet Problems

We set, for a given dual volume \(D\in {\mathcal {D}}_h\),

$$\begin{aligned} \mathbf {RTN}_{\mathrm N}({\mathcal {S}}_D):=\{{\mathbf {q}}_h\in \mathbf {RTN}({\mathcal {S}}_D);~ {\mathbf {q}}_h\cdot {\mathbf {n}}_\sigma =-\{\!\{a\nabla u_h\cdot {\mathbf {n}}_\sigma \}\!\}_\omega ,~\forall \sigma \in {\mathcal {G}}_h^{\mathrm {int}}\cap \partial D\}, \end{aligned}$$
(3.16)

which is a space of Raviart–Thomas–N\(\acute{\mathrm e}\)d\(\acute{\mathrm e}\)lec vector functions over the simplicial submesh \({\mathcal {S}}_D\) of the dual volume \(D\) whose normal fluxes are given by \(-\{\!\{a\nabla u_h\cdot {\mathbf {n}}_\sigma \}\!\}_\omega \) on \(\partial D\backslash \partial \varOmega \). As in the previous subsection (also see [37]), the same process implies that Assumption 3.1 holds with \({\mathcal {D}}_h^{\mathrm {int},*}={\mathcal {D}}_h^{\mathrm {int}}\). More than that, as we will see below, one of the properties of the approach of this subsection is that we will eventually define such \({\mathbf {t}}_h\) that Assumption 3.1 will hold more specifically with \({\mathcal {D}}_h^{\mathrm {int},*}={\mathcal {S}}_h\) and \({\mathcal {D}}_h^{\mathrm {ext},*}=\emptyset \).

Let \(f_h\) be given by \((f,1)_K/|K|\) for all \(K\in {\mathcal {S}}_h\). In order to reflect the mesh dependence of the error estimators, we define \({\mathbf {t}}_h\in \mathbf {RTN}({\mathcal {S}}_h)\) by solving on each \(D\in {\mathcal {D}}_h\) the following mesh-dependent minimization problem

$$\begin{aligned} {\mathbf {t}}_h|_D:=\arg \inf _{{\mathbf {q}}_h\in \mathbf {RTN}_{\mathrm N}({\mathcal {S}}_D),\nabla \cdot {\mathbf {q}}_h=f_h} \Vert h_{\min ,D}C_D^{-1}(a^{\frac{1}{2}}\nabla u_h+a^{-\frac{1}{2}}{\mathbf {q}}_h)\Vert _D. \end{aligned}$$
(3.17)

Note that, since \(f-\nabla \cdot {\mathbf {t}}_h=f-f_h\), the residual estimator (3.7) will be very small, and eventually disappear when \(f=f_h\) (see numerical experiments in Sect. 6.1.2). The equilibrated flux \({\mathbf {t}}_h\) defined by (3.17) minimizes the diffusive flux estimator (3.6).

Define \(\mathbf {RTN}_{\mathrm N,0}({\mathcal {S}}_D)\) as a subspace of \(\mathbf {RTN}_{\mathrm N}({\mathcal {S}}_D)\) but with the normal flux condition \({\mathbf {q}}_h\cdot {\mathbf {n}}_\sigma =0\) on \(\sigma \in {\mathcal {G}}_h^{\mathrm {int}}\cap \partial D\) for the functions \({\mathbf {q}}_h\) from this space. Let \(\mathbb P_0^*({\mathcal {S}}_D)\) be spanned by piecewise constants on \({\mathcal {S}}_D\) with varnishing mean on \(D\) when \(D\in {\mathcal {D}}_h^{\mathrm {int}}\); when \(D\in {\mathcal {D}}_h^{\mathrm {ext}}\), the mean value condition is not imposed. Then it is easy to show (cf. [19]) that (3.17) is equivalent to finding \({\mathbf {t}}_h\in \mathbf {RTN}_{\mathrm N}({\mathcal {S}}_D)\) and \(\psi _h\in \mathbb P_0^*({\mathcal {S}}_D)\) such that

$$\begin{aligned} \left\{ \begin{array}{l l} (h_{\min ,D}^2C_D^{-\top }C_D^{-1}(a^{-1}{\mathbf {t}}_h+\nabla u_h),{\mathbf {q}}_h)_D-(\psi _h,\nabla \cdot {\mathbf {q}}_h)_D=0, &{} \forall {\mathbf {q}}_h\in \mathbf {RTN}_{\mathrm N,0}({\mathcal {S}}_D), \\ (\nabla \cdot {\mathbf {t}}_h,\varphi _h)_D=(f,\varphi _h)_D, &{} \forall \varphi _h\in \mathbb P_0^*({\mathcal {S}}_D). \end{array}\right. \end{aligned}$$
(3.18)

The existence and uniqueness of a solution to the above system are standard, cf. [37].

The above presentation is done in the energy norm (2.9) setting. For the dual norm (2.10), we only need to replace \((h_{\min ,D}^2C_D^{-\top }C_D^{-1}(a^{-1}{\mathbf {t}}_h+\nabla u_h),{\mathbf {q}}_h)_D\) by \((h_{\min ,D}^2C_D^{-\top }C_D^{-1}({\mathbf {t}}_h+a\nabla u_h),{\mathbf {q}}_h)_D\) in (3.18).

For more information about constructions of the equilibration flux \({\mathbf {t}}_h\), like local minimization and its improvement [14, 37], which can be similarly extended to anisotropic meshes. Here we do not repeatedly describe the process.

4 Efficiency and Robustness of the a Posteriori Error Estimates

Recalling the two kinds of constructions of the equilibrated flux \({\mathbf {t}}_h\), we should prove the (local) efficiency and robustness of our estimates on anisotropic meshes without the assumption that \({\mathcal {T}}_h\) is shape-regular. We still proceed as generally as possible, without the definition of any particular numerical method. We firstly focus on the equilibrated flux \({\mathbf {t}}_h\) obtained by the direct prescription of Sect. 3.3.1 and present a robust energy norm (2.9) result in case of discontinuities aligned with the dual meshes \({\mathcal {D}}_h\) and use of harmonic averaging. Then the robustness in the dual norm (2.10) without any special requirement is proven. Similar results are given for the construction of \({\mathbf {t}}_h\) by the local Neumann/Dirichlet problems in Sect. 3.3.2. In fact, we show that our estimators represent, up to a generic constant, local lower bounds for the usual residual ones of Kunert [23] on anisotropic meshes.

4.1 Construction of \({\mathbf {t}}_h\) by the Direct Prescription of Sect. 3.3.1

We firstly consider the energy norm (2.9) seting and then the dual norm (2.10) one.

Theorem 4.1

(Local efficiency and robustness of the energy norm estimate with the harmonic weighting by the direct prescription) Let \(a\) be piecewise constant on \({\mathcal {D}}_h\), \(f\) a piecewise polynomial of degree \(m\) on \({\mathcal {S}}_h\), \(u\) the weak solution of problem (1.1), and \(u_h\in X_h^0\) satisfy Assumption 3.3 with the harmonic averaging weights (2.5). Let \({\mathbf {t}}_h\) be given by (3.15), \(\eta _{\mathrm {DF},D}\) by (3.6), \(\eta _{\mathrm {R},D}\) by (3.7), and \({\mathcal {D}}_h^*={\mathcal {D}}_h\) in Theorem 3.2. Then, for each \(D\in {\mathcal {D}}_h\), there holds

$$\begin{aligned} \eta _{\mathrm {DF},D}&\lesssim \parallel \mid \!u-u_h\!\parallel \mid _{{\mathcal {T}}_{V_D}}, \end{aligned}$$
(4.1)
$$\begin{aligned} \eta _{\mathrm {R},D}&\lesssim \parallel \mid \!u-u_h\!\parallel \mid _{{\mathcal {T}}_{V_D}}. \end{aligned}$$
(4.2)

Proof

The proof of Theorem 4.1 is decomposed into two parts: the local efficiencies of the two estimators are obtained by considering the anisotropic finite element meshes, respectively.

I. Local efficiency of the diffusive flux estimator

Recall the affine transformation matrix \(A_K\) defined in Sect. 2.2 and for each given \({\mathbf {q}}_h\in \mathbf {RTN}(K)\), define the corresponding function \(\bar{{\mathbf {q}}}_h\in \mathbf {RTN}({\bar{K}})\) via the usual Piola transformation [10, 32] such that

$$\begin{aligned} {\mathbf {q}}_h=\frac{A_K}{\det A_K}\bar{{\mathbf {q}}}_h,~\bar{\nabla }=A_K^\top \nabla ,~ |\sigma |{\mathbf {q}}_h\cdot {\mathbf {n}}_{\sigma ,K}=|\bar{\sigma }|\bar{{\mathbf {q}}}_h\cdot {\mathbf {n}}_{\bar{\sigma },{\bar{K}}}, \end{aligned}$$
(4.3)

which, together with (2.6), implies the following estimate

$$\begin{aligned} \Vert C_K^{-1}{\mathbf {q}}_h\Vert _K^2&= |\det A_K|\Vert \frac{C_K^{-1}A_K}{\det A_K}\bar{{\mathbf {q}}}_h\Vert _{{\bar{K}}}^2 \\&\lesssim \frac{|{\bar{K}}|}{|K|}\sum _{\bar{\sigma }\in {\mathcal {E}}_{{\bar{K}}}} \Vert \bar{{\mathbf {q}}}_h\cdot {\mathbf {n}}_{\bar{\sigma },{\bar{K}}}\Vert _{\bar{\sigma }}^2 \\&= \frac{|{\bar{K}}|}{|K|}\sum _{\sigma \in {\mathcal {E}}_{K}} \frac{|\sigma |}{|\bar{\sigma }|}\Vert {\mathbf {q}}_h\cdot {\mathbf {n}}_{\sigma ,K}\Vert _\sigma ^2\\&\lesssim \sum _{\sigma \in {\mathcal {E}}_{K}}h_{\sigma ,K}^{-1}\Vert {\mathbf {q}}_h\cdot {\mathbf {n}}_{\sigma }\Vert _\sigma ^2, \end{aligned}$$

i.e.,

$$\begin{aligned} \Vert C_K^{-1}{\mathbf {q}}_h\Vert _K^2\lesssim \sum _{\sigma \in {\mathcal {E}}_{K}} h_{\sigma ,K}^{-1}\Vert {\mathbf {q}}_h\cdot {\mathbf {n}}_{\sigma }\Vert _\sigma ^2. \end{aligned}$$
(4.4)

Let now \(K\) be an arbitrary element in the simplicial mesh \({\mathcal {S}}_D\) of a given dual volume \(D\in {\mathcal {D}}_h\) and set \({\mathbf {q}}_h=a\nabla u_h+{\mathbf {t}}_h\). Together with (5.3) in the proof of [37, Lemma 5.2], the inequality (4.4) implies that

$$\begin{aligned} h_{\min ,K}^2\Vert C_K^{-1}(a^{\frac{1}{2}}\nabla u_h+a^{-\frac{1}{2}}{\mathbf {t}}_h)\Vert _K^2&= a_K^{-1}h_{\min ,K}^2\Vert C_K^{-1}{\mathbf {q}}_h\Vert _K^2 \nonumber \\&\lesssim a_K^{-1}h_{\min ,K}^2\sum _{\sigma \in {\mathcal {E}}_K\cap {\mathcal {G}}_h^{\mathrm {int}}} h_{\sigma ,K}^{-1}\Vert \omega _{L,\sigma }[\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!]\Vert _\sigma ^2.\nonumber \\ \end{aligned}$$
(4.5)

Considering a fixed \(\sigma =\sigma _{K,L}\in {\mathcal {E}}_K\cap {\mathcal {G}}_h^{\mathrm {int}}\), the proof of the estimate

$$\begin{aligned} h_{\min ,K}h_{\sigma ,K}^{-\frac{1}{2}}\Vert [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!]\Vert _\sigma \lesssim \sum _{M\in \{K,L\}}a_M^{\frac{1}{2}}\parallel \mid \!u-u_h\!\parallel \mid _M \end{aligned}$$

is standard, refer to the proof of [23, Theorem 2]. Recall from [23] that, as also in [22] the proof uses the side and element bubble functions, the anisotropic inverse and trace inequalities, and the equivalence of norms on finite dimensional spaces (the assumption that \(f\) is a piecewise polynomial). Therefore we do not repeat the proof itself. It then follows that

$$\begin{aligned} \omega _{L,\sigma }a_K^{-\frac{1}{2}}h_{\min ,K}h_{\sigma ,K}^{-\frac{1}{2}}\Vert [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!]\Vert _\sigma \lesssim \sum _{M\in \{K,L\}}\omega _{L,\sigma }a_K^{-\frac{1}{2}}a_M^{\frac{1}{2}}\parallel \mid \!u-u_h\!\parallel \mid _M. \end{aligned}$$

Following the final part in the proof of [37, Lemma 5.2] and noting the equivalence (2.7), we obtain

$$\begin{aligned} \eta _{\mathrm {DF},D}^2\lesssim \parallel \mid \!u-u_h\!\parallel \mid _{{\mathcal {T}}_{V_D}}^2, \end{aligned}$$

which is (4.1).

II. Local efficiency of the residual estimator

By (4.3), we have, for any fixed \({\mathbf {q}}_h\in {\mathbf {H}}(\mathrm {div},K)\),

$$\begin{aligned} \nabla \cdot {\mathbf {q}}_h=\frac{1}{\det A_K}\bar{\nabla }\cdot \bar{{\mathbf {q}}}_h. \end{aligned}$$
(4.6)

Set \({\mathbf {q}}_h=a\nabla u+{\mathbf {t}}_h\). Considering a fixed dual volume \(D\in {\mathcal {D}}_h\), by the standard argument we get, for any \(K\in {\mathcal {S}}_D\),

$$\begin{aligned} \Vert f-\nabla \cdot {\mathbf {t}}_h\Vert _K&= \Vert \nabla \cdot {\mathbf {q}}_h\Vert _K \\&= |\det A_K|^{\frac{1}{2}}\Vert \frac{1}{\det A_K}\bar{\nabla }\cdot \bar{{\mathbf {q}}}_h\Vert _{{\bar{K}}}\\&\lesssim |\det A_K|^{-\frac{1}{2}}\Vert \bar{{\mathbf {q}}}_h\Vert _{{\bar{K}}}\\&= \Vert A_K^{-1}{\mathbf {q}}_h\Vert _K \\&\lesssim \Vert C_K^{-1}{\mathbf {q}}_h\Vert _K. \end{aligned}$$

In the above inequality, we have used the equivalence of norms on finite dimensional spaces, because of the assumption that \(f\) is a piecewise polynomial. Then it holds that

$$\begin{aligned} \Vert f-\nabla \cdot {\mathbf {t}}_h\Vert _D \lesssim a_{\max ,D}^{\frac{1}{2}}\Vert C_D^{-1}(a^{\frac{1}{2}}\nabla u+a^{-\frac{1}{2}}{\mathbf {t}}_h)\Vert _D. \end{aligned}$$

The remaining part is completely similar to the proof of [37, Lemma 5.3].

Next we consider a posteriori estimate in the dual norm (2.10), without any restriction on the distribution of the discontinuities (\(a\) can be piecewise constant on \({\mathcal {T}}_h\) or on \({\mathcal {D}}_h\)) and type of averaging (both arithmetic and harmonic averaging are allowed) in Assumption 3.3.

Theorem 4.2

(Global efficiency and robustness of the dual norm estimate by the direct prescription) Let \(f\) be a piecewise polynomial of degree \(m\) on \({\mathcal {S}}_h\), \(u\) the weak solution of problem (1.1), and \(u_h\in X_h^0\) satisfy Assumption 3.3 with any weight satisfying (2.2). Let \({\mathbf {t}}_h\) be given by (3.15), \(\eta _{\mathrm {DF},D}\) by (3.12), \(\eta _{\mathrm {R},D}\) by (3.13), and \({\mathcal {D}}_h^*={\mathcal {D}}_h\) in Corollary 3.2. Then there holds

$$\begin{aligned} \Big \{\sum _{D\in {\mathcal {D}}_h^*}(\eta _{\mathrm R,D}^2+\eta _{\mathrm {DF},D}^2)\Big \}^{1/2}\lesssim \parallel \mid \!u-u_h\!\parallel \mid _\#. \end{aligned}$$
(4.7)

Proof

Following the same arguments in process of proving (4.1) and (4.2) (also see the proof of [37, Theorem 5.4]), we have

$$\begin{aligned} \sum _{D\in {\mathcal {D}}_h}(\eta _{\mathrm {DF},D}^2+\eta _{\mathrm R,D}^2) \lesssim \sum _{K\in {\mathcal {S}}_h}h_{\min ,K}^2\Vert f+\nabla \cdot (a\nabla u_h)\Vert _K^2+\sum _{\sigma \in {\mathcal {G}}_h^{\mathbf {int}}}h_{\min ,\sigma }^2h_\sigma ^{-1}\Vert [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!]\Vert _\sigma ^2, \end{aligned}$$
(4.8)

which means that the present estimates represent a lower bound for the residual ones on anisotropic meshes given by Kunert in [23]. The rest of the proof is based on the tools from [23] and the process is analogous to the standard technique given by Verf\(\ddot{\mathrm u}\)rth [33], i.e., by means of bubble functions, also ref. [22]. The only differences here are the anisotropic weights of the residual norms, and the use of anisotropic inverse inequalities for bubble functions. for more details, we refer to [23]. Therefore, without the proof we directly state that

$$\begin{aligned} \Big (\sum _{K\in {\mathcal {S}}_h}h_{\min ,K}^2\Vert f+\nabla \cdot (a\nabla u_h)\Vert _K^2\Big )^{\frac{1}{2}}&\lesssim \parallel \mid \!u-u_h\!\parallel \mid _\#, \end{aligned}$$
(4.9)
$$\begin{aligned} \Big (\sum _{\sigma \in {\mathcal {G}}_h^{\mathbf {int}}}h_{\min ,\sigma }^2h_\sigma ^{-1}\Vert [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!]\Vert _\sigma ^2\Big )^{\frac{1}{2}}&\lesssim \parallel \mid \!u-u_h\!\parallel \mid _\#, \end{aligned}$$
(4.10)

which implies that (4.7) holds.

4.2 Construction of \({\mathbf {t}}_h\) by the Local Neumann/Dirichlet Problem of Sect. 3.3.2

We focus here on the equilibrated flux \({\mathbf {t}}_h\) defined in Sect. 3.3.2 by (3.18). As in the previous subsection, we first consider the energy norm (2.9) and then pass to the dual norm (2.10).

Theorem 4.3

(Local efficiency and robustness of the energy norm estimate with the harmonic weighting by the local Neumann/Dirichlet problem) Let \(a\) be piecewise constant on \({\mathcal {D}}_h\), \(f\) a piecewise polynomial of degree \(m\) on \({\mathcal {S}}_h\), \(u\) the weak solution of problem (1.1), and \(u_h\in X_h^0\) satisfy Assumption 3.3 with the harmonic averaging weights (2.5). Let \({\mathbf {t}}_h\) be given by (3.18), \(\eta _{\mathrm {DF},D}\) by (3.6), \(\eta _{\mathrm {R},D}\) by (3.7), and \({\mathcal {D}}_h^{\mathrm {int},*}={\mathcal {S}}_h\) and \({\mathcal {D}}_h^{\mathrm {ext},*}=\emptyset \) in Theorem 3.2. Then, for each \(D\in {\mathcal {D}}_h\), there holds

$$\begin{aligned}&\eta _{\mathrm {R},K}\lesssim a_{\min ,K}^{-\frac{1}{2}}h_{\min ,K}\Vert f-f_h\Vert _K, ~~~\forall K\in {\mathcal {S}}_D,\end{aligned}$$
(4.11)
$$\begin{aligned}&\eta _{\mathrm {DF},D}\lesssim \parallel \mid \!u-u_h\!\parallel \mid _{{\mathcal {T}}_{V_D}}, \end{aligned}$$
(4.12)

where \(f_h\) is the piecewise average over \({\mathcal {S}}_h\), i.e., \(f_h|_K=|K|^{-1}(f,1)_K,~K\in {\mathcal {S}}_h\).

Proof

The result (4.11) is an immediate consequence of the second term of (3.18). It thus remains to prove (4.12).

For a given \(D\in {\mathcal {D}}_h\), we need a hybridized version of (3.18), cf. [10, 32]. Therein, the first term of (3.18) is replaced by

$$\begin{aligned}&(h_{\min ,D}^2C_D^{-\top }C_D^{-1}(a^{-1}{\mathbf {t}}_h+\nabla u_h),{\mathbf {q}}_h)_D-(\psi _h,\nabla \cdot {\mathbf {q}}_h)_D +\sum _{K\in {\mathcal {S}}_D}\langle {\mathbf {q}}_h\cdot {\mathbf {n}}_K,\lambda _h\rangle _{\partial K}\\&=0,\forall {\mathbf {q}}_h\in \mathbf {RTN}_{\mathrm N,0}^*({\mathcal {S}}_D), \end{aligned}$$

where \(\mathbf {RTN}_{\mathrm N,0}^*({\mathcal {S}}_D)\) is the same space as \(\mathbf {RTN}_{\mathrm N,0}({\mathcal {S}}_D)\) of Sect. 3.3.2 with, however, no normal trace continuity constraint. \(\lambda _h\) is the Lagrange multiplier, a piecewise constant function on the sides \({\mathcal {G}}_D\) (set \(\lambda _h=0\) on \(\partial \varOmega \)). Thus we are able to set \({\mathbf {q}}_h={\mathbf {t}}_h+a\nabla u_h\) in the above equation, which leads to

$$\begin{aligned} \Vert h_{\min ,D}C_D^{-1}(a^{\frac{1}{2}}\nabla u_h+a^{-\frac{1}{2}}{\mathbf {t}}_h)\Vert _D^2&= (\psi _h,f_h+\nabla \cdot (a\nabla u_h))_D-\sum _{\sigma \in {\mathcal {G}}_D^{\mathrm {int}}} \langle [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!],\lambda _h\rangle _\sigma \nonumber \\&-\sum _{\sigma \in {\mathcal {G}}_D^{\mathrm {ext}}\cap {\mathcal {G}}_h^{\mathrm {int}}}\langle (a\nabla u_h\cdot {\mathbf {n}}_D)|_D\!-\!\{\!\{a\nabla u_h\cdot {\mathbf {n}}_D\}\!\}_\omega ,\lambda _h\rangle _\sigma ,\nonumber \\ \end{aligned}$$
(4.13)

since \({\mathbf {t}}_h\in \mathbf {RTN}_{\mathrm N}({\mathcal {S}}_D)\) and \(\nabla \cdot {\mathbf {t}}_h=f_h\) by (3.18).

Similarly to the approach of [37] and [36] (cf. also [7, 8]), we define a postprocessing \(\widetilde{\psi }_h\in M({\mathcal {S}}_D)\) of \(\psi _h\) such that

$$\begin{aligned} -a\nabla \widetilde{\psi }_h&= h_{\min ,D}^2C_D^{-\top }C_D^{-1}({\mathbf {t}}_h+a\nabla u_h),~~~\forall K\in {\mathcal {S}}_D, \end{aligned}$$
(4.14)
$$\begin{aligned} \frac{(\widetilde{\psi }_h,1)_K}{|K|}&= \psi _h|_K, \forall K\in {\mathcal {S}}_D, \end{aligned}$$
(4.15)
$$\begin{aligned} \frac{\langle \widetilde{\psi }_h,1\rangle _\sigma }{|\sigma |}&= \lambda _h|_\sigma ,~\forall \sigma \in {\mathcal {G}}_D, \end{aligned}$$
(4.16)

where \(M({\mathcal {S}}_D)\) is a space of particular piecewise polynomials on \({\mathcal {S}}_D\) of total degree \(\le 2\).

If \(D\in {\mathcal {D}}_h^{\mathrm {int}}\), then \((\psi _h,1)_D=0\), as \(\psi _h\in \mathbb P_0^*({\mathcal {S}}_D)\), see (3.18). Observing (4.15), we have \((\widetilde{\psi }_h,1)_D/|D|=0\), i.e., the mean value of \(\widetilde{\psi }_h\) over \(D\) vanishes. If \(D\in {\mathcal {D}}_h^{\mathrm {ext}}\), then \(\lambda _h=0\) on \(\partial D\cap \partial \varOmega \), and (4.16) implies that \(\langle \widetilde{\psi }_h,1\rangle _\sigma /|\sigma |=0\), i.e., the mean value of \(\widetilde{\psi }_h\) over the sides lying in \(\partial \varOmega \) vanishes. Thus for both \(D\in {\mathcal {D}}_h^{\mathrm {int}}\) and \(D\in {\mathcal {D}}_h^{\mathrm {ext}}\), using the Poincar\(\acute{\mathrm e}\)/Friedrichs inequality (cf. [35]), similarly to the proof of (3.2) (ref. [23]) we obtain

$$\begin{aligned} \Vert \widetilde{\psi }_h\Vert _D\lesssim \Vert C_D^\top \nabla \widetilde{\psi }_h\Vert _D. \end{aligned}$$
(4.17)

We should also use the anisotropic trace inequality [23], i.e., for any \(K\) sharing a side \(\sigma \in {\mathcal {G}}_D\), it holds that

$$\begin{aligned} \Vert v\Vert _\sigma ^2\lesssim h_\sigma ^{-1}(\Vert v\Vert _K^2+\Vert C_K^\top \nabla v\Vert _K^2), ~~~\forall v\in H^1(K). \end{aligned}$$
(4.18)

Recalling (4.5), the harmonic averaging (2.5) and the equivalence (2.7), the above results imply that

$$\begin{aligned}&h_{\min ,D}^2\Vert C_D^{-1}(a^{\frac{1}{2}}\nabla u_h+a^{-\frac{1}{2}}{\mathbf {t}}_h)\Vert _D^2 \\&=(\widetilde{\psi }_h,f_h+\nabla \cdot (a\nabla u_h))_D-\sum _{\sigma \in {\mathcal {G}}_D^{\mathrm {int}}} \langle [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!],\widetilde{\psi }_h\rangle _\sigma \\&\ \ \ -\sum _{\sigma \in {\mathcal {G}}_D^{\mathrm {ext}}\cap {\mathcal {G}}_h^{\mathrm {int}}}\langle (a\nabla u_h\cdot {\mathbf {n}}_D)|_D-\{\!\{a\nabla u_h\cdot {\mathbf {n}}_D\}\!\}_\omega ,\widetilde{\psi }_h\rangle _\sigma \\&\lesssim \Vert \widetilde{\psi }_h\Vert _D\Vert f_h+\nabla \cdot (a\nabla u_h)\Vert _D \\&\ \ \ +\Big (\sum _{K\in {\mathcal {S}}_D}\sum _{\sigma \in {\mathcal {E}}_K\cap {\mathcal {G}}_h^{\mathrm {int}}}h_\sigma ^{-1} \Vert [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!]\Vert _\sigma ^2\Big )^{\frac{1}{2}} (\Vert \widetilde{\psi }_h\Vert _D^2+\Vert C_D^\top \nabla \widetilde{\psi }_h\Vert _D^2)^{\frac{1}{2}}\\&\lesssim \Vert C_D^\top \nabla \widetilde{\psi }_h\Vert _D\Big \{\Vert f_h+\nabla \cdot (a\nabla u_h)\Vert _D +\Big (\sum _{K\in {\mathcal {S}}_D}\sum _{\sigma \in {\mathcal {E}}_K\cap {\mathcal {G}}_h^{\mathrm {int}}}h_\sigma ^{-1} \Vert [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!]\Vert _\sigma ^2\Big )^{\frac{1}{2}}\Big \}. \end{aligned}$$

Therefore we obtain

$$\begin{aligned}&h_{\min ,D}\Vert C_D^{-1}(a^{\frac{1}{2}}\nabla u_h+a^{-\frac{1}{2}}{\mathbf {t}}_h)\Vert _D \nonumber \\&\quad \lesssim a_{\min ,D}^{-\frac{1}{2}}h_{\min ,D}\Big \{\Vert f_h+\nabla \cdot (a\nabla u_h)\Vert _D +\Big (\sum _{K\in {\mathcal {S}}_D}\sum _{\sigma \in {\mathcal {E}}_K\cap {\mathcal {G}}_h^{\mathrm {int}}}h_\sigma ^{-1} \Vert [\![a\nabla u_h\cdot {\mathbf {n}}_\sigma ]\!]\Vert _\sigma ^2\Big )^{\frac{1}{2}}\Big \}.\nonumber \\ \end{aligned}$$
(4.19)

Then by the standard argument in the proof of Theorem 4.1 (also cf. [23] and [37]), we complete the proof.

For completeness, we also give the corresponding result in the dual norm (2.10), without any restriction on the distribution of discontinuities or type of averaging in Assumption 3.3. The proof is an easy consequence of the previous results.

Corollary 4.1

(Global efficiency and robustness of the dual norm estimate by the Local Neumann/Dirichlet problem) Let \(f\) be a piecewise polynomial of degree \(m\) on \({\mathcal {S}}_h\), \(u\) the weak solution of problem (1.1), and \(u_h\in X_h^0\) satisfy Assumption 3.3 with any weight satisfying (2.2). Let \({\mathbf {t}}_h\) be given by (3.18), \(\eta _{\mathrm {DF},D}\) by (3.12), \(\eta _{\mathrm {R},D}\) by (3.13), and \({\mathcal {D}}_h^{\mathrm {int},*}={\mathcal {S}}_h\) and \({\mathcal {D}}_h^{\mathrm {int},*}=\emptyset \) in Corollary 3.2. Then there holds

$$\begin{aligned} \Big \{\sum _{D\in {\mathcal {D}}_h^*}(\eta _{\mathrm R,D}^2+\eta _{\mathrm {DF},D}^2)\Big \}^{1/2}\lesssim \parallel \mid \!u-u_h\!\parallel \mid _\#. \end{aligned}$$
(4.20)

Remark 4.1

(The alignment between the meshes and discontinuous coefficient \(a\)) From Theorems 4.1 and 4.3 we know that, in order to get robust energy norm error estimates, the diffusion coefficient \(a\) has to be piecewise constant on \({\mathcal {D}}_h\). However, if we would first construct a simplicial mesh \({\mathcal {T}}_h\) of \(\varOmega \) and then a dual mesh \({\mathcal {D}}_h\) as in Sect. 2.1, it would be very difficult to keep the dual mesh aligned with the discontinuity, especially for adaptive refinement. A possible solution is to first define the dual mesh \({\mathcal {D}}_h\) and then the primal one \({\mathcal {T}}_h\), just as presented in [37].

On the other hand, if \(a\) is piecewise constant on the primal mesh \({\mathcal {T}}_h\) (it is easy to keep the primal mesh aligned with the discontinuity) but harmonic averaging (2.5) has been used in the interior sides of each \(D\in {\mathcal {D}}_h\), one has the local efficiency directly on each dual volume \(D\in {\mathcal {D}}_h\) and not on the patch \({\mathcal {T}}_{V_D}\) which is larger than \(D\). Recalling the proof of Theorem 4.1, in this case it gives

$$\begin{aligned} \eta _{\mathrm {DF},D}&\lesssim \parallel \mid \!u-u_h\!\parallel \mid _{{\mathcal {D}}}, \end{aligned}$$
(4.21)
$$\begin{aligned} \eta _{\mathrm {R},D}&\lesssim \frac{a_{\max ,D}}{a_{\min ,D}}\parallel \mid \!u-u_h\!\parallel \mid _{{\mathcal {D}}}. \end{aligned}$$
(4.22)

Similarly, the final upper bound of Theorem 4.3 changes from (4.12) to (4.22).

Note that Theorem 4.2 or Corollary 4.1 give full robustness with respect to the discontinuities in \(a\) without any restriction on the distribution of the discontinuities (it can be piecewise constant on \({\mathcal {T}}_h\) or on \({\mathcal {D}}_h\)) or type of averaging (both arithmetic and harmonic averaging is allowed) in Assumption 3.3. However, these results are established in the dual norm \(\parallel \mid \!\cdot \!\parallel \mid _\#\) and one only has global (and not local) efficiency. For similar discussion, also see [37].

5 Application of the Error Estimates and Efficiency Results to Different Numerical Methods

In the previous sections, we have presented the a posteriori error estimates and their efficiency generally, without any notion of a particular numerical scheme. Here, we should briefly describe how the results of Sects. 3 and 4 can be applied to the different numerical methods, such as the finite volume-type, finite difference and finite element methods. The definitions and their mutual relations of these numerical methods are summarized by Vohralík in [37], thus we do not need to list them.

For the finite volume-type and finite difference methods, the discussion is identical to that in [37], Sect. 6]. Next we only focus on the finite element method.

First, the finite element method can be written equivalently in the form of Assumption 3.3 when both \(a\) and \(f\) are piecewise constants on the mesh \({\mathcal {T}}_h\) (cf. [37]). For general \(f\), following [37] we have.

Theorem 5.1

(Energy norm a posteriori error estimate for the finite element method) Let \(u\) be the weak solution of problem (1.1) and \(u_h\) its finite element approximation; Let \(\tilde{u}\) be the weak solution of problem (1.1) with \(f\) replaced by \(f_h\) and \(\tilde{u}_h\) its finite element approximation. Then

$$\begin{aligned} \parallel \mid \!u-u_h\!\parallel \mid \lesssim \parallel \mid \!\tilde{u}-\tilde{u}_h\!\parallel \mid +(m_1(u-\tilde{u},{\mathcal {T}}_h)+m_1(u_h-\tilde{u}_h, {\mathcal {T}}_h))\Big (\sum _{K\in {\mathcal {T}}_h}\eta _{\mathrm {osc},K}^2\Big )^{\frac{1}{2}}, \end{aligned}$$
(5.1)

where

$$\begin{aligned} \eta _{\mathrm {osc},K}:=a_{\min ,K}^{-\frac{1}{2}}h_{\min ,K}\Vert f-f_h\Vert _K,~~~K\in {\mathcal {T}}_h. \end{aligned}$$

Proof

Following the proof of [37, Theorem 6.1], The anisotropic version of Poincaré inequality (4.17) and the definition of alignment measure yields (5.1).

We proceeds similarly for the dual norm setting.

Corollary 5.1

(Dual norm a posteriori error estimate for the finite element method) Let \(u\) be the weak solution of problem (1.1) and \(u_h\) its finite element approximation; Let \(\tilde{u}\) be the weak solution of problem (1.1) with \(f\) replaced by \(f_h\) and \(\tilde{u}_h\) its finite element approximation. Then

$$\begin{aligned} \parallel \mid \!u-u_h\!\parallel \mid _{\#}\lesssim \parallel \mid \!\tilde{u}-\tilde{u}_h\!\parallel \mid _{\#} +(m_2(u-\tilde{u},{\mathcal {T}}_h)+m_2(u_h-\tilde{u}_h,{\mathcal {T}}_h))\Big (\sum _{K\in {\mathcal {T}}_h}\eta _{\mathrm {osc},K}^2\Big )^{\frac{1}{2}}, \end{aligned}$$
(5.2)

where

$$\begin{aligned} \eta _{\mathrm {osc},K}:=h_{\min ,K}\Vert f-f_h\Vert _K,~~~K\in {\mathcal {T}}_h. \end{aligned}$$

Note that, the alignment measures \(m_1(v,{\mathcal {T}}_h)\) and \(m_2(v,{\mathcal {T}}_h)\) have been introduced in Definition 2.1 to measure how good anisotropic mesh \({\mathcal {T}}_h\) is aligned with an anisotropic function \(v\). Hence, it is reasonable to believe that, for whichever conforming approximation, i.e., \(\tilde{u},u_h\) or \(\tilde{u}_h\),

$$\begin{aligned} m_1(u-u_h,{\mathcal {T}}_h)\sim m_1(u-\tilde{u},{\mathcal {T}}_h)\sim m_1(u_h-\tilde{u}_h,{\mathcal {T}}_h)\sim m_1(\tilde{u}-\tilde{u}_h,{\mathcal {T}}_h)\sim 1, \end{aligned}$$

and

$$\begin{aligned} m_2(u-u_h,{\mathcal {T}}_h)\sim m_2(u-\tilde{u},{\mathcal {T}}_h)\sim m_2(u_h-\tilde{u}_h,{\mathcal {T}}_h)\sim m_2(\tilde{u}-\tilde{u}_h, {\mathcal {T}}_h)\sim 1, \end{aligned}$$

as long as the anisotropic mesh \({\mathcal {T}}_h\) sufficiently reflects the anisotropy of the solution.

When \(a\) is piecewise constant on \({\mathcal {D}}_h\), to obtain a posteriori error estimate, one can define the equilibrated flux \({\mathbf {t}}_h\) as stated in [37], Sect. 6.3]. While separating the oscillations with respect to \(f\) as in Theorem 5.1, robustness can then be proved as in Theorem 5.1.

For more information on the energy norm estimates and dual norm ones of the various numerical methods, we refer to [37], Tables 1 and 2].

6 Numerical Experiments

In this section, we present the results of several numerical experiments and study the asymptotic behaviour of the error estimators on anisotropic meshes. Two different examples will be considered such that we focus on the robustness with respect to the anisotropy of the mesh and the discontinuous coefficients, respectively. In the following two subsections, for convenience we only present the results for the energy norm error estimates.

6.1 The Diffusion Problem with the Constant Coefficient

We first consider the problem (1.1) with the constant coefficient, where \(\varOmega =(0,1)\times (0,1)\), \(a=2/(k^2+1),~f=2\pi ^2\sin (k\pi x)\sin (\pi y)\) and \(k>0\). Hence the exact solution is \(u=\sin (k\pi x)\sin (\pi y)\) and shown in the left part of Fig. 4, from which we can easily see that the exact solution varies significantly in the \(x\)-axis direction with \(k\gg 1\), which requires the anisotropic meshes in order to reflect the variation.

Fig. 4
figure 4

The exact solution with \(k=20\) (left) and the anisotropic mesh \((30\times 5)\) (right)

In order to be convenient for comparison, we first present the numerical results of the error estimates of Vohralík [37] on the anisotropic meshes. Then, we present the numerical results of our estimators under the same conditions. By comparison, our estimators are shown to be reliable and efficient on anisotropic meshes.

We consider the case \(k=100\). As shown in the right part of Fig. 4, we construct a sequence of \(m\times n\) meshes with \(m\) uniform subintervals in the \(x\)-axis direction and \(n\) uniform subintervals in the \(y\)-axis direction. The corresponding mesh information is listed in Table 1.

Table 1 The meshes on different levels

On the kind of meshes, the vertex- and cell-centered finite volume methods are equivalent [37], and so are here called the finite volume method. Tables 2 and 3 present the alignment measures for the finite volume and element methods on different meshes, respectively, where \(m_1(u-u_h,{\mathcal {T}}_h)=m_2(u-u_h,{\mathcal {T}}_h)\) since \(a\) is a constant on \(\varOmega \). The small size of the alignment measures further indicates that the meshes are well adapted to the exact solution; hence efficient error estimation is to be expected.

Table 2 The alignment measure for the finite volume method with the constant coefficient
Table 3 The alignment measure for the finite element method with the constant coefficient

6.1.1 The Error Estimates Developed by Vohralík

A posteriori error estimates developed by Vohralík are very successful for various numerical methods on isotropic meshes [37]. However, they fail on anisotropic meshes as shown in this subsection.

Let the assumptions of Theorem 3.2 be verified. Vohralík gives the following energy norm error estimate

$$\begin{aligned} \parallel \mid \!u-u_h\!\parallel \mid \le \zeta , \end{aligned}$$

where

$$\begin{aligned} \zeta :=\left\{ \sum _{D\in {\mathcal {D}}_h^*}(\zeta _{\mathrm R,D}+\zeta _{\mathrm {DF},D})^2\right\} ^{1/2}, \end{aligned}$$

the diffusive flux estimator \(\zeta _{\mathrm {DF},D}\) is given by

$$\begin{aligned} \zeta _{\mathrm {DF},D}:=\Vert a^{\frac{1}{2}}\nabla u_h+a^{-\frac{1}{2}}{\mathbf {t}}_h\Vert _D,~~~D\in {\mathcal {D}}_h^*, \end{aligned}$$

and the residual estimator \(\zeta _{\mathrm R,D}\) is given by

$$\begin{aligned} \zeta _{\mathrm R,D}:=m_{D,a}\Vert f-\nabla \cdot {\mathbf {t}}_h\Vert _D,~~~D\in {\mathcal {D}}_h^*, \end{aligned}$$

where

$$\begin{aligned} m_{D,a}^2:=C_{\mathrm P,D}\frac{h_{\max ,D}^2}{a_{\min ,D}},~~D\in {\mathcal {D}}_h^{\mathrm {int},*},~~~~ m_{D,a}^2:=C_{\mathrm F,D,\partial \varOmega }\frac{h_{\max ,D}^2}{a_{\min ,D}},~~D\in {\mathcal {D}}_h^{\mathrm {ext},*}, \end{aligned}$$

with \(C_{\mathrm P,D}\) the constant from the Poincar\(\acute{\mathrm e}\) inequality and \(C_{\mathrm F,D,\partial \varOmega }\) the constant from the Friedrichs inequality. Let

$$\begin{aligned} \zeta _{\mathrm {DF}}&:= \left\{ \sum _{D\in {\mathcal {D}}_h^*} \zeta _{\mathrm {DF},D}^2\right\} ^{1/2},\\ \zeta _{\mathrm R}&:= \left\{ \sum _{D\in {\mathcal {D}}_h^*}\zeta _{\mathrm R,D}^2\right\} ^{1/2}. \end{aligned}$$

In order to construct the equilibrated flux \({\mathbf {t}}_h\), Vohralík suggests four different ways (cf. [37]). As an example, here we only present the numerical results for one of them corresponding to the direct prescription (3.15) given by the finite volume approximation on anisotropic meshes, see Fig. 5. For the others, the conclusions are similar.

From Fig. 5 (here and below the effectivity index is given by the rate between the error estimate and the exact error), we see that, both the diffusive flux and residual estimators by the direct prescription are not efficient for the finite volume method on anisotropic meshes.

Fig. 5
figure 5

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite volume method with the constant coefficient on anisotropic meshes, estimates by the direct prescription, developed by Vohralík

6.1.2 Our Error Estimates

Let

$$\begin{aligned} \eta&:= \left\{ \sum _{D\in {\mathcal {D}}_h^*}(\eta _{\mathrm R,D}^2+\eta _{\mathrm {DF},D}^2)\right\} ^{1/2}, \\ \eta _{\mathrm {DF}}&:= \left\{ \sum _{D\in {\mathcal {D}}_h^*}\eta _{\mathrm {DF},D}^2\right\} ^{1/2},\\ \eta _{\mathrm R}&:= \left\{ \sum _{D\in {\mathcal {D}}_h^*}\eta _{\mathrm R,D}^2\right\} ^{1/2}, \end{aligned}$$

where \(\eta _{\mathrm {DF},D}\) and \(\eta _{\mathrm R,D}\) are given by (3.6) and (3.7), respectively.

Figures 6, 7, 8 and 9 show that our estimators are reliable and efficient for the two numerical methods on anisotropic meshes, since the corresponding alignment measures are always equivalent to 1 from Tables 2 and 3.

In Figs. 6 and 8, we present the results of estimates by the direct prescription (3.15) for the two numerical methods, respectively. Compared with the diffusive flux term \(\eta _{\mathrm {DF}}\), the residual one \(\eta _{\mathrm R}\) represents a dominant contribution to the error estimates by the direct prescription (3.15). In Figs. 7 and 9, The consequences are reversed for the error estimates by the local Neumann/Dirichlet mixed finite element problems (3.18). Hence, the numerical results and theoretical estimates are consistent.

Fig. 6
figure 6

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite volume method with the constant coefficient on anisotropic meshes, our estimates by the direct description (3.15)

Fig. 7
figure 7

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite volume method with the constant coefficient on anisotropic meshes, our estimates by the local Neumann/Dirichlet mixed finite element problems (3.18)

Fig. 8
figure 8

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite element method with the constant coefficient on anisotropic meshes, our estimates by the direct description (3.15)

Fig. 9
figure 9

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite element method with the constant coefficient on anisotropic meshes, our estimates by the local Neumann/Dirichlet mixed finite element problems (3.18)

6.2 The Diffusion Problem with Discontinuous Coefficients

Next we consider the problem (1.1) with the discontinuous coefficient \(a\) on \(\varOmega =\varOmega _1\cup \varOmega _2\) where \(\varOmega _1=(-1,0)\times (-1,1)\) and \(\varOmega _2=(0,1)\times (-1,1)\). Let \(a=2/(k^2+1)\) where \(k=k_1\) on \(\varOmega _1\) and \(k=k_2\) on \(\varOmega _2\). The right hand side is chosen such that \(u=\sin (k\pi x^2)\sin (\pi y)\) is the exact solution, which is shown in the left part of Fig. 10. From that we can easily see that the exact solution varies significantly in the \(x\)-axis direction (\(x>0\)) with \(k_2\gg 1\), which also requires the anisotropic meshes in order to reflect the variation.

Fig. 10
figure 10

The exact solution with \(k_1=1\) and \(k_2=15\) (left) and the anisotropic mesh \((32\times 8)\) (right)

We consider the case \(k_1=1\) and \(k_2=100\). As shown in the right part of Fig. 10, we construct a sequence of \(m\times n\) meshes with \(m_1\) uniform subintervals on \(\varOmega _1\) and \(m_2\) uniform subintervals on \(\varOmega _2\) in the \(x\)-axis direction and \(n\) uniform subintervals in the \(y\)-axis direction where \(m_1=n/2\) and \(m_2=m-m_1\). The corresponding mesh information is listed in Table 4. Tables 5 and 6 present the alignment measures for the finite volume and element methods on different meshes, respectively.

Strictly speaking, these meshes do not satisfy our assumption on the alignment between the discontinuous coefficient \(a\) and the dual mesh \({\mathcal {D}}_h\), whereas \(a\) is piecewise constant on the primal mesh \({\mathcal {T}}_h\). According to Remark 4.1, in this case one has the local efficiency directly on each dual volume \(D\in {\mathcal {D}}_h\). Finally, these meshes also yield similar numerical results, which are shown in Figs. 11, 12, 13 and 14. From that, we confirm the robustness of the error estimators with the discontinuous coefficient on anisotropic meshes.

Table 4 The meshes on different levels
Table 5 The alignment measure for the finite volume method with discontinuous coefficients
Table 6 The alignment measure for the finite element method with discontinuous coefficients
Fig. 11
figure 11

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite volume method with discontinuous coefficients on anisotropic meshes, our estimates by the direct description (3.15)

Fig. 12
figure 12

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite volume method with discontinuous coefficients on anisotropic meshes, our estimates by the local Neumann/Dirichlet mixed finite element problems (3.18)

Fig. 13
figure 13

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite element method with discontinuous coefficients on anisotropic meshes, our estimates by the direct description (3.15)

Fig. 14
figure 14

Estimated and actual energy error (left) and the corresponding effectivity index (right), finite element method with discontinuous coefficients on anisotropic meshes, our estimates by the local Neumann/Dirichlet mixed finite element problems (3.18)

7 Conclusions

We consider the error estimation based on flux reconstruction for conforming discretizations of diffusion problems with discontinuous coefficients, and provide robust a posteriori error estimates on anisotropic meshes, which is essentially regarded as generalizations of the estimates from [37] on anisotropic meshes.

The resulting error estimators are new, and robust with respect to discontinuous coefficients on anisotropic meshes. Our main contribution is the anisotropic character of the estimators such that we can apply them on anisotropic meshes, since the isotropic versions from [37] are invalid on anisotropic meshes.

The proofs of the upper and lower error bounds are more technical in the anisotropic setting. Especially in the proof of the local lower bound, we rigorously analyze the effect of anisotropy of the mesh by introducing the usual Piola transformation for the vector-valued functions (see the proofs of Theorems 4.1 and 4.3), which is the first attempt in the proof of efficiency on anisotropic meshes. For the isotropic case, we can completely omit it, see [37].