1 Introduction

As a prototype of problems with interface singularities, this paper studies a priori error estimates of mixed finite element methods for the following interface problem (i.e., the diffusion problem with discontinuous coefficients):

$$\begin{aligned} -\nabla \cdot \,(\alpha (x)\nabla \, u) = f \quad \text{ in } \,\,\varOmega \end{aligned}$$
(1.1)

with homogeneous Dirichlet boundary conditions (for simplicity)

$$\begin{aligned} u = 0 \quad \text{ on } \partial \varOmega , \end{aligned}$$
(1.2)

where \(\varOmega \) is a bounded polygonal domain in \(\mathbb {R}^d\) with \(d=2\) or 3; \(f \in L^{2}(\varOmega )\) is a given function; and diffusion coefficient \(\alpha (x)\) is positive and piecewise constant with possible large jumps across subdomain boundaries (interfaces):

$$\begin{aligned} \alpha (x)=\alpha _i > 0\quad \text{ in } \,\varOmega _i \quad \text{ for } \, i=1, \ldots , n. \end{aligned}$$

Here, \(\{\varOmega _i\}_{i=1}^n\) is a partition of the domain \(\varOmega \) with \(\varOmega _i\) being an open polygonal domain. It is well known that the solution u of problem (1.1) belongs to \(H^{1+s}(\varOmega )\) with possibly very small \(s> 0\), see for example Kellogg [23]. But we should also note that even the global regularity is low, when a finite element mesh is given, the singularity or those elements whose solution having a large gradient often only appear near some points, or along a curve. Thus it is not optimal to use the global regularity and a global uniform mesh-size to do the a priori error estimate.

In [10], we introduced the idea of robust and local optimal a priori error estimate. The robustness means that the genetic constants appeared in the estimates are independent of the parameters of the equation, the coefficient \(\alpha \) in our case. The local optimality means that in the error estimate, the upper bound is optimal with the regularity of each element and local mesh sizes, instead of using a global uniform mesh size and a global regularity.

The local optimal and robust a priori error estimate is very important for the adaptive mesh refinement algorithm. Mesh refinements algorithms are often based on the so-called “error equi-distribution” principle [26], that is, each element has an almost equal size of the error measured in an appropriate norm. We need to show the “error equi-distribution” principle is achievable via a priori error estimate. In the ideal case, if we have a known exact solution u so that the (a priori) error can be computed exactly, we should be able to find an optimal mesh with a fixed number of degrees of freedom that each element has a very similar size of the error. Also, in the robust a posteriori error analysis, we always try to find an equivalence between some intrinsic norm of the error and a computable error estimator, the so called the reliability and efficiency bounds. When constructing the error estimator, it is essential to realize that the best the adaptive numerical method can get is restricted by the robust local a priori estimates with respect to each elements. This is especially important for the mixed methods, since there are two unknowns, the flux and the potential, and there are various post-processing methods. It is important to find which is the right quantity and norm to estimate in the a posteriori error estimates. For example, should we use the a posteriori estimator related to the weighted \(L^2\)-norm of the flux, the weighted \(L^2\)-norm of the potential, the weighted discrete \(H^1\)-norm of the potential, or combinations of them? To answer this question, we need to know what can be derived from the a priori estimates.

The proof of local optimal and robust a priori error estimate often contains two parts: one is the robust best approximation result (Cea’s lemma type of result), which has its own importance; the other is the robust local approximation properties of the interpolation operator.

Before we discuss the robust best approximation result and robust local interpolations results for the mixed approximations, we first discuss the corresponding results for the conforming, Crouzeix–Raviart nonconforming, and discontinuous Galerkin results of the interface problem.

For the interface problem (1.1), the robust best approximation property is well known and it almost trivial for the \(H^1\) conforming approximation:

$$\begin{aligned} \Vert \alpha ^{1/2}\nabla (u-u_k^c)\Vert _0 \le \inf _{v_k^c \in V_k^c}\Vert \alpha ^{1/2}\nabla (u-v_k^c)\Vert _0, \end{aligned}$$

where \(V_k^c\) is the k-th degree \(H^1_0\)-conforming finite element space, and \(u_k^c\) is the corresponding \(H^1\) conforming approximation.

On the other hand, the proofs of the robust best approximation for CR nonconforming and discontinuous Galerkin is not easy. In [10], for the Croueix-Raviart nonconforming element approximation, we showed the robust best approximation property (the constant C independent of \(\alpha \) and mesh size):

$$\begin{aligned} \Vert \alpha ^{1/2}\nabla _h (u-u_1^{nc})\Vert _0 \le C\left( \inf _{v_1^{nc} \in V_1^{nc}}\Vert \alpha ^{1/2}\nabla _h (u-v_1^{nc})\Vert _0 +\mathrm{osc \, }_{\alpha ,nc} \right) , \end{aligned}$$

where \(V_1^{nc}\) is the Crouzeix–Raviart non-conforming finite element space, and \(u_1^{nc}\) is the corresponding non-conforming approximation, and \(\mathrm{osc \, }_{\alpha ,nc}\) is a robust oscillation term. Also in [10], for the discontinuous Galerkin approximation, we showed the robust best approximation property (the constant C independent of \(\alpha \) and mesh size):

where \(D_k\) is the k-th degree discontinuous finite element space, and \(u_k^{dg}\) is the corresponding discontinuous Galerkin approximation, is the \(\alpha \)-weighted \(H^1\) discontinuous Galerkin norm, and \(\mathrm{osc \, }_{\alpha ,dg}\) is a robust oscillation term.

The local approximation properties of the interpolation operators for the DG space and Crouzeix–Raviart are easy to show. For the conforming finite element approximation, there are two types of local interpolations: nodal interpolations which require high regularity of the solution, and the Scott–Zhang or Clément interpolations whose regularity requirement is very low. For the nodal interpolation, it is completely local in each element, but then it need very high regularity to exist, especially in three dimensions. For the Scott–Zhang/Clément interpolations, since they are defined on a local patch, their local robustness depends on a non-realistic assumption, the quasi-monotonicity assumption, see [5, 10, 11, 13, 19]. Thus, the existence of robust local optimal result for the conforming finite element approximation for the low regularity interface problem is still open.

For the mixed methods, we have two unknowns, one is the flux \(\varvec{\sigma }\), and the other is the potential u. For the potential u, the discontinuous finite element approximation is used, so the robust local interpolation property is obvious. We use Raviart–Thomas (RT) or Brezzi–Douglas–Marini (BDM) elements [6] to approximate the flux variable, a robust local interpolation property can be proved by the average Taylor series technique developed in [20]. This leaves the main task of proving the robust local optimal error estimates to the proof of the robust best approximation properties of the mixed methods. Unlike the conforming, non-conforming, or DG methods, where the energy norms and approximation spaces are obvious, we have several choices for the mixed methods.

Our first robust best approximation property is simple, the weighted \(L^2\)-norm of the flux error in the equilibrated discrete spaces, see Theorems 3.2 and 3.3.

In the standard a priori analysis of the mixed method [6], the error of the potential u is estimated in the \(L^2\)-norm. It turns out that we have difficulties to have a robust inf-sup condition with the weighted \(L^2\) norm for the discrete approximation \(u_h\) and a modified \(H(\mathrm{div})\) norm. Thus, we use the \(\alpha \)- and mesh-dependent norms to do the robust analysis. The choice of norm for \(u_h\) is a norm similar to the standard discontinuous Galerkin norm, that is, a weighted discrete \(H^1\) norm. With this \(\alpha \)- and mesh-dependent norm analysis, we show robust best approximation result for the potential approximation in the \(\alpha \)-dependent discrete \(H^1\) norm. But since the approximation space for the potential u is not rich enough, the order of approximation of u in the \(\alpha \)-dependent discrete \(H^1\) norm is one or two orders lower than the flux approximation. This order discrepancy suggests that we should not try to do the robust estimate of the \(\alpha \) weighted discrete \(H^1\)-norm of the potential approximation in the a posteriori error analysis, as stated the earlier discussion by Kim [24].

For the flux approximation, with the help of \(\alpha \)- and mesh-dependent analysis, we show the robust best approximation result in the non-equilibrated RT/BDM space with an \(\alpha \) and h weighted \(H(\mathrm{div})\) norm for the first time. The corresponding robust and local a priori error estimates are also given without order loss even for the BDM approximations.

Finally, since the discrete \(H^1\) norm of the potential approximation \(u_h\) is often of a lower order than the corresponding flux approximation, we use Stenberg’s post-processing to recover a new approximation with a compatible polynomial degree. We show that for the recovered potential approximation, the robust local best approximation result is true and a robust local a priori error estimates of the same order as the flux approximation is obtained. We also prove a new trace inequality of the normal trace. We also point out in the paper that any recovery or post-processing should based on the flux approximation since it is more accurate.

There are many a priori estimates for mixed methods available. The standard analysis can be found in the books and papers [6, 18, 21, 27]. In these analysis, \(L^2\) or \(H(\mathrm{div})\) norms are used for the flux approximation and the \(L^2\) norm is used for the potential approximation. No robust analysis is discussed in these papers or books. The mesh-dependent norm analysis can be found in [7, 25], also, no robust analysis is discussed. In [9, 24, 29, 30], many a priori and a posteriori error results are presented for the mixed methods, some are robust and some are non-robust. No robust and local optimal estimates are discussed for mixed methods before.

The paper is organized as follows. Section 2 describes the mixed finite element methods for the model problem. Various robust best approximations results and robust and local a priori error estimates are presented in Sect. 3, including the robust best approximation results for the flux in the weighted \(L^2\) norm in the discrete equilibrated space and in the weighted \(H(\mathrm{div})\) norm in the whole mixed approximations spaces, the robust best approximation result for the potential in weighted discrete \(H^1\) norm. In Sect. 4, we discuss Stenberg’s of post-processing and show its robust and local optimal a priori error estimates in each elements. In Sect. 5, we make some concluding remarks.

2 Mixed Finite Element Methods

Let the flux be

$$\begin{aligned} \varvec{\sigma }= -\alpha (x)\nabla u. \end{aligned}$$

The mixed variational formulation for the problem in (1.1) and (1.2) is to find \((\varvec{\sigma },\,u)\in H(\mathrm{div};\varOmega )\times L^2(\varOmega )\) such that

$$\begin{aligned} \left\{ \begin{array}{lclll} (\alpha ^{-1}\varvec{\sigma },\,\varvec{\tau })-(\nabla \cdot \varvec{\tau },\, u)&{}=&{}0 \quad &{} \quad \forall \,\, \varvec{\tau }\in H(\mathrm{div};\varOmega ),\\ (\nabla \cdot \varvec{\sigma }, \,v) &{}=&{} (f,\,v)&{} \quad \forall \,\, v\in L^2(\varOmega ). \end{array}\right. \end{aligned}$$
(2.1)

Let \(\mathcal{T}=\{K\}\) be a regular triangulation of the domain \(\varOmega \) (see, e.g., [8, 16]). Denote by \(h_K\) the diameter of the element K. Assume that interfaces \(\{\partial \varOmega _i\cap \partial \varOmega _j\,:\, i,j=1, \ldots , n\}\) do not cut through any element \(K\in \mathcal{T}\). For any element \(K\in \mathcal{T}\), denote by \(P_k(K)\) the space of polynomials on K with total degree less than or equal to k.

Define the discontinuous piecewise polynomial space of degree k by

$$\begin{aligned} D_k = \{ v \in L^2(\varOmega )\, :\, v|_K \in P_k \; \forall \, K\in \mathcal{T}\}. \end{aligned}$$

Define the \(H(\mathrm{div})\) conforming Raviart–Thomas (RT) finite element space and Brezzi–Douglas–Marini (BDM) finite element space of order k by

$$\begin{aligned} RT_k = \{ \varvec{\tau }\in H(\mathrm{div};\varOmega )\, :\, \varvec{\tau }|_K \in P_k(K)^d + \mathbf{x}P_k(K) \; \forall \, K\in \mathcal{T}\}. \end{aligned}$$

and

$$\begin{aligned} BDM_k = \{ \varvec{\tau }\in H(\mathrm{div};\varOmega )\, :\, \varvec{\tau }|_K \in P_k(K)^d \; \forall \, K\in \mathcal{T}\}. \end{aligned}$$

For mixed problems, \(RT_k\times D_k\) and \(BDM_{k+1}\times D_k\) are stable pairs. Thus, we use the notation \(\varSigma _k\) to denote \(RT_k\) or \(BDM_{k+1}\).

The mixed finite element approximation is to find \((\varvec{\sigma }_h,\,u_h) \in \varSigma _k \times D_k\) such that

$$\begin{aligned} \left\{ \begin{array}{lllll} (\alpha ^{-1}\varvec{\sigma }_h,\,\varvec{\tau }_h)-(\nabla \cdot \varvec{\tau }_h,\, u_h)=0 \quad &{} \quad \forall \,\, \varvec{\tau }_h \in \varSigma _k,\\ (\nabla \cdot \varvec{\sigma }_h,\, v_h) = (f,\,v_h)&{}\quad \forall \,\, v_h\in D_k. \end{array}\right. \end{aligned}$$
(2.2)

Difference between (2.1) and (2.2) yields the following error equation:

$$\begin{aligned} \left\{ \begin{array}{lllll} (\alpha ^{-1}(\varvec{\sigma }-\varvec{\sigma }_h),\,\varvec{\tau }_h)-(\nabla \cdot \varvec{\tau }_h,\, u-u_h)=0 \quad &{}\quad \forall \,\, \varvec{\tau }_h \in \varSigma _k,\\ (\nabla \cdot (\varvec{\sigma }-\varvec{\sigma }_h),\, v_h) = 0&{}\quad \forall \,\, v_h\in D_k. \end{array}\right. \end{aligned}$$
(2.3)

3 Robust and Local Optimal A Priori Error Estimates

3.1 Mixed Finite Element Interpolations and Approximation Properties

For a fixed \(r>0\), denote by \(I^{rt,k}_{h}: H(\mathrm{div};\,\varOmega )\cap [H^r(\varOmega )]^d \mapsto RT_k\) the standard RT interpolation operator and \(I^{bdm,k}_{h}: H(\mathrm{div};\,\varOmega )\cap [H^r(\varOmega )]^d \mapsto BDM_k\) the standard BDM interpolation operator. We have the following local approximation property: for \(\varvec{\tau }\in H^{s_K}(K)\), \(s_K >0\),

$$\begin{aligned} \Vert \varvec{\tau }- I^{\varSigma ,k}_{h} \varvec{\tau }\Vert _{0,K}\le & {} C h_K^{\min \{k+1,s_K\}} |\varvec{\tau }|_{\min \{k+1,s_K\},K} \quad \forall \,\, K\in \mathcal{T}, \end{aligned}$$
(3.1)

with \(I^{\varSigma ,k}_{h} = I^{rt,k}_{h}\) or \(I^{bdm,k}_{h}\). The estimate in (3.1) is standard for \(s_K\ge 1\) and can be proved by the average Taylor series developed in [20] and the standard reference element technique with Piola transformation for \(0<s_K<1\). We also should notice that the interpolations and approximation properties are completely local.

Denote by \(Q^k_{h}: L^2 (\varOmega ) \mapsto D_k\) the \(L^2\)-projection onto \(D_k\). The following commutativity property is well-known:

$$\begin{aligned} \nabla \cdot (I^{rt,k}_{h}\,\varvec{\tau })= & {} Q^k_{h}\,\nabla \cdot \varvec{\tau }\qquad \quad \forall \,\,\varvec{\tau }\in H(\mathrm{div};\,\varOmega )\cap H^r(\varOmega )^d \, \text{ with } \, r>0, \end{aligned}$$
(3.2)
$$\begin{aligned} \nabla \cdot (I^{bdm,k}_{h}\,\varvec{\tau })= & {} Q^{k-1}_{h}\,\nabla \cdot \varvec{\tau }\qquad \quad \forall \,\,\varvec{\tau }\in H(\mathrm{div};\,\varOmega )\cap H^r(\varOmega )^d \, \text{ with } \, r>0. \end{aligned}$$
(3.3)

Remark 1

The requirement \(r>0\) in \(H(\mathrm{div};\,\varOmega )\cap [H^r(\varOmega )]^d\) is to make sure that the mixed interpolations are well defined. Another choice is \(\{\varvec{\tau }\in L^p(\varOmega )^d \text{ and } \nabla \cdot \varvec{\tau }\in L^2(\varOmega )\}\) for \(p>2\) or \(W^{1,t}(K)\) for \(t>2d/(d+2)\) as in [6]. We use the Hilbert space based choice since it is more suitable for our analysis.

3.2 Robust Best Approximation in the Discrete Equilibrated Space for the Flux

Define the discrete equilibrated space

$$\begin{aligned} \varSigma _k^f = \{\varvec{\tau }_h \in \varSigma _k : \nabla \cdot \varvec{\tau }_h =Q^k_{h} f\}. \end{aligned}$$

Note that \(\varSigma _k^f = RT_k^f = \{\varvec{\tau }_h \in RT_k : \nabla \cdot \varvec{\tau }_h =Q^k_{h} f\}\) for the RT case and \(\varSigma _k^f = BDM_{k+1}^f= \{\varvec{\tau }_h \in BDM_{k+1} : \nabla \cdot \varvec{\tau }_h =Q^k_{h} f\}\) for the BDM case.

The following theorem is almost standard in the mixed finite element analysis.

Theorem 2

(Robust best approximation in the discrete equilibrated space) Let \((\varvec{\sigma }, u)\) and \((\varvec{\sigma }_h,\,u_h) \in \varSigma _k \times D_k\) be the solutions of (2.1) and (2.2), respectively, then the following robust best approximation result holds:

$$\begin{aligned} \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,\varOmega } \le \inf _{\varvec{\tau }_h^f \in \varSigma _k^f} \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\tau }_h^f)\Vert _{0,\varOmega }. \end{aligned}$$
(3.4)

Proof

To establish (3.4), denote by

$$\begin{aligned} \mathbf{E}= \varvec{\sigma }-\varvec{\sigma }_h \quad \text{ and }\quad e = u- u_h \end{aligned}$$

the respective errors of the flux and the potential.

Now, let \(\varvec{\tau }_h^f\) be an arbitrary function in \(RT_k^f\), then it follows from the first equation in (2.3), the fact \(\varvec{\sigma }_h \in \varSigma _k^f\), and the Cauchy–Schwarz inequality that

$$\begin{aligned} \Vert \alpha ^{-1/2}\mathbf{E}\Vert _{0,\varOmega }^2= & {} (\alpha ^{-1}\mathbf{E},\, \varvec{\sigma }-\varvec{\tau }_h^f) + (\alpha ^{-1}\mathbf{E},\, \varvec{\tau }_h^f -\varvec{\sigma }_h)\\= & {} (\alpha ^{-1}\mathbf{E},\, \varvec{\sigma }-\varvec{\tau }_h^f) + (\nabla \cdot (\varvec{\tau }_h^f-\varvec{\sigma }_h),\,e)\\= & {} (\alpha ^{-1}\mathbf{E},\, \varvec{\sigma }-\varvec{\tau }_h^f) \le \Vert \alpha ^{-1/2}\mathbf{E}\Vert _{0,\varOmega }\,\Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\tau }_h^f)\Vert _{0,\varOmega }, \end{aligned}$$

which implies the result of the theorem. \(\square \)

Theorem 3

(Robust local a priori error estimates) Let \((\varvec{\sigma }, u)\) and \((\varvec{\sigma }_h,\,u_h) \in \varSigma _k \times D_k\)\((k\ge 0)\) be the solutions of (2.1) and (2.2), respectively. Assume that \(u\in H^{1+r}(\varOmega )\) with some \(r>0\) and that \(u|_K\in H ^{1+s_K}(K)\) with an element-wisely defined regularity \(s_K>0\) for all \(K\in \mathcal{T}\). Then there exists a constant \(C>0\) independent \(\alpha \) and h for both the two- and three-dimension such that

$$\begin{aligned} \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0}\le & {} C \sum _{K\in \mathcal{T}} h_K^{\min \{k+1,s_K\}} |\alpha ^{1/2}\nabla u|_{\min \{k+1,s_K\},K}, \quad RT_k \text{ case } , \\ \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0}\le & {} C \sum _{K\in \mathcal{T}} h_K^{\min \{k+2,s_K\}} |\alpha ^{1/2}\nabla u|_{\min \{k+2,s_K\},K}, \quad BDM_{k+1} \text{ case }. \end{aligned}$$

Proof

For the \(RT_k \times D_k\) case, the commutativity property in (3.2) and the second equations in (2.1) and (2.2) lead to

$$\begin{aligned} \nabla \cdot (I_h^{rt,k}\varvec{\sigma }) = Q^k_{h}\,\nabla \cdot \varvec{\sigma }= Q^k_{h} f = \nabla \cdot \varvec{\sigma }_h. \end{aligned}$$

Thus, the result is a direct consequence of the best approximation property in (3.4) and the local approximation property in (3.1) by choosing \(\varvec{\tau }_h^f = I_h^{rt,k}\varvec{\sigma }\in RT_k^f\).

Using the same argument, we can get the result for the \(DBM_{k+1} \times D_k\) case. \(\square \)

Remark 4

For those elements with a low regularity \(0<s_K<1\), \(RT_0\) is enough and there is no need to use BDM or higher order RT approximations.

Remark 5

For the case that in each element \(K\in \mathcal{T}\), the diffusion coefficient being a full symmetric positive definite constant matrix \(A|_K\) instead of a scalar constant \(\alpha _K\), from the proofs, it is clear the above robust best approximation result is also true:

$$\begin{aligned} \Vert A^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,\varOmega } \le \inf _{\varvec{\tau }_h^f \in \varSigma _k^f} \Vert A^{-1/2}(\varvec{\sigma }-\varvec{\tau }_h^f)\Vert _{0,\varOmega }. \end{aligned}$$

In each element \(K\in \mathcal{T}\), for the quantity \(\mathbf{q}\in P_k^d\), \(A^{-1/2} \mathbf{q}\) is also in \(P_k^d\), and thus \(A^{-1/2}I_h^{\varSigma ,k} \mathbf{q}= A^{-1/2}\mathbf{q}\). Thus for a piecewise constant symmetric positive definite matrix A, we have

$$\begin{aligned} \Vert A^{-1/2}(\varvec{\tau }- I^{\varSigma ,k}_{h} \varvec{\tau })\Vert _{0,K} \le C h_K^{\min \{k+1,s_K\}} |A^{-1/2}\varvec{\tau }|_{\min \{k+1,s_K\},K} \quad \forall \,\, K\in \mathcal{T}. \end{aligned}$$

And we have the robust local a priori error estimates

$$\begin{aligned} \Vert A^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,\varOmega }\le & {} C \sum _{K\in \mathcal{T}} h_K^{\min \{k+1,s_K\}} |A^{1/2}\nabla u|_{\min \{k+1,s_K\},K}, \quad RT_k \text{ case },\\ \Vert A^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,\varOmega }\le & {} C\sum _{K\in \mathcal{T}} h_K^{\min \{k+2,s_K\}} |A^{1/2}\nabla u|_{\min \{k+2,s_K\},K}, \quad BDM_{k+1} \text{ case }. \end{aligned}$$

The corresponding results for discontinuous Galerkin methods are not proved, since the robustness of the DG method for the diffusion problem depends on the right choice of the weights of the averages and penalty coefficients. For the full tensor case, the right weight is not clear or probably not possible for a full matrix A, see [10]. For the conforming finite element approximations, due to the lack of the nodal interpolations for the low regularity cases, such robust local optimal estimates are not available. For averaging operators like the Scott–Zhang [28] or Clément interpolations [17], the robustness with respect to the full tensor A is also impossible since even the famous quasi-monotonicity assumption (see [5]) is not meaningful in the case. For the Crouzeix–Raviart non-conforming finite element approximation, it is possible we can get a similar result by using the relation between the \(RT_0\) and Crouzeix–Raviart elements.

3.3 Mesh-Dependent Norm Analysis

In this subsection, we use mesh-dependent norm analysis to derive the robust best approximation properties for the flux and the potential in appropriate norms. Earlier analysis on the mixed methods using mesh-dependent norms can be found in Babuška, Osborn, and Pitkäranta [3], Braess and Verfürth [7], and [15]. In the mesh-dependent analysis, we need to restrict ourselves to the scalar case.

First, we discuss the averages of the coefficients on the edge/face \(F\in \mathcal{E}\). For \(F = \partial K_F^{+} \cap \partial K_F^{-}\in \mathcal{E}_{I}\), denote by \(\alpha ^+_{F}\) and \(\alpha ^-_{F}\) the restriction of \(\alpha \) on the respective \(K_F^{+}\) and \(K_F^{-}\). Denote the harmonic averages of \(\alpha \) on \(F \in \mathcal{E}\) by

$$\begin{aligned} \alpha _{F,H} = \left\{ \begin{array}{ll} \displaystyle \frac{\alpha _F^+ \alpha _F^- }{\alpha _F^+ + \alpha _F^-},&{}\quad F \in \mathcal{E}_{I},\\ \alpha _F^-, &{}\quad F \in \mathcal{E}_{{_D}}\cup \mathcal{E}_{{_N}}, \end{array}\right. \end{aligned}$$

which is equivalent to the minimum of \(\alpha \):

$$\begin{aligned} \displaystyle \frac{1}{2}\min \{\alpha _F^+, \alpha _F^- \}\le \alpha _{F,H} \le \min \{\alpha _F^+, \alpha _F^- \} . \end{aligned}$$
(3.5)

Lemma 6

The bilinear form \((\nabla \cdot \varvec{\tau }, v)\) for \((\varvec{\tau },v)\in H(\mathrm{div};\varOmega )\times L^2(\varOmega )\) has the following representation:

$$\begin{aligned} (\nabla \cdot \varvec{\tau }, v) = -\sum _{K\in \mathcal{T}} (\nabla v,\varvec{\tau })_{K} + \sum _{F\in \mathcal{E}_{I}} (\varvec{\tau }\cdot \mathbf{n}, [\![ v]\!])_F + \sum _{F\in \mathcal{E}_D} (\varvec{\tau }\cdot \mathbf{n}, v)_F \end{aligned}$$
(3.6)

Proof

The representation (3.6) is a consequence of integration by parts.\(\square \)

Define \((\alpha ,\,h)\)-dependent norms on \(\mathcal{T}\) by

Note that the norm is the standard \(\alpha \)-weighted DG norm used in the discontinuous Galerkin methods, see [10]. For a \(v\in H_0^1(\varOmega )\), .

Lemma 7

For all \(\varvec{\tau }_h \in \varSigma _k(K)\), there exists a positive constant \(C>0\) independent of \(\alpha \) and h, such that

$$\begin{aligned} \displaystyle \sum _{F\in \mathcal{E}_K}\frac{h_F}{\alpha _K} \Vert \varvec{\tau }_h \cdot \mathbf{n}\Vert _{0,F}^2 \le C \Vert \alpha ^{-1/2}\varvec{\tau }_h\Vert _{0,K}^2, \end{aligned}$$

where \(\mathcal{E}_K\) is the collection of edges (in 2D) or faces (in 3D) of the element K.

Proof

The lemma is a simple consequence of the standard scaling argument and the fact that both \(RT_k(K)\) and \(BDM_{k+1}(K)\) are finite dimensional. \(\square \)

Theorem 8

The following norm equivalence holds with \(C>0\) independent of \(\alpha \) and h:

$$\begin{aligned} \Vert \alpha ^{-1/2}\varvec{\tau }_h\Vert _0 \le \Vert \varvec{\tau }_h\Vert _{\alpha ,h} \le C \Vert \alpha ^{-1/2}\varvec{\tau }_h\Vert _0, \quad \forall \varvec{\tau }_h \in \varSigma _k. \end{aligned}$$
(3.7)

Proof

Since for the harmonic average \(\alpha _{F,H}\), we have \(1/\alpha _{F,H} = 1/\alpha _{F}^+ +1/\alpha _{F}^-\), by Lemma 7, we immediately get the robust discrete norm equivalence. \(\square \)

For \(\varvec{\tau }\in H(\mathrm{div};\varOmega )\), define the following \(\alpha \) and h dependent norm:

$$\begin{aligned} \Vert \varvec{\tau }\Vert _{\alpha ,h,H(\mathrm{div})}:= \left( \Vert \alpha ^{-1/2} \varvec{\tau }\Vert _0^2 + \sum _{K\in \mathcal{T}}h_K^2\Vert \alpha ^{-1/2}\nabla \cdot \varvec{\tau }\Vert _{0,K}^2 \right) ^{1/2}. \end{aligned}$$
(3.8)

We also use \(\Vert \varvec{\tau }\Vert _{\alpha ,h,H(\mathrm{div}),K}\) to denote the norm on a single element K.

The following trace inequality can be found in Lemma 2.4 and Remark 2.5 of [10].

Lemma 9

Let F be an edge/face of \(K\in \mathcal{T}\) and \(\mathbf{n}_F\) the unit vector normal to F. Assume that \(\varvec{\tau }\) is a given function in \(H(\mathrm{div};K)\cap [H^r(K)]^d\), \(r>0\) then for any \(w_h\in P_k(K)\), we have

$$\begin{aligned} (\varvec{\tau }\cdot \mathbf{n}, w_h)_F\le & {} C\, h_F^{-1/2}\Vert w_h\Vert _{0,F} \left( \Vert \varvec{\tau }\Vert _{0,K} + h_K\Vert \nabla \cdot \varvec{\tau }\Vert _{0,K}\right) . \end{aligned}$$
(3.9)

The following two continuity results are true.

Lemma 10

The following continuity results hold with constants \(C_{con,1}>0\) and \(C_{con,2}>0\) independent of \(\alpha \) and h:

(3.10)
(3.11)

Proof

The continuity (3.10) is clear from the representation (3.6), Cauchy–Schwarz inequality, the definition of norms \(\Vert \varvec{\tau }\Vert _{\alpha ,h}\) and , and the robust norm equivalent result (3.7).

To show (3.11), we still start from the representation (3.6):

$$\begin{aligned} (\nabla \cdot \varvec{\tau }, v_h) = -\sum _{K\in \mathcal{T}} (\nabla v_h,\varvec{\tau })_{K} + \sum _{F\in \mathcal{E}_{I}} (\varvec{\tau }\cdot \mathbf{n}, [\![ v_h]\!])_F + \sum _{F\in \mathcal{E}_D} (\varvec{\tau }\cdot \mathbf{n}, v_h)_F. \end{aligned}$$

For the term \((\varvec{\tau }\cdot \mathbf{n}, [\![ v_h]\!])_F\), where \(F\in \mathcal{E}_I\), by (3.9),

$$\begin{aligned} (\varvec{\tau }\cdot \mathbf{n},\,\, [\![ v_h]\!])_F\le & {} C\, h_F^{-1/2}\Vert [\![ v_h]\!]\Vert _{0,F} \left( \Vert \varvec{\tau }\Vert _{0,K} + h_K\Vert \nabla \cdot \varvec{\tau }\Vert _{0,K}\right) , \end{aligned}$$

where K is one of the elements having F as an edge/face. Choosing K to be the element with the smaller \(\alpha _K\). From (3.5), the smaller \(\alpha _K\) is equivalent to the harmonic average \(\alpha _{F,H}\), then

$$\begin{aligned} (\varvec{\tau }\cdot \mathbf{n},\,\, [\![ v_h]\!])_F\le & {} C\, \alpha _{F,H}^{1/2}h_F^{-1/2}\Vert [\![ v_h]\!]\Vert _{0,F} \left( \Vert \alpha ^{-1/2}\varvec{\tau }\Vert _{0,K} + h_K\Vert \alpha ^{-1/2}\nabla \cdot \varvec{\tau }\Vert _{0,K}\right) . \end{aligned}$$

The term \((\varvec{\tau }\cdot \mathbf{n}, v_h)_F\), \(F\in \mathcal{E}_D\), can be handled similarly. Then by the Cauchy–Schwarz inequality, (3.11) can be easily proved. \(\square \)

Lemma 11

The following discrete inf-sup condition

(3.12)

holds with a constant \(\beta >0\) independent of \(\alpha \) and h.

Proof

By the robust norm equivalent result (3.7), we only need to prove the result for \(\varvec{\tau }_h\) in the norm \(\Vert \varvec{\tau }_h\Vert _{\alpha ,h}\). Since \(RT_k \subset BDM_{k+1}\), thus

$$\begin{aligned} \sup _{\varvec{\tau }_h \in BDM_{k+1}} \displaystyle \frac{(\nabla \cdot \varvec{\tau }_h,v_h)}{ \Vert \varvec{\tau }_h\Vert _{\alpha ,h}} \ge \sup _{\varvec{\tau }_h \in RT_k} \displaystyle \frac{(\nabla \cdot \varvec{\tau }_h,v_h)}{ \Vert \varvec{\tau }_h\Vert _{\alpha ,h}}, \quad \forall \, v_h \in D_k, \end{aligned}$$

we only need to prove the RT version.

Choose a \(\tilde{\varvec{\tau }}_h\in RT_k\) such that

$$\begin{aligned} (\tilde{\varvec{\tau }}_h,\nabla q)_K = -(\alpha \nabla v_h, \nabla q)_K \quad \forall \, q\in P_{k-1}(K) \quad \forall \,\, K\in \mathcal{T}\end{aligned}$$

and that

$$\begin{aligned} \tilde{\varvec{\tau }}_h \cdot \mathbf{n}|_F =\left\{ \begin{array}{llll} \displaystyle \frac{\alpha _{F,H}}{h_F}[\![ v_h]\!] &{} \,\, F\in \mathcal{E}_{I}, \\ \displaystyle \frac{\alpha _F}{ h_F} v_h &{} \,\, F\in \mathcal{E}_D, \end{array}\right. \end{aligned}$$
(3.13)

which, together with (3.6), gives

(3.14)

For every \(K\in \mathcal{T}\), by the standard scaling argument, there exists a constant \(C>0\) independent of \(\alpha \) and the mesh size such that

$$\begin{aligned} \Vert \tilde{\varvec{\tau }}_h\Vert _{0,K}^2 \le C \left( \Vert \alpha _K \nabla v_h \Vert _{0,K}^2 + h_K \sum _{F\in \mathcal{E}_K\cap \mathcal{E}_{I}} \Vert \displaystyle \frac{\alpha _{F,H}}{h_F}[\![ v_h]\!]\Vert _{0,F}^2 +h_K \sum _{F\in \mathcal{E}_K\cap \mathcal{E}_{D}} \Vert \displaystyle \frac{\alpha _{F}}{h_F} v_h\Vert _{0,F}^2 \right) , \end{aligned}$$

which, together with (3.5), gives

$$\begin{aligned} \Vert \alpha _K^{-1/2}\tilde{\varvec{\tau }}_h\Vert _{0,K}^2 \le C \left( \Vert \alpha _K^{1/2} \nabla v_h \Vert _{0,K}^2 +\sum _{F\in \mathcal{E}_K\cap \mathcal{E}_{I}} \displaystyle \frac{\alpha _{F,H}}{h_F} \Vert [\![ v_h]\!]\Vert _{0,F}^2 + \sum _{F\in \mathcal{E}_K\cap \mathcal{E}_{D}} \displaystyle \frac{\alpha _{F}}{h_F} \Vert v_h\Vert _{0,F}^2 \right) , \end{aligned}$$

Hence, there exists a constant \(\tilde{C}>0\) independent of \(\alpha \) and h such that

which, together with (3.14), leads to the discrete inf-sup condition of the lemma. \(\square \)

Define the following discrete divergence-free subspace of \(\varSigma _k\):

$$\begin{aligned} \varSigma _k^0 =\{\varvec{\tau }_h \in \varSigma _k : \nabla \cdot \varvec{\tau }_h =0\}. \end{aligned}$$

Its orthogonal complement is

$$\begin{aligned} (\varSigma _k^0)^\perp =\{\varvec{\tau }_h \in \varSigma _k : (\varvec{\tau }_h, \varvec{\rho }_h) =0, \forall \varvec{\rho }_h \in \varSigma _k^0\}. \end{aligned}$$

Note that the inf-sup condition (3.12) is also equivalent to the following inf-sup condition with \(\beta >0\) independent of \(\alpha \) and h (see Lemma I.4.1 of [22]):

(3.15)

The condition (3.12) also guarantees that for each \(g\in L^2(\varOmega )\), there exists a unique solution \(\varvec{\tau }_h \in (\varSigma _k^0)^\perp \) such that

$$\begin{aligned} (\nabla \cdot \varvec{\tau }_h, v_h) = (g, v_h), \quad \forall v_h \in D_k. \end{aligned}$$
(3.16)

Now let us prove the following robust best approximation property for .

Theorem 12

(Robust best approximation in the weighted discrete \(H^1\) norm) Let \((\varvec{\sigma }, u)\) and \((\varvec{\sigma }_h,\,u_h)\in \varSigma _k\times D_k\) be the solutions of (2.1) and (2.2), respectively. Assume that \(u\in H^{1+r}(\varOmega )\) with \(r>0\) and that \(u|_K\in H ^{1+s_K}(K)\) with element-wisely defined \(s_K>0\) for all \(K\in \mathcal{T}\). Then there exists a constant \(C>0\) independent of \(\alpha \) and h for both the two- and three-dimension such that

(3.17)

Proof

By the inf-sup condition, for each \(v_h \in D_k\) we have

(3.18)

By the first equation in the error equations (2.3),

$$\begin{aligned} (\nabla \cdot \varvec{\tau }_h, u_h-v_h)= & {} (\nabla \cdot \varvec{\tau }_h, u-v_h) + (\nabla \cdot \varvec{\tau }_h, u_h-u) \\= & {} (\nabla \cdot \varvec{\tau }_h, u-v_h) - (\alpha ^{-1}(\varvec{\sigma }-\varvec{\sigma }_h),\,\varvec{\tau }_h). \end{aligned}$$

Then, by the continuity result (3.11) and the Cauchy–Schwarz inequality,

Thus by (3.18) and the equivalence of \( \Vert \varvec{\tau }_h\Vert _{\alpha ,h}\) and \(\Vert \alpha ^{-1/2}\varvec{\tau }_h\Vert _{0}\),

A simple application of the triangle inequality yields

By the optimal convergence results of \(\varvec{\sigma }_h\), we have the robust best approximation result of the theorem. \(\square \)

Remark 13

Even though we have the robust best approximation result (3.17), due to the fact that the approximation orders of \(\varSigma _k\) and \(D_k\) are different for the corresponding norms, the order of convergence for \(u-u_h\) in the discrete \(H^1\) norm is one or two order lower than the corresponding weighted \(L^2\) RT or BDM approximation errors in Theorem 3, respectively.

Due to this order difference, in the a posteriori error analysis, we should only construct the error estimator related to \(\Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _0\).

Now, let us show the robust best approximation property in \(\varSigma _k\).

Theorem 14

(Robust best approximation in the mixed approximation space) The following robust best approximation properties are true with a constant C independent of \(\alpha \) and h:

$$\begin{aligned} \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _0\le & {} C \inf _{\varvec{\tau }\in \varSigma _k}\Vert \varvec{\sigma }- \varvec{\tau }_h\Vert _{\alpha ,h,H(\mathrm{div})}, \end{aligned}$$
(3.19)
$$\begin{aligned} \Vert \varvec{\sigma }- \varvec{\sigma }_h\Vert _{\alpha ,h,H(\mathrm{div})}\le & {} C \inf _{\varvec{\tau }\in \varSigma _k}\Vert \varvec{\sigma }- \varvec{\tau }_h\Vert _{\alpha ,h,H(\mathrm{div})}. \end{aligned}$$
(3.20)

Proof

For an arbitrary \(\varvec{\tau }_h \in \varSigma _k\), by (3.16), there exists a unique \(\varvec{\zeta }_h \in (\varSigma _k^0)^\perp \), such that

$$\begin{aligned} (\nabla \cdot \varvec{\zeta }_h, v_h) = (\nabla \cdot (\varvec{\sigma }- \varvec{\tau }_h), v_h), \quad \forall v_h \in D_k, \end{aligned}$$

and

(3.21)

By the continuity (3.11),

Thus,

$$\begin{aligned} \Vert \alpha ^{-1/2}\varvec{\zeta }_h\Vert _0 \le C \Vert \varvec{\sigma }- \varvec{\tau }_h\Vert _{\alpha ,h,H(\mathrm{div})}. \end{aligned}$$

Setting \(\varvec{\tau }_h^f := \varvec{\zeta }_h + \varvec{\tau }_h\), it is clear that \(\varvec{\tau }_h^f \in \varSigma _k^f\). Then by the best approximation (3.4),

$$\begin{aligned} \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _0\le & {} \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\tau }_h^f)\Vert _0 \le \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\tau }_h)\Vert _0+\Vert \alpha ^{-1/2}\varvec{\zeta }_h\Vert _0 \\\le & {} C \Vert \varvec{\sigma }- \varvec{\tau }_h\Vert _{\alpha ,h,H(\mathrm{div})}. \end{aligned}$$

On the other hand, since on each element \(K\in \mathcal{T}\),

$$\begin{aligned} (\nabla \cdot \varvec{\zeta }_h, v_h)_K = (\nabla \cdot (\varvec{\sigma }- \varvec{\tau }_h), v_h)_K, \quad \forall v_h \in P_k(K), \end{aligned}$$

and \(\nabla \cdot \varvec{\zeta }_h \in P_k(K)\), we have

$$\begin{aligned} \Vert \nabla \cdot \varvec{\zeta }_h\Vert _{0,K} \le \Vert \nabla \cdot (\varvec{\sigma }- \varvec{\tau }_h)\Vert _{0,K}. \end{aligned}$$

Since \(\nabla \cdot (\varvec{\sigma }_h-\varvec{\tau }_h^f)=0\), we have

$$\begin{aligned} \Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,K}\le & {} \Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }-\varvec{\tau }_h^f)\Vert _{0,K}+\Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }_h-\varvec{\tau }_h^f)\Vert _{0,K}\\= & {} \Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }-\varvec{\tau }_h^f)\Vert _{0,K}\\\le & {} \Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }-\varvec{\tau }_h)\Vert _{0,K}+ \Vert \alpha ^{-1/2}\nabla \cdot \varvec{\zeta }_h\Vert _{0,K} \\\le & {} 2 \Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }-\varvec{\tau }_h)\Vert _{0,K}. \end{aligned}$$

With this, the robust best approximation property (3.20) in \(\Vert \cdot \Vert _{\alpha ,h,H(\mathrm{div})}\) can proved. \(\square \)

We classify the elements in the mesh into two sets:

$$\begin{aligned} \mathcal{T}_{low} = \{ K\in \mathcal{T}: 0<s_K<1\} \quad \text{ and }\quad \mathcal{T}_{high} = \{ K\in \mathcal{T}: 1\le s_K\}. \end{aligned}$$
(3.22)

Theorem 15

(Robust local a priori error estimates in weighted \(H(\mathrm{div})\) norm) Let \((\varvec{\sigma }, u)\) and \((\varvec{\sigma }_h,\,u_h) \in \varSigma _k \times D_k\)\((k\ge 0)\) be the solutions of (2.1) and (2.2), respectively. Assume that \(u\in H^{1+r}(\varOmega )\) with some \(r>0\) and that \(u|_K\in H ^{1+s_K}(K)\) with an element-wisely defined regularity \(s_K>0\) for all \(K\in \mathcal{T}\). Then there exists a constant \(C>0\) independent \(\alpha \) and h for both the two- and three-dimension such that

$$\begin{aligned} \Vert \varvec{\sigma }- \varvec{\sigma }_h\Vert _{\alpha ,h,H(\mathrm{div})}&\le C \sum _{K\in \mathcal{T}_{low}} \left( h_K^{s_K} |\alpha ^{1/2}\nabla u|_{s_K,K} + h_K \Vert \alpha ^{-1/2}f\Vert _{0,K}\right) \\&\quad + C\sum _{K\in \mathcal{T}_{high}} \left( h_K^{\min \{k+1,s_K\}} |\alpha ^{1/2}\nabla u|_{\min \{k+1,s_K\},K} \right. \\&\quad + \left. h_K^{\min \{k+2,s_K\}}\Vert \alpha ^{-1/2}f\Vert _{\min \{k+1,s_K-1\},K}\right) , RT_{k} \text{ case }. \\ \Vert \varvec{\sigma }- \varvec{\sigma }_h\Vert _{\alpha ,h,H(\mathrm{div})}&\le C \sum _{K\in \mathcal{T}_{low}} \left( h_K^{s_K} |\alpha ^{1/2}\nabla u|_{s_K,K} + h_K \Vert \alpha ^{-1/2}f\Vert _{0,K}\right) \\&\quad + C\sum _{K\in \mathcal{T}_{high}} h_K^{\min \{k+2,s_K\}} \left( |\alpha ^{1/2}\nabla u|_{\min \{k+2,s_K\},K} \right. \\&\quad +\left. \Vert \alpha ^{-1/2}f\Vert _{\min \{k+1,s_K-1\},K}\right) , BDM_{k+1} \text{ case }. \end{aligned}$$

Proof

By the definition of the norm \(\Vert \cdot \Vert _{\alpha ,h,H(\mathrm{div})}\), we only need to discuss the term

$$\begin{aligned} h_K\Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }- \varvec{\sigma }_h)\Vert _{0,K} = h_K\Vert \alpha ^{-1/2}(f-Q^k_h f)\Vert _{0,K} \end{aligned}$$

for each element \(K\in \mathcal{T}\).

The first case is that the regularity is low in the element \(K\in \mathcal{T}_{low}\), with \(0< s_K <1\). In this case, notice that \(f \in L^2(K)\), thus

$$\begin{aligned} h_K\Vert \alpha ^{-1/2}(f-Q^k_h f)\Vert _{0,K} \le h_K\Vert \alpha ^{-1/2}f\Vert _{0,K}. \end{aligned}$$

Compared to the error \(h_K^{s_K} |\alpha ^{1/2}\nabla u|_{s_K,K}\) from the weighted \(L^2\) approximation, it is of high order.

The other case is that \(s_K \ge 1\) in the element K. Note that \(\alpha _K\) is assumed to be a constant in K, thus \(f = \nabla \cdot (\alpha _K\nabla u) = \alpha _K \varDelta u \in H^{s_K-1}(K)\), thus

$$\begin{aligned} h_K\Vert \alpha ^{-1/2}(f-Q^k_h f)\Vert _{0,K} \le C h_K^ {\min \{s_K,k+2\}}\Vert \alpha ^{-1/2}f\Vert _{\min \{s_K-1,k+1\},K}. \end{aligned}$$

Compared with the weighted \(L^2\) error, this term is of the same order for the \(BDM_{k+1}\) approximation and one order high for the \(RT_k\) approximation. \(\square \)

Remark 16

One may want to use the Brezzi’s theory directly as in [25] to get the following a priori error estimate

This is not right, since for problems with a low regularity, the \(L^2\) norm of the trace \(\Vert \varvec{\sigma }\cdot \mathbf{n}\Vert _{0,F}\) is not defined and thus \(\Vert \varvec{\sigma }\Vert _{\alpha ,h}\) is not well-defined. Also, the result obtained by this is sub-optimal for the flux approximation.

Remark 17

In the standard mixed method analysis, the \(L^2\) norm of \(u-u_h\) is analyzed and it has the same order convergence as the RT approximation. In the case of the robust local a priori error estimate, we cannot get a robust local estimate for \(\Vert \alpha ^{1/2}(u-u_h)\Vert _0\) since robust an inf-sup condition

$$\begin{aligned} \sup _{\varvec{\tau }_h \in \varSigma _k} \displaystyle \frac{(\nabla \cdot \varvec{\tau }_h,v_h)}{ \Vert \varvec{\tau }_h\Vert _{\alpha ,h,H(\mathrm{div})}} \ge \beta \Vert \alpha ^{1/2} v_h \Vert _0 \quad \forall \, v_h \in D_k, \end{aligned}$$

with a constant \(\beta \) independent of h and \(\alpha \) is not available.

4 Stenberg’s Post-processing

Since in the mixed methods, the approximation \(u_h\) measured in the weighted discrete \(H^1\) energy norm is lower than that of the approximation of the flux, we introduce the Stenberg’s post-processing to get a same order approximation.

On each element \(K\in \mathcal{T}\), if \((\varvec{\sigma }_h,u_h) \in RT_k\times D_k\) (\(k\ge 0\)) or \((\varvec{\sigma }_h,u_h) \in BDM_k\times D_{k-1}\) (\(k\ge 1\)), i.e., the index of the flux approximation space is k, we find a \(u_{h,K}^* \in P_{k+1}(K)\), such that

$$\begin{aligned} (\alpha \nabla u_{h,K}^*, \nabla v_h)_K = (f,v_h)_K - (\varvec{\sigma }_h\cdot \mathbf{n}, v_h)_{\partial K}, \quad \forall v_h\in P_{k+1}(K)/\mathbb {R}, \end{aligned}$$
(4.1)

and

$$\begin{aligned} \int _K u_{h,K}^* dx = \int _K u_{h} dx. \end{aligned}$$
(4.2)

We first prove the following trace theorem by using techniques in [4, 12].

Theorem 18

For an element \(K\in \mathcal{T}\) with the mesh size \(h_K\), we have

$$\begin{aligned} \Vert \varvec{\tau }\cdot \mathbf{n}\Vert _{-1/2,\partial K} \le C(\Vert \varvec{\tau }\Vert _{0,K} + h_K \Vert \nabla \cdot \varvec{\tau }\Vert _{0,K}), \quad \forall \varvec{\tau }\in H(\mathrm{div};K). \end{aligned}$$
(4.3)

Proof

For any \(\varvec{\tau }\in H(\mathrm{div};K)\) and \(v\in H^1(K)\), we have the following identity:

$$\begin{aligned} \langle v, \varvec{\tau }\cdot \mathbf{n}\rangle _{\partial K} =(\varvec{\tau }, \nabla v)_K + (\nabla \cdot \varvec{\tau }, v)_K, \end{aligned}$$
(4.4)

where \(\langle v, \varvec{\tau }\cdot \mathbf{n}\rangle _{\partial K}\) should be viewed as the duality pair between \(H^{1/2}(\partial K)\) and \(H^{-1/2}(\partial K)\). Thus

$$\begin{aligned} \Vert \varvec{\tau }\cdot \mathbf{n}\Vert _{-1/2,\partial K} = \sup _{v\in H^{1/2}(\partial K)} \displaystyle \frac{(\varvec{\tau }, \nabla v)_K + (\nabla \cdot \varvec{\tau }, v)_K}{\Vert v\Vert _{1/2,\partial K}}. \end{aligned}$$

On a reference element \(\hat{K}\), given \(g\in H^{1/2}(\partial \hat{K})\), consider the following equation

$$\begin{aligned} - \varDelta z + z =0 \in \hat{K}, \quad z = g \text{ on } \partial \hat{K}. \end{aligned}$$

By the elliptic stability theory, we have

$$\begin{aligned} \Vert \nabla z\Vert _{0,\hat{K}} + \Vert z\Vert _{0,\hat{K}} \le C\Vert g\Vert _{1/2, \partial \hat{K}}. \end{aligned}$$

Mapping back to the physical element K we have that given a \(g\in H^{1/2}(\partial K)\), there exits a \(w_g\in H^1(K)\) and \(w=g\) on \(\partial K\), such that

$$\begin{aligned} \Vert \nabla w_g\Vert _{0, K} + h_K^{-1} \Vert w_g\Vert _{0, K} \le C\Vert g\Vert _{1/2, \partial K}. \end{aligned}$$

Thus

$$\begin{aligned} \Vert \varvec{\tau }\cdot \mathbf{n}\Vert _{-1/2,\partial K} \le \displaystyle \frac{(\varvec{\tau }, \nabla w_g)_K + (\nabla \cdot \varvec{\tau }, w_g)_K}{\Vert g\Vert _{1/2,\partial K}} \le C(\Vert \varvec{\tau }\Vert _{0,K} + h_K \Vert \nabla \cdot \varvec{\tau }\Vert _{0,K}). \end{aligned}$$

\(\square \)

Theorem 19

In each element \(K\in \mathcal{T}\), the following robust best approximation property holds:

$$\begin{aligned} \Vert \alpha ^{1/2}_K\nabla (u-u_{h,K}^*)\Vert _{0,K} \le C \left( \inf _{w_h \in P_{k+1}(K)}\Vert \alpha ^{1/2}_K\nabla (u-w_h)\Vert _{0,K} + \Vert \varvec{\sigma }-\varvec{\sigma }_h\Vert _{\alpha ,h, H(\mathrm{div}),K} \right) . \end{aligned}$$
(4.5)

Proof

Let \(w_h\) be an arbitrary function in \(P_{k+1}(K)\), and \(v_h= u_{h,K}^* - w_h\). Let \(\overline{v}_h = \int _K v_h dx /|K|\) be the average of \(v_h\) on K, then \(v_h-\overline{v}_h\) belongs to the test space \(P_{k+1}(K)/\mathrm IR\). Then

$$\begin{aligned} \Vert \alpha ^{1/2}_K\nabla (u_{h,K}^* - w_h)\Vert _{0,K}^2= & {} \Vert \alpha ^{1/2}_K\nabla v_h\Vert _{0,K}^2 = (\alpha \nabla (u_{h,K}^* - w_h), \nabla v_h)_K\\= & {} (\alpha _K\nabla u_{h,K}^*,\nabla (v_h- \overline{v}_h))_K -(\alpha \nabla w_h, \nabla v_h)_K\\= & {} (f,v_h-\overline{v}_h)_K - (\varvec{\sigma }_h\cdot \mathbf{n}, v_h-\overline{v}_h)_{\partial K} -(\alpha _K\nabla w_h, \nabla v_h)_K \\= & {} (\alpha _K \nabla (u-w_h),\nabla v_h)_K +((\varvec{\sigma }-\varvec{\sigma }_h)\cdot \mathbf{n}, v_h-\overline{v}_h)_{\partial K}, \end{aligned}$$

where we use the fact that \((\alpha _K \nabla u, \nabla v)_K = (f,v)_K - (\varvec{\sigma }\cdot \mathbf{n}, v)_{\partial K}\) is true for any \(v \in H^1(K)\).

By the Cauchy–Schwarz inequality,

$$\begin{aligned} (\alpha _K \nabla (u-w_h),\nabla v_h)_K \le \Vert \alpha _K^{1/2}\nabla (u-w_h)\Vert _{0,K}\Vert \alpha _K^{1/2}\nabla v_h\Vert _{0,K} \end{aligned}$$

By the definition of the dual norm, the trace inequality (4.3), and the fact \(\Vert v_h-\overline{v}_h\Vert _{0,K} \le C h_K\Vert \nabla v_h\Vert _{0,K}\), we have

$$\begin{aligned}&((\varvec{\sigma }-\varvec{\sigma }_h)\cdot \mathbf{n}, v_h-\overline{v}_h)_{\partial K} \le \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\cdot \mathbf{n}\Vert _{-1/2,\partial K} \Vert \alpha ^{1/2}(v_h-\overline{v}_h)\Vert _{1/2,\partial K} \\&\quad \le C h_K^{-1}\Vert \alpha ^{1/2}(v_h-\overline{v}_h)\Vert _{0,K} \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\cdot \mathbf{n}\Vert _{-1/2,\partial K}\\&\quad \le C \Vert \alpha ^{1/2}\nabla v_h\Vert _{0,K}(\Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,K} + h_K\Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,K}). \end{aligned}$$

Thus

$$\begin{aligned}&\Vert \alpha ^{1/2}\nabla (u_{h,K}^* - w_h)\Vert _{0,K} \\&\quad \le C(\Vert \alpha ^{1/2}\nabla (u-w_h)\Vert _{0,K} + \Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,K} + h_K\Vert \alpha ^{-1/2}\nabla \cdot (\varvec{\sigma }-\varvec{\sigma }_h)\Vert _{0,K}). \end{aligned}$$

By the triangle inequality,

$$\begin{aligned} \Vert \alpha ^{1/2}\nabla (u-u_{h,K}^*)\Vert _{0,K} \le \Vert \alpha ^{1/2}\nabla (u-w_h)\Vert _{0,K}+\Vert \alpha ^{1/2}\nabla (u_{h,K}^* - w_h)\Vert _{0,K}. \end{aligned}$$

The theorem is proved. \(\square \)

By the approximation property of \(P_{k+1}(K)\), and the robust local optimal error estimate of \(\varvec{\sigma }_h\), we immediately have the following robust local optimal error estimate for the Stenberg’s post-processing.

Theorem 20

For both the \((\varvec{\sigma }_h,u_h) \in RT_k\times D_k\) (\(k\ge 0\)) or \((\varvec{\sigma }_h,u_h) \in BDM_k\times D_{k-1}\) (\(k\ge 1\)) case, the Stenberg’s recovery \(u_{h,K}^* \in P_{k+1}(K)\) has the following robust local a priori error estimate in the low regularity elements \(K\in \mathcal{T}_{low}\) with \(0\le s_K<1\):

$$\begin{aligned} \Vert \alpha ^{1/2}_K\nabla (u-u_{h,K}^*)\Vert _{0,K} \le C h_K^{s_K} |\alpha ^{1/2}\nabla u|_{s_K,K} + h_K \Vert \alpha ^{-1/2}f\Vert _{0,K}, K \in \mathcal{T}_{low}. \end{aligned}$$

For those elements \(K\in \mathcal{T}_{high}\) with \(1\le s_K\), the following robust local a priori error estimate holds:

$$\begin{aligned} \Vert \alpha ^{1/2}_K\nabla (u-u_{h,K}^*)\Vert _{0,K}\le & {} C \left( h_K^{\min \{k+1,s_K\}} |\alpha ^{1/2}\nabla u|_{\min \{k+1,s_K\},K} \right. \\&+ \left. h_K^{\min \{k+2,s_K\}}\Vert \alpha ^{-1/2}f\Vert _{\min \{k+1,s_K-1\},K} \right) , RT_{k}\times D_k \text{ case }. \\ \Vert \alpha ^{1/2}_K\nabla (u-u_{h,K}^*)\Vert _{0,K}\le & {} C h_K^{\min \{k+1,s_K\}} \left( |\alpha ^{1/2}\nabla u|_{\min \{k+1,s_K\},K} \right. \\&+\left. \Vert \alpha ^{-1/2}f\Vert _{\min \{k,s_K-1\},K}\right) , BDM_{k}\times D_{k -1} \text{ case }. \end{aligned}$$

Remark 21

There are other post-processing methods available, such as the one proposed in [2] and analyzed in [30]. The recovered potential is also mainly from the numerical flux \(\varvec{\sigma }_h\), a similar robust and local optimal a priori error estimate can also be derived.

It is also well known if the mixed method is implemented by hybridization, the Lagrange multiplier is also a better approximation of u than \(u_h\), and is a good source for post-processing or solution reconstruction. With careful analysis, it should not be hard to derive robust and local optimal result for the Lagrange multiplier and its post-processed solution under a similar weighted discrete \(H^1\) norm.

5 Final Comments

In this paper, for elliptic interface problems in two- and three-dimensions with a possible very low regularity, we establish robust and local optimal a priori error estimates for the Raviart–Thomas and Brezzi–Douglas–Marini mixed finite element approximations. For the flux approximation, we show the robust best approximation in the discrete equilibrated space and the whole mixed approximation space with appropriated norms, an \(\alpha \)-weighted \(L^2\) norm or an \((\alpha ,h)\)-weighted \(H(\mathrm{div})\) norms. We show the robust local optimal error estimates for the flux approximation in these norms. For the potential approximation, we show a robust best approximation result in a weighted discrete \(H^1\) norm and show that the convergence order is sub-optimal compared to the flux approximation. We then show that with the flux as the main source of post-processing, the Stenberg’s post-processing can recover a potential with the robust local optimal error estimate.

These robust and local optimal a priori estimates provide guidance for constructing robust a posteriori error estimates and adaptive methods for the mixed approximations. For robust a posteriori error for the mixed methods of the interface problem, we should focus on \(\Vert \alpha ^{-1/2}(\varvec{\sigma }-\varvec{\sigma }_h)\Vert _0\), like the approaches in [1, 14, 24, 30]. The approaches in [7, 25] are not optimal since they are all try to put \(u_h\) into the estimator. If any post-processing is going to be used to construct the a posteriori error estimator, the main source of information should be the numerical flux \(\varvec{\sigma }_h\), not the numerical potential \(u_h\) itself.