1 Introduction

In this paper, we consider to solve the following second-order elliptic problem

$$ \begin{array}{@{}rcl@{}} -\nabla\cdot(A\nabla u) &=& f \quad \text{in} \ {\varOmega}, \end{array} $$
(1)
$$ \begin{array}{@{}rcl@{}} u &=& 0 \quad \text{on} \ \partial{\varOmega}, \end{array} $$
(2)

where Ω is a bounded polygonal or polyhedral domain in \(\mathbb {R}^{d} (d=2,3)\) with boundary Ω. The domain is partitioned into m non-overlapping sub-domains \({\varOmega }_{i}, 1\leqslant i\leqslant m\). Let \(\mathcal {T}_{0}\) be an original partition of Ω and consistent with the partition \(\bar {{\varOmega }}={\prod }_{i=1}^{m}{\varOmega }_{i}\). For each \(\tau \in \mathcal {T}_{0}\), assume that A(x) is a constant on τ satisfying \(\alpha _{0} \leqslant A(x)\leqslant \beta _{0}, x\in {\varOmega }\), where α0 and β0 are positive constants. The variational formulation of (1) and (2) is to find \(u\in {H_{0}^{1}}({\varOmega })\) such that

$$ (A\nabla u, \nabla v)=(f, v), \quad v\in {H_{0}^{1}}({\varOmega}). $$
(3)

Weak Galerkin (WG) methods are proposed recently to solve partial differential equations, which adopt weak differential operators to approximate classical differential operators, e.g., gradient, divergence and curl. WG methods were first introduced by Wang and Ye [30, 31] to solve second-order elliptic problems. Then, the WG methods have successfully applied to solve many problems, for example, second-order elliptic interface problems [18], parabolic equations [10, 15, 38], Helmholtz equation [9, 20, 23], Biharmonic equations [19, 21, 29], Stokes equations [32, 33], Maxwell equations [22, 26], Reaction-diffusion problems [1], Navier-Stokes equations [13, 17], Darcy-Stokes equations [6], Darcy equations [16]. Later, a modified weak Galerkin (MWG) method was put forward by Wang, Malluwawadu, Gao and McMillan [34] for elliptic problem. Comparing with WG methods, MWG methods contain less unknowns, while the accuracy stays the same. Then, MWG method has also found its way to other problems, such as the parabolic problems [11], Sobolev equation [12], Signorini and obstacle problems [37], Stokes problem [27].

The solution of (3) may contain singularity. We can resolve the singularity by refining the mesh uniformly. However, the uniform refinement increase the computational workload dramatically since the number of unknowns grows dramatically. In this paper, we consider to use local mesh refinement to resolve the singularity, which puts denser grids in where the function changes dramatically. In other words, uniform refinement will need more computational labor to get the same accuracy. Adaptive finite element method (AFEM) is a local mesh refinement, which can optimize the relation between accuracy and computational labor. The theoretical study of adaptive conforming finite element method is relatively mature, see [24, 28] and the references therein. However, in most of the existing work of adaptive weak Galerkin methods for second-order elliptic problem are limited to the design and analyze the posteriori error estimators. For example, Chen, Wang and Ye [5] first defined an error estimator and proved a posteriori error estimation by applying Helmholtz decomposition of L2 function. Li, Mu and Ye [14] gave a simplified posteriori error indicator which could be applicable to polygonal meshes, mixed meshes and other general meshes with hanging points. Zhang and Chen [40] designed an error indicator and proved the upper and lower bound estimates in discrete H1 norm. Zhang, Li, Li and Zhang [39] proposed a posteriori error estimate for elliptic problem with mixed boundary conditions, and so on. For MWG methods, Zhang and Lin [41] presented the posteriori error estimate for second-order elliptic problem. There are only few research results for the convergence of the adaptive WG or MWG method. Xie and Zhong [36] first prove the convergence of an adaptive weak Galerkin method for the model problem (1)-(2), in which the combination of polynomial spaces in [30] is considered. Recently, Xie, Cao, Chen and Zhong [35] proved not only the convergence but also quasi-optimality for an adaptive MWG method, however, they only considered the lowest order. Comparing with the existing work [35, 36], we use much simpler finite element spaces introduced in [34] but the more complicated bilinear of discrete variational problem, we also design and analyze more simpler a posteriori error estimates but need the more complex proofs for corresponding convergence, because the penalty term include the negative power of the local mesh size and the jump term in our bilinear form, especially, the errors in our convergent result are only related to the “energy norm”, but the errors are related to the L2 norm and data oscillation are also included in the convergent result of [36].

The main purpose of this paper is to construct simpler a posterior error estimation and provide the convergence of an adaptive modified weak Galerkin (AMWG) algorithm for any order. It is worth mentioned that our error estimator is also simpler than the one in [41], where the jump term is a component of error estimator. In this paper, we not only drop this jump term from our error estimator, but also prove that the jump term can be controlled by error estimator and present the corresponding reliability of error estimator. Furthermore, noting that the usual orthogonality property in the conforming finite element does not hold true for the MWG methods. To conquer this difficulty, we are going to follow the theoretical analysis of adaptive interior penalty discontinuous Galerkin method in [3]. Meanwhile, it is not straightforward to extend such convergence analysis in [3] to adaptive MWG methods, because the gradient operator is approximated by weak form as distribution instead of classical differential operator in MWG methods. Here, we consider to use the difference between classical gradient operator and weak gradient operator.

The remainder of the paper is structured as follows. In Section 2, the modified weak Galerkin method and some notations and preliminaries are introduced. In Section 3, an adaptive modified weak Galerkin (AMWG) method for solving (3) is imported and each procedure of AMWG is described. In Section 4, the convergence of AMWG is obtained by showing the reliability of error estimator, the comparison of solutions and the reduction of error estimator. At last, numerical experiments are presented in Section 5 to support theoretical results.

2 A modified weak Galerkin method

In this section, we first define the modified weak gradient operator. Then, we introduce the MWG method for (3). At last, we present some preliminaries.

For any domain \(D\subset \mathbb {R}^{d}, d=2, 3\), we denote (⋅,⋅)D and ∥⋅∥D the L2-inner product and L2-norm, respectively. We also use the standard definition for Sobolev space H1(D) and their associated norms for ∥⋅∥1,D. Especially,

$$ {H_{0}^{1}}(D)=\{v\in H^{1}(D)\ {:}\ v|_{\partial D}=0\}. $$

2.1 Modified weak gradient

Given a shape-regular triangulation \(\mathcal {T}\) for Ω, we define MWG finite element spaces as follows:

$$ \begin{array}{@{}rcl@{}} \mathcal{V}(\mathcal{T})&=&\left\{ v :\left.v\right|_{\tau} \in P_{l}(\tau), \forall\ \tau\in \mathcal{T}\right\} , \\ \mathcal{V}^{0}(\mathcal{T})&=&\left\{ v \in \mathcal{V}(\mathcal{T}):\left.v\right|_{e}=0, \forall\ e \in \partial {\varOmega}\right\}, \end{array} $$
(4)

where Pl(τ) denotes the set of polynomials on τ with the degree no more than l (\(l\geqslant 1\)).

Let \(\mathcal {E}_{\mathcal {T}}\) be the set of all the edges (or faces) and \(\mathcal {E}_{\mathcal {T}}^{0}\) be the set of all the interior edges (or faces), respectively. For any \(e\in \mathcal {E}_{\mathcal {T}}^{0}\), we assume that e is the common edge (or face) of \(\tau _{1}, \tau _{2}\in \mathcal {T}\), n1 and n2 are the unit normal vectors on e for τ1 and τ2, respectively. For a scalar function ϕ and a vector function w, its average and jump on e are defined as

$$ \begin{array}{@{}rcl@{}} \left\{\left\{ \phi \right\}\right\}_{e} = (\phi|_{\tau_{1}} + \phi|_{\tau_{2}})/{2}, &&\left[\left[ \phi\right]\right]_{e} =\phi|_{\tau_{1}} \boldsymbol{n}_{1} + \phi|_{\tau_{2}} \boldsymbol{n}_{2}, \\ \left\{\left\{\boldsymbol{w}\right\}\right\}_{e} = (\boldsymbol{w}|_{\tau_{1}} + \boldsymbol{w}|_{\tau_{2}})/{2}, &&\left[\left[ \boldsymbol{w}\right]\right]_{e}=\boldsymbol{w}|_{\tau_{1}} \cdot\boldsymbol{n}_{1} + \boldsymbol{w}|_{\tau_{2}}\cdot \boldsymbol{n}_{2}, \end{array} $$

where \(\phi |_{\tau _{i}}\) and \(\boldsymbol {w}|_{\tau _{i}}\) denote the value of ϕ and w on τi,i = 1,2, respectively.

For any eΩ, denote n the unit normal vector on e, we also define

$$ \begin{array}{@{}rcl@{}} \left\{\left\{ \phi \right\}\right\}_{e}=\phi, && \left[\left[ \phi \right]\right]_{e}= \phi\boldsymbol{n}, \\ \left\{\left\{ \boldsymbol{w}\right\}\right\}_{e}=\boldsymbol{w}, && \left[\left[ \boldsymbol{w}\right]\right]_{e}=\boldsymbol{w}\cdot\boldsymbol{n}. \end{array} $$

Next we define the modified weak gradient operator used in the MWG methods.

Definition 1 (Definition 1.1 in 34)

Given a partition \(\mathcal {T}\) of Ω, let v be a piecewise smooth function on Ω. For all \(\tau \in \mathcal {T}\), the discrete gradient of v on τ is the unique element ∇w,τv in [Pl− 1(τ)]d such that

$$ \begin{array}{@{}rcl@{}} (\nabla_{w, \tau} v, q)_{\tau} := - (v, \nabla\cdot q)_{\tau} + \langle \left\{\left\{ v \right\}\right\}, q \cdot \mathbf{n}\rangle_{\partial \tau}, \quad \forall q\in \left[P_{l-1}(\tau)\right]^{d}. \end{array} $$
(5)

For simplicity of notation, when no ambiguity arises, we shall abbreviate the notation ∇w,τ as ∇w.

Using Green formula, we will get the relationship between the weak gradient and classical gradient as follows

$$ (\nabla_{w, \tau} v, q)_{\tau} = (\nabla v, q)_{\tau} + \langle \left\{\left\{ v \right\}\right\} - v, q \cdot \mathbf{n}\rangle_{\partial \tau}, \forall q\in \left[P_{l-1}(\tau)\right]^{d}, $$
(6)

whence, choosing \(v\in {H_{0}^{1}}({\varOmega })\cap \mathcal {V}(\mathcal {T})\) in (6) which implies that ∇w,τv = ∇v. Otherwise, we can derive next property.

Lemma 1 (Lemma 2.1 in 41)

For \(v_{\mathcal {T}} \in \mathcal {V}(\mathcal {T})\), it holds

$$ \| \nabla_{w} v_{\mathcal{T}}-\nabla v_{\mathcal{T}} \|_{\mathcal{T}}^{2}\lesssim {\sum}_{e \in \mathcal{E}_{\mathcal{T}}} h_{\tau}^{-1}\|\left[\left[ v_{\mathcal{T}}\right]\right]\|^{2}_{e}, $$

where \(\|\cdot \|_{\mathcal {T}}={\sum }_{\tau \in \mathcal {T}}\|\cdot \|_{\tau }\).

Next, we will introduce the MWG method for solving (3).

2.2 The MWG discretization

The MWG formula for solving (3) is to find \(u_{\mathcal {T}}\in \mathcal {V}^{0}(\mathcal {T})\) such that

$$ a_{\mathcal{T}}\left( u_{\mathcal{T}}, v_{\mathcal{T}}\right)=(f, v_{\mathcal{T}}), \quad \forall\ v_{\mathcal{T}} \in \mathcal{V}^{0}(\mathcal{T}), $$
(7)

where for \(w_{\mathcal {T}}, v_{\mathcal {T}}\in \mathcal {V}(\mathcal {T})\), the bilinear form is defined as

$$ a_{\mathcal{T}}\left( w_{\mathcal{T}}, v_{\mathcal{T}}\right) :=\left( A\nabla_{w} w_{\mathcal{T}}, \nabla_{w} v_{\mathcal{T}}\right)_{\mathcal{T}} + {\sum}_{e\in\mathcal{E}_{\mathcal{T}}}\mu h^{-1}_{\tau} { \langle\left[\left[ w_{\mathcal{T}}\right]\right]_{e}, \left[\left[ v_{\mathcal{T}}\right]\right]_{e}\rangle_{e}}. $$
(8)

Here, \((\cdot , \cdot )_{\mathcal {T}}={\sum }_{\tau \in \mathcal {T}}(\cdot , \cdot )_{\tau }\) and μ is a positive penalty parameter.

Next lemma showing that the bilinear form defined in (8) is symmetric positive definite on \(\mathcal {V}^{0}(\mathcal {T})\).

Lemma 2

The bilinear forms \(a_{\mathcal {T}}\left (\cdot , \cdot \right )\) is symmetric positive definite on \(\mathcal {V}^{0}(\mathcal {T})\).

Proof

We only need to prove that \(a_{\mathcal {T}}(\cdot , \cdot )\) is positive. In fact, if \(a_{\mathcal {T}}\left (v_{\mathcal {T}}, v_{\mathcal {T}}\right )=0\), for some \(v_{\mathcal {T}}\in \mathcal {V}^{0}(\mathcal {T})\), then in view of (8), we get

$$ \nabla_{w} v_{\mathcal{T}}|_{\tau} = 0\ \text{for}\ \forall \tau\in \mathcal{T}, \ \left[\left[ v_{\mathcal{T}}\right]\right]_{e} = 0\ \text{for}\ \forall e\in\mathcal{E}_{\mathcal{T}}. $$

According to Lemma 1, we obtain \(\nabla v_{\mathcal {T}}|_{\tau }= \nabla _{w} v_{\mathcal {T}}|_{\tau }\), then combine with \(\nabla _{w} v_{\mathcal {T}} |_{\tau }= 0\). As a result, \(v_{\mathcal {T}}\) is a piece constant on \(\mathcal {T}\). Furthermore, \(\left [\left [ v_{\mathcal {T}}\right ]\right ]_{e} = 0\) implies that \(v_{\mathcal {T}}\) is continuous across \(\forall e\in \mathcal {E}_{\mathcal {T}}^{0}\) and \(v_{\mathcal {T}}|_{\partial {\varOmega }} = 0\). Then we arrive at \(v_{\mathcal {T}} = 0\). □

From Lemma 2, we can define a mesh dependent norm on \(\mathcal {V}^{0}(\mathcal {T})\) as \(|\|v_{\mathcal {T}}\||_{A, \mathcal {T}}^{2}:= a_{\mathcal {T}}\left (v_{\mathcal {T}}, v_{\mathcal {T}}\right )\) for any \(\ v_{\mathcal {T}}\in \mathcal {V}^{0}(\mathcal {T})\). Furthermore, according to Lemma 2 and Lax-Milgram theorem, we can prove that the discrete problem (7) is well-posed.

In next subseciton, we introduce some preliminaries which will be used in the error estimates.

2.3 Some preliminaries

Noting that the orthogonality is false for modified weak Galerkin approximation. Hence, we intend to establish the partial orthogonality by using the similar arguments in [5].

Lemma 3 (Partial orthogonality)

Assume that \(u\in {H_{0}^{1}}({\varOmega })\) and \(u_{\mathcal {T}} \in \mathcal {V}^{0}(\mathcal {T})\) are the solutions of (3) and (7), respectively. We have

$$ \begin{array}{@{}rcl@{}} (A(\nabla u - \nabla_{w}u_{\mathcal{T}}), \nabla v^{c}_{\mathcal{T}})_{\mathcal{T}} = 0, \quad \forall\ v^{c}_{\mathcal{T}}\in \mathcal{V}^{c}(\mathcal{T}), \end{array} $$
(9)

where \(\mathcal {V}^{c}(\mathcal {T}) = \mathcal {V}(\mathcal {T})\cap {H_{0}^{1}}({\varOmega })\).

Proof

For any \(v^{c}_{\mathcal {T}}\in \mathcal {V}^{c}(\mathcal {T})\), noting that \(\nabla v^{c}_{\mathcal {T}}=\nabla _{w} v^{c}_{\mathcal {T}}\), setting \(v=v^{c}_{\mathcal {T}}\) in (3), and applying the fact \(\left [\left [ v^{c}_{\mathcal {T}}\right ]\right ]_{e}=0\) for any \(e\in \mathcal {E}_{\mathcal {T}}\) and (7), we obtain

$$ \begin{array}{@{}rcl@{}} &&(A\nabla u - A\nabla_{w}u_{\mathcal{T}}, \nabla v^{c}_{\mathcal{T}})_{\mathcal{T}}\\ && =(A\nabla u, \nabla v^{c}_{\mathcal{T}})_{\mathcal{T}} - (A\nabla_{w}u_{\mathcal{T}}, \nabla_{w} v^{c}_{\mathcal{T}})_{\mathcal{T}}\\ &&= (f, v^{c}_{\mathcal{T}}) - (A\nabla_{w}u_{\mathcal{T}}, \nabla_{w} v^{c}_{\mathcal{T}})_{\mathcal{T}} - {\sum}_{e\in\mathcal{E}_{\mathcal{T}}} \mu h^{-1}_{e}\left( \left[\left[ u_{\mathcal{T}}\right]\right],\left[\left[ v^{c}_{\mathcal{T}}\right]\right]\right)_{e}\\ &&=(f, v^{c}_{\mathcal{T}}) -(f, v^{c}_{\mathcal{T}}) =0. \end{array} $$

In the following, we will use the next orthogonal decomposition of \(\mathcal {V}(\mathcal {T})\). Let \(\mathcal {V}^{\bot }(\mathcal {T})\) be the orthogonal complement of \(\mathcal {V}^{c}(\mathcal {T})\) in \(\mathcal {V}(\mathcal {T})\) with respect to \(a_{\mathcal {T}} (\cdot , \cdot )\) defined in (8), namely

$$ \mathcal{V}(\mathcal{T}) = \mathcal{V}^{c}(\mathcal{T})\oplus \mathcal{V}^{\bot}(\mathcal{T}). $$
(10)

Meanwhile, we also need to present a local operator \(I_{\mathcal {T}}^{c}\) onto \(\mathcal {V}^{c}(\mathcal {T})\).

Lemma 4 (Lemma 6.6 in 3)

There exists an interpolation operator \(I_{\mathcal {T}}^{c}: \mathcal {V}(\mathcal {T})\rightarrow \mathcal {V}^{c}(\mathcal {T})\) and a constant CI depending only on the shape regularity of \(\mathcal {T}\), such that for all \(\tau \in \mathcal {T}\) the following inequalities hold:

$$ \|v_{\mathcal{T}} - I_{\mathcal{T}}^{c} v_{\mathcal{T}}\|_{\tau} \leqslant C_{I} h\|\nabla v_{\mathcal{T}}\|_{{\varOmega}_{\tau}}, \ \forall v_{\mathcal{T}}\in {H_{0}^{1}}({\varOmega}), $$
(11)

and for |a| = 0,1,

$$ \|D^{a}(v_{\mathcal{T}} - I_{\mathcal{T}}^{c} v_{\mathcal{T}})\|_{\tau}^{2} \leqslant C_{I} {\sum}_{e\in \mathcal{E}_{\mathcal{T}}\cap{\varOmega}_{\tau}} h_{\tau}^{1-2|a|}\|\left[\left[ v_{\mathcal{T}}\right]\right]\|^{2}_{e}, \ \forall v_{\mathcal{T}}\in \mathcal{V}(\mathcal{T}), $$
(12)

where \({\varOmega }_{\tau }=\{\tau ^{\prime } \in \mathcal {T}\ |\ \tau \cap \tau ^{\prime } \not = \varnothing \}\).

3 An adaptive modified weak Galerkin method

In this section, we introduce an adaptive modified weak Galerkin (AMWG) method and discuss each procedure of AMWG method.

figure a

In the SOLVE step, by solving the problem (7) we will get the discrete solution \(u_{\mathcal {T}} = \mathbf {SOLVE}(\mathcal {T}, f)\), where fL2(Ω) is a given function and the \(\mathcal {T}\) is a conforming triangulation.

In this step, we assume that the discrete linear system associated with (7) can be solved exactly.

In the ESTIMATE step, we need to define an efficient and reliable error estimators. For two elements \(\tau _{1}, \tau _{2}\in \mathcal {T}\), let τ1,τ2 share e as a common edge (or face), denote by Ωe = τ1τ2 the macro-element associated with e. Similarly, for an element τ, we denote \({\varOmega }_{\tau }=\{\tau ^{\prime } \in \mathcal {T}| \tau \cap \tau ^{\prime } \not = \varnothing \}\). Let Aτ = A|τ, \(A^{\max \limits }_{e} = \max \limits _{\tau \in {\varOmega }_{e}} A_{\tau }\). The local error indicator \(\eta (v_{\mathcal {T}}, \tau )\) for any \(v_{\mathcal {T}}\in \mathcal {V}^{0}(\mathcal {T})\) and any \(\tau \in \mathcal {T}\) is defined as

$$ \eta^{2}_{\mathcal{T}}(v_{\mathcal{T}}, \tau) = h_{\tau}^{2}A_{\tau}^{-1}\left\|R(v_{\mathcal{T}})\right\|_{\tau}^{2} + \sum\limits_{e\in\partial\tau}h_{\tau}{\int}_{e} (A^{\max}_{e})^{-1} {J^{2}_{e}}(A\nabla_{w} v_{\mathcal{T}}) \mathrm{d}s, $$
(14)

where

$$ \begin{array}{@{}rcl@{}} R(v_{\mathcal{T}}) = f+\nabla\cdot\left( A\nabla_{w} v_{\mathcal{T}}\right), \quad J_{e}(A\nabla_{w} v_{\mathcal{T}}) = \left\{ \begin{array}{l} \left[\left[ A\nabla_{w} v_{\mathcal{T}}\right]\right]_{e}, \quad \text{if} \ \ e\in \mathcal{E}_{\mathcal{T}}^{0},\\ 0, \quad\quad \quad\quad\quad \text{otherwise}. \end{array} \right. \end{array} $$

We also define the corresponding error estimator for \({\mathscr{M}}\subset \mathcal {T}\) as

$$ \eta_{\mathcal{T}}(v_{\mathcal{T}},\mathcal{M})=\sum\limits_{\tau\in\mathcal{M}}\eta^{2}_{\mathcal{T}}(v_{\mathcal{T}}, \tau), $$

when \({\mathscr{M}}=\mathcal {T}\), we get the definition of the global estimator \(\eta _{\mathcal {T}}(v_{\mathcal {T}},\mathcal {T})\). In order to save notation, we will simplify \(\eta _{\mathcal {T}}(v_{\mathcal {T}},\mathcal {T})\) as \(\eta (v_{\mathcal {T}},\mathcal {T})\).

In the MARK step, we will obtain a set of marked elements \({\mathscr{M}}_{k}\) by making use of the error indicators \(\{\eta (u_{\mathcal {T}}, \tau )\}_{\tau \in \mathcal {T}}\) on \(\mathcal {T}\) obtained in the ESTIMATE step and Dörfler marking strategy [7].

In the REFINE step, we choose bisection methods (see [2, 24]) and refine all the marked elements \({\mathscr{M}}_{k}\) at least, thereby generating \(\mathcal {T}_{k+1}\) from \(\mathcal {T}_{k}\), and satisfies that the operator defined in Lemma 4 is valid on \(\mathcal {V}(\mathcal {T}_{k+1})\). Here, we should remark that \(\mathcal {T}_{k+1}\) allow hanging node. More details are referred to the adaptive DG methods (e.g., §3.4 of [3]).

In this paper, we denote \(\mathcal {C}(\mathcal {T}_{0})\) the set of the triangulation \(\mathcal {T}\) which is conforming and refined from \(\mathcal {T}_{0}\) and assume that \(\mathcal {T}_{1} \leqslant \mathcal {T}_{2}\) means \(\mathcal {T}_{2}\) is a refinement of \(\mathcal {T}_{1}\).

4 Convergence of the AMWG method

In this section, we first verify the reliability of error estimator. Then we provide the comparison of solutions and the reduction of the error estimator. At last, we prove the convergence of Algorithm 1.

4.1 Reliability

For any \(v\in {H_{0}^{1}}({\varOmega })\) and any \(v_{\mathcal {T}}\in \mathcal {V}(\mathcal {T})\), we define the following error

$$ |\|v - v_{\mathcal{T}}\||_{\mathcal{T}}^{2} = \|A^{1/2}(\nabla v- \nabla_{w} v_{\mathcal{T}})\|^{2}_{\mathcal{T}} + \sum\limits_{e\in \mathcal{E}_{\mathcal{T}}} \mu h^{-1}_{\tau}\|\left[\left[ v_{\mathcal{T}}\right]\right]_{e}\|^{2}_{e}. $$
(15)

Remark 1

Here we use \(\left [\left [ v_{\mathcal {T}}\right ]\right ]_{e}\) instead \(of \left [\left [ v-v_{\mathcal {T}}\right ]\right ]_{e}\), since \(\left [\left [ v\right ]\right ]_{e}=0\) for \(v\in {H_{0}^{1}}({\varOmega })\).

In this subsection, we are going to prove the upper bound of error \(|\| u-u_{\mathcal {T}}\||_{\mathcal {T}}\), where \(u\in {H_{0}^{1}}({\varOmega })\) and \(u_{\mathcal {T}} \in \mathcal {V}^{0}(\mathcal {T})\) are the solution of (3) and (7), respectively.

The next lemma shows that the second term of \(|\|u-u_{\mathcal {T}}\||_{\mathcal {T}}\) can be controlled by the error estimator.

Lemma 5

Let \(u_{\mathcal {T}} \in \mathcal {V}^{0}(\mathcal {T})\) be the solution of (7), we have

$$ \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}\mu h_{\tau}^{-1}\|\left[\left[ u_{\mathcal{T}}\right]\right]\|_{e}^{2} \lesssim \mu^{-1}\eta_{\mathcal{T}}^{2}(u_{\mathcal{T}}, \mathcal{T}), $$
(16)

where the constant depends on the shape regularity of \(\mathcal {T}\) and the constant β0 for the upper bound of A(x).

Proof

Noticing that \(I_{\mathcal {T}}^{c} u_{\mathcal {T}}\in \mathcal {V}^{c}(\mathcal {T})\) with interpolation \(I_{\mathcal {T}}^{c}\) given by Lemma 4, then using (7) and the definition of weak gradient of (5), we have

$$ \begin{array}{@{}rcl@{}} &&\sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}\mu h_{\tau}^{-1}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|_{e}^{2}\\ &&= \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}\mu h_{\tau}^{-1}\langle\left[\left[ u_{\mathcal{T}}\right]\right]_{e}, \left[\left[ u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}}\right]\right]_{e}\rangle_{e}\\ &&= (f, u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}})_{\mathcal{T}} - (A\nabla_{w} u_{\mathcal{T}}, \nabla_{w}(u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}}))_{\mathcal{T}}\\ &&= (f, u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}})_{\mathcal{T}} + (\nabla\cdot (A\nabla_{w} u_{\mathcal{T}}), u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}})_{\mathcal{T}} \\ &&\quad- \sum\limits_{\tau\in\mathcal{T}}\langle\left\{\left\{ u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}}\right\}\right\}_{e} , (A\nabla_{w} u_{\mathcal{T}}) \cdot\boldsymbol{n}\rangle_{\partial \tau}\\ &&= (f +\nabla\cdot (A\nabla_{w} u_{\mathcal{T}}), u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}})_{\mathcal{T}} - \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}\langle\left\{\left\{ u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}}\right\}\right\}_{e}, \left[\left[ A\nabla_{w} u_{\mathcal{T}}\right]\right]_{e}\rangle_{e}, \end{array} $$

making using of Cauchy-Schwarz inequality, trace inequality, Lemma 4 and the definition of error estimator, we arrive at

$$ \begin{array}{@{}rcl@{}} \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}\mu h_{\tau}^{-1}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|_{e}^{2}&&\leqslant\left( \sum\limits_{\tau\in\mathcal{T}}h_{\tau}^{2} A_{\tau}^{-1}\|R(u_{\mathcal{T}})\|^{2}_{\tau}\right)^{1/2}\left( \sum\limits_{\tau\in\mathcal{T}}h_{\tau}^{-2} A_{\tau} \|u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}}\|^{2}_{\tau}\right)^{1/2}\\ && + \left( \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}} h_{\tau}^{-1} A_{e}^{\max}\|\left\{\left\{ u_{\mathcal{T}} - I_{\mathcal{T}}^{c} u_{\mathcal{T}}\right\}\right\}\|^{2}_{e}\right)^{1/2} \left( \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}h_{\tau} (A_{e}^{\max})^{ - 1} \|J_{e}(A\nabla_{w} u_{\mathcal{T}})\|^{2}_{e}\right)^{1/2} \\ &&\lesssim\eta(u_{\mathcal{T}}, \mathcal{T}) \cdot \left( \sum\limits_{\tau\in\mathcal{T}}h_{\tau}^{-2}\|u_{\mathcal{T}}-I_{\mathcal{T}}^{c} u_{\mathcal{T}}\|^{2}_{\tau}\right)^{1/2} \\ &&\lesssim\eta(u_{\mathcal{T}}, \mathcal{T}) \cdot \left( \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}h_{\tau}^{-1}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|_{e}^{2}\right)^{1/2}, \end{array} $$

which implies

$$ \begin{array}{@{}rcl@{}} \left( \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}\mu h_{\tau}^{-1}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|_{e}^{2}\right)^{1/2} \lesssim \frac{1}{\sqrt{\mu}}\eta(u_{\mathcal{T}}, \mathcal{T}). \end{array} $$

Finally, the lemma can be proved by squaring both sides of the above equation. □

For the first term of \(|\|u-u_{\mathcal {T}}\||_{\mathcal {T}}\), we need to take care of the nonconforming component \(u_{\mathcal {T}}^{\bot }\) of discrete solution \(u_{\mathcal {T}}\) as follows.

Lemma 6

Let \(u_{\mathcal {T}} \in \mathcal {V}^{0}(\mathcal {T})\) be the solution of (7), then for \(u_{\mathcal {T}} = u_{\mathcal {T}}^{c}+u_{\mathcal {T}}^{\bot }\) with \(u_{\mathcal {T}}^{c}\in \mathcal {V}^{c}(\mathcal {T})\) and \(u_{\mathcal {T}}^{\bot }\in \mathcal {V}^{\bot }(\mathcal {T})\). There holds that

$$ |\|u_{\mathcal{T}}^{\bot}\||_{A, \mathcal{T}}^{2} \lesssim\mu^{-1}\eta_{\mathcal{T}}^{2}(u_{\mathcal{T}}, \mathcal{T}), $$
(17)

where the constant depends on the shape regularity of \(\mathcal {T}\) and the constant β0.

Proof

Applying the definition of \(|\|\cdot \||_{A, \mathcal {T}}\) and (8), then for any \(w^{c}_{\mathcal {T}}\in \mathcal {V}^{c}(\mathcal {T})\), we get

$$ \begin{array}{@{}rcl@{}} &&|\|u_{\mathcal{T}}^{\bot}\||_{A,\mathcal{T}}^{2} =a_{\mathcal{T}}(u_{\mathcal{T}}^{\bot}, u_{\mathcal{T}}^{\bot})\\ &=& \inf_{w^{c}_{\mathcal{T}}\in \mathcal{V}^{c}(\mathcal{T})}a_{\mathcal{T}}(u_{\mathcal{T}}-w^{c}_{\mathcal{T}}, u_{\mathcal{T}}-w^{c}_{\mathcal{T}}) \\ &=&\inf_{w^{c}\in \mathcal{V}^{c}(\mathcal{T})}(A (\nabla_{w} u_{\mathcal{T}}- \nabla { w_{\mathcal{T}}^{c}}), \nabla_{w} u_{\mathcal{T}}-\nabla { w_{\mathcal{T}}^{c}})_{\mathcal{T}}\\ &&+\sum\limits_{e\in \mathcal{E}_{\mathcal{T}}}\mu h^{-1}_{\tau}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|^{2}_{e}. \end{array} $$
(18)

Noting that A is piecewise constant and applying triangle inequality, Lemma 1 and Lemma 4 yield

$$ \begin{array}{@{}rcl@{}} &&(A(\nabla_{w} u_{\mathcal{T}}-\nabla I_{\mathcal{T}}^{c}u_{\mathcal{T}}), \nabla_{w} u_{\mathcal{T}}-\nabla I_{\mathcal{T}}^{c}u_{\mathcal{T}})_{\mathcal{T}}\\ &&\lesssim \|\nabla_{w} u_{\mathcal{T}}-\nabla I_{\mathcal{T}}^{c}u_{\mathcal{T}}\|_{\mathcal{T}}^{2} \\ &&\lesssim\|\nabla_{w} u_{\mathcal{T}}-\nabla u_{\mathcal{T}}\|_{\mathcal{T}}^{2}+ \|\nabla u_{\mathcal{T}} -\nabla I_{\mathcal{T}}^{c}u_{\mathcal{T}}\|_{\mathcal{T}}^{2} \\ &&\lesssim\sum\limits_{e\in \mathcal{E}_{\mathcal{T}}} h^{-1}_{\tau}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|^{2}_{e}. \end{array} $$
(19)

Choosing \(w^{c}_{\mathcal {T}} = I_{\mathcal {T}}^{c}u_{\mathcal {T}}\) in (18) and submitting (19) into (18), we have

$$ |\|u_{\mathcal{T}}^{\bot}\||_{A, \mathcal{T}}^{2} \leqslant \sum\limits_{e\in \mathcal{E}_{\mathcal{T}}} h^{-1}_{\tau}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|^{2}_{e} +\sum\limits_{e\in \mathcal{E}_{\mathcal{T}}} \mu h^{-1}_{\tau}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|^{2}_{e} \lesssim \frac{1}{\mu}\eta^{2}_{\mathcal{T}}(u_{\mathcal{T}}, \mathcal{T}), $$

in the last step, we use Lemma 5. □

Now, combining Lemma 3, orthogonal decomposition (10), Lemmas 4 and 6, we can estimate the first term of \(|\| u-u_{\mathcal {T}}\||_{\mathcal {T}}\).

Lemma 7

Let \(u\in {H_{0}^{1}}({\varOmega })\) and \(u_{\mathcal {T}} \in \mathcal {V}^{0}(\mathcal {T})\) be the solutions of (3) and (7), respectively, we have

$$ \|A^{1/2}(\nabla u-\nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}}^{2}\lesssim\eta^{2}(u_{\mathcal{T}}, \mathcal{T}), $$
(20)

where the constant depends on the shape regularity of \(\mathcal {T}\), the constant β0 and the parameter μ− 1.

Proof

According to space decomposition (10), we get \(u_{\mathcal {T}} = u_{\mathcal {T}}^{c}+u_{\mathcal {T}}^{\bot }\) with \(u_{\mathcal {T}}^{c}\in \mathcal {V}^{c}(\mathcal {T})\) and \(u_{\mathcal {T}}^{\bot }\in \mathcal {V}^{\bot }(\mathcal {T})\), hence we get

$$ u -u_{\mathcal{T}}= u- u_{\mathcal{T}}^{c} - u_{\mathcal{T}}^{\bot} = w - u_{\mathcal{T}}^{\bot}, $$
(21)

where \(w = u- u_{\mathcal {T}}^{c}\in {H_{0}^{1}}({\varOmega })\). Using the decomposition (21) and the partial orthogonality (9), we get

$$ \begin{array}{@{}rcl@{}} &&(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u- \nabla_{w} u_{\mathcal{T}})_{\mathcal{T}}\\ &&=(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u- \nabla u_{\mathcal{T}})_{\mathcal{T}}+(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u_{\mathcal{T}}- \nabla_{w} u_{\mathcal{T}})_{\mathcal{T}} \\ &&=(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla w)_{\mathcal{T}} -(A(\nabla u- \nabla_{w} u_{\mathcal{T}}),\nabla u_{\mathcal{T}}^{\bot})_{\mathcal{T}} \\ &&\quad+(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u_{\mathcal{T}}- \nabla_{w} u_{\mathcal{T}})_{\mathcal{T}} \\ &&=I_{1}+I_{2} +I_{3}, \end{array} $$
(22)

where

$$ \begin{array}{@{}rcl@{}} I_{1} &=&(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla w))_{\mathcal{T}},\\ I_{2}&=&-(A(\nabla u- \nabla_{w} u_{\mathcal{T}}),\nabla u_{\mathcal{T}}^{\bot})_{\mathcal{T}},\\ I_{3}&=&(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u_{\mathcal{T}}- \nabla_{w} u_{\mathcal{T}})_{\mathcal{T}}. \end{array} $$

First, we estimate the term I1. Let \(w^{c}_{\mathcal {T}}\in \mathcal {V}^{c}(\mathcal {T})\) be an interpolation of w satisfying (e.g., [8, 25])

$$ \begin{array}{@{}rcl@{}} A_{\tau}^{1/2}\|w-w^{c}_{\mathcal{T}}\|_{0, \tau}&&\lesssim h_{\tau}\|A^{1/2}\nabla w\|_{{\varOmega}_{\tau}}, \end{array} $$
(23)
$$ \begin{array}{@{}rcl@{}} (A_{e}^{\max})^{1/2}\|w-w^{c}_{\mathcal{T}}\|_{0, e}&&\lesssim h_{e}^{1/2}\|A^{1/2}\nabla w\|_{{\varOmega}_{e}}. \end{array} $$
(24)

Using (9), Green formulas, (3), Cauchy-Schwarz inequality, (23) and (24), we arrive at

$$ \begin{array}{@{}rcl@{}} &&I_{1}=(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla w)_{\mathcal{T}}\\ &=&(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla w - \nabla w^{c}_{\mathcal{T}})_{\mathcal{T}} \\ &=&(f+\nabla\cdot (A\nabla_{w} u_{\mathcal{T}}), w-w^{c}_{\mathcal{T}})_{\mathcal{T}} + \sum\limits_{\tau\in\mathcal{T}}\langle(A\nabla_{w} u_{\mathcal{T}})\cdot\boldsymbol{n}, w-w^{c}_{\mathcal{T}}\rangle_{\partial \tau} \\ &=&(f+ \nabla\cdot(A\nabla_{w} u_{\mathcal{T}}), w-w^{c}_{\mathcal{T}})_{\mathcal{T}} + \sum\limits_{e\in\mathcal{E}_{\mathcal{T}}}\langle\left[\left[ A\nabla_{w} u_{\mathcal{T}}\right]\right]_{e}, w-w^{c}_{\mathcal{T}}\rangle_{e} \\ &&\leqslant\sum\limits_{\tau\in \mathcal{T}}\|R(u_{\mathcal{T}})\|_{0,\tau} \|w-w^{c}_{\mathcal{T}}\|_{0,\tau}+ \sum\limits_{e\in \mathcal{E}_{\mathcal{T}}}\|J_{e}(A\nabla_{w} u_{\mathcal{T}})\|_{0, e} \|w-w^{c}_{\mathcal{T}}\|_{0,e} \\ &&\lesssim \eta(u_{\mathcal{T}}, \mathcal{T}) \|A^{1/2}\nabla w\|_{\mathcal{T}}. \end{array} $$
(25)

Using (21), the triangle inequality, \(u_{\mathcal {T}}- \nabla u_{\mathcal {T}}^{c}=\nabla u_{\mathcal {T}}^{\bot }\), Lemmas 1, 6 and 5, we get

$$ \begin{array}{@{}rcl@{}} \|A^{1/2}\nabla w\|_{\mathcal{T}}=\|A^{1/2}(\nabla u-\nabla u_{\mathcal{T}}^{c})\|_{\mathcal{T}}\\ =\|A^{1/2}(\nabla u-\nabla_{w} u_{\mathcal{T}}^{c})\|_{\mathcal{T}}\\ \leqslant\|A^{1/2}(\nabla u-\nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}} +\|A^{1/2}\nabla_{w} u_{\mathcal{T}}^{\bot}\|_{\mathcal{T}}\\ \leqslant\|A^{1/2}(\nabla u-\nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}} +|\|\nabla u_{\mathcal{T}}^{\bot}\||_{A, \mathcal{T}}\\ \lesssim \|A^{1/2}(\nabla u-\nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}} +\eta(u_{\mathcal{T}}, \mathcal{T}). \end{array} $$
(26)

Substituting (26) into (25), we arrive at

$$ \begin{array}{@{}rcl@{}} I_{1}\lesssim\|A^{1/2}(\nabla u-\nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}}\eta(u_{\mathcal{T}}, \mathcal{T})+\eta^{2}(u_{\mathcal{T}}, \mathcal{T}).\ \ \end{array} $$
(27)

Now we shall estimate the second term I2. Noting that \(u_{\mathcal {T}}^{\bot }-u_{\mathcal {T}}-u_{\mathcal {T}}^{c}\), using the partial orthogonality (9) for \(I_{\mathcal {T}}^{c}u_{\mathcal {T}}- u_{\mathcal {T}}^{c} \in \mathcal {V}^{c}(\mathcal {T})\), Cauchy-Schwarz inequality and Lemma 4 yields

$$ \begin{array}{@{}rcl@{}} I_{2} &=&- (A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u_{\mathcal{T}}^{\bot})_{\mathcal{T}}\\ &=& (A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u_{\mathcal{T}}^{c}-\nabla u_{\mathcal{T}})_{\mathcal{T}}\\ &=&(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u_{\mathcal{T}}^{c}-\nabla I_{\mathcal{T}}^{c}u_{\mathcal{T}})_{\mathcal{T}} + (A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla I_{\mathcal{T}}^{c} u_{\mathcal{T}}-\nabla u_{\mathcal{T}})_{\mathcal{T}}\\ &&\lesssim \|A^{1/2}(\nabla u- \nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}}\| \nabla I_{\mathcal{T}}^{c} u_{\mathcal{T}}-\nabla u_{\mathcal{T}}\|_{\mathcal{T}} \\ &&\lesssim \|A^{1/2}(\nabla u- \nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}} \left( {\sum}_{e\in\mathcal{E}_{\mathcal{T}}}h_{\tau}^{-1}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|^{2}_{e}\right)^{1/2}\\ &&\lesssim \|A^{1/2}(\nabla u- \nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}} \eta(u_{\mathcal{T}}, \mathcal{T}), \end{array} $$
(28)

where the constant depends on the shape regularity of \(\mathcal {T}\) and the constant β0.

For the last term I3, using Cauchy-Schwarz inequality, Lemmas 1 and 5, we get

$$ \begin{array}{@{}rcl@{}} I_{3} & =&(A(\nabla u- \nabla_{w} u_{\mathcal{T}}), \nabla u_{\mathcal{T}}- \nabla_{w} u_{\mathcal{T}})_{\mathcal{T}}\\ &&\lesssim \|A^{1/2}(\nabla u- \nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}}\|\nabla u_{\mathcal{T}}-\nabla_{w} u_{\mathcal{T}}\|_{\mathcal{T}}\\ &&\lesssim \|A^{1/2}(\nabla u- \nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}} \left( {\sum}_{e\in\mathcal{E}_{\mathcal{T}}}h_{\tau}^{-1}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|^{2}_{e}\right)^{1/2}\\ &&\lesssim \|A^{1/2}(\nabla u- \nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}}\eta(u_{\mathcal{T}}, \mathcal{T}). \end{array} $$
(29)

Substituting (27), (28) and (29) in (22) and applying Young inequality, we arrive at

$$ \|A^{1/2}(\nabla u-\nabla_{w} u_{\mathcal{T}})\|_{\mathcal{T}}^{2} \lesssim\eta^{2}(u_{\mathcal{T}}, \mathcal{T}). $$

According to Lemmas 7 and 5, we can derive the reliability of error estimator.

Theorem 1 (Upper bound)

Let \(u\in {H_{0}^{1}}({\varOmega })\) and \(u_{\mathcal {T}} \in \mathcal {V}^{0}(\mathcal {T})\) be the solutions of (3) and (7), respectively. There exists a positive constant CUB depending on the shape regularity of \(\mathcal {T}\) and μ− 1, such that

$$ \begin{array}{@{}rcl@{}} |\|u - u_{\mathcal{T}}\||_{\mathcal{T}}^{2}\leqslant C_{UB}\eta_{\mathcal{T}}^{2}(u_{\mathcal{T}}, \mathcal{T}), \end{array} $$
(30)

Proof

By (15), we know

$$ |\|u - u_{\mathcal{T}}\||_{\mathcal{T}}^{2} = \|A^{1/2}(\nabla u- \nabla_{w} u_{\mathcal{T}})\|^{2}_{\mathcal{T}} + \sum\limits_{e\in \mathcal{E}_{\mathcal{T}}} \mu h^{-1}_{\tau}\|\left[\left[ u_{\mathcal{T}}\right]\right]_{e}\|^{2}_{e}, $$

then combining with Lemmas 7 and 5, we get the desired result. □

4.2 Comparison of solutions

In this subsection, we prove the comparison of solutions. The idea roots in [3].

Lemma 8 (Comparison of solutions)

Let \(u\in {H_{0}^{1}}({\varOmega })\) be the solution of (3), \(u_{\mathcal {T}}\in \mathcal {V}^{0}(\mathcal {T})\) and \(u_{\mathcal {T}_{*}} \in \mathcal {V}^{0}(\mathcal {T}_{*})\) be the corresponding discrete solutions of (7) separately, where \(\mathcal {T}, \mathcal {T}_{*}\in \mathcal {C} (\mathcal {T}_{0})\) satisfying \(\mathcal {T}\leqslant \mathcal {T}_{*}\). Then for any 𝜖 ∈ (0,1), there holds

$$ \begin{array}{@{}rcl@{}} |\|u-u_{\mathcal{T}_{*}}\||_{\mathcal{T}_{*}}^{2} &&\leqslant(1+\epsilon)|\|u-u_{\mathcal{T}}\||^{2}_{\mathcal{T}} -\frac{1}{2}\|A^{1/2}(\nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}-\nabla_{w,\tau}u_{\mathcal{T}})\|_{\mathcal{T}_{*}}^{2} \\ && + C_{\epsilon}\mu^{-1}\left( \eta^{2}(u_{\mathcal{T}}, \mathcal{T})+ \eta^{2}(u_{\mathcal{T}_{*}}, \mathcal{T}_{*})\right), \end{array} $$

where the constant C𝜖 is independent of μ and the mesh size.

Proof

By (15), we have

$$ |\|u - u_{\mathcal{T}_{*}}\||_{\mathcal{T}_{*}}^{2} = \|A^{1/2}(\nabla u- \nabla_{w, \tau_{*}} u_{\mathcal{T}_{*}})\|^{2}_{\mathcal{T}_{*}} + \sum\limits_{e_{*}\in \mathcal{E}_{\mathcal{T}_{*}}} \mu h^{-1}_{\tau_{*}}\|\left[\left[ u_{\mathcal{T}_{*}}\right]\right]_{e_{*}}\|^{2}_{e_{*}}. $$
(31)

According to the decomposition (10), we write \(u_{\mathcal {T}} = u_{\mathcal {T}}^{c} + u_{\mathcal {T}}^{\bot }\) with \(u_{\mathcal {T}}^{c}\in \mathcal {V}^{c}(\mathcal {T})\) and \(u_{\mathcal {T}}^{\bot }\in \mathcal {V}^{\bot }(\mathcal {T})\), \(u_{\mathcal {T}_{*}} = u_{\mathcal {T}_{*}}^{c} + u_{\mathcal {T}_{*}}^{\bot }\) with \(u_{\mathcal {T}_{*}}^{c}\in \mathcal {V}^{c}(\mathcal {T}_{*})\) and \(u_{\mathcal {T}_{*}}^{\bot }\in \mathcal {V}^{\bot }(\mathcal {T}_{*})\). Then noting that \(\nabla _{w,\tau _{*}} u_{\mathcal {T}_{*}}^{c}=\nabla u_{\mathcal {T}_{*}}^{c}\), \(\nabla _{w,\tau } u_{\mathcal {T}}^{c}=\nabla u_{\mathcal {T}}^{c}\) and \(u_{\mathcal {T}}^{c}\in V^{c}(\mathcal {T})\subset V^{c}(\mathcal {T}_{*})\), in combination with Lemma 3, we have

$$ \begin{array}{@{}rcl@{}} &&\|A^{1/2}(\nabla u- \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}})\|^{2}_{\mathcal{T}_{*}}\\ &=& \|A^{1/2}(\nabla u- \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}+ \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}^{c}- \nabla_{w,\tau}u_{\mathcal{T}}^{c} )\|^{2}_{\mathcal{T}_{*}} \\ &&\quad- 2(A(\nabla u-\nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}), \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}^{c}-\nabla_{w, \tau}u_{\mathcal{T}}^{c})_{\mathcal{T}_{*}} \\ &&\quad - \|A^{1/2}(\nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}^{c}- \nabla_{w,\tau} u_{\mathcal{T}}^{c})\|^{2}_{\mathcal{T}_{*}}\\ &=&\|A^{1/2}(\nabla u- \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}+ \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}^{c}- \nabla_{w,\tau}u_{\mathcal{T}}^{c} )\|^{2}_{\mathcal{T}_{*}} \\ &&\quad- \|A^{1/2}(\nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}^{c}- \nabla_{w,\tau} u_{\mathcal{T}}^{c})\|^{2}_{\mathcal{T}_{*}}. \end{array} $$
(32)

For the first part of the right-hand side in (32), using \(\nabla _{w,\tau _{*}} u_{\mathcal {T}_{*}}- \nabla _{w,\tau _{*}} u_{\mathcal {T}_{*}}^{c}= \nabla _{w,\tau _{*}} u^{\bot }_{\mathcal {T}_{*}}\) and \(\nabla _{w,\tau } u_{\mathcal {T}}^{c}=\nabla _{w,\tau } u_{\mathcal {T}}-\nabla _{w,\tau } u^{\bot }_{\mathcal {T}}\), we have

$$ \begin{array}{@{}rcl@{}} &&\|A^{1/2}(\nabla u- \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}+ \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}}^{c}- \nabla_{w,\tau}u_{\mathcal{T}}^{c} )\|^{2}_{\mathcal{T}_{*}} \\ &=&\|A^{1/2}(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}} + \nabla_{w,\tau} u_{\mathcal{T}}^{\bot}- \nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot} )\|^{2}_{\mathcal{T}_{*}} \\ &=&\|A^{1/2}(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}})\|^{2}_{\mathcal{T}} +2(A(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}}), \nabla_{w,\tau} u_{\mathcal{T}}^{\bot}- \nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot})_{\mathcal{T}_{*}} \\ &&+\|A^{1/2}(\nabla_{w,\tau} u_{\mathcal{T}}^{\bot}- \nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot})\|^{2}_{\mathcal{T}_{*}}. \end{array} $$
(33)

For the second part of the right-hand side in (32), noting that \(u^{c}_{\mathcal {T}_{*}}=u_{\mathcal {T}_{*}}-u^{\bot }_{\mathcal {T}_{*}}\) and \(u^{c}_{\mathcal {T}} = u_{\mathcal {T}}-u_{\mathcal {T}}^{\bot }\), then using the reversed triangle inequality yields

$$ \begin{array}{@{}rcl@{}} &&\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{c}-\nabla_{w,\tau}u_{\mathcal{T}}^{c})\|^{2}_{\mathcal{T}_{*}} \\ &=& \|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}-\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot})-A^{1/2}(\nabla_{w,\tau}u_{\mathcal{T}}-\nabla_{w,\tau}u_{\mathcal{T}}^{\bot})\|^{2}_{\mathcal{T}_{*}} \\ &\geqslant&\frac{1}{2}\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}-\nabla_{w,\tau}u_{\mathcal{T}})\|_{\mathcal{T}_{*}}^{2} -\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot}-\nabla_{w,\tau}u_{\mathcal{T}}^{\bot})\|_{\mathcal{T}_{*}}^{2}.\ \ \ \ \ \ \end{array} $$
(34)

Substituting (33) and (34) in (32) and employing Young inequality \(2ab\leqslant \epsilon a^{2}+ C_{\epsilon }^{\prime } b^{2}\) with arbitrary constant 𝜖 > 0, we have

$$ \begin{array}{@{}rcl@{}} &&\|A^{1/2}(\nabla u- \nabla_{w,\tau_{*}} u_{\mathcal{T}_{*}})\|^{2}_{\mathcal{T}_{*}} \\ &&\leqslant\|A^{1/2}(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}})\|^{2}_{\mathcal{T}} +2(A(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}}), \nabla_{w,\tau} u_{\mathcal{T}}^{\bot}- \nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot})_{\mathcal{T}_{*}} \\ && +\|A^{1/2}(\nabla_{w,\tau} u_{\mathcal{T}}^{\bot}- \nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot})\|^{2}_{\mathcal{T}_{*}} \\ &&-\frac{1}{2}\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}-\nabla_{w,\tau}u_{\mathcal{T}})\|_{\mathcal{T}_{*}}^{2}+\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot}-\nabla_{w,\tau}u_{\mathcal{T}}^{\bot})\|_{\mathcal{T}_{*}}^{2} \\ &&\leqslant(1+\epsilon)\|A^{1/2}(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}})\|^{2}_{\mathcal{T}} - \frac{1}{2}\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}-\nabla_{w,\tau}u_{\mathcal{T}})\|_{\mathcal{T}_{*}}^{2}\\ &&\quad + (2+C_{\epsilon}^{\prime})\|A^{1/2}(\nabla_{w,\tau} u_{\mathcal{T}}^{\bot}- \nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot})\|^{2}_{\mathcal{T}_{*}}\\ &&\leqslant(1+\epsilon)\|A^{1/2}(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}})\|^{2}_{\mathcal{T}} - \frac{1}{2}\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}-\nabla_{w,\tau}u_{\mathcal{T}})\|_{\mathcal{T}_{*}}^{2}\\ &&\quad + (4+2C_{\epsilon}^{\prime})\|A^{1/2}\nabla_{w,\tau} u_{\mathcal{T}}^{\bot}\|^{2}_{\mathcal{T}}+(4+2C_{\epsilon}^{\prime})\|A^{1/2}\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot}\|^{2}_{\mathcal{T}_{*}}, \end{array} $$
(35)

where any constant 𝜖 ∈ (0,1).

Substituting (35) with (31), noting that \(u_{\mathcal {T}_{*}}= u_{\mathcal {T}_{*}}^{c}+ u_{\mathcal {T}_{*}}^{\bot }, u_{\mathcal {T}_{*}}^{c}\in V^{c}(\mathcal {T}_{*})\) and using Lemma 6, we obtain

$$ \begin{array}{@{}rcl@{}} &&|\|u - u_{\mathcal{T}_{*}}\||_{\mathcal{T}_{*}}^{2}\\ &&\leqslant(1+\epsilon)\|A^{1/2}(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}})\|^{2}_{\mathcal{T}} -\frac{1}{2}\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}-\nabla_{w,\tau}u_{\mathcal{T}})\|_{\mathcal{T}_{*}}^{2}\\ &&\quad + (4+2C_{\epsilon}^{\prime})\|A^{1/2}\nabla_{w,\tau} u_{\mathcal{T}}^{\bot}\|^{2}_{\mathcal{T}}+(4+2C_{\epsilon}^{\prime})\|A^{1/2}\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}^{\bot}\|^{2}_{\mathcal{T}_{*}} \\ && \quad + \sum\limits_{e_{*}\in \mathcal{E}_{\mathcal{T}_{*}}} \mu h^{-1}_{\tau_{*}}\|\left[\left[ u_{\mathcal{T}_{*}}^{\bot}\right]\right]_{e_{*}}\|^{2}_{e_{*}} \\ && \leqslant(1+\epsilon)\|A^{1/2}(\nabla u- \nabla_{w,\tau} u_{\mathcal{T}})\|^{2}_{\mathcal{T}} -\frac{1}{2}\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}-\nabla_{w,\tau}u_{\mathcal{T}})\|_{\mathcal{T}_{*}}^{2}\\ &&\quad + (4+2C_{\epsilon}^{\prime})|\|u_{\mathcal{T}}^{\bot}\||^{2}_{A, \mathcal{T}}+(4+2C_{\epsilon}^{\prime})|\|u_{\mathcal{T}_{*}}^{\bot}\||^{2}_{A, \mathcal{T}_{*}} \\ &&\leqslant (1+\epsilon) |\|u - u_{\mathcal{T}}\||_{\mathcal{T}}^{2} - \frac{1}{2}\|A^{1/2}(\nabla_{w,\tau_{*}}u_{\mathcal{T}_{*}}-\nabla_{w,\tau}u_{\mathcal{T}})\|_{\mathcal{T}_{*}}^{2}\\ &&\quad+ (4+2C_{\epsilon}^{\prime})\mu^{-1}\left( \eta^{2}(u_{\mathcal{T}}, \mathcal{T})+ \eta^{2}(u_{\mathcal{T}_{*}}, \mathcal{T}_{*})\right ), \end{array} $$

Assuming \(C_{\epsilon } = 4+2C_{\epsilon }^{\prime }\) in above inequality, we get the desired result. □

4.3 Reduction of error estimator

Here we skip the proof of the reduction of the error indicator, since the corresponding techniques are quite standard and can be found, e.g., in [4].

Lemma 9

Let \(\mathcal {R}_{\mathcal {T}_{k}\rightarrow \mathcal {T}_{k+1}}\) be the set of refined elements from \(\mathcal {T}_{k}\) to \(\mathcal {T}_{k+1}\), then for any ζ > 0, there exists a constant λ ∈ (0,1) satisfying

$$ \begin{array}{@{}rcl@{}} \eta^{2}(u_{k+1}, \mathcal{T}_{k+1}) &&\leqslant (1+\zeta)\left( \eta^{2}(u_{k}, \mathcal{T}_{k})-\lambda\eta_{\mathcal{T}_{k}}^{2}(u_{k}, \mathcal{R}_{\mathcal{T}_{k}\rightarrow\mathcal{T}_{k+1}})\right) \\ && + C_{\eta}(1+\zeta^{-1})\|A^{1/2}(\nabla_{w, \tau_{k+1}}u_{k+1} - \nabla_{w, \tau_{k}}u_{k})\|^{2}_{\mathcal{T}_{k+1}}, \end{array} $$
(36)

where the constant Cη depends on the shape regularity of \(\mathcal {T}_{k+1}\) and the constant β0.

4.4 Convergence of the AMWG method

Now we are in a position to prove the convergence of the Algorithm 1.

Theorem 2

Given a marking parameter 𝜃 ∈ (0,1) and an initial mesh \(\mathcal {T}_{0}\). Let \(u\in {H_{0}^{1}}({\varOmega })\) be the solution of (3), \(\{ \mathcal {T}_{k}, u_{k}, \eta (u_{k},\mathcal {T}_{k})\}_{k\geq 0}\) be a sequence of meshes, MWG solutions and error estimators produced by Algorithm 1, then there exists a constant μA > 0, when μ > μA, such that

$$ \begin{array}{@{}rcl@{}} |\|u-u_{k+1}\||_{\mathcal{T}_{k+1}}^{2} +\delta\eta^{2}(u_{k+1}, \mathcal{T}_{k+1}) \leqslant\alpha\left( |\|u-u_{k}\||_{\mathcal{T}_{k}}^{2} + \delta\eta^{2}(u_{k}, \mathcal{T}_{k})\right), \end{array} $$
(37)

where the constants α ∈ (0,1) and δ > 0 only depend on 𝜃 ∈ (0,1), the shape regularity \(\mathcal {T}_{0}\), the degree of polynomial l and the constant β0.

Proof

By Lemmas 8 and 9, we have

$$ \begin{array}{@{}rcl@{}} &&|\|u-u_{k+1}\||_{\mathcal{T}_{k+1}}^{2} + (\tilde{\delta} - C_{\epsilon}\mu^{-1})\eta^{2}(u_{k+1}, \mathcal{T}_{k+1})\\ &&\leqslant(1+\epsilon) |\| u - u_{k}\||_{\mathcal{T}_{k}}^{2} + \left( C_{\eta}(1 + \zeta^{-1})\tilde{\delta} -\frac{1}{2}\right)\|A^{1/2}(\nabla_{w,\tau_{k+1}}u_{k+1} - \nabla_{w,\tau_{k}}u_{k})\|_{\mathcal{T}_{k+1}}^{2}\\ &&\quad+(1+\zeta)\tilde{\delta} \left( \eta^{2}(u_{k}, \mathcal{T}_{k})-\lambda\eta_{\mathcal{T}_{k}}^{2}(u_{k}, \mathcal{R}_{\mathcal{T}_{k}\rightarrow\mathcal{T}_{k+1}})\right) + C_{\epsilon}\mu^{-1}\eta^{2}(u_{k}, \mathcal{T}_{k}). \end{array} $$

Let \(\tilde {\delta }\) satisfy \(C_{\eta }(1+\zeta ^{-1})\tilde {\delta }=\frac {1}{2}\), then

$$ \begin{array}{@{}rcl@{}} &&|\|u-u_{k+1}\||_{\mathcal{T}_{k+1}}^{2} + (\tilde{\delta} - C_{\epsilon}\mu^{-1})\eta^{2}(u_{k+1}, \mathcal{T}_{k+1})\\ &&\leqslant (1+\epsilon) |\|u-u_{k}\||_{\mathcal{T}_{k}}^{2}+(1+\zeta)\tilde{\delta} (1-\lambda\theta^{2})\eta^{2}(u_{k}, \mathcal{T}_{k})+ C_{\epsilon}\mu^{-1}\eta^{2}(u_{k}, \mathcal{T}_{k}) \\ &&\leqslant (1+\epsilon) |\|u-u_{k}\||_{\mathcal{T}_{k}}^{2} - \frac{\tilde{\delta} (1+\zeta)\lambda\theta^{2}}{2}\eta^{2}(u_{k}, \mathcal{T}_{k})\\ &&\quad+ \left( (1+\zeta)\tilde{\delta} - \frac{\tilde{\delta} (1+\zeta)\lambda\theta^{2}}{2} + C_{\epsilon}\mu^{-1}\right)\eta^{2}(u_{k}, \mathcal{T}_{k})\\ &&\leqslant (1+\epsilon) |\|u-u_{k}\||_{\mathcal{T}_{k}}^{2} - \frac{\zeta\lambda\theta^{2}}{4C_{\eta}}\eta^{2}(u_{k}, \mathcal{T}_{k}) + \left( (1+\zeta)\tilde{\delta} - \frac{\tilde{\delta} (1+\zeta)\lambda\theta^{2}}{2} + C_{\epsilon}\mu^{-1}\right)\eta^{2}(u_{k}, \mathcal{T}_{k}). \end{array} $$

Using the Lemma 1, we have

$$ \begin{array}{@{}rcl@{}} &&|\|u-u_{k+1}\||_{\mathcal{T}_{k+1}}^{2} +\tilde{\delta} (1 - \frac{C_{\epsilon}}{\mu\tilde{\delta} })\eta^{2}(u_{k+1}, \mathcal{T}_{k+1})\\ &&\ \ \ \leqslant (1+\epsilon-\frac{\zeta\lambda\theta^{2}}{4C_{\eta}C_{UB}})|\|u-u_{k}\||_{\mathcal{T}_{k}}^{2}+ \tilde{\delta}\left( (1+\zeta)(1 - \frac{\lambda\theta^{2}}{2}) + \frac{C_{\epsilon}}{\mu\tilde{\delta} }\right)\eta^{2}(u_{k}, \mathcal{T}_{k}). \end{array} $$

Supposing μ1 satisfy

$$ 1 - \frac{\lambda^{2}\theta^{4}}{4}+ \frac{C_{\epsilon}}{\mu_{1}\tilde{\delta}}\leqslant 1 - \frac{C_{\epsilon}}{\mu_{1}\tilde{\delta}}. $$

Let \(\zeta =\frac {\lambda \theta ^{2}}{2}\), \(\epsilon = \frac {\zeta \lambda \theta ^{2}}{8C_{\eta }C_{UB}}\), \(\mu _{A}=\max \limits \{0, \mu _{1}\}\), we have

$$ \begin{array}{@{}rcl@{}} &&|\|u-u_{k+1}\||_{\mathcal{T}_{k+1}}^{2} +\tilde{\delta}(1 - \frac{C_{\epsilon}}{\mu\tilde{\delta}})\eta^{2}(u_{k+1}, \mathcal{T}_{k+1})\\ &&\leqslant (1-\frac{\zeta\lambda\theta^{2}}{8C_{\eta}C_{UB}})|\|u-u_{k}\||_{\mathcal{T}_{k}}^{2}+ \tilde{\delta} \left( 1 - \frac{\lambda^{2}\theta^{4}}{4}+ \frac{C_{\epsilon}}{\mu\tilde{\delta}}\right)\eta^{2}(u_{k}, \mathcal{T}_{k})\\ &&\leqslant (1-\frac{\zeta\lambda\theta^{2}}{8C_{\eta}C_{UB}})|\|u-u_{k}\||_{\mathcal{T}_{k}}^{2}+ \tilde{\delta} (1 - \frac{C_{\epsilon}}{\mu\tilde{\delta}}) \left( 1 - \frac{\lambda^{2}\theta^{4}}{4} + \frac{C_{\epsilon}}{\mu\tilde{\delta}}\right)\left( 1 - \frac{C_{\epsilon}}{\mu\tilde{\delta}}\right)^{-1} \eta^{2}(u_{k}, \mathcal{T}_{k})\\ &&\leqslant\alpha\left( |\|u-u_{k}\||_{\mathcal{T}_{k}}^{2} + \tilde{\delta} (1 - \frac{C_{\epsilon}}{\mu\tilde{\delta}}) \eta^{2}(u_{k}, \mathcal{T}_{k})\right), \end{array} $$

where \(\alpha = \max \limits \left \{ (1-\frac {\zeta \lambda \theta ^{2}}{8C_{\eta }C_{UB}}), \left (1 - \frac {\lambda ^{2}\theta ^{4}}{4}+ \frac {C_{\epsilon }}{\mu \tilde {\delta }}\right )\left (1 - \frac {C_{\epsilon }}{\mu \tilde {\delta }}\right )^{-1}\right \} \). At last, let \(\delta = \tilde {\delta }\left (1 - \frac {C_{\epsilon }}{\mu \tilde {\delta }}\right )\), we will get (37). □

Remark 2

The restrictions μ > μA for μ can be removed, If we choose l = 1 in MWG spaces (4), more details are referred to [35].

By recursion, the following decay of the energy error plus the error estimator can be obtained.

Corollary 1

Under the hypotheses of Theorem 2, then we get

$$ |\|u-u_{k}\||_{\mathcal{T}_{k}}^{2} +\delta\eta^{2}(u_{k}, \mathcal{T}_{k}) \leqslant C_{0}\alpha^{k}, $$

where the constants α,δ are given in Theorem 2, and \(C_{0} = |\|u-u_{0}\||_{\mathcal {T}_{0}}^{2} +{\delta \eta _{0}^{2}}(u_{0}, \mathcal {T}_{0})\). Thus the algorithm AMWG will terminate in finite steps.

5 Numerical experiments

In this section, numerical experiments are given to verify the convergence of the Algorithm 1.

Example 1

In this example, we choose the domain Ω = (0,1) × (0,1) and coefficient A = I, the exact solution of (3) is

$$u(x, y)=(x-x^{2})(y-y^{2})\arctan(100(\sqrt{(x-9/8)^{2}+ (y+1/2)^{2}}-1)).$$

In the variational problem (7), we choose the penalty parameter μ = 1.

The left one of Fig. 1 shows the initial mesh \(\mathcal {T}_{0}\) for Example 1 and the right one of Fig. 1 shows the refined mesh after k = 6 iterations for the Example 1. Figure 2 shows the performance of the \(\ln \# \mathcal {T}_{k}-\ln |\|\nabla u - \nabla _{w} u_{{k}}\||_{\mathcal {T}_{k}}\) with different marking parameters 𝜃 = 0.1,0.3 and 0.5, where \(\#\mathcal {T}_{k}\) and uk represent the number of elements and the corresponding solution, respectively, gotten from the Algorithm 1.

Fig. 1
figure 1

The initial mesh (left); adaptively refined mesh after k = 6 iterations(right) for Example 1

Fig. 2
figure 2

Quasi-optimality of the adaptive mesh refinements with marking parameters 𝜃 = 0.1,0.3,0.5 (right)

Example 2

In this example, we choose the L-shape domain Ω = (− 1,1)2/([0,1) × (− 1,0]) and coefficient A = I, the exact solution of (3) is \(u(x, y)=r^{2/3}\sin \limits (\frac {2\theta }{3})\). In the variational problem (7), we choose the penalty parameter μ = 1.

The left one of Fig. 3 shows the initial mesh for Example 2 and the right one of Fig. 3 shows the refined mesh for the Example 2 after k = 8 iterative steps; Fig. 4 shows the performance of the \(\ln \# \mathcal {T}_{k}-\ln |\|\nabla u - \nabla _{w} u_{{k}}\||_{\mathcal {T}_{k}}\) with different marking parameters 𝜃 = 0.1,0.3 and 0.5.

Fig. 3
figure 3

The initial mesh(left); adaptively refined mesh after k = 8 iterations(right) for Example 2

Fig. 4
figure 4

Quasi-optimality of the adaptive mesh refinements with marking parameters 𝜃 = 0.1,0.3,0.5 (right)

From above two numerical experiments, we can see that the Algorithm 1 is convergent. Furthermore, the right ones in Figs. 1 and 3 show that the meshes of the Algorithm 1 are locally refined and there are more grids around the edge singularity. From Figs. 2 and 4, we can also get the following quasi-optimality after several steps,

$$ |\|\nabla u - \nabla_{w} u_{{k}}\||_{\mathcal{T}_{k}}\leqslant C(\# \mathcal{T}_{k})^{-1/2}. $$