1 Introduction

In this paper, we consider the convergence of Adaptive weak Galerkin finite element methods (AWG) for the following model second order elliptic problems:

$$\begin{aligned} -\nabla \cdot (A\nabla u)= & {} f \quad \text{ in } \ \Omega , \end{aligned}$$
(1)
$$\begin{aligned} u= & {} 0 \quad \text{ on } \ \partial \Omega , \end{aligned}$$
(2)

where \(\Omega \) is a bounded polygonal or polyhedral domain in \({\mathbb {R}}^d (d = 2,3)\) and is partitioned into non-overlapping subdomains \(\Omega _i, 1\leqslant i\leqslant m\). Here, we need to assume that an initial partition \({\mathcal {T}}_0\) of \(\Omega \), which is consistent with the partition \({\bar{\Omega }}=\prod _{i=1}^{m}\Omega _i\) in the sense that each \({\mathcal {T}}_0\cap \Omega _i ,1\leqslant i\leqslant m\), inherits a partition of \(\Omega _i\). For all \(\tau \in {\mathcal {T}}_0\), we consider the case that the coefficient A is a piece-wise constant. We assume that the coefficient A satisfies the following property: there exist constants \(\alpha >0\) and \(\beta >0\) such that \(\alpha \leqslant A\leqslant \beta \).

Weak Galerkin (WG) makes use of discontinuous finite element functions for partial differential equations in which differential operators are approximated by weak forms as distributions. WG methods were first used to solve second order elliptic problem for simplicial grids in [27] and later for regular polytopal meshes in [21]. Then, WG methods in mixed form have been applied to solve the second order elliptic problem for arbitrary shapes of polygons (or polyhedra) in 2D (or 3D) in [28]. WG methods were subsequently applied to other problems, such as second order elliptic interface problems [17], the Helmholtz equation [7, 19, 23], the biharmonic equation [18, 22, 26], Darcy equation [12] and so on. WG methods are closely related to the mixed finite element methods and hybridized discontinuous Galerkin(DG) methods. However, when the coefficients are general variable functions, the WG methods are different from these methods.

Computation with adaptive grid refinement has proved to be a useful and efficient tool in scientific computing over the last several decades. We consider the following standard adaptive procedure:

$$\begin{aligned} \mathbf{SOLVE}\rightarrow \mathbf{ESTIMATE} \rightarrow \mathbf{MARK}\rightarrow \mathbf{REFINE} . \end{aligned}$$
(3)

The precise definition of the algorithm can be found in Sect. 3. For elliptic and Maxwell problems, the theory of convergence and computational complexity in the form of (3) have been great developments in the past few decades, such as [1, 3, 8, 15, 33, 34] etc. We also refer to [24] for an introduction to the theory of adaptive finite element methods.

For adaptive WG methods, there are only few research results for a posterior error estimates. For second order elliptic problems, a residual type a posteriori error estimator is first presented and analyzed in [6]; a posteriori error estimator is considered for a modified WG method of second order elliptic problems in [31]; a residual type error estimator is proposed which provides global upper and lower bounds of the WG method for second order elliptic problems in a discrete \(H^1\)-norm in [29]; recently, a simple posteriori error estimator which can be applied to general meshes such as hybrid, polytopal and those with hanging nodes is introduced for the WG method for second order elliptic problems in [11]; a posteriori error estimate of weak Galerkin (WG) finite element methods for the second order elliptic interface problems is presented in [16]; A residual-based a posteriori error estimator is discussed for the Stokes problem in [32]. However, to our best knowledge, there exists no work in the literature which studies the convergence of adaptive WG methods.

Our work is motivated by the convergence analysis of adaptive mixed finite element methods(AMFEM) in [5, 9]. In both approaches, the authors study AMFEM for second order elliptic problems with constant coefficient. In this paper, we will present the convergence of the AWG method for second order elliptic problems whose coefficient is piece-wise constant. Because the weak gradient is defined in polynomial space and the finite element spaces are different from the classical finite element spaces, the proof of the quasi-orthogonality in [5, 9] cannot be used directly. The data oscillation and the error indicator are estimated separately in [5, 9]. However, in WG methods, the data oscillation is one part of the corresponding error indicator and we have to estimate the data oscillation and the corresponding error indicator together. We also notice that the corresponding error estimates of WG methods are more complicated than ones for mixed element methods.

In this paper, we shall follow the state-of-the-art convergence theory [9] to prove the convergence of adaptive WG methods without extra marking for the data oscillation. We stress that the extension of the convergence theory to adaptive WG methods is not straightforward, since the data oscillation and the error indicator in [9] are estimated, separately, but in WG methods, the data oscillation is one part of the corresponding error indicator and we have to estimate the data oscillation and the corresponding error indicator together. We also notice that the convergence technique used for hybridized DG or mixed methods cannot be applied directly to the WG methods, since the corresponding error estimates of WG methods are more complicated than ones for mixed element methods. Especially, we need to establish the corresponding quasi-orthogonality.

We summarize our main result in the following theorem.

Theorem 1

Given a parameter \(\theta \in (0, 1)\) and initial mesh \({\mathcal {T}}_0\). Let u be the solution of (1)–(2), \(\{{\mathcal {T}}_k, u_k, \eta (u_k,{\mathcal {T}}_k)\}_{k\ge 0}\) be a sequence of meshes, finite element solutions and error estimates produced by the AWG method. Then there exist constants \(\rho \in (0, 1), \sigma _1>0, \sigma _2>0\) and \(\epsilon \) depending only on the shape regularity of \({\mathcal {T}}_0\), the polynomial order l, coefficient A, parameters \(\theta \) and \(\mu _0\), such that if

$$\begin{aligned} 0<\epsilon <\min \left( \dfrac{\sigma _1(1-\xi )}{C_1}, 1\right) , \end{aligned}$$

then

$$\begin{aligned}&{(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k+1})\Vert _{{\mathcal {T}}_{k+1}}^2 + \sigma _1\eta ^2(u_{k+1}, {\mathcal {T}}_{k+1}) + \sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k+1})}\\&\quad \leqslant \rho \Big ((1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k})\Vert _{{\mathcal {T}}_{k}}^2 + \sigma _1\eta ^2(u_{k}, {\mathcal {T}}_{k}) +\sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k})\Big ), \end{aligned}$$

where the constants \(C_1\) and \(\xi \) are given by Lemmas 7 and 12 , respectively.

As a consequence, the AWG method will converges in finite steps for a give tolerance.

Here is some notation used throughout the paper. The following shorthand notation will be used to avoid the repeated constants, following [33], \(x\lesssim y\) means \(x \le Cy\), where C are generic positive constants independent of the variables that appear in the inequalities and especially the mesh parameters. The notation \(C_i\), with subscript, denotes specific and important constants.

The rest of the article is organized as follows. In Sect. 2, we describe the definitions of weak gradient and discrete weak gradient, the weak Galerkin finite element spaces and the corresponding bilinear form \(a(\cdot , \cdot )\). In Sect. 3, we present the adaptive algorithm and discuss each procedure of (3) in detail. We prove the convergence of the proposed adaptive algorithm in Sect. 4 and report some numerical results in support of theoretical ones in Sect. 5.

2 Prelimimaries and Notations

In this section, we recall the definitions of weak gradient and discrete weak gradient, the weak Galerkin finite element spaces and the corresponding bilinear form \(a(\cdot , \cdot )\).

First, we present some notations. For any domain \(D\subset {\mathbb {R}}^d, d=2, 3\), we use standard definitions for the Sobolev spaces \(H^s (D)\) and their associated norms \(\Vert \cdot \Vert _{s, D}\) for \(s\geqslant 0\). Note that the space \(L^2(D)\) is \(H^0(D)\), we denote its norm by \(\Vert \cdot \Vert _D\). When \(D = \Omega \), we shall simplify the notation as \(\Vert \cdot \Vert \). More specially, we define \({\mathbf {H}}(\mathrm {div}, D) = \{{\varvec{q}} : {\varvec{q}} \in (L^2(D))^d, \nabla \cdot {\varvec{q}} \in L^2(D)\}, d=2, 3\).

2.1 Weak Gradient and Discrete Weak Gradient

Let K be any polygonal domain with boundary \(\partial K\). Following [27], a weak function on the region K refers to a function \(v = \{v_0 , v_b\}\) such that \(v_0 \in L^2 (K)\) and \(v_b \in H^{\frac{1}{2}}(\partial K)\). The first component \(v_0\) can be understood as the value of v in K, and the second component \(v_b\) represents v on the boundary of K. Note that \(v_b\) may not necessarily be related to the trace of \(v_0\) on \(\partial K\) even if the trace is well-defined. Denote by W(K) the space of weak functions on K

$$\begin{aligned} W(K) := \{v = \{v_0 , v_b\} : v_0 \in L^{2} (K), v_b\in H^{\frac{1}{2}}(\partial K)\}. \end{aligned}$$
(4)

According to [27], we define the weak gradient as follows.

Definition 1

(Weak Gradient) The weak gradient of \(v= \{v_0 , v_b\}\in W(K)\) is defined as a linear functional \(\nabla _w v\) in the dual space of \({\mathbf {H}} (\mathrm {div}, K)\) satisfying the following equation

$$\begin{aligned} (\nabla _w v, {\varvec{q}})_K := -(v_0, \nabla \cdot {\varvec{q}})_K + \langle v_b, {\varvec{q}} \cdot {\varvec{n}}\rangle _{\partial K} \qquad \forall {\varvec{q}} \in {\mathbf {H}}(\mathrm {div}, K), \end{aligned}$$
(5)

where \({\varvec{n}}\) is the unit outward normal direction to \(\partial K\), \((v_0, \nabla \cdot {\varvec{q}})_K = \int _K v_0(\nabla \cdot {\varvec{q}})\mathrm {d} {\varvec{x}}\) is the action of \(v_0\) on \(\nabla \cdot {\varvec{q}}\), and \( \langle v_b, {\varvec{q}} \cdot {\varvec{n}}\rangle _{\partial K} =\int _{\partial K} v_b ( {\varvec{q}} \cdot {\varvec{n}})\mathrm {d} s\) is the action of \({\varvec{q}}\cdot {\varvec{n}}\) on \(v_b\in H^{\frac{1}{2}}(\partial K)\).

In WG methods, we also need discrete analogues of the weak gradient. We consider a shape-regular partition \({\mathcal {T}} = \cup \{\tau \}\) for the domain \(\Omega \). For each integer \(l \geqslant 0\), let \(P_l(\tau )\) be the set of polynomials on \(\tau \) with degree no more than l and \({\hat{P}}_l(\tau )\) be the set of homogeneous polynomials of order l in the variable \({\varvec{x}} = (x_1,\ldots , x_d )^T\). Let \({\mathbf {G}}_l (\tau )\) be either \((P_l(\tau ))^d\) or \(RT_l(\tau ) = (P_l(\tau ))^d + {\hat{P}}_l(\tau ){\varvec{x}}\). For the weak function space \(W (\tau )\), we discretize it by \(W_{i, j}(\tau )\) given as follows

$$\begin{aligned} W_{i, j}(\tau ) := \left\{ v = \{v_0 , v_b\}: v_0 \in P_i(\tau ), v_b\in P_j(\partial \tau )\right\} . \end{aligned}$$

Definition 2

(Discrete weak gradient) The discrete weak gradient of \(v= \{v_0 , v_b\}\in W_{i, j}(\tau )\) denoted by \(\nabla _{w, l, \tau } v\) is defined as the unique polynomial \(\nabla _{w, l, \tau } v\in {\mathbf {G}}_l (\tau )\) satisfying the following equation

$$\begin{aligned} (\nabla _{w, l, \tau } v, {\varvec{q}})_\tau := - (v_0, \nabla \cdot {\varvec{q}})_\tau + \langle v_b, {\varvec{q}} \cdot {\varvec{n}}\rangle _{\partial \tau },\qquad \forall {\varvec{q}}\in {\mathbf {G}}_l (\tau ). \end{aligned}$$
(6)

Note that if \(v \in H^1(\tau )\) and \(\nabla v\in {\mathbf {G}}_l(\tau )\), then \(\nabla _{w, l, \tau } v = \nabla v\).

Different weak Galerkin finite element methods can be derived by choosing \(W_{ i, j}(\tau )\) and \({\mathbf {G}}_l (\tau )\) with various combinations of the indices ij and l (see [20, 27]). This paper shall mainly consider two pairs \(W_{l, l}(\tau )-RT_l(\tau )\) and \(W_{l, l+1}(\tau )-\left( P_{l+1}(\tau )\right) ^d\), for integers \(l\geqslant 0\) defined on simplices \(\tau \).

In next subsection, the weak Galerkin finite element spaces and the corresponding bilinear form \(a(\cdot , \cdot )\) will be presented.

2.2 Weak Galerkin Finite Element Method

Let \({\mathcal {T}}_h\) be a shape-regular partition of the domain \(\Omega \) into a set of elements \(\tau \). We use the notation \({\mathcal {E}}_h\) to denote the set of all edges or faces in \({\mathcal {T}}_h\) and \({\mathcal {E}}_h^0 = {\mathcal {E}}_h\setminus \partial \Omega \) denote the set of all interior edges or faces. For a d-dimensional simplex S, we write \(h_S = |S|^{1/d}\) to denote the size of the element S where |S| is the d-dimensional Lebesgue measure of S.

Denote by \(W_l (\tau ) - {\mathbf {G}}_l (\tau )\) a local weak Galerkin element that can be either \(W_{l, l}(\tau )-RT_l(\tau )\) or \(W_{l, l+1}(\tau ) - \left( P_{l+1}(\tau )\right) ^d\). Associated with \({\mathcal {T}}_h\) and a local element \(W_l (\tau ) - {\mathbf {G}}_l (\tau )\), we define global weak Galerkin finite element spaces,

$$\begin{aligned} V_h:= & {} \left\{ v = \{v_0 , v_b\}: \{v_0 , v_b\}|_\tau \in W_l(\tau )\right\} ,\\ V_h^0:= & {} \left\{ v : v\in V_h, v_b = 0 \ \ \text{ on }\ \ \partial \Omega \right\} . \end{aligned}$$

We would like to emphasize that any function \(v = \{v_0, v_b\}\in V_h^0\) has a single value \(v_b\) on each edge \(e\in {\mathcal {E}}_h\). We can also note that \(v = \{v_0, v_b\}\in V_h^0\) is a reasonable approximation of a function in \(H_0^1(\Omega )\) (see Sect. 3 in [6]).

Now, we can define the discrete weak gradient operator \(\nabla _{w, l}\) on the weak finite element space \(V_h\), which is computed element-wise by using (6); i.e., for any \(\tau \), we have \(\nabla _{w, l ,\tau } (v|_\tau )\in {\mathbf {G}}_l (\tau )\) and

$$\begin{aligned} (\nabla _{w, l} v)|_\tau := \nabla _{w, l, \tau } (v|_\tau )\quad \forall v\in V_h. \end{aligned}$$

Here and afterwards, for simplicity of notation, we shall drop the subscript l in the notation \(\nabla _{w, l}\) for the discrete weak gradient when no confusion arises.

For any \(w, v\in W_l (\tau ) - {\mathbf {G}}_l (\tau )\), we present the bilinear form as follows

$$\begin{aligned} a(w, v) = (A\nabla _{w} w, \nabla _{w} v)_{{\mathcal {T}}_h}:= \sum _{\tau \in {\mathcal {T}}_h}(A\nabla _{w} w, \nabla _{w} v)_\tau . \end{aligned}$$

The WG methods for solving for (1)–(2): find \(u_h = \{u_0^h, u_b^h\}\in V_h^0\), such that

$$\begin{aligned} a(u_h, v_h) = (f, v_0^h), \quad \forall v_h = \{v_0^h, v_b^h\}\in V_h^0. \end{aligned}$$
(7)

The well-posedness of variational problem (7) can be found in [27].

Remark 1

Optimal order error estimates, which are between weak Galerkin finite element solutions and the exact solution in both the discrete \(H^1\) and \(L^2\) norms, were also presented in [27].

3 Adaptive Weak Galerkin Finite Element Methods

In this section, we present the standard adaptive algorithm (see Sect. 5 in [6]) and discuss each step in the algorithm in detail.

figure a

The goal of this paper is to prove that the algorithm AWG will terminate in finite steps for a given tolerance. In the following subsections, we shall discuss each step in detail.

3.1 Procedure SOLVE

Given a function \(f \in L^2(\Omega )\) and a shape regular mesh \( {\mathcal {T}}_k\), let \(u_{k}\) be the exact WG solution of (7). In this step, we suppose that the finite dimensional problems (7) will be solved efficiently and accurately.

3.2 Procedure ESTIMATE

The crucial ingredient of the AWG is the control of the error by the estimator, namely the so-called reliability. Here, we will use a similar residual-type a posteriori error estimator in [6]. Given a mesh \({\mathcal {T}}_h\), assume two elements \(\tau _1\) and \(\tau _2\) sharing a common edge or face e and denote \({\varvec{n}}_1\) and \({\varvec{n}}_2\) the unit normal vectors on e exterior to \(\tau _1\) and \(\tau _2\). In \({\mathbb {R}}^2\), the unit tangential vectors \({\varvec{t}}_1\) and \({\varvec{t}}_2\) will obtained by rotating \({\varvec{n}}_1\) and \({\varvec{n}}_2\) 90 degrees counterclockwise, then denote \(\gamma _{t,\partial \tau _i}({\varvec{v}})={\varvec{v}}\cdot {\varvec{t}}_i\) the tangential trace in \(\tau _i\) of a vector function \({\varvec{v}}\). In \({\mathbb {R}}^3\), the tangential trace for \({\varvec{v}}\) in \(\tau _i\) is \(\gamma _{t,\partial \tau _i}({\varvec{v}})={\varvec{v}}\times {\varvec{n}}_i\) for \(i = 1, 2\). Then the normal jump across e is defined as \([{\varvec{w}} \cdot {\varvec{n}}]_e = {\varvec{w}}|_{\partial \tau _1} \cdot {\varvec{n}}_1 + {\varvec{w}}|_{\partial \tau _2} \cdot {\varvec{n}}_2\) and the tangential jump across e is defined as \([\gamma _t ({\varvec{w}})]_e = \gamma _{t, \partial \tau _1}({\varvec{w}})+ \gamma _{t, \partial \tau _2} ({\varvec{w}})\). For \(\forall v_h\in V_h\), we define

$$\begin{aligned} {\mathbf {J}}_e(A\nabla _w v_h)= & {} \left\{ \begin{array}{ll} [A\nabla _w v_h\cdot {\varvec{n}}]_e, &{}\quad \text{ if } \ \ e\in {\mathcal {E}}_h^0\\ 0, &{}\quad \text{ otherwise }, \end{array} \right. \\ {\mathbf {J}}_e(\gamma _t(\nabla _w v_h))= & {} \left\{ \begin{array}{ll} [\gamma _t(\nabla _w v_h)]_e, &{}\quad \text{ if }\ \ e \in {\mathcal {E}}_h^0\\ 2\gamma _t(\nabla _w v_h), &{}\quad \text{ otherwise }. \end{array} \right. \end{aligned}$$

For \(e\in {\mathcal {E}}_h^0\), denote by \(\omega _e = \tau _1 \cup \tau _2\) the macro-element associated with e, where \(\tau _1\) and \(\tau _2\) are two elements in \({\mathcal {T}}_h\) sharing e as a common edge/face. Similarly, we define \(\omega _x = \{\tau ^{\prime }\in {\mathcal {T}}_h, x\in \tau ^{\prime }\}\) for a vertex x, and \(\omega _\tau = \{\tau ^{\prime }\in {\mathcal {T}}_h, \tau ^{\prime }\cap \tau \not = \varnothing \}\) for an element \(\tau \in {\mathcal {T}}_h\). For the piece-wise constant A, we use |A| to denote its absolute value. We use the notations \(A_\tau = A|_\tau \), \(|A^{\max }_e| = \max _{\tau \in w_e} |A_\tau |\), and \(|A^{\min }_e| = \min _{\tau \in w_e} |A_\tau |\).

Let \(f_h\) be the \(L^2\) projection of f to the discontinuous Galerkin space

$$\begin{aligned} S_h=\{w\in L^2(\Omega ): w|_\tau \in P_l(\tau ), \forall \tau \in {\mathcal {T}}_h\}. \end{aligned}$$
(8)

Then, for \(v_h\in V_h\) and \(\tau \in {\mathcal {T}}_h\), we define

$$\begin{aligned} \eta _{c}^2(v_{h}, \tau ) = h_\tau ^2|A_\tau |^{-1} \Vert f_h + \nabla \cdot (A\nabla _w v_h)\Vert _\tau ^2 + \frac{1}{2}\sum _{e\in \partial \tau }h_\tau |A_e^{\max }|^{-1}\int _e {\mathbf {J}}_e^2(A\nabla _w v_h), \end{aligned}$$
(9)
$$\begin{aligned} \eta _{m}^2(v_{h}, \tau ) = h_\tau ^2|A_\tau |\cdot \Vert \nabla \times \nabla _w v_h\Vert _\tau ^2 + \frac{1}{2}\sum _{e\in \partial \tau }h_\tau |A_e^{\min }|\int _e {\mathbf {J}}_e^2(\gamma _t(\nabla _wv_h)), \end{aligned}$$
(10)
$$\begin{aligned} \text{ osc}^2 (f, \tau ) = h_\tau ^2|A_\tau |^{-1}\Vert f - f_h\Vert _\tau ^2, \end{aligned}$$
(11)

and element-wise error estimator

$$\begin{aligned} \eta ^2(v_h , \tau ) =\text{ osc}^2 (f, \tau ) + \eta _{c}^2(v_{h}, \tau ) + \eta _{m}^2(v_{h}, \tau ), \end{aligned}$$
(12)

Remark 2

Note that \( \eta _{c}(v_{h}, \tau )\) is an analogy of the error estimator for the conforming finite element and \( \eta _{m}(v_{h}, \tau )\) is an analogy of the error estimator for the mixed finite element. The \(\text{ osc }(f, \tau )\) is an analogy of the data oscillation for conforming finite elements.

Remark 3

There is a slight difference between the error estimator given in (12) and one introduced in [6]. For the mesh size in the jump terms, we use \(h_\tau \) instead of \(h_e\). Although \(h_\tau \) and \(h_e\) are comparable, the use of \(h_\tau \) is crucial for the reduction of the error estimator, as we can see from the proof of Lemma 10.

For any subset \({\mathcal {W}}_h\subset {\mathcal {T}}_h\) and \(v_h\in V_h\), we define

$$\begin{aligned} \eta ^2(v_h , {\mathcal {W}}_h) = \sum _{\tau \in {\mathcal {W}}_h}\eta ^2(v_h , \tau ), \ \ \text{ osc}^2(f, {\mathcal {W}}_h) = \sum _{\tau \in {\mathcal {W}}_h} \text{ osc}^2 (f, \tau ). \end{aligned}$$

3.3 Procedure MARK

In the selection of elements, we rely on the Dörfler marking [8]. Given a mesh \({\mathcal {T}}_k\), a set of indicators \(\{\eta ^2(u_k, \tau _k)\}_{\tau _k\in {\mathcal {T}}_k}\), and a marking parameter \(\theta \in (0, 1)\), we suppose that the procedure MARK outputs a subset of marked elements \({\mathcal {M}}_k \subset {\mathcal {T}}_k\) with minimal cardinality, such that

$$\begin{aligned} \eta ^2(u_k, {\mathcal {M}}_k)\ge \theta \eta ^2(u_k,{\mathcal {T}}_k ). \end{aligned}$$
(13)

3.4 Procedure REFINE

Starting from an initial triangulation \({\mathcal {T}}_0\), we denote by

$$\begin{aligned} {\mathbb {L}}({\mathcal {T}}_0) = \{{\mathcal {T}}: {\mathcal {T}} \ \ \text{ is } \text{ conforming } \text{ and } \text{ refined } \text{ from } {\mathcal {T}}_0\}, \end{aligned}$$
(14)

and \({\mathcal {T}}_1 \leqslant {\mathcal {T}}_2\) if \({\mathcal {T}}_2\) is a refinement of \({\mathcal {T}}_1\).

For any \({\mathcal {T}}_k \in {\mathbb {L}}({\mathcal {T}}_0)\) and a subset \({\mathcal {M}}_k \subset {\mathcal {T}}_k\) of marked elements, we suppose that Procedure REFINE outputs a conforming triangulation \({\mathcal {T}}_{k+1} \in {\mathbb {L}}({\mathcal {T}}_0)\), i.e.,

$$\begin{aligned} {\mathcal {T}}_{k+1} = \text{ REFINE }({\mathcal {T}}_k, {\mathcal {M}}_k). \end{aligned}$$

To generate \({\mathcal {T}}_{k+1} \), we first subdivide the marked elements in \({\mathcal {M}}_{k} \) to get new triangulation \({\mathcal {T}}_{k} ^{\prime }\). In general, \({\mathcal {T}}_{k}^{\prime }\) might have hanging nodes; therefore, we have to refine additional elements in \({\mathcal {T}}_k\backslash {\mathcal {M}}_k\) to obtain a conforming triangulation \({\mathcal {T}}_{k+1} \). Throughout this paper, we shall impose the local refinement \({\mathbb {L}}({\mathcal {T}}_0)\) is shape regular.

4 Convergence of the AWG Method

In this section, we begin with a quasi-orthogonality result. Then, we recall the upper bound of the a posteriori error estimator (see [6]). Moreover, we present the reduction of \(\text{ osc}^2(f, {\mathcal {T}}_{h})\) and \(\eta _1^2(v_h, {\mathcal {T}}_h)= \sum _{\tau \in {\mathcal {T}}_h} (\eta _{c}^2(v_h, \tau ) + \eta _{m}^2(v_h, \tau ))\), respectively. At last, we prove that the sum of the energy error and the error estimator, between two consecutive adaptive loops, is a contraction and the adaptive algorithm will terminate in finite steps within a given tolerance.

4.1 Quasi-Orthogonality

The standard convergence of adaptive Galerkin method is based on the orthogonality or quasi-orthogonality of the error in different finite element spaces. Especially, for the case of the mixed methods, we refer to [5, 9]. However, the quasi-orthogonality of WG methods are more complicated than ones for mixed element methods.

First, for \(\tau \in {\mathcal {T}}_h\), we denote the \(L^2\) projection onto \(W_l(\tau )\) by \(Q_\tau \cdot = \{Q_0^\tau \cdot , Q_b^\tau \cdot \}\) and \(L^2\) projection onto \({\mathbf {G}}_l(\tau )\) by \({\mathbb {Q}}_\tau \). Next Lemma presents the conservation property of the WG approximation.

Lemma 1

Let u be the solution of (1)–(2) and \(u_h = \{u_0^h, u_b^h\} \in V_h^0\) be the solution of (7). Then we have \(A\nabla _wu_h \in {\mathbf {H}}(\mathrm {div}, \Omega )\) and

$$\begin{aligned} -\nabla \cdot ( A\nabla _wu_h) = f_h, \end{aligned}$$
(15)

where \(f_h\) is the \(L^2\) projection of f to the space \(S_h\).

Proof

The proof of the Lemma 1 is similar as Lemma 3.3 in [6]. Notice that the coefficient A is piece-wise constant.

Let \(v = \{0,v_b\}\) in (6),

$$\begin{aligned} a(u_h, v)= & {} (A\nabla _w u_h, \nabla _w v) \\= & {} \sum _{\tau \in {\mathcal {T}}_h} (A\nabla _w u_h, \nabla _w v)_{\tau } \\= & {} \sum _{\tau \in {\mathcal {T}}_h} \left( -(v_0, \nabla \cdot (A\nabla _w u_h))_{\tau } + \langle v_b, (A\nabla _w u_h)\cdot {\mathbf {n}}\right) \rangle _{\partial \tau } \\= & {} \sum _{\tau \in {\mathcal {T}}_h} \langle v_b, (A\nabla _w u_h)\cdot {\mathbf {n}}\rangle _{\partial \tau } \\= & {} \sum _{e\in {\mathcal {E}}_h^{0}} \langle v_b, {\varvec{J}}_{e}((A\nabla _w u_h)\cdot {\mathbf {n}})\rangle _{e}, \end{aligned}$$

using (7) leads to

$$\begin{aligned} a(u_h, v) =(f,v_0)=(f, 0)=0, \end{aligned}$$

we have

$$\begin{aligned} \sum _{e\in {\mathcal {E}}_h^{o}} \langle v_b, {\varvec{J}}_{e}((A\nabla _w u_h)\cdot {\mathbf {n}})\rangle _{e}=0. \end{aligned}$$

Choose \(v_b|_{e}={\varvec{J}}_{e}((A\nabla _w u_h)\cdot {\mathbf {n}})\), we have

$$\begin{aligned} {\varvec{J}}_{e}((A\nabla _w u_h)\cdot {\mathbf {n}})=0, \quad \end{aligned}$$

such that \((A\nabla _w u_h)\cdot {\mathbf {n}}\) is continuous across every edge/face. Therefore, \(A\nabla _wu_h \in \mathbf {H}(\mathrm {div},\Omega )\).

When \(v = \{v_0, 0\}\), we get

$$\begin{aligned} a(u_h, v)= & {} \sum _{\tau \in {\mathcal {T}}_h}\left( -(v_0, \nabla \cdot (A\nabla _w u_h))_{\tau } + \langle v_b, (A\nabla _w u_h)\cdot {\mathbf {n}} \right) \rangle _{\partial \tau }\\= & {} -\sum _{\tau \in {\mathcal {T}}_h}(v_0, \nabla \cdot (A\nabla _w u_h))_{\tau }\\= & {} (f, v_0)=(f_h, v_0), \end{aligned}$$

which implies

$$\begin{aligned} -\nabla \cdot (A\nabla _wu_h) = f_h. \end{aligned}$$

\(\square \)

For two nested triangulations \({\mathcal {T}}_h, {\mathcal {T}}_{h_*}\in {\mathbb {L}}({\mathcal {T}}_0)\) with \({\mathcal {T}}_h\leqslant {\mathcal {T}}_{h_*}\), in order to prove the quasi-orthogonality of WG methods, we also introduce an intermediate solution \({\tilde{u}}_{h_*} = \{{\tilde{u}}_0^{h_*}, {\tilde{u}}_b^{h_*}\}\in V_{h_*}^0\) satisfying the following equation,

$$\begin{aligned} (A\nabla _w {\tilde{u}}_{h_*}, \nabla _w v_{h_*})_{{\mathcal {T}}_{h_*}} = (f_h, v^{h_*}_0)\qquad \forall v_{h_*}= \{v^{h_*}_0, v^{h_*}_b\}\in V_{h_*}^0. \end{aligned}$$
(16)

The following lemma presents the property of the intermediate solution \({\tilde{u}}_{h_*}\).

Lemma 2

Given an \(f\in L^2(\Omega )\) and two nested triangulations \({\mathcal {T}}_h, {\mathcal {T}}_{h_*}\in {\mathbb {L}}({\mathcal {T}}_0)\) with \({\mathcal {T}}_h\leqslant {\mathcal {T}}_{h_*}\), let u be the solution of (1)–(2), \(u_h = \{u_0^h, u_b^h\} \in V_h^0\) and \(u_{h_*}= \{u_0^{h_*}, u_b^{h_*}\} \in V_{h_*}^0\) be the corresponding WG solutions of (7), \({\tilde{u}}_{h_*} = \{{\tilde{u}}_0^{h_*}, {\tilde{u}}_b^{h_*}\}\in V_{h_*}^0\) be the solution of (16). Then

$$\begin{aligned} \sum _{{\tau _*}\in {\mathcal {T}}_{h_*}} (\nabla u-\nabla _{w, \tau _{*}}u_{h_*}, A (\nabla _{w, \tau _{*}}{\tilde{u}}_{h_*} - \nabla _{w,\tau }u_{h}))_{\tau _{*}} = 0, \end{aligned}$$
(17)

where \(\tau \in {\mathcal {T}}_h, \tau _{*}\in {\mathcal {T}}_{h_*}\) and \(\tau _{*}\subseteq \tau \).

Proof

The main idea follows from [5].

For all \(\tau _*\in {\mathcal {T}}_{h_*}\), let \(Q_{\tau _*}\cdot = \{Q_0^{\tau _*}\cdot , Q_b^{\tau _*}\cdot \}\) and the \({\mathbb {Q}}_{\tau _*}\) be the \(L^2\) projection to \(W_l(\tau _*)\) and \({\mathbf {G}}_l(\tau _*)\), respectively. Comparing the right-hand sides of (7) and (16), then using the similar proof of Lemma 1 and note that projection from \(V_h\) to \(V_h\) is the identity operator, we obtain \(A\nabla _w{\tilde{u}}_{h_*} \in {\mathbf {H}}(\mathrm {div}, \Omega )\) and

$$\begin{aligned} -\nabla \cdot (A\nabla _{w, \tau _*}{\tilde{u}}_{h_*}) = f_h. \end{aligned}$$
(18)

For all \(\tau _*\in {\mathcal {T}}_{h_*}\), we have \(\nabla _{w, \tau _*} (Q_{\tau _*}v)= {\mathbb {Q}}_{\tau _*}\nabla v, \forall v\in H^{1}(\tau _*)\).

Then Lemma 1 implies \(A\nabla _w u_{h} \in {\mathbf {H}}(\mathrm {div}, \Omega )\), both \(u\in H_0^1(\Omega )\) and \(u_{h_*}= \{u_0^{h_*}, u_b^{h_*}\} \in V_{h_*}^0\) implies that \(Q_b^{\tau _*} u\), \(u_b^{h_*}\) severally have a single value on each edge \(e\in {\mathcal {E}}_{*}^0\), \(Q_b^{\tau _*} u|_{\partial \Omega } = u_b^{h_*}|_{\partial \Omega } =0\), (15) and (18), we have

$$\begin{aligned}&{\sum _{{\tau _{*} }\in {\mathcal {T}}_{h_*} } \left( \nabla u- \nabla _{w,\tau _{*}}u_{h_*}, A(\nabla _{w,\tau _{*}} {\tilde{u}}_{h_*} - \nabla _{w,\tau }u_{h})\right) _{\tau _{*}}} \nonumber \\&\quad =\sum _{{\tau _{*} }\in {\mathcal {T}}_{h_*} } \left( {\mathbb {Q}}_{\tau _*} (\nabla u-\nabla _{w,\tau _{*}} u_{h_*}), A(\nabla _{w,\tau _{*}} {\tilde{u}}_{h_*} - \nabla _{w,\tau }u_{h})\right) _{\tau _{*}} \nonumber \\&\quad =\sum _{{\tau _{*} }\in {\mathcal {T}}_{h_*} } \left( {\mathbb {Q}}_{\tau _*}\nabla u-\nabla _{w,\tau _{*}} u_{h_*}, A(\nabla _{w,\tau _{*}} {\tilde{u}}_{h_*} - \nabla _{w,\tau }u_{h})\right) _{\tau _{*}} \nonumber \\&\quad = \sum _{{\tau _{*}}\in {\mathcal {T}}_{h_*} } \left( \nabla _{w,\tau _{*}} (Q_{\tau _*}u-u_{h_*}), A\nabla _{w,\tau _{*}} {\tilde{u}}_{h_*}\right) _{\tau _{*}} \nonumber \\&\qquad - \sum _{{\tau _{*}}\in {\mathcal {T}}_{h_*}} \left( \nabla _{w,\tau _{*}} (Q_{\tau _*}u-u_{h_*}), A\nabla _{w,\tau } u_{h}\right) _{\tau _{*}} \nonumber \\&\quad =- \sum _{{\tau _{*}}\in {\mathcal {T}}_{h_*} } \left( Q_0^{\tau _*} u-u_0^{h_*}, \nabla \cdot (A\nabla _{w,\tau _{*}} {\tilde{u}}_{h_*})\right) _{\tau _{*}} \nonumber \\&\qquad + \sum _{{\tau _{*}}\in {\mathcal {T}}_{h_*} }\langle Q_b^{\tau _*} u-u_b^{h_*},(A\nabla _{w,\tau _{*}} {\tilde{u}}_{h_*}) \cdot {\varvec{n}}\rangle _{\partial \tau _{*}} \nonumber \\&\qquad + \sum _{{\tau _{*}}\in {\mathcal {T}}_{h_*} } \left( Q_0^{\tau _*} u-u_0^{h_*}, \nabla \cdot (A\nabla _{w,\tau } u_h)\right) _{\tau _{*}} \nonumber \\&\qquad - \sum _{{\tau _{*}}\in {\mathcal {T}}_{h_*} }\langle Q_b^{\tau _*} u-u_b^{h_*}, (A\nabla _{w,\tau } u_{h})\cdot {\varvec{n}}\rangle _{\partial \tau _{*}} \nonumber \\&\quad =\sum _{{\tau _{*}}\in {\mathcal {T}}_{h_*} } (Q_0^{\tau _*} u-u_0^{h_*}, f_h - f_h)_{\tau _{*}} = 0. \end{aligned}$$
(19)

\(\square \)

The following lemma reveals the relationship between \({\tilde{u}}_0^{h_*} - u_0^{h_*}\) and \(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*}\).

Lemma 3

Let \(u_{h_*}= \{u_0^{h_*}, u_b^{h_*}\} \in V_{h_*}^0\) and \({\tilde{u}}_{h_*}= \{{\tilde{u}}_0^{h_*}, {\tilde{u}}_b^{h_*}\}\in V_{h_*}^0\) be the WG solutions of (7) and (16), respectively. Assume that problem (1)–(2) has the \(H^{1+s}\) regularity with \(s \in (0, 1]\). Then, we have

$$\begin{aligned} \Vert {\tilde{u}}_0^{h_*} - u_0^{h_*}\Vert _{{\mathcal {T}}_{h_*}}\lesssim h_{\tau _*}^s\Vert \nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*}\Vert _{{\mathcal {T}}_{h_*}}, \end{aligned}$$
(20)

where the constant only depends on the shape regularity of \({\mathcal {T}}_{h_*}\) and coefficient A.

Proof

Here, we adapt the technique from [27].

Let \(w\in H^1(\Omega )\) solve the following auxiliary problem

$$\begin{aligned} \left\{ \begin{array}{l} -\nabla \cdot (A\nabla w) = {\tilde{u}}_0^{h_*} - u_0^{h_*} \quad \text{ in } \ \Omega , \\ w = 0 \quad \text{ on } \ \partial \Omega . \end{array} \right. \end{aligned}$$
(21)

Then the assumption of \(H^{1+s}\) regularity implies that \(w \in H^{1+s}(\Omega )\) such that

$$\begin{aligned} \Vert w\Vert _{1+s}\lesssim \Vert {\tilde{u}}_0^{h_*} - u_0^{h_*}\Vert _{{\mathcal {T}}_{h_*}}. \end{aligned}$$
(22)

We choose the projection \(\Pi _h\) introduced in [2] satisfying the following two properties

$$\begin{aligned}&\sum _{\tau \in {\mathcal {T}}_h}(-\nabla \cdot {\varvec{q}}, v_0)_\tau = \sum _{\tau \in {\mathcal {T}}_h}(\Pi _h {\varvec{q}}, \nabla _w v)_\tau \ \quad \forall {\varvec{q}}\in {\mathbf {H}}(\mathrm {div}, \Omega ), \forall v = \{v_0, v_b\}\in V_h, \end{aligned}$$
(23)
$$\begin{aligned}&\Vert \Pi _h(A\nabla u) - A\nabla _w(Q_\tau u)\Vert \lesssim h^s\Vert u\Vert _{1+s}\qquad \forall u\in H^{1+s}(\Omega ), s>0. \end{aligned}$$
(24)

Formulas (23) and (24) can be found in the Lemmas 7.2 and 7.3 of [27], respectively.

Using the variational problem of (21) with the test function \({\tilde{u}}_0^{h_*} - u_0^{h_*}\), (23) and (24), we have

$$\begin{aligned} \Vert \tilde{u}_0^{h_*} - u_0^{h_*}\Vert ^2_{\mathcal {T}_{h_*}}= & {} \sum _{\tau _*\in \mathcal {T}_{h_*}}(-\nabla \cdot (A\nabla w), \tilde{u}_0^{h_*} - u_0^{h_*})_{\tau _*} \nonumber \\= & {} \sum _{\tau _*\in \mathcal {T}_{h_*}}(\Pi _{h_*}(A\nabla w), \nabla _w \tilde{u}_{h_*} - \nabla _w u_{h_*})_{\tau _*} \nonumber \\= & {} \left( \Pi _{h_*}(A\nabla w) - A\nabla _w(Q_{h_*} w), \nabla _w \tilde{u}_{h_*} - \nabla _w u_{h_*}\right) _{\mathcal {T}_{h_*}}\nonumber \\\lesssim & {} h_{\tau _*}^s\Vert w\Vert _{1+s}\Vert \nabla _w \tilde{u}_{h_*} - \nabla _w u_{h_*}\Vert _{{\mathcal {T}}_{h_*}}, \end{aligned}$$
(25)

where the constant only depends on the shape regularity of \({\mathcal {T}}_{h_*}\) and coefficient A. We also used the following equality in the last equal

$$\begin{aligned} (A\nabla _w(Q_{h_*} w), \nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})_{{\mathcal {T}}_{h_*}} = (f_{h} - f_{h_*}, Q_0^{h_*}w)_{{\mathcal {T}}_{h_*}} = 0. \end{aligned}$$

Substituting (22) into (25), we arrive at

$$\begin{aligned} \Vert {\tilde{u}}_0^{h_*} - u_0^{h_*}\Vert _{{\mathcal {T}}_{h_*}}\lesssim h_{\tau _*}^s\Vert \nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*}\Vert _{{\mathcal {T}}_{h_*}}, \end{aligned}$$

which completes the proof. \(\square \)

Now we define \({\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}\) as the set of refined elements from \({\mathcal {T}}_{h}\) to \({\mathcal {T}}_{h_*}\) and \(\overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\) as the set of new elements refined from \({\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}\). Obviously, \({\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}} = {\mathcal {T}}_{h}\backslash {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}\) are unchanged elements.

Lemma 4

For \({\mathcal {T}}_{h}, {\mathcal {T}}_{h_*}\in {\mathbb {L}}({\mathcal {T}}_0)\) with \({\mathcal {T}}_{h}\leqslant {\mathcal {T}}_{h_*}\), then we have

$$\begin{aligned} \Vert f_{h_*} - f_h\Vert _{\tau _{*}} \left\{ \begin{array}{l} = 0, \quad \forall \tau _{*}\in {\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}, \\ \leqslant \Vert f - f_h\Vert _{\tau _{*}}, \quad \forall \tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}. \end{array} \right. \end{aligned}$$

Proof

Notice that the functions \(f_{h}\) and \(f_{h_*}\) are the \(L^2\) projections of f to the spaces \(S_{h}\) and \(S_{h_*}\), respectively. Then for any \(\tau _{*}\in {\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\), we easily get \(\Vert f_{h_*} - f_h\Vert _{\tau _{*}} = 0\). For any \(\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\), since \( (f - f_{h_*}, v_{h_*})_{\tau _*} = 0, \forall v_{h_*}\in V_{h_*}. \) In particular, let

$$\begin{aligned} v_{h_*} = \left\{ \begin{aligned}&f_h-f_{h_*},&\text{ on }\ \tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}, \\&0,&\text{ otherwise }. \end{aligned} \right. \end{aligned}$$

Then

$$\begin{aligned} (f - f_{h_*}, f_h-f_{h_*})_{\tau _{*}} = 0. \end{aligned}$$
(26)

According to (26) and Cauchy–Schwarz inequality, we get

$$\begin{aligned} \Vert f_{h_*} -f_h\Vert _{\tau _{*}}^2= & {} (f_{h_*} -f_h, f_{h_*} -f_h)_{\tau _{*}} \\= & {} (f_{h_*} -f, f_{h_*} -f_h)_{\tau _{*}} + (f -f_{h}, f_{h_*} -f_h)_{\tau _{*}} \\= & {} (f -f_h, f_{h_*} -f_h)_{\tau _{*}}\leqslant \Vert f -f_h\Vert _{\tau _{*}}\Vert f_{h_*} -f_h\Vert _{\tau _{*}}. \end{aligned}$$

Canceling one \(\Vert f_{h_*} -f_h\Vert _{\tau _{*}}\), we will get \(\Vert f_{h_*} -f_h\Vert _{\tau _{*}}\leqslant \Vert f -f_h\Vert _{\tau _{*}}\). \(\square \)

In the rest of this subsection, we will prove the following discrete result, and use it to derive the quasi-orthogonality.

Lemma 5

Given an \(f\in L^2(\Omega )\) and two triangulations \({\mathcal {T}}_h, {\mathcal {T}}_{h_*}\in {\mathbb {L}}({\mathcal {T}}_0)\) with \({\mathcal {T}}_h\leqslant {\mathcal {T}}_{h_*}\), let u be the solution of (1)–(2), \(u_h = \{u_0^h, u_b^h\} \in V_h^0\) and \(u_{h_*}= \{u_0^{h_*}, u_b^{h_*}\} \in V_{h_*}^0\) be the corresponding WG solutions of (7), \({\tilde{u}}_{h_*}=\{{\tilde{u}}_0^{h_*}, {\tilde{u}}_b^{h_*}\}\in V_{h_*}^0\) be the solution of the variational problem (16). Then there exists a constant \(C_0\) which depends only on the shape regularity of \({\mathcal {T}}_{h_*}\), satisfying

$$\begin{aligned} \Vert A^{1/2}(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})\Vert _{{\mathcal {T}}_{h_*}} \leqslant \sqrt{C_0}\text{ osc }(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$

Proof

Applying (7) and (16), then for any \(v_{h_*}= \{v^{h_*}_0, v^{h_*}_b\}\in V_{h_*}^0\), we have

$$\begin{aligned} (A(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*}), \nabla _w v_{h_*})_{{\mathcal {T}}_{h_*}} = (f_h -f_{h_*}, v^{h_*}_0). \end{aligned}$$
(27)

Noting that \({\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}} = {\mathcal {T}}_{h}\backslash {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}\) are unchanged elements, choosing \(v_{h_*} = {\tilde{u}}_{h_*} - u_{h_*}\in V_{h_*}^0\) in (27) and using the property of \(L^{2}\) projection, Hölder inequality, (20), Cauchy–Schwarz inequality, we arrive at

$$\begin{aligned}&{\Vert A^{1/2}(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})\Vert _{{\mathcal {T}}_{h_*}}^2} \\&\quad = (f_h -f_{h_*}, {\tilde{u}}_0^{h_*} - u_0^{h_*})_{{\mathcal {T}}_{h_*}} = \sum _{\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}}(f_h - f, {\tilde{u}}_0^{h_*} - u_0^{h_*})_{\tau _{*}}\\&\quad \leqslant \sum _{\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}} \Vert f - f_h\Vert _{\tau _{*}} \cdot \Vert {\tilde{u}}_0^{h_*} - u_0^{h_*}\Vert _{\tau _{*}} \\&\quad \lesssim \sum _{\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}} \Vert f - f_h\Vert _{\tau _{*}} \cdot h_{\tau _{*}}\Vert \nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*}\Vert _{\tau _{*}}\\&\quad \lesssim \sum _{\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}} |A|^{-\frac{1}{2}}\Vert f - f_h\Vert _{\tau _{*}} \cdot h_{\tau _{*}}\Vert A^{1/2}(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})\Vert _{\tau _{*}} \\&\quad \lesssim \left( \sum _{\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}} h_{\tau _{*}}^2|A|^{-1}\Vert f - f_h\Vert _{\tau _{*}}^2\right) ^{1/2}\left( \sum _{\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}} \Vert A^{1/2}(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})\Vert _{\tau _{*}}^2\right) ^{1/2} \\&\quad \lesssim \left( \sum _{\tau \in {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}} h_{\tau }^2|A|^{-1}\Vert f - f_h\Vert _{\tau }^2\right) ^{1/2}\left( \sum _{\tau _{*}\in {\mathcal {T}}_{h_*}} \Vert A^{1/2}(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})\Vert _{\tau _{*}}^2\right) ^{1/2} \\&\quad \lesssim \text{ osc }(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}})\cdot \Vert A^{1/2}(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})\Vert _{{\mathcal {T}}_{h_*}}, \end{aligned}$$

where the constants only depends on the shape regularity of \({\mathcal {T}}_{h_*}\). At last, canceling one \(\Vert A^{1/2}(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})\Vert _{{\mathcal {T}}_{h_*}}\), then there exist a constant \(C_0\), such that

$$\begin{aligned} \Vert A^{1/2}(\nabla _w {\tilde{u}}_{h_*} - \nabla _w u_{h_*})\Vert _{{\mathcal {T}}_{h_*}} \leqslant \sqrt{C_0}\text{ osc }(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$

\(\square \)

Now, we use Lemmas 2 and 5 to derive a quasi-orthogonality result.

Lemma 6

Given an \(f\in L^2(\Omega )\) and two triangulations \({\mathcal {T}}_h, {\mathcal {T}}_{h_*}\in {\mathbb {L}}({\mathcal {T}}_0)\) defined in (14) with \({\mathcal {T}}_h\leqslant {\mathcal {T}}_{h_*}\), let u be the solution of (1)–(2), \(u_h = \{u_0^h, u_b^h\} \in V_h^0\) and \(u_{h_*}= \{u_0^{h_*}, u_b^{h_*}\} \in V_{h_*}^0\) be the corresponding WG solutions of (7). Then for any \(\epsilon \in (0, 1)\), we have

$$\begin{aligned} (1 - \epsilon )\Vert A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*})\Vert _{{\mathcal {T}}_{h_*}}^2\leqslant & {} \Vert A^{1/2}(\nabla u - \nabla _{w, \tau }u_h)\Vert _{{\mathcal {T}}_h}^2 \nonumber \\&\quad - \Vert A^{1/2}(\nabla _{w, \tau _*} u_{h_*}- \nabla _{w, \tau } u_h) \Vert _{{\mathcal {T}}_{h_*}}^2+ \frac{C_0}{\epsilon }\text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}),\nonumber \\ \end{aligned}$$
(28)

where the constant \(C_{0}\) is given in Lemma 5, \(\tau \in {\mathcal {T}}_h, \tau _{*}\in {\mathcal {T}}_{h_*}\) and \(\tau _{*}\subseteq \tau \).

Proof

First, making use of Lemma 2, Cauchy–Schwarz inequality and Lemma 5, we obtain

$$\begin{aligned}&{ (A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*}), A^{1/2}(\nabla _{w, \tau } u_h - \nabla _{w, \tau _*} u_{h_*}))_{{\mathcal {T}}_{h_*}}} \nonumber \\&\quad = (A(\nabla u - \nabla _{w, \tau _*}u_{h_*}), \nabla _{w, \tau } u_h - \nabla _{w, \tau _*}{\tilde{u}}_{h_*})_{{\mathcal {T}}_{h_*}} \nonumber \\&\qquad + (A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*}), A^{1/2}(\nabla _{w, \tau _*} {\tilde{u}}_{h_*} - \nabla _{w, \tau _*} u_{h_*}))_{{\mathcal {T}}_{h_*}} \nonumber \\&\quad =(A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*}), A^{1/2}(\nabla _{w, \tau _*} {\tilde{u}}_{h_*} - \nabla _{w, \tau _*} u_{h_*}))_{{\mathcal {T}}_{h_*}} \nonumber \\&\quad \leqslant \Vert A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*})\Vert _{{\mathcal {T}}_{h_*}} \Vert A^{1/2}(\nabla _{w, \tau _*} {\tilde{u}}_{h_*} - \nabla _{w, \tau _*} u_{h_*})\Vert _{{\mathcal {T}}_{h_*}} \nonumber \\&\quad \leqslant \sqrt{C_0}\Vert A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*})\Vert _{{\mathcal {T}}_{h_*}}\text{ osc }(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$
(29)

For any \(\epsilon >0\), using the inequality \( 2ab\leqslant \epsilon a^2 + \dfrac{1}{\epsilon } b^2 \) and (29), we have

$$\begin{aligned}&{ \Vert A^{1/2}(\nabla u - \nabla _{w, \tau }u_h)\Vert _{{\mathcal {T}}_h}^2 } \\&\quad = \Vert A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*})\Vert _{{\mathcal {T}}_{h_*}}^2 + \Vert A^{1/2}(\nabla _{w, \tau _*} u_{h_*} - \nabla _{w, \tau }u_h)\Vert _{{\mathcal {T}}_{h_*} }^2 \\&\qquad - 2\left( A^{1/2}(\nabla u -\nabla _{w, \tau _*}u_{h_*}), A^{1/2}(\nabla _{w, \tau _*} u_{h_*} - \nabla _{w, \tau }u_h)\right) _{{\mathcal {T}}_{h_*}}\\&\quad \ge \Vert A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*})\Vert _{{\mathcal {T}}_{h_*} }^2 + \Vert A^{1/2}(\nabla _{w, \tau _*} u_{h_*} - \nabla _{w, \tau }u_h)\Vert _{{\mathcal {T}}_{h_*} }^2 \\&\qquad -2\Vert A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*})\Vert _{{\mathcal {T}}_{h_*}} \sqrt{C_0}\text{ osc }(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}) \\&\quad \geqslant (1 - \epsilon )\Vert A^{1/2}(\nabla u - \nabla _{w, \tau _*}u_{h_*})\Vert _{{\mathcal {T}}_{h_*}}^2 + \Vert A^{1/2}(\nabla _{w, \tau _*} u_{h_*} - \nabla _{w, \tau } u_h)\Vert _{{\mathcal {T}}_{h_*} }^2 \\&\qquad -\frac{C_0}{\epsilon }\text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$

This completes the proof. \(\square \)

4.2 Residual Type Error Estimate: Upper Bound

In this subsection, we will recall the upper bound, which is important to prove the convergence of the adaptive WG methods.

Lemma 7

(Theorem 4.4 in [6]) Let u be the solution of (1)–(2) and \(u_h = \{u_0^h, u^h_b\}\in V_h^0\) be the solution of (7). Then, there exists a positive constant \(C_1\) depending on the shape regularity of \({\mathcal {T}}_h\) and coefficient A, such that

$$\begin{aligned} \Vert A^{1/2}(\nabla u - \nabla _wu_h)\Vert _{{\mathcal {T}}_h}\leqslant C_1\eta (u_h, {\mathcal {T}}_h). \end{aligned}$$
(30)

Remark 4

Although the error estimator in the above inequality is different from one introduced in [6], they can control each other. We can see from the Remark 3.

4.3 Contraction of the Error Estimator

In this subsection, we shall introduce the contraction of the error estimator. In order to prove that, we will divide the error estimator \(\eta (v_h, {\mathcal {T}}_h)\) into two parts \(\text{ osc}^2(f, {\mathcal {T}}_{h})\) and \(\eta _1^2(v_h, {\mathcal {T}}_h)\) and present separately the reduction of the two parts.

First, we prove the the reduction of oscillation \(\text{ osc}^2(f, {\mathcal {T}}_{h})\).

Lemma 8

For \({\mathcal {T}}_{h}, {\mathcal {T}}_{h_*}\in {\mathbb {L}}({\mathcal {T}}_0)\) with \({\mathcal {T}}_{h}\leqslant {\mathcal {T}}_{h_*}\), let \(\lambda := 1- \mu \in (0, 1)\), where \(\mu : = 2^{-1/d}\in (0, 1)\). We have

$$\begin{aligned} \text{ osc}^2(f, {\mathcal {T}}_{h_*})\leqslant \text{ osc}^2(f, {\mathcal {T}}_{h}) - \lambda \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$
(31)

Proof

For all \(\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\), applying with (26), we arrive at

$$\begin{aligned} \Vert f - f_{h_*}\Vert _{\tau _{*}}^2= & {} (f - f_{h_*}, f - f_{h_*})_{\tau _{*}} \\= & {} (f - f_{h_*}, f -f_h)_{\tau _{*}} \\\leqslant & {} \Vert f -f_{h_*}\Vert _{\tau _{*}}\Vert f -f_h\Vert _{\tau _{*}}, \end{aligned}$$

which implies

$$\begin{aligned} \Vert f - f_{h_*}\Vert _{\tau _{*}}\leqslant \Vert f -f_h\Vert _{\tau _{*}}. \end{aligned}$$
(32)

For all \(\tau \in {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}\), we suppose that \(\tau \) is bisected into \(\tau _*^1, \tau _*^2 \in {\mathcal {T}}_{h_*}\), then \(h_{\tau _*^1}^d =|\tau _*^1|=|\tau _*^2|= h_{\tau _*^2}^d = \dfrac{1}{2}|\tau | = \dfrac{1}{2}h_{\tau }^d(d = 2, 3)\) together with (11) and (32), yields

$$\begin{aligned}&\text{ osc}^2(f, \tau _*^1) + \text{ osc}^2(f, \tau _*^2)\nonumber \\&\quad = h_{\tau _*^1}^2|A_{\tau _*^1}|^{-1}\Vert f -f_{h_*}\Vert _{\tau _*^1}^2 + h_{\tau _*^2}^2|A_{\tau _*^2}|^{-1}\Vert f -f_{h_*}\Vert _{\tau _*^2}^2 \nonumber \\&\quad \leqslant h_{\tau _*^1}^2|A_{\tau _*^1}|^{-1}\Vert f -f_{h}\Vert _{\tau _*^1}^2 + h_{\tau _*^2}^2|A_{\tau _*^2}|^{-1}\Vert f -f_{h}\Vert _{\tau _*^2}^2\nonumber \\&\quad = 2^{-2/d} \cdot h_{\tau }^2|A_{\tau }|^{-1}\Vert f -f_{h}\Vert _{\tau _*^1}^2 + 2^{-2/ d} \cdot h_{\tau }^2|A_{\tau }|^{-1}\Vert f -f_{h}\Vert _{\tau _*^2}^2 \nonumber \\&\quad = 2^{-2/d} h_{\tau }^2|A_{\tau }|^{-1}\Vert f -f_{h}\Vert _{\tau }^2 \nonumber \\&\quad < \mu h_{\tau }^2|A_{\tau }|^{-1}\Vert f -f_{h}\Vert _{\tau }^2 \nonumber \\&\quad = \mu \text{ osc}^2(f, \tau ), \end{aligned}$$
(33)

Using the fact that \({\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}} = {\mathcal {T}}_{h}\backslash {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}\) in conjunction with (33) and (11), we arrive at

$$\begin{aligned}&{\text{ osc}^2(f, {\mathcal {T}}_{h_*})}\\&\quad = \sum _{\tau _*\in {\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}}h_{\tau _*}^2|A_{\tau _*}|^{-1}\Vert f -f_{h_*}\Vert _{\tau _{*}}^2 + \sum _{\tau _*\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}}h_{\tau _*}^2|A_{\tau _*}|^{-1}\Vert f -f_{h_*}\Vert _{\tau _{*}}^2 \\&\quad \leqslant \sum _{\tau _*\in {\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}}h_{\tau _*}^2|A_{\tau _*}|^{-1}\Vert f -f_{h_*}\Vert _{\tau _{*}}^2 +\mu \sum _{\tau \in {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\text{ osc}^2(f, \tau )\\&\quad =\sum _{\tau \in {\mathcal {T}}_{h}\backslash {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}} h_{\tau }^2|A_{\tau }|^{-1}\Vert f -f_h\Vert _{\tau }^2 +\mu \sum _{\tau \in {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\text{ osc}^2(f, \tau ) \\&\quad \leqslant \sum _{\tau \in {\mathcal {T}}_{h}} \text{ osc}^2(f, \tau ) - \sum _{\tau \in {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\text{ osc}^2(f, \tau ) +\mu \sum _{\tau \in {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\text{ osc}^2(f, \tau )\\&\quad \leqslant \text{ osc}^2(f, {\mathcal {T}}_{h}) - \lambda \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$

We complete the proof. \(\square \)

Now we are in a position to present the reduction of the second part. We first present the difference between \(\eta _1^2(v_{h_*}, {\mathcal {T}}_{h_*}) \) and \(\eta _1^2(v_h, {\mathcal {T}}_{h_*})\).

Lemma 9

For \({\mathcal {T}}_{h}, {\mathcal {T}}_{h_*}\in {\mathbb {L}}({\mathcal {T}}_0)\) with \({\mathcal {T}}_{h}\leqslant {\mathcal {T}}_{h_*}\), let \(v_h = \{v_0^h, v_b^h\}\in V_h^0, v_{h_*} = \{v_0^{h_*}, v_b^{h_*}\}\in V_{h_*}\). Then for any \(\zeta >0\), there exists constant \(\sigma _1\) depending on the shape regularity of \({\mathcal {T}}_{h_*}\), the polynomial order l, coefficient A and parameter \(\zeta \), such that

$$\begin{aligned} \eta _1^2(v_{h_*}, {\mathcal {T}}_{h_*})\leqslant & {} (1+\zeta )\eta _1^2(v_h, {\mathcal {T}}_{h_*}) \nonumber \\&+\frac{1}{\sigma _1}\Big (\mu \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}})+\Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert ^2_{{\mathcal {T}}_{h_*}}\Big ). \nonumber \\ \end{aligned}$$
(34)

Proof

For each \(\tau _*\in {\mathcal {T}}_{h_*}\), we will consider the four terms in \(\eta _1^2(v_{h_*}, {\mathcal {T}}_{h_*})\) one by one.

a) We first deal with the element terms \(R_1(v_{h_{*}}, f_{h_{*}}): = f_{h_{*}} + \nabla \cdot (A\nabla _w v_{h_*})\) and \(R_2(v_{h_*}):=\nabla \times \nabla _w v_{h_*}\). For \(R_1(v_{h_{*}}, f_{h_{*}})\), using the triangle inequality, we have

$$\begin{aligned}&{h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert R_1(v_{h_{*}}, f_{h_{*}})\Vert _{\tau _{*}}} \nonumber \\&\quad =h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert f_{h_{*}} + \nabla \cdot (A\nabla _w v_{h_*})\Vert _{\tau _{*}} \nonumber \\&\quad = h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert (f_{h} + \nabla \cdot (A\nabla _{w, \tau } v_h) + f_{h_*} -f_h+ \nabla \cdot A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}} \nonumber \\&\quad \leqslant h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert R_1(v_{h}, f_{h})\Vert _{\tau _*} \nonumber \\&\qquad + h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert f_{h_*} -f_h+ \nabla \cdot A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}}, \end{aligned}$$
(35)

Applying triangle inequality, chain rule and inverse inequality, we obtain

$$\begin{aligned}&{h_{\tau _{*}}|A_{\tau _{*}}|^{-1/2}\Vert f_{h_*} -f_h+ \nabla \cdot A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}}} \nonumber \\&\quad \lesssim h_{\tau _{*}}|A_{\tau _{*}}|^{-1/2}\Vert f_{h_*} -f_h\Vert _{\tau _{*}}+ h_{\tau _{*}}\Vert \nabla \cdot A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}} \nonumber \\&\quad \lesssim h_{\tau _{*}}|A_{\tau _{*}}|^{-1/2}\Vert f_{h_*} -f_h\Vert _{\tau _{*}}+ \Vert A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}} \nonumber \\&\quad \lesssim h_{\tau _{*}}|A_{\tau _{*}}|^{-1/2}\Vert f_{h_*} -f_h\Vert _{\tau _{*}}+ \Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}}. \end{aligned}$$
(36)

Substituting (36) into (35) and making use of Lemma 4, for any \(\tau _{*}\in {\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\), we have

$$\begin{aligned}&{h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert R_1(v_{h_{*}}, f_{h_{*}})\Vert _{\tau _{*}}} \nonumber \\&\quad \lesssim h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert R_1(v_{h}, f_{h})\Vert _{\tau _*} + \Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}}, \end{aligned}$$
(37)

and for any \(\tau _{*}\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}\), we have

$$\begin{aligned}&{h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert R_1(v_{h_{*}}, f_{h_{*}})\Vert _{\tau _{*}}} \nonumber \\&\quad \lesssim h_{\tau _*}|A_{\tau _{*}}|^{-1/2} \Vert R_1(v_{h}, f_{h})\Vert _{\tau _*} + h_{\tau _{*}}|A_{\tau _{*}}|^{-1/2}\Vert f -f_h\Vert _{\tau _{*}} \nonumber \\&\qquad + \Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}}. \end{aligned}$$
(38)

For \(R_2(v_{h_*})\), a similar method for proving (37), we get

$$\begin{aligned}&{h_{\tau _{*}} |A_{\tau _{*}}|^{1/2}\Vert R_2(v_{h_*})\Vert _{\tau _{*}} }\nonumber \\&\quad \leqslant h_{\tau _{*}} |A_{\tau _{*}}|^{1/2} \Vert \nabla \times \nabla _{w, \tau } v_h\Vert _{\tau _{*}} + h_{\tau _{*}} |A_{\tau _{*}}|^{1/2} \Vert \nabla \times (\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}} \nonumber \\&\quad \lesssim h_{\tau _{*}} |A_{\tau _{*}}|^{1/2}\Vert R_2(v_{h})\Vert _{\tau _{*}} + \Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _{*}}. \end{aligned}$$
(39)

b) Now, we consider the jump terms \({\mathbf {J}}_{e_*}(A\nabla _wv_{h_*})\) and \({\mathbf {J}}_{e_*}(\gamma _t(\nabla _wv_{h_*}))\). For each \(e_*\in {\mathcal {E}}_{h_*}^0\), we assume that \(e_* = \tau _*^1\cap \tau _*^2\) with \(\tau _*^1, \tau _*^2\in {\mathcal {T}}_{h_*}\). Let \({\varvec{n}}_{*}^1\) and \({\varvec{n}}_{*}^2\) be the unit normal vectors on \(e_*\) exterior to \(\tau _*^1\) and \(\tau _*^2\), respectively. Applying the triangle inequality, we obtain

$$\begin{aligned}&{h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\max }|^{-1/2}\Vert {\mathbf {J}}_{e_*}(A\nabla _{w, \tau _*}v_{h_*})\Vert _{e_*}}\nonumber \\&\leqslant h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\max }|^{-1/2}\Vert {\mathbf {J}}_{e_*}(A\nabla _{w, \tau } v_h)\Vert _{e_*} \nonumber \\&\qquad +h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\max }|^{-1/2}\Vert {\mathbf {J}}_{e_*}(A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{e_*}. \end{aligned}$$
(40)

Using the definition of \({\mathbf {J}}_{e_*}(A(\nabla _{w, \tau }v_h-\nabla _{w, \tau _*}v_{h_*}))\) and trace inequality, we have

$$\begin{aligned}&{h_{\tau _{*}}^{1/2}|A_{e_{*}}^{\max }|^{-1/2}\Vert {\mathbf {J}}_{e_*}(A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h))\Vert _{e_*}} \nonumber \\&\quad \leqslant h_{\tau _{*}}^{1/2}|A_{e_{*}}^{\max }|^{-1/2}\Vert A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)|_{\tau _*^1}\cdot {\varvec{n}}_{*}^1\Vert _{e_*} \nonumber \\&\qquad + h_{\tau _{*}}^{1/2}|A_{e_{*}}^{\max }|^{-1/2}\cdot \Vert A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)|_{\tau _*^2}\cdot {\varvec{n}}_{*}^2\Vert _{e_*} \nonumber \\&\quad \lesssim h_{\tau _{*}}^{1/2}\Vert A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)|_{\tau _*^1}\Vert _{e_*} +h_{\tau _{*}}^{1/2} \Vert A(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)|_{\tau _*^2}\Vert _{e_*} \nonumber \\&\quad \lesssim \Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _*^1\cup \tau _*^2}. \end{aligned}$$
(41)

Substituting (41) into (40), we get

$$\begin{aligned}&{h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\max }|^{-1/2}\Vert {\mathbf {J}} _{e_*}(A\nabla _wv_{h_*})\Vert _{e_*}}\nonumber \\&\quad \leqslant h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\max }|^{-1/2}\Vert {\mathbf {J}}_{e_*}(A\nabla _{w, \tau } v_{h})\Vert _{e_*} \nonumber \\&\qquad + \Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _*^1\cup \tau _*^2}. \end{aligned}$$
(42)

Similar to the proof of (42), we obtain

$$\begin{aligned}&{h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}(\gamma _t(\nabla _{w, \tau _*} v_{h_*}))\Vert _{e_*}} \nonumber \\&\quad \leqslant h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}(\gamma _t(\nabla _{w, \tau } v_{h}))\Vert _{e_*} \nonumber \\&\qquad +h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}(\gamma _t(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h))\Vert _{e_*} \nonumber \\&\quad \lesssim h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}(\gamma _t(\nabla _{w, \tau } v_{h}))\Vert _{e_*} + h_{\tau _{*}}^{1/2} \Vert (\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)|_{\tau _*^1}\Vert _{e_*} \nonumber \\&\qquad + h_{\tau _{*}}^{1/2}\Vert (\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)|_{\tau _*^2}\Vert _{e_*} \nonumber \\&\quad \lesssim h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}(\gamma _t(\nabla _{w, \tau } v_{h}))\Vert _{e_*} \nonumber \\&\qquad +\Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _*^1\cup \tau _*^2}. \end{aligned}$$
(43)

For each \(e_*\in {\mathcal {E}}_{h_*}\cap \partial \Omega \), we assume that \(e_*\subset \partial \tau _*\) with \(\tau _*\in {\mathcal {T}}_{h_*}\). By the definition of \({\mathbf {J}}_{e_*}(A\nabla _w v_{h_*})\), we have \({\mathbf {J}}_{e_*}(A\nabla _wv_{h_*} ) = 0\). Next, a similar method for proving (43), we get

$$\begin{aligned}&{h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}(\gamma _t(\nabla _w v_{h_*}))\Vert _{e_*}} \nonumber \\&\quad \leqslant h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}(\gamma _t(\nabla _{w, \tau } v_{h}))\Vert _{e_*} \nonumber \\&\qquad + h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}\left( \gamma _t(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\right) \Vert _{e_*} \nonumber \\&\quad \lesssim h_{\tau _{*}}^{1/2} |A_{e_{*}}^{\min }|^{1/2}\Vert {\mathbf {J}}_{e_*}(\gamma _t(\nabla _{w, \tau } v_{h}))\Vert _{e_*} + \Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert _{\tau _*}. \end{aligned}$$
(44)

From (33), we also arrive at

$$\begin{aligned} \sum _{\tau _*\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}}h_{\tau _{*}}|A_{\tau _{*}}|^{-1/2}\Vert f -f_h\Vert _{\tau _{*}} \leqslant \mu \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$
(45)

Squaring both sides of (37), (38), (39), (42), (43), (44), applying Young’s inequality \(2ab\leqslant \zeta a^2 + \zeta ^{-1} b^2\) for \(a, b>0, \zeta >0\), summing all elements \(\tau _*\in {\mathcal {T}}_{h_*}\) and edges/faces \(e_*\in {\mathcal {E}}_{h_*}\), observing the shape regularity of the mesh \({\mathcal {T}}_{h_*}\) and using (45), we arrive at

$$\begin{aligned}&{\eta _1^2(v_{h_*}, {\mathcal {T}}_{h_*})}\\&\quad \leqslant (1+\zeta )\eta _1^2(v_h, {\mathcal {T}}_{h_*}) +C_2(1 + \zeta ^{-1})\Big (\sum _{\tau _*\in \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}}}h_{\tau _{*}}|A_{\tau _{*}}|^{-1/2}\Vert f -f_h\Vert _{\tau _{*}} \\&\qquad +\Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert ^2_{{\mathcal {T}}_{h_*}}\Big )\\&\quad \leqslant (1+\zeta )\eta _1^2(v_h, {\mathcal {T}}_{h_*}) \\&\qquad +C_2(1 + \zeta ^{-1})\Big (\mu \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}})+\Vert A^{1/2}(\nabla _{w, \tau _*}v_{h_*}-\nabla _{w, \tau }v_h)\Vert ^2_{{\mathcal {T}}_{h_*}}\Big ). \end{aligned}$$

The constant \(C_2\) depends on the shape regularity of \({\mathcal {T}}_{h_*}\), coefficient A and the polynomial order l. At last, let \(1/\sigma _1 = C_2(1 + \zeta ^{-1})\), we get the desired inequality (34). \(\square \)

Next, we prove the contraction of the error estimator if the solution does not change.

Lemma 10

Let \({\mathcal {T}}_{h_*}\) be a shape regular triangulation which is refined from a shape regular triangulation \({\mathcal {T}}_{h}\). Let \(u_h\in V_h\) be the discrete solution of (7). Then

$$\begin{aligned} \eta _1^2(u_h, {\mathcal {T}}_{h_*})\leqslant \eta _1^2(u_h, {\mathcal {T}}_h)-\lambda \eta _1^2(u_{h}, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$

Proof

We shall divide the proof into two steps. In the first step, we prove the element-wise contraction if one element is divided into at least two parts, and in the second step, we prove the global version.

Step 1. Suppose \(\tau \in {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}\) is bisected into \(\tau _*^1\in {\mathcal {T}}_{h_*}\) and \(\tau _*^2 \in {\mathcal {T}}_{h_*}\). We shall prove that

$$\begin{aligned} \eta _1^2(u_h, \tau _*^1) + \eta _1^2(u_h, \tau _*^2)\leqslant \mu \eta _1^2(u_h, \tau ), \end{aligned}$$
(46)

where \(\mu \in (0, 1)\) is given in Lemma  8.

In fact, similar to the proof of (33), we can obtain that the two element-wise terms are reduced, namely,

$$\begin{aligned}&h_{\tau _{*}^1}^2|A_{\tau _{*}^1}|^{-1} \Vert R_1(u_{h}, f_h)\Vert _{\tau _{*}^1}^2 + h_{\tau _{*}^2}^2|A_{\tau _{*}^2}|^{-1} \Vert R_1(u_{h}, f_h)\Vert _{\tau _{*}^2}^2 \nonumber \\&\quad \leqslant \mu h_{\tau }^2|A_{\tau }^{-1}|\cdot \Vert R_1(u_{h}, f_h)\Vert _{\tau }^2, \end{aligned}$$
(47)

and

$$\begin{aligned} h_{\tau _{*}^1}^2 |A_{\tau _{*}^1}|\cdot \Vert R_2(u_h)\Vert _{\tau _{*}^1}^2+ h_{\tau _{*}^2}^2 |A_{\tau _{*}^2}|\cdot \Vert R_2(u_h)\Vert _{\tau _{*}^2}^2 \leqslant \mu h_{\tau }^2|A_{\tau }|\cdot \Vert R_2(u_h)\Vert _{\tau }^2. \end{aligned}$$
(48)

On the jump residual associated with edges/faces, we note that after \(\tau \in {\mathcal {T}}_{h}\) is bisected, in \(\tau _*^1\in {\mathcal {T}}_{h_*}\) and \(\tau _*^2\in {\mathcal {T}}_{h_*}\), there are three types of faces.

  1. 1.

    For the new edge/face \(e_*\) created by the bisection, which is inside the element \(\tau \), the function \(\nabla _w u_h|_{\tau }\) is a polynomial and its coefficients are continuous. Therefore \([A\nabla _w u_h\cdot {\varvec{n}}]_{e_*}\) and \([\gamma _t(\nabla _wu_h)]_{e_*}\) are zero.

  2. 2.

    For the edges/faces divided from \(\tau \), the jump values are invariant. But the mesh size is changed. For each \(e\in {\mathcal {E}}_h^0\), where \(e = \tau _1\cap \tau _2\) with \(\tau _1, \tau _2\in {\mathcal {T}}_{h}\). Let \(\tau _{*, i}^1\in {\mathcal {T}}_{h_*}\) and \(\tau _{*, i}^2\in {\mathcal {T}}_{h_*}\) be the children of \(\tau _i (i=1, 2)\), define \(e_*^i = \tau _{*, 1}^i\cap \tau _{*, 2}^i\), then we have \(e = e_*^1\cup e_*^2\). For the first jump term, applying Lemma 1, we obtain \({\mathbf {J}}_{e_*^i}(A\nabla _w u_h)=0, i=1, 2\). For the second jump term,

    $$\begin{aligned}&\frac{1}{2}h_{\tau _{*}}|A_{e_*^1}^{\min }|\cdot \Vert {\mathbf {J}}_{e_*^1}(\gamma _t(\nabla _{w, \tau }u_h))\Vert _{e_{*}^1}^2 + \frac{1}{2}h_{\tau _{*}}|A_{e_*^2}^{\min }|\cdot \Vert {\mathbf {J}}_{e_*^2}(\gamma _t(\nabla _{w, \tau }u_h))\Vert _{e_{*}^2}^2 \nonumber \\&\quad =2^{-1/d} \frac{|A_{e_*^1}^{\min }|}{|A_{e}^{\min }|} \cdot \frac{1}{2} h_{\tau } |A_{e}^{\min }|\cdot \Vert {\mathbf {J}}_{e_*^1}(\gamma _t(\nabla _{w, \tau }u_h))\Vert ^2_{e_*^1} \nonumber \\&\qquad + 2^{-1/d} \frac{|A_{e_*^2}^{\min }|}{|A_{e}^{\min }|}\cdot \frac{1}{2} h_{\tau } |A_{e}^{\min }|\cdot \Vert {\mathbf {J}}_{e_*^2}(\gamma _t(\nabla _{w, \tau }u_h))\Vert ^2_{e_*^2} \nonumber \\&\quad \leqslant \mu \cdot \frac{1}{2}h_{\tau }|A_{e}^{\min }|\cdot \Vert {\mathbf {J}}_{e}(\gamma _t(\nabla _wu_h))\Vert _{e}^2, \end{aligned}$$
    (49)

    in the last step, we use the fact \(\dfrac{|A_{e_*^i}^{\min }|}{|A_{e}^{\min }|} =1\). For each \(e\in {\mathcal {E}}_h\cap \partial \Omega \), where \(e = \partial \tau \) with \(\tau \in {\mathcal {T}}_{h}\). Let \(\tau _*^1\in {\mathcal {T}}_{h_*}\) and \( \tau _*^2\in {\mathcal {T}}_{h_*}\) be the children of \(\tau \), define \(e_*^i \in e\cap \tau _*^i (i=1, 2)\), then \(e = e_*^1\cup e_*^2\). For the first jump term, using the definition of \({\mathbf {J}}_{e}(A\nabla _w u_h)\), we obtain \({\mathbf {J}}_{e_*^i}(A\nabla _w u_h)=0, i=1, 2\). For the second jump term, using a similar method to prove (49), we have

    $$\begin{aligned}&{ \frac{1}{2}h_{\tau _{*}}|A_{e_*^1}^{\min }|\cdot \Vert {\mathbf {J}}_{e_*^1}(\gamma _t(\nabla _{w, \tau }u_h))\Vert _{e_{*}^1}^2 + \frac{1}{2}h_{\tau _{*}}|A_{e_*^2}^{\min }|\cdot \Vert {\mathbf {J}}_{e_*^2}(\gamma _t(\nabla _{w, \tau }u_h))\Vert _{e_{*}^2}^2 } \nonumber \\&\quad \leqslant \mu \cdot \frac{1}{2}h_{\tau }|A_{e}^{\min }|\cdot \Vert {\mathbf {J}}_{e}(\gamma _t(\nabla _wu_h))\Vert _{e}^2. \end{aligned}$$
    (50)
  3. 3.

    For the edges/faces unchanged or inherited from \(\tau \), also the jump values are invariant but the mesh size is decreased by \(2^{-1/d}, d = 2, 3\). The crucial observation is that we use the mesh size \(h_{\tau }\) in the jump residual.

Hence, using (47), (48), (49), (50) and the fact \(\mu =2^{-1/d}, d = 2, 3\), we get the inequality (46).

Step 2. Notice that \( {\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}} = {\mathcal {T}}_{h}\backslash {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}\), together with (46), we get

$$\begin{aligned} \eta _1^2(u_h, {\mathcal {T}}_{h_*})= & {} \eta _1^2(u_h, \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}})+ \eta _1^2(u_h, {\mathcal {T}}_{h_*}\backslash \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}})\\= & {} \eta _1^2(u_h, \overline{{\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}})+ \eta _1^2(u_h, {\mathcal {T}}_{h}\backslash {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}) \\\leqslant & {} \mu \eta _1^2(u_h, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}) + \eta _1^2(u_h, {\mathcal {T}}_{h}) - \eta _1^2(u_h, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}) \\\leqslant & {} \eta _1^2(u_h, {\mathcal {T}}_h)-\lambda \eta _1^2(u_{h}, {\mathcal {R}}_{{\mathcal {T}}_{h}\rightarrow {\mathcal {T}}_{h_*}}). \end{aligned}$$

\(\square \)

The following lemma summarizes the contraction of \(\eta _1^2(\cdot , \cdot )\) by using Lemmas 9 and  10.

Lemma 11

For any \(\zeta >0\), there exists constant \(\sigma _1\) depending on the shape regularity of \({\mathcal {T}}_{k+1}\), the polynomial order l, coefficient A and parameter \(\zeta \), such that

$$\begin{aligned} \eta _1^2(u_{k+1}, {\mathcal {T}}_{k+1})\leqslant & {} (1+\zeta )\left( \eta _1^2(u_k, {\mathcal {T}}_k)-\lambda \eta _1^2(u_{k}, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})\right) \nonumber \\&\quad +\frac{1}{\sigma _1}\Big (\mu \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})+\Vert A^{1/2}(\nabla _{w, \tau _{k+1}}u_{k+1} - \nabla _{w, \tau _k}u_k)\Vert ^2_{{\mathcal {T}}_{k+1}}\Big ),\nonumber \\ \end{aligned}$$
(51)

where \(\tau _k\in {\mathcal {T}}_k, \tau _{k+1}\in {\mathcal {T}}_{k+1}\) and \(\tau _{k+1}\subseteq \tau _k\).

Proof

Let \({\mathcal {T}}_{h} = {\mathcal {T}}_{k}\) and \({\mathcal {T}}_{h_*} = {\mathcal {T}}_{k+1}\) in Lemmas 9 and  10, we get the desired result (51). \(\square \)

At the end of this section, we present the contraction of the error estimator by using Lemmas 8 and 11 .

Lemma 12

There exists \(\xi \in (0, 1)\) depending only on the shape regularity of \({\mathcal {T}}_{k+1}\), the parameters \(\theta \), \(\lambda \) and \(\zeta \) given in the marking strategy (13), Lemmas 8 and  9, respectively. There holds

$$\begin{aligned} \eta ^2(u_{k+1}, {\mathcal {T}}_{k+1})\leqslant & {} \xi \eta ^2(u_{k}, {\mathcal {T}}_{k}) - \zeta \text{ osc}^2(f, {\mathcal {T}}_{k}) + \left( \zeta \lambda + \frac{\mu }{\sigma _1}\right) \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}}) \\&+ \frac{1}{\sigma _1}\Vert A^{1/2}(\nabla _{w, \tau _{k+1}}u_{k+1} - \nabla _{w, \tau _k}u_k)\Vert ^2_{{\mathcal {T}}_{k+1}}, \end{aligned}$$

where \(\mu , \sigma _1\) are defined in Lemmas 8 and 9, respectively; \(\tau _k\in {\mathcal {T}}_k, \tau _{k+1}\in {\mathcal {T}}_{k+1}\) and \(\tau _{k+1}\subseteq \tau _k\).

Proof

Making use of the definition of the error estimator \(\eta ^2(\cdot , \cdot )\), Lemma 11 and let \({\mathcal {T}}_{h} = {\mathcal {T}}_{k}, {\mathcal {T}}_{h_*} = {\mathcal {T}}_{k+1}\) in Lemma 8, we have

$$\begin{aligned}&\eta ^2(u_{k+1}, {\mathcal {T}}_{k+1})\nonumber \\&= \eta _1^2(u_{k+1}, {\mathcal {T}}_{k+1}) + \text{ osc}^2(f, {\mathcal {T}}_{k+1}) \nonumber \\&\quad \leqslant (1+\zeta )\left( \eta _1^2(u_k, {\mathcal {T}}_k)-\lambda \eta _1^2(u_{k}, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})\right) + \text{ osc}^2(f, {\mathcal {T}}_{k}) \nonumber \\&\qquad + \frac{1}{\sigma _1}\Big (\mu \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})+\Vert A^{1/2}(\nabla _{w, \tau _{k+1}}u_{k+1} - \nabla _{w, \tau _k}u_k)\Vert ^2_{{\mathcal {T}}_{k+1}}\Big ) \nonumber \\&\qquad - \lambda \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}}) \nonumber \\&\quad =(1+\zeta )\left( \eta ^2(u_k, {\mathcal {T}}_k)-\lambda \eta ^2(u_{k}, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})\right) \nonumber \\&\qquad - \zeta \left( \text{ osc}^2(f, {\mathcal {T}}_{k}) - \lambda \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})\right) \nonumber \\&\qquad +\frac{1}{\sigma _1}\Big (\mu \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})+\Vert A^{1/2}(\nabla _{w, \tau _{k+1}}u_{k+1} - \nabla _{w, \tau _k}u_k)\Vert ^2_{{\mathcal {T}}_{k+1}}\Big ) \nonumber \\&\quad = (1+\zeta )\left( \eta ^2(u_k, {\mathcal {T}}_k)-\lambda \eta ^2(u_{k}, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})\right) - \zeta \text{ osc}^2(f, {\mathcal {T}}_{k}) \nonumber \\&\qquad +\left( \zeta \lambda + \frac{\mu }{\sigma _1}\right) \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})\nonumber \\&\qquad + \frac{1}{\sigma _1}\Vert A^{1/2}(\nabla _{w, \tau _{k+1}}u_{k+1} - \nabla _{w, \tau _k}u_k)\Vert ^2_{{\mathcal {T}}_{k+1}}. \end{aligned}$$
(52)

Applying the marking strategy (13) and choosing \(\zeta \) small enough such that \(\xi :=(1+\zeta )(1-\theta \lambda )\in (0, 1), \) in conjunction with (52), we obtain

$$\begin{aligned} \eta ^2(u_{k+1}, {\mathcal {T}}_{k+1})\leqslant & {} \xi \eta ^2(u_{k}, {\mathcal {T}}_{k}) - \zeta \text{ osc}^2(f, {\mathcal {T}}_{k}) + \left( \zeta \lambda + \frac{\mu }{\sigma _1}\right) \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}}) \\&+ \frac{1}{\sigma _1}\Vert A^{1/2}(\nabla _{w, \tau _{k+1}}u_{k+1} - \nabla _{w, \tau _k}u_k)\Vert ^2_{{\mathcal {T}}_{k+1}}, \end{aligned}$$

which completes the proof. \(\square \)

4.4 Convergence of the AWG

In this subsection, we prove the algorithm AWG will terminate in finite steps within a given tolerance. First of all, we shall prove the contraction of summation of the energy error and the scaled error indicator.

Theorem 2

Given a marking parameter \(\theta \in (0, 1)\) and initial mesh \({\mathcal {T}}_0\). Let u be the solution of (1)–(2), \(\{{\mathcal {T}}_k, u_k, \eta (u_k,{\mathcal {T}}_k)\}_{k\ge 0}\) be a sequence of meshes, finite element solutions and error estimates produced by the AWG. Then there exist constants \(\rho \in (0, 1), \sigma _1>0, \sigma _2>0\) depending only on the shape regularity of \({\mathcal {T}}_0\), the polynomial order l, coefficient A, parameters \(\theta \), \(\mu _0\) and \(\epsilon \), such that if

$$\begin{aligned} 0<\epsilon <\min \left( \dfrac{\sigma _1(1-\xi )}{C_1}, 1\right) , \end{aligned}$$

then

$$\begin{aligned}&{(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k+1})\Vert _{{\mathcal {T}}_{k+1}}^2 + \sigma _1\eta ^2(u_{k+1}, {\mathcal {T}}_{k+1}) + \sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k+1})}\\&\quad \leqslant \rho \Big ((1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k})\Vert _{{\mathcal {T}}_{k}}^2 + \sigma _1\eta ^2(u_{k}, {\mathcal {T}}_{k}) +\sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k})\Big ), \end{aligned}$$

where the constants \(C_1\) and \(\xi \) are given by Lemmas 7 and 12 , respectively.

Remark 5

Notice that the data oscillation \(\text{ osc}^2(f,\cdot )\) is one part of the error indicator \(\eta ^2(\cdot , \cdot )\). If we want to get rid of the term \(\sigma _2\text{ osc}^2(f,\cdot )\), we have to add an extra marking for the data oscillation, see [5].

Proof

By adding \(\sigma _1 \eta ^2(u_{k+1}, {\mathcal {T}}_{k+1})\) to both sides of (28), then applying Lemma 12 , we have

$$\begin{aligned}&{(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k+1})\Vert _{{\mathcal {T}}_{k+1}}^2 + \sigma _1\eta ^2(u_{k+1}, {\mathcal {T}}_{k+1})} \nonumber \\&\quad \leqslant \Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 - \Vert A^{1/2}(\nabla _{w, \tau _{k+1}} u_{k+1} - \nabla _{w, \tau _{k}}u_k)\Vert _{{\mathcal {T}}_{k+1}}^2 \nonumber \\&\qquad +\frac{C_0}{\epsilon }\text{ osc }(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}})+\sigma _1 \eta ^2(u_{k+1}, {\mathcal {T}}_{k+1})\nonumber \\&\quad \leqslant \Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 - \Vert A^{1/2}(\nabla _{w, \tau _{k+1}} u_{k+1} - \nabla _{w, \tau _{k}}u_k)\Vert _{{\mathcal {T}}_{k+1}}^2 \nonumber \\&\qquad +\frac{C_0}{\epsilon }\text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}}) + \sigma _1 \xi \eta ^2(u_k, {\mathcal {T}}_k) - \sigma _1 \zeta \text{ osc}^2(f, {\mathcal {T}}_{k}) \nonumber \\&\qquad + (\zeta \lambda \sigma _1+ \mu )\text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}}) +\Vert A^{1/2}(\nabla _{w, \tau _{k+1}} u_{k+1} - \nabla _{w, \tau _{k}}u_k)\Vert ^2_{{\mathcal {T}}_{k+1}}\nonumber \\&\quad \leqslant \Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 +\sigma _1 \xi \eta ^2(u_k, {\mathcal {T}}_k) \nonumber \\&\qquad +\left( \frac{C_0}{\epsilon } + \zeta \lambda \sigma _1+ \mu \right) \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}}) - \sigma _1 \zeta \text{ osc}^2(f, {\mathcal {T}}_{k}), \end{aligned}$$
(53)

for any constant \(\epsilon \in (0, 1)\). Suppose \(\sigma _2>0\), which will be determined later. By adding \(\sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k+1})\) in the both sides of (53) and let \({\mathcal {T}}_{h} = {\mathcal {T}}_{k}, {\mathcal {T}}_{h_*} = {\mathcal {T}}_{k+1}\) in Lemma 8, we obtain

$$\begin{aligned}&{(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k+1})\Vert _{{\mathcal {T}}_{k+1}}^2 + \sigma _1\eta ^2(u_{k+1}, {\mathcal {T}}_{k+1}) + \sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k+1})} \nonumber \\&\quad \leqslant \Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 +\sigma _1 \xi \eta ^2(u_k, {\mathcal {T}}_k) + (\sigma _2- \sigma _1 \zeta )\text{ osc}^2(f, {\mathcal {T}}_{k}) \nonumber \\&\qquad +\left( \dfrac{C_0}{\epsilon } +\mu - (\sigma _2-\zeta \sigma _1)\lambda \right) \text{ osc}^2(f, {\mathcal {R}}_{{\mathcal {T}}_{k}\rightarrow {\mathcal {T}}_{k+1}}). \end{aligned}$$
(54)

The above inequality (54) along with a sufficiently large \(\sigma _2\) satisfying

$$\begin{aligned} \dfrac{C_0}{\epsilon } +\mu - (\sigma _2-\zeta \sigma _1)\lambda \leqslant 0, \end{aligned}$$
(55)

and some \(\rho _1\in (0, 1)\) to be determined later implies

$$\begin{aligned}&{(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k+1})\Vert _{{\mathcal {T}}_{k+1}}^2 + \sigma _1\eta ^2(u_{k+1}, {\mathcal {T}}_{k+1}) + \sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k+1})}\nonumber \\&\quad \leqslant \Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 +\sigma _1 \xi \eta ^2(u_k, {\mathcal {T}}_k)+ (\sigma _2- \sigma _1 \zeta )\text{ osc}^2(f, {\mathcal {T}}_{k}) \nonumber \\&\quad \leqslant \rho _1(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 + \left( 1-\rho _1(1 - \epsilon )\right) \Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert ^2\nonumber \\&\qquad +\sigma _1 \xi \eta ^2(u_k, {\mathcal {T}}_k)+ (\sigma _2- \sigma _1 \zeta )\text{ osc}^2(f, {\mathcal {T}}_{k}). \end{aligned}$$
(56)

The upper bound (30) together with (56), yields

$$\begin{aligned}&(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k+1})\Vert _{{\mathcal {T}}_{k+1}}^2 + \sigma _1\eta ^2(u_{k+1}, {\mathcal {T}}_{k+1}) + \sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k+1})\nonumber \\&\quad \leqslant \rho _1(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 + \Big (C_1-C_1\rho _1(1 - \epsilon ) + \sigma _1 \xi \Big )\eta ^2(u_k, {\mathcal {T}}_k) \nonumber \\&\qquad +(\sigma _2- \sigma _1 \zeta )\text{ osc}^2(f,{\mathcal {T}}_{k}) , \end{aligned}$$
(57)

according to

$$\begin{aligned} \rho _1\sigma _1 = C_1-C_1\rho _1(1 - \epsilon ) + \sigma _1\xi , \end{aligned}$$

choose

$$\begin{aligned} \rho _1 = \dfrac{C_1+ \sigma _1 \xi }{C_1+ \sigma _1 - C_1\epsilon }, \end{aligned}$$

the requirement \(0<\epsilon <\min \left( \dfrac{\sigma _1(1-\xi )}{C_1}, 1\right) \) with \(\xi \in (0, 1)\) leads to \(\rho _1\in (0, 1)\). By (55), we obtain \(\sigma _2- \sigma _1 \zeta >0\). Then let \(\rho _2 = \dfrac{\sigma _2- \sigma _1 \zeta }{\sigma _2}\), we get \(\rho _2\in (0, 1)\) and

$$\begin{aligned}&{ (1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k+1})\Vert _{{\mathcal {T}}_{k+1}}^2 + \sigma _1\eta ^2(u_{k+1}, {\mathcal {T}}_{k+1}) +\sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k+1}) }\\&\quad \leqslant \rho _1(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 + \rho _1\sigma _1\eta ^2(u_k, {\mathcal {T}}_k) +(\sigma _2- \sigma _1 \zeta )\text{ osc}^2(f,{\mathcal {T}}_{k}) \\&\quad \leqslant \rho _1(1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_k)\Vert _{{\mathcal {T}}_{k}}^2 + \rho _1\sigma _1\eta ^2(u_k, {\mathcal {T}}_k) + \rho _2\sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k}). \end{aligned}$$

We complete the proof by setting \(\rho = \max \{\rho _1, \rho _2\}\in (0, 1)\). \(\square \)

By recursion, we get the decay of the error plus the estimator.

Corollary 1

Under the hypotheses of Theorem 2, then we have

$$\begin{aligned} (1 - \epsilon )\Vert A^{1/2}(\nabla u-\nabla _wu_{k})\Vert _{{\mathcal {T}}_{k}}^2 + \sigma _1\eta ^2(u_{k}, {\mathcal {T}}_{k}) +\sigma _2\text{ osc}^2(f,{\mathcal {T}}_{k})\leqslant {\hat{C}}_{0}\rho ^{k}, \end{aligned}$$

where the constant \(\epsilon , \sigma _1, \sigma _2, \rho \) are given in Theorem 2, and \({\hat{C}}_{0} = (1 - \epsilon )\Vert A^{1/2}(\nabla u - \nabla _w u_{0})\Vert _{{\mathcal {T}}_{0}}^2 + \sigma _1\eta ^2(u_{0}, {\mathcal {T}}_{0}) + \sigma _2\text{ osc}^2(f,{\mathcal {T}}_{0})\). Thus the algorithm AWG will terminate in finite steps.

5 Numerical Experiments

In this section, we test some experiments to show the performance of the adaptive algorithm AWG. We carry out these numerical experiments by using the MATLAB software package iFEM [4]. We choose the lowest order WG method and estimate the energy error \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) in the following numerical experiments.

Example 1

In this example, we test ‘L-shape’ problem in two dimensions. We choose an L-shape domain \(\Omega = (-1, 1)^2/ [0, 1)^2\) and the coefficient \(A = {\mathbf {I}}\). For the source \(f =0\), the exact solution is \(u = r^{2/3}\sin (\frac{2}{3}\theta )\) in polar coordinates. The left of Fig. 1 shows the initial mesh \({\mathcal {T}}_0\), and the right of Fig. 1 shows an adaptively refined mesh with marking parameter \(\theta = 0.5\) after \(k =14\) iterative steps, which indicates the mesh is locally refined in a small vicinity of the edge singularity.

Denote \(\#{\mathcal {T}}_k\) the number of elements and \(u_{k}\) the corresponding weak finite element solution associated to the mesh \({\mathcal {T}}_k\). The left of Fig. 2 shows the curves of \(\log \# {\mathcal {T}}_k-\log \Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) for marking parameters \(\theta = 0.1, 0.3, 0.5\) which indicates the convergence and the quasi-optimality of the adaptive algorithm AWG of the energy error \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\), namely

$$\begin{aligned} \Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\lesssim (\# {\mathcal {T}}_k)^{-1/2}. \end{aligned}$$

And the right of Fig. 2 plots the performances of \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) and \(\eta (u_k, {\mathcal {T}}_k)\) which shows that the energy error \(\Vert A^{1/2}(\nabla u-\nabla _w u_{k})\Vert _{{\mathcal {T}}_k}\) can be controlled by the error estimator \( \eta (u_{k}, {\mathcal {T}}_k)\) and the optimal rates of the energy error and the corresponding error estimators are approximate.

Fig. 1
figure 1

The initial mesh \({\mathcal {T}}_0\) (left); An adaptively refined mesh after 14 adaptive iterations with marking parameter \(\theta = 0.5\) (right)

Fig. 2
figure 2

Quasi optimality of the adaptive algorithm AWG of the error \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) with different marking parameters \(\theta \)(left); the performances of \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) and \(\eta (u_k, {\mathcal {T}}_k)\) for Example 1 with \(\theta =0.5\) (right)

Example 2

In this example, we employ the Kellogg problem introduced in [10]. We choose a domain \(\Omega = (-1, 1)^2\) and for \(f=0\), the exact solution in polar coordinates is \(u(r, \theta )=r^{\gamma } \mu (\theta )\) where

$$\begin{aligned} \mu (\theta )=\left\{ \begin{array}{ll}{\cos \left( \left( \frac{\pi }{2}-\sigma \right) \gamma \right) \cos \left( \left( \theta -\frac{\pi }{2}+\rho \right) \gamma \right) } &{} { \text{ if } 0 \le \theta \le \frac{\pi }{2}}, \\ {\cos (\rho \gamma ) \cos ((\theta -\pi +\sigma ) \gamma )} &{} { \text{ if } \frac{\pi }{2} \le \theta \le \pi }, \\ {\cos (\sigma \gamma ) \cos ((\theta -\pi -\rho ) \gamma )} &{} { \text{ if } \pi \le \theta \le \frac{3 \pi }{2}}, \\ {\cos \left( \left( \frac{\pi }{2}-\rho \right) \gamma \right) \cos \left( \left( \theta -\frac{3 \pi }{2}-\sigma \right) \gamma \right) } &{} { \text{ if } \frac{3 \pi }{2} \le \theta \le 2 \pi }, \end{array}\right. \end{aligned}$$

the coefficient matrix A is piecewise constant: \(A = 161.44764\ {\mathbf {I}}\) in the first and third quadrants and \(A = {\mathbf {I}}\) in the second and fourth quadrants and the constants \(\gamma = 0.1, \sigma = -14.92256, \rho = \pi /4\). Indeed, the exact solution \(u\in H^{1+\gamma }(\Omega )\). The left of Fig. 3 shows the initial mesh \({\mathcal {T}}_0\), and the right of Fig. 3 shows an adaptively refined mesh with marking parameter \(\theta = 0.5\) after \(k=130\) iterative steps. We can also see that the mesh is locally refined in a small vicinity of the edge singularity.

The left of Fig. 4 shows the curves of \(\log \# {\mathcal {T}}_k-\log \Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) for Kellogg problem with different marking parameters \(\theta = 0.1, 0.3, 0.5\) which also indicates the convergence and the next quasi-optimality of adaptive algorithm AWG, i.e.

$$\begin{aligned} \Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\lesssim (\# {\mathcal {T}}_k)^{-1/2}. \end{aligned}$$

And the right of Fig. 4 plots the performances of \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) and \(\eta (u_k, {\mathcal {T}}_k)\) which shows that the energy error \(\Vert A^{1/2}(\nabla u-\nabla _w u_{k})\Vert _{{\mathcal {T}}_k}\) can be controlled by the error estimator \( \eta (u_{k}, {\mathcal {T}}_k)\) and the optimal rates of the energy error and the corresponding error estimators are approximate.

Fig. 3
figure 3

The initial mesh \({\mathcal {T}}_0\) (left); an adaptively refined mesh for Kellogg problem with marking parameter \(\theta = 0.5\) after \(k=130\) adaptive iterations (right)

Fig. 4
figure 4

Quasi optimality of the adaptive algorithm AWG of the error \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) with different marking parameters \(\theta \)(left); the performances of \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) and \(\eta (u_k, {\mathcal {T}}_k)\) for Example 2 (right)

Fig. 5
figure 5

The initial mesh \({\mathcal {T}}_0\) (left); an adaptively refined mesh for L-shape problem in three dimensions with marking parameter \(\theta = 0.5\) after \(k=17\) adaptive iterations (right)

Fig. 6
figure 6

Quasi optimality of the adaptive algorithm AWG of the error \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) with different marking parameters \(\theta \)(left); the performances of \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) and \(\eta (u_k, {\mathcal {T}}_k)\) for Example 3 (right)

Example 3

In this example, we test ’L-shape’ problem in three dimensions. We choose an L-shape domain \(\Omega = (-1, 1)^3/ [0, 1)\times [0,1)\times (-1, 1)\). We get an initial mesh \({\mathcal {T}}_0\) by partitioning the given domain \(\Omega \) into four subintervals in x-, y- and z-axes and then dividing every cube into 6 tetrahedrons. We set \(A={\mathbf {I}}\) and the source \(f = 0\) such that the exact solution in the cylindrical coordinate is \(u=r^{\frac{2}{3}} \sin \left( \frac{2}{3} \theta \right) \). The left of Fig. 5 shows the initial mesh \({\mathcal {T}}_0\), and the right of Fig. 5 shows an adaptively refined mesh with marking parameter \(\theta = 0.5\) after \(k=17\) iterative steps which also indicates the mesh is locally refined.

The left of Fig. 6 plots the curves of \(\log \# {\mathcal {T}}_k-\log \Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) for \(\theta = 0.1, 0.3, 0.5\) which indicates the convergence and the next quasi-optimality of adaptive algorithm AWG of the energy error, i.e.

$$\begin{aligned} \Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\lesssim (\# {\mathcal {T}}_k)^{-1/3}. \end{aligned}$$

And the right of Fig. 6 plots the performances of \(\Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\) and \(\eta (u_k, {\mathcal {T}}_k)\) for Example 3 which shows that the energy error \(\Vert A^{1/2}(\nabla u-\nabla _w u_{k})\Vert _{{\mathcal {T}}_k}\) can be controlled by the error estimator \( \eta (u_{k}, {\mathcal {T}}_k)\) and the optimal rates of the energy error and the corresponding error estimators are approximate.

From above numerical examples, we know that the AWG method introduced in Sect. 3 is convergent and the numerical examples also indicate next quasi-optimality

$$\begin{aligned} \Vert A^{1/2}(\nabla u - \nabla _w u_{k})\Vert _{{\mathcal {T}}_{k}}\lesssim (\# {\mathcal {T}}_k)^{-1/d}, d=2, 3. \end{aligned}$$