1 Introduction

The goal of this paper is to formulate and analyze a weak Galerkin method for solving an elliptic problem with low regularity solution. We consider a second order elliptic equation with a homogeneous Dirichlet boundary condition as follows:

$$\begin{aligned} -\nabla \cdot A\nabla u= & {} f, \quad \mathrm {in}\ \Omega , \end{aligned}$$
(1.1)
$$\begin{aligned} u= & {} 0,\quad \mathrm {on}\ \partial \Omega , \end{aligned}$$
(1.2)

where \(\Omega \subset \mathbb {R}^2\) is an open bounded polygonal domain, the coefficient A is a positive piecewise constant or a uniformly symmetric positive definite function matrix \(A:=\{ a_{ij}(x)\},\, a_{ij}=a_{ji}\in W^{1,\infty }(\Omega )\) in \(\mathbb {R}^{2\times 2}\), i.e., there exists a positive constant \(\gamma >0\) such that

$$\begin{aligned} \sum \limits _{ij=1}^{2}a_{ij}(x)\xi _i\xi _j\ge \gamma \left( \xi _1^2+\xi _2^2\right) ,\quad \forall \,\xi :=(\xi _1,\xi _2)\in \mathbb {R}^{2}, \, x\in \bar{\Omega }. \end{aligned}$$

And the force term \(f\in L^p(\Omega )\) is a given function for some p in \((1,\infty ]\).

We shall use standard notation for Sobolev spaces (see [1]). For any nonnegative integer s and \(r\le 1\), the classical Sobolev space on a bounded domain \(D\subset \mathbb {R}^2\) is

$$\begin{aligned} W^{s,r}(D)=\big \{v\in L^r(D)|\,\partial ^n v\in L^r(D),\, \forall \, |n|\le s\big \}, \end{aligned}$$

where \(\partial ^n v\) are the partial derivatives of v of order n and \(L^r(D)\) is the space of all (scalar-valued) functions on D for which the corresponding \(L^p\)-norm

$$\begin{aligned} \Vert v\Vert _{L^p(D)}=\left\{ \begin{array}{ll} \Big (\int _D |v(x)|^pdx\Big )^{1/p},&{}\quad p\in [1,\infty ),\\ \text {ess}~{\sup }_{x\in D} |v(x)|, &{}\quad p=\infty , \end{array}\right. \end{aligned}$$

is bounded. The corresponding norm in \(W^{s,r}(D)\) is denoted by \(\Vert \cdot \Vert _{W^{s,r}(D)}\) and the semi-norm by \(|\cdot |_{W^{s,r}(D)}.\) The \(L^2\) inner-product is denoted by \((\cdot , \cdot )_D\) and by \((\cdot , \cdot )\) if \(D =\Omega \). For the Hilbert space \(H^s(D) = W^{s,2}(D)\), the norm is denoted by \(\Vert \cdot \Vert _{L^2(D)}\). We define by \(H^1_ 0(D)\) the subspace of functions in \(H^1(D)\) that vanish on \(\partial D\). Throughout the paper, boldface characters denote vector quantities.

The standard weak formulation of (1.1)–(1.2) is to find \(u\in H^1_0(\Omega )\) such that

$$\begin{aligned} (A\nabla u, \nabla v)&=(f,v),&\quad \forall \, v\in H^1_0(\Omega ). \end{aligned}$$
(1.3)

Continuous and piecewise polynomials are used in the classical (conforming) finite element methods, but the use of discontinuous functions in the finite element approximations often provides much required flexibility in handling complex problems.

There are a variety of articles which deal with discontinuous Galerkin (DG) discretizations of elliptic problems under standard regularity assumptions (e.g., for solutions in \(W^{2,2}(\Omega )\) or \(W^{\frac{3}{2}+\epsilon ,2}(\Omega )\) with \(\epsilon >0\)) in [2, 3, 13] and the references therein for different types of DG methods. There are a great deal of difficulties in derivation of algebraic convergence rates for DG methods for low regularity solutions, even though the numerical flux is treated delicately. With solutions belonging to some weighted Sobolev spaces (based on a weighted \(W^{2,2}\)-norm), DG methods have been analyzed in many papers for elliptic problems with corner singularities in polygons. In Ref. [16], interior penalty DG methods including symmetric interior penalty Galerkin (SIPG) and non-symmetric interior penalty Galerkin (NIPG) schemes have been presented and error analysis has been conducted in an energy norm for solutions in \(W^{2,p}\), \(p\in (1,2]\).

Recently, with a well-defined weak gradient, Wang and Ye [15] developed a new weak Galerkin finite element (WG) method for second order elliptic equations formulated as a system of two first order linear equations, and presented optimal a priori error estimates in both discrete \(H^1\) and \(L^2\) norms, as well as a residual type a posteriori error estimator [4]. In most studies, the weak Galerkin finite element methods are analyzed for analytic solutions defined in the Sobolev space \(W^{k+1,2}(\Omega )\) (see [810, 14]). For high regularity solutions, a stable and efficient stabilization term has been presented in [15]. Compared to DG finite element methods, WG methods have some main features, such as continuous normal flux across element interfaces, less unknowns, and no need for choosing penalty factors.

The motivation of the paper is to use a relaxed stabilizer in the weak Galerkin method to provide an improved approximation to the low regularity solution of problem (1.1)–(1.2). Here we introduce a relaxed power index \(\beta \) on mesh size h for the stabilization in [15]. Without adding any penalty factor, particular investigation will be carried out on optimal error estimates of low regularity solutions in the \(L^p\) space with \(1<p\le 2\). There is a great interest in what values of the index \(\beta \) would be an optimal choice in numerical analysis for low regularity solutions, and in the question if the choice of \(\beta =1\) can be recovered in the case of high regularity solutions. In this paper, we will investigate a WG method with an over-relaxed stabilization term for solutions existing in \(H^1(\Omega )\cap W^{2,p}(\Omega )\) with \(p\in (1,2]\), and develop a priori error estimates in the standard energy norm and in the \(L^2,\,L^p\) norms. Furthermore, we will prove that the over-relaxed WG method converges at an (optimal) algebraic rate even if \(p\in (1,2]\).

The following Sobolev embedding and regularity results will be used in our analysis.

Lemma 1.1

([1, Theorem 4.12]) For \(p\in [1,\infty ]\), let \(D\subset \mathbb {R}^2\) be a bounded open Lipschitz domain. Then, the embedding \(W^{1,p}(D)\hookrightarrow L^q(D)\) is continuous for all \(q\in [1,\infty )\), if \(p\ge 2\), and for all \(q\in [1,\frac{2p}{2-p}]\) if \(p<2\).

Lemma 1.2

([7, 16]) The variational formulation (1.3) has a unique solution in \(W^{1,2}(\Omega )\) (equivalently, in \(H^1(\Omega )\)) for any \(f\in L^p(\Omega )\) with \(p\in (1,\infty ).\)

Lemma 1.3

([16]) Given \(\mu >1\), for any \(f\in L^p(\Omega )\) with \(p\in (1,\mu )\), the solution of (1.1)–(1.2) belongs to \(X:=W^{2,p}(\Omega )\cap H^1_0(\Omega )\) and the regularity estimate holds

$$\begin{aligned} \Vert u\Vert _{X}\le C\Vert f\Vert _{L^p(\Omega )}, \end{aligned}$$
(1.4)

where C is a positive constant independent of u.

The rest of this paper is organized as follows. In Sect. 2, we recall the definition of weak gradient and its discrete approximation, and then present an over-relaxed WG approximation. In Sect. 3, we estimate an a priori WG error in energy norm and some key inequalities are presented. In Sect. 4, error estimates in \(L^p\) and \(L^2\) norms are derived by a dual argument. In the last Section, we present some numerical experiments including three low-regularity problems to show the effectiveness and convergence rates of the over-relaxed weak Galerkin method.

2 Weak Garlerkin Methods with an Over-Relaxed Stabilization

Let \({\mathcal {T}}_h\) be a triangulation of the domain \(\Omega \) into any polygons, \({\mathcal {E}}_h\) be all edges, and \({\mathcal {E}}_h^0\) be all interior edges. For any \(T\in {\mathcal {T}}_h\), we denote its diameter by \(h_T\) and the boundary by \(\partial T.\) Further, let \(h = \max _{T\in {\mathcal {T}}_h}h_T\) denote a characteristic mesh size of the whole partition. We assume that the partition is shape regular, see [15].

On the partition \({\mathcal {T}}_h\), we define a broken Sobolev space, for \(s\in \mathbb {N}\) and \(r\in [1,\infty ]\), by

$$\begin{aligned} W^{s,r}({\mathcal {T}}_h)=\{v\in L^r(\Omega )|v\in W^{s,r}(T), \, \forall \,T\in {\mathcal {T}}_h\}. \end{aligned}$$

On an element T, we define a weak function by \(v=\{v_0,v_b\}\) such that \(v_0\in L^2(T)\) and \(v_b\in L^2(\partial \Omega ),\) where \(v_0\) and \(v_b\) represent the values of v in the interior and on the boundary of T, respectively. Let \({\mathbb {P}}_j(T)\) be the set of polynomials on T with degree no more than j. For a given finite element mesh \({\mathcal {T}}_h\) and polynomial degrees \(j,\,l\ge 1\), we consider the following finite element spaces

$$\begin{aligned} V_h&= \{v=\{v_0,v_b\}:v|_T\in {\mathbb {P}}_j(T)\times {\mathbb {P}}_l(e),e\in \partial T,\ \forall \, T\in {\mathcal {T}}_h\},\\ V_h^0&= \{v=\{v_0,v_b\}:v\in V_h,v|_e=0,e\in \partial \Omega \},\\ U_h&{=\{v_0\in L^2(\Omega ): ~v_0|_{T}\in {\mathbb {P}}_j(T), \,\forall \, T\in {\mathcal {T}}_h\}.} \end{aligned}$$

Here we take \(j=l=k\) for a fixed positive integer \(k\ge 1\) and the function \(v_b\) is not necessarily the trace of \(v_0\) on \(\partial T\). The component \(v_0\) is defined element-wise and may be discontinuous with respect to \(v_b\). The idea of polynomial reduction [12] presents an optimal combination for the polynomial spaces. For instance, one can use \(({\mathbb {P}}_{k}(T),{\mathbb {P}}_{k-1}(e),[{\mathbb {P}}_{k-1}(T)]^2)\) instead of \(({\mathbb {P}}_{k}(T),{\mathbb {P}}_{k}(e),[{\mathbb {P}}_{k-1}(T)]^2)\), to minimize the number of unknowns in the WG scheme without compromising the accuracy of the stabilized approximations [11, 12].

We denote by \(Q_0\) and \(Q_b\) the \(L^2\) projection operators from \(L^2(T)\) onto \({\mathbb {P}}_k(T)\) and from \(L^2(e)\) onto \({\mathbb {P}}_k(e)\), respectively. We write \(Q_h=\{Q_0,Q_b\}\). Moreover, let \(\mathcal {Q}_h\) be the \(L^2\) projection from \([L^2(T)]^2\) onto the local discrete gradient space \([{\mathbb {P}}_{k-1}(T)]^2.\) The weak gradient is defined by \(\nabla _w v\in [{\mathbb {P}}_{k-1}(T)]^2\) for any function \(v\in V_h\) satisfying

$$\begin{aligned} (\nabla _w v, q)_T=-(v_0,\nabla \cdot q)_T+\langle v_b,q\cdot \varvec{n}\rangle _{\partial T},\quad \forall \, q\in [{\mathbb {P}}_{k-1}(T)]^2, \end{aligned}$$
(2.1)

where \(\langle \cdot ,\cdot \rangle _{\partial T}\) stands for the usual inner product in \(L^2(\partial T)\). For any \(u_h=\{u_0,u_b\},\,v_h=\{v_0,v_b\}\in V_h\), we introduce the following bilinear form

$$\begin{aligned} (A\nabla _wu_h,\nabla _wv_h):=\sum \limits _{T\in {\mathcal {T}}_h}(A\nabla _wu_h,\nabla _wv_h)_T. \end{aligned}$$

Weak Galerkin Algorithm. A numerical approximation for (1.1)–(1.2) can be obtained by seeking \(u_h=\{u_0,u_b\}\in V_h^0\) such that for any \( v_h=\{v_0,v_b\}\in V_h^0,\)

$$\begin{aligned} a(u_h,v_h):=(A\nabla _wu_h,\nabla _wv_h)+\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }} \langle u_0-u_b,v_0-v_b\rangle _e = (f,v_0), \end{aligned}$$
(2.2)

where \(\beta \) is a positive number to be defined later.

Next, we justify the well-posedness of the scheme (2.2). For any \(v\in V_h\), we define an energy norm by

$$\begin{aligned} |||v |||:=\sqrt{a(v,v)}. \end{aligned}$$

Note that \( |||\cdot |||\) define a semi-norm in \(V_h\). However, it defines a norm in \(V_h^0\). To verify this, it suffices to check the positivity property for \( |||\cdot |||\). To this end, assume that \(v\in V_h^0\) and \(|||v |||=0\). It follows that

$$\begin{aligned} (A\nabla _wv,\nabla _wv)+\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }} \int _e|Q_bv_0-v_b\big |^2ds =0, \end{aligned}$$

which implies that \(\nabla _w v=0\) on each element K and \(Q_bv_0=v_b\) on e. By the definition of the weak gradient, it holds

$$\begin{aligned} (\nabla _w v,\tau )_T=(\nabla v_0,\tau )_T-\langle v_0-v_b,\tau \cdot \varvec{n}\rangle _{e}=(\nabla v_0,\tau )_T-\langle Q_bv_0-v_b,\tau \cdot \varvec{n}\rangle _{e}. \end{aligned}$$

Thus, \(v_0=const\) on every \(K\in {\mathcal {T}}_h\). For any edge \(e\in {\mathcal {E}}_h^0\), there exist two elements \(K_1\) and \(K_2\) in \({\mathcal {T}}_h\) such that \(e=\partial K_1\cap \partial K_2\). This, together with fact that \(Q_bv_0|_{\partial K_1^e}=v_b|_{\partial K_2^e}\) and \(v_b=0\) on \(\partial \Omega \), implies that \(v_0=v_b=0\).

Lemma 2.1

The weak Galerkin finite element scheme (2.2) has a unique solution.

Proof

It suffices to prove the uniqueness. If \(u_h^{(1)}\) and \(u_h^{(2)}\) are two solutions of (2.2), then \(e_h=u_h^{(1)}-u_h^{(2)}\) would satisfy the following equation

$$\begin{aligned} a(e_h,v)=0,~~~\forall \, v\in V_h^0. \end{aligned}$$

Note that \(e_h\in V_h^0\). Then by letting \(v=e_h\) in the above equation we arrive at

$$\begin{aligned} |||e_h |||^2=a(e_h,e_h)=0. \end{aligned}$$

It follows that \(e_h\equiv 0,\) or equivalently, \(u_h^{(1)}\equiv u_h^{(2)}\). This completes the proof of the lemma. \(\square \)

3 Error Estimate in Energy Norm

Analogously to Lemma 5.1 in [11], the following lemma will be used for multiple times.

Lemma 3.1

Let \(Q_h\) and \(\mathcal {Q}_h\) be the \(L^2\) projection operators as defined above. Then, on each element \(T\in {\mathcal {T}}_h\), we have the following properties: for any \(v\in X,\,\tau \in [{\mathbb {P}}_{k-1}(T)]^2\),

$$\begin{aligned} (A\nabla _w(Q_h v), \tau )_T=(\mathcal {Q}_h(A\nabla v),\tau )_T. \end{aligned}$$
(3.1)

Proof

It follows from the definition of \(\nabla _w\), the symmetry of A, and integration by parts that

$$\begin{aligned} (A\nabla _w(Q_h v), \tau )_T&=(\nabla _w(Q_h v), A\tau )_T\\&=-(Q_0v, \nabla \cdot (A\tau ))_T+\langle Q_bv, (A\tau )\cdot \varvec{n}\rangle _{\partial T}\\&=-(v, \nabla \cdot (A\tau ))_T+\langle v, (A\tau )\cdot \varvec{n}\rangle _{\partial T}\\&=(A\nabla v,\tau )_T\\&=(\mathcal {Q}_h(A\nabla v),\tau )_T. \end{aligned}$$

\(\square \)

Let \(e_h=\{e_0,e_b\}=\{Q_0u-u_0,Q_bu-u_b\}\). We have the error equation between \(Q_h u\) and \(u_h\) as follows:

$$\begin{aligned} a(e_h,v_h)&=(A\nabla _we_h,\nabla _wv_h)+\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }} \langle e_0-e_b,v_0-v_b\rangle _e\nonumber \\&= \sum _{e\in {\mathcal {E}}_h^0}\langle {\big (A\frac{\partial u}{\partial \varvec{n}}-\mathcal {Q}_h(A\nabla u)\cdot \varvec{n}\big )},v_0-v_b\rangle _e+ \sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\langle Q_0 u-Q_b u, v_0-v_b\rangle _e\nonumber \\ :&=l_1(u,v_h)+l_2(u,v_h). \end{aligned}$$
(3.2)

Indeed, for any \(v_h\in V_h^0\), testing (1.1) with \(v_0\), and using integration by parts, the continuity of \(\nabla u\cdot \varvec{n}\) on interior edge e and Lemma 3.1 leads to

$$\begin{aligned}&(-\nabla \cdot (A\nabla u), v_0)=\sum _{T\in {\mathcal {T}}_h}(A\nabla u, \nabla v_0)_T-\sum _{T\in {\mathcal {T}}_h}\langle v_0,A\nabla u\cdot \varvec{n}\rangle _{\partial T}\nonumber \\&\quad =\sum _{T\in {\mathcal {T}}_h}(\mathcal {Q}_h(A \nabla u),\nabla v_0)_T-\sum _{T\in {\mathcal {T}}_h}\langle v_0-v_b,A\nabla u\cdot \varvec{n}\rangle _{\partial T}+\sum _{\partial T\in \partial \Omega }\langle v_b,A\nabla u\cdot \varvec{n}\rangle _{\partial T}\nonumber \\&\quad =\sum _{T\in {\mathcal {T}}_h}(\mathcal {Q}_h(A\nabla u),\nabla _w v_h)_T -\sum _{e\in {\mathcal {E}}_h}\langle v_0-v_b,A\frac{\partial u}{\partial \varvec{n}}-\mathcal {Q}_h(A\nabla u)\cdot \varvec{n}\rangle _e\nonumber \\&\quad =(A\nabla _w(Q_h u),\nabla _w v_h)_T-\sum _{e\in {\mathcal {E}}_h}\langle v_0-v_b,A\frac{\partial u}{\partial \varvec{n}}-\mathcal {Q}_h(A\nabla u)\cdot \varvec{n}\rangle _e, \end{aligned}$$
(3.3)

where in the last second identity we have used the fact \(v_b|_{\partial \Omega }=0\) and

$$\begin{aligned} (\mathcal {Q}_h(A\nabla u),\nabla _w v_h)_T&=-(v_0,\nabla \cdot \mathcal {Q}_h(A\nabla u))_T+\langle v_b, \mathcal {Q}_h(A\nabla u)\cdot \varvec{n}\rangle _{\partial T}\\&=(\mathcal {Q}_h(A\nabla u),\nabla v_0)_T-\langle v_0-v_b, \mathcal {Q}_h(A\nabla u)\cdot \varvec{n}\rangle _{\partial T}. \end{aligned}$$

Consequently, subtracting (2.2) from (3.3) arrives at (3.2).

For any two neighboring elements \(K_1,\,K_2\in {\mathcal {T}}_h\) sharing an edge \(e\in {\mathcal {E}}_h\), let \(\Omega _e=(\bar{K}_1\cup \bar{K}_2)^{0}\) and \(p\in (1,2]\). Note that if \(w\in W^{1,p}(\Omega _e)\), then w is continuous on e; if \(w\in W^{2,p}(\Omega _e)\), then \(\nabla w\) is continuous on e.

Lemma 3.2

For \(q\in [2,+\infty )\), and \(\forall \, e\in {\mathcal {E}}_h,\,v\in V_h,\) there holds

$$\begin{aligned} \Vert h^{-\frac{1}{q}}(v_0-v_b)\Vert _{L^q(e)}\le C\Vert h^{-\frac{1}{2}}(v_0-v_b)\Vert _{L^2(e)},\quad \forall \, e\in {\mathcal {E}}_h,\,v\in V_h, \end{aligned}$$
(3.4)

equivalently,

$$\begin{aligned} \sum _{e\in {\mathcal {E}}_h}\int _{e}h^{-1}|v_0-v_b|^qds\le C\sum _{e\in {\mathcal {E}}_h}\Big (\int _{e}h^{-1}|v_0-v_b|^2ds\Big )^{\frac{q}{2}}. \end{aligned}$$
(3.5)

Proof

To prove (3.4), we first write \(\psi =v_0-v_b\) and define a reference element by \((\hat{e},\hat{\psi },\hat{{\mathbb {P}}}_k)\) with an invertible affine mapping such that the two finite elements \((e,\psi ,{\mathbb {P}}_k)\) and \((\hat{e},\hat{\psi },\hat{{\mathbb {P}}}_k)\) affine-equivalent (see the details in [5]). It holds by Theorems 3.1.2 and 3.1.3 in [5] and by norm equivalence for \(\hat{\psi }\)

$$\begin{aligned} h^{-\frac{1}{q}}\Vert \psi \Vert _{L^q(e)} \le C\Vert \hat{\psi }\Vert _{L^q(\hat{e})}\le C\Vert \hat{\psi }\Vert _{L^2(\hat{e})}\le Ch^{-\frac{1}{2}}\Vert \psi \Vert _{L^2(e)}. \end{aligned}$$

Thus, (3.4) follows.

Next, it suffices to verify (3.5). By using the fact

$$\begin{aligned} \Big (\sum \limits _{j=1}^n a_j^s\Big )^{1/s}\le \Big (\sum \limits _{j=1}^n a_j^r\Big )^{1/r}, \quad \text {for}~~ 0<r\le s,\, a_j\ge 0, \end{aligned}$$

and taking \(r=1,\, s=\frac{q}{2}\ge 1\), we have

$$\begin{aligned} \Big (\sum _{e\in {\mathcal {E}}_h}\Big (\int _{e}h^{-1}|v_0-v_b|^2ds\Big )^{\frac{q}{2}}\Big )^{\frac{2}{q}}\le C\sum _{e\in {\mathcal {E}}_h}\int _{e}h^{-1}|v_0-v_b|^2ds, \end{aligned}$$

which completes the proof of (3.5). \(\square \)

We need the following lemma of interpolation [16].

Lemma 3.3

Let \(p\in (1,2]\), and \(u\in W^{l+1,p}(\Omega )\) with \(l\ge 1\). Then there exists an interpolant \(\Pi :\, W^{l+1,p}(\Omega )\rightarrow U_h\) such that for all \(T\in {\mathcal {T}}_h\) there holds

$$\begin{aligned} |u-\Pi u|_{W^{m,p}(T)}\le C h_T^{l+1-m}\Vert u\Vert _{W^{l+1,p}(T)},\quad 0\le m\le l+1, \end{aligned}$$
(3.6)

where \(l\ge 1\) is the polynomial degree in the approximation space \(V_h\). Furthermore,

$$\begin{aligned} \sum _{T\in {\mathcal {T}}_h}\Big (h_T^{-2}\Vert u-\Pi u\Vert _{L^2(T)}^2+\Vert \nabla (u-\Pi u)\Vert _{L^2(T)}^2\Big )\le C\Big (\sum _{T\in {\mathcal {T}}_h} h_T^{p(l+1)-2}\Vert u\Vert _{W^{l+1,p}(T)}^p\Big )^{\frac{2}{p}}. \end{aligned}$$
(3.7)

Now we give the main optimal convergence theorem for the error \(e_h\) in the energy norm.

Theorem 3.1

Let \(u_h\in V_h\) be the weak Galerkin finite element solution of (2.2) arising from the problem (1.1)–(1.2). Assume that the exact solution satisfies \(u\in W^{l+1,p}({\mathcal {T}}_h)\cap W^{2,p}(\Omega )\), where \(l\ge 1\) and \(p\in (1,2].\) Then there exists a constant C such that

$$\begin{aligned} |||Q_h u-u_h |||\le C\Big (h^{(\frac{\beta +1}{2}+l-\frac{2}{p})}+h^{(\frac{1-\beta }{2}+l+1-\frac{p}{2})}\Big )\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}. \end{aligned}$$
(3.8)

When the index \(\beta =1+\frac{2}{p}-\frac{p}{2}\) is taken, we have the optimal estimate in the energy norm

$$\begin{aligned} |||Q_h u-u_h |||\le Ch^{l+1-\frac{1}{p}-\frac{p}{4}}\Vert u\Vert _{W^{l+1,p}(\Omega )}.\end{aligned}$$
(3.9)

Especially, if \(p=2\), then \(\beta =1\) and it holds

$$\begin{aligned} |||Q_hu-u_h |||\le Ch^{l}\Vert u\Vert _{W^{l+1,2}(\Omega )}. \end{aligned}$$
(3.10)

Proof

In particular, (3.10) follows from (3.8). Therefore, it suffices to prove the validity of (3.8).

Taking \(v_h=e_h\) in equation (3.2), we get

$$\begin{aligned} |||e_h|||&=a(e_h,e_h)=l_1(u,e_h)+l_2(u,e_h). \end{aligned}$$
(3.11)

Thus we need to estimate the upper bounds of the terms \(l_1\) and \(l_2\). For the term \(l_1\), by Hölder inequality and the boundedness of A, it holds

$$\begin{aligned} \begin{aligned}&|l_1(u,e_h)|\\&\quad =\Big |\sum _{e\in {\mathcal {E}}_h}\langle {\big (A\nabla u-\mathcal {Q}_h(A\nabla u)\big )\cdot \varvec{n}},e_0-e_b\rangle _e\Big |\\&\quad \le C\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{p-\bar{c}}\int _{\partial T} {\big |(A\nabla u-\mathcal {Q}_h(A\nabla u))\cdot \varvec{n}\big |^p} ds\Big )^{\frac{1}{p}}\Big (\sum _{T\in {\mathcal {T}}_h}\int _{\partial T}\frac{1}{h^{\beta }}|e_0-e_b|^{p^*}ds\Big )^{\frac{1}{p^*}}\\&\quad \le C\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{p-\bar{c}}\int _{\partial T} {\big |(\nabla u-\mathcal {Q}_h(\nabla u))\cdot \varvec{n}\big |^p} ds\Big )^{\frac{1}{p}}\Big (\sum _{T\in {\mathcal {T}}_h}\int _{\partial T}\frac{1}{h^{\beta }}|e_0-e_b|^{p^*}ds\Big )^{\frac{1}{p^*}}, \end{aligned} \end{aligned}$$
(3.12)

where \(\frac{1}{p}+\frac{1}{p^*}=1\) with \(p^*\in [2,\infty )\) and \(\bar{c}=p-(p-1)\beta .\)

We then give the following estimate from Lemma 3.2

$$\begin{aligned}&\Big (\sum _{T\in {\mathcal {T}}_h}\int _{\partial T}\frac{1}{h^{\beta }}|e_0-e_b|^{p^*}ds\Big )^{\frac{1}{p^*}}\nonumber \\&\quad \le Ch^{\frac{1-\beta }{p^*}}\Big (\sum _{T\in {\mathcal {T}}_h}\int _{\partial T}h^{-1}|e_0-e_b|^{p^*}ds\Big )^{\frac{1}{p^*}}\nonumber \\&\quad \le Ch^{\frac{1-\beta }{p^*}} \Big (\sum _{T\in {\mathcal {T}}_h}\int _{\partial T}h^{-1}|e_0-e_b|^{2}ds\Big )^{\frac{1}{2}}\nonumber \\&\quad \le Ch^{(\frac{1}{2}-\frac{1}{p^*})(\beta -1)}{\Big (\sum _{T\in {\mathcal {T}}_h}\int _{\partial T}h^{-\beta }|e_0-e_b|^{2}ds\Big )^{\frac{1}{2}}}\nonumber \\&\quad \le Ch^{(\frac{1}{2}-\frac{1}{p^*})(\beta -1)}|||e_h |||. \end{aligned}$$
(3.13)

Applying the trace inequality and scaling argument results in

$$\begin{aligned}\begin{aligned}&\sum _{T\in {\mathcal {T}}_h}h_T^{p-\bar{c}}\big \Vert (\nabla u-\mathcal {Q}_h\nabla u)\cdot \varvec{n}\big \Vert _{L^p(\partial T)}^p\\&\quad \le C\sum _{T\in {\mathcal {T}}_h} h^{(p-1)\beta }_T\Big (h_T^{-\frac{1}{p}}\Vert \nabla u-\mathcal {Q}_h\nabla u\Vert _{L^p(T)}+h_T^{1-\frac{1}{p}}\Vert \nabla (\nabla u-\mathcal {Q}_h\nabla u)\Vert _{L^p(T)}\Big )^p, \end{aligned}\end{aligned}$$

which implies by applying (3.6)

$$\begin{aligned} \sum _{T\in {\mathcal {T}}_h}h_T^{p-\bar{c}}\big \Vert (\nabla u-\mathcal {Q}_h\nabla u)\cdot \varvec{n}\big \Vert _{L^p(\partial T)}^p\le C\sum _{T\in {\mathcal {T}}_h}h_T^{(p-1)\beta +pl-1}\big \Vert u\Vert _{W^{l+1,p}(T)}^p. \end{aligned}$$
(3.14)

Then, inserting (3.13)–(3.14) into (3.12) results in

$$\begin{aligned} |l_1(u,e_h)|\le Ch^{\frac{\beta +1}{2}+l-\frac{2}{p}}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}|||e_h |||. \end{aligned}$$
(3.15)

Next, we estimate the bound of the term \(l_2\). It follows

$$\begin{aligned} |l_2(u,e_h)|&=\Big |\sum _{K\in {\mathcal {T}}_h}\frac{1}{h^{\beta }}\langle Q_0 u-Q_b u, v_0-v_b\rangle _e\Big |\nonumber \\&=\Big |\sum _{K\in {\mathcal {T}}_h}\frac{1}{h^{\beta }}\langle Q_0 u-u, v_0-v_b\rangle _e\Big |\nonumber \\&\le C\Big (\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\Vert Q_0 u-u\Vert _{L^2(e)}^2\Big )^{\frac{1}{2}}\Big (\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\Vert v_0-v_b\Vert _{L^2(e)}^2\Big )^{\frac{1}{2}} \end{aligned}$$
(3.16)

By the trace inequality and elementwise scaling, we get

$$\begin{aligned}&\Big (\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\Vert Q_0 u-u\Vert _{L^2(e)}^2\Big )^{\frac{1}{2}}\nonumber \\&\quad \le C\sum _{T\in {\mathcal {T}}_h} \Big (h^{-1-\beta }_T\Vert {Q}_0 u-u\Vert ^2_{L^2(T)}+h_T^{1-\beta }\Vert \nabla {Q}_0 u-\nabla u\Vert ^2_{L^2(T)}\Big )^{\frac{1}{2}}\nonumber \\&\quad \le C h^{\frac{1-\beta }{2}+(l+1)-\frac{p}{2}}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}, \end{aligned}$$
(3.17)

where in the last inequality we have used (3.7). Then substituting (3.17) into (3.16) gives

$$\begin{aligned} |l_2(u,e_h)|&\le Ch^{\frac{1-\beta }{2}+(l+1)-\frac{p}{2}}\Vert u\Vert _{W^{l+1,p}(\Omega )}\Big (\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\Vert v_0-v_b\Vert _{L^2(e)}^2\Big )^{\frac{1}{2}}\nonumber \\&\le h^{\frac{1-\beta }{2}+(l+1)-\frac{p}{2}}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}|||e_h |||. \end{aligned}$$
(3.18)

Consequently, combining (3.11) with (3.15) and (3.18) arrives at

$$\begin{aligned} |||e_h |||\le C\Big (h^{\frac{\beta +1}{2}+l-\frac{2}{p}}+h^{\frac{1-\beta }{2}+(l+1)-\frac{p}{2}}\Big )\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}. \end{aligned}$$
(3.19)

This completes the proof of the theorem. \(\square \)

Remark 3.1

Note that we prove the optimal convergence rate \(O(h^{l+1-\frac{1}{p}-\frac{p}{4}})\) of the error \(Q_hu-u_h\) in the energy norm, with \(\beta =1+\frac{2}{p}-\frac{p}{2}\) chosen. When p is close to 1, the optimal rate is up to the order \(O(h^{l-\frac{1}{4}})\) with \(\beta =\frac{5}{2}\), which means that the flexible stabilization term is available for the low regularity solutions.

Remark 3.2

From (3.17), we notice that the convergence rate \(h^{\frac{1-\beta }{2}+(l+1)-\frac{p}{2}}\) mainly comes from the oscillations of u and \(\nabla u\). Therefore, when u is smooth enough (at least \(p=2\)) and \(\beta =1\), the error estimate in the energy norm maintains optimal, up to \(O(h^l)\), which is matched with the theoretical results in [14].

4 Error Estimate in \(L^p\) and \(L^2\) Norms

In this section, we mainly derive the error estimate \((Q_0u-u_0)\) in the \(L^p\) norm and suitable choices for \(\beta \) to be defined later.

First, a duality argument will be employed in our analysis for the weak Glerkin finite element approximation. To this end, we consider a dual problem that seeks \(w\in H_0^1(\Omega )\cap W^{2,q}(\Omega )\) satisfying

$$\begin{aligned}&-\nabla \cdot (A\nabla w)=|Q_0u-u_0|^{p-1}\text {sign}(Q_0u-u_0),&\qquad \text {in}\,\,\Omega ,\end{aligned}$$
(4.1)
$$\begin{aligned}&w=0,&\qquad \text {on}\,\,\partial \Omega , \end{aligned}$$
(4.2)

where \(p\in (1,2]\) and \(Q_0u-u_0 \in W^{2,p}(\Omega )\). Assume that the above dual problem has the \(W^{2,q}\)-regularity. Set \(Y:=W^{2,q}(\Omega )\cap W^{l+1,q}({\mathcal {T}}_h)\cap H^1_0(\Omega )\). There exists a constant C such that

$$\begin{aligned} \Vert w\Vert _{Y}\le C\Vert Q_0u-u_0\Vert ^{p-1}_{L^{p}(\Omega )}. \end{aligned}$$
(4.3)

The space \(H_{div}(\Omega )\) is defined as the set of vector-valued functions on \(\Omega \) which, together with their divergence, are square integrable. We denote by \(\Pi _h\) a projection for \(\varvec{\tau _0}\in H_\mathrm{{div}}(\Omega )\) such that \(\Pi _h\varvec{\tau _0} H_\mathrm{{div}}(\Omega )\), and on each \(T\in {\mathcal {T}}_h\), \(\Pi _h{\varvec{\tau _0}}\in RT_k(T)\) as well as the following identity

$$\begin{aligned} (\nabla \cdot \varvec{\tau _0},v_0)_T=(\nabla \cdot (\Pi _h\varvec{\tau _0}),v_0)_T, \quad \quad \forall \, v_0\in {\mathbb {P}}_k(T),\, T\in {\mathcal {T}}_h, \end{aligned}$$

where \(RT_k(T)\) is the Raviart-Thomas element of order k.

Lemma 4.1

For any \(w\in W^{2,q}(\Omega )\cap H^1_0(\Omega )\) and \(u\in W^{l+1,p}(\Omega )\), for which \(\frac{1}{p}+\frac{1}{q}=1\) and \(p\in (1,2]\), there holds

$$\begin{aligned}&\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\Big (\langle e_b-e_0, Q_0w-w\rangle _e+\langle Q_0u-u, Q_0w-w\rangle _e\Big )\nonumber \\&\quad \le Ch^{l+1}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Vert w\Vert _{W^{2,q}(\Omega )}, \end{aligned}$$
(4.4)

where \(\beta \in [1,1+\frac{2}{p}-\frac{p}{2}]\).

Proof

From the Cauchy–Schwarz inequality, the definition of \(|||\cdot |||\), and the trace inequality [15, Lemma A.3], it follows that

$$\begin{aligned}&\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\langle e_b-e_0, Q_0w-w\rangle _e \nonumber \\&\quad \le C\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{-\beta }\Vert e_0-e_b\Vert _{L^2(\partial T)}^2\Big )^{\frac{1}{2}}\cdot \Big (\sum _{T\in {\mathcal {T}}_h}h_T^{-\beta }\Vert Q_0w-w\Vert _{L^2(\partial T)}^2\Big )^{\frac{1}{2}}\nonumber \\&\quad \le Ch^{\frac{1-\beta }{2}}\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{-\beta }\Vert e_0-e_b\Vert _{L^2(\partial T)}^2\Big )^{\frac{1}{2}}\cdot \Big (\sum _{T\in {\mathcal {T}}_h}h_T^{-1}\Vert Q_0w-w\Vert _{L^2(\partial T)}^2\Big )^{\frac{1}{2}}\nonumber \\&\quad \le Ch^{\frac{1-\beta }{2}} |||e_h |||\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{-2}\Vert Q_0w-w\Vert _{L^2(T)}^2+\Vert \nabla (Q_0w-w)\Vert _{L^2(T)}^2\Big )^{\frac{1}{2}}\nonumber \\&\quad \le C\Big (h^{(1-\frac{2}{p}+l)}+h^{({2-\beta }+l-\frac{p}{2})}\Big )\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{2q-2}\Vert w\Vert _{W^{2,q}(T)}^2\Big )^{\frac{1}{q}}\nonumber \\&\quad \le C\Big (h^{(l+1)}+h^{(2-\beta +l-\frac{p}{2}+\frac{2}{p})}\Big )\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Vert w\Vert _{W^{2,q}(\Omega )}, \end{aligned}$$
(4.5)

where in the fourth inequality we have used (3.7) and Theorem 3.1.

Analogously, with the use of the Cauchy–Schwarz inequality and (3.17), we have

$$\begin{aligned}&\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\langle Q_0u-u, Q_0w-w\rangle _e\nonumber \\&\quad \le Ch^{\frac{1-\beta }{2}}\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{-\beta }\Vert Q_0u-u\Vert _{L^2(\partial T)}^2\Big )^{\frac{1}{2}}\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{-1}\Vert Q_0w-w\Vert _{L^2(\partial T)}^2\Big )^{\frac{1}{2}}\nonumber \\&\quad \le Ch^{\frac{1-\beta }{2}}\sum _{T\in {\mathcal {T}}_h}h^{\frac{1-\beta }{2}+(l+1)-\frac{p}{2}}\Vert u\Vert _{W^{l+1,p}(T)}^2\Big (\sum _{T\in {\mathcal {T}}_h}h_T^{2q-2}\Vert w\Vert _{W^{2,q}(T)}^2\Big )^{\frac{1}{q}}\nonumber \\&\quad \le Ch^{(2-\beta +l-\frac{p}{2}+\frac{2}{p})}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Vert w\Vert _{W^{2,q}(\Omega )}. \end{aligned}$$
(4.6)

Due to \(\beta \in [1,1+\frac{2}{p}-\frac{p}{2}]\) resulting in

$$\begin{aligned} h^{l+1+\frac{2}{p}-\frac{p}{2}} \le h^{(2-\beta +l-\frac{p}{2}+\frac{2}{p})}\le h^{(l+1)}, \end{aligned}$$
(4.7)

adding (4.5) and (4.6) completes the proof of (4.4). \(\square \)

Theorem 4.1

Let \(u_h\) be the weak Galerkin finite element solution to (2.2) arising from the problem (1.1). Given the factor \(\beta \in [1,1+\frac{2}{p}-\frac{p}{2} ]\), assume the exact solution \(u\in W^{2,p}(\Omega )\cap W^{l+1,p}({\mathcal {T}}_h)\) with \(l\ge 1,\, p\in (1,2]\) and \(\frac{1}{p}+\frac{1}{q}=1\). In addition, assume that the dual problem (4.1)–(4.2) has a \(W^{2,q}(\Omega )\cap H^1_0(\Omega ) \)-regularity. Then, there exists a constant C independent of h and \(\beta \) such that

$$\begin{aligned} \Vert Q_0u-u_0\Vert _{L^p({\mathcal {T}}_h)}\le C\Big (h^{2}\Vert f-Q_0f\Vert _{L^p({\mathcal {T}}_h)}+h^{l+1}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Big ), \end{aligned}$$
(4.8)

and

$$\begin{aligned}&\Vert Q_0u-u_0\Vert _{L^2({\mathcal {T}}_h)}\le C h^{1-\frac{2}{p}}\Big (h^{2}\Vert f-Q_0f\Vert _{L^p({\mathcal {T}}_h)}+h^{l+1}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Big ), \end{aligned}$$
(4.9)

where C is independent of the mesh size h and u.

Proof

Testing (4.1) with \(e_0\) we obtain

$$\begin{aligned} \begin{aligned} \Vert e_0\Vert ^p_{L^{p}(\Omega )}&=(-\nabla \cdot (A\nabla w), Q_0u-u_0)\\&=\sum _{T\in {\mathcal {T}}_h}(-\nabla \cdot (\Pi _hA\nabla w),Q_0u-u_0)_{T} \end{aligned} \end{aligned}$$
(4.10)

It is well known that \(\forall \,v_h\in V_h,\,\tau \in V_h^2\)

$$\begin{aligned} \sum _{T\in {\mathcal {T}}_h}(-\nabla \cdot \Pi _h\tau ,v_0)_T=\sum _{T\in {\mathcal {T}}_h}(\Pi _h\tau ,\nabla _w v_h)_T-\sum _{T\in {\mathcal {T}}_h}\langle v_b, \Pi _h\tau \cdot \varvec{n}\rangle _{\partial T}, \end{aligned}$$

which infers from (4.10)

$$\begin{aligned}&\Vert Q_0u-u_0\Vert ^p_{L^{p}(\Omega )}\nonumber \\&\quad =\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w),\nabla _w (Q_hu-u_h))_T-\sum _{T\in {\mathcal {T}}_h}\langle Q_bu-u_b, \Pi _h(A\nabla w)\cdot \varvec{n}\rangle _{\partial T}\nonumber \\&\quad =\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w),\nabla _w (Q_hu-u_h))_T-\sum _{T\in {\mathcal {T}}_h}\langle Q_bu, \Pi _h(A\nabla w)\cdot \varvec{n}\rangle _{\partial T}\nonumber \\&\quad =\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w),\nabla _w (Q_hu-u_h))_T-\sum _{T\in {\mathcal {T}}_h}\langle u, \Pi _h(A\nabla w)\cdot \varvec{n}\rangle _{\partial T}\nonumber \\&\quad =\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w),\nabla _w (Q_hu-u_h))_T:=I_3, \end{aligned}$$
(4.11)

where we have used the fact that both \(\Pi _h(A\nabla w)\) and u are continuous across each interior edge and \(u_b=0\) on \(\partial \Omega \).

Thanks to the fact

$$\begin{aligned} (\nabla _w(Q_h u),\tau )_T=(\mathcal {Q}_h(\nabla u),\tau )_T, \quad \quad \forall \,\tau \in [{\mathbb {P}}_{k-1}(T)]^2, \end{aligned}$$
(4.12)

(4.11) follows as

$$\begin{aligned} \begin{aligned} I_3&=\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w), \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T\\&=\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w)-A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T+\sum _{T\in {\mathcal {T}}_h}(A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T\\&=\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w)-A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T+\sum _{T\in {\mathcal {T}}_h}(A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T \end{aligned} \end{aligned}$$
(4.13)

With the use of the Cauchy–Schwarz inequality, we estimate the first term of the right hand side of \(I_3\) in (4.13)

$$\begin{aligned}&\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w)-A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T\nonumber \\&\quad \le C \Big (\sum _{T\in {\mathcal {T}}_h}\Vert \Pi _h(A\nabla w)-A\nabla w\Vert _{L^2(T)}^2\Big )^{\frac{1}{2}}\Big (\sum _{T\in {\mathcal {T}}_h}\Vert \mathcal {Q}_h(\nabla u)-\nabla _w u_h\Vert _{L^2(T)}^2\Big )^{\frac{1}{2}}. \end{aligned}$$
(4.14)

Applying the Sobolev embedding \(W^{1,q}(T)\hookrightarrow L^2(D)\) for all \(T\in {\mathcal {T}}_h,\,q\ge 2\), and employing a scaling argument yields

$$\begin{aligned} \sum _{T\in {\mathcal {T}}_h}h_T^{-2}\Vert \Pi _h(A\nabla w)-A\nabla w\Vert _{L^2(T)}^2\le C\sum _{T\in {\mathcal {T}}_h}h_T^{2-\frac{4}{q}}\Vert w\Vert _{W^{2,q}(T)}^2, \end{aligned}$$

which infers

$$\begin{aligned} \Big (\sum _{T\in {\mathcal {T}}_h}\Vert \Pi _h(A\nabla w)-A\nabla w\Vert _{L^2(T)}^2\Big )^{\frac{1}{2}}\le Ch^{2-\frac{2}{q}}\Vert w\Vert _{W^{2,q}(\Omega )}. \end{aligned}$$

And notice that

$$\begin{aligned} \sum _{T\in {\mathcal {T}}_h}\Vert \mathcal {Q}_h(\nabla u)-\nabla _w u_h\Vert ^2_{L^2(T)}=\sum _{T\in {\mathcal {T}}_h}\Vert \nabla _w(Q_hu)-\nabla _w u_h\Vert ^2_{L^2(T)}\le C|||Q_hu- u_h |||^2. \end{aligned}$$

Then, substituting the estimates above with Theorem 3.1 into (4.14) and taking \(\beta \in [1,1+\frac{2}{p}-\frac{p}{2} ]\) results in

$$\begin{aligned}&\sum _{T\in {\mathcal {T}}_h}(\Pi _h(A\nabla w)-A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T\nonumber \\&\quad \le Ch^{2-\frac{2}{q}}|||Q_hu- u_h |||\Vert w\Vert _{W^{2,q}(\Omega )}\nonumber \\&\quad \le C\Big (h^{(\frac{\beta +1}{2}+l)}+h^{(\frac{1-\beta }{2}+l+1-\frac{p}{2}+\frac{2}{p})}\Big )\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Vert w\Vert _{W^{2,q}(\Omega )}\nonumber \\&\quad \le Ch^{l+1}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Vert Q_0u-u_0\Vert ^{p-1}_{L^{p}(\Omega )}, \end{aligned}$$
(4.15)

where we have recalled the following estimates in the last inequality:

$$\begin{aligned}&h^{l+1+\frac{1}{p}-\frac{p}{4}} \le h^{(\frac{\beta +1}{2}+l)}\le h^{(l+1)},\\&h^{l+1-\frac{p}{2}+\frac{2}{p}}\le h^{(\frac{1-\beta }{2}+l+1-\frac{p}{2}+\frac{2}{p})}\le h^{l+1-\frac{p}{4}+\frac{1}{p}}, \end{aligned}$$

and (4.15) holds a maximum order with \(\frac{\beta +1}{2}+l=\frac{1-\beta }{2}+l+1-\frac{p}{2}+\frac{2}{p}=l+1+\frac{1}{p}-\frac{p}{4}(\ge l+1)\).

For the second term of the right hand side of \(I_3\) in (4.13), by using (4.12) and the weak Galerkin formulation (2.2) with \(v_h=\mathcal {Q}_h(\nabla w)\) being taken, we can derive

$$\begin{aligned}&\sum _{T\in {\mathcal {T}}_h}(A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T\\ {}&\quad =\sum _{T\in {\mathcal {T}}_h}(A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla u)_T+\sum _{T\in {\mathcal {T}}_h}(A\nabla w, \nabla u-\nabla _w u_h)_T\\&\quad =\sum _{T\in {\mathcal {T}}_h}(A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla u)_T+(A\nabla w, \nabla u)-\sum _{T\in {\mathcal {T}}_h}(\mathcal {Q}_h(A\nabla w), \nabla _w u_h)_T\\&\quad =\sum _{T\in {\mathcal {T}}_h}(A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla u)_T+(A\nabla w, \nabla u)-\sum _{T\in {\mathcal {T}}_h}(A\nabla _w (Q_hw), \nabla _w u_h)_T\\&\quad =\sum _{T\in {\mathcal {T}}_h}(A\nabla w-\mathcal {Q}_h(A\nabla w), \mathcal {Q}_h(\nabla u)-\nabla u)_T+(f,w)-(f,Q_0w)\\&\quad \quad +\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\langle u_0-u_b, Q_0w-Q_bw\rangle _e\\&\quad =\sum _{T\in {\mathcal {T}}_h}(A\nabla w-\mathcal {Q}_h(A\nabla w), \mathcal {Q}_h(\nabla u)-\nabla u)_T+(f-Q_0f,w-Q_0w)\\&\quad \quad +\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\langle u_0-u_b, Q_0w-w\rangle _e. \end{aligned}$$

We bound the terms by the Hölder inequality

$$\begin{aligned}&\sum _{T\in {\mathcal {T}}_h}(A\nabla w-\mathcal {Q}_h(A\nabla w), \mathcal {Q}_h(\nabla u)-\nabla u)_T\nonumber \\&\quad \le C\Vert A\nabla w-\mathcal {Q}_h(A\nabla w)\Vert _{L^q({\mathcal {T}}_h)}\Vert \mathcal {Q}_h(\nabla u)-\nabla u\Vert _{L^p({\mathcal {T}}_h)}\nonumber \\&\quad \le Ch^{l+1}\Vert w\Vert _{W^{2,q}(\Omega )}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}, \end{aligned}$$
(4.16)

and

$$\begin{aligned} \begin{aligned} (f-Q_0f,w-Q_0w)&\le C\Vert w-Q_0w\Vert _{L^q({\mathcal {T}}_h)}\Vert f-Q_0f\Vert _{L^p({\mathcal {T}}_h)}\\&\le Ch^{2}\Vert w\Vert _{W^{2,q}({\mathcal {T}}_h)}\Vert f-Q_0f\Vert _{L^p({\mathcal {T}}_h)}. \end{aligned}\end{aligned}$$
(4.17)

It follows from (4.4) that

$$\begin{aligned}&\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\langle u_0-u_b, Q_0w-w\rangle _e\nonumber \\&\quad =\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\Big (\langle (u_0-Q_0u)-(u_b-Q_bu), Q_0w-w\rangle _e+\langle Q_0u-Q_bu, Q_0w-w\rangle _e\Big )\nonumber \\&\quad =\sum _{e\in {\mathcal {E}}_h}\frac{1}{h^{\beta }}\Big (\langle e_b-e_0, Q_0w-w\rangle _e+\langle Q_0u-u, Q_0w-w\rangle _e\Big )\nonumber \\&\quad \le Ch^{l+1}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Vert w\Vert _{W^{2,q}({\mathcal {T}}_h)}. \end{aligned}$$
(4.18)

Then combining (4.16)–(4.18), we get the following estimate

$$\begin{aligned}&\sum _{T\in {\mathcal {T}}_h}(A\nabla w, \mathcal {Q}_h(\nabla u)-\nabla _w u_h))_T\nonumber \\&\quad \le C\Big (h^{2}\Vert f-Q_0f\Vert _{L^p({\mathcal {T}}_h)}+h^{l+1}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Big )\Vert w\Vert _{W^{2,q}(\Omega )}\nonumber \\&\quad \le C\Big (h^{2}\Vert f-Q_0f\Vert _{L^p({\mathcal {T}}_h)}+h^{l+1}\Vert u\Vert _{W^{l+1,p}({\mathcal {T}}_h)}\Big )\Vert Q_0u-u_0\Vert ^{p-1}_{L^{p}(\Omega )}. \end{aligned}$$
(4.19)

Substituting (4.15) and (4.19) into (4.13) yields the desired error estimate (4.8) with the choice \(\beta \in [1,1+\frac{2}{p}-\frac{p}{2} ]\) from Lemma 4.1. Furthermore, (4.9) holds from (4.8) and the fact (see [6]) that for all \(\phi \in V_h,\,T\in {\mathcal {T}}_h\)

$$\begin{aligned} \Vert \phi \Vert _{L^2(T)}\le h_T^{1-\frac{2}{p}}\Vert \phi \Vert _{L^p(T)}. \end{aligned}$$

This completes the proof of the theorem. \(\square \)

5 Numerical Experiments

In this section, we report on results of numerical tests meant to assess the theoretical a priori error estimates and to illustrate the performance of the over-relaxed WG method (2.2) when dealing with low regularity elliptic problems.

In the following numerical studies, all examples will be investigated on uniformly refined triangulations of \(\Omega \), and will apply the WG method to find a solution \(u_h = \{u_0, u_b\}\) where \(u_0|_T\in {\mathbb {P}}_1(T), \hbox {~and~} u_b|_e \in {\mathbb {P}}_1(e).\) Using the piecewise linear elements \(({\mathbb {P}}_1(T),{\mathbb {P}}_1(e),[{\mathbb {P}}_{0}(T)]^2)\), we test four examples on triangular meshes of regular pattern, and the third example also on locally refined meshes. The error for the over-relaxed WG solution of (2.2) shall be measured in the following two norms defined by

$$\begin{aligned}&|||e_h |||^2=\sum \limits _{T\in {\mathcal {T}}_h}\Big (\int _{T}\big |A^{\frac{1}{2}}\nabla _w e_0\big |^2dx+h_T^{-\beta }\int _{\partial T}|e_0-e_b|^2ds\Big ),\\&\Vert e_h\Vert ^2=\sum \limits _{T\in {\mathcal {T}}_h}\int _{T}|e_0|^2dx. \end{aligned}$$

We first investigate an example with a smooth solution to testify the choices of \(\beta \) of the stabilizer in the weak Galerkin method and apply the incomplete LU (ILU) preconditioning to the discrete linear algebraic systems as \(\beta >1\).

Example 1

We consider the domain \(\Omega =(0,1)^2\) and the elliptic problem (1.1)–(1.2) with the diffusion coefficient matrix bing given by \(A=\begin{bmatrix}x^2+y^2+1&xy \\ xy&x^2+y^2+1 \end{bmatrix}\) such that the exact solution is

$$\begin{aligned} u(x,y) = \sin (\pi x)\cos (\pi y). \end{aligned}$$

The errors in the norms \(\Vert e_h\Vert \) and \(|||e_h |||\) as well as the rates of convergence are presented in Tables 1 and 2. Due to the smoothness of the solution in \(W^{l+1,2}(\Omega )\) with \(l=1\), as \(\beta \) increases, we notice that the convergence rates in the \(L^2\) and energy norms are optimal with \(\beta =1\), showing superconvergence in the energy norm as \(\beta =2,3\). Figure 1 suggests that the choice of \(\beta \) greater than 1 generally results in a better convergence rate in the energy norm, but when \(\beta =3\), the WG method has a comparable convergence rate in the \(L^2\) norm as \(\beta =1\).

Table 1 Errors for example 1 with \(\beta =0.5\) and 0.8
Table 2 Errors for example 1 with \(\beta =1,2,3\)
Fig. 1
figure 1

Convergence rates for different values of \(\beta \). Left Error in the \(H^1\) norm. Right Error in the \(L^2\) norm

To attack the ill-conditioned effect from the discrete linear algebraic systems for \(\beta >1\), we employ the ILU preconditioning and a restarted Generalized Minimum Residual method (GMRES) to drive relative residual to less than a tolerance. All our tests in this section are started from zero vector and terminated when the iteration satisfies \(r^{(n)}/r^{(0)}\le 1e-6\), where \(r^{(n)}\) is the residual of the n-th iteration. To limit the amount of memory required to execute the method, we set a restart number up to 100. Tables 3 and 4 show the outer iteration (outer it.), inner iteration (inner it.) and CPU time of the restarted GMRES method with and without ILU preconditioning. It is observed that a preconditioned GMRES method has produced a very efficient and robust performance.

Table 3 GMRES method for example 1 with \(\beta =2\)
Table 4 GMRES method for example 1 with \(\beta =3\)

Example 2

The example is originated from [16]. Taking a coefficient matrix \(A=\begin{bmatrix}1&\quad 0 \\ 0&\quad 1 \end{bmatrix}\), we now test the method for problem (1.1)–(1.2) with the low regularity solution

$$\begin{aligned} u(x,y) = x(x-1)y(y-1)r^{-2+\alpha }, \end{aligned}$$

where \(\alpha \in (0,1]\) is a constant, and \(r=\sqrt{x^2+y^2}\) denotes the distance to the origin. Note that \(u\in W_0^{1,2}(\Omega )\cap W^{2,p}(\Omega )\) for all \(p\in (1,\frac{2}{2-\alpha })\subseteq (1,2).\) As \(\alpha \) changes, the errors from the theory in this work are expected to be

$$\begin{aligned}&\Vert Q_0u- u_0\Vert _{L^2}\sim o(h^{3-\frac{2}{p}})\sim o(h^{1+\alpha }),\\ \hbox {and}\quad&|||Q_hu- u_h |||\sim o(h^{2-\frac{1}{p}-\frac{p}{4}})\sim o(h^{\frac{3-\alpha ^2}{4-2\alpha }}), \end{aligned}$$

where the optimal value of \(\beta \) is \(1+\frac{2}{p}-\frac{p}{2}\).

Table 5 Errors for example 2 with different \(\alpha =1,2^{-1},2^{-5}\) and \(\beta =0.5\)
Table 6 Errors for example 2 with different \(\alpha =1,2^{-1},2^{-5}\) and \(\beta =0.8\)
Table 7 Errors for example 2 with different \(\alpha =1,2^{-1},2^{-5}\) and \(\beta =1\)
Fig. 2
figure 2

Convergence rates for different values of \(\beta \) and \(\alpha =1\). Left Error in the \(H^1\) norm. Right Error in the \(L^2\) norm

Fig. 3
figure 3

Convergence rates for different values of \(\beta \) and \(\alpha =2^{-5}\). Left Error in the \(H^1\) norm. Right Error in the \(L^2\) norm

Table 8 Errors for example 2 with different \(\alpha =1,2^{-1},2^{-5}\) and \(\beta =2\)
Table 9 Errors for example 2 with \(\alpha =1,2^{-1},2^{-5}\) and \(\beta =3\)
Table 10 Errors for example 2 with \(\alpha =2^{-3},2^{-4},2^{-5}\) and optimal values of \(\beta \)

On the uniform triangular meshes, we present the errors and convergence rates for different values of \(\beta \) in Tables 5, 6, and 7, respectively. As \(\alpha \) tends to 0, the convergence rates tend to 0 in the energy norm and to 1 in the \(L^2\)-norm for the errors from the WG method with \(\beta =1\). When \(\beta =1\) is taken, the convergence rates in the \(L^2\) and energy norms become optimal for the high-regularity solution, which is consistent with the theory. In Fig. 2, we compare the convergence rates for different \(\beta \) and fixed \(\alpha \) (\(=\)1). It is clear from Fig. 2 that the overall convergence behavior is very similar to that of Fig. 1 and the choice \(\beta =2\) gives better convergence rates than the other two \(\beta =1,\,3\). In the case \(\alpha =2^{-5}\), it is observed in Fig. 3 that the WG methods with \(\beta =0.5,\,0.8\) are not convergent in the \(H^1\) norm but converge slowly with \(\beta =1\). Furthermore, as the value of \(\beta \) increases from 1 to 3 and the values of \(\alpha \) decrease, it is observed in Tables 7, 8, and 9 that the WG methods with \(\beta =2,\,3\) produce better convergence rates and accuracy in the energy norm for all values of \(\alpha \) and, in the case \(\beta =1\), the method has the best convergence rate in the \(L^2\) norm just for the smooth solution (\(\alpha =1\)). Especially, when \(\alpha =2^{-5}\) and \(\beta \ge 1\), for the low-regularity solution, the WG method has the first-order optimal convergence rate in the \(L^2\) norm and orders 0.5023, 0.6963 in the energy norm for \(\beta =2\) and \(\beta =3\), respectively. Due to the condition numbers in the discrete linear algebraic systems from the WG approximation up to \(O(h^{-2})\), \(O(h^{-3})\) and \(O(h^{-4})\) for \(\beta =1,\,2,\,3\), respectively, the ILU preconditioning is indispensable in our computation. Table 10 shows the best convergence rates in the energy and \(L^2\) norms when some critical values of \(\beta \) are chosen for different values of \(\alpha =2^{-3},2^{-4},2^{-5}\), respectively.

Moreover, since \(u \in W^{2,p}(\Omega )\) only, linear elements are investigated for computing. In Fig. 4, the profiles of numerical solutions illustrate that the solutions have more slope surfaces close to the origin as the values of \(\alpha \) become less. Considering the convergence rates in the energy norm, we compare the WG methods (\(\beta =1,\,2,\,3\)) with the non-symmetric interior penalty Galerkin (NIPG), symmetric interior penalty Galerkin (SIPG) and continuous finite element (FEM) methods by linear elements presented in Ref. [16], and obtain comparable results in Table 11. It is observed that when \(\beta =2,\,3,\) the WG methods give more impressive convergence rates than the other methods.

Example 3

The next example is an elliptic problem of corner singularities in the L-shaped domain \(\Omega =(0,1)^2\backslash [1/2,1)^2\) with \(A=I_{2\times 2}\), an identity matrix. Under a polar coordinate system \((r,\theta )\) with the origin \((\frac{1}{2},\frac{1}{2})\), the solution is

$$\begin{aligned} u(r)=r^{\frac{2}{3}}sin\Big (\frac{2\theta -\pi }{3}\Big ), \quad \frac{\pi }{2}\le \theta \le 2\pi . \end{aligned}$$

Note that the solution in example 3 has a corner singularity at the node (1 / 2, 1 / 2) as well as the other five vertices of the L-shaped domain. With the reentrant corner of the interior angle \(3\pi /2\). Therefore, the solution has the global regularity \(H^{\frac{5}{3}-\epsilon }(\Omega )\), where \(\epsilon \) is any positive number and \(p=2\). Some tests are made on uniform grids and locally refined grids to investigate errors and convergence rates. From Table 12, we observe that as the values of \(\beta \) increase from 0.5 up to 1, the weak Galerkin method with \(\beta =1\) has optimal convergence rates in the \(L^2\) and energy norms for the singular problem. The WG solutions for \(\beta =2,\,3\) have better accuracy and convergence rates in the energy norm in Table 13, although the convergence rate of the WG method with \(\beta =1\) in the \(L^2\) norm is the best from Tables 12 and 13.

Fig. 4
figure 4

Comparison of numerical solutions with different values of \(\alpha =1,\,2^{-1},\,2^{-2},\,2^{-3},\,2^{-4},\,2^{-5}\), respectively, listed in order from left to right and from top to bottom

Table 11 Comparison on convergence rates of \(|||e_h |||\) by using different methods for example 2 with different \(\alpha \)
Table 12 Errors for example 3 with \(\beta =0.5,\,0.8\) and 1.
Table 13 Errors for example 3 with \(\beta =2\) and 3

We also employ locally refined grids to illustrate the numerical error in Fig. 5, and verify convergence rates \(\sigma \) of the error \(|||e_h |||\) with respect to the number of degrees of freedom (Dof), defined by

$$\begin{aligned} |||e_h |||:=O(\text {Dof~}^{\sigma }). \end{aligned}$$
(5.1)

Table 14 shows the WG method has better approximation behavior in the locally refined grids than in the uniform meshes, and the choice of \(\beta =2\) gives the best convergence rate in the energy norm.

Example 4

In this case, we employ the same analytic solution as in Example 2 and in the domain with a narrow line crack of size 2e-5 (see Fig. 6), defined by \(\Omega =(-2,2)^2\backslash [-2,0.00001]\times (-0.00001,0.00001)\). We notice that the problem with a Dirichlet boundary condition has low regularity and singularity at the corners of the origin.

Fig. 5
figure 5

A locally refined grid (left) and error profile in 3D (right) with \(\beta =1\)

Table 14 Convergence rates of \(|||e_h |||\) and \(\Vert e_h\Vert \) with respect to Dof for example 3 on locally refined grids, with \(\beta =1,\,2,\,3\)
Fig. 6
figure 6

An initial grid with a crack (left) and a locally zoomed area around the origin (right)

Fig. 7
figure 7

Numerical solution in 3D (left) and the corresponding numerical error profile (right) with the initial grid refined by three times

Table 15 Errors for example 4 with \(\beta =2,\,2.4608\) and 3 on the uniform grids

In Fig. 7, it is observed that the solution around the line crack is discontinuous and has sharp slopes along the bottom-left diagonal direction, but the error mainly distributes around the origin. From Table 15, it is shown that when \(\alpha =2^{-5}\), the rate in the \(L^2\) norm with \(\beta =2.4608\) is better than that with \(\beta =3\), and the convergence rates in the energy norm are comparable in the three cases to the low regularity solutions in the cracked domain.

All numerical examples above are in good agreement with the theoretical analysis, which validates optimal convergence rates of the stabilized WG finite element method (2.2) with the suitable choices of the over-relaxed factor.

6 Conclusions

In this work, we have proposed and analyzed the a priori energy-norm and \(L^p,\,L^2\) error estimates of the over-relaxed weak Galerkin method for solving low regularity elliptic problems. In the cases of low regularity elliptic solutions, an over-relaxed factor \(\beta >1\) in the over-relaxed stabilization term has been stated with respect to \(p\in (1,2)\) to implement weak continuity in the WG method. The WG method with the over-relaxed stabilization is optimally convergent, and the rates exhibit an impressive performance in the energy norm. The optimal relaxed factor for \(p\in (1,2)\) has been derived and in the case \(p=2\), optimal error estimates in the energy and \(L^2\) norms can be recovered when \(\beta =1\) is taken. The relaxed features for low regularity solutions have been verified by some numerical results. Furthermore, an ILU preconditioning technique for the over-relaxed WG scheme is employed through the restarted GMRES method to reduce iterations and save computational cost.