1 Introduction

In this paper, we are concerned with a MWG-FEM for the convection-diffusion problems in two-dimensional space which seeks an unknown function \(u=u(x,y)\) satisfying

$$\begin{aligned} -\nabla \cdot (a\nabla u)+\nabla \cdot (\mathbf{b}u)+cu&= f(x,y),\quad (x,y)\in \Omega ,\end{aligned}$$
(1)
$$\begin{aligned} u(x,y)&= g(x,y),\quad (x,y)\in \Gamma , \end{aligned}$$
(2)

where \(\Omega \subset R^2\) is a bounded domain with boundary \(\Gamma =\partial \Omega \); \(a=a(x,y)\) is the diffusion coefficient such that \(0<a_0\le a(x,y)\le a_1,\;(x,y)\in \Omega \), here \(a_0, a_1\) are positive constants, and \(\mathbf{b}=\mathbf{b}(x,y):=(b_1(x,y),b_2(x,y))^T\) is the convection velocity, and \(c=c(x,y)\ge 0\) is reaction coefficient, \(f(x,y)\) is the right side function, \(g(x,y)\) is the prescribed Dirichlet data on the boundary \(\Gamma \), and \(u=u(x,y)\) is the unknown function representing the concentration of the solution in the flow. We impose the physical condition

$$\begin{aligned} \dfrac{\nabla \cdot \mathbf{b}}{2}+c\ge 0,\quad (x,y)\in \Omega . \end{aligned}$$
(3)

It is well known that the convection–diffusion equation that involves a combination of convection and diffusion dynamical processes is a fundamental equation describing the process of fluid transfer and is widely used in many fields of science and engineering, such as fluid mechanics, petroleum reservoir simulation, groundwater water contamination, and environmental protection (see, for example Ref. [1, 2, 10], etc.). In many such applications the convection term essentially dominates the diffusion term. The numerical approximation to the problems presents a challenging computational task. It is well documented that the governing equation is convection-dominated. Many standard methods, developed for diffusion-dominated processes such as the standard finite difference (volume) or finite element methods, often exhibit severe non-physical oscillations since the corresponding discrete schemes are unstable for the problems.

The weak Galerkin (WG) method refers to general finite element techniques for partial differential equations in which differential operators are approximated by weak forms as distributions. In [11], a WG method was introduced and analyzed for second order elliptic equations based on a discrete weak gradient arising from local RT [8] or BDM [5] elements. However the WG finite element formulation of [11] was limited to classical finite element partitions of triangles (d \(=\) 2) or tetrahedra (d \(=\) 3). This restriction was lifted for WG mixed finite element formulation developed in [12]. In [9], we introduced a new discrete weak gradient operator and a new WG finite element method for second order Poisson equations based on this new operator. The goal of this paper is to introduce a new discrete weak divergence operator and a MWG-FEM for convection–diffusion Eqs. (1) and (2) based on these new operators.

The paper is organized as follows. In Sect. 2, we introduce the definition and approximation of the modified weak gradient operator and weak divergence operator. In this section, we define some local projection operators. In Sect. 3, we provide a detailed description for the MWG finite element scheme, including a discussion on the element shape regularity assumption. We also derive some approximation properties which are useful in error analysis. In Sect. 4, we establish an optimal order error estimate for the MWG finite element approximation in a \(H^1\)-equivalent discrete norm. We also derive an optimal order error estimate in \(L^2\) by using a duality argument as was commonly employed in the standard Galerkin finite element methods [3, 6]. Finally, in Sect. 5, we present some numerical results which confirm the theory developed in earlier sections.

2 Discrete weak gradient and weak divergence

For the convection–diffusion problem (1) with non-homogenous Dirichlet boundary condition (2), the corresponding variational form is given by seeking \(u\in H^1(\Omega )\) satisfying \(u|_{\partial \Omega }=g\) and

$$\begin{aligned} (a\nabla u, \nabla v)-(\nabla \cdot (\mathbf{b}u),v)+(cu,v) = (f, v),\qquad \forall v\in H_0^1(\Omega ), \end{aligned}$$
(4)

where \(H_0^1(\Omega )\) is the subspace of \(H^1(\Omega )\) consisting of functions with vanishing value on \(\partial \Omega \).

The key in MWG-FEMs is the use of new discrete weak derivatives in the place of strong derivatives and weak divergences in the place of strong divergence in the variational form for the underlying partial differential equations. For the model problem (4), the gradient \(\nabla \) is the principle differential operator involved in the variational formulation. Thus, it is critical to define and understand discrete weak gradients for the corresponding numerical methods. Following the idea originated in [11], the new discrete weak gradient is given by approximating the strong gradient operator with piecewise vector-valued polynomial functions and the weak divergence is given by approximating the strong divergence operator with piecewise polynomial functions; details are presented in the rest of this section.

Let \(\mathcal {T}_h\) be a partition of the domain \(\Omega \) consisting of polygons. Denote by \(\mathcal {E}_h\) the set of all edges in \(\mathcal {T}_h\) and by \(\mathcal {E}_h^0=\mathcal {E}_h\backslash \partial \Omega \) the set of all interior edges. For each element \(K\in \mathcal {T}_h\), we denote by \(h_K\) its diameter and mesh size \(h=max_{K\in \mathcal {T}_h}h_K\) for \(\mathcal {T}_h\).

All the elements of \({\mathcal T}_h\) are assumed to be closed and simply connected polygons. We need some shape regularity for the partition \({\mathcal T}_h\) described as follows (see [7] for more details).

A1 :

There exists a positive constant \(\Theta \) such that for every element \(K\in T_h\) and each edge \(e\) of \(K\), one can define an irregular pyramid \(P(e)\subseteq T\) with base \(e\) and apex \(V_e\in K\), such that the height of \(P(e)\), denoted by \(H_{K,e}\), satisfies

$$\begin{aligned} H_{K,e}\ge \Theta h_K. \end{aligned}$$
A2 :

There exists a positive constant \(N\) such that for every element \(K \in T_h\), it has at most \(N\) edges.

A3 :

The convex hull of each \(K\in T_h\), denoted by \(S_K\), is contained in \(\bar{\Omega }\), the closure of \(\Omega \). Further more, each convex hull intersects with only a fixed and small number of convex hulls of other polygons in \(T_h\).

We define \(V_h\), for \(k\ge 1\), as follows

$$\begin{aligned} V_h=\{ v\in L^2(\Omega ): v|_K\in P_k(K),\forall K\in \mathcal {T}_h \}, \end{aligned}$$
(5)

where \(P_k(K)\) is the set of polynomials defined on \(K\) with degree no more than \(k\). Based on \(V_h\), we define \(V_h^0\) as a subspace of \(V_h\) with zero boundary value, i.e.,

$$\begin{aligned} V_h^0=\{v: v\in V_h, v|_{\partial \Omega }=0 \}. \end{aligned}$$
(6)

We also introduce a scalar polynomial space \(W_h\)

$$\begin{aligned} W_h:=\{w\in L^2(\Omega ), w|_{K}\in P_{k-1}(K), \forall K\in \mathcal {T}_h\}, \end{aligned}$$
(7)

and a vector valued space \(G_h\)

$$\begin{aligned} G_h:=\{q\in [L^2(\Omega )]^2, q|_{K}\in [P_{k-1}(K)]^2, \forall K\in \mathcal {T}_h\}. \end{aligned}$$
(8)

For each \(v\in V_h^0\), if \(K_i,\,i=1,2\), are two elements with a common edge \(e\), and unit outward normal vectors \(\mathbf{n}_i,\,i=1,2,\) across \(e\), the average and jump of \(v\) across \(e\) are denoted by \(\{\{v\}\}=(v|_{K_1}+v|_{K_2})/2\) and

$$\begin{aligned}{}[[v]]_e=\left\{ \begin{array}{ll} v|_{K_1}\mathbf{n_1}+v|_{K_2}\mathbf{n_2}, &{}\quad e\in \mathcal {E}^0_h,\\ 0,&{}\quad e\in \partial \Omega , \end{array}\right. \end{aligned}$$
(9)

For each element \(K\in \mathcal {T}_h\), denote by \(\mathcal {Q}_h\) the local \(L^2\)-projection onto \(G_h\), i.e., for any vector valued function \(\mathbf{w}(x)\), \(\mathcal {Q}_h\mathbf{w}(x)\) satisfies

$$\begin{aligned} (\mathcal {Q}_h \mathbf{w}, q)_K=(\mathbf{w}, q)_K,\quad \forall q\in G_{k-1}(K), \end{aligned}$$
(10)

where \(G_{k-1}(K):=[P_{k-1}(K)]^2\). Denote by \(Q_h\) the local \(L^2\) projection from \(L^2(K)\) to \(P_k(K)\).

Definition 2.1

Given a partition \(\mathcal {T}_h\) of \(\Omega \) and a piecewise smooth function \(v\), we define the weak gradient of \(v\) on \(K\), \(\forall K\in \mathcal {T}_h\), as \(\nabla _w v\in G_h\), such that

$$\begin{aligned} (\nabla _w v, q)_K=-(v,\nabla \cdot q)_K+\langle \{\{v\}\},q\cdot \mathbf{n}\rangle _{\partial K},\quad \forall q\in G_{k-1}(K), \end{aligned}$$
(11)

where \(\mathbf{n}\) is the unit outward normal to \(\partial K\).

Remark 2.1

First, note that the definition for \(\nabla _w\) here is different from the one defined in [11]. Second, when \(v\) is continuous in \(\Omega \), \(\{\{v\}\}=v\) on \(\partial K\), \(\forall K\in \mathcal {T}_h\). Thus from the definition of \(\nabla _w\),

$$\begin{aligned} (\nabla _w v, q)_K=-(v,\nabla \cdot q)_K+\langle v,q\cdot \mathbf{n}\rangle _{\partial K}=(\nabla v,q)_K,\quad \quad \forall q\in G_{k-1}(K),\quad \end{aligned}$$
(12)

that is to say, the weak gradient \(\nabla _w v\) is actually the same as \(L^2\) projection of the traditional gradient \(\nabla v\) for such a \(v\). Thus if \(v\in P_k(\Omega ), \nabla _w v=\nabla v\).

Definition 2.2

Given a partition \(\mathcal {T}_h\) of \(\Omega \) and a piecewise smooth vector-valued function \(v\), we define the weak divergence of \(v\) on \(K\), \(\forall K\in \mathcal {T}_h\), as \(\nabla _w\cdot v\in W_h\), such that

$$\begin{aligned} (\nabla _w\cdot v,\; w)_K=-(v,\;\nabla w)_K+\langle \{\{v\}\},\;w \mathbf{n}\rangle _{\partial K},\quad \forall w\;\in W_h, \end{aligned}$$
(13)

where \(\mathbf{n}\) is the unit outward normal to \(\partial K\).

Remark 2.2

When \(v\) is continuous in \(\Omega \), \(\{\{v\}\}=v\) on \(\partial K\), \(\forall K\in \mathcal {T}_h\). Thus from the definition of \(\nabla _w\cdot \),

$$\begin{aligned} (\nabla _w\cdot v,\; w)_K=-(v,\;\nabla w)_K+\langle v,\;w \mathbf{n}\rangle _{\partial K}=(\nabla \cdot v,w)_K,\; \forall w\;\in W_h, \end{aligned}$$
(14)

that is to say, the weak divergence \(\nabla _w\cdot v\) is actually the same as \(L^2\) projection of the traditional divergence \(\nabla \cdot v\) for such a vector-valued function \(v\). Thus if \(v\in [P_k(\Omega )]^2, \nabla _w\cdot v=\nabla \cdot v\).

Example 1

Let \(K\) is a triangular element \(\triangle ABC\), whose nodes are \(A(0,0),\;\) \(B(1,1),\;\) \(C(0,\;1),\,\) and \(\,\overline{AB}=e_1\), \(\overline{BC}=e_2\), \(\overline{CA}=e_3\). It is easy to know that \(\mathbf{n}_1= \left( \dfrac{1}{\sqrt{2}},-\dfrac{1}{\sqrt{2}} \right) \), \(\mathbf{n}_2=(0,1)\), \(\mathbf{n}_3=(-1,0)\). Suppose \(u|_K=(1,1);\; u=0,\; x\in \Omega /K\). It is not hard to derive \(\{\{u\}\}|_{e_i}= \left( \dfrac{1}{2},\;\dfrac{1}{2} \right) ,\;(i=1,2,3)\).

  1. (1)

    When we choose \(k=1\), \(\nabla _w\cdot u|_K=a\in P_0(K), K\in \mathcal {T}_h\). We have

    $$\begin{aligned} \begin{array}{ll} \dfrac{a}{2}=(\nabla _w\cdot u,\;1)_K&{}=-(u,\;\nabla 1)_K+\sum \limits _{i=1}^3{\langle }\{\{u\}\},\; n{\rangle }_{e_i}\\ &{}=0+\displaystyle \int \limits _{e_1}0\text{ d }s+\dfrac{1}{2}\displaystyle \int \limits _{e_2}1\text{ d }s-\dfrac{1}{2}\displaystyle \int \limits _{e_3}1\text{ d }s\\ &{}=0. \end{array} \end{aligned}$$

    So \(a=0\), and thus \(\nabla _w\cdot u=0 \).

  2. (2)

    When we choose \(k=2\), \(\nabla _w\cdot u|_K=a+bx+cy\in P_1(K)\). We have

    $$\begin{aligned}&{\begin{array}{ll} \dfrac{a}{2}+\dfrac{b}{6}+\dfrac{c}{3}=\displaystyle \int _{0}^{1}\int _{x}^1(a+bx+cy)\text{ d }y\text{ d }x=(\nabla _w\cdot u,\;1)_K=0\\ \end{array}}\\&{\begin{array}{ll} \dfrac{a}{6}+\dfrac{b}{12}+\dfrac{c}{8}=\displaystyle \int _{0}^{1}\int _{x}^1(ax+bx^2+cxy)\text{ d }y\text{ d }x=(\nabla _w\cdot u,\;x)_K\\ \qquad \qquad \qquad \quad =-\displaystyle \int _{0}^{1}\int _{x}^1 1\text{ d }y\text{ d }x+\int \limits _{e_1}\dfrac{1}{2}u|_{e_1}\cdot x\mathbf{n}_1\text{ d }s+\int \limits _{e_2}\dfrac{1}{2}u|_{e_2}\cdot x\mathbf{n}_2\text{ d }s\\ \qquad \qquad \qquad \qquad +\displaystyle \int _{e_3}\dfrac{1}{2}u|_{e_3}\cdot x\mathbf{n}_3\text{ d }s=-\dfrac{1}{4} \end{array}}\\&{\begin{array}{ll} \dfrac{a}{3}+\dfrac{b}{8}+\dfrac{c}{4}=\displaystyle \int _{0}^{1}\int _{x}^1(ay+bxy+cy^2)\text{ d }y\text{ d }x=(\nabla _w\cdot u,\;y)_K\\ \qquad \qquad \,\,\qquad =-\displaystyle \int _{0}^{1}\int _{x}^1 1\text{ d }y\text{ d }x+\int \limits _{e_1}\dfrac{1}{2}u|_{e_1}\cdot y\mathbf{n}_1\text{ d }s+\int \limits _{e_2}\dfrac{1}{2}u|_{e_2}\cdot y\mathbf{n}_2\text{ d }s\\ \qquad \qquad \qquad \quad +\displaystyle \int _{e_3}\dfrac{1}{2}u|_{e_3}\cdot y\mathbf{n}_3\text{ d }s=-\dfrac{1}{4} \end{array}} \end{aligned}$$

          So \(a=6,\;b=-6,\;c=-6\), and thus \(\nabla _w\cdot u=6-6x-6y\).

3 MWG-FEMs and some lemmas

In finite element methods, mesh generation is a crucial first step in the algorithm design. For the usual finite element methods [4, 6], the meshes are mostly required to be simplices: triangles or quadrilaterals in two dimensions and tetrahedra or hexahedra in three dimensions, or their variations known as isoparametric elements. Our MWG-FEM is designed to be sufficiently flexible so that general meshes of polytopes (e.g., polygons in 2D) are allowed. For simplicity, we shall refer to the elements as polygons in the rest of the paper.

Now we introduce two bilinear forms on \(V_h\) as follows:

$$\begin{aligned} a(w,\;v)&= \sum \limits _{K\in \mathcal{T}_h}(a\nabla _w w,\;\nabla _w v)_K+\sum \limits _{K\in \mathcal{T}_h}(\nabla _w\cdot (\mathbf{b} w),\; v)_K+\sum \limits _{K\in \mathcal{T}_h}(c w,\; v)_K\\ s(w,\;v)&= \rho \sum _{K\in \mathcal{T}_h} h^{-1}\langle [[w]],\;\;[[v]]\rangle _{\partial K}, \end{aligned}$$

where \(\rho >0\) is a constant-valued parameter without the need of being “large enough”. In practical computation, one might set \(\rho =1\). Denote by \(a_s(\cdot ,\;\cdot )\) a stabilization of \(a(\cdot ,\;\cdot )\) given by

$$\begin{aligned} a_s(w,\;v):=a(w,\;v)+s(w,\;v). \end{aligned}$$

Remark 3.1

Note that the stabilization term \(s(w,\; v)\) here is different from the one defined in [11].

3.1 Modified weak Galerkin algorithm

A numerical approximation for (1) and (2) can be obtained by seeking \(u_h\in V_h\) satisfying both \(u_h|_{\partial \Omega }= g_I\) on \(\partial \Omega \) and the following equation:

$$\begin{aligned} a_s(u_h,\;v)=(f,\;v), \quad \forall \ v\in V_h^0, \end{aligned}$$
(15)

where \(g_I\) is an approximation of the Dirichlet boundary value in the polynomial space \(P_k(\partial T\cap \partial \Omega )\). For simplicity, one may take \(g_I\) as the standard \(L^2\) projection of the boundary value \(g\) on each boundary segment.

Now, we introduce a norm in \(V_h\) as follows

$$\begin{aligned} {|||} v{|||}:=\sqrt{\sum _{K\in \mathcal{T}_h}(\Vert \nabla _w v\Vert _{0,K}^2+h^{-1}\Vert [[v]]\Vert _{0,\partial K}^2)},\; v\in V_h. \end{aligned}$$
(16)

Lemma 3.1

Let \(K\in \mathcal {T}_h\) and \(e\in \partial K\). For any function \(w\in G_h\) and \(v\in L^2(\Omega )\), \(v|_{K}\in C^1(K)\),

$$\begin{aligned} \sum \limits _{K\in \mathcal {T}_h}(\nabla v,w)_{K}=\sum \limits _{K\in \mathcal {T}_h}(\nabla _w v,w)_K+\sum \limits _{e\in \mathcal {E}_h}\langle [[v]],\{\{w\}\}\rangle _e. \end{aligned}$$
(17)

Proof

From the definition of \(\nabla _w\) and integration by parts, for each \(K\in \mathcal {T}_h\), we have

$$\begin{aligned} (\nabla _w v,\; w)_K=-(v,\;\nabla \cdot w)_K+\langle \{\{v\}\},\;w\cdot \mathbf{n} \rangle _{\partial K}, \end{aligned}$$

and

$$\begin{aligned} (\nabla v,\; w)_K=-(v,\;\nabla \cdot w)_K+\langle v,\;w\cdot \mathbf{n} \rangle _{\partial K}. \end{aligned}$$

So,

$$\begin{aligned} (\nabla v-\nabla _w v,\; w)_K&=\langle (v-\{\{v\}\})\cdot \mathbf{n_1},\;w_1 \rangle _{\partial K}\\&=\langle v_1\cdot \mathbf{n_1}+v_2\cdot \mathbf{n_2},\;\dfrac{w_1}{2} \rangle _{\partial K}=\langle [[v]],\;\dfrac{w_1}{2} \rangle _{\partial K}, \end{aligned}$$

where \(v_1, w_1\) are the values of \(v, w\) restricted to element \(K\) and \(v_2\) is the value of \(v\) restricted in \(K\)’s adjacent elements. Summing up the identities over all \(K\in \mathcal {T}_h\) yields,

$$\begin{aligned} \sum \limits _{K\in \mathcal {T}_h}(\nabla v-\nabla _w v,\; w)_K=\sum \limits _{e\in \mathcal {E}_h}\langle [[v]],\;\{\{w\}\}\rangle _e. \end{aligned}$$

\(\square \)

Lemma 3.2

Let \(K\in \mathcal {T}_h\). For any function \(v\in V_h^0\),

$$\begin{aligned} \sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b} v),\;v)_{K}=\sum \limits _{K\in \mathcal {T}_h}(\dfrac{\nabla \cdot \mathbf{b}}{2}v,\;v)_K. \end{aligned}$$
(18)

Proof

From the definition of weak divergence \(\nabla _w\cdot \), for \(v\in V_h^0\), we have

$$\begin{aligned} \sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b} v),\;v)_{K}&=-\sum \limits _{K\in \mathcal {T}_h}(\mathbf{b}v,\;\nabla v)_K+\sum \limits _{e\in \mathcal {E}_h}\langle \{\{\mathbf{b}v\}\},\;[[v]] \rangle _e.\\&=\sum \limits _{K\in \mathcal {T}_h}(\dfrac{\nabla \cdot \mathbf{b}}{2}v,\;v)_K-\sum \limits _{e\in \mathcal {E}_h}\langle \{\{\mathbf{b}v\}\},\;[[v]] \rangle _e\\&\quad +\sum \limits _{e\in \mathcal {E}_h}\langle \{\{\mathbf{b}v\}\},\;[[v]] \rangle _e. \end{aligned}$$

We are now in a position to establish the uniqueness and existence for the solution of MWG-FEM (15). It suffices to prove that the solution is unique. To this end, we let \(f\,=\,g\,=\,0\).

Lemma 3.3

The modified WG finite element scheme (15) has a unique solution.

Proof

Taking \(v=u_h\in V_h^0\) in (15) and using Lemma 3.2 gives

$$\begin{aligned} (a\nabla _w u_h,\;\nabla _w u_h)+\left( \left( \dfrac{\nabla \cdot \mathbf{b}}{2}+c \right) u_h,\; u_h \right) +s(u_h,\;u_h)=0, \end{aligned}$$

combining this with (3) implies that \( s(u_h,u_h)=0, \) and \( (a\nabla _wu_h,\;\nabla _wu_h)=0. \) Thus \(u_h\) is continuous in the whole domain \(\Omega \). Therefore \(\nabla _wu_h=\nabla u_h=0\). Thus we have \(u_h=C\) in \(\Omega \). Note \(u_h=0\) on \(\partial \Omega \), we can complete the proof of this lemma.\(\square \)

The following lemma provides some estimates for the projection operators \(Q_h\) and \(\mathcal {Q}_h\). Observe that the underlying mesh \({\mathcal T}_h\) is assumed to be sufficiently general to allow polygons. A proof of the lemma can be found in [7, 12]. It should be pointed out that the proof of the lemma requires some non-trivial technical tools in analysis, which have also been established in [7, 12].

Lemma 3.4

Let \({\mathcal T}_h\) be a finite element partition of \(\Omega \) satisfying the shape regularity assumptions A1–A3. Then, for any \(\phi \in H^{k+1}(\Omega )\), we have

$$\begin{aligned}&\sum _{K\in {\mathcal T}_h} \Vert \phi -Q_h\phi \Vert _{K}^2 +\sum _{K\in {\mathcal T}_h}h_K^2 \Vert \nabla (\phi -Q_h\phi )\Vert _{K}^2\le C h^{2(k+1)} \Vert \phi \Vert ^2_{k+1},\end{aligned}$$
(19)
$$\begin{aligned}&\sum _{K\in {\mathcal T}_h} \Vert a(\nabla \phi -\mathcal {Q}_h(\nabla \phi ))\Vert ^2_{K} \le Ch^{2k} \Vert \phi \Vert ^2_{k+1}. \end{aligned}$$
(20)

Here and in what follows of this paper, \(C\) denotes a generic constant independent of the mesh size \(h\) and the functions in the estimates.

Let \(K\) be an element with \(e\) as an edge. For any function \(\varphi \in H^1(K)\), the following trace inequality has been proved to be valid for general meshes satisfying A1–A3 (see [7, 12] for details):

$$\begin{aligned} \Vert \varphi \Vert _{e}^2 \le C \left( h_K^{-1} \Vert \varphi \Vert _K^2 + h_K \Vert \nabla \varphi \Vert _{K}^2\right) . \end{aligned}$$
(21)

Using (21), we can obtain the following estimates.

Lemma 3.5

Assume that \(\mathcal {T}_h\) is shape regular. Then, for any \(w\in H^{k+1}(\Omega )\) and \(v\in V_h\), We have the following relation

$$\begin{aligned} \left| \sum \limits _{K\in \mathcal {T}_h}h_K^{-1}\langle [[Q_hw]],[[v]]\rangle _{\partial K}\right| \le Ch^{k}\Vert w\Vert _{k+1}{|||}{v}{|||}, \end{aligned}$$
(22)

and

$$\begin{aligned} \left| \sum \limits _{K\in \mathcal {T}_h}\langle \{\{a(\nabla w-\mathcal {Q}_h\nabla w)\}\},[[v]]\}\rangle _{\partial K}\right| \le Ch^{k}\Vert w\Vert _{k+1}{|||}{v}{|||}. \end{aligned}$$
(23)

Proof

Using the definition of \(Q_h\), trace inequality (21), Lemma 3.4, and \([[w]]=0\) for \(w\in H^{k+1}(\Omega )\), we have

$$\begin{aligned} \begin{array}{ll} &{}\left| \sum \limits _{K\in \mathcal {T}_h}h_K^{-1}\langle [[Q_hw]],[[v]]\rangle _{\partial K}\right| =\left| \sum \limits _{K\in \mathcal {T}_h}h_K^{-1}\langle [[Q_hw-w]],[[v]]\rangle _{\partial K}\right| \\ &{}\le C\sum \limits _{K\in \mathcal {T}_h}h_K^{-1}\Vert [[Q_hw-w]]\Vert _{\partial K}\Vert [[v]]\Vert _{\partial K}\\ &{}\le C \left( \sum \limits _{K\in \mathcal {T}_h}h_K^{-1}(\Vert [[Q_hw-w]]\Vert _{\partial K})^{2} \right) ^{1/2} \left( \sum \limits _{K\in \mathcal {T}_h}h_K^{-1}\Vert [[v]]\Vert _{\partial K}^2 \right) ^{1/2}\\ &{}\le C \Big (\sum \limits _{K\in \mathcal {T}_h}h_K^{-2}\Vert [[Q_hw-w]]\Vert _{K}^{2}\\ &{}+\Vert [[\nabla (Q_hw-w)]]\Vert _{K}^{2}\Big )^{1/2} \left( \sum \limits _{K\in \mathcal {T}_h}h_K^{-1}\Vert [[v]]\Vert _{\partial K}^2 \right) ^{1/2}\\ &{}\le Ch^{k}\Vert w\Vert _{k+1}{|||}{v}{|||}. \end{array} \end{aligned}$$
(24)

Similarly, it follows from (21) and Lemma 3.4

$$\begin{aligned} \begin{array}{ll} &{}\left| \sum \limits _{K\in \mathcal {T}_h}\langle \{\{a(\nabla w-\mathcal {Q}_h\nabla w)\}\},[[v]]\rangle _{\partial K}\right| \\ &{}\le C \left( \sum \limits _{K\in \mathcal {T}_h}h_K\Vert a(\nabla w-\mathcal {Q}_h\nabla w)\Vert ^2_{\partial K}\right) ^{1/2} \left( \sum \limits _{K\in \mathcal {T}_h}h_K^{-1}\Vert v\Vert ^2_{\partial K}\right) ^{1/2}\\ &{}\le Ch^{k}\Vert w\Vert _{k+1}{|||}{v}{|||}. \end{array} \end{aligned}$$
(25)

This completes the proof of Lemma 3.5.\(\square \)

4 Error analysis

The goal of this section is to establish some error estimates for the MWG finite element solution \(u_h\) arising from (15). The error will be measured in two natural norms: the triple-bar norm as defined in (16) and the standard \(L^2\) norm. The triple bar norm is essentially a discrete \(H^1\) norm for the underlying weak function.

4.1 Error equation

For simplicity of analysis, we assume that the coefficient tensor \(a\) in (1) is a piecewise constant matrix with respect to the finite partition \(\mathcal {T}_h\). The result can be extended to variable tensors without any difficulty, provided that the tensor \(a\) is piecewise sufficiently smooth.

Lemma 4.1

Let \(u_h\in V_h\) be the MWG finite element solution of the problem (1, 2) arising from (15). Assume the exact solution \(u\in H^{k+1}(\Omega )\). Then, there exists a constant \(C\) such that

$$\begin{aligned} {|||} Q_hu-u_h{|||} \le C(h^{k}+\Vert Q_hu-u_h\Vert )|u|_{k+1}. \end{aligned}$$
(26)

Proof

Let \(\phi \in H^1(\Omega )\) and \(v\in V_h\) be any finite element function. It follows from the definition of the divergence (13), the definition of the discrete weak gradient (11), and the integration by parts that

$$\begin{aligned} \sum \limits _{K\in \mathcal {T}_h}(\nabla \cdot (\mathbf{b}\phi ),\; v)_K&= -\sum \limits _{K\in \mathcal {T}_h}(\mathbf{b}\phi ,\;\nabla v)_K+\sum \limits _{K\in \mathcal {T}_h}\langle \mathbf{b}\phi ,\; [[v]]\rangle _{\partial K}\nonumber \\&= \sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b}\phi ),\;v)_K, \end{aligned}$$
(27)

and

$$\begin{aligned} -\sum \limits _{K\in \mathcal {T}_h}(\nabla \cdot (a\nabla \phi ),\;v)_K&=\sum \limits _{K\in \mathcal {T}_h}(a\nabla \phi ,\;\nabla v)_K-\sum \limits _{e\in \mathcal {E}_h}\langle a\nabla \phi ,\; [[v]]\rangle _e\nonumber \\&=\sum \limits _{K\in \mathcal {T}_h}(a\nabla _w\phi ,\;\nabla v)_K-\sum \limits _{e\in \mathcal {E}_h}\langle a\nabla \phi ,\; [[v]]\rangle _e\nonumber \\&=\sum \limits _{K\in \mathcal {T}_h}(a\nabla _w\phi ,\;\nabla _w v)_K+\sum \limits _{e\in \mathcal {E}_h}{\langle }[[v]],\; \{\{a\nabla _w \phi \}\}{\rangle }_e\nonumber \\&\quad -\sum \limits _{e\in \mathcal {E}_h}\langle a\nabla \phi ,\; [[v]]\rangle _e\nonumber \\&=\sum \limits _{K\in \mathcal {T}_h}(a\nabla _w\phi ,\;\nabla _w v)_K\!+\!\sum \limits _{e\in \mathcal {E}_h}{\langle }[[v]],\; \{\{a\nabla _w \phi \!-\!a\nabla \phi \}\}{\rangle }_e.\nonumber \\ \end{aligned}$$
(28)

Testing (1) by using \(v\in V_h^0\) we arrive at

$$\begin{aligned} \sum _{K\in {\mathcal T}_h}(-\nabla \cdot (a\nabla u),\; v)_K +\sum _{K\in {\mathcal T}_h}(\nabla \cdot (\mathbf{b} u),\; v)_K+(c u,\; v)=(f,\; v). \end{aligned}$$
(29)

Letting \(\phi =u\) in (27) and (28), combining (27), (28), and (29), we have that

$$\begin{aligned} \begin{array}{ll} \sum \limits _{K\in {\mathcal T}_h}(a\nabla _w u,\; \nabla _w v)_K +\sum \limits _{e\in \mathcal {E}_h}\langle \{\{a\nabla _wu-a\nabla u\}\},\;[[v]]\rangle _{e}\\ \qquad +\sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b}u),\;v)_K+(c u,\; v)=(f,\; v). \end{array} \end{aligned}$$
(30)

Subtracting (15) from (30) yields the following error equation

$$\begin{aligned} a_s(u-u_h,\; v)=\sum _{K\in {\mathcal T}_h} \langle \{\{a\nabla u-a\nabla _w u\}\},\; [[v]]\rangle _{\partial K}, \forall v\in V_h^0. \end{aligned}$$
(31)

We have used the fact \([[u]]=0\) for the true solution \(u\).

Denote \(\theta =u-u_h\) and \(\rho =u-Q_hu\). Let \(v=Q_hu-u_h\) in (31), we have

$$\begin{aligned} a_s(v,\; v)=-a_s(\rho ,\;v)-\sum _{K\in {\mathcal T}_h} \langle \{\{a\nabla u-a\nabla _w u\}\},\; [[v]]\rangle _{\partial K}\equiv r_1+r_2, \end{aligned}$$
(32)

and

$$\begin{aligned} r_1&= -a_s(\rho ,\; v)\nonumber \\&= -\sum \limits _{K\in \mathcal {T}_h}(a\nabla _w\rho ,\;\nabla _w v)_K-\sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b}\rho ),\;v)_K-(c\rho ,\;v)-s(\rho , \;v)\nonumber \\&\equiv r_{1}^{1}+r_{1}^{2}+r_{1}^{3}+r_{1}^{4}. \end{aligned}$$
(33)

Obviously, we have

$$\begin{aligned} \left| r_{1}^{1}\right|&= \left| -\sum \limits _{K\in \mathcal {T}_h}(a\nabla _w\rho ,\;\nabla _w v)_K\right| \nonumber \\&\le \sum \limits _{K\in \mathcal {T}_h}\left| (a\nabla \rho ,\;\nabla _w v)_K\right| +C\sum \limits _{e\in \mathcal {E}_h}\left| {\langle }[[\rho ]],\;\{\{\nabla _w v\}\}{\rangle }_e\right| \nonumber \\&\le {|||} v{|||} Ch^{k}|u|_{k+1}+Ch^{k+1/2}|u|_{k+1}h^{-1/2}\Vert \nabla _wv\Vert \le Ch^k {|||}v{|||}|u|_{k+1},\quad \end{aligned}$$
(34)
$$\begin{aligned} \left| r_{1}^{3}\right| =\left| -(c\rho ,\; v)\right| \le Ch^{k+1}\Vert v\Vert |u|_{k+1} \end{aligned}$$

and

$$\begin{aligned} \left| r_{1}^{4}\right| =\left| -s(\rho , \;v)\right| \le Ch^{k+1/2}|u|_{k+1}h^{-1/2}{|||}v{|||}\le Ch^{k}|u|_{k+1}{|||}v{|||}. \end{aligned}$$

Note that

$$\begin{aligned} \sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b}\rho ),\; v)_K=\sum \limits _{K\in \mathcal {T}_h}(\nabla \cdot (\mathbf{b}\rho ),\;v)_K-\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\mathbf{b}\rho ]],\;\{\{v\}\}{\rangle }_e. \end{aligned}$$

We have

$$\begin{aligned} \left| r_{1}^{2}\right| =\left| -\sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b}\rho ),\; v)_K\right|&\le C|\mathbf{b}\rho |_1\Vert v\Vert +Ch^{-1/2}\Vert v\Vert h^{k+1/2}|u|_{k+1},\\&\le Ch^k\Vert v\Vert |u|_{k+1}. \end{aligned}$$

So, we obtain

$$\begin{aligned} \left| r_1\right| \le Ch^k|u|_{k+1}({|||} v{|||} +\Vert v\Vert ). \end{aligned}$$
(35)

Using trace inequality and Cauchy-Schwartz inequality, we can get

$$\begin{aligned} \begin{array}{ll}\left| r_2\right| &{}=\left| -\sum \limits _{K\in {\mathcal T}_h} {\langle }\{\{a\nabla u-a\nabla _w u\}\},\; [[v]]{\rangle }_{\partial K} \right| \\ &{}\le C{|||} v{|||}h^{1/2}(h^{-1/2}\Vert a(\nabla u-\nabla _w u)\Vert +h^{1/2}\Vert a(\nabla u-\nabla _wu)\Vert _1)\\ &{}\le Ch^k|u|_{k+1}{|||} v{|||}. \end{array} \end{aligned}$$
(36)

Using \(\mathfrak {a}\mathfrak {b}\le \epsilon \mathfrak {a}^2+\dfrac{1}{\epsilon } \mathfrak {b}^2\), it is not hard to get

$$\begin{aligned} \begin{array}{ll}\Vert | v\Vert |\le C(h^k+\Vert v\Vert )|u|_{k+1}. \end{array} \end{aligned}$$
(37)

\(\square \)

Corollary 4.1

Let \(u_h\in V_h\) be the MWG finite element solution of the problem (1, 2) arising from (15). Assume that the exact solution \(u\in H^{k+1}(\Omega )\). Then, there exists a constant \(C\) such that

$$\begin{aligned} {|||} u-u_h{|||} \le C(h^{k}+\Vert u-u_h\Vert )\Vert u\Vert _{k+1}. \end{aligned}$$
(38)

Proof

From triangle inequality, we have

$$\begin{aligned} \begin{array}{ll} {|||} u-u_h{|||} &{}\le {|||} Q_hu-u_h{|||}+{|||} u-Q_hu{|||}\\ &{}\le C(h^k+\Vert Q_hu-u_h\Vert )\Vert u\Vert _{k+1}+Ch^{k+1}\Vert u\Vert _{k+1}\\ &{}\le C(h^k+\Vert u-u_h\Vert +\Vert u-Q_hu\Vert )\Vert u\Vert _{k+1}\\ &{}\le C(h^k+\Vert u-u_h\Vert )\Vert u\Vert _{k+1}. \end{array} \end{aligned}$$

This completes the proof. \(\square \)

4.2 Error estimates

The error equation (32) can be used to derive the following error estimate for the MWG finite element solution.

To obtain an error estimate in the standard \(L^2\) norm, we consider a dual problem that seeks \(\Phi \in H_0^1(\Omega )\bigcap H^2(\Omega )\) satisfying

$$\begin{aligned} -\nabla \cdot (a \nabla \Phi )-\mathbf{b}\cdot \nabla \Phi +c\Phi&= \theta \quad \text{ in }\;\Omega . \end{aligned}$$
(39)

Assume that the usual \(H^{2}\)-regularity is satisfied for the dual problem. This means that there exists a constant \(C\) such that

$$\begin{aligned} \Vert \Phi \Vert _2\le C\Vert \theta \Vert , \end{aligned}$$
(40)

where \(\theta =u-u_h\).

Theorem 4.1

Let \(u_h\in V_h\) be the MWG finite element solution of the problem (1, 2) arising from (15). Assume that the exact solution \(u\in H^{k+1}(\Omega )\) and the dual problem (39) has the usual \(H^2\)-regularity (40). Then, there exists a constant \(C\) such that

$$\begin{aligned} \Vert u-u_h\Vert \le Ch^{k+1}\Vert u\Vert _{k+1}. \end{aligned}$$
(41)

Proof

By testing (39) with \(\theta \) we obtain

$$\begin{aligned} \Vert \theta \Vert ^2&= -(\nabla \cdot (a\nabla \Phi ),\;\theta )-(\mathbf{b}\cdot \nabla \Phi ,\;\theta ) +(c\Phi ,\;\theta )\nonumber \\&= \sum _{K\in {\mathcal T}_h}(a\nabla \Phi ,\; \nabla \theta )_K-\sum _{K\in {\mathcal T}_h}{\langle }a\nabla \Phi \cdot \mathbf{n},\ \theta {\rangle }_{\partial K}\nonumber \\&-\sum \limits _{K\in \mathcal {T}_h}(\nabla \Phi ,\; \mathbf{b}\theta )_K+(c\Phi ,\;\theta )\nonumber \\&= \sum _{K\in {\mathcal T}_h}(\nabla Q_h\Phi ,\; a\nabla \theta )_K-\sum _{e\in \mathcal {E}_h}{\langle }a\nabla \Phi ,\; [[\theta ]]{\rangle }_{e}+\sum \limits _{K\in \mathcal {T}_h}(\Phi ,\; \nabla \cdot (\mathbf{b}\theta ))_K\nonumber \\&-\sum \limits _{e\in \mathcal {E}_h}{\langle }\Phi ,\;[[\mathbf{b}\theta ]]{\rangle }_e+(c\Phi ,\;\theta )+\sum \limits _{K\in \mathcal {T}_h}(\nabla (\Phi -Q_h\Phi ),\;a\nabla \theta )_K\nonumber \\&= \sum _{K\in {\mathcal T}_h}(a\nabla Q_h\Phi ,\; \nabla \theta )_K-\sum _{e\in \mathcal {E}_h}{\langle }a\nabla \Phi ,\; [[\theta ]]{\rangle }_{e}+\sum \limits _{K\in \mathcal {T}_h}(Q_h\Phi ,\; \nabla \cdot (\mathbf{b}\theta ))_K\nonumber \\&+\sum \limits _{K\in \mathcal {T}_h}(\Phi -Q_h\Phi ,\; \nabla \cdot (\mathbf{b}\theta ))_K-\sum \limits _{e\in \mathcal {E}_h}{\langle }\Phi ,\;[[\mathbf{b}\theta ]]{\rangle }_e\nonumber \\&+\sum \limits _{K\in \mathcal {T}_h}(\nabla (\Phi -Q_h\Phi ),\;a\nabla \theta )_K +(c\Phi ,\;\theta )\nonumber \\&= \sum _{K\in {\mathcal T}_h}(a\nabla Q_h\Phi ,\; \nabla _w \theta )_K+\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\theta ]],\;\{\{a\nabla Q_h\Phi \}\}{\rangle }_e-\sum _{e\in \mathcal {E}_h}{\langle }a\nabla \Phi ,\; [[\theta ]]{\rangle }_{e}\nonumber \\&+\sum \limits _{K\in \mathcal {T}_h}(\Phi -Q_h\Phi ,\; \nabla \cdot (\mathbf{b}\theta ))_K+\sum \limits _{K\in \mathcal {T}_h}(Q_h\Phi ,\; \nabla _w\cdot (\mathbf{b}\theta ))_K\nonumber \\&+\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\mathbf{b}\theta ]],\;\{\{Q_h\phi \}\}{\rangle }_e\!-\!\sum \limits _{e\in \mathcal {E}_h}{\langle }\Phi ,\;[[\mathbf{b}\theta ]]{\rangle }_e\!+\!\sum \limits _{K\in \mathcal {T}_h}(\nabla (\Phi -Q_h\Phi ),\;a\nabla \theta )_K\nonumber \\&+\,(cQ_h\Phi ,\;\theta )+(c(\Phi -Q_h\Phi ),\;\theta )\nonumber \\&= \sum _{K\in {\mathcal T}_h}(a\nabla _w Q_h\Phi ,\; \nabla _w \theta )_K+\sum \limits _{e\in \mathcal {E}_h}{\langle }[[aQ_h\Phi ]],\; \{\{\nabla _w\theta \}\}{\rangle }_e+(cQ_h\Phi ,\;\theta )\nonumber \\&+\,(c(\Phi -Q_h\Phi ),\;\theta )+\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\theta ]],\;\{\{a(\nabla Q_h\Phi -\nabla \Phi )\}\}{\rangle }_e\nonumber \\&+\sum \limits _{K\in \mathcal {T}_h}(\Phi -Q_h\Phi ,\; \nabla \cdot (\mathbf{b}\theta ))_K+\sum \limits _{K\in \mathcal {T}_h}( Q_h\Phi ,\;\nabla _w\cdot (b\theta ))_K\nonumber \\&-\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\mathbf{b}\theta ]],\;\{\{\Phi -Q_h\Phi \}\}{\rangle }_e+\sum \limits _{K\in \mathcal {T}_h}(\nabla (\Phi -Q_h\Phi ),\;a\nabla \theta )_K. \end{aligned}$$
(42)

Here, we have used

$$\begin{aligned} \sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b}w),\;q)_K=\sum \limits _{K\in \mathcal {T}_h}(\nabla \cdot (\mathbf{b}w),\;q)_K-\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\mathbf{b}w]],\;\{\{q\}\}{\rangle }_e. \end{aligned}$$

Letting \(v=Q_h\Phi \) in (31) yields

$$\begin{aligned}&\sum \limits _{K\in \mathcal {T}_h}(a\nabla _w\theta ,\;\nabla _w Q_h\Phi )_K+\sum \limits _{K\in \mathcal {T}_h}(\nabla _w\cdot (\mathbf{b}\theta ),\; Q_h\Phi )_K+(c\theta ,\;Q_h\Phi )\nonumber \\&\quad =-s(\theta ,\;Q_h\Phi )+\sum _{K\in {\mathcal T}_h} \langle \{\{a(\nabla u-\nabla _w u)\}\},\; [[Q_h\Phi ]]\rangle _{\partial K}. \end{aligned}$$
(43)

Substituting (43) into (42) arrives at

$$\begin{aligned} \Vert \theta \Vert ^2&= -s(\theta ,\;Q_h\Phi )+\sum _{K\in {\mathcal T}_h} \langle \{\{a(\nabla u-\nabla _w u)\}\},\; [[Q_h\Phi ]]\rangle _{\partial K}\nonumber \\&+\,(c(\Phi -Q_h\Phi ),\;\theta )+\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\theta ]],\;\{\{a(\nabla Q_h\Phi -\nabla \Phi )\}\}{\rangle }_e\nonumber \\&+\sum \limits _{K\in \mathcal {T}_h}(\Phi -Q_h\Phi ,\; \nabla \cdot (\mathbf{b}\theta ))_K+\sum \limits _{e\in \mathcal {E}_h}{\langle }[[aQ_h\Phi ]],\; \{\{\nabla _w\theta \}\}{\rangle }_e\nonumber \\&-\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\mathbf{b}\theta ]],\;\{\{\Phi -Q_h\Phi \}\}{\rangle }_e+\sum \limits _{K\in \mathcal {T}_h}(\nabla (\Phi -Q_h\Phi ),\;a\nabla \theta )_K\nonumber \\&= \sum \limits _{i=1}^8 R_i. \end{aligned}$$
(44)

Let us bound the terms on the right hand side of (44) one by one. Using the Cauchy-Schwarz inequality, the definition of \(s(\cdot ,\cdot )\), and (26), we obtain

$$\begin{aligned}&\left| R_1\right| =\left| -s(\theta ,\;Q_h\Phi -\Phi )\right| \le {|||}\theta {|||} Ch^{-1/2}h^{3/2}\Vert \Phi \Vert _2 \le C(h^k+\Vert \theta \Vert )h\Vert \theta \Vert , \end{aligned}$$

and

$$\begin{aligned} \left| R_3\right| \le Ch^2\Vert \Phi \Vert _2\Vert \theta \Vert . \end{aligned}$$

From the trace inequality (21) and the estimate (19) we have

$$\begin{aligned} \left| R_2\right|&= \left| \sum \limits _{e\in \mathcal {E}_h}{\langle }[[Q_h\Phi -\Phi ]],\;\{\{a\nabla _w u-a\nabla u\}\}{\rangle }_e\right| \le Ch^{k+1}|u|_{k+1}\Vert \theta \Vert ,\\ \left| R_6 \right|&= \left| \sum \limits _{e\in \mathcal {E}_h}{\langle }[[aQ_h\Phi ]],\; \{\{\nabla _w\theta \}\}{\rangle }_e\right| \le C(h^k+\Vert \theta \Vert )h\Vert \theta \Vert \end{aligned}$$

and

$$\begin{aligned} \left| R_4\right| =\left| \sum \limits _{e\in \mathcal {E}_h}\right| \le C (h^k+\Vert \theta \Vert )h\Vert \theta \Vert , \end{aligned}$$

and

$$\begin{aligned} \left| R_7\right| =\left| -\sum \limits _{e\in \mathcal {E}_h}{\langle }[[\mathbf{b}\theta ]],\;\{\{\Phi -Q_h\Phi \}\}{\rangle }_e\right| \le C(h^k+\Vert \theta \Vert )h^2\Vert \theta \Vert . \end{aligned}$$

Using the Cauchy-Schwarz inequality, we obtain

$$\begin{aligned} \begin{array}{ll}&\left| R_5\right| =\left| \sum \limits _{K\in \mathcal {T}_h}\left( \Phi -Q_h\Phi ,\; \nabla \cdot (\mathbf{b}\theta )\right) _K\right| \le Ch^2(h^k+\Vert \theta \Vert )\Vert \theta \Vert , \end{array} \end{aligned}$$

and

$$\begin{aligned} \begin{array}{ll}&\left| R_8\right| =\left| \sum \limits _{K\in \mathcal {T}_h}\left( \nabla (\Phi -Q_h\Phi ),\; a\nabla \theta \right) _K\right| \le C(h^k+\Vert \theta \Vert )h\Vert \theta \Vert . \end{array} \end{aligned}$$

Substituting all of estimates to \(R_i,\;i=1,2,\cdots , 8\) into (44) yields

$$\begin{aligned} \begin{array}{ll} \Vert \theta \Vert&\le C((h^k+\Vert \theta \Vert )h+h^{k+1}|u|_{k+1}). \end{array} \end{aligned}$$

Thus,

$$\begin{aligned} \begin{array}{ll} \Vert \theta \Vert&\le Ch^{k+1}|u|_{k+1}. \end{array} \end{aligned}$$
(45)

which implies (41). This completes the proof.\(\square \)

Theorem 4.2

Let \(u_h\in V_h\) be the MWG finite element solution of the problem (1, 2) arising from (15). Assume that the exact solution \(u\in H^{k+1}(\Omega )\). Then, there exists a constant \(C\) such that

$$\begin{aligned} {|||} u-u_h{|||} \le Ch^{k}\Vert u\Vert _{k+1}. \end{aligned}$$
(46)

Proof

It follows from Corollary 4.1

$$\begin{aligned} {|||} u-u_h{|||}\le C(h^k+\Vert u-u_h\Vert )\Vert u\Vert _{k+1}. \end{aligned}$$
(47)

(47) and (41) imply (46). This completes the proof.\(\square \)

5 Numerical experiment

In this section, we will report several numerical results for the MWG finite element methods. For simplicity, we consider \(b=(1,1)^T\) and \(c=1\), \(a=1, 0.01, 0.0001\), and a rectangular domain \(\Omega =[0,1]\times [0,1]\) with uniform triangulation in this section. The triangular mesh is constructed by: 1) uniformly partitioning the domain into \(N\times N\) sub-rectangles; 2) dividing each rectangular element by the diagonal line with a negative slop. Denote the mesh size by \(h=1/N\).

All the numerical experiments are conducted by using linear weak Galerkin elements (\(k=1\)) in the finite element space \(V_h\). In this case, \(u_h\) is a combination of piecewise linear functions.

Let \(u_h\) and \(u\) be solutions to the MWG equation and the original equation, respectively. Denote \(e_h=u-u_h\). The accuracy and efficiency will be examined in the following tests. The following norms will be measured in all the numerical experiments:

Discrete \(H^1\) norm: \({|||} e_h{|||}=\left\{ \sum \limits _{K\in \mathcal {T}_h} \left( \int _K|\nabla _w e_h|^2\text{ d }x+h^{-1}\int _{\partial K}|[[e_h]]|^2\text{ d }s \right) \right\} ^{1/2}\).

Element-based \(L^2\) norm: \(\Vert e_h\Vert =\left\{ \sum \limits _{K\in \mathcal {T}_h}\int _K|e_h|^2\text{ d }x\right\} ^{1/2}\).

5.1 Homogeneous boundary cases

First, we consider two homogeneous boundary cases, i.e., \(g=0\).

Example 5.1.1

Let the analytical solution to (1) be \( u=x(1-x)y(1-y). \)It is easy to derive \(f\) by the above \(u\). Tables 1 and 2 show the convergence rate for MWG solutions measured in discrete \(H^1\) norm and element-based \(L^2\) norm on triangular meshes. The numerical results indicate that the MWG solution is convergent with rate \(O(h)\) in \(H^1\) and \(O(h^2)\) in \(L^2\) norms, which are same as the theoretical results shown in Theorems 4.2 and 4.1.

Table 1 Numerical error and convergence rate of Example 5.1.1 in norm \({|||} e_h{|||}\)
Table 2 Numerical error and convergence rate of Example 5.1.1 in norm \(\Vert e_h\Vert \)

Example 5.1.2

Again, using the same meshes and elements as those used in Example 5.1.1, and the analytical solution is \(u=\sin (\pi x)\sin (\pi y)\). In this test, the error profiles are presented in Tables 3 and 4, which show an convergence rate of optimal order in \(H^1\) and \(L^2\) norms.

Table 3 Numerical error and convergence rate of Example 5.1.2 in norm \({|||} e_h{|||}\)
Table 4 Numerical error and convergence rate of Example 5.1.2 in norm \(\Vert e_h\Vert \)

5.2 Nonhomogeneous boundary cases

In this subsection, we will test (1) with nonhomogeneous boundary condition.

Example 5.2.1

The analytical solution is \(u=x+\exp (x+y)\), which is not zero on \(\partial \Omega \). From Tables 5 and 6, we can see that the MWG solution is also convergent with rate \(O(h)\) in \(H^1\) and \(O(h^2)\) in \(L^2\) norms, which are same as the theoretical results shown in Theorems 4.2 and 4.1.

Table 5 Numerical error and convergence rate of Example 5.2.1 in norm \({|||} e_h{|||}\)
Table 6 Numerical error and convergence rate of Example 5.2.1 in norm \(\Vert e_h\Vert \)

All the three numerical examples given above are in good agreement with the theoretical analysis in Sect. 4, which demonstrate that the MWG-FEM (15) is accurate and robust.

6 Conclusion

We developed an MWG-FEM for two-dimensional convection–diffusion problems in this paper. From experiments and analyses, we can see that the proposed MWG-FEM is very efficient and successful for solving convection-diffusion problems. It has high-order accuracy for the problems with small diffusion and large local Peclets numbers. The important feature of the method is that we introduce a modified weak gradient and weak divergence, and a new stabilization term without the need of being “large enough” constant-valued parameter. The algorithm of this MWG-FEM can be extended to three and higher dimensional convection–diffusion problems with more general boundary conditions. However, we do not consider using a upwinding technique to handle the convection term over the non-standard grid in the present framework of MWG-FEMs and we will consider this idea in the oncoming works.