1 Introduction

In this paper, we consider the following one-dimensional singularly perturbed convection–diffusion problem

$$\begin{aligned} \left\{ \begin{array}{ll} -\varepsilon u''+bu'+cu=f &{}\quad \text { in } \varOmega =(0,1),\\ u(0)=u(1)=0 , \end{array}\right. \end{aligned}$$
(1.1)

where \(0<\varepsilon \ll 1\) is a small positive parameter, and bcf are sufficiently smooth functions with the following properties

$$\begin{aligned} b(x) \ge b_0> 0,\ c(x)\ge 0,\ c(x) - \frac{1}{2}b'(x) \ge c_0 > 0, \quad \forall x\in \bar{\varOmega }, \end{aligned}$$
(1.2)

for some constants \(b_0\) and \(c_0\). This assumption guarantees that problem (1.1) has a unique solution in \(H^2(\varOmega )\cap H_0^1(\varOmega )\) for all \(f\in L^2(\varOmega )\) [16, 26].

It is well known that the exact solution of problem (1.1) typically has an exponential boundary layer at \(x=1\), which cause difficulties for classical numerical methods. For example, the standard finite element or finite difference method fails to produce an accurate numerical solution unless the mesh size is comparable or smaller than the parameter \(\varepsilon \).

Layer-adapted meshes [9, 13], such as Bakhvalov mesh and Shishkin mesh, have been developed to remedy the difficulties caused by the boundary layers. As it is shown in [4], on layer-adapted meshes one can use standard discretization techniques such as conforming finite element method [16, 26], but some small oscillations still appear in the discrete solution. Additional stabilization is necessary to improve the situation. Over the past several decades, many stabilized numerical methods such as the up-winding finite difference scheme [8], the streamline-diffusion finite element method [10, 11, 17], variational multiscale method [19], and the discontinuous Galerkin finite element method [5, 6, 14, 18, 23,24,25, 27, 30, 31], have been developed for the singularly perturbed convection–diffusion problem. Details of these methods can be found in the classical book [15] and the references therein.

Recently, the WG finite element methods have attracted increasing attention. The WG methods, first proposed and analyzed by Wang and Ye [20], provide a general finite element technique for solving partial differential equations. In general, the WG scheme for PDEs by replacing usual derivatives by weakly-defined derivatives in the corresponding weak form with additional parameter-free stabilization term. The WG methods have been successfully applications in the elliptic problems [20, 28], the options pricing problem [29], the Stokes equation [21], the Maxwell equations [12], the biharmonic equations [22], and etc.

Most recently, the WG methods demonstrate robust and stable discretizations for singularly perturbed problems (SPP). For example, a WG method with an upwinding-type stabilization was presented and analyzed for the SPP with convection–diffusion type [7]. A \(P_0\)-\(P_0\) WG method was investigated in [1] for the SPP with reaction-diffusion type. The WG method was also studied for the fourth order singularly perturbed problems [2]. But the uniform convergence of the WG finite element method on layer-adapted mesh has not been discussed so far. The main concern here is to investigate the uniform convergence of the WG finite element scheme on a Shishkin mesh for one-dimensional singularly perturbed convection–diffusion equations.

The outline of this paper is organized as follows. In Sect. 2, we introduce some preliminaries and notations which will be used later. The formulation of WG finite element method for the singularly perturbed convection–diffusion equation is presented in Sect. 3. The error estimates of the proposed method are discussed in Sect. 4. Some numerical experiments are displayed in Sect. 6. It aims to confirm our theoretical results and investigate some interesting convergence phenomenons.

In the following, C denotes generic positive constants independent of N and \(\varepsilon \), and their value will not be the same in different inequalities.

2 Preliminary and Notations

2.1 The Shishkin Mesh

Let N be an even integer. Define the transition parameter

$$\begin{aligned}\tau = \min \left( \frac{1}{2},\frac{k+1}{b_0}\varepsilon \ln N\right) ,\end{aligned}$$

where k is the degree of polynomials in the finite element space which will be given later. Then divide each of the subdomains \(\varOmega _1=[0,1-\tau ]\) and \(\varOmega _2=[1-\tau , 1]\) into N/2 equidistant subintervals. Notice that \(\varepsilon \ll 1\) , here and below we take \(\tau =\frac{k+1}{b_0}\varepsilon \ln N\). Now, we have

$$\begin{aligned} x_0=0, x_j=x_{j-1}+h_j, h_j=\left\{ \begin{array}{ll} h_c, &{}\quad j=1,\ldots ,N/2,\\ h_f,&{}\quad j= N/2+1,\ldots ,N,\end{array}\right. \end{aligned}$$

where

$$\begin{aligned}h_c=2(1-\tau )/N,\qquad h_f=2\tau /N.\end{aligned}$$

It can be easily shown that

$$\begin{aligned} h_c=\mathcal {O}(N^{-1}),\qquad h_f=\mathcal {O}(\varepsilon \,N^{-1}\ln \,N). \end{aligned}$$

Denote the mesh by \(I_j=[x_{j-1},x_j]\) for \(j=1,\ldots ,N\) and set \(\mathcal {T}_N=\{I_j, j=1,\ldots ,N\}\). For each interval \(I_j\in \mathcal {T}_N\), we define its outward unit normal \(n_{I_j}(x_j)=1\) and \(n_{I_j}(x_{j-1})=-1\); if there is no confusion, instead of \(n_{I_j}\) we simply write n.

2.2 Weak Function and Weak Derivative

On each interval \(I_j=[x_{j-1},x_j]\), a weak function on the interval \(I_j\) refers to a function \(v=\{v_0,v_b\}\) such that \(v_0\in L^2(I_j)\) and \(v_b\in L^{\infty }(\partial I_j)\), where \(\partial I_j=\{x_{j-1},x_j\}\). That is, for each interval \(I_j\in \mathcal {T}_N, j=1,\ldots ,N\), we have

$$\begin{aligned} v=\left\{ \begin{array}{ll} v_0,&{}\quad \text{ in } I_j,\\ v_b,&{}\quad \text{ on } \partial I_j. \end{array}\right. \end{aligned}$$

Here \(v_0\) can be understood as the value of v in \((x_{j-1},x_j)\), and \(v_b\) represents the values of v on the endpoints of \(I_j\). Denote by \(\mathcal {M}(I_j)\) the space of weak functions on \(I_j\), i.e.,

$$\begin{aligned} \mathcal {M}(I_j)=\{v=\{v_0,v_b\}:v_0\in L^2(I_j),v_b\in L^{\infty }(\partial I_j)\}. \end{aligned}$$

The local Sobolev space \(H^1(I_j)\) can be embedded into the space \(\mathcal {M}(I_j)\) by the inclusion map

$$\begin{aligned} i_{\mathcal {M}}(v) = \{v|_{I_j}, v|_{\partial I_j}\},\quad \forall v\in H^1(I_j). \end{aligned}$$

Let \(\mathbb {P}^k(I_j)\) be the set of polynomials defined on \(I_j\) with degree no more than k. Denote by \(\mathbb {P}^0(\partial I_j)\) is the set of piecewise constants on \(\partial I_j\). For a given integer \(k\ge 1\), we define a local WG finite element space \(\mathcal {M}_N(I_j)\) on each element \(I_j\in \mathcal {T}_N\) as follows

$$\begin{aligned} \mathcal {M}_N(I_j)=\{v=\{v_0,v_b\}:v_0|_{I_j}\in \mathbb {P}^k(I_j),v_b|_{\partial I_j}\in \mathbb {P}^0(\partial I_j)\}. \end{aligned}$$

A global WG finite element space \(\mathcal {M}_N\) is then obtained by gluing all the local space \(\mathcal {M}_N(I_j)\) with common values on interior nodes. In other words, for any function \(v=\{v_0,v_b\}\in \mathcal {M}_N\), it means \(v_0|_{I_j}\) belongs to the polynomial space \(\mathbb {P}^k(I_j)\) for \(j=1,\ldots ,N\), and \(v_b\) has a single value on the nodes of the partition \(\mathcal {T}_N\).

Let \(\mathcal {M}_N^0\) be the subspace of \(\mathcal {M}_N\) consisting of discrete weak functions with vanishing boundary values, i.e.,

$$\begin{aligned} \mathcal {M}_N^0=\{v=\{v_0,v_b\}:v\in \mathcal {M}_N,v_b(0)=v_b(1)=0\}. \end{aligned}$$
(2.1)

The weak derivative of a weak function \(v=\{v_0,v_b\}\in \mathcal {M}_N\) is defined as follows.

Definition 2.1

For any weak function \(v\in \mathcal {M}_N(I_j)\), the weak derivative of \(v=\{v_0, v_b\}\) is defined as the unique polynomial \(D_{w,I_j} v\in \mathbb {P}^{k-1}(I_j)\) satisfying

$$\begin{aligned} (D_{w,I_j} v, q)_{I_j}=-(v_0,q')_{I_j}+\langle v_b,qn\rangle _{\partial I_j},\quad \forall q\in \mathbb {P}^{k-1}(I_j). \end{aligned}$$
(2.2)

Here, we have used the notation

$$\begin{aligned} (\varphi ,\psi )_{I_j}:=\int _{I_j}\varphi (x)\psi (x)\mathrm {d}x \end{aligned}$$

and

$$\begin{aligned} \langle \varphi ,\psi n\rangle _{\partial I_j}:=\varphi (x_j)\psi (x_j)-\varphi (x_{j-1})\psi (x_{j-1}). \end{aligned}$$

To approximate the convection term \(bu'\) in the problem (1.1), we introduce a weak convection derivative as follows.

Definition 2.2

For any weak function \(v\in \mathcal {M}_N(I_j)\), the weak convection derivative of \(v=\{v_0, v_b\}\) is defined as the unique polynomial \(D_{w,I_j}^b v\in \mathbb {P}^k(I_j)\) satisfying

$$\begin{aligned} (D_{w,I_j}^b v, q)_{I_j}=-(v_0,(bq)')_{I_j}+\langle v_b,bqn\rangle _{\partial I_j},\quad \forall q\in \mathbb {P}^k(I_j). \end{aligned}$$
(2.3)

The weak derivatives \(D_{w}\) and \(D_{w}^b\) on the finite element space \(\mathcal {M}_N\) can be computed by using the Eqs. (2.2) and (2.3) respectively on each element \(I_j\in \mathcal {T}_N\). More precisely, it is given by

$$\begin{aligned} (D_{w}v)|_{I_j}=D_{w,I_j}(v|_{I_j}),\quad (D_{w}^bv)|_{I_j}=D_{w,I_j}^b(v|_{I_j}), \quad \forall v\in \mathcal {M}_N. \end{aligned}$$

3 The Weak Galerkin Finite Element Scheme

For simplicity, we adopt the following notations,

$$\begin{aligned} (\varphi , \psi )_{\mathcal {T}_N}=\sum _{j=1}^N(\varphi , \psi )_{I_j},\quad \langle \varphi , \psi \rangle _{\partial \mathcal {T}_N}=\sum _{j=1}^N{\langle }\varphi , \psi {\rangle }_{\partial I_j}. \end{aligned}$$

To describe our weak Galerkin finite element method, we need to introduce three bilinear forms on \(\mathcal {M}_N\) as follows: for any \(\varphi =\{\varphi _0, \varphi _b\},\psi =\{\psi _0, \psi _b\}\in \mathcal {M}_N\), we define

$$\begin{aligned} \mathcal {A}(\varphi ,\psi )&:=\varepsilon (D_w \varphi ,D_w \psi )_{\mathcal {T}_N} +(D_w^b \varphi +c\varphi _0,\psi _0)_{\mathcal {T}_N},\\ \mathcal {S}_d(\varphi ,\psi )&:=\sum _{j=1}^N \langle \sigma _j (\varphi _0-\varphi _b),\psi _0-\psi _b\rangle _{\partial I_j},\\ \mathcal {S}_c(\varphi ,\psi )&:=\sum _{j=1}^N \langle bn_{I_j}(\varphi _0-\varphi _b),\psi _0-\psi _b\rangle _{\partial _{+} I_j}, \end{aligned}$$

where \(\partial _{+} I_j=\{x\in \partial I_j: b(x)n_{I_j}(x)\ge 0\}\), \(\sigma _j\) is a penalty parameter given as follows:

$$\begin{aligned} \sigma _j = \left\{ \begin{array}{ll} 1,&{}\quad \text{ if } j=1,\ldots , N/2,\\ N/\ln N, &{}\quad \text{ if } j= N/2+1,\ldots , N. \end{array}\right. \end{aligned}$$
(3.1)

Remark 1

The value of \(\sigma _j\) is chose as \(\sigma _j=\varepsilon h_j^{-1}\) in most of existence works of WG finite element method such as [20, 28, 29]. But \(\varepsilon \)-uniform error estimates can’t be obtained by this choice of \(\sigma _j\).

With the above notations and definitions, the weak Galerkin finite element approximation of the problem (1.1) is to find an approximate solution \(u_N=\{u_0,u_b\}\in \mathcal {M}_N^0\) such that

$$\begin{aligned} \mathcal {B}(u_N,v_N)=(f,v_0), \quad \forall \, v_N=\{v_0,v_b\}\in \mathcal {M}_N^0, \end{aligned}$$
(3.2)

where

$$\begin{aligned} \mathcal {B}(\varphi ,\psi ):=\mathcal {A}(\varphi ,\psi )+\mathcal {S}_d(\varphi ,\psi )+\mathcal {S}_c(\varphi ,\psi ). \end{aligned}$$
(3.3)

Let \(\phi _{0,i}^{j}, i=1,\ldots , k+1\) be the basis functions of piecewise polynomial space \(\mathbb {P}^k(I_j)\). Denote by \(\mathcal {E}_N^0=\{x_j, j=1,\ldots ,N-1\}\) the set of interior nodes of the mesh \(\mathcal {T}_N\). And let \(\phi _{b,j}, j=0,\ldots ,N\) be the nodal basis function of \(\mathbb {P}^0(\mathcal {E}_N)\), i.e., \(\phi _{b,j}(x_i)=\delta _{ij}\), where \(\delta _{ij}=1\) if \(j=i\) else \(\delta _{ij}=0\) if \(j\ne i\). Denote \(\varPhi _{0,m}=\{\phi _{0,i}^j, 0\}\) where \(m=i+(j-1)(k+1)\), with \(i= 1,\ldots ,k+1, j=1,\ldots ,N\). Let \(\varPhi _{b,j}=\{0, \phi _{b,j}\}\) with \(j=1,\ldots ,N-1\). Then the WG finite element space \(\mathcal {M}_N^0=span\{\varPhi _{0,1},\ldots , \varPhi _{0,(k+1)N}, \varPhi _{b,1},\ldots \varPhi _{b,N-1}\}\). Denote by

$$\begin{aligned} (B_{0,0})_{ij}&= \mathcal {B}_N(\varPhi _{0,j}, \varPhi _{0,i}), i,j=1,\ldots ,(k+1)N,\\ (B_{0,b})_{ij}&= \mathcal {B}_N(\varPhi _{b,j}, \varPhi _{0,i}), i=1,\ldots ,(k+1)N, j=1,\ldots ,N-1,\\ (B_{b,0})_{ij}&= \mathcal {B}_N(\varPhi _{0,j}, \varPhi _{b,i}), j=1,\ldots ,(k+1)N, i=1,\ldots ,N-1,\\ (B_{b,b})_{ij}&= \mathcal {B}_N(\varPhi _{b,j}, \varPhi _{b,i}), i,j=1,\ldots ,N-1,\\ F_j&= (f, \varPhi _{0,j}), j=1,\ldots ,(k+1)N, \end{aligned}$$

then the matrix form of the WG scheme (3.2) can be written as

$$\begin{aligned} \begin{pmatrix} B_{0,0}&{} B_{0,b}\\ B_{b,0}&{} B_{b,b} \end{pmatrix} \begin{pmatrix} U_{0}\\ U_{b} \end{pmatrix} =\begin{pmatrix} F\\ 0 \end{pmatrix}, \end{aligned}$$

where \(U_0\) and \(U_b\) represent the vectors of degrees of freedom for \(u_0\) and \(u_b\), respectively. We can write the above system as

$$\begin{aligned} (B_{b,b}-B_{b,0}B_{0,0}^{-1}B_{0,b})U_b + B_{b,0}B_{0,0}^{-1}F=0 \end{aligned}$$

and

$$\begin{aligned} U_0 = B_{0,0}^{-1}(F-B_{0,b}U_b). \end{aligned}$$

We emphasize that the inverse \(B_{0,0}^{-1}\) can be computed on each element independently of each other since the matrix \(B_{0,0}\) is block-diagonal owing to the discontinuous nature of the approximation space \(\mathcal {M}_N^0\).

Remark 2

It can be observed that the interior degrees of freedom \(U_0\) can be locally eliminated in terms of the interface degrees of freedom \(U_b\) in practical implementation. This means that, the linear system resulting from WG finite element methods only involves the degrees of freedom on the skeleton of the mesh. Therefore, the degrees of freedom of the WG method is comparable with conforming finite elements, and it is much less than the degrees of freedom of the discontinuous Galerkin method. It is worth to point out that the procedure of elimination of \(U_0\) by \(U_b\) is the so-called Schur complement technique in the domain decomposition community, which can be used any dimensional problem.

3.1 Coercivity of the Bilinear form \(\mathcal {B}(\cdot ,\cdot )\)

We introduce an energy norm \(|||\cdot |||\) in the finite element space \(\mathcal {M}_N\) as follows: for all \(v=\{v_0,v_b\}\in \mathcal {M}_N\),

$$\begin{aligned} ||| v|||^2 := |v|_{1,\varepsilon }^2+\Vert \sqrt{c_0}v_0\Vert _{L^2(\mathcal {T}_N)}^2 +|v|_{\mathrm {J}}^2, \end{aligned}$$
(3.4)

with the seminorm

$$\begin{aligned} |v|_{1,\varepsilon }^2:=\varepsilon \Vert D_w v\Vert _{L^2(\mathcal {T}_N)}^2 +\mathcal {S}_d(v,v),\\ |v|_{\mathrm J}^2:=\sum _{j=1}^{N}w_j|\sqrt{b}(v_0-v_b)|^2(x_j^-), \end{aligned}$$

where

$$\begin{aligned} w_j=\left\{ \begin{array}{ll} \frac{1}{2},&{}\quad j = N,\\ 1,&{}\quad j=1,\ldots , N-1. \end{array}\right. \end{aligned}$$

In addition, for \(v\in \mathcal {M}_N+H_0^1(\varOmega )\), define the discrete \(H^1\) energy norm

$$\begin{aligned} \Vert v\Vert _{\mathcal {M}}^2 := |v|_{*,\varepsilon }^2+\Vert \sqrt{c_0}v_0\Vert _{L^2(\mathcal {T}_N)}^2 +|v|_{\mathrm {J}}^2. \end{aligned}$$
(3.5)

with the seminorm

$$\begin{aligned} |v|_{*,\varepsilon }^2:=\varepsilon \Vert v_0'\Vert _{L^2(\mathcal {T}_N)}^2+\mathcal {S}_d(v,v). \end{aligned}$$

It is worth noting that a function \(v\in H_0^1(\varOmega )\) can be understood as a weak function \(\{v_0, v_b\}\) with \(v_0=v|_{I_j}\) and \(v_b=v|_{\partial I_j}\) for any \(I_j\in {\mathcal T}_h\).

The following lemma shows that the \(|||\cdot |||\)-norm and \(\Vert \cdot \Vert _{\mathcal {M}}\) are equivalent in the WG finite element space \(\mathcal {M}_N^0\).

Lemma 3.1

For any \(v_N=\{v_0, v_b\}\in \mathcal {M}_N^0\), there holds

$$\begin{aligned} C_{\mathrm{lb}}\Vert v_N\Vert _{\mathcal {M}}\le ||| v_N||| \le C_\mathrm{ub}\Vert v_N\Vert _{\mathcal {M}}, \end{aligned}$$

where \(C_\mathrm{ub}:=\max \{C_\mathrm{eq}, 1\}\) with \(C_\mathrm{eq}=\max \{\sqrt{2}, \sqrt{1+2C_{*}}\}\) and \(C_{\mathrm{lb}}:=1/C_\mathrm{ub}\).

Proof

For any \(v_N=\{v_0, v_b\}\in \mathcal {M}_N^0\), it follows from the definition of weak derivative (2.1) and integration by parts that

$$\begin{aligned} (D_wv_N, w)_{I_j} = (v_0', w)_{I_j} - \langle v_0-v_b, wn\rangle _{\partial I_j},\quad \forall w\in \mathbb {P}^{k-1}(I_j), \forall I_j\in \mathcal {T}_N. \end{aligned}$$
(3.6)

Let \(w=D_wv_N\) in (3.6), we have

$$\begin{aligned} (D_wv_N, D_wv_N)_{I_j} = (v_0', D_wv_N)_{I_j} - \langle v_0-v_b, D_wv_Nn\rangle _{\partial I_j}. \end{aligned}$$

Using the Cauchy-Schwarz inequality and the trace inequality (4.3), we infer

$$\begin{aligned} \Vert D_wv_N\Vert _{L^2(I_j)}^2&\le \Vert v_0'\Vert _{L^2(I_j)} \Vert D_wv_N\Vert _{L^2(I_j)} +\Vert v_0-v_b\Vert _{L^2(\partial I_j)} \Vert D_wv_N\Vert _{L^2(\partial I_j)}\\&\le (\Vert v_0'\Vert _{L^2(I_j)}+C_{*}h_j^{-1/2}\Vert v_0-v_b\Vert _{L^2(\partial I_j)})\Vert D_wv_N\Vert _{L^2(I_j)}. \end{aligned}$$

Thus,

$$\begin{aligned} \Vert D_wv_N\Vert _{L^2(I_j)}&\le (\Vert v_0'\Vert _{L^2(I_j)} +C_{*}h_j^{-1/2}\Vert v_0-v_b\Vert _{L^2(\partial I_j)}). \end{aligned}$$

Squaring this inequality and summing over \(I_j\in \mathcal {T}_N\) yields

$$\begin{aligned} \varepsilon \Vert D_wv_N\Vert _{L^2(\mathcal {T}_N)}^2&\le 2(\varepsilon \Vert v_0'\Vert _{L^2(\mathcal {T}_N)}^2 +C_{*}^2\sum _{j=1}^N\varepsilon h_j^{-1}\Vert v_0-v_b\Vert _{L^2(\partial I_j)}^2 ). \end{aligned}$$

Recalling (3.1), we have

$$\begin{aligned} \frac{\varepsilon h_j^{-1}}{\sigma _j}\le C\quad \text{ for } j=1,\ldots ,N. \end{aligned}$$

Then, from the definition of \(\mathcal {S}_d(\cdot ,\cdot )\), we get

$$\begin{aligned} \sum _{j=1}^N\varepsilon h_j^{-1}\Vert v_0-v_b\Vert _{L^2(\partial I_j)}^2 =\sum _{j=1}^N\frac{\varepsilon h_j^{-1}}{\sigma _j}\cdot \sigma _j\Vert v_0-v_b\Vert _{L^2(\partial I_j)}^2 \le C\mathcal {S}_d(v_N, v_N) \end{aligned}$$

As a result,

$$\begin{aligned} \varepsilon \Vert D_wv_N\Vert _{L^2(\mathcal {T}_N)}^2&\le 2(\varepsilon \Vert v_0'\Vert _{L^2(\mathcal {T}_N)}^2 +C_{*}^2\mathcal {S}_d(v_N, v_N) ). \end{aligned}$$

Moreover,

$$\begin{aligned} |v_N|_{1,\varepsilon }^2&\le 2\varepsilon \Vert v_0'\Vert _{L^2(\mathcal {T}_N)}^2 +(1+2C_{*}^2)\mathcal {S}_d(v_N, v_N), \end{aligned}$$

which yields

$$\begin{aligned} |v_N|_{1,\varepsilon }\le C_\mathrm{eq}|v_N|_{*,\varepsilon } \end{aligned}$$
(3.7)

with \(C_\mathrm{eq}=\max \{\sqrt{2}, \sqrt{1+2C_{*}}\}\).

As to the lower bound, we choose \(w=v_0'\) in (3.6) to obtain

$$\begin{aligned} (v_0', v_0')_{I_j} = (D_wv_N, v_0')_{I_j} + \langle v_0-v_b, v_0'n\rangle _{\partial I_j}. \end{aligned}$$

Using the Cauchy-Schwarz inequality and the trace inequality (4.3), we infer

$$\begin{aligned} \Vert v_0'\Vert _{L^2(I_j)}^2&\le \Vert D_w v\Vert _{L^2(I_j)}\Vert v_0'\Vert _{L^2(I_j)} + \Vert v_0-v_b\Vert _{L^2(\partial I_j)} \Vert v_0'\Vert _{L^2(\partial I_j)}\\&\le (\Vert D_w v_N\Vert _{L^2(I_j)} + C_{*}h_j^{-1/2}\Vert v_0-v_b\Vert _{L^2(\partial I_j)}) \Vert v_0'\Vert _{L^2(I_j)}. \end{aligned}$$

Thus,

$$\begin{aligned} \Vert v_0'\Vert _{L^2(I_j)}&\le (\Vert D_w v_N\Vert _{L^2(I_j)} + C_{*}h_j^{-1/2}\Vert v_0-v_b\Vert _{L^2(\partial I_j)}). \end{aligned}$$

As a result,

$$\begin{aligned} \varepsilon \Vert v_0'\Vert _{L^2(\mathcal {T}_N)}^2&\le 2(\varepsilon \Vert D_wv_N\Vert _{L^2(\mathcal {T}_N)}^2 +C_{*}^2\mathcal {S}_d(v_N, v_N)), \end{aligned}$$

which yields

$$\begin{aligned} \Vert v_N\Vert _{*,\varepsilon }^2&\le 2\varepsilon \Vert D_wv_N\Vert _{L^2(\mathcal {T}_N)}^2 +(1+2C_{*}^2)\mathcal {S}_d(v_N, v_N) \le C_\mathrm{eq}^2\Vert v_N\Vert _{1,\varepsilon }^2. \end{aligned}$$

Then, we arrive at

$$\begin{aligned} C_\mathrm{eq}^{-1}\Vert v\Vert _{*,\varepsilon }\le \Vert v\Vert _{1,\varepsilon }, \end{aligned}$$

which together with (3.7) yields

$$\begin{aligned} C_\mathrm{eq}^{-1}|v_N|_{*,\varepsilon } \le |v_N|_{1,\varepsilon }\le C_\mathrm{eq}|v_N|_{*,\varepsilon }. \end{aligned}$$

From the definition of \(|||\cdot |||\)-norm and \(\Vert \cdot \Vert _{\mathcal {M}}\)-norm, we observe that

$$\begin{aligned} C_{\mathrm{lb}}\Vert v_N\Vert _{\mathcal {M}}\le ||| v_N||| \le C_\mathrm{ub}\Vert v_N\Vert _{\mathcal {M}}, \end{aligned}$$

with \(C_\mathrm{ub}:=\max \{C_\mathrm{eq}, 1\}\) and \(C_{\mathrm{lb}}:=1/C_\mathrm{ub}\). The proof is completed. \(\square \)

Now we turn to the coercivity of the WG bilinear form \(\mathcal {B}(\cdot ,\cdot )\) with respect to the \(|||\cdot |||\)-norm defined by (3.4).

Lemma 3.2

(Coercivity with respect to the  \(|||\cdot |||\)-norm) The WG bilinear form defined by (3.3) is coercive on \(\mathcal {M}_N^0\) with respect to the \(|||\cdot |||\)-norm, i.e.,

$$\begin{aligned} \mathcal {B}(v_N,v_N)\ge ||| v_N|||^2,\quad \forall v_N\in \mathcal {M}_N^0. \end{aligned}$$
(3.8)

Proof

Let \(v_N=\{v_0, v_b\}, w_N=\{w_0, w_b\}\in \mathcal {M}_N^0\). It follows from (2.2) and integration by parts that

$$\begin{aligned} (D_w^bv_N, w_0)_{\mathcal {T}_N}&= - (v_0, (bw_0)')_{\mathcal {T}_N} + \langle v_b, bnw_0\rangle _{\partial \mathcal {T}_N}\nonumber \\&=(bv_0', w_0)_{\mathcal {T}_N} -\langle bn(v_0-v_b), w_0\rangle _{\partial \mathcal {T}_N}. \end{aligned}$$
(3.9)

Since \(v_b\) and \(w_b\) are single value at the interior nodes of \(\mathcal {T}_N\) and vanish at the boundaries nodes of \(\mathcal {T}_N\), we have

$$\begin{aligned} \langle bnv_b, w_b\rangle _{\partial \mathcal {T}_N}&= \sum _{j=1}^N[(bv_bw_b)(x_j)-(bv_bw_b)(x_{j-1})]\\&=(bv_bw_b)(1)-(bv_bw_b)(0)=0, \end{aligned}$$

whence we infer from (2.2) that

$$\begin{aligned} (D_w^bw_N, v_0)_{\mathcal {T}_N}&= - (w_0, (bv_0)')_{\mathcal {T}_N} + \langle w_b, bnv_0\rangle _{\partial \mathcal {T}_N}\nonumber \\&=- (w_0, (bv_0)')_{\mathcal {T}_N} + \langle w_b, bn(v_0-v_b)\rangle _{\partial \mathcal {T}_N}. \end{aligned}$$
(3.10)

Summing (3.9) and (3.10), and let \(v_N=w_N\), we obtain

$$\begin{aligned} (D_w^bv_N, v_0)_{\mathcal {T}_N} = -\frac{1}{2}(b'v_0,v_0)_{\mathcal {T}_N} -\frac{1}{2}\langle bn(v_0-v_b), v_0-v_b\rangle _{\partial \mathcal {T}_N}. \end{aligned}$$
(3.11)

By a simple manipulation, we have

$$\begin{aligned} \mathcal {S}_c(v_N, v_N)-\frac{1}{2}\langle bn(v_0-v_b), v_0-v_b\rangle _{\partial \mathcal {T}_N} =|v_N|_{\mathrm {J}}^2, \end{aligned}$$

which together with (3.11) yields

$$\begin{aligned} (D_w^bv_N+cv_0, v_0)_{\mathcal {T}_N}+\mathcal {S}_c(v_N, v_N)&=((c-\frac{1}{2}b')v_0,v_0)_{\mathcal {T}_N}+|v_N|_{\mathrm {J}}^2\nonumber \\&\ge \Vert \sqrt{c_0}v_0\Vert _{L^2(\mathcal {T}_N)}^2+|v_N|_{\mathrm {J}}^2 \end{aligned}$$
(3.12)

Owing to the definition of \(\mathcal {B}(\cdot ,\cdot )\) and (3.12), we obtain, for any \(v_N\in \mathcal {M}_N^0\),

$$\begin{aligned} \mathcal {B}(v_N,v_N)\ge \varepsilon (\nabla _w v_N,\nabla _w v_N)+\mathcal {S}_d(v_N,v_N) + \Vert \sqrt{c_0}v_0\Vert _{L^2(\mathcal {T}_N)}^2+|v_N|_{\mathrm {J}}^2&=||| v_N|||^2. \end{aligned}$$
(3.13)

The proof is completed.\(\square \)

As a consequent of Lemma 3.1 and Lemma 3.2, the WG bilinear form \(\mathcal {B}_h(\cdot ,\cdot )\) also has the coercivity with respect to the \(\Vert \cdot \Vert _{\mathcal {M}}\)-norm defined by (3.5).

Lemma 3.3

(Coercivity with respect to the \(\Vert \cdot \Vert _{\mathcal {M}}\)norm) The WG bilinear form defined by (3.3) is coercive on \(\mathcal {M}_N^0\) with respect to the \(\Vert \cdot \Vert _{\mathcal {M}}\)-norm, i.e.,

$$\begin{aligned} \mathcal {B}(v_N,v_N)\ge C_{\mathrm{lb}}\Vert v_N\Vert _{\mathcal {M}}^2,\quad \forall v_N\in \mathcal {M}_N^0. \end{aligned}$$

3.2 Interpolation Operator

Usually, the locally defined \(L^2\) projections on each element and its boundaries are used for the error analysis of WG finite element method in all existence references such as [20, 28, 29]. Unfortunately, the interpolation error bound of \(L^2\) projection is not \(\varepsilon \)-uniform on Shishkin mesh because of its anisotropic property. So in our analysis we will adopt a special interpolation introduced in [19].

On each element \(I_j\in \mathcal {T}_N\) with \(I_j=[x_{j-1},x_j]\), we define the set of \(k+1\) nodal functionals

$$\begin{aligned} N_0(v)&= v(x_{j-1}),\quad N_k(v)=v(x_j), \end{aligned}$$
(3.14)
$$\begin{aligned} N_l(v)&= h_j^{-l}\int _{I_j}(x-x_{j-1})^{k-1}v(x)\mathrm {d}x,\quad l=1,\ldots ,k-1. \end{aligned}$$
(3.15)

Now a local interpolation \(\mathcal {I}:{ H^1(I_j)}\rightarrow \mathbb {P}^k(I_j)\) is defined by

$$\begin{aligned} N_l(\mathcal {I}v -v)=0,\quad l=0,1,\ldots ,k, \end{aligned}$$
(3.16)

which can be extended to a continuous global interpolation \(\mathcal {I} v\).

Obviously, \(\mathcal {I}v|_{I_j}\) is continuous on \(I_j\) and belongs to \(H^1(I_j)\). Then, the weak function \(\{(\mathcal {I}v)|_{I_j}, (\mathcal {I}v)|_{\partial I_j}\}\), still denoted by \(\mathcal {I}v\) for simplicity, belongs to the local WG finite element space \(\mathcal {M}_N(I_j)\).

Lemma 3.4

(Commutativity of \(\mathcal {I}\)) Let \(\mathcal {I}\) be the interpolation operator defined by (3.16). Then, on each element \(I_j\in \mathcal {T}_N\), we have

$$\begin{aligned} D_w(\mathcal {I}v) = (\mathcal {I}v)',\quad \forall v\in H^1(I_j). \end{aligned}$$

Proof

It follows from the definition of weak derivative (2.1) that for any \(w\in \mathbb {P}^{k-1}(I_j)\)

$$\begin{aligned} (D_w(\mathcal {I}v), w)_{I_j} = -(\mathcal {I}v, w')_{I_j} +\langle \mathcal {I}v, wn\rangle _{\partial I_j}. \end{aligned}$$

Applying integration by parts to the first term on the right hand side of the above equation leads to the assertion.\(\square \)

3.3 Error Equation

The WG finite element scheme (3.2) is not consistent in the sense that for the solution u of problem (1.1), one doesn’t have \(\mathcal {B}_N(u,v_N)=(f,v_0)\) for some \(v_N=\{v_0,v_b\}\in \mathcal {M}_N^0\). As a result of the inconsistency, the usual orthogonality property for the conforming Galerkin finite element methods doesn’t hold true for the weak Galerkin method; i.e., \(\mathcal {B}_N(u-u_N,v_N)\ne 0\) for some \(v_N=\{v_0,v_b\}\in \mathcal {M}_N^0\). In this subsection, we will derive an error equation which will be used in error analysis.

Lemma 3.5

Let u be the solution of the problem (1.1). Then for \(v_N=\{v_0, v_b\}\in \mathcal {M}_N^0\), there holds

$$\begin{aligned} - {\varepsilon } (u'', v_0)_{\mathcal {T}_N} = {\varepsilon }(D_w(\mathcal {I}u), D_wv_N)_{\mathcal {T}_N}-\mathcal {E}_1(u,v_N). \end{aligned}$$
(3.17)

where

$$\begin{aligned} \mathcal {E}_1(u,v_N) = \varepsilon \langle u'-(\mathcal {I}u)', (v_0-v_b)n\rangle _{\partial \mathcal {T}_N}. \end{aligned}$$
(3.18)

Proof

Let \(v_N=\{v_0, v_b\}\in \mathcal {M}_N^0\). We infer from Lemma 3.4 that \(D_w(\mathcal {I}u)=(\mathcal {I}u)'\), which yields

$$\begin{aligned} (D_w(\mathcal {I}u), D_wv_N)_{I_j} = ((\mathcal {I}u)', D_wv_N)_{I_j},\quad \forall I_j\in \mathcal {T}_N. \end{aligned}$$
(3.19)

Then, it follows the definition of the weak derivative (2.1) and integration by parts that

$$\begin{aligned} ((\mathcal {I}u)', D_wv_N)_{I_j}&=-(v_0, (\mathcal {I}u)'')_{I_j} + \langle v_bn, (\mathcal {I}u)'\rangle _{\partial I_j}\nonumber \\&=((\mathcal {I}u)', v_0')_{I_j} -\langle (\mathcal {I}u)', (v_0-v_b)n\rangle _{\partial I_j}. \end{aligned}$$
(3.20)

The definition of \(\mathcal {I}\) and integration by parts implies

$$\begin{aligned} ((u-\mathcal {I}u)', v_0')_{I_j}=-(u-\mathcal {I}u, v_0'')_{I_j}+\langle u-\mathcal {I}u, {v_0'n}\rangle _{\partial I_j}=0, \end{aligned}$$

thus

$$\begin{aligned}( (\mathcal {I}u)', v_0')_{I_j}=(u', v_0')_{I_j},\end{aligned}$$

which together with (3.19) and (3.20), leads to

$$\begin{aligned} (D_w(\mathcal {I}u), D_wv_N)_{I_j} = (u', v_0')_{I_j} -\langle (\mathcal {I}u)', (v_0-v_b)n\rangle _{\partial I_j}. \end{aligned}$$

Summing the above equation over all element \(I_j\in \mathcal {T}_N\), we obtain

$$\begin{aligned} (D_w(\mathcal {I}u), D_wv_N)_{\mathcal {T}_N} = (u', v_0')_{\mathcal {T}_N} -\langle (\mathcal {I}u)', (v_0-v_b)n\rangle _{\partial \mathcal {T}_N}. \end{aligned}$$
(3.21)

Integration by parts shows that

$$\begin{aligned} - (u'', v_0)_{I_j} = (u', v_0')_{I_j} -\langle u', v_0n\rangle _{\partial I_j} \end{aligned}$$

Summing the above equation over all element \(I_j\in \mathcal {T}_N\), and recalling the fact

$$\begin{aligned} \sum _{j=1}^N\langle u', v_bn\rangle _{\partial I_j}=0, \end{aligned}$$

we obtain

$$\begin{aligned} - (u'', v_0)_{\mathcal {T}_N} = (u', v_0')_{\mathcal {T}_N} -\langle u', (v_0-v_b)n\rangle _{\partial \mathcal {T}_N}, \end{aligned}$$

which combining with (3.21) yields the assertion (3.17).\(\square \)

Lemma 3.6

Let u be the solution of the problem (1.1). Then for \(v_N=\{v_0, v_b\}\in \mathcal {M}_N^0\), there holds

$$\begin{aligned} (bu', v_0)_{\mathcal {T}_N} = (D_w^b(\mathcal {I}u), v_0)_{\mathcal {T}_N}-\mathcal {E}_2(u, v_N), \end{aligned}$$
(3.22)

where

$$\begin{aligned} \mathcal {E}_2(u,v_N) = ( u-\mathcal {I}u , (bv_0)')_{\mathcal {T}_N}. \end{aligned}$$
(3.23)

Proof

It follows from the definition of the weak convection derivative (2.2) that

$$\begin{aligned} (D_w^b(\mathcal {I}u), v_0)_{\mathcal {T}_N} = -(\mathcal {I}u, (bv_0)')_{\mathcal {T}_N} +\langle \mathcal {I}u, bnv_0\rangle _{\partial \mathcal {T}_N}. \end{aligned}$$
(3.24)

Integration by parts shows that

$$\begin{aligned} (bu', v_0)_{\mathcal {T}_N} = -(u, (bv_0)')_{\mathcal {T}_N} + \langle u, bnv_0\rangle _{\partial \mathcal {T}_N}, \end{aligned}$$

which together with (3.24) and recalling the fact \(\mathcal {I}u = u\) on \(\partial I_j\) yields the assertion (3.22).\(\square \)

Lemma 3.7

(Error equation) Let u and \(u_N\in \mathcal {M}_N^0\) be the solutions of problem (1.1) and (3.2), respectively. Then, for any \(v_N\in \mathcal {M}_N^0\), there holds

$$\begin{aligned} \mathcal {B}(\mathcal {I}u-u_N, v_N) = \mathcal {E}(u, v_N), \end{aligned}$$
(3.25)

where

$$\begin{aligned} \mathcal {E}(u, v_N) := \mathcal {E}_1(u, v_N)+\mathcal {E}_2(u, v_N)+\mathcal {E}_3(u, v_N). \end{aligned}$$
(3.26)

Here \(\mathcal {E}_1(u,v_N)\) and \(\mathcal {E}_2(u,v_N)\) are defined by (3.18) and (3.23) respectively, and \(\mathcal {E}_3(u,v_N)\) is given as

$$\begin{aligned} \mathcal {E}_3(u, v_N) = (c(\mathcal {I}u-u), v_0)_{\mathcal {T}_N}. \end{aligned}$$
(3.27)

Proof

Testing (1.1) by \(v_N=\{v_0, v_b\}\in \mathcal {M}_N^0\), we arrive at

$$\begin{aligned} -\varepsilon (u'', v_0)_{\mathcal {T}_N} + (bu', v_0)_{\mathcal {T}_N} + (cu, v_0)_{\mathcal {T}_N} = (f, v_0)_{\mathcal {T}_N}. \end{aligned}$$

Plugging (3.17) and (3.22) into the above equation yields

$$\begin{aligned} \mathcal {A}(\mathcal {I}u, v_N) =(f, v_0)_{\mathcal {T}_N} + \mathcal {E}(u,v_N). \end{aligned}$$

Since \(\mathcal {I}u\) is continuous in \(\varOmega \), with the aid of the definitions of \(\mathcal {S}_c(\cdot , \cdot )\) and \(\mathcal {S}_d(\cdot , \cdot )\), we conclude

$$\begin{aligned} \mathcal {S}_c(\mathcal {I}u, v_N)=0,\quad \mathcal {S}_d(\mathcal {I}u, v_N)=0.\end{aligned}$$

Thus,

$$\begin{aligned} \mathcal {B}(\mathcal {I}u, v_N) =(f, v_0)_{\mathcal {T}_N} + \mathcal {E}(u,v_N). \end{aligned}$$
(3.28)

Subtracting (3.2) from (3.28) yields the error equation (3.25). The proof is completed.\(\square \)

4 Error Analysis on a Shishkin Mesh

In this section, we will provide a \(\varepsilon \)-uniform error estimate for the error \(u-u_N\) in the \(\Vert \cdot \Vert _{\mathcal {M}}\)-norm defined by (3.5). The error analysis relies on a layer-adapted mesh — the Shishkin mesh, S-decomposition and a priori estimate of the exact solution of (1.1) and a special interpolation introduced in [19]. In the following analysis, we will assume \(\varepsilon \le N^{-1}\) which is realistic for singularly perturbed problem.

The following trace inequality and inverse inequality from [3] will be used frequently in our analysis:

$$\begin{aligned} \Vert v\Vert _{L^2(\partial I_j)}^2&\le C_\mathrm{tr}(h_j^{-1}\Vert v\Vert _{L^2(I_j)}^2+\Vert v\Vert _{L^2(I_j)}\Vert v'\Vert _{L^2(I_j)}),\quad \forall v\in H^1(I_j), \end{aligned}$$
(4.1)
$$\begin{aligned} \Vert v_N'\Vert _{L^2(\partial I_j)}&\le C_\mathrm{inv}h_j^{-1}\Vert v_N\Vert _{L^2(I_j)},\quad \forall v_N\in \mathbb {P}^k(I_j), \end{aligned}$$
(4.2)
$$\begin{aligned} \Vert v_N\Vert _{L^p(\partial I_j)}&\le C_{*}h_j^{-1/p}\Vert v_N\Vert _{L^p(I_j)},\quad \forall 1\le p\le \infty , \forall v_N\in \mathbb {P}^k(I_j), \end{aligned}$$
(4.3)

where \(C_\mathrm{tr}\), \(C_\mathrm{inv}\) and \(C_*\) are positive constants, and independent of both \(I_j\) and \(h_j\).

The following statements present a decomposition of the exact solution u of problem (1.1) into a sum of a smooth part and a layer part, which is necessary to the \(\varepsilon \)-uniform error estimates of numerical methods for singularly perturbed problems [8].

Lemma 4.1

(S-decomposition) [15] Let q be some positive integer. Consider the problem (1.1) with the assumption of (1.2). The exact solution u can be composed as \(u=u_{S}+u_{E}\), where the smooth part \(u_S\) and the layer part \(u_E\) satisfies

$$\begin{aligned} -\varepsilon u_S''+bu_S'+cu_S = f,\\ -\varepsilon u_E''+bu_E'+cu_E = 0, \end{aligned}$$

and

$$\begin{aligned} |u_S^{(l)}(x)|\le C, \quad |u_E^{(l)}(x)|\le C\varepsilon ^{-l}\exp (-b_0(1-x)/\varepsilon )\quad \text{ for } 0\le l\le q. \end{aligned}$$
(4.4)

The following lemma shows the approximation properties of the interpolation operator \(\mathcal {I}\) defined by (3.16).

Lemma 4.2

[19] For any element \(I_j\in \mathcal {T}_N\) with \(I_j=[x_{j-1}, x_j]\) and \(v\in H^{k+1}(I_j)\), the interpolation \(\mathcal {I}v\) defined by (3.16) has the following approximation properties:

$$\begin{aligned} |v-\mathcal {I}v|_{H^{l}(I_j)}&\le Ch_j^{k+1-l}|v|_{H^{k+1}(I_j)},\quad l=0,1,\ldots ,k+1, \end{aligned}$$
(4.5)
$$\begin{aligned} \Vert v-\mathcal {I}v\Vert _{L^{\infty }(I_j)}&\le Ch_j^{k+1}|v|_{W^{k+1,\infty }(I_j)}, \end{aligned}$$
(4.6)

where C is independent of \(h_j\) and \(\varepsilon \).

From Lemmas 4.1 and 4.2, we have the following interpolation error estimates on the Shishkin mesh \(\mathcal {T}_N\).

Lemma 4.3

[19, 31] Let the exact solution \(u=u_S+u_E\) of the problem (1.1) can be decomposed into a smooth and layer part, respectively. Denote \(\mathcal {I}u_S\) and \(\mathcal {I}u_E\) by the interpolations \(u_S\) and \(u_E\) on a Shishkin mesh, respectively. Assume \(\varepsilon \ln N\le b_0/2(k+1)\). Then, we have \(\mathcal {I}u=\mathcal {I}u_S+\mathcal {I}u_E\) and the estimates

$$\begin{aligned} \Vert u-\mathcal {I}u\Vert _{L^{\infty }(\varOmega _1)}&\le CN^{-(k+1)}, \end{aligned}$$
(4.7a)
$$\begin{aligned} \Vert u-\mathcal {I}u\Vert _{L^{\infty }(\varOmega _2)}&\le C(N^{-1}\ln N)^{k+1}, \end{aligned}$$
(4.7b)
$$\begin{aligned} \Vert (u_S-\mathcal {I}u_S)^{(l)}\Vert _{L^2(\varOmega )}&\le CN^{l-(k+1)},\,\, l=0,\ldots ,k, \end{aligned}$$
(4.7c)
$$\begin{aligned} \Vert u_E-\mathcal {I}u_E\Vert _{L^2(\varOmega _2)}&\le C\varepsilon ^{1/2}(N^{-1}\ln N)^{k+1}, \end{aligned}$$
(4.7d)
$$\begin{aligned} N^{-1}\Vert (\mathcal {I}u_E)'\Vert _{L^2(\varOmega _1)}+ \Vert \mathcal {I}u_E\Vert _{L^2(\varOmega _1)}&\le C(\varepsilon ^{1/2}+N^{-1/2})N^{-(k+1)}, \end{aligned}$$
(4.7e)
$$\begin{aligned} \Vert u_E\Vert _{L^{\infty }(\varOmega _1)}+\varepsilon ^{-1/2}\Vert u_E\Vert _{L^2(\varOmega _1)}&\le CN^{-(k+1)}, \end{aligned}$$
(4.7f)
$$\begin{aligned} \Vert u_E'\Vert _{L^2(\varOmega _1)}&\le C\varepsilon ^{-1/2}N^{-(k+1)}. \end{aligned}$$
(4.7g)

Lemma 4.4

Assume \(u\in H^{k+1}(\varOmega )\). Under the conditions of Lemma 4.3, there holds

$$\begin{aligned} \Vert (u_E-\mathcal {I}u_E)^{(l)}\Vert _{L^2(\varOmega _1)}&\quad \le C\varepsilon ^{1/2-l}N^{-(k+1)},\\ \Vert (u_E-\mathcal {I}u_E)^{(l)}\Vert _{L^2(\varOmega _2)}&\quad \le C\varepsilon ^{1/2-l}(N^{-1}\ln N)^{k+1-l} \end{aligned}$$

with \(l=1,2\).

Proof

Owing to the triangle inequality and (4.7e) and (4.7g) of Lemma 4.3,

$$\begin{aligned} \Vert (u_E-\mathcal {I}u_E)'\Vert _{L^2(\varOmega _1)}\le \Vert u_E'\Vert _{L^2(\varOmega _1)}+\Vert (\mathcal {I}u_E)'\Vert _{L^2(\varOmega _1)} \le C\varepsilon ^{-1/2}N^{-(k+1)}. \end{aligned}$$

As the same procedure, and using the inverse inequality, we get

$$\begin{aligned} \Vert (u_E-\mathcal {I}u_E)''\Vert _{L^2(\varOmega _1)}&\le \Vert u_E''\Vert _{L^2(\varOmega _1)}+CN\Vert (\mathcal {I}u_E)'\Vert _{L^2(\varOmega _1)}\\&\le C\varepsilon ^{-3/2}[1+(\varepsilon N)^{3/2}+(\varepsilon N)^{2}]N^{-(k+1)}\\&\le C\varepsilon ^{-3/2}N^{-(k+1)}. \end{aligned}$$

Due to (4.5) of Lemma 4.2 and (4.4), we obtain, for \(l=1,2\),

$$\begin{aligned} \Vert (u_E-\mathcal {I}u_E)^{(l)}\Vert _{L^2(\varOmega _2)}^2&=\sum _{I_j\subset \varOmega _2}\Vert (u_E-\mathcal {I}u_E)^{(l)}\Vert _{L^2(I_j)}^2\\&\le \sum _{I_j\subset \varOmega _2}Ch_j^{2(k+1-l)}\Vert u_E^{(k+1)}\Vert _{L^2(I_j)}^2\\&\le C h_f^{2(k+1-l)}\cdot \int _{1-\tau }^{1}\varepsilon ^{-2(k+1)}\exp (-2b_0(1-x)/\varepsilon )\mathrm {d}x\\&\le C\varepsilon ^{1-2l}(N^{-1}\ln N)^{2(k+1-l)}. \end{aligned}$$

The proof is completed.\(\square \)

Lemma 4.5

Assume \(u\in H^{k+1}(\varOmega )\). Let \(\sigma _j\) is given by (3.1). Under the conditions of Lemma 4.3, there holds

$$\begin{aligned} \left\{ \sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}\Vert (u-\mathcal {I}u)'\Vert _{L^2(\partial I_j)}^2\right\} ^{1/2} \le C(N^{-1}\ln N)^k. \end{aligned}$$

Proof

To simplify notation in the proof, let \(\eta _S:=u_S-\mathcal {I}u_S\) and \(\eta _E:=u_E-\mathcal {I}u_E\) denote the interpolation errors of \(u_S\) and \(u_E\), respectively. Then, the total interpolation error \(\eta :=u-\mathcal {I}u\) can be written as \(\eta =\eta _S+\eta _E\).

By the triangle inequality, we have

$$\begin{aligned} \sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}\Vert \eta '\Vert _{L^2(\partial I_j)}^2 \le {2}\sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}(\Vert \eta _S'\Vert _{L^2(\partial I_j)}^2 +\Vert \eta _E'\Vert _{L^2(\partial I_j)}^2). \end{aligned}$$
(4.8)

Owing to the trace inequality (4.1),

$$\begin{aligned} \Vert \eta _S'\Vert _{L^2(\partial I_j)}^2 \le C_\mathrm{tr}(h_j^{-1}\Vert \eta _S'\Vert _{L^2(I_j)}^2 +\Vert \eta _S'\Vert _{L^2(I_j)}\Vert \eta _S''\Vert _{L^2(I_j)}), \end{aligned}$$

then, by (4.5) of Lemma 4.2, we arrive at

$$\begin{aligned} \sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}\Vert \eta _S'\Vert _{L^2(\partial I_j)}^2&\le C_\mathrm{tr}\sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j} (h_j^{-1}\Vert \eta _S'\Vert _{L^2(I_j)}^2 +\Vert \eta _S'\Vert _{L^2(I_j)}\Vert \eta _S''\Vert _{L^2(I_j)})\nonumber \\&\le C(\varepsilon ^2N\Vert \eta _S'\Vert _{L^2(\varOmega _1)}^2+\varepsilon \Vert \eta _S'\Vert _{L^2(\varOmega _2)}^2\nonumber \\&\quad +\, \varepsilon ^2\Vert \eta _S'\Vert _{L^2(\varOmega _1)}\Vert \eta _S''\Vert _{L^2(\varOmega _1)} +\varepsilon ^2 N^{-1}\ln N\Vert \eta _S'\Vert _{L^2(\varOmega _1)}\Vert \eta _S''\Vert _{L^2(\varOmega _1)})\nonumber \\&\le C\varepsilon N^{-2k}, \end{aligned}$$
(4.9)

where \(\varepsilon N<1\) and \(\varepsilon \ln N<1\) are used.

Using the trace inequality (4.1) again, we have

$$\begin{aligned} \Vert \eta _E'\Vert _{L^2(\partial I_j)}^2 \le C_\mathrm{tr}(h_j^{-1}\Vert \eta _E'\Vert _{L^2(I_j)}^2 +\Vert \eta _E'\Vert _{L^2(I_j)}\Vert \eta _E''\Vert _{L^2(I_j)}). \end{aligned}$$

As a result,

$$\begin{aligned} \sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}\Vert \eta _E'\Vert _{L^2(\partial I_j)}^2&\le C_\mathrm{tr}\sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j} (h_j^{-1}\Vert \eta _E'\Vert _{L^2(I_j)}^2 +{\Vert \eta _E'\Vert _{L^2(I_j)}\Vert \eta _E''\Vert _{L^2(I_j)} })\\&\le C(\varepsilon ^2N\Vert \eta _E'\Vert _{L^2(\varOmega _1)}^2+\varepsilon \Vert \eta _E'\Vert _{L^2(\varOmega _2)}^2)\\&\quad +\, C\varepsilon ^2(\Vert \eta _E'\Vert _{L^2(\varOmega _1)}\Vert \eta _E''\Vert _{L^2(\varOmega _1)} +N^{-1}\ln N\Vert \eta _E'\Vert _{L^2(\varOmega _2)}\Vert \eta _E''\Vert _{L^2(\varOmega _2)}). \end{aligned}$$

Then, it follows from Lemma 4.4 that

$$\begin{aligned} \sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}\Vert \eta _E'\Vert _{L^2(\partial I_j)}^2 \le C[(\varepsilon +N^{-1})N^{-(2k+1)}+(N^{-1}\ln N)^{2k}], \end{aligned}$$

which combining with (4.8) and (4.9) yields

$$\begin{aligned} \sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}\Vert \eta '\Vert _{L^2(\partial I_j)}^2 \le C[(\varepsilon +N^{-2})N^{-2k}+(N^{-1}\ln N)^{2k}]. \end{aligned}$$

Thus,

$$\begin{aligned} \left\{ \sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}\Vert \eta '\Vert _{L^2(\partial I_j)}^2\right\} ^{1/2} \le C(N^{-1}\ln N)^{k}. \end{aligned}$$

The proof is completed.\(\square \)

Lemma 4.6

Let \(u\in H^{k+1}(\varOmega )\) solve the problem (1.1) and \(\sigma _j\) is given by (3.1). Then, for \(v_N\in \mathcal {M}_N^0\), there holds

$$\begin{aligned} |\mathcal {E}(u,v_N)|&\le C(N^{-1}\ln N)^k\Vert v_N\Vert _{\mathcal {M}}, \end{aligned}$$
(4.10)

where C is independent of N and \(\varepsilon \).

Proof

It follows from the Cauchy-Schwarz inequality and Lemma 4.5 that

$$\begin{aligned} |\mathcal {E}_1(u,v_h)|&\le \sum _{j=1}^N\varepsilon |\langle u'-(\mathcal {I}u)', (v_0-v_b)n\rangle _{\partial I_j}|\nonumber \\&\le \sum _{j=1}^N\varepsilon \Vert (u-\mathcal {I}u)'\Vert _{L^2(\partial I_j)}\Vert v_0-v_b\Vert _{L^2(\partial I_j)}\nonumber \\&\le \left\{ \sum _{j=1}^N\frac{\varepsilon ^2}{\sigma _j}\Vert (u-\mathcal {I}u)'\Vert _{L^2(\partial I_j)}^2 \right\} ^{1/2}\left\{ \sum _{j=1}^N\sigma _j\Vert v_0-v_b\Vert _{L^2(\partial I_j)}^2 \right\} ^{1/2}\nonumber \\&\le C(N^{-1}\ln N)^{k}S_d^{1/2}(v_N,v_N). \end{aligned}$$
(4.11)

From (3.23) and (3.27), we observe that

$$\begin{aligned} \mathcal {E}_2(u,v_N) + \mathcal {E}_3(u, v_N)&= (u-\mathcal {I}u, bv_0')+(u-\mathcal {I}u, (b'-c)v_0)\\&=T_1+T_2. \end{aligned}$$

With the aid of the Cauchy-Schwarz inequality and the estimates (4.7a), (4.7b) of Lemma 4.3, we have

$$\begin{aligned} |T_1|&\le C[\Vert u-\mathcal {I}u\Vert _{L^{\infty }(\varOmega _1)}\Vert v_0'\Vert _{L^1(\varOmega _1)} +\Vert u-\mathcal {I}u\Vert _{L^{\infty }(\varOmega _2)}\Vert v_0'\Vert _{L^1(\varOmega _2)}]\\&\le C[N^{-(k+1)}\Vert v_0'\Vert _{L^1(\varOmega _1)} +(N^{-1}\ln N)^{k+1}\Vert v_0'\Vert _{L^1(\varOmega _2)}] \end{aligned}$$

On \(\varOmega _1\), the inverse inequality implies

$$\begin{aligned} \Vert v_0'\Vert _{L^1(\varOmega _1)} \le CN\Vert v_0\Vert _{L^1(\varOmega _1)} \le CN|\varOmega _1|^{1/2}\Vert v_0\Vert _{L^2(\varOmega _1)} \le CN\Vert v_N\Vert _{\mathcal {M}}, \end{aligned}$$

while on \(\varOmega _2\) the Cauchy-Schwarz inequality gives

$$\begin{aligned} \Vert v_0'\Vert _{L^1(\varOmega _2)}\le \sqrt{\tau }\Vert v_0'\Vert _{L^2(\varOmega _2)} \le C(\ln N)^{1/2}\Vert v_N\Vert _{\mathcal {M}}. \end{aligned}$$

As a result,

$$\begin{aligned} |T_1|&\le C[N^{-k} +N^{-1}(\ln N)^{3/2}\cdot (N^{-1}\ln N)^{k}]\Vert v_N\Vert _{\mathcal {M}}\nonumber \\&\le C(N^{-1}\ln N)^{k}\Vert v_N\Vert _{\mathcal {M}}, \end{aligned}$$
(4.12)

where we use the fact \(N^{-1}(\ln N)^{3/2}<1\).

From (4.15) we observe that

$$\begin{aligned}\Vert u-\mathcal {I}u\Vert _{L^2(\varOmega )}\le CN^{-(k+1)}.\end{aligned}$$

Hence, \(T_2\) can be bounded by

$$\begin{aligned} |T_2|\le C\Vert u-\mathcal {I}u\Vert _{L^2(\varOmega )}\Vert v_0\Vert _{L^2(\varOmega )}\le CN^{-(k+1)}\Vert v_N\Vert _{\mathcal {M}}, \end{aligned}$$

which together with (4.11) and (4.12) completed the proof.\(\square \)

Theorem 4.1

Let u solve the problem (1.1) and \(u_N\in \mathcal {M}_N^0\) be the WG finite element solution of (3.2) calculated on Shishkin mesh \(\mathcal {T}_N\). Then, there holds

$$\begin{aligned} \Vert \mathcal {I}u - u_N\Vert _{\mathcal {M}} \le C(N^{-1}\ln N)^k, \end{aligned}$$

where C is independent of N and \(\varepsilon \).

Proof

Let \(\xi :=\mathcal {I}u - u_N\). Owing to Lemma 3.3,

$$\begin{aligned} C_{\mathrm{lb}}\Vert \xi \Vert _{\mathcal {M}}^2 \le \mathcal {B}(\xi ,\xi ) \end{aligned}$$
(4.13)

Taking \(v_N=\xi \) in the error equation (3.25) leads to

$$\begin{aligned} \mathcal {B}(\xi ,\xi ) = \mathcal {E}(u,\xi ). \end{aligned}$$

It follows from Lemma 4.6 that

$$\begin{aligned} \mathcal {B}(\xi ,\xi )\le C(N^{-1}\ln N)^k\Vert \xi \Vert _{\mathcal {M}}, \end{aligned}$$

which together with (4.13) complete the proof.\(\square \)

Theorem 4.2

Assume \(u\in H^{k+1}(\varOmega )\) and \(\sqrt{\varepsilon }(\ln N)^{k+1}<C\). Under the conditions of Lemma 4.3, there holds

$$\begin{aligned} \Vert u-\mathcal {I}u\Vert _{\mathcal {M}}\le C(N^{-1}\ln N)^{k}, \end{aligned}$$

where C is independent of N and \(\varepsilon \).

Proof

Let \(\eta = u-\mathcal {I}u\). Since \(\eta \) is continuous in \(\varOmega \), we have \(|\eta |_\mathrm{J}=0\) and \(\mathcal {S}_d(\eta ,\eta )=0\). Then,

$$\begin{aligned} \Vert \eta \Vert _{\mathcal {M}}^2=\varepsilon \Vert \eta '\Vert _{L^2(\varOmega )}^2 + c_0\Vert \eta \Vert _{L^2(\varOmega )}^2 \end{aligned}$$
(4.14)

Applying the estimates (4.7c)–(4.7f) of Lemma 4.3 and the Cauchy-Schwarz inequality, we obtain

$$\begin{aligned} \Vert u-\mathcal {I}u\Vert _{L^2(\varOmega )}&\le \Vert u_S-\mathcal {I}u_S\Vert _{L^2(\varOmega )} +\Vert u_E-\mathcal {I}u_E\Vert _{L^2(\varOmega _2)}\nonumber \\&\quad +\, \Vert u_E\Vert _{L^2(\varOmega _1)}+\Vert \mathcal {I}u_E\Vert _{L^2(\varOmega _1)}\nonumber \\&\le CN^{-(k+1)}[1+\varepsilon ^{1/2}(\ln N)^{k+1}+\varepsilon ^{1/2}+N^{-1/2}]\nonumber \\&\le CN^{-(k+1)}[1+\varepsilon ^{1/2}(\ln N)^{k+1}]\nonumber \\&\le CN^{-(k+1)}. \end{aligned}$$
(4.15)

Due to Lemma 4.4, we obtain

$$\begin{aligned} \Vert (u_E-\mathcal {I}u_E)'\Vert _{L^2(\varOmega _2)}^2&\le C\varepsilon ^{-1}(N^{-1}\ln N)^{2k},\\ \Vert (u_E-\mathcal {I}u_E)'\Vert _{L^2(\varOmega _1)}^2&\le C\varepsilon ^{-1}N^{-2(k+1)}, \end{aligned}$$

which together with (4.7c) of Lemma 4.3 yields

$$\begin{aligned} \varepsilon \Vert (u-\mathcal {I}u)'\Vert _{L^2(\varOmega )}^2&\le \varepsilon \Vert (u_S-\mathcal {I}u_S)'\Vert _{L^2(\varOmega )}^2 +\varepsilon \Vert (u_E-\mathcal {I}u_E)'\Vert _{L^2(\varOmega _1)}^2\nonumber \\&\quad +\, \varepsilon \Vert (u_E-\mathcal {I}u_E)'\Vert _{L^2(\varOmega _2)}^2\nonumber \\&\le C[\varepsilon N^{-2k} + (N^{-1}\ln N)^{2k} + N^{-2(k+1)}]\nonumber \\&\le C(\varepsilon N^{-2k}+(N^{-1}\ln N)^{2k}+N^{-2(k+1)}) \end{aligned}$$
(4.16)

Combining (4.14), (4.15), and (4.16) leads to

$$\begin{aligned} \Vert u-\mathcal {I}u\Vert _{\mathcal {M}}\le C(N^{-1}\ln N)^{k}, \end{aligned}$$

which completes the proof.\(\square \)

Using the triangle inequality and the results of Theorems 4.2 and 4.1, we arrive at the following statements.

Theorem 4.3

Let \(u\in H^{k+1}(\varOmega )\) and \(u_N\in \mathcal {M}_N^0\) solve the problem (1.1) and (3.2), respectively. Then, there holds

$$\begin{aligned} \Vert u - u_N\Vert _{\mathcal {M}} \le C(N^{-1}\ln N)^{k}, \end{aligned}$$

where C is independent of N and \(\varepsilon \).

5 Numerical Experiments

In this section, we carried out some numerical experiments to verify our theoretical findings in previous section. The Shishkin mesh with N elements is called mesh N. Let \(e_N\) denote the error of the approximate solution computed on the mesh N. Then the approximate order of convergence, i.e., order(2N), is computed by

$$\begin{aligned}order(2N):=\frac{\ln (e_N/e_{2N})}{\ln (2\ln (N)/\ln (2N))}.\end{aligned}$$

Firstly, we confirm the convergence rate of the errors between the exact solution u and the WG finite element solution \(u_N=\{u_0, u_b\}\) computed by (3.2) measured in the \(\Vert \cdot \Vert _{\mathcal {M}}\)-norm defined by (3.5). Furthermore, we investigate the convergence properties of the error \(u-u_N\) measured in the \(L^2\)-norm defined by

$$\begin{aligned} \Vert u-u_0\Vert _{L^2(\mathcal {T}_N)}:=\left\{ \sum _{j=1}^N\Vert u-u_0\Vert _{L^2(I_j)}^2\right\} ^{1/2}, \end{aligned}$$

and the discrete \(L^{\infty }\)-norm given by

$$\begin{aligned} \Vert u-u_b\Vert _{L^{\infty }(\mathcal {T}_N)}:=\max _{0\le j\le N}|u(x_j)-u_b(x_j)|. \end{aligned}$$

Example 1

Consider the following convection–diffusion problem

$$\begin{aligned} \left\{ \begin{array}{ll} -\varepsilon u'' + (2-x)u' + u =f &{}\quad \text{ in } (0,1),\\ u(0)=u(1)=0, \end{array}\right. \end{aligned}$$

with the right-hand side f chosen such that

$$\begin{aligned}u(x)= \sin \left( \frac{1}{2}\pi x\right) -\frac{e^{-(1-x)/\varepsilon }-e^{-1/\varepsilon }}{1-e^{-1/\varepsilon }}\end{aligned}$$

is the exact solution, which has a boundary layer with the width \(\mathcal {O}(\varepsilon \ln \frac{1}{\varepsilon })\) at the outflow boundary \(x=1\).

Table 1 displays the history of convergence of the WG finite element method for Example 1. They are clear illustrations of the k-th order convergence rate in the energy-like norm (3.5), which is agree with the theoretical result of Theorem 4.3. The errors \(\Vert u-u_N\Vert _{\mathcal {M}}\), \(\Vert u-u_0\Vert _{L^2(\mathcal {T}_N)}\) and \(\Vert u-u_b\Vert _{L^{\infty }(\mathcal {T}_N)}\) for Example 1 with \(\varepsilon =10^{-9}\) are plotted on log-log scales in Fig. 1. It is observed that the rate of convergence in the \(\Vert \cdot \Vert _{\mathcal {M}}\)-norm is \(\mathcal {O}((N^{-1}\ln N)^k)\), which verifies the theoretical findings in Theorem 4.3. Fig. 1 indicates that our WG finite element scheme (3.2) has the optimal convergence rates of \(\mathcal {O}(N^{-(k+1)})\) in the \(L^2\)-norm and the super-convergence rate of \(\mathcal {O}((N^{-1}\ln N)^{2k})\) in the discrete \(L^{\infty }\)-norm.

Table 1 History of convergence of the WG method, under the norm \(\Vert \cdot \Vert _{\mathcal {M}}\)
Fig. 1
figure 1

Example 1. Convergence curve of error with \(\varepsilon =10^{-9}\). a For \(\mathbb {P}^1\) element, and b for \(\mathbb {P}^2\) element

Example 2

Consider the following convection–diffusion problem

$$\begin{aligned} \left\{ \begin{array}{ll} -\varepsilon u'' + (1+x)u' + (2+x)u =4\sin (\pi x) &{}\quad \text{ in } (0,1),\\ u(0)=u(1)=0. \end{array}\right. \end{aligned}$$

The exact solution of this test problem is unknown. Therefore, we use the following variant of the double mesh principle to estimate the errors. Compute

$$\begin{aligned} e_N = \Vert u_N-u_{2N}\Vert _{{\mathcal T}_N}, \end{aligned}$$

where \(\Vert \cdot \Vert _{{\mathcal T}_N}\) refers one of the three norm \(\Vert \cdot \Vert _{\mathcal {M}}\), \(\Vert \cdot \Vert _{L^2}\) and \(\Vert \cdot \Vert _{L^{\infty }}\), and \(u_{2N}\) is the WG solution obtained on a mesh containing the mesh points of the original Shishkin mesh \({\mathcal T}_N\) and its midpoints \(x_j=(x_{j}+x_{j+1})/2, j=0,1,\ldots ,N-1\).

Table 2 History of convergence of the WG method, under the norm \(\Vert \cdot \Vert _{\mathcal {M}}\)

For different \(\varepsilon =10^{-1}, 10^{-2}, 10^{-3}\), the numerical solutions of Example 2 computed by the WG scheme (3.2) with \(\mathbb {P}^1\) element on Shishkin meshes of \(N=32\) elements are displayed in Fig. 2. It can be observed that there is a boundary layer near \(x=1\) for small \(\varepsilon \).

We show the history of convergence of the WG finite element method for Example 2 in Table 2. The errors \(\Vert u-u_N\Vert _{\mathcal {M}}\), \(\Vert u-u_0\Vert _{L^2(\mathcal {T}_N)}\) and \(\Vert u-u_b\Vert _{L^{\infty }(\mathcal {T}_N)}\) for Example 2 with \(\varepsilon =10^{-9}\) are plotted on log-log scales in Fig. 3. From Table 2 and Fig. 3, we observe the same convergence behavior as in Example 1.

Fig. 2
figure 2

Example 2. The WG solution computed by \(\mathbb {P}^1\) element with \(N=32\) and different \(\varepsilon =10^{-1}, 10^{-2}, 10^{-3}\)

Fig. 3
figure 3

Example 2. Convergence curve of error with \(\varepsilon =10^{-9}\). a For \(\mathbb {P}^1\) element, and b for \(\mathbb {P}^2\) element

6 Conclusion

In this article, a WG finite element method is presented and analyzed for the one-dimensional singularly perturbed problem of convection–diffusion type. To obtain \(\varepsilon \)-independent error estimate, a special stabilization term is proposed for the discretization of the diffusion term. Optimal and uniformly convergent error estimates in the energy-like norm of the present method is proved on the Shishkin mesh for any high order element. In the view of implementation, the presented WG finite element method and the technique of elimination of interior unknowns can be extended to two-dimensional singularly perturbed problem of convection–diffusion type. Using our error analysis approach, it is not hard to prove optimal and uniformly convergent error estimates in the energy-like norm of our presented method with linear element on Shishkin meshes. As for the uniform convergence of high order element case, the main difficulty is to construct a special type of interpolation satisfying two following conditions: (1) its interpolation error is uniformly convergent on Shishkin meshes; (2) it is suitable for the analysis of the WG finite element method. We will investigate this topic in future work.