1 Introduction

In this paper, we consider the following steady state convection–diffusion equation

$$\begin{aligned} -\epsilon \Delta u+\nabla \cdot (\mathbf {b}u)= & {} f, \quad \text {in }\Omega , \end{aligned}$$
(1.1)

where \(\Omega \subset \mathbb {R}^{d}\) is a polyhedral domain with \(d=2,3\). For simplicity, we only consider the homogeneous Dirichlet boundary condition

$$\begin{aligned} u= & {} 0, \quad \text {on }\partial \Omega . \end{aligned}$$

Extensions to cases with inhomogeneous Dirichlet boundary condition and other types of boundary conditions are straight-forward. In the above equation, u is the unknown function to be approximated, \(f\in L^{2}(\Omega )\) is the given source term, and \(\mathbf {b}\) is the given vector field which is sufficiently smooth with the divergence free assumption, i.e., \(\nabla \cdot \mathbf {b}=0\). Throughout this paper, vector fields are denoted by bold faces. Moreover, the diffusion coefficient \(\epsilon \) is a constant. We mainly consider the convection dominated case, that is to say, the diffusion is small. Hence, we assume that \(\epsilon \leqslant 1\) throughout this paper.

There have been a number of attempts to solve the convection–diffusion equation numerically. One of the most popular choices is the finite element (FE) method, which uses continuous basis functions. For example, Galerkin/Least Squares methods [27], Continuous Interior Penalty methods [6, 7], and Local Projection Stabilization [1, 2, 4, 33]. Codina’s paper [25] also summarized and compared a few existing variants of FE method, while some new ideas were proposed. Another class of important methods is the discontinuous Galerkin (DG) methods [10, 23, 24, 36, 37], which uses piecewise approximations without enforcing any continuity along cell interfaces. It has several advantages such as high order accuracy, extremely local data structure and high parallel efficiency. There are many successful works in this area, such as [3, 9, 21, 29]. The staggered DG (SDG) method is a relatively new class of DG method in literature, which uses staggered mesh. A distinctive advantage of using staggered mesh is that physical laws arising from the corresponding partial differential equations are automatically preserved. The SDG method can be viewed as a hybrid of the standard DG method and the FE method in the sense that each of the numerical solutions is continuous along some of the faces in the triangulation, but is discontinuous along other faces. The SDG method was first developed by Chung and Enquist [17] in 2006 to solve the wave propagation problem. Since then, there have been a number of successful works on SDG, such as [12, 13, 19, 32]. In 2012, Chung and Lee [20] proposed an SDG scheme for the convection–diffusion equation in which the diffusion coefficient \(\epsilon \) is set to be 1. This SDG method is successful in preserving a number of physical laws. We note that the SDG method is related to the very successful HDG method [14, 15].

It is well-known that the exact solution of the mathematical problem (1.1) may contain singular points. Also, when dealing with the convection dominated case of the convection diffusion equation, that is, when the diffusion coefficient \(\epsilon \) is relevantly small, the exact solution may contain sharp layers. In these cases, numerical computations require a very fine mesh in order to capture the detailed features of the solution near those singular points or sharp layers. Thus, a significant amount of computer memory and time are needed, and the computation of the solution is very challenging. In many cases, adaptively refinement schemes are used for quicker convergence. They refine the mesh locally at suitable locations and thus can reduce the computational cost. Over the past few decades, there are many successful works on adaptive FE or DG methods for solving the convection–diffusion equation, such as [7, 8, 28, 42]. As far as we know, there are only two works on adaptive SDG methods, namely, [22] for the time-harmonic Maxwell’s equation and [16] for the Stokes system, and there is no prior work on adaptive SDG method for convection–diffusion equations.

In this paper, we devote to construct an adaptive SDG method in order to solve the steady state convection–diffusion equation (1.1). The key ingredient is the construction of an efficient and reliable error indicator. Note that there is an extra coefficient \(\epsilon \) in Eq. (1.1) compared to the equation considered in [20]. A straight forward generalization of the SDG method in [20] will lead to a term with coefficient \(\frac{1}{\epsilon }\) in the proof of the reliability of the error indicator. This term will be very large in the convection dominated case. Hence, we make a small modification of the scheme in [20] and provide a new SDG method for solving (1.1). As the exact solution may not be known a priori, we derive a computable error estimator without undetermined variables and use it as an error indicator for the new SDG method. The error indicator is composed of local residuals and jumps of the numerical solution. It can estimate a DG-norm error of the numerical solution locally in each cell of the triangulation. We will then prove the reliability and efficiency of this error indicator. In particular, we will show that the DG-norm error of the numerical solution is both bounded up and bounded below by our computable error estimator up to a data approximation term. Based on the derived error indicator, we can refine the mesh at locations with high estimated error and hence construct an adaptive mesh refinement strategy.

We admit that there is a term with the coefficient \(\frac{h}{\epsilon }\) in our error indicator and the efficiency is proved under the assumption that the ratio \(\frac{h}{\epsilon }\) is bounded. That is, for the convection dominated problem, this error estimator may be much larger than the exact error theoretically at the sparse part of the mesh, and becomes efficient only when the mesh size is small enough locally. However, numerical examples still show that our scheme performs roughly the same as the adaptive refinement scheme using the exact solution as the error indicator. We can see that our adaptive refinement method is much better than the regular uniform refinement method and can reach the optimal rate of convergence. Also, our adaptive SDG method is able to capture the correct locations of singular points and sharp layers and hence can recover the complicated structure of the solution.

This paper is organized as follows. In Sect. 2, we make a small modification of the SDG scheme proposed in [20] and prove the numerical stability of the new SDG method. In Sect. 3, we propose a residual-type a-posteriori error estimator for this new SDG method and prove its reliability and efficiency. An adaptive mesh refinement strategy based on this error estimator is also given. Numerical experiments are performed in Sect. 4 to show the accuracy and efficiency of the proposed error estimator and the adaptive SDG method. Finally, a conclusion is given in Sect. 5.

2 Numerical Scheme

In this section, we briefly show the numerical scheme to solve the steady state convection–diffusion equation (1.1). We follow the basic idea in [20]. However, since there is an extra coefficient \(\epsilon \leqslant 1\) in the diffusion term in this paper, we need to make some modifications. Also, we adjust the notations to make the scheme simpler. We can prove that the stability of this new modified SDG scheme also holds, as in [20].

In the following, we first show a new mixed form of the original convection–diffusion equation and derive the variation form satisfied by the exact solution in Sect. 2.1. Based on the mixed form equations, we show how to construct the new SDG method in Sect. 2.2. For simplicity, we only discuss the two-dimensional case, while its generalization to the three-dimensional case is straight forward.

2.1 A New Mixed Form of the Convection–Diffusion Equation

The key step of the SDG method is to transform the original convection–diffusion equation into a mixed form. Let us first recall the following mixed form used in [20] if we only consider the steady state case:

$$\begin{aligned}&\mathbf {p} = \nabla u-\frac{1}{2}\mathbf {b}u, \nonumber \\&\mathbf {w} = \mathbf {b} u, \nonumber \\&-\nabla \cdot \mathbf {p}+\frac{1}{2}\mathbf {b}\cdot \left( \mathbf {p}+\frac{1}{2}\mathbf {w}\right) =f. \end{aligned}$$
(2.1)

Notice that \(\mathbf {p}+\frac{1}{2}\mathbf {w}\) in the last equation is in fact \(\nabla u\), deriving from the convection part of the original convection–diffusion equation. However, \(\nabla u\) in the definition of \(\mathbf {p}\) comes from the diffusion part. That is, we need to solve \(\nabla u\) in the convection part by using the diffusion part. Now we consider Eq. (1.1) which contains an coefficient \(\epsilon \) in the diffusion term. A straight forward generalization of the above mixed form leads to

$$\begin{aligned}&\mathbf {p} = \epsilon \nabla u-\frac{1}{2}\mathbf {b}u, \nonumber \\&\mathbf {w} = \mathbf {b} u, \nonumber \\&-\nabla \cdot \mathbf {p}+\frac{1}{2\epsilon }\mathbf {b}\cdot \left( \mathbf {p}+\frac{1}{2}\mathbf {w}\right) =f. \end{aligned}$$
(2.2)

Now we have a coefficient \(\frac{1}{\epsilon }\) in (2.2) which will still appear in the proof of the reliability of the error indicator. Hence, we need a new formulation to eliminate this \(\frac{1}{\epsilon }\) term.

To derive the new formulation, we denote \(\nabla u\) in the convection part as \(\mathbf {s}\) directly and introduce the following two variables:

$$\begin{aligned} \mathbf {p} = \epsilon \nabla u-\frac{1}{2}\mathbf {b}u, \qquad \mathbf {s} = \nabla u. \end{aligned}$$

Then the left hand side of Eq. (1.1) becomes

$$\begin{aligned} -\,\epsilon \Delta u+\nabla \cdot (\mathbf {b}u) =- \nabla \cdot \left( \epsilon \nabla u-\frac{1}{2}\mathbf {b}u\right) +\frac{1}{2}\nabla \cdot (\mathbf {b}u) =-\nabla \cdot \mathbf {p}+\frac{1}{2}\nabla \cdot (\mathbf {b}u). \end{aligned}$$

Since \(\mathbf {b}\) is divergence free, we know that

$$\begin{aligned} \nabla \cdot (\mathbf {b}u)= \mathbf {b}\cdot \nabla u=\mathbf {b}\cdot \mathbf {s}. \end{aligned}$$

Hence, Eq. (1.1) becomes

$$\begin{aligned} -\nabla \cdot \mathbf {p}+\frac{1}{2}\mathbf {b}\cdot \mathbf {s}=f. \end{aligned}$$

Finally, we obtain the following new mixed form of Eq. (1.1)

$$\begin{aligned}&\mathbf {p} = \epsilon \nabla u-\frac{1}{2}\mathbf {b}u, \end{aligned}$$
(2.3)
$$\begin{aligned}&\mathbf {s} =\nabla u, \end{aligned}$$
(2.4)
$$\begin{aligned}&-\nabla \cdot \mathbf {p}+\frac{1}{2}\mathbf {b}\cdot \mathbf {s}=f, \end{aligned}$$
(2.5)

which contains no \(\frac{1}{\epsilon }\) term.

Before we proceed, we denote some notations. We simply use \((\cdot ,\cdot )\) to denote the standard \(L^2\) inner product on \(\Omega \). For any domain \(\Lambda \subset \mathbb {R}^{d}\) and functions u and v defined on \(\Lambda \), we define norms as

$$\begin{aligned} \Vert v\Vert _{0;\Lambda }^{2}:= & {} \Vert v\Vert _{L^{2}(\Lambda )}^{2},\qquad |v|_{ 1;\Lambda }^{2} := \Vert \nabla v\Vert _{L^{2}(\Lambda )}^{2},\\ \Vert v\Vert _{1;\Lambda }^{2}:= & {} \Vert v\Vert _{0;\Lambda }^{2}+|v|_{ 1;\Lambda }^{2}, \end{aligned}$$

provided that these norms are well-defined.

By multiplying test functions and using integration by parts, the variational form of Eqs. (2.3) to (2.5) is : find \((\mathbf {p},\mathbf {s},u)\in [L^2(\Omega )]^2\times [L^2(\Omega )]^2 \times H_0^1(\Omega )\), such that

$$\begin{aligned}&(\mathbf {p},\mathbf {q}) = \epsilon (\nabla u,\mathbf {q})-\frac{1}{2}(\mathbf {b}u,\mathbf {q}), \end{aligned}$$
(2.6)
$$\begin{aligned}&(\mathbf {s},\mathbf {q}) = (\nabla u,\mathbf {q}), \end{aligned}$$
(2.7)
$$\begin{aligned}&(\mathbf {p},\nabla v)+\frac{1}{2}(\mathbf {b}\cdot \mathbf {s},v)=(f,v), \end{aligned}$$
(2.8)

for all test functions \((\mathbf {q},v)\in [L^2(\Omega )]^2 \times H_0^1(\Omega )\).

Now we try to eliminate the auxiliary variables \(\mathbf {p}\) and \(\mathbf {s}\) from the variational form and thus derive the equation satisfied by \(u\in H_0^1(\Omega )\). Note that for any test function \(v \in H_0^1(\Omega )\), by taking \(\mathbf {q}=\nabla v \in [L^2(\Omega )]^2\) and \(\mathbf {q}=\mathbf {b} v \in [L^2(\Omega )]^2\) in Eqs. (2.6) and (2.7), respectively, we have

$$\begin{aligned} (\mathbf {p},\nabla v)= & {} \epsilon (\nabla u,\nabla v)-\frac{1}{2}(\mathbf {b}u,\nabla v), \\ (\mathbf {s},\mathbf {b} v)= & {} (\nabla u,\mathbf {b} v). \end{aligned}$$

Substituting the above two equations into Eq. (2.8) and using the fact that \(v \in H_0^1(\Omega )\) and \(\mathbf {b}\) is divergence free, we obtain

$$\begin{aligned} \epsilon (\nabla u,\nabla v)+(\nabla \cdot (\mathbf {b}u),v)=(f,v). \end{aligned}$$
(2.9)

If we denote

$$\begin{aligned} B_d(u,v)= & {} \epsilon (\nabla u,\nabla v),\qquad B_c(u,v)=(\nabla \cdot (\mathbf {b}u),v), \\ B(u,v)= & {} B_d(u,v)+B_c(u,v), \end{aligned}$$

then the exact solution \(u\in H_0^1(\Omega )\) satisfies

$$\begin{aligned} B(u,v)=(f,v),\qquad \forall v \in H_0^1(\Omega ). \end{aligned}$$
(2.10)

This equation will be used in Sect. 3. Since \(\mathbf {b}\) is divergence free, we can observe that

$$\begin{aligned} B_c(v,v)= & {} (\nabla \cdot (\mathbf {b}v),v)=-\,(\mathbf {b}v, \nabla v) \nonumber \\= & {} -\,(\mathbf {b}\cdot \nabla v,v)=-\,(\nabla \cdot (\mathbf {b} v),v) = -B_c(v,v), \qquad \forall v \in H_0^1(\Omega ), \end{aligned}$$

which shows the following property of \(B_c\)

$$\begin{aligned} B_c(v,v)=0, \qquad \forall v \in H_0^1(\Omega ). \end{aligned}$$
(2.11)

2.2 The Modified SDG Method

Based on the mixed formulation in the last section, we are ready to construct a new modified SDG method. Following [20], we first define the triangulation. Let \(\mathcal {T}_{0}\) be a shape regular initial triangulation of \(\Omega \), as illustrate by solid lines in Fig. 1. We denote the the collection of all edges in \(\mathcal {T}_{0}\) as \(\mathcal {F}_{u}\) and denote the subset of all interior edges as \(\mathcal {F}_{u}^0\). Next, we construct a staggered mesh by further division of triangles. For each triangle \(\tau \in \mathcal {T}_{0}\), we subdivide it into three small triangles by joining its center with its vertices. Hence, we can obtain a new triangulation consists of all small triangles formed and denote it as \(\mathcal {T}\). As illustrated by dotted lines in Fig. 1, we denote the collection of all new edges formed under \(\mathcal {T}\) as \(\mathcal {F}_{p}\). Moreover, we define the set of all interior edges of \(\mathcal {T}\) as \(\mathcal {F}^0=\mathcal {F}_{p}\bigcup \mathcal {F}_{u}^0\).

Fig. 1
figure 1

An illustration of the staggered mesh

For each edge e of \(\mathcal {T}\), we define a unit normal vector \(\mathbf {n}_e\) in the following way. If \(e\in \partial \Omega \), then we define \(\mathbf {n}_e\) as the unit normal vector pointing outside of \(\Omega \). For an interior edge \(e=\partial \tau ^+ \bigcap \partial \tau ^-\), we use notations \(\mathbf {n}^+\) and \(\mathbf {n}^-\) to denote the outward unit normal vectors of e taken from \(\tau ^+\) and \(\tau ^-\), respectively, and fix \(\mathbf {n}_e\) as one of \(\mathbf {n}^{\pm }\). We use notations \(v^+\) and \(v^-\) to denote the values of a function v on e taken from \(\tau ^+\) and \(\tau ^-\), respectively. Then the jump notation [v] over an edge e for a scalar valued function v is defined as

$$\begin{aligned}{}[v]|_e := (v^+\mathbf {n}^+ + v^-\mathbf {n}^-)\cdot \mathbf {n}_e. \end{aligned}$$

For a vector-valued function \(\mathbf {q}\), the notation \([\mathbf {q}\cdot \mathbf {n}]\) is defined as

$$\begin{aligned}{}[\mathbf {q}\cdot \mathbf {n}]|_e := \mathbf {q}^+\cdot \mathbf {n}^+ + \mathbf {q}^-\cdot \mathbf {n}^-. \end{aligned}$$

Note that \(\mathbf {b}\) is the given vector field, which can be any sufficiently smooth function. In order to solve our problem numerically, we approximate \(\mathbf {b}\) with a piecewise polynomial vector \(\mathbf {b}_h\). In each triangle \(\tau \in \mathcal {T}\), \(\mathbf {b}_h|_{\tau }\) is the Raviart–Thomas projection of \(\mathbf {b}\) onto the Raviart–Thomas space \(RT^k(\tau )\), which is defined as follows

$$\begin{aligned}&(\mathbf {b}_h|_{\tau },\mathbf {q})_{0;\tau } = (\mathbf {b},\mathbf {q})_{0;\tau },\quad \forall \mathbf {q}\in [P^{k-1}(\tau )]^2, \end{aligned}$$
(2.12)
$$\begin{aligned}&(\mathbf {b}_h|_{\tau }\cdot \mathbf {n}_e,v)_{0;e} = (\mathbf {b}\cdot \mathbf {n}_e,v)_{0;e},\quad \forall v\in P^k(e),\,\forall e\in \partial \tau , \end{aligned}$$
(2.13)

where \(P^{k}(\Lambda )\) denotes the space of polynomials of degree up to k on the domain \(\Lambda \subset \mathbb {R}^{d}\). We all know that \(\mathbf {b}_h|_{\tau }\cdot \mathbf {n}_e\in P^k(e)\) on \(e\in \partial \tau \). Hence, (2.13) shows that \(\mathbf {b}_h|_{\tau }\cdot \mathbf {n}_e\) on \(e\in \partial \tau \) is just the \(L^2\) projection of \(\mathbf {b}\cdot \mathbf {n}_e\) onto \(P^k(e)\). Thus, \(\mathbf {b}_h\cdot \mathbf {n}_e\) is continuous over each interior edge in \(\mathcal {T}\). Another property of the Raviart–Thomas projection is that \(\nabla \cdot \mathbf {b}_h|_{\tau }\) is the \(L^2\) projection of \(\nabla \cdot \mathbf {b}\) onto \(P^k(\tau )\). Since \(\mathbf {b}\) is divergence free, we obtain \(\nabla \cdot \mathbf {b}_h=0\). In the following content of this paper, we denote \(\mathbf {e}_b=\mathbf {b-b}_{h}\) for simplicity. It is well known that the following error estimate holds

$$\begin{aligned} \Vert \mathbf {e}_b\Vert _{0;\tau }\leqslant h_{\tau }^{k+1} |\mathbf {b}|_{k+1;\tau }. \end{aligned}$$

Next, we define two finite element spaces on the constructed staggered mesh:

$$\begin{aligned} U_{h}:= & {} \left\{ v: v|_{\tau }\in P^{k}(\tau ), {\forall \tau \in \mathcal {T}}; \; v\text { is continuous on } e\in \mathcal {F}_{u}^{0};\; v|_{\partial \Omega }=0\right\} , \\ W_{h}:= & {} \left\{ \mathbf {q}:\mathbf {q}|_{\tau }\in [P^{k}(\tau )]^{2}, {\forall \tau \in \mathcal {T}};\;\mathbf {q}\cdot \mathbf {n}_e\text { is continuous on } e\in \mathcal {F}_{p}\right\} . \end{aligned}$$

In the space \(U_h\), we define the following norms

$$\begin{aligned} \Vert v \Vert _X^2:= & {} \int _{\Omega } v^2 d\mathbf {x}+\sum _{e\in \mathcal {F}_{u}^{0}} h_e \int _e v^2 ds, \\ \Vert v \Vert _Z^2:= & {} \int _{\Omega } |\nabla v|^2 d\mathbf {x}+\sum _{e\in \mathcal {F}_{p}} h_e^{-1} \int _e [v]^2 ds, \end{aligned}$$

where \(h_e\) is the length of e. In the space \(W_h\), we define the following norms

$$\begin{aligned} \Vert \mathbf {q}\Vert _{X'}^2:= & {} \int _{\Omega } |\mathbf {q}|^2 d\mathbf {x}+\sum _{e\in \mathcal {F}_{p}} h_e \int _e (\mathbf {q}\cdot \mathbf {n}_e)^2 ds, \\ \Vert \mathbf {q}\Vert _{Z'}^2:= & {} \int _{\Omega } (\nabla \cdot \mathbf {q})^2 d\mathbf {x}+\sum _{e\in \mathcal {F}_{u}^0} h_e^{-1} \int _e [\mathbf {q} \cdot \mathbf {n}]^2 ds. \end{aligned}$$

Based on all concepts introduced above, we are ready to construct a new SDG scheme in order to approximate the variational form (2.6)–(2.8) satisfied by the exact solution. In our SDG method, we find the numerical solution \((\mathbf {p}_{h},\mathbf {s}_{h},u_{h})\in W_{h}\times W_{h}\times U_{h}\), such that

$$\begin{aligned}&(\mathbf {p}_{h},\mathbf {q}) = \epsilon (\mathbf {s}_h,\mathbf {q})-\frac{1}{2}(\mathbf {b}_h u_h,\mathbf {q}), \end{aligned}$$
(2.14)
$$\begin{aligned}&(\mathbf {s}_{h},\mathbf {q}) = B_{h}^{*}(u_{h},\mathbf {q}), \end{aligned}$$
(2.15)
$$\begin{aligned}&B_{h}(\varvec{p}_{h},v) + \frac{1}{2}(\mathbf {b}_h \cdot \mathbf {s}_h,v) = (f,v), \end{aligned}$$
(2.16)

for all test functions \(\mathbf {q}\in W_h\) and \(v\in U_h\), where

$$\begin{aligned} B_{h}^{*}(u_{h},\mathbf {q})= & {} -(u_{h},\nabla \cdot \mathbf {q})+\sum _{e\in \mathcal {F}_{u}^{0}}\int _{e}u_{h}[\mathbf {q}\cdot \varvec{n}] ds, \\ B_{h}(\mathbf {p}_{h},v)= & {} (\mathbf {p}_{h},\nabla v)-\sum _{e\in \mathcal {F}_{p}}\int _{e}\,\mathbf {p}_{h}\cdot \mathbf {n}_e [v] ds. \end{aligned}$$

We remark that the variables \(\mathbf {p}_h\) and \(\mathbf {s}_h\) in (2.16) can be eliminated easily by using (2.14) and (2.15). Thus the scheme consists of one system involving only \(u_h\). Comparing with other types of DG schemes, our SDG scheme has fewer degrees of freedoms due to the additional continuity condition for \(u_h\). On the other hand, it is proved by Chung and Engquist in [18] that

$$\begin{aligned} B_{h}^{*}(v,\mathbf {q}) = B_{h}(\mathbf {q},v),\qquad \forall v\in U_h, \,\,\forall \mathbf {q}\in W_h. \end{aligned}$$
(2.17)

Moreover, the following inf-sup condition holds:

$$\begin{aligned} K\Vert v\Vert _Z\leqslant \sup _{\mathbf {q}\in W_h}\frac{B_h^*(v,\mathbf {q})}{\Vert \mathbf {q}\Vert _{X'}}, \end{aligned}$$
(2.18)

where K is a constant independent of the mesh size.

By using the inf-sup condition of \(B_h^*\), we can prove the following stability of the above modified SDG method.

Theorem 2.1

Let \((\mathbf {p}_{h},\mathbf {s}_{h},u_{h})\in W_{h}\times W_{h}\times U_{h}\) be the solution of the SDG scheme (2.14)–(2.16). Then the following stability holds:

$$\begin{aligned} \Vert u_h\Vert _Z\leqslant \frac{K}{\epsilon } \Vert f\Vert _{0;\Omega }. \end{aligned}$$
(2.19)

Proof

The inf-sup condition (2.18) for the operator \(B_h^*\) implies that

$$\begin{aligned} \Vert \mathbf {s}_{h}\Vert _{0;\Omega } =\sup _{\mathbf {q}\in W_h}\frac{(\mathbf {s}_{h},\mathbf {q})}{\Vert \mathbf {q}\Vert _{0;\Omega }} \geqslant \sup _{\mathbf {q}\in W_h}\frac{B_{h}^{*}(u_{h},\mathbf {q})}{\Vert \mathbf {q}\Vert _{X'}} \geqslant K \Vert u_h\Vert _Z, \end{aligned}$$
(2.20)

where we have used the scheme (2.15). Next, we try to compute \(\Vert \mathbf {s}_{h}\Vert _{0;\Omega }^2\). Taking \(\mathbf {q}=\mathbf {s}_h\), \(\mathbf {q}=\mathbf {p}_h\), and \(v=u_h\) in Eqs. (2.14), (2.15), and (2.16), respectively, we have

$$\begin{aligned}&(\mathbf {p}_{h},\mathbf {s}_h) = \epsilon (\mathbf {s}_h,\mathbf {s}_h)-\frac{1}{2}(\mathbf {b}_h u_h,\mathbf {s}_h), \end{aligned}$$
(2.21)
$$\begin{aligned}&(\mathbf {s}_{h},\mathbf {p}_h) = B_{h}^{*}(u_{h},\mathbf {p}_h), \end{aligned}$$
(2.22)
$$\begin{aligned}&B_{h}(\varvec{p}_{h},u_h) + \frac{1}{2}(\mathbf {b}_h \cdot \mathbf {s}_h,u_h) = (f,u_h). \end{aligned}$$
(2.23)

Combining (2.21) and (2.22), we get

$$\begin{aligned} \frac{1}{2}(\mathbf {b}_h u_h,\mathbf {s}_h)=\epsilon (\mathbf {s}_h,\mathbf {s}_h)-B_{h}^{*}(u_{h},\mathbf {p}_h). \end{aligned}$$
(2.24)

Substituting the above equation into (2.23) and using the property (2.17), we obtain

$$\begin{aligned} \Vert \mathbf {s}_h\Vert _{0;\Omega }^2=\frac{1}{\epsilon }(f,u_h). \end{aligned}$$
(2.25)

Combining (2.20) and (2.25), we obtain

$$\begin{aligned} \Vert u_h\Vert _Z^2 \leqslant K \Vert \mathbf {s}_{h}\Vert _{0;\Omega }^2 = \frac{K}{\epsilon }(f,u_h) \leqslant \frac{K}{\epsilon } \Vert f\Vert _{0;\Omega }\Vert u_h\Vert _Z. \end{aligned}$$
(2.26)

Hence, we arrive the conclusion that

$$\begin{aligned} \Vert u_h\Vert _Z\leqslant \frac{K}{\epsilon } \Vert f\Vert _{0;\Omega }. \end{aligned}$$
(2.27)

\(\square \)

Remark It is noted that the convection term is skew-symmetric and is therefore easy to obtain the following underlying physical law from the original convection–diffusion equation (1.1):

$$\begin{aligned} \Vert \nabla u\Vert _{0;\Omega }^2= \frac{1}{\epsilon } (f,u). \end{aligned}$$

Equation (2.25) is a result of the preservation of the skew-symmetry of the discrete convection operator and shows that our SDG scheme preserves the above physical property in the discrete sense. Such property can also enhance the stability when solving the incompressible Navier–Stokes equations [11]. This is the advantage of using staggered mesh. One can prove that this physical law is not strictly satisfied by using the local DG (LDG) method due to the usage of numerical flux.

By using the definition of \(\Vert u_h\Vert _Z\), the stability of our scheme gives

$$\begin{aligned} \Vert \nabla u_h\Vert _{0;\Omega }\leqslant \Vert u_h\Vert _Z\leqslant \frac{K}{\epsilon } \Vert f\Vert _{0;\Omega }. \end{aligned}$$
(2.28)

By combining Eqs. (2.26) and (2.27), we can also derive the following bound of \(\mathbf {s}_h\)

$$\begin{aligned} \Vert \mathbf {s}_h\Vert _{0,\Omega }\leqslant \frac{K}{\epsilon } \Vert f\Vert _{0;\Omega }. \end{aligned}$$
(2.29)

3 An Adaptive SDG Method

In order to develop an adaptive mesh refinement strategy for the new SDG scheme constructed in the last section, we will derive a reliable and efficient a-posteriori error estimator for this SDG scheme in this section. The error estimator can give a computable estimate of the numerical error in each triangle \(\tau \in \mathcal {T}\). Thus, we can use it as an error indicator and refine the mesh adaptively at locations with larger estimated numerical error.

Let \((\mathbf {p},\mathbf {s},u)\in [L^2(\Omega )]^2\times [L^2(\Omega )]^2 \times H_0^1(\Omega )\) be the exact solution of (2.6)–(2.8) and let \((\mathbf {p}_{h},\mathbf {s}_{h},u_{h})\in W_{h}\times W_{h}\times U_{h}\) be the numerical solution of the SDG scheme (2.14)–(2.16). \(\mathbf {b}_h\) is the Raviart–Thomas projection of \(\mathbf {b}\). Then, we denote numerical errors as

$$\begin{aligned}&e_{u}=u-u_{h}, \qquad \mathbf {e}_{bu}=\mathbf {b}u-\mathbf {b}_{h}u_h,\\&\mathbf {e}_{p}=\mathbf {p-p}_{h}, \qquad e_{bs}=\mathbf {b}\cdot \mathbf {s}-\mathbf {b}_h\cdot \mathbf {s}_h. \end{aligned}$$

Moreover, we define the following DG norm of the numerical error

$$\begin{aligned} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}^{2}:= & {} \epsilon ^2 |e_u|_{1;\Omega }^{2} +\Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\Omega }^{2} +\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\Omega }^{2}\nonumber \\&+\sum _{e\in \mathcal {F}_p} h_{e}^{-1} \Vert [e_u] \Vert _{0;e}^{2}. \end{aligned}$$
(3.1)

In Sect. 3.1, we will give an error indicator which gives a locally a-posteriori error estimate of the above DG norm \(\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}^2\) and prove the reliability of it. The efficiency of this error indicator will be proved in Sect. 3.2. Based on the error indicator, we will give the adaptive refinement technique in Sect. 3.3.

3.1 Reliability of the Error Indicator

As in the following theorem, we will show that the DG norm of the numerical error defined in (3.1) is bounded above by a computable error indicator \(\eta ^2\), which can be computed locally on each triangle \(\tau \) in the mesh. Throughout the paper, the notation \(\alpha \lesssim \beta \) means that \(\alpha \le C \beta \) for a constant C independent of the mesh size.

Theorem 3.1

Assuming \((\mathbf {p},\mathbf {s},u)\in [L^2(\Omega )]^2\times [L^2(\Omega )]^2 \times H_0^1(\Omega )\) be the the exact solution of (2.6)–(2.8) and denoting the numerical solution of the SDG scheme (2.14)–(2.16) as \((\mathbf {p}_{h},\mathbf {s}_{h},u_{h})\in W_{h}\times W_{h}\times U_{h}\), we can estimate the DG norm of the numerical error defined in Eq. (3.1) as

$$\begin{aligned} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}^{2} \lesssim \eta ^2=\sum _{\tau \in \mathcal {T}}\eta _{\tau }^2, \end{aligned}$$
(3.2)

where for each \(\tau \in \mathcal {T}\),

$$\begin{aligned} \eta _{\tau }^2:= & {} h_{\tau }^{2}\Vert R_1\Vert _{0;\tau }^{2}+\Vert \mathbf {R}_{2}\Vert _{0;\tau }^{2}+\Vert R_{3}\Vert _{0;\tau }^{2}+\frac{h_{\tau }^2}{\epsilon ^2} \Vert \mathbf {e}_b\Vert _{\infty ;\tau }^2\nonumber \\&+\sum _{e\in \mathcal {F}_{p}\cap \tau }h_{e}^{-1}\Vert J_{1}\Vert _{0;e}^{2} + \sum _{e\in \mathcal {F}^{0}\cap \tau }h_{e}\Vert J_{2}\Vert _{0;e}^{2}, \end{aligned}$$
(3.3)

with

$$\begin{aligned} \left\{ \begin{array}{rcl} \displaystyle R_1 &{} = &{} f+\nabla \cdot \mathbf {p}_h-\frac{1}{2}\mathbf {b}_h\cdot \mathbf {s}_h,\\ \displaystyle \mathbf {R}_2 &{} = &{} \mathbf {p}_h-\epsilon \nabla u_{h}+\frac{1}{2}\mathbf {b}_hu_h,\\ \displaystyle R_3 &{} = &{} \mathbf {b}_h\cdot (\mathbf {s}_h- \nabla u_h),\\ \displaystyle J_1 &{} = &{}[{u}_h],\\ \displaystyle J_2 &{} = &{} [ (\mathbf {p}_h+\frac{1}{2}\mathbf {b}_h u_h)\cdot \mathbf {n}]. \end{array} \right. \end{aligned}$$

Here, \(h_{\tau }\) denotes the diameter of the circumcircle of a triangle \(\tau \), and \(h_e\) is the length of an edge e. For each \(e\in \partial \tau \), it is obvious that \(h_e\leqslant h_{\tau }\). Through out this paper, we assume that our mesh is regular, that is, \( h_{\tau }\lesssim h_e\) for each \(e\in \partial \tau \).

From the definition of \(\eta ^2\) and the fact that \([u]|_e=0\) on \(e\in \mathcal {F}_p\), we can easily see that the last term of \(\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}^{2}\) in Eq. (3.1) is also one term in \(\eta ^2\), and hence is automatically bounded by \(\eta ^2\). Thus we only need to find the upper estimates for the first three terms in \(\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}^{2}\). In the following lemma, we first deal with the second and third terms.

Lemma 3.2

Under the assumption of Theorem 3.1, we have the following upper estimates:

$$\begin{aligned}&\Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\Omega } \leqslant \epsilon |e_u|_{1;\Omega } +\sum _{\tau \in \mathcal {T}}\Vert \mathbf {R}_2\Vert _{0;\tau }, \end{aligned}$$
(3.4)
$$\begin{aligned}&\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\Omega } \leqslant \sum _{\tau \in \mathcal {T}} \Vert R_3\Vert _{0;\tau }. \end{aligned}$$
(3.5)

Proof

We first prove (3.4). By the variation form (2.6), we know that for any \(\mathbf {q} \in [L^2(\Omega )]^2\), we have

$$\begin{aligned} \left( \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu},\mathbf {q}\right)= & {} \left( \mathbf {p}+\frac{1}{2}\mathbf {b}u,\mathbf {q}\right) -\left( \mathbf {p}_h+\frac{1}{2}\mathbf {b}_hu_h,\mathbf {q}\right) \nonumber \\= & {} \epsilon (\nabla u,\mathbf {q})-\left( \mathbf {p}_h+\frac{1}{2}\mathbf {b}_hu_h,\mathbf {q}\right) \nonumber \\= & {} \epsilon (\nabla (u-u_h),\mathbf {q})-\left( \mathbf {p}_h-\epsilon \nabla u_h+\frac{1}{2}\mathbf {b}_hu_h,\mathbf {q}\right) \nonumber \\= & {} \epsilon (\nabla e_u,\mathbf {q})-(\mathbf {R}_2,\mathbf {q}) \nonumber \\\leqslant & {} \left( \epsilon |e_u|_{1;\Omega }+ \sum _{\tau \in \mathcal {T}}\Vert \mathbf {R}_2\Vert _{0;\tau } \right) \Vert \mathbf {q}\Vert _{0;\Omega }. \end{aligned}$$

Taking \(\mathbf {q}=\mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\in [L^2(\Omega )]^2\), we obtain

$$\begin{aligned} \left\| \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\right\| _{0;\Omega }\leqslant & {} \epsilon |e_u|_{1;\Omega }+ \sum _{\tau \in \mathcal {T}} \Vert \mathbf {R}_2\Vert _{0;\tau }. \end{aligned}$$

We move on to prove (3.5). Using the fact that \(\mathbf {b}\) and \(\mathbf {b}_h\) are divergence free, we know that for any \(v \in L^2(\Omega )\), we have

$$\begin{aligned} (e_{bs}-\nabla \cdot \mathbf {e}_{bu},v)= & {} (\mathbf {b}\cdot \mathbf {s}- \mathbf {b}\cdot \nabla u, v)-(\mathbf {b}_h\cdot \mathbf {s}_h- \mathbf {b}_h\cdot \nabla u_h, v)\nonumber \\= & {} (\mathbf {s}-\nabla u, \mathbf {b} v)-(R_3, v). \end{aligned}$$

From the variation form (2.7), we get

$$\begin{aligned} (\mathbf {s}-\nabla u, \mathbf {b} v)=0. \end{aligned}$$

Hence, we obtain

$$\begin{aligned} (e_{bs}-\nabla \cdot \mathbf {e}_{bu},v)= & {} -(R_3,v) \leqslant \sum _{\tau \in \mathcal {T}} \Vert R_3\Vert _{0;\tau }\Vert v\Vert _{0;\Omega }. \end{aligned}$$

Taking \(v=e_{bs}-\nabla \cdot \mathbf {e}_{bu}\in L^2(\Omega )\), we have

$$\begin{aligned} \Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\Omega }\leqslant & {} \sum _{\tau \in \mathcal {T}} \Vert R_3\Vert _{0;\tau }. \end{aligned}$$

\(\square \)

From the above lemma, we know that the remaining thing is to find an upper bound of \(\epsilon ^2|e_u|_{1;\Omega }^2\), which is also the first term of \(\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}^{2}\). For this purpose, we need to introduce an auxiliary variable \(u^{c}\in H_0^1\bigcap U_h\) as in the following lemma on polynomial approximations.

Lemma 3.3

Let \(u_{h}\in U_{h}\). There exists \(u^{c}\in H_0^1(\Omega )\bigcap U_{h}\), such that

$$\begin{aligned} |u_{h}-u^{c}|_{1;\Omega }^2 \lesssim \sum _{e\in \mathcal {F}_{p}}h_{e}^{-1}\Vert J_{1}\Vert _{0;e}^{2}. \end{aligned}$$
(3.6)

Proof

From Theorem 2.2 of [31], we know that there exists \(u^{c}\in H_0^1\bigcap U_{h}\) such that

$$\begin{aligned} |u_{h}-u^{c}|_{1;\Omega }^2 \lesssim \sum _{e\in \mathcal {F}_{u}^0 \bigcup \mathcal {F}_{p}}h_{e}^{-1}\Vert J_{1}\Vert _{0;e}^{2} +\sum _{e\in \partial \Omega }h_{e}^{-1}\Vert u_h\Vert _{0;e}^{2}. \end{aligned}$$

By the definition of \(U_h\), we know that \(J_1|_{\mathcal {F}_{u}^0}=0\) and \(u_h|_{\partial \Omega }=0\). Hence, we can obtain the conclusion. \(\square \)

Using the triangle inequality, we know that

$$\begin{aligned} \epsilon ^2|e_u|_{1;\Omega }^2 \leqslant \epsilon ^2|u-u^c|_{1;\Omega }^2 +\epsilon ^2|u_h-u^c|_{1;\Omega }^2, \end{aligned}$$

where the second term on the right hand side is already bounded by a term in our error indicator as shown in the above lemma. Up to this point, only the first term on the right hand side, namely, \(\epsilon ^2|u-u^c|_{1;\Omega }^2 \), dose not have an upper estimate. The following lemma shows that the upper bound of this term is just the “error” obtained by plugging \(u_h\) into Eq. (2.10), which is satisfied by the exact solution u, plus another term in our residual.

Lemma 3.4

For \(u^c\) obtained in Lemma 3.3, we have

$$\begin{aligned} \epsilon |u-u^c|_{1;\Omega }\lesssim & {} \sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,v)-B(u_{h},v)\} +\left( \sum _{e\in \mathcal {F}_{p}}h_{e}^{-1}\Vert J_{1}\Vert _{0;e}^{2} \right) ^{\frac{1}{2}}, \end{aligned}$$
(3.7)

where we remark that the gradient operator \(\nabla \) in \(B(u_{h},v)\) means the discrete/broken gradient and \(B(u_{h},v)\) is defined element by element.

Proof

By the definitions of B, \(B_c\), \(B_d\), and the fact that \(B_c(v,v)=0\) for \(v\in H_{0}^{1}(\Omega )\), we know that

$$\begin{aligned} \epsilon |u-u^c|_{1;\Omega }^{2}= & {} B_d(u-u^c,u-u^c)=B(u-u^c,u-u^c) \nonumber \\= & {} B(u-u_h,u-u^c)+B(u_h-u^c,u-u^c)\nonumber \\= & {} B(u-u_h,u-u^c)+B_{c}(u_h-u^c,u-u^c)+B_{d}(u_h-u^c,u-u^c)\nonumber \\\leqslant & {} |u-u^c|_{1;\Omega }B\left( u-u_h,\frac{u-u^c}{|u-u^c|_{1;\Omega }}\right) \nonumber \\&+\,|u-u^c|_{1;\Omega }B_{c}\left( u_h-u^c,\frac{u-u^c}{|u-u^c|_{1;\Omega }}\right) \nonumber \\&+\,\epsilon |u_h-u^c|_{1;\Omega }|u-u^c|_{1;\Omega }, \end{aligned}$$

and therefore we have

$$\begin{aligned} \epsilon |u-u^c|_{1;\Omega }\lesssim & {} \sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{B(u-u_h,v)+B_c(u_h-u^c,v)\} +\epsilon |u_h-u^c|_{1;\Omega }\nonumber \\\lesssim & {} \sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,v)-B(u_h,v)+B_c(u_h-u^c,v)\}+\epsilon |u_h-u^c|_{1;\Omega }, \nonumber \\ \end{aligned}$$
(3.8)

where we have used Eq. (2.10) . Since \(\mathbf {b}\) is divergence free and by using the Cauchy–Schwarz inequality, for any \(v\in H_0^1\) with \(|v|_{1;\Omega }=1\), we obtain

$$\begin{aligned} B_c(u_h-u^c,v)= & {} (\mathbf {b}\cdot \nabla (u_h-u^c),v)\nonumber \\\lesssim & {} \Vert \mathbf {b}\Vert _{\infty } |u_h-u^c|_{1;\Omega }\Vert v\Vert _{0;\Omega }\nonumber \\\lesssim & {} |u_h-u^c|_{1;\Omega }, \end{aligned}$$
(3.9)

where the last inequality holds because \(\Vert v\Vert _{0;\Omega }\) must be bounded. Using (3.8)–(3.9) and Lemma 3.3, our result follows. \(\square \)

Since the second term in the above lemma is already a term in our error indicator, we just need to further find an upper estimate for \(\displaystyle \sup \nolimits _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,v)-B(u_{h},v)\}\). Before we proceed, let us state the following lemma, which states some error bounds of polynomial estimations.

Lemma 3.5

Let \(v\in H_0^1(\Omega )\). Then there exists \(v_{h}\in H_0^1(\Omega )\bigcap U_{h}\) such that

$$\begin{aligned}&\Vert v-v_{h}\Vert _{0;\tau } \lesssim h_{\tau }\Vert v\Vert _{1;\tau }, \end{aligned}$$
(3.10)
$$\begin{aligned}&|v-v_{h}|_{1;\tau } \lesssim \Vert v\Vert _{1;\tau }, \end{aligned}$$
(3.11)

for all \(\tau \in \mathcal {T}\).

Recall that our goal is to find an upper estimate for \(\sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,v)-B(u_{h},v)\}\). A usual technique in finding this kind of upper estimate is to break down the test function v into two different parts, namely \(v_{h}\) and \(v-v_{h}\), where \(v_{h}\) is conformal. This technique is useful because we can easily use the numerical scheme to replace the term \((f,v_{h})\) if \(v_{h}\) is conformal, while the term \(v-v_{h}\) usually has some nice approximation properties, as shown in the last lemma. Next, let us first deal with the conformal part \(v_h\). We can prove the following lemma.

Lemma 3.6

Let \(v\in H_0^1(\Omega )\) with \(|v|_{1;\Omega }=1\). Choose \(v_{h}\in H_0^1(\Omega )\bigcap U_{h}\) as in Lemma 3.5. Then,

$$\begin{aligned} (f,v_{h})-B(u_{h},v_{h})\lesssim & {} \sum _{\tau \in \mathcal {T}} \left( \Vert \mathbf {R}_{2}\Vert _{0;\tau }+\Vert R_{3}\Vert _{0;\tau }+\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }\right) \nonumber \\&+\left( \sum _{e\in \mathcal {F}_{p}}h_{e}^{-1}\Vert J_{1}\Vert _{0;e}^{2}\right) ^{1/2}. \end{aligned}$$
(3.12)

Proof

By using (2.16) and the fact that \(v_h\in H_0^1(\Omega )\), we know that

$$\begin{aligned} (f,v_{h})= & {} B_{h}(\mathbf {p}_{h},v_h) + \frac{1}{2}(\mathbf {b}_h \cdot \mathbf {s}_h,v_h) = (\mathbf {p}_{h},\nabla v_h)+ \frac{1}{2}(\mathbf {b}_h \cdot \mathbf {s}_h,v_h). \end{aligned}$$

Hence, by the definitions of B and \(\mathbf {R}_2\), we have

$$\begin{aligned}&(f,v_{h}) -B(u_{h},v_{h})\nonumber \\&\qquad = (\mathbf {p}_{h}-\epsilon \nabla u_h,\nabla v_h) + \frac{1}{2}(\mathbf {b}_h \cdot \mathbf {s}_h,v_h)-(\nabla \cdot (\mathbf {b}u_h),v_h) \nonumber \\&\qquad = (\mathbf {R}_{2},\nabla v_h)-\frac{1}{2}(\mathbf {b}_hu_h,\nabla v_h ) \nonumber \\&\quad \qquad +\, \frac{1}{2}(\mathbf {b}_h \cdot (\mathbf {s}_h-\nabla u_h),v_h)+\frac{1}{2}(\mathbf {b}_h \cdot \nabla u_h,v_h) -(\nabla \cdot (\mathbf {b}u_h),v_h)\nonumber \\&\qquad = (\mathbf {R}_{2},\nabla v_h)+\frac{1}{2}(R_3,v_h)\nonumber \\&\quad \qquad -\, \frac{1}{2}(\mathbf {b}_hu_h,\nabla v_h ) +\frac{1}{2}(\mathbf {b}_h \cdot \nabla u_h,v_h)-(\nabla \cdot (\mathbf {b}u_h),v_h) \end{aligned}$$
(3.13)

Using the fact that \(\nabla \cdot \mathbf {b}=0\) and integration by parts, we have

$$\begin{aligned}&-\frac{1}{2}(\mathbf {b}_hu_h,\nabla v_h ) +\frac{1}{2}(\mathbf {b}_h \cdot \nabla u_h,v_h)-(\nabla \cdot (\mathbf {b}u_h),v_h)\nonumber \\&\quad =-\frac{1}{2}(\mathbf {b}_hu_h,\nabla v_h )-\frac{1}{2}(\nabla \cdot (\mathbf {b} u_h), v_h )+\frac{1}{2}(\mathbf {b}_h \cdot \nabla u_h,v_h) -\frac{1}{2}( \mathbf {b}\cdot \nabla u_h,v_h)\nonumber \\&\quad = \frac{1}{2} (\mathbf {e}_b u_h, \nabla v_h)-\frac{1}{2} (\mathbf {e}_b\cdot \nabla u_h, v_h) - \frac{1}{2} \sum _{e\in \mathcal {F}_{p}}\int _e v_h \mathbf {b}\cdot \mathbf {n}_e[u_h]. \end{aligned}$$
(3.14)

Combining (3.13) and (3.14), we obtain

$$\begin{aligned}&(f,v_{h}) -B(u_{h},v_{h})\nonumber \\&\quad = (\mathbf {R}_{2},\nabla v_h)+\frac{1}{2}(R_3,v_h) - \frac{1}{2} \sum _{e\in \mathcal {F}_{p}}\int _e v_h \mathbf {b}\cdot \mathbf {n}_e[u_h] \nonumber \\&\qquad +\frac{1}{2} (\mathbf {e}_b u_h, \nabla v_h)-\frac{1}{2} (\mathbf {e}_b\cdot \nabla u_h, v_h). \end{aligned}$$
(3.15)

Now we deal with the term \(\frac{1}{2}(\mathbf {e}_bu_h,\nabla v_h )\). By denoting the cell average value of \(u_h\) on \(\tau \) as \(\bar{u}_{\tau }\), we have

$$\begin{aligned} (\mathbf {e}_bu_h,\nabla v_h ) =\sum _{\tau \in \mathcal {T} }(\mathbf {e}_b(u_h-\bar{u}_{\tau }),\nabla v_h)_{0;\tau }+\sum _{\tau \in \mathcal {T} }\bar{u}_{\tau }(\mathbf {e}_b,\nabla v_h)_{0;\tau }. \end{aligned}$$

By using the fact that \(\nabla v_h|_{\tau } \in [P^{k-1}(\tau )]^2\) and Eq. (2.12), we know that

$$\begin{aligned} (\mathbf {e}_b,\nabla v_h)_{0;\tau }=0, \end{aligned}$$

and hence

$$\begin{aligned} (\mathbf {e}_bu_h,\nabla v_h )= & {} \sum _{\tau \in \mathcal {T} }(\mathbf {e}_b(u_h-\bar{u}_{\tau }),\nabla v_h)_{0;\tau } \\\lesssim & {} \sum _{\tau \in \mathcal {T} } \Vert \mathbf {e}_b( u_h-\bar{u}_{\tau })\Vert _{0;\tau }\Vert \nabla v_h\Vert _{0;\tau }\\\lesssim & {} \sum _{\tau \in \mathcal {T} } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }h_{\tau }\Vert \nabla u_h\Vert _{0;\tau }|v_h|_{1;\tau }, \end{aligned}$$

where we have used the Poincaré inequality. By using (2.28) and the fact that \(f\in L^2(\Omega )\) is the given source term, we have

$$\begin{aligned} (\mathbf {e}_bu_h,\nabla v_h )\lesssim & {} \sum _{\tau \in \mathcal {T} } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }\frac{h_{\tau }}{\epsilon } \Vert f\Vert _{0;\Omega }|v_h|_{1;\tau } \lesssim \sum _{\tau \in \mathcal {T} }\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }|v_h|_{1;\tau }. \end{aligned}$$
(3.16)

Similarly, for the term \(-\frac{1}{2} (\mathbf {e}_b\cdot \nabla u_h, v_h)\), we have

$$\begin{aligned} -\frac{1}{2}(\mathbf {e}_b \nabla u_h, v_h )= & {} -\frac{1}{2}(\mathbf {e}_b v_h, \nabla u_h )\nonumber \\= & {} -\frac{1}{2}\sum _{\tau \in \mathcal {T} }(\mathbf {e}_b(v_h-\bar{v}_{\tau }),\nabla u_h)_{0;\tau }-\frac{1}{2}\sum _{\tau \in \mathcal {T} }\bar{v}_{\tau }(\mathbf {e}_b,\nabla u_h)_{0;\tau }\nonumber \\\lesssim & {} \frac{1}{2}\sum _{\tau \in \mathcal {T} } \Vert \mathbf {e}_b( v_h-\bar{v}_{\tau })\Vert _{0;\tau }\Vert \nabla u_h\Vert _{0;\tau }\nonumber \\\lesssim & {} \sum _{\tau \in \mathcal {T} } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }h_{\tau }|v_h|_{1;\tau }\Vert \nabla u_h\Vert _{0;\tau }\nonumber \\\lesssim & {} \sum _{\tau \in \mathcal {T} }\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }|v_h|_{1;\tau }. \end{aligned}$$
(3.17)

Substituting (3.16) and (3.17) into (3.15), we obtain

$$\begin{aligned}&(f,v_{h}) -B(u_{h},v_{h})\nonumber \\&\quad \lesssim (\mathbf {R}_{2},\nabla v_h)+\frac{1}{2}(R_3,v_h)\nonumber \\&\qquad +\sum _{\tau \in \mathcal {T} }\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }|v_h|_{1;\tau } - \frac{1}{2} \sum _{e\in \mathcal {F}_{p}}\int _e v_h \mathbf {b}\cdot \mathbf {n}_e[u_h] \nonumber \\&\quad \lesssim \sum _{\tau \in \mathcal {T}}\left( \Vert \mathbf {R}_{2}\Vert _{0;\tau }|v_h|_{1;\tau }+\Vert R_{3}\Vert _{0;\tau }\Vert v_h\Vert _{0;\tau }+\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }|v_h|_{1;\tau }\right) \nonumber \\&\qquad +\sum _{e\in \mathcal {F}_{p}} \Vert \mathbf {b}\cdot \mathbf {n}_e\Vert _{\infty ;e}\Vert v_h\Vert _{0;e} \Vert J_1\Vert _{0;e}, \end{aligned}$$
(3.18)

where we have used the Cauchy–Schwarz inequality. Since \(\mathbf {b}\) is the given vector field, we can assume that \(\Vert \mathbf {b}\cdot \mathbf {n}_e\Vert _{\infty ;e}\) is bounded by a constant. By using Lemma 3.5, we know that

$$\begin{aligned} \Vert v_h\Vert _{0;\tau }\lesssim & {} \Vert v_h-v\Vert _{0;\tau }+\Vert v\Vert _{0;\tau } \lesssim h_{\tau }\Vert v\Vert _{1;\tau }+\Vert v\Vert _{0;\tau }\lesssim \Vert v\Vert _{1;\tau },\\ |v_{h}|_{1;\tau }\lesssim & {} |v_h-v|_{1;\tau } +|v|_{1;\tau }\lesssim \Vert v\Vert _{1;\tau }. \end{aligned}$$

Since \(|v|_{1,\Omega }=1\) and hence \(\Vert v\Vert _{1,\Omega }\) is bounded, we know that \(\Vert v_h\Vert _{0;\tau }\) and \(|v_{h}|_{1;\tau }\) have upper bounds too. Hence, we obtain

$$\begin{aligned} (f,v_{h}) -B(u_{h},v_{h})\lesssim & {} \sum _{\tau \in \mathcal {T}}\left( \Vert \mathbf {R}_{2}\Vert _{0;\tau }+\Vert R_{3}\Vert _{0;\tau }+\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }\right) \nonumber \\&+\sum _{e\in \mathcal {F}_{p}} \Vert v_h\Vert _{0;e} \Vert J_1\Vert _{0;e}. \end{aligned}$$
(3.19)

For the last term in (3.19) on edge \(e=\partial \tau _1 \bigcap \partial \tau _2\), we use the standard trace inequality

$$\begin{aligned} \Vert v_h\Vert _{0;e}^2 \lesssim h_e^{-1} \Vert v_h\Vert _{1;\tau _e}^2, \end{aligned}$$

where \(\tau _e=\tau _1 \bigcup \tau _2\). Hence, we have

$$\begin{aligned} \sum _{e\in \mathcal {F}_{p}} \Vert v_h\Vert _{0;e} \Vert J_1\Vert _{0;e} \lesssim \left( \sum _{e\in \mathcal {F}_{p}} \Vert v_h\Vert _{0;e}^2 \Vert J_1\Vert _{0;e}^2\right) ^{\frac{1}{2}} \lesssim \left( \sum _{e\in \mathcal {F}_{p}} h_e^{-1} \Vert J_1\Vert _{0;e}^2\right) ^{\frac{1}{2}}. \qquad \end{aligned}$$
(3.20)

Substituting (3.20) into (3.19), we obtain

$$\begin{aligned} (f,v_{h}) -B(u_{h},v_{h})\lesssim & {} \sum _{\tau \in \mathcal {T}}\left( \Vert \mathbf {R}_{2}\Vert _{0;\tau }+ \Vert R_{3}\Vert _{0;\tau }+ \frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau } \right) +\left( \sum _{e\in \mathcal {F}_{p}}h_{e}^{-1}\Vert J_{1}\Vert _{0;e}^{2}\right) ^{1/2}. \end{aligned}$$

\(\square \)

As mentioned before, to find an upper estimate for \(\sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,v)-B(u_{h},v)\}\), we still need to consider the non-conformal case as in the following lemma.

Lemma 3.7

Let \(v\in H_0^1(\Omega )\) with \(|v|_{1;\Omega }=1\). Choose \(v_{h}\in H_0^1(\Omega )\bigcap U_{h}\) as in Lemma 3.5 and denote \(z=v-v_{h}\). Then,

$$\begin{aligned} (f,z)-B(u_{h},z)\lesssim & {} \sum _{\tau \in \mathcal {T}}\left( h_{\tau }\Vert R_1\Vert _{0;\tau }+\Vert \mathbf {R}_2\Vert _{0;\tau }+h_{\tau }\Vert R_{3}\Vert _{0;\tau }+\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }\right) \nonumber \\&+\left( \sum _{e\in \mathcal {F}^{0}}h_e\Vert J_2\Vert _{0;e}^{2}\right) ^{1/2}. \end{aligned}$$
(3.21)

Proof

By definitions, we have

$$\begin{aligned} (f,z)-B(u_h,z)= & {} (f,z)-(\epsilon \nabla u_{h},\nabla z)-(\mathbf {b}\cdot \nabla u_{h},z) \nonumber \\= & {} (R_{1},z)-(\nabla \cdot \mathbf {p}_h,z)+\frac{1}{2}(\mathbf {b}_h\cdot \mathbf {s}_h,z) \nonumber \\&-\,(\epsilon \nabla u_{h},\nabla z)-(\mathbf {b}\cdot \nabla u_{h},z). \end{aligned}$$
(3.22)

Using integration by parts, we have

$$\begin{aligned} -(\nabla \cdot \mathbf {p}_h,z)= & {} (\mathbf {p}_h,\nabla z)- \sum _{e\in \mathcal {F}_u^0}\int _e z\left[ \mathbf {p}_h\cdot \mathbf {n}\right] . \end{aligned}$$
(3.23)

Substituting (3.23) into (3.22) and using the definitions of \(\mathbf {R}_2\) and \(R_3\), we can obtain

$$\begin{aligned}&(f,z)-B(u_h,z)\nonumber \\&\quad = (R_{1},z)+(\mathbf {p}_h,\nabla z)-(\epsilon \nabla u_{h},\nabla z) +\frac{1}{2}(\mathbf {b}_h u_{h},\nabla z) -\frac{1}{2}(\mathbf {b}_h u_{h},\nabla z) \nonumber \\&\qquad +\,\frac{1}{2}(\mathbf {b}_h\cdot \mathbf {s}_h,z) -\frac{1}{2}(\mathbf {b}_h\cdot \nabla u_{h},z) +\frac{1}{2}(\mathbf {b}_h\cdot \nabla u_{h},z)\nonumber \\&\qquad -\,(\mathbf {b}\cdot \nabla u_{h},z) -\sum _{e\in \mathcal {F}_u^0}\int _e z\left[ \mathbf {p}_h\cdot \mathbf {n}\right] \nonumber \\&\quad = (R_{1},z)+(\mathbf {R}_2,\nabla z) +\frac{1}{2}(R_3,z)-\sum _{e\in \mathcal {F}_u^0}\int _e z\left[ \mathbf {p}_h\cdot \mathbf {n}\right] \nonumber \\&\qquad -\frac{1}{2}(\mathbf {b}_h u_{h},\nabla z) +\frac{1}{2}(\mathbf {b}_h\cdot \nabla u_{h},z)-(\mathbf {b}\cdot \nabla u_{h},z). \end{aligned}$$

Using integration by parts and the fact that \(\mathbf {b}_h\) is divergence free, we have

$$\begin{aligned} -\frac{1}{2}(\mathbf {b}_h u_{h},\nabla z)= & {} \frac{1}{2}(\mathbf {b}_h \cdot \nabla u_{h}, z)-\frac{1}{2}\sum _{e\in \mathcal {F}_p}\int _e z\left[ \mathbf {b}_hu_h\cdot \mathbf {n} \right] , \end{aligned}$$

and hence

$$\begin{aligned}&(f,z)-B(u_h,z)\nonumber \\&\quad = (R_{1},z)+(\mathbf {R}_2,\nabla z)+\frac{1}{2}(R_3,z)-(\mathbf {e}_b \cdot \nabla u_{h}, z) -\sum _{e\in \mathcal {F}^0}\int _e z \left[ (\mathbf {p}_h+\frac{1}{2}\mathbf {b}_h u_h)\cdot \mathbf {n} \right] \nonumber \\&\quad \lesssim \sum _{\tau \in \mathcal {T}} \left( \Vert R_1\Vert _{0;\tau }+\Vert R_3\Vert _{0;\tau }+\Vert \mathbf {e}_b\Vert _{\infty ;\tau }\Vert \nabla u_{h} \Vert _{0;\tau } \right) \Vert z\Vert _{0;\tau } + \sum _{\tau \in \mathcal {T}} \Vert \mathbf {R}_2\Vert _{0;\tau }|z|_{1;\tau }\nonumber \\&\qquad +\sum _{e\in \mathcal {F}^0} \Vert z\Vert _{0;e}\Vert J_2\Vert _{0;e}. \end{aligned}$$

From Lemma 3.5, we have already known that \(\Vert z\Vert _{0;\tau }\lesssim h_{\tau } \Vert v\Vert _{1;\tau }\) and \(|z|_{1;\tau }\lesssim \Vert v\Vert _{1;\tau }\). Using the fact that \(|v|_{1;\Omega }=1\) and (2.28), we obtain

$$\begin{aligned} (f,z)-B(u_h,z)\lesssim & {} \sum _{\tau \in \mathcal {T}} \left( h_{\tau }\Vert R_1\Vert _{0;\tau }+\Vert \mathbf {R}_2\Vert _{0;\tau }+h_{\tau }\Vert R_3\Vert _{0;\tau }+\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau } \right) \nonumber \\&+\sum _{e\in \mathcal {F}^0} \Vert z\Vert _{0;e}\Vert J_2\Vert _{0;e}. \end{aligned}$$
(3.24)

For \( \Vert z\Vert _{0;e}\) on edge \(e=\partial \tau _1 \bigcap \partial \tau _2\), we employ the following trace inequality [39]

$$\begin{aligned} \Vert z\Vert _{0;e}^2\lesssim & {} \Vert z\Vert _{0;\tau _e} |z|_{1;\tau _e}+h_e^{-1} \Vert z\Vert _{0;\tau _e}^2 \lesssim h_e \Vert v\Vert _{1;\tau _e}^2, \end{aligned}$$

where \(\tau _e=\tau _1 \bigcup \tau _2\). Hence, we obtain

$$\begin{aligned} \sum _{e\in \mathcal {F}^0} \Vert z\Vert _{0;e}\Vert J_2\Vert _{0;e}\lesssim & {} \left( \sum _{e\in \mathcal {F}^0} \Vert z\Vert _{0;e}^2\Vert J_2\Vert _{0;e}^2\right) ^{\frac{1}{2}} \lesssim \left( \sum _{e\in \mathcal {F}^0} h_e\Vert J_2\Vert _{0;e}^2\right) ^{\frac{1}{2}}. \end{aligned}$$

Our lemma follows by substituting the above equation into (3.24). \(\square \)

Combining Lemmas 3.23.7, we can prove Theorem 3.1.

Lemma 3.8

Theorem 3.1 holds.

Proof

Combining Lemmas 3.6 and 3.7, we know that

$$\begin{aligned}&\sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,v)-B(u_{h},v)\} \nonumber \\&\quad \lesssim \sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,v_h)-B(u_{h},v_h)\}+ \sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,z)-B(u_{h},z)\} \nonumber \\&\quad \lesssim \sum _{\tau \in \mathcal {T}}\left( h_{\tau }\Vert R_1\Vert _{0;\tau }+\Vert \mathbf {R}_2\Vert _{0;\tau }+\Vert R_{3}\Vert _{0;\tau }+\frac{h_{\tau }}{\epsilon } \Vert \mathbf {e}_b\Vert _{\infty ;\tau }\right) \nonumber \\&\qquad +\left( \sum _{e\in \mathcal {F}_{p}}h_e^{-1}\Vert J_1\Vert _{0;e}^{2}\right) ^{1/2}+ \left( \sum _{e\in \mathcal {F}^{0}}h_e\Vert J_2\Vert _{0;e}^{2}\right) ^{1/2}. \end{aligned}$$

Hence, by Lemma 3.4, we can get

$$\begin{aligned} \epsilon ^2|u-u^c|_{1;\Omega }^2\lesssim & {} \left( \sup _{v\in H_{0}^{1}(\Omega ),\,|v|_{1;\Omega }=1}\{(f,v)-B(u_{h},v)\}\right) ^2 +\sum _{e\in \mathcal {F}_{p}}h_{e}^{-1}\Vert J_{1}\Vert _{0;e}^{2} \nonumber \\\lesssim & {} \sum _{\tau \in \mathcal {T}}\left( h_{\tau }^2\Vert R_1\Vert _{0;\tau }^2+\Vert \mathbf {R}_2\Vert _{0;\tau }^2+\Vert R_{3}\Vert _{0;\tau }^2+\frac{h_{\tau }^2}{\epsilon ^2} \Vert \mathbf {e}_b\Vert _{\infty ;\tau }^2\right) \nonumber \\&+\sum _{e\in \mathcal {F}_{p}}h_e^{-1}\Vert J_1\Vert _{0;e}^{2}+\sum _{e\in \mathcal {F}^{0}}h_e\Vert J_2\Vert _{0;e}^{2}\nonumber \\\lesssim & {} \eta ^2. \end{aligned}$$
(3.25)

Combining Lemma 3.3 and the above equation, we obtain

$$\begin{aligned} \epsilon ^2|e_u|_{1;\Omega }^2 \leqslant \epsilon ^2|u-u^c|_{1;\Omega }^2 +\epsilon ^2|u_h-u^c|_{1;\Omega }^2\lesssim & {} \eta ^2. \end{aligned}$$
(3.26)

From Lemma 3.2, we know that

$$\begin{aligned}&\Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\Omega }^2 \lesssim \epsilon ^2 |e_u|_{1;\Omega }^2 + \sum _{\tau \in \mathcal {T}}\Vert \mathbf {R}_2\Vert _{0;\tau }^2\lesssim \eta ^2, \end{aligned}$$
(3.27)
$$\begin{aligned}&\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\Omega }^2 \lesssim \sum _{\tau \in \mathcal {T}} \Vert R_3\Vert _{0;\tau }^2\lesssim \eta ^2. \end{aligned}$$
(3.28)

Combining (3.26)–(3.28), our lemma follows. \(\square \)

Remark 1

Notice that in our error indicator \(\eta ^2\), there is a term with the coefficient \(\frac{h_{\tau }^2}{\epsilon ^2}\). In our computation, we just assume that the ratio \(\frac{h_{\tau }}{\epsilon }\) is bounded above by a constant which is independent of the mesh size, which is a reasonable assumption.

Remark 2

We can easily see that the last term in \(\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}^{2}\) is in fact the \(J_1\) term in \(\eta ^2\). We add it in the DG norm in order to prove the efficiency of \(\eta ^2\). It is not hard to find that the \(J_1\) term in \(\eta ^2\) comes from the convection part of the original convection–diffusion equation when we trying to find an upper bound of the diffusion term \(\epsilon ^2 |e_u|_{1;\Omega }^2\). Hence, the first term \(|e_u|_{1;\Omega }^2\) in the DG norm has a coefficient \(\epsilon ^2\) while the last term \(h_e^{-1}\Vert [e_u]\Vert _0^2\) does not have.

3.2 Efficiency of the Error Indicator

In this section, we will prove the efficiency of the error indicator derived in the last section. We use the standard bubble function technique, which was introduced by Verfürth [41] in 1994. Let \(\tau \in \mathcal {T}\) be a triangle and \(e\in \mathcal {F}\) be an edge with \(e=\tau _1 \cap \tau _2\). We denote by \(\beta _{\tau }\) and \(\beta _e\) the standard polynomial bubble functions on \(\tau \) and e, respectively, which are uniquely defined by the following properties:

$$\begin{aligned} supp \, \beta _{\tau } \subset \tau ,\,\,\, \beta _{\tau } \in P^3(\tau ),\,\,\, \beta _{\tau }\geqslant 0,\,\,\, \max _{x\in \tau } \beta _{\tau }(x)=1, \end{aligned}$$

and

$$\begin{aligned} supp \, \beta _{e} \subset \tau _1 \cup \tau _2,\,\,\, \beta _{e}|_{\tau _i} \in P^2(\tau _i),\,i=1,2,\,\,\, \beta _{e}\geqslant 0,\,\,\, \max _{x\in \tau _1 \cup \tau _2} \beta _{e}(x)=1. \end{aligned}$$

We first state the following lemmas by Houston et al. in [30]. To save space, we combined the scalar case and the vector-valued case together.

Lemma 3.9

(Lemmas 5.1 and 5.2 in [30]) Let v be a scalar/vector-valued polynomial function on \(\tau \). Then

$$\begin{aligned}&\left\| \beta _{\tau }v\right\| _{0;\tau } \lesssim \left\| v\right\| _{0;\tau } \end{aligned}$$
(3.29)
$$\begin{aligned}&\left\| v\right\| _{0;\tau } \lesssim \left\| \beta _{\tau }^{1/2}v\right\| _{0;\tau } \end{aligned}$$
(3.30)
$$\begin{aligned}&\left\| \nabla (\beta _{\tau }v)\right\| _{0;\tau } \lesssim h_{\tau }^{-1}\left\| v\right\| _{0;\tau } \end{aligned}$$
(3.31)

Moreover, let e be an edge shared by two triangles, say \(\tau _{1}\) and \(\tau _{2}\). Let q be a scalar/vector-valued polynomial function on e. Then

$$\begin{aligned} \left\| q\right\| _{0;e}\lesssim & {} \left\| \beta _{e}^{1/2}q\right\| _{0;e} \end{aligned}$$
(3.32)

Finally, there exists an extension \(Q_{b}\in H_{0}^{1}(\tau _{1}\cup \tau _{2})\) (in the vector-valued case, \(\mathbf {Q}_{b}\in H_{0}^{1}((\tau _{1}\cup \tau _{2}))^{2})\) of \(\beta _{e}q\) such that \(Q_{b}|_{e}=\beta _{e}q\) and

$$\begin{aligned}&\left\| Q_{b}\right\| _{0;\tau _{i}} \lesssim h_{e}^{1/2}\left\| q\right\| _{0;e} \end{aligned}$$
(3.33)
$$\begin{aligned}&\left\| \nabla Q_{b}\right\| _{0;\tau _{i}} \lesssim h_{e}^{-1/2}\left\| q\right\| _{0;e} \end{aligned}$$
(3.34)

for \(i=1,2\).

Remark For the vector-valued polynomial case, \(\nabla (\beta _{\tau }v)\) in (3.31) and \(\nabla Q_{b}\) in (3.34) mean \(\nabla \cdot (\beta _{\tau }\mathbf {v})\) in (3.31) and \(\nabla \cdot \mathbf {Q}_{b}\), respectively.

Throughout our discussion, we denote the space of all piecewise polynomials of a fixed order \(k_0\) (\(k_0\ge k\)) on \(\mathcal {T}\) by \(P(\mathcal {T})\). For any \(f_{h}\in P(\mathcal {T})\), we denote \(e_f=f-f_h\). Noticing that the error indicator \(\eta ^2\) is defined as the sum of the element-wise error indicator \(\eta _{\tau }^2\), we now define the element-wise norm of the numerical error as

$$\begin{aligned} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG};\tau }^{2}:= & {} \epsilon ^2 |e_u|_{1;\tau }^{2} +\Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\tau }^{2} +\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\tau }^{2} \nonumber \\&+\sum _{e\in \mathcal {F}_p\bigcap \tau } h_{e}^{-1} \Vert [e_u] \Vert _{0;e}^{2}. \end{aligned}$$

To prove the efficiency, we consider all terms involved in \(\eta _{\tau }^2\) one by one. It turns out that each term can be bounded by the right-hand side of Eq. (3.42). We will first deal with the residual terms.

Lemma 3.10

For the residual terms \(R_1\), \(\mathbf {R}_2\) and \(R_3\), we have

$$\begin{aligned} h_{\tau }^{2}\Vert R_{1}\Vert _{0;\tau }^{2}\lesssim & {} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG};\tau }^{2} +\frac{h_{\tau }^2}{\epsilon ^2}\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG};\tau }^{2}\nonumber \\&+\,h_{\tau }^2 \Vert e_f\Vert _{0;\tau }^2 +\frac{h_{\tau }^2}{\epsilon ^2}\Vert \mathbf {e}_b\Vert _{\infty ;\tau }^2, \\ \Vert \mathbf {R}_{2}\Vert _{0;\tau }^{2}\lesssim & {} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau }^{2}, \\ \Vert R_{3}\Vert _{0;\tau }^{2}\lesssim & {} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau }^{2}, \end{aligned}$$

Proof

We define

$$\begin{aligned} v_1:= & {} f_h+\nabla \cdot \mathbf {p}_h-\frac{1}{2}\mathbf {b}_h\cdot \mathbf {s}_h,\\ \mathbf {v}_2:= & {} \mathbf {R}_2=\mathbf {p}_h-\epsilon \nabla u_h+\frac{1}{2}\mathbf {b}_h u_h, \\ v_3:= & {} R_3=\mathbf {b}_h \cdot (\mathbf {s}_h-\nabla u_h), \end{aligned}$$

which are polynomials on each \(\tau \in \mathcal {T}\), and let \(v_{b1}=\beta _{\tau } v_1\), \(\mathbf {v}_{b2}=\beta _{\tau } \mathbf {v}_2\) and \(v_{b3}=\beta _{\tau } v_3\). Then we have \(R_1=v_1+f-f_h\), and hence

$$\begin{aligned} \Vert R_1\Vert _{0;\tau }^2\lesssim & {} \Vert v_1\Vert _{0;\tau }^2+\Vert e_f\Vert _{0;\tau }^2. \end{aligned}$$
(3.35)

Next, we find upper bounds of \(\Vert v_1\Vert _{0;\tau }^2\), \(\Vert \mathbf {v}_2\Vert _{0;\tau }^2\) and \(\Vert v_3\Vert _{0;\tau }^2\). By using the bubble function technique, we know that

$$\begin{aligned} \Vert v_1\Vert _{0;\tau }^2\lesssim & {} \Vert \beta _{\tau }^{1/2}v_1\Vert _{0;\tau }^2=( v_1 ,v_{b1})_{0;\tau } =\left( f_h+\nabla \cdot \mathbf {p}_h-\frac{1}{2}\mathbf {b}_h\cdot \mathbf {s}_h,v_{b1}\right) _{0;\tau },\qquad \end{aligned}$$
(3.36)
$$\begin{aligned} \Vert \mathbf {v}_2\Vert _{0;\tau }^2\lesssim & {} \Vert \beta _{\tau }^{1/2}\mathbf {v}_2\Vert _{0;\tau }^2 =(\mathbf {v}_2,\mathbf {v}_{b2})_{0;\tau } =\left( \mathbf {p}_h-\epsilon \nabla u_h+\frac{1}{2}\mathbf {b}_h u_h,\mathbf {v}_{b2}\right) _{0;\tau },\end{aligned}$$
(3.37)
$$\begin{aligned} \Vert v_3\Vert _{0;\tau }^2\lesssim & {} \Vert \beta _{\tau }^{1/2}v_3\Vert _{0;\tau }^2 =(v_3,v_{b3})_{0;\tau } =(\mathbf {b}_h \cdot (\mathbf {s}_h-\nabla u_h),v_{b3})_{0;\tau }. \end{aligned}$$
(3.38)

Also, the variational form (2.6)–(2.8) gives

$$\begin{aligned} \left( f+\nabla \cdot \mathbf {p}-\frac{1}{2}\mathbf {b}\cdot \mathbf {s},v_{b1}\right) _{0;\tau }=0,\\ \left( \mathbf {p}-\epsilon \nabla u+ \frac{1}{2}\mathbf {b}u,\mathbf {v}_{b2}\right) _{0;\tau } =0,\\ \left( \mathbf {s}-\nabla u,\mathbf {b}v_{b3}\right) _{0;\tau } =0. \end{aligned}$$

Subtracting the above equations from (3.36)–(3.38), we get

$$\begin{aligned} \Vert v_1\Vert _{0;\tau }^2\lesssim & {} \left( -\,e_f-\nabla \cdot \mathbf {e}_p+\frac{1}{2}e_{bs}, v_{b1}\right) _{0;\tau }\nonumber \\= & {} -\,( e_f, v_{b1})_{0;\tau }+\left( \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu},\nabla v_{b1}\right) _{0;\tau }+\frac{1}{2}(e_{bs}-\nabla \cdot \mathbf {e}_{bu}, v_{b1})_{0;\tau }\\&+\,(\mathbf {b}\cdot \nabla e_u, v_{b1})_{0;\tau }+(\mathbf {e}_b\cdot \nabla u_h, v_{b1})_{0;\tau },\\\lesssim & {} \Vert e_f\Vert _{0;\tau }\Vert v_{b1}\Vert _{0;\tau } +\Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\tau }\Vert \nabla v_{b1}\Vert _{0;\tau } +\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\tau }\Vert v_{b1}\Vert _{0;\tau }\nonumber \\&+\,|e_u|_{1;\tau }\Vert v_{b1}\Vert _{0;\tau } +\Vert \mathbf {e}_b\Vert _{\infty ;\tau }\Vert \nabla u_h\Vert _{0;\tau }\Vert v_{b1}\Vert _{0;\tau },\\ \Vert \mathbf {v}_2\Vert _{0;\tau }^2\lesssim & {} \left( -\,\mathbf {e}_p+\epsilon \nabla e_u-\frac{1}{2}\mathbf {e}_{bu},\mathbf {v}_{b2}\right) _{0;\tau } \lesssim \left( \Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\tau } +\epsilon |e_u|_{1;\tau } \right) \Vert \mathbf {v}_{b2}\Vert _{0;\tau },\\ \Vert v_3\Vert _{0;\tau }^2\lesssim & {} (-\,e_{bs}+\nabla \cdot \mathbf {e}_{bu}, v_{b3})_{0;\tau } \lesssim \Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\tau } \Vert v_{b3}\Vert _{0;\tau }. \end{aligned}$$

where we have used integration by parts and the Cauchy–Schwarz inequality. By using the bubble function technique and by deleting \(\Vert v_1\Vert _{0;\tau }\), \(\Vert \mathbf {v}_2\Vert _{0;\tau }\) and \(\Vert v_3\Vert _{0;\tau }\) from both sides of the above equations respectively, we get

$$\begin{aligned} \Vert v_1\Vert _{0;\tau }\lesssim & {} \Vert e_f\Vert _{0;\tau } +h_{\tau }^{-1}\Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\tau } +\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\tau }\nonumber \\&+\,|e_u|_{1;\tau }+\Vert \mathbf {e}_b\Vert _{\infty ;\tau }\Vert \nabla u_h\Vert _{0;\tau },\\ \Vert \mathbf {v}_2\Vert _{0;\tau }\lesssim & {} \Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\tau } +\epsilon |e_u|_{1;\tau },\\ \Vert v_3\Vert _{0;\tau }\lesssim & {} \Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\tau }, \end{aligned}$$

and hence

$$\begin{aligned} h_{\tau }^{2}\Vert v_1\Vert _{0;\tau }^2\lesssim & {} \left\| \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\right\| _{0;\tau }^2 +h_{\tau }^{2}\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\tau }^2+\frac{h_{\tau }^2}{\epsilon ^2}\epsilon ^2|e_u|_{1;\tau }^2 \nonumber \\&+\,h_{\tau }^{2}\Vert e_f\Vert _{0;\tau }^2+h_{\tau }^{2}\Vert \mathbf {e}_b\Vert _{\infty ;\tau }^2\Vert \nabla u_h\Vert _{0;\tau }^2,\\\lesssim & {} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau }^{2}+ \frac{h_{\tau }^2}{\epsilon ^2}\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau }^{2} \nonumber \\&+\,h_{\tau }^{2}\Vert e_f\Vert _{0;\tau }^2 +h_{\tau }^{2}\Vert \mathbf {e}_b\Vert _{\infty ;\tau }^2\Vert \nabla u_h\Vert _{0;\tau }^2,\\ \Vert \mathbf {R}_2\Vert _{0;\tau }^2= \Vert \mathbf {v}_2\Vert _{0;\tau }^2\lesssim & {} \Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0;\tau }^2 +\epsilon ^2 |e_u|_{1;\tau }^2 \lesssim \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau }^{2},\\ \Vert R_3\Vert _{0;\tau }^2=\Vert v_3\Vert _{0;\tau }^2\lesssim & {} \Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0;\tau }^2 \lesssim \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau }^{2}. \end{aligned}$$

By using Equation (2.28) and the fact that \(f\in L^2(\Omega )\) is the given source term, we obtain

$$\begin{aligned} h_{\tau }^{2}\Vert v_1\Vert _{0;\tau }^2\lesssim & {} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau }^{2}+\frac{h_{\tau }^2}{\epsilon ^2}\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau }^{2} \nonumber \\&+\,h_{\tau }^{2}\Vert e_f\Vert _{0;\tau }^2 +\frac{h_{\tau }^{2}}{\epsilon ^2}\Vert \mathbf {e}_b\Vert _{\infty ;\tau }^2. \end{aligned}$$

Our lemma follows by combining the above formula with (3.35). \(\square \)

Now, we proceed to the jump term.

Lemma 3.11

Let \(e\in \mathcal {F}^{0}\) with \(e=\tau _1 \cap \tau _2\). Assume that the exact solution \(\mathbf {p}\cdot \mathbf {n}_e\) is continuous on e, then we have

$$\begin{aligned} h_e\Vert J_2\Vert _{0;e}^{2}\lesssim & {} \sum _{i=1,2}\left( \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau _i}^{2} +\frac{h_e^2}{\epsilon ^2}\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau _i}^{2} \right) \nonumber \\&+\sum _{i=1,2}\left( h_e^2\Vert R_1\Vert _{0,\tau _i}^2+\frac{h_e^2}{\epsilon ^2} \Vert \mathbf {e}_b\Vert _{\infty ;\tau _i}^2 \right) . \end{aligned}$$
(3.39)

Proof

Define \(q:=J_2=[(\mathbf {p}_h+\frac{1}{2}\mathbf {b}_h u_h)\cdot \mathbf {n}]\) which is a polynomial on \(e\in \mathcal {F}^{0}\). Since \(u\in H_0^1(\Omega )\), \(\mathbf {b}\in \mathbf {H}(div, \Omega )\) and \(\mathbf {p}\cdot \mathbf {n}_e\) is continuous on each \(e\in \mathcal {F}^{0}\), we have

$$\begin{aligned} q=\left[ \left( \mathbf {p}_h+\frac{1}{2}\mathbf {b}_hu_h-\mathbf {p}-\frac{1}{2}\mathbf {b}u\right) \cdot \mathbf {n}\right] . \end{aligned}$$

We define \(Q_b\in H_0^1(\tau _1 \cup \tau _2)\) be the extension of \(\beta _e q\) on \(\tau _1 \cup \tau _2\) such that \(Q_b|_e=\beta _e q\). Again, by using the standard bubble function technique, we have

$$\begin{aligned} \Vert q\Vert _{0;e}^2\lesssim & {} \Vert \beta _{e}^{1/2}q\Vert _{0;e}^2=(q,Q_b)_{0;e} =-\int _e \left[ \left( \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\right) \cdot \mathbf {n}\right] Q_b ds \nonumber \\= & {} -\sum _{i=1,2} \int _{\tau _i} \nabla \cdot \left( \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\right) Q_b -\sum _{i=1,2} \int _{\tau _i} \left( \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\right) \cdot \nabla Q_b. \qquad \end{aligned}$$
(3.40)

Since \(Q_b\in H_0^1(\tau _1 \cup \tau _2)\), the variational form (2.8) gives

$$\begin{aligned} \sum _{i=1,2}\int _{\tau _i}\left( f+\nabla \cdot \mathbf {p}-\frac{1}{2}\mathbf {b}\cdot \mathbf {s} \right) Q_b =0. \end{aligned}$$

Adding the above equation with (3.40), we get

$$\begin{aligned} \Vert q\Vert _{0;e}^2\lesssim & {} \sum _{i=1,2} \int _{\tau _i} \left( f+\nabla \cdot \mathbf {p}_h-\frac{1}{2}\mathbf {b}\cdot \mathbf {s}- \frac{1}{2} \nabla \cdot \mathbf {e}_{bu} \right) Q_b -\sum _{i=1,2} \int _{\tau _i} \left( \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\right) \cdot \nabla Q_b \nonumber \\= & {} \sum _{i=1,2} \int _{\tau _i} \left( R_1-\frac{1}{2}(e_{bs}-\nabla \cdot \mathbf {e}_{bu})- \mathbf {b}\cdot \nabla e_u-\mathbf {e}_b\cdot \nabla u_h\right) Q_b \\&-\sum _{i=1,2} \int _{\tau _i} (\mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}) \cdot \nabla Q_b \nonumber \\\lesssim & {} \sum _{i=1,2} h_e^{\frac{1}{2}}\left( \Vert R_1\Vert _{0,\tau _i} +\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0,\tau _i}+| e_u|_{1,\tau _i}+\Vert \mathbf {e}_b\cdot \nabla u_h\Vert _{0,\tau _i}\right) \Vert q\Vert _{0,e}, \\&+\sum _{i=1,2} h_e^{-\frac{1}{2}}\left\| \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\right\| _{0;\tau _i} \Vert q\Vert _{0;e} \end{aligned}$$

where we have used the bubble function technique (3.33) and (3.34). Using (2.28), we know that

$$\begin{aligned} \Vert \mathbf {e}_b\cdot \nabla u_h\Vert _{0,\tau _i} \lesssim \Vert \nabla u_h\Vert _{0,\tau _i} \Vert \mathbf {e}_b\Vert _{\infty ,\tau _i}\lesssim \frac{1}{\epsilon }\Vert \mathbf {e}_b\Vert _{\infty ,\tau _i}. \end{aligned}$$

Hence, we have

$$\begin{aligned} h_e^{\frac{1}{2}}\Vert q\Vert _{0;e}\lesssim & {} \sum _{i=1,2}\left( h_e\Vert R_1\Vert _{0,\tau _i}+h_e\Vert e_{bs}-\nabla \cdot \mathbf {e}_{bu}\Vert _{0,\tau _i}+h_e| e_u|_{1,\tau _i}\right. \\&\left. + \frac{h_e}{\epsilon }\Vert \mathbf {e}_b\Vert _{\infty ,\tau _i}+\Vert \mathbf {e}_p+\frac{1}{2}\mathbf {e}_{bu}\Vert _{0,\tau _i} \right) , \end{aligned}$$

and

$$\begin{aligned} h_e\Vert J_2\Vert _{0;e}^2\lesssim & {} \sum _{i=1,2}\left( \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG};\tau _i}^{2} +\frac{h_e^2}{\epsilon ^2}\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG};\tau _i}^{2} \right) \nonumber \\&+\sum _{i=1,2}\left( h_e^2\Vert R_1\Vert _{0,\tau _i}^2+\frac{h_e^2}{\epsilon ^2} \Vert \mathbf {e}_b\Vert _{\infty ;\tau _i}^2 \right) . \end{aligned}$$
(3.41)

\(\square \)

Combing all the lemmas above, we can get the following theorem about the efficiency.

Theorem 3.12

Let \((\mathbf {p},\mathbf {s},u)\in [L^2(\Omega )]^2\times [L^2(\Omega )]^2 \times H_0^1(\Omega )\) be the the exact solution of (2.6)–(2.8) and \((\mathbf {p}_{h},\mathbf {s}_{h},u_{h})\in W_{h}\times W_{h}\times U_{h}\) be the numerical solution of the SDG scheme (2.14)–(2.16). Suppose that the ratio \(\frac{h_{\tau }}{\epsilon }\) is bounded above by a constant which is independent of the mesh size. For any \(f_{h}\in P(\mathcal {T})\), we have

$$\begin{aligned} \eta ^{2}\lesssim & {} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{DG}^{2}+ \sum _{\tau \in \mathcal {T}}\left( h_{\tau }^2 \Vert e_f\Vert _{0;\tau }^2 +\Vert \mathbf {e}_b\Vert _{\infty ;\tau }^2\right) . \end{aligned}$$
(3.42)

Proof

Since we have assumed that for each cell \(\tau \in \mathcal {T}\), the ratio \(\frac{h_{\tau }}{\epsilon }\) is bounded by a constant, we can obtain the following estimation of \(R_1\) by using Lemma 3.10:

$$\begin{aligned} h_{\tau }^{2}\Vert R_{1}\Vert _{0;\tau }^{2}\lesssim & {} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG};\tau }^{2}+h_{\tau }^2 \Vert e_f\Vert _{0;\tau }^2 +\Vert \mathbf {e}_b\Vert _{\infty ;\tau }^2. \end{aligned}$$
(3.43)

For each edge \(e\in \mathcal {F}_{u}^{0}\) with \(e=\tau _1 \cap \tau _2\), since \(\frac{h_e}{h_{\tau _i}}\leqslant 1\) for both \(i=1\) and \(i=2\). By using Lemma 3.11 and the above inequality, we know that

$$\begin{aligned} h_e\Vert J_2\Vert _{0;e}^{2}\lesssim & {} \sum _{i=1,2}\left( \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau _i}^{2} +\frac{h_{\tau _i}^2}{\epsilon ^2}\Vert \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau _i}^{2} \right) \nonumber \\&+\sum _{i=1,2}\left( h_{\tau _i}^2\Vert R_1\Vert _{0,\tau _i}^2+\frac{h_{\tau _i}^2}{\epsilon ^2} \Vert \mathbf {e}_b\Vert _{\infty ;\tau _i}^2 \right) \nonumber \\\lesssim & {} \sum _{i=1,2}\left( \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG},\tau _i}^{2} +h_{\tau _i}^2 \Vert e_f\Vert _{0;\tau _i}^2 +\Vert \mathbf {e}_b\Vert _{\infty ;\tau _i}^2\right) . \end{aligned}$$
(3.44)

The remaining proof is trivial by Eqs. (3.43) and (3.44), Lemma 3.10, and the fact that the term about \(J_1\) in \(\eta \) is also a term in \(\Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}\). \(\square \)

3.3 The Adaptive Refinement Strategy

As proved in Sects. 3.1 and 3.2, we have already known that \(\eta ^2\) is a reliable and efficient local error estimator. Based on well-established ideas for adaptive algorithms [26, 34, 35, 40], now we can get a residual-type adaptive mesh refinement strategy for our SDG method by using \(\eta ^2\) to compute an error indicator.

We should notice that our error estimator \(\eta ^2\) is defined on each element in \(\mathcal {T}\), which is a staggered mesh constructed based on an initial triangulation \(\mathcal {T}_{0}\). The purpose of constructing \(\mathcal {T}\) is to implement our scheme in a staggered way, as described in Sect. 2.2. Our adaptive refinement is constructed based on the initial mesh \(\mathcal {T}_{0}\) instead of the final mesh \(\mathcal {T}\). To define an error indicator on each element \(\rho \) of the initial mesh \(\mathcal {T}_{0}\), we will use the most trivial choice

$$\begin{aligned} \xi _{\rho }^{2}:= & {} \sum _{\tau \in \mathcal {T},\,\tau \cap \rho \ne \emptyset }\eta _{\tau }^{2}. \end{aligned}$$
(3.45)

It is obviously that

$$\begin{aligned} \eta ^{2}=\sum _{\tau \in \mathcal {T}}\eta _{\tau }^{2}= \sum _{\rho \in \mathcal {T}_{0}}\xi _{\rho }^{2}. \end{aligned}$$

Now we present the adaptive refinement strategy. The idea is that we compute the error indicator for each element in the initial mesh, locate elements with larger errors and only refine those elements. After this process, we obtain a new level initial mesh, which can be used to construct a new staggered mesh and the corresponding solution spaces to form the new SDG system. With the j-th level initial mesh denoted as \(\mathcal {T}_{0}^j\), we implement our adaptive refinement scheme by using the following iteration:

  1. 1.

    Subdivide each triangle in \(\mathcal {T}_{0}^j\) to get the staggered mesh. Use the SDG scheme (2.14)–(2.16) to solve for a numerical solution \((\mathbf {p}_{h}^j,\mathbf {s}_{h}^j,u_{h}^j)\in W_{h}\times W_{h}\times U_{h}\).

  2. 2.

    If the total number of triangles in \(\mathcal {T}_{0}^{j}\) is larger than a threshold \(N_0\), we stop the refinement procedure and use \((\mathbf {p}_{h}^j,\mathbf {s}_{h}^j,u_{h}^j)\) as the final result. Otherwise, we evaluate \(\xi _{\rho }^2\) for each element \(\rho \in \mathcal {T}_{0}^{j}\) and thus compute their summation \(\eta ^2\).

  3. 3.

    If the total estimated error \(\eta ^2\) is less than a threshold value \(\delta _0\), we stop the refinement procedure. Otherwise, we use the following two steps to construct a refined mesh \(\mathcal {T}_{0}^{j+1}\).

  4. 4.

    We enumerate all triangles in \(\mathcal {T}_{0}^j\) such that \(\xi _{\rho _{1}}\ge \xi _{\rho _{2}}\ge \xi _{\rho _{3}}\ge \cdots \). Choose \(0<\theta <1\) and find the least possible value of m such that

    $$\begin{aligned} \theta \eta ^{2}\le & {} \sum _{i=1}^{m}\xi _{\rho _{i}}^2. \end{aligned}$$
  5. 5.

    Get a new initial mesh \(\mathcal {T}_{0}^{j+1}\) by refining the first m triangles in \(\mathcal {T}_{0}^{j}\) chosen in the last step and any other possible triangles which keep the conformity of \(\mathcal {T}_{0}^{j+1}\).

4 Numerical Examples

In this section, we provide several numerical examples to show the accuracy and the efficiency of the proposed error indicator and the corresponding adaptive refinement technique. We use structured triangular meshes. All the numerical experiments are performed on the square domain \(\Omega =[0,1]\times [0,1]\). The initial mesh consists of two triangles only, by bisecting the domain \(\Omega \) through (0, 0) and (1, 1). The parameter \(\theta \) in our adaptive refinement procedure is chosen to be 0.3. In order to test the ability of our method, we compare our adaptive refinement strategy with two other refinement schemes: (1) uniform refinement; (2) adaptive refinement scheme with the error estimator being replaced by the exact error.

For the space \(U_h\), the degree of freedoms in each triangle \(\tau \in \mathcal {T}\) are taken as function values at \(\frac{(k+1)(k+2)}{2}\) points with the requirement that each edge contains \(k+1\) points. Basis functions are then just interpolation functions at these points. For each edge \(e=\tau _1 \cap \tau _2 \in \mathcal {F}_{u}^0\), \(\tau _1\) and \(\tau _2\) share the same degrees of freedom on e and thus functions in \(U_h\) will be continuous over each edge in \({F}_{u}^0\). For \(W_{h}\), we use the Brezzi–Douglas–Marini (BDM) finite element [5]. The degrees of freedom include the values of the normal component at \(k+1\) points per edge. For \(k > 1\), the degrees of freedom also include integration terms over the triangle. For each edge \(e=\tau _1 \cap \tau _2 \in \mathcal {F}_{p}\), \(\tau _1\) and \(\tau _2\) share the same degrees of freedom on e since functions in \(W_{h}\) should have continuous normal component over each edge in \(\mathcal {F}_{p}\). A detailed description of the basis functions can be found in [38]. For simplicity, we use piecewise linear elements (\(k=1\)) for all examples.

Fig. 2
figure 2

Example 1 with the singular point (0.5, 0.5). a Comparison of different refinement schemes. b Mesh level 30

Example 1

For the first example, we take \(\epsilon =1\) and \(\varvec{b}=(1, 1)^T\). We consider the exact solution

$$\begin{aligned} u=\sin (\pi x)\sin (\pi y)\left[ (x-{c_1})^{2}+(y-{c_2})^{2}\right] ^{1/6}, \end{aligned}$$

which is singular at \((c_1,c_2)\), and hence can compute f. We first consider the case with \(c_1=0.5\) and \(c_2=0.5\). In this case, the singularity is located at the mesh interface. We compare the log-log plots of the numerical error for different refinement schemes in Fig. 2a. We can see that our scheme performs roughly the same as the adaptive refinement scheme using the exact error as the error indicator, which confirms the reliability and efficiency of the proposed error indicator. Also, it is evident that the error of our adaptive refinement scheme is less than the error of the uniform refinement scheme when using the same number of elements. More importantly, we can see that \(\mathrm{log} \Vert (e_{u}, \mathbf {e}_{bu},\mathbf {e}_{p}, e_{bs})\Vert _{\mathrm {DG}}\) declines at a rate of 0.5 against log(number of elements) for our adaptive scheme, which corresponds to order 1 convergence in 2D domains. This shows that our scheme out-performs the uniform refinement scheme and attains an optimal rate of convergence for piecewise linear elements. Our adaptive mesh of level 30 is shown in Fig. 2b. We can see that the refinements are concentrated around the point of singularity as expected. We further consider the case with \(c_1=0.3\) and \(c_2=0.6\). In this case, the singularity is not located at the mesh interface. The numerical results are shown in Fig. 3. We can see that the behavior is largely the same as the previous case, which confirms the robustness of our scheme.

Fig. 3
figure 3

Example 1 with the singular point (0.3, 0.6). a Comparison of different refinement schemes. b Mesh level 30

Example 2

For the second example, we consider the solution with a circular internal layer. We take \(\epsilon =10^{-4}\) and \(\varvec{b}=(2, 3)^T\). The exact solution is taken as

$$\begin{aligned} u=16xy(1-x)(1-y)\left\{ \frac{1}{\pi }\arctan {\left[ \frac{2}{\sqrt{\epsilon }}(0.25^2-(x-0.5)^2-(y-0.5)^2)\right] }+0.5\right\} . \end{aligned}$$

We use our adaptive refinement scheme to approximate the solution of this example. Figure 4a shows the adaptive mesh of level 50. We can see that the refinements are more concentrated around the circular internal layer as expected. The largest edge length in this mesh is 0.25 and the smallest edge length near the layer is \(6.51\times 10^{-4}\). Figure 4b, c show the contour plot and also the 3D plot of \(u_h\). We can see that the numerical solution possesses a circular internal layer, which conforms the ability of our scheme to capture the position of the layer.

Fig. 4
figure 4

Example 2, circular internal layer. a Mesh level 50. b Contour plot. c 3D plot

Example 3

In the third example, we take \(\varvec{b}=(\dfrac{1}{2}, \dfrac{\sqrt{3}}{2})\), \(\epsilon =5\times 10^{-5}\) and \(f=0\). By denoting \(\partial \Omega _1=\left\{ (x,y)\in \partial \Omega :\, x=0\right\} \cup \left\{ (x,y)\in \partial \Omega :\, x\le 0.5,y=0\right\} \), the boundary condition for this example is given by

$$\begin{aligned} u={\left\{ \begin{array}{ll} 1, &{} \qquad \mathrm {on\;} (x,y)\in \partial \Omega _1, \\ 0, &{} \qquad \mathrm {on\;} (x,y)\in \partial \Omega {\setminus } \partial \Omega _1. \end{array}\right. } \end{aligned}$$

For this example, the exact solution is unknown, but it should possess both an internal layer and boundary layers. In Fig. 5, we can see that all the layers are recovered by using our adaptive refinement scheme, which conforms the ability of our scheme to capture the positions of the layers. Here we adopt the adaptive mesh level 34. The largest edge length in this mesh is 0.141 and the smallest edge length near the layer is \(5.66\times 10^{-6}\).

Fig. 5
figure 5

Example 3, internal and boundary layers. a Mesh level 34. b Contour plot. (c) 3D plot

5 Conclusion

In this paper, we propose a new SDG method in order to solve the steady state convection–diffusion equation with a small diffusion coefficient \(\epsilon \). A residual-type a-posteriori error estimator for the numerical solutions solved with this new SDG method is derived. The reliability and efficiency of this error estimator are also proved. By using this error estimator as the error indicator, an adaptive mesh refinement technique is proposed. Numerical examples with point singularities and sharp layers are provided, which are computational expensive for regular uniform refinement methods. We can see that the proposed error indicator is close to the exact numerical error. Our adaptive mesh refinement method out-performs the uniform refinement scheme and attains an optimal rate of convergence for piecewise linear elements. Also, the adaptive method can capture the positions of singular points and sharp layers accurately, thus can improve the resolution near these places by making the mesh more dense.