1 Introduction

The numerical solution of singularly perturbed problems has been studied extensively over the last decades (see, e.g., the books [8, 11] and the references therein). These problems typically feature boundary layers (and, more generally, also internal layers). Their resolution requires the use of strongly refined, layer-adapted meshes. In the context of fixed order methods, well-known representatives of such meshes include the Bakhvalov mesh [1] and the Shishkin mesh [14]. For the \(p\)/\(hp\)-version Finite Element Method (FEM) or for spectral methods, the Spectral Boundary Layer mesh [3, 4, 13] is essentially the smallest mesh that permits the resolution of boundary layers (see Definition 2.2 ahead for the 1D version and Sect. 3.1 for a realization in 2D).

The use of the above mentioned meshes can lead to robust convergence, i.e., convergence uniform in the singular perturbation parameter. For the reaction-diffusion equations (2.1), (3.1) under consideration here, the FEM is naturally analyzed in the energy norm (2.6), (3.4), which is simply the norm induced by the inner-product defined by the bilinear form of the variational problem; robust convergence of the \(h\)-FEM on Shishkin meshes can be found, for example, in [11] and robust exponential convergence on Spectral Boundary Layer meshes is shown in [3, 4]. The (natural) energy norm associated with this boundary value problem is rather weak in that the layer contributions are not “seen” by the energy norm; that is, the energy norm of the layer contribution vanishes as the singular perturbation parameter \(\varepsilon \) tends to zero whereas the energy norm of the smooth part of the solution does not. This has sparked the recent work [2, 9, 10] to study the convergence of the \(h\)-FEM in norms stronger than the energy norm. The analysis of [2, 9, 10] is performed in an \(\varepsilon \)-weighted \(H^1\)-norm which is balanced in the sense that both the smooth part and the layer part are (generically) bounded away from zero uniformly in \(\varepsilon \); both energy norm [see (2.6), (3.4) for the 1D and 2D case, respectively] and balanced norm [see (2.10), (3.5)] are \(\varepsilon \)-weighted \(H^1\)-norms but they differ in the \(\varepsilon \)-scaling. Robust convergence in this balanced norm is shown in [2, 9, 10] if Shishkin meshes are employed. We show in the present work that this analysis can be extended to the \(hp\)-version FEM on Spectral Boundary Layer meshes to give robust exponential convergence of the \(hp\)-version FEM in this balanced norm. An additional outcome of our convergence analysis in the balanced norm is the robust exponential convergence in the maximum norm.

It is worth mentioning that robust exponential convergence of the \(hp\)-FEM on Spectral Boundary Layer meshes in the balanced norm was shown earlier in special cases. For example, for the case of equations with constant coefficients and polynomial right-hand sides, [13] observes that the smooth part of the asymptotic expansion is again polynomial and therefore in the finite element space. It follows that a factor \(\varepsilon ^{1/2}\) is gained in the convergence estimate and leads to robust exponential convergence in the balanced norm. A more detailed discussion of similar effects can be found in the concluding remarks of [5] and in the section with numerical results in [6].

Let us briefly discuss the ideas underlying our analysis. Asymptotic expansions may be viewed as a tool to decompose the solution into components associated with different length scales. Roughly speaking, our analysis in balanced norms mimicks this technique on the discrete level in that the Galerkin approximation is likewise decomposed into components associated with different length scales. In total, our analysis involves the following ideas:

  1. 1.

    An analysis of the difference between the FEM approximation and a Galerkin approximation to a reduced problem.

  2. 2.

    A stable decomposition of the FEM space on the layer-adapted mesh into fine and coarse components. This decomposition relies essentially on strengthened Cauchy–Schwarz inequalities.

Throughout the paper we will utilize the usual Sobolev space notation \(H^{k}\left( \Omega \right) \) to denote the space of functions on \(\Omega \) with weak derivatives up to order \(k\) in \(L^{2}\left( \Omega \right) \), equipped with the norm \(\left\| \cdot \right\| _{k,\Omega }\) and seminorm \(\left| \cdot \right| _{k,\Omega }\). We will also use the space \(H_{0}^{1}\left( \Omega \right) =\left\{ u\in H^{1}\left( \Omega \right) :\left. u\right| _{\partial \Omega }=0\right\} \), where \(\partial \Omega \) denotes the boundary of \(\Omega \). The norm of the space \(L^\infty (\Omega )\) of essentially bounded functions is denoted by \(\Vert \cdot \Vert _{\infty ,\Omega }\). The letters \(C\), \(c\) will be used to denote generic positive constants, independent of any discretization or singular perturbation parameters and possibly having different values in each occurrence. Finally, the notation \(A\lesssim B\) means the existence of a positive constant \(C\), which is independent of the quantities \(A\) and \(B\) under consideration and of the singular perturbation parameter \(\varepsilon \), such that \(A\le CB\).

2 The one-dimensional case

We start with the one-dimensional case as many of the ideas can be seen in this setting already.

2.1 Problem formulation and solution regularity

We consider the following model problem: Find \(u\) such that

$$\begin{aligned} -\varepsilon ^{2}u^{\prime \prime }+bu= & {} f\quad \text { in }I=(0,1),\end{aligned}$$
(2.1a)
$$\begin{aligned} u(0)= & {} u(1)=0. \end{aligned}$$
(2.1b)

The parameter \(0<\varepsilon \le 1\) is given, as are the functions \(b>0\) and \(f\), which are assumed to be analytic on \(\overline{I}=[0,1]\). In particular, we assume that there exist constants \(C_{f}\), \(\gamma _{f}\), \(C_{b}\), \(\gamma _{b}\), \(c_b >0\), such that

$$\begin{aligned} \left\{ \begin{array}{ll} \left\| f^{(n)}\right\| _{\infty ,I}\le C_{f}\gamma _{f}^{n}n! &{} \quad \forall \;n\in \mathbb {N}_{0}, \\ \left\| b^{(n)}\right\| _{\infty ,I}\le C_{b}\gamma _{b}^{n}n! &{} \quad \forall \;n\in \mathbb {N}_{0}, \\ b(x) \ge c_b > 0 &{} \quad \forall x \in \overline{I}. \end{array} \right. \end{aligned}$$
(2.2)

The variational formulation of (2.1) reads: Find \(u\in H_{0}^{1}\left( I\right) \) such that

$$\begin{aligned} {\mathcal {B}}_{\varepsilon }\left( u,v\right) ={\mathcal {F}}\left( v\right) \;\;\forall \;v\in H_{0}^{1}\left( I\right) , \end{aligned}$$
(2.3)

where, with \(\left\langle \cdot ,\cdot \right\rangle _{I}\) the usual \(L^{2}(I)\) inner product,

$$\begin{aligned} {\mathcal {B}}_{\varepsilon }\left( u,v\right)= & {} \varepsilon ^{2}\left\langle u^{\prime },v^{\prime }\right\rangle _{I}+\left\langle bu,v\right\rangle _{I}, \end{aligned}$$
(2.4)
$$\begin{aligned} {\mathcal {F}}\left( v\right)= & {} \left\langle f,v\right\rangle _{I}. \end{aligned}$$
(2.5)

The bilinear form \({\mathcal {B}}_{\varepsilon }\left( \cdot ,\cdot \right) \) given by (2.4) is coercive with respect to the energy norm

$$\begin{aligned} \left\| u\right\| _{E,I}^{2}:={\mathcal {B}}_{\varepsilon }\left( u,u\right) , \end{aligned}$$
(2.6)

i.e.,

$$\begin{aligned} {\mathcal {B}}_{\varepsilon }\left( u,u\right) \ge \left\| u\right\| _{E,I}^{2}\quad \forall \;u\in H_{0}^{1}\left( I\right) . \end{aligned}$$

The solution \(u\) is analytic in \({I}\) and features boundary layers at the endpoints. Its regularity was described in [3] (our presentation below follows [4, Prop. 2.2.1]) both in terms of classical differentiability (see Proposition 2.1, (i)) as well as asymptotic expansions (see Proposition 2.1, (ii)):

Proposition 2.1

([4, Prop. 2.2.1], [3]) Assume (2.2) and let \(u \in H^1_0(I)\) be the solution of (2.1) Then:

  1. (i)

    There are constants \(C\), \(K > 0\) independent of \(\varepsilon \in (0,1]\) such that \(\Vert u^{(n)}\Vert _{L^2(I)} \le C K^n \max \{n+1,\varepsilon ^{-1}\}^n\) for all \(n \in \mathbb {N}_0\).

  2. (ii)

    \(u\) can be decomposed as \(\displaystyle u=w+u^{BL}+r \) where, for some constants \(C_{w}\), \(\gamma _{w}\), \(C_{BL}\), \(\gamma _{BL}\), \(C_{r}\), \(\gamma _{r}\), \(b >0\) independent of \(\varepsilon \in (0,1]\),

    $$\begin{aligned} \left\| w^{(n)}\right\| _{\infty ,I}\le & {} C_{w}\gamma _{w}^{n}n^{n} \quad \forall n \in {\mathbb {N}}_0, \end{aligned}$$
    (2.7a)
    $$\begin{aligned} \left| \left( u^{BL}\right) ^{(n)}(x)\right|\le & {} C_{BL}\gamma _{BL}^{n} \max \{n+1,\varepsilon ^{-1}\}^{n} e^{-b \,\mathrm{dist}(x,\partial I)/\varepsilon } \quad \forall n \in {\mathbb {N}}_0, \qquad \end{aligned}$$
    (2.7b)
    $$\begin{aligned} \Vert r^{(n)}\Vert _{0,I}\le & {} C_{r}\varepsilon ^{2-n}e^{-\gamma _{r}/\varepsilon }, \quad n\in \{0,1,2\}. \end{aligned}$$
    (2.7c)

2.2 High order FEM

The discrete version of the variational formulation (2.3) reads: Given \(V_N \subset H^1_0(\Omega )\) find \(u_{FEM} \in V_N\) such that

$$\begin{aligned} {\mathcal {B}}_{\varepsilon }\left( u_{FEM},v\right) ={\mathcal {F}}\left( v\right) \quad \forall v \in V_N. \end{aligned}$$
(2.8)

In order to define the FEM space \(V_N\), let \(\Delta =\left\{ 0=x_{0}<x_{1}<\cdots <x_{N}=1\right\} \) be an arbitrary partition of \(I=\left( 0,1\right) \) and set

$$\begin{aligned} I_{j}=\left[ x_{j-1},x_{j}\right] ,\quad h_{j}=x_{j}-x_{j-1},\quad j=1,\ldots ,N. \end{aligned}$$

Also, define the reference element \(I_{ST}=[ -1,1]\) and note that it can be mapped onto the \(j^{\text {th}}\) element \(I_{j}\) by the standard affine mapping \(x=M_{j}(t)=\frac{1}{2}\left( 1-t\right) x_{j-1}+\frac{1}{2}\left( 1+t\right) x_{j}\). With \(\Pi _{p}\left( I_{ST}\right) \) the space of polynomials of degree \(\le p\) on \(I_{ST}\) (and with \(\circ \) denoting composition of functions), we define the finite dimensional subspace as

$$\begin{aligned} {\mathcal S}^{{p}}(\Delta )= & {} \left\{ v\in H^{1}\left( I\right) :v\circ M_{j}\in \Pi _{p_{j}}(I_{ST}),j=1,\ldots ,N\right\} , \\ {\mathcal S}_{0}^{{p}}(\Delta )= & {} S^{{p}}(\Delta )\cap H_{0}^{1}(I). \end{aligned}$$

We restrict our attention here to constant polynomial degree \(p\) for all elements, i.e., \(p_{j}=p\), \(j=1,\ldots ,N\); clearly, more general settings with variable polynomial degree are possible. The following Spectral Boundary Layer mesh is essentially the minimal mesh that yields robust exponential convergence.

Definition 2.2

(Spectral Boundary Layer mesh) For \(\lambda >0\), \(p\in \mathbb {N}\) and \(0<\varepsilon \le 1\), define the Spectral Boundary Layer mesh \(\Delta _{BL}(\lambda ,p)\) as

$$\begin{aligned} \Delta _{BL}(\lambda ,p):= {\left\{ \begin{array}{ll} \{0,\lambda p \varepsilon ,1-\lambda p \varepsilon ,1\} \quad &{} \text{ if } \, \lambda p \varepsilon < 1/4 \\ \{0,1\} \quad &{} \text{ if } \, \lambda p \varepsilon \ge 1/4. \end{array}\right. } \end{aligned}$$

The spaces \(S(\lambda ,p)\) and \(S_0(\lambda ,p)\) of piecewise polynomials of degree \(p\) are given by

$$\begin{aligned} S(\lambda ,p):= & {} {\mathcal S}^p(\Delta _{BL}(\lambda ,p)), \\ S_0(\lambda ,p):= & {} {\mathcal S}^p_0(\Delta _{BL}(\lambda ,p)) = S(\lambda ,p) \cap H^1_0(I). \end{aligned}$$

We quote the following result from [3].

Proposition 2.3

([3, Thm. 16]) Assume that (2.2) holds and let \(u\) be the solution to (2.3). Then, there exists \(\lambda _0>0\) (depending only on  \(b, f)\) such that for every \(\lambda \in (0,\lambda _0)\) there are \(C, \sigma >0,\) independent of \(\varepsilon \in (0,1]\) and \(p\in \mathbb {N}\) such that

$$\begin{aligned} \inf _{v \in S_0(\lambda ,p)} \Vert u - v\Vert _{E,I} \le C e^{-\sigma p}. \end{aligned}$$
(2.9)

By Céa’s Lemma the Galerkin approximation \(u_{FEM} \in S_0(\lambda ,p)\) satisfies \(\Vert u_{FEM}-u\Vert _{E,I}\sim \left\| u_{FEM}-u\right\| _{0,I}+\varepsilon \left\| \left( u_{FEM}-u\right) ^{\prime }\right\| _{0,I}\le Ce^{-\sigma p}. \)

Define the balanced norm by

$$\begin{aligned} \Vert v\Vert ^2_{balanced,I}:= \Vert v\Vert ^2_{0,I} + \varepsilon \Vert v^\prime \Vert ^2_{0,I}. \end{aligned}$$
(2.10)

We note that the balanced norm \(\Vert \cdot \Vert _{balanced,I}\) is stronger than the energy norm \(\Vert \cdot \Vert _{E,I}\) of (2.6). In Lemma 2.5 below, we will show that the approximation result (2.9) can be sharpened to

$$\begin{aligned} \inf _{v \in S_0(\lambda ,p)} \Vert u - v\Vert _{balanced,I} \le C e^{-\sigma p}. \end{aligned}$$

The key step towards this result is a better treatment of the boundary layer part than it is done in [3, Thm. 16]. This modification is due to [13]. For future reference we formulate this modification as a separate lemma:

Lemma 2.4

Let \(\varepsilon \in (0,1]\). Let the function \(v\) satisfy on \(I = (0,1)\) the estimate

$$\begin{aligned} |v^{(n)}(x)| \le C_{v} \gamma ^n \max \{n+1,\varepsilon ^{-1}\}^n e^{-x/\varepsilon } \quad \forall x \in I, \quad \forall n \in \mathbb {N}_0. \end{aligned}$$
(2.11)

Then there are constants \(C\), \(\beta \), \(\eta > 0\) (depending only on \(\gamma )\) such that the following is true: Let \(\Delta \) be any mesh with a mesh point \(\xi \in (0,1]\) that satisfies

$$\begin{aligned} \frac{\xi }{p\varepsilon } \le \eta . \end{aligned}$$
(2.12)

Then there exists an approximation \(I_p v \in {\mathcal S}^p(\Delta )\) with \(I_p v(0) = v(0)\) and \(I_p v (1) = v(1)\) as well as the approximation properties

$$\begin{aligned}&\Vert v - I_p v \Vert _{\infty ,(0,\xi )} +\xi ^{-1/2} \Vert v - I_p v \Vert _{0,(0,\xi )} +\xi ^{1/2} \Vert v - I_p v \Vert _{1,(0,\xi )}, \nonumber \\&\le C C_{v}\left[ \frac{\xi }{p \varepsilon } e^{-\beta p} + e^{-\xi /\varepsilon }\right] , \end{aligned}$$
(2.13)
$$\begin{aligned}&\Vert v - I_p v \Vert _{\infty ,(\xi ,1)} \le C C_{v} e^{-\xi /\varepsilon }, \end{aligned}$$
(2.14)
$$\begin{aligned}&\Vert v - I_p v \Vert _{0,(\xi ,1)} + {\varepsilon } \Vert v - I_p v \Vert _{1,(\xi ,1)} \le C C_{v}\sqrt{\varepsilon } e^{-\xi /\varepsilon }. \end{aligned}$$
(2.15)

Proof

We will assume that \(\xi \in (0,1/2)\); in the converse, “asymptotic” case we have \( \varepsilon ^{-1} \lesssim p\) so that a suitable approximation on a single element may be taken (e.g., the Gauß–Lobatto interpolant or the operator \({\mathcal I}_p\) discussed in detail in [12, Thm. 3.14] and [5, Sec. 3.2.1]).

It suffices to assume that the mesh consists of the two elements \(I_1:= (0,\xi )\) and \(I_2:=(\xi ,1)\). We construct \(I_p v\) separately on the two elements, starting with \(I_1\).

On \(I_1\), we construct \(I_p v\) in two steps. In the first step, we let \(\pi ^1 \in \Pi _p\) be the polynomial (on \(I_1\)) given by [5, Lemma 3.8]. It interpolates in the endpoints \(0\), \(\xi \) of the interval \(I_1\), i.e.,

$$\begin{aligned}&\pi ^1(0) = v(0), \quad \pi ^1(\xi ) = v(\xi ). \end{aligned}$$
(2.16)

Furthermore, [5, Lemma 3.8] asserts the existence of \(\eta > 0\) such the constraint (2.12) implies

$$\begin{aligned}&\xi ^{-1} \Vert \pi ^1 - v\Vert _{0,I_1} + |\pi ^1 - v|_{1,I_1} \le C C_{v} \frac{\xi ^{1/2}}{p \varepsilon } e^{-\beta p}. \end{aligned}$$
(2.17)

(Note that [5, Lemma 3.8] constructs an approximation on the reference element \(I_{ST}\) instead of \(I_1\). It is applicable with \(K = \varepsilon ^{-1}\) and \(h = \xi \)). The 1D Sobolev embedding theorem in the form \(\Vert v\Vert _{\infty ,J} \lesssim |J|^{-1/2} \Vert v\Vert _{0,J} + |J|^{1/2} \Vert v^\prime \Vert _{0,J}\) (where \(|J|\) denotes the length of the interval \(J\)) gives

$$\begin{aligned} \xi ^{-1/2} \Vert \pi ^1- v\Vert _{\infty ,I_1} + \xi ^{-1} \Vert \pi ^1 - v\Vert _{0,I_1} + |\pi ^1 - v|_{1,I_1} \le C C_{v} \frac{\xi ^{1/2}}{p \varepsilon } e^{-\beta p}. \end{aligned}$$

In the second step, we modify \(\pi ^1\) as proposed in [13] in order to obtain a better approximation on the element \(I_2\). We define \(\pi ^2 \in \Pi _p\) on \(I_1\) as

$$\begin{aligned} \pi ^2(x):= \pi ^1(x) - \frac{x}{\xi } \left( 1-\sqrt{\varepsilon }\right) v(\xi ), \end{aligned}$$

so that \(\pi ^2(\xi ) = \pi ^1(\xi ) - (1-\sqrt{\varepsilon }) v(\xi ) = \sqrt{\varepsilon } v(\xi )\). In view of \(|v(\xi )| \le C_{v} e^{-\xi /\varepsilon }\), this modification leads to

$$\begin{aligned}&\xi ^{-1/2} \Vert \pi ^2- v\Vert _{\infty ,I_1} + \xi ^{-1} \Vert \pi ^2 - v\Vert _{0,I_1} + |\pi ^2 - v|_{1,I_1}\\&\quad \le C C_{v} \left[ \frac{\xi ^{1/2}}{p \varepsilon } e^{-\beta p} + \xi ^{-1/2} e^{-\xi /\varepsilon } \right] . \end{aligned}$$

We take \((I_p v)|_{I_1} = \pi ^2\), and this shows (2.13). On \(I_2\), we take \((I_p v)|_{I_2}\) as the linear interpolant between the values \(\pi ^2(\xi ) = \sqrt{\varepsilon } v(\xi )\) at \(\xi \) and \(v(1)\) at \(1\). We immediately get

$$\begin{aligned} \Vert I_p v \Vert _{\infty ,I_2} + \Vert (I_p v)^\prime \Vert _{\infty ,I_2} \le C \sqrt{\varepsilon } |v(\xi )| \le C C_{v} \sqrt{\varepsilon } e^{-\xi /\varepsilon }. \end{aligned}$$
(2.18)

Furthermore, for \(v\) we have

$$\begin{aligned} \Vert v\Vert _{\infty ,I_2} + \varepsilon ^{-1/2} \Vert v\Vert _{0,I_2} + \sqrt{\varepsilon } \Vert v\Vert _{1,I_2} \le C C_{v} e^{-\xi /\varepsilon }. \end{aligned}$$
(2.19)

(2.18) and (2.19) imply, along with the triangle inequality, then (2.14), (2.15). \(\square \)

Lemma 2.4 shows that boundary layer functions can be approximated at a robust exponential rate in various norms including \(L^\infty \) and the energy norm (2.6), if the mesh is suitably chosen. We now show approximability of solutions to (2.3) in the balanced norm (2.10):

Lemma 2.5

Assume that (2.2) holds and let \(u\) be the solution to (2.3). Then there are constants \(\lambda _0\), \(C\), \(\beta > 0\) (depending only on the constants appearing in (2.2)\()\) such that for every \(\lambda \in (0,\lambda _0]\), \(\varepsilon \in (0,1]\), \(p \in \mathbb {N}\), there exists an approximant \(I_p u \in {\mathcal S}^p_0(\Delta _{BL}(\lambda ,p))\) that satisfies

$$\begin{aligned} \Vert u - I_p u\Vert _{\infty ,I}\le & {} C e^{-\beta \lambda p}, \end{aligned}$$
(2.20a)
$$\begin{aligned} \Vert u - I_p u\Vert _{0,I} + \sqrt{\lambda p \varepsilon } \Vert (u - I_p u)^\prime \Vert _{0,I}\le & {} C e^{-\beta \lambda p}. \end{aligned}$$
(2.20b)

Proof

The proof follows the lines of [3, Thm. 16]. For case of \(p\varepsilon \) sufficiently small, Proposition 2.1 decomposes the solution \(u\) as \(u = w + u^{BL} + r\). The approximation of \(w\) and \(r\) is done as in [3, Thm. 16]. The treatment of the boundary layer part \(u^{BL}\) of [3, Thm. 16] is replaced with an appeal to Lemma 2.4. We remark that slightly sharper estimates are possible if one formulates bounds for \(u - I_p u\) on the two elements \((0,\lambda p \varepsilon )\) and \((\lambda p \varepsilon ,1)\) separately. \(\square \)

2.3 Robust exponential convergence in a balanced norm

The goal of this article is to improve on Proposition 2.3 by showing that the Galerkin error \(u - u_{FEM}\) convergences at a robust exponential rate also in the balanced norm \(\Vert \cdot \Vert _{balanced,I}\):

Theorem 2.6

Assume (2.2). Let \(u\) solve (2.3) and \(u_{FEM} \in S_0(\lambda ,p)\) be obtained by (2.8) based on the Spectral Boundary Layer mesh \(\Delta _{BL}(\lambda ,p)\). Then there exists \(\lambda _0 > 0\) (depending solely on \(b\) and \(f)\) such that for every \(\lambda \in (0,\lambda _0)\) there are constants \(C, \sigma > 0\) such that for every \(\varepsilon \in (0,1], p \in \mathbb {N}\)

$$\begin{aligned} \left\| u-u_{FEM}\right\| _{0,I}+\sqrt{\varepsilon }\left\| \left( u - u_{FEM}\right) ^{\prime }\right\| _{0,I}\le Ce^{-\sigma p}. \end{aligned}$$
(2.21)

The remainder of this section is devoted to the proof of Theorem 2.6. Before that, we note a consequence of Theorem 2.6:

Corollary 2.7

Under the assumptions of Theorem 2.6 there is \(\lambda _0 > 0\) such that for every \(\lambda \in (0,\lambda _0)\) there are constants \(C\), \(\sigma > 0\) such that for all \(\varepsilon \in (0,1]\), \(p \in \mathbb {N}\)

$$\begin{aligned} \Vert u - u_{FEM}\Vert _{\infty ,I} \le C e ^{-\sigma p}. \end{aligned}$$

Proof

We first observe that standard inverse estimates yield the result when \(\lambda p \varepsilon \ge 1/4\), in which case the mesh consists of a single element. Let us therefore consider the 3-element case \(\lambda p \varepsilon < 1/4\). Using the boundary condition at \(x = 0\) we can write

$$\begin{aligned} \left| u(x)-u_{FEM}(x)\right| =\left| \int \nolimits _{0}^{x}\left( u(t)-u_{FEM}(t)\right) ^{\prime }dt\right| . \end{aligned}$$

Assume first that \(x\in (0,\lambda p\varepsilon ].\) Then by the Cauchy–Schwarz inequality and (2.21)

$$\begin{aligned} \left| u(x)-u_{FEM}(x)\right| \le \sqrt{\lambda p\varepsilon }\left( C \varepsilon ^{-1/2}e^{-\sigma p}\right) \le C\sqrt{\lambda p}e^{-\sigma p}. \end{aligned}$$

The same technique works if \(x \in [1-\lambda p \varepsilon , 1)\). For \(x \in [\lambda p \varepsilon , 1-\lambda p \varepsilon ]\), we write with the approximation \(I_p u\) of Lemma 2.5 and the triangle inequality \(|u(x) - u_{FEM}(x)| \le |u(x) - I_p u(x)| + |I_p u(x) - u_{FEM}(x)|\). Lemma 2.5 takes care of \(|u(x) - I_p u(x)|\) while \(|I_p u(x) - u_{FEM}(x)|\) is treated with the standard polynomial inverse estimate \(\Vert I_p u - u_{FEM} \Vert _{\infty ,[\lambda p \varepsilon ,1-\lambda p \varepsilon ]} \le C p^2 \Vert I_p u - u_{FEM}\Vert _{0,I}\) and the energy estimate of Proposition 2.3. \(\square \)

The proof of Theorem 2.6 is done in two steps: First, in Sect. 2.3.1 we reduce the analysis to an \(H^1\)-stability analysis of a projection operator \({\mathcal P}_0\) that is closely connected with the reduced/limit problem. Next, we recognize that polynomial inverse estimates will be needed for the \(H^1\)-stability analysis. In order to minimize the adverse impact of small elements of size \(O(\varepsilon p)\) on inverse estimates, we work with a decomposition of the space \(S(\lambda ,p)\) into global polynomials and polynomials supported by the small elements near the boundary. Section 2.3.2 provides the necessary strengthened Cauchy–Schwarz inequality, and Lemma 2.9 formulates the \(H^1\)-stability results for \({\mathcal P}_0\). Finally, in Sect. 2.3.3 we conclude the proof of Theorem 2.6.

2.3.1 Reduction to an \(H^1\)-stability analysis for a reduced problem

Since the desired estimate in the “asymptotic” case \(\lambda p \varepsilon \ge 1/4\) is easily shown (see the formal proof of Theorem 2.6 at the end of the section) we will focus in the following analysis on the 3-element case, i.e., \(\lambda p \varepsilon < 1/4\).

We begin by defining the bilinear form

$$\begin{aligned} {\mathcal {B}}_{0}\left( u,v\right) =\left\langle bu,v\right\rangle _{I}, \end{aligned}$$
(2.22)

corresponding to the reduced/limit problem. We also introduce the operator \({\mathcal {P}}_{0}:L^{2}(I)\rightarrow S_0(\lambda ,p)\) by the orthogonality conditionFootnote 1

$$\begin{aligned} {\mathcal {B}}_{0}\left( u-{\mathcal {P}}_{0}u,v\right) =0 \quad \forall \; v\in S_0(\lambda ,p). \end{aligned}$$
(2.23)

Then, by Galerkin orthogonality satisfied by \(u-u_{FEM}\) (with respect to the bilinear form \({\mathcal B}_\varepsilon \)) and by \(u - {\mathcal P}_0 u\) (with respect to the bilinear form \({\mathcal B}_0\)) we have

$$\begin{aligned} \left\| u_{FEM}-{\mathcal {P}}_{0}u\right\| _{E,I}^{2}= & {} {\mathcal {B}}_{\varepsilon }\left( u_{FEM}-{\mathcal {P}}_{0}u,u_{FEM}-{\mathcal {P}}_{0}u\right) \\= & {} {\mathcal {B}}_{\varepsilon }\left( u-{\mathcal {P}} _{0}u,u_{FEM}-{\mathcal {P}} _{0}u\right) \nonumber \\= & {} \varepsilon ^{2}\left\langle \left( u-{\mathcal {P}}_{0}u\right) ^{\prime },\left( u_{FEM}- {\mathcal {P}} _{0}u\right) ^{\prime }\right\rangle _{I} \nonumber \\\le & {} \varepsilon ^2 \Vert \left( u-{\mathcal {P}}_{0}u\right) ^{\prime } \Vert _{0,I} \Vert \left( u_{FEM}-{\mathcal {P}}_{0}u\right) ^{\prime } \Vert _{0,I}. \nonumber \end{aligned}$$
(2.24)

Hence

$$\begin{aligned} \varepsilon \left\| \left( u_{FEM}-{\mathcal {P}} _{0}u\right) ^{\prime }\right\| _{0,I}\le \left\| u_{FEM}-{\mathcal {P}} _{0}u\right\| _{E,I}\le \varepsilon \left\| \left( u-{\mathcal {P}} _{0}u\right) ^{\prime }\right\| _{0,I}. \end{aligned}$$

The triangle inequality will then allow us to infer from this the exponential convergence result (2.21) provided we can show that

$$\begin{aligned} \left\| \left( u-{\mathcal {P}} _{0}u\right) ^{\prime }\right\| _{0,I} \le C\varepsilon ^{-1/2}e^{-\sigma p}, \end{aligned}$$

for some \(C\) and \(\sigma >0\) independent of \(\varepsilon \) and \(p\). This calculation shows that we have to study the \(H^{1}\)-stability of the operator \({\mathcal {P}}_{0}\) on Spectral Boundary Layer meshes. This is achieved in Lemma 2.9. Subsequently in Lemma 2.10, we control \(\Vert (u - {\mathcal P}_0 u)^\prime \Vert _{0,I}\).

2.3.2 Stable decompositions of the spaces \(S(\lambda ,p)\)

Asymptotic expansions are a tool to decompose the solution \(u\) into components on the different length scales. We need to mimick this on the discrete level for \({\mathcal {P}} _{0}u\). We define (implicitly assuming \(\lambda p \varepsilon <1/4\)) the layer region

$$\begin{aligned} I_\varepsilon := [0,\lambda p \varepsilon ] \cup [1-\lambda p \varepsilon ,1] \end{aligned}$$

and the following two subspaces of \(S(\lambda ,p)\):

$$\begin{aligned} S_{1}= & {} {\mathcal S}^{p}(\Delta ), \qquad \Delta =\{0,1\}, \end{aligned}$$
(2.25)
$$\begin{aligned} S_{\varepsilon }= & {} \{u\in S(\lambda ,p){:}\,\mathrm{supp}\, u\subset I_\varepsilon \}. \end{aligned}$$
(2.26)

Note that the spaces \(S_{1}\) and \(S_{\varepsilon }\) do not carry any boundary conditions at the endpoints of \(I\)—this is a reflection of the fact that the reduced problem does not satisfy the homogeneous Dirichlet boundary conditions. It is important for the further developments to observe that for the three-element mesh of sufficiently small \(\lambda p\varepsilon \), there holds \({S}(\lambda ,p)=S_{1}\oplus S_{\varepsilon }\). In other words, each \(z\in {S} (\lambda ,p)\) has a unique decomposition \(z=z_{1}+z_{\varepsilon }\) with \(z_{1}\in S_{1}\) and \(z_{\varepsilon }\in S_{\varepsilon }\), when \(\lambda p\varepsilon <1/4.\) We also have the inverse estimates

$$\begin{aligned} \Vert z^{\prime }\Vert _{0,I}\le & {} Cp^{2}\Vert z\Vert _{0,I}\quad \forall z\in S_{1}, \end{aligned}$$
(2.27)
$$\begin{aligned} \Vert z^{\prime }\Vert _{0,I}\le & {} C\frac{p^{2}}{\lambda p\varepsilon } \Vert z\Vert _{0,I} \quad \forall z\in S_{\varepsilon }, \end{aligned}$$
(2.28)

by [12, Thm. 3.91]. Furthermore, we have the following strengthened Cauchy–Schwarz inequality:

Lemma 2.8

(Strengthened Cauchy–Schwarz inequality) Let \(\mathcal {B}_{0}\) be given by (2.22). Then, there is a constant \(C > 0\) depending solely on \(\Vert b\Vert _{\infty ,I}\) and \(\inf _{x \in I} b(x)\) such that

$$\begin{aligned} \left| {\mathcal {B}}_{0}\left( u,v\right) \right| \le C \min \{1,\sqrt{\lambda p\varepsilon }p\} \left\| u\right\| _{0,I}\left\| v\right\| _{0,I_{\varepsilon }}\quad \forall u\in S_{1},v\in S_{\varepsilon }. \end{aligned}$$

Proof

The standard Cauchy–Schwarz inequality yields \(|{\mathcal B}_0(u,v)| \le \Vert b\Vert _{\infty ,I} \Vert u\Vert _{0,I} \Vert v\Vert _{0,I}\), which accounts for the “\(1\)” in the minimum.

Let \(I_{1}=(0,\delta _{1})\) and \(I_{2}=(0,\delta _{2})\) be two intervals with \(\delta _{1}<\delta _{2}\). Consider polynomials \(\pi _{1}\) and \(\pi _{2}\) of degree \(p\). Then, using an inverse inequality [12, eq. (3.6.4)],

$$\begin{aligned} \left| \int \nolimits _{I_{1}}\pi _{1}(x)\pi _{2}(x)\,dx\right| \le \int \nolimits _{I_{1}} \left| \pi _{1}(x) \right| \left| \pi _{2}(x) \right| \,dx \le C\sqrt{\frac{\delta _{1}}{\delta _{2}}}p\Vert \pi _{1}\Vert _{0,I_{2}}\Vert \pi _{2}\Vert _{0,I_{1}}. \end{aligned}$$

The result follows by taking \(\delta _{1}=\lambda p\varepsilon \), \(\delta _{2}=1\). \(\square \)

As already mentioned, since \({S}(\lambda ,p)=S_{1}\oplus S_{\varepsilon }\) when \(\lambda p \varepsilon <1/4,\) we can uniquely decompose \({\mathcal {P}} _{0}u\) into components in \(S_{1}\) and \(S_{\varepsilon }\). The Strengthened Cauchy–Schwarz inequality of Lemma 2.8 allows us to quantify the size of these contributions:

Lemma 2.9

(stability of \({\mathcal P}_0)\) There exist constants \(C\), \(c>0\) depending solely on \(\inf _{x \in I} b(x) > 0\) and \(\Vert b\Vert _{\infty ,I}\) such that the following is true under the assumption

$$\begin{aligned} \sqrt{\lambda p\varepsilon }p\le c: \end{aligned}$$
(2.29)

For each \(z\in L^{2}(I)\), the (unique) decomposition of

$$\begin{aligned} {\mathcal {P}}_{0}z=z_{1}+z_{\varepsilon } \end{aligned}$$

into the components \(z_1 \in S_{1}\) and \(z_{\varepsilon } \in S_{\varepsilon }\) satisfies

$$\begin{aligned} \Vert z_{1}\Vert _{0,I}\le & {} C\Vert z\Vert _{0,I}, \end{aligned}$$
(2.30)
$$\begin{aligned} \Vert z_{\varepsilon }\Vert _{0,I}\le & {} C\{\Vert z\Vert _{0,I_{\varepsilon }}+\sqrt{\lambda p\varepsilon }p\Vert z\Vert _{0,I}\}. \end{aligned}$$
(2.31)

Furthermore,

$$\begin{aligned} \Vert z_{1}^\prime \Vert _{0,I}\le & {} C p^2\Vert z\Vert _{0,I}, \end{aligned}$$
(2.32)
$$\begin{aligned} \Vert z_{\varepsilon }^\prime \Vert _{0,I}\le & {} C\left\{ \frac{p^2}{\lambda p \varepsilon } \Vert z\Vert _{0,I_{\varepsilon }} +(\lambda p\varepsilon )^{-1/2} p^3\Vert z\Vert _{0,I}\right\} . \end{aligned}$$
(2.33)

Proof

Before we start with the proof of (2.30), (2.31), we mention that (2.30) follows by fairly standard arguments. Indeed, the smallness assumption (2.29) on \(c\) implies the strengthened Cauchy–Schwarz inequality by Lemma 2.8, and for this setting, it is well-known that the contributions \(z_1\) and \(z_\varepsilon \) can be controlled in terms of the constant of the strengthened Cauchy–Schwarz inequality and \(\Vert {\mathcal P}_0 z\Vert _{0,I}\). This result produces (2.30) but not (2.31), for which we need to refine the standard analysis. This is done below. In the interest of completeness, we will nevertheless present a proof for both (2.30), (2.31).

Write \({\mathcal P}_0 z = z_1 + z_\varepsilon \) with \(z_1 \in S_1\) and \(z_\varepsilon \in S_\varepsilon \). We define the auxiliary function

$$\begin{aligned} \psi _{1,\varepsilon }:={\left\{ \begin{array}{ll} \left( 1-\frac{x}{\lambda p\varepsilon }\right) ^{p} &{}\quad \text{ if } x\in [0,\lambda p\varepsilon ] \\ 0 &{} \quad \text{ otherwise. } \end{array}\right. } \end{aligned}$$

Then \(\text {supp }\psi _{1,\varepsilon }\subset [0,\lambda p\varepsilon ],\psi _{1,\varepsilon }(0)=1\) and \(\left\| \psi _{1,\varepsilon }\right\| _{0,I_{\varepsilon }} \sim p^{-1/2}\sqrt{\lambda p\varepsilon }\). For the right endpoint we define \(\psi _{2,\varepsilon }(x):=\psi _{1,\varepsilon }(1-x), x \in [1-\lambda p \varepsilon , 1]\). We also define

$$\begin{aligned} \widetilde{z}_{\varepsilon }:=z_{\varepsilon }+\psi _{1,\varepsilon }z_{1}(0)+\psi _{2,\varepsilon }z_{1}(1), \end{aligned}$$

and note that \({\mathcal {P}}_0 z \in S_0(\lambda ,p)\). Thus, \((z_1 + z_\varepsilon )|_{{\partial }{I}}= 0\) so that \(\widetilde{z}_{\varepsilon }\in S_{\varepsilon }\cap H_{0}^{1}(I)\subset S_{\varepsilon }\cap S_0(\lambda ,p)\). Utilizing the inverse estimate [12, Thm. 3.92]

$$\begin{aligned} \left\| \pi \right\| _{\infty , I} \le C p \left\| \pi \right\| _{0,I} \quad \forall \; \pi \in S_{1}, \end{aligned}$$

we arrive at

$$\begin{aligned} \left\| \widetilde{z}_{\varepsilon }\right\| _{0,I}=\left\| \widetilde{z}_{\varepsilon }\right\| _{0,I_{\varepsilon }}\le C\left\{ \left\| z_{\varepsilon }\right\| _{0,I_{\varepsilon }}+p^{1/2}\sqrt{ \lambda p\varepsilon }\left\| z_{1}\right\| _{0,I}\right\} . \end{aligned}$$

The representation \({\mathcal {P}}_{0}z=z_{1}+z_{\varepsilon }\in S_0(\lambda ,p)\) also implies

$$\begin{aligned} {\mathcal {B}}_{0}(z_{1},v_{1})+{\mathcal {B}}_{0}(z_{\varepsilon },v_{1})= & {} {\mathcal {B}}_{0}({\mathcal {P}} _{0}z,v_{1}) \quad \forall \text { }v_{1}\in S_{1}, \end{aligned}$$
(2.34)
$$\begin{aligned} {\mathcal {B}}_{0}(z_{1},v_{\varepsilon })+{\mathcal {B}}_{0}(z_{\varepsilon },v_{\varepsilon })= & {} {\mathcal {B}}_{0}({\mathcal {P}} _{0}z,v_{\varepsilon })= {\mathcal {B}}_{0}(z,v_{\varepsilon })\quad \forall \text { }v_{\varepsilon }\in S_{\varepsilon }\cap S_0(\lambda ,p), \qquad \qquad \end{aligned}$$
(2.35)

where in (2.35) we used the fact that \({\mathcal {P}} _{0}\) is the \({\mathcal B}_0\)–projection onto \(S_0(\lambda ,p)\). Taking \(v_{1}=z_{1}\) in (2.34) and \(v_{\varepsilon }=\widetilde{z}_{\varepsilon }\in S_{\varepsilon }\cap S_0(\lambda ,p)\) in (2.35) yields, together with the Strengthened Cauchy Schwarz inequality of Lemma 2.8,

$$\begin{aligned} \Vert z_{1}\Vert _{0,I}^{2}\le & {} C \{ \Vert {\mathcal {P}}_{0}z\Vert _{0,I}\Vert z_{1}\Vert _{0,I}+p\sqrt{\lambda p\varepsilon }\Vert z_{\varepsilon }\Vert _{0,I}\Vert z_{1}\Vert _{0,I} \}, \end{aligned}$$
(2.36a)
$$\begin{aligned} \Vert z_{\varepsilon }\Vert _{0,I}^{2}\le & {} C \{ \Vert z\Vert _{0,I_{\varepsilon }}\Vert \widetilde{z}_{\varepsilon }\Vert _{0,I_{\varepsilon }}+p\sqrt{\lambda p\varepsilon }\Vert \widetilde{z} _{\varepsilon }\Vert _{0,I}\Vert z_{1}\Vert _{0,I}+\Vert z_{\varepsilon }\Vert _{0,I}\Vert z_{1}\Vert _{0,I}\sqrt{\lambda p\varepsilon }p^{1/2} \} \nonumber \\\le & {} C \{ \Vert z_{\varepsilon }\Vert _{0,I}\left[ \Vert z\Vert _{0,I_{\varepsilon }}+p\sqrt{\lambda p\varepsilon }\Vert z_{1}\Vert _{0,I}+ \sqrt{\lambda p \varepsilon }p^{1/2}\Vert z_{1}\Vert _{0,I}\right] \nonumber \\&+\left[ \Vert z\Vert _{0,I_{\varepsilon }}+p\sqrt{\lambda p\varepsilon }\Vert z_{1}\Vert _{0,I}\right] \sqrt{\lambda p\varepsilon } p^{1/2}\Vert z_{1}\Vert _{0,I} \}. \end{aligned}$$
(2.36b)

Estimating generously \(\sqrt{\lambda p\varepsilon }p^{1/2}\le \sqrt{\lambda p\varepsilon }p\) and using an appropriate Young inequality in (2.36b) we get

$$\begin{aligned} \Vert z_{1}\Vert _{0,I}\le & {} C \{ \Vert {\mathcal {P}}_{0}z\Vert _{0,I}+p\sqrt{\lambda p\varepsilon }\Vert z_{\varepsilon }\Vert _{0,I}\}, \end{aligned}$$
(2.37a)
$$\begin{aligned} \Vert z_{\varepsilon }\Vert _{0,I}\le & {} C \{ \Vert z\Vert _{0,I_{\varepsilon }}+p\sqrt{\lambda p\varepsilon }\Vert z_{1}\Vert _{0,I} \} . \end{aligned}$$
(2.37b)

Inserting (2.37b) in (2.37a), assuming that \(\sqrt{\lambda p\varepsilon }p\) is sufficiently small and using the stability \(\Vert {\mathcal {P}}_{0}z\Vert _{0,I}\le C \Vert z\Vert _{0,I}\) gives \(\Vert z_{1}\Vert _{0,I}\le C\Vert z\Vert _{0,I}\). Inserting this bound in (2.37b) concludes the proof of (2.30) and (2.31). Finally, the proof (2.32), (2.33) follows from a further application of the standard polynomial inverse estimates (2.27), (2.28). \(\square \)

2.3.3 Conclusion of the proof of Theorem 2.6

We are now in the position to prove the following

Lemma 2.10

Assume (2.2). Let \(u\) be the solution of (2.3) and let \(\lambda _0\) be given by Lemma 2.5. Let \(\lambda \in (0,\lambda _0]\) and assume that \(\lambda \), \(p\), \(\varepsilon \) satisfy (2.29). Then there exist constants \(C\), \(\beta > 0\) (independent of \(\varepsilon \in (0,1]\) and \(p\in \mathbb {N}\) but dependent on \(\lambda )\) such that

$$\begin{aligned} \Vert (u - {\mathcal {P}}_0 u)^\prime \Vert _{0,I} \le C \varepsilon ^{-1/2} e^{-\beta p}. \end{aligned}$$
(2.38)

Proof

Recall that only the case \(\lambda p \varepsilon < 1/4\) is of interest. By Lemma 2.5 we can find an approximation \(I_{p}u\in S_0(\lambda ,p)\) with

$$\begin{aligned} \Vert u-I_{p}u\Vert _{0,I}+\sqrt{\varepsilon }\Vert (u-I_{p}u)^{\prime }\Vert _{0,I}\le Ce^{-\beta p}. \end{aligned}$$
(2.39)

We stress that, while the estimate (2.20) is explicit in the parameter \(\lambda \), we have absorbed this dependence here in the constants \(C\) and \(\beta \) for simplicity of exposition.

Since \({\mathcal {P}}_0\) is a projection on \(S_0(\lambda ,p)\) and \(I_p u \in S_0(\lambda ,p)\), we can write \(u- {\mathcal {P}}_0 u=u-I_{p}u-{\mathcal {P}}_0 (u-I_{p}u)\). The first term, \(u - I_p u\), is already treated in (2.39). For the second term, \({\mathcal {P}}_0 (u-I_{p}u)\in S_0(\lambda ,p)\), we decompose \({\mathcal {P}}_0 (u-I_{p}u)=z_{1}+z_{\varepsilon }\) and use the estimates (2.32), (2.33) of Lemma 2.9 to get

$$\begin{aligned} \Vert z_{1}^{\prime }\Vert _{0,I}\lesssim & {} p^{2}\Vert u-I_{p}u\Vert _{0,I} \le Ce^{-\beta p}, \\ \Vert z_{\varepsilon }^{\prime }\Vert _{0,I}\lesssim & {} \frac{p^{2}}{\lambda p\varepsilon }\left[ \Vert u-I_{p}u\Vert _{0,I_{\varepsilon }}+\sqrt{\lambda p\varepsilon }p\Vert u-I_{p}u \Vert _{0,I}\right] . \end{aligned}$$

There are several possible ways to treat the term \(\Vert (u-I_{p}u)\Vert _{0,I_{\varepsilon }}\). A rather generous approach exploits the fact that \((u-I_{p}u)(0)=(u-I_{p}u)(1)=0\) so that we use \(z(x)=\int \nolimits _{0}^{x}z^{\prime }(t)\,dt\) and obtain

$$\begin{aligned} \Vert u-I_{p}u\Vert _{0,I_{\varepsilon }}\le C\lambda p\varepsilon \Vert (u-I_{p})^{\prime }\Vert _{0,I_{\varepsilon }}. \end{aligned}$$

Hence,

$$\begin{aligned} \Vert z_{\varepsilon }^{\prime }\Vert _{0,I}\lesssim \frac{p^{2}}{\lambda p \varepsilon }\left[ \lambda p \varepsilon \Vert (u-I_{p}u)^{\prime }\Vert _{0,I_{\varepsilon }}+\sqrt{\lambda p\varepsilon } p\Vert u-I_{p}u\Vert _{0,I}\right] \lesssim \varepsilon ^{-1/2}e^{-\beta p}. \end{aligned}$$

\(\square \)

Proof of Theorem 2.6

In view of \(\Vert u - u_{FEM}\Vert _{0,I} \le C \Vert u - u_{FEM}\Vert _{E,I} \le C e^{-\sigma p}\) by Proposition 2.3, we focus on the control of \(\sqrt{\varepsilon } \Vert (u - u_{FEM})^\prime \Vert _{0,I}\). We distinguish two cases:

Case 1 Assume that (2.29) is satisfied. Then (2.38) yields the result.

Case 2 Assume that \(\sqrt{\lambda p \varepsilon } p \ge c\) for the constant \(c\) appearing in (2.29). Then \(\varepsilon ^{-1/2} \le c^{-1} p^{3/2} \lambda ^{1/2}\) so that

$$\begin{aligned} \sqrt{\varepsilon } \Vert (u - u_{FEM})^\prime \Vert _{0,I}\le & {} \varepsilon ^{-1/2} \Vert u - u_{FEM}\Vert _{E,I}\\\le & {} c^{-1} \lambda ^{1/2} p^{3/2} \Vert u - u_{FEM}\Vert _{E,I} \lesssim e^{-\sigma p}, \end{aligned}$$

which concludes the proof.\(\square \)

2.4 Numerical example

To illustrate the theoretical findings presented above, we show in Fig. 1 the results of numerical computations for the following problem:

$$\begin{aligned} -\varepsilon ^2 u^{\prime \prime }(x)+u(x)= & {} \left( x+\frac{1}{2}\right) ^{-1},\quad x\in (0,1), \\ u(0)= & {} u(1)=0. \end{aligned}$$

We use the Spectral Boundary Layer mesh \(\Delta _{BL}(\lambda ,p)\) with \(\lambda = 1\) and polynomials of degree \(p\) which we increase from \(1\) to \(5\) to improve accuracy. We select \(\varepsilon = 10^{-j}\), \(j=4,\ldots ,8\). We note \(\mathrm{dim} S_0(\lambda ,p) = 2+3(p-1)\). Since no exact solution is available, we use a reference solution to estimate the error. In Fig. 1, we present the error in the balanced norm (2.10) versus the polynomial degree \(p\) as well as the error \(\varepsilon ^{1/2} \Vert (u - u_{FEM})^\prime \Vert _{0,I}\) and the \(L^2\)-error. The error curves are on top of each other, which supports the robust exponential convergence in the balanced norm.

Fig. 1
figure 1

Convergence on Spectral Boundary Layer meshes. Top convergence in the balanced norm. Bottom left error \(\varepsilon ^{1/2} \Vert (u - u_{FEM})^\prime \Vert _{0,I}\) versus \(p\). Bottom right convergence in \(L^2\)

3 The two-dimensional case

The ideas of the previous section carry over to the two-dimensional case. We consider the following boundary value problem: Find \(u\) such that

$$\begin{aligned} -\varepsilon ^{2}\Delta u+b u= & {} f \quad \text { in }\Omega \subset \mathbb {R}^{2}, \end{aligned}$$
(3.1a)
$$\begin{aligned} u= & {} 0\quad \text { on }\partial \Omega , \end{aligned}$$
(3.1b)

where \(\varepsilon \in (0,1]\), and the functions \(b\), \(f\) are given with \(b>0\) on \(\overline{\Omega }\). We assume that the data of the problem is analytic, i.e., \(\partial \Omega \) is an analytic curve and that there exist constants \(C_{f}\), \(\gamma _{f}\), \(C_{b}\), \(\gamma _{b}\), \(c_b >0\) such that

$$\begin{aligned} \left\{ \begin{array}{ll} \left\| \nabla ^{n}f\right\| _{\infty ,\Omega }\le C_{f}\gamma _{f}^{n}n! &{} \quad \forall \;n\in \mathbb {N}_{0}, \\ \left\| \nabla ^{n}b\right\| _{\infty ,\Omega }\le C_{b}\gamma _{b}^{n}n! &{} \quad \forall \;n\in \mathbb {N}_{0}, \\ \inf _{x \in \Omega } b(x) \ge c_b > 0 . \end{array} \right. \end{aligned}$$
(3.2)

The variational formulation of (3.1a), (3.1b) reads: Find \(u\in H_{0}^{1}\left( \Omega \right) \) such that

$$\begin{aligned} {\mathcal {B}}_{\varepsilon }(u,v):=\varepsilon ^2 \left\langle \nabla u,\nabla v\right\rangle _\Omega + \left\langle b u,v\right\rangle _{\Omega } =F(v):= \left\langle f,v\right\rangle _{\Omega } \quad \forall v\in H_{0}^{1}\left( \Omega \right) , \end{aligned}$$
(3.3)

where \(\left\langle {\cdot ,\cdot }\right\rangle _{\Omega } \) denotes the usual \(L^{2}(\Omega )\) inner product. As in 1D, the energy norm \(\Vert \cdot \Vert _{E,\Omega }\) and the balanced norm \(\Vert \cdot \Vert _{balanced,\Omega }\) are defined by

$$\begin{aligned} \Vert v\Vert ^2_{E,\Omega }:= & {} {\mathcal B}_\varepsilon (v,v), \end{aligned}$$
(3.4)
$$\begin{aligned} \Vert v\Vert ^2_{balanced,\Omega }:= & {} \Vert v\Vert ^2_{0,\Omega } + \varepsilon \Vert \nabla v\Vert ^2_{0,\Omega }. \end{aligned}$$
(3.5)

The discrete version of (3.3) reads: find \(u_{FEM}\in V_{N}\subset H_{0}^{1}\left( \Omega \right) \) such that (3.3) holds for all \(v\in V_{N}\subset H_{0}^{1}\left( \Omega \right) \), with \(u\) replaced by \(u_{FEM}\), where the subspace \(V_{N}\) will be defined shortly.

3.1 Meshes and spaces

Concerning the meshes and the \(hp\)-FEM space based on these meshes, we adopt the simplest case that generalizes our 1D analysis to 2D: The elements are (curvilinear) quadrilaterals and the needle elements required to resolve the boundary layer are obtained as mappings of needle elements of a reference configuration. This approach is discussed in more detail in [7, Sec. 3.1.2] and expanded as the notion of “patchwise structured meshes” in [4, Sec. 3.3.2].

Our \(hp\)-FEM spaces have the following general structure: Let \(\Delta =\left\{ \Omega _{i}\right\} _{i=1}^{N}\) be a mesh consisting of curvilinear quadrilaterals \(\Omega _{i}\), \(i=1,\ldots ,N\), subject to the usual restrictions (see, e.g., [7]) and associate with each \(\Omega _{i}\) a bijective, Lipschitz continuous (further smoothness assumptions are imposed below) element mapping \(M_{i}:S_{ST}\rightarrow \overline{\Omega }_{i},\) where \(S_{ST}=[0,1]^{2}\) denotes the usual reference square. With \(Q_{p}(S_{ST})\) the space of polynomials of degree \(p\) (in each variable) on \(S_{ST}\), we set

$$\begin{aligned} \mathcal {S}^{p}(\Delta )= & {} \left\{ u\in H^{1}\left( \Omega \right) :\left. u\right| _{\Omega _{i}}\circ M_{i} \in Q_p(S_{ST}),\quad i=1,\ldots ,N\right\} ,\\ \mathcal {S}_{0}^{p}(\Delta )= & {} \mathcal {S}^{p}(\Delta )\cap H_{0}^{1}(\Omega ). \end{aligned}$$

We now describe the mesh \(\Delta \) and the element maps that we will use (see Fig. 2). Our starting point is a fixed mesh \(\Delta _A\) (the subscript “\(A\)” stands for “asymptotic”) consisting of curvilinear quadrilateral elements \(\Omega _i\), \(i=1,\ldots ,N^\prime \). These elements \(\Omega _i\) are the images of the reference square \(S_{ST}= [0,1]^2\) under the element maps \(M_{A,i}\), \(i=1,\ldots ,N^\prime \) (we added the subscript “\(A\)” to emphasize that they correspond to the asymptotic mesh \(\Delta _A\)). They are assumed to satisfy the conditions (M1)–(M3) of [7] in order to ensure that the space \({\mathcal S}^p(\Delta _A)\) has suitable approximation properties. The element maps \(M_{A,i}\) are assumed to be analytic with analytic inverse; that is, as in [7] we require for some constants \(C_1\), \(C_2\), \(\gamma > 0\)

$$\begin{aligned} \Vert (M_{A,i}^\prime )^{-1} \Vert _{\infty ,S_{ST}} \le C_1, \quad \Vert D^\alpha M_{A,i}\Vert _{\infty ,S_{ST}} \le C_2 \alpha ! \gamma ^{|\alpha |}\,\, \forall \alpha \in \mathbb {N}_0^2, \quad i=1,\ldots ,N^\prime . \end{aligned}$$

We furthermore assume that elements do not have a single vertex on the boundary \(\partial \Omega \) but only complete, single edges, i.e., the following dichotomy holds:

$$\begin{aligned} \text{ either } \quad \overline{\Omega _i} \cap \partial \Omega = \emptyset \quad \text{ or } \overline{\Omega _i} \cap \partial \Omega \quad \text{ is } \text{ a } \text{ single } \text{ edge } \text{ of }~\Omega _i. \end{aligned}$$
(3.6)

Edges of curvilinear quadrilaterals are, of course, the images of the edges of \(S_{ST}\) under the element maps. For notational convenience, we assume that the edges lying on \(\partial \Omega \) are the image of the edge \(\{0\} \times [0,1]\) under the element map. It then follows that these elements have one edge on \(\partial \Omega \) and the images of the edges \(\{y = 1\}\) and \(\{y = 0\}\) of \(S_{ST}\) are shared with elements that likewise have one edge on \(\partial \Omega \). For notational convenience, we assume that the elements at the boundary are numbered first, i.e., they are the elements \(\Omega _i\), \(i=1,\ldots ,n < N^\prime \). For a parameter \(\lambda > 0\) and a degree \(p \in \mathbb {N}\), the boundary layer mesh \(\Delta _{BL} = \Delta _{BL}(\lambda ,p)\) is defined as follows.

Definition 3.1

(Spectral Boundary Layer mesh \(\Delta _{BL}(\lambda ,p)\)) Given parameters \(\lambda > 0\), \(p \in \mathbb {N}\), \(\varepsilon \in (0,1]\) and the asymptotic mesh \(\Delta _A\), the mesh \(\Delta _{BL}(\lambda ,p)\) is defined as follows:

  1. 1.

    \(\lambda p\varepsilon \ge 1/2.\) In this case we are in the asymptotic regime, and we use the asymptotic mesh \(\Delta _{A}\).

  2. 2.

    \(\lambda p\varepsilon <1/2.\) In this regime, we need to define so-called needle elements. This is done by splitting the elements \(\Omega _{i},i=1,\ldots ,n\) into two elements \(\Omega _{i}^{need}\) and \(\Omega _{i}^{reg}.\) To that end, split the reference square \(S_{ST}\) into two elements

    $$\begin{aligned} S^{need}=\left[ 0,\lambda p\varepsilon \right] \times [0,1], \qquad S^{reg}= \left[ \lambda p\varepsilon ,1\right] \times [0,1], \end{aligned}$$

    and define the elements \(\Omega _i^{need}\), \(\Omega _i^{reg}\) as the images of these two elements under the element map \(M_{A,i}\) and the corresponding element maps as the concatination of the affine maps

    $$\begin{aligned}&A^{need}: S_{ST}\rightarrow S^{need}, \qquad (\xi ,\eta ) \rightarrow (\lambda p \varepsilon \xi , \eta ), \\&A^{reg}: S_{ST}\rightarrow S^{reg}, \qquad (\xi ,\eta ) \rightarrow (\lambda p \varepsilon + (1-\lambda p \varepsilon ) \xi , \eta ) \end{aligned}$$

    with the element map \(M_{A,i}\), i.e., \(M_i^{need} = M_{A,i} \circ A^{need}\) and \(M_i^{reg} = M_{A,i} \circ A^{reg}\). Explicitly:

    $$\begin{aligned} \Omega _{i}^{need}&=M_{A,i}\left( S^{need}\right) ,&\Omega _{i}^{reg}&=M_{A,i}\left( S^{reg}\right) , \\ M_{i}^{need}(\xi ,\eta )&=M_{A,i}\left( \lambda p\varepsilon \xi ,\eta \right) ,&M_{i}^{reg}(\xi ,\eta )&=M_{A,i}\left( \lambda p\varepsilon +(1-\lambda p\varepsilon )\xi ,\eta \right) . \end{aligned}$$
Fig. 2
figure 2

Example of an admissible mesh. Left asymptotic mesh \(\Delta _A\). Right boundary layer mesh \(\Delta _{BL}\)

In Fig. 2 we show an example of such a mesh construction on the unit circle. In total, the mesh \(\Delta _{BL}(\lambda ,p)\) consists of \(N = N^\prime +n\) elements if \(\lambda p \varepsilon <1/2\).

Anticipating that we will need, for the case \(\lambda p \varepsilon <1/2\), a decomposition of

$$\begin{aligned} S(\lambda ,p):= {\mathcal S}^{p} (\Delta _{BL}(\lambda ,p)) \end{aligned}$$

into two spaces reflecting the two scales present, we proceed as follows: With \(\Delta _{A}\) the asymptotic (coarse) mesh that resolves the geometry we set

$$\begin{aligned} S_{1}:= & {} \mathcal {S}^{p}(\Delta _{A}), \end{aligned}$$
(3.7)
$$\begin{aligned} S_{\varepsilon }:= & {} \{v \in {\mathcal S}^p(\Delta _{BL}(\lambda ,p)) \,|\, \mathrm{supp}\, v \subset \overline{\Omega }_{\lambda p \varepsilon }\}, \end{aligned}$$
(3.8)

where the boundary layer region \(\Omega _{\lambda p \varepsilon }\) is defined as

$$\begin{aligned} \Omega _{\lambda p\varepsilon }=\overset{n}{\underset{i=1}{\cup }}\Omega _{i}^{need} . \end{aligned}$$
(3.9)

As in the 1D situation, our approximation space \({\mathcal S}^p(\Delta _{BL}(\lambda ,p))\) can be written as a direct sum of \(S_1\) and \(S_\varepsilon \) if \(\lambda p \varepsilon <1/2\):

Lemma 3.2

Let \(\lambda p \varepsilon < 1/2\). Then \({\mathcal S}^p(\Delta _{BL}(\lambda ,p))\) is the direct sum \(S_1 \oplus S_\varepsilon \). Furthermore, we have the inverse estimates

$$\begin{aligned} \Vert u\Vert _{0,\partial \Omega _{i}}\le & {} Cp\Vert u\Vert _{0,\Omega _{i}}\quad \forall u\in S_{1},\quad i=1,\ldots ,N^\prime , \end{aligned}$$
(3.10)
$$\begin{aligned} |u|_{1,\Omega _{i}}\le & {} Cp^{2}\Vert u\Vert _{0,\Omega _{i}}\quad \forall u\in S_{1},\quad i=1,\ldots ,N^\prime , \end{aligned}$$
(3.11)
$$\begin{aligned} |u|_{1,\Omega _{i}}\le & {} C\frac{p^{2}}{\lambda p\varepsilon }\Vert u\Vert _{0,\Omega _{i}}\quad \forall u\in S_{\varepsilon },i=1,\ldots ,n, \end{aligned}$$
(3.12)

Proof

The claim that \({\mathcal S}^p(\Delta _{BL}(\lambda ,p)) = S_1 \oplus S_\varepsilon \) follows from the way \(\Delta _{BL}(\lambda ,p)\) is constructed. Let \(z \in {\mathcal S}^p(\Delta _{BL}(\lambda ,p))\). Define \(z_1 \in S_1\) as follows: For the internal elements \(\Omega _i\) with \(i=n+1,\ldots ,N^\prime \) take \(z_1|_{\Omega _i}:= z|_{\Omega _i}\). For \(\Omega _i\), \(i \in \{1,\ldots ,n\}\), which is further decomposed into \(\Omega _i^{need}\) and \(\Omega _i^{reg}\), we consider the pull-back \(\widetilde{z}_i:= z|_{\Omega _i} \circ M_{A,i}\). This pull-back \(\widetilde{z}_i\) is a piecewise polynomial on \(S_{ST}= S^{need} \cup S^{reg}\). Define the polynomial \(\widehat{z}_i \in Q(S_{ST})\) on the full reference element \(S_{ST}\) by the condition

$$\begin{aligned} \widehat{z}_i|_{S^{reg}} = \widetilde{z}_i|_{S^{reg}} \end{aligned}$$

and then set \(z_1|_{\Omega _i}:= \widehat{z}_i \circ M_{A,i}^{-1}\); that is, the restriction \(\widetilde{z}_i|_{S^{reg}}\) is extended polynomially to \(S_{ST}\). In this way, the function \(z_1\) is defined elementwise, and the assumptions on the element maps \(M_{A,i}\) of the asymptotic mesh \(\Delta _A\) ensure that \(z_1 \in H^1(\Omega )\), i.e., \(z_1 \in S_1\). Since by construction \(z|_{\Omega _i^{reg}} = z_1|_{\Omega _i^{reg}}\) for \(i=1,\ldots ,n\), we conclude that \(\mathrm{supp} (z - z_1) \subset \overline{\Omega }_{\lambda p \varepsilon }\) and therefore \(z_\varepsilon := z - z_1 \in S_\varepsilon \). The construction also shows the uniqueness of the decomposition.

The inverse estimates (3.10), (3.11), (3.12) can be seen as follows. The estimate (3.11) is an easy consequence of the assumptions on the element maps \(M_{A,i}\) of the asymptotic mesh \(\Delta _A\) and the polynomial inverse estimates [12, Thm. 4.76]. In a similar manner, the inverse estimate (3.10), which estimates the \(L^2\)-norm on the boundary \(\partial \Omega _i\) of \(\Omega _i\) by the \(L^2\)-norm on \(\Omega _i\) follows from a suitable application of 1D inverse estimates (cf. [12, eqn. (3.6.4)]).

For the estimate (3.12), we note that for an element \(\Omega _i^{need}\), we can estimate for any \(v \in S_\varepsilon \) again with assumptions on the element maps \(M_{A,i}\)

$$\begin{aligned} \Vert \nabla v\Vert _{0,\Omega _i^{need}} \sim \Vert \nabla (v \circ M_{A,i}) \Vert _{0,S^{need}} \le C \frac{p^2}{\lambda p \varepsilon } \Vert v \circ M_{A,i}\Vert _{0,S^{need}} \sim C \frac{p^2}{\lambda p \varepsilon } \Vert v \Vert _{0,S^{need}}, \end{aligned}$$

where we exploited that \(v \circ M_{A,i}\) is a polynomial of degree \(p\) and used the inverse estimate [12, Thm. 3.91]. \(\square \)

We mention already at this point that we will quantify the contributions \(z_1\) and \(z_\varepsilon \) of this decomposition in Lemma 3.9 ahead. We close this section by pointing out that in our setting, one has very good control over the element maps: There exist \(C > 0\) (depending solely on the asymptotic mesh \(\Delta _A\)) such that

$$\begin{aligned} \Vert M^\prime _{A,i} \Vert _{\infty ,S_{ST}} + \Vert (M^\prime _{A,i})^{-1} \Vert _{\infty ,S_{ST}}\le & {} C, \quad i=1,\ldots ,N^\prime , \end{aligned}$$
(3.13a)
$$\begin{aligned} \Vert (M^{reg}_{i})^\prime \Vert _{\infty ,S_{ST}} + \Vert ((M^{reg}_{i})^\prime )^{-1} \Vert _{\infty ,S_{ST}}\le & {} C, \quad i=1,\ldots ,n, \end{aligned}$$
(3.13b)
$$\begin{aligned} \Vert (M^{need}_{i})^\prime \Vert _{\infty ,S_{ST}} + \Vert ((M^{need}_{i})^\prime )^{-1} \Vert _{\infty ,S_{ST}}\le & {} C \frac{1}{\lambda p \varepsilon }, \quad i=1,\ldots ,n.\qquad \quad \end{aligned}$$
(3.13c)

3.2 Approximation properties of the Spectral Boundary Layer mesh

By construction, the resulting mesh (in the case \(\lambda p\varepsilon <1/2\))

$$\begin{aligned} \Delta _{BL} = \Delta _{BL}(\lambda ,p) =\left\{ \Omega _{1}^{need},\ldots ,\Omega _{n}^{need}, \Omega _{1}^{reg},\ldots ,\Omega _{n}^{reg},\Omega _{n+1},\ldots ,\Omega _{N}\right\} \end{aligned}$$

is a regular admissible mesh in the sense of [7]. Therefore, [7] gives that the space

$$\begin{aligned} S_0(\lambda ,p):= {\mathcal S}^p_0(\Delta _{BL}(\lambda ,p)) \end{aligned}$$

has the following approximation properties:

Proposition 3.3

([7]) Let \(u\) be the solution to (3.3) and assume that (3.2) holds. Then there exist constants \(\lambda _0\), \(\lambda _1\), \(C\), \(\beta > 0\) independent of \(\varepsilon \in (0,1]\) and \(p \in \mathbb {N}\), such that the following is true: For every \(p\) and every \(\lambda \in (0,\lambda _0]\) with \(\lambda p \ge \lambda _1\) there exists \(\pi _{p} u \in \mathcal {S}_{0}^{{p}}(\Delta _{BL}(\lambda ,p) )\) such that

$$\begin{aligned} \left\| u-\pi _{p} u \right\| _{\infty , \Omega }+\varepsilon \left\| \nabla (u-\pi _{p} u) \right\| _{\infty ,\Omega }\le Cp^{2}\left( \ln p+1\right) ^{2}e^{-\beta p\lambda }. \end{aligned}$$

We mention in passing that Proposition 3.3 provides robust exponential convergence in the energy norm. However, as in the 1D case of Lemma 2.5, we can modify the boundary layer part of the approximant of Proposition 3.3, so as to be able to approximate at a robust exponential rate in the balanced norm. This is achieved with the following 2D analog of Lemma 2.4.

Lemma 3.4

Let \(v\) be defined on \(S = [0,1]^2\), and let \(v\) be analytic on \([0,d_0] \times [0,1]\) for some fixed \(d_0 \in (0,1]\). Assume that for some \(C_v\), \(\gamma _v>0\) and \(\varepsilon \in (0,1]\), the function \(v\) satisfies the following hypotheses:

  1. (R1)

    For every \(\xi \in (0,d_0)\), the stretched function \(\widehat{v}_{\xi }: S \rightarrow \mathbb {R}\) given by \(\widehat{v}_{\xi } (x,y):= u(x \xi ,y)\), satisfies

    $$\begin{aligned} \Vert D^\alpha \widehat{v}_{\xi } \Vert _{\infty ,S} \le C_v \gamma _v^{|\alpha |} \max \{|\alpha |+1,\xi /\varepsilon \}^{|\alpha |} \qquad \forall \alpha \in \mathbb {N}_0^2. \end{aligned}$$
  2. (R2)

    The function \(v\) satisfies

    $$\begin{aligned} \sup _{y \in [0,1]} |\nabla ^n v(x,y)| \le C_v \varepsilon ^{-n} e^{-x/\varepsilon } \quad \forall x \in [0,1], \quad n \in \{0,1\}. \end{aligned}$$

Then there are constants \(C\), \(\beta \), \(\eta > 0\) (depending only on \(\gamma _v\)) such that under the assumption

$$\begin{aligned} \frac{\xi }{p \varepsilon } \le \eta , \end{aligned}$$

the following is true for the mesh \(\Delta _\xi = \{S_\xi ^{need}, S_\xi ^{reg}\}\) with \(S_\xi ^{need}:= [0,\xi ]\times [0,1]\) and \(S_\xi ^{reg}:= [\xi ,1]\times [0,1]\): There is a piecewise polynomial approximation \(I_p v \in {\mathcal S}^{p}(\Delta _\xi )\) with the following properties:

  1. (i)

    On the two edges \(x = 0\) and \(x = 1\) of \(S\), the approximation \(I_p v\) coincides with the Gauß–Lobatto interpolant of \(v\). On the edge \((0,\xi ) \times \{0\}\), \(I_p v\) is given by the Gauß–Lobatto interpolant corrected by \((1-\sqrt{\varepsilon }) \frac{x}{\xi } v(\xi ,0)\) (so that \((I_p v)(\xi ,0) = \sqrt{\epsilon } v(\xi ,0)\)), and on the edge \((\xi ,1) \times \{0\}\), \(I_p v\) is the linear polynomial interpolating the values \(\sqrt{\varepsilon } v(\xi ,0)\) and \(v(1,0)\) at the endpoints. \(I_p v\) is defined analogously on the edges \((0,\xi ) \times \{1\}\) and \((\xi ,1) \times \{1\}\).

  2. (ii)

    The approximation \(I_p v\) satisfies

    $$\begin{aligned}&\Vert (v - I_p v)\Vert _{\infty ,S_\xi ^{need}} + \xi \Vert \partial _x (v - I_p v)\Vert _{\infty ,S_\xi ^{need}} + \Vert \partial _y (v - I_p v)\Vert _{\infty ,S_\xi ^{need} }\\&\le C C_v \left[ e^{-\beta p} + p^2 (1 + \ln p)^2 e^{-\xi /\varepsilon }\right] , \\&\Vert v - I_p v\Vert _{\infty ,S_\xi ^{reg}} \le C C_v (1+\ln p)^2 e^{-\xi /\varepsilon }, \\&\Vert v - I_p v\Vert _{0,S_\xi ^{reg}} + \varepsilon \Vert \nabla (v - I_p v)\Vert _{0,S_\xi ^{reg}} \le C C_v p^2 (1+\ln p)^2 \sqrt{\varepsilon } e^{-\xi /\varepsilon }. \end{aligned}$$

Proof

As in the corresponding 1D result (Lemma 2.4), we construct \(I_p v\) in two steps. In the first step, we study the approximation \(v_1\) which is given by the piecewise Gauß–Lobatto interpolant. In the second step, we modify \(v_1\) to obtain the additional factor \(\sqrt{\varepsilon }\) for the error in the \(L^2\)-based norms on the large element \(S_\xi ^{reg}\).

Step 1: For \(\xi \in (0,d_0)\), let \(v_1\) be the piecewise Gauß–Lobatto interpolant of \(v\) on the mesh \(\Delta _\xi \). For simplicity, we assume \(\xi \le 1/2\). The error analysis for \(v - v_1\) can be extracted from the proof of [7, Thm. 3.12]; we highlight here the main arguments for completeness’ sake. The one-dimensional Gauß–Lobatto interpolation operator \(i_p:C([0,1]) \rightarrow \Pi _p\) has the stability property \(\Vert i_p\Vert _{\infty ,[0,1]} \le C (1 + \ln p)\) by [15]. Together with a polynomial inverse estimate (Markov’s inequality) we get on \(S_\xi ^{reg}\):

$$\begin{aligned} \Vert v_1\Vert _{\infty ,S_\xi ^{reg}}&\le C (1 + \ln p)^2 \Vert v\Vert _{\infty ,S_\xi ^{reg}} \le C C_v (1 + \ln p)^2 e^{-\xi /\varepsilon }, \\ \Vert \nabla v_1\Vert _{\infty ,S_\xi ^{reg}}&\le C p^2 \Vert v_1\Vert _{\infty ,S_\xi ^{reg}}\\&\le C p^2 (1 + \ln p)^2 \Vert v\Vert _{\infty ,S_\xi ^{reg}} \le C C_v p^2 (1 + \ln p)^2 e^{-\xi /\varepsilon }. \end{aligned}$$

The error analysis for the Gauß–Lobatto interpolation on \(S_\xi ^{need}\) is achieved by (anisotropically) scaling \(S_\xi ^{need}\) to the reference element \(S = [0,1]^2\). In order to make use of the regularity properties of the scaled function \(\widehat{v}\), we first observe that for \(n \in \mathbb {N}_0\)

$$\begin{aligned} \max \{n+1,\xi /\varepsilon \}^n&= \max \left\{ (n+1)^n,\frac{1}{n!} (\xi /\varepsilon )^n n!\right\} \le \max \left\{ (n+1)^n,n! e^{\xi /\varepsilon }\right\} \\&\le n! e^{\xi /\varepsilon } \frac{(n+1)^n}{n!}\le C n! e^n e^{\xi /\varepsilon }, \end{aligned}$$

for some \(C > 0\), where the last inequality follows from Stirling’s formula. The tensor product Gauß–Lobatto interpolant \(\widehat{v}_1\) of the stretched function \(\widehat{v}_\xi \) satisfies on \(S\)

$$\begin{aligned} \Vert \widehat{v}_\xi - \widehat{v}_1 \Vert _{\infty ,S} + \Vert \nabla ( \widehat{v}_\xi - \widehat{v}_1) \Vert _{\infty ,S} \le C C_v e^{\xi /\varepsilon } e^{-\beta p}, \end{aligned}$$

for some \(C\), \(\beta > 0\) that depend solely on \(\gamma _v\). Returning to \(S_\xi ^{need}\), we get for the Gauß–Lobatto interpolation error

$$\begin{aligned} \Vert v - v_1 \Vert _{\infty ,S_\xi ^{need}} + \xi \Vert \partial _x ( v - v_1) \Vert _{\infty ,S_\xi ^{need}} + \Vert \partial _y ( v - v_1) \Vert _{\infty ,S_\xi ^{need}} \le C C_v e^{\xi /\varepsilon } e^{-\beta p}. \end{aligned}$$

Step 2: We define \(I_p v\) as follows (thus correcting \(v_1\)):

$$\begin{aligned} I_p v(x,y):= {\left\{ \begin{array}{ll} v_1(x,y) - (1-\sqrt{\varepsilon }) v_1(\xi ,y) \frac{x}{\xi }, \quad &{} (x,y) \in S_\xi ^{need} \\ \sqrt{\varepsilon } v_1(\xi ,y) \frac{1-x}{1-\xi } + \frac{x-\xi }{1-\xi } v_1(1,y), \quad &{} (x,y) \in S_\xi ^{reg}. \end{array}\right. } \end{aligned}$$

We note

$$\begin{aligned}&\sup _{y \in [0,1]} |v_1(\xi ,y)| \le C C_v (1 + \ln p)^2 e^{-\xi /\varepsilon }, \\&\sup _{y \in [0,1]} |\partial _y v_1(\xi ,y)| \le C C_v p^2 (1 + \ln p)^2 e^{-\xi /\varepsilon }, \\&\sup _{y \in [0,1]} |v_1(1,y)| \le C C_v (1 + \ln p)^2 e^{-1/\varepsilon },\\&\sup _{y \in [0,1]} |\partial _y v_1(1,y)| \le C C_v p^2 (1 + \ln p)^2 e^{-1/\varepsilon }. \end{aligned}$$

From this, we get on \(S_\xi ^{need}\)

$$\begin{aligned}&\Vert v - I_p v\Vert _{\infty ,S_\xi ^{need}} + \xi \Vert \partial _x (v - I_p v)\Vert _{\infty ,S_\xi ^{need}} + \Vert \partial _y (v - I_p v)\Vert _{\infty ,S_\xi ^{need}} \\&\quad \le C C_v \left[ e^{\xi /\varepsilon } e^{-\beta p} + p^2 (1+\ln p)^2 e^{-\xi /\varepsilon } \right] . \end{aligned}$$

The hypothesis \(\xi /\varepsilon \le \eta p\) implies that \(e^{\xi /\varepsilon } e^{-\beta p} \le e^{(\eta - \beta )p}\), so that \(\eta < \beta \) guarantees exponential convergence (in \(p\)). The claimed approximation properties on \(S_\xi ^{need}\) follow.

The approximations on \(S_\xi ^{reg}\) are achieved by the triangle inequality \(\Vert v - I_p v\Vert \le \Vert v\Vert + \Vert I_p v\Vert \). The control of \(I_p v\) is easily achieved by observing

$$\begin{aligned} \Vert I_p v\Vert _{\infty ,S_\xi ^{reg}}\le & {} C C_v (1 + \ln p)^2 \sqrt{\varepsilon } e^{-\xi /\varepsilon }, \\ \Vert \nabla I_p v\Vert _{\infty ,S_\xi ^{reg}}\le & {} C C_v p^2 (1 + \ln p)^2 \sqrt{\varepsilon } e^{-\xi /\varepsilon }. \end{aligned}$$

Note that we suppressed the contributions arising from \(v_1(1,\cdot )\) since our assumption \(\xi \le 1/2\) provides \(e^{-1/\varepsilon } \le C \sqrt{\varepsilon } e^{-\xi /\varepsilon }\) for some \(C > 0\). \(\square \)

The improved treatment of the boundary layer contribution allows us to sharpen the approximation result of Proposition 3.3 in the balanced norm:

Corollary 3.5

Under the assumptions of Proposition 3.3, there exist constants \(\lambda _0\), \(\lambda _1\), \(C\), \(\beta > 0\) independent of \(\varepsilon \in (0,1]\) and \(p \in \mathbb {N}\), such that the following is true: For every \(p\) and every \(\lambda \in (0,\lambda _0]\) with \(\lambda p \ge \lambda _1\), there exists \(\widetilde{\pi }_{p} u \in \mathcal {S}_{0}^{{p}}(\Delta _{BL}(\lambda ,p) )\) such that

$$\begin{aligned} \Vert u- \widetilde{\pi }_{p} u\Vert _{\infty ,\Omega } + \varepsilon ^{1/2}\left\| \nabla (u-\widetilde{\pi }_{p} u) \right\| _{0,\Omega }\le Cp^{2}\left( \ln p+1\right) ^{2}e^{-\beta p\lambda }. \end{aligned}$$

Proof

In the case that the mesh \(\Delta _{BL}(\lambda ,p)\) consists of the asymptotic mesh \(\Delta _A\), we set \(\widetilde{\pi }_p u = \pi _p u\) and the proof follows easily from Proposition 3.3, since \(\varepsilon \ge 1/(2 \lambda p) \ge 1/(2 \lambda _1)\). Let, therefore, \(\Delta _{BL}(\lambda ,p)\) have needle elements, i.e., the elements \(\Omega _i\), \(i=1,\ldots ,n\) of the asymptotic mesh \(\Delta _A\) are further subdivided into \(\Omega _i^{need}\) and \(\Omega _i^{reg}\). Our starting point is the proof of Proposition 3.3 in [7]. There, the approximation is obtained by a piecewise Gauß–Lobatto interpolation of the function \(u\), which is decomposed into a smooth (analytic) part \(w\), a boundary layer part \(u^{BL}\), and a remainder \(r\):

$$\begin{aligned} u = w + u^{BL} + r. \end{aligned}$$

The approximations of the smooth part \(w\) and the remainder \(r\) are taken to be those of [7], i.e., the elementwise Gauß–Lobatto interpolants. The boundary layer part \(u^{BL}\), however, is not approximated by its elementwise Gauß–Lobatto interpolant but by the elementwise Gauß–Lobatto interpolant on the elements \(\Omega _i\) with \(\overline{\Omega _i } \cap \partial \Omega = \emptyset \), with the aid of the operator \(I_p\) of Lemma 3.4. Inspection of the procedure in [7] shows that the regularity hypotheses (R1), (R2) of Lemma 3.4 are satisfied and that the approximation result holds if \(\xi = \lambda p \varepsilon \) with \(\lambda \le \lambda _0\) and \(\lambda _0\) sufficiently small. \(\square \)

3.3 Robust exponential convergence in balanced norms

The main result of the paper is the following robust exponential convergence in the balanced norm:

Theorem 3.6

There is a \(\lambda _{0}>0\) depending only on the functions \(b\), \(f\) and the asymptotic mesh \(\Delta _A\) such that for every \(\lambda \in (0,\lambda _{0}]\), \(\varepsilon \in (0,1]\), \(p \in \mathbb {N}\), the \(hp\)-FEM space \(\mathcal {S}_{0}^{p}(\Delta _{BL}(\lambda ,p))\) leads to a finite element approximation \(u_{FEM}\in \mathcal {S}_{0}^{p}(\Delta _{BL}(\lambda ,p))\) satisfying

$$\begin{aligned} \sqrt{\varepsilon }\Vert \nabla (u-u_{FEM})\Vert _{0,\Omega }+ \Vert u-u_{FEM}\Vert _{0,\Omega }\le Ce^{-\beta p}; \end{aligned}$$

the constants \(C\), \(\beta >0\) depend on the choice of \(\lambda \) but are independent of \(\varepsilon \) and \(p\).

The proof is deferred to the end of the section. As a corollary, we get exponential convergenence in the maximum norm.

Corollary 3.7

Let \(u\) be the solution of (3.3) and let \(u_{FEM}\in {\mathcal S}_{0}^{p}(\Delta _{BL}(\lambda ,p))\) be its finite element approximation. Then there exist constants \(C\), \(\sigma >0\) independent of \(\varepsilon \in (0,1]\) and \(p\in \mathbb {N}\) such that

$$\begin{aligned} \left\| u-u_{FEM}\right\| _{\infty , \Omega }\le C e^{-\sigma p}. \end{aligned}$$

Proof

First we note that Corollary 3.5 provides an approximation \(\pi _p u \in {\mathcal S}_{0}^{p}(\Delta _{BL}(\lambda ,p))\) with

$$\begin{aligned} \left\| u-\pi _p u \right\| _{\infty , \Omega }\le C e^{-\beta \lambda p}. \end{aligned}$$

In view of the triangle inequality \(\left\| u-u_{FEM}\right\| _{\infty , \Omega }\le \left\| u-\pi _p u \right\| _{\infty , \Omega } + \Vert \pi _p u - u_{FEM}\Vert _{\infty , \Omega }, \) we may focus on the term \(\left\| \pi _p u-u_{FEM}\right\| _{\infty , \Omega }\). It suffices to prove the result in the layer region, i.e., for the elements \(\Omega _i^{need}\), since outside \(\Omega _{\lambda p \varepsilon }\) standard inverse estimates (bounding the \(L^\infty \)-norm of polynomials by their \(L^2\)-norm up to powers of \(p\)) yield the desired bound in view of (3.13a), (3.13b).

For a needle element \(\Omega _i^{need}\) we introduce \(\widetilde{\pi }_p u:= \pi _p u|_{\Omega _i^{need}} \circ M_{A,i}\) and \(\widetilde{u}_{FEM}:= u_{FEM}|_{\Omega _i^{need}} \circ M_{A,i}\). The polynomial inverse estimate of [12, Thm. 4.76] and an affine scaling argument (between \(S_{ST}\) and \(S^{need}\)) yield

$$\begin{aligned} \left\| \pi _p u - u_{FEM}\right\| _{\infty , \Omega _i^{need}}= & {} \left\| \widetilde{\pi }_p u - \widetilde{u}_{FEM}\right\| _{\infty , S^{need}} \le C \frac{p^2}{\sqrt{\lambda p \varepsilon }} \left\| \widetilde{\pi }_p u - \widetilde{u}_{FEM}\right\| _{0,S^{need}}\\\sim & {} \frac{p^2}{\sqrt{\lambda p \varepsilon }} \left\| \pi _p u - u_{FEM}\right\| _{0,\Omega _i^{need}}, \end{aligned}$$

where in the last step we used the assumptions on the element maps \(M_{A,i}\). The triangle inequality then gives

$$\begin{aligned} \left\| \pi _p u - u_{FEM}\right\| _{\infty , \Omega _i^{need}} \le C \frac{p^2}{\sqrt{\lambda p \varepsilon }} \left[ \left\| \pi _p u - u\right\| _{0,\Omega _i^{need}} + \left\| u - u_{FEM} \right\| _{0,\Omega _i^{need}} \right] . \end{aligned}$$
(3.14)

For the first term in (3.14) we obtain from the \(L^\infty \)-bound of Corollary 3.5 and the fact that \(|\Omega _i^{need}| \sim \lambda p \varepsilon \),

$$\begin{aligned} \left\| \pi _p u - u\right\| _{0,\Omega _i^{need}} \lesssim \sqrt{\lambda p \varepsilon } e^{-\beta p}. \end{aligned}$$
(3.15)

For the second term in (3.14) we exploit the fact that \(u_{FEM} = 0 = \pi _p u\) on \(\partial \Omega \) and a 1D Poincaré inequality. To that end, we note that for any function \(\widetilde{v} \in H^1(S^{need})\) with \(v = 0\) on the edge \(\{(0,y)\,|\, 0 \le y \le 1\}\) of \(S^{need} = \{(x,y)\,|\, 0 \le x \le \lambda p \varepsilon , 0 \le y \le 1\}\), we obtain from a 1D Poincaré inequality

$$\begin{aligned} \Vert \widetilde{v}\Vert _{0,S^{need}} \le C \sqrt{\lambda p \varepsilon } \Vert \partial _x \widetilde{v}\Vert _{0,S^{need} } \le C \sqrt{\lambda p \varepsilon } \Vert \nabla \widetilde{v}\Vert _{0,S^{need}}. \end{aligned}$$
(3.16)

Upon setting \(\widetilde{v}:= (u - u_{FEM})|_{\Omega _i^{need}} \circ M_{A,i}\), we may use (3.16) together with the properties of \(M_{A,i}\) to get

(3.17)

Combining (3.14), (3.15), (3.17) gives the desired result. \(\square \)

3.4 Proof of Theorem 3.6

The proof of Theorem 3.6 parallels that of the 1D case in Sect. 2. We begin by defining the bilinear form for the reduced problem,

$$\begin{aligned} {\mathcal {B}}_{0}(u,v)=\left\langle bu,v\right\rangle _{\Omega }. \end{aligned}$$
(3.18)

We also introduce the projection operator \(\mathcal {P}_{0}:L^{2}(\Omega )\rightarrow \mathcal {S}_{0}^{p}(\Delta _{BL}(\lambda ,p) )\) by the condition

$$\begin{aligned} {\mathcal {B}}_{0}\left( u-\mathcal {P}_{0}u,v\right) =0 \quad \forall v\in \mathcal {S}_{0}^{p}(\Delta _{BL}(\lambda ,p) ). \end{aligned}$$

Then, by reasoning as in (2.24) with Galerkin orthogonalities, we get

$$\begin{aligned} \left\| u_{FEM}-\mathcal {P}_{0}u\right\| _{E,\Omega }^{2}= & {} \varepsilon ^{2}\left\langle \nabla \left( u-\mathcal {P}_{0}u\right) , \nabla \left( u_{FEM}-\mathcal {P}_{0}u\right) \right\rangle _{\Omega }. \end{aligned}$$

Hence

$$\begin{aligned} \varepsilon \left\| \nabla \left( u_{FEM}-\mathcal {P}_{0}u\right) \right\| _{0,\Omega }\le \left\| u_{FEM}-\mathcal {P}_{0}u\right\| _{E,\Omega }\le \varepsilon \left\| \nabla \left( u-\mathcal {P}_{0}u\right) \right\| _{0,\Omega }. \end{aligned}$$

The key step towards showing robust exponential convergence in balanced norms is therefore to show

$$\begin{aligned} \left\| \nabla \left( u-\mathcal {P}_{0}u\right) \right\| _{0,\Omega } \le C\varepsilon ^{-1/2}e^{-\sigma p}, \end{aligned}$$

for some \(C\) and \(\sigma >0\) independent of \(\varepsilon \) and \(p\). Completely analogous to the one-dimensional case, we are therefore led to studying the \(H^{1}\)-stability of the projection operator \(\mathcal {P}_{0}\) on the Spectral Boundary Layer mesh of Definition 3.1.

Lemma 3.8

(Strengthened Cauchy–Schwarz inequality in 2D) Let \({\mathcal {B}}_{0}\) be given by (3.18). Then,

$$\begin{aligned} \left| {\mathcal {B}}_{0}\left( u,v\right) \right| \le C \min \{1,\sqrt{\lambda p\varepsilon }p\}\left\| u\right\| _{0,\Omega }\left\| v\right\| _{0,\Omega _{\lambda p \varepsilon } }\quad \forall u\in S_{1},\quad v\in S_{\varepsilon }, \end{aligned}$$

with \(S_{1}\), \(S_{\varepsilon }\) given by (3.7) and (3.8), respectively. The constant \(C > 0\) depends solely on \(\Vert b\Vert _{\infty ,\Omega }\), \(\inf _{x \in \Omega } b(x) > 0\), and the element maps of the asymptotic mesh \(\Delta _A\).

Proof

We restrict our attention to the case \(\lambda p \varepsilon <1/2\) as the “\(1\)” in the minimum is a simple consequence of the Cauchy–Schwarz inequality. With \(u \in S_1\), \(v \in S_{\varepsilon }\) there holds \({\mathcal {B}}_0 (u,v) = \iint \nolimits _{\Omega _{\lambda p \varepsilon }} b u v\). Fix \(\Omega _i^{need}\) and recall that it is obtained from an element \(\Omega _i\) (\(i \in \{1,\ldots ,n\}\)) by a splitting, i.e., \(\overline{\Omega }_i = \overline{\Omega ^{need}_i} \cup \overline{\Omega ^{reg}_i}\). The construction of \(\Delta _{BL}(\lambda ,p)\) implies that the pull-back \(\pi _1:= u|_{\Omega _i} \circ M_{A,i}\) to \(S_{ST}\) is a polynomial of degree \(p\) (in each variable) whereas the pull-back \(\pi _2:= v|_{\Omega _i} \circ M_{A,i}\) is a piecewise polynomial of degree \(p\) (in each variable) with \(\mathrm{supp}\, \pi _2 \subset S^{need}\). Upon setting \(\widehat{b}:= b|_{\Omega _i^{need}} \circ M_{A,i}\), which is uniformly bounded on \(S^{need}\), we calculate

$$\begin{aligned} \iint \nolimits _{\Omega _i} b uv \,dx\,dy&= \iint \nolimits _{\Omega _i^{need}} b u v\,dx\,dy {=} \iint \nolimits _{S^{need}} \pi _1(x,y) \pi _2(x,y) \widehat{b} |\,\mathrm{det} M_{A,i}^\prime | \,dx\, dy . \end{aligned}$$

Since \(|\mathrm{det}\, M_{A,i}^\prime |\) is bounded uniformly (in \((x,y)\)), we obtain

$$\begin{aligned} \left| \iint \nolimits _{\Omega _i^{need}} b u v \right|\le & {} C \iint \nolimits _{S^{need}} |\pi _1(x,y)| |\pi _2(x,y)| dx dy\\= & {} C \int \nolimits _{0}^{1} \int \nolimits _{0}^{\lambda p \varepsilon } |\pi _1(x,y)| |\pi _2(x,y)| dx dy. \end{aligned}$$

Now, fix \(y \in [0,1]\) and consider

$$\begin{aligned}&\int \nolimits _{0}^{\lambda p \varepsilon } |\pi _1(x,y)| |\pi _2(x,y)| dx \\&\quad \le C p \sqrt{\lambda p \varepsilon } \left[ \int \nolimits _{0}^{1} |\pi _1(x,y)|^2 dx \right] ^{1/2} \left[ \int \nolimits _{0}^{\lambda p \varepsilon } |\pi _2(x,y)|^2 dx \right] ^{1/2} \end{aligned}$$

by Lemma 2.8. Integrating in \(y\) from 0 to 1, gives

$$\begin{aligned}&\int \nolimits _0^{1}\int \nolimits _{0}^{\lambda p \varepsilon } |\pi _1(x,y)| |\pi _2(x,y)| dx dy\\&\quad \le C p \sqrt{\lambda p \varepsilon } \int \nolimits _0^1 \left[ \int \nolimits _{0}^{1} |\pi _1(x,y)|^2 dx \right] ^{1/2} \left[ \int \nolimits _{0}^{\lambda p \varepsilon } |\pi _2(x,y)|^2 dx \right] ^{1/2} dy. \end{aligned}$$

Using once more the Cauchy–Schwarz inequality, we arrive at

$$\begin{aligned} \iint \nolimits _{S^{need}} |\pi _1(x,y)| |\pi _2(x,y)| dx dy \le C p \sqrt{\lambda p \varepsilon } \Vert \pi _1 \Vert _{0,S_{ST}} \Vert \pi _2 \Vert _{0,S^{need}}. \end{aligned}$$

The assumptions on the element map \(M_{A,i}\) allows us to infer \(\Vert \pi _1 \Vert _{0,S_{ST}} \Vert \pi _2 \Vert _{0,S^{need}} \sim \Vert u \Vert _{0,\Omega _i} \Vert v \Vert _{0,\Omega _i^{need}}\), which concludes the proof. \(\square \)

Lemma 3.9

(Stability of \({\mathcal P}_0)\) There exist constants \(C\), \(c>0\) depending solely on \(\Vert b\Vert _{\infty ,\Omega }\), \(\inf _{x \in \Omega } b(x) > 0\), and the element maps of the asymptotic mesh \(\Delta _A\) such that the following is true under the assumption

$$\begin{aligned} \sqrt{\lambda p\varepsilon } p \le c: \end{aligned}$$
(3.19)

For each \(z \in L^2(\Omega )\), the (unique) decomposition

$$\begin{aligned} \mathcal{{P}} _{0}z=z_{1}+z_{\varepsilon } \end{aligned}$$

into the components \(z_1 \in S_{1}\) and \(z_{\varepsilon } \in S_{\varepsilon }\) satisfies

$$\begin{aligned} \Vert z_{1}\Vert _{0,\Omega }\le & {} C\Vert z\Vert _{0,\Omega }, \end{aligned}$$
(3.20)
$$\begin{aligned} \Vert z_{\varepsilon }\Vert _{0,\Omega }\le & {} C\{\Vert z\Vert _{0,\Omega _{\lambda p \varepsilon }}+\sqrt{\lambda p\varepsilon }p \Vert z\Vert _{0,\Omega }\}. \end{aligned}$$
(3.21)

Furthermore,

$$\begin{aligned} \Vert \nabla z_{1}\Vert _{0,\Omega }\le & {} C p^2 \Vert z\Vert _{0,\Omega }, \end{aligned}$$
(3.22)
$$\begin{aligned} \Vert \nabla z_{\varepsilon }\Vert _{0,\Omega }\le & {} C\frac{p^2}{\lambda p \varepsilon } \left\{ \Vert z\Vert _{0,\Omega _{\lambda p \varepsilon }} +\sqrt{\lambda p\varepsilon }p \Vert z\Vert _{0,\Omega }\right\} . \end{aligned}$$
(3.23)

Proof

The proof parallels that of Lemma 2.9. With Lemma 3.2 we can write \({\mathcal P}_0 z = z_1 + z_\varepsilon \). We define the auxiliary function \(\psi _\varepsilon \) on \(S_{ST}\) by

$$\begin{aligned} \psi _{\varepsilon }(x,y ):={\left\{ \begin{array}{ll} \left( 1-\frac{ 2x }{\lambda p\varepsilon }\right) ^{p} \quad &{} \text { if } (x,y) \in S^{need} \\ 0\quad &{} \text { otherwise.} \end{array}\right. } \end{aligned}$$

Then \(\text {supp }\psi _{\varepsilon }\subset S^{need},\psi _{\varepsilon }(0,y)=1\) and \(\left\| \psi _{\varepsilon }\right\| _{0,S_{ST}} = \left\| \psi _{\varepsilon }\right\| _{0,S^{need}}\sim p^{-1/2}\sqrt{\lambda p\varepsilon }\). We define the function \(\widetilde{z}_\varepsilon \in S_\varepsilon \) on the needle elements \(\Omega _i^{need}\) by prescribing its pull-back to \(S^{need}\):

$$\begin{aligned}&(\widetilde{z}_{\varepsilon }|_{\Omega _i^{need}} \circ M_{A,i}) (x,y)\\&\quad := (z_{\varepsilon }|_{\Omega _i^{need}} \circ M_{A,i})(x,y) +\psi _{\varepsilon }(x, y) (z_{1}|_{\Omega _i} \circ M_{A,i})(0,y), \quad (x,y) \in S^{need}; \end{aligned}$$

here, \(\Omega _i\) and \(\Omega _i^{need}\) are related to each other by \(\Omega _i = \Omega _i^{need} \cup \Omega _i^{reg}\). It is an effect of the assumptions on the asymptotic mesh \(\Delta _A\) that the elementwise defined function \(\widetilde{z}_\varepsilon \) is in fact in \(H^1(\Omega )\) and therefore indeed \(z_\varepsilon \in S_\varepsilon \). By construction, \(\widetilde{z}_\varepsilon |_{\partial \Omega } = (z_1 + z_\varepsilon )|_{\partial \Omega } = ({\mathcal P}_0 z)|_{\partial \Omega } = 0\) so that \(\widetilde{z}_\varepsilon \in S_\varepsilon \cap S_0(\lambda ,p)\). Noting the product structure of \((z_\varepsilon - \widetilde{z}_\varepsilon )|_{\Omega _i^{need}} \circ M_{A,i}\) on \(S^{need}\) and the above estimate on \(\Vert \psi _\varepsilon \Vert _{0,S^{need}}\), we get for \(\widetilde{z}_\varepsilon \) with the inverse estimate (3.10),

$$\begin{aligned} \left\| \widetilde{z}_{\varepsilon }\right\| _{0,\Omega }= \left\| \widetilde{z}_{\varepsilon }\right\| _{0,\Omega _{\lambda p\varepsilon }}\le C\left\{ \left\| z_{\varepsilon }\right\| _{0,\Omega _{\lambda \varepsilon }}+ p^{1/2}\sqrt{\lambda p\varepsilon }\left\| z_{1}\right\| _{0,\Omega }\right\} . \end{aligned}$$

We also have in view of \({\mathcal P}_0 z = z_1 + z_\varepsilon \)

$$\begin{aligned} B_{0}(z_{1},v_{1})+B_{0}(z_{\varepsilon },v_{1})= & {} B_{0}(\mathcal {P}_{0}z,v_{1}) \quad \forall v_{1}\in S_{1}, \end{aligned}$$
(3.24)
$$\begin{aligned} B_{0}(z_{1},v_{\varepsilon })+B_{0}(z_{\varepsilon },v_{\varepsilon })= & {} B_{0}(\mathcal {P}_{0}z,v_{\varepsilon })=B_{0}(z,v_{\varepsilon }) \quad \forall v_{\varepsilon }\in {S}_{\varepsilon }\cap \mathcal {S}_{0}^{p}\left( \Delta _{BL}(\lambda ,p)\right) , \nonumber \\ \end{aligned}$$
(3.25)

where in (3.25) we used the fact that \(\mathcal {P}_{0}\) is the \({\mathcal B}_0\)–projection onto \(\mathcal {S}_{0}^{p}\left( \Delta _{BL}(\lambda ,p)\right) \). Taking \(v_{1}=z_{1}\) in (3.24) and \(v_{\varepsilon }=\widetilde{z}_{\varepsilon }\in S_{\varepsilon }\cap \mathcal {S}_{0}^{p}\left( \Delta _{BL}(\lambda ,p)\right) \) in (3.25) yields, together with the Strengthened Cauchy Schwarz inequality of Lemma 2.9, just like in the 1D case,the bounds (3.20), (3.21). The final estimates (3.22), (3.23) follow from (3.20), (3.21) with the aid of the inverse estimates (3.11), (3.12) of Lemma 3.2. \(\square \)

We are now in the position to prove the following

Lemma 3.10

Assume (3.2) and let \(u\) be the solution of (3.3). Let \(\lambda _0 > 0\) be given by Corollary 3.5. Assume that \(\lambda \le \lambda _0\) and that \(\lambda \), \(p\), \(\varepsilon \) satisfy (3.19). Then, for constants \(C\), \(\beta > 0\) independent of \(\varepsilon \in (0,1]\) and \(p\in \mathbb {N}\) (but depending on \(\lambda )\)

$$\begin{aligned} \Vert \nabla (u - \mathcal{{P}}_0 u) \Vert _{0,\Omega } \le C \varepsilon ^{-1/2} e^{-\beta p}. \end{aligned}$$
(3.26)

Proof

By Corollary 3.5 we can find an approximation \(\pi _{p}u\in \mathcal {S}_{0}^{p}(\Delta _{BL}(\lambda ,p))\) with \((u-\pi _{p}u)|_{\partial \Omega }=0\) such that

$$\begin{aligned} \sqrt{\varepsilon }\left\| \nabla (u-\pi _{p}u)\right\| _{0,\Omega }\le C p^{2}\left( \ln p+1\right) ^{2}e^{-\beta \lambda p}. \end{aligned}$$

Since \(\mathcal {P}_{0}(u-\pi _{p}u)\in \mathcal {S}^p_0(\Delta _{BL}(\lambda ,p))\), we decompose \({\mathcal {P}}_{0}(u-\pi _{p}u)=z_{1}+z_{\varepsilon }\) and use (3.22), (3.23),

$$\begin{aligned} \vert z_{1}\vert _{1,\Omega }\lesssim & {} p^{2}\Vert u-\pi _{p}u \Vert _{0,\Omega }\lesssim Ce^{-bp}, \end{aligned}$$
(3.27)
$$\begin{aligned} \vert z_{\varepsilon }\vert _{1,\Omega }\lesssim & {} \frac{p^{2}}{\lambda p\varepsilon }\left[ \Vert u-\pi _{p}u \Vert _{0,\Omega _{\lambda p \varepsilon }}+ \sqrt{\lambda p\varepsilon }p\Vert u-\pi _{p}u \Vert _{0,\Omega }\right] . \end{aligned}$$
(3.28)

Let us treat the term \(\Vert u-\pi _{p}u\Vert _{0,\Omega _{\lambda p \varepsilon }}\) above. Recall that \(\Omega _{\lambda p \varepsilon } = \cup _{i=1}^n \Omega _i^{need}\); from (3.15) we therefore get \(\displaystyle \Vert u-\pi _{p}u\Vert _{0,\Omega _{\lambda p \varepsilon }} \lesssim \sqrt{\lambda p \varepsilon } e^{-\beta p}. \) Furthermore, from Corollary 3.5 we readily have \(\Vert u-\pi _{p}u\Vert _{0,\Omega } \lesssim e^{-\beta p}\). Inserting these two estimates into (3.27) produces

$$\begin{aligned} \vert z_{\varepsilon }\vert _{1,\Omega }\lesssim \frac{p^{2}}{\lambda p \varepsilon } \sqrt{\lambda p \varepsilon } e^{-\beta p} + \sqrt{\lambda p\varepsilon p} e^{-\beta p} \lesssim \varepsilon ^{-1/2}e^{-\beta p}, \end{aligned}$$

where the constant \(\beta > 0\) is suitably adjusted in each estimate. The result follows. \(\square \)

Proof of Theorem 3.6

Again, we focus only on the control of \(\sqrt{\varepsilon }\Vert \nabla (u-u_{FEM})\Vert _{0,\Omega }\). We distinguish two cases:

Case 1 Assume that (3.19) is satisfied. Then (3.26) and Lemma 2.10 yield the result.

Case 2 Assume (3.19) is not satisfied. Then \(\varepsilon \ge c^{2}p^{-3}\lambda ^{-1}\) so that

$$\begin{aligned} \sqrt{\varepsilon }\Vert \nabla (u-u_{N})\Vert _{0,\Omega }\le \varepsilon ^{-1/2}\Vert u-u_{N}\Vert _{E,\Omega }\le \frac{1}{c}\sqrt{\lambda } p^{3/2}\Vert u-u_{N}\Vert _{E,\Omega }\lesssim e^{-bp}. \end{aligned}$$

3.5 Numerical example

We close with a numerical example in two dimensions: We consider the problem

$$\begin{aligned} -\varepsilon ^{2}\Delta u+u= & {} 1 \quad \text { in }\Omega :=\left\{ (x,y)\,|\, 0 \le \left( \frac{x}{2}\right) ^2 + y^2 < 1 \right\} \subset \mathbb {R}^2,\\ u= & {} 0 \quad \text { on }\partial \Omega , \end{aligned}$$

We approximate the solution to this problem on the mesh shown in Fig. 3 below, using polynomials of degree \(1, \ldots , 7\).

Fig. 3
figure 3

Mesh used for the two-dimensional example

In Fig. 4 we present the error

$$\begin{aligned} \max _{1\le i \le M }\left| u(r_i) - u_{FEM}(r_i) \right| , \qquad M:= 20, \end{aligned}$$

versus the polynomial degree \(p\), in a semi-log scale. The \(M\) points \(r_i\) were uniformly distributed first on the mesh line connecting the points \((8 \varepsilon ,0), (1,0)\), as highlighted in Fig. 3, and second on the generic line, of width approximately \(8\varepsilon \), within the layer starting from the boundary point \((\sqrt{2},\sqrt{2}/2)\) at a \(-45\) degree angle. Figure 4 clearly shows the robust exponential convergence in the \(L^\infty (\Omega )\)-norm of the \(hp\)-FEM on the Spectral Boundary Layer mesh.

Fig. 4
figure 4

Maximum norm convergence of the \(hp\)-FEM. Left on a meshline within the layer. Right on a generic line within the layer