1 Introduction

This paper is concerned with error estimates for discrete approximations to the solution of the obstacle problem. Concerning the underlying domain, we only assume that it is polygonally or polyhedrally bounded and convex. Under a structural and commonly used assumption on the obstacle we show that the sequence of discrete approximations possesses a convergence rate close to two in the \(L^2\)-norm. Thus, we obtain convergence results similar to those of the Ritz-projection of the solution. This is the main contribution of the present paper.

Before going into further details, let us review common discretization concepts for the obstacle problem and related convergence results from the literature. The first approach consists of a direct discretization of the variational inequality (corresponding to the obstacle problem) based on linear finite elements. For this approach, it is well-known that the resulting sequence of discrete solutions exhibits a convergence rate of one in the \(H^1\)-norm if the domain is convex. The corresponding proof has already become classical in the meanwhile, see [6]. Essentially, it is based on the variational formulations of the problems (continuous and discrete) and standard interpolation error estimates. In contrast, a universal approach for the derivation of optimal error estimates in the \(L^2\)-norm is unknown. A review of \(L^p\)-error estimates from the literature and a discussion of their validity can be found in [4]. It has even been shown in this reference, see [4, Theorem 5 and Theorem 10], that a duality argument, similar to that for the \(L^2\)-error of the Ritz-projection, cannot be established as the \(H^2\)-regularity of the solution in this situation is not sufficient in order to guarantee second order convergence in \(L^2\). To circumvent this issue, it is possible to consider pointwise error estimates since such estimates also imply estimates in \(L^2\) due to the Hölder inequality. In [14, 15, 18] it is shown that a convergence rate of two (times \(\log \)-factors) can be achieved in \(L^\infty \). This result requires sufficiently smooth data, and interior angles, that are small enough, in order to guarantee sufficiently smooth solutions due to the presence of corner and edge singularities. For instance, in two dimensional polygonal domains, it is well-known that in general the interior angles must be less or equal to \(\pi /2\) for the validity of those rates. For larger interior angles the convergence rates are reduced. In addition, based on the pointwise estimates, it is proven in [14] that a convergence rate close to two can be expected in \(L^2\) if the domain is only assumed to be convex. However, this result requires an obstacle which is sufficiently smooth and, more importantly, which is inactive on the boundary. It is also crucial to note that all the pointwise estimates (and hence the \(L^2\) estimate in [14]) discussed so far only hold if a discrete maximum principle holds for the discrete solutions (at least this is the state of the art). For instance, this can be ensured by weakly acute finite element meshes. However, in practice, this is a serious restriction on the construction of finite element meshes, especially in the three dimensional case.

A second strategy to obtain approximations to the solution of the obstacle problem can be summarized as follows: First appropriately regularize the obstacle problem (for instance we use a Moreau-Yosida type relaxation) to get a semilinear partial differential equation, where the nonlinearity depends on the regularization parameter, and then truncate the regularized equation and discretize it by linear finite elements. Typically, the regularization parameter is chosen as a function of the mesh parameter in order to balance both error contributions. This approach is pursued in the present paper. In case that the boundary is smooth enough and the data are regular enough, it is shown in [17] that by this type of discretization a convergence rate of two (times \(\log \)-factors) in \(L^\infty \), and hence in \(L^2\), can be achieved. Moreover, the proof can be extended to polygonal and polyhedral domains if the interior angles are small enough such that the appearing corner and edge singularities are mild enough. For larger interior angles the convergence rates in \(L^\infty \) are again reduced. This is in agreement with corresponding discretization error estimates for semilinear partial differential equations, where the nonlinearity does not additionally depend on a (mesh parameter dependent) regularization parameter. In this case, a convergence rate of two can also be proven in \(L^2\) if the underlying domains are only assumed to be convex. Of course, this raises the question if such a result (or at least a comparable one) is also valid for the approximations of the present discretization strategy. Typically, in order to obtain error estimates in \(L^2\), a duality argument is applied. However, a straightforward application of the duality argument in the \(L^2\)-setting is not promising here as an inappropriate coupling between regularization parameter and mesh parameter cannot be avoided in this case. Nevertheless, under the commonly used structural assumption that the obstacle is sufficiently smooth and inactive on the boundary, we show that a convergence rate of two (times \(\log \)-factors) in \(L^2\) can be established in convex polygonal/polyhedral domains, which represents the main result of the paper.

Our proof heavily relies on the fact that in the continuous and discrete setting the obstacle is inactive in a non-empty strip at the boundary. This is deduced by basic pointwise estimates and the structural assumption that the obstacle is inactive on the boundary. Then, by using in a certain sense new results for locally discrete harmonic functions, we are able to split the discretization error in \(L^2\) into two error terms. The first one is nothing else than an \(L^2\)-error for the Ritz-projection in the domain, where we can rely on standard estimates from the literature. The second error contribution represents an error in the interior of the domain, where the solution enjoys more regularity. In order to appropriately bound this term, we employ techniques from [17] (introduced there for global \(L^\infty \)-error estimates). However, we always take care on the local support of this error, which lies in the interior of the domain. A more detailed outline of the proof is given at the beginning of Sect. 4.

The paper is organized as follows: In Sect. 2 we introduce the variational formulation to the obstacle problem and a Moreau-Yosida type relaxation to this problem. Moreover, we state basic properties of the corresponding solutions, such as regularity results, and we establish pointwise convergence of these solutions to each other. Some of the results are already known in a similar fashion in the literature. Nevertheless, we state the basic ideas in order to be self-contained. Moreover, and more importantly, this also enables us to ensure that those results do not depend on a smooth boundary in general, which is very often assumed in the literature. After having introduced the discrete problem in Sect. 3, we consider the \(L^2\)-error estimates in Sect. 4. There we start with giving a short roadmap for the remainder of the paper. This is followed by the observation that in all problems, which we consider, the obstacle is inactive in a strip at the boundary, which relies on pointwise estimates and the aforementioned structural assumption on the obstacle. Then we establish error estimates for locally discrete harmonic functions, which are new in a certain sense compared to the results from the literature. Using these estimates, standard results for the Ritz-projection, and duality arguments in the \(L^\infty \)-\(L^1\)-setting, we establish at the end of Sect. 4 the main result of the paper, the discretization error estimates in \(L^2\) in convex polygonal and polyhedral domains. Finally, in Sect. 5 we state numerical examples which underline our theoretical findings.

Before closing the introduction, we emphasize that in all what follows C denotes a generic positive constant which is always independent of the regularization parameter \(\varepsilon \) and the mesh parameter h.

2 The continuous and the regularized problem

We start with introducing some notation which is used in the sequel of the paper. We let \(\Omega \subset \mathbb {R}^N\), \(N \in \{2,3\}\), denote an open, convex and polygonally/polyhedrally bounded domain with boundary \(\partial \Omega \). The Sobolev spaces are classically denoted by \(W^{k,p}(\Omega )\) with \(k\in \mathbb {N}_0\) and \(p\in [1,\infty ]\). In case that \(p=2\), we also use the notation \(H^k(\Omega )\). The norms in these spaces are denoted by \(||\cdot ||_{W^{k,p}(\Omega )}\) and \(||\cdot ||_{H^{k}(\Omega )}\), respectively. In addition, \(H^k_0(\Omega )\) denotes the completion of all functions in \(C^\infty _0(\Omega )\) (the space of infinitely often differentiable functions with compact support in \(\Omega \)) with respect to \(||\cdot ||_{H^k(\Omega )}\). Again classically, we denote the norm in \(L^p(\Omega )=W^{0,p}(\Omega )\) by \(||\cdot ||_{L^p(\Omega )}\). For the inner product in \(L^2(\Omega )\) we use the notation \((\cdot ,\cdot )\). The dual space to \(H^1_0(\Omega )\) is denoted by \(H^{-1}(\Omega )\) and we use the notation \(\langle \cdot ,\cdot \rangle \) to indicate the corresponding duality pairing.

Let us now formulate the problem which we are dealing with. For \(f \in L^\infty (\Omega )\) and \(\psi \in W^{2,\infty }(\Omega )\), which satisfies \(\psi \le 0\) on \(\partial \Omega \), we consider the variational problem: Find \(u \in \mathcal K_\psi := \{v\in H^1_0(\Omega )\,|\, v \ge \psi \text{ a.e. } \text{ in } \Omega \}\) such that

$$\begin{aligned} (\nabla u,\nabla (v- u))&\ge (f,v-u)\quad \forall v \in \mathcal K_\psi . \end{aligned}$$
(1)

That is, we discuss the classical obstacle problem for a function \(u\in K_\psi \subset H^1_0(\Omega )\) with an obstacle \(\psi \in W^{2,\infty }(\Omega )\). By classical means it is possible to show that there exists a unique solution to this problem, see for instance [12, Chap. II, Theorem 2.1]. For the existence of a unique solution our regularity assumptions on the domain, the obstacle and the data can certainly be relaxed. Let us again emphasize that the assumptions stated above are crucial for our numerical analysis. In Sect. 4 we even assume that the obstacle is inactive on the boundary.

Next, let us recall a well-known reformulation of the obstacle problem which can be deduced by using concepts from convex analysis, see for instance [2, Sects. 1 and 2]. Let \(I_{\mathcal {K}_\psi }\) denote the indicator functional of the convex set \(\mathcal {K}_\psi \). Then, the subdifferential \(\partial I_{\mathcal {K}_\psi }\) of \(I_{\mathcal {K}_\psi }\) at a point \(u\in \mathcal {K}_\psi \) is given by

$$\begin{aligned} \partial I_{\mathcal {K}_\psi }(u)=\left\{ v\in H^{-1}(\Omega )\,\big |\, \langle v,u-w\rangle \ge 0\, \forall w\in \mathcal {K}_\psi \right\} . \end{aligned}$$

Moreover, a function \(u\in H^1_0(\Omega )\) solves the obstacle problem (1) if and only if there exists a Lagrange multiplier \(\beta (u-\psi ) \in \partial I_{\mathcal {K}_\psi }(u)\) such that

$$\begin{aligned} (\nabla u, \nabla v) + \langle \beta (u-\psi ), v\rangle = (f,v) \quad \forall v\in H^1_0(\Omega ). \end{aligned}$$
(2)

Remark 1

Note that, as the solution u to the obstacle problem is unique, the Eq. (2) uniquely determines \(\beta (u-\psi )\in H^{-1}(\Omega )\) by the relation

$$\begin{aligned} \beta (u-\psi )=f+\Delta u \in H^{-1}(\Omega ). \end{aligned}$$

Next, we introduce a regularized problem, where we replace the Lagrange multiplier \(\beta (u-\psi )\) by a suitable relaxation. Our approach follows that of [17]. A similar one is also used in [12, Chap. IV, Sect. 5].

For \(\varepsilon > 0\) we substitute \(\beta (u-\psi )\) by the monotonically increasing, and globally Lipschitz continuous function

$$\begin{aligned} \beta _\varepsilon (s) := \left\{ \begin{array}{lll} 0, &{} \quad \text{ if } &{} s\ge 0,\\ s/\varepsilon &{} \quad \text{ if } &{} s<0, \end{array}\right. \end{aligned}$$

and consider the semi-linear partial differential equation

$$\begin{aligned} (\nabla u_\varepsilon , \nabla v) + (\beta _\varepsilon (u_\varepsilon -\psi ),v) = (f,v) \quad \forall v \in H^1_0(\Omega ) \end{aligned}$$
(3)

as an approximation of (1). Due to the above formulated assumptions on \(\beta _\varepsilon \), \(\psi \) and f, existence of a unique solution \(u_\varepsilon \in H^1_0(\Omega )\cap C(\bar{\Omega })\) to (3) follows for any \(\varepsilon >0\) by arguments due to Browder and Minty, see for instance [19, Theorem 4.7]. We also note that this is an outer approximation or Moreau–Yosida relaxation, [8], of the variational inequality (1).

The following two lemmas about regularity issues for the obstacle problem and its regularized version are in the spirit of [12, Chap. IV, Lemma 5.1 and Theorem 5.2].

Lemma 1

Let \(u_\varepsilon \in H^1_0(\Omega )\) for \(\varepsilon \in (0,1]\) denote the solution to (3). Then, \(\beta _\varepsilon (u_\varepsilon -\psi )\) belongs to \(L^\infty (\Omega )\) fulfilling

$$\begin{aligned} ||\beta _\varepsilon (u_\varepsilon -\psi ) ||_{L^\infty (\Omega )} \le ||f + \Delta \psi ||_{L^\infty (\Omega )}. \end{aligned}$$
(4)

Further, the solution \(u_\varepsilon \) possesses the higher regularity \(H^2(\Omega )\cap H^1_0(\Omega )\) and satisfies

$$\begin{aligned} ||u_\varepsilon ||_{H^2(\Omega )}\le C(||f||_{L^\infty (\Omega )}+||\Delta \psi ||_{L^\infty (\Omega )}) \end{aligned}$$

with a constant \(C>0\) independent of \(\varepsilon \).

Proof

To prove the boundedness of \(\beta _\varepsilon (u_\varepsilon -\psi )\) in \(L^\infty (\Omega )\), one can proceed completely analogously to the proof of [12, Chap. IV, Lemma 5.1]. For the convenience of the reader and also to see the exact regularity requirements, let us quickly summarize the most essential steps of the proof. Since \(u_\varepsilon \in H^1_0(\Omega )\cap C(\bar{\Omega })\) and \(\psi \in W^{2,\infty }(\Omega )\) with \(\psi |_{\partial \Omega }\le 0\) we have that \(\beta _\varepsilon (u-\psi )|_{\partial \Omega }=0\) and \(\beta _\varepsilon (u-\psi ) \in H^1(\Omega ) \cap L^\infty (\Omega )\) (according to [12, Chap. II, Theorem A.1]). Thus, we may choose \(-(-\beta _\varepsilon (u_\varepsilon -\psi ))^{p-1} = -|\beta _\varepsilon (u_\varepsilon -\psi )|^{p-1}\), which then belongs to \(H^1_0(\Omega )\cap L^\infty (\Omega )\) as well (definitely for \(p>2\)), as a test function in (3). This yields

$$\begin{aligned} ||\beta _\varepsilon (u_\varepsilon&-\psi ) ||_{L^p(\Omega )}^p\\&= (\nabla (u_\varepsilon -\psi ), \nabla ( (-\beta _\varepsilon (u_\varepsilon -\psi ) )^{p-1} )+(f+\Delta \psi ,-(-\beta _\varepsilon (u_\varepsilon -\psi ))^{p-1})\\&=(1-p) (\nabla (u_\varepsilon -\psi ),(- \beta _\varepsilon (u_\varepsilon -\psi ))^{p-2} \beta _\varepsilon ^\prime (u_\varepsilon -\psi ) \nabla (u_\varepsilon -\psi ) )\\&\quad -(f+\Delta \psi ,|\beta _\varepsilon (u_\varepsilon -\psi )|^{p-1}), \end{aligned}$$

where we used the integration by parts formula and the chain rule. Then, employing the monotonicity of \(\beta _\varepsilon \) together with \(\beta _\varepsilon \le 0\) and the Hölder inequality results in

$$\begin{aligned} ||\beta _\varepsilon (u_\varepsilon -\psi ) ||_{L^p(\Omega )}^p\le ||f + \Delta \psi ||_{L^p(\Omega )} ||\beta _\varepsilon (u_\varepsilon -\psi ) ||_{L^p(\Omega )}^{p-1}. \end{aligned}$$

Finally, after having divided by \(||\beta _\varepsilon (u_\varepsilon -\psi ) ||_{L^p(\Omega )}^{p-1}\), we take the limit \(p\rightarrow \infty \) and obtain the desired bound for \(\beta _\varepsilon (u_\varepsilon -\psi )\). As a consequence, the higher regularity can be deduced from [9, Theorem 3.2.1.2] after having sent \(\beta _\varepsilon (u_\varepsilon -\psi )\) to the right hand side. \(\square \)

Lemma 2

Let \(u \in \mathcal K_\psi \) and \(u_\varepsilon \in H^1_0(\Omega )\) denote the solutions to (1) and (3), respectively. Then, we have

$$\begin{aligned} u_\varepsilon&\xrightarrow {\varepsilon \rightarrow 0} u \text { weakly in } H^2(\Omega )\text { and strongly in } C(\bar{\Omega }),\nonumber \\ \beta _\varepsilon (u_\varepsilon -\psi )&\xrightarrow {\varepsilon \rightarrow 0} \beta (u-\psi ) \text { weakly in } L^2(\Omega ), \end{aligned}$$
(5)

and

$$\begin{aligned} ||\beta (u-\psi ) ||_{L^\infty (\Omega )} \le ||f + \Delta \psi ||_{L^\infty (\Omega )}. \end{aligned}$$

Proof

We proceed similar to the proof of [12, Chap. IV, Theorem 5.2]. However, we rely on the reformulation (2) of the obstacle problem instead of considering the variational inequality (1). Due to the uniform boundedness of \(u_\varepsilon \) in \(H^2(\Omega )\cap H^1_0(\Omega )\) and \(\beta _\varepsilon (u_\varepsilon -\psi )\in L^2(\Omega )\) according to Lemma 1, we get the existence of functions \(\hat{u}\in H^2(\Omega )\cap H^1_0(\Omega )\) and \(\hat{\beta }\in L^2(\Omega )\) such that

$$\begin{aligned} u_\varepsilon&\xrightarrow {\varepsilon \rightarrow 0} \hat{u} \text { weakly in } H^2(\Omega ),\\ \beta _\varepsilon (u_\varepsilon -\psi )&\xrightarrow {\varepsilon \rightarrow 0} \hat{\beta }\text { weakly in } L^2(\Omega ). \end{aligned}$$

Actually, we only get the convergence of subsequences at first. However, as the limits will be unique (the unique solution u of the obstacle problem and the corresponding unique Lagrange multiplier \(\beta (u-\psi )\)), we will have the convergence of the whole sequences, and therefore we skip this detail in the following. Next, we show that \(\hat{u}\) and \(\hat{\beta }\) fulfill (2). Due to the weak convergence, we already know that

$$\begin{aligned} (\nabla \hat{u}, \nabla v) + (\hat{\beta }, v) = (f,v) \quad \forall v\in H^1_0(\Omega ). \end{aligned}$$

Thus, as the duality pairing between \(H^{-1}(\Omega )\) and \(H^1_0(\Omega )\) is compatible with the inner product in \(L^2(\Omega )\), it only remains to show that \(\hat{\beta }\in \partial I_{\mathcal {K}_\psi }(\hat{u})\). In a first step towards this, we show that \(\hat{u}\) belongs to \(\mathcal {K}_\psi \), since then the subdifferential \(\partial I_{\mathcal {K}_\psi }(\hat{u})\) at \(\hat{u}\) can be characterized as

$$\begin{aligned} \partial I_{\mathcal {K}_\psi }(\hat{u})=\left\{ v\in H^{-1}(\Omega )\,\big |\, \langle v,\hat{u}-w\rangle \ge 0\quad \forall w\in \mathcal {K}_\psi \right\} . \end{aligned}$$

As \(H^2(\Omega )\) is compactly embedded in \(C(\bar{\Omega })\), we get that

$$\begin{aligned} u_\varepsilon \xrightarrow {\varepsilon \rightarrow 0} \hat{u} \text { strongly in } C(\bar{\Omega }). \end{aligned}$$

Assume next that there is a set \(O\subset \Omega \) with \(|O|>0\) and \(\delta >0\) such that \(\hat{u} \le \psi - \delta \) on O. By the strong convergence in \(C(\bar{\Omega })\) we have for \(\varepsilon \) small enough that \(u_\varepsilon \le \psi -\delta /2\). Thus, by the Cauchy–Schwarz inequality and the definition of \(\beta _\varepsilon \), we deduce

$$\begin{aligned} ||\beta _\varepsilon (u_\varepsilon - \psi ) ||_{L^2(O)} ||u_\varepsilon -\psi ||_{L^2(O)}&\ge (\beta _\varepsilon (u_\varepsilon - \psi ), u_\varepsilon - \psi )_{L^2(O)}\\&=\frac{1}{\varepsilon } ||u_\varepsilon -\psi ||_{L^2(O)}^2 \ge |O| \frac{\delta ^2}{4\varepsilon }, \end{aligned}$$

which is a contradiction to the boundedness of the left hand side of this inequality (according to Lemma 1) if we send \(\varepsilon \) to zero. As a consequence, we have shown \(\hat{u}\in \mathcal {K}_\psi \). Now, we show that \(\hat{\beta }\) belongs to \(\partial I_{\mathcal {K}_\psi }(\hat{u})\). By introducing appropriate intermediate functions, we get for any \(w\in \mathcal {K}_\psi \), which implies \(\beta _\varepsilon (w-\psi )=0\), that

$$\begin{aligned} \int _{\Omega }\hat{\beta }(\hat{u}-w)&=\int _\Omega (\hat{\beta }-\beta _\varepsilon (u_\varepsilon -\psi ))(\hat{u}-w)+\int _{\Omega }\beta _\varepsilon (u_\varepsilon -\psi )(\hat{u} - u_\varepsilon )\\&\quad +\int _{\Omega }(\beta _\varepsilon (u_\varepsilon -\psi )-\beta _\varepsilon (w-\psi ))((u_\varepsilon -\psi )-(w-\psi ))\\&\ge \int _\Omega (\hat{\beta }-\beta _\varepsilon (u_\varepsilon -\psi ))(\hat{u}-w)+\int _{\Omega }\beta _\varepsilon (u_\varepsilon -\psi )(\hat{u} - u_\varepsilon ), \end{aligned}$$

where we used the monotonicity of \(\beta _\varepsilon \) in the last step. Sending \(\varepsilon \) to zero implies

$$\begin{aligned} \int _{\Omega }\hat{\beta }(\hat{u}-w)\ge 0\quad \forall w\in \mathcal {K}_\psi , \end{aligned}$$

which means that \(\hat{\beta }\in \partial I_{\mathcal {K}_\psi }(\hat{u})\), and hence \(u=\hat{u}\) and \(\beta (u-\psi )=\hat{\beta }\). Finally, the estimate for the Lagrange multiplier \(\beta (u-\psi )\) in \(L^\infty (\Omega )\) is a direct consequence of the weak convergence (5) and the estimate (4) due to the weakly lower semi-continuity of the norm. \(\square \)

Remark 2

Due to the convergence results of Lemma 2 it is possible to further characterize \(\beta (u-\psi )\). For any non-negative function \(v\in C^{\infty }_0(\Omega )\) we have according to the definition of \(\beta _\varepsilon \)

$$\begin{aligned} 0\ge \lim _{\varepsilon \rightarrow 0}\int _{\Omega } \beta _\varepsilon (u_\varepsilon -\psi )v=\int _{\Omega } \beta (u-\psi )v. \end{aligned}$$

Thus, by means of the fundamental lemma of variational calculus, we get \(\beta (u-\psi )\le 0\), and hence, \(\beta (u-\psi )(u-\psi )\le 0\) almost everywhere, such that the definition of \(\beta _\varepsilon \) implies

$$\begin{aligned} 0\le \lim _{\varepsilon \rightarrow 0}\int _{\Omega } \beta _\varepsilon (u_\varepsilon -\psi )(u_\varepsilon -\psi )=\int _{\Omega } \beta (u-\psi )(u-\psi )\le 0 \end{aligned}$$

or, equivalently,

$$\begin{aligned} || \beta (u-\psi )(u-\psi )||_{L^1(\Omega )}=0. \end{aligned}$$

To summarize, this means in the almost everywhere sense

$$\begin{aligned} \beta (u-\psi )\left\{ \begin{array}{lll} =0&{}\quad \text {if }&{} u-\psi >0,\\ \le 0&{}\quad \text {if }&{} u-\psi =0. \end{array}\right. \end{aligned}$$

The next theorem is concerned with the regularization error in \(L^\infty (\Omega )\) and the related convergence rate. It basically reflects the results of [17, Theorem 2.1]. Nevertheless, we recall the proof in order to ensure that it does not depend on the smoothness of the boundary since in [17] a smooth boundary is assumed.

Theorem 1

Let \(u \in \mathcal {K}_\psi \) and \(u_\varepsilon \in H^{1}_0(\Omega )\) denote the solutions to (1) and (3), respectively. Then there is the estimate

$$\begin{aligned} \Vert u-u_\varepsilon \Vert _{L^\infty (\Omega )} \le \varepsilon \Vert f+\Delta \psi \Vert _{L^\infty (\Omega )}. \end{aligned}$$

Proof

Let us abbreviate \(e_\varepsilon =u-u_\varepsilon \). Having in mind the \(L^2\)-regularity of \(\beta (u-\psi )\) from Lemma 2, we obtain from (2) and (3)

$$\begin{aligned} (\nabla e_\varepsilon ,\nabla v ) = (\beta _\varepsilon (u_\varepsilon -\psi )-\beta (u-\psi ),v) \quad \forall v \in H^1_0(\Omega ). \end{aligned}$$

Next, we observe that the function \(e_\varepsilon ^{2p+1}\), where p is an arbitrary positive integer, belongs to \(H^1_0(\Omega )\) if u and \(u_\varepsilon \) belong to \(H^1_0(\Omega )\cap L^\infty (\Omega )\). The \(L^\infty (\Omega )\) regularity is given by Lemma 2. Thus, we may choose it as a test function in the above variational equation. This yields employing the chain rule several times

$$\begin{aligned} (\beta _\varepsilon (u_\varepsilon -\psi )&-\beta (u-\psi ),e_\varepsilon ^{2p+1})\\&=(\nabla e_\varepsilon ,\nabla e_\varepsilon ^{2p+1})=\frac{2p+1}{(p+1)^2}||\nabla e_\varepsilon ^{p+1}||_{L^2(\Omega )}^2\\&\ge C\frac{2p+1}{(p+1)^2}||e_\varepsilon ^{p+1}||_{L^2(\Omega )}^2=C\frac{2p+1}{(p+1)^2}||e_\varepsilon ||_{L^{2(p+1)}(\Omega )}^{2(p+1)}, \end{aligned}$$

where we applied the Poincaré inequality in between. Notice, that the constant from the Poincaré inequality is independent of p. We now estimate the term on the left hand side. Due to the definition of \(\beta _\varepsilon \) and Remark 2 we notice that \(\beta _\varepsilon (u-\psi )=\beta (u-\psi )=0\) almost everywhere if \(u-\psi >0\). Then, due to the monotonicity of \(\beta _\varepsilon \) we get

$$\begin{aligned} 0\ge (\beta _\varepsilon (u_\varepsilon -\psi )-\beta (u-\psi ))(u-u_\varepsilon )\quad \text {a.e. in }\{x\in \Omega \,|\, u(x)-\psi (x)>0\}. \end{aligned}$$

According to the definition of \(\beta _\varepsilon \) and Remark 2, we also obtain

$$\begin{aligned} 0\ge (\beta _\varepsilon (u_\varepsilon -\psi )-\beta (u-\psi ))(u-u_\varepsilon ) \text { a.e. in }\{x\in \Omega \,|\, u_\varepsilon (x)-\psi (x)>0\, \wedge \, u(x)=\psi (x)\}. \end{aligned}$$

Next, let us define \(I=\{x\in \Omega \,|\, u_\varepsilon (x)-\psi (x)\le 0\, \wedge \, u(x)=\psi (x)\}\) and \(e_\psi =\psi -u_\varepsilon \ge 0\). Then, combining the previous results yields

$$\begin{aligned} (\beta _\varepsilon (u_\varepsilon -\psi )-\beta (u-\psi ),e_\varepsilon ^{2p+1})\le (\beta _\varepsilon (u_\varepsilon -\psi )-\beta (u-\psi ),e_\psi ^{2p+1})_{L^2(I)}. \end{aligned}$$

Due to the definition \(\beta _\varepsilon (u_\varepsilon -\psi )\) and Remark 2, this also implies

$$\begin{aligned} \frac{1}{\varepsilon }||e_\psi ||_{L^{2(p+1)}(I)}^{2(p+1)}+(\beta _\varepsilon (u_\varepsilon -\psi )-\beta (u-\psi ),e_\varepsilon ^{2p+1})\le (|\beta (u-\psi )|,e_\psi ^{2p+1})_{L^2(I)}. \end{aligned}$$

By means of the Hölder and the Young inequality, we get

$$\begin{aligned} (|\beta (u&-\psi )|,e_\psi ^{2p+1})_{L^2(I)}\\&\le ||\beta (u-\psi )||_{L^{2(p+1)}(I)}||e_\psi ||_{L^{2(p+1)}(I)}^{2p+1}\\&\le \frac{1}{2(p+1)} \varepsilon ^{2p+1}||\beta (u-\psi )||_{L^{2(p+1)}(I)}^{2(p+1)}+\frac{2p+1}{2(p+1)}\frac{1}{\varepsilon }||e_\psi ||_{L^{2(p+1)}(I)}^{2(p+1)} \\&\le \frac{1}{2(p+1)} \varepsilon ^{2p+1}||\beta (u-\psi )||_{L^{2(p+1)}(I)}^{2(p+1)}+ \frac{1}{\varepsilon }||e_\psi ||_{L^{2(p+1)}(I)}^{2(p+1)}, \end{aligned}$$

such that

$$\begin{aligned} (\beta _\varepsilon (u_\varepsilon -\psi )-\beta (u-\psi ),e_\varepsilon ^{2p+1})\le \frac{1}{2(p+1)} \varepsilon ^{2p+1}||\beta (u-\psi )||_{L^{2(p+1)}(I)}^{2(p+1)}, \end{aligned}$$

and hence

$$\begin{aligned} ||e_\varepsilon ||_{L^{2(p+1)}(\Omega )}&\le \left( C\frac{p+1}{2p+1}\right) ^{\frac{1}{2(p+1)}}\varepsilon ^{\frac{2p+1}{2(p+1)}}||\beta (u-\psi )||_{L^{2(p+1)}(I)}\\&\le C^{\frac{1}{2(p+1)}}\varepsilon ^{\frac{2p+1}{2(p+1)}}||\beta (u-\psi )||_{L^{2(p+1)}(I)}, \end{aligned}$$

where the constant C is still independent of p. If we let p tend to infinity, the desired result follows from Lemma 2. \(\square \)

Remark 3

We later consider the error \(\Vert u-u_\varepsilon \Vert _{L^2(\Omega )}\). Nevertheless, Theorem 1 gives an upper bound for the error due to the Hölder inequality. Even, in Sect. 5.1, this rate is numerically validated to be sharp.

We close this section with a interior regularity result for the solution of the Poisson equation, which is needed later in the proof of Lemma 8.

Lemma 3

Let \(U \subset U_\delta \subset \Omega \) denote two connected subsets with \({\text {dist}}(\partial U,\partial U_\delta ) \ge \delta \), \(\delta >0\), with boundaries of class \(C^{1,1}\). Let \(\phi \in L^2(\Omega ) \cap L^\infty (U_\delta )\) be given and let \(z \in H^1_0(\Omega )\) denote the unique solution to

$$\begin{aligned} -\Delta z&= \phi \quad \text{ in } \Omega ,\\ z&= 0 \quad \text{ on } \partial \Omega . \end{aligned}$$

Then, for \(p \in [2,\infty )\) there holds

$$\begin{aligned} \Vert z\Vert _{W^{2,p}(U)} \le Cp ( \Vert \phi \Vert _{L^p(U_\delta )} + \Vert \phi \Vert _{L^2(\Omega )} ), \end{aligned}$$

where the constant C depends on \(\delta \) but not on p.

Proof

We follow a similar proof from [13, Lemma 2.4], i.e., we apply a boot strapping argument. First we introduce an intermediate smooth domain \(U_{\delta /2}\) such that \(U \subset U_{\delta /2} \subset U_\delta \) with \(\text{ dist }(\partial U, \partial U_{\delta /2}) \ge \delta /2\) and \(\text{ dist }(\partial U_{\delta /2}, \partial U_\delta ) \ge \delta /2\). In a first step we show \(W^{1,\infty }(U_{\delta /2})\)-regularity for z. Let \(\omega \in C^{\infty }(\Omega )\) denote a smooth cut-off function on \(U_{\delta /2}\), such that \(\omega |_{U_{\delta /2}} \equiv 1\), \(\omega |_{\Omega {\setminus } U_{\delta }} \equiv 0\), and \(|\omega |_{W^{r,\infty }(\Omega )} \le C \delta ^{-r}\) for \(r \in \{0,1,2\}\), see [10, Theorem 1.4.1 and Eq. (1.42)] for the existence of such a cut-off function and the corresponding estimates. We set \(v := \omega z\). Then \(v\in H^1_0(U_\delta )\) is the weak solution to

$$\begin{aligned} \begin{array}{ll} -\Delta v = \phi \omega + (-\Delta \omega ) z - 2\nabla \omega \cdot \nabla z =: g &{} \text{ in } U_\delta ,\\ v = 0 &{} \text{ on } \partial U_\delta . \end{array} \end{aligned}$$

Due to the smoothness properties of \(\omega \), the right hand side g can be bounded by

$$\begin{aligned} \Vert g\Vert _{L^6(U_\delta )} \le C \left( \Vert \phi \Vert _{L^6(U_\delta )} + \Vert z\Vert _{W^{1,6}(U_\delta )} \right) , \end{aligned}$$

where the constant C depends on \(\delta \). Moreover, due to the \(H^2\)-regularity of z, as \(\Omega \) is convex, we have

$$\begin{aligned} \Vert z\Vert _{W^{1,6}(\Omega )} \le C\Vert z\Vert _{H^2(\Omega )} \le C \Vert \phi \Vert _{L^2(\Omega )}. \end{aligned}$$

Consequently, by elliptic regularity, c.f. [7, Theorem 9.9], we obtain

$$\begin{aligned} \Vert v\Vert _{W^{2,6}(U_\delta )} \le C\Vert g\Vert _{L^6(U_\delta )} \le C( \Vert \phi \Vert _{L^6(U_\delta )} + \Vert \phi \Vert _{L^2(\Omega )} ). \end{aligned}$$

Since \(\omega \equiv 1\) on \(U_{\delta /2}\) we have \(v|_{U_{\delta /2}} \equiv z|_{U_{\delta /2}}\) and therefore

$$\begin{aligned} \Vert z\Vert _{W^{1,\infty }(U_{\delta /2})} \le C \Vert z\Vert _{W^{2,6}(U_{\delta /2})} \le C ( \Vert \phi \Vert _{L^6(U_{\delta })} + \Vert \phi \Vert _{L^2(\Omega )} ). \end{aligned}$$
(6)

Next, we repeat the above argumentation for U and \(U_{\delta /2}\) with correspondingly changed cut-off function \(\omega \) and auxiliary problem for v. Let \(p>6\) and \(\omega \) denote a cut-off function such that \(\omega \equiv 1\) on U and \(\omega \equiv 0\) on \(\Omega {\setminus } U_{\delta /2}\). As above, we obtain

$$\begin{aligned} \Vert g\Vert _{L^p(U_{\delta /2})} \le C \left( \Vert \phi \Vert _{L^p(U_{\delta /2})} + \Vert z\Vert _{W^{1,p}(U_{\delta /2})} \right) \le C\left( \Vert \phi \Vert _{L^p(U_\delta )} + \Vert \phi \Vert _{L^2(\Omega )}\right) , \end{aligned}$$

where we used (6). Finally, from elliptic regularity, c.f. [7, Theorem 9.9], we get for \(p\in [2,\infty )\) the desired result,

$$\begin{aligned} \Vert z\Vert _{W^{2,p}(U)} \le Cp( \Vert \phi \Vert _{L^p(U_{\delta })} + \Vert \phi \Vert _{L^2(\Omega )} ), \end{aligned}$$

where we notice that the constant C is independent of p. This can be seen from the proof of [7, Theorem 9.9]. \(\square \)

3 The discrete problem

In the following we derive optimal a-priori error estimates in \(L^2(\Omega )\) for a numerical approximation to (1) which is based on the regularized problem (3). We rely on the approach of [17]. However, we again notice that the results from that reference are not directly applicable in our setting as in [17] global \(W^{2,p}\)-regularity is required with arbitrarily large \(p < \infty \).

Let us now introduce the numerical approximation which we are dealing with. We do not discretize (3) directly but an equivalent reformulation of it. According to (4), we may truncate the nonlinearity \(\beta _\varepsilon \) without changing the solution to (3). More precisely, if we choose the constant

$$\begin{aligned} \lambda := c||f + \Delta \psi ||_{L^\infty (\Omega )} \end{aligned}$$
(7)

with \(c\ge 1\), we may redefine \(\beta _\varepsilon \) by the bounded, monotonically increasing, and globally Lipschitz continuous function

$$\begin{aligned} \beta _\varepsilon (s) := \left\{ \begin{array}{lll} 0, &{} \text{ if } &{} s\ge 0,\\ \max (s/\varepsilon ,-\lambda ), &{} \text{ if } &{} s<0, \end{array}\right. \end{aligned}$$
(8)

without changing the solution of (3). This problem with the redefined nonlinearity is now being discretized by piecewise continuous and linear finite elements. Let \(\{\mathcal T_h\}\) be a family of conforming and quasi-uniform triangulations of \(\Omega \) which are admissible in the sense of Ciarlet. We denote by \(h:=\max _{T\in \mathcal {T}_h} {\text {diam}}T\) the global mesh parameter and assume that \(h<1/2\). For each element \(T\in \mathcal {T}_h\) we assume that it is isoparametrically equivalent to the unit simplex in \(\mathbb {R}^N\).

We comment on the case of elements that are equivalent to the unit cube in Remark 4.

On \(\mathcal T_h\) we define

$$\begin{aligned} V_h := \{ v_h \in C(\overline{\Omega }) \,|\, v|_{T} \text{ is } \text{ affine } \forall T \in \mathcal T_h, \, v|_{\partial \Omega } \equiv 0\}, \end{aligned}$$

and determine approximations to the solution \(u_\varepsilon \) of (3) by solving the problem: Find \(u_{\varepsilon ,h}\in V_h\) such that

$$\begin{aligned} (\nabla u_{\varepsilon ,h},\nabla v_h) + (\beta _\varepsilon (u_{\varepsilon ,h}-\psi ),v_h) = (f,v_h) \quad \forall v_h \in V_h. \end{aligned}$$
(9)

For each mesh parameter h the existence of a unique solution to this finite dimensional problem follows by standard arguments. For later reference, we define \(I_h:C(\overline{\Omega }) \rightarrow V_h\) as the usual Lagrangian interpolation operator, and the Ritz projection of \(w \in H^1_0(\Omega )\) as the function \(R_h w\) in \(V_h\) which satisfies

$$\begin{aligned} (\nabla (R_h w-w),\nabla v_h)=0 \quad \forall v_h \in V_h. \end{aligned}$$

Finally, let us stress that we assume exact integration for the non-smooth nonlinearity \(\beta _\varepsilon (u_{\varepsilon ,h}-\psi )\). We refer to [11], where a lumping technique is used for the numerical approximation of the nonlinear term.

4 Error estimates in \(L^2(\Omega )\)

In Theorem 1 we have already seen that the regularization error can appropriately be bounded in \(L^2(\Omega )\), even in \(L^\infty (\Omega )\). In the following we derive a priori bounds with respect to the discretization parameter h for the discretization error \(u_\varepsilon -u_{\varepsilon ,h}\) in \(L^2(\Omega )\), see Theorem 3. Afterwards, we combine these results in Theorem 4.

Before going into detail, let us quickly elucidate the structure of the main part of this section, the proof of estimates for the discretization error. Based on the assumption that \(\psi <0\) on the boundary, i.e., the obstacle is inactive on the boundary, we show in a first step that there exists a (non-empty) strip \(D_d\) at the boundary \(\partial \Omega \) of width d (independent of \(\varepsilon \) and h) such that

$$\begin{aligned} \beta (u-\psi )=\beta _\varepsilon (u_\varepsilon -\psi )=\beta _\varepsilon (u_{\varepsilon ,h}-\psi )=0\quad \text {a.e. in }D_d\subset \Omega , \end{aligned}$$
(10)

see Lemma 5, and hence, the constraint is inactive in the neighborhood \(D_d\) of the boundary for each problem. The proof requires that \(\varepsilon \) and h are small enough as it relies on the fact that we already have pointwise convergence of \(u_\varepsilon \) towards u, see Theorem 1, and pointwise convergence of \(u_{\varepsilon ,h}\) towards \(u_\varepsilon \) with some (maybe not optimal) rate, see Lemma 4. According to (10), we also have that \(R_hu_\varepsilon -u_{\varepsilon ,h}\) is discretely harmonic, see (13), on \(D_d\). This implies that there exists another strip D at the boundary (for instance of width d / 2) such that

$$\begin{aligned} ||R_hu_\varepsilon -u_{\varepsilon ,h}||_{H^1(D)}\le C ||R_hu_\varepsilon -u_{\varepsilon ,h}||_{L^2(D_d\backslash D)}\le C ||R_hu_\varepsilon -u_{\varepsilon ,h}||_{L^2(\Omega \backslash D)},\qquad \end{aligned}$$
(11)

where the constant C depends on the distance between D and \(D_d\), see Theorem 2. Based on this, we get after having introduced \(R_hu_\varepsilon \) as an intermediate function

$$\begin{aligned} ||u_\varepsilon&-u_{\varepsilon ,h}||_{L^2(\Omega )}\nonumber \\&\le ||u_\varepsilon -R_hu_\varepsilon ||_{L^2(D)}+||R_hu_\varepsilon -u_{\varepsilon ,h}||_{L^2(D)}+||u_\varepsilon -u_{\varepsilon ,h}||_{L^2(\Omega \backslash D)}\nonumber \\&\le ||u_\varepsilon -R_hu_\varepsilon ||_{L^2(D)}+C||R_hu_\varepsilon -u_{\varepsilon ,h}||_{L^2(\Omega \backslash D)}+||u_\varepsilon -u_{\varepsilon ,h}||_{L^2(\Omega \backslash D)}\nonumber \\&\le C\left( ||u_\varepsilon -R_hu_\varepsilon ||_{L^2(\Omega )}+||u_\varepsilon -u_{\varepsilon ,h}||_{L^2(\Omega \backslash D)}\right) , \end{aligned}$$
(12)

where we introduced \(u_\varepsilon \) as an intermediate function in the last step. Estimating the error of the Ritz-projection \(R_h u_\varepsilon \) is standard, taking into account the \(H^2(\Omega )\)-regularity of \(u_\varepsilon \) according to Lemma 2. It remains to bound the second term in the previous inequality. More precisely, we estimate the difference \(u_\varepsilon -u_{\varepsilon ,h}\) in \(L^\infty (\Omega \backslash D)\). Here we rely on a duality argument as in [17], see Theorem 3. However, we always take care on the fact that this term only lives in the interior of the domain, where we have higher regularity. This is the main reason for having second order convergence (times a \(\log \)-factor) in \(L^2(\Omega )\) in case of general convex polygonal/polyhedral domains.

We start with providing an \(L^\infty (\Omega )\)-estimate for the discretization error, which is valid in convex domains, but only has a lower convergence rate.

Lemma 4

Let \(u_\varepsilon \) and \(u_{\varepsilon ,h}\) be the solutions of (3) and (9), respectively. Then, there is the estimate

$$\begin{aligned} \Vert u_\varepsilon - u_{\varepsilon ,h}\Vert _{L^\infty (\Omega )} \le C h^{2-\frac{N}{2}}( \Vert f \Vert _{L^\infty (\Omega )} + \Vert \Delta \psi \Vert _{L^\infty (\Omega )} ), \end{aligned}$$

where the constant C is independent of \(\varepsilon \) and h.

Proof

This follows from [17, Lemma 2.2 and Theorem 2.3] using only \(H^2(\Omega )\)-regularity which holds in general convex domains. Equivalently, one can set \(D=\emptyset \) within the proof of Theorem 3 and Lemma 8. Then, by taking into account only the \(H^2(\Omega )\)-regularity of z within the proof of Lemma 8 for estimating \(\Vert z-R_hz\Vert _{L^\infty (\Omega )}\), one obtains the desired result as well. \(\square \)

Based on the previous lemma, we next show that (10) holds.

Lemma 5

Let u, \(u_\varepsilon \) and \(u_{\varepsilon ,h}\) be the solutions of (2), (3) and (9), respectively. In addition assume that \(\psi <0\) on the boundary. Then, there exist constants \(d>0\), \(\varepsilon _0>0\) and \(h_0>0\) such that for all \(\varepsilon \le \varepsilon _0\) and \(h\le h_0\) there holds

$$\begin{aligned} \beta (u-\psi )=\beta _\varepsilon (u_\varepsilon -\psi )=\beta _\varepsilon (u_{\varepsilon ,h}-\psi )=0 \text { a.e. in }D_d:=\{ x \in \Omega \,|\, {\text {dist}}(x,\partial \Omega ) \le d \}. \end{aligned}$$

Proof

As the obstacle \(\psi \) is a continuous function on the boundary, which represents a compact set, we obtain that there exists a \(\tau >0\) such that \(\psi \le -\tau \) on the boundary, and hence, there holds \(u - \psi \ge \tau \) on the boundary. Next, we notice that \(u-\psi \) is a continuous function up to the boundary, see Lemma 2. Consequently, there is a constant \(d>0\) such that \(u - \psi \ge \frac{1}{2}\tau \) on \(D_d\). Further, from Theorem 1 we have that \(|u(x)-u_\varepsilon (x)| \le \varepsilon \Vert f+\Delta \psi \Vert _{L^\infty }\) for all \(x\in \Omega \). Consequently, there exists a constant \(\varepsilon _0>0\) such that \(u_\varepsilon - \psi \ge \frac{1}{4}\tau \) on \(D_d\) for all \(\varepsilon \le \varepsilon _0\). In the same manner, now using the \(L^\infty (\Omega )\)-estimate from Lemma 4 (note that the constant there is independent of \(\varepsilon \) and h), we deduce the existence of a constant \(h_0>0\) such that for all \(h\le h_0\) there holds \(u_{\varepsilon ,h}- \psi \ge \frac{1}{8}\tau \) on \(D_d\). The assertion now follows from the discussion in Remark 2 and the definition of \(\beta _\varepsilon \) in (8). \(\square \)

Next, we are concerned with proving (11). For that reason, let us first introduce the notion of locally discrete harmonic functions as it is used in the following. Let \(U_\delta \) denote a subset of \(\Omega \). We call a function \(w_h \in V_h\) discretely harmonic on \(U_\delta \) if

$$\begin{aligned} (\nabla w_h,\nabla v_h) = 0 \quad \forall v_h \in V_h\cap \{v\in H^1(\Omega )\,|\, v=0\text { a.e. in } \Omega \backslash U_\delta \}. \end{aligned}$$
(13)

It is well-known that discretely harmonic functions fulfill the following Caccioppoli-type estimate: Let U and \(U_\delta \) be subsets of \(\Omega \) such that \(U\subset U_\delta \) and \({\text {dist}}(U,\partial U_\delta \backslash \partial \Omega )=\delta \) with \(\delta >0\). Further, assume that \(w_h\in V_h\) is discretely harmonic on \(U_\delta \). Then for h small enough (depending on \(\delta \)) there is the estimate

$$\begin{aligned} ||\nabla w_h||_{L^2(U)}\le C \delta ^{-1} ||w_h||_{L^2(U_\delta )}, \end{aligned}$$
(14)

where the constant C is independent of \(\delta \). Estimates of this kind are essential when proving local energy norm estimates, which can be traced back to [16] We also mention [5] where in contrast to [16] the assumption on quasi-uniform meshes is avoided and sharply varying grids are admitted. A more sophisticated discussion on local estimates and a survey on related results from the literature can be found in [5] as well.

In (14) the norm on the right hand side is defined on \(U_\delta \) but not on \(U_\delta \backslash U\) as it is required for our purposes. However, the results from the literature can be extended to this by minor modifications. We summarize this in the following lemma. We assume that the mesh is quasi-uniform, and only notice that the results also extend to the more general setting of [5].

Lemma 6

Let U and \(U_\delta \) be subsets of \(\Omega \) such that \(U\subset U_\delta \) and \({\text {dist}}(U,\partial U_\delta \backslash \partial \Omega )=\delta \) with \(\delta >0\). Further, assume that \(w_h\in V_h\) is discretely harmonic on \(U_\delta \) in the sense of (13). Then, there exists a constant \(h_\delta >0\) (depending on \(\delta \)) such that for \(h\le h_\delta \) there is the estimate

$$\begin{aligned} ||\nabla w_h||_{L^2(U)}\le C \delta ^{-1} ||w_h||_{L^2(U_\delta \backslash U)}, \end{aligned}$$

where the constant C is independent of \(\delta \).

Proof

For \(i=1,\ldots ,4\), let \(U_{i\delta /5}\) be a subset of \(\Omega \) such that \(U\subset U_{i\delta /5}\subset U_\delta \) and \({\text {dist}}(U,\partial U_{i\delta /5}\backslash \partial \Omega )=i\delta /5\). Moreover, we define the smooth cut-off function \(\omega \in C^\infty (\Omega )\) which satisfies

$$\begin{aligned} \omega |_{U_{2\delta /5}} \equiv 1, \quad \omega |_{\Omega {\setminus } U_{3\delta /5}} \equiv 0, \quad \text{ and }\quad |\omega |_{W^{r,\infty }(\Omega )} \le C\delta ^{-r} \text { for } 0\le r \le 2, \end{aligned}$$

see [10, Theorem 1.4.1 and Eq. (1.42)] for the existence of such a cut-off function and the corresponding estimates. By simple calculations we deduce

$$\begin{aligned} \Vert \nabla w_h \Vert _{L^2(U)}^2&\le \Vert \omega \nabla w_h\Vert ^2_{L^2(U_\delta )} = \int _{U_\delta } \omega ^2 \nabla w_h \cdot \nabla w_h \nonumber \\&= \int _{U_{\delta }}\nabla w_h\cdot \nabla (\omega ^2 w_h) - \int _{U_{\delta }}w_h\nabla w_h\cdot \nabla \omega ^2. \end{aligned}$$
(15)

For the second term we obtain by the Cauchy–Schwarz inequality and the properties of \(\omega \)

$$\begin{aligned} \bigg |&\int _{U_\delta }w_h\nabla w_h\cdot \nabla \omega ^2\bigg | =2\left| \int _{U_\delta }\omega \nabla w_h\cdot w_h\nabla \omega \right| \le 2\Vert \omega \nabla w_h\Vert _{L^2(U_\delta )} \Vert w_h\nabla \omega \Vert _{L^2(U_\delta {\setminus } U)}\nonumber \\&\le C\delta ^{-1}\Vert \omega \nabla w_h\Vert _{L^2(U_\delta )} \Vert w_h\Vert _{L^2(U_\delta {\setminus } U)} \le \frac{1}{4}\Vert \omega \nabla w_h\Vert _{L^2(U_\delta )}^2 + C \delta ^{-2}\Vert w_h\Vert ^2_{L^2(U_\delta {\setminus } U)}, \end{aligned}$$
(16)

where we applied Young’s inequality in the last step. Next, we consider the first term in (15). We notice that there exists a constant \(h_\delta >0\) such that for all \(h\le h_\delta \) there holds \(I_h(\omega ^2w_h)\in V_h\cap \{v\in H^1(\Omega )\,|\, v=0\text { a.e. in } \Omega \backslash U_{4\delta /5}\}\) and \(I_h(\omega ^2w_h) \equiv \omega ^2w_h\) on \(U_{\delta /5}\). Thus, using (13) to insert \(I_h(\omega ^2w_h)\) we obtain

$$\begin{aligned} \int _{U_\delta }\nabla w_h\cdot \nabla (\omega ^2 w_h)&= \int _{U_\delta }\nabla w_h\cdot \nabla \left( \omega ^2 w_h - I_h(\omega ^2 w_h)\right) \\&= \int _{U_{4\delta /5}{\setminus } U_{\delta /5}}\nabla w_h\cdot \nabla \left( \omega ^2 w_h - I_h(\omega ^2 w_h)\right) \\&\le \sum _{T\subset U_\delta {\setminus } U} \Vert \nabla w_h\Vert _{L^2(T)} \Vert \nabla \left( \omega ^2 w_h - I_h(\omega ^2 w_h)\right) \Vert _{L^2(T)}. \end{aligned}$$

For each element \(T\subset U_\delta {\setminus } U\) we deduce by means of an inverse inequality and a standard interpolation error estimate

$$\begin{aligned} ||\nabla w_h||_{L^2(T)} ||\nabla \left( \omega ^2 w_h - I_h(\omega ^2 w_h)\right) ||_{L^2(T)}&\le C h^{-1}\Vert w_h\Vert _{L^2(T)} h |\omega ^2 w_h|_{H^2(T)} \nonumber \\&=C\Vert w_h\Vert _{L^2(T)} |\omega ^2 w_h|_{H^2(T)}. \end{aligned}$$
(17)

Moreover, using the bounds for \(\omega \) and its derivatives, we get by elementary calculations

$$\begin{aligned} |\omega ^2 w_h|_{H^2(T)}&\le C \left( |\omega |_{W^{1,\infty }(T)}\Vert \omega \nabla w_h\Vert _{L^2(T)} + |\omega ^2|_{W^{2,\infty }(T)}\Vert w_h\Vert _{L^2(T)}\right) \nonumber \\&\le C \left( \delta ^{-1}\Vert \omega \nabla w_h\Vert _{L^2(T)} + \delta ^{-2}\Vert w_h\Vert _{L^2(T)} \right) . \end{aligned}$$
(18)

Note that all second derivatives of the affine function \(w_h\) vanish on T. The previous inequalities imply

$$\begin{aligned} \int _{U_\delta }\nabla w_h\cdot \nabla (\omega ^2 w_h)&\le C\sum _{T\subset U_\delta {\setminus } U} \left( \delta ^{-1}\Vert w_h\Vert _{L^2(T)}\Vert \omega \nabla w_h\Vert _{L^2(T)} + \delta ^{-2}\Vert w_h\Vert ^2_{L^2(T)} \right) \nonumber \\&\le \frac{1}{4}\Vert \omega \nabla w_h\Vert _{L^2(U_\delta )}^2 + C\delta ^{-2}\Vert w_h\Vert ^2_{L^2(U_\delta {\setminus } U)}, \end{aligned}$$
(19)

where we applied Young’s inequality in the last step. We finally get the assertion from (15), (16) and (19). \(\square \)

Remark 4

For the proof of Lemma 6 it is essential that second derivatives of \(w_h\) vanish on each element T, see (17) and (18). This is no longer the case if the elements are isoparametrically equivalent to the unit cube in \(\mathbb {R}^n\) as the corresponding shape functions on the reference element (unit cube) are multilinear. However, using sharper versions of the Bramble-Hilbert Lemma in (17), which only involve pure, but not mixed, second derivatives, lead to a comparable result. For instance, such a version of the Bramble-Hilbert Lemma is valid if the underlying mesh is rectangular, see e.g. [1, Sect. 2.4.2].

We now combine the previous results to deduce (11).

Theorem 2

Let \(u_\varepsilon \) and \(u_{\varepsilon ,h}\) be the solutions of (3) and (9), respectively. In addition assume that \(\psi <0\) on the boundary. Then, there exist a non-empty strip D at the boundary and constants \(\varepsilon _1>0\) and \(h_1>0\) such that for all \(\varepsilon \le \varepsilon _1\) and \(h\le h_1\) there holds

$$\begin{aligned} ||R_hu_\varepsilon -u_{\varepsilon ,h}||_{H^1(D)}\le C ||R_hu_\varepsilon -u_{\varepsilon ,h}||_{L^2(\Omega \backslash D)}. \end{aligned}$$

Proof

Define \(D:=\{ x \in \Omega \,|\, \text{ dist }(x,\partial \Omega ) \le d/2 \}\), where d denotes the width of the strip \(D_d\) in Lemma 5. From the same lemma we obtain that \(R_hu_\varepsilon -u_{\varepsilon ,h}\) is discretely harmonic on \(D_d\) for all \(h\le h_0\) and \(\varepsilon \le \varepsilon _1:=\varepsilon _0\) as \(\beta _\varepsilon (u_\varepsilon -\psi )=\beta _\varepsilon (u_{\varepsilon ,h}-\psi )=0\) on \(D_d\). Consequently, employing Lemma 6 there exists a constant \(h_d\) such that for all \(h\le h_1:=\min \{h_0,h_d\}\) there holds

$$\begin{aligned} ||\nabla (R_hu_\varepsilon -u_{\varepsilon ,h})||_{L^2(D)} \le C ||R_hu_\varepsilon -u_{\varepsilon ,h}||_{L^2(D_d{\setminus } D)} \le C ||R_hu_\varepsilon -u_{\varepsilon ,h}||_{L^2(\Omega \backslash D)}. \end{aligned}$$

As \(R_hu_\varepsilon -u_{\varepsilon ,h}\) fulfills homogeneous boundary conditions on \(\partial \Omega \), the estimate of the assertion is finally a consequence of the Poincaré inequality. \(\square \)

For the remainder of this section let D denote a strip at the boundary where we have

$$\begin{aligned} \beta _\varepsilon (u_\varepsilon -\psi )|_D=\beta _\varepsilon (u_{\varepsilon ,h}-\psi )|_D=0. \end{aligned}$$
(20)

This is the same strip as introduced in Theorem 2 when we collect all the intermediate estimates in Theorem 4. In a next step for the final result, we estimate \(u_\varepsilon -u_{\varepsilon ,h}\) in \(L^\infty (\Omega \backslash D)\). As already announced, we use a duality argument for that purpose. For the corresponding dual problem, we define

$$\begin{aligned} b := \left\{ \begin{array}{ll} [\beta _\varepsilon (u_\varepsilon - \psi ) - \beta _\varepsilon (u_{\varepsilon ,h}- \psi )]/(u_\varepsilon -u_{\varepsilon ,h}) &{} \text{ if } (u_\varepsilon -u_{\varepsilon ,h})(x) \ne 0,\\ 0 &{} \text{ else }. \end{array}\right. \end{aligned}$$
(21)

Note that \(0\le b \le \varepsilon ^{-1}\) almost everywhere in \(\Omega \). The upper bound follows from the Lipschitz-continuity of \(\beta _\varepsilon \) with Lipschitz constant \(\varepsilon ^{-1}\), while the lower bound is a consequence of the monotonicity of \(\beta _\varepsilon \).

Moreover, let

$$\begin{aligned} \tilde{\delta }\text { be a function from } C^\infty (\Omega ) \text { with } {\text {supp}} {\tilde{\delta }}\subset \Omega {\setminus } D \text { and } ||{\tilde{\delta }} ||_{L^1(\Omega )} \le 1. \end{aligned}$$
(22)

Then, we define \(G \in H^1_0(\Omega )\) as the weak solution to the dual problem

$$\begin{aligned} -\Delta G + bG= & {} {\tilde{\delta }}\quad \text{ in } \Omega ,\nonumber \\ G= & {} 0 \quad \text{ on } \partial \Omega . \end{aligned}$$
(23)

Before applying the duality argument in Theorem 3, let us state several auxiliary results.

Lemma 7

Let D with \(|D|\ge 0\) be a strip at the boundary where (20) holds. Moreover, let b and \(\tilde{\delta }\) be the functions from (21) and (22), respectively, and let \(G\in H^1_0(\Omega )\) be the solution of (23). Then, there holds

  1. (i)

    \(||bG||_{L^1(\Omega )}\le 1,\)

  2. (ii)

    \({\text {supp}} bG \subset \Omega {\setminus } D\).

Proof

  1. (i)

    For \(t>0\) we define the regularized sign function \({{\,\mathrm{sgn}\,}}_t(x) := \frac{x}{\sqrt{x^2+t}}\). Testing (23) with \({{\,\mathrm{sgn}\,}}_t(G)\) yields

    $$\begin{aligned} 1&\ge ({\tilde{\delta }},{{\,\mathrm{sgn}\,}}_t(G)) = (\nabla G, {{\,\mathrm{sgn}\,}}_t^\prime (G) \nabla G) + (bG, {{\,\mathrm{sgn}\,}}_t(G)). \end{aligned}$$

    As a consequence, by means of the monotonicity of \({{\,\mathrm{sgn}\,}}_t\), we get

    $$\begin{aligned} 1\ge (bG, {{\,\mathrm{sgn}\,}}_t(G))_{L^2(\Omega )}. \end{aligned}$$

    Sending t to zero and recalling that \(b\ge 0\) yields \(1 \ge \int _{\Omega } b{{\,\mathrm{sgn}\,}}(G)G = ||bG ||_{L^1(\Omega )}\).

  2. (ii)

    According to (20) we have \(\beta _\varepsilon (u_\varepsilon -\psi )=\beta _\varepsilon (u_{\varepsilon ,h}-\psi )=0\) a.e. on D such that \(b=0\) a.e. on D, and hence \(bG=0\) a.e. on D.

\(\square \)

Lemma 8

Let D with \(|D|>0\) (independent of \(\varepsilon \) and h) be a strip at the boundary where (20) holds. Moreover, let b and \(\tilde{\delta }\) be the functions from (21) and (22), respectively. Then, there exists a constant \(h_d>0\) such that for all \(h\le h_d\) the solution G of (23) and its Ritz-projection \(R_hG\) fulfill

$$\begin{aligned} \Vert G-R_hG\Vert _{L^1(\Omega )} \le C h^2|\log h|^2 \end{aligned}$$

with a constant \(C>0\) independent of \(\varepsilon \), h and \(\tilde{\delta }\).

Proof

Let \(z \in H^1_0(\Omega )\) denote the unique weak solution to

$$\begin{aligned} \begin{array}{ll} -\Delta z = {{\,\mathrm{sgn}\,}}(G-R_hG) &{}\quad \text{ in } \quad \Omega ,\\ z = 0 &{} \quad \text{ on } \quad \partial \Omega . \end{array} \end{aligned}$$

By means of this equation, the orthogonality of the Ritz-projection, (23) and Lemma 7, we obtain

$$\begin{aligned} ||G&-R_hG||_{L^1(\Omega )}\\&=(G-R_hG,{{\,\mathrm{sgn}\,}}(G-R_hG)) =(\nabla (G-R_hG),\nabla z)\\&=(\nabla (G-R_hG),\nabla (z-R_hz)) = (\nabla G, \nabla (z-R_hz))\\&=({\tilde{\delta }}-bG, z-R_hz)_{L^2(\Omega {\setminus } D)} \le (||{\tilde{\delta }}||_{L^1(\Omega )}+||bG||_{L^1(\Omega )})\Vert z-R_hz\Vert _{L^\infty (\Omega {\setminus } D)}\\&\le 2 \Vert z-R_hz\Vert _{L^\infty (\Omega {\setminus } D)}. \end{aligned}$$

For technical reasons, we have to introduce another subset \(D'\) of \(\Omega \), which is smoothly bounded, fulfills \(D'\subset D\), and has a fixed and positive distance to D and to \(\partial \Omega \). Using local \(L^\infty \)-error estimates from [20, Theorem 10.1] in combination with a standard interpolation error estimate we get for h small enough

$$\begin{aligned} \Vert z-R_hz\Vert _{L^\infty (\Omega {\setminus } D)}&\le C \left( h^{2-\frac{N}{p}} |\log h| ||z||_{W^{2,p}(\Omega {\setminus } D')} + ||z-R_hz||_{L^2(\Omega )}\right) . \end{aligned}$$

A standard \(L^2(\Omega )\)-error estimate for the Ritz-projection together with elliptic regularity for z, and Lemma 3 implies

$$\begin{aligned} \Vert z&-R_hz\Vert _{L^\infty (\Omega {\setminus } D)}\\&\le C\left( ph^{2-\frac{N}{p}}|\log h| ||{{\,\mathrm{sgn}\,}}(G-R_hG)||_{L^p(\Omega )} + h^2 ||{{\,\mathrm{sgn}\,}}(G-R_hG)||_{L^2(\Omega )}\right) \\&\le C h^2 |\log h|^2(p|\log h|^{-1}h^{-\frac{N}{p}}+1), \end{aligned}$$

where we used that \(||{{\,\mathrm{sgn}\,}}(G-R_hG)||_{L^\infty (\Omega )}\le 1\). If we set \(p=|\log h|\), the desired result follows as \(h^{-\frac{N}{|\log h|}}=e^N\). \(\square \)

Theorem 3

Let D with \(|D|>0\) (independent of \(\varepsilon \) and h) be a strip at the boundary where (20) holds. Moreover, let \(u_\varepsilon \) and \(u_{\varepsilon ,h}\) be the solutions of (3) and (9), respectively. Then, there exists a constant \(h_d>0\) such that for all \(h\le h_d\) there holds

$$\begin{aligned} \Vert u_\varepsilon -u_{\varepsilon ,h}\Vert _{L^\infty (\Omega {\setminus } D)} \le C h^2 |\log h|^2 (\Vert f \Vert _{L^\infty (\Omega )} + \Vert \Delta \psi \Vert _{L^\infty (\Omega )} ) \end{aligned}$$

with a constant \(C>0\) independent of \(\varepsilon \) and h.

Proof

As \(L^\infty (\Omega {\setminus } D)=(L^1(\Omega {\setminus } D))^*\) we have that

$$\begin{aligned} ||u_\varepsilon -u_{\varepsilon ,h}||_{L^\infty (\Omega {\setminus } D)} = \sup _{\begin{array}{c} {\tilde{\delta }}\in C^\infty (\Omega )\\ {\text {supp}} {\tilde{\delta }} \subset \Omega {\setminus } D\\ ||{\tilde{\delta }} ||_{L^1(\Omega )} \le 1 \end{array}} \left| \int _{\Omega }(u_\varepsilon -u_{\varepsilon ,h}) {\tilde{\delta }}\right| . \end{aligned}$$

Let such a \({\tilde{\delta }}\) be the right hand side of (23). Consequently, we get

$$\begin{aligned} \int _{\Omega }(u_\varepsilon&-u_{\varepsilon ,h}){\tilde{\delta }}\\&= (\nabla (u_\varepsilon -u_{\varepsilon ,h}),\nabla G) + (\beta _\varepsilon (u_\varepsilon -\psi )-\beta _\varepsilon (u_{\varepsilon ,h}-\psi ),G) \\&= (\nabla (u_\varepsilon -u_{\varepsilon ,h}),\nabla (G-R_hG) + (\beta _\varepsilon (u_\varepsilon -\psi )-\beta _\varepsilon (u_{\varepsilon ,h}-\psi ),G-R_hG), \end{aligned}$$

where we also used (3) and (9). The orthogonality of the Ritz-projection and (3) imply

$$\begin{aligned} \int _{\Omega }(u_\varepsilon -u_{\varepsilon ,h}){\tilde{\delta }}&= (\nabla u_\varepsilon ,\nabla (G-R_hG) + (\beta _\varepsilon (u_\varepsilon -\psi )-\beta _\varepsilon (u_{\varepsilon ,h}-\psi ),G-R_hG)\nonumber \\&= (f - \beta _\varepsilon (u_{\varepsilon ,h}-\psi ),G-R_hG)\nonumber \\&\le ||f - \beta _\varepsilon (u_{\varepsilon ,h}-\psi )||_{L^\infty (\Omega )}||G-R_hG||_{L^1(\Omega )}. \end{aligned}$$

The assertion follows from the boundedness \(\Vert \beta _\varepsilon (u_{\varepsilon ,h}-\psi )\Vert _{L^\infty (\Omega )} \le c\Vert f+\Delta \psi \Vert _{L^\infty (\Omega )}\) according to (8), and Lemma 8. \(\square \)

If we now combine the results from Theorem 1, Lemma 5, Theorems 2, and  3, as outlined in (12), we obtain the following result.

Theorem 4

Let u, \(u_\varepsilon \) and \(u_{\varepsilon ,h}\) be the solutions of (1), (3) and (9), respectively. In addition assume that \(\psi <0\) on the boundary. Then, there exist constants \(\varepsilon _d>0\) and \(h_d>0\) such that for all \(\varepsilon \le \varepsilon _d\) and \(h\le h_d\) there holds

$$\begin{aligned} \Vert u-u_{\varepsilon ,h}\Vert _{L^2(\Omega )} \le C\left( \varepsilon + h^2|\log h|^2\right) ( \Vert f \Vert _{L^\infty (\Omega )} + \Vert \Delta \psi \Vert _{L^\infty (\Omega )} ) \end{aligned}$$

with a constant \(C>0\) independent of \(\varepsilon \) and h, and using \(\varepsilon = \mathcal O(h^2|\log h|^2)\) we get

$$\begin{aligned} \Vert u-u_{\varepsilon ,h}\Vert _{L^2(\Omega )} \le Ch^2|\log h|^2 ( \Vert f \Vert _{L^\infty (\Omega )} + \Vert \Delta \psi \Vert _{L^\infty (\Omega )} ). \end{aligned}$$

We close this section with some remarks on certain additional aspects of our approach.

Remark 5

(Inactivity at the boundary \(\partial \Omega \)) The previous results are derived under the assumption that the obstacle is inactive on the boundary. This is due to the appearance of singular terms within the primal and dual solutions at the singular points of the boundary, which are the corners of the domain for \(N=2\), and the corners and edges for \(N=3\). However, the singularities are local phenomena. Away from the singular points, the regularity of the primal and dual solutions is only limited by the regularity of the data and the obstacle. For that reason, it is also sufficient to only assume inactivity of \(\psi \) on the boundary at the singular points.

Remark 6

(Non-convex domains) Throughout the whole paper, we have assumed that the domain is convex. Let us briefly comment on the non-convex case. As already noticed in the previous remark, the singularities are only local phenomena around the singular points. Thus, the \(W^{2,p}(\Omega \backslash D')\) regularity in the interior of the domain still holds. Further, the global \(H^1(\Omega )\cap C(\overline{\Omega })\) regularity is valid. Only the \(H^2(\Omega )\) regularity up to the boundary might no longer be true. Instead one may employ \(H^{1+t}(\Omega )\) regularity with some \(t\in (1/2,1]\) (depending on the singularities). Consequently, one may use the weak convergence of \(u_\varepsilon \) to \(\hat{u}\) in \(H^{1+t}(\Omega )\) (instead of \(H^2(\Omega )\)) within the proof of Lemma 2 in order to show the strong convergence of the sequence in \(C(\overline{\Omega })\). In addition, one has to replace the estimates for

$$\begin{aligned} ||u_\varepsilon -R_hu_\varepsilon ||_{L^2(\Omega )} \text { in (12) and }||z-R_hz||_{L^2(\Omega )} \text { within the proof of Lemma}\, 8 \end{aligned}$$

by the correspondingly adapted estimates. In general, this will lead to reduced all over convergence rates of order 2t (due to the corner/edge singularities affecting those estimates). However, it is possible to use mesh grading techniques to retain the full order of convergence for the critical terms. As a consequence, and as the results of Lemma 6 also hold on sharply varying grids (see the discussion before Lemma 6), graded meshes can also be used for the present discretization strategy for the obstacle problem to retain the convergence rates of Theorem 4 in non-convex domains.

5 Numerical validation

For the numerical realization of the fully discrete Eq. (9) we employ the finite element toolbox iFEM [3] inside Matlab® R2018a. In the numerical examples of the subsequent Sects. 5.1 and 5.2 the discrete subspaces \(V_h\) are constructed by piecewise linear and globally continuous functions on a sequence of subdivisions of \(\Omega \) into triangles. The computational domains \(\Omega \subset \mathbb {R}^2\) are exactly specified below. Moreover, we fix \(f=-30\) and \(\psi = -1\). The constant c in the definition of \(\lambda \) (7) is chosen as \(c=6\).

The example in Sect. 5.1 shows that the convergence rates of \(u_{\varepsilon ,h}\) in terms of \(\varepsilon \) and h are sharp. More precisely, we see that the exponents of \(\varepsilon \) and \(h|\log h|\) in

$$\begin{aligned} ||u - u_{\varepsilon ,h}||_{L^2(\Omega )} \le C(\varepsilon + h^2|\log h|^2), \end{aligned}$$
(24)

proven in Theorem 4, can essentially not be improved.

The example in Sect. 5.2 studies the influence of the largest interior angle of a polygonal domain on the convergence rates in \(L^2(\Omega )\) and \(L^\infty (\Omega )\). It illustrates that the result of Theorem 4, and hence estimate (24), is valid in general convex domains. However, the convergence rates in \(L^\infty (\Omega )\) may be reduced depending on the largest interior angle due to the appearance of corner singularities. Let us denote by \(\alpha \in [\pi /3,\pi )\) the largest interior angle of the domain. Then, one can show (neglecting \(\log \)-terms)

$$\begin{aligned} ||u - u_{\varepsilon ,h}||_{L^\infty (\Omega )} \le C ( \varepsilon + h^{\min \{2,\pi /\alpha \}-\delta }) \end{aligned}$$
(25)

for an arbitrarily small \(\delta >0\). For instance, this can be deduced from [17, Lem. 2.2 and Thm. 2.3] having in mind the reduced regularity stemming from the corner singularities.

Before turning our attention to the numerical examples, we notice that we use reference solutions (computed on a fine mesh and with a small regularization parameter) for the purpose of comparison, as we do not have analytic solutions to any of our numerical examples.

5.1 Validation of the discretization error estimates in \(L^2(\Omega )\)

In this section we verify (24). As underlying domain we choose the unit square \(\Omega = (0,1)^2\). The reference solution is computed with \(\varepsilon _{ref} = 10^{-7}\) and \(h_{ref}~=~0.5^{11}~\approx ~4.8\cdot ~10^{-4}\). From the structure of the estimate one expects that for small h the total error is dominated by the error caused by \(\varepsilon \) and vice versa. To show this, we calculate solutions \(u_{\varepsilon ,h}\) to (9) for sequences \(\varepsilon \) and h tending to zero. In Fig. 1 we show \(||u_{\varepsilon ,h}-u_{ref} ||_{L^2(\Omega )}\) as a function of \(\varepsilon \) for fixed values of h, while in Fig. 2 we show \(||u_{\varepsilon ,h}-u_{ref} ||_{L^2(\Omega )}\) as a function of h for fixed values of \(\varepsilon \).

Fig. 1
figure 1

Evolution of \(\Vert u_{ref}-u_{\varepsilon ,h}\Vert _{L^2(\Omega )}\) for \(\varepsilon \rightarrow 0\) and for several values of h

In Fig. 1 we observe, that for every fixed h, the error becomes stationary for small \(\varepsilon \) and cannot be further reduced by reducing \(\varepsilon \). Hence, the discretization error is dominating in this case. Moreover, for h sufficiently small we observe first order convergence in terms of \(\varepsilon \). This is in agreement with our theoretical findings, see (24). An analogous result is observed in Fig. 2, but with \(\varepsilon \) and h changing their roles. Of course, in terms of h we see a convergence rate of close to two.

Fig. 2
figure 2

Evolution of \(\Vert u_{ref}-u_{\varepsilon ,h}\Vert _{L^2(\Omega )}\) for \(h\rightarrow 0\) and for several values of \(\epsilon \)

Table 1 Experimental orders of convergence of \(\eta _h^{L^2} = \Vert u_{ref}-u_{\varepsilon ,h}\Vert _{L^2(\Omega )}\) and \(\eta _h^{L^\infty }=~\Vert u_{ref}-u_{\varepsilon ,h}\Vert _{L^\infty (\Omega )}\) on \(\Omega _{\frac{5}{8}\pi }\)
Table 2 Experimental orders of convergence of \(\eta _h^{L^2} = \Vert u_{ref}-u_{\varepsilon ,h}\Vert _{L^2(\Omega )}\) and \(\eta _h^{L^\infty } =~\Vert u_{ref}-u_{\varepsilon ,h}\Vert _{L^\infty (\Omega )}\) on \(\Omega _{\frac{3}{4}\pi }\)
Table 3 Experimental orders of convergence of \(\eta _h^{L^2} = \Vert u_{ref}-u_{\varepsilon ,h}\Vert _{L^2(\Omega )}\) and \(\eta _h^{L^\infty }=~\Vert u_{ref}-u_{\varepsilon ,h}\Vert _{L^\infty (\Omega )}\) on \(\Omega _{\frac{17}{18}\pi }\)

5.2 Influence of the largest interior angle on the error in \(L^2\) and \(L^\infty \)

In this section we verify (24) and (25) on domains \(\Omega \) with varying largest interior angle. The reference solution for each experiment is calculated with \(\varepsilon _{ref} = 10^{-4}\) and \(h_{ref} = 2^{-11} \approx 4.9\cdot 10^{-4}\). Moreover we fix \(\varepsilon = 10^{-4}\) for all experiments, and hence, we only investigate the error behavior with respect to h depending on the largest interior angle. As computational domains, we consider the domains \(\Omega _\alpha \) with largest interior angle \(\alpha \in [\pi /2,\pi )\) which are defined by

$$\begin{aligned} \bar{\Omega }_\alpha := {\text {conv}} \{ (0,0), (1,0), (0,1), \frac{1}{2}( 1 + \tan ( \alpha / 2 )^{-1} ) (1,1) \}. \end{aligned}$$

In particular, the case \(\alpha = \frac{1}{2}\pi \) leads to the unit square \((0,1)^2\), while for \(\alpha \rightarrow \pi \) the domain \(\Omega _\alpha \) degenerates to a rectangular triangle.

We perform experiments for three particular domains with largest interior angle \(\frac{5}{8}\pi \), \(\frac{3}{4}\pi \), and \(\frac{17}{18}\pi \). Our observations are presented in Tables 1, 2, and 3. Here \(\eta _h^{L^p} := ||u_{\varepsilon ,h}-u _{ref}||_{L^p(\Omega )}\) abbreviates the error between the numerical solution \(u_{\varepsilon ,h}\) and the reference solution \(u_{ref}\) in the \(L^p\)-norm (\(p \in \{2,\infty \}\)). For sequences \((h_k)\) and \((\eta _k)\) we define the experimental order of convergence (EOC) by

$$\begin{aligned} \mathrm {EOC}_k = \frac{\log (\eta _k) - \log (\eta _{k-1})}{\log (h_k) - \log (h_{k-1})} \end{aligned}$$

as an approximation to the convergence rate of \((\eta _k)\) with respect to \((h_k)\). We observe that the experimental orders of convergence for \(\eta _h^{L^2}\) are two on all three domains, as expected from (24). In case of \(\eta _h^{L^\infty }\), we observe a decreasing convergence rate for an increasing largest interior angle. The corresponding experimental orders of convergence nicely follow the theoretical result from (25).