Abstract
We define and analyse a least-squares finite element method for a first-order reformulation of the obstacle problem. Moreover, we derive variational inequalities that are based on similar but non-symmetric bilinear forms. A priori error estimates including the case of non-conforming convex sets are given and optimal convergence rates are shown for the lowest-order case. We provide a posteriori bounds that can be used as error indicators in an adaptive algorithm. Numerical studies are presented.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Many physical problems are of obstacle type, or more generally, described by variational inequalities [25, 29]. In this article we consider, as a model problem, the classical obstacle problem where one seeks the equilibrium position of an elastic membrane constrained to lie over an obstacle. Another important example of an elliptic obstacle problem is the bending of a plate over an obstacle.
There exists already a long history of numerical methods, in particular finite element methods, see e.g., the books [16, 17] for an overview on the topic. However, the literature on least-squares methods for obstacle problems is scarce. In fact, until the writing of this paper only [9] was available for the classical obstacle problem where the idea goes back to a Nitsche-based method for contact problems introduced and analyzed in [11]. An analysis of first-order least-squares finite element methods for Signorini problems can be found in [1] and more recently [26]. Let us also mention the pioneering work [14] for the a priori analysis of a classical finite element scheme. Newer articles include [18,19,20] where mixed and stabilized methods are considered.
Least-squares finite element methods are a widespread class of numerical schemes and their basic idea is to approximate the solution by minimizing the residual in some given norm. Let us recall some important properties of least-squares finite element methods, a detailed list is given in the introduction of the overview article [5], see also the book [6].
Unconstrained stability One feature of least-squares schemes is that the methods are stable for all pairings of discrete spaces.
Adaptivity Another feature is that a posteriori error bounds are obtained by simply evaluating the least-squares functional. For instance, standard least-squares methods for the Poisson problem [6] are based on minimizing residuals in \(L^2\) norms, which can be localized and, then, be used as error indicators in an adaptive algorithm.
The main purpose of this paper is to close the gap in the literature and define least-squares based methods for the obstacle problem. In particular, we want to study if the aforementioned properties transfer to the case of obstacle problems. Let us shortly describe the functional our method is based on. For simplicity assume a zero obstacle (the remainder of the paper deals with general non-zero obstacles). Then, the problem reads
in some domain \(\Omega \) and \(u|_{\partial \Omega }=0\). Introducing the Lagrange multiplier (or reaction force) \(\lambda = -\Delta u-f\) and \(\varvec{\sigma }=\nabla u\), we rewrite the problem as a first-order system, see also [2, 3, 9, 18],
Note that \(f\in L^2(\Omega )\) does not imply more regularity for u so that \(\lambda \in H^{-1}(\Omega )\) lives only in the dual space in general. However, observe that \({{\,\mathrm{div}\,}}\varvec{\sigma }+\lambda =-f\in L^2(\Omega )\) and therefore the functional
where \(\langle \cdot ,\cdot \rangle \) denotes a duality pairing, is well-defined for \({{\,\mathrm{div}\,}}\varvec{\sigma }+\lambda \in L^2(\Omega )\). We will show that minimizing J over a convex set with the additional constraints \(u\ge 0\), \(\lambda \ge 0\) is equivalent to solving the obstacle problem. We will consider the variational inequality associated to this problem with corresponding bilinear form \(a(\cdot ,\cdot )\). An issue that arises is that \(a(\cdot ,\cdot )\) is not necessarily coercive. However, as it turns out, a simple scaling of the first term in the functional ensures coercivity on the whole space. In view of the aforementioned properties, this means that our method is unconstrained stable. The recent work [18] based on a Lagrange formulation (without reformulation to a first-order system) considers augmenting the trial spaces with bubble functions (mixed method) resp. adding residual terms (stabilized method) to obtain stability. The authors extended their work also to plate-bending problems, see [20].
Another motivation of the proposed first-order reformulation is that it allows to simultaneously approximate displacements and stresses. In many problems of structural engineering the stress is usually the primary quantity of interest. For the present problem of an elastic membrane the stress is directly related to the gradient and for the problem of bending a plate over an obstacle the physical quantities of interest are the bending moments.
Furthermore, we will see that the functional J evaluated at some discrete approximation \((u_h,\varvec{\sigma }_h,\lambda _h)\) with \(u_h,\lambda _h\ge 0\) is an upper bound for the error. Note that for \(\lambda _h\in L^2(\Omega )\) the duality \(\langle \lambda _h,u_h\rangle \) reduces to the \(L^2\) inner product. Thus, all the terms in the functional can be localized and used as error indicators.
Additionally, we will derive and analyse other variational inequalities that are also based on the first-order reformulation. The resulting methods are quite similar to the least-squares scheme since they share the same residual terms. The only difference is that the complementary condition \(\lambda u = 0\) is incorporated in a different, non-symmetric, way. We will present a uniform analysis that covers the least-squares formulation and the novel variational inequalities of the obstacle problem.
Finally, we point out that the use of adaptive schemes for obstacle problems is quite natural. First, the solutions may suffer from singularities stemming from the geometry, and second, the free boundary is a priori unknown. There exists plenty of literature on a posteriori estimators resp. adaptivity for finite elements methods for the obstacle problem, see e.g., [4, 7, 10, 27, 28, 31, 32] to name a few. Many of the estimators are based on the use of a discrete Lagrange multiplier which is obtained in a postprocessing step. In contrast, our proposed methods simultaneously approximate the Lagrange multiplier. This allows for a simple analysis of reliable a posteriori bounds.
1.1 Outline
The remainder of the paper is organized as follows. In Sect. 2 we describe the model problem, introduce the corresponding first-order system and based on that reformulation define our least-squares method. Then, Sect. 3 deals with the definition and analysis of different variational inequalities. In Sect. 4 we provide an a posteriori analysis and numerical studies are presented in Sect. 5. The appendix contains an example, which shows that \(a(\cdot ,\cdot )\) is not coercive in general, and proofs of some auxiliary results.
2 Least-squares method
In Sects. 2.1 and 2.2 we describe the model problem and introduce the reader to our notation. Then, Sect. 2.3 is devoted to the definition and analysis of a least-squares functional.
2.1 Model problem
Let \(\Omega \subset \mathbb {R}^n\), \(n=2,3\) denote a polygonal Lipschitz domain with boundary \(\Gamma =\partial \Omega \). For given \(f\in L^2(\Omega )\) and \(g\in H^1(\Omega )\) with \(g|_{\Gamma }\le 0\) we consider the classical obstacle problem: find a solution u to
It is well-known that this problem admits a unique solution \(u\in H_0^1(\Omega )\), and it can be equivalently characterized by the variational inequality: find \(u\in H_0^1(\Omega )\), \(u\ge g\) such that
see [25]. For a more detailed description of the involved function spaces we refer to Sect. 2.2 below.
2.2 Notation and function spaces
We use the common notation for Sobolev spaces \(H_0^1(\Omega )\), \(H^s(\Omega )\) (\(s>0\)). Let \((\cdot ,\cdot )\) denote the \(L^2(\Omega )\) inner product, which induces the norm \(\Vert \cdot \Vert _{}\). The dual of \(H_0^1(\Omega )\) is denoted by \(H^{-1}(\Omega ) := (H_0^1(\Omega ))^*\), where duality \(\langle \cdot ,\cdot \rangle \) is understood with respect to the extended \(L^2(\Omega )\) inner product. We equip \(H^{-1}(\Omega )\) with the dual norm
Recall Friedrichs’ inequality
where \(0<C_F=C_F(\Omega )\le {{\,\mathrm{diam}\,}}(\Omega )\). Thus, by definition we have \(\Vert \lambda \Vert _{-1}\le C_F\Vert \lambda \Vert _{}\) for \(\lambda \in L^2(\Omega )\).
Let \({{\,\mathrm{div}\,}}{:}\,\varvec{L}^2(\Omega ):=L^2(\Omega )^n \rightarrow H^{-1}(\Omega )\) denote the generalized divergence operator, i.e., \(\langle {{\,\mathrm{div}\,}}\varvec{\sigma },u\rangle := -(\varvec{\sigma },\nabla u)\) for all \(\varvec{\sigma }\in \varvec{L}^2(\Omega )\), \(u\in H_0^1(\Omega )\). This operator is bounded,
Let \(v\in H^1(\Omega )\). We say \(v\ge 0\) if \(v\ge 0\) a.e. in \(\Omega \). Moreover, \(\lambda \ge 0\) for \(\lambda \in H^{-1}(\Omega )\) means that \(\langle \lambda ,v\rangle \ge 0\) for all \(v\in H_0^1(\Omega )\) with \(v\ge 0\). We also interpret \(v\ge w\) as \(v-w\ge 0\) for \(v,w\in H^1(\Omega )\).
Define the space
with norm
and the space
with norm
Observe that \(\Vert \cdot \Vert _{U}\) is a stronger norm than \(\Vert \cdot \Vert _{V}\), i.e.,
Our first least-squares formulation will be based on the minimization over the non-empty, convex and closed subset
where g is the given obstacle function. We will also derive and analyse variational inequalities based on non-symmetric bilinear forms that utilize the sets
Clearly, \(K^s\subset K^{j}\) for \(j=0,1\).
We write \(A\lesssim B\) if there exists a constant \(C>0\), independent of quantities of interest, such that \(A\le C B\). Analogously we define \(A > rsim B\). If \(A\lesssim B\) and \(B\lesssim A\) hold then we write \(A\simeq B\).
2.3 Least-squares functional
Let \(u\in H_0^1(\Omega )\) denote the unique solution of the obstacle problem (1). Define \(\lambda := -\Delta u - f\in H^{-1}(\Omega )\) and \(\varvec{\sigma }:=\nabla u\). Problem (1) can equivalently be written as the first-order problem
Observe that \({{\,\mathrm{div}\,}}\varvec{\sigma }+\lambda \in L^2(\Omega )\) and that the unique solution \(\varvec{u}=(u,\varvec{\sigma },\lambda )\in U\) satisfies \(\varvec{u}\in K^s\). We consider the functional
for \(\varvec{u}=(u,\varvec{\sigma },\lambda )\in U\), \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\) and the minimization problem: find \(\varvec{u}\in K^s\) with
Note that the definition of the functional only makes sense if \(g\in H_0^1(\Omega )\).
Theorem 1
If \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\), then problems (3) and (4) are equivalent. In particular, there exists a unique solution \(\varvec{u}\in K^s\) of (4) and it holds that
The constant \(C_J>0\) depends only on \(\Omega \).
Proof
Let \(\varvec{u}:= (u,\varvec{\sigma },\lambda ) =(u,\nabla u,-\Delta u - f)\in K^s\) denote the unique solution of (3). Observe that \(J(\varvec{v};f,g)\ge 0\) for all \(\varvec{v}\in K^s\) and \(J(\varvec{u};f,g)=0\), thus, \(\varvec{u}\) minimizes the functional. Suppose (5) holds and that \(\varvec{u}^*\in K^s\) is another minimizer. Then, (5) proves that \(\varvec{u}=\varvec{u}^*\). It only remains to show (5). Let \(\varvec{v}=(v,{\varvec{\tau }},\mu )\in K^s\). Note that all terms in \(J(\varvec{v};f,g)\) are non-negative. Since \(f=-{{\,\mathrm{div}\,}}\varvec{\sigma }-\lambda \) and \(\nabla u-\varvec{\sigma }= 0\) we have with the constant \(C_F>0\) that
Moreover, \(\langle \lambda ,u-g\rangle =0\) and \(\langle \lambda ,v-g\rangle \ge 0\), \(\langle \mu ,u-g\rangle \ge 0\). We estimate
Define \(\varvec{w}:= (w,{\varvec{\chi }},\nu ) := \varvec{v}-\varvec{u}\). Then, the Cauchy–Schwarz inequality, Young’s inequality and the definition of the divergence operator yield
Application of the Cauchy–Schwarz inequality, Friedrichs’ inequality and Young’s inequality gives us for the last term and \(\delta >0\)
Putting altogether and choosing \(\delta =\tfrac{1}{2}\) we end up with
which finishes the proof. \(\square \)
Remark 2
Note that (5) measures the error of any function \(\varvec{v}\in K^s\), in particular, it can be used as a posteriori error estimator when \(\varvec{v}\in K_h^s\subset K^s\) is a discrete approximation. However, in practice the condition \(K_h^s \subset K^s\) is often hard to realize. Below we introduce a simple scaling of the first term in the least-squares functional that allows us to prove coercivity of the associated bilinear form on the whole space U.
For given \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\), and fixed parameter \(\beta >0\) define the bilinear form \(a_\beta {:}\,U\times U \rightarrow \mathbb {R}\) and the functional \(F_\beta {:}\,U\rightarrow \mathbb {R}\) by
for all \(\varvec{u}=(u,\varvec{\sigma },\lambda ), \varvec{v}= (v,{\varvec{\tau }},\mu )\in U\). We stress that \(a_1(\cdot ,\cdot )\) and \(F_1(\cdot )\) induce the functional \(J(\cdot ;\cdot )\), i.e.,
Since J is differentiable it is well-known that the solution \(\varvec{u}\in K^s\) of (4) satisfies the variational inequality
Conversely, if J is also convex in \(K^s\), then any solution of (8) solves (4). However, J is convex on \(K^s\) iff \(a_1(\varvec{v}-\varvec{w},\varvec{v}-\varvec{w})\ge 0\) for all \(\varvec{v},\varvec{w}\in K^s\), which is not true in general, see the example from “Appendix A”. In Sect. 3 below we will show that for sufficiently large \(\beta >1\) the bilinear form \(a_\beta (\cdot ,\cdot )\) is coercive, even on the whole space U. This has the advantage that we can prove unique solvability of the continuous problem and its discretization simultaneously. More important, in practice this allows the use of non-conforming subsets \(K_h^s\nsubseteq K^s\).
3 Variational inequalities
In this section we introduce and analyse different variational inequalities. The idea of including the complementary condition in different ways has also been used in [15] to derive DPG methods for contact problems.
We define the bilinear forms \(b_\beta ,c_\beta {:}\,U\times U\rightarrow \mathbb {R}\) and functionals \(G_\beta \), \(H_\beta \) by
Let \(\varvec{u}= (u,\varvec{\sigma },\lambda )\in K^s\subset K^j\) (\(j=0,1\)) denote the unique solution of (3) with \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\). Recall that \({{\,\mathrm{div}\,}}\varvec{\sigma }+\lambda = -f\). Testing this identity with \({{\,\mathrm{div}\,}}{\varvec{\tau }}+\mu \), multiplying with \((\beta -1)\) and adding it to (8) we see that the solution \(\varvec{u}\in K^s\) satisfies the variational inequality
For the derivation of our second variational inequality let \(\varvec{u}=(u,\varvec{\sigma },\lambda )\in K^{0}\) denote the unique solution of (3) with \(f\in L^2(\Omega )\), \(g\in H^1(\Omega )\), \(g|_\Gamma \le 0\). Recall that \(\lambda = -\Delta u-f\). By (2) we have that
for all \(v\in H_0^1(\Omega )\), \(v\ge g\). Thus, \(\varvec{u}\in K^0\) satisfies the variational inequality
Our final method is based on the observation that for \(\mu \ge 0\), we have that \(\langle \mu ,u-g\rangle \ge 0\) for \(u\ge g\in H_0^1(\Omega )\). Together with \(\langle \lambda ,u-g\rangle =0\) we conclude \(\langle \mu -\lambda ,u-g\rangle \ge 0\). Thus, \(\varvec{u}\in K^1\) satisfies the variational inequality
Note that \(a_\beta \) is symmetric, whereas \(b_\beta \), \(c_\beta \) are not.
3.1 Solvability
In what follows we analyse the (unique) solvability of the variational inequalities (VIa)–(VIc) in a uniform manner (including discretizations).
Lemma 3
Suppose \(\beta >0\). Let \(A\in \{a_\beta ,b_\beta ,c_\beta \}\). There exists \(C_\beta >0\) depending only on \(\beta >0\) and \(\Omega \) such that
If \(\beta \ge 1+C_F^2\), then A is coercive, i.e.,
The constant \(C>0\) is independent of \(\beta \) and \(\Omega \).
Proof
We prove boundedness of \(A = a_\beta \). Let \(\varvec{u}=(u,\varvec{\sigma },\lambda ),\varvec{v}=(v,{\varvec{\tau }},\mu )\in U\) be given. The Cauchy–Schwarz inequality together with the Friedrichs’ inequality and boundedness of the divergence operator yields
This shows boundedness of \(a_\beta (\cdot ,\cdot )\). Similarly, one concludes boundedness of \(b_\beta (\cdot ,\cdot )\) and \(c_\beta (\cdot ,\cdot )\).
For the proof of coercivity, observe that \(a_\beta (\varvec{w},\varvec{w}) = b_\beta (\varvec{w},\varvec{w}) = c_\beta (\varvec{w},\varvec{w})\) for all \(\varvec{w}\in U\). We stress that coercivity directly follows from the arguments given in the proof of Theorem 1. Note that the choice of \(\beta \) yields
for \(\varvec{w}=(w,{\varvec{\chi }},\nu )\in U\). The right-hand side can be further estimated following the argumentation as in the proof of Theorem 1 which gives us
This finishes the proof. \(\square \)
Remark 4
Recall that \(C_F\le {{\,\mathrm{diam}\,}}(\Omega )\). Therefore, we can always choose \(\beta =1+{{\,\mathrm{diam}\,}}(\Omega )^2\) to ensure coercivity of our bilinear forms. We stress that a choice of \(\beta \) of order \({{\,\mathrm{diam}\,}}(\Omega )\) is not only sufficient to ensure coercivity but also necessary in general as the example from “Appendix A” shows. Another possibility is to rescale \(\Omega \) such that \({{\,\mathrm{diam}\,}}(\Omega )\le 1\) which implies that we can choose \(\beta =2\). Furthermore, observe that a scaling of \(\Omega \) transforms (1) to an equivalent obstacle problem (with appropriate redefined functions f, g). To be more precise, define \(\widetilde{u}(x) := u(dx)\) with \(d:={{\,\mathrm{diam}\,}}(\Omega )>0\) and \(u\in H_0^1(\Omega )\) the solution of (1). Moreover, set \(\widetilde{f}(x) = d^2 f(dx)\), \(\widetilde{g}(x) := g(dx)\). Then, \(\widetilde{u}\) solves (1) in \(\widetilde{\Omega }:= \big \{x/d{:}\,x\in \Omega \big \}\) with f, g replaced by \(\widetilde{f},\widetilde{g}\).
The variational inequalities (VIa)–(VIc) are of the first kind and we use a standard framework for the analysis (Lions–Stampacchia theorem), see [16, 17, 25].
Theorem 5
Suppose \(\beta \ge 1+C_F^2\). Let \(A\in \{a_\beta ,b_\beta ,c_\beta \}\) and let \(F: U\rightarrow \mathbb {R}\) denote a bounded linear functional. If \(K\subseteq U\) is a non-empty convex and closed subset, then the variational inequality
admits a unique solution.
In particular, for \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\) each of the problems (VIa)–(VIc) has a unique solution and the problems are equivalent to (3).
Proof
With the assumption on \(\beta \), Lemma 3 proves that the bilinear forms are coercive and bounded. Then, unique solvability of (9) follows from the Lions–Stampacchia theorem, see e.g., [16, 17, 25].
Unique solvability of (VIa)–(VIc) follows since the functionals \(F_\beta \), \(G_\beta \), \(H_\beta \) are linear and bounded: For example, boundedness of \(F_\beta \) can be seen from
The same arguments prove that \(G_\beta \) and \(H_\beta \) are bounded.
Finally, equivalence to (3) follows since all problems admit unique solutions and by construction the solution of (3) also solves each of the problems (VIa)–(VIc). \(\square \)
Remark 6
We stress that the assumption \(g\in H_0^1(\Omega )\) is necessary. If \(g\in H^1(\Omega )\) then the term \(\langle \mu ,g\rangle \) in \(F_\beta \), \(H_\beta \) is not well-defined. However, this term does not appear in \(G_\beta \) and therefore the variational inequality in (VIb) admits a unique solution if we only assume \(g\in H^1(\Omega )\) with \(g|_\Gamma \le 0\).
Remark 7
The variational inequality (VIa) corresponds to a least-squares finite element method with convex functional
Then, Theorem 5 proves that the problem
admits a unique solution for all non-empty convex and closed sets \(K\subseteq U\). Moreover, \(J_\beta (\varvec{u};f,g)\simeq J(\varvec{u};f,g)\) for \(\varvec{u}\in K^s\), so that this problem is equivalent to (4) for \(K=K^s\).
3.2 A priori analysis
The following three results provide general bounds on the approximation error. The proofs are based on standard arguments, see e.g., [14]. We give details for the proof of the first result, the others follow the same lines of argumentation and are left to the reader.
Theorem 8
Suppose \(\beta \ge 1+C_F^2\). Let \(\varvec{u}\in K^s\) denote the solution of (VIa), where \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\). Let \(K_h\subset U\) denote a non-empty convex and closed subset and let \(\varvec{u}_h \in K_h\) denote the solution of (9) with \(A=a_\beta \), \(F=F_\beta \) and \(K=K_h\). It holds that
The constant \(C_\mathrm {opt}>0\) depends only on \(\beta \) and \(\Omega \).
Proof
Throughout let \(\varvec{v}=(v,{\varvec{\tau }},\mu )\in K^s\), \(\varvec{v}_h=(v_h,{\varvec{\tau }}_h,\mu _h)\in K_h\) and let \(\varvec{u}=(u,\varvec{\sigma },\lambda )\in K^s\) denote the exact solution of (VIa). Thus, \({{\,\mathrm{div}\,}}\varvec{\sigma }+\lambda +f=0\) and \(\nabla u-\varvec{\sigma }= 0\). For arbitrary \(\varvec{w}= (w,{\varvec{\chi }},\nu )\in U\) it holds that
Using coercivity of \(a_\beta (\cdot ,\cdot )\), identity (10) and the fact that \(\varvec{u}_h\) solves the discretized variational inequality (on \(K_h\)) shows that
Note that \(0=\langle \lambda ,u-g\rangle \le \langle \lambda ,v-g\rangle \) and \(\langle \lambda ,u-g\rangle \le \langle \mu ,u-g\rangle \). Hence,
This and identity (10) with \(\varvec{w}=\varvec{u}-\varvec{v}_h\) imply that
Putting altogether, boundedness of \(a_\beta (\cdot ,\cdot )\) and an application of Young’s inequality with parameter \(\delta >0\) show that
Subtracting the term \(\delta /2\Vert \varvec{u}-\varvec{u}_h\Vert _{U}^2\) for some sufficiently small \(\delta >0\) finishes the proof since \(\varvec{v}\in K^s\), \(\varvec{v}_h\in K_h\) are arbitrary. \(\square \)
Theorem 9
Suppose \(\beta \ge 1+C_F^2\). Let \(\varvec{u}\in K^0\) denote the solution of (VIb), where \(f\in L^2(\Omega )\), \(g\in H^1(\Omega )\) with \(g|_\Gamma \le 0\). Let \(K_h\subset U\) denote a non-empty convex and closed subset and let \(\varvec{u}_h \in K_h\) denote the solution of (9) with \(A=b_\beta \), \(F=G_\beta \), and \(K=K_h\). It holds that
The constant \(C_\mathrm {opt}>0\) depends only on \(\beta \) and \(\Omega \).
Theorem 10
Suppose \(\beta \ge 1+C_F^2\). Let \(\varvec{u}\in K^1\) denote the solution of (VIc), where \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\). Let \(K_h\subset U\) denote a non-empty convex and closed subset and let \(\varvec{u}_h \in K_h\) denote the solution of (9) with \(A=c_\beta \), \(F=H_\beta \), and \(K=K_h\). It holds that
The constant \(C_\mathrm {opt}>0\) depends only on \(\beta \) and \(\Omega \).
3.3 Discretization
Let \(\mathcal {T}\) denote a regular triangulation of \(\Omega \), \(\bigcup _{T\in \mathcal {T}} \overline{T}=\overline{\Omega }\). We assume that \(\mathcal {T}\) is \(\kappa \)-shape regular, i.e.,
Moreover, let \(\mathcal {V}\) denote the vertices of the mesh \(\mathcal {T}\) and \(\mathcal {V}_0:=\mathcal {V}{\setminus }\Gamma \). Let \(h_\mathcal {T}\in L^\infty (\Omega )\) denote the mesh-size function, \(h_\mathcal {T}|_T := h_T := {{\,\mathrm{diam}\,}}(T)\) for \(T\in \mathcal {T}\). Set \(h:=\max _{T\in \mathcal {T}}{{\,\mathrm{diam}\,}}(T)\). We use standard finite element spaces for the discretization. Let \(\mathcal {P}^p(\mathcal {T})\) denote the space of \(\mathcal {T}\)-elementwise polynomials of degree less or equal than \(p\in \mathbb {N}_0\). Let \(\mathcal {R}\!\mathcal {T}^p(\mathcal {T})\) denote the Raviart–Thomas space of degree \(p\in \mathbb {N}_0\), \(\mathcal {S}_0^{p+1}(\mathcal {T}) := \mathcal {P}^{p+1}(\mathcal {T})\cap H_0^1(\Omega )\), and
Clearly, \(U_{hp} \subset U\). We stress that the polynomial degree is chosen, so that the best approximation in the norm \(\Vert \cdot \Vert _{U}\) is of order \(h^{p+1}\).
To define admissible convex sets for the discrete variational inequalities we need to put constraints on functions from the space \(\mathcal {S}_0^{p+1}(\mathcal {T})\) or from \(\mathcal {P}^p(\mathcal {T})\) or both. Let us remark that for a polynomial degree \(\ge 2\) such constraints are not straightforward to implement. One possibility would be to impose such constraints pointwise and then analyse the consistency error. We comment on the case \(p=1\) and \(n=2\) below. For some hp-FEM method for elliptic obstacle problems we refer to [2, 3]. In order to avoid such quite technical treatments and for a simpler representation of the basic ideas we consider from now on the lowest-order case only, where the linear constraints can easily be built in. To that end define the non-empty convex subsets
In the definition of \(K_h^s\), \(K_h^0\) we assume \(g\in H^1(\Omega )\cap C^0(\overline{\Omega })\) so that the point evaluation is well-defined.
Let us shortly comment on how to incorporate the constraints for the higher-order space \(U_{h1}\) and \(n=2\). Let \(\mathcal {V}_m\) denote the midpoints of interior edges of the triangulation \(\mathcal {T}\). Then, a choice for the discrete convex set is
In the same manner one defines \(K_{h1}^0\) resp. \(K_{h1}^1\).
3.4 Auxiliary results
For the analysis of the convergence rates we use the nodal interpolation operator \(I_h{:}\,H^2(\Omega )\rightarrow \mathcal {S}^1(\mathcal {T}):= \mathcal {P}^1(\mathcal {T})\cap C^0(\overline{\Omega })\), the Raviart–Thomas projector \(\Pi ^{{{\,\mathrm{div}\,}}}_h{:}\,H^1(\Omega )^n \rightarrow \mathcal {R}\!\mathcal {T}^0(\mathcal {T})\), and the \(L^2(\Omega )\) projector \(\Pi _h{:}\,L^2(\Omega )\rightarrow \mathcal {P}^0(\mathcal {T})\). Observe that with \(v\ge 0\), \(\mu \ge 0\) we have (with sufficient regularity) that \(I_h v\ge 0\), \(\Pi _h\mu \ge 0\). Moreover, recall the commutativity property \({{\,\mathrm{div}\,}}\Pi ^{{{\,\mathrm{div}\,}}}_h = \Pi _h{{\,\mathrm{div}\,}}\), as well as the approximation properties
Here, \(\nabla {\varvec{\tau }}\) is understood componentwise, \(\nabla _\mathcal {T}\mu \) denotes the \(\mathcal {T}\)-elementwise gradient of \(\mu \in H^1(\mathcal {T}) := \big \{\nu \in L^2(\Omega ){:}\,\nu |_T \in H^1(T), \, T\in \mathcal {T}\big \}\). Set \(\Vert \nu \Vert _{H^1(\mathcal {T})}^2 := \Vert \nu \Vert _{}^2 + \Vert \nabla _\mathcal {T}\nu \Vert _{}^2\). The involved constants depend only on the \(\kappa \)-shape regularity of \(\mathcal {T}\) but are otherwise independent of \(\mathcal {T}\). Furthermore, for \(\mu \in L^2(\Omega )\), it also holds that
which follows from the definition of the dual norm, the projection and approximation property of \(\Pi _h\).
The proof of optimal a priori convergence rates will also rely on the following two results. Scaling arguments and the continuous embedding \(H^2(T_\mathrm {ref}) \hookrightarrow C^0(\overline{T_\mathrm {ref}})\) show the next result. Here, \(\overline{T_\mathrm {ref}}\) denotes some reference element.
Lemma 11
There exists a constant \(C>0\) depending only on \(T_\mathrm {ref}\) and \(\kappa \)-shape regularity of the triangulation such that
The next result is proven along the lines of [12, Lemma 7]. For completeness we present the proof of the nontrivial result adapted to our situation in “Appendix B”. For each element \(T\in \mathcal {T}\) and \(v\in H^2(T)\) we define the level set
Note that \(v\in H^2(T)\) implies that these sets are measurable. Moreover, \(|T_\mathrm {C}(v)| + |T_\mathrm {NC}(v)| = |T|\).
Lemma 12
Let \(v\in H^2(T)\). Assume \(|T_\mathrm {C}(v)|>0\). Then,
and in particular
Here, \(C = \sqrt{n}\) for \(n=2,3\).
3.5 Optimal a priori convergence rates
Theorem 13
Suppose \(\beta \ge 1+C_F^2\). Let \(\varvec{u}\in K^s\) denote the solution of (VIa) with data \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\). Let \(K_h^s\) denote the set defined in (11a) and let \(\varvec{u}_h \in K_h^s\) denote the solution of (9) with \(A=a_\beta \), \(F=F_\beta \), and \(K=K_h^s\). If \(u \in H^2(\Omega )\), \(g\in H^2(\Omega )\) and \(f\in H^1(\mathcal {T})\), then
The constant \(C_\mathrm {app}>0\) depends only on \(\beta \), \(\Omega \), and \(\kappa \)-shape regularity of \(\mathcal {T}\).
Proof
Choose \(\varvec{v}_h = (I_hu,\Pi ^{{{\,\mathrm{div}\,}}}_h\varvec{\sigma },\Pi _h\lambda )\in K_h^s\). The commutativity property of \(\Pi ^{{{\,\mathrm{div}\,}}}_h\) shows that
Therefore, using the approximation properties of the involved operators proves
Moreover,
We have to estimate the term
Define \(T_\mathrm {C} := T_\mathrm {C}(u-g)\) and \(T_\mathrm {NC} := T_\mathrm {NC}(u-g)\). Note that these two sets are measurable and we have that \(|T_\mathrm {C}| + |T_\mathrm {NC}|=|T|\). We consider three cases: First, assume that \(|T_\mathrm {C}| = 0\). This implies that \(u-g>0\) a.e. in T but since \((u-g)\lambda = 0\) we infer that \(\lambda = 0\) a.e. in T. Therefore, \(\Pi _h\lambda |_T = 0\) and we have that \(\langle \Pi _h\lambda -\lambda ,u-g\rangle _T = 0\). Second, assume that \(|T_\mathrm {NC}| = 0\). But then, \(u-g = 0\) a.e. in T and we have again \(\langle \Pi _h\lambda -\lambda ,u-g\rangle _T = 0\). The final case to be considered is \(|T_\mathrm {NC}|>0, |T_\mathrm {C}|>0\): We have that
Note that \(\lambda |_{T_\mathrm {NC}} = 0\). Thus, \(\Vert \lambda \Vert _{L^1(T)} = \Vert \lambda \Vert _{L^1(T_\mathrm {C})} \le |T_\mathrm {C}|^{1/2} \Vert \lambda \Vert _{T}\). For the second term we apply Lemma 11 with \(v = (1-\Pi _h)(u-g)\) and together with the approximation property of \(\Pi _h\) we get the estimate
We can estimate the gradient term by applying the second inequality of Lemma 12 which gives us
Clearly \(|T_\mathrm {C}|^{1/2}\le |T|^{1/2}\), thus \(|T|^{-1/2}\le |T_\mathrm {C}|^{-1/2}\) and we conclude that
Using \(\Vert \lambda \Vert _{L^1(T)} \le |T_\mathrm {C}|^{1/2} \Vert \lambda \Vert _{T}\) then yields that
Summing up we have that
Therefore, in view of Theorem 8 it only remains to estimate the consistency error
Define \(\varvec{v}:= (v,{\varvec{\chi }},\mu ):=(v,0,\lambda _h)\in U\) with \(v:=\sup \{u_h,g\}\) and observe that \(\varvec{v}\in K^s\). This directly leads to \(\langle \mu -\lambda _h,u-g\rangle = 0\). For the remaining term we follow the seminal work [14] of Falk. The same lines as in the proof of [14, Lemma 4] show that
This finishes the proof. \(\square \)
The proof of the following result can be obtained in the same fashion as the previous one and is therefore omitted.
Theorem 14
Suppose \(\beta \ge 1+C_F^2\). Let \(\varvec{u}\in K^0\) denote the solution of (VIb) with data \(f\in L^2(\Omega )\), \(g\in H^1(\Omega )\), \(g|_\Gamma \le 0\). Let \(\varvec{u}_h \in K_h\) denote the solution of (9) with \(A=b_\beta \), \(F=G_\beta \), and \(K=K_h\), where either \(K_h=K_h^s\) or \(K_h=K_h^0\). If \(u \in H^2(\Omega )\), \(g\in H^2(\Omega )\) and \(f\in H^1(\mathcal {T})\), then
The constant \(C_\mathrm {app}>0\) depends only on \(\beta \), \(\Omega \), and \(\kappa \)-shape regularity of \(\mathcal {T}\).
Finally, we show convergence rates for problem (VIc) and its approximation. Note that for the sets \(K_h^1\), \(K_h^s\) defined in (11c), (11a) it holds that \(K_h^s\subset K_h^1\subset K^1\) and thus the consistency error, see Theorem 10, vanishes. The proof is similar to the one of Theorem 13 and is therefore left to the reader.
Theorem 15
Suppose \(\beta \ge 1+C_F^2\). Let \(\varvec{u}\in K^1\) denote the solution of (VIc) with data \(f\in L^2(\Omega )\), \(g\in H_0^1(\Omega )\). Let \(\varvec{u}_h \in K_h\) denote the solution of (9) with \(A=c_\beta \), \(F=H_\beta \), and \(K=K_h\), where either \(K_h=K_h^s\) or \(K_h=K_h^1\). If \(u \in H^2(\Omega )\), \(g\in H^2(\Omega )\) and \(f\in H^1(\mathcal {T})\), then
The constant \(C_\mathrm {app}>0\) depends only on \(\beta \), \(\Omega \), and \(\kappa \)-shape regularity of \(\mathcal {T}\).
4 A posteriori analysis
In this section we derive reliable error bounds that can be used as an a posteriori estimator. We define
The estimator below includes the residual term
which can be localized. The derivation of our estimators is quite simple and is based on the following observation. Let \(\varvec{u}\in K^s\subset K^j\) denote the unique solution of (3) and let \(\varvec{u}_h\in U_{h0}\) be arbitrary. Take \(\beta = 1+C_F^2\) and recall that by Lemma 3 it holds that \(a_\beta (\varvec{v},\varvec{v})=b_\beta (\varvec{v},\varvec{v})=c_\beta (\varvec{v},\varvec{v}) > rsim \Vert \varvec{v}\Vert _{U}^2\) for all \(\varvec{v}\in U\). Then, together with the Pythagoras theorem \(\Vert \mu \Vert _{}^2 = \Vert (1-\Pi _h)\mu \Vert _{}^2 + \Vert \Pi _h\mu \Vert _{}^2\) for \(\mu \in L^2(\Omega )\) and using \({{\,\mathrm{div}\,}}\varvec{\sigma }+\lambda +f=0\), \(\nabla u = \varvec{\sigma }\), \({{\,\mathrm{div}\,}}\varvec{\sigma }_h+\lambda _h\in \mathcal {P}^0(\mathcal {T})\), it follows that
The remaining results in this section are proved by estimating the duality term \(\langle \lambda _h-\lambda ,u_h-u\rangle \) from (16). In particular, the proof of the next result employs only \(\lambda _h\ge 0\) We will need the positive resp. negative part of a function \(v{:}\,\Omega \rightarrow \mathbb {R}\),
This definition implies that \(v = v_+-v_-\). The ideas of estimating the duality term are similar as in [18, 31] and references therein, see also [15] for a related estimate for Signorini-type problems. Note that we do not need to assume \(g\in H_0^1(\Omega )\).
Theorem 16
Let \(\varvec{u}\in K^s\) denote the solution of (3). Let \(\varvec{u}_h \in K_h\), where \(K_h\in \{K_h^s,K_h^1\}\), be arbitrary. The error satisfies
where the estimator contribution \(\rho \) is given by
The constant \(C_\mathrm {rel}>0\) depends only on \(\Omega \).
Proof
In view of estimate (16) we only have to tackle the term \(\langle \lambda _h-\lambda ,u_h-u\rangle \). Define \(v_h := \max \{u_h,g\}\). Clearly, \(v_h\ge g\) and \(v_h\in H_0^1(\Omega )\). Note that \(\lambda = -\Delta u - f \in H^{-1}(\Omega )\). Therefore, \(\langle \lambda ,v\rangle = (\nabla u,\nabla v)-(f,v)\) for all \(v\in H_0^1(\Omega )\) and using the variational inequality for the exact solution (2) yields
for all \(\delta >0\). Employing \(\lambda _h\ge 0\), \(g-u\le 0\), and \(v+v_-=v_+\) we further infer that
Recall that \(\Vert \lambda -\lambda _h\Vert _{-1}\le \Vert \varvec{u}-\varvec{u}_h\Vert _{V}\lesssim \Vert \varvec{u}-\varvec{u}_h\Vert _{U}\), where the involved constant depends only on \(\Omega \). Thus, choosing \(\delta >0\) sufficiently small the proof is concluded with (16). \(\square \)
We could derive a similar estimate if \(\varvec{u}_h\in K_h^0\) by changing the role of \(u_h\) and \(\lambda _h\) resp. u and \(\lambda \) in the proof. However, this leads to an estimator with a non-local term. To see this, suppose \(g=0\). Then, following the last proof we get
for \(\delta >0\). For the total error this would yield
The last term is not localizable and therefore it is not feasible to use this estimate as an a posteriori error estimator in an adaptive algorithm.
Remark 17
The derived estimator is efficient up to the term \(\rho \), i.e.,
To see this, we employ the Pythagoras theorem to obtain
Then, \({{\,\mathrm{div}\,}}\varvec{\sigma }+\lambda =-f\), \(\nabla u = \varvec{\sigma }\) and the triangle inequality prove the asserted estimate. The proof of the efficiency estimate \(\rho \lesssim \Vert \varvec{u}-\varvec{u}_h\Vert _{U}\) (up to possible data resp. obstacle oscillations) is an open problem, see also the related works [1, 18].
5 Examples
In this section we present numerical studies that demonstrate the performance of our proposed methods in different situations:
In Sect. 5.3 we consider a problem on the unit square with smooth obstacle and known smooth solution.
In Sect. 5.4 we consider the example from [4, Section 5.2] where the solution is known and exhibits a singularity.
In Sect. 5.5 we consider a problem on an L-shaped domain with a pyramid-like obstacle and unknown solution.
Before we come to a detailed discussion on the numerical studies some remarks are in order. In all examples we choose \(\beta = 1+{{\,\mathrm{diam}\,}}(\Omega )^2\) to ensure coercivity of the bilinear forms (Lemma 3). This also implies that the Galerkin matrices associated to the bilinear forms \(a_\beta \), \(b_\beta \), and \(c_\beta \) are positive definite. Choosing standard basis functions for \(\mathcal {S}_0^1(\mathcal {T})\) (nodal basis), \(\mathcal {R}\!\mathcal {T}^0(\mathcal {T})\) (lowest-order Raviart–Thomas basis) and \(\mathcal {P}^0(\mathcal {T})\) (characteristic functions), the constraints in the discrete convex sets \(K_h^\star \), where \(\star =0\), \(\star =1\) or \(\star =s\), are straightforward to impose. The resulting discrete variational inequalities are then solved using a (primal-dual) active set strategy, see e.g., [21,22,23].
5.1 Active set method and discrete variational inequalities
In this section we first define and collect results on the (primal-dual) active set method. Then, we recall the variational inequalities (VIa)–(VIc) and write down their discrete variants.
5.1.1 Active set method
Let \(\mathcal {N}= \{1,\dots ,N\}\), \(N\in \mathbb {N}\), and let \(\mathcal {N}_\gamma \subseteq \mathcal {N}\) be a non-empty subset. We set \(\mathcal {N}_\omega := \mathcal {N}{\setminus } \mathcal {N}_\gamma \). For a vector \(\varvec{x}\in \mathbb {R}^N\) we write \(\varvec{x}=0\) if all components are equal to 0. Similarly, \(\varvec{x}\ge 0\) means that all components of \(\varvec{x}\) are \(\ge 0\). For a subset \(\mathcal {I}\subseteq \mathcal {N}\), \(\varvec{x}_{\mathcal {I}} = 0\) means \(\varvec{x}_i = 0\) for all \(i\in \mathcal {I}\). We also use the notation \(\varvec{x}_{\mathcal {I}}\ge 0\), which means \(\varvec{x}_i\ge 0\) for all \(i\in \mathcal {I}\) and \(\varvec{x}_\mathcal {I}\ge \varvec{y}_\mathcal {I}\) stands for \(\varvec{x}_\mathcal {I}-\varvec{y}_\mathcal {I}\ge 0\).
For \(\varvec{g}\in \mathbb {R}^N\) we consider the convex set
Let \(\varvec{S}\in \mathbb {R}^{N\times N}\) denote a positive definite (but possibly non-symmetric) matrix, and \(\varvec{b}\in \mathbb {R}^N\) some arbitrary vector. We consider the variational inequality: find \(\varvec{x}\in K\), such that
where \(\langle \cdot ,\cdot \rangle _2\) denotes the Euclidean inner product on \(\mathbb {R}^N\). Since \(\varvec{S}\) is positive definite this problem admits a unique solution. It is well-known that problem (17) can be rewritten as follows: find \((\varvec{x},\varvec{\lambda })\in \mathbb {R}^N\times \mathbb {R}^N\) such that
where \(\max \{\cdot ,\cdot \}\) denotes the componentwise maximum and \(C>0\) is some constant. Note that the solution is independent of C. Now following the seminal work [21] one defines a (semi-smooth) Newton method for solving (18). The same lines of argumentation as in [21] show that the method can be written as an active set strategy. The algorithm adapted to our situation is given in Algorithm 1.
The solution of the linear system in Line 8 of Algorithm 1 can be written (with \(\mathcal {I}=\mathcal {I}^k\), \(\mathcal {J}=\mathcal {J}^k\)) as
With the constraints \(\varvec{x}_{\mathcal {J}} = \varvec{g}_\mathcal {J}\) and \(\varvec{\lambda }_{\mathcal {I}} = 0\) this reduces to the solution of the system
and the definition
Since \(\varvec{S}\) is positive definite the subblock \(\varvec{S}_{\mathcal {I}\mathcal {I}}\) is as well and thus \(\varvec{S}_{\mathcal {I}\mathcal {I}}\) is invertible.
Some remarks are in order. We can follow the analysis of [21] to see that the basic (local) convergence result holds true in our case as well.
Proposition 18
[21, Theorem 3.1] If the initial guess \((\varvec{x}^0,\varvec{\lambda }^0)\) is sufficient close to the exact solution \((\varvec{x},\varvec{\lambda })\) of (18) then the iterates \((\varvec{x}^k,\varvec{\lambda }^k)\) in Algorithm 1 converge superlinearly to \((\varvec{x},\varvec{\lambda })\).
The stopping criterion in Line 5 can be replaced by other criterions. Here, we choose \(\mathcal {J}^k =\mathcal {J}^{k-1}\) because then we know that we have hit the exact solution of (17). The proof of the following result is a slight modification of the proof of [23, Lemma 3.1] and the interested reader can find it in “Appendix C”.
Lemma 19
If the stopping criterion in Line 5 of Algorithm 1 is satisfied, then \(\varvec{x}=\varvec{x}^k\) is the solution of (17).
5.1.2 Discrete variational inequalities
In this section we recall the discrete versions of the variational inequalities (VIa)–(VIc) and present them in matrix-vector form. They fit into the abstract framework given in Sect. 5.1.1.
Let us recall the discrete space from Sect. 3.3,
Let \(\mathcal {E}\) denote the set of edges (\(n=2\)) resp. faces (\(n=3\)). Then, \(\dim (U_{h0}) = \#\mathcal {V}_0+\#\mathcal {E}+\#\mathcal {T}=:N\). Numbering the nodes \(x_j\) of \(\mathcal {V}_0\), the edges/faces \(E_j\) in \(\mathcal {E}\) and the elements \(T_j\) in \(\mathcal {T}\), we consider the following functions:
For \(j=1,\dots ,\#\mathcal {V}_0\) let \(v_j\) denote the nodal basis functions associated to the node \(x_j\in \mathcal {V}_0\).
For \(j=1,\dots ,\#\mathcal {E}\) let \({\varvec{\tau }}^{(j)}\) denote the Raviart–Thomas basis functions associated to the edge/face \(E_j\in \mathcal {E}\).
For \(j=1,\dots ,\#\mathcal {T}\) let \(\chi _j\) denote the characteristic function of the element \(T_j\in \mathcal {T}\).
We define the basis \(({\varvec{\xi }}^{(j)})_{j=1}^N\) for the space \(U_{h0}\) by
Recall from Eq. 11 the discrete convex sets
These convex subsets of \(U_{h0}\) correspond to convex subsets of \(\mathbb {R}^N\) as follows: For given obstacle function \(g\in H_0^1(\Omega )\cap C^0(\overline{\Omega })\) define the vector \(\varvec{g}\in \mathbb {R}^N\) by
Let \(\mathcal {N}= \{1,\dots ,N\}\) and define \(\mathcal {N}_\gamma ^s\), \(\mathcal {N}_\gamma ^0\), \(\mathcal {N}_\gamma ^1\) by
Then, the three sets \(K_h^s\), \(K_h^0\), \(K_h^1\) correspond to the sets
With these definitions we can now state the algebraic forms of the discrete variational inequalities:
5.1.3 Discrete version of (VIa) with \(K_h^s\)
The discrete version of (VIa) with convex set \(K_h^s\) reads: find \(\varvec{u}_h\in K_h^s\) such that
Let \(\varvec{S}^{(s)}\in \mathbb {R}^N\times \mathbb {R}^N\) denote the Galerkin matrix of the bilinear form \(a_\beta (\cdot ,\cdot )\) and let \(\varvec{b}^{(s)}\in \mathbb {R}^N\) denote the load vector, i.e.,
for all \(j,k=1,\dots ,N\). Note that \(\varvec{S}^{(s)}\) is symmetric and positive definite. Problem (19) then reads in algebraic form as: find \(\varvec{x}\in K_N^s\) such that
5.1.4 Discrete version of (VIb) with \(K_h^0\)
The discrete version of (VIb) with convex set \(K_h^0\) reads: find \(\varvec{u}_h\in K_h^0\) such that
Let \(\varvec{S}^{(0)}\in \mathbb {R}^N\times \mathbb {R}^N\) denote the Galerkin matrix of the bilinear form \(b_\beta (\cdot ,\cdot )\) and let \(\varvec{b}^{(0)}\in \mathbb {R}^N\) denote the load vector, i.e.,
for all \(j,k=1,\dots ,N\). Note that \(\varvec{S}^{(0)}\) is non-symmetric and positive definite. Problem (21) then reads in algebraic form as: find \(\varvec{x}\in K_N^0\) such that
5.1.5 Discrete version of (VIc) with \(K_h^1\)
The discrete version of (VIc) with convex set \(K_h^1\) reads: find \(\varvec{u}_h\in K_h^1\) such that
Let \(\varvec{S}^{(1)}\in \mathbb {R}^N\times \mathbb {R}^N\) denote the Galerkin matrix of the bilinear form \(c_\beta (\cdot ,\cdot )\) and let \(\varvec{b}^{(1)}\in \mathbb {R}^N\) denote the load vector, i.e.,
for all \(j,k=1,\dots ,N\). Note that \(\varvec{S}^{(1)}\) is non-symmetric and positive definite. Problem (23) then reads in algebraic form as: find \(\varvec{x}\in K_N^1\) such that
5.1.6 Solver setup
The algebraic problems (20), (22) and (24) are then solved using Algorithm 1. The initial data \((\varvec{x}^0,\varvec{\lambda }^0)\) is chosen as the solution of
and
where \(\star =s\), \(\star =0\) or \(\star =1\). The constant C in Algorithm 1 is choosen as \(C=1\). The linear systems in Line 8 of Algorithm 1 are solved using the MATLAB backslash operator.
5.2 Error and estimator quantities
We define the error resp. total estimator by
Note that the estimator can be decomposed into local contributions,
where \(\Vert \cdot \Vert _{T}\) denotes the \(L^2(T)\) norm and \((\cdot ,\cdot )_T\) the \(L^2(T)\) inner product. Moreover, we will estimate the error in the weaker norm \(\Vert \cdot \Vert _{V}\). To do so we consider an upper bound given by
where the evaluation of \(\Vert \cdot \Vert _{-1,h}\) is based on the discrete \(H^{-1}(\Omega )\) norm discussed in the seminal work [8]: Let \(Q_h{:}\,L^2(\Omega )\rightarrow \mathcal {S}_0^1(\mathcal {T})\) denote the \(L^2(\Omega )\) projector. Let \(\mu \in L^2(\Omega )\). We stress that using the projection and local approximation property of \(Q_h\) yields
where the involved constant depends on shape regularity of \(\mathcal {T}\). Following [8] it holds that
where \(u_h[\mu ]\in \mathcal {S}_0^1(\mathcal {T})\) is the solution of
Note that \(\Vert \nabla u_h[\mu ]\Vert _{}\le \Vert \mu \Vert _{-1}\). The estimate \(\Vert Q_h\mu \Vert _{-1}\lesssim \Vert \nabla u_h[\mu ]\Vert _{}\) depends on the stability of the projection \(Q_h\) in \(H^1(\Omega )\), \(\Vert \nabla Q_h v\Vert _{} \lesssim \Vert \nabla v\Vert _{}\) for \(v\in H_0^1(\Omega )\), i.e.,
Here, we use newest-vertex bisection [30] as refinement strategy where stability of the \(L^2(\Omega )\) projection is known [24].
We use an adaptive algorithm that basically consists of iterating the four steps
where the marking step is done with the bulk criterion, i.e., we determine a set \(\mathcal {M}\subseteq \mathcal {T}\) of (up to a constant) minimal cardinality with
For the experiments the marking parameter \(\theta \) is set to \(\tfrac{1}{4}\).
Convergence rates in the figures are indicated by triangles, where the number \(\alpha \) besides the triangle denotes the experimental rate \(\mathcal {O}( (\#\mathcal {T})^{-\alpha })\). For uniform refinement we have \(h^{2\alpha } \simeq \#\mathcal {T}^{-\alpha }\).
5.3 Smooth solution
Let \(\Omega = (0,1)^2\), \(u(x,y) = (1-x)x(1-y)y\),
Then, u solves the obstacle problem (1) with data f and obstacle
where \(\widetilde{g}\) is the unique polynomial of degree 3 such that g and \(\nabla g\) are continuous at the lines \(x=\tfrac{1}{2},\tfrac{3}{4}\). In particular, \(g\in H^2(\Omega )\). Note that \(\lambda = -\Delta u-f \in H^1(\mathcal {T})\). Figure 1 shows that the convergence rates for the solutions of the discrete variational inequalities (VIa)–(VIc) based on the convex sets \(K_h^s\), \(K_h^0\), \(K_h^1\) are optimal. This perfectly fits to our theoretic considerations in Theorems 13–15. Additionally, we plot \({{\,\mathrm{err}\,}}_V\) which is in all cases slightly smaller than \({{\,\mathrm{err}\,}}_U\) but of the same order. Note that since \(\lambda \) is a \(\mathcal {T}\)-elementwise polynomial, an inverse inequality shows that \(h\Vert \lambda -\lambda _h\Vert _{} \lesssim \Vert \lambda -\lambda _h\Vert _{-1}\) and thus \({{\,\mathrm{err}\,}}_V\) is equivalent to \(\Vert \varvec{u}-\varvec{u}_h\Vert _{V}\).
5.4 Manufactured solution on L-shaped domain
We consider the same problem as given in [4, Section 5.2], where \(g=0\), \(\Omega = (-2,2)^2{\setminus } [0,2]^2\) and
where \((r,\varphi )\) denote polar coordinates. With \(r_* = 2(r-1/4)\), \(\gamma ,\delta \) are given by
The exact solution then reads \(u(r,\varphi ) = r^{2/3}\sin (2/3\varphi )\gamma (r)\). Note that u has a generic singularity at the reentrant corner. We consider the discrete version of (VIa), where solutions are sought in the convex set \(K_h^s\). We conducted various tests with \(\beta \) between 1 and 100 and the results were in all cases comparable. For the results displayed here we have used \(\beta =3\). Figure 2 displays convergence rates in the case of uniform and adaptive mesh-refinement. We note that in the first plot the lines for \({{\,\mathrm{err}\,}}_U\) and \({{\,\mathrm{est}\,}}\) are almost identical. In the second plot we compare the contributions of the overall error and estimator in the adaptive case. The lines for \({{\,\mathrm{osc}\,}}\) and \(\Vert {{\,\mathrm{div}\,}}\varvec{\sigma }_h+\lambda _h+f\Vert _{}\) are almost identical. This means that the estimator contribution \(\Vert {{\,\mathrm{div}\,}}\varvec{\sigma }_h+\lambda _h+\Pi _hf\Vert _{}\) in \(\eta \) is negligible and \({{\,\mathrm{osc}\,}}\) is dominating the overall estimator. We observe from the first plot that \({{\,\mathrm{err}\,}}_V\) is much smaller than \({{\,\mathrm{err}\,}}_U\) but has the same rate of convergence. In the uniform case we see that the errors and estimators approximately converge at rate 0.45. One would expect a smaller rate due to the singularity. However, in this example the solution has a large gradient so that the algorithm first refines the regions where the gradient resp. f is large. This preasymptotic behavior was also observed in [4, Section 5.2]. Nevertheless, adaptivity yields a significant error reduction.
Figure 3 shows the approximation \(\lambda _h\) (left column) and the distribution of the estimator contribution \(\rho ^2\) (right column) on some adaptively refined meshes.
5.5 Unknown solution
For our final experiment, we choose \(\Omega = (-1,1)^2 {\setminus } [-\,1,0]^2\), \(f=1\), and the pyramid-like obstacle \(g(x) = \max \{0,{{\,\mathrm{dist}\,}}(x,\partial \Omega _u)-\tfrac{1}{4}\}\), where \(\Omega _u = (0,1)^2\). The solution in this case is unknown. We solve the discrete version of (VIa) with convex set \(K_h^s\). Since f is constant we have \({{\,\mathrm{osc}\,}}= 0\). Figure 4 shows the overall estimator (left) and its contributions (right). We observe that uniform refinement leads to the reduced rate \(\tfrac{1}{3}\), whereas for adaptive refinement we recover the optimal rate. Heuristically, we expect the solution to have a singularity at the reentrant corner as well as in the contact regions. This would explain the reduced rates. Figure 5 visualizes meshes produced by the adaptive algorithm and corresponding solution components \(u_h\). We observe strong refinements towards the corner (0, 0) and around the point \((\tfrac{1}{2},\tfrac{1}{2})\), which coincides with the tip of the pyramid obstacle.
References
Attia, F.S., Cai, Z., Starke, G.: First-order system least squares for the Signorini contact problem in linear elasticity. SIAM J. Numer. Anal. 47(4), 3027–3043 (2009)
Banz, L., Schröder, A.: Biorthogonal basis functions in \(hp\)-adaptive FEM for elliptic obstacle problems. Comput. Math. Appl. 70(8), 1721–1742 (2015)
Banz, L., Stephan, E.P.: A posteriori error estimates of \(hp\)-adaptive IPDG-FEM for elliptic obstacle problems. Appl. Numer. Math. 76, 76–92 (2014)
Bartels, S., Carstensen, C.: Averaging techniques yield reliable a posteriori finite element error control for obstacle problems. Numer. Math. 99(2), 225–249 (2004)
Bochev, P., Gunzburger, M.: Least-squares finite element methods. In: International Congress of Mathematicians, vol. III, pp. 1137–1162. Eur. Math. Soc., Zürich (2006)
Bochev, P.B., Gunzburger, M.D.: Least-Squares Finite Element Methods. Applied Mathematical Sciences, vol. 166. Springer, New York (2009)
Braess, D.: A posteriori error estimators for obstacle problems—another look. Numer. Math. 101(3), 415–421 (2005)
Bramble, J.H., Lazarov, R.D., Pasciak, J.E.: A least-squares approach based on a discrete minus one inner product for first order systems. Math. Comput. 66(219), 935–955 (1997)
Burman, E., Hansbo, P., Larson, M.G., Stenberg, R.: Galerkin least squares finite element method for the obstacle problem. Comput. Methods Appl. Mech. Eng. 313, 362–374 (2017)
Chen, Z., Nochetto, R.H.: Residual type a posteriori error estimates for elliptic obstacle problems. Numer. Math. 84(4), 527–548 (2000)
Chouly, F., Hild, P.: A Nitsche-based method for unilateral contact problems: numerical analysis. SIAM J. Numer. Anal. 51(2), 1295–1307 (2013)
Drouet, G., Hild, P.: Optimal convergence for discrete variational inequalities modelling Signorini contact in 2D and 3D without additional assumptions on the unknown contact set. SIAM J. Numer. Anal. 53(3), 1488–1507 (2015)
Evans, L.C., Gariepy, R.F.: Measure Theory and Fine Properties of Functions. Textbooks in Mathematics, revised edn. CRC Press, Boca Raton (2015)
Falk, R.S.: Error estimates for the approximation of a class of variational inequalities. Math. Comput. 28, 963–971 (1974)
Führer, T., Heuer, N., Stephan, E.P.: On the DPG method for Signorini problems. IMA J. Numer. Anal. 38(4), 1893–1926 (2018)
Glowinski, R.: Numerical methods for nonlinear variational problems. In: Scientific Computation. Springer, Berlin (2008). Reprint of the 1984 original
Glowinski, R., Lions, J.-L., Trémolières, R.: Numerical Analysis of Variational Inequalities, Volume 8 of Studies in Mathematics and Its Applications. North-Holland Publishing Co., Amsterdam (1981). Translated from the French
Gustafsson, T., Stenberg, R., Videman, J.: Mixed and stabilized finite element methods for the obstacle problem. SIAM J. Numer. Anal. 55(6), 2718–2744 (2017)
Gustafsson, T., Stenberg, R., Videman, J.: On finite element formulations for the obstacle problem—mixed and stabilised methods. Comput. Methods Appl. Math. 17(3), 413–429 (2017)
Gustafsson, T., Stenberg, R., Videman, J.: A stabilised finite element method for the plate obstacle problem. BIT 59(1), 97–124 (2019)
Hintermüller, M., Ito, K., Kunisch, K.: The primal–dual active set strategy as a semismooth Newton method. SIAM J. Optim. 13(3), 865–888 (2003), 2002
Hoppe, R.H.W., Kornhuber, R.: Adaptive multilevel methods for obstacle problems. SIAM J. Numer. Anal. 31(2), 301–323 (1994)
Kärkkäinen, T., Kunisch, K., Tarvainen, P.: Augmented Lagrangian active set methods for obstacle problems. J. Optim. Theory Appl. 119(3), 499–533 (2003)
Karkulik, M., Pavlicek, D., Praetorius, D.: On 2D newest vertex bisection: optimality of mesh-closure and \(H^1\)-stability of \(L_2\)-projection. Constr. Approx. 38(2), 213–234 (2013)
Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Inequalities and Their Applications, Volume 31 of Classics in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2000). Reprint of the 1980 original
Krause, R., Müller, B., Starke, G.: An adaptive least-squares mixed finite element method for the Signorini problem. Numer. Methods Partial Differ. Equ. 33(1), 276–289 (2017)
Nochetto, R.H., Siebert, K.G., Veeser, A.: Pointwise a posteriori error control for elliptic obstacle problems. Numer. Math. 95(1), 163–195 (2003)
Nochetto, R.H., Siebert, K.G., Veeser, A.: Fully localized a posteriori error estimators and barrier sets for contact problems. SIAM J. Numer. Anal. 42(5), 2118–2135 (2005)
Rodrigues, J.-F.: Obstacle Problems in Mathematical Physics, Volume 134 of North-Holland Mathematics Studies. North-Holland Publishing Co., Amsterdam (1987). Notas de Matemática [Mathematical Notes], 114
Stevenson, R.: The completion of locally refined simplicial partitions created by bisection. Math. Comput. 77(261), 227–241 (2008)
Veeser, A.: Efficient and reliable a posteriori error estimators for elliptic obstacle problems. SIAM J. Numer. Anal. 39(1), 146–167 (2001)
Weiss, A., Wohlmuth, B.I.: A posteriori error estimator for obstacle problems. SIAM J. Sci. Comput. 32(5), 2627–2658 (2010)
Acknowledgements
This work was supported by CONICYT through FONDECYT project “Least-squares methods for obstacle problems” under Grant 11170050.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Non-convexity of functional J
Recall that the functional \(J(\cdot ;f,g)\) is convex if and only if
In the following we construct a simple example that shows that the above inequality does not hold in general, thus J is not convex resp. \(a(\cdot ,\cdot ) = a_1(\cdot ,\cdot )\) is not coercive.
To that end, let \(u\in H_0^1(\Omega )\) denote the solution of \(\Delta u = 1\) in the square domain \(\Omega = (0,d)^2\). Then, \(u\le 0\) in \(\Omega \). Choose the obstacle as \(g= u\) (or \(g\le u\)). Note that \(\varvec{v}:= (0,0,0)\in K^s\) and that \(\varvec{u}:=(u,\varvec{\sigma },\Delta u) := (u,\nabla u,\Delta u)\in K^s\). We have that
Therefore \(\Vert 1\Vert _{-1}^2 = \Vert \nabla u\Vert _{}^2 = -\langle 1,u\rangle \). Using this we infer that
Hence, \(a_1(\varvec{u}-\varvec{v},\varvec{u}-\varvec{v}) < 0\) if and only if \(\Vert 2\Vert _{} < \Vert 1\Vert _{-1}\). Clearly, \(\Vert 2\Vert _{} = 2 |\Omega |^{1/2} = 2d\). We investigate the scaling of the negative order norm. Let \(\widehat{\Omega }\) denote the unit square \((0,1)^2\). Then,
Finally, \(\Vert 2\Vert _{} < \Vert 1\Vert _{-1}\) if
which holds for sufficiently large d.
Appendix B: Proof of Lemma 12
We use \(T_\mathrm {C}:= T_\mathrm {C}(v)\) and \(T_\mathrm {NC} := T_\mathrm {NC}(v)\). If \(|T_\mathrm {NC}|=0\), then \(v=0\) on T and the first inequality is trivial. Note that also \(\nabla v =0\). Therefore, the second inequality is trivial as well. From now on we thus assume \(|T_\mathrm {NC}|>0\). Using that \(v=0\) on \(T_\mathrm {C}\) and the identity
we obtain that
where in the ultimate step we have used symmetry in \(\varvec{x}\) and \(\varvec{y}\). With the substitution \(\varvec{z}= \phi (\varvec{x}) = s\varvec{x}+ (1-s)\varvec{y}\in T\) we further get that
Putting altogether this proves the first inequality.
For the second inequality we use the fact that the gradient on level sets vanishes, see [13, Theorem 3.3]. This means that \(\nabla v = 0\) a.e. in \(T_\mathrm {C}\). Then, the same lines of proof as above (with v replaced by the components of \(\nabla v\)) show the second inequality, which finishes the proof.
Appendix C: Proof of Lemma 19
We consider the decompositions
For the decomposition of \(\mathcal {J}^k\) note that from Lines 8–9 of Algorithm 1 we have that \(\varvec{\lambda }_{\mathcal {N}_\gamma \cap \mathcal {I}^{k-1}} = 0\) and \(\varvec{x}_{\mathcal {J}^k} = \varvec{g}_{\mathcal {J}^k}\). This yields
If \(\mathcal {J}^k = \mathcal {J}^{k-1}\), then, since all decompositions are disjoint,
This also means that \(\widehat{\varvec{\lambda }}_j^k>0\) if and only if \(\varvec{\lambda }_j^k>0\) with \(\varvec{x}_j^k = \varvec{g}_j\). Thus,
which implies that (18) is satisfied for \(\varvec{x}=\varvec{x}^k\), \(\varvec{\lambda }=\varvec{\lambda }^k\) or equivalently \(\varvec{x}=\varvec{x}^k\) solves (17).