Keywords

1 Introduction

Let Ω be a bounded convex polygonal domain in \({\mathbb{R}}^{2}\), y d L 2(Ω), γ ≥ 0 and β > 0 be constants. The following problem [33] is a model elliptic distributed optimal control problem with pointwise state constraints:

Find the minimizer of the functional

$$\displaystyle{ J(y,u) = \frac{\gamma } {2}\int _{\Omega }{(y - y_{d})}^{2}\,dx + \frac{\beta } {2}\int _{\Omega }{u}^{2}\,dx, }$$
(1.1)

where \((y,u) \in H_{0}^{1}(\Omega ) \times L_{2}(\Omega )\) are subjected to the constraints

$$\displaystyle{ \begin{array}{llll} \int _{\Omega }\nabla y \cdot \nabla v\,dx& =\int _{\Omega }uv\,dx&\qquad &\forall \,v \in H_{0}^{1}(\Omega ), \end{array} }$$
(1.2)
$$\displaystyle{ \psi _{1} \leq y \leq \psi _{2}\qquad \mbox{ a.e. in $\Omega $}. }$$
(1.3)

Here the functions \(\psi _{1}(x),\psi _{2}(x) \in {C}^{2}(\Omega ) \cap C(\bar{\Omega })\) satisfy

$$\displaystyle{ \begin{array}{llll} &\psi _{1} <\psi _{2} & \qquad & \mbox{ in $\Omega $}, \end{array} }$$
(1.4a)
$$\displaystyle{ \begin{array}{llll} &\psi _{1} < 0 <\psi _{2} & \qquad & \mbox{ on $\partial \Omega $}.\end{array} }$$
(1.4b)

Since \(\Omega\) is convex, elliptic regularity [36, 45, 58] implies that (1.2) is equivalent to \(y \in {H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\) and \(u = -\varDelta y\). Note that [46, Theorem 2.2.1]

$$\displaystyle{ \int _{\Omega }(\varDelta v)(\varDelta w)\,dx =\int _{\Omega }({D}^{2}v: {D}^{2}w)\,dx\qquad \forall \,v,w \in {H}^{2}(\Omega ) \cap H_{ 0}^{1}(\Omega ), }$$
(1.5)

where

$$\displaystyle{{D}^{2}v: {D}^{2}w =\sum _{ 1\leq i,j\leq 2}\left ( \frac{{\partial }^{2}v} {\partial x_{i}\partial x_{j}}\right )\left ({ {\partial }^{2}w \over \partial x_{i}\partial x_{j}}\right )}$$

is the (Frobenius) inner product of the Hessian matrices of v and w. Therefore we can solve the optimal control problem (1.1)–(1.3) by looking for the minimizer of the reduced functional

$$\displaystyle{ \hat{J}(y) = \frac{\gamma } {2}\int _{\Omega }{(y - y_{d})}^{2}dx + \frac{\beta } {2}\int _{\Omega }({D}^{2}y: {D}^{2}y)\,dx }$$

in the set

$$\displaystyle{ K =\{ v \in {H}^{2}(\Omega ) \cap H_{ 0}^{1}(\Omega ):\,\psi _{ 1} \leq v \leq \psi _{2}\;\text{in}\;\Omega \}. }$$
(1.6)

A simple calculation shows that this is equivalent to the following problem:

$$\displaystyle{ \text{Find}\quad \bar{y} =\mathop{ \mathrm{argmin}}_{y\in K}\left [\frac{1} {2}\mathcal{A}(y,y) - (f,y)\right ], }$$
(1.7)

where f = γ y d , (⋅ , ⋅ ) is the inner product of \(L_{2}(\Omega )\), and

$$\displaystyle{ \mathcal{A}(v,w) =\int _{\Omega }\big[\beta ({D}^{2}v: {D}^{2}w) +\gamma vw\big]\,dx. }$$
(1.8)

Since (1.4) implies that K is a nonempty closed convex subset of \({H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\) and the bilinear form \(\mathcal{A}(\cdot,\cdot )\) is symmetric, bounded, and coercive on \({H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\), we can apply the standard theory [43, 52, 54, 59] to conclude that the problem (1.7) has a unique solution \(\bar{y} \in K\) characterized by the variational inequality

$$\displaystyle{ \mathcal{A}(\bar{y},y -\bar{ y}) \geq (f,y -\bar{ y})\qquad \forall \,y \in K. }$$
(1.9)

The solution of the optimal control problem is then given by \((\bar{y},\bar{u})\), where \(\bar{u} = -\varDelta \bar{y}\). Note that (1.7) becomes the displacement obstacle problem for simply supported Kirchhoff plates if we take γ to be 0. For this reason we will also refer to (1.7) as an obstacle problem.

According to the regularity results in [32, 41, 42] for fourth order obstacle problems, the solution \(\bar{y}\) of (1.7) belongs to \(H_{\mathrm{loc}}^{3}(\Omega ) \cap {C}^{2}(\Omega )\) under our assumptions on the functions y d , ψ 1, and ψ 2. Note that (1.4b) implies that the constraints are inactive near \(\partial \Omega\) and hence

$$\displaystyle{{\beta \varDelta }^{2}\bar{y} +\gamma \bar{ y} = f}$$

near \(\partial \Omega\). It then follows from the elliptic regularity theory for the biharmonic equation (cf. [8] and Appendix A.) that there exists α ∈ (0, 1] (determined by the interior angles of \(\Omega\)) such that \(\bar{y} \in {H}^{2+\alpha }(\mathcal{N})\) in a neighborhood \(\mathcal{N}\) of \(\partial \Omega\) disjoint from the active set. Thus globally \(\bar{y}\) belongs to \({H}^{2+\alpha }(\Omega )\). We shall refer to α as the index of elliptic regularity for the obstacle problem (1.7).

A main difficulty in the analysis of finite element methods for fourth order obstacle problems is that the solutions in general do not belong to \(H_{\mathrm{loc}}^{4}(\Omega )\) even for smooth data, which means that the complementarity form of the variational inequality (1.9) in general only exists in a weak sense. In contrast, the solutions of second order obstacle problems belong to \({H}^{2}(\Omega )\) under appropriate assumptions on the data (cf. [29, 53]). Hence the complementarity forms of the variational inequalities arising from second order obstacle problems exist in the strong sense, which is a crucial ingredient for the derivations of optimal error estimates in [30, 31, 40].

A new approach to the obstacle problem for clamped Kirchhoff plates on convex polygonal domains was introduced in [25], where optimal error estimates were obtained for C 1 finite element methods, classical nonconforming finite element methods, and discontinuous Galerkin methods. The results were later extended to general domains and general Dirichlet boundary conditions in [15, 23, 24]. This new approach does not rely on the complementarity forms of the variational inequalities and hence can bypass the aforementioned difficulty. The goal of this paper is to extend the results in [23] to (1.7)/(1.9), which covers both obstacle problems for simply supported plates and optimal control problems with pointwise state constraints. We will show that the magnitude of the error in the energy norm is O(h α) on quasi-uniform meshes and O(h) on graded meshes.

Finite element methods for state constrained elliptic optimal control problems were investigated in [37, 56], where the finite element approximation \((\bar{y}_{h},\bar{u}_{h})\) of \((\bar{y},\bar{u})\) is obtained from discrete versions of the optimal control problems. In this approach the error analysis for the state and the error analysis for the control are coupled and hence the estimates for \(\vert \bar{y} -\bar{ y}_{h}\vert _{{H}^{1}(\Omega )}\) and \(\|\bar{u} -\bar{ u}_{h}\|_{L_{2}(\Omega )}\) have the same magnitude, which in the case of a rectangle with quasi-uniform meshes is O(h 1−ε). In our approach we obtain instead an error estimate for the approximation \(\bar{y}_{h}\) of \(\bar{y}\) in an H 2-like energy norm, which then implies an error estimate in the L 2 norm for the approximation \(\bar{u}_{h}\) of \(\bar{u}\) (generated from \(\bar{y}_{h}\) by a postprocessing procedure) with the same magnitude. In the case of a rectangle with quasi-uniform meshes, the magnitudes of these errors are O(h). On the other hand, the convergence of \(\bar{y}_{h}\) in the \({H}^{1}(\Omega )\) norm and the \(L_{\infty }(\Omega )\) norm, which are weaker than the energy norm, can be expected to be of higher order. This is indeed observed in our numerical experiments, where the magnitudes of the errors of \(\bar{y}_{h}\) in the \({H}^{1}(\Omega )\) norm and the \(L_{\infty }(\Omega )\) norm are O(h 2) for a rectangle.

The optimal control problem defined by (1.1)–(1.3) is solved as a fourth order variational inequality in [55] by a Morley finite element method and in [44] by a mixed finite element method. However the analyses in [44, 55] rely on additional assumptions on the active set first introduced in [7]. Our new approach for fourth order obstacle problems may provide an error analysis for the finite element methods in [44, 55] without the additional assumptions on the active set.

Other numerical methods for (1.1)–(1.3) are investigated, for example, in [5, 6, 34, 4851, 57, 60].

The rest of the paper is organized as follows. We introduce a quadratic C 0 interior penalty method for (1.7) in Sect. 2 and an intermediate obstacle problem that connects the continuous and discrete obstacle problems in Sect. 3. Section 4 contains several preliminary estimates which are useful for the convergence analysis carried out in Sect. 5. Numerical results that illustrate the performance of our method are presented in Sect. 6, followed by some concluding remarks in Sect. 7. Elliptic regularity results for simply supported plates, which play an important role in the error analysis, are summarized in Appendix A. Some technical results concerning an enriching operator that connects the discrete and continuous spaces are given in Appendix B.

We will follow the notation for Sobolev spaces and norms in [20, 35]. Throughout the paper we will denote by C a generic positive constant independent of mesh sizes that can take different values at different occurrences. To avoid the proliferation of constants, we will also use A ≲ B (or B ≳ A) to denote the statement that A ≤ (constant)B, where the positive constant is independent of mesh sizes. The statement A ≈ B is equivalent to A ≲ B and B ≲ A.

2 A Quadratic C 0 Interior Penalty Method

C 0 interior penalty methods were introduced in [39] for fourth order elliptic boundary value problems. They were further studied in [13, 16, 18, 21] and fast solvers for C 0 interior methods were developed in [22, 26, 27]. Adaptive [17] and isoparametric [19] versions of C 0 interior penalty methods are also available. Below we will recall the notation for C 0 interior penalty methods and introduce the discrete obstacle problem for (1.7).

2.1 Triangulation

Let \(\mathcal{T}_{h}\) be a simplicial triangulation of \(\Omega\) that is regular (i.e., \(\mathcal{T}_{h}\) satisfies a minimum angle condition). We will use the following notation throughout the paper.

  • h T is the diameter of the triangle T.

  • h is a mesh parameter proportional to \(\max _{T\in \mathcal{T}_{h}}h_{T}\).

  • v T is the restriction of the function v to the triangle T.

  • \(\mathcal{E}_{h}\) is the set of the edges of the triangles in \(\mathcal{T}_{h}\).

  • \(\mathcal{E}_{h}^{i}\) is the subset of \(\mathcal{E}_{h}\) consisting of edges interior to \(\Omega\).

  • \(\mathcal{E}_{h}^{b}\) is the subset of \(\mathcal{E}_{h}\) consisting of edges along \(\partial \Omega\).

  • \(\vert e\vert \) is the length of an edge e.

  • \(\mathcal{V}_{h}\) is the set of the vertices of the triangles in \(\mathcal{T}_{h}\).

  • \(\mathcal{V}_{T}\) is the set of the three vertices of T.

  • \(\mathcal{E}_{\mathcal{V}_{T}}^{i}\) is the set of the edges in \(\mathcal{E}_{h}^{i}\) emanating from the vertices of T.

  • \(\mathcal{T}_{T}\) is the set of triangles sharing a vertex with T.

  • \(\mathcal{S}_{T}\) is the interior of the closure of \(\cup _{T^\prime \in \mathcal{T}_{T}}T^\prime \).

  • \(\mathcal{T}_{p}\) is the set of the triangles in \(\mathcal{T}_{h}\) that share the common vertex p.

  • \(\mathcal{T}_{e}\) is the set of the triangles in \(\mathcal{T}_{h}\) that share the common edge e.

  • \(\vert \mathcal{T}_{p}\vert \) (resp. \(\vert \mathcal{T}_{e}\vert \)) is the number of triangles in \(\mathcal{T}_{p}\) (resp. \(\mathcal{T}_{e}\)).

  • Let \(e \in \mathcal{E}_{h}^{b}\). Then T e is the triangle in \(\mathcal{T}_{h}\) such that \(\mathcal{T}_{e} =\{ T_{e}\}\).

We will consider both quasi-uniform and graded triangulations. For a quasi-uniform triangulation \(\mathcal{T}_{h}\), we have

$$\displaystyle{ h_{T} \approx h\qquad \forall \,T \in \mathcal{T}_{h}. }$$
(2.1)

Let p 1, , p L be the corners of \(\Omega\) and ω be the interior angle at p for 1 ≤  ≤ L. For a graded triangulation \(\mathcal{T}_{h}\), we have

$$\displaystyle{ h_{T} \approx h\varPhi (c_{T})\qquad \forall \,T \in \mathcal{T}_{h}, }$$
(2.2)

where \(c_{T}\) is the center of T,

$$\displaystyle{ \varPhi (x) =\prod _{ \ell=1}^{L}\vert p_{\ell} - x{\vert }^{1-\alpha _{\ell}}, }$$
(2.3)

and the grading parameters α  > 0 are determined as follows:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \alpha _{\ell} = 1 \quad &\qquad \text{if}\;\;\omega _{\ell} \leq \frac{\pi } {2}, \\ \alpha _{\ell} < \left (\frac{\pi } {\omega _{\ell}}\right ) - 1\quad &\qquad \text{if}\;\; \frac{\pi } {2} <\omega _{\ell} <\pi. \end{array} \right. }$$
(2.4)

Note that (2.2) and (2.3) imply

$$\displaystyle{ h_{T}^{\alpha _{\ell}} \approx h }$$
(2.5)

if \(T \in \mathcal{T}_{h}\) touches the corner p .

Remark 2.1.

We can take \( \alpha = \min\limits_{1\leq l \leq L} \alpha_l \) to be the index of elliptic regularity (cf. Appendix A).

Remark 2.2.

The construction of regular triangulations that satisfy (2.2) is discussed, for example, in [1, 10, 14].

2.2 Jumps and Averages

The jumps and averages of the normal derivatives for functions in the piecewise Sobolev spaces

$$\displaystyle{ {H}^{s}(\Omega,\mathcal{T}_{ h}) =\{ v \in L_{2}(\Omega ):\, v_{T} = v\vert _{T} \in {H}^{s}(T)\quad \forall T \in \mathcal{T}_{ h}\} }$$

are defined as follows.

Let \(e \in \mathcal{E}_{h}^{i}\) be the common edge of \(T_{\pm }\in \mathcal{T}_{h}\) and n e be the unit normal of e pointing from T to T +. We define on e

$$\displaystyle{ \begin{array}{llll} \left \{\!\!\!\left\{\frac{{\partial }^{2}v} {\partial {n}^{2}} \right \}\!\!\!\right\} & = \frac{1} {2}\left (\frac{{\partial }^{2}v_{ +}} {\partial n_{e}^{2}} \bigg\vert _{e} + \frac{{\partial }^{2}v_{ -}} {\partial n_{e}^{2}} \bigg\vert _{e}\right )&\qquad &\forall v \in {H}^{s}(\Omega,\mathcal{T}_{h}),s > \frac{5} {2},\end{array} }$$
(2.6a)
$$\displaystyle{ \begin{array}{llll} \left[\!\!\left [\frac{\partial v} {\partial n}\right ]\!\!\right ]& = \frac{\partial v_{+}} {\partial n_{e}} \bigg\vert _{e} -\frac{\partial v_{-}} {\partial n_{e}} \bigg\vert _{e}&\qquad &\forall v \in {H}^{2}(\Omega,\mathcal{T}_{h}),\end{array} }$$
(2.6b)

where \(v_{\pm } = v\big\vert _{T_{\pm }}\). Similarly, we define on e

$$\displaystyle{ \begin{array}{llll} \left\{ \!\!\! \left \{ \frac{\partial v} {\partial n_{e}}\right \}\!\!\!\right \} & = \frac{1} {2}\left (\frac{\partial v_{+}} {\partial n_{e}} \bigg\vert _{e} + \frac{\partial v_{-}} {\partial n_{e}} \bigg\vert _{e}\right )&\qquad &\forall v \in {H}^{2}(\Omega,\mathcal{T}_{h}),\end{array} }$$
(2.7a)
$$\displaystyle{ \begin{array}{llll} \left[ \!\! \left [ \frac{{\partial }^{2}v} {\partial n_{e}^{2}} \right ]\!\!\right ]& = \frac{{\partial }^{2}v_{ +}} {\partial n_{e}^{2}} \bigg\vert _{e} -\frac{{\partial }^{2}v_{ -}} {\partial n_{e}^{2}} \bigg\vert _{e}&\qquad &\forall v \in {H}^{s}(\Omega,\mathcal{T}_{h}),s > \frac{5} {2}.\end{array} }$$
(2.7b)

Remark 2.3.

Note that the definitions for the average \(\left \{\left \{{\partial }^{2}v/\partial {n}^{2}\right \}\right \}\) and the jump [[∂v∕∂n]] in (2.6),which appear in C 0 interior penalty methods, are independent of the choice of T ± (or n e). On the other hand, the definitions in (2.7) for {{∂v∕∂n e }} and \([\![{\partial }^{2}v/\partial n_{e}^{2}]\!]\) , which appear only in the analysis, do depend on the choice of T ± (or n e ). However their product is also independent of the choice of T ± (or n e ).

Let \(e \in \mathcal{E}_{h}^{b}\) be a boundary edge and n e be the unit normal of e pointing towards the outside of \(\Omega\). We define on e

$$\displaystyle{ \begin{array}{llll} \left\{\!\!\left \{ \frac{\partial v} {\partial n_{e}}\right \} \!\! \right \} & = \frac{\partial v} {\partial n_{e}}\bigg\vert _{e}&\qquad &\forall v \in {H}^{2}(\Omega,\mathcal{T}_{h}), \end{array} }$$
(2.8a)
$$\displaystyle{ \begin{array}{llll} \left[ \!\! \left [ \frac{{\partial }^{2}v} {\partial n_{e}^{2}} \right] \!\! \right]& = -\frac{{\partial }^{2}v} {\partial n_{e}^{2}} \bigg\vert _{e}&\qquad &\forall v \in {H}^{s}(\Omega,\mathcal{T}_{h}),s > \frac{5} {2}.\end{array} }$$
(2.8b)

2.3 The Discrete Obstacle Problem

Let \(V _{h} \subset H_{0}^{1}(\Omega )\) be the \(\mathbb{P}_{2}\) Lagrange finite element space associated with \(\mathcal{T}_{h}\) whose members vanish on \(\partial \Omega\). We define the bilinear form a h (⋅ , ⋅ ) on V h × V h by

$$\displaystyle\begin{array}{rcl} a_{h}(v,w)& =& \sum _{T\in \mathcal{T}_{h}}\int _{T}({D}^{2}v: {D}^{2}w)dx +\sum _{ e\in \mathcal{E}_{h}^{i}}\int _{e}\{\!\!\{{\partial }^{2}v/\partial {n}^{2}\}\!\!\}[\![\partial w/\partial n]\!]ds \\ & +& \sum _{e\in \mathcal{E}_{h}^{i}}\int _{e}\{\!\!\{{\partial }^{2}w/\partial {n}^{2}\}\!\!\}[\![\partial v/\partial n]\!]ds \\ & +& \sigma \sum _{e\in \mathcal{E}_{h}^{i}}\vert e{\vert }^{-1}\int _{ e}[\![\partial v/\partial n]\!][\![\partial w/\partial n]\!]ds, {}\end{array}$$
(2.9)

where σ > 0 is a penalty parameter. Note that a h (⋅ , ⋅ ) is a consistent bilinear form for the biharmonic equation with the boundary conditions of simply supported plates.

It follows from (2.6a) and scaling that

$$\displaystyle{ \sum _{e\in \mathcal{E}_{h}^{i}}\vert e\vert \|\{\!\!\{{\partial }^{2}v/\partial {n}^{2}\}\!\!\}\|_{ L_{2}(e)}^{2} \lesssim \sum _{ T\in \mathcal{T}_{h}}\vert v\vert _{{H}^{2}(T)}^{2}\qquad \forall v \in V _{ h}. }$$
(2.10)

Therefore, for sufficiently large σ, we have (cf. [21])

$$\displaystyle{ a_{h}(v,v) \gtrsim \left (\sum _{T\in \mathcal{T}_{h}}\vert v\vert _{{H}^{2}(T)}^{2} +\sum _{ e\in \mathcal{E}_{h}^{i}}\vert e{\vert }^{-1}\|[\![\partial v/\partial n]\!]\|_{ L_{2}(e)}^{2}\right )\qquad \forall v \in V _{ h}. }$$
(2.11)

The discrete bilinear form that approximates \(\mathcal{A}(\cdot,\cdot )\) is then given by

$$\displaystyle{ \mathcal{A}_{h}(v,w) =\beta a_{h}(v,w) +\gamma (v,w), }$$
(2.12)

and

$$\displaystyle\begin{array}{rcl} & & \|v\|_{h} = \left [\beta \left (\sum _{T\in \mathcal{T}_{h}}\vert v\vert _{{H}^{2}(T)}^{2} +\sum _{ e\in \mathcal{E}_{h}^{i}}\vert e{\vert }^{-1}\|[\![\partial v/\partial n]\!]\|_{ L_{2}(e)}^{2}\right )\right. \\ & & \qquad {\left.\quad +\gamma \| v\|_{L_{2}(\Omega )}^{2}\right ]}^{\frac{1} {2} } {}\end{array}$$
(2.13)

is the mesh-dependent energy norm. It follows from (2.10)–(2.13) that

$$\displaystyle{ \begin{array}{llll} \vert \mathcal{A}_{h}(v,w)\vert & \lesssim \| v\|_{h}\|w\|_{h}&\qquad &\forall v,w \in V _{h},\end{array} }$$
(2.14)
$$\displaystyle{ \begin{array}{llll} \mathcal{A}_{h}(v,v)& \gtrsim \| v\|_{h}^{2} & \qquad & \forall v \in V _{h},\end{array} }$$
(2.15)

provided that σ is sufficiently large, which we assume to be the case.

Note that

$$\displaystyle{ \|v\|_{{H}^{1}(\Omega )}^{2} \lesssim \sum _{ T\in \mathcal{T}_{h}}\vert v\vert _{{H}^{2}(T)}^{2} +\sum _{ e\in \mathcal{E}_{h}^{i}}\vert e{\vert }^{-1}\|[\![\partial v/\partial n]\!]\|_{ L_{2}(e)}^{2} }$$
(2.16)

for all \(v \in {H}^{2}(\Omega,\mathcal{T}_{h}) \cap H_{0}^{1}(\Omega )\) by a Poincaré–Friedrichs inequality [28, Example 5.4], and hence

$$\displaystyle{ \|v\|_{{H}^{1}(\Omega )} \lesssim \| v\|_{h} }$$
(2.17)

for all \(v \in {H}^{2}(\Omega,\mathcal{T}_{h}) \cap H_{0}^{1}(\Omega )\;(\supset V _{h} + {H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega ))\).

We can now define the discrete obstacle problem for (1.7):

$$\displaystyle{ \text{Find}\quad \bar{y}_{h} =\mathop{ \mathrm{argmin}}_{y_{h}\in K_{h}}\left [\frac{1} {2}\mathcal{A}_{h}(y_{h},y_{h}) - (f,y_{h})\right ], }$$
(2.18)

where

$$\displaystyle{ K_{h} =\{ v \in V _{h}:\,\psi _{1}(p) \leq v(p) \leq \psi _{2}(p)\quad \forall p \in \mathcal{V}_{h}\}. }$$
(2.19)

Let Π h be the nodal interpolation operator for the \(\mathbb{P}_{2}\) Lagrange finite element space. Then Π h maps \({H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\) into V h and K into K h . Therefore K h is a nonempty closed convex subset of V h . Moreover the bilinear form \(\mathcal{A}_{h}(\cdot,\cdot )\) is symmetric positive definite by (2.15). Hence the discrete problem (2.18) has a unique solution \(\bar{y}_{h} \in K_{h}\) characterized by the discrete variational inequality:

$$\displaystyle{ \mathcal{A}_{h}(\bar{y}_{h},y_{h} -\bar{ y}_{h}) \geq (f,y_{h} -\bar{ y}_{h})\qquad \forall y_{h} \in K_{h}. }$$
(2.20)

Let Π T be the nodal interpolation operator for the \(\mathbb{P}_{2}\) Lagrange finite element on a triangle T. We have a standard local interpolation error estimate [20, 35]

$$\displaystyle{ \sum _{m=0}^{2}h_{ T}^{m-2}\vert \zeta -\Pi _{ T}\zeta \vert _{{H}^{m}(T)} \lesssim h_{T}^{s}\vert \zeta \vert _{{ H}^{2+s}(T)} }$$
(2.21)

for all ζ ∈ H 2+s(T), \(T \in \mathcal{T}_{h}\) and s ∈ [0, 1].

The following lemma provides global interpolation error estimates for the solution \(\bar{y}\) of (1.7)/(1.9).

Lemma 2.1.

There exists a positive constant C independent of h such that

$$\displaystyle{ \|\bar{y} -\Pi _{h}\bar{y}\|_{h} \leq C{h}^{\tau }, }$$

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Proof.

Since the estimate for a quasi-uniform \(\mathcal{T}_{h}\) is standard (cf. [21]), we will focus on a graded \(\mathcal{T}_{h}\). Let \(\mathcal{T}_{h}^{I}\) be the set of triangles in \(\mathcal{T}_{h}\) that do not touch any corner of \(\Omega\) and \(\mathcal{T}_{h}^{C} = \mathcal{T}_{ h}\setminus \mathcal{T}_{h}^{I} = \cup _{ 1\leq \ell\leq L}\mathcal{T}_{h,\ell}^{C}\), where \(\mathcal{T}_{h,\ell}^{C}\) is the set of the triangles that touch the corner p .

Since \(\bar{y}_{T} \in {H}^{3}(T)\) for \(T \in \mathcal{T}_{h}^{I}\) (cf. Appendix A.), we have, by (2.21),

$$\displaystyle\begin{array}{rcl} & & \sum\limits_{T\in \mathcal{T}_{h}^{I}}\sum\limits_{m=0}^{2}h_{ T}^{2(m-2)}\vert \bar{y} -\Pi _{ h}\bar{y}\vert _{{H}^{m}(T)}^{2} \\ & & \quad \lesssim \sum\limits_{T\in \mathcal{T}_{h}^{I}}\big{({\Phi }^{-2}(c_{ T})h_{T}^{2}\big)\Phi }^{2}(c_{ T})\vert \bar{y}\vert _{{H}^{3}(T)}^{2}{}\end{array}$$
(2.22)

where the function Φ is defined in (2.3).

It follows from (2.2), (2.3), (2.22), and (A.7) that

$$\displaystyle{ \sum _{T\in \mathcal{T}_{h}^{I}}\sum _{m=0}^{2}h_{ T}^{2(m-2)}\vert \bar{y} -\Pi _{ h}\bar{y}\vert _{{H}^{m}(T)}^{2} \lesssim {h}^{2}. }$$
(2.23)

Let \(T \in \mathcal{T}_{h,\ell}^{C}\) be a triangle that touches a corner p . Then \(\bar{y} \in {H}^{2+\alpha _{\ell}}(T)\) (cf. Appendix A.) and we have, by (2.21),

$$\displaystyle{ \sum _{m=0}^{2}h_{ T}^{m-2}\vert \bar{y} -\Pi _{ h}\bar{y}\vert _{{H}^{m}(T)} \lesssim h_{T}^{\alpha _{\ell}}\vert \bar{y}\vert _{{ H}^{2+\alpha _{\ell}}(T)}\qquad \forall \,T \in \mathcal{T}_{h,\ell}^{C }. }$$
(2.24)

It follows from (2.5) and (2.24) that

$$\displaystyle\begin{array}{rcl} & & \sum _{T\in \mathcal{T}_{h}^{C}}\sum _{m=0}^{2}h_{ T}^{2(m-2)}\vert \bar{y} -\Pi _{ h}\bar{y}\vert _{{H}^{m}(T)}^{2} \\ & & \quad =\sum _{ \ell=1}^{L}\sum _{ T\in \mathcal{T}_{h,\ell}^{C}}\sum _{m=0}^{2}h_{ T}^{2(m-2)}\vert \bar{y} -\Pi _{ h}\bar{y}\vert _{{H}^{m}(T)}^{2} \lesssim {h}^{2}.{}\end{array}$$
(2.25)

Combining (2.23) and (2.25), we find

$$\displaystyle{ \sum _{T\in \mathcal{T}_{h}}\sum _{m=0}^{2}h_{ T}^{2(m-2)}\vert \bar{y} -\Pi _{ h}\bar{y}\vert _{{H}^{m}(T)}^{2} \lesssim {h}^{2}. }$$
(2.26)

By the trace theorem with scaling, (2.6b) and (2.26), we also have

$$\displaystyle\begin{array}{rcl} & & \sum _{e\in \mathcal{E}_{h}^{i}}\vert e{\vert }^{-1}\|[\![\partial (\bar{y} -\Pi _{ h}\bar{y})/\partial n]\!]\|_{L_{2}(e)}^{2} \\ & & \quad \lesssim \sum _{T\in \mathcal{T}_{h}}\Big(h_{T}^{-2}\vert \bar{y} -\Pi _{ h}\bar{y}\vert _{{H}^{1}(T)}^{2} + \vert \bar{y} -\Pi _{ h}\bar{y}\vert _{{H}^{2}(T)}^{2}\Big) \lesssim {h}^{2}.{}\end{array}$$
(2.27)

The lemma for a graded \(\mathcal{T}_{h}\) follows from (2.13), (2.16), (2.26), and (2.27).

3 An Intermediate Obstacle Problem

As mentioned in Sect. 1, the difficulties due to the lack of \(H_{\mathrm{loc}}^{4}(\Omega )\) regularity can be bypassed if the convergence analysis does not rely on the complementarity form of the variational inequality (1.9). We can accomplish this by introducing the following intermediate obstacle problem:

$$\displaystyle{ \text{Find}\qquad \bar{y}_{h}^{{\ast}} =\mathop{ \mathrm{argmin}}_{ y_{h}^{{\ast}}\in K_{h}^{{\ast}}}\left [\frac{1} {2}\mathcal{A}(y_{h}^{{\ast}},y_{ h}^{{\ast}}) - (f,y_{ h}^{{\ast}})\right ], }$$
(3.1)

where

$$\displaystyle{ K_{h}^{{\ast}} =\{ v \in {H}^{2}(\Omega ) \cap H_{ 0}^{1}(\Omega ):\,\psi _{ 1}(p) \leq v(p) \leq \psi _{2}(p)\qquad \forall p \in \mathcal{V}_{h}\}. }$$
(3.2)

By the standard theory (3.1) has a unique solution \(\bar{y}_{h}^{{\ast}}\) characterized by the variational inequality

$$\displaystyle{ \mathcal{A}(\bar{y}_{h}^{{\ast}},y_{ h}^{{\ast}}-\bar{y}_{ h}^{{\ast}}) \geq (f,y_{ h}^{{\ast}}-\bar{y}_{ h}^{{\ast}})\qquad \forall y_{ h}^{{\ast}}\in K_{ h}^{{\ast}}. }$$
(3.3)

Note that, on the one hand, \(\bar{y}_{h}^{{\ast}}\in {H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\) minimizes the same functional as \(\bar{y}\) but on the larger set K h  ⊃ K, and, on the other hand, \(\bar{y}_{h}^{{\ast}}\) shares the same pointwise constraints as \(\bar{y}_{h}\). Thus the intermediate obstacle problem connects the continuous obstacle problem (1.7) and the discrete obstacle problem (2.18). We will carry out the convergence analysis using (1.9), (2.20), and (3.3), but not their complementarity forms.

3.1 Relation Between \(\bar{y}\) and \(\bar{y}_{h}^{{\ast}}\)

Using the fact that H 2(Ω) is compactly embedded in \(C(\bar{\Omega })\), it was shown in [25] that there exist two nonnegative functions \(\phi _{1},\phi _{2} \in C_{0}^{\infty }(\Omega )\) and a positive number h 0 such that for any h ≤ h 0 we can find two positive numbers δ h, 1 and δ h, 2 with the following properties:

$$\displaystyle{ \hat{y}_{h}:= \bar{y}_{h}^{{\ast}} +\delta _{ h,1}\phi _{1} -\delta _{h,2}\phi _{2} \in K\quad \text{and}\quad \delta _{h,i} \lesssim {h}^{2}. }$$
(3.4)

Note that we can treat \(\bar{y}\) as an internal approximation of \(\bar{y}_{h}^{{\ast}}\) since K ⊂ K h . It then follows from (3.4) and a standard result [3] that

$$\displaystyle{ \|\bar{y}_{h}^{{\ast}}-\bar{ y}\|_{{ H}^{2}(\Omega )} \lesssim {\left [\inf _{y\in K}\|\bar{y}_{h}^{{\ast}}- y\|_{{ H}^{2}(\Omega )}\right ]}^{\frac{1} {2} } \lesssim \|\bar{y}_{h}^{{\ast}}-\hat{ y}_{h}\|_{{H}^{2}(\Omega )}^{\frac{1} {2} } \lesssim h. }$$
(3.5)

Remark 3.1.

Even though the results in [ 25 ] are obtained for clamped Kirchhoff plates on convex polygonal domains, these results are also valid for general boundary conditions and general polygonal domains because they are interior results that only require the following ingredients: (i) The set K h is a closed convex subset of \({H}^{2}(\Omega )\) . (ii) The constraints and the boundary conditions are separated. (iii) The obstacle functions ψ 1 2 and the solution \(\bar{y}\) belong to \({C}^{2}(\Omega )\) .

3.2 Connection Between K h and K h

We can connect K h and K h by an enriching operator E h that maps V h into \({H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\). By construction E h is a linear operator that preserves the nodal values at the vertices of \(\mathcal{T}_{h}\), i.e.,

$$\displaystyle{ (E_{h}v)(p) = v(p)\qquad \forall p \in \mathcal{V}_{h},\;v \in V _{h}, }$$
(3.6)

which, in view of (2.19) and (3.2), implies

$$\displaystyle{ E_{h}K_{h} \subset K_{h}^{{\ast}}. }$$
(3.7)

Moreover we have (cf. the notation in Sect. 2.1),

$$\displaystyle\begin{array}{rcl} & & \sum _{m=0}^{2}h_{ T}^{2m}\vert v - E_{ h}v\vert _{{H}^{m}(T)}^{2} \\ & & \quad \lesssim h_{T}^{4}\left (\sum _{ T^\prime \in \mathcal{T}_{T}}\vert v\vert _{{H}^{2}(T^\prime )}^{2} +\sum _{ e\in \mathcal{E}_{\mathcal{V}_{T}}^{i}}\vert e{\vert }^{-1}\|[\![\partial v/\partial n]\!]\|_{ L_{2}(e)}^{2}\right ){}\end{array}$$
(3.8)

for any v ∈ V h and \(T \in \mathcal{T}_{h}\), and

$$\displaystyle{ \sum _{m=0}^{2}h_{ T}^{m-2}\vert \zeta - E_{ h}\Pi _{h}\zeta \vert _{{H}^{m}(T)} \lesssim h_{T}^{s}\vert \zeta \vert _{{ H}^{2+s}(\mathcal{S}_{T})} }$$
(3.9)

for all \(\zeta \in {H}^{2+s}(\mathcal{S}_{T})\), \(T \in \mathcal{T}_{h}\) and s ∈ [0, 1]. The construction of E h , which is similar to the constructions of the enriching operators in [16, 21] under different boundary conditions, is given in Appendix B., where we also derive the estimates (3.8) and (3.9).

The estimate (3.8) implies

$$\displaystyle{ \sum _{T\in \mathcal{T}_{h}}h_{T}^{2(m-2)}\vert v - E_{ h}v\vert _{{H}^{m}(T)}^{2} \lesssim \| v\|_{ h}^{2}\qquad \forall \,v \in V _{ h}, }$$
(3.10)

and in particular,

$$\displaystyle{ \vert E_{h}v\vert _{{H}^{2}(\Omega )} \lesssim \| v\|_{h}\qquad \forall \,v \in V _{h}. }$$
(3.11)

Combining (2.7a), (2.8a), (3.10) and the trace theorem with scaling, we also have

$$\displaystyle{ \sum _{e\in \mathcal{E}_{h}}\vert e{\vert }^{-1}\|\{\!\!\{\partial (v - E_{ h}v)/\partial n_{e}\}\!\!\}\|_{L_{2}(e)}^{2} \lesssim \| v\|_{ h}^{2}\qquad \forall \,v \in V _{ h}. }$$
(3.12)

Finally the quasi-local estimate (3.9) implies the following result for the solution \(\bar{y}\) of (1.7). We omit the proof due to its similarity with the proof of Lemma 2.1.

Lemma 3.1.

There exists a positive constant C independent of h such that

$$\displaystyle{ \|\bar{y} - E_{h}\Pi _{h}\bar{y}\|_{L_{2}(\Omega )} + h\vert \bar{y} - E_{h}\Pi _{h}\bar{y}\vert _{{H}^{1}(\Omega )} + {h}^{2}\vert \bar{y} - E_{ h}\Pi _{h}\bar{y}\vert _{{H}^{2}(\Omega )} \leq C{h}^{2+\tau }, }$$

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

4 Preliminary Estimates

In this section we derive some preliminary estimates that are useful for the convergence analysis in Sect. 5. We begin by stating the following integration by parts formula that holds for v, w ∈ V h :

$$\displaystyle\begin{array}{rcl} & & \sum _{T\in \mathcal{T}_{h}}\int _{T}{D}^{2}v: {D}^{2}(w - E_{ h}w)dx \\ & & \quad =\sum _{T\in \mathcal{T}_{h}}\int _{\partial T}\left [\left (\frac{{\partial }^{2}v} {\partial {n}^{2}}\right )\left (\frac{\partial (w - E_{h}w)} {\partial n} \right )\right. \\ & & \qquad \left.+\left ( \frac{{\partial }^{2}v} {\partial n\partial t}\right )\left (\frac{\partial (w - E_{h}w)} {\partial t} \right )\right ]ds \\ & & \quad = -\sum _{e\in \mathcal{E}_{h}}\int _{e}\left [\left [ \frac{{\partial }^{2}v} {\partial n_{e}^{2}}\right ]\right ]\left \{\left \{\frac{\partial (w - E_{h}w)} {\partial n_{e}} \right \}\right \}ds \\ & & \qquad -\sum _{e\in \mathcal{E}_{h}^{i}}\int _{e}\left \{\left \{\frac{{\partial }^{2}v} {\partial {n}^{2}}\right \}\right \}\left [\left [\frac{\partial (w - E_{h}w)} {\partial n} \right ]\right ]ds.{}\end{array}$$
(4.1)

Note that \(v \in \mathbb{P}_{2}(T)\) and hence on any edge e of \(T \in \mathcal{T}_{h}\) we have

$$\displaystyle{ \int _{e}\left (\frac{{\partial }^{2}v_{T}} {\partial n\partial t}\right )\left (\frac{\partial (w_{T} - E_{h}w)} {\partial t} \right )ds = \left (\frac{{\partial }^{2}v_{T}} {\partial n\partial t}\right )\int _{e}\frac{\partial (w_{T} - E_{h}w)} {\partial t} ds = 0 }$$

because of (3.6).

Next we derive a basic estimate for \(\bar{y} -\bar{ y}_{h}\), where \(\bar{y}\) (resp. \(\bar{y}_{h}\)) is the solution of (1.7)/(1.9) [resp. (2.18)/(2.20)].

Lemma 4.1.

There exists a positive constant C independent of h such that

$$\displaystyle{ \|\bar{y} -\bar{ y}_{h}\|_{h}^{2} \leq 2\|\bar{y} -\Pi _{ h}\bar{y}\|_{h}^{2} + C\big[\mathcal{A}_{ h}(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) - (f,\Pi _{h}\bar{y} -\bar{ y}_{h})\big]. }$$
(4.2)

Proof.

Since \(\Pi _{h}\bar{y} \in K_{h}\), we deduce from (2.15) and (2.20) that

$$\displaystyle\begin{array}{rcl} \|\bar{y} -\bar{ y}_{h}\|_{h}^{2}& \leq & 2\|\bar{y} -\Pi _{ h}\bar{y}\|_{h}^{2} + 2\|\Pi _{ h}\bar{y} -\bar{ y}_{h}\|_{h}^{2} {}\\ & \leq & 2\|\bar{y} -\Pi _{h}\bar{y}\|_{h}^{2} + C\mathcal{A}_{ h}(\Pi _{h}\bar{y} -\bar{ y}_{h},\Pi _{h}\bar{y} -\bar{ y}_{h}) {}\\ & \leq & 2\|\bar{y} -\Pi _{h}\bar{y}\|_{h}^{2} + C\big[\mathcal{A}_{ h}(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) - (f,\Pi _{h}\bar{y} -\bar{ y}_{h})\big]. {}\\ \end{array}$$

In view of Lemmas 2.1 and 4.1, we can complete the error analysis by bounding the second term on the right-hand side of (4.2). This will be carried out in Sect. 5 after we have developed several technical lemmas in the remaining part of this section.

Lemma 4.2.

There exists a positive constant C independent of h such that

$$\displaystyle{ \sum _{e\in \mathcal{E}_{h}}\vert e\vert \big\|\left [\left [{\partial }^{2}(\Pi _{ h}\bar{y})/\partial n_{e}^{2}\right ]\right ]\big\|_{ L_{2}(e)}^{2} \leq C{h}^{2\tau }, }$$

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Proof.

We will split the estimate into two cases. Let \(\mathcal{E}_{h}^{R} =\{ e \in \mathcal{E}_{ h}:\) e is not an edge of any triangle that touches a corner of \(\Omega\) where the angle is strictly greater than π∕2} and \(\mathcal{E}_{h}^{S} = \mathcal{E}_{ h}\setminus \mathcal{E}_{h}^{R}\). Note that the number of edges in \(\mathcal{E}_{h}^{S}\) is bounded by a constant determined by the minimum angle of \(\mathcal{T}_{h}\).

Since away from the corners of \(\Omega\) where the angles are strictly greater than π∕2 the function \(\bar{y}\) belongs to H 3 and \({\partial }^{2}\bar{y}/\partial {n}^{2} =\varDelta \bar{ y}\) vanishes on \(\partial \Omega\) (cf. Appendix A.), we have, by (2.7b), (2.21) and the trace theorem with scaling,

$$\displaystyle\begin{array}{rcl} \sum _{e\in \mathcal{E}_{h}^{R}}\vert e\vert \big\|\left [\left [{\partial }^{2}(\Pi _{ h}\bar{y})/\partial n_{e}^{2}\right ]\right ]\big\|_{ L_{2}(e)}^{2}& =& \sum _{ e\in \mathcal{E}_{h}^{R}}\vert e\vert \big\|[\![{\partial }^{2}(\Pi _{ h}\bar{y} -\bar{ y})/\partial n_{e}^{2}]\!]\big\|_{ L_{2}(e)}^{2} {}\\ & \lesssim & \sum _{e\in \mathcal{E}_{h}^{R}}\sum _{T\in \mathcal{T}_{e}}{(h_{T}^{2}{\varPhi }^{-2}(c_{ T}))\varPhi }^{2}(c_{ T})\vert \bar{y}\vert _{{H}^{3}(T)}^{2}, {}\\ \end{array}$$

where the function Φ is defined in (2.3). It then follows from (2.1)–(2.3) and (A.7) that

$$\displaystyle{ \sum _{e\in \mathcal{E}_{h}^{R}}\vert e\vert \big\|\left [\left [{\partial }^{2}(\Pi _{ h}\bar{y})/\partial n_{e}^{2}\right ]\right ]\big\|_{ L_{2}(e)}^{2} \lesssim {h}^{2\tau }, }$$
(4.3)

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) satisfies (2.2)–(2.4).

Let \(e \in \mathcal{E}_{h}^{S}\) be an edge of a triangle that touches a corner p of \(\Omega\) where the angle ω  ∈ (π∕2, π). It follows from scaling that

$$\displaystyle\begin{array}{rcl} \vert e\vert \big\|\left [\left [{\partial }^{2}(\Pi _{ h}\bar{y})/\partial n_{e}^{2}\right ]\right ]\big\|_{ L_{2}(e)}^{2}& \lesssim & \sum _{ T\in \mathcal{T}_{e}}\vert \Pi _{h}\bar{y}\vert _{{H}^{2}(T)}^{2} {}\\ & \lesssim & \sum _{T\in \mathcal{T}_{e}}\left (\vert \Pi _{h}\bar{y} -\bar{ y}\vert _{{H}^{2}(T)}^{2} + \vert \bar{y}\vert _{{ H}^{2}(T)}^{2}\right ). {}\\ \end{array}$$

Let \(T \in \mathcal{T}_{e}\). Since \(\bar{y} \in {H}^{2+\alpha _{\ell}}(T)\) (cf. Appendix A.), we have

$$\displaystyle{ \vert \Pi _{h}\bar{y} -\bar{ y}\vert _{{H}^{2}(T)} \lesssim \vert \bar{y}\vert _{{H}^{2+\alpha _{\ell}}(T)}h_{T}^{\alpha _{\ell}}. }$$

Moreover we have

$$\displaystyle\begin{array}{rcl} \vert \bar{y}\vert _{{H}^{2}(T)}& \approx & \sum _{\vert \mu \vert =2}\|{\partial }^{\mu }\bar{y}\|_{L_{2}(T)} {}\\ & =& \sum {_{\vert \mu \vert =2}\|\varPsi }^{-1}\big(\varPsi ({\partial }^{\mu }\bar{y})\big)\|_{ L_{2}(T)} \lesssim h_{T}^{\alpha _{\ell}}\sum _{ \vert \mu \vert =2}\|\varPsi ({\partial }^{\mu }\bar{y})\|_{ L_{2}(T)}, {}\\ \end{array}$$

where Ψ is define in (A.9). Therefore it follows from (2.1), (2.5), Remark 2.1 and (A.8) that

$$\displaystyle{ \sum _{e\in \mathcal{E}_{h}^{S}}\vert e\vert \big\|\left [\left [{\partial }^{2}(\Pi _{ h}\bar{y})/\partial n_{e}^{2}\right ]\right ]\big\|_{ L_{2}(e)}^{2} \lesssim {h}^{2\tau }, }$$
(4.4)

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) satisfies (2.2)–(2.4).

The lemma follows from (4.3) and (4.4).

Lemma 4.3.

There exists a positive constant C independent of h such that

$$\displaystyle\begin{array}{rcl} & & \left \vert a_{h}(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) -\int _{\Omega }{D}^{2}\bar{y}: {D}^{2}E_{ h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\,dx\right \vert \\ & &\quad \leq C{h}^{\tau }\|\Pi _{h}\bar{y} -\bar{ y}_{h}\|_{h}, {}\end{array}$$
(4.5)

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Proof.

Since both \([\![\partial E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})/\partial n]\!]\) and \([\![\partial \bar{y}/\partial n]\!]\) equal 0, we have, from (2.9),

$$\displaystyle\begin{array}{rcl} & & a_{h}(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) \\ & & \quad =\sum _{T\in \mathcal{T}_{h}}\int _{T}{D}^{2}\bar{y}: {D}^{2}E_{ h}(\Pi _{h}\bar{y} -\bar{ y}_{h})dx \\ & & \qquad +\sum _{T\in \mathcal{T}_{h}}\int _{T}{D}^{2}(\Pi _{ h}\bar{y} -\bar{ y}): {D}^{2}E_{ h}(\Pi _{h}\bar{y} -\bar{ y}_{h})dx \\ & & \qquad +\sum _{T\in \mathcal{T}_{h}}\int _{T}{D}^{2}(\Pi _{ h}\bar{y}): {D}^{2}\big[(\Pi _{ h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big]dx \\ & & \qquad +\sum _{e\in \mathcal{E}_{h}^{i}}\int _{e}\left \{\left \{\frac{{\partial }^{2}(\Pi _{h}\bar{y})} {\partial {n}^{2}} \right \}\right \}\left [\left [\frac{\partial [(\Pi _{h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})]} {\partial n} \right ]\right ]ds \\ & & \qquad +\sum _{e\in \mathcal{E}_{h}^{i}}\int _{e}\left \{\left \{\frac{{\partial }^{2}(\Pi _{h}\bar{y} -\bar{ y}_{h})} {\partial {n}^{2}} \right \}\right \}\left [\left [\frac{\partial (\Pi _{h}\bar{y} -\bar{ y})} {\partial n} \right ]\right ]ds \\ & & \qquad +\sum _{e\in \mathcal{E}_{h}^{i}} \frac{\sigma } {\vert e\vert }\int _{e}\left [\left [\frac{\partial (\Pi _{h}\bar{y} -\bar{ y})} {\partial n} \right ]\right ]\left [\left [\frac{\partial (\Pi _{h}\bar{y} -\bar{ y}_{h})} {\partial n} \right ]\right ]ds, {}\end{array}$$
(4.6)

and we can use (2.6), (2.13), Lemma 2.1, (3.11) and scaling to estimate the second, fifth, and sixth terms on the right-hand side of (4.6) as follows:

$$\displaystyle\begin{array}{rcl} & & \left \vert \sum _{T\in \mathcal{T}_{h}}\int _{T}{D}^{2}(\Pi _{ h}\bar{y} -\bar{ y}): {D}^{2}E_{ h}(\Pi _{h}\bar{y} -\bar{ y}_{h})dx\right \vert \\ & &\qquad \leq {\left (\sum _{T\in \mathcal{T}_{h}}\vert \Pi _{h}\bar{y} -\bar{ y}\vert _{{H}^{2}(T)}^{2}\right )}^{\frac{1} {2} }\vert E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\vert _{{H}^{2}(\Omega )} \\ & & \qquad \lesssim {h}^{\tau }\|\Pi _{h}\bar{y} -\bar{ y}_{h}\|_{h}, {}\end{array}$$
(4.7)
$$\displaystyle\begin{array}{rcl} & & \left \vert \sum _{e\in \mathcal{E}_{h}^{i}}\int _{e}\left \{\left \{\frac{{\partial }^{2}(\Pi _{h}\bar{y} -\bar{ y}_{h})} {\partial {n}^{2}} \right \}\right \}\left [\left [\frac{\partial (\Pi _{h}\bar{y} -\bar{ y})} {\partial n} \right ]\right ]ds\right \vert \\ & &\quad \leq {\left (\sum _{e\in \mathcal{E}_{h}^{i}}\vert e\vert \|\{\!\!\{{\partial }^{2}(\Pi _{ h}\bar{y} -\bar{ y}_{h})/\partial {n}^{2}\}\!\!\}\|_{ L_{2}(e)}^{2}\right )}^{\frac{1} {2} } \\ & & \qquad \times {\left (\sum _{e\in \mathcal{E}_{h}^{i}} \frac{1} {\vert e\vert }\|[\![\partial (\Pi _{h}\bar{y} -\bar{ y})/\partial n]\!]\|_{L_{2}(e)}^{2}\right )}^{\frac{1} {2} } \\ & & \quad \lesssim {\left (\sum _{T\in \mathcal{T}_{h}}\vert \Pi _{h}\bar{y} -\bar{ y}_{h}\vert _{{H}^{2}(T)}^{2}\right )}^{\frac{1} {2} }{\left (\sum _{e\in \mathcal{E}_{ h}^{i}} \frac{1} {\vert e\vert }\|[\![\partial (\Pi _{h}\bar{y} -\bar{ y})/\partial n]\!]\|_{L_{2}(e)}^{2}\right )}^{\frac{1} {2} } \\ & & \quad \lesssim {h}^{\tau }\|\Pi _{h}\bar{y} -\bar{ y}_{h}\|_{h}, {}\end{array}$$
(4.8)
$$\displaystyle\begin{array}{rcl} & & \left \vert \sum _{e\in \mathcal{E}_{h}^{i}} \frac{\sigma } {\vert e\vert }\int _{e}\left [\left [\frac{\partial (\Pi _{h}\bar{y} -\bar{ y})} {\partial n} \right ]\right ]\left [\left [\frac{\partial (\Pi _{h}\bar{y} -\bar{ y}_{h})} {\partial n} \right ]\right ]ds\right \vert \\ & &\quad \lesssim {\left (\sum _{e\in \mathcal{E}_{h}^{i}} \frac{1} {\vert e\vert }\|\,\left [\left [\partial (\Pi _{h}\bar{y} -\bar{ y})/\partial n\right ]\right ]\|_{L_{2}(e)}^{2}\right )}^{\frac{1} {2} } \\ & & \qquad \times {\left (\sum _{e\in \mathcal{E}_{h}^{i}} \frac{1} {\vert e\vert }\|\,\left [\left [\partial (\Pi _{h}\bar{y} -\bar{ y}_{h})/\partial n\right ]\right ]\|_{L_{2}(e)}^{2}\right )}^{\frac{1} {2} } \\ & & \quad \lesssim {h}^{\tau }\|\Pi _{h}\bar{y} -\bar{ y}_{h}\|_{h}. {}\end{array}$$
(4.9)

Now we use (3.12), the integration by parts formula (4.1) together with Lemma 4.2 to estimate the sum of the third and fourth terms on the right-hand side of (4.6) by

$$\displaystyle\begin{array}{rcl} & & \sum _{T\in \mathcal{T}_{h}}\int _{T}{D}^{2}(\Pi _{ h}\bar{y}): {D}^{2}\big[(\Pi _{ h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big]dx \\ & & \qquad +\sum _{e\in \mathcal{E}_{h}^{i}}\int _{e}\left \{\left \{\frac{{\partial }^{2}(\Pi _{h}\bar{y})} {\partial {n}^{2}} \right \}\right \}\left [\left [\frac{\partial [(\Pi _{h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})]} {\partial n} \right ]\right ]ds \\ & & \quad = -\sum _{e\in \mathcal{E}_{h}}\int _{e}\left [\left [\frac{{\partial }^{2}(\Pi _{h}\bar{y})} {\partial n_{e}^{2}} \right ]\right ]\left \{\left \{\frac{\partial [(\Pi _{h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})]} {\partial n_{e}} \right \}\right \}ds \\ & & \quad \leq {\left (\sum _{e\in \mathcal{E}_{h}}\vert e\vert \big\|\left [\left [{\partial }^{2}(\Pi _{ h}\bar{y})/\partial n_{e}^{2}\right ]\right ]\big\|_{ L_{2}(e)}^{2}\right )}^{\frac{1} {2} } \\ & & \qquad \times {\left (\sum _{e\in \mathcal{E}_{h}} \frac{1} {\vert e\vert }\big\|\left \{\left \{\partial [(\Pi _{h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})]/\partial n_{e}\right \}\right \}\big\|_{L_{2}(e)}^{2}\right )}^{\frac{1} {2} } \\ & & \quad \lesssim {h}^{\tau }\|\Pi _{h}\bar{y} -\bar{ y}_{h}\|_{h}. {}\end{array}$$
(4.10)

The lemma follows from (4.6)–(4.10).

Lemma 4.4.

There exists a positive constant C independent of h such that

$$\displaystyle{ \mathcal{A}(\bar{y},E_{h}\Pi _{h}\bar{y} -\bar{ y}) \leq C{h}^{1+\tau }, }$$

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Proof.

Since \(\varDelta \bar{y} \in H_{0}^{1}(\Omega )\) (cf. Appendix A.), we have, by (1.5) and Lemma 3.1,

$$\displaystyle\begin{array}{rcl} \int _{\Omega }{D}^{2}\bar{y}: {D}^{2}(E_{ h}\Pi _{h}\bar{y} -\bar{ y})\,dx& =& \int _{\Omega }(\varDelta \bar{y})\big(\varDelta (E_{h}\Pi _{h}\bar{y} -\bar{ y})\big)\,dx \\ & =& -\int _{\Omega }\nabla (\varDelta \bar{y}) \cdot \nabla (E_{h}\Pi _{h}\bar{y} -\bar{ y})\,dx \\ & \lesssim & \vert E_{h}\Pi _{h}\bar{y} -\bar{ y}\vert _{{H}^{1}(\Omega )} \lesssim {h}^{1+\tau }.{}\end{array}$$
(4.11)

Moreover Lemma 3.1 also implies

$$\displaystyle{ (\bar{y},E_{h}\Pi _{h}\bar{y} -\bar{ y}) \lesssim {h}^{2+\tau }. }$$
(4.12)

The lemma follows from (1.8), (4.11), and (4.12).

5 Convergence Analysis

In this section we complete the error analysis by finding a bound for the second term on the right-hand side of (4.2). We will show that

$$\displaystyle{ \mathcal{A}_{h}(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) - (f,\Pi _{h}\bar{y} -\bar{ y}_{h}) \lesssim {h}^{2\tau } + {h}^{\tau }\|\Pi _{ h}\bar{y} -\bar{ y}_{h}\|_{h}, }$$
(5.1)

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) satisfies (2.2)–(2.4). But first we use (5.1) to establish the main result of this paper.

Theorem 5.1.

There exists a positive constant C independent of h such that

$$\displaystyle{ \|\bar{y} -\bar{ y}_{h}\|_{h} \leq C{h}^{\tau }, }$$
(5.2)

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Proof.

It follows from Lemma 2.1, (4.2), (5.1), and the inequality of arithmetic and geometric means that

$$\displaystyle\begin{array}{rcl} \|\bar{y} -\bar{ y}_{h}\|_{h}^{2}& \leq & C\big({h}^{2\tau } + {h}^{\tau }\|\Pi _{ h}\bar{y} -\bar{ y}_{h}\|_{h}\big) {}\\ & \leq & C\big({h}^{2\tau } + {h}^{\tau }\|\bar{y} -\bar{ y}_{ h}\|_{h}\big) \leq C{h}^{2\tau } + \frac{1} {2}\|\bar{y} -\bar{ y}_{h}\|_{h}^{2}, {}\\ \end{array}$$

which implies (5.2).

The following lemma reduces the derivation of (5.1) to an estimate at the continuous level.

Lemma 5.1.

There exists a positive constant C independent of h such that

$$\displaystyle\begin{array}{rcl} & & \mathcal{A}_{h}(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) - (f,\Pi _{h}\bar{y} -\bar{ y}_{h}) {}\\ & & \quad \leq C{h}^{\tau }\|\Pi _{h}\bar{y} -\bar{ y}_{h}\|_{h} + \mathcal{A}\big(\bar{y},E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) -\big (f,E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big), {}\\ \end{array}$$

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Proof.

From (1.8) and (2.12) we have

$$\displaystyle\begin{array}{rcl} & & \mathcal{A}_{h}(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) - (f,\Pi _{h}\bar{y} -\bar{ y}_{h}) \\ & & \quad =\beta \left [a_{h}(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) -\int _{\Omega }\big({D}^{2}\bar{y}: {D}^{2}E_{ h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big)\,dx\right ] \\ & & \qquad +\gamma \big [(\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) -\big (\bar{y},E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big)\big] \\ & & \qquad -\big (f,(\Pi _{h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) \\ & & \qquad + \mathcal{A}\big(\bar{y},E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) -\big (f,E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big), {}\end{array}$$
(5.3)

and we can bound the second and third terms on the right-hand side of (5.3) as follows:

$$\displaystyle\begin{array}{rcl} & & (\Pi _{h}\bar{y},\Pi _{h}\bar{y} -\bar{ y}_{h}) -\big (\bar{y},E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) \\ & & \quad = (\Pi _{h}\bar{y} -\bar{ y},\Pi _{h}\bar{y} -\bar{ y}_{h}) +\big (\bar{y},(\Pi _{h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) \\ & & \quad \lesssim {h}^{2}\|\Pi _{ h}\bar{y} -\bar{ y}_{h}\|_{h} {}\end{array}$$
(5.4)

by (2.21) and (3.10); and

$$\displaystyle{ \big\vert \big(f,(\Pi _{h}\bar{y} -\bar{ y}_{h}) - E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big)\big\vert \lesssim {h}^{2}\|\Pi _{ h}\bar{y} -\bar{ y}_{h}\|_{h} }$$
(5.5)

by (3.10).

The lemma follows from Lemma 4.3 and (5.3)–(5.5).

In view of Lemma 5.1, it only remains to show that

$$\displaystyle{ \mathcal{A}\big(\bar{y},E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) -\big (f,E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) \lesssim {h}^{2\tau } + {h}^{\tau }\|\Pi _{ h}\bar{y} -\bar{ y}_{h}\|_{h}. }$$
(5.6)

We will use the relation \(A\stackrel{\leq }{.}B\) to streamline the derivation of (5.6), where

$$\displaystyle{A\stackrel{\leq }{.}B\quad \text{means that}\quad A - B \lesssim {h}^{2\tau } + {h}^{\tau }\|\Pi _{ h}\bar{y} -\bar{ y}_{h}\|_{h}.}$$

The estimate (5.6) can then be written as

$$\displaystyle{ \mathcal{A}\big(\bar{y},E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big)\stackrel{\leq }{.}\big(f,E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big). }$$
(5.7)

It follows from (1.9), (3.3), (3.4), (3.7), and Lemma 4.4 that

$$\displaystyle\begin{array}{rcl} & & \mathcal{A}\big(\bar{y},E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) = \mathcal{A}(\bar{y},E_{h}\Pi _{h}\bar{y} -\bar{ y}) + \mathcal{A}(\bar{y},\bar{y} - E_{h}\bar{y}_{h}) \\ & & \quad \stackrel{\leq }{.}\mathcal{A}(\bar{y},\bar{y} - E_{h}\bar{y}_{h}) \\ & & \quad = \mathcal{A}(\bar{y},\bar{y} -\hat{ y}_{h}) + \mathcal{A}(\bar{y},\hat{y}_{h} - E_{h}\bar{y}_{h}) \\ & & \quad \leq (f,\bar{y} -\hat{ y}_{h}) + \mathcal{A}(\bar{y},\bar{y}_{h}^{{\ast}}- E_{ h}\bar{y}_{h}) + \mathcal{A}(\bar{y},\delta _{h,1}\phi _{1} -\delta _{h,2}\phi _{2}) \\ & & \quad \stackrel{\leq }{.}(f,\bar{y} -\hat{ y}_{h}) + \mathcal{A}(\bar{y},\bar{y}_{h}^{{\ast}}- E_{ h}\bar{y}_{h}) \\ & & \quad = (f,\bar{y} -\hat{ y}_{h}) + \mathcal{A}(\bar{y}_{h}^{{\ast}},\bar{y}_{ h}^{{\ast}}- E_{ h}\bar{y}_{h}) + \mathcal{A}(\bar{y} -\bar{y}_{h}^{{\ast}},\bar{y}_{ h}^{{\ast}}- E_{ h}\bar{y}_{h}) \\ & & \quad \leq (f,\bar{y} -\hat{ y}_{h}) + (f,\bar{y}_{h}^{{\ast}}- E_{ h}\bar{y}_{h}) + \mathcal{A}(\bar{y} -\bar{y}_{h}^{{\ast}},\bar{y}_{ h}^{{\ast}}- E_{ h}\bar{y}_{h}) \\ & & \quad \stackrel{\leq }{.}(f,\bar{y} - E_{h}\bar{y}_{h}) + \mathcal{A}(\bar{y} -\bar{y}_{h}^{{\ast}},\bar{y}_{ h}^{{\ast}}- E_{ h}\bar{y}_{h}). {}\end{array}$$
(5.8)

Moreover we have

$$\displaystyle\begin{array}{rcl} (f,\bar{y} - E_{h}\bar{y}_{h})& =& \big(f,E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) + (f,\bar{y} - E_{h}\Pi _{h}\bar{y}) \\ & \stackrel{\leq }{.}& \big(f,E_{h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) {}\end{array}$$
(5.9)

by Lemma 3.1, and

$$\displaystyle\begin{array}{rcl} & & \mathcal{A}(\bar{y} -\bar{y}_{h}^{{\ast}},\bar{y}_{ h}^{{\ast}}- E_{ h}\bar{y}_{h}) \\ & & \quad = \mathcal{A}(\bar{y} -\bar{y}_{h}^{{\ast}},\bar{y}_{ h}^{{\ast}}-\bar{ y}) + \mathcal{A}(\bar{y} -\bar{y}_{ h}^{{\ast}},\bar{y} - E_{ h}\bar{y}_{h}) \\ & & \quad \leq \mathcal{A}(\bar{y} -\bar{y}_{h}^{{\ast}},\bar{y} - E_{ h}\Pi _{h}\bar{y}) + \mathcal{A}\big(\bar{y} -\bar{y}_{h}^{{\ast}},E_{ h}(\Pi _{h}\bar{y} -\bar{ y}_{h})\big) \\ & & \quad \stackrel{\leq }{.}0 {}\end{array}$$
(5.10)

by (3.5), (3.11), and Lemma 3.1. The relation (5.7) then follows from (5.8)–(5.10). Therefore we have established (5.6) and hence (5.1).

The following corollary is an immediate consequence of (2.17) and Theorem 5.1.

Corollary 5.1.

There exists a positive constant C independent of h such that

$$\displaystyle{ \vert \bar{y} -\bar{ y}_{h}\vert _{{H}^{1}(\Omega )} \leq C{h}^{\tau }, }$$

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Since the energy norm \(\|\cdot \|_{h}\) is an H 2-like norm, we can also deduce an L norm error estimate from Theorem 5.1. The proof of the following theorem, which is based on Lemmas 2.1, 3.1, Theorem 5.1, standard inverse estimates and the Sobolev inequality, is identical to the proof of Theorem 4.1 in [23] and thus omitted.

Theorem 5.2.

There exists a positive constant C independent of h such that

$$\displaystyle{ \|\bar{y} -\bar{ y}_{h}\|_{L_{\infty }(\Omega )} \leq C{h}^{\tau }, }$$

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Remark 5.1.

Since the norms \(\|\cdot \|_{L_{\infty }(\Omega )}\) and \(\vert \cdot \vert _{{H}^{1}(\Omega )}\) are weaker than the energy norm \(\|\cdot \|_{h}\) , the order of convergence in these norms should be higher than the order of convergence in \(\|\cdot \|_{h}\) . This is confirmed by the numerical results in Sect.  6.Therefore the estimates for \(\|\bar{y} -\bar{ y}_{h}\|_{L_{\infty }(\Omega )}\) and \(\vert \bar{y} -\bar{ y}_{h}\vert _{{H}^{1}(\Omega )}\) in Corollary 5.1 and Theorem 5.2 are not sharp.

For the optimal control problem defined by (1.1)–(1.3), we can take the approximation for the optimal control \(\bar{u}\) to be the function \(\bar{u}_{h} \in V _{h}\) defined by

$$\displaystyle{ \int _{\Omega }\nabla \bar{y}_{h} \cdot \nabla v\,dx =\int _{\Omega }\bar{u}_{h}v\,dx\qquad \forall \,v \in V _{h}. }$$
(5.11)

Theorem 5.3.

There exists a positive constant C independent of h such that

$$\displaystyle{ \|\bar{u} -\bar{ u}_{h}\|_{L_{2}(\Omega )} \leq C{h}^{\tau }, }$$
(5.12)

where τ = α if \(\mathcal{T}_{h}\) is quasi-uniform and τ = 1 if \(\mathcal{T}_{h}\) is graded according to (2.2)–(2.4) .

Proof.

Let \(Q_{h}: L_{2}(\Omega )\longrightarrow V _{h}\) be the orthogonal projection. From (1.2) we have

$$\displaystyle{ \int _{\Omega }\nabla \bar{y} \cdot \nabla v\,dx =\int _{\Omega }\bar{u}v\,dx =\int _{\Omega }(Q_{h}\bar{u})v\,dx\qquad \forall \,v \in V _{h}. }$$
(5.13)

Let v ∈ V h be arbitrary. Using integration by parts, the Cauchy–Schwarz inequality, scaling, (2.7b), (2.13), and Theorem 5.1, we find

$$\displaystyle\begin{array}{rcl} & & \int _{\Omega }\nabla (\bar{y} -\bar{ y}_{h}) \cdot \nabla v\,dx {}\\ & & \quad = -\sum _{e\in \mathcal{E}_{h}^{i}}\int _{e}[\![\partial (\bar{y} -\bar{ y}_{h})/\partial n]\!]v\,ds -\sum _{T\in \mathcal{T}_{h}}\int _{T}[\varDelta (\bar{y} -\bar{ y}_{h})]v\,dx {}\\ & & \quad \leq {\left (\sum _{e\in \mathcal{E}_{h}^{i}}\vert e{\vert }^{-1}\|[\![\partial (\bar{y} -\bar{ y}_{ h})/\partial n]\!]\|_{L_{2}(e)}^{2}\right )}^{\frac{1} {2} }{\left (\sum _{e\in \mathcal{E}_{ h}^{i}}\vert e\vert \|v\|_{L_{2}(e)}^{2}\right )}^{\frac{1} {2} } {}\\ & & \qquad +{ \left (\sum _{T\in \mathcal{T}_{h}}\vert \bar{y} -\bar{ y}_{h}\vert _{{H}^{2}(T)}^{2}\right )}^{\frac{1} {2} }\|v\|_{L_{ 2}(\Omega )} {}\\ & & \quad \lesssim \|\bar{ y} -\bar{ y}_{h}\|_{h}\|v\|_{L_{2}(\Omega )} \lesssim {h}^{\tau }\|v\|_{L_{2}(\Omega )}. {}\\ \end{array}$$

It then follows from (5.11), (5.13) and duality that

$$\displaystyle\begin{array}{rcl} & & \|Q_{h}\bar{u} -\bar{ u}_{h}\|_{L_{2}(\Omega )} \\ & & \quad =\sup _{v\in V _{h}\setminus \{0\}}\left (\int _{\Omega }(Q_{h}\bar{u} -\bar{ u}_{h})v\,dx\right )/\|v\|_{L_{2}(\Omega )} \\ & & \quad =\sup _{v\in V _{h}\setminus \{0\}}\left (\int _{\Omega }\nabla (\bar{y} -\bar{ y}_{h}) \cdot \nabla v\,dx\right )/\|v\|_{L_{2}(\Omega )} \lesssim {h}^{\tau }.{}\end{array}$$
(5.14)

Furthermore, we have, by a standard interpolation error estimate [61],

$$\displaystyle{ \|Q_{h}\bar{u} -\bar{ u}\|_{L_{2}(\Omega )} \lesssim \vert \bar{u}\vert _{{H}^{1}(\Omega )}h. }$$
(5.15)

The estimate (5.12) follows from (5.14) and (5.15).

Remark 5.2.

One can also take the piecewise constant function \(\bar{u}_{h} = -\varDelta _{h}\bar{y}_{h}\) to be an approximation of the optimal control \(\bar{u}\) , where Δ h is the piecewise Laplacian with respect to \(\mathcal{T}_{h}\) . The estimate (5.12) then immediately follows from (2.13) and Theorem 5.1. But numerical results indicate that the approximation of \(\bar{u}\) defined by (5.11) is a better choice.

Remark 5.3.

By tracing the constants in all the estimates (including (3.4))one can show (using (A.7), (A.8),and (A.10))that the constant C in Theorem 5.1, Corollary 5.1, Theorem 5.2, and Theorem 5.3 is of the form

$$\displaystyle\begin{array}{rcl} & & \mathfrak{C}\left (\|f\|_{L_{2}(\Omega )} +\sum _{ i=1}^{2}\|\psi _{ i}\|_{W_{\infty }^{2}(K)} + \vert \varDelta \bar{y}\vert _{{H}^{1}(\Omega )} +\sum _{\vert \mu \vert =3}\|\varPhi ({\partial }^{\mu }\bar{y})\|_{ L_{2}(\Omega )}\right. {}\\ & & \quad \left.+\sum _{\vert \mu \vert =2}\|\varPsi ({\partial }^{\mu }\bar{y})\|_{L_{2}(\Omega )} +\|\bar{ y}\|_{W_{\infty }^{2}(K)}\right ), {}\\ \end{array}$$

where Φ (resp. Ψ) is defined in (2.3) (resp. (A.9)), \(K \subset \subset \Omega\) is a compact neighborhood of the contact set where \((\bar{y} -\psi _{1})(\bar{y} -\psi _{2}) = 0\) , and the positive constant \(\mathfrak{C}\) depends only on \(\Omega\) and the shape regularity of \(\mathcal{T}_{h}\) .

6 Numerical Results

In this section we present several numerical examples for the obstacle problem (1.7) with \(\psi _{1}(x) = -\infty \). The computational domain for the first four examples is the square \((-0.5,0.5) \times (-0.5,0.5)\). The discrete problems are defined on uniform triangulations \(\mathcal{T}_{j}\) with mesh parameter \(h_{j} = {2}^{-j}\) ( = the length of the horizontal and vertical edges) for 1 ≤ j ≤ 8, and the penalty parameter σ is chosen to be 5 which ensures the coercivity of the discrete bilinear form on uniform meshes. The solutions of the discrete problems are denoted by \(\bar{y}_{j}\) (1 ≤ j ≤ 8), which are obtained by a primal–dual active set algorithm [4, 47].

Example 1.

In this example we validate our numerical scheme by solving (1.7)/(1.9) with a known solution. We begin with the obstacle problem on the disc {x:  | x |  < 2} with γ = 0, β = 1, f = 0 and \(\psi _{2}(x) = \frac{1} {2}\vert x{\vert }^{2} - 1\). This problem can be solved analytically because of rotational symmetry and the exact solution is given by

$$\displaystyle{ y_{\dag }(x) = \left \{\begin{array}{@{}l@{\quad }l@{}} C_{1}\vert x{\vert }^{2}\ln \vert x\vert + C_{2}\vert x{\vert }^{2} + C_{3}\ln \vert x\vert + C_{4}\quad &\quad \vert x\vert > r_{0} \\ \frac{1} {2}\vert x{\vert }^{2} - 1 \quad &\quad \vert x\vert \leq r_{ 0} \end{array} \right., }$$
(6.1)

where r 0 = 0. 31078820, \(C_{1} = -0.26855864\ldots,\) C 2 = 0. 45470930, \(C_{3} = -0.02593989\ldots\), and \(C_{4} = -1.05625438\ldots.\)

Let \(\bar{y}\) be the restriction of y to \(\Omega = {(-0.5,0.5)}^{2}\). Then we have

$$\displaystyle{ \bar{y} =\mathop{ \mathrm{argmin}}_{y\in \tilde{K}}\left [\frac{1} {2}\int _{\Omega }({D}^{2}y: {D}^{2}y)dx -\int _{ \partial \Omega }\left (\frac{{\partial }^{2}y_{\dag }} {\partial {n}^{2}} \right )\left (\frac{\partial y} {\partial n}\right )ds\right ], }$$
(6.2)

where n is the unit outer normal on \(\partial \Omega\) and

$$\displaystyle{ \tilde{K} =\{ v \in {H}^{2}(\Omega ):\, v - y_{ \dag }\in H_{0}^{1}(\Omega )\quad \text{and}\quad v \leq \psi _{ 2}\quad \text{in}\;\Omega \}, }$$

i.e., \(\bar{y}\) is the solution of an obstacle problem for a simply supported plate with nonhomogeneous boundary conditions.

As in the case of clamped plates [23], our results for simply supported plates with homogeneous boundary conditions (Theorems 5.1 and 5.2) can be extended to the nonhomogeneous case. Let \(\tilde{V }_{h}\) be the \(\mathbb{P}_{2}\) Lagrange finite element space associated with the triangulation \(\mathcal{T}_{h}\). The discrete problem for (6.2) is to find

$$\displaystyle{ \bar{y}_{h} =\mathop{ \mathrm{argmin}}_{y_{h}\in \tilde{K}_{h}}\left [\frac{1} {2}a_{h}(y_{h},y_{h}) -\sum _{e\in \mathcal{E}_{h}^{b}}\int _{e}\left (\frac{{\partial }^{2}y_{\dag }} {\partial {n}^{2}} \right )\left (\frac{\partial y_{h}} {\partial n} \right )ds\right ], }$$
(6.3)

where

$$\displaystyle{ \tilde{K}_{h} =\{ v \in \tilde{ V }_{h}:\; v -\Pi _{h}y_{\dag }\in H_{0}^{1}(\Omega )\quad \text{and}\quad v(p) \leq \psi _{ 2}(p)\quad \forall \,p \in \mathcal{V}_{h}\}. }$$

Let \(\bar{y}_{j}\) be the solution of (6.3) for the jth level triangulation and Π j be the Lagrange nodal interpolation operator for the jth level finite element space V j . We evaluate the error \(e_{j} =\Pi _{j}\bar{y} -\bar{ y}_{j}\) in the energy norm \(\|\cdot \|_{h_{j}}\) and in the norm \(\|\cdot \|_{\infty }\) defined by

$$\displaystyle{\|e_{j}\|_{\infty } =\max _{p\in \mathcal{N}_{j}}\vert e_{j}(p)\vert,}$$

where \(\mathcal{N}_{j}\) is the set of the vertices and midpoints of \(\mathcal{T}_{j}\). We also compute the order of convergence in these norms by the formulas

$$\displaystyle{ \ln (\|e_{j-1}\|_{h_{j-1}}/\|e_{j}\|_{h_{j}})/\ln 2\quad \text{ and }\ \quad \ln (\|e_{j-1}\|_{\infty }/\|e_{j}\|_{\infty })/\ln 2. }$$

The numerical results are presented in Table 1. The order of convergence in the energy norm is observed to be 1. 5, which is better than the order of 1 predicted by Theorem 5.1. This is likely due to the fact that \(\bar{y}\) is actually a C function on \(\Omega\) away from the circle with radius r 0 and therefore superconvergence occurs since we use uniform triangulations. We also observe that the order of convergence in the norm is close to 2, better than the order of 1 predicted by Theorem 5.2.

Table 1 Energy and l errors for Example 1

We plot the discrete coincidence sets I 7 and I 8 in Fig. 1, where

$$\displaystyle{ I_{j} =\{ p \in \mathcal{N}_{j}:\;\bar{ y}_{j}(p) \geq \psi _{2}(p) -\| e_{j}\|_{\infty }\}. }$$

The black circle represents the exact free boundary \(F =\{ x \in \Omega: \vert x\vert = r_{0}\}\) (cf. (6.1)). It is evident that the discrete coincidence sets (resp. free boundaries) are converging to the exact coincidence set (resp. free boundary).

Fig. 1
figure 1

Discrete coincidence sets I 7 (left) and I 8 (right) for Example 1

The second set of examples are optimal control problems with state constraints that come from [6, 55]. The value of γ is taken to be 1. Since the exact solutions are not known, we take \(\tilde{e}_{\bar{y},j} =\bar{ y}_{j-1} -\bar{ y}_{j}\) and evaluate \(\|\tilde{e}_{\bar{y},j}\|_{h_{j}}\) (the error of the state in the energy norm), \(\vert \tilde{e}_{\bar{y},j}\vert _{{H}^{1}}\) (the error of the state in the \({H}^{1}(\Omega )\) seminorm), and \(\|\tilde{e}_{\bar{y},j}\|_{\infty }\) (the error of the state in the l norm) defined by

$$\displaystyle{ \|\tilde{e}_{\bar{y},j}\|_{\infty } =\max _{p\in \mathcal{N}_{j}}\vert \tilde{e}_{\bar{y},j}(p)\vert. }$$

The approximations of the optimal control in these examples are given by the piecewise quadratic functions \(\bar{u}_{j} \in V _{j}\) defined by (5.11). We take \(\tilde{e}_{\bar{u},j} =\bar{ u}_{j-1} -\bar{ u}_{j}\) and evaluate \(\|\tilde{e}_{\bar{u},j}\|_{L_{2}}\) (the error of the control in the \(L_{2}(\Omega )\) norm). The orders of convergence in these examples are generated by the formulas

$$\displaystyle{ \ln (\|\tilde{e}_{\bar{y},j-1}\|/\|\tilde{e}_{\bar{y},j}\|)/\ln (2)\quad \text{and}\ \quad \ln (\|\tilde{e}_{\bar{u},j-1}\|/\|\tilde{e}_{\bar{u},j}\|)/\ln (2). }$$

Example 2.

In this example we take \(y_{d}(x) = 10(\sin (2\pi (x_{1} + 0.5)) + (x_{2} + 0.5))\), ψ 2(x) = 0. 01 and β = 0. 1. The errors for the approximations of the state and the control are reported in Tables 2 and 3. The discrete state \(\bar{y}_{8}\) and control \(\bar{u}_{8}\) are depicted in Fig. 2.

Table 2 Energy and state errors for Example 2
Table 3 H 1 state errors and L 2 control errors for Example 2
Fig. 2
figure 2

Discrete state \(\bar{y}_{8}\) (left) and control \(\bar{u}_{8}\) (right) for Example 2

Example 3.

In this example we take \(y_{d}(x) =\sin (2\pi (x_{1} + 0.5)(x_{2} + 0.5))\), ψ 2(x) = 0. 1 and \(\beta = 1{0}^{-3}\). The errors for the approximations of the state and the control are given in Tables 4 and 5. Figure 3 contains the plots for the discrete state \(\bar{y}_{8}\) and the discrete control \(\bar{u}_{8}\).

Table 4 Energy and state errors for Example 3
Table 5 H 1 state errors and L 2 control errors for Example 3
Fig. 3
figure 3

Discrete state \(\bar{y}_{8}\) (left) and control \(\bar{u}_{8}\) (right) for Example 3

Example 4.

In this example we take \(y_{d}(x) =\sin (4\pi (x_{1} + 0.5)(x_{2} + 0.5)) + 1.5\), ψ 2(x) = 1 and \(\beta = 1{0}^{-4}\). The errors for the approximations in the state and the control are presented in Tables 6 and 7. The plots of the discrete state \(\bar{y}_{8}\) and the discrete control \(\bar{u}_{8}\) are given in Fig. 4.

Table 6 Energy and state errors for Example 4
Table 7 H 1 state errors and L 2 control errors for Example 4
Fig. 4
figure 4

Discrete state \(\bar{y}_{8}\) (left) and control \(\bar{u}_{8}\) (right) for Example 4

The numerical results in Tables 27 confirm the error estimate for \(\|\bar{y} -\bar{ y}_{h}\|_{h}\) in Theorem 5.1, since the index of elliptic regularity α = 1 for a rectangular domain. On the other hand, the order of convergence for \(\bar{y}_{h}\) is 2 for both the norm and the \({H}^{1}(\Omega )\) seminorm, which is better than the first order convergence predicted by Theorem 5.2 and Corollary 5.1; and the order of convergence for \(\bar{u}_{h}\) is around 1. 5, which is also better than the first order convergence predicted by Theorem 5.3. The plots of the state and control in Figs. 24 also agree with the ones reported in [6, 55].

Example 5.

In this example we take \(\Omega\) to be the pentagonal domain obtained from the square (−0. 5, 0. 5)2 by deleting the triangle with vertices (0. 5, 0), (0. 5, 0. 5) and (0, 0. 5). We use the same data as Example 3, i.e., ψ 2(x) = 0. 1, \(y_{d}(x) =\sin (2\pi (x_{1} + 0.5)(x_{2} + 0.5))\), \(\beta = 1{0}^{-3}\) and γ = 1. The mesh parameter for the jth level uniform triangulation \(\mathcal{T}_{j}\) is \(h_{j} = {2}^{-(j+1)}\). The errors for the approximate state \(\bar{y}_{j}\) and approximate control \(\bar{u}_{j}\) are presented in Tables 8 and 9. Since the index of elliptic regularity α for the pentagonal domain can be taken to be any number less than 1∕3 (cf. Remark 2.1), the results in Tables 8 and 9 agree with Theorems 5.1 and 5.3. However, for this example the magnitude of the l error of the state seems to be O(h 2α) and the magnitude of the \({H}^{1}(\Omega )\) error of the state seems to be O(h).

We also plot the discrete state \(\bar{y}_{8}\) and control \(\bar{u}_{8}\) in Fig. 5. The singular nature of \(\bar{y}\) near the corners at (0. 5, 0) and (0, 0. 5) can be observed in the plot of \(\bar{u}_{6}\).

Table 8 Energy and state errors for Example 5
Table 9 H 1 state errors and L 2 control errors for Example 5
Fig. 5
figure 5

Discrete state \(\bar{y}_{8}\) (left) and control \(\bar{u}_{8}\) (right) for Example 5

Example 6.

In this example we solve the same problem in Example  5 on graded meshes obtained from a uniform triangulation \(\mathcal{T}_{0}\) of the pentagonal domain by the refinement process in [10] (cf. Fig. 6), and we take the penalty parameter σ to be 20.

The errors of the approximate state \(\bar{y}_{j}\) and approximate control \(\bar{u}_{j}\) are reported in Tables 10 and 11. It is observed that the order of convergence for the state in the energy norm and for the control in the \(L_{2}(\Omega )\) norm is about 1, which agrees with Theorems 5.1 and 5.3. On the other hand, the order of convergence for the state in the norm and the \({H}^{1}(\Omega )\) seminorm is about 1. 5, which is better than the order of convergence predicted by Theorem 5.2 and Corollary 5.1.

Table 10 Energy and state errors for Example 6
Table 11 H 1 state errors and L 2 control errors for Example 6

The discrete state \(\bar{y}_{7}\) and control \(\bar{u}_{3}\) are depicted in Fig. 7. By comparing Figs. 5 and 7 we see that the graphs of the optimal states computed by a uniform mesh and a graded mesh are very similar. But the graph of the optimal control computed by graded meshes exhibited a more pronounced singular behavior near the corners (0, 0. 5) and (0. 5, 0) since the triangles at these corners are much smaller than the corresponding ones in a uniform mesh.

Fig. 6
figure 6

Triangulation \(\mathcal{T}_{0}\) (left) and \(\mathcal{T}_{1}\) (right) for the pentagonal domain

Fig. 7
figure 7

Discrete state \(\bar{y}_{7}\) (left) and control \(\bar{u}_{3}\) (right) for Example 6

7 Concluding Remarks

In this paper we have only considered the optimal control problem (1.1)–(1.3) on convex polygonal domains. It is possible to treat this problem on general polygonal domains, in which case the space \({H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\) will be replaced by the space \(\{v \in H_{0}^{1}(\Omega ):\,\varDelta v \in L_{2}(\Omega )\}\) that has been thoroughly analyzed in [45, 46] and the discretization will involve singular functions.

The three-dimensional version of (1.1)–(1.3) can also be solved as fourth order variational inequalities by finite element methods. For smooth domains, a straightforward extension of the approach in [15, 2325] and this paper will lead to \(O({h}^{\frac{1} {2} })\) errors for the state in the energy norm and the control in the \(L_{2}(\Omega )\) norm, similar to the error estimates in [37, 56]. Again we expect the convergence of the state in the \({H}^{1}(\Omega )\) norm and the \(L_{\infty }(\Omega )\) norm to be of higher order.

These and other topics, such as the solution of optimal control problems with both state and control constraints as fourth order variational inequalities are subjects of ongoing investigations.

APPENDIX

A. Elliptic Regularity for Simply Supported Plates

In this appendix we summarize elliptic regularity results for the biharmonic equation on convex polygonal domains with the boundary conditions of simply supported plates and also discuss related results for the solution \(\bar{y}\) of the obstacle problem (1.7). We will focus on the H 3 regularity (or lack thereof) for the solution since \(\bar{y} \in H_{\mathrm{loc}}^{3}(\Omega )\).

Let \(\Omega\) be a convex polygonal domain with corners p 1, , p L and ω be the interior angle of \(\Omega\) at p . Let \(g \in L_{2}(\Omega )\) and \(z \in {H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\) satisfy

$$\displaystyle{ \int _{\Omega }{D}^{2}z: {D}^{2}v\,dx =\int _{\Omega }gv\,dx\qquad \forall \,v \in {H}^{2}(\Omega ) \cap H_{ 0}^{1}(\Omega ). }$$
(A.1)

It follows from (A.1) that \(w = -\varDelta z \in L_{2}(\Omega )\) has the following properties: (i) w is an H 2 function away from the corners of \(\Omega\), (ii) w vanishes on \(\partial \Omega \setminus \{p_{1},\ldots,p_{L}\}\). These two conditions then imply that

$$\displaystyle{ \mbox{ $w = -\varDelta z$ belongs to ${H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )$} }$$
(A.2)

and that z also satisfies

$$\displaystyle{ \int _{\Omega }\nabla z \cdot \nabla v\,dx =\int _{\Omega }wv\,dx\qquad \forall \,v \in H_{0}^{1}(\Omega ). }$$

Thus we can deduce the elliptic regularity of z from the elliptic regularity theory for the Laplace operator [36, 45, 58].

First of all,

$$\displaystyle{ \mbox{ $z$ is an ${H}^{4}$ function away from the corners of $\Omega $}, }$$
(A.3)

which also follows directly from (A.1). Secondly we have

$$\displaystyle{ z \in {H}^{3}(\mathcal{N}_{\ell})\qquad \text{if}\ \omega _{\ell} \leq \pi /2, }$$
(A.4)

where \(\mathcal{N}_{\ell}\subset \Omega\) is a neighborhood of p . Finally, at a corner p where ω  > π∕2, we have

$$\displaystyle{ z -\kappa _{\ell}\varphi _{\ell}\in {H}^{3}(\mathcal{N}_{\ell}), }$$
(A.5)

where \(\mathcal{N}_{\ell}\subset \Omega\) is a neighborhood of p , κ is a constant (generalized stress intensity factor), and the singular function \(\varphi _{\ell}\) is defined by

$$\displaystyle{ \varphi _{\ell} = r_{\ell}^{\pi /\omega _{\ell}}\sin \big((\pi /\omega _{\ell})\theta _{\ell}\big). }$$
(A.6)

Here (r , θ ) are the polar coordinates at p such that the two edges of \(\Omega\) emanating from p are given by θ  = 0 and θ  = ω . Note that \(\varphi _{\ell}\) is a harmonic function and \(\varphi _{\ell} \in {H}^{1+(\pi /\omega _{\ell})-\epsilon }(\mathcal{N}_{\ell})\) for any ε > 0.

Now we turn to the solution \(\bar{y}\) of (1.7)/(1.9). Since the constraints in (1.3) are not active near \(\partial \Omega\) because of (1.4b), we have

$$\displaystyle\begin{array}{rcl} \int _{\Omega }\left [\beta \big({D}^{2}(\rho _{ 1}\bar{y}): {D}^{2}w\big)\right ]\,dx& =& \int _{\Omega }\beta \left [{D}^{2}(\rho _{ 1}\bar{y}): {D}^{2}\big((1 -\rho _{ 2})w\big)\right ]\,dx {}\\ & \quad +& \int _{\Omega }\rho _{2}(f -\gamma \bar{ y})w\,dx {}\\ \end{array}$$

for all \(w \in {H}^{2}(\Omega ) \cap H_{0}^{1}(\Omega )\), where \(\rho _{1} =\rho _{2} = 1\) near \(\partial \Omega\), ρ 1 = 1 on the support of ρ 2, and the support of ρ 1 is disjoint from the active set where \(\bar{y}(x) =\psi _{1}(x)\) or ψ 2(x). Note that standard interior elliptic regularity [62, Sect. 20] implies

$$\displaystyle{ \int _{\Omega }\big[{D}^{2}(\rho _{ 1}\bar{y}): {D}^{2}\big((1 -\rho _{ 2})w\big)\big]\,dx =\int _{\Omega }(1 -\rho _{2}){[\varDelta }^{2}(\rho _{ 1}\bar{y})]w\,dx, }$$

where \({(1 -\rho _{2})\varDelta }^{2}(\rho _{1}\bar{y}) \in L_{2}(\Omega )\).

Therefore \(z =\rho _{1}\bar{y}\) satisfies (A.1) with \(g =\rho _{2}(f -\gamma \bar{ y})/\beta + {(1 -\rho _{2})\varDelta }^{2}(\rho _{1}\bar{y}) \in L_{2}(\Omega )\). Combining (A.2)–(A.6) and the fact that \(\bar{y} \in H_{\mathrm{loc}}^{3}(\Omega )\), we can draw the following conclusions about \(\bar{y}\).

  • The function \(\varDelta \bar{y}\) belongs to \(H_{0}^{1}(\Omega )\). Therefore \(\bar{u} = -\varDelta \bar{y}\) belongs to \(H_{0}^{1}(\Omega )\) for the optimal control problem (1.1)–(1.3).

  • Let α be chosen according to (2.4). Then \(\bar{y} \in {H}^{2+\alpha _{\ell}}(\mathcal{N}_{\ell})\), where \(\mathcal{N}_{\ell}\,(\subset \Omega )\) is a neighborhood of p . Globally we have \(\bar{y} \in {H}^{2+\alpha }(\Omega )\) where α = min1 ≤  ≤ L α .

  • We can write \(\bar{y} =\bar{ y}_{S} +\bar{ y}_{R}\), where \(\bar{y}_{R} \in {H}^{3}(\Omega ) \cap H_{0}^{1}(\Omega )\), \(\varDelta \bar{y}_{R} \in H_{0}^{1}(\Omega )\) and \(\bar{y}_{S}\) have the following properties.

    • \(\bar{y}_{S}\) is an H 3 function away from the corners of \(\Omega\) where the angles are > π∕2.

    • \(\bar{y}_{S}\) is a multiple of \(\varphi _{\ell}\) in a neighborhood \(\mathcal{N}_{\ell}\) of a corner p where ω  > π∕2.

    • \(\varDelta \bar{y}_{S}\) belongs to \(H_{0}^{1}(\Omega )\).

  • Since \(r_{\ell}^{1-\alpha _{\ell}}({\partial }^{\mu }\varphi _{\ell}) \in L_{2}(\mathcal{N}_{\ell})\) for | μ |  = 3, we have \(\varPhi ({\partial }^{\mu }\bar{y}_{S}) \in L_{2}(\Omega )\) for | μ |  = 3 and hence

$$\displaystyle{ \varPhi ({\partial }^{\mu }\bar{y}) \in L_{2}(\Omega )\quad \text{for}\ \vert \mu \vert = 3, }$$
(A.7)
  • where the function Φ is defined by (2.3).

  • Since \(r_{\ell}^{-\alpha _{\ell}}({\partial }^{\mu }\varphi _{\ell}) \in L_{2}(\mathcal{N}_{\ell})\) for | μ |  = 2, we have \(\varPsi ({\partial }^{\mu }\bar{y}_{S}) \in L_{2}(\Omega )\) for | μ |  = 2 and hence

$$\displaystyle{ \varPsi ({\partial }^{\mu }\bar{y}) \in L_{2}(\Omega )\quad \text{for}\ \vert \mu \vert = 2, }$$
(A.8)
  • where the function Ψ is defined by

$$\displaystyle{ \varPsi (x) =\prod _{ \ell=1}^{L}\vert p_{\ell} - x{\vert }^{-\alpha _{\ell}}. }$$
(A.9)

Finally we note that (cf. [36, Theorem AA.3 and Theorem AA.7])

$$\displaystyle{ \vert \bar{y}\vert _{{H}^{2+\alpha }(\Omega )} \leq C_{\Omega }\sum _{\vert \mu \vert =3}\|\varPhi ({\partial }^{\mu }\bar{y})\|_{L_{2}(\Omega )}. }$$
(A.10)

B. An Enriching Operator

In this appendix we construct the enriching operator introduced in Sect. 3.2. Such operators have played an important role in the design and analysis of fast solvers for nonconforming finite element methods [11, 12, 22, 26].

Let \(\tilde{V }_{h} \subset {H}^{1}(\Omega )\) be the \(\mathbb{P}_{2}\) Lagrange finite element space associated with \(\mathcal{T}_{h}\) and \(\tilde{W}_{h} \subset {H}^{2}(\Omega )\) be the \(\mathbb{P}_{6}\) Argyris finite element space [2] associated with \(\mathcal{T}_{h}\). The degrees of freedom of \(w \in \tilde{ W}_{h}\) (cf. Fig. 8) consist of the values of the derivatives of w up to second order at the vertices of \(\mathcal{T}_{h}\), the values of w at the midpoints of the edges of \(\mathcal{T}_{h}\) and at the centers of the triangles of \(\mathcal{T}_{h}\), and the values of the normal derivative of w at two nodes on each edge in \(\mathcal{E}_{h}\).

Fig. 8
figure 8

Degrees of freedom for the \(\mathbb{P}_{6}\) Argyris finite element

The enriching operator \(E_{h}:\tilde{ V }_{h}\longrightarrow \tilde{W}_{h}\) is defined by averaging as follows (cf. Sect. 2.1 for the notation).

  1. (i)

    Let N be a degree of freedom associated with an interior node p. We define

    $$\displaystyle{ N(E_{h}v) = \frac{1} {\vert \mathcal{T}_{p}\vert }\sum _{T\in \mathcal{T}_{p}}N(v_{T}). }$$
  2. (ii)

    Let N be a degree of freedom involving the normal derivative associated with a boundary node interior to an edge \(e \in \mathcal{E}_{h}^{b}\). We define

    $$\displaystyle{ N(E_{h}v) = N(v_{T_{e}}). }$$
  3. (iii)

    Let p be a boundary node which is not a corner of \(\Omega\) such that p is the common endpoint of two edges \(e_{1},e_{2} \in \mathcal{E}_{h}^{b}\). For any degree of freedom N associated with p, we define

    $$\displaystyle{ N(E_{h}v) = \frac{1} {2}\big[N(v_{T_{e_{ 1}}}) + N(v_{T_{e_{ 2}}})\big]. }$$
  4. (iv)

    Let p be a corner of \(\Omega\). Then p is the common endpoint of \(e_{1},e_{2} \in \mathcal{E}_{h}^{b}\). Let t j (resp. n j ) be a unit tangent (resp. normal) of e j . We define

    $$\displaystyle{\begin{array}{llll} (E_{h}v)(p) & = v(p), \\ (\partial (E_{h}v)/\partial t_{j})(p) & = (\partial v_{T_{e_{ j}}}/\partial t_{j})(p) &\qquad &\mbox{ for $j = 1,2$}, \\ ({\partial }^{2}(E_{h}v)/\partial t_{j}^{2})(p) & = ({\partial }^{2}v_{T_{e_{ j}}}/\partial t_{j}^{2})(p) &\qquad &\mbox{ for $j = 1,2$}, \\ ({\partial }^{2}(E_{h}v)/\partial t_{1}\partial n_{1})(p)& = ({\partial }^{2}v_{T_{e_{ 1}}}/\partial t_{1}\partial n_{1})(p).\end{array} }$$

Remark B.1.

We can also replace the last equation in (iv) by

$$\displaystyle{({\partial }^{2}(E_{ h}v)/\partial t_{2}\partial n_{2})(p) = ({\partial }^{2}v_{ T_{e_{2}}}/\partial t_{2}\partial n_{2})(p).}$$

Since v is continuous at the vertices, the relation (3.6) follows immediately from (i), (iii), and (iv). It is also easy to check that

$$\displaystyle{ E_{h}v \in W_{h} =\tilde{ W}_{h} \cap H_{0}^{1}(\Omega ) \subset {H}^{2}(\Omega ) \cap H_{ 0}^{1}(\Omega )\quad \text{ if }\quad v \in V _{ h} =\tilde{ V }_{h} \cap H_{0}^{1}(\Omega ). }$$

We now turn to the derivations of (3.8) and (3.9). Let \(T \in \mathcal{T}_{h}\) be arbitrary. Since v = E h v at the vertices and the center of T, we have, by scaling,

$$\displaystyle\begin{array}{rcl} & & \|v - E_{h}v\|_{L_{2}(T)}^{2} \lesssim h_{ T}^{4}\left (\sum _{ p\in \mathcal{V}_{T}}\vert \nabla (v - E_{h}v)(p){\vert }^{2}\right. \\ & & \quad \left.+\sum _{p\in \mathcal{N}_{T}}{\left \vert \frac{\partial (v - E_{h}v)} {\partial n} (p)\right \vert }^{2} +\sum _{ p\in \mathcal{V}_{T}}h_{T}^{2}\vert {D}^{2}(v - E_{ h}v)(p){\vert }^{2}\right ){}\end{array}$$
(B.1)

for all \(v \in \tilde{ V }_{h}\), where \(\mathcal{N}_{T}\) is the set of the six nodes on ∂ T associated with the degrees of freedom of the \(\mathbb{P}_{6}\) Argyris finite element that involve the normal derivative (cf. Fig. 8).

Let \(p \in \mathcal{V}_{T}\) be interior to \(\Omega\). Since the tangential derivative of vE h v is continuous across element boundaries, we have, by the definition of E h and a standard inverse estimate,

$$\displaystyle\begin{array}{rcl} \vert \nabla (v - E_{h}v)(p){\vert }^{2}& =& \Bigg\vert \frac{1} {\vert \mathcal{T}_{p}\vert }\sum _{T^\prime \in \mathcal{T}_{p}}(\nabla v_{T}(p) -\nabla v_{T^\prime }(p))\Bigg{\vert }^{2} \\ & \lesssim & \sum _{e\in \mathcal{E}_{p}^{i}}\vert e{\vert }^{-1}\|\,\left [\left [\partial v/\partial n\right ]\right ]\|_{ L_{2}(e)}^{2} {}\end{array}$$
(B.2)

where \(\mathcal{E}_{p}^{i}\) is the set of the edges in \(\mathcal{E}_{h}^{i}\) sharing p as a common endpoint. Similarly, we have

$$\displaystyle\begin{array}{rcl} \vert {D}^{2}(v - E_{ h}v)(p){\vert }^{2}& =& \Bigg\vert \frac{1} {\vert \mathcal{T}_{p}\vert }\sum _{T^\prime \in \mathcal{T}_{p}}{D}^{2}(v_{ T} - v_{T^\prime })(p)\Bigg{\vert }^{2} \\ & \lesssim & \sum _{T^\prime \in \mathcal{T}_{p}}h_{T^\prime }^{-2}\vert v\vert _{{ H}^{2}(T^\prime )}^{2}. {}\end{array}$$
(B.3)

The estimates (B.2) and (B.3) are also valid for \(p \in \partial \Omega\) by similar arguments.

Now we consider \(p \in \mathcal{N}_{T}\). If p is a boundary node, then \(\big\vert (\partial (v - E_{h}v)/\partial n)(p)\big\vert = 0\) by the definition of E h . Otherwise we have, by a standard inverse estimate,

$$\displaystyle{ \vert \partial (v - E_{h}v)/\partial n(p){\vert }^{2} \lesssim \vert e{\vert }^{-1}\|\,\left [\left [\partial v/\partial n\right ]\right ]\|_{ L_{2}(e)}^{2} }$$
(B.4)

for some \(e \in \mathcal{E}_{h}^{i}\).

Combining (B.1)–(B.4), we obtain the estimate (3.8) for m = 0, which then implies the estimates for m = 1 and 2 through standard inverse estimates.

For the operator E h Π h , first we observe that it is a bounded linear operator from \({H}^{2+s}(\mathcal{S}_{T})\) into H 2(T) because of (2.21) and (3.8). Furthermore, by construction, E h Π h ζ = ζ on T if \(\zeta \in \mathbb{P}_{2}(\mathcal{S}_{T})\). Hence the estimate (3.9) follows from the Bramble–Hilbert lemma (cf. [9, 38]).