1 Introduction

PDE constrained optimal control problems with pointwise state constraints are known to cause certain theoretical and numerical difficulties. Some progress has recently been made regarding the numerical analysis of such problems. A priori discretization error estimates and convergence results are available for different classes of problems, including linear-quadratic distributed control problems [9, 12, 17, 19, 26, 27], problems with Neumann boundary control [21], problems with finitely many state constraints [10, 22, 25], or problems with finitely many control parameters [24, 25]. In [24] the control parameters may influence a linear combination of nonhomogeneous Dirichlet boundary data with high-regularity. In this work, we are concerned with a Dirichlet boundary control problem, which admits less regularity for \(L^2\)-control functions than for instance Neumann boundary control problems. We will focus on presenting a priori error estimates for the finite element discretization of such linear-quadratic problems with pointwise state constraints in the interior of the domain.

We prove an error rate for the \(L^2(\varGamma )\)-norm of the control of \({\mathcal {O}}(h^{1-1/p})\) for problems without control constraints (cf. Theorem 3), which seems to be in accordance to the existing results in the literature. The error rate will be limited by the effects of the boundary term to \(h^{1-1/p}\) (cf. [13, 23]) and the effects of having a measure as the Lagrange multiplier associated to the state constraints. If we include also control constraints in our analysis, we obtain an order of convergence of \({\mathcal {O}}(h^{3/4-1/(2p)})\) (cf. Theorem 4). To the authors’ knowledge, results on discretization error estimates for state-constrained problems in the literature deal with distributed or Neumann boundary control problems, only. The order of almost \({\mathcal {O}}(h)\) obtained by Deckelnick and Hinze [17] or by Meyer [26] for distributed controls is for domains with smooth boundary. In [26] a comment about convex polygons is made, and an order \({\mathcal {O}}(h^{1/2})\) is obtained. The estimate of order \({\mathcal {O}}(h|\log h|)\) obtained by Casas, Mateos and Vexler in [12] is based on the fact that, for the problem treated in that work, an enhanced regularity of the Lagrange multiplier can be proven under mild assumptions. The same order is obtained in [19, Corollary 3.3] under the assumption of uniform boundedness of the distributed controls in the \(L^\infty (\varOmega )\)-norm.

In [19, Remark 3.4] it is noticed that for both distributed and Neumann boundary state-constrained control problems (and using a variational discretization) an easy argument can be used to show that in the presence of control constraints the same proof made for the pure state-constrained case applies also for the control-and-state-constrained case. This can be done due to the high regularity of both the control and the state. Unfortunately, such argument cannot be transferred to our problem due to the low regularity of the involved functions. Therefore, we must use two completely different methods of proof for the two cases.

Let us present an outline of the paper. In the next section we introduce the problem and the notation that will be used throughout the work. In Sect. 3 we collect and prove the regularity results we are going to need. Section 4 is devoted to the derivation of optimality conditions, as well as the regularity properties of the optimal solution that can be derived from these. Next, in Sect. 5, we discretize the problem using finite elements. Our main results are presented and proven in Sect. 6. After introducing a technical assumption on the mesh and proving an approximation result for the normal derivative of the adjoint state, we split the presentation and use different techniques of proof for the no-control-constrained and the control-constrained cases. The proofs are presented in Sects. 6.1 and 6.2, respectively, and the aforementioned orders of convergence of \({\mathcal {O}}(h^{1-1/p})\) and \({\mathcal {O}}(h^{3/4-1/(2p)})\) are obtained in each setting. We also remark in Sect. 6.3 that if we use the technique of Sect. 6.1 to the control-constrained case or the technique of Sect. 6.2 to the no-control-constrained case, we get worse orders of convergence in both cases.

Finally, we provide a numerical example and we discuss the sharpness of the obtained estimates as well as of existing error estimates for Dirichlet control problems.

2 The control problem

Throughout the article, we are dealing with the following linear-quadratic optimal control problem:

Here, \(\varOmega \subset {\mathbb {R}}^2\) is a bounded convex domain with polygonal boundary \(\varGamma \) and \(\varOmega _1\subset \subset \varOmega \) is an open set. With this notation, we mean that the closure of \(\varOmega _1\) is included in \(\varOmega \): \(\bar{\varOmega }_1\subset \varOmega \).

We denote by \(\pi /3\le \omega <\pi \) the largest interior angle of \(\varGamma \), and by

$$\begin{aligned} p_\varOmega = 2/(2-\pi /\max \{\omega ,\pi /2\})>2\quad \text {and}\quad s_\varOmega =1+\pi /\omega \in (2,4] \end{aligned}$$

the exponents giving the maximal elliptic regularity in \(W^{2,p}(\varOmega )\) for \(p<p_\varOmega \) (cf. [18, Theorem 4.4.3.7]) and \(H^s(\varOmega )\) for \(s<s_\varOmega \) (cf. [18, Theorem 5.1.1.4]). We consider a target state \(y_d\) regular enough, i.e., we will assume \(y_d\in L^p(\varOmega )\cap H^{s-2}(\varOmega )\) for all \(p<p_\varOmega \) and all \(s<s_\varOmega \).

Moreover, for the state constraints, we consider two functions \(a,b\in C(\bar{\varOmega }_1)\) such that \(a(x)<b(x)\) on \(\bar{\varOmega }_1\) and, for the control constraints, two functions \(\alpha ,\beta \in W^{1-1/p_\varOmega ,p_\varOmega }(\varGamma )\), such that \(\alpha (x)<\beta (x)\) on \(\varGamma \). With an abuse of notation, we will include in our formulation the absence of one or several constraints allowing the cases \(a\equiv -\infty \), \( b \equiv \infty \), \(\alpha \equiv -\infty \) or \(\beta \equiv \infty \). Further assumptions on the regularity of the state constraints will be made in order to obtain error estimates. Finally, consider \(\nu >0\) a regularization parameter.

To end this section, let us introduce some short notation. As usual, \((\cdot ,\cdot )\) is the inner product in \(L^2(\varOmega )\), \((\cdot ,\cdot )_\varGamma \) is the inner product in \(L^2(\varGamma )\), and \(\langle \cdot ,\cdot \rangle \) is the duality product between \(C(\bar{\varOmega }_1\)) and its dual \({\mathcal {M}}(\bar{\varOmega }_1)\), the space of regular Borel measures on \(\bar{\varOmega }_1\). To handle the constraints, we will use the sets

$$\begin{aligned} K= & {} \{y\in C(\bar{\varOmega }_1):a(x)\le y(x)\le b(x)\ \forall x\in \bar{\varOmega }_1\},\\ U_{\alpha ,\beta }= & {} \{u\in L^2(\varGamma ):\alpha (x)\le u(x)\le \beta (x) \text{ for } \text{ a.e. } x\in \varGamma \}, \end{aligned}$$

and

$$\begin{aligned} U_{ad} = \{u\in U_{\alpha ,\beta }:y_u\in K\}. \end{aligned}$$

We will denote by \({{\mathrm{Proj}}}_{[a,b]}(c)=\min \{b,\max \{a,c\}\}\) for any real numbers abc the projection of c onto the interval [ab]. Finally, we will denote by \(\{\chi _j\}_{j=1}^m\) the vertices of \(\varGamma \) counted counterclockwise, with \(\chi _{m+1}=\chi _1\), and by \(\varGamma _j\) the part of \(\varGamma \) joining vertices \(\chi _j\) and \(\chi _{j+1}\).

3 Some regularity results for the related PDEs

It is well known that in a polygonal domain, for any \(u\in L^2(\varGamma )\) there exists a unique \(y_u\in H^{1/2}(\varOmega )\) solving the state equation in the transposition sense:

$$\begin{aligned} \int _\varOmega y_u \varDelta z dx = \int _\varGamma u\partial _n z ds\ \forall z\in H^2(\varOmega )\cap H^1_0(\varOmega ). \end{aligned}$$

Moreover, the estimate

$$\begin{aligned} \Vert y_u\Vert _{H^{1/2}(\varOmega )}\le C \Vert u\Vert _{L^2(\varGamma )} \end{aligned}$$
(1)

holds; see [1, Theorem 2.4] for a proof even in non-convex polygonal domains. This defines a linear and continuous control-to-state operator

$$\begin{aligned} S:L^2(\varGamma )\rightarrow H^{1/2}(\varOmega ). \end{aligned}$$

In this section we will collect some regularity results for the state and the adjoint state equation that will be needed in the rest of the work.

Lemma 1

The control-to-state mapping is continuous from \(H^{s-3/2}(\varGamma )\) to \(H^{s-1}(\varOmega )\) for all \({3/2}\le s<s_\varOmega \).

Proof

If \(s=3/2\), then the continuity follows directly using the transposition method. Let us take now \(3/2<s<s_\varOmega \). Consider \(u\in H^{s-3/2}(\varGamma )\): from the trace theorem [18, Theorem 1.5.2.8] we know that there exists \(U\in H^{s-1}(\varOmega )\) such that trace\((U)=u\). Consider \(z=U-y_u\). This function satisfies

$$\begin{aligned} -\varDelta z = -\varDelta U \text{ in } \varOmega ,\ z=0 \text{ on } \varGamma . \end{aligned}$$

Since \(-\varDelta U\in H^{s-3}(\varOmega )\) and \(s-3<s_\varOmega -2\), then [18, Theorem 5.1.1.4] implies that \(z\in H^{s-1}(\varOmega )\) and consequently \(y_u\) belongs to \(H^{s-1}(\varOmega )\) as well. \(\square \)

Lemma 2

For any open set \(\varOmega '\subset \subset \varOmega \), the control-to-state mapping \({}{S}:L^2(\varGamma ) \rightarrow H^{1/2}(\varOmega )\), \({}{S} u = y_u\) is continuous

  1. 1.

    from \(L^2(\varGamma )\rightarrow C(\bar{\varOmega }')\);

  2. 2.

    from \(W^{1-1/p,p}(\varGamma )\) to \(W^{2,p}(\varOmega ')\) for all \(p<p_\varOmega \);

  3. 3.

    and from \(H^{s-3/2}(\varGamma )\) to \(H^s(\varOmega ')\) for all \(s<s_\varOmega \).

Proof

The proof follows the usual techniques for interior regularity results. We will prove in detail the first statement.

  • 1. Since \(y_u\) is a harmonic function, and hence continuous in \(\varOmega \), we have that \(y_u\in C(\bar{\varOmega }')\) and S is well defined from \(L^2(\varGamma )\rightarrow C(\bar{\varOmega }')\). In \({\mathbb {R}}^2\), we have that \(H^{1/2}(\varOmega )\hookrightarrow L^4(\varOmega )\), and using (1) we can write

    $$\begin{aligned} \Vert y_u\Vert _{L^{4}(\varOmega )}+\Vert \nabla y_u\Vert _{W^{-1,4}(\varOmega )}\le C \Vert u\Vert _{L^2(\varGamma )}. \end{aligned}$$
    (2)

    Consider now a cut-off function \(\eta \in {\mathcal {D}}(\varOmega )\), \(0\le \eta \le 1\) and \(\eta \equiv 1\) on \(\bar{\varOmega }'\), as well as \(\eta \equiv 0\) on \(\varOmega {\setminus }\varOmega ''\), with some subdomain \(\varOmega ''\) satisfying \(\varOmega '\subset \subset \varOmega ''\subset \subset \varOmega \). Taking into account that \(\varDelta y_u=0\), we have that \(\eta y_u\) satisfies the equation

    $$\begin{aligned} -\varDelta (\eta y_u) = -y\varDelta \eta -2\nabla \eta \cdot \nabla y_u \text{ in } \varOmega ,\quad \eta y_u = 0 \text{ on } \varGamma . \end{aligned}$$
    (3)

    Since \(4>2\), using the classical estimate by Stampacchia [29], we obtain

    $$\begin{aligned} \Vert y_u\Vert _{L^\infty (\varOmega ')} \le \Vert \eta y_u\Vert _{L^\infty (\varOmega )} \le \Vert y\varDelta \eta +\nabla \eta \cdot \nabla y_u\Vert _{W^{-1,4}(\varOmega )} \end{aligned}$$

    and the first result follows from this inequality and (2)

  • 2. We make use of \(y_u\in W^{1,p}(\varOmega )\) (see [1, Lemma 2.3]), and hence \(\nabla y_u\in L^p(\varOmega )\). We repeat the process from Step 1, taking into account the \(W^{2,p}(\varOmega )\) regularity of \(\eta y_u\), which follows from [18, Theorem 4.4.3.7].

  • 3. From Lemma 1 we have \(y_u\in H^{s-1}(\varOmega )\), and therefore \(\nabla y_u\in H^{s-2}(\varOmega )\). Since \(s<s_\varOmega \), we can apply [18, Theorem 5.1.1.4] to (3) to obtain \(\eta y_u\in H^s(\varOmega )\), and hence \(y_u\in H^s(\varOmega ')\).

\(\square \)

To eventually obtain regularity results for the control via the optimality system, we proceed by discussing regularity of adjoint equations. For \(u\in L^2(\varGamma )\) and \(\mu \in {\mathcal {M}}(\bar{\varOmega }_1)\) we define \(\varphi _{{\mathrm {r}}}(u)\in H^1_0(\varOmega )\) and \(\varphi _{{\mathrm {s}}}(\mu )\in W^{1,t}_0(\varOmega )\) for all \(t<2\) the unique solutions of

$$\begin{aligned} -\varDelta \varphi _{\mathrm {r}}(u) = y_u-y_d \text{ in } \varOmega ,\ \varphi _{\mathrm {r}}(u) = 0 \text{ on } \varGamma ,\\ -\varDelta \varphi _{\mathrm {s}}(\mu ) = \mu \quad \text{ in } \varOmega ,\ \varphi _{\mathrm {s}}(\mu ) = 0 \text{ on } \varGamma , \end{aligned}$$

where the last equation must be understood in the transposition sense:

$$\begin{aligned} (\varphi _{\mathrm {s}}(\mu ),-\varDelta z) = \langle \mu ,z\rangle \ \forall z\in H^1_0(\varOmega ) \text{ s.t. } \varDelta z\in L^2(\varOmega ). \end{aligned}$$
(4)

Notice that if \(\varDelta z\in L^2(\varOmega )\), then \(z\in H^2_{loc}(\varOmega )\) and hence \(z\in C(\bar{\varOmega }_1)\), so the definition is meaningful.

Lemma 3

If \(u\in L^2(\varGamma )\), then

$$\begin{aligned} \varphi _{\mathrm {r}}(u)\in W^{2,q}(\varOmega ),\ \partial _n\varphi _{\mathrm {r}}(u)\in W^{1-1/q,q}(\varGamma )\ \forall q\le 4,\ q<p_\varOmega . \end{aligned}$$
(5)

If, further, \(u\in H^{1/2}(\varGamma )\), then we also have that

$$\begin{aligned}&\varphi _{\mathrm {r}}(u)\in W^{2,p}(\varOmega ),\ \partial _n\varphi _{\mathrm {r}}(u)\in W^{1-1/p,p}(\varGamma )\ \forall p<p_\varOmega , \end{aligned}$$
(6)
$$\begin{aligned}&\varphi _{{\mathrm {r}}}(u)\in H^{s}(\varOmega ),\ \partial _n\varphi _{\mathrm {r}}(u)\in \prod _{j=1}^m H^{s-3/2}(\varGamma _j)\ \forall s\le 3,\ s < s_\varOmega , \end{aligned}$$
(7)

and

$$\begin{aligned} \partial _n\varphi _{\mathrm {r}}(u)\in H^{s-3/2}(\varGamma ) \ \forall s<\min \{3,s_\varOmega \}. \end{aligned}$$
(8)

Proof

Suppose \(u\in L^2(\varGamma )\). Then \(y_u\in H^{1/2}(\varOmega )\hookrightarrow L^4(\varOmega )\) and usual regularity results (cf. [18, Theorem 4.4.3.7]) will give us that \(\varphi _{\mathrm {r}}(u)\in W^{2,q}(\varOmega )\) for \(q\le 4\), \(q<p_\varOmega \). The trace theorem (e.g. [18, Theorem 1.6.1.5]) states then that

$$\begin{aligned} \partial _n\varphi _{\mathrm {r}}(u)\in \prod _{j=1}^m W^{1-1/q,q}(\varGamma _j)\ \forall q\le 4, q<p_\varOmega . \end{aligned}$$

Since \(\varphi _{\mathrm {r}}(u)=0\) on \(\varGamma \), we have that \(\partial _n\varphi _{\mathrm {r}}(u)(\chi _j)=0\) (see [11, Lemma A.2] and [8, §4]) and \(\partial _n\varphi _{\mathrm {r}}(u)\in C(\varGamma )\). This compatibility condition is enough (cf. [18, Theorem 1.5.2.3(b)] to obtain the global regularity in \(\varGamma \).

If \(u\in H^{1/2}(\varGamma )\), then \(y_u\in H^1(\varOmega )\subset L^p(\varOmega )\) for all \(p<p_\varOmega \). Relations in (6) follow now in the same way as we proved (5). The regularity result [18, Theorem 5.1.1.4] gives us \(\varphi _{{\mathrm {r}}}(u)\in H^{s}(\varOmega )\) for all \(s\le 3\), \(s<s_\varOmega \) and the trace theorem hence yields

$$\begin{aligned} \partial _n\varphi _{\mathrm {r}}(u)\in \prod _{j=1}^m H^{s-3/2}(\varGamma _j) \ \forall s\le 3, s < s_\varOmega . \end{aligned}$$

If \(s_\varOmega \le 5/2\) (i.e., for \(\omega \ge 2\pi /3\)), the already mentioned global continuity of \(\partial _n\varphi _{{\mathrm {r}}}\) is enough to obtain the desired global regularity on the boundary. If \(5/2<s_\varOmega <3\) (this is, for angles \(\pi /2<\omega <2\pi /3\)) this continuity condition gives us also that \(\partial _n\varphi _{\mathrm {r}}(u)\in H^1(\varGamma )\); on the other hand, the definition of the Sobolev space \(H^{s-3/2}(\varGamma _j)\) for \(s>5/2\) gives that

$$\begin{aligned} \partial _\tau \partial _n\varphi _{\mathrm {r}}(u)\in \prod _{j=1}^m H^{s-5/2}(\varGamma _j). \end{aligned}$$

Since \(s<3\), it is known (cf. [18, Theorem 1.5.2.3(a)]) that no compatibility condition is required at the corners to have

$$\begin{aligned} \prod _{j=1}^m H^{s-5/2}(\varGamma _j) = H^{s-5/2}(\varGamma ). \end{aligned}$$
(9)

All together, we obtain that \(\partial _n\varphi _{\mathrm {r}}(u)\in H^1(\varGamma )\) and its derivative satisfies \(\partial _\tau \partial _n\varphi _{\mathrm {r}}(u)\in H^{s-5/2}(\varGamma )\). These are precisely the conditions that define the space \(H^{s-3/2}(\varGamma )\) (for \(5/2<s<7/2\)), and therefore \(\partial _n\varphi _{\mathrm {r}}(u)\in H^{s-3/2}(\varGamma )\) by definition.

For \(s_\varOmega \ge 3\) and \(s=3\), (9) is no longer true in general. \(\square \)

Lemma 4

For every open set \(\varOmega _2\) with smooth boundary \(\varGamma _2\) such that \(\varOmega _1\subset \subset \varOmega _2\subset \subset \varOmega \) and every \(\mu \in {\mathcal {M}}(\bar{\varOmega }_1)\)

$$\begin{aligned}&\varphi _{{\mathrm {s}}}(\mu )\in W^{1,t}_0(\varOmega )\cap W^{2,p}(\varOmega {\setminus }\bar{\varOmega }_2)\cap H^{s}(\varOmega {\setminus }\bar{\varOmega }_2)\ \forall t<2,\ p<p_\varOmega ,\ s<s_\varOmega , \nonumber \\\end{aligned}$$
(10)
$$\begin{aligned}&\partial _n\varphi _{\mathrm {s}}(\mu )\in W^{1-1/p,p}(\varGamma )\cap \prod _{j=1}^m H^{s-3/2}(\varGamma _j)\ \forall p<p_\varOmega ,\ s<s_\varOmega , \end{aligned}$$
(11)

and

$$\begin{aligned} \partial _n\varphi _{\mathrm {s}}(\mu )\in H^{s-3/2}(\varGamma )\ \forall s<\min \{3,s_\varOmega \}. \end{aligned}$$
(12)

Proof

Since \(\varphi _{\mathrm {s}}(\mu )\) is harmonic in \(\varOmega {\setminus }\bar{\varOmega }_1\), we have that \(\varphi _{\mathrm {s}}(\mu )\in C^\infty _{loc}(\varOmega {\setminus }\bar{\varOmega }_1)\).

For any open set \(\varOmega _2\) with smooth boundary \(\varGamma _2\) such that \(\varOmega _1\subset \subset \varOmega _2\subset \subset \varOmega \), \(\varphi _{\mathrm {s}}(\mu )\) is the solution of the following boundary value problem:

$$\begin{aligned} -\varDelta \varphi _{\mathrm {s}}(\mu ) = 0 \text{ in } \varOmega {\setminus }\bar{\varOmega }_2,\ \varphi _{\mathrm {s}}(\mu )=0 \text{ on } \varGamma , \varphi _{\mathrm {s}}(\mu )=g \text{ on } \varGamma _2, \end{aligned}$$
(13)

where g is the trace of \(\varphi _{\mathrm {s}}(\mu )\) on \(\varGamma _2\) and is a \(C^\infty (\varGamma _2)\) function. Therefore, using [18, Theorems 4.4.3.7 and 5.1.1.4] we obtain (10). Notice that now we do not have the restriction \(s\le 3\), since the right hand side of (13) is zero.

The regularity of its normal derivative is proven using the trace theory as in Lemma 3. \(\square \)

Some further interior regularity will also be useful later.

Lemma 5

For any open sets \(\varOmega _2\) and \(\varOmega _3\) such that \(\varOmega _1\subset \subset \varOmega _2\subset \subset \varOmega _3\subset \subset \varOmega \)

$$\begin{aligned} \varphi _{\mathrm {s}}(\mu )\in W^{2,\infty }(\varOmega _3{\setminus }\bar{\varOmega }_2) \end{aligned}$$

and

$$\begin{aligned} \Vert \varphi _{\mathrm {s}}(\mu )\Vert _{W^{2,\infty }(\varOmega _3{\setminus }\bar{\varOmega }_2)} \le C\Vert \mu \Vert _{{\mathcal {M}}(\bar{\varOmega }_1)}, \end{aligned}$$

where C depends on the distance from \(\bar{\varOmega }_1\) to \(\varOmega _3{\setminus }\bar{\varOmega }_2\).

Proof

The first statement is obvious since \(\varphi _{{\mathrm {s}}}(\mu )\) is harmonic \(\varOmega {\setminus }\bar{\varOmega }_1\) and \(\varOmega _3{\setminus }\bar{\varOmega }_2\subset \subset \varOmega {\setminus }\bar{\varOmega }_1\).

The proof of the continuity estimate is that of Lemma 2. Here we need to use a bootstrapping argument with two open sets \(\varOmega '\) and \(\varOmega ''\) such that \(\varOmega _3{\setminus }\bar{\varOmega }_2 \subset \subset \varOmega '' \subset \subset \varOmega '{}{\subset \subset \varOmega {\setminus }\bar{\varOmega }_1}\subset \subset \varOmega \) to obtain the intermediate results

$$\begin{aligned} \Vert \varphi _{\mathrm {s}}(\mu )\Vert _{W^{2,\infty }(\varOmega _3{\setminus }\bar{\varOmega }_2)}\le & {} \Vert \varphi _{\mathrm {s}}(\mu )\Vert _{W^{4,t}(\varOmega _3{\setminus }\bar{\varOmega }_2)} \le C_1 \Vert \varphi _{\mathrm {s}}(\mu )\Vert _{W^{3,t}(\varOmega '')}\\\le & {} C_2 \Vert \varphi _{\mathrm {s}}(\mu )\Vert _{W^{2,t}(\varOmega ')}\le C_3 \Vert \varphi _{\mathrm {s}}(\mu )\Vert _{W^{1,t}(\varOmega )} \le C\Vert \mu \Vert _{{\mathcal {M}}(\bar{\varOmega }_1)}. \end{aligned}$$

\(\square \)

4 Optimality conditions and regularity of the solution

Define the Lagrangian of the problem, \({\mathcal {L}}:L^2(\varGamma )\times {\mathcal {M}}(\bar{\varOmega }_1)\times {\mathcal {M}}(\bar{\varOmega }_1)\rightarrow {\mathbb {R}}\) as

$$\begin{aligned} {\mathcal {L}}(u,\mu ^+,\mu ^-) = J(u) +\langle \mu ^+,y_u-b\rangle + \langle \mu ^-,a-y_u\rangle . \end{aligned}$$

We have that for any \(u,v\in L^2(\varGamma )\) and \(\mu ^+,\mu ^-\in {\mathcal {M}}(\bar{\varOmega }_1)\), with \(\mu = \mu ^+ -\mu ^-\) the first derivatives are given by the expressions (see [13])

$$\begin{aligned} J'(u)v&= (-\partial _n\varphi _{\mathrm {r}}(u) +\nu u,v)_\varGamma \\ \partial _u{\mathcal {L}}(u,\mu ^+,\mu ^-)v&= (-\partial _n\varphi _{\mathrm {r}}(u)-\partial _n\varphi _{\mathrm {s}}(\mu ) + \nu u,v)_\varGamma \end{aligned}$$

and the second derivatives are independent of u, \(\mu ^+\), and \(\mu ^-\) since the problem is quadratic and the constraints are linear:

$$\begin{aligned} J''(u) v^2 = \partial ^2_{uu}{\mathcal {L}}(u,\mu ^+,\mu ^-)v^2 = \Vert y_v\Vert _{L^2(\varOmega )}^2+{}{\nu }\Vert v\Vert _{L^2(\varGamma )}^2. \end{aligned}$$

Definition 1

We will say that u is a feasible point for \(({\mathbb {P}})\) if \(u\in U_{ad}\). We will say that \(u_0\in U_{ad}\) is a feasible Slater point for \(({\mathbb {P}})\) if there exist \(\delta >0\) and \(\varepsilon >0\) such that

Theorem 1

Suppose problem \(({{\mathbb {P}}})\) has a feasible point. Then it has a unique solution \(\bar{u}\in U_{ad}\) with related state \(\bar{y} = y_{\bar{u}}\in K\). If, further, \(({{\mathbb {P}}})\) has a feasible Slater point, then there exist two nonnegative measures \(\bar{\mu }^+, \bar{\mu }^-\in {\mathcal {M}}(\bar{\varOmega }_1)\) such that

$$\begin{aligned}&-\varDelta \bar{y} = 0 \text{ in } \varOmega ,\ \bar{y} = \bar{u} \text{ on } \varGamma \end{aligned}$$
(14a)
$$\begin{aligned}&-\varDelta \bar{\varphi } = \bar{y}-y_d+\bar{\mu } \text{ in } \varOmega ,\ \bar{\varphi } = 0 \text{ on } \varGamma \end{aligned}$$
(14b)
$$\begin{aligned}&\bar{u}(x) = {{\mathrm{Proj}}}_{[\alpha (x),\beta (x)]} \left( \frac{1}{\nu }\partial _n\bar{\varphi }(x)\right) \text{ on } \varGamma \end{aligned}$$
(14c)
$$\begin{aligned}&\langle \bar{\mu },y-\bar{y}\rangle \le 0\ \forall y\in K \end{aligned}$$
(14d)

and

$$\begin{aligned}&{{\mathrm{supp}}}\bar{\mu }^+\subset \{x\in \bar{\varOmega }_1:\bar{y}(x)=b(x)\}\end{aligned}$$
(15a)
$$\begin{aligned}&{{\mathrm{supp}}}\bar{\mu }^-\subset \{x\in \bar{\varOmega }_1:\bar{y}(x)=a(x)\} \end{aligned}$$
(15b)

where \(\bar{\mu } = \bar{\mu }^+-\bar{\mu }^-\) and \(\bar{\varphi }=\varphi _{\mathrm {r}}(\bar{u})+\varphi _{\mathrm {s}}(\bar{\mu })\).

Proof

Since Problem \(({{\mathbb {P}}})\) is strictly convex and we are supposing the existence of a feasible point, existence and uniqueness of a solution \(\bar{u}\in L^2(\varGamma )\) is immediate.

Thanks to Lemma 2 and our assumption on the existence of a Slater point, from the expression of the derivative of the Lagrangian, we obtain (see, e.g., [7]) the existence of two nonnegative measures \(\bar{\mu }^+\) and \(\bar{\mu }^-\) such that (14d) holds and

$$\begin{aligned} \partial _u{\mathcal {L}}\left( \bar{u},\bar{\mu }^+,\bar{\mu }^-\right) (u-\bar{u})\ge 0\ \forall u\in U_{\alpha ,\beta }, \end{aligned}$$

which in our case means

$$\begin{aligned} \left( -\partial _n\bar{\varphi } + \nu \bar{u},u-\bar{u}\right) _\varGamma \ge 0\ \forall u\in U_{\alpha ,\beta }, \end{aligned}$$
(16)

that leads directly to the projection formula (14c). Relations like (15a) and (15b) are well known in the context of state constrained problems. See e.g. [12] for a proof for non-constant constraints. \(\square \)

Remark 1

The Lagrange multiplier \(\bar{\mu }\) and the adjoint state \(\bar{\varphi }\) need not be unique. Consider the following one-dimensional problem. \(\varOmega =(-1,1)\), \(\varOmega _1=(-1/2,1/2)\), \(y_d\equiv -1/2\), \(\nu =1\), \(b\equiv -1/2\). Then \(\bar{y}\equiv -1/2\), \(\bar{u}\equiv -1/2\) is the unique solution of the problem. But both pairs

$$\begin{aligned} \bar{\varphi }_1=\frac{1}{2}(1-|x|),\ \bar{\mu }_1=\delta _{0} \end{aligned}$$

and

$$\begin{aligned} \bar{\varphi }_2=\left\{ \begin{array}{cc} (1-|x|)/2&{} \text{ if } |x|>1/2\\ -x^2/2+3/8&{} \text{ if } |x|<1/2, \end{array} \right. \ \bar{\mu }_2=\chi _{\varOmega _1} \end{aligned}$$

satisfy the optimality system.

Remark 2

It is also possible to state first order necessary optimality conditions without the use of measures. Due to the convexity of \(U_{ad}\) and the expression for the derivative of J, we have that

$$\begin{aligned} (-\partial _n\varphi _{\mathrm {r}}(\bar{u})+\nu \bar{u} ,u-\bar{u})\ge 0\ \forall u\in U_{ad}. \end{aligned}$$

This would lead to the expression

$$\begin{aligned} \bar{u} = {{\mathrm{Proj}}}_{U_{ad}}\left( \frac{1}{\nu }\partial _n \varphi _{\mathrm {r}}(\bar{u})\right) \text{ in } \text{ the } \text{ sense } \text{ of } L^2(\varGamma ). \end{aligned}$$

This strategy is used in [27].

Corollary 1

If \(({{\mathbb {P}}})\) has a feasible Slater point, then

$$\begin{aligned} \bar{u}\in W^{1-1/p,p}(\varGamma ),\ \bar{y}\in W^{1,p}(\varOmega )\ \forall p<p_\varOmega \end{aligned}$$
(17)

and

$$\begin{aligned} \bar{u}\in H^{s-3/2}(\varGamma ), \bar{y}\in H^{s-1}(\varOmega )\ \forall s<\min \{3,s_\varOmega \}. \end{aligned}$$
(18)

Proof

On the one hand, using Lemma 4 we know that \(\partial _n\varphi _{{\mathrm {s}}}(\bar{\mu })\in W^{1-1/p,p}(\varGamma )\) for all \(p<p_\varOmega \).

To consider \(\partial _n\varphi _{{\mathrm {r}}}\) on the other hand, note that Eq. (5) implies that \(\partial _n\varphi _{{\mathrm {r}}}(\bar{u})\in W^{1-1/q,q}(\varGamma )\) for all \(q\le 4\), \(q<p_\varOmega \). So the projection relation (14c) gives us that \(\bar{u}\in W^{1-1/q,q}(\varGamma )\subset H^{1/2}(\varGamma )\). We can now use a bootstrap argument using the relations (6) to obtain (17). The regularity of the state is an immediate consequence of the trace theorem; see [1, Lemma 2.3] for details.

Relation (18) follows from (8) and (12) and the projection formula (14c). See [4, Theorem 18]. The regularity of the state follows directly from Lemma 1. \(\square \)

5 Discretization

Let \(\{{\mathcal {T}}_h\}_h\) be a quasi-uniform family of triangulations of \(\bar{\varOmega }\). For the discretization of the state and the adjoint state we use the space of linear finite elements \(Y_{h}\subset H^1(\varOmega )\),

$$\begin{aligned} Y_{h}=\left\{ y\in C(\bar{\varOmega }):y_h\in P^1(T)\ \forall T\in {\mathcal {T}}_h\right\} . \end{aligned}$$

As usual, we will abbreviate \(Y_{h0}=Y_h\cap H^1_0(\varOmega )\). For the control we use the space \(U_h\) of continuous piecewise linear functions that are the trace of some element of \(Y_h\). We define the set of boundary nodes \({\mathcal {B}}_h=\{j:x_j\in \varGamma \}\) for later use. Finally, for the discrete Lagrange multiplier we use the space \({\mathcal {M}}_h\subset \mathcal { M}(\bar{\varOmega }_1)\) which is spanned by Dirac measures corresponding to the nodes \(\{x_j\}_{j\in {\mathcal {I}}_{1h}}\) of the finite element mesh that are elements of \(\bar{\varOmega }_1\).

For any function \(y\in C(\bar{\varOmega })\) (resp. \(u\in C(\varGamma )\)) we denote by \(I_h y\in Y_h\) (resp. \(I_h u\in U_h\)) its nodal interpolator and for any function \(u\in L^2(\varGamma )\), we will denote by \(\varPi _h u\in U_h\) its projection onto \(U_h\) in the \(L^2(\varGamma )\) sense, i.e.,

$$\begin{aligned} (\varPi _h u,v_h)_{\varGamma } = (u,v_h)_\varGamma \ \forall v_h\in U_h. \end{aligned}$$

Notice that for \(u_h\in U_h\), \(\varPi _h u_h = u_h\). It is known (see [5, Eq. (2.20)], [13, Eq. (4.1)] or [16, Eq. (3.8)]) that if \(u\in H^t(\varGamma )\), \(0\le t\le 2\)

$$\begin{aligned} \Vert u-\varPi _h u\Vert _{L^2(\varGamma )}\le \Vert u\Vert _{H^t(\varGamma )}\ \text{ for } 0\le t\le 2. \end{aligned}$$
(19)

We will also use the space

$$\begin{aligned} Y_h^\varGamma =\left\{ y_h\in Y_h:y_h(x_j)=0 \text{ if } x_j\not \in \varGamma \right\} . \end{aligned}$$

We discretize the state equation without penalization, (using variational crime) (see [2, Theorem 5.2]): for any \(u\in L^2(\varGamma )\), \(y_h(u)\in Y_h\) is the solution of

$$\begin{aligned} (\nabla y_h(u),\nabla z_h)=0\ \forall z_h\in Y_{h0},\ (y_h(u),v_h)_{\varGamma } = (u,v_h)_\varGamma \ \forall v_h\in U_h. \end{aligned}$$

It is customary to say that \(y_h(u)\) is the discrete harmonic extension of u. Notice that \(y_h(u)\equiv \varPi _h u\) on \(\varGamma \) and hence, if \(u_h\in U_h\), \(y_h(u_h)\equiv u_h \text{ on } \varGamma \).

The discrete objective functional is defined as

$$\begin{aligned} J_h(u)=\frac{1}{2}\Vert y_h(u)-y_d\Vert ^2_{L^2(\varOmega )}+\frac{\nu }{2}\Vert u\Vert ^2_{L^2(\varGamma )}. \end{aligned}$$

We will denote by

$$\begin{aligned} U_{\alpha ,\beta ,h}= & {} \{u_h\in U_h:\alpha (x_j)\le u_h(x_j)\le \beta (x_j)\ \forall j\in {\mathcal {B}}_h\},\\ K_h= & {} \{y_h\in Y_h:a(x_j)\le y_h(x_j)\le b(x_j)\ \forall x_j\in \bar{\varOmega }_1\}, \end{aligned}$$

and

$$\begin{aligned} U_{ad,h} = \{u_h\in U_{\alpha ,\beta ,h}:y_h(u_h)\in K_h\}. \end{aligned}$$

Our discrete control problem then reads as

We will discuss some properties of problem \(({{\mathbb {P}}}_{h})\) similar to those of problem \(({{\mathbb {P}}})\).

Definition 2

We will say that \(u_h\) is a feasible point for \(({{\mathbb {P}}}_{h})\) if \(u_h\in U_{ad,h}\). We will call \(u_{h0}\in U_{ad,h}\) a feasible Slater point for \(({{\mathbb {P}}}_{h})\) if there exist \(\delta _h>0\) and \(\varepsilon _h>0\) such that

Theorem 2

Suppose that \(({{\mathbb {P}}})\) has a regular feasible Slater point \(u_0\in W^{1-1/p,p}(\varGamma )\), for some \(p>2\). Then there exists \(h_0>0\) such that for all \(0<h<h_0\) the discrete problem \(({{\mathbb {P}}}_{h})\) has a Slater feasible point \(u_{h0}=\varPi _hu_0\).

Moreover, the quantities \(\delta _h\) and \(\varepsilon _h\) can be taken independent of h for h small enough.

Remark 3

Different assumptions on the regularity of the Slater point are not rare in the related literature on control problems with both control and state constraints. See e.g [26, Assumption 6.2], [10, Remark 3.8] or [27, Assumption 2.1].

Proof

Let \(u_0\in U_{ad}\) be the feasible Slater point for problem \(({{\mathbb {P}}})\), and define \(u_{h0}=\varPi _h u_0\). With an inverse inequality, usual interpolation error estimates, and estimate (19) we obtain

$$\begin{aligned}&\Vert u_0-\varPi _h u_0\Vert _{L^\infty (\varGamma )} \\&\quad \le \Vert u_0-I_h u_0\Vert _{L^\infty (\varGamma )} + \Vert I_h u_0-\varPi _h u_0\Vert _{L^\infty (\varGamma )} \\&\quad \le C {}{h^{1-1/p} \Vert u_0\Vert _{W^{1-1/p,p}(\varGamma )}} + C h^{-1/2} \Vert I_h u_0-\varPi _h u_0\Vert _{L^2(\varGamma )} \\&\quad \le C h^{1-1/p} \Vert u_0\Vert _{W^{1-1/p,p}(\varGamma )}\\&\quad \quad + C h^{-1/2} \left( \Vert I_h u_0-\ u_0\Vert _{L^2(\varGamma )} + \Vert u_0-\varPi _h u_0\Vert _{L^2(\varGamma )}\right) \\&\quad \le C h^{1-1/p} \Vert u_0\Vert _{W^{1-1/p,p}(\varGamma )} + C h^{-1/2} \left( h^{{}{1-1/p}}\Vert u_0\Vert _{H^{1{}{-1/p}}(\varGamma )}\right) \le C h^{1/2{}{-1/p}}. \end{aligned}$$

From this uniform convergence, and the fact that \(\alpha (x) < u_0 <\beta (x)\) for all \(x\in \varGamma \), we deduce the existence of some \(h_0>0\) such that for all \(0<h<h_0\), \(\alpha (x_j) < u_{h0}(x_j) <\beta (x_j)\) holds for all \(x_j\in \varGamma \).

Since \(u_{h0}\rightarrow u_0\) in \(L^2(\varGamma )\), Lemma 2 allows to deduce that

$$\begin{aligned} \lim _{h\rightarrow 0}\Vert y_{u_0}- y_{u_{h0}}\Vert _{L^\infty (\varOmega _1)}=0. \end{aligned}$$
(20)

On the other hand, using the interior error estimate from [28, Theorem 5.1] we have that for some open set \(\varOmega _2\) such that \(\varOmega _1\subset \subset \varOmega _2\subset \subset \varOmega \) the estimate

$$\begin{aligned}&\Vert y_{u_{h0}}-y_h(u_{h0})\Vert _{L^\infty (\varOmega _1)}\\&\quad \le C\left( |\log h|\Vert y_{u_{h0}}-I_h y_{u_{h0}}\Vert _{L^\infty (\varOmega _2)} + \Vert y_{u_{h0}}-y_h(u_{h0})\Vert _{L^2(\varOmega )} \right) \end{aligned}$$

holds. The first addend in this expression converges to zero since \(y_{u_{h0}}\) is harmonic in \(\varOmega \), and the second one as a consequence of [2, Theorem 5.5]. So we obtain

$$\begin{aligned} \lim _{h\rightarrow 0}\Vert y_{u_{h0}}-y_h(u_{h0})\Vert _{L^\infty (\varOmega _1)}=0. \end{aligned}$$
(21)

From the triangle inequality, (20), and (21), we conclude \(y_{h}(u_{h0})\rightarrow y_{u_{0}}\) in \(L^\infty (\varOmega _1)\). Since \(a(x) < y_{u_{0}}(x) < b(x)\) for all \(x\in \bar{\varOmega }_1\), there exists \(h_0>0\) such that \(a(x_j) < y_{h}(u_{h0})(x_j) < b(x_j)\) holds for all \(0<h<h_0\) and all \(x_j\in \bar{\varOmega }_1\). Hence, \(u_{h0}\) is a Slater point.

The independence of \(\delta _h\) and \(\varepsilon _h\) with respect to h is clear from the definition of the Slater point \(u_0\) and the proven uniform convergences. \(\square \)

For any \(u\in L^2(\varGamma )\) and \(\mu \in {\mathcal {M}}(\bar{\varOmega }_1)\), we define \(\varphi _{{\mathrm {r}},h}(u),\varphi _{{\mathrm {s}},h}(\mu )\in Y_{h0}\) to be the unique solutions of

Let us now introduce the discrete variational normal derivative. For any linear operator \(T_h:Y_h\rightarrow {\mathbb {R}}\), let \(\varphi _h\in Y_{h0}\) be the solution of

$$\begin{aligned} (\nabla z_h,\nabla \varphi _h)=T_h(z_h)\ \forall z_h\in Y_{h0}. \end{aligned}$$

Then its discrete variational normal derivative \(\partial _n^h\varphi _h\in U_h\) is the unique solution of

$$\begin{aligned} (\partial _n^h\varphi _h,z_h)_\varGamma = (\nabla z_h,\nabla \varphi _h) -T_h(z_h) \ \forall z_h\in Y_h^\varGamma . \end{aligned}$$

The Lagrangian \({\mathcal {L}}_h:L^2(\varGamma )\times {\mathcal {M}}_h\times {\mathcal {M}}_h\rightarrow {\mathbb {R}}\) of \(({{\mathbb {P}}}_{h})\) is defined by

$$\begin{aligned} {\mathcal {L}}_h\left( u,\mu ^+_h,\mu ^-_h\right) = J(u)+\langle \mu ^+_h,y_h(u)-b\rangle + \langle \mu ^-_h,a-y_h(u)\rangle . \end{aligned}$$

We have that for any \(u,v\in L^2(\varGamma )\) and any \(\mu ^+_h,\mu ^-_h\in {\mathcal {M}}_h\), with \(\mu _h=\mu ^+_h-\mu ^-_h\), the first derivatives of \(J_h\) and \({\mathcal {L}}_h\) are given by the expression

$$\begin{aligned} J_h'(u)v&= \left( -\partial ^h_n\varphi _{{\mathrm {r}},h}(u)+\nu u,v\right) _\varGamma \\ \partial _u{\mathcal {L}}_h\left( u,\mu ^+_h,\mu ^-_h\right) v&= \left( -\partial ^h_n\varphi _{{\mathrm {r}},h}(u)-\partial ^h_n\varphi _{{\mathrm {s}},h}(\mu _h)+\nu u,v\right) _\varGamma , \end{aligned}$$

and again the second derivatives are independent of u, \(\mu ^+_h\), and \(\mu ^-_h\) since the problem is quadratic and the constraints are linear:

$$\begin{aligned} J''_h(u) v^2 = \partial ^2_{uu}{\mathcal {L}}_h\left( u,\mu ^+_h,\mu ^-_h\right) v^2 = \Vert y_h(v)\Vert _{L^2(\varOmega )}^2+{}{\nu }\Vert v\Vert _{L^2(\varGamma )}^2. \end{aligned}$$

Corollary 2

If \(({{\mathbb {P}}})\) has a regular feasible Slater point, then there exists \(h_0>0\) such that for all \(0<h<h_0\) the discrete problem \(({{\mathbb {P}}}_{h})\) has a unique solution \(\bar{u}_h\in U_{ad,h}\) with related discrete state \(\bar{y}_h\in K_h\). Moreover, there exist nonnegative measures \(\bar{\mu }^+_h,\bar{\mu }^-_h\in {\mathcal {M}}_h\) such that

$$\begin{aligned} (\nabla \bar{y}_h,\nabla z_h)=0\ \forall z_h\in Y_{h0},\ (\bar{y}_h,v_h)_{\varGamma }&= (\bar{u}_{{h}},v_h)_\varGamma \ \forall v_h\in U_h \end{aligned}$$
(22a)
$$\begin{aligned} (\nabla z_h,\nabla \bar{\varphi }_h)&=(\bar{y}_h-y_d,z_h)+ \langle \bar{\mu }_h,z_h\rangle \ \forall z_h\in Y_{h0}\end{aligned}$$
(22b)
$$\begin{aligned} \langle \bar{\mu }_h,y_h-\bar{y}_h\rangle&\le 0\ \forall y_h\in K_h\end{aligned}$$
(22c)
$$\begin{aligned} (\nu \bar{u}_h-\partial _n^h\bar{\varphi }_h,u_h-\bar{u}_h)&\ge 0\ \forall u_h\in U_{\alpha ,\beta ,h} \end{aligned}$$
(22d)

where \(\bar{\mu }_h = \bar{\mu }^+_h-\bar{\mu }^-_h\) and \(\bar{\varphi }_h=\varphi _{{\mathrm {r}},h}(\bar{u}_h)+\varphi _{{\mathrm {s}},h}(\bar{\mu }_h)\).

Proof

Problem \(({{\mathbb {P}}}_{h})\) is a finite dimensional strictly convex optimization problem whose feasible set is not empty due to Theorem 2, so it has a unique solution \(\bar{u}_h\in U_{ad,h}\).

The optimality system is immediately obtained from the expression for the first derivative of the discrete Lagrangian. \(\square \)

Lemma 6

Under the assumptions of Corollary 2, the discrete Lagrange multipliers are bounded independently of h.

Proof

Consider \(u_{h0}\) the sequence of feasible Slater points for problems \(({{\mathbb {P}}}_{h})\) found in Theorem 2. Since \(u_{h0}\rightarrow u_0\) in \(L^2(\varGamma )\), it is a bounded sequence, and the continuity of the solution operator from \(L^2(\varGamma )\) to \(L^2(\varOmega )\), together with [2, Theorem 5.5], implies that \(y_h(u_{h0})\) is also bounded in \(L^2(\varOmega )\). So we may deduce the existence of \(C>0\) such that

$$\begin{aligned} \Vert \bar{u}_h\Vert _{L^2(\varGamma )}\le \frac{2}{\nu }J_h(\bar{u}_h)\le \frac{2}{\nu }J_h(u_{h0})\le C. \end{aligned}$$

With the same reasoning made for the discrete states related to the Slater points, we deduce that the sequence of discrete optimal states \(\bar{y}_h\) is also bounded in \(L^2(\varOmega )\).

Since \(u_{h0}\) is a Slater point for problem \(({{\mathbb {P}}}_{h})\), there exists \(\rho >0\) such that

$$\begin{aligned} a(x_j)\le y_{h}(u_{h0})(x_j)-\rho < y_{h}(u_{h0})(x_j)+\rho \le b(x_j)\ \forall x_j\in \bar{\varOmega }_1. \end{aligned}$$

Notice that \(\bar{\mu }_h=\bar{\mu }_h^+-\bar{\mu }_h^-\in {\mathcal {M}}_h\), and hence it is a combination of Dirac deltas centered at the nodes of the mesh. There exist \(\bar{\lambda }_j\in {\mathbb {R}}\) for all j such that \(x_j\in \bar{\varOmega }_1\) such that

$$\begin{aligned} \bar{\mu }_h = \sum \bar{\lambda }_j\delta _{x_j}. \end{aligned}$$

Define \(z_h\in Y_h\) as

$$\begin{aligned} z_h(x_j)=\left\{ \begin{array}{cc}\rho &{} \text{ if } \bar{\lambda }_j\ge 0\\ -\rho &{} \text{ if } \bar{\lambda }_j < 0\\ 0&{} \text{ if } x_j\not \in \bar{\varOmega }_1 \end{array}\right. \end{aligned}$$

Clearly, \(y_{h0}+z_h\in K_h\), and using (22c) we have

$$\begin{aligned} \langle \bar{\mu }_h,y_h(u_{h0})+z_h-\bar{y}_h\rangle \le 0. \end{aligned}$$

So we have, using the definition of the discrete normal derivative of \(\varphi _{{\mathrm {s}},h}(\bar{\mu }_h)\), the fact that \(\varphi _{{\mathrm {s}},h}(\bar{\mu }_h)\in Y_{h0}\) together with the definition of discrete state, the discrete Euler-Lagrange condition (22d) together with the boundary conditions satisfied by the discrete states, the definition of discrete normal derivative of \(\varphi _{{\mathrm {r}},h}(\bar{u}_h)\), the fact that \(\varphi _{{\mathrm {r}},h}(\bar{u}_h)\in Y_{h0}\) together with the definition of discrete state and the already proved boundedness in \(L^2(\varGamma )\) of the discrete optimal controls and the discrete Slater controls and in \(L^2(\varOmega )\) of its related states:

$$\begin{aligned} \rho \Vert \bar{\mu }_h\Vert _{{\mathcal {M}}(\bar{\varOmega }_1)}&=\rho \sum |\lambda _j| = \langle \bar{\mu }_h,z_h\rangle \\&\le \langle \bar{\mu }_h,\bar{y}_h-y_h(u_{h0})\rangle \\&=(\nabla (\bar{y}_{h}-y_h(u_{h0})),\nabla \varphi _{{\mathrm {s}},h}(\bar{\mu }_h))-(\partial _n^h\varphi _{{\mathrm {s}},h}(\bar{\mu }_h),\bar{y}_h-y_h(u_{h0}))_\varGamma \\&\le (\nu \bar{u}_h-\partial ^h_n\varphi _{{\mathrm {r}},h}(\bar{u}),\bar{u}_h-u_{h0})_\varGamma \\&= (\nu \bar{u}_h,\bar{u}_h-u_{h0})_\varGamma \\&\quad + (\nabla (\bar{y}_{h}-y_h(u_{h0})),\nabla \varphi _{{\mathrm {r}},h}(\bar{u}_h))- (\bar{y}_{h}-y_d,\bar{y}_{h}-y_h(u_{h0}))\\&= (\nu \bar{u}_h,\bar{u}_h-u_{h0})_\varGamma \le C. \end{aligned}$$

Hence the assertion is proven. \(\square \)

6 Error estimates

To obtain error estimates, we will make the following technical assumption on the triangulation, which is not difficult to fulfill in practice:

Assumption (H)

There exists some \(\bar{h}>0\) and an open set \(\varOmega _{2,\bar{h}}\) such that \(\varOmega _1\subset \subset \varOmega _2\subset \subset \varOmega _{2,\bar{h}}\subset \subset \varOmega \) for some open set \(\varOmega _2\) with smooth boundary \(\varGamma _2\) such that for all \(0<h<\bar{h}\)

$$\begin{aligned} \bar{\varOmega }_{2,\bar{h}}=\cup \left\{ T\in {\mathcal {T}}_h: \text{ s.t. } x_j\in \bar{\varOmega }_{2,\bar{h}}\ \forall x_j \text{ vertex } \text{ of } T\right\} . \end{aligned}$$

Notice that for every \(T\in {\mathcal {T}}_h\), either \({\mathrm {int}}\,T\in \varOmega _{2,\bar{h}}\) or \({\mathrm {int}}\,T\in \varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}\) and \(\{{\mathcal {T}}_h\}_{h<\bar{h}}\) induces a quasi-uniform family of triangulations \(\{{\mathcal {T}}_{2,h}\}_{h<\bar{h}}\) on \(\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}\) . We define

$$\begin{aligned} \tilde{Y}_h = \left\{ y_h\in C(\bar{\varOmega }{\setminus }\varOmega _{2,\bar{h}}):y_h\in P^1(T)\ \forall T\in {\mathcal {T}}_{2,h}\right\} \end{aligned}$$

and

$$\begin{aligned} \tilde{Y}_{h,0} =\tilde{Y}_h\cap H^1_{0}\left( \varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}\right) . \end{aligned}$$

We will denote by \((\cdot ,\cdot )_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}}\) the inner product in \(L^2(\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}})\). We will also use the space \(\tilde{U}_h\) of the traces of the elements of \(\tilde{Y}_h\) on \(\varGamma _{2,\bar{h}}\), the boundary of \(\varOmega _{2,\bar{h}}\).

We can also define a variational discrete normal derivative on \(\varGamma _{2,\bar{h}}\). For any \(e_h\in U_h\), and \(T_h:\tilde{Y}_h\rightarrow {\mathbb {R}}\) linear, let \(\phi _h\in \tilde{Y}_h\) be the unique solution of

$$\begin{aligned} (\nabla \phi _h,\nabla z_h)_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}} = T_{{}{h}}(z_h) \text{ for } \text{ all } z_h\in \tilde{Y}_{h,0},\ \phi _h = e_h \text{ on } \varGamma ,\ \phi _h = 0\ \text{ on } \varGamma _{2,\bar{h}}. \end{aligned}$$

Then it can be shown as in [13] that there exists a unique \(\partial _n^h\phi _h\in \tilde{U}_h\) such that

$$\begin{aligned} \left( \partial _n^h\phi _h,z_h\right) _{\varGamma _{2,\bar{h}}} = \left( \nabla \phi _h,\nabla z_h\right) _{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}}-\ T_{{}{h}}(z_h)\ \ \ \forall z_h\in \tilde{Y}_h. \end{aligned}$$
(23)

We have the following relation between the boundary data on \(\varGamma \) and the discrete normal derivative on \(\varGamma _{2,\bar{h}}\).

Lemma 7

Suppose that Assumption (H) is satisfied, consider \(e_h\in U_h\) and let \(\phi _h\in \tilde{Y}_h\) be the unique solution of

$$\begin{aligned} \left( \nabla \phi _h,\nabla z_h\right) _{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}} = 0\ \forall z_h\in \tilde{Y}_{h,0},\ \phi _h = e_h \text{ on } \varGamma ,\ \phi _h = 0\ \text{ on } \varGamma _{2,\bar{h}}. \end{aligned}$$

Then, there exist \(h_0>0\) and \(C>0\) such that for all \(0<h<h_0\)

$$\begin{aligned} \Vert \partial _n^h\phi _h\Vert _{L^2(\varGamma _{2,\bar{h}})} \le \frac{C}{h} \Vert e_h\Vert _{L^2(\varGamma )} \end{aligned}$$

is satisfied.

Proof

Take any \(v_h\in \tilde{U}_h\) and let \(\eta _h\in \tilde{Y}_h\) be the unique solution of

$$\begin{aligned} (\nabla \eta _h,\nabla z_h)_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}}=0\ \forall z_h\in \tilde{Y}_{h,0},\ \eta _h =0 \text{ on } \varGamma , \eta _h = v_h \text{ on } \varGamma _{2,\bar{h}}. \end{aligned}$$

Then, with (23) and using the appropriate inverse inequality (cf. [6, Theorem (4.5.11)], we obtain

$$\begin{aligned} (\partial _n^h\phi _h,v_h)_{\varGamma _{2,\bar{h}}}&= (\nabla \phi _h,\nabla \eta _h)_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}} \le \Vert \nabla \phi _h\Vert _{L^2(\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}})} \Vert \nabla \eta _h\Vert _{L^2(\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}})}\\&\le C \Vert e_h\Vert _{H^{1/2}(\varGamma )} \Vert v_h\Vert _{H^{1/2}(\varGamma _{2,\bar{h}})} \\&\le C \frac{1}{h^{1/2}}\Vert e_h\Vert _{L^{2}(\varGamma )} \frac{1}{h^{1/2}}\Vert v_h\Vert _{L^{2}(\varGamma _{2,\bar{h}})} \end{aligned}$$

and the result follows. \(\square \)

Lemma 8

For any \(u\in W^{1-1/p,p}(\varGamma )\) there exists some \(h_1>0\) and some \(C>0\) independent of u such that for all \(0<h<h_1\) the following estimate holds

$$\begin{aligned} \Vert \partial _n\varphi _{\mathrm {r}}(u) - \partial _n^h\varphi _{{\mathrm {r}},h}(u)\Vert _{L^2(\varGamma )} \le C h^{1-1/p}\Vert u\Vert _{W^{1-1/p,p}(\varGamma )}\ \forall p<p_\varOmega . \end{aligned}$$
(24)

Suppose further that assumption (H) is satisfied. Then, for any \(\mu \in {\mathcal {M}}(\bar{\varOmega }_1)\), there exist some \(h_2>0\) and \(C>0\) independent of \(\mu \) such that for all \(0<h<h_2\) the following estimate holds

$$\begin{aligned} \Vert \partial _n\varphi _{{\mathrm {s}}}(\mu )- \partial ^h_n\varphi _{{\mathrm {s}},h}(\mu )\Vert _{L^2(\varGamma )}\le C h^{1-1/p}\Vert \mu \Vert _{{\mathcal {M}}(\bar{\varOmega }_1)}\ \forall p<p_\varOmega . \end{aligned}$$
(25)

Proof

The proof of the estimates for both the regular part and the singular part is similar. We will write the details for the singular part, since it requires some more tricks. We will drop the dependence on \(\mu \) in the following lines. First we write

$$\begin{aligned} \Vert \partial _n\varphi _{{\mathrm {s}}}- \partial ^h_n\varphi _{{\mathrm {s}},h}\Vert ^2_{L^2(\varGamma )} = \Vert \partial _n\varphi _{{\mathrm {s}}}- \varPi _h \partial _n\varphi _{{\mathrm {s}}}\Vert ^2_{L^2(\varGamma )} +\Vert \varPi _h\partial _n\varphi _{{\mathrm {s}}}- \partial ^h_n\varphi _{{\mathrm {s}},h}\Vert ^2_{L^2(\varGamma )}. \end{aligned}$$

From Lemma 4 and estimate (19) it follows that

$$\begin{aligned} \Vert \partial _n\varphi _{{\mathrm {s}}}- \varPi _h \partial _n\varphi _{{\mathrm {s}}}\Vert _{L^2(\varGamma )}\le C h^s \Vert \mu \Vert _{{\mathcal {M}}(\bar{\varOmega }_1)}\ \forall s<\min \{3/2,s_\varOmega -3/2\}. \end{aligned}$$

For the second addend, denote by \(e_h= \varPi _h\partial _n\varphi _{{\mathrm {s}}}- \partial ^h_n\varphi _{{\mathrm {s}},h}\) and define \(\phi _h\in \tilde{Y}_h\) as the unique solution of

$$\begin{aligned} (\nabla \phi _h,\nabla z_h)_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}} = 0 \text{ for } \text{ all } z_h\in \tilde{Y}_{h,0},\ \phi _h = e_h \text{ on } \varGamma ,\ \phi _h = 0\ \text{ on } \varGamma _{2,\bar{h}}. \end{aligned}$$

We use the definition of \(\varPi _h\) and the value of \(\phi _h\) on \(\varGamma \) to write

$$\begin{aligned} \Vert e_h\Vert _{L^2(\varGamma )}^2&=\Vert \varPi _h\partial _n\varphi _{{\mathrm {s}}}- \partial ^h_n\varphi _{{\mathrm {s}},h}\Vert ^2_{L^2(\varGamma )}\nonumber \\&= \left( \varPi _h\partial _n\varphi _{{\mathrm {s}}}- \partial ^h_n\varphi _{{\mathrm {s}},h},\varPi _h\partial _n\varphi _{{\mathrm {s}}}- \partial ^h_n\varphi _{{\mathrm {s}},h}\right) _\varGamma \nonumber \\&=\left( \partial _n\varphi _{{\mathrm {s}}}- \partial ^h_n\varphi _{{\mathrm {s}},h},\varPi _h\partial _n\varphi _{{\mathrm {s}}}- \partial ^h_n\varphi _{{\mathrm {s}},h}\right) _\varGamma \nonumber \\&= \left( \partial _n\varphi _{\mathrm {s}},\phi _h\right) _{\varGamma } - \left( \partial ^h_n\varphi _{{\mathrm {s}},h}, \phi _h\right) _{\varGamma }. \end{aligned}$$
(26)

Since \(\phi _h=0\) on \(\varGamma _{2,\bar{h}}\), the extension of \(\phi _h\) to \(\varOmega _{2,\bar{h}}\) by 0 is an element of \(Y_h\). With an abuse of notation we well also refer to this extension as \(\phi _h\). Now we can use that \(\phi _h\in H^1(\varOmega )\) and apply Green’s formula to obtain

$$\begin{aligned} (\partial _n\varphi _{\mathrm {s}},\phi _h)_{\varGamma } = -(\phi _h,\mu ) +(\nabla \phi _h,\nabla \varphi _{\mathrm {s}}) = (\nabla \phi _h,\nabla \varphi _{\mathrm {s}})= (\nabla \phi _h,\nabla \varphi _{\mathrm {s}})_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}}, \end{aligned}$$
(27)

where we have used that \({{\mathrm{supp}}}\bar{\mu }\subset \bar{\varOmega }_1\subset \varOmega _2\subset \subset \varOmega _{2,\bar{h}}\) and \(\phi _h\equiv 0\) in \(\varOmega _{2,\bar{h}}\). In the same way we use that \(\phi _h\in Y_h\) and the definition of the discrete normal derivative to obtain

$$\begin{aligned} (\partial ^h_n\varphi _{{\mathrm {s}},h},\phi _h)_{\varGamma }&= -(\phi _h,\mu ) +(\nabla \phi _h,\nabla \varphi _{{\mathrm {s}},h}) \nonumber \\&= (\nabla \phi _h,\nabla \varphi _{{\mathrm {s}},h})= (\nabla \phi _h,\nabla \varphi _{{\mathrm {s}},h}) _{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}}. \end{aligned}$$
(28)

Now we use (26)–(28) and insert the zero \(\pm (\nabla \phi _h,\nabla I_h\varphi _{\mathrm {s}})_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}}\) to write

$$\begin{aligned} \Vert e_h\Vert _{L^2(\varGamma )}^2= \left( \nabla \phi _h,\nabla \varphi _{\mathrm {s}}- \nabla I_h\varphi _{\mathrm {s}}\right) _{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}} + \left( \nabla \phi _h,\nabla I_h \varphi _{{\mathrm {s}}}-\nabla \varphi _{{\mathrm {s}},h}\right) _{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}}. \end{aligned}$$
(29)

Let us discuss the second term of (29). For any \(z_h\in \tilde{Y}_h\) such that \(z_h=0 \text{ on } \varGamma \), using the definition of discrete normal derivative, we have

$$\begin{aligned} \left( \nabla \phi _h,{}{\nabla }z_h\right) _{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}} = \left( \partial _n^h\phi _h,z_h\right) _{\varGamma _{2,\bar{h}}} \end{aligned}$$

and therefore

$$\begin{aligned} \left( \nabla \phi _h,\nabla I_h \varphi _{{\mathrm {s}}}-\nabla \varphi _{{\mathrm {s}},h}\right) _{\varOmega {\setminus } \bar{\varOmega }_{2,\bar{h}}}&= \left( \partial _n^h\phi _h, I_h\varphi _{{\mathrm {s}}}-\varphi _{{\mathrm {s}},h}\right) _{\varGamma _{2,\bar{h}}}\\&= \left( \partial _n^h\phi _h, I_h\varphi _{{\mathrm {s}}}-\varphi _{{\mathrm {s}}}\right) _{\varGamma _{2,\bar{h}}} + \left( \partial _n^h\phi _h, \varphi _{{\mathrm {s}}}-\varphi _{{\mathrm {s}},h}\right) _{\varGamma _{2,\bar{h}}}. \end{aligned}$$

From Lemma 5, we know that \(\varphi _{\mathrm {s}}\) is regular in \(\varOmega _3{\setminus }\bar{\varOmega }_2\) for some \(\varOmega _3\subset \subset \varOmega \) such that \(\varGamma _{2,\bar{h}}\subset \varOmega _3{\setminus }\bar{\varOmega }_2\), so we use interpolation error estimates (see e.g. [15, Theorem 17.2]) and Lemma 7 for the first term. For the second one we also use Lemma 7 and the uniform estimate for Green functions [28, Theorem 6.1(i)]. This result is proved for Dirac measures, but the proof is the same (with the obvious changes) for any measure with compact support. We obtain

$$\begin{aligned} (\nabla \phi _h,\nabla I_h \varphi _{{\mathrm {s}}}-\nabla \varphi _{{\mathrm {s}},h})_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}} \le C h |\log h| \Vert e_h\Vert _{L^2(\varGamma )}\Vert \mu \Vert _{{\mathcal {M}}(\bar{\varOmega }_1)}. \end{aligned}$$

For the first term in (29)

$$\begin{aligned} (\nabla \phi _h,\nabla \varphi _{\mathrm {s}}- \nabla I_h\varphi _{\mathrm {s}})_{\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}}}&\le \Vert \phi _h\Vert _{W^{1,p'}(\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}} ) } \Vert \varphi _{\mathrm {s}}- I_h\varphi _{\mathrm {s}}\Vert _{W^{1,p}(\varOmega {\setminus }\bar{\varOmega }_{2,\bar{h}} ) } \nonumber \\&\le C \Vert e_h\Vert _{W^{1-1/p',p'}(\varGamma )} h\Vert \varphi _s\Vert _{W^{2,p}(\varOmega {\setminus }\varOmega _{2,\bar{h}})} \nonumber \\&\le C \Vert e_h\Vert _{H^{1-1/p'}(\varGamma )} h\Vert \mu \Vert _{{\mathcal {M}}(\bar{\varOmega }_1)} \nonumber \\&\le C h^{1/p'-1}\Vert e_h\Vert _{L^{2}(\varGamma )} h {}{\Vert \mu \Vert _{{\mathcal {M}}(\bar{\varOmega }_1)}} \nonumber \\&= C h^{1-1/p}\Vert e_h\Vert _{L^{2}(\varGamma )}\Vert \mu \Vert _{{\mathcal {M}}(\bar{\varOmega }_1)}. \end{aligned}$$
(30)

Collecting all the estimates the proof is complete. For the regular part, it is easier, since we can define \(\phi _h\in Y_h\) and we avoid the second term in (29). \(\square \)

To obtain error estimates, we will follow two different methods of proof for problems with pure state constraints and problems with additional control constraints. We discuss the main differences of these methods along with the advantages and disadvantages of each of them at the end of the paper.

6.1 No control constraints

The main result of this part is the error estimate proved in Theorem 3. A technical lemma necessary for the proof is provided first.

Lemma 9

Suppose that \(({{\mathbb {P}}})\) has a regular feasible Slater point, \(a,b\in W^{2,p}(\varOmega _1)\) for all \(p<p_\varOmega \) and \(\alpha (x) < \bar{u}(x) < \beta (x)\) for all \(x\in \varGamma \). Let \(\bar{u}\) and \(\bar{u}_h\) be the solutions of problems \(({{\mathbb {P}}})\) and \(({{\mathbb {P}}}_{h})\), respectively, and \(\bar{\mu }\) and \(\bar{\mu }_h\) Lagrange multipliers associated to these solutions. Then

$$\begin{aligned} \left( \partial _n^h\varphi _{{\mathrm {s}},h}\left( \bar{\mu }\right) - \partial _n^h\varphi _{{\mathrm {s}},h}\left( \bar{\mu }_h\right) , \bar{u}-\bar{u}_h\right) _\varGamma \le C h^{2(1-1/p)}\ \forall p<p_\varOmega . \end{aligned}$$

Proof

Using the definition of the \(L^2(\varGamma )\) projection, the definition of the discrete normal derivative, the equalities \(y_h(\bar{u})\equiv \varPi _h\bar{u}\), \(\bar{y}_h\equiv \bar{u}_h\) on \(\varGamma \), the fact that both \(\varphi _{{\mathrm {s}},h}(\bar{\mu }_h),\varphi _{{\mathrm {s}},h}(\bar{\mu })\in Y_{h0}\) and the discrete state equation, we obtain

$$\begin{aligned} \left( \partial _n^h\varphi _{{\mathrm {s}},h}\left( \bar{\mu }\right) -\partial _n^h\varphi _{{\mathrm {s}},h}\left( \bar{\mu }_h\right) ,\bar{u}-\bar{u}_h\right) _\varGamma= & {} \left( \partial _n^h\varphi _{{\mathrm {s}},h}\left( \bar{\mu }\right) -\partial _n^h\varphi _{{\mathrm {s}},h}\left( \bar{\mu }_h\right) ,\varPi _h\bar{u}-\bar{u}_h\right) _\varGamma \\= & {} \left( \nabla \left( \varphi _{{\mathrm {s}},h}\left( \bar{\mu }\right) -\varphi _{{\mathrm {s}},h}\left( \bar{\mu }_h\right) \right) , \nabla \left( y_h\left( \bar{u}\right) -\bar{y}_h\right) \right) \\&- \langle \bar{\mu } -\bar{\mu }_h, y_h\left( \bar{u}\right) -\bar{y}_h \rangle \\= & {} \langle \bar{\mu }_h -\bar{\mu }, y_h\left( \bar{u}\right) -\bar{y}_h \rangle \\= & {} \langle \bar{\mu }^+,\bar{y}_h- y_h\left( \bar{u}\right) \rangle -\langle \bar{\mu }^-,\bar{y}_h- y_h\left( \bar{u}\right) \rangle \\&+\langle \bar{\mu }_h^+,y_h\left( \bar{u}\right) -\bar{y}_h\rangle -\langle \bar{\mu }_h^-,y_h\left( \bar{u}\right) -\bar{y}_h\rangle . \end{aligned}$$

For the first two addends we use that \(\bar{y}=b\) on \({{\mathrm{supp}}}\mu ^+\), \(\bar{y} = a\) on \({{\mathrm{supp}}}\mu ^-\), \(I_ha \le \bar{y}_h \le I_hb\) and the estimates for the interpolation error to obtain

$$\begin{aligned}&\langle \bar{\mu }^+,\bar{y}_h- y_h(\bar{u}) \rangle -\langle \bar{\mu }^-,\bar{y}_h- y_h(\bar{u})\rangle \\&\quad \le \langle \bar{\mu }^+,b- y_h(\bar{u}) \rangle + \langle \bar{\mu }^+,I_hb - b \rangle + \langle \bar{\mu }^-,-a+ y_h(\bar{u}) \rangle + \langle \bar{\mu }^-,a - I_ha \rangle \\&\quad = \langle \bar{\mu }^+,b- \bar{y} \rangle - \langle \bar{\mu }^-,a- \bar{y} \rangle + \langle \bar{\mu }^+-\bar{\mu }^-,\bar{y}- y_h(\bar{u}) \rangle \\&\quad \quad + \langle \bar{\mu }^+,I_hb - b \rangle + \langle \bar{\mu }^-,a - I_ha \rangle \\&\quad = \langle \bar{\mu },\bar{y}- y_h(\bar{u}) \rangle \ + \langle \bar{\mu }^+,I_hb - b \rangle + \langle \bar{\mu }^-,a - I_ha \rangle \\&\quad \le \Vert \bar{\mu }\Vert _{\mathcal { M}(\bar{\varOmega }_1)}\Big (\Vert \bar{y}- y_h(\bar{u})\Vert _{L^\infty (\varOmega _1)} \\&\quad \quad + Ch^{2-2/p}(\Vert a\Vert _{W^{2,p}(\varOmega _1)} + \Vert b\Vert _{W^{2,p}(\varOmega _1)})\big )\ \forall p<p_\varOmega . \end{aligned}$$

To finish, we use that \(\bar{y}_h=b\) on \({{\mathrm{supp}}}\bar{\mu }^+_h\), \(\bar{y}- b\le 0\), \(\bar{y}_h=a\) on \({{\mathrm{supp}}}\bar{\mu }^-_h\), and \(\bar{y}- a\ge 0\) to obtain

$$\begin{aligned}&\langle \bar{\mu }_h^+,y_h(\bar{u})-\bar{y}_h \rangle -\langle \bar{\mu }_h^-,y_h(\bar{u}) -\bar{y}_h\rangle \\&\quad = \langle \bar{\mu }_h^+,y_h(\bar{u})-b \rangle -\langle \bar{\mu }_h^-,y_h(\bar{u}) -a\rangle \\&\quad =\langle \bar{\mu }_h^+,y_h(\bar{u})-\bar{y} \rangle +\langle \bar{\mu }_h^+,\bar{y}-b \rangle -\langle \bar{\mu }_h^-,y_h(\bar{u}) -\bar{y}\rangle -\langle \bar{\mu }_h^-,\bar{y} -a\rangle \\&\quad \le \langle \bar{\mu }_h,y_h(\bar{u})-\bar{y} \rangle \\&\quad \le \Vert \bar{\mu }_h\Vert _{\mathcal { M}(\bar{\varOmega }_1)}\Vert \bar{y}- y_h(\bar{u})\Vert _{L^\infty (\varOmega _1)}. \end{aligned}$$

All together we arrive at

$$\begin{aligned}&(\partial _n^h\varphi _{{\mathrm {s}},h}(\bar{\mu })-\partial _n^h\varphi _{{\mathrm {s}},h}(\bar{\mu }_h),\bar{u}-\bar{u}_h)_\varGamma \\&\quad \le (\Vert \bar{\mu }\Vert _{{\mathcal {M}}(\varOmega _1)} + \Vert \bar{\mu }_h\Vert _{{\mathcal {M}}(\varOmega _1)} )\Vert \bar{y}-y_h(\bar{u})\Vert _{L^\infty (\varOmega _1)}\\&\quad \quad + C h^{2-2/p}(\Vert a\Vert _{W^{2,p}(\varOmega _1)} +\Vert b\Vert _{W^{2,p}(\varOmega _1)})\qquad \forall p<p_\varOmega . \end{aligned}$$

Thanks to the boundedness of \(\bar{\mu }_h\) proved in Lemma 6, it only remains to estimate \(\Vert \bar{y}-y_h(\bar{u})\Vert _{L^\infty (\varOmega _1)}\). We use the interior error estimates [28, Theorem 5.1], interpolation error estimates, finite element error for non-regular problems (cf. [6, Theorem (12.3.5)], the interior regularity results of Lemmas 1,  2 and the regularity of the optimal control state of Corollary 1. For any open set \(\varOmega _2\) such that \(\varOmega _1\subset \subset \varOmega _2\subset \subset \varOmega \), all \(p<p_\varOmega \) and \(s<\min \{3,s_\varOmega \}\) we obtain

$$\begin{aligned} \Vert \bar{y}-y_h(\bar{u})\Vert _{L^\infty (\varOmega _1)}&\le C \big ( |\log h| \Vert \bar{y}-I_h\bar{y}\Vert _{L^\infty (\varOmega _2)} + \Vert \bar{y}-y_h(\bar{u})\Vert _{L^2(\varOmega _2)}\big )\nonumber \\&\le C \big ( |\log h| h^{2-2/p} \Vert \bar{y}\Vert _{W^{2,p}(\varOmega _2)} + h^{s-1}\Vert \bar{y}\Vert _{H^{s-1}(\varOmega )}\big )\nonumber \\&\le C \big ( |\log h| h^{2-2/p} \Vert \bar{u}\Vert _{W^{1-1/p,p}(\varGamma )} + h^{s-1}\Vert \bar{u}\Vert _{H^{s-3/2}(\varGamma )}\big ). \end{aligned}$$
(31)

Choosing \(s=3-2/p\) (which is smaller than 3 and \(s_\varOmega \)), the proof is complete. Since the result is valid for all \(p<p_\varOmega \), the \(|\log h|\) term can be neglected. \(\square \)

Theorem 3

Let \(\bar{u}\) and \(\bar{u}_h\) be the solutions of problems \(({{\mathbb {P}}})\) and \(({{\mathbb {P}}}_{h})\), respectively, and suppose that \(({{\mathbb {P}}})\) has a regular feasible Slater point, Assumption (H) is satisfied, \(a,b\in W^{2,p}(\varOmega _1)\) for all \(p<p_\varOmega \) and \(\alpha (x) < \bar{u}(x) < \beta (x)\) for all \(x\in \varGamma \). Then there exists some \(h_0>0\) and \(C>0\) such that for all \(0<h<h_0\)

$$\begin{aligned} \Vert \bar{u}-\bar{u}_h\Vert _{L^2(\varGamma )}\le C h^{1-1/p}\ \forall p<p_\varOmega . \end{aligned}$$

Proof

Since J is quadratic, we can write

$$\begin{aligned}&\frac{\nu }{2}\Vert \bar{u}-\bar{u}_h\Vert ^2_{L^2(\varGamma )}\le J_h''(u_\xi )(\bar{u}-\bar{u}_h)^2 = J_h'(\bar{u})(\bar{u}-\bar{u}_h)-J'_h(\bar{u}_h)(\bar{u}-\bar{u}_h)\\&\quad =(-\partial _n^h\varphi _{{\mathrm {r}},h}(\bar{u}) +\nu \bar{u},\bar{u}-\bar{u}_h)_\varGamma -(-\partial _n^h\varphi _{{\mathrm {r}},h}(\bar{u}_h) +\nu \bar{u}_h,\bar{u}-\bar{u}_h)_\varGamma \end{aligned}$$

with some \(u_\xi =\bar{u}_h+\xi (\bar{u}-\bar{u}_h)\), \(0\le \xi \le 1\). Inserting the term \(\pm \partial _n\varphi _{{\mathrm {r}}}(\bar{u})\) and taking into account that in the absence of control constraints first order optimality conditions read like

$$\begin{aligned} \nu \bar{u} -\partial _n\varphi _{{\mathrm {r}}}(\bar{u})-\partial _n\varphi _{{\mathrm {s}}}(\bar{\mu })&= 0 \end{aligned}$$
(32)
$$\begin{aligned} \nu \bar{u}_h -\partial _n^h\varphi _{{\mathrm {r}},h}(\bar{u}_h)-\partial _n^h\varphi _{{\mathrm {s}},h}(\bar{\mu }_h)&= 0, \end{aligned}$$
(33)

we get to

$$\begin{aligned} \nu \Vert \bar{u}-\bar{u}_h\Vert ^2_{L^2(\varGamma )}&\le ( \partial _n\varphi _{{\mathrm {s}}}(\bar{\mu }) -\partial ^h_n\varphi _{{\mathrm {s}},h}(\bar{\mu }_h) ,\bar{u}-\bar{u}_h)_\varGamma \nonumber \\&\quad +(\partial _n\varphi _{{\mathrm {r}}}(\bar{u}) -\partial _n^h\varphi _{{\mathrm {r}},h}(\bar{u}),\bar{u}-\bar{u}_h)_\varGamma \nonumber \\&= ( \partial _n\varphi _{{\mathrm {s}}}(\bar{\mu })-\partial ^h_n\varphi _{{\mathrm {s}},h}(\bar{\mu }) ,\bar{u}-\bar{u}_h)_\varGamma \nonumber \\&\quad + ( \partial ^h_n\varphi _{{\mathrm {s}},h}(\bar{\mu })-\partial ^h_n\varphi _{{\mathrm {s}},h}(\bar{\mu }_h),\bar{u}-\bar{u}_h)_\varGamma \nonumber \\&\quad +(\partial _n\varphi _{{\mathrm {r}}}(\bar{u}) -\partial _n^h\varphi _{{\mathrm {r}},h}(\bar{u}),\bar{u}-\bar{u}_h)_\varGamma . \end{aligned}$$
(34)

The result then follows from (34) and Lemmas 8 and 9. \(\square \)

6.2 Control constrained case

We provide hence a different proof from the one done for the no-control-constrained case, where we use a technique similar to that followed by Meyer in [26] or Rösch and Steinig [27], where we show an order of convergence of \({\mathcal {O}}(h^{3/4-1/(2p)})\). Before stating and proving the main result of this section, we will collect some auxiliary results. We begin with the error estimates for the \(L^2(\varGamma )\) projections.

Lemma 10

The \(L^2\)-projection \(\varPi _h\) fulfills the projection error estimates

$$\begin{aligned} \Vert u-\varPi _hu\Vert _{L^2(\varGamma )}\le Ch^{1-1/p}\Vert u\Vert _{ W^{1-1/p,p}(\varGamma )} \end{aligned}$$
(35)

as well as

$$\begin{aligned} \Vert u-\varPi _h u\Vert _{H^{-1/2}(\varGamma )}\le Ch^{3/2-1/p} \Vert u\Vert _{ W^{1-1/p,p}(\varGamma )} \end{aligned}$$
(36)

for all \(u \in W^{1-1/p,p}(\varGamma )\) and all \(p<+\infty \).

Proof

Estimate (35) follows from [14, Theorem 2.1] and usual interpolation error estimates. The proof of estimate (36) is a bit more delicate. It involves a duality argument that relies on the approximation property (19).

To shorten notation let us define \({\mathcal {F}}= \{v\in H^{1/2}(\varGamma ):\Vert {}{v}\Vert _{H^{1/2}(\varGamma )}=1\}.\) Using the definition of \(\varPi _h\) and of the dual norm we may write

$$\begin{aligned} \Vert {}{u}-\varPi _h u\Vert _{H^{-1/2}(\varGamma )}= & {} \sup _{v\in {\mathcal {F}}}\langle u-\varPi _h u,v\rangle _{ H^{-1/2}(\varGamma ), H^{1/2}(\varGamma )} \\= & {} \sup _{v\in {\mathcal {F}}}( u-\varPi _h u,v) \\= & {} \sup _{v\in {\mathcal {F}}}( u-\varPi _h u,v-\varPi _h v) \\\le & {} \sup _{v\in {\mathcal {F}}}\Vert u-\varPi _h u\Vert _{L^2(\varGamma )}\Vert v-\varPi _h v\Vert _{L^2(\varGamma )} \\\le & {} \sup _{v\in {\mathcal {F}}} c h^{1-1/p}\Vert u\Vert _{W^{1-1/p,p}(\varGamma )} h^{1/2} \Vert v\Vert _{H^{1/2}(\varGamma )} \\= & {} c h^{3/2-1/p} \Vert u\Vert _{W^{1-1/p,p}(\varGamma )}, \end{aligned}$$

and the proof is complete.\(\square \)

The reader may compare (36) with [26, Eq. (4.2)] or [27, Eq. (3.4)] and wonder why we have not used the norm of \(W^{1-1/p,p}(\varGamma )^*\) instead of the norm in \(H^{-1/2}(\varGamma )\), which would have lead to an estimate of order \(h^{2-2/p}\). The reason is that we will need the continuity of the solution operator into \(L^2(\varOmega )\) in (43), and this is not possible for data in \(W^{1-1/p,p}(\varGamma )^*\).

Le us now state properly the meaning of the state equation for data \(u\in H^{-1/2}(\varGamma )\). We will say that \(y=Su\) if

$$\begin{aligned} \int _\varOmega y \varDelta z dx = \langle u,\partial _n z\rangle _{H^{-1/2}(\varGamma ), H^{1/2}(\varGamma )}\ \forall z\in H^2(\varOmega )\cap H^1_0(\varGamma ). \end{aligned}$$

Since \(z=0\) on \(\varGamma \), \(\partial _n z\in H^{1/2}(\varGamma )\) and the definition makes sense (see [11, Lemma A.2]).

Lemma 11

The control-to-state-mapping \(S u = y_u\) is well defined and continuous from \(H^{-1/2}(\varGamma )\) to \(L^2(\varOmega )\). For any open set \(\varOmega '\subset \subset \varOmega \), it is also continuous from \(H^{-1/2}(\varGamma )\) to \(C(\bar{\varOmega }')\).

Proof

The proof of the first part follows the usual duality argument. To shorten notation, let us denote \({\mathcal {F}}=\{f\in L^{2}(\varOmega ):\Vert f\Vert _{L^{2}(\varOmega )}=1\}\) and for every \(f\in L^2(\varOmega )\), let z be the unique element in \(H^2(\varOmega )\cap H^1_0(\varOmega )\) such that \(-\varDelta z = f\) in \(\varOmega \). Then

$$\begin{aligned} \Vert y\Vert _{L^2(\varOmega )}&= \sup _{f\in {\mathcal {F}}}\int _\varOmega yf dx = \sup _{f\in {\mathcal {F}}} -\langle u,\partial _n z\rangle _{H^{-1/2}(\varGamma ), H^{1/2}(\varGamma )}\\&\le \sup _{f\in {\mathcal {F}}} \Vert u\Vert _{H^{-1/2}(\varGamma )} \Vert \partial _n z\Vert _{H^{1/2}(\varGamma )}\\&\le \sup _{f\in {\mathcal {F}}} C \Vert u\Vert _{H^{-1/2}(\varGamma )} \Vert f\Vert _{L^{2}(\varOmega )} = C \Vert u\Vert _{H^{-1/2}(\varGamma )}. \end{aligned}$$

The interior regularity can be proven using a similar discussion to that of Lemma 3 and a bootstrap argument like in Lemma 5 \(\square \)

Lemma 12

Suppose that \(({{\mathbb {P}}})\) has a regular feasible Slater point and that Assumption (H) is satisfied. Then the sequence of discrete optimal controls \(\bar{u}_h\) of Problem \(({{\mathbb {P}}}_{h})\) is bounded in the \(W^{1-1/p,p}(\varGamma )\)-norm independently of h for all \(p<p_\varOmega \).

Proof

For the proof we refer to [13, Theorem 6.2]. This proof is based on the stability of the \(L^2(\varGamma )\)-projections in \(W^{1-1/p,p}(\varGamma )\) stated in [14] and can be adapted with the obvious changes starting with our Lemma 8. \(\square \)

Remark 4

The reader may have noticed so far that we are using in this subsection the \(W^{1-1/p,p}(\varGamma )\)-regularity of the optimal control proved in (17) instead of the \(H^{s-3/2}(\varGamma )\)-regularity proved in (18). The reason is that we are not able to prove uniform boundness of the discrete optimal controls in \(H^{s-3/2}(\varGamma )\). To repeat the sketch of the proof of [13, Theorem 6.2] for such norm, we would need an exponent \(s-3/2\) in (30), instead of \(1-1/p\), which we have not. In this case, we would eventually obtain an order of convergence of \(h^{1-1/p}\) in Theorem 4, instead of \(h^{3/4-1/(2p)}\) and there would be no need to write a separate proof in Sect. 6.1 for the no-control-constraints case.

Lemma 13

Suppose that \(({{\mathbb {P}}})\) has a regular feasible Slater point. Let \(\bar{u}_h\) be the optimal control of \(({\mathbb {P}}_h)\). There exists a sequence \(u^*=u^*(h)\) of controls, uniformly bounded in \(W^{1-1/p,p}(\varGamma )\) for all \(p<p_\varOmega \), that are feasible for \(({\mathbb {P}})\), and a constant \(C>0\) independent of h such that

$$\begin{aligned} \Vert \bar{u}_h-u^*(h)\Vert _{H^{-1/2}(\varGamma )}\le C h^{3/2-1/p}\ \forall p<p_\varOmega . \end{aligned}$$
(37)

Proof

For \(h>0\) consider \(u_{h0}=\varPi _h u_0\) the discrete Slater point introduced in Theorem 2. For \(\kappa =\kappa (h)\) to be determined define the auxiliary control

$$\begin{aligned} u^*=\bar{u}_h+\kappa (u_{h0}-\bar{u}_h). \end{aligned}$$

The boundedness of the sequences \(u^*\) follows directly from Lemma 12 and the stability of \(\varPi _h\) stated in [14, Theorem 2.3]. Then, clearly the error \(\Vert \bar{u}_h-u^*\Vert _{H^{-{1}/{2}}(\varGamma )}\) is determinded by \(\kappa (h)\). Notice, for instance,

$$\begin{aligned} u^*=(1-\kappa )\bar{u}_h+\kappa u_{h0}\le (1-\kappa )\beta +\kappa (\beta -\delta _h)=\beta -\kappa \delta _h\le \beta , \end{aligned}$$

where \(\delta _h\) is introduced in Definition 2. Repeating these calculations for the lower bound results in feasibility of \(u^*\) with respect to the control constraints. To check feasibility regarding the state constraints, observe that in \(\bar{\varOmega }_1\) we have

$$\begin{aligned} {y_{u^*}}&=y_h\left( u^*\right) +{y_{u^*}}-y_h\left( u^*\right) \\&\le (1-\kappa ){\bar{y}_h}+\kappa y_h(u_{h0})+\Vert {y_{u^*}}-y_h\left( u^*\right) \Vert _{L^\infty (\varOmega _1)}. \end{aligned}$$

Similar to (31), we obtain with [13, Theorem 5.4]

$$\begin{aligned}&\Vert {y_{u^*}}-y_h\left( u^*\right) \Vert _{L^\infty (\varOmega _1)}\nonumber \\&\quad \le C \big ( |\log h| \Vert {y_{u^*}}-I_h{y_{u^*}}\Vert _{L^\infty (\varOmega _2)} + \Vert {y_{u^*}}-y_h\left( u^*\right) \Vert _{L^2(\varOmega )}\big )\nonumber \\&\quad \le C \big ( |\log h| h^{2-2/p} \Vert {y_{u^*}}\Vert _{W^{2,p}(\varOmega _2)} + h^{3/2-1/p}\Vert u^*\Vert _{{ W^{1-1/p,p}(\varGamma )}}\big )\nonumber \\&\quad \le C \big ( |\log h| h^{2-2/p} \Vert u^*\Vert _{ W^{1-1/p,p}(\varGamma )}+ h^{3/2-1/p}\Vert u^*\Vert _{ W^{1-1/p,p}(\varGamma )}\big )\nonumber \\&\quad \le Ch^{3/2-1/p}\Vert u^*\Vert _{ W^{1-1/p,p}(\varGamma )}. \end{aligned}$$
(38)

Taking into account that \(u^*(h)\) is bounded in \(W^{1-1/p,p}(\varGamma )\), all the estimates yield

$$\begin{aligned} {y_{u^*}}&=y_h\left( u^*\right) +{y_{u^*}}-y_h\left( u^*\right) \\ {}&\le (1-\kappa )b+\kappa (b-\varepsilon _h)+Ch^{3/2-1/p}\\&\le b-\kappa \varepsilon _h+Ch^{3/2-1/p}. \end{aligned}$$

Noting that for h small enough, \(\varepsilon _h>0\) is independent of h (cf. Theorem 2), we obtain for \(\kappa =Ch^{3/2-1/p}/\varepsilon _h\) feasibility with respect to the upper bound. Analogous calculations for the lower bound and the definition of \(\kappa =\kappa (h)\) yields the assertion including the required error estimate. \(\square \)

Lemma 14

Suppose that \(({{\mathbb {P}}})\) has a regular feasible Slater point. Let \(\bar{u}\) be the optimal control of \(({\mathbb {P}})\). There exists a sequence \(u^*_h\) of controls, uniformly bounded in \(W^{1-1/p,p}(\varGamma )\) for all \(p<p_\varOmega \), that are feasible for \(({\mathbb {P}}_h)\) and a constant \(C>0\) independent of h such that

$$\begin{aligned} \Vert \bar{u}-u^*_h\Vert _{L^2(\varGamma )}\le & {} C h^{1-1/p}\ \forall p<p_\varOmega , \end{aligned}$$
(39)
$$\begin{aligned} \Vert \bar{u}-u^*_h\Vert _{H^{-1/2}(\varGamma )}\le & {} C h^{3/2-1/p}\ \forall p<p_\varOmega . \end{aligned}$$
(40)

Proof

The proof is similar to the one of Lemma 13. Define

$$\begin{aligned} u^*_h=\varPi _h\bar{u}+\kappa (u_{h0}-\varPi _h\bar{u}), \end{aligned}$$

and note that \(u_{h0}=\varPi _hu_0\). The boundedness of the sequence \(u_h^*\) follows again directly from the stability of \(\varPi _h\) stated in [14, Theorem 2.3]. Obviously, for \(\kappa \) sufficiently small, \(u_h^*\in U_{\alpha ,\beta ,h}\) is instantly verified. To discuss the state constraints in the interior of \(\varOmega \), by means of the projection error estimate from Lemma 10 together with the interior regularity result in Lemma 11 and estimate (36), and the interior \(L^\infty \)-error estimate for the state which is obtained as in the proof of the previous lemma, see (38), we obtain

$$\begin{aligned} y_h(u_h^*)= & {} {y_{u_h^*}}+y_h(u_h^*)-{y_{u_h^*}}\nonumber \\= & {} (1-\kappa )\bar{y}+\kappa {y_{u_0}}+(1-\kappa )y_{(\varPi _h\bar{u}-\bar{u})}\nonumber \\&+\,\kappa y_{(\varPi _hu_0-u_0)}+y_h(u_h^*)-{y_{u_h^*}}\nonumber \\\le & {} (1-\kappa )b+\kappa (b-\varepsilon ) +C(1-\kappa )h^{3/2-1/p}\Vert \bar{u}\Vert _{W^{1-1/p,p}(\varGamma )} \nonumber \\&+\,C\kappa h^{3/2-1/p}\Vert u_0\Vert _{W^{1-1/p,p}(\varGamma )} +Ch^{3/2-1/p} \Vert u_h^*\Vert _{W^{1-1/p,p}(\varGamma )}\nonumber \\\le & {} b-\kappa \varepsilon +Ch^{3/2-1/p}, \end{aligned}$$
(41)

and thus we may choose \(\kappa =Ch^{3/2-1/p}/\varepsilon \). To obtain the estimates (39) and (40), we use that \(\kappa \le C h^{3/2-1/p}\) and (35) and (36), respectively.

$$\begin{aligned} \Vert u_h^*-\bar{u}\Vert _{L^2(\varGamma )}\le \Vert \varPi _h\bar{u}-\bar{u}\Vert _{L^2(\varGamma )}+Ch^{3/2-1/p}\Vert u_{h0}-\varPi _h\bar{u}\Vert _{L^2(\varGamma )}\le Ch^{1-1/p} \end{aligned}$$

as well as

$$\begin{aligned} \Vert u_h^*-\bar{u}\Vert _{H^{-1/2}(\varGamma )}&\le \Vert \varPi _h\bar{u}-\bar{u}\Vert _{H^{-1/2}(\varGamma )}+Ch^{3/2-1/p}\Vert u_{h0}-\varPi _h\bar{u}\Vert _{H^{-1/2}(\varGamma )}\\&\le Ch^{3/2-1/p}. \end{aligned}$$

\(\square \)

Lemma 15

There exists \(C>0\) such that the following estimate holds:

$$\begin{aligned} \Vert {y_{\bar{u}_h}}-{\bar{y}_h}\Vert _{L^2(\varOmega )}+\Vert {y_{u_h^*}}-y_h(u_h^*)\Vert _{L^2(\varOmega )}\le C h^{3/2-1/p}\ \forall p<p_\varOmega . \end{aligned}$$
(42)

Proof

The assertion follows from the error estimate for semilinear equations in [13, Theorem 5.4] and the uniform bounds stated in Lemmas 12 and 14.

$$\begin{aligned} \Vert {y_{\bar{u}_h}}-{\bar{y}_h}\Vert _{L^2(\varOmega )}&+\Vert {y_{u_h^*}}-y_h(u_h^*)\Vert _{L^2(\varOmega )}\\&\le Ch^{3/2-1/p}\left( \Vert \bar{u}_h\Vert _{W^{1-1/p,p}(\varGamma )}+\Vert u_h^*\Vert _{W^{1-1/p,p}(\varGamma )}\right) \\&\le Ch^{3/2-1/p}. \end{aligned}$$

\(\square \)

With the preceding results, we are now in the position to prove our error estimates in the control-constrained case.

Theorem 4

Let \(\bar{u}\) and \(\bar{u}_h\) be the solutions of problems \(({{\mathbb {P}}})\) and \(({{\mathbb {P}}}_{h})\), respectively, and suppose that \(({{\mathbb {P}}})\) has a regular feasible Slater point and Assumption (H) is satisfied. Then there exists some \(h_0>0\) and \(C>0\) such that for all \(0<h<h_0\)

$$\begin{aligned} \Vert \bar{u}-\bar{u}_h\Vert _{L^2(\varGamma )}\le C h^{\frac{3}{4}-\frac{1}{2p}}\ \forall p<p_\varOmega . \end{aligned}$$

Proof

We follow closely the technique of proof in [26], Lemma 7 and Theorem 3. We use the auxiliary controls \(u^*\) and \(u_h^*\) from Lemmas 13 and 14, that are feasible for Problems \(({{\mathbb {P}}})\) and \(({{\mathbb {P}}}_{h})\), respectively, to test the variational inequalities for \(({{\mathbb {P}}})\) and \(({{\mathbb {P}}}_{h})\). This leads to

figure a

where the Lagrange multiplier terms disappear because of feasibility of \(u^*\) and \(u_h^*\) with respect to the state constraints. Then, adding both inequalities and straight forward computations lead to

$$\begin{aligned} 0\le&-\nu \Vert \bar{u}-\bar{u}_h\Vert ^2_{L^2(\varGamma )}+\nu \left( (\bar{u},u^*-\bar{u}_h)_\varGamma +(\bar{u},u_h^*-\bar{u})_\varGamma +(\bar{u}_h-\bar{u},u_h^*-\bar{u})_\varGamma \right) \\&-\Vert \bar{y}_h-\bar{y}\Vert ^2_{L^2(\varOmega )}+({\bar{y}_h}-{\bar{y}},\ y_h(u_h^*)-{y_{u_h^*}}+{y_{u_h^*}}-{\bar{y}})\\&+({\bar{y}}-y_d,\ {y_{u^*}}-{y_{\bar{u}_h}}+{y_{u_h^*}}-{\bar{y}}+{y_{\bar{u}_h}}-{\bar{y}_h}+y_h(u_h^*)-{y_{u_h^*}}). \end{aligned}$$

Rearranging terms and estimating the right-hand-side of the last inequality further, we arrive at

$$\begin{aligned}&\frac{\nu }{2}\Vert \bar{u}-\bar{u}_h\Vert ^2_{L^2(\varGamma )}+\frac{1}{2}\Vert \bar{y}-\bar{y}_h\Vert ^2_{L^2(\varOmega )}\\&\quad \le \nu \Vert \bar{u}\Vert _{H^{1/2}(\varGamma ) } \left( \Vert u^*-\bar{u}_h\Vert _{H^{-1/2}(\varGamma )} +\Vert u_h^*-\bar{u}\Vert _{H^{-1/2}(\varGamma )} \right) \\&\qquad + \frac{\nu }{2} \Vert u_h^*-\bar{u}\Vert ^2_{L^2(\varGamma )} + \Vert y_h(u_h^*)-{y_{u_h^*}}\Vert ^2_{L^2(\varOmega )} + \Vert {y_{u_h^*}}-{\bar{y}}\Vert ^2_{L^2 {}{(}\varOmega {}{)}}\\&\qquad + \Vert {\bar{y}}-y_d\Vert _{L^2(\varOmega )} \left( \Vert {y_{u^*}}-{y_{\bar{u}_h}}\Vert _{L^2(\varOmega )} +\Vert {y_{u_h^*}}-{\bar{y}}\Vert _{L^2(\varOmega )}\right) \\&\qquad + \Vert {\bar{y}}-y_d\Vert _{L^2(\varOmega )} \left( \Vert {y_{\bar{u}_h}}-{\bar{y}_h}\Vert _{L^2(\varOmega )} +\Vert y_h(u_h^*)-{y_{u_h^*}}\Vert _{L^2(\varOmega )}\right) \\&\quad \le \frac{\nu }{2}\Vert u_h^*-\bar{u}\Vert ^2_{L^2(\varGamma )} + C \Vert u_h^*-\bar{u}\Vert ^2_{H^{-1/2}(\varGamma )} + \Vert {y_{u_h^*}}-y_h(u_h^*)\Vert ^2_{L^2(\varOmega )}\\&\qquad + \left( \nu \Vert \bar{u}\Vert _{H^{1/2}(\varGamma )} +C\Vert {\bar{y}}-y_d\Vert _{L^2(\varOmega )}\right) \\&\qquad \qquad \left( \Vert u^*-\bar{u}_h\Vert _{H^{-1/2}(\varGamma )} +\Vert u_h^*-\bar{u}\Vert _{H^{-1/2}(\varGamma )}\right) \\&\qquad + \Vert {\bar{y}}-y_d\Vert _{L^2(\varOmega )}\left( \Vert {y_{\bar{u}_h}}-{\bar{y}_h}\Vert _{L^2(\varOmega )} +\Vert y_h(u_h^*)-{y_{u_h^*}}\Vert _{L^2(\varOmega )}\right) , \end{aligned}$$

where we applied in particular Young’s inequality, the Cauchy–Schwarz inequality, as well as the estimate

$$\begin{aligned} \Vert {y_{u_h^*}}-{\bar{y}}\Vert _{L^2(\varOmega )}\le C\Vert u_h^*-\bar{u}\Vert _{H^{-1/2}(\varGamma )}. \end{aligned}$$
(43)

which follows from Lemma 11.

We now use estimates (37), (39), (40) and (42). Collecting all estimates yields the assertion after taking the square root. \(\square \)

6.3 Comparison between the two methods of proof

Let us end this manuscript with a short comment on the different methods of proof in Sects. 6.1 and 6.2. If we try to write the proof of Sect. 6.2 for the non-control-constrained case, and we want to get an order \({\mathcal {O}}(h^{1-1/p})\) as we obtained in Sect. 6.1, somehow we should use the norm in \(H^{s-3/2}(\varGamma )\) (\(s<3,s<s_\varOmega \)) instead of the norm in \(W^{1-1/p,p}(\varGamma )\). Indeed, the optimal control has that regularity, which would improve the error for the \(L^2\)-projection, estimate (36). But to improve the FEM estimates (38), (41), and (42), using the same technique as in (31), we would need the norm in \(H^{s-3/2}(\varGamma )\) of the discrete optimal controls to be bounded, as we state in Lemma 12 for the norm in \(W^{1-1/p,p}(\varGamma )\). To have that bound, we would have to prove stability of \(\varPi _h\) in \(H^{s-3/2}(\varGamma )\), (this is not proved in [14] but it can be proven with the same technique used therein) and an error estimate for the approximation of the adjoint state analogous to that of Lemma 8. The key point, as we already mentioned in Remark 4, is that we are not able to improve the order of convergence \({\mathcal {O}}(h^{1-1/p})\) in (30), so the subsequent argument in the proof of Theorem 6.2 in [13], which eventually uses an inverse estimate, would not lead to the desired result.

On the other hand, to adapt the method of Sect. 6.1 to the control constrained case, we have to use the inequality form of the first order necessary conditions (16) and (22d) instead of (32) and (33). One idea to compare both inequalities is to use the interpolate introduced by Casas and Raymond cf. [13, Equation (7.9)] as test function, but this only leads to an order of \({\mathcal {O}}(h^{1/2-1/(2p)})\). The main reason for this is that in the analogous of Lemma 9, we would find the term \(\Vert \bar{y}-y_h(u_h^{{\mathrm {CR}}})\Vert _{L^\infty (\varOmega _1)}\), where \(u_h^{{\mathrm {CR}}}\) is the afore-mentioned interpolate, which will be bounded by the finite element error estimate plus the interpolation error \(\Vert \bar{u}- u_h^{{\mathrm {CR}}}\Vert _X\) in some appropriate norm. The finite element error is of order \({\mathcal {O}}(h^{3/2-1/p})\) [in contrast to the unconstrained case, where it is \({\mathcal {O}}(h^{2-2/p})\) due to the higher regularity of the control as shown in (31)], but the interpolation error \(\Vert \bar{u}- u_h^{{\mathrm {CR}}}\Vert _{L^2(\varOmega )}\) is of order \({\mathcal {O}}(h^{1-1/p})\) (cf. [13, Lemma 7.5]). To obtain a final order of \({\mathcal {O}}(h^{3/4-1/(2p)})\), it would be enough to prove that \(\Vert \bar{u}- u_h^{{\mathrm {CR}}}\Vert _{H^{-1/2}(\varGamma )}\le C h^{3/2-1/p}\), but we have not been able to prove such an estimate. The key difference is that with the technique used in Sect. 6.2 we are able to use the \(L^2\)-projection instead of the Casas and Raymond interpolate, and we obtain an interpolation error in \(H^{-1/2}(\varGamma )\) of order \({\mathcal {O}}(h^{3/2-1/p})\) [see Eq. (36)]. Notice also that we do not need to assume \(a,b\in W^{2,p}(\varOmega _1)\) to obtain the final error estimate in Theorem 4.

7 Numerical experiments

In contrast to the previous works [13, 23], where the authors only assumed \(L^{p}(\varOmega )\) regularity of the data \(y_\varOmega \) to obtain an error estimate for the optimal control of order \(O(h^{1-1/p})\) in the control-constrained and the unconstrained cases, respectively, we have also supposed \(y_\varOmega \in H^{s-2}(\varOmega )\) to achieve the same \(O(h^{1-1/p})\) error estimate (or even the worse \(O(h^{3/4-1/(2p)})\) error estimate for the control-and-state constrained case) in the present work. This assumption leads to a higher regularity of the solution \(\bar{u}\in H^{s-3/2}(\varGamma )\), cf. Eq. (18), in the four possible cases: (1) unconstrained, (2) control-constrained, (3) state-constrained and (4) control-and-state constrained. This higher regularity, the existing numerical experiments in the literature, and our own numerical experiment presented below seem to indicate that our estimates are not sharp. Instead, we suppose that the sharp error estimate for regular enough data should be \(O(h^{\min \{1,s-3/2\}})\) for all \(s<s_\varOmega \).

Consider \(\varOmega \) the interior of the convex hull of the points \((-0.5,-0.5)\), \((0.5,-0.5)\), (0.5, 0), (0, 0.5), \((-0.5,0.5)\) and \(\varOmega _1\) the open ball centered at \((-0.1,-0.1)\) with radius 0.2. Set \(\beta \equiv 0.16\) and \(b\equiv 0.15\), define \(y_\varOmega \equiv 1\) and consider the regularization parameter \(\nu =1\). We are going to solve the four problems

$$\begin{aligned} (P^1)\min J(u),\ (P^2)\min _{u\le \beta }J(u),\ (P^3)\min _{y\le b} J(u),\ (P^4)\min _{\begin{array}{c}y\le b\\ u\le \beta \end{array}} J(u). \end{aligned}$$

For this domain \(\omega =3\pi /4\), so we have \(1-1/p_\varOmega \approx 0.67\) and \(s_\varOmega -3/2\approx 0.83\).

We have solved all four problems starting with a mesh of size \(h_0\) satisfying Assumption (H) and obtaining subsequent meshes of size \(h_j=h_{j-1}/2\) by regular diadic refinement. We collect the mesh data in Table 1.

Table 1 Mesh data

To solve the control constrained problems, we have followed a primal dual active set strategy as described in [3]. For the state constrained problems, we use a penalization strategy similar to the one described in [20]. The unconstrained problems arising in the optimization procedures have been solved using the preconditioned conjugate gradient method. All the code has been made by the authors using Matlab R2015a and has been run on a PC with Windows 7SP1 64bits with 16GB RAM on an Intel(R) Core(TM) i7 CP 870@2.93Ghz.

We name \(u^i_j\) the approximate solution of problem \((P^i)\) in the mesh of size \(h_j\). For reference and possible double-check, we quote

$$\begin{aligned} J_h(u^1_6) = 0.347116,\ J_h(u^2_6) = 0.353813,\ J_h(u^3_6) =0.355277 \ J_h(u^4_6) = 0.355292 \end{aligned}$$

Since we do not know the analytic solution \(u^i\) of problem \((P^i)\), we measure the error and the experimental order of convergence as

$$\begin{aligned} e^i_j = \Vert u^i_j-u^i_{j-1}\Vert _{L^2(\varGamma )} \text{ and } {\mathcal {O}}^i_{j} = \log _2 e^i_j-\log _2 e^i_{j-1}. \end{aligned}$$

We obtain the results summarized in Table 2.

Table 2 \(L^2(\varGamma )\) errors and experimental orders of convergence for all the example problems

We think it is remarkable that in all cases, the final experimental order of convergence (in boldface in the table) is closer to \(s_\varOmega -3/2\approx 0.83\) than to the existing theoretical predictions \(1-1/p_\varOmega \approx 0.67\) or \(3/4-1/(2p_\varOmega )\approx 0.58\).