1 Introduction

This paper has two main aims. First, we develop a general theory of error analysis for smooth nonlinear programming problems in Banach spaces, which is applicable in particular to optimal control problems. Second, as main application, we prove new error estimates for optimal control problems governed by a class of quasilinear elliptic equations.

Roughly speaking, the following question stands behind our general theory: Let a control problem be discretized by a finite element scheme, where the associated grid has mesh size h. How large is the difference between a locally optimal control and its numerical approximation, if the state equation is handled by a finite element method and the controls are discretized appropriately?

In the last decade, the associated error analysis of optimal control problems for partial differential equations (PDEs) has made considerable progress. This refers in particular to the case of elliptic PDEs. Here, various error estimates are known that can be used to quantify the discretization error for the optimal solution(s) of the control problem. For control-constrained elliptic problems, we mention only the early papers for linear-quadratic elliptic control problems by Falk [15] and Geveci [16], the discussion of semilinear elliptic control problems in [1, 7], error estimates for Dirichlet boundary control problems in [8, 12], the concept of variational discretization in [18], and the investigation of superconvergence in [20]. In the recent past, state-constrained control problems received more attention. For instance, we refer to recent contributions by [5, 13], or [14]. This list of papers could be extended; we refer to the references in the survey paper [19].

Nevertheless, there are still important classes of optimal control problems, where reliable a priori error estimates are desired but now yet proved. For instance, because of intrinsic technical difficulties, the class of optimal control problems for quasilinear elliptic equations was not yet considered. The optimal control of coupled systems of quasilinear PDEs leads to additional difficulties.

We observed that very similar ideas are frequently repeated in any new contribution to this field. A general theorem on error estimates might save tremendous work in future investigations on other types of control problems. Therefore, in the first part of the paper, we provide a general tool for deriving error estimates for the optimal control under control constraints. In the second part, our general analysis is applied to an optimal control problem governed by a class of quasilinear elliptic PDEs. We show the error estimates stated in [10] without proof. Our general result should also be applicable to other classes of control problems, in particular if the state equation is of parabolic type. It is needed that the partial differential equation under consideration and its numerical approximation obey certain regularity properties of their solutions.

We consider the following quasilinear optimal control problem (P),

$$\mbox{(P)} \quad \left\{\everymath{\displaystyle}\begin{array}{l} {\min J(u) := \int_\Omega L(x,y_u(x),u(x)) \, dx},\\[5pt]\alpha \le u(x) \le \beta \quad \mbox{for a.e.}\ x \in \Omega,\end{array}\right.$$

where y u is the solution of the state equation

$$\left\{\begin{array}{l@{\quad }l}-\operatorname{div}\left[ a(x,y(x)) \nabla y(x) \right] + f(x,y(x)) = u(x)&\mbox{in}\ \Omega\\[5pt]y(x) =0& \mbox{on}\ \Gamma.\end{array}\right. $$
(1.1)

We began with the numerical analysis of (P) in [10], where we studied the finite element approximation of (1.1), its linearization and its adjoint equation. Moreover, we were able to prove that to any strict locally optimal control \(\bar{u}\) there exists a sequence of locally optimal controls \(\bar{u}_{h}\) of the associated discretized optimal control problem that converges to \(\bar{u}\) as the mesh size h tends to zero. Here, we will prove the error estimates announced in [10].

The objective functional J of (P) will lead to the well-known two-norm discrepancy: It is twice continuously Fréchet-differentiable in L (Ω) but not in general in L 2(Ω). On the other hand, we need a second-order sufficient optimality condition, which can only be satisfied in the norm of L 2(Ω). This difficulty complicates the estimation of the error and requires the use of two norms.

With the following simpler functional, the two-norm discrepancy does not occur:

$$ J(u) := \int_\Omega \biggl\{\ell(x,y_u(x)) + \frac{\Lambda}{2} u(x)^2\biggr\}\, dx.$$
(1.2)

It is twice continuously differentiable in L 2(Ω).

2 A unified theory of error estimates

2.1 An abstract optimization problem and optimality conditions

Let U and U 2 be Banach and Hilbert spaces, respectively, endowed with the norms ∥⋅∥ and ∥⋅∥2. We assume that U U 2 with continuous embedding. In particular, the choice U =U 2 is possible. The latter case is of interest for problems, where the two-norm discrepancy does not occur as for the functional (1.2).

We denote by \(\mathcal {K}\) a nonempty convex subset of U that is closed in U 2. Moreover, an objective function J:U ⟶ℝ is given. With these quantities, we define the abstract optimization problem

$$\mbox {\textrm{(}$\mathcal {P}$)} \quad \min_{u \in \mathcal {K}} J(u).$$

For the well-posedness of the problem, we require the next assumption.

(A1) The function J satisfies the following properties:

(2.1)
(2.2)

Theorem 2.1

Under the assumptions above, \((\mathcal{P})\) has at least one solution.

The proof is standard. In the case of nonconvex optimization, local solutions play an important role. Under a local solution of (\(\mathcal{P}\)) we understand an element \(\bar{u} \in \mathcal {K}\) such that, with some ε>0, \(J(\bar{u}) \le J(u)\) holds for all \(u \in \mathcal {K}\cap \{u \in U_{\infty}: \|u - \bar{u}\|_{\infty} < \varepsilon\}\). In this way, local optimality is understood in the sense of the topology of U . If the strong inequality \(J(\bar{u}) < J(u)\) is satisfied in this set for \(u \neq \bar{u}\), then the solution is called a strict local solution.

The next well known result provides first order optimality conditions in form of a variational inequality.

Theorem 2.2

If \(\bar{u}\) is a local solution of (\(\mathcal {P}\)) and J is directionally differentiable at \(\bar{u}\), both in the sense of U , then

$$J^\prime(\bar{u})(u-\bar{u}) \ge 0 \quad \forall u \in \mathcal {K}. $$
(2.3)

Notice that any local solution of (\(\mathcal {P}\)) in the U 2 sense is also a local solution in the U sense. Therefore (2.3) holds also for local solutions of (\(\mathcal {P}\)) in the U 2 sense.

In the sequel, we shall need stronger differentiability properties of J. Let us mention here once and for all that first- and second-order differentiability of J and notions as class C 1 or C 2 are always to be understood in the sense of the space U . The same refers to the approximation J h that will be defined later.

Next, we establish second order optimality conditions. To this end, we need some further assumptions. Let us fix a point \(\bar{u} \in \mathcal {K}\) as a reference element. In what follows, \(B_{2}(\bar{u},r)\) and \(B_{\infty}(\bar{u},r)\) denote the open ball of radius r>0 centered at \(\bar{u}\) in U 2 and U , respectively.

(A2) There exists an open subset \(\mathcal{A} \subset U_{\infty}\) covering \(\mathcal {K}\) , such that \(J:\mathcal{A} \longrightarrow \mathbb{R}\) is of class C 2 . Furthermore, there exist constants r>0, M i , i=1,2, such that it holds for all v,v 1,v 2U and \(u \in B_{2}(\bar{u},r) \cap \mathcal {K}\)

$$|J^\prime(u)v| \le M_1\|v\|_2 \quad \mbox{\textit{and}}\quad |J^{\prime\prime}(u)(v_1,v_2)| \le M_2\|v_1\|_2\|v_2\|_2. $$
(2.4)

For every ε>0 there exists δ>0 such that for all \(u_{1}, u_{2} \in B_{\infty}(\bar{u},r)\) and vU

$$\|u_1 - u_2\|_\infty < \delta \quad \Rightarrow \quad \left\{\everymath{\displaystyle}\begin{array}{l}|[J^\prime(u_1) - J^\prime(u_2)]v| \le \varepsilon\|v\|_2,\\[5pt]|[J^{\prime\prime}(u_1) - J^{\prime\prime}(u_2)]v^2| \le \varepsilon\|v\|^2_2.\end{array}\right. $$
(2.5)

Finally, we assume that the quadratic form \(Q: v \mapsto J^{\prime\prime}(\bar{u})v^{2}\) , Q:U 2⟶ℝ is a Legendre form according to the definition below.

Remark 2.3

(i) For the objective functional J of (P), (A2) is satisfied in U =L (Ω) and U 2=L 2(Ω) under appropriate differentiability and Lipschitz conditions on L, cf. Sect. 4.2. This holds, because \(\mathcal {K}= \{ u \in L^{\infty}(\Omega)\,| \, \alpha \le u(x) \le \beta \mbox{ a.e. in } \Omega\}\) is bounded in L (Ω).

If \(\mathcal {K}\) is unbounded, e.g. \(\mathcal {K}= L^{\infty}(\Omega) \), then the objective functional of (P) is too general. In this case, (A2) can be verified if J has the particular form (1.2).

(ii) By the mean value theorem, the estimate (2.4) implies the upper inequality of (2.5), if u 1,u 2 belong to \(B_{2}(\bar{u},r) \cap \mathcal {K}\). This follows from \([J^{\prime}(u_{1}) - J^{\prime}(u_{2})]v = J^{\prime\prime}(\hat{u})(u_{1}-u_{2},v)\) with some \(\hat{u} \in [u_{1},u_{2}]\).

By (2.4) and (2.5), the linear or bilinear forms J′(u) and J′′(u) can be continuously extended to U 2 or U 2×U 2 so that the expressions and estimates above make also sense for v,v 1,v 2U 2. Condition (2.5) expresses the continuity of the mappings uJ′(u) and uJ′′(u) from U to the associated spaces of linear and bilinear forms.

Following Bonnans and Shapiro [2] or Bonnans and Zidani [3] we say that J′′(u) is a Legendre form, if the following implications hold:

  1. (i)

    If v k v in U 2 as k→∞, then \(J^{\prime\prime}(u) v^{2} \le \lim \inf_{k \to \infty} J^{\prime\prime}(u) v^{2}_{k}\).

  2. (ii)

    If additionally \(\lim_{k \to \infty} J^{\prime\prime}( u) v^{2}_{k} = J^{\prime\prime}(u) v^{2}\) holds, then ∥vv k 2→0.

We define the cones \(S_{\bar{u}}\) of feasible directions and \(C_{\bar{u}}\) of critical directions by

$$\begin{array}{rcl}S_{\bar{u}} &=& \left\{v \in U_\infty : v = \lambda(u - \bar{u}) \mbox{ for some } \lambda > 0 \mbox{ and } u \in \mathcal {K}\right\},\\[1ex]C_{\bar{u}} &=& \operatorname{cl}_{2} (S_{\bar{u}}) \cap \{v \in U_2 : J^\prime(\bar{u})v = 0\},\end{array}$$

where \(\operatorname{cl}_{2} (S_{\bar{u}})\) denotes the closure of \(S_{\bar{u}}\) in U 2. Now we can prove necessary second order optimality conditions under a regularity assumption given in the next theorem.

Theorem 2.4

Let \(\bar{u}\) be a local solution of (\(\mathcal {P}\)) in U . Assume that (A2) is satisfied and that \(C_{\bar{u}} = \operatorname{cl}_{2}(\mathcal{C}_{\bar{u}})\), where

$$\mathcal{C}_{\bar{u}} = \left\{v \in S_{\bar{u}} : J^\prime(\bar{u})v = 0\right\}.$$

Then it holds \(J^{\prime\prime}(\bar{u})v^{2} \ge 0\) for all \(v \in C_{\bar{u}}\).

Proof

Given \(v \in C_{\bar{u}}\), we take a sequence \(\{v_{k}\}_{k = 1}^{\infty}\subset \mathcal{C}_{\bar{u}}\) converging to v in U 2. By definition of \(\mathcal{C}_{\bar{u}}\), we have

$$v_k = \lambda_k(u_k - \bar{u}),\quad u_k \in \mathcal {K}, \ \lambda_k > 0,\qquad J^\prime(\bar{u})v_k = 0.$$

Since \(\bar{u}\) is a local minimum of (\(\mathcal {P}\)), there exists \(\bar{\varepsilon}> 0\) such that \(J(\bar{u}) \le J(u)\) for any \(u \in B_{\infty}(\bar{u},\bar{\varepsilon}) \subset U_{\infty}\). We can assume that \(\bar{\varepsilon}\le r\). Then, for \(0 < \rho < \min\{1/\lambda_{k},\bar{\varepsilon}/\|v_{k}\|_{\infty}\}\), the elements \(\bar{u} + \rho v_{k}\) belong to \(\mathcal {K}\cap B_{\infty}(\bar{u},\bar{\varepsilon})\). The second order Taylor expansion

leads to

$$J^{\prime\prime}(\bar{u} + \theta_\rho\rho v_k)v_k^2 \ge 0 \quad \Rightarrow \quad J^{\prime\prime}(\bar{u})v_k^2 = \lim_{\rho \searrow 0}J^{\prime\prime}(\bar{u} + \theta_\rho\rho v_k)v_k^2 \ge 0.$$

Finally, by (2.4), we arrive at \(J^{\prime\prime}(\bar{u})v^{2} = \lim_{k \rightarrow \infty}J^{\prime\prime}(\bar{u})v_{k}^{2} \ge 0\). □

Remark 2.5

In control problems with constraints of type αu(x)≤β for xX, we have

$$C_{\bar{u}} = \Biggl\{v \in L^2(X) : v(x) = \left\{\begin{array}{l@{\quad }l}\ge 0 & \mbox{if}\ \bar{u}(x) = \alpha\\\le 0 & \mbox{if}\ \bar{u}(x) = \beta \\0 & \mbox{if}\ \bar{d}(x) \neq 0\end{array}\right. \Biggr\},$$

where \(\bar{d} \in U_{2}\) is the Riesz representative of the derivative of J at \(\bar{u}\), more precisely

$$J^\prime(\bar{u})v = \int_X \bar{d}(x) v(x) \, dx.$$

In this case, the regularity assumption of Theorem 2.4 holds with U =L (X) and U 2=L 2(X).

Indeed, for given \(v \in C_{\bar{u}}\) set

$$v_k(x) = \left\{\everymath{\displaystyle}\begin{array}{l@{\quad }l}0 & \mbox{if}\ \alpha < \bar{u}(x) < \alpha + \frac{1}{k}\ \mbox{or}\ \beta - \frac{1}{k} < \bar{u}(x) < \beta,\\\mathbb {P}_{[-k,+k]}(v(x)) & \mbox{otherwise}.\end{array}\right.$$

Let us check first that \(\alpha \le u_{k}(x) = \bar{u}(x) + \rho_{k} v_{k}(x) \le \beta\), if 0<ρ k ≤min{1,βα}/k 2. If e.g. \(\alpha < \bar{u}(x) < \alpha + 1/k\), then v k (x)=0 so that the inequality above is trivial. If \(\bar{u}(x) = \alpha\), then 0≤ρ k v k (x)≤ρ k kβα, hence αu k (x)≤β.

If \(\alpha + 1/k \le \bar{u}(x) \le \beta - 1/k\), then

$$\bar{u}(x) + \rho_k v_k(x) \ge \bar{u}(x) - \rho_k k\ge \alpha + \frac{1}{k} - \frac{1}{k^2} k = \alpha,$$

and

$$\bar{u}(x) + \rho_k v_k(x) \le \bar{u}(x) + \rho_k k \le \beta - \frac{1}{k} + \frac{1}{k^2} k = \beta.$$

The upper bound β is handled analogously. Consequently, \(v_{k} = \frac{1}{\rho_{k}}(u_{k} - \bar{u})\) belongs to \(S_{\bar{u}}\).

On the other hand, it is obvious that |v k (x)|≤|v(x)|. Hence, if \(\bar{d}(x) \neq 0\), then v(x)=0 and also v k (x)=0. Consequently, \(v_{k} \in \mathcal{C}_{\bar{u}}\) and v k v in L 2(X), which proves that \(v \in \operatorname{cl}_{2}(\mathcal{C}_{\bar{u}})\). Since v was taken arbitrarily in \(C_{\bar{u}}\), we deduce that \(C_{\bar{u}} \subset \operatorname{cl}_{2}(\mathcal{C}_{\bar{u}})\). The converse inclusion is obvious.

Theorem 2.6

Suppose that assumption (A2) holds. Let \(\bar{u} \in \mathcal {K}\) satisfy (2.3) and

$$J^{\prime\prime}(\bar{u})v^2 > 0 \quad \forall v \in C_{\bar{u}} \setminus \{0\}. $$
(2.6)

Then, there exist ε>0 and δ>0 such that

$$J(\bar{u}) + \frac{\delta}{2}\|u - \bar{u}\|_2^2 \le J(u) \quad \forall u \in \mathcal {K}\cap B_\infty(\bar{u},\varepsilon). $$
(2.7)

Proof

We argue by contradiction and assume that for any positive integer k there exists \(u_{k} \in \mathcal {K}\) such that

$$\|u_k - \bar{u}\|_\infty < \frac{1}{k} \quad \mbox{and}\quad J(\bar{u}) + \frac{1}{2k}\|u_k - \bar{u}\|_2^2 > J(u_k). $$
(2.8)

Setting \(\rho_{k} = \|u_{k} - \bar{u}\|_{2}\) and \(v_{k} = (u_{k} - \bar{u})/\rho_{k}\), we can assume that v k v in U 2; if necessary, we select a subsequence. Let us prove that \(v \in C_{\bar{u}}\). From assumption (A2) and (2.3) we deduce

$$J^\prime(\bar{u})v = \lim_{k \rightarrow \infty}J^\prime(\bar{u})v_k = \lim_{k \rightarrow \infty}\rho_kJ^\prime(\bar{u})(u_k - \bar{u}) \ge 0.$$

To prove the opposite inequality, we use (2.8) and find

Thus we have that \(J^{\prime}(\bar{u})v = 0\). The first equality above follows from

For arbitrary ε>0, we deduce from (2.5) and (2.8) the existence of k ε such that

$$|[J^\prime(\bar{u} + \theta_k(u_k - \bar{u})) - J^\prime(\bar{u})]v_k| \le \varepsilon\|v_k\|_2 = \varepsilon \quad \forall k \ge k_\varepsilon.$$

Therefore, by (2.4),

$$\everymath{\displaystyle}\begin{array}{l}\lim_{k \rightarrow \infty}\frac{J(\bar{u} + \rho_kv_k) - J(\bar{u})}{\rho_k}\\[2ex]\quad {} = \lim_{k \rightarrow \infty}J^\prime(\bar{u})v_k + \lim_{k \rightarrow \infty}[J^\prime(\bar{u} + \theta_k(u_k - \bar{u})) - J^\prime(\bar{u})]v_k= J^\prime(\bar{u})v.\end{array}$$

Next, we prove that \(v \in \operatorname{cl}_{2} (S_{\bar{u}})\). From \(v_{k} = (u_{k} - \bar{u})/\rho_{k}\) and \(u_{k} \in \mathcal {K}\), we conclude \(v_{k} \in S_{\bar{u}} \subset \operatorname{cl}_{2} (S_{\bar{u}})\). The set \(\operatorname{cl}_{2}(S_{\bar{u}})\) is closed and convex in U 2, hence \(v \in \operatorname{cl}_{2} (S_{\bar{u}})\). Thus, we obtain \(v \in C_{\bar{u}}\).

Invoking again (2.8) and (2.3) we get by a Taylor expansion

where

$$r_k = [J^{\prime\prime}(\bar{u} + \theta_k\rho_kv_k) - J^{\prime\prime}(\bar{u})]v_k^2 = [J^{\prime\prime}(\bar{u} + \theta_k(u_k - \bar{u})) - J^{\prime\prime}(\bar{u})]v_k^2.$$

Therefore, it holds

$$J^{\prime\prime}(\bar{u})v_k^2 + r_k < \frac{1}{k}.$$

Once again, (2.5) and (2.8) imply |r k |→0 when k→∞. Then, the above inequality, along with (2.4) and (2.6) leads to

$$0 \le J^{\prime\prime}(\bar{u})v^2 \le \liminf_{k \rightarrow \infty}J^{\prime\prime}(\bar{u})v_k^2 \le \limsup_{k \rightarrow \infty}J^{\prime\prime}(\bar{u})v_k^2 \le \limsup_{k \rightarrow \infty}\frac{1}{k} = 0,$$

so that \(J^{\prime\prime}(\bar{u})v_{k}^{2} \rightarrow J^{\prime\prime}(\bar{u})v^{2} = 0\). From (2.6), it follows v=0. Finally, using that \(J^{\prime\prime}(\bar{u})\) is a Legendre form, we get that v k v=0 in U 2. This contradicts the fact that ∥v k 2=1 for every k. □

Remark 2.7

Under the assumption (A2), the optimality condition (2.6) is equivalent to the following one:

$$\exists \alpha > 0\quad \mbox{such that}\quad J^{\prime\prime}(\bar{u})v^2 \ge \alpha\|v\|^2\quad \forall v \in C_{\bar{u}}. $$
(2.9)

Indeed, it is obvious that (2.9) implies (2.6). We verify the converse implication by contradiction. Suppose that (2.6) holds, but not (2.9). Then for any positive integer k there exists an element \(v_{k} \in C_{\bar{u}}\) such that

$$J^{\prime\prime}(\bar{u})v_k^2 < \frac{1}{k}\|v_k\|^2_2 \quad \forall k \ge 1.$$

Re-defining v k :=v k /∥v k 2 yields \(J^{\prime\prime}(\bar{u})v_{k}^{2} < \frac{1}{k}\). By taking a subsequence, denoted in the same way, we can assume that v k v weakly in U 2. Since \(C_{\bar{u}}\) is closed and convex in U 2, v belongs to \(\in C_{\bar{u}}\). Moreover, \(J^{\prime\prime}(\bar{u})\) is a Legendre form. Therefore (2.6) implies

$$0 \le J^{\prime\prime}(\bar{u})v^2 \le \liminf_{h \rightarrow 0}J^{\prime\prime}(\bar{u})v_k^2 \le \limsup_{h \rightarrow 0}J^{\prime\prime}(\bar{u})v_k^2 \le \limsup_{h \rightarrow 0}\frac{1}{k} = 0.$$

Consequently, v=0 and \(J^{\prime\prime}(\bar{u})v^{2}_{k} \rightarrow 0\) must hold and hence ∥v k 2→0 contradicting the fact that ∥v k 2=1.

Remark 2.8

If \(\bar{u} \in \mathcal {K}\) satisfies (2.7) and the regularity assumption \(C_{\bar{u}}={\operatorname{cl}_{2}(\mathcal{C}_{\bar{u}})}\) holds true, then (2.9) is fulfilled by α=δ. Indeed, by (2.7), \(\bar{u}\) is a local solution to

$$\min_{u \in \mathcal {K}} J(u) - \frac{\delta}{2} \|u - \bar{u}\|^2_2.$$

In view of Theorem 2.4, the necessary second-order condition

$$J^{\prime\prime}(\bar{u})v^2 - \delta \|v\|^2_2 \ge 0 \quad \forall v \in C_{\bar{u}}$$

must hold. Consequently, under the regularity condition above, inequality (2.6) holds if and only if (2.7) is satisfied.

2.2 Approximation of (\(\mathcal {P}\))

Let h>0 denote a small parameter. In applications to control problems, h is the meshsize of a grid underlying a numerical approximation of (\(\mathcal {P}\)). We consider a family of problems (\(\mathcal {P}_{h}\)) approximating (\(\mathcal {P}\)) when h→0, which is characterized in some sense by the parameter h.

For any h>0, we consider a family of sets \(\mathcal {K}_{h}\) with the following properties:

(A3) \(\mathcal {K}_{h}\subset \mathcal {K}\) is convex and closed in U 2 . Moreover, for any \(u \in \mathcal {K}\) there exist elements \(u_{h} \in \mathcal {K}_{h}\) such thatuu h 2→0 as h→0.

We introduce a family of functionals J h :U ⟶ℝ satisfying the following assumptions.

(A4) All functions J h have the following properties:

$$\mbox{\textit{If}}\ \{u_k\}_{k=1}^\infty \subset \mathcal {K}_h\ \mbox{\textit{and}}\ u_k \rightharpoonup u\ \mbox{\textit{in}}\ U_2,\quad \mbox{\textit{then}}\ J_h(u) \le \liminf_{k \rightarrow \infty} J_h(u_k). $$
(2.10)

If \(\mathcal {K}_{h}\) is unbounded in U 2 , then the following properties hold:

(2.11)
(2.12)

We investigate the family of approximating control problems

$$\mbox {\textrm{(}$\mathcal {P}_{h}$)} \quad \min_{u_h \in \mathcal {K}_h} J_h(u_h).$$

To guarantee that the family \(\{\mbox {\textrm{(}$\mathcal {P}_{h}$)} \}_{h}\) really approximates problem (\(\mathcal {P}\)), we need a further assumption.

(A5) Let \(\{u_{h}\}_{h > 0} \subset \mathcal {K}\) and \(u \in \mathcal {K}\) be given.

(2.13)
(2.14)

Remark 2.9

If assumption (A1) is satisfied, then the property of weak lower semicontinuity in (2.14) follows just from lim h→0|J h (u h )−J(u h )|=0. Indeed, we obtain

$${\liminf_{h \to 0} J_h(u_h)= \liminf_{h \to 0} J(u_h) + \lim_{h \to 0} (J_h(u_h) - J(u_h)) .}$$

As the last expression tends to zero, we obtain from (A1)

$$\liminf_{h \to 0} J_h(u_h) = \liminf_{h \to 0}J(u_h) \ge J(u).$$

As a consequence of assumption (A5) the next result is deduced.

Lemma 2.10

Under assumptions (A1) and (A5) the following holds:

(2.15)

Proof

We apply first (2.13) and then (2.14),

$$J(u) = \lim_{h \rightarrow 0} J(u_h) = \lim_{h \rightarrow 0}J_h(u_h) +\lim_{h \rightarrow 0}[J(u_h) - J_h(u_h)] = \lim_{h \rightarrow 0} J_h(u_h).$$

 □

Next, we show that the problems (\(\mathcal {P}_{h}\)) approximate problem (\(\mathcal {P}\)). To this aim, we introduce another assumption.

(A6) If \(\{u_{k}\}_{k=1}^{\infty}\subset \mathcal {K}\) , u k u in U 2 , and J(u k )→J(u), thenu k u2→0.

Theorem 2.11

Assume that (A1) and (A3)(A5) hold. Then, for all h>0, the problem (\(\mathcal {P}_{h}\)) has at least one (global) solution \(\bar{u}_{h}\). Any sequence of solutions \(\{\bar{u}_{h}\}_{h > 0}\) contains a subsequence, denoted for convenience in the same way, that converges weakly in U 2 to a point \(\bar{u}\). Each of these weak limit points is a solution to (\(\mathcal {P}\)). Moreover, \(\lim_{h \rightarrow 0}J_{h}(\bar{u}_{h}) =\lim_{h \rightarrow 0}J(\bar{u}_{h})= J(\bar{u})\). Under assumption (A6), the convergence \(\bar{u}_{h} \to \bar{u}\) is even strong in U 2.

Proof

First we prove that every problem (\(\mathcal {P}_{h}\)) has at least one solution. Thanks to (A3), \(\mathcal {K}_{h}\) is not empty. Let \(\{u_{k}\}_{k = 1}^{\infty}\subset \mathcal {K}_{h}\) be a minimizing sequence of (\(\mathcal {P}_{h}\)). Assumption (2.11) implies that \(\{u_{k}\}_{k = 1}^{\infty}\) is bounded in U 2. Then, selecting a subsequence if necessary, we can assume that \(u_{k} \rightharpoonup \bar{u}_{h}\) weakly in U 2. Since \(\mathcal {K}_{h}\) is convex and closed in U 2 by assumption (A3), it holds \(\bar{u}_{h} \in \mathcal {K}_{h}\). Finally, we conclude from (2.10) that \(\bar{u}_{h}\) is a solution of (\(\mathcal {P}_{h}\)).

Next, we verify the boundedness \(\{\bar{u}_{h}\}_{h > 0}\) in U 2. This is obvious if \(\mathcal {K}\) is bounded in U 2. If \(\mathcal {K}\) is unbounded, we fix an element \(u \in \mathcal {K}\). Thanks to assumption (A3), there exists a sequence {u h } h>0 with \(u_{h} \in \mathcal {K}_{h}\), such that u h u strongly in U 2. From (2.15) and the fact that \(\bar{u}_{h}\) is a solution of (\(\mathcal {P}_{h}\)), we get

$$J_h(\bar{u}_h) \le J_h(u_h) \rightarrow J(u) \quad \Rightarrow \quad \exists M >0 \mbox{ such that } J_h(\bar{u}_h) \le M\quad \forall h > 0.$$

In view of (2.12), this implies that \(\{\bar{u}_{h}\}_{h > 0}\) is bounded in U 2. Therefore, there exist subsequences weakly convergent in U 2 to points \(\bar{u}\). Let us prove that any of these limit points is a solution of (\(\mathcal {P}\)). First of all, \(\bar{u} \in \mathcal {K}\) because \(\{\bar{u}_{h}\}_{h > 0} \subset \mathcal {K}\) and \(\mathcal {K}\) is convex and closed in U 2. Let \(\tilde{u}\) be a solution of (\(\mathcal {P}\)) and consider a sequence \(\{\tilde{u}_{h}\}_{h > 0}\), with \(\tilde{u}_{h} \in \mathcal {K}_{h}\), such that \(\tilde{u}_{h} \to \tilde{u}\) in U 2. Such a sequence exists thanks to (A3). Then, (2.14), (2.15) and the fact that \(\bar{u}_{h}\) is a solution of (\(\mathcal {P}_{h}\)) lead to

$$\everymath{\displaystyle}\begin{array}{rcl}J(\bar{u}) &\le& \liminf_{h \rightarrow 0}J_h(\bar{u}_h) \le \limsup_{h \rightarrow 0}J_h(\bar{u}_h) \le \limsup_{h \rightarrow 0}J_h(\tilde{u}_h)\\[2ex]&=& J(\tilde{u}) = \inf \mbox {\textrm{(}$\mathcal {P}$)}\le J(\bar{u}),\end{array}$$

which proves that \(\bar{u}\) is a solution of (\(\mathcal {P}\)) and \(J_{h}(\bar{u}_{h}) \rightarrow J(\bar{u})\). From (2.14) we get

$$\lim_{h \rightarrow 0}J(\bar{u}_h) = \lim_{h \rightarrow 0}J_h(\bar{u}_h) + \lim_{h \rightarrow 0}[J(\bar{u}_h) - J_h(\bar{u}_h)] = J(\bar{u}).$$

Finally, (A6) and the above equality yield the strong convergence \(\|\bar{u} - \bar{u}_{h}\|_{2} \rightarrow 0\). □

The reader might wonder if there is a reciprocal theorem to the previous one. More precisely, we have proved that the solutions of (\(\mathcal {P}_{h}\)) converge to solutions of (\(\mathcal {P}\)). Now, the question is: given a solution \(\bar{u}\) of (\(\mathcal {P}\)), can it be approximated by solutions of (\(\mathcal {P}_{h}\))? The answer is yes assuming that \(\bar{u}\) is the unique solution of (\(\mathcal {P}\)) in a neighborhood. Even more, the next theorem states that strict local solution of (\(\mathcal {P}\)) can be approximated by local solutions of (\(\mathcal {P}_{h}\)).

Theorem 2.12

Under the assumptions (A1) and (A3)(A6), if \(\bar{u}\) is a strict local solution of (\(\mathcal {P}\)) in the sense of U 2, then there exists a sequence \(\{\bar{u}_{h}\}_{h > 0}\) such that \(\bar{u}_{h}\) is a local solution of (\(\mathcal {P}_{h}\)) and \(\|\bar{u} - \bar{u}_{h}\|_{2} \rightarrow 0\).

Proof

Since \(\bar{u}\) is a strict local solution of (\(\mathcal {P}\)), there exists ε>0 such that

$$J(\bar{u}) < J(u) \quad \forall u \in \mathcal {K}^\varepsilon\setminus\{\bar{u}\},\quad \mbox{where}\ \mathcal {K}^\varepsilon = \{u \in \mathcal {K}: \|u - \bar{u}\|_2 \le \varepsilon\}. $$
(2.16)

Consider the problems

$$\mbox{(\textrm{P}$^{\varepsilon}$)} \quad \min_{u \in \mathcal {K}^\varepsilon}J(u) \quad \mbox{and}\quad \mbox{(\textrm{P}$^{\varepsilon}_{h}$)} \quad \min_{u_h \in \mathcal {K}_h^\varepsilon}J_h(u_h),$$

where \(\mathcal {K}_{h}^{\varepsilon}= \mathcal {K}^{\varepsilon}\cap \mathcal {K}_{h}\). From (2.16) we know that \(\bar{u}\) is the unique solution of (Pε). From assumption (A3) we obtain the existence of a sequence {u h } h>0, with \(u_{h} \in \mathcal {K}_{h}\), such that \(\|\bar{u} - u_{h}\|_{2} \rightarrow 0\). Therefore, \(\|\bar{u} - u_{h}\|_{2} \le \varepsilon\) for all hh ε , hence \(u_{h} \in \mathcal {K}_{h}^{\varepsilon}\) for every hh ε . Since \(\mathcal {K}_{h}^{\varepsilon}\) is nonempty, bounded, convex and closed in U 2, we can deduce easily from (2.10) and (2.11) that (P\(^{\varepsilon}_{h}\)) has at least one solution \(\bar{u}_{h}\) for hh ε . Then, applying Theorem 2.11 we deduce that \(\|\bar{u} - \bar{u}_{h}\|_{2} \rightarrow 0\). Therefore, there exists h 0>0 such that \(\|\bar{u} - \bar{u}_{h}\|_{2} < \varepsilon\) for hh 0. This implies that \(\bar{u}_{h}\) is a local minimum of (\(\mathcal {P}_{h}\)) for hh 0, which concludes the proof. □

Remark 2.13

Often, Theorem 2.11 can be improved by deriving the convergence \(\|\bar{u} - \bar{u}_{h}\|_{\infty}\rightarrow 0\); see e.g. [1, 7, 10]. In such cases, Theorem 2.12 can be extended: Any strict local solutions of (\(\mathcal {P}\)) in the sense of U can be approximated by local solutions of (\(\mathcal {P}_{h}\)) in the sense of U .

In the remainder of this section, \(\{\bar{u}_{h}\}_{h > 0}\) will denote a sequence of local minima of problems (\(\mathcal {P}_{h}\)) converging strongly in U 2 to a local minimum \(\bar{u}\) of (\(\mathcal {P}\)); see Theorems 2.11 and 2.12. Our next goal is to prove some error estimates for \(\|\bar{u} - \bar{u}_{h}\|_{2}\). To this end, we need extra hypotheses.

(A7) The functions \(J_{h}:\mathcal{A} \longrightarrow \mathbb{R}\) are of class C 1 , where \(\mathcal{A}\) is the open set of U introduced in assumption (A2). Furthermore, with a sequence ε h →0, it holds

(2.17)

and

$$\left\{\begin{array}{l}\mbox{\textit{either}}\ \|\bar{u} - \bar{u}_h\|_\infty \rightarrow 0,\\[1pt]\mbox{\textit{or for any sequence}}\ \{(u_k,v_k)\}_{k=1}^\infty \subset \mathcal {K}\times U_2\ \mbox{\textit{with}}\ \|u_k - u\|_2 \rightarrow 0, \\[3pt]\quad \mbox{\textit{if}}\ v_k \rightharpoonup v\ \mbox{\textit{weakly in}}\ U_2 \Rightarrow J^{\prime\prime}(u)v^2 \le \liminf_{k \rightarrow \infty}J^{\prime\prime}(u_k)v_k^2,\\[1pt]\quad \mbox{\textit{if}}\ v_k \rightharpoonup 0\ \mbox{\textit{weakly in}}\ U_2 \Rightarrow \liminf_{k \rightarrow \infty}J^{\prime\prime}(u_k)v_k^2 \ge \Lambda \liminf_{k \rightarrow \infty}\|v_k\|_2^2\end{array}\right. $$
(2.18)

with some Λ>0.

Theorem 2.14

Let the assumptions (A2), (A3) and (A7) be satisfied and let \(\{\bar{u}_{h}\}_{h > 0}\) be a sequence of local solutions to \((\mathcal{P}_{h})\) converging strongly to \(\bar{u}\) in U 2. If the second-order sufficiency condition (2.6) holds, then there exist C>0 and h 0>0 such that

$$\|\bar{u} - \bar{u}_h\|_2 \le C\bigl[\varepsilon_h^2 + \|\bar{u} - u_h\|_2^2 + J^\prime(\bar{u})(u_h - \bar{u})\bigr]^{1/2}\quad \forall u_h \in \mathcal {K}_h, \ \forall h < h_0. $$
(2.19)

Proof

Since \(\bar{u}_{h}\) is a local minimum of (\(\mathcal {P}_{h}\)) and J h is C 1 around \(\bar{u}_{h}\), we have

$$J^\prime_h(\bar{u}_h)(u_h - \bar{u}_h) \ge 0 \quad \forall u_h \in \mathcal {K}_h. $$
(2.20)

From this inequality we get

$$\begin{array}{l}J^\prime(\bar{u}_h)(\bar{u} - \bar{u}_h) + [J^\prime_h(\bar{u}_h) - J^\prime(\bar{u}_h)](\bar{u} - \bar{u}_h) \\[2ex]\quad {}+[J^\prime_h(\bar{u}_h) - J^\prime(\bar{u})](u_h - \bar{u}) + J^\prime(\bar{u})(u_h - \bar{u}) \ge 0.\end{array}$$

Taking \(u = \bar{u}_{h}\) in (2.3) and adding the resulting inequality to the previous one we get

We introduce an additional term

We use (2.17) for the first and second term of the right-hand side of this inequality, apply the mean value theorem to the third one and estimate the resulting term by (2.4). In this way, we deduce from the above inequality

$$\begin{array}{l}[J^\prime(\bar{u}_h) - J^\prime(\bar{u})](\bar{u}_h - \bar{u})\\\quad {}\le \varepsilon_h\|\bar{u} -\bar{u}_h\|_2 + \varepsilon_h\|\bar{u} - u_h\|_2 + M_2 \|\bar{u}_h -\bar{u}\|_2\|\bar{u} - u_h\|_2\\[1pt]\qquad {} + J'(\bar{u})(u_h - \bar{u}).\end{array}$$

Since J is of class C 2 in \(\mathcal{A}\), we have

$$(J^\prime(\bar{u}_h)-J^\prime(\bar{u}))(\bar{u}_h - \bar{u}) = J^{\prime\prime}(\hat{u}_{h})(\bar{u}_h - \bar{u})^2$$

with some \(\hat{u}_{h}\) in the segment \([\bar{u}_{h},\bar{u}]\). Hence, from the above two relations we get

(2.21)

From our assumptions \(\|\bar{u}_{h} - \bar{u}\|_{2} \to 0\) as h→0, hence also \(\|\hat{u}_{h} - \bar{u}\|_{2} \to 0\).

Now, let us argue by contradiction. We suppose that (2.19) is false, then there exists a subsequence \(\{h_{k}\}_{k=1}^{\infty}\) converging to 0, with \(u_{h_{k}} \in \mathcal{K}_{h_{k}}\), such that

$$\|\bar{u} - \bar{u}_{h_k}\|_2^2 > k[\varepsilon_{h_k}^2 + \|\bar{u} - u_{h_k}\|_2^2 + J^\prime(\bar{u})(u_{h_k} - \bar{u})],$$

or equivalently

$$\frac{1}{k} > \frac{\varepsilon_{h_k}^2}{\|\bar{u} - \bar{u}_{h_k}\|_2^2} + \frac{\|u_{h_k}- \bar{u}\|_2^2}{\|\bar{u} - \bar{u}_{h_k}\|_2^2} + \frac{J^\prime(\bar{u})(u_{h_k}-\bar{u})}{\|\bar{u} - \bar{u}_{h_k}\|_2^2}. $$
(2.22)

Since the three summands of the right hand side are nonnegative, (2.22) implies that all of them converge to zero. Define

$$v_{h_k} := \frac{\bar{u}_{h_k} - \bar{u}}{\|\bar{u}_{h_k} - \bar{u}\|_2}.$$

All \(v_{{h_{k}}}\) belong to the unit sphere, hence we can assume \(v_{{h_{k}}} \rightharpoonup v\) in U 2 with some vU 2. Let us prove that \(v \in C_{\bar{u}}\). It is clear that \(v_{h_{k}} \in S_{\bar{u}}\), hence \(v \in \operatorname{cl}_{2}(S_{\bar{u}})\) because of the convexity of this set. It remains to prove that \(J'(\bar{u})v = 0\). The optimality conditions (2.3) imply that \(J^{\prime}(\bar{u})v =\lim_{k \rightarrow \infty}J^{\prime}(\bar{u})v_{h_{k}} \ge 0\). To prove the converse inequality we use (2.4), (2.17) and (2.14) as follows

(2.23)

The first limit above exists due to the weak convergence of \(v_{h_{k}}\). We verify below that the third and fourth limit exist, hence the second limit must exist as well. Indeed, by assumption (2.17) we have for the third limit

$$\lim_{k \rightarrow \infty}|[J^\prime(\bar{u}_{h_k}) - J_{h_k}^\prime(\bar{u}_{h_k})]v_{h_k}|\le \lim_{k \rightarrow \infty} \varepsilon_{h_k} \|v_{h_k}\|_2 = \lim_{k \rightarrow \infty} \varepsilon_{h_k} = 0.$$

Thanks to the mean value theorem and (2.4), we obtain for the fourth limit

$$\lim_{k \rightarrow \infty}| [J^\prime(\bar{u}) - J^\prime(\bar{u}_{h_k})]v_{h_k}| \le \lim_{k \rightarrow \infty}M_2 \|\bar{u}-\bar{u}_{h_k}\|_2 \|v_{h_k}\|_2 = 0.$$

In view of (2.23), this yields the inequality \(J^{\prime}(\bar{u})v \le \lim_{k \rightarrow \infty}J^{\prime}_{h_{k}}(\bar{u}_{h_{k}})v_{h_{k}}\). Now, from this inequality and using first (2.20), next (2.4) along with (2.17) and finally (2.22), we obtain

Let us show how (2.4) and (2.17) imply the last inequality. For any \(u \in B_{\infty}(\bar{u},r) \cap \mathcal {K}\), see assumption (A7), and every vU 2,

We observe that \(J^{\prime\prime}(\bar{u})v^{2} \le \liminf_{k \rightarrow \infty}J^{\prime\prime}(\hat{u}_{h_{k}})v_{h_{k}}^{2}\). Indeed, this is a consequence of (2.18). The inequality is obvious if the second condition of (2.18) is fulfilled. Under the first condition, we also have that \(\|\bar{u} - \hat{u}_{h_{k}}\|_{\infty}\rightarrow 0\). Because \(J^{\prime\prime}(\bar{u}):U_{2} \times U_{2} \longrightarrow \mathbb{R}\) is a Legendre form, we get from (2.5)

Combining this with (2.21) and (2.22), we obtain

This inequality, the fact that \(v \in C_{\bar{u}}\) and (2.6) imply that v=0. Finally, (2.18) leads to the contradiction. Indeed, under the first assumption of (2.18) and using again (2.5) along with the fact that \(\|\bar{u} - \hat{u}_{h_{k}}\|_{\infty}\rightarrow 0\), we have

Thus, we have proved that \(J^{\prime\prime}(\bar{u})v_{h_{k}}^{2} \rightarrow 0\) and \(v_{h_{k}} \rightharpoonup 0\) weakly in U 2. Since \(J^{\prime\prime}(\bar{u})\) is a Legendre quadratic form, we get \(\|v_{h_{k}}\|_{2} \rightarrow 0\), which contradicts the fact that \(\|v_{h_{k}}\|_{2} = 1\) for every k.

Under the second assumption of (2.18), we achieve the contradiction by \(0 \ge \liminf_{k \rightarrow \infty}J^{\prime\prime}(\hat{u}_{h_{k}})v_{h_{k}}^{2} \ge \Lambda\liminf_{k \rightarrow \infty}\|v_{h_{k}}\|_{2}^{2} = \Lambda > 0\). □

3 Application to the optimal control problem (P)

3.1 Main assumptions and known results on the control-to-state mapping

We apply Theorem 2.14 on error estimates to the PDE constrained optimal control problem (P). In this way, we derive the error estimates we have already announced in [10] and illustrate the use of our abstract theory. We rely on the assumptions (H1)–(H4) below.

(H1) :

Ω is an open, convex and bounded subset of ℝn, n=2 or 3, with boundary Γ of class C 1,1. For n=2, Ω is allowed to be polygonal instead of class C 1,1.

In the next assumptions, C M >0 denotes a constant depending on a real bound M>0. We assume in (H2)–(H4) that such a constant C M exists. Although we might use different constants for a,L, and f, we view C M as the maximum of all of them.

(H2) :

The function \(a:\bar{\Omega}\times \mathbb{R} \longrightarrow \mathbb{R}\) is of class C 2 with respect to the second component. For any M>0, it holds for all \(x_{i} \in \bar{\Omega}\) and |y|,|y i |≤M, i=1,2,

(3.1)
(3.2)

Moreover, there exists a 0>0 such that

$$ a(x,y) \ge a_0\quad \forall x \in \Omega, \ \forall y \in \mathbb{R}. $$
(3.3)
(H3) :

The function f:Ω×ℝ⟶ℝ is measurable with respect to the first variable and twice differentiable with respect to the second. It obeys the following properties:

(3.4)
(3.5)
(3.6)
(3.7)

for almost all x∈Ω and all |y|,|y 1|,|y 2|≤M.

(H4) :

The function L:Ω×ℝ×ℝ⟶ℝ is measurable with respect to the first variable and twice differentiable with respect to the others. It fulfills the convexity condition

$$ \frac{\partial^2 L}{\partial u^2}(x,y,u) \ge \Lambda > 0 \quad \mbox{for a.a.}\ x \in \Omega, \ \forall y, u \in \mathbb{R}.$$
(3.8)

Moreover

(3.9)
(3.10)
(3.11)

for all \(x_{1}, x_{2} \in \bar{\Omega}\), for a.e. x∈Ω, and all |y|,|y i |,|u|,|u i |≤M, i=1,2, where \(\bar{p} > n\) is as in assumption (H3) and \(D^{2}_{(y,u)}L\) denotes the second derivative of L with respect to (y,u), i.e. the associated Hessian matrix.

Remark 3.1

To simplify the presentation, the conditions (H3) and (H4) are required slightly stronger than needed. Continuity of the second-order derivatives of f and L with respect to y and u, uniform w.r. to x∈Ω, would suffice.

We used similar assumptions as (H1)–(H4) in our papers [9] and [10]. Preparing our error analysis for locally optimal controls, we discussed in [9] second-order sufficient optimality conditions. Consequently, we had to require associated second-order derivatives. In [10], we did not need second-order derivatives of a and f, since our focus was only on convergence of discretizations of the state equation and the adjoint equation.

In (H1), we have now added the possibility of a convex and polygonal domain if n=2. Our assumptions are imposed to assure that the states and adjoint states belong to W 2,p(Ω) for p>n.

It follows from Theorem 3.2 below that for every uL 2(Ω), the state equation (1.1) has a unique solution \(y_{u} \in H^{2}(\Omega) \cap H_{0}^{1}(\Omega)\). Moreover, for all bounded sets \(\mathcal{B} \subset L^{2}(\Omega)\), there exists a constant \(C_{\mathcal{B}} > 0\) such that

$$ \|y_u\|_{H^{2}(\Omega)} \le C_{\mathcal{B}} \quad \forall u \in \mathcal{B}.$$
(3.12)

The control-to-state mapping \(G: L^{2}(\Omega) \to H^{1}_{0}(\Omega)\cap H^{2}(\Omega)\), G:uy u that assigns to u the unique solution y of (1.1), is now well defined. We fix the control spaces U 2:=L 2(Ω), U :=L (Ω), and \(Y := H_{0}^{1}(\Omega)\cap H^{2}(\Omega)\) as the state space. Notice that Y is continuously embedded in \(C(\bar{\Omega})\).

Let us define the set \(\mathcal {K}\) of admissible controls by

$$\mathcal {K}= \{ u \in L^\infty(\Omega) : \alpha \le u(x) \le \beta \ \mbox{for a.a.}\ x \in \Omega\}.$$

We write the integral functional of (P) as

$$F(y,u):= \int_\Omega L(x,y(x),u(x))\,dx.$$

Then the reduced objective functional J admits the form

$$ J(u) := F(G(u),u) = \int_\Omega L(x,y_u(x),u(x))\,dx.$$
(3.13)

Thanks to our assumptions, F is defined and twice continuously Frèchet differentiable on \(C(\bar{\Omega}) \times L^{\infty}(\Omega)\), but it is not in general differentiable on \(C(\bar{\Omega})\times L^{2}(\Omega)\). In contrast to this, the objective functional (1.2) has better properties, if L(x,y,u):=(x,y)+Λ/2u 2 satisfies the assumption (H4). This functional is of class C 2 even in \(C(\bar{\Omega}) \times L^{2}(\Omega)\). Therefore, for (1.2) we can take U 2=U :=L 2(Ω).

Now we are able to relate the control problem (P) to the abstract optimization problem (\(\mathcal{P}\)). (P) can be written in the form

$$\mbox{(P)} \quad \min_{u \in \mathcal {K}} J(u),$$

where \(\mathcal {K}\) and J are defined as above.

Theorem 3.2

(See [9])

For every q>n/2, the mapping G:L q(Ω)⟶W 2,q(Ω), defined by G(u)=y u , is of class C 2. For any vL q(Ω), the function z v =G′(u)v is the unique solution in \(W^{2,q}(\Omega) \cap W_{0}^{1,q}(\Omega)\) of the equation

$$ \left\{\everymath{\displaystyle}\begin{array}{l@{\quad }l}- \operatorname{div}\biggl[a(x,y_u) \nabla z + \frac{\partial a}{\partial y}(x,y_u) z \nabla y_u\biggr] +\frac{\partial f}{\partial y}(x,y_u) z = v &\mbox{\textit{in}}\ \Omega\\[5pt]z = 0& \mbox{\textit{on}}\ \Gamma. \end{array}\right. $$
(3.14)

Moreover, for any v 1,v 2L q(Ω) the function z=G″(u)[v 1,v 2] is the unique solution in \(W^{2,q}(\Omega) \cap W_{0}^{1,q}(\Omega)\) of the following equation, where \(z_{v_{i}} = G'(u)v_{i}\), i=1,2:

$$ \left\{\everymath{\displaystyle}\begin{array}{l@{\quad }l}-\operatorname{div}\biggl[ a(x,y_u)\nabla z + \frac{\partial a}{\partial y}(x,y_u)z\nabla y_u \biggr] +\frac{\partial f}{\partial y}(x,y_u) z\\[5pt]\quad {} = -\frac{\partial^2f}{\partial y^2}(x,y_u)z_{v_1}z_{v_2}\\[5pt]\qquad {} + \operatorname{div}\biggl[\frac{\partial a}{\partial y}(x,y_u)(z_{v_1}\nabla z_{v_2} + \nabla z_{v_1} z_{v_2}) + \frac{\partial^2 a}{\partial y^2}(x,y_u)z_{v_1}z_{v_2}\nabla y_u\biggr]& \mbox{\textit{in}}\ \Omega\\z = 0 & \mbox{\textit{on}}\ \Gamma.\end{array}\right. $$
(3.15)

We have n≤3, hence our mapping G is of class C 2 from U 2 to Y. For the next theorem we recall that \(\bar{p}\) was defined in (H3).

Theorem 3.3

(See [9])

The functional J:L (Ω)→ℝ is of class C 2. For every u,vL (Ω) we have

$$J'(u)v=\int_\Omega\biggl(\frac{\partial L}{\partial u}(x,y_u,u)+\varphi_{u}\biggr)v\,dx , $$
(3.16)

where \(\varphi_{u}\in W_{0}^{1,\bar{p}}(\Omega) \cap W^{2,\bar{p}}(\Omega)\) is the unique solution of the adjoint equation

$$\left\{\everymath{\displaystyle}\begin{array}{l@{\quad }l}-\operatorname{div}\left[a(x,y_u) \nabla \varphi\right] + \frac{\partial a}{\partial y}(x,y_u)\nabla y_u\cdot \nabla\varphi + \frac{\partial f}{\partial y}(x,y_u) \varphi ={\frac{\partial L}{\partial y}(x,y_u,u)} & \mbox{\textit{in}}\ \Omega\\[5pt]\varphi = 0 & \mbox{\textit{on}}\ \Gamma.\end{array}\right. $$
(3.17)

With \(z_{v_{i}} = G'(u)v_{i}\), i=1,2, the second derivative J′′ is given by

(3.18)

The property \(\varphi_{u} \in W^{2,\bar{p}}(\Omega)\) is not explicitly mentioned in the associated proof. However, it follows as in the proof of [9, Thm. 2.11]. In the case of a locally optimal control \(\bar{u}\), (3.16) and (2.3) amount to

$$ \int_\Omega\biggl(\frac{\partial L}{\partial u}(x,y_{\bar{u}}(x),\bar{u}(x))+\varphi_{\bar{u}}(x)\biggr)(u(x)-\bar{u}(x))\,dx \ge 0 \quad \forall u \in \mathcal {K}.$$
(3.19)

A standard discussion of (3.19) yields the following projection result:

Theorem 3.4

(See [9])

If \(\bar{u}\) is a local minimum of (P), then the equation

$$ \frac{\partial L}{\partial u}(x,\bar{y}(x),t) + \bar{\varphi}(x) = 0 $$
(3.20)

has a unique solution \(\bar{t} = \bar{s}(x)\) for every \(x \in \bar{\Omega}\). The function \(\bar{s}:\bar{\Omega}\to \mathbb{R}\) is Lipschitz and \(\bar{u}\) is related to \(\bar{s}\) by the projection formula

$$ \bar{u}(x) = \mathbb{P}_{[\alpha,\beta]}(\bar{s}(x)) =\max\{\min\{\beta,\bar{s}(x)\},\alpha\}.$$
(3.21)

Consequently, also \(\bar{u}\) is Lipschitz in \(\bar{\Omega}\).

In the particular case of (1.2), the projection formula (3.21) simplifies to

$$ \bar{u}(x) = \mathbb{P}_{[\alpha,\beta]}\biggl\{-\frac{1}{\Lambda}\bar{\varphi}(x)\biggr\}.$$
(3.22)

3.2 Finite element discretization of (P)

For a numerical approximation of (P), we have to discretize the state equation and in general also the set of control functions. First, we introduce the finite element approximation of the quasilinear state equation of (P). To this aim, we consider a family of regular triangulations \(\{\mathcal{T}_{h}\}_{h>0}\) of \(\bar{\Omega}\), defined in the standard way, e.g. in [4]. In particular, this definition excludes the so-called hanging nodes. Moreover, this triangulation is supposed to be regular and to satisfy an inverse assumption; see (i) below. Also for Ω⊂ℝ3, we shall speak of a triangulation, although we will have a set of tetrahedra T.

With each element \(T \in \mathcal{T}_{h}\), we associate two parameters ρ(T) and σ(T), where ρ(T) denotes the diameter of the set T and σ(T) is the diameter of the largest ball contained in T. Define the mesh size by

$$h:=\max_{T\in\mathcal{T}_h}\rho(T).$$

We suppose that the following standard regularity assumptions are satisfied.

  1. (i)

    There exist two positive constants ρ and σ such that

    $$\frac{\rho(T)}{\sigma(T)}\leq \sigma, \qquad \frac{h}{\rho(T)}\leq \rho $$

    hold for all \(T\in \mathcal{T}_{h}\) and all h>0.

  2. (ii)

    Define \(\overline{\Omega}_{h}=\bigcup_{T\in \mathcal{T}_{h}} T\), and let Ω h and Γ h denote its interior and its boundary, respectively. We assume that \(\overline{\Omega}_{h}\) is convex and that the vertices of \(\mathcal{T}_{h}\) placed on the boundary Γ h are points of Γ. From [21, estimate (5.2.19)] we know that

    $$|\Omega\setminus \Omega_h| \leq Ch^2. $$
    (3.23)

We will use piecewise linear approximations for the states, thus we set

$$Y_h=\{ y_h\in C(\bar{\Omega}) \mid {y_h}_{\mid T} \in \mathcal{P}_1,\ \mbox{for all}\ T\in {\mathcal{T}}_h,\ \mbox{and}\ y_h=0 \ \mbox{on}\ \bar{\Omega}\setminus\Omega_h\},$$

where \(\mathcal{P}_{1}\) is the space of polynomials of degree less or equal than 1.

The discrete version of the state equation is

$$\left\{\everymath{\displaystyle}\begin{array}{l} \mbox{Find } y_h \in Y_h \mbox{ such that, for all } z_h\in Y_h,\\[5pt]{{\int_{\Omega_h}[a(x,y_h(x))\nabla y_h\cdot \nabla z_h + f(x,y_h(x))z_h]\,dx =\int_{\Omega_h}uz_h\,dx.}}\end{array}\right. $$
(3.24)

By an application of the Brouwer fixed point theorem, we showed the existence of at least one solution to this equation. To our surprise, we were not able to show uniqueness of the solution. To our best knowledge, this is an open question until now. Under the additional requirement of boundedness of a, we were able to show uniqueness for sufficiently small h. The situation is easier for semilinear elliptic state equations (a(x,y)=a(x)) where the solution of (3.24) is unique, [1].

To set up the discretized problem, we define its objective functional by

$$F_h(y_{h},u) = \int_{\Omega_h} L(x,y_{h}(x), u(x))\,dx.$$

Remark 3.5

In what follows, we will derive various estimates of \(\|u_{h} - u\|_{L^{2}(\Omega_{h})}\) for functions \(u_{h}, u \in \mathcal {K}\). We should notice that then \(\|u_{h} - u\|_{L^{2}(\Omega_{h})} \to 0\) is equivalent to \(\|u_{h} - u\|_{L^{2}(\Omega)} \to 0\) independently on how the functions u,u h are defined on Ω∖Ω h . This follows immediately from \(\|u - u_{h}\|_{L^{2}(\Omega \setminus \Omega_{h})} \le C h^{2}\) for all \(u_{h}, u \in \mathcal {K}\).

We also introduce a set \(\mathcal {K}_{h}\subset \mathcal {K}\) of discrete control functions u h . These functions are only needed on Ω h , because the integral of F h is only evaluated there. However, we need \(\mathcal {K}_{h}\subset \mathcal {K}\) to apply the abstract theory. For this reason, discretized functions u h defined only on Ω h can be extended arbitrarily to Ω∖Ω h such that the resulting function belongs to \(\mathcal {K}\). Later, in the context of error estimates, we will always set \(u_{h}(x) = \bar{u}(x)\mbox{ a.e. in } \Omega\setminus \Omega_{h}\), where \(\bar{u}\) is a fixed locally optimal reference control.

With these definitions, we consider the following family of discretized problems depending on the mesh size h>0:

$$\mbox{(\textrm{Q}$_{h}$)}\quad \min_{(y_h,u_h) \in Y_h\times \mathcal {K}_h} F_h(y_{h},u_h).$$

(Q h ) is a discrete approximation of the optimal control problem (P). As y h is possibly not uniquely determined by u h , this problem does not yet fit in the abstract theory. However, the discretized states are locally unique:

Theorem 3.6

(See [10, Thm. 3.2])

Suppose that \(n < p \le \bar{p}\). Then there exist h 0>0, \(\rho_{\bar{u}} > 0\) and \(\rho_{\bar{y}} > 0\) such that, for any h<h 0 and any u in the closed ball \(\bar{B}(\bar{u},\rho_{\bar{u}})\) of L p(Ω), the equation (3.24) has a unique solution y h (u) in the closed ball \(\bar{B}(\bar{y},\rho_{\bar{y}})\) of \(W^{1,p}_{0}(\Omega)\). Moreover, there holds the estimate

$$ \|y_u - y_h(u)\|_{L^p(\Omega_h)}+ h\|y_u - y_h(u)\|_{W^{1,p}(\Omega_h)} \le C(\bar{u}) h^2. $$
(3.25)

To simplify the presentation, we shall use the same notation \(\bar{B}(\bar{u},\rho_{\bar{u}}) \) and \(\bar{B}(\bar{y},\rho_{\bar{y}})\) for the balls in L p(Ω) and W 1,p h ). This should not lead to confusion. Moreover, as in this theorem, we shall denote the particular solution y h of (3.24) contained in \(\bar{B}(\bar{y},\rho_{\bar{y}})\) by y h (u). Introducing a mapping \(G_{h}: \bar{B}(\bar{u},\rho_{\bar{u}}) \to \bar{B}(\bar{y},\rho_{\bar{y}})\) by G h :uy h (u), we define the family of approximated functionals

$$J_h(u) := F_h(G_h(u),u)$$

and introduce the family of discretized problems

$$\mbox{(\textrm{P}$_{h}$)}\quad \min_{u_h \in \mathcal {K}_h\cap \bar{B}(\bar{u},\rho_{\bar{u}})} J_h(u_h).$$

Theorem 3.7

Let \(\bar{u}\) be a strict local solution of (P). Then there exists a sequence \(\{(\bar{y}_{h},\bar{u}_{h})\}_{h>0}\) of local solutions to (Q h ) such that

$$\lim_{h \to 0} \{ \|\bar{u}_h - \bar{u}\|_{L^p(\Omega)} + \|\bar{y}_h - y_{\bar{u}}\|_{W^{1,p}(\Omega)} \} = 0 \quad \forall p \le \bar{p}$$

and \(\lim_{h \to 0} F_{h}(\bar{y}_{h},\bar{u}_{h}) = F(y_{\bar{u}},\bar{u}) = J(\bar{u})\), where \(\bar{y} = y_{\bar{u}}\).

Proof

Consider the admissible set \(\hat{\mathcal {K}}:= \mathcal {K}\cap \bar{B}(\bar{u},\rho_{\bar{u}})\). It is clear that \(\bar{u}\) is also a strict local solution to

$$(\hat{\mathrm{P}}) \quad \min_{u \in \hat{\mathcal {K}}} J(u).$$

This problem fits to our abstract theory by (\(\mathcal{P}\)) := (\(\hat{P}\)). We consider now the approximated problem

$$(\hat{\mathrm{P}}_h)\quad \min_{u_h \in {\hat{\mathcal {K}}}_h } J_h(u),$$

where \({\hat{\mathcal {K}}}_{h} := \mathcal {K}_{h} \cap \bar{B}(\bar{u},\rho_{\bar{u}})\). Later, we shall verify that all the assumptions (A1)–(A7) are satisfied for (P), in particular for (\(\hat{\mathrm{P}}\)), hence Theorem 2.11 is applicable: There exists a sequence \(\bar{u}_{h}\) of global solutions to \((\hat{\mathrm{P}}_{h})\) that converges strongly in U 2 to \(\bar{u}\). Consequently, \(\bar{u}_{h} \in B(\bar{u},\rho_{\bar{u}}/2)\) holds for all sufficiently small h>0. Then \(\bar{u}_{h}\) does not touch the boundary of \(B(\bar{u},\rho_{\bar{u}})\). It is easy to see that therefore \((\bar{y}_{h},\bar{u}_{h})\) is a local solution to (Q h ), since the restriction \(u \in \bar{B}(\bar{u},\rho_{\bar{u}})\) is not active.

The convergence in \(\|\bar{u}_{h} - \bar{u}\|_{L^{2}(\Omega)} \to 0\) implies the convergence \(\bar{u}_{h} \to \bar{u}\) in all spaces L p(Ω), 1≤p<∞, because \({\hat{\mathcal {K}}}_{h} \subset \mathcal {K}\) and \(\mathcal {K}\) is bounded in L (Ω), hence the other statements follow immediately from (3.25). □

3.3 Finite element discretization of the adjoint equation

The discrete adjoint equation for (P) is the finite element version of the adjoint equation (3.17):

$$\left\{\everymath{\displaystyle}\begin{array}{l}\mbox{Find } \varphi_h \in Y_h \mbox{ such that, for all }\phi_h\in Y_h,\\[5pt]\int_{\Omega_h}\biggl\{a(x,y_h)\nabla \varphi_h \cdot \nabla \phi_h +\biggl[\frac{\partial a}{\partial y}(x,y_h)\nabla y_h\cdot \nabla\varphi_h + \frac{\partial f}{\partial y} (x,y_h)\varphi_h\biggr] \phi_h\biggr\}\,dx\\\quad {}= \int_{\Omega_h}\frac{\partial L}{\partial y}(x,y_h,u_h)\phi_h\,dx.\end{array}\right. $$
(3.26)

By [10, Thms. 4.1, 5.1], this equation has a unique solution for all sufficiently small h>0, see also Theorem 3.10. In view of the possible non-uniqueness of the solution to the discrete state equation, we associated with \(u \in B(\bar{u},\rho_{\bar{u}})\) the locally unique state y h (u). We insert y h (u) for y h in the discrete adjoint equation above; the associated unique solution φ h is denoted by φ h (u). Then the following result holds:

Lemma 3.8

(See [10])

For every pair \(u, v \in \mathcal {K}\) with \(v \in B(\bar{u},\rho_{\bar{u}})\), there holds

(3.27)
(3.28)
(3.29)

3.4 Necessary optimality conditions for (P h )

The theory of first-order necessary conditions for (P h ) is similar to that of (P).

Theorem 3.9

(See [10])

For every hh 0, the functional \(J_{h}:B(\bar{u},\rho_{\bar{u}})\cap L^{\infty}(\Omega) \to \mathbb{R}\) is of class C 1, and its derivative is given by

$$J_h'(u)v=\int_{\Omega_h}\biggl(\frac{\partial L}{\partial u}(x,y_h(u),u)+\varphi_h(u)\biggr)v\,dx, $$
(3.30)

where φ h (u)∈Y h is the unique solution of the adjoint state equation

(3.31)

From this expression for J′(u), we obtain first-order necessary optimality conditions for the discretized problem. Let \((\bar{y}_{h},\bar{u}_{h}) \in B(\bar{y},\rho_{\bar{y}})\times B(\bar{u},\rho_{\bar{u}})\) be a local solution to (Q h ) in the sense of U such that \(\bar{u}_{h}\) belongs to \(B(\bar{u},\rho_{\bar{u}}/2)\). The existence of such solutions follows from Theorem 3.7 for sufficiently small h>0. Then \(\bar{u}_{h}\) is also a local solution of (P h ) in the sense of U that does not touch the boundary of \(B(\bar{u},\rho_{\bar{u}})\) where \(J_{h}(\bar{u}_{h})\) is well defined. Therefore, we are justified to apply the following result on necessary optimality conditions that was proved in [10] for the problem (P h ).

Theorem 3.10

(See [10])

Let \((\bar{y}_{h},\bar{u}_{h}) \in \bar{B}(\bar{y},\rho_{\bar{y}})\times \bar{B}(\bar{u},\rho_{\bar{u}})\) be a local solution of (Q h ). Then there exists a unique solution \(\bar{\varphi}_{h}\) in Y h of

(3.32)

such that

$$\int_{\Omega_h} \biggl(\frac{\partial L}{\partial u}(x,\bar{y}_h,\bar{u}_h) + \bar{\varphi}_h\biggr)(u_h -\bar{u}_h) \, dx \ge 0 \quad \forall u_h \in \mathcal {K}_h. $$
(3.33)

From (3.33), we derive again projection formulas for \(\bar{u}_{h}\), if \(\mathcal {K}_{h}\) is the set (4.3) of functions being piecewise constant in Ω h or the variational discretization \(\mathcal {K}_{h}= \mathcal {K}\) is applied. In the case of the variational discretization \(\mathcal {K}_{h}= \mathcal {K}\), proceeding as in Theorem 3.4, we deduce from (3.33) that

$$\bar{u}_h(x) = \mathbb{P}_{[\alpha,\beta]}(\bar{s}_h(x)) \quad \forall x \in \Omega_h. $$
(3.34)

Here, \(\bar{s}_{h}(x)\) is the unique solution t of the equation

$$ \frac{\partial L}{\partial u}(x,\bar{y}_h(x),t) +\bar{\varphi}_h(x) = 0\quad \forall x \in \Omega_h. $$
(3.35)

In the case of the integral functional (1.2), this formula admits the form

$$\bar{u}_h(x) = \mathbb{P}_{[\alpha,\beta]}\biggl(-\frac{1}{\Lambda}\bar{\varphi}_h(x)\biggr), $$
(3.36)

hence \(\bar{u}_{h}\) is a piecewise linear function, although it was formally not discretized.

For piecewise constant control functions, formula (3.33) yields

$$ \bar{u}_{h\mid_T} = \mathbb{P}_{[\alpha,\beta]}(\bar{s}_{h\mid_T})\quad \forall T \in \mathcal{T}_h,$$
(3.37)

where \(s_{h\mid_{T}}\) is the unique real number satisfying the equation

$$ \int_T\biggl(\frac{\partial L}{\partial u}(x,\bar{y}_h(x),\bar{s}_{h\mid_T})+ \bar{\varphi}_h(x)\biggr)\, dx = 0.$$
(3.38)

For piecewise linear controls, we do not have an analogous projection formula.

4 Error estimates for the problem (P)

4.1 Control discretization and error estimate for (P)

We shall verify the general assumptions (A1)–(A7) for our quasilinear control problem (P) so that Theorem 2.14 provides a general tool for deriving error estimates. To apply it, the three different terms in the right-hand side of (2.19) must be chosen appropriately.

Once and for all, we select a fixed locally optimal reference control \(\bar{u} \in \mathcal {K}\). The treatment of the first term in (2.19) is fairly clear: To estimate \(\varepsilon_{h}^{2}\), an error analysis of the state equation is needed. The two other terms, \(\|u_{h} - \bar{u}\|^{2}\) and \(J^{\prime}(\bar{u})(u_{h} - \bar{u})\) need special strategies to arrive at optimal error estimates. In particular, they depend on how the control functions u are discretized. We explain associated methods and their consequences below. Let us notice already here that, within the framework of the abstract theory, we have to work in \(\hat{\mathcal {K}}_{h}\) and \(\hat{\mathcal {K}}\) instead of \(\mathcal {K}_{h}\) and \(\mathcal {K}\), respectively.

4.1.1 Variational discretization of controls

If a control \(u \in \mathcal {K}\) is not discretized, we set \(\mathcal {K}_{h}=\mathcal {K}\). Then \(u_{\mid {\Omega_{h}}}\) is to be inserted in J h . We extend u h to Ω∖Ω h by \(u_{h}(x) := \bar{u}(x)\) on Ω∖Ω h . In this case, a discretization of the control enters only implicitely by the necessary optimality conditions. The discretized adjoint state \(\bar{\varphi}_{h}\) is a piecewise linear function. If \(\mathcal {K}_{h}=\mathcal {K}\) and the simplified functional (1.2) is considered, the discretized optimal control \(\bar{u}_{h}\) of (P h ) satisfies the projection formula (3.36). Then, with \(\bar{\varphi}_{h}\), also the associated control \(\bar{u}_{h}\) is piecewise linear on Ω h . This strategy is called variational discretization, cf. Hinze [18].

For the more general objective functional of (P), the projection formula (3.34) has to be used instead. In this case, the associated discretized control \(\bar{u}_{h}\) is not necessarily piecewise linear and has a more complicated structure. Then the variational discretization is only of limited value. Nevertheless, the following result holds:

Theorem 4.1

(Variational discretization, L 2-estimate)

Assume that (H1)(H4) are satisfied. Let \(\bar{u}\) be a strict locally optimal control of (P) that fulfills the second-order sufficient optimality condition (2.6) and suppose that \(\mathcal {K}_{h}= \mathcal {K}\). Let further \(\{(\bar{y}_{h},\bar{u}_{h})\}\) be a sequence of locally optimal solutions for (Q h ) that converges strongly in W 1,p(Ω)×L p(Ω) to \((\bar{y},\bar{u})\) and exists according to Theorem 2.12. Then there is some constant C>0 not depending on h such that

$$ \|\bar{u}_h - \bar{u}\|_{L^2(\Omega_h)} \le C h^2 \quad \forall h > 0.$$
(4.1)

Proof

We can apply Theorem 2.14, since all needed assumptions are satisfied. In (2.19), we set \(u_{h} := \bar{u}\). Then only the term \(\varepsilon_{h}^{2}\) does not vanish and we obtain instantly the error estimate

$$ \|\bar{u}_h - \bar{u}\|_{L^2(\Omega_h)} \le C \varepsilon_h.$$
(4.2)

In (4.21) we will show ε h =Ch 2, hence (2.19) yields the result. □

4.1.2 Piecewise constant control approximation

We assumed in (H1) that Ω is convex with smooth boundary to obtain the needed regularity of \(\bar{u}\). For formal reasons, we must specify the discretized controls u h also in Ω∖Ω h . We define

(4.3)

Notice that we do not need the values of u h in Ω∖Ω h ; F h and the discretized equations are only considered in Ω h . It does not matter that we do not know \(\bar{u}\) in advance. For (P), the derivative \(J^{\prime}(\bar{u})\) is obtained by (3.30). We write it in the form

$$J^\prime(\bar{u}) u = \int_\Omega \bar{d}(x) u(x)\,dx,$$

where the reduced gradient \(\bar{d} \in L^{2}(\Omega)\) is given by

$$ \bar{d}(x) :=\frac{\partial L}{\partial u}(x,\bar{y}(x),\bar{u}(x))+\bar{\varphi}(x).$$
(4.4)

For error estimates, we apply Theorem 2.14. Again, we have to define the function u h in (2.19) in the right way. With piecewise constant controls, we proceed as follows: On any triangle \(T \in \mathcal{T}_{h}\), we define u h as the piecewise constant interpolate

$$ u_h(x) := \frac{1}{|T|}\int_T \bar{u}(x)\,dx \quad \forall x\in T, \ \forall T \in \mathcal{T}_h.$$
(4.5)

Notice that we have set \(u_{h}(x) = \bar{u}(x)\) for all x∈Ω∖Ω h .

In view of (3.27), we can select ε h =Ch 2 in Theorem 2.14. For piecewise constant control approximation, we cannot fully benefit from this. Even ε h =Ch would guarantee the error estimate below.

In the next proofs, we denote by \(\mathcal{T}_{0} \subset \mathcal{T}\) the set of all triangles T, which contain at least one zero x T of \(\bar{d}\). Notice that we need continuity of \(\bar{d}\) to make this definition correct. In all other triangles, \(\bar{d}\) is by continuity either everywhere positive or everywhere negative. The set \(\mathcal{T}_{0}\) will be essential for deriving optimal error estimates.

Theorem 4.2

(Piecewise constant controls, L 2-estimate)

Assume (H1)(H4), let a locally optimal control \(\bar{u}\) of (P) satisfy the second-order sufficient conditions (2.6). Define the admissible set \(\mathcal {K}_{h}\) by (4.3). Let further \(\bar{u}_{h}\) be a sequence of locally optimal (piecewise constant) solutions to (Q h ) that converges strongly in L 2(Ω) to \(\bar{u}\). Then there is some constant C>0 not depending on h such that

$$ \|\bar{u}_h - \bar{u}\|_{L^2(\Omega_h)} \le C h \quad \forall h > 0.$$
(4.6)

Proof

By Theorem 3.3, the adjoint state \(\bar{\varphi}\) belongs to \(W^{2,\bar{p}}(\Omega)\), since the right-hand side of the adjoint equation (3.17) belongs to \(L^{\bar{p}} (\Omega)\). Therefore, \(\bar{\varphi}\) belongs even to \(C^{1}(\bar{\Omega})\). Theorem 3.4 yields that \(\bar{u}\) is Lipschitz, hence \(\bar{d}\) is also Lipschitz. Notice that assumption (3.10) implies that L is Lipschitz with respect to x. The projection formula ensures in particular \(\bar{u} \in H^{1}(\Omega)\), hence it holds

$$ \|u_h - \bar{u}\|_{L^2(\Omega)} \le C h,$$
(4.7)

cf. Ciarlet [11]. Consider first the triangles \(T \notin \mathcal{T}_{0}\). Here, \(\bar{d}\) is either positive or negative in T. If \(\bar{d}(x) > 0\) in T then a.e. we have \(\bar{u}(x) = \alpha\). This follows from a standard pointwise discussion of (3.19), see e.g. [22, Lemma 2.2.6].

By the construction (4.5), it then also holds u h (x)=αxT. If d(x)<0 a.e. in T, we obtain accordingly u h (x)=βxT. In either case, we have

$$\int_T \bar{d}(x) (\bar{u}_h(x) - \bar{u}(x)) \,dx = 0 \quad \forall T\notin \mathcal{T}_0.$$

In each \(T \in \mathcal{T}_{0}\), there exists x T such that \(\bar{d}(x_{T}) = 0\). We estimate

(4.8)

where \(L_{\bar{d}}\) is the Lipschitz constant of \(\bar{d}\); we used the inequality (4.7).

Now we apply Theorem 2.14 and use (3.27), (4.7) and (4.8). These estimates show that all three terms appearing in the right-hand side of (2.19) have at least the order h 2, hence

$$\varepsilon_h^2 + \|u_h - \bar{u}\|_{L^2(\Omega)}^2 +J^\prime(\bar{u})(u_h - \bar{u}) \le C h^2.$$

Now (2.19) yields \(\|\bar{u}_{h} - \bar{u}\|_{L^{2}(\Omega)} \le C h \) as it was claimed in the theorem. □

This estimate is optimal, since the order of the L 2-approximation of a function by step functions is not in general better than h. However, as it is observed numerically, the error \(\|\bar{y}_{h} - \bar{y}\|_{L^{\infty}(\Omega_{h})}\) for the state function has often a higher order. This phenomenon was first observed in [7] and explained analytically by [20].

Our last result leads directly to an associated maximum-norm estimate.

Theorem 4.3

(Piecewise constant controls, L -estimate)

Under the assumptions of Theorem 4.2, there is a constant C>0 not depending on h such that

$$\|\bar{u}_h - \bar{u}\|_{L^\infty(\Omega_h)} \le C h \quad \forall h > 0.$$
(4.9)

Proof

From Lemma 3.8, estimate (3.29) and Theorem 4.2, we get

$$\|\varphi_{\bar{u}} - \varphi_h(\bar{u}_h)\|_{C(\bar{\Omega}_h)} \le C h.$$

Subtracting (3.21) and (3.37), we obtain a.e. on each triangle T

(4.10)
(4.11)

The second inequality follows with some effort by an application of the mean value theorem in (3.38) and a Taylor expansion of \(L(x,\bar{y}_{h}(x),s_{h\mid_{T}})\) with respect to the second and third variable. Moreover, Lemma 3.8, the Lipschitz continuity of \(y_{\bar{u}}\), \(\varphi_{\bar{u}}\), L(x,y,u), and of \(\bar{s}\) with respect to x, and the convexity condition of (H4) are needed. In the last line, we applied Theorem 4.2. For details we refer to [10, p. 29]. □

4.1.3 Piecewise linear control approximation

This type of approximation is frequently used in numerical computations, since its implementation is fairly easy. We adopt the notation \(\bar{d}\) and \(\mathcal{T}_{0}\) from above. We define

$$ \mathcal {K}_h= \{ u \in \mathcal {K}: {u}_{\mid T} \in \mathcal{P}_1 \ \mbox{for all}\ T\in {\mathcal{T}}_h\ \mbox{and}\ u_{\mid \Omega\setminus\Omega_h} =\bar{u}\}.$$
(4.12)

To apply Theorem 2.14, we use the function

$$ u_h := \Pi_h \bar{u} := \left\{ \begin{array}{l@{\quad }l}(\pi_h \bar{u})(x) & \forall x \in \Omega_h\\[1ex]\bar{u}(x) & \forall x \in \Omega\setminus\Omega_h, \end{array} \right.$$
(4.13)

where \(\pi_{h}: C(\bar{\Omega}) \to C(\bar{\Omega}_{h})\) associates to \(u \in C(\bar{\Omega}) \) its standard piecewise linear interpolate on Ω h . It holds that \(\|u - \Pi_{h} u\|_{C(\bar{\Omega}_{h})} \to 0\), if u is continuous on \(\bar{\Omega}\).

In general, we expect a better approximation of \(\bar{u}\) by piecewise linear functions than by piecewise constant controls. However, without further assumptions, the improvement is only marginal.

Theorem 4.4

(Piecewise linear controls, low order L 2-estimate)

Assume that \(\bar{u}\) is a local solution to (P) that satisfies the second-order sufficient condition (2.6). Let the assumptions (H1)(H4) be satisfied and \(\mathcal {K}_{h}\) be the set defined in (4.12). If \(\{\bar{u}_{h}\}\) is a sequence of locally optimal piecewise linear controls to (Q h ) with \(\|\bar{u}_{h} - \bar{u}\|_{L^{2}(\Omega_{h})} \to\nolinebreak 0\), then \(\|\bar{u}_{h} - \bar{u}\|_{L^{2}(\Omega_{h})} = o(h)\), i.e.

$$\lim_{h \to 0} \frac{1}{h}\|\bar{u}_h - \bar{u}\|_{L^2(\Omega_h)} = 0.$$
(4.14)

Proof

We proceed similarly as in the proof of the last theorem, but we define now \(u_{h} = \Pi_{h} \bar{u} \) as in (4.13). If \(T\notin \mathcal{T}_{0}\), then again either \(\bar{u}(x) \equiv \alpha\) or \(\bar{u}(x) \equiv \beta\) holds in T so that \(u_{h}(x) = \Pi_{h} \bar{u}_{h}(x) =\bar{u}(x)\) holds in T. Now we estimate

in view of the Lipschitz continuity of \(\bar{d}\) and the estimate

$$ \|\Pi_h u-u\|_{L^2(\Omega_h)} = o(h)$$
(4.15)

for uW 1,p(Ω), p>n, cf. Brenner and Scott [4]. In Theorem 2.14 we have to deal with the terms \(\varepsilon_{h}^{2}\), \(\|u_{h} - \bar{u}\|_{L^{2}(\Omega_{h})}^{2}\), and \(J^{\prime}(\bar{u})(u_{h} - \bar{u})\). For ε h , the finite element analysis yields again ε h Ch 2. By (4.15), the second term is of the order o(h 2), while the last one was shown above to have the order o(h 2).

All three terms appearing in the right-hand side of (2.19) were confirmed to have the order o(h 2), hence \(\varepsilon_{h}^{2} +\|u_{h} - \bar{u}\|_{L^{2}(\Omega_{h})}^{2} + J^{\prime}(\bar{u})(u_{h} - \bar{u}) = o(h^{2})\) so that (2.19) yields \(\|\bar{u}_{h} - \bar{u}\|_{L^{2}(\Omega_{h})} = o(h) \) as it was claimed in the theorem. □

In numerical tests with piecewise linear control approximation, often the higher order h 3/2 of approximation is observed. The reason is that the boundary of the active set of \(\bar{u}\) often has zero measure.

Theorem 4.5

(Piecewise linear controls, optimal L 2-estimate)

Suppose in addition to the assumptions of Theorem 4.4 that L(x,y,u)=l(x,y)+(Λ/2)u 2 and the measure of the set \(\mathcal{T}_{*}\) formed by the union of all triangles \(T \in \mathcal{T}\), where \(\bar{d}\) is not identically zero but vanishes in a subset of T of positive measure, can be estimated by

$$|\mathcal{T}_*| \le C h.$$

Then there exists some constant C>0 not depending on h such that

$$ \|\bar{u}_h - \bar{u}\|_{L^2(\Omega_h)} \le C h^{\frac{3}{2}}\quad \forall h > 0.$$
(4.16)

Proof

We decompose \(\mathcal{T}_{h}\) in three families of triangles

If \(T \in \mathcal{T}_{1}\), then either \(\bar{u}(x) = \alpha\ \forall x \in T\) or \(\bar{u}(x) = \beta\ \forall x \in T\), therefore \(\bar{u}(x) - u_{h}(x) \equiv 0\) in T. If \(T \in \mathcal{T}_{2}\), then \(\bar{u}(x) = -(1/\Lambda)\bar{\varphi}(x)\) for every xT, hence \(\bar{u}_{\mid T} \in H^{2}(T)\) and we have the interpolation error

$$\|\bar{u} - u_h\|_{L^2(T)} \le Ch^2\|\bar{u}\|_{H^2(T)} = \frac{C}{\Lambda}h^2\|\bar{\varphi}\|_{H^2(T)}.$$

Finally, in \(\mathcal{T}_{3}\) it holds

$$\sum_{T \in \mathcal{T}_3}\int_T|\bar{u} - u_h|^2 \,dx \le Ch^2\|\bar{u}\|^2_{C^{0,1}(\bar{\Omega})}\sum_{T \in \mathcal{T}_3}|T| = Ch^2\|\bar{u}\|^2_{C^{0,1}(\bar{\Omega})}|\mathcal{T}_*| \le C h^3.$$

From these estimates we infer

$$\|\bar{u} -u_h\|_{L^2(\Omega_h)} \le Ch^{3/2}.$$

It remains to estimate \(J'(\bar{u})(u_{h} - \bar{u})\). This follows from the above remarks and the fact that \(\bar{u}\) and u h coincide in Ω∖Ω h . In each \(T \in \mathcal{T}_{3}\) there exists x T such that \(\bar{d}(x_{T}) = 0\), hence

 □

4.2 Verification of the assumptions (A1)–(A7)

Finally, we verify the assumptions (A1)–(A7) for the quasilinear optimal control problem (P). This is a fairly technical task, characteristic for problems with quasilinear equations. The verification would be much easier for the semilinear elliptic problems. In either case, the discussion of error estimates would be much longer without our general Theorem 2.14.

We recall our choice U =L (Ω), U 2=L 2(Ω). Notice that we must verify the assumptions for the sets \(\hat{\mathcal {K}}\) and \(\hat{\mathcal {K}}_{h}\) instead of \(\mathcal {K}\) and \(\mathcal {K}_{h}\), respectively.

(A1): By Theorem 3.2 and the uniform boundedness of \(\mathcal {K}\), all possible states y u obey with some C>0

$$|y_u(x)| \le C \quad \forall x \in \Omega, \ \forall u \in \mathcal {K}.$$

The mapping uy u is in particular continuous from L 2(Ω) to \(H^{1}_{0}(\Omega)\), see Theorem 3.2. Let now u k u in L 2(Ω). The embedding of L 2(Ω) in W −1,p(Ω) is compact for all p>1, if n=2, and for all 1<p<6 for n=3. Therefore, we have u k u in W −1,p(Ω) and Theorem 2.3 of [9] ensures \(y_{u_{k}} \to y_{u}\) in \(W^{1,p}(\Omega)\subset C(\bar{\Omega})\). We split

By the uniform boundedness of \(y_{u_{k}}\), \(y_{u_{k}} \to y_{u}\) in \(C(\bar{\Omega})\) and the Lipschitz property (3.11), the second integral tends to zero. Thanks to condition (3.8), the mapping uL(x,y,u) is convex.

The functional above is defined on \(\mathcal {K}\), but not in general on L 2(Ω). In view of this, general results on lower semicontinuity of convex and continuous functionals defined on the whole space are not directly applicable. Therefore, to obtain the property of lower semicontinuity above, we apply the Mazur theorem and select a sequence of convex combinations of the u k that converges strongly to u. Notice that also the convex combinations belong to \(\mathcal {K}\). Then the property follows by the convexity of uL(x,y,u).

(A2): Here, we take \(\mathcal{A} = L^{\infty}(\Omega)\) and select any fixed real number r>0. It follows from Theorem 3.2 and assumptions (H4) on L that J is of class C 2 in L (Ω). The first-order derivative J′(u) is given by (3.16). We estimate

$$|J'(u)v|\le \int_\Omega\biggl|\frac{\partial L}{\partial u}(x,y_u,u)+\varphi_{u}\biggr| |v|\,dx.$$

The inequality \(|J^{\prime}(u)v| \le c \|v\|_{L^{2}(\Omega)}\) follows immediately, since y u , u, and φ u are all bounded and measurable. The boundedness of φ u is obtained from Theorem 3.3.

To discuss J′′(u), we employ its representation (3.18). The functions \(z_{v_{i}} = G^{\prime}(u)v_{i}\) were introduced in Theorem 3.2. Thanks to \(\mathcal {K}\subset L^{\infty}(\Omega)\), the functions y u and φ u belong to \(W^{2,\bar{p}}(\Omega) \subset C^{1}(\bar{\Omega})\). Therefore, they and their gradients are bounded and measurable. The linear mappings \(v_{i} \mapsto z_{v_{i}}\) and \(v_{i} \mapsto \nabla z_{v_{i}}\) are continuous from L 2(Ω) to L 2(Ω)n, as the reader may verify with Theorem 3.2. Therefore, the second part of (2.4) is easy to confirm. Let us consider the most difficult term in (3.18),

$$\int_\Omega \biggl|\nabla \varphi_u \frac{\partial a}{\partial y}(x,y_u)z_{v_1}\nabla z_{v_2}\biggr|\,dx\le \|\varphi_u\|_{W^{2,\bar{p}}(\Omega)}\biggl\|\frac{\partial a}{\partial y}(x,y_u)\biggr\|_{\infty}\|z_{v_1}\|_{L^2(\Omega)}\|z_{v_2}\|_{H^1_0(\Omega)}.$$

It is clear that this term can be estimated against \(\|v_{1}\|_{L^{2}(\Omega)}\|v_{2}\|_{L^{2}(\Omega)}\).

Next, we verify (2.5).

Lemma 4.6

Under the assumptions (H1)(H4), the objective functional J of problem (P) obeys the estimates

(4.17)
(4.18)

for all \(u_{i} \in B_{\infty}(\bar{u},r)\), v,v i L 2(Ω), if r>0 is sufficiently small. The constant C(r)>0 does not depend on u i , v, and v i .

Proof

Thanks to the representations of J′ and J′′, it is sufficient to verify

$$ \|\varphi_{u_1}-\varphi_{u_2}\|_{W^{2,p}(\Omega)} \le C(r) \|u_1-u_2\|_{L^\infty(\Omega)} \quad \forall u_i \in \mathcal {K}\cap B_\infty(\bar{u},r).$$
(4.19)

Write for short \(\varphi_{i} := \varphi_{u_{i}}\) and \(y_{i} := y_{u_{i}}\), i=1,2; then

$$\everymath{\displaystyle}\begin{array}{l}-\operatorname{div}\left[a(x,y_1)(\nabla(\varphi_1 -\varphi_2)\right] + \frac{\partial a}{\partial y}(x,y_1)\nabla y_1\cdot \nabla(\varphi_1 - \varphi_2) + \frac{\partial f}{\partial y}(x,y_1)(\varphi_1 - \varphi_2)\\[2ex]\quad {}= -\operatorname{div}\left[(a(x,y_2) - a(x,y_1))\nabla\varphi_2\right] +\biggl[\frac{\partial a}{\partial y}(x,y_2)\nabla y_2 -\frac{\partial a}{\partial y}(x,y_1)\nabla y_1\biggr]\cdot \nabla\varphi_2\\[2ex]\qquad {}+\biggl[\frac{\partial f}{\partial y}(x,y_2)- \frac{\partial f}{\partial y}(x,y_1)\biggr]\varphi_2.\end{array}$$

The right-hand side can be estimated in L p(Ω) by \(\|u_{2} - u_{1}\|_{L^{p}(\Omega)}\), cf. [10, (4.22)–(4.24)]. Now the W 2,p(Ω)-estimate (4.19) follows from regularity results by [17] for smooth domains. □

It remains to show that J′′(u) is a Legendre form. To this aim, we select a sequence v k v in L 2(Ω), k→∞, and consider (3.18) with v 1=v 2:=v k

(4.20)

The mapping vz v is linear and continuous, hence also weakly continuous, thus \(z_{v_{k}} \rightharpoonup z_{v}\) in H 2(Ω). The embedding of H 2(Ω) into \(C(\bar{\Omega}) \cap H^{1}_{0}(\Omega)\) is compact, hence \(z_{v_{k}} \to z_{v}\) in \(C(\bar{\Omega})\) and \(\nabla z_{v_{k}} \to \nabla z_{v}\) in L 2(Ω). Therefore, all expressions above except the last one tend to the terms associated with the limit v. By the convexity condition of (H4), it holds \(\frac{\partial^{2} L}{\partial u^{2}}(x,y_{u},u) \ge \Lambda\) a.e. in Ω; therefore the integral of the last term is lower semicontinuous. Altogether, we obtain

$$\liminf_{k \to \infty} J''(u)v_k^2 \ge J''(u)v.$$

If, in addition, \(J''(u)v_{k}^{2} \to J''(u)v\), then it follows

$$\everymath{\displaystyle}\begin{array}{rcl}\Lambda \|v_k - v\|_{L^2(\Omega)}^2 &\le&\int_\Omega \frac{\partial^2 L}{\partial u^2}(x,y_u,u)(v_k -v)^2\, dx\\[2ex]&=&\int_\Omega \frac{\partial^2 L}{\partial u^2}(x,y_u,u)v_k^2\, dx+\int_\Omega \frac{\partial^2 L}{\partial u^2}(x,y_u,u)v^2\, dx\\&&{}- 2 \int_\Omega \frac{\partial^2 L}{\partial u^2}(x,y_u,u)v_kv\, dx.\end{array}$$

By our additional assumption, the first integral tends to the value of the second one. The same holds true for the third one by weak convergence. Therefore, the right-hand side converges to zero. This implies \(\|v_{k} - v\|_{L^{2}(\Omega)} \to 0\), hence J′′(u) is a Legendre form and all requirements of (A2) are confirmed.

(A3): For our 3 cases of control discretization, this approximation condition is satisfied. This is trivial for the variational discretization. Moreover, any function \(u \in \mathcal {K}\) can be approximated by step functions u h via the projection (4.5) (insert \(\bar{u} := u\) there); it holds u h u in L 2(Ω) and \(u_{h} \in \mathcal {K}_{h}\) for sufficiently small h>0.

For piecewise linear control discretization, we mention that W 1,p(Ω) is dense in L 2(Ω). To any fixed \(u \in \mathcal {K}\) and ε>0 there exists u ε W 1,p(Ω) such that \(\|u - u_{\varepsilon}\|_{L^{2}(\Omega)} < \varepsilon/2\). The function \(\hat{u}_{\varepsilon}(x) := \mathbb{P}_{[\alpha,\beta]}u_{\varepsilon}(x) \) belongs to \(W^{1,p}(\Omega) \cap \mathcal {K}\) and satisfies

$$|u(x) - \hat{u}_\varepsilon(x)| = |\mathbb{P}_{[\alpha,\beta]} u(x) - \mathbb{P}_{[\alpha,\beta]}u_\varepsilon(x)|\le |u(x) - u_\varepsilon(x)| \quad \mbox{in}\ \Omega $$

so that also \(\|u - \hat{u}_{\varepsilon}\|_{L^{2}(\Omega)} < \varepsilon/2\). The function \(u_{\varepsilon,h} := \Pi_{h}\hat{u}_{\varepsilon}\) belongs to \(\mathcal {K}_{h}\) and it holds \(\|\hat{u}_{\varepsilon}- u_{\varepsilon,h}\|_{L^{2}(\Omega)} < \varepsilon/2\) for all sufficiently small h>0. By the triangle inequality, we have \(\|u - u_{\varepsilon,h}\|_{L^{2}(\Omega)} < \varepsilon\) for all h<h 0(ε) and this finally confirms (A3).

(A4): Take, as in (A4), \(u_{k} \in \mathcal {K}_{h}\cap B(\bar{u},\rho_{\bar{u}})\) with u k u in L 2(Ω); we write for short y k =y h (u k ). We have to show the lower semicontinuity property

$$J_h(u) \le \liminf_{k \to \infty} J_h(u_k)$$

for each fixed h>0. The sequence {u k } is bounded in L (Ω), hence also {y k } in the finite-dimensional space Y h . Consequently, there exists a subsequence of {y k } denoted in the same way that converges strongly in \(H^{1}_{0}(\Omega)\) to some y h Y h . This is a solution to the discretized quasilinear equation associated with \(u \in \bar{B}(\bar{u},\rho_{\bar{u}})\). By local uniqueness, we have y h =y h (u), hence all such subsequences converge to the same limit y h (u) so that \(\|y_{h}(u_{k}) - y_{h}(u)\|_{L^{2}(\Omega_{h})} \to 0\). The sequence {y h (u k )} is bounded in L (Ω). Now the lower semicontinuity of J h can be verified as the one of J.

Since our set \(\mathcal {K}\) is bounded, the second requirement of (A4) need not be verified.

(A5): The condition (2.13) follows immediately from J(u)=F(G(u),u) and the continuity of the mapping G:L 2(Ω)→H 2(Ω).

Let us show that |J h (u h )−J(u h )|→0 as h→0. Indeed, we obtain

$$\everymath{\displaystyle}\begin{array}{l}|J_h(u_h) - J(u_h)|\\\quad{}\le\int_{\Omega_h} |L(x,y_h(u_h), u_h) - L(x,y_{u_h}, u_h)|\, dx+ \int_{\Omega\setminus \Omega_h} |L(x,y_{u_h}, u_h)|\, dx\\[1ex]\quad {}\le C \|y_h(u_h)- y_{u_h}\|_{L^\infty(\Omega)} + \int_{\Omega\setminus \Omega_h}|L(x,0,0)|dx + c h^2 \to 0 \mbox{ as } h \to 0.\end{array}$$

This follows from the finite element estimate (3.24), since all u h are bounded, by Lemma 3.8 also \(\|y_{u_{h}}\|_{L^{\infty}(\Omega)}\) and, in view of the estimate (3.26), also \(\|y_{h}(u_{h})\|_{L^{\infty}(\Omega)}\). Notice that \(\|L(\cdot,0,0)\|_{L^{1}(\Omega\setminus \Omega_{h})} \to 0\) as h→0. Now the semicontinuity requirement (2.14) is obtained by Remark 2.9, because (A1) has already been confirmed.

(A6): Assume that u k u, \(u_{k} \in \mathcal {K}\). We write \(y_{k} := y_{u_{k}}\) and split

with some u θ ∈[u k ,u] . Assume J(u k )→J(u) as in (A6). Then the left-hand side and the integrals I, II converge to zero (cf. the verification of (A5) above), hence also the integral III. By the convexity condition in (H4), we deduce \(\|u_{k} - u\|_{L^{2}(\Omega)} \to 0\).

(A7): In view of the possible non-uniqueness of the discrete state y h , we restricted the discussion of (A1)–(A7) to the set \(\tilde{\mathcal {K}}= \mathcal {K}\cap \bar{B}(\bar{u},\rho_{\bar{u}})\). This is compatible with our needs, since we perform a local analysis around \(\bar{u}\). The set \(\tilde{\mathcal {K}}\) is also a convex and closed set of L (Ω). Therefore, our theory is applicable to this set instead of \(\mathcal {K}\). Let us confirm first (2.17). We obtain

$$\everymath{\displaystyle}\begin{array}{l}|(J_h^\prime(u)-J^\prime(u))(u_h - \bar{u})|\\\quad {}\le \int_{\Omega_h} \biggl|\varphi_h(u)-\varphi_{u}+ \frac{\partial L}{\partial u}(x,y_h(u),u)\\[1ex]\qquad {}-\frac{\partial L}{\partial u}(x,y_{u},u)\biggr| |u_h - \bar{u}|\, dx + \int_{\Omega \setminus \Omega_h}\biggl| \varphi_{u} + \frac{\partial L}{\partial u}(x,y_{u},u))\biggr| |u_h - \bar{u}| \, dx\end{array}$$

and, by Lemma 3.8,

$$\|\varphi_h(u)-\varphi_{u}\|_{L^2(\Omega_h)} + \|y_h(u)-y_{u}\|_{L^2(\Omega_h)} \le C h^2.$$

Therefore, the integral on Ω h can be estimated by \(c h^{2} \|u_{h} - \bar{u}\|_{L^{2}(\Omega)}\). The integral on Ω∖Ω h cannot be handled this way, but it vanishes. Notice that \(u_{h}(x) = \bar{u}(x)\) was assumed a.e. on Ω∖Ω h for the types of control discretization we consider. Therefore, (2.17) is satisfied with

$$ \varepsilon(h) = C h^2.$$
(4.21)

The confirmation of (2.18) is more delicate. It was shown in [10] that, under our assumptions on (P), the convergence \(\bar{u}_{h} \to \bar{u}\) in L 2(Ω) implies \(\bar{u}_{h} \to \bar{u}\) in L (Ω), provided that the controls are not discretized (variational discretization) or taken as piecewise constant functions. This follows from the available projection formulas for \(\bar{u}_{h}\). The situation is more difficult for piecewise linear control approximation. Then (2.18) needs the discussion of \(J^{\prime\prime}(u_{k})v_{k}^{2}\) for weakly converging sequences {v k }. In (3.18) there is one delicate term, namely \(\int_{\Omega}\frac{\partial^{2} L}{\partial u^{2}}(x,y_{u_{k}},u_{k})v_{k}^{2}\,dx\). Here, we cannot really control the convergence of \(v_{k}^{2}\). It was shown in [6, pp. 149–150] by Egorov’s theorem that condition (2.18) is satisfied for piecewise linear control approximation. In [6], the equation was semilinear but this method does not depend on the particular type of the equation.