1 Introduction

In this paper we study the following parabolic optimal control problem:

$$\begin{aligned} \min \limits _{u\in U_{ad}}\ \ J(y,u)&={1\over 2}\Vert y-y_d\Vert ^2_{L^2(0,T;L^2(\varOmega ))}+{\alpha \over 2}\Vert u\Vert ^2_{L^2(0,T;L^2(\varGamma ))} \end{aligned}$$
(1.1)

subject to

$$\begin{aligned} \left\{ {\begin{array}{l@{\quad }l@{\quad }l} \frac{\partial y}{\partial t} -\varDelta y=f \, \, &{}\text{ in }\ \varOmega _T, \\ \ y=u \,\, &{} \text{ on }\ \varSigma ,\\ y(0)=y_0\,\, &{} \text{ in }\ \varOmega , \end{array}} \right. \end{aligned}$$
(1.2)

where \(\varOmega _T=\varOmega \times (0,T], \,\varSigma =\partial \varOmega \times (0,T]\) with \(\varOmega \) denoting an open bounded domain with boundary \(\varGamma :=\partial \varOmega \), \(U_{ad}\) is the admissible control set which is assumed to be of box type

$$\begin{aligned} U_{ad}:=\big \{u\in L^2(0,T;L^2(\varGamma )):\ u_a\le u(x,t)\le u_b,\ \text{ a.e. }\ \text{ on }\ \varSigma \big \}, \end{aligned}$$
(1.3)

with \(u_a< u_b\) denoting constants. For the convenience we make the following assumption on the domain \(\varOmega \) and the given data which shall be valid throughout the paper without explicit mentioning:

Assumption 1

We assume that \(\varOmega \) is an open bounded, convex polygonal domain in \(\mathbb {R}^2\). \(\alpha >0, \,f\in L^2(0,T;L^2(\varOmega )), \,y_0\in L^2(\varOmega ), \,y_d\in L^2(0,T;L^2(\varOmega ))\) and \(T>0\) are fixed data.

Dirichlet boundary control is important in many practical applications such as the active boundary control of flows, see e.g. [13, 18, 20]. If one is, e.g. interested in blowing and suction as control on part of the boundary, controls with low regularity should be admissible, which could have jumps and satisfy pointwise bounds. In the mathematical theory one has to use the concept of very weak solutions in this situation, see [4] for a more detailed discussion of this fact.

In the present work we consider a parabolic Dirichlet boundary control problem of tracking type, which may be regarded as prototype problem to study Dirichlet boundary control for time-dependent PDEs. For parabolic optimal boundary control problems of Dirichlet type, only few contributions can be found in the literature [2, 3, 23]. Kunisch and Vexler [23] considered a semi-smooth Newton method for the numerical solution of parabolic Dirichlet boundary control problems. A Robin penalization method using Robin-type boundary conditions applied to parabolic Dirichlet boundary control problems is investigated in [3]. However, to the best of the authors’ knowledge no error analysis is available for the finite element approximation of this kind of problems. With the present paper we intend to fill this gap and derive a priori error estimates for parabolic Dirichlet boundary control problems. Compared to the elliptic case, parabolic Dirichlet boundary control problems are more involved in both the definition of discrete schemes and the a priori error analysis, since the regularity of the involved state variable is low.

Finite element approximations of optimal control problems are important for the numerical treatment of optimal control problems related to practical applications, see e.g. [22, Ch. 4]. An overview on the numerical a priori and a posteriori analysis for elliptic control problems can be found in [22, Ch. 3] and [28], respectively. To the best of the authors’ knowledge the first contribution to parabolic optimal control problems is given in [36]. The state of the art in the numerical a priori analysis of distributed parabolic optimal control problems can be found in [30, 31]. More recent contributions with higher order in time Galerkin schemes can be found in [1, 32, 35]. Residual-based a posteriori error estimates are presented in [26] and [27]. For boundary control problems with parabolic equations we refer to [15]. There is a long list of contributions to boundary control of elliptic PDEs, see e.g. [7, 8, 10, 11, 14, 16, 21, 29, 34]. Further references can be found in [22, Ch. 3].

In this paper we use the very weak solution concept for the state equation and \(L^2(0,T;L^2(\varGamma ))\) as control space to argue the existence of a unique solution to the optimal control problems (1.1)–(1.2). For the numerical discretization of the optimal control problem we discretize the state using standard piecewise linear and continuous finite elements in space and dG(0) scheme in time. The Dirichlet boundary conditions are approximated based on the space–time \(L^2\)-projection. The control is discretized in space either by piecewise linear finite elements or implicitly through the discretization of the adjoint state, the so-called variational discretization (see [19]). For both cases we derive a priori error bounds for the state and control in the \(L^2\)-norm for problems posed on polygonal domains. As main result we obtain the error bound

$$\begin{aligned} \Vert u-U_{hk}\Vert _{L^2(L^2(\varGamma ))}+\Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))}\le & {} Ch^{{1\over 2}} \end{aligned}$$
(1.4)

under the coupling \(k=O(h^2)\) with both full control discretisation and variational control discretisation for the optimal solution (yu) of (1.1), where \(Y_{hk}\) and \(U_{hk}\) denote the optimal discrete state and control, see also Corollary 1. We present several numerical examples which support our theoretical findings.

The rest of our paper is organized as follows. In Sect. 2 we present the analytical setting of the parabolic Dirichlet boundary control problem and argue the existence of a unique solution. In Sect. 3 we establish the fully discrete finite element approximation to the state equation and the corresponding stability results. Then we formulate the fully discrete approximation for parabolic Dirichlet boundary control problems. The a priori error analysis for the finite element approximation and the variational discretization of the optimal control problems posed on convex, polygonal domains is studied in Sect. 4. Furthermore, we present some numerical experiments in Sect. 5 to support our theoretical results.

2 Optimal Control Problem

For \(m\ge 0\) and \(1\le s\le \infty \), we adopt the standard notation \(W^{m,s}(\varOmega )\) for Sobolev spaces on \(\varOmega \) with norm \(\Vert \cdot \Vert _{m,s,\varOmega }\) and seminorm \(|\cdot |_{m,s,\varOmega }\), where \(H^m(\varOmega )=W^{m,2}(\varOmega ), \,\Vert \cdot \Vert _{m,\varOmega }=\Vert \cdot \Vert _{m,2,\varOmega }\) and \(|\cdot |_{m,\varOmega }=|\cdot |_{m,2,\varOmega }\) for \(s=2\). Note that \(H^0(\varOmega )=L^2(\varOmega )\) and \(H_0^1(\varOmega )=\{v\in H^1(\varOmega );\ v=0\ \text{ on }\ \partial \varOmega \}\). We denote by \(L^r(0,T;W^{m,s}(\varOmega ))\) the Banach space of all \(L^r\) integrable functions from [0, T] into \(W^{m,s}(\varOmega )\) with norm \(\Vert v\Vert _{L^r(0,T;W^{m,s}(\varOmega ))}= \Big (\int _0^T\Vert v\Vert ^r_{m,s,\varOmega }dt\Big )^{{1\over r}}\) for \(1\le r<\infty \), and with the standard modification for \(r=\infty \). For a Banach space Y, we use the abbreviations \(L^2(Y)=L^2(0,T;Y), \,H^s(Y)=H^s(0,T;Y), \,s=[0,\infty )\), and \(C(Y)=C([0,T];Y)\). We denote the \(L^2\)-inner products on \(L^2(\varOmega ), \,L^2(\varOmega _T)\) and \(L^2(\varGamma )\) by \((\cdot ,\cdot ), \,(\cdot ,\cdot )_{\varOmega _T}\) and \(\langle \cdot ,\cdot \rangle \), respectively. In addition c and C denote generic positive constants.

Let

$$\begin{aligned} a(y,w)=\int _{\varOmega }\nabla y\cdot \nabla w \quad \forall \ y,w\in H^1(\varOmega ). \end{aligned}$$

The standard weak form for the parabolic equation (1.2) is to find \(y\in L^2(H^1(\varOmega ))\cap H^1(H^{-1}(\varOmega ))\) with \(y|_{\varSigma }=u\) and \(y(\cdot ,0)|_\varOmega =y_0(\cdot )\) such that

$$\begin{aligned} \left( \frac{\partial y}{\partial t},v \right) +a(y, v)=(f,v)\ \ \text{ a.a. }\ t\in (0,T], \ \forall \ v\in H_0^1(\varOmega ). \end{aligned}$$
(2.1)

This setting requires \(u\in L^2(H^{1\over 2}(\varGamma ))\). Motivated by practical considerations (see e.g. the discussion in [4]) we are interested in controls \(u\in U_{ad}\) defined in (1.3). For a proper treatment of the state equation in this case we use the transposition technique introduced by Lions and Magenes (see [24, Ch. 2, Sec. 5.2] and [25, Ch. 2]) to argue the existence of a unique solution to the state equation (1.2) in the present paper. The very weak form of (1.2) that we shall utilize reads: Find \(y\in L^2(L^2(\varOmega ))\) such that

$$\begin{aligned} \int _{\varOmega _T}y(-z_t-\varDelta z)dxdt=-\int _{\varSigma }u\partial _{{n}}zdsdt+\int _{\varOmega _T}fzdxdt+\int _\varOmega y_0z(\cdot ,0)dx\nonumber \\ \ \ \forall \ z\in L^2(H^2(\varOmega )\cap H_0^1(\varOmega ))\cap H^1(L^2(\varOmega )) \end{aligned}$$
(2.2)

with \(z(\cdot ,T)=0\) holds, where \(\partial _{{n}}v:=\nabla v\cdot n\) with n denoting the unit outward normal to \(\varGamma \). Then the existence and uniqueness of a very weak solution of (2.2), which we denote by \(y=\mathcal {G}(u)\), is shown in the following lemma (see, e.g. [25])

Lemma 1

For each \(u\in L^2(L^2(\varGamma ))\), there exists a unique very weak solution \(y\in L^2(L^2(\varOmega ))\) of (2.2) satisfying

$$\begin{aligned} \Vert y\Vert _{L^2(L^2(\varOmega ))}\le C\big (\Vert y_0\Vert _{L^2(\varOmega )}+\Vert f\Vert _{L^2(L^2(\varOmega ))}+\Vert u\Vert _{L^2(L^2(\varGamma ))}\big ). \end{aligned}$$
(2.3)

Proof

For \(y_0\in L^2(\varOmega ), \,f\in L^2(L^2(\varOmega ))\) and \(u\equiv 0\) it is straightforward to show that (1.2) admits a unique solution \(y\in L^2(H^1_0(\varOmega ))\cap H^1(H^{-1}(\varOmega ))\) in the sense of (2.1), which also satisfies (2.3). To prove the lemma in the case \(u\ne 0\) it is sufficient to consider the case \(f\equiv 0, \,y_0\equiv 0\), where we follow the constructive approach of [10]. For each \(g\in L^2(L^2(\varOmega ))\) we denote by \(z\in L^2(H^2(\varOmega )\cap H_0^1(\varOmega ))\cap H^1(L^2(\varOmega ))\) the solution of

$$\begin{aligned} \left\{ { \begin{array}{l@{\quad }l@{\quad }l} -\frac{\partial z}{\partial t} -\varDelta z=g \, \, &{}\text{ in }\ \varOmega _T, \\ \ z=0 \, \, &{}\text{ on }\ \varSigma ,\\ z(T)=0\, \, &{}\text{ in }\ \varOmega . \end{array}} \right. \end{aligned}$$
(2.4)

Then we have \(\partial _{{n}}z\in H^{\frac{1}{4}}(L^2(\varGamma ))\) according to [23, Th. 3.2]. Moreover, from the fact that \(z\in L^2(H^2(\varOmega ))\) and \(z=0\) on \(\varSigma \) we obtain that \(\partial _{{n}}z\in L^2(H^{1\over 2}(\varGamma ))\) according to Lemma A.2 in [6]. We denote by \(T:L^2(L^2(\varOmega ))\rightarrow L^2(L^2(\varGamma ))\) the continuous linear operator which is defined by \(Tg:=-\partial _{{n}}z|_{\varSigma }\) and denote its adjoint by \(T^*\). Then with \(y=T^*u\) we have

$$\begin{aligned} \int _{\varOmega _T}ygdxdt=\int _{\varOmega _T}y(-z_t-\varDelta z)dxdt=\int _{\varOmega _T}T^*ugdxdt=-\int _{\varSigma }u\partial _{{n}}zdsdt, \end{aligned}$$

which verifies that y satisfies (2.2). The estimate (2.3) follows by observing that

$$\begin{aligned} |\int _{\varOmega _T}ygdxdt|\le C\Vert u\Vert _{L^2(L^2(\varGamma ))}\Vert \partial _{{n}}z\Vert _{L^2(L^2(\varGamma ))}\le C\Vert u\Vert _{L^2(L^2(\varGamma ))}\Vert g\Vert _{L^2(L^2(\varOmega ))}. \end{aligned}$$

\(\square \)

Now we are ready to formulate the optimal control problem considered in the present paper. It reads

(2.5)

By standard arguments (see, e.g. [24, Ch. 2,  Sec. 1.2]), there exists a unique solution (yu) for problem (2.5). Let \(J(u):=J(y(u),u)\) denote the reduced cost functional, where for each \(u\in L^2(L^2(\varGamma ))\) the state y(u) is the unique very weak solution of (2.2). Then J is infinitely often Fréchet differentiable. Moreover, the first order sufficient and necessary optimality conditions for problem (2.5) are given by

Theorem 1

Assume that \(u\in L^2(L^2(\varGamma ))\) is the unique solution of problem (2.5) and let y be the associated state. Then there exists a unique adjoint state \(z\in L^2(H^1_0(\varOmega ))\cap H^1(H^{-1}(\varOmega ))\) such that

$$\begin{aligned} \left\{ { \begin{array}{l@{\quad }l@{\quad }l}-\frac{\partial z}{\partial t} -\varDelta z=y-y_d \, \, &{}\text{ in }\ \varOmega _T, \\ \ z=0 \, \, &{}\text{ on }\ \varSigma ,\\ z(T)=0\, \, &{}\text{ in }\ \varOmega , \end{array}} \right. \end{aligned}$$
(2.6)

and

$$\begin{aligned} J'(u)(v-u)=\int _{\varSigma }(\alpha u-\partial _{{ n}}z)(v-u)dsdt\ge 0,\ \ \forall \ v\in U_{ad}. \end{aligned}$$
(2.7)

We note that (2.7) is equivalent to

$$\begin{aligned} J'(u)(v-u)= & {} \int _{\varSigma }\alpha u(v-u)dsdt+\int _{\varOmega _T}(y-y_d)(y(v)-y)dxdt\nonumber \\\ge & {} 0,\ \ \forall \ v\in U_{ad} \end{aligned}$$
(2.8)

or

$$\begin{aligned} u=P_{U_{ad}}\big (\frac{1}{\alpha }\partial _{{ n}}z\big ), \end{aligned}$$
(2.9)

where for each \(v\in L^2(L^2(\varGamma )), \,y(v)\) is the solution of problem (2.2) with u replaced by v, and \(P_{U_{ad}}:L^2(L^2(\varGamma ))\rightarrow U_{ad}\) denotes the orthogonal projection.

We now turn to the regularity properties of optimal controls u on \(\varSigma \). The proof of the following theorem can be found in, e.g. [23, Th. 3.4].

Theorem 2

Let \((y,u,z)\in L^2(L^2(\varOmega ))\times L^2(L^2(\varGamma ))\times L^2(H^1_0(\varOmega ))\cap H^1(H^{-1}(\varOmega ))\) be the solution of optimal control problem (2.5)–(2.8). Then we have

$$\begin{aligned} u\in L^2(H^{1\over 2}(\varGamma ))\cap H^{\frac{1}{4}}(L^2(\varGamma )),\ \ \ y\in L^2(H^1(\varOmega ))\cap H^{\frac{1}{2}}(L^2(\varOmega )), \end{aligned}$$
(2.10)

and

$$\begin{aligned} z\in L^2(H^2(\varOmega )\cap H_0^1(\varOmega ))\cap H^1(L^2(\varOmega )). \end{aligned}$$
(2.11)

Proof

From \(f\in L^2(L^2(\varOmega )), \,y_0\in L^2(\varOmega )\) and \(u\in L^2(L^2(\varGamma ))\) we conclude that \(y\in L^2(L^2(\varOmega ))\) according to Lemma 1. Thus, \(y_d\in L^2(L^2(\varOmega ))\) implies \(z\in L^2(H^2(\varOmega )\cap H_0^1(\varOmega ))\cap H^1(L^2(\varOmega ))\), which in turn implies \(\partial _n z\in L^2(H^{1\over 2}(\varGamma ))\cap H^{\frac{1}{4}}(L^2(\varGamma ))\) (see [6, 17, 23]). From (2.9) we obtain that \(u\in L^2(H^{1\over 2}(\varGamma ))\cap H^{\frac{1}{4}}(L^2(\varGamma ))\) and thus \(y\in L^2(H^1(\varOmega ))\cap H^{\frac{1}{2}}(L^2(\varOmega ))\) (see [25, Vol. II, p. 78]). This completes the proof. \(\square \)

In our analysis we frequently use results of the following backward in time parabolic problem:

$$\begin{aligned} \left\{ {\begin{array}{l@{\quad }l@{\quad }l} -w_t-\varDelta w=g\ \ \ \text{ in }\ \varOmega _T,\\ w=0\ \ \text{ on }\ \varSigma ,\\ w(T)=0\ \ \text{ in }\ \varOmega . \end{array}}\right. \end{aligned}$$
(2.12)

If \(g\in L^2(L^2(\varOmega ))\), then (2.12) has a unique solution \(w\in L^2(H^2(\varOmega )\cap H_0^1(\varOmega ))\cap H^1(L^2(\varOmega ))\) satisfying

$$\begin{aligned}&\Vert w\Vert _{L^2(H^2(\varOmega ))}+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}\le C\Vert g\Vert _{L^2(L^2(\varOmega ))}, \end{aligned}$$
(2.13)
$$\begin{aligned}&\Vert w(0)\Vert _{1,\varOmega }\le C\Vert g\Vert _{L^2(L^2(\varOmega ))}. \end{aligned}$$
(2.14)

3 Finite Element Discretization of the State Equation and Optimal Control Problems

At first let us consider the finite element approximation of the state Eq. (1.2). For the spatial discretization we consider conforming Lagrange triangular elements.

Let \(\mathcal {T}^h\) be a quasi-uniform partitioning of \(\varOmega \) into disjoint regular triangles \(\tau \), so that \(\bar{\varOmega }=\bigcup _{\tau \in \mathcal {T}^h}\bar{\tau }\). Associated with \(\mathcal {T}^h\) is a finite dimensional subspace \(V^h\) of \(C(\bar{\varOmega })\), such that for \(\chi \in V^h\) and \(\tau \in \mathcal {T}^h, \,\chi |_{\tau }\) are piecewise linear polynomials. We set \(V^h_0=V^h\cap H_0^1(\varOmega )\).

Let \(\mathcal {T}_U^h\) be a partitioning of \(\varGamma \) into disjoint regular segments s, so that \(\varGamma =\bigcup _{s\in \mathcal {T}_U^h}\bar{s}\). Associated with \(\mathcal {T}_U^h\) is another finite dimensional subspace \(U^h\) of \(L^2(\varGamma )\), such that for \(\chi \in U^h\) and \(s\in \mathcal {T}_U^h, \,\chi |_s\) are piecewise linear polynomials. Here we suppose that \(\mathcal {T}_U^h\) is the restriction of \(\mathcal {T}^h\) on the boundary \(\varGamma \) and \(U^h=V^h(\varGamma )\), where \(V^h(\varGamma )\) is the restriction of \(V^h\) on the boundary \(\varGamma \).

For the standard Lagrange interpolation operator \(I_h:C(\bar{\varOmega })\rightarrow V^h\), we have the following error estimate (see, e.g. [9, Sec. 3.1])

$$\begin{aligned} \Vert w-I_hw\Vert _{l,\varOmega }\le Ch^{m-l}\Vert w\Vert _{m,\varOmega },\ \ \ 0\le l\le 1\le m\le 2. \end{aligned}$$
(3.1)

To define our discrete scheme, we need to introduce some projection operators. Here \(Q_h:L^2(\varGamma )\rightarrow V^h(\varGamma )\) and \(\tilde{Q}_h:L^2(\varOmega )\rightarrow V^h_0\) denote the orthogonal projection operators. Furthermore, \(R_h: H^1_0(\varOmega )\rightarrow V^h_0\) denotes the Ritz projection operator defined as

$$\begin{aligned} a(R_hw,v_h)=a(w,v_h),\ \ w\in H^1_0(\varOmega ),\ \forall \ v_h\in V_0^h. \end{aligned}$$
(3.2)

It is well known that the Ritz projection satisfies (see, e.g. [9, Sec. 3.1])

$$\begin{aligned} \Vert w-R_hw\Vert _{s,\varOmega }\le Ch^{l-s}\Vert w\Vert _{l,\varOmega },w\in H_0^1(\varOmega )\cap H^l(\varOmega ),\forall \ 0\le s\le 1\le l\le 2. \end{aligned}$$
(3.3)

For the \(L^2(\varGamma )\) projection operator \(Q_h\) we also have (see [9] and [12, pp. 85–86, Eq. (25) and (28)])

$$\begin{aligned} \Vert w-Q_hw\Vert _{0,\varGamma }\le Ch^{s-{1\over 2}}\Vert w\Vert _{s,\varOmega }\ \ \text{ for }\ w\in H^s(\varOmega ), \ \frac{1}{2}\le s\le 2, \end{aligned}$$
(3.4)

and

$$\begin{aligned} \Vert (I-Q_h)\partial _{{n}}w\Vert _{0,\varGamma }\le Ch^{{1\over 2}}\Vert w\Vert _{2,\varOmega }\ \ \text{ for }\ w\in H^2(\varOmega ). \end{aligned}$$
(3.5)

In our following analysis we need estimates for discrete harmonic functions.

Lemma 2

Let \(v_h\in V^h(\varGamma )\), and suppose that \(w\in H^1(\varOmega )\) is the solution of

$$\begin{aligned} a(w,\phi )=0,\ \ \ \forall \ \phi \in H^1_0(\varOmega ),\ \ \ w=v_h \ \text{ on }\ \varGamma \end{aligned}$$
(3.6)

and \(w_h\in V^h\) is the solution of

$$\begin{aligned} a(w_h,\phi _h)=0,\ \ \ \forall \ \phi _h\in V^h_0,\ \ \ w_h=v_h \ \text{ on }\ \varGamma . \end{aligned}$$
(3.7)

Then

$$\begin{aligned}&\Vert w-w_h\Vert _{1,\varOmega }\le C\Vert v_h\Vert _{{1\over 2},\varGamma }\le Ch^{-{1\over 2}}\Vert v_h\Vert _{0,\varGamma }, \end{aligned}$$
(3.8)
$$\begin{aligned}&\Vert w_h\Vert _{0,\varOmega }+h^{{1\over 2}}\Vert w_h\Vert _{1,\varOmega }\le C\Vert v_h\Vert _{0,\varGamma }, \end{aligned}$$
(3.9)
$$\begin{aligned}&\Vert w-w_h\Vert _{{1\over 2},\varOmega }\le C\Vert v_h\Vert _{0,\varGamma }. \end{aligned}$$
(3.10)

Proof

The proof of (3.8) and (3.9) can be found in [5, Lm. 3.2], [7, Th. 5.4] and [11, Lm. 1]. Here we provide a proof of (3.10). For each \(g\in H^{-{1\over 2}}(\varOmega )\) let \(\psi _g\in H^{3\over 2}(\varOmega )\cap H^1_0(\varOmega )\) be the solution of

$$\begin{aligned} a(\phi ,\psi _g)=\langle g,\phi \rangle _{H^{-{1\over 2}}, H^{1\over 2}},\ \ \ \forall \ \phi \in H^1_0(\varOmega ). \end{aligned}$$
(3.11)

Then we have \(\Vert \psi _g\Vert _{{3\over 2},\varOmega }\le C\Vert g\Vert _{-{1\over 2},\varOmega }\). Note that from (3.6) and (3.7) we have

$$\begin{aligned} \langle g,w-w_h\rangle _{H^{-{1\over 2}}, H^{1\over 2}}= a(w-w_h,\psi _g)= a(w-w_h,\psi _g-I_h\psi _g), \end{aligned}$$

where \(I_h\psi _g\) is the linear Lagrange interpolation of \(\psi _g\) [9]. Then standard error estimates lead to

$$\begin{aligned} \langle g,w-w_h\rangle _{H^{-{1\over 2}}, H^{1\over 2}}= & {} a(w-w_h,\psi _g-I_h\psi _g)\nonumber \\\le & {} \Vert w-w_h\Vert _{1,\varOmega }\Vert \psi _g-I_h\psi _g\Vert _{1,\varOmega }\nonumber \\\le & {} Ch^\frac{1}{2}\Vert v_h\Vert _{{1\over 2},\varGamma }\Vert \psi _g\Vert _{\frac{3}{2},\varOmega }\nonumber \\\le & {} C\Vert v_h\Vert _{0,\varGamma }\Vert g\Vert _{-\frac{1}{2},\varOmega }, \end{aligned}$$
(3.12)

where we have used the estimate (3.8). This implies

$$\begin{aligned} \Vert w_h-w\Vert _{\frac{1}{2},\varOmega }\le C\Vert v_h\Vert _{0,\varGamma }, \end{aligned}$$

which proves (3.10).\(\square \)

The semi-discrete finite element approximation of (1.2) reads: Find \(y_h(u)\in L^2(V^h)\) such that

$$\begin{aligned} \left\{ \begin{aligned}&-(y_h(u),\partial _tv_h)_{\varOmega _T} +a(y_h(u),v_h)_{\varOmega _T}=(f,v_h)_{\varOmega _T} + \left( y_0^h, v_h(\cdot ,0)\right) \ \ \forall \ v_h\in H^1(V^h_0),\\&\ y_h(u)= Q_h(u)\ \ \text{ on }\ \varSigma \end{aligned}\right. \end{aligned}$$
(3.13)

with \(v_h(\cdot , T)=0, \,y_0^h=\tilde{Q}_hy_0\in V^h\) an approximation of \(y_0\) using the \(L^2\)-projection, and \(Q_h\) the projection operator from \(L^2(\varGamma )\) to \(V^h(\varGamma )\). Note that the above semi-discrete scheme is well-defined and admits a unique solution \(y_h(u)\in L^2(H^1(\varOmega ))\), which we denote by \(y_h(u)=\mathcal {G}_h(u)\), since \(Q_h(u)\in L^2(H^{1\over 2}(\varGamma ))\), thus we use a standard bilinear form \(a(\cdot ,\cdot )\) compared to the very weak form (2.2).

The semi-discrete finite element approximation of (1.1)–(1.2) reads as follows:

$$\begin{aligned} \left\{ \begin{aligned} \min \limits _{u_h\in U_{ad}^h,y_h\in L^2(V^h)}&J_h(y_h,u_h)={1\over 2}\Vert y_h-y_d\Vert ^2_{L^2(L^2(\varOmega ))}+{\alpha \over 2}\Vert u_h\Vert ^2_{L^2(L^2(\varGamma ))}\\&\text{ subject } \text{ to }\ \ y_h=\mathcal {G}_h(u_h)\ \ \text{ defined } \text{ in }\ (3.13), \end{aligned}\right. \end{aligned}$$
(3.14)

where \(y_0^h\in V^h\) is an approximation of \(y_0\), and \(U_{ad}^{h}\) is an appropriate approximation to \(U_{ad}\) depending on the discretization scheme for the control.

It follows that the control problem (3.14) has a unique solution \((y_h,u_h)\) and that a pair \((y_h,u_h)\) is the solution of the problem (3.14) if and only if there is a co-state \(z_h\in L^2(V^h_0)\) such that the triplet \((y_h,z_h,u_h)\) satisfies (3.13) and the following optimality conditions:

$$\begin{aligned} \left\{ {\begin{array}{l@{\quad }l} -\big (\frac{\partial z_h}{\partial t},q_h\big )+a(q_h,z_h)=(y_h-y_{d},q_h),\ \ \forall \ q_h\in V^h_0,\\ z_h=0\ \ \text{ on }\ \varSigma ;\ z_h(T)=0\ \ \ \text{ in }\ \varOmega , \end{array}}\right. \end{aligned}$$
(3.15)
$$\begin{aligned} \int _{\varOmega _T}(y_h-y_{d})(y_h(v_h)-y_h)dxdt+ \alpha \int _{\varSigma }u_h(v_h-u_h)dsdt\ge 0, \ \ \forall \ v_h\in U_{ad}^h, \end{aligned}$$
(3.16)

where \(y_h(v_h)\in L^2(V^h)\) is the solution of state equation (3.13) with Dirichlet boundary condition \(Q_h(v_h)\).

We next consider the fully discrete approximation for above semi-discrete problem by using the dG(0) scheme in time [33]. We note that the dG(0) scheme is equivalent to the backward Euler method with the right hand side approximated by the averaged integral.

Let \(0=t_0<t_1<\cdots <t_{N-1}<t_N=T\) be a time domain partitioning with \(k_n=t_n-t_{n-1}, \,n=1,2,\ldots ,N\) and \(k=\max \limits _{1\le n\le N}k_n\). We assume that the time partitioning is quasi-uniform, i.e., there exist positive constants \(c_1\) and \(c_2\) such that \(c_1k_n\le k\le c_2k_n\) holds for each \(n=1,2,\ldots ,N\). We also set \(I_n:=(t_{n-1},t_n]\). For \(n=1,2,\ldots ,N\), we construct the finite element spaces \(V^h\subset H^1(\varOmega )\) with the mesh \(\mathcal {T}^h\). Similarly, we construct the finite element spaces \(U^h\subset L^2(\varGamma )\) with the mesh \(\mathcal {T}^h_U\). In our case we have \(U^h=V^h(\varGamma )\). Then we denote by \(V^h\) and \(U^h\) the finite element spaces defined on \(\mathcal {T}^h\) and \(\mathcal {T}^h_U\) on each time step.

Let \(V_k\) denote the space of piecewise constant functions on the time partition. We define the \(L^2\) projection operator \(P_k:L^2(0,T)\rightarrow V_k\) on \(I_n\) through

$$\begin{aligned} P_k^nw:=(P_kw)(t)|_{I_n}=\frac{1}{k_n}\int _{I_n}w(s)ds\ \ \text{ for }\ t\in I_n\ (n=1,\ldots ,N). \end{aligned}$$

Then we have the following estimate

$$\begin{aligned} \Vert (I-P_k)w\Vert _{L^2(0,T;H)}\le Ck\Vert w_t\Vert _{L^2(0,T;H)},\ \ \forall \ w\in H^1(0,T;H), \end{aligned}$$
(3.17)

where H denotes some separable Hilbert space .

We consider a dG(0) scheme for the time discretization and set

$$\begin{aligned} V_{hk}:=\Big \{\phi :\overline{\varOmega }\times [0,T]\rightarrow \mathbb {R},\ \phi (\cdot ,t)|_{\overline{\varOmega }}\in V^h,\ \phi (x,\cdot )|_{I_n}\in \mathbb {P}_0\ \text{ for } \ n=1,\ldots ,N\Big \}, \end{aligned}$$

i.e. \(\phi \in V_{hk}\) is a piecewise constant polynomial w.r.t. time. We also set \(V_{hk}(\varGamma )\) as the restriction of \(V_{hk}\) on \(L^2(L^2(\varGamma ))\). We set \(Q=Q_hP_k=P_kQ_h\). Thus, we have \(Q:L^2(L^2(\varGamma ))\rightarrow V_{hk}(\varGamma )\). For \(Y,\varPhi \in V_{hk}\) we set

$$\begin{aligned} A(Y,\varPhi ):=\sum \limits _{n=1}^N k_na(Y^n,\varPhi ^n)+\sum \limits _{n=2}^N(Y^n-Y^{n-1},\varPhi ^{n}) +(Y_+^0,\varPhi _+^0), \end{aligned}$$

where \(\varPhi ^n:=\varPhi ^n_-=\lim _{s\rightarrow 0^+}\varPhi (t_n-s), \,\varPhi ^{n+1}:=\varPhi ^n_{+}=\lim _{s\rightarrow 0^+}\varPhi (t_n+s)\).

For each \(u\in L^2(L^2(\varGamma ))\) the fully discrete dG(0)-cG(1) finite element approximation of (3.13) now reads: Find \(Y_{hk}\in V_{hk}\) such that

$$\begin{aligned} \left\{ \begin{aligned} A(Y_{hk},\varPhi )=( f,\varPhi )_{\varOmega _T}+(y_0,\varPhi _+^0),\, \,&\forall \ \varPhi \in V_{hk}^0,\\ Y_{hk}=Q(u)\, \,&\text{ on }\ \varSigma , \end{aligned}\right. \end{aligned}$$
(3.18)

where \(V_{hk}^0\) denotes the subspace of \(V_{hk}\) with functions vanishing on the boundary \(\varGamma \).

It is easy to see that on each time interval \(I_n, \,Y_{hk}^n\in V^h\) solves the following problem:

$$\begin{aligned} \left\{ \begin{aligned}&\left( \frac{Y_{hk}^n-Y_{hk}^{n-1}}{k_n},w_h \right) +a(Y_{hk}^n,w_h)=(P_k^nf,w_h),\ \forall \ w_h\in V^h_0,\;\; n=1,\ldots ,N,\\&Y_{hk}^0=y_0^h,\ \text{ in }\ \varOmega ; \ \ Y_{hk}^n= Q_h(P_k^nu),\ n=1,\ldots ,N\ \ \text{ on }\ \varGamma . \end{aligned}\right. \end{aligned}$$
(3.19)

Here we use the \(L^2\)-projection to approximate the non-smooth Dirichlet boundary condition in (3.18).

In the following we need to investigate the stability behavior of the fully discrete scheme (3.18) with respect to the initial value \(y_0\), the right hand side f and the Dirichlet boundary conditions u.

Lemma 3

There exists a constant C independent of hk and the data \((f,y_0)\) such that

$$\begin{aligned}&\sum \limits _{n=1}^{N}\Big (\Vert Y_{hk}^n- Y_{hk}^{n-1}\Vert _{0,\varOmega }^2+k_n\Vert Y_{hk}^n\Vert _{1,\varOmega }^2\Big )+\Vert Y_{hk}^N\Vert _{0,\varOmega }^2\nonumber \\&\le C(h^{-1}+hk^{-1})\Vert u\Vert _{L^2(L^2(\varGamma ))}^2 \end{aligned}$$
(3.20)

and

$$\begin{aligned}&\sum \limits _{n=1}^{N}\Big (\Vert Y_{hk}^n- Y_{hk}^{n-1}\Vert _{0,\varOmega }^2+k_n\Vert Y_{hk}^n\Vert _{1,\varOmega }^2\Big )+\Vert Y_{hk}^N\Vert _{0,\varOmega }^2\nonumber \\&\le C(1+h^2k^{-1})\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))}^2 \end{aligned}$$
(3.21)

hold in case \(f\equiv 0, \,y_0\equiv 0\). In the case \(u\equiv 0\) the estimate

$$\begin{aligned} \sum \limits _{n=1}^{N}\Vert Y_{hk}^n- Y_{hk}^{n-1}\Vert _{0,\varOmega }^2 \le C\big (h^{-2}k \Vert y_0\Vert ^2_{0,\varOmega }+k\Vert f\Vert _{L^2(L^2(\varOmega ))}^2\big ) \end{aligned}$$
(3.22)

is valid.

Proof

Let us first assume that \(f\equiv 0, \,y_0\equiv 0\). The proof follows the idea of [12] completely and here we give a sketch for the case with variable time steps. To begin with we introduce the following problem: Find \(y_u\in V_{hk}\) with

$$\begin{aligned} \left\{ \begin{aligned}&\left( \frac{y_u^n-y_u^{n-1}}{k_n},w_h \right) +a(y_u^n,w_h)=0,\ \forall \ w_h\in V^h_0,\;\; n=1,\ldots ,N,\\&y_u^0=0\ \text{ in }\ \varOmega ; \ \ y_u^n= Q_h(P_k^nu),\ n=1,\ldots ,N\ \ \text{ on }\ \varGamma . \end{aligned}\right. \end{aligned}$$
(3.23)

For arbitrary \(y_h\in V^h\) we have the splitting

$$\begin{aligned} y_h=y_1+R_hy_h\ \ \ \ \text{ and }\ \ y_h=y_2+\tilde{Q}_hy_h, \end{aligned}$$

where \(\tilde{Q}_hy_h\in V_0^h\) and \(R_hy_h\in V_0^h\) are the \(L^2\)-projection and Ritz-projection of \(y_h\), respectively. Then we have \(y_2|_{\varGamma }=y_h, \,y_1|_{\varGamma }=y_h\) and

$$\begin{aligned} (y_2,v_h)=0\ \ \ \ \text{ and }\ \ a(y_1,v_h)=0,\ \ \forall \ v_h\in V_0^h. \end{aligned}$$

Let \(y^n_u=y_2^n+\tilde{Q}_hy_u^n\). Then (3.23) delivers

$$\begin{aligned} (\tilde{Q}_hy^n_u-\tilde{Q}_hy^{n-1}_u,w_h)+k_na(\tilde{Q}_hy^n_u,w_h)=-k_na(y_2^n,w_h),\ \ \forall \ w_h\in V^h_0. \end{aligned}$$
(3.24)

Similar to the proof of Proposition 1 in [12, P. 88] we conclude from (3.24) that

$$\begin{aligned}&\sum \limits _{n=1}^{N}\big (\Vert y^n_u-y_u^{n-1}\Vert _{0,\varOmega }^2+k_na( y_u^n,y_u^n)\big )+\Vert y_u^N\Vert _{0,\varOmega }^2\nonumber \\\le & {} C\sum \limits _{n=1}^{N}\big (k_na(y_2^n, y_2^n)+\Vert y^n_2\Vert _{0,\varOmega }^2\big ). \end{aligned}$$
(3.25)

For \(y^n_u\in V^h\) we also have the splitting \(y^n_u=y_1^n+R_hy_u^n\). It follows from the proof of Lemma 3 in [12, P. 87] that

$$\begin{aligned} \Vert y_2^n\Vert _{0,\varOmega }\le Ch^{1\over 2}\Vert y_1^n\Vert _{{1\over 2},\varOmega }. \end{aligned}$$

Similarly, we also have

$$\begin{aligned} \Vert y_2^n\Vert _{0,\varOmega }\le Ch\Vert y_1^n\Vert _{1,\varOmega }. \end{aligned}$$

We note that \(y_1^n|_{\varGamma }=y^n_u=Q_h(P_k^nu)\in H^{1\over 2}(\varGamma )\) and

$$\begin{aligned} a(y_1^n,\phi _h)=0,\ \ \forall \ \phi _h\in V_0^h. \end{aligned}$$

Let \(w^n\in H^1(\varOmega )\) be the solution of (3.6) with \(v_h\) substituted by \(Q_h(P_k^nu)\). Then

$$\begin{aligned} \Vert w^n\Vert _{{1\over 2},\varOmega }\le C\Vert Q_h(P_k^nu)\Vert _{0,\varGamma },\quad \Vert w^n\Vert _{1,\varOmega }\le C\Vert Q_h(P_k^nu)\Vert _{{1\over 2},\varGamma } \end{aligned}$$

and \(y_1^n\) is the finite element approximation to \(w^n\). So we deduce from Lemma 2 that

$$\begin{aligned} \Vert y_1^n\Vert _{{1\over 2},\varOmega }\le & {} \Vert y_1^n-w^n\Vert _{{1\over 2},\varOmega }+\Vert w^n\Vert _{{1\over 2},\varOmega }\\\le & {} C\Vert Q_h(P_k^nu)\Vert _{0,\varGamma } \end{aligned}$$

and

$$\begin{aligned} \Vert y_1^n\Vert _{1,\varOmega }\le & {} \Vert y_1^n-w^n\Vert _{1,\varOmega }+\Vert w^n\Vert _{1,\varOmega }\\\le & {} C\Vert Q_h(P_k^nu)\Vert _{{1\over 2},\varGamma }, \end{aligned}$$

which in turn give

$$\begin{aligned} \Vert y_2^n\Vert _{0,\varOmega }\le Ch^{1\over 2}\Vert Q_h(P_k^nu)\Vert _{0,\varGamma },\quad \Vert y_2^n\Vert _{0,\varOmega }\le Ch\Vert Q_h(P_k^nu)\Vert _{{1\over 2},\varGamma }. \end{aligned}$$

Inverse estimates also yield

$$\begin{aligned} \Vert y_2^n\Vert _{1,\varOmega }\le Ch^{-{1\over 2}}\Vert Q_h(P_k^nu)\Vert _{0,\varGamma }. \end{aligned}$$

With the help of above estimates and norm interpolation we are led to

$$\begin{aligned} \Vert y_2^n\Vert _{s,\varOmega }\le Ch^{{1\over 2}-s}\Vert Q_h(P_k^nu)\Vert _{0,\varGamma },\quad \Vert y_2^n\Vert _{s,\varOmega }\le Ch^{1-s}\Vert Q_h(P_k^nu)\Vert _{{1\over 2},\varGamma } \end{aligned}$$
(3.26)

for all \(0\le s\le 1\) and \(n=1,2,\ldots ,N\). Thus, from the quasi-uniformality of time partioning we have

$$\begin{aligned}&\sum \limits _{n=1}^{N}\big (\Vert y^n_u-y_u^{n-1}\Vert _{0,\varOmega }^2+k_na( y_u^n,y_u^n)\big )+\Vert y_u^N\Vert _{0,\varOmega }^2\nonumber \\&\le C\sum \limits _{n=1}^{N}\big (k_na(y_2^n, y_2^n)+\Vert y^n_2\Vert _{0,\varOmega }^2\big )\nonumber \\&\le C\sum \limits _{n=1}^{N}\big (k_nh^{-1}\Vert Q_h(P_k^nu)\Vert _{0,\varGamma }^2+h\Vert Q_h(P_k^nu)\Vert _{0,\varGamma }^2\big )\nonumber \\&\le C(h^{-1}+hk^{-1})\sum \limits _{n=1}^{N}\int _{I_n}\Vert Q_h(P_k^nu)\Vert _{0,\varGamma }^2\nonumber \\&\le C(h^{-1}+hk^{-1})\Vert Q(u)\Vert _{L^2(L^2(\varGamma ))}^2\le C(h^{-1}+hk^{-1})\Vert u\Vert _{L^2(L^2(\varGamma ))}^2. \end{aligned}$$
(3.27)

This gives

$$\begin{aligned}&\sum \limits _{n=1}^{N}\big (\Vert y^n_u-y_u^{n-1}\Vert _{0,\varOmega }^2+k_n\Vert y_u^n\Vert _{1,\varOmega }^2\big )+\Vert y_u^N\Vert _{0,\varOmega }^2\nonumber \\&\le C(h^{-1}+hk^{-1})\Vert u\Vert _{L^2(L^2(\varGamma ))}^2. \end{aligned}$$
(3.28)

Similarly, we can derive from (3.26) and the \(W^{s,q}(\varGamma )\) (\(0\le s\le 1\) and \(1\le q\le \infty \)) stability of \(L^2\)-projection operator \(Q_h\) (see [7, P. 1601]) that

$$\begin{aligned}&\sum \limits _{n=1}^{N}\big (\Vert y^n_u-y_u^{n-1}\Vert _{0,\varOmega }^2+k_n\Vert y_u^n\Vert _{1,\varOmega }^2\big )+\Vert y_u^N\Vert _{0,\varOmega }^2\nonumber \\&\le C(1+h^2k^{-1})\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))}^2. \end{aligned}$$
(3.29)

Combining (3.28) and (3.29) we prove the case of \(f\equiv 0\) and \(y_0\equiv 0\) with \(Y_{hk}^n=y_u^n\).

For the case \(u\equiv 0\), let \(y_f\in L^2(H_0^1(\varOmega ))\cap H^{1}(H^{-1}(\varOmega ))\) be the solution of following problem

$$\begin{aligned} \left\{ \begin{aligned}&\big (\frac{\partial y_f}{\partial t},w\big )+a(y_f,w)=(f,w),\ \ \forall \ w\in H_0^1(\varOmega ),\;\;t\in (0,T],\\&y_f=y_0 \ \ \text{ in }\ \varOmega ;\ \ y_f= 0\ \ \text{ on }\ \varSigma . \end{aligned}\right. \end{aligned}$$
(3.30)

Then we have

$$\begin{aligned} \Vert y_f\Vert _{L^2(H^1(\varOmega ))}+\Vert \frac{\partial y_f}{\partial t}\Vert _{L^2(H^{-1}(\varOmega ))}\le C\big (\Vert f\Vert _{L^2(L^2(\varOmega ))}+ \Vert y_0\Vert _{0,\varOmega }\big ). \end{aligned}$$
(3.31)

Let \(y_f^n\in V^h, \,n=1,2,\ldots , N\) be the solutions of following problems:

$$\begin{aligned} \left\{ \begin{aligned}&\left( \frac{y_f^n-y_f^{n-1}}{k_n},w_h \right) +a(y_f^n,w_h)=(P_k^nf,w_h), \forall w_h\in V^h_0,n=1,\ldots ,N,\\&y_f^0=y_0^h,\ \text{ in }\ \varOmega ; \ \ y_f^n= 0,\ n=1,\ldots ,N,\ \ \text{ on }\ \varGamma . \end{aligned}\right. \end{aligned}$$
(3.32)

Then \( y_f^n\) is the standard fully discrete approximation of \(y_f\). Let \(w_h=k_n(y^n_f- y_f^{n-1})\) in (3.32) we get

$$\begin{aligned} \left( y_f^n-y_f^{n-1},y_f^n-y_f^{n-1}\right) +k_na\left( y_f^n,y_f^n-y_f^{n-1})=k_n(P_k^nf,y_f^n-y_f^{n-1}\right) , \end{aligned}$$

thus we have

$$\begin{aligned}&\Vert y_f^n-y_f^{n-1}\Vert ^2_{0,\varOmega } +k_n\Vert y_f^n\Vert ^2_{1,\varOmega }\\&=k_na(y_f^n,y_f^{n-1})+\int _{I_n}(f,y_f^n-y_f^{n-1})dt\\&\le {1\over 2}k_n\Vert y_f^n\Vert ^2_{1,\varOmega }+{1\over 2}k_n\Vert y_f^{n-1}\Vert ^2_{1,\varOmega }+\Vert y_f^n-y_f^{n-1}\Vert _{0,\varOmega }\int _{I_n}\Vert f\Vert _{0,\varOmega }dt\\&\le {1\over 2}k_n\Vert y_f^n\Vert ^2_{1,\varOmega }+{1\over 2}k_n\Vert y_f^{n-1}\Vert ^2_{1,\varOmega }+\frac{1}{2}k_n\int _{I_n}\Vert f\Vert ^2_{0,\varOmega }dt +\frac{1}{2}\Vert y_f^n-y_f^{n-1}\Vert ^2_{0,\varOmega }. \end{aligned}$$

Summing the above equations over n from 1 to N we obtain

$$\begin{aligned} \sum \limits _{n=1}^{N}\Vert y_f^n-y_f^{n-1}\Vert ^2_{0,\varOmega } +k_N\Vert y_f^N\Vert ^2_{1,\varOmega }\le & {} k_1\Vert \tilde{Q}_hy_0\Vert ^2_{1,\varOmega }+k\sum \limits _{n=1}^{N}\int _{I_n}\Vert f\Vert ^2_{0,\varOmega }dt\nonumber \\\le & {} k\Vert \tilde{Q}_hy_0\Vert ^2_{1,\varOmega }+k\Vert f\Vert _{L^2(L^2(\varOmega ))}^2\nonumber \\\le & {} Ckh^{-2}\Vert y_0\Vert ^2_{0,\varOmega }+k\Vert f\Vert _{L^2(L^2(\varOmega ))}^2, \end{aligned}$$
(3.33)

where we used the estimate \(\Vert \tilde{Q}_hy_0\Vert ^2_{1,\varOmega }\le Ch^{-2}\Vert \tilde{Q}_hy_0\Vert ^2_{0,\varOmega }\le Ch^{-2}\Vert y_0\Vert ^2_{0,\varOmega }\). This proves the case \(u\equiv 0\) with \(Y_{hk}^n=y_f^n\). \(\square \)

We next consider the fully discrete approximation for above semi-discrete optimal control problems by using the dG(0) scheme in time. The fully discrete approximation scheme of (3.14) is to find \((Y_{hk},U_{hk})\in V_{hk}\times U_{ad}^{hk}\), such that

$$\begin{aligned} \min \limits _{U_{hk}\in U^{hk}_{ad},Y_{hk}\in V_{hk}}J_{hk}(Y_{hk},U_{hk})=\sum \limits ^{N}_{i=1}k_n\bigg \{{ 1\over 2}\int _{\varOmega }(Y_{hk}^i-P_k^i y_d)^2dx+{\alpha \over 2}\int _{\varGamma }(U_{hk}^i)^2ds\bigg \} \end{aligned}$$
(3.34)

subject to

$$\begin{aligned} \left\{ \begin{aligned} A(Y_{hk},\varPhi )=(f,\varPhi )_{\varOmega _T}+(y_0,\varPhi _+^0),\, \,&\forall \ \varPhi \in V_{hk}^0,\\ Y_{hk}=Q(U_{hk})\, \,&\text{ on }\ \varSigma . \end{aligned}\right. \end{aligned}$$
(3.35)

Here \(U_{ad}^{hk}\) is an appropriate approximation to \(U_{ad}\). We set \(U_{ad}^{hk}= V_{hk}(\varGamma )\cap U_{ad}\) for the full discretization of the control problem (1.1)–(1.2) and \(U_{ad}^{hk}\equiv U_{ad}\) for its variational discretization.

It follows from standard arguments (see [24]) that the above control problem has a unique solution \((Y_{hk},U_{hk})\), and that a pair \((Y_{hk},U_{hk})\in V_{hk}\times U_{ad}^{hk}\) is the solution of (3.34)–(3.35) if and only if there is a co-state \(Z_{hk}\in V_{hk}^0\), such that the triplet \((Y_{hk},Z_{hk},U_{hk})\in V_{hk}\times V_{hk}^0\times U_{ad}^{hk}\) satisfies (3.35) and the following optimality conditions:

$$\begin{aligned}&\left\{ {\begin{array}{l@{\quad }l} A(\varPhi ,Z_{hk})=\sum \limits ^{N}_{i=1}\int _{I_i}(Y_{hk}^i-y_d,\varPhi )dt,\ \ \forall \ \varPhi \in V_{hk}^0,\\ Z_{hk}=0\ \ \text{ on }\ \varSigma , \end{array}}\right. \end{aligned}$$
(3.36)
$$\begin{aligned}&\int _{\varOmega _T}(Y_{hk}-y_{d})(Y_{hk}(v_{hk})-Y_{hk})dxdt+\alpha \int _{\varSigma }U_{hk}(v_{hk}-U_{hk})dsdt\ge 0,\ \forall \ v_{hk}\in U_{ad}^{hk},\nonumber \\ \end{aligned}$$
(3.37)

where \(Y_{hk}(v_{hk})\) is the solution of problem (3.35) with Dirichlet boundary conditions \(Q(v_{hk})\).

To derive an expression for the derivative of \(J_{hk}:L^2(L^2(\varGamma ))\rightarrow \mathbb {R}\) analogous to the one of J given by formula (2.7) we have to define a discrete normal derivative \(\partial ^{hk}_nZ_{hk}\in V_{hk}(\varGamma )\) satisfying

$$\begin{aligned} \int _{\varSigma }\partial ^{hk}_nZ_{hk}\varPhi dsdt= & {} A(\varPhi ,Z_{hk})-\int _{\varOmega _T}(Y_{hk}-y_{d})\varPhi dxdt,\ \ \forall \ \varPhi \in V_{hk}. \end{aligned}$$
(3.38)

It is easy to verify that the linear form

$$\begin{aligned} L(\varPhi ):=A(\varPhi ,Z_{hk})-\int _{\varOmega _T}(Y_{hk}-y_{d})\varPhi dxdt \end{aligned}$$

is well defined on \(V_{hk}(\varGamma )\) and is also continuous. Thus from Riesz representation theorem the equation (3.38) admits a unique solution \(\partial ^{hk}_nZ_{hk}\) in \(V_{hk}(\varGamma )\). For an analogous reconstruction of discrete normal derivatives for elliptic Dirichlet boundary control problems we refer to [7]. With the help of (3.38) it is not difficult to show that

$$\begin{aligned} 0\le & {} J'_{hk}(U_{hk})(v_{hk}-U_{hk})\\= & {} \alpha \int _{\varSigma }U_{hk}(v_{hk}-U_{hk})dsdt+\int _{\varOmega _T}(Y_{hk}-y_{d})(Y_{hk}(v_{hk})-Y_{hk})dxdt\\= & {} \alpha \int _{\varSigma }U_{hk}(v_{hk}-U_{hk})dsdt+A(Y_{hk}(v_{hk})-Y_{hk},Z_{hk})\\&-\int _{\varSigma }\partial ^{hk}_nZ_{hk}(Y_{hk}(v_{hk})-Y_{hk})dsdt\\= & {} \alpha \int _{\varSigma }U_{hk}(v_{hk}-U_{hk})dsdt-\int _{\varSigma }\partial ^{hk}_nZ_{hk}\cdot Q(v_{hk}-U_{hk})dsdt\\= & {} \int _{\varSigma }(\alpha U_{hk}-\partial ^{hk}_nZ_{hk})(v_{hk}-U_{hk})dsdt \end{aligned}$$

for \(v_{hk}\in U_{ad}^{hk}\), which in turn implies

$$\begin{aligned} U_{hk}=P_{U_{ad}^{hk}}\left( {1\over \alpha }\partial ^{hk}_nZ_{hk}\right) , \end{aligned}$$
(3.39)

where \(P_{U_{ad}^{hk}}:L^2(L^2(\varGamma ))\rightarrow U_{ad}^{hk}\) denotes the orthogonal projection in \(L^2(L^2(\varGamma ))\) onto \(U_{ad}^{hk}\).

4 Error Estimates for the Optimal Control Problems

As a preliminary result we first estimate the error introduced by the discretization of the state equation, i.e., the error between the solutions of problems (2.2) and (3.18).

Theorem 3

Let \(y\in L^2(L^2(\varOmega ))\) and \(Y_{hk}(u)\in V_{hk}\) with \(Y_{hk}(u)|_{\varSigma }=Q(u)\) be the solutions of problems (2.2) and (3.18), respectively. Then for \(u\in L^2(L^2(\varGamma ))\) we have

$$\begin{aligned} \Vert y-Y_{hk}(u)\Vert _{L^2(L^2(\varOmega ))}&\le C\left( h^{{1\over 2}}+k^{{1\over 4}}+h^{3\over 2}k^{-{1\over 2}}+h^{-{1\over 2}}k^{{1\over 2}}+h^{{5\over 2}}k^{-1}\right) \nonumber \\&\quad \big (\Vert f\Vert _{L^2(L^2(\varOmega ))}+\Vert y_0\Vert _{0,\varOmega }+\Vert u\Vert _{L^2(L^2(\varGamma ))}\big ) \end{aligned}$$
(4.1)

and for \(u\in L^2(H^{1\over 2}(\varGamma ))\cap H^{1\over 4}(L^2(\varGamma ))\) we have

$$\begin{aligned}&\Vert y-Y_{hk}(u)\Vert _{L^2(L^2(\varOmega ))} \le C(h+k^{1\over 2}+h^2k^{-{1\over 2}}+h^{3}k^{-1}+h^{-1}k)\nonumber \\&\left( \Vert f\Vert _{L^2(L^2(\varOmega ))}+\Vert y_0\Vert _{0,\varOmega }+\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))}+\Vert u\Vert _{H^{1\over 4}(L^2(\varGamma ))} \right) . \end{aligned}$$
(4.2)

Proof

In view of the linearity of the problem it is sufficient to consider the problems with either \(f\equiv 0, \,y_0\equiv 0\) or \(u\equiv 0\).

Let us first assume that \(f\equiv 0, \,y_0\equiv 0\) and \(u\in L^2(L^2(\varGamma ))\). We first note that according to [12] \(y\in L^2(H^{1\over 2}(\varOmega ))\) holds. Let \(w\in L^2(H^2(\varOmega )\cap H_0^1(\varOmega ))\cap H^1(L^2(\varOmega ))\) be the solution of problem (2.12) with right hand side \(g=y- Y_{hk}(u)\). Since \(w(T)=0\), we from (2.2) and (3.18) deduce that

$$\begin{aligned} \Vert y-Y_{hk}(u)\Vert _{L^2(L^2(\varOmega ))}^2= & {} \int _{\varOmega _T}(-w_t-\varDelta w)(y- Y_{hk}(u))dxdt\nonumber \\= & {} \int _{\varOmega _T}(-w_ty-\varDelta wy)dxdt+\int _{\varSigma } Q(u)\frac{\partial w}{\partial n}dsdt\nonumber \\&+\int _{\varOmega _T}(w_tY_{hk}(u)-\nabla w\nabla Y_{hk}(u))dxdt\nonumber \\= & {} \int _{\varSigma }(Q(u)-u)\frac{\partial w}{\partial n} dsdt\nonumber \\&+\int _{\varOmega _T}\big (w_tY_{hk}(u)-\nabla w\nabla Y_{hk}(u)\big )dxdt\nonumber \\:= & {} E_1+E_2. \end{aligned}$$
(4.3)

We treat \(E_1\) by exploiting the properties of \(P_k\) and \(Q_h\):

$$\begin{aligned} E_1= & {} \int _{\varSigma }( Q(u)-u)\frac{\partial w}{\partial n}dsdt\\= & {} \int _0^T\big \langle (P_k-I)u,\frac{\partial w}{\partial n}\big \rangle dt+\int _0^T\big \langle ( Q_h-I)P_ku, \frac{\partial w}{\partial n}\big \rangle dt\\= & {} \int _0^T\big \langle (P_k-I)u,\frac{\partial }{\partial n}(I-P_k)w\big \rangle dt+\int _0^T\big \langle ( Q_h-I)P_ku,(I-Q_h)\frac{\partial w}{\partial n}\big \rangle dt. \end{aligned}$$

From the Young’s inequality, the trace inequality and a norm interpolation inequality we derive (see, e.g. [12])

$$\begin{aligned}&\left\| \frac{\partial }{\partial n}(I-P_k)w \right\| _{L^2(L^2(\varGamma ))}^2\\&\le \frac{C}{\epsilon }\Vert (I-P_k)w\Vert _{L^2(H^2(\varOmega ))}^2+{\epsilon }\Vert (I-P_k)w\Vert _{L^2(H^1(\varOmega ))}^2\\&\le \frac{C}{\epsilon }\Vert (I-P_k)w\Vert _{L^2(H^2(\varOmega ))}^2 +{\epsilon }\Vert (I-P_k)w\Vert _{L^2(H^2(\varOmega ))}\Vert (I-P_k)w\Vert _{L^2(L^2(\varOmega ))}\\&\le \frac{2C}{\epsilon }\Vert (I-P_k)w\Vert _{L^2(H^2(\varOmega ))}^2+{\epsilon ^3}\Vert (I-P_k)w\Vert _{L^2(L^2(\varOmega ))}^2. \end{aligned}$$

Setting \(\epsilon = k^{-\frac{1}{2}}\) and using the approximation property (3.17) of \(P_k\) gives

$$\begin{aligned} \left\| \frac{\partial }{\partial n}(I-P_k)w\right\| _{L^2(L^2(\varGamma ))}^2\le Ck^{1/2}\big (\Vert w\Vert _{L^2(H^2(\varOmega ))}^2+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}^2\big ). \end{aligned}$$

We also have

$$\begin{aligned} \left\| (I-Q_h)\frac{\partial w}{\partial n}\right\| _{L^2(L^2(\varGamma ))}\le Ch^{1/2}\Vert w\Vert _{L^2(H^2(\varOmega ))}. \end{aligned}$$

Using the Cauchy-Schwarz inequality and stability results for \(Q_h\) and \(P_k\) we estimate

$$\begin{aligned} |E_1|\le & {} Ck^{1/4}\Vert u\Vert _{L^2(L^2(\varGamma ))}\big (\Vert w\Vert _{L^2(H^2(\varOmega ))}+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}\big )\nonumber \\&+ \;Ch^{1/2}\Vert P_ku\Vert _{L^2(L^2(\varGamma ))}\Vert w\Vert _{L^2(H^2(\varOmega ))}\nonumber \\\le & {} C(h^{1/2}+k^{1/4})\Vert u\Vert _{L^2(L^2(\varGamma ))}\Vert g\Vert _{L^2(L^2(\varOmega ))}. \end{aligned}$$
(4.4)

Next we estimate \(E_2\). Considering (3.18) and \(w^N=w(\cdot ,T)=0\) we calculate

$$\begin{aligned} E_2= & {} \int _{\varOmega _T}\big (w_tY_{hk}(u)-\nabla w\nabla Y_{hk}(u)\big )dxdt\nonumber \\= & {} \sum \limits _{n=1}^N\big (Y_{hk}^n(u),w^{n}-w^{n-1}\big )-k_n(\nabla Y_{hk}^n(u),\nabla P_k^nw)\nonumber \\= & {} -\sum \limits _{n=1}^N\big (Y_{hk}^n(u)- Y_{hk}^{n-1}(u),w^{n-1}\big )+k_n(\nabla Y_{hk}^n(u),\nabla P_k^nw)\nonumber \\= & {} -\sum \limits _{n=1}^N\Big ((Y_{hk}^n(u)- Y_{hk}^{n-1}(u),w^{n-1}- R_hP_k^nw)\nonumber \\&+\;k_n\big (\nabla Y_{hk}^n(u),\nabla (P_k^nw-R_hP_k^nw)\big )\Big ). \end{aligned}$$
(4.5)

By the Cauchy-Schwarz inequality we have

$$\begin{aligned} |E_2|\le F_1\cdot F_2, \end{aligned}$$

where

$$\begin{aligned} F_1=\Big (\sum \limits _{n=1}^N\big (\Vert Y_{hk}^n(u)- Y_{hk}^{n-1}(u)\Vert _{0,\varOmega }^2+k_n(\nabla Y_{hk}^n(u),\nabla Y_{hk}^n(u))\big )\Big )^{{1\over 2}} \end{aligned}$$

and

$$\begin{aligned} F_2=\Big (\sum \limits _{n=1}^N\big (\Vert w^{n-1}- R_hP_k^nw\Vert _{0,\varOmega }^2+k_n(\nabla (I- R_h)P_k^nw,\nabla (I- R_h)P_k^nw)\big )\Big )^{{1\over 2}}. \end{aligned}$$

In view of the stability result (3.20) of Lemma 3 we have

$$\begin{aligned} |F_1|\le C(h^{-{1\over 2}}+h^{{1\over 2}}k^{-{1\over 2}})\Vert u\Vert _{L^2(L^2(\varGamma ))}. \end{aligned}$$
(4.6)

It remains to estimate \(F_2\). To begin with we note that

$$\begin{aligned} \Vert w^{n-1}-R_hP_k^nw\Vert _{0,\varOmega }\le & {} \Vert w^{n-1}- P_k^nw\Vert _{0,\varOmega }+\Vert (I- R_h)P_k^nw\Vert _{0,\varOmega }\nonumber \\\le & {} \Vert w^{n-1}-P_k^nw\Vert _{0,\varOmega }+Ch^2\Vert P_k^nw\Vert _{2,\varOmega }, \end{aligned}$$
(4.7)

and

$$\begin{aligned} (\nabla (I- R_h)P_k^nw,\nabla (I-R_h)P_k^nw)\le Ch^2\Vert P_k^nw\Vert _{2,\varOmega }^2. \end{aligned}$$
(4.8)

It is straightforward to show that

$$\begin{aligned} \Vert w^{n-1}-P_k^nw\Vert _{0,\varOmega }\le k_n^{1/2}\Vert w_t\Vert _{L^2(I_n,L^2(\varOmega ))} \end{aligned}$$
(4.9)

and

$$\begin{aligned} \Vert P_k^nw\Vert _{2,\varOmega }\le k_n^{-1/2}\Vert w\Vert _{L^2(I_n,H^2(\varOmega ))}. \end{aligned}$$
(4.10)

Combining (4.7)–(4.10) we get

$$\begin{aligned} F_2\le & {} C\Big (\sum \limits _{n=1}^N\left( (h^4+k_nh^2)\Vert P_k^nw\Vert _{2,\varOmega }^2+k_n\Vert w_t\Vert _{L^2(I_n,L^2(\varOmega ))}^2\right) \Big )^{{1\over 2}}\nonumber \\\le & {} C(h+h^2k^{-{1\over 2}}+k^{{1\over 2}})\Big (\sum \limits _{n=1}^N\left( \Vert w\Vert _{L^2(I_n,H^2(\varOmega ))}^2+\Vert w_t\Vert _{L^2(I_n,L^2(\varOmega ))}^2\right) \Big )^{{1\over 2}}\nonumber \\\le & {} C\left( h+h^2k^{-{1\over 2}}+k^{{1\over 2}}\right) \big (\Vert w\Vert _{L^2(H^2(\varOmega ))}+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}\big ). \end{aligned}$$
(4.11)

Using the stability estimate (3.20) of Lemma 3 we conclude

$$\begin{aligned} |E_2|\le & {} C(h^{-{1\over 2}}+h^{1\over 2}k^{-{1\over 2}})(h+h^2k^{-{1\over 2}}+k^{{1\over 2}})\nonumber \\&\big (\Vert w\Vert _{L^2(H^2(\varOmega ))}+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}\big )\Vert u\Vert _{L^2(L^2(\varGamma ))}\nonumber \\\le & {} C(h^{{1\over 2}}+h^{3\over 2}k^{-{1\over 2}}+h^{-{1\over 2}}k^{{1\over 2}}+h^{{5\over 2}}k^{-1})\nonumber \\&\big (\Vert w\Vert _{L^2(H^2(\varOmega ))}+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}\big )\Vert u\Vert _{L^2(L^2(\varGamma ))}. \end{aligned}$$
(4.12)

From the estimates (4.3)–(4.12) we conclude the desired result in the case \(f\equiv 0, \,y_0\equiv 0\) and \(u\in L^2(L^2(\varGamma ))\) that

$$\begin{aligned}&\Vert y-Y_{hk}(u)\Vert _{L^2(L^2(\varOmega ))}\nonumber \\&\le C(h^{{1\over 2}}+k^{{1\over 4}}+h^{3\over 2}k^{-{1\over 2}}+h^{-{1\over 2}}k^{{1\over 2}}+h^{{5\over 2}}k^{-1})\Vert u\Vert _{L^2(L^2(\varGamma ))}. \end{aligned}$$
(4.13)

If \(u\in L^2(H^{1\over 2}(\varGamma ))\cap H^{1\over 4}(L^2(\varGamma ))\) we can estimate \(E_1\) as

$$\begin{aligned} |E_1|\le & {} Ck^{1/2}\Vert u\Vert _{H^{1\over 4}(L^2(\varGamma ))}\big (\Vert w\Vert _{L^2(H^2(\varOmega ))}+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}\big )\nonumber \\&+\;Ch\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))}\Vert w\Vert _{L^2(H^2(\varOmega ))}\nonumber \\\le & {} C(h+k^{1/2})\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))\cap H^{1\over 4}(L^2(\varGamma ))}\Vert g\Vert _{L^2(L^2(\varOmega ))}. \end{aligned}$$
(4.14)

Combining the estimate (4.11) of \(F_2\) and the stability result (3.21) in Lemma 3 we are led to

$$\begin{aligned} |E_2|\le & {} C(1+hk^{-{1\over 2}})(h+h^2k^{-{1\over 2}}+k^{{1\over 2}})\nonumber \\&\big (\Vert w\Vert _{L^2(H^2(\varOmega ))}+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}\big )\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))}\nonumber \\\le & {} C(h+k^{1\over 2}+h^2k^{-{1\over 2}}+h^{3}k^{-1})\nonumber \\&\big (\Vert w\Vert _{L^2(H^2(\varOmega ))}+\Vert w_t\Vert _{L^2(L^2(\varOmega ))}\big )\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))}. \end{aligned}$$
(4.15)

This combining with (4.3) and (4.14) gives

$$\begin{aligned}&\Vert y-Y_{hk}(u)\Vert _{L^2(L^2(\varOmega ))}\nonumber \\&\le C(h+k^{1\over 2}+h^2k^{-{1\over 2}}+h^{3}k^{-1})\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))\cap H^{1\over 4}(L^2(\varGamma ))}. \end{aligned}$$
(4.16)

If \(u\equiv 0, \,f\in L^2(L^2(\varOmega ))\) and \(y_0\in L^2(\varOmega )\) we have \(y\in L^2(H_0^1(\varOmega ))\cap H^1(H^{-1}(\varOmega ))\) (see, e.g. [25]). Then similar to the above error estimate and using the stability estimate (3.22) of Lemma 3, it is straightforward to prove that (see also [12])

$$\begin{aligned} \Vert y- Y_{hk}(u)\Vert _{L^2(L^2(\varOmega ))}\le C(h+h^{-1}k)(\Vert f\Vert _{L^2(L^2(\varOmega ))}+\Vert y_0\Vert _{0,\varOmega }). \end{aligned}$$
(4.17)

Actually, by using the duality argument it follows from (4.3) and (4.5) that

$$\begin{aligned}&\Vert y-Y_{hk}(u)\Vert _{L^2(L^2(\varOmega ))}^2\nonumber \\&=\int _{\varOmega _T}(-w_t-\varDelta w)(y- Y_{hk}(u))dxdt\nonumber \\&=\int _{\varOmega _T}\big (w_tY_{hk}(u)-\nabla w\nabla Y_{hk}(u)\big )dxdt+(y_0-Y_{hk}^0(u),w(\cdot ,0))\nonumber \\&=-\sum \limits _{n=1}^N\Big (\Big (Y_{hk}^n(u)- Y_{hk}^{n-1}(u),w^{n-1}- R_hP_k^nw\Big )\nonumber \\&\quad +\;(y_0-Y_{hk}^0(u),w(\cdot ,0)) \end{aligned}$$
(4.18)

where we used the fact that the second term in (4.5) vanishes because of \(Y_{hk}^n(u)\in V_0^h\). Note that

$$\begin{aligned} (y_0-Y_{hk}^0(u),w(\cdot ,0))\le & {} \Vert y_0-\tilde{Q}_h y_0\Vert _{-1,\varOmega }\Vert w(\cdot ,0)\Vert _{1,\varOmega }\nonumber \\\le & {} Ch\Vert y_0\Vert _{0,\varOmega }\Vert g\Vert _{L^2(L^2(\varOmega ))}. \end{aligned}$$
(4.19)

It follows from (3.22), (4.7), (4.9), (4.10) that

$$\begin{aligned}&|-\sum \limits _{n=1}^N\Big (\Big (Y_{hk}^n(u)- Y_{hk}^{n-1}(u),w^{n-1}- R_hP_k^nw\Big )|\\&\quad \le \Big (\sum \limits _{n=1}^N\Vert w^{n-1}- R_hP_k^nw\Vert _{0,\varOmega }^2\Big )^{{1\over 2}}\Big (\sum \limits _{n=1}^N\Vert Y_{hk}^n(u)- Y_{hk}^{n-1}(u)\Vert _{0,\varOmega }^2\Big )^{{1\over 2}}\\&\quad \le C\Big (\sum \limits _{n=1}^N\Big (h^4k_n^{-1}\Vert w\Vert _{L^2(I_n,H^2(\varOmega ))}^2+k_n\Vert w_t\Vert _{L^2(I_n,L^2(\varOmega ))}^2\Big )\Big )^{{1\over 2}}\\&\qquad \big (h^{-2}k\Vert y_0\Vert _{0,\varOmega }^2 + k\Vert f\Vert _{L^2(L^2(\varOmega ))}^2\big )^{{1\over 2}}\\&\quad \le C(h^2k^{-{1\over 2}}+k^{{1\over 2}})\Big (\sum \limits _{n=1}^N(\Vert w\Vert _{L^2(I_n,H^2(\varOmega ))}^2+\Vert w_t\Vert _{L^2(I_n,L^2(\varOmega ))}^2)\Big )^{{1\over 2}}\\&\qquad \big (h^{-2}k\Vert y_0\Vert _{0,\varOmega }^2 + k\Vert f\Vert _{L^2(L^2(\varOmega ))}^2\big )^{{1\over 2}}\\&\quad \le C(h+h^{-1}k)\Vert g\Vert _{L^2(L^2(\varOmega ))}\big (\Vert y_0\Vert _{0,\varOmega } + \Vert f\Vert _{L^2(L^2(\varOmega ))}\big ), \end{aligned}$$

this together with (4.18), (4.19) gives (4.17). Combining both cases we complete the proof. \(\square \)

Now we are in a position to derive our main result of this section: the a priori error estimates for optimal control problems. At first we consider the fully discrete case, i.e., \(U_{ad}^{hk}=U_{ad}\cap V_{hk}(\varGamma )\).

Theorem 4

Let \((y,u,z)\in {L^2(L^2(\varOmega ))}\times {L^2(L^2(\varGamma ))}\times {L^2(H^2(\varOmega ))}\cap H^1(L^2(\varOmega ))\) and \((Y_{hk},U_{hk},Z_{hk})\in V_{hk}\times U_{ad}^{hk}\times V_{hk}^0\) be the solutions of problem (2.5)–(2.8) and (3.34)–(3.35) with \(U_{ad}^{hk}=U_{ad}\cap V_{hk}(\varGamma )\), respectively. Then we have the a priori error estimate

$$\begin{aligned}&\Vert u-U_{hk}\Vert _{L^2(L^2(\varGamma ))}+\Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))}\nonumber \\&\le C(h^{{1\over 2}}+k^{{1\over 4}}+h^{3\over 2}k^{-{1\over 2}}+h^{-{1\over 2}}k^{{1\over 2}}+h^{{5\over 2}}k^{-1})\nonumber \\&\big (\Vert f\Vert _{L^2(L^2(\varOmega ))}+\Vert y_0\Vert _{0,\varOmega }+ \Vert y_d\Vert _{L^2(L^2(\varOmega ))}+\Vert u\Vert _{L^2(L^2(\varGamma ))}\big ) \end{aligned}$$
(4.20)

with a constant \(C>0\) independent of h and k.

Proof

Let us recall the continuous and discrete optimality conditions

$$\begin{aligned} \int _{\varOmega _T}(y-y_d)(y(v)-y)dxdt+\alpha \int _{\varSigma } u(v-u)dsdt\ge 0,\ \ \forall \ v\in U_{ad} \end{aligned}$$
(4.21)

and

$$\begin{aligned}&\int _{\varOmega _T}(Y_{hk}- y_{d})(Y_{hk}(v_{hk})- Y_{hk})dxdt+\alpha \int _{\varSigma } U_{hk}(v_{hk}- U_{hk})dsdt\nonumber \\&\ge 0,\ \ \ \forall \ v_{hk}\in U_{ad}^{hk}. \end{aligned}$$
(4.22)

Setting \(v=U_{hk}\in U_{ad}\) and \(v_{hk}=Q(u)\in U_{ad}^{hk}\) we have

$$\begin{aligned}&\alpha \Vert u- U_{hk}\Vert _{L^2(L^2(\varGamma ))}^2=\alpha \int _{\varSigma }(u- U_{hk})^2dsdt\nonumber \\&\quad =\alpha \int _{\varSigma }u(u- U_{hk})dsdt-\alpha \int _{\varSigma } U_{hk}(u- U_{hk})dsdt\nonumber \\&\quad \le \int _{\varOmega _T}(y-y_d)(y(U_{hk})-y)dxdt-\alpha \int _{\varSigma } U_{hk}(u- Q(u))dsdt\nonumber \\&\qquad -\alpha \int _{\varSigma }U_{hk}(Q(u)-U_{hk})dsdt\nonumber \\&\quad \le \int _{\varOmega _T}(y-y_d)(y(U_{hk})-y)dxdt+\int _{\varOmega _T}( Y_{hk}-y_d)( Y_{hk}(Qu)-Y_{hk})dxdt\nonumber \\&\qquad -\alpha \int _{\varSigma }U_{hk}(u- Q(u))dsdt, \end{aligned}$$
(4.23)

where \(y(U_{hk})\in L^2(L^2(\varOmega ))\) with \(y(U_{hk})|_{\varSigma }= U_{hk}\) solves

$$\begin{aligned} \int _{\varOmega _T}y(U_{hk})(-v_t-\varDelta v)dxdt= & {} -\int _{\varSigma }U_{hk}\partial _{{n}}vdsdt+\int _{\varOmega _T}fvdxdt+\int _\varOmega y_0v(\cdot ,0)dx\nonumber \\&\forall \ v\in L^2(H^2(\varOmega )\cap H_0^1(\varOmega ))\cap H^1(L^2(\varOmega )) \end{aligned}$$
(4.24)

with \(v(\cdot ,T)=0\), and \(Y_{hk}(Qu)\in V_{hk}\) solves

$$\begin{aligned} \left\{ {\begin{array}{l@{\quad }l} A(Y_{hk}(Qu),\varPhi )=(f,\varPhi )_{\varOmega _T}+(y_0,\varPhi _+^0),\, \, &{}\forall \ \varPhi \in V_{hk}^0,\\ Y_{hk}(Qu)=Q(u)\, \, &{}\text{ on }\ \varSigma . \end{array}}\right. \end{aligned}$$
(4.25)

With Young’s inequality we deduce

$$\begin{aligned}&\int _{\varOmega _T}(y-y_d)(y(U_{hk})-y)dxdt+\int _{\varOmega _T}( Y_{hk}-y_d)( Y_{hk}(Qu)- Y_{hk})dxdt\nonumber \\&\quad =(y-y_d,y(U_{hk})-y)_{\varOmega _T}+(Y_{hk}-y_d,Y_{hk}(Qu)- Y_{hk})_{\varOmega _T}\nonumber \\&\quad =(y-y_d,y(U_{hk})-y)_{\varOmega _T}+( Y_{hk}-y,y-Y_{hk})_{\varOmega _T}\nonumber \\&\qquad +\;(Y_{hk}-y, Y_{hk}(Qu)-y)_{\varOmega _T}+(y-y_d, Y_{hk}(Qu)- Y_{hk})_{\varOmega _T}\nonumber \\&\quad =-\Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))}^2+(Y_{hk}-y, Y_{hk}(Qu)-y)_{\varOmega _T}\nonumber \\&\qquad +\big (y-y_d,y( U_{hk})-y-( Y_{hk}-Y_{hk}(Qu))\big )_{\varOmega _T}\nonumber \\&\quad \le -\Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))}^2+\big (y-y_d,y(U_{hk})-y-(Y_{hk}- Y_{hk}(Qu))\big )_{\varOmega _T}\nonumber \\&\qquad +\;\sigma \Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))}^2+C(\sigma )\Vert Y_{hk}(Qu)-y\Vert _{L^2(L^2(\varOmega ))}^2. \end{aligned}$$
(4.26)

Taking \(\sigma >0\) small enough, we from (4.23)–(4.26) obtain

$$\begin{aligned}&\alpha \Vert u-U_{hk}\Vert _{L^2(L^2(\varGamma ))}^2+\Vert y- Y_{hk}\Vert _{L^2(L^2(\varOmega ))}^2\nonumber \\&\quad \le -\alpha \int _{\varSigma } U_{hk}(u- Q(u))dsdt+C\Vert Y_{hk}(Qu)-y\Vert _{L^2(L^2(\varOmega ))}^2\nonumber \\&\qquad +\;(y-y_d,y(U_{hk})-y-( Y_{hk}- Y_{hk}(Qu)))_{\varOmega _T}\nonumber \\&\quad :=I_1+I_2+I_3. \end{aligned}$$
(4.27)

Note that from the standard error estimates for the \(L^2\)-projection and the regularity of u we have

$$\begin{aligned} \Vert Q(u)-u\Vert _{L^2(L^2(\varGamma ))}^2\le C(h+k^{1\over 2})\big (\Vert u\Vert _{L^2(H^{\frac{1}{2}}(\varGamma ))}^2+\Vert u\Vert _{H^{\frac{1}{4}} (L^2(\varGamma ))}^2\big ). \end{aligned}$$
(4.28)

Thus we are led to

$$\begin{aligned} |I_1|= & {} \big |-\alpha \int _{\varSigma } U_{hk}(u-Q(u))dsdt\big |\nonumber \\= & {} \big |\alpha \int _{\varSigma }u( Q(u)-u)dsdt+\alpha \int _{\varSigma }(U_{hk}-u)( Q(u)-u)dsdt\big |\nonumber \\= & {} \big |\alpha \int _{\varSigma }(u- Q(u))( Q(u)-u)dsdt +\alpha \int _{\varSigma }(U_{hk}-u)( Q(u)-u)dsdt\big |\nonumber \\\le & {} \sigma \Vert u- U_{hk}\Vert ^2_{L^2(L^2(\varGamma ))}+C\Vert u- Q(u)\Vert ^2_{L^2(L^2(\varGamma ))}\nonumber \\\le & {} \sigma \Vert u- U_{hk}\Vert ^2_{L^2(L^2(\varGamma ))}+C(h+k^{1\over 2})\left( \Vert u\Vert _{L^2(H^{\frac{1}{2}}(\varGamma ))}^2+\Vert u\Vert _{H^{\frac{1}{4}}(L^2(\varGamma ))}^2\right) . \end{aligned}$$
(4.29)

Since \(Y_{hk}(Qu)\) is the fully discrete finite element approximation of y, the error estimate (4.2) of Theorem 3 gives

$$\begin{aligned} I_2= & {} \Vert Y_{hk}(Qu)-y\Vert _{L^2(L^2(\varOmega ))}^2\le C(h^2+k+h^4k^{-1}+h^{6}k^{-2}+h^{-2}k^2)\nonumber \\&\left( \Vert f\Vert _{L^2(L^2(\varOmega ))}^2+\Vert y_0\Vert ^2_{0,\varOmega }+\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))\cap H^{1\over 4}(L^2(\varGamma ))}^2\right) . \end{aligned}$$
(4.30)

Then it remains to estimate \(I_3\). From (2.2), (2.6), (4.24) and (4.25) we have

$$\begin{aligned} I_3= & {} (y-y_d,y(U_{hk})-y-(Y_{hk}- Y_{hk}(Qu)))_{\varOmega _T}\nonumber \\= & {} \int _{\varOmega _T}(y(U_{hk})-y)\left( -\frac{\partial z}{\partial t}-\varDelta z\right) dxdt\nonumber \\&-\int _{\varOmega _T}(Y_{hk}- Y_{hk}(Qu))\left( -\frac{\partial z}{\partial t}-\varDelta z\right) dxdt\nonumber \\= & {} \int _{\varOmega _T}\big (-(y(U_{hk})-y)z_t-(y( U_{hk})-y)\varDelta z\big )dxdt\nonumber \\&+\int _{\varOmega _T}\big (z_t(Y_{hk}- Y_{hk}(Qu))-\nabla (Y_{hk}-Y_{hk}(Qu))\nabla z\big )dxdt\nonumber \\&+\int _{\varSigma }(U_{hk}-Q(u))\partial _{{ n}}zdsdt\nonumber \\= & {} -\int _{\varSigma }( U_{hk}-u)\partial _{{ n}}zdsdt+\int _{\varSigma }(U_{hk}-Q(u))\partial _{{n}}zdsdt\nonumber \\&+\int _{\varOmega _T}\big (z_t(Y_{hk}- Y_{hk}(Qu))-\nabla (Y_{hk}-Y_{hk}(Qu))\nabla z\big )dxdt\nonumber \\= & {} H_1+H_2+H_3. \end{aligned}$$
(4.31)

Note that

$$\begin{aligned} H_1+H_2= & {} -\int _{\varSigma }(U_{hk}-u)\partial _{{ n}}zdsdt+\int _{\varSigma }(U_{hk}-Q(u))\partial _{{ n}}zdsdt\\= & {} \int _{\varSigma }(u- Qu)\partial _{{ n}}zdsdt\\= & {} \int _{\varSigma }(u- Qu)\big (\partial _{{n}}z- Q(\partial _{{n}}z)\big )dsdt. \end{aligned}$$

It is straightforward to estimate

$$\begin{aligned} |H_1+H_2|\le & {} C(h+k^{1\over 2})\big (\Vert z\Vert _{L^2(H^2(\varOmega ))}+\Vert z_t\Vert _{L^2(L^2(\varOmega ))}\big )\nonumber \\&\big (\Vert u\Vert _{L^2(H^{1\over 2}(\varGamma ))}+\Vert u\Vert _{H^{1\over 4}(L^2(\varGamma ))}\big ). \end{aligned}$$
(4.32)

Define \(E_{hk}:=Y_{hk}- Y_{hk}(Qu)\). Using the proof technique of Theorem 3 we from (3.35), (4.25) obtain

$$\begin{aligned} H_3= & {} \int _{\varOmega _T}\big (z_tE_{hk}-\nabla E_{hk}\nabla z\big )dxdt\\= & {} \sum \limits _{n=1}^N\big (\big (E_{hk}^n,z^n-z^{n-1}\big )-k_n(\nabla E_{hk}^n,\nabla P_k^nz)\big )\\= & {} -\sum \limits _{n=1}^N\big (\big (E_{hk}^n-E_{hk}^{n-1},z^{n-1}\big )+k_n(\nabla E_{hk}^n,\nabla P_k^nz)\big )\\= & {} -\sum \limits _{n=1}^N(E_{hk}^n-E_{hk}^{n-1},z^{n-1}-R_hP_k^nz)+k_n(\nabla E_{hk}^n,\nabla (P_k^nz-R_hP_k^nz)). \end{aligned}$$

With the help of projection error estimate and proceeding as in the estimate of (4.5) we have

$$\begin{aligned} |H_3|\le & {} C(h+h^2k^{-{1\over 2}}+k^{1\over 2})\big (\Vert z\Vert _{L^2(H^2(\varOmega ))}+\Vert z\Vert _{H^1(L^2(\varOmega ))}\big )\cdot \\&\left( \sum \limits _{n=1}^{N}\Vert E_{hk}^n- E_{hk}^{n-1}\Vert ^2_{0,\varOmega }+k_n\Vert E_{hk}^n\Vert ^2_{1,\varOmega }\right) ^{1\over 2}. \end{aligned}$$

From Lemma 3 we conclude

$$\begin{aligned} |H_3|\le & {} C(h+h^2k^{-{1\over 2}}+k^{1\over 2})\big (\Vert z\Vert _{L^2(H^2(\varOmega ))}+\Vert z_t\Vert _{L^2(L^2(\varOmega ))}\big )\cdot \nonumber \\&\quad (h^{-{1\over 2}}+h^{{1\over 2}}k^{-{1\over 2}})\Vert Q(u)-U_{hk}\Vert _{L^2(L^2(\varGamma ))}\nonumber \\&\le C(h^{{1\over 2}}+h^{{3\over 2}}k^{-{1\over 2}}+h^{-{1\over 2}}k^{{1\over 2}}+h^{{5\over 2}}k^{-1})\nonumber \\&\quad \big (\Vert z\Vert _{L^2(H^2(\varOmega ))}+\Vert z_t\Vert _{L^2(L^2(\varOmega ))}\big )\Vert u- U_{hk}\Vert _{L^2(L^2(\varGamma ))}. \end{aligned}$$
(4.33)

Since the projection operator \(P_{U_{ad}}\) is continuous on \(L^2(H^{\frac{1}{2}}(\varGamma ))\) and \(H^{\frac{1}{4}}(L^2(\varGamma ))\) (see, e.g. [23, Lm. 3.3]), we have from Theorem 2 that

$$\begin{aligned} \Vert u\Vert _{L^2(H^{\frac{1}{2}}(\varGamma ))}+\Vert u\Vert _{H^{\frac{1}{4}}(L^2(\varGamma ))}\le C(\Vert z\Vert _{L^2(H^2(\varOmega ))}+\Vert z_t\Vert _{L^2(L^2(\varOmega ))}). \end{aligned}$$
(4.34)

From standard regularity result for parabolic equation [25] and Lemma 1 we have

$$\begin{aligned}&\Vert z\Vert _{L^2(H^2(\varOmega ))}+\Vert z_t\Vert _{L^2(L^2(\varOmega ))}\le C(\Vert y\Vert _{L^2(L^2(\varOmega ))}+\Vert y_d\Vert _{L^2(L^2(\varOmega ))})\nonumber \\\le & {} C(\Vert f\Vert _{L^2(L^2(\varOmega ))}+\Vert y_d\Vert _{L^2(L^2(\varOmega ))}+ \Vert y_0\Vert _{0,\varOmega }+\Vert u\Vert _{L^2(L^2(\varGamma ))}). \end{aligned}$$
(4.35)

Combining above results and using the Cauchy-Schwarz inequality and Young’s inequality completes the proof. \(\square \)

If we use variational discretization concept introduced in [19], i.e., \(U_{ad}^{hk}=U_{ad}\), we can prove the following error estimates in a similar way.

Theorem 5

Let \((y,u,z)\in {L^2(L^2(\varOmega ))}\times {L^2(L^2(\varGamma ))}\times {L^2(H^2(\varOmega ))}\cap H^1(L^2(\varOmega ))\) and \((Y_{hk},U_{hk},Z_{hk})\in V_{hk}\times U_{ad}\times V_{hk}^0\) be the solutions of problem (2.5)–(2.8), and (3.34)–(3.35) with \(U_{ad}^{hk}=U_{ad}\), respectively. Then we have following a priori error estimate

$$\begin{aligned}&\Vert u-U_{hk}\Vert _{L^2(L^2(\varGamma ))}+\Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))}\nonumber \\&\quad \le C(h^{{1\over 2}}+k^{{1\over 4}}+h^{3\over 2}k^{-{1\over 2}}+h^{-{1\over 2}}k^{{1\over 2}}+h^{{5\over 2}}k^{-1})\nonumber \\&\quad \big (\Vert f\Vert _{L^2(L^2(\varOmega ))}+\Vert y_0\Vert _{0,\varOmega }+\Vert y_d\Vert _{L^2(L^2(\varOmega ))}+\Vert u\Vert _{L^2(L^2(\varGamma ))}\big ) \end{aligned}$$
(4.36)

with a constant \(C>0\) independent of h and k.

Proof

In the proof of Theorem 4 it suffices to set \(v=U_{hk}\) in (4.21) and \(v_{hk}=u\) in (4.22) and add the corresponding inequalities. This directly gives

$$\begin{aligned} \alpha \Vert u-U_{hk}\Vert _{L^2(L^2(\varGamma ))}^2\le & {} (y-y_d,y(U_{hk})-y)_{\varOmega _T}+( Y_{hk}-y_d,Y_{hk}(u)-Y_{hk})_{\varOmega _T}\\\le & {} -\Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))}^2+(y-Y_{hk},y-Y_{hk}(u))\\&+\;(y-y_d,y(U_{hk})-y-( Y_{hk}- Y_{hk}(u)))_{\varOmega _T}. \end{aligned}$$

Thus

$$\begin{aligned}&\alpha \Vert u-U_{hk}\Vert _{L^2(L^2(\varGamma ))}^2+\Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))}^2\\&\quad \le C\Vert y-Y_{hk}(u)\Vert _{L^2(L^2(\varOmega ))}^2+(y-y_d,y(U_{hk})-y-( Y_{hk}- Y_{hk}(u)))_{\varOmega _T}. \end{aligned}$$

The rest of proof is along the lines of the estimation of the terms \(I_2\) and \(I_3\) in the proof of Theorem 4. \(\square \)

As the final result, we can conclude from Theorems 4 and 5 the explicit convergence rate with respect to h and k for the fully discrete finite element approximation of the optimal control problems under the assumption \(k=O(h^2)\).

Corollary 1

Assume that the spatial mesh size h and time step k satisfy the coupling \(k=O(h^2)\). Then we have the following a priori error estimate

$$\begin{aligned} \Vert u-U_{hk}\Vert _{L^2(L^2(\varGamma ))}+\Vert y-Y_{hk}\Vert _{L^2(L^2(\varOmega ))} \le Ch^{{1\over 2}} \end{aligned}$$
(4.37)

for both full control discretization and variational control discretization with a constant \(C>0\) independent of h and k.

Remark 1

The error estimates we obtained in Theorems 4, 5 and Corollary 1 reflect the worst cases we can expect for parabolic Dirichlet boundary control problems defined on convex polygonal domains.

Since the regularity of parabolic equations depends on the maximum interior angle of the domain, the state admits the improved regularity \(y\in L^2(W^{1,p}(\varOmega ))\) for \(2\le p < p_{*}\) with \(p_{*}=\frac{2\omega }{2\omega -\pi }\) depending on the maximum interior angle \(\omega \) of the polygonal domain \(\varOmega \) and also the data (see [29] for more details). Moreover, for problems defined on curved domains with smooth boundary, we have higher regularity for the optimal control problems as stated in Theorem 3.4 of [23]. Improved space regularity leads to better approximation properties of the state and thus to better convergence rates for space discretization, as is reported in our numerical results. For the elliptic case with polygonal boundaries we refer to [7] where an approximation order for the controls of \(O(h^{1-\frac{1}{p}})\) is derived for some \(2<p\le p_{*}\) with \(p_{*}\) depending on the data and the maximum interior angle of the domain. For the error estimates of elliptic Dirichlet boundary control problems defined on curved domains we refer to [8], [10] and [16] for more details.

Note that \(y\in L^2(H^1(\varOmega ))\cap H^{1\over 2}(L^2(\varOmega ))\), if in addition \(y_d\in L^2(H^1(\varOmega ))\cap H^{1\over 2}(L^2(\varOmega ))\) we may derive that \(z\in L^2(H^3(\varOmega ))\cap H^{3\over 2}(L^2(\varOmega ))\) and \(\partial _nz\in H^{3\over 4}(L^2(\varGamma ))\) under appropriate assumption on the domain \(\varOmega \), and thus \(u\in H^{1\over 2}(L^2(\varGamma ))\) (see the proof of Theorem 2 and [23]). This improved time regularity may deliver higher order convergence for time discretization, compared to the estimates (4.29) and (4.32). This may explain the higher order convergence for the time discretization observed in the numerical experiments.

5 Numerical Experiments

In this section we will carry out some numerical experiments to support our theoretical findings. We consider the optimal control problems (1.1)–(1.2) of tracking type with control set \(U_{ad}\) defined as follows

$$\begin{aligned} U_{ad}:=\big \{u\in L^2(0,T;L^2(\varGamma )):\ 0\le u(x,t)\le 1,\ \ \text{ a.a. }\ (t,x)\in [0,T]\times \varGamma \big \}. \end{aligned}$$

In the numerical experiments we illustrate the convergence orders with respect to the spatial and time discretizations separately by setting h and k small enough respectively, although we derived a priori error estimate with coupling \(k=O(h^2)\). The numerical tests indicate that such a coupling of k and h seems not to be needed. We expect that an according analysis is possible with adapting the techniques of [30] and [31] to the present setting.

Although we do not consider problems defined on curved domains in our numerical analysis, we include some numerical examples on both polygonal and curved domains using full discretization and variational discretisation of the control. For the numerical approximations of Dirichlet boundary control problems defined on curved domains we refer to [8, 10] and [16]. We use \(\Vert \cdot \Vert _{L^2}\) to denote the \(L^2(L^2(\varGamma ))\)-norm error for the optimal control u and the \(L^2(L^2(\varOmega ))\)-norm errors for the state y and adjoint state z.

Example 1

The first example is a unconstrained problem defined on the unit square \(\varOmega =[0,1]\times [0,1], \,T=1\). The data is chosen as

$$\begin{aligned} f= & {} -\frac{4.0}{\alpha }\sin (\pi t) - \frac{\pi }{\alpha }(x_1(1-x_1)+x_2(1-x_2))\cos (\pi t),\\ y_d= & {} -(2+1.0/\alpha )(x_1(1-x_1)+x_2(1-x_2))\sin (\pi t) + \pi x_1x_2(1-x_1)(1-x_2)\cos (\pi t), \end{aligned}$$

with \(\alpha =1\), so that the optimal solution is given by

$$\begin{aligned} u= & {} -\frac{1}{\alpha }(x_1(1-x_1)+x_2(1-x_2))\sin (\pi t),\\ y= & {} -\frac{1}{\alpha }(x_1(1-x_1)+x_2(1-x_2))\sin (\pi t),\\ z= & {} x_1x_2(1-x_1)(1-x_2)\sin (\pi t). \end{aligned}$$

At first we consider the error with respect to spatial mesh size h. We fix the time step to \(k= \frac{1}{8192}\) and present the errors of optimal control u, state y and adjoint state z in Table 1. Then we consider the convergence order of error with respect to time step size k. We fixed the space mesh with \(DOF= 22785\) and present the errors of optimal control u, state y and adjoint state z in Table 2. We observe an order of convergence \(\frac{3}{2}\) for the control and order 2 for the state and adjoint state for spatial discretization, and order 1 convergence for both of them for the time discretization. This is the best result we can expect for linear finite elements and dG(0) approximations.

Table 1 Error of control u, state y and adjoint state z for Example 1 with fixed time step \(N=8192\)
Table 2 Error of control u, state y and adjoint state z for Example 1 with fixed mesh \(DOF=22785\)

Example 2

The second example is an unconstrained problem defined in a polygonal domain with maxminum interior angle \(\omega _{max}={5\over 6}\pi \)(see [29]), so that the optimal solution may have only reduced regularity. The data is chosen as

$$\begin{aligned} y_d= \left\{ \begin{aligned} t^2g(x)\, \,&0\le t<0.5, \\ -t^2g(x) \, \,&0.5\le t \le 1 \end{aligned} \right. \end{aligned}$$

with \(f=1, \,g=\frac{1}{(x_1^2+x_2^2)^{\frac{1}{3}}}\).

There is no exact solution for this example. We take the solution with \(k=\frac{1}{4096}\) and \(Dof=158561\) in the spatial discretization as reference solution. Similarly as in the previous example, we investigate the convergence order with respect to the spatial and time discretization separately. Although the assumption \(k=O(h^2)\) is not satisfied in this example, the analysis and numerical results in [30] and [31] suggest \(O(h^{1\over 2})\) convergence for spatial discretization and \(O(k^{1\over 4})\) convergence for time discretization in our case. We can observe in Table 3 nearly \(O(h^{1\over 2})\)-order convergence for the spatial discretization of the control. The convergence order for the time discretization reported in Table 4 is higher than \(O(k^{1\over 4})\) which might be caused by a higher regularity of the control w.r.t time, compare Remark 1 and [23, Th. 3.4]. Caused by \(y_d\) we may expect a regularity loss w.r.t. time at \(t=0.5\), which in our opinion might only have a mild influence on the convergence order of the numerical computations.

Table 3 Error of control u, state y and adjoint state z for Example 2 with fixed time step \(N=4096\)
Table 4 Error of control u, state y and adjoint state z for Example 2 with fixed mesh \(DOF=158561\)
Table 5 Error of control u, state y and adjoint state z for Example 3 with fixed time step \(N=4096\) and full discretisation
Table 6 Error of control u, state y and adjoint state z for Example 3 with fixed time step \(N=4096\) and variational discretisation
Table 7 Error of control u, state y and adjoint state z for Example 3 with fixed mesh \(DOF=16641\) and full discretization

Example 3

This example is a control constrained problem defined in a smooth domain (see [10]). The domain is the unit circle \(\varOmega =B(0,1)\) with center at zero and radius 1, \(T=1\). The data is presented in polar coordinates. We set

$$\begin{aligned} f(r,\theta ;t)= & {} -6r\max (0,\cos {\theta }\sin ^3(\pi t))-{\pi \over 2} \sin (\pi t)r^3\max (0,\cos ^3{\theta }),\\ y_d(r,\theta ;t)= & {} (7r^2\cos ^2{\theta }+6r^2-6r)\cos {\theta }\sin ^3(\pi t)+y(r,\theta )\\&-{\pi \over 2} \sin (\pi t)r^3(r-1)\max (0,\cos ^3{\theta }), \end{aligned}$$

so that the optimal solution is given by

$$\begin{aligned} u(r,\theta ;t)= & {} \max (0,\cos ^3{\theta }\sin ^3(\pi t)),\\ y(r,\theta ;t)= & {} r^3\max (0,\cos ^3{\theta }\sin ^3(\pi t)),\\ z(r,\theta ;t)= & {} r^3(r-1)\cos ^3{\theta }\sin ^3(\pi t). \end{aligned}$$

We set \(\alpha =1\).

First we consider the error with respect to spatial mesh size h. We fix the time step \(k= \frac{1}{4096}\) and present the error of the optimal control u, the state y and the adjoint state z in Table 5 with full discretisation, and in Table 6 with variational discretisation. We as expected observe that both approaches deliver similar results. Then we consider the convergence order of the time error. We fix the space mesh with \(DOF= 16641\) and present the error of the optimal control u, the state y and the adjoint state z in Table 7. We observe higher order convergence w.r.t. the spatial discretization for both the control u and the state y.