A Priori Error Estimates for State-Constrained Semilinear Parabolic Optimal Control Problems

Ludovici, Francesco; Neitzel, Ira; Wollner, Winnifried

doi:10.1007/s10957-018-1311-8

A Priori Error Estimates for State-Constrained Semilinear Parabolic Optimal Control Problems

Published: 19 June 2018

Volume 178, pages 317–348, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

A Priori Error Estimates for State-Constrained Semilinear Parabolic Optimal Control Problems

Download PDF

353 Accesses
2 Citations
Explore all metrics

Abstract

We consider the finite element discretization of semilinear parabolic optimization problems subject to pointwise in time constraints on mean values of the state variable. In order to control the feasibility violation induced by the discretization, error estimates for the semilinear partial differential equation are derived. Based upon these estimates, it can be shown that any local minimizer of the semilinear parabolic optimization problems satisfying a weak second-order sufficient condition can be approximated by the discretized problem. Rates for this convergence in terms of temporal and spatial discretization mesh sizes are provided. In contrast to other results in numerical analysis of optimization problems subject to semilinear parabolic equations, the analysis can work with a weak second-order condition, requiring growth of the Lagrangian in critical directions only. The analysis can then be conducted relying solely on the resulting quadratic growth condition of the continuous problem, without the need for similar assumptions on the discrete or time semidiscrete setting.

Second order analysis for the optimal control of parabolic equations under control and final state constraints

Article 31 July 2015

Analysis for the space-time a posteriori error estimates for mixed finite element solutions of parabolic optimal control problems

Article 19 October 2023

A New A Posteriori Error Estimates for Optimal Control Problems Governed by Parabolic Integro-Differential Equations

Article 13 March 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

This paper is concerned with optimal control problems governed by semilinear parabolic partial differential equations (PDEs) subject to pointwise in time constraints on mean values in space of the solution to the PDE. We derive convergence rates for a space-time discretization of the problem based on conforming finite elements in space and a discontinuous Galerkin discretization method in time. The control variable is a vector-valued function depending on time, acting distributed in the domain. The inequality constraint on the solution of the PDE is imposed pointwise in time and averaged in space. The precise problem formulation will be given in the next section.

This class of problems under consideration is a simplified model motivated by applications from industrial processes like cooling/heating in steel manufacturing, or tumor therapy in biomathematics. For an extended overview of the possible applications, we refer, e.g., to [1, 2]. In many of these applications, the control variable depends on finitely many parameters with fixed spatial influence but varying in time. Further, especially in cooling processes and material optimization, bounds on the state variable and its derivatives are prescribed to avoid material failure and to preserve product quality.

Despite all these interesting applications, the literature on a priori error estimates for semilinear parabolic optimal control, even without state constraints, has only a few contributions. This can be explained by rather low regularity properties of parabolic equations compared to the more often discussed case of elliptic problems. This already becomes apparent, when considering convex linear-quadratic examples and comes into play even more severely, when issues of nonconvex problems, due to nonlinear state-equations, need to be addressed. To start with the latter issue, it is well known that, in contrast to convex problems, the first-order necessary conditions are no longer sufficient for optimality; hence, second-order sufficient optimality conditions (SSCs) are of interest, also for a priori discretization error estimates. These, or rather the resulting quadratic growth condition, can for instance be used to prove finite element error estimates. When discussing SSCs, one is often interested in proving or using conditions with minimal gap to the second-order necessary optimality conditions; as they are more likely to hold.

One of our main aims and novelty in this paper is to use weak SSCs for the continuous problem as derived in [3] in order to prove discretization error estimates. This is not straightforward if at the same time a clear separation of the influence of spatial and temporal discretization is desired. For elliptic state-constrained problems, it is known that the proof of convergence requires the quadratic growth condition for the continuous problem, only. Once the quadratic growth for the continuous problem is given, it does not matter if the growth is given due to weak or strong SSCs; see [4] in the elliptic case with pointwise state constraints.

In the parabolic setting, if one wants to achieve a clear separation of the spatial and temporal errors, the numerical analysis of both, convex and nonconvex problems, has previously been done in two steps introducing an intermediate time-discrete problem, cf., [5,6,7,8] for convex or [9] for nonconvex problems. As a consequence of this approach, a quadratic growth condition has to hold for a time-discrete version of a nonconvex problem in order to prove an error estimate for the fully discrete problem. Rather than relying on an additional assumption for each time-discrete problem, SSCs can be transferred from the continuous to the semidiscrete level if one uses a rather strong SSC; see [9]. In contrast, weak SSCs have not been shown to be stable with respect to time discretization, and it is not clear at all if this is possible without further assumptions.

In this paper, in favor of the more general weak SSCs, we will derive error estimates without the use of an intermediate auxiliary problem. Our main result is the error estimate of Theorem 5.3, namely an error estimate which coincides with the orders obtained for convex problems in [10].

Our technique allows to derive an estimate for the error between the continuous and semidiscrete solution only depending on the time step size. The price we pay for not transferring the SSC, to the time-discrete level, is that proving an analogous error estimate between the time-discrete and fully discrete problem is a difficult open problem. Let us note that error estimates for the control immediately imply an error estimate of the same order for the state due to Lipschitz properties of the control-to-state mapping. However, often, the state seems to have a better order of convergence concluding from numerical experiments. Proving this can be rather involved and has only been done in special settings, cf., [11, 12], even in the case of elliptic equations.

As already mentioned, there are rather few contributions to the error analysis for problems governed by semilinear parabolic equations. Among them is [9], discussing a setting including bilateral control constraints. The authors also discussed several control discretization approaches. Error estimates were obtained in [13, 14] for a problem without control and state constraints. Overall, the literature on state-constrained semilinear elliptic problems is less sparse, and we refer the reader to [4, 15, 16] and the references therein.

The lack of results for semilinear parabolic problems in the presence of state constraints is also explained by the sparsity of results for the corresponding linear theory, also due to rather low regularity properties of parabolic equations. Only recently, error estimates for the maximal error in time and different norms in space were derived for a space-time discretization of a linear parabolic state equation in [10] and [7]. Indeed, such estimates are necessary for the consideration of constraints pointwise in time on the mean value of the state variable and its first derivative. We would also like to point out the new result [17], where a pointwise (quasi)-best-approximation result for the maximal space-time error for the discretization of linear parabolic PDEs has been derived.

Confining ourselves to the linear parabolic setting, error estimates for pointwise in time and space state constraints are derived in [18], while [19] is concerned with the variational discretization approach. For control constraints, we refer to [5, 6].

Let us end this introduction by some remarks concerning second-order optimality conditions for state-constrained parabolic optimal control problems, which have recently attracted attention. A well-written survey on the state-of-the art can be found in [20].

For the case at hand, we will rely on second-order sufficient conditions (SSCs) that were introduced in [3]. The authors, inspired by techniques from nonlinear optimization in finite-dimensional spaces, obtained SSCs that are very close to the necessary ones. Their analysis was limited to the one-dimensional case and has been extended in [1] to domains of arbitrary dimensions considering, as in our case, vector-valued control functions depending on time only. Due to the nature of the problem, the resulting cone of critical directions can be recasted also from the theory of semi-infinite optimization [21].

Seminal papers for the theory of SSCs in the presence of integral state constrains are [22, 23]. The former deals with boundary controls and handles the state constraints using Ekeland’s principle. The latter considers a nonlinearity in the boundary conditions and uses concepts of semigroup theory to cope with the limitations on the dimension of the domain. More recently, [24] has overcome the limitation in the dimension using concepts of maximal parabolic regularity. For other contributions to the theory of SSCs in the presence of state constraints, we refer to [25,26,27]. Parts of this manuscript are considered in the PhD-thesis of Francesco Ludovici [28].

The paper is organized as follows: in Sect. 2, we give a precise definition of the model problem sketched in (1), introduce the operators and functionals involved in the analysis, and state first- and second-order optimality conditions. Section 3 is devoted to the time and space discretization of the problem. After these preliminary sections, collecting ‘essentially known’ results, we derive new a priori estimates for the discretization error between the solution of the continuous, semidiscrete and discrete state equation, extending techniques from [10] for linear parabolic problems to the semilinear case at hand, in Sect. 4. The core of the paper is Sect. 5. Extending techniques for the elliptic case presented in [4] to the parabolic case, we derive the rate of convergence for the optimal control problem.

2 Problem Formulation, Assumptions, and Analytic Setting

In this section, we introduce our model problem, discuss its precise analytic setting, introduce the main assumptions, and fix the notation. Extending the result of [10] to the case of semilinear parabolic PDEs, we consider, for a time interval $ I=]0,T[ $ and a convex bounded domain $ \varOmega \subset {\mathbb {R}}^{2} $ with boundary $ \partial \varOmega $, the following problem

$$\begin{aligned} \min \; J(q,u):=\frac{1}{2}\int _{I}\int _{\varOmega }(u(t,x)-u_{\mathrm{d}}&(t,x))^{2}\,\mathrm {d}x\,\mathrm {d}t+\frac{\alpha }{2}\int _{I}q(t)^Tq(t)\,\mathrm {d}t, \end{aligned}$$

(1a)

where the state u(t, x) and the control $q(t)=(q_{i}(t))_{i=1}^m$ are coupled by the semilinear parabolic PDE

$$\begin{aligned} \partial _{t}u(t,x)-\varDelta u(t,x) +d(t,x,u(t,x))&=\sum \limits _{i=1}^{m}q_{i}(t)g_{i}(x),\qquad \,\text {in} \;\,I \times \varOmega , \nonumber \\ u(t,x)&= 0,\qquad \qquad \qquad \,\text {on} \,I\times \partial \varOmega ,\nonumber \\ u(0,x)&= u_{0},\qquad \qquad \qquad \text {in}\;\lbrace 0 \rbrace \times \varOmega , \end{aligned}$$

(1b)

with a monotone and smooth nonlinearity d. Further, we consider control constraints

$$\begin{aligned} q_{\min }\le q(t)&\le \, q_{\max }, \qquad \text {a.e. in}\,I, \end{aligned}$$

(1c)

and, for a given weighting function $ \omega (x) $, pointwise in time state constraints

$$\begin{aligned} \int _{\varOmega }u(t,x)\omega (x)\mathrm {d}x&\le 0, \qquad \forall t \in [0,T]. \end{aligned}$$

(1d)

For $ i = 1,\ldots ,m $, we consider controls $ q_{i}\! \in \! L^{2}(I) $ and fixed functions $ g_{i} \!\in \! L^{\infty }(\varOmega ) $. We assume that the desired state satisfies $ u_{\mathrm{d}} \in L^{2}(I \times \varOmega ) $ and the initial data satisfies $ u_{0}\in H^{1}_{0}(\varOmega )\cap H^{2}(\varOmega )\hookrightarrow C({\text {cl}}(\varOmega )) $.

In the following, we set $ V:=H^{1}_{0}(\varOmega ) $, $ H:=L^{2}(\varOmega ) $; $ (\cdot ,\cdot )_{I} $ denotes the standard inner product in $ L^{2}(I,H) $, i.e., $ (\cdot ,\cdot )_{I} = \int _{I}(\cdot ,\cdot )\mathrm {d}t $ with associated norm $ \Vert \cdot \Vert _{I} $, while $ (\cdot , \cdot ) $ and $ \Vert \cdot \Vert $ is used for $ L^{2}(\varOmega ) $. The state constraint is denoted by $ F(u):=(u, \omega ) $, where $ \omega \in L^{\infty }(\varOmega ) $ is a weighting function. Throughout the paper, c will denote a generic constant independent of the discretization parameters, that may take different values at each appearance.

With appropriate discretization of the problem, we will derive our main result, Theorem 5.3, namely the convergence

$$\begin{aligned} \Vert \bar{q}-\bar{q}_{kh}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}\le c\Bigg (k^\frac{1}{2}\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{4}}+h\Big (\log \frac{T}{k}+1\Big )^\frac{1}{2}\Bigg ) \end{aligned}$$

between a locally optimal control $\bar{q}$ of the model problem, and its discrete counterpart $\bar{q}_{kh}$. The semidiscrete analogue

$$\begin{aligned} \Vert \bar{q}-\bar{q}_{k}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \le ck^\frac{1}{2}\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{4}}, \end{aligned}$$

only depending on the time step size k could be shown with the same technique. As mentioned in the introduction, the price we pay for not transferring the SSC, to the time-discrete level, is that proving an analogous error estimate between the time-discrete and fully discrete problem, i.e.,

$$\begin{aligned} \Vert \bar{q}_k-\bar{q}_{kh}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \le ch\Big (\log \frac{T}{k}+1\Big )^\frac{1}{2} \end{aligned}$$

is delicate, as it would require a SSC for the time-discrete problem, where all constants need to be independent of the time discretization.

Before discussing the problem in detail, we impose the following usual assumptions on the nonlinearity; see, e.g., [29, Chapter 5, Assumption 5.6].

Assumption 2.1

The nonlinearity $ d(t,x,u) :I \times \varOmega \times {\mathbb {R}}$ is assumed to satisfy the following:

(i):

For all $u \in {\mathbb {R}}$, the nonlinearity is measurable with respect to $ (t, x) \in I \times \varOmega $. Further, for almost every $ (t, x) \in I \times \varOmega $ it is twice continuously differentiable with respect to u.

(ii):

For $ u = 0 $, there is $ c > 0 $ such that d(t, x, u) satisfies, together with its derivatives up to order two, the boundedness condition

$$\begin{aligned} \Vert d(\cdot ,\cdot ,0)\Vert _{L^\infty (I \times \varOmega )}+\Vert \partial _{u} d(\cdot ,\cdot ,0)\Vert _{L^\infty (I \times \varOmega )} + \Vert \partial _{u}^2d(\cdot ,\cdot ,0)\Vert _{L^\infty (I \times \varOmega )} \le c. \end{aligned}$$

Further, each of these derivatives satisfies a local Lipschitz condition with respect to u, i.e., for any $ M>0 $ there exist a constant $ L(M)>0 $ such that for any $|u_{j}| \le M$ $j=1,2$ there holds

$$\begin{aligned} \Vert \partial _{u}^id(\cdot , \cdot , u_{1})-\partial _{u}^id(\cdot , \cdot , u_{2})\Vert _{L^\infty (I \times \varOmega )} \le L(M)\vert u_{1} - u_{2} \vert , \end{aligned}$$

for every $i = 0,1,2$.

(iii):

For all $ u \in {\mathbb {R}}$ and for almost every $ (t, x) \in I \times \varOmega $, there holds the monotonicity condition

$$\begin{aligned} \partial _{u}d(t, x, u) \ge 0. \end{aligned}$$

When no confusion arises, we shorten the notation for the semilinearity from $ d(\cdot , \cdot , u) $ to d(u) . We now focus on the well-posedness of the state equation (1b). We introduce the Hilbert space

$$\begin{aligned} W(0,T)= \lbrace u \in L^{2}(I, V):\partial _{t} u \in L^{2}(I, V^{*}) \rbrace , \end{aligned}$$

and the space of admissible controls

$$\begin{aligned} Q_{\mathrm{ad}}= \left\{ q \in L^{2}(I, {\mathbb {R}}^{m}):q_{\min }\le q(t) \le q_{\max },\, \text {a.e. in}\,I \right\} , \end{aligned}$$

with $ q_{\min } < q_{\max } \in \mathbb {R}^{m} $.

Denoting with $ V^{*} $ the dual space of V, we recall that the triplet

$$\begin{aligned} V \hookrightarrow H \hookrightarrow V^{*} \end{aligned}$$

forms a Gelfand triple. Then, for $ u, \varphi \in W(0,T) $, we define a bilinear form

$$\begin{aligned} b(u, \varphi ) := (\partial _{t}u, \varphi )_{I}+(\nabla u, \nabla \varphi )_{I}+(u(0),\varphi (0)) \end{aligned}$$

and the weak formulation of (1b) reads: for given $ q \in L^{2}(I,{\mathbb {R}}^{m})$ and initial data $ u_{0} \in V \cap H^{2}(\varOmega ) \hookrightarrow C({\text {cl}}(\varOmega )) $, find $ u \in ~W(0,T) $ satisfying

$$\begin{aligned} b(u, \varphi ) + (d(\cdot ,\cdot ,u),\varphi )_{I} = (qg,\varphi )_{I} + (u_{0},\varphi (0)),\,\,\forall \varphi \in W(0,T). \end{aligned}$$

(2)

It is well known that the PDE (2) admits a unique solution $ u \in W(0,T)$ satisfying the additional regularity $ u \in C({\text {cl}}(I) \times {\text {cl}}(\varOmega )) $; see, e.g., [29, Theorem 5.5]. Further, thanks to the monotonicity assumption on $ d(\cdot ,\cdot ,u) $, the solution u of (2) satisfies the additional regularity

$$\begin{aligned} u \in L^{2}(I,V \cap H^{2}(\varOmega ))\cap H^{1}(I,H)\hookrightarrow L^{\infty }(I\times \varOmega ), \end{aligned}$$

and the following stability estimates hold, cf., [9, Proposition 2.1] for a weak formulation with explicit initial condition which is equivalent to the form considered here, justifying the use of the $L^2$ inner product in the notation of (2).

Proposition 2.1

Let $ u \in W(0,T) $ be the solution of (2) for given data q, g, $u_0$, and d. Then, there holds

$$\begin{aligned} \begin{aligned}&\Vert u\Vert _{L^{\infty }(I\times \varOmega )}\le c\big (\Vert qg\Vert _{L^{\infty }(I\times \varOmega )} +\Vert u_{0}\Vert _{L^{\infty }(\varOmega )}+\Vert d(0)\Vert _{L^{\infty }(I\times \varOmega )}\big ),\\&\Vert u\Vert _{L^{2}(I,V\cap H^2(\varOmega ))}+\Vert u\Vert _{L^{\infty }(I,V)}+\Vert \partial _{t}u\Vert _{I}\le c\big (\Vert qg\Vert _{I}+\Vert u_{0}\Vert _{V}+\Vert d(0)\Vert _{I} \big ). \end{aligned} \end{aligned}$$

Remark 2.1

We observe that the regularity of $ u \in W(0,T) $ is enough to treat the state constraint. Indeed, there holds the embedding $ W(0,T) \hookrightarrow C(I, H) $ and we have $ F :W(0,T)\rightarrow C({\text {cl}}(I)) $, where

$$\begin{aligned} F(u)(t) := \int _\varOmega u(t,x)\omega (x)\mathrm {d}x. \end{aligned}$$

On the other hand, we require more regularity for the solution of (2), because stability estimates in the norms of $ L^{\infty }(I \times \varOmega )$, $L^{\infty }(I,H) $ will come into play to ensure Lipschitz continuity for the control-to-state map. Further, we note that $ u_{0} \in V \cap C({\text {cl}}(\varOmega )) $ is enough to ensure well-posedness of the problem. The assumption $ u_{0} \in H^{2}(\varOmega ) $ is posed to use results from [6, 10], where this regularity is required to fully exploit the approximation property of the discontinuous Galerkin method.

Thanks to (1c), we can regard the control variable q as an element of $ L^{\infty }(I, {\mathbb {R}}^{m}) $. Then, the following definitions are justified. We introduce the control-to-state map

$$\begin{aligned} S :L^{\infty }(I, {\mathbb {R}}^{m}) \rightarrow W(0,T)\cap C(I \times \varOmega ), \end{aligned}$$

associating with any given q the solution $ u(q):=S(q) $ of (2). We denote the concatenation of the control-to-state map and the state constraint F by

$$\begin{aligned} G = (F \circ S) :L^{\infty }(I, \mathbb {R}^{m})\rightarrow C({\text {cl}}(I)) . \end{aligned}$$

In the subsequent analysis, we will need G to be of class $ C^{2} $. This is indeed the case; see [3].

In order to formulate the optimal control problem in reduced form, we introduce the set of feasible controls

$$\begin{aligned} Q_{\mathrm{feas}} := \lbrace q \in Q_{\text {ad}} :G(q) \le 0\rbrace . \end{aligned}$$

Then, (1) reads

We remark that the problem at hand is nonconvex, due to the presence of the nonlinear term in the state equation. As a consequence, it is suitable to consider local solutions, as defined below.

Definition 2.1

A control $ \bar{q} \in Q_{\mathrm{feas}} $ is a local solution, in the sense of $ L^{2}(I,{\mathbb {R}}^{m}) $, if there exists some $ \epsilon > 0 $ such that there holds

$$\begin{aligned} j(\bar{q}) \le j(q) \end{aligned}$$

for all $ q \in Q_{\mathrm{feas}} $ with $ \Vert q-\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \le \epsilon $.

The existence of a local solution follows by standard arguments; see, e.g., [29, Theorem 5.7].

Proposition 2.2

Assuming the existence of a feasible point, problem ($\mathbb {P}$) admits at least one solution

$$\begin{aligned} (\bar{q}, \bar{u}) \in L^{\infty }(I, \mathbb {R}^{m}) \times \big (W(0,T)\cap C({\text {cl}}(I) \times \bar{\varOmega )}\cap H^1(I,H)\cap L^2(I,H^2(\varOmega ))\big ), \end{aligned}$$

where $ \bar{u} = S(\bar{q}) $.

We conclude the section with well-known differentiability properties of the operators and functionals involved in the analysis, referring to [29, Chapter 5] for details.

Lemma 2.1

The map $ S :L^{\infty }(I, {\mathbb {R}}^{m}) \rightarrow W(0,T)\cap L^{\infty }(I \times \varOmega ) $ is of class $ C^{2} $ from $ L^{\infty }(I, \mathbb {R}^{m})$ to W(0, T) . For $ p \in L^{\infty }(I, \mathbb {R}^{m})$ its first derivative $S^{'}(q)p =: v_{p}$, in direction p, is the solution of

$$\begin{aligned} b(v_{p}, \varphi ) + (\partial _{u}d(\cdot ,\cdot ,u(q))v_{p}, \varphi )_{I} = (pg, \varphi )_{I}, \qquad \forall \;\varphi \in W(0,T), \end{aligned}$$

(3)

with zero-initial condition. For $ p_{1}, p_{2} \in L^{\infty }(I, \mathbb {R}^{m})$ the second derivative $ S^{''} (q)p_{1}p_{2} =: v_{p_{1}p_{2}} $, in the directions $p_1, p_2$, solves

$$\begin{aligned} b(v_{p_{1}p_{2}},\varphi )+(\partial _{u}d(\cdot ,\cdot ,u_{q})v_{p_{1}p_{2}}, \varphi )_{I}=-(\partial _{uu}d(\cdot ,\cdot ,u_{q})v_{p_{1}}v_{p_{2}}, \varphi )_{I}, \end{aligned}$$

for all $\varphi \in W(0,T)$, again with zero-initial condition, where $v_{p_{1}}, v_{p_{2}}$ are given by (3).

For S and its first derivative, the following Lipschitz properties hold.

Lemma 2.2

For $p, q_{1}, q_{2} \in Q_\mathrm{ad}$, there exists a constant $ c > 0 $ such that

$$\begin{aligned} \Vert S(q_{1})-S(q_{2})\Vert _{L^{\infty }(I,V)}&\le c \Vert q_{1}-q_{2}\Vert _{L^{2}(I, {\mathbb {R}}^{m})}, \end{aligned}$$

(4a)

$$\begin{aligned} \Vert S^{'}(q_{1})p-S^{'}(q_{2})p\Vert _{I}&\le c \Vert q_{1}-q_{2}\Vert _{L^{2}(I, {\mathbb {R}}^{m})}\Vert p\Vert _{L^{2}(I, {\mathbb {R}}^{m})}, \end{aligned}$$

(4b)

$$\begin{aligned} \Vert S^{'}(q_{1})p-S^{'}(q_{2})p\Vert _{L^{\infty }(I,H)}&\le c \Vert q_{1}-q_{2}\Vert _{L^{2}(I, {\mathbb {R}}^{m})}\Vert p\Vert _{L^{2}(I, {\mathbb {R}}^{m})}. \end{aligned}$$

(4c)

Proof

The claim of (4a)–(4b) is given in [9, Lemma 2.3]. To show (4c), we consider $ \xi :=S^{'}(q_{1})p-S^{'}(q_{2})p $ and define $ \tilde{u}:=S^{'}(q_{2})p $. We note that, for any $ \varphi \in W(0,T)$, $ \xi $ fulfills

$$\begin{aligned} b(\xi , \varphi )+\big (\partial _{u}d(u(q_{1}))\xi ,\varphi \big )_{I}= -\big (\partial _{u}d(u(q_{1}))\tilde{u}-\partial _{u}d(u(q_{2}))\tilde{u} ,\varphi \big )_{I}. \end{aligned}$$

(5)

Clearly, due to the boundedness of $ \partial _{u}d(\cdot ) $, for $ S^{'}(q)p $ there hold the same stability estimates as for S(q) , compare with Lemma 2.1. Then, by means of such a stability estimate in $ L^{\infty }(I,H) $ in combination with the Lipschitz continuity of $ \partial _{u}d(\cdot ) $, we obtain

$$\begin{aligned} \Vert \xi \Vert _{L^{\infty }(I,H)}&\le c\left\| \big (\partial _{u}d(u(q_{1}))-\partial _{u}d(u(q_{2}))\big )\tilde{u}\right\| _{I}\\&\le c\left\| u(q_{1})-u(q_{2})\right\| _{L^{4}(I\times \varOmega )}\left\| \tilde{u}\right\| _{L^{4}(I\times \varOmega )}\\&\le c\left\| u(q_{1})-u(q_{2})\right\| _{L^{\infty }(I, V)}\left\| \tilde{u}\right\| _{L^{\infty }(I,V)}\\&\le c\Vert q_{1}-q_{2}\Vert _{I}\Vert p\Vert _{I}, \end{aligned}$$

where we used the embedding $ L^{\infty }(I,V) \hookrightarrow L^{4}(I\times \varOmega ) $. $\square $

Corollary 2.1

The functional $ j(q):L^{\infty }(I,{\mathbb {R}}^{m}) \rightarrow {\mathbb {R}}$ is of class $ C^{2} $ in the topology of $ L^{\infty }(I, \mathbb {R}^{m})$ and for $ q, p, p_{1}, p_{2} \in L^{\infty }(I,{\mathbb {R}}^{m}) $ there holds

$$\begin{aligned} j^{'}(q)(p)&= \int _{\varOmega }\sum _{i=1}^{m}(\alpha q_{i}(t)+z_{0}(q)g_{i}(x))p_{i}(t)\,\mathrm {d}t, \\ j^{''}(q)p_{1}p_{2}&= \int _{\varOmega _{I}}(v_{p_{1}}v_{p_{2}}+\alpha p_{1}p_{2}-z_{0}(q)\partial _{uu}d(x,t,u(q))v_{p_{1}}v_{p_{2}})\,\mathrm {d}t\,\mathrm {d}x, \end{aligned}$$

where $ z_{0}(q) \in W(0,T) $ is the adjoint state associated with q and j, defined, for all $ \varphi \in W(0,T) $, as the unique solution of

$$\begin{aligned} b(\varphi , z) + (\partial _{u}d(\cdot ,\cdot ,u(q))z, \varphi )_{I} = (u_{q}-u_{\mathrm{d}}, \varphi )_{I}, \end{aligned}$$

(6)

and $ v_{p_{i}}$, $i=1,2 $, is defined as (3).

Remark 2.2

As observed in [29, Section 5.7.4], when the control appears quadratically in the cost functional and linearly in the state equation, then the reduced cost functional is of class $ C^{2} $ not only in $ L^{\infty }(I, {\mathbb {R}}^{m}) $ but also in $ L^{2} (I,{\mathbb {R}}^{m}) $; see also [3, Remark 2.8]. In particular, this allows the introduction of a quadratic growth condition without two-norm discrepancy as it is stated later in Theorem 2.5.

2.1 Optimality Conditions

In this section, we discuss the optimality conditions for our optimal control problem. In a first step, we state standard first-order necessary conditions in KKT form.

For the rest of the paper, we rely on the following linearized Slater’s regularity condition.

Assumption 2.2

Given a local solution $ \bar{q} $ of ($\mathbb {P}$), we assume the existence of $ q_{\gamma } \in Q_{\mathrm{ad}} $ such that

$$\begin{aligned} G(\bar{q}) + G^{'}(\bar{q})(q_{\gamma }-\bar{q}) \le -\gamma < 0, \end{aligned}$$

(7)

for some $ \gamma \in \mathbb {R}_{+} $.

Based on the Slater condition, we obtain first-order necessary optimality conditions in KKT form; see, e.g., [25].

Theorem 2.3

Let $ \bar{q} \in Q_{\mathrm{feas}} $ be a local solution of ($\mathbb {P}$) such that Assumption 2.2 is satisfied, and let $ \bar{u} $ be the associated state. Then, there exists a Lagrange multiplier $ \bar{\mu } \in C({\text {cl}}(I))^{*} $ and an adjoint state $ \bar{z} \in L^{2}(I, V) $ such that

$$\begin{aligned} \begin{aligned}&b(\bar{u},\varphi )+(d(\cdot ,\cdot ,\bar{u}), \varphi )_{I} = (\bar{q}g,\varphi )_{I} + (u_{0}, \varphi (0)),&\forall \varphi \in W(0,T),\\&b(\varphi ,\bar{z})+(\varphi , \partial _{u}d(\cdot ,\cdot ,\bar{u})\bar{z}) = (\bar{u}-u_{\mathrm{d}},\varphi )_{I} + \langle \bar{\mu }, F(\varphi ) \rangle ,&\forall \varphi \in W(0,T), \\&\alpha (\bar{q}, q-\bar{q})_{L^{2}(I)} + (\bar{z},(q-\bar{q})g)_{I} \ge 0,&\forall q \in Q_{\mathrm{ad}}, \\&\langle F(\bar{u}),\bar{\mu } \rangle = 0,\,\, \bar{\mu } \ge 0,\,\,F(\bar{u})\le 0, \end{aligned} \end{aligned}$$

(8)

where $\langle \cdot , \cdot \rangle $ denotes the duality pairing between $ C({\text {cl}}(I)) $ and $ C({\text {cl}}(I))^{*} $.

Since the problem at hand is nonconvex, we introduce second-order sufficient conditions. The following results can be obtained by combining several ideas from the literature. As it is not the purpose of this paper to show these second-order conditions, we will refrain from providing the lengthy details and refer to [28]. We only remark that the SSCs stated here can be obtained using the approach developed in [30] for semilinear elliptic problems. Their analysis was extended to semilinear parabolic problems in [3], for the one-dimensional case. In higher dimensions, the control-to-state map is, in general, not twice continuously differentiable from $ L^{2}(I \times \varOmega ) $ to $C({\text {cl}}(I) \times {\text {cl}}(\varOmega ))$. This restriction to one space dimension has been circumvented in [1], considering, as in this paper, controls depending on time only.

To discuss SSCs, we introduce the Hamiltonian $H :{\mathbb {R}}\times \varOmega \times {\mathbb {R}}\times {\mathbb {R}}\times {\mathbb {R}}\rightarrow {\mathbb {R}}$ given by

$$\begin{aligned} H(q,u,z) = H(t,x,q,u,z) = \frac{1}{2}(u-u_{\mathrm{d}})^{2}+\frac{\alpha }{2}q^{2}+z\left( \sum _{i=1}^{m}q_{i}g_{i}-d(u)\right) , \end{aligned}$$

suppressing the first two arguments t, x in the exposition. Moreover, the reduced Lagrangian function is given by

$$\begin{aligned} \mathcal {L}(q,\mu )= j(q) + \langle \mu , G(q) \rangle . \end{aligned}$$

Remark 2.3

For better readability, at each $ (t,x) \in I\times \varOmega $, we denote by $ \bar{H}, \bar{\mathcal {L}} $ the Hamiltonian and Lagrangian function, when evaluated at $ (\bar{q}, \bar{u}, \bar{z}) $. We note that $\frac{\partial H}{\partial q}, \frac{\partial ^{2} H}{\partial q^{2}} $ are, respectively, an $ {\mathbb {R}}^{m} $-vector and an $ {\mathbb {R}}^{m\times m} $-matrix. When referring to the ith component and the (i, j)-entry, we abbreviate $ \partial _{q}H_{i}, \partial _{q}^{2}H_{i,j} $, respectively.

We now give the cone of critical directions associated with $ \bar{q} \in Q_{\mathrm{feas}} $, following [3]. Introducing the conditions

$$\begin{aligned}&p_{i}(t) = \left\{ \begin{array}{l l l} \ge 0, &{} \quad \text {if } \bar{q}_{i}=q_{\min } ,\\ \le 0, &{} \quad \text {if } \bar{q}_{i}=q_{\max } ,\\ =0, &{} \quad \text {if } \int _{\varOmega }\partial _q\bar{H}_{i}\,\mathrm {d}x \not = 0 , \end{array} \right. \text {for all}\,\, i=1,\ldots ,m \end{aligned}$$

(9)

$$\begin{aligned}&F(v_{p})=\frac{\partial F}{\partial u}(\bar{u})v_{p} \le 0 \,\,\text {if}\,\, F(\bar{u}) =0, \end{aligned}$$

(10)

$$\begin{aligned}&\int _{\varOmega }F(v_{p})\,\mathrm {d}\bar{\mu }=0, \end{aligned}$$

(11)

where $ v_{p} $ is defined by (3), the cone of critical direction is given by

$$\begin{aligned} C_{\bar{q}} = \{ p \in L^{2}(I, {\mathbb {R}}^{m}):p\,\,\text {satisfies }(9),~(10),~(11)\}. \end{aligned}$$

(12)

After this preparation, we postulate the following second-order sufficient condition.

Assumption 2.4

Let $ \bar{q} \in Q_{\mathrm{feas}} $ fulfill, together with the associated state $ \bar{u} $, adjoint state $ \bar{z} $, and Lagrange multipliers $ \bar{\mu } $, the first-order optimality conditions (8). Then, we assume

$$\begin{aligned} \frac{\partial ^{2}\bar{\mathcal {L}}}{\partial ^{2}q}p^{2} >0 \qquad \forall p \in C_{\bar{q}} \setminus \lbrace 0 \rbrace . \end{aligned}$$

(13)

Remark 2.4

Comparing the second-order sufficient condition of Assumption 2.4 with the one of [3], we observe that the assumption

$$\begin{aligned} \partial _{q}^{2}\bar{H}_{i,i} \ge \xi , \qquad \forall t \in I\setminus E_{i}^{\nu },\,\forall i =1,\ldots ,m, \end{aligned}$$

where

$$\begin{aligned} E_{i}^{\nu } = \Big \lbrace t \in I:\int _{\varOmega }\partial _{q}\bar{H}_{i}\,\mathrm {d}x \big \vert \ge \nu \Big \rbrace \end{aligned}$$

is the set of sufficiently active control constraints and $ \xi , \nu $ are positive constants, is implicitly satisfied in our setting. Indeed, since the control appears quadratically in the cost functional and linearly in the state equation, it trivially follows $ \partial _{q}^{2}\bar{H}_{i,i} = \alpha \mathbb {I} > 0 $, where $ \mathbb {I} $ denotes the identity operator.

With the second-order conditions at hand, we obtain the following quadratic growth condition; see [1, Theorem 5] and Remark 2.2.

Theorem 2.5

Let $ \bar{q} \in Q_{\mathrm{feas}} $ satisfy the first-order necessary optimality conditions (8) and let Assumption 2.4 hold. Then, there exist constants $ \delta ,\eta > 0 $ such that

$$\begin{aligned} j(q) \ge j(\bar{q}) + \delta \Vert q-\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}^{2}, \end{aligned}$$

(14)

for any $ q \in Q_{\mathrm{feas}} $ with $ \Vert q-\bar{q}\Vert _{L^{2}(I, {\mathbb {R}}^{m})} \le \eta $.

3 Discretization

We briefly describe the discretization in time and space of our problem. We use the dG(0)cG(1) method, discontinuous in time and continuous in space Galerkin method, referring to [31] for additional details.

The control variable is discretized implicitly by the optimality conditions through the variational discretization approach, attributed to [32].

3.1 Time Discretization

We consider a partitioning of ${\text {cl}}(I)$ consisting of time intervals $ I_{n}=]t_{n-1},t_{n} ] $, for $ n=1,\ldots ,N $ and $ I_{0}=\lbrace 0 \rbrace $, where the times $ t_{i} $ are such that

$$\begin{aligned} 0=t_{0}<t_{1}<\cdots<t_{N-1}<t_{N}=T . \end{aligned}$$

The length of the interval $ I_{n} $ is $ k_{n} $ and we set $ k=\max _{n}k_{n} $ imposing that $ k<T $. Further, we assume the existence of strictly positive constants $ a,b,\tilde{k} $ such that the following technical conditions hold:

$$\begin{aligned} \min _{n> 0}k_{n} \ge ak^{b},&\tilde{k}^{-1} \le \frac{k_{n}}{k_{n+1}}\le \tilde{k} \quad \forall n > 0. \end{aligned}$$

We denote with $ \mathcal {P}_{0}(I_{n},V) $ the space of piecewise constant polynomials on $ I_{n} $ with values in V. The semidiscrete state and trial space is

$$\begin{aligned} U_{k}=U_{k}(V)=\big \lbrace \varphi _{k} \in L^{2}(I,V):\varphi _{k,n} = \varphi _{k}\vert _{I_{n}} \in \mathcal {P}_{0}(I_{n},V), \,n=1,\ldots ,N \big \rbrace , \end{aligned}$$

with inner product $ (\cdot ,\cdot )_{I_{n}} $ and norm $ \Vert \cdot \Vert _{I_{n}} $ given by the restriction of the usual inner product and norm of $ L^{2}(I,H) $ onto the interval $I_{n}$, i.e., we have $ (\cdot , \cdot )_{I_{n}} = \int _{I_{n}}(\cdot , \cdot )\,\mathrm {d}t $.

Our functions are piecewise constant on each interval. Thus, we can simplify standard notation and, for functions $ \varphi _{k} \in U_{k}$, we write

$$\begin{aligned} \varphi _{k,n+1}=\varphi _{k,n}^{+} = \lim _{t \rightarrow 0^{+}} \varphi _{k}(t_{n}+t) =\lim _{t \rightarrow 0^{+}} \varphi _{k}(t_{n+1}-t),\quad [\varphi _{k}]_{n} = \varphi _{k,n+1}-\varphi _{k,n}. \end{aligned}$$

For $ u_{k},\varphi _{k} \in U_{k}+ W(0,T) $, the semidiscrete bilinear form is defined, in general, as

$$\begin{aligned} B(u_{k},\varphi _k):=\sum _{n=1}^{N}(\partial _{t}u_{k},\varphi _k)_{I_{n}}\!+\!\, (\nabla u_{k}, \nabla \varphi _k)_{I}\!+\!\sum _{n=2}^{N}([u_{k}]_{n-1},\varphi _{k,n})\!\,+\!\,(u_{k,1}, \varphi _{k,1}). \end{aligned}$$

As long as only piecewise constants in time are considered, the bilinear form can be simplified, noting that $\partial _t u_k\bigl |_{I_{n}} \equiv 0$ for any $u_k \in U_{k}$. Indeed, for any $u_k,\varphi _k \in U_{k}$ it is

$$\begin{aligned} B(u_{k},\varphi _k)&= (\nabla u_{k}, \nabla \varphi _k)_{I}+\sum _{n=2}^{N}\left( [u_{k}]_{n-1},\varphi _{k,n}\right) +(u_{k,1}, \varphi _{k,1}), \end{aligned}$$

and the semidiscrete state equation reads: given $ q \in L^{2}(I,{\mathbb {R}}^{m}) $ as well as $ u_{0}\in H^{2}(\varOmega )\cap V $, find $ u_{k} = u_{k}(q) \in U_{k}$ such that

$$\begin{aligned} B(u_{k}, \varphi _{k}) + (d(\cdot , \cdot ,u_{k}),\varphi _{k})_{I}=(qg,\varphi _{k})_{I}+(u_{0},\varphi _{k,1}),\quad \forall \varphi _{k} \in U_{k}. \end{aligned}$$

(15)

Note that this is a variant of the implicit Euler scheme with averaging on the right-hand side, where the partial derivatives with respect to time are piecewise zero. For the existence and regularity of a unique solution for (15), the following proposition, from [9, Theorem 3.1 and 3.2], holds true.

Proposition 3.1

For the solution $ u_{k} \in U_{k}$ of (15), the following stability estimates hold

$$\begin{aligned}&\Vert u_{k}\Vert _{L^{\infty }(I \times \varOmega )} \le c\big (\Vert qg\Vert _{L^{p}(I \times \varOmega )}+ \Vert u_{0}\Vert _{L^{\infty }} + \Vert d(\cdot , \cdot ,0)\Vert _{L^{p}(I \times \varOmega )}\big ),\nonumber \\&\Vert u_{k}\Vert _{L^{\infty }(I,V)} \le c\big (\Vert qg\Vert _{I}^{2} + \Vert u_{0}\Vert _{V} + \Vert d(\cdot , \cdot ,0)\Vert _{I} \big ), \end{aligned}$$

(16)

where $ p > 2 $.

As for the continuous case, we now introduce the semidiscrete control-to-state map

$$\begin{aligned} S_{k} :L^{\infty }(I, {\mathbb {R}}^{m}) \rightarrow U_{k}, \end{aligned}$$

associating with any given q the solution $ u_{k}(q):=S_{k}(q) $ of (15). As in the continuous case, we have that $ S_{k} $ is of class $ C^{2} $.

Lemma 3.1

The operator $ S_{k} :L^{\infty }(I, {\mathbb {R}}^{m}) \rightarrow U_{k}$ is of class $ C^{2} $. Further, for $ u_{k}=S_{k}(q) $ and $ p~\in ~L^{\infty }(I, \mathbb {R}^{m})$, its first derivative $ S_{k}^{'}(q)p:=v_{k,p} $, in direction p, is the solution of

$$\begin{aligned} B(v_{k,p},\varphi _{k}) +(\partial _{u}d(\cdot , \cdot , u_{k})v_{k,p}, \varphi _{k})_{I} = (pg,\varphi _{k})_{I},\quad \forall \varphi _{k}\in U_{k}. \end{aligned}$$

For $ p_{1}, p_{2} \in L^{\infty }(I, \mathbb {R}^{m})$, its second derivative $ S^{''}_{k}(q)p_{1}p_{2}=v_{k,p_{1}p_{2}} $, in the directions $p_1, p_2$, is the solution of

$$\begin{aligned} B(v_{k,p_{1}p_{2}},\varphi _{k})+(\partial _{u}d(\cdot , \cdot , u_{k})v_{k,p_{1}p_{2}}, \varphi _{k})_{I} =-(\partial _{uu}d(\cdot ,\cdot ,u_{k})v_{k,p_{1}}v_{k,p_{2}},\varphi _{k})_{I}, \end{aligned}$$

for all $\varphi _{k}\in U_{k}$.

Similarly to S, also for $ S_{k} $ and its first derivative there holds the following Lipschitz property, compare [9, Lemma 3.1] and Lemma 2.2.

Lemma 3.2

For $ q_{1}, q_{2}, p \in L^{\infty }(I, \mathbb {R}^{m})$ there holds

$$\begin{aligned} \begin{aligned} \Vert S_{k}(q_{1})-S_{k}(q_{2})\Vert _{I}&\le c \Vert q_{1}-q_{2}\Vert _{L^{2}(I, {\mathbb {R}}^{m})},\\ \Vert S^{'}_{k}(q_{1})p-S^{'}_{k}(q_{2})p\Vert _{I}&\le c \Vert q_{1}-q_{2}\Vert _{L^{2}(I, {\mathbb {R}}^{m})}\Vert p\Vert _{L^{2}(I, {\mathbb {R}}^{m})}, \\ \Vert S^{'}_{k}(q_{1})p-S^{'}_{k}(q_{2})p\Vert _{L^{\infty }(I,H)}&\le c \Vert q_{1}-q_{2}\Vert _{L^{2}(I, {\mathbb {R}}^{m})}\Vert p\Vert _{L^{2}(I, {\mathbb {R}}^{m})}. \end{aligned} \end{aligned}$$

(17)

3.2 Space Discretization

We consider a family $\mathcal {T}_{h}$ of subdivisions of $ \varOmega $ consisting of closed triangles or quadrilaterals (tetrahedral or hexahedral in dimension three) T which are affine equivalent to their reference elements. The union $\varOmega _h = \text {int}\bigl ( \bigcup _{T \in \mathcal T_h} T\bigr )$ of these elements is assumed to be such that the vertices on $\partial \varOmega _h$ are located on $\partial \varOmega $. We assume the family $\mathcal T_h$ to be quasi-uniform and shape regular in the sense of [33] denoting by $ h_{T} $ the diameter of T and $ h:=\max _{T\in \mathcal {T}_{h}} h_{T}$. Then, we define the conforming finite element space $ V_{h} \subset V$ as the space of piecewise linear functions with respect to $ \mathcal {T}_{h}$ with the canonical extension $v\bigl |_{\varOmega \setminus \varOmega _h} \equiv 0$ for any $v \in V_h$. Moreover, we assume that the sequence of spatial meshes is such that the $L^2$-projection $\varPi _h$ onto $V_h$ is stable with respect to the $H^1$-norm, for conditions ensuring this stability see, e.g., [34]. Then, the discrete state and trial spaces are given by

$$\begin{aligned} U_{kh}=U_{kh}(V_{h})=\big \lbrace \varphi _{kh} \in L^{2}(I,V_{h}):\varphi _{kh,n}=\varphi _{kh}\vert _{I_{n}} \in \mathcal {P}_{0}(I_{n},V_{h}), n=1,\ldots ,N \big \rbrace , \end{aligned}$$

and the discrete state equation reads: for given $ q \in L^{\infty }(I, \mathbb {R}^{m})$, find the state $ u_{kh} = u_{kh}(q) \in U_{kh}$ such that

$$\begin{aligned} B(u_{kh}, \varphi _{kh}) + (d(\cdot , \cdot ,u_{kh}),\varphi _{kh})_{I}=(qg,\varphi _{kh})_{I}+(u_{0},\varphi _{kh,1}),\,\, \forall \varphi _{kh} \in U_{kh}. \end{aligned}$$

(18)

Just as in the semidiscrete case, we have the following stability estimates; see [9, Theorem 4.1]. We remark again that the uniform boundedness of $ u_{kh} $, independent of the discretization parameters k, h, will play a crucial role.

Proposition 3.2

For the solution $ u_{kh} \in U_{kh}$ of (18), the following stability estimates holds

$$\begin{aligned} \Vert u_{kh}\Vert _{L^{\infty }(I \times \varOmega )}&\le c\big (\Vert qg\Vert _{L^{p}(I \times \varOmega )}+ \Vert \varPi _{h}u_{0}\Vert _{L^{\infty }(\varOmega )} + \Vert d(\cdot , \cdot ,0)\Vert _{L^{p}(I \times \varOmega )}\big ),\nonumber \\ \Vert u_{kh}\Vert _{L^{\infty }(I,V)}&\le c\big (\Vert qg\Vert _{I}^{2} + \Vert \varPi _{h}u_{0}\Vert _{V} + \Vert d(\cdot , \cdot ,0)\Vert _{I} \big ), \end{aligned}$$

(19)

where $ p > 2 $ and $ \varPi _{h} :V \rightarrow V_{h} $ is the $L^{2}$-projection in space.

Next, we introduce the discrete control-to-state map $ S_{kh}:L^{\infty }(I, \mathbb {R}^{m})\rightarrow U_{kh}$, the discrete state constraint

$$\begin{aligned} F_{kh}:=(\cdot , w) :U_{kh}\rightarrow U_{kh}(\mathbb {R}), \end{aligned}$$

where $U_{kh}(\mathbb {R})$ denotes the space of piecewise constant functions $]0,T[ \rightarrow {\mathbb {R}}$. Further, we introduce the $ C^{2} $-functional $ G_{kh}=(F_{kh} \circ S_{kh} ) $, and the set of feasible controls $ Q_{kh, \mathrm{feas}} := \lbrace q \in Q_{\mathrm{ad}}:G_{kh}(q) \le 0 \rbrace $.

The discrete problem reads

Similar to the semidiscrete case, first and second derivatives of the discrete control-to-state map $ S_{kh} $ are defined via Lemma 3.1, with test functions from $ U_{kh}$, instead of $U_{k}$. Further, for $ S_{kh} $ and its first derivative there holds the Lipschitz property analog to Lemma 3.2, compare with [9, Lemma 4.1].

We formulate standard KKT optimality conditions for problem ($\mathbb {P}_{kh}$). These conditions will be justified after the introduction of an auxiliary problem in Sect. 5. In particular, we will show in Lemma 5.1 that, for k, h small enough, the Slater point for (1) is also a Slater point for ($\mathbb {P}_{kh}$).

Theorem 3.1

Let $ \bar{u}_{kh} \in Q_{kh,\mathrm{feas}} $ be a local solution of ($\mathbb {P}_{kh}$) with $ \bar{u}_{kh} \in U_{kh}$ the associated state. Then, under Assumption 2.2, for k, h sufficiently small there exists a Lagrange multiplier $ \bar{\mu }_{kh} \in U_{kh}(\mathbb {R})^{*}\cap C({\text {cl}}(I))^{*} $ and an adjoint state $ \bar{z}_{kh} \in U_{kh}$ such that

$$\begin{aligned}&B(\bar{u}_{kh},\varphi ) + (d(\cdot ,\cdot ,\bar{u}_{kh}), \varphi )_{I} = (\bar{q}_{kh}g,\varphi )_{I} + (u_{0}, \varphi _{kh,1}),&\forall \varphi \in U_{kh}, \\&B(\varphi , \bar{z}_{kh}) + (\varphi , \partial _{u}d(\cdot ,\cdot ,\bar{u}_{kh})\bar{z}_{kh}) = (\bar{u}-u_{\mathrm{d}},\varphi )_{I} + \langle \bar{\mu }_{kh}, F_{kh}(\varphi ) \rangle ,&\forall \varphi \in U_{kh}, \\&\alpha (\bar{q}_{kh}, q-\bar{q}_{kh})_{L^{2}(I)} + (\bar{z}_{kh},(q-\bar{q}_{kh})g)_{I} \ge 0,&\forall q \in Q_{kh,\mathrm{feas}},\\&\langle F_{kh}(\bar{u}_{kh}),\bar{\mu }_{kh} \rangle = 0,\,\, \bar{\mu } \ge 0, \end{aligned}$$

where $\langle \cdot , \cdot \rangle $ denotes the duality pairing between $ U_{kh}({\mathbb {R}}) $ and $ U_{kh}({\mathbb {R}})^{*} $. Further, the Lagrange multiplier can be represented as an element of $ C({\text {cl}}(I))^{*} $ by

$$\begin{aligned} \langle v, \bar{\mu }_{kh} \rangle = \sum _{n=1}^{N}\frac{\mu _{kh,n}}{k_{n}}\int _{I_{n}}v(t)\mathrm {d}t,\,\,\forall v \in C({\text {cl}}(I))\cup U_{kh}(\mathbb {R}). \end{aligned}$$

4 The State Equation

In this section, we are interested in the derivation of $ L^{\infty }(I,H) $ error estimates for the solutions of the continuous, semidiscrete and discrete state equation, which are not available for semilinear parabolic equations and are required for our final result in Sect. 5. The technique behind these estimates is based on a duality argument requiring, at any level of discretization, the introduction of auxiliary linearized problems. This approach has been used in [10] for a linear parabolic state equation. We now intend to extend it to the semilinear parabolic case adapting an idea of [35] for semilinear elliptic equations.

4.1 Error Estimates for the Temporal Discretization

In a first step, we introduce the backward uncontrolled linearized counterpart of the state equation. For a given fixed $ q \in L^{\infty }(I, \mathbb {R}^{m})$, we consider u and $ u_{k} $ solutions of (2) and (15), respectively, and we define

$$\begin{aligned} \tilde{d}={\left\{ \begin{array}{ll} \frac{d(u(t,x))-d(u_{k}(t,x))}{u(t,x)-u_{k}(t,x)}, &{} \text {if}\,\,u(t,x) \ne u_{k}(t,x), \\ 0,&{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Then, we consider the linear backward problem

$$\begin{aligned} \begin{aligned} -(\varphi , \partial _{t}w)_{I} +(\nabla \varphi , \nabla w )_{I} + (\varphi , \tilde{d}w)_{I}&= 0, \\ w(T)&= w_{T}, \end{aligned} \end{aligned}$$

(20)

for any $ \varphi \in W(0,T) \cap H^1(I,H)$, with $ w_{T}\in H $.

Denoting by $ \hat{I}= ]0, \hat{t}[ $, $ \hat{t} \in ]0,T[ $, a truncated time interval, we introduce

$$\begin{aligned} \begin{aligned} -(\varphi , \partial _{t}\hat{w})_{\hat{I}} +(\nabla \varphi , \nabla \hat{w} )_{\hat{I}} + (\varphi , \tilde{d}\hat{w})_{\hat{I}}&= 0, \\ w(\hat{t})&= w_{T}. \end{aligned} \end{aligned}$$

(21)

Further, the semidiscrete counterpart of (20), for any $ \varphi _{k}\in U_{k}$, reads

$$\begin{aligned} B(\varphi _{k}, w_{k}) + (\varphi _{k}, \tilde{d}w_{k})_{I} = (\varphi _{k,N}, w_{T}). \end{aligned}$$

(22)

Before starting, we observe that, for any $ \varphi _{k} \in U_{k} $, the following relations hold

$$\begin{aligned} B(u-u_{k}, \varphi _{k})= & {} -(d(u)-d(u_{k}), \varphi _{k})_{I} = -((u-u_{k})\tilde{d}, \varphi _{k})_{I}, \end{aligned}$$

(23)

$$\begin{aligned} B(\varphi _{k}, w-w_{k})= & {} -(\varphi _{k}, (w-w_{k})\tilde{d})_{I}. \end{aligned}$$

(24)

In the following analysis, we will need negative norm estimates for the error between the solutions of (20), (21), and (22). These estimates will be used to derive the error at the time nodal points and inside the time intervals $ I_{n} $. Their derivation follows exactly as in [10, Lemma 5.1, Lemma 5.2], with minor changes due to the presence of the linearization $ \tilde{d} $ of the semilinear term, and therefore it is omitted. The crucial point is the boundedness of $ \tilde{d} $ in $ L^{\infty }(I\times \varOmega ) $ which follows from the Lipschitz continuity of $ d(\cdot ) $ and the regularity of $ u, u_{k} \in L^{\infty }(I \times \varOmega ) $.

For the convenience of the reader, the analog to [10, Lemma 5.1, Lemma 5.2] in our case reads as follows.

Lemma 4.1

For the error between the solutions w, $ \hat{w} $, and $ w_{k} $ of (20), (21), and (22), respectively, there holds

$$\begin{aligned} \Vert w-\hat{w}\Vert _{L^{1}(\hat{I},H)}+\Vert w(0)-\hat{w}(0)\Vert _{H^{-2}(\varOmega )}&\le ck\Big (\log \frac{T}{k}\Big )^{\frac{1}{2}}\Vert w_{T}\Vert ,\\ \Vert w-w_{k}\Vert _{L^{1}(I,H)}+\Vert w(0)-w_{k,1}\Vert _{H^{-2}(\varOmega )}&\le ck\Big (\log \frac{T}{k}\Big )^{\frac{1}{2}}\Vert w_{T}\Vert . \end{aligned}$$

With these estimates at hand, we are ready to derive the main result of the section.

Theorem 4.1

For given $ qg \in L^{\infty }(I,H) $ and $ u_{0} \in V \cap H^{2}(\varOmega ) $, let $ u \in U $ and $ u_{k} \in U_{k} $ be the solution of (2) and (15), respectively. Then, there holds

$$\begin{aligned} \Vert u-u_{k}\Vert _{L^{\infty }(I,H)}\le ck\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{2}}\Big (\Vert qg\Vert _{L^{\infty }(I,H)} +\Vert u_{0}\Vert _{H^{2}(\varOmega )}+\Vert d(0)\Vert _{L^{\infty }(I\times \varOmega )}\Big ). \end{aligned}$$

Proof

Let $ e_{k}=u-u_{k} $ denote the error arising from the dG(0)-time discretization. In every time interval, we split the error into

$$\begin{aligned} \Vert e_{k}\Vert _{L^{\infty }(I_{n},H)} \le \underbrace{\Vert u(\cdot )-u(t_{n})\Vert _{L^{\infty }(I_{n},H)}}_{(a_{1})} +\underbrace{\Vert u(t_{n})-u_{k}(\cdot )\Vert _{L^{\infty }(I_{n},H)}}_{(a_{2})}, \end{aligned}$$

and we analyze the two terms $ (a_{1}), (a_{2}) $ separately. Then, taking the maximum over all $ n=1,\ldots ,N $, we obtain the assertion. Without loss of generality, we consider the last time interval $ I_{N} $. For an arbitrary time interval $ I_{n} $, we consider (20) on $ I=]0,t_{n}[ $ and (21) on $ \hat{I}=]0,\hat{t}[ $ for $ \hat{t}\in ]t_{n-1},t_{n}] $, and the proof follows mutatis mutandis, observing that $ 0\le \log (t_{n}/k)\le \log (T/k) $.

$(a_{1})$ :

For a generic fixed time $ \hat{t} \in I_{N} $, we start the derivation considering the interpolation error $ u(\hat{t})- u(t_{N}) $.

Consider the solutions w and $ \hat{w} $ to (20) and (21) on $\hat{I}=]0,\hat{t}[$, respectively, with terminal value $ w_{T} = u(\hat{t}) - u(t_N)$. Integration by parts in time of (20) and (21) gives

$$\begin{aligned} -(\varphi (T), w(T)) + (\varphi (0), w(0)) +(\partial _{t}\varphi ,w)_{I}+(\nabla \varphi , \nabla w)_{I} + (\varphi , \tilde{d}w)_{I}&= 0, \\ -(\varphi (\hat{t}),\hat{w}(\hat{t}))+(\varphi (0),\hat{w}(0))+(\partial _{t}\varphi ,\hat{w})_{\hat{I}}+(\nabla \varphi , \nabla \hat{w})_{\hat{I}}+ (\varphi , \tilde{d}\hat{w})_{\hat{I}}&=0, \end{aligned}$$

for any $ \varphi \in W(0,T) \cap H^1(I,H)$.

In particular, setting $ \varphi = u $, the state equation (2) yields

$$\begin{aligned} -(u(T), w(T)) + (u(0), w(0)) +(qg,w)_{I} -(d(u), w)_{I}+ (u, \tilde{d}w)_{I}&= 0, \\ -(u(\hat{t}),\hat{w}(\hat{t})) + (u(0),\hat{w}(0))+ (qg,\hat{w})_{\hat{I}} -(d(u), \hat{w})_{\hat{I}}+ (u, \tilde{d}\hat{w})_{\hat{I}}&= 0. \end{aligned}$$

By definition $ w(T)=w(\hat{t})=w_{T} $, subtracting the equalities above, we get

$$\begin{aligned} \begin{aligned} (u(\hat{t})-u(T),w_{T})&= (u(0), \hat{w}(0)-w(0))+(qg,\hat{w}-w)_{\hat{I}}-(qg,w)_{I \setminus \hat{I}} \\&\quad \, +\underbrace{\big (u,(\hat{w}-w)\tilde{d}\big )_{\hat{I}}}_{(b_{1})} \underbrace{-(u, \tilde{d}w)_{I\setminus \hat{I}}}_{(b_{2})}\\&\quad \, +\underbrace{(d(u), w-\hat{w})_{\hat{I}}}_{(b_{3})} +\underbrace{(d(u), w)_{I\setminus \hat{I}}}_{(b_{4})}. \end{aligned} \end{aligned}$$

(25)

We abbreviate $\hat{e}^w = w-\hat{w}$ and analyze the terms separately.

$(b_{1})$ :: Due to the stability in $ L^{\infty }(I \times \varOmega ) $ of the solutions of (2) and (15) and the Lipschitz continuity of d, we observe that $ \Vert \tilde{d}\Vert _{L^{\infty } (I\times \varOmega )}\le c $. Therefore,
$$\begin{aligned} |(u,\hat{e}^w\tilde{d})_{\hat{I}}| \le c\Vert u\Vert _{L^{\infty }(I,H)}\Vert \hat{e}^w\Vert _{L^{1}(\hat{I},H)}. \end{aligned}$$
$(b_{2})$ :: Exploiting again the boundedness of $ \tilde{d} $ in $ L^{\infty }(I\times \varOmega ) $, and $ \vert T-\hat{t} \vert \le k $, we have
$$\begin{aligned} -(u,\tilde{d}w)_{I\setminus \hat{I}}&\le \Big \vert \int _{\hat{t}}^{T}(u,\tilde{d}w)\mathrm {d}t\Big \vert \\&\le ck\Vert u\Vert _{L^{\infty }(I,H)}\Vert w\Vert _{L^{\infty }(I,H)}. \end{aligned}$$
$(b_{3})$ :: The Lipschitz property of d(u) and the boundedness of d(0) in $ L^{\infty }(\hat{I},H) $ yield
$$\begin{aligned} (d(u), \hat{e}^w)_{\hat{I}}&=(d(u)-d(0), \hat{e}^w)_{\hat{I}}+(d(0), \hat{e}^w)_{\hat{I}}\\&\le \Vert d(u)-d(0)\Vert _{L^{\infty }(\hat{I},H)}\Vert \hat{e}^w\Vert _{L^{1}(\hat{I},H)}\\&\quad +\Vert d(0)\Vert _{L^{\infty }(\hat{I},H)}\Vert \hat{e}^w\Vert _{L^{1}(\hat{I},H)}\\&\le c\big (\Vert u\Vert _{L^{\infty }(\hat{I},H)} +\Vert d(0)\Vert _{L^{\infty }(\hat{I},H)} \big )\Vert \hat{e}^w\Vert _{L^{1}(\hat{I},H)}. \end{aligned}$$
$(b_{4})$ :: Using the same argument as for $ (b_{3}) $, we conclude
$$\begin{aligned} (d(u), w)_{I\setminus \hat{I}}&= (d(u)-d(0), w)_{I\setminus \hat{I}}+(d(0), w)_{I\setminus \hat{I}}\\&\le c k\big (\Vert u\Vert _{L^{\infty }(I\times \varOmega )}+\Vert d(0)\Vert _{L^{\infty }(I\times \varOmega )} \big )\Vert w\Vert _{L^{\infty }(I,H)}. \end{aligned}$$

Going back to (25), but now utilizing that $ w_{T}=u(\hat{t})-u(T) $, we obtain

$$\begin{aligned} \Vert u(\hat{t})-u(T) \Vert ^{2}&\le c\Big (\Vert \hat{e}^w\Vert _{L^{1}(\hat{I},H)}+\Vert \hat{e}^w(0)\Vert _{H^{-2}(\varOmega )} +k\Vert w\Vert _{L^{\infty }(I,H)}\Big )\\&\quad \cdot \Big (\Vert qg\Vert _{L^{\infty }(I,H)}+\Vert u_{0}\Vert _{H^{2}(\varOmega )} +\Vert d(0)\Vert _{L^{\infty }(I\times \varOmega )}\\&\quad \quad +\Vert u\Vert _{L^{\infty }(I,H)} +\Vert u\Vert _{L^{\infty }(I\times \varOmega )}\Big ). \end{aligned}$$

Using the stability of the solution w of (20), i.e., $ \Vert w\Vert _{L^{\infty }(I,H)}\le c\Vert w_{T}\Vert $; see, e.g., [10, Theorem 5.3], Proposition 2.1, Lemma 4.1, and division by $ \Vert w_{T}\Vert = \Vert u(\hat{t})-u(T)\Vert $, we conclude

$$\begin{aligned} \begin{aligned} \Vert u(\hat{t})-u(T) \Vert \le ck\log \Big (\frac{T}{k}+1\Big )^{\frac{1}{2}}\Big (&\Vert q\Vert _{L^{\infty }(I, \mathbb {R}^{m})}\Vert g\Vert _{H} +\Vert u_{0}\Vert _{H^{2}(\varOmega )}\\&+\Vert d(0)\Vert _{L^{\infty }(I\times \varOmega )}\Big ). \end{aligned} \end{aligned}$$

(26)

$(a_{2})$ :

To obtain the error of the dG(0) -discretization inside the time interval $ I_{N} $, we set $ w_{T}=u(t_{N})-u_{k,N}=u(T)-u_{k,N} $ in (20) and in (22). Then, for any $\varphi \in U_k + (L^2(I,V)\cap H^1(I,H))$ it holds

$$\begin{aligned} B(\varphi ,w)+(\varphi ,\tilde{d}w)_{I}=(\varphi _{N},u(T)-u_{k,N}). \end{aligned}$$

In particular, testing the relation above with $ \varphi =u-u_{k} $ and making use of (23) and (24), we have

$$\begin{aligned} \Vert u(T)-u_{k,N}\Vert ^{2}&= B(u-u_{k},w)+(u-u_{k},\tilde{d}w)_{I}\\&=B(u-u_{k},w-w_{k})-((u-u_{k})\tilde{d},w_{k})_{I}+((u-u_{k})\tilde{d},w)_{I}\\&=B(u,w-w_{k})+(u_{k},(w-w_{k})\tilde{d})_{I}+((u-u_{k})\tilde{d},w-w_{k})_{I}\\&=(qg,w-w_{k})_{I}+(u_{0},w(0)-w_{k}(0))\underbrace{-(d(u),w-w_{k})_{I}}_{(c_{1})} \\&\quad +\underbrace{(u_{k},(w-w_{k})\tilde{d})_{I}}_{(c_{2})} + \underbrace{((u-u_{k})\tilde{d},w-w_{k})_{I}}_{(c_{3})}, \end{aligned}$$

where, in the last step, we used (2).

We abbreviate $e^w_k = w-w_{k}$ and consider the three terms $ (c_{1})-(c_{3}) $ separately.

$(c_{1})$ :

Observing that $ L^{\infty }(I,V) \hookrightarrow L^{\infty }(I,H) $, the stability result in Lemma 2.1 of the solution u of (2), the Lipschitz continuity of $ d(\cdot ) $, and the boundedness of d(0) in $ L^{\infty }(I,H)$ yield

$$\begin{aligned} -(d(u),e^w_k)_{I}&\le \Big (\Vert d(u)-d(0)\Vert _{L^{\infty }(I,H)} + \Vert d(0)\Vert _{L^{\infty }(I,H)}\Big ) \Vert e^w_k\Vert _{L^{1}(I,H)}\\&\le c\Big (\Vert u\Vert _{L^{\infty }(I,H)}+\Vert d(0)\Vert _{L^{\infty }(I,H)}\Big )\Vert e^w_k\Vert _{L^{1}(I,H)}\\&\le c\Big (\Vert qg\Vert _{I}+ \Vert u_{0}\Vert _{V}+\Vert d(0)\Vert _{L^{\infty }(I,H)}\Big )\Vert e^w_k\Vert _{L^{1}(I,H)}. \end{aligned}$$

$(c_{2})$ :

The boundedness of $ \tilde{d} $ in $ L^{\infty }(I\times \varOmega ) $ and the stability result of the semidiscrete equation of Proposition 3.1 yield

$$\begin{aligned} (u_{k},\tilde{d}e^w_k)_{I}&\le \Vert u_{k}\Vert _{L^{\infty }(I,H)}\Vert e^w_k\Vert _{L^{1}(I,H)}\\&\le c\Big (\Vert qg\Vert _{I}+ \Vert u_{0}\Vert _{V} +\Vert d(0)\Vert _{I} \Big ) \Vert e^w_k\Vert _{L^{1}(I,H)}. \end{aligned}$$

$(c_{3})$ :

From the Lipschitz continuity of $ d(\cdot ) $, as well as the definition and boundedness of $ \tilde{d} $, it follows

$$\begin{aligned} (\tilde{d}(u-u_{k}),e^w_k)_{I}&= (d(u)-d(u_{k}),e^w_k)_{I}\\&=(d(u)-d(0),e^w_k)_{I}+(d(0)-d(u_{k}),e^w_k)_{I}\\&\le c\Big (\Vert u\Vert _{L^{\infty }(I,H)}+\Vert u_{k}\Vert _{L^{\infty }(I,H)}\Big )\Vert e^w_k\Vert _{L^{1}(I,H)}\\&\le c\Big (\Vert qg\Vert _{I}+ \Vert u_{0}\Vert _{V} +\Vert d(0)\Vert _{I} \Big ) \Vert e^w_k\Vert _{L^{1}(I,H)}, \end{aligned}$$

where, in the last step, we used the stability of the solutions u and $ u_{k} $ of (2) and (15), from Proposition 2.1 and Proposition 3.1, respectively.

Summing up, for the error inside the time interval, we obtain

$$\begin{aligned} \begin{aligned} \Vert u(T)-u_{k,N}\Vert ^{2}\le c&\Big (\Vert e^w_k\Vert _{L^{I}(I,H)}+\Vert e^w_k(0)\Vert _{H^{-2}(\varOmega )}\Big )\\&\cdot \Big (\Vert qg\Vert _{L^{\infty }(I,H)}+\Vert u_{0}\Vert _{H^{2}(\varOmega )} +\Vert d(0)\Vert _{L^{\infty }(I,H)}\Big ). \end{aligned} \end{aligned}$$

(27)

In conclusion, combining (26) with (27) and thanks to Lemma 4.1, we obtain the assertion dividing by $\Vert w_T\Vert = \Vert u(T)-u_{k,N}\Vert $. $\square $

4.2 Error Estimates for the Spatial Discretization

We develop error estimates for the spatial discretization of the problem using similar steps as in the semidiscrete case. The linearization of d now reads

$$\begin{aligned} \hat{d}= {\left\{ \begin{array}{ll} \frac{d(u_{k}(t,x))-d(u_{kh}(t,x))}{u_{k}(t,x)-u_{kh}(t,x)}, &{} \text {if}\,\,u_{k}(t,x) \ne u_{kh}(t,x), \\ 0,&{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

We remark that, thanks to the Lipschitz continuity of $ d(\cdot ) $, the linearized term $ \hat{d} $ is bounded in $ L^{\infty }(I\times \varOmega ) $.

We introduce the discrete counterpart of (20) with $\hat{d}$ instead of $\tilde{d}$. Find $ w_{kh}\in U_{kh}$ such that

$$\begin{aligned} B(\varphi _{kh}, w_{kh})+(\varphi _{kh},\hat{d}w_{kh})_{I}= (\varphi _{kh,N}, w_{T}), \end{aligned}$$

(28)

for any $ \varphi _{kh} \in U_{kh}$, with $ w_{T} \in H $.

We also consider the auxiliary problem (22) with $ \hat{d} $ instead of $ \tilde{d} $. Namely, find $ w_{k}\in U_{k}$ such that

$$\begin{aligned} B(\varphi _{k}, w_{k})+(\varphi _{k},\hat{d}w_{k})_{I}= (\varphi _{k,N}, w_{T}), \end{aligned}$$

(29)

for any $ \varphi _{k}\in U_{k}$.

We observe that for any $ \varphi _{kh} \in U_{kh} $ the following relations hold

$$\begin{aligned} B(u_{k}-u_{kh}, \varphi _{kh})= & {} -(d(u_{k})-d(u_{kh}), \varphi _{kh})_{I} = -((u_{k}-u_{kh})\hat{d}, \varphi _{kh})_{I}, \end{aligned}$$

(30)

$$\begin{aligned} B(\varphi _{kh}, w_{k}-w_{kh})= & {} -(\varphi _{kh}, (w_{k}-w_{kh})\hat{d})_{I}. \end{aligned}$$

(31)

As for the error in the dG(0)-semidiscretization, also here we will employ a duality argument requiring estimates for the error between the solutions of (29) and (28). Again, the proof is analogous to [10, Lemma 5.8 and Lemma 5.9] with the obvious modifications due to the presence of $ \hat{d} $. Consequently, we only state the following Lemma regarding the required error estimates.

Lemma 4.2

For the error between the solutions $ w_{k} $ and $ w_{kh} $ of (29) and (28), respectively, there holds

$$\begin{aligned} \Vert w_{k,1}-w_{kh,1}\Vert _{H^{-2}(\varOmega )} + T\Vert w_{k,1}-w_{kh,1}\Vert \le ch^{2}\Vert w_{T}\Vert . \end{aligned}$$

(32)

Theorem 4.2

For given $ qg \in L^{\infty }(I,H) $ and $ u_{0} \in H^{2}(\varOmega ) \cap V $, let $ u_{k} \in U_{k} $ and $ u_{kh} \in U_{kh} $ be the solutions of (15) and (18), respectively. Then, there holds

$$\begin{aligned} \Vert u_{k}-u_{kh}\Vert _{L^{\infty }(I,H)} \le ch^{2}\Big (\log \frac{T}{k}+1\Big )\Big (\Vert qg\Vert _{L^{\infty }(I,H)}+\Vert u_{0}\Vert _{H^{2}(\varOmega )} + \Vert d(0)\Vert _{L^{\infty }(I \times \varOmega )} \Big ). \end{aligned}$$

Proof

Since both $ u_{k} $, $ u_{kh} $ are constant on each time interval $ I_{n} $, we can equivalently show the estimate on a single time interval $ I_{n} $ and, with no loss of generality, we consider the last time interval only. For an arbitrary time interval $ I_{n} $, we consider (28) and (29) on $ I=]0,t_{n}[ $ and, noting that

$$\begin{aligned} 0\le \log (t_{n}/k)\le \log (T/k), \end{aligned}$$

the proof follows mutatis mutandis.

Proceeding as in the proof of Theorem 4.1, we set $w_{T} = u_{k,N}-u_{kh,N} $ in (28) and (29). Then, using (30) and (31), we have

$$\begin{aligned} \Vert u_{k,N}&-u_{kh,N}\Vert ^{2} = B(u_{k}-u_{kh}, w_{k}) + (u_{k}-u_{kh}, \hat{d}w_{k})_{I}\\&= B(u_{k}-u_{kh}, w_{k}-w_{kh})-(\hat{d}(u_{k}- u_{kh}), w_{kh})_{I} + (\hat{d}(u_{k}-u_{kh}), w_{k})_{I}\\&= B(u_{k}, w_{k}-w_{kh}) + (u_{kh}, \hat{d}(w_{k}-w_{kh}))_{I} +(\hat{d}(u_{k}-u_{kh}),w_{k}- w_{kh})_{I}\\&= (qg,w_{k}-w_{kh})_{I}+(u_{0}, w_{k,1}-w_{kh,1})\underbrace{-(d(u_{k}),w_{k}-w_{kh})_{I}}_{(a_{1})}\\&\quad + \underbrace{(u_{kh}, \hat{d}(w_{k}-w_{kh}))_{I}}_{(a_{2})} +\underbrace{(\hat{d}(u_{k}-u_{kh}),w_{k}-w_{kh})_{I}}_{(a_{3})}, \end{aligned}$$

where, in the last step, we used (15). We analyze the three terms separately, abbreviating $e^w_{kh} = w_k - w_{kh}$.

$(a_{1})$ :: The Lipschitz continuity of $ d(\cdot ) $ and the boundedness of d(0) in $ L^{\infty }(I,H) $, give
$$\begin{aligned} -(d(u_{k}),e^w_{kh})_{I}&\le c\Big (\Vert d(u_{k})-d(0)\Vert _{L^{\infty }(I,H)}+\Vert d(0)\Vert _{L^{\infty }(I,H)}\Big )\cdot \Vert e^w_{kh}\Vert _{L^{1}(I,H)}\\&\le c\Big (\Vert u_{k}\Vert _{L^{\infty }(I,H)}+\Vert d(0)\Vert _{L^{\infty }(I,H)}\Big )\Vert e^w_{kh}\Vert _{L^{1}(I,H)}. \end{aligned}$$
$(a_{2})$ :: Recalling that $ \hat{d} $ is bounded, we have
$$\begin{aligned} (u_{kh}, \hat{d}e^w_{kh})_{I} \le c\Vert u_{kh}\Vert _{L^{\infty }(I,H)} \Vert e^w_{kh}\Vert _{L^{1}(I,H)}. \end{aligned}$$
$(a_{3})$ :: For the last term, we rely again on the Lipschitz continuity of $ d(\cdot ) $ to conclude
$$\begin{aligned} (\hat{d}(u_{k}-u_{kh}),e^w_{kh})_{I}&=(d(u_{k})-d(u_{kh}),e^w_{kh})_{I}\\&\le c\Big (\Vert u_{k}\Vert _{L^{\infty }(I,H)}+\Vert u_{kh}\Vert _{L^{\infty }(I,H)}\Big )\Vert e^w_{kh}\Vert _{L^{1}(I,H)}. \end{aligned}$$

We now combine the previous inequalities and, thanks to the stability estimates (16) and (19), we obtain

$$\begin{aligned} \Vert u_{k,N}-u_{kh,N}\Vert ^{2}&\le c\Big (\Vert e^w_{kh}\Vert _{L^{1}(I,H)} + \Vert w_{k,1}-w_{kh,1}\Vert _{H^{-2}(\varOmega )} \Big )\\&\quad \cdot \Big (\Vert qg\Vert _{L^{\infty }(I,H)}+\Vert u_{0}\Vert _{H^{2}(\varOmega )}+\Vert d(0)\Vert _{L^{\infty }(I\times \varOmega )}\Big ). \end{aligned}$$

Noting that the $L^2$-estimate in (32) remains true on shorter intervals, it follows with $\tau _{n,k} = T-t_{n-1}$

$$\begin{aligned} \Vert w_{k}-w_{kh}\Vert _{L^{1}(I,H)}&\le \sum _{n=1}^{N}k_{n}\tau _{k,n}^{-1}\max _{n=1,\ldots ,N} \big (\tau _{k,n}\Vert w_{k,n}-w_{kh,n}\Vert \big )\\&\le ch^{2}\Big (\log \frac{T}{k}+1\Big )\Vert w_{T}\Vert . \end{aligned}$$

Using Lemma 4.2, dividing by $ \Vert w_{T}\Vert = \Vert u_{k,N}-u_{kh,N}\Vert $, we obtain the assertion. $\square $

5 Convergence Analysis

In this section, we focus on the main result of this paper. We show that for any local solution $ \bar{q} $ of the continuous problem satisfying KKT-conditions and SSCs, there exists a sequence of local solutions $ \bar{q}_{kh} $ of ($\mathbb {P}_{kh}$) converging to $ \bar{q} $. To analyze the errors induced by the discretization, we use the so-called two-way feasibility argument; see, e.g., [36, 37]. In this method, the linearized Slater point $ q_{\gamma } $ from Assumption 2.2 is used to construct sequences of controls (competitors) which are feasible for the continuous and discrete problem, respectively. If the problem is linear, these sequences of feasible competitors can be used in the first-order necessary and sufficient conditions to obtain convergence of the discrete problem. In the semilinear case, due to the presence of the linearized term, the complementary slackness condition cannot be used as in the linear setting. Therefore, the feasible controls have to be used in combination with second-order information, in particular in the quadratic growth condition (14) arising from the second-order sufficient conditions. This approach has been used in the recent paper [4] for the semilinear elliptic case in combination with a localization argument, as in [38]. We now intend to extend that approach to our semilinear parabolic optimal control problem with state constraints.

In the following analysis, we will introduce auxiliary problems in a neighborhood of the optimal local solution $ \bar{q} $. To this end, we denote with $ r > 0 $ a radius, to be chosen conveniently later, and we define

$$\begin{aligned} Q^{r}&:=\lbrace q \in Q_{\mathrm{ad}}:\Vert q-\bar{q}\Vert _{L^{2}(I, {\mathbb {R}}^{m})} \le r \rbrace ,\\ Q^{r}_\mathrm{feas}&:=\lbrace q \in Q^{r}:G(q)\le 0 \rbrace . \end{aligned}$$

Then, the continuous auxiliary problem reads

Due to the SSCs, for r sufficiently small, the unique global solution of ($\mathbb {P}^{r}$) coincides with the selected local solution $\bar{q}$ of ($\mathbb {P}$). The value of introducing the auxiliary problem lies in the fact that a discretization of ($\mathbb {P}^{r}$), defined below, will provide a sequence of solutions converging to the selected local optimum. For $ G_{kh} = F_{kh} \circ S_{kh } $, we introduce the discrete auxiliary problem ($\mathbb {P}_{kh}^{r}$)

We remark again that the control is not discretized, the index k, h is taken only to clarify the association to the problem ($\mathbb {P}_{kh}^{r}$), i.e., the use of the discretized state equation.

Assumption 5.1

We assume that $ q_{\gamma } $ satisfying Slater’s condition (7) is close enough to $ \bar{q} \in Q_{\mathrm{feas}} $, meaning

$$\begin{aligned} \Vert q_{\gamma } - \bar{q}\Vert _{L^{2}(I, {\mathbb {R}}^{m})}\le \frac{r}{2}. \end{aligned}$$

(34)

The fact that $ q_{\gamma } $ is in a neighborhood of $ \bar{q} $ is a reasonable assumption. Indeed, as observed in [4, Section 2], given any Slater point $q_\gamma $ with parameter $\gamma $ one can construct a Slater point $q_\gamma ^r=\bar{q}+t(q_{\gamma }-\bar{q})$ close to $\bar{q}$ with a parameter $ \gamma (r) = t\gamma \simeq r\gamma $ with $ t=\min \{1,r/2\Vert q_{\gamma }-\bar{q}\Vert \} $. Hence, one has that (7) holds with $ \gamma $ replaced by $ t\gamma $. Further, after showing that ($\mathbb {P}_{kh}$) admits local solutions, and thanks to the following Lemma 5.1, we will see that it is reasonable to assume that (34) holds also for the discrete problem ($\mathbb {P}_{kh}$), namely

$$\begin{aligned} \Vert q_{\gamma } - \bar{q}_{kh}\Vert _{L^{2}(I, {\mathbb {R}}^{m})} \le \frac{r}{2}. \end{aligned}$$

In what follows, we abbreviate the derived convergence rate for the space-time discretization by

$$\begin{aligned} c(k,h) := k\Bigl (\ln \frac{T}{k}+1\Bigr )^{\frac{1}{2}} + h^2 \Bigl (\ln \frac{T}{k}+1\Bigr ). \end{aligned}$$

We will now define three constants $ c_{1}, c_{2}, c_{3} $, independent of the discretization parameter k, h, and the Tikhonov parameter $\alpha $. These constants are given by

$$\begin{aligned}&\sup _{q\in B_{\frac{r}{2}}(\bar{q})} \Vert (\omega ,u_{kh}(q)-u(q))\Vert _{L^\infty (I)} \le c_1 c(k,h),\\&\sup _{q\in B_{\frac{r}{2}}(\bar{q})}\Vert G^{''}(q_{k})\Vert _{\mathcal L (L^2(I,{\mathbb {R}}^m)^2;L^\infty (I))}, \sup _{q\in B_{\frac{r}{2}}(\bar{q})}\Vert G^{''}_{kh}(q_{k})\Vert _{\mathcal L (L^2(I,{\mathbb {R}}^m)^2;L^\infty (I))}\le c_{2},\\&\sup _{q\in B_{\frac{r}{2}}(\bar{q})} \Vert (G'_{kh}(q)-G'(\bar{q}))(q_\gamma -\bar{q})\Vert _{L^\infty (I)} \le c_3 \Bigl (c(k,h) + \frac{r^2}{2}\Bigr ), \end{aligned}$$

where $ B_{\frac{r}{2}}(\bar{q}) $ denotes an $ L^{2}(I, {\mathbb {R}}^{m}) $ ball centered in $ \bar{q} $ with radius $\frac{r}{2}$.

Remark 5.1

To see that these constants are independent of k, h proceed as follows

For the constant $c_1$, we notice that this error can be estimated by the discretization errors obtained by Theorems 4.1 and 4.2, noting that by the proof of these theorems the constant in the error estimates remains bounded on $B_{\frac{r}{2}}(\bar{q})$.
The constant $c_2$ is a consequence of G being a $C^2$ functional together with a discretization error bound for $G''_{kh}$.
For the constant $c_3$, we notice that
$$\begin{aligned} F(\varphi )=F_{kh}(\varphi )=\int _{\varOmega }\varphi (t,x)\omega (x) \mathrm {d}x, \quad \varphi \in W(0,T) \cup U_{kh} \end{aligned}$$
is linear and consequently the error satisfies
$$\begin{aligned} (G'_{kh}(q)&-G'(\bar{q}))(q_\gamma -\bar{q})=F_{kh}\big (S^{'}_{kh}(q)(q_{\gamma }-\bar{q})\big ) -F\big (S^{'}(\bar{q})(q_{\gamma }-\bar{q})\big )\\&=\Big (\omega ,\big (S^{'}_{kh}(q) - S^{'}(\bar{q})\big )(q_{\gamma }-\bar{q})\Big )\\&= \Big (\omega ,\big (S^{'}_{kh}(q)-S^{'}(q)+S^{'}(q)-S^{'}(\bar{q})\big )(q_{\gamma }-\bar{q})\Big )\\&\le c\Big (\Vert \big (S^{'}_{kh}(q)-S^{'}(q)\big )(q_{\gamma }-\bar{q})\Vert _{L^{\infty }(I,H)}\\&\quad +\Vert q-\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}\Vert q_{\gamma }-\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}\Big ), \end{aligned}$$
where, in the last step, we used the stability of $S'$, i.e., (4c). The remaining term is a discretization error that can be estimated by [10, Corollary 5.5 and 5.11]. Namely, we have
$$\begin{aligned} \Vert \big (S^{'}_{kh}(\bar{q}_{kh}^{r})-S^{'}(\bar{q}_{kh}^{r}) \big )(q_{\gamma }-\bar{q})\Vert _{L^{\infty }(I,H)}&\le c\cdot c(k,h)\cdot \\&\quad \cdot \big (\Vert g\Vert _{L^{\infty }(\varOmega )}\Vert q_{\gamma }-\bar{q}\Vert _{L^{\infty }(I,{\mathbb {R}}^{m})}\big ). \end{aligned}$$
By virtue of the control constraints, we have
$$\begin{aligned} \Vert q_{\gamma }-\bar{q}\Vert _{L^{\infty }(I,{\mathbb {R}}^{m})}\le |q_{\max }-q_{\min } |. \end{aligned}$$
Then, thanks to (34) and $\Vert q-\bar{q}\Vert _{L^2(I,{\mathbb {R}}^m)} \le \frac{r}{2}$, we conclude
$$\begin{aligned} |(G'_{kh}(q)-G'(\bar{q}))(q_\gamma -\bar{q})|\le c_{3}\Bigg (c(k,h)+\frac{r^{2}}{2}\Bigg ). \end{aligned}$$

Moreover, by the above arguments, clearly, $c_1, c_2, c_3$ remain bounded as $r \rightarrow 0$.

As we have seen in the discussion after Assumption 5.1 it holds $ \gamma (r)\simeq r\gamma $. Hence there exists $ \tilde{r}\le r$ such that

$$\begin{aligned} -\gamma (\tilde{r})+\Big (c_{2}+\frac{c_{3}}{2}\Big )\tilde{r}^{2} \le -\frac{3}{4}\gamma (\tilde{r}). \end{aligned}$$

(35)

We can now summarize our requirements on r. Throughout the rest of the paper we rely on the following.

Assumption 5.2

Let the radius $ r > 0 $ be small enough, such that (35) holds and the quadratic growth condition (14) holds for elements in $ Q^{r}_{\mathrm{feas}} $. Namely,

$$\begin{aligned} j(q) \ge j(\bar{q}) + \delta \Vert q-\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}^{2}, \end{aligned}$$

for any $ q \in Q^{r}_{\mathrm{feas}} $.

After this preparation, we construct feasible competitors for ($\mathbb {P}_{kh}^{r}$).

Proposition 5.1

Let $ \bar{q} $ be a local solution of ($\mathbb {P}$) and $ q_{\gamma } $ be the Slater’s point from Assumption 2.2. Let

$$\begin{aligned} t(k,h)=\frac{c_{1} \cdot c(k,h)}{c_{4}r^{2}-\gamma } \end{aligned}$$

be given with $ c_{4} $ such that $ c_{4}r^{2}-\gamma = \gamma /4$. Then, the sequence of controls defined by

$$\begin{aligned} q_{t(k,h)}= \bar{q}+t(k,h)(q_{\gamma }-\bar{q}) \end{aligned}$$

is feasible for ($\mathbb {P}_{kh}^{r}$), for k, h sufficiently small, such that $ 0< t(k,h) < 1 $.

Proof

To verify the feasibility of $q_{t(k,h)}$, we use a Taylor expansion argument. The definition of $ q_{t(k,h)} $ suggests to expand $ G(q_{t(k,h)}) $ at $ \bar{q} $, obtaining

$$\begin{aligned} G(q_{t(k,h)}) = G(\bar{q})+G^{'}(\bar{q})(q_{t(k,h)}-\bar{q})+\frac{1}{2}G^{''}(q_{\zeta })(q_{t(k,h)}-\bar{q})^{2}, \end{aligned}$$

where $ q_{\zeta } $ is a convex combination of $ q_{t(k,h)}$ and $ \bar{q}$.

We insert this expansion in the following calculations

$$\begin{aligned} G_{kh}(q_{t(k,h)})&= G_{kh}(q_{t(k,h)})-G(q_{t(k,h)})+G(q_{t(k,h)})\\&=G_{kh}(q_{t(k,h)})-G(q_{t(k,h)})+G(\bar{q})+G^{'}(\bar{q})(q_{t(k,h)}-\bar{q})\\&\quad +\frac{1}{2}G^{''}(q_{\zeta })(q_{t(k,h)}-\bar{q})^{2}\\&=G_{kh}(q_{t(k,h)})-G(q_{t(k,h)})+G(\bar{q})+t(k,h)G(\bar{q})-t(k,h)G(\bar{q})\\&\quad +t(k,h)G^{'}(\bar{q})(q_{\gamma }-\bar{q})+\frac{1}{2}G^{''}(q_{\zeta })(q_{t(k,h)}-\bar{q})^{2}\\&=\underbrace{G_{kh}(q_{t(k,h)})-G(q_{t(k,h)})}_{(a_{1})}\\&\quad +\underbrace{(1-t(k,h))G(\bar{q})+t(k,h)(G(\bar{q})+G^{'}(\bar{q})(q_{\gamma }-\bar{q}))}_{(a_{2})}\\&\quad +\underbrace{\frac{1}{2}G^{''}(q_{\zeta })(q_{t(k,h)}-\bar{q})^{2}}_{(a_{3})}. \end{aligned}$$

$(a_{1})$ :

By definition of $c_1$, it holds

$$\begin{aligned} G_{kh}(q_{t(k,h)})-G(q_{t(k,h)})&= (u_{kh}(q_{t(k,h)})-u(q_{t(k,h)}), \omega (x))_{I}\\&\le c_{1}\cdot c(k,h). \end{aligned}$$

$(a_{2})$ :

This part is handled thanks to the feasibility of $ \bar{q}$ for ($\mathbb {P}$) and Slater’s regularity condition of Assumption 2.2. Indeed, for k, h sufficiently small, such that $ 0<t(k,h)<1 $, we have

$$\begin{aligned} (1-t(k,h))G(\bar{q})&\le 0,\\ t(k,h)(G(\bar{q})+G^{'}(\bar{q})(q_{\gamma }-\bar{q}))&\le -t(k,h)\gamma , \end{aligned}$$

from which we obtain

$$\begin{aligned} (a_{2}) \le -t(k,h)\gamma . \end{aligned}$$

$(a_{3})$ :

By definition of $c_2$, it follows

$$\begin{aligned} G^{''}(q_{\zeta })(q_{t(k,h)}-\bar{q})^{2}&\le c_{2}t(k,h)^{2}\Vert q_{\gamma }-\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}^{2}\le c_{2}t(k,h)^{2}\frac{r^{2}}{4}. \end{aligned}$$

Combining the three parts and using the definition of t(k, h) , we have

$$\begin{aligned} G_{kh}(q_{t(k,h)})&\le c_{1}\cdot c(k,h)+t(k,h)\left( c_{2}t(k,h)\frac{r^{2}}{4}-\gamma \right) \\&= t(k,h)(c_4r^2-\gamma )+t(k,h)\left( c_{2}t(k,h)\frac{r^{2}}{4}-\gamma \right) \\&= t(k,h) \Bigl ( c_{4} r^2 - 2\gamma + c_{2} t(k,h) r^2\Bigr ). \end{aligned}$$

Hence, for h, k sufficiently small, such that $ 0<t(k,h)<1 $, we obtain from (35) and the definition of $c_{4}$ that

$$\begin{aligned} G_{kh}(q_{t(k,h)})&\le t(k,h) \Bigl ( c_{4} r^2 - 2\gamma + c_{2} r^2\Bigr )\\&\le (c_{4} - \gamma ) + (c_{2}r^2 - \gamma )\\&\le \frac{\gamma }{2} - \frac{3}{4} \gamma \\&\le -\frac{1}{4} \gamma < 0, \end{aligned}$$

and the feasibility of $ q_{t(k,h)}$ is verified. $\square $

The proposition above in particular ensures that $ Q^{r}_{kh,\mathrm{feas}} $ is not empty once k, h are small enough, thus we assert:

Corollary 5.1

For k, h sufficiently small, there exists at least one global solution $ \bar{q}_{kh}^{r}\in Q^{r}_{kh,\mathrm{feas}} $ of ($\mathbb {P}_{kh}^{r}$).

In a second step, we show that the linearized regularity condition of Assumption 2.2 continues to hold in the discrete setting.

Lemma 5.1

Under Assumption 2.2, for k, h small enough, it holds

$$\begin{aligned} G_{kh}(\bar{q}_{kh}^{r})+G^{'}_{kh}(\bar{q}_{kh}^{r})(q_{\gamma }-\bar{q}_{kh}^{r}) \le -\frac{1}{2}\gamma < 0\quad \text {on }{\text {cl}}(I). \end{aligned}$$

(36)

Proof

In view of Assumption 2.2, we add and subtract $ G(\bar{q}), G_{kh}(\bar{q})$, as well as $G^{'}(\bar{q})(q_{\gamma }-\bar{q}) $, to obtain

$$\begin{aligned} G_{kh}(\bar{q}_{kh}^{r})+G^{'}(\bar{q}_{kh}^{r})(q_{\gamma }-\bar{q}_{kh}^{r})&= G(\bar{q}) + G^{'}(\bar{q})(q_{\gamma }-\bar{q}) +G_{kh}(\bar{q}_{kh}^{r})\\&\quad +G^{'}(\bar{q}_{kh}^{r})(q_{\gamma }-\bar{q}_{kh}^{r})-G(\bar{q})-G^{'}(\bar{q})(q_{\gamma }-\bar{q})\\&\le -\gamma + \underbrace{G_{kh}(\bar{q}_{kh}^{r})+G^{'}_{kh}(\bar{q}_{kh}^{r}) (\bar{q}-\bar{q}_{kh}^{r}) - G_{kh}(\bar{q})}_{(b_{1})} \\&\quad + \underbrace{G_{kh}(\bar{q})-G(\bar{q})}_{(b_{2})} + \underbrace{\big (G^{'}_{kh}(\bar{q}_{kh}^{r})-G^{'}(\bar{q})\big )(q_{\gamma }-\bar{q})}_{(b_{3})}. \end{aligned}$$

$(b_{1})$ :

Taylor expansion of $ G_{kh}(\bar{q}) $ at $ \bar{q}_{kh}^{r}$ reads

$$\begin{aligned} G_{kh}(\bar{q})=G_{kh}(\bar{q}_{kh}^{r})+G^{'}_{kh}(\bar{q}_{kh}^{r})(\bar{q}-\bar{q}_{kh}^{r})+\frac{1}{2}G^{''}_{kh}(q_{\zeta })(\bar{q}-\bar{q}_{kh}^{r})^{2}, \end{aligned}$$

with $ q_{\zeta } $ a convex combination of $\bar{q}$ and $\bar{q}_{kh}^{r}$, yielding

$$\begin{aligned} (b_{1}) = -\frac{1}{2}G_{kh}^{''}(q_{\zeta })(\bar{q}-\bar{q}_{kh}^{r})^{2}\le c_{2}\Vert \bar{q}-\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}^{2} \le c_{2}r^{2}, \end{aligned}$$

where we used $ G_{kh} $ being a $ C^{2} $-functional together with the feasibility of $ \bar{q}_{kh}^{r}$ for ($\mathbb {P}_{kh}^{r}$).

$ (b_{2}) $ :

By definition of $c_1$ it holds

$$\begin{aligned} G_{kh}(\bar{q}_{k})-G(\bar{q})&= \int _{\varOmega }\big (u_{kh}(\bar{q})-u(\bar{q}) \big )\omega (x)\mathrm {d}x \le c_{1}\cdot c(k,h). \end{aligned}$$

$ (b_{3}) $ :

By definition of $c_3$ it follows

$$\begin{aligned} (G'_{kh}(\bar{q}^r_{kh})-G'(\bar{q}))(q_\gamma -\bar{q}) \le c_3 \Bigl ( c(k,h)+ \frac{r^2}{2}\Bigr ). \end{aligned}$$

In conclusion, for k, h sufficiently small and thanks to (35), the three estimates for $ (b_{1}), (b_{2}), (b_{3}) $ yield

$$\begin{aligned} G_{kh}(\bar{q}_{kh}^{r})+G^{'}(\bar{q}_{kh}^{r})(q_{\gamma }-\bar{q}_{kh}^{r})&\le -\gamma +c_{2}r^{2} +c_{1}\cdot c(k,h)+ c_{3}\Big (c(k,h)+\frac{r^{2}}{2}\Big )\\&\le -\gamma +(c_{2}+\frac{c_{3}}{2})r^{2}+(c_{1}+c_{3})\cdot c(k,h)\\&\le -\frac{3}{4}\gamma +(c_{1}+c_{3}) c(k,h)\\&\le -\frac{1}{2}\gamma . \end{aligned}$$

$\square $

We now introduce the feasible competitors for the continuous auxiliary problem ($\mathbb {P}^{r}$).

Proposition 5.2

Let $ \bar{q}_{kh}^{r} $ be a global optimum for ($\mathbb {P}_{kh}^{r}$) and $ q_{\gamma } $ be the Slater’s point from Assumption 2.2. Further, let

$$\begin{aligned} \tau (k,h)=\frac{c_{1}\cdot c(k,h)}{c_{4}r^{2}-\gamma } \end{aligned}$$

be given with a constant $ c_{4} $ such that $ 0< c_{4}r^{2}-\gamma < \gamma /2 $. Then, the sequence of controls defined by

$$\begin{aligned} q_{\tau (k,h)}= \bar{q}_{kh}^{r}+ \tau (k,h)(q_{\gamma }-\bar{q}_{kh}^{r}) \end{aligned}$$

is feasible for ($\mathbb {P}^{r}$), for k, h sufficiently small.

Proof

The proof is analogous to the one of Proposition 5.1. $\square $

With these results at hands, we now show that global solutions of ($\mathbb {P}_{kh}^{r}$) converge to the considered local solution of ($\mathbb {P}$).

Proposition 5.3

Let k, h be small enough, such that Propositions 5.1 and 5.2 hold. Let $ \bar{q} $ be a local solution for ($\mathbb {P}$) satisfying the assumptions of Theorem 2.3 and Assumption 2.4, and let $ \bar{q}_{kh}^{r}$ be a global solution of ($\mathbb {P}_{kh}^{r}$). Then it holds the error estimate

$$\begin{aligned} \Vert \bar{q}-\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}^{2}\le c\Big (k\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{2}}+h^{2}\Big (\log \frac{T}{k}+1\Big )\Big ). \end{aligned}$$

(37)

Proof

Let $ q_{t(k,h)}$ and $ q_{\tau (k,h)}$ be defined as in Proposition 5.1 and 5.2, respectively, and let k, h be small enough, such that $ 0<t(k,h), \tau (k,h)<1 $. We have

$$\begin{aligned} \Vert \bar{q}-\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \le \Vert \bar{q}-q_{\tau (k,h)}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}+\Vert q_{\tau (k,h)}-\bar{q}_{kh}^{r}\Vert _{L^{2} (I,{\mathbb {R}}^{m})}. \end{aligned}$$

For the second term, we have

$$\begin{aligned} \Vert q_{\tau (k,h)}-\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \le c\Big (k\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{2}}+h^{2}\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{2}}\Big ), \end{aligned}$$

since, by definition in Proposition 5.2, it is $ q_{\tau (k,h)}= \bar{q}_{kh}^{r}+ \tau (k,h)(q_{\gamma }-\bar{q}_{kh}^{r})$. Consequently, $q_{\tau (k,h)}- \bar{q}_{kh}^{r}= \tau (k,h)(q_{\gamma }-\bar{q}_{kh}^{r})$ and convergence with order $ \tau (k,h) $ is asserted as $\Vert q_{\gamma } -\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \le 2r$ by definition of $Q^{r}_{kh,\mathrm{feas}} $. Therefore, we are left with the first term.

The competitor $ q_{\tau (k,h)}$ is feasible for ($\mathbb {P}^{r}$) and, using the quadratic growth condition (14), we obtain

$$\begin{aligned} \delta \Vert \bar{q}-q_{\tau (k,h)}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}^{2}&\le j(q_{\tau (k,h)})-j(\bar{q})\\&= j(q_{\tau (k,h)})-j_{kh}(\bar{q}_{kh}^{r})+j_{kh}(\bar{q}_{kh}^{r})-j_{kh}(q_{t(k,h)})\\ {}&\;\;\;\;+j_{kh}(q_{t(k,h)})-j(\bar{q})\\&\le \underbrace{j(q_{\tau (k,h)})-j_{kh}(\bar{q}_{kh}^{r})}_{(d_{1})} + \underbrace{j_{kh}(q_{t(k,h)})-j(\bar{q})}_{(d_{2})}, \end{aligned}$$

where, in the last step, we have used that $ q_{t(k,h)}\in Q_{kh,\mathrm{feas}}^{r} $ and $ \bar{q}_{kh}^{r}$ is a global optimum for ($\mathbb {P}_{kh}^{r}$).

We now analyze the two terms separately.

$(d_{1})$ :

With simple algebraic manipulations, using the definition of the objective function j, its discrete counterpart $j_{kh}$, the Cauchy-Schwarz inequality and binomial formulas, we have

$$\begin{aligned} j(q_{\tau (k,h)})-j_{kh}(\bar{q}_{kh}^{r})&\le \frac{1}{2}\Vert u(q_{\tau (k,h)})+u_{kh}(\bar{q}_{kh}^{r})-2u_{\mathrm{d}}\Vert _{I}\Vert u(q_{\tau (k,h)})-u_{kh}(\bar{q}_{kh}^{r})\Vert _{I}\\&\quad \, +\frac{\alpha }{2}\Vert q_{\tau (k,h)}+\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}\Vert q_{\tau (k,h)}-\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}. \end{aligned}$$

Then, by means of the stability of the solution u and $ u_{kh} $ of (2) and (18), respectively, together with the boundedness of $ Q_{\mathrm{ad}} $, and with the help of the Cauchy-Schwarz inequality, we get

$$\begin{aligned} j(q_{\tau (k,h)})-j_{kh}(\bar{q}_{kh}^{r})&\le c\Big (\Vert u(q_{\tau (k,h)})-u(\bar{q}_{kh}^{r})\Vert _{I}+\Vert u(\bar{q}_{kh}^{r})-u_{kh}(\bar{q}_{kh}^{r})\Vert _{I}\\&\quad +\Vert q_{\tau (k,h)}-\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \Big )\\&\le c \Big (\Vert u(\bar{q}_{kh}^{r})-u_{kh}(\bar{q}_{kh}^{r})\Vert _{I}+\Vert q_{\tau (k,h)}-\bar{q}_{kh}^{r}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \Big ), \end{aligned}$$

where, in the last step, we have used (4a).

The first term is a discretization error that can be estimated by [9, Theorems 3.3 and 4.2] together with the regularity of the solution of (2), obtaining

$$\begin{aligned} \Vert u(\bar{q}_{kh}^{r})-u_{kh}(\bar{q}_{kh}^{r})\Vert _{I} \le c(k+h^{2}). \end{aligned}$$

The estimate for the second term, $\Vert q_{\tau (k,h)}-\bar{q}_{kh}^{r}\Vert $, follows directly from Proposition 5.2. Summing up, we conclude

$$\begin{aligned} j(q_{\tau (k,h)})-j_{k}(q_{k}^{r})&\le c\Big (k+h^{2} + k\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{2}}+h^{2}\Big (\log \frac{T}{k}+1\Big )\Big )\\&\le c\Big (k\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{2}}+h^{2}\Big (\log \frac{T}{k}+1\Big )\Big ). \end{aligned}$$

$ (d_{2}) $ :

We proceed exactly as for $ (d_{1}) $ and obtain

$$\begin{aligned} j_{kh}(q_{t(k,h)})-j(\bar{q})&\le \frac{1}{2}\Vert u_{kh}(q_{t(k,h)})+u(\bar{q})-2u_{\mathrm{d}}\Vert _{I}\Vert u_{kh}(q_{t(k,h)})-u(\bar{q})\Vert _{I}\\&\quad +\frac{\alpha }{2}\Vert q_{t(k,h)}+\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}\Vert q_{t(k,h)}-\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}\\&\le c\Big (\Vert u_{kh}(q_{t(k,h)})-u(q_{t(k,h)})\Vert _{I}+\Vert q_{t(k,h)}-\bar{q}\Vert _{L^{2}(I,{\mathbb {R}}^{m})} \Big )\\&\le c\Big (k\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{2}}+h^{2}\Big (\log \frac{T}{k}+1\Big )\Big ). \end{aligned}$$

Combining $ (d_{1}) $ with $ (d_{2}) $, we have the assertion. $\square $

It is readily seen that, for k, h small enough, global solutions of ($\mathbb {P}_{kh}^{r}$) are local solutions of ($\mathbb {P}_{kh}$), as the constraint $ \Vert \bar{q}-\bar{q}_{kh}^{r}\Vert _{L^{2}(I, {\mathbb {R}}^{m})}\le r $ is not active. In particular, this ensures the existence of a sequence $ \bar{q}_{kh} $, of local solutions to ($\mathbb {P}_{kh}$), converging to $ \bar{q} $. We formalize this in the main result of the paper

Theorem 5.3

Let $ \bar{q}$ be a local solution of ($\mathbb {P}$) satisfying the assumptions of Theorem 2.3 and Assumption 2.4. Then, for k, h sufficiently small, there exists a sequence $ (\bar{q}_{kh}) $ of local solution of ($\mathbb {P}_{kh}$) converging to $ \bar{q} $ as $k,h \rightarrow 0$. Further, there holds the error estimate

$$\begin{aligned} \Vert \bar{q}-\bar{q}_{kh}\Vert _{L^{2}(I,{\mathbb {R}}^{m})}^{2} \le c\Bigg (k\Big (\log \frac{T}{k}+1\Big )^{\frac{1}{2}}+h^{2}\Big (\log \frac{T}{k}+1\Big )\Bigg ). \end{aligned}$$

6 Conclusions

Within this paper, we have shown that a weak second-order sufficient condition, and its implied quadratic growth condition, holding in a local minimizer of a quadratic optimization problem constrained by a semilinear parabolic equation, is sufficient to assert that this minimizer can be approximated by a space-time finite element approximation. The result heavily relies on the fact that for such problems no two-norm discrepancy is present—the extension to more general optimization problems, where this discrepancy cannot be avoided, has to be considered an open problem.

References

de Los Reyes, J.C., Merino, P., Rehberg, J., Tröltzsch, F.: Optimality conditions for state-constrained PDE control problems with time-dependent controls. Control Cybern. 37(1), 5–38 (2008)
MathSciNet MATH Google Scholar
Hinze, M., Pinnau, R., Ulbrich, M., Ulbrich, S.: Optimization with PDE Constraints. Mathematical Modelling: Theory and Applications. Springer, Netherlands (2010)
MATH Google Scholar
Casas, E., de los Reyes, J.C., Tröltzsch, F.: Sufficient second-order optimality conditions for semilinear control problems with pointwise state constraints. SIAM J. Optim. 19(2), 616–643 (2008)
Article MathSciNet MATH Google Scholar
Neitzel, I., Pfefferer, J., Rösch, A.: Finite element discretization of state-constrained elliptic optimal control problems with semilinear state equation. SIAM J. Control Optim. 53(2), 874–904 (2015)
Article MathSciNet MATH Google Scholar
Meidner, D., Vexler, B.: A priori error estimates for space-time finite element discretization of parabolic optimal control problems. I. Problems without control constraints. SIAM J. Control Optim. 47(3), 1150–1177 (2008)
Article MathSciNet MATH Google Scholar
Meidner, D., Vexler, B.: A priori error estimates for space-time finite element discretization of parabolic optimal control problems. II. Problems with control constraints. SIAM J. Control Optim. 47(3), 1301–1329 (2008)
Article MathSciNet MATH Google Scholar
Ludovici, F., Wollner, W.: A priori error estimates for a finite element discretization of parabolic optimization problems with pointwise constraints in time on mean values of the gradient of the state. SIAM J. Control Optim. 53(2), 745–770 (2015)
Article MathSciNet MATH Google Scholar
Leykekhman, D., Vexler, B.: Optimal a priori error estimates of parabolic optimal control problems with pointwise control. SIAM J. Numer. Anal. 51(5), 2797–2821 (2013)
Article MathSciNet MATH Google Scholar
Neitzel, I., Vexler, B.: A priori error estimates for space-time finite element discretization of semilinear parabolic optimal control problems. Numer. Math. 120(2), 345–386 (2012)
Article MathSciNet MATH Google Scholar
Meidner, D., Rannacher, R., Vexler, B.: A priori error estimates for finite element discretizations of parabolic optimization problems with pointwise state constraints in time. SIAM J. Control Optim. 49(5), 1961–1997 (2011)
Article MathSciNet MATH Google Scholar
Leykekhman, D., Meidner, D., Vexler, B.: Optimal error estimates for finite element discretization of elliptic optimal control problems with finitely many pointwise state constraints. Comput. Optim. Appl. 55(3), 769–802 (2013)
Article MathSciNet MATH Google Scholar
Neitzel, I., Wollner, W.: A priori $L^2$-discretization error estimates for the state in elliptic optimization problems with pointwise inequality state constraints. Numer. Math. 138(2), 273–299 (2018). https://doi.org/10.1007/s00211-017-0906-6
Article MathSciNet MATH Google Scholar
Chrysafinos, K., Karatzas, E.N.: Symmetric error estimates for discontinuous Galerkin approximations for an optimal control problem associated to semilinear parabolic PDE’s. Discrete Contin. Dyn. Syst. Ser. B 17(5), 1473–1506 (2012)
Article MathSciNet MATH Google Scholar
Chrysafinos, K.: Convergence of discontinuous Galerkin approximations of an optimal control problem associated to semilinear parabolic PDE’s. M2AN Math. Model. Numer. Anal. 44(1), 189–206 (2010)
Article MathSciNet MATH Google Scholar
Casas, E., Mateos, M.: Uniform convergence of the FEM. Applications to state constrained control problems. Comput. Appl. Math. 21(1), 67–100 (2002)
MathSciNet MATH Google Scholar
Hinze, M., Meyer, C.: Stability of semilinear elliptic optimal control problems with pointwise state constraints. Comput. Optim. Appl. 52(1), 87–114 (2012)
Article MathSciNet MATH Google Scholar
Leykekhman, D., Vexler, B.: Pointwise best approximation results for Galerkin finite element solutions of parabolic problems. SIAM J. Numer. Anal. 54(3), 1365–1384 (2016)
Article MathSciNet MATH Google Scholar
Gong, W., Hinze, M.: Error estimates for parabolic optimal control problems with control and state constraints. Comput. Optim. Appl. 56(1), 131–151 (2013)
Article MathSciNet MATH Google Scholar
Deckelnick, K., Hinze, M.: Variational discretization of parabolic control problems in the presence of pointwise state constraints. J. Comput. Math. 29(1), 1–15 (2011)
MathSciNet MATH Google Scholar
Casas, E., Tröltzsch, F.: Second order optimality conditions and their role in PDE control. Jahresber. Dtsch. Math. Ver. 117(1), 3–44 (2015)
Article MathSciNet MATH Google Scholar
Bonnans, J., Shapiro, A.: Perturbation Analysis of Optimization Problems, Springer Series Operations Research Financial Engineering, 1st edn. Springer, New York (2010)
Google Scholar
Goldberg, H., Tröltzsch, F.: Second-order sufficient optimality conditions for a class of nonlinear parabolic boundary control problems. SIAM J. Control Optim. 31(4), 1007–1025 (1993)
Article MathSciNet MATH Google Scholar
Casas, E.: Pontryagin’s principle for state-constrained boundary control problems of semilinear parabolic equations. SIAM J. Control Optim. 35(4), 1297–1327 (1997)
Article MathSciNet MATH Google Scholar
Krumbiegel, K., Rehberg, J.: Second order sufficient optimality conditions for parabolic optimal control problems with pointwise state constraints. SIAM J. Control Optim. 51(1), 304–331 (2013)
Article MathSciNet MATH Google Scholar
Bonnans, J.F., Jaisson, P.: Optimal control of a parabolic equation with time-dependent state constraints. SIAM J. Control Optim. 48(7), 4550–4571 (2010)
Article MathSciNet MATH Google Scholar
Casas, E., Raymond, J.P., Zidani, H.: Pontryagin’s principle for local solutions of control problems with mixed control-state constraints. SIAM J. Control Optim. 39(4), 1182–1203 (2000)
Article MathSciNet MATH Google Scholar
Raymond, J.P., Tröltzsch, F.: Second order sufficient optimality conditions for nonlinear parabolic control problems with state constraints. Discrete Contin. Dyn. Syst. 6(2), 431–450 (2000)
Article MathSciNet MATH Google Scholar
Ludovici, F.: Numerical analysis of parabolic optimal control problems with restrictions on the state and its first derivative. Ph.D. thesis, Technische Universität Darmstadt (2017). http://tuprints.ulb.tu-darmstadt.de/6781/
Tröltzsch, F.: Optimal Control of Partial Differential Equations. Theory, Methods and Applications., Graduate Studies in Mathematics, vol. 112. AMS, Providence (2010)
MATH Google Scholar
Casas, E., Mateos, M.: Second order optimality conditions for semilinear elliptic control problems with finitely many state constraints. SIAM J. Control Optim. 40(5), 1431–1454 (2002). (electronic)
Article MathSciNet MATH Google Scholar
Thomèe, V.: Galerkin Finite Element Methods for Parabolic Problems, Springer Ser. Comput. Math., 2nd edn. Springer, Berlin (2006)
Google Scholar
Hinze, M.: A variational discretization concept in control constrained optimization: the linear-quadratic case. Comput. Optim. Appl. 30(1), 45–61 (2005)
Article MathSciNet MATH Google Scholar
Brenner, S.C., Scott, R.L.: The Mathematical Theory of Finite Element Methods, Texts in Applied Mathematics, 3rd edn. Springer, New York (2008)
Book Google Scholar
Bank, R.E., Yserentant, H.: On the $H^1$-stability of the $L_2$-projection onto finite element spaces. Numer. Math. 126(2), 361–381 (2014)
Article MathSciNet MATH Google Scholar
Nochetto, R.H.: Sharp $L^\infty $-error estimates for semilinear elliptic problems with free boundaries. Numer. Math. 54(3), 243–255 (1988)
Article MathSciNet MATH Google Scholar
Falk, R.S.: Approximation of a class of optimal control problems with order of convergence estimates. J. Math. Anal. Appl. 44, 28–47 (1973)
Article MathSciNet MATH Google Scholar
Meyer, C.: Error estimates for the finite-element approximation of an elliptic control problem with pointwise state and control constraints. Control Cybern. 37(1), 51–83 (2008)
MathSciNet MATH Google Scholar
Casas, E., Tröltzsch, F.: Error estimates for the finite-element approximation of a semilinear elliptic control problem. Control Cybern. 31(3), 695–712 (2002)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors are grateful for the support of their former host institutions. To this end, I. Neitzel acknowledges the support of the Technische Universität München and F. Ludovici and W. Wollner the support of the Universität Hamburg.

Author information

Authors and Affiliations

Fachbereich Mathematik, Technische Universität Darmstadt, Dolivostr. 15, 64293, Darmstadt, Germany
Francesco Ludovici & Winnifried Wollner
Institut für Numerische Simulation, Rheinische Friedrich-Wilhelms-Universität Bonn, Wegelerstr. 6, 53115, Bonn, Germany
Ira Neitzel

Authors

Francesco Ludovici
View author publications
You can also search for this author in PubMed Google Scholar
Ira Neitzel
View author publications
You can also search for this author in PubMed Google Scholar
Winnifried Wollner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Winnifried Wollner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ludovici, F., Neitzel, I. & Wollner, W. A Priori Error Estimates for State-Constrained Semilinear Parabolic Optimal Control Problems. J Optim Theory Appl 178, 317–348 (2018). https://doi.org/10.1007/s10957-018-1311-8

Download citation

Received: 13 November 2017
Accepted: 14 May 2018
Published: 19 June 2018
Issue Date: August 2018
DOI: https://doi.org/10.1007/s10957-018-1311-8

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Priori Error Estimates for State-Constrained Semilinear Parabolic Optimal Control Problems

Abstract

Similar content being viewed by others

Second order analysis for the optimal control of parabolic equations under control and final state constraints

Analysis for the space-time a posteriori error estimates for mixed finite element solutions of parabolic optimal control problems

A New A Posteriori Error Estimates for Optimal Control Problems Governed by Parabolic Integro-Differential Equations

1 Introduction

2 Problem Formulation, Assumptions, and Analytic Setting

Assumption 2.1

Proposition 2.1

Remark 2.1

Definition 2.1

Proposition 2.2

Lemma 2.1

Lemma 2.2

Proof

Corollary 2.1

Remark 2.2

2.1 Optimality Conditions

Assumption 2.2

Theorem 2.3

Remark 2.3

Assumption 2.4

Remark 2.4

Theorem 2.5

3 Discretization

3.1 Time Discretization

Proposition 3.1

Lemma 3.1

Lemma 3.2

3.2 Space Discretization

Proposition 3.2

Theorem 3.1

4 The State Equation

4.1 Error Estimates for the Temporal Discretization

Lemma 4.1

Theorem 4.1

Proof

4.2 Error Estimates for the Spatial Discretization

Lemma 4.2

Theorem 4.2

Proof

5 Convergence Analysis

Assumption 5.1

Remark 5.1

Assumption 5.2

Proposition 5.1

Proof

Corollary 5.1

Lemma 5.1

Proof

Proposition 5.2

Proof

Proposition 5.3

Proof

Theorem 5.3

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation