A priori error analysis of the upwind symmetric interior penalty Galerkin (SIPG) method for the optimal control problems governed by unsteady convection diffusion equations

Akman, Tuğba; Yücel, Hamdullah; Karasözen, Bülent

doi:10.1007/s10589-013-9601-4

A priori error analysis of the upwind symmetric interior penalty Galerkin (SIPG) method for the optimal control problems governed by unsteady convection diffusion equations

Published: 25 September 2013

Volume 57, pages 703–729, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Computational Optimization and Applications Aims and scope Submit manuscript

A priori error analysis of the upwind symmetric interior penalty Galerkin (SIPG) method for the optimal control problems governed by unsteady convection diffusion equations

Download PDF

Tuğba Akman¹,
Hamdullah Yücel¹^nAff2 &
Bülent Karasözen¹

1010 Accesses
12 Citations
Explore all metrics

Abstract

In this paper, we analyze the symmetric interior penalty Galerkin (SIPG) for distributed optimal control problems governed by unsteady convection diffusion equations with control constraint bounds. A priori error estimates are derived for the semi- and fully-discrete schemes by using piecewise linear functions. Numerical results are presented, which verify the theoretical results.

Distributed Optimal Control of Diffusion-Convection-Reaction Equations Using Discontinuous Galerkin Methods

Optimal Control of Diffusion-Convection-Reaction Equations Using Upwind Symmetric Interior Penalty Galerkin (SIPG) Method

Space-Time Discontinuous Galerkin Methods for Optimal Control Problems Governed by Time Dependent Diffusion-Convection-Reaction Equations

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Optimal control problems (OCPs) governed by convection diffusion partial differential equations (PDEs) arise in environmental modeling, petroleum reservoir simulation and in many other applications. Hence, efficient numerical methods are essential to obtain effective solutions of the such optimal control problems.

It is well known that the standard Galerkin finite element method produces nonphysical oscillating solutions for mesh sizes larger than a critical value depending on the ratio between diffusion and convection terms. To enhance stability and accuracy of the optimal control problems governed by the steady convection diffusion equations, some effective stabilization techniques are used, i.e., the streamline upwind/Petrov Galerkin (SUPG) finite element method [6], the local projection stabilization [1], the edge stabilization [12, 25]. Recently, discontinuous Galerkin (DG) methods have became popular for the optimal control problems governed by convection diffusion equations due the better convergence behavior, local mass conservation, flexibility in approximating rough solutions on complicated meshes and mesh adaptation, see, e.g., [15, 16, 26–28].

However, to the best of our knowledge, a few papers are published so far for unsteady optimal control problems governed by the convection diffusion equations. A characteristic finite element approximation in space and backward Euler method in time are used in [9, 10]. Zhou et al. [29] used local discontinuous Galerkin (LDG) discretization in space, whereas Sun [24] used the nonsymmetric interior penalty Galerkin (NIPG) discretization. In [24], a priori error estimates are only given for semi-discrete scheme, whereas it is investigated for both semi- and fully-discrete schemes with the backward Euler method in [29]. In both papers, numerical results are not given.

In this paper, we will investigate a priori error analysis of the optimal control problems governed by the unsteady convection diffusion equations using the symmetric interior penalty Galerkin (SIPG) method for the semi- and fully-discrete schemes. For time discretization, we apply the backward Euler method. We present the numerical results related to the DG discretization for the unsteady optimal control problems.

The rest of the paper is organized as follows: In Sect. 2, we introduce the control constrained optimal control problems governed by the unsteady convection diffusion equations. The upwind symmetric interior penalty Galerkin (SIPG) discretization and semi-discrete scheme are given in Sect. 3. A priori error estimates of the semi-discrete scheme are derived in Sect. 4. In Sect. 5, we give the fully-discrete scheme of the optimal control problems by using the backward Euler discretization in time. We derive a priori error estimates of the fully-discrete scheme in Sect. 6. Finally, we present the numerical results in Sect. 7.

2 The optimal control problem

We adopt the standard notations for Sobolev spaces on computational domains and their norms. Ω and Ω _U are bounded convex polygon domains in $\mathbb{R}^{2}$ with Lipschitz boundaries ∂Ω and ∂Ω _U, respectively. The inner products in L ²(Ω _U) and L ²(Ω) are denoted by (⋅,⋅)_U and (⋅,⋅), respectively. Further, we consider spaces of functions mapping the time interval (0,T) to a normed space X in which the norm ∥⋅∥_X is defined. For r≥1, we define

$$L^{r}(0,T;X)= \biggl\{ z:[0,T] \rightarrow X \ \hbox{measurable} : \int_{0}^{T} \big\|z(t)\big\|_{X}^{r} \,dt < \infty\biggr\} $$

with

$$\begin{aligned} \big\|z(t)\big\|_{L^{r}(0,T;X)}=\left \{ \begin{array}{l@{\quad}l} ( \int_{0}^{T} \|z(t)\|_{X}^{r} \, dt )^{1/r}, & \hbox{if} \ 1 \leq r < \infty, \\ \hbox{ess} \sup_{t \in(0,T]} \|z(t)\|_{X}, & \hbox{if} \ r=\infty. \end{array} \right . \end{aligned}$$

In this paper, we are interested in the following distributed optimal control problem governed by the unsteady diffusion convection reaction equation with control constraints

(2.1)

subject to

$$\begin{aligned} &\partial_t y-\varepsilon \varDelta y+\beta\cdot\nabla y+r y = f + B u \quad x\in\varOmega, \ t \in(0,T], \end{aligned}$$

(2.2a)

$$\begin{aligned} &y(x,t)=0 \quad x \in\partial\varOmega, \ t \in (0,T], \end{aligned}$$

(2.2b)

$$\begin{aligned} &y(x,0)=y_{0}(x) \quad x\in\varOmega, \end{aligned}$$

(2.2c)

where the admissible space of control constraints is given by

$$\begin{aligned} U_{ad} = \bigl\{ u \in L^{2} \bigl(0,T;L^2(\varOmega_U)\bigr): u_a \le u \le u_b, \hbox{ a.e. in } \varOmega_U \times(0,T] \bigr\} \end{aligned}$$

(2.3)

with the constant bounds $u_{a}, u_{b} \in\mathbb{R} \cup\{\pm\infty\} $, i.e., u _a<u _b. B is a bounded linear continuous operator to ensure the transition from Ω _U to Ω. Generally, Ω _U can be a subset of Ω. In the special case, Ω _U=Ω and B=I is an identity operator.

We make the following assumptions for the functions and parameters on the optimal control problem (2.1), (2.2a)–(2.2c):

(i)
The source function f and the desired state y _d belong to H ¹(0,T;L ²(Ω)) with $f(0), y_{d}(T) \in H_{0}^{1}(\varOmega)$.
(ii)
The initial condition is defined as $y_{0}(x) \in H^{1}_{0}(\varOmega)$ with $\varDelta y_{0} \in H^{1}_{0}(\varOmega)$.
(iii)
The diffusion and reaction parameters are denoted by ε>0 and r∈L ^∞(Ω), respectively.
(iv)
β denotes a velocity field. It belongs to (W ^1,∞(Ω))² and satisfies the incompressibility condition, i.e. ∇⋅β=0.

Further, we assume the existence of a constant c ₀≡c ₀(x)≥0 such that

$$\begin{aligned} r(x) \geq c_{0} \geq0 \quad\hbox{a.e. in}\ \varOmega \end{aligned}$$

(2.4)

to ensure the well-posedness of the optimal control problem (2.1), (2.2a)–(2.2c).

Using the assumptions defined above, the following result on regularity of the state solution can be proved.

Proposition 1

Under the assumptions defined above and for a given control u∈H ¹(0,T;L ²(Ω _U)), the state y satisfies the following regularity condition

$$y \in H^1\bigl(0,T; H^2(\varOmega) \cap H^1_0(\varOmega)\bigr) \cap H^2\bigl(0,T; L^2(\varOmega)\bigr) $$

and the weak formulation

$$\begin{aligned} &(\partial_t y,v)+ a(y,v)+b(u,v)=(f,v) \quad\forall v \in V=H^1_0(\varOmega), \ t\in(0,T], \end{aligned}$$

(2.5)

$$\begin{aligned} &y(x,0)=y_0, \end{aligned}$$

(2.6)

where the (bi)-linear forms are defined by

$$\begin{aligned} &a(y,v)=\int_{\varOmega} (\varepsilon \nabla y \cdot\nabla v + \beta \cdot\nabla y v + r y v)\, dx, \qquad b(u,v)=-\int_{\varOmega} Bu v \, dx, \\ &(f,v)= \int_{\varOmega} f v \, dx. \end{aligned}$$

Proof

The regularity of the state $y \in H^{1}(0,T; H^{2}(\varOmega) \cap H^{1}_{0}(\varOmega)) \cap H^{2}(0,T; L^{2}(\varOmega))$ can be proved as done [7] provided that f+Bu∈H ¹(0,T;L ²(Ω)) with $(f + Bu)(0) \in H^{1}_{0}(\varOmega)$ is satisfied. This condition is ensured by our assumptions. See, e.g., [20] for details. □

Then, variational formulation corresponding to (2.1), (2.2a)–(2.2c) can be written as

$$\begin{aligned} &\underset{u \in U_{ad}}{\hbox{minimize}} \quad J(y,u):=\int _{0}^{T} \biggl(\frac{1}{2} \|y-y_{d}\|^{2}_{L^2(\varOmega)} + \frac{\alpha}{2} \| u \|^{2}_{L^2(\varOmega_U)} \biggr)\, dt \end{aligned}$$

(2.7a)

$$\begin{aligned} &\hbox{subject to} \quad (\partial_t y,v)+ a(y,v)+b(u,v)=(f,v) \quad\forall v \in V, \ t \in(0,T], \\ &\phantom{\hbox{subject to}}\quad y(x,0)=y_0, \\ & \phantom{\hbox{subject to}}\quad(y,u) \in Y \times U_{ad}. \end{aligned}$$

(2.7b)

It is well known that the triple (y,u) is the unique solution of (2.7a), (2.7b) if and only if there is an adjoint $p \in H^{1}(0,T; H^{2}(\varOmega) \cap H^{1}_{0}(\varOmega)) \cap H^{2}(0,T; L^{2}(\varOmega))$ such that (y,u,p) satisfies the following optimality system

$$\begin{aligned} &(\partial_t y,v)+a(y,v)+b(u,v)=(f,v) \quad \forall v \in V,\ y(x,0)=y_0, \end{aligned}$$

(2.8a)

$$\begin{aligned} &-(\partial_t y,\psi)+a(\psi,p)=-(y-y_{d},\psi) \quad \forall\psi\in V,\ p(x,T)=0, \end{aligned}$$

(2.8b)

$$\begin{aligned} &\int_{0}^{T} \bigl(\alpha u- B^* p, w -u \bigr)_U\, dt \geq0 \quad \forall w \in U_{ad}, \end{aligned}$$

(2.8c)

where B ^∗ denotes the adjoint of B [18, 20].

3 Discontinuous Galerkin (DG) scheme for optimal control problem

3.1 Discontinuous Galerkin discretization

Let $\{ \mathcal{T}_{h}\}_{h}$ be a family of shape regular meshes such that $\overline{\varOmega} = \cup_{K \in\mathcal{T}_{h}} \overline{K}$, K _i∩K _j=∅ for $K_{i}, K_{j} \in\mathcal{T}_{h}$, i≠j. The diameter of an element K and the length of an edge E are denoted by h _K and h _E, respectively. Further, the maximum value of element diameter is denoted by $h=\max_{K \in\mathcal{T}_{h}} h_{K}$.

We only consider discontinuous piecewise linear finite element spaces to define the discrete spaces of the state and test functions

$$\begin{aligned} V_h = Y_h &= \bigl \{{y \in L^2( \varOmega)}\,:~{ y\mid_{K}\in\mathbb{P}^1(K) \ \forall K \in\mathcal{T}_h}\bigr \}. \end{aligned}$$

(3.1)

Remark 1

When the state equation (2.2a)–(2.2c) contains nonhomogeneous Dirichlet boundary conditions, the space of discrete states Y _h and the space of test functions V _h can still be taken the same due to the weak treatment of boundary conditions in DG methods. See, [16] for details.

We split the set of all edges $\mathcal{E}_{h}$ into the set $\mathcal {E}^{0}_{h}$ of interior edges and the set $\mathcal{E}^{\partial}_{h}$ of boundary edges so that $\mathcal{E}_{h}=\mathcal{E}^{\partial}_{h}\cup \mathcal{E}^{0}_{h}$. Let n denote the unit outward normal to ∂Ω. We define the inflow boundary

$$\varGamma^- = \bigl \{{x \in\partial\varOmega}\,:~{ \beta\cdot\mathbf{n}(x) < 0}\bigr \} $$

and the outflow boundary Γ ⁺=∂Ω∖Γ ⁻. The boundary edges are decomposed into edges $\mathcal{E}^{-}_{h} = \{{E \in\mathcal{E}^{\partial}_{h}}\,:~{ E \subset \varGamma^{-} }\}$ that correspond to inflow boundary and edges $\mathcal{E}^{+}_{h} = \mathcal{E}^{\partial}_{h} \setminus\mathcal {E}^{-}_{h}$ that correspond to outflow boundary. The inflow and outflow boundaries of an element $K \in\mathcal{T}_{h}$ are defined by

$$\begin{aligned} \partial K^-=\bigl \{{x \in\partial K}\,:~{\beta\cdot\mathbf{n}_{K}(x) <0}\bigr \}, \qquad\partial K^{+} = \partial K \setminus\partial K^{-}, \end{aligned}$$

where n _K is the unit normal vector on the boundary ∂K of an element K.

Let the edge E be a common edge for two elements K and K ^e. For a piecewise continuous scalar function y, there are two traces of y along E, denoted by y|_E from inside K and y ^e|_E from inside K ^e. Then, the jump and average of y across the edge E are defined by:

$$\begin{aligned}{} [\![ y ]\!]=y\big|_E\mathbf{n}_{K}+y^e\big|_E \mathbf{n}_{K^e}, \qquad \left \{\!\left \{ y \right \}\!\right \}=\frac{1}{2} \bigl( y\big|_E+y^e\big|_E \bigr). \end{aligned}$$

(3.2)

Similarly, for a piecewise continuous vector field ∇y, the jump and average across an edge E are given by

$$\begin{aligned}{} [\![ \nabla y ]\!]=\nabla y\big|_E \cdot\mathbf{n}_{K}+\nabla y^e\big|_E \cdot\mathbf{n}_{K^e}, \qquad \left \{\!\left \{ \nabla y \right \}\!\right \}=\frac{1}{2} \bigl(\nabla y\big|_E+\nabla y^e\big|_E \bigr). \end{aligned}$$

(3.3)

For a boundary edge E∈K∩Γ, we set $\left \{\!\left \{ \nabla y \right \}\!\right \}=\nabla y$ and [[y]]=y n where n is the outward normal unit vector on Γ.

We now consider the discretization of the control variable. Let $\{ \mathcal{T}_{h}^{U}\}_{h}$ is also a family of shape regular meshes of Ω _U such that $\overline{\varOmega}_{U} = \bigcup_{K_{U} \in\mathcal {T}_{h}^{U}} \overline{K}_{U}$, $K^{i}_{U} \cap K^{j}_{U} = \emptyset$ for $K^{i}_{U}, K^{j}_{U} \in\mathcal{T}_{h}^{U}$, i≠j. The maximum diameter is defined by $h_{U}=\max_{K_{U} \in\mathcal{T}_{h}^{U}} h_{K_{U}}$, where $h_{K_{U}}$ denotes the diameter of an element K _U. The discrete space of the control variable associated with $\{ \mathcal{T}_{h}^{U}\}_{h}$ is also piecewise linear finite element space

$$\begin{aligned} U_{h}= \bigl \{{u \in L^2( \varOmega_U)}\,:~{ u\mid_{K_U} \in\mathbb{P}^1(K_U) \ \forall K_U \in\mathcal{T}_h^U}\bigr \}. \end{aligned}$$

(3.4)

Note that in general, the sizes of the elements in $\{ \mathcal{T}_{h}^{U}\} _{h}$ are smaller than those in $\{ \mathcal{T}_{h}\}_{h}$, so we assume that h _U/h≤C throughout this paper.

We can now give DG discretizations of the state equation (2.2a)–(2.2c) in space for fixed control u. The DG method proposed here is based on the upwind discretization of the convection term and on the SIPG discretization of the diffusion term [22]. This leads to the following (bi-)linear forms applied to y _h∈H ¹(0,T;Y _h) for ∀t∈(0,T]

$$\begin{aligned} (\partial_t y_h, v_h) + a_h(y_h,v_h)+b_h(u_h,v_h)=(f_h,v_h) \quad\forall v_h \in V_h, \ t \in(0,T], \end{aligned}$$

(3.5)

where

$$\begin{aligned} &a_h(y,v) \\ &\quad = \underbrace{\sum _{K \in\mathcal{T}_h} \int _{K} \varepsilon \nabla y \cdot\nabla v \, dx - \sum _{ E \in\mathcal{E}_h} \int _E \biggl( \{\!\{ \varepsilon \nabla y \}\!\} \cdot [\![ v ]\!] + \{\!\{ \varepsilon \nabla v \}\!\} \cdot [\![ y ]\!] - \frac{\sigma \varepsilon }{h_E} [\![ y ]\!] \cdot [\![ v ]\!] \biggr) \, ds }_{a^{d}(y,v)} \\ &\qquad {}+ \!\underbrace{ \sum _{K \in\mathcal{T}_h} \int _{K} ( \beta\cdot\nabla y v + r y v ) \, dx + \!\!\sum _{K \in\mathcal{T}_h} \int _{\partial K^{-} \backslash\varGamma^-} \beta\cdot \mathbf{n} \bigl(y^e-y\bigr)v \, ds - \!\!\sum _{K \in\mathcal{T}_h} \int _{\partial K^{-} \cap\varGamma^{-}} \beta\cdot\mathbf{n} y v \, ds}_{a^{cr}(y,v)}, \end{aligned}$$

(3.6a)

$$\begin{aligned} &b_h(u, v) = - \sum _{K \in\mathcal{T}_h} \int _{K} Buv \, dx \end{aligned}$$

(3.6b)

with a constant interior penalty parameter σ>0. We choose σ to be sufficiently large, independent of the mesh size h and the diffusion coefficient ε to ensure the stability of the DG discretization as described in [21, Sect. 2.7.1] with a lower bound depending only on the polynomial degree. Large penalty parameters decrease the jumps across element interfaces, which can affect the numerical approximation. Further, the DG approximation can converge to the continuous Galerkin approximation as the penalty parameter goes to infinity. See, e.g., [5] for details.

To make the notation easier for the readers, we introduce the L ² inner product on the inflow or outflow boundaries as follows

$$(w, v)_{\varGamma^{-}} = \int_{\varGamma_{-}}|\beta\cdot n| w v \, ds $$

with analogous definition of $(\cdot, \cdot)_{\varGamma^{+}}$ and associated norms $\|\cdot\|_{\varGamma^{-}}$ and $\|\cdot\|_{\varGamma ^{+}}$. Further, the standard notation W ^m,q(Ω) is used for the Sobolev space with a norm $\|\cdot\|_{W^{m,q}(\varOmega)}$ and the broken Sobolev spaces used in DG discretization are given by

$$\|\!| v |\!\|_{W^{m,q}(\mathcal{T}_h)}= \biggl( \sum _{K \in\mathcal{T}_h} \|v \|_{W^{m,q}(K)}^{2} \biggr)^{1/2}. $$

3.2 Semi-discrete formulation of optimal control problem

The discretization of admissible set (2.3) is defined by

$$\begin{aligned} U_h^{ad}=\bigl\{u_h \in L^{2}(0,T;U_h) : u_a \leq u_h \leq u_b \ \hbox{a.e. in}\ \varOmega_U \times(0,T] \bigr\}. \end{aligned}$$

(3.7)

Let $f_{h}, y_{h}^{d}$ and $y_{h}^{0}$ be approximations of the source function f, the desired state function y _d and initial condition y ₀, respectively. Then, the semi-discrete approximation of the optimal control problem (2.8a)–(2.8c) can be defined as follows:

$$\begin{aligned} &\underset{u_h \in U_h^{ad}}{\hbox{minimize}} \quad \int_{0}^{T} \biggl( \frac{1}{2} \sum_{K \in\mathcal{T}_h} \big\|y_h-y_{h}^d \big\|^{2}_{L^{2}(K)} + \frac{\alpha}{2} \sum_{K_U \in\mathcal{T}_h^U} \| u_h\|^{2}_{L^{2}(K_U)} \biggr) \, dt, \end{aligned}$$

(3.8a)

$$\begin{aligned} &\hbox{subject to} \quad (\partial_t y_h, v_h) + a_h(y_h,v_h)+b_h(u_h,v_h)=(f_h,v_h) \quad\forall v_h \in V_h, \ t \in(0,T], \\ &\hphantom{\hbox{subject to} \quad} y_h(x,0)=y_h^0,\qquad (y_h,u_h) \in Y_h \times U_h^{ad}. \end{aligned}$$

(3.8b)

4 A priori error analysis of semi-discrete scheme

In this section, we derive a priori error estimates for the semi-discrete scheme of the optimal control problem (2.1), (2.2a)–(2.2c) by using the upwind symmetric interior penalty Galerkin (SIPG) discretization for the space. By introducing the following norm [21]

$$\begin{aligned} \|v\|^{2}_{\varepsilon} = \sum _{K \in\mathcal{T}_{h}} \int_{K} \| \varepsilon\nabla y \|^{2}_{L^{2}(K)} \, dx + \sum_{E \in\mathcal{E}_{h}} \frac{\sigma\varepsilon}{h_{E}} \int _{E} \big\|[\![ y ]\!]\big\|^{2}_{L^{2}(E)} \, ds, \end{aligned}$$

we obtain the following coercivity result for some positive constant κ>0 independent of the mesh size h and the diffusion parameter ε provided that a sufficiently large penalty parameter σ is chosen based on the polynomial degree as described in [21]:

$$\begin{aligned} \forall t >0, \ \forall v_{h} \in V_{h}, \quad\kappa\|v\|^{2}_{\varepsilon} \leq a^{d}_{h}(v, v). \end{aligned}$$

(4.1)

We also need the following trace inequality at the rest of the paper:

$$\begin{aligned} \|v\|_{L^{2}(E)} \leq C |h_{E}|^{1/2} |h_{K}|^{-1/2} \|v\|_{L^{2}(K)}, \quad\forall E \subset \partial K, \end{aligned}$$

(4.2)

where the constant C is independent of mesh size h, but depends on polynomial degree. In addition, the generalization of Poincaré-Friedrichs inequality to the broken Sobolev space $H^{1}(\mathcal {T}_{h})$ [4]

$$\begin{aligned} \|v\|^{2}_{L^{2}(\varOmega)} \leq C \biggl( \|\!| v |\!\|^{2}_{H^{0}(\mathcal{T}_h)} + \sum _{E \in\mathcal{E}_h} \frac{1}{h_E} \big\| \left [\!\left [ v \right ]\!\right ] \big\|^{2} _{L^{2}(E)} \biggr), \quad \forall v \in H^{1}(\mathcal{T}_h) \end{aligned}$$

(4.3)

is needed for some of the following proofs.

Now, we turn to derive a semi-discrete stability estimate for the state variable at the following Lemma 1.

Lemma 1

Let y _h be the solution of (3.8b) and let c ₀ be a positive constant such that (2.4) holds. Then, there exists a positive constant C independent of mesh size h for ∀t∈(0,T] such that

$$\begin{aligned} &\sum_{K \in\mathcal{T}_{h}} \big\|y_h(t)\big\|^{2}_{L^{2}(K)} + \int _{0}^{t} \|y_{h}\|^{2}_{\varepsilon} \, dt \\ &\qquad{} + \int _{0}^{t} \biggl( \sum _{K \in\mathcal{T}_{h}} c_{0}\|y_{h} \|^{2}_{L^{2}(K)} + \sum _{K \in\mathcal{T}_{h}} \|y_{h}\|^{2}_{L^{2}(\partial K^{-} \cap\varGamma^{-})} \biggr) \, dt \\ &\qquad{}+ \int _{0}^{t} \biggl( \sum _{K \in\mathcal{T}_{h}} \big\|y_{h} - y_{h}^{e}\big\|^{2}_{L^{2}(\partial K^{-} \backslash\varGamma^{-})} + \sum _{K \in\mathcal{T}_{h}} \|y_{h}\|^{2}_{L^{2}(\partial K^{+} \cap \varGamma^{+})} \biggr) \, dt \\ &\quad \leq C \biggl( \sum _{K \in\mathcal{T}_{h}} \big\|y_{h}^{0} \big\|^{2}_{L^{2}(K)} + \int _{0}^{t} \biggl( \sum_{K \in\mathcal{T}_{h}} \|f\|^{2}_{L^{2}(K)} + \sum _{K \in\mathcal{T}_{h}} \|Bu_{h}\|^{2}_{L^{2}(K)} \biggr) \, dt \biggr). \end{aligned}$$

(4.4)

Proof

The proof is shown as done in [24, Lemma 3.1]. □

Let J(⋅) be a continuous functional in L ²(Ω). Then, there exists at least one solution for the minimization problem (3.8a), (3.8b) since $\int_{0}^{T} \sum_{K \in\mathcal{T}_{h}} \|y(u_{h})\|_{H^{1}(K)}^{2}$ is bounded as proven in Lemma 1 (see, e.g., [24] for details). Then, we can deduce that the semi-discrete optimal control problem (3.8a), (3.8b) has a unique solution $(y_{h},u_{h}) \in H^{1}(0,T;Y_{h}) \times U^{ad}_{h}$. See, e.g., [18]. The functions $(y_{h},u_{h}) \in H^{1}(0,T;Y_{h}) \times U^{ad}_{h}$ solve (3.8a), (3.8b) if and only if $(y_{h},u_{h},p_{h}) \in H^{1}(0,T;Y_{h}) \times U^{ad}_{h} \times H^{1}(0,T;Y_{h})$ is a unique solution of the following optimality system:

$$\begin{aligned} &(\partial_t y_h, v_h) + a_{h}(y_{h},v_{h})+b(u_{h},v_{h})=(f_h,v_{h}) \quad \forall v_{h} \in V_{h}, \ y_h(x,0)=y_h^0, \end{aligned}$$

(4.5a)

$$\begin{aligned} &-(\partial_t p_h, \psi_{h}) + a_{h}(\psi_{h},p_{h})=-\bigl(y_{h}-y_h^{d}, \psi_{h}\bigr)\quad \forall\psi_{h} \in V_{h}, \ p_h(x,T)=0, \end{aligned}$$

(4.5b)

$$\begin{aligned} &\int_{0}^{T} \bigl(\alpha u_{h}- B^*p_{h}, w_{h} -u_{h}\bigr)_U\, dt \geq0 \quad \forall w_{h} \in U^{ad}_{h}. \end{aligned}$$

(4.5c)

Similar to Lemma 1, we can obtain the following semi-discrete stability estimate for the adjoint variable in Lemma 2.

Lemma 2

Let p _h be the solution of (4.5b) and let c ₀ be a positive constant such that (2.4) holds. Then, there exists a positive constant C independent of h such that

$$\begin{aligned} &\sum_{K \in\mathcal{T}_{h}} \big\|p_h(t)\big\|^{2}_{L^{2}(K)} + \int_{t}^{T} \|p_{h}\|^{2}_{\varepsilon}\, dt \\ &\qquad{}+ \int _{t}^{T} \biggl( \sum _{K \in\mathcal{T}_{h}} c_0\|p_{h}\|^{2}_{L^{2}(K)} + \sum _{K \in\mathcal{T}_{h}} \|p_{h}\|^{2}_{L^{2}(\partial K^{-} \cap \varGamma^{-})} \biggr) \, dt \\ &\qquad{}+ \int _{t}^{T} \biggl( \sum _{K \in\mathcal{T}_{h}} \big\|p_{h} - p_{h}^{e}\big\|^{2}_{L^{2}(\partial K^{+} \backslash\varGamma^{+})} + \sum _{K \in\mathcal{T}_{h}} \|p_{h}\|^{2}_{L^{2}(\partial K^{+} \cap \varGamma^{+})} \biggr) \, dt \\ &\quad \leq C \int _{t}^{T} \sum _{K \in\mathcal{T}_{h}} \big\|y_{h} - y^{d}_{h}\big\|^{2}_{L^{2}(K)}\, dt. \end{aligned}$$

(4.6)

Proof

The proof is similar to (4.4) with p _h(x,T)=0. □

In order to derive a priori error estimates for the semi-discrete scheme, we make use of the following definitions and estimates. Firstly, we define an elliptic projection $\tilde{y}$ of y onto Y _h satisfying the Galerkin orthogonality

$$\begin{aligned} a^{d}_{h} \bigl(y(t) - \tilde{y}(t), v\bigr) = 0 \quad\forall t \geq0, \ \forall v\in V_{h} \end{aligned}$$

(4.7)

to derive an error estimate for y−y _h(u). Then, we use the following estimates that are given in [21]:

$$\begin{aligned} \big\|y(t) - \tilde{y}(t) \big\|_{\varepsilon} &\leq Ch\big |\!\big |\!\big | y(t) \big |\!\big |\!\big |_{H^{2}(\mathcal{T}_{h})} \; \quad\forall t \geq0, \end{aligned}$$

(4.8a)

$$\begin{aligned} \big\|y(t) - \tilde{y}(t) \big\|_{L^{2}(\varOmega)} & \leq Ch^2 \big |\!\big |\!\big | y(t) \big |\!\big |\!\big |_{H^{2}(\mathcal{T}_{h})} \quad\forall t \geq0. \end{aligned}$$

(4.8b)

Moreover, the domain Ω _U is divided as the active and inactive regions of the control u for each time interval as firstly introduced in [17]

$$\begin{aligned} \varOmega^{*}_U &= \biggl\{ \bigcup _{K_U}: K_U \subset\varOmega_U, u_{a} < u|_{K_U} < u_{b} \biggr\}, \\ \varOmega^{c}_U &= \biggl\{ \bigcup _{K_U}: K_U \subset\varOmega_U, u|_{K_U} = u_{a} \hbox{ or } u|_{K_U} = u_{b} \biggr\}, \\ \varOmega^{b}_U &= \varOmega\backslash \bigl( \varOmega^{*}_U \cup\varOmega^{c}_U \bigr). \end{aligned}$$

It is assumed that the intersection of the three sets is empty, i.e., $\varOmega^{i}_{U} \cap\varOmega^{j}_{U}= \emptyset$ for i≠j and $\varOmega_{U} = \varOmega^{*}_{U} \cup\varOmega^{c}_{U} \cup\varOmega ^{b}_{U}$. $\varOmega^{b}_{U}$ consists of elements which lie close to the free boundary between the active and the inactive sets for each time interval. We also hold the following assumption

$$\begin{aligned} \hbox{meas}\bigl(\varOmega^{b}_U \bigr) \leq Ch_U \end{aligned}$$

(4.9)

on the regularity of u and $\mathcal{T}_{h}^{U}$. This assumption is valid if the boundary of the level set $\varOmega^{c}_{U}$ consists of a finite number of rectifiable curves [19]. In addition, we set

$$\varOmega^* = \bigl\{ x\in\varOmega_U: u_a < u(x) <u_b \bigr\}, $$

which includes $\varOmega_{U}^{*} \subset\varOmega^{*}$ [25].

We finally define

$$\begin{aligned} \bigl(J_{h}^{\prime}(u),v-u\bigr)_U= \int_{0}^{T} \bigl(\alpha u- B^*p_{h}(u),v-u\bigr)_U \, dt, \end{aligned}$$

(4.10)

in which the auxiliary solution p _h(u)∈H ¹(0,T;Y _h) is the solution of the following system

$$\begin{aligned} &\bigl(\partial_t y_h(u), v_h\bigr) + a_{h}\bigl(y_{h}(u),v_{h}\bigr)+b_{h}(u,v_{h})=(f_h,v_{h}) \\ &\quad\forall v_{h} \in V_{h}, \ y_h(u) (x,0)=y_0^h, \end{aligned}$$

(4.11a)

$$\begin{aligned} &{-}\bigl(\partial_t p_h(u), q_{h}\bigr) + a_{h}\bigl(q_{h},p_{h}(u)\bigr)=- \bigl(y_{h}(u)-y_h^{d}, q_{h}\bigr) \\ &\quad \forall q_{h} \in V_{h}, \ p_h(u) (x,T)=0, \end{aligned}$$

(4.11b)

where y _h(u)∈H ¹(0,T;Y _h) is also an auxiliary solution for given $u \in U^{ad}_{h}$.

To complete the a priori error estimate of semi-discrete scheme, we firstly derive convergence estimates between the approximate solutions (y _h,p _h) and the auxiliary solutions (y _h(u),p _h(u)).

Lemma 3

Let (y _h,p _h) and (y _h(u),p _h(u)) be the solutions of (4.5a), (4.5b) and (4.11a), (4.11b), respectively. Then, there are positive constants C ₁ and C ₂ independent of h such that

$$\begin{aligned} \big\|y_{h} -y_{h}(u)\big\|_{L^{\infty}(0,T; L^{2}(\varOmega))} \leq C_{1} \| u-u_{h} \|_{L^{2}(0,T; L^{2}(\varOmega_U))} \end{aligned}$$

(4.12a)

and

$$\begin{aligned} \big\|p_{h} - p_{h}(u)\big\|_{L^{\infty}(0,T; L^{2}(\varOmega))} \leq C_{2} \|u-u_{h} \|_{L^{2}(0,T; L^{2}(\varOmega_U))}. \end{aligned}$$

(4.12b)

Proof

By subtracting (4.11a) (respectively, (4.11b)) from (4.5a) (respectively, (4.5b)), taking v _h=y _h−y _h(u) (respectively, v _h=p _h−p _h(u)) and following the approach in the stability estimates of the semi-discrete state equation (respectively, the semi-discrete adjoint equation), the desired results are obtained. □

Now, we will derive an estimate for the control u using the discontinuous piecewise linear finite element space by following the approach in [25, 29].

Lemma 4

Let (y,p,u) and (y _h,p _h,u _h) be the solutions of (2.8a)–(2.8c) and (4.5a)–(4.5c), respectively. Assume that $u \in L^{2}(0,T;W^{1, \infty}(\varOmega_{U})), u|_{\varOmega^{*}} \subset L^{2}(0,T; H^{2}(\varOmega^{*}))$. Then, we have

$$\begin{aligned} \| u - u_{h} \|_{L^{2}(0,T; L^{2}(\varOmega_U))} \leq C \bigl( h^{3/2}_U + \big\|p - p_{h}(u) \big\|_{L^{2}(0,T; L^{2}(\varOmega))} \bigr). \end{aligned}$$

(4.13)

Proof

Let $(J_{h}^{\prime}(u),v-u)_{U}= \int_{0}^{T}(\alpha u-B^{*} p_{h}(u),v-u)_{U} \, dt$, where p _h(u) is the solution of the auxiliary equation (4.11b). Then, we have

$$\begin{aligned} \bigl(J_{h}^{\prime}(v)-J_{h}^{\prime}(u),v-u \bigr)_U =& \int _{0}^{T} \bigl(\alpha(v-u),v-u \bigr)_U \, dt\\ &{} + \int _{0}^{T} \bigl(B^* p_h(u)- B^*p_h(v),v-u\bigr)_U \, dt. \end{aligned}$$

By using the auxiliary equations (4.11a) and (4.11b), we obtain

$$\begin{aligned} &\int _{0}^{T} \bigl( Bv-Bu,p_h(u)-p_h(v) \bigr)_U \, dt \\ &\quad = \int _{0}^{T} \bigl( \partial_t \bigl( y_h(v)-y_h(u)\bigr),p_h(u)-p_h(v) \bigr) \,dt\\ &\qquad{} + \int _{0}^{T} \bigl( a_h \bigl(y_h(v)-y_h(u),p_h(u)-p_h(v) \bigr) \bigr) \, dt \\ &\quad = \int _{0}^{T} \bigl( \partial_t \bigl( y_h(v)-y_h(u)\bigr),p_h(u)-p_h(v) \bigr) \,dt\\ &\qquad{} +\int _{0}^{T} \bigl( \partial_t \bigl(p_h(u)-p_h(v)\bigr),y_h(v)-y_h(u) \bigr) \, dt \\ &\qquad{} + \int _{0}^{T} \bigl( y_h(v)-y_h(u),y_h(v)-y_h(u) \bigr) \, dt. \end{aligned}$$

Application of integration by parts on the first term by using the fact (y _h(v)−y _h(u))|_t=0=0 and (p _h(v)−p _h(u))|_t=T=0 yields

$$\begin{aligned} \int_{0}^{T} \bigl( v-u,B^*p_h(u)-B^*p_h(v) \bigr)_U \, dt=\int _{0}^{T} \bigl(y_h(v)-y_h(u),y_h(v)-y_h(u) \bigr) \, dt \geq0. \end{aligned}$$

(4.14)

By using (4.14), we obtain

$$\begin{aligned} \bigl(J_{h}^{\prime}(v)-J_{h}^{\prime}(u),v-u \bigr)_U \geq\alpha\int _{0}^{T} \|v-u \|^{2}_{L^{2}(\varOmega_U)} \, dt. \end{aligned}$$

(4.15)

With the help of the inequalities (4.15), (4.14), (2.8c), (4.5c), the standard Lagrangian interpolation Πu with Young’s inequality and the notation p _h=p _h(u _h), we obtain

$$\begin{aligned} &\alpha\|u-u_h\|^{2}_{L^{2}(0,T;L^{2}(\varOmega_U))} \\ &\quad \leq \underbrace{\int _{0}^{T} \bigl(\alpha u-B^*p,u-u_{h} \bigr)_U\,dt}_{\geq0} +\int _{0}^{T} \bigl(B^*p-B^*p_{h}(u),u-u_{h}\bigr)_U\,dt \\ &\qquad {}+ \underbrace{\int _{0}^{T} \bigl(\alpha u_{h}-B^*p_{h},u_{h}- \varPi u\bigr)_U\,dt}_{\geq0} +\int _{0}^{T} \bigl( \alpha u_{h}-B^*p_{h},\varPi u-u\bigr)_U\,dt \\ &\quad \leq \int _{0}^{T} \bigl(B^*p-B^*p_{h}(u),u-u_{h} \bigr)_U\,dt \\ &\qquad{} + \int _{0}^{T} \bigl(\alpha u_{h}-B^*p_{h}-\alpha u+B^*p,\varPi u-u\bigr)_U \,dt \\ &\qquad {}+ \int _{0}^{T} \bigl(\alpha u-B^*p,\varPi u-u \bigr)_U\,dt \\ &\quad =\int_{0}^{T} \bigl(B^*p-B^*p_{h}(u),u-u_{h} \bigr)_U\,dt +\int _{0}^{T} (\alpha u_{h}-\alpha u,\varPi u-u)_U\,dt \\ &\qquad {}+ \int _{0}^{T} \bigl(B^*p-B^*p_{h}(u),\varPi u-u\bigr)_U\,dt +\int _{0}^{T} \bigl(B^*p_{h}(u)-B^*p_{h}, \varPi u-u\bigr)_U\,dt \\ &\qquad {}+\int _{0}^{T} \bigl(\alpha u-B^*p,\varPi u-u \bigr)_U\,dt \\ &\quad \leq\int _{0}^{T} \bigl(\alpha u-B^*p,\varPi u-u\bigr)_U\,dt + C_{1} \big\|B^*p_{h}(u)-B^*p_{h}\big\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))} \\ &\qquad{} + C_{2}\|u-\varPi u\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))}+ C_{3}\big\|B^*p-B^*p_{h}(u)\big\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))} \\ &\qquad{} + C_{4} \|\alpha u-\alpha u_{h}\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))} + C_{5} \|u-u_{h}\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))}. \end{aligned}$$

(4.16)

As described in (3.4), we use the discontinuous piecewise linear finite element space for the control variable. Assuming Πu is the standard Lagrangian interpolation satisfying Πu(x)=u(x) for any vertex x. Then, Πu belongs to $U_{ad}^{h}$. We get

$$\begin{aligned} \| u - \varPi u \|_{L^{2}(\varOmega^{*}_U)} \leq C h^{2}_U \|u \|_{H^{2}(\varOmega^{*}_U)}, \qquad\| u - \varPi u \|_{W^{0, \infty} (\varOmega^{b}_U)} \leq C h_U \|u\|_{W^{1, \infty} (\varOmega^{b}_U)} \end{aligned}$$

for u∈W ^1,∞(Ω _U) and $u_{| \varOmega^{*}} \subset H^{2}(\varOmega^{*})$. Hence,

$$\begin{aligned} \|u-\varPi u\|^{2}_{L^{2}(\varOmega_U)}&= \int_{\varOmega^{*}_U} (u-\varPi u)^{2} + \int _{\varOmega^{c}_U} (u-\varPi u)^{2} +\int_{\varOmega^{b}_U} (u-\varPi u)^{2} \\ & \leq C h^{4}_U \|u\|^{2}_{H^{2}(\varOmega^{*}_U)}+0+ C h^{2}_U \|u\|^{2}_{W^{1, \infty} (\varOmega^{b}_U)} \hbox{ meas }\bigl(\varOmega^{b}_U\bigr) \\ & \leq C h^{3}_U \bigl( h_U \|u \|^{2}_{H^{2}(\varOmega^{*}_U)} + \|u\|^{2}_{W^{1, \infty} (\varOmega ^{b}_U)}\bigr) \\ &\leq C h^{3}_U \bigl( \|u\|^{2}_{H^{2}(\varOmega^{*})} + \|u\|^{2}_{W^{1, \infty} (\varOmega_U)}\bigr). \end{aligned}$$

(4.17)

By the inequality in (4.5c), we have

$$\alpha u-B^*p=0 \quad\hbox{on}\ \varOmega^{*}_U \quad \hbox{and} \quad\varPi u-u=0 \quad\hbox{on}\ \varOmega^{c}_U. $$

In addition, there exists $x_{0} \in K_{U} \subset\varOmega^{b}_{U}$ with u _a<u(x ₀)<u _b satisfying (αu−B ^∗ p)(x ₀)=0. Then, the following estimate by [25]

$$\begin{aligned} \big\| \alpha u - B^*p \big\|_{W^{0, \infty}(\varOmega^{b}_U)} =& \big\| \alpha u - B^*p - \bigl(\alpha u - B^*p\bigr) (x_{0}) \big\|_{W^{0, \infty} (\varOmega^{b}_U)}\\ \leq& Ch_U \big\| \alpha u - B^*p \big\|_{W^{1, \infty} (\varOmega^{b}_U)} \end{aligned}$$

results in

$$\begin{aligned} \bigl(\alpha u - B^*p, \varPi u-u \bigr)_U & = \int _{\varOmega^{*}_U} \bigl(\alpha u - B^*p\bigr) (\varPi u - u) + \int _{\varOmega^{c}_U} \bigl(\alpha u - B^*p\bigr) (\varPi u - u) \\ &\quad {} + \int _{\varOmega^{b}_U} \bigl(\alpha u - B^*p\bigr) (\varPi u - u) \\ & = 0+0 +\int _{\varOmega^{b}_U} \bigl(\alpha u - B^*p\bigr) (\varPi u - u) \\ & \leq\big\| \alpha u - B^*p \big\|_{W^{0, \infty} (\varOmega^{b}_U)} \|u - \varPi u \|_{W^{0, \infty} (\varOmega^{b}_U)} \hbox{meas}\bigl(\varOmega^{b}_U\bigr) \\ & \leq C h^{3}_U \big\| \alpha u-B^*p\big\|_{W^{1, \infty}( \varOmega^{b}_U)} \|u \|_{W^{1, \infty}( \varOmega^{b}_U)}. \end{aligned}$$

(4.18)

Finally, the desired result is obtained by inserting (4.17), (4.18) and (4.12b) into (4.16). □

Remark 2

In Lemma 4, we assume that u∈W ^1,∞(Ω _U) and u∈H ²(Ω ^∗) in space, instead of u∈H ²(Ω _U) due to the regularity issues on the boundary of the control as done [24, 25, 29]. The control variable u has lower regularity due to the discontinuity of the derivative of u on the free boundary Ω ^b. Hence, the convergence rate of the control u is around h ^3/2. However, in numerical experiments, the optimal convergence rate can be obtained if the initial mesh is generated properly. It means that the initial grid aligns with the points where the bounds of control and the values of adjoint coincide. Hence, there happens no kink.

In the following lemma, the connection between the exact solution of the state y (respectively, the adjoint p) and the auxiliary state solution y _h(u) (respectively, the auxiliary adjoint solution p _h(u)) will be established.

Lemma 5

Let (y,p) be the solutions of (2.8a), (2.8b), respectively and (y _h(u),p _h(u)) be the solutions of the auxiliary equations (4.11a), (4.11b), respectively. Then, there is a constant C independent of h such that

$$\begin{aligned} \big\|y - y_{h}(u)\big\|_{L^{\infty}(0,T; L^{2}(\varOmega))} \leq Ch^2\| y \|_{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))} \end{aligned}$$

(4.19)

and

$$\begin{aligned} \big\|p - p_{h}(u)\big\|_{L^{\infty}(0,T; L^{2}(\varOmega))} \leq Ch^2 \bigl( \| p \|_{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))} + \| y \|_{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))} \bigr). \end{aligned}$$

(4.20)

Proof

To show the estimate of the state (4.19), we begin with subtracting (4.11a) from (2.8a),

$$\begin{aligned} \biggl( \frac{\partial(y - y_{h}(u))}{\partial t}, v_{h} \biggr) + a_{h}^{d} \bigl(y - y_{h}(u), v_{h}\bigr) + a_{h}^{cr} \bigl(y - y_{h}(u), v_{h}\bigr) = 0, \quad\forall v_{h} \in V_{h}. \end{aligned}$$

By writing

$$\begin{aligned} y - y_{h}(u) = (y - \tilde{y}) - \bigl(y_{h}(u) - \tilde{y}\bigr) = \eta- \xi, \end{aligned}$$

where $\tilde{y}$ is the elliptic projection of y and taking ν _h=ξ, we obtain

$$\begin{aligned} \biggl( \frac{\partial\xi}{\partial t}, \xi\biggr) + a_{h}^{d}(\xi, \xi) + a_{h}^{cr}(\xi, \xi) = \biggl( \frac{\partial\eta}{\partial t}, \xi \biggr) + a_{h}^{d}(\eta, \xi) + a_{h}^{cr}( \eta, \xi), \quad\forall t>0. \end{aligned}$$

Coercivity of $a_{h}^{d}(\cdot, \cdot)$ (4.1) and the Galerkin orthogonality (4.7) yield

$$\begin{aligned} &\frac{1}{2} \frac{d}{dt} \|\xi\|^{2}_{L^{2}(\varOmega)} + \kappa\|\xi\|^{2}_{\varepsilon} + \sum _{K \in\mathcal{T}_{h}} c_{0}\|\xi\|^{2}_{K} + \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \|\xi\|^{2}_{\partial K^{-} \cap\varGamma ^{-}} \\ & \qquad{}+ \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \big\|\xi- \xi^{e} \big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \big\|\xi(t)\big\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \\ &\quad \leq\biggl \vert \biggl(\frac{\partial\eta}{\partial t}, \xi\biggr) + a_{h}^{cr}(\eta, \xi) \biggr \vert . \end{aligned}$$

(4.21)

The bounds in [21] for the first term in right-hand side of (4.21), i.e., $(\frac{\partial\eta}{\partial t}, \xi )$, and the bounds in [8] for the second term, i.e., $a_{h}^{cr}(\eta, \xi)$, give us

$$\begin{aligned} & \frac{1}{2} \frac{d}{dt} \|\xi \|^{2}_{L^{2}(\varOmega)} + \kappa\|\xi\|^{2}_{\varepsilon} + \sum _{K \in\mathcal{T}_{h}} c_{0}\|\xi\|^{2}_{K} + \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \|\xi\|^{2}_{\partial K^{-} \cap\varGamma^{-}} \\ &\qquad{}+ \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \big\|\xi- \xi^{e} \big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \|\xi\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \\ &\quad \leq \frac{\kappa}{2}\|\xi\|^{2}_{\varepsilon} + Ch^4 \bigg |\!\bigg |\!\bigg | \frac{\partial y}{\partial t} \bigg |\!\bigg |\!\bigg |^{2}_{H^{2}(\mathcal{T}_{h})} + \frac{\kappa}{8}\|\xi\|^{2}_{\varepsilon} + C \|\eta \|^{2}_{L^{2}(\varOmega)} + C \|\xi\|^{2}_{L^{2}(\varOmega)} \\ &\qquad{} +\frac{1}{4} \sum _{K \in\mathcal{T}_{h}} \big\|\xi- \xi^{e} \big\|^{2}_{{\partial K^{-} \backslash\varGamma^{-}}} + \sum _{K \in\mathcal{T}_{h}} \big\| \eta^{e}\big\|^{2}_{{\partial K^{-} \backslash\varGamma^{-}}} + \frac {1}{4} \sum _{K \in\mathcal{T}_{h}} \|\xi\|^{2}_{{\partial K^{+} \cap\varGamma ^{+}}} \\ &\qquad{} + \sum _{K \in\mathcal{T}_{h}} \|\eta\|^{2}_{{\partial K^{+} \cap\varGamma^{+}}}. \end{aligned}$$

(4.22)

Now, we eliminate the terms related to η by using the estimate related to $\|\eta\|_{\partial K^{-}}$ or $\|\eta\|_{\partial K^{+}}$ in [13], trace inequality (4.2) and elliptic projection (4.7)

$$\begin{aligned} \sum_{K \in\mathcal{T}_{h}} \big\| \eta^{e}\big\|^{2}_{{\partial K^{-} \backslash\varGamma^{-}}} &\leq\sum _{K \in\mathcal{T}_{h}} \|\beta\|_{L^{\infty}(K)}\|\eta\|^{2}_{{\partial K}} \leq\sum _{K \in\mathcal{T}_{h}} C \|\eta\|^{2}_{K} = C \|\eta \|^{2}_{L^{2}(\varOmega)} \\ & \leq Ch^4\|\!| y |\!\|^{2}_{H^{2}(\mathcal{T}_{h})}. \end{aligned}$$

(4.23)

A bound for $\|\xi\|^{2}_{L^{2}(\varOmega)}$ is derived by multiplying (4.22) by 2 and integrating from 0 to t. Using the continuous Gronwall inequality for ξ, we complete the proof of (4.19) by noting ξ(0)=0.

We proceed the proof of (4.20) by starting with the following equation

$$\begin{aligned} \biggl(-\frac{\partial(p - p_{h}(u))}{\partial t},q_{h} \biggr) + a_{h} \bigl(q_{h}, p - p_{h}(u)\bigr) = -\bigl(y - y_{h}(u), q_{h}\bigr), \quad\forall q_{h} \in V_{h}, \end{aligned}$$

as the proof of (4.19). □

Now, we complete the a priori error estimate of the semi-discrete scheme by combining Lemmas 3–5 with triangle inequality.

Theorem 1

Let (y,u,p) and (y _h,u _h,p _h) be the solutions of (2.8a)–(2.8c) and (4.5a)–(4.5c), respectively. Suppose that the conditions of Proposition 1 and Lemma 4 are valid. Assume that the regularity condition (4.9) is also satisfied. Then, the following estimate holds

$$\begin{aligned} &\|y - y_{h}\|_{L^{\infty}(0,T; L^{2}(\varOmega))} + \|p - p_{h} \|_{L^{\infty}(0,T; L^{2}(\varOmega))} + \|u - u_{h}\|_{L^{2}(0,T; L^{2}(\varOmega_U))} \\ & \quad \leq Ch^2 \bigl( \| p \|_{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))} + \| y \| _{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))}\bigr) + Ch^{3/2}_U. \end{aligned}$$

(4.24)

5 Fully-discrete formulation of optimal control problem

We use the standard backward Euler method to discretize the optimal control problem (2.1), (2.2a)–(2.2c) in time. Let N _T be a positive integer. The discrete time interval $\bar{I}=[0,T]$ is defined as

$$0=t_0<t_1< \cdots<t_{N_T-1}<t_{N_T}=T $$

with size k _n=t _n−t _n−1 for n=1,…,N _T and $k= \max _{n=1, \ldots, N_{T}} k_{n}$.

To prove the a priori error estimate of the fully-discrete scheme, we need the discrete time-dependent norm for 1≤q<∞ by [9],

$$\begin{aligned} \|\!| v |\!\|_{L^{q}(0,T; L^{2}(\varOmega))} = \Biggl( \sum_{n=1}^{N_{T}} k_{n} \|v_{n} \|^{q}_{L^{2}(\varOmega)} \Biggr)^{1/q}. \end{aligned}$$

(5.1)

Let f _h,n and $y_{h,n}^{d}$ be approximations of the source function f _h and the desired state function $y_{h}^{d}$ at time t _n. Then, the fully-discrete approximate scheme of the semi-discrete problem (3.8a), (3.8b) is

$$\begin{aligned} &\underset{u_{h,n} \in U_{h}^{ad}}{\hbox{minimize }} \quad \sum_{n=1}^{N_T} k_{n} \biggl( \frac{1}{2} \sum_{K \in\mathcal{T}_h} \big\|y_{h,n}-y_{h,n}^{d}\big\|^{2}_{L^{2}(K)} + \frac{\alpha}{2} \sum _{K_U \in\mathcal{T}_h^U} \| u_{h,n} \|^{2}_{L^{2}(K_U)} \biggr), \end{aligned}$$

(5.2a)

$$\begin{aligned} & \hbox{subject to}\quad \biggl(\displaystyle\frac{y_{h,n}-y_{h,n-1}}{k_{n}}, v \biggr) + a_h(y_{h,n},v)+b_h(u_{h,n},v) = (f_{h,n},v) \quad\forall v \in V_{h}, \\ &\phantom{\hbox{subject to}}\quad y_{h,0}(x,0)=y_h^0. \end{aligned}$$

(5.2b)

6 A priori error analysis of fully-discrete scheme

As for the semi-discrete scheme, we first give the stability result of the state variable at the following Lemma 6.

Lemma 6

Let y _h,n be the solution of (5.2b) and let c ₀ be a positive constant such that (2.4) holds. Then, there exists a positive constant C independent of h and k _m for m=1,2,…,N _T such that

$$\begin{aligned} &\|y_{h,m}\|^{2}_{L^{2}(\varOmega)} + \sum _{n=1}^{m} k_{n} \biggl(\kappa\|y_{h,n}\|^{2}_{\varepsilon} + \sum _{K} 2c_{0}\|y_{h,n} \|^{2}_{K} + \sum _{K} \|y_{h,n}\|^{2}_{\partial K^{-} \cap\varGamma^{-}} \biggr) \\ &\qquad{} + \sum _{n=1}^{m} k_{n} \biggl(\sum _{K} \big\| y_{h,n} - y_{h,n}^{e}\big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \sum _{K} \| y_{h,n}\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ & \quad \leq C \big\|y_{h}^{0}\big\|^{2}_{L^{2}(\varOmega)} + C \sum _{n=1}^{m} k_{n} \bigl( \|f_{h,n}\|^{2}_{L^{2}(\varOmega)} + \|B u_{h,n} \|^{2}_{L^{2}(\varOmega)} \bigr). \end{aligned}$$

(6.1)

Proof

Choose v _h=y _h,n in (6.2a). By using the algebraic inequality $\frac{x^{2}-y^{2}}{2} \leq (x-y)x$, $\forall x,y \in \mathbb{R}$ and following the steps in Lemma 1, we obtain the desired result. □

The minimization problem (5.2a), (5.2b) has at least one solution due to the boundedness of solution y _h,n as proven in Lemma 6. Then, the fully discretized control problem (5.2a), (5.2b) obtained by using the backward Euler method has a unique solution (y _h,n,u _h,n), n=1,2,…,N _T, and $(y_{h,n},u_{h,n}) \in Y_{h} \times U_{h}^{ad}$, n=1,2,…,N _T is the solution of (5.2a), (5.2b) if and only if $(y_{h,n},u_{h,n},p_{h,n-1}) \in Y_{h} \times U^{ad}_{h} \times Y_{h}$ is a unique solution of the following optimality system:

$$\begin{aligned} & \begin{array}{l} \biggl(\displaystyle\frac{y_{h,n}-y_{h,n-1}}{k_{n}}, v \biggr) + a_h(y_{h,n},v)+b_h(u_{h,n},v)= (f_{h,n},v) \quad\forall v \in V_{h}, \\ y_{h,0}=y_h^0, \quad n=1,2, \ldots, N_{T}, \end{array} \end{aligned}$$

(6.2a)

$$\begin{aligned} &\begin{array}{l} \biggl(\displaystyle\frac{p_{h,n-1}-p_{h,n}}{k_{n}}, q\biggr) + a_h(q,p_{h,n-1})=- \bigl(y_{h,n}-y_{h,n}^{d},q\bigr) \quad\forall q \in V_{h}, \\ p_{h,T}=0, \quad n=N_{T}, \ldots, 2,1, \end{array} \end{aligned}$$

(6.2b)

$$\begin{aligned} &\bigl(\omega u_{h,n}- B^*p_{h,n-1}, w-u_{h,n} \bigr)_U \geq0 \quad\forall w \in U_{h}^{ad}, \ n=1,2, \ldots, N_{T}. \end{aligned}$$

(6.2c)

The stability result of the adjoint variable for the fully-discrete scheme is also given at the following Lemma 7.

Lemma 7

Let p _h,n be the solution of (6.2b) and let c ₀ be a positive constant such that (2.4) holds. Then, there exists a positive constant C independent of h and k _m for m=N _T−1,…,2,1 such that

$$\begin{aligned} &\|p_{h,m}\|^{2}_{L^{2}(\varOmega)} + \sum _{n = 1}^{m} k_{n} \kappa\|p_{h,n-1}\|^{2}_{\varepsilon} \\ &\qquad{} + \sum _{n = 1}^{m} k_{n} \biggl( 2\sum _{K} c_{0}\| p_{h,n-1}\|^{2}_{K} + \sum _{K} \| p_{h,n-1} \|^{2}_{\partial K^{-} \cap\varGamma^{-}} \biggr) \\ &\qquad{} + \sum _{n = 1}^{m} k_{n} \biggl( \sum _{K} \big\|p_{h,n-1} - p_{h,n-1}^{e} \big\|^{2}_{\partial K^{+} \backslash\varGamma^{+}} + \sum_{K} \| p_{h,n-1} \|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ &\quad \leq C \sum _{n = 1}^{m} k_{n} \big\|y_{h,n} - y^{d}_{h,n}\big\|^{2}_{L^{2}(\varOmega)}. \end{aligned}$$

(6.3)

Now, we derive the a priori error estimates of the fully-discrete scheme by introducing the following auxiliary equations as for the semi-discrete scheme. Let

$$\begin{aligned} \bigl(J_{h}^{\prime}(u),v-u\bigr)_U=\sum _{n=1}^{N_{T}} k_{n} \bigl(\alpha u_{n}- B^*p_{h,n-1}(u),v_{n}-u_{n} \bigr)_U, \end{aligned}$$

(6.4)

where p _h,n−1(u) is the solution of the following system

$$\begin{aligned} & \begin{array}{l} \biggl(\displaystyle\frac{y_{h,n}(u)-y_{h,n-1}(u)}{k_{n}}, v \biggr) + a_h\bigl (y_{h,n}(u),v \bigr)+b_h(u_{n},v)= (f_{h,n},v) \quad \forall v \in V_{h}, \\ y_{h,0}(u)=y_h^0, \quad n=1,2, \ldots, N_{T}, \end{array} \end{aligned}$$

(6.5a)

$$\begin{aligned} & \begin{array}{l} \biggl(\displaystyle\frac{p_{h,n-1}(u)-p_{h,n}(u)}{k_{n}}, q \biggr) + a_h\bigl (q,p_{h,n-1}(u) \bigr)=-\bigl(y_{h,n}(u)-y_{h,n}^{d},q\bigr) \quad \forall q \in V_{h}, \\ p_{h,T}(u)=0, \quad n=N_{T}, \ldots, 2,1. \end{array} \end{aligned}$$

(6.5b)

For the simplicity, we use the following notations,

$$\begin{aligned} \zeta_{n} &= y_{h,n} - y_{h,n}(u), \quad n=0,1, \ldots, N_T, \\ \chi_{n} &= p_{h,n} - p_{h,n}(u), \quad n=N_T, \ldots, 1,0. \end{aligned}$$

Firstly, we establish a connection between the approximation results (y _h,p _h) and the auxiliary solutions (y _h(u),p _h(u)) as described at the following lemma.

Lemma 8

Let (y _h,p _h) and (y _h(u),p _h(u)) be the solutions of (6.2a), (6.2b) and (6.5a), (6.5b), respectively. Then, there are positive constants C ₁ and C ₂ independent of h and k such that

$$\begin{aligned} \big |\!\big |\!\big | y_{h} - y_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} & \leq C_{1 }\|\!| u - u_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}, \end{aligned}$$

(6.6)

$$\begin{aligned} \big |\!\big |\!\big | p_{h} - p_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} &\leq C_{2}\|\!| u - u_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}. \end{aligned}$$

(6.7)

Proof

We start the proof of (6.6) by subtracting (6.5a) from (6.2a) to obtain the following equality

$$\begin{aligned} \biggl(\frac{\zeta_{n} - \zeta_{n-1}}{k_n}, v_{h} \biggr) + a_{h}( \zeta_{n}, v_{h})= - b_h(u_{h,n} - u_{n}, v_{h}). \end{aligned}$$

By choosing v _h=ζ _n and following the steps in Lemma 6, we obtain

$$\begin{aligned} &\frac{1}{2 k_{n}} \bigl( \|\zeta_{n}\|^{2}_{L^{2}(\varOmega)} - \|\zeta_{n-1}\|^{2}_{L^{2}(\varOmega)} \bigr) + \kappa\| \zeta_{n}\|^{2}_{\varepsilon} + \sum _{K} c_{0}\|\zeta_{n}\|^{2}_{L^{2}(K)} + \frac{1}{2}\sum _{K} \| \zeta_{n} \|^{2}_{\partial K^{-} \cap\varGamma^{-}} \\ &\qquad{} + \frac{1}{2}\sum _{K} \big\| \zeta_{n}- \zeta_{n}^{e}\big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \frac{1}{2}\sum _{K} \|\zeta_{n} \|^{2}_{\partial K^{+} \cap\varGamma^{+}} \\ &\quad \leq\frac{1}{2} \|u_{h,n} - u_{n} \|^{2}_{L^{2}(\varOmega)} + \frac{1}{2} \|\zeta_{n} \|^{2}_{L^{2}(\varOmega)}. \end{aligned}$$

Multiplying the above inequality by 2k _n and summing from n=1 to n=N _T, we derive

$$\begin{aligned} &\bigl( \|\zeta_{N_{T}}\|^{2}_{L^{2}(\varOmega)} - \| \zeta_{0}\|^{2}_{L^{2}(\varOmega)} \bigr) + 2 \sum _{n=1}^{N_{T}} k_{n} \biggl( \kappa\| \zeta_{n}\|^{2}_{\varepsilon} + \sum _{K}c_{0} \| \zeta_{n}\|^{2}_{L^{2}(K)} \biggr) \\ &\qquad{} + \sum _{n=1}^{N_{T}} k_{n} \biggl( \sum _{K} \| \zeta_{n}\|^{2}_{\partial K^{-} \cap\varGamma^{-}} + \sum _{K} \big\| \zeta_{n}- \zeta_{n}^{e} \big\|^{2}_{\partial K^{-}\backslash\varGamma^{-}} + \sum _{K} \| \zeta_{n}\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ &\quad \leq\sum _{n=1}^{N_{T}} k_{n} \bigl( \|u_{h,n} - u_{n}\|^{2}_{L^{2}(\varOmega)} + \|\zeta_{n}\|^{2}_{L^{2}(\varOmega)} \bigr). \end{aligned}$$

Then, we apply discrete Gronwall’s inequality to the terms related to ζ and use (4.3) which leads to the inequality $\| \cdot\|_{L^{2}(\varOmega)} \leq C\| \cdot\| _{\varepsilon}$ for some positive constant C and finally use the definition of the norm in (5.1) to obtain (6.6).

To show the second part of the Lemma 8, we subtract (6.5b) from (6.2b) to obtain

$$\begin{aligned} \biggl(\frac{ \chi_{n-1} - \chi_{n}}{k_{n}}, q_{h} \biggr) + a_{h}(q_{h}, \chi_{n-1}) = -(\zeta_{n}, q_{h}). \end{aligned}$$

By choosing q _h=χ _n−1 and proceeding as in the first part, we obtain the following inequality

$$\begin{aligned} \big |\!\big |\!\big | p_{h} - p_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} \leq C \big |\!\big |\!\big | y_{h} - y_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))}. \end{aligned}$$

This inequality gives us the desired result (6.7). □

To derive an estimate for the control u in the fully-discrete scheme, we use the discontinuous piecewise linear finite element space by following the approach in [29].

Lemma 9

Let (y,p,u) and (y _h,p _h,u _h) be the solutions of (2.8a)–(2.8c) and (6.2a)–(6.2c), respectively. Under the assumptions u∈L ²(0,T;W ^1,∞(Ω _U)), $u|_{\varOmega^{*}} \in L^{2}(0,T; H^{2}(\varOmega^{*}))$, p∈L ²(0,T;W ^1,∞(Ω)), we have

$$\begin{aligned} \|\!| u - u_{h} |\!\|_{L^{2}(0,T; L^{2}(\varOmega_U))}& \leq C \biggl( k \biggl \Vert \frac{\partial p}{\partial t}\biggr \Vert _{L^{2}(0,T; L^{2}(\varOmega ))} + \big |\!\big |\!\big | p_{h}(u) - p \big |\!\big |\!\big |_{L^{2}(0,T; L^{2}(\varOmega))} \biggr) \\ &\quad {} + Ch^{3/2}_U. \end{aligned}$$

(6.8)

Proof

Let

$$\bigl(J_{h}^{\prime}(u),v-u\bigr)_U=\sum _{n=1}^{N_{T}} k_{n} \bigl(\alpha u_{n}- B^*p_{h,n-1}(u),v_{n}-u_{n} \bigr)_U, $$

where p _h,n−1(u) is the solution of the auxiliary solution (6.5b). Then,

$$\begin{aligned} \bigl(J_{h}^{\prime}(v)-J_{h}^{\prime}(u),v-u \bigr)_U &=\sum _{n=1}^{N_{T}}k_{n}( \alpha v_{n}-\alpha u_{n},v_{n}-u_{n})_U\\ &\quad {}+ \sum _{n=1}^{N_{T}}k_{n}\bigl( B^*p_{h,n-1}(u)- B^*p_{h,n-1}(v),v_{n}-u_{n} \bigr)_U \\ &=\alpha \|\!| v-u |\!\|^{2}_{L^{2}(0,T;L^{2}(\varOmega_U))}\\ &\quad {} + \sum _{n=1}^{N_{T}}k_{n} \bigl( B^*p_{h,n-1}(u)- B^*p_{h,n-1}(v),v_{n}-u_{n} \bigr)_U. \end{aligned}$$

By using the auxiliary solutions (6.5a), (6.5b) as done for the semi-discrete scheme, we obtain

$$\sum_{n=1}^{N_{T}}k_{n}\bigl( B^*p_{h,n-1}(u)- B^*p_{h,n-1}(v),v_{n}-u_{n}\bigr)_U \geq0. $$

Hence,

$$\begin{aligned} \bigl(J_{h}^{\prime}(v)-J_{h}^{\prime}(u),v-u \bigr)_U \geq\alpha \|\!| v-u |\!\|^{2}_{L^{2}(0,T;L^{2}(\varOmega_U))}. \end{aligned}$$

(6.9)

Set Πu _n∈U _h be the standard Lagrange interpolation of u at time t _n such that Πu _n(x)=u _n(x) for all vertices x. Then, Πu _n belongs to $U_{ad}^{h}$ at time t _n. With the help of the inequalities (6.9), (2.8c), (6.2c) and an approximation of u at time t _n, i.e., Πu _n, we obtain

$$\begin{aligned} &\alpha \|\!| u-u_h |\!\|^{2}_{L^{2}(0,T;L^{2}(\varOmega_U))} \\ &\quad \leq \bigl(J_{h}^{\prime}(u)-J_{h}^{\prime}(u_h),u-u_h \bigr)_U \\ &\quad = \sum _{n=1}^{N_{T}}k_{n}\bigl(\alpha u_{n}- B^*p_n,u_{n}-u_{h,n} \bigr)_U + \sum _{n=1}^{N_{T}}k_{n} \bigl( B^*p_{n}- B^*p_{h,n-1}(u),u_{n}-u_{h,n} \bigr)_U \\ &\qquad{}+ \sum _{n=1}^{N_{T}}k_{n}\bigl(\alpha u_{h,n}- B^*p_{h,n-1}, \varPi u_{n}-u_{n} \bigr)_U \\ &\qquad{} + \sum _{n=1}^{N_{T}}k_{n} \bigl(\alpha u_{h,n}- B^*p_{h,n-1}(u), u_{h,n}-\varPi u_{n}\bigr)_U \\ &\quad \leq \sum _{n=1}^{N_{T}}k_{n} \bigl( B^*p_{n}- B^*p_{h,n-1}(u),u_{n}-u_{h,n} \bigr)_U \\ &\qquad{} + \sum _{n=1}^{N_{T}}k_{n} \bigl(\alpha u_{h,n}- B^*p_{h,n-1}, \varPi u_{n}-u_{n} \bigr)_U \\ &\quad = \underbrace{\sum _{n=1}^{N_{T}}k_{n}\bigl( B^*p_{n}- B^*p_{n-1},u_{n}-u_{h,n} \bigr)_U}_{T_{1}} \\ &\qquad {} + \underbrace{\sum _{n=1}^{N_{T}}k_{n} \bigl( B^*p_{n-1}- B^*p_{h,n-1}(u),u_{n}-u_{h,n} \bigr)_U}_{T_{2}} \\ &\qquad {} + \underbrace{\sum _{n=1}^{N_{T}}k_{n} \bigl(\alpha u_{h,n}- B^*p_{h,n-1}, \varPi u_{n}-u_{n} \bigr)_U}_{T_{3}}. \end{aligned}$$

(6.10)

The following estimates of T ₁ and T ₂ are derived by using Young’s inequality,

$$\begin{aligned} T_{1} &\leq C_{1} \sum _{n=1}^{N_{T}}k_{n} \|p_{n}-p_{n-1}\|_{L^{2}(\varOmega)}^{2} + C_{2} \sum _{n=1}^{N_{T}}k_{n} \|u_{n}-u_{h,n}\|_{L^{2}(\varOmega_U)}^{2} \\ &\leq C_{1} k^{2} \bigg\| \frac{\partial p}{\partial t} \bigg\|_{L^{2}(0,T;L^{2}(\varOmega))}^{2} + C_{2} \|\!| u-u_h |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}^{2}, \\ T_{2} &\leq C_{1} \sum _{n=1}^{N_{T}}k_{n} \big\|p_{n-1}-p_{h,n-1}(u)\big\|_{L^{2}(\varOmega)}^{2} + C_{2}\sum _{n=1}^{N_{T}}k_{n} \|u_{n}-u_{h,n}\|_{L^{2}(\varOmega_U)}^{2} \\ &\leq C_{1} \big |\!\big |\!\big | p-p_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))}^{2}+ C_{2} \|\!| u-u_h |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}^{2}. \end{aligned}$$

By considering the discontinuous piecewise linear finite element space for the control u and following the steps in Lemma 4, we obtain

$$\begin{aligned} T_{3} &\leq C k^{2} \bigg\| \frac{\partial p}{\partial t} \bigg\|_{L^{2}(0,T;L^{2}(\varOmega))}^{2} + C\big |\!\big |\!\big | p_{h}(u)-p \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))}^{2}\\ &\quad {}+ \|\!| u-u_h |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}^{2}+ Ch^{3}_U. \end{aligned}$$

Summing up the estimates of T ₁−T ₃, we complete the proof. □

Now, we establish the connection between the exact solutions and auxiliary solutions.

Lemma 10

Let (y,p) be the solutions of (2.8a), (2.8b) and (y _h(u),p _h(u)) be the solutions of (6.5a), (6.5b), respectively. Suppose that the conditions of Proposition 1 and Lemma 9 are valid. Then, the following estimates hold

$$\begin{aligned} \big |\!\big |\!\big | y - y_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} & \leq C h^2 \|\!| y |\!\| _{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} \\ &\quad {} + C h^2 \biggl \Vert \frac{\partial y }{\partial t} \biggr \Vert _{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} + C k \biggl \Vert \frac{\partial ^{2} y}{\partial t^{2}} \biggr \Vert _{L^{2}(0,T;L^{2}(\varOmega))}, \end{aligned}$$

(6.11)

and

$$\begin{aligned} &\big |\!\big |\!\big | p - p_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} \\ &\quad \leq C h^2 \sum _{v=y, p} \|\!| v |\!\| _{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} + C h^2 \sum _{v=y, p} \biggl \Vert \frac{\partial v}{\partial t} \biggr \Vert _{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} \\ &\qquad {} + C k \sum _{v=y, p} \biggl \Vert \frac{\partial^{2} v}{\partial t^{2}}\biggr \Vert _{L^{2}(0,T;L^{2}(\varOmega))} + C k \sum _{v=y, y_{d}} \biggl \Vert \frac{\partial v}{\partial t} \biggr \Vert _{L^{2}(0,T;L^{2}(\varOmega))}. \end{aligned}$$

(6.12)

Proof

Here, we only prove (6.11) since both cases follow the same procedure. We start with the following equation obtained by using (6.5a) and (2.8a)

$$\begin{aligned} (\partial_t y_{n}, v_h) + a_h(y_{n},v_h)- \biggl( \frac{y_{h,n}(u) - y_{h,n-1}(u)}{k_{n}}, v_{h} \biggr) - a_h \bigl(y_{h,n}(u), v_h\bigr) = 0. \end{aligned}$$

(6.13)

We decompose y−y _h(u) as

$$\begin{aligned} y_{n} - y_{h,n}(u) = y_n - \tilde{y}_{n} - \bigl(y_{h}(u) - \tilde{y}_{n} \bigr) = \eta_{n} - \xi_{n}, \end{aligned}$$

where $\tilde{y}$ is an elliptic projection of y. We only need to estimate ξ _n since the estimate of η _n is given in (4.8b). Hence, we write (6.13) as

$$\begin{aligned} \biggl(\frac{\xi_{n} - \xi_{n-1}}{k_{n}}, v_{h} \biggr) + a_{h}( \xi_{n}, v_{h})& = \biggl( \frac{\partial y_{n}}{\partial t} - \frac{y_{n} - y_{n-1}}{k_{n}}, v_{h} \biggr) + \biggl(\frac{\eta_{n} - \eta_{n-1}}{k_{n}}, v_{h} \biggr)\\ &\quad {} + a_{h}(\eta_{n}, v_{h}). \end{aligned}$$

By choosing v _h=ξ _n and applying the steps in Lemma 5 to bound the terms on the inflow and outflow boundaries, we obtain

$$\begin{aligned} &\|\xi_{N_{T}}\|^{2}_{L^{2}(\varOmega)} - \|\xi_{0} \|^{2}_{L^{2}(\varOmega)} + \frac{3 \kappa}{4} \sum _{n=1}^{N_{T}} k_{n} \|\xi_{n}\|^{2}_{\varepsilon} + 2 \sum _{n=1}^{N_{T}} k_{n} \sum _{K} c_{0}\| \xi_{n}\|^{2}_{L^{2}(K)} \\ &\qquad{} + \frac{1}{2} \sum _{n=1}^{N_{T}} k_{n} \biggl( 2\sum _{K} \| \xi_{n}\|^{2}_{\partial K^{-} \cap\varGamma^{-}} + \sum _{K} \big\| \xi_{n}- \xi_{n}^{e}\big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \sum _{K} \|\xi_{n}\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ & \quad \leq2 \sum _{n=1}^{N_{T}} k_{n} \biggl( \sum _{K} \big\| \eta^{e}_{n}\big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \sum_{K} \|\eta_{n}\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ &\qquad{} + C \sum _{n=1}^{N_{T}} k_{n} \| \xi_{n}\|^{2}_{L^{2}(\varOmega)} + C k_{n}^{2} \int _{0}^{T} \biggl \vert \!\biggl \vert \frac{\partial^{2} y }{\partial t^{2}} \biggr \vert \!\biggr \vert ^{2}_{L^{2}(\varOmega)}\,dt + C h^4 \int _{0}^{T} \biggl \vert \!\biggl \vert \!\biggl \vert \frac{\partial y }{\partial t} \biggr \vert \!\biggr \vert \!\biggr \vert _{H^{2}(\mathcal{T}_{h})} \, dt. \end{aligned}$$

Then, applying Gronwall’s inequality to the terms related to ξ in above inequality, the desired result (6.11) is obtained. □

Now, we finalize the a priori error estimate of the fully-discrete scheme by combining Lemmas 8–10 with the triangle inequality.

Theorem 2

Let (y,p,u) be the solutions of (2.8a)–(2.8c) and (y _h,p _h,u _h) be the solutions of (6.2a)–(6.2c), respectively. Suppose that the conditions of Proposition 1 and Lemma 9 are valid. Further, the regularity condition (4.9) is satisfied. Then, the following estimate holds

$$\begin{aligned} &\|\!| y - y_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega))} + \|\!| p - p_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega))} + \|\!| u - u_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}\\ &\quad \leq Ch^2\sum _{v=y, p} \|\!| v |\!\|_{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} + C h^2 \sum _{v=y, p} \biggl \Vert \frac{\partial v}{\partial t} \biggr \Vert _{L^{2}(0,T;H^{2}(\mathcal {T}_{h}))} + Ch^{3/2}_U\\ &\qquad {}+ C k \sum _{v=y, p} \biggl \Vert \frac{\partial^{2} v}{\partial t^{2}}\biggr \Vert _{L^{2}(0,T;L^{2}(\varOmega))} + C k \sum _{v=y, p, y_{d}} \biggl \vert \!\biggl \vert \frac{\partial v}{\partial t} \biggr \vert \!\biggr \vert _{L^{2}(0,T;L^{2}(\varOmega))}. \end{aligned}$$

7 Numerical results

In this section, we present numerical results for the unsteady control constrained optimal control problems governed by the convection diffusion equation (2.1), (2.2a)–(2.2c). We take Ω=Ω _U and B=I. To do this, we consider the problem in [10] with the following parameters

$$\begin{array}{l} Q=(0,1] \times\varOmega, \quad\varOmega=(0,1)^{2}, \quad \varepsilon =10^{-5}, \quad\beta=(1,0)^T, \\ r=0, \quad \alpha=1 \quad\text{and} \quad u_a=0. \end{array} $$

The source function f, the desired state y _d and the initial condition y ₀ are computed from (2.8a)–(2.8c) using the following exact solutions of the state, adjoint and control, respectively,

$$\begin{aligned} y(x,t) &= \exp(-t) \sin(2 \pi x_1) \sin(2 \pi x_2), \\ p(x,t) &= \exp(-t) (1-t) \sin(2 \pi x_1) \sin(2 \pi x_2), \\ u(x,t) &= \max(0,p). \end{aligned}$$

In our numerical example, the control variable is only bounded from below, i.e., u _a=0. The state, the adjoint, and the control variables are discretized by using the piecewise linear polynomials, i.e., (x,y,1−x−y). Discretized control constraint problems are solved by the primal dual active set (PDAS) algorithm as a semi-smooth Newton step, see, e.g., [2, 3]. The algorithm is terminated when two consecutive active sets coincide. The initial guess for the control variable is taken as equal to zero for all discretization levels.

There are two approaches to solve the optimization problem (2.1), (2.2a)–(2.2c) numerically, i.e., the discretize-then-optimize (DO) and the optimize-then-discretize (OD). It is desirable that both approaches lead to the same discrete optimality system. In the DO approach, one first discretizes the optimal control problem with the objective function (2.1) and the state equation (2.2a)–(2.2c) and then form the optimality system. In the OD approach, one first derives the optimality conditions consisting of the state and adjoint PDEs and the algebraic equation that links the control and the adjoint variable. Afterwards the infinite dimension optimality system is discretized to form the optimality system. It is known that the two approaches for optimal control problems are governed by convection diffusion PDEs lead the same linear optimality systems for some discretization schemes. Although the commutative property is preserved for the steady case using upwind SIPG methods [15, 28], it is not preserved for the unsteady problems when the backward Euler discretization is used in time. A straightforward time discretization will usually not lead to the same optimality system for the DO and OD approaches; the initial condition of the adjoint PDE makes the difference between the DO and OD approaches. It was shown in [23] for the Stokes equation. By adjusting the time discretization for the initial condition of the forward problem, it was possible to show that the OD and DO approaches commute. The same technique was also used in [11]. Further, both approaches lead to the same discrete optimality system when discontinuous Galerkin discretization dG(0) is used in time [14].

Tables 1 and 2 show the errors and converge rates with respect to the discrete time-dependent norm (5.1) by fixing mesh size in space for OD and DO, respectively. The order of convergence for time is k as expected from the a priori error estimates. For fixed time steps, the errors and convergence rates in terms of space are given at Tables 3 and 4 for OD and DO, respectively. Again, the results confirm the a priori error estimates. Although the rate of the control is h ^3/2 theoretically, it is observed to be h ² since there is no kink for the control. It means that our initial grid aligns with the points where the lower bound of control and the value of adjoint coincide, i.e., x ₁=x ₂=0.5.

Table 1 $h/\sqrt{2}=1/32$ via the OD approach

Full size table

Table 2 $h/\sqrt{2}=1/32$ via the DO approach

Full size table

Table 3 k=1/2048 via the OD approach

Full size table

Table 4 k=1/2048 via the DO approach

Full size table

Figure 1 show the computed solutions of the state and adjoint at t=0.5 with $h/\sqrt{2}=1/32$, k=1/128 by using the OD approach. In addition, the exact and computed solutions of the control is given at Fig. 2 for t=0.5 with $h/\sqrt{2}=1/32$, k=1/128 by using the OD approach.

Discontinuous Galerkin (DG) discretizations exhibit a better convergence behavior for convection dominated optimal control problems since errors in boundary layers are not propagated into the entire domain [16]. Therefore, DG discretization with mesh adaptivity presents better results with respect to stabilized finite element methods such as in [26, 27] for steady convection dominated optimal control problems. Although our example is smooth, we can still see the effect of the DG discretization. The same example was solved in [10] with characteristic finite element method. Comparing the results in Table 1 with the ones obtained in [10], it turns out that the upwind SIPG discretization yields more accurate results.

8 Conclusions

We have derived a priori error estimates for the optimal control problems governed by the unsteady convection diffusion equation using the upwind SIPG discretization in space and the standard backward Euler in time. Although the OD and DO approaches lead two different optimality systems under the backward Euler discretization, there is no remarkable differences in numerical results and convergence rates. Numerical experiments are given to confirm the theoretical results. With adaptive meshes, the DG methods resolve the boundary and/or interior layers for convection dominated problems better and more efficient than the continuous finite elements in [26, 27] for steady optimal control problems. This issue will addressed in the coming work with space-time adaptivity for unsteady convection dominated optimal control problem.

References

Becker, R., Vexler, B.: Optimal control of the convection-diffusion equation using stabilized finite element methods. Numer. Math. 106, 349–367 (2007)
Article MATH MathSciNet Google Scholar
Bergounioux, M., Ito, K., Kunisch, K.: Primal-dual strategy for constrained optimal control problems. SIAM J. Control Optim. 37(4), 1176–1194 (1999)
Article MATH MathSciNet Google Scholar
Bergounioux, M., Haddou, M., Hintermueller, M., Kunisch, K.: A comparison of interior–point methods and a Moreau–Yosida based active set strategy for constrained optimal control problems. SIAM J. Optim. 11, 495–521 (2001)
Article Google Scholar
Brenner, S.: Poincaré-Friedrichs inequalities for piecewise H ¹ functions. SIAM J. Numer. Anal. 41(1), 306–324 (2003)
Article MATH MathSciNet Google Scholar
Cangiani, A., Chapman, J., Georgoulis, E.H., Jensen, M.: On Local Super-Penalization of Interior Penalty Discontinuous Galerkin Methods. CoRR 1205.5672 [abs] (2012)
Collis, S.S., Heinkenschloss, M.: Analysis of the streamline upwind/Petrov Galerkin method applied to the solution of optimal control problems. Tech. Rep. TR02–01, Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005-1892 (2002)
Evans, L.C.: Partial Differential Equations. Grad. Stud. Math., vol. 19. AMS, Providence (2002)
Google Scholar
Feistauer, M., S̆vadlenka, K.: Discontinuous Galerkin method of lines for solving nonstationary singularly perturbed linear problems. J. Numer. Math. 12, 97–117 (2004)
Article MATH MathSciNet Google Scholar
Fu, H.: A characteristic finite element method for optimal control problems governed by convection-diffusion equations. J. Comput. Appl. Math. 235, 825–836 (2010)
Article MATH MathSciNet Google Scholar
Fu, H., Rui, H.: A priori error estimates for optimal control problems governed by transient advection-diffusion equations. J. Sci. Comput. 38(3), 290–315 (2009)
Article MATH MathSciNet Google Scholar
Hinze, M.K., Turek, S.: A Space-Time Multigrid Method for Optimal Flow Control. International Series of Numerical Mathematics, vol. 160 (2011)
Google Scholar
Hinze, M., Yan, N., Zhou, Z.: Variational discretization for optimal control governed by convection dominated diffusion equations. J. Comput. Math. 27, 237–253 (2009)
MATH MathSciNet Google Scholar
Houston, P., Schwab, C., Süli, E.: Discontinuous hp-finite element methods for advection-diffusion-reaction problems. SIAM J. Numer. Anal. 39(6), 2133–2163 (2002)
Article MATH MathSciNet Google Scholar
Kunisch, K., Vexler, B.: Constrained Dirichlet boundary control in L ² for a class of evolution equations. SIAM J. Control Optim. 46(5), 1726–1753 (2007)
Article MATH MathSciNet Google Scholar
Leykekhman, D.: Investigation of commutative properties of discontinuous Galerkin methods in PDE constrained optimal control problems. J. Sci. Comput. 53, 483–511 (2012)
Article MATH MathSciNet Google Scholar
Leykekhman, D., Heinkenschloss, M.: Local error analysis of discontinuous Galerkin methods for advection-dominated elliptic linear-quadratic optimal control problems. SIAM J. Numer. Anal. 50, 2012–2038 (2012)
Article MATH MathSciNet Google Scholar
Li, R., Liu, W., Ma, H., Tang, T.: Adaptive finite element approximation for distributed elliptic optimal control problems. SIAM J. Control Optim. 41, 1321–1349 (2002)
Article MATH MathSciNet Google Scholar
Lions, J.L.: Optimal Control of Systems Governed by Partial Differential Equations. Springer, Berlin (1971)
Book MATH Google Scholar
Meidner, D., Vexler, B.: A priori error estimates for space-time finite element discretization of parabolic optimal control problems. II. Problems with control constraints. SIAM J. Control Optim. 47(3), 1301–1329 (2008)
Article MATH MathSciNet Google Scholar
Meidner, D., Vexler, B.: A priori error analysis of the Petrov–Galerkin Crank–Nicolson scheme for parabolic optimal control problems. SIAM J. Control Optim. 49(5), 2183–2211 (2011)
Article MATH MathSciNet Google Scholar
Rivìere, B.: Discontinuous Galerkin Methods for Solving Elliptic and Parabolic Equations. Theory and Implementation. Frontiers in Applied Mathematics, vol. 35. Society for Industrial and Applied Mathematics, Philadelphia (2008)
Book MATH Google Scholar
Schötzau, D., Zhu, L.: A robust a-posteriori error estimator for discontinuous Galerkin methods for convection-diffusion equations. Appl. Numer. Math. 59, 2236–2255 (2009)
Article MATH MathSciNet Google Scholar
Stoll, M., Wathen, A.: All-at-once solution of time-dependent Stokes control. J. Comput. Phys. 232, 498–515 (2013)
Article MathSciNet Google Scholar
Sun, T.: Discontinuous Galerkin finite element method with interior penalties for convection diffusion optimal control problem. Int. J. Numer. Anal. Model. 7, 87–107 (2010)
MathSciNet Google Scholar
Yan, N., Zhou, Z.: A priori and a posteriori error analysis of edge stabilization Galerkin method for the optimal control problem governed by convection-dominated diffusion equation. J. Comput. Appl. Math. 223, 198–217 (2009)
Article MATH MathSciNet Google Scholar
Yücel, H., Karasözen, B.: Adaptive Symmetric Interior Penalty Galerkin (SIPG) method for optimal control of convection diffusion equations with control constraints (2013). Optimization (electronic)
Yücel, H., Heinkenschloss, M., Karasözen, B.: An adaptive discontinuous Galerkin method for convection dominated distributed optimal control problems. Tech. Rep., Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005–1892 (2012)
Yücel, H., Heinkenschloss, M., Karasözen, B.: Distributed optimal control of diffusion-convection-reaction equations using discontinuous Galerkin methods. In: Numerical Mathematics and Advanced Applications, vol. 2011, pp. 389–397. Springer, Berlin (2013)
Google Scholar
Zhou, Z., Yan, N.: The local discontinuous Galerkin method for optimal control problem governed by convection diffusion equations. Int. J. Numer. Anal. Model. 7, 681–699 (2010)
MATH MathSciNet Google Scholar

Download references

Acknowledgements

The authors wish to thank the referees for their helpful suggestions and comments. This research was supported by the Middle East Technical University Scientific Research Fund (Project: BAP-07-05-2012-102).

Author information

Hamdullah Yücel
Present address: Computational Methods in Systems and Control Theory, Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstr. 1, 39106, Magdeburg, Germany

Authors and Affiliations

Department of Mathematics & Institute of Applied Mathematics, Middle East Technical University, 06800, Ankara, Turkey
Tuğba Akman, Hamdullah Yücel & Bülent Karasözen

Authors

Tuğba Akman
View author publications
You can also search for this author in PubMed Google Scholar
Hamdullah Yücel
View author publications
You can also search for this author in PubMed Google Scholar
Bülent Karasözen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamdullah Yücel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akman, T., Yücel, H. & Karasözen, B. A priori error analysis of the upwind symmetric interior penalty Galerkin (SIPG) method for the optimal control problems governed by unsteady convection diffusion equations. Comput Optim Appl 57, 703–729 (2014). https://doi.org/10.1007/s10589-013-9601-4

Download citation

Received: 12 July 2012
Published: 25 September 2013
Issue Date: April 2014
DOI: https://doi.org/10.1007/s10589-013-9601-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A priori error analysis of the upwind symmetric interior penalty Galerkin (SIPG) method for the optimal control problems governed by unsteady convection diffusion equations

Abstract

Similar content being viewed by others

Distributed Optimal Control of Diffusion-Convection-Reaction Equations Using Discontinuous Galerkin Methods

Optimal Control of Diffusion-Convection-Reaction Equations Using Upwind Symmetric Interior Penalty Galerkin (SIPG) Method

Space-Time Discontinuous Galerkin Methods for Optimal Control Problems Governed by Time Dependent Diffusion-Convection-Reaction Equations

1 Introduction

2 The optimal control problem

Proposition 1

Proof

3 Discontinuous Galerkin (DG) scheme for optimal control problem

3.1 Discontinuous Galerkin discretization

Remark 1

3.2 Semi-discrete formulation of optimal control problem

4 A priori error analysis of semi-discrete scheme

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Remark 2

Lemma 5

Proof

Theorem 1

5 Fully-discrete formulation of optimal control problem

6 A priori error analysis of fully-discrete scheme

Lemma 6

Proof

Lemma 7

Lemma 8

Proof

Lemma 9

Proof

Lemma 10

Proof

Theorem 2

7 Numerical results

8 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation