1 Introduction

Optimal control problems (OCPs) governed by convection diffusion partial differential equations (PDEs) arise in environmental modeling, petroleum reservoir simulation and in many other applications. Hence, efficient numerical methods are essential to obtain effective solutions of the such optimal control problems.

It is well known that the standard Galerkin finite element method produces nonphysical oscillating solutions for mesh sizes larger than a critical value depending on the ratio between diffusion and convection terms. To enhance stability and accuracy of the optimal control problems governed by the steady convection diffusion equations, some effective stabilization techniques are used, i.e., the streamline upwind/Petrov Galerkin (SUPG) finite element method [6], the local projection stabilization [1], the edge stabilization [12, 25]. Recently, discontinuous Galerkin (DG) methods have became popular for the optimal control problems governed by convection diffusion equations due the better convergence behavior, local mass conservation, flexibility in approximating rough solutions on complicated meshes and mesh adaptation, see, e.g., [15, 16, 2628].

However, to the best of our knowledge, a few papers are published so far for unsteady optimal control problems governed by the convection diffusion equations. A characteristic finite element approximation in space and backward Euler method in time are used in [9, 10]. Zhou et al. [29] used local discontinuous Galerkin (LDG) discretization in space, whereas Sun [24] used the nonsymmetric interior penalty Galerkin (NIPG) discretization. In [24], a priori error estimates are only given for semi-discrete scheme, whereas it is investigated for both semi- and fully-discrete schemes with the backward Euler method in [29]. In both papers, numerical results are not given.

In this paper, we will investigate a priori error analysis of the optimal control problems governed by the unsteady convection diffusion equations using the symmetric interior penalty Galerkin (SIPG) method for the semi- and fully-discrete schemes. For time discretization, we apply the backward Euler method. We present the numerical results related to the DG discretization for the unsteady optimal control problems.

The rest of the paper is organized as follows: In Sect. 2, we introduce the control constrained optimal control problems governed by the unsteady convection diffusion equations. The upwind symmetric interior penalty Galerkin (SIPG) discretization and semi-discrete scheme are given in Sect. 3. A priori error estimates of the semi-discrete scheme are derived in Sect. 4. In Sect. 5, we give the fully-discrete scheme of the optimal control problems by using the backward Euler discretization in time. We derive a priori error estimates of the fully-discrete scheme in Sect. 6. Finally, we present the numerical results in Sect. 7.

2 The optimal control problem

We adopt the standard notations for Sobolev spaces on computational domains and their norms. Ω and Ω U are bounded convex polygon domains in \(\mathbb{R}^{2}\) with Lipschitz boundaries ∂Ω and ∂Ω U , respectively. The inner products in L 2(Ω U ) and L 2(Ω) are denoted by (⋅,⋅) U and (⋅,⋅), respectively. Further, we consider spaces of functions mapping the time interval (0,T) to a normed space X in which the norm ∥⋅∥ X is defined. For r≥1, we define

$$L^{r}(0,T;X)= \biggl\{ z:[0,T] \rightarrow X \ \hbox{measurable} : \int_{0}^{T} \big\|z(t)\big\|_{X}^{r} \,dt < \infty\biggr\} $$

with

$$\begin{aligned} \big\|z(t)\big\|_{L^{r}(0,T;X)}=\left \{ \begin{array}{l@{\quad}l} ( \int_{0}^{T} \|z(t)\|_{X}^{r} \, dt )^{1/r}, & \hbox{if} \ 1 \leq r < \infty, \\ \hbox{ess} \sup_{t \in(0,T]} \|z(t)\|_{X}, & \hbox{if} \ r=\infty. \end{array} \right . \end{aligned}$$

In this paper, we are interested in the following distributed optimal control problem governed by the unsteady diffusion convection reaction equation with control constraints

(2.1)

subject to

$$\begin{aligned} &\partial_t y-\varepsilon \varDelta y+\beta\cdot\nabla y+r y = f + B u \quad x\in\varOmega, \ t \in(0,T], \end{aligned}$$
(2.2a)
$$\begin{aligned} &y(x,t)=0 \quad x \in\partial\varOmega, \ t \in (0,T], \end{aligned}$$
(2.2b)
$$\begin{aligned} &y(x,0)=y_{0}(x) \quad x\in\varOmega, \end{aligned}$$
(2.2c)

where the admissible space of control constraints is given by

$$\begin{aligned} U_{ad} = \bigl\{ u \in L^{2} \bigl(0,T;L^2(\varOmega_U)\bigr): u_a \le u \le u_b, \hbox{ a.e. in } \varOmega_U \times(0,T] \bigr\} \end{aligned}$$
(2.3)

with the constant bounds \(u_{a}, u_{b} \in\mathbb{R} \cup\{\pm\infty\} \), i.e., u a <u b . B is a bounded linear continuous operator to ensure the transition from Ω U to Ω. Generally, Ω U can be a subset of Ω. In the special case, Ω U =Ω and B=I is an identity operator.

We make the following assumptions for the functions and parameters on the optimal control problem (2.1), (2.2a)–(2.2c):

  1. (i)

    The source function f and the desired state y d belong to H 1(0,T;L 2(Ω)) with \(f(0), y_{d}(T) \in H_{0}^{1}(\varOmega)\).

  2. (ii)

    The initial condition is defined as \(y_{0}(x) \in H^{1}_{0}(\varOmega)\) with \(\varDelta y_{0} \in H^{1}_{0}(\varOmega)\).

  3. (iii)

    The diffusion and reaction parameters are denoted by ε>0 and rL (Ω), respectively.

  4. (iv)

    β denotes a velocity field. It belongs to (W 1,∞(Ω))2 and satisfies the incompressibility condition, i.e. ∇⋅β=0.

Further, we assume the existence of a constant c 0c 0(x)≥0 such that

$$\begin{aligned} r(x) \geq c_{0} \geq0 \quad\hbox{a.e. in}\ \varOmega \end{aligned}$$
(2.4)

to ensure the well-posedness of the optimal control problem (2.1), (2.2a)–(2.2c).

Using the assumptions defined above, the following result on regularity of the state solution can be proved.

Proposition 1

Under the assumptions defined above and for a given control uH 1(0,T;L 2(Ω U )), the state y satisfies the following regularity condition

$$y \in H^1\bigl(0,T; H^2(\varOmega) \cap H^1_0(\varOmega)\bigr) \cap H^2\bigl(0,T; L^2(\varOmega)\bigr) $$

and the weak formulation

$$\begin{aligned} &(\partial_t y,v)+ a(y,v)+b(u,v)=(f,v) \quad\forall v \in V=H^1_0(\varOmega), \ t\in(0,T], \end{aligned}$$
(2.5)
$$\begin{aligned} &y(x,0)=y_0, \end{aligned}$$
(2.6)

where the (bi)-linear forms are defined by

$$\begin{aligned} &a(y,v)=\int_{\varOmega} (\varepsilon \nabla y \cdot\nabla v + \beta \cdot\nabla y v + r y v)\, dx, \qquad b(u,v)=-\int_{\varOmega} Bu v \, dx, \\ &(f,v)= \int_{\varOmega} f v \, dx. \end{aligned}$$

Proof

The regularity of the state \(y \in H^{1}(0,T; H^{2}(\varOmega) \cap H^{1}_{0}(\varOmega)) \cap H^{2}(0,T; L^{2}(\varOmega))\) can be proved as done [7] provided that f+BuH 1(0,T;L 2(Ω)) with \((f + Bu)(0) \in H^{1}_{0}(\varOmega)\) is satisfied. This condition is ensured by our assumptions. See, e.g., [20] for details. □

Then, variational formulation corresponding to (2.1), (2.2a)–(2.2c) can be written as

$$\begin{aligned} &\underset{u \in U_{ad}}{\hbox{minimize}} \quad J(y,u):=\int _{0}^{T} \biggl(\frac{1}{2} \|y-y_{d}\|^{2}_{L^2(\varOmega)} + \frac{\alpha}{2} \| u \|^{2}_{L^2(\varOmega_U)} \biggr)\, dt \end{aligned}$$
(2.7a)
$$\begin{aligned} &\hbox{subject to} \quad (\partial_t y,v)+ a(y,v)+b(u,v)=(f,v) \quad\forall v \in V, \ t \in(0,T], \\ &\phantom{\hbox{subject to}}\quad y(x,0)=y_0, \\ & \phantom{\hbox{subject to}}\quad(y,u) \in Y \times U_{ad}. \end{aligned}$$
(2.7b)

It is well known that the triple (y,u) is the unique solution of (2.7a), (2.7b) if and only if there is an adjoint \(p \in H^{1}(0,T; H^{2}(\varOmega) \cap H^{1}_{0}(\varOmega)) \cap H^{2}(0,T; L^{2}(\varOmega))\) such that (y,u,p) satisfies the following optimality system

$$\begin{aligned} &(\partial_t y,v)+a(y,v)+b(u,v)=(f,v) \quad \forall v \in V,\ y(x,0)=y_0, \end{aligned}$$
(2.8a)
$$\begin{aligned} &-(\partial_t y,\psi)+a(\psi,p)=-(y-y_{d},\psi) \quad \forall\psi\in V,\ p(x,T)=0, \end{aligned}$$
(2.8b)
$$\begin{aligned} &\int_{0}^{T} \bigl(\alpha u- B^* p, w -u \bigr)_U\, dt \geq0 \quad \forall w \in U_{ad}, \end{aligned}$$
(2.8c)

where B denotes the adjoint of B [18, 20].

3 Discontinuous Galerkin (DG) scheme for optimal control problem

3.1 Discontinuous Galerkin discretization

Let \(\{ \mathcal{T}_{h}\}_{h}\) be a family of shape regular meshes such that \(\overline{\varOmega} = \cup_{K \in\mathcal{T}_{h}} \overline{K}\), K i K j =∅ for \(K_{i}, K_{j} \in\mathcal{T}_{h}\), ij. The diameter of an element K and the length of an edge E are denoted by h K and h E , respectively. Further, the maximum value of element diameter is denoted by \(h=\max_{K \in\mathcal{T}_{h}} h_{K}\).

We only consider discontinuous piecewise linear finite element spaces to define the discrete spaces of the state and test functions

$$\begin{aligned} V_h = Y_h &= \bigl \{{y \in L^2( \varOmega)}\,:~{ y\mid_{K}\in\mathbb{P}^1(K) \ \forall K \in\mathcal{T}_h}\bigr \}. \end{aligned}$$
(3.1)

Remark 1

When the state equation (2.2a)–(2.2c) contains nonhomogeneous Dirichlet boundary conditions, the space of discrete states Y h and the space of test functions V h can still be taken the same due to the weak treatment of boundary conditions in DG methods. See, [16] for details.

We split the set of all edges \(\mathcal{E}_{h}\) into the set \(\mathcal {E}^{0}_{h}\) of interior edges and the set \(\mathcal{E}^{\partial}_{h}\) of boundary edges so that \(\mathcal{E}_{h}=\mathcal{E}^{\partial}_{h}\cup \mathcal{E}^{0}_{h}\). Let n denote the unit outward normal to ∂Ω. We define the inflow boundary

$$\varGamma^- = \bigl \{{x \in\partial\varOmega}\,:~{ \beta\cdot\mathbf{n}(x) < 0}\bigr \} $$

and the outflow boundary Γ +=∂ΩΓ . The boundary edges are decomposed into edges \(\mathcal{E}^{-}_{h} = \{{E \in\mathcal{E}^{\partial}_{h}}\,:~{ E \subset \varGamma^{-} }\}\) that correspond to inflow boundary and edges \(\mathcal{E}^{+}_{h} = \mathcal{E}^{\partial}_{h} \setminus\mathcal {E}^{-}_{h}\) that correspond to outflow boundary. The inflow and outflow boundaries of an element \(K \in\mathcal{T}_{h}\) are defined by

$$\begin{aligned} \partial K^-=\bigl \{{x \in\partial K}\,:~{\beta\cdot\mathbf{n}_{K}(x) <0}\bigr \}, \qquad\partial K^{+} = \partial K \setminus\partial K^{-}, \end{aligned}$$

where n K is the unit normal vector on the boundary ∂K of an element K.

Let the edge E be a common edge for two elements K and K e. For a piecewise continuous scalar function y, there are two traces of y along E, denoted by y| E from inside K and y e| E from inside K e. Then, the jump and average of y across the edge E are defined by:

$$\begin{aligned}{} [\![ y ]\!]=y\big|_E\mathbf{n}_{K}+y^e\big|_E \mathbf{n}_{K^e}, \qquad \left \{\!\left \{ y \right \}\!\right \}=\frac{1}{2} \bigl( y\big|_E+y^e\big|_E \bigr). \end{aligned}$$
(3.2)

Similarly, for a piecewise continuous vector field ∇y, the jump and average across an edge E are given by

$$\begin{aligned}{} [\![ \nabla y ]\!]=\nabla y\big|_E \cdot\mathbf{n}_{K}+\nabla y^e\big|_E \cdot\mathbf{n}_{K^e}, \qquad \left \{\!\left \{ \nabla y \right \}\!\right \}=\frac{1}{2} \bigl(\nabla y\big|_E+\nabla y^e\big|_E \bigr). \end{aligned}$$
(3.3)

For a boundary edge EKΓ, we set \(\left \{\!\left \{ \nabla y \right \}\!\right \}=\nabla y\) and [[y]]=y n where n is the outward normal unit vector on Γ.

We now consider the discretization of the control variable. Let \(\{ \mathcal{T}_{h}^{U}\}_{h}\) is also a family of shape regular meshes of Ω U such that \(\overline{\varOmega}_{U} = \bigcup_{K_{U} \in\mathcal {T}_{h}^{U}} \overline{K}_{U}\), \(K^{i}_{U} \cap K^{j}_{U} = \emptyset\) for \(K^{i}_{U}, K^{j}_{U} \in\mathcal{T}_{h}^{U}\), ij. The maximum diameter is defined by \(h_{U}=\max_{K_{U} \in\mathcal{T}_{h}^{U}} h_{K_{U}}\), where \(h_{K_{U}}\) denotes the diameter of an element K U . The discrete space of the control variable associated with \(\{ \mathcal{T}_{h}^{U}\}_{h}\) is also piecewise linear finite element space

$$\begin{aligned} U_{h}= \bigl \{{u \in L^2( \varOmega_U)}\,:~{ u\mid_{K_U} \in\mathbb{P}^1(K_U) \ \forall K_U \in\mathcal{T}_h^U}\bigr \}. \end{aligned}$$
(3.4)

Note that in general, the sizes of the elements in \(\{ \mathcal{T}_{h}^{U}\} _{h}\) are smaller than those in \(\{ \mathcal{T}_{h}\}_{h}\), so we assume that h U /hC throughout this paper.

We can now give DG discretizations of the state equation (2.2a)–(2.2c) in space for fixed control u. The DG method proposed here is based on the upwind discretization of the convection term and on the SIPG discretization of the diffusion term [22]. This leads to the following (bi-)linear forms applied to y h H 1(0,T;Y h ) for ∀t∈(0,T]

$$\begin{aligned} (\partial_t y_h, v_h) + a_h(y_h,v_h)+b_h(u_h,v_h)=(f_h,v_h) \quad\forall v_h \in V_h, \ t \in(0,T], \end{aligned}$$
(3.5)

where

$$\begin{aligned} &a_h(y,v) \\ &\quad = \underbrace{\sum _{K \in\mathcal{T}_h} \int _{K} \varepsilon \nabla y \cdot\nabla v \, dx - \sum _{ E \in\mathcal{E}_h} \int _E \biggl( \{\!\{ \varepsilon \nabla y \}\!\} \cdot [\![ v ]\!] + \{\!\{ \varepsilon \nabla v \}\!\} \cdot [\![ y ]\!] - \frac{\sigma \varepsilon }{h_E} [\![ y ]\!] \cdot [\![ v ]\!] \biggr) \, ds }_{a^{d}(y,v)} \\ &\qquad {}+ \!\underbrace{ \sum _{K \in\mathcal{T}_h} \int _{K} ( \beta\cdot\nabla y v + r y v ) \, dx + \!\!\sum _{K \in\mathcal{T}_h} \int _{\partial K^{-} \backslash\varGamma^-} \beta\cdot \mathbf{n} \bigl(y^e-y\bigr)v \, ds - \!\!\sum _{K \in\mathcal{T}_h} \int _{\partial K^{-} \cap\varGamma^{-}} \beta\cdot\mathbf{n} y v \, ds}_{a^{cr}(y,v)}, \end{aligned}$$
(3.6a)
$$\begin{aligned} &b_h(u, v) = - \sum _{K \in\mathcal{T}_h} \int _{K} Buv \, dx \end{aligned}$$
(3.6b)

with a constant interior penalty parameter σ>0. We choose σ to be sufficiently large, independent of the mesh size h and the diffusion coefficient ε to ensure the stability of the DG discretization as described in [21, Sect. 2.7.1] with a lower bound depending only on the polynomial degree. Large penalty parameters decrease the jumps across element interfaces, which can affect the numerical approximation. Further, the DG approximation can converge to the continuous Galerkin approximation as the penalty parameter goes to infinity. See, e.g., [5] for details.

To make the notation easier for the readers, we introduce the L 2 inner product on the inflow or outflow boundaries as follows

$$(w, v)_{\varGamma^{-}} = \int_{\varGamma_{-}}|\beta\cdot n| w v \, ds $$

with analogous definition of \((\cdot, \cdot)_{\varGamma^{+}}\) and associated norms \(\|\cdot\|_{\varGamma^{-}}\) and \(\|\cdot\|_{\varGamma ^{+}}\). Further, the standard notation W m,q(Ω) is used for the Sobolev space with a norm \(\|\cdot\|_{W^{m,q}(\varOmega)}\) and the broken Sobolev spaces used in DG discretization are given by

$$\|\!| v |\!\|_{W^{m,q}(\mathcal{T}_h)}= \biggl( \sum _{K \in\mathcal{T}_h} \|v \|_{W^{m,q}(K)}^{2} \biggr)^{1/2}. $$

3.2 Semi-discrete formulation of optimal control problem

The discretization of admissible set (2.3) is defined by

$$\begin{aligned} U_h^{ad}=\bigl\{u_h \in L^{2}(0,T;U_h) : u_a \leq u_h \leq u_b \ \hbox{a.e. in}\ \varOmega_U \times(0,T] \bigr\}. \end{aligned}$$
(3.7)

Let \(f_{h}, y_{h}^{d}\) and \(y_{h}^{0}\) be approximations of the source function f, the desired state function y d and initial condition y 0, respectively. Then, the semi-discrete approximation of the optimal control problem (2.8a)–(2.8c) can be defined as follows:

$$\begin{aligned} &\underset{u_h \in U_h^{ad}}{\hbox{minimize}} \quad \int_{0}^{T} \biggl( \frac{1}{2} \sum_{K \in\mathcal{T}_h} \big\|y_h-y_{h}^d \big\|^{2}_{L^{2}(K)} + \frac{\alpha}{2} \sum_{K_U \in\mathcal{T}_h^U} \| u_h\|^{2}_{L^{2}(K_U)} \biggr) \, dt, \end{aligned}$$
(3.8a)
$$\begin{aligned} &\hbox{subject to} \quad (\partial_t y_h, v_h) + a_h(y_h,v_h)+b_h(u_h,v_h)=(f_h,v_h) \quad\forall v_h \in V_h, \ t \in(0,T], \\ &\hphantom{\hbox{subject to} \quad} y_h(x,0)=y_h^0,\qquad (y_h,u_h) \in Y_h \times U_h^{ad}. \end{aligned}$$
(3.8b)

4 A priori error analysis of semi-discrete scheme

In this section, we derive a priori error estimates for the semi-discrete scheme of the optimal control problem (2.1), (2.2a)–(2.2c) by using the upwind symmetric interior penalty Galerkin (SIPG) discretization for the space. By introducing the following norm [21]

$$\begin{aligned} \|v\|^{2}_{\varepsilon} = \sum _{K \in\mathcal{T}_{h}} \int_{K} \| \varepsilon\nabla y \|^{2}_{L^{2}(K)} \, dx + \sum_{E \in\mathcal{E}_{h}} \frac{\sigma\varepsilon}{h_{E}} \int _{E} \big\|[\![ y ]\!]\big\|^{2}_{L^{2}(E)} \, ds, \end{aligned}$$

we obtain the following coercivity result for some positive constant κ>0 independent of the mesh size h and the diffusion parameter ε provided that a sufficiently large penalty parameter σ is chosen based on the polynomial degree as described in [21]:

$$\begin{aligned} \forall t >0, \ \forall v_{h} \in V_{h}, \quad\kappa\|v\|^{2}_{\varepsilon} \leq a^{d}_{h}(v, v). \end{aligned}$$
(4.1)

We also need the following trace inequality at the rest of the paper:

$$\begin{aligned} \|v\|_{L^{2}(E)} \leq C |h_{E}|^{1/2} |h_{K}|^{-1/2} \|v\|_{L^{2}(K)}, \quad\forall E \subset \partial K, \end{aligned}$$
(4.2)

where the constant C is independent of mesh size h, but depends on polynomial degree. In addition, the generalization of Poincaré-Friedrichs inequality to the broken Sobolev space \(H^{1}(\mathcal {T}_{h})\) [4]

$$\begin{aligned} \|v\|^{2}_{L^{2}(\varOmega)} \leq C \biggl( \|\!| v |\!\|^{2}_{H^{0}(\mathcal{T}_h)} + \sum _{E \in\mathcal{E}_h} \frac{1}{h_E} \big\| \left [\!\left [ v \right ]\!\right ] \big\|^{2} _{L^{2}(E)} \biggr), \quad \forall v \in H^{1}(\mathcal{T}_h) \end{aligned}$$
(4.3)

is needed for some of the following proofs.

Now, we turn to derive a semi-discrete stability estimate for the state variable at the following Lemma 1.

Lemma 1

Let y h be the solution of (3.8b) and let c 0 be a positive constant such that (2.4) holds. Then, there exists a positive constant C independent of mesh size h fort∈(0,T] such that

$$\begin{aligned} &\sum_{K \in\mathcal{T}_{h}} \big\|y_h(t)\big\|^{2}_{L^{2}(K)} + \int _{0}^{t} \|y_{h}\|^{2}_{\varepsilon} \, dt \\ &\qquad{} + \int _{0}^{t} \biggl( \sum _{K \in\mathcal{T}_{h}} c_{0}\|y_{h} \|^{2}_{L^{2}(K)} + \sum _{K \in\mathcal{T}_{h}} \|y_{h}\|^{2}_{L^{2}(\partial K^{-} \cap\varGamma^{-})} \biggr) \, dt \\ &\qquad{}+ \int _{0}^{t} \biggl( \sum _{K \in\mathcal{T}_{h}} \big\|y_{h} - y_{h}^{e}\big\|^{2}_{L^{2}(\partial K^{-} \backslash\varGamma^{-})} + \sum _{K \in\mathcal{T}_{h}} \|y_{h}\|^{2}_{L^{2}(\partial K^{+} \cap \varGamma^{+})} \biggr) \, dt \\ &\quad \leq C \biggl( \sum _{K \in\mathcal{T}_{h}} \big\|y_{h}^{0} \big\|^{2}_{L^{2}(K)} + \int _{0}^{t} \biggl( \sum_{K \in\mathcal{T}_{h}} \|f\|^{2}_{L^{2}(K)} + \sum _{K \in\mathcal{T}_{h}} \|Bu_{h}\|^{2}_{L^{2}(K)} \biggr) \, dt \biggr). \end{aligned}$$
(4.4)

Proof

The proof is shown as done in [24, Lemma 3.1]. □

Let J(⋅) be a continuous functional in L 2(Ω). Then, there exists at least one solution for the minimization problem (3.8a), (3.8b) since \(\int_{0}^{T} \sum_{K \in\mathcal{T}_{h}} \|y(u_{h})\|_{H^{1}(K)}^{2}\) is bounded as proven in Lemma 1 (see, e.g., [24] for details). Then, we can deduce that the semi-discrete optimal control problem (3.8a), (3.8b) has a unique solution \((y_{h},u_{h}) \in H^{1}(0,T;Y_{h}) \times U^{ad}_{h}\). See, e.g., [18]. The functions \((y_{h},u_{h}) \in H^{1}(0,T;Y_{h}) \times U^{ad}_{h}\) solve (3.8a), (3.8b) if and only if \((y_{h},u_{h},p_{h}) \in H^{1}(0,T;Y_{h}) \times U^{ad}_{h} \times H^{1}(0,T;Y_{h})\) is a unique solution of the following optimality system:

$$\begin{aligned} &(\partial_t y_h, v_h) + a_{h}(y_{h},v_{h})+b(u_{h},v_{h})=(f_h,v_{h}) \quad \forall v_{h} \in V_{h}, \ y_h(x,0)=y_h^0, \end{aligned}$$
(4.5a)
$$\begin{aligned} &-(\partial_t p_h, \psi_{h}) + a_{h}(\psi_{h},p_{h})=-\bigl(y_{h}-y_h^{d}, \psi_{h}\bigr)\quad \forall\psi_{h} \in V_{h}, \ p_h(x,T)=0, \end{aligned}$$
(4.5b)
$$\begin{aligned} &\int_{0}^{T} \bigl(\alpha u_{h}- B^*p_{h}, w_{h} -u_{h}\bigr)_U\, dt \geq0 \quad \forall w_{h} \in U^{ad}_{h}. \end{aligned}$$
(4.5c)

Similar to Lemma 1, we can obtain the following semi-discrete stability estimate for the adjoint variable in Lemma 2.

Lemma 2

Let p h be the solution of (4.5b) and let c 0 be a positive constant such that (2.4) holds. Then, there exists a positive constant C independent of h such that

$$\begin{aligned} &\sum_{K \in\mathcal{T}_{h}} \big\|p_h(t)\big\|^{2}_{L^{2}(K)} + \int_{t}^{T} \|p_{h}\|^{2}_{\varepsilon}\, dt \\ &\qquad{}+ \int _{t}^{T} \biggl( \sum _{K \in\mathcal{T}_{h}} c_0\|p_{h}\|^{2}_{L^{2}(K)} + \sum _{K \in\mathcal{T}_{h}} \|p_{h}\|^{2}_{L^{2}(\partial K^{-} \cap \varGamma^{-})} \biggr) \, dt \\ &\qquad{}+ \int _{t}^{T} \biggl( \sum _{K \in\mathcal{T}_{h}} \big\|p_{h} - p_{h}^{e}\big\|^{2}_{L^{2}(\partial K^{+} \backslash\varGamma^{+})} + \sum _{K \in\mathcal{T}_{h}} \|p_{h}\|^{2}_{L^{2}(\partial K^{+} \cap \varGamma^{+})} \biggr) \, dt \\ &\quad \leq C \int _{t}^{T} \sum _{K \in\mathcal{T}_{h}} \big\|y_{h} - y^{d}_{h}\big\|^{2}_{L^{2}(K)}\, dt. \end{aligned}$$
(4.6)

Proof

The proof is similar to (4.4) with p h (x,T)=0. □

In order to derive a priori error estimates for the semi-discrete scheme, we make use of the following definitions and estimates. Firstly, we define an elliptic projection \(\tilde{y}\) of y onto Y h satisfying the Galerkin orthogonality

$$\begin{aligned} a^{d}_{h} \bigl(y(t) - \tilde{y}(t), v\bigr) = 0 \quad\forall t \geq0, \ \forall v\in V_{h} \end{aligned}$$
(4.7)

to derive an error estimate for yy h (u). Then, we use the following estimates that are given in [21]:

$$\begin{aligned} \big\|y(t) - \tilde{y}(t) \big\|_{\varepsilon} &\leq Ch\big |\!\big |\!\big | y(t) \big |\!\big |\!\big |_{H^{2}(\mathcal{T}_{h})} \; \quad\forall t \geq0, \end{aligned}$$
(4.8a)
$$\begin{aligned} \big\|y(t) - \tilde{y}(t) \big\|_{L^{2}(\varOmega)} & \leq Ch^2 \big |\!\big |\!\big | y(t) \big |\!\big |\!\big |_{H^{2}(\mathcal{T}_{h})} \quad\forall t \geq0. \end{aligned}$$
(4.8b)

Moreover, the domain Ω U is divided as the active and inactive regions of the control u for each time interval as firstly introduced in [17]

$$\begin{aligned} \varOmega^{*}_U &= \biggl\{ \bigcup _{K_U}: K_U \subset\varOmega_U, u_{a} < u|_{K_U} < u_{b} \biggr\}, \\ \varOmega^{c}_U &= \biggl\{ \bigcup _{K_U}: K_U \subset\varOmega_U, u|_{K_U} = u_{a} \hbox{ or } u|_{K_U} = u_{b} \biggr\}, \\ \varOmega^{b}_U &= \varOmega\backslash \bigl( \varOmega^{*}_U \cup\varOmega^{c}_U \bigr). \end{aligned}$$

It is assumed that the intersection of the three sets is empty, i.e., \(\varOmega^{i}_{U} \cap\varOmega^{j}_{U}= \emptyset\) for ij and \(\varOmega_{U} = \varOmega^{*}_{U} \cup\varOmega^{c}_{U} \cup\varOmega ^{b}_{U}\). \(\varOmega^{b}_{U}\) consists of elements which lie close to the free boundary between the active and the inactive sets for each time interval. We also hold the following assumption

$$\begin{aligned} \hbox{meas}\bigl(\varOmega^{b}_U \bigr) \leq Ch_U \end{aligned}$$
(4.9)

on the regularity of u and \(\mathcal{T}_{h}^{U}\). This assumption is valid if the boundary of the level set \(\varOmega^{c}_{U}\) consists of a finite number of rectifiable curves [19]. In addition, we set

$$\varOmega^* = \bigl\{ x\in\varOmega_U: u_a < u(x) <u_b \bigr\}, $$

which includes \(\varOmega_{U}^{*} \subset\varOmega^{*}\) [25].

We finally define

$$\begin{aligned} \bigl(J_{h}^{\prime}(u),v-u\bigr)_U= \int_{0}^{T} \bigl(\alpha u- B^*p_{h}(u),v-u\bigr)_U \, dt, \end{aligned}$$
(4.10)

in which the auxiliary solution p h (u)∈H 1(0,T;Y h ) is the solution of the following system

$$\begin{aligned} &\bigl(\partial_t y_h(u), v_h\bigr) + a_{h}\bigl(y_{h}(u),v_{h}\bigr)+b_{h}(u,v_{h})=(f_h,v_{h}) \\ &\quad\forall v_{h} \in V_{h}, \ y_h(u) (x,0)=y_0^h, \end{aligned}$$
(4.11a)
$$\begin{aligned} &{-}\bigl(\partial_t p_h(u), q_{h}\bigr) + a_{h}\bigl(q_{h},p_{h}(u)\bigr)=- \bigl(y_{h}(u)-y_h^{d}, q_{h}\bigr) \\ &\quad \forall q_{h} \in V_{h}, \ p_h(u) (x,T)=0, \end{aligned}$$
(4.11b)

where y h (u)∈H 1(0,T;Y h ) is also an auxiliary solution for given \(u \in U^{ad}_{h}\).

To complete the a priori error estimate of semi-discrete scheme, we firstly derive convergence estimates between the approximate solutions (y h ,p h ) and the auxiliary solutions (y h (u),p h (u)).

Lemma 3

Let (y h ,p h ) and (y h (u),p h (u)) be the solutions of (4.5a), (4.5b) and (4.11a), (4.11b), respectively. Then, there are positive constants C 1 and C 2 independent of h such that

$$\begin{aligned} \big\|y_{h} -y_{h}(u)\big\|_{L^{\infty}(0,T; L^{2}(\varOmega))} \leq C_{1} \| u-u_{h} \|_{L^{2}(0,T; L^{2}(\varOmega_U))} \end{aligned}$$
(4.12a)

and

$$\begin{aligned} \big\|p_{h} - p_{h}(u)\big\|_{L^{\infty}(0,T; L^{2}(\varOmega))} \leq C_{2} \|u-u_{h} \|_{L^{2}(0,T; L^{2}(\varOmega_U))}. \end{aligned}$$
(4.12b)

Proof

By subtracting (4.11a) (respectively, (4.11b)) from (4.5a) (respectively, (4.5b)), taking v h =y h y h (u) (respectively, v h =p h p h (u)) and following the approach in the stability estimates of the semi-discrete state equation (respectively, the semi-discrete adjoint equation), the desired results are obtained. □

Now, we will derive an estimate for the control u using the discontinuous piecewise linear finite element space by following the approach in [25, 29].

Lemma 4

Let (y,p,u) and (y h ,p h ,u h ) be the solutions of (2.8a)(2.8c) and (4.5a)(4.5c), respectively. Assume that \(u \in L^{2}(0,T;W^{1, \infty}(\varOmega_{U})), u|_{\varOmega^{*}} \subset L^{2}(0,T; H^{2}(\varOmega^{*}))\). Then, we have

$$\begin{aligned} \| u - u_{h} \|_{L^{2}(0,T; L^{2}(\varOmega_U))} \leq C \bigl( h^{3/2}_U + \big\|p - p_{h}(u) \big\|_{L^{2}(0,T; L^{2}(\varOmega))} \bigr). \end{aligned}$$
(4.13)

Proof

Let \((J_{h}^{\prime}(u),v-u)_{U}= \int_{0}^{T}(\alpha u-B^{*} p_{h}(u),v-u)_{U} \, dt\), where p h (u) is the solution of the auxiliary equation (4.11b). Then, we have

$$\begin{aligned} \bigl(J_{h}^{\prime}(v)-J_{h}^{\prime}(u),v-u \bigr)_U =& \int _{0}^{T} \bigl(\alpha(v-u),v-u \bigr)_U \, dt\\ &{} + \int _{0}^{T} \bigl(B^* p_h(u)- B^*p_h(v),v-u\bigr)_U \, dt. \end{aligned}$$

By using the auxiliary equations (4.11a) and (4.11b), we obtain

$$\begin{aligned} &\int _{0}^{T} \bigl( Bv-Bu,p_h(u)-p_h(v) \bigr)_U \, dt \\ &\quad = \int _{0}^{T} \bigl( \partial_t \bigl( y_h(v)-y_h(u)\bigr),p_h(u)-p_h(v) \bigr) \,dt\\ &\qquad{} + \int _{0}^{T} \bigl( a_h \bigl(y_h(v)-y_h(u),p_h(u)-p_h(v) \bigr) \bigr) \, dt \\ &\quad = \int _{0}^{T} \bigl( \partial_t \bigl( y_h(v)-y_h(u)\bigr),p_h(u)-p_h(v) \bigr) \,dt\\ &\qquad{} +\int _{0}^{T} \bigl( \partial_t \bigl(p_h(u)-p_h(v)\bigr),y_h(v)-y_h(u) \bigr) \, dt \\ &\qquad{} + \int _{0}^{T} \bigl( y_h(v)-y_h(u),y_h(v)-y_h(u) \bigr) \, dt. \end{aligned}$$

Application of integration by parts on the first term by using the fact (y h (v)−y h (u))| t=0=0 and (p h (v)−p h (u))| t=T =0 yields

$$\begin{aligned} \int_{0}^{T} \bigl( v-u,B^*p_h(u)-B^*p_h(v) \bigr)_U \, dt=\int _{0}^{T} \bigl(y_h(v)-y_h(u),y_h(v)-y_h(u) \bigr) \, dt \geq0. \end{aligned}$$
(4.14)

By using (4.14), we obtain

$$\begin{aligned} \bigl(J_{h}^{\prime}(v)-J_{h}^{\prime}(u),v-u \bigr)_U \geq\alpha\int _{0}^{T} \|v-u \|^{2}_{L^{2}(\varOmega_U)} \, dt. \end{aligned}$$
(4.15)

With the help of the inequalities (4.15), (4.14), (2.8c), (4.5c), the standard Lagrangian interpolation Πu with Young’s inequality and the notation p h =p h (u h ), we obtain

$$\begin{aligned} &\alpha\|u-u_h\|^{2}_{L^{2}(0,T;L^{2}(\varOmega_U))} \\ &\quad \leq \underbrace{\int _{0}^{T} \bigl(\alpha u-B^*p,u-u_{h} \bigr)_U\,dt}_{\geq0} +\int _{0}^{T} \bigl(B^*p-B^*p_{h}(u),u-u_{h}\bigr)_U\,dt \\ &\qquad {}+ \underbrace{\int _{0}^{T} \bigl(\alpha u_{h}-B^*p_{h},u_{h}- \varPi u\bigr)_U\,dt}_{\geq0} +\int _{0}^{T} \bigl( \alpha u_{h}-B^*p_{h},\varPi u-u\bigr)_U\,dt \\ &\quad \leq \int _{0}^{T} \bigl(B^*p-B^*p_{h}(u),u-u_{h} \bigr)_U\,dt \\ &\qquad{} + \int _{0}^{T} \bigl(\alpha u_{h}-B^*p_{h}-\alpha u+B^*p,\varPi u-u\bigr)_U \,dt \\ &\qquad {}+ \int _{0}^{T} \bigl(\alpha u-B^*p,\varPi u-u \bigr)_U\,dt \\ &\quad =\int_{0}^{T} \bigl(B^*p-B^*p_{h}(u),u-u_{h} \bigr)_U\,dt +\int _{0}^{T} (\alpha u_{h}-\alpha u,\varPi u-u)_U\,dt \\ &\qquad {}+ \int _{0}^{T} \bigl(B^*p-B^*p_{h}(u),\varPi u-u\bigr)_U\,dt +\int _{0}^{T} \bigl(B^*p_{h}(u)-B^*p_{h}, \varPi u-u\bigr)_U\,dt \\ &\qquad {}+\int _{0}^{T} \bigl(\alpha u-B^*p,\varPi u-u \bigr)_U\,dt \\ &\quad \leq\int _{0}^{T} \bigl(\alpha u-B^*p,\varPi u-u\bigr)_U\,dt + C_{1} \big\|B^*p_{h}(u)-B^*p_{h}\big\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))} \\ &\qquad{} + C_{2}\|u-\varPi u\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))}+ C_{3}\big\|B^*p-B^*p_{h}(u)\big\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))} \\ &\qquad{} + C_{4} \|\alpha u-\alpha u_{h}\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))} + C_{5} \|u-u_{h}\|^{2}_{L^{2}(0,T; L^{2}(\varOmega_U))}. \end{aligned}$$
(4.16)

As described in (3.4), we use the discontinuous piecewise linear finite element space for the control variable. Assuming Πu is the standard Lagrangian interpolation satisfying Πu(x)=u(x) for any vertex x. Then, Πu belongs to \(U_{ad}^{h}\). We get

$$\begin{aligned} \| u - \varPi u \|_{L^{2}(\varOmega^{*}_U)} \leq C h^{2}_U \|u \|_{H^{2}(\varOmega^{*}_U)}, \qquad\| u - \varPi u \|_{W^{0, \infty} (\varOmega^{b}_U)} \leq C h_U \|u\|_{W^{1, \infty} (\varOmega^{b}_U)} \end{aligned}$$

for uW 1,∞(Ω U ) and \(u_{| \varOmega^{*}} \subset H^{2}(\varOmega^{*})\). Hence,

$$\begin{aligned} \|u-\varPi u\|^{2}_{L^{2}(\varOmega_U)}&= \int_{\varOmega^{*}_U} (u-\varPi u)^{2} + \int _{\varOmega^{c}_U} (u-\varPi u)^{2} +\int_{\varOmega^{b}_U} (u-\varPi u)^{2} \\ & \leq C h^{4}_U \|u\|^{2}_{H^{2}(\varOmega^{*}_U)}+0+ C h^{2}_U \|u\|^{2}_{W^{1, \infty} (\varOmega^{b}_U)} \hbox{ meas }\bigl(\varOmega^{b}_U\bigr) \\ & \leq C h^{3}_U \bigl( h_U \|u \|^{2}_{H^{2}(\varOmega^{*}_U)} + \|u\|^{2}_{W^{1, \infty} (\varOmega ^{b}_U)}\bigr) \\ &\leq C h^{3}_U \bigl( \|u\|^{2}_{H^{2}(\varOmega^{*})} + \|u\|^{2}_{W^{1, \infty} (\varOmega_U)}\bigr). \end{aligned}$$
(4.17)

By the inequality in (4.5c), we have

$$\alpha u-B^*p=0 \quad\hbox{on}\ \varOmega^{*}_U \quad \hbox{and} \quad\varPi u-u=0 \quad\hbox{on}\ \varOmega^{c}_U. $$

In addition, there exists \(x_{0} \in K_{U} \subset\varOmega^{b}_{U}\) with u a <u(x 0)<u b satisfying (αuB p)(x 0)=0. Then, the following estimate by [25]

$$\begin{aligned} \big\| \alpha u - B^*p \big\|_{W^{0, \infty}(\varOmega^{b}_U)} =& \big\| \alpha u - B^*p - \bigl(\alpha u - B^*p\bigr) (x_{0}) \big\|_{W^{0, \infty} (\varOmega^{b}_U)}\\ \leq& Ch_U \big\| \alpha u - B^*p \big\|_{W^{1, \infty} (\varOmega^{b}_U)} \end{aligned}$$

results in

$$\begin{aligned} \bigl(\alpha u - B^*p, \varPi u-u \bigr)_U & = \int _{\varOmega^{*}_U} \bigl(\alpha u - B^*p\bigr) (\varPi u - u) + \int _{\varOmega^{c}_U} \bigl(\alpha u - B^*p\bigr) (\varPi u - u) \\ &\quad {} + \int _{\varOmega^{b}_U} \bigl(\alpha u - B^*p\bigr) (\varPi u - u) \\ & = 0+0 +\int _{\varOmega^{b}_U} \bigl(\alpha u - B^*p\bigr) (\varPi u - u) \\ & \leq\big\| \alpha u - B^*p \big\|_{W^{0, \infty} (\varOmega^{b}_U)} \|u - \varPi u \|_{W^{0, \infty} (\varOmega^{b}_U)} \hbox{meas}\bigl(\varOmega^{b}_U\bigr) \\ & \leq C h^{3}_U \big\| \alpha u-B^*p\big\|_{W^{1, \infty}( \varOmega^{b}_U)} \|u \|_{W^{1, \infty}( \varOmega^{b}_U)}. \end{aligned}$$
(4.18)

Finally, the desired result is obtained by inserting (4.17), (4.18) and (4.12b) into (4.16). □

Remark 2

In Lemma 4, we assume that uW 1,∞(Ω U ) and uH 2(Ω ) in space, instead of uH 2(Ω U ) due to the regularity issues on the boundary of the control as done [24, 25, 29]. The control variable u has lower regularity due to the discontinuity of the derivative of u on the free boundary Ω b. Hence, the convergence rate of the control u is around h 3/2. However, in numerical experiments, the optimal convergence rate can be obtained if the initial mesh is generated properly. It means that the initial grid aligns with the points where the bounds of control and the values of adjoint coincide. Hence, there happens no kink.

In the following lemma, the connection between the exact solution of the state y (respectively, the adjoint p) and the auxiliary state solution y h (u) (respectively, the auxiliary adjoint solution p h (u)) will be established.

Lemma 5

Let (y,p) be the solutions of (2.8a), (2.8b), respectively and (y h (u),p h (u)) be the solutions of the auxiliary equations (4.11a), (4.11b), respectively. Then, there is a constant C independent of h such that

$$\begin{aligned} \big\|y - y_{h}(u)\big\|_{L^{\infty}(0,T; L^{2}(\varOmega))} \leq Ch^2\| y \|_{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))} \end{aligned}$$
(4.19)

and

$$\begin{aligned} \big\|p - p_{h}(u)\big\|_{L^{\infty}(0,T; L^{2}(\varOmega))} \leq Ch^2 \bigl( \| p \|_{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))} + \| y \|_{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))} \bigr). \end{aligned}$$
(4.20)

Proof

To show the estimate of the state (4.19), we begin with subtracting (4.11a) from (2.8a),

$$\begin{aligned} \biggl( \frac{\partial(y - y_{h}(u))}{\partial t}, v_{h} \biggr) + a_{h}^{d} \bigl(y - y_{h}(u), v_{h}\bigr) + a_{h}^{cr} \bigl(y - y_{h}(u), v_{h}\bigr) = 0, \quad\forall v_{h} \in V_{h}. \end{aligned}$$

By writing

$$\begin{aligned} y - y_{h}(u) = (y - \tilde{y}) - \bigl(y_{h}(u) - \tilde{y}\bigr) = \eta- \xi, \end{aligned}$$

where \(\tilde{y}\) is the elliptic projection of y and taking ν h =ξ, we obtain

$$\begin{aligned} \biggl( \frac{\partial\xi}{\partial t}, \xi\biggr) + a_{h}^{d}(\xi, \xi) + a_{h}^{cr}(\xi, \xi) = \biggl( \frac{\partial\eta}{\partial t}, \xi \biggr) + a_{h}^{d}(\eta, \xi) + a_{h}^{cr}( \eta, \xi), \quad\forall t>0. \end{aligned}$$

Coercivity of \(a_{h}^{d}(\cdot, \cdot)\) (4.1) and the Galerkin orthogonality (4.7) yield

$$\begin{aligned} &\frac{1}{2} \frac{d}{dt} \|\xi\|^{2}_{L^{2}(\varOmega)} + \kappa\|\xi\|^{2}_{\varepsilon} + \sum _{K \in\mathcal{T}_{h}} c_{0}\|\xi\|^{2}_{K} + \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \|\xi\|^{2}_{\partial K^{-} \cap\varGamma ^{-}} \\ & \qquad{}+ \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \big\|\xi- \xi^{e} \big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \big\|\xi(t)\big\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \\ &\quad \leq\biggl \vert \biggl(\frac{\partial\eta}{\partial t}, \xi\biggr) + a_{h}^{cr}(\eta, \xi) \biggr \vert . \end{aligned}$$
(4.21)

The bounds in [21] for the first term in right-hand side of (4.21), i.e., \((\frac{\partial\eta}{\partial t}, \xi )\), and the bounds in [8] for the second term, i.e., \(a_{h}^{cr}(\eta, \xi)\), give us

$$\begin{aligned} & \frac{1}{2} \frac{d}{dt} \|\xi \|^{2}_{L^{2}(\varOmega)} + \kappa\|\xi\|^{2}_{\varepsilon} + \sum _{K \in\mathcal{T}_{h}} c_{0}\|\xi\|^{2}_{K} + \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \|\xi\|^{2}_{\partial K^{-} \cap\varGamma^{-}} \\ &\qquad{}+ \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \big\|\xi- \xi^{e} \big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \frac{1}{2} \sum _{K \in\mathcal{T}_{h}} \|\xi\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \\ &\quad \leq \frac{\kappa}{2}\|\xi\|^{2}_{\varepsilon} + Ch^4 \bigg |\!\bigg |\!\bigg | \frac{\partial y}{\partial t} \bigg |\!\bigg |\!\bigg |^{2}_{H^{2}(\mathcal{T}_{h})} + \frac{\kappa}{8}\|\xi\|^{2}_{\varepsilon} + C \|\eta \|^{2}_{L^{2}(\varOmega)} + C \|\xi\|^{2}_{L^{2}(\varOmega)} \\ &\qquad{} +\frac{1}{4} \sum _{K \in\mathcal{T}_{h}} \big\|\xi- \xi^{e} \big\|^{2}_{{\partial K^{-} \backslash\varGamma^{-}}} + \sum _{K \in\mathcal{T}_{h}} \big\| \eta^{e}\big\|^{2}_{{\partial K^{-} \backslash\varGamma^{-}}} + \frac {1}{4} \sum _{K \in\mathcal{T}_{h}} \|\xi\|^{2}_{{\partial K^{+} \cap\varGamma ^{+}}} \\ &\qquad{} + \sum _{K \in\mathcal{T}_{h}} \|\eta\|^{2}_{{\partial K^{+} \cap\varGamma^{+}}}. \end{aligned}$$
(4.22)

Now, we eliminate the terms related to η by using the estimate related to \(\|\eta\|_{\partial K^{-}}\) or \(\|\eta\|_{\partial K^{+}}\) in [13], trace inequality (4.2) and elliptic projection (4.7)

$$\begin{aligned} \sum_{K \in\mathcal{T}_{h}} \big\| \eta^{e}\big\|^{2}_{{\partial K^{-} \backslash\varGamma^{-}}} &\leq\sum _{K \in\mathcal{T}_{h}} \|\beta\|_{L^{\infty}(K)}\|\eta\|^{2}_{{\partial K}} \leq\sum _{K \in\mathcal{T}_{h}} C \|\eta\|^{2}_{K} = C \|\eta \|^{2}_{L^{2}(\varOmega)} \\ & \leq Ch^4\|\!| y |\!\|^{2}_{H^{2}(\mathcal{T}_{h})}. \end{aligned}$$
(4.23)

A bound for \(\|\xi\|^{2}_{L^{2}(\varOmega)}\) is derived by multiplying (4.22) by 2 and integrating from 0 to t. Using the continuous Gronwall inequality for ξ, we complete the proof of (4.19) by noting ξ(0)=0.

We proceed the proof of (4.20) by starting with the following equation

$$\begin{aligned} \biggl(-\frac{\partial(p - p_{h}(u))}{\partial t},q_{h} \biggr) + a_{h} \bigl(q_{h}, p - p_{h}(u)\bigr) = -\bigl(y - y_{h}(u), q_{h}\bigr), \quad\forall q_{h} \in V_{h}, \end{aligned}$$

as the proof of (4.19). □

Now, we complete the a priori error estimate of the semi-discrete scheme by combining Lemmas 3–5 with triangle inequality.

Theorem 1

Let (y,u,p) and (y h ,u h ,p h ) be the solutions of (2.8a)(2.8c) and (4.5a)(4.5c), respectively. Suppose that the conditions of Proposition 1 and Lemma 4 are valid. Assume that the regularity condition (4.9) is also satisfied. Then, the following estimate holds

$$\begin{aligned} &\|y - y_{h}\|_{L^{\infty}(0,T; L^{2}(\varOmega))} + \|p - p_{h} \|_{L^{\infty}(0,T; L^{2}(\varOmega))} + \|u - u_{h}\|_{L^{2}(0,T; L^{2}(\varOmega_U))} \\ & \quad \leq Ch^2 \bigl( \| p \|_{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))} + \| y \| _{H^{1}(0,T; H^{2}(\mathcal{T}_{h}))}\bigr) + Ch^{3/2}_U. \end{aligned}$$
(4.24)

5 Fully-discrete formulation of optimal control problem

We use the standard backward Euler method to discretize the optimal control problem (2.1), (2.2a)–(2.2c) in time. Let N T be a positive integer. The discrete time interval \(\bar{I}=[0,T]\) is defined as

$$0=t_0<t_1< \cdots<t_{N_T-1}<t_{N_T}=T $$

with size k n =t n t n−1 for n=1,…,N T and \(k= \max _{n=1, \ldots, N_{T}} k_{n}\).

To prove the a priori error estimate of the fully-discrete scheme, we need the discrete time-dependent norm for 1≤q<∞ by [9],

$$\begin{aligned} \|\!| v |\!\|_{L^{q}(0,T; L^{2}(\varOmega))} = \Biggl( \sum_{n=1}^{N_{T}} k_{n} \|v_{n} \|^{q}_{L^{2}(\varOmega)} \Biggr)^{1/q}. \end{aligned}$$
(5.1)

Let f h,n and \(y_{h,n}^{d}\) be approximations of the source function f h and the desired state function \(y_{h}^{d}\) at time t n . Then, the fully-discrete approximate scheme of the semi-discrete problem (3.8a), (3.8b) is

$$\begin{aligned} &\underset{u_{h,n} \in U_{h}^{ad}}{\hbox{minimize }} \quad \sum_{n=1}^{N_T} k_{n} \biggl( \frac{1}{2} \sum_{K \in\mathcal{T}_h} \big\|y_{h,n}-y_{h,n}^{d}\big\|^{2}_{L^{2}(K)} + \frac{\alpha}{2} \sum _{K_U \in\mathcal{T}_h^U} \| u_{h,n} \|^{2}_{L^{2}(K_U)} \biggr), \end{aligned}$$
(5.2a)
$$\begin{aligned} & \hbox{subject to}\quad \biggl(\displaystyle\frac{y_{h,n}-y_{h,n-1}}{k_{n}}, v \biggr) + a_h(y_{h,n},v)+b_h(u_{h,n},v) = (f_{h,n},v) \quad\forall v \in V_{h}, \\ &\phantom{\hbox{subject to}}\quad y_{h,0}(x,0)=y_h^0. \end{aligned}$$
(5.2b)

6 A priori error analysis of fully-discrete scheme

As for the semi-discrete scheme, we first give the stability result of the state variable at the following Lemma 6.

Lemma 6

Let y h,n be the solution of (5.2b) and let c 0 be a positive constant such that (2.4) holds. Then, there exists a positive constant C independent of h and k m for m=1,2,…,N T such that

$$\begin{aligned} &\|y_{h,m}\|^{2}_{L^{2}(\varOmega)} + \sum _{n=1}^{m} k_{n} \biggl(\kappa\|y_{h,n}\|^{2}_{\varepsilon} + \sum _{K} 2c_{0}\|y_{h,n} \|^{2}_{K} + \sum _{K} \|y_{h,n}\|^{2}_{\partial K^{-} \cap\varGamma^{-}} \biggr) \\ &\qquad{} + \sum _{n=1}^{m} k_{n} \biggl(\sum _{K} \big\| y_{h,n} - y_{h,n}^{e}\big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \sum _{K} \| y_{h,n}\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ & \quad \leq C \big\|y_{h}^{0}\big\|^{2}_{L^{2}(\varOmega)} + C \sum _{n=1}^{m} k_{n} \bigl( \|f_{h,n}\|^{2}_{L^{2}(\varOmega)} + \|B u_{h,n} \|^{2}_{L^{2}(\varOmega)} \bigr). \end{aligned}$$
(6.1)

Proof

Choose v h =y h,n in (6.2a). By using the algebraic inequality \(\frac{x^{2}-y^{2}}{2} \leq (x-y)x\), \(\forall x,y \in \mathbb{R}\) and following the steps in Lemma 1, we obtain the desired result. □

The minimization problem (5.2a), (5.2b) has at least one solution due to the boundedness of solution y h,n as proven in Lemma 6. Then, the fully discretized control problem (5.2a), (5.2b) obtained by using the backward Euler method has a unique solution (y h,n ,u h,n ), n=1,2,…,N T , and \((y_{h,n},u_{h,n}) \in Y_{h} \times U_{h}^{ad}\), n=1,2,…,N T is the solution of (5.2a), (5.2b) if and only if \((y_{h,n},u_{h,n},p_{h,n-1}) \in Y_{h} \times U^{ad}_{h} \times Y_{h}\) is a unique solution of the following optimality system:

$$\begin{aligned} & \begin{array}{l} \biggl(\displaystyle\frac{y_{h,n}-y_{h,n-1}}{k_{n}}, v \biggr) + a_h(y_{h,n},v)+b_h(u_{h,n},v)= (f_{h,n},v) \quad\forall v \in V_{h}, \\ y_{h,0}=y_h^0, \quad n=1,2, \ldots, N_{T}, \end{array} \end{aligned}$$
(6.2a)
$$\begin{aligned} &\begin{array}{l} \biggl(\displaystyle\frac{p_{h,n-1}-p_{h,n}}{k_{n}}, q\biggr) + a_h(q,p_{h,n-1})=- \bigl(y_{h,n}-y_{h,n}^{d},q\bigr) \quad\forall q \in V_{h}, \\ p_{h,T}=0, \quad n=N_{T}, \ldots, 2,1, \end{array} \end{aligned}$$
(6.2b)
$$\begin{aligned} &\bigl(\omega u_{h,n}- B^*p_{h,n-1}, w-u_{h,n} \bigr)_U \geq0 \quad\forall w \in U_{h}^{ad}, \ n=1,2, \ldots, N_{T}. \end{aligned}$$
(6.2c)

The stability result of the adjoint variable for the fully-discrete scheme is also given at the following Lemma 7.

Lemma 7

Let p h,n be the solution of (6.2b) and let c 0 be a positive constant such that (2.4) holds. Then, there exists a positive constant C independent of h and k m for m=N T −1,…,2,1 such that

$$\begin{aligned} &\|p_{h,m}\|^{2}_{L^{2}(\varOmega)} + \sum _{n = 1}^{m} k_{n} \kappa\|p_{h,n-1}\|^{2}_{\varepsilon} \\ &\qquad{} + \sum _{n = 1}^{m} k_{n} \biggl( 2\sum _{K} c_{0}\| p_{h,n-1}\|^{2}_{K} + \sum _{K} \| p_{h,n-1} \|^{2}_{\partial K^{-} \cap\varGamma^{-}} \biggr) \\ &\qquad{} + \sum _{n = 1}^{m} k_{n} \biggl( \sum _{K} \big\|p_{h,n-1} - p_{h,n-1}^{e} \big\|^{2}_{\partial K^{+} \backslash\varGamma^{+}} + \sum_{K} \| p_{h,n-1} \|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ &\quad \leq C \sum _{n = 1}^{m} k_{n} \big\|y_{h,n} - y^{d}_{h,n}\big\|^{2}_{L^{2}(\varOmega)}. \end{aligned}$$
(6.3)

Now, we derive the a priori error estimates of the fully-discrete scheme by introducing the following auxiliary equations as for the semi-discrete scheme. Let

$$\begin{aligned} \bigl(J_{h}^{\prime}(u),v-u\bigr)_U=\sum _{n=1}^{N_{T}} k_{n} \bigl(\alpha u_{n}- B^*p_{h,n-1}(u),v_{n}-u_{n} \bigr)_U, \end{aligned}$$
(6.4)

where p h,n−1(u) is the solution of the following system

$$\begin{aligned} & \begin{array}{l} \biggl(\displaystyle\frac{y_{h,n}(u)-y_{h,n-1}(u)}{k_{n}}, v \biggr) + a_h\bigl (y_{h,n}(u),v \bigr)+b_h(u_{n},v)= (f_{h,n},v) \quad \forall v \in V_{h}, \\ y_{h,0}(u)=y_h^0, \quad n=1,2, \ldots, N_{T}, \end{array} \end{aligned}$$
(6.5a)
$$\begin{aligned} & \begin{array}{l} \biggl(\displaystyle\frac{p_{h,n-1}(u)-p_{h,n}(u)}{k_{n}}, q \biggr) + a_h\bigl (q,p_{h,n-1}(u) \bigr)=-\bigl(y_{h,n}(u)-y_{h,n}^{d},q\bigr) \quad \forall q \in V_{h}, \\ p_{h,T}(u)=0, \quad n=N_{T}, \ldots, 2,1. \end{array} \end{aligned}$$
(6.5b)

For the simplicity, we use the following notations,

$$\begin{aligned} \zeta_{n} &= y_{h,n} - y_{h,n}(u), \quad n=0,1, \ldots, N_T, \\ \chi_{n} &= p_{h,n} - p_{h,n}(u), \quad n=N_T, \ldots, 1,0. \end{aligned}$$

Firstly, we establish a connection between the approximation results (y h ,p h ) and the auxiliary solutions (y h (u),p h (u)) as described at the following lemma.

Lemma 8

Let (y h ,p h ) and (y h (u),p h (u)) be the solutions of (6.2a), (6.2b) and (6.5a), (6.5b), respectively. Then, there are positive constants C 1 and C 2 independent of h and k such that

$$\begin{aligned} \big |\!\big |\!\big | y_{h} - y_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} & \leq C_{1 }\|\!| u - u_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}, \end{aligned}$$
(6.6)
$$\begin{aligned} \big |\!\big |\!\big | p_{h} - p_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} &\leq C_{2}\|\!| u - u_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}. \end{aligned}$$
(6.7)

Proof

We start the proof of (6.6) by subtracting (6.5a) from (6.2a) to obtain the following equality

$$\begin{aligned} \biggl(\frac{\zeta_{n} - \zeta_{n-1}}{k_n}, v_{h} \biggr) + a_{h}( \zeta_{n}, v_{h})= - b_h(u_{h,n} - u_{n}, v_{h}). \end{aligned}$$

By choosing v h =ζ n and following the steps in Lemma 6, we obtain

$$\begin{aligned} &\frac{1}{2 k_{n}} \bigl( \|\zeta_{n}\|^{2}_{L^{2}(\varOmega)} - \|\zeta_{n-1}\|^{2}_{L^{2}(\varOmega)} \bigr) + \kappa\| \zeta_{n}\|^{2}_{\varepsilon} + \sum _{K} c_{0}\|\zeta_{n}\|^{2}_{L^{2}(K)} + \frac{1}{2}\sum _{K} \| \zeta_{n} \|^{2}_{\partial K^{-} \cap\varGamma^{-}} \\ &\qquad{} + \frac{1}{2}\sum _{K} \big\| \zeta_{n}- \zeta_{n}^{e}\big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \frac{1}{2}\sum _{K} \|\zeta_{n} \|^{2}_{\partial K^{+} \cap\varGamma^{+}} \\ &\quad \leq\frac{1}{2} \|u_{h,n} - u_{n} \|^{2}_{L^{2}(\varOmega)} + \frac{1}{2} \|\zeta_{n} \|^{2}_{L^{2}(\varOmega)}. \end{aligned}$$

Multiplying the above inequality by 2k n and summing from n=1 to n=N T , we derive

$$\begin{aligned} &\bigl( \|\zeta_{N_{T}}\|^{2}_{L^{2}(\varOmega)} - \| \zeta_{0}\|^{2}_{L^{2}(\varOmega)} \bigr) + 2 \sum _{n=1}^{N_{T}} k_{n} \biggl( \kappa\| \zeta_{n}\|^{2}_{\varepsilon} + \sum _{K}c_{0} \| \zeta_{n}\|^{2}_{L^{2}(K)} \biggr) \\ &\qquad{} + \sum _{n=1}^{N_{T}} k_{n} \biggl( \sum _{K} \| \zeta_{n}\|^{2}_{\partial K^{-} \cap\varGamma^{-}} + \sum _{K} \big\| \zeta_{n}- \zeta_{n}^{e} \big\|^{2}_{\partial K^{-}\backslash\varGamma^{-}} + \sum _{K} \| \zeta_{n}\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ &\quad \leq\sum _{n=1}^{N_{T}} k_{n} \bigl( \|u_{h,n} - u_{n}\|^{2}_{L^{2}(\varOmega)} + \|\zeta_{n}\|^{2}_{L^{2}(\varOmega)} \bigr). \end{aligned}$$

Then, we apply discrete Gronwall’s inequality to the terms related to ζ and use (4.3) which leads to the inequality \(\| \cdot\|_{L^{2}(\varOmega)} \leq C\| \cdot\| _{\varepsilon}\) for some positive constant C and finally use the definition of the norm in (5.1) to obtain (6.6).

To show the second part of the Lemma 8, we subtract (6.5b) from (6.2b) to obtain

$$\begin{aligned} \biggl(\frac{ \chi_{n-1} - \chi_{n}}{k_{n}}, q_{h} \biggr) + a_{h}(q_{h}, \chi_{n-1}) = -(\zeta_{n}, q_{h}). \end{aligned}$$

By choosing q h =χ n−1 and proceeding as in the first part, we obtain the following inequality

$$\begin{aligned} \big |\!\big |\!\big | p_{h} - p_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} \leq C \big |\!\big |\!\big | y_{h} - y_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))}. \end{aligned}$$

This inequality gives us the desired result (6.7). □

To derive an estimate for the control u in the fully-discrete scheme, we use the discontinuous piecewise linear finite element space by following the approach in [29].

Lemma 9

Let (y,p,u) and (y h ,p h ,u h ) be the solutions of (2.8a)(2.8c) and (6.2a)(6.2c), respectively. Under the assumptions uL 2(0,T;W 1,∞(Ω U )), \(u|_{\varOmega^{*}} \in L^{2}(0,T; H^{2}(\varOmega^{*}))\), pL 2(0,T;W 1,∞(Ω)), we have

$$\begin{aligned} \|\!| u - u_{h} |\!\|_{L^{2}(0,T; L^{2}(\varOmega_U))}& \leq C \biggl( k \biggl \Vert \frac{\partial p}{\partial t}\biggr \Vert _{L^{2}(0,T; L^{2}(\varOmega ))} + \big |\!\big |\!\big | p_{h}(u) - p \big |\!\big |\!\big |_{L^{2}(0,T; L^{2}(\varOmega))} \biggr) \\ &\quad {} + Ch^{3/2}_U. \end{aligned}$$
(6.8)

Proof

Let

$$\bigl(J_{h}^{\prime}(u),v-u\bigr)_U=\sum _{n=1}^{N_{T}} k_{n} \bigl(\alpha u_{n}- B^*p_{h,n-1}(u),v_{n}-u_{n} \bigr)_U, $$

where p h,n−1(u) is the solution of the auxiliary solution (6.5b). Then,

$$\begin{aligned} \bigl(J_{h}^{\prime}(v)-J_{h}^{\prime}(u),v-u \bigr)_U &=\sum _{n=1}^{N_{T}}k_{n}( \alpha v_{n}-\alpha u_{n},v_{n}-u_{n})_U\\ &\quad {}+ \sum _{n=1}^{N_{T}}k_{n}\bigl( B^*p_{h,n-1}(u)- B^*p_{h,n-1}(v),v_{n}-u_{n} \bigr)_U \\ &=\alpha \|\!| v-u |\!\|^{2}_{L^{2}(0,T;L^{2}(\varOmega_U))}\\ &\quad {} + \sum _{n=1}^{N_{T}}k_{n} \bigl( B^*p_{h,n-1}(u)- B^*p_{h,n-1}(v),v_{n}-u_{n} \bigr)_U. \end{aligned}$$

By using the auxiliary solutions (6.5a), (6.5b) as done for the semi-discrete scheme, we obtain

$$\sum_{n=1}^{N_{T}}k_{n}\bigl( B^*p_{h,n-1}(u)- B^*p_{h,n-1}(v),v_{n}-u_{n}\bigr)_U \geq0. $$

Hence,

$$\begin{aligned} \bigl(J_{h}^{\prime}(v)-J_{h}^{\prime}(u),v-u \bigr)_U \geq\alpha \|\!| v-u |\!\|^{2}_{L^{2}(0,T;L^{2}(\varOmega_U))}. \end{aligned}$$
(6.9)

Set Πu n U h be the standard Lagrange interpolation of u at time t n such that Πu n (x)=u n (x) for all vertices x. Then, Πu n belongs to \(U_{ad}^{h}\) at time t n . With the help of the inequalities (6.9), (2.8c), (6.2c) and an approximation of u at time t n , i.e., Πu n , we obtain

$$\begin{aligned} &\alpha \|\!| u-u_h |\!\|^{2}_{L^{2}(0,T;L^{2}(\varOmega_U))} \\ &\quad \leq \bigl(J_{h}^{\prime}(u)-J_{h}^{\prime}(u_h),u-u_h \bigr)_U \\ &\quad = \sum _{n=1}^{N_{T}}k_{n}\bigl(\alpha u_{n}- B^*p_n,u_{n}-u_{h,n} \bigr)_U + \sum _{n=1}^{N_{T}}k_{n} \bigl( B^*p_{n}- B^*p_{h,n-1}(u),u_{n}-u_{h,n} \bigr)_U \\ &\qquad{}+ \sum _{n=1}^{N_{T}}k_{n}\bigl(\alpha u_{h,n}- B^*p_{h,n-1}, \varPi u_{n}-u_{n} \bigr)_U \\ &\qquad{} + \sum _{n=1}^{N_{T}}k_{n} \bigl(\alpha u_{h,n}- B^*p_{h,n-1}(u), u_{h,n}-\varPi u_{n}\bigr)_U \\ &\quad \leq \sum _{n=1}^{N_{T}}k_{n} \bigl( B^*p_{n}- B^*p_{h,n-1}(u),u_{n}-u_{h,n} \bigr)_U \\ &\qquad{} + \sum _{n=1}^{N_{T}}k_{n} \bigl(\alpha u_{h,n}- B^*p_{h,n-1}, \varPi u_{n}-u_{n} \bigr)_U \\ &\quad = \underbrace{\sum _{n=1}^{N_{T}}k_{n}\bigl( B^*p_{n}- B^*p_{n-1},u_{n}-u_{h,n} \bigr)_U}_{T_{1}} \\ &\qquad {} + \underbrace{\sum _{n=1}^{N_{T}}k_{n} \bigl( B^*p_{n-1}- B^*p_{h,n-1}(u),u_{n}-u_{h,n} \bigr)_U}_{T_{2}} \\ &\qquad {} + \underbrace{\sum _{n=1}^{N_{T}}k_{n} \bigl(\alpha u_{h,n}- B^*p_{h,n-1}, \varPi u_{n}-u_{n} \bigr)_U}_{T_{3}}. \end{aligned}$$
(6.10)

The following estimates of T 1 and T 2 are derived by using Young’s inequality,

$$\begin{aligned} T_{1} &\leq C_{1} \sum _{n=1}^{N_{T}}k_{n} \|p_{n}-p_{n-1}\|_{L^{2}(\varOmega)}^{2} + C_{2} \sum _{n=1}^{N_{T}}k_{n} \|u_{n}-u_{h,n}\|_{L^{2}(\varOmega_U)}^{2} \\ &\leq C_{1} k^{2} \bigg\| \frac{\partial p}{\partial t} \bigg\|_{L^{2}(0,T;L^{2}(\varOmega))}^{2} + C_{2} \|\!| u-u_h |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}^{2}, \\ T_{2} &\leq C_{1} \sum _{n=1}^{N_{T}}k_{n} \big\|p_{n-1}-p_{h,n-1}(u)\big\|_{L^{2}(\varOmega)}^{2} + C_{2}\sum _{n=1}^{N_{T}}k_{n} \|u_{n}-u_{h,n}\|_{L^{2}(\varOmega_U)}^{2} \\ &\leq C_{1} \big |\!\big |\!\big | p-p_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))}^{2}+ C_{2} \|\!| u-u_h |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}^{2}. \end{aligned}$$

By considering the discontinuous piecewise linear finite element space for the control u and following the steps in Lemma 4, we obtain

$$\begin{aligned} T_{3} &\leq C k^{2} \bigg\| \frac{\partial p}{\partial t} \bigg\|_{L^{2}(0,T;L^{2}(\varOmega))}^{2} + C\big |\!\big |\!\big | p_{h}(u)-p \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))}^{2}\\ &\quad {}+ \|\!| u-u_h |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}^{2}+ Ch^{3}_U. \end{aligned}$$

Summing up the estimates of T 1T 3, we complete the proof. □

Now, we establish the connection between the exact solutions and auxiliary solutions.

Lemma 10

Let (y,p) be the solutions of (2.8a), (2.8b) and (y h (u),p h (u)) be the solutions of (6.5a), (6.5b), respectively. Suppose that the conditions of Proposition 1 and Lemma 9 are valid. Then, the following estimates hold

$$\begin{aligned} \big |\!\big |\!\big | y - y_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} & \leq C h^2 \|\!| y |\!\| _{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} \\ &\quad {} + C h^2 \biggl \Vert \frac{\partial y }{\partial t} \biggr \Vert _{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} + C k \biggl \Vert \frac{\partial ^{2} y}{\partial t^{2}} \biggr \Vert _{L^{2}(0,T;L^{2}(\varOmega))}, \end{aligned}$$
(6.11)

and

$$\begin{aligned} &\big |\!\big |\!\big | p - p_{h}(u) \big |\!\big |\!\big |_{L^{2}(0,T;L^{2}(\varOmega))} \\ &\quad \leq C h^2 \sum _{v=y, p} \|\!| v |\!\| _{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} + C h^2 \sum _{v=y, p} \biggl \Vert \frac{\partial v}{\partial t} \biggr \Vert _{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} \\ &\qquad {} + C k \sum _{v=y, p} \biggl \Vert \frac{\partial^{2} v}{\partial t^{2}}\biggr \Vert _{L^{2}(0,T;L^{2}(\varOmega))} + C k \sum _{v=y, y_{d}} \biggl \Vert \frac{\partial v}{\partial t} \biggr \Vert _{L^{2}(0,T;L^{2}(\varOmega))}. \end{aligned}$$
(6.12)

Proof

Here, we only prove (6.11) since both cases follow the same procedure. We start with the following equation obtained by using (6.5a) and (2.8a)

$$\begin{aligned} (\partial_t y_{n}, v_h) + a_h(y_{n},v_h)- \biggl( \frac{y_{h,n}(u) - y_{h,n-1}(u)}{k_{n}}, v_{h} \biggr) - a_h \bigl(y_{h,n}(u), v_h\bigr) = 0. \end{aligned}$$
(6.13)

We decompose yy h (u) as

$$\begin{aligned} y_{n} - y_{h,n}(u) = y_n - \tilde{y}_{n} - \bigl(y_{h}(u) - \tilde{y}_{n} \bigr) = \eta_{n} - \xi_{n}, \end{aligned}$$

where \(\tilde{y}\) is an elliptic projection of y. We only need to estimate ξ n since the estimate of η n is given in (4.8b). Hence, we write (6.13) as

$$\begin{aligned} \biggl(\frac{\xi_{n} - \xi_{n-1}}{k_{n}}, v_{h} \biggr) + a_{h}( \xi_{n}, v_{h})& = \biggl( \frac{\partial y_{n}}{\partial t} - \frac{y_{n} - y_{n-1}}{k_{n}}, v_{h} \biggr) + \biggl(\frac{\eta_{n} - \eta_{n-1}}{k_{n}}, v_{h} \biggr)\\ &\quad {} + a_{h}(\eta_{n}, v_{h}). \end{aligned}$$

By choosing v h =ξ n and applying the steps in Lemma 5 to bound the terms on the inflow and outflow boundaries, we obtain

$$\begin{aligned} &\|\xi_{N_{T}}\|^{2}_{L^{2}(\varOmega)} - \|\xi_{0} \|^{2}_{L^{2}(\varOmega)} + \frac{3 \kappa}{4} \sum _{n=1}^{N_{T}} k_{n} \|\xi_{n}\|^{2}_{\varepsilon} + 2 \sum _{n=1}^{N_{T}} k_{n} \sum _{K} c_{0}\| \xi_{n}\|^{2}_{L^{2}(K)} \\ &\qquad{} + \frac{1}{2} \sum _{n=1}^{N_{T}} k_{n} \biggl( 2\sum _{K} \| \xi_{n}\|^{2}_{\partial K^{-} \cap\varGamma^{-}} + \sum _{K} \big\| \xi_{n}- \xi_{n}^{e}\big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \sum _{K} \|\xi_{n}\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ & \quad \leq2 \sum _{n=1}^{N_{T}} k_{n} \biggl( \sum _{K} \big\| \eta^{e}_{n}\big\|^{2}_{\partial K^{-} \backslash\varGamma^{-}} + \sum_{K} \|\eta_{n}\|^{2}_{\partial K^{+} \cap\varGamma^{+}} \biggr) \\ &\qquad{} + C \sum _{n=1}^{N_{T}} k_{n} \| \xi_{n}\|^{2}_{L^{2}(\varOmega)} + C k_{n}^{2} \int _{0}^{T} \biggl \vert \!\biggl \vert \frac{\partial^{2} y }{\partial t^{2}} \biggr \vert \!\biggr \vert ^{2}_{L^{2}(\varOmega)}\,dt + C h^4 \int _{0}^{T} \biggl \vert \!\biggl \vert \!\biggl \vert \frac{\partial y }{\partial t} \biggr \vert \!\biggr \vert \!\biggr \vert _{H^{2}(\mathcal{T}_{h})} \, dt. \end{aligned}$$

Then, applying Gronwall’s inequality to the terms related to ξ in above inequality, the desired result (6.11) is obtained. □

Now, we finalize the a priori error estimate of the fully-discrete scheme by combining Lemmas 8–10 with the triangle inequality.

Theorem 2

Let (y,p,u) be the solutions of (2.8a)(2.8c) and (y h ,p h ,u h ) be the solutions of (6.2a)(6.2c), respectively. Suppose that the conditions of Proposition 1 and Lemma 9 are valid. Further, the regularity condition (4.9) is satisfied. Then, the following estimate holds

$$\begin{aligned} &\|\!| y - y_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega))} + \|\!| p - p_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega))} + \|\!| u - u_{h} |\!\|_{L^{2}(0,T;L^{2}(\varOmega_U))}\\ &\quad \leq Ch^2\sum _{v=y, p} \|\!| v |\!\|_{L^{2}(0,T;H^{2}(\mathcal{T}_{h}))} + C h^2 \sum _{v=y, p} \biggl \Vert \frac{\partial v}{\partial t} \biggr \Vert _{L^{2}(0,T;H^{2}(\mathcal {T}_{h}))} + Ch^{3/2}_U\\ &\qquad {}+ C k \sum _{v=y, p} \biggl \Vert \frac{\partial^{2} v}{\partial t^{2}}\biggr \Vert _{L^{2}(0,T;L^{2}(\varOmega))} + C k \sum _{v=y, p, y_{d}} \biggl \vert \!\biggl \vert \frac{\partial v}{\partial t} \biggr \vert \!\biggr \vert _{L^{2}(0,T;L^{2}(\varOmega))}. \end{aligned}$$

7 Numerical results

In this section, we present numerical results for the unsteady control constrained optimal control problems governed by the convection diffusion equation (2.1), (2.2a)–(2.2c). We take Ω=Ω U and B=I. To do this, we consider the problem in [10] with the following parameters

$$\begin{array}{l} Q=(0,1] \times\varOmega, \quad\varOmega=(0,1)^{2}, \quad \varepsilon =10^{-5}, \quad\beta=(1,0)^T, \\ r=0, \quad \alpha=1 \quad\text{and} \quad u_a=0. \end{array} $$

The source function f, the desired state y d and the initial condition y 0 are computed from (2.8a)–(2.8c) using the following exact solutions of the state, adjoint and control, respectively,

$$\begin{aligned} y(x,t) &= \exp(-t) \sin(2 \pi x_1) \sin(2 \pi x_2), \\ p(x,t) &= \exp(-t) (1-t) \sin(2 \pi x_1) \sin(2 \pi x_2), \\ u(x,t) &= \max(0,p). \end{aligned}$$

In our numerical example, the control variable is only bounded from below, i.e., u a =0. The state, the adjoint, and the control variables are discretized by using the piecewise linear polynomials, i.e., (x,y,1−xy). Discretized control constraint problems are solved by the primal dual active set (PDAS) algorithm as a semi-smooth Newton step, see, e.g., [2, 3]. The algorithm is terminated when two consecutive active sets coincide. The initial guess for the control variable is taken as equal to zero for all discretization levels.

There are two approaches to solve the optimization problem (2.1), (2.2a)–(2.2c) numerically, i.e., the discretize-then-optimize (DO) and the optimize-then-discretize (OD). It is desirable that both approaches lead to the same discrete optimality system. In the DO approach, one first discretizes the optimal control problem with the objective function (2.1) and the state equation (2.2a)–(2.2c) and then form the optimality system. In the OD approach, one first derives the optimality conditions consisting of the state and adjoint PDEs and the algebraic equation that links the control and the adjoint variable. Afterwards the infinite dimension optimality system is discretized to form the optimality system. It is known that the two approaches for optimal control problems are governed by convection diffusion PDEs lead the same linear optimality systems for some discretization schemes. Although the commutative property is preserved for the steady case using upwind SIPG methods [15, 28], it is not preserved for the unsteady problems when the backward Euler discretization is used in time. A straightforward time discretization will usually not lead to the same optimality system for the DO and OD approaches; the initial condition of the adjoint PDE makes the difference between the DO and OD approaches. It was shown in [23] for the Stokes equation. By adjusting the time discretization for the initial condition of the forward problem, it was possible to show that the OD and DO approaches commute. The same technique was also used in [11]. Further, both approaches lead to the same discrete optimality system when discontinuous Galerkin discretization dG(0) is used in time [14].

Tables 1 and 2 show the errors and converge rates with respect to the discrete time-dependent norm (5.1) by fixing mesh size in space for OD and DO, respectively. The order of convergence for time is k as expected from the a priori error estimates. For fixed time steps, the errors and convergence rates in terms of space are given at Tables 3 and 4 for OD and DO, respectively. Again, the results confirm the a priori error estimates. Although the rate of the control is h 3/2 theoretically, it is observed to be h 2 since there is no kink for the control. It means that our initial grid aligns with the points where the lower bound of control and the value of adjoint coincide, i.e., x 1=x 2=0.5.

Table 1 \(h/\sqrt{2}=1/32\) via the OD approach
Table 2 \(h/\sqrt{2}=1/32\) via the DO approach
Table 3 k=1/2048 via the OD approach
Table 4 k=1/2048 via the DO approach

Figure 1 show the computed solutions of the state and adjoint at t=0.5 with \(h/\sqrt{2}=1/32\), k=1/128 by using the OD approach. In addition, the exact and computed solutions of the control is given at Fig. 2 for t=0.5 with \(h/\sqrt{2}=1/32\), k=1/128 by using the OD approach.

Fig. 1
figure 1

The computed solutions of the state (left) and adjoint (right) at t=0.5 with \(h/\sqrt{2}=1/32\), k=1/128 by using the OD approach

Fig. 2
figure 2

The exact and computed solutions of the control at t=0.5 with \(h/\sqrt{2}=1/32\), k=1/128 by using the OD approach

Discontinuous Galerkin (DG) discretizations exhibit a better convergence behavior for convection dominated optimal control problems since errors in boundary layers are not propagated into the entire domain [16]. Therefore, DG discretization with mesh adaptivity presents better results with respect to stabilized finite element methods such as in [26, 27] for steady convection dominated optimal control problems. Although our example is smooth, we can still see the effect of the DG discretization. The same example was solved in [10] with characteristic finite element method. Comparing the results in Table 1 with the ones obtained in [10], it turns out that the upwind SIPG discretization yields more accurate results.

8 Conclusions

We have derived a priori error estimates for the optimal control problems governed by the unsteady convection diffusion equation using the upwind SIPG discretization in space and the standard backward Euler in time. Although the OD and DO approaches lead two different optimality systems under the backward Euler discretization, there is no remarkable differences in numerical results and convergence rates. Numerical experiments are given to confirm the theoretical results. With adaptive meshes, the DG methods resolve the boundary and/or interior layers for convection dominated problems better and more efficient than the continuous finite elements in [26, 27] for steady optimal control problems. This issue will addressed in the coming work with space-time adaptivity for unsteady convection dominated optimal control problem.