1 Introduction

PDE-constrained optimal control problem plays an important role in many real world applications. There has been extensive research on numerical methods or algorithms of optimal control problems in the literatures (see, [6, 9, 11, 18]), most of which focused on control problems governed by integer order PDEs. In recent years, many researchers have started to study numerical approximation of optimal control problems governed by fractional differential equations, which are frequently met in applications, for example, anomalous process ([3, 14, 20]).

Based on an optimal control framework an initial value inverse problem for time fractional diffusion equation was studied in [23] by spectral Galerkin method. Spectral Galerkin approximation of state integral constrained time fractional optimal control problem was discussed in [24]. In [17], Legendre pseudo-spectral method combining with L1 scheme was utilized to discretize control constrained optimal control problem governed by a time-fractional diffusion equation. In [25] finite element method combined with L1-scheme was applied to discretize time fractional optimal control problem with Caputo derivative and a priori error estimate for semidiscrete case was proved. Regularity of time fractional optimal control problem and a fully discrete error estimate for L1 and back Euler scheme were presented in [13]. A time adaptive algorithm was developed in [27] for time-stepping discontinuous Galerkin approximation of time fractional optimal control problem. In [1, 2], finite element approximation of optimal control problems governed by fractional Laplacian were investigated. Finite element approximation of optimal control problem governed by one dimensional space fractional diffusion equation was investigated in [26], where a priori error estimate was derived and a fast primal-dual active set algorithm was developed. A fast gradient projection algorithm for finite difference approximation of space fractional optimal control problem was proposed in [7].

This paper is devoted to the error analysis and fast algorithm for time-stepping discontinuous Galerkin finite element approximation of control constrained optimal control problem governed by time fractional diffusion equation. A piecewise linear finite element method combined with a piecewise constant time-stepping discontinuous Galerkin method is applied to discretize the state equation. For the discretization of control variable variational discretization approach(see, [10]) is adopted. Regularity of the optimal control problem is discussed. A priori error analysis for state variable, adjoint state variable and control variable is deduced. Due to the nonlocal property of time fractional derivative the numerical schemes for time fractional equation give rise to long tails in time, which leads to higher memory requirements and increasing computational time comparing with integer differential equations. For optimal control problem this problem becomes more serious, since we usually solve the discrete first order optimality condition in an iterative manner, which consists of discrete state equation, adjoint state equation and variational inequality. To reduce the computational cost, a fast gradient projection algorithm is designed for the control problem based on the block triangular Toeplitz structure of the coefficient matrix of the discretized system. The total computational cost for solving the state and adjoint state equation is of \(O(MJ\log J)\) operations, where M and J denote the number of freedom for space and time, respectively. Finally, numerical examples are carried out to illustrate the theoretical findings and fast algorithm.

The paper is organized as follows. In Sect. 2, we recall some knowledge about fractional calculus. Regularity of the optimality system is discussed in Sect. 3. In Sect. 4, time-stepping discontinuous Galerkin approximation of optimal control problem is given and a priori error estimate for state, adjoint state and control variable is derived in Sect. 5. A fast gradient projection algorithm is developed in Sect. 6 based on the block triangular Toeplitz structure of the coefficient matrix of the discretized state and adjoint state equation. Numerical results are given in Sect. 7 to verify the theoretical findings and fast algorithm.

2 Preliminary

In this section we begin with recalling some definitions and properties of fractional derivatives as well as Sobolev spaces.

For \(0<\beta <1\), the left Caputo and Riemann-Lioville fractional derivative are defined as follows( [21])

$$\begin{aligned} {}_0^C\partial _t^{\beta }v=\frac{1}{\varGamma (1-\beta )}\int _0^t\frac{v'(s)}{(t-s)^\beta }ds \end{aligned}$$

and

$$\begin{aligned} {}_0^R\partial _t^{\beta }v=\frac{1}{\varGamma (1-\beta )}\frac{d}{dt}\int _0^t\frac{v(s)}{(t-s)^\beta }ds. \end{aligned}$$

Here \(\varGamma (\cdot )\) denotes the Gamma function. Similarly, the right Caputo and Riemann-Lioville fractional derivative of order \(\beta \) are defined by

$$\begin{aligned} {}_t^C\partial _T^{\beta }v=-\frac{1}{\varGamma (1-\beta )}\int _t^T\frac{v'(s)}{(s-t)^\beta }ds \end{aligned}$$

and

$$\begin{aligned} {}_t^R\partial _T^{\beta }v=-\frac{1}{\varGamma (1-\beta )}\frac{d}{dt}\int _t^T\frac{v(s)}{(s-t)^\beta }ds. \end{aligned}$$

Following [21], the following relations hold

$$\begin{aligned} {}_0^R\!\partial _t^{\beta }v(t)={}_0^C\!\partial _t^{\beta }v+\frac{v(0)t^{-\beta }}{\varGamma (1-\beta )} \end{aligned}$$

and

$$\begin{aligned} {}_t^R\!\partial _T^{\beta }v(t)={}_t^C\!\partial _T^{\beta }v+\frac{v(T)(T-t)^{-\beta }}{\varGamma (1-\beta )}. \end{aligned}$$

We can observe that the Riemann-Liouville and Caputo fractional derivatives are equal for the homogenous initial condition or terminal condition.

For a Lebesgue measurable subset \(\omega \) of \(R^\sigma (1\le \sigma \le 4)\), the symbol \((p,q)_{\omega }\) means \(\int _{\omega }pq\). According [8] the following properties hold.

Lemma 2.1

Assume \(\beta \in (0,1) \setminus \{0.5\}\) and \(\varLambda =(a,b)\). If \(v\in H^{\beta }_0({\varLambda })\) , we have

$$\begin{aligned} ( {}_a^R\!\partial _t^{\beta }v,{}_t^R\!\partial _b^{\beta }v)_{{\varLambda }}=\cos (\beta \pi )|v|^2_{H^{\beta }({\varLambda })}. \end{aligned}$$

Further, if \(0<\beta <1/2\) and \(v,w\in H^{\beta }({\varLambda })\), we have

$$\begin{aligned} ({}_a^R\!\partial _t^{\beta }v,{}_t^R\!\partial _b^{\beta }w)_{{\varLambda }}\le & {} |v|_{H^{\beta } ({\varLambda })}|w|_{H^{\beta }({\varLambda })},\\ ( {}_a^R\!\partial _t^{2\beta }v,w)_{{\varLambda }}= & {} ( {}_a^R\!\partial _t^{\beta }v,{}_t^R\!\partial _b^{\beta }w)_{{\varLambda }}=(w,{}_t^R\! \partial _b^{2\beta }v)_{{\varLambda }}. \end{aligned}$$

According to [22], we introduce the space of \({\dot{H}}^{s}(\varOmega )\). Let \(\{\lambda _k\}_{k=1}^{\infty }\) and \(\{\phi _k\}_{k=1}^{\infty }\) denote the eigenvalues (ordered nondecreasingly with multiplicity counted) and the \(L^2(\varOmega )\)-orthonormal eigenfunctions of \(-\varDelta \) operator on the domain \(\varOmega \) with a zero Dirichlet boundary condition. For \(s\ge 0\), let \({\dot{H}}^{s}(\varOmega )\subset L^2(\varOmega )\) defined by

$$\begin{aligned} {\dot{H}}^{s}(\varOmega ):=\left\{ v=\sum _{k=0}^{\infty }c_k\phi _k:\Big (\sum _{k=0}^{\infty } c_k^2\lambda _k^{s}\Big )^{\frac{1}{2}}<\infty \right\} . \end{aligned}$$

By Lemma 3.1(see, [22], Chap. 3, Page 38) we have \({\dot{H}}^{1}(\varOmega )=H^1_0(\varOmega )\) and \({\dot{H}}^{2}(\varOmega )=H^2(\varOmega )\cap H^1_0(\varOmega ).\)

3 Optimal Control Problem

Let \(\varOmega \) be a bounded domain of \(R^d(1\le d\le 3)\) with sufficiently smooth boundary \(\partial \varOmega \), and \(\varOmega _T=\varOmega \times (0,T), \ \varGamma _T=\partial \varOmega \times (0,T)\). We study the following distributed optimal control problem governed by time fractional diffusion equation

$$\begin{aligned} \min \limits _{q\in U_{ad}}{\mathcal {J}}(u,q):= \frac{1}{2}\Vert u(x,t)-u_d(x,t)\Vert ^2_{L^2(\varOmega _T)} +\frac{\gamma }{2}\Vert q(x,t)\Vert ^2_{L^2(\varOmega _T)} \end{aligned}$$
(3.1)

subject to

$$\begin{aligned} \left\{ \begin{aligned}&\frac{\partial u}{\partial t}-{}_0^R\!\partial _t^{\beta }\varDelta u=f(x,t)+q(x,t),\ \ \ (x,t)\in \varOmega _T, \\&u(x,t)=0, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (x,t)\in \varGamma _T, \\&u(x,0)=0, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x\in \varOmega . \end{aligned}\right. \end{aligned}$$
(3.2)

Here \(U_{ad}\) is the admissible set defined by

$$\begin{aligned} U_{ad}=\{q\in L^{2}(\varOmega _T):\; a\le q(x,t)\le b\ \ \text{ a.e. } \text{ in }\; \varOmega _T \ \ \text{ with }\ \ a, b\in \text{ R }\ \ \text{ and } \ \ a\le b\}. \end{aligned}$$

Since the functional \({\mathcal {J}}\) is strictly convex and the admissible set \(U_{ad}\) is bounded, closed and convex, the control problem (3.1)–(3.2) admits a unique solution.

Assumption 3.1

In the following we suppose that \(\gamma >0\), \(f\in L^2(0,T;L^2(\varOmega ))\) and \(u_d \in L^2(0,T;L^2(\varOmega ))\) are fixed data.

According [16] the state variable satisfies the following estimate.

Theorem 3.2

Suppose that u is the solution of state equation with right hand term f. Then we have

$$\begin{aligned} \begin{aligned} ( u_t,v) _{\varOmega _T} + ( {}_0^R\!\partial _t^{\beta }\nabla u,\nabla v) _{\varOmega _T} =( f,v) _{\varOmega _T}, \forall {v\in L^2(0,T; {\dot{H}}^1(\varOmega ))} \end{aligned} \end{aligned}$$
(3.3)

and

$$\begin{aligned} \Vert u\Vert _{H^1(0,T; L^2(\varOmega ))}+\Vert u\Vert _{H^{(1+\beta )/2}(0,T;{\dot{H}}^1(\varOmega ))} +\Vert u\Vert _{H^{\beta }(0,T;{\dot{H}}^2(\varOmega ))} \le C\Vert f\Vert _{L^2(0,T; L^2(\varOmega ))}. \end{aligned}$$

For above optimal control problem we can derive the following first order optimality conditions( [27]).

Theorem 3.3

Assume that \(q\in U_{ad}\) is the solution to optimal control problem (3.1)–(3.2) and u is the corresponding state variable given by (3.2). Then there exists an adjoint state such that (uzq) satisfies the following optimality conditions:

$$\begin{aligned} \left\{ \begin{aligned}&\frac{\partial u}{\partial t}-{}_0^R\!\partial _t^{\beta }\varDelta u=f+q,\ \ \ \ \ \ (x,t)\in \varOmega _T, \\&u(x,t)=0, \ \ \ \ \ \ \ \ \ \ \ \ (x,t)\in \varGamma _T, \\&u(x,0)=0, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x\in \varOmega , \end{aligned}\right. \end{aligned}$$
(3.4)
$$\begin{aligned} \left\{ \begin{aligned}&-\frac{\partial z}{\partial t}-{}_t^R\!\partial _T^{\beta }\varDelta {z}={u}-u_d,\ \ \ (x,t)\in \varOmega _T, \\&z(x,t)=0, \ \ \ \ \ \ \ \ \ \ \ (x,t)\in \varGamma _T, \\&z(x,T)=0, \ \ \ \ \ \ \ \ \ \ \ \ \ x \in \varOmega \end{aligned}\right. \end{aligned}$$
(3.5)

and

$$\begin{aligned} \int _{\varOmega _T}(\gamma q+z)(v-q)\ge 0, \ \ \ \forall v\in U_{ad}. \end{aligned}$$
(3.6)

Let

$$\begin{aligned} P_{U_{ad}}(q(x,t))=\max \{a,\min (q(x,t),b)\} \end{aligned}$$

denote the pointwise projection onto the admissible set \(U_{ad}\). Then (3.6) is equivalent to

$$\begin{aligned} q=P_{U_{ad}}\left( -\frac{1}{\gamma }z\right) . \end{aligned}$$

Similar to state equation, the adjoint state variable admits the following estimate ([16]).

Theorem 3.4

Let z be the solution of adjoint state equation with right hand term g. Then we have

$$\begin{aligned} \begin{aligned} -( z_t,v) _{\varOmega _T} +( {}_t^R\!\partial _T^{\beta } \nabla z,\nabla v) _{\varOmega _T} =(g,v ) _{\varOmega _T} { ,\forall v\in L^2(0,T; {\dot{H}}^1(\varOmega ))} \end{aligned} \end{aligned}$$
(3.7)

and

$$\begin{aligned} \Vert z\Vert _{H^1(0,T;L^2(\varOmega ))}+\Vert z\Vert _{H^{(1+\beta )/2}(0,T;{\dot{H}}^1(\varOmega ))} +\Vert z\Vert _{H^{\beta }(0,T;{\dot{H}}^2(\varOmega ))} \le C \Vert g\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned}$$

For the optimality system (3.4)–(3.6) we have the following regularity result.

Theorem 3.5

Supposed that (uzq) is the solution of optimality system (3.4)–(3.6). Then we have

$$\begin{aligned} u(x), z(x)\in {H}^{1}(0,T;L^2(\varOmega ))\cap H^{\beta }(0,T;{\dot{H}}^2(\varOmega ))\cap H^{(1+\beta )/2}(0,T;{\dot{H}}^1(\varOmega )) \end{aligned}$$

and

$$\begin{aligned} q(x)\in {H}^{1}(0,T;L^2(\varOmega )). \end{aligned}$$

Proof

Note that \(f\in L^2(0,T;L^2(\varOmega ))\), \(y_d \in L^2(0,T;L^2(\varOmega ))\) and \(q\in U_{ad}\). Then by Theorems 3.3 and 3.4 we have

$$\begin{aligned} u(x), z(x)\in {H}^{1}(0,T;L^2(\varOmega ))\cap H^{\beta }(0,T;{\dot{H}}^2(\varOmega ))\cap H^{(1+\beta )/2}(0,T;{\dot{H}}^1(\varOmega )). \end{aligned}$$

For the projection operator \(P_{U_{ad}}\) the following property holds (see, [15])

$$\begin{aligned} \Vert P_{U_{ad}}(v)\Vert _{W^{s,p}(0,T;L^2(\varOmega ))}\le C\Vert v\Vert _{W^{s,p}(0,T;L^2(\varOmega ))}, \ \ 0\le s\le 1, \ \ 1\le p\le \infty . \end{aligned}$$

This implies that

$$\begin{aligned} q(x)\in {H}^{1}(0,T;L^2(\varOmega )). \end{aligned}$$

\(\square \)

4 Time Stepping Discontinuous Galerkin Discrete Scheme

Let \(V_h\) be the finite element space consisting of continuous piecewise linear functions over the triangulation \(T_h\):

$$\begin{aligned} V_h=\{v_h\in H^1_0(\varOmega )\cap C(\varOmega );\ v_h \ \hbox { is a linear function over}\ K, \forall K\in T_h\}. \end{aligned}$$

Let \(\varDelta _{\tau }:0=t_0<t_1<\cdots<t_{J-1}<t_J=T\) be a time grid with \(\tau =T/J\). Set \(I_j=(t_{j-1},t_j)\) for each \(1\le j\le J\). Define the fully discrete finite element space

$$\begin{aligned} V_{hk}=\{\phi : \varOmega \times [0,T]\rightarrow R;\ \phi (x,\cdot )\in V_h,\ \phi (\cdot ,t)|_{I_j}\in P_0,\ j=1,2,\ldots , J\}. \end{aligned}$$

This implies that \(\phi \) is a piecewise constant with respect to time. For \(\phi \in V_{hk}, t\in I_j\), we define

$$\begin{aligned} \phi _j^{\pm }=\lim \limits _{t\rightarrow t_j^{\pm }}\phi (t), \end{aligned}$$

and

$$\begin{aligned}{}[[\phi _j]]=\phi _j^+ -\phi _j^- ,\ \ [[\phi ^0]]=\phi _0^+ . \end{aligned}$$

Then the time-stepping discontinuous Galerkin approximation of the state equation can be written as

$$\begin{aligned}\begin{aligned} \sum _{j=0}^{J-1}( [[ U_j]],w_j^+) _{\varOmega } +({}_0^R\!\partial _t^{\beta }\nabla U,\nabla w) _{\varOmega _T} =( f+q,w) _{\varOmega _T}, \forall w\in V_{hk}. \end{aligned} \end{aligned}$$

The time-stepping discontinuous Galerkin discrete scheme for the control problem (3.1)–(3.2) is to find \((U,Q)\in V_{hk}\times U_{ad}\), such that

$$\begin{aligned} \min \limits _{ Q\in U_{ad}}{\mathcal {J}}(U,Q) \end{aligned}$$

subject to

$$\begin{aligned} \sum _{j=0}^{J-1}( [[ U_j]],w_j^+) _{\varOmega } +({}_0^R\!\partial _t^{\beta }\nabla U,\nabla w) _{\varOmega _T} =(f+Q,w)_{\varOmega _T},\forall w\in V_{hk}. \end{aligned}$$
(4.1)

Here the control variable was implicitly discretized by variational discretization concept(see, [10]).

In order to obtain the discrete first-order optimal control conditions for the optimal control problem, we define the Lagrange functional

$$\begin{aligned} {\mathcal {L}}(U,Z,Q)= & {} {\mathcal {J}}(U,Q)+( f+Q,Z)_{\varOmega _T}\nonumber \\&- \sum _{j=0}^{J-1}( [[ U_j]],Z_j^+) _{\varOmega } -({}_0^R\!\partial _t^{\beta }\nabla U,\nabla Z) _{\varOmega _T}, { Z \in V_{hk}}. \end{aligned}$$
(4.2)

Then the discrete first order optimality condition can be deduced by computing

$$\begin{aligned}\begin{aligned} \frac{\partial {\mathcal {L}}(U,Z,Q)}{\partial U}(w)=0, \frac{\partial {\mathcal {L}}(U,Z,Q)}{\partial Z}(w)=0, \frac{\partial {\mathcal {L}}(U,Z,Q)}{ \partial Q}(v-Q)\ge 0. \end{aligned} \end{aligned}$$
  • Discrete adjoint state equation: Note that

    $$\begin{aligned}&\frac{\partial {\mathcal {L}}(U,Z,Q)}{\partial U}(w)\\= & {} \lim \limits _{t\rightarrow 0^+}\frac{1}{t}\left( {\mathcal {L}}(U+tw,Z,Q)-{\mathcal {L}}(U,Z,Q)\right) \\= & {} \lim \limits _{t\rightarrow 0^+}\frac{1}{t}\left( {-} \sum _{j=0}^{J-1}( [[ (U{+}tw)_j]],Z_j^+) _{\varOmega } {-}({}_0^R\!\partial _t^{\beta }\nabla ( U{+}tw),\nabla Z) _{\varOmega _T}+\frac{1}{2}\int _{\varOmega _T}(U+tw-u_d)^2dxdt\right. \\&\qquad \left. + \sum _{j=0}^{J-1}( [[ U_j]],Z_j^+) _{\varOmega } +({}_0^R\!\partial _t^{\beta }\nabla U,\nabla Z)_{\varOmega _T}-\frac{1}{2}\int _{\varOmega _T}(U-u_d)^2dxdt \right) \\= & {} \lim \limits _{t\rightarrow 0^+}\frac{1}{t}\left( - \sum _{j=0}^{J-1}( [[ (tw)_j]],Z_j^+) _{\varOmega } -({}_0^R\!\partial _t^{\beta }\nabla (tw),\nabla Z) _{\varOmega _T} -\frac{1}{2}\int _{\varOmega _T}(2U+tw-2u_d)twdxdt\right) \\= & {} - \sum _{j=0}^{J-1}([[w_j]],Z_j^+) _{\varOmega } -({}_t^R\!\partial _T^{\beta }\nabla Z,\nabla w) _{\varOmega _T} +\int _{\varOmega _T}(U-u_d)wdxdt. \end{aligned}$$

    Then we have

    $$\begin{aligned} \sum _{j=0}^{J-1}( [[w_j]],Z_j^+) _{\varOmega } +({}_t^R\!\partial _T^{\beta }\nabla Z,\nabla w)_{\varOmega _T} =( U-u_d,w) _{\varOmega _T},\forall w\in V_{hk}. \end{aligned}$$
    (4.3)
  • Discrete variational inequality: Note that

    $$\begin{aligned}\begin{aligned} \frac{\partial {\mathcal {L}}(U,Z,Q)}{ \partial Q}(v-Q) =&\lim \limits _{t\rightarrow 0^+}\frac{1}{t}\Big ({\mathcal {L}}(U,Z,Q+t(v-Q))-{\mathcal {L}}(U,Z,Q)\Big )\\ =&\lim \limits _{t\rightarrow 0^+}\frac{1}{t}\left( ( f+Q+t(v-Q),Z)_{\varOmega _T}+\frac{\gamma }{2} \int _{\varOmega _T}(Q+t(v-Q))^2dxdt\right. \\&\left. \qquad -( f+Q,Z)_{\varOmega _T}-\frac{\gamma }{2} \int _{\varOmega _T}Q^2dxdt\right) \\ =&\lim \limits _{t\rightarrow 0^+}\frac{1}{t}\left( ( t(v-Q),Z)_{\varOmega _T}+\frac{\gamma }{2} \int _{\varOmega _T}(2Q+t(v-Q))t(v-Q)dxdt\right) \\ =&\int _{\varOmega _T}( v-Q)Zdxdt+\gamma \int _{\varOmega _T}Q(v-Q)dxdt. \end{aligned} \end{aligned}$$

    This gives

    $$\begin{aligned} \begin{aligned} ( \gamma Q +Z , v-Q)_{\varOmega _T}\ge 0. \end{aligned} \end{aligned}$$
    (4.4)

Then the first order optimal condition for the discrete optimal control problem can be written as

$$\begin{aligned} \left\{ \begin{aligned} \sum \limits _{j=0}^{J-1}( [[ U_j]],w_j^+) _{\varOmega } +({}_0^R\!\partial _t^{\beta }\nabla U,\nabla w)_{\varOmega _T}&=( f+Q,w)_{\varOmega _T}, \forall w\in V_{hk},\\ \sum \limits _{j=0}^{J-1}( [[w_j]],Z_j^+) _{\varOmega } +({}_t^R\!\partial _T^{\beta }\nabla Z,\nabla w)_{\varOmega _T}&=( U-u_d,w) _{\varOmega _T}, \forall w\in V_{hk},\\ \qquad \qquad ( \gamma Q +Z , v-Q)_{\varOmega _T}&\ge 0, \forall v\in U_{ad}. \end{aligned}\right. \end{aligned}$$
(4.5)

5 Error Analysis

In this section we are going to derive the error estimate of \(\Vert u-U\Vert _{L^2(0,T;L^{2}(\varOmega ))}\), \(\Vert z-Z\Vert _{L^2(0,T;L^{2}(\varOmega ))}\) and \(\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}\). Note the the state variable and control variable are coupled together. In the following we firstly sketch the overall proof strategy.

To decouple the estimate we firstly decompose \(u-U\) and \(z-Z\) as follows

$$\begin{aligned} u-U= & {} u-U(q)+U(q)-U,\\ z-Z= & {} z-Z(u)+Z(u)-Z, \end{aligned}$$

where U(q) and Z(u) are auxiliary variables defined by

$$\begin{aligned} \sum _{j=0}^{J-1}( [[ U(q)_j]],w_j^+) _{\varOmega } +( {}_0^R\!\partial _t^{\beta }\nabla U(q),\nabla w)_{\varOmega _T} =( f+q,w) _{\varOmega _T},\ \ \forall w\in V_{hk} \end{aligned}$$
(5.1)

and

$$\begin{aligned} \sum _{j=0}^{J-1}([[w_j]], Z(u)_j^+) _{\varOmega } +({}_t^R\!\partial _T^{\beta } \nabla Z(u),\nabla w) _{\varOmega _T} =( u-u_d,w) _{\varOmega _T}, \ \ \forall w\in V_{hk}. \end{aligned}$$
(5.2)

In Sects. 5.1 and 5.2 we will prove that the estimate of \(\Vert u-U\Vert _{L^2(0,T;L^{2}(\varOmega ))}\) and \(\Vert z-Z\Vert _{L^2(0,T;L^{2}(\varOmega ))}\) are controlled by the estimate of \(\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}\). In order to finish the whole estimate we just need to derive the error estimate of \(\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}\). By introducing another auxiliary variable Z(q), we obtain the error estimate of \(\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}\) in Sect. 5.3. Then combining the results obtained in Sects. 5.15.3 yields the final error estimate of \(\Vert u-U\Vert _{L^2(0,T;L^{2}(\varOmega ))}\), \(\Vert z-Z\Vert _{L^2(0,T;L^{2}(\varOmega ))}\) and \(\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}\), i.e., Theorem 5.13.

To achieve the final results we need to introduce some interpolation and projection operators. Suppose that X is a Banach space. For \(v\in C((0,T];X)\) and \(v\in C([0,T);X)\) we respectively define

$$\begin{aligned} (P_{\tau }v)|_{I_j}=v(t_j),\ \ \forall 1\le j\le J, \end{aligned}$$

and

$$\begin{aligned} (G_{\tau }v)|_{I_j}=v(t_{j-1}),\ \ \forall 1\le j\le J. \end{aligned}$$

According [16] the following estimates hold.

Lemma 5.1

If \(v\in H^{\sigma }(0,T),0\le \alpha<1/2<\sigma <1,\) we have

$$\begin{aligned}&|(I-P_{\tau })v|_{H^{\alpha }(0,T)}\le C\tau ^{\sigma -\alpha }\Vert v\Vert _{ H^{\sigma }(0,T)},\\&|(I-G_{\tau })v|_{H^{\alpha }(0,T)}\le C\tau ^{\sigma -\alpha }\Vert v\Vert _{ H^{\sigma }(0,T)}. \end{aligned}$$

Let \(P_h: L^2(\varOmega )\rightarrow V_h\) denote the \( L^2(\varOmega )-\)orthogonal projection operator defined by

$$\begin{aligned} (P_h\varphi , \chi )=(\varphi , \chi ), \forall \chi \in V_h, \end{aligned}$$

which satisfies the following estimates ([12]): if \(v\in {\dot{H}}^{\sigma }(\varOmega )\) with \(\sigma =1, 2\),

$$\begin{aligned} \parallel v-P_hv \parallel _{L^2(\varOmega )}+h\parallel v-P_hv \parallel _{{\dot{H}}^1(\varOmega )} \le h^{\sigma }\Vert v \Vert _{{\dot{H}}^{\sigma }(\varOmega )}. \end{aligned}$$

The following lemmas can be found in [22], which will be used in the following error analysis.

Lemma 5.2

Assume that \(v\in L^2(0,T;L^2(\varOmega ))\), and \(v'\in L^2(0,T;L^2(\varOmega ))\), then we have

$$\begin{aligned} ( v',V)_{\varOmega _T} =( v(T),V_J^-)_{\varOmega }-\sum _{j=0}^{J-1}([[V_i]],(G_{\tau }P_hv)_i^+)_{\varOmega }, \\ ( v',V)_{\varOmega \times (0,t_j)} =\sum _{i=0}^{j-1}( [[(P_{\tau }P_hv)_i]],V_i^+)_{\varOmega }-( v(0),V_0^+)_{\varOmega }, \end{aligned}$$

for all \(V\in V_{hk}\), and \(\ 1\le j\le J\).

Lemma 5.3

If \(V\in V_{hk}\), then the following estimate holds

$$\begin{aligned} \frac{1}{2}\left( \Vert V_j^-\Vert ^2_{L^2(\varOmega )}+\Vert V_0^+\Vert ^2_{L^2(\varOmega )}\right) \le \sum _{i=0}^{j-1}( [[V_i]],V_i^+)_{\varOmega }, \end{aligned}$$

for all \(1\le j\le J\) .

5.1 Error Estimate for u-U

It is easy to see that U(q) is the finite element approximation of u. Then according to [16], we can get the following error estimate.

Theorem 5.4

If \(f+q\in L^2(0,T;L^2(\varOmega )) \), then we have

$$\begin{aligned}\begin{aligned} \Vert u-U(q)\Vert _{L^2(0,T;L^2(\varOmega ))}\le C(h^2+\tau )\Vert f+q\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned} \end{aligned}$$

Nextly we turn to estimate \(\Vert U(q)-U\Vert _{L^2(0,T;L^2(\varOmega ))}\).

Theorem 5.5

Assume that U(q) and U are the solution of the Eqs. (5.1) and (4.1), respectively. Then we have

$$\begin{aligned}\begin{aligned} \Vert U(q)-U\Vert _{L^2(0,T;L^2(\varOmega ))} \le C\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned} \end{aligned}$$

Proof

Choosing \(w=U(q)-U\) in (4.1) and (5.1), we deduce

$$\begin{aligned}&\sum _{j=0}^{J-1}([[(U(q)-U)_j]],(U(q)-U)_j^+)_{\varOmega } +( {}_0^R\!\partial _t^{\beta } \nabla (U(q)-U),\nabla (U(q)-U)) _{\varOmega _T}\\&\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =(q-Q, U(q)-U)_{\varOmega _T}. \end{aligned}$$

By Lemma 5.3, we deduce

$$\begin{aligned} \frac{1}{2}\Vert (U(q)-U)_J^-\Vert ^2_{L^2(\varOmega )}\le \sum _{j=0}^{J-1} ([[(U(q)-U)_j]],(U(q)-U)_j^+)_{\varOmega }. \end{aligned}$$

By Lemma 2.1, we obtain

$$\begin{aligned} \begin{aligned} ( {}_0^R\!\partial _t^{\beta } \nabla (U(q)-U),\nabla ( U(q)-U)) _{\varOmega _T} =\cos (\frac{\beta }{2}\pi )|U(q)-U|^2_{H^{\beta /2}(0,T;H^1(\varOmega ))}. \end{aligned} \end{aligned}$$

Note that \(0<\beta <1\) and \(U(q)-U\) vanishes on \(\partial \varOmega \). Then we obtain

$$\begin{aligned} |U(q)-U|^2_{H^{\beta /2}(0,T;H^1(\varOmega ))}\sim \parallel U(q)-U\parallel ^2_{H^{\beta /2}(0,T;H^1(\varOmega ))}. \end{aligned}$$

This implies that

$$\begin{aligned} \parallel U(q)-U\parallel _{L^2(0,T;L^2(\varOmega ))}\le C | U(q)-U|_{H^{\beta /2}(0,T;H^1(\varOmega ))}. \end{aligned}$$

Collecting above inequalities and using Young inequality leads to

$$\begin{aligned}&\Vert (U(q)-U)_J^-\Vert ^2_{L^2(\varOmega )} +\cos \left( \frac{\beta }{2}\pi \right) |U(q)-U|^2_{H^{\beta /2}(0,T;H^1(\varOmega ))}\\&\quad \le C \Vert q-Q\Vert ^2_{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned}$$

Theorem result follows above estimate. \(\square \)

By Theorems 5.4 and 5.5, we obtain the estimate of \(\Vert u-U\Vert _{L^2(0,T;L^2(\varOmega ))}\).

Theorem 5.6

Assume that u and U are the solution of the Eqs. (3.4) and (4.1), respectively. Then the following estimate

$$\begin{aligned} \Vert u-U\Vert _{L^2(0,T;L^2(\varOmega ))} \le C (h^2+\tau )\Vert f+q\Vert _{L^2(0,T;L^{2}(\varOmega ))}+ C\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))} \end{aligned}$$

holds.

5.2 Error Estimates for z-Z

Note that Z(u) is the finite element approximation of z. Then in the following we are going to prove the estimate of \(\Vert z-Z(u)\Vert _{L^2(0,T;L^2(\varOmega ))}\) by dual argument. To this end we firstly prove some auxiliary results.

Lemma 5.7

Assume that z and Z(u) are solutions of (3.5) and (5.2). Then we have

$$\begin{aligned} \Vert (Z(u){-}G_{\tau }P_hz)_j^+\Vert _{L^2(\varOmega )}{+}|Z(u){-}G_{\tau }P_hz|_{H^{\beta /2} (0,t_j;{\dot{H}}^{1}(\varOmega ))} \le C|(I{-}G_{\tau }P_h)z|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))}. \end{aligned}$$

Proof

Let \(\theta =Z(u)-G_{\tau }P_hz\). Choosing \(v=\theta _{\chi (0,t_j)}\), (3.7) reduce to

$$\begin{aligned} -( z',\theta )_{\varOmega \times (0,t_j)} +( {}_t^R\!\partial _T^{\beta }\nabla z,\nabla \theta )_{\varOmega \times (0,t_j)} =( u-u_d,\theta )_{\varOmega \times (0,t_j)}. \end{aligned}$$

Setting \(w=\theta _{\chi (0,t_j)}\) in (5.2) we obtain

$$\begin{aligned} \sum _{i=0}^{j-1}( [[ \theta _i]],Z(u)_i^+) _{\varOmega } +( {}_t^R\!\partial _T^{\beta }\nabla Z(u),\nabla \theta )_{\varOmega \times (0,t_j)} =( u-u_d,\theta )_{\varOmega \times (0,t_j)}. \end{aligned}$$

Subtracting above two formulas leads to

$$\begin{aligned} -( z',\theta )_{\varOmega \times (0,t_j)} +( {}_t^R\!\partial _T^{\beta }\nabla (z-Z(u)),\nabla \theta )_{\varOmega \times (0,t_j)} -\sum _{i=0}^{j-1}([[ \theta _i]],Z(u)_i^+)_{\varOmega }=0. \end{aligned}$$

By Lemma 5.2 and \(z(T)=0\) we have

$$\begin{aligned} -( z',\theta )_{\varOmega \times (0,t_j)} =\sum _{i=0}^{j-1}([[\theta _i]],(G_{\tau }P_hz)_i^+)_{\varOmega }. \end{aligned}$$

A simple calculation then yields

$$\begin{aligned} ( {}_t^R\!\partial _T^{\beta }\nabla (z-Z(u)),\nabla \theta )_{\varOmega \times (0,t_j)} -\sum _{i=0}^{j-1}([[\theta _i]],\theta _i^+)_{\varOmega }=0. \end{aligned}$$

Note that \(z-Z(u)=z-G_{\tau }P_hz+G_{\tau }P_hz-Z(u)\). Then we obtain

$$\begin{aligned} ({}_t^R\!\partial _T^{\beta }\nabla (z-G_{\tau }P_hz),\nabla \theta )_{\varOmega \times (0,t_j)}=({}_t^R\!\partial _T^{\beta } \nabla \theta ,\nabla \theta )_{\varOmega \times (0,t_j)} +\sum _{i=0}^{j-1}( [[\theta _i]],\theta _i^+)_{\varOmega }. \end{aligned}$$

According Lemma 5.3, we have

$$\begin{aligned} \frac{1}{2}(\Vert \theta _j^-\Vert ^2_{L^2(\varOmega )}+\Vert \theta _0^+\Vert ^2_{L^2(\varOmega )}) \le \sum _{i=0}^{j-1}( [[\theta _i]],\theta _i^+)_{\varOmega }. \end{aligned}$$

This implies

$$\begin{aligned}&\frac{1}{2}\left( \Vert \theta _j^-\Vert ^2_{L^2(\varOmega )}+\Vert \theta _0^+\Vert ^2_{L^2(\varOmega )}\right) + |\theta |^2_{H^{\beta /2}(0,t_j;{\dot{H}}^{1}(\varOmega ))} \\&\qquad \le C |z-G_{\tau }P_hz|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))} |\theta |_{H^{\beta /2}(0,t_j;{\dot{H}}^{1}(\varOmega ))}. \end{aligned}$$

By the Young inequality, we can deduce the theorem result. \(\square \)

Lemma 5.8

Assume that z and Z(u) are solutions of (3.5) and (5.2). Then we derive

$$\begin{aligned} \begin{aligned}&\max _{0\le j\le J-1}\Vert z(t_j)-Z(u)_j^+\Vert _{L^2(\varOmega )}+|z-Z(u)|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))}\\&~~\qquad \le C(h+\tau ^{1/2})\Vert u-u_d\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned} \end{aligned}$$
(5.3)

Proof

By the estimate of \(L^2\) projection and the definition of \(G_{\tau }\), we have

$$\begin{aligned}&\quad \Vert z(t_j)-(G_{\tau }P_hz)_j^+\Vert _{L^2(\varOmega )}= \Vert (I-P_h)z(t_j)\Vert _{L^2(\varOmega )}\nonumber \\&\le h\Vert z(t_j)\Vert _{{\dot{H}}^1(\varOmega )}\le h\Vert z\Vert _{H^{(1+\beta )/2} (0,T;{\dot{H}}^1(\varOmega ))}\nonumber \\&\le h\Vert u-u_d\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned}$$
(5.4)

Let \(z-G_{\tau }P_hz=z-P_hz+P_hz-G_{\tau }P_hz\). Using Lemma 5.1 gives

$$\begin{aligned}&|z-G_{\tau }P_hz|_{H^{\beta / 2}(0,T;{\dot{H}}^{1}(\varOmega ))}\nonumber \\&\le |z-P_hz|_{H^{\beta / 2}(0,T;{\dot{H}}^{1}(\varOmega ))}+ |(I-G_{\tau })P_hz|_{H^{\beta / 2}(0,T;{\dot{H}}^{1}(\varOmega ))}\nonumber \\&\le h|z|_{H^{\beta / 2}(0,T;{\dot{H}}^2(\varOmega ))} +|(I-G_{\tau })z|_{H^{\beta / 2}(0,T;{\dot{H}}^{1}(\varOmega ))}\nonumber \\&\le h|z|_{H^{\beta / 2}(0,T;{\dot{H}}^2(\varOmega ))} +\tau ^{1/2}|z|_{H^{(1+\beta )/ 2}(0,T;{\dot{H}}^{1}(\varOmega ))}\nonumber \\&\le C(h+\tau ^{1/2})\Vert u-u_d\Vert _{L^2(0,T;L^2(\varOmega )}. \end{aligned}$$
(5.5)

Combining (5.4) and (5.5) as well as Lemma  5.7 leads to

$$\begin{aligned}\begin{aligned}&\Vert z(t_j)-Z(u)_j^+\Vert _{L^2(\varOmega )}+|z-Z(u)|_{H^{\beta /2} (0,T;{\dot{H}}^{1}(\varOmega ))}\\&\quad \le \Vert z(t_j)-(G_{\tau }P_hz)_j^+\Vert _{L^2(\varOmega )} +\Vert (G_{\tau }P_hz)_j^+-Z(u)_j^+\Vert _{L^2(\varOmega )}\\&\quad \quad +|z-G_{\tau }P_hz|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))} +|G_{\tau }P_hz-Z(u)|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))}\\&\quad \le \Vert z(t_j)-(G_{\tau }P_hz)_j^+\Vert _{L^2(\varOmega )} +|z-G_{\tau }P_hz|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))}\\&\quad \le C(h+\tau ^{1/2})\Vert u-u_d\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned} \end{aligned}$$

\(\square \)

Theorem 5.9

Let z and Z(u) be the solution of (3.5) and (5.2), respectively. Then the following estimate holds

$$\begin{aligned} \qquad \Vert z-Z(u)\Vert _{L^2(0,T;L^{2}(\varOmega ))} \le C(h^2+\tau )\Vert u-u_d\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned}$$

Proof

To drive the estimate of \(\Vert z-Z(u)\Vert _{L^2(0,T;L^{2}(\varOmega ))}\), we introduce the following auxiliary problem:

$$\begin{aligned} \left\{ \begin{aligned}&( y',w) _{\varOmega _T} +({}_0^R\!\partial _t^{\beta }\nabla y,\nabla w) _{\varOmega _T} =( z-Z(u),w) _{\varOmega _T}, w\in L^2(0,T;\dot{H}^1(\varOmega )),\\&y(\cdot ,0)=0. \end{aligned}\right. \end{aligned}$$

It is easy to see that y satisfies the stability estimate presented in Theorem 3.2.

Choosing \(w=z-Z(u)\) yields

$$\begin{aligned} \Vert z-Z(u)\Vert ^2_{L^2(0,T;L^{2}(\varOmega ))}=( y',z-Z(u)) _{\varOmega _T} +( {}_0^R\!\partial _t^{\beta } \nabla y,\nabla (z-Z(u))) _{\varOmega _T}. \end{aligned}$$

Using integration by parts we obtain

$$\begin{aligned} \Vert z-Z(u)\Vert ^2_{L^2(0,T;L^{2}(\varOmega ))}=-( z',y) _{\varOmega _T} -( y',Z(u)) _{\varOmega _T}+( {}_t^R\!\partial _T^{\beta }\nabla (z-Z(u)) ,\nabla y ) _{\varOmega _T}. \end{aligned}$$

Choosing \(w=Y=P_{\tau }P_hy\), (5.2) reduce to

$$\begin{aligned} \qquad \sum _{j=0}^{J-1}( [[ Y_j]],Z(u)_j^+) _{\varOmega }+ ( {}_t^R\!\partial _T^{\beta } \nabla Z(u),\nabla Y) _{\varOmega _T}=( u-u_d,Y) _{\varOmega _T}. \end{aligned}$$

By Theorem 3.4 and \(v=Y\), we have

$$\begin{aligned} -( z',Y) _{\varOmega _T}+ ( {}_t^R\!\partial _T^{\beta }\nabla z,\nabla Y) _{\varOmega _T} =( u-u_d,Y) _{\varOmega _T}. \end{aligned}$$

Subtracting above two equations leads to

$$\begin{aligned} ( {}_t^R\!\partial _T^{\beta }\nabla (z-Z(u)),\nabla Y) _{\varOmega _T}= ( z',Y) _{\varOmega _T}+\sum _{j=0}^{J-1}( [[ Y_j]],Z(u)_j^+)_{\varOmega }. \end{aligned}$$

Note that

$$\begin{aligned}\begin{aligned}&( {}_t^R\!\partial _T^{\beta }\nabla (z-Z(u)),\nabla y) _{\varOmega _T}\\&=( {}_t^R\!\partial _T^{\beta } \nabla (z-Z(u)),\nabla (y-Y)) _{\varOmega _T} +( {}_t^R\!\partial _T^{\beta }\nabla (z-Z(u)),\nabla Y) _{\varOmega _T}\\&=( {}_t^R\!\partial _T^{\beta } \nabla (z-Z(u)),\nabla (y-Y)) _{\varOmega _T} +( z',Y) _{\varOmega _T}+\sum _{j=0}^{J-1}( [[ Y_j]],Z(u)_j^+) _{\varOmega }. \end{aligned} \end{aligned}$$

By Lemma 5.2 and \(y(\cdot ,0)\), we have

$$\begin{aligned} \sum \limits _{j=0}^{J-1}( [[ Y_j]],Z(u)_j^+) _{\varOmega }, =( y',Z(u)) _{\varOmega _T}. \end{aligned}$$

Therefore we further have

$$\begin{aligned} \begin{aligned}&\Vert z-Z(u)\Vert ^2_{L^2(0,T;L^{2}(\varOmega ))} =( {}_t^R\!\partial _T^{\beta } \nabla (z-Z(u)),\nabla ( y-Y)) _{\varOmega _T} -(z',y-Y) _{\varOmega _T}\\&\le |z-Z(u)|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))}|y-Y|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))}\\&\quad +\Vert z'\Vert _{L^2(0,T;L^2(\varOmega ))}\Vert y-Y\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned} \end{aligned}$$
(5.6)

Using similar proof of Lemma 5.8, we have

$$\begin{aligned} \begin{aligned} |y-Y|_{H^{\beta /2}(0,T;{\dot{H}}^{1}(\varOmega ))}\le C(h+\tau ^{1/2})\Vert z-Z(u)\Vert _{L^2(0,T;L^2(\varOmega ))} \end{aligned} \end{aligned}$$
(5.7)

and according [5], we have

$$\begin{aligned} \begin{aligned} \Vert y-Y\Vert _{L^2(0,T;L^2(\varOmega ))}\le C(h^2+\tau )\Vert z-Z(u)\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned} \end{aligned}$$
(5.8)

Inserting (5.3), (5.7) and (5.8) into (5.6) yields the theorem result. \(\square \)

Now it remains to estimate \(\parallel Z(u)-Z\parallel _{L^2(0,T;L^2(\varOmega ))}\).

Theorem 5.10

Assume that Z(u) and Z be the solution of (5.2) and (4.3). Then the following estimate holds

$$\begin{aligned}\begin{aligned} \parallel Z(u)-Z\parallel _{L^2(0,T;L^2(\varOmega ))} \le C\Vert u-U\Vert _{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned} \end{aligned}$$

Proof

By (4.3) and (5.2), we obtain

$$\begin{aligned} \sum _{j=0}^{J-1}( [[w_j]],(Z(u)-Z)_j^+) _{\varOmega } +( {}_t^R\!\partial _T^{\beta }\nabla (Z(u)-Z),\nabla w) _{\varOmega _T} =( u-U,w)_{\varOmega _T}. \end{aligned}$$

Setting \(w=Z(u)-Z\) and using Lemma 5.3, we have

$$\begin{aligned} \begin{aligned} \frac{1}{2}\Vert (Z(u)-Z)_0^+\Vert ^2_{L^2(\varOmega )}\le \sum _{j=0}^{J-1} ( [[(Z(u)-Z)_j]],(Z(u)-Z)_j^+)_{\varOmega }. \end{aligned} \end{aligned}$$
(5.9)

By Lemma 2.1, we have

$$\begin{aligned} ( {}_t^R\!\partial _T^{\beta }\nabla (Z(u)-Z),\nabla (Z(u)-Z))_{\varOmega _T} =\cos \left( \frac{\beta }{2}\pi \right) |Z(u)-Z|^2_{H^{\beta /2}(0,T;H^1(\varOmega ))}. \end{aligned}$$

Note that

$$\begin{aligned} \begin{aligned} \parallel Z(u)-Z\parallel _{L^2(0,T;L^2(\varOmega ))}\le C| Z(u)-Z|_{H^{\beta /2}(0,T;H^1(\varOmega ))}. \end{aligned} \end{aligned}$$
(5.10)

In analogous to the proof of Theorem 5.5, using (5.9) and (5.10) as well as Young inequality results in

$$\begin{aligned}\begin{aligned}&\Vert (Z(u)-Z)_0^+\Vert ^2_{L^2(\varOmega )} +\cos \left( \frac{\beta }{2}\pi \right) |Z(u)-Z|^2_{H^{\beta /2}(0,T;H^1(\varOmega ))}\\&\quad \le C\Vert u-U\Vert _{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned} \end{aligned}$$

Again by (5.10) we obtain

$$\begin{aligned}\begin{aligned} \Vert (Z(u)-Z)_0^+\Vert ^2_{L^2(\varOmega )}+\parallel Z(u)-Z\parallel ^2 _{L^2(0,T;L^2(\varOmega ))} \le C\Vert u-U\Vert ^2_{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned} \end{aligned}$$

\(\square \)

Combining Theorems 5.9 and 5.10 yields the error estimate of the adjoint state variable \(\Vert z-Z\Vert _{L^2(0,T;L^2(\varOmega ))}\).

Theorem 5.11

Assume that z and Z be the solution of the Eqs. (3.7) and (4.3). Then we have

$$\begin{aligned} \begin{aligned} \Vert z-Z\Vert _{L^2(0,T;L^2(\varOmega ))} \le C(h^2+\tau )\Vert u-u_d\Vert _{L^2(0,T;L^2(\varOmega ))}+ C\Vert u-U\Vert _{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned} \end{aligned}$$

5.3 Error Estimates for q-Q

Note that the estimates of \(u-U\) and \(z-Z\) depend on \(q-Q\). In the following analysis we will derive the estimate of \(q-Q\) to finish the whole error analysis.

Theorem 5.12

Let q and Q is the solution to the problem (3.6) and (4.4). Then the following estimate holds

$$\begin{aligned} \Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}\le C( h^2+\tau ). \end{aligned}$$

Proof

It follows from (3.6) and (4.4)

$$\begin{aligned}\begin{aligned} \gamma \Vert q-Q\Vert ^2_{L^2(0,T;L^{2}(\varOmega ))}=&\int _{\varOmega _T}\gamma q(q-Q)-\int _{\varOmega _T}\gamma Q(q-Q)\\ \le&\int _{\varOmega _T}z(Q-q)+\int _{\varOmega _T}Z(q-Q)\\ =&\int _{\varOmega _T}(z-Z(q))(Q-q)+\int _{\varOmega _T}(Z(q)-Z)(Q-q). \end{aligned} \end{aligned}$$

Here Z(q) is an auxiliary variable defined by

$$\begin{aligned} \sum _{j=0}^{J-1}( [[w_j]],Z(q)_j^+) _{\varOmega } +( {}_t^R\!\partial _T^{\beta }\nabla Z(q),\nabla w) _{\varOmega _T} =( U(q)-u_d,w) _{\varOmega _T},\forall w\in V_{hk}.\ \ \ \end{aligned}$$
(5.11)

By (4.1) and (5.1), we have

$$\begin{aligned} (Q-q,w)_{\varOmega _T}=\sum _{j=0}^{J-1}( [[(U-U(q))_j]],w_j^+) _{\varOmega } +({}_0^R\!\partial _t^{\beta } \nabla (U-U(q)),\nabla w) _{\varOmega _T}. \end{aligned}$$

Choosing \(w=Z(q)-Z\), and using integration by parts yields

$$\begin{aligned}\begin{aligned}&\int _{\varOmega _T}(Q-q)(Z(q)-Z)\\&\quad =\sum _{j=0}^{J-1}( [[(U-U(q))_j]],(Z(q)-Z)_j^+) _{\varOmega } +( {}_0^R\!\partial _t^{\beta }\nabla (U-U(q)),\nabla (Z(q)-Z)) _{\varOmega _T}\\&\quad =\sum _{j=0}^{J-1}( [[(U-U(q))_j]],(Z(q)-Z)_j^+) _{\varOmega } +({}_t^R\!\partial _T^{\beta }\nabla (Z(q)-Z),\nabla (U-U(q) ) _{\varOmega _T}. \end{aligned} \end{aligned}$$

By (4.3) and (5.11), we obtain

$$\begin{aligned}\begin{aligned} \sum _{j=0}^{J-1}([[w_j]],(Z(q)-Z)_j^+) _{\varOmega } +( {}_t^R\!\partial _T^{\beta } \nabla (Z(q)-Z),\nabla w) _{\varOmega _T} =( U(q)-U,w) _{\varOmega _T}. \end{aligned} \end{aligned}$$

Setting \(w= U- U(q)\) gives

$$\begin{aligned} \int _{\varOmega _T}(Q-q)(Z(q)-Z)=-\int _{\varOmega _T}( U(q)-U)^2\le 0. \end{aligned}$$

Then we deduce

$$\begin{aligned}\begin{aligned} \gamma \Vert q-Q\Vert ^2_{L^2(0,T;L^{2}(\varOmega ))}&\le \int _{\varOmega _T}(z-Z(q))(Q-q)\\&\le \Vert z-Z(q)\Vert _{L^2(0,T;L^{2}(\varOmega ))}\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned} \end{aligned}$$

Further we have

$$\begin{aligned} \Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}\le C \Vert z-Z(q)\Vert _{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned}$$

Now it remains to estimate \(\Vert z-Z(q)\Vert _{L^2(0,T;L^{2}(\varOmega ))}\). Note that \(z-Z(q)=z-Z(u)+Z(u)-Z(q)\), thus we only need to estimate \(\Vert Z(u)-Z(q)\Vert _{L^2(0,T;L^{2}(\varOmega ))}\). By (5.2) and (5.11), we have

$$\begin{aligned} \sum _{j=0}^{J-1}( [[w_j]],(Z(u)-Z(q))_j^+) _{\varOmega } +({}_t^R\!\partial _T^{\beta } \nabla (Z(u)-Z(q)),\nabla w) _{\varOmega _T} =( u-U(q),w) _{\varOmega _T}. \end{aligned}$$

Setting \(w=Z(u)-Z(q)\) and according to Lemma 5.3, we have

$$\begin{aligned} \frac{1}{2}\Vert (Z(u)-Z(q))_0^+\Vert ^2_{L^2(\varOmega )}\le \sum _{j=0}^{J-1}( [[(Z(u)-Z(q))_j]],(Z(u)-Z(q))_j^+)_{\varOmega }, 1\le j\le J. \end{aligned}$$

By Lemma 2.1, we have

$$\begin{aligned} ( {}_t^R\!\partial _T^{\beta }\nabla (Z(u)-Z(q)),\nabla (Z(u)-Z(q))) _{\varOmega _T} =\cos (\frac{\beta }{2}\pi ) |Z(u)-Z(q) |^2_{H^{\beta /2}(0,T;H^1(\varOmega ))}. \end{aligned}$$

Collecting above equations leads to

$$\begin{aligned} \begin{aligned}&\frac{1}{2}\Vert (Z(u)-Z(q))_0^+\Vert ^2_{L^2(\varOmega )} +\cos (\frac{\beta }{2}\pi ) |Z(u)-Z(q) |^2_{H^{\beta /2}(0,T;H^1(\varOmega ))}\\&\quad \le C\Vert u-U(q)\Vert _{L^2(0,T;L^{2}(\varOmega ))}\Vert Z(u)-Z(q)\Vert _{L^2(0,T;L^2(\varOmega ))}. \end{aligned} \end{aligned}$$

Since

$$\begin{aligned} \Vert Z(u)-Z(q)\Vert _{L^2(0,T;L^2(\varOmega ))}\le C|Z(u)-Z(q) |_{H^{\beta /2}(0,T;H^1(\varOmega ))}, \end{aligned}$$

by Young inequality we obtain

$$\begin{aligned} \begin{aligned} \Vert (Z(u)-Z(q))_0^+\Vert ^2_{L^2(\varOmega )}+\Vert Z(u)-Z(q)\Vert ^2_{L^2(0,T;L^2(\varOmega ))} \le C\Vert u-U(q)\Vert ^2_{L^2(0,T;L^{2}(\varOmega ))}. \end{aligned}\nonumber \\ \end{aligned}$$
(5.12)

By Theorems 5.4, 5.9 and (5.12), we can get the error estimate of the control variable

$$\begin{aligned}\begin{aligned} \Vert q-Q\Vert _{L^2(0,T;L^2(\varOmega ))}&\le C \Vert z-Z(q)\Vert _{L^2(0,T;L^{2}(\varOmega ))}\\&\le C (\Vert z-Z(u)\Vert _{L^2(0,T;L^{2}(\varOmega ))}+\Vert Z(u)-Z(q)\Vert _{L^2(0,T;L^{2}(\varOmega ))})\\&\le C(h^2+\tau )\Vert u-u_d\Vert _{L^2(0,T;L^2(\varOmega ))}+ C\Vert u-U(q)\Vert _{L^2(0,T;L^{2}(\varOmega ))}\\&\le C(h^2+\tau ). \end{aligned} \end{aligned}$$

\(\square \)

5.4 Main Result

Combining Theorems 5.6, 5.11 and 5.12, we obtain the following main result.

Theorem 5.13

Let (uzq) and (UZQ) respectively are the solution of the Eqs. (3.43.6) and (4.5). Then the following error estimate is established

$$\begin{aligned} \parallel u-U\parallel _{L^2(0,T;L^2(\varOmega ))} +\parallel z-Z\parallel _{L^2(0,T;L^2(\varOmega ))} +\parallel q-Q\parallel _{L^2(0,T;L^2(\varOmega ))} \le C( h^2+\tau ). \end{aligned}$$

Proof

By Theorems 5.6 and 5.12, we obtain

$$\begin{aligned}\begin{aligned} \Vert u-U\Vert _{L^2(0,T;L^2(\varOmega ))}&\le C(h^2+\tau )\Vert f+q\Vert _{L^2(0,T;L^{2}(\varOmega ))}+ C\Vert q-Q\Vert _{L^2(0,T;L^{2}(\varOmega ))}\\&\le C(h^2+\tau ). \end{aligned} \end{aligned}$$

Hence Theorem 5.11 implies

$$\begin{aligned}\begin{aligned} \Vert z-Z\Vert _{L^2(0,T;L^2(\varOmega ))}&\le C(h^2+\tau )\Vert u-u_d\Vert _{L^2(0,T;L^2(\varOmega ))}+ C\Vert u-U\Vert _{L^2(0,T;L^{2}(\varOmega ))}\\&\le C(h^2+\tau ). \end{aligned} \end{aligned}$$

\(\square \)

6 Fast Algorithm

In this section, a fast gradient projection algorithm is designed to solve the optimal control problem. In the following \(F_J\) and \(F_J^*\) denote the Fourier matrix and the inverse Fourier matrix. The symbol \(\otimes \) denotes the Kronecker tensor product.

6.1 Gradient Projection Algorithm

The following gradient projection algorithm is used to solve the discrete optimal control problem.

figure a

In above algorithm the discrete control system is solved in an iterative manner. The main computational cost comes from solving the discrete state and adjoint equation. The discrete state equation and adjoint state equation can be solved in a time-marching fashion.When the time division and space division are relatively large, due to nonlocal property of the time fractional derivative, the computation amount is very large. Therefore, fast algorithm is necessary.

6.2 The Coefficient Matrix of the Optimal Control Problem

In this part we are going to investigate the structure of the coefficient matrix of discrete state and adjoint state equation. Let \(V_h\) be the finite element space consisting of continuous piecewise linear functions

$$\begin{aligned} V_h= span\{\varphi _{1}, \varphi _{2},\ldots , \varphi _M\}. \end{aligned}$$

According to [27], by choosing w to vanish outside \(I_j\) the discrete first order optimality system is reduced to

$$\begin{aligned}\left\{ \begin{aligned} (U^j-U^{j-1},w_h)+\sum \limits _{k=1}^j C_{j,k}(\nabla U^k,\nabla w_h)&=\tau ({\bar{f}}^j+Q^j,w_h),\ \ j=1,2,\ldots ,J,\\ ( Z^{j-1}-Z^j,w_h) +\sum \limits _{k=j}^J C_{k,j}(\nabla Z^{k-1},\nabla w_h)&=\tau ( U^j-{\bar{u}}_d^j,w_h),\ \ j=J,J-1,\ldots ,1 ,\\ ( \gamma Q^j+ Z^{j-1} ,v_h-Q^j)&\ge 0,\ \ \forall v_h\in U_{ad},\ \ j=1,2,\ldots ,J,\\ U^0=0,Z^J=0.\\ \end{aligned}\right. \end{aligned}$$

Here \(\displaystyle {{\bar{f}}^j=\frac{1}{\tau }\int _{I_j} fdt},\ \ {\bar{u}}^j_d=\frac{1}{\tau }\int _{I_j}u_ddt\) and \(C_{j,k}\) are defined as follows

$$\begin{aligned} C_{j,k}=\left\{ \begin{aligned}&\frac{\tau ^{1-\beta }}{\varGamma (2-\beta )}\Big ((j{-}k{+}1)^{1-\beta }{-}2(j-k)^{1-\beta } {+}(j{-}k{-}1)^{1{-}\beta }\Big ),k=1,2,\ldots ,j-1,\\&\frac{\tau ^{1-\beta }}{\varGamma (2-\beta )},\qquad \qquad \qquad k=j. \end{aligned}\right. \end{aligned}$$

We can observe that the time-stepping discontinuous Galerkin discrete scheme can be viewed as a modified backward Euler scheme, since the discrete state and adjoint state are piecewise constant in time.

We denote the mass matrix and stiff matrix by \( {\mathcal {M}}\) and \({\mathcal {S}}\), respectively, whose entries are calculated by \( {\mathcal {M}}=(\varphi _i,\varphi _j)\) and \({\mathcal {S}}=(\nabla \varphi _i,\nabla \varphi _j), i,j=1,2,\ldots , M.\) Note that

$$\begin{aligned}\begin{aligned}&C_{11}(\nabla U^1,\nabla w_h)+(U^1,w_h)=\tau ({\bar{f}}^1+Q^1,w_h),\\&C_{21}(\nabla U^1,\nabla w_h)-(U^1,w_h)+C_{22}(\nabla U^2,\nabla w_h) +(U^2,w_h)=\tau ({\bar{f}}^2+Q^2,w_h),\\&C_{31}(\nabla U^1,\nabla w_h){+}C_{32}(\nabla U^2,\nabla w_h){-}(U^2,w_h)+C_{33}(\nabla U^2,\nabla w_h)+(U^3,w_h) =\tau ({\bar{f}}^3+Q^3,w_h),\\&\quad \quad \vdots \\&C_{J1}(\nabla U^1,\nabla w_h)+C_{J2}(\nabla U^2,\nabla w_h)+\cdots +C_{J,J-1}(\nabla U^{J-1},\nabla w_h)-(U^{J-1},w_h)\\&\qquad \qquad +C_{JJ}(\nabla U^J,\nabla w_h)+(U^J,w_h) =\tau ({\bar{f}}^J+Q^J,w_h).\\ \end{aligned} \end{aligned}$$

Then the discrete state equation can be rewritten as follows

$$\begin{aligned} \begin{aligned} A {\mathbb {U}} =\mathbb {b}. \end{aligned} \end{aligned}$$
(6.1)

Here

$$\begin{aligned}\begin{aligned} {\mathbb {U}}&=(\mathbf{U }^1,\mathbf{U }^2,\ldots ,\mathbf{U }^J)^T,\\ \mathbb {b}&=(\mathbf{b }^1,\mathbf{b }^2,\ldots ,\mathbf{b }^J)^T, \\ \end{aligned}\end{aligned}$$

with \(\mathbf{b }^j=(\tau ({\bar{f}}^j+Q^j,\varphi _i))_{M\times 1},j=1, \ldots ,J,i=1,2,\ldots ,M\) and \( \mathbf{U }^j\in R^{M}\). Since the time subdivision is uniform, by the definition of \(C_{jk}\) we have

$$\begin{aligned}\left\{ \begin{aligned}&C_{11}=C_{22}=\cdots =C_{JJ},\\&C_{21}=C_{32}=\cdots =C_{J,J-1},\\&C_{31}=C_{42}=\cdots =C_{J,J-2},\\&\qquad \vdots&\\&C_{J-2,1}=C_{J-1,2}=C_{J,3},\\&C_{J-1,1}=C_{J,2}. \end{aligned}\right. \end{aligned}$$

For convenience, we define \(C_1=C_{11},C_2=C_{21},\ldots ,C_J=C_{J1}.\) Then the coefficient matrix A takes the following form

$$\begin{aligned} A= \left( \begin{array}{cccccc} {\mathcal {M}}+C_1{\mathcal {S}} &{} &{} &{} &{} &{} \\ C_2{\mathcal {S}}- {\mathcal {M}} &{} {\mathcal {M}}+C_1{\mathcal {S}} &{} &{} &{} &{} \\ C_3 {\mathcal {S}} &{} C_2{\mathcal {S}}- {\mathcal {M}} &{} {\mathcal {M}}+C_1{\mathcal {S}} &{} &{} &{} \\ \vdots &{}\vdots &{} \vdots &{} \ddots &{} &{}\\ C_{J-1}{\mathcal {S}} &{} C_{J-2}{\mathcal {S}} &{} C_{J-3}{\mathcal {S}} &{} \cdots &{} {\mathcal {M}}+C_1{\mathcal {S}} &{} \\ C_{J} {\mathcal {S}}&{}C_{J-1}{\mathcal {S}} &{}C_{J-2} {\mathcal {S}}&{} \cdots &{} C_2{\mathcal {S}}- {\mathcal {M}} &{} {\mathcal {M}}+C_1{\mathcal {S}} \\ \end{array} \right) . \end{aligned}$$

In an analogous way, we can rewrite the discrete adjoint state equation as follows

$$\begin{aligned} \begin{aligned} A{\mathbb {Z}}=\mathbb {c},\\ \end{aligned} \end{aligned}$$
(6.2)

among

$$\begin{aligned}\begin{aligned} {\mathbb {Z}}&=(\mathbf{Z }^{J-1},\mathbf{Z }^{J-2},\ldots ,\mathbf{Z }^0)^T,\\ \mathbb {c}&=(\mathbf{c }^1,\mathbf{c }^2,\ldots ,\mathbf{c }^J)^T,\\ \mathbf{c }^j&=(\tau (U^{J-j+1}-{\bar{u}}_d^{J-j+1},\varphi _i))_{M\times 1},j=1,\ldots ,J,i=1,\ldots , M. \end{aligned} \end{aligned}$$

It is easy to see that the coefficient matrix for state equation and adjoint state equation are both block lower triangular Toeplitz-like with tri-diagonal block matrix(BL3TB).

6.3 Fast Algorithm

Note that the computational effort to solve the optimal control problem mainly focuses on how to solve the state equation and the adjoint state equation. Then the remaining problem is how to solve the following linear system in a fast manner

$$\begin{aligned} \begin{aligned} A{\mathcal {X}}=\mathbf{y }. \end{aligned} \end{aligned}$$
(6.3)

According [19], the main idea is to replace the exact A with an approximate \(A_{\epsilon }\), where \(A_{\epsilon }\) satisfies the following form

$$\begin{aligned} A_{\epsilon }= \left[ \begin{array}{ccccc} A_0 &{}\epsilon A_{J-1} &{} \cdots &{} \epsilon A_2 &{} \epsilon A_1 \\ A_1 &{}A_0 &{} \epsilon A_{J-1} &{} \cdots &{} \epsilon A_2 \\ \vdots &{} A_1 &{} A_0 &{} \ddots &{} \vdots \\ A_{J-2} &{} \cdots &{}\ddots &{} \ddots &{} \epsilon A_{J-1} \\ A_{J-1} &{} A_{J-2} &{} \cdots &{} A_1 &{} A_0 \\ \end{array} \right] . \end{aligned}$$

Therefore the solution of the Eq. (6.3) can be represented as

$$\begin{aligned} \begin{aligned} {\mathcal {X}}=A^{-1}\mathbf{y }\approx A_{\epsilon }^{-1}\mathbf{y }={\mathcal {X}}_{\epsilon }. \end{aligned} \end{aligned}$$

Note that \(A_{\epsilon }\) is a Block \(\epsilon \)-criculant matrix(see [4]). This allows us to use fast Fourier transform to reduce the computational cost.

Theorem 6.1

(see [4], Theorem 2.10) Let \(A_{\epsilon }\) is the Block \(\epsilon -criculant\) matrix with the first column \([A_0,A_1,\ldots ,A_{J-1}]^T\). Then we have

$$\begin{aligned} \begin{aligned} A_{\epsilon }^{-1}=[(D_{\delta }^{-1}F_J^*)\otimes I_M]diag(\varLambda ^{-1}_0,\varLambda ^{-1}_1,\ldots ,\varLambda ^{-1}_{J-1})[(F_JD_{\delta }\otimes I_M)], \end{aligned} \end{aligned}$$
(6.4)

in which, \(D_{\delta }=diag(1,\delta ,\ldots ,\delta ^{J-1}), \delta =\root J \of {\epsilon }\) is a diagonal matrix. The matrix \(\varLambda _k\), \(k=0,1,\ldots ,J-1\), are \(M\times M\) dimensionality and satisfy

$$\begin{aligned} \left[ \begin{array}{c} \varLambda _0\\ \varLambda _1\\ \vdots \\ \varLambda _{J-1}\\ \end{array} \right] = [(\sqrt{J}F_JD_{\delta })\otimes I_M] \left[ \begin{array}{c} A_0\\ A_1\\ \vdots \\ A_{J-1}\\ \end{array} \right] . \end{aligned}$$

According to [19], the Eq. (6.4) can be implemented by the following algorithm, which is called approximate inversion method (AIM) .

figure b

Applying Algorithm 2 to Algorithm 1 yields a fast projection gradient method for the optimal control problem. The Table 1 displays the computational cost required for each step, which implies that the total computational cost is \( O(MJ\log J)\).

Table 1 Analysis of computation
Fig. 1
figure 1

discrete state U, adjoint state Z and control Q

Table 2 Errors of state, adjoint state and control in \(L^2(0,T;L^2(\varOmega ))\) for different \(\beta \) and fixed time partition \(J{=}M^2\)
Fig. 2
figure 2

Space convergence rates of state, adjoint state and control for \(\beta =0.4\)

Fig. 3
figure 3

Space convergence rates of state, adjoint state and control for \(\beta =0.8\)

Table 3 Errors of state, adjoint state and control for different \(\beta \) and \(M{=}J\)

7 Numerical Examples

In this section numerical experiments will be carried out to illustrate the error analysis and algorithm presented in Sects. 5 and 6.

7.1 Example 1

We consider the control problem with \( \varOmega =[0,1],\ \gamma =1, \ T=1, \ \eta = 1.0\times 10^{-5}\) and \(\epsilon =0.5\times 10^{-8}\). The exact solutions are given by

$$\begin{aligned} \begin{aligned} u&=t^{2}\sin (2\pi x),\\ z&=(1-t)^{2}\sin (\pi x),\\ q&=\max (-0.5,\min (-z,-0.1)).\\ \end{aligned} \end{aligned}$$

The right hand term f and the desired state \(u_d\) can be calculated by the exact solutions and governing equation.

The space-time surfaces of discrete state variable, adjoint state variable and the control variable are displayed in Fig. 1 .

Fig. 4
figure 4

Time convergence rates of state, adjoint state and control for \(\beta =0.4\)

Fig. 5
figure 5

Time convergence rates of state, adjoint state and control for \(\beta =0.8\)

Table 4 Comparison of computational time

In order to achieve the convergence rate of space we fix time step as \(J=M^{2}\). The errors of state, adjoint state and control variable are presented in Table 2. The corresponding convergence rate for space is shown in Figs. 2 and 3 , which is in agreement with the theoretical result.

Similarly, in order to test the convergence rate of time we set \(M=J\). The errors of state, adjoint state and control variable are list in Table 3 and the corresponding convergence rate is given in Figs. 4 and 5 . It is easy to see that the convergence rate is consistent with the theoretical prediction.

The comparison of computational time for fast algorithm (AIM) and the traditional method with the block forward substitution (BFSM) is given in Table 4 with \(M=2^7\). We can observe that fast algorithm can effectively reduce the computational time. From Figs. 6 and  7 we can find that the computational time for fast algorithm (AIM) increases linearly, which is in contrast to the traditional method with the block forward substitution (BFSM).

7.2 Example 2

We consider the control problem with \( \varOmega =[0,1],\ \gamma =1, \ T=1,\) and \(\eta = 1.0\times 10^{-5}\). f and \(u_d\) are given by

$$\begin{aligned} \begin{aligned} f&=10,\\ u_d&=-20x(1-x)e^t. \end{aligned} \end{aligned}$$

In this example, no exact solutions can be achieved. The control variable is chosen as \(q=\max (-0.5,\min (-z,-0.1)).\) We use the numerical solutions on a much finer grid as the reference solutions. In order to achieve the convergence rate of space, we fix \(J=2^{10}\) and \(M=2^{10}\). The errors of state, adjoint state and control variable are given in Table 5. The corresponding convergence rate for space is shown in Figs. 8 and 9 for different \(\beta \), which is in agreement with the theoretical result.

Fig. 6
figure 6

Computational time of fast algorithm (AIM)

Fig. 7
figure 7

Computational time of BFSM

Table 5 Errors of state, adjoint state and control for \(M=2^{10}\)
Fig. 8
figure 8

Space convergence rates of state, adjoint state and control for \(\beta =0.4\)

Fig. 9
figure 9

Space convergence rates of state, adjoint state and control for \(\beta =0.8\)

Similarly, we set \(J=2^{11}\) and \(M=2^{11}\) to verify the convergence rate of time. The errors of state, adjoint state and control variable are listed in Table 6 and the corresponding convergence rate is given in Figs. 10 and 11 for different \(\beta \). It is easy to observe that the convergence rate is consistent with the theoretical prediction.

Table 6 Errors of state, adjoint state and control for \(J=2^{11}\)
Fig. 10
figure 10

Time convergence rates of state, adjoint state and control for \(\beta =0.4\)

Fig. 11
figure 11

Time convergence rates of state, adjoint state and control for for \(\beta =0.8\)