1 Introduction

Our main goal in this paper is to study finite element discretization of optimal control problem governed by 1-D space fractional diffusion equation. We consider the following control problem:

$$\begin{aligned} \min \limits _{u\in U_{ad}} J(y,u):= \frac{1}{2} \Vert y-y_d\Vert _{0}^2 +\frac{\gamma }{2} \Vert u\Vert _{0}^{2} \end{aligned}$$
(1.1)

subject to

$$\begin{aligned} \left\{ \begin{array}{llr} {\mathcal {L}}_{r}^{\alpha }y =f+u \quad &{}\text{ in }\ \varOmega , \\ y=0 \quad &{}\text{ on }\ \varGamma , \end{array}\right. \end{aligned}$$
(1.2)

where \(\varOmega =(0,1)\), \(\varGamma =\partial \varOmega \), \(y_{d}\in L^2(\varOmega )\) is the desired state, \(\gamma >0\) is the regularization parameter, \(f\in L^2(\varOmega )\) is a given function and the control constraint

$$\begin{aligned} U_{ad}=\{u\in L^{2}(\varOmega ):\; u_a\le u(x)\le u_b\ \; \text{ a.e. } \text{ in }\; \varOmega \ \ \text{ with }\ \ u_a, u_b\in {\mathbb {R}}\ \ \text{ and } \ \ u_a\le u_b\}. \end{aligned}$$

Here the fractional differential operator \({\mathcal {L}}_{r}^{\alpha }\) is defined by

$$\begin{aligned} {\mathcal {L}}_{r}^{\alpha }y:=-D(r\ {}_0\!D_x^{-(2-\alpha )}+(1-r){}_x\!D_1^{-(2-\alpha )}) Dy. \end{aligned}$$

The parameters \(r\in [0,1]\) and \(\alpha \in (1,2)\). More details about fractional derivative and integral will be specified later.

In the past decades, lots of researches show that anomalous diffusion phenomena widely exists in real world applications, for example, contaminant transport in groundwater flow. In [1, 2], it was shown that solutes moving through aquifers do not generally follow a classical second-order Fickian diffusion equation. The heavy tail behavior of the transport processes can be accurately described by Levy distribution, which can be viewed as a probability description of fractional diffusion equations. Due to the self-similarity the plume spreads faster than a traditional Brownian motion. The traditional dispersion equation would seriously underestimate the risk of downstream contamination if the plume represent a pollutant heading to a drinking water well. The stable density that solves the fractional diffusion equation can capture the super-diffusive spreading observed in the data. Motivated by above facts, in this paper we mainly focus on optimal control problem governed by a space fractional diffusion equation described in (1.2).

In recent years, the research of optimal control problem governed by fractional PDEs forms a hot topic not only in model problems, but also in numerical methods. For the model problems, a distributed optimal control problem governed by time fractional diffusion equation with Riemann–Liouville derivative was discussed in [3] and the first order optimality system was derived there. In [4], the authors studied the optimal control problem governed by time fractional diffusion equation with state constraints and proved the well-posedness of the control problem. The controllability of the time fractional diffusion equation was discussed in [5]. A new type of identification problems was studied in [6], where the fractional order in a nonlocal evolution equation was identified and the well-posedness of the identification problem was proved.

On the numerical method aspect, spectral approximation of time fractional optimal control problems were studied in [7,8,9] and error estimates were derived under the assumption that both state and control variables are sufficiently smooth. In [10], Legendre pseudo-spectral method combined with L1 scheme was applied to approximate optimal control problem governed by a time-fractional diffusion equation. Numerical simulation of distributed-order fractional optimal control problems was investigated in [11]. A fully spectral collocation scheme was proposed. Besides, finite element approximation and finite difference approximation of fractional optimal control problems are also widely studied. In [12,13,14,15], optimal control problems governed by fractional Laplacian were investigated, where the fractional Laplacian operator was characterized as fractional powers of the Dirichlet Laplace operator in the sense of spectral theory. By using the Caffarelli–Silvestre extension, the fractional Laplacian equation was realized as the Dirichlet-to-Neumann map for a nonuniformly elliptic problem posed on a semi-infinite cylinder in one more spatial dimension, which overcame the nonlocality of the fractional Laplacian operator. Finite element discretization of the control problem was investigated, and a priori as well as a posteriori error estimate were derived. In [16], the authors investigated the controllability of a one-dimensional heat equation involving the fractional Laplacian operator both from theoretical and numerical aspects. Finite element approximation of time fractional optimal control problems was studied in [17], where the well-posedness of control problem and optimal-order error estimates for the space semidiscrete approximation were proved. Pointwise-in-time error estimates for an optimal control problem with subdiffusion constraint was derived in [18] for L1 and back Euler discretization of time. A fast projected gradient algorithm for optimal control problem governed by space fractional diffusion equation was developed in [19] based on finite difference discretization of the state equation.

The present work is devoted to develop a rigorous error estimate for finite element approximation of optimal control problem governed by space fractional diffusion equation. The control space is approximated by the piecewise constant finite element space. The regularity estimates for the state, the adjoint state and the control variables are derived based on the first order optimality system. Note that the fractional derivative is nonlocal, which leads to a full matrix in finite element or finite difference discretization of the state equation as well as the adjoint equation. To reduce the computational cost, we develop a fast primal dual active set algorithm based on the Toeplitz structure of the coefficient matrix ([20]) in the discrete state and adjoint state equations. Finally, numerical examples are given to illustrate the theoretical results.

Our paper is organized as follows. In next section, we present some preliminary knowledge about the fractional derivative and integral, first. Then we derive the first order optimality system and prove some results with respect to the regularity of the solutions. In Sect. 3, we consider the finite element approximation of control problem and prove a priori error estimates for the state variable, the adjoint state variable and the control variable. To save computational cost, a fast algorithm based on the primal dual active set strategy is developed in Sect. 4. In Sect. 5, numerical example is given to confirm our theoretical findings. Finally, we draw some concluding remarks in Sect. 6.

2 Optimal Control Problem

In this section, we begin with a brief review of the definition of fractional integral, derivative and the related Sobolev sapce.

For a function u defined on the interval \(\varOmega \) and \(\beta >0\), we have the left and right fractional integrals of order \(\beta \) defined by:

$$\begin{aligned} {_0D_x^{-\beta }} u(x):= & {} \frac{1}{\varGamma (\beta )}\int _0^x\frac{u(s)}{(x-s)^{1-\beta }}ds, \\ {_xD_1^{-\beta }} u(x):= & {} \frac{1}{\varGamma (\beta )}\int _x^1\frac{u(s)}{(s-x)^{1-\beta }}ds. \end{aligned}$$

The left and right Caputo fractional derivatives of order \(1-\sigma \) with \(0\le \sigma <1\) are defined by:

$$\begin{aligned} {_0^CD_x^{1-\sigma }} u(x)= & {} \frac{1}{\varGamma (\sigma )}\int _0^x\frac{u'(s)}{(x-s)^{1-\sigma }}ds={_0D_x^{-\sigma }}Du(x), \\ {_x^CD_1^{1-\sigma }} u(x)= & {} -\frac{1}{\varGamma (\sigma )}\int _x^1\frac{u'(s)}{(s-x)^{1-\sigma }}ds=-{_xD_1^{-\sigma }}Du(x). \end{aligned}$$

Similarly, the left and right Riemann–Liouville fractional derivatives of order \(1-\sigma \) are defined by:

$$\begin{aligned} {_0D_x^{1-\sigma }} u(x)= & {} \frac{1}{\varGamma (\sigma )}{\frac{d}{dx}}\int _0^x\frac{u(s)}{(x-s)^{1-\sigma }}ds=D{_0D_x^{-\sigma }}u(x), \\ {_xD_1^{1-\sigma }} u(x)= & {} -\frac{1}{\varGamma (\sigma )}{\frac{d}{dx}}\int _x^1\frac{u(s)}{(s-x)^{1-\sigma }}ds=-D{_xD_1^{-\sigma }}u(x). \end{aligned}$$

For \(s\ge 0\), let \(H^s(\varOmega )\) denote the Sobolev space of order s on the interval \(\varOmega \) and \({\widetilde{H}}^s(\varOmega )\) denote the set of functions in \(H^s(\varOmega )\) whose extension by 0 are in \(H^s(R)\). For u defined on \(\varOmega \) and \({\widetilde{u}}\) its extension by zero, \({\widetilde{H}}^s(\varOmega )\) is the closure of \(C_0^{\infty }(\varOmega )\) with the norm \(\Vert u\Vert _{{H}^s(\varOmega )}:=\Vert {\widetilde{u}}\Vert _{{\widetilde{H}}^s(R)}\).

Let \({}_2\!F_1\) denote the Gaussian three-parameter hypergeometric function, which is defined by

$$\begin{aligned} {}_2\!F_1(a;b;c;x)= & {} \frac{\varGamma (c)}{\varGamma (b)\varGamma (c-b)}\int _0^1 z^{b-1}(1-z)^{c-b-1}(1-zx)^{-a}dz\\= & {} \sum \limits _{n=0}^{\infty } \frac{(a)_n (b)_n x^n}{(c)_n n!}. \end{aligned}$$

Here \((\cdot )_n\) denotes the rising Pochhammer symbol. This function will be used in the kernel function of fractional operator.

Theorem 2.1

Let (yu) be the solution of control problem (1.1)–(1.2). Then the following first order optimality system holds

$$\begin{aligned}&\left\{ \begin{array}{ll} -\,D(r{}_0\!D_x^{-(2-\alpha )}+(1-r){}_x\!D_1^{-(2-\alpha )}) D y=f+u\quad &{}\text{ in }\ \varOmega ,\\ y =0\quad &{}\text{ on }\ \varGamma , \end{array} \right. \end{aligned}$$
(2.1)
$$\begin{aligned}&\left\{ \begin{array}{ll} -\,D(r{}_x\!D_1^{-(2-\alpha )}+(1-r){}_0\!D_x^{-(2-\alpha )}) D z=y-y_d\quad &{}\text{ in }\ \varOmega ,\\ z =0\quad &{}\text{ on }\ \varGamma \end{array} \right. \end{aligned}$$
(2.2)

and

$$\begin{aligned} \int _{\varOmega }(\gamma u+z)({\chi }-u)dx&\ge 0, {\forall } {\chi }\in U_{ad}. \end{aligned}$$
(2.3)

Proof

To derive the first order optimality system, we introduce the following reduced optimization problem:

$$\begin{aligned} \min \limits _{u\in U_{ad}} {\hat{J}}(u):=J(y(u),u), \end{aligned}$$

where y(u) is the solution of the state equation. Then the following optimality condition holds

$$\begin{aligned} {\hat{J}}'(u)({\chi }-u)\ge 0, \forall {\chi }\in U_{ad}. \end{aligned}$$

Note that the objective functional is strictly convex, so the above condition is sufficient and necessary.

By simple calculation, we have

$$\begin{aligned} {\hat{J}}'(u)({\chi }-u)= & {} \int _{\varOmega }(y(u)-y_d)[y'(u)({\chi }-u)]dx+\gamma \int _{\varOmega }u({\chi }-u)dx. \end{aligned}$$

In order to simplify the above inequality, we need to calculate \(y'(u)({\chi }-u)\), According to the state equation, we have

$$\begin{aligned} \left\{ \begin{array}{ll} -\,D(r{}_0\!D_x^{-(2-\alpha )}+(1-r){}_x\!D_1^{-(2-\alpha )}) D[y'(u)({\chi }-u)]={\chi }-u\quad &{}\text{ in }\ \varOmega ,\\ y'(u)({\chi }-u)=0\quad &{}\text{ on }\ \varGamma . \end{array} \right. \end{aligned}$$

We introduce the adjoint state equation:

$$\begin{aligned} \left\{ \begin{array}{ll} -D(r{}_x\!D_1^{-(2-\alpha )}+(1-r){}_0\!D_x^{-(2-\alpha }) D z=y(u)-y_d\quad &{}\text{ in }\ \varOmega ,\\ z=0\quad &{}\text{ on }\ \varGamma . \end{array} \right. \end{aligned}$$

Then by using integration by parts, we deduce

$$\begin{aligned} \int _{\varOmega }(y(u)-y_d)[y'(u)({\chi }-u)] dx= & {} \int _{\varOmega }\Big (-D(r{}_x\!D_1^{-(2-\alpha )}+(1-r){}_0\!D_x^{-(2-\alpha )}) D z\Big ) \cdot [y'(u)({\chi }-u)]dx \\= & {} \int _{\varOmega }(r{}_x\!D_1^{-(2-\alpha )}+(1-r){}_0\!D_x^{-(2-\alpha )}) Dz\cdot D[y'(u)({\chi }-u)]dx\\= & {} \int _{\varOmega } Dz(x)\cdot \Big ((r{}_0\!D_x^{-(2-\alpha )}+(1-r){}_x\!D_1^{-(2-\alpha )})D[y'(u)({\chi }-u)]\Big )dx\\= & {} -\int _{\varOmega } z(x)\cdot D(r{}_0\!D_x^{-(2-\alpha )}+(1-r){}_x\!D_1^{-(2-\alpha ) })D[y'(u)({\chi }-u)]dx\\= & {} \int _{\varOmega }z({\chi }-u)dx. \end{aligned}$$

Combing the above equations leads to

$$\begin{aligned} {\hat{J}}'(u)(\chi -u)= & {} \int _{\varOmega }(\gamma u+z(x))({\chi }-u)dx\ge 0. \end{aligned}$$

\(\square \)

Let

$$\begin{aligned} P_{U_{ad}}(u)=\max \{u_a,\min \{u,u_b\}\} \end{aligned}$$

denote the pointwise projection onto the admissible set \(U_{ad}\). Then (2.3) is equivalent to

$$\begin{aligned} u=P_{U_{ad}}(-\frac{1}{\gamma }z). \end{aligned}$$

In the following, we are going to show the regularity of the control problem.

Theorem 2.2

Suppose that (yzu) is the solution of optimality system (2.1)–(2.3). Then we have the following regularity results

$$\begin{aligned} y, z\in {H}^{\nu +\frac{3}{2}-\epsilon }(\varOmega ),\ u\in \left\{ \begin{array}{ll} {H}^{\nu +\frac{3}{2}-\epsilon }(\varOmega ), \ \ \ \nu +\frac{3}{2}\le 1,\\ {H}^{1}(\varOmega ),\ \ \ \ \ \ \ \ \ \nu +\frac{3}{2}> 1. \end{array} \right. \end{aligned}$$

Here \(\nu =\min \{p,q\}\), \(\forall \epsilon >0\) and pq are constants satifying \(\alpha -2\le p, q<0\) and the following relation:

$$\begin{aligned} p+q=\alpha -2, \ \ \ r sin(\pi (-q))=(1-r) sin(\pi (-p)). \end{aligned}$$

Proof

According to [21], the kernel function of operator \({\mathcal {L}}_{r}^{\alpha }\) is given by \(\text{ ker }({\mathcal {L}}_{r}^{\alpha })=\text{ Span }\{1,K(x)\}\), where \(K(x)=\frac{1}{p+1}x^{p+1} {}_2\!F_1(-q,p+1,p+2;x)\) with \({}_2\!F_1\) being the Gaussian three-parameter hypergeometric function and \(\alpha -2\le p, q<0\) satisfying the following relation:

$$\begin{aligned} p+q=\alpha -2, \ \ \ r sin(\pi (-q))=(1-r) sin(\pi (-p)). \end{aligned}$$

This implies that the state y belongs to the space \({H}^{\nu +\frac{3}{2}-\epsilon }(\varOmega )\) with \(\nu =\min \{p,q\}\) and any constant \(\epsilon >0\).

By similar argument, we can obtain that the adjoint state z has the same regularity as that of the state y. According to [22], the projection operator \(P_{U_{ad}}\) satisfies the following property: if \(v\in H^s(\varOmega ), \forall s\in [0,1]\), then \(P_{U_{ad}}(v)\in H^s(\varOmega ).\) Combining this property with the regularity of z implies that \(u\in \left\{ \begin{array}{ll} {H}^{\nu +\frac{3}{2}-\epsilon }(\varOmega ), \ \ \ \nu +\frac{3}{2}\le 1,\\ {H}^{1}(\varOmega ),\ \ \ \ \ \ \ \ \ \nu +\frac{3}{2}>1. \end{array} \right. \)\(\square \)

Remark 2.3

In the case \(r=\frac{1}{2}\), we have \(p=q=\frac{\alpha }{2}-1\). This leads to \(y, z\in {H}^{\frac{\alpha }{2}+\frac{1}{2}-\epsilon }(\varOmega ),\ u\in H^1(\varOmega )\).

In the case \(r\longrightarrow 1\), we have \(q\longrightarrow 0, p\longrightarrow {\alpha }-2\). This gives \(y, z\in {H}^{{\alpha }-\frac{1}{2}-\epsilon }(\varOmega )\), which implies that \(\ u\in H^1(\varOmega )\), for \(\alpha \in \left( \frac{3}{2},2\right) \) and \(\ u\in {H}^{{\alpha }-\frac{1}{2}-\epsilon }(\varOmega )\), for \(\alpha \in \left( 1, \frac{3}{2}\right] \).

3 Finite Element Approximation

In this section, we will investigate finite element approximation of the control problem. For simplicity, we set \(\beta =1-\frac{\alpha }{2}\).

Following [21, 23], let

$$\begin{aligned} A(y,v):=r({}_0\!D_x^{-\beta }Dy,{}_x\!D_1^{-\beta }D v)+(1-r)({}_x\!D_1^{-\beta }Dy,{}_0\!D_x^{-\beta }Dv) \end{aligned}$$

denote the the bilinear form, which satisfies

$$\begin{aligned} A(y,y)\ge C_{0}\Vert y\Vert _{{\widetilde{H}}^{\frac{\alpha }{2}}(\varOmega )}^2 \end{aligned}$$

and

$$\begin{aligned} A(y,v)\le C_{1}\Vert y\Vert _{{\widetilde{H}}^{\frac{\alpha }{2}}(\varOmega )}\Vert v\Vert _{{\widetilde{H}}^{\frac{\alpha }{2}}(\varOmega )}, \end{aligned}$$

where \(C_0\) and \(C_1\) are positive constants.

Then the weak formulation of the control problem reads as:

$$\begin{aligned} \min \limits _{u\in U_{ad}} J(y,u) \end{aligned}$$
(3.1)

subject to

$$\begin{aligned} A(y,v)=(f+u,v), \forall v\in {\widetilde{H}}^{\frac{\alpha }{2}}(\varOmega ). \end{aligned}$$
(3.2)

Define the Lagrange functional

$$\begin{aligned} {\mathcal {L}}(y,z,u)=J(y,u)+(f+u,z)-A(y,z). \end{aligned}$$

Then we can derive the first order optimality system of the above variational problem:

$$\begin{aligned} A(y,v)= & {} (f+u,v), \forall v\in {\widetilde{H}}^{\frac{\alpha }{2}}(\varOmega ), \end{aligned}$$
(3.3)
$$\begin{aligned} A(w,z)= & {} (y-y_d,w), \forall w\in {\widetilde{H}}^{\frac{\alpha }{2}}(\varOmega ), \end{aligned}$$
(3.4)

and

$$\begin{aligned} \int _{\varOmega }(\gamma u+z)(\chi -u)\ge 0, \forall \chi \in U_{ad}. \end{aligned}$$
(3.5)

To define the finite element scheme, we introduce an uniform partition of the interval \(\varOmega \) with the mesh parameter \(h=\frac{1}{N}\), where \(N>0\) is an integer. Let \(V_h\) denote the continuous finite element space consisting of piecewise linear polynomial on each interval \(I_i=[x_{i},x_{i+1}],\) where the nodes \(x_{i}=ih, i=0, 1, 2, \ldots , N.\) Since \(\frac{\alpha }{2}\in \left( \frac{1}{2},1\right) \), we have \(V_h\subset {\widetilde{H}}^{\frac{\alpha }{2}}(\varOmega )\). Following [24], the space \(V_h\) satisfies the approximation property: if the partition is quasi-uniform and \(\frac{\alpha }{2}\le \kappa \le 2,\) for \(v\in H^{\kappa }(\varOmega )\cap {\widetilde{H}}^{\frac{\alpha }{2}}(\varOmega )\), then

$$\begin{aligned} \inf \limits _{v_h\in V_h}\Vert v-v_h\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )}\le Ch^{\kappa -\frac{\alpha }{2}}\Vert v\Vert _{{H}^{\kappa }(\varOmega )}. \end{aligned}$$
(3.6)

For the discretization of control variable, we introduce a piecewise constant finite element space \(U_h\subset L^2(\varOmega ).\) Let \(U_{ad}^h=U_h\cap U_{ad}.\)

Then the finite element approximation of control problem (3.1)–(3.2) is to find \((y_h,u_h)\in V_h\times {U}_{ad}^h\) such that

$$\begin{aligned} \min \limits _{u_h\in {U}_{ad}^h} J(y_h,u_h) \end{aligned}$$
(3.7)

subject to

$$\begin{aligned} A(y_h,v_h)=(f+u_h,v_h), \forall v_h\in V_h. \end{aligned}$$
(3.8)

Analogous to the continuous case, we can derive the discrete first order optimality system

$$\begin{aligned} A(y_h,v_h)= & {} (f+u_h,v_h), \forall v_h\in V_h, \end{aligned}$$
(3.9)
$$\begin{aligned} A(w_h,z_h)= & {} (y_h-y_d,w_h), \forall w_h\in V_h, \end{aligned}$$
(3.10)

and

$$\begin{aligned} \int _{\varOmega }(\gamma u_h+z_h)({\chi }_h-u_h)dx\ge 0, \forall {\chi }_h\in {U}_{ad}^h. \end{aligned}$$
(3.11)

In the following analysis, we are going to derive a priori error estimate for the control problem. For this purpose, we introduce some auxiliary problems:

$$\begin{aligned} A(y_h(u),v_h)= & {} (f+u,w_h), \ \ \ \ \ \ \ \forall w_h\in V_h, \end{aligned}$$
(3.12)
$$\begin{aligned} A(w_h,z_h(u))= & {} (y_h(u)-y_d,w_h), \forall w_h\in V_h, \end{aligned}$$
(3.13)
$$\begin{aligned} A(w_h,z_h(y))= & {} (y-y_d,w_h), \ \ \ \ \ \ \ \forall w_h\in V_h. \end{aligned}$$
(3.14)

It is easy to see that \(y_h(u)\) and \(z_h(y)\) are the finite element approximations of the state y and the adjoint state z, respectively. Therefore, following [21] and the regularity of the state and the adjoint state variables, we have

$$\begin{aligned} \Vert y-y_h(u)\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )}\le & {} Ch^{\nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon }\Vert y\Vert _{{H}^{\nu +\frac{3}{2}-\epsilon }(\varOmega )}, \end{aligned}$$
(3.15)
$$\begin{aligned} \Vert y-y_h(u)\Vert _{L^2(\varOmega )}\le & {} Ch^{2(\nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon )}\Vert y\Vert _{{H}^{\nu +\frac{3}{2}-\epsilon }(\varOmega )} \end{aligned}$$
(3.16)

and

$$\begin{aligned} \Vert z-z_h(y)\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )}\le & {} Ch^{\nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon }\Vert z\Vert _{{H}^{\nu +\frac{3}{2}-\epsilon }(\varOmega ) }, \ \end{aligned}$$
(3.17)
$$\begin{aligned} \Vert z-z_h(y)\Vert _{L^2(\varOmega )}\le & {} Ch^{2(\nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon )}\Vert z\Vert _{{H}^{\nu +\frac{3}{2}-\epsilon }(\varOmega ) }. \end{aligned}$$
(3.18)

By the coercivity of the bilinear form \(A(\cdot ,\cdot )\), we can derive

$$\begin{aligned} \Vert z_h(y)-z_h(u)\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )}\le & {} C\Vert y_h(u)-y\Vert _{L^2(\varOmega )}, \end{aligned}$$
(3.19)
$$\begin{aligned} \Vert z_h(y)-z_h\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )}\le & {} C\Vert y_h-y\Vert _{L^2(\varOmega )} \end{aligned}$$
(3.20)

and

$$\begin{aligned} \Vert y_h-y_h(u)\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )}\le & {} C\Vert u-u_h\Vert _{L^2(\varOmega )}, \end{aligned}$$
(3.21)
$$\begin{aligned} \Vert z_h-z_h(u)\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )}\le & {} C\Vert u-u_h\Vert _{L^2(\varOmega )}. \end{aligned}$$
(3.22)

Note that the estimate of the state and the adjoint state depends on the estimate of the control. Therefore, we show the error estimate of the control firstly.

Lemma 3.1

Let (yzu) and \((y_h, z_h, u_h)\) be the solutions of (2.1)–(2.3) and (3.9)–(3.11), respectively. Then the following estimate holds

$$\begin{aligned} \parallel u- u_h\parallel _{L^2(\varOmega )}\le & {} Ch^{2(\nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon )}. \end{aligned}$$

Proof

Let

$$\begin{aligned} ({\widehat{J}}_h'(u),{\chi })=(\gamma u+z_h(u),{\chi }). \end{aligned}$$

Then we can derive

$$\begin{aligned} ({\widehat{J}}_h'(u),u-u_h)-({\widehat{J}}_h'(u_h),u-u_h)\ge \gamma \parallel u-u_h\parallel ^2_{L^2(\varOmega )} +(z_h(u)-z_h,u-u_h). \end{aligned}$$

By (3.8) and (3.12), we obtain

$$\begin{aligned} (z_h(u)-z_h,u-u_h)= & {} A(y_h(u)-y_h,z_h(u)-z_h)\\= & {} (y_h(u)-y_h,y_h(u)-y_h)\ge 0. \end{aligned}$$

This implies

$$\begin{aligned} ({\widehat{J}}_h'(u),u-u_h)-({\widehat{J}}_h'(u_h),u-u_h)\ge \gamma \parallel u-u_h\parallel ^2_{L^2(\varOmega )}. \end{aligned}$$

Then we have

$$\begin{aligned}&\gamma \parallel u- u_h\parallel _{L^2(\varOmega )}^2\\&\quad \quad \le ({\widehat{J}}_h'(u),u-u_h)-({\widehat{J}}_h'(u_h),u-u_h)\\&\quad \quad =\int _{\varOmega }(\gamma u+z_h(u))(u-u_h)dx-\int _{\varOmega }(\gamma u_h+z_h)(u- u_h)dx\\&\quad \quad = \int _{\varOmega }(\gamma u+z)(u-u_h)dx+\int _{\varOmega }(z_h(u)-z)(u-u_h)dx-\int _{\varOmega }(\gamma u_h+z_h)(u- u_h)dx. \end{aligned}$$

Let \(\varPi _h u\in U^h_{ad}\) be the \(L^2\) projection of u defined by

$$\begin{aligned} \varPi _h{u}|_{I_i}=\frac{\int _{I_i}{u}}{\int _{I_i}1}, \ \ \forall I_i. \end{aligned}$$

It is easy to see that

$$\begin{aligned} \parallel {u}-\varPi _h{u}\parallel _{L^2(\varOmega )} \le Ch^s\Vert {u}\Vert _{H^s(\varOmega )}, \ \ 0<s\le 1. \end{aligned}$$
(3.23)

Therefore, it follows from (2.3), (3.11), (3.22) and the Young’s inequality that

$$\begin{aligned}&\gamma \parallel u- u_h\parallel _{L^2(\varOmega )}^2\\= & {} \int _{\varOmega }(\gamma u+z)(u-u_h)dx+\int _{\varOmega }(z_h(u)-z)(u-u_h)dx-\int _{\varOmega }(\gamma u_h+z_h)(u-\varPi _h u)dx\\&-\,\int _{\varOmega }(\gamma u_h+z_h)(\varPi _h u-u_h)dx \\\le & {} 0+\int _{\varOmega }(z_h(u)-z)(u-u_h)dx-\int _{\varOmega }(\gamma u_h+z_h)(u-\varPi _h u)dx+0 \\= & {} \int _{\varOmega }(z_h(u)-z)(u-u_h)dx-\int _{\varOmega }(\gamma u\\&+\,z-\gamma u_h-z_h)(u-\varPi _h u)dx+\int _{\varOmega }(\gamma u+z)(u-\varPi _h u)dx\\= & {} \int _{\varOmega }(z_h(u)-z)(u-u_h)dx+\gamma \int _{\varOmega }(u_h-u)(u-\varPi _h u)dx\\&+\,\int _{\varOmega }(\gamma u+z)(u-\varPi _h u)dx\\&+\,\int _{\varOmega }(z_h(u)-z)(u-\varPi _h u)dx+\int _{\varOmega }(z_h(u)-z_h)(\varPi _h u-u)dx\\\le & {} \delta \Vert u-u_h\Vert _{L^2(\varOmega )}^2+C(\delta )\Vert u-\varPi _h u\Vert _{L^2(\varOmega )}^2\\&+\,C(\delta )\Vert z_h(u)-z\Vert _{L^2(\varOmega )}^2+\int _{\varOmega }(\gamma u+z)(u-\varPi _h u)dx. \end{aligned}$$

Here \(\delta \) is an arbitrary positive constant. Furthermore, by the definition of \(\varPi _h u\) and the regularity of u and z, we have

$$\begin{aligned} \int _{\varOmega }(\gamma u+z)(u-\varPi _h u)= & {} \int _{\varOmega }(\gamma u+z-\varPi _h(\gamma u+z))(u-\varPi _h u)dx\\\le & {} \Vert \gamma u+z-\varPi _h(\gamma u+z)\Vert _{L^2(\varOmega )} \Vert u-\varPi _h u\Vert _{L^2(\varOmega )}\\\le & {} Ch^{2 \min \{1, \nu +\frac{3}{2}-\epsilon \}}. \end{aligned}$$

Combining the above estimates and setting \(\delta \) small enough lead to

$$\begin{aligned} \parallel u- u_h\parallel _{L^2(\varOmega )}^2\le & {} Ch^{2 \min \{1, \nu +\frac{3}{2}-\epsilon \}}+Ch^{4( \nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon )} \\\le & {} Ch^{4(\nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon )}. \end{aligned}$$

where the following fact is used

$$\begin{aligned} 2\left( \nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon \right) \le 2\left( \frac{\alpha }{2}-1 +\frac{3}{2}-\frac{\alpha }{2}-\epsilon \right) =2\left( \frac{1}{2}-\epsilon \right) <1. \end{aligned}$$

\(\square \)

Finally, by using the above estimate for the control variable, we can derive a priori error estimates for the state and the adjoint state variables.

Theorem 3.2

Assume that (yzu) and \((y_h,z_h,u_h)\) are the solutions of the optimality system (2.1)–(2.3) and the discrete counterpart. Then the following error estimates hold

$$\begin{aligned} \Vert y-y_h\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )}+\Vert z-z_h\Vert _{{H}^{\frac{\alpha }{2}}(\varOmega )} \le Ch^{ \nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon } \end{aligned}$$

and

$$\begin{aligned} \Vert y-y_h\Vert _{L^2(\varOmega )}+\Vert z-z_h\Vert _{L^2(\varOmega )} \le Ch^{2 (\nu +\frac{3}{2}-\frac{\alpha }{2}-\epsilon )}. \end{aligned}$$

Proof

Note that

$$\begin{aligned} \parallel y- y_h\parallel _{{H^{\frac{\alpha }{2}}(\varOmega )}}\le & {} \Vert y-y_h(u)\Vert _{H^{\frac{\alpha }{2}}(\varOmega )}+\Vert y_h(u)-y_h\Vert _{H^{\frac{\alpha }{2}}(\varOmega )} \\\le & {} \Vert y-y_h(u)\Vert _{H^{\frac{\alpha }{2}}(\varOmega )}+\Vert u-u_h\Vert _{L^2(\varOmega )} \end{aligned}$$

and

$$\begin{aligned} \parallel y- y_h\parallel _{L^2(\varOmega )}\le & {} \Vert y-y_h(u)\Vert _{L^2(\varOmega )}+\Vert y_h(u)-y_h\Vert _{L^2(\varOmega )}\\\le & {} \Vert y-y_h(u)\Vert _{L^2(\varOmega )}+\Vert u-u_h\Vert _{L^2(\varOmega )}. \end{aligned}$$

Then the estimate of the state follows from (3.15), (3.16) and Lemma 3.1. In an analogue way,

$$\begin{aligned} \parallel z- z_h\parallel _{{H^{\frac{\alpha }{2}}(\varOmega )}}\le & {} \Vert z-z_h(y)\Vert _{H^{\frac{\alpha }{2}}(\varOmega )}+\Vert z_h(y)-z_h\Vert _{H^{\frac{\alpha }{2}}(\varOmega )} \\\le & {} \Vert z-z_h(y)\Vert _{H^{\frac{\alpha }{2}}(\varOmega )}+\Vert y-y_h\Vert _{L^2(\varOmega )} \end{aligned}$$

and

$$\begin{aligned} \parallel z- z_h\parallel _{L^2(\varOmega )}\le & {} \Vert z-z_h(y)\Vert _{L^2(\varOmega )}+\Vert z_h(y)-z_h\Vert _{L^2(\varOmega )}\\\le & {} \Vert z-z_h(y)\Vert _{L^2(\varOmega )}+\Vert y-y_h\Vert _{L^2(\varOmega )}. \end{aligned}$$

Then the estimate of the adjoint state follows from (3.17), (3.18) and the estimate of the state. \(\square \)

4 Numerical Algorithm

In this section, we will present a numerical algorithm for the above control problem based on the primal dual active set strategy. The primal dual active set strategy has been developed in [25] for the elliptic optimal control problem with control constraints.

4.1 Primal–Dual Active Set Algorithm

In order to present the primal–dual active set algorithm, we reformulate the optimality system (2.1)–(2.3) into the following equivalent formulation

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l} -\,D(r{}_0\!D_x^{-(2-\alpha )}+(1-r){}_x\!D_1^{-(2-\alpha )}) D y=f+u\quad &{}\text{ in }\ \varOmega ,\\ -\,D(r{}_x\!D_1^{-(2-\alpha )}+(1-r){}_0\!D_x^{-(2-\alpha )}) D z=y-y_d\quad &{}\text{ in }\ \varOmega ,\\ \gamma u+ z=-\lambda \quad &{}\text{ in }\ \varOmega . \end{array} \right. \end{aligned}$$
(4.1)

Here \(\lambda \) is defined as follows, for \(\mu >0\)

$$\begin{aligned} \lambda =\mu \max \left\{ 0,u+\frac{\lambda }{\mu }-u_b\right\} -\mu \min \left\{ 0,u+\frac{\lambda }{\mu }-u_b\right\} \end{aligned}$$
(4.2)

or

$$\begin{aligned} \lambda =\max \{0,{\lambda }+{\mu }(u -u_b)\}-\min \{0,{\lambda }+{\mu }(u -u_a)\}. \end{aligned}$$
(4.3)

The primal dual active set algorithm is based on utilizing (4.2) or (4.3) as a prediction strategy. For given \((u^{n-1},\lambda ^{n-1})\), the active set and the inactive set for the next iteration are chosen as

$$\begin{aligned} {\mathscr {A}}_a^{n}= & {} \{x\in \varOmega \mid \lambda ^{n-1}(x)+{\mu }(u^{n-1}(x) -u_a)< 0 \} \\ {\mathscr {A}}_b^{n}= & {} \{x\in \varOmega \mid {\lambda ^{n-1}(x)}+{\mu }(u^{n-1}(x) -u_b)>0 \}\\ {\mathscr {I}}^{n}= & {} \varOmega \backslash ({\mathscr {A}}_a^{n}\cup {\mathscr {A}}_b^{n} ). \end{aligned}$$

Then the primal–dual active set algorithm for the continuous problem is specified as follows

figure a

In the following, we are going to discuss the algorithm in the discrete level. Let

$$\begin{aligned} V_h= & {} \text{ Span }\{\phi _{1},\ldots , \phi _{N-1}\}, U_h=\text{ Span }\{ \psi _{1},\ldots , \psi _{N}\}, \ \mathbf{y}=(y_1,\ldots ,y_{N-1})^{T}, \\ \mathbf{z}= & {} (z_1,\ldots ,z_{N-1})^{T}. \end{aligned}$$

Then the finite element scheme for the state equation with right hand term g can be expressed as the following matrix form

$$\begin{aligned} {\mathcal {A}}\mathbf{y}=\mathbf{g}, \end{aligned}$$
(4.5)

where \(\mathbf{g}=(g_1,\ldots ,g_{N-1})^{T}\) with \(g_k=(g,\phi _k(x)), \ k=1, \ldots , N-1\). The entries of matrix \({\mathcal {A}}\) are calculated by

$$\begin{aligned} a_{ij}= & {} A(\phi _{j},\phi _i)\\= & {} r({}_0\!D_x^{-\beta }D\phi _{j}(x),{}_x\!D_1^{-\beta }D \phi _i(x))+(1-r)({}_x\!D_1^{-\beta }D\phi _{j}(x),{}_0\!D_x^{-\beta }D\phi _i(x)). \end{aligned}$$

Note that the finite element space \(V_h\) consists of piecewise linear polynomial on each interval \([x_{i},x_{i+1}].\) Then by simple calculation, matrix \({\mathcal {A}}\) takes the following form ([26, 27])

$$\begin{aligned} {\mathcal {A}} ={\left( {\begin{array}{ccccccc} {{a _0}}&{}{{a _{-1}}}&{}{{a _{-2}}}&{} {\ddot{s}} &{}{{a _{3 - N}}}&{}{{a _{2-N}}}\\ {{a _1}}&{}{{a _0}}&{}{{a _{-1}}}&{}{{a _{-2}}}&{} \ddots &{}{{a _{3 - N}}}\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{}{{a _1}}&{}{{a _0}}&{}{{a _{-1}}}&{}{{a _{-2}}}\\ {{a_{N - 3}}}&{} \ddots &{} \ddots &{} \ddots &{}{{a _0}}&{}{{a _{-1}}}\\ {{a_{N - 2}}}&{}{{a _{N - 3}}}&{} \ddots &{} \ddots &{}{{a _1}}&{}{{a _0}} \end{array}} \right) } \end{aligned}$$
(4.6)

where

$$\begin{aligned} \left\{ \begin{array}{lcl} a_0=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}\Big (r(-4+2^{3-\alpha })+(1-r)(-4+2^{3-\alpha })\Big ),\\ a_1=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}\Big (r(3^{3-\alpha }-4\cdot 2^{3-\alpha }+6)+(1-r)\Big ),\\ a_{i}=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}r\Big ( (i+2)^{3-\alpha }-4\cdot (i+1)^{3-\alpha }+6\cdot i^{3-\alpha }-4\cdot (i-1)^{3-\alpha }+(i-2)^{3-\alpha } \Big ),\\ [0.15in] \quad \quad i=2,3,4,\ldots ,N-2,\\ a_{-1}=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}\Big (r+(1-r)(3^{3-\alpha }-4\cdot 2^{3-\alpha }+6)\Big ),\\ a_{-i}=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}(1-r)\Big ( (i+2)^{3-\alpha }-4\cdot (i+1)^{3-\alpha }+6\cdot i^{3-\alpha }-4\cdot (i-1)^{3-\alpha }+(i-2)^{3-\alpha } \Big ),\\ [0.15in] \quad \quad i=2,3,4,\ldots ,N-2. \end{array} \right. \end{aligned}$$

In an analogous way, the coefficient matrix for the adjoint state equation can be expressed as follows

$$\begin{aligned} {\mathcal {B}} ={\left( {\begin{array}{ccccccc} {{{\widetilde{a}}_0}}&{}{{{\widetilde{a}}_{-1}}}&{}{{{\widetilde{a}}_{-2}}}&{} \ddots &{}{{{\widetilde{a}}_{3 - N}}}&{}{{\widetilde{a} _{2-N}}}\\ {{\widetilde{a} _1}}&{}{{\widetilde{a}_0}}&{}{{\widetilde{a}_{-1}}}&{}{{\widetilde{a}_{-2}}}&{} \ddots &{}{{\widetilde{a}_{3 - N}}}\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{}{{\widetilde{a} _1}}&{}{{\widetilde{a}_0}}&{}{{\widetilde{a}_{-1}}}&{}{{\widetilde{a} _{-2}}}\\ {{\widetilde{a}_{N - 3}}}&{} \ddots &{} \ddots &{} \ddots &{}{{\widetilde{a}_0}}&{}{{\widetilde{a} _{-1}}}\\ {{\widetilde{a}_{N - 2}}}&{}{{\widetilde{a}_{N - 3}}}&{} \ddots &{} \ddots &{}{{\widetilde{a}_1}}&{}{{\widetilde{a} _0}} \end{array}} \right) } \end{aligned}$$
(4.7)

where

$$\begin{aligned} \left\{ \begin{array}{lcl} \widetilde{a}_0=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}\Big ((1-r)(-4+2^{3-\alpha })+r(-4+2^{3-\alpha })\Big ),\\ \widetilde{a}_1=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}\Big ((1-r)(3^{3-\alpha }-4\cdot 2^{3-\alpha }+6)+r\Big ),\\ \widetilde{a}_{i}=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}(1-r)\Big ((i+2)^{3-\alpha }-4\cdot (i+1)^{3-\alpha }+6\cdot i^{3-\alpha }-4\cdot (i-1)^{3-\alpha }+(i-2)^{3-\alpha } \Big ),\\ \quad \quad i=2,3,4,\ldots ,N-2,\\ \widetilde{a}_{-1}=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}\Big ((1-r)+r(3^{3-\alpha }-4\cdot 2^{3-\alpha }+6)\Big ),\\ \widetilde{a}_{-i}=-\frac{h^{1-\alpha }}{\varGamma (4-\alpha )}r\Big ((i+2)^{3-\alpha }-4\cdot (i+1)^{3-\alpha }+6\cdot i^{3-\alpha }-4\cdot (i-1)^{3-\alpha }+(i-2)^{3-\alpha } \Big ),\\ \quad \quad i=2,3,4,\ldots ,N-2. \end{array} \right. \end{aligned}$$

Let \(N_h=\{1,2,\ldots , N\}\). \(\mathbf{F}, Y_d\in {\mathbb {R}}^{N-1}\), where \(\mathbf{F}=((f,\phi _{i}))_{(N-1)\times 1}\) and \(Y_d=((y_{d},\phi _{i}))_{(N-1)\times 1}\). \(\varvec{\Lambda },\mathbf{u}, \mathbf{u}_a, \mathbf{u}_b\in {\mathbb {R}}^{N}\) are vectors corresponding to the discrete counterpart. \(M\in {\mathbb {R}}^{(N-1)\times (N-1)}\) and \(M_0\in {\mathbb {R}}^{N\times N}\) are the mass matrix for space \(V_h\) and \(U_h\), respectively. \(M_1\in {\mathbb {R}}^{(N-1)\times N}\) denotes the matrix whose entry is calculated by \(m_{ij}=(\psi _j(x),\phi _i(x)), i=1,\ldots , N-1, j=1,2,\ldots ,N.\) We should point out that \(M_{0}\) is a diagonal matrix in this case.

Similar to the continuous case, we can define the active set and the inactive set of the n-th step as follows

$$\begin{aligned} \widetilde{{\mathscr {A}}}_a^{n}= & {} \{i\in N_h\mid ({\varvec{\Lambda }}^{n-1}+{\mu }(\mathbf{u}^{n-1} -\mathbf{u}_a))_i< 0 \}, \\ \widetilde{{\mathscr {A}}}_b^{n}= & {} \{i\in N_h \mid ({{\varvec{\Lambda }}^{n-1}}+{\mu }(\mathbf{u}^{n-1} -\mathbf{u}_b))_i>0 \},\\ \widetilde{{\mathscr {I}}}^{n}= & {} N_h\backslash ( \widetilde{{\mathscr {A}}}_a^{n}\cup \widetilde{{\mathscr {A}}}_b^{n} ), \end{aligned}$$

where \({\varvec{\Lambda }}^n=\gamma M_{0}\mathbf{u}^n+M_{1}^{T}\mathbf{z}^n\).

figure b

It is easy to see that in each iteration of the above algorithm, we need to solve the following equation

$$\begin{aligned} \left( \begin{array}{cc} {\mathcal {A}} &{} \frac{1}{\gamma } \widetilde{M}_1M_0^{-1}\widetilde{M}_1^T \\ {-\,M} &{} {\mathcal {B}} \end{array}\right) \left( \begin{array}{cc} {Y} \\ {Z} \end{array}\right) =\left( \begin{array}{cc} \mathbf{F} + M_1({\widetilde{\mathbf{u}}}_a+{\widetilde{\mathbf{u}}}_b) \\ -Y_d \end{array}\right) \end{aligned}$$
(4.9)

where \( \widetilde{M}_1\) consists of the \(\widetilde{\mathscr {I}}_n\) columns of the matrix \(M_{1}\), the \(\widetilde{\mathscr {A}}_a^n\) entries of \({\widetilde{\mathbf{u}}}_a\) is the same as that of \(\mathbf{u}_a\) and the other entries are zero, the \(\widetilde{\mathscr {A}}_b^{n}\) entries of \({\widetilde{\mathbf{u}}}_b\) is the same as that of \(\mathbf{u}_b\) and the other entries are zero.

4.2 Fast Algorithm

Note that the fractional differential operator is nonlocal, which leads to a full coefficient matrix. In the following part, we will develop a fast algorithm based on the Toeplitz structure of the coefficient matrix.

The Eq. (4.9) can be solved by some Krylov subspace methods, for example, BiCGStab method. In the above algorithm, the main computational cost comes from the matrix-vector multiplication. Note that M is tri-diagonal sparse matrix, which needs O(N) operations and O(N) storage. However, \({\mathcal {A}} \) and \({\mathcal {B}} \) are full, which usually cost \(O(N^2)\) operations and \(O(N^2)\) storage.

It is easy to observe that the stiff matrices \({\mathcal {A}}\) and \({\mathcal {B}}\) are two \((N-1)\)-by-\((N-1)\) Toeplitz matrices. In the computation, we only need to store the first row vector and the first column vector which requires O(N) storage. Note that a Toeplitz matrix \(T_N\) can be embedded into a circulant matrix \(C_{2N}\). According to [28, 29], a circulant matrix \(C_{2N}\) can be decomposed as follows

$$\begin{aligned} C_{2N}=F_{2N}^{-1}\text{ diag }(F_{2N}{} \mathbf d )F_{2N}, \end{aligned}$$
(4.10)

where \(\mathbf{d }\) is the first column vector of \(C_{2N}\) and \(F_{2N}\) is the \(2N\times 2N\) discrete Fourier transform matrix in which the (jl)-entry \(F_{2N}(j,l)\) of the matrix \(F_{2N}\) is defined by

$$\begin{aligned} F_{2N}(j,l)=\frac{1}{\sqrt{2N}}exp\left( -\frac{2\pi ijl}{2N}\right) ,\ \ \ 0\le j,\ l\le 2N-1 \end{aligned}$$

with \(i=\sqrt{-1}\). It is well known that the matrix-vector multiplication \(F_{2N} W_{2N}\) for \( W_{2N}\in R^{2N}\) can be carried out by fast Fourier transform (FFT). Thus the matrix-vector multiplication \(C_{2N} W_{2N}\) can be calculated by \(O(N\log N)\) operations.

Indeed, the \((N-1)\times (N-1)\) Toeplitz matrix \({\mathcal {A}}\) can be embedded into a \((2N-2)\times (2N-2)\) circulant matrix \(C_{2N-2}\) as follows

$$\begin{aligned} C_{2N-2}= \left( \begin{array}{cc} {{\mathcal {A}}} &{} \widehat{{\mathcal {A}}} \\ \widehat{{\mathcal {A}}} &{} {{\mathcal {A}}} \\ \end{array} \right) , \end{aligned}$$

where

$$\begin{aligned} \widehat{\mathcal {A}} = {\left( {\begin{array}{cccccc} {{0}}&{}{{a _{N-2}}}&{}{{a _{N-3}}}&{} \ddots &{}{{a _{2}}}&{}{{a _{1}}}\\ {{a _{2-N}}}&{}{{0}}&{}{{a _{N-2}}}&{}{{a _{N-3}}}&{} \ddots &{}{{a _{2}}}\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{}{{a _{2-N}}}&{}{{0}}&{}{{a _{N-2}}}&{}{{a _{N-3}}}\\ {{a _{-2}}}&{} \ddots &{} \ddots &{} \ddots &{}{{0}}&{}{{a _{N-2}}}\\ {{a _{-1}}}&{}{{a _{-2}}}&{} \ddots &{} \ddots &{}{{a _{2-N}}}&{}{{0}} \end{array}} \right) }. \end{aligned}$$

Define a \(2N-2\) vector \( \mathbf{v }_{2N-2}=( \mathbf{v }_{N-1},\mathbf{0 })^{T}\), then we have

$$\begin{aligned} C_{2N-2} \mathbf{v }_{2N-2}= \left( \begin{array}{c} \mathcal {A} \mathbf{v }_{N-1}\\ \widehat{\mathcal {A}}\ \mathbf{v }_{N-1} \end{array} \right) . \end{aligned}$$

Thus, the first part of matrix-vector products \(C_{2N-2} \mathbf{v }_{2N-2}\) leads to the matrix-vector products \( \mathcal {A}\mathbf{v }_{N-1}\). Therefore, the computational cost of \( \mathcal {A}\mathbf{v }_{N-1}\) is \(O(N\log N)\) operations. For \( \mathcal {B}\mathbf{v }_{N-1}\) we can treat in the same way. Applying above skill to the BiCGStab method leads to a fast algorithm for the optimal control problem.

Table 1 \(L^2\) error and \(H^{\frac{\alpha }{2}}\) error versus mesh size h and orders of convergence for Example 1
Table 2 \(L^2\) error of u, y and z versus mesh size h and orders of convergence for Example 1
Table 3 \(H^{\frac{\alpha }{2}}\) error of y and z versus mesh size h and orders of convergence for Example 1

5 Numerical Example

In this section, some numerical experiments are carried out to show the performance of our finite element scheme and the iterative algorithm.

In the following part, we denote \(e_{0,h}=\gamma \Vert u-u_{h}\Vert _{0}+\Vert y-y_{h}\Vert _{0}+\Vert z-z_{h}\Vert _{0}\), \(e_{\frac{\alpha }{2},h}=\Vert y-y_{h}\Vert _{\frac{\alpha }{2}}+\Vert z-z_{h}\Vert _{\frac{\alpha }{2}}\). The \(\Vert w\Vert _{\frac{\alpha }{2}}\) data presented in the tables denotes the norm

$$\begin{aligned} \Vert w\Vert _{\frac{\alpha }{2}}=\left( \Vert w\Vert _{0}^2+\int _{0}^{1}\int _{0}^{1}\frac{|w(x)-w(y)|^2}{|x-y|^{1+\alpha }}dxdy\right) ^{\frac{1}{2}}. \end{aligned}$$

Example 1

In this example, we consider problem (1.1)–(1.2) with the exact solutions defined by

$$\begin{aligned} y= & {} x^{\frac{\alpha }{2}}(1-x)^{\frac{\alpha }{2}}, \\ z= & {} sx^{\frac{\alpha }{2}}(1-x)^{\frac{\alpha }{2}} ,\\ u= & {} \max \left\{ u_a,\min \left\{ u_{b},-\frac{1}{\gamma }z\right\} \right\} . \end{aligned}$$

where \(s\in {\mathbb {R}}\) is a constant and if we take \(\alpha =1.6\), \(r=0.5\), the corresponding f and \(y_{d}\) are

$$\begin{aligned} f= & {} -\varGamma (1+\alpha )\cos (\pi \alpha /2)-u, \\ y_{d}= & {} y+s\varGamma (1+\alpha )\cos (\pi \alpha /2). \end{aligned}$$

In the first numerical test, we take \(\gamma =1\), \(s=10\), \(u_a=-2.5\) and \(u_b = -0.5\). We refine the mesh uniformly in the test. The \(L^2\) error, the \(H^{\frac{\alpha }{2}}\) error and the orders of convergence with respect to the mesh size are listed in Tables 1, 2 and 3, respectively. The Figs. 1 and 2 show the convergence rate. The Fig. 3 shows the numerical solutions and the exact solutions of the optimal control problem with the mesh size \(h=1/64\). According to these results, we know that the orders of convergence of the \(L^2\) error and the \(H^{\frac{\alpha }{2}}\) error are 1 and 0.5, respectively, which are just the same as the theoretical analysis.

Fig. 1
figure 1

The convergence rate of Example 1

Fig. 2
figure 2

The convergence rate of Example 1

Fig. 3
figure 3

The numerical solutions and the exact solutions of Example 1

Table 4 \(L^2\) error and \(H^{\frac{\alpha }{2}}\) error versus mesh size h and orders of convergence for Example 1
Table 5 \(L^2\) error of u, y and z versus mesh size h and orders of convergence for Example 1
Table 6 \(H^{\frac{\alpha }{2}}\) error of y and z versus mesh size h and orders of convergence for Example 1

In the second numerical test, we take \(\gamma =10^{-4}\), \(s=0.001\), \(u_a=-2.5\) and \(u_b = -0.5\). We do the same test and list the results in Tables 4, 5 and 6. The Figs. 4 and 5 show the convergence rate. The Fig. 6 shows the numerical solutions and the exact solutions of the optimal control problem with the mesh size \(h=1/64\). These results are in accordance with the theoretical analysis.

Fig. 4
figure 4

The convergence rate of Example 1

Fig. 5
figure 5

The convergence rate of Example 1

Fig. 6
figure 6

The numerical solution and the exact solution of Example 1

Meanwhile, we also test the performance of our fast algorithm combing with the BiCGStab method for solving the linear system. We denote the original one by “PDAS” and the fast algorithm by “FFT PDAS”. We compare the time consuming between these two algorithms. We list the test results in Table 7, where N is the number of the element, “T” is the total time consuming of the algorithm. We list the time consuming and the iteration number for solving the corresponding linear system, in each iteration of the primal dual active set method, by the BiCGStab method. For example, “1.054E\(-\)1(115.5)” means that the time consuming of solving the linear system is “1.054E\(-\)1s” and the iteration number of BiCGStab method is “115.5”. According to these results, we can see that the fast algorithm can speed up the primal dual active method for solving the constraint optimal control problem combining with the BiCGStab method.

Table 7 The time consuming of PDAS method for \(\gamma =1.0\)

6 Conclusion

In this paper, finite element approximation of optimal control governed by space fractional equation is investigated. A priori error estimate is derived. A fast primal dual active set algorithm based on the Toeplitz structure of coefficient matrix of discrete equations is designed. Numerical experiments are carried out to show the performance of the numerical method and the fast algorithm. There are many issues in this field can be addressed, for example, optimal control problem governed by fractional Laplacian operator, identification of the fractional order in a nonlocal equation and other type optimal control problem ([30]).