INTRODUCTION

Mathematical models involving fractional differentiation become common to describe various phenomena in physics, mechanics and economics, in particular, such that thermal diffusion in fractal domains, flow in highly heterogeneous aquifers etc. [1–5]. Recently, different generalizations and modifications of the traditional Caputo and Riemann–Liouville fractional derivatives were used for mathematical modeling of the aforementioned processes [6–8].

In the articles [9–12] some control problems of fractional diffusion equations have been developed. The optimal control problems that are solved in the mentioned papers deal with the control on the right side of the state equation with Dirichlet boundary conditions, and quadratic objective functional with distributed in the domain observation function. For these problems the existence of the unique solutions have been proved and first-order optimality conditions have been derived.

Considerable attention is paid to the numerical analysis of boundary value problems for partial differential equations of fractional order (cf., e.g. [13–19] and the bibliography therein). Various approximations of equations with fractional derivatives and error estimates under the assumption of sufficient smoothness of the solution were proved. The convergence and rate of convergence of an approximate solution to a regular weak solution of Dirichlet boundary value problems with Caputo fractional derivative in time were also studied.

The study of numerical methods for the optimal control problems governed by PDEs with fractional derivatives has only begun to attract attention. We highlight the articles [20–22] in this direction. In [21] finite element approximation of the unconstrained optimal control problems governed by time fractional diffusion equations is investigated. The stability and truncation error of the fully discrete scheme are analyzed. In [22] the numerical analysis for a distributed optimal control problem, with box constraint on the control, governed by a subdiffusion equation which involves a fractional derivative in time was presented. The rate of convergence for the numerical solutions of the optimal control problem constructed by applying finite element method in space and \(L1\)-backward Euler scheme in time was established.

In this article, we consider constrained optimal control problems governed by linear parabolic equations with fractional time derivative in various definitions and with mixed boundary conditions. Control is carried out on the right side of the equation and the Neumann boundary condition, an objective functional is convex and lower semicontinuous. Specifically, it can contain a quadratical part and the indicator functions of the set of constraints for control and state functions. The differential optimal control problem is approximated by implicit scheme using \(L1\)-approximation in time and finite element method with quadrature formulas in spatial variables. We prove the existence of a unique solution of the differential and discrete optimal control problem. The investigation of the convergence of the constructed mesh approximations is beyond the scope of our research. The main aim is to develop the effective iterative solution methods for the discrete problems. In the investigations we rely on the results of our research in the field of constructing iterative methods for finite-dimensional saddle point problems with applications to mesh approximations of optimal control problems for equations with integer derivatives [23–25].

An essential point in constructing effective iterative methods is the proof of suitable stability estimates for mesh approximations of the state equation. In this article, we are considering a problem with an objective function containing \(L^{2}\)-norms of control and state functions, so, a stability estimate is required for the solution of the state equation in the \(L^{2}\)-norm through \(L^{2}\)-norms of control functions. Note that estimates known from the literature contain maximum norms in time of the right side of the equation. To get the required estimate, we need a uniform mesh in time.

The rest of the article is organized as follows. First, we formulate the mixed boundary value problem for the diffusion equation with fractional time derivatives in the sense of Caputo, generalized Caputo and Caputo–Fabrizio and the optimal control problem. The existence of the unique solution of the optimal control problem is proved.

Then, in Section 2, we construct a finite dimensional approximation of the differential problem and investigate the properties of the constructed discrete problem. In particular, we prove the stability estimate, which are necessary when analyzing the convergence of iterative solution methods.

In Section 3 we study iterative solution methods for the mesh optimal control problems. We use Lagrange function for constructing iterative methods, prove the existence of its saddle point and convergence of the iterative methods. In particular case when there are no constraints on the state of the problem, the rate of convergence is set.

Some possible generalizations of the results, including their development to an optimal control problem with a space-time fractional state equation are discussed in Section 4.

The results of numerical experiments are presented in Section 5.

1 PROBLEM FORMULATION

First of all, we recall the definitions of several fractional order derivatives.

  • The classical Caputo fractional derivative

    $$\displaystyle\mathcal{D}^{\alpha}_{t}y(t)=\dfrac{1}{\Gamma(1-\alpha)}\int\limits_{0}^{t}(t-s)^{-\alpha}\dfrac{\partial y}{\partial s}(s)ds,\quad 0<\alpha<1,$$
    (1)

    where \(\Gamma(x)\) is gamma-function.

  • The generalized Caputo fractional derivative [7]

    $$\displaystyle\mathcal{D}^{\alpha,g}_{t}y(t)=\dfrac{1}{\Gamma(1-\alpha)}\int\limits_{0}^{t}\frac{g(t-s)}{(t-s)^{\alpha}}\frac{\partial y}{\partial s}(s)ds,\quad 0<\alpha<1,$$
    (2)

    with a weighting function \(g(t)\in C^{2}[0,T]\), such that \(g(t)>0\) and \(g^{\prime}(t)\leqslant 0\) for all \(t\in[0,T]\).

  • Caputo–Fabrizio derivative [6]

    $${}_{CF}\mathcal{D}^{\alpha}_{t}y(t)=\dfrac{M(\alpha)}{1-\alpha}\int\limits_{0}^{t}\exp\left(-\alpha\frac{t-s}{1-\alpha}\right)\dfrac{\partial y}{\partial s}(s)ds,\quad 0<\alpha<1,$$
    (3)

where \(M(\alpha)\) is a smooth positive function such that \(M(0)=M(1)=1\).

All of these fractional differentiation operators in (1)–(3) are particular cases of the operator

$$\displaystyle\mathcal{D}_{t}y(t)=\int\limits_{0}^{t}G(t-s)\dfrac{\partial y}{\partial s}(s)ds$$
(4)

with a kernel \(G(t)\) satisfying the following properties: function G(t) is continuous, positive, convex, and strictly decreasing on \((0,+\infty)\),

$$\displaystyle\int\limits_{0}^{+\infty}G(t)dt<\infty.$$
(6)

In what follows we use the notation \(\mathcal{D}_{t}\) for a fractional order differential operator (4) with a kernel satisfying the properties (5), (6).

Let \(\Omega\subset\mathbb{R}^{2}\) be a bounded domain with Lipschitz continuous boundary \(\partial\Omega=\Gamma_{D}\cup\Gamma_{N},\) \({\textrm{meas}}\Gamma_{D}>0\), \(Q_{T}=\Omega\times(0,T]\), \(\Sigma_{D}=\Gamma_{D}\times(0,T]\) and \(\Sigma_{N}=\Gamma_{N}\times(0,T]\). We consider a parabolic initial-boundary value problem

$$\displaystyle\mathcal{D}_{t}y-\Delta y=f\quad\textrm{in}\quad Q_{T},$$
$$\displaystyle y=0\quad\textrm{on}\quad\Sigma_{D},\quad\frac{\partial y}{\partial n}=q\quad\textrm{on}\quad\Sigma_{N},$$
$$y=0\quad\textrm{for}\quad t=0,\quad x\in\Omega,$$
(7)

where \(\Delta\) is Laplace operator, functions \(u(x,t)\) and \(q(x,t)\) are defined, respectively, in \(Q_{T}\) and \(\Sigma_{N}\). In what follows these functions play the role of control while \(y\) is a state function.

Now we briefly discuss the question of the existence of weak solutions to problem (7). Multiplying the differential equation (7) by an infinitely differentiable function \(v(x,t):v|_{\Sigma_{D}}=0\), we obtain a variational equation that can serve as a basis for determining a weak solution

$$\int\limits_{Q_{T}}\mathcal{D}_{t}yvdxdt+\int\limits_{Q_{T}}\nabla y\cdot\nabla vdxdt=\int\limits_{Q_{T}}fvdxdt+\int\limits_{\Sigma_{N}}qvd\Gamma dt.$$

Let \(V=\{y\in H^{1}(\Omega):y=0\ \textrm{a.e.}\ x\in\Gamma_{D}\}\) and \({}_{0}H^{\gamma}(0,T)=\{u\in H^{\gamma}(0,T):u(0)=0\}\) if \(1/2<\gamma\leqslant 1\), \({}_{0}H^{\gamma}(0,T)=H^{\gamma}(0,T)\) for \(0\leqslant\gamma<1/2\). We use also the spaces \(L^{2}(0,T;X)\) and \(H^{\gamma}(0,T;X)\) with \(\gamma>0\) and a Banach space \(X\) (cf., e.g. [26, 27] for more details).

For the case of Dirichlet boundary value problem (i.e. \(\Gamma_{N}=\emptyset\) in our problem) and classical Caputo derivative \(\mathcal{D}_{t}\) in [28] the existence of a unique solution to problem (7) from \(H^{\alpha}(0,T;L^{2}(\Omega))\cap L^{2}(0,T;H_{0}^{1}(\Omega)\cap H^{2}(\Omega))\) is proved and the corresponding a priori estimate through the \(L^{2}(Q_{T})\)-norm of the right-hand side \(f\) is given. In [29] the unique existence of a solution from \(B^{\alpha/2}(Q_{T})=H^{\alpha/2}(0,T;L^{2}(\Omega))\cap L^{2}(0,T;V)\) is substantiated for (7) with mixed boundary conditions and classical Caputo derivative. The corresponding a priori estimate is a follows:

$$||y||_{B^{\alpha/2}(Q_{T})}\leqslant C_{1}||f||_{L^{2}(Q)}+C_{2}||q||_{L^{2}(\Sigma_{N})},\quad C_{1},C_{2}={\textrm{const}}.$$

As for the general case of the problem (7), we assume that it has a unique weak solution, and for our future needs it suffices that the following estimate holds:

$$||y||_{L^{2}(0,T;V)}\leqslant C_{1}||f||_{L^{2}(Q)}+C_{2}||q||_{L^{2}(\Sigma_{N})},\quad C_{1},C_{2}={\textrm{const}}.$$
(8)

Now we formulate the optimal control problem in which the function \(f\), \(q\) are the control function, and \(y\) is a state function. We introduce the objective functional

$$\displaystyle J(y,f,q)=J_{0}(y,f,q)+\psi(y)+\varphi_{f}(f)+\varphi_{q}(q),$$
(9)

where

$$\displaystyle J_{0}(y,f,q)=\frac{1}{2}\int\limits_{Q_{T}}(y(x,t)-y_{d}(x,t))^{2}dxdt+\frac{1}{2}\int\limits_{Q_{T}}f^{2}dxdt+\frac{1}{2}\int\limits_{\Sigma_{N}}q^{2}d\Gamma dt,$$
$$\psi:L^{2}(Q_{T})\to\bar{\mathbb{R}},\varphi_{f}:L^{2}(Q_{T})\to\bar{\mathbb{R}},\varphi_{q}:L^{2}(\Sigma_{N})\to\bar{\mathbb{R}}\quad\text{are convex,}$$
$$\text{proper and lower semicontinuous functions}\quad(\bar{\mathbb{R}}={\mathbb{R}}\cup\{+\infty\}).$$
(10)

Above \(y_{d}(x,t)\in L^{2}(Q)\) is a given observation function, while functions \(\psi,\) \(\varphi_{f}\) and \(\varphi_{q}\) are ‘‘responsible’’ for the control and state constraints.

The optimal control problem we solve is as follows:

$$\textrm{find}\quad\min\limits_{(y,f,q)\in W}J(y,f,q),$$
$$W=\{(y,f,q)\in L^{2}(Q_{T})\times L^{2}(Q_{T})\times L^{2}(\Sigma_{N})\quad\text{satisfy state problem (7)}\}.$$
(11)

Theorem 1. Let the assumptions (8), (10) be fulfilled and

$$K=\{(y,f,q)\in{\textrm{dom}}\psi\times{\textrm{dom}}\varphi_{f}\times{\textrm{dom}}\varphi_{q}:\quad{equation\;(7)\;holds}\}\neq\emptyset.$$
(12)

Then optimal control problem (11) has a unique solution.

Proof. The functional \(J(y,f,q)\) can attain its finite minimum only on the set \(K\subset W\), so we will prove the existence of a unique solution to problem

$$\min\limits_{(y,f,q)\in K}J(y,f,q).$$
(13)

Next, the assumptions (10) ensure that the functional \(J(y,f,q)\) is convex and lower semicontinuous on the space \(L^{2}(Q_{T})\times L^{2}(Q_{T})\times L^{2}(\Sigma_{N})\). The quadratical functional \(J_{0}(y,f,q)\) is coercive, so, the functional \(J(y,f,q)\) is also coercive. Since the sets \({\textrm{dom}}\psi\), \({\textrm{dom}}\varphi_{f}\) and \({\textrm{dom}}\varphi_{q}\) are convex and state equation (7) is linear, then the set \(K\) is also convex. Due to the closeness of these sets and to stability inequality (8) the set \(K\) is closed. The listed properties of \(K\) and \(J(y,f,q)\) ensure (cf., e.g., [30], p. 44) the existence of a solution to problem (13), which is equivalent to (11). Moreover, because of the strict convexity of \(J(y,f,q)\) with respect to \(f\) and \(q\) and the linearity of the state equation (7), it is strictly convex with respect to \(y\). Thus, the solution of the problem is unique. \(\Box\)

An example of the constraint functions are the indicator functions of the closed and convex sets:

$$\displaystyle\varphi_{f}(f)=I_{U_{ad}},\quad U_{ad}=\{f\in L^{2}(Q_{T}):|f(x,t)|\leqslant\bar{f}\quad\textrm{in}\quad Q_{T}\},$$
$$\displaystyle\psi(y)=I_{Y_{ad}},\quad Y_{ad}=\{y\in L^{2}(Q_{T}):y_{\textrm{min}}\leqslant y(x,t)\leqslant y_{\textrm{max}}\quad\textrm{in}\quad Q_{T}\}.$$
(14)

If \(-\infty\leqslant y_{\textrm{min}}<0<y_{\textrm{max}}\leqslant+\infty,\) then the condition (12) is satisfied, because the null function \((0,0,0)\in{\textrm{dom}}\psi\times{\textrm{dom}}\varphi_{f}\times{\textrm{dom}}\varphi_{q}\) satisfies the state equation (7).

2 APPROXIMATION

2.1 Approximation of the State Equation

First, we approximate a time fractional derivative. Let \(\omega_{\tau}=\{t_{j}=j\tau,\ j=0,1,\ldots M;\ M\tau=T\}\) be a uniform mesh on the segment \([0,T]\) and \(y^{j}=y(t_{j})\) for a continuous function \(y(t)\). The approximation of the derivative \(\mathcal{D}_{t}y(t)=\int\limits_{0}^{t}G(t-s)\dfrac{\partial y}{\partial s}(s)ds\) of a continuous function \(y(t),y(0)=0\), at a point \(t_{k}=k\tau\in\omega_{\tau}\) of uniform mesh \(\omega_{\tau}=\{t_{j}=j\tau,\ j=0,1,\ldots M;\ M\tau=T\}\) is as follows:

$$\displaystyle\mathcal{D}_{t}y(t_{k})\approx\partial_{t}y(t_{k})=d_{1}y^{k}+\sum\limits_{j=1}^{k-1}(d_{j+1}-d_{j})y^{k-j},\quad\displaystyle d_{j}=\dfrac{1}{\tau}\int\limits_{t_{k-j}}^{t_{k-j+1}}G(t_{k}-s)ds=\dfrac{1}{\tau}\int\limits_{(j-1)\tau}^{j\tau}G(u)du.$$

Due to (5) the coefficients satisfy the inequalities

$$d_{1}>d_{2}>\cdots>d_{M}>0.$$
(15)

We emphases one important property for further research, namely, the independence of the coefficients \(d_{j}\) from the number \(k\) of the time level. This is a consequence of using a uniform mesh in time variable.

The semidiscrete implicit scheme approximating problem (7) is as follows: find \((y(x,t_{1}),y(x,t_{2}),\ldots,y(x,t_{M}))\) with \(y(x,t_{k})\in V\) and \(y(x,0)=0\), such that for all \(k\) and all \(v\in V\)

$$\int\limits_{\Omega}\partial_{t}y(x,t_{k})v(x)dx+\int\limits_{\Omega}\nabla y(x,t_{k})\cdot\nabla v(x)dx=\int\limits_{\Omega}f(x,t_{k})v(x)dx+\int\limits_{\Gamma_{N}}q(x,t_{k})v(x)d\Gamma.$$
(16)

Since \(\partial_{t}y(x,t_{k})=d_{1}y(x,t_{k})+\sum\limits_{j=1}^{k-1}(d_{j+1}-d_{j})y(x,t_{k-j}),\) then a \(k\)-th equation of system (16) is a variational equation

$$a_{k}(y,v)=\int\limits_{\Omega}d_{1}y(x,t_{k})v(x)dx+\int\limits_{\Omega}\nabla y(x,t_{k})\cdot\nabla v(x)dx=F_{k}(v),$$
$$F_{k}(v)=\int\limits_{\Omega}f(x,t_{k})v(x)dx+\int\limits_{\Gamma_{N}}q(x,t_{k})v(x)d\Gamma-\sum\limits_{j=1}^{k-1}(d_{j+1}-d_{j})\int\limits_{\Omega}\nabla y(x,t_{k-j})\cdot\nabla v(x)dx$$
(17)

with bounded and coercive bilinear form \(a_{k}:V\times V\to{\mathbb{R}}\) and linear functional \(F_{k}\in V^{*}\). From Lax–Milgram theorem we deduce the existence of a unique solution to (17).

A fully discrete approximation of problem (7) is constructed using the approximation with respect to the spatial variables of the elliptic problem (17). For definiteness, we construct a finite element scheme with first-order finite elements in the polygonal domain \(\Omega\) and simple quadrature formulas on triangles and their sides. When solving test problems, we also consider the case of a rectangular domain \(\Omega\), and approximate (7) using the bilinear elements and trapezoidal quadrature formulas.

Let \(T_{h}\) be a family of non-overlapping closed triangles \(e\) (finite elements) with maximal diameter \(h\). We suppose that \(T_{h}\) is a conforming and regular triangulation \(\overline{\Omega}=\bigcup\limits_{e\in T_{h}}e\) of \(\overline{\Omega}\) ([31], p. 124) and \(T_{h}\) generates the triangulation \(\partial T_{h}\) of \(\overline{\Gamma}_{N}\), i.e. \(\overline{\Gamma}_{N}\) consists of integer number of sides \(\partial e\) of elements \(e\in T_{h}\). We define the finite element space \(V_{h}\subset V\) of the continuous and piecewise linear functions (linear on each \(e\)) that vanish on the boundary \(\Gamma_{D}\) and the finite element space \(Q_{h}\) of the piecewise linear functions on \(\Gamma_{N}\) (linear on each \(\partial e\in\Gamma_{N}\)), which are the traces on \(\Gamma_{N}\) of the functions from \(V_{h}\). We denote by \(y_{h}\) with subscript \(h\) a mesh function from the space \(V_{h}\) or \(Q_{h}\). Let \(V_{h\tau}\) be the linear space of the mesh functions \(y_{h}(t):\omega_{\tau}\to V_{h}\), and \(Q_{h\tau}\) be the linear space of the mesh functions \(q_{h}(t):\omega_{\tau}\to Q_{h}\). By \(y_{h}^{k}\) we mean the value of mesh function from \(V_{h\tau}\) or \(Q_{h\tau}\) at a time level \(t_{k}\in\omega_{t}\).

We use quadrature formulas approximating the integrals of the continuous function \(g(x)\):

$$\int\limits_{e}g(x)dx\approx S_{e}(g)=\frac{{\textrm{meas}}(e)}{3}\sum_{\alpha=1}^{3}g(x_{\alpha}),\quad\int\limits_{\partial e}g(x)d\Gamma\approx S_{\partial e}(g)=\frac{{\textrm{meas}}(\partial e)}{2}\sum_{\alpha=1}^{2}g(x_{\alpha}),$$

where \(x_{\alpha}\) are the vertices of \(e\) and \(\partial e\), respectively, and the composite quadrature formulas \(S_{\Omega}(g)=\sum\limits_{e\in T_{h}}S_{e}(g)\) and \(S_{\Gamma}(g)=\sum\limits_{\partial e\in\partial T_{h}}S_{\partial e}(g)\) approximating the integrals over the domain \(\Omega\) and the boundary \(\Gamma\), respectively.

The fully discrete scheme approximating problem (7) has the following form:

$$\textrm{find}\quad y_{h}(t)\in V_{h\tau}\quad\text{such that}\quad y^{0}_{h}=0\quad\textrm{and}$$
$$S_{\Omega}\Big{(}\partial_{t}y^{k}_{h}v_{h}\Big{)}+S_{\Omega}\Big{(}\nabla y^{k}_{h}\cdot\nabla v_{h}\Big{)}=S_{\Omega}\Big{(}f^{k}_{h}\ v_{h}\Big{)}+S_{\Gamma}\Big{(}q^{k}_{h}\ v_{h}\Big{)}\quad\forall v_{h}\in V_{h}\quad\textrm{for}\quad k=1,2,\ldots,M.$$
(18)

Similar to (17), problem (18) has a unique solution.

2.2 Stability Estimate

Let us introduce lower triangle Toeplitz \(M\times M\) matrix

$$B=\begin{pmatrix}d_{1}&0&0&0&\cdots&0&0\\ d_{2}-d_{1}&d_{1}&0&0&\cdots&0&0\\ d_{3}-d_{2}&d_{2}-d_{1}&d_{1}&0&\cdots&0&0\\ ...&...&...&...&\cdots&...&...\\ d_{M}-d_{M-1}&d_{M-1}-d_{M-2}&d_{M-2}-d_{M-3}&...&\cdots&d_{2}-d_{1}&d_{1}\end{pmatrix},$$
(19)

where \(d_{j}\) are the coefficients of the approximation \(\partial_{t}y\) of Caputo derivative. Using this matrix, this approximation at a point \(t_{k}=k\tau\in\omega_{\tau}\) can be written as \(\partial_{t}y(t_{k})=\Big{(}By(t)\Big{)}^{k}.\)

Lemma 1. The matrix \(\frac{1}{2}(B+B^{T})\) is positive definite:

$$\left(\frac{1}{2}(B+B^{T})z,z\right)_{t}\geqslant\chi_{0}||z||_{t}^{2}\quad\forall z\in{\mathbb{R}^{M}},$$
(20)

where \(\Big{(}.,.\Big{)}_{t}\) and \(||.||_{t}\) are the inner product and Euclidian norm in \({\mathbb{R}^{M}}\) , and \(\chi_{0}=G(T/2).\)

Proof. The basic properties of the matrix \(B\) follow from the properties (15):

  • \(B\) has positive diagonal elements and non-positive off-diagonal elements,

  • \(B\) is strictly diagonally dominant both in rows and columns.

Based on these properties, the article [32] proves the positive definiteness of the matrix, as well as the inequality

$$\Big{(}\frac{1}{2}(B+B^{T})z,z\Big{)}_{t}\geqslant\min\limits_{1\leqslant k\leqslant M}\frac{1}{2}(d_{k}+d_{M-k+1})||z||_{t}^{2}.$$

The estimate

$$\min\limits_{1\leqslant k\leqslant M}\frac{1}{2}(d_{k}+d_{M-k+1})\geqslant\min\limits_{0\leqslant t\leqslant T}\frac{1}{2}\Big{(}G(t)+G(T-t)\Big{)}=G(T/2)$$

can be proved by direct calculations using the property of strict monotonicity and convexity of the kernel function \(G(t)\). \(\Box\)

Below we use following notations for the scalar products and norms in the spaces of mesh functions \(V_{h}\) and \(Q_{h}\):

$$(v_{h},z_{h})_{0,\Omega}=S_{\Omega}(v_{h}z_{h}),\quad\displaystyle||v_{h}||_{0,\Omega}=(v_{h},v_{h})_{0,\Omega}^{1/2},$$
$$\displaystyle||v_{h}||_{1,\Omega}=||\nabla v_{h}||_{0,\Omega},\quad||v_{h}||_{-1,\Omega}=\sup\limits_{z_{h}\neq 0}\frac{(v_{h},z_{h})_{0,\Omega}}{||z_{h}||_{1,\Omega}},$$
$$\displaystyle(v_{h},z_{h})_{0,\Gamma}=S_{\Gamma}(v_{h}z_{h}),\quad||v_{h}||_{0,\Gamma}=(v_{h},v_{h})_{0,\Gamma}^{1/2},\quad||v_{h}||_{-1/2,\Gamma}=\sup\limits_{z_{h}\neq 0}\frac{(v_{h},z_{h})_{0,\Gamma}}{||z_{h}||_{1,\Omega}}.$$

The following analogs of embedding and trace inequalities are well-known (cf., e.g. [31]):

$$||v_{h}||_{0,\Omega}\leqslant c_{\Omega}||v_{h}||_{1,\Omega},\quad||v_{h}||_{0,\Gamma}\leqslant c_{\Gamma}||v_{h}||_{1,\Omega}\quad\forall v_{h}\in V_{h}.$$
(21)

Theorem 2. For the solution of problem (18) the following stability estimate holds:

$$\sum\limits_{k=1}^{M}\left(\chi_{0,\Omega}||y_{h}||^{2}_{0,\Omega}+\frac{1}{2}||y_{h}||^{2}_{1,\Omega}\right)\leqslant\sum\limits_{k=1}^{M}\left(||f_{h}^{k}||^{2}_{-1,\Omega}+||q^{k}_{h}||^{2}_{-1/2,\Gamma}\right).$$
(22)

Proof. Taking \(v_{h}=y^{k}_{h}\) in (18), after summation over \(k\) we get

$$\sum\limits_{k=1}^{M}S_{\Omega}\Big{(}\partial_{t}y^{k}_{h}\ y^{k}_{h}\Big{)}+\sum\limits_{k=1}^{M}S_{\Omega}\Big{(}\nabla y^{k}_{h}\cdot\nabla y^{k}_{h}\Big{)}=\sum\limits_{k=1}^{M}\Big{(}S_{\Omega}\Big{(}f^{k}_{h}\ y^{k}_{h}\Big{)}+S_{\Gamma}\Big{(}q^{k}_{h}\ y^{k}_{h}\Big{)}\Big{)}.$$

Due to Lemma 1

$$\displaystyle\sum\limits_{k=1}^{M}S_{\Omega}\Big{(}\partial_{t}y^{k}_{h}\ y^{k}_{h}\Big{)}=S_{\Omega}\Big{(}(By,y)_{t}\Big{)}\geqslant\chi_{0}\sum\limits_{k=1}^{M}S_{\Omega}\Big{(}(y^{k}_{h})^{2}\Big{)}=\chi_{0}\sum\limits_{k=1}^{M}||y_{h}||^{2}_{0,\Omega},$$

then

$$\sum\limits_{k=1}^{M}\Big{(}\chi_{0}||y_{h}||^{2}_{0,\Omega}+||y_{h}||^{2}_{1,\Omega}\Big{)}\leqslant\sum\limits_{k=1}^{M}\Big{|}\Big{(}f^{k}_{h},y^{k}_{h}\Big{)}_{0,\Omega}\Big{|}+\Big{|}\Big{(}q^{k}_{h},y^{k}_{h}\Big{)}_{0,\Gamma}\Big{|}.$$

Using the inequalities

$$\Big{|}\Big{(}f^{k}_{h},y^{k}_{h}\Big{)}_{0,\Omega}\Big{|}\leqslant||f_{h}^{k}||_{-1,\Omega}||y_{h}^{k}||_{1,\Omega},\quad\Big{|}\Big{(}q^{k}_{h},y^{k}_{h}\Big{)}_{0,\Gamma}\Big{|}\leqslant||q^{k}_{h}||_{-1/2,\Gamma}||y_{h}^{k}||_{1,\Omega},$$

to evaluate the right-hand side, we easily get the estimate (22). \(\Box\)

Corollary 1. The stability estimate in the norm of mesh space \(L^{2}(\omega_{t};L^{2}(\omega_{x}))\) for the solution of state equation through the corresponding \(L^{2}\) -norms of the control functions is true:

$$\sum\limits_{k=1}^{M}\tau||y_{h}||^{2}_{0,\Omega}\leqslant C_{0}\left(\sum\limits_{k=1}^{M}\tau||f_{h}^{k}||^{2}_{0}+\sum\limits_{k=1}^{M}\tau||q^{k}_{h}||^{2}_{0,\Gamma}\right)$$
(23)

with a constant \(C_{0}\) which does not depend on the mesh parameters. To prove (23), it suffices to use (21) and the following inequalities arising from (21): \(||v_{h}||_{-1,\Omega}\leqslant c_{\Omega}||v_{h}||_{0},\) \(||v_{h}||_{-1/2,\Gamma}\leqslant c_{\Gamma}||v_{h}||_{0,\Gamma}.\)

2.3 Approximation of Optimal Control Problem

Let a function \(y_{dh}(t)\in V_{h\tau}\) be an approximation of the observation function \(y_{d}(x,t)\). The approximation of the quadratical functional \(J_{0}(y,f,q)\) has the following form:

$$\displaystyle J_{0h}(y_{h},f_{h},q_{h})=\frac{\tau}{2}\sum\limits_{k=1}^{M}\Big{(}||y_{h}^{k}-y_{dh}^{k}||^{2}_{0,\Omega}+||f_{h}^{k}||^{2}_{0,\Omega}+||q_{h}^{k}||^{2}_{0,\Gamma}\Big{)},$$
(24)

Let also \(\psi_{h}:V_{h\tau}\to\bar{\mathbb{R}}\), \(\varphi_{fh}:V_{h\tau}\to\bar{\mathbb{R}}\) and \(\varphi_{qh}:Q_{h\tau}\to\bar{\mathbb{R}}\) are convex, proper and lower semicontinuous functions, approximating in some sense the corresponding functions \(\psi\), \(\varphi_{f}\) and \(\varphi_{q}.\)

In the particular case when \(\psi\) and \(\varphi\) are the indicator functions of the sets defined by (14), \(\varphi_{fh}\), \(\varphi_{qh}\), and \(\psi_{h}\) are the indicator functions of the sets \(U^{h}_{ad}=\{|f_{h}^{k}(x)|\leqslant\bar{f}\ \forall x\in\Omega,\ k=1,2,\ldots,M\},\) \(Q^{h}_{ad}=\{|q_{h}^{k}(x)|\leqslant\bar{q}\ \forall x\in\Gamma_{N},\ k=1,2,\ldots,M\}\) and \(Y^{h}_{ad}=\{y_{h}^{k}(x):\displaystyle y_{\textrm{min}}\leqslant y_{h}^{k}(x)\leqslant y_{\textrm{max}},\ \forall x\in\Omega,\ k=1,2,\ldots,M\}.\)

Now, the mesh optimal control problem is as follows:

$$\displaystyle\textrm{find}\quad\min\limits_{(y_{h},f_{h},q_{h})\in W_{h}}\{J_{h}(y_{h},f_{h},q_{h})=J_{0h}(y_{h},f_{h},q_{h})+\psi_{h}(y_{h})+\varphi_{fh}(f_{h})+\varphi_{qh}(q_{h})\}$$
$$W_{h}=\{(y_{h},f_{h},q_{h})\in V_{h\tau}\times V_{h\tau}\times Q_{h\tau}\quad\text{satisfy state equation }\ (18)\}.$$
(25)

Theorem 3. Let the set

$$\displaystyle K_{h}=\{(y_{h},f_{h},q_{h})\in{\textrm{dom}}\psi_{h}\times{\textrm{dom}}\varphi_{fh}\times\quad{\textrm{dom}}\varphi_{qh}\quad\text{satisfy state equation }\ (18)\}$$

is not empty. Then mesh optimal control problem (25) has a unique solution \((y_{h},f_{h})\).

Proof. Since mesh state equation is linear and stability estimate (23) holds, then the set \(K_{h}\) is convex and closed. Next, mesh objective function \(J_{h}\) is strictly convex, lower semicontinuous and coercive. The listed properties of \(K_{h}\) and \(J_{h}\) ensure the existence of a unique solution to problem

$$\min\limits_{(y_{h},f_{h},q_{h})\in K_{h}}J_{h}(y_{h},f_{h},q_{h})=\min\limits_{(y_{h},f_{h},q_{h})\in W_{h}}J_{h}(y_{h},f_{h},q_{h}).$$

\(\Box\)

3 ITERATIVE SOLUTION METHOD

In what follows, we will use algebraic forms of mesh problems, namely discrete problems for vectors of nodal values of mesh functions..

We denote by \(y\in{\mathbb{R}}^{N_{x}}\) the vector of nodal values of a function \(y_{h}\in V_{h}\) (\(N_{x}={\textrm{dim}}V_{h}\)). Then we get the ‘‘onto’’ correspondence \(y\Leftrightarrow y_{h}\). By \((.,.)_{x}\) and \(||.||_{x}\) we mean the inner product and Euclidian norm in \({\mathbb{R}}^{N_{x}}\). Similarly, a vector \(q\in{\mathbb{R}}^{N_{q}}\) corresponds to \(q_{h}\in Q_{h}\) (\(N_{q}={\textrm{dim}}Q_{h}\)), and \((.,.)_{q}\) and \(||.||_{q}\) are the inner product and Euclidian norm in \({\mathbb{R}}^{N_{q}}\). The dimensions of \(V_{h\tau}\) and \(Q_{h\tau}\) equal \(N_{x}M\) and \(N_{q}M\), respectively. By \((.,.)_{xt}\) and \(||.||_{xt}\) we denote the inner product and Euclidian norm in \({\mathbb{R}}^{N_{x}M}\). The notations \((.,.)_{qt}\) and \(||.||_{qt}\) have the similar meaning for the space \({\mathbb{R}}^{N_{q}M}\).

Define the stiffness matrix \(A\in{\mathbb{R}}^{N_{x}\times N_{x}}\), diagonal mass matrices \({\mathcal{M}}_{x}\in{\mathbb{R}}^{N_{x}\times N_{x}}\) and \({\mathcal{M}}_{q}\in{\mathbb{R}}^{N_{q}\times N_{q}}\), and rectangular matrix \(S_{q}\in{\mathbb{R}}^{N_{x}\times N_{q}}\), by the following equalities:

$$\displaystyle(Ay,z)_{x}=S_{\Omega}\left(\nabla y_{h}\cdot\nabla z_{h}\right),\quad({\mathcal{M}}_{x}y,z)_{x}=\Big{(}y_{h},z_{h}\Big{)}_{0,\Omega},$$
$$({\mathcal{M}}_{q}q,p)_{q}=\Big{(}q_{h},p_{h}\Big{)}_{0,\Gamma},\quad(S_{q}q,z)_{x}=\Big{(}q_{h},z_{h}\Big{)}_{0,\Gamma}.$$

Below we omit the index \(h\) and use the same notations for the mesh functions and for the vectors of their nodal values with a chosen ordering. In this regard, the notation \(y^{k}\) means a value \(y(x,t_{k})\in V_{h}\) of a mesh function from the space \(V_{h\tau}\) on a time level \(t_{k}\in\omega_{\tau}\), as well as corresponding vector \(y^{k}\in{\mathbb{R}}^{N_{x}}\). Similar sense has \(q^{k}=q(x,t_{k})\in{\mathbb{R}}^{N_{q}}\).

Using the introduced notations we can write mesh state equations (18) as the following system of linear algebraic equations:

$$\displaystyle y^{0}=0,\quad{\mathcal{M}}_{x}(By)^{k}+Ay^{k}={\mathcal{M}}_{x}f^{k}+S_{q}q^{k},\quad k=1,2,\ldots,M.$$
(26)

Let \(I_{t}\) be \(M\times M\) unit matrix and \(\otimes\) denotes Kronekker (tensor) product of the matrices. We define the matrices

$${\mathcal{L}}=B\otimes{\mathcal{M}}_{x}+I_{t}\otimes A,\quad{\mathcal{M}}=I_{t}\otimes{\mathcal{M}}_{x},\quad{\mathcal{S}}=I_{t}\otimes S_{q},\quad{\mathcal{R}}=I_{t}\otimes{\mathcal{M}}_{q}.$$

Then the system of the equations (26) can be written in a brief form

$${\mathcal{L}}y={\mathcal{M}}f+{\mathcal{S}}q.$$

The stability estimate (23) written in terms of nodal vectors becomes:

$$({\mathcal{M}}y,y)_{xt}\leqslant C_{0}\Big{(}({\mathcal{M}}f,f)_{xt}+({\mathcal{R}}q,q)_{qt}\Big{)}.$$
(27)

The algebraic form of mesh objective function (24) is as follows:

$$\displaystyle I_{0}(y,f,q)=\displaystyle\frac{\tau}{2}({\mathcal{M}}(y-y_{d}),y-y_{d})_{xt}+\displaystyle\frac{\tau}{2}({\mathcal{M}}f,f)_{xt}+\frac{\tau}{2}({\mathcal{R}}q,q)_{qt}.$$

Let us define the functions \(\varphi_{f}:{\mathbb{R}}^{N_{x}M}\to\bar{\mathbb{R}}\), \(\varphi_{q}:{\mathbb{R}}^{N_{q}M}\to\bar{\mathbb{R}}\) and \(\psi:{\mathbb{R}}^{N_{x}M}\to\bar{\mathbb{R}}\) by the equalities:

$$\varphi_{f}(f)=\varphi_{h}(f_{h})\quad\textrm{for}\quad f\Leftrightarrow f_{h},\quad\varphi_{q}(q)=\varphi_{h}(q_{h})\quad\textrm{for}\quad q\Leftrightarrow q_{h},\psi(y)=\psi_{h}(y_{h})\quad\textrm{for}\quad y\Leftrightarrow y_{h}.$$

Now, after introducing all these notations, we can formulate the algebraic form of the mesh optimal control problem (25):

$$\min\limits_{{\mathcal{L}}y={\mathcal{M}}f+{\mathcal{S}}q}\bigg{\{}\frac{1}{2}({\mathcal{M}}(y-y_{d}),y-y_{d})_{xt}+\displaystyle\frac{1}{2}({\mathcal{M}}f,f)_{xt}$$
$${}+\frac{1}{2}({\mathcal{R}}q,q)_{qt}+\psi(y)+\varphi_{f}(f)+\varphi_{q}(q)\bigg{\}}.$$
(28)

To construct the iterative methods for solving problem (28), we introduce the Lagrange function

$${\mathbb{L}}(y,f,q,\lambda)=\frac{1}{2}({\mathcal{M}}(y-y_{d}),y-y_{d})_{xt}+\displaystyle\frac{1}{2}({\mathcal{M}}f,f)_{xt}$$
$${}+\frac{1}{2}({\mathcal{R}}q,q)_{qt}+\psi(y)+\varphi_{f}(f)+\varphi_{q}(q)+(\lambda,{\mathcal{L}}y-{\mathcal{M}}f-{\mathcal{S}}q)_{xt}.$$
(29)

Using well-known results on the Lagrange functions (cf. [30], p. 169) we obtain that a saddle point of (29) satisfies the following saddle point problem:

$$\begin{pmatrix}{\mathcal{M}}&0&0&{\mathcal{L}}^{T}\\ 0&{\mathcal{M}}&0&-{\mathcal{M}}\\ 0&0&{\mathcal{R}}&-{\mathcal{S}}^{T}\\ {\mathcal{L}}&-{\mathcal{M}}&-{\mathcal{S}}&0\end{pmatrix}\begin{pmatrix}y\\ f\\ q\\ \lambda\end{pmatrix}+\begin{pmatrix}\partial\psi(y)\\ \partial\varphi_{f}(f)\\ \partial\varphi_{q}(q)\\ 0\end{pmatrix}\ni\begin{pmatrix}{\mathcal{M}}y_{d}\\ 0\\ 0\\ 0\end{pmatrix}$$
(30)

where \(\partial\psi,\partial\varphi_{f}\) and \(\partial\varphi_{q}\) are the subdifferentials of the corresponding functions. Using the notations \(z=(y,f,q)^{T}\), \(f=({\mathcal{M}}y_{d},0,0,)^{T}\), \(\Psi(z)=\psi(y)+\varphi_{f}(f)+\varphi_{q}(q)\), \({\mathcal{A}}={\textrm{diag}}\begin{pmatrix}{\mathcal{M}}&{\mathcal{M}}&{\mathcal{R}}\end{pmatrix}\) and \({\mathcal{B}}=\begin{pmatrix}{\mathcal{L}}&-{\mathcal{M}}&-{\mathcal{S}}\end{pmatrix},\) problem (30) can be rewritten in a compact form

$$\begin{pmatrix}{\mathcal{A}}&{\mathcal{B}}^{T}\\ {\mathcal{B}}&0\end{pmatrix}\begin{pmatrix}z\\ \lambda\end{pmatrix}+\begin{pmatrix}\partial\Psi(z)\\ 0\end{pmatrix}\ni\begin{pmatrix}f\\ 0\end{pmatrix}.$$

To solve this saddle point problem we apply a preconditioned Uzawa-type iterative method: for a given initial guess \(\lambda^{(0)}\) solve for \(s=0,1,\ldots\)

$${\mathcal{A}}z^{(s+1)}+\partial\Psi(z^{(s+1)})\ni{\mathcal{B}}^{T}\lambda^{(s)}+f,\quad\displaystyle\dfrac{1}{\rho}{\mathcal{D}}(\lambda^{(s+1)}-\lambda^{(s)})+{\mathcal{B}}z^{(s+1)}=0,$$
(31)

where \({\mathcal{D}}\) is a symmetric and positive definite matrix (preconditioner), \(\rho>0\) is an iterative parameter.

Lemma 2. Matrix \({\mathcal{D}}={\mathcal{L}}{\mathcal{\mathcal{M}}}^{-1}{\mathcal{L}}^{T}\) is spectrally equivalent to \({\mathcal{B}}{\mathcal{A}}^{-1}{\mathcal{B}}^{T}\):

$${\mathcal{D}}\leqslant{\mathcal{B}}{\mathcal{A}}^{-1}{\mathcal{B}}^{T}\leqslant(1+2C_{0}){\mathcal{D}}$$
(32)

with constant \(C_{0}\) from (27).

Proof. Direct calculations give:

$${\mathcal{B}}{\mathcal{A}}^{-1}{\mathcal{B}}^{T}={\mathcal{L}}{\mathcal{\mathcal{M}}}^{-1}{\mathcal{L}}^{T}+{\mathcal{M}}+{\mathcal{S}}{\mathcal{R}}^{-1}{\mathcal{S}}^{T}.$$

Left inequality of (32) is obvious because of positive definiteness of \({\mathcal{M}}\) and \({\mathcal{S}}{\mathcal{R}}^{-1}{\mathcal{S}}^{T}\). Let us prove the right one.

Taking in account equality \({\mathcal{L}}y={\mathcal{M}}f+{\mathcal{S}}q\), stability estimate (27) can be written as

$$||{\mathcal{M}}^{1/2}{\mathcal{L}}^{-1}({\mathcal{M}}f+{\mathcal{S}}q)||^{2}_{xt}\leqslant C_{0}\Big{(}||{\mathcal{M}}^{1/2}f||^{2}_{xt}+||{\mathcal{R}}^{1/2}q||^{2}_{qt}\big{)}.$$
(33)

We estimate the right-hand side of the equality

$$({\mathcal{B}}{\mathcal{A}}^{-1}{\mathcal{B}}^{T}\lambda,\lambda)_{xt}=||{\mathcal{M}}^{-1/2}{\mathcal{L}}^{T}\lambda||_{xt}^{2}+||{\mathcal{M}}^{1/2}\lambda||_{xt}^{2}+||{\mathcal{R}}^{-1/2}{\mathcal{S}}^{T}\lambda||_{qt}^{2}$$
(34)

by \(||{\mathcal{M}}^{-1/2}{\mathcal{L}}^{T}\lambda||_{xt}^{2}\).

We use (33) in the following chain of the inequalities:

$$||{\mathcal{M}}^{-1/2}{\mathcal{L}}^{T}\lambda||_{xt}=\sup\limits_{v}\frac{({\mathcal{M}}^{-1/2}{\mathcal{L}}^{T}\lambda,v)_{xt}}{||v||_{xt}}=\sup\limits_{y}\frac{(\lambda,{\mathcal{L}}y)_{xt}}{||{\mathcal{M}}^{1/2}y||_{xt}}$$
$${}\geqslant\sup\limits_{f,q}\frac{(\lambda,{\mathcal{M}}f+{\mathcal{S}}q)_{xt}}{||{\mathcal{M}}^{1/2}{\mathcal{L}}^{-1}({\mathcal{M}}f+{\mathcal{S}}q)||_{xt}}\geqslant\frac{1}{C^{1/2}_{0}}\sup\limits_{f,q}\frac{(\lambda,{\mathcal{M}}f+{\mathcal{S}}q)_{xt}}{\Big{(}||{\mathcal{M}}^{1/2}f||^{2}_{xt}+||{\mathcal{R}}^{1/2}q||^{2}_{qt}\Big{)}^{1/2}}.$$

Choosing subsequently \(q=0\) and \(f=0\) in this inequality we have

$$\displaystyle||{\mathcal{M}}^{-1/2}{\mathcal{L}}^{T}\lambda||_{xt}\geqslant\frac{1}{C^{1/2}_{0}}\sup\limits_{f}\frac{(\lambda,{\mathcal{M}}f)_{xt}}{||{\mathcal{M}}^{1/2}f||_{xt}}=\frac{1}{C^{1/2}_{0}}||{\mathcal{M}}^{1/2}\lambda||_{xt},$$
$$\displaystyle||{\mathcal{M}}^{-1/2}{\mathcal{L}}^{T}\lambda||_{xt}\geqslant\frac{1}{C^{1/2}_{0}}\sup\limits_{q}\frac{(\lambda,{\mathcal{S}}q)_{xt}}{||{\mathcal{R}}^{1/2}q||_{xt}}=\frac{1}{C^{1/2}_{0}}||{\mathcal{R}}^{-1/2}{\mathcal{S}}^{T}\lambda||_{qt}.$$
(35)

Estimates (34) and (35) yield (32). \(\Box\)

Theorem 4.

  1. 1.

    Problem (30) has a solution \((y,f,q,\lambda)\) with unique \(y\) , \(f\) , \(q\) , which coincide with the solution of problem (28).

  2. 2.

    Iterative method (31) with the preconditioner \({\mathcal{D}}={\mathcal{L}}{\mathcal{\mathcal{M}}}^{-1}{\mathcal{L}}^{T}\) applied to solving saddle point problem (30), converges if \(0<\rho<2/(1+2C_{0}).\)

  3. 3.

    If the optimal control problem does not contain a state constraint, i.e. \(\psi=0\) , then

    • the Lagrange multiplier \(\lambda\) is uniquely determined;

    • iterative method (31) with the preconditioner \({\mathcal{D}}={\mathcal{L}}{\mathcal{\mathcal{M}}}^{-1}{\mathcal{L}}^{T}\) and the iterative parameter \(\rho=\frac{1}{1+C_{0}}\) has the rate of convergence defined by the following estimate:

      $$||\lambda^{k+1}-\lambda||^{2}_{\mathcal{D}}\leqslant\frac{C_{0}}{1+C_{0}}||\lambda^{k}-\lambda||^{2}_{\mathcal{D}}\quad\forall k.$$

The proof of this theorem is based on the general results from [23, 25] and Lemma 2, so, we omit it.

When implementing method (31) with the preconditioner \({\mathcal{D}}={\mathcal{L}}{\mathcal{\mathcal{M}}}^{-1}{\mathcal{L}}^{T}\) for problem (30) it is reasonable to change the variable \(\lambda\) by \(\eta={\mathcal{L}}^{T}\lambda\) and to write the resulting system for finding \(s+1\)-th iteration in the following form:

$$\begin{cases}y^{(s+1)}+{\mathcal{M}}^{-1}\partial\psi(y^{(s+1)})\ni y_{d}-{\mathcal{M}}^{-1}\lambda^{s},\\ f^{(s+1)}+{\mathcal{M}}^{-1}\partial\varphi_{f}(f^{(s+1)})\ni{\mathcal{L}}^{-T}\lambda^{s},\\ q^{(s+1)}+{\mathcal{R}}^{-1}\partial\varphi_{q}(q^{(s+1)})\ni{\mathcal{R}}^{-1}{\mathcal{S}}^{T}{\mathcal{L}}^{-T}\lambda^{s},\end{cases}$$
$$\frac{\lambda^{s+1}-\lambda^{s}}{\rho}={\mathcal{M}}y^{(s+1)}-{\mathcal{M}}{\mathcal{L}}^{-1}\Big{(}{\mathcal{M}}f^{(s+1)}+{\mathcal{S}}q^{(s+1)}\Big{)}.$$

On every step of the iterative method we have to solve three inclusions with diagonal operators \(I+{\mathcal{M}}^{-1}\partial\psi\), \(I+{\mathcal{M}}^{-1}\partial\varphi_{f}\) and \(I+{\mathcal{R}}^{-1}\partial\varphi_{q}\), where \(I\) is the identity matrix. The solution of these inclusions is reduced to a simple procedure of point projections (for all coordinates of nodal vectors at each time level) on the corresponding sets of constraints. The most time-consuming part of the implementation consists of solving two mesh parabolic equations, namely, state problem with the matrix \({\mathcal{L}}\) and adjoint problem with the matrix \({\mathcal{L}}^{T}\).

4 THE SPACE-TIME FRACTIONAL DERIVATIVE PROBLEM

In previous sections we constructed easily implemented iterative methods that converge for parameters that are independent of the mesh steps \(h\) and \(\tau\). This was achieved by creating suitable preconditioners. In particular, stability estimates for the state equation in the corresponding norms play a fundamental role. In the considered examples, the stability estimate (23) in the \(L^{2}\) -norm through the \(L^{2}\)-norms of the control functions was sufficient for this aim. Since the matrix \(B\) is positive definite, to obtain this stability estimate, it is sufficient that the matrix \(A\) be positive semidefinite. This means that all the previous results on the convergence of the iterative methods for the mesh optimal control problems can be extended to the case when Laplace operator is replaced by another positive definite (or positive semidefinite) elliptic operator.

Below we briefly discuss the numerical solution of the problem with fractional derivatives in both time and space. For simplicity, we consider a problem in which the state function satisfies a homogeneous initial-boundary-value Dirichlet problem in \(Q_{T}=\Omega\times(0,T]\) and approximate it by a finite difference scheme on a uniform mesh.

First, let us define the Riemann–Liouville space-fractional derivatives of order \(\beta\in(1,2)\) with respect to variable \(x\):

$$\displaystyle\dfrac{\partial^{\beta}y}{\partial x^{\beta}}=\dfrac{1}{\Gamma(2-\beta)}\dfrac{\partial^{2}}{\partial x^{2}}\int\limits_{0}^{x}\dfrac{y(\xi,t)}{(x-\xi)^{\beta-1}}d\xi,\quad\displaystyle\dfrac{\partial^{\beta}y}{\partial(-x)^{\beta}}=\dfrac{1}{\Gamma(2-\beta)}\dfrac{\partial^{2}}{\partial x^{2}}\int\limits_{1}^{x}\dfrac{y(\xi,t)}{(\xi-x)^{\beta-1}}d\xi.$$
(36)

Next, define the state problem as follows:

$$\mathcal{D}_{t}y-\dfrac{1+\gamma_{1}}{2}\dfrac{\partial^{\beta_{1}}y}{\partial x_{1}^{\beta_{1}}}-\dfrac{1-\gamma_{1}}{2}\dfrac{\partial^{\beta_{1}}y}{\partial(-x_{1})^{\beta_{1}}}-\dfrac{1+\gamma_{2}}{2}\dfrac{\partial^{\beta_{2}}y}{\partial x_{2}^{\beta_{2}}}-\dfrac{1-\gamma_{2}}{2}\dfrac{\partial^{\beta_{2}}y}{\partial(-x_{2})^{\beta_{2}}}=f\quad\textrm{in}\quad Q_{T},$$
$$\displaystyle y=0\quad\textrm{on }\quad\partial\Omega,\quad y=0\quad\textrm{for}\quad t=0,\quad x\in\Omega.$$
(37)

Above \(\mathcal{D}_{t}y\) is a fractional derivative, defined in (4), (5), and the constants \(\beta_{i}\in(1,2),\gamma_{i}\in[-1,1]\).

We approximate equation (37) by a finite difference scheme on the uniform time-space mesh with steps \(h\) and \(\tau\). The approximation of the time-fractional derivative \(\mathcal{D}_{t}y\) is the same as in the previous sections. To approximate the space fractional derivatives \(\dfrac{\partial^{\beta}y}{\partial x^{\beta}}\) and \(\dfrac{\partial^{\beta}y}{\partial(-x)^{\beta}}\) at a mesh point \(x_{j}\) we use its so-called flux representation and Grunwald–Letnikov approach to approximation (see [33]. This approach leads to the following formulas:

$$\dfrac{\partial^{\beta}y}{\partial x^{\beta}}(x_{j})=\dfrac{\partial}{\partial x}\left(\dfrac{\partial^{\beta-1}y}{\partial x^{\beta-1}}\right)(x_{j})\approx\dfrac{1}{h}({}^{+\beta}F_{j+1/2}-{}^{+\beta}F_{j-1/2}),$$
$$\dfrac{\partial^{\beta}y}{\partial(-x)^{\beta}}(x_{j})=\dfrac{\partial}{\partial x}\Big{(}\dfrac{\partial^{\beta-1}y}{\partial(-x)^{\beta-1}}\Big{)}(x_{j})\approx\dfrac{1}{h}({}^{-\beta}F_{j+1/2}-{}^{-\beta}F_{j-1/2}),$$

where

$$\displaystyle{}^{+\beta}F_{j+1/2}=\left\{\dfrac{\Delta^{\beta-1}y}{\Delta x^{\beta-1}}\right\}_{j+1}=\dfrac{1}{h^{\beta-1}}\sum_{k=0}^{[(x)/h]}\tilde{\theta}_{k}y(x_{j+1}-kh),$$
$$\displaystyle{}^{-\beta}F_{j+1/2}=\left\{\dfrac{\Delta^{\beta-1}y}{(-\Delta x)^{\beta-1}}\right\}_{j}=\dfrac{-1}{h^{\beta-1}}\sum_{k=0}^{[(1-x)/h]}\tilde{\theta}_{k}y(x_{j}+kh)$$

and the constants in the approximations of the fractional derivatives are defined by the recurrent formulas: \(\tilde{\theta}_{0}=1\), \(\tilde{\theta}_{k+1}=\dfrac{-\tilde{\theta}_{k}(\beta-k-1)}{k+1}.\)

Let \(N\) be the number of mesh points in the directions \(x_{1}\) and \(x_{2}\). Define for \(i=1,2\) the matrices

$$L_{i}=\dfrac{1}{h^{\beta_{i}}}\begin{pmatrix}\theta_{i1}&-1&0&\ldots&\ldots&0&0\\ \theta_{i2}&\theta_{i1}&-1&\ldots&\ldots&0&0\\ \theta_{i3}&\theta_{i2}&\theta_{i1}&\ldots&\ldots&0&0\\ \ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots\\ \theta_{i,N-1}&\theta_{i,N-2}&\theta_{i,N-3}&\ldots&\ldots&\theta_{i1}&-1\\ \theta_{i,N}&\theta_{i,N-1}&\theta_{i,N-2}&\ldots&\ldots&\theta_{i2}&\theta_{i1}\end{pmatrix},$$

where \(\theta_{i1}=\beta_{i}\), \(\theta_{i,k+1}=-\dfrac{\theta_{ik}\cdot(\beta_{i}-k)}{k+1}\) for \(k=1,2,\ldots N-1.\) Note that the entries of the matrices \(L_{i},i=1,2,\) satisfy the following properties:

  • diagonal elements are positive, while off-diagonal elements are non-positive;

  • \(L_{i}\) is strictly diagonally dominant Toeplitz matrix.

These properties ensure that the matrices \(\frac{1}{2}(L_{i}+L_{i}^{T})\) are positive definite (compare with Lemma 1). Let now \(I_{x}\) be \(N\times N\) unit matrix and

$$A_{1}=I_{x}\otimes\left(\dfrac{1+\gamma_{1}}{2}L_{1}+\dfrac{1-\gamma_{1}}{2}L^{T}_{1}\right),\quad A_{2}=\left(\dfrac{1+\gamma_{2}}{2}L_{2}+\dfrac{1-\gamma_{2}}{2}L^{T}_{2}\right)\otimes I_{x},\quad A=A_{1}+A_{2}.$$

Then the constructed mesh equation can be written as follows:

$$\displaystyle y^{0}=0,\quad(By)^{k}+Ay^{k}=f^{k},\quad k=1,2,\ldots,M.$$
(38)

Due to the properties of \(L_{i}\) the matrix \(A_{s}=\frac{1}{2}(A+A^{T})\) is positive definite. Since the matrix \(B_{s}=\frac{1}{2}(B+B^{T})\) is also positive definite with the constant of positive definiteness \(\chi_{0}\), then the following stability estimate holds for the mesh equation (38): \(||y||_{xt}\leqslant\chi_{0}^{-1/2}||f||_{xt}.\) It can be used when estimating the equivalence constants of the corresponding matrices in Lemma 2, and the results of Theorem 4 remain valid up to constants in the estimates.

5 NUMERICAL RESULTS

The main goal of computational experiments was to check estimates for the iterative parameters and convergence rate in the mesh problems without state constraints. We performed calculations for a 1D problem, taking the space-time fractional equation with Dirichlet boundary condition as a state equation, and approximated the problem by a finite-difference scheme on a uniform mesh. The control function was right-hand side \(f\). In the absence of constraints on \(y\), the iterative method can be considered not with respect to the dual variables \(\lambda\), but with respect to direct variables, in our case \(f\). In this case, the theoretical estimates of the optimal iterative parameters and the convergence rate remain unchanged.

For the simplicity the the domain was \(Q_{T}=(0,1)\times(0,1)\). We varied the mesh steps and the indices of fractional operators \(\alpha\in(0,1]\) and \(\beta\in[1,2]\). As follows from theoretical estimates of the rate of convergence of iterative methods in the absence of the state constraints, it depends on the constant \(C_{0}\) in the stability inequality. In turn, \(C_{0}=\left(\xi_{0}+\chi_{0}\right)^{-1}\), where \(\chi_{0}=\chi_{0}(\alpha,\tau)\) and \(\xi_{0}=\xi_{0}(\beta,h)\) are the minimum eigenvalues of the matrices \(B_{s}\) and \(A_{s}\), respectively. Figure 1 shows the dependencies \(\chi_{0}\) on \(\alpha\) and \(\xi_{0}\) on \(\beta\) for the fixed mesh steps. Note that for \(\alpha=1\), the fractional-time finite-difference operator turns into a classical first-order operator that is positive definite with \(\chi_{0}=O(\tau)\). Correspondingly, for \(\beta=1\), the fractional-spatial finite-difference operator turns into a mesh diffusion operator with the factor \(h\) and the constant of definite definiteness \(\xi_{0}=O(h)\).

Fig. 1
figure 1

Minimum eigenvalues \(\chi_{0}(\alpha)\) of the matrix \(B_{s}\) (left) and \(\xi_{0}(\beta)\) of the matrix \(A_{s}\) (right) for \(h=\tau=0.01\).

The iterative method was implemented with the theoretically optimal parameter \(\rho_{opt}\) and arbitrary accepted values \(\rho=0.1\), \(\rho=0.5\) and \(\rho=0.9\). Stopping criterion for iterative process was \(||f^{(k)}-f_{opt}||<\varepsilon=10^{-6}\), where \(f_{opt}\) was <<exact>> solution obtained by performing a large number of iterations, and \(||.||\) was the mesh analog of \(L^{2}(Q_{T})\)-norm.

Table 1 contains the calculated results for the limit case of the parameters \(\alpha=1\) and \(\beta=1\); the observation function \(y_{d}=t^{1/2}\sin(0.5\pi xN_{x})\). Numerical results demonstrate the dependence of the number of iterations to achieve a given fixed accuracy on the mesh parameters \(h\) and \(\tau\).

Table 1 Number of iterations \(N_{it}\) to achieve accuracy \(||f^{(k)}-f_{opt}||<\varepsilon=10^{-6}\); unconstrained optimization problem; \(\alpha=1\), \(\beta=1\), \(y_{d}=t^{1/2}\sin(0.5\pi xN_{x})\)

The dependence of the number of iterations to achieve a given fixed accuracy on the indices \(\alpha\) and \(\beta\) is demonstrated in Table 2. The input data were as follows: \(\tau=h=0.01\), observation function \(y_{d}=t\sin(\pi x)\), stopping criterion \(||f^{(k)}-f_{opt}||<\varepsilon=10^{-6}\). The calculations were performed for a problem with no constraints and for a problem with control constraint \(U_{ad}=\{f\in{\mathbb{R}}^{N_{x}}:|f_{i}|\leqslant 0.006\}\). A small obstacle value \(0.006\) provides a large number of mesh points at which the constraint is active. We denote by \(N_{it}\) the number of the iterations for the unconstrained problem and by \(\tilde{N}_{it}\) the number of the iterations for the constrained problem. We see that a large number of mesh points at which the constraint is active significantly reduces the number of iterations.

Table 2 Dependence of the number of iterations on the parameters \(\alpha\), \(\beta\) and the iterative parameter \(\rho\)

We included in the article only some of the results of computational experiments. It should be noted that the number of iterations to achieve a given accuracy was almost insensitive to the observation function \(y_{d}\). This is quite expected, since in calculations it appears in the form \({\mathcal{L}}^{-T}y_{d}\) on the right side of the corresponding equation (or variational inequality in the control constraint problem).

The results of computational experiments generally confirmed the theoretical estimates. Unexpected was the high rate of convergence of iterations with an iterative parameter that significantly exceeded the theoretically optimal one for the case \(\alpha=\beta=1\).

6 CONCLUSIONS

We investigated the convergence of iterative methods for solving mesh approximations of optimal control problems controlled by parabolic equations with fractional derivatives. Convergence and estimates of the convergence rate of iterative methods are obtained on the basis of stability estimates for mesh state equations. In deriving these estimates, an essential property was the positive definiteness of the mesh operator of fractional time differentiation. This property is possessed by mesh operators on a time-uniform mesh for various definitions of the fractional derivative. We limited ourselves to the consideration of objective functionals with \(L^{2}\)-norms of state and control functions. The presented results can be generalized in various directions, including for more general linear state equations, other quadratic objective functionals, and other constraints on the control and state functions.