A Fast Gradient Projection Method for a Constrained Fractional Optimal Control

Du, Ning; Wang, Hong; Liu, Wenbin

doi:10.1007/s10915-015-0125-1

A Fast Gradient Projection Method for a Constrained Fractional Optimal Control

Published: 14 November 2015

Volume 68, pages 1–20, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Scientific Computing Aims and scope Submit manuscript

A Fast Gradient Projection Method for a Constrained Fractional Optimal Control

Download PDF

Ning Du¹,
Hong Wang² &
Wenbin Liu³

935 Accesses
37 Citations
Explore all metrics

Abstract

Optimal control problems governed by a fractional diffusion equation tends to provide a better description than one by a classical second-order Fickian diffusion equation in the context of transport or conduction processes in heterogeneous media. However, the fractional control problem introduces significantly increased computational complexity and storage requirement than the corresponding classical control problem, due to the nonlocal nature of fractional differential operators. We develop a fast gradient projection method for a pointwise constrained optimal control problem governed by a time-dependent space-fractional diffusion equation, which requires the computational cost from $O(M N^3)$ of a conventional solver to $O(M N\log N)$ and memory requirement from $O(N^2)$ to O(N) for a problem of size N and of M time steps. Numerical experiments show the utility of the method.

A Spectral Method for Optimal Control Problems Governed by the Time Fractional Diffusion Equation with Control Constraints

A space-time spectral method for the time fractional diffusion optimal control problems

Article Open access 15 May 2015

A Control Parameterization Method to Solve the Fractional-Order Optimal Control Problem

Article 01 September 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

It is well known that numerical approximation of optimal control problems has long been an important topic in engineering design work. There has been extensive researches on developing fast numerical algorithms for these optimal control problems. Some recent progress in this area can be found in [5, 8, 12–17, 21–23, 25, 29] and the references cited therein.

In this paper we consider the model of a pointwise-contrained time-dependent optimal control problem imposed on the space interval (0, 1) over the time period [0, T]: Find a control $u \in K := \{ v \in L^2(0,T;(0,1)), v \ge u_0\}$ with a prescribed function $u_0(x,t)$ to minimize the objective functional

$$\begin{aligned} J(u) := \displaystyle \frac{1}{2}\displaystyle \int _0^T\displaystyle \int _0^1 \big (y - \tilde{y} \big )^2 dx dt + \displaystyle \frac{1}{2} \displaystyle \int _0^T\displaystyle \int _0^1 u^2 dx dt, \end{aligned}$$

(1)

which is governed by the initial-boundary value problem of a diffusion equation for the state variable y(x, t)

$$\begin{aligned}&L y = f + u, ~(x,t) \in (0,1) \times (0,T], \nonumber \\&y(0,t) = y(1,t) = 0, ~ t \in [0,T], ~~y(x,0) = y_0(x), ~x \in [0,1]. \end{aligned}$$

(2)

Here $\tilde{y}(x,t)$ represents a prescribed or observed value of the state variable y, f(x, t) is a prescribed source and sink term, u is the control variable, and L represents a diffusion operator.

This type of problems occurs widely in engineering design and applications. For instance, in the temperature control in a thermo conduction process, y represents the temperature in a medium and u the heating or cooling source. The goal of the control problem is to optimize the input of the heating or cooling source u such that the temperature distribution u is as close as to the prescribed temperature distribution $\tilde{y}$. Due to the limitation of the surrounding environments or equipments, the input control u might be subject to certain constraint which leads to a constrained control problem such as that with a pointwise obstacle constraint as described by the admissible set K. Similar models also occur in the optimal control in mass diffusive transport process in porous media and in the photocopier and the laser printer in the modern office that rely on the transport of electrons in amorphous semiconductors in an electric field. The resulting optimality conditions from these models consist of non-smooth nonlinear systems of partial differential equations.

The canonical optimal control problems of this type, in which the differential operator L is a classical second-order Fickian diffusion operator, have been studied extensively in the literature and many efficient and reliable methods have been developed [5, 12–15, 17, 21, 25]. There exist at least three general classes of computational approaches to solving the resulting nonlinear systems of partial differential equations derived from the optimality conditions. The primal-dual active set method employs the primal-dual active set strategy based on the augmented Lagrangian method in [12, 13]. This method has widely been used in solving numerical solutions of different kinds of optimal control in the literature. However in some cases, e.g. when the Lagrange multipliers are irregular, the active sets are not easy to be computed numerically. The second class is to apply the semi-Newton algorithm to solve the resulting optimality conditions, which involves computations of second derivatives, see, e.g., [14, 15]. There exists an extensive body of research on the above two methods in the literature, and it is even impossible to give a brief review here. The third kind is the class of gradient projection methods that use a line search method to minimize the objective functional, guided by the gradient whose computation needs repeatedly solving the state and the co-state equations. This class of methods has been found robust in practice [17], but its effectiveness relies on very efficient solvers for the state and the co-state equations.

In the last few decades an increasingly more number of evidence suggests that the classical second-order Fickian diffusion equation or heat conduction equation does not necessarily provides a proper description of the diffusion or heat conduction process in a heterogeneous medium. A microscopic explanation is as follows: In a normal diffusion process modeled by a classical second-order Fickian diffusion or heat conduction equation, a particle’s motion in the underlying transport or conduction process has virtually no spatial correlation and long walks in the same direction are rare, so the variance of a particle excursion distance is finite. The central limit theorem ensures that the probability of finding a particle somewhere in space satisfies a normal distribution, which gives a probabilistic description of a normal diffusion. However, in a diffusion or heat conduction process in a heterogeneous medium, a particle’s motion is better described by steps that are not independent and that can take vastly different times to perform. A particle’s long walk may have long correlation length so the processes can have large deviations from the stochastic process of Brownian motion or even have an infinite variance. Thus, the assumptions of (i) the existence of a mean free path and (ii) the existence of a mean time taken to perform a step, which were made by Einstein in his derivation of diffusion equation or Brownian motion [9], are violated. As a matter of fact, it was shown [3, 20] that the heavy tail behavior of the transport processes in heterogeneous media can be described accurately by Levy distribution, which can be viewed as a probability description of fractional diffusion equations. This is the fundamental reason why fractional diffusion or heat conduction equations provide a better description of transport processes in heterogeneous media than classical Fickian diffusion or heat conduction equations do.

Motivated by these facts, in this paper we consider the optimal control problem of (1)–(2) except that L is a fractional diffusion operator given by

$$\begin{aligned} Ly := \displaystyle \frac{\partial {y}}{\partial {t}} - d(x,t) \bigg ( r \; {}_0D_x^{2-\beta } + (1-r)\; {}_xD_1^{2-\beta } \bigg ) y, ~(x,t) \in (0,1) \times (0,T] \end{aligned}$$

(3)

where d(x, t) is the diffusivity coefficient with positive lower and upper bounds, $0 \le r \le 1$ indicates the relative weight of forward versus backward transition probability, $2-\beta $ with $0 < \beta <1$ is the order of the anomalous diffusion in the governing equation. The left and right fractional integrals are defined by Podlubny [24]

$$\begin{aligned}&^{}_0D_x^{-\beta }w(x) \displaystyle := \frac{1}{\varGamma (\beta )} \int _0^x \frac{w(s)}{(x-s)^{1-\beta }}ds, \\&^{}_xD^{-\beta }_1w(x) \displaystyle := \frac{1}{\varGamma (\beta )} \int _x^1 \frac{w(s)}{(s-x)^{1-\beta }}ds \end{aligned}$$

For any positive integer m, the left and right Riemann–Liouville fractional derivatives of order $m-\beta $ are defined by

$$\begin{aligned}&{}_0D_x^{m-\beta }w(x) \displaystyle := D^m {}^{}_0D_x^{-\beta } w(x) = \frac{1}{\varGamma (\beta )} \frac{d^m}{dx^m} \int _0^x \frac{w(s)}{(x-s)^{1-\beta }} ds,\\&{}_xD^{m-\beta }_1w(x) \displaystyle := (- D)^m {}^{}_xD_1^{-\beta } w(x) = \frac{(-1)^m}{\varGamma (\beta )} \frac{d^m}{dx^m} \int _x^1 \frac{w(s)}{(s-x)^{1-\beta }} ds. \end{aligned}$$

Because of the improved modeling capability of the fractional differential equations, there has been increasingly more research on both mathematical analysis and numerical approximation of the fractional control problems [1, 7, 10]. However, because of the nonlocal nature of fractional differential operators, the corresponding numerical methods for solving the fractional state and costate equations generate full stiffness matrices. Consequently, the resulting computational complexity is of $O(N^3)$ and the memory requirement is of $O(N^2)$ for a problem of size N, which is in sharp contrast to the computational work and memory requirement of O(N) for numerical methods for second-order Fickian diffusion equations. Recall that the fractional state and costate equations have to be iteratively solved for many times. Hence, the computational bottleneck becomes a really severe burden. To our best knowledge, there is no work in the literature that has addressed the significantly increased computational cost issue.

The goal of this paper is to develop a fast gradient projection method for the fractional optimal control problem (1)–(3), which significantly reduces the computational cost of the solution of the fractional state and costate equations to $O(M N \log N)$ from $O(M N^3)$, with M being the number of time steps, per iterative solve of the two equations, and the memory requirement to O(N) from $O(N^2)$. Numerical experiments justify these. The rest of the paper is organized as follows. In Sect. 2 we derive the optimality condition for the fractional optimal control problem. In Sect. 3 we present a fractional gradient projection method. In Sect. 4 we present a fast and faithful solution method. In Sect. 5 we carry out numerical experiments to validate the fast gradient projection method developed.

2 The Optimality Condition of the Fractional Control Problem

We frequently use the following properties in this section. The left and right Riemann–Liouville fractional integral operators are adjoints in the $L^2$ sense [24, 26]:

$$\begin{aligned} \displaystyle \int _0^1 \left( {}_0D_x^{-\beta } w\right) v dx = \displaystyle \int _0^1 w \left( {}_xD_1^{-\beta } v\right) dx, \quad \forall v, w \in L^2(0,1). \end{aligned}$$

(4)

For any v, w with $v(0)=0$ and $w(1)=0$, the fractional integral operator and the classical first-order differential operators commute [33]:

$$\begin{aligned} {}^{}_0D^{-\beta }_x D v = D {}_0D^{-\beta }_x v , \qquad {}^{}_xD^{-\beta }_1 Dw = D {}^{}_xD^{-\beta }_1 w. \end{aligned}$$

(5)

To derive the adjoint costate equation for the costate function we rewrite the state equation (2) as

$$\begin{aligned}&d^{-1}\displaystyle \frac{\partial {y}}{\partial {t}} - \big ( r\; {}_0D_x^{2-\beta } + (1-r)\; {}_xD_1^{2-\beta } \big ) y = d^{-1}(f+u),\nonumber \\&\quad y(0,t) = y(1,t) = 0, \quad t \in [0,T], \nonumber \\&\quad y(x,0) = y_0(x), \quad x \in [0,1]. \end{aligned}$$

(6)

We then multiply the equation by any test function p with appropriately specified boundary condition and terminal conditions [as in (7)] and utilize the adjoint property (4) to obtain the following terminal-boundary value problem of the adjoint costate equation for the costate function p(x, t)

$$\begin{aligned}&-\displaystyle \frac{\partial {(d^{-1}p)}}{\partial {t}} - \big ( (1-r)\; {}_0D_x^{2-\beta } + r\; {}_xD_1^{2-\beta } \big ) p = y-\tilde{y}, \nonumber \\&\quad p(0,t) = p(1,t) = 0, \quad t \in [0,T], \nonumber \\&\quad p(x,T) = 0, \quad x \in [0,1]. \end{aligned}$$

(7)

To derive the optimality condition for (1), we calculate the Gâteaux derivative of the objective functional J(u). Let $u \in K$ be the minimizer of the objective functional J(u). Let $v \in K$ be any admissible function and $\delta u = v-u$. For any arbitrarily small positive number $\varepsilon $, we note that

$$\begin{aligned} \delta y := \lim \limits _{\varepsilon \rightarrow 0^+} \displaystyle \frac{y(u+\varepsilon \delta u)-y(u)}{\varepsilon } = y'(u)\delta u \end{aligned}$$

(8)

exists, provided that y is differentiable with respect to u. Consequently,

$$\begin{aligned} y(u+\varepsilon \delta u)= & {} y(u) + \varepsilon y'(u) \delta u + o(|\varepsilon \delta u|) \\= & {} y(u) + \varepsilon \delta y + o(|\varepsilon \delta u|) \end{aligned}$$

as $\varepsilon $ tends to zero. Therefore, we expand $J(u+\varepsilon \delta u)$ as follows

$$\begin{aligned} J(u+\varepsilon \delta u)= & {} \displaystyle \frac{1}{2}\displaystyle \int _0^T\displaystyle \int _0^1 (y(u+\varepsilon \delta u) -\tilde{y})^2 dx dt\\&+\, \displaystyle \frac{1}{2} \displaystyle \int _0^T\displaystyle \int _0^1 (u+\varepsilon \delta u)^2 dx dt \\= & {} \displaystyle \frac{1}{2}\displaystyle \int _0^T\displaystyle \int _0^1 (y(u) -\tilde{y})^2 + 2 \varepsilon (y(u) -\tilde{y}) \delta y dx dt \\&+\, \displaystyle \frac{1}{2} \displaystyle \int _0^T\displaystyle \int _0^1 u^2 + 2 \varepsilon \; u \; \delta u dx dt + o(\varepsilon ) . \end{aligned}$$

We accordingly find that

$$\begin{aligned} J'(u)\delta u= & {} \displaystyle \lim \limits _{\varepsilon \rightarrow 0^+} \displaystyle \frac{J(u+\varepsilon \delta u) -J(u)}{\varepsilon } \nonumber \\= & {} \displaystyle \int _0^T\displaystyle \int _0^1 (y-\tilde{y}) \delta y ~dxdt + \displaystyle \int _0^T\displaystyle \int _0^1 u \delta u ~dxdt\nonumber \\= & {} \displaystyle \int _0^T \displaystyle \int _0^1 \Big [-\displaystyle \frac{\partial {(d^{-1}p)}}{\partial {t}} - \Big ( (1-r)\; {}_0D_x^{2-\beta } + r\; {}_xD_1^{2-\beta } \Big ) p\Big ] \delta y \;dxdt \nonumber \\&+\, \displaystyle \int _0^T\displaystyle \int _0^1 u \delta u ~dxdt. \end{aligned}$$

(9)

In the last step we have replaced $y-\tilde{y}$ in the integrand of the first integral by the left-hand side of the costate equation (7).

We see from (8) that $\delta y$ satisfies the homogeneous initial and boundary conditions. We further utilize the fact that $p(x,T) = 0$ and integrate the first integrand in the first integral on the right-hand side with respect to time by parts to get

$$\begin{aligned} \displaystyle \int _0^T \displaystyle \int _0^1 \displaystyle \frac{\partial {(d^{-1}p)}}{\partial {t}} \delta y dxdt = -\displaystyle \int _0^T \displaystyle \int _0^1 p d^{-1} \displaystyle \frac{\partial {\delta y}}{\partial {t}} dxdt. \end{aligned}$$

(10)

We intend to utilize the facts that $\delta y$ and p both vanish at the boundary to integrate the second integrand in the first integral on the right-hand side with respect to x by parts in a similar fashion. However because the fractional integral operators and the classical differential operators do not commute in general, we have to take extra care by utilizing (4) and (5) repeatedly to obtain

$$\begin{aligned}&\displaystyle \int _0^T \displaystyle \int _0^1 \left[ \Big ( (1-r)\; {}_0D_x^{2-\beta } + r\; {}_xD_1^{2-\beta } \Big ) p\right] \delta y dxdt\nonumber \\&\quad = \displaystyle \int _0^T \displaystyle \int _0^1 \left[ \Big ( (1-r)\; D^2~{}_0D_x^{-\beta } + r\; D^2 {}_xD_1^{-\beta } \Big ) p\right] \delta y dxdt \nonumber \\&\quad = \displaystyle \int _0^T \displaystyle \int _0^1 \left[ \Big ( (1-r)\; D ~{}_0D_x^{-\beta } D + r\; D {}_xD_1^{-\beta } D \Big ) p\right] \delta y dxdt \nonumber \\&\quad = -\displaystyle \int _0^T \displaystyle \int _0^1 \left[ \Big ( (1-r)\; ~{}_0D_x^{-\beta } D + r\; {}_xD_1^{-\beta } D \Big ) p\right] (D \delta y) dxdt \nonumber \\&\quad = -\displaystyle \int _0^T \displaystyle \int _0^1 \left[ \Big ( r\; {}_0D_x^{-\beta } + (1-r)\; {}_xD_1^{-\beta } \Big ) D \delta y\right] Dp dxdt \nonumber \\&\quad = -\displaystyle \int _0^T \displaystyle \int _0^1 \left[ \Big ( rD\; {}_0D_x^{-\beta } + (1-r)D\; {}_xD_1^{-\beta } \Big ) \delta y\right] Dp dxdt \nonumber \\&\quad = \displaystyle \int _0^T \displaystyle \int _0^1 \left[ \Big ( r D^2\; {}_0D_x^{-\beta } + (1-r) D^2 \; {}_xD_1^{-\beta } \Big ) \delta y\right] p dxdt \nonumber \\&\quad = \displaystyle \int _0^T \displaystyle \int _0^1 \left[ \Big ( r\; {}_0D_x^{2-\beta } + (1-r)\; {}_xD_1^{2-\beta } \Big ) \delta y\right] p dxdt. \end{aligned}$$

(11)

We incorporate (10) and (11) into (9) to obtain

$$\begin{aligned} J'(u)\delta u= & {} \displaystyle \int _0^T \displaystyle \int _0^1 \Big [ d^{-1} \displaystyle \frac{\partial {\delta y}}{\partial {t}} - \Big ( r\; {}_0D_x^{2-\beta } + (1-r)\; {}_xD_1^{2-\beta } \Big ) \delta y\Big ] p dxdt \\&+ \displaystyle \int _0^T\displaystyle \int _0^1 u \delta u ~dxdt. \end{aligned}$$

We observe from (6) and (8) that

$$\begin{aligned} \displaystyle d^{-1}\displaystyle \frac{\partial {\delta y}}{\partial {t}} -\bigg ( r\; {}_0D_x^{2-\beta } + (1-r)\; {}_xD_1^{2-\beta } \bigg ) \delta y =d^{-1} \delta u. \end{aligned}$$

Recall that u is the minimizer of the objective functional $J(\cdot )$ and $\delta u = v-u$, the fact that $J'(u)\delta u \ge 0$ leads to the following variational inequality

$$\begin{aligned} J'(u)\delta u= & {} \displaystyle \int _0^T\displaystyle \int _0^1 (d^{-1} p +u) \delta u dxdt \nonumber \\= & {} \displaystyle \int _0^T\displaystyle \int _0^1 (d^{-1}p+u) (v-u) dxdt \ge 0 \end{aligned}$$

(12)

for all $v \in K$. This variational inequality has the following closed form solution [17]

$$\begin{aligned} u = \max \left\{ u_0,-d^{-1}p\right\} . \end{aligned}$$

(13)

As a matter of fact, if $-d^{-1}p > u_0$ on the entire interval (0, 1), then (13) implies $u=-d^{-1}p$ and so $d^{-1}p+u=0$ on the entire interval (0, 1). The variational inequality (12) becomes identically zero for all $v \in K$. If $-d^{-1} p \le u_0$ on the entire interval (0, 1), (13) implies $u=u_0$. Thus, $d^{-1}p + u = d^{-1}p + u_0 \ge 0$. Since $v-u = v-u_0 \ge 0$ for all $v\in K$, the variational inequality (12) holds.

In the general case, the maximum in (13) may be reached by either on different subdomains of $[0,1] \times [0,T]$. Suppose that $u = -d^{-1}p$ on $\varOmega _1$ and $u = u_0$ on $\varOmega _2$ with $\varOmega _1 \cup \varOmega _2 = [0,1] \times [0,T]$. It follows that the integral $\displaystyle \int _{\varOmega _1} (d^{-1}p+u) (v-u) dxdt = 0$. Hence, the integral in (12) reduces to an integral on $\varOmega _2$, on which $u=u_0$. Then the same argument in the preceding paragraph shows that this integral remains nonnegative for all $v \in K$. Therefore, (12) still holds.

3 A Fractional Gradient Projection Method

The optimality condition of the constrained optimal control problem (1) and (2) reduces the problem to the solution of a coupled system of the state equation (6), the costate equation (7), and the optimal control condition (13). A commonly used method for the efficient solution of the system is the following sequentially decoupled gradient projection method [17]:

(i)
Pick up an guess of the control $u^0$.

For $n=0,1,\ldots $, do
(ii)
With the known control $u^n$, solve the state equation forward in time for the state variable $y^n$
$$\begin{aligned} \displaystyle d^{-1}\displaystyle \frac{\partial {y^n}}{\partial {t}} - \bigg ( r\; {}_0D_x^{2-\beta } + (1-r)\; {}_xD_1^{2-\beta } \bigg ) y^n = d^{-1} \bigg ( f + u^n \bigg ). \end{aligned}$$
(iii)
With the known state variable $y^n$, solve the costate equation backward in time for the costate variable $p^n$
$$\begin{aligned} \displaystyle -\displaystyle \frac{\partial {(d^{-1}p^n)}}{\partial {t}} - \bigg ( (1-r)\; {}_0D_x^{2-\beta } + r\; {}_xD_1^{2-\beta } \bigg ) p^n = y^n-\tilde{y}. \end{aligned}$$
(iv)
Update the optimal control
$$\begin{aligned} \displaystyle u^{n+1} = \max \big \{u_0,-d^{-1} p^n\big \}. \end{aligned}$$
(v)
If $\Vert u^{n+1}-u^n\Vert _{\infty }$ is within the tolerance, then stop

end do

It is clear that in the gradient projection method, the major computational cost comes from the repeated solution of the state equation (6) and the costate equation (7). In the rest of this section, we present two finite difference methods for these equations and discuss related computational issues.

Let M and N be positive integers. Define the time step size $\tau := T/M$ and the spatial mesh size $h := 1/(N+1)$, respectively. We define the temporal partition $t^k := k \tau $ for $k=0,1,\ldots ,M$ and the spatial partition $x_i := i h$ for $i =0, 1,\ldots ,N+1$. Let $y_i^k, p_i^k, u_i^k$ be the finite difference approximations to $y(x_i,t^k)$, $p(x_i,t^k)$, $u(x_i,t^k)$, respectively, and denote $(d^{-1})_i^k := d^{-1}(x_i,t^k)$, $f_i^k := f(x_i,t^k)$, $\tilde{y}_i^k := \tilde{y}(x_i,t^k)$. Then the Meerschaert finite difference scheme for the state equation can be formulated as follows [19]: For $k=1,2,\ldots ,M$, solve the following equation forward in time to find $\{ y^k_i\}_{i=1}^N$ such that

$$\begin{aligned}&\big ( d^{-1} \big )_i^k\displaystyle \frac{y_i^k-y_i^{k-1}}{\tau } - \left[ r\displaystyle \sum _{j=0}^{i+1}g_j^{(2-\beta )} y_{i-j+1}^k + (1-r)\displaystyle \sum _{j=0}^{N-i+1}g_{j}^{(2-\beta )} y_{i+j-1}^k \right] h^{\beta -2} \nonumber \\&\quad = \big (d^{-1}\big )_i^k \big (f_i^k + u_i^k \big ). \end{aligned}$$

(14)

Here the fractional binomial coefficients $g_{j}^{(\alpha )}$ can be evaluated recursively

$$\begin{aligned} \displaystyle g_{0}^{(\alpha )}=1, \qquad g_{j}^{(\alpha )}=\left( 1-\frac{\alpha +1}{j}\right) g_{j-1}^{(\alpha )} \quad {\mathrm {for}} \quad j \ge 1. \end{aligned}$$

We can similarly formulate the Meerschaert finite difference scheme for the costate equation: For $k=M, M-1,\ldots ,1$, solve the following equation backward in time to find $\{ p^k_i\}_{i=1}^N$ such that

$$\begin{aligned}&\displaystyle \frac{(d^{-1})_i^{k-1} p_i^{k-1}-(d^{-1})_i^k p_i^k}{\tau } \displaystyle - \left[ (1-r)\displaystyle \sum _{j=0}^{i+1}g_j^{(2-\beta )} p_{i-j+1}^k\right. \left. +r\displaystyle \sum _{j=0}^{N-i+1}g_{j}^{(2-\beta )} p_{i+j-1}^k \right] h^{\beta -2} \nonumber \\&\quad \quad = y_i^{k-1} - \tilde{y}_i^{k-1}. \end{aligned}$$

(15)

Because a shifted Grünwald approximation was used, the Meerschaert scheme has only first-order accuracy in both space and time [19].

Second-order finite difference methods were developed to improve the computational efficiency. A Crank-Nicolson temporal discretization plus a Richardson extrapolation of the spatial discretization in (14) results in a scheme that has second-order accuracy in both space and time [18]. In this paper we use an alternative second-order scheme, the CN-WSGD scheme, developed in [28]: For $k=1,2,\ldots ,M$, solve the following equation forward in time to find $\{ y^k_i\}_{i=1}^N$ such that

(16)

Here the superscript $k-\frac{1}{2}$ denotes the mean value of the function on k and $k-1$ levels and $w_{j}^{(\alpha )}$ can be evaluated by

$$\begin{aligned} \displaystyle w_{0}^{(\alpha )} =\frac{\alpha }{2}g_{0}^{(\alpha )}, \quad w_{j}^{(\alpha )} =\frac{\alpha }{2}g_{j}^{(\alpha )} +\frac{2-\alpha }{2}g_{j-1}^{(\alpha )} \quad {\mathrm {for}} \quad j\ge 1. \end{aligned}$$

We can similarly formulate the CN-WSGD scheme for the costate equation as follows: For $k=M, M-1,\ldots ,1$, solve the following equation backward in time to find $\{ p^k_i\}_{i=1}^N$ such that

$$\begin{aligned}&\displaystyle \frac{(d^{-1})_i^{k-1} p_i^{k-1}-(d^{-1})_i^k p_i^k}{\tau } \displaystyle - \Bigg [(1-r)\displaystyle \sum _{j=0}^{i+1}w_j^{(2-\beta )} p_{i-j+1}^{k-\frac{1}{2}} +r\displaystyle \sum _{j=0}^{N-i+1}w_{j}^{(2-\beta )} p_{i+j-1}^{k-\frac{1}{2}} \Bigg ] h^{\beta -2} \nonumber \\&\qquad = y_i^{k-\frac{1}{2}} - \tilde{y}_i^{k-\frac{1}{2}}. \end{aligned}$$

(17)

The nonlocal property of fractional differential operators results in a significant increase in the already expensive computational and storage cost in the gradient projection method for the constrained optimal control problem. Despite that the optimal control constrained by space-fractional diffusion equations provides a more appropriate description than its analogue constrained by second-order diffusion equations, the significantly increased computational and storage cost poses a severe computational challenge to the new model. More precisely, within each outer loop in the gradient projection method, the computational cost of the space-fractional state and costate equations is of order $O(M N^3)$, which is a very sharp increase of the computational cost of second-order state and costate equations that has a computational cost of O(M N). This significantly increased computational cost gets even worse as the state and costate equations have to be solved iteratively in the gradient projection method. Therefore, the development of a fast and faithful numerical method is of fundamental importance.

4 A Fast and Faithful Solution Method

The goal of this section is to develop a fast and faithful gradient projection method for the optimal control problem (1) and (2). As the major computational complexity of the gradient projection method comes from the numerical solution of the state and costate equations, we focus on the development of fast and faithful solution methods for these equations.

4.1 Fast Implementation of the Finite Difference Methods for the State and Costate Equations

Let $\mathbf {y}^k := (y_1^k,y_2^k,\ldots ,y_N^k)^T$, and $\mathbf {y}^{k-1}$, $\mathbf {f}^k$, $\mathbf {u}^k$ be introduced in a similar fashion. Then the Meerschaert scheme for the state equation (14) can be expressed in a matrix form [19]

$$\begin{aligned} \big (\mathbf {D}^k+\gamma \mathbf {A}\big )\mathbf {y}^k =\mathbf {D}^k\mathbf {y}^{k-1}+\tau \mathbf {D}^k\big (\mathbf {f}^k +\mathbf {u}^k\big ), \end{aligned}$$

(18)

where $\gamma ={\tau }/{h^{2-\beta }}$, $\mathbf {D}^k={\mathrm {diag}}((d^{-1})_i^k)_{i=1}^N$, and $\mathbf {A}$ is a full stiffness matrix. Traditionally, the linear system (18) has been solved numerically by Gaussian elimination [19], which requires computational work of $O(N^3)$ per time step. It was reported in [32] that the stiffness matrix $\mathbf {A}$ can be decomposed as

$$\begin{aligned} \mathbf {A}= r \mathbf {G}+ (1-r) \mathbf {G}^T \end{aligned}$$

(19)

where $\mathbf {G}$ is a Toeplitz matrix of the form

$$\begin{aligned} \mathbf {G}= -\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} g_1 &{} g_0 &{} &{} &{} \\ g_2 &{} g_1 &{} \ddots &{} &{} \\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \\ \vdots &{} \ddots &{} \ddots &{} g_1 &{} g_0 \\ g_N &{} \cdots &{} \cdots &{} g_2 &{} g_1 \end{array}\right) \end{aligned}$$

(20)

with $g_i := g_i^{(2-\beta )}$.

Fast finite difference methods were developed by exploring the decomposition [31, 32] which reduce the computational cost from $O(N^3)$ at each time step to $O(N \log N)$ per inner iteration. It is known that an $N \times N$ Toeplitz matrix $\mathbf {T}=(t_{i-j})_{0\le i,j\le N}$ can be embedded into a $2N \times 2N$ circulant matrix $\mathbf {C}_{2N}$ [6, 11]

$$\begin{aligned} \begin{array}{l} \mathbf {C}_{2N} = \left[ \begin{array}{c@{\quad }c} \mathbf {T}&{} \hat{\mathbf {T}} \\ \hat{\mathbf {T}} &{} \mathbf {T}\end{array}\right] , \quad \hat{\mathbf {T}} = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0 &{} t_{N-1} &{} \ldots &{} t_{2} &{} t_{1}\\ t_{1-N} &{} 0 &{} t_{N-1} &{} \ldots &{} t_{2} \\ \vdots &{} t_{1-N} &{} 0 &{} \ddots &{} \vdots \\ t_{-2} &{} \vdots &{} \ddots &{} \ddots &{} t_{N-1} \\ t_{-1} &{} t_{-2} &{} \ldots &{} t_{1-N} &{} 0 \end{array}\right] , \end{array}\end{aligned}$$

(21)

The circulant matrix $\mathbf {C}_{2N}$ can be decomposed as

$$\begin{aligned} \mathbf {C}_{2N} = \mathbf {F}_{2N}^{-1} {\mathrm {diag}}(\mathbf {F}_{2N} \mathbf {c}) \mathbf {F}_{2N}, \end{aligned}$$

(22)

where $\mathbf {c}$ is the first column vector of $\mathbf {C}_{2N}$ and $\mathbf {F}_{2N}$ is the $2N \times 2N$ discrete Fourier transform matrix. It is well known that the matrix-vector multiplication $\mathbf {F}_{2N}\mathbf {v}_{2N}$ for $\mathbf {v}_{2N} \in R^{2N}$ can be carried out in $O(2N \log (2N)) = O(N \log N)$ operations via the fast Fourier transform (FFT). Therefore, $\mathbf {C}_{2N} \mathbf {v}_{2N}$ can be evaluated in $O(N \log N)$ operations by (22).

The same idea can be applied to other schemes in the previous section. The Meerschaert scheme (15) for the costate equation can be expressed in the matrix form

$$\begin{aligned} \Bigg (\mathbf {D}^{k-1}+\gamma \mathbf {A}^T\Bigg )\mathbf {p}^{k-1} =\mathbf {D}^k\mathbf {p}^{k}+\tau \Bigg (\mathbf {y}^{k-1} -\tilde{\mathbf {y}}^{k-1}\Bigg ). \end{aligned}$$

(23)

It was shown in [30, 32] that the matrix $\mathbf {G}$ is diagonal-dominant and positive-definite for $0< \beta <1$, and so does $\mathbf {A}$. Clearly the matrix $\mathbf {A}$ is Toeplitz, and the coefficient matrices $\mathbf {D}^k+\gamma \mathbf {A}$, $\mathbf {D}^{k-1}+\gamma \mathbf {A}^T$ are of diagonal-plus-Toeplitz type. Thus, the memory requirements to store them are O(N), instead of a traditional storage for them that uses $O(N^2)$ of storage. Furthermore, if $r = 1/2$, $\mathbf {A}$ is symmetric and positive-definite, so the conjugate gradient can be used to solve the linear systems (18) and (23) iteratively. For $r \ne 1/2$, $\mathbf {A}$ is nonsymmetric, the conjugate gradient squared type of methods can be employed to solve the systems.

Similarly, the CN-WSGD schemes (16) and (17) can be expressed in the matrix forms

$$\begin{aligned} \Bigg (\mathbf {D}^{k-\frac{1}{2}} + \frac{1}{2}\gamma \mathbf {B}\Bigg )\mathbf {y}^k = \Bigg (\mathbf {D}^{k-\frac{1}{2}} - \frac{1}{2}\gamma \bar{\mathbf {B}} \Bigg ) \mathbf {y}^{k-1} +\tau \mathbf {D}^{k-\frac{1}{2}} \bigg (\mathbf {f}^{k-\frac{1}{2}} +\mathbf {u}^{k-\frac{1}{2}} \bigg ) \end{aligned}$$

(24)

and

$$\begin{aligned} \Bigg (\mathbf {D}^{k-1}+\frac{1}{2}\gamma \mathbf {B}^T \Bigg )\mathbf {p}^{k-1} = \Bigg ( \mathbf {D}^k - \frac{1}{2}\gamma \mathbf {B}^T \Bigg )\mathbf {p}^{k} +\tau \bigg (\mathbf {y}^{k-\frac{1}{2}} -\tilde{\mathbf {y}}^{k-\frac{1}{2}} \bigg ). \end{aligned}$$

(25)

Here $\mathbf {B}= r \bar{\mathbf {G}} + (1-r)\bar{\mathbf {G}}^T$ and $\bar{\mathbf {G}}$ is also given by the expression in (20) with $g_i$ being replaced by $w_i^{(2-\beta )}$. $\mathbf {D}^{k-\frac{1}{2}}$, $\mathbf {f}^{k-\frac{1}{2}}$, $\mathbf {u}^{k-\frac{1}{2}}$, $\mathbf {y}^{k-\frac{1}{2}}$, and $\tilde{\mathbf {y}}^{k-\frac{1}{2}}$ are their corresponding average at time step $t^{k-1}$ and $t^k$. The matrix $\bar{\mathbf {G}}$ was shown to be positive definite [28], and so is $\mathbf {B}$. Hence, the fast method described earlier in the subsection still applies.

4.2 A Fast Preconditioned Conjugate Gradient Algorithm

Although the fast Krylov subspace iterative solvers described in the previous subsection significantly reduced the computational cost to $O(N \log N)$ per iteration, large number of iterations might be needed if the condition number of the coefficient matrix is large. In addition, the round off errors accumulated due to large number of iterations might deteriorate the convergence behavior of the iterative solvers [30]. This motivates us to use preconditioned Krylov subspace solvers.

For $r = 1/2$, the coefficient matrices of the systems (18), (23), (24), and (25) are of diagonal-plus-symmetric and positive-definite Toeplitz type. We modify the Strang preconditioner as follows [27]: For the Toeplitz matrix $\mathbf {T}$ given in the previous subsection, the diagonals $s_j$ of the Strang circulant preconditioner $\mathbf {S}(\mathbf {T})=(s_{i-j})_{0\le i,j\le N}$ are given by

$$\begin{aligned} s_l=\left\{ \begin{array}{ll} t_l, &{} \quad 0\le l\le N/2, \\ t_{l-N}, &{}\quad N/2 < l < N, \\ t_{l+N}, &{}\quad -N < l < 0. \end{array}\right. \end{aligned}$$

(26)

Then, the diagonal-plus-symmetric and positive-definite Toeplitz coefficient matrices in the systems in (18), (23), (24), and (25) are preconditioned by the following circulant preconditioner

$$\begin{aligned} \mathbf {M}^k = D^k\mathbf {I}+\gamma \mathbf {S}(\mathbf {T}), \end{aligned}$$

(27)

where $D^k=\frac{1}{N}\sum \nolimits _{i=1}^N (d^{-1})_i^k$ is the mean value of the diagonal elements of $\mathbf {D}^k$, $\mathbf {I}$ is the identity matrix of order N, and $\mathbf {T}= \mathbf {A}$ for the systems (18) and (23) or $\mathbf {T}= \mathbf {B}$ for (24) and (25). We use the system (18) as an example to formulate the fast preconditioned conjugate gradient method as follows [2]:

In the above algorithm, all steps need O(N) operations except for two matrix-vector multiplications $\mathbf {A}\mathbf {y}^{(0)}$, $\mathbf {A}\mathbf {p}^{(i-1)}$, and the solution of the system $\mathbf {M}^k \mathbf {q}^{(i)} = \mathbf {r}^{(i)}$. The matrix-vector multiplications can be carried out in $O(N \log N)$ operations as shown in Sect. 4.1. The preconditioner $\mathbf {M}^k$ is an $N\times N$ circulant matrix and so can be decomposed similarly as (22)

$$\begin{aligned} \mathbf {M}^k = \mathbf {F}_{N}^{-1} {\mathrm {diag}}(\mathbf {F}_{N} \mathbf {s}) \mathbf {F}_{N}, \end{aligned}$$

where $\mathbf {s}$ is the first column vector of $\mathbf {M}^k$ and $\mathbf {F}_{N}$ is the $N \times N$ discrete Fourier transform matrix. It can be inverted in $O(N \log N)$ operations.

4.3 A Fast Preconditioned Conjugate Gradient Squared Algorithm

For $r \ne 1/2$, the stiffness matrices $\mathbf {A}$ and $\mathbf {B}$ in the systems (18), (23), (24), and (25) are positive-definite but nonsymmatric Toeplitz matrices. We again use the Meerschaert scheme (18) as an example to demonstrate the idea. We still use a circulant preconditioner of the structure $\mathbf {M}^k = D^k\mathbf {I}+\gamma \mathbf {S}(\mathbf {A})$. However, note that $\mathbf {S}(\mathbf {A})$ is different from that of the symmetric case. Of course, we can also replace $\mathbf {S}(\mathbf {A})$ by a T. Chan preconditioner [4]. Then a fast preconditioned conjugate gradient squared algorithm for (18) can be formulated as follows [2]:

5 Numerical Experiments

In this section we carry out numerical experiments to investigate the performance of the fast gradient projection method. These methods were implemented using Compaq Visual Fortran 6.6 on a ThinkPad T410 Laptop.

5.1 An Optimal Control Problem with $r = 0.5$

We consider the optimal control problem (1) and (2) with the following data

$$\begin{aligned} \beta= & {} 0.2, \quad r = 0.5, \quad T = 1, \quad u_0 = 1,\nonumber \\ \tilde{y}= & {} \displaystyle \frac{100 x^3(1-x)^2 \sin (T-t)}{(1+xt)^2} +\displaystyle \frac{100 x^2(1-x)^2 \cos (T-t)}{1+xt} \nonumber \\&-\displaystyle \frac{\sin (T-t)}{\varGamma (1.2)}\Big [(x^{0.2}+(1-x)^{0.2}) -5(x^{1.2}+(1-x)^{1.2}) +\displaystyle \frac{50}{11}(x^{2.2}+(1-x)^{2.2})\Big ], \nonumber \\ f= & {} -\max \Big \{ \displaystyle \frac{100 x^2(1-x)^2 \sin (T-t)}{1+xt},1\Big \}, \nonumber \\ d= & {} \displaystyle \frac{1+xt}{100}. \end{aligned}$$

(28)

The analytical solution of this problem is given by

$$\begin{aligned} y= & {} 0, \nonumber \\ p= & {} -x^2(1-x)^2 \sin (T-t), \nonumber \\ u= & {} \max \Big \{\displaystyle \frac{100 x^2(1-x)^2 \sin (T-t)}{1+xt},1\Big \}. \end{aligned}$$

(29)

Since $r = 0.5$, the coefficient matrices for the states equations are symmetric. We solve the Meerschaert schemes (18) and (23) as well as the CN-WSGD systems (24) and (25) by Gaussian elimination (Gauss), the fast conjugate gradient (CG) method, and the fast preconditioned conjugate gradient (PCG) method. We present the $l^2$ errors of the approximations to y, p, and u, and the convergence rates of the PCG scheme in Tables 1 and 2, respectively. Since Gauss elimination consumes too much CPU time, we skipped the numerical solution by Gauss elimination for the finest mesh $N = 2^{10}$. We present the consumed CPU times of the two schemes in Table 3.

Table 1 Performance of the Meerschaert scheme solved by Gauss, the CG, and the PCG solvers for the example in Sect. 5.1

Full size table

Table 2 Performance of the CN-WSGD scheme solved by Gauss, the CG, and the PCG solvers for the example in Sect. 5.1

Full size table

Table 3 The consumed CPU times of the two schemes solved by Gauss, the CG, and the PCG solvers for the example in Sect. 5.1

Full size table

Table 4 Performance of the Meerschaert scheme solved by Gauss, the CGS, and the PCGS solvers for the example in Sect. 5.2

Full size table

Table 5 Performance of the CN-WSGD scheme solved by Gauss, the CGS, and the PCGS solvers for the example in Sect. 5.2

Full size table

We observe from Tables 1 and 2 that Gauss, the CG and the PCG solvers generate almost identical numerical solutions. Thus, we only compute the convergence rates of the PCG solutions as an indicator. The fast gradient projection method, as an iterated method for the coupled nonlinear system, retains the first order convergence rate of Meerschaert scheme and the second order convergence rate of the CN-WSGD scheme, respectively. These results, together with the results in Table 3, show the utility of the fast preconditioned conjugate gradient method.

5.2 An Optimal Control Problem with $r = 0.8$

We consider the same example as in Sect. 5.1 except that $r = 0.8$ and

$$\begin{aligned} \tilde{y}= & {} \displaystyle \frac{100 x^3(1-x)^2 \sin (T-t)}{(1+xt)^2} +\displaystyle \frac{100 x^2(1-x)^2 \cos (T-t)}{1+xt} \nonumber \\&-\displaystyle \frac{2\sin (T-t)}{5\varGamma (1.2)}\Big [\big (x^{0.2}+4(1-x)^{0.2}\big ) \nonumber \\&\qquad -5\big (x^{1.2}+4(1-x)^{1.2}\big ) +\displaystyle \frac{50}{11}\big (x^{2.2}+4(1-x)^{2.2}\big )\Big ]. \end{aligned}$$

(30)

The true solution of this problems is the same as the one in Sect. 5.1.

As $r = 0.8$, the stiffness matrices of the numerical schemes are nonsymmetric, so we solve the Meerschaert schemes (18) and (23) and the CN-WSGD schemes (24) and (25) by Gauss elimination, the fast conjugate gradient squared (CGS) method, and the fast preconditioned conjugate gradient squared (PCGS) method. We present the corresponding results in Tables 4, 5 and 6, respectively.

Table 6 The consumed CPU times of the two schemes solved by Gauss, the CGS, and the PCGS solvers for the example in Sect. 5.2

Full size table

Table 7 Performance of the Meerschaert scheme solved by Gauss, the CGS, and the PCGS solvers for the example in Sect. 5.3

Full size table

Table 8 Performance of the CN-WSGD scheme solved by Gauss, the CGS, and the PCGS solvers for the example in Sect. 5.3

Full size table

Table 9 The consumed CPU times of the two schemes solved by Gauss, the CGS, and the PCGS solvers for the example in Sect. 5.3

Full size table

We observe roughly the same convergence behavior as those in Sect. 5.1. One important exception is that for the Meerschaert scheme with the finest mesh $M = N = 2^{10}$ the CGS solver diverged. As a matter of fact, the CGS solver does not converge within 4 hours of CPU time. In contrast, the PCGS solver still converged in 6 minutes and 15 seconds of CPU time. In other words, the PCGS solver not only reduced the CPU time of the CGS solver, but also improved its convergence behavior. Furthermore, the CGS solver converged for the CN-WSGD scheme at $M = N = 2^{10}$. This showed an additional computational benefit of the second-order CN-WSGD scheme over the first-order Meerschaert scheme.

5.3 An Optimal Control Problem with $r = 1$

We consider the optimal control problem (1) and (2) with the following data

$$\begin{aligned} T= & {} 1, \quad u_0 = 1, \quad \beta =0.2, \quad r=1, \nonumber \\ \tilde{y}= & {} x^2(1-x)\sin (T-t) +\displaystyle \frac{100 x^2(1-x)^2 \sin (T-t)}{(1+xt)^2} \nonumber \\&+\displaystyle \frac{100 x(1-x)^2 \cos (T-t)}{1+xt} -\displaystyle \frac{\sin (T-t)}{\varGamma (1.2)}\Big [2(1-x)^{0.2} -5(1-x)^{1.2}\Big ], \nonumber \\ f= & {} -x^2(1-x)\cos (T-t) -\displaystyle \frac{(1+xt)\sin (T-t)}{100\varGamma (1.2)}\Big [2x^{0.2} -5x^{1.2}\Big ] \nonumber \\&-\max \Big \{ \displaystyle \frac{100 x(1-x)^2 \sin (T-t)}{1+xt},1\Big \}, \nonumber \\ d= & {} \displaystyle \frac{1+xt}{100}. \end{aligned}$$

(31)

The true solution to the problem is

$$\begin{aligned} y= & {} x^2(1-x)\sin (T-t) \nonumber \\ p= & {} -x(1-x)^2\sin (T-t) \nonumber \\ u= & {} \max \Big \{ \displaystyle \frac{100 x(1-x)^2 \sin (T-t)}{1+xt},1 \Big \}. \end{aligned}$$

(32)

This example has even stronger nonsymmetry, as the containing fractional diffusion equations are one-sided equations. Namely, the state equation is a left-sided fractional diffusion equation while the costate equation is a right-sided equation. We solve the Meerschaert schemes (18) and (23) and the CN-WSGD schemes (24) and (25) by Gauss elimination, the CGS method, and the PCGS method. We present the corresponding results in Tables 7, 8 and 9, respectively.

We observe similar convergence behavior as in Sect. 5.2 except that the CGS solver diverged even for the CN-WSGD scheme at $M=N=2^{10}$. In contrast, the PCGS method still converges for both the Meerschaert scheme and the CN-WSGD scheme. This again shows the utility of the PCGS method.

References

Agrawal, O.: A general formulation and solution scheme for fractional optimal control problems. Nonlinear Dyn. 38, 323–337 (2004)
Article MathSciNet MATH Google Scholar
Barrett, R., Berry, M., Chan, T.F., Demmel, J., Donato, J.M., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., Van der Vorst, H.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia (1994)
Book MATH Google Scholar
Benson, D., Wheatcraft, S.W., Meerschaert, M.M.: The fractional-order governing equation of Lévy motion. Water Resour. Res. 36, 1413–1423 (2000)
Article Google Scholar
Chan, T.: An optimal circulant preconditioner for Toeplitz systems. SIAM J. Sci. Stat. Comput. 9, 766–771 (1988)
Article MathSciNet MATH Google Scholar
Chen, Y., Lu, Z.: Error estimates of fully discrete mixed finite element methods for semilinear quadratic parabolic optimal control problem. Comput. Method. Appl. Mech. Eng. 199(23–24), 1415–1423 (2010)
Article MathSciNet MATH Google Scholar
Davis, P.J.: Circulant Matrices. American Mathematical Society, Province (1979)
MATH Google Scholar
Dorville, R., Mophou, G.M., Valmorin, V.S.: Optimal control of a nonhomogeneous Dirichlet boundary fractional diffusion equation. Comput. Math. Appl. 62(3), 1472–1481 (2011)
Article MathSciNet MATH Google Scholar
Du, N., Ge, L., Liu, W.: Adaptive finite element approximation for an elliptic optimal control problem with both pointwise and integral control constraints. J. Sci. Comput. 60(1), 160–183 (2014)
Article MathSciNet MATH Google Scholar
Einstein, A.E.: Investigations on the Theory of the Brownian Movement, Translation. Dover, Mineola (1956)
MATH Google Scholar
Frederico, G., Torres, D.: Fractional optimal control in the sense of caputo and the fractional Noethers theorem. Int. Math. Forum 3, 479–493 (2008)
MathSciNet MATH Google Scholar
Gray, R.M.: Toeplitz and Circulant Matrices: A Review. Now Publishers Inc, Hanover (2006)
MATH Google Scholar
Ito, K., Kunisch, K.: Augmented Lagrangian methods for nonsmooth convex optimization in Hilbert spaces. Nonlinear Anal. 41, 573–589 (2000)
Article MathSciNet MATH Google Scholar
Ito, K., Kunisch, K.: The primal-dual active set method for nonlinear optimal control problems with bilateral constraints. SIAM J. Control Optim. 43, 357–376 (2004)
Article MathSciNet MATH Google Scholar
Ito, K., Kunisch, K.: Semismooth Newton methods for time-optimal control for a class of ODEs. SIAM J. Control Optim. 48, 3997–4013 (2010)
Article MathSciNet MATH Google Scholar
Ito, K., Kunisch, K.: Minimal effort problems and their treatment by semismooth Newton methods. SIAM J. Control Optim. 49, 2083–2100 (2011)
Article MathSciNet MATH Google Scholar
Li, R., Liu, W., Ma, H., Tang, T.: Adaptive finite element approximation for distributed elliptic optimal control problems. SIAM J. Control Optim. 41(5), 1321–1349 (2002)
Article MathSciNet MATH Google Scholar
Liu, W., Yan, N.: Adaptive Finite Element Methods for Optimal Control Governed by PDEs: C Series in Information and Computational Science 41. Science Press, Beijing (2008)
Google Scholar
Meerschaert, M.M., Scheffler, H.P., Tadjeran, C.: Finite difference methods for two-dimensional fractional dispersion equation. J. Comput. Phys. 211, 249–261 (2006)
Article MathSciNet MATH Google Scholar
Meerschaert, M.M., Tadjeran, C.: Finite difference approximations for two-sided space-fractional partial differential equations. Appl. Numer. Math. 56, 80–90 (2006)
Article MathSciNet MATH Google Scholar
Metzler, R., Klafter, J.: The random walk’s guide to anomalous diffusion: a fractional dynamics approach. Phys. Rep. 339, 1–77 (2000)
Article MathSciNet MATH Google Scholar
Neittaanmaki, P., Tiba, D., Dekker, M.: Optimal Control of Nonlinear Parabolic Systems: Theory: Algorithms and Applications. Marcel Dekker, New York (1994)
MATH Google Scholar
Niu, H., Yang, D.: Finite element analysis of optimal control problem governed by Stokes equations with $L^2$-norm state-constraints. J. Comput. Math. 29, 589–604 (2011)
Article MathSciNet MATH Google Scholar
Pironneau, O.: Optimal Shape Design for Elliptic Systems. Springer, Berlin (1984)
Book MATH Google Scholar
Podlubny, I.: Fractional Differential Equations. Academic Press, New York (1999)
MATH Google Scholar
Roos, H., Reibiger, C.: Numerical analysis of a system of singularly perturbed convection–diffusion equations related to optimal control. Numer. Math. Theor. Meth. Appl. 4, 562–575 (2011)
MathSciNet MATH Google Scholar
Samko, S., Kilbas, A., Marichev, O.: Fractional Integrals and Derivatives: Theory and Applications. Gordon and Breach, London (1993)
MATH Google Scholar
Strang, G.: A proposal for Toeplitz matrix calculations. Stud. Appl. Math. 74(2), 171–176 (1986)
Article MATH Google Scholar
Tian W., Zhou H., Deng W.: A class of second order difference approximations for solving space fractional diffusion equations. arXiv:1201.5949 [math.NA]
Vallejos, M.: Multigrid methods for elliptic optimal control problems with pointwise state constraints. Numer. Math. Theor. Meth. Appl. 5, 99–109 (2012)
MathSciNet MATH Google Scholar
Wang, H., Du, N.: A superfast-preconditioned iterative method for steady-state space-fractional diffusion equations. J. Comput. Phys. 240, 49–57 (2013)
Article MathSciNet MATH Google Scholar
Wang, H., Du, N.: A fast finite difference method for three-dimensional time-dependent space-fractional diffusion equations and its efficient implementation. J. Comput. Phys. 253, 50–63 (2013)
Article MathSciNet Google Scholar
Wang, H., Wang, K., Sircar, T.: A direct $O(N log^2 N)$ finite difference method for fractional diffusion equations. J. Comput. Phys. 229, 8095–8104 (2010)
Article MathSciNet MATH Google Scholar
Wang, H., Yang, D.: Wellposedness of variable-coefficient conservative fractional elliptic differential equations. SIAM J. Numer. Anal. 51, 1088–1107 (2013)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics, Shandong Univeristy, Jinan, Shandong, China
Ning Du
Department of Mathematics, University of South Carolina, Columbia, SC, 29208, USA
Hong Wang
KBS, University of Kent, Canterbury, CT2 7NF, UK
Wenbin Liu

Authors

Ning Du
View author publications
You can also search for this author in PubMed Google Scholar
Hong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hong Wang.

Additional information

This work was supported by the National Natural Science Foundation of China under Grants 11371229, 91130010, and 11471194, by the National Science Foundation under Grants EAR-0934747 and DMS-1216923 and, by the China Scholarship Council (File No. 201308370102).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Du, N., Wang, H. & Liu, W. A Fast Gradient Projection Method for a Constrained Fractional Optimal Control. J Sci Comput 68, 1–20 (2016). https://doi.org/10.1007/s10915-015-0125-1

Download citation

Received: 17 October 2014
Accepted: 20 October 2015
Published: 14 November 2015
Issue Date: July 2016
DOI: https://doi.org/10.1007/s10915-015-0125-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Fast Gradient Projection Method for a Constrained Fractional Optimal Control

Abstract

Similar content being viewed by others

A Spectral Method for Optimal Control Problems Governed by the Time Fractional Diffusion Equation with Control Constraints

A space-time spectral method for the time fractional diffusion optimal control problems

A Control Parameterization Method to Solve the Fractional-Order Optimal Control Problem

1 Introduction

2 The Optimality Condition of the Fractional Control Problem

3 A Fractional Gradient Projection Method

4 A Fast and Faithful Solution Method

4.1 Fast Implementation of the Finite Difference Methods for the State and Costate Equations

4.2 A Fast Preconditioned Conjugate Gradient Algorithm

4.3 A Fast Preconditioned Conjugate Gradient Squared Algorithm

5 Numerical Experiments

5.1 An Optimal Control Problem with \(r = 0.5\)

5.2 An Optimal Control Problem with \(r = 0.8\)

5.3 An Optimal Control Problem with \(r = 1\)

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Fast Gradient Projection Method for a Constrained Fractional Optimal Control

Abstract

Similar content being viewed by others

A Spectral Method for Optimal Control Problems Governed by the Time Fractional Diffusion Equation with Control Constraints

A space-time spectral method for the time fractional diffusion optimal control problems

A Control Parameterization Method to Solve the Fractional-Order Optimal Control Problem

1 Introduction

2 The Optimality Condition of the Fractional Control Problem

3 A Fractional Gradient Projection Method

4 A Fast and Faithful Solution Method

4.1 Fast Implementation of the Finite Difference Methods for the State and Costate Equations

4.2 A Fast Preconditioned Conjugate Gradient Algorithm

4.3 A Fast Preconditioned Conjugate Gradient Squared Algorithm

5 Numerical Experiments

5.1 An Optimal Control Problem with \(r = 0.5\)

5.2 An Optimal Control Problem with \(r = 0.8\)

5.3 An Optimal Control Problem with \(r = 1\)

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation