Abstract
Optimal control problems governed by a fractional diffusion equation tends to provide a better description than one by a classical second-order Fickian diffusion equation in the context of transport or conduction processes in heterogeneous media. However, the fractional control problem introduces significantly increased computational complexity and storage requirement than the corresponding classical control problem, due to the nonlocal nature of fractional differential operators. We develop a fast gradient projection method for a pointwise constrained optimal control problem governed by a time-dependent space-fractional diffusion equation, which requires the computational cost from \(O(M N^3)\) of a conventional solver to \(O(M N\log N)\) and memory requirement from \(O(N^2)\) to O(N) for a problem of size N and of M time steps. Numerical experiments show the utility of the method.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
It is well known that numerical approximation of optimal control problems has long been an important topic in engineering design work. There has been extensive researches on developing fast numerical algorithms for these optimal control problems. Some recent progress in this area can be found in [5, 8, 12–17, 21–23, 25, 29] and the references cited therein.
In this paper we consider the model of a pointwise-contrained time-dependent optimal control problem imposed on the space interval (0, 1) over the time period [0, T]: Find a control \(u \in K := \{ v \in L^2(0,T;(0,1)), v \ge u_0\}\) with a prescribed function \(u_0(x,t)\) to minimize the objective functional
which is governed by the initial-boundary value problem of a diffusion equation for the state variable y(x, t)
Here \(\tilde{y}(x,t)\) represents a prescribed or observed value of the state variable y, f(x, t) is a prescribed source and sink term, u is the control variable, and L represents a diffusion operator.
This type of problems occurs widely in engineering design and applications. For instance, in the temperature control in a thermo conduction process, y represents the temperature in a medium and u the heating or cooling source. The goal of the control problem is to optimize the input of the heating or cooling source u such that the temperature distribution u is as close as to the prescribed temperature distribution \(\tilde{y}\). Due to the limitation of the surrounding environments or equipments, the input control u might be subject to certain constraint which leads to a constrained control problem such as that with a pointwise obstacle constraint as described by the admissible set K. Similar models also occur in the optimal control in mass diffusive transport process in porous media and in the photocopier and the laser printer in the modern office that rely on the transport of electrons in amorphous semiconductors in an electric field. The resulting optimality conditions from these models consist of non-smooth nonlinear systems of partial differential equations.
The canonical optimal control problems of this type, in which the differential operator L is a classical second-order Fickian diffusion operator, have been studied extensively in the literature and many efficient and reliable methods have been developed [5, 12–15, 17, 21, 25]. There exist at least three general classes of computational approaches to solving the resulting nonlinear systems of partial differential equations derived from the optimality conditions. The primal-dual active set method employs the primal-dual active set strategy based on the augmented Lagrangian method in [12, 13]. This method has widely been used in solving numerical solutions of different kinds of optimal control in the literature. However in some cases, e.g. when the Lagrange multipliers are irregular, the active sets are not easy to be computed numerically. The second class is to apply the semi-Newton algorithm to solve the resulting optimality conditions, which involves computations of second derivatives, see, e.g., [14, 15]. There exists an extensive body of research on the above two methods in the literature, and it is even impossible to give a brief review here. The third kind is the class of gradient projection methods that use a line search method to minimize the objective functional, guided by the gradient whose computation needs repeatedly solving the state and the co-state equations. This class of methods has been found robust in practice [17], but its effectiveness relies on very efficient solvers for the state and the co-state equations.
In the last few decades an increasingly more number of evidence suggests that the classical second-order Fickian diffusion equation or heat conduction equation does not necessarily provides a proper description of the diffusion or heat conduction process in a heterogeneous medium. A microscopic explanation is as follows: In a normal diffusion process modeled by a classical second-order Fickian diffusion or heat conduction equation, a particle’s motion in the underlying transport or conduction process has virtually no spatial correlation and long walks in the same direction are rare, so the variance of a particle excursion distance is finite. The central limit theorem ensures that the probability of finding a particle somewhere in space satisfies a normal distribution, which gives a probabilistic description of a normal diffusion. However, in a diffusion or heat conduction process in a heterogeneous medium, a particle’s motion is better described by steps that are not independent and that can take vastly different times to perform. A particle’s long walk may have long correlation length so the processes can have large deviations from the stochastic process of Brownian motion or even have an infinite variance. Thus, the assumptions of (i) the existence of a mean free path and (ii) the existence of a mean time taken to perform a step, which were made by Einstein in his derivation of diffusion equation or Brownian motion [9], are violated. As a matter of fact, it was shown [3, 20] that the heavy tail behavior of the transport processes in heterogeneous media can be described accurately by Levy distribution, which can be viewed as a probability description of fractional diffusion equations. This is the fundamental reason why fractional diffusion or heat conduction equations provide a better description of transport processes in heterogeneous media than classical Fickian diffusion or heat conduction equations do.
Motivated by these facts, in this paper we consider the optimal control problem of (1)–(2) except that L is a fractional diffusion operator given by
where d(x, t) is the diffusivity coefficient with positive lower and upper bounds, \(0 \le r \le 1\) indicates the relative weight of forward versus backward transition probability, \(2-\beta \) with \(0 < \beta <1\) is the order of the anomalous diffusion in the governing equation. The left and right fractional integrals are defined by Podlubny [24]
For any positive integer m, the left and right Riemann–Liouville fractional derivatives of order \(m-\beta \) are defined by
Because of the improved modeling capability of the fractional differential equations, there has been increasingly more research on both mathematical analysis and numerical approximation of the fractional control problems [1, 7, 10]. However, because of the nonlocal nature of fractional differential operators, the corresponding numerical methods for solving the fractional state and costate equations generate full stiffness matrices. Consequently, the resulting computational complexity is of \(O(N^3)\) and the memory requirement is of \(O(N^2)\) for a problem of size N, which is in sharp contrast to the computational work and memory requirement of O(N) for numerical methods for second-order Fickian diffusion equations. Recall that the fractional state and costate equations have to be iteratively solved for many times. Hence, the computational bottleneck becomes a really severe burden. To our best knowledge, there is no work in the literature that has addressed the significantly increased computational cost issue.
The goal of this paper is to develop a fast gradient projection method for the fractional optimal control problem (1)–(3), which significantly reduces the computational cost of the solution of the fractional state and costate equations to \(O(M N \log N)\) from \(O(M N^3)\), with M being the number of time steps, per iterative solve of the two equations, and the memory requirement to O(N) from \(O(N^2)\). Numerical experiments justify these. The rest of the paper is organized as follows. In Sect. 2 we derive the optimality condition for the fractional optimal control problem. In Sect. 3 we present a fractional gradient projection method. In Sect. 4 we present a fast and faithful solution method. In Sect. 5 we carry out numerical experiments to validate the fast gradient projection method developed.
2 The Optimality Condition of the Fractional Control Problem
We frequently use the following properties in this section. The left and right Riemann–Liouville fractional integral operators are adjoints in the \(L^2\) sense [24, 26]:
For any v, w with \(v(0)=0\) and \(w(1)=0\), the fractional integral operator and the classical first-order differential operators commute [33]:
To derive the adjoint costate equation for the costate function we rewrite the state equation (2) as
We then multiply the equation by any test function p with appropriately specified boundary condition and terminal conditions [as in (7)] and utilize the adjoint property (4) to obtain the following terminal-boundary value problem of the adjoint costate equation for the costate function p(x, t)
To derive the optimality condition for (1), we calculate the Gâteaux derivative of the objective functional J(u). Let \(u \in K\) be the minimizer of the objective functional J(u). Let \(v \in K\) be any admissible function and \(\delta u = v-u\). For any arbitrarily small positive number \(\varepsilon \), we note that
exists, provided that y is differentiable with respect to u. Consequently,
as \(\varepsilon \) tends to zero. Therefore, we expand \(J(u+\varepsilon \delta u)\) as follows
We accordingly find that
In the last step we have replaced \(y-\tilde{y}\) in the integrand of the first integral by the left-hand side of the costate equation (7).
We see from (8) that \(\delta y\) satisfies the homogeneous initial and boundary conditions. We further utilize the fact that \(p(x,T) = 0\) and integrate the first integrand in the first integral on the right-hand side with respect to time by parts to get
We intend to utilize the facts that \(\delta y\) and p both vanish at the boundary to integrate the second integrand in the first integral on the right-hand side with respect to x by parts in a similar fashion. However because the fractional integral operators and the classical differential operators do not commute in general, we have to take extra care by utilizing (4) and (5) repeatedly to obtain
We incorporate (10) and (11) into (9) to obtain
We observe from (6) and (8) that
Recall that u is the minimizer of the objective functional \(J(\cdot )\) and \(\delta u = v-u\), the fact that \(J'(u)\delta u \ge 0\) leads to the following variational inequality
for all \(v \in K\). This variational inequality has the following closed form solution [17]
As a matter of fact, if \(-d^{-1}p > u_0\) on the entire interval (0, 1), then (13) implies \(u=-d^{-1}p\) and so \(d^{-1}p+u=0\) on the entire interval (0, 1). The variational inequality (12) becomes identically zero for all \(v \in K\). If \(-d^{-1} p \le u_0\) on the entire interval (0, 1), (13) implies \(u=u_0\). Thus, \(d^{-1}p + u = d^{-1}p + u_0 \ge 0\). Since \(v-u = v-u_0 \ge 0\) for all \(v\in K\), the variational inequality (12) holds.
In the general case, the maximum in (13) may be reached by either on different subdomains of \([0,1] \times [0,T]\). Suppose that \(u = -d^{-1}p\) on \(\varOmega _1\) and \(u = u_0\) on \(\varOmega _2\) with \(\varOmega _1 \cup \varOmega _2 = [0,1] \times [0,T]\). It follows that the integral \(\displaystyle \int _{\varOmega _1} (d^{-1}p+u) (v-u) dxdt = 0\). Hence, the integral in (12) reduces to an integral on \(\varOmega _2\), on which \(u=u_0\). Then the same argument in the preceding paragraph shows that this integral remains nonnegative for all \(v \in K\). Therefore, (12) still holds.
3 A Fractional Gradient Projection Method
The optimality condition of the constrained optimal control problem (1) and (2) reduces the problem to the solution of a coupled system of the state equation (6), the costate equation (7), and the optimal control condition (13). A commonly used method for the efficient solution of the system is the following sequentially decoupled gradient projection method [17]:
-
(i)
Pick up an guess of the control \(u^0\).
For \(n=0,1,\ldots \), do
-
(ii)
With the known control \(u^n\), solve the state equation forward in time for the state variable \(y^n\)
$$\begin{aligned} \displaystyle d^{-1}\displaystyle \frac{\partial {y^n}}{\partial {t}} - \bigg ( r\; {}_0D_x^{2-\beta } + (1-r)\; {}_xD_1^{2-\beta } \bigg ) y^n = d^{-1} \bigg ( f + u^n \bigg ). \end{aligned}$$ -
(iii)
With the known state variable \(y^n\), solve the costate equation backward in time for the costate variable \(p^n\)
$$\begin{aligned} \displaystyle -\displaystyle \frac{\partial {(d^{-1}p^n)}}{\partial {t}} - \bigg ( (1-r)\; {}_0D_x^{2-\beta } + r\; {}_xD_1^{2-\beta } \bigg ) p^n = y^n-\tilde{y}. \end{aligned}$$ -
(iv)
Update the optimal control
$$\begin{aligned} \displaystyle u^{n+1} = \max \big \{u_0,-d^{-1} p^n\big \}. \end{aligned}$$ -
(v)
If \(\Vert u^{n+1}-u^n\Vert _{\infty }\) is within the tolerance, then stop
end do
It is clear that in the gradient projection method, the major computational cost comes from the repeated solution of the state equation (6) and the costate equation (7). In the rest of this section, we present two finite difference methods for these equations and discuss related computational issues.
Let M and N be positive integers. Define the time step size \(\tau := T/M\) and the spatial mesh size \(h := 1/(N+1)\), respectively. We define the temporal partition \(t^k := k \tau \) for \(k=0,1,\ldots ,M\) and the spatial partition \(x_i := i h\) for \(i =0, 1,\ldots ,N+1\). Let \(y_i^k, p_i^k, u_i^k\) be the finite difference approximations to \(y(x_i,t^k)\), \(p(x_i,t^k)\), \(u(x_i,t^k)\), respectively, and denote \((d^{-1})_i^k := d^{-1}(x_i,t^k)\), \(f_i^k := f(x_i,t^k)\), \(\tilde{y}_i^k := \tilde{y}(x_i,t^k)\). Then the Meerschaert finite difference scheme for the state equation can be formulated as follows [19]: For \(k=1,2,\ldots ,M\), solve the following equation forward in time to find \(\{ y^k_i\}_{i=1}^N\) such that
Here the fractional binomial coefficients \(g_{j}^{(\alpha )}\) can be evaluated recursively
We can similarly formulate the Meerschaert finite difference scheme for the costate equation: For \(k=M, M-1,\ldots ,1\), solve the following equation backward in time to find \(\{ p^k_i\}_{i=1}^N\) such that
Because a shifted Grünwald approximation was used, the Meerschaert scheme has only first-order accuracy in both space and time [19].
Second-order finite difference methods were developed to improve the computational efficiency. A Crank-Nicolson temporal discretization plus a Richardson extrapolation of the spatial discretization in (14) results in a scheme that has second-order accuracy in both space and time [18]. In this paper we use an alternative second-order scheme, the CN-WSGD scheme, developed in [28]: For \(k=1,2,\ldots ,M\), solve the following equation forward in time to find \(\{ y^k_i\}_{i=1}^N\) such that
Here the superscript \(k-\frac{1}{2}\) denotes the mean value of the function on k and \(k-1\) levels and \(w_{j}^{(\alpha )}\) can be evaluated by
We can similarly formulate the CN-WSGD scheme for the costate equation as follows: For \(k=M, M-1,\ldots ,1\), solve the following equation backward in time to find \(\{ p^k_i\}_{i=1}^N\) such that
The nonlocal property of fractional differential operators results in a significant increase in the already expensive computational and storage cost in the gradient projection method for the constrained optimal control problem. Despite that the optimal control constrained by space-fractional diffusion equations provides a more appropriate description than its analogue constrained by second-order diffusion equations, the significantly increased computational and storage cost poses a severe computational challenge to the new model. More precisely, within each outer loop in the gradient projection method, the computational cost of the space-fractional state and costate equations is of order \(O(M N^3)\), which is a very sharp increase of the computational cost of second-order state and costate equations that has a computational cost of O(M N). This significantly increased computational cost gets even worse as the state and costate equations have to be solved iteratively in the gradient projection method. Therefore, the development of a fast and faithful numerical method is of fundamental importance.
4 A Fast and Faithful Solution Method
The goal of this section is to develop a fast and faithful gradient projection method for the optimal control problem (1) and (2). As the major computational complexity of the gradient projection method comes from the numerical solution of the state and costate equations, we focus on the development of fast and faithful solution methods for these equations.
4.1 Fast Implementation of the Finite Difference Methods for the State and Costate Equations
Let \(\mathbf {y}^k := (y_1^k,y_2^k,\ldots ,y_N^k)^T\), and \(\mathbf {y}^{k-1}\), \(\mathbf {f}^k\), \(\mathbf {u}^k\) be introduced in a similar fashion. Then the Meerschaert scheme for the state equation (14) can be expressed in a matrix form [19]
where \(\gamma ={\tau }/{h^{2-\beta }}\), \(\mathbf {D}^k={\mathrm {diag}}((d^{-1})_i^k)_{i=1}^N\), and \(\mathbf {A}\) is a full stiffness matrix. Traditionally, the linear system (18) has been solved numerically by Gaussian elimination [19], which requires computational work of \(O(N^3)\) per time step. It was reported in [32] that the stiffness matrix \(\mathbf {A}\) can be decomposed as
where \(\mathbf {G}\) is a Toeplitz matrix of the form
with \(g_i := g_i^{(2-\beta )}\).
Fast finite difference methods were developed by exploring the decomposition [31, 32] which reduce the computational cost from \(O(N^3)\) at each time step to \(O(N \log N)\) per inner iteration. It is known that an \(N \times N\) Toeplitz matrix \(\mathbf {T}=(t_{i-j})_{0\le i,j\le N}\) can be embedded into a \(2N \times 2N\) circulant matrix \(\mathbf {C}_{2N}\) [6, 11]
The circulant matrix \(\mathbf {C}_{2N}\) can be decomposed as
where \(\mathbf {c}\) is the first column vector of \(\mathbf {C}_{2N}\) and \(\mathbf {F}_{2N}\) is the \(2N \times 2N\) discrete Fourier transform matrix. It is well known that the matrix-vector multiplication \(\mathbf {F}_{2N}\mathbf {v}_{2N}\) for \(\mathbf {v}_{2N} \in R^{2N}\) can be carried out in \(O(2N \log (2N)) = O(N \log N)\) operations via the fast Fourier transform (FFT). Therefore, \(\mathbf {C}_{2N} \mathbf {v}_{2N}\) can be evaluated in \(O(N \log N)\) operations by (22).
The same idea can be applied to other schemes in the previous section. The Meerschaert scheme (15) for the costate equation can be expressed in the matrix form
It was shown in [30, 32] that the matrix \(\mathbf {G}\) is diagonal-dominant and positive-definite for \(0< \beta <1\), and so does \(\mathbf {A}\). Clearly the matrix \(\mathbf {A}\) is Toeplitz, and the coefficient matrices \(\mathbf {D}^k+\gamma \mathbf {A}\), \(\mathbf {D}^{k-1}+\gamma \mathbf {A}^T\) are of diagonal-plus-Toeplitz type. Thus, the memory requirements to store them are O(N), instead of a traditional storage for them that uses \(O(N^2)\) of storage. Furthermore, if \(r = 1/2\), \(\mathbf {A}\) is symmetric and positive-definite, so the conjugate gradient can be used to solve the linear systems (18) and (23) iteratively. For \(r \ne 1/2\), \(\mathbf {A}\) is nonsymmetric, the conjugate gradient squared type of methods can be employed to solve the systems.
Similarly, the CN-WSGD schemes (16) and (17) can be expressed in the matrix forms
and
Here \(\mathbf {B}= r \bar{\mathbf {G}} + (1-r)\bar{\mathbf {G}}^T\) and \(\bar{\mathbf {G}}\) is also given by the expression in (20) with \(g_i\) being replaced by \(w_i^{(2-\beta )}\). \(\mathbf {D}^{k-\frac{1}{2}}\), \(\mathbf {f}^{k-\frac{1}{2}}\), \(\mathbf {u}^{k-\frac{1}{2}}\), \(\mathbf {y}^{k-\frac{1}{2}}\), and \(\tilde{\mathbf {y}}^{k-\frac{1}{2}}\) are their corresponding average at time step \(t^{k-1}\) and \(t^k\). The matrix \(\bar{\mathbf {G}}\) was shown to be positive definite [28], and so is \(\mathbf {B}\). Hence, the fast method described earlier in the subsection still applies.
4.2 A Fast Preconditioned Conjugate Gradient Algorithm
Although the fast Krylov subspace iterative solvers described in the previous subsection significantly reduced the computational cost to \(O(N \log N)\) per iteration, large number of iterations might be needed if the condition number of the coefficient matrix is large. In addition, the round off errors accumulated due to large number of iterations might deteriorate the convergence behavior of the iterative solvers [30]. This motivates us to use preconditioned Krylov subspace solvers.
For \(r = 1/2\), the coefficient matrices of the systems (18), (23), (24), and (25) are of diagonal-plus-symmetric and positive-definite Toeplitz type. We modify the Strang preconditioner as follows [27]: For the Toeplitz matrix \(\mathbf {T}\) given in the previous subsection, the diagonals \(s_j\) of the Strang circulant preconditioner \(\mathbf {S}(\mathbf {T})=(s_{i-j})_{0\le i,j\le N}\) are given by
Then, the diagonal-plus-symmetric and positive-definite Toeplitz coefficient matrices in the systems in (18), (23), (24), and (25) are preconditioned by the following circulant preconditioner
where \(D^k=\frac{1}{N}\sum \nolimits _{i=1}^N (d^{-1})_i^k\) is the mean value of the diagonal elements of \(\mathbf {D}^k\), \(\mathbf {I}\) is the identity matrix of order N, and \(\mathbf {T}= \mathbf {A}\) for the systems (18) and (23) or \(\mathbf {T}= \mathbf {B}\) for (24) and (25). We use the system (18) as an example to formulate the fast preconditioned conjugate gradient method as follows [2]:
In the above algorithm, all steps need O(N) operations except for two matrix-vector multiplications \(\mathbf {A}\mathbf {y}^{(0)}\), \(\mathbf {A}\mathbf {p}^{(i-1)}\), and the solution of the system \(\mathbf {M}^k \mathbf {q}^{(i)} = \mathbf {r}^{(i)}\). The matrix-vector multiplications can be carried out in \(O(N \log N)\) operations as shown in Sect. 4.1. The preconditioner \(\mathbf {M}^k\) is an \(N\times N\) circulant matrix and so can be decomposed similarly as (22)
where \(\mathbf {s}\) is the first column vector of \(\mathbf {M}^k\) and \(\mathbf {F}_{N}\) is the \(N \times N\) discrete Fourier transform matrix. It can be inverted in \(O(N \log N)\) operations.
4.3 A Fast Preconditioned Conjugate Gradient Squared Algorithm
For \(r \ne 1/2\), the stiffness matrices \(\mathbf {A}\) and \(\mathbf {B}\) in the systems (18), (23), (24), and (25) are positive-definite but nonsymmatric Toeplitz matrices. We again use the Meerschaert scheme (18) as an example to demonstrate the idea. We still use a circulant preconditioner of the structure \(\mathbf {M}^k = D^k\mathbf {I}+\gamma \mathbf {S}(\mathbf {A})\). However, note that \(\mathbf {S}(\mathbf {A})\) is different from that of the symmetric case. Of course, we can also replace \(\mathbf {S}(\mathbf {A})\) by a T. Chan preconditioner [4]. Then a fast preconditioned conjugate gradient squared algorithm for (18) can be formulated as follows [2]:
5 Numerical Experiments
In this section we carry out numerical experiments to investigate the performance of the fast gradient projection method. These methods were implemented using Compaq Visual Fortran 6.6 on a ThinkPad T410 Laptop.
5.1 An Optimal Control Problem with \(r = 0.5\)
We consider the optimal control problem (1) and (2) with the following data
The analytical solution of this problem is given by
Since \(r = 0.5\), the coefficient matrices for the states equations are symmetric. We solve the Meerschaert schemes (18) and (23) as well as the CN-WSGD systems (24) and (25) by Gaussian elimination (Gauss), the fast conjugate gradient (CG) method, and the fast preconditioned conjugate gradient (PCG) method. We present the \(l^2\) errors of the approximations to y, p, and u, and the convergence rates of the PCG scheme in Tables 1 and 2, respectively. Since Gauss elimination consumes too much CPU time, we skipped the numerical solution by Gauss elimination for the finest mesh \(N = 2^{10}\). We present the consumed CPU times of the two schemes in Table 3.
We observe from Tables 1 and 2 that Gauss, the CG and the PCG solvers generate almost identical numerical solutions. Thus, we only compute the convergence rates of the PCG solutions as an indicator. The fast gradient projection method, as an iterated method for the coupled nonlinear system, retains the first order convergence rate of Meerschaert scheme and the second order convergence rate of the CN-WSGD scheme, respectively. These results, together with the results in Table 3, show the utility of the fast preconditioned conjugate gradient method.
5.2 An Optimal Control Problem with \(r = 0.8\)
We consider the same example as in Sect. 5.1 except that \(r = 0.8\) and
The true solution of this problems is the same as the one in Sect. 5.1.
As \(r = 0.8\), the stiffness matrices of the numerical schemes are nonsymmetric, so we solve the Meerschaert schemes (18) and (23) and the CN-WSGD schemes (24) and (25) by Gauss elimination, the fast conjugate gradient squared (CGS) method, and the fast preconditioned conjugate gradient squared (PCGS) method. We present the corresponding results in Tables 4, 5 and 6, respectively.
We observe roughly the same convergence behavior as those in Sect. 5.1. One important exception is that for the Meerschaert scheme with the finest mesh \(M = N = 2^{10}\) the CGS solver diverged. As a matter of fact, the CGS solver does not converge within 4 hours of CPU time. In contrast, the PCGS solver still converged in 6 minutes and 15 seconds of CPU time. In other words, the PCGS solver not only reduced the CPU time of the CGS solver, but also improved its convergence behavior. Furthermore, the CGS solver converged for the CN-WSGD scheme at \(M = N = 2^{10}\). This showed an additional computational benefit of the second-order CN-WSGD scheme over the first-order Meerschaert scheme.
5.3 An Optimal Control Problem with \(r = 1\)
We consider the optimal control problem (1) and (2) with the following data
The true solution to the problem is
This example has even stronger nonsymmetry, as the containing fractional diffusion equations are one-sided equations. Namely, the state equation is a left-sided fractional diffusion equation while the costate equation is a right-sided equation. We solve the Meerschaert schemes (18) and (23) and the CN-WSGD schemes (24) and (25) by Gauss elimination, the CGS method, and the PCGS method. We present the corresponding results in Tables 7, 8 and 9, respectively.
We observe similar convergence behavior as in Sect. 5.2 except that the CGS solver diverged even for the CN-WSGD scheme at \(M=N=2^{10}\). In contrast, the PCGS method still converges for both the Meerschaert scheme and the CN-WSGD scheme. This again shows the utility of the PCGS method.
References
Agrawal, O.: A general formulation and solution scheme for fractional optimal control problems. Nonlinear Dyn. 38, 323–337 (2004)
Barrett, R., Berry, M., Chan, T.F., Demmel, J., Donato, J.M., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., Van der Vorst, H.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia (1994)
Benson, D., Wheatcraft, S.W., Meerschaert, M.M.: The fractional-order governing equation of Lévy motion. Water Resour. Res. 36, 1413–1423 (2000)
Chan, T.: An optimal circulant preconditioner for Toeplitz systems. SIAM J. Sci. Stat. Comput. 9, 766–771 (1988)
Chen, Y., Lu, Z.: Error estimates of fully discrete mixed finite element methods for semilinear quadratic parabolic optimal control problem. Comput. Method. Appl. Mech. Eng. 199(23–24), 1415–1423 (2010)
Davis, P.J.: Circulant Matrices. American Mathematical Society, Province (1979)
Dorville, R., Mophou, G.M., Valmorin, V.S.: Optimal control of a nonhomogeneous Dirichlet boundary fractional diffusion equation. Comput. Math. Appl. 62(3), 1472–1481 (2011)
Du, N., Ge, L., Liu, W.: Adaptive finite element approximation for an elliptic optimal control problem with both pointwise and integral control constraints. J. Sci. Comput. 60(1), 160–183 (2014)
Einstein, A.E.: Investigations on the Theory of the Brownian Movement, Translation. Dover, Mineola (1956)
Frederico, G., Torres, D.: Fractional optimal control in the sense of caputo and the fractional Noethers theorem. Int. Math. Forum 3, 479–493 (2008)
Gray, R.M.: Toeplitz and Circulant Matrices: A Review. Now Publishers Inc, Hanover (2006)
Ito, K., Kunisch, K.: Augmented Lagrangian methods for nonsmooth convex optimization in Hilbert spaces. Nonlinear Anal. 41, 573–589 (2000)
Ito, K., Kunisch, K.: The primal-dual active set method for nonlinear optimal control problems with bilateral constraints. SIAM J. Control Optim. 43, 357–376 (2004)
Ito, K., Kunisch, K.: Semismooth Newton methods for time-optimal control for a class of ODEs. SIAM J. Control Optim. 48, 3997–4013 (2010)
Ito, K., Kunisch, K.: Minimal effort problems and their treatment by semismooth Newton methods. SIAM J. Control Optim. 49, 2083–2100 (2011)
Li, R., Liu, W., Ma, H., Tang, T.: Adaptive finite element approximation for distributed elliptic optimal control problems. SIAM J. Control Optim. 41(5), 1321–1349 (2002)
Liu, W., Yan, N.: Adaptive Finite Element Methods for Optimal Control Governed by PDEs: C Series in Information and Computational Science 41. Science Press, Beijing (2008)
Meerschaert, M.M., Scheffler, H.P., Tadjeran, C.: Finite difference methods for two-dimensional fractional dispersion equation. J. Comput. Phys. 211, 249–261 (2006)
Meerschaert, M.M., Tadjeran, C.: Finite difference approximations for two-sided space-fractional partial differential equations. Appl. Numer. Math. 56, 80–90 (2006)
Metzler, R., Klafter, J.: The random walk’s guide to anomalous diffusion: a fractional dynamics approach. Phys. Rep. 339, 1–77 (2000)
Neittaanmaki, P., Tiba, D., Dekker, M.: Optimal Control of Nonlinear Parabolic Systems: Theory: Algorithms and Applications. Marcel Dekker, New York (1994)
Niu, H., Yang, D.: Finite element analysis of optimal control problem governed by Stokes equations with \(L^2\)-norm state-constraints. J. Comput. Math. 29, 589–604 (2011)
Pironneau, O.: Optimal Shape Design for Elliptic Systems. Springer, Berlin (1984)
Podlubny, I.: Fractional Differential Equations. Academic Press, New York (1999)
Roos, H., Reibiger, C.: Numerical analysis of a system of singularly perturbed convection–diffusion equations related to optimal control. Numer. Math. Theor. Meth. Appl. 4, 562–575 (2011)
Samko, S., Kilbas, A., Marichev, O.: Fractional Integrals and Derivatives: Theory and Applications. Gordon and Breach, London (1993)
Strang, G.: A proposal for Toeplitz matrix calculations. Stud. Appl. Math. 74(2), 171–176 (1986)
Tian W., Zhou H., Deng W.: A class of second order difference approximations for solving space fractional diffusion equations. arXiv:1201.5949 [math.NA]
Vallejos, M.: Multigrid methods for elliptic optimal control problems with pointwise state constraints. Numer. Math. Theor. Meth. Appl. 5, 99–109 (2012)
Wang, H., Du, N.: A superfast-preconditioned iterative method for steady-state space-fractional diffusion equations. J. Comput. Phys. 240, 49–57 (2013)
Wang, H., Du, N.: A fast finite difference method for three-dimensional time-dependent space-fractional diffusion equations and its efficient implementation. J. Comput. Phys. 253, 50–63 (2013)
Wang, H., Wang, K., Sircar, T.: A direct \(O(N log^2 N)\) finite difference method for fractional diffusion equations. J. Comput. Phys. 229, 8095–8104 (2010)
Wang, H., Yang, D.: Wellposedness of variable-coefficient conservative fractional elliptic differential equations. SIAM J. Numer. Anal. 51, 1088–1107 (2013)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Natural Science Foundation of China under Grants 11371229, 91130010, and 11471194, by the National Science Foundation under Grants EAR-0934747 and DMS-1216923 and, by the China Scholarship Council (File No. 201308370102).
Rights and permissions
About this article
Cite this article
Du, N., Wang, H. & Liu, W. A Fast Gradient Projection Method for a Constrained Fractional Optimal Control. J Sci Comput 68, 1–20 (2016). https://doi.org/10.1007/s10915-015-0125-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10915-015-0125-1