Abstract
In this paper we consider the convergence analysis of adaptive finite element method for elliptic optimal control problems with pointwise control constraints. We use variational discretization concept to discretize the control variable and piecewise linear and continuous finite elements to approximate the state variable. Based on the well-established convergence theory of AFEM for elliptic boundary value problems, we rigorously prove the convergence and quasi-optimality of AFEM for optimal control problems with respect to the state and adjoint state variables, by using the so-called perturbation argument. Numerical experiments confirm our theoretical analysis.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Adaptive finite element method (AFEM for short), contributed to the pioneer work of Babuška and Rheinboldt [2], becomes nowadays a popular approach in the community of engineering and scientific computing. It aims at distributing more mesh nodes around the area where the singularities happen to save the computational cost. Various types of reliable and efficient a posteriori error estimators, which are used to detect the location of singularity and essential for the success of AFEM, have been developed in the last decades for different kind of problems, we refer to [36] for an overview.
Although AFEM has been successfully applied for more than three decades, the convergence analysis is rather recent which started with Dörfler [13] and was further studied in [6, 7, 31–33]. Besides convergence, optimality is another important issue in AFEM which was firstly addressed by Binev et al. [6] and further studied by Stevenson [34, 35]. The so-called Dörfler’s marking proposed in [13] and quasi-error introduced in [7] consisting of the sum of the energy error and the scaled estimator are crucial to prove the contraction of the errors and quasi-optimal cardinality of the standard AFEM which avoids marking for oscillation [13] and circumvents the interior node property of mesh refinement [32, 33].
AFEM also finds successful application in optimal control problems governed by partial differential equations, starting from Liu, Yan [26] and Becker, Kapp, Rannacher [3]. In [3] the authors proposed a dual-weighted goal-oriented adaptivity for optimal control problems while in [26] residual type a posteriori error estimates were derived. We refer to [17, 18, 24, 27–30] for more details of recent advance. Recently, Kohls, Rösch and Siebert derived in [22] an error equivalence property which enables one to derive reliable and efficient a posteriori error estimators for optimal control problems with either variational discretization or full control discretization.
There also exist some attempts to prove the convergence of AFEM for optimal control problems. In [14] the authors considered the piecewise constant approximation of the control variable and gave an error reduction property for the quadruplet \((u,y,p,\sigma )\), where u, y, p denote the optimal control, state, and adjoint state variables and \(\sigma \) the associated co-control variable. However, additional requirements on the strict complementarity of the continuous problem and non-degeneracy property of the discrete control problem are assumed and the marking strategy is extended to include the discrete free boundary between the active and inactive control sets. In [4] the authors viewed the control problems as a nonlinear elliptic system of the state and adjoint variables and gave a convergence proof for the adaptive algorithm involving the marking of data oscillation. In [23] the authors proved that the sequence of adaptively generated discrete solutions converged to the true solutions for optimal control problems, but obtained only the plain convergence of the adaptive algorithm without convergence rate and optimality. In this paper we intend to give a rigorous convergence proof for the adaptive finite element algorithm of elliptic optimal control problem in an optimal control framework. Compared to [4], the AFEM adopted in current paper uses Dörfler’s marking [13] and is a standard algorithm in that it employs only the error indicators and does not use the oscillation indicators. Moreover, for the convergence analysis of AFEM we use the standard convergence results of AFEM for elliptic boundary value problems so that the proof is more clear and rigorous.
Inspired by the work [11] of Dai, Xu and Zhou where the convergence and optimality of AFEM for elliptic eigenvalue problem are proved by exploiting a certain relationship between the finite element eigenvalue approximation and the associated finite element boundary value approximation, in this paper we will provide a rigorous convergence analysis of the adaptive finite element algorithm for optimal control problems governed by a linear elliptic equation. Under mild assumptions on the initial mesh from which the adaptive algorithm starts, we show that the energy norm errors of the state and adjoint state variables are equivalent to the boundary value approximations of the state and adjoint state equations up to a higher order term. Then based on the well-known convergence result of AFEM for elliptic boundary value problems, we are able to prove the convergence of AFEM for the optimal control problems (OCPs for short). To be more specific, the AFEM for OCPs is a contraction for the sum of the energy errors and the scaled error estimators of the state y and the adjoint state p, between two consecutive adaptive loops. We also show that the AFEM yields a decay rate of the energy errors of the state y and the adjoint state p plus oscillations of the state and adjoint state equations in terms of the number of degrees of freedom. This result is an improvement over the plain convergence result presented in [23].
We remark that we study the AFEM for OCPs under energy norm errors for the state and adjoint state. Compared to a priori error estimates for optimal control problems [20], it seems to be more suitable to work with \(L^2\)-norm errors for the control, the state and the adjoint state, including a posteriori error estimates and the convergence analysis. However, the motivation to study AFEM in energy norm in this paper is two folds. Firstly, up to now almost all AFEMs with guaranteed convergence are based on energy norm error, the only contribution of convergent AFEM under \(L^2\)-norm to our knowledge is [12] by Demlow and Stevenson, where the convergence result of AFEM under energy norm is used to prove the convergence of AFEM with \(L^2\) norm by establishing certain equivalence property between the \(L^2\)-norm error and the weighted energy norm error. Secondly, to work with \(L^2\)-norm we have to assume \(H^2\)-regularity for the solution of elliptic equation, this excludes the most interesting case where the domain may be non-convex or the coefficient of the elliptic operator may be discontinuous. However, if we have \(H^2\)-regularity for the solution of elliptic equation we can already achieve optimal a priori error estimates for the optimal control problems and AFEM is thus not attractive.
The rest of the paper is organised as follows. In Sect. 2 we recall some well-known results on the convergence analysis of AFEM for elliptic boundary values problems. In Sect. 3 we introduce the finite element approximation of the optimal control problems and derive a posteriori error estimates. We also present Dörfler’s marking strategy and the adaptive algorithm for the optimal control problems. In Sect. 4 we give a rigorous convergence analysis of the AFEM for optimal control problems and the quasi-optimal cardinality is proved in Sect. 5. Numerical experiments are carried out in Sect. 6 to validate our theoretical result. Finally, we give a conclusion in Sect. 7 and outlook the possible extensions and future work.
Let \(\Omega \subset \mathbb {R}^d\) (\(d=2,3\)) be a bounded polygonal or polyhedral domain which is not necessarily convex. We denote by \(W^{m,q}(\Omega )\) the usual Sobolev space of order \(m\geqslant 0\), \(1\leqslant q<\infty \) with norm \(\Vert \cdot \Vert _{m,q,\Omega }\) and seminorm \(|\cdot |_{m,q,\Omega }\). For \(q=2\) we denote \(W^{m,q}(\Omega )\) by \(H^m(\Omega )\) and \(\Vert \cdot \Vert _{m,\Omega }=\Vert \cdot \Vert _{m,2,\Omega }\), which is a Hilbert space. Note that \(H^0(\Omega )=L^2(\Omega )\) and \(H_0^1(\Omega )=\{v\in H^1(\Omega ):\ v=0\ \mathrm{on}\ \partial \Omega \}\). We denote C a generic positive constant which may stand for different values at its different occurrences but does not depend on mesh size. We use the symbol \(A\lesssim B\) to denote \(A\leqslant CB\) for some constant C that is independent of mesh size.
2 Preliminaries
In this section, we recall some well-known results on the adaptive finite element approximation to a linear elliptic boundary value problem, which are then used for the convergence analysis of AFEM for optimal control problems. Some of the results are collected from [7, 11], see also [16].
Consider the following second order elliptic equation
where \(f\in L^2(\Omega )\) and L is a linear second order elliptic operator of the following form:
We denote \(L^*\) the adjoint operator of L
Here \(0\leqslant a_0<\infty \), \(a_{ij}\in W^{1,\infty }(\Omega )\) \((i,j=1,\ldots ,d)\) and \((a_{ij})_{d\times d}\) is symmetric and positive definite. We denote \(A=(a_{ij})_{d\times d}\) and \(A^*\) its adjoint. Let
It is clear that \(a(\cdot ,\cdot )\) is a bounded bilinear form over \(H_0^1(\Omega )\) and defines a norm \(\Vert \cdot \Vert _{a,\Omega }=\sqrt{a(\cdot ,\cdot )}\) which is equivalent to \(\Vert \cdot \Vert _{1,\Omega }\).
The standard weak form of (2.1) reads as follows: Find \(y\in H_0^1(\Omega )\) such that
For each \(f\in H^{-1}(\Omega )\) the above problem admits a unique solution by the well-known Lax–Milgram theorem. Since the elliptic equation (2.2) is linear with respect to the right-hand side f, we can define a linear and bounded solution operator \(S : L^2(\Omega ) \rightarrow H_0^1(\Omega )\) such that \(y = Sf\).
Let \(\mathcal {T}_h\) be a regular triangulation of \(\Omega \) such that \(\bar{\Omega }=\cup _{T\in \mathcal {T}_h}\bar{T}\). We assume that \(\mathcal {T}_h\) is shape regular in the sense that: There exists a constant \(\gamma ^*>0\) such that \(\frac{h_T}{\rho _T}\leqslant \gamma ^{*}\) for all \(T\in \mathcal {T}_h\), where \(h_T\) denotes the diameter of T and \(\rho _T\) is the diameter of the biggest ball contained in T. We set \(h=\max _{T\in \mathcal {T}_h}h_T\). In this paper, we use \(\mathcal {E}_h\) to denote the set of interior faces (edges or sides) of \(\mathcal {T}_h\) and \(\# \mathcal {T}_h\) to denote the number of elements of \(\mathcal {T}_h\).
On \(\mathcal {T}_h\) we construct a family of nested finite element spaces \(V_{h}\) consisting of piecewise linear and continuous polynomials such that \(V_h\subset C(\bar{\Omega })\cap H_0^1(\Omega )\). We define the standard Galerkin projection operator \(\mathcal {R}_h: H_0^1(\Omega )\rightarrow V_h\) by [9]
which satisfies the following stability result
A standard finite element approximation to (2.2) can then be formulated as: Find \(y_h\in V_h\) such that
Similarly, we can define a discrete solution operator \(S_h : L^2(\Omega ) \rightarrow V_h\) such that \(y_h = S_hf\). Thus, we have \(y_h=\mathcal {R}_hy=\mathcal {R}_hSf\).
For the following purpose, we follow the idea of [11] to introduce the quantity \(\kappa (h)\) as follows
We note that the quantity \(\kappa (h)\) is determined by the regularity of Sf which is further induced by properties of the domain \(\Omega \). Indeed, if the boundary of \(\Omega \) is smooth, like \(\mathcal {C}^1\), the additional regularity \(Sf\in H^2(\Omega )\) holds and thus \(\kappa (h)=O(h)\). This is still true for polygonal or polyhedral boundaries if the domain is convex. The regularity is reduced, however, in the vicinity of non-convex portions of polygonal or polyhedral boundaries. Grisvard proved in [15] the precise regularity results (Theorem 2.4.3 for the two-dimensional case and Corollary 2.6.7 for the three-dimensional case): There exists an \(\varepsilon \in (0,{1\over 2}]\), which depends on the shape of the domain, such that \(Sf\in H^{{3\over 2}+\varepsilon }(\Omega )\) for each \(f\in L^2(\Omega )\). Obviously, \(\kappa (h)\ll 1\) for \(h\in (0,h_0)\) if \(h_0\ll 1\).
The following results are standard and can be found in, e.g., [9, 11]
Proposition 2.1
For each \(f\in L^2(\Omega )\), there hold
and
Now we are in the position to review the residual type a posteriori error estimator for the finite element approximation of an elliptic boundary value problem. We define the element residual \({\tilde{r}}_T(y_h)\) and the jump residual \({\tilde{j}}_E(y_h)\) by
where \([A\nabla y_h]_E\cdot n_E\) denotes the jump of \(A\nabla y_h\) across the common side E of elements \(T^+\) and \(T^-\), \(n_E\) denotes the outward normal oriented to \(T^-\). For each element \(T\in \mathcal {T}_h\), we define the local error indicator \({\tilde{ \eta }_{h}}(y_{h},T)\) by
Then on a subset \(\omega \subset \Omega \), we define the error estimator \({\tilde{\eta }}_{h}(y_h,\omega )\) by
Thus, \({\tilde{\eta }}_{h}(y_h,\Omega )\) constitutes the error estimator on \(\Omega \) with respect to \(\mathcal {T}_h\).
For \(f\in L^2(\Omega )\) we also need to define the data oscillation as (see [32, 33])
where \(\bar{f}_T\) denotes the \(L^2\)-projection of f onto piecewise constant space on T. It is easy to see that
For the above defined data oscillation we have the following lemma whose proof can be found in [11, Lemma 2.4].
Lemma 2.2
There exists a constant \(C_*\) which depends on A, the mesh regularity constant \(\gamma ^*\) and the coefficient c, such that
Now we can formulate the following global upper and lower bounds for the a posteriori error estimators of elliptic boundary value problems (see, e.g., [13, 36]):
For our following purpose we also need to study the adjoint equation of the elliptic boundary value problem (2.1). For each \(g\in L^2(\Omega )\), let \(p\in H_{0}^{1}(\Omega )\) be the solution of the following adjoint equation
with its finite element approximation
We can also give the a posteriori global upper and lower error bounds:
To analyze the adaptive finite element approximation for the optimal control problem, we introduce the AFEM for a system of two source problems associated with the state and adjoint state equations, which is some trivial extension for the existing results of the adaptive finite element approximation of the scalar problem (see [7]). Specifically, we introduce the adaptive finite element algorithm to solve a system of elliptic boundary value problems (2.2) and (2.18) based on the error estimators \({\tilde{\eta }}_{h}^{2}(y_{h},\Omega )+{\tilde{\eta }}_h^{2}(p_{h},\Omega )\).
The adaptive finite element procedure consists of the following loop
The ESTIMATE step is based on the a posteriori error estimators derived above, while the step REFINE can be done by using iterative or recursive bisection of elements with the minimal refinement condition (see [34, 36]). Due to [7], the procedure REFINE here is not required to satisfy the interior node property of [32].
There are different kinds of adaptive algorithms which differ from the marking strategies (see [31–33]). Here we apply Dörfler’s marking strategy introduced in [13], which marks only the error estimator \({\tilde{\eta }}_h^{2}(y_{h},\Omega )+{\tilde{\eta }}_h^2(p_h,\Omega )\) to obtain the set of marked elements \({{\mathcal {M}}}_{h}\subset \mathcal {T}_{h}\) and avoids the marking for oscillation (compare Algorithm 3.7).
Then the adaptive algorithm for solving elliptic boundary value problems is also standard, see e.g. [7, Section 2.7], except that we solve instead a system of elliptic boundary value problems (2.2) and (2.18) (compare Algorithm 3.8).
We denote \(\mathbb {T}\) the class of all conforming refinements by bisection of \(\mathcal {T}_{h_0}\) (see [7] for more details). Given a fixed number \(b\geqslant 1\), for any \(\mathcal {T}_{h_{k}}\in \mathbb {T}\) and \(\mathcal {M}_{h_{k}}\subset \mathcal {T}_{h_{k}}\) of marked elements,
outputs a conforming triangulation \(\mathcal {T}_{h_{k+1}}\in \mathbb {T}\), where at least all elements of \(\mathcal {M}_{h_k}\) are bisected b times. We define \(R_{\mathcal {T}_{h_k}\rightarrow \mathcal {T}_{h_{k+1}}}=\mathcal {T}_{h_k}{\backslash }(\mathcal {T}_{h_k}\cap \mathcal {T}_{h_{k+1}})\) as set of refined elements, which satisfies \(\mathcal {M}_{h_k}\subset R_{\mathcal {T}_{h_k}\rightarrow \mathcal {T}_{h_{k+1}}}\).
Then we can formulate the following standard result on the complexity of the refinement, see [7, Lemma 2.3] and [35] for more details.
Lemma 2.3
Assume that \(\mathcal {T}_{h_0}\) verifies condition (b) of Sect. 4 in [35]. Let \(\mathcal {T}_{h_k}\) (\(k\geqslant 0\)) be a sequence of conforming and nested triangulations of \(\Omega \) generated by REFINE starting from the initial mesh \(\mathcal {T}_{h_0}\). Assume that \(\mathcal {T}_{h_{k+1}}\) is generated from \(\mathcal {T}_{h_k}\) by \(\mathcal {T}_{h_{k+1}}=\mathrm{REFINE}(\mathcal {T}_{h_k},\mathcal {M}_{h_k})\) with a subset \(\mathcal {M}_{h_k}\subset \mathcal {T}_{h_k}\). Then there exists a constant \({\hat{C}}_0\) depending on \(\mathcal {T}_{h_0}\) and b such that
We define
The convergence of adaptive algorithm based on Dörfler’s marking strategy is proven in [7] and the techniques are then used extensively for the convergence analysis of AFEM for a different kind of boundary value problems. The following Theorem 2.4, Lemma 2.5 and Lemma 2.6 are extensions of corresponding results for the single elliptic equation in [7] by some primary operations. We remark that in [10] the authors used a similar idea to prove the convergence of adaptive finite element computations for multiple eigenvalues.
Theorem 2.4
Let \((y_{h_k},p_{h_k})\in V_{h_k}\times V_{h_k}\) be a sequence of finite element solutions of problems (2.2) and (2.18) based on the adaptively refined mesh \(\mathcal {T}_{h_k}\) produced by AFEM. Then there exist constants \({\tilde{\gamma }}>0\) and \({\tilde{\beta }}\in (0,1)\), depending only on the shape regularity of meshes, the data and the parameters used in Dörfler’s marking algorithm, such that for any two consecutive iterates k and \(k+1\) we have
Here
with some constant \(\delta \in (0,1)\).
To prove the optimal complexity of the adaptive algorithm we need further results. The following lemma presents a localized upper bound estimate for the distance between two nested solutions of the elliptic boundary value problems (2.2) and (2.18) (see [7, Lemma 3.6] and [11, Lemma 6.2]).
Lemma 2.5
Let \((y_{h_k},p_{h_k})\in V_{h_k}\times V_{h_k}\) and \((y_{h_{k+1}},p_{h_{k+1}})\in V_{h_{k+1}}\times V_{h_{k+1}}\) be the discrete solutions of problems (2.2) and (2.18) over a mesh \(\mathcal {T}_{h_k}\) and its refinement \(\mathcal {T}_{h_{k+1}}\) with marked element \(\mathcal {M}_{h_{k}}\subset \mathcal {T}_{h_k}\). Let \(R_{\mathcal {T}_{h_k}\rightarrow \mathcal {T}_{h_{k+1}}}\) be the set of refined elements. Then the following localised upper bound is valid
Consequently, we can show the optimality of Dörfler’s marking strategy in the following lemma (see [7, Lemma 5.9] and [11, Proposition 6.3] for the proof).
Lemma 2.6
Let \((y_{h_k},p_{h_k})\in V_{h_k}\times V_{h_k}\) and \((y_{h_{k+1}},p_{h_{k+1}})\in V_{h_{k+1}}\times V_{h_{k+1}}\) be the discrete solutions of problems (2.2) and (2.18) over a mesh \(\mathcal {T}_{h_k}\) and its refinement \(\mathcal {T}_{h_{k+1}}\) with marked element \(\mathcal {M}_{h_{k}}\subset \mathcal {T}_{h_k}\). Suppose that they satisfy the energy decrease property
with \({\tilde{\gamma }}_{0}>0\) a constant and \({\tilde{\beta }}_{0}^{2}\in (0,{1\over 2})\). Then the set \(R_{\mathcal {T}_{h_k}\rightarrow \mathcal {T}_{h_{k+1}}}\) of marked elements satisfies the Dörfler property
with \({\tilde{\theta }}=\frac{{\tilde{C}}_{2}(1-2{\tilde{\beta }}_{0}^{2})}{{\tilde{C}}_{0}({\tilde{C}}_{1}+(1+2C_{*}^{2}{\tilde{C}}_{1}){\tilde{\gamma }}_{0})}\), where \({\tilde{C}}_{0}=\max (1,\frac{{\tilde{C}}_{3}}{{\tilde{\gamma }}_{0}})\).
3 Adaptive finite element method for the optimal control problem
In this section we consider the following elliptic optimal control problem:
subject to
where \(\alpha >0\) is a fixed parameter, \(y_d\in L^2(\Omega )\) is the desired state and \(U_{ad}\) is the admissible control set with bilateral control constraints:
where \(a,b\in \mathbb {R}\) and \(a<b\).
Remark 3.1
We remark that all the theories presented below can be generalized to the case that the control acts on a subdomain \(\omega \subset \Omega \). In this case the control operator \(B:L^2(\omega )\rightarrow L^2(\Omega )\) is an extension by zero operator and the governing equation reads \(Ly=Bu\).
With the solution operator S of the elliptic equation (3.2) introduced in the last section, we can formulate the reduced optimization problem
Since the above optimization problem is linear and strictly convex, there exists a unique solution \(u\in U_{ad}\) by standard arguments (see [25]). Moreover, the first order necessary and sufficient optimality condition can be stated as follows:
where \(S^*\) is the adjoint of S [21]. Introducing the adjoint state \(p:=S^*(Su-y_d)\in H_0^1(\Omega )\), we are led to the following optimality system:
Hereafter, we call u, y and p the optimal control, state and adjoint state, respectively. From the last inequality of (3.5) we have the pointwise representation of u (see [25]):
where \(P_{[a,b]}\) is the orthogonal projection operator from \(L^2(\Omega )\) to \(U_{ad}\).
Next, let us consider the finite element approximation of (3.1)–(3.2). In this paper, we use the piecewise linear finite elements to approximate the state y, and variational discretization for the optimal control u (see [20]). Based on the finite element space \(V_h\), we can define the finite dimensional approximation to the optimal control problem (3.1)–(3.2) as follows: Find \((u_h,y_h)\in U_{ad}\times V_h\) such that
subject to
Similar to the continuous case we have \(y_h=S_hu_h\). With this notation we can formulate a reduced discrete optimization problem
We note that the above optimization problem can be solved by the projected gradient method or the semi-smooth Newton method, see [5, 19, 21, 30] for more details.
Similar to the continuous problem (3.1)–(3.2), the above discretized optimization problem also admits a unique solution \({u}_h\in U_{ad}\). Moreover, the first order necessary and sufficient optimality condition can be stated as follows:
where \(S^*_h\) is the adjoint of \(S_h\). Introducing the adjoint state \(p_h:=S^*_h(S_hu_h-y_d)\in V_h\), the discretized first order necessary and sufficient optimality condition is equivalent to:
Hereafter, we call \(u_h\), \(y_h\) and \(p_h\) the discrete optimal control, state and adjoint state, respectively. Similar to the continuous case (3.6) we have
It should be noticed that \(u_h\) is not generally a finite element function in \(V_h\).
For convenience we define \(y^h:=Su_h\) and \(p^h:=S^*(S_hu_h-y_d)\). It is obvious that \(y_h\) and \(p_h\) are the standard Galerkin projections of \(y^h\) and \(p^h\), i.e., \(y_h=\mathcal {R}_hy^h\) and \(p_h=\mathcal {R}_hp^h\). The following equivalence property is established in [22].
Theorem 3.2
Let \((u,y,p)\in U_{ad}\times H_0^1(\Omega )\times H_0^1(\Omega )\) and \(( u_h, y_h, p_h)\in U_{ad}\times V_h\times V_h\) be the solutions of problems (3.1)–(3.2) and (3.7)–(3.8), respectively. Then the following an equivalence property holds:
Proof
For completeness we include a brief proof. Setting \(v= u_h\) in (3.4) and \(v_h = u\) in (3.10) we are led to
Adding the above two inequalities, we conclude from (3.5) that
It follows from Young’s inequality that
Moreover, from (3.5) we have
and
Combining the above estimates we prove the upper bound.
Now we prove the lower bound. Note that
Similarly, we can derive that
Thus, we can conclude from the above estimates the lower bound. This completes the proof. \(\square \)
Next, we will prove a compact equivalence property which shows the certain relationship between the finite element optimal control approximation and the associated finite element boundary value approximation.
Theorem 3.3
Let \(h\in (0,h_0)\), \((u,y,p)\in U_{ad}\times H_0^1(\Omega )\times H_0^1(\Omega )\) and \((u_h,y_h,p_h)\in U_{ad}\times V_h\times V_h\) be the solutions of problems (3.1)–(3.2) and (3.7)–(3.8), respectively. Then the following equivalence properties hold
provided \(h_0\ll 1\).
Proof
It is obvious that
Moreover, it follows from the stability results of the elliptic equation that
In the following we estimate \(\Vert y-y_h\Vert _{0,\Omega }\). Let \(\psi \in H_0^1(\Omega )\) be the solution of the following auxiliary problem
Let \(\psi _h\in V_h\) be the finite element approximation of \(\psi \). In the following proof we use the duality argument (see, e.g., [9]). Multiplying by \(y-y_h\) on both sides of (3.24) and integration by parts, we can conclude from (2.7) that
where \(\kappa (h)\) is defined in (2.6). This in turn implies
Considering (3.23) we have
It remains to estimate \(\Vert u-u_h\Vert _{0,\Omega }\). Note that it follows from (3.14), (3.15) and the definition of \(S^*_h\) that
which yields
Let \(\phi \in H_0^1(\Omega )\) be the solution of the following auxiliary problem
Now we use the duality argument again. It follows from the continuous and discrete adjoint state equations that
where \(\phi _h\in V_h\) is the finite element approximation of \(\phi \). We can conclude from (2.7), (2.8) that
and
Using the fact that \(\kappa ^2(h)< \kappa (h)\ll 1\) when \(h_0\ll 1\), we are able to derive that
Combining (3.27) and (3.33) we are led to
If \(h_0\ll 1\) then \(\kappa (h)\ll 1\) for all \(h\in (0,h_0)\), and we arrive at
Inserting the above estimate into (3.23) and (3.26), we can conclude from (3.22) the desired results (3.20), (3.21). This completes the proof. \(\square \)
Now we are in the position to consider the adaptive finite element method for the optimal control problem (3.1)–(3.2). At first we will derive a posteriori error estimates for above optimal control problems. To begin with, we firstly introduce some notations. Similar to the definitions (2.9) and (2.10) we define the element residuals \(r_{y,T}(y_h)\), \(r_{p,T}(p_h)\) and the jump residuals \(j_{y,E}(y_h)\), \(j_{p,E}(p_h)\) by
For each element \(T\in \mathcal {T}_h\), we define the local error indicators \(\eta _{y,h}(y_h,T)\) and \(\eta _{p,h}(p_h,T)\) by
Then on a subset \(\omega \subset \Omega \), we define the error estimators \(\eta _{y,h} (y_h,\omega )\) and \(\eta _{p,h}(p_h,\omega )\) by
Thus, \(\eta _{y,h}(y_h,\Omega )\) and \(\eta _{p,h}(p_h,\Omega )\) constitute the error estimators for the state equation and the adjoint state equation on \(\Omega \) with respect to \(\mathcal {T}_h\).
Note that \(S_hu_h\) and \(S^*_h(S_hu_h-y_d)\) are the standard Galerkin projections of \(Su_h\) and \(S^*(S_hu_h-y_d)\), respectively. Similar to (2.16)–(2.17), standard a posterior error estimates for elliptic boundary value problems give the following upper bounds (see, e.g., [36]) which show the reliability of the error estimators.
Lemma 3.4
Let S and \(S_h\) be the continuous and discrete solution operators defined above. Then the following a posteriori error estimates hold
Then we can also derive the following global a posteriori error lower bounds, i.e., the global efficiency of the error estimators.
Lemma 3.5
Let S and \(S_h\) be the continuous and discrete solution operators defined above. Then the following a posteriori error lower bounds hold
Let \(h_0\in (0,1)\) be the mesh size of the initial mesh \(\mathcal {T}_{h_0}\) and define
It is obvious that \({\tilde{\kappa }}(h_0)\ll 1\) if \(h_0\ll 1\). For ease of exposition we also define the following quantities:
and the straightforward modifications for \(\eta _h^2((y_h,p_h),\Omega )\) and \(\mathrm{osc}^2((y_h,p_h),\mathcal {T}_h)\).
Now we state the following a posteriori error estimates for the finite element approximation of the optimal control problem.
Theorem 3.6
Let \(h\in (0,h_0)\). Assume that \((u,y,p)\in U_{ad}\times H_0^1(\Omega )\times H_0^1(\Omega )\) and \((u_h,y_h,p_h)\in U_{ad}\times V_h\times V_h\) are the solutions of problems (3.1)–(3.2) and (3.7)–(3.8), respectively. Then there exist positive constants \(C_1\), \(C_2\) and \(C_3\), independent of the mesh size h, such that
and
provided \(h_0\ll 1\).
Proof
Note that \(y^h = Su_h\), \(y_h=S_hu_h\), \(p^h=S^*(S_h u_h-y_d)\) and \(p_h=S^*_h(S_h u_h-y_d)\). From the estimates (3.20), (3.21) and Lemmas 3.4 and 3.5 we have
and
We obtain the desired results by choosing
\(\square \)
Now we present the adaptive algorithm for solving optimal control problems. Note that there are two error estimators \(\eta _{y,h}(y_h,T)\) and \(\eta _{p,h}(p_h,T)\) contributed to the state approximation and adjoint state approximation, respectively. We use the sum of the two estimators as our indicators for the marking strategy. The marking algorithm based on Dörfler’s strategy for optimal control problems can be described as follows
Algorithm 3.7
Dörfler’s marking strategy for OCPs
-
(1)
Given a parameter \(0<\theta <1\);
-
(2)
Construct a minimal subset \({{\mathcal {M}}}_{h}\subset \mathcal {T}_h\) such that
$$\begin{aligned} \sum \limits _{T\in {{\mathcal {M}}}_{h}}\eta _{h}^2((y_h,p_h),T)\geqslant \theta \eta _{h}^2((y_h,p_h),\Omega ). \end{aligned}$$ -
(3)
Mark all the elements in \({{\mathcal {M}}}_{h}\).
Then we can present the adaptive finite element algorithm for the optimal control problem (3.7)–(3.8) as follows:
Algorithm 3.8
Adaptive finite element algorithm for OCPs:
-
(1)
Given an initial mesh \(\mathcal {T}_{h_0}\) with mesh size \(h_0\), construct the finite element space \(V_{h_0}\).
-
(2)
Set \(k=0\) and solve the optimal control problem (3.7)–(3.8) to obtain \((u_{h_k},y_{h_k},p_{h_k})\in U_{ad}\times V_{h_k}\times V_{h_k}\).
-
(3)
Compute the local error indicator \(\eta _{h_k}((y_{h_k},p_{h_k}),T)\).
-
(4)
Construct \({{\mathcal {M}}}_{h_k}\subset \mathcal {T}_{h_k}\) by the marking Algorithm 3.7.
-
(5)
Refine \({{ \mathcal {M}}}_{h_k}\) to get a new conforming mesh \(\mathcal {T}_{h_{k+1}}\) by procedure REFINE.
-
(6)
Construct the finite element space \(V_{h_{k+1}}\) and solve the optimal control problem (3.7)–(3.8) to obtain \((u_{h_{k+1}},y_{h_{k+1}},p_{h_{k+1}})\in U_{ad}\times V_{h_{k+1}}\times V_{h_{k+1}}\).
-
(7)
Set \(k=k+1\) and go to Step (3).
4 Convergence of AFEM for the optimal control problem
In this section we intend to prove the convergence of the adaptive Algorithm 3.8. The proof uses some ideas of [11, 16] and some results of [7]. Following Theorem 3.3, we may firstly establish some relationships between the two level approximations, which will be used in our analysis for both convergence and optimal complexity.
Theorem 4.1
Let \(h,H\in (0,h_0)\) and \((u,y,p)\in U_{ad}\times H_0^1(\Omega )\times H_0^1(\Omega )\) be the solution of problem (3.1)–(3.2). Assume that \((u_h, y_h, p_h)\in U_{ad}\times V_h\times V_h\) and \((u_H, y_H, p_H)\in U_{ad}\times V_H\times V_H\) are the solutions of problem (3.7)–(3.8), respectively. Define \(y^H:=Su_H\) and \(p^H:=S^*(S_Hu_H-y_d)\). Then the following properties hold
and
provided \(h_0\ll 1\).
Proof
Note that
and
On the other hand, it follows from (2.4), the triangle inequality and the stability results for elliptic equation that
and
where in the last inequality we used (3.25). It follows from (3.35) that
provided \(h_0\ll 1\). Combining this with (4.7), (4.8) yields (4.1) and (4.2).
Then we prove (4.3), (4.4). Note that
From Lemma 2.2 we have
which together with (4.11) imply
Moreover, since \(\bar{f}_T\) is the \(L^2\)-projection of f onto piecewise polynomials on T, there holds
By using the triangle inequality and (3.25) we thus have
which together with (3.35) yield
We can conclude the desired results (4.3), (4.4) from the definition of the data oscillation and (4.12)–(4.16).
Now it remains to prove (4.5) and (4.6). From the definition of \(y^H\) and \(y^h\) we know that \(y^h-y^H\) is the solution of an elliptic boundary value problem with right hand side \(u_h-u_H\). It follows from (2.17) and (4.9) that
From (2.14), (3.35), (4.14) and (4.15) we are led to
Note that
Combining this with (3.35), (4.17) and (4.18) gives
which proves (4.5). Similarly we can prove (4.6). Thus, we complete the proof of the theorem. \(\square \)
Now we are ready to prove the error reduction for the sum of the energy errors and the scaled error estimators of the state y and the adjoint state p, between two consecutive adaptive loops.
Theorem 4.2
Let \((u,y,p)\in U_{ad}\times H_0^1(\Omega )\times H_0^1(\Omega )\) be the solution of problem (3.1)–(3.2) and \((u_{h_k},y_{h_k}, p_{h_k})\in U_{ad}\times V_{h_k}\times V_{h_k}\) be a sequence of solutions to problem (3.7)–(3.8) produced by Algorithm 3.8. Then there exist constants \(\gamma >0\) and \(\beta \in (0,1)\) depending only on the shape regularity of meshes and the parameter \(\theta \) used by Algorithm 3.7, such that for any two consecutive iterates k and \(k+1\), we have
provided \(h_0\ll 1\). Therefore, Algorithm 3.8 converges with a linear rate \(\beta \), namely, the kth iterate solution \((u_{h_k},y_{h_k},p_{h_k})\) of Algorithm 3.8 satisfies
where \(C_0=\Vert (y-y_{h_{0}},p-p_{h_{0}})\Vert _{a}^2+\gamma \eta _{h_{0}}^2((y_{h_{0}},p_{h_{0}}),\Omega )\).
Proof
For convenience, we use \((u_H,y_H,p_H)\) and \((u_h,y_h,p_h)\) to denote \((u_{h_k},y_{h_k},p_{h_k})\) and \((u_{h_{k+1}}, y_{h_{k+1}},p_{h_{k+1}})\), respectively. So it suffices to prove that for \((u_H,y_H,p_H)\) and \((u_h,y_h,p_h)\), there holds
Recall that \(y^H:=Su_H\), \(y^h:=Su_h\) and \(p^H:=S^*(S_Hu_H-y_d)\), \(p^h:=S^*(S_hu_h-y_d)\). It follows from Algorithm 3.7 that Dörfler’s marking strategy is satisfied for \((y^H,p^H)\). So we conclude from Theorem 2.4 that there exist constants \({\tilde{\gamma }}\) as defined in (2.24) and \({\tilde{\beta }}\in (0,1)\) satisfying
Note that \(\mathcal {R}_Hy^H=y_H\) and \(\mathcal {R}_Hp^H=p_H\), we thus have
We conclude from (4.1), (4.2) and (4.5), (4.6) that there exists a constant \({\hat{C}}_4>0\) such that
where Young’s inequality with \(\delta _1\) is used and \(\delta _1\in (0,1)\) satisfies
Thus, there exists a positive constant \({\hat{C}}_5\) depending on \({\hat{C}}_4\) and \({\tilde{\gamma }}\) such that \({\hat{C}}_4(1+\delta _1^{-1})(1+{\tilde{\gamma }})\leqslant {\hat{C}}_{5}\delta _1^{-1}\) and
It follows from (4.23) and (4.25) that
Then using Theorem 3.3 we arrive at
and thus
where \(C_4\) is a positive constant depending on \({\hat{C}}_5\) and \({\hat{C}}_6\) when \(h_0\ll 1\). So we can derive
or equivalently,
Since \({\tilde{\kappa }}(h_0)\ll 1\) provided that \(h_0\ll 1\), we can define the constant \(\beta \) as
which satisfies \(\beta \in (0,1)\) if \(h_0\ll 1\) in view of (4.24). Then
Now we choose
it is obvious that
Then we obtain (4.21), this completes the proof. \(\square \)
Remark 4.3
We remark that the requirement \(h_0\ll 1\) on the initial mesh \(\mathcal {T}_{h_0}\) is not restrictive for the convergence analysis of AFEM for nonlinear problems, such as optimal control problems studied in this paper, see, e.g., [14]. For a similar requirement we refer to [10, 11] for the convergence analysis of adaptive finite element eigenvalue computations and to [31] for the adaptive finite element computations for nonsymmetric boundary value problems, we should also mention [16] for an adaptive finite element method of semilinear elliptic equations.
Remark 4.4
In the adaptive Algorithm 3.8 we use the sum of the error estimators \(\eta _{y,h}(y_h,T)\) contributed to the state approximation and \(\eta _{p,h}(p_h,T)\) contributed to the adjoint state approximation as an indicator to select the subset \({{\mathcal {M}}}_h\) for refinement. This marking strategy enables us to prove the convergence and quasi-optimality (see Sect. 5) of AFEM for optimal control problems. We remark that it is also possible to apply Dörfler’s strategy to the contributions \(\eta _{y,h}(y_h,T)\) and \(\eta _{p,h}(p_h,T)\) as follows:
-
Construct a minimal subset \({{\mathcal {M}}}_{h,1}\subset \mathcal {T}_h\) such that \(\sum \nolimits _{T\in {{\mathcal {M}}}_{h,1}} \eta _{y,h}^2(y_h,T)\geqslant \theta \eta _{y,h}^2(y_h,\Omega )\).
-
Construct another minimal subset \({{\mathcal {M}}}_{h,2}\subset \mathcal {T}_h\) such that \(\sum \nolimits _{T\in {{\mathcal {M}}}_{h,2}} \eta _{p,h}^2(p_h,T)\geqslant \theta \eta _{p,h}^2(p_h,\Omega )\).
-
Set \({{\mathcal {M}}}_{h}:={{\mathcal {M}}}_{h,1}\cup {{\mathcal {M}}}_{h,2}\) and mark all the elements in \({{\mathcal {M}}}_{h}\).
With this marking strategy we can also prove the convergence of AFEM for optimal control problems by using the results of [7, 11] for single boundary value problems. To be more specific, the error reduction (4.22) can be derived separately for the state and adjoint state approximations. However, the resulting over-refinement for this marking strategy prevents us to prove the quasi-optimality of the adaptive algorithm.
We also would like to point out that in our convergence analysis we used the same mesh for the state and adjoint state approximations, another possibility is to use different meshes for the state and adjoint state. Therefore, starting from the same initial mesh we obtain different adaptive meshes for the state and adjoint state approximations where the refinement of each mesh can be generated by the error indicators corresponding to the respective state and adjoint state equations. It will be an interesting topic to prove the convergence of AFME based on the above different discretizations for primal and dual variables.
5 Complexity of AFEM for the optimal control problem
In this section we intend to analyze the complexity of the adaptive finite element algorithm for optimal control problems based on known results on the complexity for elliptic boundary value problems. The proof uses some ideas of [11, 16] and some results of [7].
Similar to [7, 11], for our purpose to analyse the complexity of AFEM for optimal control problems we need to introduce a function approximation class as follows
where \(\gamma >0\) is some constant and
Here \(\mathcal {T}\subset \mathcal {T}_{h_0}\) means that \(\mathcal {T}\) is a refinement of \(\mathcal {T}_{h_0}\), \(y_{\mathcal {T}}\) and \(p_{\mathcal {T}}\) are elements of the finite element space corresponding to the partition \(\mathcal {T}\). It is seen from the definition that \(\mathcal {A}_\gamma ^s=\mathcal {A}_1^s\) for all \(\gamma >0\), thus we use \(\mathcal {A}^s\) throughout the paper with corresponding norm \(|\cdot |_{s}\). So \(\mathcal {A}^s\) is the class of functions that can be approximated with a given tolerance \(\varepsilon \) by continuous piecewise linear polynomial functions over a partition \(\mathcal {T}\) with number of degrees of freedom \(\#{\mathcal {T}}-\#{\mathcal {T}_{h_0}}\lesssim \varepsilon ^{-1/s}|v|_s^{1/s}\).
Now we are in the position to prepare for the proof of optimal complexity of Algorithm 3.8 for the optimal control problem (3.1)–(3.2). At first, we define \(y^{h_k}:=Su_{h_k}\) and \(p^{h_k}:=S^*(S_{h_k}u_{h_k}-y_d)\). Then we have the following result.
Lemma 5.1
Let \((u_{h_k},y_{h_k},p_{h_k})\in U_{ad}\times V_{h_k}\times V_{h_k}\) and \((u_{h_{k+1}},y_{h_{k+1}},p_{h_{k+1}})\in U_{ad}\times V_{h_{k+1}}\times V_{h_{k+1}}\) be discrete solutions of problem (3.7)–(3.8) over mesh \(\mathcal {T}_{h_k}\) and its refinement \(\mathcal {T}_{h_{k+1}}\) with marked elements \(\mathcal {M}_{h_{k}}\). Suppose they satisfy the following property
with \(\gamma _*\) and \(\beta _*\) some positive constants. Then for the associated state and adjoint state approximations we have
with
where \(C_5\) is some constant depending on \(C_*\), \({\hat{C}}_5\) and \({\hat{C}}_6\). \({\hat{C}}_5\), \({\hat{C}}_6\) and \(\delta _1\in (0,1)\) are some constants as in the proof of Theorem 4.2.
Proof
The proof follows along the lines of the proof of Theorem 4.2 when (4.5), (4.6) are replaced by (4.3), (4.4). Specifically, in the proof of Theorem 4.2 we use (4.22), Theorems 3.3 and 4.1 to prove (4.21). Conversely, here we need to prove (4.22) from (4.21), Theorems 3.3 and 4.1. The definitions of \({\tilde{\beta }}_{*}\) and \({\tilde{\gamma }}_{*}\) are very similar to (4.30) and (4.32). \(\square \)
Next, we are able to derive a result similar to Lemma 2.6 concerning the optimality of Dörfler’s marking strategy for the optimal control problems.
Corollary 5.2
Let \((u_{h_k},y_{h_k},p_{h_k})\in U_{ad}\times V_{h_k}\times V_{h_k}\) and \((u_{h_{k+1}},y_{h_{k+1}},p_{h_{k+1}})\in U_{ad}\times V_{h_{k+1}}\times V_{h_{k+1}}\) be discrete solutions of problem (3.7)–(3.8) over the mesh \(\mathcal {T}_{h_k}\) and its refinement \(\mathcal {T}_{h_{k+1}}\) with marked elements \(\mathcal {M}_{h_k}\). Suppose they satisfy the following property
with constants \(\gamma _*>0\) and \(\beta _*\in (0,\sqrt{1\over 2})\). Then the set \(R_{\mathcal {T}_{h_k}\rightarrow \mathcal {T}_{h_{k+1}}}\) of refined elements satisfies the Dörfler property
with \({\hat{\theta }}=\frac{{\tilde{C}}_{2}(1-2{\tilde{\beta }}_{*}^{2})}{{\tilde{C}}_{0}(\tilde{C}_1+(1+2C_*^2{\tilde{C}}_{1}){\tilde{\gamma }}_{*})}\) and \({\tilde{C}}_{0}=\max (1,\frac{{\tilde{C}}_{3}}{{\tilde{\gamma }}_{*}})\).
Proof
From the statement of this corollary we know that the assumption (5.1) in Lemma 5.1 is satisfied, due to Lemma 5.1 we can conclude (5.2). Note that \(y_{h_k}=\mathcal {R}_{h_k}y^{h_k}\) and \(p_{h_k}=\mathcal {R}_{h_k}p^{h_k}\). By the lower bounds in Lemma 3.5 and the definition of \({\tilde{C}}_{0}\) we have
Thus, it follows from (5.2) that
Note that \(y_{h_k}\) and \(\mathcal {R}_{h_{k+1}}y^{h_k}\) are the Galerkin projections of \(y^{h_k}\) on \(V_{h_k}\) and \(V_{h_{k+1}}\), respectively. From the standard Galerkin orthogonality we have
By (2.15), the triangle and Young’s inequalities we have
which together with the dominance of the indicator over oscillation (see [7, Remark 2.1])
implies
where we used (2.25) in the last inequality. Combining (5.4)–(5.8) and (2.25) we obtain
By choosing
we complete the proof. \(\square \)
Lemma 5.3
Let \((y,p,y_d)\in \mathcal {A}^s\) and \(\mathcal {T}_{h_k}\) (\(k\geqslant 0\)) be a sequence of meshes generated by Algorithm 3.8 starting from the initial mesh \(\mathcal {T}_{h_0}\). Let \(\mathcal {T}_{h_{k+1}}=\mathrm{REFINE}(\mathcal {T}_{h_k},\mathcal {M}_{h_k})\) where \(\mathcal {M}_{h_k}\) is produced by Algorithm 3.7 with \(\theta \) satisfying \(\theta \in (0,\frac{ C_2\gamma }{C_3(C_1+(1+2C_*^2 C_1)\gamma )})\). Then
where the constant \(C_5\) depends on the discrepancy between \(\theta \) and \(\frac{ C_2\gamma }{C_3(C_1+(1+2C_*^2 C_1)\gamma )}\).
Proof
Let \(\rho ,\rho _1\in (0,1)\) satisfy \(\rho _1\in (0,\rho )\) and
Choose \(\delta _1\in (0,1)\) to satisfy (4.24) and
which implies
Set
and let \(\mathcal {T}_{h_\varepsilon }\) be a refinement of \(\mathcal {T}_{h_0}\) with minimal degrees of freedom satisfying
We can conclude from the definition of \(\mathcal {A}^s\) that
Let \(\mathcal {T}_{h_*}:=\mathcal {T}_{h_\varepsilon }\oplus \mathcal {T}_{h_k}\) be the smallest common refinement of \(\mathcal {T}_{h_\varepsilon }\) and \(\mathcal {T}_{h_k}\). Let \(V_{h_\varepsilon }\subset H_0^1(\Omega )\) and \(V_{h_*}\subset H_0^1(\Omega )\) be the finite element spaces defined on \(\mathcal {T}_{h_\varepsilon }\) and \(\mathcal {T}_{h_*}\), respectively. Assume that \((u_{h_\varepsilon },y_{h_\varepsilon },p_{h_\varepsilon })\in U_{ad}\times V_{h_\varepsilon }\times V_{h_\varepsilon }\) is the solution of problem (3.7)–(3.8).
Define \(y^{h_\varepsilon }:=Su_{h_\varepsilon }\) and \(p^{h_\varepsilon }:=S^*(S_{h_\varepsilon }u_{h_\varepsilon }-y_d)\). From the definition of oscillation we can conclude from Lemma 2.2 that
and
Then from Young’s inequality we have
Due to the orthogonality
we arrive at
From (2.24) of Theorem 2.4 we can see that \({\tilde{\gamma }}\leqslant \frac{1}{2C_*^2}\), which implies
with \(\sigma = \frac{1}{C_*^2}-{\tilde{\gamma }}\in (0,1)\). Following the similar procedure as in the proof of Theorem 4.2 when (4.5), (4.6) are replaced by (4.3), (4.4), we are led to
where
and \(C_4\) is the constant that appeared in the proof of Theorem 4.2. Thus, it follows from (5.13), (5.14) and the definition of \(\varepsilon \) that
with \(\beta _1=\frac{1}{\sqrt{2}}\beta _0\rho _1\).
In view of (5.12) and the definitions of \(\beta _0\) and \(\beta _1\) we have \(\beta _1^2\in (0,{1\over 2})\) provided \(h_0\ll 1\). It follows from Corollary 5.2 that
where \(\theta _1=\frac{{\tilde{C}}_2(1-2{\tilde{\beta }}_1^2)}{{\tilde{C}}_5({\tilde{C}}_1+(1+2C_{*}^2{\tilde{C}}_1){\tilde{\gamma }}_1)}\), \({\tilde{\gamma }}_1=\frac{\gamma }{1-C_5\delta _1^{-1}{\tilde{\kappa }}^2(h_0)}\), \({\tilde{C}}_5=\max (1,\frac{{\tilde{C}}_3}{{\tilde{\gamma }}_1})\) and
It follows from the definition of \(\gamma \) in (2.24) and \({\tilde{\gamma }}\) in (4.32) that \({\tilde{\gamma }}_1<1\), which together with \({\tilde{C}}_3>1\) (see [11]) implies \({\tilde{C}}_5=\frac{{\tilde{C}}_3}{{\tilde{\gamma }}_1}\). Since \(h_0\ll 1\), we obtain that \({\tilde{\gamma }}_1>\gamma \) and \({\tilde{\beta }}_1\in (0,\frac{1}{\sqrt{2}}\rho )\) from (5.11). It is easy to see from (3.50) and \({\tilde{\gamma }}_1>\gamma \) that
provided \(h_0\ll 1\). This implies
Note that Algorithm 3.7 selects a minimal set \(\mathcal {M}_{h_k}\) satisfying
Thus,
which is the desired result with an explicit dependance on the discrepancy between \(\theta \) and \(\frac{ C_2\gamma }{C_3(C_1+(1+2C_*^2 C_1)\gamma )}\). \(\square \)
We are now ready to prove that Algorithm 3.8 possesses optimal complexity for the state and adjoint state approximations.
Theorem 5.4
Let \((u,y,p)\in U_{ad}\times H_0^1(\Omega )\times H_0^1(\Omega )\) be the solution of problem (3.1)–3.2 and \((u_{h_n},y_{h_n},p_{h_n})\in U_{ad}\times V_{h_n}\times V_{h_n}\) be a sequence of solutions of problem (3.7)–(3.8) corresponding to a sequence of finite element spaces \(V_{h_n}\) with partitions \(\mathcal {T}_{h_n}\) produced by Algorithm 3.8. Then the nth iterate solution \((y_{h_n},p_{h_n})\) of Algorithm 3.8 satisfies the optimal bound
where the hidden constant depends on the exact solution (u, y, p) and the discrepancy between \(\theta \) and \(\frac{ C_2\gamma }{C_3(C_1+(1+2C_*^2 C_1)\gamma )}\).
Proof
It follows from (2.22) and (5.10) that
From the lower bound (3.49) we have
where \(C_6=\max (1+\frac{\gamma }{C_2},\frac{ C_3}{C_2})\). Then we arrive at
Due to (4.19), we obtain for \(0\leqslant k< n\) that
Thus,
where the last inequality holds due to the fact that \(\beta <1\).
which together with (5.21) yields
this completes the proof. \(\square \)
Remark 5.5
From (3.35) and the equivalence property (3.13) we can conclude that Theorem 4.2 also implies the convergence of \(\Vert u-u_{h_n}\Vert _{0,\Omega }\), namely, for the n-th iterate solution \(u_{h_n}\) of Algorithm 3.8 there holds
We remark that the control variable can also be included into the complexity analysis of AFEM for optimal control problems to obtain
However, the above results are sub-optimal for the optimal control as illustrated by the numerical results in Sect. 6. To prove the optimality of the AFEM for the control variable, it seems that we need to work with an AFEM based on \(L^2\)-norm error estimators. We refer to [20] for an optimal a priori error estimate. We expect that the results in [12] will enable us to prove the optimal convergence of the AFEM for the optimal control u, this will be postponed to future work.
6 Numerical experiments
In this section we carry out some numerical tests in two dimensions to support our theoretical results obtained in this paper. We take the elliptic operator L as \(-\Delta \) with homogeneous Dirichlet boundary conditions for all the examples.
Example 6.1
This example is taken from [1]. The domain \(\Omega \) can be described in polar coordinates by
We take the exact solutions as
with \(\lambda ={2\over 3}\) and \(\nu _1=\nu _2={5\over 2}\). We set \(\alpha = 0.1\), \(a=-0.3\) and \(b=1\). We assume the additional right hand side f for the state equation.
We give the numerical results for the optimal control approximation by Algorithm 3.8 with parameter \(\theta = 0.4\) and \(\theta = 0.5\). Figure 1 shows the profiles of the numerically computed optimal state and adjoint state. We present in Fig. 2 the triangulations by Algorithm 3.8 after 8 and 10 adaptive iterations. We can see that the meshes are concentrated on the reentrant corner where the singularities are located. We also illustrate the active sets of the continuous solution, the discrete solutions with variational control discretization and piecewise linear control discretization after 9 adaptive iterations with \(\theta =0.4\). In this example only the lower bound \(u\geqslant -0.3\) is active. Figure 3 clearly shows that the active set crosses element edges and is not restricted to finite element edges by our variational discretization for control u, and is much closer to that of the continuous solution compared with full control discretization.
To illustrate the efficiency of adaptive finite element method for solving optimal control problems, we show in the left plot of Fig. 4 the error histories of the optimal control, state and adjoint state with uniform refinement. We can only observe the reduced orders of convergence which are less than one for the energy norms of the state and adjoint state, and less than two for the \(L^2\)-norm of the control. In the right plot of Fig. 4 we present the convergence behaviours of the optimal control, state and adjoint state, as well as the error estimators \(\eta _{y,h}(y_h,\Omega )\) and \(\eta _{p,y}(p_h,\Omega )\) for the state and adjoint state equations with adaptive refinement. In Fig. 5 we present the convergence of the error \(\Vert (y-y_h,p-p_h)\Vert _a\) and error indicator \(\eta _h((y_h,p_h),\Omega )\) with \(\theta = 0.4\) and \(\theta = 0.5\), respectively. It is shown from Fig. 5 that the error \(\Vert (y-y_h,p-p_h)\Vert _a\) is proportional to the a posteriori error estimators, which implies the efficiency of the a posteriori error estimators given in Sect. 3. Moreover, we can also observe that the convergence order of the error \(\Vert (y-y_h,p-p_h)\Vert _a\) is approximately parallel to the line with slope \(-1/2\) which is the optimal convergence rate we can expect by using linear finite elements, this coincides with our theory in Sect. 5. For the error \(\Vert u-u_h\Vert _{0,\Omega }\) we can observe a reduction with slope \(-1\), which is better than the results presented in Remark 5.5, and strongly suggests that the convergence rate for the optimal control is not optimal.
Example 6.2
In the second example we consider an optimal control problem without explicit solutions. We set \(\Omega =(-1,1)^2\), \(\alpha = 10^{-3}\), \(a=-10\) and \(b=10\). The desired state \(y_d\) is chosen as 10, 1, \(-10\) and \(-1\) in the first, second, third and fourth quadrant, respectively.
Similar to the above example Fig. 6 shows the profiles of the numerically computed optimal state and adjoint state. We present in the left plot of Fig. 7 the triangulation generated by Algorithm 3.8 after 8 adaptive iterations with parameter \(\theta =0.5\). From Fig. 6 we can see that the state and adjoint state are very smooth, so the AFEM should produce a quasi-uniform mesh in this case and indeed we can observe this phenomena from the left plot of Fig. 7. Since there are no explicit solutions we can not show the convergence of the error \(\Vert (y-y_h,p-p_h)\Vert _a\) as in Example 6.1. Instead we show in the right plot of Fig. 7 the convergence of the error indicator \(\eta _h((y_h,p_h),\Omega )\), the error estimators \(\eta _{y,h}(y_h,\Omega )\) and \(\eta _{p,y}(p_h,\Omega )\) for the state and adjoint state equations. We can observe an error reduction with slope \(-1/2\). We also plot in Fig. 8 the boarders of the active sets when using variational control discretization and piecewise linear control discretization. In this example, both the upper and lower bounds are active and we can observe a very sharp boundary between the active sets of the upper and lower bounds.
Example 6.3
In this example we consider an optimal control problem without explicit solutions defined on domain \(\Omega =(-1,1)\times (-1,1){\backslash } [0,1)\times (x_1,0]\). We set \(\alpha = 10^{-2}\), \(a=0\) and \(b=8\). We take the desired state \(y_d=2\).
We show in Fig. 9 the profiles of the numerically computed optimal state and adjoint state, singularities for both the state and adjoint state can be observed around the reentrant corner. We present in the left plot of Fig. 10 the triangulation generated by Algorithm 3.8 after 8 adaptive iterations with \(\theta =0.5\) which is locally refined around the corner. Similar to Examples 6.1 and 6.2 we also illustrate the active sets of the discrete solutions with both variational discretization and piecewise linear discretization after 9 adaptive iterations with \(\theta =0.4\). In this example only the upper bound \(u\leqslant 8\) is active. Figure 11 clearly shows the advantage of variational dicretization over full discretization for control u. Since there are no explicit solutions we show in the right plot of Fig. 10 the convergence of the error indicator \(\eta _h((y_h,p_h),\Omega )\), the error estimators \(\eta _{y,h}(y_h,\Omega )\) and \(\eta _{p,y}(p_h,\Omega )\) for the state and adjoint state equations. We can also observe an error reduction with slope \(-1/2\).
7 Conclusion and outlook
In this paper we give a rigorous convergence analysis of the adaptive finite element algorithm for optimal control problems governed by linear elliptic equations. We prove that the AFEM is a contraction, for the sum of the energy errors and the scaled error estimators of the state y and the adjoint state p, between two consecutive adaptive loops. We also show that the AFEM yields a decay rate of the energy errors of the state y and the adjoint state p plus oscillations of the state and adjoint state equations in terms of the number of degrees of freedom.
We expect that the results should also be valid for optimal Neumann boundary control problems (see [27]) by the following observations. The key point for the convergence analysis is the equivalence property presented in Theorem 3.3 where the relation between the finite element optimal control approximation and the standard finite element boundary value approximation is established. Consider the governing equation of the Neumann boundary control problem:
Similar to the proof of Theorem 3.3 we can conclude from the trace theorem that
where \(u_h\) is the discrete optimal control. Then we can obtain the counterpart of (3.20), (3.21) for Neumann boundary control problems
provided \(h_0\ll 1\). Thus, the convergence and complexity analysis of AFEM carries out to the Neumann boundary control problems.
There are many important issues remained unsolved for the convergence analysis of AFEM for optimal control problems compared to AFEM for boundary value problems. Firstly, at this moment we only prove the optimality of AFEM for energy errors of the state and adjoint state variables, the convergence for the optimal control u is sub-optimal. To prove the optimality of AFEM for the optimal control u it seems that we should work on the optimality of AFEM for boundary value problems under \(L^2\)-norms, as done in [12]. This complicates the convergence analysis with additional restrictions to the adaptive algorithms and will be postponed to future work.
Secondly, the convergence analysis of the adaptive finite element algorithm for other kinds of optimal control problems like Stokes control problems (see [28]), and non-standard finite element algorithm such as mixed finite element methods (see [8]) remains open and will be addressed in forthcoming papers.
Thirdly, we only prove the convergence of AFEM for optimal control problems with control constraints using a variational control discretization. The full control discretization concept by using piecewise constant or piecewise linear finite elements is also very important among the numerical methods for control problems. This kind of control discretization results in an additional discretised control space and an additional contribution to the a posteriori error estimators (see [22]) which should be incorporated within the adaptive algorithm and the corresponding convergence analysis. We also intend to generalise our approach in this paper to analyse the convergence of AFEM for optimal control problems with full control discretization in the future.
References
Apel, T., Rösch, A., Winkler, G.: Optimal control in non-convex domains: a priori discretization error estimates. Calcolo 44, 137–158 (2007)
Babuška, I., Rheinboldt, W.C.: Error estimates for adaptive finite element computations. SIAM J. Numer. Anal. 15, 736–754 (1978)
Becker, R., Kapp, H., Rannacher, R.: Adaptive finite element methods for optimal control of partial differential equations: basic concept. SIAM J. Control Optim. 39, 113–132 (2000)
Becker, R., Mao, S.P.: Quasi-optimality of an adaptive finite element method for an optimal control problem. Comput. Methods Appl. Math. 11, 107–128 (2011)
Bergounioux, M., Ito, K., Kunisch, K.: Primal-dual strategy for constrained optimal control problems. SIAM J. Control Optim. 37, 1176–1194 (1999)
Binev, P., Dahmen, W., DeVore, R.: Adaptive finite element methods with convergence rates. Numer. Math. 97, 219–268 (2004)
Cascon, J.M., Kreuzer, C., Nochetto, R.H., Siebert, K.G.: Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal. 46(5), 2524–2550 (2008)
Chen, Y.P., Liu, W.B.: A posteriori error estimates for mixed finite element solutions of convex optimal control problems. J. Comput. Appl. Math. 211(1), 76–89 (2008)
Ciarlet, P.G.: The Finite Element Methods for Elliptic Problems. North-Holland, Amsterdam (1978)
Dai, X.Y., He, L.H., Zhou, A.H.: Convergence and quasi-optimal complexity of adaptive finite element computations for multiple eigenvalues. IMA J. Numer. Anal. 35(4), 1934–1977 (2015)
Dai, X.Y., Xu, J.C., Zhou, A.H.: Convergence and optimal complexity of adaptive finite element eigenvalue computations. Numer. Math. 110, 313–355 (2008)
Demlow, A., Stevenson, R.: Convergence and quasi-optimality of an adaptive finite element method for controlling \(L^2\) errors. Numer. Math. 117(2), 185–218 (2011)
Dörfler, W.: A convergent adaptive algorithm for Poisson’s equation. SIAM J. Numer. Anal. 33, 1106–1124 (1996)
Gaevskaya, A., Hoppe, R.H.W., Iliash, Y., Kieweg, M.: Convergence analysis of an adaptive finite element method for distributed control problems with control constraints. In: Control of coupled partial differential equations, Internat. Ser. Numer. Math., vol. 155, pp. 47–68. Birkhäuser, Basel (2007)
Grisvard, P.: Singularities in Boundary Value Problems. Masson, Paris and Springer-Verlag, Berlin (1992)
He, L.H., Zhou, A.H.: Convergence and complexity of adaptive finite element methods for elliptic partial differential equations. Int. J. Numer. Anal. Model. 8(4), 615–640 (2011)
Hintermüller, M., Hoppe, R.H.W.: Goal-oriented adaptivity in control constrained optimal control of partial differential equations. SIAM J. Control Optim. 47(4), 1721–1743 (2008)
Hintermüller, M., Hoppe, R.H.W., Iliash, Y., Kieweg, M.: An a posteriori error analysis of adaptive finite element methods for distributed elliptic control problems with control constraints, ESAIM: Control Optim. Calc. Var. 14, 540–560 (2008)
Hintermüller, M., Ito, K., Kunisch, K.: The primal-dual active set strategy as a semismooth Newton method. SIAM J. Optim. 13, 865–888 (2003)
Hinze, M.: A variational discretization concept in control constrained optimization: the linear-quadratic case. Comput. Optim. Appl. 30, 45–63 (2005)
Hinze, M., Pinnau, R., Ulbrich, M., Ulbrich, S.: Optimization with PDE constraints. In: Math. Model. Theo. Appl., vol. 23. Springer, New York (2009)
Kohls, K., Rösch, A., Siebert, K.G.: A posteriori error analysis of optimal control problems with control constraints. SIAM J. Control Optim. 52, 1832–1861 (2014)
Kohls, K., Rösch, A., Siebert, K.G.: Convergence of adaptive finite elements for optimal control problems with control constraints. In: Leugering, G. et al. (eds.) Trends in PDE Constrained Optimization, Internat. Ser. Numer. Math., vol. 165, pp. 403–419 (2015)
Li, R., Liu, W.B., Ma, H.P., Tang, T.: Adaptive finite element approximation for distributed elliptic optimal control problems. SIAM J. Control Optim. 41(5), 1321–1349 (2002)
Lions, J.L.: Optimal Control of Systems Governed by Partial Differential Equations. Springer-Verlag, Berlin (1971)
Liu, W.B., Yan, N.N.: A posteriori error analysis for convex distributed optimal control problems. Adv. Comput. Math. 15(1–4), 285–309 (2001)
Liu, W.B., Yan, N.N.: A posteriori error estimates for convex boundary control problems. SIAM J. Numer. Anal. 39(1), 73–99 (2001)
Liu, W.B., Yan, N.N.: A posteriori error estimates for optimal problems governed by Stokes equations. SIAM J. Numer. Anal. 40, 1850–1869 (2003)
Liu, W.B., Yan, N.N.: A posteriori error estimates for optimal control problems governed by parabolic equations. Numer. Math. 93, 497–521 (2003)
Liu, W.B., Yan, N.N.: Adaptive Finite Element Methods for Optimal Control Governed by PDEs. Science Press, Beijing (2008)
Mekchay, K., Nochetto, R.H.: Convergence of adaptive finite element methods for general second order linear elliplic PDEs. SIAM J. Numer. Anal. 43, 1803–1827 (2005)
Morin, P., Nochetto, R.H., Siebert, K.G.: Data oscillation and convergence of adaptive FEM. SIAM J. Numer. Anal. 38, 466–488 (2000)
Morin, P., Nochetto, R.H., Siebert, K.G.: Convergence of adaptive finite element methods. SIAM Rev. 44, 631–658 (2002)
Stevenson, R.: Optimality of a standard adaptive finite element method. Found. Comput. Math. 7, 245–269 (2007)
Stevenson, R.: The completion of locally refined simplicial partitions created by bisection. Math. Comput. 77, 227–241 (2008)
Verfürth, R.: A Review of a Posteriori Error Estimates and Adaptive Mesh Refinement Techniques. Wiley-Teubner, New York (1996)
Acknowledgments
The authors would like to thank two anonymous referees for their careful reviews and many helpful suggestions. The first author was supported by the National Basic Research Program of China under Grant 2012CB821204 and the National Natural Science Foundation of China under Grants 11201464 and 91530204. The second author acknowledged the support of the National Natural Science Foundation of China under Grants 11171337 and 91530204.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gong, W., Yan, N. Adaptive finite element method for elliptic optimal control problems: convergence and optimality. Numer. Math. 135, 1121–1170 (2017). https://doi.org/10.1007/s00211-016-0827-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-016-0827-9