Abstract
In this paper, space-time discontinuous Galerkin finite element method for distributed optimal control problems governed by unsteady diffusion-convection-reaction equation without control constraints is studied. Time discretization is performed by discontinuous Galerkin method with piecewise constant and linear polynomials, while symmetric interior penalty Galerkin with upwinding is used for space discretization. We present some numerical results in order to evaluate the performance of the method.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Optimal Control Problem
- Discontinuous Galerkin Method
- Adjoint Equation
- Multiple Shooting Method
- Symmetric Interior Penalty
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
Optimal control problems (OCPs) governed by diffusion-convection-reaction equations arise in environmental control problems, optimal control of fluid flow and in many other applications. It is well known that the standard Galerkin finite element discretization causes non-physical oscillating solutions when convection dominates. Stable and accurate numerical solutions can be achieved by various effective stabilization techniques such as the streamline upwind/Petrov Galerkin (SUPG) finite element method [10], the local projection stabilization [4], the edge stabilization [19] for steady state OCPs. Recently, discontinuous Galerkin (dG) methods gain importance due to their better convergence behaviour, local mass conservation, flexibility in approximating rough solutions on complicated meshes, mesh adaptation and weak imposition of the boundary conditions in OCPs (see, e.g., [21, 22, 36, 37]).
In the recent years, much effort has been spent on parabolic OCPs (see, e.g., [2, 24]). However, there are few publications dealing with OCPs governed by nonstationary diffusion-convection-reaction equation. In many publications, for space discretization, conforming finite elements are used. In [16, 17], a priori error estimates are derived for the characteristic finite element method, whereas for time discretization, backward Euler method is used. Crank-Nicolson time discretization is applied to OCP of diffusion-convection equation in [5]. In the study of Chrysafinos [7], a priori error estimates for unconstrained parabolic OCP, where conforming finite elements are combined with dG time discretization, are presented by decoupling the optimality system. In [17], error analysis concerning the characteristic finite element solution of the OCP with control constraints is discussed. The local dG approximation of the OCP which is discretized by backward Euler in time is derived in [38] and a priori error estimates for semi-discrete OCP are provided in [30]. We present a priori error analysis for SIPG discretization combined with backward Euler method in [1].
In this paper, we solve the OCP governed by diffusion-convection-reaction equation without control constraints by applying symmetric interior penalty Galerkin (SIPG) method in space and dG discretization in time [14]. We adapt the error analysis [7] to space-time dG discretization. For this purpose, we divide the error analysis into three parts as in [17] and use the error estimates for dG bilinear forms.
Discontinuous Galerkin discretization schemes have the pleasant property that discretization and dualization interchange, i.e. discretization and optimization commute. There are two different approaches for solving OCPs: optimize-then-discretize (OD) and discretize-then-optimize (DO). In OD approach, first, the infinite dimensional optimality system is derived containing the state and the adjoint equation and the variational inequality. Then, the optimality system is discretized by using a suitable discretization method in space and time. In DO approach, the infinite dimensional OCP is discretized and then the finite-dimensional optimality system is derived. OD and DO approaches do not commute in general for OCPs governed by diffusion-convection-reaction equation [10]. However, commutativity is achieved in the case of SIPG discretization for steady state problems [21, 36]. Recently, dG time discretization has been applied to PDE constrained optimization problems, which is solved by the indirect multiple shooting method [18]. The multiple shooting method was formulated in function spaces and discretized afterwards, where dG time discretization commutes for both approaches. The spatial mesh was adapted at each constant time step using a dual weighted residual error estimate.
The rest of the paper is organized as follows. In Sect. 2, we define the model problem and then derive the optimality system. In Sect. 3, dG discretization and the semi-discrete optimality system follow. In Sect. 4, the fully discrete optimality system, which is discretized by space-time dG method and \(\theta\)-scheme, are presented. Under dG time discretization, we show that OD and DO approaches commute for time-dependent problems, too. In Sect. 5, some auxiliary results accompanied with a priori error estimates for the fully discrete optimality system follow. In Sect. 6, numerical results are shown in order to evaluate the performance of the suggested method. Additionally, we give some numerical results for \(\theta\)-method and compare them with the dG time discretization. The paper ends with some conclusions.
2 The Optimal Control Problem
We consider the following distributed optimal control problem governed by the unsteady diffusion-convection-reaction equation
We adopt the standard notations for Sobolev spaces on computational domains and their norms. Let Ω be a bounded convex polygonal domain in \(\mathbb{R}^{2}\) with Lipschitz boundary ∂ Ω. The inner product in L 2(Ω) is denoted by (⋅ , ⋅ ). The source function and the desired state are denoted by f ∈ L 2(0, T; L 2(Ω)) and y d ∈ L 2(0, T; L 2(Ω)), respectively. The initial condition is also defined as y 0(x) ∈ H 0 1(Ω). The diffusion and the reaction coefficients are ε > 0 and \(r \in L^{\infty }(\varOmega )\), respectively. The velocity field \(\boldsymbol{\beta }\in (W^{1,\infty }(\varOmega ))^{2}\) satisfies the incompressibility condition, i.e. \(\nabla \cdot \boldsymbol{\beta } = 0\). Furthermore, we assume the existence of the constant C 0 such that r ≥ C 0 a.e. in Ω so that the well-posedness of the OCP (1) is guaranteed. The trial and the test spaces are \(Y = V = H_{0}^{1}(\varOmega ),\;\forall t \in (0,T]\).
It is well known that the functions \((y,u) \in H^{1}(0,T;L^{2}(\varOmega )) \cap L^{2}(0,T;Y ) \times L^{2}(0,T;L^{2}(\varOmega ))\) solve (1) if and only if there is an adjoint \(p \in H^{1}(0,T;L^{2}(\varOmega )) \cap L^{2}(0,T;Y )\) such that (y, u, p) is the unique solution of the following optimality system [23, 32],
with the bilinear form
where (⋅ , ⋅ ) is the inner product in L 2(Ω).
3 Discontinuous Galerkin Semidiscretization
Let \(\{\mathcal{T}_{h}\}_{h}\) be a family of shape regular meshes such that \(\overline{\varOmega } = \cup _{K\in \mathcal{T}_{h}}\overline{K}\), \(K_{i} \cap K_{j} =\emptyset\) for \(K_{i},K_{j} \in \mathcal{T}_{h}\), i ≠ j. The diameters of elements K are denoted by h K . The maximum diameter is \(h =\max \limits _{K\in \mathcal{T}_{h}}h_{K}\). In addition, the length of an edge E is denoted by h E .
In this paper, we consider discontinuous piecewise finite element spaces to define the discrete test, state and control spaces
Here, \(\mathbb{P}^{p}(K)\) denotes the set of all polynomials on \(K \in \mathcal{T}_{h}\) of degree p.
We split the set of all edges \(\mathcal{E}_{h}\) into the set \(\mathcal{E}_{h}^{0}\) of interior edges and the set \(\mathcal{E}_{h}^{\partial }\) of boundary edges so that \(\mathcal{E}_{h} = \mathcal{E}_{h}^{\partial } \cup \mathcal{E}_{h}^{0}\). Let n denote the unit outward normal to ∂ Ω. We define the inflow boundary
and the outflow boundary \(\varGamma ^{+} = \partial \varOmega \setminus \varGamma ^{-}\). The boundary edges are decomposed into edges \(\mathcal{E}_{h}^{-} = \left \{E \in \mathcal{E}_{h}^{\partial }\,:\ E \subset \varGamma ^{-}\right \}\) that correspond to inflow boundary and edges \(\mathcal{E}_{h}^{+} = \mathcal{E}_{h}^{\partial }\setminus \mathcal{E}_{h}^{-}\) that correspond to outflow boundary. The inflow and outflow boundaries of an element \(K \in \mathcal{T}_{h}\) are defined by
where n K is the unit normal vector on the boundary ∂ K of an element K.
Let the edge E be a common edge for two elements K and K e. For a piecewise continuous scalar function y, there are two traces of y along E, denoted by y | E from interior of K and y e | E from interior of K e. Then, the jump and average of y across the edge E are defined by:
Similarly, for a piecewise continuous vector field ∇y, the jump and average across an edge E are given by
For a boundary edge \(E \in K\cap \varGamma\), we set \(\left \{\!\!\left \{\nabla y\right \}\!\!\right \} = \nabla y\) and \(\left [\!\left [y\right ]\!\right ] = y\mathbf{n}\) where n is the outward normal unit vector on Γ.
The state equation (1) in space for fixed control u is discretized by the symmetric interior penalty method (SIPG). The convective term is discretized by the upwind method [3]. This leads to the following semi-discrete state equation
with the (bi-)linear forms
and
The penalty parameter \(\sigma > 0\) should be sufficiently large to ensure the stability of the dG discretization [26, Sect. 2.7.1] with a lower bound depending only on the polynomial degree.
Let f h , y h d and y h 0 be approximations of the source function f, the desired state function y d and the initial condition y 0, respectively. Then, the semi-discrete approximation of the OCP (2) can be defined as follows:
The semi-discrete optimality system is written as follows:
where
4 Time Discretization of the Optimal Control Problem
In this section, we derive the fully-discrete optimality system using \(\theta\)-method and dG time stepping method [14]. The fully discrete optimality systems are compared for optimize-then-discretize (OD) and discretize-then-optimize (DO) approaches.
4.1 Time Discretization Using θ-Method
Let \(0 = t_{0} < t_{1} < \cdots < t_{N_{T}} = T\) be a subdivision of I = (0, T) with time intervals I m = (t m−1, t m ] and time steps k m = t m − t m−1 for m = 1, …, N T and \(k =\max _{1\leq m\leq N_{T}}k_{m}\).
We start with OD approach, so we discretize the semi-discrete optimality system (11) using \(\theta\)-method as follows:
In DO approach, the first and the second parts of the cost functional are discretized by the rectangle rule and the trapezoidal rule, respectively, so that the value of the adjoint at the final time becomes zero as in [29]. Then, we have the following fully-discrete OCP:
where M is the mass matrix.
Now, we construct the discrete Lagrangian
By differentiating the Lagrangian (13), we derive the fully-discrete optimality system
In the case of backward Euler method (\(\theta = 1\)), the value u h, 0 is not needed, as we observe from (14). As we mentioned before, approximating the first integral in the cost functional by using the rectangle rule leads to p h, N = 0, u h, N = 0, as we see from (14). Due to SIPG, we obtain a h s(ψ δ , p δ ) = a h a(p δ , ψ δ ) [36]. Therefore, variational formulations (12) and (14) are the same.
In the case of Crank-Nicolson method (\(\theta = 1/2\)), we observe that some differences occur in the adjoint equation. In (12), the right-hand side of the adjoint equation is evaluated at two successive points, while it is evaluated at just one point in (14). Additional differences are seen in the variational inequalities (12) and (14), too. Thus, OD and DO approaches lead to different weak forms. Several variants of Crank-Nicolson method are used for optimal control of heat equation in [2]. For DO approach, the cost functional is discretized by using the midpoint rule. On the other hand, for OD approach, the semi-discrete state equation is discretized by using the midpoint rule and a variant of the trapezoidal rule is applied to the semi-discrete adjoint equation to obtain the fully-discrete optimality system. Then, the resulting optimality systems commute.
4.2 Discontinuous Galerkin Time Discretization
We define the space-time finite element space of piecewise discontinuous functions for test function, state and control as
We define the temporal jump of v ∈ V h, p k, q as [v] m = v + m − v − m, where \(w_{\pm }^{m} =\lim \limits _{\varepsilon \rightarrow 0\pm }v(t_{m}+\varepsilon )\).
Let f δ and y δ d be approximations of the source function f and the desired state function y d on each interval I m . Then, the fully-discrete OCP is written as
The OCP (15) has a unique solution (y δ , u δ ) and that pair (y δ , u δ ) ∈ V h, p k, q × U h, p k, q is the solution of (15) if and only if there is an adjoint p δ ∈ V h, p k, q such that (y δ , u δ , p δ ) ∈ V h, p k, q × U h, p k, q × V h, p k, q is the unique solution of the fully-discrete optimality system
In DO approach, firstly, we construct the discrete Lagrangian
Differentiating \(\mathcal{L}\) with respect to y δ and applying integration by parts, we obtain
Now, we add and subtract \((\psi _{\delta,N_{T}}^{-},p_{\delta,N_{T}}^{+})\) to (17) and obtain
On each subinterval I m , the adjoint equation reads as
However, \((q_{\delta,-}^{N_{T}},p_{\delta,+}^{N_{T}})\) does not match the right-hand side of (18), so it is set to zero, i.e. p δ, + N = 0. Now, we use a h s(ψ δ , p δ ) = a h a(p δ , ψ δ ). Thus, we arrive at (16). We note that OD and DO approaches lead to the same optimality conditions, which can be observed by differentiating the discrete Lagrangian with respect to u δ . Therefore, both approaches commute.
5 Error Estimates
In this section, firstly, we give the norms used in the analysis and mention some estimates in the literature. Secondly, the discrete characteristic function which enables us to provide error estimates at arbitrary time points is explained. Then, we prove some useful lemmas and state the main estimate of this study.
We introduce the L 2 inner product on the inflow or outflow boundaries as follows
with analogous definition of \((\cdot,\cdot )_{\varGamma ^{+}}\) and associated norms \(\|\cdot \|_{\varGamma ^{-}}\) and \(\|\cdot \|_{\varGamma ^{+}}\).
The broken Sobolev space is defined as
with the semi-norm defined by
The Bochner space of functions whose kth time derivative is bounded almost everywhere on (0, T) with values in X is denoted by \(W^{k,\infty }(0,T;X)\). We use the dG energy norm in [33, Sect. 4]
We give the multiplicative trace inequality for all \(K \in \mathcal{T}_{h}\), for all v ∈ H 1(K) as follows:
where C M is a positive constant independent of v, h and K. We refer the reader to the study [12, Lemma 3.1] for the proof.
In addition, the generalization of Poincaré inequality to the broken Sobolev space \(H^{1}(\varOmega,\mathcal{T}_{h})\) is given as [26, Sect. 3.1.4]
We proceed with the standard estimates derived for finite element methods [9]. Consider the L 2-projection \(\varPi _{h}: L^{2}(\varOmega ) \rightarrow V _{h,p}\) so that
for all v ∈ H p+1(K), \(K \in \mathcal{T}_{h}\) where C Π is a positive constant and independent of v and h. In addition, as suggested in [33, Sect. 4], using the study [13], the following estimate holds for all \(v \in H^{p+1}(\varOmega,\mathcal{T}_{h})\)
where C M and C Π are positive constants from (20) and (22), respectively. In the following we introduce the parabolic projection for m = 0, …, N T and mention the properties given in [33]. Suppose that \(X \subset L^{2}(\varOmega )\) is a Hilbert space. Let us denote the space of polynomial functions depending on time as follows:
A space-time projection π of y ∈ C(0, T; H 1(Ω)) into V h, p k, q is employed for the convergence estimates. Time projection P of y ∈ C(0, T; H 1(Ω)) is defined as
In addition, for m = 0, …, N T , with y ∈ C(0, T; H 1(Ω)), π y ∈ V h, p k, q is defined as
We note that the definition of the projection π is likewise in the study [28].
We give some estimates from [33, Lemmas 4.3, 4.5], which we need in the proofs.
Lemma 1
Suppose that \(y \in W^{q+1,\infty }(I_{m},H^{1}(\varOmega ))\) such that y = 0 on ∂Ω. Then,
Lemma 2
Suppose that \(y \in W^{q+1,\infty }(I_{m},H^{1}(\varOmega )) \cap L^{\infty }(I_{m},H^{p+1}(\varOmega ))\) such that y = 0 on ∂Ω. Then,
where \(\|y\|_{R} =\max (\vert y\vert _{W^{q+1,\infty }(I_{m},H^{1}(\varOmega ))},\vert y\vert _{L^{\infty }(I_{m},H^{p+1}(\varOmega ))})\) and C π is a positive constant independent of h,k m ,m and y.
Lemma 3
There exists a positive constant C A which is independent of h,v h ,w h ,ε such that
Proof
The proof in [11, Lemma 3.8] is adopted to the bilinear form (7) using the estimate (23). ⊓⊔
Remark 1
A similar estimate for the bilinear form arising from the nonsymmetric interior penalty Galerkin method can be found in [33, Lemma 4.2].
Lemma 4
The bilinear form a d (⋅,⋅) satisfies the coercivity inequality
Proof
The proof in [11, Corollary 3.10] is adopted to the bilinear form (7) using the norm (19). ⊓⊔
5.1 Discrete Characteristic Function
We use the discrete characteristic function in order to provide error estimates at arbitrary time points as suggested in [8]. We can work on [0, k) instead of I m , since the construction of the discrete characteristic function is invariant under translation. We consider polynomials \(s \in \mathcal{P}_{q}(0,k)\) and the discrete approximation of χ [0, t) s of s which is a polynomial
This definition can be extended from \(\mathcal{P}_{q}(0,k)\) to V h, p k, q. The discrete approximation of χ [0, t) v for v ∈ V h, p k, q is written as \(\tilde{v} =\sum _{ i=0}^{q}\tilde{s}_{i}(t)v_{i}\). On account of these inequalities, the following estimate is given in [33]
We mention that a suitable discrete approximation \(\chi _{(t,t^{n}]}v_{h}\) must be constructed for the adjoint problem, as it is noted in the proof of [7, Theorem 3.8]. The discrete approximation of \(\chi _{ (t,t^{\,N_{T}}]}s\) is a polynomial
\(\forall z \in \mathcal{P}_{q-1}(t^{\,N_{T}-1},t^{\,N_{T}})\). This definition can be extended from \(\mathcal{P}_{q}(t^{\,N_{T}-1},t^{\,N_{T}})\) to V h, p k, q and the estimates above can be modified for the adjoint [7, Theorem 3.8].
5.2 A Priori Error Estimates
We proceed with the derivation of the convergence estimates for the optimality system and its space-time dG approximation. We define the auxiliary state and adjoint equation which are needed for a priori error analysis
Following [15], we assume that the reaction term satisfies | r | ≤ C r a.e. in Ω; the velocity field is bounded by a constant \(C_{\boldsymbol{\beta }}\) a.e. in Ω.
We prove some useful lemmas before stating the main theorem of this study.
Lemma 5
Let (y δ ,p δ ) and (y δ u ,p δ u ) be the solutions of (16) and (30) , respectively. Then, there exists a constant C independent of h and k such that
Proof
Firstly, we study the fully discrete state equation on each subinterval I m . We subtract (16) from (30) to obtain
where \(\theta = y_{\delta }^{u} - y_{\delta }\). We substitute \(v_{\delta } = 2\theta\) in (32). Then,
is achieved. For the right-hand side, we employ Cauchy-Schwarz, Young inequalities, Poincaré inequality (21) and the definition of dG norm (19). For the left-hand side, we use (28) for diffusion term and follow the technique in (see [15, Theorem 5.1]) for convection and reaction terms. Then, we derive the following estimate in the middle of (34)
We note that the lower bound on the left-hand side of (34) has been added after deriving the estimate in the middle for the clearance of the proof and will be used later. Now, we proceed by substituting \(v_{\delta } = 2\tilde{\theta }\) into (32). We employ the discrete characteristic function as in the proof of [33, Theorem 5.2] to obtain an estimate at arbitrary points and use the properties given there. With \(z =\arg \sup _{\bar{I}_{m}}\|\theta (t)\|\), the discrete characteristic function defined in Sect. 5.1 leads to
We use (35) and (36) and the inequality \(\|\theta _{-}^{m-1}\| \leq \sup _{t\in I_{m-1}}\|\theta (t)\|\) to bound the terms arising in the time derivative. We proceed by moving \(2\int _{I_{m}}a_{h}(\theta,\tilde{\theta })\;dt\) to the right-hand side. We employ (27) for the diffusion term, the proof of [15, Theorem 5.1] for the convection term. The reaction term and the control on the right-hand side is bounded by using Cauchy-Schwarz and Young inequalities (21) and (19) such that \(\|\cdot \|^{2} \leq C\vert \vert \vert \cdot \vert \vert \vert _{DG}^{2}\) is satisfied for a positive constant C. We eliminate the term \(\vert \vert \vert \tilde{\theta }\vert \vert \vert _{DG}^{2}\) on the right-hand side by using (29). Then, we obtain the following inequality
where \(C_{b} = C(1 + C_{D})(\epsilon C_{A} + C_{S}(C_{r} + C_{\boldsymbol{\beta }})),C_{b}^{{\prime}} =\max \{ 1,C_{b}\}\). In order to eliminate the terms \(\theta\) on the right-hand side of (37), we use (34) multiplying it by \(C_{b}^{{\prime\prime}} = \frac{2} {\epsilon } C_{b}^{{\prime}}\). By adding these inequalities and denoting \(\varTheta _{m}\ =\ \sup _{t\in I_{m}}\|\theta (t)\|^{2} + C_{b}^{{\prime\prime}}\|\theta _{-}^{m}\|^{2}\), we arrive at
We sum (38) over m = 1, …, n ≤ N T and use \(\theta = 0\) at t = 0 to derive the estimate
Secondly, we proceed with the adjoint equation subtracting (16) from (30) and using ζ = p δ u − p δ . A discrete approximation to \(\chi _{(t,t_{m}]}v_{h}\) specified for the adjoint problem must be used, as we discussed in Sect. 5.1. Then, this leads to
where \(z =\arg \sup _{\bar{I}_{m}}\|\zeta (t)\|\). In addition, the inequalities \(\|\zeta ^{m}\|^{2} \leq \sup _{I_{N_{ T}-m+2}}\|\zeta (t)\|^{2}\) and \(\|\zeta (z)\|^{2} =\sup _{I_{N_{ T}-m+1}}\|\zeta (t)\|^{2}\) are needed. Then, we follow the same idea used to derive (39) to reach the inequality
We sum (41) over m = N T , …, n ≥ 1 and use ζ = 0 at \(t = t_{N_{T}}\). The final result (31) follows from standard algebra, (39) and (41). ⊓ ⊔
We proceed with the estimate between the exact and the approximate control.
Lemma 6
Let (y,p,u) and (y δ ,p δ ,u δ ) be the solutions of (2) and (16) , respectively. Then, we have
Proof
We apply the technique used for the steady-state optimal control problem in [21, Sect. 4.2]. We start using the continuous and the fully-discrete optimality conditions (2)–(16) to obtain the following equation
We use Cauchy-Schwarz and Young inequalities to show that
We proceed with J 2 and use the auxiliary state equation (30) to obtain
We proceed applying integration by parts in time and use the auxiliary adjoint equation (30) to arrive at
Then, using (43)–(45), we derive the final result (42). ⊓ ⊔
Lemma 7
Let (y,p) and (y δ u ,p δ u ) be the solutions of (2) and (30) , respectively. Assume that \(y,p \in W^{q+1,\infty }(0,T;H^{1}(\varOmega )) \cap L^{\infty }(0,T;H^{p+1}(\varOmega ))\) . Then, there exists a constant C independent of h and k such that
Proof
Firstly, we integrate (2) over I m and subtract the result from (30) in order to obtain the following equation
where \(y - y_{\delta }^{u} = (y -\pi y) + (\pi y - y_{\delta }^{u}) =\eta +\xi\).
Since we use the same mesh on each time interval, (24) leads to the following identity.
We proceed as in the proof of Lemma 5 and the proof of [15, Theorem 5.1] by inserting the estimate (26) to obtain
where \(\vert y\vert _{R} =\max (\vert y\vert _{W^{q+1,\infty }(I_{m};H^{1}(\varOmega ))},\vert y\vert _{L^{\infty }(I_{m};H^{p+1}(\varOmega ))})\).
Firstly, we shall substitute \(v_{\delta } = 2\xi\) into (49) to obtain
where \(C_{b} =\max \{ C_{A}C_{\pi },2C_{\boldsymbol{\beta }}C_{\pi }C_{M},C_{\pi }\frac{C_{\boldsymbol{\beta }}C_{r}} {C_{0}} \}\).
Secondly, we substitute \(v_{\delta } = 2\tilde{\xi }\) into (49) to obtain
where \(C_{b}^{{\prime}} = C(1 + C_{D})(\epsilon C_{A} + C_{S}(C_{\boldsymbol{\beta }} + C_{r})),C_{b}^{{\prime\prime}} =\max \{ 1,C_{b}^{{\prime}}\}\). Now, we proceed as in the proof of Lemma 5. We multiply (50) by \(C_{b}^{{\prime\prime\prime}} = \frac{2} {\epsilon } C_{b}^{{\prime\prime}}\) in order to eliminate the terms \(\xi\) on the right-hand side of (51). Then, we add it to (51) and denote \(\varTheta _{m} =\sup _{t\in I_{m}}\|\xi (t)\|^{2} + C_{b}^{{\prime\prime\prime}}\|\xi _{-}^{m}\|^{2}\) in order to obtain
We sum (52) over m = 1, …, n ≤ N T to obtain
Thirdly, we integrate (2) over I m and subtract it from (30) and denote \(p - p_{\delta }^{u} = (p -\pi p) + (\pi p - p_{\delta }^{u}) =\varphi +\mu\). Then, we use the idea in the proof of (53) in order to derive
for C > 0. The resulting inequality is summed over m = N T , …, n ≥ 1. Then, it is combined with (53) to derive the final result (46). ⊓ ⊔
Remark 1
For guaranteeing the assumptions on the exact solution, it is necessary to require a higher regularity of the data of the problem.
We state the main estimate of this study by combining Lemmas 5, 6, and 7.
Theorem 1
Suppose that (y,p,u) and (y δ ,p δ ,u δ ) are the solutions of (2) and (16) , respectively. We assume that all conditions of Lemmas 5, 6 and 7 are satisfied. Then, there exists a constant C independent of h and k such that
In Theorem 1, the error in the state and control is measured with respect to the norm \(L^{\infty }(0,T;L^{2}(\varOmega ))\) and L 2(0, T; L 2(Ω)), respectively. The same norms are used, for example, in the study of Fu [16], too. The former norm is due to the discrete characteristic function which is used to provide error estimates at arbitrary time points. The latter norm arises from the optimality condition which is shown in Lemma 6. On the other hand, we observe that Theorem 1 is optimal in time, suboptimal in space in the \(L^{\infty }(0,T;L^{2}(\varOmega ))\) norm for the state and L 2(0, T; L 2(Ω)) for the control, i.e. \(\mathcal{O}(h^{p},k^{q+1})\), using p-degree spatial, q-degree temporal polynomial approximation. However, for example, optimal spatial convergence rate for SIPG discretization combined with backward Euler is achieved using an elliptic projection in [1]. The first reason behind the order reduction in this study is the estimate (26) for the space-time projection which is employed to bound the continuity estimate of the bilinear form in Lemma 3. The convection term also has an influence on the spatial order reduction since we follow the proof of [15, Theorem 5.1]. After eliminating the effect of the space-time projection in the bilinear form of the diffusion term, this suboptimal estimate can be improved as in [1].
6 Numerical Results
In this section, we present some numerical results. We measure the error in the state and the control in terms of \(L^{\infty }(0,1;L^{2}(\varOmega ))\) and L 2(0, 1; L 2(Ω)) norm, respectively. We have used discontinuous piecewise linear polynomials in space. In all numerical examples, we have taken h = k.
We note that, in the case of dG(0) method, the approximating polynomials are piecewise constant in time and the resulting scheme is a version of the backward Euler method with a modified right-hand side [31, Chap. 12]:
For dG(1) method, we use piecewise linear polynomials in time. The resulting linear system for the state on each time interval is given as follows [31, Chap. 12]:
where A sand M are the stiffness and the mass matrices of the state equation, respectively. We derive the solution at the time step t m as y h, m = Y 0 + Y 1. For the adjoint equation, we have the following linear system:
where A a is the stiffness matrix for the adjoint equation. We obtain the adjoint at the time step t m−1 as p h, m−1 = P 0 + P 1.
The main drawback of dG time discretization is the solution of large coupled linear systems in block form. Because we are using constant time steps, the coupled matrices on the right-hand side of (56) and (57) have to be decomposed (LU block factorization) at the beginning of the integration. Then, the state and the adjoint equations are solved at each time step by forward elimination and back substitution using the block factorized matrices.
Example 1
The first example is a convection dominated OCP with smooth solutions. It is converted to an unconstrained optimal control problem [17, Ex. 1] by adding the reaction term with
The source function f, the desired state y d and the initial condition y 0 are computed from (2) using the following exact solutions of the state and the control, respectively,
In Table 1, errors and converge rates for dG(0) and backward Euler method are shown. We observe that the first order convergence rate is achieved in time, due to the dominance of temporal errors.
In Table 2, errors and converge rates for Crank-Nicolson method obtained by OD and DO approaches are shown. For Crank-Nicolson method, through OD approach, the second order convergence rate is achieved. However, for DO approach, discretization of the right-hand side of the adjoint equation (14) by one-step method is reflected to the numerical results and the quadratic order of convergence is not observed.
In Table 3, We present numerical results for dG(1) time discretization. Numerical results indicate a higher order experimental order of convergence, namely \(\mathcal{O}(h^{2})\), than the one shown in Theorem 1, which is \(\mathcal{O}(h)\) with h = k. The error in the state is smaller than for Crank-Nicolson method with OD approach, while the error in the control is close to one for Crank-Nicolson method with OD approach.
Example 2
The second example is a convection dominated OCP adapted from [16, Ex. 2] with
The source function f, the desired state y d and the initial condition y 0 are computed from (2) using the following exact solutions of the control and state, respectively,
where t x = t − 0. 5(x 1 + x 2). As opposed to the previous example, the exact solution of the PDE constraint depends on the diffusion explicitly and the problem is highly convection dominated. This example cannot be solved properly by using dG(0) and backward Euler method. Therefore, we present numerical results for Crank-Nicolson method in Table 4, where the differences between OD and DO can be seen clearly. DO approach causes order reduction in the control. However, due to the convection dominated nature of the problem, the quadratic convergence rate cannot be achieved with OD approach in contrast to Example 1. The orders of convergence correspond to those in [5].
In Table 5, we present numerical results for dG(1) discretization. As opposed to the results in Table 4, the error in the state and the control are smaller than in the case of Crank-Nicolson. Numerical results indicate a better experimental order of convergence, namely \(\mathcal{O}(h^{2})\), than the theoretical error estimate in Theorem 1. Similar observations are made for nonstationary non-linear diffusion-convection equations for the SIPG spatial discretization in [20].
In Figs. 1 and 2, we present the error between the exact and the approximate solution at t = 0. 5 obtained using Crank-Nicolson-DO approach and dG(1) discretization. These figures also show that dG(1) discretization solves the problem well.
7 Conclusion
For dG time discretization, the numerical results show that linear and quadratic convergence rates are achieved using piecewise discontinuous constant and linear polynomials in time, respectively, and DO and OD approaches commute. In a future work, we will study control constrained problem and derive the optimal convergence rates under lower regularity assumptions.
8 Outlook: Efficient Solvers for DG Time Discretization
Discontinuous Galerkin time stepping is used for solving linear and nonlinear OCPs by multiple shooting methods in [6, 18] because of the commutativity property of discretization and optimization. At each subinterval of multiple shooting, a very large system of linear or nonlinear equations has to be solved, which can be handled by iterative methods, such as Krylov subspace method. In the references mentioned above, the first order dG(0) method is used, where for nonlinear problems at each Newton iteration step, a linear system of equations with the same structure of implicit Euler method has to be solved. Higher order dG methods lead to coupled block systems and the number of the unknowns grows linearly with increasing order. Therefore, for OCPs constrained by linear and nonlinear parabolic PDEs in several space dimensions, efficient solution techniques are needed. In the following, we will give an overview of the existing approaches by narrowing our discussion to 2x2 coupled block systems arising from different dG discretizations.
In the last decade, several variational time discretization methods were developed. The test spaces always consist of piecewise discontinuous polynomials. When the solution space consists of continuous piecewise polynomials of degree k and the test functions are piecewise discontinuous polynomials of degree k − 1, the resulting method is called continuous Galerkin discretization cGP(k). For discontinuous Galerkin dG(k) method, both test and trial spaces are piecewise discontinuous polynomials of degree k. Advantages of variational time discretization are stability, convergence, space-time adaptivity. Both continuous and discontinuous Galerkin methods are A-stable; the discontinuous Galerkin methods are even L-stable (strongly stable). The convergence order of cGP(k) methods is of one order higher than the dG(k) methods. Both of these methods are super-convergent at the nodal points, namely of order 2r + 1, when the order of the method is r and the solution of the problem is sufficiently regular [31, Chap. 12]. The time-space adaptivity can be easily implemented, because the time is discretized as the space with finite elements. Using a posteriori error estimates, adaptive hp time stepping and dynamic meshes (the use of different spatial discretization for each time step) can directly be incorporated in the discrete formulation [25]. We want to mention that dynamic meshes (meshes changing with time) were used by combining dG(0) time discretization with multiple shooting method for linear and nonlinear OCPs in [18], whereas Carraro et al. [6] use fixed meshes for all discrete time levels.
As we have mentioned, the main disadvantage of variational time discretization is the large system of coupled equations as a result of space-time discretization. To illustrate this, we consider the semilinear parabolic initial value problem
where A is a linear second order elliptic differential operator and f(u) is locally Lipschitz continuous and monotone.
The 2×2 block system associated to dG discretization of (58) can be written in the following form:
where M is the mass matrix and F(⋅ )’s are dGFEM semi-discretized nonlinear terms of the right hand side of (58).
One step of the Newton iteration for solving the coupled system in (59) corresponds to solving the following 2 × 2 block system:
where the vectors W n i and R n i, for i = 1, 2, denote the Newton correction and residual for a temporal basis function, respectively [25].
In [35], the linear system of equations associated to dG(k) method, derived from the solution of the linear parabolic equations, are decoupled into complex valued linear systems having the same structure as the implicit Euler discretization. Because the existing finite element codes do not support complex arithmetic, implementation would be difficult and costly. In order to avoid the use of complex arithmetic, Richter et al. [25] developed an inexact Newton method for solving nonlinear parabolic PDEs discretized by dG(k) methods. At each time step, several linear systems of equations are solved with the same structure as for the implicit Euler discretization. Weller and Basting [34] suggest a different solution strategy for linear parabolic PDEs under dG(2) method approximated at Gauss-Radau points. The essential component U n 2, which is the solution of the problem at the next time step, can be obtained by an inexact factorization of the Schur complement, due to the property β 1, 2 = β 2, 1 = 0 in (59) and (60). Because the Schur complement is of the fourth order, the condition number will be worse than the condition number of the original system. They apply a symmetric preconditioned conjugate gradient method so that a number of linear systems with the same structure arising from implicit Euler discretization must be solved at each step. The nice property of the method is that it can be applied to linear parabolic PDEs with non-self adjoint operators like diffusion-convection-reaction equation, because the Schur complement is symmetric. Efficiency of the solution technique for nonlinear parabolic problems has to be tested. Schieweck [27] introduced a continuous dG method where the solution space consists piecewise continuous polynomials of degree k ≥ 1 and test space of piecewise discontinuous polynomials of degree k − 1 approximated at Gauss-Lobatto nodes. They call this technique discontinuous Galerkin-Petrov dGP(k) method. Because the time derivative of the discrete solution is contained in the discrete test space, the method has energy decreasing property so that it can be applied to gradient systems like Allen-Chan and Chan-Hilliard equations. Again, the essential unknown is U n 2 for dGP(2) method due to β 11 = 0 in (59) and (60), and the solution can be determined by fixed point iteration. However, the linear system which must be solved at each time level consists of powers of mass and stiffness matrices, which could be difficult to solve. Instead, a defect correction algorithm was introduced [27], so that at each defect correction step, linear systems like in the implicit Euler discretization have to be solved again.
References
Akman, T., Yücel, H., Karasözen, B.: A priori error analysis of the upwind symmetric interior penalty Galerkin (SIPG) method for the optimal control problems governed by unsteady convection diffusion equations. Comput. Optim. Appl. 57(3), 703–729 (2014)
Apel, T., Flaig, T.G.: Crank-Nicolson schemes for optimal control problems with evolution equations. SIAM J. Numer. Anal. 50(3), 1484–1512 (2012)
Ayuso, B., Marini, L.D.: Discontinuous Galerkin methods for advection-diffusion-reaction problems. SIAM J. Numer. Anal. 47(2), 1391–1420 (2009)
Becker, R., Vexler, B.: Optimal control of the convection-diffusion equation using stabilized finite element methods. Numer. Math. 106(3), 349–367 (2007)
Burman, E.: Crank-Nicolson finite element methods using symmetric stabilization with an application to optimal control problems subject to transient advection-diffusion equations. Commun. Math. Sci. 9(1), 319–329 (2011)
Carraro, T., Geiger, M., Rannacher, R.: Indirect multiple shooting for nonlinear parabolic optimal control problems with control constraints. SIAM J. Sci. Comput. 36(2), A452–A481 (2014)
Chrysafinos, K.: Discontinuous Galerkin approximations for distributed optimal control problems constrained by parabolic PDE’s. Int. J. Numer. Anal. Model. 4(3–4), 690–712 (2007)
Chrysafinos, K., Walkington, N.J.: Error estimates for the discontinuous Galerkin methods for parabolic equations. SIAM J. Numer. Anal. 44(1), 349–366 (electronic) (2006)
Ciarlet, P.G.: The Finite Element Method for Elliptic Problems. North-Holland, Amsterdam, New York (1978)
Collis, S.S., Heinkenschloss, M.: Analysis of the streamline upwind/Petrov Galerkin method applied to the solution of optimal control problems. Tech. Rep. TR02–01, Department of Computational and Applied Mathematics, Rice University, Houston, TX (2002)
Dolejší, V., Feistauer, M.: Error estimates of the discontinuous Galerkin method for nonlinear nonstationary convection-diffusion problems. Numer. Funct. Anal. Optim. 26(3), 349–383 (2005)
Dolejší, V., Feistauer, M., Schwab, C.: A finite volume discontinuous Galerkin scheme for nonlinear convection-diffusion problems. Calcolo 39, 1–40 (2002)
Dolejší, V., Feistauer, M., Sobotíková, V.: Analysis of the discontinuous Galerkin method for nonlinear convection-diffusion problems. Comput. Methods Appl. Mech. Eng. 194(25–26), 2709–2733 (2005)
Eriksson, K., Johnson, C., Thomée, V.: Time discretization of parabolic problems by the discontinuous Galerkin method. RAIRO Modél. Math. Anal. Numér. 19(4), 611–643 (1985)
Feistauer, M., Švadlenka, K.: Discontinuous Galerkin method of lines for solving nonstationary singularly perturbed linear problems. J. Numer. Math. 12(2), 97–117 (2004)
Fu, H.: A characteristic finite element method for optimal control problems governed by convection-diffusion equations. J. Comput. Appl. Math. 235(3), 825–836 (2010)
Fu, H., Rui, H.: A priori error estimates for optimal control problems governed by transient advection-diffusion equations. J. Sci. Comput. 38(3), 290–315 (2009)
Hesse, H.K., Kanschat, G.: Mesh adaptive multiple shooting for partial differential equations. I. Linear quadratic optimal control problems. J. Numer. Math. 17(3), 195–217 (2009)
Hinze, M., Yan, N., Zhou, Z.: Variational discretization for optimal control governed by convection dominated diffusion equations. J. Comput. Math. 27(2–3), 237–253 (2009)
Hozman, J., Dolejj̆í, V.: A priori error estimates for DGFEM applied to nonstationary nonlinear convection-diffusion equation. In: Kreiss, G., et al. (eds.) Numerical Mathematics and Advanced Applications. ENUMATH 2009, pp. 459–467. Springer, Heidelberg (2010)
Leykekhman, D.: Investigation of commutative properties of discontinuous Galerkin methods in PDE constrained optimal control problems. J. Sci. Comput. 53(3), 483–511 (2012)
Leykekhman, D., Heinkenschloss, M.: Local error analysis of discontinuous Galerkin methods for advection-dominated elliptic linear-quadratic optimal control problems. SIAM J. Numer. Anal. 50(4), 2012–2038 (2012)
Lions, J.L.: Optimal control of systems governed by partial differential equations. Translated from the French by S. K. Mitter. Die Grundlehren der Mathematischen Wissenschaften, Band 170. Springer, New York (1971)
Meidner, D., Vexler, B.: A priori error estimates for space-time finite element discretization of parabolic optimal control problems. I. Problems without control constraints. SIAM J. Control Optim. 47(3), 1150–1177 (2008)
Richter, T., Springer, A., Vexler, B.: Efficient numerical realization of discontinuous Galerkin methods for temporal discretization of parabolic problems. Numer. Math. 124(1), 151–182 (2013)
Rivière, B.: Discontinuous Galerkin Methods for Solving Elliptic and Parabolic Equations: Theory and Implementation. Frontiers in Applied Mathematics, vol. 35. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2008)
Schieweck, F.: A-stable discontinuous Galerkin-Petrov time discretization of higher order. J. Numer. Math. 18(1), 25–57 (2010)
Schötzau, D., Schwab, C.: Time discretization of parabolic problems by the hp-version of the discontinuous Galerkin finite element method. SIAM J. Numer. Anal. 38(3), 837–875 (2000)
Stoll, M., Wathen, A.: All-at-once solution of time-dependent PDE-constrained optimization problems. Tech. rep., Computational Methods in Systems and Control Theory, Max Planck institude for Dynamics of Complex Technical Systems, Magdeburg (2010). http://www.eprints.maths.ox.ac.uk/1017/1/NA-10-13.pdf
Sun, T.: Discontinuous Galerkin finite element method with interior penalties for convection diffusion optimal control problem. Int. J. Numer. Anal. Model. 7(1), 87–107 (2010)
Thomée, V.: Galerkin Finite Element Methods for Parabolic Problems, 2nd edn. Springer Series in Computational Mathematics, vol. 25. Springer, Berlin (2006)
Tröltzsch, F.: Optimal Control of Partial Differential Equations: Theory, Methods and Applications. Graduate Studies in Mathematics, vol. 112. American Mathematical Society, Providence, RI (2010). Translated from the 2005 German original by Jürgen Sprekels
Vlasák, M., Dolejší, V., Hájek, J.: A priori error estimates of an extrapolated space-time discontinuous Galerkin method for nonlinear convection-diffusion problems. Numer. Methods Partial Differ. Equ. 27(6), 1456–1482 (2011)
Weller, S., Basting, S.: Efficient preconditioning of variational time discretization methods for parabolic partial differential equations. ESIAM Math. Model. Numer. Anal. 49(2), 331–347 (2015)
Werder, T., Gerdes, K., Schötzau, D., Schwab, C.: hp-discontinuous Galerkin time stepping for parabolic problems. Comput. Methods Appl. Mech. Eng. 190(49–50), 6685–6708 (2001)
Yücel, H., Heinkenschloss, M., Karasözen, B.: Distributed optimal control of diffusion-convection-reaction equations using discontinuous Galerkin methods. In: Cangiani, A., Davidchack, R.L., Georgoulis, E., Gorban, A.N., Levesley, J., Tretyakov, M.V. (eds.) Numerical Mathematics and Advanced Applications 2011, pp. 389–397. Springer, Berlin/Heidelberg (2013)
Yücel, H., Karasözen, B.: Adaptive symmetric interior penalty Galerkin (SIPG) method for optimal control of convection diffusion equations with control constraints. Optimization 63(1), 145–166 (2014)
Zhou, Z., Yan, N.: The local discontinuous Galerkin method for optimal control problem governed by convection diffusion equations. Int. J. Numer. Anal. Model. 7(4), 681–699 (2010)
Acknowledgements
The authors thank to Konstantinos Chrysafinos for his explanations regarding error estimates and references. This research was supported by the Middle East Technical University Research Fund Project (BAP-07-05-2012-102).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Akman, T., Karasözen, B. (2015). Space-Time Discontinuous Galerkin Methods for Optimal Control Problems Governed by Time Dependent Diffusion-Convection-Reaction Equations. In: Carraro, T., Geiger, M., Körkel, S., Rannacher, R. (eds) Multiple Shooting and Time Domain Decomposition Methods. Contributions in Mathematical and Computational Sciences, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-319-23321-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-23321-5_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23320-8
Online ISBN: 978-3-319-23321-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)