Reduced basis approximation and a posteriori error bounds for 4D-Var data assimilation

Kärcher, Mark; Boyaval, Sébastien; Grepl, Martin A.; Veroy, Karen

doi:10.1007/s11081-018-9389-2

Reduced basis approximation and a posteriori error bounds for 4D-Var data assimilation

Research Article
Published: 04 June 2018

Volume 19, pages 663–695, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Optimization and Engineering Aims and scope Submit manuscript

Reduced basis approximation and a posteriori error bounds for 4D-Var data assimilation

Download PDF

Mark Kärcher¹,
Sébastien Boyaval^2,3,
Martin A. Grepl⁴ &
…
Karen Veroy¹

474 Accesses
29 Citations
Explore all metrics

Abstract

We propose a certified reduced basis approach for the strong- and weak-constraint four-dimensional variational (4D-Var) data assimilation problem for a parametrized PDE model. While the standard strong-constraint 4D-Var approach uses the given observational data to estimate only the unknown initial condition of the model, the weak-constraint 4D-Var formulation additionally provides an estimate for the model error and thus can deal with imperfect models. Since the model error is a distributed function in both space and time, the 4D-Var formulation leads to a large-scale optimization problem for every given parameter instance of the PDE model. To solve the problem efficiently, various reduced order approaches have therefore been proposed in the recent past. Here, we employ the reduced basis method to generate reduced order approximations for the state, adjoint, initial condition, and model error. Our main contribution is the development of efficiently computable a posteriori upper bounds for the error of the reduced basis approximation with respect to the underlying high-dimensional 4D-Var problem. Numerical results are conducted to test the validity of our approach.

3D-VAR for parameterized partial differential equations: a certified reduced basis approach

Article 25 July 2019

Variational Data Assimilation Based on Derivative-Free Optimization

Recent Applications in Representer-Based Variational Data Assimilation

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The goal of four-dimensional variational (4D-Var) data assimilation is to estimate unknown control variables of a dynamical system—classically the initial condition of the system—that provide the best fit of the system outputs with observation data over a specific time interval (Courtier 1997; Dimet and Talagrand 1986; Lorenc 1981, 1986; Sasaki 1970). The use of 4D-Var data assimilation is prevalent in oceanography (Bennett 1993) and meteorology (Lynch 2015), where the dynamical system is described by partial differential equations (PDEs); see the recent texts (Law et al. 2015; Reich and Cotter 2015) and references therein for variational data assimilation in general.

We consider two variants of the 4D-Var problem. In the traditional strong-constraint 4D-Var formulation, the model is assumed to be “perfect” and only the initial conditions serve as the (unknown) control variable. The weak-constraint 4D-Var formulation additionally accounts for an imperfect model in the traditional formulation by introducing and finding a forcing term to account for the model error. In the weak-constraint case, the unknown initial condition and unknown model-error forcing term thus serve as control variables; for various weak-constraint formulations see e.g. Trémolet (2006).

The 4D-Var problem is usually cast as an optimization problem and has very close connections to optimal control theory (Vermeulen and Heemink 2006). A cost functional is introduced consisting of two terms in the classical strong-constraint formulation: the first term penalizes the misfit between the (unknown) initial condition and its prior background information and the second term penalizes the distance between the predicted system outputs and the observation data. In the weak-constraint case, another term is added which penalizes the model-error forcing. The optimal estimate of the initial condition is then found by minimizing the cost functional subject to the governing equations of the dynamical system, i.e., the PDE. After discretization of the PDE using classical techniques such as finite elements or volumes, the 4D-Var problem results in a large-scale optimization problem which is typically very expensive to solve due to the high-dimensional state and control variable spaces and the associated computation of the cost functional, gradient, and possibly Hessian. Note that in the discretized weak-constraint formulation, the model-error forcing is also assumed to be spatially distributed and thus has approximately the same dimension as the state and initial condition. To lower the tremendous computational cost for solving the problem, an incremental approach has been proposed in Courtier et al. (1994).

Another way to speed up the solution process is a reduced-order approach; such approaches have been proposed successfully for the strong-constraint 4D-Var formulation in, for example, Cao et al. (2007), Daescu and Navon (2008), Dimitriu et al. (2010), Hoteit and Köhl (2006), Robert et al. (2005), Vermeulen and Heemink (2006) and Ştefănescu et al. (2015). There are two kinds of 4D-Var reduced-order approaches in the literature: In the first approach (Hoteit and Köhl 2006; Robert et al. 2005; Vermeulen and Heemink 2006), a reduced basis space is introduced, e.g. using empirical orthogonal functions, for only the control variable (initial condition). By limiting the search space to the reduced space, the optimization cost per iteration decreases and the convergence improves (at least during the first few iterations). In the second approach (Cao et al. 2007; Daescu and Navon 2008; Dimitriu et al. 2010), a reduced-order model for the system dynamics using proper orthogonal decomposition (POD) is additionally introduced. This leads to an additional speed-up and significant overall computational savings compared to reducing only the control space. All of these approaches also consider adapting the basis during the optimization. However, to the best of our knowledge, a posteriori error bounds to assess the sub-optimality of the reduced-order 4D-Var solutions have not yet been developed.

In this paper, we develop efficiently evaluable a posteriori error bounds for reduced order solutions of the strong- and weak-constraint 4D-Var data assimilation problem. We consider the standard quadratic 4D-Var cost functional constrained by parametrized linear parabolic PDEs involving noisy observations in time. Our final goal is not only to recover the “usual” 4D-Var control variables, i.e., the initial condition and model-error forcing, but also the model parameters. A preliminary improvement of the model itself before estimating the state can result in an improved state estimate, see e.g. the application in Habert et al. (2016). We thus obtain a bilevel optimization problem where the outer optimization stage is performed over the model parameters after an inner optimization stage identical to the standard 4D-Var setting, i.e., an optimization over control variables for given fixed model parameters. In this paper, we focus mainly on the inner optimization stage and propose a posteriori error bounds for the control variable. Our main contributions are as follows:

In Sect. 3, we consider the strong-constraint 4D-Var formulation. We employ the reduced basis method to generate reduced order approximations for the solution of the parametrized 4D-Var problem, i.e., the state, adjoint, and control variables (i.e., the initial condition). We then propose an a posteriori error bound for the control variable that allows us to assess the error between the reduced-order 4D-Var solution and 4D-Var solution of the underlying high-dimensional FE approximation.
In Sect. 4, we extend the reduced basis approximation and a posteriori error estimation procedure from the strong- to the weak-constraint case. For simplicity of exposition, we consider the model-error forcing as the only unknown control variable in this section.
In Sect. 5, we combine the results from the two previous sections and consider problems with unknown initial condition and model-error forcing.

With the assumption of affine parameter dependence, the reduced-order 4D-Var problems and the a posteriori error bounds can be efficiently evaluated using an offline-online computational decomposition. Problems involving material parameters often naturally satisfy an affine parameter dependence, and even geometric parameters can often be treated after introducing suitable affine mappings onto a reference domain (Rozza et al. 2008). Furthermore, the dimension reduction as well as the a posteriori error bound formulation presented in this paper still hold even for non-affine problems. However, for non-affine problems the computations can no longer be decomposed into offline-online stages, and the online computational efficiency thus suffers. To address this issue, the non-affine case can be treated using the empirical interpolation method (EIM) which replaces the non-affine terms using an affine approximation and thus allows us to regain the online-computational efficiency; we refer the interested reader e.g. to Barrault et al. (2004), Grepl et al. (2007) and Maday et al. (2007).

We present numerical results for the strong- and weak-constraint setting in Sect. 6. We consider the dispersion of a pollutant governed by a convection-diffusion equation with a Taylor–Green vortex velocity field. Our goal is to recover the initial condition (in the strong-constraint case) or the model-error forcing (in the weak-constraint case) given noisy measurements of the pollutant concentration at five spatial locations over time. Since we focus on the solution of the inverse problem here, we limit our test case to low Peclet numbers up to 50. The reason is that high Peclet numbers pose significant challenges for model reduction even for the forward problem itself: the high Peclet regime may require stabilization and faces a growing Kolmogorov N-width and associated increase of the reduced order spaces. However, even Peclet numbers below 50 are still practically relevant and do appear in realistic scenarios, see e.g., Marshall et al. (2006).

We note that there is a close connection between the 4D-Var problem formulation and optimal control and that a posteriori error bounds for reduced order solutions to optimal control problems have been developed previously. However, rigorous and efficiently evaluable error bounds have been proposed mainly for elliptic problems KG2012,Kaercher2017,NRMsps2012), whereas error bounds for parabolic optimal control problems are either not rigorous (Dedè 2010) or not (online-)efficient (Tröltzsch and Volkwein 2009). The only exception for parabolic problems is Kärcher and Grepl (2014), which considers only scalar time-dependent controls and is based on a pertubation argument, often resulting in a more conservative error bound (Kärcher et al. 2017).

Finally, we note that the reduced basis method has already been used in a parameterized-background data-weak approach to variational data assimilation in Maday et al. (2015a, b). However, this previous work considers the elliptic case and presents a relaxation of the 3D-Var setting, whereas we consider the time-dependent case using the classical 4D-Var formulation. Before introducing some preliminary definitions and assumptions in the following section, we do note that although we consider the 4D-Var problem here, our approach directly applies to the 3D-Var setting since the two are formally similar (Lynch 2015).

2 Preliminaries

In this section, we introduce the necessary ingredients and definitions for the subsequent discussion. The 4D-Var problem is usually cast in a fully discrete setting; we thus directly consider a spatial finite element (FE) and temporal finite difference (FD) discretization using the weak variational formulation. We summarize the continuous formulation of the 4D-Var problem in “Appendix 1”.

Let $Y_\mathrm {e}$ with $H^1_0({\varOmega }) \subset Y_\mathrm {e}\subset H^1({\varOmega })$ be a Hilbert space of functions over the bounded Lipschitz domain ${\varOmega }\subset \mathbb {R}^d,$ $d \in \mathbb {N},$ with boundary ${\varGamma }.$ The inner product and induced norm associated with $Y_\mathrm {e}$ are given by $(\cdot ,\cdot )_Y$ and $\left||\cdot \right||_Y = \sqrt{(\cdot ,\cdot )_Y},$ respectively. We assume that the norm $\left||\cdot \right||_Y$ is equivalent to the $H^1({\varOmega })$-norm and denote the dual space of $Y_\mathrm {e}$ by $Y_\mathrm {e}'.$ We also introduce the Hilbert space for the control, $U_\mathrm {e}= L^2({\varOmega }),$ together with its inner product $(\cdot ,\cdot )_U,$ induced norm $\left||\cdot \right||_U = \sqrt{(\cdot ,\cdot )_U},$ and associated dual space $U_\mathrm {e}'.$ Furthermore, let $\mathcal {D}\subset \mathbb {R}^P$ be a prescribed P-dimensional compact set in which our P-tuple input parameter $\mu = (\mu _1,\ldots ,\mu _{P})$ resides.

We divide the time interval [0, T] with fixed final time T into K subintervals of equal length $\tau = \frac{T}{K}$ and define $t^k = k \, \tau , \ 0 \le k \le K,$ and $\mathbb {K}= \{ 1, \dots , K \}.$ We also introduce two conforming finite element approximation spaces $Y \subset Y_\mathrm {e}$ and $U \subset U_\mathrm {e}$ of typically large dimension $\mathcal {N}_Y = \dim (Y)$ and $\mathcal {N}_U= \dim (U);$ note that Y and U shall inherit the inner product and norm from $Y_\mathrm {e}$ and $U_\mathrm {e},$ respectively. We shall assume that the spaces $Y,U$ and the number of timesteps K are large enough – i.e. Y and U are sufficiently rich and the time-discretization sufficiently fine – such that the FE-FD approximation guarantees a desired accuracy over the whole parameter domain $\mathcal {D}.$

We next introduce the (for the sake of simplicity) parameter-independent bilinear forms $m(w,v) = (w,v)_{L^2({\varOmega })}$ for all $w,v \in L^2({\varOmega })$ and $b(\cdot ,\cdot ): U \times Y \rightarrow \mathbb {R}.$ We assume that $b(\cdot ,\cdot )$ is continuous, i.e.

$$\begin{aligned} \gamma _b= \sup _{w \in U \setminus \{0\}} \sup _{v \in Y \setminus \{0\}} \frac{b(w,v)}{\left||w\right||_{U} \left||v\right||_{Y}} < \infty . \end{aligned}$$

(1)

We also introduce the parameter-dependent bilinear form $a(\cdot ,\cdot ;\mu ): Y \times Y \rightarrow \mathbb {R},$ which we assume to be continuous, coercive,

$$\begin{aligned} \alpha (\mu )= \inf _{v \in Y \setminus \{0\}} \frac{a(v,v;\mu )}{\left||v\right||_Y^2} \ge \underline{\alpha } > 0 \quad \forall \mu \in \mathcal {D}, \end{aligned}$$

(2)

and affinely parameter-dependent,

$$\begin{aligned} a(w,v;\mu ) = \sum _{q=1}^{Q_a} {\varTheta }_a^q(\mu ) \, a^q(w,v) \quad \forall w,v \in Y, \quad \forall \mu \in \mathcal {D}, \end{aligned}$$

(3)

for some (preferably) small integer $Q_a.$ Here, the coefficient functions ${\varTheta }_a^q: \mathcal {D}\rightarrow \mathbb {R}$ are continuous and depend on $\mu ,$ but the continuous bilinear forms $a^q: Y \times Y \rightarrow \mathbb {R}$ do not depend on $\mu .$

We also require the continuous linear functional $f(\cdot ): Y \rightarrow \mathbb {R}$ and the continuous and linear (observation) operator $C: Y \rightarrow D,$ where D is a suitable Hilbert space of observations with inner product $(\cdot ,\cdot )_D$ and norm $\left||\cdot \right||_D.$ Although a more general setting is possible, we consider here the observation space $D = \mathbb {R}^l$ and the observation operator given by $C \phi = (h_1(\phi ), \dots , h_{\ell }(\phi ))^T,$ where $h_i \in Y'$ are linear output functionals. The continuity constant of the operator C is given by

$$\begin{aligned} \gamma _c= \sup _{v \in Y \setminus \{0\}} \frac{\left||C v\right||_{D}}{\left||v\right||_Y}. \end{aligned}$$

(4)

For the development of the a posteriori error bounds we assume that we have access to a positive lower bound $\alpha _{\mathrm{LB}}(\mu ): \mathcal {D}\rightarrow \mathbb {R}_{+}$ for the coercivity constant $\alpha (\mu )$ defined in (2) such that

$$\begin{aligned} 0 < \underline{\alpha } \le \alpha _{\mathrm{LB}}(\mu )\le \alpha (\mu )\quad \forall \mu \in \mathcal {D}. \end{aligned}$$

(5)

We note that $\alpha _{\mathrm{LB}}(\mu )$ is used in the a posteriori error bound formulation to replace the actual coercivity constants. Whereas the constants $\gamma _b$ and $\gamma _c$ are parameter-independent and can thus be computed once offline, we require that the coercivity lower can be efficiently evaluated online, i.e., the computational cost is independent of the FE dimension $\mathcal {N}.$ Various recipes exist to obtain such bounds (Huynh et al. 2007; Rozza et al. 2008).

3 Strong-constraint 4D-Var

In this section, we consider the strong-constraint 4D-Var data assimilation problem. The extension to the weak-constraint case is considered in Sect. 4.

3.1 Problem statement

For a given parameter $\mu \in \mathcal {D},$ the classical 4D-Var problem can be stated as the minimization problem

$$\begin{aligned} \begin{aligned}&\min _{y \in Y^K, \, u \in U} J(y,u;\mu ) \quad \text {s.t.} \quad y \in Y^K \quad \text {solves} \\&m(y^k,v) + \tau \, a(y^k,v;\mu ) = m(y^{k-1},v) + \tau f(v) \quad \forall v \in Y, \ \forall k \in \mathbb {K}, \end{aligned} \end{aligned}$$

(6)

with initial condition $m(y^0,v) = m(u,v)$ for all $v \in Y,$ and cost functional $J(\cdot ,\cdot ;\mu ): Y^K \times U \rightarrow \mathbb {R}$ given by

$$\begin{aligned} J(y,u;\mu ) = \frac{1}{2} \left||u - u_d\right||_{U}^2 + \frac{\tau }{2} \sum _{k=1}^K \left||C y^k - z_d^k\right||^2_{D}. \end{aligned}$$

(7)

Here $u_d \in U$ is the background state (also referred to as the prior), i.e., the best estimate of the true initial condition $u \in U$ prior to measurements being available, and $z_d^k \in D,$ $k \in \mathbb {K},$ is the given data, e.g., observed outputs. The first term in the cost functional penalizes the deviation of the initial condition from the background state, the second term penalizes the deviation of the predicted outputs from the given data/observed outputs. The relative weight of both terms is affected by the choice of the $(\cdot ,\cdot )_U$ and $(\cdot ,\cdot )_D$ inner products. Note that we use u for the unknown control/initial condition to signify the similarity to optimal control and the notation $J(\cdot ,\cdot ;\mu )$ to indicate the implicit dependence of the cost functional J on the parameter $\mu$ through the state y. However, to simplify the notation we often do not explicitly state the dependence of the state and control on the parameter, i.e., we use $y^k$ and u instead of $y^k(\mu )$ and $u(\mu ),$ respectively.

We would like to point out that the first term in (7) represents a Tichonov regularization of the cost functional and that the regularization parameter is “hidden” in the choice of the inner product. We refer to Engl et al. (1996) for regularization of inverse problems in general and to Puel (2009) for Tichonov regularization in data assimilation. Furthermore, we note that the choice of the norm for the data misfit term depends on the characteristics of the noise and is inspired by Gaussian noise in this paper. Different noise characteristics may require a different choice of norm; we refer e.g. to Rao et al. (2017) for a discussion using $L_1$ and Huber norms instead of the $L_2$ norm. The approach presented in the following is restricted to the case of Gaussian noise.

Employing a Lagrangian approach, we obtain the associated necessary, and in our setting sufficient, first-order optimality conditions: Given $\mu \in \mathcal {D},$ the optimal solution $(y^*,p^*,u^*) \in Y^K \times Y^K \times U$ satisfies

$$m(y^{*,k} - y^{*,k-1},\phi ) + \tau \, a ( y^{*,k},\phi ;\mu )= \tau \, f(\phi ) \quad \forall \phi \in Y, \ \forall k \in \mathbb {K},$$

(8a)

$$m(y^{*,0}, \phi )= m(u^*,\phi ) \quad \forall \phi \in Y,$$

(8b)

$$m(\varphi , p^{*,k} - p^{*,k+1}) + \tau \, a ( \varphi ,p^{*,k};\mu )= \tau \, \left( z_d^k - C y^{*,k}, C \varphi \right) _{D} \quad \forall \varphi \in Y, \ \forall k \in \mathbb {K},$$

(8c)

$$\left( u^{*} - u_d,\psi \right) _U - m(\psi , p^{*,1})= 0 \quad \forall \psi \in U,$$

(8d)

where the final condition of the adjoint is given by $p^{*,K+1} = 0.$ Concerning the existence and uniqueness of the 4D-Var problem specifically and of saddle point problems in general we refer to Bröcker (2017) and Benzi et al. (2005).

3.1.1 Algebraic formulation

The 4D-Var problem is usually stated using an algebraic formulation (Ide et al. 1997). We thus briefly outline the algebraic equivalent of (6) by introducing a basis for the finite element spaces Y and U such that $Y = {{\mathrm{span}}}\{\phi _i^y, \, i = 1, \ldots , \mathcal {N}_Y\}$ and $U = {{\mathrm{span}}}\{\phi _i^u, \, i = 1, \ldots , \mathcal {N}_U\},$ respectively. We express the state, adjoint, and control, respectively, as

$$\begin{aligned} y^k = \textstyle \sum \limits _{i = 1}^{\mathcal {N}_Y} y^k_i \phi _i^y, \qquad p^k = \textstyle \sum \limits _{i = 1}^{\mathcal {N}_Y} y^k_i \phi _i^y, \qquad u = \textstyle \sum \limits _{i = 1}^{\mathcal {N}_U} u_i \phi _i^u, \end{aligned}$$

and denote the corresponding coefficient vectors by $\mathrm {y}^k = [y_1^k, \ldots , y_{\mathcal {N}_Y}^k]^T \in \mathbb {R}^{\mathcal {N}_Y},$ $\mathrm {p}^k = [p_1^k, \ldots , p_{\mathcal {N}_Y}^k]^T \in \mathbb {R}^{\mathcal {N}_Y},$ and $\mathrm {u}= [u_1, \ldots , u_{\mathcal {N}_U}]^T \in \mathbb {R}^{\mathcal {N}_U}.$ We thus obtain the algebraic formulation of the classical 4D-Var minimization problem

$$\begin{aligned} \begin{aligned}&\min J(\mathrm {y},\mathrm {u};\mu ) = \frac{1}{2} (\mathrm {u}- \mathrm {u}_b)^T \mathrm {U}(\mathrm {u}- \mathrm {u}_b) + \frac{\tau }{2} \sum _{k=1}^K \left( \mathrm {C}\mathrm {y}^k - \mathrm {z}_d^k\right) ^T \mathrm {D}\left( \mathrm {C}\mathrm {y}^k - \mathrm {z}_d^k\right) , \\&\text {s.t.} \quad \mathrm {y}^k \in \mathbb {R}^{\mathcal {N}_Y} \quad \text {solves}\quad \mathrm {M}\mathrm {y}^k + \tau \, \mathrm {A}(\mu ) \mathrm {y}^k = \mathrm {M}\mathrm {y}^{k-1} + \tau \mathrm {F}\quad \forall k \in \mathbb {K}, \\&\text {with\, initial\, condition}\quad \mathrm {M}\mathrm {y}^0 = \mathrm {M}_u \mathrm {u}. \end{aligned} \end{aligned}$$

(9)

Here $\mathrm {M}\in \mathbb {R}^{\mathcal {N}_Y \times \mathcal {N}_Y},$ $\mathrm {A}(\mu ) \in \mathbb {R}^{\mathcal {N}_Y \times \mathcal {N}_Y},$ $\mathrm {F}\in \mathbb {R}^{\mathcal {N}_Y},$ and $\mathrm {C}\in \mathbb {R}^{\ell \times \mathcal {N}_Y}$ are the usual finite element mass matrix, stiffness matrix, load vector, and state-to-output matrix with entries $\mathrm {M}_{ij} = m(\phi _j^y,\phi _i^y),$ $\mathrm {A}_{ij}(\mu ) = a(\phi _j^y,\phi _i^y;\mu ),$ $\mathrm {F}_{i} = f(\phi _i^y),$ and $\mathrm {C}_{ij} = h_i(\phi _j^y),$ respectively. The matrix $\mathrm {M}_u \in \mathbb {R}^{\mathcal {N}_Y \times \mathcal {N}_U}$ is given by $(\mathrm {M}_u)_{ij} = m(\phi _j^u,\phi _i^y).$ Furthermore, the matrices $\mathrm {U}\in \mathbb {R}^{\mathcal {N}_Y \times \mathcal {N}_Y}$ with entries $\mathrm {U}_{ij} = (\phi _j^y,\phi _i^y)_U$ and $\mathrm {D}\in \mathbb {R}^{\ell \times \ell }$ with entries $\mathrm {D}_{ij} = (e_j,e_i)_D$ can be identified as the inverses of the background and observation error covariance matrices, respectively. Here, $e_i$ denotes the ith unit vector in $\mathbb {R}^{\ell }.$

The derivation and algebraic formulation of the optimality system (8) is standard and thus omitted for brevity. Further, in our problem setting the first-discretize-then-optimize and first-optimize-then-discretize strategies lead to the same algebraic formulation of the first-order optimality system. For more details on these two approaches, we refer to Hinze et al. (2009) and for time-dependent problems specifically to Stoll and Wathen (2013).

3.2 Reduced basis approximation

We first assume that we are given the reduced basis spaces $Y_N \subset Y$ for the state and adjoint, and $U_N^0 \subset U$ for the control. Here, $1 \le N \le N_{\mathrm{max}}$ is the number of iterations of the POD-Greedy sampling procedure to construct the spaces $Y_N$ and $U_N^0$ discussed in Sect. 4.4. Note that the dimensions $N_Y(N) := \dim (Y_N)$ and $N_U^0(N) := \dim (U_N^0)$ of the reduced basis spaces depend on N but are in general not equal to N. Furthermore, the basis functions of $Y_N$ and $U_N^0$ are orthogonalized with respect to the $(\cdot ,\cdot )_Y$ and $(\cdot ,\cdot )_U$ inner product, respectively.

We next replace the finite element approximation of the PDE constraint in the 4D-Var problem statement (6) with its reduced basis approximation. For a given parameter $\mu \in \mathcal {D}$, the reduced-order 4D-Var data assimilation problem can thus be stated as

$$\begin{aligned} \begin{aligned}&\min _{y_N \in Y_N^K, \, u_N \in U_N^0} J(y_N,u_N;\mu ) \quad \text {s.t.} \quad y_N \in Y_N^K \quad \text {solves} \\&m(y_N^k,v) + \tau \, a(y_N^k,v;\mu ) = m(y_N^{k-1},v) + \tau f(v) \quad \forall v \in Y_N, \ \forall k \in \mathbb {K}, \end{aligned} \end{aligned}$$

(10)

with initial condition $m(y_N^0,v) = m(u_N,v)$ for all $v \in Y_N$.

We can again employ a Lagrangian approach to obtain the reduced-order optimality system: Given any $\mu \in \mathcal {D}$, the optimal solution $(y_N^*,p_N^*,u_N^*) \in Y_N^K \times Y_N^K \times U_N^0$ satisfies

$$m\left( y_N^{*,k} - y_N^{*,k-1},\phi \right) + \tau \, a \left( y_N^{*,k},\phi ;\mu \right)= \tau \, f(\phi )\quad\forall \phi \in Y_N, \ \forall k \in \mathbb {K},$$

(11a)

$$m\left( y_N^{*,0}, \phi \right)= m\left( u_N^*,\phi \right)\quad \forall \phi \in Y_N,$$

(11b)

$$m\left( \varphi , p_N^{*,k} - p_N^{*,k+1}\right) + \tau \, a \left( \varphi ,p_N^{*,k};\mu \right)= \tau \, \left( z_d^k - C y_N^{*,k}, C \varphi \right) _{D} \quad \forall \varphi \in Y_N, \ \forall k \in \mathbb {K},$$

(11c)

$$\left( u_N^{*} - u_d,\psi \right) _U - m\left( \psi , p_N^{*,1}\right)= 0\quad \forall \psi \in U_N^0,$$

(11d)

where the final condition of the adjoint is given by $p_N^{*,K+1} = 0$. The reduced-order optimality system can be solved efficiently using an offline-online computational procedure which is briefly discussed in Sect. 3.4.

Note that we use a single reduced basis ansatz and test space for the state and adjoint equations for two reasons: first, a single space for state and adjoint guarantees the stability of the reduced-order optimality system (Gerner and Veroy 2012); and second, the reduced-order optimality system (11) reflects the reduced-order 4D-Var problem (10) only if the spaces of the state and adjoint equations are identical. Since the state and adjoint solutions need to be well-approximated using the single space $Y_N,$ we combine both snapshots of the state and adjoint equations into the reduced basis space $Y_N.$

We also note that the dynamics of the state and adjoint are often different, and thus separate spaces for the state and adjoint would be beneficial concerning the computational efficiency, i.e., the dimension of the state/adjoint reduced basis space and thus the overall dimension of the reduced-order optimality system would be considerably smaller. However, this requires a Petrov–Galerkin projection for the state and adjoint with associated detriment concerning the stability.

3.3 A posteriori error estimation

We turn to the a posteriori error estimation procedure. Although we consider a parametrized problem here, we note that the error bounds proposed below can also be used in the non-parametrized reduced-order setting and are independent of how the reduced-order spaces are constructed, i.e., the bound directly applies to reduced-order approaches where the spaces are constructed e.g. using empirical orthogonal functions, POD, or dual-weighted POD (Daescu and Navon 2008).

As mentioned above, our main goal is to rigorously bound the error in the optimal control, $u^* - u_N^*$. This will allow us to confirm the fidelity of the reduced-order 4D-Var solution efficiently during the online stage. Our a posteriori error bounds are also crucial in the construction of the reduced basis spaces by the POD-Greedy algorithm (see Sect. 3.5).

To begin, we require the residuals

$$\begin{aligned} r_y^k(\phi ;\mu )&= f(\phi ) - a\left( y_N^{*,k},\phi ;\mu \right) - \frac{1}{\tau } m\left( y_N^{*,k}- y_N^{*,k-1},\phi \right) \quad \forall \phi \in Y, \ k \in \mathbb {K}, \end{aligned}$$

(12)

$$\begin{aligned} r_p^k(\varphi ;\mu )&= \left( z_d^k - C y_N^{*,k}, C \varphi \right) _D - a\left( \varphi ,p_N^{*,k};\mu \right) - \frac{1}{\tau } m\left( \varphi ,p_N^{*,k}-p_N^{*,k+1}\right) \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall \varphi \in Y, \ k \in \mathbb {K}, \end{aligned}$$

(13)

$$\begin{aligned} r_u(\psi ;\mu )&= m\left( \psi ,p_N^{*,1}\right) - \left( u_N^* - u_d, \psi \right) _U\quad \forall \psi \in U. \end{aligned}$$

(14)

We also define

$$\begin{aligned} R_y = \left( \tau \sum _{k=1}^K \left||r_y^k\right||_{Y'}^2 \right) ^{1/2}, \qquad R_p = \left( \tau \sum _{k=1}^K \left||r_p^k\right||_{Y'}^2 \right) ^{1/2}, \end{aligned}$$

(15)

and the errors $e_y^k= y^{*,k} - y_N^{*,k}$, $e_p^k= p^{*,k} - p_N^{*,k}$, and $e_u = u^* - u_N^*$. Note that we use $\left||r_{y,p}^k\right||_{Y'}$ and $\left||r_u\right||_{U'}$ as a shorthand notation for $\left||r_{y,p}^k(\cdot ;\mu )\right||_{Y'}$ and $\left||r_u(\cdot ;\mu )\right||)_{U'}$, respectively. We can now state our main result:

Proposition 1

Let $u^*$ and $u_N^*$ be the optimal solutions of the full-order and reduced-order 4D-Var problems, (6) and (10), respectively. The error satisfies

$$\begin{aligned} \left||u^* - u_N^*\right||_U \le {\varDelta }_N^u(\mu ) := c_1(\mu ) + \sqrt{c_1(\mu )^2 + c_2(\mu )} \quad \forall \mu \in \mathcal {D}, \end{aligned}$$

(16)

where $c_1(\mu )$ and $c_2(\mu )$ are given by

$$\begin{aligned} c_1(\mu )&= \frac{1}{2} \left( \left||r_u(\cdot ;\mu )\right||_{U'} + \frac{1}{\sqrt{\alpha _{\mathrm{LB}}(\mu )}} R_p \right) , \quad \text {and} \end{aligned}$$

(17)

$$\begin{aligned} \quad c_2(\mu )&= \left( \frac{\sqrt{2} + 1}{\alpha _{\mathrm{LB}}(\mu )} R_y R_p + \frac{\gamma _c^2}{2 (\alpha _{\mathrm{LB}}(\mu ))^2} R_y^2 \right) . \end{aligned}$$

(18)

Proof

We start from the error-residual equations obtained from (8) and the definitions of the residuals

$$\begin{aligned} m\left( e_y^k- e_y^{k-1},\phi \right) + \tau \, a\left( e_y^k,\phi ;\mu \right)&= \tau \, r_y^k(\phi ;\mu ), \qquad \qquad \forall \phi \in Y, \; k \in \mathbb {K}, \end{aligned}$$

(19)

$$\begin{aligned} m\left( \varphi ,e_p^k- e_p^{k+1}\right) + \tau \, a\left( \varphi ,e_p^k;\mu \right)&= \tau \, r_p^k(\varphi ;\mu ) -\tau \, \left( C e_y^k, C \varphi \right) _D, \nonumber \\&\quad \quad \,\,\,\,\qquad \qquad \qquad \qquad \qquad \forall \varphi \in Y, \; k \in \mathbb {K}, \end{aligned}$$

(20)

$$\begin{aligned} (e_u,\psi )_U- m\left( \psi ,e_p^1\right)&= r_u(\psi ;\mu ), \qquad \qquad \qquad \forall \psi \in U, \end{aligned}$$

(21)

where $e_p^{K+1} = 0$ and $e_y^0 = e_u$. We first choose $\phi = e_p^k$ in (19) and take the sum from $k=1$ to K to get

$$\begin{aligned} \sum _{k=1}^K m\left( e_y^k-e_y^{k-1},e_p^k\right) + \tau \sum _{k=1}^K a\left( e_y^k,e_p^k;\mu \right) = \tau \sum _{k=1}^K r_y^k\left( e_p^k;\mu \right) . \end{aligned}$$

(22)

Similarly, choosing $\varphi = e_y^k$ in (20) and summing from $k=1$ to K we obtain

$$\begin{aligned} \sum _{k=1}^K m\left( e_y^k,e_p^k-e_p^{k+1}\right) + \tau \sum _{k=1}^K a\left( e_y^k,e_p^k;\mu \right) = \tau \sum _{k=1}^K r_p^k\left( e_y^k;\mu \right) - \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2. \end{aligned}$$

(23)

Finally, from (21) with $\psi = e_u$ we have

$$\begin{aligned} \left||e_u\right||_U^2 - m\left( e_u,e_p^1\right) = r_u(e_u;\mu ). \end{aligned}$$

(24)

By adding Eqs. (23) and (24), and then subtracting (22) we get

$$\begin{aligned}&\sum _{k=1}^K m\left( e_y^{k-1},e_p^k\right) - \sum _{k=1}^K m\left( e_y^k,e_p^{k+1}\right) - m(e_u,e_p^1) + \left||e_u\right||_U^2 \nonumber \\&\quad = -\,\tau \sum _{k=1}^K r_y^k\left( e_p^k;\mu \right) + \tau \sum _{k=1}^K r_p^k\left( e_y^k;\mu \right) +r_u(e_u;\mu ) - \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2. \end{aligned}$$

(25)

Since $e_y^0=e_u$, and $e_p^{K+1}=0$, the left-hand side of (25) reduces to $\left||e_u\right||_U^2$ and we thus obtain

$$\begin{aligned} \left||e_u\right||_U^2 + \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2&= -\,\tau \sum _{k=1}^K r_y^k\left( e_p^k;\mu \right) + \tau \sum _{k=1}^K r_p^k\left( e_y^k;\mu \right) +r_u(e_u;\mu ) \nonumber \\&\le \left( \tau \sum _{k=1}^K \left||r_y^k\right||_{Y'}^2 \right) ^{1/2} \left( \tau \sum _{k=1}^K \left||e_p^k\right||_Y^2 \right) ^{1/2} \nonumber \\&\quad + \,\left( \tau \sum _{k=1}^K \left||r_p^k\right||_{Y'}^2 \right) ^{1/2} \left( \tau \sum _{k=1}^K \left||e_y^k\right||_Y^2 \right) ^{1/2} + \left||r_u\right||_{U'} \left||e_u\right||_U. \end{aligned}$$

(26)

From the proof for the spatio-temporal energy norm bound in Grepl and Patera (2005) and Kärcher and Grepl (2014) we know that

$$\begin{aligned} \tau \sum _{k=1}^K \left||e_y^k\right||_Y^2 \le \frac{\tau }{(\alpha _{\mathrm{LB}}(\mu ))^2} \sum _{k=1}^K \left||r_y^k\right||_{Y'}^2 + \frac{1}{\alpha _{\mathrm{LB}}(\mu )} \underbrace{m(e_u,e_u)}_{= \left||e_u\right||_U^2}. \end{aligned}$$

(27)

We need an analogous result for the adjoint. To this end, we first choose $\varphi = e_p^k$ in (20) to obtain

$$\begin{aligned} m\left( e_p^k,e_p^k- e_p^{k+1}\right) + \tau \, a\left( e_p^k,e_p^k;\mu \right) = \tau \, r_p^k\left( e_p^k;\mu \right) - \tau \, \left( C e_y^k, C e_p^k\right) _D. \end{aligned}$$

(28)

We next note from the Cauchy–Schwarz inequality and Young’s inequality that

$$\begin{aligned} 2 \, m\left( e_p^k,e_p^{k+1}\right) \le m\left( e_p^k,e_p^k\right) + m\left( e_p^{k+1},e_p^{k+1}\right) , \end{aligned}$$

(29)

and also that

$$\begin{aligned} 2 \, \tau \, \left( C e_y^k, C e_p^k\right) _D&\le 2 \, \tau \, \left||C e_y^k\right||_D \, \left||C e_p^k\right||_D \le 2 \, \tau \, \left||C e_y^k\right||_D \, \gamma _c\, \left||e_p^k\right||_Y \nonumber \\&\le \frac{2 \, \tau \, \gamma _c^2}{\alpha _{\mathrm{LB}}(\mu )} \left||C e_y^k\right||_D^2 + \frac{\tau \, \alpha _{\mathrm{LB}}(\mu )}{2} \left||e_p^k\right||_Y^2, \end{aligned}$$

(30)

where we also used the definition of the constant $\gamma _c$. Finally, again from Young’s inequality we obtain

$$\begin{aligned} 2 \, \tau \, r_p^k\left( e_p^k;\mu \right) \le \frac{2 \, \tau }{\alpha _{\mathrm{LB}}(\mu )} \left||r_p^k\right||_{Y'}^2 + \frac{\tau \, \alpha _{\mathrm{LB}}(\mu )}{2} \left||e_p^k\right||_Y^2. \end{aligned}$$

(31)

By summing two times (28) from $k=1$ to K and invoking (29), (30), and (31), we obtain

$$\begin{aligned} m\left( e_p^1,e_p^1\right) + \tau \sum _{k=1}^K a\left( e_p^k,e_p^k;\mu \right) \le \frac{2 \, \tau }{\alpha _{\mathrm{LB}}(\mu )} \sum _{k=1}^K \left||r_p^k\right||_{Y'}^2 + \frac{2 \, \tau \, \gamma _c^2}{\alpha _{\mathrm{LB}}(\mu )} \sum _{k=1}^K \left||C e_y^k\right||_D^2, \end{aligned}$$

(32)

and hence

$$\begin{aligned} \tau \sum _{k=1}^K \left||e_p^k\right||_Y^2 \le \frac{2 \, \tau }{(\alpha _{\mathrm{LB}}(\mu ))^2} \sum _{k=1}^K \left||r_p^k\right||_{Y'}^2 + 2 \left( \frac{ \gamma _c}{\alpha _{\mathrm{LB}}(\mu )} \right) ^2 \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2. \end{aligned}$$

(33)

Using the inequalities (27) and (33) in (26), invoking the definitions (15), and noting that $(a^2 + b^2)^{1/2} \le |a| + |b|$, it follows that

$$\begin{aligned} \left||e_u\right||_U^2 + \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2&\le \left||r_u\right||_{U'} \left||e_u\right||_U+ R_p \left[ \frac{1}{(\alpha _{\mathrm{LB}}(\mu ))^2} R_y^2 + \frac{1}{\alpha _{\mathrm{LB}}(\mu )} \left||e_u\right||_U^2 \right] ^{1/2} \nonumber \\&\quad + R_y \left[ \frac{2}{(\alpha _{\mathrm{LB}}(\mu ))^2} R_p^2 + 2 \left( \frac{ \gamma _c}{\alpha _{\mathrm{LB}}(\mu )} \right) ^2 \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2 \right] ^{1/2} \nonumber \\&\le \left||r_u\right||_{U'} \left||e_u\right||_U+ R_p \left[ \frac{1}{\alpha _{\mathrm{LB}}(\mu )} R_y + \frac{1}{\sqrt{\alpha _{\mathrm{LB}}(\mu )}} \left||e_u\right||_U\right] \nonumber \\&\quad + R_y \left[ \frac{\sqrt{2}}{\alpha _{\mathrm{LB}}(\mu )} R_p + \frac{\sqrt{2} \, \gamma _c}{\alpha _{\mathrm{LB}}(\mu )} \left( \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2 \right) ^{1/2} \right] . \end{aligned}$$

(34)

We now use Young’s inequality to bound

$$\begin{aligned} R_y \frac{\sqrt{2} \, \gamma _c}{\alpha _{\mathrm{LB}}(\mu )} \left( \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2 \right) ^{1/2} \le \frac{\gamma _c^2}{2 (\alpha _{\mathrm{LB}}(\mu ))^2} R_y^2 + \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2, \end{aligned}$$

(35)

and thereby eliminate the second term on the left-hand side of the inequality (34) to obtain

$$\begin{aligned}&\left||e_u\right||_U^2 \le \left||r_u\right||_{U'} \left||e_u\right||_U+ \frac{1}{\sqrt{\alpha _{\mathrm{LB}}(\mu )}} R_p \left||e_u\right||_U\nonumber \\&\quad + \frac{\sqrt{2} + 1}{\alpha _{\mathrm{LB}}(\mu )} R_y R_p + \frac{\gamma _c^2}{2 (\alpha _{\mathrm{LB}}(\mu ))^2} R_y^2. \end{aligned}$$

(36)

Using the definitions of $c_1(\mu )$ and $c_2(\mu )$ in (17) and (18), respectively, (36) simplifies to

$$\begin{aligned} \left||e_u\right||_U^2 - 2 \, c_1(\mu ) \, \left||e_u\right||_U- c_2(\mu ) \le 0. \end{aligned}$$

(37)

We obtain the desired result by bounding the error $\left||e_u\right||_{U}$ by the larger root of the quadratic inequality. $\square$

We note that we currently cannot assess the tightness of the error bound (16) by providing an a priori upper bound for the effectivity, i.e., the ratio of the bound to the error. We present numerical results for the effectivity in Sect. 6.2.

3.4 Computational procedure

We briefly comment on the computational procedure to solve the reduced-order 4D-Var problem and to evaluate the error bound. Given the affine parameter dependence, the offline-online decomposition for the reduced basis approximation is already quite standard in the reduced basis literature (Rozza et al. 2008); for the parabolic case considered in this paper, we also specifically refer to Grepl and Patera (2005) and Kärcher and Grepl (2014). The evaluation of the a posteriori error bounds requires the following ingredients:

The dual norm of the residuals $\left||r_y^k\right||_{Y'}$, $\left||r_p^k\right||_{Y'}$, and $\left||r_u\right||_{U'}$;
The coercivity lower bound $\alpha _{\mathrm{LB}}(\mu )$ and the constant $\gamma _c$.

For the construction of the coercivity lower bound, $\alpha _{\mathrm{LB}}(\mu )$, various recipes exist (Huynh et al. 2007; Prud’homme et al. 2002; Veroy et al. 2002). The specific choices for our numerical tests are stated in Sect. 6. The constant $\gamma _c$ is parameter-independent and can be computed by solving a generalized eigenproblem. The offline-online evaluation of the dual norms of the residuals is standard and hence omitted (Rozza et al. 2008). For a summary of the computational cost in the parabolic optimal control context, we refer to Kärcher and Grepl (2014).

We solve the full-order and reduced-order 4D-Var problems with a preconditioned Newton-CG method on the “reduced” cost functional $j(u;\mu ) := J(y(u),u;\mu )$, i.e., we eliminate the PDE-constraint in the minimization problem. The control mass matrix is used as a preconditioner. We present results for the number of CG iterations in Sect. 6. Overall, the online computational cost to solve the reduced-order 4D-Var problem and to evaluate the a posteriori error bound depends only on the reduced basis dimensions $N_Y$ and $N_U^0$, and is independent of $\mathcal {N}$.

3.5 Greedy algorithm

To construct the reduced basis spaces $Y_N$ and $U_N^0$, we use the POD-Greedy sampling procedure in Algorithm 1. Here, ${\varXi }_\mathrm {train}\subset \mathcal {D}$ is a finite but suitably large training sample, $\mu ^1 \in {\varXi }_\mathrm {train}$ is the initial parameter value, $N_{\mathrm{max}}$ the maximum number of greedy iterations, and $\epsilon _\mathrm {tol, min}> 0$ a prescribed error tolerance. We also define the relative error bound ${\varDelta }_{N,\mathrm{rel}}^u(\mu ) = {\varDelta }_N^u(\mu )/\left||u_N^*(\mu )\right||_U$. Furthermore, for a given time history $v_k \in Y, \ k \in \mathbb {K}$, the operator $\text {POD}_Y(\{ v_k: k \in \mathbb {K}\})$ returns the largest POD-mode with respect to the $(\cdot ,\cdot )_Y$ inner product (normalized with respect to the Y-norm), and $v^k_{{\text {proj}},N}(\mu )$ denotes the Y-orthogonal projection of $v^k(\mu )$ onto the reduced basis space $Y_N$.

In steps 6 and 7 of Algorithm 1 we expand the reduced basis space $Y_N$ with the largest POD mode of both the state and the adjoint solution. Note that we apply the POD in these two steps to the time history of the optimal state and adjoint projection errors, i.e., $e^{y,k}_{\text {proj},N}(\mu ) = y^{*,k}(\mu ) - y^{*,k}_{\text {proj},N}(\mu )$ and $e^{p,k}_{\text {proj},N}(\mu ) = p^{*,k}(\mu ) - p^{*,k}_{\text {proj},N}(\mu ), \ k \in \mathbb {K}$, and not to the solutions $y^k(\mu ),\ k \in \mathbb {K}$, and $p^k(\mu ),\ k \in \mathbb {K}$.^{Footnote 1} This ensures that the POD modes are already orthogonal with respect to the $(\cdot ,\cdot )_Y$ inner product and that we add only new information to $Y_N$ which is not yet captured in the reduced basis.

In step 8 we expand the reduced basis space $U_N^0$ with the optimal control at $\mu ^*$. Due to the time-dependence of the state and adjoint, it is possible that a specific parameter $\tilde{\mu }$ is picked several times by the greedy search in step 9. Before expanding $U_N^0$, we thus need to check if the new snapshot is already contained in the reduced basis space $U_{N-1}^0$, and consequently discard linearly dependent snapshots. By construction, we thus have $\dim (U_N^0) \le N$ and $\dim (Y_N) = 2N$ (although it is theoretically possible that $\dim (Y_N) \le 2N$, we did not observe this case in the numerical results). Finally, we note that information from the data assimilation cost functional enters through the adjoint equation and the adjoint snapshots into $Y_N$.

4 Weak-constraint 4D-Var

We next consider the weak-constraint 4D-Var data assimilation problem, thus accounting for possible model errors in the dynamical system. For simplicity, we assume in this section that the initial condition is known and that we are interested only in bounding the model error. We consider the combined problem (unknown initial condition and model error) in the next section.

4.1 Problem statement

To emphasize the relation between the weak-constraint 4D-Var problem and the optimal control setting, we denote in this section the model error by u. However, the model error is now time-dependent, i.e., $u = u^k, \, k \in \mathbb {K}$, and appears in every time step of the dynamical system. For a given parameter $\mu \in \mathcal {D}$, the weak-constraint 4D-Var problem is then given by the minimization problem

$$\begin{aligned} \begin{aligned}&\min _{y \in Y^K, \, u \in U^K} J(y,u;\mu ) \quad \text {s.t.} \quad y \in Y^K \quad \text {solves} \\&m(y^k,v) + \tau \, a(y^k,v;\mu ) = m(y^{k-1},v) + \tau \, b(u^k,v) + \tau \, f(v) \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall v \in Y, \ \forall k \in \mathbb {K}, \end{aligned} \end{aligned}$$

(38)

with initial condition $m(y^0,v) = m(y_0,v)$ for all $v \in Y,$ and cost functional $J(\cdot ,\cdot ;\mu ): Y^K \times U^K \rightarrow \mathbb {R}$ given by

$$\begin{aligned} J(y,u;\mu ) = \frac{\tau }{2} \sum _{k=1}^K \left||u^k - u_d^k\right||_{U}^2 + \frac{\tau }{2} \sum _{k=1}^K \left||C y^k - z_d^k\right||^2_{D}. \end{aligned}$$

(39)

We note that the cost functional now contains the contribution of the model error $u^k$ as a sum over all time steps. In the optimal control setting, $u_d^k \in U, \, k \in \mathbb {K}$ denotes the desired optimal control. In the data assimilation setting, however, $u_d^k$ is usually set to zero since the model error is generally assumed to be unbiased (Law et al. 2015). We also note that a constant (known) bias can be taken into account by adjusting the right-hand side f(v). Similar to the strong-constraint formulation, $z_d^k \in D$, $k \in \mathbb {K}$, are the observed outputs.

We again obtain the associated necessary and sufficient first-order optimality conditions using a Lagrangian approach: Given $\mu \in \mathcal {D}$, the optimal solution $(y^*,p^*,u^*) \in Y^K \times Y^K \times U^K$ satisfies

$$m\left( y^{*,k} - y^{*,k-1},\phi \right) + \tau \, a ( y^{*,k},\phi ;\mu )= \tau \, b(u^k,\phi ) + \tau \, f(\phi ) \quad \forall \phi \in Y, \ \forall k \in \mathbb {K},$$

(40a)

$$m(y^0,\phi )= m(y_0,\phi )\quad \forall \phi \in Y,$$

(40b)

$$m(\varphi , p^{*,k} - p^{*,k+1}) + \tau \, a (\varphi ,p^{*,k};\mu )= \tau \, \left( z_d^k - C y^{*,k}, C \varphi \right) _{D} \quad \forall \varphi \in Y, \ \forall k \in \mathbb {K},$$

(40c)

$$\tau \, \left( u^{*,k} - u_d^k,\psi \right) _U - \tau \, b(\psi , p^{*,k})= 0\quad \forall \psi \in U, \ \forall k \in \mathbb {K},$$

(40d)

where the final condition of the adjoint is given by $p^{*,K+1} = 0$. We note that the adjoint equation of the weak-constraint formulation (40c) is identical to the adjoint of the strong constraint formulation (8c).

4.2 Reduced basis approximation

We again assume that we are given the reduced basis spaces $Y_N \subset Y$ for the state and adjoint and $U_N \subset U$ for the control. Whereas the construction of the space $Y_N$ directly follows from the discussion in Sect. 3.5 for the strong-constraint case, the construction of $U_N$ needs to be adjusted to account for the time-dependence of the model error. We briefly outline the procedure in Sect. 4.4.

For a given parameter $\mu \in \mathcal {D}$, we can now state the weak-constraint reduced-order 4D-Var data assimilation problem as follows

$$\begin{aligned} \begin{aligned}&\min _{y_N \in Y_N^K, \, u_N \in U_N^K} J(y_N,u_N;\mu ) \quad \text {s.t.} \quad y_N \in Y_N^K \quad \text {solves} \\&m\left( y_N^k,v\right) + \tau \, a\left( y_N^k,v;\mu \right) = m\left( y_N^{k-1},v\right) + \tau \, b\left( u_N^k,v\right) + \tau \, f(v) \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall v \in Y_N, \ \forall k \in \mathbb {K}, \end{aligned} \end{aligned}$$

(41)

with initial condition $m(y_N^0,v) = m(y_0,v)$ for all $v \in Y_N$. The reduced-order optimality system directly follows from (40) and is thus omitted.

4.3 A posteriori error estimation

We first introduce the residuals for the weak-constraint case

$$\begin{aligned} \tilde{r}_y^k(\phi ;\mu )&= f(\phi ) + b\left( u_N^{*,k},\phi \right) - a\left( y_N^{*,k},\phi ;\mu \right) - \frac{1}{\tau } m\left( y_N^{*,k}- y_N^{*,k-1},\phi \right) \nonumber \\&\qquad \qquad \qquad \qquad \qquad \forall \phi \in Y, \ k \in \mathbb {K}, \end{aligned}$$

(42)

$$\begin{aligned} \tilde{r}_p^k(\varphi ;\mu )&= \left( z_d^k - C y_N^{*,k}, C \varphi \right) _D - a\left( \varphi ,p_N^{*,k};\mu \right) - \frac{1}{\tau } m\left( \varphi ,p_N^{*,k}-p_N^{*,k+1}\right) \nonumber \\&\qquad \qquad \qquad \forall \varphi \in Y, \ k \in \mathbb {K}, \end{aligned}$$

(43)

$$\begin{aligned} \tilde{r}_u^k(\psi ;\mu )&= m\left( \psi ,p_N^{*,k}\right) - \left( u_N^{*,k}- u_d, \psi \right) _U\quad \forall \psi \in U, \ k \in \mathbb {K}. \end{aligned}$$

(44)

Since the adjoint equations (40c) and (8c) are identical, the adjoint residual is actually equivalent to the strong-constraint case, i.e., $r_p^k= \tilde{r}_p^k$. Similar to (15), we introduce the sums from $k =1$ to K of the dual norms of the residuals as

$$\begin{aligned} \tilde{R}_{y,p} = \left( \tau \sum _{k=1}^K \left||\tilde{r}_{y,p}^k(\cdot ;\mu )\right||_{Y'}^2 \right) ^{1/2}, \qquad \tilde{R}_u = \left( \tau \sum _{k=1}^K \left||\tilde{r}_u^k(\cdot ;\mu )\right||_{U'}^2 \right) ^{1/2}, \end{aligned}$$

(45)

and the time-dependent model error $e_u^k= u^{*,k} - u_N^{*,k}$. We may now state our main result:

Proposition 2

Let $u^{*,k}$ and $u_N^{*,k}$, $k \in \mathbb {K}$, be the optimal solutions of the full-order and reduced-order 4D-Var problems (38) and (41), respectively. The error satisfies

$$\begin{aligned} \left( \tau \sum _{k=1}^{K} \left||u^{*,k} - u_N^{*,k}\right||_U^2 \right) ^{1/2} \le \tilde{{\varDelta }}_N^u(\mu ) := c_1(\mu ) + \sqrt{c_1(\mu )^2 + c_2(\mu )} \quad \forall \mu \in \mathcal {D}, \end{aligned}$$

(46)

where $c_1(\mu )$ and $c_2(\mu )$ are given by

$$\begin{aligned} c_1(\mu )&= \frac{1}{2} \left( \tilde{R}_u + \frac{\sqrt{2} \, \gamma _b}{\alpha _{\mathrm{LB}}(\mu )} \tilde{R}_p \right) , \quad \text {and} \end{aligned}$$

(47)

$$\begin{aligned} \quad c_2(\mu )&= \left( \frac{2\sqrt{2}}{\alpha _{\mathrm{LB}}(\mu )} \tilde{R}_y \tilde{R}_p + \frac{\gamma _c^2}{2 (\alpha _{\mathrm{LB}}(\mu ))^2} \tilde{R}_y^2 \right) . \end{aligned}$$

(48)

Proof

The proof follows partly from the proof of Proposition 1; we thus stress the differences and refer to the previous proof whenever possible. We again start from the error-residual equations which are now given by

$$m\left( e_y^k- e_y^{k-1},\phi \right) + \tau \, a\left( e_y^k,\phi ;\mu \right)= \tau \, \tilde{r}_y^k(\phi ;\mu ) + \tau \, b(e_u^k,\phi ), \quad\forall \phi \in Y, \; k \in \mathbb {K},$$

(49)

$$m\left( \varphi ,e_p^k- e_p^{k+1}\right) + \tau \, a\left( \varphi ,e_p^k;\mu \right)= \tau \, \tilde{r}_p^k(\varphi ;\mu ) -\tau \, \left( C e_y^k, C \varphi \right) _D, \quad \forall \varphi \in Y, \; k \in \mathbb {K},$$

(50)

$$\tau \, \left( e_u^k,\psi \right) _U- \tau \, b\left( \psi ,e_p^k\right)= \tau \, \tilde{r}_u^k(\psi ;\mu ),\quad \forall \psi \in U, \; k \in \mathbb {K},$$

(51)

where $e_p^{K+1} = 0$ and $e_y^0 = 0$, since we guarantee that $y_0 \in Y_N$. We now choose $\phi = e_p^k$ in (49), $\varphi = e_p^k$ in (50), and $\psi = e_u^k$ in (21), sum all equations from from $k=1$ to K and combine them following the proof of Proposition 1 to obtain

$$\begin{aligned} \tau \sum _{k=1}^{K}&\left||e_u^k\right||_U^2 + \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2 \nonumber \\&= -\,\tau \sum _{k=1}^K \tilde{r}_y^k\left( e_p^k;\mu \right) + \tau \sum _{k=1}^K \tilde{r}_p^k\left( e_y^k;\mu \right) + \tau \sum _{k=1}^K \tilde{r}_u^k\left( e_u^k;\mu \right) \nonumber \\&\le \tilde{R}_y \left( \tau \sum _{k=1}^K \left||e_p^k\right||_Y^2 \right) ^{1/2} + \tilde{R}_p \left( \tau \sum _{k=1}^K \left||e_y^k\right||_Y^2 \right) ^{1/2} + \tilde{R}_u \left( \tau \sum _{k=1}^K \left||e_u^k\right||_U^2 \right) ^{1/2}. \end{aligned}$$

(52)

We next bound the primal error. Since the primal equation contains the model error on the right-hand side, we need to extend the proof from Grepl and Patera (2005) for the spatio-temporal energy norm bound to include the extra term on the right-hand side. The derivation is similar to the one for the bound of the adjoint in the proof of Proposition 1 [cf. (28)–(33)], but instead of bounding the $(\cdot ,\cdot )_D$ inner product using Cauchy–Schwarz and the constant $\gamma _c$, we invoke the continuity of the bilinear form $b(\cdot ,\cdot )$. We can thus derive the bound

$$\begin{aligned} \tau \sum _{k=1}^K \left||e_y^k\right||_Y^2 \le \frac{2 \, \tau }{(\alpha _{\mathrm{LB}}(\mu ))^2} \sum _{k=1}^K \left||r_y^k\right||_{Y'}^2 + 2 \left( \frac{ \gamma _b}{\alpha _{\mathrm{LB}}(\mu )} \right) ^2 \tau \sum _{k=1}^K \left||e_u^k\right||_U^2. \end{aligned}$$

(53)

Furthermore, since the adjoint of the strong- and weak-constraint case are equivalent, we can directly use the bound (33). Using the inequalities (53) and (33) in (52), invoking the definitions (45), and noting that $(a^2 + b^2)^{1/2} \le |a| + |b|$, it follows that

$$\begin{aligned} \tau \sum _{k=1}^{K} \left||e_u^k\right||_U^2&+ \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2 \nonumber \\&\le \left[ \tilde{R}_u + \frac{\sqrt{2} \, \gamma _b}{\alpha _{\mathrm{LB}}(\mu )} \tilde{R}_p \right] \Big ( \tau \sum _{k=1}^K \left||e_u^k\right||_U^2 \Big )^{1/2} \nonumber \\&\quad +\frac{2 \sqrt{2}}{\alpha _{\mathrm{LB}}(\mu )} \tilde{R}_y \tilde{R}_p + \frac{\sqrt{2} \, \gamma _c}{\alpha _{\mathrm{LB}}(\mu )} \tilde{R}_y \left( \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2 \right) ^{1/2}. \end{aligned}$$

(54)

We again use Young’s inequality to bound

$$\begin{aligned} \tilde{R}_y \frac{\sqrt{2} \, \gamma _c}{\alpha _{\mathrm{LB}}(\mu )} \left( \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2 \right) ^{1/2} \le \frac{\gamma _c^2}{2 (\alpha _{\mathrm{LB}}(\mu ))^2} \tilde{R}_y^2 + \tau \sum _{k=1}^K \left||C e_y^k\right||_D^2, \end{aligned}$$

(55)

and thereby eliminate the second term on the left-hand side of (54) to obtain

$$\begin{aligned}&\tau \sum _{k=1}^{K} \left||e_u^k\right||_U^2 \le \left[ \tilde{R}_u + \frac{\sqrt{2} \, \gamma _b}{\alpha _{\mathrm{LB}}(\mu )} \tilde{R}_p \right] \left( \tau \sum _{k=1}^K \left||e_u^k\right||_U^2 \right) ^{1/2} \nonumber \\&\quad + \frac{2 \sqrt{2}}{\alpha _{\mathrm{LB}}(\mu )} \tilde{R}_y \tilde{R}_p + \frac{\gamma _c^2}{2 (\alpha _{\mathrm{LB}}(\mu ))^2} \tilde{R}_y^2. \end{aligned}$$

(56)

Using the definitions of $c_1(\mu )$ and $c_2(\mu )$ in (47) and (48), respectively, we obtain

$$\begin{aligned} \tau \sum _{k=1}^{K} \left||e_u^k\right||_U^2 - 2 \, c_1(\mu ) \, \left( \tau \sum _{k=1}^K \left||e_u^k\right||_U^2 \right) ^{1/2} - c_2(\mu ) \le 0, \end{aligned}$$

(57)

The desired result follows again by using the larger root of the quadratic inequality as a bound for the error. $\square$

The offline-online computational procedure in the weak-constraint case is analogous to the strong-constraint case discussed in Sect. 3.4 and therefore omitted. Note that we additionally require the constant $\gamma _b$ now, which is parameter-independent and can be computed by solving a generalized eigenproblem (similar to $\gamma _c$). For the Newton–CG method, we use the block-diagonal matrix $\mathrm{blkdiag}(\tau \mathrm {M}, \ldots , \tau \mathrm {M})$ as a preconditioner.

Similar to the strong-constraint case, we again cannot assess the tightness of the error bound (46) by providing an a priori upper bound for the associated effectivity. Instead, we present numerical results for the weak-constraint case also in Sect. 6.2.

4.4 Greedy algorithm

The POD-Greedy sampling procedure to construct the reduced basis spaces $Y_N$ and $U_N$ in the weak-constraint case is very similar to the strong-constraint case. We summarize the procedure in Algorithm 2 and comment only on the differences.

First, since we assume in this section that the initial condition $y_0$ is known, we initialize the reduced basis space $Y_N$ with $y_0/\Vert y_0\Vert _Y$. Second, we additionally require the operator $\text {POD}_U(\{ v_k: k \in \mathbb {K}\})$, which returns the largest POD mode with respect to the $(\cdot ,\cdot )_U$ inner product (and normalized with respect to the U-norm). Also, $v^k_{{\text {proj}_U},N}(\mu )$ denotes the U-orthogonal projection of $v^k(\mu )$ onto the reduced basis space $U_N$ and $e^{u,k}_{\text {proj}_U,N}(\mu ) = u^{*,k}(\mu ) - u^{*,k}_{\text {proj}_U,N}(\mu )$ denotes the time history of the optimal model-error forcing. Since the model-error forcing is time-dependent, we simply replace step 8 in Algorithm 1 with a POD-step and add only the largest POD mode $\zeta$ to $U_N$. We note that the POD modes $\zeta$ are orthogonal with respect to the $(\cdot ,\cdot )_U$ inner product and that we now usually have $\dim (U_N) = N$ and $\dim (Y_N) = 2N + 1$ (due to the initial condition), i.e., the reduced basis space $U_N$ is enriched in every greedy step. Again, it is theoretically possible that $\dim (U_N) \le N$ and $\dim (Y_N) \le 2N + 1$, although we did not observe this case in the numerical results.

5 Combined 4D-Var formulation

We now combine the results from the previous two sections and consider the classical 4D-Var data assimilation problem including model error.

5.1 Problem statement

For a given parameter $\mu \in \mathcal {D}$, we now consider the minimization problem

$$\begin{aligned} \begin{aligned}&\min _{y \in Y^K, \, u \in U^{K+1}} J(y,u;\mu ) \quad \text {s.t.} \quad y \in Y^K \quad \text {solves} \\&m(y^k,v) + \tau \, a(y^k,v;\mu ) = m(y^{k-1},v) + \tau \, b(u^k,v) + \tau \, f(v) \\&\qquad \qquad \qquad \qquad \qquad \qquad \forall v \in Y, \ \forall k \in \mathbb {K}, \end{aligned} \end{aligned}$$

(58)

with initial condition $m(y^0,v) = m(u^0,v)$ for all $v \in Y,$ and cost functional $J(\cdot ,\cdot ;\mu ): Y^K \times U^{K+1} \rightarrow \mathbb {R}$ given by

$$\begin{aligned} J(y,u;\mu ) = \frac{1}{2} \left||u^0 - u_d^0\right||_{U}^2 + \frac{\tau }{2} \sum _{k=1}^K \left||u^k - u_d^k\right||_{U}^2 + \frac{\tau }{2} \sum _{k=1}^K \left||C y^k - z_d^k\right||^2_{D}. \end{aligned}$$

(59)

In addition to the error between the predicted and observed outputs, the cost functional now contains the deviation of the initial condition from the background state, $u_d^0 \in U,$ as well as the model error for all time steps. As mentioned earlier, in the data assimilation context we usually have $u_d^0 \ne 0$ and $u_d^k = 0, \, 1 \le k \le K$, i.e., the background state is nonzero whereas the model error is assumed to have zero mean.

The associated necessary and sufficient first-order optimality conditions are thus: Given $\mu \in \mathcal {D}$, the optimal solution $(y^*,p^*,u^*) \in Y^K \times Y^K \times U^{K+1}$ satisfies

$$m\left( y^{*,k} - y^{*,k-1},\phi \right) + \tau \, a ( y^{*,k},\phi ;\mu )= \tau \, b(u^k,\phi ) + \tau \, f(\phi ) \quad \forall \phi \in Y, \ \forall k \in \mathbb {K},$$

(60a)

$$m(y^{*,0},\phi )= m(u^0,\phi )\quad\forall \phi \in Y$$

(60b)

$$m(\varphi , p^{*,k} - p^{*,k+1}) + \tau \, a (\varphi ,p^{*,k};\mu )= \tau \, \left( z_d^k - C y^{*,k}, C \varphi \right) _{D} \quad\forall \varphi \in Y, \ \forall k \in \mathbb {K},$$

(60c)

$$\tau \, \left( u^{*,k} - u_d^k,\psi \right) _U - \tau \, b(\psi , p^{*,k})= 0\quad\forall \psi \in U, \ \forall k \in \mathbb {K},$$

(60d)

$$\left( u^{*,0} - u_d^0,\psi \right) _U - m(\psi , p^{*,1})= 0\quad\forall \psi \in U,$$

(60e)

where the final condition of the adjoint is given by $p^{*,K+1} = 0$.

5.2 Reduced basis approximation and error estimation

The reduced-order problem follows directly from (58) and (59) by restricting the state, adjoint, and control spaces to their respective reduced basis spaces. We again introduce an integrated space $Y_N$ for the state and adjoint, and two separate spaces for the “control,” i.e., $U_N^0$ for the initial condition $u_N^0$ and $U_N$ for the model error $u_N^k, \, k \in \mathbb {K}$. The greedy procedure to generate these spaces simply combines the algorithms introduced in Sects. 3.5 and 4.4.

For any given $\mu \in \mathcal {D}$, we can now state the reduced-order minimization problem as follows

$$\begin{aligned} \begin{aligned}&\min _{y_N \in Y_N^K, \, u_N \in U_N^0 \times U_N^K} J(y_N,u_N;\mu ) \quad \text {s.t.} \quad y_N \in Y_N^K \quad \text {solves} \\&m\left( y_N^k,v\right) + \tau \, a\left( y_N^k,v;\mu \right) = m\left( y_N^{k-1},v\right) + \tau \, b\left( u_N^k,v\right) + \tau \, f(v) \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall v \in Y_N, \ \forall k \in \mathbb {K}, \end{aligned} \end{aligned}$$

(61)

with initial condition $m(y_N^0,v) = m(u_N^0,v)$ for all $v \in Y_N$. The reduced-order optimality system directly follows from (60) and is thus omitted.

The a posteriori error bound result is a combination of the strong- and weak-constraint case. In addition to the residuals of the state $\tilde{r}_y^k$, adjoint $\tilde{r}_p^k$, and model error $\tilde{r}_u^k$ defined in (42), (43), and (44), we also require the residual

$$\begin{aligned} r_u^0(\psi ;\mu ) = m\left( \psi ,p_N^{*,1}\right) - \left( u_N^{*,0} - u_d^0, \psi \right) _U\quad \forall \psi \in U. \end{aligned}$$

(62)

The a posteriori error bound is given in the following proposition.

Proposition 3

Let $u^{*,k}$ and $u_N^{*,k}$ be the optimal solutions of the full-order and reduced-order 4D-Var problems (58) and (61), respectively. The error satisfies

$$\begin{aligned}&\left( \left||u^{*,0} - u_N^{*,0}\right||_U^2 + \tau \, \sum _{k=1}^{K} \left||u^{*,k} - u_N^{*,k}\right||_U^2 \right) ^{1/2} \nonumber \\&\quad \le \hat{{\varDelta }}_N^u(\mu ) := c_1(\mu ) + \sqrt{c_1(\mu )^2 + c_2(\mu )} \quad \forall \mu \in \mathcal {D}, \end{aligned}$$

(63)

where $c_1(\mu )$ and $c_2(\mu )$ are given by

$$\begin{aligned} c_1(\mu ) = \frac{1}{2} \left( \left( \left||r_u^0(\cdot ;\mu )\right||_{U'}^2 + \tilde{R}_u^2 \right) ^{1/2} + \left( \frac{2 \, \gamma _b^2}{(\alpha _{\mathrm{LB}}(\mu ))^2} + \frac{1}{\alpha _{\mathrm{LB}}(\mu )} \right) ^{1/2} \tilde{R}_p \right) \end{aligned}$$

(64)

and

$$\begin{aligned} c_2(\mu ) = \left( \frac{2\sqrt{2}}{\alpha _{\mathrm{LB}}(\mu )} \tilde{R}_y \tilde{R}_p + \frac{\gamma _c^2}{2 (\alpha _{\mathrm{LB}}(\mu ))^2} \tilde{R}_y^2 \right) . \end{aligned}$$

(65)

The proof follows from the proofs of Propositions 1 and 2 and is thus omitted. The offline-online decomposition is analogous to our previous discussion in Sect. 3.4.

6 Numerical results

6.1 Problem description

We consider the dispersion of a pollutant governed by a convection-diffusion equation with a Taylor–Green vortex velocity field. The concentration of the pollutant is measured at five spatial locations over time. The computational domain is ${\varOmega }= (-\,1,1)^2$ and we assume homogeneous Dirichlet boundary conditions on the lower boundary ${\varGamma }_D$ and homogeneous Neumann boundary conditions on the remaining boundary ${\varGamma }_N$. The Péclet number serves as our parameter, i.e., we have $\mu = \mathrm{Pe} \in \mathcal {D}= [10,50]$. The bilinear form a is thus given by

$$\begin{aligned} a(w,v;\mu ) = \frac{1}{\mu } \int _{\varOmega }\nabla w \cdot \nabla v \; \mathrm{d}x+ \int _{\varOmega }(\beta \cdot \nabla w) v \; \mathrm{d}x, \end{aligned}$$

(66)

and the velocity field is $\beta (x) = ( \sin (\pi x_1) \cos (\pi x_2), - \cos (\pi x_1) \sin (\pi x_2) )^T$. The domain ${\varOmega }$ with measurement sites as well as the velocity field are sketched in Fig. 1. Our model problem is motivated by the source reconstruction of a (possibly) accidental release of an agent, where the velocity field is known (Krysta et al. 2006; Krysta and Bocquet 2007). Although we consider a fixed velocity field here, our problem formulation also directly applies to (affinely) parametrized velocity fields.

We do not consider an additional forcing term and thus set $f \equiv 0$. The inner product on $Y_\mathrm {e}= \{ v \in H^1({\varOmega }): v|_{{\varGamma }_D} \equiv 0 \}$ is defined as $(w,v)_Y = \frac{1}{2} a(w,v;\mu ^{\mathrm {ref}}) + \frac{1}{2} a(v,w;\mu ^{\mathrm {ref}})$ for the reference parameter $\mu ^{\mathrm {ref}}= 30$. Since $\beta$ is divergence-free and $\beta \cdot n \equiv 0$ on ${\varGamma }$, one can show that a is coercive and that the symmetric part of a is given by $1 / \mu \int _{\varOmega }\nabla w \cdot \nabla v \; \mathrm{d}x$. Hence we can use the min-theta approach to construct a coercivity lower bound: $\alpha _{\mathrm{LB}}(\mu ):= \mu ^{\mathrm {ref}}/ \mu$. For details, we refer to Appendix B.3 of Kärcher (2017).

We choose the time interval $I = [0,8]$ and a time step size $\tau = 0.04$ resulting in $K = 200$ time steps. For the space discretization we introduce a spatial mesh with an element size of $h = 0.04$ and corresponding linear finite element approximation spaces $Y=U$ with $\mathcal {N}_Y = \mathcal {N}_U = 13{,}131$ degrees of freedom. We assume that the (unknown true) initial condition $y_0^\text {true}$ is given by a spatial Gaussian function with mean $(-0.1,0.8)^T$ and covariance matrix $\sigma ^2 \mathbb{I}$, where $\sigma = 0.1$ and $\mathbb {I}$ is the identity matrix (the center of the Gaussian is shown as a blue dot in Fig. 1). The average concentrations over the measurement domains shown in Fig. 1 serve as our five outputs $h_i(\phi ) = |{\varOmega }_i|^{-1} \int _{{\varOmega }_i} \phi \; \mathrm{d}x$, $i=1,\dots ,5$. We then generate noisy measurements by adding white noise to the outputs computed from the full-order model for the (unknown true) parameter $\mu ^\text {true} = 30$ with initial condition $y_0^\text {true}$ such that $z_d^k = C y^{k,\text {true}} + \eta ^k$, where $\eta ^k \in \mathbb {R}^5, \ k \in \mathbb {K},$ is a vector containing uncorrelated Gaussian noise in each entry, i.e., $\eta _i^k \sim N(0,0.05^2), \ i = 1,\ldots ,5, \ k \in \mathbb {K}$ . The inverse observation covariance matrix is given by $\mathrm {D}= 10 \mathbb {I}$. In practice, the choice 10 produces acceptable results for the 4D Var problem (a thorough discussion of the impact of Tychonov regularization on 4D-Var is beyond the scope of this paper; we refer to Puel (2009) for more details). In the strong-constraint case, we assume an optimal prior and set the prior mean $u_d$ to be equal to the true initial condition. In the weak-constraint case, we set $b(\cdot ,\cdot ) = m(\cdot ,\cdot )$ to account for the model-error forcing and $u_d^k = 0, \ k \in \mathbb {K},$ i.e., the model-error forcing is assumed to be unbiased and have zero mean. In both cases, the inverse prior covariance matrix $\mathrm {U}$ is given by the mass matrix.

A preconditioned Newton–CG method takes between 30 s for $\mu = 10$ (requiring 31 CG iterations) and 54 s for $\mu = 50$ (requiring 56 CG iterations) to solve the full-order strong-constraint 4D-Var problem. For the weak-constraint case, the solution time ranges from 114 s ($\mu = 10$, 81 CG iterations) to 189 s ($\mu = 50$, 137 CG iterations). In Fig. 2, we plot the concentration of the pollutant for three different parameter values and various timesteps. The influence of the Taylor–Green vortex and the Péclet number on the solutions is clearly visible. In Fig. 3 on the left, we plot the five true outputs $C y^{k,\text {true}}$ over time (the numbering and color of the curves correspond to the sketch in Fig. 1). The corresponding noisy measurements $z_d^k$ used for the data assimilation are shown on the right. We note that all computations were performed in Matlab on a computer with 2.6 GHz Intel Core i7 processor and 16 GB of RAM.

6.2 Reduced-order 4D-Var approach

We consider the strong- and weak-constraint 4D-Var data assimilation problem separately and present results for the performance of the reduced-order approach for each setting. We thus build different reduced basis spaces for the strong- and weak-constraint case by employing the Greedy sampling procedure described in Sects. 3.5 and 4.4, respectively. For both, we choose $\mu ^\text {start} = 10$ and a training set consisting of 40 equidistant parameters over the parameter domain $\mathcal {D}$. We also set the number of Greedy iterations to $N_{\max } = 80$ (strong) and $N_{\max } = 100$ (weak) resulting in a relative error bound tolerance of approximately $10^{-2}$.

In Fig. 4 we plot the maximum relative error and error bound over a test sample consisting of 20 randomly chosen parameters in $\mathcal {D}$ versus the number of Greedy iterations N. The relative error and bound are defined as $\left||u^*(\mu ) - u_N^*(\mu )\right||_U/\left||u^*(\mu )\right||_U$ and ${\varDelta }_N^u(\mu )/\left||u^*(\mu )\right||_U$ in the strong-constraint case, and by $\Big ( \tau \sum _{k=1}^{K} \left||u^{*,k}(\mu ) - u_N^{*,k}(\mu )\right||_U^2 \Big )^{1/2} /\Big ( \tau \sum _{k=1}^{K} \left||u^{*,k}(\mu )\right||_U^2 \Big )^{1/2}$ and $\tilde{{\varDelta }}_N^u(\mu )/ \Big ( \tau \sum _{k=1}^{K} \left||u^{*,k}(\mu )\right||_U^2 \Big )^{1/2}$ in the weak-constraint case. We observe that the error and bound converge at the same rate and that the effectivities, i.e., the ratio of the bound and the error, thus remain almost constant over N. The mean effectivities over the test sample for $N_{\mathrm{max}}$ are 480 in the strong-constraint case and 40 in the weak-constraint case. We note that the maximum dimensions of the reduced basis state/adjoint and control spaces are $N_{Y,{\mathrm{max}}} = 2 N_{\mathrm{max}} = 160$ and $N_{U,\mathrm{max}}^0 = 21$ (strong-constraint), and $N_{Y,{\mathrm{max}}} = 2 N_{\mathrm{max}} + 1 = 201$ and $N_{U,{\mathrm{max}}} = N_{{\mathrm{max}}} = 100$ (weak-constraint). Especially in the strong-constraint case, we thus obtain a considerable reduction in the dimension of the control space from $\mathcal {N}= {13,131}$ to $N_{U,{\mathrm{max}}}^0 = 21$. This will also be reflected in the required number of CG iterations to solve the reduced-order 4D-Var problem (see below).

We next report on the online computational times of our reduced-order approach. Similar to the full-order approach, the reduced-order solution times also depend on $\mu$ (smaller for $\mu = 10$ and higher for $\mu = 50$) and of course also strongly on N. We first consider the strong-constraint case: the solution times for the reduced-order 4D-Var problem range from 10 ms to 1.37 s, and the evaluation of the a posteriori error bound ${\varDelta }_N^u(\mu )$ takes between 2.8 and 29 ms. We note that the computation of the error bound is much faster than the solution of the 4D-Var problem itself. Furthermore, we note that the computational time to evaluate the error bound depends only on N and not on $\mu$ (i.e., evaluating the bound for fixed N at $\mu = 10$ or $\mu = 50$ takes the same time). The overall online speed-up for $N = N_{\mathrm{max}}$ thus ranges from approximately 23–40.

In the weak-constraint case, the solution times for the reduced-order 4D-Var problem range from 99 ms to 12.6 s, and the evaluation of the a posteriori error bound $\tilde{{\varDelta }}_N^u(\mu )$ takes between 4.8 and 71 ms. Again, the evaluation of the error bound is much faster than the solution of the 4D-Var problem itself. The online speed-up for $N = N_{\mathrm{max}}$ is now approximately 15.

In order to illustrate the connection between the approximation error and the online solution time, we plot the average online solution time of the reduced-order 4D-Var problem versus the average relative error over the test sample in Fig. 5. Recall that the full-order solution takes approximately 30–54 s for the strong-constraint case and 114–189 s for the weak-constraint case.

We next show results for the number of CG iterations required to solve the reduced-order 4D-Var problem. In Fig. 6, we plot the number of CG iterations as a function of the parameter $\mu$ for various values of N and $N_U$ on the left for the strong-constraint case and on the right for the weak-constraint case. In the same plots, we also show the number of CG iterations required to solve the full-order problem. We observe a different behavior in the strong- and weak-constraint case. We first note that in the weak-constraint case the number of reduced-order CG iterations converges to the number of full-order CG iterations with increasing N. However, in the strong-constraint case the number of reduced-order CG iterations is bounded by $N_U^0$, which is significantly smaller than N. The number of reduced-order CG iterations is thus almost constant over $\mu$ for given N and considerably smaller than the number of full-order CG iterations even for $N = N_{\mathrm{max}}$.

Finally, we consider the outer minimization problem and try to estimate the unknown true parameter $\mu ^\text {true} = 30$ which leads to the noisy measurements. To this end, we define the “optimal” parameters $\mu ^*$ and $\mu _N^*$ which minimize the full-order and reduced-order cost functionals

$$\begin{aligned} \mu ^* = \mathop {\hbox {arg min}}\limits _{\mu \in \mathcal {D}} J^*(\mu ) \quad \text {and} \quad \mu _N^*= \mathop {\hbox {arg min}}\limits _{\mu \in \mathcal {D}} J_N^*(\mu ), \end{aligned}$$

(67)

respectively. We compute the optimal estimated parameters $\mu ^*$ and $\mu _N^*$ using the Matlab routine fminbnd, which needs only evaluations of the full-order and reduced-order cost functional. We also define the maximum relative cost functional error $e_{J,N}^{\max } = \max _{\mu \in \mathcal {D}} |J^*(\mu ) - J_N^*(\mu )|/|J^*(\mu )|$ and parameter error $e_{\mu ,N} := |\mu ^* - \mu _N^*|/|\mu ^*|$. We present these errors for the strong- and weak-constraint case as a function of N in Table 1. We observe that in both cases the cost functional error and parameter error converge very fast, i.e., the reduced-order approach allows us to recover the optimal parameter $\mu ^*$. We also note that the (full-order) optimal parameter is close to the true parameter in the strong-constraint case ($\mu ^* = 29.67$ vs. $\mu ^\text {true} = 30$), but that this is not true in the weak-constraint case ($\mu ^* = 45.36$ vs. $\mu ^\text {true} = 30$). Since $\mu _N^* \rightarrow \mu ^*$ with increasing N, this is of course also true for—and the best we can expect of—the reduced-order optimal parameters.

Table 1 Error in cost functional and estimated parameter over number of Greedy iterations N

Full size table

7 Conclusion

In this paper, we considered the strong- and weak-constraint 4D-Var data assimilation problem. We presented a reduced-order approach to the 4D-Var problem based on the reduced basis method and proposed rigorous and efficiently evaluable a posteriori error bounds for the optimal control, i.e., the initial condition in the strong-constraint setting and the model-error forcing in the weak-constraint setting. For both instances we showed numerical results confirming the validity of the proposed approach. We also presented theoretical results for the combined case with unknown initial condition and model-error forcing.

We note that although we consider a parametrized problem here, the error bounds can also be used in the non-parametrized reduced-order setting and are independent of how the reduced-order spaces are constructed. The bound thus directly applies to reduced-order approaches where the spaces are constructed, e.g., using empirical orthogonal functions, POD, or dual-weighted POD (Daescu and Navon 2008). We also believe that the error bounds can be gainfully applied in a multi-fidelity approach to solve the 4D-Var problem, e.g., in a trust-region approach as proposed in Chen et al. (2011) and Du et al. (2013).

Although we also presented results for the error in the cost functional and for estimating the unknown model parameter, we currently cannot provide rigorous and sharp a posteriori error bounds for these quantities. Furthermore, we considered only a fixed setting for the noise level and regularization parameter here; a detailed analysis of the influence of these parameters on the performance of the reduced order model has not been performed. These are topics of current and future research in our groups.

Notes

For the first iteration of the algorithm we define $v^k_{{\text {proj}},0}(\mu ) = 0$, and hence $e^{y,k}_{\text {proj},0}(\mu ) = y^k(\mu )$ and $e^{p,k}_{\text {proj},0}(\mu ) = p^k(\mu )$.

References

Barrault M, Maday Y, Nguyen NC, Patera AT (2004) An ‘empirical interpolation’ method: application to efficient reduced-basis discretization of partial differential equations. C R Acad Sci Paris 339(9):667–672. https://doi.org/10.1016/j.crma.2004.08.006
Article MathSciNet MATH Google Scholar
Bennett AF (1993) Inverse methods in physical oceanography. Cambridge University Press, Cambridge
MATH Google Scholar
Benzi M, Golub GH, Liesen J (2005) Numerical solution of saddle point problems. Acta Numer 14:1–137. https://doi.org/10.1017/S0962492904000212
Article MathSciNet MATH Google Scholar
Bröcker J (2017) Existence and uniqueness for four-dimensional variational data assimilation in discrete time. SIAM J Appl Dyn Syst 16(1):361–374. https://doi.org/10.1137/16M1068918
Article MathSciNet MATH Google Scholar
Cao Y, Zhu J, Navon IM, Luo Z (2007) A reduced-order approach to four-dimensional variational data assimilation using proper orthogonal decomposition. Int J Numer Methods Fluids 53(10):1571–1583. https://doi.org/10.1002/fld.1365
Article MATH Google Scholar
Chen X, Navon IM, Fang F (2011) A dual-weighted trust-region adaptive POD 4D-Var applied to a finite-element shallow-water equations model. Int J Numer Methods Fluids 65(5):520–541. https://doi.org/10.1002/fld.2198
Article MathSciNet MATH Google Scholar
Courtier P (1997) Dual formulation of four-dimensional variational assimilation. Q J R Meteorol Soc 123(544):2449–2461. https://doi.org/10.1002/qj.49712354414
Article Google Scholar
Courtier P, Thépaut JN, Hollingsworth A (1994) A strategy for operational implementation of 4D-Var, using an incremental approach. Q J R Meteorol Soc 120(519):1367–1387. https://doi.org/10.1002/qj.49712051912
Article Google Scholar
Daescu DN, Navon IM (2008) A dual-weighted approach to order reduction in 4DVAR data assimilation. Mon Weather Rev 136(3):1026–1041. https://doi.org/10.1175/2007MWR2102.1
Article Google Scholar
Dedè L (2010) Reduced basis method and a posteriori error estimation for parametrized linear-quadratic optimal control problems. SIAM J Sci Comput 32(2):997–1019
Article MathSciNet MATH Google Scholar
Dimitriu G, Apreutesei N, Ştefănescu R (2010) Numerical simulations with data assimilation using an adaptive POD Procedure. Springer, Berlin, pp 165–172. https://doi.org/10.1007/978-3-642-12535-5_18
Du J, Navon I, Zhu J, Fang F, Alekseev A (2013) Reduced order modeling based on POD of a parabolized Navier–Stokes equations model II: Trust region POD 4D VAR data assimilation. Comput Math Appl 65(3):380–394. https://doi.org/10.1016/j.camwa.2012.06.001
Article MathSciNet MATH Google Scholar
Ekeland I, Temam R (1976) Convex analysis and variational problems. Studies in mathematics and its applications. Elsevier, Amsterdam
MATH Google Scholar
Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer, Dordrecht
Book MATH Google Scholar
Ern A, Guermond JL (2010) Theory and practice of finite elements. Applied mathematical sciences. Springer, Berlin
Google Scholar
Fursikov AV (2000) Optimal control of distributed systems. Theory and applications, vol 187. American Mathematical Society, Providence
MATH Google Scholar
Gerner AL, Veroy K (2012) Certified reduced basis methods for parametrized saddle point problems. SIAM J Sci Comput 34(5):A2812–A2836. https://doi.org/10.1137/110854084
Article MathSciNet MATH Google Scholar
Grepl MA, Patera AT (2005) A posteriori error bounds for reduced-basis approximations of parametrized parabolic partial differential equations. ESAIM Math Model Numer 39(1):157–181. https://doi.org/10.1051/m2an:2005006
Article MATH Google Scholar
Grepl MA, Maday Y, Nguyen NC, Patera AT (2007) Efficient reduced-basis treatment of nonaffine and nonlinear partial differential equations. ESAIM Math Model Numer Anal 41(3):575–605
Article MathSciNet MATH Google Scholar
Habert J, Ricci S, Pape EL, Thual O, Piacentini A, Goutal N, Jonville G, Rochoux M (2016) Reduction of the uncertainties in the water level-discharge relation of a 1d hydraulic model in the context of operational flood forecasting. J Hydrol 532(Supplement C):52–64. https://doi.org/10.1016/j.jhydrol.2015.11.023
Article Google Scholar
Hinze M, Pinnau R, Ulbrich M, Ulbrich S (2009) Optimization with PDE constraints, mathematical modelling: theory and applications, vol 23. Springer, Berlin
MATH Google Scholar
Hoteit I, Köhl A (2006) Efficiency of reduced-order, time-dependent adjoint data assimilation approaches. J Oceanogr 62(4):539–550. https://doi.org/10.1007/s10872-006-0074-2
Article Google Scholar
Huynh DBP, Rozza G, Sen S, Patera AT (2007) A successive constraint linear optimization method for lower bounds of parametric coercivity and inf-sup stability constants. C R Acad Sci Paris 345(8):473–478. https://doi.org/10.1016/j.crma.2007.09.019
Article MathSciNet MATH Google Scholar
Ide K, Courtier P, Ghil M, Lorenc A (1997) Unified notation for data assimilation: operational, sequential and variational. J Meteorol Soc Jpn 75:181–189
Article Google Scholar
Kärcher M (2017) Certified reduced basis methods for parametrized pde-constrained optimization problems. Ph.D. thesis, RWTH Aachen University
Kärcher M, Grepl MA (2013) A certified reduced basis method for parametrized elliptic optimal control problems. ESAIM Control Optim CA 20(2):416–441. https://doi.org/10.1051/cocv/2013069
Article MathSciNet MATH Google Scholar
Kärcher M, Grepl MA (2014) A posteriori error estimation for reduced order solutions of parametrized parabolic optimal control problems. ESAIM M2AN 48(6):1615–1638. https://doi.org/10.1051/m2an/2014012
Article MathSciNet MATH Google Scholar
Kärcher M, Tokoutsi Z, Grepl MA, Veroy K (2017) Certified reduced basis methods for parametrized elliptic optimal control problems with distributed controls. J Sci Comput. https://doi.org/10.1007/s10915-017-0539-z
MATH Google Scholar
Krysta M, Bocquet M (2007) Source reconstruction of an accidental radionuclide release at European scale. Q J R Meteorol Soc 133(623):529–544. https://doi.org/10.1002/qj.3
Article Google Scholar
Krysta M, Bocquet M, Sportisse B, Isnard O (2006) Data assimilation for short-range dispersion of radionuclides: an application to wind tunnel data. Atmos Environ 40(38):7267–7279. https://doi.org/10.1016/j.atmosenv.2006.06.043
Article Google Scholar
Law K, Stuart A, Zygalakis K (2015) Data assimilation. Springer, Berlin
Book MATH Google Scholar
Le Dimet FX, Talagrand O (1986) Variational algorithms for analysis and assimilation of meteorological observations: theoretical aspects. Tellus A 38A(2):97–110. https://doi.org/10.1111/j.1600-0870.1986.tb00459.x
Article Google Scholar
Lorenc AC (1981) A global three-dimensional multivariate statistical interpolation scheme. Mon Weather Rev 109(4):701–721. https://doi.org/10.1175/1520-0493(1981)109<0701:AGTDMS>2.0.CO;2
Lorenc AC (1986) Analysis methods for numerical weather prediction. Q J R Meteorol Soc 112(474):1177–1194. https://doi.org/10.1002/qj.49711247414
Article Google Scholar
Lynch P (2015) The Princeton companion to applied mathematics. Numerical weather prediction. Princeton University Press, Princeton, pp 705–712
Google Scholar
Maday Y, Nguyen NC, Patera AT, Pau GSH (2007) A general, multipurpose interpolation procedure: the magic points. Commun Pure Appl Anal (CPAA) 8:383–404. https://doi.org/10.3934/cpaa.2009.8.383
Article MathSciNet MATH Google Scholar
Maday Y, Patera AT, Penn JD, Yano M (2015a) A parameterized-background data-weak approach to variational data assimilation: formulation, analysis, and application to acoustics. Int J Numer Methods Eng 102(5):933–965. https://doi.org/10.1002/nme.4747
Article MathSciNet MATH Google Scholar
Maday Y, Patera AT, Penn JD, Yano M (2015b) PBDW state estimation: noisy observations; configuration-adaptive background spaces; physical interpretations. ESAIM Proc 50:144–168. https://doi.org/10.1051/proc/201550008
Article MathSciNet MATH Google Scholar
Marchuk G, Shutyaev V (2002) Solvability and numerical algorithms for a class of variational data assimilation problems. ESAIM Control Optim Calc Var 8:873–883. https://doi.org/10.1051/cocv:2002044
Article MathSciNet MATH Google Scholar
Marshall J, Shuckburgh E, Jones H, Hill C (2006) Estimates and implications of surface eddy diffusivity in the southern ocean derived from tracer transport. J Phys Oceanogr 36(9):1806–1821. https://doi.org/10.1175/JPO2949.1
Article Google Scholar
Negri F, Rozza G, Manzoni A, Quarteroni A (2013) Reduced basis method for parametrized elliptic optimal control problems. SIAM J Sci Comput 35(5):A2316–A2340
Article MathSciNet MATH Google Scholar
Pontryagin L, Boltyanskij V, Gamkrelidze R, Mishchenko E (1964) The mathematical theory of optimal processes (translated from the Russian by D.E. Brown Macmillan). Pergamon Press, Oxford
Google Scholar
Prud’homme C, Rovas DV, Veroy K, Machiels L, Maday Y, Patera AT, Turinici G (2002) Reliable real-time solution of parametrized partial differential equations: reduced-basis output bound methods. J Fluids Eng 124(1):70–80. https://doi.org/10.1115/1.1448332
Article Google Scholar
Puel JP (2009) A nonstandard approach to a data assimilation problem and Tychonov regularization revisited. SIAM J Control Optim 48(2):1089–1111. https://doi.org/10.1137/060670961
Article MathSciNet MATH Google Scholar
Rao V, Sandu A, Ng M, Nino-Ruiz ED (2017) Robust data assimilation using $l_1$ and Huber norms. SIAM J Sci Comput 39(3):B548–B570. https://doi.org/10.1137/15M1045910
Article MATH Google Scholar
Reich S, Cotter C (2015) Probabilistic forecasting and bayesian data assimilation. Cambridge University Press, Cambridge
Book MATH Google Scholar
Robert C, Durbiano S, Blayo E, Verron J, Blum J, Dimet FXL (2005) A reduced-order strategy for 4D-Var data assimilation. J Mar Syst 57(1–2):70–82. https://doi.org/10.1016/j.jmarsys.2005.04.003
Article Google Scholar
Rozza G, Huynh DBP, Patera AT (2008) Reduced basis approximation and a posteriori error estimation for affinely parametrized elliptic coercive partial differential equations. Arch Comput Methods Eng 15(3):229–275. https://doi.org/10.1007/s11831-008-9019-9
Article MathSciNet MATH Google Scholar
Sasaki Y (1970) Some basic formalisms in numerical variational analysis. Mon Weather Rev 98(12):875–883. https://doi.org/10.1175/1520-0493(1970)098<0875:SBFINV>2.3.CO;2
Ştefănescu R, Sandu A, Navon IM (2015) POD/DEIM reduced-order strategies for efficient four dimensional variational data assimilation. J Comput Phys 295:569–595. https://doi.org/10.1016/j.jcp.2015.04.030
Article MathSciNet MATH Google Scholar
Stoll M, Wathen A (2013) All-at-once solution of time-dependent Stokes control. J Comput Phys 232(1):498–515. https://doi.org/10.1016/j.jcp.2012.08.039
Article MathSciNet Google Scholar
Trémolet Y (2006) Accounting for an imperfect model in 4D-Var. Q J R Meteorol Soc 132(621):2483–2504. https://doi.org/10.1256/qj.05.224
Article Google Scholar
Tröltzsch F, Volkwein S (2009) POD a-posteriori error estimates for linear-quadratic optimal control problems. Comput Optim Appl 44:83–115. https://doi.org/10.1007/s10589-008-9224-3
Article MathSciNet MATH Google Scholar
Vermeulen PTM, Heemink AW (2006) Model-reduced variational data assimilation. Mon Weather Rev 134(10):2888–2899. https://doi.org/10.1175/MWR3209.1
Article Google Scholar
Veroy K, Rovas DV, Patera AT (2002) A posteriori error estimation for reduced-basis approximation of parametrized elliptic coercive partial differential equations: convex inverse bound conditioners. ESAIM Control Optim CA 8:1007–1028. https://doi.org/10.1051/cocv:2002041
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Aachen Institute for Advanced Study in Computational Engineering Science (AICES) and Faculty of Civil Engineering, RWTH Aachen University, Schinkelstraße 2, 52062, Aachen, Germany
Mark Kärcher & Karen Veroy
Laboratoire d’hydraulique Saint-Venant (Ecole des Ponts ParisTech – EDF R& D – CEREMA), Université Paris-Est, 6 quai Watier, 78401, Chatou Cedex, France
Sébastien Boyaval
INRIA Paris (team Matherials), Paris, France
Sébastien Boyaval
Numerical Mathematics (IGPM), RWTH Aachen University, Templergraben 55, 52056, Aachen, Germany
Martin A. Grepl

Authors

Mark Kärcher
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien Boyaval
View author publications
You can also search for this author in PubMed Google Scholar
Martin A. Grepl
View author publications
You can also search for this author in PubMed Google Scholar
Karen Veroy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karen Veroy.

Additional information

This work was supported by the Excellence Initiative of the German federal and state governments and the German Research Foundation through Grant GSC 111.

Appendix 1: Continuous 4D-Var formulation

The strong-constraint 4D-Var problem for a linear parabolic PDE on $(0,T)\times {\varOmega }$, with ${\varOmega }$ a Lipschitz domain,

$$\begin{aligned} \partial _ty + Ay = f, \quad y|_{\partial {\varOmega }}=0, \qquad y(0)=u, \end{aligned}$$

(68)

is classically rewritten as the optimal control problem:

$$\begin{aligned} \text {Find } (y^*,u^*) \in \mathop {\hbox {arginf}}_{ (y,u)\in \mathcal {Y}\times U \text { satisfies (68) } } J \end{aligned}$$

with a lower semi-continuous cost functional

$$\begin{aligned} J = \frac{\lambda }{2} \left||u-u_d\right||^2_U + \frac{1}{2} \int _0^T \left||Cy-z_d\right||^2_D \end{aligned}$$

(69)

on the tensor-product of $\mathcal {Y} := \{y\in L^2(0,T;H^1_0({\varOmega })); \partial _t y\in L^2(0,T;H^{-1}({\varOmega }))\}$ and $U := L^2({\varOmega })$. If the observation operator C has a unique continuation in $\mathcal {Y}$, J is coercive and strictly convex. Then, if $f\in L^2((0,T)\times {\varOmega })$ so the set of admissible states is non-empty, there exists a unique solution, see e.g. Fursikov (2000). To characterize and compute the solution, one can use duality techniques following Pontryagin et al. (1964) or Ekeland and Temam (1976). On introducing a Lagrange multiplier $(p^*,v^*)\in \mathcal {Y}\times U$, $p^*(T)= 0$ for the constraint, it is classical that the solution should satisfy (Marchuk and Shutyaev 2002)

$$(u^*,\psi )_U-(\lambda v^*,\psi )_U=(u_d,\psi )_U \ \forall \psi \in U$$

(70a)

$$(p^*(0),\varphi )_Y - (v^*,\varphi )_U= 0$$

(70b)

$$(Cy^*,C\varphi )_D-\left( \partial _tp^*-A^Tp^*,\varphi \right) _Y=(Cz_d,C\varphi )_D \ \forall \varphi \in \mathcal {Y}$$

(70c)

$$(y^*(0),\phi )_Y - (u^*,\phi )_U= 0$$

(70d)

$$\left( \partial _ty^* + Ay^*,\phi \right) _Y= (f,\phi )_Y \ \forall \phi \in \mathcal {Y}$$

(70e)

which is a well-posed saddle-point problem, well-approximated by the discretization (8) (Ern and Guermond 2010), again under the condition that the observation operator C has a unique continuation in $\mathcal {Y}$. Note that first adequately discretizing J then leads to exactly the same discrete Euler–Lagrange equations as (8).

The weak-constraint 4D-Var problem is also classical, see e.g. Fursikov (2000). The optimal control problem becomes

$$\begin{aligned} \text {Find } (y^*,u^*) \in \mathop {\mathrm{arginf}}_{ (y,u)\in \mathcal {Y}_{y_0}\times \mathcal {U} \text { satisfies (71) } } J \end{aligned}$$

for

$$\begin{aligned} \partial _ty + Ay = f + Bu, \quad y|_{\partial {\varOmega }}=0, \qquad y(0)=y_0, \end{aligned}$$

(71)

with the lower semi-continuous cost functional

$$\begin{aligned} J = \frac{1}{2} \int _0^T \left||u-u_d\right||^2_U + \frac{1}{2} \int _0^T \left||Cy-z_d\right||^2_D \end{aligned}$$

(72)

on the tensor-product of $\mathcal {Y}_{y_0} := \{y\in L^2(0,T;H^1_0({\varOmega })); \partial _t y\in L^2(0,T;H^{-1}({\varOmega })); y(0)=y_0 \}$ and $\mathcal {U}:= L^2((0,T)\times {\varOmega })$; the saddle-point becomes

$$\begin{aligned} (u^*,\psi )_U-(B^Tp^*,\psi )_Y&=(u_d,\psi )_U \ \forall \psi \in U \end{aligned}$$

(73a)

$$\begin{aligned} (Cy^*,C\varphi )_D-\left( \partial _tp^*-A^Tp^*,\varphi \right) _Y&=(Cz_d,C\varphi )_D \ \forall \varphi \in \mathcal {Y} \end{aligned}$$

(73b)

$$\begin{aligned} \left( \partial _ty^* + Ay^*,\phi \right) _Y - (Bu^*,\phi )_Y&= (f,\phi )_Y \ \forall \phi \in \mathcal {Y} \end{aligned}$$

(73c)

while existence and uniqueness of a soluion still hold under the same conditions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kärcher, M., Boyaval, S., Grepl, M.A. et al. Reduced basis approximation and a posteriori error bounds for 4D-Var data assimilation. Optim Eng 19, 663–695 (2018). https://doi.org/10.1007/s11081-018-9389-2

Download citation

Received: 06 March 2017
Revised: 31 January 2018
Accepted: 12 May 2018
Published: 04 June 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s11081-018-9389-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Reduced basis approximation and a posteriori error bounds for 4D-Var data assimilation

Abstract

Similar content being viewed by others

3D-VAR for parameterized partial differential equations: a certified reduced basis approach

Variational Data Assimilation Based on Derivative-Free Optimization

Recent Applications in Representer-Based Variational Data Assimilation

1 Introduction

2 Preliminaries

3 Strong-constraint 4D-Var

3.1 Problem statement

3.1.1 Algebraic formulation

3.2 Reduced basis approximation

3.3 A posteriori error estimation

Proposition 1

Proof

3.4 Computational procedure

3.5 Greedy algorithm

4 Weak-constraint 4D-Var

4.1 Problem statement

4.2 Reduced basis approximation

4.3 A posteriori error estimation

Proposition 2

Proof

4.4 Greedy algorithm

5 Combined 4D-Var formulation

5.1 Problem statement

5.2 Reduced basis approximation and error estimation

Proposition 3

6 Numerical results

6.1 Problem description

6.2 Reduced-order 4D-Var approach

7 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix 1: Continuous 4D-Var formulation

Appendix 1: Continuous 4D-Var formulation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation