Comparison of Some Reduced Representation Approximations

Bebendorf, Mario; Maday, Yvon; Stamm, Benjamin

doi:10.1007/978-3-319-02090-7_3

Mario Bebendorf⁴,
Yvon Maday^5,6 &
Benjamin Stamm^5,7

Part of the book series: MS&A - Modeling, Simulation and Applications ((MS&A,volume 9))

4849 Accesses
3 Citations

Abstract

In the field of numerical approximation, specialists considering highly complex problems have recently proposed various ways to simplify their underlying problems. In this field, depending on the problem they were tackling and the community that are at work, different approaches have been developed with some success and have even gained some maturity, the applications can now be applied to information analysis or for numerical simulation of PDE’s. At this point, a crossed analysis and effort for understanding the similarities and the differences between these approaches that found their starting points in different backgrounds is of interest. It is the purpose of this paper to contribute to this effort by comparing some constructive reduced representations of complex functions. We present here in full details the Adaptive Cross Approximation (ACA) and the Empirical Interpolation Method (EIM) together with other approaches that enter in the same category.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

The Life and Work of André Boivin

A Low-Rank Matrix Approach to Compute Polynomial Approximations of Smooth Two-Dimensional Functions

Article Open access 28 May 2024

Computational Methods for the Fourier Analysis of Sparse High-Dimensional Functions

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 3.1 Introduction

This paper deals with the economical representation of dedicated sets of data, that are currently — and more and more importantly — available stemming out of various experiences or given by formal expressions. The amount of information that can be derived out of a given massive set of data is far much smaller than the size of the data itself, therefore, parallel to the increasing size of data acquisition and storage available on computer architectures, an effort for post processing and economically represent, analyze and derive pertinent information out of the data has been done during the last century. The main idea starts from the translation of the fact that the data are dedicated to some phenomenon and thus, there exists a certain amount of coherence in these data which can be separated into two classes: deterministic or statistical. Among them have been proposed: regularity, sparsity, small n-width etc. that can be either assumed, verified or proven.

The data themselves can be known in different ways, either (i) completely explicitly, like for instance (i-1) from an analytic representation or at least access to the values at every point, (i-2) or only given on a large set of points, (i-3) or also given through various global measures like moments, or (ii) given implicitly through a model like a partial differential equation (PDE). The range of applications is huge, examples can be found in statistics, image and information process, learning process, experiments in mechanics, meteorology, earth sciences, medicine, biology, etc. and the challenge is in computationally processing such a large amount of high-dimensional data so as to obtain low-dimensional descriptions and capture much of the phenomena of interest.

We consider the following problem formulation: Let us assume that we are given a (presumably large) set ℱ of functions ϕ ∈ ℱ defined over Ω_x ⊂ ℝ_dx (with d _x ≥ 1). Our aim is to find some functions h ₁, h ₂,…, h _Q : Ω_x → ℝ such that every ϕ ∈ ℱ can be well approximated as follows

$$\varphi (x) \approx \sum\limits_{q = 1}^Q {{{\hat \varphi }_q}{h_q}(x),} $$

where Q ≪ dim(span{ℱ}). As said above, the ability for ℱ to posses this property is an assumption. It is precisely stated under the notion of small Kolmogorov n-width, defined as follows:

Let ℱ be a subset of some Banach space $X$ and ${\mathbb{V}_Q}$ be a generic Q-dimensional subspace of $X$. The angle between ℱ and ${\mathbb{V}_Q}$ is

$$E(F;{\mathbb{V}_Q}): = \mathop {\sup }\limits_{\varphi \in F} \;\mathop {\inf }\limits_{{v_Q} \in {\mathbb{V}_Q}} \left\| {\varphi - {v_Q}} \right\|X.$$

The Kolmogorov n-width of ℱ in $X$ is given by

$${d_Q}(F,X): - inf\{ (F;{\mathbb{V}_Q})|{\mathbb{V}_Q}a\,Q - dimensional\,subspace\,of\;X\} .$$

The n-width of ℱ thus measures to what extent the set ℱ can be approximated by a n-dimensional subspace of $X$.

This assumption of small Kolmogorov n-width can be taken for granted, but there are also reasons on the elements of ℱ that can lead to such a smallness such as regularity of the functions ϕ ∈ ℱ. As an example, we can quote, in the periodic settings, the well-known Fourier series. Small truncated Fourier series are good approximations of the full expansion if the decay rate of the Fourier coefficients is fast enough, i.e. if the functions ϕ have enough continuous derivatives. In this case, the basis is actually multipurpose since it is not dedicated to the particular set ℱ. Fourier series are indeed adapted to any set of regular enough functions, the more regular they are, the better the approximation is. Another property for ℱ to have a small Kolmogorov n-width is that it satisfies the principle of transform sparsity, i.e., we assume that the functions ϕ ∈ ℱ are expressed in a sparse way when written in some orthonormal basis set {ψ_i}, e.g. an orthonormal wavelet basis, a Fourier basis, or a local Fourier basis, depending on the application: this means that the coefficients ${\hat \varphi _i} = \left\langle {\varphi ,{\psi _i}} \right\rangle $ satisfy, for some p, 0 < p < 2, and some R:

$${\left\| \varphi\right\|_{{\ell ^p}}} = {\left( {{{\sum\limits_i {\left| {{{\hat \varphi }_i}} \right|} }^P}} \right)^{1/p}} \leqslant R$$

. A key implication of this assumption is that if we denote by ϕ_N the sum of the N largest contributions then

$$\exists C(R,p),\;\forall \varphi\in \mathcal{F},\quad {\left\| {\varphi- {\varphi _N}} \right\|_{{\ell ^2}}} \leqslant C(R,p){(N + 1)^{1/2 - 1/p}},$$

i.e. there exists a contracted representation of such a ϕ. Note that the representation is adaptive and tuned to each ϕ (it is what is called a nonlinear approximation). However, under these assumptions, and if ℱ is finite dimensional (with a dimension that is much larger than N), the theory of compressed sensing (see [29]), at the price of having a slight logarithmic degradation of the convergence rate, allows to propose a non-adaptive recovery of ℓ^p functions, with p ≤ 1, that is almost optimal.We refer to [29] and the references therein for more details on this question. Anyway, these are cases where the set of basis functions {h _i} does not constitute a multipurpose approximation set, all the contrary: it is tuned to that choice of ℱ and will not have any good property for another one.

The difficulty is of course to find the basis set {h _i}. Note additionally that, from the definition of the small Kolmogorov n-width, except in a Hilbertian framework, the optimal elements need not even be in span{ℱ}.

Let us proceed and propose a way to better identify the various elements in ℱ: we consider that they are parametrized with y ∈ Ωy ⊂ ℝ^dy (with d _y ≥ 1), so that ℱ consists of the parametrized functions f : Ωx × Ωy → ℝ. In what follows, we denote the function f as a function of x for some fixed parameter value y as f _y := f(·,y). However, the role of x and y could be interchanged and both x and y will be considered equally as variables of the same level or as variable and parameter in all what follows.

In this paper, we present a survey of algorithms that search for an affine decomposition of the form

$$f\left( {x,y} \right) \approx \sum\limits_{q = 1}^Q {{g_q}\left( y \right){h_q}\left( x \right)} .$$

(3.1)

We focus on the case where the decomposition is chosen in an optimal way (in terms of sparse representation) and additionally we focus on methods with minimal computational complexity. It is assumed that we have a priori some or all the knowledge on functions f in ℱ, i.e. they are not implicitly defined by a PDE. In that “implicit” case there exists a family of reduced modeling approaches such as the reduced basis method; see e.g. [62].

Note that the domains Ω_x and Ω_y can be with finite cardinality M and N, inwhich case the functions can be written as matrices, then, the above algorithms can often be stated as a low-rank approximation: Given a matrix M ∈ ℝ^{M × N}, find a decomposition of the matrix M:

$$M \approx U{V^T}$$

where U is of size M × Q and V of size N × Q.

In this completely discrete setting, the Singular Value Decomposition (SVD), or the related Proper Orthogonal Decomposition (POD), yields an optimal (in terms of approximability with respect to the ${\left\|\cdot\right\|_{{\ell ^2}}} - norm$) solution, but is rather expensive to compute. After presenting the POD in a general setting in Sect. 3.2, we present two alternatives, the Adaptive Cross Approximation (ACA) in Sect. 3.3 and the Empirical Interpolation Method (EIM), in Sect. 3.4, which originate from completely different backgrounds. We give a comparative overview of features and existing results of those approaches which are computationally much cheaper and yield in practice similar approximation results. The relation between ACA and the EIM is studied in Sect. 3.5. Section 3.6 is devoted to a projection method based on incomplete data known as Gappy POD or Missing Point Estimation, which in some cases can be interpreted as an interpolation scheme.

3.2 3.2 Proper Orthogonal Decomposition

Let us start by assuming that we have an unlimited knowledge of the data set and that we have unlimited computer resources — coming back at the end of this section to more realistic matter of facts. The first approach is known under the generic concept of Proper Orthogonal Decomposition (POD) which is a mathematical technique that stands at the intersection of various horizons that have actually been developed independently and concomitantly in various disciplines and is thus known under various names, including:

Proper Orthogonal Decomposition (POD): a term used in turbulence;
Singular Value Decomposition (SVD): a term used in algebra;
Principal Component Analysis (PCA): a term used in statistics for discrete random processes;
the discrete Karhunen-Loeve transform (KLT): a term used in statistics for continuous random processes;
the Hotelling transform: a term used in image processing;
Principal Orthogonal Direction (POD): a term used in geophysics;
Empirical Orthogonal Functions (EOFs): a term used in meteorology and geophysics.

All these somewhat equivalent approaches aim at obtaining low-dimensional approximate descriptions of high-dimensional processes, therefore eliminating information which has little impact on the overall understanding.

3.2.1 3.2.1 Historical Overview

As stated above, the POD is present under various forms in many contributions.

The original SVD was established for real-square matrices in the 1870’s by Beltrami and Jordan, for complex square matrices in 1902 by Autonne, and for general rectangular matrices in 1936 by Eckart and Young; see also the generalization to unitarily invariant norms by Mirsky [58]. The SVD can be viewed as the extension of the eigenvalue decomposition for the case of non-symmetric matrices and non-square matrices.

The PCA is a statistical technique. The earliest descriptions of the technique were given by Pearson [63] and Hotelling [44]. The purpose of the PCA is to identify the dependence structure behind a multivariate stochastic observation in order to obtain a compact description of it.

Lumley [51] traced the idea of the POD back to independent investigations by Kosambi [47], Loève [50], Karhunen [46], Pougachev [64] and Obukhov [59].

These methods aim at providing a set of orthonormal basis functions that allow to express approximately and optimally any function in the data set. The equivalence between all these approaches has been also investigated by many authors, among them [48, 56, 71].

3.2.2 3.2.2 Algorithm

Let us now present the POD algorithm in a semi-discrete framework, that is, we consider a finite family of functions ${\{ {f_y}\} _{y \in \Omega _y^{train}}}$ where f _y : Ω_x → ℝ for each train y ∈ Ω_y = Ω ^train_y where Ω ^train_y is finite with cardinality N. In this context, the goal is to define an approximation P _Q[f _y] to f _y defined by

$${P_Q}\left[ {{f_y}} \right]\left( x \right) = \sum\limits_{q = 1}^Q {{g_q}\left( y \right){h_q}\left( x \right)}$$

(3.2)

with Q ≪ N. The POD actually incorporates a scalar product, for functions depending on x ∈ Ω_x and the above projection is then an orthogonal projection on the Q-dimensional vectorial space span {h _q, q = 1,…, Q}.

The question is now to select properly the functions h _q. With a scalar product, orthonormality is useful, since we would like that these modes are selected in order that they carry as much of the information that exists in the ${\{ {f_y}\} _{y \in \Omega _y^{train}}}$, i.e. the first function h ₁ should be selected such that it provides the best one-term approximation similarly, then h _q should be selected so that, with h ₁, h ₂,…, h _q-1 it gives the best q-term approximation. The best q-term above is understood in the sense that the mean square error over all y ∈ Ω ^train_y is the smallest. Such specially ordered orthonormal functions are called the proper orthogonal modes for the function f(x, y). With these functions, the expression (3.2) is called the POD of f and the algorithm is given in Table 3.1.

Scheme 3.1. Proper orthogonal decomposition (POD)

a.
Let $\Omega _y^{train} = \{ \hat y, \ldots ,{\hat y_N}\} $ be a N-dimensional dicrete representation of Ω_y.
b.
Construct the correlation matrix
$${h_q}(x) = \sum\limits_{n = 1}^N {{{({v_q})}_n}f(x,{{\hat y}_n}),\quad 1 \leqslant q \leqslant Q,\quad x \in {\Omega _x},} $$
where ${( \cdot , \cdot )_{{\Omega _x}}}$ denotes a scalar product of functions depending on Ω_x.
c.
Then, solve for the Q largest eigenvalue-eigenvector pairs (λ_q, v_q) such that
$$C{v_q} = {\lambda _q}{v_q},\quad 1 \leqslant q \leqslant Q.$$
(3.3)
d.
The orthogonal POD basis functions {h ₁,…,hQ} such that ${\mathbb{V}_Q} = span\{ {h_1}, \ldots ,{h_Q}\} $ are then given by the linear combinations
$${h_q}(x) = \sum\limits_{n = 1}^N {{{({v_q})}_n}f(x,{{\hat y}_n}),\quad 1 \leqslant q \leqslant Q,\quad x \in {\Omega _x},} $$
and where (v_q)_n denotes the n-th coefficient of the eigenvector v_q.

Approximation. The approximation P _Q[f _y] to f _y : Ω_x → ℝ, for any y ∈ Ω_y, is then given by

$${P_Q}[{f_y}](x) = \sum\limits_{q = 1}^Q {{g_q}(y){h_q}(x),\quad x \in {\Omega _x},} $$

with ${g_q}(y) = \frac{{{{({f_y},{h_q})}_{{\Omega _X}}}}}{{{{({h_q},{h_q})}_{{\Omega _X}}}}}.$

Proposition 3.1 The approximation error

$$d_2^{POD}(Q) = \sqrt {\frac{1}{N}\sum\limits_{y \in \Omega _y^{train}} {\left\| {{f_y} - {P_Q}[{f_y}]} \right\|_{{\Omega _x}}^2} } $$

minimizes the mean square error $\sqrt {\frac{1}{N}\sum\limits_{y \in \Omega _y^{train}} {\left\| {{f_y} - {P_Q}[{f_y}]} \right\|_{{\Omega _x}}^2} } $ over all projection operators ${P_Q}$ onto a space of dimension Q. It is given by

$$d_2^{POD}\left( Q \right) = \sqrt {\sum\limits_{q = Q + 1}^N {{\lambda _q}} } ,$$

(3.4)

where {λ_Q+1,…,λ} denotes the set of the N – Q smallest eigenvalues of the eigenvalue problem (3.3).

Remark 3.1 (Relation to SVD) If the scalar product ${( \cdot , \cdot )_{{\Omega _X}}}$ is approximated in the sense of ℓ² on a discrete set of points

$${(v,w)_{\Omega _x^{train}}} = \frac{{\left| {{\Omega _x}} \right|}}{M}\sum\limits_{i = 1}^M {v({{\hat x}_i})w({{\hat x}_i}),} $$

, i.e.

$${(v,w)_{\Omega _x^{train}}} = \frac{{\left| {{\Omega _x}} \right|}}{M}\sum\limits_{i = 1}^M {v({{\hat x}_i})w({{\hat x}_i}),} $$

then we see that C = A^T A where A is the matrix defined by ${A_{i,j}} = \sqrt {\frac{{\left| {{\Omega _x}} \right|}}{{NM}}} {f_{{{\hat y}_j}({{\hat x}_i})}}$. And thus, the square roots of the eigenvalues (3.3) are singular values of A.

Remark 3.2 (Infinite dimensional version) In the case where the POD is processed by leaving the parameter y continuous in Ω_y, the correlation matrix becomes an operator C : L ²(Ω_y) → L ²(Ω_y) with kernel $C({y_1},{y_2}) = {({f_{{y_1}}},{f_{{y_2}}})_{{\Omega _x}}}$ that acts on functions of y ∈ Ω_y as follows

$$(C\phi )(y) = {(C(y, \cdot ),\phi )_{{\Omega _y}}},\quad \phi\in {L^2}({\Omega _y}).$$

Assuming that f ∈ L ² (Ω_x × Ω_y), by the results obtained in [67] (that generalize Mercer’s theorem to more general domains) there exists a sequence of positive real eigenvalues (that can be ranked in decreasing order) and associated orthonormal eigenvectors, which can be used to construct best L ²-approximations (3.1).

The infinite dimensional version is important to understand the generality of the approach, e.g. how the various POD algorithms are linked together. In essence, this boils down to spectral theory of self-adjoint operators, either finite (in the matrix case) or infinite (for integral operator defined with symmetric kernels). Such operators have positive real eigenvalues and the corresponding eigenvectors can be ranked in decreasing order of eigenvalues. The approximation is based on considering the only eigenmodes that corresponds to the largest eigenvalues, they are those that carry the maximum information.

In practice though, both in the x and the y variables, sample sets Ω ^train_x and Ω ^train_x are devised. Depending on the size of N, the solution of the eigenvalue problem (3.3) can be prohibitively expensive. Most of the time though, there is not much hint on the way these training points should be chosen and they are generally quite large sets with N ≫ Q.

We finally remind that the original goal is to approximate any function f(x, y) for all x ∈ Ω_x and y ∈ Ω_y. In this regard, the error bound (3.4) only provides an upper error estimate for functions f _y with y ∈ Ω ^train_y and no certified error bound for functions f _y with y ∈ Ω_y ∖ Ω ^train_y can be provided.

3.3 3.3 Adaptive Cross Approximation

In order to cope with the difficulty of implementation of the POD algorithms, let us present here the Adaptive Cross Approximation. The approximation leading to (3.1) is

$$f\left( {x,y} \right) \approx {\mathfrak{J}_Q}\left[ {{f_y}} \right]\left( x \right): = {\left[ {\begin{array}{*{20}{c}} {f\left( {x,{y_1}} \right)} \\ \vdots\\ {f\left( {x,{y_Q}} \right)} \end{array}} \right]^T}M_Q^{ - 1}\left[ {\begin{array}{*{20}{c}} {f\left( {{x_1},y} \right)} \\ \vdots\\ {f\left( {{x_Q},y} \right)} \end{array}} \right]$$

(3.5)

with points x _q, y _q, q = 1,…, Q, chosen such that the matrix

$${M_Q}: = \left[ \begin{gathered} f({x_1},{y_1})\, \cdots \;f({x_1},{y_Q}) \hfill \\ \vdots \quad \quad \quad \quad \quad \quad \quad\vdots\hfill \\ f({x_Q},{y_1})\, \cdots \;f({x_Q},{y_Q}) \hfill \\ \end{gathered}\right] \in {\mathbb{R}^{Q \times Q}}$$

is invertible. Notice that while P _Q used in the construction of the POD is an orthogonal projector, ${\Im _Q}:{C^0}({\Omega _x}) \to {\mathbb{V}_Q}$ is an interpolation operator from the space of continuous functions C ⁰(Ω_x) onto the system ${\Im _Q}:{C^0}({\Omega _x}) \to {\mathbb{V}_Q}$, i.e.

$$[{\Im _Q}[{f_y}]({x_q}) = f({x_q},y)\quad for{\kern 1pt} ally{\kern 1pt} and{\kern 1pt} q = 1, \ldots ,Q.$$

Due to the symmetry of x and y in (3.5), we also have ${\Im _Q}[{f_{{y_q}}}](x) = f(x,{y_q})$ for all x and q = 1,…,Q.

3.3.1 3.3.1 Historical Overview

Approximations of type (3.5) were first considered by Micchelli and Pinkus in [57]. There, it was proved for so-called totally positive functions f, i.e. continuous functions f : [0,1] × [0,1] → ℝ with non-negative determinants

$$\left| {\left[ \begin{gathered} f({\xi _1},{\upsilon _1})\, \cdots \,f({\xi _1},{\upsilon _q}) \hfill \\ \vdots \quad \quad \quad \quad \quad \quad\vdots\hfill \\ f({\xi _q},{\upsilon _1})\, \cdots \,f({\xi _q},{\upsilon _q}) \hfill \\ \end{gathered}\right]} \right|$$

for all 0. ≤ ξ₁ < … < ξ_q ≤ 1, 0 ≤ υ and q = 1,…, Q, that such approximations are optimal with respect to the L ¹-norm, i.e.

where ${\Im _Q}$ is defined at implicitly known nodes x ₁,…,x _Q and y ₁,…,y _Q; see [57] for an additional technical assumption.

Instead of L ¹-estimates, it is usually required to obtain L ^∞-estimates. The obvious estimate

$${\left\| {{f_y} - {\Im _Q}[{f_y}]} \right\|_{{L^\infty }({\Omega _x})}} \leqslant (1 + {\sigma _1}[f])\mathop {\inf }\limits_{\upsilon\in {\mathbb{V}_Q}} {\left\| {{f_y} - \upsilon } \right\|_{{L^\infty }({\Omega _x})}}$$

contains the expression

$${\sigma _1}[f]: = \mathop {\sup }\limits_{x \in {\Omega _x}} \left\| {M_Q^{ - T}} \right.\left[ \begin{gathered} f(x,{y_1}) \hfill \\ \vdots\hfill \\ f(x,{y_Q}) \hfill \\ \end{gathered}\right]\left\| {_{{\ell ^1}}.} \right.$$

Since there is usually no estimate on the previous infimum (note that ${\mathbb{V}_Q}$ also depends on $\mathcal{F} = {\{ {f_y}\} _{y \in {\Omega _y}}}$, one tries to relate ${f_y} - {\Im _Q}[{f_y}]$ with the interpolation error in another system ${\mathbb{W}_Q} = span\{ {w_1}, \ldots {w_Q}\} $ of functions (e.g. polynomials, spherical harmonics, etc.); cf. [6, 12]. Assume that the determinant of the Vandermonde matrix ${W_Q}: = {[{w_i}({x_j})]_{i,j = 1, \ldots ,Q}}$ does not vanish and let L : Ω_x → ℝ^Q be the vector consisting of Lagrange functions ${L_i} \in {\mathbb{W}_Q},\;i.e.\,{L_i}({x_j}) = {\delta _{ij}},\;i,j = 1, \ldots ,Q$. Then, the interpolation operator ${\Im '_Q}$ defined over C⁰(Ω_x) with values in ${\mathbb{W}_Q}$ can be represented as

$${\Im '_Q}[\varphi ](x) = {\left[ \begin{gathered} \varphi ({x_1}) \hfill \\ \vdots\hfill \\ \varphi ({x_Q}) \hfill \\ \end{gathered}\right]^T}L(x),\quad \varphi\in {C^0}({\Omega _x}),$$

and we obtain

$${f_y}(x) - {\Im _Q}[{f_y}](x) = {f_y}(x) - {\left[ \begin{gathered} f({x_1},y) \hfill \\ \vdots\hfill \\ f({x_Q},y) \hfill \\ \end{gathered}\right]^T}L(x) - {\left( {\left[ \begin{gathered} f(x,{y_1}) \hfill \\ \vdots\hfill \\ f(x,{y_Q}) \hfill \\ \end{gathered}\right] - M_Q^TL(x)} \right)^T}M_Q^{ - 1}\left[ \begin{gathered} f({x_1},y) \hfill \\ \vdots\hfill \\ f({x_Q},y) \hfill \\ \end{gathered}\right] = {f_y}(x) - {\Im '_Q}[{f_y}](x) - {\left[ \begin{gathered} {f_{{y_1}}}(x) - {{\Im '}_Q}[{f_{{y_1}}}](x) \hfill \\ \vdots\hfill \\ {f_{{y_Q}}}(x) - {{\Im '}_Q}[{f_{{y_Q}}}](x) \hfill \\ \end{gathered}\right]^T}M_Q^{ - 1}\left[ \begin{gathered} f({x_1},y) \hfill \\ \vdots\hfill \\ f({x_Q},y) \hfill \\ \end{gathered}\right].$$

Hence, for any y ∈ Ω_y

$${\left\| {{f_y} - {\mathfrak{J}_Q}\left[ {{f_y}} \right]} \right\|_{{L^\infty }\left( {{\Omega _x}} \right)}} \leqslant \left( {1 + {\sigma _2}\left[ f \right]} \right)\mathop {\max }\limits_{z \in \left\{ {y,{y_1}, \ldots {y_Q}} \right\}} {\left\| {{f_z} - {{\mathfrak{J}'}_Q}\left[ {{f_z}} \right]} \right\|_{L\infty \left( {{\Omega _x}} \right)}},$$

(3.6)

where

$${\sigma _2}[f]: = \mathop {\sup }\limits_{y \in {\Omega _y}} \left\| {M_Q^{ - 1}\left[ \begin{gathered} f({x_1},y) \hfill \\ \vdots\hfill \\ f({x_Q},y) \hfill \\ \end{gathered}\right]} \right.\left\| {_{{\ell ^1}}.} \right.$$

3.3.2 3.3.2 Construction of Interpolation Nodes

The assumption that the determinant of the Vandermonde matrix W_Q does not vanish, can be guaranteed by the choice of x ₁,…,x _Q. To this end, let Q linearly independent functions w ₁,…,w _Q be given as above.As in [8], we construct linearly independent functions ℓ,…,ℓ_Q satisfying ℓ_q(x _p) = 0, p < q, and $span\{ {\ell _1}, \ldots ,{\ell _Q}\}= {\mathbb{W}_Q},q \leqslant Q$ in the following way. Let ℓ1 = w ₁ and x ₁ ∈ Ω_x be a maximum of |ℓ₁|. Assume that ℓ_Q-1 has already been constructed. For the construction ℓ_Q of define ℓ_Q,0 := w _Q and

$${\ell _{Q,q}}: = {\ell _{Q,q - 1}} - {\ell _{Q,q - 1}}({x_q})\frac{{{\ell _q}}}{{{\ell _Q}({x_q})}},\quad q = 1, \ldots ,Q - 1.$$

Then ℓ_Q,Q-1(x _q) = 0, q < Q, and span{ℓ_Q,0,…,ℓ_Q,Q-1} = span{ℓ₁,…,ℓ_Q-1, w _Q}. Hence, we set ℓ_Q := ℓ_Q,Q-1 and choose

$${x_Q}: = \mathop {\arg \;\sup }\limits_{x \in {\Omega _x}} \left| {{\ell _Q}\left( x \right)} \right|.$$

(3.7)

The previous construction guarantees unisolvency at the nodes x _q, q = 1,…,Q.

Lemma 3.1 It holds that det W_Q ≠ 0.

Proof Since span{ℓ₁,…,ℓ_Q} = span{w ₁,…,w _Q} it follows that there is a non-singular matrix T ∈ ℝ^Q×Q such that

$$\left[ \begin{gathered} {\ell _1} \hfill \\ \vdots\hfill \\ {\ell _Q} \hfill \\ \end{gathered}\right] = T\left[ \begin{gathered} {w_1} \hfill \\ \vdots\hfill \\ {w_Q} \hfill \\ \end{gathered}\right].$$

Hence, R_Q = TW_Q where R_Q := [ℓ_i(x _j)] ^Q_i,j=1 is upper triangular. The assertion follows from

$$\det \,{R_Q} = {\ell _1}({x_1}) \cdot\ldots\cdot {\ell _Q}({x_Q}) \ne 0.$$

As an example, we choose ${\mathbb{W}_Q} = {\prod _{Q - 1}}$ the space of polynomials of degree at most Q – 1. Then, it follows from (3.6) that ACA converges if, e.g., f is analytic with respect to x, and the speed of convergence is determined by the decay of f’s derivatives or the elliptical radius of the ellipse in which f has a holomorphic extension. Furthermore, it can be seen that

$${\ell _Q}(x) = \prod\limits_{q = 1}^{Q - 1} {(x - {x_q}).} $$

Hence, the choice (3.7) of x _Q is a generalization of a construction that is due to Leja [49]. Leja recursively defines a sequence of nodes {x ₁,…,x _Q} for polynomial interpolation in a compact set K ⊂ ℂ as follows. Let x ₁ ∈ K be arbitrary. Once x ₁,…,x _Q-1 have been found, choose x _Q ∈ K so that

$$\prod\limits_{q = 1}^{Q - 1} {\left| {{x_Q} - {x_q}} \right|}= \mathop {\max }\limits_{x \in K} \prod\limits_{q = 1}^{Q - 1} {\left| {x - {x_q}} \right|.} $$

In [68] it is proved that Lebesgue constants associated with Leja points are subexponential for fairly general compact sets in ℂ; see also [65]. Hence, analyticity is required in general for the convergence of the interpolation process.

The expression σ2[f] on the right-hand side of (3.6) can be controlled by the choice of the points y ₁,…,y _Q ∈ Ω_y. Due to Laplace’s theorem

$${\left( {M_Q^{ - 1}\left[ \begin{gathered} f({x_1},y) \hfill \\ \vdots\hfill \\ f({x_Q},y) \hfill \\ \end{gathered}\right]} \right)_q} = \frac{{\det {M_q}(y)}}{{\det {M_Q}}},\quad q = 1, \ldots ,Q,$$

where M_q(y) arises from replacing the q-th column of M_Q by the vector [f(x ₁,y),…,f(x _Q,y)]^T, we obtain that σ₂[f] ≤ Q if y ₁,…,y _Q are chosen such that

$$\left| {\det {M_Q}} \right| \geqslant \left| {\det {M_q}\left( y \right)} \right|,\quad q = 1, \ldots ,Q,y \in {\Omega _y}.$$

(3.8)

connection with the so-called maximum volume condition (3.8), we also refer to the error estimates in [66] which are based on the technique of exact annihilators (see [2, 3]) in order to provide similar results as (3.6).

3.3.3 3.3.3 Incremental Construction

The maximum volume condition (3.8) is difficult to satisfy by an a-priori choice of y ₁,…,y _Q. Therefore, the following incremental construction of approximations (3.5), which is called Adaptive Cross Approximation (ACA) [6], has turned out to be practically more relevant. Let r ₀(x,y) := f(x,y) and define the sequence of remainders as

$${r_q}\left( {x,y} \right): = {r_{q - 1}}\left( {x,y} \right) - \frac{{{r_{q - 1}}\left( {x,{y_q}} \right){r_{q - 1}}\left( {{x_q},y} \right)}}{{{r_{q - 1}}\left( {{x_q},{y_q}} \right)}},\quad q = 1, \ldots ,Q,$$

(3.9)

where x _q and y _q are chosen such that r _q-1(x _q,y _q) ≠ 0. Then, the algorithm is summarized in Table 3.2.

Since r _q-1(x _q,y _q) coincides with the q-th diagonal entry of the upper triangular factor of the LUdecomposition of M _Q, we obtain that det M_Q ≠ 0. In [12], it is shown that

$$f\left( {x,y} \right) = {\mathfrak{J}_Q}\left[ {{f_y}} \right]\left( x \right) + {r_Q}\left( {x,y} \right)$$

(3.10)

and

$${\Im _Q}[{f_y}](x) = \sum\limits_{q = 1}^Q {{r_{q - 1}}(x,{y_g})\frac{{{r_{q - 1}}({x_q},y)}}{{{r_{q - 1}}({x_q},{y_q})}}.} $$

This method is used in [21] (see also [23]) under the name Geddes-Newton series expansion for the numerical integration of bivariate functions, where instead of the maximum volume condition (3.8) (x _q, y _q) is found from maximizing |r _q-1|. This choice of (x _q, y _q) is usually referred to as global pivoting. Another pivoting strategy is the so-called partial pivoting, i.e., y _q is chosen in the q-th step such that

$$\left| {{r_{q - 1}}({x_q},{y_q})} \right| \geqslant \left| {{r_{q - 1}}({x_q},{y_{}})} \right|\,for\,all\,y\, \in {\Omega _y}$$

for x _q ∈ Ω_x chosen by (3.7). For the latter condition (and in particular for the stronger global pivoting) the conservative bound σ₂[f] ≤ 2^Q-1 can be guaranteed; see [6]. The actual growth of σ₂[f] with respect to Q is, however, typically significantly weaker.

Scheme 3.2. Bivariate Adaptive Cross Approximation (ACA2)

Set q := 1.

While err < tol

a.
Define the remainder ${r_{q - 1}} = f - \sum {_{i = 1}^{q - 1}{c_i}} $ and choose (x _q,y _q) ∈ Ω_x × Ω_y such that
$${r_{q - 1}}({x_q},{y_q}) \ne 0.$$
.
b.
Define the next tensor product by
$${c_q}(x,y) = \frac{{{r_{q - 1}}(x,{y_q}){r_{q - 1}}({x_q},y)}}{{{r_{q - 1}}({x_q},{y_q})}}.$$
c.
Define the error level by
$$err = {\left\| {{r_{q - 1}}} \right\|_{{L^\infty }(\Omega x \times \Omega y)}}$$
and set q := q + 1.

3.3.4 3.3.4 Application to Matrices

Approximations of the form (3.5) are particularly useful when they are applied to large-scale matrices A ∈ ℝ^{M × N}. In this case, (3.5) becomes

$$A \approx \tilde A: = {A_{:,\sigma }}A_{\tau ,\sigma }^{ - 1}{A_{\tau ,:}},$$

(3.11)

where τ := {i ₁,…,i _Q} andσ := {j ₁,…,j _Q} are sets of row and column indices, respectively, such that A_τ,σ ∈ ℝ^Q×Q is invertible. Here and in the following, we use the notation A_τ,: for the rows τ and A:,σ for the columns σ of A. Notice that the approximation Ã has rank at most Q and is constructed from few of the original matrix entries. Such kind of approximations were investigated by Eisenstat and Gu [37] and Tyrtyshnikov et al. [35] in the context of the maximum volume condition. Again, the approximation can be constructed incrementally by the sequence of remainders R⁽⁰⁾ := A and

$${R^{(q)}}: = {R^{(q - 1)}} - \frac{{R_{:,jq}^{(q - 1)}R_{iq,:}^{(q - 1)}}}{{R_{iq,jq}^{(q - 1)}}},\quad q = 1, \ldots ,Q,$$

where the index pair (i _q,j _q) is chosen such that $R_{{i_q}{j_q}}^{(q - 1)} \ne 0.$. The previous condition guarantees that A_τ,σ is invertible, and we obtain

$$\tilde A = \sum\limits_{q = 1}^Q {\frac{{R_{:,{j_q}}^{(q - 1)}R_{{i_q},:}^{(q - 1)}}}{{R_{{i_q},{j_q}}^{(q - 1)}}}} .$$

If A arises from evaluating a smooth function at given points, then R^(q) can be estimated using (3.6).

In order to avoid the computation of each entry of the remainders R^(q), it is important to notice that only the entries in the i _q-th row and the j _q-th column of R^(q-1) are required for the construction of Ã. Therefore, the following algorithm computes the column vectors $R_{:,{j_q}}^{(q - 1)}$ and row vectors ${v_q}: = R_{{i_q},:}^{\left( {q - 1} \right)}$ resulting in

$$\tilde A = \sum\limits_{q = 1}^Q {\frac{{{u_q}v_q^T}}{{{{\left( {{v_q}} \right)}_{{j_q}}}}}.}$$

(3.12)

The iteration stops after Q steps if the error satisfies

$${\left\| {A - \tilde A} \right\|_{{\ell ^2}}} = {\left\| {{R^{\left( Q \right)}}} \right\|_{{\ell ^2}}} < \varepsilon$$

(3.13)

with given accuracy ε > 0. The previous condition cannot be evaluated with linear complexity. Since the next rank-1 term $({v_{Q + 1}})_{{j_{Q + 1}}}^{ - 1}{u_{Q + 1}}v_{Q + 1}^T$ approximates R^(Q), we replace (3.13) with the error indicator

$$\frac{{{{\left\| {{u_{Q + 1}}v_{Q + 1}^T} \right\|}_{{\ell ^2}}}}}{{\left| {{{({V_{Q + 1}})}_{{j_{Q + 1}}}}} \right|}} = \frac{{{{\left\| {{u_{Q + 1}}} \right\|}_{{\ell ^2}}}{{\left\| {{V_{Q + 1}}} \right\|}_{{\ell ^2}}}}}{{\left| {{{({V_{Q + 1}})}_{{j_{Q + 1}}}}} \right|}} < \varepsilon .$$

The algorithm is presented in Table 3.3. Remark 3.3 Notice that almost no condition has been imposed on the row index i _q. The following three methods are commonly used to choose i _q. In addition to choosing i _q randomly, i _q can be found as

$${i_q}: = \mathop {\arg \,\max }\limits_{i = 1, \ldots ,M} \left| {{{({u_{q - 1}})}_i}} \right|,$$

which leads to a cyclic pivoting strategy. If A stems from the evaluation of a function at given nodes, then the construction of Sect. 3.3.2 should be used in order to guarantee the well-posedness of the interpolation operator ${\Im '_Q}$ and exploit the error estimate (3.6).

In some cases (see [15]), it is required to put more effort in the choice of i _q to guarantee a well-suited approximation space $span\{ {A_{{i_1},:}}, \ldots ,{A_{{i_Q},:}}\} $; cf. [7].

Instead of the M · N entries of A, we only have to compute Q(M + N) entries of A for the approximation by Ã. The construction of (3.12) requires $O({Q^2}(M + N))$ arithmetic operations, and Ã can be stored with Q(M + N) units of storage. Possible redundancies among the vectors u_q, v_q, q = 1,…,Q, can be removed via orthogonalization.

Scheme 3.3. Adaptive Cross Matrix Approximation

Set q := 1.

While err < tol

a.
Choose i _q such that
$${V_q}: = A_{{i_q},:}^T - \sum\limits_{\ell= 1}^{q - 1} {\frac{{{{({u_\ell })}_{{i_q}}}}}{{{{({v_\ell })}_{{j_\ell }}}}}{v_\ell }} $$
is nonzero and j ^q such that $\left| {{{({v_q})}_{{j_q}}}} \right| = {\max _{j = 1, \ldots ,N}}\left| {{{({v_q})}_j}} \right|.$
b.
Compute the vector
$${u_q}: = {A_{:,{j_q}}} - \sum\limits_{\ell= 1}^{q - 1} {\frac{{{{({v_\ell })}_{{j_q}}}}}{{{{({v_\ell })}_{{j_\ell }}}}}{u_\ell }.} $$
c.
Compute the error indicator
$$err = {\left| {{{({v_q})}_{{j_q}}}} \right|^{ - 1}}{\left\| {{u_q}} \right\|_{{\ell ^2}}}{\left\| {{v_q}} \right\|_{{\ell ^2}}}$$
and set q := q + 1.

The origin of this matrix version of ACA is the construction of so-called hierarchical matrices [7,39,40] for the efficient treatment of integral formulations of elliptic boundary value problems. Hierarchical matrices allow to treat discretizations of such non-local operators with logarithmic-linear complexity. To this end, subblocks A_t,s from a suitable partition of large-scale matrices A are approximated by low-rank matrices.

A form that is slightly different from (3.11) and which looks more complicated at first glance is

$${A_{t,s}} \approx {\hat A_{t,s}}: = {A_{:,{\sigma _t}}}A_{{\tau _{t,}}{\sigma _t}}^{ - 1}{A_{{\tau _{t,}}{\sigma _s}}}A_{{\tau _{s,}}{\sigma _s}}^{ - 1}{A_{{\tau _s},:}}$$

with suitable index sets τ_t, σ_t, τ_s, and _s depending on the respective index t or s only. Notice that in contrast to Ã, Â does not interpolate A on the “cross” but rather at single points specified by the indices τ_t, σ_s, i.e. ${\hat A_{{\tau _t},{\sigma _s}}} = {A_{{\tau _t},{\sigma _s}}}$. The advantage of this approach is the fact that the large parts ${A_{:{\sigma _t}}}A_{{\tau _t},{\sigma _t}}^{ - 1}$ and $A_{{\tau _s},{\sigma _s}}^{ - 1}{A_{{\tau _s},:}}$: depend only on either one of the two index sets t or s, while only the small matrix ${A_{{\tau _s},{\sigma _s}}}$ depends on both. This allows to further reduce the complexity of hierarchical matrix approximations by constructing so-called nested bases approximations [13], which are mandatory to efficiently treat high-frequency Helmholtz problems; see [11].

3.3.5 3.3.5 Relation with Gaussian Elimination

Without loss of generality, we may assume for the moment that i _q = j _q = q, q = 1,…,Q. Otherwise, interchange the rows and columns of the original matrix R ⁽⁰⁾. Then

$${R^{(q)}} = \left( {I - \frac{{{R^{(q - 1)}}{e_q}e_q^T}}{{e_q^TR(q - 1){e_q}}}} \right){R^{(q - 1)}} = {L^{(q)}}{R^{(q - 1)}},$$

where L^(q) ∈ ℝ_M×N is the matrix

which differs from a Gaussian matrix only in the position (q,q); cf. [6]. This relation was exploited in [41] for the convergence analysis of ACA in the case of positive definite matrices A.

Furthermore, it is an interesting observation that ACA reduces the rank of the remainder in each step, i.e. rankR^(q) = rankR^(q-1) - 1. This was first discovered by Wedderburn in [69, p. 69]; see also [6, 26]. Hence, ACA may be regarded as a rank revealing LU factorization [22,45]. As we know, it is possible that the elements grow in the LU decomposition algorithm; cf. [34]. Thus the exponential bound 2^Q on σ2[f] is not a result of overestimation.

3.3.6 3.3.6 Generalizations of ACA

The Adaptive Cross Approximation can easily be generalized to a linear functional setting. Instead the evaluation of the remainders at the chosen points x _q, y _q, q = 1,…,Q, one considers the recursive construction

$${r_q}(x,y): = {r_{q - 1}}(x,y) - \frac{{\left\langle {{r_{q - 1}}(x, \cdot ),{\psi _q}} \right\rangle \left\langle {{\varphi _q},{r_{q - 1}}( \cdot ,y)} \right\rangle }}{{\left\langle {{\varphi _q},{r_{q - 1}},{\psi _q}} \right\rangle }},\quad q = 1, \ldots ,Q.$$

Here, ϕ_q and ψ_q denote given linear functionals acting on x and y, respectively. It is easy to show (see [10]) that

$$\left\langle {{\varphi _i},{r_q}\left( { \cdot ,y} \right)} \right\rangle= 0 = \left\langle {{r_q}\left( {x, \cdot } \right),{\psi _i}} \right\rangle \quad for\,all\,i \leqslant q,x \in {\Omega _x}\;and\;y \in {\Omega _y}.$$

(3.14)

Hence, r _q vanishes for an increasing number of functionals and

$${\Im ''_Q}[{f_y}](x): = \sum\limits_{q = 1}^Q {\left\langle {{r_{q - 1}}(x, \cdot ),{\psi _q}} \right\rangle } \frac{{\left\langle {{\varphi _q},{r_{q - 1}}( \cdot ,y)} \right\rangle }}{{\left\langle {{\varphi _q},{r_{q - 1}},{\psi _q}} \right\rangle }}$$

gradually interpolates f _y (in the sense of functionals). The Adaptive Cross Approximation (3.9) is obtained from choosing the Dirac functionals ${\varphi _q}: = {\delta _{{x_q}}}$ and ${\psi _q}: = {\delta _{{y_q}}}$.

The benefits of the separation of variables resulting form (3.5) are even more important for multivariate functions f. We present two ways to generalize (3.9) to functions depending on d variables. An obvious idea is to group the set of variables into two parts each containing d/2 variables; see [10] for a method that uses the covariance of f to construct this separation. Each of the two parts can be treated as a single new variable. Then, the application of (3.9) results in a sequence of less-dimensional functions which inherit the smoothness of f. Hence, (3.9) can be applied again until only univariate functions are left. Due the nestedness of the construction, the constructed approximation cannot be regarded as an interpolation. Error estimates for this approximation were derived in [8] for d = 3, 4. The application to tensors of order d > 2 was presented in [4, 60, 61].

A more sophisticated way to generalize ACA to multivariate functions is presented in [9]. For the case d = 3, the sequence of remainders is constructed as

$${r_q}(x,y,z): = {r_{q - 1}}(x,y,z) - \frac{{{r_{q - 1}}(x,y,{z_q}){r_{q - 1}}(x,{y_q}z){r_{q - 1}}({x_q},y,z){r_{q - 1}}({x_q},{y_q},{z_q})}}{{{r_{q - 1}}(x,{y_q},{z_q}){r_{q - 1}}({x_q},y,{z_q}){r_{q - 1}}({x_q},{y_q},z)}}$$

instead of (3.9). Notice that this kind of approximation requires that x _q, y _q, z _q can be found such that the denominator r _q-1(x,y _q,z _q) r _q-1(x _q,y,z _q) r _q-1(x _q,y _q,z) ≠ 0. On the other hand, the advantage of this generalization is that it is equi-directional in contrast to the aforementioned idea, i.e., none of the variables is preferred to the others. Hence, similar to (3.14) we obtain for all x,y,z

$${r_q}(x,y,{z_i}) = {r_q}(x,{y_i},z) = {r_q}({x_i},y,z) = 0,\quad i \leqslant q.$$

3.4 3.4 Empirical Interpolation Method

3.4.1 3.4.1 Historical Overview

The Empirical Interpolation Method (EIM) [5] originates from reduced order modeling and its application to the resolution of parameter dependent partial differential equations. We are thus in the context where the set of solutions u(·,y) to the PDE generates a manifold, parametrized by y (the parameter is generally called · in these applications) that possesses a small Kolmogorov n-width. In the construction stage of the reduced basis method, the reduced basis is constructed from a greedy approach where each new basis function, that is a solution to the PDE associated to an optimally chosen parameter, is incorporated recursively. The selection criteria of the parameter is based on maximal (a posteriori) error estimates over the parameter space. This construction stage can be expensive: indeed it requires an initial accurate classical discretization method of finite element, spectral or finite volume type and every solution associated to a parameter that is optimally selected, needs to be approximated during this stage by the classical method. Once the preliminary stage is performed off-line, all the approximations of solutions corresponding to a new parameter are performed as a linear combination of the (few) basis functions constructed during the first phase. This second on-line stage is very cheap. This is due to two facts. The first one is related to the fact that the greedy approach is proven to be quite optimal [14, 16, 28], for exponential or polynomial decay of the Kolmogorov n-width, the greedy method provides a basis set that has the same feature.

The second fact is related to the approximation process. A Galerkin approximation in this reduced space indeed provides very good approximations, and if Q modes are used, a linear PDE can be simulated by inverting Q × Q matrices only, i.e. much smaller complexity than the classical approaches.

In order that the same remains true for nonlinear PDE’s, a strategy, similar to the pseudo-spectral approximation for high-order Fourier or polynomial approximations has been sought. This involves the use of an interpolation operator. In order to be coherent, an approximation u _Q(·,y) = ∑ ^Q_i=1 α_i(y)u(·,y _i) being given (where the y _i are the parameters that define the reduced basis snapshots) we want to approximate $G({u_Q}( \cdot ,y)){\kern 1pt} (G{\kern 1pt} being{\kern 1pt} a{\kern 1pt} nonlinear{\kern 1pt} functional)$ as a linear combination

$$G({u_Q}( \cdot ,y)) \approx \sum\limits_{i = 1}^Q {{\beta _i}} (y){\kern 1pt} G(u( \cdot ,{y_i})).$$

The derivation of the set {β_i}_i from {α_i}_i needs to be very fast, it is defined by interpolation through the Empirical Interpolation Method defined in the following section. This has been extensively used for different types of equations in [36] and has led to the definition of general interpolation techniques and rapid derivation of the associated points.

The approach having a broader scope than only the use in reduced basis approximation, a dedicated analysis of the approximation properties for sets with small- Kolmogorov n-width has been presented in [54]. This approach for nonlinear problems has actually also been used for problems where the dependency in the parameter is involved (the so called “non-affine problems”) and has boosted the domain of application of reduced order approximations.

3.4.2 3.4.2 Motivation

As said above and in the introduction, we are in a situation where the set $mathcal{F} = {\{ f( \cdot ,y)\} _{y \in {\Omega _y}}}$ denotes a family of parametrized functions with small Kolmogorov n-width. We therefore do not identify Ω_x with Ω_y. In addition, for a given parameter y, f(·,y) is supposed to be accessible at all values in Ω_x.

The EIM is designed to find approximations to members of ℱ through an interpolation operator I _q that interpolates the function f _y = f(·,y) at some particular points in Ω_x That is, given an interpolatory system defined by a set of basis functions {h ₁,…,h _q} (linear combination of particular “snapshots” ${f_{{y_1}}}, \ldots {f_{{y_q}}}$) and interpolation points {x ₁,…,x _q{, the interpolant I _q[f _y] of f _y with y ∈ Ω_y written as

$${I_q}\left[ {{f_y}} \right]\left( x \right) = \sum\limits_{j = 1}^q {{g_j}\left( y \right){h_j}\left( x \right)} ,\quad x \in {\Omega _x},$$

(3.15)

is defined by

$${I_q}\left[ {{f_y}} \right]\left( {{x_i}} \right) = {f_y}\left( {{x_i}} \right),\quad i = 1, \ldots ,q.$$

(3.16)

Thus, (3.16) is equivalent to the following linear system

$$\sum\limits_{j = 1}^q {{g_j}\left( y \right){h_j}\left( {{x_i}} \right)}= {f_y}\left( {{x_i}} \right),\quad i = 1, \ldots ,q.$$

(3.17)

One of the problems is to ensure that the system above is uniquely solvable, i.e. that the matrix (h _j(x _i))_i,j is invertible, which will be considered in the design of the interpolation scheme.

Scheme 3.4. Empirical Interpolation Method

Set q = 1. Do while err < tol:

a.
Pick the sample point
$${y_q} = \mathop {\arg \,\sup }\limits_{y \in {\Omega _y}} {\left\| {{f_y} - {I_{q - 1}}\left[ {{f_y}} \right]} \right\|_{{L^p}\left( {{\Omega _x}} \right)}},$$
(3.18)
and the corresponding interpolation point
$${x_q} = \mathop {\arg \,\sup }\limits_{x \in {\Omega _x}} \left| {{f_{{y_q}}}\left( x \right) - {I_{q - 1}}\left[ {{f_{{y_q}}}} \right]\left( x \right)} \right|.$$
(3.19)
b.
Define the next basis function as
$${h_q} = \frac{{{f_{{y_q}}} - {I_{q - 1}}\left[ {{f_{{y_q}}}} \right]}}{{{f_{{y_q}}}\left( {{x_q}} \right) - {I_{q - 1}}\left[ {{f_{{y_q}}}} \right]\left( {{x_q}} \right)}}.$$
(3.20)
c.
Define the error level by
$$err = {\left\| {er{r_p}} \right\|_{{L^\infty }({\Omega _y})}}\quad with\quad er{r_p}(y) = {\left\| {{f_y} - {I_{q - 1}}[{f_y}]} \right\|_{LP({\Omega _x})}},$$
and set q := q + 1.

3.4.3 3.4.3 Algorithm

The construction of the basis functions and interpolation points is based on a greedy algorithm. Note that the EIM is defined with respect to a given norm on Ω_x and we consider here L ^p(Ω_x)-norms for 1 ≤ p ≤ ∞. The algorithm is given in Table 3.4.

Remark 3.4 Note that whenever dim(span{ℱ}) = q⋆ the algorithm finishes for q = q⋆.

As long as q ≤ q⋆, note that the basis functions {h ₁,…,h _q} and the snapshots $\{ {f_{{y_1}}}, \ldots ,{f_{{y_q}}}\} $ span the same space, i.e.,

$${\mathbb{V}_q} = span\{ {h_1}, \ldots ,{h_q}\}= span\{ {f_{{y_1}}}, \ldots ,{f_{{y_q}}}\} .$$

The former are preferred to the latter due to the following properties

$${h_i}\left( {{x_i}} \right) = 1,\quad {\forall _i} = 1, \ldots ,q\quad and\quad {h_j}\left( {{x_i}} \right) = 0,\quad 1 \leqslant i < j \leqslant q.$$

(3.21)

Remark 3.5 It is easy to show that the interpolation operator I _q is the identity if restricted to the space ${\mathbb{V}_q}$, i.e.,

$${I_q}[{f_{{y_i}}}](x) = {f_{{y_i}}}(x),\quad i = 1, \ldots ,q,\quad x \in {\Omega _x}.$$

Remark 3.6 The construction of the interpolating functions and the associated interpolation points follows a greedy approach: we add the function in ℱ that is the worse approximated by the current interpolation operator and the interpolation point is where the error is the largest. The construction is thus recursive which, in turn, means that it is of low computational cost.

Remark 3.7 As explained in [5], the algorithm can be reduced to the selection of the interpolation points only, in the case where the family of interpolating functions $[\{ {f_{{y_1}}}, \ldots ,{f_{{y_q}}}, \ldots \} $ is preexisting. This can be the case for instance if a POD strategy has been used previously or when one considers a set that has a canonical basis and ordering (like the set of polynomials).

Note that solving the interpolation system (3.17) can be written as a linear system B g_y = f_y with q unknowns and equations where

$${B_{i,j}} = {h_j}({x_i}),\quad {({f_y})_i} = {f_y}({x_i}),\quad i,j = 1, \ldots ,q,$$

such that the interpolant is defined by

$${I_q}[{f_y}](x) = \sum\limits_{j = 1}^q {{{({g_y})}_j}{h_j}(x),\quad x \in {\Omega _x}.} $$

This construction of the basis functions and interpolation points satisfies the following theoretical properties (see [5]):

the basis functions {h ₁,…,h _q} consist of linearly independent functions;
the interpolation matrix B_i,j is lower triangular with unity diagonal by (3.21) and hence invertible, the remaining entries belong to [-1,1];
the empirical interpolation procedure is well-posed in L _p(Ω_x), as long as q ≤ q⋆.

If the L ^∞(Ω_x)-norm (p = ∞) is considered, the error analysis of the interpolation procedure classically involves the Lebesgue constant ${\Lambda _q} = {\sup _{x \in {\Omega _x}}}\sum {_{i = 1}^q} \left| {{L_i}(x)} \right|\,where\,{L_i} \in {\mathbb{V}_q}$ are the Lagrange functions satisfying L _i(x _j) = δ_ij. The following bound holds [5]

$${\left\| {{f_y} - {I_q}[{f_y}]} \right\|_{{L^\infty }({\Omega _x})}} \leqslant (1 + {\Lambda _q})\mathop {\inf }\limits_{{v_q} \in {\mathbb{V}_q}} {\left\| {{f_y} - {v_q}]} \right\|_{{L^\infty }({\Omega _x})}}.$$

An (in practise very pessimistic) upper bound (cf. [54]) of the Lebesque constant is given by

$${\Lambda _q} \leqslant {2^q} - 1,$$

which in turn results in the following estimate. Assume that $F \subset X \subset {L^\infty }({\Omega _x})$ and that there exists a sequence of finite dimensional spaces

$${\mathbb{Z}_1} \subset {\mathbb{Z}_2} \subset \ldots ,\quad \dim ({\mathbb{Z}_q}) = q,\quad and\quad {\mathbb{Z}_q} \subset F,$$

such that there exists c > 0 and α > log(4) with

$$[\mathop {\inf }\limits_{{v_q} \in {\mathbb{Z}_q}} \left\| {{f_y} - {v_q}} \right\|X \leqslant c{e^{ - \alpha q}},\quad y \in {\Omega _y},$$

then

$${\left\| {{f_y} - {I_q}[{f_y}]} \right\|_{{L^\infty }({\Omega _x})}} \leqslant c{e^{ - {{(\alpha- \log (4))}_q}}}.$$

Remark 3.8 The worst-case situation where the Lebesgue constant scales indeed like Λ_q ≤ 2^q - 1 is rather artificial and in all implementations we have done so far involving functions belonging to some reasonable set with small Kolmogorov n-width, the growth of the Lebesgue constant is much more reasonable and in most of the times a linear growth is observed. Note that, the points that are generated by the EIM using polynomial basis functions (in increasing order of degree) on [-1,1] are exactly the Leja points as indicated in the frame of the EIM by A. Chkifa^{Footnote 1} and the discussion in Sect. 3.3.2 in the case of ACA. On the other hand, if one considers the Leja points on a unit circle and then project them onto the interval [-1,1] a linear growth is shown in [25].

3.4.4 3.4.4 Practical Implementation

In the practical implementation of the EIM one encounters the following problem. Finding the supremum respectively the arg sup in (3.18) and (3.19) is not feasible if any kind of approximation is effected. The least difficult way, but not the only one, is to consider representative point-sets $\Omega _x^{train} = \left\{ {{{\hat x}_1},{{\hat x}_2}, \ldots ,{{\hat x}_M}} \right\}\,of\,{\Omega _x}\,and\;\Omega _y^{train} = \left\{ {{{\hat y}_1},{{\hat y}_2}, \ldots ,{{\hat y}_N}} \right\}\,of\,{\Omega _y}.$ Then, the EIM is written as in Table 3.5.

This possible implementation of the EIM is sometimes referred to as the Discrete Empirical Interpolation Method (DEIM) [24].

Remark 3.9 Different strategies have been reported in [38,55] to successively enrich the training set Ω ^train_y . The main idea is to start with a small number of training points and enrich the set during the iterations of the algorithm and obtain a very fine discretization only towards the end of the algorithm. One can also think of enriching the training set Ω ^train_x simultaneously.

Remark 3.10 Using representative pointsets Ω ^train_x and Ω ^train_y is only one way to discretize the problem. Alternatively, one can think of using optimization methods to find the maximum over Ω_x and Ω_y. Such a strategy has been reported in [18, 19] in the context of the reduced basis method, which, as well as the EIM, is based on a greedy algorithm.

Scheme 3.5. Empirical Interpolation Method (possible implementation of EIM)

Set q = 1. Do while err < tol:

a.
Pick the sample point
$${y_q} = \mathop {\arg \,\max }\limits_{y \in \Omega _y^{train}} {\left\| {{f_y} - {I_q} - 1\left[ {{f_y}} \right]} \right\|_{{L^p}\left( {{\Omega _x}} \right)}},$$
(3.22)
and the corresponding interpolation point
$${x_q} = \mathop {\arg \,\max }\limits_{x \in \Omega _x^{train}} \left| {{f_{{y_q}}}(x) - {I_{q - 1}}[{f_{{y_q}}}](x)} \right|.$$
b.
Define the next basis function as
$${h_q} = \frac{{{f_{{y_q}}} - {I_{q - 1}}[{f_{{y_q}}}]}}{{{f_{{y_q}}}({x_q}) - {I_{q - 1}}[{f_{{y_q}}}]({x_q})}}.$$
c.
Define the error level by
$$err = {\left\| {er{r_p}} \right\|_{{L^\infty }({\Omega _y})}}\quad with\quad er{r_p}(y) = {\left\| {{f_y} - {I_{q - 1}}[{f_y}]} \right\|_{{L^P}({\Omega _x})}}$$
and set q := q + 1.

3.4.5 3.4.5 Practical Implementation Using the Matrix Representation of the Function

One can define an implementation of the EIM in a completely discrete setting using the representative matrix of f defined by M_i,j = f(x _i, y _j) for 1 ≤ i ≤ M and 1 ≤ j ≤ N. For the sake of short notation we recall the notation M_:,j used for the j-th column of M.

Assume that we are given a set of basis vectors {h₁,…,h_q} and interpolation indices i ₁,…,i _q, the discrete interpolation operator I_q : ℝ^N → ℝ^N of column vectors is given in the span of the basis vectors {h_j} ^q_j=1 , i.e. by I_q[r] = ∑_j=1 ^q g_j(r)h_j for some scalars g _j(r), such that

$${({I_q}[r])_{{i_k}}} = \sum\limits_{j = 1}^q {{g_j}(r){{({h_j})}_{{i_k}}} = {r_{{i_k}}},\quad r \in {\mathbb{R}^N},\quad k = 1, \ldots ,q.} $$

Using this notation, we then present the matrix version of the EIM in Table 3.6.

This procedure allows to define an approximation of any coefficient of the matrix M. In some cases however, one would like to obtain an approximation of f(x, y) for any (x,y) ∈ Ω_x ×. After running the implementation, one can still construct the continuous interpolant I _Q[f](x,y) for any (x,y) ∈ Ω_x × Ω_y. Indeed, the interpolation points x ₁,…,x _Q are provided by ${x_q} = {\hat x_{{i_q}}}.$. The construction of the (continuous) basis functions hq is based on mimicking part b of the discrete algorithm but in a continuous context. Therefore, during the discrete version one saves the following data

$$\begin{gathered} {s_{q,j}} = {g_j}({M_{:,{j_q}}}),\quad \quad \quad \quad \quad \quad \quad from\quad {I_{q - 1}}[{M_{:,{j_q}}}] = \sum\limits_{j = 1}^{q = 1} {{g_j}\left( {{M_{:,{j_q}}}} \right){h_j}} , \hfill \\ {s_{q,q}} = {M_{{i_q},{j_q}}} - {({I_{q - 1}}[{M_{:,{j_q}}}])_{{i_q}}}. \hfill \\ \end{gathered} $$

Then, the continuous basis functions can be recovered by the following recursive formula

$${h_q} = \frac{{{f_{{y_q}}} - \Sigma _{j = 1}^{q - 1}{s_{q,j}}{h_j}}}{{{s_{q,q}}}}$$

using the notation ${y_q} = {\hat y_{{i_q}}}.$

Scheme 3.6. Empirical Interpolation Method (implementation based on representative matrix M of f)

Set q = 1. Do while err < tol

a.
Pick the sample index
$${j_q} = \mathop {\arg \,\max }\limits_{j = 1, \ldots ,M} {\left\| {{M_{:,j}} - {I_{q - 1}}[{M_{:,j}}]} \right\|_{{\ell ^p}}},$$
and the corresponding interpolation index
$${i_q} = \mathop {\arg \,\max }\limits_{i = 1, \ldots ,N} \left| {{M_{i,{j_q}}} - {{({I_{q - 1}}[{M_{:,{j_q}}}])}_i}} \right|.$$
b.
Define the next approximation column by
$${h_q} = \frac{{{M_{:,{j_q}}} - {I_{q - 1}}[{M_{:,{j_q}}}]}}{{{M_{{i_q},{j_q}}} - {{({I_{q - 1}}[{M_{:,{j_q}}}])}_{{i_q}}}}}.$$
c.
Define the error level by
$$err = \mathop {\max }\limits_{j = 1, \ldots ,M} {\left\| {{M_{:,j}} - {I_{q - 1}}[{M_{:,j}}]} \right\|_{{\ell ^p}}}$$
and set q : = q + 1.

3.4.6 3.4.6 Generalizations of the EIM

In the following, we present some generalizations of the core concept behind the EIM.

3.4.6.1 3.4.6.1 Generalized Empirical Interpolation Method (gEIM)

We have seen that the EIM-interpolation operator I _q[f _y] interpolates the function f _y at some empirically constructed points x ₁,…,x _q. The EIM can be generalized in the following sense as proposed in [52]. Let Σ be a dictionary of linear continuous forms (say for the L ²(Ω_x)-norm) acting on functions f _y, y ∈ Ω_y Then, the gEIM consists in providing a set of basis functions h ₁,…,h _q, such that ${\mathbb{V}_q} = span\{ {h_1}, \ldots ,{h_q}\} = span\{ {f_{{y_1}}}, \ldots ,{f_{{y_q}}}\} $ for some empirically chosen {y ₁,…,y _q} ⊂ Ω_y, and a set of linear forms, or moments, {σ₁,…,σ_q} ⊂ Σ. The generalized interpolant then takes the form

$${J_q}[{f_y}] = \sum\limits_{j = 1}^q {{g_j}(y){h_j}(x),\quad x \in {\Omega _x},\quad y \in {\Omega _y},} $$

and is defined in the following way

$${\sigma _i}({J_q}[{f_y}]) = {\sigma _i}({f_y}),\quad i = 1, \ldots ,q,$$

which will define the coefficients g _j(y) for each y ∈ Ω_y. We note that if the linear forms are Dirac functionals δ_x with x ∈ Ω_x, then the gEIM reduces to the plain EIM. The algorithm is given in Table 3.7. This constructive algorithm satisfies the following theoretical properties (see [52]):

the set {h ₁,…,h _q} consists of linearly independent functions;
the generalized interpolation matrix (B)_ij = σ_i(h _j) is lower triangular with unity diagonal (hence invertible) with other entries s ∈ [−1, 1];
the generalized empirical interpolation procedure is well-posed in L ²(Ω_x).

Scheme 3.7. Generalized Empirical Interpolation Method (gEIM)

Set q = 1. Do while err < tol:

a.
Pick the sample point
$${y_q} = \mathop {\arg \,\sup }\limits_{y \in {\Omega _y}} {\left\| {{f_y} - {J_{q - 1}}[{f_y}]} \right\|_{{L^p}({\Omega _x})}},$$
and the corresponding interpolation moment
$${\sigma _q} = \mathop {\arg \,\sup }\limits_{\sigma \in \Sigma } \left| {\sigma ({f_{{y_q}}} - {J_{q - 1}}[{f_{{y_q}}}])} \right|.$$
b.
Define the next basis function as
$${h_q} = \frac{{{f_{{y_q}}} - {J_{q - 1}}[{f_{{y_q}}}]}}{{{\sigma _q}({f_{{y_q}}} - {J_{q - 1}}[{f_{{y_q}}}])}}.$$
c.
Define the error level by
$$err = {\left\| {er{r_p}} \right\|_{{L^\infty }({\Omega _y})}}\,with\,er{r_p}(y) = {\left\| {{f_y} - {I_{q - 1}}[{f_y}]} \right\|_{{L^p}({\Omega _x})}}$$
and set q := q + 1.

In order to quantify the error of the interpolation procedure, like in the standard interpolation procedure, we introduce the Lebesgue constant in the L ²-norm:

$${\Lambda _q} = \mathop {\sup }\limits_{y \in {\Omega _y}} \frac{{{{\left\| {{J_q}[{f_y}]} \right\|}_{{L^2}({\Omega _x})}}}}{{{{\left\| {{f_y}} \right\|}_{{L^2}({\Omega _x})}}}},$$

i.e. the L ²-operator norm of J _q. Thus, the interpolation error satisfies:

$${\left\| {{f_y} - {J_q}[{f_y}]} \right\|_{{L^2}(\Omega )}} \leqslant (1 + {\Lambda _q})\quad \mathop {\inf }\limits_{{v_q} \in {\mathbb{V}_q}} {\left\| {{f_y} - {v_q}} \right\|_{{L^2}(\Omega )}}.$$

Again, a (very pessimistic) upper-bound for Λ_q is:

$${\Lambda _q} \leqslant {2^{q - 1}}\mathop {\max }\limits_{i = 1, \ldots ,q} {\left\| {{h_i}} \right\|_{{L^2}(\Omega )}},$$

indeed, the Lebesgue constant is, in many cases, uniformly bounded in this generalized case. The following result proves that the greedy construction is quite optimal [53].

1.
Assume that the Kolmogorov n-width of ℱ in L ²(Ω_x) is upper bounded by C ₀ n ^-α for any n ≥ 1, then the interpolation error of the gEIM greedy selection process satisfies for any f ∈ ℱ the inequality ${\left\| {f( \cdot ,y) - {J_Q}[f( \cdot ,y)]} \right\|_{{L^2}({\Omega _x})}} \leqslant {C_0}{(1 + {\Lambda _Q})^3}{Q^{ - \alpha }}$.
2.
Assume that the Kolmogorov n-width of ℱ in L ²(Ωx) is upper bounded by ${C_0}{e^{ - {c_1}{n^\alpha }}}$ for any n ≥ 1, then the interpolation error of the gEIM greedy selection process satisfies for any f ∈ ℱ the inequality ${\left\| {f( \cdot ,y) - {J_Q}[f( \cdot ,y)]} \right\|_{{L^2}({\Omega _x})}} \leqslant {C_0}{(1 + {\Lambda _Q})^3}{e^{ - {c_2}{Q^\alpha }}}$ for a positive constant c ₂ slightly smaller than c ₁.

3.4.6.2 3.4.6.2 hp-EIM

If the Kolmogov n-width is only decaying slowly with respect to n and the resulting number of basis functions and associated integration points is larger than desired, a remedy consists of partitioning the space Ω_y into different elements Ω ¹_y ,…,Ω ^P_y on which a separate interpolation operator ${I_{{q_p}}}:{\{ {f_y}\} _{y \in \Omega _x^p}} \to {\mathbb{V}_{{q_p}}}$ with p = 1,…,P is constructed. That is, for each element Ω ^P_y a standard EIM as described above is performed. The choice of creating the partition is subject to some freedom and different approaches have been presented in [30, 32].

A somewhat different approach is presented in [55], although in the framework of a projection method, where the idea of a strict partition of the space Ω_y is abandoned. Instead, given a set of sample points y ₁,…,y _K for which the basis functions f(·,y ₁),…,f(y _K) are known (or have been computed) a local approximation space for any y ∈ Ω_y is constructed by considering the N basis functions whose parameter values are closest to y. In addition, the distance function, measuring the distance between two points in Ω_y, can be empirically built in order to represent local anisotropies in the parameter space Ω_y. Further, the distance function can also be used to define the training set Ω ^train_y which can be uniformly sampled with respect to the problem dependent distance function.

Several approaches have been presented in cases where Ω_y is high-dimensional (dim(Ω_y) ≈ 10). In such cases, finding the maximizer in (3.22) becomes a challenge. Since the discrete set Ω ^train_y should be representative of Ω_y, we require that Ω ^train_y consists of a very large number of training points. Finding the maximum over this huge set is therefore prohibitive expensive as a result of the curse of dimensionality.

In [42], the authors propose a computational approach that randomly samples the space Ω_y with a feasible number of training points, that is however changing over the iterations. Therefore, counting all tested training points over all iterations is still a very large number, at each iteration though finding the maximum is a feasible task.

In [43], the authors use, in the framework of the reduced basismethod, an ANOVA expansion based on sparse grid quadrature in order to identify the sensitivity of each dimension in Ω_y. Then, once unimportant dimensions in Ω_y are identified, the values of the unimportant dimensions are fixed to some reference value and the variation of y in Ω_y is then restricted to the important dimensions. Finally, a greedy-based algorithm is used to construct a low-order approximation.

3.5 3.5 Comparison of ACA versus EIM

In the previous sections, we have given independent presentations of the basics of the ACA and the EIM type methods. As was explained, the backgrounds and the applications are different. In addition, we have also presented the results of the convergence analysis of these approximations yielding another fundamental difference between the two approaches. The frame for the convergence of the ACA is a comparison to any other interpolating system, such as the polynomial approximation and the existence of derivatives for the family of functions f _y, y ∈ Ω_y is then the reason for convergence. The convergence of the EIM is compared with respect to the n-width expressed by the Kolmogorov small dimension.

Nevertheless, despite there differences in origins, it is clear that some link exist between these two constructive approximation methods. We show now the relation between the ACA and the EIM in a particular case.

Theorem 3.1 The Bivariate Adaptive Cross Approximation with global pivoting is equivalent to the Empirical Interpolation Method using the L^∞(Ω_x)-norm.

Proof We proceed by induction. Our affirmation A_q at the q-th step is:

(A_q)₁: the interpolation points {x ₁,…,x _q} and {y ₁,…,y _q} of the EIM and ACA are identical;

(A_q)₂: g _q(y) = r _q-1(x _q, y), y ∈ Ω_y;

(A_q)₃: I _g[f _y](x) = ℑ_q[f _y](x), (x,y) ∈ Ω_x × Ω_y.

Induction base (q = 1): First, we note that r ₀ = f and thus

$$({x_1},{y_1}) = \mathop {\arg \,\sup }\limits_{(x,y) \in {\Omega _x} \times {\Omega _y}} \left| {{r_0}(x,y)} \right| = \quad \mathop {\arg \,\sup }\limits_{(x,y) \in {\Omega _x} \times {\Omega _y}} \left| {f(x,y)} \right|.$$

Then, from (3.20) we conclude that ${h_1}(x) = \frac{{f(x,{y_1})}}{{f({x_1},{y_1})}}$ and by (3.17) we obtain that ${g_1}(y) = \frac{{f({x_1},y)}}{{{h_1}({x_1})}} = f({x_1},y) = {r_0}({x_1},y)$ since h ₁(x ₁) = 1. Further, using additionally (3.15), we get

$${I_1}[{f_{{y_1}}}](x) = {g_1}(y){h_1}(x) = {r_0}({x_1},y)\frac{{f(x,{y_1})}}{{f({x_1},{y_1})}} = \frac{{{r_0}({x_1},y){r_0}(x,{y_1})}}{{{r_0}({x_1},{y_1})}} = {\Im _1}[{f_y}](x),$$

for all (x, y) ∈ Ω_x × Ω_y and A¹ holds in consequence.

Induction step (q > 1): Let us assume A_q-1 to be true and we first note that

$${r_{q - 1}}\left( {x,y} \right) = f\left( {x,y} \right) - {I_{q - 1}}\left[ {{f_y}} \right]\left( x \right)$$

(3.23)

by (3.10) and (A_q-1)₃. Therefore, the selection criteria for the points (x _q, y _q) are identical for the EIM wit p = ∞ and the ACA with global pivoting. In consequence, the chosen sample points (x _q, y _q) are identical. Further, combining (3.20) and (3.23) yields

$${h_q}\left( x \right) = \frac{{{f_{{y_q}}}\left( x \right) - {I_{q - 1}}\left[ {{f_{{y_q}}}} \right]\left( x \right)}}{{{f_{{y_q}}}\left( {{x_q}} \right) - {I_{q - 1}}\left[ {{f_{{y_q}}}} \right]\left( {{x_q}} \right)}} = \frac{{{r_{q - 1}}\left( {x,{y_q}} \right)}}{{{r_{q - 1}}\left( {{x_q},{y_q}} \right)}}.$$

(3.24)

By (3.17) for i = q, using that h _q(x _q) = 1 and (3.23), we obtain (A_q)₂:

$${g_q}\left( y \right) = f\left( {{x_q},y} \right) - \sum\limits_{j = 1}^{q - 1} {{g_j}\left( y \right){h_j}\left( {{x_q}} \right)} = f\left( {{x_q},y} \right) - {I_{q - 1}}\left[ {{f_y}} \right]\left( {{x_q}} \right) = {r_{q - 1}}\left( {{x_q},y} \right).$$

(3.25)

Finally, combining (3.24) and (3.25) in addition to (A_q-1)₃, we onclude that

$${I_q}[{f_y}](x) = {I_{q - 1}}[{f_y}](x) + {g_q}(y){h_q}(x) = {\Im _{q - 1}}[{f_y}](x) + {r_{q - 1}}({x_q},y)\frac{{{r_{q - 1}}(x,{y_q})}}{{{r_{q - 1}}({x_q},{y_q})}} = {\Im _q}[{f_y}](x)$$

and the proof is complete.

3.6 3.6 Gappy POD

In the following, we present a completion to the POD method called Gappy POD [17, 31, 70] or Missing Point Estimation [1]. We refer to it as the Gappy POD in the following. It is a projection based method (thus not an interpolation based method although in some particular cases it can be interpreted as an interpolation scheme). However, the projection matrix is approximated by a low-rank approximation that in turn is based on partial or incomplete (“gappy”) data of the functions under consideration. In a first turn, we present the method as introduced in [17, 70] and we generalize it in a second turn.

3.6.1 3.6.1 The Gappy POD Algorithm

We start from the conceptual idea that a set of basis functions {h ₁,…,h _Q}, that can — but does not need to — be obtained through a POD procedure, is given. We first introduce the idea of Gappy POD in the context of Remark 3.1 where functions are represented by a vector containing its pointwise values on a given grid $\Omega _x^{train} = \{ {\hat x_1}, \ldots ,{\hat x_M}\} $. We remind that the projection P _Q[f _y] of f _y with y ∈ Ω_y onto the space spanned by {h ₁,…,h _Q} is defined by

$${({P_Q}[{f_y}],{h_q})_{\Omega _x^{train}}},\quad {({f_y},{h_q})_{\Omega _x^{train}}}\quad q = 1, \ldots ,Q.$$

Next, assume that we only dispose of some incomplete data of f _y. That is, we are given say L (< M) distinct points {x ₁,…,x _L} among Ω ^train_x where f _y(x _i) is available. Then, we define the gappy scalar product by

$${(v,w)_{L,\Omega _x^{train}}} = \frac{{|\Omega |}}{L}\sum\limits_{i = 1}^L {v({x_i})w({x_i}),} $$

which only takes into account available data of f _y. We can compute the gappy projection defined by

$${({P_{Q,L}}[{f_y}],{h_q})_{L,\Omega _x^{train}}} = {({f_y},{h_q})_{L,\Omega _x^{train}}},\quad q = 1, \ldots ,Q.$$

Observe that the basis functions {h ₁,…,h _Q} are no longer orthonormal for the gappy scalar product and that the stability of the method mainly depends on the properties of the mass matrix M_h,L defined by

$${({M_{h,L}})_{i,j}} = {({h_j},{h_i})_{L,\Omega _x^{train}}}.$$

To summarize, in the above presentation we assumed that the data of f _y at some given points was available and then defined a “best approximation” with respect to the available but incomplete data. For instance, the data can be assimilated by physical experiments and the Gappy POD allows to reconstruct the solution in the whole domain Ω ^train_x assuming that it can be accurately represented by the basis functions {h ₁,…,h _Q}.

We now change the viewpoint and ask the question: If we can place L sensors at the locations {x _i} ^L_i=1 ⊂ Ω_x at which we have access to the data f _y(x _i) (through measurements), where would we place the points {x _i} ^L_i=1 ?

One might consider different criteria to chose the sensors. In [70] the placement of L sensors is stated as a minimization problem

$$\min \,\kappa ({M_{h,L}})\quad where\quad {M_{h,L}}\,is\,based\,on\,L\,points\,\{ {x_1}, \ldots ,{x_L}\} $$

and κ (M_h,L) denotes the condition number of MM_h,L. We report in Table 3.8 a slight modification of the algorithm presented in [1, 70] to construct a sequence of sensor placements {x ₁,…,x _L} (with L ≥ Q) based on an incremental greedy algorithm.

Scheme 3.8. Sensor placement algorithm with Gappy POD and minimal condition number

For 1 ≤ l ≤ L :

$${x_l} = \mathop {\arg \,\min \kappa ({M_{h,l}}(x))}\limits_{x \in {\Omega _x}} $$

where

$${({M_{h,l}}(x))_{i,j}} = \frac{{|{\Omega _x}|}}{l}\left[ {\sum\limits_{k = 1}^{l - 1} {{h_i}({x_k}){h_j}({x_k}) + {h_i}(x){h_j}(x)} } \right],\quad 1 \leqslant i,j \leqslant \min (Q,1).$$

Scheme 3.9. Sensor placement algorithm with Gappy POD and minimal error

For 1 ≤ l ≤ L:

$${x_l} = \mathop {\arg \,\max }\limits_{x \in {\Omega _x}} {\left\| {{P_{Q,l - 1}}[{f_y}] - {f_y} - } \right\|_{{L^p}({\Omega _x})}}$$

where P _Q,l-1[f _y] is the gappy projection of f _y onto the span of {h ₁,…,h _min(Q,l-1)} based on the pointwise information at {x ₁,…,x _l-1}.

This natural algorithm actually seems to have some difficulties at the beginning, for small values of l. It is thus recommended to start with the algorithm presented in Table 3.9.

This criterion is actually the one that is used in the Gappy POD method presented in [20] in the frame of the GNAT approach that allows a stabilized implementation of the gappy method for a challenging CFD problem. Further, we have the following link between the gappy projection and the EIM as noticed in [33].

Lemma 3.2 Let {h ₁,…,h _Q} and {x ₁,…,x _Q} be given basis functions and interpolation nodes. If the interpolation is well-defined (i.e. the interpolation matrix being invertible), then the interpolatory system based on the basis functions {h ₁,…,h _Q} and the interpolation nodes {x ₁,…,x _Q} is equivalent to the gappy projection system based on the basis functions {h ₁,…,h _Q} with available data at the points {x ₁,…,x _Q}, that is, for any y ∈ Ω_y the unique interpolant IQ[fy] ∈ span{h ₁,…,h _Q} such that

$${I_Q}\left[ {{f_y}} \right]\left( {{x_q}} \right) = {f_y}\left( {{x_q}} \right),\quad q = 1, \ldots ,Q,$$

(3.26)

is equivalent to the unique gappy projection P _Q,L[f _y] defined by

$${\left( {{P_{Q,L}}\left[ {{f_y}} \right],{h_q}} \right)_{Q,\Omega _x^{train}}} = {\left( {{f_y},{h_q}} \right)_{Q,\Omega _x^{train}}},\quad q = 1, \ldots ,Q.$$

(3.27)

Proof Multiply (3.26) by $\frac{{|{\Omega _x}|}}{Q}{h_i}({x_q})$ and take the sum over all q = 1,…,Q to obtain

$$\frac{{|{\Omega _x}|}}{Q}\sum\limits_{q = 1}^Q {{I_Q}[{f_y}]({x_q}){h_i}({x_q}) = } \frac{{|{\Omega _x}|}}{Q}\sum\limits_{q = 1}^Q {{f_y}({x_q}){h_i}({x_q}),\quad i = 1, \ldots ,Q,} $$

which is equivalent to ${({I_Q}[{f_y}],{h_i})_{L,\Omega _x^{train}}} = {({f_y},{h_i})_{L,\Omega _x^{train}}}$ for all i = 1,…,Q. On the other hand, if P _Q,L[f _y] is the solution of (3.27), then there holds that

$$\sum\limits_{q = 1}^Q {{P_{Q,L}}\left[ {{f_y}} \right]\left( {{x_q}} \right){h_i}\left( {{x_q}} \right)} = \sum\limits_{q = 1}^Q {{f_y}\left( {{x_q}} \right){h_i}\left( {{x_q}} \right)} ,\quad i = 1, \ldots ,Q.$$

(3.28)

Since the interpolating system is well-posed, the interpolation matrix B_i,j = h _j(x _i) is invertible and thus there exists a vector u_j such that Bu_j = e_j for some j = 1,…,Q where e_j is the canonical basis vector. Then, multiply (3.28) by (u_j)_i and sum over all i:

$$\sum\limits_{i,q = 1}^Q {{P_{Q,L}}[{f_y}]({x_q}){{({u_j})}_i}{B_{qi}} = \sum\limits_{i,q = 1}^Q {{f_y}({x_q}){{({u_j})}_i}{B_{q,i}},\quad j = 1, \ldots ,Q,} } $$

to get

$${P_{Q,L}}[{f_y}]({x_j}) = {f_y}({x_j}),\quad j = 1, \ldots ,Q.$$

Thus, the gappy projection satisfies the interpolation scheme.

One feature of the sensor placement algorithm based on the Gappy POD framework is that the basis functions {h ₁,…,h _q} are given and the sensors are chosen accordingly. As a consequence of the interpretation of the gappy projection as an interpolation scheme if the number of basis functions and sensors coincide, one might combine the Gappy POD approach with the EIM in the following way in order to construct basis functions and redundant sensor locations simultaneously:

1.
use the EIM to construct simultaneously Q basis functions {h _q} ^Q_q=1 and interpolation points {x _q} ^Q_q=1 until a sufficiently small error is achieved;
2.
use the gappy projection framework as outlined above to add interpolation points (sensors) to enhance the stability of the scheme.

3.6.2 3.6.2 Generalization of Gappy POD

In the previous algorithm the functions were represented by their nodal values at some points ${\hat x_1}, \ldots ,{\hat x_M}$. That is, we can introduce for each point ${\hat x_i}$ a functional ${\hat \sigma _i} = {\delta _{{{\hat x}_i}}}$ (δ_x denoting the Dirac functional associated to the point x) such that the interpolant of any continuous function f onto the space ${\mathbb{V}_M}$ of piecewise linear and globally continuous functions can be written as

$$\sum\limits_{m = 1}^M {{{\hat \sigma }_m}(f){{\hat \varphi }_m},} $$

where $\{ {\hat \varphi _m}\} _{m = 1}^M$ denotes the Lagrange basis of $[{\mathbb{V}_M}$ with respect to the points ${\hat x_1}, \ldots ,{\hat x_M}.$

We present a generalization where we allow a more general discrete space $[{\mathbb{V}_M}$. Therefore, let $[{\mathbb{V}_M}$ be a M-dimensional discrete space spanned by a set of basis functions $\{ {\hat \varphi _i}\} _{i = 1}^M$ such as for example the finite element hat-functions, Fourier-basis or polynomial basis functions. In the context of the theory of finite elements, cf. [27], we are given M functionals $\{ {\hat \varphi _m}\} _{m = 1}^M$, associated with the basis set $\{ {\hat \varphi _i}\} _{i = 1}^M$, which determine the degrees of freedom of a function. That is, for f regular enough such that all degrees of freedom ${\hat \varphi _m}(f)$ are well-defined, the following interpolation scheme

$$f \to \sum\limits_{m = 1}^M {{{\hat \sigma }_m}(f){{\hat \varphi }_m}} $$

defines a function in ${\mathbb{V}_M}$ that interpolates the degrees of freedom.

We start with noting that the scalar product between two functions f,g in ${\mathbb{V}_M}$ is given by

$${(f,g)_{{\Omega _x}}} = \sum\limits_{n,m = 1}^M {{{\hat \sigma }_n}(f){{\hat \sigma }_m}(g){{({{\hat \sigma }_n},{{\hat \sigma }_m})}_{{\Omega _x}}}.} $$

In this framework, the meaning of “gappy” data is generalized. We speak of gappy data if only partial data of degrees of freedom, i.e. the ${\hat \sigma _m}(f)$ is available. Thus, in this generalized context, the degrees of freedom are not necessarily nodal values, i.e. the functionals being Dirac functionals, and depend on the choice of the basis functions.

Assume that we are given Q basis functions h ₁,…,h _Q that describe a subspace in $[{\mathbb{V}_M}$ and L ≥ Q degrees of freedom ${\sigma _l} = {\hat \sigma _{{i_l}}}$, for l = 1,…,L (chosen among all M degrees of freedom ${\hat \sigma _1}, \ldots ,{\hat \sigma _L}$). Denoting by ${\sigma _l} = {\hat \sigma _{{i_l}}}$ the corresponding L basis functions, we then define a gappy scalar product

$${(f,g)_{L,{\Omega _x}}} = \frac{M}{L}\sum\limits_{l,k = 1}^L {{\sigma _l}(f){\sigma _k}(g){{({\varphi _l},{\varphi _k})}_{{\Omega _x}}}.} $$

Given any f _y, y ∈ Ω_y the gappy projection P _Q,L[fy] span{h ₁,…,h _Q} is defined by

$${({P_{Q,L}}[{f_y}],{h_q})_{L,{\Omega _x}}} = {({f_y},{h_q})_{L,{\Omega _x}}},\quad q = 1, \ldots ,Q.$$

Then, the sensor placement algorithm introduced in the previous section can easily be generalized to this setting.

Remark 3.11 If the mass matrix ${\hat M_{i,j}} = {({\hat \varphi _j},{\hat \varphi _i})_{{\Omega _x}}}$ associated with the basis set $\{ {\hat \varphi _i}\} _{i = 1}^M$ satisfies the following orthogonality property ${\hat M_{i,j}} = {({\hat \varphi _j},{\hat \varphi _i})_{{\Omega _x}}} = \frac{{|{\Omega _x}|}}{M}{\delta _{ij}}$ either by construction of the basis functions or using mass lumping (in the case of finite elements) and if the basis functions $\{ {\hat \varphi _i}\} _{i = 1}^M$ are nodal basis functions associated with the set of points $\Omega _x^{train} = \left\{ {{{\hat x}_1}, \ldots ,{{\hat x}_M}} \right\}$, then the original Gappy POD method is established.

Remark 3.12 If themassmatrix ${(\hat M)_{i,j}} = {({\hat \varphi _j},{\hat \varphi _i})_{{\Omega _x}}}$ associated with the selected functions {ϕ_i{ ^L_i=1 is orthonormal, then the gappy projection P _Q,L[f _y] is solution to the following quadratic minimization problem

$$\mathop {\min }\limits_{f \in {\mathbb{V}_Q}} \sum\limits_{l = 1}^L {|{\sigma _l}({f_y}) - {\sigma _l}(f){|^2}.} $$

Since L < Q in a general setting, this means that the gappy projection fits the selected degrees of freedom optimally in a least-squares sense. In the general case, P _Q,L[f _y] is solution to the following minimization problem

$$\mathop {\min }\limits_{f \in {\mathbb{V}_Q}} \sum\limits_{l,k = 1}^L {({\sigma _l}({f_y}) - {\sigma _l}(f)){{({\varphi _l},{\varphi _m})}_{{\Omega _x}}}({\sigma _k}({f_y}) - {\sigma _k}(f)).} $$

Acknowledgements This work was supported by the research grant ApProCEM-FP7-PEOPLE-PIEF-GA-2010-276487.

Notes

1.
1 personal communication

References

Astrid, P., Weiland, S., Willcox, K., Backx T.: Missing Point Estimation in Models Described by Proper Orthogonal Decomposition. IEEE Transactions on Automatic Control, 53(10), 2237–2251 (2008)
Article MathSciNet Google Scholar
Babaev, M.-B.A.: Best approximation by bilinear forms. Mat. Zametki 46(2), 21–33, 158 (1989)
MathSciNet Google Scholar
Babaev, M.-B.A.: Exact annihilators and their applications in approximation theory. Trans. Acad. Sci. Azerb. Ser. Phys.-Tech. Math. Sci. 20(1, Math. Mech.), 17–24, 233 (2000)
MATH MathSciNet Google Scholar
Ballani, J., Grasedyck, L., Kluge, M.: Black Box Approximation of Tensors in Hierarchical Tucker Format. Linear Algebra and its Applications 438 639–657 (2013)
Article MATH MathSciNet Google Scholar
Barrault, M., Maday, Y., Nguyen, N.C., Patera, A.T.: An ‘empirical interpolation’ method: application to efficient reduced-basis discretization of partial differential equations. Comptes Rendus de l’Académie des Sciences. Série I. Mathématique 339(9), 667–672 (2004)
MATH MathSciNet Google Scholar
Bebendorf, M.: Approximation of boundary elementmatrices. Numer. Math. 86(4), 565–589 (2000)
Article MATH MathSciNet Google Scholar
Bebendorf, M.: Hierarchical Matrices: A Means to Efficiently Solve Elliptic Boundary Value Problems. Lecture Notes in Computational Science and Engineering (LNCSE) 63. Springer-Verlag, Berlin Heidelberg (2008)
Google Scholar
Bebendorf, M.: Adaptive cross approximation of multivariate functions. Constr. Appr., 34(2), 149–179 (2011)
Article MATH MathSciNet Google Scholar
Bebendorf, M., Kühnemund, A., Rjasanow, S.: A symmetric generalization of adaptive cross approximation for higher-order tensors. Technical Report 503, SFB611, University of Bonn, Bonn (2011)
Google Scholar
Bebendorf, M., Kuske, C.: Separation of variables for function generated high-order tensors. Technical Report 1303, INS, University of Bonn, Bonn (2013)
Google Scholar
M. Bebendorf, C. Kuske, and R. Venn. Wideband nested cross approximation for Helmholtz problems. Technical report, SFB 611 Preprint (2012)
Google Scholar
Bebendorf, M., Rjasanow, S.: Adaptive low-rank approximation of collocation matrices. Computing 70(1), 1–24 (2003)
Article MATH MathSciNet Google Scholar
Bebendorf, M., Venn, R.: Constructing nested bases approximations from the entries of non-local operators. Numer. Math. 121(4), 609–635 (2012)
Article MATH MathSciNet Google Scholar
Binev, R., Cohen, A., Dahmen, W., DeVore, R., Petrova, G., Wojtaszczyk, P.: Convergence rates for greedy algorithms in reduced basis methods. SIAM Journal on Mathematical Analysis–(3), 1457–1472 (2011)
Google Scholar
Börm, S., Grasedyck, L.: Hybrid cross approximation of integral operators. Numer. Math. 101(2), 221–249 (2005)
Article MATH MathSciNet Google Scholar
Buffa, A., Maday, Y., Patera, A.T., Prudhomme, C., Turinici, G.: A priori convergence of the greedy algorithm for the parametrized reduced basis method. ESAIM: Mathematical Modelling and Numerical Analysis 46(03), 595–603 (2012)
Article MATH MathSciNet Google Scholar
Bui-Thanh, T., Damodaran, M., Willcox, K.E.: Aerodynamic data reconstruction and inverse design using proper orthogonal decomposition. AIAA journal 42(8), 1505–1516 (2004)
Article Google Scholar
Bui-Thanh, T., Willcox, K., Ghattas, O.: Model reduction for large-scale systems with high-dimensional parametric input space. SIAM J. Sci. Comput. 30(6), 3270–3288 (2008)
Article MATH MathSciNet Google Scholar
Bui-Thanh, T., Willcox, K., Ghattas, O., van Bloemen Waanders, B.: Goal-oriented, model-constrained optimization for reduction of large-scale systems. J. Comput. Phys. 224(2), 880–896 (2007)
Article MATH MathSciNet Google Scholar
Carlberg, K., Farhat, C., Cortial, J., Amsallem, D.: The gnat method for nonlinear model reduction: effective implementation and application to computational fluid dynamics and turbulent flows. Journal of Computational Physics (2013)
Google Scholar
Carvajal, O.A., Chapman, F.W., Geddes, K.O.: Hybrid symbolic-numeric integration in multiple dimensions via tensor-product series. ISSAC’05, pp. 84–91 (electronic). ACM, New York (2005)
Google Scholar
Chan, T.F.: On the existence and computation of LU-factorizations with small pivots. Math. Comp. 42(166), 535–547 (1984)
MATH MathSciNet Google Scholar
Chapman, F.W.: Generalized orthogonal series for natural tensor product interpolation. PhD thesis, University of Waterloo, Waterloo (2003)
Google Scholar
Chaturantabut, S., Sorensen, D.C.: Discrete empirical interpolation for nonlinear model reduction. In Decision and Control, 2009, held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on, pp. 4316–4321. IEEE (2009)
Google Scholar
Chkifa, A.: On the Lebesgue constant of Leja sequences for the complex unit disk and of their real projection. Journal of Approximation Theory (2012)
Google Scholar
Chu, M.T., Funderlic, R.E., Golub, G.H.: A rank-one reduction formula and its applications to matrix factorizations. SIAM Review 37(4), 512–530 (1995)
Article MATH MathSciNet Google Scholar
Ciarlet, P.G.: The finite element method for elliptic problems, vol. 4. North Holland, Amsterdam-New York-Oxford (1978)
MATH Google Scholar
DeVore, R., Petrova, G., Wojtaszczyk, P.: Greedy algorithms for reduced bases in banach spaces. Constructive Approximation, 1–12 (2012)
Google Scholar
Donoho, D.L.: Compressed sensing. Information Theory, IEEE Transactions on, 52(4):1289–1306 (2006)
MathSciNet Google Scholar
Eftang, J.L., Stamm, B.: Parameter multi-domain ‘hp’ empirical interpolation. Int. J. Numer. Meth. Eng. 90({V4}), 412–428 (2012)
Article MATH MathSciNet Google Scholar
Everson, R., Sirovich, L.: Karhunen-loeve procedure for gappy data. JOSA A 12(8), 1657–1664 (1995)
Article Google Scholar
Fares, M., Hesthaven, J.S., Maday, Y., Stamm, B.: The reduced basis method for the electric field integral equation. Journal of Computational Physics 230(14), 5532–5555 (2011)
Article MATH MathSciNet Google Scholar
Galbally, D., Fidkowski, K., Willcox, K., Ghattas, O.: Non-linear model reduction for uncertainty quantification in large-scale inverse problems. International journal for numerical methods in engineering 81(12), 1581–1608 (2010)
MATH MathSciNet Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix computations, 3rd Ed. Johns Hopkins University Press, Baltimore, MD (1996)
MATH Google Scholar
Goreinov, S.A., Tyrtyshnikov, E.E., Zamarashkin, N.L.: A theory of pseudoskeleton approximations. Linear Algebra Appl. 261, 1–21 (1997)
Article MATH MathSciNet Google Scholar
Grepl, M.A., Maday, Y., Nguyen, N.C., Patera, A.T.: Efficient reduced-basis treatment of nonaffine and nonlinear partial differential equations. ESAIM: Mathematical Modelling and Numerical Analysis 41(03), 575–605 (2007)
Article MATH MathSciNet Google Scholar
Gu, M., Eisenstat, S.C.: Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM J. Sci. Comput. 17(4), 848–869 (1996)
Article MATH MathSciNet Google Scholar
Haasdonk, B., Dihlmann, M., Ohlberger, M.: A training set and multiple bases generation approach for parameterized model reduction based on adaptive grids in parameter space. Mathematical and Computer Modelling of Dynamical Systems. Methods, Tools and Applications in Engineering and Related Sciences 17(4), 423–442 (2011)
MATH MathSciNet Google Scholar
Hackbusch, W.: A sparse matrix arithmetic based on ℋ-matrices. Part I: Introduction to ℋ-matrices. Computing 62(2), 89–108 (1999)
Article MATH MathSciNet Google Scholar
Hackbusch, W., Khoromskij, B.N.: A sparse ℋ-matrix arithmetic. Part II: Application to multi-dimensional problems. Computing 64({dn1}), 21–47 (2000)
MATH MathSciNet Google Scholar
Harbrecht, H., Peters, M., Schneider, R.: On the low-rank approximation by the pivoted cholesky decomposition. Technical report, 2011. to appear in APNUM
Google Scholar
Hesthaven, J.S., Stamm, B., Zhang, S.: Efficient greedy algorithms for high-dimensional parameter spaces with applications to empirical interpolation and reduced basis methods. Technical report, Providence, RI, USA (2011)
Google Scholar
Hesthaven, J.S., Zhang, S.: On the use of ANOVA expansions in reduced basis methods for high-dimensional parametric partial differential equations. Technical Report 2011–31, Scientific Computing Group, Brown University, Providence, RI, USA (Dec. 2011)
Google Scholar
Hotelling, H.: Analysis of a complex of statistical variables into principal components. The Journal of educational psychology, 498–520 (1933)
Google Scholar
Hwang, T.M., Lin, W.W., Yang, E.K.: Rank revealing LU factorizations. Linear Algebra Appl. 175, 115–141 (1992)
Article MATH MathSciNet Google Scholar
Karhunen, K.: Zur spektraltheorie stochastischer prozesse. Suomalainen tiedeakatemia (1946)
Google Scholar
Kosambi, D.: Statistics in function space. J. Indian Math. Soc 7(1), 76–88 (1943)
MATH MathSciNet Google Scholar
Kunisch, K., Volkwein, S.: Control of the burgers equation by a reduced-order approach using proper orthogonal decomposition. Journal of Optimization Theory and Applications 102(2), 345–371 (1999)
Article MATH MathSciNet Google Scholar
Leja, F.: Sur certaines suites liées aux ensembles plans et leur application ä la représentation conforme. Ann. Polon. Math. 4, 8–13 (1957)
MATH MathSciNet Google Scholar
Loève, M.: Fonctions aléatoires de second ordre. CR Acad. Sci. Paris 220, 380 (1945)
MATH Google Scholar
Lumley, J.L.: Stochastic tools in turbulence. Courier Dover Publications, USA (2007)
MATH Google Scholar
Maday, Y., Mula, O.: A generalized empirical interpolation method: application of reduced basis techniques to data assimilation. Analysis and Numerics of Partial Differential Equations XIII, 221–236 (2013)
Article MathSciNet Google Scholar
Maday, Y., Mula, O., Turinici, G.: A priori convergence of the generalized empirical interpolation method. http://hal.archives-ouvertes.fr/docs/00/79/81/14/PDF/bare_conf.pdf
Google Scholar
Maday, Y., Nguyen, N.C., Patera, A.T., Pau, G.S.H.: A general multipurpose interpolation procedure: the magic points. Communications on Pure and Applied Analysis 8(1), 383–404 (2009)
Article MATH MathSciNet Google Scholar
Maday, Y., Stamm, B.: Locally adaptive greedy approximations for anisotropic parameter reduced basis spaces. arXiv.org (Apr. 2012)
Google Scholar
Mees, A., Rapp, P., Jennings, L.: Singular-value decomposition and embedding dimension. Physical Review A 36(1), 340 (1987)
Article Google Scholar
Micchelli, C.A., Pinkus, A.: Some problems in the approximation of functions of two variables and n-widths of integral operators. J. Approx. Theory 24(1), 51–77 (1978)
Article MATH MathSciNet Google Scholar
Mirsky, L.: Symmetric gauge functions and unitarily invariant norms. Quart. J. Math. Oxford Ser. (2), 11:50–59 (1960)
Article MATH MathSciNet Google Scholar
Obukhov, A.M.: Statistical description of continuous fields. Trudy Geophys. Inst. Akad. Nauk. SSSR 24(151), 3–42 (1953)
Google Scholar
Oseledets, I.V., Savostyanov, D.V., Tyrtyshnikov, E.E.: Linear algebra for tensor problems. Computing 85(3), 169–188 (2009)
Article MATH MathSciNet Google Scholar
Oseledets, I.V., Tyrtyshnikov, E.E.: TT-Cross Approximation for Multidimensional Arrays. Linear Algebra Appl. 432(5), 70–88 (2010)
Article MATH MathSciNet Google Scholar
Patera, A.T., Rozza, G.: Reduced Basis Approximation and A Posteriori Error Estimation for Parametrized Partial Differential Equations. MIT Pappalardo Graduate Monographs in Mechanical Engineering. Cambridge, MA (2007). Available from http:// augustine.mit.edu/methodology/methodology_book.htm
Google Scholar
Pearson, K.: On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2(11), 559–572 (1901)
Article Google Scholar
Pougachev, V.S.: General theory of the correlation of random functions. Izv.Akad. Nauk. SSSR, Ser Mat 17, 401 (1953)
Google Scholar
Reichel, L.: Newton interpolation at Leja points. BIT 30(2), 332–346 (1990)
Article MATH MathSciNet Google Scholar
Schneider, J.: Error estimates for two-dimensional Cross Approximation. J. Approx. Theory 162(9), 1685–1700 (2010)
Article MATH MathSciNet Google Scholar
Šimša, J.: The best L ²-approximation by finite sums of functions with separable variables. Aequationes Math. 43(2–3), 248–263 (1992)
MATH MathSciNet Google Scholar
Taylor, R.: Lagrange interpolation on Leja points. PhD thesis, University of South Florida (2008)
Google Scholar
Wedderburn, J.H.M.: Lectures on matrices. Dover Publications Inc., New York (1964)
MATH Google Scholar
Willcox, K.: Unsteady flow sensing and estimation via the gappy proper orthogonal decomposition. Computers & Fluids 35(2), 208–226 (2006)
Article MATH Google Scholar
Wu, C., Liang, Y., Lin, W., Lee, H., Lim, S.: A note on equivalence of proper orthogonal decomposition methods. Journal of Sound Vibration 265, 1103–1110 (2003)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Numerical Simulation, University of Bonn, Wegelerstraße 6, 53115, Bonn, Germany
Mario Bebendorf
UPMC Univ. Paris 06, UMR 7598 LJLL, Paris, F-75005, France
Yvon Maday & Benjamin Stamm
Institut Universitaire de France and Division of Applied Mathematics, Brown University, Providence, RI, USA
Yvon Maday
CNRS, UMR 7598 LJLL, Paris, F-75005, France
Benjamin Stamm

Authors

Mario Bebendorf
View author publications
You can also search for this author in PubMed Google Scholar
Yvon Maday
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Stamm
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benjamin Stamm .

Editor information

Editors and Affiliations

CMCS-MATHICSE, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Alfio Quarteroni
MOX, Department of Mathematics “F. Brioschi”, Politecnico di Milano, Milan, Italy
Alfio Quarteroni
SISSA mathLab, International School for Advanced Studies, Trieste, Italy
Gianluigi Rozza

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bebendorf, M., Maday, Y., Stamm, B. (2014). Comparison of Some Reduced Representation Approximations. In: Quarteroni, A., Rozza, G. (eds) Reduced Order Methods for Modeling and Computational Reduction. MS&A - Modeling, Simulation and Applications, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-319-02090-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-02090-7_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02089-1
Online ISBN: 978-3-319-02090-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Comparison of Some Reduced Representation Approximations

Abstract

Similar content being viewed by others

The Life and Work of André Boivin

A Low-Rank Matrix Approach to Compute Polynomial Approximations of Smooth Two-Dimensional Functions

Computational Methods for the Fourier Analysis of Sparse High-Dimensional Functions

Keywords

3.1 3.1 Introduction

3.2 3.2 Proper Orthogonal Decomposition

3.2.1 3.2.1 Historical Overview

3.2.2 3.2.2 Algorithm

Scheme 3.1. Proper orthogonal decomposition (POD)

3.3 3.3 Adaptive Cross Approximation

3.3.1 3.3.1 Historical Overview

3.3.2 3.3.2 Construction of Interpolation Nodes

3.3.3 3.3.3 Incremental Construction

Scheme 3.2. Bivariate Adaptive Cross Approximation (ACA2)

3.3.4 3.3.4 Application to Matrices

Scheme 3.3. Adaptive Cross Matrix Approximation

3.3.5 3.3.5 Relation with Gaussian Elimination

3.3.6 3.3.6 Generalizations of ACA

3.4 3.4 Empirical Interpolation Method

3.4.1 3.4.1 Historical Overview

3.4.2 3.4.2 Motivation

Scheme 3.4. Empirical Interpolation Method

3.4.3 3.4.3 Algorithm

3.4.4 3.4.4 Practical Implementation

Scheme 3.5. Empirical Interpolation Method (possible implementation of EIM)

3.4.5 3.4.5 Practical Implementation Using the Matrix Representation of the Function

Scheme 3.6. Empirical Interpolation Method (implementation based on representative matrix M of f)

3.4.6 3.4.6 Generalizations of the EIM

3.4.6.1 3.4.6.1 Generalized Empirical Interpolation Method (gEIM)

Scheme 3.7. Generalized Empirical Interpolation Method (gEIM)

3.4.6.2 3.4.6.2 hp-EIM

3.5 3.5 Comparison of ACA versus EIM

3.6 3.6 Gappy POD

3.6.1 3.6.1 The Gappy POD Algorithm

Scheme 3.8. Sensor placement algorithm with Gappy POD and minimal condition number

Scheme 3.9. Sensor placement algorithm with Gappy POD and minimal error

3.6.2 3.6.2 Generalization of Gappy POD

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation