1 Introduction

In the modelling of the complex physical systems, uncertainties are inevitably encountered from various sources, which can be generally categorized into epistemic and aleatory uncertainties. The former can be reduced by more precise measurements or more advanced noise filtering techniques, while the latter are very difficult if not impossible to be accurately captured due to possible multiscale properties and intrinsic randomness of the physical systems. When the latter uncertainties are taken into account in the mathematical models, we come to face stochastic problems. To solve them, various stochastic computational methods have been developed, such as perturbation, Neumann expansion, Monte Carlo, stochastic Galerkin, stochastic collocation, reduced basis method [7, 18, 32, 33, 45]. In particular, we are interested in the comparison of stochastic collocation method and reduced basis method.

In the early years, stochastic collocation method was developed from the non-intrusive deterministic spectral collocation method [9, 10] to address applications in a variety of fields, for instance chemical and environmental engineering [30], multibody dynamic system [25]. Nevertheless, only in the recent years [1, 44] a complete analysis has been carried out, and new extensions outlined [4, 17, 22, 27, 31, 32]. In principle, stochastic collocation method employs multivariate polynomial interpolations for the approximation of stochastic solution at any given realization of the random inputs based on collocated deterministic solutions [1]. Due to the heavy computation of a deterministic system at each collocation point in high dimensional space, isotropic or anisotropic sparse grids with suitable cubature rules [31, 32] were successfully analysed and applied for stochastic collocation method to reduce the computational burden. This method is preferred for more practical applications because it features the possibility of reusing available deterministic solvers owning to its non-intrusive structure as Monte Carlo method, and also because it achieves fast convergence rate as stochastic Galerkin method, see numerical comparison of them in [3].

Reduced basis method, on the other hand, is a model reduction technique originally developed to solve parametric problems arising from the field of structure mechanics, fluid dynamics, etc. [19, 20, 29, 34, 36, 37, 41]. For its application to stochastic problems, we first parametrize the random variables into parameter space, next we select the most representative points in this parameter space by greedy sampling based on a posteriori error estimation [6, 7]. A landmark feature of reduced basis method is the separation of the whole procedure into an offline computational stage and an online computational stage [34, 37]. During the former, the more computationally demanding ingredients are computed and stored once and for all, including sampling parameters, assembling matrices and vectors, after solving and collecting snapshots of solutions. During the online stage, only the parameter related elements are left to be computed and a small Galerkin approximation problem has to be solved. Reduced basis method is similar to stochastic collocation method but with a posteriori error estimation for sampling, and thus potentially be more efficient provided that a posteriori error bound is cheap to obtain [6]. How to compute rigorous, sharp and inexpensive a posteriori error bound is an open challenging task for the application of reduced basis method in more general stochastic problems.

When it comes to solve practically a realistic stochastic problem, we need to choose between different stochastic computational methods. It is crucial to know the properties of each method and especially the way they compare in terms of complexity for formulation and implementation, convergence properties and computational costs to solve a specific problem. In this paper, our target is the comparison of the stochastic collocation method and the reduced basis method based on a rather simple benchmark, a stochastic elliptic problem, in order to shed light on the advantages and disadvantages of each method. We hope to provide some insightful indications on how to choose the proper method for different problems. Generally speaking, for small scale and low dimensional problems, stochastic collocation method is preferred while reduced basis method performs better for large scale and high dimensional problems, as supported by our computational comparison. Moreover, our numerical experiments demonstrate that an efficient combination of reduced basis method and stochastic collocation method features a fast evaluation of statistics of the stochastic solution.

In Sect. 2, a stochastic elliptic problem is set up with affine assumptions on the random coefficient field. Weak formulation and regularity property of this problem is provided. The general formulation for the stochastic collocation method and the reduced basis method are introduced in Sects. 3 and 4, respectively. A theoretical comparison of convergence results in both univariate case and multivariate case as well as a direct comparison of the approximation error are carried out in Sect. 5 and a detailed comparison of the computational costs for the two methods is provided by evaluating the cost of each step of the algorithms in Sect. 6. In Sect. 7, we perform a family of numerical experiments aimed at the assessment of the convergence rates and computational costs of the two methods. Finally, concluding remarks about the limitation of our work as well as some extension to more general stochastic problems are given in Sect. 8.

2 Problem Setting

Let \((\Omega , \mathcal{{F}}, P)\) be a complete probability space, where \(\Omega \) is a set of outcomes \(\omega \in \Omega \), \(\mathcal{{F}}\) is \(\sigma \)-algebra of events and \(P:\mathcal{{F}} \rightarrow [0,1]\) with \(P(\Omega ) = 1\) assigns probability to the events. Let \(D\) be a convex, open and bounded physical domain in \(\mathbb R ^d\) (\(d=2,3\)) with Lipschitz continuous boundary \(\partial D\). We consider the following stochastic elliptic problem: find \(u: \Omega \times \bar{D} \rightarrow \mathbb R \) such that it holds almost surely

$$\begin{aligned} -\nabla \cdot (a(\cdot , \omega ) \nabla u(\cdot , \omega ))&= f \quad \text { in } D\nonumber \\ u(\cdot , \omega )&= 0 \quad \text { on } \partial D \end{aligned}$$
(2.1)

where \(f\) is a deterministic forcing term defined in the physical domain \(D\) and the homogeneous Dirichlet boundary condition is prescribed on the whole boundary \(\partial D\) for simplicity. For the random coefficient \(a(\cdot ,\omega )\), we consider the following assumptions:

Assumption 1

The random coefficient \(a(\cdot , \omega )\) is assumed to be uniformly bounded from below and from above, i.e. there exist constants \(0< a_{min} < a_{max} < \infty \) such that

$$\begin{aligned} P(\omega \in \Omega : a_{min} < a(x, \omega ) < a_{max} \quad \forall x \in \bar{D})= 1. \end{aligned}$$
(2.2)

Assumption 2

For the sake of simplicity, we assume that the random coefficient \(a(\cdot , \omega )\) depends only on finite dimensional noise in the following linear form

$$\begin{aligned} a(x, y) = a_0(x) + \sum \limits _{n=1}^K a_k(x)y_k(\omega ), \end{aligned}$$
(2.3)

where the leading term is assumed to be dominating and uniformly bounded away from 0, i.e.

$$\begin{aligned} \exists \delta > 0, a_{\min } \text { the same as in (2.2)}\, \text { s.t. } a_0(x)&\ge \delta , \, \forall x \in D \text { and } ||a_{k}||_{L^{\infty }(D)} < 2 a_{\min },\nonumber \\&1\le k \le K, \end{aligned}$$
(2.4)

and \(\{y_k\}_{k=1}^K\) are real valued random variables with joint probability density function \(\rho (y)\), being \(y= (y_1,\ldots ,y_K)\). By denoting \(\Gamma _k = y_k(\Omega ), k=1, \ldots , K\) and \(\Gamma = \Pi _{k=1}^K \Gamma _k\), we can also view \(y\) as a parameter in the parametric space \(\Gamma \) that is endowed with the measure \(\rho (y)dy\).

Remark 2.1

The expression (2.3) is widely used in practice and may come from, e.g., piecewise thermal conductivity of a heat conduction field, where the functions \(a_k,k=1, \ldots , K\) are characteristic functions. Or it may arise from the truncation of Karhunen–Loève expansion [42] of the correlation kernel of porosity field with continuous and bounded second order moment when modeling fluid flow in porous media, where in this case for \(k =1, \ldots , K\), \(a_k = \sqrt{\lambda _k}\phi _k\) with \(\lambda _k\) and \(\phi _k\) denoting the \(k\)th eigenvalue and eigenfunction of the expansion, etc.

Under the above assumptions, the weak formulation of the stochastic elliptic problem reads: find \(u(y)\in H^1_0(D)\) such that the following equation holds for \(\forall y \in \Gamma \)

$$\begin{aligned} A(u,v;y) = F(v) \quad \forall v \in H^1_0(D), \end{aligned}$$
(2.5)

where \(H^1_0(D):=\{v: D^{\varvec{\beta }} v \in L^2(D), v|_{\partial D}=0\}\) with non-negative multi-index \({\varvec{\beta }} = (\beta _1, \ldots , \beta _d)\) such that \(|{\varvec{\beta }}|\le 1\) is a Hilbert space equipped with norm \(||v||_{H^1_0(D)} = (\sum _{|\varvec{\beta }|\le 1}||D^{\varvec{\beta }}v||_{L^2(D)})^{1/2}\), where the derivative is defined as \(D^{\varvec{\beta }}v:=\partial ^{|\varvec{\beta }|}v/\partial x_1^{\beta _1}\cdots \partial x_d^{\beta _d}\); \(F(\cdot )\) is a linear functional defined as \(F(v) := (f,v)\) with \(f \in L^2(D)\) and \(A(\cdot ,\cdot ;y)\) is a bilinear form affinely expanded following (2.3)

$$\begin{aligned} A(u,v;y) = A_0(u,v) + \sum \limits _{k=1}^K A_k(u,v) y_k(\omega ), \end{aligned}$$
(2.6)

where the deterministic bilinear forms \(A_k(u,v)\) are given by \(A_k(u,v) := (a_k\nabla u, \nabla v), k = 0, 1, \ldots , K\). From assumption (2.2) we have that the bilinear form is coercive and continuous and thus the existence of a unique solution \(u(y) \in H^1_0(D)\) for \(\forall y\in \Gamma \) to problem (2.5) is guaranteed by Lax–Milgram theorem [39]. In fact, we are interested in a related quantity \(s(u;y)\), e.g., the linear functional \( F(u)\), as well as its statistics, e.g. the expectation \(\mathbb E [s]\), defined as

$$\begin{aligned} \mathbb E [s] = \int \limits _{\Gamma } s(u;y) \rho (y)dy. \end{aligned}$$
(2.7)

Since any approach for the approximation of the solution in the stochastic/parameter space depends on the regularity of the solution with respect to the random vector or parameter \(y \in \Gamma \), we summarize briefly the regularity results in Lemma 2.1 following [15] for infinite dimensional problems (\(K = \infty \)).

Lemma 2.1

The following estimate for the solution of the problem (2.5) holds

$$\begin{aligned} ||\partial _y^{\nu } u||_{L^{\infty }(\Gamma ;X)} \le B |\nu | ! b^{\nu }, \end{aligned}$$
(2.8)

where \(\nu = (\nu _1, \ldots , \nu _K) \in \mathbb N ^K\), \(|\nu | = \nu _1 + \cdots + \nu _K\), \(H^1_0(D) \subset X \subset H^1(D)\), \(B = ||u||_{L^{\infty }(\Gamma ;X)}\) and

$$\begin{aligned} b_k = \frac{||a_k||_{L^{\infty }(D)}}{a_{min}} \text { and } b^{\nu } = \prod \limits _{k=1}^K b_k^{\nu _k}. \end{aligned}$$
(2.9)

Furthermore, Lemma 2.1 implies by Taylor expansion the following analytic regularity which represents a generalization of the result in [4] from \(\mathbb{R }^K\) to \(\mathbb C ^K\).

Corollary 2.2

The solution \(u: \Gamma \rightarrow X\) is analytic and can be analytically extended to the set

$$\begin{aligned} \Sigma = \left\{ z \in \mathbb C ^K: \sum \limits _{k=1}^K |z_k-y_k| b_k < 1, \, \forall \, y \in \Gamma \right\} , \end{aligned}$$
(2.10)

We may also write for \(\tau _k \le 1/(Kb_k), 1\le k \le K\)

$$\begin{aligned} \Sigma _{\tau } = \left\{ z \in \mathbb{C }^K: \text {dist}(z_k, \Gamma _k) \le \tau _k, \, \forall 1 \le k \le K\right\} . \end{aligned}$$
(2.11)

Remark 2.2

Problem (2.1) is a stochastic linear elliptic coercive and affine problem with linear functional outputs. Without loss of generality, it represents our reference benchmark problem aimed at the comparison of approximation quality and computational costs between the reduced basis method [41] and the stochastic collocation method [1]. We remark that the comparison depends essentially on the following factors: the regularity of the stochastic solution in the parameter space, the dimension of the parameter space, and the complexity of solving a deterministic system at one stochastic realization. Therefore, conclusions drawn from the comparison results for the linear elliptic problem are supposed to hold similarly for more general problems as long as the above factors are concerned.

3 Stochastic Collocation Method

Given any realization \(y \in \Gamma \), stochastic collocation method [1] essentially adopts the Lagrangian interpolation to approximate the solution \(u(y)\) based on a set of deterministic solutions at the collocation points chosen according to the probability distribution function of the random variables. Therefore, we have to solve one deterministic problem at each collocation point. In order to achieve accurate and inexpensive collocation approximation of the stochastic solution as well as its statistics, it all remains to select efficient collocation points. Let us introduce the univariate stochastic collocation at first.

3.1 Univariate Interpolation

Given the collocation points in \(\Gamma \), e.g., \( y^0 < y^1 < y^2 < \cdots < y^N \) as well as the corresponding solutions \(u(y^n), 0\le n \le N\), we define the univariate \(N\)th order Lagrangian interpolation operator as

$$\begin{aligned} \mathcal{{U}}_Nu(y) = \sum \limits _{n=0}^N u(y^n) l^n(y), \end{aligned}$$
(3.1)

where \(l^n(y), 0\le n \le N\) are the Lagrangian characteristic polynomials of order \(N\) given in the form

$$\begin{aligned} l^n(y) = {\prod \limits _{m\ne n}} \frac{y-y^m}{y^n-y^m} \quad 0 \le n \le N. \end{aligned}$$
(3.2)

One evaluation of \(\mathcal{{U}}_Nu(y)\) at a new realization \(y \in \Gamma \) requires \(O(N^2)\) operations by formula (3.1). For efficient and stable polynomial interpolation, we use barycentric formula [38] and rewrite the characteristic polynomials as

$$\begin{aligned} l^n(y) = \underbrace{\frac{1}{\prod _{m\ne n}(y^n - y^m)}}_{\bar{w}^n}\cdot \frac{1}{y-y^n} \underbrace{\prod \limits _{m=0}^N(y-y^m)}_{l(y)} = l(y) \frac{\bar{w}^n}{y-y^n} \quad 0 \le n \le N, \end{aligned}$$
(3.3)

where \(\bar{w}^n, 0\le n \le N\) are barycentric weights, so that the interpolation operator (3.1) becomes

$$\begin{aligned} \mathcal{{U}}_Nu(y) = {\sum \limits _{n=0}^N \frac{\bar{w}^n}{y-y^n} u(y^n)}\Big /{\sum \limits _{n=0}^N \frac{\bar{w}^n}{y-y^n}}, \text { where } l(y) = \sum \limits _{n=0}^N \frac{\bar{w}^n}{y-y^n}, \end{aligned}$$
(3.4)

which instead needs only \(O(N)\) operations for one evaluation provided that the barycentric weights are precomputed and stored. The statistics of the solution or output can be therefore evaluated, e.g.

$$\begin{aligned} \mathbb E [u] \approx \mathbb E [\mathcal{{U}}_Nu] = \sum \limits _{n=0}^N \left( \int \limits _{\Gamma } \left( \frac{\bar{w}^n}{y-y^n}\Big /\sum \limits _{n=0}^N \frac{\bar{w}^n}{y-y^n} \right) \rho (y) dy \right) u(y^n) = \sum \limits _{n=0}^N w^n u(y^n),\nonumber \\ \end{aligned}$$
(3.5)

where \({w}^n, 0\le n \le N\) are quadrature weights. In order to improve the accuracy of the numerical integral in (3.5) and the numerical interpolation in (3.4), it is favourable to select the collocation points as the quadrature abscissas. Available quadrature rules include Clenshaw–Curtis quadrature, Gaussian quadrature based on various orthogonal polynomials and so on [38].

3.2 Multivariate Tensor Product Interpolation

Rewrite the univariate interpolation formula (3.1) with the index \(k\) for the \(k\)th dimension as

$$\begin{aligned} \mathcal{{U}}_{N_k}u(y_k) \!=\! \sum \limits _{y_k^{n_k} \!\in \! \Theta ^k} u(y^{n_k}_k) l^{n_k}_k(y_k), \text { where } \Theta ^k \!=\! \{y_k^{n_k} \in \Gamma _k, {n_k} \!=\! 0, \ldots , N_k \} \text { for some } N_k \!\ge \! 1\nonumber \\ \end{aligned}$$
(3.6)

then the multivariate interpolation is given as the tensor product of the univariate interpolation

$$\begin{aligned} \left( \mathcal{{U}}_{N_1} \otimes \cdots \otimes \mathcal{{U}}_{N_K}\right) u(y) = \sum \limits _{y_1^{n_1} \in \Theta ^1} \cdots \sum \limits _{y_K^{n_K} \in \Theta ^K} u(y^{n_1}_1, \ldots , y_K^{n_K}) \big (l_1^{n_1}(y_1) \otimes \cdots \otimes l_K^{n_K}(y_K)\big ).\nonumber \\ \end{aligned}$$
(3.7)

The corresponding barycentric formula for the multivariate interpolation is given as

$$\begin{aligned}&\left( \mathcal{{U}}_{N_1} \otimes \cdots \otimes \mathcal{{U}}_{N_K}\right) u(y)\nonumber \\&\quad = \sum \limits _{y_1^{n_1} \in \Theta ^1} \frac{b_{n_1}^1(y_1)}{\displaystyle \sum \limits _{y_1^{n_1} \in \Theta ^1} b_{n_1}^1(y_1)} \cdots \sum \limits _{y_K^{n_K} \in \Theta ^K} \frac{b_{n_K}^K(y_K)}{\displaystyle \sum \limits _{y_K^{n_K} \in \Theta ^K} b_{n_K}^K(y_K)} u(y^{n_1}_1, \ldots , y_K^{n_K}), \end{aligned}$$
(3.8)

where \(b_{n_k}^k(y_k) = \bar{w}^k_{n_k}/(y_k - y_k^{n_k})\) with barycentric weights \( \bar{w}^k_{n_k} , 1 \le k \le K\) precomputed and stored. It is obvious that the multivariate barycentric formula reduces the tensor product interpolation from \(O(N_1^2\times \cdots \times N_K^2)\) operations by (3.6) to \(O(N_1 \times \cdots \times N_K)\) operations by (3.8). Corresponding to the univariate interpolation, the expectation of the solution by multivariate interpolation is given as

$$\begin{aligned} \mathbb E [u] \approx \mathbb E [\left( \mathcal{{U}}_{N_1} \otimes \cdots \otimes \mathcal{{U}}_{N_K}\right) u] \!=\! \sum \limits _{y_1^{n_1} \in \Theta ^1} \cdots \sum \limits _{y_K^{n_K} \in \Theta ^K} u(y^{n_1}_1, \ldots , y_K^{n_K}) \big (w_1^{n_1} \times \cdots \!\times \! w_K^{n_K}\big ),\nonumber \\ \end{aligned}$$
(3.9)

where the quadrature weights \(w_k^{n_k}, 1 \le k \le K\) can be pre-computed and stored by

$$\begin{aligned} w_k^{n_k} = \int \limits _{\Gamma _{k}} \left( b_{n_k}^k(y^k)\Big /\sum \limits _{y_k^{n_k} \in \Theta ^k} b_{n_k}^k(y^k)\right) \rho (y_k) dy_k. \end{aligned}$$
(3.10)

We remark that the number of the collocation points or quadrature abscissas grows exponentially fast as \((N_1+1) \times \cdots \times (N_K+1), \text { or } (N_1+1)^K \text { if } N_1 = \cdots = N_K\), which prohibits the application of the multivariate tensor product interpolation for high dimensional stochastic problems (when \(K\) is large).

3.3 Sparse Grid Interpolation

In order to alleviate the “curse of dimensionality” in the interpolation on the full tensor product grid, various sparse grid techniques [8] have been developed, among which the Smolyak type [32] is one of the most popular constructions. For isotropic interpolation with the same degree \(q\ge K\) for one dimensional polynomial space in each direction, we have the Smolyak interpolation operator

$$\begin{aligned} \mathcal{{S}}_q u(y) = \sum \limits _{{q-K+1}\le |{i}|\le q} (-1)^{q-|{i}|} \left( \begin{array}{c} K-1\\ q-|{i}| \end{array} \right) \left( \mathcal{{U}}^{i_1}\otimes \cdots \otimes \mathcal{{U}}^{i_K}\right) u(y), \end{aligned}$$
(3.11)

where \(|{i}|=i_1+\cdots +i_K\) with the multivariate index \({i} = (i_1, \ldots , i_K)\) defined via the index set

$$\begin{aligned} X(q,K):= \left\{ i\in N_+^K, \forall \, i_k \ge 1: \sum \limits _{k=1}^K i_k\le q \right\} , \end{aligned}$$
(3.12)

and the set of collocation nodes for the sparse grid (see the middle of Fig. 1) is thus collected as

$$\begin{aligned} {{H}}(q,K) = \bigcup _{{q-K+1}\le |{i}|\le q} \left( \Theta ^{i_1}\times \cdots \times \Theta ^{i_K}\right) , \end{aligned}$$
(3.13)

where the number of collocation nodes \(\# \Theta ^{i_k} = 1 \text { if } i_k = 1, \text { and } \# \Theta ^{i_k} = 2^{i_k-1} + 1\text { when } i_k > 1\) in a nested structure. Note that we denote \( \mathcal{{U}}^{i_k} \equiv \mathcal{{U}}_{N_k}\) defined in (3.6) for \(N_k = 2^{i_k-1}\). Define the differential operator \(\Delta ^{i_k} = \mathcal{{U}}^{i_k} - \mathcal{{U}}^{i_k-1}, k =1, \ldots , K\) with \(\mathcal{{U}}^0 = 0\), we have an equivalent expression of Smolyak interpolation [1]

$$\begin{aligned} \mathcal{{S}}_q u(y)&= \sum \limits _{i \in X(q,K)} \left( \Delta ^{i_1}\otimes \cdots \otimes \Delta ^{i_K}\right) u (y)\nonumber \\&= \mathcal{{S}}_{q-1} u(y) + \sum \limits _{|i|= q}\left( \Delta ^{i_1}\otimes \cdots \otimes \Delta ^{i_K}\right) u (y). \end{aligned}$$
(3.14)

The above formula allows us to discretize the stochastic space in hierarchical structure based on nested collocation nodes, such as the extrema of Chebyshev polynomials or Gauss–Patterson nodes, leading to Clenshaw–Curtis cubature rule or Gauss–Patterson cubature rule, respectively [24, 32].

Fig. 1
figure 1

Two dimensional collocation nodes by Clenshaw–Curtis cubature rule in tensor product grid \(q=8\) (left), sparse grid \(q=8\) (middle), anisotropic sparse grid \(q=8\) and \(\alpha = (1,1.5)\) (right)

Smolyak sparse grid [44] is originally developed as isotropic in every one-dimensional polynomial space. The convergence rate of the solution in each polynomial space may vary due to different importance of each random variable, which helps to reduce further the computational effort by anisotropic sparse grid [31], written as

$$\begin{aligned} \mathcal{{S}}_q^{\alpha } u(y) = \sum \limits _{i \in X_{\alpha }(q,K)} \left( \Delta ^{i_1}\otimes \cdots \otimes \Delta ^{i_K}\right) u (y), \end{aligned}$$
(3.15)

with the weighted index

$$\begin{aligned} X_{\alpha }(q,K):= \left\{ i\in N_+^K, i \ge 1: \sum \limits _{k=1}^K i_k \alpha _k \le \min (\alpha ) q \right\} , \end{aligned}$$
(3.16)

where \(\alpha = (\alpha _1, \ldots , \alpha _K)\) represents the weights in different directions, estimated either from a priori or a posteriori error estimates, see [31]. Figure 1 displays the full tensor product grid, the sparse grid and the anisotropic sparse grid based on Clenshaw–Curtis cubature rule. We can observe that the isotropic and anisotropic sparse grids are far coarser than the full tensor product grid, leading to considerable reduction of the stochastic computation without much loss of accuracy, as we shall see in the convergence analysis and the numerical experiments in the following sections.

Remark 3.1

For certain specific problems, some other advanced techniques turn out to be more efficient than both the isotropic and the anisotropic Smolyak sparse grid techniques. For example, the quasi-optimal sparse grid [4] is assembled in a greedy manner to deal with the “accuracy-work” trade-off problem; the adaptive hierarchical sparse grid [17, 27] succeeded in constructing the sparse grid adaptively in hierarchical levels with local refinement or domain decomposition in stochastic space, which is more suitable for low regularity problems; the combination of analysis of variance (ANOVA) and sparse grid techniques [21, 22] for dealing with the high dimensional problems, etc.

4 Reduced Basis Method

Different from the interpolation approach used by stochastic collocation method, reduced basis method employs Galerkin projection in the reduced basis space spanned by a set of deterministic solutions [34, 35, 41]. Given any space \(X\) of dimension \(\mathcal{{N}}\) for the approximation of the solution of problem (2.5) (for instance, finite element space), we build the \(N\) dimensional reduced basis space \(X_N\) for \(N =1, \ldots , N_{max}\) hierarchically until satisfying tolerance requirement at \(N_{max}\ll \mathcal{{N}}\) as

$$\begin{aligned} X_N = \text {span}\{u(y^n), 1\le n \le N\} \end{aligned}$$
(4.1)

based on suitably chosen samples \(S_N=\{y^1, \ldots , y^N\}\) from a training set \(\Xi _{train} \subset \Gamma \). The solutions \(\{u(y^n), n=1, \ldots , N\}\) are called “snapshots” corresponding to the samples \(\{y^n, n=1, \ldots , N\}\). Note that \(X_1 \subset X_2 \subset \cdots \subset X_{N_{max}}\). Given any realization \(y \in \Gamma \), we seek the solution \(u_N(y)\) in the reduced basis space \(X_N\) by solving the following Galerkin projection problem

$$\begin{aligned} A(u_N,v;y) = F(v) \quad \forall v \in X_N. \end{aligned}$$
(4.2)

With \(u_N(y)\) we can evaluate the output \(s_N(y) = s(u_N(y))\) as well as compute its statistics, e.g. expectation \(\mathbb E [s_N]\), by using e.g. Monte–Carlo method or quadrature formula as used in stochastic collocation method. Four specific ingredients of the reduced basis method play a key role in selecting the most representative samples, hierarchically building the reduced basis space, and efficiently evaluating the outputs. They are training set, greedy algorithm, a posteriori error estimate and an offline–online computational decomposition, which are addressed respectively as follows.

4.1 Training Set

Two criteria should be fulfilled in the choice of the training set: it should be cheap, without too many ineffectual samples, in order to avoid unnecessary computation with little gain, and sufficient to capture the most representative snapshots so as to build an accurate reduced basis space. In practice, the training set is usually chosen as randomly distributed or log-equidistant distributed in the parameter space [34, 41]. As for stochastic problem with random variables obeying probability distribution other than uniform type, we propose to choose the samples in the training set according to the probability distribution. Furthermore, for the sake of comparison with stochastic collocation method, we take the training set such that it contains all the collocation points used by the stochastic collocation method. Adaptive approaches for building the training set have been well explored starting from a small number of samples to more samples in the space \(\Gamma \), see [46].

4.2 Greedy Algorithm

Given a training set \(\Xi _{train}\subset \Gamma \) and a first sample set \(S_1 = \{y^1\}\) as well as its associated reduced basis space \(X_1 = \text {span}\{u(y^1)\}\), we seek the sub-optimal solution to the \(L^{\infty }(\Xi _{train};X)\) optimization problem in a greedy way as [41]: for \(N=2, \ldots , N_{max}\), find \(y^N = \arg \max _{y \in \Xi _{train}} \triangle _{N-1}(y)\), where \(\triangle _{N-1}\) is a sharp and inexpensive a posteriori error bound constructed in the current \(N-1\) dimensional reduced basis space (specified later). Subsequently, the sample set and the reduced basis space are enriched by \(S_N = S_{N-1} \cup \{y^N\}\) and \(X_N = X_{N-1} \oplus \text {span}\{u(y^N)\}\), respectively. For the sake of efficient computation of Galerkin projection and offline–online decomposition, we can normalize the snapshots by Gram-Schmidt process to get the orthonormal basis of \(X_N = \text {span}\{\zeta _1, \ldots , \zeta _N\}\) such that \((\zeta _m, \zeta _n)_X = \delta _{mn}, \, 1 \le m,n \le N\). We remark that another algorithm that might be used for the sampling procedure is proper orthogonal decomposition, POD for short [41], which is rather expensive in dealing with \(L^{2}(\Xi _{train};X)\) optimization and thus more suitable for low dimensional problems.

4.3 A Posteriori Error Estimate

The efficiency and reliability of the reduced basis approximation by greedy algorithm relies critically on the availability of an inexpensive and sharp a posteriori error bound \(\triangle _{N}\), which can be constructed as follows: for every \(y\in \Gamma \), let \(R(v;y) \in X'\) be the residual in the dual space of \(X\), defined as

$$\begin{aligned} R(v;y) := F(v) - A(u_N(y),v;y) \quad \forall v \in X. \end{aligned}$$
(4.3)

By Riesz representation theorem [16], we have a unique function \(\hat{e}(y) \in X\) such that \((\hat{e}(y),v)_X = R(v;y) \, \forall v \in X\) and \(||\hat{e}(y)||_X = ||R(\cdot ;y)||_{X'}\), where the \(X\)-norm is defined as \(||v||_X = A(v,v;\bar{y})\) at some reference value \(\bar{y} \in \Gamma \) (we choose \(\bar{y}\) as the center of \(\Gamma \) by convention). Define the error \(e(y) := u(y) - u_N(y)\), we have by (2.5), (4.2) and (4.3) the following equation

$$\begin{aligned} A(e(y),v;y) = R(v;y) \quad \forall v \in X. \end{aligned}$$
(4.4)

By choosing \(v = e(y)\) in (4.4), recalling the coercivity constant \(\alpha (y)\) with the definition of its lower bound \(\alpha _{LB}(y)\le \alpha (y)\) of the bilinear form \(A(\cdot , \cdot ; y)\), and using Cauchy–Schwarz inequality, we have

$$\begin{aligned} \alpha _{LB}(y) ||e(y)||^2_X&\le A(e(y),e(y);y) = R(e(y);y)\nonumber \\&\le ||R(\cdot ,y)||_{X'}||e(y)||_X = ||\hat{e}(y)||_X ||e(y)||_X, \end{aligned}$$
(4.5)

so that we can define the a posteriori error bound \(\triangle _N\) for the solution \(u\) as \(\triangle _N:={||\hat{e}(y)||_X}/{\alpha _{LB}(y)}\), yielding \(||u(y) - u_N(y)||_X \le \triangle _N\) by (4.5). As for the output in the compliant case, i.e. \(s = f\), we have the following error bound

$$\begin{aligned} |s(y) - s_N(y)| = |s(u(y)) - s(u_N(y))| = A(e(y),e(y);y) \le ||\hat{e}(y)||^2_X/{\alpha _{LB}(y)}.\quad \end{aligned}$$
(4.6)

As for more general output where \(s\ne f\), an adjoint problem of (2.5) can be employed to achieve a faster convergence of the approximation error \(|s-s_N|\) [37]. The efficient computation of a sharp and accurate a posteriori error bound thus relies on the computation of a lower bound of the coercivity constant \(\alpha _{LB}(y)\) as well as the value \(||\hat{e}(y)||_X\) for any given \(y\in \Gamma \). For the former, we apply the successive constraint linear optimization method (SCM) [23] to compute a lower bound \(\alpha _{LB}(y)\) of \(\alpha (y)\). For the latter, we turn to an offline–online computational decomposition procedure.

4.4 Offline–Online Computational Decomposition

The evaluation of the expectation \(\mathbb E [s_N]\) and the a posteriori error bound \(\triangle _N\) requires to compute the output \(s_N\) and the solution \(u_N\) many times. Similar situations can be encountered for other applications in the context of many query (optimal design, control) and real time computational problems. One of the key ingredients that make reduced basis method stand out in this ground is the offline–online computational decomposition, which becomes possible due to the affine assumption such as that made in (2.3). To start, we express the reduced basis solution in the form [41]

$$\begin{aligned} u_N(y) = \sum \limits _{m=1}^N u_{Nm}(y) \zeta _m. \end{aligned}$$
(4.7)

Upon replacing it in (4.2) and choosing \(v = \zeta _n, 1\le n \le N\), we obtain the problem of finding \(u_{Nm}(y), 1 \le m \le N\) such that

$$\begin{aligned} \sum \limits _{m=1}^N\left( A_0(\zeta _m,\zeta _n) + \sum \limits _{k=1}^Ky_kA_k(\zeta _m,\zeta _n)\right) u_{Nm}(y) = F(\zeta _n) \quad 1\le n \le N. \end{aligned}$$
(4.8)

From (4.8) we can see that the values \(A_k(\zeta _m,\zeta _n), k = 0, 1, \ldots , K, 1 \le m,n \le N_{max}\) and \(F(\zeta _n), 1 \le n\le N_{max}\) are independent of \(y\), we may thus pre-compute and store them in the offline procedure. In the online procedure, we only need to assemble the stiffness matrix in (4.8) and solve the resulting \(N \times N\) stiffness system with much less computational effort compared to solve the original \(\mathcal{{N}} \times \mathcal{{N}}\) stiffness system. As for the computation of the error bound \(\triangle _{N}(y)\), we need to compute \(||\hat{e}(y)||_X\) corresponding to \(y\) chosen in the course of sampling procedure. We expand the residual (4.3) as

$$\begin{aligned} R(v;y) = F(v) - A(u_N,v;y) = F(v) - \sum \limits _{n=1}^N u_{Nn} \left( \sum \limits _{k=0}^K y_k A_k(\zeta _n,v) \right) , \text { where } y_0 = 1.\nonumber \\ \end{aligned}$$
(4.9)

Set \((\mathcal{{C}},v)_X = F(v)\) and \((\mathcal{{L}}^k_n,v)_X = -A_k(\zeta _{n},v) \,\forall v \in X_N, 1\le n \le N, 0\le k \le K\), where \(\mathcal{{C}}\) and \(\mathcal{{L}}^k_n\) are the representatives in \(X\) whose existence is secured by the Riesz representation theorem. By recalling \((\hat{e}(y),v)_X = R(v;y)\), we obtain

$$\begin{aligned} ||\hat{e}(y)||_X^2 = (\mathcal{{C}},\mathcal{{C}})_X + \sum \limits _{k=0}^K \sum \limits _{n=1}^N y_k u_{Nn}(y) \left( 2(\mathcal{{C}},\mathcal{{L}}^k_n)_X + \sum \limits _{k'=0}^K \sum \limits _{n'=1}^N y_{k'} u_{Nn'}(y) (\mathcal{{L}}^k_n,\mathcal{{L}}^{k'}_{n'})_X\right) .\nonumber \\ \end{aligned}$$
(4.10)

Therefore, we can compute and store \( (\mathcal{{C}},\mathcal{{C}})_X, (\mathcal{{C}},\mathcal{{L}}^k_n)_X, (\mathcal{{L}}^k_n,\mathcal{{L}}^{k'}_{n'})_X, 1 \le n,n' \le N_{max}, 0\le k, k' \le K\) in the offline-procedure, and evaluate \(||\hat{e}(y)||_X\) in the online-procedure by assembling (4.10).

Remark 4.1

Different from the presentation of stochastic collocation method regardless of the underlying system, reduced basis method is introduced based on the linear, coercive and affine elliptic problem. In fact, the same approach presented above can be extended to more general problems [37, 41], e.g. time-dependent, non-linear, non-coercive and non-affine problems, as long as a posteriori error bound is cheap to obtain and the offline construction and online evaluation can be efficiently decomposed using proper techniques [2, 19, 20, 28], see, e.g. [14, 26, 36, 37, 40], where many different kind of applications are addressed.

5 Comparison of Convergence Analysis

In this section, we provide a comparison of the theoretical convergence results between the stochastic collocation method and the reduced basis method. In the first part, a preliminary comparison is carried out based on the available convergence results in the literature at the best of our knowledge. Then we perform a direct comparison between the approximation errors of the two methods.

5.1 Preliminary Comparison of Convergence Results

Let us first consider a priori error estimate for one dimensional Lagrangian interpolation for \(y \in \Gamma = [-1,1]\) without loss of generality. In fact, we can map any bounded interval \(\Gamma \) into \([-1,1]\) by shifting and rescaling. The convergence result for univariate stochastic collocation approximation is given as:

Proposition 5.1

Thanks to the analytic regularity in Corollary 2.2, we have the exponential convergence rate for one dimensional stochastic collocation approximation error in \(L^{\infty }(\Gamma ;X)\) norm

$$\begin{aligned} ||u - \mathcal{{U}}_Nu||_{L^{\infty }(\Gamma ;X)} \le C_N r^{-N} = C_N e^{-(\ln r) N}, \end{aligned}$$
(5.1)

with \(r = a+b = \sqrt{1+\tau ^2} + \tau \ge (\sqrt{5}+1)/2 \approx 1.6\) owing to (2.11) and assumption (2.4). The constant \(C_N\) is bounded in a logarithmic rescaling \(C_N \le C\ln (N+1)\), where \(C\) is a constant independent of \(N\).

Remark 5.1

The same result has been obtained in \({L^{2}(\Gamma ;X)}\) norm in [1] except that the constant \(C_N\) in (5.1) is independent of \(N\). For the sake of comparison with the convergence rate of reduced basis method, we consider (5.1) in the norm of \({L^{\infty }(\Gamma ;X)}\) with the constant \(C_N\) depending on \(N\).

Proof

Firstly, we demonstrate that the operator \(\mathcal{{U}}_N: C^0(\Gamma ;X) \rightarrow L^{\infty }(\Gamma ;X)\) is continuous. In fact, by the definition of \(\mathcal{{U}}_N\) in (3.1), we have the following estimate

$$\begin{aligned} ||\mathcal{{U}}_Nu||_{L^{\infty }(\Gamma ;X)}&= \sup _{y\in \Gamma } \left| \left| \sum \limits _{n=0}^N u(y^n) l^n(y)\right| \right| _X\nonumber \\&\le \sup _{y\in \Gamma }\left( \sum \limits _{n=0}^N |l^n(y)|\right) \max _{n = 0, 1, \ldots , N} ||u(y^n)||_X \le \Lambda (N) ||u||_{C^0(\Gamma ; X)}, \end{aligned}$$
(5.2)

where \(\Lambda (N)\) is the optimal Lebesgue constant bounded by (see [36])

$$\begin{aligned} \Lambda (N): = \sup _{y\in \Gamma }\left( \sum \limits _{n=0}^N |l^n(y)|\right) \le \frac{3}{4} + \frac{2}{\pi }\ln (N+1). \end{aligned}$$
(5.3)

Therefore, by the fact \(\mathcal{{U}}_Nw = w, \forall w \in \mathcal{{P}}_N(\Gamma )\otimes X\) (where \(\mathcal{{P}}_N(\Gamma )\) is the space of polynomials of order less than or equal to \(N\)), we have that for every function \(u\in C^0(\Gamma ;X)\),

$$\begin{aligned} ||u - \mathcal{{U}}_Nu||_{L^{\infty }(\Gamma ;X)}&\le ||u - w||_{L^{\infty }(\Gamma ;X)} + ||\mathcal{{U}}_N(w-u)||_{L^{\infty }(\Gamma ;X)}\nonumber \\&\le (1+\Lambda (N))||u - w||_{C^0(\Gamma ;X)}. \end{aligned}$$
(5.4)

Moreover, the following approximation error estimate holds for every function \(u\in C^0(\Gamma ;X)\) (see [1])

$$\begin{aligned} \inf _{w\in \mathcal{{P}}_N(\Gamma )\otimes X}||u - w||_{C^0(\Gamma ;X)} \le \frac{2}{r-1} r^{-N} \max _{z\in \Sigma }||u(z)||_X. \end{aligned}$$
(5.5)

A combination of (5.2), (5.3), (5.4) and (5.5) leads to the result stated in (5.1) with the constant \(C_N\) such that \(C_N \le C \ln (N+1)\), with \(C\) depending only on \( \max _{z\in \Sigma }||u(z)||_X \) and \(r\).\(\square \)

For the same one dimensional parametric problem, a priori error estimate has been well established for the reduced basis approximation [29, 41]. Note that in the context of the reduced basis approximation, the result is based on the assumption that the parameter \(y\) is positive with \(0< y_{\min }\le y \le y_{max}<\infty \). For the sake of consistent comparison with stochastic collocation method, we still take the same parameter range \( \Gamma = [-1,1]\) and introduce a new parameter by \(\mu = y + (1+\delta )\) with \(\delta > 0\) so that \(\mu \in [\delta , 2+\delta ]\) with \(\mu _{min} = \delta > 0\) and \(\mu _{max} = 2+\delta \). Correspondingly, the problem coefficient becomes \(a(x,y) = a_0(x) + a_1(x)y = (a_0(x) - (1+\delta )a_1(x)) + a_1(x)\mu \) and will be denoted as \(\hat{a}_0(x) + a_1(x) \mu \) for convenience. We state the convergence result for one dimensional reduced basis approximation given in [34, 41] in the following proposition:

Proposition 5.2

Suppose that \(\ln \mu _r = \ln (\mu _{max}/\mu _{min}) > 1/2e\) and \(N \ge N_{crit} \equiv 1 + [2e\ln \mu _r]_+\) (\([s]_+\) is the maximum integer smaller than \(s\)), then

$$\begin{aligned} ||u - u_N||_{L^{\infty }(\Gamma ;X)} \le C e^{-(N-1)/(N_{crit}-1)}, \end{aligned}$$
(5.6)

where \(u_N\) is the reduced basis approximation of the solution in the reduced basis space spanned by \(N\) “snapshots”, and C is independent of \(N\). Note that the samples \(\mu _1, \ldots , \mu _N\) are taken as equidistant within \([\ln (\mu _{min}), \ln (\mu _{max})]\) in the way that \(\ln (\mu _n)-\ln (\mu _{n-1}) = \ln (\mu _r)/(N-1), 2 \le n \le N\).

At our knowledge, the a priori error estimates in Proposition 5.1 for the stochastic collocation approximation and in Proposition 5.2 for the reduced basis approximation are the best available results in the literature. Both of them show exponential convergence rate for the approximation of the analytic solution with respect to the parameter \(y\in \Gamma \). In order to guarantee the positiveness of \(\hat{a}_0(x)\) in Proposition 5.2, we require \(\delta \le 1/2\) by assumption (2.4). Therefore, the minimal value of \(N_{crit}\) is \(1 + [2e\ln (u_r)]_+ = 9\), so that the convergence rate in (5.6) becomes \(e^{-(N-1)/8} \approx 1.13^{-(N-1)}\) for \(N\) dimensional reduced basis approximation, which is larger than \( r^{-(N-1)}\) (\(r>1.6\)) in the stochastic collocation approximation (5.1) using \(N\) collocation nodes corresponding to \(\mathcal U _{N-1}\). From this closer look, it seems that the stochastic collocation approximation is better as to a priori error estimation than the reduced basis approximation in the univariate case under the above specific assumptions.

In the multivariate case, the property of convergence rate inherits that of the univariate case thanks to the full tensor product structure of the multivariate Lagrangian interpolation (3.6) in the stochastic collocation approximation. A priori error estimate is obtained in the following proposition.

Proposition 5.3

Under the assumptions of (2.4) and the analytic regularity of the solution in Corollary 2.2, with \(\Gamma = [-1,1]^K\) for simplicity, the following convergence result is a consequence of Proposition 5.1 by recursively using triangular inequality

$$\begin{aligned} ||u - \mathcal{{U}}_Nu||_{L^{\infty }(\Gamma ;X)} \le \sum \limits _{k=1}^K C_{N_k} e^{-\ln (r_k)N_k}, \end{aligned}$$
(5.7)

where \(r_k = a+b = \sqrt{1+\tau _k^2}+\tau _k > 1, 1 \le k \le K\) from (2.11) and \(N = (N_1, \ldots , N_K)\) is the interpolation order corresponding to the interpolation operator \((\mathcal U _{N_1} \otimes \cdots \otimes \mathcal U _{N_K})\).

Remark 5.2

If \(C_{N_k} = C_{N_1}, r_k = r > 1, 1 \le k \le K\) and \(N_k = N_1, 2 \le k \le K\). Then the total number of collocation nodes is \(N = K^{N_1}\) and the error estimate in Proposition 5.3 becomes

$$\begin{aligned} ||u - \mathcal{{U}}_Nu||_{L^{\infty }(\Gamma ;X)} \le C_{N_1} K N^{-\frac{\ln (r)}{\ln (K)}}, \end{aligned}$$
(5.8)

which decays very slowly when \(K\) is large and the region of analyticity \(r\) is small. For instance, when \(K = 10\) and \(r = 1.6\) as in Proposition 5.1, we need at least \(N = 10^{10}\) in order to have \(KN^{-\frac{\ln (r)}{\ln (K)}} \le 0.1\).

The convergence analysis of the isotropic and anisotropic Smolyak sparse grids stochastic collocation methods have been studied in [32] and [31] in the norm \({L^2(\Gamma ; X)}\). Using the same argument in the proof of Proposition 5.1, the following results in \({L^{\infty }(\Gamma ; X)}\) norm are straightforward.

Proposition 5.4

Suppose that the function \( u \) can be analytically extended to a complex domain \(\Sigma (\Gamma ;\tau )\). By using isotropic Smolyak sparse grid and Clenshaw–Curtis collocation nodes, we have

$$\begin{aligned} ||u - \mathcal{{S}}_q u||_{L^{\infty }(\Gamma ; X)} \le C_{q-K+1} N_q^{-r}, \end{aligned}$$
(5.9)

where: \(C_{q-K+1}\) is a constant depending on \(q-K+1\) and \(r\) s.t. \(C_{q-K+1} \le C(r) \ln (2^{q-K+1}+2)\); \(N_q = \# H(q,K)\) is the number of collocation nodes; \(r\) is defined as \(r = \min (\ln (\sqrt{r_1}), \ldots ,\ln (\sqrt{r_K}))/(1+\ln (2K))\) with \(r_1, \ldots , r_K\) defined in (5.7). Using the anisotropic Smolyak sparse grid with Clenshw–Curtis collocation nodes, we have

$$\begin{aligned} ||u - \mathcal{{S}}_q^{\alpha } u||_{L^{\infty }(\Gamma ; X)} \le C_{q-K+1} N_q^{-r(\alpha )}, \end{aligned}$$
(5.10)

where \(r(\alpha ) = \min (\alpha )(\ln (2)e - 1/2)/\left( \ln (2)+ \sum _{k=1}^K \min (\alpha )/\alpha _k\right) \) and \(\alpha _k = \ln (\sqrt{r_k}), k =1, \ldots , K\).

As for the reduced basis approximation in multivariate problems, there is unfortunately no a priori error estimate in the literature to our knowledge. However, there is indeed a comparison between the Kolmogorov width,

$$\begin{aligned} d_N(\Gamma ) := \inf _{\dim (S_N) = N} \sup _{y \in \Gamma } \inf _{w_N \in X_N} ||u(y) - w_N||_X, \end{aligned}$$
(5.11)

which defines the error of the optimal approximation, and the convergence rate of \(N\) dimensional reduced basis approximation by the greedy algorithm [5]. In (5.11), the notations are the same as in section 4: \(S_N\) is a subset of samples with cardinality \(N\); \(X_N = \text {span}\{u(y), y \in S_N\}\) is a function space spanned by the “snapshots”. Essentially, the Kolmogorov width measures the error of the best or optimal \(N\) dimensional approximation over all possible \(N\) dimensional approximation. Define the error of \(N\) dimensional approximation in the subspace \(X^g_N\) constructed from a greedy algorithm as:

$$\begin{aligned} \sigma _N(\Gamma ) = \sup _{y \in \Gamma } \inf _{w_N\in X^g_N} ||u(y) - w_N||_X. \end{aligned}$$
(5.12)

In practice we use a posteriori error estimator \(\triangle _N\) as introduced in Sect. 4 instead of the true error \(\inf _{w_N\in X^g_N} ||u(y) - w_N||_X\) for the greedy selection of quasi-optimal samples, which satisfies

$$\begin{aligned} c \triangle _N \le \inf _{w_N\in X^g_N} ||u(y) - w_N||_X \le C \triangle _N, \text { where } 0 < \gamma \equiv \frac{c}{C} \le 1. \end{aligned}$$
(5.13)

A recent result [5] established a relation between the Kolmogorov width \(d_N\) and the reduced basis approximation error \(\sigma _N\), which is summarized in the following proposition.

Proposition 5.5

Suppose that \( \exists \, M > 0\) s.t. \(d_0(\Gamma )\le M\). Moreover, assume that \( \exists \, r > 0\)

$$\begin{aligned} \text {if } d_N(\Gamma ) \le M N^{-r} \text { then } \sigma _N(\Gamma ) \le C M N^{-r} \quad \forall N > 0, \end{aligned}$$
(5.14)

where the constant \(C\) depends only on \(r\) and \(\gamma \). Moreover, assume that \(\exists \, a>0\),

$$\begin{aligned} \text {if } d_N(\Gamma ) \le M e^{-aN^{r}} \text { then } \sigma _N(\Gamma ) \le C M e^{-cN^{s}} \quad \forall N \ge 0, \end{aligned}$$
(5.15)

where the constants \(s=r/(r+1)\) and \(c,C\) depends only on \(a, r\) and \(\gamma \).

This proposition basically states that whenever the Kolmogorov width decays at either an algebraic or exponential rate, the greedy algorithm will also generate a quasi-optimal approximation space with the error decaying in a similar way. By the definition of Kolmogorov width, which measures the error of the optimal approximation among all the possible approximations, we have that the stochastic collocation approximation error can not be smaller than or decay faster than the Kolmogorov width. In particular, we have that the Kolmogorov width is smaller than the isotropic and anisotropic sparse grid collocation approximation error, i.e.

$$\begin{aligned} d_{N_q}(\Gamma ) \le \min \{||u - \mathcal{{S}}_q u||_{L^{\infty }(\Gamma ; X)}, ||u - \mathcal{{S}}_q^{\alpha } u||_{L^{\infty }(\Gamma ; X)}\} \end{aligned}$$
(5.16)

and if \(d_{N_q}(\Gamma ) \le M N_q^{-\tilde{r}}\) then \(\tilde{r} \ge \max \{r, r(\alpha )\}\), where \(r\) and \(r(\alpha )\) are the algebraic convergence rate in (5.9) and (5.10), respectively. Then we have the following a priori error estimates: the reduced basis approximation error \(\sigma _{N_q}(\Gamma ) \le CM N_q^{-\tilde{r}}\) which decays faster than the stochastic collocation approximation error. Moreover, if the stochastic solution is analytic in the probability/parameter space, as is the case for the elliptic problem (2.1) with analytic solution in Corollary 2.2, the Kolmogorov width can achieve exponential convergence rate in practice [5], so that the reduced basis approximation error also decays exponentially and much faster than the stochastic collocation approximation error, as demonstrated by our numerical experiments in Sect. 7.

Both the Kolmogorov width \(d_N(\Gamma )\) and the greedy error \(\sigma _N(\Gamma )\) are given on the whole region \(\Gamma \). However, in practice they are defined over the training set \(\Xi _{train} \subset \Gamma \). When it is dense enough, i.e. \(d_N(\Gamma )\) and \(\sigma _N(\Gamma )\) are indistinguishable from \(d_N(\Xi _{train})\) and \(\sigma _N(\Xi _{train})\), the comparison above is valid. On the other hand, if the training set \(\Xi _{train}\) is rather sparse in \(\Gamma \), which is usually the case in high dimensional problem, the comparison might be invalid. In order to have more rigorous and fair comparison of the reduced basis approximation and the stochastic collocation approximation, we perform a direct comparison of their approximation errors in the next section.

5.2 Direct Comparison of Approximation Errors

As mentioned above, selecting an appropriate training set \(\Xi _{train}\) for the reduced basis approximation is crucial. For its effective comparison with the stochastic collocation approximation, we choose training set as the set represented by the collocation points used in the latter approximation, which we denote as \(\Xi _{sc}\) in general for both the full tensor product grid and the sparse grid. Let us denote also the interpolation formula on \(\Xi _{sc}\) as \(\mathcal{{I}}_{sc}: C^0(\Gamma ;X) \rightarrow L^{\infty }(\Gamma ;X)\). We have the following proposition for a direct comparison:

Proposition 5.6

Provided that the training set \(\Xi _{train}\) for reduced basis approximation is taken as the collocation set \(\Xi _{sc}\) for stochastic collocation approximation, we have

$$\begin{aligned} ||u - u_N||_{L^{\infty }(\Gamma ;X)} \le C||u - \mathcal{{I}}_{sc}u||_{L^{\infty }(\Gamma ;X)}, \end{aligned}$$
(5.17)

where \(C = 3a_{max}/a_{min}\) (with \(a_{max},a_{min}\) defined in (2.2)) is a constant independent of \(N\).

Proof

By definition of the reduced basis approximation \(u_N\) in (4.7), we have

$$\begin{aligned} ||u - u_N||_{L^{\infty }(\Gamma ;X)}&= \sup _{y\in \Gamma } ||u(y)-u_N(y)||_X \nonumber \\&\le \frac{C}{3} \sup _{y \in \Gamma } \inf _{w\in X_N}||u(y)-w||_X \nonumber \\&\le \frac{C}{3} \sup _{y \in \Xi _{sc} / S_N}\inf _{w\in X_N}||u(y)-w||_X + \frac{C}{3} \sup _{ y \in \Gamma / \Xi _{sc}}\inf _{w\in X_N}||u(y) - w||_X,\nonumber \\ \end{aligned}$$
(5.18)

where \(C = 3a_{max}/a_{min}\) is a constant independent of \(N\) according to Céa lemma [39]; the first inequality is due to the property of Galerkin projection on the space \(X_N\), the second one comes from the fact that \(\Gamma = S_N \cup (\Xi _{sc}/S_N)\cup (\Gamma / \Xi _{sc})\) and the reduced basis approximation error vanishes for any \(y \in S_N\) so that only the above two terms remain. For the second term of (5.18), we have

$$\begin{aligned} \sup _{ y \in \Gamma / \Xi _{sc}}\inf _{w\in X_N}||u(y) - w||_X \le \sup _{ y \in \Gamma / \Xi _{sc}}\inf _{v\in X_{sc}}||u(y) - v||_X + \inf _{w\in X_{N}}||v - w||_X, \end{aligned}$$
(5.19)

where the function \(v\) is defined in the space \(X_{sc}\), which is spanned by the solutions at the collocations points in \(\Xi _{sc}\). Therefore, the first term of (5.19) satisfies

$$\begin{aligned} \sup _{ y \in \Gamma / \Xi _{sc}}\inf _{v\in X_{sc}}||u(y) - v||_X \le \sup _{ y \in \Gamma / \Xi _{sc}}||u(y) - \mathcal{{I}}_{sc}u(y)||_X = \sup _{ y \in \Gamma }||u(y) - \mathcal{{I}}_{sc}u(y)||_X\nonumber \\ \end{aligned}$$
(5.20)

and (noting that \(v\) is one solution at some \(y\in \Xi _{sc}\)) the second term of (5.19) can be bounded by

$$\begin{aligned} \inf _{w\in X_{N}}||v - w||_X \le \sup _{y \in \Xi _{sc}}\inf _{w\in X_N}||u(y)-w||_X = \sup _{y \in \Xi _{sc} / S_N}\inf _{w\in X_N}||u(y)-w||_X.\nonumber \\ \end{aligned}$$
(5.21)

A combination of (5.18), (5.19), (5.20) and (5.21) leads to the following error bound

$$\begin{aligned} ||u - u_N||_{L^{\infty }(\Gamma ;X)} \le \frac{2C}{3} \sup _{y \in \Xi _{sc} / S_N}\inf _{w\in X_N}||u(y)-w||_X + \frac{C}{3} \sup _{ y \in \Gamma }||u(y) - \mathcal{{I}}_{sc}u(y)||_X.\nonumber \\ \end{aligned}$$
(5.22)

Moreover, we can construct the reduced basis space in such a way that the reduced basis approximation error in the collocation/training set \(\Xi _{sc}\) [the first term of (5.22)] is smaller than the stochastic collocation approximation error over \(\Gamma \) [the second term of (5.22)], i.e.

$$\begin{aligned} \sup _{y \in \Xi _{sc} / S_N}\inf _{w\in X_N}||u(y)-w||_X \le \sup _{ y \in \Gamma }||u(y) - \mathcal{{I}}_{sc}u(y)||_X, \end{aligned}$$
(5.23)

which is always possible and an extreme case is that all the collocation points are included in the sample set, i.e. \(\Xi _{sc} = S_N\), so that the first term of (5.22) vanishes. Therefore, by substituting (5.23) into (5.22) we obtain (5.17).\(\square \)

Since the evaluation of statistics by Monte Carlo algorithm converges very slowly, we propose the approach of evaluating the solution by reduced basis method at all the collocation nodes first and then applying quadrature formula (2.7) to assess the statistics. To improve the accuracy of this approach, we also build the training set \(\Xi _{train}\) as the collocation/quadrature nodes \(\Xi _{sc} = \Xi _{train}\). In fact, we have the error estimate between the expectation \(\mathbb E [s]\) and the value \(\mathbb E [s_{rb}]\) approximated by reduced basis method (\(\mathbb E [s_{sc}]\) is the value approximated by stochastic collocation method)

$$\begin{aligned} |\mathbb E [s] - \mathbb E [s_{rb}]| \le |\mathbb E [s] - \mathbb E [s_{sc}]| + |\mathbb E [s_{sc}] - \mathbb E [s_{rb}]|, \end{aligned}$$
(5.24)

where the first term represents the quadrature error, while the second term is bounded by (4.6) as

$$\begin{aligned} |\mathbb E [s_{sc}] - \mathbb E [s_{rb}]|&\le \sum \limits _{y^i \in \Xi _{sc}} w^i |s(y^i) - s_{rb}(y^i)|\nonumber \\&\le \max _{y \in \Xi _{sc}} |s(y) - s_{rb}(y)| \nonumber \\&\le \max _{y \in \Xi _{sc}} ||s||_{X'} ||u(y) - u_N(y)||_X, \end{aligned}$$
(5.25)

where \(w^i > 0\) are quadrature weights. As long as reduced basis approximation error is smaller than the quadrature error, (5.24) is dominated by the first term—the quadrature error.

6 Comparison of Computational Costs

In this section, we aim at comparing in detail the computational costs with respect to operations count and storage of the reduced basis method and the stochastic collocation method. Let us begin with the computational costs (\(C(\cdot )\) stands for operations count and \(S(\cdot )\) for storage) for stochastic collocation method, which is listed along side the Algorithm 1. The major computational costs for reduced basis method is listed along side the Algorithm 2.

A few notations are: \(N_{sc} = \#\Theta = (N_1+1) \times \cdots \times (N_K+1)\), \(N_{t} = \# \Xi _{train}\); \(N_{rb} = N_{max}\); \(W_{\alpha }\) is the average work to evaluate the lower bound \(\alpha _{LB}\) over the training set; \(W_s\) is the work to solve once the linear system arising from (2.5) with \(C(\mathcal{{N}}^2) \le W_s \le C(\mathcal{{N}}^3)\) and \(W_m\) is the work to evaluate \((\mathcal{{L}},\mathcal{{L}})_X\) in (4.10) once with \(C(\mathcal{{N}})\le W_m \le C(\mathcal{{N}}^2)\). The total computational costs (apart from that of the common initialization) for the reduced basis method and stochastic collocation method is calculated from Algorithm 1 and 2 and presented in Table 1.

More in detail, the offline cost for stochastic collocation method is dominated by solving the problem (2.5) \(N_{sc}\) times with total work \(C(N_{sc}(W_s+K))\). Its online cost scales as \(C(N_{sc})\) by the multivariate barycentric formula or quadrature formula. The total storage is dominated by that for all the solutions \(S(N_{sc}(\mathcal{{N}}))\). As for reduced basis method, the offline cost is the sum of pre-computing the lower bound \(C(N_t W_{\alpha })\), solving the system \(N_{rb}\) times with total work \(C(N_{rb}W_s + KN_{rb}^2 \mathcal{{N}})\), computing error bound with work \(C(K^2N_{rb}^2 W_m)\) and searching in the training set with work \(C(N_t K^2 N_{rb}^3)\). The online cost is the sum of assembling (4.8) with work \(C(KN_{rb}^2)\) and solving it with work \(C(N_{rb}^3)\) as well as evaluating the error bound with work \(K^2N_{rb}^2\), as for statistics by quadrature formula, we need work \(C(N_{sc}(N_{rb}^3 + KN_{rb}^2))\). The total storage for reduced basis method takes \(S(N_{rb}\mathcal{{N}} + K^2N^2_{rb} + KN_t)\) for storing the solution, stiffness matrix as well as the training set.

figure a

From Table 1 we can observe that an explicit comparison of computational costs for reduced basis method and stochastic collocation method depends crucially on the number of collocation points \(N_{sc}\) and the size of the training set \(N_{t}\), the dimension of the reduced basis \(N_{rb}\) and parameters \(K\), as well as on the work of computing the lower bound \(W_{\alpha }\). In general, provided that the problem is computational consuming in the sense that \(\mathcal{{N}}\) is very large and provided that \(N_{sc} \approx N_t\), we have \(N_{rb} \ll N_{sc}\) so that the reduced basis method is much more efficient in the offline procedure under the condition that \(W_{\alpha } \ll W_s\) by the SCM optimization algorithm. As for the online evaluation of the solution at a new \(y \in \Gamma \), this advantage becomes even more evident especially in high dimensions since the online operations count for reduced basis method is much smaller than that for the stochastic collocation method, i.e. \(C(N_{rb}^3 + KN_{rb}^2 + K^2N_{rb}^2) \ll C(N_{sc})\). However, as for the evaluation of the statistics, e.g. expectation \(\mathbb E [u]\), the online operations count \(C(N_{sc}(N_{rb}^3 + KN_{rb}^2))\) is larger for the reduced basis method than the online operations count (\(C(N_{sc})\)) for the stochastic collocation method. Moreover, if we choose the size of the training set larger than the number of collocation points \(N_t \gg N_{sc}\), which is usually the case in practice for low dimensional problems (\(K =1 ,2,3\)), or else the work \(W_{\alpha }\) for the computation of the lower bound \(\alpha _{LB}\) is not significantly smaller than \(W_s\), the stochastic collocation method could perform as well as or even better than the reduced basis method when \(N_t \gg N_{sc}\).

Table 1 Computational costs of stochastic collocation method (SC) and reduced basis method (RB)
figure b

7 Numerical Experiments

In this section, we find numerical substantiation to our previous analysis on the convergence rate and on computational costs, by numerically comparing the reduced basis method and the stochastic collocation method. More precisely, we consider a stochastic elliptic problem in a two dimensional unit square \(x = (x_1,x_2) \in D = (0,1)^2\). The deterministic forcing term \(f = 1\) is fixed. The coefficient \(a(x,\omega )\) is a random field with finite second moment, whose expectation and correlation are given as

$$\begin{aligned} \mathbb E [a](x)&= \frac{c}{100}, \text { for a suitable } c > 0; \quad \mathbb C ov[a](x,x')\nonumber \\&= \frac{1}{100^2}\exp \left( -\frac{(x_1 - x_1')^2}{L^2}\right) , \quad x, x' \in D \end{aligned}$$
(7.1)

where \(L\) is the correlation length. The Karhunen–Loève expansion of the random field \(a\) is

$$\begin{aligned}&a(x,\omega ) \!=\! \frac{1}{100} \left( c\!+\!\left( \frac{\sqrt{\pi }L}{2}\right) ^{1/2} y_1(\omega ) \!+\! \sum _{n = 1}^{\infty } \sqrt{\lambda _n} \left( \sin (n \pi x_1) y_{2n}(\omega ) \right. \right. \nonumber \\&\left. \left. + \cos (n \pi x_1) y_{2n+1}(\omega )\right) \right) \!, \end{aligned}$$
(7.2)

where the uncorrelated random variables \(y_n, n \ge 1\), have zero mean and unit variance, and the eigenvalues \({\lambda _n}, n \ge 1\), have the following expression

$$\begin{aligned} \sqrt{\lambda _n} = \left( \sqrt{\pi }L\right) ^{1/2} \exp \left( -\frac{(n\pi L)^2}{8}\right) , \quad \forall \, n \ge 1. \end{aligned}$$
(7.3)

The random field \(a(x,\omega )\) will be chosen as in (7.4) and (7.7) below. All the numerical computation is performed in MATLAB on an Intel Core i7-2620M Processor of 2.70 GHz.

7.1 Numerical Experiments for a Univariate Problem

For the test of univariate stochastic problem, we take

$$\begin{aligned} a(x,\omega ) = \frac{1}{100} \left( 1 + \left( \frac{\sqrt{\pi }L}{2}\right) ^{1/2} \sin (2\pi x_1) y_1(\omega ) \right) , \end{aligned}$$
(7.4)

where \(y_1(\omega )\) obeys uniform distribution with zero mean and unit variance \(y_1(\omega ) \sim \mathcal U (-\sqrt{3},\sqrt{3})\). We implement Algorithm 1 for the stochastic collocation approximation with Clenshaw–Curtis nodes (the same as Chebyshev–Gauss–Lobatto nodes [9, 43]), defined for \(y_1 \in \Gamma _1 = [-\sqrt{3},\sqrt{3}]\) as

$$\begin{aligned} y_1^{n} = -\sqrt{3} \cos \left( \frac{n\pi }{N}\right) , \quad n = 0, \ldots , N. \end{aligned}$$
(7.5)

We also implement Algorithm 2 for the reduced basis approximation with equidistant training set \(\Xi _{train}\) with cardinality \(N_t = 1000\), which is rather dense in the interval \([-\sqrt{3},\sqrt{3}]\). We take randomly the testing set \(\Xi _{test}\) with \(N_{test} = 1000\) samples and define the \(L^{\infty }(\Gamma )\) error between the true solution \(u\) (finite element solution) and approximate solution \(u_{approx}\) as

$$\begin{aligned} ||u-u_{approx}||_{L^{\infty }(\Gamma ;X)} \approx \max _{y\in \Xi _{test}} ||u(y) - u_{approx}(y)||_X. \end{aligned}$$
(7.6)

We also compute the statistical error \(|\mathbb E [||u||_X] - \mathbb E [||u_{approx}||_X]|\) with the expectation defined in (3.5).

Figure 2 illustrates the convergence of the error against collocation nodes as well as the number of reduced bases for the stochastic collocation approximation and reduced basis approximation, respectively. From the left of Fig. 2, we observe that both approximations achieve exponential convergence in accordance with Proposition 5.1 and Proposition 5.2. The reduced basis approximation (convergence rate \(\approx \exp (-1.8N)\)) turns out to be slightly better than the stochastic collocation approximation (convergence rate \(\approx \exp (-1.3N)\)). As for the computation of the expectation \(\mathbb E [||u-u_{approx}||_X]\), we apply Clenshaw–Curtis quadrature rule [43] for stochastic approximation and Monte–Carlo algorithm for reduced basis approximation. The right of Fig. 2 shows that the quadrature rule with exponential convergence rate \(\approx \exp (-1.6N)\) is apparently superior to Monte–Carlo algorithm with algebraic convergence rate \(\approx N^{-1/2}\) in the univariate problem.

Fig. 2
figure 2

Comparison for convergence rate of the error \(||u - u_{approx}||_{L^{\infty }(\Gamma ;X)}\) (left) and the expectation \(|\mathbb E [||u||_X] - \mathbb E [||u_{approx}||_X]|\) (right) between the true and the approximate solutions in 1D

As for the computational costs, though the reduced basis approximation needs slightly less “snapshots” than the stochastic collocation approximation, it costs more for the computation of a posteriori error estimator by greedy sampling over a large training set in the offline construction. In Table 2 for univariate (1D) problem, we observe that for small scale problems, i.e. the mesh size \(h\) is large, the offline construction of reduced basis approximation is apparently more expensive than the stochastic collocation approximation. When the problem grows to large scale, i.e. the mesh size \(h\) is small, the computational time is dominated by the time required for the solution of the finite element problem, then the reduced basis approximation is as efficient as the stochastic collocation approximation or even better. Moreover, it takes \(C(N_{SC})=C(28)\) operations count for the online evaluation of the solution \(u(y)\) for any given \(y\in \Gamma \) by stochastic collocation method while reduced basis method needs more computation \(C(N_{RB}^3)=C(8000) > C(N_{SC})=C(28)\). From Table 2 we can see that the online computational costs of reduced basis approximation increases with the scale of the problem and takes more time than that of the stochastic collocation approximation, which depends only on the number of collocation points \(N_{SC}\). In the computation of statistics, the reduced basis—Monte–Carlo approximation is much more expensive than the stochastic collocation approximation with corresponding quadrature rule for the univariate problem. In order to alleviate the computational cost, we can first evaluate the solution at the collocation nodes by reduced basis approximation and then use the quadrature formula to compute the statistics. However, this is not so useful if the number of collocation nodes is comparable to the number of reduced bases, as in the univariate case. We will compare the proposed approach with the stochastic collocation approach for multivariate case later. From the univariate experiment, we conclude that the stochastic collocation approximation is more efficient than the reduced basis approximation for small scale problem in terms of computational costs and become less efficient as the problem becomes in large scale, expensive to solve.

Table 2 1D offline (online in bracket) computational costs measured in CPU time by reduced basis approximation (RB) and stochastic collocation approximation (SC) achieving the same accuracy

Figure 3 depicts the procedure of reduced basis construction by greedy sampling algorithm and hierarchical stochastic collocation construction based on Clenshaw–Curtis nodes. At the top of Fig. 3, we use larger size of dots to show earlier samples selected in the greedy algorithm, which is very similar to the hierarchical collocation construction shown at the bottom of Fig. 3 in terms of the position and selected order of the nodes. This effect can be observed more closely in the middle figure, where the greedy samples is in full consistency with the Clenshaw–Curtis nodes. In fact, the maximum distance of the corresponding points between greedy samples and Clenshaw–Curtis nodes (CC) is \(0.074\), and the mean distance is \(0.023\). For comparison, we also test Chebyshev–Gauss nodes (CG), Legendre–Gauss nodes (LG) and Legendre–Gauss–Lobatto nodes (LGL) (see[9]) and the result is listed in Table 3, from which we can see that Clenshaw–Curtis nodes are the best choice, followed by Legendre–Gauss–Lobatto nodes. Note that the average distances of the samples in the training set are \(2\sqrt{3}/1000 = 0.0035\) and \(2\sqrt{3}/10000 = 0.00035\), which are much smaller than the quantities in Table 3, so that we are confident with the intrinsic difference between the samples selected by the greedy algorithm and the collocation nodes. This numerical coincidence has also been observed for empirical interpolation method (EIM) [2, 28], which is efficiently used in affinely approximating the nonlinear function for nonlinear problems in the framework of reduced basis approximation. The fact sheds light on the similarity of projection and interpolation in the common framework of nonlinear approximation, in the way that the greedy algorithm for reduced basis projection tends to select the points on which the Lebesgue constant arising in the stochastic collocation/interpolation is minimized.

Fig. 3
figure 3

Comparison of greedy sampling (top) and hierarchical Clenshaw–Curtis rule (bottom). The size of the nodes stands for the order that they are selected in the hierarchical approximation. Correspondence of samples for reduced basis (\(\circ \)) and Clenshaw–Curtis nodes (\(\cdot \)) is highlighted (middle)

Table 3 Comparison of the maximum distance (average distance in brackets) between greedy samples in reduced basis approximation and different collocation nodes for stochastic collocation approximation

7.2 Numerical Experiments for Multivariate Problems

For the test of multivariate problem, we truncate the random field \(a(x,\omega )\) from Karhunen–Loève expansion (7.2) with five uniformly distributed random variables \(y = (y_1,\ldots ,y_5)\), whose value belongs to \(\Gamma = [-\sqrt{3},\sqrt{3}]^5\), and correlation length \(L = 1/8\) so that the two eigenvalues \(\lambda _1 \approx 0.2132, \lambda _2 \approx 0.1899\), written as

$$\begin{aligned} a(x,\omega )&= \frac{1}{100} \left( 4 + \left( \frac{\sqrt{\pi }L}{2}\right) ^{1/2} y_1(\omega ) + \sum \limits _{n = 1}^{2} \sqrt{\lambda _n} \left( \sin (n \pi x_1) y_{2n}(\omega )\right. \right. \nonumber \\&\left. \left. + \cos (n \pi x_1) y_{2n+1}(\omega )\right) \right) \!. \end{aligned}$$
(7.7)

The tensor product of one dimensional Clenshaw–Curtis nodes (7.5) for \(N = 1,2,3,4,5,6,7\) as well as a single node \([0,0,0,0,0]\) are used for the stochastic collocation approximation, while Smolyak sparse grid with level \(q-5 = 1,2,3,4,5,6,7\) are used for stochastic sparse grid collocation approximation. For the reduced basis approximation, we select the same \(7^5\) samples as used in the tensor product stochastic collocation nodes. The convergence results for \(L^{\infty }(\Gamma )\) error and the expectation are displayed in Fig. 4. From the left of Fig. 4, we observe obviously better convergence rate for the reduced basis approximation (still achieving exponential convergence rate \(\approx \exp (-0.2N)\)) than stochastic collocation approximation (only gaining convergence rate \(\approx \exp (0.0002N)\) or rather algebraic convergence rate \(\approx N^{-1.5}\)). The sparse grid collocation achieves better approximation than the tensor production collocation at the beginning, and loses this advantage to the latter due to slower convergence for our specific experiment in five dimensions.

Fig. 4
figure 4

Comparison for convergence rate of the error \(||u - u_{approx}||_{L^{\infty }(\Gamma ;X)}\) (left) and the expectation \(|\mathbb E [||u||_X] - \mathbb E [||u_{approx}||_X]|\) (right) between the true and the approximate solution in 5D

As for the convergence of the expectation \(E[||u||_X]\) as seen from the right of Fig. 4, the highest convergence rate (gaining exponential convergence rate) is still achieved by the reduced basis—collocation approximation, essentially by constructing the reduced basis at first and then evaluating the solution at the collocation/quadrature points by reduced basis approximation. Similar convergence behaviour can be observed for the tensor product and sparse grid collocation approximation, which are still better than the reduced basis—Monte–Carlo approximation, though this advantage becomes less evident than the univariate case.

For the comparison of computational cost, besides the same \(7^5\) training samples as used in the tensor product stochastic collocation nodes, we also use \(N_t = 1000 \ll 7^5\) randomly generated samples as training set and obtain the same number of reduced bases to achieve the same accuracy due to the smoothness of the solution in the parameter space. From Table 4, we may see that the offline computational costs for stochastic collocation approximation grows exponentially fast as the complexity of the problem, while for reduced basis approximation, it increases slightly and is dominated linearly by the cardinality of the training set \(\Xi _{train}\) from the contrast of \(7^5 \approx 1.7\times 10^4\) to \(10^3\), which is almost the same ratio of the CPU time \(839/50 \approx 17\). In comparison, the reduced basis approximation becomes much more efficient than the stochastic collocation approximation in offline construction for large scale problems while it loses moderately to the latter for the online computational cost. In the computation of statistics, the reduced basis—collocation approximation is much faster than the stochastic collocation approximation: \(949(\text {offline})+0.05\times 7^5(\text {online}) \approx 1789 \ll 17252\) for large scale problems (\(h = 1/128\)) while this becomes opposite for small scale problems (\(h=1/8\)) since \(839+0.0005\times 7^5 \approx 847 \gg 17\).

Table 4 5D offline (online in bracket) computational costs measured in CPU time by reduced basis approximation (RB) and stochastic collocation approximation (SC) achieving the same accuracy

When numerically solving the five dimensional stochastic problems, we can see that both collocation and reduced basis approximation achieve better convergence property than Monte–Carlo algorithm. However, when the number of random variables or parameters becomes very large, the tensor product stochastic collocation approximation would need too many collocation points so that the quadrature formula losses its advantage over the Monte–Carlo algorithm. Meanwhile, the size of the training set for reduced basis construction also grows exponentially with the dimensions of the problem. Therefore, it is necessary to alleviate the computational cost. When the random variables \(y_k, 1\le k \le K\) have different importance for the stochastic problem, it would be worthless to put the same weight on the ones with little importance as on those with much larger influence. For instance, the first few eigenvalues \(\lambda _1 \approx 0.4782, \lambda _2 \approx 0.0752, \lambda _3 \approx 0.0034, \lambda _4 \approx 0.000045\) decay so fast for large correlation (\(L = 1/2\)) length in the Karhunen–Loève expansion (7.2) that the random variables have distinct weights in determining the value of the coefficient \(a(x,y_1,\ldots , y_K)\).

The key idea behind anisotropic sparse grid is that we take advantage of the anisotropic weights, placing more collocation points in the dimensions that suffer from a slower convergence in order to balance and minimize the global error [31]. How to obtain a sharp estimate of the importance or the weight of different dimensions is crucial to use the anisotropic spars grid. One way is to derive a priori error estimate with the convergence rate, e.g. the convergence rate \(\exp (-\ln (r_k)N), 1\le k \le K\) in (5.7), as accurate as possible. However, deriving an analytical estimation of the convergence rate for general problems is rather difficult. In alternative, we may perform empirical estimation by fitting the convergence rate from numerical evaluation for each dimension, see Fig. 5, and use the estimated convergence rates as \(\alpha \) in (3.15) for anisotropic sparse grid construction [31]. For the test of efficiency of anisotropic grid, we take the correlation length \(L = 1/2\), \(c = 5\) for the coefficient \(a(x,\omega )\) in (7.2) and truncate it with nine random variables \(y = (y_1, \ldots , y_9) \in \Gamma = [-\sqrt{3}, \sqrt{3}]^9\). Instead of the norm \(||u-u_{approx}||_{L^{\infty }(\Gamma ,X)}\), we use \(||||u||_X-||u_{approx}||_X||_{L^{\infty }(\Gamma )}\) to reduce the evaluation cost.

Fig. 5
figure 5

Empirical convergence (left) and fitted convergence rate (right) in dimension \(1\le k \le 9\)

We use isotropic sparse grid and anisotropic sparse grid at the interpolation level \(q-9 =1,2,3,4,5,6\) for stochastic collocation approximation in (3.11) and (3.15), and choose the training samples as the collocation nodes in sparse grid at the deepest interpolation level \(q-9 = 6\) (\(100897\approx 10^5\) nodes) for reduced basis approximation. From Fig. 6 we can see that the reduced basis approximation converges much faster than the stochastic collocation approximation in both \(L^{\infty }(\Gamma )\) norm and the statistical norm. The offline computational cost of reduced basis approximation for small scale problems \(h=1/8, 1/16,1/32\) is larger than that of stochastic collocation approximation, while for large scale problems \(h=1/64,1/128\) this becomes quite opposite, see Table 5. Besides, we also use \(10^3\) randomly generated training samples for reduced basis approximation, and we still obtain the high accuracy in both \(L^{\infty }(\Gamma )\) norm and the statistical norm due to the fact that the solution is very smooth in the parameter space. We can see from Table 5 that the computational costs with \(10^3\) samples is far less than that of the sparse grid stochastic collocation approximation for both offline construction and online evaluation. In fact, the online construction of the reduced basis approximation stays the same as dominated by the number of reduced basis \(N_{rb}\) in the way \(O(N^3_{rb}+KN^2_{rb}+K^2N^2_{rb})\), while the online cost for stochastic collocation approximation grows with the number of collocation points in an approximately linear way \(O(N_{sc})\)(\(10^5/7^5 \approx 0.13/0.02\)). Figure 6 also carries the fact that anisotropic sparse grid is more efficient than the isotropic sparse grid for anisotropic problems. Meanwhile, we can see that the stochastic collocation approximation based on tensor product grid starts to converge slower than \(N^{-1/2}\), which is the typical convergence rate of Monte–Carlo method.

Fig. 6
figure 6

Comparison for convergence rate of the error \(||||u||_X-||u_{approx}||_X||_{L^{\infty }(\Gamma )}\) (left) and the expectation \(|\mathbb E [||u||_X] - \mathbb E [||u_{approx}||_X]|\) (right) between the true and the approximate solutions in 9D

Table 5 9D offline (online in bracket) computational costs measured in CPU time by reduced basis approximation (RB) and stochastic collocation approximation (SC) achieving the same accuracy

7.3 Numerical Experiments for Higher Dimensional Problems

In this numerical experiment, we deal with high dimensional stochastic problems, pushing the dimensions from 9D to 21D, 51D to up 101D and comparing the performance of the reduced basis approximation and the stochastic collocation approximation. Note that in high dimensions \(K = 101\), it is prohibitive to use stochastic collocation with tensor product grid (since we would need \(3^{101} \approx 1.5\times 10^{48}\) collocation points in total with 3 collocation points in each dimension), we use instead sparse grid of the anisotropic type to reduce the computational cost. The correlation length is \(L = 1/128\), which enables us to consider an anisotropic problem but with the eigenvalues decaying very slowly (\(\lambda _1 = 0.0138 , \lambda _{50}= 0.0095\)). The constant in (7.2) is chosen as \(c = 20\) to guarantee that the stochastic problem is well posed with coercive elliptic operator. For the reduced basis approximation, we use 1,000 samples randomly selected in \(\Gamma = [-\sqrt{3},\sqrt{3}]^K, K = 9, 21, 51, 101\) thanks to the rather smooth property of the solution in the parameter space, and for the stochastic collocation approximation, we construct adaptively an anisotropic sparse grid with \(10^1, 10^2, 10^3, 10^4, 10^5, 10^6\) collocation nodes in a hierarchical way governed by the hierarchical surpluses [24]. To evaluate the \(||||u||_X-||u_{approx}||_X||_{L^{\infty }(\Gamma )}\) error, we randomly select 100 samples in \(\Gamma \). For the computation of expectation as well as the error \(|\mathbb E [||u||_X] - \mathbb E [||u_{approx}||_X]|\), we apply the reduced basis—collocation approximation with \(10^5\) collocation nodes constructed from the anisotropic grid. The error \(|\mathbb E [||u||_X] - \mathbb E [||u_{approx}||_X]|\) is evaluated as a posteriori error by taking the best stochastic collocation approximation as the true value.

The results for the high dimensional stochastic problems are displayed in Fig. 7, from which we can observe an exponential decay rate for both the \(L^{\infty }(\Gamma )\) error and the statistical error by reduced basis approximation, which is much faster than the stochastic collocation approximation. As the dimension increases from 9 to 101, the convergence rate decreases very fast for both reduced basis approximation and stochastic collocation approximation. As for the computational costs of reduced basis method, it takes \(86(K = 9), 424(K=21), 2479(K=51), 8986(K=101)\) CPU seconds, respectively for the offline construction with the mesh size \(h = 1/8\), growing as \(t_{RB} \propto O(K^2)\), which verify the formula in Table 1 by Algorithm 2. In contrary, it would take \(t_{SC} \propto O(K^w)\) where \(w = q - K = 0, 1, 2, \ldots \) is the interpolation level in the isotropic Smolyak formula (3.11), which prevents large \(w\) for high dimensional problems. We remark that although our numerical results are very promising for reduced basis approximation, the size of the samples in the training set \(\#\Xi _{train} = 1000\) and the testing set \(\#\Xi _{test} = 100\) is rather small for the high dimensional problems, which may bring insufficiency as for the approximation. In order to increase the accuracy of approximation, we may construct the training set adaptively by replacing it with new set once the reduced basis approximation is good enough in the current one, see [46]. We also remark that the work of offline construction is linear with respect to the cardinality of the training set \(t_{RB} \propto N_t\), as seen in Table 1.

Fig. 7
figure 7

Comparison for convergence rate of \(||||u||_X-||u_{approx}||_X||_{L^{\infty }(\Gamma )}\) (left) and the expectation \(|\mathbb E [||u||_X] - \mathbb E [||u_{approx}||_X]|\) (right) between the anisotropic sparse grid stochastic collocation (SC) approximation and the reduced basis (RB) approximation in high dimensions 9D, 21D, 51D and 101D

8 Concluding Remarks

In this work, we carried out a detailed comparison between the reduced basis method and the stochastic collocation method for linear stochastic elliptic problems, in terms of convergence analysis and computational costs. The reduced basis method adopts Galerkin projection on the reduced basis space constructed from a greedy algorithm governed by a posteriori error estimate. It takes advantage of the affine structure of the stochastic problem to decompose the computation into offline procedure and online procedure. The stochastic collocation method, on the other hand, follows essentially the Lagrangian interpolation on the collocation nodes, which are taken as quadrature abscissa in order to achieve high order interpolation as well as integration for statistical computation.

As for the convergence analysis, the reduced basis method achieves exponential convergence rate for analytic/smooth problems regardless of dimensions in our test case. The stochastic collocation method also obtains exponential convergence in the low dimensional case, though with a slower rate than that featured by the reduced basis method; in contrast, in the multivariate case, especially for high dimensional problems, it only achieves algebraic convergence rate. The computation of the stochastic collocation method costs less effort than the reduced basis method in small scale and low dimensional problems, while it grows much faster than the reduced basis method in large scale and high dimensional problems, resulting in much higher computational effort than the latter one. Note that the comparison depends essentially on the regularity of the stochastic solution, the dimension of the parameter space as well as the complexity of solving the underlying deterministic system, so that we presume that similar comparison results hold reasonably beyond the case of linear stochastic elliptic problems considered here.

We succeeded in applying the reduced basis method and the anisotropic sparse grid stochastic collocation method in high dimensional problems up to the order of \((100)\). Nevertheless, the application is admittedly insufficient since the number of samples and collocation nodes is rather small. More advanced techniques such as sensitivity analysis, adaptive construction and so on [21, 22] for both methods are being developed actively from the research community, more specifically to deal with high dimensional stochastic systems.

Moreover, the comparison has been only carried out for problems with smooth solutions with respect to the parameters. As for non-smooth or low regularity stochastic problems, we expect that the reduced basis method, by taking advantage of solving a reduced problem (4.2) with the same mathematical structure as the original problem (2.5), can avoid Gibbs phenomenon as encountered by stochastic collocation method based on dictionary basis (here Lagrange basis function), thus gaining further benefit on convergence, see examples in [11]. More research focusing on both theoretical and computational aspects is still needed when considering reduced basis method and stochastic collocation method, as well as their efficient combination, for solving low regularity and high dimensional stochastic problems with random variables featuring more general probability distributions [12, 13].