Keywords

1 Introduction

Simulating real world processes through computer experiments [17] yields many benefits. Lower costs compared to real experiments, many executions in parallel and no risk to humans or the environment, just to name a few. However, computer experiments are never capable of simulating the real world comprehensively and always must be a compromise of precision and complexity.

The field of uncertainty quantification deals with the inevitably limited knowledge of the real world, and allows for more realistic assessments of computer experiment results. This is done by introducing uncertainty to the input parameters and observing how the uncertainty propagates through the model and influences the results [29]. To increase the accuracy of the predictions for the underlying process more uncertain input parameters can be added, such that the computer experiment takes more aspects of the real world into account. However, the run-times and necessary computational resources increase with the complexity of the model.

This problem can be dealt with by creating a surrogate that is a sufficiently accurate approximation of the original model, but much faster to evaluate. For several years B-spline basis functions [13] have been used for the creation of surrogate models. However, the number of grid points of classical uniform isotropic tensor product grids increases exponentially with the number of input parameters. This is known as the curse of dimensionality [2]. Sparse Grids [3, 31] are an established technique to mitigate the curse, in particular when created spatially adaptive[19]. Sparse Grids have successfully been applied in combination with B-splines for interpolation, optimization, regression and uncertainty quantification [15, 19, 21, 28]. When further increasing the dimensionality of the parameter space, the boundary points of sparse grids again introduce exponential growth rates, and thus must be omitted. The B-spline basis must compensate for this to prevent a dramatic loss in approximation quality.

So far only a heuristical boundary treatment has been used [19, 28]. The left- and right-most splines were modified to enforce second zero derivatives at the boundary of the parameter domain. However, this can be disadvantageous in many cases, where the objective function does not meet this requirement. In particular, modified not-a-knot B-splines do not preserve the ability of the original not-a-knot B-spline basis, including the boundary, to represent polynomials exactly, and therefore lack one of the most important spline properties.

Recently we have introduced hierarchical extended not-a-knot B-splines for usage on spatially adaptive sparse grids [20] based on the extension concept [14, 18]. This extended basis follows the premise of preserving the polynomial representation property. In this work, we apply the new basis for the first time to a subsurface flow benchmark from the field of uncertainty quantification [12]. With this we are able to demonstrate that the new basis does not only represent polynomials exactly, but also improves the approximation of general objective functions and quantities of interest. We compare our results with a simple Monte Carlo approach and the widely used polynomial chaos expansion [9, 30].

2 Sparse Grids

Full uniform isotropic tensor product grids are one of the most widely used discretization approaches. However, their amount of grid points increases like \(\mathcal {O}(h^{-D})\), where h is the grid width and D is the dimensionality of the underlying space. This exponential growth prevents calculations already for moderately high-dimensional applications.

Sparse Grids are a discretization scheme designed to mitigate this curse and enable higher-dimensional approximations. The amount of grid points of nonboundary regular sparse grids of level l with grid width h l only increases like \(\mathcal {O}(h_l^{-1}(\log _2 h_l^{-1})^{D-1})\). At the same time, the L 2-interpolation error of interpolations with B-splines of degree n on regular sparse grids of level l still decays asymptotically like \(\mathcal {O}(h_l^{n+1} (\log _2 h_l^{-1})^{D-1})\) [26], if the objective function is sufficiently smooth. This is only slightly worse than the full grid error convergence rate of \(\mathcal {O}(h_l^{n+1})\).

In contrast to the widely used combination technique, also known as Smolyak scheme [27], we use spatially adaptive sparse grids [19]. These can automatically be customized for the quantity of interest, resolving locally finer in more important regions and coarser in less important ones. By doing so the number of grid points is potentially reduced even further. This is important, because every grid point means an expensive evaluation of the original model.

The definition of sparse grids is based on arbitrary hierarchical basis functions φ l,i of level l and index i. We now introduce sparse grids in this general form, but later will only use hierarchical spline functions as bases.

2.1 Regular Sparse Grids

Without loss of generality, throughout this work, we restrict ourselves to parameters in the unit hypercube [0, 1]D. Let I l be the hierarchical index set of level \(l\in \mathbb {N}_0\),

$$\displaystyle \begin{aligned} I_l := \begin{cases} \lbrace 0,1\rbrace, &l=0,\\ \lbrace 0<i<2^l \mid i \text{ odd}\rbrace, &\text{else}. \end{cases} \end{aligned} $$
(1)

Given univariate hierarchical basis functions φ l,i of level \(l \in \mathbb {N}_0\) and index \(i\in \mathbb {N}_0\), we define multivariate basis functions φ l,i via tensor products,

$$\displaystyle \begin{aligned} \mathbf{\varphi}_{\mathbf{l},\mathbf{i}} = \prod_{d=1}^D \varphi_{l_d,i_d}, \mathbf{l}\in \mathbb{N}_0^D, \mathbf{i} \in I_{\mathbf{l}} := I_{l_1} \times \dots \times I_{l_D}, \end{aligned} $$
(2)

where l and i are multi-indices. Let now \(\mathcal {H}_{\mathbf {l}} := \lbrace {\mathbf {x}}_{\mathbf {l},\mathbf {i}} = (x_{l_1,i_1},\dots ,x_{l_D,i_D}) \mid \mathbf {i}\in I_{\mathbf {l}} \rbrace \) for \(x_{l_d,i_d} := i_d h_{l_d}\) be the anisotropic grid of level l with grid widths \(h_{l_d} := 2^{-l_d}\). We define the hierarchical subspaces W l of level l through the basis functions corresponding to \(\mathcal {H}_{\mathbf {l}}\),

$$\displaystyle \begin{aligned} W_{\mathbf{l}}:=\text{span}\lbrace \mathbf{\varphi}_{\mathbf{l},\mathbf{i}} \mid \mathbf{i} \in I_{\mathbf{l}} \rbrace. {} \end{aligned} $$
(3)

Regular boundary sparse grids \(V^b_l\) of level \(l\in \mathbb {N}_0\) in D dimensions are defined as the direct sum of these hierarchical subspaces,

$$\displaystyle \begin{aligned} V^b_l := \bigoplus_{\vert\mathbf{l}'\vert_1\leq l}W_{\mathbf{l}'}, \end{aligned} $$
(4)

where \(\vert \mathbf {l}'\vert _1 : =\sum _{d=1}^D l_d^{\prime }\) is the discrete 1 norm of l . Unfortunately the number of boundary points of a boundary sparse grid grows like \(\mathcal {O}(2^D)\). This growth is exponential, still preventing discretization for higher dimensional applications. Therefore the boundary points must be omitted. The D-dimensional nonboundary sparse grid \(V^s_l\) of level \(l\in \mathbb {N}\) is defined as

$$\displaystyle \begin{aligned} V^s_l := \bigoplus_{\vert\mathbf{l}'\vert_1\leq l,~ l_d^{\prime}\geq 1 \forall d \in \lbrace1,\dots,D\rbrace} W_{\mathbf{l}'}. \end{aligned} $$
(5)

Figure 1 shows an illustration of the hierarchical subspace scheme, the corresponding regular boundary sparse grid and the corresponding regular nonboundary sparse grid.

Fig. 1
figure 1

(a) Hierarchical subspace scheme of level l = 3, (b) corresponding regular boundary sparse grid \(V^b_3\) and (c) corresponding regular nonboundary sparse grid \(V^s_3\)

2.2 Spatial Adaptivity

Regular sparse grids uniformly discretize the objective domain, spending too few grid points in regions of interest and too many grid points in regions of little significance. Spatially adaptive sparse grids [19] can automatically be adapted to the objective function. Given an initial sparse grid approximation, each basis function’s benefit to the quantity of interest is estimated. Depending on this estimate, the grid points corresponding to the most significant basis functions are refined. This approach is more selective than classical dimensional adaptivity [11] and therefore allows the employment of even fewer grid points.

Let x l,i be a sparse grid point. We define its hierarchical children C(l, i) as all grid points \({\mathbf {x}}_{\mathbf {l}',\mathbf {i}'}\), for which there exists r ∈{1, …, D}, s.t.

$$\displaystyle \begin{aligned} l_d &= l_d^{\prime}, i_d = i_d^{\prime} \quad \forall d \in \lbrace 1,\dots , D\rbrace \setminus \lbrace r\rbrace,\\ l_r^{\prime} &= l_r + 1,\\ i_r^{\prime} &\in\lbrace 2 i_r - 1, 2 i_r + 1\rbrace. \end{aligned} $$
(6)

Let now \(\mathcal {G}\) be a spatially refined grid,

$$\displaystyle \begin{aligned} \mathcal{G} : =\lbrace {\mathbf{x}}_{\mathbf{l},\mathbf{i}} \mid (\mathbf{l},\mathbf{i})\in L\rbrace, \end{aligned} $$
(7)

where \(L\subset \lbrace (\mathbf {l},\mathbf {i})\mid \mathbf {l} \in \mathbb {N}_0^D, \mathbf {i} \in I_{\mathbf {l}}\rbrace \) is some finite level-index set. Note that this includes regular sparse grids as a special case. The set of all level-index pairs of refineable grid points, L ref ⊆ L, is defined as

$$\displaystyle \begin{aligned} L^{\text{ref}} := \lbrace (\mathbf{l}, \mathbf{i})\in L \mid C(\mathbf{l},\mathbf{i})\not\subset \mathcal{G} \rbrace . \end{aligned} $$
(8)

The sparse grid \(\mathcal {G}\) can now be refined, by iterating the following two steps until a given threshold for the total number of grid points is exceeded. First identify the level-index pair of the grid point \({\mathbf {x}}_{{\mathbf {l}}^*,{\mathbf {i}}^*} \in L^{\text{ref}}\) and corresponding basis function \(\mathbf {\varphi }_{{\mathbf {l}}^*,{\mathbf {i}}^*}\) with most influence on the quantity of interest. Second, add all its hierarchical children C(l , i ) to the grid.

Many criteria for the identification of (l , i ) exist. In this work we apply the standard surplus criterion [19]. It is based on the hierarchy of the basis, where larger interpolation coefficients |α l,i| imply a worse local approximation. Consequently we use

$$\displaystyle \begin{aligned} ({\mathbf{l}}^*,{\mathbf{i}}^*) := \text{argmax}_{(\mathbf{l},\mathbf{i})\in L^{\text{ref}}} \vert \alpha_{\mathbf{l},\mathbf{i}}\vert. \end{aligned} $$
(9)

3 Basis Functions

Sparse grids are widely used in combination with the popular linear hat functions, i.e. B-splines of degree one. But if the objective function admits a certain smoothness, an approximation should preserve it or it would otherwise lose valuable information. Therefore in the last years B-splines have been used increasingly often on (spatially adaptive) sparse grids [15, 19, 28]. Their local support and arbitrary choosable degree result in their well-known approximation quality, while the underlying sparse grid keeps the number of necessary function evaluations small.

Before we define the new extended not-a-knot B-spline basis we must introduce the underlying classical not-a-knot B-splines. Furthermore we define modified not-a-knot B-splines to motivate the new basis. As is common, throughout this paper we only define and use splines of odd degrees.

3.1 B-Splines

Let ξ := (ξ 0, …, ξ q+n) be a knot-sequence, i.e. a non-decreasing sequence of real numbers ξ k for k ∈{0, …, q + n} and some \(q\in \mathbb {N}_0\). The B-spline \(b^n_{k,\xi }\) of index k and degree n is defined by the Cox-de-Boor recursion [4, 6],

$$\displaystyle \begin{aligned} b^n_{k,\xi} (x) = \begin{cases} \dfrac{x-\xi_k}{\xi_{k+n} - \xi_k} b^{n-1}_{k,\xi}(x) + \dfrac{\xi_{k+n+1} -x}{\xi_{k+n+1} - \xi_{k+1}} b^{n-1}_{k+1,\xi}(x) & n\geq 1, \\ \chi_{[\xi_k,\xi_{k+1}]}(x) & n=0, \end{cases} \end{aligned} $$
(10)

where \(\chi _{[\xi _k,\xi _{k+1}]} (x)\) evaluates to one in the interval [ξ k, ξ k+1] and zero elsewhere.

Originally, Schoenberg introduced B-splines with an infinite and uniform knot sequence \(\xi ^\infty _h = (\dots , \xi ^\infty _{h,-1}, \xi ^\infty _{h,0}, \xi ^\infty _{h,1}, \dots )\), where \(\xi ^\infty _{h,k} = kh\) for grid width \(h\in \mathbb {R}\) and index \(k\in \mathbb {Z}\) [23]. The corresponding B-splines \(b^n_{k,{\xi ^\infty _h}}\) form a basis of \(S^n_{\xi ^\infty _h}\), the spline space of n times continuously differentiable piecewise polynomials on the knot intervals.

When using a finite knot sequence this desirable basis property is no longer valid, because the Schoenberg-Whitney conditions [13, 24] do not hold at the left-most and right-most knot intervals. A common approach to revalidate these conditions are not-a-knot B-splines [5, 28].

3.2 Not-a-Knot B-Splines

Not-a-knot B-splines are motivated by requiring continuity of the n-th derivatives at the \(\frac {n-1}{2}\) left-most and \(\frac {n-1}{2}\) right-most knots. This requirement is equivalent to excluding the according n − 1 knots from the B-spline defining knot sequence ξ but keeping them in the set of interpolation nodes.

Without loss of generality we restrict ourselves to uniform B-splines of level \(l\in \mathbb {N}_0\) on the unit interval [0, 1] using the uniform knot sequence \(\xi ^{n,\mathrm {u}}_l := (\xi ^{n,\mathrm {u}}_{l,0}, \dots , \xi ^{n,\mathrm {u}}_{l,2^l+2 n})\), where \(\xi ^{n,\mathrm {u}}_{l,k} := (k-n)h_l\) for grid width h l := 2l. Consequently, we derive \(\xi ^{n,\mathrm {nak}}_l: = (\xi ^{n,\mathrm {nak}}_{l,0},\dots ,\xi ^{n,\mathrm {nak}}_{l,2^l+n+1})\), the uniform not-a-knot sequence of level l and degree n as

$$\displaystyle \begin{aligned} \xi^{n,\mathrm{nak}}_{l,k} := \begin{cases} \xi^{n,\mathrm{u}}_{l,k}, & k=0,\dots,n, \\ \xi^{n,\mathrm{u}}_{l,k+(n-1)/2}, & k=n+1,\dots , 2^l, \\ \xi^{n,\mathrm{u}}_{l,k+n-1}, & k=2^l+1,\dots,2^l+n+1. \end{cases} \end{aligned} $$
(11)

The definition of \(\xi ^{n,\mathrm {nak}}_{l,k}\) is only applicable if l ≥⌈log2(n + 1)⌉. Otherwise we cannot exclude n − 1 knots from the sequence. Therefore, if l < ⌈log2(n + 1)⌉, we use \(\xi ^{n,\mathrm {nak}}_{l,k} := \xi ^{n,\mathrm {u}}_{l,k}\) and Lagrange polynomials

$$\displaystyle \begin{aligned} L_{l,k}(x) := \prod_{\substack{0\leq m \leq 2^l, \\ m\neq k}} \frac{x-\xi^{n,\mathrm{u}}_{l,m}}{\xi^{n,\mathrm{u}}_{l,k}-\xi^{n,\mathrm{u}}_{l,m}}, \quad k=0,\dots,2^l \end{aligned} $$
(12)

as basis functions. This ensures a basis for the polynomial space on the first levels.

Finally, the not-a-knot B-spline basis \(b^{n,\mathrm {nak}}_{l,k}\) of degree n, level l and index k is given by

$$\displaystyle \begin{aligned} b^{n,\mathrm{nak}}_{l,k} (x) := \begin{cases} b^n_{k,\xi^{n,\mathrm{nak}}_{l,k}}(x) & l \geq \lceil \log_2(n+1)\rceil, \\ L_{l,k}(x) & l <\lceil \log_2(n+1)\rceil. \end{cases} \end{aligned} $$
(13)

The knot-sequence \(\xi ^{n,\mathrm {nak}}_{l,k}\) still includes the boundary points \(\xi ^{n,\mathrm {nak}}_{l,0} = 0\) and \(\xi ^{n,\mathrm {nak}}_{l,2^l+n+1} = 1\). Because the number of boundary points of higher-dimensional sparse grids dominates the total number of grid points, the boundary points must be omitted. However, simply excluding the boundary points, and thus the corresponding B-spline basis functions, impairs the approximation quality at the boundaries. Therefore an appropriate boundary treatment is necessary.

3.3 Modified Not-a-Knot B-Splines

So far modified not-a-knot B-splines [28] are used to compensate for the missing boundary points. Motivated by an application with natural boundary conditions, they were defined to enforce zero second derivatives at the domain’s boundaries. The resulting basis functions extrapolate towards the boundaries, as can be seen in Fig. 3. Consequently the modified not-a-knot B-spline \(b^{n,mod}_{l,k}\) of degree n, level l and index k is defined as,

$$\displaystyle \begin{aligned} b^{n, \mathrm{mod}}_{l,k}(x) := \begin{cases} 1 & l=1, k=1, \\ b^{n,\mathrm{nak}}_{l,k}(x) + b^{n,\mathrm{nak}}_{l,k-1}(x) & l\geq 2, k=1, n=1, \\ b^{n,\mathrm{nak}}_{l,k}(x) - \frac{\frac{d^2}{dx^2} b^{n,\mathrm{nak}}_{l,k}(0)}{\frac{d^2}{dx^2} b^{n,\mathrm{nak}}_{l,k-1}(0)} b^{n,\mathrm{nak}}_{l,k-1}(x) & l \geq 2, k\in \lbrace 1,\dots,\frac{n+1}{2} \rbrace, n>1, \\ b^{n,\mathrm{mod}}_{l,2^l-k}(1-x) & l \geq 2, k \in \lbrace 2^l-\frac{n+1}{2},\dots, 2^l-1 \rbrace, \\ b^{n,\mathrm{nak}}_{l,k}(x) & \mathrm{otherwise}. \end{cases} \end{aligned} $$
(14)

Note, that for linear splines of degree n = 1 the second derivatives always vanish. Therefore the modification is defined as the linear continuation of the left-most and right-most inner splines.

Some applications require zero second derivatives, and thus are accurately representable by modified not-a-knot B-splines. However, this condition does not hold in general and modified not-a-knot B-splines are not capable of representing arbitrary functions. In particular the standard monomial basis {x m∣0 ≤ m ≤ n} for the polynomial space \(\mathbb {P}^n\) has second derivatives unequal to zero for n ≥ 2. The modified not-a-knot B-spline basis is thus not even capable of exactly representing polynomials, which is one of the most important properties for spline bases.

3.4 Extended Not-a-Knot B-Splines

The extension of B-splines was originally introduced in the context of WEB-splines [14] and later generalized for hierarchical subdivision schemes [18]. Recently we have introduced hierarchical extended not-a-knot B-splines for the usage on sparse grids [20].

The idea of the extension is to add the omitted splines b j, j ∈ J l := {0, 2l} to the remaining splines in such a way, that their contribution to the capability of representing polynomials is preserved. In a first step, we interpolate {P mm ∈ M := {0, …, n + 1}} a basis for the polynomial space \(\mathbb {P}^{n}\) with the regular not-a-knot B-spline basis including the boundary splines. Let l ≥⌈log2(n + 2)⌉, then the polynomial basis is represented exactly by definition of the not-a-knot B-splines. This results in interpolation coefficients α m,i, such that

$$\displaystyle \begin{aligned} P_m = \sum_{k=0}^{2^l} \alpha_{m,k} b^{n,nak}_{l,k} \quad \forall m \in M. {} \end{aligned} $$
(15)

In practice we use the monomials P m = x m, but the theory is independent of this particular choice.

In a next step, we identify the closest n + 1 inner indices I l(j) for each index j ∈ J l. Now the coefficients α j, j ∈ J l are represented as linear combinations of the coefficients α i, i ∈ I l(j), i.e.

$$\displaystyle \begin{aligned} \alpha_j = \sum_{i \in I_l(j)} e_{i,j} \alpha_i, {} \end{aligned} $$
(16)

where \(e_{i,j} \in \mathbb {R}\) are the extension coefficients. See Fig. 2 for an illustration.

Fig. 2
figure 2

Schematic visualization of the extension of not-a-knot B-splines of degree n = 3 on a one-dimensional regular grid of level l = 4. The boundary splines with indices J l = {0, 16} are added to the n + 1 next inner splines I l(0) = {1, 2, 3, 4} and I l(16) = {12, 13, 14, 15}, indicated with arrows

Fig. 3
figure 3

(a) Hierarchical not-a-knot B-splines with boundary basis functions, (b) hierarchical modified not-a-knot B-splines and (c) hierarchical extended not-a-knot B-splines of degree 3 and levels 0, 1, 2, 3 and 4, respectively. The not-a-knot change in the knot sequence is illustrated with crosses at x l,1 and \(x_{l,2^l-1}\)

Let J l(i) := {j ∈ J li ∈ I l(j)} be the dual of I l(j) and \(P\in \mathbb {P}^n\) be an arbitrary polynomial. Following eq. (15) it holds

$$\displaystyle \begin{aligned} P = \sum_{m\in M} p_m P_m = \sum_{m\in M} \sum_{i\in I_l} p_m \alpha_{m,i} b^{n,\mathrm{nak}}_{l,i} + \sum_{m\in M} \sum_{j\in J_l} p_m \alpha_{m,j}b^{n,\mathrm{nak}}_{l,j} \end{aligned} $$
(17)

for uniquely defined coefficients \(p_m,\alpha _{m,i},\alpha _{m,j}\in \mathbb {R}\). Exploiting the finiteness of the sets M, I l and J l, we interchange the sums,

$$\displaystyle \begin{aligned} P = \sum_{i\in I_l} \left(\sum_{m\in M} p_m \alpha_{m,i}\right) b^{n,\mathrm{nak}}_{l,i}+ \sum_{j\in J_l} \left(\sum_{m\in M} p_m \alpha_{m,j}\right) b^{n,\mathrm{nak}}_{l,j} . \end{aligned} $$
(18)

Because J l(i) is the dual of I l(j), and by the definition of the extension coefficients e i,j in eq. (16), it holds

$$\displaystyle \begin{aligned} P & = \sum_{i \in I_l} \underbrace{\left( \sum_{m\in M}p_m\alpha_{m,i} \right)}_{=: \beta_i} \underbrace{\left(b^{n,e\mathrm{nak}}_{l,i} + \sum_{j \in J_l(i)} e_{i,j} b^{n,\mathrm{nak}}_{l,j} \right)}_{=: b^{n,e}_{l,i}}{} \end{aligned} $$
(19)
$$\displaystyle \begin{aligned} & = \sum_{i\in I_l} \beta_i b^{n,e}_{l,i}. \end{aligned} $$
(20)

Consequently the extended not-a-knot B-spline \(b^{n,e}_{l,i}\) of degree n, level l and index i is defined through eq. (19)

$$\displaystyle \begin{aligned} b^{n,e}_{l,i} := \begin{cases} b^{n,\mathrm{nak}}_{l,i} + \sum_{j\in J_l(i)}e_{i,j}b^{n,\mathrm{nak}}_{l,j} & l \geq \lceil \log_2(n+2)\rceil, \\ L_{l,i}(x) & l <\lceil \log_2(n+2)\rceil, \end{cases} \end{aligned} $$
(21)

where again Lagrange polynomials are employed on lower levels to ensure the polynomial basis property, as long as there are not enough inner knots for the extension.

For the usage on sparse grids, all presented B-spline basis functions are applied in the hierarchical manner introduced in Eq. (3). See Fig. 3 for an illustration. Recently we showed that the hierarchical extended not-a-knot B-spline basis fulfills the desired polynomial representation property [20].

4 Expansion Methods

The field of uncertainty quantification generalizes the concept of numerical modeling by introducing nondeterminism, thereby allowing more accurate simulations of the real world. Instead of real values, parameters are random variables obeying probability density functions. The uncertainty of the input parameters is then propagated through the model resulting in uncertain outputs. In order to estimate likely outcomes of the model, stochastic values such as mean and standard deviation can be calculated. Two of the most widely used techniques to calculate these values are stochastic collocation and polynomial chaos expansion (PCE).

Formally let \((\varOmega , \mathcal {F}, P)\) be a complete probability space with \(\varOmega \subset \mathbb {R}^D\) being the D-dimensional sample space of all possible outcomes, \(\mathcal {F}\) the σ-algebra of events and \(P:\mathcal {F}\rightarrow [0,1]\) the probability measure. Without loss of generality we assume Ω ⊆ [0, 1]D. Let X := (X 1, …, X D) ∈ Ω be a random vector consisting of D random variables. We assume that the according random variables admit statistically independent probability density functions ϱ 1, …, ϱ D and thus the random vector is distributed according to their product distribution \(\mathbf {\varrho }:= \prod _{d=1}^D \varrho _d\).

4.1 Stochastic Collocation

Stochastic collocation is based on the process of replacing the original objective function f by a surrogate \(\tilde {f}\), and performing stochastic analysis on the surrogate. We create the surrogate as a linear combination of B-splines b l,i on an adaptively created sparse grid \(\mathcal {G}\) with level-index set L,

$$\displaystyle \begin{aligned} f \approx \tilde{f} := \sum_{(\mathbf{l},\mathbf{i}) \in L} \mathbf{\alpha}_{\mathbf{l},\mathbf{i}} {\mathbf{b}}_{\mathbf{l},\mathbf{i}}, \end{aligned} $$
(22)

where the coefficients α l,i are computed via interpolation at the sparse grid points. From this we approximate the mean \(\mathbb {E}(f)\) and variance \(\mathbb {V}(f)\) of the objective function using Gauss-Legendre quadrature,

$$\displaystyle \begin{aligned} \mathbb{E}(f) \approx \mathbb{E}(\tilde{f}) &= \int_{[0,1]^D} \tilde{f}(\mathbf{X}) \mathbf{\varrho}(\mathbf{X}) d\mathbf{X} \end{aligned} $$
(23)
$$\displaystyle \begin{aligned} &\approx \sum_{k} \tilde{f}(x_k)\mathbf{\varrho}(x_k)\omega_k, \end{aligned} $$
(24)
$$\displaystyle \begin{aligned} \mathbb{V}(f) \approx \mathbb{V}(\tilde{f}) &= \mathbb{E}(\tilde{f}^2) - \mathbb{E}(\tilde{f})^2, \end{aligned} $$
(25)

where x k are the points and ω k the weights of the quadrature rule. The order of the quadrature rule is chosen depending on the distribution ϱ. Being piecewise polynomials, splines are exactly integrated by the Gauss-Legendre quadrature rule of order (n + 1)∕2 with respect to a uniform probability density function. Therefore, if any of the density functions ϱ d is uniform, the quality of the approximation \(\tilde {f}\) directly propagates to the quality of the stochastic values.

4.2 Polynomial Chaos Expansion

Generalized polynomial chaos is based on the Wiener-Askey scheme [30], where Hermite, Legendre, Laguerre, Jacobi and generalized Laguerre polynomials are used to model the effects of uncertainties of normal, uniform, exponential, beta and gamma distributed random variables respectively. These polynomials are optimal for the according distribution in the sense, that they are orthogonal with respect to the according inner product [9].

If other distribution types are required, nonlinear variable transformations like Rosenblatt [22] and Nataf [7] can be applied, but convergence rates are typically decreased by this [9]. Alternatively orthogonal polynomials matching the given distribution can be numerically generated [8]. For a fair comparison, our numerical examples all obey the distributions from the Wiener-Askey scheme. Note however, that stochastic collocation with B-splines on sparse grids is not limited in the type of distribution and can be applied directly for any given distribution.

The actual chaos expansion takes the form

$$\displaystyle \begin{aligned} f(\mathbf{X}) = \gamma_0 \varPhi_0 + \sum_{d=1}^D \gamma_{d} \varPhi_1(X_{d}) + \sum_{d=1}^D\sum_{t=1}^{d} \gamma_{d,t}\varPhi_2(X_{d},X_{t}) + \dots, \end{aligned} $$
(26)

where Φ d are the basis functions from the Wiener-Askey scheme and each additional set of nested summation introduces an additional order of polynomials. Usually the order-based indexing is replaced by term-based indexing to simplify the representation. Consequently,

$$\displaystyle \begin{aligned} f(\mathbf{X}) = \sum_{\mathbf{k} = \mathbf{0}}^{\mathbf{\infty}}\mathbf{\gamma}_{\mathbf{k}}\varPsi_{\mathbf{k}}(\mathbf{X}), \end{aligned} $$
(27)

where there is a direct correspondence between γ d,t,… and γ k and between Φ t(X d, X t, … ) and Ψ k(X), which are multivariate polynomials.

The PCE coefficients γ k are calculated via spectral projection, taking advantage of the orthogonality of the polynomials to extract each coefficient,

$$\displaystyle \begin{aligned} \mathbf{\gamma}_{\mathbf{k}} = \frac{\langle f,\varPsi_{\mathbf{k}} \rangle}{\langle \varPsi^2_{\mathbf{k}} \rangle} = \frac{1}{\langle \varPsi^2_{\mathbf{k}}\rangle} \int_{[0,1]^D} f(\mathbf{X}) \varPsi_{\mathbf{k}} \mathbf{\varrho} (\mathbf{X}) d\mathbf{X}. {} \end{aligned} $$
(28)

The integral in eq. (28) must be numerically calculated. In high-dimensional settings usually regular sparse grids based on the combination technique are used [9] and we too use this approach.

Once the expansion coefficients have been calculated, the desired stochastic quantities follow directly, because of the orthogonality of the polynomials,

$$\displaystyle \begin{aligned} \mathbb{E}(f) &= \mathbf{\gamma}_0,{} \end{aligned} $$
(29)
$$\displaystyle \begin{aligned} \mathbb{V}(f) &= \sum_{\mathbf{k}=0}^\infty \mathbf{\gamma}_{{\mathbf{k}}^2} \langle\varPsi^2_{\mathbf{k}} \rangle_{\mathbf{\varrho}}.{} \end{aligned} $$
(30)

In practice the expansion representation of the variance must be truncated, thus PCE tends to underestimate the variance.

5 Numerical Results

We will measure the interpolation error between an objective function \(f:\varOmega \rightarrow \mathbb {R}\) and a surrogate \(\tilde {f} :\varOmega \rightarrow \mathbb {R}\) with the normalized root-mean-square error (NRMSE). For \(R\in \mathbb {N}\) given samples {x r ∈ Ωr = 1, …, R}, the NRMSE is defined as

$$\displaystyle \begin{aligned} \frac{1}{f_{\mathrm{max}} - f_{\mathrm{min}} }\sqrt{\frac{\sum_{r=1}^{R} (f({\mathbf{x}}_r)-\tilde{f}({\mathbf{x}}_r))^2 }{R}}, \end{aligned} $$
(31)

where f max :=maxr=1,…,R f(x r) and f min :=minr=1,…,R f(x r). In our examples we used R = 100000. Mean and variance errors are measured relatively,

$$\displaystyle \begin{aligned} \varepsilon_{\mathbb{E}} = \frac{\vert \mathbb{E}(f) - \mathbb{E}(\tilde{f}) \vert}{\mathbb{E}(f)}, \quad \varepsilon_{\mathbb{V}} = \frac{\vert \mathbb{V}(f) - \mathbb{V}(\tilde{f}) \vert}{\mathbb{V}(f)}. \end{aligned} $$
(32)

All results in this chapter, except for polynomial chaos expansion, were calculated with our software SG++ [19], a general toolbox for regular and spatially adaptive sparse grids. It is available open-source for usage and comparison [25]. Our spatial adaptivity algorithm was set up to refine up to 25 points in each refinement step, starting with a regular sparse grid of level 0 for not-a-knot B-splines on boundary sparse grids and level 1 otherwise.

In practice the extension coefficients must be calculated only once. This allows an efficient implementation of the new basis. The precalculated extension coefficients we used are listed in Table 1.

Table 1 Extension coefficients e i,j for the degrees n ∈{1, 3, 5} based on P m = x m, m ∈ M. Only the coefficients for the extension at the left boundary are shown, i.e. j = 0. The right boundary is treated symmetrically. For degree 5 and level 3, the left and right extensions overlap, resulting in a special case

5.1 Exponential Objective Function

We first verify the improved convergence rates of extended not-a-knot B-splines in a simple setup, which illustrates why the new basis functions were necessary. We interpolate the one-dimensional exponential function with the common spline functions used in sparse grid context and measure the NRMSE, see Fig. 4.

Fig. 4
figure 4

Normalized root mean square error for the interpolation of \(f(x)=\exp (x)\) with not-a-knot B-splines with and without boundary points, modified not-a-knot B-splines and extended not-a-knot B-splines on regular sparse grids for degrees n ∈{1, 3, 5}

The not-a-knot B-splines without boundary points or any boundary treatment converge very slowly, clearly showing the need for appropriate boundary treatment. The modified not-a-knot B-splines converge faster but are still far away from the optimal convergence rates of \(\mathcal {O}(h^{-(n+1)})\). Only not-a-knot B-splines with boundary points and extended not-a-knot B-splines reach the optimal convergence rates.

In this one-dimensional example the additional costs of the two boundary points are negligible. However, in higher-dimensions 2D boundary points of a level 0 sparse grid can already exceed the computational limits, leaving extended not-a-knot B-splines as the only viable alternative.

5.2 Borehole Model

The next example is a real world application, modeled in 1983 by Harper and Gupta for the office of nuclear waste isolation [12]. Since then, it has been used many times for testing new approximation methods, e.g. in [16, 32]. A borehole is drilled through an aquifer above a nuclear waste repository, through the repository, and to an aquifer below. The input parameter ranges are defined in Table 2, the response \(Q\in \mathbb {R}\) is the flow in m3/yr and is given by

$$\displaystyle \begin{aligned} Q = \frac{2\pi T_u(H_u-H_l)}{\ln(r/r_w)\left(1+\frac{2LT_u}{\ln(r/r_w)r_w^2K_w} + \frac{T_u}{T_l} \right)}. \end{aligned} $$
(33)
Table 2 The input variables and according distributions for the borehole model

In terms of calculating mean and variance we compare our method to the polynomial chaos expansion implementation of the DAKOTA library [1] and Monte Carlo. We compare calculated means and variances to a reference solution computed with extended not-a-knot B-splines of degree 5 on a spatially adaptive sparse grid with 35,000 grid points. We verified this reference solution by calculating another reference solution using DAKOTA’s polynomial chaos expansion based on a sparse grid of level 5 with 34,290 grid points. The difference between both results for mean and variance is smaller than 10−11.

Figure 5 shows the NRMSE, the relative mean error and relative variance errors for all introduced B-splines on regular and spatially adaptive sparse grids, simple Monte Carlo and polynomial chaos expansion. For this problem B-splines of degree n = 5 performed best and the plots show only these results. However, the free choice of the B-spline degree makes the approach very flexible and allows to react to local features of general objective functions. While higher degree approximations are in general better for smooth functions, they can start to oscillate, making lower degrees advantageous.

Fig. 5
figure 5

Normalized root mean square error for the approximation of the borehole model and calculation of its mean and variance with not-a-knot B-splines with and without boundary points, modified not-a-knot B-splines and extended not-a-knot B-splines of degree 5 on regular and adaptive sparse grids, polynomial chaos expansion and Monte Carlo

B-splines without boundary points or any boundary treatment barely converge, again demonstrating the urgent need for compensation, when omitting the boundary points. For B-splines with boundary points the errors do converge, but slower than for modified or extended not-a-knot B-splines, which can resolve the inner domain much finer. Of these two, the extended not-a-knot B-splines perform significantly better. In all cases spatial adaptivity increases the convergence rate significantly over regular Sparse Grids.

The polynomial chaos expansion’s NRMSE is worse than that of modified and extended not-a-knot B-splines. That is because the underlying global polynomials cannot react to local features, as the spline bases can. However, its approximation of the mean is best among all shown methods. This can be explained by Eq. (29). The mean of a polynomial chaos approximation is directly given by its first coefficient γ 0 and independent of all other terms. So the mean of a polynomial chaos approximation can be disproportionately better than its overall approximation quality. The variance approximation on the other hand, which is calculated according to Eq. (30), theoretically relies on all, infinitely many, coefficients. In practice the sum must be truncated. Consequently the polynomial chaos expansion tends to underestimate the variance and it can be seen, that the extended not-a-knot B-splines on spatially adaptive sparse grids approximate the variance better.

As expected the simple Monte Carlo approach is easily outperformed by almost all other techniques.

6 Conclusions and Outlook

In this article we have demonstrated the need for proper boundary treatment when creating surrogates with B-splines on sparse grids for moderately high-dimensional problems. We have shown that modified not-a-knot B-splines are not sufficient if the objective function does not have second zero derivatives at the boundary. Our recently introduced extended not-a-knot B-splines performed significantly better in a real world uncertainty quantification benchmark. Not only the overall approximation is improved but also the derived stochastic quantities of interest. The results of our new method are comparable to, and for some quantities of interest even outperform, widely used polynomial chaos expansion. This makes the technique an interesting alternative, in particular for objective functions with local features that often can hardly be resolved by global polynomial approaches.

For this work we used the standard surplus-based refinement criterion. However, other refinement criteria based on means or variances have successfully been used in the context of uncertainty quantification and sparse grids [10]. These might improve our techniques results even further.