High-Dimension Model Representation via Sparse Grid Techniques

Sabbagh, Harold A.; Kim Murphy, R.; Sabbagh, Elias H.; Zhou, Liming; Wincheski, Russell

doi:10.1007/978-3-030-67956-9_9

Harold A. Sabbagh²⁰,
R. Kim Murphy²⁰,
Elias H. Sabbagh²⁰,
Liming Zhou²¹ &
…
Russell Wincheski²²

Part of the book series: Scientific Computation ((SCIENTCOMP))

317 Accesses

Abstract

The question of high-dimension model representation (HDMR) is of increasing importance in computational mathematics and science.

Access provided by Autonomous University of Puebla. Download chapter PDF

Estimation in High Dimensions: A Geometric Perspective

High-Dimensional Classification

Dimensionality Reduction with Subgaussian Matrices: A Unified Theory

Article 18 August 2015

1 Introduction

The question of high-dimension model representation (HDMR) is of increasing importance in computational mathematics and science. We introduced this subject in Chaps. 7 and 8, where we invoked the Karhunen-Loève expansion to reduce the number of random parameters that are required to define the stochastic model. We continue the discussion of HDMR in this chapter by turning our attention to the question of determining a suitable surrogate model for computing the response of the forward problem via VIC-3D ^®. This surrogate takes the form of an interpolation table, which is then transformed into the conventional table used in NLSE for solving inverse problems. The surrogate model that we seek falls under the rubric sparse grids, and has been the subject of intensive research in a number of areas in recent years [15, 22, 43, 44, 46, 57, 58, 73, 121, 123, 146]. We will apply it to solving problems of model-based inversion as was developed in [111]. Sparse grids can also be used to effectively calculate high-dimensional integrals of the form (7.2).

2 Mathematical Structure of the Problem

The problems in this set are based on Fig. 6.3, and required 81 VIC-3D ^® runs to establish the interpolating grid. Thus, we say that the grid has 81 nodes in four-dimensional space, speaking abstractly. Each variable (the slab depth in Fig. 6.3) defines a dimension of the grid. The fifth problem introduced another variable, the width, with three possible values, making the overall grid a hypercube of 243 nodes in five-dimensional space. As we add dimensions (variables), we will soon encounter the ‘curse of dimensionality,’ because each node requires a VIC-3D ^® run to produce the corresponding blending function. The question arises as to whether we can reduce the number of runs by reducing the number of nodes in the interpolating grid. The answer lies in the notion of ‘sparse grids.’ This is an important area of study in numerical methods, and we refer the reader to [43] for a brief tutorial. We will follow the presentation and nomenclature of [43], using the complex-flaw model of Fig. 6.3 as an example. Further interesting applications of the sparse grid approach can be found in [44] and [146].

Referring to Fig. 6.3, we can define the mathematical structure of the problem by the abstract formula (0, 10, 20) ⊗ (0, 10, 20) ⊗ (0, 10, 20) ⊗ (0, 10, 20), where ⊗ represents the Cartesian or ‘direct’ or ‘tensor’ product. Thus, we have a problem that is defined on a four-dimensional hypercube with 16 corner nodes, given by (0, 20) ⊗ (0, 20) ⊗ (0, 20) ⊗ (0, 20) and 65 interior nodes given by the ‘direct-difference’ between the total nodes given above and the corner nodes. These involve the intermediate 10mil levels in Fig. 6.3. The question then becomes, are all of the interior nodes required for an accurate representation of the function, and if not, how do we choose which ones to keep? The answer to this is given by the sparse grid algorithm.

The sparse grid algorithm of [43] relies on a refinement of the interval of interest through successive halving of the previous interval, and then using the ‘hierarchical’ basis system of Fig. 9.1 as the interpolants. This system comprises, of course, our famous one-dimensional, first-order spline tent functions:

$$\displaystyle \begin{aligned} \phi_{l,j}= \begin{cases} 1-\vert x/h_l-j_l\vert, & x\in [(j_l-1)h_l,\ (j_l+1)h_l]\cap\vert 0,1\vert; \\ 0, & \text{otherwise,} \end{cases}\ , {} \end{aligned} $$

(9.1)

where h _l is the length of an interval in the lth level, and j _l = 0, …, 2^l determines the position of a node. It is assumed in [43] that the grid is defined on the unit cube, which is the reason for the appearance of the interval, [0, 1]. For our problem, we are only interested in levels 0 and 1 of the hierarchy, because we only use at most two intervals for each dimension (variable) of the problem, as shown in Fig. 6.3.

Multidimensional functions are obtained by taking products of the one-dimensional splines:

$$\displaystyle \begin{aligned} \phi_{\mathbf{l},\mathbf{j}}=\varPi_{t=1}^d\phi_{l_t,j_t}(x_t)\ , {} \end{aligned} $$

(9.2)

where l = (l ₁, …, l _d) denotes the number of intervals at the lth level in each dimension. Each entry is an integer. Similarly, j = (j ₁, …, j _d), with $j_t=0,\ldots ,2^{l_t}$, denotes the nodal ordering at the lth level in each dimension. Associated with the multidimensional functions is the index set

$$\displaystyle \begin{aligned} {\mathbf{B}}_{\mathbf{l}}= \begin{cases} j_t=1,\ldots,2^{l_t}-1,j_t\ \text{odd},&t=1,\ldots,d, \text{if}\ l_t>0,\\ j_t=0,1,&t=1,\ldots,d, \text{if}\ l_t=0\ . \end{cases} {} \end{aligned} $$

(9.3)

We define two norms for discrete multi-index vectors: |l|_∞ = max_1≤t≤d l _t, and $\vert \mathbf {l}\vert _1=\sum _{t=1}^dl_t$. Using these norms, together with (9.3), we can define a full-grid function as an expansion in terms of the basis system (9.2),

$$\displaystyle \begin{aligned} f(\mathbf{x})=\sum_{{\mathbf{l}}_\infty\le n}\sum_{\mathbf{j}\in{\mathbf{B}}_{\mathbf{l}}}\alpha_{\mathbf{l},\mathbf{j}}\phi_{\mathbf{l},\mathbf{j}}(\mathbf{x})\ , {} \end{aligned} $$

(9.4)

and a sparse-grid function similarly,

$$\displaystyle \begin{aligned} f(\mathbf{x})=\sum_{{\mathbf{l}}_1\le n}\sum_{\mathbf{j}\in{\mathbf{B}}_{\mathbf{l}}}\alpha_{\mathbf{l},\mathbf{j}}\phi_{\mathbf{l},\mathbf{j}}(\mathbf{x})\ . {} \end{aligned} $$

(9.5)

Before we discuss either expansion in detail, let’s take a look at the distinction between them, which is tied up with the definition of the norms above. In the complex-flaw case that we are considering, we have d = 4 and n = 1. Hence, the condition |l|_∞≤ 1 is satisfied 16 ways, as shown in Table 9.1. The resulting total number of nodes agrees with what we knew before for the Cartesian product of three nodes in each of four variables, 3⁴ = 81. Contrast this with the vector condition for sparse grids shown in Table 9.2. It is clear that the sparse grid algorithm gives a significant reduction in the number of nodes in the interpolation grid. The difference is even more striking in the case of the five-dimensional complex flaw of Test Problem No. 5. In that case the full grid had 243 nodes, whereas the sparse grid has only 2⁵ + 5 × 16 = 112, which is less than half the full-grid complement.

Table 9.1 Four-dimensional vectors satisfying |l|_∞≤ 1. The last column gives the number of nodes associated with each vector

Full size table

Table 9.2 Four-dimensional vectors satisfying |l|₁ ≤ 1. The last column gives the number of nodes associated with each vector

Full size table

We’ll interpret Table 9.2 geometrically, using Fig. 6.3 to motivate the development. The entries in Tables 9.1 and 9.2 are the exponents, l, that yield the number of intervals, 2^l, in each slab of Fig. 6.3. Hence, the first entry in Table 9.2 corresponds to the situation in which the midpoint, 10 mil, is missing in each of the slabs (yielding a single interval in each slab) and we are working with the Cartesian product (0, 20) ⊗ (0, 20) ⊗ (0, 20) ⊗ (0, 20), which are the 16 corner points of a four-dimensional hypercube of length 20 on a side.

The next entry in Table 9.2 indicates that we have introduced the 10-mil midpoint in the first slab, yielding a slab with two intervals. Geometrically, this corresponds to the intersection of the hyperplane, x ₁ = 10, with the four-dimensional hypercube, resulting in eight nodes at the boundaries of the hypercube. Similarly, the third through fifth entries correspond to the intersection of hyperplanes with the hypercube, resulting in eight nodes for each entry, yielding a total of 48 nodes. Each node carries a blending function that VIC-3D ^® must compute using the appropriate parameters of Fig. 6.3. For example, the blending function corresponding to the second entry in the table would have the first slab of Fig. 6.3 fixed at 10 mils depth, and the other three cycling through 0 and 20 mils, each, giving a VIC-3D ^® problem with 8 range values.

We turn our attention, now, to the hierarchical structure of the algorithm, which lies at the heart of (9.5). Using the format of Table 9.2, we expand (9.5) as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} f(\mathbf{x})& =&\displaystyle \sum_{\vert \mathbf{l}\vert_1\le 1}\sum_{\mathbf{j}\in{\mathbf{B}}_{\mathbf{l}}} \alpha_{\mathbf{l},\mathbf{j}}\phi_{\mathbf{l},\mathbf{j}}(\mathbf{x}) \\ & =&\displaystyle \sum_{\mathbf{j}\in B_{0000}}\alpha_{0000,\mathbf{j}}\phi_{0000,\mathbf{j}}(\mathbf{x}) +\sum_{\mathbf{j}\in B_{1000}}\alpha_{1000,\mathbf{j}}\phi_{1000,\mathbf{j}}(\mathbf{x}) +\sum_{\mathbf{j}\in B_{0100}}\alpha_{0100,\mathbf{j}}\phi_{0100,\mathbf{j}}(\mathbf{x}) \\ & +&\displaystyle \sum_{\mathbf{j}\in B_{0010}}\alpha_{0010,\mathbf{j}}\phi_{0010,\mathbf{j}}(\mathbf{x}) +\sum_{\mathbf{j}\in B_{0001}}\alpha_{0001,\mathbf{j}}\phi_{0001,\mathbf{j}}(\mathbf{x})\ . {} \end{array} \end{aligned} $$

(9.6)

Because the expansion functions, {ϕ _l,j(x)}, are nonoverlapping for a given level, l, and have a unit amplitude, the expansion coefficients, {α _l,j}, are simply equal to the blending function associated with the node of the appropriate function at level l.

Figure 9.2 illustrates the situation in one dimension at levels 0 and 1. In this example, we have α _0,0 = BF(0), α _0,20 = BF(20), α _1,10 = SURPLUS, where SURPLUS = BF(10) − 1∕2(BF(20) + BF(0)). Hence, the expansion shown in Fig. 9.2 is given by

$$\displaystyle \begin{aligned} f(x)=BF(0)\phi_{0,0}(x)+BF(20)\phi_{0,20}(x)+\mathrm{SURPLUS}\phi_{1,10}(x)\ , {} \end{aligned} $$

(9.7)

where ϕ _0,0(x) is the slide function, S(x), and ϕ _0,20(x) is the ramp function, R(x), in Fig. 9.2.

It is clear that the name ‘SURPLUS’ (called hierarchical surplus in [43]) denotes the excess in function value that the higher-order levels are supposed to accommodate. It’s evaluation at level l requires only one additional blending function to be computed, while using two previously computed at level l − 1. This is an advantage of the hierarchical structure of the algorithm. If SURPLUS=0, then the interpolator would treat this function as being linear, instead of piecewise linear, in this dimension. Thus, the expansion, (9.6), is reminiscent of a Taylor series, in which the terms corresponding to higher levels of the hierarchy correspond to higher-order polynomial terms in the Taylor series.

The full four-dimensional expansion of (9.6) is given next. The first term is:

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle BF(0,0,0,0)S(x_1)S(x_2)S(x_3)S(x_4)+BF(20,0,0,0)R(x_1)S(x_2)S(x_3)S(x_4)+ \\ & &\displaystyle BF(0,20,0,0)S(x_1)R(x_2)S(x_3)S(x_4)+BF(20,20,0,0)R(x_1)R(x_2)S(x_3)S(x_4)+ \\ & &\displaystyle BF(0,0,20,0)S(x_1)S(x_2)R(x_3)S(x_4)+BF(20,0,20,0)R(x_1)S(x_2)R(x_3)S(x_4)+ \\ & &\displaystyle BF(0,20,20,0)S(x_1)R(x_2)R(x_3)S(x_4)+BF(20,20,20,0)R(x_1)R(x_2)R(x_3)S(x_4)+ \\ & &\displaystyle BF(0,0,0,20)S(x_1)S(x_2)S(x_3)R(x_4)+BF(20,0,0,20)R(x_1)S(x_2)S(x_3)R(x_4)+ \\ & &\displaystyle BF(0,20,0,20)S(x_1)R(x_2)S(x_3)R(x_4)+BF(20,20,0,20)R(x_1)R(x_2)S(x_3)R(x_4)+ \\ & &\displaystyle BF(0,0,20,20)S(x_1)S(x_2)R(x_3)R(x_4)+BF(20,0,20,20)R(x_1)S(x_2)R(x_3)R(x_4)+ \\ & &\displaystyle BF(0,20,20,20)S(x_1)R(x_2)R(x_3)R(x_4)+BF(20,20,20,20)R(x_1)R(x_2)R(x_3)R(x_4)\ ; \\ {} \end{array} \end{aligned} $$

(9.8)

the second:

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \mathrm{SURPLUS}(10,0,0,0)\phi_{1,10}(x_1)S(x_2)S(x_3)S(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(10,20,0,0)\phi_{1,10}(x_1)R(x_2)S(x_3)S(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(10,0,20,0)\phi_{1,10}(x_1)S(x_2)R(x_3)S(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(10,20,20,0)\phi_{1,10}(x_1)R(x_2)R(x_3)S(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(10,0,0,20)\phi_{1,10}(x_1)S(x_2)S(x_3)R(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(10,20,0,20)\phi_{1,10}(x_1)R(x_2)S(x_3)R(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(10,0,20,20)\phi_{1,10}(x_1)S(x_2)R(x_3)R(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(10,20,20,20)\phi_{1,10}(x_1)R(x_2)R(x_3)R(x_4)\ ; \\ {} \end{array} \end{aligned} $$

(9.9)

the third:

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \mathrm{SURPLUS}(0,10,0,0)S(x_1)\phi_{1,10}(x_2)S(x_3)S(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,10,0,0)R(x_1)\phi_{1,10}(x_2)S(x_3)S(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,10,20,0)S(x_1)\phi_{1,10}(x_2)R(x_3)S(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,10,20,0)R(x_1)\phi_{1,10}(x_2)R(x_3)S(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,10,0,20)S(x_1)\phi_{1,10}(x_2)S(x_3)R(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,10,0,20)R(x_1)\phi_{1,10}(x_2)S(x_3)R(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,10,20,20)S(x_1)\phi_{1,10}(x_2)R(x_3)R(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,10,20,20)R(x_1)\phi_{1,10}(x_2)R(x_3)R(x_4)\ ; \\ {} \end{array} \end{aligned} $$

(9.10)

the fourth:

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \mathrm{SURPLUS}(0,0,10,0)S(x_1)S(x_2)\phi_{1,10}(x_3)S(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,0,10,0)R(x_1)S(x_2)\phi_{1,10}(x_3)S(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,20,10,0)S(x_1)R(x_2)\phi_{1,10}(x_3)S(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,20,10,0)R(x_1)R(x_2)\phi_{1,10}(x_3)S(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,0,10,20)S(x_1)S(x_2)\phi_{1,10}(x_3)R(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,0,10,20)R(x_1)S(x_2)\phi_{1,10}(x_3)R(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,20,10,20)S(x_1)R(x_2)\phi_{1,10}(x_3)R(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,20,10,20)R(x_1)R(x_2)\phi_{1,10}(x_3)R(x_4)\ ; \\ {} \end{array} \end{aligned} $$

(9.11)

and the fifth:

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \mathrm{SURPLUS}(0,0,0,10)S(x_1)S(x_2)S(x_3)\phi_{1,10}(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,0,0,10)R(x_1)S(x_2)S(x_3)\phi_{1,10}(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,20,0,10)S(x_1)R(x_2)S(x_3)\phi_{1,10}(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,20,0,10)R(x_1)R(x_2)S(x_3)\phi_{1,10}(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,0,20,10)S(x_1)S(x_2)R(x_3)\phi_{1,10}(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,0,20,10)R(x_1)S(x_2)R(x_3)\phi_{1,10}(x_4)+ \\ & &\displaystyle \mathrm{SURPLUS}(0,20,20,10)S(x_1)R(x_2)R(x_3)\phi_{1,10}(x_4)\\ & &\displaystyle \quad +\mathrm{SURPLUS}(20,20,20,10)R(x_1)R(x_2)R(x_3)\phi_{1,10}(x_4)\ . \\ {} \end{array} \end{aligned} $$

(9.12)

The arguments of the BF and SURPLUS functions are the depth parameters of the corresponding slabs in Fig. 6.3. Thus, BF(0, 20, 20, 20) is the blending function computed by VIC-3D ^® when slab 1 has a depth of 0, and the other three slabs are at a full depth of 20. In (9.9)–(9.12), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathrm{SURPLUS}(10,a,b,c)& =&\displaystyle BF(10,a,b,c)-1/2(BF(20,a,b,c)+BF(0,a,b,c)) \\ \mathrm{SURPLUS}(a,10,b,c)& =&\displaystyle BF(a,10,b,c)-1/2(BF(a,20,b,c)+BF(a,0,b,c)) \\ \mathrm{SURPLUS}(a,b,10,c)& =&\displaystyle BF(a,b,10,c)-1/2(BF(a,b,20,c)+BF(a,b,0,c)) \\ \mathrm{SURPLUS}(a,b,c,10)& =&\displaystyle BF(a,b,c,10)-1/2(BF(a,b,c,20)+BF(a,b,c,0))\\ {} \end{array} \end{aligned} $$

(9.13)

3 Clenshaw-Curtis Grids

The Clenshaw-Curtis family of grids [57, 58] are, for the most part, superior to others of the genre that we have just described, in the sense that the number of points in C-C grids increases more slowly with the level of refinement, while retaining the same asymptotic error decay rate. Our presentation follows [58].

The points, $x_j^i$, of the C-C grid in one dimension are defined as

$$\displaystyle \begin{aligned} \begin{array}{rcl} m_i& =&\displaystyle \begin{cases}1,&\displaystyle \text{if}\ i=1,\\ 2^{i-1}+1,& \text{if}\ i>1,\end{cases} \\ x^i_j& =&\displaystyle \begin{cases}(j-1)/(m_i-1)&\displaystyle \text{for}\ j=1,\ldots,m_i\ \text{if}\ m_i>1,\\ 0.5& \text{for}\ j=1\ \text{if}\ m_i=1.\end{cases} {} \end{array} \end{aligned} $$

(9.14)

The index, i, is used to indicate a level of refinement of the grid in the appropriate dimension. It is clear from (9.14) that the set of points, X ⁱ, generated at the ith level, is a subset of X ⁱ⁺¹: X ⁱ ⊂ X ⁱ⁺¹. Furthermore, this implies that the multidimensional sparse grid generated by the tensor product of the one-dimensional grids

$$\displaystyle \begin{aligned} H_{q,d}=\cup_{q-d+1\le\vert \mathbf{l}\vert_1\le q}(X^{i_1}\otimes\cdots\otimes X^{i_d})\ , {} \end{aligned} $$

(9.15)

also satisfies the inclusion property: H _q,d ⊂ H _q+1,d. The index, q, is associated with the level of the dth dimension. The importance of this last relationship is that it implies that those blending functions that have already been computed on the set H _q,d can be used in H _q+1,d, with only the surplus blending-functions being needed to be computed in H _q+1,d. When these surpluses are smaller than a threshold, we can stop refining the grid. Figure 9.3 illustrates two point sets of the C-C grid, X ¹ ⊗ X ⁴ and X ¹ ⊗ X ⁵. The former is a subset of the latter, which is at the next level of refinement.

An example of (9.15) for the case q = 6, d = 4 is given here:

$$\displaystyle \begin{aligned} \begin{array}{rcl} H_{6,2}& =&\displaystyle \cup_{5\le i_1+i_2\le6}X^{i_1}\otimes X^{i_2} \\ & =&\displaystyle X^1\otimes X^4\cup X^1\otimes X^5\cup X^2\otimes X^3\cup X^2\otimes X^4 \cup X^3\otimes X^2\cup X^3\otimes X^3 \\ & \cup&\displaystyle X^4\otimes X^1\cup X^4\otimes X^2\cup X^5\otimes X^0\cup X^5\otimes X^1\ , {} \end{array} \end{aligned} $$

(9.16)

with X ⁰ = ∅. The C-C grid corresponding to (9.16) is shown in Fig. 9.4.

With $X^i_\varDelta =X^i\backslash X^{i-1}$, namely those points in X ⁱ that are not in X ⁱ⁻¹ (the ‘excess’ points), (9.15) can be expanded as

$$\displaystyle \begin{aligned} H_{q,d}=\cup_{\vert \mathbf{i}\vert_1\le q}(X^{i_1}_\varDelta\otimes\cdots\otimes X^{i_d}_\varDelta)=H_{q-1,d}\cup \varDelta H_{q,d}\ , {} \end{aligned} $$

(9.17)

where

$$\displaystyle \begin{aligned} \varDelta H_{q,d}=\cup_{\vert\mathbf{i}\vert_1=q}(X^{i_1}_\varDelta\otimes\cdots\otimes X^{i_d}_\varDelta)\ , {} \end{aligned} $$

(9.18)

and H _d−1,d = ∅. This is more convenient for expansion of the interpolant using successive refinements of the grid with increasing parameter, q.

Table 9.3 shows the number of grid points at level n = q − d for Clenshaw-Curtis grids of up to 16 dimensions. We see that the increase in grid points with dimension is rather slow, which is a significant advantage of C-C grids. If we argue, as we did earlier, that the complex flaw of Fig. 6.3 requires only three grid points per dimension, then it follows from Table 9.3 that we may be able to get by with a total of 33 points for 16 dimensions at level 1! This is due to the fact that the midpoint is used at the 0th level, and the two end points at the first level, and is an incredible savings. If we are conservative, and decide that we need to go to level 2, we can go up to eight dimensions and require only 145 blending functions. If we used a full grid for an eight-dimensional problem with three points per dimension, we would need a total of 6561 points! The curse of dimensionality strikes again.

Table 9.3 Number of grid points at level n = q − d for Clenshaw-Curtis grids

Full size table

Klimke, [57, 58], allows one to compute the coordinates of a C-C grid, which then allows the user to determine a priori what blending functions to compute for an interpolation table. Table 9.4 lists the coordinates of hierarchical C-C grid points for d = 4, n = 0 : 2. The table is arranged to show the grid points that are added to the interpolant at level n. The total number of points is 41, and agrees with the corresponding entry in Table 9.3.

Table 9.4 Coordinates of hierarchical C-C grid points for d = 4, n = 0 : 2

Full size table

Now that we have discussed the nature of C-C grids, it is time to address the expansion of the interpolant in function space. Figure 9.5 illustrates the typical piecewise nodal basis functions used in conventional interpolation schemes, as well as the hierarchical basis functions used with Clenshaw-Curtis grids. Note that in the C-C basis, the lowest level function in the hierarchy is simply a constant, whereas the basis system in the next level are a slide (left) and ramp (right) which do not span the entire line, but only one-half. Finally, the basis functions at level 2 are the usual disjoint tent functions that we have seen before in Fig. 9.1.

Figure 9.6 illustrates how the two sets of basis functions are used in interpolations. Our interest is in the hierarchical system shown at the bottom of the figure. The w _i,j denote the hierarchical surpluses that are the expansion coefficients for the hierarchical interpolation formula:

$$\displaystyle \begin{aligned} U(x)=w_{1,1}\phi_{1,1}(x)+w_{2,1}\phi_{2,1}(x)+w_{2,2}\phi_{2,2}(x)+w_{3,1}\phi_{3,1}(x)+w_{3,2}\phi_{3,2}(x)\ . {} \end{aligned} $$

(9.19)

They are given by: w _1,1 = f(0.5), w _2,1 = f(0) − f(0.5), w _2,2 = f(1) − f(0.5), w _3,1 = f(0.25) − 1∕2(f(0) + f(0.5)), and w _3,2 = f(0.75) − 1∕2(f(1) + f(0.5)), where f(x) is shown as the dashed curve in the figure. (The surpluses are the expansion coefficients because the basis functions all have unit amplitude, and do not overlap with each other at a given level.) Note the efficiency in the hierarchical algorithm, in that function values computed at one level are reused in the next higher level.

In reality, the nodal values in Fig. 9.6 are blending functions, comprising an entire 1- or 2-D scan impedance response. Here’s how we use the coordinate information in Table 9.4 to determine the computation of the associated blending function in Fig. 6.3. For n = 0, we set the boundary of each slab in Fig. 6.3 to be 10 mils (recall that we are scaling the physical dimensions to fit into a unit hypercube in 4-space), and use VIC-3D ^® to compute the response. This, then, is the blending function associated with the midpoint of the hypercube.

The problem configuration for computing the blending function associated with the first entry at n = 1 of Table 9.4 would have the depth of the first slab of Fig. 6.3 equal to zero, and the other three fixed at 10 mils. For the first entry at n = 2, the depth of the first slab would be 5 mils, and the other three depths would be 10 mils, and so on.

4 The TASMANIAN Sparse Grids Module

The Toolkit for Adaptive Stochastic Modeling and Non-Intrusive ApproximatioN is a robust library for high-dimensional integration and interpolation, as well as parameter calibration. The code consists of several modules that can be used individually or conjointly. The project is sponsored by Oak Ridge National Laboratory Directed Research and Development as well as the Department of Energy Office for Advanced Scientific Computing Research (see tasmanian.ornl.org/about.html).

Sparse Grids is a family of algorithms for constructing multidimensional quadrature and interpolation rules from tensor products of one-dimensional rules. The TASMANIAN Sparse Grid code implements a number of different quadrature rules and basis functions (see [123] for details). The rules are grouped into three categories:

Global Grids: suitable for globally smooth functions. Quadrature is based on a number of rules, including Clenshaw-Curtis, and interpolation is based on global Lagrange polynomials. Nodal point selection follows the same rules as quadrature. These grids are most suitable for our use, and will be discussed in more detail below.
Local Polynomial Grids: suitable for non-smooth functions with locally sharp behavior. Interpolation is based on hierarchical piecewise polynomials with local support and user-specified order. These grids are suitable for local refinement.
Wavelet Grids: are similar to local polynomials, except that it is assumed that the order is either 1 or 3. When coupled with local refinement, these grids often provide the same accuracy with fewer abscissas.

Lagrange Polynomial Interpolation

Given nodal values, f(x _i), the LP interpolator is given by

$$\displaystyle \begin{aligned} f(x)=\sum_{i=0}^Nl_i(x)f(x_i)\ , {} \end{aligned} $$

(9.20)

where

(9.21)

The interpolating polynomials, {l _i(x)}, satisfy l _i(x _j) = δ _ij. An example for N = 3 is given here:

$$\displaystyle \begin{aligned} \begin{array}{rcl} l_0(x)& =&\displaystyle {\displaystyle\frac{x-x_1}{x_0-x_1}}{\displaystyle\frac{x-x_2}{x_0-x_2}}{\displaystyle\frac{x-x_3}{x_0-x_3}} \\ l_1(x)& =&\displaystyle {\displaystyle\frac{x-x_0}{x_1-x_0}}{\displaystyle\frac{x-x_2}{x_1-x_2}}{\displaystyle\frac{x-x_3}{x_1-x_3}} \\ l_2(x)& =&\displaystyle {\displaystyle\frac{x-x_0}{x_2-x_0}}{\displaystyle\frac{x-x_1}{x_2-x_1}}{\displaystyle\frac{x-x_3}{x_2-x_3}} \\ l_3(x)& =&\displaystyle {\displaystyle\frac{x-x_0}{x_3-x_0}}{\displaystyle\frac{x-x_1}{x_3-x_1}}{\displaystyle\frac{x-x_2}{x_3-x_2}}\ . {} \end{array} \end{aligned} $$

(9.22)

The nodal points, or knots, are located at the extrema (maxima or minima) of Chebyshev polynomials. If m _i > 1 is the number of knots at the ith level of approximation in a given dimension, then the knots, over the interval [−1, +1], are given by

$$\displaystyle \begin{aligned} x_j^i=-\cos{\displaystyle\frac{\pi(j-1)}{m_i-1}},\ \ j=1,\ldots,m_i\ , {} \end{aligned} $$

(9.23)

with $x_1^i=0$ if m _i = 1. In order for the knots to be nested at the next level of approximation, we choose m ₁ = 1 and m _i = 2ⁱ⁻¹ + 1 for i > 1 [15]. The Lagrange interpolator is an example of Eq. (1.3) in the TASMANIAN User Manual.

5 First TASMANIAN Results

In order to test the accuracy of interpolating high-dimensional models at various levels,^{Footnote 1} we start with a simple example in which we compare the interpolated result at various levels with original data that are computed directly with VIC-3D ^® for the complex flaw with ‘coordinates’ (20,0,0,0) shown in Fig. 9.7. These results are shown in Figs. 9.8 and 9.9. Additional test results are shown in Figs. 9.10, 9.11, and 9.12. It is clear from these tests that TASMANIAN works well as long as the level is chosen correctly. It seems likely that one could determine a suitable level for problems of a given dimension by using theoretical rates of convergence, but in these tests we used an empirical approach to determine such levels.

6 Results for 4D-Level 8

We have done a number of numerical experiments to test TASMANIAN, and learn more about the relationship between the number of dimensions in a grid and the level of the grid. For example, we found that there was reasonable convergence from 8D-Level3(593 points) to 8D-Level4(1953 points), but to use an eight-dimensional grid accurately would require even higher levels. Since the 8D-Level4 run took 12.5 h, we decided not to go to higher dimensions, until we were absolutely forced to. In fact, it is probably wiser to use voxel-based inverse methods when the number of parameters needed to accurately model the geometry of the problem exceeds, say, 5 or 6.

We also studied the convergence of 4D-Level 6 to 4D-Level 8, and find excellent results. Hence, the remainder of our study at this point will concentrate on 4D-Level 8 situation. As a starter, we show in Figs. 9.13, 9.14, 9.15, 9.16, 9.17, 9.18, 9.19, 9.20, 9.21, 9.22, and 9.23 a comparison of interpolated results at 81 arbitrarily selected points with the TASMANIAN grid at level 8 with 1857 Chebyshev points.

Scaling may be a contributing factor to the occasionally poor fit of the real part, since the real part is much smaller than the imaginary, but it is also likely that the real and imaginary parts vary differently over this rather large range of the variables in physical space, which may contribute to the challenge of accurately interpolating each component.

Figure 9.24 shows the VIC-3D ^® model response of Fig. 6.4 when block 2 is varied in depth from 0 to 20 mils in four equal intervals, and the other blocks remain fixed at the values shown. Figure 9.25 shows a fourth-order polynomial fit to the peak values of the impedance data of Fig. 9.24, and Table 9.5 lists the coefficients of these fits. Clearly, there is a difference in the way that the real and imaginary parts vary with this particular geometric feature of the model. In both Figs. 9.24 and 9.25, the legend indicates depths of 0–40, which is due to the fact that the slabs are represented in VIC-3D ^® as canonical ‘blocks,’ whose reference coordinate system is at the center of the block. The blocks are defined in VIC-3D ^® by the total length of the unclipped block, so that when the clip plane passes through the origin of the block, as in this case, the ‘length’ parameter that VIC-3D ^® uses, and which is shown in the figures, is always twice the depth parameter shown in Fig. 6.4.

Table 9.5 Coefficients of the fourth-order polynomial fits of Fig. 9.25

Full size table

7 The Geometry of the 4D-Level 8 Chebyshev Sparse Grid

We show in Figs. 9.26, 9.27, 9.28, 9.29, and 9.30 the distribution of points in a number of two-dimensional subspaces of the original four-dimensional grid.

8 Searching the Sparse Grid for a Starting Point for Inversion

One of the applications of the sparse grid is as a surrogate for VIC-3D ^® in choosing a starting point for inversion with NLSE. To demonstrate this, we took two ‘test data’ sets, one with coordinates (0,10,18,32), and the other with coordinates (0,10,5,12). The first corresponds to a ‘good’ interpolation, as shown in the first column of the third row of Fig. 9.13, and the second to a ‘not-so-good’ interpolation, as shown in the first column of the first row of Fig. 9.13. The algorithm for finding the starting point is to determine the ‘nearest neighbor’ to the test data among the 1857 sparse grid points by choosing that point with the smallest norm of the residual impedance vector. The result of the experiment is shown in Table 9.6. The second nearest point is also shown for each data vector. The nearest neighbor to (0,10,18,32) lies in the subspace (d ₁, d ₂, 20, 30) in Fig. 9.27, and the second nearest neighbor lies in the subspace (d ₁, d ₂, 20, 20) in the same figure. As for (0,10,5,12), its nearest neighbor lies in the subspace (d ₁, d ₂, 0, 10) shown in Fig. 9.26, and the second nearest neighbor lies in (d ₁, d ₂, 10, 10) in the same figure.

Table 9.6 Nearest sparse grid neighbors for two test data sets

Full size table

In this manner we can map the ‘most likely’ regions of the sparse grid in which the data vector lies, and then compute a compact uniform grid for interpolation in NLSE within these regions. If the uniform grid is much smaller than the sparse grid, we would expect to get tighter estimations of confidence intervals when we perform a stochastic inversion with NLSE.

9 A Five-Dimensional Inverse Problem

The proof-of-the-pudding with sparse grids is their ability to simplify the solution of inverse problems that utilize internal interpolation tables as with NLSE. Our hope is that we need to compute far fewer forward solutions with VIC-3D ^®, but then use the resulting sparse-grid solution to compute a much more refined full Cartesian grid which NLSE will then use to complete the inversion problem. This example will demonstrate the validity of this approach.

The problem consists of a split-D probe of the type shown in Fig. 9.31, and which was analyzed in [111, Section 6.6] that is scanned past a rectangular slot whose dimensions are 1 mm × 2 mm × 3 mm. The probe is vertical to the surface of the workpiece, but is rotated about its axis by 22^∘. The two parameters that define the orientation of the probe are the Euler angles shown in Fig. 9.32. These two angles, together with the liftoff of the probe and the length and depth of the flaw are the five parameters that are to be determined in the inverse problem. The ‘true’ values of the parameters are: LO = 3.5, Θ = 0, Ψ = 22, L = 2, and D = 3.

The five-dimensional parameter space ranges over: LO = [0, 9.99], Θ = [0, 9.99], Ψ = [0, 90], L = [1.7, 2.3], D = [2.7, 3.3], which at level 4 will be covered by 311 Chebyshev points. These are the values that are presented to VIC-3D ^® to generate the sparse interpolation table. The first NLSE full Cartesian table consists of four points in each dimension, uniformly distributed over its range, which yields 1024 nodes for NLSE. The results of the first inversion test are shown in Table 9.7. The final column in the table gives the number of local minima generated by the 500 random starting points that coalesce into the global minimum. In order to improve the accuracy of the inversion of Ψ, we increase the number of NLSE nodes in this parameter to 6, and get the result shown as Test 2 in Table 9.7. Clearly, this parameter benefits from a denser nodal distribution because it covers a large range of [0,90].

Table 9.7 Inversion results for the five-dimensional problem at level 4 with two different interpolation tables for NLSE

Full size table

The five-dimensional, level 4 inversion is quite good, but the important thing to note is the ‘leverage-value’ of each test in Table 9.7, namely 1024∕311 = 3.3 for the first, and 4.94 for the second. This is significant because the time to compute the 311 points completely dominates the overall inversion process, whereas the computation of function values for NLSE using the full Cartesian grid interpolation table derived from these 311 points is much faster.

10 Noisy Data and Uncertainty Propagation

In developing the stochastic inverse model of Chap. 6, we assumed that the input data were given, and that the only stochastic feature of the problem was the random vector of unknown parameters that were to be determined. Each component of the vector was uniformly distributed over a certain range.

Now we want to extend the model to include the effects of Gaussian random noise that is superimposed on the noise-free input, shown in Fig. 9.33, to the five-dimensional inverse problem discussed in the preceding section. We consider two levels of noise, one with an RMS value of 1 × 10⁻⁵ and the other with an RMS value of 3 × 10⁻⁵. Figure 9.34 illustrates a sample function of the former process superimposed on the noiseless data, and Fig. 9.35 illustrates a sample function from the second process superimposed on the noiseless data.

Our interest is in determining how uncertainty in the input data is propagated through the nonlinear least-squares filter into uncertainty in the output parameters. To accomplish this, we do a Monte Carlo analysis, in which the data of Fig. 9.33 are corrupted by ten samples from each noise source, as in Figs. 9.34 and 9.35, and then applied to NLSE using the interpolation table shown as Test 2 in Table 9.7.

The results are shown in Fig. 9.36, which depicts the relative error, defined to be the ratio of the computed value to the ‘true’ value, except for Θ, which uses an artificial value of 1 × 10⁻⁸ for zero. The ‘Noise Level’ in the figure is the ratio of the RMS value of noise to the peak value of the noiseless resistance, 0.00025, in Fig. 9.33. The ten sample points for each reconstructed parameter are shown as small dots, and the mean of the results is shown as the large red dot. The large black dot is the true value of the parameter. Note that many of the sample points are hidden behind either of the large dots.

The mean values are in reasonable agreement with the true values, which suggests that the inversions are reasonable with these levels of input noise. The results for depth, D, may appear strange, in that the error in the mean value for the 1 × 10⁻⁵ noise source is greater than that for the 3 × 10⁻⁵ source, but keep in mind that the sensitivity coefficient for D in Table 9.7 is large, which, following our discussion in Chap. 6, indicates that the inversion process alone will introduce significant uncertainty in the estimated value of D. Our intuition is restored, however, when we look at the distribution of the errors over the ten samples: it is much larger for all five parameters when the noise level is 0.12. (Note that there is a small dot at ±0.1 in D for Noise Level = 0.12.)

Notes

1.
‘Level’, in the context of Lagrange interpolation with the Chebyshev rule, implies the highest order polynomial that can be interpolated exactly.

References

V. Barthelmann, E. Novak, K. Ritter, High dimensional polynomial interpolation on sparse grids. Adv. Comput. Math. 12, 273–288 (2000)
Article MathSciNet Google Scholar
C. Cai, M. Lambert, Sparse grid-nested sampling for model selection in eddy-current testing, in Conference Paper, 20th International Workshop on Electromagnetic Nondestructive Evaluation (ENDE 2015), Sendai (2015)
Google Scholar
J. Garcke, Sparse grid tutorial. http://en.wikipedia.org/wiki/Sparse-grid
J. Garcke, M. Griebel, Data mining with sparse grids using simplicial basis functions. http://en.wikipedia.org/wiki/Sparse-grid
M. Griebel, Sparse grids and related approximation schemes for higher dimensional problems, in Proceedings Foundations of Computational Mathematics 2005, (FoCM05), Santander, ed. by L. Pardo, A. Pinkus, E. Suli, M. Todd (Cambridge University Press, Cambridge, 2006), pp. 106–161
Chapter Google Scholar
A. Klimke, Sparse grid interpolation toolbox user’s guide, V. 5.1 (2008). http://www.ians.uni-stuttgart.de/spinterp
A. Klimke, B. Wolhmuth, Algorithm 847: spinterp: piecewise multilinear hierarchical sparse grid interpolation in matlab. ACM Trans. Math. Softw. 31(4), 561–579 (2005)
Article MathSciNet Google Scholar
E.K. Miller, Adaptive sparse sampling to estimate radiation and scattering patterns to a specified uncertainty with model-based parameter estimation. IEEE Antennas Propag. Mag. 103–113 (2015)
Google Scholar
H.A. Sabbagh, R. Kim Murphy, E.H. Sabbagh, J.C. Aldrin, J.S. Knopp, Computational Electromagnetics and Model-Based Inversion: A Modern Paradigm for Eddy-Current Nondestructive Evaluation (Springer, New York, 2013)
Book Google Scholar
S.A. Smolyak, Quadrature and interpolation formulas for tensor products of certain classes of functions. Soviet Math. Dokl. 4, 240–243 (1963)
MATH Google Scholar
M. Stoyanov, User manual: TASMANIAN sparse grids. Oak Ridge National Laboratory (2013)
Google Scholar
N. Zabaras, Solving stochastic inverse problems: a sparse grid collocation approach, in Computational Methods for Large-Scale Inverse Problems and Quantification of Uncertainty, ed. by People on Earth (Wiley, Hoboken, 2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Victor Technologies, LLC, Bloomington, IN, USA
Harold A. Sabbagh, R. Kim Murphy & Elias H. Sabbagh
Hermitage, TN, USA
Liming Zhou
NASA Langley Research Center, Hampton, VA, USA
Russell Wincheski

Authors

Harold A. Sabbagh
View author publications
You can also search for this author in PubMed Google Scholar
R. Kim Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Elias H. Sabbagh
View author publications
You can also search for this author in PubMed Google Scholar
Liming Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Russell Wincheski
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sabbagh, H.A., Kim Murphy, R., Sabbagh, E.H., Zhou, L., Wincheski, R. (2021). High-Dimension Model Representation via Sparse Grid Techniques. In: Advanced Electromagnetic Models for Materials Characterization and Nondestructive Evaluation. Scientific Computation. Springer, Cham. https://doi.org/10.1007/978-3-030-67956-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-67956-9_9
Published: 08 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67954-5
Online ISBN: 978-3-030-67956-9
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics

High-Dimension Model Representation via Sparse Grid Techniques

Abstract

Similar content being viewed by others