Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction and Generalized Interpolation Problems

1.1 Introduction

Freeden (1981b) and independently Wahba (1981) developed spherical splines for interpolation and smoothing/approximation problems on the sphere, and it soon became clear that their idea can be extended to the more general case of harmonic splines introduced by Freeden (1981a). Since then, harmonic splines have been suggested for interpolation/approximation on regular surfaces as well as for the solution of boundary value problems. Convergence results have been found for both spherical splines and harmonic splines (cf. Freeden 1984a,b1987a,b), many different types of kernels have been introduced and even categorized (cf. Freeden and Schreiner 2014 and the references therein), and these splines became an important tool for geosciences with many applications (see, e.g., Freeden 1999; Freeden and Michel 2004; Freeden and Gerhards 2013 and the references therein).

Since splines lead to a system of linear equations which in case of harmonic splines has to be densely populated, a major obstacle has always been the solution of this system. Any iterative solver still requires fast summation methods for the splines to be efficient. On the sphere, there are several possible choices depending on the problem at hand: spherical panel clustering (cf. Freeden et al. 1998b; Freeden 1999; Fengler 2005 and the references therein), spherical FFT for gridded data points, or spherical NFFT for non-equispaced data (cf. Potts and Steidl 2003; Keiner et al. 2006).

The fast multipole method (FMM) which has been introduced first in two and then in three dimensions by Rokhlin (1985), Greengard and Rokhlin (19871988), and Greengard (1988) can be combined with harmonic splines for fast summation and therefore also with spherical splines. Such a combination is used in Glockner (2002) to solve problems of satellite geodesy with harmonic splines of one specific type, namely, the singularity kernel. This kernel is also considered here, because it is the most obvious choice, but we also consider the Abel-Poisson kernel and use the accelerated version of the FMM that was first introduced in Greengard and Rokhlin (1997) and Cheng et al. (1999). This approach has also been applied to the oblique boundary value problem of potential theory in Gutting (20072012).

The outline of this paper is as follows: In Sect. 1.2 the generalized interpolation problems are introduced together with the necessary basic definitions. Section 2 summarizes the theory of harmonic splines, spline smoothing as well as spherical splines. In Sect. 3 we establish the connection between harmonic splines and the sums that can be computed by the fast multipole method. We summarize the adaptive construction of the decomposition of the computational domain and the kernel expansions that are required. We provide our version of the fast multipole algorithm for harmonic splines and briefly show the acceleration by the exponential translation introduced by Greengard and Rokhlin (1997) and Cheng et al. (1999) as well as our approach to use recombination strategies to reduce the number of translations. The parameters of the algorithm are investigated and optimized in Sect. 4. For the iterative solution of the system of linear equations, it is advisable to use some form of preconditioning. In Sect. 5 we suggest an additive Schwarz preconditioner based on an overlapping domain decomposition that is closely related to the decomposition required for the FMM. The results using the topography of the Earth as boundary surface are presented in Sect. 6, and we give a short conclusion with some outlook to the future in Sect. 7.

1.2 Generalized Interpolation Problems on Regular Surfaces

With regard to boundary value problems, we define the properties of the boundary surfaces that are considered such as the regularity of the surface.

Definition 1.

A \(C^{\left (k\right )}\)-regular surface \(\Sigma \,\,\subset \mathbb{R}^{3}\) is a surface in \(\mathbb{R}^{3}\) which has to fulfill the following properties:

  1. (i)

    \(\Sigma \) divides \(\mathbb{R}^{3}\) into the interior \(\Sigma ^{\text{int}}\) and the exterior \(\Sigma ^{\text{ext}}\), where \(\Sigma ^{\text{int}}\) is a bounded region and \(\Sigma ^{\text{ext}}\) is an unbounded region.

  2. (ii)

    The origin is contained in \(\Sigma ^{\text{int}}\).

  3. (iii)

    \(\Sigma \) is closed (and therefore compact) and free of double points.

  4. (iv)

    \(\Sigma \) is a \(C^{\left (k\right )}\)-surface, i.e., for each \(x \in \Sigma \) there exists a neighborhood \(U \subset \mathbb{R}^{3}\) of x such that \(\Sigma \cap U\) possesses a \(C^{\left (k\right )}\)-parametrization.

A \(C^{\left (k,\lambda \right )}\)-regular surface \(\Sigma \,\,\subset \mathbb{R}^{3}\) with \(\lambda \in \left (0,\,1\right )\) is a \(C^{\left (k\right )}\)-regular surface where every point \(x \in \Sigma \) possesses a neighborhood U such that \(\Sigma \cap U\) can locally be parameterized by a k-times \(\lambda\)-Hölder continuously differentiable parametrization.

The extension to \(C^{\left (k,\lambda \right )}\)-regular surfaces is required for the consideration of oblique derivative boundary value problems as in Gutting (2012), we do not need it in this paper.

Now we introduce the interpolation problem on a regular surface such that interpolation on the sphere can be interpreted as a special case.

Problem 1 (Interpolation on a regular surface).

Let \(\Sigma \) be a \(C^{\left (0\right )}\)-regular surface. Let a finite set of points \(\{x_{1},\ldots,x_{N}\} \subset \Sigma \) on the surface and data F i , i = 1, , N corresponding to these points be given. The aim is to find a function \(F \in C^{\left (0\right )}(\Sigma )\) such that F(x i ) = F i , i = 1, , N.

Of course, it is also possible to search for F in other function spaces if additional/other properties of the interpolating function are desired. Often the data F i are error affected, and strict interpolation is less desirable than a good approximation that reduces the errors. Then, the interpolation conditions are reduced to F(x i ) ≈ F i , i = 1, , N, and F has to minimize some functional that balances closeness to the data and smoothness of F, usually with one or several parameters.

The Dirichlet boundary value problem for the exterior space of a regular surface (of sufficient smoothness) in its classical form is described as follows (cf., e.g., Freeden and Gerhards 2013 and the references therein):

Problem 2.

Let \(\Sigma \) be a \(C^{\left (k\right )}\)-regular surface with k ≥ 2. Let \(F \in C^{\left (0\right )}\left (\Sigma \right )\) be a given boundary function. The task is to find a potential U with the following three properties:

  1. (i)

    \(U \in C^{\left (0\right )}\left (\overline{\Sigma ^{\text{ext}}}\right ) \cap C^{\left (2\right )}\left (\Sigma ^{\text{ext}}\right )\) is harmonic in \(\Sigma ^{\text{ext}}\),

  2. (ii)

    U is regular at infinity, i.e., for \(\vert x\vert \rightarrow \infty \),

    $$\displaystyle{ \vert U(x)\vert = \mathcal{O}\left (\vert x\vert ^{-1}\right ), }$$
    $$\displaystyle{ \vert \nabla U(x)\vert = \mathcal{O}\left (\vert x\vert ^{-2}\right ) }$$
    (2)

    uniformly with respect to all directions \(\tfrac{x} {\vert x\vert }\).

  3. (iii)

    U + = F on \(\Sigma \), i.e., for all \(x \in \Sigma \),

    $$\displaystyle{U^{+} =\lim \limits _{ \begin{array}{c}\tau \rightarrow \,0\\ \tau \,>\,0\end{array}}U\left (x +\tau \nu \left (x\right )\right ) = F\left (x\right ),}$$

    where ν is the unit normal vector field on \(\Sigma \) directed into \(\Sigma ^{\text{ext}}\).

It is well known that under these conditions, the Dirichlet boundary value problem possesses a unique solution (see, e.g., Freeden and Michel 2004; Freeden and Gerhards 2013). Note that the special case of \(\Sigma = \Omega _{r}\), i.e., a sphere of radius r around the origin, is very well known, and for such results, we refer to Freeden et al. (1998a), Freeden and Michel (2004), Freeden and Gerhards (2013), and the references therein.

In this paper we focus on the discrete version of the Problem 2 which requires only the values of the boundary function in a finite set of points on the surface.

Problem 3.

Let \(\Sigma \) be a \(C^{\left (k\right )}\)-regular surface with k ≥ 2. Let \(\{x_{1},\ldots,x_{N}\} \subset \Sigma \) be a discrete set of N points on the surface. For each point x i , let \(F_{i} = U(x_{i})\) be given, where \(i = 1,\ldots,N\).

The task is to determine the potential \(U \in C^{\left (0\right )}\left (\overline{\Sigma ^{\text{ext}}}\right ) \cap C^{\left (2\right )}\left (\Sigma ^{\text{ext}}\right )\) which is harmonic in \(\Sigma ^{\text{ext}}\) and regular at infinity (i.e., (1) and (2) hold) or an approximation U N to it which fits the data, i.e., for i = 1, , N,

$$\displaystyle{ U_{N}(x_{i}) = F_{i} = U(x_{i}). }$$

This formulation reduces the Dirichlet boundary value problem to a (generalized) interpolation problem. We also want to consider regional problems on parts of the surface \(\Sigma \), i.e., in Problem 3 all points \(x_{i} \in \Gamma \subsetneq \Sigma \) where \(\Gamma \) denotes a region on \(\Sigma \). Therefore, our approximation needs to be localizing, and we use harmonic splines as first suggested in Freeden (1981a1982a,b) (see also Freeden and Michel 2004; Freeden and Gerhards 2013 and the references therein for the use of harmonic splines in boundary value problems of geomathematics).

2 Preliminaries

Spherical harmonics, which we denote by Y n, m (with degree \(n \in \mathbb{N}_{0}\), order m = −n, , n), are known to form a complete orthonormal basis of the space \(L^{2}(\Omega )\) of square-integrable functions on the unit sphere \(\Omega \) (see, e.g., Edmonds 1964; Vars̆alovic̆ et al. 1988; Freeden and Gutting 2013). From an approximation point of view, it is important to note that spherical harmonics \(\{Y _{n,m}\}_{n\in \mathbb{N}_{0},m=-n,\ldots,n}\) form a closed system in \(C(\Omega )\) and are closed and complete in \(L^{2}(\Omega )\). This allows the representation of square-integrable functions on any sphere \(\Omega _{R}\) of radius R > 0 by their Fourier series, where the Fourier coefficients of \(F \in L^{2}(\Omega _{R})\) are denoted by

$$\displaystyle{ F^{\wedge }(n,m) =\int _{ \Omega _{R}}F\left (x\right ) \frac{1} {R}Y _{n,m}\left ( \tfrac{x} {R}\right )d\omega _{R}\left (x\right ). }$$
(3)

2.1 Harmonic Splines

Harmonic splines are constructed in such a way that they are subspaces of the space of harmonic functions on a sphere situated inside the Earth, the so-called Runge (or Krarup) sphere (see Moritz 2014). Due to the Runge-Walsh approximation theorem, we can use these functions which possess a larger domain of harmonicity to approximate the solution of the problem which requires harmonicity only outside the Earth’s surface (see Freeden 1999; Freeden and Michel 2004 for an extensive introduction of this technique). Harmonic splines have been introduced in Freeden (1981a1982a,b) and Shure et al. (1982) and their convergence properties are shown in Freeden (1987a,b). They are closely related to spherical splines which we discuss in Sect. 2.3. We briefly summarize the main ideas. For more details, the reader is referred to Freeden (1999) and Freeden and Michel (2004) and the references in these two books.

At first, we define the Runge sphere \(\Omega _{R}\) which is a sphere of radius R around the origin such that the exterior of the Runge sphere, i.e., \(\Omega _{R}^{\text{ext}}\), contains the exterior of the regular surface \(\Sigma \), i.e., \(\overline{\Sigma ^{\text{ext}}} \subset \Omega _{R}^{\text{ext}}\). See also Moritz (2014) where it is called Krarup sphere.

We define Sobolev spaces of the form \(\mathcal{H} = \mathcal{H}\left (\{A_{n}\};\overline{\Omega _{R}^{\text{ext}}}\right )\) using the Runge sphere \(\Omega _{R}\) and a sequence \(\left \{A_{n}\right \}_{n\in \mathbb{N}_{0}} \subset \mathbb{R}\) which satisfies the summability condition

$$\displaystyle{ \sum \limits _{n=0}^{\infty }\frac{2n + 1} {4\pi A_{n}^{2}} < \infty. }$$
(4)

Definition 2.

The Sobolev space \(\mathcal{H} = \mathcal{H}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right )\) is defined by

$$\displaystyle{ \mathcal{H} = \mathcal{H}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right ) = \overline{\mathcal{E}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right )}^{\left \|\cdot \right \| _{ \mathcal{H}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right )} }\,, }$$

where \(\mathcal{E}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right ) \subset C^{(\infty )}\left (\overline{\Omega _{R}^{\text{ext}}}\right )\) is the set of all functions that are harmonic in \(\Omega _{R}^{\text{ext}}\), infinitely often differentiable on the Runge sphere \(\Omega _{R}\) and regular at infinity (i.e., (1) and (2) hold) and whose Fourier coefficients F (n, m) with respect to \(L^{2}(\Omega _{R})\) (as defined in (3)) fulfill

$$\displaystyle{ \left \|F\right \|_{ \mathcal{H}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right )} =\sum \limits _{ n=0}^{\infty }\sum \limits _{ m=-n}^{n}A_{ n}^{2}\left (F^{\wedge }(n,m)\right )^{2} < \infty. }$$

\(\mathcal{H}\) is a Hilbert space with the inner product defined by

$$\displaystyle{ \langle F,G\rangle _{ \mathcal{H}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right )} =\sum \limits _{ n=0}^{\infty }\sum \limits _{ m=-n}^{n}A_{ n}^{2}\,F^{\wedge }(n,m)G^{\wedge }(n,m) }$$

for \(F,G \in \mathcal{H}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right )\).

It is well known (cf. Freeden 1999 and the references therein) that such a space possesses a so-called reproducing kernel (see Aronszajn 1950 for an overview on reproducing kernels in general).

Definition 3.

Let U be a nonempty set and \(\left (X,\langle \cdot,\cdot \rangle _{X}\right )\) be a separable Hilbert space of real-valued functions on U. Let \(\left \{B_{n}\right \}_{n\in \mathbb{N}_{0}}\) be a complete orthonormal system in \(\left (X,\langle \cdot,\cdot \rangle _{X}\right )\). Any function \(K: U \times U\longrightarrow \mathbb{R}\) of the form

$$\displaystyle{ K\left (x,y\right ) =\sum _{ n=0}^{\infty }K^{\wedge }(n)B_{ n}\left (x\right )B_{n}\left (y\right ) }$$
(5)

with x, yU and \(K^{\wedge }(n) \in \mathbb{R}\) for \(n \in \mathbb{N}_{0}\) is called an X-product kernel (briefly an X-kernel).

An X-kernel \(K\left (\cdot,\cdot \right ): U \times U\longrightarrow \mathbb{R}\) is called a reproducing kernel (or shortly repro-kernel) for \(\left (X,\langle \cdot,\cdot \rangle _{X}\right )\) if:

  1. (i)

    \(K\left (x,\cdot \right ) \in X\) for all xU.

  2. (ii)

    \(\langle K\left (x,\cdot \right ),F\rangle _{X} = F\left (x\right )\) for all xU and all FX.

If there exists such a repro-kernel in X, then X is called a reproducing kernel Hilbert space.

It is well known that a reproducing kernel is always unique, and its existence is equivalent to the boundedness of all evaluation functionals (cf. Aronszajn 1950). In the space \(\mathcal{H} = \mathcal{H}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right )\) with a summable sequence {A n }, the repro-kernel (5) can be represented by its expansion in Legendre polynomials due to the addition theorem for spherical harmonics:

$$\displaystyle{ K_{\mathcal{H}}(x,y) =\sum \limits _{ n=0}^{\infty }\frac{2n + 1} {4\pi A_{n}^{2}} \frac{1} {\vert x\vert \vert y\vert }\left ( \frac{R^{2}} {\vert x\vert \vert y\vert }\right )^{n}P_{ n}\left ( \frac{x} {\vert x\vert }\cdot \frac{y} {\vert y\vert }\right ). }$$
(6)

We use these reproducing kernels to define harmonic splines.

Definition 4.

Let \(\left \{\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right \} \subset \mathcal{H}^{{\ast}}\) be a set of N linearly independent bounded linear functionals on the reproducing kernel Hilbert space \(\mathcal{H}\).

Then any function S of the form

$$\displaystyle{S =\sum \limits _{ i=1}^{N}a_{ i}\mathcal{L}_{i}K_{\mathcal{H}}(\cdot,\cdot )}$$

with a set of so-called spline coefficients \(\left \{a_{1},\ldots,a_{N}\right \} \subset \mathbb{R}\) is called an \(\mathcal{H}\)-spline relative to \(\left \{\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right \}\). The function space of all \(\mathcal{H}\)-splines relative to \(\left \{\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right \}\) is denoted by \(\mathcal{S}_{\mathcal{H}}\left (\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right )\).

It should be noted that by construction, any \(\mathcal{H}\)-spline is harmonic. The interpolating spline S F for the function \(F \in \mathcal{H}\) has to fulfill the interpolation conditions

$$\displaystyle{ \mathcal{L}_{i}S^{F} = \mathcal{L}_{ i}F\quad \text{for }i = 1,\ldots,N. }$$
(7)

The interpolation conditions (7) can be rewritten as a system of linear equations for the spline coefficients a i :

$$\displaystyle{ \sum \limits _{i=1}^{N}a_{ i}\mathcal{L}_{i}\mathcal{L}_{j}K_{\mathcal{H}}(\cdot,\cdot ) = \mathcal{L}_{j}F,\quad j = 1,\ldots,N, }$$
(8)

whose corresponding matrix possesses the entries \(\mathcal{L}_{i}\mathcal{L}_{j}K_{\mathcal{H}}(\cdot,\cdot )\) and is symmetric and positive definite (for linear functionals \(\mathcal{L}_{1},\ldots,\mathcal{L}_{N} \in \mathcal{H}^{{\ast}}\) which are linearly independent).

Note that in this paper, we consider only evaluation functionals \(\mathcal{L}_{x}\), i.e., \(\mathcal{L}_{x}F = F(x)\) where \(x \in \overline{\Sigma ^{\text{ext}}}\). Furthermore, \(\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\) are given by \(\mathcal{L}_{i}F = F(x_{i})\) where \(x_{i} \in \Sigma \). Obviously, this suffices to treat Problem 3 as well as any interpolation problems. The concept of splines can be used for a more general class of problems. In the following theorem, we summarize the properties of \(\mathcal{H}\)-splines.

Theorem 1.

Let \(F \in \mathcal{H}\) and let \(\left \{\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right \} \subset \mathcal{H}^{{\ast}}\) . Then the \(\mathcal{H}\) -spline interpolation problem with the interpolation conditions (7) is uniquely solvable, and its solution \(S^{F} \in \mathcal{S}_{\mathcal{H}}\left (\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right )\) possesses the following properties:

  1. (i)

    S F is the \(\mathcal{H}\) -orthogonal projection of F onto \(\mathcal{S}_{\mathcal{H}}\left (\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right )\) .

  2. (ii)

    \(\left \|S^{F}\right \|_{\mathcal{H}}\leq \left \|F\right \|_{\mathcal{H}}\) .

  3. (iii)

    If \(G \in \mathcal{H}\) also satisfies the interpolation conditions (7) , then the first minimum property holds:

    $$\displaystyle{\left \|G\right \|_{\mathcal{H}}^{2} = \left \|S^{F}\right \|_{ \mathcal{H}}^{2} + \left \|G - S^{F}\right \|_{ \mathcal{H}}^{2},}$$

    i.e., S F is the interpolating function of F in \(\mathcal{H}\) with minimal norm.

  4. (iv)

    If \(G \in \mathcal{H}\) also satisfies (7) and \(S \in \mathcal{S}_{\mathcal{H}}\left (\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right )\) , then the second minimum property holds:

    $$\displaystyle{\left \|S - G\right \|_{\mathcal{H}}^{2} = \left \|S^{F} - G\right \|_{ \mathcal{H}}^{2} + \left \|S - S^{F}\right \|_{ \mathcal{H}}^{2}.}$$

For the proof and for further details on splines, the reader is referred to Freeden (1981a1987a1999) and Freeden et al. (1998a) and the references therein.

The choice \(A_{n} = h^{-\tfrac{n} {2} }\), h ∈ (0, 1), fulfills (4) and provides us with the reproducing kernel called Abel-Poisson kernel (see Fig. 1) which is given by

$$\displaystyle{ K_{\mathcal{H}}(x,y) = \frac{1} {4\pi } \frac{\vert x\vert ^{2}\vert y\vert ^{2} - h^{2}R^{4}} {\left (\vert x\vert ^{2}\vert y\vert ^{2} + h^{2}R^{4} - 2hR^{2}x \cdot y\right )^{\tfrac{3} {2} }}. }$$
(9)
Fig. 1
figure 1

The profiles of the Abel-Poisson kernel with h = 0. 93 (red), h = 0. 9 (green), h = 0. 8 (blue), | x | = | y | = 1, R = 0. 999, the angle between \(\tfrac{x} {\vert x\vert }\) and \(\tfrac{y} {\vert y\vert }\) running from −π to π (right: a more detailed view)

The sequence \(A_{n} = \left (n + \tfrac{1} {2}\right )^{\tfrac{1} {2} }h^{-\tfrac{n} {2} }\), h ∈ (0, 1), also satisfies (4) and leads to the singularity kernel (see Fig. 2) given by

$$\displaystyle{ K_{\mathcal{H}}(x,y) = \frac{1} {2\pi } \frac{1} {\left (\vert x\vert ^{2}\vert y\vert ^{2} + h^{2}R^{4} - 2hR^{2}x \cdot y\right )^{\tfrac{1} {2} }}. }$$
(10)
Fig. 2
figure 2

The profile of the singularity kernel with h = 0. 93 (red), h = 0. 9 (green), h = 0. 8 (blue), | x | = | y | = 1, R = 0. 999, the angle between \(\tfrac{x} {\vert x\vert }\) and \(\tfrac{y} {\vert y\vert }\) running from −π to π (right: a more detailed view)

By the Runge-Walsh approximation theorem and an extension of Helly’s theorem (cf. Yamabe 1950), it is possible to prove the existence of approximations fulfilling an interpolation condition. Moreover, convergence results for harmonic splines can be derived that show the convergence to the solution of the continuous boundary value problem (Problem 2) for an increasing density of data points, i.e., if the largest data gap goes to zero (cf. Freeden 1987a). For a detailed analysis, the reader is referred to Freeden and Michel (2004) and the literature therein.

In this paper we focus on two specific types of splines (using Abel-Poisson and singularity kernels) and propose methods to quickly compute the sums \(\sum a_{i}K_{\mathcal{H}}\left (x_{i},y_{j}\right )\) for many points. This can be used to solve the systems of linear equations (8) that occur in the solution of the interpolation problems using harmonic splines.

2.2 Smoothing Splines

For noisy data, i.e., \(F_{i} = U(x_{i}) +\varepsilon _{i} \approx U(x_{i})\), i = 1, , N, in Problem 3, it makes no sense to compute an interpolation problem, but an approximation to U which smoothes the data is required (see Freeden 1981b; Wahba 1990; Freeden et al. 1998a1997 for the case of spherical splines, Freeden (1981a1999) for the case of harmonic splines). This smoothing process is achieved by minimizing the following functional

$$\displaystyle{\mu (S) =\sum \limits _{ i=1}^{N}\sum \limits _{ j=1}^{N}\left (\mathcal{L}_{ i}S - F_{i}\right )\mathbf{C}_{ij}\left (\mathcal{L}_{j}S - F_{j}\right ) +\beta \left \|S\right \|_{\mathcal{H}}}$$

in the reproducing kernel Hilbert space \(\mathcal{H} = \mathcal{H}\left (\{A_{n}\};\,\overline{\Omega _{R}^{\text{ext}}}\right )\). \(\mathbf{C} = (\mathbf{C}_{ik}) \in \mathbb{R}^{N\times N}\) denotes a positive definite matrix, and β > 0 is a constant. The following theorem of Freeden (1999) summarizes the existence and uniqueness of smoothing splines.

Theorem 2.

Let F i , i = 1,…,N, correspond to a set of linearly independent bounded linear functionals \(\mathcal{L}_{1},\ldots,\mathcal{L}_{N} \in \mathcal{H}^{{\ast}}\) .

Then there exists a unique element \(S \in \mathcal{S}_{\mathcal{H}}\left (\mathcal{L}_{1},\ldots,\mathcal{L}_{N}\right )\) such that μ(S) ≤μ(F) for all \(F \in \mathcal{H}\) and μ(S) = μ(F) if and only if S = F. This element is called the smoothing spline . Its spline coefficients a i , i = 1,…,N are uniquely determined by the system of linear equations

$$\displaystyle{ \sum \limits _{i=1}^{N}a_{ i}\left (\mathcal{L}_{i}\mathcal{L}_{j}K_{\mathcal{H}}(\cdot,\cdot ) +\beta (\mathbf{C}^{-1})_{ ij}\right ) = \mathcal{L}_{j}F,\quad j = 1,\ldots,N. }$$
(11)

The matrix in (11) corresponds to the one in (8) plus β C −1 and is still positive definite. If C is the unit matrix, there is only the one smoothing parameter β, using a diagonal matrix as C it is possible to introduce weights for the data F i . The general case allows the use of C to include covariance information on the data.

The choice of the smoothing parameter(s) can be interpreted as the application of a parameter choice method in the regularization theory of ill-posed problems. We refer to Bauer and Lukas (2011) and Bauer et al. (2014) for an up-to-date overview of methods and literature in the field.

2.3 Spherical Splines

Spherical splines have been introduced by Freeden (1981b) and independently by Wahba (1981) and can be embedded naturally in the harmonic setting of Sect. 2.1. On the sphere, i.e., \(\Sigma = \Omega _{R}\), no additional Runge sphere is required and the surface itself can be used to consider the corresponding Sobolev spaces \(\mathcal{H} = \mathcal{H}\left (\{A_{n}\};\,\Omega _{R}\right )\) with respect to a sequence {A n }. The summability condition (4) remains the same, and if it is satisfied by {A n }, we again obtain a reproducing kernel. It should be remarked that on the sphere, this reproducing kernel is a radial basis function depending only on the distance | xy | of its two arguments \(x,y \in \Omega _{R}\). Moreover, it is possible to consider Sobolev spaces that possess a locally supported reproducing kernel, i.e., \(K_{\mathcal{H}}(x,y) = 0\) if | xy | is larger than a threshold value (see Freeden et al. 1998a and also the categorization in Freeden and Schreiner 2014).

For our purposes, the problem of needing a fast method to evaluate sums \(\sum a_{i}K_{\mathcal{H}}\left (x_{i},y_{j}\right )\) and to solve the system (8) for the spline coefficients remains the same as before. However, there exists a wider spectrum of fast summation methods on the sphere such as panel clustering (see Freeden et al. 1998b; Freeden 1999; Fengler 2005 and the references therein), spherical FFT for gridded data points or spherical NFFT for non-equispaced data (cf., e.g., Potts and Steidl 2003; Keiner et al. 2006).

In Freeden et al. (1998a), a more general construction which is orthogonal to the spherical harmonics of degrees 0 to m is introduced which can be used for approximation by combining it with the corresponding spherical harmonics. A combined interpolation and smoothing approach is included as well as spline exact numerical integration formulas on the sphere.

Convergence results for spherical splines are shown in Freeden (1984a,b) if the size of the largest data gap goes to zero. We refer the reader to Freeden et al. (1998a) and Michel (2013) and the references therein for a detailed introduction. It should be noted that the construction has also been carried over to the three-dimensional ball (see Michel 20132014b and the references therein).

3 The Fast Multipole Method for Splines

In case of harmonic splines, the interpolation conditions lead to a system of linear equations with a dense matrix whose size is the number of data points. Thus, the matrix can be large, and the solution of the corresponding system of linear equations becomes difficult. For spherical splines, we are confronted with the same situation if we restrict ourselves to the Abel-Poisson kernel and the singularity kernel. Other (locally supported) spherical kernels can reduce this problem to some degree by leading to band matrices.

Each reproducing kernel corresponding to a space \(\mathcal{H}\) defined by the summable sequence {A n } can be written as an infinite Legendre expansion as in (6). For certain special cases such as the singularity kernel (10) and the Abel-Poisson kernel (9), the expansion can be reduced to an elementary function. These two reproducing kernels are closely related to the single pole \(\tfrac{1} {\vert x-y\vert }\). Because of this connection, the fast multipole method (FMM), which has been introduced by Rokhlin (1985), Greengard and Rokhlin (19871988), and Greengard (1988), allows the fast summation of harmonic splines, i.e., of the sum \(\sum a_{i}K_{\mathcal{H}}(x_{i},\cdot )\). Thus, the FMM also accelerates the fast computation of the matrix-vector products occurring in an iterative solver for (8).

The two main ideas of the FMM can be described as:

  • Subdivision of the computational domain into hierarchical sets of nested cubes that are organized in an octree data structure,

  • Interaction of cubes instead of single points by summarizing all information of a cube in coefficients of inner/outer harmonics expansions.

This means that cubes interact in two ways: directly, i.e., the kernel is evaluated for all points in one cube combined with all points in another cube, and the approximate way, i.e., by translation of the expansion coefficients. The key is to apply the approximation as often as possible and on the coarsest possible level of the tree data structure of cubes. Direct evaluation is used only for the closest cubes where the approximation is not applicable.

The FMM has seen various updates that increase its efficiency (cf., e.g., White and Head-Gordon 1996; Greengard and Rokhlin 1997; Cheng et al. 1999). We summarize our implementation and show the application of the FMM to harmonic and spherical splines. For a more detailed analysis, the reader is referred to Gutting (2007).

3.1 Kelvin Transform of Reproducing Kernels

The combination between our reproducing kernels and the fundamental solution of the Laplace equation is the Kelvin transform which can be interpreted as a reflection on a sphere of radius R around the origin. It is well known from textbooks on potential theory (cf., e.g., Kellogg 1967; Freeden and Gerhards 2013).

Definition 5.

Let \(\Gamma \subseteq \mathbb{R}^{3}\) be a domain, \(W: \Gamma \longrightarrow \mathbb{R}\) a function. Let the reflection of \(\Gamma \) on the sphere \(\Omega _{R}\) be given by

$$\displaystyle{ \Gamma ^{\text{KT}} = \left \{x^{\text{KT}} \in \mathbb{R}^{3}: \frac{R^{2}} {\vert x^{\text{KT}}\vert ^{2}}x^{\text{KT}} = x \in \Gamma \right \}. }$$

The function

$$\displaystyle\begin{array}{rcl} & & W^{\text{KT}}: \Gamma ^{\text{KT}}\longrightarrow \mathbb{R}, {}\\ & & \qquad \quad x^{\text{KT}}\mapsto W^{\text{KT}}\left (x^{\text{KT}}\right ) = \frac{R} {\vert x^{\text{KT}}\vert }W\left ( \frac{R^{2}} {\vert x^{\text{KT}}\vert ^{2}}x^{\text{KT}}\right ) = \frac{R} {\vert x^{\text{KT}}\vert }W(x), {}\\ \end{array}$$

is called the Kelvin transform of W with respect to the sphere of radius R.

The Kelvin transform is applied to the reproducing kernels with respect to one argument (the other is kept fixed). The Kelvin transform \(K_{\mathcal{H}}^{\text{KT}}\left (x,y^{\text{KT}}\right )\) of the singularity kernel (10) can be computed by its expansion

$$\begin{array}{lll} K_{\mathcal{H}}(x,y)& = \sum \limits _{n=0}^{\infty } \frac{h^{n}} {n + \tfrac{1} {2}} \frac{2n + 1} {4\pi \vert x\vert \vert y\vert }\left ( \frac{R^{2}} {\vert x\vert \vert y\vert }\right )^{n}P_{ n}\left ( \frac{x} {\vert x\vert }\cdot \frac{y} {\vert y\vert }\right ) \\ & = \frac{1} {2\pi \vert y\vert }\sum \limits _{n=0}^{\infty }\frac{\left (h\vert y^{\text{KT}}\vert \right )^{n}} {\vert x\vert ^{n+1}} P_{n}\left ( \frac{x} {\vert x\vert }\cdot \frac{y^{\text{KT}}} {\vert y^{\text{KT}}\vert }\right ) \\ & = \frac{1} {2\pi \vert y\vert } \frac{1} {\vert x - hy^{\text{KT}}\vert } = \frac{\vert y^{\text{KT}}\vert } {R} K_{\mathcal{H}}^{\text{KT}}\left (x,y^{\text{KT}}\right ),{}\end{array}$$
(12)

where \(y^{\text{KT}} = \frac{R^{2}} {\vert y\vert ^{2}} y\) and

$$\displaystyle{ K_{\mathcal{H}}^{\text{KT}}\left (x,y^{\text{KT}}\right ) = \frac{1} {2\pi R} \frac{1} {\vert x - hy^{\text{KT}}\vert }. }$$
(13)

The Kelvin transform \(K_{\mathcal{H}}^{\text{KT}}\left (x,y^{\text{KT}}\right )\) of the Abel-Poisson kernel (9) is given by

$$\begin{array}{lll} K_{\mathcal{H}}(x,y)& = \frac{1} {4\pi } \frac{\vert x\vert ^{2}\vert y\vert ^{2} - h^{2}R^{4}} {\left (\vert x\vert ^{2}\vert y\vert ^{2} + h^{2}R^{4} - 2hR^{2}x \cdot y\right )^{\tfrac{3} {2} }} {}\\ & = \frac{\vert y^{\text{KT}}\vert } {R} \frac{1} {4\pi R} \frac{\vert x\vert ^{2} - h^{2}\vert y^{\text{KT}}\vert ^{2}} {\vert x - hy^{\text{KT}}\vert ^{3}} = \frac{\vert y^{\text{KT}}\vert } {R} K_{\mathcal{H}}^{\text{KT}}\left (x,y^{\text{KT}}\right ), {}\\ \end{array}$$

which is related to (13) by

$$\displaystyle{ K_{\mathcal{H}}^{\text{KT}}\left (x,y^{\text{KT}}\right ) = \frac{1} {2\pi R}\left (-x \cdot \nabla _{x} -\tfrac{1} {2}\text{Id}\right ) \frac{1} {\vert x - hy^{\text{KT}}\vert }. }$$
(14)

We can summarize both (13) and (14) by the use of the operator D x such that

$$\displaystyle{ K_{\mathcal{H}}^{\text{KT}}\left (x,y^{\text{KT}}\right ) = \frac{1} {2\pi R}D_{x} \frac{1} {\vert x - hy^{\text{KT}}\vert }, }$$
(15)

where D x = Id (singularity kernel) or \(D_{x} = -x \cdot \nabla _{x} -\tfrac{1} {2}\text{Id}\) (Abel-Poisson kernel).

3.2 Adaptive Decomposition of the Domain

At first, a bounding cube is determined which is large enough such that it contains all points x i and targets \(hy_{j}^{\text{KT}} = h \frac{R^{2}} {\vert y_{j}\vert ^{2}} y_{j}\). This single bounding cube forms level 0 of the octree structure and is subdivided into eight equally sized cubes of half its edge length which then form level 1. Each cube is adaptively divided into nested cubes where a cube of level l has half the edge length of a cube of level l − 1 as proposed, e.g., by Cheng et al. (1999). The adaptive splitting is achieved by sorting the points and targets into the currently available cubes. If a cube contains more than the prescribed maximal number of points or targets m, it is split into eight cubes of the next level, and all its points/targets are redistributed into these eight cubes. All the cubes carry information about other surrounding cubes (their so-called neighbors) which is required for the application of the kernel expansion.

The necessary vocabulary (see also Greengard and Rokhlin 1997; Cheng et al. 1999) is summarized in the following definitions.

Definition 6.

  • A cube C is called child of the cube B if C results from a single subdivision of B which in return is named the parent of C.

  • A cube that is not further subdivided is called childless or a leaf.

  • Cubes are said to be neighbors if they are of the same size (same level) and share at least one boundary point. Each cube is a neighbor of itself.

  • If two cubes are at the same level, but are no neighbors, they are called well separated, i.e., between these cubes exists at least one cube of their size.

Each cube carries the relevant information about other cubes in four lists as suggested by Cheng et al. (1999).

Definition 7.

  • In list 1 of the childless cube X are all childless cubes directly adjacent to X. List 1 only contains any cubes if X is a leaf, then it always contains at least X itself.

  • List 2 of a cube X consists of all children of neighbors of the parent cube of X which are well separated from X. The cube X does not need to be childless.

  • Children of neighbors of the leaf X (or smaller cubes descending from neighbors of X) which do not have any point in common with X form list 3. Their parents have to be adjacent to X. If X is not childless, then list 3 is empty.

  • List 4 consists of childless cubes which are neighbors of the parent cube of X, but these childless cubes are not adjacent to X.

Notice the following observations:

  1. (i)

    List 1 is the list of all neighbors.

  2. (ii)

    All cubes in list 2 of a cube X are of the same size as X.

  3. (iii)

    The elements of list 3 are all smaller than X, and the distance between them and X is at least their side length and at most the side length of X.

  4. (iv)

    List 4 of a cube X only contains cubes that are larger than X. They are separated from X by a distance that is at least the side length of X and at most their own edge length.

  5. (v)

    All members of list 1 and list 4 are leaves, and list 1 and list 3 of a cube X remain empty if X is not childless.

After finishing the construction of the octree and sorting all points and targets into cubes, the algorithm removes childless cubes that contain neither points nor targets.

3.3 Single Pole Expansion

In addition to the decomposition of the domain, the other part of the FMM is the expansion of the single pole. This is achieved similar to (12) by

$$\displaystyle\begin{array}{rcl} \frac{1} {\vert x - y\vert }& =& \frac{1} {\vert y - x_{0} - (x - x_{0})\vert } \\ & =& \sum \limits _{n=0}^{\infty } \frac{\vert x - x_{0}\vert ^{n}} {\vert y - x_{0}\vert ^{n+1}}P_{n}\left ( \frac{y - x_{0}} {\vert y - x_{0}\vert }\cdot \frac{x - x_{0}} {\vert x - x_{0}\vert }\right ) \\ & =& \sum \limits _{n=0}^{\infty }\sum \limits _{ m=-n}^{n}I_{ n,m}^{{\ast}}(x - x_{ 0})O_{n,m}(y - x_{0}),{}\end{array}$$
(16)

where | yx 0 | > | xx 0 | for the expansion center \(x_{0} \in \mathbb{R}^{3}\). The upper star ∗ in (16) denotes the complex conjugate. Thereby, we use the (complex-valued) outer and inner harmonics for \(n \in \mathbb{N}_{0}\), m = −n, , n:

$$\begin{array}{lll} O_{n,m}(x)& = \sqrt{ \frac{4\pi } {2n + 1}} \frac{\sqrt{(n + m)!(n - m)!}} {\vert x\vert ^{n+1}} Y _{n,m}\left ( \tfrac{x} {\vert x\vert }\right ) {}\\ & = (-1)^{m}\frac{(n - m)!} {\vert x\vert ^{n+1}} P_{n,m}(\cos \vartheta )e^{im\varphi }, {}\\ I_{n,m}(x)& = \sqrt{ \frac{4\pi } {2n + 1}} \frac{\vert x\vert ^{n}} {\sqrt{(n + m)!(n - m)!}}Y _{n,m}\left ( \tfrac{x} {\vert x\vert }\right ) {}\\ & = (-1)^{m} \frac{\vert x\vert ^{n}} {(n + m)!}P_{n,m}(\cos \vartheta )e^{im\varphi }, {}\\ \end{array}$$

where \(\vartheta,\varphi\) are the spherical coordinates of \(\tfrac{x} {\vert x\vert }\). Note that P n, m are the associated Legendre functions

$$\displaystyle{P_{n,m}(t) = \left (1 - t^{2}\right )^{\frac{m} {2} } \frac{d^{m}} {dt^{m}}P_{n}(t),\quad t \in [-1,1],\quad n \geq m \geq 0.}$$

The symmetry relation \(P_{n,-m}(t) = (-1)^{m}\frac{(n-m)!} {(n+m)!}P_{n,m}(t)\) extends them for negative orders (cf., e.g., Edmonds 1964; Biedenharn and Louck 1981).

It is well known that translation theorems for these outer and inner harmonics allow to shift the expansion center (see, e.g., Epton and Dembart 1995 for a detailed derivation).

Theorem 3 (Translation Theorem for Outer Harmonics).

Let \(x,y \in \mathbb{R}^{3}\) such that |x| > |y|. Then the outer harmonic of degree \(n \in \mathbb{N}_{0}\) and order \(m \in \mathbb{Z}\) , − n ≤ m ≤ n, at x − y can be expanded in terms of inner and outer harmonics as follows:

$$\displaystyle\begin{array}{rcl} O_{n,m}(x - y)& =& \sum \limits _{n'=0}^{\infty }\sum \limits _{ m'=-n'}^{n'}I_{ n',m'}^{{\ast}}(y)O_{ n+n',m+m'}(x) \\ & =& \sum \limits _{n'=n}^{\infty }\sum \limits _{ m'=-n'}^{n'}I_{ n'-n,m'-m}^{{\ast}}(y)O_{ n',m'}(x).{}\end{array}$$
(17)

Note that in (17), we make use of the convention that I n, m = 0 if | m | > n.

Theorem 4 (Translation Theorem for Inner Harmonics).

Let \(x,y \in \mathbb{R}^{3}\) . Then the inner harmonic of degree \(n \in \mathbb{N}_{0}\) and order \(m \in \mathbb{Z}\) , − n ≤ m ≤ n, at x − y can be expanded in a finite sum of inner harmonics

$$\displaystyle{ I_{n,m}(x - y) =\sum \limits _{ n'=0}^{n}\sum \limits _{ m'=-n'}^{n'}(-1)^{n'}I_{ n',m'}(y)I_{n-n',m-m'}(x). }$$

For orders with | m | > n, we have again by convention I n, m = 0.

By applying (17) Theorem 3 allows the translation of an outer harmonics expansion with expansion center x 0 such as

$$\displaystyle{ F(x) =\sum \limits _{ n=0}^{\infty }\sum \limits _{ m=-n}^{n}F_{ x_{0}}^{\wedge,\text{O}}(n,m)O_{ n,m}(x - x_{0}) }$$
(18)

which converges uniformly for \(x \in \Omega _{r_{0}}^{\text{ext}}(x_{ 0})\) with some r 0 > 0. \(\Omega _{r_{0}}^{\text{ext}}(x_{ 0})\) denotes the exterior of the sphere of radius r 0 around x 0. The outer harmonics series resulting from the translation possesses the expansion center x 1 and the coefficients

$$\displaystyle{ F_{x_{1}}^{\wedge,\text{O}}(n',m') =\sum \limits _{ n=0}^{n'}\sum \limits _{ m=-n}^{n}F_{ x_{0}}^{\wedge,\text{O}}(n,m)I_{ n'-n,m'-m}^{{\ast}}(x_{ 0} - x_{1}). }$$
(19)

The expansion converges uniformly for \(x \in \Omega _{r_{1}}^{\text{ext}}(x_{ 1})\) where \(\Omega _{r_{1}}^{\text{ext}}(x_{ 1}) \subset \Omega _{r_{0}}^{\text{ext}}(x_{ 0})\). This translation is called multipole-to-multipole translation (M2M).

By Theorem 3, we also find that the outer harmonics expansion can be translated into an inner harmonics series centered around x 2 which converges uniformly for \(x \in \Omega _{r_{2}}^{\text{int}}(x_{ 2})\) if the new ball of convergence is situated completely in \(\Omega _{r_{1}}^{\text{ext}}(x_{ 1})\), i.e., \(\overline{\Omega _{r_{1}}^{\text{int}}(x_{ 1})} \cap \overline{\Omega _{r_{2}}^{\text{int}}(x_{ 2})} = \varnothing \). The resulting coefficients of the inner harmonics expansion are

$$\displaystyle{ F_{x_{2}}^{\wedge,\text{I}}(n',m') =\sum \limits _{ n=0}^{\infty }\sum \limits _{ m=-n}^{n}F_{ x_{1}}^{\wedge,\text{O}}(n,m)(-1)^{n'+m}O_{ n+n',m'-m}^{{\ast}}(x_{ 2} - x_{1}) }$$
(20)

and this translation is named multipole-to-local translation (M2L).

Furthermore, Theorem 4 lets us shift the expansion center of such inner harmonics expansions to the new center x 3 which possesses the coefficients

$$\displaystyle{ F_{x_{3}}^{\wedge,\text{I}}(n',m') =\sum \limits _{ n=n'}^{\infty }\sum \limits _{ m=-n}^{n}F_{ x_{2}}^{\wedge,\text{I}}(n,m)I_{ n-n',m-m'}(x_{3} - x_{2}). }$$
(21)

and converges uniformly for \(x \in \Omega _{r_{3}}^{\text{int}}(x_{ 3}) \subset \Omega _{r_{2}}^{\text{int}}(x_{ 2})\). This translation step is called local-to-local translation (L2L) . For further details, we refer to Gutting (2007) and the references therein, in particular Epton and Dembart (1995).

3.4 The Fast Multipole Algorithm

Before kernel expansions can be translated, we have to compute multipole expansions, more precisely a first set of expansion coefficients for each cube containing any points. Thus, only the part of the spline related to a single cube X is considered, i.e., the kernel functions \(K_{\mathcal{H}}(x_{i},\cdot )\), where x i X:

$$\displaystyle{ F =\sum \limits _{ \begin{array}{c}i=1 \\ x_{i}\in X\end{array}}^{N}a_{ i}K_{\mathcal{H}}(x_{i},\cdot ) =\sum \limits _{ \begin{array}{c}i=1 \\ x_{i}\in X\end{array}}^{N}a_{ i}\left (\frac{\vert y^{\text{KT}}\vert } {R} \frac{1} {2\pi R}D_{x} \frac{1} {\vert x - hy^{\text{KT}}\vert }\right )\Bigg\vert _{x=x_{i}}. }$$

We find the following expansion for \(\vert hy^{\text{KT}} - x_{0}\vert> \vert x_{i} - x_{0}\vert \), x i X, i.e., if x 0 is the center of the cube X, the targets hy KT and the cube X need to fulfill a distance requirement.

$$\displaystyle\begin{array}{rcl} F& =& \frac{\vert y^{\text{KT}}\vert } {R} \sum \limits _{\begin{array}{c}i=1 \\ x_{i}\in X\end{array}}^{N}a_{ i}\left ( \frac{1} {2\pi R}D_{x}\sum \limits _{n=0}^{\infty }\sum \limits _{ m=-n}^{n}I_{ n,m}^{{\ast}}(x - x_{ 0})O_{n,m}(hy^{\text{KT}} - x_{ 0})\right )\Bigg\vert _{x=x_{i}} \\ & =& \frac{\vert y^{\text{KT}}\vert } {R} \sum \limits _{n=0}^{\infty }\sum \limits _{ m=-n}^{n}F_{ x_{0}}^{\wedge,O}(n,m)O_{ n,m}(hy^{\text{KT}} - x_{ 0}) {}\end{array}$$
(22)

where the multipole coefficients \(F_{x_{0}}^{\wedge,O}(n,m)\) of the cube X are given by

$$\displaystyle{ F_{x_{0}}^{\wedge,O}(n,m) =\sum \limits _{ \begin{array}{c}i=1\\ x_{ i}\in X\end{array}}^{N}a_{ i}\left ( \frac{1} {2\pi R}D_{x}I_{n,m}^{{\ast}}(x - x_{ 0})\right )\Bigg\vert _{x=x_{i}}. }$$
(23)

These coefficients can be translated to other cubes via relations (19), (20) as well as (21) as long as the distance requirements are fulfilled by the construction of the decomposition of the domain into nested cubes. This first step is called point to multipole (P2M) step . Obviously, the infinite sum in (22) has to be truncated at degree p. This parameter essentially determines the accuracy of the algorithm.

At the end of the fast multipole cycle, i.e., after many M2M-, M2L-, L2L-translations, each cube Y possesses an inner harmonics expansion centered around the center of the cube. This expansion has to be evaluated at the targets contained by Y, i.e., the local to targets (L2T) step is performed:

$$\displaystyle{ \mathcal{L}_{j}F = F(y_{j}) = \left (\frac{\vert y^{\text{KT}}\vert } {R} \sum \limits _{n=0}^{p}\sum \limits _{ m=-n}^{n}F_{ x_{0}}^{\wedge,I}(n,m)I_{ n,m}(hy^{\text{KT}} - x_{ 0})\right )\Bigg\vert _{y=y_{j}}, }$$
(24)

where the variable y is hidden by \(y^{\text{KT}} = \frac{R^{2}} {\vert y\vert ^{2}} y\).

In Algorithm 1 we briefly recapitulate the fast multipole algorithm (see, e.g., Carrier et al. 1988; Cheng et al. 1999 or Gutting 2007 for our specific implementation).

Algorithm 1 Fast Multipole Algorithm

For the computation of the spline coefficients of the smoothing splines of Sect. 2.2, we have to consider the system of linear equations (11) instead of (8). This means that we have to add \(\beta \sum \limits _{i=1}^{N}a_{i}(\mathbf{C}^{-1})_{ij}\) to the matrix-vector product that is computed by the FMM. In order to keep a fast algorithm, the matrix C −1 has to allow a fast summation method or C has to be a sparse matrix. The trivial cases where C is a diagonal matrix can also be included in the direct evaluation step of the fast multipole algorithm.

3.5 Acceleration of the Translations

Following the ideas of White and Head-Gordon (1996) (see also Greengard and Rokhlin 1997; Cheng et al. 1999), the multipole-to-multipole (M2M) and the local-to-local (L2L) translations can both be accelerated by using Wigner rotation matrices (cf., e.g., Edmonds 1964; Biedenharn and Louck 1981; Vars̆alovic̆ et al. 1988; Choi et al. 1999). With these rotations, the shift direction becomes the \(\varepsilon ^{3}\)-axis; we shift there and rotate back. This reduces the numerical costs from \(\mathcal{O}\left (p^{4}\right )\) in the M2M and L2L steps to \(\mathcal{O}\left (p^{3}\right )\), since each rotation requires an effort of \(\mathcal{O}\left (p^{3}\right )\) and the shift along the \(\varepsilon ^{3}\)-axis is given by

$$\displaystyle{\tilde{F}_{x_{1}}^{\wedge,\text{O}}\left (n',m'\right ) =\sum \limits _{ n=\vert m'\vert }^{n'}\tilde{F}_{ x_{0}}^{\wedge,\text{O}}\left (n,m'\right )\frac{\vert x_{0} - x_{1}\vert ^{n'-n}} {\left (n' - n\right )!},\quad m' = -n',\ldots,n',}$$

for M2M (the tilde indicates that we are dealing with rotated coefficients) and by

$$\displaystyle{\tilde{F}_{x_{3}}^{\wedge,\text{I}}\left (n',m'\right ) =\sum \limits _{ n=n'}^{p}\tilde{F}_{ x_{2}}^{\wedge,\text{I}}\left (n,m'\right )\frac{\vert x_{3} - x_{2}\vert ^{n-n'}} {\left (n - n'\right )!},\quad m' = -n',\ldots,n',}$$

for L2L requiring also an effort of \(\mathcal{O}(p^{3})\). For a detailed description, we refer to White and Head-Gordon (1996) or Gutting (2007) with all technical details.

For the M2L translation, we follow the idea of Greengard and Rokhlin (1997) and Cheng et al. (1999) by replacing it with exponential translations which are based on the representation

$$\displaystyle\begin{array}{rcl} \frac{1} {\vert x - y\vert }& =& \frac{1} {2\pi }\int _{0}^{\infty }e^{-\lambda (x_{3}-y_{3})}\int _{ 0}^{2\pi }e^{i\lambda ((x_{1}-y_{1})\cos \alpha +(x_{2}-y_{2})\sin \alpha )}d\alpha \ d\lambda \\ & =& \sum \limits _{k=1}^{s(\varepsilon )} \frac{w_{k}} {M_{k}}\sum \limits _{j=1}^{M_{k} }e^{-\lambda _{k}(x_{3}-y_{3})}e^{i\lambda _{k}((x_{1}-y_{1})\cos \alpha _{j,k}+(x_{2}-y_{2})\sin \alpha _{j,k})} + \mathcal{O}(\varepsilon ){}\end{array}$$
(25)

for points x, y whose Cartesian coordinates satisfy \(1 \leq x_{3} - y_{3} \leq 4\) as well as \(0 \leq \sqrt{(x_{1 } - y_{1 } )^{2 } + (x_{2 } - y_{2 } )^{2}} \leq 4\sqrt{2}\). The inner integral is discretized using the trapezoidal rule, i.e., \(\alpha _{j,k} = \tfrac{2\pi j} {M_{k}}\), j = 1, , M k , and the outer integral is treated with the integration weights w k and the integration points \(\lambda _{k}\), \(k = 1,\ldots,s(\varepsilon )\) for a chosen accuracy \(\varepsilon\). w k , \(\lambda _{k}\), M k can be found in Greengard and Rokhlin (1997), Yarvin and Rokhlin (1998), and Cheng et al. (1999). The total number of numerical integration points, i.e., the number of exponential functions, is \(S(\varepsilon ) =\sum \limits _{ k=1}^{s(\varepsilon )}M_{k} \in \mathcal{O}\left (p^{2}\right )\), whereas \(s(\varepsilon ) \in \mathcal{O}(p)\) and takes the role of p in determining the accuracy of the integration.

Since outer harmonics are related to the single pole by Hobson’s formula (cf. Hobson 1965), a multipole expansion of F with coefficients F c ∧, O(n, m), center c, and accuracy \(\varepsilon\) can be transformed by (25) into a series of exponentials (multipole-to-exponential step, briefly M2X)

$$\displaystyle{ F(x) =\sum \limits _{ k=1}^{s(\varepsilon )}\sum \limits _{ j=1}^{M_{k} }W(k,j)e^{-\lambda _{k}\frac{x_{3}-c_{3}} {d} }e^{i\lambda _{k}\left (\frac{x_{1}-c_{1}} {d} \cos \alpha _{j,k}+\frac{x_{2}-c_{2}} {d} \sin \alpha _{j,k}\right )} + \mathcal{O}(\varepsilon ), }$$
(26)

with the coefficients given by

$$\displaystyle{ W(k,j) =\sum \limits _{ n=0}^{p}\sum \limits _{ m=-n}^{n}\frac{F_{c}^{\wedge,\text{O}}(n,m)} {d^{n+1}} \frac{w_{k}} {M_{k}}\lambda _{k}^{n}i^{m}e^{im\alpha _{j,k} } }$$
(27)

for \(k = 1,\ldots,s(\varepsilon )\), \(j = 1,\ldots,M_{k}\). It is required that c is the center of a box of edge length d containing the points x and the series of exponentials is valid for points y with \(d \leq x_{3} - y_{3} \leq 4d\) and \(0 \leq \sqrt{(x_{1 } - y_{1 } )^{2 } + (x_{2 } - y_{2 } )^{2}} \leq 4\sqrt{2}d\). Such exponential expansions can be shifted very efficiently to the new center b, i.e., to

$$\displaystyle{ F(x) =\sum \limits _{ k=1}^{s(\varepsilon )}\sum \limits _{ j=1}^{M_{k} }V (k,j)e^{-\lambda _{k}\frac{x_{3}-b_{3}} {d} }e^{i\lambda _{k}\left (\frac{x_{1}-b_{1}} {d} \cos \alpha _{j,k}+\frac{x_{2}-b_{2}} {d} \sin \alpha _{j,k}\right )} }$$

with the new coefficients

$$\displaystyle{ V (k,j) = W(k,j)e^{-\lambda _{k}\frac{b_{3}-c_{3}} {d} }e^{i\lambda _{k}\left (\frac{b_{1}-c_{1}} {d} \cos \alpha _{j,k}+\frac{b_{2}-c_{2}} {d} \sin \alpha _{j,k}\right )} }$$
(28)

for \(k = 1,\ldots,s(\varepsilon )\), \(j = 1,\ldots,M_{k}\). This exponential to exponential shift is abbreviated by X2X.

Afterwards, the exponential expansion is transformed back into an inner harmonics expansion completing the M2L translation step:

$$\displaystyle{ F(x) =\sum \limits _{ n=0}^{p}\sum \limits _{ m=-n}^{n}F_{ c}^{\wedge,\text{I}}(n,m)I_{ n,m}(x - c) + \mathcal{O}(\varepsilon ), }$$
(29)

where the coefficients can be computed by this so-called exponential to local (X2L) step

$$\displaystyle{ F_{c}^{\wedge,\text{I}}(n,m) =\sum \limits _{ k=1}^{s(\varepsilon )}\sum \limits _{ j=1}^{M_{k} }W(k,j)\frac{(-\lambda _{k})^{n}} {d^{n}} i^{m}e^{-im\alpha _{j,k} } }$$
(30)

for \(n = 0,\ldots,p\), \(m = -n,\ldots,n\). Obviously, the same geometric restrictions hold as before.

The restrictions on the positions of x and y mean that the exponential translations are applicable for cubes in list 2 that are situated above the current cube with another cube in between. However, by combining the idea with rotations of the multipole expansion using again the Wigner rotation matrices, the exponential translation can substitute the M2L translation for all cubes in list 2. Therefore, the list of all well-separated cubes (list 2) is split into six directional lists (up, down, North, South, East, and West), and instead of M2L, the following sequence of transformations is used: (rotation), M2X (27), X2X (28), and X2L (30), (inverse rotation).

Each exponential shift requires numerical costs of \(\mathcal{O}(S(\varepsilon )) = \mathcal{O}\left (p^{2}\right )\), and the rotations can be applied using \(\mathcal{O}\left (p^{3}\right )\) operations (as do the M2X and X2L steps). Thus, this improves the performance compared to the M2L step’s \(\mathcal{O}\left (p^{4}\right )\) effort. Each cube needs to allocate some additional memory for an outgoing and an incoming set of coefficients of the sum of exponentials.

Moreover, we can save translations by recombination (see Greengard and Rokhlin 1997; Cheng et al. 1999; Gutting 2007). If several cubes have the same target cubes, intermediate stops are performed. No additional memory is required since we can use the available storage of cubes at a lower level.

In 3D we perform 8 translations from X 1, , X 8 to their common parent cube C, 9 translations from C to B 1, , B 9 which are the parent cubes of S 1, , S 36 instead of 288 translations from X 1, , X 8 to S 1, , S 36 as summarized in the following diagram:

(31)

Not only expansions of X j , but also of other cubes that have the children of B k as target cubes, are collected in the parent cubes B k and are shifted to the cubes S j after all contributions are added. Thus, we only need to perform the translations from B k to its children once. This further reduction is the reason that we use two intermediate stations for the translation. We call this the first stage of recombination.

Similar considerations can be used in other situations where fewer cubes have common target cubes (second stage of recombination). The corresponding diagram for the 3D case is given by

(32)

As before, the final translation is executed only after all contributions arrived in the cubes B k . These two stages already cover the complete list for the directions up and down. For the other four lists, we introduce a third stage . Obviously, we only consider cubes that have not yet been treated by the first two stages.

If a cube X has a cube Y in one of its directional lists, we collect all other children of X’s parent C which also have Y in the corresponding directional list in the combination list (altogether M cubes). If M > 1, we collect the maximal number of common cubes in the corresponding directional lists of all cubes of the combination list in the target cube list (altogether N cubes). If N > 1, all expansions from cubes in the combination list are shifted to their parent cube C and then to the cubes of the target cube list. The number of translations is reduced from NM to N + M.

Finally, any remaining cubes are treated individually as without the recombination technique. This procedure can significantly reduce the number of necessary translations. The algorithm is described and analyzed in full detail in Gutting (2007).

It should also be noted that there are several symmetries in the coefficients of the exponential expansion since we are dealing with a real-valued function F in (26). They are used to further reduce the numerical costs (cf. Greengard and Rokhlin 1997; Cheng et al. 1999).

4 Numerical Tests and Results

At first, the effect of the recombination described in the end of Sect. 3.5 is investigated for a fully populated bounding box. Obviously, this is the ideal case maximizing the effect of the recombination. The increased efficiency of the X2X step for the various stages of recombination (see Sect. 3.5; by stage 0 we indicate the case without recombination) is documented by Tables 1 and 2. Note that the accuracy

Table 1 Number of exponential translations depending on the stages in the recombination technique for the fully populated octree with levels 2–5. In brackets in the first column, the number of cubes in this level is given; the average number of translations per cubes is listed in brackets in the other columns
Table 2 Computing times (in seconds) corresponding to the exponential translations of Table 1

\(s(\varepsilon ) = 26\) and the corresponding values for M k , \(S(\varepsilon )\) from Cheng et al. (1999) are chosen for this test. The well known symmetries in the exponential coefficients (see Gutting 2007 for a detailed derivation) are also used to reduce the computational costs.

For a rather sphere-like setting as in the discrete version of the Dirichlet boundary value problem (Problem 3) or in an interpolation/approximation problem on the Earth’s surface, the savings reduce to about 30 % because of the adaptive construction of the octree. This can be seen in Table 3 where an adaptive decomposition is computed for points randomly distributed between two spheres of radius 6,356 and 6,378 km. The targets are the Kelvin transforms of these points with respect to a sphere of radius R = 6, 352 km and the parameter h = 0. 95. The adaptive FMM used the parameters p = 23,

Table 3 Computing times (in seconds) for the exponential translation without and with recombination (all three stages). Note that the listed times include the M2X and X2L transformations in both cases

\(s(\varepsilon ) = 26\), \(S(\varepsilon ) = 670\), m = 130.

Now the truncation degree p is investigated for different accuracies of the exponential translation \(s(\varepsilon )\). We increase p while \(s(\varepsilon )\) is kept fixed and determine when the integration error of the exponential translation dominates the truncation error. This leads to the choices of p for different levels of \(s(\varepsilon )\) given by Table 4.

Table 4 Resulting truncation degrees p for different \(s(\varepsilon )\) for the two types of kernels

For the remaining tests in this section, we consider the following test scenario: We take latitudes and longitudes of the points of a so-called spiral grid (cf. Rakhmanov et al. 1994) and combine these with a random radius between 6,356.7 and 6,378.1 km. As radius of the Runge sphere, R = 6, 356 km is chosen and h = 0. 95. The spline coefficients a i are just random numbers between − 1 and 1. We compute

$$\displaystyle{ S\left (x_{j}\right ) =\sum \limits _{ i=1}^{N}a_{ i}\mathcal{L}_{j}\mathcal{L}_{i}K_{\mathcal{H}}(\cdot,\cdot ) =\sum \limits _{ i=1}^{N}a_{ i}K_{\mathcal{H}}\left (x_{i},x_{j}\right ),\quad j = 1,\ldots,N, }$$

with exponential translations of different accuracies and without them (using standard M2L translations). We compare the results to the direct summation without FMM. The error behavior can be seen in Fig. 3, and together with many further tests (cf. Gutting 2007), we obtain the values of Table 4.

Fig. 3
figure 3

Error induced by the FMM for the singularity kernel (left) and Abel-Poisson kernel (right) with exponential translations of different accuracies for \(s(\varepsilon ) = 8\) (red line), \(s(\varepsilon ) = 17\) (cyan line), \(s(\varepsilon ) = 26\) (green line). For comparison, we include the standard M2L translation (blue line) with increasing truncation degree p as abscissae

Finally, the maximal number of points or targets per cube m has a strong influence on the adaptive octree construction and the performance of the FMM. If m is too small, there are many cubes each containing only very few points. Thus, the kernel expansion coefficients no longer combine the information of enough points to be efficient. If m is too large, there are only few cubes each with a large number of points. This means that far too often instead of kernel expansion direct interaction is used. After many empirical tests (cf. Gutting 2007), we came to the conclusion that the choices for m given by Table 5 lead to a good performance in our implementation.

Table 5 Chosen maximal numbers of points m per cube for the singularity kernel and the Abel-Poisson kernel and the different error levels

After these optimizations of the parameters of the fast multipole algorithm, we can compare its performance with direct computation and find the break-even points of our implementation, i.e., the minimal number of points that is necessary for our algorithm to be faster than the direct approach (see Table 6). Note that such results are always very dependent on the implementation.

Table 6 Break-even points for the singularity kernel and the Abel-Poisson kernel

Furthermore, the linear asymptotic behavior which we expect from the FMM becomes obvious in Fig. 4 compared to the quadratic behavior of the direct approach.

Fig. 4
figure 4

Break-even points by comparison of computation times (in seconds) for direct (blue line) and FMM-accelerated (red line) computation (left: singularity kernel, right: Abel-Poisson kernel), the number of points forms the abscissae

Our implementation turns out to be efficient even for rather small problem sizes. In general, the Abel-Poisson kernel requires some more computation time since it leads to a more difficult P2M step.

5 Overlapping Domain Decomposition Preconditioner

The fast multipole algorithm already provides us with an adaptive decomposition of the computational domain which we also use to improve the condition of the system of linear equations (8) by using a Schwarz algorithm as preconditioner (see Chan and Mathew 1994; Smith et al. 1996 and the references therein for an overview). The basic idea of the Schwarz algorithms is to split up the computational domain into several (often overlapping) parts, solve the resulting smaller problems for each part, and combine the partial solutions. We intend to use only one decomposition for both the FMM and the Schwarz algorithm.

There are two main variants that have different ways to update the residual: In the multiplicative variant, the residual is updated after each single solution of a subproblem. In the additive variant, the solutions of the subproblems are merged after all of them are computed, and the merged result updates the residual (cf. Beatson et al. 2000; Zhou et al. 2003 for use of these methods in radial basis function interpolation and Hesse 2002 for use of the multiplicative variant in the context of harmonic splines). The multiplicative variant possesses the advantage of not requiring the full matrix for the update of the residuals, but this is less relevant in the presence of the FMM which provides a very fast way to (approximately) multiply with the full matrix. Moreover, the total number of update steps is higher for the multiplicative variant. Thus, we choose the additive Schwarz algorithm together with a coarse grid correction and use a residual update of Hybrid-II type (by the categorization of Smith et al. 1996).

Since all the decomposing of the domain and sorting of points is already part of the FMM, the only additional work in the initialization of the preconditioner is the determination of points in the overlapping parts of the subdomains which are exactly given by the leaves of the octree. The amount of the overlap is controlled by the parameter \(\vartheta _{\text{ov}}\), and the area is a part of the directly adjacent cubes, i.e., cubes in list 1 as set in Definition 7 (white in Fig. 5). The overlap is the blue area in Fig. 5. Note that \(\vartheta _{\text{ov}} = 0\), i.e., no overlap at all, is a valid choice.

Fig. 5
figure 5

Two-dimensional illustration of the overlap (dark blue area) for an adaptive decomposition of the domain. The red and light blue areas belong to list 2 and 3, respectively

In our problems we usually do not assume a structured point distribution. The idea for establishing a coarse grid from our scattered data points is to choose from each subproblem the point that is closest to the center of the domain and add that point to the coarse grid. Note that for the preconditioner, we only consider points and ignore the so-called targets of Sect. 3.2. If a cube only contains targets and none of the points x 1, , x N , neither overlap nor coarse grid points are determined for it. After these initialization steps that need to be computed only once, we can investigate the preconditioning cycle which yields v = M −1 b for a given residual b.

At first, we introduce the restriction matrix for each subdomain. Let \(M \in \mathbb{N}\) be the total number of subdomains and the subdomain X r , \(r = 1,\ldots,M\), contain N r points in the cube and its corresponding overlap. By \(\sigma _{r}\), we denote the permutation of the numbers \(1,\ldots,N\) that yields the points \(x_{\sigma _{r}(i)}\) of X r for the indices \(i = 1,\ldots,N_{r} \leq N\). For \(i = N_{r} + 1,\ldots,N\), the permutation \(\sigma _{r}\) is supposed to give the remaining indices between 1 and N such that \(\sigma _{r}(i) <\sigma _{r}(i + 1)\) for all \(i = N_{r} + 1,\ldots,N - 1\). With the help of the canonical basis vectors in \(\mathbb{R}^{N}\) and the index permutation \(\sigma _{r}\), the restriction matrix R r is defined as

$$\displaystyle{ \mathbf{R}_{r} = \left (\begin{array}{*{10}c} e_{\sigma _{r}(1)}^{T}\\ \vdots \\ e_{\sigma _{r}(N_{r})}^{T} \end{array} \right ) \in \mathbb{R}^{N_{r}\times N}\,. }$$
(33)

The restriction matrices reduce a vector \(x \in \mathbb{R}^{N}\) to a vector in \(\mathbb{R}^{N_{r}}\) consisting only of components of x related to X r . The corresponding transposed matrix \(\mathbf{R}_{r}^{T} \in \mathbb{R}^{N\times N_{r}}\) is called the prolongation matrix in the multiplicative variant. The restriction/prolongation matrices R 0, R 0 T for the coarse grid are defined analogously using the corresponding set of points. In the additive variant with overlapping subdomains, R r T, r = 1, , M, stands for the operation of fitting together solutions corresponding to the subdomains. Several possibilities are available (see Gutting 2007 and the references therein), and we choose to only update components that correspond to points of the cube in X r without the overlap. The overlapping part still plays a role in solving the subproblems, i.e., in R r and A r . Obviously, these restriction and prolongation operators are only written as matrices for the description of the algorithm; their effect is implemented in the form of index operations.

Algorithm 2 Overlapping Additive Schwarz Preconditioner

Initialization step:

Find points x i , i = 1, , N r , in each domain including overlap and set matrices

$$\displaystyle{ \mathbf{A}_{r} = \left (K_{\mathcal{H}}(x_{i},x_{j})\right )_{i,j=1,\ldots,N_{r}}\,\quad r = 1,\ldots,M. }$$
(34)

Find the coarse grid and build the corresponding matrix \(\mathbf{A}_{0} = \left (K_{\mathcal{H}}(x_{i},x_{j})\right )\), where i, j = 1, , N 0 correspond to the indices of the N 0 points of the coarse grid. The residual b is given.

Preconditioning cycle:

For r = 1, , M: Solve \(\mathbf{A}_{r}z_{r} = \mathbf{R}_{r}b\) to obtain z r .

Update at the inner points: \(v =\sum \limits _{ r=1}^{M}\mathbf{R}_{r}^{T}z_{r}\).

Update the coarse residual and perform coarse grid correction:

$$\displaystyle{ v_{\mathrm{final}} = v + \mathbf{R}_{0}^{T}\mathbf{A}_{ 0}^{-1}\mathbf{R}_{ 0}(b -\mathbf{A}v), }$$
(35)

where A denotes the full matrix corresponding to all N points.

Remark 1.

The computation of R 0 A v in (35) stands for the product of an N 0 × N-matrix with a vector in \(\mathbb{R}^{N}\). Since N 0N, this matrix-vector product is computed directly in our implementation even though it is of the type for which the FMM works as fast summation method.

Our implementation solves the subproblems directly, i.e., the matrices A r are factorized once in the initialization step.

Algorithm 2 is applied as preconditioner in the well known preconditioned GMRES algorithm for the iterative solution of systems of linear equations (cf. Saad and Schultz 1986 for the first formulation of GMRES or, e.g., Saad 2003). The matrix-vector products with the full matrix A that occur in GMRES are accelerated with the fast multipole algorithm of Sect. 3.

The overlap parameter \(\vartheta _{\text{ov}}\) which controls the size of the overlapping parts of the domains (see Fig. 5) plays an important role in the effectiveness of the preconditioner. Note that the number of overlap points is not directly controlled this way. On the one hand, a larger overlapping part, i.e., more overlap points, generally reduces the number of iterations, because the efficiency of the additive Schwarz preconditioner rises. On the other hand, the number of points related to each domain also grows, i.e., the effort for each domain increases, and hence the iterations take more time.

We investigate the following test scenario to determine a good balance. The Earth’s surface is described by the TerrainBase model (see Hastings and Row 1997), and we choose the latitudes and longitudes of the data points from a so-called spiral grid (cf. Rakhmanov et al. 1994), i.e., the points are approximately equidistributed on the surface for this test. As gravitational data to interpolate, we use the EGM96 model (cf. Lemoine et al. 1998). The fast multipole algorithm is run with \(s(\varepsilon ) = 17\) and corresponding values for p and m (see Sect. 4). The spline parameters are R = 6, 352 km and h = 0. 92 for both kernels.

From Table 7, we find that the differences for

Table 7 Splines using the singularity and the Abel-Poisson kernel with N = 8, 800 or N = 11, 200 points. Computation times (in seconds) and the number of iterations in brackets. The first line uses a direct solver for comparison; the second line uses GMRES without preconditioning. Note that we restrict the GMRES algorithm to a maximum of 300 iterations. If no convergence is achieved by then, we write > 300

\(\vartheta _{\text{ov}} \geq 0.4\) are rather small. Moreover, the disadvantages of a too large overlap become more and more evident for more points, in particular more points in the cubes, e.g., resulting from \(s(\varepsilon ) = 26\) (see Table 5). Therefore, we use the preconditioner with \(\vartheta _{\text{ov}} = 0.4\) in our calculations in Sect. 6. Further parameter studies and tests of other update variants can be found in Gutting (2007).

For smoothing splines, similar restrictions on the matrix C in (11) hold as for the FMM (see the end of Sect. 3.4). However, the system of linear equations (11) is typically much better conditioned than (8) because of the smoothing.

6 Numerical Tests and Results

At first, we consider a global test example putting together all previous parts. In this section, the FMM is used with \(s(\varepsilon ) = 26\) and the corresponding parameters (see Sect. 4), and the preconditioner is set up with the overlap parameter \(\vartheta _{\text{ov}} = 0.4\) as explained in Sect. 5.

To determine random points on the Earth’s surface, we compute a uniform distribution on a sphere. Since the Earth’s surface is close to a sphere, the distribution can be expected to be still close to uniform. Of course, even for uniformly distributed random points, data gaps can occur leading to complications. As test data, we simply evaluate the EGM96 (using degrees 3–100 here) at N = 100, 000 random points on the Earth’s surface. In Fig. 6, the spline of the gravitational potential and distribution of the absolute error of the spline interpolation can be seen. Figure 7 provides a detailed view of the errors at some data gaps.

Fig. 6
figure 6

Spline interpolation of the gravitational potential (left) and absolute error distribution (right), both in m2/s2

Fig. 7
figure 7

A closer look at some data gaps where the largest error (in m2/s2) occurs in Fig. 6 (right). The data points are added in magenta

The singularity kernel is computed with h = 0. 98 and R = 6, 352 km in just 18 iterations and leads to a mean absolute error of \(\varepsilon _{ \text{mabs}} = 0.0413\:\mathrm{m}^{2}/\mathrm{s}^{2}\) and a maximal absolute error of \(\varepsilon _{\text{max}} = 4.1442\:\mathrm{m}^{2}/\mathrm{s}^{2}\).

For our further considerations, we no longer work on the whole surface of the Earth but restrict ourselves to a region containing South America (65S to 30N, 110W to 10W), and only the points within this region are interpolated. As test data, the EGM96 (using degrees 16–200) is evaluated at N = 48, 749 almost uniformly distributed random points (constructed as before) in the region on the Earth’s surface (see Fig. 8 (left)).

Fig. 8
figure 8

Local examples: N = 48, 749 randomly distributed points (left) and N = 48, 744 points based on a spiral grid (right) in the region of the Earth’s surface

The resulting spline with the singularity kernel with h = 0. 9875 and R = 63, 544 km required 32 iterations to obtain the coefficients, and its mean absolute error is \(\varepsilon _{ \text{mabs}} = 0.0353\:\mathrm{m}^{2}/\mathrm{s}^{2}\). The maximal error \(\varepsilon _{\text{max}} = 8.0492\:\mathrm{m}^{2}/\mathrm{s}^{2}\) and occurs at the boundary of the region (see Fig. 9). Thus, we ignore 2. 5 % of the region at each boundary (see Fig. 10) which reduces both mean and maximal errors to \(\tilde{\varepsilon }_{ \text{mabs}} = 0.0280\:\mathrm{m}^{2}/\mathrm{s}^{2}\) and \(\tilde{\varepsilon }_{\text{max}} = 3.6544\:\mathrm{m}^{2}/\mathrm{s}^{2}\), respectively (compare also Figs. 10 to 9). It should be noted that the error naturally is largest at the boundary since there are no more data points beyond it. Therefore, one can consider everything outside the region under consideration as one huge data gap.

Fig. 9
figure 9

Spline interpolation of the gravitational potential (left) and absolute error distribution (right) using the singularity kernel and the interpolation points in Fig. 8 (left), both in m2/s2

Fig. 10
figure 10

Same results as in Fig. 9, but 2.5 % of the area at the boundary of the region have been removed

Finally, we consider another regional example with the Abel-Poisson kernel (with h = 0. 98 and R = 6, 354 km) using the spiral grid of Rakhmanov et al. (1994) mapped to the Earth’s surface as before (see the test scenarios in Sect. 5). The points (N = 48, 744) are illustrated in Fig. 8 (right), and as data we evaluate again the EGM96 (using degrees 16–360). Our algorithm requires 15 iterations, and the errors are \(\varepsilon _{ \text{mabs}} = 0.0199\:\mathrm{m}^{2}/\mathrm{s}^{2}\) and \(\varepsilon _{\text{max}} = 10.7790\:\mathrm{m}^{2}/\mathrm{s}^{2}\) where the maximal error again occurs at the boundary of the region (see Fig. 11). Cutting off again 2. 5 % of the region at the boundary (see Fig. 12 and compare it to Fig. 11), these errors reduce to \(\tilde{\varepsilon }_{\text{mabs}} = 0.0046\:\mathrm{m}^{2}/\mathrm{s}^{2}\) and \(\tilde{\varepsilon }_{\text{max}} = 0.9984\:\mathrm{m}^{2}/\mathrm{s}^{2}\). Obviously, the more regular distribution of points removes the errors that result from data gaps and leads to better results.

Fig. 11
figure 11

Spline interpolation of the gravitational potential (left) and absolute error distribution (right) using the Abel-Poisson kernel and the interpolation points in Fig. 8 (right), both in m2/s2

Fig. 12
figure 12

Same results as in Fig. 11, but 2.5 % of the area at the boundary of the region have been removed

7 Conclusion

Combining the FMM and the additive Schwarz algorithm as preconditioner in the iterative algorithm GMRES proves to be an efficient solution strategy that can treat interpolation problems and Dirichlet boundary value problems with many data points on regular surfaces (e.g., the actual topography of the Earth). It should be pointed out that our approach is not restricted to a global treatment but also applies to regional domains as shown by the numerical examples. This can lead to a local improvement of the gravitational field in areas of particular interest. A global model (e.g., in terms of spherical harmonics) should first be subtracted from the data and combined afterwards with the spline solution. In general, the singularity kernel leads to results slightly faster than the Abel-Poisson kernel.

The spline approach naturally includes spherical boundaries as a special case and can be extended to spline smoothing with some restrictions (see Sect. 2.2 and the end of Sect. 3.4). Note that for smoothing splines, it is well possible to leave out the preconditioner since the smoothing itself drastically improves the condition of the matrix. However, the smoothing parameter(s) plays a crucial role in this approach and must be chosen very carefully or a lot of detail information is lost to oversmoothing. The combination with parameter choice methods from ill-posed problems (cf. Bauer and Lukas 2011; Bauer et al. 2014 and the references therein) is an interesting challenge for the future.

For highly irregular distributions of data points, the spline approach reaches its limits. The largest data gap in the domain desires a small value of the parameter h, whereas the closest data points require a larger value of h to avoid ill-conditioning. Even smoothing splines cannot completely bridge this gap so far though further investigation is required. However, functional matching pursuit methods can result in better approximations (see Michel 2014a and the references therein), but so far these algorithms require large numerical costs.