Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

It is well appreciated that functions cannot have finite support in the temporal (or spatial) and spectral domain at the same time (Slepian 1983). Finding and representing signals that are optimally concentrated in both is a fundamental problem in information theory which was solved in the early 1960s by Slepian, Landau, and Pollak (Slepian and Pollak 1961; Landau and Pollak 19611962). The extensions and generalizations of this problem (Daubechies 19881990; Daubechies and Paul 1988; Cohen 1989) have strong connections with the burgeoning field of wavelet analysis. In this contribution, however, we shall not talk about wavelets, the scaled translates of a “mother” with vanishing moments, the tool for multiresolution analysis (Daubechies 1992; Flandrin 1998; Mallat 1998). Rather, we devote our attention entirely to what we shall collectively refer to as “Slepian functions,” in multiple Cartesian dimensions and on the sphere.

These we understand to be orthogonal families of functions that are all defined on a common, e.g., geographical, domain, where they are either optimally concentrated or within which they are exactly limited, and which at the same time are exactly confined within a certain bandwidth or maximally concentrated therein. The measure of concentration is invariably a quadratic energy ratio, which, though only one choice out of many (Donoho and Stark 1989; Freeden and Windheuser 1997; Riedel and Sidorenko 1995; Freeden and Schreiner 2010; Michel 2010), is perfectly suited to the nature of the problems we are attempting to address. These are, for example: How do we make estimates of signals that are noisily and incompletely observed? How do we analyze the properties of such signals efficiently, and how can we represent them economically? How do we estimate the power spectrum of noisy and incomplete data? What are the particular constraints imposed by dealing with potential-field signals (gravity, magnetism, etc.) and how is the altitude of the observation point, e.g., from a satellite in orbit, taken into account? What are the statistical properties of the resulting signal and power spectral estimates?

These and other questions have been studied extensively in one dimension, that is, for time series, but until the twenty-first century, remarkably little work had been done in the Cartesian plane or on the surface of the sphere. For the geosciences, the latter two domains of application are nevertheless vital for the obvious reasons that they deal with information (measurement and modeling) that is geographically distributed on (a portion of) a planetary surface. In our own recent series of papers (Wieczorek and Simons 20052007; Simons and Dahlen 20062007; Simons et al. 20062009; Dahlen and Simons 2008; Plattner and Simons 20132014) we have dealt extensively with Slepian’s problem in spherical geometry. Asymptotic reductions to the plane (Simons et al. 2006; Simons and Wang 2011) then generalize Slepian’s early treatment of the multidimensional Cartesian case (Slepian 1964).

In this chapter we provide a framework for the analysis and representation of geoscientific data by means of Slepian functions defined for time series, on two-dimensional Cartesian, and spherical domains. We emphasize the common ground underlying the construction of all Slepian functions, discuss practical algorithms, and review the major findings of our own recent work on signal (Wieczorek and Simons 2005; Simons and Dahlen 2006) and power spectral estimation theory on the sphere (Wieczorek and Simons 2007; Dahlen and Simons 2008). Compared to the first edition of this work (Simons 2010), we now also include a section on vector-valued Slepian functions that brings the theory in line with the modern demands of (satellite) gravity, geomagnetic, or oceanographic data analysis (Freeden 2010; Olsen et al. 2010; Grafarend et al. 2010; Martinec 2010; Sabaka et al. 2010).

2 Theory of Slepian Functions

In this section we review the theory of Slepian functions in one dimension, in the Cartesian plane, and on the surface of the unit sphere. The one-dimensional theory is quite well known and perhaps most accessibly presented in the textbook by Percival and Walden (1993). It is briefly reformulated here for consistency and to establish some notation. The two-dimensional planar case formed the subject of a lesser-known of Slepian’s papers (Slepian 1964) and is reviewed here also. We are not discussing alternatives by which two-dimensional Slepian functions are constructed by forming the outer product of pairs of one-dimensional functions. While this approach has produced some useful results (Hanssen 1997; Simons et al. 2000), it does not solve the concentration problem sensu stricto. The spherical scalar case was treated in most detail, and for the first time, by ourselves elsewhere (Wieczorek and Simons 2005; Simons et al. 2006; Simons and Dahlen 2006), though two very important early studies by Slepian, Grünbaum, and others laid much of the foundation for the analytical treatment of the spherical concentration problem for cases with special symmetries (Gilbert and Slepian 1977; Grünbaum 1981). The spherical vector case was treated in its most general form by ourselves elsewhere (Plattner et al. 2012; Plattner and Simons 20132014), but had also been studied in some, but less general, detail by researchers interested in medical imaging (Maniar and Mitra 2005; Mitra and Maniar 2006) and optics (Jahn and Bokor 20122013). Finally, we recast the theory in the context of reproducing-kernel Hilbert spaces, through which the reader may appreciate some of the connections with radial basis functions, splines, and wavelet analysis, which are commonly formulated in such a framework (Freeden et al. 1998; Michel 2010).

2.1 Spatiospectral Concentration for Time Series

2.1.1 General Theory in One Dimension

We use t to denote time or one-dimensional space and ω for angular frequency, and adopt a normalization convention (Mallat 1998) in which a real-valued time-domain signal f(t) and its Fourier transform F(ω) are related by

$$\displaystyle{ f(t) = (2\pi )^{-1}\int _{ -\infty }^{\infty }F(\omega )e^{i\omega t}\,d\omega,\qquad F(\omega ) =\int _{ -\infty }^{\infty }f(t)e^{-i\omega t}\,dt. }$$
(1)

The problem of finding the strictly bandlimited signal

$$\displaystyle{ g(t) = (2\pi )^{-1}\int _{ -W}^{W}G(\omega )e^{i\omega t}\,d\omega, }$$
(2)

that is maximally, though by virtue of the Paley-Wiener theorem (Daubechies 1992; Mallat 1998) never completely, concentrated into a time interval | t | ≤ T, was first considered by Slepian, Landau, and Pollak (Slepian and Pollak 1961; Landau and Pollak 1961). The optimally concentrated signal is taken to be the one with the least energy outside of the interval:

$$\displaystyle{ \lambda = \frac{\int _{-T}^{T}g^{2}(t)\,dt} {\int _{-\infty }^{\infty }g^{2}(t)\,dt} = \mbox{ maximum}. }$$
(3)

Bandlimited functions g(t) satisfying the variational problem (3) have spectra G(ω) that satisfy the frequency-domain convolutional integral eigenvalue equation

$$\displaystyle{ \int _{-W}^{W}D(\omega,\omega ')\,G(\omega ')\,d\omega ' =\lambda \ G(\omega ),\quad \vert \omega \vert \leq W, }$$
(4a)
$$\displaystyle{ D(\omega,\omega ') = \frac{\sin T(\omega -\omega ')} {\pi (\omega -\omega ')}. }$$
(4b)

The corresponding time- or spatial-domain formulation is

$$\displaystyle{ \int _{-T}^{T}D(t,t')\,g(t')\,dt' =\lambda g(t),\quad t \in \mathbb{R}, }$$
(5a)
$$\displaystyle{ D(t,t') = \frac{\sin W(t - t')} {\pi (t - t')}. }$$
(5b)

The “prolate spheroidal eigentapers” g 1(t), g 2(t), that solve Eq. (5) form a doubly orthogonal set. When they are chosen to be orthonormal over infinite time \(\vert t\vert \leq \infty\), they are also orthogonal over the finite interval | t | ≤ T:

$$\displaystyle{ \int _{-\infty }^{\infty }g_{\alpha }g_{\beta }\,dt =\delta _{\alpha \beta },\qquad \int _{ -T}^{T}g_{\alpha }g_{\beta }\,dt =\lambda _{\alpha }\delta _{\alpha \beta }. }$$
(6)

A change of variables and a scaling of the eigenfunctions transforms Eq. (4) into the dimensionless eigenproblem

$$\displaystyle{ \int _{-1}^{1}D(x,x')\,\psi (x')\,dx' =\lambda \psi (x), }$$
(7a)
$$\displaystyle{ D(x,x') = \frac{\sin TW(x - x')} {\pi (x - x')}. }$$
(7b)

Equation (7) shows that the eigenvalues \(\lambda _{1}>\lambda _{2}>\ldots\) and suitably scaled eigenfunctions \(\psi _{1}(x),\psi _{2}(x),\ldots\) depend only upon the time-bandwidth product TW. The sum of the concentration eigenvalues \(\lambda\) relates to this product by

$$\displaystyle{ N^{\mathrm{1D}} =\sum _{ \alpha =1}^{\infty }\lambda _{ \alpha } =\int _{ -1}^{1}D(x,x)\,dx = \frac{(2T)(2W)} {2\pi } = \frac{2TW} {\pi }. }$$
(8)

The eigenvalue spectrum of Eq. (7) has a characteristic step shape, showing significant (\(\lambda \approx 1\)) and insignificant (\(\lambda \approx 0\)) eigenvalues separated by a narrow transition band (Landau 1965; Slepian and Sonnenblick 1965). Thus, this “Shannon number” is a good estimate of the number of significant eigenvalues or, roughly speaking, N 1D is the number of signals f(t) that can be simultaneously well concentrated into a finite time interval | t | ≤ T and a finite frequency interval | ω | ≤ W. In other words (Landau and Pollak 1962), N 1D is the approximate dimension of the space of signals that is “essentially” timelimited to T and bandlimited to W, and using the orthogonal set \(g_{1},g_{2},\ldots,g_{N^{\mathrm{1D}}}\) as its basis is parsimonious.

2.1.2 Sturm-Liouville Character and Tridiagonal Matrix Formulation

The integral operator acting upon ψ in Eq. (7) commutes with a differential operator that arises in expressing the three-dimensional scalar wave equation in prolate spheroidal coordinates (Slepian and Pollak 1961; Slepian 1983), which makes it possible to find the scaled eigenfunctions ψ by solving the Sturm-Liouville equation

$$\displaystyle{ \frac{d} {dx}\left [(1 - x^{2}) \frac{d\psi } {dx}\right ] + \left [\chi -\frac{(N^{\mathrm{1D}})^{2}\pi ^{2}} {4} x^{2}\right ]\psi = 0, }$$
(9)

where \(\chi \neq \lambda\) is the associated eigenvalue. The eigenfunctions ψ(x) of Eq. (9) can be found at discrete values of x by diagonalization of a simple symmetric tridiagonal matrix (Slepian 1978; Grünbaum 1981; Percival and Walden 1993) with elements

$$\displaystyle\begin{array}{rcl} & & ([N - 1 - 2x]/2)^{2}\cos (2\pi W)\quad \mbox{ for}\quad x = 0,\cdots \,,N - 1, \\ & & \quad x(N - x)/2\quad \mbox{ for}\quad x = 1,\ldots,N - 1. {}\end{array}$$
(10)

The matching eigenvalues \(\lambda\) can then be obtained directly from Eq. (7). The discovery of the Sturm-Liouville formulation of the concentration problem posed in Eq. (3) proved to be a major impetus for the widespread adoption and practical applications of the “Slepian” basis in signal identification, spectral analysis, and numerical analysis. Compared to the sequence of eigenvalues \(\lambda\), the spectrum of the eigenvalues χ is extremely regular and thus the solution of Eq. (9) is without any problem amenable to finite-precision numerical computation (Percival and Walden 1993).

2.2 Spatiospectral Concentration in the Cartesian Plane

2.2.1 General Theory in Two Dimensions

A square-integrable function f(x) defined in the plane has the two-dimensional Fourier representation

$$\displaystyle{ f(\mathbf{x}) = (2\pi )^{-2}\int _{ -\infty }^{\infty }F(\mathbf{k})e^{i\mathbf{k}\cdot \mathbf{x}}\,d\mathbf{k},\qquad F(\mathbf{k}) =\int _{ -\infty }^{\infty }f(\mathbf{x})e^{-i\mathbf{k}\cdot \mathbf{x}}\,d\mathbf{x}, }$$
(11)

We use g(x) to denote a function that is bandlimited to \(\mathcal{K}\), an arbitrary subregion of spectral space,

$$\displaystyle{ g(\mathbf{x}) = (2\pi )^{-2}\int _{ \mathcal{K}}G(\mathbf{k})e^{i\mathbf{k}\cdot \mathbf{x}}\,d\mathbf{k}. }$$
(12)

Following Slepian (1964), we seek to concentrate the power of g(x) into a finite spatial region \(\mathcal{R}\in \mathbb{R}^{2}\), of area A:

$$\displaystyle{ \lambda = \frac{\int _{\mathcal{R}}g^{2}(\mathbf{x})\,d\mathbf{x}} {\int _{-\infty }^{\infty }g^{2}(\mathbf{x})\,d\mathbf{x}} = \mbox{ maximum}. }$$
(13)

Bandlimited functions g(x) that maximize the Rayleigh quotient (13) solve the Fredholm integral equation  (Tricomi 1970)

$$\displaystyle{ \int _{\mathcal{K}}D(\mathbf{k},\mathbf{k}')\,G(\mathbf{k}')\,d\mathbf{k}' =\lambda G(\mathbf{k}),\qquad \mathbf{k} \in \mathcal{K}, }$$
(14a)
$$\displaystyle{ D(\mathbf{k},\mathbf{k}') = (2\pi )^{-2}\int _{ \mathcal{R}}e^{i(\mathbf{k}'-\mathbf{k})\cdot \mathbf{x}}\,d\mathbf{x}. }$$
(14b)

The corresponding problem in the spatial domain is

$$\displaystyle{ \int _{\mathcal{R}}D(\mathbf{x},\mathbf{x}')\,g(\mathbf{x}')\,d\mathbf{x}' =\lambda g(\mathbf{x}),\qquad \mathbf{x} \in \mathbb{R}^{2}, }$$
(15a)
$$\displaystyle{ D(\mathbf{x},\mathbf{x}') = (2\pi )^{-2}\int _{ \mathcal{K}}e^{i\mathbf{k}\cdot (\mathbf{x}-\mathbf{x}')}\,d\mathbf{k}. }$$
(15b)

The bandlimited spatial-domain eigenfunctions \(g_{1}(\mathbf{x}),g_{2}(\mathbf{x}),\ldots\) and eigenvalues \(\lambda _{1} \geq \lambda _{2} \geq \ldots\) that solve Eq. (15) may be chosen to be orthonormal over the whole plane \(\|\mathbf{x}\| \leq \infty\) in which case they are also orthogonal over \(\mathcal{R}\):

$$\displaystyle{ \int _{-\infty }^{\infty }g_{\alpha }g_{\beta }\,d\mathbf{x} =\delta _{\alpha \beta },\qquad \int _{ \mathcal{R}}g_{\alpha }g_{\beta }\,d\mathbf{x} =\lambda _{\alpha }\delta _{\alpha \beta }. }$$
(16)

Concentration to the disk-shaped spectral band \(\mathcal{K} =\{ \mathbf{k}:\| \mathbf{k}\| \leq K\}\) allows us to rewrite Eq. (15) after a change of variables and a scaling of the eigenfunctions as

$$\displaystyle{ \int _{\mathcal{R}_{{\ast}}}D(\boldsymbol{\xi },\boldsymbol{\xi }')\,\psi (\boldsymbol{\xi }')\,d\boldsymbol{\xi }' =\lambda \psi (\boldsymbol{\xi }), }$$
(17a)
$$\displaystyle{ D(\boldsymbol{\xi },\boldsymbol{\xi }') = \frac{K\sqrt{A/4\pi }} {2\pi } \frac{J_{1}(K\sqrt{A/4\pi }\,\,\|\boldsymbol{\xi } -\boldsymbol{\xi }'\ } {\|\boldsymbol{\xi }-\boldsymbol{\xi }'\|}, }$$
(17b)

where the region \(\mathcal{R}_{{\ast}}\) is scaled to area 4π and J 1 is the first-order Bessel function of the first kind. Equation (17) shows that, also in the two-dimensional case, the eigenvalues \(\lambda _{1},\lambda _{2},\ldots\) and the scaled eigenfunctions \(\psi _{1}(\boldsymbol{\xi }),\psi _{2}(\boldsymbol{\xi }),\ldots\) depend only on the combination of the circular bandwidth K and the spatial concentration area A, where the quantity K 2 A∕(4π) now plays the role of the time-bandwidth product TW in the one-dimensional case. The sum of the concentration eigenvalues \(\lambda\) defines the two-dimensional Shannon number N 2D as

$$\displaystyle{ N^{\mathrm{2D}} =\sum _{ \alpha =1}^{\infty }\lambda _{ \alpha } =\int _{R_{{\ast}}}D(\boldsymbol{\xi },\boldsymbol{\xi })\,d\boldsymbol{\xi } = \frac{(\pi K^{2})(A)} {(2\pi )^{2}} = K^{2}\,\frac{A} {4\pi }. }$$
(18)

Just as N 1D in Eq. (8), N 2D is the product of the spectral and spatial areas of concentration multiplied by the “Nyquist density” (Daubechies 19881992). And, similarly, it is the effective dimension of the space of “essentially” space- and bandlimited functions in which the set of two-dimensional functions \(g_{1},g_{2},\ldots,g_{N^{\mathrm{2D}}}\) may act as a sparse orthogonal basis.

After a long hiatus since the work of Slepian (1964), the two-dimensional problem has recently been the focus of renewed attention in the applied mathematics community (de Villiers et al. 2003; Shkolnisky 2007), and applications to the geosciences are following suit (Simons and Wang 2011). Numerous numerical methods exist to use Eqs. (14) and (15) in solving the concentration problem (13) on two-dimensional Cartesian domains. An example of Slepian functions on a geographical domain in the Cartesian plane can be found in Fig. 1.

Fig. 1
figure 1

Bandlimited eigenfunctions \(g_{1},g_{2},\ldots,g_{4}\) that are optimally concentrated within the Columbia Plateau, a physiographic region in the United States centered on 116.02 W 43.56 N (near Boise City, Idaho) of area A ≈ 145 × 103 km2. The concentration factors \(\lambda _{1},\lambda _{2},\ldots,\lambda _{4}\) are indicated; the Shannon number N 2D = 10. The top row shows a rendition of the eigenfunctions in space on a grid with 5 km resolution in both directions, with the convention that positive values are blue and negative values red, though the sign of the functions is arbitrary. The spatial concentration region is outlined in black. The bottom row shows the squared Fourier coefficients | G α (k) | 2 as calculated from the functions g α (x) shown, on a wavenumber scale that is expressed as a fraction of the Nyquist wavenumber. The spectral limitation region is shown by the black circle at wavenumber K = 0. 0295 rad/km. All areas for which the absolute value of the functions plotted is less than one hundredth of the maximum value attained over the domain are left white. The calculations were performed by the Nyström method using Gauss-Legendre integration of Eq. (17) in the two-dimensional spatial domain (Simons and Wang 2011)

2.2.2 Sturm-Liouville Character and Tridiagonal Matrix Formulation

If in addition to the circular spectral limitation, space is also circularly limited, in other words, if the spatial region of concentration or limitation \(\mathcal{R}\) is a circle of radius R, then a polar coordinate, \(\mathbf{x} = (r,\theta )\), representation

$$\displaystyle{ g(r,\theta ) = \left \{\begin{array}{ll} \sqrt{2}\,g(r)\cos m\theta &\ \mbox{ if}\ m <0, \\ g(r) &\ \mbox{ if}\ m = 0, \\ \sqrt{2}\,g(r)\sin m\theta &\ \mbox{ if}\ m> 0,\\ \end{array} \right. }$$
(19)

may be used to decompose Eq. (17) into a series of nondegenerate fixed-order eigenvalue problems, after scaling,

$$\displaystyle{ \int _{0}^{1}D(\xi,\xi ')\psi (\xi ')\,\xi 'd\xi ' =\lambda \psi (\xi ), }$$
(20a)
$$\displaystyle{ D(\xi,\xi ') = 4N^{\mathrm{2D}}\int _{ 0}^{1}J_{ m}\big(2\sqrt{N^{\mathrm{2D }}}\,p\xi \big)\,J_{m}\big(2\sqrt{N^{\mathrm{2D }}}\,p\xi '\big)\,pdp. }$$
(20b)

The solutions to Eq. (20) also solve a Sturm-Liouville equation on \(0 \leq \xi \leq 1\). In terms of \(\varphi (\xi ) = \sqrt{\xi }\,\psi (\xi )\),

$$\displaystyle{ \frac{d} {d\xi }\left [(1 -\xi ^{2})\frac{d\varphi } {d\xi }\right ] + \left (\chi +\frac{1/4 - m^{2}} {\xi ^{2}} - 4N^{\mathrm{2D}}\xi ^{2}\right )\varphi = 0, }$$
(21)

for some \(\chi \neq \lambda\). When m = ±1∕2 Eq. (21) reduces to Eq. (9). By extension to \(\xi> 1\) the fixed-order “generalized prolate spheroidal functions” \(\varphi _{1}(\xi ),\varphi _{2}(\xi ),\ldots\) can be determined from the rapidly converging infinite series

$$\displaystyle{ \varphi (\xi ) = \frac{\sqrt{2}} {\gamma } \sum _{l=0}^{\infty }(2l + m + 1)^{1/2}d_{ l}\frac{J_{m+2l+1}(c\xi )} {\sqrt{c\xi }},\qquad \xi \in \mathbb{R}^{+}, }$$
(22)

where \(\varphi (0) = 0\) and the fixed-m expansion coefficients d l are determined by recursion (Slepian 1964) or by diagonalization of a symmetric tridiagonal matrix (de Villiers et al. 2003; Shkolnisky 2007) with elements given by

$$\displaystyle\begin{array}{rcl} T_{ll}& =& \left (2l + m + \frac{1} {2}\right )\left (2l + m + \frac{3} {2}\right ) + \frac{c^{2}} {2} \left [1 + \frac{m^{2}} {(2l + m)(2l + m + 2)}\right ], \\ T_{l+1\,l}& =& - \frac{c^{2}\,(l + 1)(m + l + 1)} {\sqrt{2l + m + 1}\,(2l + m + 2)\,\sqrt{2l + m + 3}}, {}\end{array}$$
(23)

where the parameter l ranges from 0 to some large value that ensures convergence. The desired concentration eigenvalues \(\lambda\) can subsequently be obtained by direct integration of Eq. (17), or, alternatively, from

$$\displaystyle{ \lambda = 2\gamma ^{2}\sqrt{N^{\mathrm{2D }}},\quad \mbox{ with}\quad \gamma = \frac{c^{m+1/2}d_{ 0}} {2^{m+1}(m + 1)!}\left (\sum _{l=0}^{\infty }d_{ l}\right )^{-1}. }$$
(24)

An example of Slepian functions on a disk-shaped region in the Cartesian plane can be found in Fig. 2. The solutions were obtained using the Nyström method using Gauss-Legendre integration of Eq. (17) in the two-dimensional spatial domain. These differ only very slightly from the results of computations carried out using the diagonalization of Eqs. (23) directly, as shown and discussed by us elsewhere (Simons and Wang 2011).

Fig. 2
figure 2

Bandlimited eigenfunctions \(g_{\alpha }(r,\theta )\) that are optimally concentrated within a Cartesian disk of radius R = 1. The dashed circle denotes the region boundary. The Shannon number N 2D = 42. The eigenvalues \(\lambda _{\alpha }\) have been sorted to a global ranking with the best-concentrated eigenfunction plotted at the top left and the 30th best in the lower right. Blue is positive and red is negative and the color axis is symmetric, but the sign is arbitrary; regions in which the absolute value is less than one hundredth of the maximum value on the domain are left white. The calculations were performed by Gauss-Legendre integration in the two-dimensional spatial domain, which sometimes leads to slight differences in the last two digits of what should be identical eigenvalues for each pair of non-circularly-symmetric eigenfunctions

2.3 Spatiospectral Concentration on the Surface of a Sphere

2.3.1 General Theory in “Three” Dimensions

We denote the colatitude of a geographical point \(\mathbf{\hat{r}}\) on the unit sphere surface \(\Omega =\{ \mathbf{\hat{r}}:\| \mathbf{\hat{r}}\| = 1\}\) by \(0 \leq \theta \leq \pi\) and the longitude by 0 ≤ ϕ < 2π. We use R to denote a region of \(\Omega\), of area A, within which we seek to concentrate a bandlimited function of position \(\mathbf{\hat{r}}\). We express a square-integrable function \(f(\mathbf{\hat{r}})\) on the surface of the unit sphere as

$$\displaystyle{ f(\mathbf{\hat{r}}) =\sum \limits _{ l=0}^{\infty }\sum \limits _{ m=-l}^{l}f_{ lm}Y _{lm}(\mathbf{\hat{r}}),\qquad f_{lm} =\int _{\Omega }f(\mathbf{\hat{r}})Y _{lm}(\mathbf{\hat{r}})\,d\Omega, }$$
(25)

using orthonormalized real surface spherical harmonics (Edmonds 1996; Dahlen and Tromp 1998)

$$\displaystyle\begin{array}{rcl} Y _{lm}(\mathbf{\hat{r}}) = Y _{lm}(\theta,\phi ) = \left \{\begin{array}{ll} \sqrt{2}X_{l\vert m\vert }(\theta )\cos m\phi &\ \ \mbox{ if}\ \ - l \leq m <0, \\ X_{l0}(\theta ) &\ \ \mbox{ if}\ \ m = 0, \\ \sqrt{2}X_{lm}(\theta )\sin m\phi &\ \ \mbox{ if}\ \ 0 <m \leq l,\\ \end{array} \right.& &{}\end{array}$$
(26)
$$\displaystyle\begin{array}{rcl} X_{lm}(\theta ) = (-1)^{m}\left (\frac{2l + 1} {4\pi } \right )^{1/2}\left [\frac{(l - m)!} {(l + m)!}\right ]^{1/2}P_{ lm}(\cos \theta ),& &{}\end{array}$$
(27)
$$\displaystyle\begin{array}{rcl} P_{lm}(\mu ) = \frac{1} {2^{l}l!}(1 -\mu ^{2})^{m/2}\left (\frac{d} {d\mu }\right )^{l+m}(\mu ^{2} - 1)^{l}.& &{}\end{array}$$
(28)

The quantity \(0 \leq l \leq \infty\) is the angular degree of the spherical harmonic, and − lml is its angular order. The function P lm (μ) defined in (28) is the associated Legendre function of integer degree l and order m. Our choice of the constants in Eqs. (26) and (27) orthonormalizes the harmonics on the unit sphere:

$$\displaystyle{ \int _{\Omega }Y _{lm}Y _{l'm'}\,d\Omega =\delta _{ll'}\delta _{mm'}, }$$
(29)

and leads to the addition theorem in terms of the Legendre functions \(P_{l}(\mu ) = P_{l0}(\mu )\) as

$$\displaystyle{ \sum \limits _{m=-l}^{l}Y _{ lm}(\mathbf{\hat{r}})Y _{lm}(\mathbf{\hat{r}}') = \left (\frac{2l + 1} {4\pi } \right )P_{l}(\mathbf{\hat{r}} \cdot \mathbf{\hat{r}}'). }$$
(30)

To maximize the spatial concentration of a bandlimited function

$$\displaystyle{ g(\mathbf{\hat{r}}) =\sum \limits _{ l=0}^{L}\sum \limits _{ m=-l}^{l}g_{ lm}Y _{lm}(\mathbf{\hat{r}}) }$$
(31)

within a region R, we maximize the energy ratio

$$\displaystyle{ \lambda = \frac{\int _{R}g^{2}(\mathbf{\hat{r}})\,d\Omega } {\int _{\Omega }^{}g^{2}(\mathbf{\hat{r}})\,d\Omega } = \mbox{ maximum}. }$$
(32)

Maximizing Eq. (32) leads to the positive-definite spectral-domain eigenvalue equation

$$\displaystyle{ \sum \limits _{l'=0}^{L}\sum \limits _{ m'=-l'}^{l'}D_{ lm,l'm'}g_{l'm'} =\lambda g_{lm},\qquad 0 \leq l \leq L, }$$
(33a)
$$\displaystyle{ D_{lm,l'm'} =\int _{R}Y _{lm}Y _{l'm'}\,d\Omega, }$$
(33b)

and we may equally well rewrite Eq. (33) as a spatial-domain eigenvalue equation:

$$\displaystyle{ \int _{R}D(\mathbf{\hat{r}},\mathbf{\hat{r}}')\,g(\mathbf{\hat{r}}')\,d\Omega ' =\lambda g(\mathbf{\hat{r}}),\qquad \mathbf{\hat{r}} \in \Omega, }$$
(34a)
$$\displaystyle{ D(\mathbf{\hat{r}},\mathbf{\hat{r}}') =\sum _{ l=0}^{L}\left (\frac{2l + 1} {4\pi } \right )P_{l}(\mathbf{\hat{r}} \cdot \mathbf{\hat{r}}'), }$$
(34b)

where P l is the Legendre function of degree l. Equation (34) is a homogeneous Fredholm integral equation of the second kind, with a finite-rank, symmetric, Hermitian kernel. We choose the spectral eigenfunctions of the operator in Eq. (33b), whose elements are g lm α , α = 1, \(\ldots,\) (L + 1)2, to satisfy the orthonormality relations

$$\displaystyle{ \sum \limits _{lm}^{L}g_{ lm\alpha }g_{lm\beta } =\delta _{\alpha \beta },\qquad \sum \limits _{lm}^{L}g_{ lm\alpha }\sum \limits _{l'm'}^{L}D_{ lm,l'm'}g_{l'm'\beta } =\lambda _{\alpha }\delta _{\alpha \beta }. }$$
(35)

The finite set of bandlimited spatial eigensolutions \(g_{1}(\mathbf{\hat{r}}),g_{2}(\mathbf{\hat{r}}),\ldots,g_{(L+1)^{2}}(\mathbf{\hat{r}})\) can be made orthonormal over the whole sphere \(\Omega\) and orthogonal over the region R:

$$\displaystyle{ \int _{\Omega }g_{\alpha }g_{\beta }\,d\Omega =\delta _{\alpha \beta },\qquad \int _{R}g_{\alpha }g_{\beta }\,d\Omega =\lambda _{\alpha }\delta _{\alpha \beta }. }$$
(36)

In the limit of a small area \(A \rightarrow 0\) and a large bandwidth \(L \rightarrow \infty\) and after a change of variables, a scaled version of Eq. (34) will be given by

$$\displaystyle{ \int _{R_{{\ast}}}D(\boldsymbol{\xi },\boldsymbol{\xi }')\,\psi (\boldsymbol{\xi }')\,d\Omega '_{{\ast}} =\lambda \psi (\boldsymbol{\xi }), }$$
(37a)
$$\displaystyle{ D(\boldsymbol{\xi },\boldsymbol{\xi }') = \frac{(L + 1)\sqrt{A/4\pi }} {2\pi } \frac{J_{1}[(L + 1)\sqrt{A/4\pi }\,\,\|\boldsymbol{\xi } -\boldsymbol{\xi }'\|]} {\|\boldsymbol{\xi } -\boldsymbol{\xi }'\|}, }$$
(37b)

where the scaled region R now has area 4π and J 1 again is the first-order Bessel function of the first kind. As in the one- and two-dimensional case, the asymptotic, or “flat-Earth” eigenvalues \(\lambda _{1} \geq \lambda _{2} \geq \ldots\) and scaled eigenfunctions \(\psi _{1}(\boldsymbol{\xi }),\psi _{2}(\boldsymbol{\xi }),\ldots\) depend upon the maximal degree L and the area A only through what is once again a space-bandwidth product, the “spherical Shannon number,” this time given by

$$\displaystyle\begin{array}{rcl} N^{\mathrm{3D}}& =& \sum _{\alpha =1}^{(L+1)^{2} }\lambda _{\alpha } =\sum _{ l=0}^{L}\sum \limits _{ m=-l}^{l}D_{ lm,lm} =\int _{R}D(\mathbf{\hat{r}},\mathbf{\hat{r}})\,d\Omega \\ & =& \int _{R_{{\ast}}}D(\boldsymbol{\xi },\boldsymbol{\xi })\,d\Omega _{{\ast}} = (L + 1)^{2}\,\frac{A} {4\pi }. {}\end{array}$$
(38)

Irrespectively of the particular region of concentration that they were designed for, the complete set of bandlimited spatial Slepian eigenfunctions \(g_{1},g_{2},\ldots,g_{(L+1)^{2}}\) is a basis for bandlimited scalar processes anywhere on the surface of the unit sphere (Simons et al. 2006; Simons and Dahlen 2006). This follows directly from the fact that the spectral localization kernel (33b) is real, symmetric, and positive definite: its eigenvectors \(g_{1lm},g_{2lm},\ldots,g_{(L+1)^{2}lm}\) form an orthogonal set as we have seen. Thus the Slepian basis functions \(g_{\alpha }(\mathbf{\hat{r}})\), α = 1, , (L + 1)2 given by Eq. (31) simply transform the same-sized limited set of spherical harmonics \(Y _{lm}(\mathbf{\hat{r}})\), 0 ≤ lL, − lml that are a basis for the same space of bandlimited spherical functions with no power above the bandwidth L. The effect of this transformation is to order the resulting basis set such that the energy of the first N 3D functions, \(g_{1}(\mathbf{\hat{r}}),\ldots,g_{N^{\mathrm{3D}}}(\mathbf{\hat{r}})\), with eigenvalues \(\lambda \approx 1\), is concentrated in the region R, whereas the remaining eigenfunctions, \(g_{N^{\mathrm{3D}}+1}(\mathbf{\hat{r}}),\ldots,g_{(L+1)^{2}}(\mathbf{\hat{r}})\), are concentrated in the complimentary region \(\Omega \setminus R\). As in the one- and two-dimensional case, therefore, the reduced set of basis functions \(g_{1},g_{2},\ldots,g_{N^{\mathrm{3D}}}\) can be regarded as a sparse, global basis suitable to approximate bandlimited processes that are primarily localized to the region R. The dimensionality reduction is dependent on the fractional area of the region of interest. In other words, the full dimension of the space (L + 1)2 can be “sparsified” to an effective dimension of \(N^{\mathrm{3D}} = (L + 1)^{2}A/(4\pi )\) when the signal of interest resides in a particular geographic region.

Numerical methods for the solution of Eqs. (33) and (34) on completely general domains on the surface of the sphere were discussed by us elsewhere (Simons et al. 2006; Simons and Dahlen 20062007). An example of Slepian functions on a geographical domain on the surface of the sphere is found in Fig. 3.

Fig. 3
figure 3

Bandlimited L = 60 eigenfunctions \(g_{1},g_{2},\ldots,g_{12}\) that are optimally concentrated within Antarctica. The concentration factors \(\lambda _{1},\lambda _{2},\ldots,\lambda _{12}\) are indicated; the rounded Shannon number is N 3D = 102. The order of concentration is left to right, top to bottom. Positive values are blue and negative values are red; the sign of an eigenfunction is arbitrary. Regions in which the absolute value is less than one hundredth of the maximum value on the sphere are left white. We integrated Eq. (33b) over a splined high-resolution representation of the boundary, using Gauss-Legendre quadrature over the colatitudes, and analytically in the longitudinal dimension (Simons and Dahlen 2007)

2.3.2 Sturm-Liouville Character and Tridiagonal Matrix Formulation

In the special but important case in which the region of concentration is a circularly symmetric cap of colatitudinal radius \(\Theta\), centered on the North Pole, the colatitudinal parts \(g(\theta )\) of the separable functions

$$\displaystyle{ g(\theta,\phi ) = \left \{\begin{array}{ll} \sqrt{2}\,g(\theta )\cos m\phi &\quad \mbox{ if}\ \ - L \leq m <0, \\ g(\theta ) &\quad \mbox{ if}\ \ m = 0, \\ \sqrt{2}\,g(\theta )\sin m\phi &\quad \mbox{ if}\ \ 0 <m \leq L,\\ \end{array} \right. }$$
(39)

which solve Eq. (34), or, indeed, the fixed-order versions

$$\displaystyle{ \int _{0}^{\Theta }D(\theta,\theta ')\,g(\theta ')\sin \theta '\,d\theta ' = \lambda g(\theta ),\quad 0 \leq \theta \leq \pi, }$$
(40a)
$$\displaystyle{ D(\theta,\theta ') = 2\pi \sum \limits _{l=m}^{L}X_{ lm}(\theta )X_{lm}(\theta '), }$$
(40b)

are identical to those of a Sturm-Liouville equation for the \(g(\theta )\). In terms of \(\mu =\cos \theta\),

$$\displaystyle{ \frac{d} {d\mu }\left [(\mu -\cos \Theta )(1 -\mu ^{2})\frac{dg} {d\mu } \right ] + \left (\chi +L(L + 2)\mu -\frac{m^{2}(\mu -\cos \Theta )} {1 -\mu ^{2}} \right )g = 0, }$$
(41)

with \(\chi \neq \lambda\). This equation can be solved in the spectral domain by diagonalization of a simple symmetric tridiagonal matrix with a very well-behaved spectrum (Simons et al. 2006; Simons and Dahlen 2007). This matrix, whose eigenfunctions correspond to the g lm of Eq. (31) at constant m, is given by

$$\displaystyle\begin{array}{rcl} T_{ll}& =& -l(l + 1)\cos \Theta, \\ T_{l\,l+1}& =& \big[l(l + 2) - L(L + 2)\big]\sqrt{ \frac{(l + 1)^{2 } - m^{2 } } {(2l + 1)(2l + 3)}}.{}\end{array}$$
(42)

Moreover, when the region of concentration is a pair of axisymmetric polar caps of common colatitudinal radius \(\Theta\) centered on the North and South Pole, the \(g(\theta )\) can be obtained by solving the Sturm-Liouville equation

$$\displaystyle\begin{array}{rcl} & & \frac{d} {d\mu }\left [(\mu ^{2} -\cos ^{2}\Theta )(1 -\mu ^{2})\frac{dg} {d\mu } \right ] \\ & & \quad + \left (\chi +L_{p}(L_{p} + 3)\mu ^{2} -\frac{m^{2}(\mu ^{2} -\cos ^{2}\Theta )} {1 -\mu ^{2}} \right )g = 0,{}\end{array}$$
(43)

where L p = L or L p = L − 1 depending whether the order m of the functions \(g(\theta,\phi )\) in Eq. (39) is odd or even and whether the bandwidth L itself is odd or even. In their spectral form the coefficients of the optimally concentrated antipodal polar-cap eigenfunctions only require the numerical diagonalization of a symmetric tridiagonal matrix with analytically prescribed elements and a spectrum of eigenvalues that is guaranteed to be simple (Simons and Dahlen 20062007), namely,

$$\displaystyle\begin{array}{rcl} T_{ll}^{p}& =& -l(l + 1)\cos ^{2}\Theta + \frac{2} {2l + 3}\left [(l + 1)^{2} - m^{2}\right ] \\ & & + [(l - 2)(l + 1) - L_{p}(L_{p} + 3)]\left [\frac{1} {3} -\frac{2} {3}\, \frac{3m^{2} - l(l + 1)} {(2l + 3)(2l - 1)}\right ], \\ T_{l\,l+2}^{p}& =& \frac{\big[l(l + 3) - L_{p}(L_{p} + 3)\big]} {2l + 3} \sqrt{\frac{\left [(l + 2)^{2 } - m^{2 } \right ] \left [(l + 1)^{2 } - m^{2 } \right ] } {(2l + 5)(2l + 1)}}.{}\end{array}$$
(44)

The concentration values \(\lambda\), in turn, can be determined from the defining Eqs. (33) or (34). The spectra of the eigenvalues χ of Eqs. (42) and (44) display roughly equant spacing, without the numerically troublesome plateaus of nearly equal values that characterize the eigenvalues \(\lambda\). Thus, for the special cases of symmetric single and double polar caps, the concentration problem posed in Eq. (32) is not only numerically feasible also in circumstances where direct solution methods are bound to fail (Albertella et al. 1999), but essentially trivial in every situation. In practical applications, the eigenfunctions that are optimally concentrated within a polar cap can be rotated to an arbitrarily positioned circular cap on the unit sphere using standard spherical harmonic rotation formulae (Edmonds 1996; Blanco et al. 1997; Dahlen and Tromp 1998).

2.4 Vectorial Slepian Functions on the Surface of a Sphere

2.4.1 General Theory in “Three” Vectorial Dimensions

The expansion of a real-valued square-integrable vector field \(\mathbf{f}(\mathbf{\hat{r}})\) on the unit sphere \(\Omega\) can be written as

$$\displaystyle{ \mathbf{f}(\mathbf{\hat{r}}) =\sum \limits _{ l=0}^{\infty }\sum \limits _{ m=-l}^{l}f_{ lm}^{P}\mathbf{P}_{ lm}(\mathbf{\hat{r}}) + f_{lm}^{B}\mathbf{B}_{ lm}(\mathbf{\hat{r}}) + f_{lm}^{C}\mathbf{C}_{ lm}(\mathbf{\hat{r}}), }$$
(45a)
$$\displaystyle\begin{array}{rcl} f_{lm}^{P}& =& \int _{ \Omega }\mathbf{P}_{lm}(\mathbf{\hat{r}}) \cdot \mathbf{f}(\mathbf{\hat{r}})\,d\Omega,\quad f_{lm}^{B} =\int _{ \Omega }\mathbf{B}_{lm}(\mathbf{\hat{r}}) \cdot \mathbf{f}(\mathbf{\hat{r}})\,d\Omega,\quad \mbox{ and}\,\, \\ f_{lm}^{C}& =& \int _{ \Omega }\mathbf{C}_{lm}(\mathbf{\hat{r}}) \cdot \mathbf{f}(\mathbf{\hat{r}})\,d\Omega, {}\end{array}$$
(45b)

using real vector surface spherical harmonics  (Dahlen and Tromp 1998; Sabaka et al. 2010; Gerhards 2011) that are constructed from the scalar harmonics in Eq. (26), as follows. In the vector spherical coordinates \((\mathbf{\hat{r}},\boldsymbol{\hat{\theta }},\boldsymbol{\hat{\phi }})\) and using the surface gradient \(\boldsymbol{\nabla }_{1} =\boldsymbol{\hat{\theta }} \partial _{\theta } +\boldsymbol{\hat{\phi }} (\sin \theta )^{-1}\partial _{\phi }\), we write for l > 0 and − mlm,

$$\displaystyle\begin{array}{rcl} \mathbf{P}_{lm}(\mathbf{\hat{r}}) = \mathbf{\hat{r}}Y _{lm}(\mathbf{\hat{r}}),& &{}\end{array}$$
(46)
$$\displaystyle\begin{array}{rcl} \mathbf{B}_{lm}(\mathbf{\hat{r}}) = \frac{\boldsymbol{\nabla }_{1}Y _{lm}(\mathbf{\hat{r}})} {\sqrt{l(l + 1)}} = \frac{[\boldsymbol{\hat{\theta }}\partial _{\theta } +\boldsymbol{\hat{\phi }} (\sin \theta )^{-1}\partial _{\phi }]Y _{lm}(\mathbf{\hat{r}})} {\sqrt{l(l + 1)}},& &{}\end{array}$$
(47)
$$\displaystyle\begin{array}{rcl} \mathbf{C}_{lm}(\mathbf{\hat{r}}) = \frac{-\mathbf{\hat{r}} \times \boldsymbol{\nabla }_{1}Y _{lm}(\mathbf{\hat{r}})} {\sqrt{l(l + 1)}} = \frac{[\boldsymbol{\hat{\theta }}(\sin \theta )^{-1}\partial _{\phi } -\boldsymbol{\hat{\phi }} \partial _{\theta }]Y _{lm}(\mathbf{\hat{r}})} {\sqrt{l(l + 1)}},& &{}\end{array}$$
(48)

together with the purely radial \(\mathbf{P}_{00} = (4\pi )^{-1/2}\mathbf{\hat{r}}\), and setting \(f_{00}^{B} = f_{00}^{C} = 0\) for every vector field f. The remaining expansion coefficients (45b) are naturally obtained from Eq. (45a) through the orthonormality relationships

$$\displaystyle\begin{array}{rcl} \int _{\Omega }\mathbf{P}_{lm} \cdot \mathbf{P}_{l'm'}\,d\Omega & =& \int _{\Omega }\mathbf{B}_{lm} \cdot \mathbf{B}_{l'm'}\,d\Omega =\int _{\Omega }\mathbf{C}_{lm} \cdot \mathbf{C}_{l'm'}\,d\Omega =\delta _{ll'}\delta _{mm'},{}\end{array}$$
(49a)
$$\displaystyle\begin{array}{rcl} \int _{\Omega }\mathbf{P}_{lm} \cdot \mathbf{B}_{l'm'}\,d\Omega & =& \int _{\Omega }\mathbf{P}_{lm} \cdot \mathbf{C}_{l'm'}\,d\Omega =\int _{\Omega }\mathbf{B}_{lm} \cdot \mathbf{C}_{l'm'}\,d\Omega = 0.{}\end{array}$$
(49b)

The vector spherical-harmonic addition theorem  (Freeden and Schreiner 2009) implies the limited result

$$\displaystyle{ \sum \limits _{m=-l}^{l}\mathbf{P}_{ lm}(\mathbf{\hat{r}})\cdot \mathbf{P}_{lm}(\mathbf{\hat{r}}) = \left (\frac{2l + 1} {4\pi } \right ) =\sum \limits _{ m=-l}^{l}\mathbf{B}_{ lm}(\mathbf{\hat{r}})\cdot \mathbf{B}_{lm}(\mathbf{\hat{r}}) =\sum \limits _{ m=-l}^{l}\mathbf{C}_{ lm}(\mathbf{\hat{r}})\cdot \mathbf{C}_{lm}(\mathbf{\hat{r}}). }$$
(50)

As before we seek to maximize the spatial concentration of a bandlimited spherical vector function

$$\displaystyle{ \mathbf{g}(\mathbf{\hat{r}}) =\sum \limits _{ l=0}^{L}\sum \limits _{ m=-l}^{l}g_{ lm}^{P}\mathbf{P}_{ lm}(\mathbf{\hat{r}}) + g_{lm}^{B}\mathbf{B}_{ lm}(\mathbf{\hat{r}}) + g_{lm}^{C}\mathbf{C}_{ lm}(\mathbf{\hat{r}}) }$$
(51)

within a certain region R, in the vectorial case by maximizing the energy ratio

$$\displaystyle{ \lambda = \frac{\int _{R}\mathbf{g} \cdot \mathbf{g}\,d\Omega } {\int _{\Omega }\mathbf{g} \cdot \mathbf{g}\,d\Omega } = \text{maximum}. }$$
(52)

The maximization of Eq. (52) leads to a coupled system of positive-definite spectral-domain eigenvalue equations, \(\mathrm{for}\,\,0 \leq l \leq L\,\,\mathrm{and}\, - l \leq m \leq l\),

$$\displaystyle\begin{array}{rcl} \sum \limits _{l'=0}^{L}\sum \limits _{ m'=-l'}^{l'}D_{ lm,l'm'}g_{l'm'}^{P}& =& \lambda \ g_{ lm}^{P},{}\end{array}$$
(53a)
$$\displaystyle\begin{array}{rcl} \sum \limits _{l'=0}^{L}\sum \limits _{ m'=-l'}^{l'}B_{ lm,l'm'}g_{l'm'}^{B} + C_{ lm,l'm'}g_{l'm'}^{C}& =& \lambda \ g_{ lm}^{B},{}\end{array}$$
(53b)
$$\displaystyle\begin{array}{rcl} \sum \limits _{l'=0}^{L}\sum \limits _{ m'=-l'}^{l'}C_{ lm,l'm'}^{\mathit{T}}g_{ l'm'}^{B} + B_{ lm,l'm'}g_{l'm'}^{C}& =& \lambda \ g_{ lm}^{C}.{}\end{array}$$
(53c)

Of the below matrix elements that complement the equations above, Eq. (54a) is identical to Eq. (33b),

$$\displaystyle\begin{array}{rcl} D_{lm,l'm'}& =& \int _{R}\mathbf{P}_{lm} \cdot \mathbf{P}_{l'm'}\,d\Omega =\int _{R}Y _{lm}Y _{l'm'}\,d\Omega,{}\end{array}$$
(54a)
$$\displaystyle\begin{array}{rcl} B_{lm,l'm'}& =& \int _{R}\mathbf{B}_{lm} \cdot \mathbf{B}_{l'm'}\,d\Omega =\int _{\mathcal{R}}\mathbf{C}_{lm} \cdot \mathbf{C}_{l'm'}\,d\Omega,{}\end{array}$$
(54b)
$$\displaystyle\begin{array}{rcl} C_{lm,l'm'}& =& \int _{R}\mathbf{B}_{lm} \cdot \mathbf{C}_{l'm'}\,d\Omega,{}\end{array}$$
(54c)

and the transpose of Eq. (54c) switches its sign. The radial vectorial concentration problem (53a)–(54a) is identical to the corresponding scalar case (33) and can be solved separately from the tangential equations. Altogether, in the space domain, the equivalent eigenvalue equation is

$$\displaystyle{ \int _{R}\mathbf{D}(\mathbf{\hat{r}},\mathbf{\hat{r}}') \cdot \mathbf{g}(\mathbf{\hat{r}}')\,d\Omega =\lambda \mathbf{g}(\mathbf{\hat{r}}),\quad \mathbf{\hat{r}} \in \Omega, }$$
(55a)
$$\displaystyle{ \mathbf{D}(\mathbf{\hat{r}},\mathbf{\hat{r}}') =\sum \limits _{ l=0}^{L}\sum \limits _{ m=-l}^{l}\mathbf{P}_{ lm}(\mathbf{\hat{r}})\mathbf{P}_{lm}(\mathbf{\hat{r}}') + \mathbf{B}_{lm}(\mathbf{\hat{r}})\mathbf{B}_{lm}(\mathbf{\hat{r}}') + \mathbf{C}_{lm}(\mathbf{\hat{r}})\mathbf{C}_{lm}(\mathbf{\hat{r}}'), }$$
(55b)

a homogeneous Fredholm integral equation with a finite-rank, symmetric, separable, bandlimited kernel. Further reducing Eq. (55) using the full version of the vectorial addition theorem does not yield much additional insight.

After collecting the spheroidal (radial, consoidal) and toroidal expansion coefficients in a vector,

$$\displaystyle{ \boldsymbol{\mathsf{g}} = (\ldots,g_{lm}^{P},\ldots,g_{ lm}^{B},\ldots,g_{ lm}^{C},\ldots )^{\mathsf{T}} }$$
(56)

and the kernel elements D lm, lm, B lm, lm and C lm, lm of Eq. (54) into the submatrices \(\boldsymbol{\mathsf{D}}\), \(\boldsymbol{\mathsf{B}}\), and \(\boldsymbol{\mathsf{C}}\), we assemble

$$\displaystyle{ \boldsymbol{\mathsf{K}} = \left (\begin{array}{*{10}c} \boldsymbol{\mathsf{D}}& \boldsymbol{\mathsf{0}} & \boldsymbol{\mathsf{0}}\\ \boldsymbol{\mathsf{0} } & \boldsymbol{\mathsf{B} } & \boldsymbol{\mathsf{C}} \\ \boldsymbol{\mathsf{0}} &\boldsymbol{\mathsf{C}}^{\mathsf{T}} & \boldsymbol{\mathsf{B}} \end{array} \right ). }$$
(57)

In this new notation Eq. (53) reads as an \([3(L + 1)^{2} - 2] \times [3(L + 1)^{2} - 2]\)-dimensional algebraic eigenvalue problem

$$\displaystyle{ \boldsymbol{\mathsf{K}}\boldsymbol{\mathsf{g}} =\lambda \boldsymbol{ \mathsf{g}}, }$$
(58)

whose eigenvectors \(\boldsymbol{\mathsf{g}}_{1},\boldsymbol{\mathsf{g}}_{2},\ldots,\boldsymbol{\mathsf{g}}_{3(L+1)^{2}-2}\) are mutually orthogonal in the sense

$$\displaystyle{ \boldsymbol{\mathsf{g}}_{\alpha }^{\mathsf{T}}\boldsymbol{\mathsf{g}}_{\beta }^{} =\delta _{\alpha \beta },\qquad \boldsymbol{\mathsf{g}}_{\alpha }^{\mathsf{T}}\boldsymbol{\mathsf{K}}\boldsymbol{\mathsf{g}}_{\beta }^{} =\lambda _{\alpha }\delta _{\alpha \beta }. }$$
(59)

The associated eigenfields \(\mathbf{g}_{1}(\mathbf{\hat{r}}),\mathbf{g}_{2}(\mathbf{\hat{r}}),\ldots,\mathbf{g}_{3(L+1)^{2}-2}(\mathbf{\hat{r}})\) are orthogonal over the region R and orthonormal over the whole sphere \(\Omega\):

$$\displaystyle{ \int _{\Omega }\mathbf{g}_{\alpha } \cdot \mathbf{g}_{\beta }\,d\Omega =\delta _{\alpha \beta },\qquad \int _{R}\mathbf{g}_{\alpha } \cdot \mathbf{g}_{\beta }\,d\Omega =\lambda _{\alpha }\delta _{\alpha \beta }. }$$
(60)

The relations (60) for the spatial domain are equivalent to their matrix counterparts (59). The eigenfield \(\mathbf{g}_{1}(\mathbf{\hat{r}})\) with the largest eigenvalue \(\lambda _{1}\) is the element in the space of bandlimited vector fields with most of its spatial energy within region R; the eigenfield \(\mathbf{g}_{2}(\mathbf{\hat{r}})\) is the next best-concentrated bandlimited function orthogonal to \(\mathbf{g}_{1}(\mathbf{\hat{r}})\) over both \(\Omega\) and R; and so on. Finally, as in the scalar case, we can sum up the eigenvalues of the matrix \(\boldsymbol{\mathsf{K}}\) to define a vectorial spherical Shannon number

$$\displaystyle\begin{array}{rcl} N^{\mathrm{vec}}& =& \sum _{\alpha =1}^{3(L+1)^{2}-2 }\lambda _{\alpha } = \mathrm{tr}\ \boldsymbol{\mathsf{K}} =\sum \limits _{ l=0}^{L}\sum \limits _{ m=-l}^{l}(D_{ lm,lm} + B_{lm,lm} + C_{lm,lm}){}\end{array}$$
(61)
$$\displaystyle\begin{array}{rcl} & =& \int _{R}\left [\sum \limits _{l=0}^{L}\sum \limits _{ m=-l}^{l}\mathbf{P}_{ lm}(\mathbf{\hat{r}}) \cdot \mathbf{P}_{lm}(\mathbf{\hat{r}}) + \mathbf{B}_{lm}(\mathbf{\hat{r}}) \cdot \mathbf{B}_{lm}(\mathbf{\hat{r}}) + \mathbf{C}_{lm}(\mathbf{\hat{r}}) \cdot \mathbf{C}_{lm}(\mathbf{\hat{r}})\right ]d\Omega {}\end{array}$$
(62)
$$\displaystyle\begin{array}{rcl} & =& \left [3(L + 1)^{2} - 2\right ]\frac{A} {4\pi }.{}\end{array}$$
(63)

To establish the last equality we used the relation (50). Given the decoupling of the radial from the tangential solutions that is apparent from Eq. (57), we may subdivide the vectorial spherical Shannon number into a radial and a tangential one. These are N r = (L + 1)2 A∕(4π) and N t = [2(L + 1)2 − 2]A∕(4π), respectively.

Numerical solution methods were discussed by Plattner and Simons (2014). An example of tangential vectorial Slepian functions on a geographical domain on the surface of the sphere is found in Fig. 4.

Fig. 4
figure 4

Twelve tangential Slepian functions \(\mathbf{g}_{1},\mathbf{g}_{2},\ldots,\mathbf{g}_{12}\), bandlimited to L = 60, optimally concentrated within Australia. The concentration factors \(\lambda _{1},\lambda _{2},\ldots,\lambda _{12}\) are indicated. The rounded tangential Shannon number N t = 112. Order of concentration is left to right, top to bottom. Color is absolute value (red the maximum) and circles with strokes indicate the direction of the eigenfield on the tangential plane. Regions in which the absolute value is less than one hundredth of the maximum absolute value on the sphere are left white

2.4.2 Sturm-Liouville Character and Tridiagonal Matrix Formulation

When the region of concentration R is a symmetric polar cap with colatitudinal radius \(\Theta\) centered on the north pole, special rules apply that greatly facilitate the construction of the localization kernel (57). There are reductions of Eq. (54) to some very manageable integrals that can be carried out exactly by recursion. Solutions for the polar cap can be rotated anywhere on the unit sphere using the same transformations that apply in the rotation of scalar functions (Edmonds 1996; Blanco et al. 1997; Dahlen and Tromp 1998; Freeden and Schreiner 2009).

In the axisymmetric case the matrix elements (54a)–(54c) reduce to

$$\displaystyle\begin{array}{rcl} D_{lm,l'm'}& =& 2\pi \delta _{mm'}\int _{0}^{\Theta }X_{ lm}X_{l'm}\sin \theta \,d\theta,{}\end{array}$$
(64)
$$\displaystyle\begin{array}{rcl} B_{lm,l'm'}& =& \frac{2\pi \delta _{mm'}\int _{0}^{\Theta }\left [X'_{ lm}X'_{l'm} + m^{2}(\sin \theta )^{-2}X_{ lm}X_{l'm}\right ]\sin \theta \,d\theta } {\sqrt{l(l + 1)l'(l' + 1)}},{}\end{array}$$
(65)
$$\displaystyle\begin{array}{rcl} C_{lm,l'm'}& =& -\frac{2\pi \delta _{-mm'}m\int _{0}^{\Theta }\left [X'_{ lm}X_{l'm} + X_{lm}X'_{l'm}\right ]\,d\theta } {\sqrt{l(l + 1)l'(l' + 1)}} \\ & =& -\frac{2\pi \delta _{-mm'}mX_{lm}(\Theta )X_{l'm}(\Theta )} {\sqrt{l(l + 1)l'(l' + 1)}}, {}\end{array}$$
(66)

using the derivative notation \(X'_{lm} = dX_{lm}/d\theta\) for the normalized associated Legendre functions of Eq. (27). Equation (66) can be easily evaluated. The integrals over the product terms X lm X lm in Eq. (64) can be rewritten using Wigner 3j symbols (Wieczorek and Simons 2005; Simons et al. 2006; Plattner and Simons 2014) to simple integrals over X l2m or X l0 which can be handled recursively (Paul 1978). Finally, in Eq. (65) integrals of the type X lm X lm , and \(m^{2}(\sin \theta )^{-2}X_{lm}X_{l'm}\) can be rewritten as integrals over undifferentiated products of Legendre functions (Ilk 1983; Eshagh 2009; Plattner and Simons 2014). All in all, these computations are straightforward to carry out and lead to block-diagonal matrices at constant order m, which are relatively easily diagonalized.

As this chapter went to press, Jahn and Bokor (2014) reported the exciting discovery of a differential operator that commutes with the tangential part of the concentration operator (55), and a tridiagonal matrix formulation for the tangential vectorial concentration problem to axisymmetric domains. They achieve this feat by a change of basis by which to reduce the vectorial problem to a scalar one that is separable in \(\theta\) and ϕ, using the special functions \(X'_{lm} \pm m(\sin \theta )^{-1}X_{lm}\) (Sheppard and Török 1997). Hence they derive a commuting differential operator and a corresponding spectral matrix for the concentration problem. By their approach, the solutions to the fixed-order tangential concentration problem are again solutions to a Sturm-Liouville problem with a very simple eigenvalue spectrum, and the calculations are always fast and stable, much as they are for the radial problem which completes the construction of vectorial Slepian functions on the sphere.

2.5 Midterm Summary

It is interesting to reflect, however heuristically, on the commonality of all of the above aspects of spatiospectral localization, in the slightly expanded context of reproducing-kernel Hilbert spaces (Yao 1967; Nashed and Walter 1991; Daubechies 1992; Amirbekyan et al. 2008; Kennedy and Sadeghi 2013). In one dimension, the Fourier orthonormality relation and the “reproducing” properties of the spatial delta function are given by

$$\displaystyle{ \delta (t,t') = (2\pi )^{-1}\int _{ -\infty }^{\infty }e^{i\omega (t-t')}\,d\omega,\qquad \int _{ -\infty }^{\infty }f(t')\delta (t,t')\,dt' = f(t). }$$
(67)

In two Cartesian dimensions the equivalent relations are

$$\displaystyle{ \delta (\mathbf{x},\mathbf{x}') = (2\pi )^{-2}\int _{ -\infty }^{\infty }e^{i\mathbf{k}\cdot (\mathbf{x}-\mathbf{x}')}\,d\mathbf{k},\qquad \int _{ -\infty }^{\infty }f(\mathbf{x}')\delta (\mathbf{x},\mathbf{x}')\,d\mathbf{x}' = f(\mathbf{x}), }$$
(68)

and on the surface of the unit sphere we have, for the scalar case,

$$\displaystyle{ \delta (\mathbf{\hat{r}},\mathbf{\hat{r}}') =\sum _{ l=0}^{\infty }\left (\frac{2l + 1} {4\pi } \right )P_{l}(\mathbf{\hat{r}} \cdot \mathbf{\hat{r}}'),\qquad \int _{\Omega }f(\mathbf{\hat{r}}')\delta (\mathbf{\hat{r}},\mathbf{\hat{r}}')\,d\Omega ' = f(\mathbf{\hat{r}}), }$$
(69)

and for the vector case, we have the sum of dyads

$$\displaystyle{ \boldsymbol{\delta }(\mathbf{\hat{r}},\mathbf{\hat{r}}') =\sum \limits _{ l=0}^{\infty }\sum \limits _{ m=-l}^{l}\mathbf{P}_{ lm}(\mathbf{\hat{r}})\mathbf{P}_{lm}(\mathbf{\hat{r}}') + \mathbf{B}_{lm}(\mathbf{\hat{r}})\mathbf{B}_{lm}(\mathbf{\hat{r}}') + \mathbf{C}_{lm}(\mathbf{\hat{r}})\mathbf{C}_{lm}(\mathbf{\hat{r}}'), }$$
(70a)
$$\displaystyle{ \int _{\Omega }\mathbf{f}(\mathbf{\hat{r}}') \cdot \boldsymbol{\delta }(\mathbf{\hat{r}},\mathbf{\hat{r}}')\,d\Omega ' = \mathbf{f}(\mathbf{\hat{r}}). }$$
(70b)

The integral-equation kernels (5b), (15b), (34b), and (55b) are all bandlimited spatial delta functions which are reproducing kernels for bandlimited functions of the types in Eqs. (2), (12), (31), and (51):

$$\displaystyle\begin{array}{rcl} D(t,t') = (2\pi )^{-1}\int _{ -W}^{W}e^{i\omega (t-t')}\,d\omega,\ \ \int _{ -\infty }^{\infty }g(t')D(t,t')\,dt' = g(t),& &{}\end{array}$$
(71)
$$\displaystyle\begin{array}{rcl} D(\mathbf{x},\mathbf{x}') = (2\pi )^{-2}\int _{ \mathcal{K}}e^{i\mathbf{k}\cdot (\mathbf{x}-\mathbf{x}')}\,d\mathbf{k},\ \ \int _{ -\infty }^{\infty }g(\mathbf{x}')D(\mathbf{x},\mathbf{x}')\,d\mathbf{x}' = g(\mathbf{x}),& &{}\end{array}$$
(72)
$$\displaystyle\begin{array}{rcl} D(\mathbf{\hat{r}},\mathbf{\hat{r}}') =\sum _{ l=0}^{L}\left (\frac{2l + 1} {4\pi } \right )P_{l}(\mathbf{\hat{r}} \cdot \mathbf{\hat{r}}'),\qquad \int _{\Omega }g(\mathbf{\hat{r}}')D(\mathbf{\hat{r}},\mathbf{\hat{r}}')\,d\Omega = g(\mathbf{\hat{r}}),& &{}\end{array}$$
(73)
$$\displaystyle{ \mathbf{D}(\mathbf{\hat{r}},\mathbf{\hat{r}}') =\sum \limits _{ l=0}^{L}\sum \limits _{ m=-l}^{l}\mathbf{P}_{ lm}(\mathbf{\hat{r}})\mathbf{P}_{lm}(\mathbf{\hat{r}}') + \mathbf{B}_{lm}(\mathbf{\hat{r}})\mathbf{B}_{lm}(\mathbf{\hat{r}}') + \mathbf{C}_{lm}(\mathbf{\hat{r}})\mathbf{C}_{lm}(\mathbf{\hat{r}}'), }$$
(74a)
$$\displaystyle{ \int _{\Omega }\mathbf{g}(\mathbf{\hat{r}}') \cdot \mathbf{D}(\mathbf{\hat{r}},\mathbf{\hat{r}}')\,d\Omega ' = \mathbf{g}(\mathbf{\hat{r}}). }$$
(74b)

The equivalence of Eq. (71) with Eq. (5b) is through the Euler identity, and the reproducing properties follow from the spectral forms of the orthogonality relations (67) and (68), which are self-evident by change of variables, and from the spectral form of Eq. (69), which is Eq. (29). Much as the delta functions of Eqs. (67)–(70) set up the Hilbert spaces of all square-integrable functions on the real line, in two-dimensional Cartesian space and on the surface of the sphere (both scalar and vector functions), the kernels (71) and (74) induce the equivalent subspaces of bandlimited functions in their respective dimensions. Inasmuch as the Slepian functions are the integral eigenfunctions of these reproducing kernels in the sense of Eqs. (5a), (15a), (34a), and (55a), they are complete bases for their band-limited subspaces (Slepian and Pollak 1961; Landau and Pollak 1961; Daubechies 1992; Flandrin 1998; Freeden et al. 1998; Plattner and Simons 2014). Therein, the N 1D, N 2D, N 3D, or N vec best time- or space-concentrated members allow for sparse, approximate expansions of signals that are spatially concentrated to the one-dimensional interval \(t \in [-T,T] \subset \mathbb{R}\), the Cartesian region \(\mathbf{x} \in \mathcal{R}\subset \mathbb{R}^{2}\), or the spherical surface patch \(\mathbf{\hat{r}} \in R \subset \Omega\).

As a corollary to this behavior, the infinite sets of exactly time- or spacelimited (and thus band-concentrated) versions of the functions g and g, which are the eigenfunctions of Eqs. (5), (15), (34), and (55) with the domains appropriately restricted, are complete bases for square-integrable scalar or vector functions on the intervals to which they are confined (Slepian and Pollak 1961; Landau and Pollak 1961; Simons et al. 2006; Plattner and Simons 2013). Expansions of such wideband signals in the small subset of their N 1D, N 2D, N 3D, or N vec most band-concentrated members provide reconstructions which are constructive in the sense that they progressively capture all of the signal in the mean-squared sense, in the limit of letting their numbers grow to infinity. This second class of functions can be trivially obtained, up to a multiplicative constant, from the bandlimited Slepian functions g and g by simple time- or space limitation. While Slepian (Slepian and Pollak 1961; Slepian 1983), for this reason perhaps, never gave them a name, we have been referring to those as h (and h) in our own investigations of similar functions on the sphere (Simons et al. 2006; Simons and Dahlen 2006; Dahlen and Simons 2008; Plattner and Simons 2013).

3 Problems in the Geosciences and Beyond

Taking all of the above at face value but referring again to the literature cited thus far for proof and additional context, we return to considerations closer to home, namely, the estimation of geophysical (or cosmological) signals and/or their power spectra, from noisy and incomplete observations collected at or above the surface of the spheres “Earth” or “planet” (or from inside the sphere “sky”). We restrict ourselves to real-valued scalar measurements, contaminated by additive noise for which we shall adopt idealized models. We focus exclusively on data acquired and solutions expressed on the unit sphere. We have considered generalizations to problems involving satellite potential-field data collected at an altitude elsewhere (Simons and Dahlen 2006; Simons et al. 2009). We furthermore note that descriptions of the scalar gravitational and magnetic potential may be sufficient to capture the behavior of the corresponding gravity and magnetic vector fields, but that with vectorial Slepian functions, versatile and demanding satellite data analysis problems will be able to get robustly handled even in the presence of noise that may be strongly heterogeneous spatially and/or over the individual vector components.

Speaking quite generally, the two different statistical problems that arise when geomathematical scalar spherical data are being studied are, (i) how to find the “best” estimate of the signal given the data and (ii) how to construct from the data the “best” estimate of the power spectral density of the signal in question. There are problems intermediate between either case, for instance, those that utilize the solutions to problems of the kind (i) to make inference about the power spectral density without properly solving any problems of kind (ii). Mostly such scenarios, e.g., in localized geomagnetic field analysis (Beggan et al. 2013), are born out of necessity or convenience. We restrict our analysis to the “pure” end-member problems.

Thus, let there be some real-valued scalar data distributed on the unit sphere, consisting of “signal,” s and “noise,” n, and let there be some region of interest \(R \subset \Omega\); in other words, let

$$\displaystyle\begin{array}{rcl} d(\mathbf{\hat{r}}) = \left \{\begin{array}{ll} s(\mathbf{\hat{r}}) + n(\mathbf{\hat{r}}) &\mbox{ if $\mathbf{\hat{r}} \in R$},\\ \mbox{ unknown/undesired} &\mbox{ if $\mathbf{\hat{r} } \in \Omega \setminus R$}. \end{array} \right.& &{}\end{array}$$
(75)

We assume that the signal of interest can be expressed by way of spherical harmonic expansion as in Eq. (25), and that it is, itself, a realization of a zero-mean, Gaussian, isotropic, random process, namely,

$$\displaystyle{ s(\mathbf{\hat{r}}) =\sum \limits _{ l=0}^{\infty }\sum \limits _{ m=-l}^{l}s_{ lm}Y _{lm}(\mathbf{\hat{r}}),\qquad \langle s_{lm}\rangle = 0\quad \mbox{ and}\quad \langle s_{lm}s_{l'm'}\rangle = S_{l}\,\delta _{ll'}\delta _{mm'}. }$$
(76)

For illustration we furthermore assume that the noise is a zero-mean stochastic process with an isotropic power spectrum, i.e., \(\langle n(\mathbf{\hat{r}})\rangle = 0\) and \(\langle n_{lm}n_{l'm'}\rangle = N_{l}\,\delta _{ll'}\delta _{mm'}\), and that it is statistically uncorrelated with the signal. We refer to power as white when S l = S or N l = N, or, equivalently, when \(\langle n(\mathbf{\hat{r}})n(\mathbf{\hat{r}}')\rangle = N\delta (\mathbf{\hat{r}},\mathbf{\hat{r}}')\). Our objectives are thus (i) to determine the best estimate \(\hat{s}_{lm}\) of the spherical harmonic expansion coefficients s lm of the signal and (ii) to find the best estimate \(\hat{S}_{l}\) for the isotropic power spectral density S l . While in the physical world there can be no limit on bandwidth, practical restrictions force any and all of our estimates to be bandlimited to some maximum spherical harmonic degree L, thus by necessity \(\hat{s}_{lm} = 0\) and \(\hat{S}_{l} = 0\) for l > L:

$$\displaystyle{ \hat{s}(\mathbf{\hat{r}}) =\sum \limits _{ l=0}^{L}\sum \limits _{ m=-l}^{l}\hat{s}_{ lm}Y _{lm}(\mathbf{\hat{r}}). }$$
(77)

This limitation, combined with the statements of Eq. (75) on the data coverage or the study region of interest, naturally leads us back to the concept of “spatiospectral concentration,” and, as we shall see, solving either problem (i) or (ii) will gain from involving the “localized” scalar Slepian functions rather than, or in addition to, the “global” spherical harmonics basis.

This leaves us to clarify what we understand by “best” in this context. While we adopt the traditional statistical metrics of bias, variance, and mean-squared error to appraise the quality of our solutions (Cox and Hinkley 1974; Bendat and Piersol 2000), the resulting connections to sparsity will be real and immediate, owing to the Slepian functions being naturally instrumental in constructing efficient, consistent, and/or unbiased estimates of either \(\hat{s}_{lm}\) or \(\hat{S}_{l}\). Thus, we define

$$\displaystyle{ v =\langle \hat{ s}^{2}\rangle -\langle \hat{ s}\rangle ^{2},\qquad b =\langle \hat{ s}\rangle - s,\qquad \epsilon =\hat{ s} - s,\quad \mbox{ and}\quad \langle \epsilon ^{2}\rangle = v + b^{2} }$$
(78)

for problem (i), where the lack of subscript indicates that we can study variance, bias, and mean-squared error of the estimate of the coefficients \(\hat{s}_{lm}\) but also of their spatial expansion \(\hat{s}(\mathbf{\hat{r}})\). For problem (ii) on the other hand, we focus on the estimate of the isotropic power spectrum at a given spherical harmonic degree l by identifying

$$\displaystyle{ v_{l} =\langle \hat{ S}_{l}^{2}\rangle -\langle \hat{ S}_{ l}\rangle ^{2},\qquad b_{ l} =\langle \hat{ S}_{l}\rangle - S_{l},\qquad \epsilon _{l} =\hat{ S}_{l} - S_{l},\quad \mbox{ and}\quad \langle \epsilon _{l}^{2}\rangle = v_{ l} + b_{l}^{2}. }$$
(79)

Depending on the application, the “best” estimate could mean the unbiased one with the lowest variance (Tegmark 1997; Tegmark et al. 1997; Bond et al. 1998; Oh et al. 1999; Hinshaw et al. 2003), it could be simply the minimum-variance estimate having some acceptable and quantifiable bias (Wieczorek and Simons 2007), or, as we would usually prefer, it would be the one with the minimum mean-squared error (Simons and Dahlen 2006; Dahlen and Simons 2008).

3.1 Problem (i): Signal Estimation from Spherical Data

3.1.1 Spherical Harmonic Solution

Paraphrasing results elaborated elsewhere (Simons and Dahlen 2006), we write the bandlimited solution to the damped inverse problem

$$\displaystyle{ \int _{R}(s - d)^{2}\,d\Omega +\eta \int _{\bar{ R}}s^{2}\,d\Omega = \mbox{ minimum}, }$$
(80)

where η ≥ 0 is a damping parameter, by straightforward algebraic manipulation, as

$$\displaystyle{ \hat{s}_{lm} =\sum \limits _{ l'=0}^{L}\sum \limits _{ m'=-l'}^{l'}\left (D_{ lm,l'm'} +\eta \bar{ D}_{lm,l'm'}\right )^{-1}\int _{ R}d\,Y _{l'm'}\,d\Omega, }$$
(81)

where \(\bar{D}_{lm,l'm'}\), the kernel that localizes to the region \(\bar{R} = \Omega \setminus R\), compliments D lm, lm given by Eq. (33b) which localizes to R. Given the eigenvalue spectrum of the latter, its inversion is inherently unstable, thus Eq. (80) is an ill-conditioned inverse problem unless η > 0, as has been well known, e.g., in geodesy  (Xu 1992; Sneeuw and van Gelderen 1997). Elsewhere (Simons and Dahlen 2006) we have derived exact expressions for the optimal value of the damping parameter η as a function of the signal-to-noise ratio under certain simplifying assumptions. As can be easily shown, without damping the estimate is unbiased but effectively incomputable; the introduction of the damping term stabilizes the solution at the cost of added bias. And of course when \(R = \Omega\), Eq. (81) is simply the spherical harmonic transform, as in that case, Eq. (33b) reduces to Eq. (29), in other words, then \(D_{lm,l'm'} =\delta _{ll'}\delta _{mm'}\).

3.1.2 Slepian Basis Solution

The trial solution in the Slepian basis designed for this region of interest R, i.e.,

$$\displaystyle{ \hat{s}(\mathbf{\hat{r}}) =\sum _{ \alpha =1}^{(L+1)^{2} }\hat{s}_{\alpha }g_{\alpha }(\mathbf{\hat{r}}), }$$
(82)

would be completely equivalent to the expression in Eq. (77) by virtue of the completeness of the Slepian basis for bandlimited functions everywhere on the sphere and the unitarity of the transform (31) from the spherical harmonic to the Slepian basis. The solution to the undamped (η = 0) version of Eq. (80) would then be

$$\displaystyle{ \hat{s}_{\alpha } =\lambda _{ \alpha }^{-1}\int _{ R}dg_{\alpha }\,d\Omega, }$$
(83)

which, being completely equivalent to Eq. (81) for η = 0, would be computable and biased, only when the expansion in Eq. (82) were to be truncated to some finite J < (L + 1)2 to prevent the blowup of the eigenvalues \(\lambda\). Assuming for simplicity of the argument that J = N 3D, the essence of the approach is now that the solution

$$\displaystyle{ \hat{s}(\mathbf{\hat{r}}) =\sum _{ \alpha =1}^{N^{\mathrm{3D}} }\hat{s}_{\alpha }g_{\alpha }(\mathbf{\hat{r}}) }$$
(84)

will be sparse (in achieving a bandwidth L using N 3D Slepian instead of (L + 1)2 spherical-harmonic expansion coefficients) yet good (in approximating the signal as well as possible in the mean-squared sense in the region of interest R) and of geophysical utility (assuming we are dealing with spatially localized processes that are to be extracted, e.g., from global satellite measurements) as shown by Han and Simons (2008), Simons et al. (2009), and Harig and Simons (2012).

3.2 Bias and Variance

In concluding this section let us illustrate another welcome by-product of our methodology, by writing the mean-squared error for the spherical harmonic solution (81) compared to the equivalent expression for the solution in the Slepian basis, Eq. (83). We do this as a function of the spatial coordinate, in the Slepian basis for both, and, for maximum clarity of the exposition, using the contrived case when both signal and noise should be white (with power S and N, respectively) as well as bandlimited (which technically is impossible). In the former case, we get

$$\displaystyle\begin{array}{rcl} \langle \epsilon ^{2}(\mathbf{\hat{r}})\rangle & =& N\sum _{\alpha =1}^{(L+1)^{2} }\lambda _{\alpha }[\lambda _{\alpha } +\eta (1 -\lambda _{\alpha })]^{-2}g_{\alpha }^{2}(\mathbf{\hat{r}}) \\ & & +\eta ^{2}S\sum _{\alpha =1}^{(L+1)^{2} }(1 -\lambda _{\alpha })^{2}[\lambda _{\alpha } +\eta (1 -\lambda _{\alpha })]^{-2}g_{\alpha }^{2}(\mathbf{\hat{r}}),{}\end{array}$$
(85)

while in the latter, we obtain

$$\displaystyle{ \langle \epsilon ^{2}(\mathbf{\hat{r}})\rangle = N\sum _{\alpha =1}^{N^{\mathrm{3D}} }\lambda _{\alpha }^{-1}g_{\alpha }^{2}(\mathbf{\hat{r}}) + S\sum _{\alpha>N^{\mathrm{3D}}}^{(L+1)^{2} }g_{\alpha }^{2}(\mathbf{\hat{r}}). }$$
(86)

All (L + 1)2 basis functions are required to express the mean-squared estimation error, whether in Eq. (85) or in Eq. (86). The first term in both expressions is the variance, which depends on the measurement noise. Without damping or truncation the variance grows without bounds. Damping and truncation alleviate this at the expense of added bias, which depends on the characteristics of the signal, as given by the second term. In contrast to Eq. (85), however, the Slepian expression (86) has disentangled the contributions due to noise/variance and signal/bias by projecting them onto the sparse set of well-localized and the remaining set of poorly localized Slepian functions, respectively. The estimation variance is felt via the basis functions \(\alpha = 1 \rightarrow N^{\mathrm{3D}}\) that are well concentrated inside the measurement area, and the effect of the bias is relegated to those \(\alpha = N^{\mathrm{3D}} + 1 \rightarrow (L + 1)^{2}\) functions that are confined to the region of missing data.

When forming a solution to problem (i) in the Slepian basis by truncation according to Eq. (84), changing the truncation level J to values lower or higher than the Shannon number N 3D amounts to navigating the trade-off space between variance, bias (or “resolution”), and sparsity in a manner that is captured with great clarity by Eq. (86). We refer the reader elsewhere (Simons and Dahlen 20062007) for more details, and, in particular, for the case of potential fields estimated from data collected at satellite altitude, treated in detail in chapter Potential-Field Estimation Using Scalar and Vector Slepian Functions at Satellite Altitude of this book.

3.3 Problem (ii): Power Spectrum Estimation from Spherical Data

Following Dahlen and Simons (2008) we will find it convenient to regard the data \(d(\mathbf{\hat{r}})\) given in Eq. (75) as having been multiplied by a unit-valued boxcar window function confined to the region R,

$$\displaystyle{ b(\mathbf{\hat{r}}) =\sum _{ p=0}^{\infty }\sum _{ q=-p}^{p}b_{ pq}Y _{pq}(\mathbf{\hat{r}}) = \left \{\begin{array}{ll} 1&\mbox{ if $\mathbf{\hat{r}} \in R$},\\ 0 &\mbox{ otherwise.} \end{array} \right. }$$
(87)

The power spectrum of the boxcar window (87) is

$$\displaystyle{ B_{p} = \frac{1} {2p + 1}\sum _{q=-p}^{p}b_{ pq}^{2}. }$$
(88)

3.3.1 The Spherical Periodogram

Should we decide that an acceptable estimate of the power spectral density of the available data is nothing else but the weighted average of its spherical harmonic expansion coefficients, we would be forming the spherical analogue of what Schuster (1898) named the “periodogram” in the context of time series analysis, namely,

$$\displaystyle{ \hat{S}_{l}^{\mathrm{SP}} = \left ( \frac{4\pi } {A}\right ) \frac{1} {2l + 1}\sum _{m=-l}^{l}\left [\int _{ R}d(\mathbf{\hat{r}})\,Y _{lm}(\mathbf{\hat{r}})\,d\Omega \right ]^{2}. }$$
(89)

3.3.2 Bias of the Periodogram

Upon doing so we would discover that the expected value of such an estimator would be the biased quantity

$$\displaystyle{ \langle \hat{S}_{l}^{\mathrm{SP}}\rangle =\sum _{ l'=0}^{\infty }K_{ ll'}(S_{l'} + N_{l'}), }$$
(90)

where, as it is known in astrophysics and cosmology (Peebles 1973; Hauser and Peebles 1973; Hivon et al. 2002), the periodogram “coupling” matrix

$$\displaystyle{ K_{ll'} = \left ( \frac{4\pi } {A}\right ) \frac{1} {2l + 1}\sum _{m=-l}^{l}\sum _{ m'=-l'}^{l'}\left [D_{ lm,l'm'}\right ]^{2} }$$
(91)

governs the extent to which an estimate \(\hat{S}_{l}^{\mathrm{SP}}\) of S l is influenced by spectral leakage from power in neighboring spherical harmonic degrees l′ = l ± 1, l ± 2, , all the way down to 0 and up to \(\infty\). In the case of full data coverage, \(R = \Omega\), or of a perfectly white spectrum, S l = S, however, the estimate would be unbiased – provided the noise spectrum, if known, can be subtracted beforehand.

3.3.3 Variance of the Periodogram

The covariance of the periodogram estimator (89) would moreover be suffering from strong wideband coupling of the power spectral densities in being given by

$$\displaystyle{ \Sigma _{ll'}^{\mathrm{SP}} = \frac{2(4\pi /A)^{2}} {(2l + 1)(2l' + 1)}\sum _{m=-l}^{l}\sum _{ m'=-l'}^{l'}\left [\sum _{ p=0}^{\infty }\sum _{ q=0}^{\infty }(S_{ p} + N_{p})D_{lm,pq}D_{pq,l'm'}\right ]^{2}. }$$
(92)

Even under the commonly made assumption as should the power spectrum be slowly varying within the main lobe of the coupling matrix, such coupling would be nefarious. In the “locally white” case we would have

$$\displaystyle{ \Sigma _{ll'}^{\mathrm{SP}} = \frac{2(4\pi /A)^{2}} {(2l + 1)(2l' + 1)}\,(S_{l} + N_{l})(S_{l'} + N_{l'})\sum _{m=-l}^{l}\sum _{ m'=-l'}^{l'}\left [D_{ lm,l'm'}\right ]^{2}. }$$
(93)

Only in the limit of whole-sphere data coverage will Eqs. (92) or (93) reduce to

$$\displaystyle{ \Sigma _{ll'}^{\mathrm{WS}} = \frac{2} {2l + 1}\left (S_{l} + N_{l}\right )^{2}\delta _{ ll'}, }$$
(94)

which is the “planetary” or “cosmic” variance that can be understood on the basis of elementary statistical considerations (Jones 1963; Knox 1995; Grishchuk and Martin 1997). The strong spectral leakage for small regions (A ≪ 4π) is highly undesirable and makes the periodogram “hopelessly obsolete” (Thomson and Chave 1991), or, to put it kindly, “naive” (Percival and Walden 1993), just as it is for one-dimensional time series.

In principle it is possible – after subtraction of the noise bias – to eliminate the leakage bias in the periodogram estimate (89) by numerical inversion of the coupling matrix K ll. Such a “deconvolved periodogram” estimator is unbiased. However, its covariance depends on inverting the periodogram coupling matrix, which is only feasible when the region R covers most of the sphere, A ≈ 4π. For any region whose area A is significantly smaller than 4π, the periodogram coupling matrix (91) will be too ill-conditioned to be invertible.

Thus, much like in problem (i) we are faced with bad bias and poor variance, both of which are controlled by the lack of localization of the spherical harmonics and their non-orthogonality over incomplete subdomains of the unit sphere. Both effects are described by the spatiospectral localization kernel defined in (33b), which, in the quadratic estimation problem (ii), appears in “squared” form in Eq. (92). Undoing the effects of the wideband coupling between degrees at which we seek to estimate the power spectral density by inversion of the coupling kernel is virtually impossible, and even if we could accomplish this to remove the estimation bias, this would much inflate the estimation variance (Dahlen and Simons 2008).

3.3.4 The Spherical Multitaper Estimate

We therefore take a page out of the one-dimensional power estimation playbook of Thomson (1982) by forming the “eigenvalue-weighted multitaper estimate.” We could weight single-taper estimates adaptively to minimize quality measures such as estimation variance or mean-squared error (Thomson 1982; Wieczorek and Simons 2007), but in practice, these methods tend to be rather computationally demanding. Instead we simply multiply the data \(d(\mathbf{\hat{r}})\) by the Slepian functions or “tapers” \(g_{\alpha }(\mathbf{\hat{r}})\) designed for the region of interest prior to computing power and then averaging:

$$\displaystyle{ \hat{S}_{l}^{\mathrm{MT}} =\sum _{ \alpha =1}^{(L+1)^{2} }\lambda _{\alpha }\left ( \frac{4\pi } {N^{\mathrm{3D}}}\right ) \frac{1} {2l + 1}\sum _{m=-l}^{l}\left [\int _{ \Omega }g_{\alpha }(\mathbf{\hat{r}})\,d(\mathbf{\hat{r}})\,Y _{lm}(\mathbf{\hat{r}})\,d\Omega \right ]^{2}. }$$
(95)

3.3.5 Bias of the Multitaper Estimate

The expected value of the estimate (95) is

$$\displaystyle{ \langle \hat{S}_{l}^{\mathrm{MT}}\rangle =\sum _{ l'=l-L}^{l+L}M_{ ll'}(S_{l'} + N_{l'}), }$$
(96)

where the eigenvalue-weighted multitaper coupling matrix, using Wigner 3-j functions (Varshalovich et al. 1988; Messiah 2000), is given by

$$\displaystyle{ M_{ll'} = \frac{2l' + 1} {(L + 1)^{2}}\sum _{p=0}^{L}(2p+1)\left (\begin{array}{ccc} l &p& l' \\ 0&0&0 \end{array} \right )^{2}. }$$
(97)

It is remarkable that this result depends only upon the chosen bandwidth L and is completely independent of the size, shape, or connectivity of the region R, even as \(R = \Omega\). Moreover, every row of the matrix in Eq. (97) sums to unity, which ensures that the multitaper spectral estimate \(\hat{S}_{l}^{\mathrm{MT}}\) has no leakage bias in the case of a perfectly white spectrum provided the noise bias is subtracted as well: \(\langle \hat{S}_{l}^{\mathrm{MT}}\rangle -\sum M_{ll'}N_{l'} = S\) if S l = S.

3.3.6 Variance of the Multitaper Estimate

Under the moderately colored approximation, which is more easily justified in this case because the coupling (97) is confined to a narrow band of width less than or equal to 2L + 1, with L the bandwidth of the tapers, the eigenvalue-weighted multitaper covariance is

$$\displaystyle{ \Sigma _{ll'}^{\mathrm{MT}} = \frac{1} {2\pi }(S_{l}+N_{l})(S_{l'}+N_{l'})\sum _{p=0}^{2L}(2p+1)\,\Gamma _{ p}\left (\begin{array}{ccc} l &p& l'\\ 0 &0 &0 \end{array} \right )^{2}, }$$
(98)

where, using Wigner 3-j and 6-j functions (Varshalovich et al. 1988; Messiah 2000),

$$\displaystyle\begin{array}{rcl} \Gamma _{p}& =& \frac{1} {(N^{\mathrm{3D}})^{2}}\sum _{s=0}^{L}\sum _{ s'=0}^{L}\sum _{ u=0}^{L}\sum _{ u'=0}^{L}(2s + 1)(2s' + 1)(2u + 1)(2u' + 1) \\ & & \times \sum _{e=0}^{2L}(-1)^{p+e}(2e + 1)B_{ e} \\ & & \times \left \{\begin{array}{ccc} s &e& s'\\ u &p &u' \end{array} \right \}\left (\begin{array}{ccc} s&e&s'\\ 0 &0 &0 \end{array} \right )\left (\begin{array}{ccc} u&e&u'\\ 0 &0 & 0 \end{array} \right )\left (\begin{array}{ccc} s&p&u'\\ 0 &0 & 0 \end{array} \right )\left (\begin{array}{ccc} u&p&s'\\ 0 &0 &0 \end{array} \right ).{}\end{array}$$
(99)

In this expression B e , the boxcar power (88), which we note does depend on the shape of the region of interest, appears again, summed over angular degrees limited by 3-j selection rules to 0 ≤ e ≤ 2L. The sum in Eq. (98) is likewise limited to degrees 0 ≤ p ≤ 2L. The effect of tapering with windows bandlimited to L is to introduce covariance between the estimates at any two different degrees l and l′ that are separated by fewer than 2L + 1 degrees. Equations (98) and (99) are very efficiently computable, which should make them competitive with, e.g., jackknifed estimates of the estimation variance (Chave et al. 1987; Thomson and Chave 1991; Thomson 2007).

The crux of the analysis lies in the fact that the matrix of the spectral covariances between single-tapered estimates is almost diagonal (Wieczorek and Simons 2007), showing that the individual estimates that enter the weighted formula (95) are almost uncorrelated statistically. This embodies the very essence of the multitaper method. It dramatically reduces the estimation variance at the cost of small increases of readily quantifiable bias.

4 Practical Considerations

In this section we now turn to the very practical context of sampled, e.g., geodetic, data on the sphere. We shall deal exclusively with bandlimited scalar functions, which are equally well expressed in the spherical harmonic as the Slepian basis, namely:

$$\displaystyle{ f(\mathbf{\hat{r}}) =\sum \limits _{ l=0}^{L}\sum \limits _{ m=-l}^{l}f_{ lm}Y _{lm}(\mathbf{\hat{r}}) =\sum _{ \alpha =1}^{(L+1)^{2} }f_{\alpha }g_{\alpha }(\mathbf{\hat{r}}), }$$
(100)

whereby the Slepian-basis expansion coefficients are obtained as

$$\displaystyle{ f_{\alpha } =\int _{\Omega }f(\mathbf{\hat{r}})g_{\alpha }(\mathbf{\hat{r}})\,d\Omega. }$$
(101)

If the function of interest is spatially localized in the region R, a truncated reconstruction using Slepian functions built for the same region will constitute a very good, and sparse, local approximation to it (Simons et al. 2009):

$$\displaystyle{ f(\mathbf{\hat{r}}) \approx \sum _{\alpha =1}^{N^{\mathrm{3D}} }f_{\alpha }g_{\alpha }(\mathbf{\hat{r}}),\qquad \mathbf{\hat{r}} \in R. }$$
(102)

We represent any sampled, bandlimited function f by an M-dimensional column vector

$$\displaystyle{ \boldsymbol{\mathsf{f}} = (f_{1}\;\cdots \;f_{j}\;\cdots \;f_{M})^{\mathsf{T}}, }$$
(103)

where \(f_{j} = f(\mathbf{\hat{r}}_{j})\) is the value of f at pixel j, and M is the total number of sampling locations. In the most general case the distribution of pixel centers will be completely arbitrary (Hesse et al. 2010). The special case of equal-area pixelization of a 2-D function \(f(\mathbf{\hat{r}})\) on the unit sphere \(\Omega\) is analogous to the equispaced digitization of a 1-D time series. Integrals will then be assumed to be approximated with sufficient accuracy by a Riemann sum over a dense set of pixels,

$$\displaystyle{ \int f(\mathbf{\hat{r}})\,d\Omega \approx \Delta \Omega \sum _{j=1}^{M}f_{ j}\quad \mbox{ and}\quad \int f^{2}(\mathbf{\hat{r}})\,d\Omega \approx \Delta \Omega \,\boldsymbol{\mathsf{f}}^{\mathsf{T}}\boldsymbol{\mathsf{f}}. }$$
(104)

We have deliberately left the integration domain out of the above equations to cover both the cases of sampling over the entire unit sphere surface \(\Omega\), in which case the solid angle \(\Delta \Omega = 4\pi /M\) (case 1) as well as over an incomplete subdomain \(R \subset \Omega\), in which case \(\Delta \Omega = A/M\), with A the area of the region R (case 2). If we collect the real spherical harmonic basis functions Y lm into an (L + 1)2 × M-dimensional matrix

$$\displaystyle{ \boldsymbol{\mathsf{Y}} = \left (\begin{array}{*{20}c} Y _{00}(\mathbf{\hat{r}}_{1}) &\cdots & Y _{00}(\mathbf{\hat{r}}_{j}) &\cdots & Y _{00}(\mathbf{\hat{r}}_{M})\\ & & \vdots & & \\ \cdots & & Y _{lm}(\mathbf{\hat{r}}_{j}) & & \cdots \\ & & \vdots & & \\ Y _{LL}(\mathbf{\hat{r}}_{1})&\cdots &Y _{LL}(\mathbf{\hat{r}}_{j})&\cdots &Y _{LL}(\mathbf{\hat{r}}_{M}) \end{array} \right ), }$$
(105)

and the spherical harmonic coefficients of the function into an (L + 1)2 × 1-dimensional vector

$$\displaystyle{ \mathbf{f} = (\;f_{00}\;\cdots \;f_{lm}\;\cdots \;f_{LL}\;)^{\mathit{T}}, }$$
(106)

we can write the spherical harmonic synthesis in Eq. (100) for sampled data without loss of generality as

$$\displaystyle{ \boldsymbol{\mathsf{f}} =\boldsymbol{ \mathsf{Y}}^{\mathsf{T}}\mathbf{f}. }$$
(107)

We will adhere to the notation convention of using sans-serif fonts (e.g., \(\boldsymbol{\mathsf{f}}\), \(\boldsymbol{\mathsf{Y}}\)) for vectors or matrices that depend on at least one spatial variable, and serifed fonts (e.g., f, D) for those that are entirely composed of “spectral” quantities. In the case of dense, equal-area, whole-sphere sampling, we have an approximation to Eq. (29):

$$\displaystyle{ \boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}} \approx \Delta \Omega ^{-1}\mathbf{I}\qquad \mbox{ (case 1)}, }$$
(108)

where the elements of the (L + 1)2 × (L + 1)2-dimensional spectral identity matrix I are given by the Kronecker deltas \(\delta _{ll'}\delta _{mm'}\). In the case of dense, equal-area sampling over some closed region R, we find instead an approximation to the (L + 1)2 × (L + 1)2-dimensional “spatiospectral localization matrix”:

$$\displaystyle{ \boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}} \approx \Delta \Omega ^{-1}\mathbf{D}\qquad \mbox{ (case 2)}, }$$
(109)

where the elements of D are those defined in Eq. (33b).

Let us now introduce the (L + 1)2 × (L + 1)2-dimensional matrix of spectral Slepian eigenfunctions by

$$\displaystyle{ \mathbf{G} = \left (\begin{array}{*{20}c} g_{001} & \cdots & g_{00\alpha } & \cdots & g_{00(L+1)^{2}}\\ & & \vdots & & \\ \cdots & & g_{lm\alpha } & & \cdots \\ & & \vdots & & \\ g_{ LL1} & \cdots &g_{LL\alpha }&\cdots &g_{LL(L+1)^{2}} \end{array} \right ). }$$
(110)

This is the matrix that contains the eigenfunctions of the problem defined in Eq. (33), which we rewrite as

$$\displaystyle{ \mathbf{D}\mathbf{G} = \mathbf{G}\boldsymbol{\Lambda }, }$$
(111)

where the diagonal matrix with the concentration eigenvalues is given by

$$\displaystyle{ \boldsymbol{\Lambda } = \text{diag}\left (\;\lambda _{1}\;\cdots \;\lambda _{\alpha }\;\cdots \;\lambda _{(L+1)^{2}}\;\right ). }$$
(112)

The spectral orthogonality relations of Eq. (35) are

$$\displaystyle{ \mathbf{G}^{\mathrm{T}}\mathbf{G} = \mathbf{I},\qquad \mathbf{G}^{\mathrm{T}}\mathbf{D}\mathbf{G} =\boldsymbol{ \Lambda }, }$$
(113)

where the elements of the (L + 1)2 × (L + 1)2-dimensional Slepian identity matrix I are given by the Kronecker deltas δ α β . We write the Slepian functions of Eq. (31) as

$$\displaystyle{ \boldsymbol{\mathsf{G}} = \mathbf{G}^{\mathrm{T}}\boldsymbol{\mathsf{Y}}\quad \mbox{ and}\quad \boldsymbol{\mathsf{Y}} = \mathbf{G}\boldsymbol{\mathsf{G}}, }$$
(114)

where the (L + 1)2 × M-dimensional matrix holding the sampled spatial Slepian functions is given by

$$\displaystyle{ \boldsymbol{\mathsf{G}} = \left (\begin{array}{*{20}c} g_{1}(\mathbf{\hat{r}}_{1}) &\cdots & g_{1}(\mathbf{\hat{r}}_{j}) &\cdots & g_{1}(\mathbf{\hat{r}}_{M})\\ & & \vdots & & \\ \cdots & & g_{\alpha }(\mathbf{\hat{r}}_{j}) & & \cdots \\ & & \vdots & & \\ g_{(L+1)^{2}}(\mathbf{\hat{r}}_{1})&\cdots &g_{(L+1)^{2}}(\mathbf{\hat{r}}_{j})&\cdots &g_{(L+1)^{2}}(\mathbf{\hat{r}}_{M}) \end{array} \right ). }$$
(115)

Under a dense, equal-area, whole-sphere sampling, we will recover the spatial orthogonality of Eq. (36) approximately as

$$\displaystyle{ \boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}} \approx \Delta \Omega ^{-1}\mathbf{I}\qquad \mbox{ (case 1)}, }$$
(116)

whereas for dense, equal-area sampling over a region R we will get, instead,

$$\displaystyle{ \boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}} \approx \Delta \Omega ^{-1}\boldsymbol{\Lambda }\qquad \mbox{ (case 2)}. }$$
(117)

With this matrix notation we shall revisit both estimation problems of the previous section.

4.1 Problem (i), Revisited

4.1.1 Spherical Harmonic Solution

If we treat Eq. (107) as a noiseless inverse problem in which the sampled data \(\boldsymbol{\mathsf{f}}\) are given but from which the coefficients f are to be determined, we find that for dense, equal-area, whole-sphere sampling, the solution

$$\displaystyle{ \mathbf{\hat{f}} \approx \Delta \Omega \,\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{f}}\qquad \mbox{ (case 1)} }$$
(118)

is simply the discrete approximation to the spherical harmonic analysis formula (25). For dense, equal-area, regional sampling we need to calculate

$$\displaystyle{ \mathbf{\hat{f}} \approx \Delta \Omega \,\mathbf{D}^{-1}\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{f}}\qquad \mbox{ (case 2)}. }$$
(119)

Both of these cases are simply the relevant solutions to the familiar overdetermined spherical harmonic inversion problem (Kaula 1967; Menke 1989; Aster et al. 2005) for discretely sampled data, i.e., the least-squares solution to Eq. (107),

$$\displaystyle{ \mathbf{\hat{f}} = (\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}})^{-1}\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{f}}, }$$
(120)

for the particular cases described by Eqs. (108) and (109). In Eq. (119) we furthermore recognize the discrete version of Eq. (81) with η = 0, the undamped solution to the minimum mean-squared error inverse problem posed in continuous form in Eq. (80). From the continuous limiting case Eq. (81), we thus discover the general form that damping should take in regularizing the ill-conditioned inverse required in Eqs. (119) and (120). Its principal property is that it differs from the customary ad hoc practice of adding small values on the diagonal only. Finally, in the most general and admittedly most commonly encountered case of randomly scattered data, we require the Moore-Penrose pseudo-inverse

$$\displaystyle{ \mathbf{\hat{f}} = \mathrm{pinv}(\boldsymbol{\mathsf{Y}}^{\mathsf{T}})\boldsymbol{\mathsf{f}}, }$$
(121)

which is constructed by inverting the singular value decomposition (svd) of \(\boldsymbol{\mathsf{Y}}^{\mathsf{T}}\) with its singular values truncated beyond where they fall below a certain threshold (Xu 1998). Solving Eq. (121) by truncated svd is equivalent to inverting a truncated eigenvalue expansion of the normal matrix \(\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}}\) as it appears in Eq. (120), as can be easily shown.

4.1.2 Slepian Basis Solution

If we collect the Slepian expansion coefficients of the function f into the (L + 1)2 × 1-dimensional vector

$$\displaystyle{ \mathbf{t} = (\;f_{1}\;\cdots \;f_{\alpha }\;\cdots \;f_{(L+1)^{2}}\;)^{\mathit{T}}, }$$
(122)

the expansion (100) in the Slepian basis takes the form

$$\displaystyle{ \boldsymbol{\mathsf{f}} =\boldsymbol{ \mathsf{G}}^{\mathsf{T}}\mathbf{t} =\boldsymbol{ \mathsf{Y}}^{\mathsf{T}}\mathbf{G}\mathbf{t}, }$$
(123)

where we used Eqs. (113) and (114) to obtain the second equality. Comparing Eq. (123) with Eq. (107), we see that the Slepian expansion coefficients of a function transform to and from the spherical harmonic coefficients as

$$\displaystyle{ \mathbf{f} = \mathbf{G}\mathbf{t}\quad \mbox{ and}\quad \mathbf{t} = \mathbf{G}^{\mathrm{T}}\mathbf{f}. }$$
(124)

Under dense, equal-area sampling with complete coverage, the coefficients in Eq. (123) can be estimated from

$$\displaystyle{ \mathbf{\hat{t}} \approx \Delta \Omega \,\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{f}}\qquad \mbox{ (case 1)}, }$$
(125)

the discrete, approximate version of Eq. (101). For dense, equal-area sampling in a limited region R, we get

$$\displaystyle{ \mathbf{\hat{t}} \approx \Delta \Omega \,\boldsymbol{\Lambda }^{-1}\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{f}}\qquad \mbox{ (case 2)}. }$$
(126)

As expected, both of the solutions (125) and (126) are again special cases of the overdetermined least-squares solution

$$\displaystyle{ \mathbf{\hat{t}} = (\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}})^{-1}\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{f}}, }$$
(127)

as applied to Eqs. (116) and (117). We encountered Eq. (126) before in the continuous form of Eq. (83); it solves the undamped minimum mean-squared error problem (80). The regularization of this ill-conditioned inverse problem may be achieved by truncation of the concentration eigenvalues, e.g., by restricting the size of the (L + 1)2 × (L + 1)2-dimensional operator \(\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}}\) to its first J × J subblock. Finally, in the most general, scattered-data case, we would be using an eigenvalue-truncated version of Eq. (127), or, which is equivalent, form the pseudo-inverse

$$\displaystyle{ \mathbf{\hat{t}} = \mathrm{pinv}(\boldsymbol{\mathsf{G}}^{\mathsf{T}})\boldsymbol{\mathsf{f}}. }$$
(128)

The solutions (118)–(120) and (125)–(127) are equivalent and differ only by the orthonormal change of basis from the spherical harmonics to the Slepian functions. Indeed, using Eqs. (114) and (124) to transform Eq. (127) into an equation for the spherical harmonic coefficients and comparing with Eq. (120) exposes the relation

$$\displaystyle{ \mathbf{G}(\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}})^{-1}\mathbf{G}^{\mathrm{T}} = (\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}})^{-1}, }$$
(129)

which is a trivial identity for case 1 (insert Eqs. 108, 116 and 113) and, after substituting Eqs. (109) and (117), entails

$$\displaystyle{ \mathbf{G}\boldsymbol{\Lambda }^{-1}\mathbf{G}^{\mathrm{T}} = \mathbf{D}^{-1} }$$
(130)

for case 2, which holds by virtue of Eq. (113). Equation (129) can also be verified directly from Eq. (114), which implies

$$\displaystyle{ \boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}} = \mathbf{G}(\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}})\mathbf{G}^{\mathrm{T}}. }$$
(131)

The popular but labor-intensive procedure by which the unknown spherical harmonic expansion coefficients of a scattered data set are obtained by forming the Moore-Penrose pseudo-inverse as in Eq. (121) is thus equivalent to determining the truncated Slepian solution of Eq. (126) in the limit of continuous and equal-area, but incomplete data coverage. In that limit, the generic eigenvalue decomposition of the normal matrix becomes a specific statement of the Slepian problem as we encountered it before, namely,

$$\displaystyle{ \boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}}\Delta \Omega = \mathbf{U}\boldsymbol{\Sigma }^{2}\mathbf{U}^{\mathrm{T}}\quad \rightarrow \quad \mathbf{D} = \mathbf{G}\boldsymbol{\Lambda }\mathbf{G}^{\mathrm{T}}. }$$
(132)

Such a connection has been previously pointed out for time series (Wingham 1992) and leads to the notion of “generalized prolate spheroidal functions” (Bronez 1988) should the “Slepian” functions be computed from a formulation of the concentration problem in the scattered data space directly, rather than being determined by sampling those obtained from solving the corresponding continuous problem, as we have described here.

Above, we showed how to stabilize the inverse problem of Eq. (120) by damping. We dealt with the case of continuously available data only; the form in which it appears in Eq. (81) makes it clear that damping is hardly practical for scattered data. Indeed, it requires knowledge of the complementary localization operator \(\bar{\mathbf{D}}\), in addition to being sensitive to the choice of η, whose optimal value depends implicitly on the unknown signal-to-noise ratio (Simons and Dahlen 2006). The data-driven approach taken in Eq. (121) is the more sensible one (Xu 1998). We have now seen that, in the limit of continuous partial coverage, this corresponds to the optimal solution of the problem formulated directly in the Slepian basis. It is consequently advantageous to also work in the Slepian basis in case the data collected are scattered but closely collocated in some region of interest. Prior knowledge of the geometry of this region and a prior idea of the spherical harmonic bandwidth of the data to be inverted allows us to construct a Slepian basis for the situation at hand, and the problem of finding the Slepian expansion coefficients of the unknown underlying function can be solved using Eqs. (127) and (128). The measure within which this approach agrees with the theoretical form of Eq. (126) will depend on how regularly the data are distributed within the region of study, i.e., on the error in the approximation (117). But if indeed the scattered-data Slepian normal matrix \(\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}}\) is nearly diagonal in its first J × J-dimensional block due to the collocated observations having been favorably, if irregularly, distributed, then Eq. (126), which, strictly speaking, requires no matrix inversion, can be applied directly. If this is not the case, but the data are still collocated or we are only interested in a local approximation to the unknown signal, we can restrict \(\boldsymbol{\mathsf{G}}\) to its first J rows, prior to diagonalizing \(\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}}\) or performing the svd of a partial \(\boldsymbol{\mathsf{G}}^{\mathsf{T}}\) as necessary to calculate Eqs. (127) and (128). Compared to solving Eqs. (120) and (121), the computational savings will still be substantial, as only when \(R \approx \Omega\) will the operator \(\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}}\) be nearly diagonal. Truncation of the eigenvalues of \(\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}}\) is akin to truncating the matrix \(\boldsymbol{\mathsf{G}}\boldsymbol{\mathsf{G}}^{\mathsf{T}}\) itself, which is diagonal or will be nearly so. With the theoretically, continuously determined, sampled Slepian functions as a parametrization, the truncated expansion is easy to obtain and the solution will be locally faithful within the region of interest R. In contrast, should we truncate \(\boldsymbol{\mathsf{Y}}\boldsymbol{\mathsf{Y}}^{\mathsf{T}}\) itself, without first diagonalizing it, we would be estimating a low-degree approximation of the signal which would have poor resolution everywhere. See Slobbe et al. (2012) for a set of examples in a slightly expanded and numerically more challenging context.

4.1.3 Bias and Variance

For completeness we briefly return to the expressions for the mean-squared estimation error of the damped spherical-harmonic and the truncated Slepian function methods, Eqs. (85) and (86), which we quoted for the example of “white” signal and noise with power S and N, respectively. Introducing the (L + 1)2 × (L + 1)2-dimensional spectral matrices

$$\displaystyle{ \mathbf{H} =\boldsymbol{ \Lambda } +\eta (\mathbf{I} -\boldsymbol{ \Lambda }), }$$
(133a)
$$\displaystyle{ \mathbf{V} = N\mathbf{H}^{-2}\boldsymbol{\Lambda },\quad \mbox{ and}\quad \mathbf{B} = \sqrt{S}\,\mathbf{H}^{-1}(\mathbf{I} -\boldsymbol{ \Lambda }), }$$
(133b)

we handily rewrite the “full” version of Eq. (85) in two spatial variables as the error covariance matrix

$$\displaystyle{ \langle \epsilon (\mathbf{\hat{r}})\epsilon (\mathbf{\hat{r}}')\rangle =\boldsymbol{ \mathsf{G}}^{\mathsf{T}}\left (\mathbf{V} +\eta ^{2}\mathbf{B}^{2}\right )\boldsymbol{\mathsf{G}}. }$$
(134)

We subdivide the matrix with Slepian functions into the truncated set of the best-concentrated \(\alpha = 1 \rightarrow J\) and the complementary set of remaining \(\alpha = J + 1 \rightarrow (L + 1)^{2}\) functions, as follows:

$$\displaystyle{ \boldsymbol{\mathsf{G}} =\big (\;\boldsymbol{\underline{\mathsf{G}}}^{\mathsf{T}}\;\;\boldsymbol{\bar{\mathsf{G}}}^{\mathsf{T}}\big)^{\mathsf{T}}, }$$
(135)

and similarly separate the eigenvalues, writing

$$\displaystyle\begin{array}{rcl} \boldsymbol{\bar{\Lambda }}& =& \text{diag}\left (\;\lambda _{1}\;\cdots \;\lambda _{J}\;\right ),{}\end{array}$$
(136a)
$$\displaystyle\begin{array}{rcl} \boldsymbol{\underline{\Lambda }}& =& \text{diag}\left (\;\lambda _{J+1}\;\cdots \;\lambda _{(L+1)^{2}}\right ).{}\end{array}$$
(136b)

Likewise, the identity matrix is split into two parts, \(\bar{\mathbf{I}}\) and \(\boldsymbol{\underline{\mathrm{I}}}\). If we now also redefine

$$\displaystyle{ \mathbf{\bar{V}} = N\boldsymbol{\bar{\Lambda }}^{-1},\quad \mbox{ and}\quad \mathbf{\bar{B}} = \sqrt{S}\,\bar{\mathbf{I}}, }$$
(137a)
$$\displaystyle{ \boldsymbol{\underline{\mathrm{V}}} = N\boldsymbol{\underline{\Lambda }}^{-1},\quad \mbox{ and}\quad \boldsymbol{\underline{\mathrm{B}}} = \sqrt{S}\,\boldsymbol{\underline{\mathrm{I}}}, }$$
(137b)

the equivalent version of Eq. (86) is readily transformed into the full spatial error covariance matrix

$$\displaystyle{ \langle \epsilon (\mathbf{\hat{r}})\epsilon (\mathbf{\hat{r}}')\rangle = \boldsymbol{\underline{\mathsf{G}}}^{\mathsf{T}}\boldsymbol{\underline{\mathrm{V}}}\boldsymbol{\underline{\mathsf{G}}} +\boldsymbol{ \bar{\mathsf{G}}}^{\mathsf{T}}\mathbf{\bar{B}}^{2}\boldsymbol{\bar{\mathsf{G}}}. }$$
(138)

In selecting the Slepian basis we have thus successfully separated the effect of the variance and the bias on the mean-squared reconstruction error of a noisily observed signal. If the region of observation is a contiguous closed domain \(R \subset \Omega\) and the truncation should take place at the Shannon number J = N 3D, we have thereby identified the variance as due to noise in the region where data are available and the bias to signal neglected in the truncated expansion – which, in the proper Slepian basis, corresponds to the regions over which no observations exist. In practice, the truncation will happen at some J that depends on the signal-to-noise ratio (Simons and Dahlen 2006) and/or on computational considerations (Slobbe et al. 2012).

Finally, we shall also apply the notions of discretely acquired data to the solutions of problem (ii), below.

4.2 Problem (ii), Revisited

We need two more pieces of notation in order to rewrite the expressions for the spectral estimates (89) and (95) in the “pixel-basis.” First we construct the M × M-dimensional symmetric spatial matrix collecting the fixed-degree Legendre polynomials evaluated at the angular distances between all pairs of observations points,

$$\displaystyle{ \boldsymbol{\mathsf{P}}_{l} = \left (\frac{2l + 1} {4\pi } \right )\left (\begin{array}{*{20}c} P_{l}(\mathbf{\hat{r}}_{1} \cdot \mathbf{\hat{r}}_{1}) &\cdots & P_{l}(\mathbf{\hat{r}}_{1} \cdot \mathbf{\hat{r}}_{j}) &\cdots & P_{l}(\mathbf{\hat{r}}_{1} \cdot \mathbf{\hat{r}}_{M})\\ & & \vdots & & \\ \cdots & & P_{l}(\mathbf{\hat{r}}_{i} \cdot \mathbf{\hat{r}}_{j}) & & \cdots \\ & & \vdots & & \\ P_{l}(\mathbf{\hat{r}}_{M} \cdot \mathbf{\hat{r}}_{1})&\cdots &P_{l}(\mathbf{\hat{r}}_{M} \cdot \mathbf{\hat{r}}_{j})&\cdots &P_{l}(\mathbf{\hat{r}}_{M} \cdot \mathbf{\hat{r}}_{M}) \end{array} \right ). }$$
(139)

The elements of \(\boldsymbol{\mathsf{P}}_{l}\) are thus \(\sum _{m=-l}^{l}Y _{lm}(\mathbf{\hat{r}}_{i})Y _{lm}(\mathbf{\hat{r}}_{j})\), by the addition theorem, Eq. (30). And finally, we define \(\boldsymbol{\mathsf{G}}_{l}^{\alpha }\), the M × M symmetric matrix with elements given by

$$\displaystyle{ \left (\boldsymbol{\mathsf{G}}_{l}^{\alpha }\right )_{ ij} = \left (\frac{2l + 1} {4\pi } \right )g_{\alpha }(\mathbf{\hat{r}}_{i})P_{l}(\mathbf{\hat{r}}_{i} \cdot \mathbf{\hat{r}}_{j})g_{\alpha }(\mathbf{\hat{r}}_{j}). }$$
(140)

4.2.1 The Spherical Periodogram

The expression equivalent to Eq. (89) is now written as

$$\displaystyle{ \hat{S}_{l}^{\mathrm{SP}} = \left ( \frac{4\pi } {A}\right )\frac{(\Delta \Omega )^{2}} {2l + 1} \,\boldsymbol{\mathsf{d}}^{\mathsf{T}}\boldsymbol{\mathsf{P}}_{ l}\,\boldsymbol{\mathsf{d}}, }$$
(141)

whereby the column vector \(\boldsymbol{\mathsf{d}}\) contains the sampled data as in the notation for Eq. (103). This lends itself easily to computation, and the statistics of Eqs. (90)–(93) hold, approximately, for sufficiently densely sampled data.

4.2.2 The Spherical Multitaper Estimate

Finally, the expression equivalent to Eq. (95) becomes

$$\displaystyle{ \hat{S}_{l}^{\mathrm{MT}} =\sum _{ \alpha =1}^{(L+1)^{2} }\lambda _{\alpha }\left ( \frac{4\pi } {N^{\mathrm{3D}}}\right )\frac{(\Delta \Omega )^{2}} {2l + 1} \,\boldsymbol{\mathsf{d}}^{\mathsf{T}}\boldsymbol{\mathsf{G}}_{ l}^{\alpha }\boldsymbol{\mathsf{d}}. }$$
(142)

Both Eqs. (141) and (142) are quadratic forms, earning them the nickname “quadratic spectral estimators” (Mullis and Scharf 1991). The key difference with the maximum-likelihood estimator popular in cosmology (Bond et al. 1998; Oh et al. 1999; Hinshaw et al. 2003), which can also be written as a quadratic form (Tegmark 1997), is that neither \(\boldsymbol{\mathsf{P}}_{l}\) nor \(\boldsymbol{\mathsf{G}}_{l}^{\alpha }\) depends on the unknown spectrum itself and can be easily precomputed. In contrast, maximum-likelihood estimation is inherently nonlinear, requiring iteration to converge to the most probable estimate of the power spectral density (Dahlen and Simons 2008). As such, given a pixel grid, a region of interest R and a bandwidth L, Eq. (142) produces a consistent localized multitaper power spectral estimate in one step.

The estimate (142) has the statistical properties that we listed earlier as Eqs. (96)–(99). These continue to hold when the data pixelization is fine enough to have integral expressions of the kind (104) be exact. As mentioned before, for completely irregularly and potentially non-densely distributed discrete data on the sphere, “generalized” Slepian functions (Bronez 1988) could be constructed specifically for the purpose of their power spectral estimation and used to build the operator (140).

5 Conclusions

What is the information contained in a bandlimited set of scientific observations made over an incomplete, e.g., temporally or spatially limited sampling domain? How can this “information,” e.g., an estimate of the signal itself, or of its energy density, be determined from noisy data, and how shall it be represented? These seemingly age-old fundamental questions, which have implications beyond the scientific (Slepian 1976), had been solved – some say, by conveniently ignoring them – heuristically, by engineers, well before receiving their first satisfactory answers given in the theoretical treatment by Slepian, Landau, and Pollak (Slepian and Pollak 1961; Landau and Pollak 19611962), first for “continuous” time series, later generalized to the multidimensional and discrete cases (Slepian 1964; Slepian 1978; Bronez 1988). By the “Slepian functions” in the title of this contribution, we have lumped together all functions that are “spatiospectrally” concentrated, quadratically, in the original sense of Slepian. In one dimension, these are the “prolate spheroidal functions” whose popularity is as enduring as their utility. In two Cartesian dimensions, and on the surface of the unit sphere, both scalar and vectorial, their time for applications in geomathematics has come.

The answers to the questions posed above are as ever relevant for the geosciences of today. There, we often face the additional complications of irregularly shaped study domains, scattered observations of noise-contaminated potential fields, perhaps collected from an altitude above the source by airplanes or satellites, and an acquisition and model-space geometry that is rarely if ever nonsymmetric. Thus the Slepian functions are especially suited for geoscientific applications and to study any type of geographical information, in general.

Two problems that are of particular interest in the geosciences, but also further afield, are how to form a statistically “optimal” estimate of the signal giving rise to the data and how to estimate the power spectral density of such signal. The first, an inverse problem that is linear in the data, applies to forming mass flux estimates from time-variable gravity, e.g., by the grace mission (Harig and Simons 2012), or to the characterization of the terrestrial or planetary magnetic fields by satellites such as champ, swarm, or mgs. The second, which is quadratic in the data, is of interest in studying the statistics of the Earth’s or planetary topography and magnetic fields (Lewis and Simons 2012; Beggan et al. 2013) and especially for the cross-spectral analysis of gravity and topography (Wieczorek 2008), which can yield important clues about the internal structure of the planets. The second problem is also of great interest in cosmology, where missions such as wmap and planck are mapping the cosmic microwave background radiation, which is best modeled spectrally to constrain models of the evolution of our universe.

Slepian functions, as we have shown by focusing on the scalar case in spherical geometry, provide the mathematical framework to solve such problems. They are a convenient and easily obtained doubly orthogonal mathematical basis in which to express, and thus by which to recover, signals that are geographically localized or incompletely (and noisily) observed. For this they are much better suited than the traditional Fourier or spherical harmonic bases, and they are more “geologically intuitive” than wavelet bases in retaining a firm geographic footprint and preserving the traditional notions of frequency or spherical harmonic degree. They are furthermore extremely performant as data tapers to regularize the inverse problem of power spectral density determination from noisy and patchy observations, which can then be solved satisfactorily without costly iteration. Finally, by the interpretation of the Slepian functions as their limiting cases, much can be learned about the statistical nature of such inverse problems when the data provided are themselves scattered within a specific areal region of study.