Keywords

1 1 Introduction

The spherical and the oblate spheroidalFootnote 1 harmonic expansions are widely used in geodesy and geophysics (Heiskanen and Moritz 1967; Maus 2010; Pavlis et al. 2012; Lowes and Winch 2012). However, their numerical computations face with some obstacles when degree/order is high. One problem in the spherical case (Holmes and Featherstone 2002) is an underflow during the computation of \(\overline{P}_{nm}(t)\), the 4π fully-normalized Associated Legendre Function (ALF) of the first kind (Heiskanen and Moritz 1967, Sect. 1–14). Here \(t \equiv \cos \theta\) while θ is the geocentric colatitude. Additional issue in the oblate spheroidal case (Sona 1995) is the difficulty in computing Q nm (ix), the unnormalized ALF of the second kind with a pure imaginary argument (Olver et al. 2010, Sect. 14.2). This article reviews recent solutions to these problems and related issues (Fukushima 2012a,b,c,d20132014).

2 2 X-Number Formulation

An underflow occurs in the forward sectorial recursion of \(\overline{P}_{nm}(t)\) since it is a sequence of multiplications of small factors. For example, when θ = 60, an underflow happens when m > 1030 in the double precision environment of the IEEE 754 standard (IEEE 2008) where the minimum representable number is \(2^{-1023} \approx 1.1 \times 10^{-308}\). Once an underflow occurs, all the subsequent sectorial values are regarded as exact zeros in the computers. This ruins the non-sectorial recursions of \(\overline{P}_{nm}(t)\) starting from them, which would recover the diminished ALFs to the level of the order of unity if the sectorial values remain to be non-zero even if they are extremely tiny. Thus, the underflow results a significant loss of precision in the computation of not only the point values when the maximum order M is as high as 2700 (Fukushima 2012a, Fig. 5) but also the low-order derivatives and the integrals (Fukushima 2014, Fig. 1).

This trouble is solved by the so-called X-number formulation (Fukushima 2012a) or other similar devices (Wittwer et al. 2008; Nesvadba 2008). The X-number formulation represents a real number X by a pair of a floating point number (termed F-number) x and a signed integer i X such that \(X = xB^{i_{X}}\) where B is a power of 2. If (1) x is an IEEE 754 double precision F-number, (2) i X is a 32 bit signed integer, and (3) B ≡ 2960, the minimum representable number becomes as tiny as \(\approx 1.2 \times 10^{-6.2\times 10^{12} }\) (Fukushima 2014, §1). As a result, almost no underflow occurs in the computation of \(\overline{P}_{nm}(t)\). Thus, the formulation enables the correct computation of not only the point values of the ALF but also their low-order derivatives and integrals of high degree and order such as 216 000 or more (Fukushima 2012a,c2014).

As an illustration, some sample values of them for the case n = 216 000, m = 108 000, θ 1 = 60, and θ 2 = 30 are listed below:

$$\displaystyle{ \overline{P}_{\mathit{nm}}\left (t_{1}\right ) = -3.105\;584\;633\;08(1\;662), }$$
(1)
$$\displaystyle{ \left [d\overline{P}_{\mathit{nm}}/d\theta \right ]\left (t_{1}\right ) = -6.327\;420\;821\;20(2\;954) \times 10^{4}, }$$
(2)
$$\displaystyle{ \left [d^{2}\overline{P}_{\mathit{ nm}}/d\theta ^{2}\right ]\left (t_{ 1}\right ) = +6.287\;479\;684\;91(0\;149) \times 10^{9}, }$$
(3)
$$\displaystyle{ \overline{I}_{\mathit{nm}}\left (t_{1},t_{2}\right ) = +4.839\;192\;555\;12(2\;765) \times 10^{-3}, }$$
(4)

where the erroneous digits, determined from the comparison with the quadruple precision X-number computation, are shown in parentheses.

3 3 Acceleration of ALF Computation

In general, the X-number formulation results an increase in the CPU time of a factor 2–3. As already reported in Jekeli et al. (2007), \(\overline{P}_{\mathit{nm}}(t)\) starts to oscillate with respect to n when n is sufficiently larger than m (Fukushima 2014, Fig. 2). This suggests a possible switch from the X- to F-number computations during the recursion. The switch is dynamically conducted when both \(\overline{P}_{n-1,m}(t)\) and \(\overline{P}_{n-2,m}(t)\) can be regarded as F-numbers, namely when both i X of \(\overline{P}_{n-1,m}(t)\) and i X of \(\overline{P}_{n-2,m}(t)\) are 0. This device reduces significantly the CPU time increase (Fukushima 2014, Fig. 6).

Another technique to accelerate the computation of ALFs is the folded parallel computation (Fukushima 2012d). A pairing of the fixed-order increasing-degree recursions of order m and Mm, where M is the maximum order, equalizes the computational load of different-order recursions conducted in parallel at multiple processor units. Consequently, its simple implementation by the OpenMP architecture (OpenMP ARB 2011) achieves the acceleration factor being the same as the number of processor units (Fukushima 2012d, Fig. 1).

4 4 Effect of Underflow of Sectorial Integrals on Non-sectorial Integrals

During the investigation of the computation of the integral of the ALF of the first kind, \(\overline{I}_{\mathit{nm}}\left (t_{1},t_{2}\right ) \equiv \int _{t_{1}}^{t_{2}}\overline{P}_{nm}(t)dt\), it is noticed that an underflow, which might occur during the forward recursion of the sectorial integrals (Fukushima 2014, Eq. (A.10)), causes no problem in the subsequent non-sectorial computation (Fukushima 2014).

The reason of this phenomenon becomes clear by examining the dependence of the non-sectorial integral on the sectorial one. The fixed-order increasing-degree recurrence formulas of \(\overline{I}_{nm}\) are expressed (Fukushima 2014, Eqs. (A.2) and (A.8)) as

$$\displaystyle{ \overline{I}_{m+1,m} = -\overline{J}_{m+1,m},\;\overline{I}_{\mathit{nm}} = f_{nm}\overline{I}_{n-2,m} -\overline{J}_{nm},\;(n \geq m + 2) }$$
(5)

where (1) \(\overline{J}_{nm}\) is a partial integral computed from \(\overline{P}_{nm}\) at the two end points of the integration interval as

$$\displaystyle{ \overline{J}_{nm} \equiv \sqrt{\frac{(2n + 1)(2n - 1)} {(n + m)(n - m)}} \left (\frac{u_{2}^{2}\overline{P}_{nm}\left (t_{2}\right ) - u_{1}^{2}\overline{P}_{nm}\left (t_{1}\right )} {n + 1} \right ), }$$
(6)

while \(u \equiv \sin \theta\), (2) f nm is a numerical constant defined as

$$\displaystyle\begin{array}{rcl} f_{nm}& \equiv & \frac{n - 2} {n + 1}\sqrt{ \frac{(2n + 1)(n + m - 1)(n - m - 1)} {(2n - 3)(n + m)(n - m)}}, \\ & & (n \geq m + 2 \geq 2) {}\end{array}$$
(7)

and (3) the arguments t 1 and t 2 are omitted where no confusion is introduced.

Thus, \(\overline{I}_{nm}\) is a linear function of \(\overline{I}_{mm}\) and a group of \(\overline{P}_{\ell m}\left (t_{j}\right )\) where m + 1 ≤  ≤ n and j = 1 and 2. If nm is odd, \(\overline{I}_{nm}\) does not depend on \(\overline{I}_{mm}\) at all. Meanwhile, if nm is even, \(\overline{I}_{nm}\) linearly depends on \(\overline{I}_{mm}\) with a product of f ℓ m as its proportional coefficient. Namely

$$\displaystyle\begin{array}{rcl} & & \left (\frac{\partial \overline{I}_{m+2k-1,m}} {\partial \overline{I}_{mm}} \right )_{\overline{P}_{ nm}} = 0, \\ & & \left (\frac{\partial \overline{I}_{m+2k,m}} {\partial \overline{I}_{mm}} \right )_{\overline{P}_{ nm}} =\prod _{ j=1}^{k}f_{ m+2j,m}.\;\;(k \geq 1){}\end{array}$$
(8)

The coefficient is less than unity since 0 < f nm  < 1 when n ≥ m + 2 ≥ 2. This means that the absolute error of the sectorial integral contributes to a smaller absolute error of the non-sectorial one. In other words, not the relative error but the absolute error must be worried in the sectorial integral computation. For this purpose, the simple forward recursion (Fukushima 2014, Eq. (A.2)) is sufficient. At any rate, this fact enables one to avoid the existing complicated approach to obtain the sectorial integrals by the backward recursion starting from the two seed values computed by the hypergeometric series (Paul 1978; Gerstl 1980; Gleason 1985).

When \(\left \vert t_{2} - t_{1}\right \vert \) is small, the direct evaluation of \(\overline{J}_{nm}\) suffers from a heavy cancellation, and therefore results a precision loss of \(\overline{I}_{nm}\). This is eminent when n and m are rather small, say less than 30 or so (Fukushima 2012b, Figs. 1 and 2). In that case, a computing method based on the cancellation-error-free evaluation of the finite differences is effective (Fukushima 2012b).

5 5 Recursive Computation of Oblate Spheroidal Harmonics of the Second Kind

In addition to the problem of \(\overline{P}_{nm}(t)\) computation, another type of problem arises in the computation of oblate spheroidal harmonic expansion. The non-angular component of the expansion is a ratio of Q nm (ix), the ALF of the second kind with a pure imaginary argument (Heiskanen and Moritz 1967). The computational difficulty of Q nm (ix) is caused by the fact that it is the minimal solution of a second-order difference equation (Gil and Segura 1998). As a result, the increasing-degree recursion to obtain Q nm (ix) is fragile against the contamination of the initially tiny but rapidly inflating component, P nm (ix), and therefore becomes quite erroneous (Sona 1995).

Thus, a various forms of hypergeometric functions have been developed instead. However, except that used in Martinec and Graferend (1997), all other forms (Hobson 1931; Jekeli 1988; Petrovskaya and Vershkov 2000; Vershkov 2002; Sebera et al. 2012; Petrovskaya and Vershkov 2013) are inappropriate for high degree and/or order (Fukushima 2013, Fig. 1) due to the cancellation problems. Meanwhile, a method using the backward recursion already exists (Gil and Segura 1998). It uses the Wronskian relation to obtain the seed values of Q nm (ix) from the values of P nm (ix) computed by the forward recursion. However, this method faces the overflow problem in the recursive computation of P nm (ix) (Gil and Segura 1998, Tables 1 and 2).

In order to overcome this situation, a new method based on the backward recursion is developed (Fukushima 2013). The key point of the new method is the usage of a hypergeometric series to evaluate Q nm (ix) developed by Petrovskaya and Vershkov (2000). Although it is not suitable for general values of n and m, it rapidly converges when m is sufficiently small, say m = 0 and 1. Thus, it can be used in obtaining the three seed values of the backward recursion. Also, the derivative computation is achieved by recursion. As a result, the new method is sufficiently precise and yet much faster than the existing method in the computation of the point values and low-order derivatives of the ratio of Q nm (ix) (Fukushima 2013, Fig. 2).

Sample values of the ratio, \(q_{nm}(x) \equiv Q_{nm}(ix)/Q_{nm}(x_{0})\), and its low-order derivatives with respect to x for the case n = 216 000, m = 108 000, \(x_{0} = b/E\), and \(x = (b + 0.1\) km)∕E, while b and \(E \equiv ae\) are those of the GRS80 system, are listed below:

$$\displaystyle{ q_{\mathit{nm}}(x) = +3.392\;092\;979\;84(8\;901) \times 10^{-2}, }$$
(9)
$$\displaystyle{ dq_{\mathit{nm}}(x)/dx = -5.989\;743\;102\;88(4\;788) \times 10^{2}, }$$
(10)
$$\displaystyle{ d^{2}q_{\mathit{ nm}}(x)/dx^{2} = +1.057\;671\;120\;81(9\;206) \times 10^{7}. }$$
(11)

6 6 Summary and Future Issues

The so-called X-number formulation (Fukushima 2012a,c2014) resolves not only the underflow problem but also the overflow problem in any kind of computation. For example, the unnormalized ALF of the first kind, P nm (t) or P nm (ix), can be computed by this formulation without suffering from the overflow problem.

The formulation is significantly accelerated by the dynamic switch from X- to F-numbers during the fixed-order increased-degree recursions of \(\overline{P}_{nm}(t)\). Also, if a suitable parallel computing environment is available, a further speed-up is achieved by the folded parallel execution of fixed-order recursions (Fukushima 2012d).

Consequently, the formulation realizes an accurate, precise, and fast computation of the point values, the derivatives, and the integrals of the ALF of the first kind. This enables us to conduct not only the spherical harmonic synthesis but also the spherical harmonic analysis of high degree and order as 216 000 or more.

On the other hand, a backward recursive formulation computes the point values of Q nm (ix) precisely and quickly (Fukushima 2013). This has lowered the computational difficulty of the oblate spheroidal harmonic synthesis to the level of the popular spherical harmonic synthesis. Thus, the next problem to be investigated is the oblate spheroidal harmonic analysis (Wang and Yang 2013). If that is completed, there will be ‘nothing to fear from oblate spheroidal harmonics’ (Nesvadba 2011).

The Fortran 77/90 programs of these computations as well as output examples are available from the following website:

https://www.researchgate.net/profile/Toshio_Fukushima/

The author appreciates many valuable suggestions by Dr. J. Sebera and two anonymous referees.