1 Introduction

There has been an abundance of work around the world on geoid determination and gravity data evaluation in recent years. Many countries are interested in 1 cm-accurate geoid determination which involves as a first step the evaluation and cleansing of available gravity databases (e.g., Denker et al. 2008; Forsberg et al. 2003; Véronneau and Huang 2007; Ågren et al. 2006; Featherstone et al. 2011; Blitzkow 1999; Medvedev and Nepoklonov 2002; Junyong et al. 2001; Merry 2003; Kuroishi 2001; Hwang 1997; Bae et al. 2012; Wang et al. 2012).

In the USA, the NGS gravity data have been collected over more than half a century by different organizations using various hardware and software. Such an old and heterogeneous data compilation is likely to be contaminated with several types of random and systematic errors (Heck 1990). This digital gravity database was inherited and has been used to compute several US geoid models since 1990 (Milbert 1991; Smith and Milbert 1999; Smith and Roman 2001; Roman et al. 2004; Wang et al. 2012). The accuracy of these models has improved over the years, from a few decimeters to the sub-decimeter level, due to improvements in altimetry and other satellite-derived gravimetric and topographic models.

In recent years, the question has been asked whether the NGS surface gravity data are accurate enough to support the computation of a “centimeter-accurate” geoid. Jekeli (2009) presents an answer in the form of a general error analysis for deriving data requirements necessary to achieve 1-cm geoid accuracy. Wang et al. (2010); Wang et al. (2012) attempted to answer this question by exploring several geoid computation methods and comparing the resulting models to control data such as geoid heights computed from independent data at GPS-occupied bench marks and tide gauges and astro-geodetic deflections of the vertical. Unfortunately, most existing control data are old and likely to be inaccurate. Also unfortunate is the fact that modern and accurate control data are too expensive to collect and therefore still very scarce. Consequently, it has not been clear whether the disagreement between gravimetric geoid models and control data are due to shortcomings of the gravity or the control data.

In an attempt to contribute to the answer of the above question, this paper derives a few quality measures for the NGS’ surface gravity database and examines their geoid implications, without relying on any control data (with the exception of the freely available and accurate GRACE-derived models). Long and short-wavelength gravity errors are estimated separately, and their effects on the corresponding geoid are quantified. This work and its conclusions are based on the premise that if the geoid implications of the derived gravity accuracy measures are much larger than 1 cm, and if the effort involved in reducing these errors to the 1 cm level is formidable, then the surface gravity data alone are insufficient. Otherwise, they are sufficient.

Section 2 describes the data. In Sect. 3, the long-wavelength content of the surface anomalies is compared to GRACE-derived gravity for estimating regional gravity errors (Huang et al. 2008). The corresponding long-wavelength geoid errors are computed by differencing two geoid models based on the same NGS surface gravity anomalies. The first model is designed to be faithful to GRACE in its long-wavelengths in the spectral band between harmonic degrees 2 and 120, while the second is faithful to the surface gravity data over all wavelengths (Wang et al. 2012).

Section 4 estimates local gravity biases and high-frequency surface gravity anomaly errors based on crossover discrepancies of overlapping gravity surveys. The 1,489 individual gravity surveys in CONUS are separated into different files. Each survey is represented by a track-like linear feature. Crossover locations between all overlapping surveys are computed (Wessel and Watts 1988; Denker and Roland 2005) by detecting all intersections of all linear features. Crossover discrepancies at all intersections are then computed and used to estimate local survey biases of all surveys. Maps and statistics of the significant biases and their geoid implications are presented. These biases are removed from the crossover discrepancies of the respective surveys leading to crossover-derived high-frequency gravity anomaly errors.

Section 5 describes the estimation of another independent set of high-frequency errors by applying the \(K\)-nearest neighbors (KNN) prediction method (Tscherning et al. 1991) to bias-free residual gravity anomalies. In Sect. 6, the crossover- and KNN-derived high-frequency gravity errors are propagated in Stokes’ integral (see Heiskanen and Moritz 1967, section 7.8) to give high-frequency geoid errors. Finally, the findings of this work are summarized and explained in the discussion and conclusions section.

2 Data

The gravity surveys used in this work were submitted decades ago by many academic and commercial sources to half a dozen US and international agencies (Hittelman et al. 1982; Moose 1986). NGS acquired the gravity databases of these agencies, added them to its own data collection and validated and edited the data further. This validation effort was done mainly during the 1980s and early 1990s by a group of several people who have since either retired or left the gravity group. Data were validated based mainly on manual identification of spikes in 3-D Bouguer anomaly plots. The data causing the spikes were either corrected by better interpolating the heights from topographic maps or, if that did not correct the spikes, flagged for rejection. In addition, different overlapping data sources were compared and data deemed unreliable were flagged for rejection. This effort and additional editing done at the DMA produced the current NGS’ digital gravity database.

The NGS digital gravity database in CONUS (\(24^{\circ }\text{ N} \,{\le }\, \text{ latitude} \,{\le }\, {\sim } 50{^{\circ }}\text{ N}; 227{^{\circ }} \,{\le }\, \text{ longitude} \,{\le }\, 303{^{\circ }}\)) contains \({\sim }1.4\) million gravity observations comprising 1,489 land and ship surveys. The great majority of this data are “spot gravity observations”, i.e., observations at unmarked points on the ground. The position and height values of most of these points were interpolated from topographic maps. A small fraction of the heights were measured by barometric leveling. Some gravity observations were taken at controlled intersections or bridges where horizontal locations can be accurately pinpointed on the topographic map; thus heights could be interpolated more accurately. Our attempts to replace the elevations of the data records with interpolated digital elevation model (DEM) elevations led to worse geoid models because of the limited accuracy of available DEMs and limited knowledge of the point’s true latitude and longitude. The NGS gravity database also includes gravity observations at \({\sim }10{,}500\) leveled bench marks. These observations are not used in this study since their accuracy does not need any further evaluation.

The digital gravity data records contain information about geographic location with respect to NAD27, elevation with respect to NGVD29, depth under the surface in case the gravimeter was submerged, IGSN71 gravity value, observing agency (e.g., DMA, NGS, USGS, etc.), survey number and observation type (e.g., land, ship, airborne, submerged observations, etc.). The depth attribute for submerged observations refers to the depth of the gravimeter under water. However, the depth attribute for other types of surveys is questionable. In non-submerged ship data, for example, non-zero depth is interpreted as the bathymetry at that location. Other questionable information on the data records (which is not used) includes estimates of the accuracy of the position, height and gravity anomalies. The latter, for example, has a nominal value of 1, 2 or 3 mGal in 90 % of the database records. Although date and time information should be available in the original paper records, such information is absent in the NGS digital database except for the Canadian data (which are not used in this study).

In addition to our surface anomalies, a \(1^{\prime }\times 1^{\prime }\) grid of DNSC2008GRA altimetry-derived gravity anomalies (Andersen et al. 2010) is used to populate all \(1^{\prime }\times 1^{\prime }\) US territorial oceanic cells in all geoid computations. The SRTM-DTED1 (Slater et al. 2006) \(3^{\prime \prime }\times 3^{\prime \prime }\) elevations are used for all RTM computations (Forsberg 1984; Wang et al. 2012) and EGM2008 (Pavlis et al. 2012) is used as a reference model.

A free air gravity anomaly residual is computed for each gravity observation (Fig. 1):

$$\begin{aligned} \Delta g_\mathrm{res} =\Delta g-\Delta g_\mathrm{Synth} -\Delta g_\mathrm{RTM} \end{aligned}$$
(1)

where \(\Delta g\) is the database surface free air gravity anomaly, \(\Delta g_\mathrm{Synth}\) is a synthesized surface free air anomaly computed from the full power of EGM2008 and \(\Delta g_\mathrm{RTM}\) is the RTM gravity effect, computed using the program TC.f (Forsberg 1984). \(\Delta g_\mathrm{RTM}\) is computed using an inner radius of 50 km for the integration over the \(3^{\prime \prime }\) SRTM-DTED1 and an outer radius of 167 km for integration over a \(1^{\prime }\times 1^{\prime }\) grid of averaged SRTM-DTED1 values. The DEM is splined to fit local heights, and the reference DEM is a grid of \(5^{\prime }\times 5^{\prime }\) of averaged SRTM-DTED1 values (Wang et al. 2012).

Fig. 1
figure 1

Residual surface free air gravity anomaly in CONUS. \({\sim }2{,}700\) residuals with absolute value \({>}40\) mGal are excluded, most of which belong to an old airborne gravity survey in northeast North Carolina, mistakenly classified in the NGS database as a surface survey

The subtracted quantities in Eq. (1) remove from our gravity anomalies all modeled anomalies in the spectral band spanned by the harmonic degrees 2–216,000. Whatever is left in the residual, \(\Delta g_\mathrm{res}\), consists of data errors, any possible EGM2008 and RTM errors, in addition to very high-frequency (degrees \({>}216{,}000\)) signal.

3 Long-wavelength gravity errors and their effect on the geoid

The long-wavelength (\({>}{\sim }3^{\circ }\)) gravity errors are computed by comparing surface to independent synthetic GRACE-derived gravity anomalies (see also Huang et al. 2008). The residuals of Eq. (1) are used as input data, except that the spectral band from harmonic degrees 2 to 120 of the synthetic anomalies, \(\Delta g_\mathrm{Synth}\), are obtained from the GRACE-derived model ITG-Grace2010 (Mayer-Gürr et al. 2010) rather than EGM2008.

The content of these residuals could be viewed in the following way. The subtracted synthetic EGM2008-derived gravity anomalies in the spectral band 121–\(2{,}160\) remove all high-frequency content in this band from our surface gravity anomalies. The subtracted RTM anomalies remove all high-frequency content of higher degrees than \(2{,}160\). The remaining part mainly contains the surface gravity anomalies in the spectral band 2–120 minus the synthetic GRACE-derived surface anomalies in that same band. Thus, if the latter is considered as the absolute truth, these residuals mainly contain the long-wavelength errors of the surface point gravity anomalies in the spectral band 2–120. In addition, however, the residuals contain any existing EGM2008 and RTM errors of wavelengths shorter than \({\sim }3^{\circ }\) and very high-frequency signal and noise.

These residuals are then filtered to remove wavelengths shorter than \(3^{\circ }\). The filtering is done in two steps. First, the residuals are gridded on a global \(1^{\circ }\times 1^{\circ }\) grid using a moving-window average with exponential weights of the form \(\text{ e}^{{-r}/\lambda }\), where \(r\) is the distance of the weighted observation from the computed node and \(\lambda \) is a characteristic length of the filter. The exponential weights are used to taper steep slopes along the edges of the data (no altimetry data are used here; see Fig. 1), which also affect some land areas along the coastlines. The choice of exponential weights is based on the fact that 1-D data, including occasional step functions, are optimally smoothed (while step functions are tapered) by convolving the data with the following exponential kernel:

$$\begin{aligned} G(x,\xi )=\frac{1}{2\lambda }\text{ e}^{-{|x-\xi |}/\lambda } \end{aligned}$$
(2)

where \(x\) is the 1-D coordinate of the computation point, \(\xi \) the coordinate of the integration point and \(\lambda \) is a characteristic length of the smoothing operation (see Blake and Zisserman 1987, chapter 4). The corresponding kernel for 2-D data smoothing is a modified Bessel function of the second kind. However, if isotropy is assumed, exponential weights can still be used for smoothing 2-D data by replacing \(|x-\xi |\) by the distance \(r\) between the computation and integration points. The characteristic length, \(\lambda \), is taken as \(3^{\circ }\).

The second step passes the resulting \(1^{\circ }\times 1^{\circ }\) global smoothed error grid through a least-squares spherical harmonic analysis to degree and order 120. The computed harmonic coefficients are then used to synthesize the gravity error grid shown in Fig. 2. This figure still shows slopes along the eastern shore of the USA, mainly in the southeast. However, these slopes are believed to be related to the Gulf Stream rather than edge effects.

Fig. 2
figure 2

Long-wavelength (degrees 2–120) gravity errors

In summary, the purpose of the moving window filter is twofold: first, to remove most of the content of wavelengths shorter than \(3^{\circ }\) and, second, to minimize artificial slopes due to edge effects along the coastlines. The resulting error grid is smooth and tapered to zero in the oceans beyond the ship gravity data locations (Fig. 1). The magnitude of the long-wavelength errors of Fig. 2 is a fraction of an mGal, almost an order of magnitude smaller than those obtained by Huang et al. (2008), particularly in the Rocky Mountains. We believe that the difference is a consequence of our removal of high-frequency constituents prior to the low-pass filtering in order to prevent harmful aliasing effects.

Fig. 3
figure 3

Long-wavelength (degrees 2–120) geoid errors

The corresponding long-wavelength geoid errors were estimated by Wang et al. (2012), independent of Fig. 2, and will be summarized here due to their relevance to our work. These errors (Fig. 3) were estimated by differencing two geoid models, both based on the same NGS surface-gravity data and the DNSC2008GRA altimetry-derived gravity anomalies, and on EGM2008 as a reference model. The first model is computed with truncated Stokes’ kernel according to Wong and Gore (1969), which excludes any surface and altimetry-derived gravity contributions in the spectral range between harmonic degrees 2–120 (whatever is left is on the 1 mm level), and replaces them by those derived from EGM2008. Thus:

$$\begin{aligned} N_1&= N_\mathrm{EGM\,2008} +N_\mathrm{RTM} +\frac{R}{4\pi \gamma }\nonumber \\&\times \int \!\!\int {\Delta g_\mathrm{res} \cdot [S(\psi )-\Delta S(\psi )]\cdot \text{ d}\sigma } \end{aligned}$$
(3)

where \(\Delta g_\mathrm{res}\) is the gravity anomaly residual computed according to Eq. (1), \(S(\psi )\) is Stokes’ kernel and:

$$\begin{aligned} \Delta S(\psi ) = \sum _{n=2}^{120} {\frac{2n+1}{n-1}} P_n (\cos \psi ) \end{aligned}$$
(4)

is the portion of the kernel to be excluded.

The second geoid model is computed with the traditional Stokes’ kernel, which honors the surface and altimetry-derived gravity data over all wavelengths. Thus:

$$\begin{aligned} N_2&= N_\mathrm{EGM\,2008} +N_\mathrm{RTM} +\frac{R}{4\pi \gamma }\int \!\!\int {\Delta g_\mathrm{res} \cdot S(\psi )} \cdot \text{ d}\sigma \nonumber \\ \end{aligned}$$
(5)

The difference between the two models:

$$\begin{aligned} \delta N=N_2 -N_1 =\frac{R}{4\pi \gamma }\int \!\!\int {\Delta g_\mathrm{res} \cdot \Delta S(\psi )\cdot \text{ d}\sigma }. \end{aligned}$$
(6)

is reduced by the orthogonality of spherical harmonics (Heiskanen and Moritz 1967) to:

$$\begin{aligned} \delta N=\frac{R}{4\pi \gamma }\int \!\!\int {\Delta g_\mathrm{res}^{[2-120]} \cdot \Delta S(\psi )\cdot \text{ d}\sigma } \end{aligned}$$
(7)

where \(\Delta g_\mathrm{res}^{[2-120]}\) is the long-wavelength (\({>}3{^{\circ }}\)) portion of the residuals, bounded by degrees 2 and 120.

The difference given in Eq. (7) and Fig. 3, which reaches several decimeters in the interior of the USA, mainly represents the long-wavelength geoid errors caused by long-wavelength surface gravity data errors. In addition, however, this difference also contains the following smaller error sources:

  1. 1.

    A contribution from long-wavelength altimetry-derived gravity errors, which could reach several centimeters along the coastlines, but only a few centimeters in the interior of the USA (see e.g., Wang and Roman 2004).

  2. 2.

    The integral in (7) is not global, thus the orthogonality relations do not strictly apply, which introduces errors in the order of a few millimeters or less (Li and Wang 2009). We must mention here that whether we use the traditional Wong and Gore (1969) kernel (as in Eqs. (3) and (4)) or a tapered Wong and Gore kernel, the results are almost equal to within a few millimeters or less.

  3. 3.

    The long wavelengths (from degree 2 to 120) in Eqs. (37) are taken from EGM2008 rather than GRACE. The maximum difference between a GRACE- and an EGM2008-derived synthetic geoid in our CONUS window is 12 mm and the RMS of the difference is at the 1 mm level. Thus, if GRACE is considered as the absolute truth, the use of EGM2008 to represent GRACE introduces a maximal error of 12 mm in a few spots in Fig. 3.

These three error sources are minor as compared to the magnitude of the signal exposed in Fig. 3, which amount to several decimeters in the interior of the USA.

4 Crossover-derived local gravity biases and high-frequency errors

4.1 Crossover analysis: concept and definitions

Crossover analysis is routinely used for evaluating marine and airborne gravity surveys (Wessel and Watts 1988; Denker and Roland 2005). These survey types produce measurements taken along approximately linear or curved tracks, which occasionally intersect each other. A track intersects itself at “internal crossovers” (ICOs) and crosses other tracks at “external crossovers” (ECOs). The discrepancy in gravity value between two intersecting track sections at ICOs and ECOs are called ICO and ECO errors [ICOEs and ECOEs, collectively crossover error (COEs)]. The ICOEs of a track are measures of its internal consistency, while ECOEs between two tracks also reflect biases between them. In ship and airborne tracks, two consecutive points along the track are usually a few hundred meters apart at most. This keeps the gravity values at neighboring observations similar, which minimizes interpolation errors in crossovers computation.

The majority of the data considered in our study are surface land gravity data which, unlike airborne and ship gravity surveys, are usually two dimensional (see Fig. 4). In addition, most of our land data are as sparse as several kilometers and even sparser in mountains. Therefore, several modifications are necessary before crossover analysis can be applied to land gravity data.

Fig. 4
figure 4

Locations of the 2,603 observations of the NGS survey #9443

First, we replace the gravity anomalies by the residuals of Eq. (1) (Fig. 1) to remove all predictable gravity constituents from the longest wavelengths of thousands of kilometers to the shortest wavelength of about 90 m. The use of these residuals instead of gravity anomalies reduces the magnitude and roughness of the signal, which reduces any computation errors caused by the sparseness of the data.

Second, we fit a track-like linear feature through each survey. We choose this feature as the shortest path that visits each observation only once, also known as “the traveling salesman’s path” through the survey and hereafter called “the survey track”. To detect the traveling salesman’s path through each gravity survey, we implement the algorithm ofKirkpatric et al. (1983), which is based on simulated annealing. This algorithm was the first to practically solve the traveling salesman’s problem for large number of points (or traveled cities) (Press et al. 1990).

Once each survey is represented by its survey track, we intersect the survey track of each survey with itself to detect all ICOs, and with all other survey tracks to detect all ECOs. Figure 4 presents an example of a midsize survey and Fig. 5 presents its survey track, the traveling salesman path through this survey. As can be seen in Fig. 5, the survey track of most midsize (a few thousand observations) or larger surveys occasionally intersects itself creating ICOs. Survey #9443 (Fig. 4), which consists of 2,603 observations, has about three dozen ICOs, while the survey track of our largest survey of 68,580 observations creates 144,212 ICOs. Figure 6 presents an example of a pair of overlapping survey tracks, surveys #2094 and #4277, which cross each other 4,601 times.

Fig. 5
figure 5

The traveling salesman’s path through survey #9443 and the ICOs it creates (marked by red dots)

Fig. 6
figure 6

The survey tracks of survey #4277 (in green, 22,137 observations) and the part of survey 2,094, which overlaps with it (in red, 8,425 observations). They create 4,601 crossovers, but only 230 ECOs with \(|\text{ ECOEs}| > 5\) mGal are highlighted in blue. The mean of the ECOEs of this survey pair is 0.000 mGal (purely by chance), their RMS is 2.4 mGal, their minimum is \(-14.7\) mGal and maximum is 14.6 mGal. The mean of their height ECOEs is \(-0.2\) m, the RMS is 7.0 m, the minimum is \(-53.0\) m and maximum is 51.6 m

An advantage of the traveling salesman path is that each of its individual line segments (connecting two consecutive points) also tends to be short, which limits the magnitude of interpolation errors involved in the computation of the COEs. In case the traveling salesman’s path fails to ensure short intersecting sections, we limit COE computation to intersecting sections shorter than 20 km. If survey track sections longer than 20 km do intersect, the resulting COEs are not considered in our study.

We analyze residual gravity anomalies, and thus a COE consists of gravity and height errors. In addition, however, there is always an elevation difference at the crossover location between the two intersecting sections, which causes a small error component, usually on the micro-Gal level, unrelated to gravity anomaly errors. To minimize these data-unrelated errors, we also compute height COEs (or perhaps better called “height CO discrepancies”) at all crossover locations. Unlike in ship track analysis, height COEs are not necessarily small for land surveys. If the height crossover discrepancy is larger than 500 m in mountainous areas, the corresponding gravity COEs are not considered in our crossover adjustment (see Sect. 4.5), which excludes only \({\sim }570\) ECOEs located mostly along the Rocky Mountains. These height crossover discrepancies could also uncover blunders in the database heights. For example, if height crossover discrepancies are large in flat areas or in marine gravity records, this may reflect observed height errors. This helped us detect and correct thousands of marine gravity data where the height was confused with bathymetry in the NGS gravity database. Though these measurements are documented as non-submerged ship measurements, their heights were large reaching thousands of meters.

Finally, a crossover adjustment of all ECOEs is computed as described in Sect. 4.5. Since there are no time tags in our digital database, we cannot estimate gravimeter drifts. Even if time tags were available, past attempts to estimate gravimeter drifts from crossover adjustments of old and heterogeneous databases usually either failed or were found to be unnecessary, since drifts are commonly adjusted and removed prior to data dissemination and publication (e.g., Wessel and Watts 1988; Denker and Roland 2005). We therefore estimate only a single unknown bias per survey.

The ECOEs are used as input observations to this adjustment. The \(k\)th ECOE between surveys \(i\) and \(j\) contributes an observation equation of the form:

$$\begin{aligned} \text{ ECOE}_{{ij}_{k}} -e_{{ij}_{k}}&= \beta _j -\beta _i \nonumber \\ \\ P_{\mathrm{ECOE}_{{ij}_{k}}}&= 1/{\sigma _{ij}^{2}}\nonumber \end{aligned}$$
(8)

where \(\beta _i\) stands for the bias of survey \(i, e_{ij_k} \) the unbiased random error of \(\text{ ECOE}_{ij_k}, P_{\mathrm{ECOE}_{ij_k}}\) the weight of “observation” \(\text{ ECOE}_{{ij}_{k}}\) and \(\sigma _{ij}\) is the SD of all ECOEs between surveys \(i\) and \(j\). Although using ICOEs to form observation equations of the form (8) does no harm, it was not done to avoid making additional constraints that would imply more assumptions regarding gravimeter drifts (see discussion in Sect. 4.7). Instead, we gave more weight to surveys with small ICOEs as described in Sect. 4.5.

4.2 Efficient crossover detection

The geographic boundaries of each survey are found and used to infer whether two surveys roughly overlap. The observations of each survey are reordered such as to follow the traveling salesman’s path through the survey. To prevent duplicates, all survey names are ordered in a list and each survey is “intersected” only with itself and with all those that follow it in the list if their geographic boundaries suggest a possible overlap with it.

Two surveys are “intersected” by testing the possibility of intersection of all line segments of the first survey with all those of the second. Since the number of intersection tests is combinatorially large, it is essential to use an efficient technique for the detection of intersecting lines. The technique used in this work is based on computational geometric principles borrowed from chapters 1 and 7 of O’Rourke (1998).

The used technique searches for intersections of linear features by relying on the self-explanatory geometric relations: (1) collinearity, (2) “between-ness” and (3) “a point is to the left of a line”. To detect the possibility of intersection in Fig. 7a, for example, the technique tests if “either point \(A\) or \(B\)” is located “to the left of line” \(DC\) and “either point \(C\) or \(D\)” is “to the left of line” \(AB\). If the answer to both questions is positive, an intersection does occur. For the different type of intersection in Fig. 7b, the technique tests whether \(A, B\) and \(C\) are “collinear” and \(C\) is located “between” \(A\) and \(B\). To distinguish intersection types such as those of Fig. 7a from those of Fig. 7b, four collinearity tests must be made, one with points \(A, C\) and \(B\), another with points \(A, D\) and \(B,\) and then \(C, A\) and \(D\) and \(C, B\) and \(D\). Situations like the ones in Fig. 7c and d are specifically not recognized as intersections.

Fig. 7
figure 7

Crossover detection. a, b Examples for intersections to seek and c, d examples for intersections to be avoided

The key to the efficiency of this technique is in the fact that testing the necessary geometric relations relies on a single computation, namely the computation of the area of a triangle based on the coordinates of its nodes, which involves two multiplications:

$$\begin{aligned} 2\text{ Area}_\mathrm{ABC}&= (E_B -E_A )\cdot (N_\mathrm{C} -N_\mathrm{A} )-(E_\mathrm{C} -E_\mathrm{A} )\nonumber \\&\cdot (N_\mathrm{B} -N_\mathrm{A}) \end{aligned}$$
(9)

where \(N\) and \(E\) stand for the northing and easting coordinates. “A point is to the left of a line” and collinearity are then tested as follows:

$$\begin{aligned}&C \,\,{\text{ is} \text{ to} \text{ the} \text{ left} \text{ of} \text{ line} }\ \mathbf{A}B\quad {\text{ if}} \quad 2\text{ Area}_\mathrm{ABC} >0 \nonumber \\&A, B \,\,{\text{ and} }\ C {\text{ are} \text{ collinear}}\quad \;\;\! \;{\text{ if}} \quad 2\text{ Area}_\mathrm{ABC} =0 \end{aligned}$$
(10)

and “between-ness” of \(C\) between \(A\) and \(B\) occurs if \(A, B\) and \(C\) are collinear according to (10) and the coordinates of \(C\) fall between those of \(A\) and \(B\). In the example of triangle \(ABC\) of Fig. 7a, where the counterclockwise order of the nodes indicates that point \(C\) is to the left of line \(AB\), Eq. (9) returns a positive value while for triangle \(ABD\) it returns a negative value.

Notice that no straight line equations, slopes or number divisions are used in the detection of the intersecting linear segments. These are only used for computing the horizontal locations of all crossovers after all intersecting lines have been detected.

4.3 Internal and external crossover results

Only 526 surveys turned out to actually have ICOs. The number of ICOs per survey ranged from 1 in a typically small survey of \({<}1{,}000\) observations to 144,212 in the largest survey of 65,580 observations. The total number of ICOs is \({\sim }482{,}000\) (Fig. 8). It can be safely stated based on Fig. 8 that the internal consistency of these 526 individual surveys is better than 2 mGal in more than 95 % of the cases.

Fig. 8
figure 8

All ICOEs in CONUS. \({\sim }1{,}000\) ICOEs larger in absolute value than 40 mGal are excluded, which mostly belong to an old airborne survey in northeastern North Carolina, mistakenly classified as a surface survey in the NGS database. The corresponding height-ICOEs have a mean of 0.011 m, RMS of 20.619 m, and a minimum and maximum of \(-1{,}125.293\) and \(1{,}158.365\) m, respectively

Figure 9 presents the \({\sim }263{,}000\) detected ECOEs. As expected, the ECOEs are larger in magnitude than the ICOEs since they absorb any relative biases between overlapping surveys.

Fig. 9
figure 9

All ECOEs in CONUS. The mean of the corresponding height-ECOEs is \(-1.867\) m, their RMS is 45.220 m, their minimum is \(-1{,}376.050\) and their maximum is 1,619.046 m

Approximately, 6,000 pairs of overlapping surveys were found. This large and evenly distributed number of overlaps reflects a considerable redundancy in the NGS gravity database, which could be exploited for data editing and optimization. Only 123 surveys do not overlap with any others. These surveys do not have any ECOs and therefore could not be evaluated in the crossover adjustment. The rest of the surveys group themselves into 95 non-overlapping groups, where each group contains mutually overlapping surveys.

4.4 The effect of EGM2008 and RTM errors on crossovers of free air anomaly residuals

Since crossover analysis is applied to the residuals computed in Eq. (1), COEs may also contain any existing EGM2008 and RTM errors. COEs are differences between overlapping track sections at the same location. Therefore, they tend not to contain any long- or medium-wavelength errors. Such errors, whether those mentioned in Sect. 3 or those due to EGM2008 or RTM, tend to cancel out in the differencing. However, high-frequency, random-like, EGM2008 and RTM errors accumulate with the differencing and could affect the residuals.

To address the question of whether possible high-frequency EGM2008 and RTM errors significantly contaminate the statistics of Figs. 8 and 9, all ICOEs were re-computed using the simple Bouguer anomalies. In other words, the residuals of Eq. (1) are replaced by the simple Bouguer anomalies for the sake of this section only, and their ICOEs are computed and presented in Fig. 10.

Fig. 10
figure 10

All ICOEs computed using the simple Bouguer anomalies

Notice the almost identical values and statistics of the ICOEs of Figs. 8 and 10. The free air gravity anomaly residuals, which contain the total effects of any possible high-frequency EGM2008 and RTM errors, and the simple Bouguer anomalies which are independent of EGM2008 and SRTM, have almost identical ICOEs. Thus, it can be concluded that high-frequency EGM2008 and RTM errors are either negligible or do not affect our COEs.

4.5 Crossover adjustment

The simple crossover adjustment described in Sect. 4.1 has a datum defect of one since no reference survey was introduced. It turns out, however, that the adjustment’s normal matrix has 218 zero eigenvalues. This means that there are 218 groups of surveys, each of which contains mutually intersecting surveys, but no group intersects any other. Of these groups, 123 consist of a single survey (see Sect. 4.3) and the remaining 95 include several overlapping surveys per group.

Ideally, additional external information about the biases should be obtained from comparisons to satellite-derived gravity. However, this information is difficult to derive due to data sparseness, the small magnitude of most relative biases and the small horizontal extent of many surveys. Instead, the datum (and configuration) defects are removed by the following two-step method. First, a weight is added to each of the 1,489 diagonal elements of the normal matrix, which is inversely proportional to the square of the SD of the ICOEs of each survey. If a survey has no ICOEs (all but 526 surveys), its added weight is arbitrarily chosen as 1/100, corresponding to a fictitious ICOE SD of 10 mGal. Thus the smaller the spread of the ICOEs of an individual survey, the higher its weight is and the more influence it gets as a datum definer in its group. Unfortunately, this constraint alone could not remove all singularities.

Next, singular value decomposition is used to solve the normal system, where the inverses of zero singular values are set to zero (Press et al. 1990). In more detail, the normal matrix, \(N\), is decomposed into:

$$\begin{aligned} N=V\cdot W\cdot V^\mathrm{T} \end{aligned}$$
(11)

where \(W\) is a diagonal matrix with diagonal elements, \(w_{ii}\), equal to the singular values (and in this case also eigenvalues) of \(N\), and \(V\) is an orthogonal matrix of the eigenvectors. The inverse of the normal matrix is then:

$$\begin{aligned} N^{-1}=V\cdot W^{-1}\cdot V^\mathrm{T}. \end{aligned}$$
(12)

Since 218 of the 1,489 diagonal elements of \(W\) are zeros, \(W^{-1}\) does not exist. An alternative optimal and logical solution is obtained by replacing \(W^{-1}\) by its pseudo-inverse, \(W^{+}\), computed as follows:

$$\begin{aligned} W^{+} = \text{ diag}\{w_{ii}^+ \} = \text{ diag} \left\{ {\begin{array}{ll} w_{ii}^{-1}&\quad \text{ if} \quad w_{ii} \ne 0.0 \\ 0.0&\quad \text{ if} \quad w_{ii} =0.0 \\ \end{array}}\right.. \end{aligned}$$
(13)

The final solution is then given by:

$$\begin{aligned} N^{+}=V\cdot W^{+}\cdot V^\mathrm{T}; \quad {\hat{{\beta }}}=N^{+}\cdot U \end{aligned}$$
(14)

where \(U\) is the right-hand side of the normal equations. The least-squares residuals are computed by (see Eq. 8):

$$\begin{aligned} \hat{{e}}_{ij_k } = \text{ ECOE}_{ij_k } -(\hat{{\beta }}_j -\hat{{\beta }}_i) \end{aligned}$$
(15)

and grouped in the residual vector \(\hat{{e}}\). The covariance matrix of the estimated biases is given by:

$$\begin{aligned} \Sigma _{\hat{{\beta }}} = \hat{{\sigma }}_0^2 \cdot N^{+}; \quad \hat{{\sigma }}_0^2 = \frac{\hat{{e}}^\mathrm{T}\cdot P_\mathrm{ECOE} \cdot \hat{{e}}}{n-u} \end{aligned}$$
(16)

where \(\hat{{\sigma }}_0^2\) is the variance of unit weight, \(n\) is the number of used ECOEs, \(u=1{,}489-218=1{,}271\) is the number of unknown biases and \(P_\mathrm{ECOE} \) is the (diagonal) weight matrix of all observations.

This generalized inverse solution automatically fixes (i.e., constrains to a certain value) one survey per group. The biases of all other members of the group are automatically determined relative to the fixed survey. Since this is also a minimum-norm solution, i.e.:

$$\begin{aligned} \beta ^\mathrm{T}\beta = \min , \end{aligned}$$
(17)

it is guaranteed that the fixed biases are automatically chosen such as to keep the adjusted surveys as close to each other as possible. This method removes the datum (and configuration) defects without a need to know which surveys make up each group and which surveys are actually fixed. In this specific situation, the fixed biases are automatically chosen as zero or very close to zero (smaller in absolute value than a micro-Gal), since the mean of all surveys is very small (see Fig. 1).

4.6 Crossover-derived local gravity biases and high-frequency errors

The crossover adjustment results in 1,489 biases and their accuracy estimates. Of these, 244 are deemed significant since their magnitudes are \({>}2\) mGal and their normalized values (the ratio of the bias’ magnitude to its estimated SD) are \({>}2\). These significant biases are presented in Fig. 11, where (for display purposes) each observation of a significantly biased survey is replaced by the bias of that survey. The effect of these biases on the geoid (Fig. 12) reaches 20 cm even though all harmonic constituents from degree 2 to 120 were truncated out of Stokes’ kernel, assuming that this spectral content of the surface data are usually replaced by satellite-derived models. The geoid biases of Fig. 12 are computed using the integral term of Eq. (3), except that the residual is replaced by the gravity bias over the significantly biased surveys, zeros are padded everywhere else, and some tapering is done around the edges.

Fig. 11
figure 11

Biases of all 244 significantly biased surveys

Fig. 12
figure 12

The effect of significant gravity biases on the geoid

The purely high-frequency gravity errors are then computed by removing all significant relative biases from the ECOEs and combining the result with all ICOEs (Fig. 13). These unbiased crossover-derived high-frequency errors are then gridded (i.e., propagated onto the nodes of a grid) as follows. First, a fictitious random error of 1.5 mGal is added at the location of \(1^{\prime }\times 1^{\prime }\) altimetry-derived gravity grid all the way up to the coastlines and 3 mGal at the locations of all the Canadian gravity data holdings. The resulting random errors are propagated on a \(1^{\prime }\times 1^{\prime }\) grid using (a modified version of) program geogrid (Tscherning et al. 1991). The propagated value at a node of the grid is taken as the weighted geometric mean of the 20 nearest neighbor COEs, 5 in each quadrant around that node (Fig. 14), where the weight of a COE is inversely proportional to the square of its distance from the node.

Fig. 13
figure 13

All ICOEs and unbiased-ECOEs in CONUS

Fig. 14
figure 14

\(1^{\prime }\times 1^{\prime }\) grid of crossovers-derived high-frequency gravity anomaly errors

4.7 Gravimeter drifts and multiple biases per survey

It has been argued by one of the reviewers and the associate editor of this paper that the assumption of the presence of a single bias per survey is not justified. They argue that surveys may have trends due to gravimeter drifts in addition to any biases. A land gravity survey may involve several reference stations, they argue, thus multiple biases could be present within a single survey. Our response is as follows:

  1. 1.

    Gravimeter drifts are commonly estimated and removed before data dissemination and publication. Therefore, attempts to estimate trends from old heterogeneous gravity surveys usually failed to detect anything meaningful (e.g., Wessel and Watts 1988; Denker and Roland 2005).

  2. 2.

    Ship gravity surveys are compared to altimetry data as a common practice, and obvious biases of the different tracks of the same ship survey relative to altimetry-derived gravity anomalies are eliminated.

  3. 3.

    If the reviewers’ argument that a single bias per survey was really baseless, the crossover adjustment would have resulted in estimated biases that are noisy and hence statistically insignificant. This however is not the case for at least 244 out of 1,489 surveys. At least 40 of these, each represented by a large number (at least several dozens) of ECOEs, have estimated biases larger by an order of magnitude than their estimated SD.

  4. 4.

    For the remaining 1,245 surveys, perhaps multiple biases within a survey are possible. However, the effort of attempting to find out the exact number of biases to be modeled is formidable. The data in question were collected before the early 1970s. The details of this data are described in hundreds and thousands of reports, thesis and dissertations that are half-a-century old and that have to be found and read. Furthermore, these data have been edited and re-edited for over 50 years by many operators, at the NGS, DMA and other institutions. Consequently, much of the original contents of the above-mentioned reports are no longer applicable.

  5. 5.

    Even if 1,245 surveys with noisy biases do contain multiple biases per survey, it would not change our major findings and conclusions, but would only indicate that the data are worse than reported in this paper.

5 KNN-derived high-frequency gravity errors

The KNN method is well known to users of program geogrid (Tscherning et al. 1991). To estimate high-frequency gravity errors, first all significant biases, derived in the previous section, are removed from all residual gravity anomalies defined by Eq. (1). The \(K\) closest surrounding points of each observation, excluding the observation itself, are used to predict a value at the location of that observation. The discrepancy of the observation from the predicted value is considered as the KNN-derived high-frequency error of that residual surface gravity anomaly. As discussed in a previous section, all long- to medium-wavelength effects tend to cancel out with the differencing.

Again, a modified version of program geogrid is used to do the prediction, where \(K=39\): ten observations in each quadrant around the predicted location but excluding the observation at that location. The collocation prediction option of geogrid is used, with a correlation length of 25 km and a noise SD of 1.5 mGal. These parameters were selected in order to predict with minimal smoothing, given that the signal SD is 3.4 mGal as shown in Fig. 1. As usual, the derived high-frequency errors include any possible prediction errors.

The resulting errors (Fig. 15) are combined with 1.5 mGal random errors at the locations of \(1^{\prime }\times 1^{\prime }\) altimetry-derived data and 3 mGal errors at the locations of available Canadian gravity data and gridded (i.e., propagated onto the nodes of a grid) by a weighted geometric mean as described in Sect. 4.6 (Fig. 16). This grid is essentially similar to that obtained by crossover analysis (Fig. 14). However, it is clear from Figs. 14 and 16 that crossover analysis tends to highlight larger errors, while the KNN-derived errors tend to be more homogeneous due to the smoothing inherent in this method.

Fig. 15
figure 15

KNN-derived surface gravity anomaly high-frequency errors

Fig. 16
figure 16

\(1^{\prime }\times 1^{\prime }\) grid of the KNN-derived surface gravity anomaly high-frequency errors

6 High-frequency geoid errors

The gravity errors derived by crossover analysis (Fig. 14) and KNN predictions (Fig. 16) are mainly high-frequency errors. Long-wavelength errors, whether those discussed in Sect. 3 or those due to EGM2008 or SRTM, tend to cancel out as a result of the differencing involved in both methods, and all significant local gravity biases are directly removed.

These high-frequency gravity anomaly errors are squared and propagated in Stokes’ integral, leading to the corresponding high-frequency geoid variances. This is done using the same 1-D FFT (Haagmans et al. 1993) convolution program used for geoid computations (see also Wang et al. 2012), except that the gravity residual grid is replaced by the gravity variance grid and the traditional Stokes’ kernel and all other multiplying factors, including those of the inner zone, are replaced by their squares (Heiskanen and Moritz 1967, equation 7–74).

Figures 17 and 18 present the 2-sigma (SD) values of the resulting high-frequency geoid errors, which amount to a few centimeters in and to the west of the Rocky Mountains and on the Appalachians and only a few millimeters or less everywhere else. Notice the clear geographic correlation between the locations of mountains and hills and between the error maps of Figs. 17 and 18. This is due to the large uncertainty in heights in mountainous areas.

Fig. 17
figure 17

Crossover-derived high-frequency geoid errors (2-sigma SD)

Fig. 18
figure 18

KNN-derived high-frequency geoid errors (2-sigma SD)

7 Discussion and conclusions

Are the decade-old and heterogeneous NGS surface gravity data sufficient for supporting a 1 cm-accurate geoid? In an attempt to answer this question, the errors of this data in the conterminous USA are evaluated and their geoid implications are quantified. The answer to this question is extracted only from the surface gravity data (and GRACE-derived models), avoiding the use of old control data of questionable accuracy such as GPS-occupied bench marks and tide gauges or astro-geodetic deflections of the vertical.

Of the 1,489 surface gravity surveys, 244 are found to be significantly biased, with biases \({>}2\) mGal and normalized biases \({>}2\). The geoid errors caused by these biases can reach 20 cm. In addition, it is estimated that the data contains high-frequency gravity anomaly errors with an RMS of \({\sim }2.2\) mGal. When these errors are propagated in Stokes’ integral, assuming they represent unbiased random gravity anomaly errors, they lead to high-frequency geoid errors that reach several centimeters on and to the west of the Rocky Mountains and on the Appalachians and a few millimeters or less everywhere else. Finally, the long-wavelength surface gravity errors are found to be of the order of a sub-mGal, but their large horizontal extent magnifies their effect on the geoid to several decimeters.

The derived high-frequency geoid errors are too small to cause any concerns except in high mountains where it could reach several centimeters. The long-wavelength geoid errors can be greatly reduced by simple Stokes’ kernel truncation methods in conjunction with a satellite-derived gravity model. Thus, the only serious limitation of the surface gravity data is due to the local gravity biases, which can cause geoid errors as large as 20 cm. Most of these biases cannot be removed by satellite-derived models. Instead, they can be detected and corrected by comparing overlapping surveys as done in a very simple minded way in Sect. 4. However, such methods are somewhat limited. For example, 123 surveys do not overlap with any others. Therefore, crossover analysis cannot help in computing their biases unless additional data are collected to connect these surveys to their neighbors. Another example is the 95 groups of surveys that do not overlap with one another. We practically fix (i.e., constrain) one survey per group in order to estimate the biases of other members of the group relative to the fixed survey. However, the biases would be more correctly determined by collecting new and improved bias-free data that overlap with the different survey groups.

These limitations of existing surface gravity data have become clear at NGS in recent years and a nationwide airborne gravity project was launched for collecting new and improved airborne gravity (Smith 2007). The new data will be used, among other things, to remedy surface gravity problems such as biases and replace old surface surveys that cannot be salvaged.