More Thoughts on AG–SG Comparisons and SG Scale Factor Determinations

Crossley, David; Calvo, Marta; Rosat, Severine; Hinderer, Jacques

doi:10.1007/s00024-018-1834-9

More Thoughts on AG–SG Comparisons and SG Scale Factor Determinations

Published: 22 March 2018

Volume 175, pages 1699–1725, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Pure and Applied Geophysics Aims and scope Submit manuscript

More Thoughts on AG–SG Comparisons and SG Scale Factor Determinations

Download PDF

David Crossley ORCID: orcid.org/0000-0002-6380-7106¹,
Marta Calvo²,
Severine Rosat³ &
…
Jacques Hinderer³

275 Accesses
15 Citations
Explore all metrics

Abstract

We revisit a number of details that arise when doing joint AG–SG (absolute gravimeter–superconducting gravimeter) calibrations, focusing on the scale factor determination and the AG mean value that derives from the offset. When fitting SG data to AG data, the choice of which time span to use for the SG data can make a difference, as well as the inclusion of a trend that might be present in the fitting. The SG time delay has only a small effect. We review a number of options discussed recently in the literature on whether drops or sets provide the most accurate scale factor, and how to reject drops and sets to get the most consistent result. Two effects are clearly indicated by our tests, one being to smooth the raw SG 1 s (or similar sampling interval) data for times that coincide with AG drops, the other being a second pass in processing to reject residual outliers after the initial fit. Although drops can usefully provide smaller SG calibration errors compared to using set data, set values are more robust to data problems but one has to use the standard error to avoid large uncertainties. When combining scale factor determinations for the same SG at the same station, the expected gradual reduction of the error with each new experiment is consistent with the method of conflation. This is valid even when the SG data acquisition system is changed, or different AG’s are used. We also find a relationship between the AG mean values obtained from SG to AG fits with the traditional short-term AG (‘site’) measurements usually done with shorter datasets. This involves different zero levels and corrections in the AG versus SG processing. Without using the Micro-g FG5 software it is possible to use the SG-derived corrections for tides, barometric pressure, and polar motion to convert an AG–SG calibration experiment into a site measurement (and vice versa). Finally, we provide a simple method for AG users who do not have the FG5-software to find an internal FG5 parameter that allows us to convert AG values between different transfer heights when there is a change in gradient.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 The SG at Apache

The original work in this paper was done on the Apache Point (AP, New Mexico, USA) SG–AG data from 2011 to 2015, and later applied to data from the J9 installation in Strasbourg, France. It is necessary to briefly discuss the site situation at AP, which is a first class astronomical observatory that hosts one of the best lunar laser ranging (LLR) facilities in the world. In 2009 an SG was installed at the site to assist in constraining the displacement of the ground during the LLR experiments which use a 3.5 m optical dish attached to a solid pier. Due to logistic and financial considerations it was not possible to place the SG in its own isolated building as is common at other geodetic sites, and as a compromise the SG was located in the cone room, a small access room directly under the telescope housing. There are both advantages and disadvantages to this location, but the subsequent difficulty of providing a suitable environment for the AG instrument to calibrate the SG was not considered.

It was quickly discovered during the first calibration experiment in 2011 that the AG was subject to excessive disturbance during the nighttime operations, when the telescope was in constant use, and these disturbances severely compromised the quality of the AG data. There are two effects associated with the telescope motion, one being small self-correcting offsets in the SG data (at the level of 0.5 μGal or less) due to mass changes associated with the telescope position above the gravimeter. These can be removed by constructing a model using additional data from the telescope slews, but this is a time-consuming operation that has not been done systematically for all the AP data, and not for the SG data used in the calibrations. For the AG, the cooling system beneath the telescope blows air directly into the cone room and onto the AG instrument itself which cause data disturbances that are not damped by the F5 superspring. Unfortunately, there is no possibility to avoid this problem by moving the SG/AG to another location in the observatory complex, and the only remedy with the AG is to reject all the disturbed data.

Thus we are obliged to use the AG data as recorded, and cannot easily improve the situation. Coupled with this is the limitation on residence time for the AG. Site requirements permit the AG instrument to remain in the cone room for 5 days or less, except for the first experiment in 2011 where it was allowed to run over a weekend (and gave by far the best data), and this severely limits the amount of good data we can collect. Our site is, therefore, one of the noisiest and most challenging for an AG–SG calibration experiment. Ground accelerations induced by the movements of a nearby VLBI antenna have also been also detected in the SG recordings at Ishigakijima, Japan (Imanishi et al. 2018), therefore calibration experiments at such a site might encounter similar problems.

1.2 Motivation

One of the motivations for this paper is to share our experience with the calibration experiments at AP that were initially done without the collaboration with the Strasbourg group that later became available, and thus represents the situation that might face a less experienced team of SG–AG operators. For example, an initial assumption at AP was that both the SG series and the AG series must be compared without any AG corrections (i.e., for tides, ocean-tide loading, local pressure or polar motion), and so all such corrections were turned off in the FG5 setup. Later it became clear that, as is done routinely at many observatories, the experiment can be done with the standard AG corrections, and the FG5 settings can be changed to remove the corrections for the calibration and produce uncorrected files. At SG installations where there is no in-house or dedicated AG, one may need external assistance for the FG5 instrument, which for the case of AP is the National Geospatial-Intelligence Agency (NGA). Frequently, such FG5’s are in heavy demand which limits their availability. Further, as mentioned, site constraints at AP dictate not only a very small space for the SG and AG, in a room in the middle of a very complex building, but visitations by the FG5 are disruptive of local operations so only one or two AG measurements per year are preferred. This would be similar to an SG being located remotely (e.g., Syowa, Antarctica), or in a special underground environment for hydrological purposes. It is obvious that whenever we did an SG calibration we needed also to produce an AG site measurement, which is the normal 1–2 day occupation with all corrections turned on, unlike a calibration experiment which normally takes at least 5 days.

Thus, a major goal of the paper is to process the SG–AG calibration data using only the FG5 text files, without access to the software that is supplied with the Micro-g software (denoted’g-software’) http://www.microglacoste.com/pdf/g9Help.pdf. The g-software gives complete user control over the processing of the AG fringe data, and produces a set of internal binary files and a set of 3 ASCII files—the drop data file, the set data file, and the project file that summarize the results of the processing. For the AP station the binary files were available from NGA, as was some limited re-processing, but were not useful to us at AP; the situation in Strasbourg is of course entirely different. This limitation was perfectly anticipated in Sect. 2 of the paper by Palinkas et al. (2012) who noted that some users of the gravity data have no access to the g-software (see the “Appendix” for further information). We acknowledge that the g-software is available independently of the instrument from Micro-g, but perhaps the suggestions in our paper may help some users to avoid that necessity.

Accurate calibration of a superconducting gravimeter (SG) is fundamental in many geophysical and physical applications, for instance for the search for time-variability in the Earth’s response to tides induced by internal process inside the Earth or by surface loading (Calvo et al. 2014), or the search for anisotropy in the Newtonian gravitational constant G (Warburton and Goodkind 1976). Many papers cite ocean tide loading as a prominent requirement for accurate SG scale factors (along with accurate phase calibration) e.g., Boy et al. (2003) and Baker and Bos (2003). Although much of the initial work at AP could have been avoided using the g-software, we hope some of the results are still of interest to those who contract out AG measurements, or perform only occasional calibration or drift checks on their SG.

There are numerous papers on the use of an AG to calibrate an SG, summarized in Hinderer et al. (2015). Here we investigate some small issues that arise in this type of comparison. In order of presentation, these are: (a) a discussion on the merits of various ways to use drop or set SG data, (b) the effect of adding the data acquisition time delay and a local trend to the solution, (c) combining multiple determinations of the scale factor for a particular station, and (d) comparing the AG offset from a calibration experiment to regular determinations of the AG site gravity. We use data from the Apache Point (AP) station in New Mexico, USA, and from J9 station in Strasbourg (ST), France to demonstrate the various points. As mentioned, AP is a site with especially high nighttime site noise, perhaps the extreme end of stations that have high cultural noise during some part of the day. This has been encountered at some older SG installations, e.g., Wuhan, China or Vienna (Meurers 2012) or a recent one at Ishigaki (Imanishi et al. 2018), whereas ST is typical of a station with quiet and fairly constant site noise. Van Camp et al. (2016) make the useful suggestion that higher drop rates (e.g., every 5 s) should be used, and this would be beneficial in future for AP measurements during the undisturbed daytime recording.

1.3 Basic Equations

To begin, we assume a simultaneous measurement of AG and SG gravity over a time period T ~ 5 days, to be assured of reaching a reasonable convergence in the scale factor (e.g., Francis 1997; Meurers 2012). All the SG data come from either the raw 1 s data, or the filtered 1 min files available at GGP/IGETS (Crossley and Hinderer 2010; Voigt et al. 2016). Very little of our SG data at AP required corrections for SG-specific disturbances such as He refills, disturbances, or offsets but simple pre-processing was done where necessary. Likewise there was no problematic data (such as a large earthquake) that would have affected both instruments (in different ways) and therefore, to be avoided. The common time period T was chosen to span a period during the largest diurnal tides at the station, which occur fortnightly. Pre-processing of some SG data from ST was done to avoid data disturbances, as described in Rosat et al. (2009).

The FG5 data at both stations, denoted by y(t), was collected drop-by-drop every 10 s, and accumulated each 20 min as a set mean of 100 drops. In our original processing, the SG data x(t) was 1 min smoothed from 1 s raw data by applying a low-pass filter, which avoids the problem of aliasing of the 5–10 s microseismic noise (Van Camp et al. 2016). This is still present even for station AP in the middle of the N. American continent, but less than ST in Central Europe. The SG data is normally cubic-splined to the AG drop or set times which are given at a sampling time t (Rosat et al. 2009). Later we also used 1 s data for comparison.

The AG data is composed of a constant mean value y₀ (over the time period T of the experiment) plus a time-varying part y₁(t); similarly the SG data is composed of a constant part x₀ plus a time-varying part x₁(t). We perform a least-squares (LSQ) fit of y(t) (μGal, 10⁻⁸ m/s²) to the SG data x(t) (volt) using

$$ y\left( t \right) \, = \alpha + \beta \times x\left( t \right) \, + \gamma \times \, t \, + \varepsilon $$

(1)

where ε is assumed to be Gaussian random noise, and the sum of ε² is minimized. The parameters determined from the fit are α, the offset between the mean zero levels of the AG and SG data, β the scale factor SF (or calibration constant) of the SG (μGal or nm s⁻²/volt), and γ is a trend to account for possible differential instrument drifts (see e.g., Imanishi et al. 2002). If no mean values are subtracted, then

$$ x\left( t \right) \, = \, x_{0} + \, x_{1} \left( t \right) $$

$$ y\left( t \right) \, = \, y_{0} + \, y_{1} \left( t \right) $$

(2)

and after the LSQ fit (we use lfit from Numerical Recipes, but any similar code will do) for (α, β, γ) we can equate, within the errors, the constant and time-variable parts

$$ y_{0} = \alpha + \beta \times \, x_{0} $$

$$ y_{1} \left( t \right) \, = \beta \times \, x_{1} \left( t \right) \, + \gamma \times \, \left( {t - t_{0} } \right) $$

(3)

where t₀ has been added to indicate the time of the first AG drop. We refer to the quantity y₀ as the AG mean value, which depends on the fitted offset α, the SG mean x₀ (which can be computed separately prior to the fit), and β—the SG scale factor. With Gaussian errors, if the standard deviations for (α, β) are σ_α and σ_β, then the variance of y₀ is

$$ \sigma_{0}^{2} = \sigma_{\alpha }^{2} + \, x_{0}^{2} \times \sigma_{\beta }^{2} + \, 2x_{0} \times \sigma_{\alpha \beta } $$

(4)

where σ_αβ is the covariance of (α, β).

A few points need to be mentioned about these equations. First, many authors do not consider the offset α, nor the mean values x₀ or y₀, as being of sufficient interest to mention, and others ignore the trend γ, thus leaving β as the only parameter of interest. This is understandable if one chooses to get a regular AG site value at the site by reprocessing the AG data using the g-software. Later we show another method to do this based on y₀. As for the offset α, Imanishi et al. (2002) discussed in some detail its variations for a month-long series of AG-SG measurements, and ascribed the cause to possible AG instrumental drift during the experiments. The possibility of such an effect is one reason we include the term γ × t in (1). Unfortunately we could not repeat the experiment of Imanishi et al. because we could only record at AP over a few days, but any linear trend in α will appear in the γ × t term in (1). Short-term effects over the time T of the calibration are distinct from the classic long-term SG drift, but it is reasonable that the latter be removed first from the SG data, though it is unnecessary here.

Wziontek et al. (2006) add explicitly a drift function to the SG component, and an offset to the AG data, but no AG trend. Although arbitrarily adding a drift parameter to (1) without being able to identify the reason might seem unjustified, Meurers (2012) clearly showed that a linear trend can perturb the amplitude ratio between the AG and SG data (which is the goal of the calibration), and so there is a good reason to include it. Many other papers also advocate a drift parameter, for instance Hinderer et al. (1991) included drift when using an earlier JILA-5 instrument with twin laser drift problems, and a drift is explicitly included by Tamura et al. (2005) and Van Camp et al. (2016). Meurers (2002) explored the effect of unmodeled drift on the calibration factor using synthetic and real datasets. We also received a suggestion that He gas from the SG might leak into the room and affect the AG; this could have a preferential effect on one instrument and not the other (B. Meurers, editorial comment). This phenomenon has also been reported by Mäkinen et al. (2015), but it is not usually a problem for closed cycle SG’s such as the current observatory SG or iGrav. Note that at the J9 station in Strasbourg, unlike at AP, the AG is recording in a separate room from that of the SG.

2 Drops or Sets?

Early calibration experiments in Strasbourg (Hinderer et al. 1991) used only 1-day experiments but used both drop and set data, and also considered both the L₁ (minimum absolute error) and L₂ (LSQ) norms when solving for the constants in Eq. (1). It appears the L₁ norm has not been widely used in recent years. Amalvict et al. (2001, 2002), however, showed that with good data there was little difference in the scale factor between drop and set methods, and noted that the errors (which they stated to be standard deviations) in both methods were similar despite the very large difference in the number of SG–AG pairs to be fitted (generally there are about 100 drops for each set) which should result in a smaller formal error using drop data. Although there has been a recent trend in SG–AG calibration processing towards the use of drop data rather than set data, obviously the less numerous AG set values are still less scattered than the drop values.

Several recent papers have covered similar ground to this study, and with which our results are consistent. Tamura et al. (2005) for example used AG drops, and found no evidence for scale factor changes at Esashi (Japan). Wziontek et al. (2006) identified AG offsets at station Bad Homburg (Germany) from calibration experiments using different FG5 instruments, and also treated mainly drops. Meurers (2012) used drops in a comprehensive assessment of many of the factors in SG–AG processing, and Van Camp et al. (2016) also favored using drops, emphasizing the need not just for many drops, by increasing the drop rate, but measuring at high tides to improve accuracy.

We need to be clear about the difference between the two types of measurement. An AG drop results in a trajectory of a falling corner cube, whose flight is sampled by a number of fringe zero-crossing times of which there are a large (many thousand) number of fringes per drop (see e.g., Kren et al. 2016). The variance covariance matrix of the LSQ fit to the fringe crossings yields a statistically determined drop value and a scatter, or standard deviation, σ_d. When drops are processed in sets (often 100 drops, every 10 s) the set mean is the unweighted mean of the accepted drops averaged over a set. One can take the set sigma σ_s as the usual standard deviation of the set mean [Eq. (A2) in the “Appendix”] that reflects the drop to drop scatter (column labeled ‘Sigma’ in the set text files). This is our choice for most of this paper, except for the final two Figures. Alternatively one may choose the standard error of the set mean (SEM), which is σ_s/√ (N) where N is the number of drops per set (unweighted), as in Tables 3 and 4. N is frequently close to 100, so the set SEM is about 10 x smaller than σ_s, and given by the column ‘Error’ in the set file. Drops are accepted or rejected by the g-software based on the usual 3-σ criterion, i.e., a drop outlier is rejected when more than 3-σ from the set mean. When using single drops, more flexibility is available to select drops in the solution, as we will see. Rather than using ‘set sigmas’, and ‘drop sigmas’, to avoid any ambiguity we frequently refer to the columns (Sigma, Error) in the drop files and (Sigma, Error) in the set files.

2.1 Tests on Set Data

For reasons that will be clear later, we wish to also find y₀ from the fit, and this requires an initial assessment of the SG mean value x₀; certainly x₀ can be ignored if the only goal is to find β. Various possibilities for the span of SG data were tried: (i) SG values starting at UT 0 the first day, and ending at midnight the last day, (ii) using SG values only at AG times, and (iii) using SG values starting at the first AG drop time and ending at the last AG drop. The differences in the scale factor were (and expected to be) insignificant, but there was an affect at the μGal level on the AG mean value y₀ in Eq. (4). Obviously option (iii) is the logical choice of the time span for the SG data.

A test was also made to quantify the effect of the SG instrument time delay to the experiment, as mentioned previously. For SG 046 at Apache Point, an observatory-style gravimeter, the nominal time delay (lag) of the system is predominantly that of the GGP1 filter (Hinderer et al. 2015) which is 8.16 s, so this delay has to be incorporated in any calculation that returns the SG value at the AG times. We tested the shift in the scale factor for various time delays in Table 1, showing that for most stations, the effect is negligible even up to 30 s. This confirms similar results of Meurers (2012) and Van Camp et al. (2016).

Table 1 Effect of SG time delay on scale factor (μGal/V)

Full size table

Again using the AP 2011 data, another test was done to assess the effect of a relative drift between the SG and AG data, i.e., adding the term γ × t in the RHS of Eq. (1). This is not primarily to account for the known SG drift, but allows for other effects occurring preferentially in one of the instruments. The instrument drift of SG046 between 2009 and 2012 was 70 μGal/year, or + 0.192 μGal/day—which is unusually large for an SG. For this reason, the sensor was replaced in 2013 by the manufacturer GWR (Goodkind–Warburton–Reineman, San Diego, California) with a significant decrease in drift. The results are shown in Table 2 where we give the scale factor, the trend, and the AG mean value, all with a time lag of 8.16 s. We have included the errors σ_β to show that although the trend can be larger and of opposite sign than the known SG drift, its effect on β is always smaller than σ_β. The same situation occurs for the AG mean value y₀. Note, however, that σ_β is not determined very accurately by these set solutions at AP. We note that the trend can also be regarded as a diagnostic for possible problems in the data, e.g., the trend for AP2016 is − 0.55 μGal/day, which is sufficient to perturb the offset and SF. In this case, we know the AG data quality for 2016 was rather low.

Table 2 Effect on the scale factor and base value of adding a trend to the SG–AG fit

Full size table

The argument for using AG set values for SG–AG calibration probably arises because it is the natural choice when determining an AG site value, where the geophysical corrections are applied to get the site gravity. Among past papers, Rosat et al. (2009) used AG set values when doing SG–AG experiments, and there is some benefit in having a set average with a well-defined set sigma (σ_s) used in weighting the fit (as we will see). On the other hand, there are two arguments against using sets, the first being the inability of the set average to precisely track the top and bottom of the semi diurnal tides if the regular set averages are used. To combat this, Meurers (2012) suggested using a moving window average of both AG and SG data to help reduce the AG scatter and yet track the tidal signal more precisely. At each drop time an average of the AG and SG data is taken over the length of a set spanning the drop point, thus keeping the high number of drop values, but reducing the drop to drop scatter. This is related to the second point that in principle set averages, rejecting drop outliers, should work better on AG data where the geophysical corrections (principally tides and atmospheric pressure) have been subtracted, and the corrected signal has only a small scatter. The situation is different when doing SG–AG calibrations that require the full tidal signal; the AG set averages are biased by the changing level of the large time-varying signal. The importance of this procedure is one of the options we test in our processing. Recent authors have tended to recommend the use of AG drops to generate the scale factor, ignoring the trivial increase in computer time over the set method.

2.2 Initial Attempts at Set and Drop Processing

We start with Fig. 1 showing the fit of SG-to-AG data between July 28 and Aug 3, 2011, which was the span of the first AG measurement at Apache Point Observatory after installation of SG046 in February 2009.

The fit is based on AG set values (100 drops/set, set interval 20 min, drop interval 10 s), where it is seen that the σ_s’s are considerably larger during the nighttime hours when the LLR telescope is active for various sky surveys. The set mean can nonetheless be acceptable if poor drops are rejected and de-emphasized in the set means by their large inverse variances (i.e., 1/σ²) in the SF fit. As can be seen in Fig. 2, zoomed to the first night of the calibration, not only are the set errors large, but the set means deviate from the SG values.

There were four SG–AG calibrations done for AP between 2011 and 2016, and for comparison we processed four datasets from ST between 2008 and 2011. We first assessed the histograms of the drop and set σ’s (the columns marked ‘Sigma’ in the FG5 text files) for all datasets used in this study and these are shown in Fig. 3a for AP, and Fig. 3b for ST. For AP the drop sigmas are clearly divided into two groups, for every experiment. There is a tight clustering of values around 16 μGal for AP2011 and AP2013, and similarly around about 25 μGal for AP2014, AP2016. Beyond 30 μGal, the drop sigmas extend to very high values (several 100 µGals) for badly disturbed data, and so we choose a maximum acceptable cutoff value of δg_m = 30 μGal for all AP data. For the AP set data, however, the sigmas vary from a peak around 5–7 μGal, to a scatter of values up to about 30 μGal, suggesting the latter is also a suitable set cutoff to avoid disturbed data. The situation is different for the ST data where there is very little disturbed data, but the histogram peaks also occur at different values depending on the experiment. The drop sigmas for ST divide between a low of about 10 μGal for ST2010 and ST2011 to between 25 and 30 μGal for the other 2 datasets. The ST set sigmas vary between 5 and about 27 μGal, so to include most data we choose for ST a rather large value of δg_m = 35 μGal as a cutoff, but this is not a critical value because there are very few sets above 25 μGal.

We processed the four SG–AG experiments at AP using both set and drop data. Initially we used all the set or drop data in the files, without prior selection, and compute the solution relying on the weights (inverse variances from the drop sigmas) to reject large set or drop σ’s. This initial processing is called JS0 (selection method 0) and yields the two rows ‘JS0’ in Table 3. To be explicit this procedure consisted of:

Table 3 Set versus drop scale factors β (μGal/V) for AP experiments; drop and set δg_m = 30 μGal

Full size table

(a)
Selecting only those drops or sets that had sigmas below the sigma cutoff discussed above (shown in Fig. 3) and weighting the AG data according the their inverse variances, and
(b)
For the SG data we used 1 min GGP data and interpolated the SG value to the drop time or set time that matched the AG times. This is the procedure discussed in Rosat et al. (2009).

The result is that much data was discarded, depending on the experiment, and in the worst case for the AP2013 experiment less than 30% of the data could be used. Notice in Table 3 that the drop SFs (scale factors) are sometime quite different to the set SFs, whereas the latter (JS0) are generally similar to the other ‘better’ JS solutions (described below). The reason is that the drop σ’s are not necessarily indicative of which drops are far from the SG curve, and so the weighting does not automatically diminish the influence of drop outliers that may have acceptable σ_d’s. This is not true of the set σ’s, as they are more robust against bad drop data, and sets with high σ’s are degraded in the solution against sets with small σ’s. Table 3 shows the results for JS1 and JS3 (see later) for passes 1 and 2, but note that pass 2 is not required for some of the set solutions if there are no set residuals outside the 3-σ criterion.

Considering the difficult AP data, namely the lack of agreement between the scale factor for sets and drops, we wondered if it was possible to improve the fit by further selecting the AG drops. Because most drops had about the same σ_d, they all equally contribute to the SG fit, but in reality many drops are far away from the SG curve; perhaps these drops degrade the scale factor fit and could be rejected? It is also clear that drops near the tidal peaks are more important in determining the scale factor that those midway between peaks. Rather than pursue such an approach, we changed strategy and decided to test a number of options presented in the AG–SG processing. For the moment we pass over sections (c) of Tables 3 and 4, and return to them later.

Table 4 As Table 3 but for 4 Strasbourg J9 calibrations

Full size table

2.3 Improving the Algorithm for Rejecting Bad Data

When collaboration began with the Strasbourg group, and especially after the paper by Calvo et al. (2014), it became clear that better ways to reject bad data had been gaining acceptance, following the approach contained in Meurers (2012). We mention again the goal here was to process the AG and SG data using only the information in the FG5 text files, and further to do this on files that come directly from the FG5 without any pre-processing of the AG data using the g-software. This was the case for the AP text files received by mail from NGA, whereas the ST data were already carefully pre-processed to reject bad drops before the text files were written—a significant difference (and improvement).

The following discussion addresses a number of options that we tried in rejecting bad drops, which is the key to getting SFs that are consistent when using both drops and sets. The options are summarized in Table 8 in the “Appendix” with the same abbreviations as here:

1.
w₁–w₀: weighting the data when computing means. When the g-software records each drop it is compared to an evolving mean value that eventually becomes the set mean, and there is no chance to weight the drops. But after the drops and sets are recorded one can re-compute the set means by weighting the drops with their σ_d’s, in principle this being a better approach. So we decided to base all subsequent processing on just the drop data files, and used the set file data only as a check. We could recover the set file data by computing the unweighted mean of the set mean, and also by adopting the 3-σ rejection of drop outliers. We verified that exactly the same drops were rejected as reported in the set files (columns ‘accept’, ‘reject’) and with exactly the same set means. The 3-σ rejection is iterated successively until the number of rejected drops does not change (a maximum of 5 iterations proved adequate). The unweighted solutions are designated ‘w₀’ and the weighted solutions are ‘w₁’. The weighting (or not) is applied to every instance in the program where set means are required; our default preference is to use the weighted option.
2.
s₃–s₁: using 3-σ or 1-σ criterion for rejecting outliers. Amalvict et al. (2002) reported on the choice of ‘n’ in using an ‘n-σ’ selection. They chose 3-σ for drops and 1-σ for sets, so we decided to test both options when rejecting outliers. It is expected that this choice depends on the noise in the experiment; for good data it should make little difference.
3.
co–nc data: should we do drop selection on the uncorrected or corrected data? We use ‘TLBP’ to refer to the geophysical corrections (tide, ocean load, barometric pressure loading, polar motion) that are generally removed to get a regular AG site measurement ‘co’, as opposed to the ‘nc’ option that does not remove TLBP when rejecting outliers. This point has been emphasized in the papers by Meurers (2012), Calvo et al. (2014), and Van Camp et al. (2016). As discussed previously, if corrections are not made, the drop rejection is compromised by the inclusion of tides (predominantly) which vary throughout the experiment. Once the drops are rejected, the TLBP corrections can then be reapplied to the accepted AG data for use in the calibrations. At AP we requested the NGA operator to re-run the four calibration experiments with corrections applied, and based our rejections on those files rather than the uncorrected data. In this case it is necessary to have exactly consistent drop and set files to transfer the accept/reject criteria between corrected and uncorrected versions of the same data. This worked well for the AP data, but unfortunately for the Strasbourg experiments the loss of a computer disk in 2014 with all the processed data meant that we did not have ready access to the corrected text files, although the raw AG files are archived. Up to the present we have not tested the ‘co–nc’ option for the ST data.
4.
p₁–p₂ processing: adding a drop/set rejection of outliers after the first fit was obtained. This turned out to be a significant point. Once pass 1 is made, one has access to the residuals—the deviations between the SG curves and the AG drop or set values. These residuals have a more or less normal distribution about the SG curve, so it is easy to reject deviations on a 3-σ (or 1-σ) reject criterion; this is equivalent to refining the pass 1 solution based on rejecting outliers, or equivalently choosing drops or sets deviations that are close to the SG curve. For most datasets, even those that had been carefully screened manually using the g-software in Strasbourg, pass 2 found sufficient drops or sets to discard on all the datasets that the solution was noticeably improved. For some of the poorer AP sets up to 900 new drops were rejected and up to 5 more sets. It should be said that all the processing for drop or set SFs up to pass 1 use exactly the same accept/reject drops or sets, but at the last step, the pass 2 solution for drops rejects only drops, and the pass 2 solution for sets rejects only sets, so a small difference appears in the data used for the two types of SF.

An illustration of the effectiveness of this procedure is shown in Fig. 4 for the AP2011 experiment. As seen in Table 3, most calibrations are improved using pass 2 in the sense that the drop and set SFs are brought closer.

In addition to the above options (1)–(4) that could be invoked, we chose 3 additional methods to discard drops/sets. Again these are summarized in Table A1, and described below as a series of processing steps. Within each step we can choose any of the above options.

Step 1: From the drop files with TLBP corrections, compute the set means and implement rejection of drop outliers (with iteration), flag all drops as accept/reject, and save these flags as method ‘JS1’ (rejection based on set means). For the drop SFs only accepted drops are used, and for the set SF the accepted drops are gathered into sets and the set means are used as the data (as usual). Complete sets are rejected if their σ_s > δg_m and such sets even with good drops are discarded when doing set SFs.
Step 2: From the same drop files, reject drops based on the drop σ > δg_m (the cut-off σ shown in Fig. 3), flag all drops as accept/reject, and save these flags as method ‘JS2’ (rejection based on drop σ). The procedure then follows JS1.
Step 3: Combine the accept/reject flags from the previous 2 steps, so drops are rejected as method ‘JS3’ (drop rejection based on both set means and drop σ’s).
Step 4: From the drop text files for the calibration with no corrections ‘nc’, we apply the reject flags on all drops identified from the previous three methods (JS1, JS2, and JS3). This completes the preselection of AG drops.
Step 5: Prepare the SG data in two ways. First we use the old method (JS0) for interpolating the 1 min data to either AG drop times, or AG set times, depending on whether we are using drop or set data; this method is combined with the JS1 and JS2 AG selection above. Second we use the SG 1 s files, select all data spanning the AG drop times, then filter the data to reduce the effect of the microseismic noise; this was suggested explicitly by Van Camp et al. (2016) and turns out to be a valuable suggestion for improving the SFs.

Figure 5 shows a suite of filters, based on the Parzen data window, with lengths from 11 to 501 points and with frequency cutoffs between 3 and 50 s. They are designated sf₁–sf₉, where the last filter sf₉ is the original 1 s to 1 min GGP filter. The effect of applying these filters to the four ST datasets on the calibrations is shown in Fig. 6.

It is clear that there can be a bias in the SF, depending on the data quality, if the SG data are inadequately smoothed. For one dataset, ST2010, the SF using no filter (sf₀) shows a noticeable shift in SF that can be corrected using a filter such as sf₆ (or longer). For this experiment there was a large earthquake that had to be removed from the data, and it was assumed the rest of the data was OK, but the effect of the earthquake carried over unseen in the data. All scale factors using the Calvo et al. (2014) method are obtained from the raw 1 s data without filtering. Here, filtering using sf₆ is used in all the calibration solutions with the ‘JS3’ designation. It should be pointed out that Meurers (2012) uses the SG 1 min data resampled to the AG drop times in his solution, which is equivalent to the original JS0 processing here. Even though it might be technically better to use filtered raw 1 s data, the SF difference between sf₆-filtered 1 s data and the standard 1 min GGP data is negligible and there appears to be no bias introduced using the 1 min SG data directly. One other advantage of filtering the SG data at AP is a noticeable reduction in the amplitude of the telescope glitches that are at the level of 0.5 μGal or less.

The JS1 and JS2 methods combine the different preselection of drops with the SG data interpolated from 1 min data. The JS3 method uses the combined preselection of drops, indicated above, but the SG data are preselected at AG drop times and either used directly for the drop SF, or gathered into sets exactly as the AG drops are done, for the set SF. JS3 is thus the only combination that follows the procedure of Meurers (2012) and Calvo et al. (2014), and so we consider it to be the ‘best’ method of getting the SFs, especially with a pass 2 refinement. This is borne out by the results presented below.

2.4 Results of Tests for the Scale Factors on AP Data

From the previous section we may, therefore, summarize the various tests as follows. For the AP data we have these multiple solutions:

Method JS0: 1 solution for drops, 1 for sets; uses fixed options [w₁, s₃, co (AP) or nc (ST), pass 1] for 2 types of solution
Methods JS1–JS3: There are 2 solutions each for (w₀/w₁, s₃/s₁, co/nc, and p₁/p₂); the number of solutions computed is, therefore (16 options × 3 methods × 2 types = 96 calculations of the scale factors). Together with JS0, we therefore computed the SF for each experiment 98 times.

We then group all the solutions according to which factor we want to isolate (e.g., w₁/w₀) and compute the mean of the absolute difference between the solutions for w₀ and the solutions for w₁ (in units of μGal/V). This provides a metric to judge which of the various options have the most effect on the final scale factors. Some of the more important results are shown in Table 9 (“Appendix”) under the 4 methods and 2 types (drops or sets). In this table we have also graded the datasets according to whether they are of high, medium or low quality, based on both the amount of data discarded in Table 3, and on the results in Table 9 themselves. It is important to also evaluate our options on the higher quality datasets that most users will encounter, so we combine all the results with a weighting on a scale of (1 = low, 2 = med, 3 = high) for both the AP and ST datasets combined. The result is shown schematically in Fig. 7 where the two options drops/sets are given with the combined score for the 4 options.

For the w₀/w₁ choice, it depends on whether we are doing drops or sets. For drops the effect is small, but for sets the choice is moderately important, and it seems better to weight the set means as a matter of principle. For the co/nc option, the means differences are surprisingly small, despite the seeming theoretical advantage in rejecting drops on the corrected data (previous discussion), as opposed to the uncorrected data used for the calibration. It is unfortunate that we did not have the corrected ST data at hand to test this result, but it appears that selecting drops directly from the uncorrected files may not in practice give a large SF error. For the s₃/s₁ option the effect is modest for either drops or sets, therefore, retaining the 3-σ outlier is OK for the better data. It is clear from Table 3, however, for the lower quality data there is a difference (improvement) in choosing a tighter control of outliers using the 1-σ criterion. The final option p₁/p₂, whether or not to have a second pass, makes the biggest difference in determining the SF, especially for the drop solutions. The advantage for sets is much less obvious, as might be expected because the set solutions are weighted more robustly. The number of tests is halved for the ST datasets because there is no corrected data (‘co’); thus there are only 49 computations of the various SFs for ST.

2.5 Strasbourg Processing and Calibration Results

We turn from the problematic AG data of AP to station J9 in Strasbourg, which has a very long series of AG–SG calibrations, beginning in the early 1990’s (Hinderer et al. 1991; Amalvict et al. 2001, 2002; Rosat et al. 2009; Calvo et al. 2014). We repeat the same calculations as for AP on data for SG CO26 in Strasbourg, using the 4 datasets from 2008, 2009, 2010, and 2011 as shown in Table 4.

Because of the much better site conditions, the AG data are much cleaner than AP, and with the previously determined cutoff of 35 μGal for both the sets and drops, only a small percentage of the data are excluded. Comparing Tables 4 with 3, we note that the ST scale factors are more consistently determined than at AP, but the error in the SFs is not uniformly better. For example, experiment AP2011 in Table 3 has smaller drop and set errors than any of the data for ST, due probably to the larger number of sets used (283) compared to 139–164 sets for ST. Even so, with careful rejection of the bad data, a satisfactory SF can be obtained, even at AP. As in Table 3, we compare the JS1 and JS3 methods for passes 1 and 2 and note that pass 2 is always required, even though the ST data is much better overall than that at AP. This is a good time to point out that based on set standard deviations, the set SFs in Tables 3(b) and 4(b), have much larger uncertainties than the drop SFs in sections (a) of the Tables, especially for ST2008 and ST2011. The only way the drop and set errors can be comparable, as Amalvict et al. (2002) indicated, is if we use standard error of the mean (that is to say the standard deviation divided by √(N)) as the uncertainty in the set SFs. To show this we recomputed the set SF’s using the Error column of the set files, shown in sections (c) of Tables 3 and 4 for the set errors. It is clear that the differences in SF’s are very small, and the SF errors are almost exactly a factor 10 smaller than when using the Sigma column. Thus, one can get almost the same result by finding the standard deviation of the SF using the Sigma column (sections (b) of Tables 3 and 4) and dividing by √(N) to get the SEM. The SEM of a weighted mean has the more precise form indicated in the “Appendix” following Eq. (A2).

We see in Fig. 8 a comparison of 7 solutions (JS0, JS1, JS2, and JS3 for drops and sets) for 2 experiments at AP and two from ST, based on sections (a) and (b) of Tables 3 and 4.

The x-axis gives the different solutions comparing pass 1 and 2. Note that the SFs alternate high and low depending on which pass, but with the most important solutions (JS3 pass 2) there is a good convergence of the SFs from drops and sets. The original solutions (JS0) are quite different for the AP data, but with the improved processing the results for AP become almost as good as at ST. In Fig. 8b, we also show for comparison the solutions obtained by Calvo et al. (2014) where the drop and set values are more separated. This improvement is due probably to our filtering of the SG 1 s data, as well as using pass 2 to clean up the residuals.

3 Combining Different Scale Factors

Figure 9 shows the result of 51 AG set calibrations from instrument CO26 at J9 in Strasbourg, from 1997 to 2012 (Calvo et al. 2014). The SG sensor did not change over this period, but the data acquisition system electronics changed in 1997/12 (time lag reduced from 36.0 s using the TIDE filter to 17.18 s using the GGP2 filter) and again in 2010/04 when a new GGP filter board (GGP1 filter) was installed with a time lag of 8.16 s. The evolution of the measurements indicates a reduction in scatter of the calibrations with time, but no clear convergence to a unique value. All these drop scale factors were computed using the drop and set methods in Calvo et al. (2014), similar to our Pass 1 determination.

What is the best way to combine such different estimates of the SG scale factors? The answer partially depends on whether the scale factor should be treated as a constant from one calibration experiment to another. Physicists have long been faced with this problem in the determination of fundamental quantities, for example, the Newtonian gravitational constant G, and in such cases the proper procedure is conflation, see “Appendix” Eq. (A1). In the case of an SG, it is assumed (e.g., Hinderer et al. 2015) that the scale factor is determined by the factory magnetic field configuration, or, as stated by Goodkind (1999): “The calibration constant is fixed by the geometry of the coils and suspended mass so that it remains the same if the instrument is turned off and on again no matter how long the time between”.

Assuming this is true, the scatter in Fig. 9 must then be attributed to random (and probably also systematic) factors in the experimental setup and environmental noise rather than in the instrument, and this would suggest that conflation is appropriate. As discussed in the “Appendix”, the calculation of a weighted mean of a series of measurements is unique, but there are two ways to compute its variance, depending on the purpose. One can use the weighted sample variance Eq. (A2), which measures the spread of estimates about the weighted mean. This is appropriate to indicate the scatter of the measurements, but in the case of the SG scale factor it is assumed that the repeated measurements should converge to a unique value, which is the actual calibration. This is indeed the case for SGs, where numerous studies indicate that β can be quite stable. Even the relocation of an SG between two quite different sites does not change the scale factor, as documented by Meurers (2012) for the transition between Vienna and the Conrad Observatory. The SF errors are then appropriately combined by conflation, Eq. (A1).

In principle set and drop scale factors are not independent, but they arise from different procedures and we cannot strictly average them, or conflate them, and they should be treated separately. We apply (A1) to set and drop data independently, and assume the scale factor is not influenced by the changes in electronics, to get the result in Fig. 10.

The initial scale factors prior to 2000 are quite divergent, but each scale factor has more or less stabilized between 2005 and the end of 2012, and the set values are somewhat higher than that used in the Strasbourg data files which are in agreement with the drop value at about − 792.00 nm s⁻²/V. The conflated values are much more revealing of the evolution of the calibration experiments than the scatter plot in Fig. 9, and are a useful way to give a unique value to an SG in a database.

It is to be noted that Eq. (A1) ensures that the eventual error of a long series of scale factor measurements will eventually approach zero, and thus may seem ‘unrealistic’. To test this we artificially extended the calibrations at J9 by repeating the same data as shown in Fig. 9 between 2001 and 2012 and adding this series as being qualitatively representative of future (yet to be done) measurements. From the total of 107 calibration experiments (those beyond 2012 being repeats) the conflated set error would have dropped slowly from 0.11 nm s⁻²/V (Fig. 10) to 0.07 nm s⁻²/V, so the decrease for even a long series of measurements is quite slow.

To be complete, we compare the evolution of the two variances from Eqs. (A1) and (A2) in Fig. 11. The error in the weighted mean (A2) varies somewhat from the scatter of individual scale factor estimates, but does show the same overall downward trend as from conflation (A1). Assuming the actual SF is constant over long time periods, it seems plausible to expect an eventual convergence of the mean and error estimate from Eq. (A1). We also note that conflation can be used for SG scale factors determined using different AG instruments.

4 The AG Mean Value

Returning to the basic Eqs. (1)–(3), we note that the AG mean value y₀ includes certain AG static corrections such as the transfer height and gradient effects that are applied to standardize the absolute site level. During the processing of AG data, the operator sets the transfer height and gradient for the experiment. The former is simply a height at which the gravity value is desired for the particular site (frequently ground level), which should be constant from experiment to experiment for consistency. This can be computed from a combination of the actual height for the dropping chamber (in fact the sum of the setup height and a specific height close to 1.2 m given by the manufacturer for each instrument where g is computed from the trajectory over a distance of about 20 cm inside the dropping chamber) and a gradient to be used for the transfer. Ideally, the observed gravity gradient should replace the default − 3.0 μGal/cm (standard free-air gradient).

For various reasons these static corrections (transfer height and gradient) were not always kept constant at Apache Point. In one experiment, the transfer height was set to 0, and the gradient set to − 2.79 μGal/cm, whereas normally we had used 100 or 130 cm for the transfer height and the default gradient. For a long time we did not know the actual gradient below the telescope where the AG measurements were made. This was eventually measured in 2015 with a Scintrex CG5 both in the cone room (− 4.42 μGal/cm) and outside the telescope building at ground level (− 3.87 μGal/cm). The gradient resulting from the potential of a homogenous ellipsoidal model at AP is − 3.08 (µGal/cm), but cannot be applied to a station at an elevation of 2788 m (referred to WGS84); the difference with the measured value is most likely due to the assumption of a radial Earth model; the discrepancy being due to local topography (including the building) and lateral crustal density anomalies. It then became necessary to adjust not only the transfer height but also the gradient from the values used in the experiment to consistent values. We refer to discussion in the “Appendix” showing how this can be done without access to the g-software.

During a regular AG site measurement, assume y(t) is measured as previously. But this time the geophysical corrections are applied for tides from local gravimetric factors, barometric pressure, and polar motion (TLBP), therefore we write these as:

$$ gs\left( t \right) \, = \, \left[ {g_{\text{tide}} + \, g_{\text{press}} + \, g_{\text{polar}} } \right] \, = \, gs_{0} + \, gs_{1} \left( t \right) $$

(5)

so that during a site measurement, the corrected AG measurements are

$$ y_{\text{c}} \left( t \right) \, = \, y\left( t \right) \, {-} \, gs\left( t \right). $$

(6)

Introducing the mean and time-varying parts from (3) and (5), and recognizing that the corrected gravity y_c(t) should ideally be the site AG value g₀, free of time-varying effects, we find for the constant and time-varying parts

$$ g_{0} = \, y_{0} {-} \, gs_{0} = \alpha + \, x_{0} \times \beta {-} \, gs_{0} $$

(7a)

$$ 0 \, = \, y_{1} \left( t \right) \, {-} \, gs_{1} \left( t \right) $$

(7b)

Equation (7b) ensures that all time-varying parts of the measured gravity field are accounted for by the time-varying part of gs(t), provided we ignore the errors in the model and other factors such as non-tidal ocean effects and hydrology (but see below). Equation (7a) shows how to use the mean value y₀ to get g₀, i.e., by subtracting the mean value (or zero level) of the geophysical corrections applied by the g-software. These are the tidal amplitudes with a non-zero mean level, the applied nominal pressure corrections with a specific reference pressure p₀ (calculated for each AG site) and admittance (− 0.3 μGal/hPa), and the mean level of polar motion. It was for this reason that we kept track of the constants α, β, and x₀ in the solution of (1).

Normally no other time-varying effects are explicitly involved in the site AG measurements such as local hydrology attraction, non-local hydrology loading, non-tidal ocean effects, and tectonics (see e.g., Pálinkáš et al. 2010); indeed some of these are the target being measured. But many SG users routinely consider such further corrections to their data, so we could assume another model for these: gh(t) = gh₀ + gh₁(t), dominated by hydrology, similar to (6). Adding gh(t) to gs(t) then changes (7a) and (7b) to:

$$ g_{0} = \, y_{0} {-} \, gs_{0} {-} \, gh_{0} $$

(8a)

$$ y_{1} \left( t \right) \, = \, gs_{1} \left( t \right) \, + \, gh_{1} \left( t \right) $$

(8b)

Again, the time-varying part of the AG measurements can be accounted for in (8b), for which many models exist. It is Eq. (8a) that poses a problem for AG measurements because it is not easy to define the mean level of hydrology gh₀. Unlike the other mean levels, there is no obvious reference level for hydrology; the relative level used for SG studies (e.g., supplied as a loading correction by the EOST/IGETS loading service) is arbitrary and may have no relevance for a particular site. One might use the hydrology levels expected from local environmental parameters (rain, snow, evapotranspiration …) which could define a mean hydrology based, for example, on decades-long averaging, or a reference hydrology level established after a prolonged drought, which might be empirically estimated. This is an interesting unresolved problem that may arise when considering further corrections to the AG site measurements.

Aside from the problem of hydrology, we can still find the site gravity g₀ from Eq. (7a), and compare it with the AG site measurements when re-running the g-software with the geophysical corrections turned on. To estimate gs₀, we turn to the SG-derived version of gs(t) that is readily available at all SG stations due to the need for such modeling, noting this may differ from the FG5 corrections. For example, the SG may provide a superior tidal model (using local gravimetric factors), and we can apply the polar motion as published by IERS instead of predicting the polar motion from a model done in the g-software.

We applied the above method to finding the mean value y₀ and the estimate AG gravity g₀ for the 4 calibrations at AP based on set-derived solutions from the calibration—to be consistent with set means used in the FG5 processing. Table 5 shows the AG mean values from set and drop estimates, estimating g₀ from (5) using the procedure above, and g₀ from a usual FG site measurement.

Table 5 Processing of Apache Point AG mean values y₀ from SG to AG calibrations as AG site measurements; all units μGal

Full size table

For AP, the standard transfer height and gradient are (130 cm, − 4.42 μGal/cm). We note that the largest component of gs₀ is not necessarily the tide, and TLBP is quite variable over the 4 years. The mean value y₀ comes directly from the fitting (1), with x₀ being found before the fit, after which we read in the corrections gs(t) and find the mean level of the tides, pressure, and polar motion from all the 1 min values coincident with the SG data values. Then gs(t) are splined to the AG set times, and subtracted from the AG values y(t) as in Eq. (6). The corrected AG values y_c(t) allow a weighted mean of the set values that leads to a ‘simulated’ FG5 site value denoted by g₁ in Table 5. Alternatively, the mean level of the components of gs₀ are added together and subtracted from y₀ using (7a), giving an AG site value g₀ from y₀, as advertised. Finally the FG5 operator can reprocess the AG data using the g-software with corrections applied to get the actual FG5 site gravity. Some subtleties exist. For example, the SG mean value x₀ and the AG mean value y₀, which can also be obtained directly from the input AG data, must use the highest sampling of the data, either 1 s or 1 min for the SG, and drop data for the AG even though the drop data may be noisy. But the mean TLBP level should be based on the set times, which is based only on good data, and g₀ and g₁ should match the set data as recorded in a site measurement. All the solutions in Tables 5 and 6 are based on the JS3, pass 2 method.

Table 6 Processing of Strasbourg AG mean values y₀ from SG to AG calibrations as AG site measurements; all units μGal

Full size table

In Table 5, we note that some errors, e.g., the site value g₀ from y₀, seem large, especially for the AP data where there are relatively few TLBP data at set times used in finding gs₀. Thus, quantities derived from gs₀ tend to have larger errors than one might expect. The final two lines of Table 5 show that the difference g₁ − g₀ is consistently less than 1 μGal, clearly indicating that Eqs. (7a) and (7b) work in practice for all datasets. The discrepancy between the g₁ and FG5 site values are more variable but still generally within the error bars. The smallest error is on the value g₁ which uses the external corrections directly on the AG data, but other contributions to the total error, i.e., the ‘uncertainty’, are not added as done within the g-software.

Confirmation of this procedure is provided from the 4 Strasbourg experiments, Table 6. We see that the tide mean value is significantly higher than at AP and does dominate gs₀—it is a significant effect. The overall agreement g₁ − g₀ is very close and more consistent than for AP due to the better AG data, but there are discrepancies between g₁ and the standard FG5 set measurement, whose origin may be due to the corrections being either FG5 or ‘SG-derived’ values.

5 Summary and Conclusions

We show in Table 7, a summary of the amplitudes of the various effects that influence the SG scale factor, according to our estimates. These are taken from the various tables and figures, with additional estimates for the difference between JS0, JS1, and JS3. Notice that the effects are in units of the scale factor (μGal/V), but when translated into percentages the values are close to % errors, e.g., an error of 0.025 in scale factor is equivalent to 0.03%. Certain effects are more important than others, i.e., moving from JS0 to JS3 (using departures from set means which is standard in FG5 processing) when processing drops, but this is less important for sets. Two factors have been improved over the processing of Calvo et al. (2014), namely smoothing the SG 1 s data, and doing a second pass—especially in the case of drop SFs. For set SFs the dominant effects are to weight the drops when finding set means, and JS3–JS2—gathering the SG data at AG times into sets rather than interpolating SG 1 min data to AG set times. Any difference below 0.01 μGal/V (such as having to make corrections to AG data before selecting drops), is considered a minimal effect, but we still recommend processing using JS3.

Table 7 Summary of all factors influencing the SG scale factor

Full size table

In addition, we have shown that the SG data should be selected at beginning and ending at the AG drop times, and especially for the AG mean value (if used) it can be important to include a trend to account for a drift in one of the instruments but not the other. The SG electronics time delay, which ideally should also be included, has almost negligible effect. Another feature of our study is that we use only the drop text files, because we can compute everything from them, including all the set processing required for a set SF. We do not need special pre-processing of the recorded data using the g-software to reject drops, although this can of course be done by groups that have the facilities and manpower. In terms of choosing whether to report drop or set SFs, both should be computed. Where there is a discrepancy it is likely that the set value may be less affected by bad data. On the other hand if the values are close, the drop SF is preferred as its error is statistically better defined in the sense one does not have to choose between using the ‘Sigma’ and ‘Error’ columns of the set data as errors.

We also recommend the use of conflation to combine different estimations of the SF for a particular SG, as this is the best way to characterize the SF for stations in a database. Finally, for users who do not have the g-software, or are reluctant to spend the effort to use it for their calibrations, we have shown it is possible to turn an SG calibration experiment into an AG site measurement at a site by subtracting the geophysical TLBP corrections from the AG mean value. Also it might be useful on occasion to determine the internal distance parameter D, through Eq. (A5), for an FG5 to enable a precise conversion of an AG mean value from one gradient to another.

References

Amalvict, M., Hinderer, J., Boy, J.-P., & Gegout, P. (2001). A three year comparison between a superconducting gravimeter (GWR C026) and an absolute gravimeter (FG5#206) in Strasbourg (France). Journal of the Geodetic Society of Japan, 47, 410–416.
Google Scholar
Amalvict, M., Hinderer, J., Gegout, P., Rosat, S., & Crossley, D. (2002). On the use of AG data to calibrate SG instruments in the GGP network. Bull d’Inf Marees Terr, 135, 10621–10626.
Google Scholar
Baker, T. F., & Bos, M. S. (2003). Validating Earth and ocean tide models using tidal gravity measurements. Geophysical Journal International, 152, 468–485.
Article Google Scholar
Boy, J.-P., Llubes, M., Hinderer, J., & Florsch, N. (2003). A comparison of tidal ocean loading models using superconducting gravimeter data. Journal of Geophysical Research, 108(B4), 2193. https://doi.org/10.1029/2002JB002050).
Article Google Scholar
Calvo, M., Hinderer, J., Rosat, S., Legros, H., Boy, J.-P., Ducarme, B., et al. (2014). Time stability of spring and superconducting gravimeters through the analysis of very long gravity records. Journal of Geodynamics, 80, 20–33. https://doi.org/10.1016/j.jog.2014.04.009.
Article Google Scholar
Crossley, D., & Hinderer, J. (2010). GGP (global geodynamics project): An international network of superconducting gravimeters to study time-variable gravity. IAG Symposia, Gravity, Geoid, and Earth Observation, 135, 627–635. https://doi.org/10.1007/978-3-642-10634-7_83.
Article Google Scholar
Francis, O. (1997). Calibration of the C021 superconducting gravimeter in Membach (Belgium) using 47 days of absolute gravity measurements. In: Gravity, Geoid and Marine Geodesy, Tokyo, Japan, IAG Symposium (Vol. 117, pp. 212–219). Berlin: Springer.
Goodkind, J. (1999). The superconducting gravimeter. Review of Scientific Instruments, 70(11), 4131–4152.
Article Google Scholar
Hill, T. (2011). Conflations of probability distributions. Transactions of the American Mathematical Society, 363(6), 3351–3372.
Article Google Scholar
Hill, T., & Miller, J., (2011). An optimal method to combine results from different experiments. arXiv:1005.4978v3 [physics.data-an].
Hinderer, J., Crossley, D., & Warburton, R. J. (2015). Superconducting gravimetry. In Gerald Schubert (Ed.), Treatise on geophysics (2nd ed., Vol. 3, pp. 59–115). Oxford: Elsevier.
Chapter Google Scholar
Hinderer, J., Florsch, N., Makinen, J., Legros, H., & Faller, J. (1991). On the calibration of a superconducting gravimeter using absolute gravity measurements. Geophysical Journal International, 106, 491–497.
Article Google Scholar
Imanishi, Y., Higashi, T., & Fukuda, Y. (2002). Calibration of the superconducting gravimeter T011 by parallel observation with the absolute gravimeter FG5 210—A Bayesian approach. Geophysical Journal International, 151, 867–878.
Article Google Scholar
Imanishi, Y., Nawa, K., Tamura, Y., & Ikeda, H. (2018). Effects of horizontal acceleration on the superconducting gravimeter CT #036 at Ishigakijima, Japan. Earth, Planets and Space, 70, 9. https://doi.org/10.1186/s40623-018-0777-9.
Article Google Scholar
Kren, P., Palinkas, V., & Masika, P. (2016). On the effect of distortion and dispersion in fringe signal of the FG5 absolute gravimeters. Metrologia, 53(1), 27–40.
Article Google Scholar
Mäkinen, J., Virtanen, H., Bilker-Koivula M, Ruotsalainen, H., Näränen, J., and Raja-Halli, A. (2015). The effect of helium emissions by a superconducting gravimeter on the rubidium frequency standards of absolute gravimeters In: International Association of Geodesy Symposia. New York: Springer. https://doi.org/10.1007/1345_2015_205.
Meurers, B. (2002). Aspects of gravimeter calibration by time domain comparison of gravity records. Bull. d’Inf Marées Terr, 135, 10643–10650.
Google Scholar
Meurers, B., (2012). Superconducting gravimeter calibration by colocated gravity observations: Results from GWRC025. International Journal of Geophysics. https://doi.org/10.1155/2012/954271. https://www.hindawi.com/journals/ijge/2012/954271/.
Nagornyi, V. (1995). A new approach to absolute gravimeter analysis. Metrologia, 32(3), 201–208.
Article Google Scholar
Niebauer, T. M. (1989). The effective measurement height of freefall absolute gravimeters. Metrologia, 26, 115–118.
Article Google Scholar
Pálinkáš, V., Kostelecký, J., & Simek, J. (2010). A feasibility of absolute gravity measurements in geodynamics. Acta Geodynamic Geomaterial, 7(1), 61–69.
Google Scholar
Rosat, S., Boy, J.-P., Ferhat, G., et al. (2009). Analysis of a ten-year (1997–2007) record of time-varying gravity in Strasbourg using absolute and superconducting gravimeters: New results on the calibration and comparison with GPS height changes and hydrology. Journal of Geodynamics, 48(3–5), 360–365.
Article Google Scholar
Tamura, Y., Sato, T., Fukuda, Y., & Higashi, T. (2005). Scale factor calibration of a superconducting gravimeter at Esashi Station, Japan, using absolute gravity measurements. Journal of Geodesy, 78, 481–488.
Article Google Scholar
Timmen, L. (2003). Precise definition of the effective measurement height of free-fall absolute gravimeters. Metrologia, 40, 62–65.
Article Google Scholar
Topping, J. (1979). Errors of observation and their treatment. New York: Halstead Press.
Google Scholar
Van Camp, M., Meurers, B., de Viron, O., & Forbriger, T. (2016). Optimized strategy for the calibration of superconducting gravimeters at the one per mille level. Journal of Geodesy, 90, 91–99.
Article Google Scholar
Voigt, C., Förste, C., Wziontek, H., Crossley, D., Meurers, B., Pálinkáš, V., Hinderer, J., Boy, J.-P., Barriot, J.-P., Sun, H. (2016). Report on the data base of the international geodynamics and earth tide service (IGETS). Scientific Technical Report STR—Data. Potsdam: GFZ German Research Centre for Geosciences. http://doi.org/10.2312/GFZ.b103-16087.
Google Scholar
Warburton, R., & Goodkind, J. (1976). Search for evidence of a preferred reference frame. Astrophysical Journal, 208, 881–886.
Article Google Scholar
Wziontek, H., Falk, R., Wilmes, H., & Wolf, P. (2006). Rigorous combination of superconducting and absolute gravity measurements with respect to instrumental properties. Bull d’Inf Marées Terr, 142, 11417–11422.
Google Scholar

Download references

Acknowledgements

Two anonymous reviewers provided many insightful comments and criticisms that allowed us to rethink some of our initial results and ideas, from which we trust the reader will benefit. We also thank Bruno Meurers as editor for excellent suggestions throughout the reviewing process. We are deeply indebted to R. David Wheeler, from the National Geospatial-Intelligence Agency stationed at Holloman AFB, New Mexico, for making the difficult AG measurements in the cone room at Apache Point Observatory. DC benefitted from very useful discussions with Hartmut Wziontek (BKG, Leipzig) and Derek Van Westrum (National Geodetic Survey, Boulder). The work was done as a subcontract originating from the pioneering work of Tom Murphy (UCSD, California) on the APOLLO LLR system, and his effort to install an SG to improve the LLR; funding came from Grants 10-APRA10-0045 (NASA), PHY-1068879 and 10322410-SUB (NSF). The Strasbourg SG data is available at https://doi.org/10.5880/igets.st.l1.001.

Author information

Authors and Affiliations

Earth and Atmospheric Sciences, Saint Louis University, St. Louis, MO, USA
David Crossley
Observatorio Geofísico Central, IGN, Madrid, Spain
Marta Calvo
Université de Strasbourg, CNRS, EOST, IPGS, UMR 7516, 67000, Strasbourg, France
Severine Rosat & Jacques Hinderer

Authors

David Crossley
View author publications
You can also search for this author in PubMed Google Scholar
Marta Calvo
View author publications
You can also search for this author in PubMed Google Scholar
Severine Rosat
View author publications
You can also search for this author in PubMed Google Scholar
Jacques Hinderer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Crossley.

Appendix

1.1 Weighted Mean and Variance

There are two approaches to finding the variance of weighted samples, assuming N samples of a quantity x_i, each assumed to have a Gaussian probability distribution with a standard deviation σ_i. Interpreting the weight of each sample as its inverse variance, w_i = 1/σ ²_i , the mean (x_m) and variance σ² of the resulting combination are given by

$$ x_{\text{m}} = \varSigma_{\text{i}} \left( {w_{\text{i}} x_{\text{i}} } \right) \, / \, V_{1} ;\sigma^{2} = \, 1/V_{1} $$

(A1)

where V₁ = Σ_i (w_i), assuming the data are uncorrelated. Equation (A1) gives the variance of the weighted mean, used in combining different quantities, derived for example from the LSQ inversion or fitting of parameters to data. It is also the error of compound quantities, so that when all weights are equal, σ² = (1/N²) Σ_i (σ ²_i ). A second approach is to compute the unbiased weighted sample variance

$$ \sigma^{2} = \varSigma_{\text{i}} {{\left[ {w_{\text{i}} \left( {x_{\text{i}} - x_{\text{m}} } \right)^{2} } \right]} \mathord{\left/ {\vphantom {{\left[ {w_{\text{i}} \left( {x_{\text{i}} - x_{\text{m}} } \right)^{2} } \right]} {\left[ {V_{1} - \left( {V_{2} /V_{1} } \right)} \right]}}} \right. \kern-0pt} {\left[ {V_{1} - \left( {V_{2} /V_{1} } \right)} \right]}} \, $$

(A2)

where V₂ = Σ_i (w ²_i ), as given for example in the GNU Fortran library (function gsl_stats_wvariance at https://www.gnu.org/software/gsl/manual/gsl-ref.html#Weighted-Samples). The mean x_m remains as in (A1) and the weighted SEM is defined as σ√(V₂)/V₁. Equation (A2) is used to assess the variance of the data about its mean, and is the weighted version of the usual formula for Gaussian mean and variance. In his useful little book, Topping (1979, Section 43) refers to (A1) as measuring internal consistency of the data, (using the errors associated only with each experiment) as opposed the external consistency of the data where the errors are determined by the spread of each experiment about a common mean.

Hill and Miller (2011) give the name conflation to (A1) and argue this is the correct way to combine different experiments to determine the best value of an unchanging physical quantity. They show that conflation is (a) commutative and associative, (b) iterative, and the (c) conflations of normal distributions are normal. Property (b) is useful as new data can be easily combined by adding to the conflations of the previous datasets. A more mathematically oriented justification can be found in Hill (2011).

1.2 AG Transfer Height

AG transfer height and gradient effects were discussed by Niebauer (1989) and then extended by Nagornyi (1995) and Timmen (2003) who provided more instrumental details. From a user’s point of view it is not possible to deal with changing the gradient using only the transfer height adjustment provided in the FG5 manual. Denote this transfer height correction by

$$ \delta g_{1} = \, - \, \left( {{\text{actual}}\;{\text{height}} - {\text{transfer}}\;{\text{height}}} \right) \times {\text{gradient }} = \, - \, \left( {{\text{AH}} - h} \right) \times \Delta g $$

(A3)

where AH is the actual height and h the transfer height. The actual height is the sum of the factory height and the setup height, and these quantities are given in the FG5 project files. But (A3) is insufficient to recover the transfer height correction if the gradient Δg changes. A series of experiments was performed by NGA (National Geospatial-Intelligence Agency) by varying the transfer heights and gradients in the FG5 processing and recording the calculated values from the FG5 merged project files. Starting with g_c(h, Δg) as the calculated value, we first subtract the standard correction (A1), and also the calculated gravity for zero gradient at this transfer height g_c(h, 0), thus

$$ g_{1}^{\prime } (h,\Delta g) \, = \, g_{\text{c}} (h,\Delta g) \, - \delta g_{1} - \, g_{\text{c}} \left( {h,0} \right) $$

(A4)

We established that the left hand side (LHS) is a linear function of the gradient, i.e.,

$$ g_{1}^{\prime } (h,\Delta g) \, = \, D \times \Delta g $$

(A5)

where for NGA’s FG5-107 used at AP, D = 8.0258 cm. Thus

$$ g_{\text{c}} (h,\Delta g) \, - \, g_{\text{c}} \left( {h, \, 0} \right) \, = \, \left( {h - h_{\text{e}} } \right) \times \Delta g $$

(A6)

introducing h_e = (AH − D) = 122.694 cm as an effective height, such that when h = h_e the gradient has the least effect on the gravity value. Because it is not possible to find h_e or D from the project files, Pálinkáš et al. (2010) observed “…Some users of the gravity data have no access to the FG5g-software; they cannot accurately correct for the new gradient without knowledge of the effective position. There is even a risk that they will compute the new transfer correction with respect to the top of the drop, because this is often presented as the instrument’s reference height.”

This was indeed our earlier experience, and we found h_e by adding one additional step, i.e., re-processing the same FG5 data with zero gradient at the same transfer height, g₁′(h, 0), and subtract this to get g₁′(h, Δg) using (A5) to find D. We can then correct for both transfer height and gradient from one AG setup to another using

$$ g_{\text{c}} (h_{2} ,\Delta g_{2} ) \, - \, g_{\text{c}} (h_{1} ,\Delta g_{1} ) \, = \, (\delta g_{1} - \delta g_{2} ) \, + \, D \times (\Delta g_{2} - \Delta g_{1} ) $$

(A7)

Note the distance D in (A5) remains the same when AH changes, and therefore, needs to be determined only once, whereas the effective height h_e depends on AH. In one of our experiments we start with AH = 130.72 cm, a transfer height of 100 cm, and a gradient of − 3.0 μGal/cm but want gravity at a height of 130 cm and a gradient of − 4.42 μGal/cm; using Eq. (A7) gives − 100.37 μGal, but Eq. (A3) gives a value of − 90.0 μGal, a difference of more than 10 μGal. Equation (A5) in fact is entirely consistent with Eq. (3) in Pálinkáš et al. (2010) to which the reader should refer for complete details (Tables 8, 9).

Table 8 Terminology and abbreviations used in the paper

Full size table

Table 9 Test results for AP and ST scale factors. Shown are mean differences between equivalent calculations (keeping other options constant); values in (μGal/V)

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Crossley, D., Calvo, M., Rosat, S. et al. More Thoughts on AG–SG Comparisons and SG Scale Factor Determinations. Pure Appl. Geophys. 175, 1699–1725 (2018). https://doi.org/10.1007/s00024-018-1834-9

Download citation

Received: 24 January 2017
Revised: 02 March 2018
Accepted: 08 March 2018
Published: 22 March 2018
Issue Date: May 2018
DOI: https://doi.org/10.1007/s00024-018-1834-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

More Thoughts on AG–SG Comparisons and SG Scale Factor Determinations

Abstract

1 Introduction

1.1 The SG at Apache

1.2 Motivation

1.3 Basic Equations

2 Drops or Sets?

2.1 Tests on Set Data

2.2 Initial Attempts at Set and Drop Processing

2.3 Improving the Algorithm for Rejecting Bad Data

2.4 Results of Tests for the Scale Factors on AP Data

2.5 Strasbourg Processing and Calibration Results

3 Combining Different Scale Factors

4 The AG Mean Value

5 Summary and Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Weighted Mean and Variance

1.2 AG Transfer Height

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation