1 Introduction

Scalar gravimeters have been widely used for airborne gravimetry surveys and have been shown to provide accuracies better than 2 mGal for spatial resolutions down to 2 km (half-wavelength) (Bruton 2000; Forsberg and Olesen 2010; Li 2013). However, besides being expensive, these systems are not easy to install in a small aircraft due to their large dimensions. Furthermore, their operation is not straightforward, see Li (2007) for some detailed discussion. Combining a global navigation satellite system (GNSS) with a strapdown inertial measurement unit (IMU) can serve as an alternative or even as a complement to traditional spring gravimeters (Glennie et al. 2000). Strapdown airborne gravimetry typically requires high-quality navigation-grade inertial sensors, although it has been shown that tactical-grade inertial sensors are also able to observe the vertical component of the gravity vector (Bastos et al. 2002; Deurloo 2011). Navigation grade comprises inertial systems with a bias stability of 0.0001–0.1°/h for the gyros and 1–1000 μg for the accelerometers, while tactical grade includes inertial systems with a bias stability of 0.1–10,000°/h for the gyros and 50–10,000 μg for the accelerometers (Jekeli 2001). Early results with navigation-grade IMUs showed that accuracies obtained for these systems are similar to those of spring gravimeters (Bruton 2000; Deurloo 2011; Glennie et al. 2000; Kwon and Jekeli 2001).

The first airborne gravimetry flight test using a (navigation grade) IMU was carried out over the Rocky Mountains in 1995 by researchers at the University of Calgary (Wei and Schwarz 1998). This test showed the ability of the IMU to retrieve the vertical component of the gravity vector (i.e., scalar gravimetry) with an accuracy ranging between 2 and 3 mGal and 5 km spatial resolution (half-wavelength). Following this remarkable development, Bruton (2000) has further improved the extraction of the gravity signal from the IMU data, yielding accuracies of 2.5 and 1.5 mGal for spatial resolutions of 1.4 and 2 km (half-wavelength), respectively. Kwon and Jekeli (2001) from the Ohio State University proposed a method for retrieving the three components of the gravity vector (i.e., vector gravimetry). Using the dataset from the Rocky Mountains 1995 flight test, they reached accuracies of 6 and 3–4 mGal for the horizontal and vertical components, respectively. Around the same period, the Astronomical Observatory of the University of Porto (AOUP) started developing an airborne gravimetry system based on a low-cost tactical-grade IMU, a Litton LN-200. Bastos et al. (2002) presented the first results of the system using data acquired in the region of Azores in the scope of the Airborne Geoid Mapping System for Coastal Oceanography (AGMASCO) project (Forsberg et al. 1997). The vertical component of the gravity vector was estimated with an accuracy of 5–10 mGal for a spatial resolution of 10 km (Tomé 2002).

More than a decade after the initial developments, there have not been significant improvements, with the methods developed by Bruton (2000) and Kwon and Jekeli (2001) being still the most used for scalar and vector gravimetry, respectively. For example, Senobari (2010) presents similar accuracies to those mentioned by Kwon and Jekeli (2001). In addition, Li (2011) estimated the vertical component of gravity with 0.5–3.2 mGal of accuracy for a spatial resolution of 17 km. Moreover, Gerlach et al. (2010) retrieved gravity disturbance estimates with 3 mGal for a spatial resolution of 2 km, and Huang et al. (2012) showed that their system was able to retrieve scalar gravity estimates with an accuracy better than 2 mGal for a spatial resolution of 6 km.

Regarding the AOUP low-cost system, Deurloo (2011) modified the inertial sensor error model and the system’s configuration and thus improved the initial results to 8 mGal at 3 km of spatial resolution. Following his approach and introducing a crossover point-based serial tuning, this work aims to evaluate and compare the performance of three IMUs (two navigation and one tactical grade) for scalar airborne gravimetry. Data collected during a two-day airborne campaign were used to assess the performance of the different inertial systems. Due to the characteristics of the campaign flights, the results are analyzed with four different methods: assessment of the internal accuracy of each IMU by comparing two overlapping flight lines; evaluation of the system’s repeatability at crossover points; using the higher quality IMU as a reference to evaluate the performance of the other two; and comparison of each system with an upward continued reference.

2 Strapdown Airborne Gravimetry

According to Newton’s second law of motion, the gravity field of the Earth can be expressed with respect to a local-level reference frame (l-frame) as (Jekeli 2001):

$${\delta {\mathbf{g}}^{\text{l}} = {\dot{\mathbf{v}}}_{\text{e}}^{\text{l}} - {\mathbf{C}}_{\text{b}}^{\text{l}} {\mathbf{a}}^{\text{b}} + \left( {2{\varvec{\Omega}}_{\text{ie}}^{\text{l}} + {\varvec{\Omega}}_{\text{el}}^{\text{l}} } \right)\mathbf{v}_{\text{e}}^{\text{l}} - \gamma^{\text{l}} }$$
(1)

where the superscript l denotes the use of the aforementioned local frame and the subscripts e, b, and i denote an Earth-fixed frame (e-frame), body frame (b-frame), and inertial frame (i-frame), respectively; \(\delta {\mathbf{g}}^{\text{l}}\) is the gravity disturbance vector; \({\varvec{\upgamma}}^{\text{l}}\) is the normal gravity vector; a b is the specific force acting on the vehicle; \({{\dot{\mathbf{v}}}_{\text{e}}^{\text{l}} }\) and \({{\mathbf{v}}_{\text{e}}^{\text{l}} }\) are the vehicle’s (kinematic) acceleration and velocity, respectively (both given with respect to the e-frame, but oriented along the l-frame); \({\mathbf{C}}_{\text{b}}^{\text{l}}\) is the direct cosine matrix (DCM) which transforms a vector from the b-frame to the l-frame; and \({\varvec{\Omega}}_{\text{ie}}^{\text{l}}\) and \({{\varvec{\Omega}}_{\text{el}}^{\text{l}} }\) are the skew-symmetric matrix form of, respectively, the Earth-rate rotation vector and the rotation vector as a result of the vehicle’s motion over the Earth’s surface.

When considering scalar gravimetry, only the vertical component of Eq. (1) is required (Glennie and Schwarz 1999; Wei and Schwarz 1998):

$$\delta g = \dot{v}_{\text{d}} - a_{\text{d}} + \left( {2\omega_{\text{e}} \cos \varphi + \frac{{v_{\text{e}} }}{N + h}} \right)v_{\text{e}} + \frac{{v_{\text{n}}^{2} }}{M + h} - \gamma_{\text{d}}$$
(2)

where \({\dot{v}_{\text{d}} }\) is the vehicle’s kinematic acceleration in the down direction; \(a_{\text{d}}\) is the down component of the specific force; ω e is the Earth’s rotation rate; φ is the latitude; h is the height above the reference ellipsoid; v e and v n are the east and north components of the vehicle’s velocity; N and M are the radii of curvature in meridian and prime vertical; and \(\gamma_{\text{d}}\) is the down component of the normal gravity.

There are essentially two methods for determining the gravity disturbance with an IMU. The first one is the so-called accelerometry approach, typically employed by the University of Calgary group in the early 2000s (Bruton 2000; Glennie and Schwarz 1999). This approach starts by combining the IMU and differential GNSS (DGNSS) data using an extended Kalman filter (EKF), where the sensors’ errors are estimated as part of the state vector and used for correcting the IMU raw measurements. Then, the gravity disturbance is obtained by taking the difference between the DGNSS-derived acceleration and the corrected IMU specific force as in Eq. (2). Finally, the results are low-pass-filtered.

The second one is the so-called inertial navigation approach, where the gravity disturbance is determined from the navigation solution, either by extracting it from the acceleration residuals (Kwon and Jekeli 2001; Li 2011) or by stochastically modeling it as an additional system state (Bastos et al. 2002; Deurloo 2011; Tomé 2002). The latter method is considered in this work, with \(\delta g\) being modeled as a stochastic process.

In the navigation approach, the EKF is the core of the estimation process. The EKF has a predictor–corrector structure which can thus be divided in two distinct steps: propagation and update. For the propagation step, the system model (i.e., the inertial navigation equations) is fed by the accelerometer and gyro measurements and integrated over time to predict the navigation solution (position, velocity, and attitude).

The full set of inertial navigation equations are written in the l-frame:

$${\dot{\mathbf{r}}}_{\text{e}}^{\text{l}} = {\mathbf{D}}^{ - 1} {\mathbf{v}}_{\text{e}}^{\text{l}}$$
(3)
$${\dot{\mathbf{v}}}_{\text{e}}^{\text{l}} = {\mathbf{C}}_{\text{b}}^{\text{l}} {\mathbf{a}}^{\text{b}} + {\mathbf{g}}^{\text{l}} - \left( {2{\varvec{\Omega}}_{\text{ie}}^{\text{l}} + {\varvec{\Omega}}_{\text{el}}^{\text{l}} } \right){\mathbf{v}}_{\text{e}}^{\text{l}}$$
(4)
$${\dot{\mathbf{C}}}_{\text{b}}^{\text{l}} = {\mathbf{C}}_{\text{b}}^{\text{l}} {\varvec{\Omega}}_{\text{lb}}^{\text{b}}$$
(5)

where \({\mathbf{r}}_{\text{e}}^{\text{l}}\) is the position vector; D is a matrix that relates Cartesian and Geodetic coordinates; \({\mathbf{g}}^{\text{l}} = {\varvec{\upgamma}}^{\text{l}} + \delta {\mathbf{g}}^{\text{l}}\) is the gravity vector; and \({\varvec{\Omega}}_{\text{lb}}^{\text{b}}\) is the skew-symmetric matrix form of the b-frame rotation vector with respect to the l-frame in the l-frame coordinate system. All other elements have been defined before. Here, the accelerometer and gyro (3D) biases \({\mathbf{b}}_{\text{a}}\) and \({\mathbf{b}}_{\omega }\), respectively, are modeled as random constants and the gravity disturbance as random walk:

$${\dot{\mathbf{b}}}_{\text{a}} = \mathbf{0}$$
(6)
$${\dot{\mathbf{b}}}_{\omega } = \mathbf{0}$$
(7)
$$\delta \dot{g} = w_{{\delta g }}$$
(8)

where \(w_{{\delta g }}\) is the white noise driving the random walk process. Although there are more complex stochastic models, such as the third-order Gauss–Markov (Jekeli 1994; Kwon and Jekeli 2001), it was shown by Deurloo (2011) that the random walk model produces similar or even better gravity disturbance estimates. In addition, the filter tuning becomes easier with this simpler model because only one instead of three noise parameters needs to be estimated.

The design of the EKF is based on the linearization of the system model (Eqs. 35 augmented with Eqs. 68) using a first-order Taylor series expansion. It is assumed that the state vector residuals follow a normal distribution and that a linear system model can be applied (Groves 2008):

$$\delta {\dot{\mathbf{x}}} = {\mathbf{F}}\delta {\mathbf{x}} + {\mathbf{Gw}}$$
(9)

where F is the system matrix; G is the system noise distribution matrix; and w is the system noise vector. The error state vector \(\delta {\mathbf{x}}\) is defined as:

$$\delta {\mathbf{x}} = \left[ {\begin{array}{*{20}c} {\delta {\mathbf{r}}_{\text{e}}^{\text{l}} } & {\delta {\mathbf{v}}_{\text{e}}^{\text{l}} } & {{\varvec{\uppsi}}^{\text{l}} } & {{\varvec{\upvarepsilon}}_{{{\text{b}}_{\text{a}} }} } & {{\varvec{\upvarepsilon}}_{{{\text{b}}_{\omega } }} } & {{{\upvarepsilon}}_{{\delta g}} } \\ \end{array} } \right]^{\text{T}}$$
(10)

where each element of \(\delta {\mathbf{x}}\) corresponds, respectively, to the errors in position, velocity, orientation, accelerometer bias, gyro bias, and gravity disturbance.

For the update step of the EKF, DGNSS-derived positions and velocities are used for correcting the navigation solution, sensor biases, and gravity disturbance (loosely coupled integration). The difference between the GNSS observation (position and velocity) and IMU prediction \(\delta {\mathbf{z}}\) is modeled as:

$$\delta {\mathbf{z}} = {\mathbf{H}}\delta {\mathbf{x}} + {\mathbf{v}}$$
(11)

where H is the measurement matrix describing how \(\delta {\mathbf{z}}\) varies with \(\delta {\mathbf{x}}\) and v is the observation white noise. Note that H includes the usual lever-arm compensation for the distance between the center of navigation of the inertial systems and the phase center of the GNSS antenna.

3 Data and Processing

GEOid over MADeira (GEOMAD) was an airborne gravimetry campaign carried out in the summer of 2010, aiming to compute a new local geoid for the island of Madeira with an accuracy better than 10 cm (Bos et al. 2011). During the campaign, two main flights were flown, covering the island with six roughly east–west flight lines and eight roughly north–south flight lines (see Fig. 1). The flight lines were spaced <10 km apart and had a combined length of around 1700 km. The east–west lines (lines 1–6) were flown on August 27 at a constant height of 3000 m, while the north–south lines (lines 8–14) were flown on August 31 at a constant height of 2600 m. The minimum height of these flights was set by the topography of Madeira Island, which has its highest point at around 1860 m. The flight speed was 110 m/s for both days.

Fig. 1
figure 1

Flight lines of the GEOMAD campaign

The flight tests were performed with an ATR42 aircraft from the Service des Avions Français Instrumentés pour la Recherche en Environnement (SAFIRE). The GNSS/IMU equipment consisted of a dual-frequency GNSS receiver and three IMUs with fiber-optic gyros (FOG): the navigation-grade systems, iXSea AIRINS and iMAR iNAV-FMS, and the tactical-grade Litton LN-200. Table 1 presents the noise and bias stability specified by the manufacturer for the three IMUs. The systems were attached closely together to the same mounting plate near the center of mass of the aircraft. The sampling rate was 1 Hz for the GNSS receiver, 100 Hz for iXSea, 400 Hz for iMAR, and 200 Hz for Litton. GNSS reference data (sampled at 1 Hz) came from a station on the nearby island of Porto Santo, meaning that the baseline between station and aircraft was always <120 km. Due to an error in the data-logging system, flight lines 1 and 7 were not used in this comparison.

Table 1 Noise and bias stability of the three IMUs according to the manufacturer’s specifications

The data from the two main flights were processed following the Rauch–Tung–Striebel (RTS) method, where the EKF described in Sect. 2 is run forward in time and, once the end of the dataset is reached, a backward (smoothing) filter is applied (Groves 2008). Note that GNSS positions were derived from double-differenced carrier-phase pseudoranges, which were used to compute the GNSS velocities with a fifth-order central-difference differentiator (Deurloo 2011).

The main challenge of modeling the gravity field as a stochastic process along with system biases resides on properly choosing the a priori estimates of the covariance matrices (i.e., error, system noise, and observation noise covariance matrices). Choosing wrong a priori information for these matrices will result in poor estimates of gravity (Kwon and Jekeli 2001; Schwarz and Li 1997).

Kalman filter designers usually rely on serial tuning to find the close to optimal EKF parameters for a specific sensor and/or scenario. In this type of tuning, the designer interrupts the GNSS update several times for a few seconds and takes discrete steps in the tuning parameters, while evaluating the position mean drift after the GNSS simulated outages (Goodall 2009). However, position drifts cannot be used as a tuning reference in airborne gravimetry because the effect of measurement errors on positioning and gravity determination is different. Gravity contains high-frequency noise which can be negligible for positioning because of the smoothing effect of the double integration (Schwarz 2006). Nevertheless, poorly chosen a priori parameters will have a significant negative effect on the crossover points. Therefore, the agreement at the crossover points will be used to evaluate the results of the serial tuning procedure. It is important to note, though, that this crossover point-based serial tuning cannot be applied to flight profiles that do not yield crossing points. However, if the gravity field and the accelerometer noise do not change too much between lines, the tuning parameters will practically be the same for each line. The latter requirement is obviously related to the quality of IMU and flight characteristics.

The GNSS position noise and gyro bias standard deviation are normally the first and second parameters to be tuned (Goodall 2009). In airborne gravimetry, the most important parameters to tune are the accelerometer and gravity disturbance noises, as the balance between these quantities is governed by the behavior of the gravity field and flight characteristics. In this work, the gravity disturbance random walk standard deviation was fixed to 5.0 \({\text{mGal/}}\sqrt {\text{s}}\), whereas the accelerometer noise was allowed to change in the serial tuning process. The filter was initialized with the manufacturer’s specifications, and four parameters were tuned in order as follows: accelerometer noise, GNSS position noise, GNSS velocity noise, and gyro noise. Table 2 shows the EKF parameters resulting from the tuning procedure, except for iMAR (this will be discussed in Sect. 4). After tuning the EKF for each IMU, the results were low-pass-filtered using a second-order Butterworth filter with different cutoff lengths: 30, 60, 90, and 120 s.

Table 2 Crossover point-based serial tuning resulting parameters, except for iMAR (see Sect. 4)

For both flights, iMAR gravity disturbance estimates showed a clear drift with respect to the European Improved Gravity model of the Earth by New techniques 6C4 (EIGEN-6C4) developed by Förste et al. (2014), see Fig. 2a. The model has an expected accuracy of 5 mGal. By subtracting EIGEN-6C4 long wavelengths from iMAR’s estimates (Fig. 2b), it can be seen that the drift and bias change each time the aircraft makes a turn. Since the accelerometer bias and gravity disturbance are only separately observable when there is acceleration in the horizontal plane (which is not the case for a typical airborne gravimetry flight line, see, e.g., Schwartz and Li 1997), a time-varying accelerometer bias can be the cause of this behavior. This hypothesis is strengthened by an 8-h stationary test (Fig. 3), where the accelerometer z-axis readings appear to suffer from a temperature-dependent drift. Unfortunately, it was not possible to retrieve the temperature from this particular sensor and, thus, this assumption cannot be confirmed. Hence, a least squares method was used to determine a drift and bias for each flight line (as depicted in Fig. 2b) in order to correct the gravity disturbance estimates of iMAR. Note that the aircraft turns were excluded from the adjustment due to the inherent noise increase.

Fig. 2
figure 2

Day 27 flight profile results: a EIGEN-6C4 vs iMAR gravity disturbance estimates; b iMAR residuals (considering EIGEN-6C4 as a reference) with a straight line fitted to each flight line (aircraft turns were excluded)

Fig. 3
figure 3

iMAR iNAV-FMS accelerometer Z-axis stationary test (with the mean subtracted)

4 Results and Analysis

4.1 Comparison of two overlapping flight lines

The internal accuracy of the each IMU was assessed by comparing the two overlapping flight lines 4 and 5 (see Fig. 1). As pointed out by Wei and Schwarz (1998), this should be close to the actual accuracy of the system because most errors are only a function of time. Table 3 presents the bias-free RMS agreement between the two overlapping flight lines for different cutoff filter lengths (30, 60, 90, and 120 s). Considering that both lines have the same accuracy, the standard deviation for a single line was computed by dividing the RMS by \(\sqrt 2\). As expected, iXSea performed much better than the other two IMUs and was able to retrieve high-resolution information (half-wavelength of 1.7 km) with an accuracy of 2.1 mGal. Surprisingly, the performance of iMAR was quite similar to Litton’s, with standard deviations of 4.3 and 4.8 mGal, respectively, for a filter length of 30 s. In general, the accuracy improved as the cutoff filter length increased and, thus, the best results were achieved with the 120-s filter (half-wavelength of 6.6 km). When considering this filter length, the standard deviation was 1.5 mGal for iXSea, 4.1 mGal for iMAR, and 4.6 mGal for Litton.

Table 3 Comparison between the gravity disturbance estimates of the overlapping flight lines 4 and 5 for different cutoff filter lengths (mGal)

4.2 RMS and standard deviation at the crossover points

The RMS and standard deviation at the crossover points were computed in order to evaluate the systems’ day-to-day repeatability. As described in Sect. 3, these statistics resulted from the crossover point-based serial tuning process. Table 4 shows the bias-free RMS agreement at the crossover points calculated for the same cutoff filter lengths as the previous comparison. Given that evaluating different flight profiles at the crossover points is also a method of internal accuracy assessment, the standard deviation for a single point was computed by dividing the RMS by \(\sqrt 2\). As can be seen, the results of iXSea and Litton for the 30-s filter are very similar to those obtained with the overlapping flight lines, showing that both systems are providing consistent gravity disturbance estimates. However, as opposed to the overlapping flight lines comparison, the RMS agreement at the crossover points decreases with the cutoff filter length. This may be explained by the fact that overlapping flight lines have, naturally, experienced the same spatial variation in gravity, which means that the frequency spectrum is similar for both lines. Since the difference between both spectrums is only due to the system’s accuracy, the agreement between lines improves as high frequencies are removed. In the case of the crossover points, the crossing lines (most of the times) do not experience the same spatial variation in gravity. Thus, if the amplitudes of the lower frequencies differ too much between crossing lines and assuming that higher frequencies do not suffer from large inaccuracies (i.e., high levels of noise), the agreement at the crossover points will decrease as the cutoff filter length increases.

Table 4 Comparison between the gravity disturbance estimates at crossover points for different cutoff filter lengths (mGal)

Unfortunately, iMar did not show the same consistency at the crossover points as the other two IMUs. The standard deviation for the 30-s filter was about 0.7 mGal higher than that obtained in the comparison of lines 4 and 5. As demonstrated in Sect. 4.3, the degradation of the accuracy is caused by several lines in which iMAR did not agree well with iXSea.

Furthermore, Table 4 shows that, as short wavelengths are removed from iMAR, the standard deviation at the crossover points does not increase. This means that the IMU is not able to provide much relevant information at short wavelengths. The agreement only starts to decrease if even higher cutoff filter lengths are considered (e.g., 180- and 200-s filter lengths yield accuracies of 5.0 and 5.4 mGal, respectively). Similarly, Litton does not seem to capture much gravity information in wavelengths shorter than 10 km (i.e., cutoff filter lengths lower than 90 s). This result was expected given that the signal-to-noise ratio of such a system typically drops below 1 at a wavelength of 5 km for mountainous areas (Deurloo et al. 2012).

Since the EKF has low-pass filter characteristics, one might think the results were over-smoothed by the choice of tuning parameters. However, this is unlikely to happen because, as discussed before, over-smoothing the results or, in other words, removing shorter wavelengths will have a negative effect on the crossovers. In addition, when the crossover point-based serial tuning was first performed on iMAR’s data, the resulting accelerometer noise was 130 and 100 \(\upmu{\text{g/}}\sqrt {\text{Hz}}\) for days 27 and 31, respectively. With these noise specifications, the results of iMAR not only followed the same pattern as those of iXSea and Litton (i.e., the standard deviation increases with the filter length) but also yielded more accurate estimates for shorter filter lengths (4.6 mGal for both 30- and 60-s filters). Therefore, choosing lower accelerometer noise values only makes the estimates noisier and that explains why the standard deviation decreases when higher cutoff filter lengths are considered. The accelerometer noise values presented for iMAR in Table 2 (80 \(\upmu{\text{g/}}\sqrt {\text{Hz}}\)) were kept in order to emphasize the fact that over-smoothing is not a major issue in this tuning procedure and to handle some inconsistencies observed when comparing iMAR with iXSea (see Sect. 4.3).

The crossover point-based serial tuning also improved the initial results presented in Deurloo (2011), where the data from iXSea and Litton (iMAR was not considered) were processed using the manufacturer’s specifications (Table 1). The agreement of Litton’s data at the crossover points was improved by 19 % when considering the 60-s filter. Although Deurloo (2011) did not show this comparison for iXSea, the crossover point-based serial tuning yielded an agreement improvement of about 35 % for this IMU. These improvements are interesting numerical examples of how gravity estimates are degraded when wrong a priori EKF parameters are considered.

4.3 Comparison of Litton and iMAR with iXSea

Both iMAR and Litton were further evaluated by taking iXSea as a reference. The underlying assumption is that, since the iXSea is more accurate, the resulting gravity estimates are closer to the actual Earth’s gravity values and can be used as a reference for the data obtained with the other two IMUs. The residuals RMS (with the bias removed) for each line is presented in Tables 5 and 6, respectively. The standard deviation of each IMU was computed by subtracting in quadrature the standard deviation of iXSea (Table 3) from the weighted mean RMS (propagation of errors). Once again, the results obtained for Litton are consistent with previous comparisons, highlighting the ability of this IMU to retrieve gravity estimates with an accuracy of better than 5 mGal.

Table 5 Comparison between iMAR and iXSea gravity disturbance estimates for different cutoff filter lengths (mGal)
Table 6 Comparison between Litton and iXSea gravity disturbance estimates for different cutoff filter lengths (mGal)

The comparison of the iMAR with the iXSea revealed mean standard deviations of 5.4–5.5 mGal for the various filter lengths. These are about 0.6 mGal higher than the previous standard deviation values using the crossover points, see Table 4. Although line 2 is clearly contaminating the overall accuracy of the IMU, there are several lines in which the agreement with iXSea is greater than 6.0 mGal. A possible explanation for this lower accuracy could be that the drift in such lines is not well approximated by a linear function. In addition, as cutoff frequency decreases, the accuracy of several lines gets degraded, which means that longer wavelengths contain the major part of the error.

As previously mentioned, after processing iMAR’s data with the information derived from the crossover point-based serial tuning, some inconsistencies were observed. Although the agreement at the crossover points was better (between 4.6 and 4.7 mGal) and decreased with higher cutoff filter lengths, the standard deviation computed in this comparison for iMAR was around 6.0 mGal. This may be further evidence that the drift affecting iMAR’s accelerometers is highly nonlinear. The crossover point-based tuning is basically adapting the filter parameters, and consequently the gravity estimates, to just 28 (crossover) points of a long time series (>8000 points) corrupted with a nonlinear function. This would not be an issue if all data points were drift-free or, at least, if more crossing points were available. In the latter case, the results would probably be more consistent but the accuracy would not improve.

4.4 Comparison of each IMU with an upward continued reference

Finally, Tables 7, 8, and 9 show the comparison between the IMUs and an upward continued reference computed from 85 terrestrial gravity observations on Madeira. As mentioned in Sect. 3, Madeira is very mountainous with the highest peak around 1860 m. These mountains create a large gravitational attraction and are the main cause for the 200-mGal signal that is shown in Fig. 2. In addition, the mountains are very steep, and, with a wavelength of 18 km, EIGEN-6C4 is unable to capture the high gradients. Therefore, using a digital terrain model based on the Shuttle Radar Topography Mission (Farr et al. 2007), the gravitational attraction caused by the short wavelengths due to the topography was computed and added as a terrain correction (TC) to the values of EIGEN-6C4. The terrestrial gravity observations were then corrected for EIGEN-6C4 + TC, a spatial auto-covariance function of the residual gravity field was estimated, and, using least squares collocation (LSC) (Moritz 1980), the residual gravity observations were upward continued to flight altitude.

Table 7 Comparison between EIGEN-6C4 + TC + LSC and iXSea gravity disturbance estimates for different cutoff filter lengths (mGal)
Table 8 Comparison between EIGEN-6C4 + TC + LSC and iMAR gravity disturbance estimates for different cutoff filter lengths (mGal)
Table 9 Comparison between EIGEN-6C4 + TC + LSC and Litton gravity disturbance estimates for different cutoff filter lengths (mGal)

The model’s standard deviation was computed by subtracting in quadrature the standard deviation of the IMU from the weighted mean RMS (propagation of errors). The standard deviation considered for iXSea was derived from the comparison of lines 4 and 5 (Table 3), while the standard deviations considered for iMAR and Litton were derived from the comparison with iXSea (Tables 5, 6, respectively). Figure 4 gives a visual impression of the agreement between EIGEN-6C4 + TC + LSC and the sensors’ estimates (without low-pass filtering).

Fig. 4
figure 4

Gravity disturbance estimates (without low-pass filtering) of iXSea, iMAR, Litton, and EIGEN-6C4 + TC + LSC for days a 27 and b 31 flight lines. The IMUs’ estimates are shifted on the vertical axis for clarity of representation

As presented in Tables 7, 8, and 9, the model’s estimated accuracy ranges between 3.0 and 4.5 mGal. These results are not only in accordance with the expected accuracy of EIGEN-6C4 + TC + LSC, but also show the consistency of the gravity information derived from the IMUs’ measurements. Applying short wavelength geophysical corrections to EIGEN-6C4 also helped validating iXSea’s estimates, which were considered the reference in the previous comparison. Taking the 30-s filter as example, the RMS agreement between EIGEN-6C4 and iXSea improved 1.0 mGal after adding TC (from 6.2 to 5.2 mGal) and 0.3 mGal after adding LSC (from 5.2 to 4.9 mGal).

It is important to mention, though, that the model’s standard deviation determined with iMAR and Litton is about 1–1.5 mGal lower than the one obtained with iXSea. This behavior suggests that the accuracy of iXSea may be a little overestimated, which would consequently lead to the underestimation of the other sensors’ standard deviation. As can be seen, e.g., in Tables 6 and 7, the RMS agreement of iXSea with Litton is very similar to the RMS agreement achieved with EIGEN-6C4 + TC + LSC for the filter lengths considered. Hence, the results of Tables 6, 7, and 9 should reflect the fact that Litton and EIGEN-6C4 + TC + LSC have an identical accuracy.

Moreover, since the Butterworth filter was not applied to the model, its standard deviation determined with the IMUs’ data was expected to remain constant as the filter length increased. The reduced agreement for lower cutoff filter lengths may thus be a further indicator that both iMAR and Litton are not able to retrieve viable information at shorter wavelengths.

5 Conclusions

Using the data obtained during an airborne gravimetry campaign over Madeira in 2010, three inertial systems were evaluated under the same flight conditions. The IMUs compared were the navigation-grade iXSea AIRINS and iMAR iNAV-FMS, and the tactical-grade Litton LN-200. In addition, this work also shows for the first time that iXSea and iMAR can be used for strapdown airborne gravimetry.

The high-performance iXSea provided gravity disturbance estimates with accuracies of 2.1 and 1.6 mGal for 1.7 and 5.0 km of spatial resolution (half-wavelength), respectively. These results are within the range expected for a high-quality navigation-grade inertial system (Bruton 2000; Kwon and Jekeli 2001; Schwarz 2006). Furthermore, iXSea’s gravity estimates were used to derive a new local geoid for Madeira. Enormous misfits of 15 cm were observed when comparing the new local geoid with GPS/levelling data over the island. Similar values were presented before by Catalão and Sevilla (2009) and, therefore, the problem may come from the GPS/levelling data but that falls outside the scope of the current research to go into the detail.

Extracting the gravity signal from iMAR was not straightforward, and a gravity reference was needed to model and remove the nonlinear drift present in the data. This somewhat limits the use of this IMU for airborne gravimetry. Nevertheless, after properly dealing with the nonlinear drift, iMAR should be able to provide gravity information with accuracies similar to those obtained in the overlapping flight lines comparison (around 4.0 mGal for a spatial resolution of 5–7 km). Remarkably, the tactical-grade Litton showed very consistent results and achieved an accuracy of about 4.5 mGal at 5 km of spatial resolution (half-wavelength). Due to its characteristics, such a tactical-grade system can be easily deployed around the world and can be considered more cost-effective for less demanding applications such as regional geoid definition and/or improvement. In fact, for the purpose of the GEOMAD campaign (determine a local geoid with an accuracy of 10 cm), the quality of the observations made using Litton is sufficient. Moreover, the possibility to apply this system in unmanned aerial vehicles (UAVs) can provide new, less expensive, airborne survey options (Deurloo et al. 2012).

The paper also introduced a new crossover point-based serial tuning. The method proved to be extremely useful for modeling the gravity field as a stochastic process in an extended Kalman filter. In this way, both IMU and GNSS datasets are optimally combined and large errors related to a poor choice of a priori information are avoided.