1 Introduction

In contemporary research of the Earth’s gravitational field, gravity measurement on the Earth’s surface plays a crucial role. However, this is a complex task influenced by various factors such as geodynamic processes, geological structure variability, and others. Earth’s gravity field studies are essential in geodesy, geophysics, mineral exploration, volcanic activity research, and more (Mulugeta et al. 2021; Greco and Krasnyy 2021; Bychkov et al. 2021).

In scientific and engineering practices, gravimetry provides valuable data for various research fields and applications. When measuring gravity acceleration, it is necessary to determine both absolute values and differences in this quantity. A vital step in using any measuring instruments is their standardization, which ensures the consistency of measurement results of the same amount. Such consistency within a unified system of units is a task of metrology. The absence of a physical standard for gravity acceleration poses some challenges for gravimeters’ metrological support.

Its total magnitude (absolute value) or the gravity differences in space or time (relative value) are determined when measuring gravity acceleration. Theoretically, standardization of absolute gravimeters (AG) is possible by separately standardizing the device’s standards of length and time. However, in practice, a comparison of a group of AGs is used (Wu et al. 2021; Li et al. 2022).

This is because the structure of an AG is a complex mechanism. In addition to the inherent uncertainties of time and length standards, the total uncertainty is burdened with unknown systematic deviations arising at various points of the instrument. Moreover, different types of AGs may introduce their systematic errors. It is impossible to accurately model all the systematic errors introduced by the components of an AG. The final measurement results, which already contain all systematic errors, are compared directly. Deviations from the mean value of all participating standard gravimeters are used to assess the probability of measurement results. This approach is a more reliable method of standardization. The metrological support strategy for AGs is described in the IAG Strategy for Metrology in Absolute Gravimetry (Marti et al. 2014).

Relative gravimeters (RG) are used for building national or regional gravity networks in some countries, and are also actively used for creating local microgravity networks and surveys. Therefore, calibrating instruments of this type remains a relevant task. Unlike AGs, there is no universally accepted methodology for calibrating modern relative instruments, but there are several well-known methods to determine their metrological characteristics. Calibration of RGs is necessary for several reasons:

  1. 1.

    When using relative gravimeters to build national gravimetric networks, calibration ensures scaling of the national gravimetric system.

  2. 2.

    Gravimeters are high-precision instruments subject to various measurement accuracy errors. Calibration helps minimize these errors and improve the accuracy of data obtained using the gravimeter.

  3. 3.

    Without calibration, it is impossible to compare measurement results obtained with different gravimeters. Calibration ensures data unification, making them comparable.

  4. 4.

    Calibration allows verifying the functionality of the gravimeter and ensuring that it operates correctly. This helps to detect faults timely and conduct necessary repair works.

  5. 5.

    In many countries, there are standards that gravimeters used in various fields must adhere to. Calibration guarantees that the gravimeter complies with these standards.

The article discusses the methodology for determining the calibration function and possible methods for finding it. Next, implementations of relevant laboratory methods are considered, and the baseline calibration method.

For the field calibration method on the baseline, processing algorithm and example of their implementation are discussed, and an overview of open-source software products capable of equalizing the results of gravity observations with RGs are provided.

Gravity observations are actively being developed in the Republic of Kazakhstan, especially in creating a national gravity reference frame compliant with international standards. Since 2023, work has begun on establishing a national gravity reference frame that will meet the requirements of The International Gravity Reference System (Wziontek et al. 2021; MDDIAI 2023).

The number of RGs has increased in Kazakhstan, and an AG has been purchased for the first time. However, the lack of a national standard and calibration system for RGs that meets all modern requirements is a significant problem in developing this field. Therefore, the authors set the task of selecting a calibration method for RGs.

This paper addresses this issue by selecting and evaluating suitable calibration methods for RGs. It discusses the methodology for determining the calibration function, explores various laboratory and field methods, and analyzes data processing techniques. Additionally, it provides an overview of existing calibration systems in different countries and presents a case study on the calibration of RGs in Kazakhstan.

2 Calibration function and methods of its determination

2.1 Calibration function

Static gravimeters are instruments for relative gravity determination by observing the equilibrium position of a body subjected to gravity and compensating forces. Their principle of operation involves comparing gravity force with a constant compensating force and measuring their difference.

In the land RGs, a sensor is used, where the gravity difference is determined through the deformation of a spring. Modern instruments detect this deformation using electrostatic force, producing an electrical voltage output. Therefore, calibration is required to obtain measured values in units of gravity acceleration.

In theory, various sensor and instrument parameters are required for this transition. Still, it is impossible to determine the calibration function \(F\left( z\right)\) based solely on this data. Therefore, \(F\left( z\right)\) is modeled, and the parameters of this model are determined based on reference gravity differences.

Currently, two types of sensors are used in global practice. The first is a lever-spring balance system. The advantage of such a system lies in the possibility of increasing sensitivity with a relatively low resolution of the reading device.

According to Torge (1989) and Timmen (2010), the full calibration function \(g = F(z)\) of such a system can be expressed through polynomial \(F_{poly}\) and periodic \(F_{per}\) terms:

$$\begin{aligned} g = F(z) = F_{poly}(z) + F_{per}(z), \end{aligned}$$
(1)

where \(F_{poly}(z) = N_0 + \sum _{k=1}^m Y_k z^k\), where \(N_0\) is the unknown instrument constant, \(Y_k\) are calibration coefficients of degree k, and \(F_{per}(z) = \sum _{l=1}^{n}A_l\cos (\omega _l z - \varphi _l)\), where \(A_l\) is the amplitude, \(\omega _l\) is the angular frequency, and \(\varphi _l\) is the initial phase of the harmonic of degree l.

As a rule, gravimeter manufacturers find an approximate calibration function \(F_0(z)\). When performing measurements with the highest precision, the user must determine its non-linear and periodic terms \(\Delta F(z)\) (Lederer 2009):

$$\begin{aligned} g &= F(z) = F_0(z) + N_0 + \Delta F(z) \nonumber \\ &=F_0(z) + N_0 + \sum _{k=1}^{m}{\bar{Y}}_k{\bar{z}}^k + \sum _{l=1}^{n}\left( A_l\cos (\omega _l {\bar{z}}-{\bar{\varphi }}_l)\right) , \end{aligned}$$
(2)

where \({\bar{z}}\) – approximate readings of the function \(\Delta F_0\), and \({\bar{Y}}_k\), \({\bar{\varphi }}_l\) – corrections to approximate calibration coefficients.

2.2 Determination of calibration function terms

Modeling the polynomial term F(z) is associated with the nonlinearity of gravimeter-sensitive systems due to the peculiarities of their construction, such as changes in spring diameter or lever systems. The periodic function arises from errors in the scale calibration of the micrometer screw. Astazing instruments, such as LaCoste & Romberg (LCR) gravity meters and their successors ZLS Burris (Jentzsch 2008; Jentzsch et al. 2018), are examples of such instruments.

Another type of sensor is a balance with a vertical spring. For such systems, the instrument sensitivity depends directly on the resolution of the reading device. For example, to measure gravity force at the level of 10 \(\mu\)Gal (1 Gal = 1 cm/\(\hbox {s}^{2}\)), the mechanical sensitivity to spring length changes should be about 0.1 nm. This was an obstacle to using such sensors for a long time. However, Scintrex Ltd. has achieved significant success in overcoming this problem. Instruments such as CG-3, CG-5, and CG-6 use an unastazing system with a vertical quartz spring capable of performing measurements across the entire Earth’s surface range without any adjustments.

Such a simple sensor construction does not require modeling the periodic term of the calibration function, and only a first-order polynomial term is sufficient. Calibration measurements can be reduced to relative observations between two reliable absolute gravity stations (Seigel 1995; Timmen and Gitlein 2009).

2.3 Estimation of calibration coefficients

In the works Budetta and Carbone (1997), Jacob et al. (2009, 2010), an assessment of the stability of the calibration coefficient for CG-3 and CG-5 gravimeters was conducted. The change in the calibration coefficient for the CG-5 gravimeter amounts to 1 part per thousand over two years.

A similar assessment of the change in calibration coefficients for CG-3 gravimeters was conducted by researchers in the work Ukawa et al. (2010). Their results identified changes in the coefficients in the 100 parts per million range. The calibration coefficients changed at a rate of more than 10 parts per million per year, even several years after production. Calibration shifts occur gradually over time, so performing interpolating calibration coefficients between calibrations is helpful. Significant calibration shifts during measurements of large gravity ranges were not detected.

A similar assessment for CG-3 and CG-5 gravimeters based on 12 years of measurements was carried out in the article Cheraghi et al. (2020). According to the results, the average value of the coefficients for CG-3 gravimeters is close to 1.000–1.0001, while for CG-5, the coefficient decreases over time. However, depending on the instrument models is not apparent due to the small number of calibrated instruments.

Despite the manufacturers’ claims about the linearity of F(z) for the CG-5, the authors of the article Onizawa (2019) attempted to estimate the values of the second-degree polynomial coefficients F(z) at absolute gravity stations in the Japan Gravity Standardization Net. Data for calibration were obtained in the range of 1400 mGal from 2012 to 2017. The range of scale coefficients varied from 0.9991 to 1.0006 depending on the combination of stations.

All studies indicate changes in the calibration coefficients for Scintrex Ltd. gravimeters over time. Often, each instrument may have individual characteristics. Nevertheless, variations in the scale coefficient cannot be ignored to achieve high measurement accuracy. They need to be corrected through calibrations before and after measurements when conducting high-precision gravimetric work. Since the coefficients gradually change over time, more accurate use may involve interpolation. The work discusses the importance of considering changes in gravimeter calibration coefficients in the work Yang et al. (2021), where researchers used gravity monitoring to study earthquake mechanisms.

The authors of the Earth tide study Navarro et al. (2021) also clearly demonstrated the importance of carefully studying the applied CG-5 gravimeters. In addition to classical calibration on the baseline, researchers used a calibration method for CG-5 gravimeters by comparing time gravity measurements with data from a superconducting gravimeter. The advantage of this approach is that it provides a complete determination of the transfer function of the CG-5 gravimeter by estimating tidal amplitude and phase coefficients in the tidal frequency spectrum. To determine the calibration coefficient, they used two approaches: one in the time domain and the other in the frequency domain. Researchers obtained the calibration coefficient in the time domain by performing linear regression between CG-5 data and superconducting gravimeter data.

A similar approach to calibrating relative LCR gravimeters was used in measurements at gravity stations in Antarctica (Fukuda et al. 2021). The studies involved three gravimeters: the absolute FG5, the superconducting iGrav, and the spring LCR. First, using the absolute FG5 gravimeter, the scale coefficient of the superconducting gravimeter was determined through simultaneous measurements over 10 days, during which geophysical effects were not considered. Then, in the same way, based on a series of simultaneous measurements of the superconducting gravimeter and the LCR gravimeter, the scale coefficient of the LCR gravimeter was indirectly determined.

According to the proposed classification by Torge (1989), parameters F(z) can be determined using laboratory and field methods. In laboratory determinations of the calibration function, small increments of gravity are usually applied, and thus, the corresponding small change in gravimeter readings is found. These include the feedback method, the mass change method, and the tilt method.

The only field method for calibration is comparing measured gravity differences with known absolute values or increments at calibration baselines.

Let’s delve deeper into the current laboratory and field methods for determining the calibration function of RGs.

2.4 Laboratory calibration methods

2.4.1 Feedback system

Using the feedback system, it is possible to quickly and easily determine the non-linearity and cyclic errors of astable gravimeters by comparing the variable constants of the measuring screw with the feedback voltages at their different positions. Thus, the calibration task becomes the determination of the feedback calibration function (Meurers 1995). For the entire range of the feedback system, increments are obtained as calibration differences between readings of the feedback system and differences in readings on the counter scale (LaCoste 1991).

One of the earliest gravimeters to incorporate a feedback system in the late 1980s was the LCR. This technology is used in modern ZLS Burris gravimeters, part of the digital microprocessor-based UltraGrav automatic data acquisition and registration system. The control system reads the balance position and automatically zeroes it. The feedback range is approximately ±25 mGal. If the feedback system cannot zero the balance position, the measuring screw needs to be manually adjusted. Additionally, gravimeters with installed stepper motors for automatic balance adjustment are optionally available. In this case, the balance is directed towards a calibrated point on the scale (approximately every 50 mGal), thereby avoiding circular errors across the entire 7000 mGal range of the screw (Jentzsch 2008; Gerlach et al. 2018). Thus, the feedback system is now implemented in all modern gravimeters and is not considered a separate laboratory calibration method.

2.4.2 Mass change

The mass change method was previously used for LCR gravimeters of the D and G models (LCR D and LCR G). In this method, the apparent change in gravity is induced by changing the effective test mass of the gravimeter’s sensor by adding or removing mass (Valliant 1991). However, this method, like the feedback method, is not universal and relevant for modern RGs.

2.4.3 Tilt method

The calibration of gravimeters using the tilt method is based on the following relationship. When the gravimeter is tilted at an angle \(\beta\) relative to the horizontal, the counter reading changes proportionally to the apparent decrease in gravity (Kosticyn 2002; Burian et al. 1979).

The tilt angle is insignificant for quartz gravimeters with a gravity measurement range of 100–150 mGal. To increase the accuracy of the scale division, the error in measuring the tilt angle should be reduced. The gravimeter’s tilt can be determined using a leveling platform, an optical theodolite, a setting plate, or the instrument’s leveling screws.

For gravimeters with a horizontal torsion thread (Kozyakova et al. 1979), angles of tilt and scale division of the instruments can be determined using a particular gravimeter calibration setup. These setups allow determining the scale division with a relative error of \(\pm 1\cdot 10^{-4}\), ensuring the determination of gravity increments with an accuracy of 0.01 mGal.

The gravimeter’s scale division is most accurately determined using a leveling platform. The article Budanov and Evdokimova (1965) proposed using an optical theodolite to determine the tilt angle. The relative error in determining the scale division is \((1-2)\cdot 10^{-3}\) with a maximum tilt angle of up to \(1^\circ\).

The following articles present the results of tilt calibration for Sharp (Kozyakova et al. 1979; Olejník and Träger 2019), Worden (Coutts et al. 1980), LCR (Nakagawa et al. 1985), Sodin (Murray and Tracey 2001) and Soviet GAK and GAG (Bulanzhe 1976) meters. All of this RGs have long been out of production and, in rare cases, are used for filming. Currently, the tilt method can be applied to calibrate quartz spring gravimeters with a small measurement range, primarily for Soviet production. This method is not used for modern gravimeters like the CG-5 or CG-6.

2.4.4 A moving mass

The idea of using a moving mass for calibration emerged in the 1970s. The presented device uses a moving mass (namely a cylinder) to create gravitational vibrations, and several such devices have been created.

One of them was developed at the Eötvös Loránd Geophysical Institute’s Geodynamic Observatory in Hungary (Varga et al. 1995; Csapó and Szatmári 1995). The equipment allows the generating of gravitational variations up to 112 \(\mu\)Gal using a suspended cylindrical ring with a mass of 3200 kg.

Another implementation of the installation is located at Mátyáshegy Observatory. It also represents a cylindrical test mass making vertical movements around the gravimeter using a lifting device (Koppán et al. 2020; Papp et al. 2022). The movement of the cylindrical mass of 3100 kg creates a sinusoidal calibration signal with an amplitude from peak to peak of 110.2 \(\mu\)Gal. A careful analysis of the parameters of the cylindrical mass, combined with analytical modeling of the gravitational effect, ensures an accuracy of 0.3 \(\mu\)Gal. The experiments were conducted over 400 times on 5 instruments, including LCR and CG-5.

The moving mass system gives a very accurate result, but within a small range. It could be quite relevant for meters used to record the earth’s tides, such as gPhoneX or iGrav. In addition, such a device is difficult to create, requiring very precise manufacturing of parts for it, which are expensive.

2.4.5 Artificial acceleration

The first experiment using artificial acceleration for gravimeter calibration by applying external sinusoidal acceleration, approximately 0.3 mGal, to the gravimeter is described in the article by Valliant (1973). The amplitude and frequency of the gravimeter displacement were used to calculate the applied acceleration and compare it with the expected response of the gravimeter. Gravimeter displacement was measured using a photodetector. The measurement results obtained with the prototype device (intended only for technical-economic justification) showed that the calibration accuracy was 0.1%. When the system moves (±10 mm), artificial accelerations up to 1 mGal can be generated.

Its principle is that a platform with sinusoidal acceleration generates an acceleration that depends linearly on the motion of the platform and quadratically on the period. The complexity of using the accelerating platform lies in ensuring its smooth movement since any deviations from stability can seriously affect the second derivative, creating acceleration.

In work Richter et al. in Richter et al. (1995) reviews 2 types of platforms: the Institute of Applied Geodesy (IFAG) in Frankfurt and one specially designed for calibrating gravimeters by the Royal Belgian Observatory (ORB) in Brussels.

The Frankfurt system consists of three spindles (2 mm pitch, 30 mm range) driven by stepper motors using microstepping technology (10,000 steps/revolution). The position of the spindles is monitored by built-in glass sensors with a resolution of 0.05 \(\mu\)m, and motion imperfections can be compensated using a digital feedback loop. Each element of the system (spindle, stepper motor and glass sensor) is individually controlled by a computer. The Frankfurt system was used in international comparisons in Sèvres in 1994 (Becker et al. 1995).

The method of artificial acceleration using the “Frankfurt calibration platform” is proposed in the article Wilmes et al. (2006) as one of the options considered for calibrating the GWR R038 superconducting gravimeter installed at the TIGO geodetic fundamental station in Concepción, Chile.

ORB mechanical eccentric calibration system using the vibrating platform VRR 8601 described at the article Ruymbeke (1989). The principle of operation of this platform consists of using a horizontal square platform on two long flat springs and rotating on three cylinders. The two cylinders at the edges have fixed axes, while the cylinder axis in the middle can be moved vertically using an eccentric mechanism. When cylinder is lowered, the ends of the springs, attached to the platform legs, move upward.

The method of artificial acceleration is also used in the calibration system for RGs included in the Russian State Primary Special Standard for Acceleration in Gravimetry (Vitushkin et al. 2020).

This calibration system can also be used for superconducting gravimeters. Such systems are also very complex and cannot be created in a simple way.

2.5 Calibration on the baseline

The most common and easily available calibration system is baseline calibration. This calibration system can be performed separately at calibration lines or during measurements in gravity networks, where both relative and absolute stations are present. In this case, calibration coefficients are determined during the overall network adjustment. For this purpose, the calibration function parameters are included as unknowns in the connection conditional equations alongside the drift parameters of RGs.

This is the primary and most common method used in practice. The calibration lines (systems) parameters depend on the number of stations, the range of gravity, and the precision. Depending on this, linear, non-linear, and periodic terms of the calibration function can be determined, which are mostly computed as a result of adjustment based on the formulas from Sect. 2.6.

Reference differences can be defined on horizontal and vertical gravity calibration lines. Horizontal baselines use changes in gravity along the meridian. A change of 1 degree in latitude increases \(\sim\)90 mGal. Vertical baselines depend on gravity change with height and are usually located in mountainous areas. An elevation of 1 km results in a shift in gravity acceleration of \(\sim\)200 mGal. The drawback of horizontal baselines is the transportation of instruments over long distances.

There are more than ten calibration baselines in the world, which were created by various countries. For example, articles on baselines in the Polish Tatras (Sas et al. 2009), in the Swiss Alps (Marti et al. 2015), the Austrian Hochkar (Meurers and Ruess 2001; Ruess and Ullrich 2015), and the Zugspitze calibration System in Germany (Flury et al. 2007) describe vertical baseline calibration systems.

An example of horizontal calibration baselines would be the Orangeville Calibration Lines in Canada (CG-5 Operation Manual 2012), in Estonia (Oja et al. 2014), the Masala-Vihd calibration line in Finland (Poutanen et al. 2011), Kazan Calibration Line (The Kazan calibration line 2016), The Irkutsk calibration line (The Irkutsk calibration line 2018), The Moscow gravity network (Oshchepkov et al. 2016) in Russia, and Emba Gravity Baseline, Almaty Gravity Baseline and Zhetygen gravity baseline in Kazakhstan (Lapin et al. 2001; Baydin et al. 2007; Grebenchikova 2016).

Special attention should be paid to baseline calibration systems in China. There, six calibration systems were created, which are located in different parts of the country (Wang et al. 2014).

The mixed (part vertical and part horizontal) baseline calibration systems in Iran (Cheraghi et al. 2020) and Brazil (Sousa and Santos 2010) have large ranges and extend over long distances.

We present the summary indicators, advantages and disadvantages for each of the listed methods in the form of a summary Table 1.

Table 1 Comparison of laboratory and field methods for calibrating RGs

Laboratory calibration methods are less expensive in the context of performing meter studies, since they do not require field measurements, are not associated with the transportation of meters to points, are not subject to external environmental influences, and allow studies to be performed at any intervals within a given range with high accuracy. Typically, these methods are used by meter manufacturers at the factory. But some of them, such as the tilt method, or the feedback method, are not applicable to the modern most common meters CG-5 and CG-6. The moving mass and artificial accelerations methods are not available to most users. In addition, laboratory methods limit the range of gravity that can be measured.

Thus, as mentioned above, the most accessible method for calibrating relative gravimeters remains the classical baselines method.

Now we will briefly consider the theory of processing measurements on calibration lines, on the basis of which we can develop a methodology for our research, and then we will process the observation data that was obtained during the calibration of RGs CG-5 on the Zhetigen calibration baseline in Sect. 3.

2.6 Adjustment measurements on the baseline

The baseline must have at least two stations with a significant gravity difference to obtain a linear calibration coefficient. In general, the number of stations depends on the degree of the polynomial, where for a degree k, the number of stations should be \(k+1\). Calibration coefficients can only be determined for the gravity interval of this baseline; extrapolation to other ranges is not permissible.

The equation for the correction scale factors \(\Delta\)GCAL1 and \(\Delta\)GCAL2 is as follows:

$$\begin{aligned} l_{i,k} + v_{i,k} = g_k - \Delta \text {GCAL}1\cdot z - \Delta \text {GCAL}\cdot z^2. \end{aligned}$$
(3)

The linear term of the calibration function (scale factor) \(\Delta\)GCAL1 is determined by the ratio of the specified increment \(\Delta g\) to the difference in gravimeter readings \(\Delta z\):

$$\begin{aligned} \Delta \text {GCAL1} = \frac{\Delta g}{\Delta z} = \frac{\Delta g_{ref}}{\Delta g_{meas}}. \end{aligned}$$
(4)

For modern Scintrex Ltd. gravimeters, only refining the primary calibration coefficient GCAL1 through the scale correction coefficient \(\Delta\)GCAL1:

$$\begin{aligned} \text {GCAL1}' = \Delta \text {GCAL} \cdot \text {GCAL1}. \end{aligned}$$
(5)

The stability of GCAL1 depends on the strength of the capacitive displacement sensor and the stability of the internal direct current reference voltage sensor. After several months, due to stress relaxation effects in the new fused quartz system, during which GCAL1 can change by up to 0.1%, the drift rate of the scale coefficient GCAL1 typically ranges from 1 to 2 parts per million per day. Therefore, GCAL1 refinement should be performed at least once every few years for maximum accuracy.

As a rule, many redundant measurements are made when determining the parameters F(z) at calibration line stations. Then, the measurement results are adjusted by optimizing the solution using the least squares method.

In 1984, an algorithm and its implementation in the FORTRAN IV programming language were presented to adjust gravity observations (Lagios 1984). The adjustment program was tested on gravity networks in Scotland and Greece.

Subsequently, Torge (1989) and Hwang et al. (2002), developing the use of this method, proposed to use weighted adjustment in two variants: when the a priori values of some gravity stations are known and a free adjustment model without initial data. Also, a parameter was added to the correlation equation (1), which includes the gravimeter drift coefficients:

$$\begin{aligned} g + N_0 + \Delta F(z) + D(t) = l(t) + v, \end{aligned}$$
(6)

where t is the measurement time, l(t) is the measured gravity value obtained by multiplying the instrument reading at station s by the calibration coefficient with introduced corrections for the influence of the environment; v is the discrepancy of the values l(t); g is the gravity value at the station; \(N_0\) is a constant coefficient; \(\Delta F(z)\) is the calibration function; z is the reading in instrument units; D(t) is the gravimeter drift.

The functions \(\Delta F(z)\) and D(t) express systematic errors of the gravimeter. In this study, mathematical models for \(\Delta F(z)\) and D(t) are modeled as in Eq. (2).

The gravimeter drift D(t) is modeled by a polynomial:

$$\begin{aligned} D(t) = \sum _{p=1}^{a}d_p\left( t-t_0\right) ^p, \end{aligned}$$
(7)

where \(t_0\) is the initial epoch, and a is the degree of the polynomial, which depends on the characteristics of the gravimeter but rarely exceeds 2.

The least squares method is used to find the unknowns in Eq. (6). This is described in detail in Torge (1989) and in Hwang et al. (2002), for example.

The described adjustment method was applied in the work above Jacob et al. (2010) and the processing of the gravity network of the Tibetan Plateau (Xing et al. 2020). Adjusting using the least squares method was also applied in the campaign of relative gravity observations at the international comparison of AGs in 2009 (Jiang et al. 2012). Considering the small gravity acceleration increments, only the linear scale factor was considered in the relationship equation for CG-5 gravimeters and ZLS Burris gravimeters. Dias and Escobar (2001) proposed a model for adjusting combined gravity observations to determine calibration coefficients for LCR gravimeters.

Based on this adjustment, many software products have been created. For example, in 2003, interactive software for processing CG-3 gravimeter data, called CG3TOOL, was developed (Gabalda et al. 2003). Another approach, which took into account possible non-linear scale factors for both LCR and CG-5 gravimeters, was applied in the adjustment of the Brazilian national gravity network (Escobar et al. 2013). The article Cattin et al. (2015) presents the GravProcess software package for processing gravity observations with a graphical interface developed in MATLAB. The authors of the pyGrav software Hector and Hinderer (2016) developed a package with an open-source graphical interface in Python. The developers of pyGrav note the significant contribution of the work (Hwang et al. 2002), which was further extended by the work Beilin (2006) with the MCGravi software (provided with pyGrav). The continuation of pyGrav was the GSadjust project by U.S. Geological Survey (Kennedy 2020). Some working groups use other, more complex network adjustment schemes (Kennedy et al. 2014) depending on the research objectives and task features, which can be implemented in pyGrav as alternative options. The source code of these packages is available as git repositories for pyGrav on GitHub.com and on the GitLab server of the company SGC Travaux Spéciaux, respectively, and for GSadjust this is the USGS Official Source Code Archive. Another option for processing gravity network data is presented in the Gsolve program (McCubbine et al. 2018). Another Python software package was developed and presented in the paper Wijaya et al. (2019). The pyGABEUR program uses weighted least squares adjustment with outlier rejection based on the \(\tau\)-criterion. In the works Koymans et al. (2022); Koymans (2022a), the relative-gravity-adjustment software is used, which is also based on weighted least squares adjustment. The implementation of this software is presented as the relative-gravity-adjustment package and as a Web application (Koymans 2022b). gTOOLS (Battaglia et al. 2022) and GravNetAdj (Zhao et al. 2024) is open-source software written in MATLAB, compiled into an executable file for running using the MATLAB Runtime Compiler. The gTOOLS source code is published on the USGS’s git server, and the GravNetAdj on the GitHub.com server. The open-source GRAVS2 software package (Oja 2022) was created explicitly to process relative gravity meter data. The detailed PDF manual explains methods, parameters, input parameters, and examples (Oja 2021). The author distributes the source codes and documentation via Google Drive.

Next, we will present our version of adjustment gravimetric data obtained at the Zhetygen calibration baseline in Kazakhstan, which was mentioned above.

3 Analysis of CG-5 meters calibration function on Zhetygen baseline

3.1 Source data

The Zhetygen calibration line mentioned above belongs to LLP Geoken. The Geoken company regularly studies its gravimeters at this baseline. It is located along the highway north of Almaty and stretches from south to north. It consists of six stations with the following identification 2, 1, 2A, 3, 4, 5 and 6. The point numbering and station count follow established conventions of the company Geoken conducting measurements for many years. The total length of the baseline is 30.8 km. The calibration line map is shown in Fig. 1, and the coordinates and heights of the stations in the Table 2.

Fig. 1
figure 1

Zhetygen calibration line

Table 2 Stations of the Zhetygin Calibration Line (distances relative to St. 1, Latitude and longitude in VGS)

All measurements we examined were performed with three CG-5 RGs, serial numbers 40823, 40824 and 49459, owned by Geoken LLP.

The data includes the results of 6 measurement campaigns from 2019 to 2023. Measurements were carried out using a stepwise method (double loop) starting from the 2nd to 6th station and back 2–1–2–1–2–2A–2–2A–3–2A–3- -...–3–2A–3–2A–2–2A–2–1–2–1–2 (see Fig. 2). In this case, all ties were calculated relative to the 1st point. At each point, 5–6 readings were recorded with an average of 40 s.

Fig. 2
figure 2

Sequence of steps when performing calibration measurements using CG-5 RGs on Zhetygen Calibration Line

3.2 Definition of gravity reference values

Since on Zhetygen baseline there is not a single station with known absolute values of gravity, there is a need to select reference gravity difference. We can analyze the time series and calculate new reference difference that are most optimal according to some criteria. One way or another, the lack of reference absolute gravities is not a big problem in the context of this study, since its goal is to study the dynamics of calibration functions over time and a range of gravity. Therefore, we first performed the measurement adjustment in accordance with Eq. (6) without the \(\Delta F\left( z\right)\) term. We performed free adjustment with fixation of one station (we chose station 1).

The CG-5 records readings are corrected for the drift, tides and etc. The CG-5 system uses the Longman (1959) tide model. Due to the fact that this model is outdated and does not compensate for tides well, many researchers use third-party models to take into account tides, including ocean load. However, in our study, for simplicity, we will use the reading values corrected by the CG-5 system, since the measurements were carried out in a short period of time over short distances. Thus, in the difference of readings, the defect from using an imperfect Longman model will be insignificant.

Using Eq. (6) we calculated the gravity ties for each measurement cycle in campaign and the meter in it. The destributions of residuals are presented in Fig. 3. Here the residuals are grouped by campaign and the colors represent days of observation.

Fig. 3
figure 3

Residuals obtained after adjustment using the WLS method for each observation day in the campaigns

We used several approaches for the adjustment of measurements. For the calculations, simple scripts were written using the Python programming language and the Jupyter Notebook tool. To solve the system of equations, the weighted least squares (WLS) regression linear model module and the robust linear model (RLM) from the |statsmodels| package were used.

Table 3 Total gravity values for all observation days for each campaign by WLS method

The plots of residuals clearly show the nature of the formation of RG readings on stations, and the quality of measurements in general. The most unstable readings are characteristic of meter № 40823, and the highest quality measurements, obviously, belong to device № 40824.

We can obtain the total for all observation days for each measurement cycle using the WLS and RLM models, and also as a weighted mean (WM) using the formula

$$\begin{aligned} {\overline{g}} = \frac{\sum _{j=1}^{J} w_j\ g_j}{\sum _{j=1}^{J}w_j}, \end{aligned}$$
(8)

where the weights \(w_j\) are the inverse deviations \(w_j=\sigma _j^{-2}\) obtained as a result of the adjustment. The corresponding standard deviation of the weighted means \({\overline{g}}\) are calculated as

$$\begin{aligned} {\overline{\sigma }} = \sqrt{\frac{\sum _{j=1}^{J}w_j\left( g_j-{\overline{g}}\right) ^2}{\left( n - 1\right) \sum _{j=1}^{J}w_j}}. \end{aligned}$$
(9)

The results indicated no significant difference between the three models. Therefore, subsequent calculations will utilize the g values from the WLS model. These results are presented in Table 3. The columns display the calculated gravity values \({\overline{g}}\) relative to station number 1 and their corresponding standard errors \({\overline{\sigma }}\).

Figure 4 shows the distribution of g values by WLS from the Table 3. The smallest deviations relate to meter № 40824, but at the same time, this device shows the greatest variability of values at the same stations throughout the entire measurement period.

Fig. 4
figure 4

Time changes in measured gravity differences values by each RG

In order to trace the stability of the calibration function over time, we remove the value g obtained at the beginning from all campaings:

$$\begin{aligned} g_k^m = g_j^m - g_{j=0}^m. \end{aligned}$$
(10)

The plots of changes in gravity from year to year at each station k generated in this way are presented in Fig. 5. The plots clearly show that for instrument № 40824, with small errors, a decrease in the value of g relative to the initial cycle can be traced, while for ranges of more than 60 \(\mu\)Gal, the difference increased until 2022, and then became stable. Devices № 40823 and № 49459, on the contrary, with larger standard errors compared to № 40824, the values of g in different measurement cycles show a stable result.

Fig. 5
figure 5

Time changes in measured gravity differences values by each RG from first campaign

To determine the \(\Delta g\) reference values from the Table 3, measurement campaigns with the smallest deviations in gravity values between the RGs under study were determined. Based on this, the period corresponding to the first half of 2022 was selected (denoted by purple circles in Fig. 4). The total \(\Delta g\) values calculated using the WLS model for this period are given in Table 4. Also included in this table are the reference increments used previously (labeled REF). As can be seen from the table, the previously accepted REF values differ significantly from the calculated ones, especially in ranges greater than 40 mGal. Since further research is related to the analysis of the dynamics of calibration coefficients, we will use the values obtained from WLS to calculate them.

Table 4 Reference gravity difference values

3.3 Scale factors

We first analyzed the time course of the calibration coefficients for each meter, considering only \(\Delta\)GCAL1. To do this, using the same WLS method, we calculated the \(\Delta\)GCAL1 values for each meter for all measurement campaigns. The results are presented in Fig. 6. In this case, the values of \(\Delta\)GCAL1 for each day of measurements were first calculated (small points), and then the total value for the campaign (large points).

Fig. 6
figure 6

Time changes in linear scale factors for each RG

After this, using the formula (3) for each campaign and RG, we obtained an approximation of the corrective calibration function depending on the measurement range. The results of calculating these coefficients \(\Delta\)GCAL1 and \(\Delta\)GCAL2 are presented in Table 5 and Fig. 7. The dotted lines in the plots show functions of a 2nd degree polynomial. It is clear from the plots that, in general, almost all campaigns gave a more or less linear result, which indicates that 2nd degree calibration coefficient for the CG-5 is very small. However, for meter № 40823 there is a slight increase in the coefficient with the measurement range, this is especially noticeable in the 2023 campaign. You can also notice a general increase for all RGs at station № 3 in the campaigns of 2021, the second half of 2022 and 2023.

Fig. 7
figure 7

Changes the corrective calibration function from the range for each RG and campaign

Table 5 Coefficients of the second degree polynomial \(\Delta\)GCAL1 and \(\Delta\)GCAL2 of the corrective calibration function \(\Delta F\left( z\right)\), calculated for campaigns and RGs

In addition to the disadvantage mentioned earlier of the Zhetygen calibration line, namely the lack of reference absolute values of gravity, the range of 100 mGal does not cover elevation differences in the territory of Kazakhstan, and this may also not be enough to identify the dependence of the behavior of a sensitive system on the magnitude of the measured gravity.

For this reason, a new baseline is planned for Almaty. To this end, the mountainous terrain surrounding the city is necessary. The upper station is planned at an altitude of 2700 m on the territory of the Tien Shan Astronomical Observatory. The lower station will be located on the territory of the Kazakh National Research Technical University at an altitude of 680 meters. Thus, the baseline range must be at least 300 mGal.

4 Conclusions

Metrological support for relative gravimeters (RGs) is crucial for enhancing accuracy and reliability in geophysical surveys at mineral deposits, scientific research on geodynamic processes, and gravity surveys for geodetic purposes.

This study examined current laboratory and field calibration methods for terrestrial RGs, considering technical aspects, complexity, cost, economic efficiency, and other economic characteristics. The methods involving moving mass and artificial acceleration were identified as the most complex and costly regarding system production and personnel qualification, and most importantly, they do not provide the required calibration range.

Calibration on a gravity baseline was the most optimal and promising method. Vertical baselines are preferred over horizontal ones for greater accuracy and efficiency in high-precision calibration for modern gravimetric research.

Mountainous areas with developed road infrastructure are ideal for designing vertical baselines. Based on international experience, several requirements for a vertical baseline were formulated:

  • To include at least 3 stations in the gravity calibration line.

  • To demonstrate a gravity range of at least  100 mGal.

  • Station locations arranged for a total observation time of under 4 h per program.

  • Securely anchored stations with pedestals for simultaneous installation of multiple RGs.

  • Accessibility to all users.

  • Reliable determination of reference gravity values by an AG, with periodic re-measurements.

  • Determination of a vertical gravity gradient model at each station.

  • Absence of external sources of systematic errors at the stations.

In addition, external factors like snow cover and adverse observation conditions in mountainous areas should be considered.

The research presented here, as well as numerous other studies, underscores the importance of regular RG calibration, as the calibration function can change over time. Kazakhstan already has several gravity baselines, but they lack absolute determinations and sufficient range for gravity differences encountered in mountainous surveys. Moreover, the proximity of some stations to a reservoir may introduce variability due to changes in water masses. A new gravity calibration system meeting modern requirements is necessary to address this. The Almaty region, with its proximity to mountain ranges and well-developed road network, is the most suitable location. Preliminary estimates suggest that a gravity baseline with a range exceeding  300 mGal and travel times between stations of no more than 2 h can be established in this area.

Future research directions may include the following directions:

  • Development of Kazakhstan’s national gravity reference frame.

  • Research and development of software for gravity data processing and analysis.

  • International cooperation and exchange of experience in gravimetry.

  • Study of the influence of external factors on gravity measurements.