Introduction

Through the use of a geodetic Global Navigation Satellite System (GNSS) antenna for soil moisture monitoring in the top 5–10 cm of the soil column, GNSS interferometric reflectometry (GNSS-IR) has become an interesting and complementary remote sensing technique due to its advantages over classical satellite or aircraft images. GNSS-IR resolution is higher (a scale of approximately 1000 m2 around the antenna); it can be used for continuous monitoring and is independent of weather conditions (the technique is valid in rainy and foggy conditions) and illumination (day or night).

GNSS satellite signals are transmitted in the L-band (microwave frequency), so the signal reflected by nearby surfaces and recorded by the antenna contains information about the environment surrounding the antenna. This information can be obtained by processing the signal-to-noise ratio (SNR) recorded in the antenna as interferograms.

The GNSS-IR technique, first developed by Larson et al. (2008a, b), and based on the procedure detailed in Larson et al. (2010), Larson and Nievinski (2013), Chew et al. (2014, 2015, 2016), Vey et al. (2016), Small et al. (2016), Wan et al. (2015), Chen et al. (2016), Roussel et al. (2016), and Zhang et al. (2017), is summarized as follows:

  1. 1.

    SNR observations from the GNSS satellites should be selected for satellites with elevation angles ranging from 5 to 30 degrees. This is one of the main drawbacks of the technique. It can be used only for low elevation satellites since this range is the only one useful for obtaining a valid interferogram for soil moisture analysis. Data from satellite elevations below 5 degrees are discarded to avoid strong multipath effects in the SNR data.

  2. 2.

    Rising or setting satellite tracks are separated (or tagged for the post-processing), since the interferogram pattern can differ for rising and setting tracks of the same satellite.

  3. 3.

    SNR data are converted from observed dB-Hz units to a linear scale in Volts by the expression SNRlinear = 10SNR/20 (Vey et al. 2016).

  4. 4.

    The reflected signal \({\text{SNR}}_{{{\text{linear}}}}^{{{\text{reflected}}}}\) is isolated by fitting a second-order polynomial to the SNRlinear in order to eliminate the direct satellite signal (Wan et al. 2015; Chew et al. 2016).

  5. 5.

    A Lomb-Scargle periodogram is computed from the \({\text{SNR}}_{{{\text{linear}}}}^{{{\text{reflected}}}}\) of each satellite track in order to check that only a clear primary wave is observed. Tracks with multiple peaks or low maximum average power should be discarded.

  6. 6.

    The selected tracks can be modeled as:

    where A and ϕ are the amplitude and phase of the primary wave, λ is the GNSS signal wavelength, e is the satellite elevation, and h is the reflector height: the vertical distance between the GNSS antenna phase center and the horizontal reflecting surface, which is assumed to be the distance between the antenna and the ground due to the low signal penetration on the ground (Chew et al. 2014; Roussel et al. 2016; Zhang et al. 2017). A and ϕ are estimated by a least-squares algorithm with initial values of 1 for A and 0 for ϕ.

    $${\text{SNR}}_{{{\text{linear}}}}^{{{\text{reflected}}}} = A\cos \left( {\frac{4\pi h}{\lambda }\sin e + \emptyset } \right)$$
    (1)
  7. 7.

    The final step is to derive the relationship between soil moisture variations (more specifically, volumetric water content, VWC m3/m3) and variations in the estimated phase of the primary reflected wave. To do this, in situ observations are needed as a reference data set. Those observations can be obtained from conventional water content reflectometer sensors (Vey et al. 2016; Larson et al. 2010) or from soil data samples (Martín et al. 2020). Considering a linear relationship between GNSS-IR estimated phase variations and reference VWC variations, several weeks (or months) of both types of data are necessary to obtain a good relationship. The slope is adjusted using the satellite tracks for which the phase variations presented a stronger linear correlation with in situ soil moisture variations. For example, in Zhang et al. (2017) this correlation is set at 0.9, so only the ascending tracks of GPS satellites 13, 21, 24 and 30 and the descending tracks of GPS satellites 05, 09, 10, 15 and 23 are used. Finally, the slope to be used for all valid tracks should be the mean slope value obtained for the highly correlated satellite tracks.

In Nievinski and Larson (2014) an open Matlab/Octave source code is developed which can produce simulated SNR, carrier phase and pseudorange GPS observations that agree with a multipath model for the near-surface reflectometry and positioning applications.

In Roesler and Larson (2018), a free software tool is presented to translate GPS (or GNSS) observations into a format usable for reflections research (written in Fortran 77), map GNSS-IR reflection zones around the antenna (written in Matlab), and estimate dominant frequencies and reflection height from GNSS data (written in Matlab).

In this manuscript, we explain the main ideas and implementation decisions to write the python code for software tools that implement the first six steps of the previous procedure (the last one requires in situ data to obtain an accurate linear relationship), and we share the software with the scientific community. The developed software works in Python 2.7 and Python 3.

Software development

The proposed software tools have two main modules. The first module transforms the RINEX format observation and navigation files (Gurtner and Estey 2015) to a unique file containing the epoch of each observation, the satellite identification, SNR observation, and computed azimuth and elevation of the satellite from the navigation RINEX file. A line in the output file is written only if the satellite elevation falls between 5 and 30 degrees. The input can be a single file (one observation day, for example) or several (one week or month of continuous observation data separated by daily files), but the output is always a single file.

The input for the second module is the output of the first module; it generates an output file per satellite containing the reflection height and the adjusted values for phase and amplitude of the interferometric wave for each individual satellite track. The software also generates a graphical output for each track containing the direct SNR signal, the indirect SNR signal, the computed interferogram, and the wave adjusted to the indirect SNR signal.

First module: information extraction from RINEX observation and navigation files

The developed software works with the version 3 format of the RINEX observation and navigation files. The open source software GFZRNX can translate RINEX version 2 data to version 3 (Nischan 2016). Although we refer to GNSS in the manuscript, the first version of the software only works for the GPS satellite constellation.

The user has to decide the frequency to be used prior to running the software; this can be done by opening a RINEX observation file and inspecting the header information. Once the frequency has been selected, the user should determine the observations related with this frequency and stored in the observation RINEX file that will be used in the process. Those observations are the SNR and the pseudorange or code observation related to the selected frequency.

The algorithm reads the observation and navigation files stored in a folder (\data\input\ by default). The names and extensions of those files should follow the RINEX standard format: The name is composed of four characters for the station name, three for the day of the year and one character to identify the session. The extension is composed of two characters corresponding with the last two digits of the year and an "o" for the observation files or "n" for the GPS navigation file. The algorithm orders the input files in chronological sequence and opens them in order.

The algorithm reads each GPS satellite observation for every epoch for every observation file and writes in the output file (located in the data\output1\ folder by default) the following data only from satellites with elevations between 5 and 30 degrees:

  1. 1.

    A numerical identifier related with the epochs of the observed file. This integer identifier number starts with zero for all observations of the first epoch of the first observation file, increases one by one, and ends with the last epoch of the last observation file. This numerical number can be used as the identifier in the second software module. The software writes as many lines, with the same numerical identifier, as there are observations from different satellites in the same epoch.

  2. 2.

    The time related to the previous lines. The year, month, day, and a float number containing the hour (calculated with the integer hour, integer minute, and float seconds information from the RINEX observation file).

  3. 3.

    The two-digit satellite GPS numerical identifier.

  4. 4.

    The observed SNR.

  5. 5.

    Based on the observation epoch, the pseudo-distance or code observation to the satellite, the station coordinates located in the header of the observation file and the navigation file, the algorithm computes the azimuth and elevation of the satellite from the antenna. This calculation must determine the emission time and the satellite coordinates in the Earth-Fixed-Earth-Centered (ECEF) reference system. For the emission time, an iterative “pseudorange-based algorithm” is used (Sanz et al. 2013). The algorithm described in Leick et al. (2015) on pg. 240 is used for the ECEF satellite coordinates computation. Finally, the earth rotation during the signal travel is taken into account to obtain the final satellite coordinates in the receiver time. Based on the ECEF coordinates of antenna and satellite, the azimuth and elevation are computed and stored in the output file. The module that computes the azimuth and elevation requires two extra modules, one to compute the Julian and GPS time and another to compute the spherical geodetic coordinates form the geocentric coordinates and compute the azimuth and elevation from the geodetic coordinates of the antenna and the satellite.

Second module: GPS-IR reflector height calculation and wave adjustment to observed SNR

The second software module uses the output of the first module as input. The program identifies all valid tracks by satellite and will solve steps 2 to 6 described in the introduction section.

First, the software creates an output folder structure: one folder for each satellite identified by its numerical identifier, creating 32 folders numbered from 1 to 32. If these folders already exist, the program will fail, since they are created during execution; this prevents previous results from being overwritten or generating more files within the folders with each successive execution.

Before running the software, the user must set some internal input parameters:

  1. 1.

    The time interval between observations: It can be obtained from the observation file header.

  2. 2.

    The minimum and maximum satellite elevations: These parameters are set again, by default, to 5 and 30 degrees.

  3. 3.

    The minimum and maximum azimuth of the satellite tracks to be considered: This allows the user to focus on a certain area around the antenna. These parameters are set by default to 0 and 360 degrees.

  4. 4.

    The antenna height: measured from the ground to the antenna reference mark in meters.

  5. 5.

    The working frequency length: 0.1904 m for L1, 0.2443 m for L2.

  6. 6.

    A minimum range of elevation angle to be covered by the satellite to consider the track as valid: set by default to 10 degrees.

  7. 7.

    A minimum number of epochs without observations to assume the observations belong to the same or a new satellite track: In certain cases, some epoch information is lost in the observation file due to signal interruptions or a malfunction, so the software identifies satellite observations that are temporally separated by less than the number of epochs set in this parameter as belonging to the same track. The value depends on the time interval between epochs; it can be set to 1 if the time interval between observation periods is 30 s, 7 if the interval is 5 s or 35 for 1 s.

The software selects each track and tags it as a rising or a setting satellite track; it then performs the processes described in the following paragraphs.

Steps 3, 4, 5 and 6 described in the introduction section are performed for every track; finally, A and ϕ parameters and their standard deviations are estimated by the least-squares algorithm.

One of the biggest advantages of using Python is the many free scientific libraries available for easy access. The most complex process, in this case, is the computing of the Lomb-Scargle periodogram: step 5 in the introduction section. For this, the LombScargle function of the Astropy library is used (https://docs.astropy.org/en/stable/timeseries/lombscargle.html). The input for the computing the Lomb-Scargle periodogram for each track are the sine of the satellite elevation angle on the X-axis, and the \({SNR}_{linear}^{reflected}\) on the Y-axis. With this configuration, the result converts the frequency into the antenna reflector height in meters on the output X-axis. However, due to the use of the sine of the satellite elevation angle, the grid spacing on the X-axis is irregular. To determine the appropriate grid spacing to use, the library introduces an option through keywords passed to the autopower() method. By default (after some experimental probes), the highest frequency is fixed to two times the average Nyquist frequency: Nyquist_factor = 2 in the code. Additionally, the maximum reflector height allowed in the output periodogram is fixed to 2.5 m, but this parameter can be changed if the antenna elevation is higher. A theoretical discussion of the frequency extraction from GNSS-IR can be found in Roesler and Larson (2018).

The most important step in this module is to establish conditions for selecting “good” tracks. In this case, based on different experiments, a track is considered valid if the satellite track contains more than 30 min of observation and more than the minimum previously set angle value range of elevation to be covered by the satellite, the power of the dominant frequency is 6 times larger than the media background noise, and the adjustment of the theoretical wave to the \({SNR}_{linear}^{reflected}\) signal presents a residual vector with a mean less than 1.3 V/Volts and a standard deviation less than 25 V/Volts.

An output file is generated, in which each line corresponds with one track and contains the following columns:

  1. 1.

    numerical identifier related to the epochs of the observed file

  2. 2.

    yes/no text field related to the previous conditions (yes indicates “good or valid” tracks)

  3. 3.

    epoch of the observation time

  4. 4.

    month of the observation time

  5. 5.

    day of the observation time

  6. 6.

    satellite identification number

  7. 7.

    initial satellite track azimuth

  8. 8.

    final satellite track azimuth

  9. 9.

    initial satellite track elevation

  10. 10.

    final satellite track elevation

  11. 11.

    adjusted A value

  12. 12.

    standard deviation of the adjusted A value

  13. 13.

    adjusted ϕ value (in degrees)

  14. 14.

    standard deviation of the adjusted ϕ value (in degrees)

  15. 15.

    computed reflector height

  16. 16.

    satellite-falling/satellite-rising text field

A second output file is also created, containing the same information but only for the tracks considered “good or valid” tracks. For all valid and invalid tracks, a plot is generated containing four figures: the complete signal (SNR), the reflected signal (\({SNR}_{linear}^{reflected}\)), the Lomb-Scargle periodogram for the reflected signal, and the reflected signal with the adjusted wave. The figure name includes the numerical identifier, year, month, day, satellite identifier, the initial azimuth, and the final azimuth to allow the user to clearly and reliably identify the figure with the corresponding line in the output files.

GPS-IR example

The data we provided with the software is part of an experiment performed in the installations of the Cajamar Center of Experiences, Paiporta, Valencia, Spain, Fig. 1, Martín et al. (2020). We provide 7 days (from a total of 66 in the experiment) of GPS observations with a geodetic GNSS receiver (Trimble R10, from the Department of Cartographic Engineering, Geodesy and Photogrammetry of the Universitat Politècnica de València). RINEX observations (with a 5- second sample rate) and navigation files are provided. L1 frequency is used for the example.

Fig. 1
figure 1

Geodetic-quality GNSS antenna located in the experiment zone

The experiment was performed from December 3, 2018, to February 6, 2019. The observations files included with the software are daily data files from January 14 to 20, 2020, and the height of the antenna to the ground was 1.8 m.

There was no occurrence of rain during the observation.

In addition to the software, the output file of the first software module is provided, along with the figures and the two output files for satellite number 15 of the second module, so that the user can validate the software operation. Figures 2, 3, 4, 5 show the track with identifier epoch 88,518, January 19, 2019, starting azimuth of 281 degrees and ending azimuth of 300.1 degrees. Specifically, Fig. 2 is the SNR data in dB-Hz, Fig. 3 is the indirect SNR data in Volts, Fig. 4 is the Lomb-Scargle periodogram for the SNR reflected signal, and Fig. 5 depicts the SNR reflected signal with the adjusted wave in Volts.

Fig. 2
figure 2

Observed SNR data in dB-Hz

Fig. 3
figure 3

Reflected or indirect SNR data in Volts

Fig. 4
figure 4

Lomb-Scargle periodogram for the SNR reflected signal

Fig. 5
figure 5

SNR reflected signal with the adjusted wave in Volts

With the established criteria for selecting the correct tracks, the confusion matrix obtained for all satellites during the 7 observation days presents the following values: 45% of tracks during the seven days are valid and classified as such, 43% are invalid and are classified as such, 9% are invalid tracks classified as valid, and only 3% of the tracks are valid but classified as invalid.

Conclusions and final remarks

We intend to share the software with the scientific community to introduce new users to the GNSS-IR technique. This technique can be used not only for soil moisture monitoring but also, for example, vegetation water content monitoring (Wan et al. 2015), snow depth measurements (Larson et al. 2009), or tide measurements (Roussel et al. 2015), illustrating that users can easily adapt the software for other purposes.

The software can also be extended to work with Galileo, GLONASS and Beidou GNSS satellite observations and frequencies, by modifying only the first software module.

Availability of data and material

The software is available from the GPS Toolbox website at https://geodesy.noaa.gov/gps-toolbox/.