1 Introduction

In the last two decades, global navigation satellite system (GNSS) has seen tremendous advances in the precision of the observations, which allow researchers to perform geodynamics and geophysics studies through the analysis of daily GNSS position time series (He et al. 2017). GNSS observations have been used to study geophysical phenomena such as plate tectonics (e.g., Gordon and Stein 1992; Blewitt 1993; Fernandes et al. 2003), crustal deformation due to earthquakes (Montillet et al. 2015), tectonic strain and glacial isostatic adjustment (e.g., Lidberg et al. 2007; Tregoning and Watson 2009; Steffen and Wu 2011) and vertical land motion to study the sea level variations (Bos et al. 2013b; Santamaría-Gómez et al. 2017; Montillet et al. 2018).

These geophysical processes can be modelled by fitting a composite trend (composed of linear, sinusoidal, offsets and even nonlinear signals) to the observations. Johnson and Agnew (1995) pointed out that the noise in GNSS time series is temporally correlated and that the power spectral density P of the noise can be described by a power-law noise model (Williams 2003):

$$ P\left( f \right) = P_{0} (f/f_{0} )^{\kappa } $$
(1)

where f is the frequency and P0 and f0 are two constants representing the amplitude and reference frequency, respectively, and κ is called the spectral index. This type of noise should be taken into account to avoid an underestimation of the linear trend uncertainty by a factor of 5–11 (Mao et al. 1999). This noise behaviour is present in most GNSS solutions as demonstrated by Williams et al. (2004) who analysed a global set of 414 GNSS coordinate time series. They concluded that a combination of a power-law (PL) noise plus a white noise (WN) model provides an adequate representation for the noise that existed in most of the time series. Amiri-Simkooei et al. (2007), Langbein (2008), King and Williams (2009) and Santamaría-Gómez et al. (2011) reached similar conclusions with longer time series. Note that King and Williams (2009) investigated the temporal correlations of the GNSS time series spanning several years over short baselines (< 1 km) in order to characterise the stability of GNSS monuments. On the same topic, Hill et al. (2009) achieved similar results on the stability of braced monuments with a short baseline network of stations.

Presently, with the availability at some stations of time series with a span of more than 20 years, we are able to study the spectral power at even lower frequencies. This enables us to verify if the PL + WN model still dominates below 0.1–0.2 cpy in the power spectrum, or if we can detect a start of a flattening of the power spectral density \( P\left( f \right) \). This problem is summarised in Fig. 1 which shows the standard PL + WN model in black. One of our objectives is to investigate whether we can detect this flattening, represented by the blue line in Fig. 1. The Generalised Gauss–Markov (GGM) noise model of Langbein (2004) is suitable to model this effect. In fact, this flattening of the power spectrum implies that the noise is no longer a long-memory process which results in lower uncertainties of the estimated velocity.

Fig. 1
figure 1

Schematic representation of our research objective which tries to detect flattening of the power-law noise at the low frequencies (blue line) or increase of the power due to random-walk noise (red line)

Our task is to find the best description of the noise as the lowest observed frequencies and to extrapolate that to even lower frequencies using a stochastic model. In addition, Langbein (2012) has emphasised that random-walk (RW) noise might exist in the time series which, even small, can have a significant effect on the estimated trend uncertainty by a factor of 2 (cf. Fig. 1). This was further investigated by Dmitrieva et al. (2015) who, by stacking several GNSS time series, showed that RW noise was indeed present in their data. Recently, Langbein and Svarc (2019) have analysed a set of 740 GNSS sites in the western USA. The study supports that long GNSS time series with at least 9.75 years of data display significant temporal correlations. Their stochastic noise model is best described by a combination of white, flicker, random-walk and bandpass filtered noise. Studying very long time series allows us also to confirm these results.

Most previous studies used the log-likelihood value to select the optimal noise model. Both Santamaría-Gómez et al. (2011) and Langbein (2012) used synthetic time series to determine how well this criterion can be used for noise model selection. In our research, we extend this approach by using information criteria. They are originated from signal processing applied to telecommunications (Akaike 1974). Unlike the conventional hypothesis testing-based approach, information criteria do not require any subjective threshold settings. For example, the number of signals buried in the receiver’s noise floor is obtained by minimising a nominated information criterion (Proakis 2001; Hacker and Hatemi 2018). Various criteria have been developed since the pioneering work of Akaike (1974) with applications in time series analysis (Burnham and Anderson 2002). Several studies (e.g., Bos et al. 2013b) have applied information criteria to study the impact of the stochastic model on estimated geophysical signals.

The next section defines the selection of the optimal stochastic model for a nominated GNSS time series based on various information criteria. We study in particular the selection of the stochastic noise model as a function of the time span of the time series.

2 Choosing the optimal noise model for GNSS time series

GNSS coordinate time series are assumed to consist of signal plus noise. Bevis and Brown (2014) introduced the term ‘trajectory model’ to describe the signal and give a detailed overview of the various components that are normally included in this model in addition to the linear trend, such as seasonal signals, offsets and post-seismic relaxation. The purpose of most GNSS time series analysis is to estimate accurately the parameters within the trajectory model.

We obtain residuals that represent the noise after subtracting the trajectory model from the observations. As mentioned in the previous section, a PL model is widely used to describe the noise at the low frequencies. Furthermore, the spectral index \( \kappa \) in Eq. 1 is normally around − 1 which is called flicker noise (FN). White noise is mostly present at the high frequencies. The combination of noise models FN + WN has two free parameters (v = 2) which are the two noise amplitudes for each model. For similar reasons v = 3 for the noise model combination FN + RW + WN. For PL + WN, the number of parameters is also three since the spectral index also needs to be estimated in addition to the two noise amplitudes. GGM + WN is similar to PL + WN, but it requires another parameter that controls the frequency where the flattening of the power spectrum at the low frequencies begins (Langbein 2004), so, v  =  4. To decide which combination of noise models is the most accurate representation of the real noise, one can use the \( w \)-test (Amiri-Simkooei et al. 2007). One can also compute the likelihood function \( L \), which depends on the chosen noise model through the covariance matrix, to select the most likely combination of noise models. This approach has been followed by Williams et al. (2004), Langbein (2004), and Amiri-Simkooei (2016). However, a combination of noise model with more parameters will normally fit the observed noise better, because of the extra parameters providing more flexibility. To compensate for this overfitting, penalty functions can be added to the log-likelihood function. The most widely used are the AIC—Akaike information criterion (Akaike 1974) and the BIC—Bayesian information criterion (Schwarz 1978). These criteria are given by:

$$ AIC = - 2 { \log }\left( L \right) + 2v $$
(2)
$$ BIC = - 2 { \log }\left( L \right) + { \log }\left( n \right)v $$
(3)

where \( n \) is the number of data points. Due to the minus sign before the logarithm, the optimum stochastic model is then chosen by minimising a nominated criterion (e.g., AIC, BIC). A larger number of parameters \( v \) increase the AIC and BIC value and thus serve as a penalty term. Since we are studying long time series (\( n \approx 6000 \to { \log }\left( n \right) = 8.7 \)), BIC penalises extra parameters in the noise models more than AIC, which will affect GGM + WN the most. The BIC is actually derived by assuming that \( n \) is very large (Burnham and Anderson 2002). In order to avoid this approximation, the original factor \( 2\pi \) in the derivation presented by Schwarz (1978) is reintroduced into the criteria, which we call here BIC_tp:

$$ BIC\_tp = -\, 2{ \log }\left( L \right) + { \log }\left( {\frac{n}{2\pi }} \right)v $$
(4)

It results in an information criterion which lies between AIC and BIC in the weight of the penalty of adding more parameters in the noise models. All three information criteria have been implemented in the Hector software package (Bos et al. 2013a), which we used to estimate the parameters of the different combination of noise models in this study.

3 Evaluation of information criteria and synthetic time series analysis

To verify the performance of the information criteria we created synthetic time series with a linear trend, an annual and semi-annual signal but without data gaps or offsets. The exact values for this trajectory model varied randomly between each time series, but the standard deviations of the trend, annual and semi-seasonal signal amplitudes were 10 mm/year, 2 mm and 0.5 mm, respectively. To these time series, we added noise of various types. Note that flicker noise is defined as power-law noise with spectral index \( \kappa = - 1 \). To separate the two, we define power-law noise to have a spectral index of − 0.9. Another complication is that when the parameter \( \phi \) in the GGM is close enough to 1, it becomes equal to pure power-law noise. A value of \( \phi = 1 - 0.0017 \) = 0.9983 creates a flattening of the spectrum around a period of 10 years, see Fig. 2. Using the results of Dmitrieva et al. (2015), the standard deviation \( \sigma_{rw} \) of the random walk has been set to 2 mm/year0.5. For each type of noise, we created 500 time series, only noise, with a length of 3000, 6000 and 9000 days (8.2, 16.4 and 24.6 years), respectively. The amplitude of the FN/PL/GGM noise was 10 mm/year0.25 (for \( \kappa = - 1 \)).

Fig. 2
figure 2

Power spectral density for GGM with \( \phi \) values of 1–0.0017 and 1–0.02

Dmitrieva et al. (2017) showed that the estimation of the linear trend absorbs a part of the noise at the low frequencies. This creates a slight flattening of the power spectrum, favouring the GGM noise model. Furthermore, RW is not always detected by maximum likelihood estimation (Zhang et al. 1997), which converges to zero amplitude of RW, when the FN amplitude is too large.

The results of our analysis of synthetic time series are summarised in Tables 1 and 2 which, for two different values of \( \phi \), list the percentages of how many times the true noise model was selected, (true positives (TP) in bold) and how many times another model was selected (false positives, FP).

Table 1 Results of finding the underlying noise model using various information criteria
Table 2 Same as Table 1 but \( \phi = 1 - 0.02 = 0.98 \) for GGM

From Tables 1 and 2, various conclusions can be drawn which we will also observe in real GNSS time series. First, Table 1 shows that the log-likelihood criteria, which were used by most studies, favour the GGM noise model when the cross-over period is around 10 years. The same applies for the AIC while BIC and BIC_tp have trouble distinguishing between GGM and PL/FN noise models.

This situation is significantly improved if we use GGM time series with a flattening that starts at a period of 1 year (\( \phi = 1 - 0.02 = 0.98 \)). The influence of the \( \phi \) parameter on the flattening of the power spectral density is shown in Fig. 2. If we force a stronger flattening of the power spectrum, then it becomes easier to separate the GGM type of noise from the rest. Table 2 demonstrates this produces true positive percentages of around 90% for GGM, PL and FN for all criteria.

Santamaría-Gómez et al. (2011) also found various time series with GGM noise, but these were discarded in order to ensure the computed velocity uncertainties were conservative. Note that even for a cross-over of a year, i.e., Table 2, and for time series with a length of 8.2 years (n = 3000) around 40% of real PL noise is still classified wrongly as GGM. In this research, the length of the time series is closer to 16.4 years (n = 6000) and this helps to improve the separation between GGM and PL noise.

Secondly, both Tables 1 and 2 show that FN + RW has a low percentage of TP, but the FP percentage is zero. In other words, if one finds FN + RW noise, then it can be very confident that it is indeed FN + RW noise. The TP percentage depends on the fraction of the FN and RW noise amplitudes. The lower the FN noise, the easier it is to detect RW noise. Table 2 is also supporting the results established in Langbein (2012) where the Log(L) criterion selects 50–70% of the PL noise model instead of the true FN + RW model. Tables 1 and 2 clearly show the increase in the TP percentage of detecting RW, when the length of the time series is increased.

Last, Tables 1 and 2 also show that the log(L) cannot separate PL from FN noise. The reason is simply that FN is equal to PL when the spectral index is − 1. Without extra penalties for including extra parameter in the noise model, the two noise models are identical in this particular case. The BIC_tp criteria can separate FN from PL noise, and to investigate its discriminating power, we created power-law noise with a spectral index \( \kappa \) ranging from − 1.3 to − 0.7 with synthetic time series of 6000 days. The TP and FP percentages (for GGM and FN) are shown in Fig. 3. It shows that within a spectral index range of − 1.05 and − 0.95, pure flicker noise is the preferred noise model. This gives us an indication of when one can use a simple FN + WN noise model and when one should use a PL + WN noise model.

Fig. 3
figure 3

The TP percentage for synthetic power-law time series of 6000 days for various values of \( \kappa . \) Also shown are the FP percentages for the FN and GGM models. The dotted lines were computed using time series of 3000 days. Results are obtained using the BIC_tp criterion and demonstrate for which spectral index range FN can be separated from PL

We repeated these simulations using time series of a length of 3000 days (8.2 years), and the results are displayed using dotted lines. Now the range of the spectral index for which flicker noise model is the preferred is wider. Note that for these shorter time series we have a higher percentage of false positives of GGM, see also Tables 1 and 2.

4 The processing of GNSS daily position time series

We analysed daily time series from 110 stations of the International GNSS Service (IGS). The daily positions were computed using GIPSY-OASIS v6.3 (Bertiger et al. 2010) with the Precise Point Positioning (or PPP) strategy (Zumberge et al. 1997). This approach, based on undifferentiated data, permits to compute the positions of each station individually by using satellite orbit and clock parameters provided by the Jet Propulsion Laboratory (JPL) that are kept fixed during the processing leading that the position of the station is computed by minimising the clock errors of the receiver. To keep consistency, we also have applied the daily transformation parameters estimated by JPL to align the solution within ITRF2008. We have carried out a dedicated processing of all IGS stations using the same parameters and models described by Neres et al. (2016). The observations were taken between 1996 and 2017, and only time series with more than 12 years were used.

Before estimating the stochastic and trajectory models using the Hector software (Bos et al. 2013a, b), we include several steps to remove outliers and correct known offsets. Outliers are removed by first fitting the trajectory model to the observations using a WN model. Afterwards the misfit between observations and models is estimated. As a rule of thumb, everything falling outside 3 times the interquartile range, is considered to be an outlier (Langbein and Bock, 2004). However, the GNSS time series may still contain various offsets from their nominal values due to either geophysical sources (earthquake ruptures) or non-geophysical errors (antenna height metadata errors, phase centre modelling errors, or other man-made and software-dependent errors). In this work, we use a new feature from Hector software based on automatic offset detection (Fernandes and Bos 2016). The used trajectory model is a linear trend with an annual and semi-annual signal plus the aforementioned offsets.

5 Influence of time span on selection of noise model

Before analysing our complete set of 110 station, we first focus on the time series of 20 globally distributed permanent IGS stations with a time span of over 19.3 years (January 1996–October 2017) and low data missing (i.e., less than 6% of data gaps). Figure 4 and electronic supplement Table X1 show the selected stations. The average rate of data gaps and time span are 3.1% (maximum 5.7%) and 21.7 year (minimum 19.3 year) for the 20 IGS stations, respectively.

Fig. 4
figure 4

Distribution of the analysed 20 IGS stations

From these long time series, we produce sub-time series of 6, 9, 12, 15, 18 and 20 years. Each time series is analysed with Hector, and using the various information criteria, the optimal noise model is selected (see Electronic Supplement Table X2, X3 and X4). Using the results from the analysis of synthetic time series, we only consider the detection of GGM as the most probably noise model significant if the value of \( \phi < 0.98 \). The latter parameter is also estimated by Hector. If this condition is not met, then the second most likely noise model is chosen. Without this condition, AIC would detect a GGM noise model in around 90% of the stations, in agreement with our results shown in Table 1. Figure 5 displays the results for the sub-time series with length equal to 6 and 20 years.

Fig. 5
figure 5

The influence of the length of the time series (6 and 20 years) on the preferred noise model using AIC, BIC and BIC_tp

It can be seen that there are two stations which have FN + RW + WN as their optimal noise model but only for the longest time series in agreement with our synthetic time series analysis results discussed in Sect. 2. Several studies, including Williams et al. (2004), showed that the detection of the RW noise becomes easier with longer and longer time series. Figure 5 demonstrates this for stations HOFN and MONP. Another example is the East component of the station GUAM; for time series with a length of 6 years, the Hector software estimates a zero RW noise amplitude. However, for time series with a length of 20 years, Hector estimates a RW amplitude of 1.3 mm/year0.5. As shown in Sect. 2, there is no false detection of RW noise for any of the information criteria, so this detection of RW is very likely to be correct.

Figure 6 displays the power spectral density (PSD) of the residual time series for one component of three selected GNSS stations. The left panel shows the standard PL + WN noise at DRAO, up component, which is present in most GNSS time series. The middle panel shows a distinct increment of the spectral index at the low frequencies for the East component of HOFN, due to the presence of RW noise. Finally, the right panel shows the flattening of the noise of the Up component of WSRT when the GGM noise is used.

Fig. 6
figure 6

Power spectral density plots of three stations: DRAO (21.7 years), HOFN (19.6 years) and WSRT (20.0 years) showing PL, FN + RW and GGM noise, respectively

6 Influence of stochastic noise model and time span on velocity uncertainties

In the previous section, we discussed the selection of the best noise model. In this section, we explore their influence on the velocity uncertainties. Figure 7 displays the evolution of velocity uncertainty for 4 different noise models as function of the length of the time series for stations ALBH, DRAO and MAS1 for the three components. The velocities were estimated using Hector by fitting a standard trajectory model to the observations which includes an annual and semi-annual signal. We also corrected for offsets.

Fig. 7
figure 7

Evolution of velocity uncertainty of GNSS time series with different stochastic model and increasing time span

Figure 7 shows that in most cases the trend uncertainty for the noise models FN + WN and FN + WN + RW are in agreement. If the maximum likelihood estimation method used in Hector does not find the random-walk noise, then it sets the RW amplitude to zero. Table 3 displays the fraction, indicating the amount of RW noise, and amplitude of RW component (in mm/year0.5) for the stations’ coordinate with a nonzero value, under the assumption of FN + RW + WN model for time series with a data span of approximately 20 years.

Table 3 Fraction and amplitude (mm/year0.5) of RW component under the assumption of FN + RW + WN model for time series of stations with ~ 20 years data span

For some components, we can see that the ratio FN + WN + RW/FN + WN is around 1 which indicates that no RW component was found. However, when RW noise is present, there is a relatively large effect on the velocity uncertainty. This analysis implies that a small portion of RW noise can cause a significant effect on the velocity uncertainty, supporting previous studies (e.g., Langbein 2012). This factor can increase to 8.4 for stations with large RW noise such as ALBH. Note that ALBH station records slow slip events due to silent earthquakes in the Cascadia range (Melbourne and Webb 2003). Thus, most of the found RW noise may result from the residual misfit between the functional model and the time series.

To make further analysis on the effect of different noise models on the GNSS station velocity uncertainty, the evolution of ratio of velocity uncertainty on FN + RW + WN, GGM + WN and PL + WN model over FN + WN from 6  to 20 year in Table X5 (Electronic supplement). For the analysed 20 IGS stations, we find that the ratio FN + WN + RW/FN + WN is sometimes around 1. Besides, it can be seen from Fig. 6 and Table X5 that the GGM model fits better than other models when increasing the length of the time series and thus validating the assumption of a flattening of the power-spectra in time. Furthermore, we analyse the evolution of the fraction of velocity uncertainty FN + RW + WN, GGM + WN, PL + WN over FN + WN from 6 to 20 year. Table 4 shows a statistical analysis on the average effect of the evolution on the fraction of velocity uncertainty for the 20 stations at both time scales. It can be seen that the trend error should be made smaller by a factor of approximately 0.55 (between 0.5 and 0.6) with GGM + WN compared with FN + WN model.

Table 4 Evolution of the fraction of velocity uncertainty FN + RW + WN, GGM + WN, PL + WN over FN + WN from 6 to 20 year. \( Ratio_{{(FN + RW + WN)/\left( {FN + WN} \right)}} = \, A \), \( Ratio_{{GGM + WN/\left( {FN + WN} \right)}} = B \), \( R{\text{atio}}_{{PL + WN/\left( {FN + WN} \right)}} = C \)

7 Results of the analysis of 110 IGS stations

We now present the results using the 110 stations from the IGS core network with more than 12 years of observations. As mentioned in Sect. 4, the used trajectory model is a linear trend, an annual and semi-annual signal and offsets. Each time series was analysed using GGM + WN, PL + WN, FN + WN and RW + FN + WN noise models. The selection of best noise model is based on BIC_tp and we used a minimum cross-over period of 1 year for the GGM model to ensure proper separation from the other noise models. The results are summarised in Table 5. Figure 8 shows the spatial distribution of selected noise models for the three components. It can be seen that the noise model shows some diversity, without one particular model emerging in particular. But there is a slight difference between the stochastic models selected on the Horizontal (East, North) and Up components. This difference can be attributed to the fact that the vertical component is generally much noisier than the other two components, with large white noise amplitude (Williams et al. 2004; Montillet et al. 2013). Although Fig. 8 shows that there is no relationship between the spatial distribution of the stations and the optimal selected stochastic model (i.e., FN + WN, PL + WN, and GGM + WN), the inclusion of the RW component in the FN + WN model seems to be mainly located at stations close to the coastline (d < 10 km to the shore, such as ASPA). This result may be explained by the fact that those areas have high water content (soil water and ground water), hence associated with low-noise sites (Finnegan et al. 2008; Langbein and Svarc 2019). Note that the overall percentage of including the RW component in the stochastic model for the coordinates is 6.1%. Moreover, Amiri-Simkooei et al. (2007) developed a method to detect the RW noise and whether to include it in the stochastic noise model of the time series, based on the w-test. Comparing our results with this study, this method is more sensitive to this type of noise, with an average 47% overall their stations, than our algorithm based on BIC_tp. Nevertheless, in Table 5, FL + WN and PL + WN models appear to be the best noise models, accounting for 90.9%, 87.3%, 83.6% for North, East and Up components, respectively. GGM + WN fits the time series mainly in the Up component (about 12.7%).

Table 5 Distribution of selected noise models per component using BIC_tp
Fig. 8
figure 8

Spatial distribution of stochastic properties of 110 IGS GNSS time stations by component. It is also shown the stations with significant RW

Looking at the spectral index of the analysed 110 stations whose preferred noise model is PL + WN, we find that the spectral index is outside of the interval − 1.05 to − 0.95. Overall, the results show that the PL + WN and FN + WN are still the most likely selected noise models looking at all components in Table 5, with a combined percentage varying between 83 and 90%, when BIC_tp is used.

8 Conclusions

The most common model to describe the stochastic properties of the noise in GNSS time series is power-law plus white noise (PL + WN). Using 110 time series of IGS stations with a time span longer than 12 years, we have investigated if this is still the best choice. A property of power-law noise is that in the frequency domain the power increases for decreasing frequency. For very long GNSS time series, we investigated if the power spectrum of the noise is not showing any signs of flattening which could be described by the Generalised Gauss–Markov (GGM) noise model developed in Langbein (2004). To objectively select the best noise model, we have investigated various criteria such as the log-likelihood value, the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Their performance was quantified by analysing various batches of 500 simulated time series with a length of 8.2, 16.4 and 24.6 years and with known noise characteristics. We found that when the flattening of the power spectrum at the low frequencies is small, both the log(L) and AIC are biased towards the selection of the GGM noise model. Slightly better results are obtained using the BIC. We also modified BIC by reintroducing the factor \( 2\pi \) in its derivation which decreases the penalty of adding more parameters in the noise model and called it BIC_tp. Its performances are very similar to that of BIC, although it has a 5% better chance of detecting random-walk noise (RW).

For the analysed 110 stations, the results show that the PL + WN and FN + WN are still the most common stochastic noise models with a combined percentage varying between 83 and 90%.

RW has been detected more frequently in the time series associated with the horizontal components than in the vertical one. In our set of 110 stations, we found that about 4.5–8.2% contained RW noise in the three components. The linear trend uncertainty for these stations is varying by a factor of 1.5–8.4 larger compared with the one estimated with a FL + WN model, agreeing with previous studies such as Langbein (2012) and Langbein and Svarc (2019).

Previous studies have demonstrated that the linear trend fitted to the observations absorbs partially the noise at the low frequencies (Dmitrieva et al., 2017), creating a small flattening of the power spectrum. To minimise this problem, we have taken a conservative approach, selecting only time series showing a strong flattening, which started at 1 year, to be more realistic. Despite this conservative approach, our results show that GGM + WN is the optimal noise model for 3.6 and 5.5% of the stations for the horizontal components (i.e., East and North, respectively), and 12.7% for the vertical component. For these stations, the uncertainty associated with the estimated tectonic rate is around a factor of two smaller than when the standard PL + WN model is applied. Santamaría-Gómez et al. (2011) did not find any significant GGM noise which is probably caused by their shorter time series which prevented a good separation of GGM with PL/FN and forced them to discard this type of noise. Our research has shown that this is no longer the case and that GGM should be included on a routine basis in the selection of a proper noise model in GNSS time series, especially for the vertical component.