1 Introduction

High-precision GNSS positioning and navigation data are mostly processed based on the well-known least-squares (LS) principle. In the absence of any unmodeled systematic errors in the observations’ functional model, the LS solution is expected to result with the best linear unbiased estimation (BLUE) of the unknown parameters (Grafarend and Schaffrin 1974). Nevertheless, real BLUE solution will not be available unless reliable stochastic model of the GNSS observations is in use (Koch 1988; Teunissen 1998b; Teunissen et al. 2008; Li et al. 2011). The GNSS observations’ stochastic model has an important role in the LS procedure. While the functional model links the GNSS observables to the unknown parameters, such as baseline coordinates, carrier phase ambiguities and atmospheric delays, the stochastic model is used to specify the observations weights through their accuracies and dependencies to each other (Teunissen 1998b).

A proper choice of the stochastic model influences all subsequent stages of data processing, from the possibility to reach ‘minimum variance’ in the LS solution, to the reliability and quality control of the solution. The variance–covariance (VC) matrix, for example, is involved in the overall statistic for model validation and the w-statistic for outlier detection, and both statistics are sensitive to the stochastic model (Koch 1988; Teunissen 2006; Teunissen et al. 2008). In GNSS application, the stochastic model forms the basis for the reliable ambiguity resolution. Adequate stochastic model is necessary to define the proper search area of both integer least-squares and integer bootstrapped ambiguity resolution methods, and for making a reliable prediction of their success rate (Teunissen 2000, 2007).

In the last two decades, a significant research effort has been made for shaping and refining the GNSS stochastic model (Teunissen et al. 1998; Bona 2000; Tiberius and Kenselaar 2000; Liu 2002; Wang et al. 2002; Li et al. 2008; Amiri-Simkooei et al. 2009, 2013). Most studies agree that a realistic GNSS stochastic model consists of the following characteristics: (1) different variances for different observation types, together with satellite elevation dependence of the observations’ precision; (2) cross-correlation between different observation types; and (3) temporal correlation of the observables. Nevertheless, most GNSS software packages [for instance, Bernese (Dach and Walser 2015) and Gamit (Herring et al. 2010)] consider only the elevation-dependent variance model and ignore correlation between observables, as an easy implementation of the stochastic model. As a result, the VC matrices estimated from the GNSS processing software are very optimistic and cause incorrect interpretations in the statistical analyses and incorrect quality assessment of the results (Han and Rizos 1995a, b; El-Rabbany and Kleusberg 2003; Erdogan and Dogan 2019). Several studies have suggested using a scale factor (SF) with the basic VC matrix to achieve a more realistic stochastic results (Geirsson 2003; Kashani et al. 2004; Cetin et al. 2018; Erdogan and Dogan 2019). For better performance, Li et al. (2008) suggested applying a pre-adjusted variance model according to the measurement types. Furthermore, to reach the highest realistic VC matrix estimation, a variance component estimation (VCE) procedures can be implemented with a full structured GNSS stochastic model (Teunissen 1988; Koch 1986, 1999; Yang et al. 2005; Amiri-Simkooei 2007; Li et al. 2011).

Although the literature holds significant studies on the GNSS stochastic model, the vast majority have been focused on the improvement in the positioning precision caused by the refined stochastic model. Considering that the stochastic model is also used for statistic testing and quality assessment, even more important from its effect on the positioning precision is its statistical nature reliability. Reliable stochastic model is important as a planning tool since it allows to simulate and predict positioning accuracy without having real observations in hand. A few studies have analyzed the impact of the stochastic model on statistical procedures that involve GNSS positioning. Li et al. (2015) investigated the GNSS elevation-dependent modeling and its impact on w-statistic testing. In Amiri-Simkooei et al. (2016), the effect of the realistic stochastic model on the ambiguity resolution success rate was evaluated. Li (2016) and Li et al. (2017) studied the influence of the stochastic model on the statistical tests with triple-frequency BeiDou. Unfortunately, far too little attention has been paid to authenticate the statistical characteristic of the refined stochastic model in GNSS positioning, and no controlled research has been found to investigate how this statistical nature is affected by different baseline lengths.

This paper focuses on testing the reliability of the GNSS stochastic model through its statistical nature in different baseline lengths. Section 2 presents the general form of LS adjustment. The procedures for evaluating a general stochastic model using Chi-distribution and binomial distribution tests are presented in Sect. 3. A realistic GNSS stochastic model estimation using the least-squares variance component estimation (LS-VCE) is then presented in Sect. 4. Section 5 presents a novel procedure for evaluating the GNSS stochastic model based on the statistical nature of the GNSS ambiguity vector solution. In Sect. 6, a numerical analysis is made based on the implementation of the proposed method in different baseline configurations. The conclusions are then summarized in Sect. 7.

2 Least-squares estimation

Consider the general formulation of linear(ized) observation equations

$$ y = Ax + \varepsilon , E( \varepsilon ) = 0; \quad D( \varepsilon ) = Q_{y} $$
(1)

where \(E(\cdot)\) and \(D\left( . \right)\) are the mathematical operators for expectation and dispersion. y is \(m \times 1\) observations vector, A is \(m \times n\) design matrix of full column rank to the \(n \times 1\) unknown parameters vector \(x\), and \(\varepsilon\) is \(m \times 1\) random noise vector with VC matrix \(Q_{y}\).

If \(\varepsilon\) is normally distributed, we can use the inverse of the observations’ variance–covariance matrix as a weight matrix to obtain the least square solution of Eq. (1) as follows (Koch 1988):

$$ \hat{x} = \left( {A^{T} Q_{y}^{ - 1} A} \right)^{ - 1} A^{T} Q_{y}^{ - 1} y, Q_{{\hat{x}}} = \left( {A^{T} Q_{y}^{ - 1} A} \right)^{ - 1} $$
(2)

Assuming both functional and stochastic models are clear from errors misspecifications, \(\hat{x}\) is the best linear unbiased estimation (BLUE) of \(x\) with \(Q_{{\hat{x}}}\) as the corresponding estimated VC matrix, resulting from the law of variance–covariance propagation. The estimated \(Q_{{\hat{x}}}\) has an important role in reflecting the quality of the estimated vector \(x\). In survey applications, one can use it to assess the statistical confidence region in which the true value of \(x\) is located. In that manner, we should expect that a realistic model of the observation VC matrix would lead to \(Q_{{\hat{x}}}\) that reflects the true statistical characteristic of the estimated parameters in \(\hat{x}\) (Cooper 1987).

An inadequate stochastic model will result with poor quality assessment and nonreliable statistical analysis in the solution. Nevertheless, a realistic VC matrix is not always available, and assumptions concerning the stochastic model are made in the data processing. Hence, a testing procedure is required to assess the reliability level of the assumed VC matrix.

3 Stochastic model testing procedures

Two methods for testing the reliability of the stochastic model are presented here: (1) using the Chi-square distribution to test whether a true reference is compatible with the estimated dataset under the assumption of normal distribution behavior, and (2) using the binomial distribution to obtain the true coverage probability of a dataset, regardless of its error distribution behavior, and compare it with a normal distributed one. The different approaches are in the heart of the GNSS stochastic model’s testing procedures that will be described in Sect. 5.

3.1 Testing stochastic model using Chi-squared distribution

The Chi-squared distribution is used for testing hypotheses about variances. In particular, it may be used to test whether or not a sample is compatible with a postulated probability density function in the so called ‘goodness-of-fit’ test (Cooper 1987). This test is acceptable only under the premise that the observations noise elements are normally distributed, and the functional model is clear from biases (Tiberius and Borre 2000). In that case, \(\hat{x}\) in (2) holds a multinormal-distribution behavior with the following probability density function

$$ {\text{f}}\left( x \right) = \frac{1}{{\left( {2\pi } \right)^{\frac{n}{2}} \sqrt {\left| {Q_{{\hat{x}}} } \right|} }}{\text{e}}^{{ - \frac{1}{2}\left( {x - \hat{x}} \right)^{t} Q_{{\hat{x}}}^{ - 1} \left( {x - \hat{x}} \right)}} $$
(3)

Here, the expression

$$ \left( {x - \hat{x}} \right)^{t} \cdot Q_{{\hat{x}}}^{ - 1} \cdot \left( {x - \hat{x}} \right) \sim \chi_{n}^{2} $$
(4)

represents a family of hyper-ellipsoids centered at \(\left( {x - \hat{x}} \right)\), bounding the possible errors in \(\hat{x}\) with a statistical significance level distributed as Chi-squared with \(n\) degrees of freedom. In case that \(x\) is known, Eq. (4) may be used to test whether \(\hat{x}\) and \(Q_{{\hat{x}}}\) meet the criterion of normal distribution through hypotheses.

Consider \(\mu = \left[ {\mu_{1} , \mu_{2} , \ldots , \mu_{n} } \right]^{{\text{T}}}\) as the true values vector of \(x\), we can use the following hypothesis to test whether \(x\) is adequate with \(\mu\) under \(\hat{x}\) and \(Q_{{\hat{x}}}\):

$$ H_{0} :x = \mu $$
$$ H_{1} :x \ne \mu $$
(5)

where the hypothesis test

$$ T = \left( {\mu - \hat{x}} \right)^{t} \cdot Q_{{\hat{x}}}^{ - 1} \cdot \left( {\mu - \hat{x}} \right) \sim \chi_{n}^{2} $$
(6)

is rejected if \(T > \chi_{n,1 - \alpha }^{2}\) for a given significance level \(\alpha\).

Notice that using (6) to test individual set of \(\hat{x}\) and \(Q_{{\hat{x}}}\) will not result with deductions about the reliability of the stochastic model, rather the possibility that the specific set is acceptable with \(\mu\) as the true value of \(x\). Too pessimistic \(Q_{{\hat{x}}}\) for example, will result with deceptively higher probability of accepting \(H_{0}\), even though \(Q_{{\hat{x}}}\) is not realistic. Therefore, in order to test the reliability of the model, Eq. (6) should be reused with numerous independent sets of \(\hat{x}\) and \(Q_{{\hat{x}}}\). The histogram of the test results should be compared with the \(\chi_{n}^{2}\) distribution to evaluate if the current stochastic model yields the expected statistic characteristic for \(T\).

3.2 Testing stochastic model with binomial distribution

The binomial distribution is used here to obtain the empirical confidence interval (CI) of a binomial process. A binomial process, also referred to as a Bernoulli test, is a success–failure experiment with a certain probability of success. The CI aims to predict this probability for a given significance level by repeating the Bernoulli test with a series of true independent datasets. A common approach to estimate the CI value using the binomial distribution is the so-called Wald approach (Brown et al. 2001; Sauro and Lewis 2005). Using this approach, the CI is extracted as follows:

$$ CI = \hat{S} \pm \frac{{z_{a/2} \sqrt {\hat{S}\left( {1 - \hat{S}} \right)} }}{\sqrt M } $$
(7)

where \(\hat{S}\) is the accepted success rate in the Bernoulli tests, \(M\) is the number of tests and \(z_{a/2}\) is the standard normal distribution coefficient used to define the confidence level according to \(\alpha\) as follows

$$ z_{a/2} = {\Phi }^{ - 1} \left( {1 - \frac{\alpha }{2}} \right) $$
(8)

with \({\Phi }\left( . \right)\) being the standard normal distribution function (Brown et al. 2001).

In controlled survey applications, one can use the obtained CI value to ratify the stochastic nature of a variable (Luo et al. 2011). This is done by implementing a series of Bernoulli tests that examine whether the area that is defined by \(Q_{{\hat{x}}}\) around an estimated variable \(\hat{x}\) is indeed holding the true reference value in a suspected rate (see Fig. 1 for two-dimensional \(\hat{x}\) illustration). In reference to GNSS applications, the Bernoulli tests were used through Monte Carlo simulation to study the empirical (realistic) integer ambiguity resolution success rate (Hou et al. 2016). To this end, \(\hat{S}\) in Eq. (7) equals the fix rate of the integer ambiguity from \(M\) independent baselines solution, and \(\alpha\) is chosen for the tolerance measurement of the CI. When the CI value is close to the expecting success rate, it implies that the GNSS stochastic model is reliable.

Fig. 1
figure 1

Stochastic model evaluation with Bernoulli tests. For two-dimensional \( {\hat{x}} \), an area defined by \({Q_{{\hat{x}}}}\) should succeed in holding the true reference value with a suspected rate of \({\text{CI}} = 39\%\)

4 GNSS stochastic model estimation

The geometry-based single-difference (SD) model is used here to evaluate the GNSS stochastic model (Liu 2002; Li et al. 2008, 2015). The SD model absents from mathematical correlation between different satellites making it more suitable for estimating the satellite-specific variance compared to the double-difference (DD) model. Though, both models were studied and successfully employed for estimating the stochastic model (Amiri-Simkooei et al. 2009; Li 2016). Here, an adjustment was added to the SD functional model formation in Li et al. (2015) in order to account for the ionospheric influence in the medium- and long-range baselines observations. After presenting the functional model development in matrix form, the related stochastic model is given and the LS-VCE method is described for evaluating the stochastic model’s components.

4.1 The functional model

The general SD phase and code observations equations for a single-epoch and single-frequency case are formulated with the additional ionospheric element to Li et al. (2015) as

$$\begin{aligned} &\varphi_{k,j} = \rho_{k} + e_{m} \delta t_{k} - \lambda_{j} N_{j} - \mu_{j} \iota_{k} + T_{k} + \varepsilon_{{\varphi_{k,j} }} , E\left( {\varepsilon_{{\varphi_{k,j} }} } \right) = 0;\\ & D\left( {\varepsilon_{{\varphi_{k,j} }} } \right) = Q_{{\varphi_{k,j} }}\end{aligned} $$
(9)
$$ P_{k,j} = \rho_{k} + e_{m} \delta t_{k} + \mu_{j} \iota_{k} + T_{k} + \varepsilon_{{P_{k,j} }} , E\left( {\varepsilon_{{P_{k,j} }} } \right) = 0;\quad D\left( {\varepsilon_{{P_{k,j} }} } \right) = Q_{{P_{k,j} }} $$
(10)

where subscripts \(k\) and \(j\) represent the epoch number and the frequency, respectively; \(\varphi_{k,j} = \left[ {\varphi_{k,j}^{1} , \ldots ,\varphi_{k,j}^{m} } \right]^{{\text{T}}}\) and \(P_{k,j} = \left[ {P_{k,j}^{1} , \ldots ,P_{k,j}^{m} } \right]^{{\text{T}}}\) are the SD phase and code observations vectors with superscript of the satellite number; \(\rho_{k} = \left[ {\rho_{k}^{1} , \ldots ,\rho_{k}^{m} } \right]^{{\text{T}}}\) is the SD satellite distance vector; \(e_{m}\) is an m-column vector with all elements equal to 1 and \(\delta t_{k}\) is the SD receiver clock error; \(N_{j} = \left[ {N_{j}^{1} , \ldots ,N_{j}^{m} } \right]^{{\text{T}}}\) is the SD ambiguity vector with \(N_{j}^{i} = z_{j}^{i} + \varphi_{j}^{0}\), where \(z_{j}^{i}\) is an integer and \(\varphi_{j}^{0}\) is the initial phase bias, and \(\lambda_{j}\) is the corresponding wavelength; \(\iota_{k} = \left[ {\iota_{k}^{1} , \ldots ,\iota_{k}^{m} } \right]\) is the SD ionosphere delay vector in TEC units and \(\mu_{j} = 40.3 \cdot 10^{16} /f_{j}^{2}\) is the ionospheric factor for frequency \(f_{j}\); \(T_{k} = \left[ {T_{k}^{1} , \ldots ,T_{k}^{m} } \right]\) is the SD tropospheric delay vector; and \(\varepsilon_{{\varphi_{k,j} }} = \left[ {\varepsilon_{{\varphi_{k,j} }}^{1} , \ldots ,\varepsilon_{{\varphi_{k,j} }}^{m} } \right]\) and \(\varepsilon_{{P_{k,j} }} = \left[ {\varepsilon_{{P_{k,j} }}^{1} , \ldots ,\varepsilon_{{P_{k,j} }}^{m} } \right]\) are the phase and code observations noise vectors.

Following Liu (2002) and Li et al. (2008), Eq. (9) can be reparametrized to assimilate prior information of DD integer ambiguities while preserving their integer nature. The observation then take form as

$$ \varphi_{k,j} = \rho_{k} + e_{m} \overline{\delta t}_{k,j} - \lambda_{j} \overline{N}_{j} - \mu_{j} \iota_{k} + T_{k} + \varepsilon_{{\varphi_{k,j} }} $$
(11)

with

$$ \overline{\delta t}_{k,j} = \delta t_{k} - \lambda_{j} N_{j}^{1} $$
(12)
$$ \overline{N}_{j} = \left[ {0,a_{j}^{T} } \right]^{T} $$
(13)
$$ a_{j}^{T} = {\text{D}}^{T} N_{j} $$
(14)
$$ {\text{D}}^{T} = \left[ { - e_{m - 1} ,I_{m - 1} } \right] $$
(15)

where \(\overline{\delta t}_{k,j}\) serve as a new parameter equivalent to the receiver clock error;\( \overline{N}_{j}\) is the reparametrized ambiguity vector; \(a_{j}\) is the DD integer ambiguity vector; and \({\text{D}}\) is the differential matrix with the first satellite as reference.

In order to maintain adequate matrix rank in multi-epoch solution, an element is added here to represent a common temporal change in the receiver clock error for both phase and code observations. Equations (10) and (11) are then reparametrized as

$$ P_{k,j} = \rho_{k} + e_{m} \left( {\delta t_{0} + \Delta t_{k} } \right) + \mu_{j} \iota_{k} + T_{k} + \varepsilon_{{P_{k,j} }} $$
(16)

and

$$ \varphi_{k,j} = \rho_{k} + e_{m} \left( {\overline{\delta t}_{0,j} + \Delta t_{k} } \right) - \lambda_{j} \overline{N}_{j} - \mu_{j} \iota_{k} + T_{k} + \varepsilon_{{\varphi_{k,j} }} $$
(17)

with \(\delta t_{0}\) and \(\overline{\delta t}_{0,j}\) the time constant variables representing the receiver clock equivalents in the first epoch and \(\Delta t_{k}\) being the temporal change in the receiver clock error.

Permanent stations with known coordinates in addition to precise ephemeris can be used to enhance the functional model’s strength during the stochastic model estimation process. In this case, the geometric parameter \(\rho_{k}\) can accurately be calculated and the DD ambiguities can be obtained from a baseline solution with fixed coordinates. The float ambiguities from this solution can be fixed to their integer value using the well-known LAMBDA approach with very high success rate (Teunissen 1995; Verhagen and Li 2012). In addition, the tropospheric delays can be obtained using commonly used tropospheric models, as, for example, the Vienna Mapping Function (VMF) in Landskron and Böhm (2018).

By shifting all the extracted values to the left side, Eqs. (16) and (17) become

$$ \overline{P}_{k,j} = P_{k,j} - \rho_{k} - T_{k} = e_{m} \left( {\delta t_{0} + \Delta t_{k} } \right) + \mu_{j} \iota_{k} + \varepsilon_{{P_{k,j} }} $$
(18)

and

$$ \overline{\varphi }_{k,j} = \varphi_{k,j} - \rho_{k} + \lambda_{j} \overline{N}_{j} - T_{k} = e_{m} \left( {\overline{\delta t}_{0,j} + \Delta t_{k} } \right) - \mu_{j} \iota_{k} + \varepsilon_{{\varphi_{k,j} }} $$
(19)

Arranging Eqs. (18) and (19) in matrix form, as in Eq. (1), results in the following

$$ \begin{aligned} \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\overline{\varphi }_{{\left[ {0 \ldots k - 1} \right],1}} } \\ {\overline{\varphi }_{{\left[ {0 \ldots k - 1} \right],2}} } \\ \end{array} } \\ {\begin{array}{*{20}c} {\overline{P}_{{\left[ {0 \ldots k - 1} \right],1}} } \\ {\overline{P}_{{\left[ {0 \ldots k - 1} \right],2}} } \\ \end{array} } \\ \end{array} } \right] &= \left[ {\begin{array}{*{20}c} {e_{m \cdot k} } \\ {0_{3m \cdot k} } \\ \end{array} , \begin{array}{*{20}c} {\begin{array}{*{20}c} {0_{m \cdot k} } \\ {e_{m \cdot k} } \\ \end{array} } \\ {0_{2m \cdot k} } \\ \end{array} , \begin{array}{*{20}c} {0_{2m \cdot k} } \\ {e_{2m \cdot k} } \\ \end{array} , e_{4} \otimes \left[ {\begin{array}{*{20}c} {e_{k - 1}^{T} \otimes 0_{m} } \\ {I_{k - 1} \otimes e_{m} } \\ \end{array} } \right] , \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} { - \mu_{1} } \\ { - \mu_{2} } \\ \end{array} } \\ {\begin{array}{*{20}c} {\mu_{1} } \\ {\mu_{2} } \\ \end{array} } \\ \end{array} } \right] \otimes I_{m \cdot k} } \right] \hfill \\ &\quad \left[ {\begin{array}{*{20}c} {\overline{\delta t}_{0,1} } \\ {\overline{\delta t}_{0,2} } \\ {\begin{array}{*{20}c} {\delta t_{0} } \\ {\Delta t_{{\left[ {1 \ldots k - 1} \right]}} } \\ {\iota_{{\left[ {0 \ldots k - 1} \right]}} } \\ \end{array} } \\ \end{array} } \right] + \varepsilon ; \quad E\left( \varepsilon \right) = 0, \quad D\left( \varepsilon \right) = Q_{y}^{SD} \hfill \\ \end{aligned} $$
(20)

where \(\otimes\) denotes the Kronecker product; \(0_{m}\) is an m-column vector with all elements equal to zero; and \(Q_{y}^{{{\text{SD}}}}\) is the SD code and phase observations VC matrix. The degrees of freedom \(\left( {3m - 1} \right)k - 2\) follow from \(4mk\) observations minus three time-constant unknowns, \(k - 1\) receiver clock’s temporal drift unknowns and \(mk\) ionosphere delay unknowns. The form of \(Q_{y}^{{{\text{SD}}}}\) and its estimation process are discussed next.

4.2 The stochastic model

In this contribution, the DD realistic stochastic model is considered as (Amiri-Simkooei et al. 2009, 2013, 2016)

$$ Q_{y}^{{{\text{DD}}}} = {\Sigma }_{{\text{C}}} \otimes {\Sigma }_{{\text{T}}} \otimes {\Sigma }_{{\text{E}}}^{{{\text{DD}}}} $$
(21)

where matrices \({\Sigma }_{{\text{C}}} , {\Sigma }_{{\text{T}}}\) and \({\Sigma }_{{\text{E}}}^{{{\text{DD}}}}\) are defined as

$$ {\Sigma }_{{\text{C}}} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\sigma_{\varphi 1}^{2} } & {\sigma_{\varphi 1\varphi 2} } \\ {\sigma_{\varphi 1\varphi 2} } & {\sigma_{\varphi 2}^{2} } \\ \end{array} } & {\begin{array}{*{20}c} {\sigma_{\varphi 1P1} } & {\sigma_{\varphi 1P2} } \\ {\sigma_{\varphi 2P1} } & {\sigma_{\varphi 2P2} } \\ \end{array} } \\ {\begin{array}{*{20}c} {\sigma_{\varphi 1P1} } & {\sigma_{\varphi 2P1} } \\ {\sigma_{\varphi 1P2} } & {\sigma_{\varphi 2P2} } \\ \end{array} } & {\begin{array}{*{20}c} {\sigma_{P1}^{2} } & {\sigma_{P1P2} } \\ {\sigma_{P1P2} } & {\sigma_{P2}^{2} } \\ \end{array} } \\ \end{array} } \right] $$
(22)

and

$$ {\Sigma }_{{\text{T}}} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\sigma_{\left( 0 \right)} } &\quad {\sigma_{\left( 1 \right)} } \\ {\sigma_{\left( 1 \right)} } &\quad {\sigma_{\left( 0 \right)} } \\ \end{array} } &\quad {\begin{array}{*{20}c} \ldots &\quad {\sigma_{{\left( {k - 1} \right)}} } \\ \ldots &\quad {\sigma_{{\left( {k - 2} \right)}} } \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots &\quad \vdots \\ {\sigma_{{\left( {k - 1} \right)}} } & \quad {\sigma_{{\left( {k - 2} \right)}} } \\ \end{array} } &\quad {\begin{array}{*{20}c} \ddots & \vdots \\ \ldots &\quad {\sigma_{\left( 0 \right)} } \\ \end{array} } \\ \end{array} } \right] $$
(23)

and

$$ {\Sigma }_{{\text{E}}}^{{{\text{DD}}}} = 2\left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\sigma_{\left[ 1 \right]}^{2} + \sigma_{\left[ 2 \right]}^{2} } &\quad {\sigma_{\left[ 1 \right]}^{2} } \\ {\sigma_{\left[ 1 \right]}^{2} } &\quad {\sigma_{\left[ 1 \right]}^{2} + \sigma_{\left[ 3 \right]}^{2} } \\ \end{array} } &\quad {\begin{array}{*{20}c} { \ldots } &\quad {\sigma_{\left[ 1 \right]}^{2} } \\ { \ldots } &\quad {\sigma_{\left[ 1 \right]}^{2} } \\ \end{array} } \\ {\begin{array}{*{20}c} { \vdots } &\quad \vdots \\ {\sigma_{\left[ 1 \right]}^{2} } &\quad {\sigma_{\left[ 1 \right]}^{2} } \\ \end{array} } &\quad {\begin{array}{*{20}c} { \ddots } &\quad \vdots \\ { \ldots } &\quad {\sigma_{\left[ 1 \right]}^{2} + \sigma_{\left[ m \right]}^{2} } \\ \end{array} } \\ \end{array} } \right] $$
(24)

with \({\Sigma }_{{\text{C}}}\) as cross-correlation matrix, consisting of ten VC elements for dual-frequency phase and code observation types; \({\Sigma }_{{\text{T}}}\) as \(k \times k\) time correlation matrix, consisting of \(k\) VC unknowns for \(k\)-epochs and assuming that the correlation is a function of the time difference alone, e.g., \(\sigma_{{\left( {ij} \right)}} = \sigma_{\left( \tau \right)} = \sigma_{{\left( {i - j} \right)}}\); and last, \({\Sigma }_{{\text{E}}}^{{{\text{DD}}}}\) as \(\left( {m - 1} \right) \times \left( {m - 1} \right)\) elevation-dependent satellite weight matrix, where \(m\) is the satellites number and satellite #1 is assumed to be the reference satellite.

In order to use (21) with the SD functional model presented here, a modification is needed. The \(DD\) notation implies that \({\Sigma }_{{\text{E}}}^{{{\text{DD}}}}\) encapsulates the DD correlations between different satellites and thereby it should be modified to be integrated in the SD version of the stochastic model, whereas \({\Sigma }_{{\text{C}}}\) and \({\Sigma }_{{\text{T}}}\) may remain as in Eqs. (22) and (23). Since a single SD observation is a linear combination of two raw observations, its variance can easily be expressed using the error propagation law. Assuming both base and rover share the same variance \(\sigma_{\left[ m \right]}^{2}\) for a given satellite in a specific epoch, the variance for the related SD observation is then expressed as \(\sigma_{{{\text{SD}}}}^{2} = 2\sigma_{\left[ m \right]}^{2}\). Applying this for all satellites, and assuming that the correlation between channels is absent, the elevation-dependent satellite weight matrix for the SD is then

$$ {\Sigma }_{{\text{E}}}^{{{\text{SD}}}} = 2 \cdot {\text{diag}}\left( {\sigma_{\left[ 1 \right]}^{2} , \ldots ,\sigma_{\left[ m \right]}^{2} } \right) $$
(25)

and the overall stochastic form of the SD observation is read by

$$ Q_{y}^{{{\text{SD}}}} = {\Sigma }_{{\text{C}}} \otimes {\Sigma }_{{\text{T}}} \otimes {\Sigma }_{{\text{E}}}^{{{\text{SD}}}} $$
(26)

Notice that using the differential matrix \(D\) from Eq. (15), one can express the relation between \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\) and \({\Sigma }_{{\text{E}}}^{{{\text{DD}}}}\) as

$$ {\Sigma }_{{\text{E}}}^{{{\text{DD}}}} = D^{T} \cdot {\Sigma }_{{\text{E}}}^{{{\text{SD}}}} \cdot D $$
(27)

The formulation of the stochastic model in Eqs. (21) and (26) is considered to be more realistic compared to the general diagonal structure used by many researches (Amiri-Simkooei 2007).

It should be mentioned, however, that this model assumes that time correlation and satellite elevation dependence is identical for all observation types which may not always be the case (Amiri-Simkooei et al. 2009). For a more general stochastic model presentation regarding the SD observation form, the readers are referred to Li (2016). Furthermore, due to the changes in time of the satellite’s elevation, together with code and phase variance, this structure should be restricted for a limited number of adjacent epochs. Therefore, over a long time span, the observations are to be divided into subgroups with a relatively small number of \(k\)-epochs.

The unknown VC components in Eq. (26) may be estimated using the LS-VSE method, which will be described in the following section. Once estimated, the VC components may be used to obtain parameters for a given predefined covariance function. Here, a commonly used elevation-dependent and time correlation covariance functions were implemented through \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\) and \({\Sigma }_{{\text{T}}}\) estimations. The satellite elevation-dependent variance matrix, \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\), may employ the exponentially based elevation-dependent model as (Euler and Goad 1991)

$$ \sigma_{{\left[ {s_{\theta } } \right]}} = f_{{{\text{elv}}}} \left( \theta \right) = a_{0} + a_{1} e^{{\left( { - \theta /\theta_{0} } \right)}} $$
(28)

where \(\sigma_{{\left[ {s_{\theta } } \right]}}\) is the standard deviation of satellite \(s\) with \(\theta\) as an elevation angle; \(a_{0}\), \(a_{1}\) and \(\theta_{0}\). are the model parameters to be estimated. The time correlation matrix, \({\Sigma }_{{\text{T}}}\), may employ a first-order Gauss–Markov process with autocovariance function as Odolinski (2012)

$$ \sigma_{\left( \tau \right)} = f_{{{\text{time}}1}} \left( \tau \right) = b_{0} e^{{ - \tau /\tau_{0} }} $$
(29)

where \(\tau\) is the time interval, \(b_{0}\) is the covariance amplitude and \(\tau_{0}\) is the correlation length.

The predefined functions in (28) and (29) should be carefully handled as they form a specific pattern for the covariance behavior. Incompatible pattern, if chosen, will lead to fault stochastic model (Li et al. 2015). For comparison with (29), the logarithmic process is suggested here to handle the time correlation, with the autocovariance function as

$$ \sigma_{\left( \tau \right)} = f_{{{\text{time}}2}} \left( \tau \right) = c_{0} + c_{1} \log \left( \tau \right) $$
(30)

where \(c_{0}\) and \(c_{1}\) are the unknown parameters for estimation.

The unknown parameters of the nonlinear models given in Eqs. (28)–(30) can be obtained by a nonlinear least-squares fit to the estimated variance components from LS-VCE.

4.3 Least-squares variance component estimation (LS-VCE)

Several variance component estimation (VCE) methods may be implemented for estimating the unknown VC components (Rao 1971; Koch 1986, 1999; Yang et al. 2005; Li et al. 2011). Among them, the LS-VCE method utilizing least-squares adjustment to estimate the variance–covariance (VC) matrix components (Teunissen 1988). Under the assumption of normally distributed observations, the LS-VCE method should result in identical estimation with those of many of the existing VCE methods (Amiri-Simkooei 2007; Teunissen and Amiri-Simkooei 2008; Amiri-Simkooei et al. 2016).

In LS-VCE method, the VC matrix of the linear(ized) observation equation in Eq. (1) is expressed as the following linear combination

$$ D\left( \varepsilon \right) = Q_{y} = \mathop \sum \limits_{i = 1}^{p} \sigma_{i} Q_{i} $$
(31)

where \(\sigma_{i} , i = 1, \ldots ,P\) are the unknown VC components and \(Q_{i}\), \(i = 1, \ldots ,P\), are the known cofactor matrices constructing the observations covariance matrix \(Q_{y}\).

The calculation of the unknown VC components is read by

$$ \hat{\sigma } = N^{ - 1} l $$
(32)

where \(\hat{\sigma } = \left[ {\sigma_{1} , \ldots ,\sigma_{p} } \right]^{{\text{T}}}\) is a p-column vector with the estimated VC components; \(N\) is a p-squared matrix and \(l\) is a p-vector, both obtained as (Amiri-Simkooei 2007; Teunissen and Amiri-Simkooei 2008)

$$ n_{ij} = \frac{1}{2}{\text{tr}}\left( {Q_{i} Q_{y}^{ - 1} P_{A}^{ \bot } Q_{j} Q_{y}^{ - 1} P_{A}^{ \bot } } \right) $$
(33)

and

$$ l_{i} = \frac{1}{2}\hat{e}^{T} Q_{y}^{ - 1} Q_{i} Q_{y}^{ - 1} \hat{e} $$
(34)

where \(n_{ij}\) and \(l_{i}\) (\(i = 1, \ldots ,P\) and \(j = 1, \ldots ,P)\) are the components of \(N\) and \(l,\) respectively, \({\text{tr}}\) denotes the action of trace of a matrix, and \(\hat{e} = P_{A}^{ \bot } y\) is a vector of the least-squares residuals, with the orthogonal projector \(P_{A}^{ \bot } = I_{4km} - A\left( {A^{T} Q_{y}^{ - 1} A} \right)^{ - 1} A^{T} Q_{y}^{ - 1}\).

For practical manners, when using LS-VCE with long observations time span, it was suggested to divide the entire observations time span into K multi-epoch solution groups (Amiri-Simkooei et al. 2009). This strategic allows to estimate the time-variant components and overcomes the problems of computational burden and memory for the VCE methods. Since the design matrix and the covariance matrix remain the same among the groups, the resulted variance–covariance components may average over time or be used in a time-variant model adjustment, as, for example, in Eqs. (28) and (29).

5 GNSS stochastic model testing

Once estimating the GNSS stochastic model, a reliability testing procedure is suggested to assure that the stochastic model properly encapsulates the statistical properties of the observations. As shown in Eq. (6), testing a stochastic model is a statistical procedure that relies on a true reference value for the comparison with the estimated variables. Therefore, in GNSS solution, one can utilize the deterministic nature of the integer ambiguity variables. After fixing the integer ambiguity vector with a considerably high confidence level, the fix solution can serve as a true reference data in the analysis.

5.1 Testing the integer ambiguity float solution

To test the stochastic model through the integer ambiguity vector, the estimated ambiguity float solution, \(\hat{\alpha }\), and its covariance matrix, \(Q_{{\hat{\alpha }}}\), should be extracted as a function of the observation VC matrix \(Q_{y}\). For this purpose, the general formulation of the observation equations given in Eq. (1) is partitioned as

$$ y = A_{1} a + A_{2} b + \varepsilon ,E\left( \varepsilon \right) = 0;\quad D\left( \varepsilon \right) = Q_{y} $$
(35)

where \(a \in Z^{n}\) is the unknown DD integer ambiguities n-column vector; \(b \in R^{q}\) is the unknown baseline components q-column vector, including atmospheric delays and clock errors; and \(A_{1}\), and \(A_{2}\) are their corresponding design matrices derived from the observation model. The above formulation may form either SD or DD observation model: for the SD case, when \(y\) represents SD code and phase observations as in Eqs. (16) and (17), the observation VC matrix is equivalent to the estimated one, thus \(Q_{y} = Q_{y}^{{{\text{SD}}}} = {\Sigma }_{{\text{C}}} \otimes {\Sigma }_{{\text{T}}} \otimes {\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\). For the DD case, when \(y\) represents DD code and phase observations, as, for example, in the long baseline implementation in Odijk et al. (2014), the observation VC matrix should be calculated using Eqs. (21) and (27) as

$$ Q_{y} = Q_{y}^{DD} = {\Sigma }_{{\text{C}}} \otimes {\Sigma }_{{\text{T}}} \otimes \left( {D^{T} \cdot {\Sigma }_{E}^{{{\text{SD}}}} \cdot D} \right) $$
(36)

From Eq. (36), the float ambiguities vector estimation can be expressed as (Teunissen 1993)

$$ \hat{\alpha } = \left( {\overline{A}_{1}^{T} Q_{y}^{ - 1} \overline{A}_{1} } \right)^{ - 1} \overline{A}_{1}^{T} Q_{y}^{ - 1} y $$
(37)

with \(\overline{A}_{1} = P_{A2}^{ \bot } A_{1}\) and \(P_{A2}^{ \bot } = I - A_{2} \left( {A_{2}^{T} Q_{y}^{ - 1} A_{2} } \right)^{ - 1} A_{2}^{T} Q_{y}^{ - 1}\) as an orthogonal projector. Applying the error propagation law to Eq. (37) gives after simplification the following expression for the float ambiguities VC matrix

$$ Q_{{\hat{\alpha }}} = \left( {\overline{A}_{1}^{T} Q_{y}^{ - 1} \overline{A}_{1} } \right)^{ - 1} $$
(38)

Since it is assumed that the float ambiguity solution is normally distributed, having both the float ambiguity vector \(\hat{\alpha }_{i}\) and its covariance matrix \(Q_{{\hat{\alpha }_{i} }}\) for \(i = 1 \ldots k\) independent epochs, together with the true reference fix value \(a\), enables to execute \(k\) times the hypothesis testing in Eq. (6) as

$$ T_{i} = \left( {a - \hat{a}_{i} } \right)^{t} \cdot Q_{{\hat{a}_{i} }}^{ - 1} \cdot \left( {a - \hat{a}_{i} } \right) \sim \chi_{n}^{2} ,i = 1 \ldots k $$
(39)

where \(n\) is the number of ambiguity elements in \({ }a\). For large \(k\), the histogram for all the test values \(T_{i}\) (\(i = 1 \ldots k\)) is expected to form the shape of the \(\chi_{n}^{2}\) distribution. Incompatibility between the histogram and the \(\chi_{n}^{2}\) distribution graph indicates a poor reliability of the stochastic model. Figure 2, for example, shows two histograms of a simulated test values with comparison to the \(\chi_{10}^{2}\) distribution graph. In the simulation, two sets of 1000 normal distributed float ambiguities vectors were randomized with 0.1-diagonal VC matrix according to Teunissen (1998a). Each vector consists of ten float ambiguities that distribute as \(N\left( {0, 0.1 \cdot I_{10} } \right)\). The float ambiguities vectors from Set #1 were used with the original VC matrix to obtain the test values, whereas Set #2 used the more optimistic VC matrix with 0.5 factorization to the original one. It is clear that the histogram of Set #1 fits to the \(\chi_{10}^{2}\) distribution graph, implying that the stochastic model is admissible. Contrary to this, the histogram of Set #2 shows relatively large test values, implying that the stochastic model is too optimistic as expected.

Fig. 2
figure 2

Simulated histograms of the test value (T) with comparison to the \(\chi_{10}^{2}\) distribution graph. Dark-orange columns represent the results with a decent VC matrix; light-orange columns represent the result with a factorized VC matrix; and orange columns represent the shared area of both histograms

The procedure presented here is based on a visual impression to test the reliability of the stochastic model. To enrich the procedure with numerical score, a test for the ambiguity resolution success rate using the binomial distribution is suggested.

5.2 Testing the integer ambiguity success rate

A common tool to examine the fix integer value is the ambiguities resolution success rate (SR). The SR is a statistical value that reflects the probability of correct integer value as a function of the integer estimation method and the float ambiguity VC matrix. Since the SR does not depend on real measurements data, a reliable stochastic observation model will allow to use the SR as a planning tool, as done, for example, by Jonkman et al. (2000) and Milbert (2005). Vice versa, testing the validity of the SR prediction level can lead to a deduction about the reliability of the general stochastic model.

In Teunissen (1999), the SR is proved to be optimal for the integer least-squares (ILS) estimation methods like LAMBDA, although approximation is needed to obtain its value. As an alternative approach, the success rate of the integer bootstrapping method is suggested (Verhagen 2005; Odijk et al. 2014). Integer bootstrapping is a simple method with near-optimal performance, in which the float ambiguities are conditionally rounded to the nearest integers. The bootstrapped success rate can be exactly computed with the simple expression

$$ {\text{SR}}_{{\text{B}}} = \mathop \prod \limits_{i = 1}^{n} 2{\Phi }\left( {\frac{1}{{2\sigma_{{\hat{a}_{j|J} }} }}} \right) - 1 $$
(40)

where \({\text{SR}}_{{\text{B}}}\) is the bootstrapped success rate that serves as a lower bound for \({\text{SR}}_{{{\text{ILS}}}}\) and \(\sigma_{{\hat{a}_{j|J} }}\) is the standard deviation of the \(j\)th ambiguity obtained through conditioning on the previous \(J = 1, \ldots ,\left( {j - 1} \right)\) ambiguities.

The theoretical SR in Eq. (40) is computed based on the stochastic model alone. Since this SR is defined as the probability of success, it is in fact equivalent to the CI definition of the Bernoulli test with a criterion of success in fixing the float ambiguity vector to the true integer solution. Hence, extraction of the CI value in Eq. (7), based on a repeated Bernoulli test with true reference and empirical data, may be compared to the averaged SR value obtained from Eq. (40).

6 Numerical test

6.1 Experiment description

A reliability test procedure is conducted to examine and compare the performance of the GNSS stochastic model for three baselines: ELAT-NRIF (60 km), ELAT-YOSH (289 km) and ELAT-MRAV (329 km). The reference stations located in the Middle East region (see Fig. 3) and part of the CORS (Continuously Operating Reference Station) network in Israel were maintained by the Survey of Israel (SOI). For each baseline, the stochastic model is estimated and tested using the reliability testing procedure flow demonstrated in Fig. 4 and implemented in MATLAB. The permanent stations precise coordinates were obtained through a daily adjustment of the SOI’s permanent network using the Bernese GNSS Software, version 5.2 (Dach and Walser 2015), which is also used to detect outliers in the observations files. The tropospheric delays were calculated using the Vienna Mapping Function (VMF) (Landskron and Böhm 2018) with gridded VMF3 files archived in http://vmf.geo.tuwien.ac.at/trop_products/GRID/. To enable comparison between the estimated stochastic models, the different permanent stations that form the rover in each baseline share the same configuration of receiver and antenna types: Javad Delta-3 receiver and LEIAT504 LEIS antenna. The common base station (ELAT) is equipped with LEICA GRX1200Pro receiver and LEIAT504GG SCIS antenna. All observations were made on February 11, 2018 (00:00:00–23:59:55 UTC), with epoch interval of 5 s and four observation type (P1-P2-L1-L2). For each hour, only full observed satellites above \(10^\circ\) elevation mask were used. The final number of satellites in the computations is presented in Fig. 5.

Fig. 3
figure 3

Permanent stations from SOI network used in the experiment

Fig. 4
figure 4

Experiment flow for estimating and testing the reliability of the stochastic model

Fig. 5
figure 5

Number of GPS satellites in the computations

6.2 Estimating the stochastic models

Following the previous theory, the stochastic model for each baseline was estimated. At first, a DD observation model adjustment was carried out with maximum time span (24 groups of 1-h solution) and fixed stations’ coordinates to extract the most accurate DD float ambiguities vector. For each 1-h solution, the DD ambiguities vector consists of \(2\left( {m - 1} \right)\) float DD ambiguities from the dual-frequency observations to \(m\) satellites. The float ambiguities were then fixed to their integer values using the LAMBDA method implementation in MATLAB (Verhagen and Li 2012). These ambiguities, together with the precise stations’ coordinates and precise ephemeris data, were used to obtain the SD fixed observations as in Eqs. (18) and (19). The unknown VC components were then obtained using the LS-VCE process with 1440 groups of 1-min observation time span, divided from the total observations time (see Fig. 4).

The LS-VCE solution implemented for each time group in stepwise manner as follows (Amiri-Simkooei et al. 2013):

  1. 1.

    Estimation of the VC components in \({\Sigma }_{{\text{C}}}\)

  2. 2.

    Estimation of the satellite-elevation factors in \({\Sigma }_{E}^{{{\text{SD}}}}\)

  3. 3.

    Estimation of the time covariance factors in \({\Sigma }_{{\text{T}}}\)

At first, the components of \({\Sigma }_{{\text{C}}}\) were estimated while the time correlation and the satellite-elevation-dependent factors were assumed to be absent. To this end, \({\Sigma }_{{\text{T}}}\) was considered as identity matrix (\({\Sigma }_{{\text{T}}} = {\text{I}}_{k}\)) and \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\) simplified to \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}} = 2I_{m}\). To increase the number of degrees of freedom in the estimation procedure, the matrix \({\Sigma }_{{\text{C}}}\) is assumed here to contain only six components, instead of ten as in Eq. (22), as follows

$$ {\Sigma }_{{\text{C}}} = {\Sigma }_{O} \otimes {\Sigma }_{f} $$
(41)

with

$$ {\Sigma }_{O} = \left[ {\begin{array}{*{20}c} {\sigma_{\varphi }^{2} } & {\sigma_{\varphi P} } \\ {\sigma_{\varphi P} } & {\sigma_{P}^{2} } \\ \end{array} } \right] $$
(42)

and

$$ {\Sigma }_{f} = \left[ {\begin{array}{*{20}c} {\sigma_{{f_{1} }}^{2} } & {\sigma_{{f_{1} f_{2} }} } \\ {\sigma_{{f_{1} f_{2} }} } & {\sigma_{{f_{2} }}^{2} } \\ \end{array} } \right] $$
(43)

where \({\Sigma }_{{\text{O}}}\) is the core observation-type VC matrix and \({\Sigma }_{f}\) is the related signal-frequency factors components matrix. The components in \({\Sigma }_{{\text{O}}}\) and \({\Sigma }_{f}\) were estimated using the LS-VCE routine in two steps: At first, the components in \({\Sigma }_{{\text{O}}}\) were estimated, while \({\Sigma }_{f}\) was assumed to be an identity matrix (\({\Sigma }_{f} = {\text{I}}_{2}\)); then, the estimated \({\Sigma }_{{\text{O}}}\) was introduced to the model and the components in \({\Sigma }_{f}\) were estimated.

Figure 6 presents the estimated VC components derived from \({\Sigma }_{{\text{O}}}\) for each time group in the different baseline solutions. The signal-related observations’ standard deviation (STD) is considered using the estimated factor components in \({\Sigma }_{f}\) (for example, the variance for L1 observation is obtained by \(\sigma_{\varphi 1}^{2} = \sigma_{\varphi }^{2} \cdot \sigma_{{f_{1} }}^{2}\)). This STD is relevant to the unique combination of the equipment in the baselines. The exact combination of receiver and antenna types is repeated for all baselines here, making it possible to compare between the estimated components. Here, a comparison is possible regarding the observation’s precision over time. Phase and code precision graphs resemble a random noise with approximate mean values of 2.5 mm and 30 cm, respectively. All baselines share a similar precision trend, implying that the functional model is compatible and no significant systematic errors affect the observations. The covariance graphs, between phase codes and between frequencies, have approximately zero mean with altered noise level. While the noise level of the covariance between frequencies remains mainly consistent, the phase-code covariance graph is more sensitive due to the different scale of the parameters. Irregularities in the estimation process may result from either high multipath or high-order atmospheric effect in the 1-min observation time span solution.

Fig. 6
figure 6

Estimated VC components (\(\hat{\sigma }_{\varphi 1}\), \(\hat{\sigma }_{P1}\), \(\hat{\sigma }_{\varphi P}\), \(\hat{\sigma }_{{f_{1} f_{2} }}\)) from ELAT-NRIF (60 km, blue line), ELAT-YOSH (289 km, green line) and ELAT-MRAV (329 km, red line) baselines with 1440 groups of 1-min observations time span

To add insight regarding the ionospheric contribution in the observation’s precision, the estimation of L1′s and P1′s STDs was repeated as in Fig. 6, however, with fix ionospheric delays that were extracted using the global ionospheric map (GIM) and considered as known values in (20). The GIM was obtained following the IONosphere map EXchange format (IONEX) (Schaer et al. 1998), with IONEX files archived in ftp://cddis.nasa.gov/gnss/products/ionex/. The resulted STDs are presented in Fig. 7. It is clear that the ionospheric residuals here are significantly degraded L1′s accuracy, especially for longer baselines. This reflected in both the STD values and their noise level. P1′s accuracy, however, is less affected due to the originally low accuracy. The results imply that the ionosphere components must be considered in functional model when estimating the core observation’s variance for medium- and long-range baselines.

Fig. 7
figure 7

Estimated phase and code STDs (\(\hat{\sigma }_{\varphi 1}\) and \(\hat{\sigma }_{P1}\)) with fixed ionosphere value from GIM

Table 1 summarizes the mean and precision of the estimated VC components. The estimated STDs of the phase observations (L1 and L2) for all baselines in the float ionosphere instance are between 2.0 to 2.9 mm, and the code observations (P1 and P2) are between 27.0 and 30.8 cm. The precision of these estimates, calculated from the standard deviation of the groupwise samples, is at millimeter level for the phase observations and at sub-decimeter level for the code observations. In general, the results here are reasonable and resemble the commonly used nominal STD values (\(\sigma_{\varphi }\) = 3 mm, \(\sigma_{{\text{P}}}\) = 30 cm), with practically no correlations between observation types. The results also indicate that the standard deviation of the phase observations is affected by the baseline length—larger baselines with degraded accuracy and vice versa. This may sound trivial but in fact no familiar weighting schemes address this parameter in the stochastic modeling. The difference here between the estimated phase standard deviations in the longest baseline (ELAT-MRAV) and in the shortest baseline (ELAT-NRIF) is up to 0.7 mm which is about 70\% from this parameter precision. This effect of the baseline length is obviously more noticeable in the fixed ionosphere instance due to larger unmodeled ionospheric influence in the observations. Differently, the code’s precision here appears not to be affected by the baseline length. This may relate to the large noise level of the code observations comparing to the ionospheric impact. As a result, most of the ionospheric effect due to the baseline length is absorbed in the phase’s variance estimation and less noticeable in the code.

Table 1 Mean standard deviations and precisions from the overall groupwise LS-VCE in each baseline

In the second stage of the overall stochastic model estimation, the components of \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\), the satellites’ elevation dependence covariance matrix, were estimated. The estimated components of the covariance matrix \({\Sigma }_{{\text{C}}}\) from the previous step were introduced into the stochastic model. The autocorrelation matrix \({\Sigma }_{{\text{T}}}\) is still assumed to be an identity one. The components of the diagonal matrix \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\) were then estimated for each time group using the LS-VCE routine. Each component captures variance for a specific satellite’s mean elevation angle in the group. All the estimated components from all time groups were used together to adjust the three parameters of the elevation-dependent variance model in Eq. (28). The adjusted models’ parameters together with root-mean-squares (RMS) for each of the tested baselines are summarized in Table 2.

Table 2 Adjusted elevation model parameters

Figure 8a presents the L1′s STDs for each satellite in all time groups, together with the adjusted elevation model function, as a function of the satellite elevation angle in ELAT-NRIF baseline. The standard deviations are presented in mm units by multiplying \(\sigma_{\varphi 1}\) with the related factor for each satellite from \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\). Figure 8b presents together the adjusted models for all the tested baselines. Notice here also how the baseline length is affecting the STD graph. At minimum elevation angle, the difference between ELAT-NRIF (60 km) and ELAT-MRAV (329 km) is approximately 2 mm, which is considerably large comparing the RMS of both the models. Notice also how the RMS is increasing as the baseline length gets larger. This reinforces the statement that atmospheric residuals cause irregularities and degrade the estimation precision of the observations’ STDs.

Fig. 8
figure 8

L1′s STD as a function of satellites elevation angle; a ELAT-NRIF estimated STDs and their adjusted model; b adjusted satellite elevation model for ELAT-NRIF, ELAT-YOSH and ELAT-MRAV baselines

In the last stage of the stochastic model estimation, the autocovariance matrix \({\Sigma }_{{\text{T}}}\) was determined. Here, the pre-estimated components in \({\Sigma }_{{\text{C}}}\) and \({\Sigma }_{{\text{E}}}^{{{\text{SD}}}}\) are assumed to be known and introduced to the model. Then, the components of \({\Sigma }_{{\text{T}}}\) were estimated using the LS-VCE routine. The estimated components were then used to adjust the autocovariance function models in Eqs. (29) and (30). The adjusted parameters for both models, together with RMS for each of the tested baselines, are summarized in Table 3.

Table 3 Adjusted autocovariance models parameters

In order to learn about the effect of the sampling interval on the autocovariance estimation process, the LS-VCE method was implemented in three sampling interval modes—5, 10 and 20 s. In each mode, time groups were composed with a constant number of 12 epochs, making the total amount of time groups in 24-h data equal 1440, 720 and 360 (for 5, 10 and 20 s sampling interval, respectively). Figure 9 presents the estimated autocovariance components for the different sampling interval modes in each baseline. The estimated autocovariance components are presented here as average with noise for each time lag. Similar to the results in Li (2016), the autocovariance components are noisier for larger time lags inside a single mode. This is due the smaller number of correlative epochs that participates in the components’ estimation for larger time lags. The results also show that the different sampling intervals have minor influence on the autocovariance components.

Fig. 9
figure 9

Estimated autocovariance factors for different sampling interval modes (5, 10 and 20 s) in ELAT-NRIF, ELAT-YOSH and ELAT-MRAV baselines

In Fig. 10a, the comparison between the time-lag averaged autocovariance components for all the tested baselines is presented. The general pattern of the autocovariance graph is similar for all baselines, indicating that the core behavior of the autocovariance due to the equipment configuration is kept toward the different baselines. The slightly differences (up to 0.1 autocovariance factor) may be derived from external error sources such as multipath and atmospheric delays. Figure 10b presents the estimated autocovariance graph with both exponential and logarithmic adjusted models for ELAT-NRIF baseline. From the visual impression, it is clear that the logarithmic model is better to capture the pattern of the autocovariance graph. The RMS of the logarithmic model (0.01) is also considerably small comparing with the RMS of the exponential model (0.04). This is common to all of the tested baselines, as reflected in Table 3.

Fig. 10
figure 10

Autocovariance factors and function model; a estimated autocovariance factors for ELAT-NRIF, ELAT-YOSH and ELAT-MRAV baselines; b ELAT-NRIF estimated autocovariance factors together with both exponential and logarithmic adjusted models

6.3 Testing the stochastic models

After estimating the stochastic model for each baseline, a testing procedure is conducted to explore the reliability and the contribution of the estimated VC matrix in practical uses. An implementation of the DD observation model adjustment was performed to solve rovers position in each baseline using four different composed VC models (M1–M4). The different VC models, listed in Table 4, intend to reflect the influence of each component of the realistic stochastic model in Eq. (36). To achieve a meaningful statistical sample in the analysis, the overall observations’ time span was divided into 1440 groups of 1-min observations data. In each group, the estimated float ambiguity vector \(\hat{\alpha }_{i}\) and its covariance matrix \(Q_{{\hat{\alpha }_{i} }}\) were obtained and used, together with the known fixed ambiguity vector \(a\), to calculate the test value \(T\) in Eq. (6). In this calculation, only six ambiguities elements were considered to maintain adequate ambiguity vector size for all time groups. This number is equal to the minimum of six float DD ambiguities from the dual-frequency observations to four satellites.

Table 4 VC models used for testing. The composed VC models are calculated using Eq. (36)

Figure 11 presents the probability histograms for the \(T\) values together with the expected \(\chi_{6}^{2}\) distribution graph for comparison. The sets differ by the baseline configuration (ELAT-NRIF, ELAT-YOSH and ELAT-MRAV) and the VC model used in the adjustment. The visual impression from the results clearly indicates that M3 and M4 lead to a better compatibility with the expected \(\chi_{6}^{2}\) distribution graph, compared to M1 and M2. The histogram deployment for all M1 and M2 cases exceeds the \(\chi_{6}^{2}\) distribution graph boundaries and deceptively indicates on more optimistic stochastic model. The results here add interesting insight on the importance of time correlation modeling, even more than satellites-elevation weighting. Both M1 and M2 ignore time correlation and fail to achieve the expected statistical behavior. In contrary, M3 and M4 consider time correlation and behave properly, even though M3 ignores satellites-elevation weighting. This would probably be more significant if more epochs will participate in the solution and introduce temporal covariance influence. From baseline configuration perspective, small and ignorable differences appear in the histogram’s compatibility level, implying that the estimating procedure of the VC matrix manages to capture the difference between the stochastic models.

Fig. 11
figure 11

Testing the reliability of the VC matrix for different VC models (M1-M4) through a comparison between the histogram of the test value (T) and the \(\chi_{6}^{2}\) distribution graph

The second testing procedure for the stochastic model reliability is done through the validity assessment of the integer ambiguity success rate. This test encapsulates the practical implication of using the different stochastic models. In this test, the reliability of the theoretical SR for a 1-min time span solution is examined. For each combination of baseline and VC model, an average of the theoretical SR, calculated using Eq. (40) for 1440 time groups, was obtained. The prediction of the theoretical SR was tested against the real level of success, represented by the CI value from Eq. (7). The CI value obtained by repeating the Bernoulli test with each one of the 1440 time groups and testing whether the bootstrapped fixed solution for the float ambiguities vector is equal to the known fix solution.

Figure 12 presents a comparison between the predicted and real ambiguity resolution SR values using the different stochastic models. Here also, the predicted SR in both M1 and M2 appears to be not reliable and deceptively optimistic with an average value of 40\% for the different baselines, while the real level of success is significantly lower with an average of 2\%. In contrary, M3 and M4 show considerably more reliable stochastic models. The reliability is expressed in three aspects: First, the performance of the real ambiguity fixing rates is improved, especially with M4. Second, the predicted SR values are considerably closer to the average of the CI values, showing high prediction level using the stochastic model. And third, the predicted SR values are dynamically changed between the different baselines, reflecting the expected altered influence of the atmospheric residuals on the ambiguity’s resolution. Nevertheless, there are still small differences between the real SR value and the one predicted with full estimated VC matrix (M4). These differences may result from the stochastic model’s uncertainty and the influence of irregularities from external errors in the observations.

Fig. 12
figure 12

Comparison between the predicted and real (CI) ambiguity resolution SR values using the different stochastic models (M1–M4) in ELAT-NRIF, ELAT-YOSH and ELAT-MRAV baselines solution. Small differences between the predicted and real SR values indicate good compatibility of the stochastic model

7 Summary and conclusions

This paper suggested statistical tests to assess stochastic models’ reliability in GNSS baseline solution. The Chi-squared and the binomial process distributions were used to examine the float integer ambiguity VC matrix and its reliability in predicting the ambiguities resolution success rate. The presented approach was used to test a stochastic model estimation routine based on the SD observation model and the LS-VCE method. Although specific stochastic model was chosen, using only GPS satellites, the presented approach may further be used for analyzing more models, including multi-constellations ones, in the future. In practical applications, the presented approach may be used to determine which stochastic model is adequate to the equipment that is in used. A reliable stochastic model is necessary to simulate positioning accuracy and to predict ambiguity resolution SR without having real observations in hand. Hence, once accepting a stochastic model using the presented approach, trustful simulation (e.g. for mission planning) will be possible.

In the case study, stochastic models for a shared configuration of receiver types were estimated and tested in different baseline lengths using 24-h observation data. The results provide a broad impression of the nature of the GNSS stochastic model for general use. The following conclusions are summarized:

The GNSS observation precisions, and particularly the phase’s standard deviation, are affected by the baseline length. In the current case study, the average phase’s standard deviation in the longest baseline (329 km) is 2.9 mm and in the shortest baseline (60 km) is 2.2 mm, with a difference of about 70\% from this parameter precision. Thus, a reliable mission planning tool should address this parameter in the stochastic modeling. A future work is needed here to formulate the influence of the baseline length on the SD observation precisions in different atmospheric conditions.

As the baseline length increases, the stochastic model is less predictable and exposed to irregularities in the observation’s precision. This is probably the results from the remaining high-order atmospheric effect in the observations.

A full realistic stochastic modeling using the LS-VCE procedure results with a decent VC matrix that manages to capture the expected multinormal-distribution nature of the DD adjustment results. To achieve reliable stochastic model, time correlation modeling is mandatory. Unlike satellites-elevation weighting, ignoring time correlation will result in a deceptively optimistic stochastic model that does not form the expected statistical behavior. A proper function to model time correlation is the logarithmic-based function, which results in a better compatibility to the estimated autocorrelation graph than the exponential function.

And last, the use of theoretical SR as a decision helper for the integer ambiguity resolution analysis should be restricted only for the case of using a reliable stochastic model. As shown here, a false captured stochastic model results in a very optimistic SR of around 40\% for the different baselines, while the real level of success is significantly lower with an average of 2\%.