Introduction

In a data adjustment system, the functional model describes the relationship between observations and parameters, while the stochastic model describes the observation precisions and their correlations to each other. The stochastic model can be specified by a covariance matrix, being the second-order central moments of the random observation errors. Despite the principle that an arbitrarily positive-definite covariance matrix can be used to compute the unbiased estimator in least squares adjustment, one can never achieve the optimal estimate with the minimal variance unless the correct stochastic model is applied (Koch 1999; Li et al. 2011).

In global navigation satellite system (GNSS) applications, the stochastic model is very important for reliable integer ambiguity resolution and for precise positioning. Compared with the correct stochastic model, any approximate stochastic model will result in a smaller success rate of both integer least squares and integer bootstrapped ambiguity resolution (Teunissen 2007; Amiri-Simkooei et al. 2016). Hence, refining the GNSS stochastic model is a worthy aspiration and significant research efforts have been done in the past two decades. The earlier studies were based on the elevation dependence of random observation errors (Euler and Goad 1991; Brunner et al. 1999), and later took into account the physical correlations, typically the between-frequency cross-correlation and time correlation, by post-analysis (Tiberius and Kenselaar 2003; Bona 2000; Li et al. 2008; Amiri-Simkooei et al. 2009), variance component estimation together with positioning (Wang et al. 2002; Li et al. 2011) and turbulence theory (Schön and Brunner 2008; Kermarrec and Schön 2014). Based on these studies, it is concluded that in general the observation precision is elevation-dependent and the cross-correlation and time correlation may exist. Moreover, these stochastic characteristics vary with both receiver and observation types.

Besides achieving the precise parameter estimator, the correct stochastic model is also required to retrieve the objective precision measures and the covariance matrix of the estimator. For short baselines and short observation sessions, the physical correlations have no significant effects on the baseline solutions, but significant effects on the covariance matrix of the baselines, as numerically shown in El-Rabbany (1994), Han and Rizos (1995), Howind et al. (1999) and Li (2016). Existing studies of refining GNSS stochastic models almost all focus on the improvement of positioning. In fact, the stochastic model is even more important for the reliability of quality control. The reliability measures the capability of an equation system to detect, identify and resist outlier(s). It consists of internal and external reliabilities, of which the internal reliability is more important for detecting and identifying the outlier with hypothesis testing (Teunissen 2006). Usually, the probabilities of type I and type II errors with a given significance in hypothesis testing, the separability of two outlier detection statistics as well as the minimal detectable bias (MDB) are applied as indicators. In reliability, the covariance matrix is involved in testing statistics, for instance, the overall statistic for model specification and the w statistic for outlier detection. Such statistics are known to be sensitive to the stochastic model (Baarda 1968; Teunissen 2006; Li et al. 2016).

However, in the GNSS community, the influence of the stochastic model on the statistical reliability tests has been rarely studied. Teunissen (1998) derived the analytical formulae of MDB for canonical forms of different GNSS application models. Li et al. (2016) numerically demonstrated the impact of the elevation-dependent model on the overall and w statistic tests. We will study the influence of the GNSS stochastic model on the statistical tests involved in reliability with triple-frequency BeiDou as an example. We first apply the variance component estimation (VCE) method to achieve realistic elevation-dependent precision, cross-correlation and time correlation. Compared with the empirical stochastic models, we numerically demonstrate the influence of these realistic stochastic properties on the overall and w statistic tests. In addition, the MDBs together with separability defined by the correlation coefficient of two w-test statistics are analyzed. To the best of our knowledge, this is the first comprehensive study on the reliability influence of BeiDou stochastic modeling. The achieved results will be very helpful for users to do quality control in real applications.

The symbols and operators used below are described as follows. The matrix \(\varvec{I}_{n}\) denotes the \((n \times n)\) identity matrix and \(\varvec{e}_{n}\) is the n-column vector with all elements of ones. 0 denotes a matrix of all elements of 0’s with proper dimensions. The symbols \({\text{E}}\) and \({\text{D}}\) are the expectation and dispersion operators, while \({\text{tr}}\) is the trace of a matrix. \({\text{diag(}}\varvec{a} )\) is the operator to form a diagonal square matrix with elements a at the diagonal, while \({\text{blkdiag}}\) is the operator of block diagonal concatenation of matrices. The symbols ⊗ and \({\text{vec}}\) are Kronecker product and vectorization operator. The properties \((\varvec{AB}) \otimes (\varvec{CD}) = (\varvec{A} \otimes \varvec{C})(\varvec{B} \otimes \varvec{D})\), \({\text{vec(}}\varvec{ABC} )= (\varvec{C}^{T} \otimes \varvec{A}){\text{vec(}}\varvec{B} )\), and \({\text{vec(}}\varvec{A} )^{T} {\text{vec(}}\varvec{B} )= {\text{tr(}}\varvec{A}^{T} \varvec{B} )\) will be frequently applied in derivations. See Koch (1999) for more properties about these mathematical operators.

GNSS model and its solutions

In this section, we first give the SD phase and code observation model between two receivers, followed by derivation of the representation of the full-rank model. Based on the full-rank SD model, two types of solutions, namely the float solution and fixed solution, are presented.

Between-receiver SD observation model

We study the reliability statistics based on the between-receiver single-difference (SD) model of short baseline, as it allows to explicitly analyze the satellite-specific property, where the satellite-dependent biases are eliminated. For a short baseline, the remaining systematic biases, such as the atmospheric biases, can be reduced sufficiently and basically ignored. Then, the single-epoch SD observation equations of \(f\)-frequency phase and code read

$${\text{E}}\left( {\left[ {\begin{array}{*{20}c}\varvec{\phi}\\ \varvec{p} \\ \end{array} } \right]} \right) = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\varvec{e}_{f} \otimes \varvec{G}} & {\varvec{I}_{f} \otimes \varvec{e}_{s} } \\ \end{array} } & {\begin{array}{*{20}c} {\mathbf{0}} & {\varvec{\varLambda}\otimes \varvec{I}_{s} } \\ \end{array} } \\ \end{array} } \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\varvec{e}_{f} \otimes \varvec{G}} & {\mathbf{0}} \\ \end{array} } & {\begin{array}{*{20}c} {\varvec{I}_{f} \otimes \varvec{e}_{s} } & {\mathbf{0}} \\ \end{array} } \\ \end{array} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} \varvec{x} \\ {\delta \varvec{t}} \\ \end{array} } \\ {\begin{array}{*{20}c} {d\varvec{t}} \\ \varvec{a} \\ \end{array} } \\ \end{array} } \right]$$
(1)

where \(\varvec{\phi}= [\varvec{\phi}_{1}^{T} , \ldots ,\varvec{\phi}_{f}^{T} ]^{T}\) and \(\varvec{p} = [\varvec{p}_{1}^{T} , \ldots ,\varvec{p}_{f}^{T} ]^{T}\) are the vectors of \(f\)-frequency SD code and phase observations, respectively, \(\varvec{\phi}_{j} = [\phi_{j}^{1} , \ldots ,\phi_{j}^{s} ]^{T}\) is the SD phase observation vector of s satellites on frequency j; \(\varvec{p}_{j}\) has the same structure as \(\varvec{\phi}_{j}\). G is the design matrix to the baseline vector x. \(\varvec{\varLambda}= {\text{diag([}}\lambda_{1} , \ldots ,\lambda_{f} ] )\) is the design matrix to the SD ambiguity vector \(\varvec{a} = [\varvec{a}_{1}^{T} , \ldots ,\varvec{a}_{f}^{T} ]^{T}\) with \(\varvec{a}_{j} = [a_{j}^{1} , \ldots ,a_{j}^{s} ]^{T}\) being the SD ambiguity vector of frequency j including the initial SD receiver phase bias. \(\lambda_{j}\) is the wavelength of the jth frequency. \(\delta \varvec{t} = [\delta t_{1} , \ldots ,\delta t_{f} ]^{T}\) and \(d\varvec{t} = [dt_{1} , \ldots ,dt_{f} ]^{T}\), with \(\delta t_{j}\) and \(dt_{j}\) being the SD receiver clock errors (hardware delays included) of the jth frequency in meters for phase and code, respectively.

Obviously, Eq. (1) is rank-deficient since the coefficients of \(\delta \varvec{t}\) and the SD ambiguities a satisfy

$$\left[ {\begin{array}{*{20}c} {\varvec{I}_{f} \otimes \varvec{e}_{s} } & {\varvec{\varLambda}\otimes \varvec{I}_{s} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c}\varvec{\varLambda}\\ { - \varvec{I}_{f} \otimes \varvec{e}_{s} } \\ \end{array} } \right] = {\mathbf{0}}_{fs \times f}$$
(2)

This equation indicates that the SD phase clocks \(\delta \varvec{t}\) are dependent on the SD ambiguities a with rank deficiency f. To eliminate this rank deficiency, we conduct the reparameterization as:

$$\begin{aligned} \delta \bar{\varvec{t}} & = \left[ {\delta t_{1} + a_{1}^{1} , \ldots ,\delta t_{f} + a_{f}^{1} } \right]^{T} \\ \varvec{z}_{j} & = \left[ {a_{j}^{2} - a_{j}^{1} , \ldots ,a_{j}^{s} - a_{j}^{1} } \right]^{T} \\ \end{aligned}$$
(3)

where \(\varvec{z}_{j}\) is the DD ambiguities of frequency j. The full-rank version of model (1) reads

$${\text{E}}\left( {\left[ {\begin{array}{*{20}c}\varvec{\phi}\\ \varvec{p} \\ \end{array} } \right]} \right) = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\varvec{e}_{f} \otimes \varvec{G}} & {\varvec{I}_{f} \otimes \varvec{e}_{s} } \\ \end{array} } & {\begin{array}{*{20}c} {\mathbf{0}} & {\varvec{\varLambda}\otimes\varvec{\varGamma}} \\ \end{array} } \\ \end{array} } \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\varvec{e}_{f} \otimes \varvec{G}} & {\mathbf{0}} \\ \end{array} } & {\begin{array}{*{20}c} {\varvec{I}_{f} \otimes \varvec{e}_{s} } & {\mathbf{0}} \\ \end{array} } \\ \end{array} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} \varvec{x} \\ {\delta \bar{\varvec{t}}} \\ \end{array} } \\ {\begin{array}{*{20}c} {d\varvec{t}} \\ \varvec{z} \\ \end{array} } \\ \end{array} } \right]$$
(4)

with \(\varvec{\varGamma}= [{\mathbf{0}}_{(s - 1) \times 1} ,\,\varvec{I}_{s - 1} ]^{T}\). The integer nature of DD ambiguities in (4) can be retrieved, which is exactly equivalent to the DD equation system. The equations are rewritten in a compact form as

$$\varvec{y} = \varvec{Az} + \varvec{Bb} +\varvec{\epsilon}_{\varvec{y}} , \quad \varvec{Q}_{{\varvec{yy}}} = {\text{blkdiag([}}\varvec{Q}_{{\varvec{\phi \phi }}} ,\varvec{Q}_{{\varvec{pp}}} ] )$$
(5)

where \(\varvec{y} = [\varvec{\phi}^{T} , \varvec{p}^{T} ]^{T} \in {\mathbb{R}}^{m}\) with \(m = 2fs\); \(\varvec{A} = [1, 0]^{T} \otimes\varvec{\varLambda}\otimes\varvec{\varGamma}\) to DD integer ambiguity vector \(\varvec{z} \in {\mathbb{Z}}^{n}\) with \(n = f(s - 1)\). \(\varvec{B} = [\varvec{e}_{2f} \otimes \varvec{G},\,\varvec{I}_{sf} \otimes \varvec{e}_{s} ]\) is the design matrix to the real parameter vector \(\varvec{b} = [\varvec{x}^{T} ,\delta \bar{\varvec{t}}^{T} ,d\varvec{t}^{T} ]^{T} \in {\mathbb{R}}^{p}\) with \(p = 2f + 3\). \(\varvec{\epsilon}_{\varvec{y}}\) is the random observation noise assumed to be normally distributed with zero mean and covariance matrix of \(\varvec{Q}_{{\varvec{yy}}}\), where \(\varvec{Q}_{{\varvec{\phi \phi }}}\) and \(\varvec{Q}_{{\varvec{pp}}}\) are the covariance matrices of SD phase and code observations. Here the cross-correlation between phase and code observations is ignored.

Float and fixed solutions

In general, a three-step procedure is employed to solve model (5) based on the least squares criterion.

Step 1: Float solution

The integer property of the ambiguities \(\varvec{z} \in {\mathbb{Z}}^{n}\) is disregarded, and the so-called float solution is computed,

$$\left[ {\begin{array}{*{20}c} {\hat{\varvec{z}}} \\ {\hat{\varvec{b}}} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\varvec{Q}_{{\hat{\varvec{z}}\hat{\varvec{z}}}} } & {\varvec{Q}_{{\hat{\varvec{z}}\hat{\varvec{b}}}} } \\ {\varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{z}}}} } & {\varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{b}}}} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\varvec{A}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{y}} \\ {\varvec{B}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{y}} \\ \end{array} } \right]$$
(6)

where

$$\left[ {\begin{array}{*{20}c} {\varvec{Q}_{{\hat{\varvec{z}}\hat{\varvec{z}}}} } & {\varvec{Q}_{{\hat{\varvec{z}}\hat{\varvec{b}}}} } \\ {\varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{z}}}} } & {\varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{b}}}} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\varvec{A}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{A}} & {\varvec{A}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{B}} \\ {\varvec{B}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{A}} & {\varvec{B}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{B}} \\ \end{array} } \right]^{ - 1}$$

Then, the residuals of float solution are computed as

$$\hat{\varvec{\epsilon}}_{\varvec{y}} = \varvec{y} - \varvec{A}\hat{\varvec{z}} - \varvec{B}\hat{\varvec{b}}$$
(7)
$$\varvec{Q}_{{\hat{\varvec{\epsilon}}\hat{\varvec{\epsilon}}}} = \varvec{Q}_{{\varvec{yy}}} - \varvec{AQ}_{{\hat{\varvec{z}}\hat{\varvec{z}}}} \varvec{A}^{T} - \varvec{BQ}_{{\hat{\varvec{b}}\hat{\varvec{b}}}} \varvec{B}^{T} - \varvec{AQ}_{{\hat{\varvec{z}}\hat{\varvec{b}}}} \varvec{B}^{T} - \varvec{BQ}_{{\hat{\varvec{b}}\hat{\varvec{z}}}} \varvec{A}^{T}$$
(8)

Step 2: Integer estimation

The float ambiguity estimate \(\hat{\varvec{z}}\) is used to compute its integer counterpart, denoted as

$$\check{\varvec{z}} = I(\hat{\varvec{z}})$$
(9)

with \(I:{\mathbb{R}}^{n} \mapsto {\mathbb{Z}}^{n}\) the integer mapping from the reals to the integers in n-dimensional space. There are different choices of mapping function I possible, which correspond to different integer estimation methods. Integer rounding, integer bootstrapping and integer least squares (ILS) are examples of such integer estimators. Of all choices, ILS is optimal as it can achieve the largest success rate (Teunissen 1999). ILS is efficiently mechanized in the LAMBDA method (Teunissen 1995). Recently, a new version of the LAMBDA software (version 3.0) was released with a more efficient search strategy and more integer estimation methods (Verhagen and Li 2012).

Step 3: Fixed solution

The float solution of the baseline parameters is updated using the fixed integer parameters,

$$\check{\varvec{b}} = \hat{\varvec{b}} - \varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{z}}}} \varvec{Q}_{{\hat{\varvec{z}}\hat{\varvec{z}}}}^{ - 1} (\hat{\varvec{z}} - \check{\varvec{z}}),\quad \varvec{Q}_{{\check{\varvec{b}}}{\check{\varvec{b}}}} = \varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{b}}}} - \varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{z}}}} \varvec{Q}_{{\hat{\varvec{z}}\hat{\varvec{z}}}}^{ - 1} \varvec{Q}_{{\hat{\varvec{z}}\hat{\varvec{b}}}}$$
(10)

and the residuals of the fixed solution are

$$\check{\varvec{\epsilon}}_{\varvec{y}} = \varvec{y} - \varvec{A}\check{\varvec{z}} - \varvec{B}\check{\varvec{b}},\quad \varvec{Q}_{{\check{\varvec{\epsilon}}\check{\varvec{\epsilon}}}} = \varvec{Q}_{{\varvec{yy}}} - \varvec{BQ}_{{\check{\varvec{b}}}{\check{\varvec{b}}}} \varvec{B}^{T}$$
(11)

It is pointed out that the covariance matrix \(\varvec{Q}_{{\check{\varvec{b}}}{\check{\varvec{b}}}}\) is derived based on the error propagation law with assumption that the integer solution \(\check{\varvec{z}}\) is deterministic. This holds true only when the success rate is sufficiently close to 1. In that case, \(\varvec{Q}_{{\check{\varvec{b}}}{\check{\varvec{b}}}} \ll \varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{b}}}}\) since after successful ambiguity fixing, the phase measurements start to act as very precise pseudorange measurements. However, if the success rate is not sufficiently high, the fixed solution \(\check{\varvec{b}}\) is not necessarily more precise than the float solution \(\hat{\varvec{b}}\) (de Jonge et al. 2000; Verhagen et al. 2013; Li et al. 2014).

Overall test, w-test and MDB

As well known in the Gauss-Markov model, the least squares solution is optimal only when no outlier exists nor any other misspecifications of the functional and stochastic model (Koch 1999). It is therefore important to validate this precondition by using some proper statistical testing. Often, two test statistics, overall test and w-test, are popularly applied to check the specification of the mathematic model. The overall test is to test the overall discrepancy between the underlying observation model and the real observations, while the w-test is to test whether the outliers in individual observations are present. In GNSS applications, one can apply these two statistical tests to both float and fixed solutions.

Once the float solution is obtained in the first step of solving the mixed GNSS model, one can apply the overall test to check the compatibility of the mathematic model. The overall test statistic is (Koch 1999; Teunissen 2006)

$$T_{q} = \frac{{\hat{\varvec{\epsilon}}_{\varvec{y}}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \hat{\varvec{\epsilon}}_{\varvec{y}} }}{q}$$
(12)

For the null hypothesis that no misspecification exists, the overall statistic has a Fisher distribution with \(q = m - n - p = f(s - 1) - 3\) and \(\infty\) degrees of freedom, i.e., \(T_{q} \,\sim\,F(q,\infty )\). Given the correct stochastic model \(\varvec{Q}_{{\varvec{yy}}}\), it is emphasized that the expectation of \(T_{q}\) is equal to 1 if the function model is overall well specified. Therefore, given a significance level α, if \(T_{q} < F_{1 - \alpha } (q,\infty )\), we accept the null hypothesis that there is no misspecification in the functional and/or stochastic model; otherwise, we accept the alternative hypothesis that the misspecification exists in the functional and/or stochastic model.

If the null hypothesis is rejected, one may then need to further identify the cause of the misspecification between model and data. Usually, one starts with testing for outliers in individual observations by using the w-test. The w-test statistic of the ith observation reads (Baarda 1968; Teunissen 2006)

$$w_{i} = \frac{{\varvec{c}_{i}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \hat{\varvec{\epsilon}}_{\varvec{y}} }}{{\sqrt {\varvec{c}_{i}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{Q}_{{\hat{\varvec{\epsilon}}\hat{\varvec{\epsilon}}}} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{c}_{i} } }}$$
(13)

where c i is m-column vector with all elements of 0 except the ith element of 1. The \(w_{i}\) is standard normally distributed with zero mean [i.e., \(w_{i} \,\sim\,N(0,1)\)] for null hypothesis \(H_{0}\) and with nonzero mean (i.e., non-centrality parameter \(\sqrt {\varvec{c}_{i}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{Q}_{{\hat{\varvec{\epsilon}}\hat{\varvec{\epsilon}}}} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{c}_{i} } \nabla\) with \(\nabla\) an unknown scalar as expectation of bias) for alternative hypothesis \(H_{a}\). In any statistical hypothesis test, one encounters the type I error of false alarm and the type II error of wrong detection, namely the error of rejecting a correct hypothesis and the error of accepting a wrong hypothesis (Neyman and Pearson 1933). In the w-test, with a significance level α, the null hypothesis will be accepted that the ith observation is not an outlier if \(\left| {w_{i} } \right| < N_{1 - \alpha /2}\); otherwise, the corresponding alternative hypothesis will be accepted if it has the largest \(\left| {w_{i} } \right|\) of all m alternatives. In such case, the corresponding detection power, \(\gamma = 1 - \beta\) with β being the probability of the type II error, can be computed under \(H_{a}\).

In theory, with a significance level α 0, the largest \(\left| \nabla \right|\) will receive the largest detection power \(\gamma\). If the detection power is further controlled to a level \(\gamma_{0}\), the absolute non-centrality parameter \(\sqrt {\lambda_{0} }\) as a function of α 0 and γ 0 can be obtained. For instance, for \(\alpha_{0} = 0.001\) and \(\gamma_{0} = 0.8\), it follows that \(\lambda_{0} = 17\). Once the non-centrality parameter is known, the corresponding size of the bias is (Baarda 1968; Teunissen 2006)

$$\left| \nabla \right| = \sqrt {\frac{{\lambda_{0} }}{{\varvec{c}_{i}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{Q}_{{\hat{\varvec{\epsilon}}\hat{\varvec{\epsilon}}}} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{c}_{i} }}}$$
(14)

If the outlier is smaller than this size, the testing power will be smaller than \(\gamma_{0}\). Hence, this size is defined as the minimal detection bias (MDB) related to the probabilities of α 0 and \(\gamma_{0}\).

For the fixed solutions, one can apply the overall test and the w-test and compute the MDB exactly following (12), (13) and (14), respectively; But now \(\check{\varvec{\epsilon}}_{\varvec{y}}\) and \(\varvec{Q}_{{\check{\varvec{\epsilon}}\check{\varvec{\epsilon}}}}\) must be used instead of their float counterparts \(\hat{\varvec{\epsilon}}_{\varvec{y}}\) and \(\varvec{Q}_{{\hat{\varvec{\epsilon}}\hat{\varvec{\epsilon}}}}\). Note in the overall test the degree of freedom becomes \(q = 2f(s - 1) - 3\) since the \(f(s - 1)\) DD ambiguities are fixed.

To intuitively get some insight on how the stochastic model (covariance matrix \(\varvec{Q}_{{\varvec{yy}}}\)) affects the least squares solutions and the hypothesis testing statistics, we assume simply that the structure of \(\varvec{Q}_{{\varvec{yy}}}\) is correct but scaled by a factor κ, i.e., \(\varvec{Q}_{{\varvec{yy}}} \to \kappa \varvec{Q}_{{\varvec{yy}}}\). Then, the least squares float estimate, \(\hat{\varvec{b}}\), is invariant but its covariance matrix \(\varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{b}}}} \to \kappa \varvec{Q}_{{\hat{\varvec{b}}\hat{\varvec{b}}}}\), which is the case also for the fixed solution \(\check{\varvec{b}}\). The overall and w-test statistics as well as MDB become \(T_{q} \to T_{q} /\kappa\), \(w_{i} \to w_{i} /\sqrt \kappa\) and \(\left| \nabla \right| \to \sqrt \kappa \left| \nabla \right|\), respectively. It is obvious that the scaled stochastic model has immediate effect on both the overall and w-test statistics and MDB although it does not affect the parameter estimate \(\hat{\varvec{b}}\). In the following, we will numerically demonstrate how the elevation-dependent precisions, the cross-correlations and time correlations in the stochastic model affect the hypothesis tests by comparing between the realistic and empirical stochastic models.

Realistic stochastic model estimation

In the section, we will present the formulation of stochastic model and its estimation. Here, the stochastic model refers to the SD functional model with unknown variance and covariance components. The LS VCE method will be employed for solving these unknown components.

Formulation of unknown stochastic model

The following issues are taken into account in formulating the unknown stochastic model. First, to address the satellite-specific variance and its elevation dependence, the unknown variances are assigned to individual satellites for each observation type and frequency type over a short period of K epochs during which the satellite elevation is nearly invariant. Second, the cross-correlations are assumed to be present between two arbitrary frequencies for phase and code, respectively, while absent between phase and code. In addition, the different time correlations are assigned for each frequency and each observation type.

In terms of above assumptions on the stochastic model, the covariance matrix of SD observations for K consecutive epochs follows as

$$\varvec{Q}_{{\varvec{yy}}} = \varvec{Q}_{T} \otimes \varvec{Q}_{C} \otimes \varvec{Q}_{E}$$
(15)

where the matrices \(\varvec{Q}_{C} = {\text{blkdiag}}\, (\varvec{Q}_{{C,\varvec{\phi}}} ,\varvec{Q}_{{C,\varvec{p}}} )\), \(\varvec{Q}_{T}\) and \(\varvec{Q}_{E}\) are defined as

$$\varvec{Q}_{{C,\varvec{\phi}}} = \left[ {\begin{array}{*{20}c} {\sigma_{{\phi_{1} }}^{2} } & {\sigma_{{\phi_{1} \phi_{2} }} } & {\sigma_{{\phi_{1} \phi_{3} }} } \\ {\sigma_{{\phi_{1} \phi_{2} }} } & {\sigma_{{\phi_{2} }}^{2} } & {\sigma_{{\phi_{2} \phi_{3} }} } \\ {\sigma_{{\phi_{1} \phi_{3} }} } & {\sigma_{{\phi_{2} \phi_{3} }} } & {\sigma_{{\phi_{3} }}^{2} } \\ \end{array} } \right]$$
(16)

and

$$\varvec{Q}_{T} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} 1 \\ {\sigma_{[1]} } \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots \\ {\sigma_{[K - 1]} } \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\sigma_{[1]} } \\ 1 \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots \\ {\sigma_{[K - 2]} } \\ \end{array} } \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} \cdots \\ \cdots \\ \end{array} } \\ {\begin{array}{*{20}c} \ddots \\ \cdots \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\sigma_{[K - 1]} } \\ {\sigma_{[K - 2]} } \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots \\ 1 \\ \end{array} } \\ \end{array} } \\ \end{array} } \\ \end{array} } \right]$$
(17)

and

$$\varvec{Q}_{E} = 2 \times {\text{diag([}}\sigma_{1}^{2} , \ldots ,\sigma_{s}^{2} ] )$$
(18)

The diagonal and off-diagonal elements of \(\varvec{Q}_{{C,\varvec{\phi}}}\) denote the variances and the covariances between observation types, respectively. \(\varvec{Q}_{{C,\varvec{p}}}\) has the same structure as \(\varvec{Q}_{{C,\varvec{\phi}}}\). The \(K \times K\) Toeplitz matrix \(\varvec{Q}_{T}\) represents the possible time correlation of observations, where K is the number of observation epochs. Finally, \(\varvec{Q}_{E}\) is an \(s \times s\) matrix representing the dependence of observation precision on the satellite elevation, where s is the number of satellites. Here, the factor 2 is due to the single differencing of the observations, which is based on the fact that the satellite elevations for two stations are very close with difference smaller than 0.5º for the baselines as long as 20 km.

One may express the observation precision as an elevation-dependent function \(\sigma_{\theta } = f(\varvec{c}|\theta )\) with unknown c and observation elevation \(\theta\). This elevation can be the satellite elevation of either baseline station or the average elevation of two baseline stations. In such case,

$$\varvec{Q}_{E} = 2 \times {\text{diag([}}f^{2} (\varvec{c}|\theta_{1} ), \ldots ,f^{2} (\varvec{c}|\theta_{s} ) ] )$$
(19)

Then, one estimates the unknown c instead of satellite variances. These unknown variance and covariance components in \(\varvec{Q}_{C}\), \(\varvec{Q}_{T}\) and \(\varvec{Q}_{E}\) will be estimated by using the variance component estimation (VCE) method.

LS VCE for estimating realistic stochastic model

There are many VCE methods to solve the formulated VCE problem, such as Helmert-type VCE, MINQUE, RMLE and LS VCE, which are equivalent under certain conditions. For more information, one can refer to Amiri-Simkooei (2007). We will employ the LS VCE method due to its superior properties as elaborated in Teunissen and Amiri-Simkooei (2008). We implement the VCE based on fixed solutions. The observation model and stochastic model is organized in the general linear Gauss-Markov model,

$${\text{E(}}\varvec{y} - \varvec{A}\check{\varvec{z}} )= \varvec{Bb},\quad \varvec{Q}_{{\varvec{yy}}} = \varvec{Q}_{0} + \mathop \sum \limits_{k = 1}^{p} \sigma_{k} \varvec{U}_{k}$$
(20)

where \(\varvec{Q}_{{\varvec{yy}}}\) is decomposed into the known part \(\varvec{Q}_{0}\) and the unknown part specified by p unknown variance and covariance components σ k and their associated cofactor matrices \(\varvec{U}_{k}\). The normal equations of LS VCE read (Amiri-Simkooei 2007)

$$\varvec{N}\hat{\varvec{\sigma}} =\varvec{\omega}$$
(21)

where \(\hat{\varvec{\sigma}} = [{\hat{\sigma }}_{1} , \ldots ,{\hat{\sigma }}_{p} ]^{T}\) and the entries of normal matrix N and vector ω are

$$n_{kl} = {\text{tr}}\left( {\varvec{U}_{k} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{P}_{\varvec{B}}^{ \bot } \varvec{U}_{l} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{P}_{\varvec{B}}^{ \bot } } \right)$$
(22)
$$\omega_{k} = \check{\varvec{\epsilon}}_{\varvec{y}}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{U}_{k} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \check{\varvec{\epsilon}}_{\varvec{y}} - {\text{tr}}\left( {\varvec{U}_{k} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{P}_{\varvec{B}}^{ \bot } \varvec{Q}_{0} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{P}_{\varvec{B}}^{ \bot } } \right)$$
(23)

with projector matrix \(\varvec{P}_{\varvec{B}}^{ \bot } = \varvec{I} - \varvec{B}(\varvec{B}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{B})^{ - 1} \varvec{B}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1}\). Obviously, it needs iterative computations since the covariance matrix \(\varvec{Q}_{{\varvec{yy}}}\) is involved in the normal equations. Usually given the initial values for unknown variance and covariance components, denoted by \(\sigma_{k}^{0} ,\,(k = 1, \ldots ,p)\), one computes the initial values for \(\varvec{Q}_{{\varvec{yy}}}\) and \(\varvec{P}_{\varvec{B}}^{ \bot }\) and then the updated unknowns. The iteration continues until the computed unknowns between two consecutive iterations are stationary. For more information on computation aspect, see Li (2016).

Estimation of BDS stochastic model

The purpose of this research is to numerically investigate the impacts of a realistic stochastic model on the hypothesis tests compared with those with empirical stochastic model. In addition to the effect of receiver and antenna types, the GNSS stochastic model is in principle data-dependent due to the specific observation situations. Therefore, in real applications, the realistic stochastic model should be estimated together with parameter estimation unless the assumed stochastic model can indeed adequately capture the real observation randomness.

Data description

Two data sets of triple-frequency BeiDou observations are collected on a zero baseline by using Trimble receivers and on a short baseline of about 6 m by ComNav receivers. The total number of epochs is 86,400 for both baselines, i.e., a daily observation span with sampling interval of 1 s. In the computations, the cutoff elevation is 10°.

The DD integer ambiguity resolution is the precondition to analyze the stochastic model. In this study, the data sets are collected on a zero baseline and a 6-m short baseline. Such precisely known short baseline can be applied to extremely enhance the ambiguity resolution such that the ambiguity resolution can be reliably done epoch by epoch with the LAMBDA method.

Result of estimated stochastic model

Given the data window K = 60 epochs, we estimate the precision of each satellite per frequency and observation type. More precisely, the standard deviation of the observation should be used here and in the following instead of precision, which is the square root of estimated variance of the observation. Then, the precision estimates of all satellites for the same observation type are sorted in ascending order of elevations. For each elevation interval of 0.5° from 10° to 90°, we take the mean of the precision estimates in the elevation interval as the precision of this elevation. The results of elevation-dependent precisions are shown in Fig. 1 for all three-frequency phase and code observations. Taking K = 60 can guarantee the precision of the estimate precision to better than 1%, which has been proven by many numerical data analyses though they are not presented here.

Fig. 1
figure 1

Estimated elevation-dependent precisions (discrete points) and their fitting curves (solid lines) for triple-frequency phase and code observations of zero baseline with Trimble receivers (ad) and of short baseline with ComNav receivers (eh). The solid lines in subplots (a, b) and (c, d) are for model A (Eq. 24) and B (Eq. 25) of zero-baseline data, respectively. The solid lines in subplots (e, f) and (g, h) are for model A and B of short-baseline data, respectively

In addition, two elevation-dependent functions are analyzed, denoted by model A and B, respectively. We choose model A as (Li et al. 2016; Li 2016)

$$\sigma_{\theta } = f({\varvec{c}}|\theta ) = c_{1} /({ \sin }\theta + c_{2} )$$
(24)

and its reduced version model B:

$$\sigma_{\theta } = f({\varvec{c}}|\theta ) = c/{ \sin }\theta$$
(25)

It is noted that we choose these two elevation-dependent models just as a case study due to their simplicity. Moreover, these two models are representative. That is, model (24) can fit the elevation-dependent observation precisions very well, while model (25) will do so poorly, particularly for zero baseline of Trimble receivers as seen in latter results. One can of course choose other elevation-dependent models, for instance, exponential function (Euler and Goad 1991), which may generate different numerical results.

The results show that the observation precisions are overall elevation-dependent for all triple-frequency phase and code observations, although the dependence patterns differ by observation and receiver types. For the short baseline using ComNav receivers, both phase and code of B3 are more precise than those of B1 and B2, especially for B3 code when the elevation angle is larger than 30°. For the zero baseline using Trimble receivers, we cannot see any obvious precision difference for phases between frequencies, while the code precisions of B1 are relatively larger for low elevations. The model A can overall fit the precisions better than model B. For model B, an over-fitting problem exhibits. In other words, the precision values of low elevations are overly enlarged while those of high elevations are overly reduced. The fitting parameters of model A and B are presented in Table 1, and their associated fitting curves are shown in Fig. 1. Also, the elevation-independent model denoted by model C is also analyzed. In model C, the average precision over all elevations is actually taken, which is the estimate of parameter c.

Table 1 Fitting parameters of elevation-dependent model A and B and elevation-independent model C for zero-baseline data using Trimble receivers and short-baseline data using ComNav receivers, respectively

The estimated cross-correlation coefficients are presented in Table 2. For each receiver, six cross-correlation coefficients are computed for phase and code observations among three frequencies. For the ComNav receiver, all cross-correlation coefficients deviate from 0 with values smaller than 0.2; especially for phase, which means that no cross-correlation exists. However, for the Trimble receiver, very significant cross-correlation with correlation coefficient 0.76 exists between B2 and B3 code observations.

Table 2 Estimated cross-correlation coefficients for all three-frequency phase and code observations of zero baseline and short baseline with two types of receivers

The estimated time correlation coefficients are shown in Fig. 2 as a function of time lags for both phase and code of all three frequencies. For zero-baseline data of Trimble receivers, the time correlations are absent for both phase and code of all frequencies at time lag of 1 s. For the short baseline using ComNav receivers, the time correlations exist for all triple-frequency phase and code observations. The time correlations of triple-frequency phase observations drop sharply to a small value at the time lag of 1 s. However, for code observations the time correlations are still significant at the first 5 s.

Fig. 2
figure 2

Estimated time correlation coefficients as a function of time lags for phase and code observations of zero baseline with Trimble receivers (top) and short baseline with ComNav receivers (bottom)

Impact of BDS stochastic model on reliability

We analyze the impacts of individual stochastic quantities, elevation-dependent precisions, cross-correlations and time correlations, on the reliability including the overall test, w-test and MDB.

Impact of elevation-dependent models on reliability

We demonstrate the impact of observation precisions on reliability by comparing two elevation-dependent models, A and B, and the elevation-independent model C. Hereafter, they are also called weighting models. As an example, the Trimble zero-baseline data was processed with these three models, respectively. Their parameters are taken from Table 1.

Figure 3 illustrates the statistics of overall test for single-epoch float and fixed solutions with three models. Given a significance level \(\alpha = 0.05\), the critical values are computed by \(F_{0.95} (q,\infty )\) for float and fixed solutions with \(q = 3s - 6\) and \(6s - 9\) for \(f = 3\), respectively. The results of models A and C are very close with each other, while they differ significantly from model B. Since the baseline data were collected in an ideal environment, very few outliers were found and excluded in our postprocessing. In other words, there is no outlier in the observations used anymore. In such case, if the model specifies the observations very well, the expectation of the overall statistics is equal to 1. From the figure, the statistics of models A and C are indeed overall close to 1, but those of model B have significant deviations from 1. The mean of all epoch statistics can be deemed as an empirical approximation to expectation. Therefore, the smaller the difference of the computed mean from 1, the better is the corresponding elevation-dependent model. The means of overall statistics are shown in Table 3. The result indicates that model A is best, followed by model C and then model B. The deviations of means of overall statistics from 1 are only 0.03 and 0.02 for float and fixed solutions of model A, while they are 0.10 and 0.07 for model C and 1.33 and 1.16 for model B, respectively.

Fig. 3
figure 3

Statistics of single-epoch overall tests with two elevation-dependent models, A and B, and elevation-independent model C for Trimble zero-baseline data. The subplots (top) and (bottom) are for the float and fixed solutions, respectively

Table 3 Means of overall test statistics and probabilities of false alarm with significance level \(\alpha = 0.05\) for two elevation-dependent models A and B, and elevation-independent model C

In absence of outliers, the computed statistics should be statistically smaller than the critical values. If the statistic is larger than its associated critical value, it leads to a false alarm. The probabilities of false alarm are shown in Table 3 for both float and fixed solutions with three models. Obviously, model A is clearly better than the other two models, and model C better than model B. The probabilities of false alarm for model A are smaller than 4 and 7% for float and fixed solutions, respectively, while they increase to 7.1 and 11.23% for model C. For model B, they are worst and even reach about 67 and 80% for float and fixed solutions, respectively. Roughly, the probabilities of false alarm of model A are smaller than those of model C by about 2 times and model B by more than 10 times. In addition, the false alarm probabilities of model A are much closer to the given significant level 5%, which makes sense statistically for a clear system with only random errors. Such performance reveals that if the elevation-dependent function is not properly specified, it will derive even worse results than the elevation-independent model.

In the single-epoch float solution model, the ambiguity parameters are to be estimated for all three-frequency phase observations. Such model formation leads to zero denominators in (13) and (14), i.e., \(\varvec{c}_{i}^{T} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{Q}_{{\hat{\varvec{\epsilon}}\hat{\varvec{\epsilon}}}} \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{c}_{i} = 0\), for phase observations. This means that the statistics of the w-test and MDB cannot be computed for phase observations with single-epoch float solutions. Therefore, we focus on analyzing the statistics of w-test and MDB for single-epoch fixed solutions.

Figure 4 shows the computed w-test statistics as a function of elevations for all triple-frequency code and phase observations with two elevation-dependent models, A and B, and elevation-independent model C. Recall the theoretical relation that \(w_{i} \to w_{i} /\sqrt \kappa\) if \(\varvec{Q}_{{\varvec{yy}}} \to \kappa \varvec{Q}_{{\varvec{yy}}}\). It means that the downscaling variance (\(\kappa < 1\)) derives the larger \(w_{i}\) statistic, and vice versa. In model C, the smaller precision is used instead of its actual value for low-elevation observation, which leads to the enlarged w-statistic values. This is the opposite for the high-elevation observations, see right column of Fig. 4. As observed in Fig. 1, the model B overfits the elevation dependence of observation precisions, which means that a too large κ is assigned to low-elevation observations while a too small κ to high-elevation observations. As a result, the w-test statistics of low elevations are smaller than those of high elevations, especially for code of B2 and B3, as seen from the figure. The model A outperforms the models B and C, where its w-test statistics are basically comparable for all elevation observations.

Fig. 4
figure 4

w-test statistics of single-epoch fixed solutions as a function of elevations with elevation-dependent model A (left) and B (middle), as well as elevation-independent model C (right) for Trimble zero-baseline data

For a normal observation, \(w_{i}\) is of standard normal distribution. Given the significance level \(\alpha = 0.05\), the empirical probability of false alarm is computed as a ratio between the number of w-test statistics outside the confident region \(\left[ {N_{\alpha /2} , N_{1 - \alpha /2} } \right]\) and the total number of w-test statistics. Given the elevation intervals of 10º from 10º to 90º, there are in total 8 elevation intervals. For each elevation interval, this empirical probability of false alarm can be computed. The results of all 8 elevation intervals are shown in Fig. 5 as a function of elevation intervals for phase and code observations with three weighting models. Again, the model A is best, followed by model C and model B.

Fig. 5
figure 5

Probabilities of false alarm for w-test statistics of single-epoch fixed solutions as a function of elevation intervals with two elevation-dependent models, A (left) and B (middle) and elevation-independent model C (right) for Trimble zero-baseline data. The elevation interval is 10°, and eight elevation intervals are from 10° to 90°

We now analyze the MDB results with three weighting models. The MDBs are computed with single-epoch fixed solution following (14) for triple-frequency code and phase observations. The results are shown in Fig. 6. Again recall the theoretical relation that \(\left| \nabla \right| \to \sqrt \kappa \left| \nabla \right|\) if \(\varvec{Q}_{{\varvec{yy}}} \to \kappa \varvec{Q}_{{\varvec{yy}}}\). It means that the MDB is positively proportional to the observation precision with arithmetic square root of a scalar. The more precise observation will receive a smaller MDB, and vice versa. In other words, with a given significance level α and detection power γ, the detectable outlier becomes smaller if the observation precision is improved. The model A receives the realistic MDBs since it can reflect the precisions of observations realistically. However, models B and C obtain unrealistic MDBs, which can be either too small or too large. Compared to the MDBs of model A, the model B obtains too large MDBs for low elevations while too small ones for high elevations. In other words, the outliers at low elevations that can be actually detected become non-detectable in terms of MDB with a certain reliability. More conservatively, some normal observations at high elevations may be wrongly excluded as outliers.

Fig. 6
figure 6

MDBs of single-epoch fixed solution as a function of elevations with two elevation-dependent models, A (left) and B (middle), and elevation-independent model C (right) for Trimble zero-baseline data

Impact of cross-correlation on reliability

To investigate the impact of cross-correlation on reliability, we use the B2 and B3 code observations of the Trimble zero baseline for the baseline resolution, where the B2 and B3 code is strongly correlated with correlation coefficient 0.76 seen in Table 2. Here, we do not incorporate B1 data for simplicity due to its minor correlations with B2 and B3 data. Referring to (1), the single-epoch SD model with only B2 and B3 code observations reads

$${\text{E}}\left( {\left[ {\begin{array}{*{20}c} {\varvec{p}_{2} } \\ {\varvec{p}_{3} } \\ \end{array} } \right]} \right) = \left[ {\begin{array}{*{20}c} \varvec{G} & {\varvec{e}_{s} } & {\mathbf{0}} \\ \varvec{G} & {\varvec{e}_{s} } & {\varvec{e}_{s} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} \varvec{x} \\ {dt_{2} } \\ {dt_{3} } \\ \end{array} } \right],\quad 2\left[ {\begin{array}{*{20}c} {\varvec{Q}_{{p_{2} p_{2} }} } & {\varrho_{\text{c}} \varvec{Q}_{\text{c}} } \\ {\varrho_{\text{c}} \varvec{Q}_{\text{c}} } & {\varvec{Q}_{{p_{3} p_{3} }} } \\ \end{array} } \right]$$
(26)

where \(\varvec{Q}_{{p_{2} p_{2} }} = {\text{diag([}}(\sigma_{{p_{2} }}^{1} )^{2} , \ldots ,(\sigma_{{p_{2} }}^{s} )^{2} ] )\) with \(\sigma_{{p_{2} }}^{s}\) is the undifference observation precision of satellite s computed by elevation-dependent model A (24) with fitting parameters from Table 1. \(\varvec{Q}_{{p_{3} p_{3} }}\) is similar to \(\varvec{Q}_{{p_{2} p_{2} }}\) and computed with its own elevation-dependent parameters. The matrix, \(\varvec{Q}_{\text{c}} = \varvec{Q}_{{p_{2} p_{2} }}^{1/2} \varvec{Q}_{{p_{3} p_{3} }}^{1/2}\), is scaled by a cross-correlation coefficient \(\varrho_{\text{c}}\) between B2 and B3 code observations.

For the realistic stochastic model with \(\varrho_{\text{c}} = 0.76\) and the empirical model with \(\varrho_{\text{c}} = 0\), one can solve the model (26) to obtain the corresponding LS solutions epoch by epoch. Then, the statistics of overall tests are computed as shown in Fig. 7 with respect to two stochastic models. They are very close to each other and their means are 1.0756 and 0.9176, respectively. With significance level α = 0.01, the probabilities of false alarm are 2.24 and 2.75% for stochastic models with \(\varrho_{\text{c}} = 0.76\) and 0, respectively.

Fig. 7
figure 7

Statistics of single-epoch overall tests for two stochastic models with \({\varrho }_{\text{c}} = 0.76\) and 0 for Trimble zero-baseline data

The w-test statistics of all observations are computed for these two stochastic models with and without cross-correlations. The histograms of w-test statistics are shown in Fig. 8. For these two stochastic models, the means of w-test statistics are 5 × 10−5 and 0.001 with standard deviations 1.0345 and 0.9544, respectively. Although the w-test statistics with stochastic model of \(\varrho_{\text{c}}\) = 0.76 are slightly closer to the standard normal distribution, they are practically very similar to those with stochastic model of \(\varrho_{\text{c}}\) = 0. For the significance level α = 0.01, the probability of false alarm is 0.79 and 1.34%, respectively. In summary, the cross-correlation in stochastic model has very minor effects on the overall and w-test.

Fig. 8
figure 8

Histogram of w-test statistics for two stochastic models with \(\varrho_{\text{c}} = 0\) (left) and 0.76 (right) for Trimble zero-baseline data

Let us now analyze the impact of cross-correlation on the MDBs. By considering and ignoring the cross-correlations in stochastic model, the MDBs are computed for all B2 and B3 code observations of all satellites, shown in Fig. 9. Obviously, considering the cross-correlation will decrease the MDBs, namely the smaller outliers are detectable if the cross-correlations are properly assimilated. In principle, the information content in the correlated B2 and B3 observations should be less than that in the B2 and B3 observations if they are independent. Hence, with less information contents for correlated B2 and B3 observations, the outlier detection should become difficult and the MDBs should be larger. The theoretical expectation is opposite to the result obtained in Fig. 9. Such contradiction attracts our further analysis. The correlation coefficient of two w-test statistics of B2 and B3 code observations for a satellite is defined as (Förstner 1983)

$$\varrho_{{w_{i} w_{j} }} = \frac{{\varvec{c}_{i}^{T} \varvec{\varOmega c}_{j} }}{{\sqrt {\varvec{c}_{i}^{T} \varvec{\varOmega c}_{i} } \sqrt {\varvec{c}_{j}^{T} \varvec{\varOmega c}_{j} } }} = \frac{{\varvec{\varOmega}\left( {i,j} \right)}}{{\sqrt {\varvec{\varOmega}\left( {i,i} \right)} \sqrt {\varvec{\varOmega}\left( {j,j} \right)} }}$$
(27)

where \(\varvec{\varOmega}= \varvec{Q}_{{\varvec{yy}}}^{ - 1} \varvec{Q}_{{\hat{\varvec{\epsilon}}\hat{\varvec{\epsilon}}}} \varvec{Q}_{{\varvec{yy}}}^{ - 1}\). For a larger correlation coefficient \(\varrho_{{w_{i} w_{j} }}\) between the ith and jth observations, it is more difficult to discriminate the outlier exactly on either the ith or jth observation. In other words, one may detect the outlier, but wrongly position the outlier location, which will derive the type III error (Förstner 1983; Yang et al. 2013). In real applications, if the w-test statistics of two observations are highly correlated and an outlier is statistically detected on either one observation, an advisable strategy is to exclude both observations simultaneously to control the type III error.

Fig. 9
figure 9

MDBs of B2 and B3 code observations and their means for all satellites without cross-correlation, \(\varrho_{\text{c}} = 0\) (left), and with cross-correlation, \(\varrho_{\text{c}} = 0.76\) (right) for Trimble zero-baseline data. The lines in the subplots of ad indicate the MDBs for individual satellites

The correlation coefficients \(\varrho_{{w_{i} w_{j} }}\) between two w-test statistics of B2 and B3 code observations for individual satellites are computed and illustrated in Fig. 10. The mean correlation coefficients over the whole observation span for all satellites are shown in the figure as well. The correlation increases from about –0.2 (\(\varrho_{\text{c}}\) = 0) to –0.8 (\(\varrho_{\text{c}} = 0.76\)). That makes sense since the cross-correlation makes the B2 and B3 observations of one satellite correlated and then their w-test statistics correlated. Therefore, if the outlier is detected for B2 or B3 observation, it is advisable to exclude both B2 and B3 observations of this satellite to control the type III error for high reliability of positioning solutions.

Fig. 10
figure 10

Correlation of w-test statistics between B2 and B3 observations of each satellite without (left) and with (right) cross-correlation considered for Trimble zero baseline

Impact of time correlation on reliability

To demonstrate the impact of time correlation on reliability, we use the short-baseline code observations using ComNav receivers. We solve the baseline solutions based on the SD model by using triple-frequency code observations of with two consecutive epochs, where all triple-frequency code observations are time correlated. The associate model reads

$${\text{E}}\left( {\left[ {\begin{array}{*{20}c} {\varvec{p}_{k} } \\ {\varvec{p}_{k + 1} } \\ \end{array} } \right]} \right) = \left[ {\begin{array}{*{20}c} {\varvec{e}_{3} \otimes \varvec{G}_{k} } & {\varvec{I}_{3} \otimes \varvec{e}_{s} } & {\mathbf{0}} \\ {\varvec{e}_{3} \otimes \varvec{G}_{k + 1} } & {\mathbf{0}} & {\varvec{I}_{3} \otimes \varvec{e}_{s} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} \varvec{x} \\ {dt_{k} } \\ {dt_{k + 1} } \\ \end{array} } \right]$$
(28)

with stochastic model

$${\text{D}}\left( {\left[ {\begin{array}{*{20}c} {\varvec{p}_{k} } \\ {\varvec{p}_{k + 1} } \\ \end{array} } \right]} \right) = \left( {\left[ {\begin{array}{*{20}c} {\varvec{I}_{3} } & {{\varvec{\varrho }}_{\text{t}} } \\ {{\varvec{\varrho }}_{\text{t}} } & {\varvec{I}_{3} } \\ \end{array} } \right] \otimes \varvec{I}_{s} } \right)\left( {\varvec{I}_{2} \otimes \varvec{Q}_{{\varvec{pp}}} } \right)$$
(29)

where \(\varvec{Q}_{{\varvec{pp}}} = {\text{blkdiag(}}\varvec{Q}_{{\varvec{p}_{1} }} ,\varvec{Q}_{{\varvec{p}_{2} }} ,\varvec{Q}_{{\varvec{p}_{3} }} )\) is the covariance matrix of single-epoch SD triple-frequency code observations. \(\varvec{Q}_{{\varvec{p}_{i} }}\) is the covariance matrix of single-epoch SD code observations of the ith frequency, which is computed by the elevation-dependent model A (24) with fitting parameters in Table 1. The diagonal matrix \({\varvec{\varrho }}_{\text{t}} = {\text{diag([}}\varrho_{{{\text{t}},1}} , \varrho_{{{\text{t}},2}} ,\varrho_{{{\text{t}},3}} ] )\) consists of the time correlation coefficient between two consecutive epochs for triple-frequency code observations. Sampling from the bottom-right subplot of Fig. 2, they are equal to 0.7, 0.54 and 0.75 for triple-frequency code observations, respectively.

For two stochastic models specified by \({\varvec{\varrho }}_{\text{t}} = 0\) and \({\varvec{\varrho }}_{\text{t}} = {\text{diag([}}0.7,0.54,0.75 ] )\), i.e., ignoring and considering time correlation, the statistics of overall and w-test and MDBs are computed. The statistics of overall test and the histograms of w-test statistics are shown in Figs. 11 and 12, respectively. The means of overall test statistics are 1.1670 and 1.1515 for \({\varvec{\varrho }}_{\text{t}} = 0\) and \({\varvec{\varrho }}_{\text{t}} \ne 0\), respectively. Here \({\varvec{\varrho }}_{\text{t}} \ne 0\) means that one takes the realistic time correlations into account. The corresponding probabilities of false alarm are 3.54 and 2.29% for significance level α = 0.01. The means of w-test statistics are 0.0025 and 0.0014 with respect to \({\varvec{\varrho }}_{\text{t}} = 0\) and \({\varvec{\varrho }}_{\text{t}} \ne 0\). The corresponding probabilities of false alarm is 2.26 and 2.12% for the significance level α = 0.01. Therefore, in general, the time correlation has minor impact on the overall test and w-test.

Fig. 11
figure 11

Statistics of overall tests of single-epoch baseline solutions with two stochastic models specified by \({\varvec{\varrho }}_{\text{t}} = 0\) and \({\varvec{\varrho }}_{\text{t}} \ne 0\) for ComNav short-baseline data

Fig. 12
figure 12

Histograms of w-test statistics of single-epoch baseline solutions with two stochastic models specified by \({\varvec{\varrho }}_{\text{t}} = 0\) (left) and \({\varvec{\varrho }}_{\text{t}} \ne 0\) (right) for ComNav short-baseline data

The MDB results of all triple-frequency code observations are shown in Fig. 13. For each baseline solution, triple-frequency code observations of two epochs are involved. The smaller variations of MDBs are for GEO satellites due to their stable elevations and then precisions. For a given satellite and frequency, the correlations of two w-test statistics of two-epoch observations are computed with (27). The results are shown in Fig. 14 for the whole observation span. The means of MDBs and correlation coefficients of w-test statistics are shown in Fig. 15 as function of satellite PRNs. The MDBs of B3 are the smallest, followed by B1 and B2. This is due to the B3 code being most precise and then B1 and B2 as shown in subplots (f) and (h) of Fig. 1. Similar to the impact of cross-correlation on MDB and correlation of w-test statistics, the MDBs of all observations are indeed reduced when taking into account the time correlation, while the correlations of w-test statistics between two epochs are significantly increased from –0.2 (\({\varvec{\varrho }}_{\text{t}} = 0\)) to even –0.9 (\({\varvec{\varrho }}_{\text{t}} \ne 0\)). As a result, it is more difficult to discriminate exactly on which observation the outlier has occurred if the observations are time correlated. Again an advisable strategy is to exclude the observations of these two epochs simultaneously to control the type III error if the outlier is detected at either epoch.

Fig. 13
figure 13

MDBs of all triple-frequency code observations for all satellites without time correlation, \({\varvec{\varrho }}_{\text{t}} = 0\) (left), and with time correlation, \({\varvec{\varrho }}_{\text{t}} \ne 0\) (right) for ComNav short-baseline data. The lines in all subplots indicate the MDB results for individual satellites

Fig. 14
figure 14

Correlations of two w-test statistics for all baseline solutions of two epochs without, \({\varvec{\varrho }}_{\text{t}} = 0\) (left), and with time correlation, \({\varvec{\varrho }}_{\text{t}} \ne 0\) (right) for ComNav short-baseline data. The lines in all subplots denote the different satellites

Fig. 15
figure 15

Means of MDBs and correlation coefficients for triple-frequency code observations as function of satellite PRN without, \({\varvec{\varrho }}_{\text{t}} = 0\) (left), and with time correlation, \({\varvec{\varrho }}_{\text{t}} \ne 0\) (right), for ComNav short-baseline data

Concluding remarks

The importance of stochastic model on achieving optimal parameter estimator and realistic covariance matrix of the estimator has been well documented by GNSS researchers in the past. That is, the ambiguity resolution and positioning can be improved by refining the stochastic model. However, the importance of stochastic model on the reliability of quality control has been rarely studied, where the covariance matrix is involved in statistical reliability tests. We have studied the influence of the stochastic model on the statistical tests with triple-frequency BeiDou as an example. Compared with the empirical stochastic models, the influences of estimated realistic stochastic models on the overall and w statistical tests as well as the MDBs have been numerically investigated. Based on our studies, the conclusions are summarized as follows:

  • The GNSS observation precision is in general elevation-dependent, and cross-correlations and time correlations may exist. These stochastic characteristics differ from the receiver and observation types and frequencies, which should be taken into account for establishing a realistic stochastic model.

  • Comparison of elevation-dependent and -independent models in overall test and w-test reveals that a realistic elevation-dependent model can reduce the probabilities of both false alarm and wrong detection. Without proper elevation-dependent model, the probabilities of false alarm and wrong detection could be even worse than those of elevation-independent model.

  • The cross-correlations and time correlations have very marginal effects on the baseline (positioning) solutions (Li 2016). However, they affect the covariance matrix of the baseline solutions and then the reliability test statistics significantly. In other words, one may not expect the improved baseline solutions by properly considering the physical correlations, but indeed more realistic reliability results. That is, taking into account the physical correlations, the probabilities of both false alarm and wrong detection will be reduced in statistical reliability tests; the MDBs become smaller with more difficulty of discriminating the outlier location. Hence, when the physical correlations exist among observations, an advisable strategy is to exclude these observations simultaneously, if either observation is detected as outlier, to control the type III error for reliable positioning.

  • Considering the complexity of stochastic model and its dependence on the receiver type, antenna type and also observation environment, in real applications one should estimate the realistic stochastic model with the data set to capture the real stochastic characteristics of data set itself.

To the best of our knowledge, this is the first comprehensive study for analyzing the influence of stochastic model on statistical reliability tests. With realistic stochastic model, one can obtain the reasonable reliability test results, which are helpful for users to make objective decisions in quality control of real GNSS applications.