Keywords

1 Introduction

In real-time positioning and speed measurement using Global Navigation Satellite System (GNSS), Receiver Autonomous Integrity Monitoring (RAIM) is important to ensure reliable and accurate positioning which is a method for real-time monitoring of positioning error. When the true positioning error (PE) is greater than the alarm limit (AL), it is necessary to do fault detection and exclusion (FDE) to reduce the effect of the fault observation and improve the availability.

At present, the least square method is widely used for positioning but still needs to be improved. Because when there is a fault, the parameter estimation will be offset [1], resulting in some misleading information in residual [2]. The integrity monitoring and fault exclusion methods have evolved from the earliest single fault method to the current multiple fault method.

For multiple faults, FDE methods mainly include three types, multiple solution separation methods [3, 4], weight adjustment methods, and robust estimation methods [1, 6,7,8]. Most of them are based on all possible subsets of all observations which make the computation burden unaffordable. Some of the above methods need the prior assumption about the number of faults which is not infeasible. At present, the robust methods such as M [6] or MM [1] methods have good performance, but the calculation of the Protection Level (PL) [3] is lacked, which make it hard to decide the error range of the real-time positioning. Due to the above reason on computation burden, the research on more than four faults is lacked. But with the increasing number of availably satellites and other observations such as INS or movement sensors, more faults may happen which ask for the algorithm and experiment for more faults.

This paper proposes a multi-faults adaptive integrity monitoring method based on the observations of non-zero mean, thick tail, and non-Gaussian error distribution when the faults happened [8,9,10]. We use Gaussian Mixture distribution to model the error. This distribution can well describe the characteristics of errors such as thick tail, asymmetric, and non-Gaussian distribution [8,9,10] under different parameters. Real-time maximum likelihood parameter estimation based on real-time observation can achieve self-adaptation. At the same time, the method proposed in this paper iteratively calculates the posterior probability of each satellite's failure mode and the range of the failure using EM algorithm, so the calculation complexity is reduced and does not increase with the number of failures.

Finally, this article designed three failure modes (fixed, random, and slowly changed) in the experiment. When the total number of satellites is 20, the number of failures gradually increased from zero to ten. The results show that in most cases, our method achieves lower positioning error and fast computation. Finally, this article also gives the calculation and derivation of the protection level, and analysis the false alarm (FA) and missed detection (MD) [11,12,13,14] under different experimental conditions, which proves that the PL calculation can effectively reflect the range of the true positioning error.

2 Gaussian Mixture Model

In this section, we propose the algorithm for multi faults which includes the Gaussian mixture model for error distribution and EM algorithm for positioning and FDE, the formula of protection level is also derived to make up the whole solution.

2.1 Convert Weighted Least Squares to Ordinary Least Squares

In order to simplify the calculations, suppose the least squares positioning equation is \(\tilde{Y} = \tilde{H}\beta + \tilde{\varepsilon }, \, \tilde{\varepsilon } \sim N(0, \, \sigma^{{2}} \sum )\), \(\tilde{Y}\) is the pseudo-range observation, \(\tilde{H}\) is the observation geometric matrix and the intercept, \(\beta\) is the positioning parameters need to be solved, \(\sum\) is a positive definite diagonal matrix which represents the priori variance of the pseudo-range. We apply a linear transformation to the equation: \(\sum^{{^{{ - \, \frac{1}{2}}} }} \tilde{Y} = \sum^{{^{{ - \, \frac{1}{2}}} }} \tilde{H}\beta + \sum^{{^{{ - \, \frac{1}{2}}} }} \tilde{\varepsilon }, \, \sum { = }\sum^{\frac{1}{2}} (\sum^{{^{\frac{1}{2}} }} )^{T}\).

Obviously, \(\sum^{{^{{ - \, \frac{1}{2}}} }} \tilde{\varepsilon } \sim N(0, \, \sigma^{{2}} I)\), \(I\) is the unit diagonal matrix.

\(\sum^{{ - \, \frac{1}{2}}} { = (}\sum^{{ - \, \frac{1}{2}}} {)}^{T} { = }diag\{ \frac{1}{{\sqrt {\sigma_{1}^{2} } }},...,\frac{1}{{\sqrt {\sigma_{n}^{2} } }}{\text{\} }}\). Then we get \(Y = H\beta + \varepsilon , \, \varepsilon \sim N(0, \, \sigma^{{2}} I)\),

\(Y = \sum^{{^{{ - \, \frac{1}{2}}} }} \tilde{Y} = (y_{1} ,...,y_{n} )\), \(H = \sum^{{^{{ - \, \frac{1}{2}}} }} \tilde{H} = (h_{1} ,...,h_{n} )^{T}\).

2.2 The Gaussian Mixture Model

The Gaussian mixture density [15] is composed of the weighted sum of K normal distributions. The weight is the prior probability of these K components, while each normal distribution is a component. The form is as follows:

$$ f_{m} (y_{i} ) = \sum\limits_{k = 1}^{K} {\gamma_{k} f_{k} (y_{i} {; }h_{i}^{T} \beta + \mu_{k} ,\sigma_{k}^{2} )} , \, \sum\limits_{k = 1}^{K} {\gamma_{k} = 1} $$
(1)
$$ f_{k} (y_{i} ) = \frac{1}{{\sigma_{k} \sqrt {2\pi } }}e^{{ - \frac{{(y_{i} - h_{i}^{T} \beta - \mu_{k} )^{{2}} }}{{2\sigma_{k}^{2} }}}} $$

\(f_{m} (y_{i} )\) represents the density of the Gaussian mixture, \(f_{k} (y_{i} )\) represents the density of the k-th component. K is the number of components. We let \(K = 3\) and \(\mu_{{1}} = {0}\), assume that the first component of the Gaussian mixture is a Gaussian distribution with zero mean and an unknown error. A schematic diagram of the Gaussian mixture density function under different parameters can be seen in Fig. 1. The expectation and variance can be computed as follows:

$$ E(y_{i} ) = \int {y_{i} f_{m} (y_{i} )dy_{i} } = h_{i}^{T} \beta { + }\sum\limits_{k = 1}^{K} {\gamma_{k} \mu_{k} } $$
(2)
$$ Var(y_{i} ) = \sum\limits_{k = 1}^{K} {\gamma_{k} (\mu_{k}^{2} + \sigma_{k}^{2} )} - (\sum\limits_{k = 1}^{K} {\gamma_{k} \mu_{k} } )^{2} $$
(3)
Fig. 1.
figure 1

Diagram of Gaussian mixture density function under varying parameters

2.3 Positioning Using EM Algorithm

We use the EM algorithm [15]. Add a hidden variable \(z_{i}\) for each satellite which means the type of faults, for \(z_{i} \ne {1}\), the fault exist, and the prior probability is \(P(z_{i} = k) = \gamma_{k} ,k = 1,...,K;K = 3\). We can calculate the posterior probability of \(z_{i}\) based on the observation data \(y_{i}\). (see Sect. 2.4).

Due to the limited space, we only give the derived formula:

By adding hidden variables, the posterior probability of pseudo-range residuals can be simplified for calculating the log likelihood:

$$ f_{m} (y_{i} |z_{i} ,\theta ) = \prod\limits_{k = 1}^{K} {f_{k} (y_{i} {; }h_{i}^{T} \beta + \mu_{k} ,\sigma_{k}^{2} )^{{I_{k} (z_{i} )}} } $$
(4)
$$I_{k} (z) = \left\{ \begin{gathered} 1, \, if \, z = k \hfill \\ 0, \, else \hfill \\ \end{gathered} \right. , \theta = (\beta^{T} ,\mu^{T} ,(\sigma^{2} )^{T} ,\gamma^{T} )^{T}.$$

The log likelihood is

$$ l_{c} (\theta ) = log \, f_{m} (Y,Z|\theta ) = \sum\limits_{i = 1}^{n} {log \, f_{m} (y_{i} ,z_{i} |\theta )} $$
(5)

2.3.1 E Step: Compute the Expected Log Likelihood \(Q(\theta |\theta^{(t)} ,Y)\)

$$ \begin{gathered} Q(\theta |\theta^{(t)} ,Y) \triangleq E_{Z} [l_{c} (\theta )|\theta^{(t)} ,Y] \hfill \\ { = }\sum\limits_{i = 1}^{n} {\sum\limits_{k = 1}^{K} {\gamma_{ik} log[\gamma_{k} ]} } - \sum\limits_{i = 1}^{n} {\sum\limits_{k = 1}^{K} {\gamma_{ik} \frac{{(y_{i} - h_{i}^{T} \beta - \mu_{k} )^{2} }}{{2\sigma_{k}^{2} }}} } - \frac{1}{2}\sum\limits_{i = 1}^{n} {\sum\limits_{k = 1}^{K} {\gamma_{ik} log(\sigma_{k}^{2} )} } - nlog\sqrt {{2}\pi } \hfill \\ \end{gathered} $$
(6)
$$ \gamma_{ik}^{(t)} { = }E[I_{k} (z_{i} )]{ = }\frac{{\gamma_{k}^{(t)} f_{k} (y_{i} {|}\theta^{(t)} )}}{{\sum\limits_{k = 1}^{K} {\gamma_{k}^{(t)} f_{k} (y_{i} {|}\theta^{(t)} )} }} $$
(7)

\(t\) is the number of iterations. See Sect. 2.4 for the selection of the initial value \(\theta^{(0)}\).

\(\gamma_{ik}^{(t)} { = }P(z_{i} = k{\text{|y}}_{i} ,\theta^{(t)} )\) represents the posterior probability of the failure state.

2.3.2 M Step: Maximize \(Q(\theta |\theta^{(t)} ,Y)\) and Update \(\theta^{(t + 1)}\)

$$ \gamma_{k}^{(t + 1)} = \mathop {\arg \max }\limits_{\gamma } Q(\theta |\theta^{(t)} ,Y) = \frac{{\sum\limits_{i = 1}^{n} {\gamma_{ik}^{(t)} } }}{n} $$
(8)
$$ \left( {\begin{array}{*{20}l} \beta \hfill \\ {\mu _{2} } \hfill \\ {\mu _{3} } \hfill \\ \end{array} } \right)^{{(t + 1)}} = (H_{E} ^{T} W_{E} H_{E} )^{{ - 1}} H_{E} ^{T} W_{E} Y_{E} $$
(9)
$$ Y_{E} = \left( \begin{gathered} Y \hfill \\ Y \hfill \\ Y \hfill \\ \end{gathered} \right){\text{, }}H_{E} = \left( {\begin{array}{*{20}l} H \hfill & {\text{0}} \hfill & {\text{0}} \hfill \\ H \hfill & {{\text{e}}_{n} } \hfill & {\text{0}} \hfill \\ H \hfill & {\text{0}} \hfill & {{\text{e}}_{n} } \hfill \\ \end{array} } \right),W_{E} = \left( {\begin{array}{*{20}l} {W_{1} } \hfill & {\text{0}} \hfill & 0 \hfill \\ {\text{0}} \hfill & {W_{2} } \hfill & 0 \hfill \\ {\text{0}} \hfill & {\text{0}} \hfill & {W_{3} } \hfill \\ \end{array} } \right),{\text{e}}_{n} = (1,...,1)^{T} $$
(10)
$$ W_{k} = diag\{ \frac{{\gamma_{1k}^{(t)} }}{{(\sigma_{k}^{2} )^{(t)} }},...,\frac{{\gamma_{nk}^{(t)} }}{{(\sigma_{k}^{2} )^{(t)} }}\} ,k = 1,2,3 $$
(11)
$$ (\sigma_{k}^{2} )^{(t + 1)} {\text{ = argmax}}Q(\theta |\theta^{(t)} ,Y) = \frac{{\sum\limits_{i = 1}^{n} {\gamma_{ik}^{(t)} (y_{i} - h_{i}^{T} \beta^{(t + 1)} - \mu_{k}^{(t + 1)} )^{2} } }}{{\sum\limits_{i = 1}^{n} {\gamma_{ik}^{(t)} } }} $$
(12)

Complete an iteration \(\theta^{(t)} \to \theta^{(t + 1)}\) as described in Sects. 2.3.1 and 2.3.2, and repeat until \(\left\| {\theta^{(t + 1)} - \theta^{(t)} } \right\|\) is small to obtain the final estimation of all unknown parameters \(\hat{\theta } = \theta^{(t + 1)}\) and the posterior of the fault state \(\gamma_{ik}^{(t + 1)}\).

2.4 Initial Value Selection

The MM method is selected as the method for initial value for iteration, and some improvements have been made. The FAST-S [7] algorithm based on optimal subsampling is used instead of the LTS algorithm to makes the estimation efficiency and reduce the time consumption for initial value [7]. The other initial values of the parameters in the Gaussian mixture distribution can be set to make the distribution as more flat, thick-tailed and symmetrical as possible based on experience.

2.5 Calculation of the Protection Level

The covariance matrix of \(\hat{\beta }\) can be calculated using (9) as follows:

$$ \text{cov} \left( \begin{gathered} \beta \hfill \\ \mu _{2} \hfill \\ \mu _{3} \hfill \\ \end{gathered} \right)^{{(t + 1)}} {\text{ = }}A_{E} \text{cov} (Y_{E} )A_{E}^{T} {\text{, }}A_{E} = {\text{(}}H_{E} ^{T} W_{E} H_{E} )^{{ - 1}} H_{E} ^{T} W_{E} ,{\text{ }}\text{cov} (Y_{E} ) = \left( {\begin{array}{*{20}l} {\Sigma _{Y} } \hfill & {\Sigma _{Y} } \hfill & {\Sigma _{Y} } \hfill \\ {\Sigma _{Y} } \hfill & {\Sigma _{Y} } \hfill & {\Sigma _{Y} } \hfill \\ {\Sigma _{Y} } \hfill & {\Sigma _{Y} } \hfill & {\Sigma _{Y} } \hfill \\ \end{array} } \right) $$

According to (3) and (7), the posterior covariance matrix \(\Sigma_{Y}\) of \(Y\), is a diagonal matrix, which can be calculated according to formula (3), and replaces \(\gamma_{k}\) with the posterior probability \(\gamma_{ik}\) in (7).

We let \(C_{ii}^{{}}\) denote the \(i\) th diagonal element of \({\text{cov}} \left( \begin{gathered} \beta \hfill \\ \mu_{2} \hfill \\ \mu_{3} \hfill \\ \end{gathered} \right)^{(t + 1)}\), then

$$ VPL = \kappa_{\alpha } \sqrt {C_{33} } , \, HPL = \kappa_{\alpha } \sqrt {C_{11}^{{}} + C_{22}^{{}} } $$
(13)

\(\kappa_{\alpha }\) can be set according to the required false alarm rate.

3 Experiments

In this section, the experimental results of our algorithm and three other algorithms are compared. Our algorithm achieved a good performance on almost all the error modes and error ranges.

3.1 Experimental Design

We collected 24 h of GPS (L1) and Beidou (B1C) measured data for simulation in October 15, 2020. The fixed antenna is located on the top floor of the office building, both GPS (L1) or Beidou (B1C) have 9–12 visible stars, the total number of satellites is about 20–22, the elevation threshold is 5°, and the carrier-to-noise threshold is 25, using pseudo-range for positioning. The simulation environment is as follows: laptop Thinkpad-T450, operating system win10, Intel Core i5-5200U CPU@2.20 GHz 2.19 GHz, 12.0 GB RAM, and simulation software R-3.5.3 and gcc-4.9.3. We randomly selected 3000 samples from the 24-h observation data with a uniform distribution for the experiment. Then we randomly selected 20 satellites each epoch. We calculate the Root Mean Square Error (RMSE) of the three-dimensional ENU direction under each experimental condition. The long-term RTK measurements are used as the reference true position. Due to limited space, this article only uses the U direction as an example.

We chose four calculation methods for comparison:

Method 1 (denoted as LS): Least square estimation.

Method 2 (denoted as M): M estimation method (Huber loss [5]).

Method 3 (denoted as MM): MM estimation method [1].

Method 4 (denoted as GM3): The Gaussian mixture model with 3 components proposed in this paper.

The following strategies are adopted for the failure modes, and at each mode, the fault satellites are randomly selected:

Failure mode 1: random failure, with a deviation of 5–30 m in a uniform distribution added on the pseudorange measurement.

Failure mode 2: Fixed deviation of 20 m is added on the pseudorange.

Failure mode 3: Slop biased failure, adding deviation at the speed of 0.03 m/s (6 faults) and 0.02 m/s (8 faults) after 200 s on the pseudorange measurement.

3.2 Positioning Error Analysis

We use the failure mode 1 and 2 in Sect. 3.1 and randomly selected the number of faulty satellites from 0 to 10 respectively. In Fig. 2 and 3, it can be seen that the positioning error of the GM3 method is lower than the other three methods in most cases. The positioning error of failure mode 3 are shown in Fig. 4 and 5. It can be seen that with the delay of time, the increase of the deviation causes the positioning error to become larger. The increase speed of the positioning error obtained by the GM3 method is slower than the other three methods, and the average positioning error is the lowest. When there are 8 faulty satellites, the positioning error is reduced by at least 6 m compared with the other three methods.

Fig. 2.
figure 2

Positioning error on 4 methods at random failures and different numbers of faults

Fig. 3.
figure 3

Comparison of the positioning error (height) between 4 methods at 20 m bias

Fig. 4.
figure 4

Positioning error (height) between 4 methods at slope bias (0.03 m/s) and 6 faults

Fig. 5.
figure 5

Positioning error (height) between 4 methods at slope bias (0.02 m/s) and 8 faults

3.3 Protection Level Analysis

It can be seen from the histograms in Fig. 7 that in most cases, \(\frac{{\left| {VPE} \right|}}{VPL} < 1\) indicates that the VPL can effectively measure the range of the real time positioning error [3, 16,17,18]. From Fig. 6 we can see that a low missed detection and false alarm rate is achieved.

Fig. 6.
figure 6

The comparison between VPL and VPE at slope bias (0.01 m/s) and 6 faults

Fig. 7.
figure 7

The timing diagram of VPL and VPE, and the histogram of VPE/VPL at slope bias (0.01 m/s) and 6 faults

3.4 Time Complexity

It can be seen from Table 1 that the computational complexity of our method is basically independent of the number of failures after the subsampling improvement of the MM method and the EM algorithm described in Sects. 2.3 and 2.4.

Table 1. Running time under each umber of faults

4 Conclusions

This paper proposes a RAIM algorithm based on Gaussian mixture distribution, which can describe the characteristics of non-central, thick-tailed, non-Gaussian residuals when there are faults, and also has good adaptability to the residual distribution when there is no fault. The parameters of the distribution are determined based on maximum likelihood and real time observation, and can adapt to changes in errors distribution over time. Using EM algorithm and subsampling-based initial value determination can make the calculation complexity do not change with the increase of the number of failures and ensure the real-time performance of the algorithm. Experiments with multiple failure modes (random, fixed, and slowly varying) are performed. When the total number of satellites is 20 and the number of faulty satellites gradually increases from 0 to 10, the performance of four methods are compared. It is verified that the method in this paper can effectively reduce the positioning error when multiple faults occur, the PL calculation is effective and the computation is fast. This method can be widely used in multi scenarios such as city valley, cycle slip and so on, we will do more experiments on more scenarios and more complex error modes in the future.