Introduction

With the deployment of the Galileo and BeiDou GNSS constellations and ongoing modernization of the GPS and GLONASS constellations, the number of satellites in view will be increased to well over 40 most of the time, at many places around the world. The significant increase in the number of satellites from multiple GNSS constellations will greatly improve the navigation performance in terms of positioning accuracy, reliability and availability. However, it may not always be possible to process all visible satellites in real-time for a standalone GNSS receiver, especially in the case of low-cost receivers with limited tracking channels, or insufficient bandwidth for augmentation message channels, or critical power consumption (Walter et al. 2016). Even if all visible satellites are used for positioning, the positioning accuracy may not necessarily be improved, and real-time performance can deteriorate due to the high computational burden (Blanco-Delgado et al. 2017). The challenge is, how to select the best subset of visible satellites?

The most straightforward algorithm for satellite selection is based on a brute force approach that aims to minimize either the Geometric Dilution of Precision (GDOP) or the weighted GDOP (WGDOP) (Zhang and Zhang 2009). The optimal satellite subset is determined by computing GDOP/WGDOP values with all possible satellite subset combinations and selecting the one with the minimum value. The GDOP/WGDOP calculation cycle can reach billions of computations for one epoch, with the calculation of a single GDOP or WGDOP value requiring matrix multiplication and inversion operations. This makes the brute force approach difficult to implement for real-time applications. To reduce the number of GDOP/WGDOP calculation cycles, a number of sub-optimal satellite selection methods have been proposed. For example, a quasi-optimal subset of satellites is selected by recursively removing the satellite that has the smallest increase in GDOP from all satellites in view (Liu et al. 2009), or by sequentially adding the least redundant satellite with respect to previously selected satellites (Peng et al. 2014; Roongpiboonsopit and Karimi 2009), or just by removing the satellites that have GDOP contribution values smaller than a predefined threshold (Li et al. 2012). However, these algorithms are very likely to select a globally sub-optimal set of satellites. To track the optimal subset of satellites over time, a temporal algorithm for satellite subset selection has been proposed by evolving the best subset over time by swapping one or two satellites (Swaszek et al. 2017). This algorithm shows great promise in finding the optimal subset. However, this algorithm still suffers from the computational burden of hundreds, even thousands of calculation cycles. To avoid GDOP/WGDOP calculations alternative satellite selection measures, such as maximization of the volume of the polytope formed by the satellites (Blanco-Delgado and Nunes 2010; Kong et al. 2014) and satellites’ comparability and distribution characteristics (Li et al. 2016; Wei et al. 2012), have been proposed. These methods cannot guarantee optimal satellite selection and need to take geometric satellite distribution into account. Furthermore, instead of directly computing GDOP using matrix multiplication and inversion operations, there are alternative methods. These include closed-form formulas (Doong 2009; Teng and Wang 2016) and machine learning (ML) based methods, such as the Genetic Algorithm (GA) (Mosavi 2011; Zhu 2018), Support Vector Machine (SVM) (Wu et al. 2011), and Neural Network (NN) approaches (Simon and El-Sherief 1995; Jwo and Lai 2007; Azami and Sanei 2014; Zarei 2014). All these ML-based methods treat satellite selection as a regression problem of GDOP calculation and focus on improving the performance of GDOP approximation and/or classification. For example, Wu et al. (2011) compared the performance of several ML methods with different types of training patterns that describe the input–output relationships for GDOP. They found that SVM gives the best performance. Furthermore, different training patterns also affect the performance of the GDOP approximation. However, only around 65% of GDOP value difference smaller than 0.5 can be achieved. Besides, one limitation of these methods is their application locality, which has to be near the trained location. The performance of some of these methods also depends on the training time and size of training data. To reduce the training time of back propagation (BP) NN, there have been many methods proposed in using either different NNs, such as the probabilistic NN and the general regression NN (Jwo and Lai 2007), or improved BP algorithms, including resilient BP and conjugate gradient algorithms (Azami et al. 2013). Even with the GDOP approximation/classification methods, satellite selection still requires a brute force procedure to identify the subset with the smallest GDOP value from all possible subsets. Moreover, the GDOP classification can only approximately classify the GDOP value of each satellite subset into a predefined range (Jwo and Lai 2007; Azami et al. 2013).

The authors propose an end-to-end deep learning network to select the optimal subset from the set of all visible satellites. Instead of treating the satellite selection as a regression problem for GDOP calculation, it can be considered as a segmentation problem of a small point cloud (Grilli et al. 2017), by partitioning all visible satellites into two classes indicating whether the satellite is selected or not. There is no need to design a specific training pattern to describe the input–output relationships for the GDOP and to implement a brute force selection procedure. To design such an end-to-end deep learning network for satellite selection with GDOP/WGDOP criteria, the interactions among all the input satellites need to be captured by the network. Since different satellite feeding orders would not change the satellite segmentation results the network can be designed to be invariant to all permutations of the input, as inspired by PointNet via using a simple symmetric function, max pooling (Qi et al. 2016). Furthermore, the max pooling enables the network to learn the global features of the satellite input. To output the per-satellite segmentation, the local features of each satellite also have to be learned. The PointNet has shown to be effective in per point segmentation by combining local point and global input features learned from multi-layer perceptron (MLP) on each point and max pooling across points, respectively. Therefore, the module with local and global information combination in the PointNet is also adopted by the end-to-end deep learning network for satellite selection. To learn the features of the selected subset from all satellites in view, the input channels used for each satellite should be able to characterize its specified features to help satellite segmentation with respect to GDOP/WGDOP criteria. They can be represented by the receiver-to-satellite vector or elevation and azimuth angles. Each visible satellite is processed identically and independently at the beginning using a few fully connected (FC) layers on each satellite to achieve the local satellite features. Then, new per-satellite features are obtained by concatenating the local satellite feature and global satellite input feature with max pooling, which are used as input for the subsequent segmentation layers. To learn more complex features for the selected satellites with GDOP/WGDOP criteria, the architecture of stacked voxel feature encoding (VFE) layers used in VoxelNet is also employed (Zhou and Tuzel 2017). However, to obtain better trained models for satellite segmentation, the stacked VFE architecture is modified with reduced output sizes to achieve compact satellite internal representations. One problem with this network is that the number in the predicted subset may not always be equal the required one. Since the output scores of the network represent probabilities of input being the predicted labels, this problem can be solved by selecting the required number of satellites according to their output scores.

Following an introduction of the architectures of PointNet and VoxelNet, the end-to-end network for satellite segmentation is described in detail, including the training and test data generation, architecture design, and training details. Models for GDOP and WGDOP with different numbers of satellites are then tested with GNSS observations from 220 IGS stations. Finally, some concluding remarks are given.

PointNet and VoxelNet networks

PointNet is a unified architecture for tasks with irregular input format such as point cloud-based 3D classification and segmentation (Qi et al. 2016). It is able to directly take point sets as input and outputs either class labels for the entire input or per point labels for each point of the input. To realize satellite selection, each satellite has to be identified to be selected or not. Since the segmentation network of PointNet is capable of partitioning input point sets into multiple homogeneous segments, i.e., per point segment labels for each point of the input, this network is of more interest. Each point in the point cloud can be represented by its \(\left( {\begin{array}{*{20}{c}} {x,}&{y,}&z \end{array}} \right)\) coordinates as well as other feature channels. Assuming that there are \(n\) unordered input points \(\left\{ {\begin{array}{*{20}{c}} {{x_1},}&{{x_2},}&{ \cdots ,}&{{x_n}} \end{array}} \right\}\) with \({x_i} \in {{\mathbb{R}}^d}\) and \(m\) sub-categories, the model trained from the segmentation network is able to output \(n \times m\) scores for each point and sub-category. The basic idea of the PointNet is to approximate a continuous function \(f:{\mathbf{\chi }} \to {\mathbb{R}}\) that maps the unordered input points to a vector by applying a symmetric function, max pooling, on transformed elements in the set:

$$f\left( {\left\{ {\begin{array}{*{20}{c}} {{x_1},}&{{x_2},}&{ \cdots ,}&{{x_n}} \end{array}} \right\}} \right) \approx \gamma \left( {\mathop {{\text{MAX}}}\limits_{{i=1,2,...,n}} \{ p({x_i})\} } \right)$$
(1)

where \(\gamma\) and \(p\) are usually multi-layer perceptron (MLP) networks. The output vector \({\mathbf{f}}\) represents the global signature of the input set. Each input point is identically and independently processed at the initial stages using the MLP with different hidden layers to encode statistical properties of the points. The MLP can also be seen as several FC layers operating on each point, i.e., \(1 \times 1\) convolution layers. To achieve per point segmentation, both local and global features of the input point are extracted to learn the interactions among the points. The local features of each point \({{\mathbf{p}}_i} \in {{\mathbb{R}}^{\text{l}}}\) can be obtained from the initial stages with the MLP, and the global features \({\mathbf{f}} \in {{\mathbb{R}}^{\text{g}}}\) can be achieved with the max pooling algorithm across points. Then the new per point features \({\mathbf{p}}_{i}^{{{\text{new}}}} \in {{\mathbb{R}}^{{\text{l}}+{\text{g}}}}\) are extracted by concatenating the local point features with the global features:

$${\mathbf{p}}_{i}^{{{\text{new}}}}={[\begin{array}{*{20}{c}} {{\mathbf{p}}_{i}^{{\text{T}}}}&{{{\mathbf{f}}^{\text{T}}}} \end{array}]^{\text{T}}}$$
(2)

Finally, the combined new per point features are used for point label prediction with the MLP. Figure 1 illustrates the basic segmentation network architecture.

Fig. 1
figure 1

PointNet segmentation network architecture

VoxelNet is an end-to-end network for point cloud-based 3D object detection (Zhou and Tuzel 2017). The voxel used in the VoxelNet refers to a 3D grid, which contains a number of points divided from the point cloud input. One key innovation of the VoxelNet is its architecture of stacked VFE layers to learn voxel-wise feature, as shown in Fig. 2. As with the feature aggregation network in PointNet, each VFE layer is able to achieve the new point features by combining the point-wise features with one FC layer and the locally aggregated features with the max pooling, i.e., the local point features and global voxel features. The main difference between the VoxelNet feature learning architecture and PointNet is that there are several global and local feature concatenations performed with the stacked VFE layers. To illustrate the detailed processing procedure of one voxel, all the feature sizes involved in the different layers are shown in Fig. 2. It is assumed that the network also takes \(n\) points with input channels \(d\) as input and outputs voxel feature \({f_{\text{v}}} \in {{\mathbb{R}}^{\text{v}}}\). Since there is one FC layer and one max pooling in each VFE layer, the output feature size from each VFE is two times the local feature size. The stacked VFE layers also directly consume the points within the voxel as input and outputs new per point concatenated features. It encodes point interactions within a voxel and enables learning complex voxel-wise features for characterizing 3D shape information of each voxel.

Fig. 2
figure 2

Architecture of stacked VFE layers

An end-to-end satellite segmentation network

In contrast to the GDOP approximation/classification methods mentioned in the introduction that treat satellite selection as a regression problem and classifying the GDOP value of satellite input into a predefined range, the proposed end-to-end satellite segmentation network is able to directly segment all the input satellites selected or not with no need for the brute force procedure as required in GDOP approximation/classification methods. One key problem for satellite segmentation using GDOP/WGDOP criteria is that the segmentation results should be invariant to all the permutations of the input satellites. Inspired by the PointNet segmentation network that uses max pooling to deal with unordered input, and predicts the per point class label by combining the global feature with local point features, this architecture can be used for satellite segmentation network by treating the satellites in view at one instant as point cloud input, with two class labels—for selecting the satellite or not selecting the satellite—as output. In addition, the architecture of stacked VFE layers in the VoxelNet is employed to improve the performance of satellite segmentation using GDOP/WGDOP criteria.

Train and test data generation

Assume the selection of m satellites from n satellites in view from multiple constellations where inter-system time offsets have been accounted for. The criterion used for satellite selection is to minimize either the GDOP or WGDOP value. These values can be calculated by:

$$\begin{array}{*{20}{c}} {{\text{GDOP}}=\sqrt {{\text{tr(}}{{\mathbf{H}}^{\text{T}}}{\mathbf{H}}{{\text{)}}^{ - 1}}} }&{{\text{WGDOP}}=\sqrt {{\text{tr(}}{{\mathbf{H}}^{\text{T}}}{\mathbf{WH}}{{\text{)}}^{ - 1}}} } \end{array}$$
(3)

where \({\text{tr(}}\cdot {\text{)}}\) denotes trace of the matrix, \({\mathbf{H}}\) is the design matrix and \({\mathbf{W}}\) is the weight matrix. \({\mathbf{H}}\) and \({\mathbf{W}}\) can be represented by:

$${\mathbf{H}}=\left[ {\begin{array}{*{20}{c}} {{x_1}}&{{y_1}}&{{z_1}}&1 \\ {{x_2}}&{{y_2}}&{{z_2}}&1 \\ \vdots & \vdots & \vdots & \vdots \\ {{x_n}}&{{y_n}}&{{z_n}}&1 \end{array}} \right]\begin{array}{*{20}{c}} {}&{} \end{array}{\mathbf{W}}=\left[ {\begin{array}{*{20}{c}} {{w_1}}&{}&{}&{\mathbf{0}} \\ {}&{{w_2}}&{}&{} \\ {}&{}& \ddots &{} \\ {\mathbf{0}}&{}&{}&{{w_n}} \end{array}} \right]$$
(4)

where \({h_i}=\left[ {\begin{array}{*{20}{c}} {{x_i}}&{{y_i}}&{{z_i}} \end{array}} \right]\) is the receiver-to-ith-satellite unit vector, the column of ones in the \({\mathbf{H}}\) matrix represents parameters for the receiver clock bias in units of meters and \({w_i}\) is the weight of the ith satellite. Using the brute force approach, the subset with the minimum GDOP or WGDOP value can be selected from \(n!/[m!(n - m)!]\) possible combinations. Therefore, there are two classes in the output data. One class is for the m selected satellites and the other is for the \(n - m\) not-selected satellites. To classify the n satellites in an end-to-end fashion, the input data of each satellite should be able to characterize its local feature. The receiver-to-satellite unit vector \(\left[ {\begin{array}{*{20}{c}} {{x_i}}&{{y_i}}&{{z_i}} \end{array}} \right]\) can be used as the input channel, which contains direct information for the optimal subset selection using GDOP/WGDOP criteria. Since it can also be converted to elevation and azimuth \(\left[ {\begin{array}{*{20}{c}} {{\text{e}}{{\text{l}}_i}}&{{\text{a}}{{\text{z}}_i}} \end{array}} \right]\), the input channel represented by \(\left[ {\begin{array}{*{20}{c}} {{\text{e}}{{\text{l}}_i}}&{{\text{a}}{{\text{z}}_i}} \end{array}} \right]\) is another option. In the following experiment, the satellite segmentation performance using both forms of input channels is compared.

Satellite segmentation network architecture

To segment all the satellites in view, both local and global features of each satellite are needed. The local feature of each satellite can be obtained by applying inputs and feature transformation through a few FC layers on each satellite. Since the satellite order does not change the final segmentation result, max pooling is used for the satellite segmentation network. Furthermore, max pooling also learns the global features of the satellite input. Then new satellite-wise features can be extracted by combining the global features with the local satellite features. To learn multi-level features for characterizing the pattern of the selected satellites, stacked feature encoding (FE) layers based on the architecture of stacked VFE layers are used. However, instead of concatenating local and global features every time the local feature is obtained with the FC layer in each VFE layer, the MLP with a few FC layers are used. As with the MLP in the PointNet, the FC layers operate on each satellite. The global feature obtained from the last layer in the MLP is concatenated with the local feature in the same layer or preceding one as shown in Fig. 3. All the feature sizes derived from different layer are also illustrated in Fig. 3. This architecture is able to reduce the inference time due to the two times increase in feature size resulting from feature concatenation of each FC layer as in the stacked VFE layers.

Fig. 3
figure 3

Satellite segmentation network architecture

Training details

Since the number of satellites in view will change with time, zero-padding is needed. The size of the zero-padding can be determined by fixing the number of observed satellites to the maximum possible number of visible satellites, e.g., \(n\). Therefore, the size of each input training data is \(n \times d\), where \(d\) is the dimension of the satellite input channel. All training data are randomized before processing. The network is designed by trading off the computational complexity and performance of satellite segmentation. It is composed of seven stacked FE layers and two FC layers for the final segmentation. In each FE layer, there are three FC layers and a max pooling. The max pooling, performed across the local features learned from the last FC layer, is used to obtain the global feature. The global feature is then concatenated with the local feature obtained from the second FC layer, as shown in Fig. 3. Overall, there are 23 layers in the proposed network architecture. The detailed layer output sizes are listed in Table 1. The number of layers and their sizes are empirical values.

Table 1 Satellite segmentation architecture with stacked FE layers

The gradual increase in the channel size makes it possible for the network to learn more high-level features. Due to the redundancy in the concatenated feature in each FE layer, the channel size of the FC layer in the next FE layer is reduced to keep the compact satellite features. All FC layers use rectified linear units (ReLU) as activation function and batch normalization. Since the satellite segmentation is satellite-wise classification, all of the input data including both satellite-input and zero-padding input have to be segmented. The same label as satellite not-selected is assigned to zero-padding input. The loss function for the satellite segmentation network is based on the binary cross entropy between the target and predicted classes (De Boer et al. 2005):

$$L= - \alpha \frac{1}{{{N_1}}}\sum\limits_{{i=1}}^{{{N_1}}} {\log ({{\hat {l}}^i})} - \beta \frac{1}{{{N_0}}}\sum\limits_{{i=1}}^{{{N_0}}} {\log (1 - {{\hat {l}}^i})} - \gamma \frac{1}{{{N_{\text{e}}}}}\sum\limits_{{i=1}}^{{{N_{\text{e}}}}} {\log (1 - {{\hat {l}}^i})}$$
(5)

where \({N_1}\), \({N_0}\), and \({N_{\text{e}}}\) denote the number of satellite-input training data with label 1, 0 and zero-padding input data with label 0, respectively; \({\hat {l}^i}\) denotes the predicted value of the softmax output; and \(\alpha\), \(\beta\), and \(\gamma\) are loss weight constants to adjust the relative importance balance among different input data. The minimization of this loss function also indirectly minimizes the difference in the GDOP or WGDOP values between the predicted satellites to be selected and the targeted ones. This loss function is optimized based on one commonly used stochastic gradient descent algorithm for deep learning networks, the Adam optimization algorithm (Kingma and Ba 2014). This algorithm uses first-order gradients to optimize stochastic objective functions with adaptive estimates of first and second moments. In addition, the algorithm has very little memory requirement and is very suitable for neural networks with large training datasets and parameters. All the computations in the proposed network can be implemented on a CPU/GPU in parallel. One problem when using this network is that the number of output predictions for selected satellites may not be equal to the predefined one. As the output scores of the network can be interpreted as probabilities of input being the predicted labels, the number of output satellites for selection can be fixed by selecting the satellites with high output scores for label 1.

Experiment

Based on the brute force approach, training data with GDOP and WGDOP criteria were generated using 1-day GNSS observations (at 1-min intervals) from 200 IGS stations, resulting in around 288,000 training data samples in total. Both GPS and GLONASS observations were used and it was assumed that the inter-frequency bias for GLONASS satellites had been accounted for. Elevation cutoff angle for each receiver was set to 15°. Test data were generated in a similar way for another 20 IGS stations. The maximum number of satellites in view was 20, i.e., \(n=20\). The labels for selected satellites were set as class 1 and the not-selected satellites were designated class 0. Due to the slightly unbalanced training data, different loss weight values \(\alpha =1\), \(\beta =1.2\), and \(\gamma =0.05\) were assigned. Adam optimizer with initial learning rate 0.1, momentum 0.9 and batch size 128 values were used. The decay rate for batch normalization was 0.7, starting from 0.5 and gradually increasing to 0.99. The learning rate was reduced at the same rate as the decay rate for batch normalization. The models were converged in around 50 min using TensorFlow (Abadi et al. 2016), an open-source machine learning library running on a GTX1080TI GPU. The code and trained models used in the following experiment can be found at https://github.com/PanUnsw/satellite_selection.git.

Satellite segmentation with different input channels and architectures

To compare the segmentation performance with input channels represented by \({c_i}=\left[ {\begin{array}{*{20}{c}} {{x_i}}&{{y_i}}&{{z_i}} \end{array}} \right]\) and \({c_i}=\left[ {\begin{array}{*{20}{c}} {{\text{e}}{{\text{l}}_i}}&{{\text{a}}{{\text{z}}_i}} \end{array}} \right]\), one example for selecting 9 satellites based on the GDOP criterion is shown in Fig. 4. Table 2 is a comparison of training and testing accuracies. The accuracy represents the percentage of satellite input segmented correctly, i.e., the predicted subset is the same as the optimal subset. Although \(\left[ {\begin{array}{*{20}{c}} {{x_i}}&{{y_i}}&{{z_i}} \end{array}} \right]\) and \(\left[ {\begin{array}{*{20}{c}} {{\text{e}}{{\text{l}}_i}}&{{\text{a}}{{\text{z}}_i}} \end{array}} \right]\) are mathematically equivalent, the input channel with \(\left[ {\begin{array}{*{20}{c}} {{x_i}}&{{y_i}}&{{z_i}} \end{array}} \right]\) has slower training convergence, and worse converged training and testing accuracies than that based on \(\left[ {\begin{array}{*{20}{c}} {{\text{e}}{{\text{l}}_i}}&{{\text{a}}{{\text{z}}_i}} \end{array}} \right]\). This fact can be intuitively explained by the saddle points when optimizing the loss function, which can considerably slow down training (Dauphin et al. 2014). It may take more time to escape the saddle points for the input channel with \(\left[ {\begin{array}{*{20}{c}} {{x_i}}&{{y_i}}&{{z_i}} \end{array}} \right]\) compared with \(\left[ {\begin{array}{*{20}{c}} {{\text{e}}{{\text{l}}_i}}&{{\text{a}}{{\text{z}}_i}} \end{array}} \right]\) at the beginning of training. The input channels represented by \(\left[ {\begin{array}{*{20}{c}} {{\text{e}}{{\text{l}}_i}}&{{\text{a}}{{\text{z}}_i}} \end{array}} \right]\) were used in the following experiment.

Fig. 4
figure 4

Performance comparison with different input channels

Table 2 Accuracy comparison with different input channels

To illustrate the performance of the satellite segmentation architecture with stacked FE layers compared with that without stacked FE layers, another model with an architecture without stacked FE layers was also trained with elevation and azimuth input channels for selecting nine satellites based on the GDOP criterion. The architecture, consisting of six FC layers, one max pooling layer and one segmentation layer was used by trading off the computational complexity and performance of satellite segmentation. The output sizes of these layers are shown in Table 3. Local features obtained from the third FC layer were concatenated with the global features from the max pooling layer. The number of layers and their sizes used for the architecture without stacked FE layers are also empirical values (Table 3).

Table 3 Satellite segmentation architecture without stacked FE layers

The initial parameters used for the architecture without stacked FE layers were the same as with stacked FE layers except that the decay rates for batch normalization and learning rate were set to 0.3 to reduce the tendency for overfitting. Figure 5 and Table 4 show the performance comparison between the two trained models, model 1 and 2 with and without stacked FE architecture, respectively. It can be seen that both the training and test accuracies of model 2 are worse than that of model 1. Therefore, it is preferable to use the architecture with stacked FE layers for satellite segmentation.

Fig. 5
figure 5

Performance comparison with different architectures

Table 4 Accuracy comparison with different architectures

Satellite segmentation with GDOP

Models for selecting 9 and 12 satellites from the same set of observed satellites using the GDOP criterion were trained and evaluated. With the same network architecture with stacked FE layers, the models trained were used when selecting, in turn, 9 or 12 satellites. Table 5 shows the segmentation performance comparison. It can be seen that the testing accuracy is slightly worse than the training accuracy. The model for selecting 12 satellites has better performance than that for 9 satellites.

Table 5 Satellite segmentation performance using the GDOP criterion

To validate the trained models, Table 6 lists the test results. Compared with means of the best GDOP values, the means of the test GDOP values were only increased by 0.08 and 0.04% when selecting 9 or 12 satellites, respectively. For more than 80% of the epochs the optimal subset of the satellite input was correctly predicted, and around 20% when one satellite in the optimal subset was not selected. The one satellite in the optimal subset that was not selected was often replaced by one that produces a slightly worse GDOP value than the minimum GDOP value.

Table 6 Test evaluation performance with GDOP-trained model

One example of the predicted subset with one wrongly selected satellite is shown in Fig. 6. The detailed input and output labels for the targets and predictions are listed in Table 7. The satellite PRN G9 in the optimal subset was predicted with label 0, which was replaced by satellite PRN R18 resulting in the optimal GDOP value increasing from 1.617 to 1.619.

Fig. 6
figure 6

One example of a wrongly predicted subset with GDOP-trained model. Asterisk, diamond and circle markers identify the satellites optimally selected, not selected and wrongly selected, respectively

Table 7 One example of satellite segmentation with GDOP-trained model

Figure 7 is a plot of the percentage of GDOP increase between the predicted and the optimal subsets. It can be seen that the GDOP increase is limited to 8 and 3% when selecting 9 or 12 satellites, respectively, as shown in the top and bottom panels.

Fig. 7
figure 7

Percentage of GDOP increase

The histogram of GDOP value differences between the test and the optimal GDOP values is plotted in Fig. 8. For about 99 and 100% of the time, the GDOP difference is smaller than 0.03 when selecting 9 or 12 satellites, respectively.

Fig. 8
figure 8

GDOP value difference comparison

Satellite segmentation with WGDOP

In this section, satellite segmentation results are presented for two trained models using the WGDOP criterion. The weight calculated for each satellite is based on its elevation angle as \(w={[\sin ({\text{el}})]^2}\). With the same network architecture used above, the best performance results of two trained models when selecting 9 or 12 satellites are listed in Table 8. Similar to the earlier trained models using the GDOP criterion, the model for selecting 12 satellites has better performance than that for 9 satellite selection. However, both models trained with WGDOP have lower accuracies than the ones with GDOP. This can be attributed to more complex features introduced by the different target satellites.

Table 8 Satellite segmentation performance with the WGDOP criterion

Even with the larger bias between the testing and training accuracies, around 95% of the time the predicted subsets are correctly selected, or with just one wrong segmentation, compared with the optimal ones, as listed in Table 9. The means of the test WGDOP values are increased by 0.31 and 0.15% when selecting 9 or 12 satellites, respectively. Similar to the earlier case, the subset with one wrongly selected satellite usually has a slightly larger WGDOP value than the optimal subset. One typical example of a wrongly predicted subset is shown in Fig. 9. Table 10 lists the elevation and azimuth input and the output labels for the targets and predictions. The difference in WGDOP value between the optimal and predicted subsets is approximate 0.012, with WGDOP increased from 3.418 to 3.430.

Table 9 Test evaluation performance with WGDOP-trained model
Fig. 9
figure 9

One example of a wrongly predicted subset with WGDOP-trained model. Asterisk, diamond and circle markers identify the satellites optimally selected, not selected and wrongly selected, respectively

Table 10 One example of satellite segmentation with WGDOP-trained model

The percentage of WGDOP increase between the predicted and the optimal subsets is mostly limited to 8 and 4% when selecting 9 or 12 satellites, respectively, as shown in the top and bottom panels of Fig. 10.

Fig. 10
figure 10

Percentage of WGDOP increase

Figure 11 shows the histogram of WGDOP value differences between the test and the optimal WGDOP values. More than 99% of the differences in WGDOP value are less than 0.2, for both cases of selecting 9 or 12 satellites. This is adequate for most applications.

Fig. 11
figure 11

WGDOP value comparison

The comparison of average computational time with GDOP and WGDOP criteria between the brute force approach to select one optimal subset and the proposed satellite segmentation network to predict one selected subset with the trained models is summarized in Table 11. It can be seen that the satellite segmentation network method is about 90 times faster than the brute force approach.

Table 11 Performance comparison of computation time

Concluding remarks

The authors presented an end-to-end deep learning network for satellite selection invariant to satellite input permutation based on the PointNet and VoxelNet networks. The satellite selection procedure was converted to a satellite segmentation procedure, with specified input channel for each satellite and two class labels representing the selected and not-selected satellites. The proposed satellite segmentation network was composed of several simple stacked FE layers and one segmentation layer. An experiment was conducted to evaluate the proposed approach with training and test data from 220 IGS stations. The satellite segmentation performance was compared with respect to different input channels, including receiver-to-satellite unit vector and elevation and azimuth, as well as different architectures, i.e., with and without stacked FE layers. The experiment showed that it was preferable to use an architecture with stacked FE layers and input channel represented by elevation and azimuth due to the faster training convergence and better converged accuracy. Cases for selecting 9 or 12 satellites, with GDOP and WGDOP criteria, were investigated. It was demonstrated that using the satellite segmentation network approach was around 90 times faster than the brute force satellite selection approach. From the test results, it can be concluded that the trained models were capable of selecting the satellites that had the most contribution to the GDOP or WGDOP value and had no limitation of application locality as shown in GDOP approximation/classification methods. In addition, the trained models based on GDOP had better performance than the ones based on WGDOP. Furthermore, the models for selecting 12 satellites were more accurate than those for nine satellites. The model proposed for satellite segmentation is only for selecting a single fixed number of satellites, and the training and test data are all from static receivers. In the future, a model capable of selecting a multiple number of satellites will be trained with the selected number of satellites added to the input channel. Kinematic data will be analyzed, and the robustness of the trained model will be investigated.