Introduction

Global Navigation Satellite System (GNSS) is a positioning system that is widely used in cars, ships, and airplanes because it can provide a position solution regardless of the time and location. In urban canyons, buildings and other structures often block the line of sight (LOS) of visible satellites, which makes it difficult to provide a navigation solution. GNSS availability simulations (Yoo et al. 2009; Lee et al. 2008) indicate that four or more visible satellites may sometimes be unavailable in urban canyons. Further, the application of GNSS fault detection and exclusion may reduce the number of available satellites because erroneous satellites are excluded (Lee et al. 2011).

Research into alternative navigation systems has been conducted because of constraints on the availability of GNSS in urban canyons. Zhao (1997) and Fu et al. (2007) estimated the position and heading based on the number of revolutions of each wheel using an odometer. In odometer applications, navigation error arises from wheel slips and inaccuracies in the scale factor of the odometer, which results in a drift error due to the absence of error correction using external information. The Inertial Navigation System (INS) can also be used as an alternative navigation system independently of GNSS (Titterton and Weston 1997). INS uses dead reckoning to calculate the vehicle position, velocity, and attitude. As the operational time passes, however, the navigation solutions become more inaccurate due to the cumulative error. Numerous studies have been conducted to integrate INS with vision sensors, such as camera, lidar, and laser range finders, to mitigate this drift error. Durrant-Whyte and Bailey (2006a, b) proposed simultaneous localization and mapping (SLAM) with visual measurements, which is a technique that estimates positions in an unknown environment and builds a map. This Vision/INS integrated navigation system (i.e., SLAM) operates without the aid of absolute information (premeasured feature points or exact error correction). Vision/INS integrated navigation has been implemented actively in the field of mobile robots and unmanned aerial vehicles (UAVs) by applying nonlinear estimations including the extended Kalman filter (EKF), unscented Kalman filter (UKF), and particle filter (PF). However, these methods still experience continuous drift errors. According to Bryson and Sukkarieh (2008) and Bryson et al. (2005), Vision/INS integrated navigation will always lack observability of three states in the Kalman filter if additional absolute information related to the position is not available. Therefore, conventional approaches inevitably incur drift errors.

Previous studies have been conducted based on the assumption that absolute information is not available. Signals from some visible satellites can be received in urban canyons, which can be used as absolute information to correct navigation errors, although these are insufficient to derive a position solution (Soloviev 2008). Soloviev and Venable (2010) improved the navigation performance of Vision/INS integrated navigation in urban canyons using limited GPS measurements. This method utilized the GPS carrier phase, and it improved the accuracy with a change in position (delta position). However, the error in the delta position generated a drift error in the position, which could not be removed.

We propose a method that improves the navigation performance by combining less than four observable GNSS satellites and Vision/INS integrated navigation. An integrated model is proposed to utilize GNSS measurements of the range and range rate. The possibility of implementing these measurements was assessed during an observability analysis of the system. A decrease in the position drift error when using GNSS measurements was also verified based on the estimated position in a computer simulation, which depended on the number of satellites available; additionally, an experimental test was conducted. The results showed that limited GNSS measurements can improve the positioning performance in urban canyons.

Section "GNSS" describes the GNSS observation equations. Section "INS/Vision integrated navigation system" introduces the process model and the observation model for Vision/INS integrated navigation. Section "GNSS integration with Vision/INS integrated navigation" explains the GNSS and vision integrated model. Section "Observability analysis" analyzes the observability of the proposed system, and section "Test and results" verifies the improvement in the observability via a simulation. Finally, the conclusions are presented in sect. "Conclusion".

GNSS

The number of available signals has increased as various GNSSs have become operational, such as GPS, GLONASS, Galileo, and Beidou. As explained above, it is difficult to obtain signals from four or more satellites in GNSS-blocked or GNSS-challenged environments. In these situations, a three-dimensional position solution of the navigation filter is unavailable, whereas range and range rate measurements are available. This section describes a linearized observation model for pseudorange and range rate measurements that can be used with Vision/INS integrated navigation.

Pseudorange observation model

The pseudorange is the distance between a satellite and a receiver, which is measured based on the propagation time. The observation model comprises the distance between the satellite position and the user position and the receiver clock error. The satellite clock error is contained in the broadcasted navigation messages, and it can be estimated. The pseudorange model is as follows:

$$ \rho = r + cb_{\rm u} $$
(1)
$$ \rho - \bar{\rho } = \left[ {\begin{array}{*{20}c} {h_{\text{GNSS}} } & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\delta {\mathbf{x}}_{\rm u} } \\ {c\delta b_{\rm u} } \\ \end{array} } \right] $$
(2)

where ρ is the pseudorange, r is the distance between the satellite and the user, c is the speed of light, b u is the clock bias of the receiver, \( \bar{\rho } \) is the pseudorange at the nominal position of the user (\( {\mathbf{x}}_{\rm u} \)), and h GNSS is the LOS unit vector between the satellite and the user. The subscript u denotes the user. Equation (1) assumes that only a receiver clock error (cb u) exists. This equation can be linearized around a nominal point as (2).

Range rate observation model

The range rate can be derived from the Doppler shift of the GNSS measurement or the time-differenced carrier phase (TDCP). The two values are obtained by the receiver using different processes, but they have an identical physical meaning (Ding and Wang 2011). The expressions are as follows:

$$ \lambda \Upphi \left( t \right) = \rho \left( t \right) + \lambda N + cb_{\rm u} \left( t \right) $$
(3)
$$ \lambda \dot{\Upphi }\left( t \right) \approx \frac{{\left[ {\Upphi \left( t \right) - \Upphi \left( {t - 1} \right)} \right]}}{\Updelta t} = \dot{\rho }\left( t \right) + c\dot{b}_{\rm u} \left( t \right) $$
(4)
$$ \lambda \dot{\Upphi }\left( t \right) = h_{\text{GNSS}} \left( {{\dot{\mathbf{x}}}_{\rm s} \left( t \right) - {\dot{\mathbf{x}}}_{\rm u} \left( t \right)} \right) + c\dot{b}_{u} \left( t \right) $$
(5)
$$ \lambda \dot{\Upphi } - h_{\text{GNSS}} {\dot{\mathbf{x}}}_{\rm s} = \left[ {\begin{array}{*{20}c} { - h_{\text{GNSS}} } & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {{\dot{\mathbf{x}}}_{\rm u} } \\ {c\dot{b}_{\rm u} } \\ \end{array} } \right] $$
(6)

where Φ is the carrier phase, λ is the wavelength, N is the integer ambiguity, and Δt is the time interval. Equation (3) is the observation equation for the carrier phase (Hofmann-Wellenhof and Wasle 2007). Unlike the code-based pseudorange, the carrier phase includes an integer ambiguity, so it is cancelled by considering the time difference in (4). The change in the carrier phase in (4) depends on the velocity of the satellite (\( {\dot{\mathbf{x}}}_{\rm s} \)) and the velocity of the user (\( {\dot{\mathbf{x}}}_{\rm u} \)). Equation (5) is expressed in matrix form in (6) (Ding and Wang 2011).

The range and range rate equations were derived to utilize the available satellites. As previously mentioned, a three-dimensional solution cannot be calculated with less than four available satellites, although the measurements from each satellite can aid other navigation systems. From the perspective of GNSS, the problem of insufficient measurements can be solved by combining it with Vision/INS integrated navigation.

INS/Vision integrated navigation system

Vision/INS integrated navigation calculates navigation solutions (such as the position, velocity, and attitude) by integrating INS and feature points from a vision sensor. Feature points are fixed points, which can be tracked using tracking methods such as Kanade–Lucas–Tomasi (KLT), Scale-Invariant Feature Transform (SIFT), and Speeded Up Robust Feature (SURF) (Tomasi and Kanade 1991; Lowe 1999; Bay et al. 2006). The coordinates of the feature points are determined by the position and the attitude of the sensor installed in a vehicle. The navigation solution can be calculated using an appropriate process and an observation model that describes the relationship.

Process model

The state vectors for the vehicle (position, velocity, and attitude) and the positions of the feature points are required by Vision/INS integrated navigation. The state vector is expressed with respect to the local North–East–Down frame (navigation frame, n). A linearized INS model is used as the vehicle’s process model. The inertial sensor is assumed to be of low grade, and the rotation of the earth is neglected. These systems are formulated as follows:

$$ {\mathbf{x}}_{\text{vehicle}}^{n} = \left[ {\begin{array}{*{20}c} {p^{n} } & {v^{n} } & {\Uppsi^{n} } \\ \end{array} } \right]^{T} $$
(7)
$$ \left[ {\begin{array}{*{20}l} {p^{n} \left( {t + 1} \right)} \\ {v^{n} \left( {t + 1} \right)} \\ {\Uppsi^{n} \left( {t + 1} \right)} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {p^{n} \left( t \right) + v^{n} \left( t \right)\Updelta t} \\ {v^{n} \left( t \right) + \left[ {C_{\rm b}^{n} \left( t \right)f^{\rm b} \left( t \right) + g^{n} } \right]\Updelta t} \\ {\Uppsi^{n} \left( t \right) + E_{\rm b}^{n} \left( t \right)\omega^{\rm b} \left( t \right)\Updelta t} \\ \end{array} } \right] + w_{\text{vehicle}} \left( t \right) $$
(8)

where p n is the position, v n is the velocity, Ψn is the attitude, \( C_{\rm b}^{n} \) is the direction cosine matrix, \( E_{\rm b}^{n} \) is the rotation rate transformation matrix from the body to navigation frame, f b is the specific force in the body frame, g n is the gravity, \( \omega^{\rm b} \) is the angular velocity in the body frame, and w vehicle is the process noise (Kim and Sukkarieh 2007; Won et al. 2010; Chun et al. 2012; Bryson and Sukkarieh 2008). Equation (8) represents the process model for the vehicle using INS. The process model of Vision/INS integrated navigation is as follows:

$$ {\mathbf{x}}_{\text{feature}}^{n} = \left[ {\begin{array}{*{20}c} {m_{1}^{n} } & {m_{2}^{n} } & \cdots & {m_{j}^{n} } \\ \end{array} } \right]^{T} $$
(9)
$$ {\mathbf{x}}_{\text{feature}}^{n} \left( {t + 1} \right) = {\mathbf{x}}_{\text{feature}}^{n} \left( t \right) + w_{\text{feature}} \left( t \right) $$
(10)
$$ \begin{gathered} \delta {\mathbf{x}}_{{{\text{Vision}}/{\text{INS}}}}^{n} \left( {t + 1} \right) = F_{{{\text{Vision}}/{\text{INS}}}} \left( t \right)\delta {\mathbf{x}}_{{{\text{Vision}}/{\text{INS}}}}^{n} \left( t \right) + w_{{{\text{Vision}}/{\text{INS}}}} \left( t \right) \hfill \\ \delta {\mathbf{x}}_{{{\text{Vision}}/{\text{INS}}}}^{n} = \left[ {\begin{array}{*{20}c} {\delta {\mathbf{x}}_{\text{vehicle}}^{n} } & {\delta {\mathbf{x}}_{\text{feature}}^{n} } \\ \end{array} } \right]^{T} \hfill \\ w_{{{\text{Vision}}/{\text{INS}}}} = \left[ {\begin{array}{*{20}c} {w_{\text{vehicle}} } & {w_{\text{feature}} } \\ \end{array} } \right]^{T} \hfill \\ F_{{{\text{Vision}}/{\text{INS}}}} = \left[ {\begin{array}{*{20}c} {{\mathbf{I}}_{3 \times 3} } & {\Updelta t{\mathbf{I}}_{3 \times 3} } & {0_{3 \times 3} } & {0_{3 \times 3j} } \\ {0_{3 \times 3} } & {{\mathbf{I}}_{3 \times 3} } & {\Updelta t\left[ {f^{n} \times } \right]} & {0_{3 \times 3j} } \\ {0_{3 \times 3} } & {0_{3 \times 3} } & {{\mathbf{I}}_{3 \times 3} } & {0_{3 \times 3j} } \\ {0_{3i \times 3} } & {0_{3i \times 3} } & {0_{3j \times 3} } & {{\mathbf{I}}_{3j \times 3j} } \\ \end{array} } \right] \hfill \\ \end{gathered} $$
(11)

where [f n×] is the skew-symmetric matrix of f n and \( {\mathbf{I}} \) is the identical matrix. The subscript a × b is the size of the matrix rows (a) and columns (b), and j is the total number of feature points. The feature point (m n) is modeled as (10), which is a fixed point. A perturbation is applied to (8) and (10), and the total process model is expressed as (11).

Vision observation model

Several coordinate systems are required to derive the observation model: the navigation frame, body frame, and sensor frame. To simplify the model, the body frame and sensor frame are not separated by assuming that the vehicle has a rigid body.

Figure 1 represents the relationship between the feature point (m) and the position of the sensor (p). This relationship can be expressed as:

$$ p^{n} = m^{n} - C_{c}^{n} m^{c} $$
(12)

where \( C_{c}^{n} \) is the transition matrix from the sensor frame to the navigation frame (Kim and Sukkarieh 2007; Chun et al. 2012).

Fig. 1
figure 1

Configuration of the coordinates used for Vision/INS integrated navigation, where m is the feature point position, p is the sensor (camera) position, and O is the origin of the frame, whereas the superscripts c and n represent the sensor frame and the navigation frame, respectively

The feature point is measured in pixel units on the image plane. Figure 2 shows the feature point position in terms of the relationship between the sensor frame and the image plane (i). The coordinates in the image plane (m i) are conventionally measured using two-dimensional coordinates (\( [ {m_{x}^{i} ,\,\,\,m_{y}^{i} }] \)). However, they are transformed into three-dimensional coordinates if the focal length (\( m_{z}^{i} \)) is applied. The focal length is the distance between the origin (O c) and the image plane, which can be estimated during the sensor calibration process. After estimating the focal length, the state vector \( {\mathbf{m}}^{i} = \left[ {\begin{array}{*{20}c} {m_{x}^{i} } & {m_{y}^{i} } & {m_{z}^{i} } \\ \end{array} } \right]^{T} \) is expressed in three dimensions. It has an identical basis vector to the feature point position on the sensor frame (m c), which is then expressed as a unit vector using (12). The expressions are as follows:

$$ \begin{aligned} \frac{{m^{i} }}{{\left\| {m^{i} } \right\|}} & = & \frac{{m^{c} }}{{\left\| {m^{c} } \right\|}} \\ & = \frac{1}{{\left\| {m^{n} - p^{n} } \right\|}}C_{n}^{c} \left[ {m^{n} - p^{n} } \right] \\ \end{aligned} $$
(13)

where the finalized term is in the form of a normalized unit vector. Vision measurements (\( {\mathbf{z}}_{\text{vision}}^{n} \)) are expressed as a conventional observation function with the states (\( p^{n} ,v^{n} ,\Uppsi ,m^{n} \)) and the measured noise (υ), the observation model is as follows:

$$ \begin{aligned} z_{\text{vision}}^{n} \left( t \right) & = \frac{{m^{c} \left( t \right)}}{{\left\| {m^{c} \left( t \right)} \right\|}} \\ & = h\left( {p^{n} \left( t \right),v^{n} \left( t \right),\Uppsi \left( t \right),m^{n} \left( t \right)} \right) + \upsilon \left( t \right) \\ \end{aligned} $$
(14)
$$ \begin{gathered} z_{\text{vision}}^{n} \left( t \right) - h\left( {\bar{p}^{n} \left( t \right),\bar{v}^{n} \left( t \right),\bar{\Uppsi }\left( t \right),\bar{m}^{n} \left( t \right)} \right) \hfill \\ = \left[ {\begin{array}{*{20}c} {h_{{{\text{vision}},p}} } & {{\mathbf{0}}_{1 \times 3} } & {h_{{{\text{vision}},\Uppsi }} } & {h_{{{\text{vision}},m}} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\delta p^{n} \left( k \right)} \\ {\delta v^{n} \left( k \right)} \\ {\delta \Uppsi \left( k \right)} \\ {\delta m^{n} \left( k \right)} \\ \end{array} } \right] + \upsilon \left( t \right) \hfill \\ \end{gathered} $$
(15)
Fig. 2
figure 2

Feature point position on the sensor frame

Equation (14) takes the first-order approximation with respect to the nominal points (\( \bar{p}^{n} ,\bar{v}^{n} ,\bar{\Uppsi },\bar{m}^{n} \)) and it becomes (15).

GNSS integration with Vision/INS integrated navigation

The systems in sections "GNSS" and "INS/Vision integrated navigation system" were implemented using different state vectors and different coordinate systems. The identical state vector and coordinate systems need to be used to integrate these systems. This section describes this process and an observation model with identical conditions.

Process model

In order to integrate GNSS into Vision/INS integrated navigation, the receiver clock bias (b u) and drift (\( \dot{b}_{\rm u} \)) should be considered as extra state vectors in addition to (11). The receiver clock model is expressed as:

$$ \delta {\mathbf{x}}_{\text{clock}} \left( {t + 1} \right) = \left[ {\begin{array}{*{20}c} 1 & {\Updelta t} \\ 0 & 1 \\ \end{array} } \right]\delta {\mathbf{x}}_{\text{clock}} \left( t \right) + \eta_{\text{clock}} \left( t \right) $$
(16)

where \( \delta {\mathbf{x}}_{\text{clock}} \) is \( \left[ {\begin{array}{*{20}c} {\delta b_{\rm u} } & {\delta \dot{b}_{\rm u} } \\ \end{array} } \right]^{T} \). Equation (11) adds the receiver clock error state to produce the process model, so the integrated process model is as follows:

$$ \begin{gathered} \delta {\mathbf{x}}\left( {t + 1} \right) = F\left( t \right)\delta {\mathbf{x}}\left( t \right) + w\left( t \right) \hfill \\ \delta {\mathbf{x}} = \left[ {\begin{array}{*{20}c} {\delta {\mathbf{x}}_{\text{vehicle}} } & {\delta {\mathbf{x}}_{\text{clock}} } & {\delta {\mathbf{x}}_{\text{feature}} } \\ \end{array} } \right]^{T} \hfill \\ w = \left[ {\begin{array}{*{20}c} {w_{\text{vehicle}} } & {w_{\text{clock}} } & {w_{\text{feature}} } \\ \end{array} } \right]^{T} \hfill \\ F = \left[ {\begin{array}{*{20}c} {{\mathbf{I}}_{3 \times 3} } & {\Updelta t{\mathbf{I}}_{3 \times 3} } & {0_{3 \times 3} } &\vline & {{\mathbf{0}}_{3 \times 2} } &\vline & {{\mathbf{0}}_{3 \times 3j} } \\ {{\mathbf{0}}_{3 \times 3} } & {{\mathbf{I}}_{3 \times 3} } & {\Updelta t\left[ {f^{n} \times } \right]} &\vline & {{\mathbf{0}}_{3 \times 2} } &\vline & {{\mathbf{0}}_{3 \times 3j} } \\ {{\mathbf{0}}_{3 \times 3} } & {{\mathbf{0}}_{3 \times 3} } & {{\mathbf{I}}_{3 \times 3} } &\vline & {{\mathbf{0}}_{3 \times 2} } &\vline & {{\mathbf{0}}_{3 \times 3j} } \\ \hline {{\mathbf{0}}_{2 \times 3} } & {{\mathbf{0}}_{2 \times 3} } & {{\mathbf{0}}_{2 \times 3} } &\vline & {\begin{array}{*{20}c} 1 & {\Updelta t} \\ 0 & 1 \\ \end{array} } &\vline & {{\mathbf{0}}_{2 \times 3j} } \\ \hline {{\mathbf{0}}_{3j \times 3} } & {{\mathbf{0}}_{3j \times 3} } & {{\mathbf{0}}_{3j \times 3} } &\vline & {{\mathbf{0}}_{3j \times 2} } &\vline & {{\mathbf{I}}_{3i \times 3j} } \\ \end{array} } \right] \hfill \\ \end{gathered} $$
(17)

Equation (17) is the final process model which includes all states for INS, GNSS, and vision.

Observation model

GNSS and the vision observation model are calculated using different coordinate systems. The state vector is defined with respect to the navigation frame. GNSS is calculated using the WGS-84 coordinate system, and it requires the receiver clock bias and drift. However, the state vector related to vision utilizes the navigation frame, which can be applied directly in the INS model. To integrate the two measurements, therefore, the total observation model should be derived based on a consideration of the coordinates. The observation model of the two sensors is as follows :

$$ \delta z = \left[ {\begin{array}{*{20}c} {\delta z_{{{\text{GNSS}},{\text{PR}}}} } \\ {\delta z_{{{\text{GNSS}},{\text{CP}}}} } \\ \hline {\delta z_{\text{vision}} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {h_{\text{GNSS}} C_{n}^{e} } & {{\mathbf{0}}_{1 \times 3} } & {{\mathbf{0}}_{1 \times 3} } &\vline & 1 & 0 &\vline & {{\mathbf{0}}_{1 \times 3} } \\ {{\mathbf{0}}_{1 \times 3} } & { - h_{\text{GNSS}} C_{n}^{e} } & {{\mathbf{0}}_{1 \times 3} } &\vline & 0 & 1 &\vline & {{\mathbf{0}}_{1 \times 3} } \\ \hline {h_{{{\text{vision}},p}} } & {{\mathbf{0}}_{3 \times 3} } & {h_{{{\text{vision}},\Uppsi }} } &\vline & 0 & 0 &\vline & {h_{{{\text{vision}},m}} } \\ \end{array} } \right]\delta {\mathbf{x}}^{n} $$
(18)

where \( \delta {\mathbf{z}}_{{{\text{GNSS}},{\text{PR}}}} \) is the pseudorange, \( \delta {\mathbf{z}}_{{{\text{GNSS}},{\text{CP}}}} \) is the carrier phase, \( \delta {\mathbf{z}}_{\text{vision}} \) is the vision measurements, and \( C_{n}^{e} \) is the transformation matrix from the navigation frame to the ECEF frame (WGS-84). Each measurement represents the residuals from nominal points.

Observability analysis

An observability test was conducted to verify the possibility of implementing the proposed navigation system. Several steps were applied to simplify the calculation during verification. The coordinate system was assumed to be \( C_{b}^{n} = \,C_{b}^{c} = {\mathbf{I}}_{3 \times 3} \), and the vision observation component was \( h_{{{\text{vision}},p}} = - h_{{{\text{vision}},m}} \) in (15). Multiplying the whole equation by an identical value would not affect the integrity of the equation, so the vision observation equation was multiplied by \( h_{{{\text{vision}},p}}^{ - 1} \) and the observation matrix (H) was organized. The simplified observation matrix is as follows:

$$ H = \left[ {\begin{array}{*{20}c} {h_{\rm G} } & {{\mathbf{0}}_{1 \times 3} } & {{\mathbf{0}}_{1 \times 3} } &\vline & 1 & 0 &\vline & {{\mathbf{0}}_{1 \times 3} } \\ {{\mathbf{0}}_{1 \times 3} } & { - h_{\rm G} } & {{\mathbf{0}}_{1 \times 3} } &\vline & 0 & 1 &\vline & {{\mathbf{0}}_{1 \times 3} } \\ \hline { - {\mathbf{I}}_{3 \times 3} } & {{\mathbf{0}}_{3 \times 3} } & {h_{\rm V} } &\vline & 0 & 0 &\vline & {{\mathbf{I}}_{3 \times 3} } \\ \end{array} } \right] $$
(19)

where \( h_{\rm G} \) is \( h_{\text{GNSS}} C_{n}^{e} \) and \( h_{\rm V} \) is \( \left( {h_{{{\text{vision}},p}} } \right)^{ - 1} \cdot h_{{{\text{vision}},\Uppsi }} \). The observability test was conducted based on (19).

Observability of the Vision/INS integrated navigation

The observability of the Vision/INS integrated navigation was analyzed prior to its integration with GNSS. Before the receiver clock bias and drift components were applied, Eq. (11) was used to check the observability of the Vision/INS integrated system with the form \( \delta {\mathbf{x}} = \left[ {\begin{array}{*{20}c} {\delta {\mathbf{p}}^{n} } & {\delta {\mathbf{v}}^{n} } & {\delta {\varvec{\psi}}^{n} } & {\delta {\mathbf{M}}^{n} } \\ \end{array} } \right]^{T} \). According to Bryson and Sukkarieh (2008) and Bryson et al. (2005), the observability of the Vision/INS integrated navigation always lacks rank 3. A feature point is used to form a local observability matrix (LOM, O LOM) during each single time segment (Goshen-Meskin and Bar-Itzhack 1992). The expression is as follows:

$$ O_{\text{LOM}} = \left[ {\begin{array}{*{20}c} { - {\mathbf{I}}_{3 \times 3} } & {{\mathbf{0}}_{3 \times 3} } & {h_{\rm V} } & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - \Updelta t{\mathbf{I}}_{3 \times 3} } & {h_{\rm V} } & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - 2\Updelta t{\mathbf{I}}_{3 \times 3} } & { - \Updelta t^{2} \left[ {f^{n} \times } \right] + h_{\rm V} } & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - 3\Updelta t{\mathbf{I}}_{3 \times 3} } & {\underbrace {{ - 3\Updelta t^{2} \left[ {f^{n} \times } \right] + h_{\rm V} }}_{{{\text{Rank}} = 2}}} & {{\mathbf{I}}_{3 \times 3} } \\ \end{array} } \right] $$
(20)

where O LOM in (20) contains 12 states and there are four unobservable states, so the rank is 8. If an additional feature point is added to the single time segment, there are 15 states in total and the rank is 11 because of the four unobservable states. The rank deficiency arises from the correlation of the vehicle position and the feature point (three unobservable states), whereas the remaining rank deficiency derives from the attitude term, which is constituted by the skew-symmetric matrix.

The total observation matrix (TOM, O TOM) is composed of multiple time segments for testing the observability (Goshen-Meskin and Bar-Itzhack 1992). The matrix over two time segments is as follows:

$$ O_{\text{TOM}} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} { - {\mathbf{I}}_{3 \times 3} } & {{\mathbf{0}}_{3 \times 3} } & {h_{\rm V,1} } & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - \Updelta t{\mathbf{I}}_{3 \times 3} } & {h_{\rm V,1} } & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - 2\Updelta t{\mathbf{I}}_{3 \times 3} } & {\left[ { - \Updelta t^{2} \left[ {f^{n} \times } \right] + h_{\rm V,1} } \right]} & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - 3\Updelta t{\mathbf{I}}_{3 \times 3} } & {\left[ { - 3\Updelta t^{2} \left[ {f^{n} \times } \right] + h_{\rm V,1} } \right]} & {{\mathbf{I}}_{3 \times 3} } \\ \end{array} } \\ \hline {\begin{array}{*{20}c} { - {\mathbf{I}}_{3 \times 3} } & {{\mathbf{0}}_{3 \times 3} } & {h_{\rm V,2} } & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - \Updelta t{\mathbf{I}}_{3 \times 3} } & {h_{\rm V} ,_{2} } & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - 2\Updelta t{\mathbf{I}}_{3 \times 3} } & {\left[ { - \Updelta t^{2} \left[ {f^{n} \times } \right] + h_{\rm V,2} } \right]} & {{\mathbf{I}}_{3 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - 3\Updelta t{\mathbf{I}}_{3 \times 3} } & {\underbrace {{\left[ { - 3\Updelta t^{2} \left[ {f^{n} \times } \right] + h_{\rm V,2} } \right]}}_{{{\text{Rank}} = 3}}} & {{\mathbf{I}}_{3 \times 3} } \\ \end{array} } \\ \end{array} } \right] $$
(21)

If the number of time segments increases, the attitude part of each row of the skew-symmetric matrix becomes independent, thereby increasing the rank by one so the rank is 3. Similar to LOM analysis, however, an increment in the number of feature points does not increase the rank further. Thus, the total number of unobservable states is never less than three. The positions of the vehicle and the feature points remain unobservable. Therefore, only the relative positioning of the vehicle and feature points can be determined during Vision/INS integrated navigation.

Observability of the GNSS/Vision/INS integrated navigation

The number of states for GNSS/Vision/INS integrated navigation is 11 + 3 * j, that is, nine for the vehicle, two for the GNSS receiver clock, and 3 * j for the feature points. The state vector for GNSS/Vision/INS integrated navigation is set to \( \delta {\mathbf{x}} = \left[ {\begin{array}{*{20}c} {\delta p^{n} } & {\delta v^{n} } & {\begin{array}{*{20}c} {\delta \Uppsi } & {\delta b_{\rm u} } & {\delta \dot{b}_{u} } \\ \end{array} } & {\delta m^{n} } \\ \end{array} } \right]^{T} \). The observability test was conducted based on the assumption that all observations (GNSS, Vision, and INS) were measured during the same epoch because of the different update rates of each measurement. The LOM is as follows:

$$ O_{\text{LOM}} = \left[ {\begin{array}{*{20}c} {h_{\rm G} } & {{\mathbf{0}}_{1 \times 3} } & {{\mathbf{0}}_{1 \times 3} } & 1 & 0 & {{\mathbf{0}}_{1 \times 3} } \\ {{\mathbf{0}}_{1 \times 3} } & { - h_{\rm G} } & {{\mathbf{0}}_{1 \times 3} } & 0 & 1 & {{\mathbf{0}}_{1 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & {{\mathbf{0}}_{3 \times 3} } & {h_{\rm V} } & {{\mathbf{0}}_{3 \times 1} } & {{\mathbf{0}}_{3 \times 1} } & {{\mathbf{I}}_{3 \times 3} } \\ {h_{\rm G} } & {\Updelta th_{\rm G} } & {{\mathbf{0}}_{1 \times 3} } & 1 & {\Updelta t} & {{\mathbf{0}}_{1 \times 3} } \\ {{\mathbf{0}}_{1 \times 3} } & { - h_{\rm G} } & { - \Updelta t\left[ {f^{n} \times } \right]h_{\rm G} } & 0 & 1 & {{\mathbf{0}}_{1 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - \Updelta t{\mathbf{I}}_{3 \times 3} } & {h_{\rm V} } & {{\mathbf{0}}_{3 \times 1} } & {{\mathbf{0}}_{3 \times 1} } & {{\mathbf{I}}_{3 \times 3} } \\ {h_{\rm G} } & {2\Updelta th_{\rm G} } & {\Updelta t^{2} \left[ {f^{n} \times } \right]h_{\rm G} } & 1 & {2\Updelta t} & {{\mathbf{0}}_{1 \times 3} } \\ {{\mathbf{0}}_{1 \times 3} } & { - h_{\rm G} } & { - 2\Updelta t\left[ {f^{n} \times } \right]h_{\rm G} } & 0 & 1 & {{\mathbf{0}}_{1 \times 3} } \\ { - {\mathbf{I}}_{3 \times 3} } & { - 2\Updelta t{\mathbf{I}}_{3 \times 3} } & { - \Updelta t^{2} \left[ {f^{n} \times } \right] + h_{\rm V} } & {{\mathbf{0}}_{3 \times 1} } & {{\mathbf{0}}_{3 \times 1} } & {{\mathbf{I}}_{3 \times 3} } \\ \end{array} } \right] $$
(21)

If only one feature point is used, the total dimensions of the state vector are 14. Equation (21) is the O LOM form where the process model in (17) and observation model in (18) are used, and the rank is 10. Assuming that the attitude term can ensure full rank by formulating TOM as shown in (20), O TOM has rank 11 and there are three unobservable states. The unobservable states relate to the vehicle position, feature point position, and the GNSS receiver clock.

If the number of time segments is increased, the rank deficiency is improved by one in the attitude term. With GNSS, the LOS vector does not change sufficiently during a few seconds because of the distance between the satellite and the user. Despite the multiple time segments, the change in the GNSS LOS matrix (h G) is not significant, and it does not improve the observability of the position. However, an additional GNSS satellite increases the rank by one. Therefore, increasing the number of satellites increases the rank.

Table 1 shows the number of observable states according to the number of GNSS satellites. Using less than four available GNSS satellites can improve the rank deficiency, although it does not make the observation matrix full rank. Assuming that the GNSS receiver clock error is known or well calibrated when analyzing (21), the receiver clock bias and drift can be set as known. The dimension of the state vector is 12 and one available satellite increases the rank by one. Thus, three available satellites make a full rank and the states of the vehicle and the feature point positions may be observable.

Table 1 Observable states with respect to the number of GNSS satellites

According to the observability analysis in this section, Vision/INS integrated navigation always has three unobservable states. If the proposed method is used, however, each additional satellite decreases the number of unobservable states by one, which enhances the estimation performance for the vehicle position and the feature point position. If the GNSS receiver clock is also known, three satellites constitute a full rank and this provides better performance. The unobservable state is a term related to the vehicle position and the feature point position, which ensures observability in addition to the pseudorange. However, the range rate is a function of the vehicle velocity, which is already an observable state that does not improve the rank. Improved precision can be expected because of the carrier phases.

Test and results

The performance of the proposed system was verified using a simulation and an experimental test. The following section describes the test conditions and results.

Simulation setting

The simulation considered the flight of an UAV and repeated the route shown in Fig. 3, which was followed twice at a height of 10 m. The simulation was conducted for 300 s. The IMU used in this simulation was a Honeywell HG-1700, and the vision sensor was an Allied Vision Technologies Guppy F-080. The GPS was a Novatel OEM-6, and the other specifications are listed in Table 2. The vision sensor was assumed to be installed pointing 45° downward with respect to the path of movement. The feature points were assumed to be distributed randomly on the surface of the ground, and a maximum of seven could be utilized. We assumed monocular vision and that the feature points were assigned via delayed initialization (Bailey 2003). The GPS data for January 15, 2012, from the YUMA Almanac were provided by the U.S. Coast Guard Navigation Center. And Konkuk University, Seoul, Korea, was set as the origin. Figure 4 shows the GPS constellation used in the simulation. All of the simulation results were generated using Matlab®. The GPS and IMU measurements were produced using the satellite navigation toolbox and the inertial navigation toolbox (©GPSoft LLC).

Fig. 3
figure 3

Simulation trajectory

Table 2 Sensor specifications
Fig. 4
figure 4

Simulated GPS constellation

Simulation results

Figure 5 shows the true trajectory and position estimation results for INS only and Vision/INS integrated navigation. There were relatively large drift errors, which diverged when INS calculated the position without the aid of vision whereas Vision/INS integrated navigation decreased the divergence. However, the amount of divergence was only reduced, rather than removed.

Fig. 5
figure 5

Estimated positions

Figure 6 shows the position estimated using GPS and Vision/INS integrated navigation, as described in section "GNSS integration with Vision/INS integrated navigation". The PRN 13, 16, and 19 satellites were used, and the calculation of the clock error for the GPS receiver was included in the state vector. We analyzed the estimated position error as the number of available satellites increased. The true position and estimated position were compared, and the position error was analyzed quantitatively. Figure 7 shows the decrease in the position error with an increasing number of available satellites. As shown in the previous analysis of observability, increasing the number of available satellites decreased the number of unobservable states, which improved the position estimation performance. When three satellites were used, the maximum error was 16 m throughout the entire simulation period and the error had a relatively low drift compared with the other two cases. When two satellites were used, the positions estimated in the direction of the PRN 13 and 16 satellites were excellent, whereas the estimates were relatively poor in the opposite direction. Some of the position errors were greater than those with Vision/INS integrated navigation because of the simulation path and the deployment of the satellites, although the position error divergence was small. If one available satellite was used, the position error was higher than that with Vision/INS integrated navigation. The existence of one satellite only in a specific direction increased the drift error continuously.

Fig. 6
figure 6

Estimated positions depending on the number of satellites (with the GPS receiver clock error): one satellite (top), two satellites (middle), and three satellites (bottom)

Fig. 7
figure 7

Estimated position errors depending on the number of satellites (with the GPS receiver clock error)

In the previously analyzed cases, the receiver clock error was included in the state vector. Figure 8 assumes that the clock error was known or well calibrated. The general trend was similar to that shown in Figs. 6 and 7. However, the use of two available satellites produced relatively better position estimation results. If the receiver clock error term was removed from the model, the rank increased by one and the observable states also increased by one (Fig. 9).

Fig. 8
figure 8

Estimated positions depending on the number of satellites (without the GPS receiver clock error): one satellite (top), two satellites (middle), and three satellites (bottom)

Fig. 9
figure 9

Estimated position errors depending on the number of satellites (without the GPS receiver clock error)

According to the observability analysis, one more observable state was determined compared with Vision/INS integrated navigation, even with one available satellite. However, the position error was greater than that with Vision/INS integrated navigation in both cases. This was because the deployment of the satellites and direction of the vehicle’s movement had to be considered when one available satellite was used.

Experimental results

An experimental (driving) test was conducted to verify the proposed navigation system. The test was conducted at Konkuk University, Korea, on November 2, 2012. The IMU was an ADIS 16364, the vision sensor was an LG LVC-A903HM, and the GPS was a Novatel OEM-5. Novatel SPAN was used as the reference. The feature points were tacked using the KLT method (Tomasi and Kanade 1991). All data were recorded and postprocessed. Figure 10 shows the test site and trajectory. Figure 11 shows the deployment of the satellites and the number of available satellites during the experimental test. At this site, buildings and other structures often blocked the LOS of visible satellites. The satellites were blocked in ascending order of elevation angle. This site was similar to urban canyons so it was considered to be a suitable site for testing the proposed algorithm.

Fig. 10
figure 10

Test site and trajectory

Fig. 11
figure 11

Deployment of satellites (left) and the number of available satellites during the experimental test (right)

Figure 12 shows the estimated positions, and Fig. 13 shows the estimated position errors. The navigation errors were not corrected in INS, so the other two methods had better performance than INS. The INS/Vision/GPS integrated system delivered better performance compared with the other methods. The number of satellites was decreased from three to two at 18 s, and the position error increased until 32 s. After 33 s, the number of satellites changed to three and the error decreased temporally. After 35 s, the number of satellites dropped to 0 then increased to one and the errors started to increase. All of these effects were related to the unobservable states described in section "Observability analysis". This test verified that the positions were estimated accurately because the number of available satellites increased with the INS/Vision/GPS integrated system.

Fig. 12
figure 12

Estimated positions

Fig. 13
figure 13

Estimated position errors

This section described simulations and an experimental test, which verified the proposed navigation system using a limited number of GNSS satellites.

Conclusion

In urban canyons, it is difficult to obtain four or more satellites to provide a three-dimensional navigation solution. Conventional alternative navigation systems are based on an assumption that GNSS is unavailable when there is a lack of available satellites. The use of limited GNSS measurements makes it difficult to derive a navigation solution, but they can be integrated with Vision/INS integrated navigation to improve the navigation performance. In this paper, we proposed a Vision/INS integrated navigation system based on GNSS measurements for use when the number of available satellites is less than four. We developed an integrated model that combined the range and range rate of GNSS with Vision/INS integrated navigation. We verified the improvement in the navigation performance via an observability analysis. The observability analysis confirmed that Vision/INS integrated navigation always had three unobservable states. However, the observability was improved by the addition of available satellites. We conducted simulations and an experimental test to analyze the position estimation performance, depending on the number of available satellites. An improvement in the drift error was confirmed when GNSS measurements were utilized during Vision/INS integrated navigation. In urban canyons, only a few satellites are available and the signal quality is low so the number of available satellites often drops below four. Improved navigation solutions can be expected in urban canyon environments by integrating GNSS measurements with Vision/INS integrated navigation.