Keywords

1 Introduction

In the last decade there has been an increasing use of Autonomous Ground Vehicles (AGVs) in many navigation applications. One of the representative civil application is in intelligent transport systems (ITSs); more specifically in autonomous cars or driverless cars [1]. In the vehicle’s Guidance, Navigation, and Control (GNC) system, the navigation state estimation is an essential component, not only for the vehicle’s autonomous navigation, but also for cooperative safety communications among vehicles [2,3,4]. Among different navigation systems, the inertial navigation system (INS) is the only one that has the capability to produce a complete and continuous set of navigation state data with high precision during a short time span. However, low-cost INS has to be integrated with other aiding sensors to reduce its error growth. INS and Global Navigation Satellite System (GNSS) integration is commonly used for vehicle outdoor navigation. However, GNSS signal can suffer from obstruction and multipath errors in city centers and mountainous regions, and is prone to the possibility of being jammed or spoofed [5].

To mitigate the error growth of INS in GNSS denied environment, aiding sensors such as wheel odometers and motion constraints (e.g., Non-holonomic Constraints (NHC) as a “virtual” sensor) were utilized for land vehicles [6, 7]. Recently, researchers proposed to use vision sensor aiding in the navigation system, in loosely coupled form, with the visual odometry (VO) module [8, 9], or in tightly coupled form [10,11,12,13]. Most research used point features in the tightly coupled VINS, including simultaneous localization and mapping (SLAM) based methods [10, 14] and structure-less methods, which focus on the ego-motion only [12, 15]. The representative algorithm of filter based methods is called multi-state constraint Kalman filter (MSCKF) [12], which augments camera poses (a sliding window of poses) into the state vector to make most use of measurements of feature points in the geometry constraint. Apart from the point-based VINS, there are approaches using the vanishing point (VP) module [16, 17], another attitude aiding source for INS working in the environment with structured buildings or in corridors. These methods rely on the observation of parallel lines in structured buildings.

In this paper, we present MSCKF in the local-level frame (LLF), which is common to the navigation community. To improve the performance of point-based MSCKF, we propose to add a VP module, which observes vertical lines of building blocks, for land vehicle navigation. Based on that, a roll angle measurement model is derived. Therefore, a mixed VINS using points and line observations is developed, and its performance is verified by real data experiments.

2 MSCKF Algorithm in Local-Level Frame

MSCKF is considered as a structure-less visual-inertial navigation system, which means the vehicle localization is of the main interest, rather than recovering the structure of the environment. They take into account all camera poses at which observations of the same 3D point occurred. MSCKF augments camera poses (a sliding window of poses) into the state vector to make most use of measurements of feature points in the geometry constraint. Every feature has a corresponding stack of misclosure vectors (residuals). Feature position error states are cancelled out by projecting the residual stack on the left null space of the design matrix corresponding to the feature error state [12]. Here we present the MSCKF algorithm in LLF with the notations common to navigation engineers and researchers, in order to bridge between the navigation community and robotics community. It should be mentioned that some components in the filtering formulations should be modified with respect to original MSCKF.

2.1 MSCKF States and Covariance

For the near-Earth navigation, the navigation frame is usually the local-level frame (n-frame; east-north-up is used in this paper). The navigation state is presented by the position (latitude L, longitude \(\lambda\) and height h), velocity (east velocity, north velocity, and up velocity), and attitude (pitch, roll, and azimuth). Differential equations of position P, velocity \(\varvec{V}^{n}\) and attitude matrix \(\varvec{C}_{b}^{n}\) are given by

$$\dot{\varvec{P}} = \varvec{D}^{ - 1} \varvec{V}^{n} = \left[ {\begin{array}{*{20}c} 0 & {\frac{1}{{R_{M} + h}}} & 0 \\ {\frac{1}{{(R_{N} + h)\cos L}}} & 0 & 0 \\ 0 & 0 & 1 \\ \end{array} } \right]\varvec{V}^{n}$$
(2.1)
$$\dot{\varvec{V}}^{n} = \varvec{f}^{n} - \left( {2\varvec{\omega}_{ie}^{n} +\varvec{\omega}_{en}^{n} } \right) \times \varvec{V}^{n} + \varvec{g}^{n}$$
(2.2)
$$\dot{\varvec{C}}_{b}^{n} = \varvec{C}_{b}^{n} \left[ {\varvec{\omega}_{ib}^{b} \times } \right] - \left[ {\left( {\varvec{\omega}_{ie}^{n} +\varvec{\omega}_{en}^{n} } \right) \times } \right]\varvec{C}_{b}^{n}$$
(2.3)

where \(R_{M}\) and \(R_{N}\) are meridian radius of curvature and prime vertical radius respectively, \(\varvec{f}^{n}\) is the specific force in n-frame, \(\varvec{\omega}_{ie}^{n}\) is the earth rotation angular rate vector, \(\varvec{\omega}_{en}^{n}\) is the angular rate of the navigation frame with respect to (w.r.t.) ECEF frame, and \(\varvec{\omega}_{ib}^{b}\) is the angular rate of the body frame (b) w.r.t. the inertial frame (i). The notation \((\varvec{a}) \times\) is the asymmetric matrix of a vector a. The navigation states are propagated using Eqs. (2.12.3), which is called INS mechanization where the IMU measurement is the input signal.

The navigation states can be considered as the full states, and the associated errors are called error states. For convenience, let X denote the error states, and the full states will be presented by their quantities notions. The state vector of the MSCKF is:

$$\varvec{X} = \left[ {\begin{array}{*{20}c} {\varvec{X}_{I}^{T} } & {\varvec{X}_{{C_{1} }}^{T} } & {\begin{array}{*{20}c} {\varvec{X}_{{C_{2} }}^{T} } & \cdots & {\varvec{X}_{{C_{m} }}^{T} } \\ \end{array} } \\ \end{array} } \right]^{T}$$
(2.4)

where \(\varvec{X}_{I}\) is the current INS error states, and \(\varvec{X}_{{C_{i} }} , i = 1 \ldots \,\,m\) are camera poses errors at the times the last m images were recorded. INS error states are defined as

$$\varvec{X}_{I} = \left[ {\begin{array}{*{20}c} {\delta \varvec{V}^{nT} } & {\varvec{\phi}^{nT} } & {\begin{array}{*{20}c} {\delta \varvec{P}^{T} } & {\varvec{b}_{g}^{T} } & {\varvec{b}_{a}^{T} } \\ \end{array} } \\ \end{array} } \right]^{T}$$
(2.5)

where \(\delta \varvec{V}^{n}\), \(\varvec{\phi}^{n}\), and \(\delta \varvec{P}\) are the velocity error, misalignment error, and position error (latitude, longitude and height error). \(\varvec{b}_{g}\) and \(\varvec{b}_{a}\) are biases of gyroscopes and accelerometers. In this case, the dimension of the state vector is 15 + 6 m. Misalignment error \(\varvec{\phi}^{n}\) (considered as the small angle) is defined by

$$\hat{\varvec{C}}_{b}^{n} = \left[ {\varvec{I} -\varvec{\phi}^{n} \times } \right]\varvec{C}_{b}^{n}$$
(2.6)

For EKF, the INS error state propagation is based on the INS error dynamic model. The IMU sensor biases are modelled as random constants, random walks or Gauss-Markov processes. The ith camera pose errors are defined as:

$$\varvec{X}_{{C_{i} }} = \left[ {\begin{array}{*{20}c} {\varvec{\phi}_{{C_{i} }}^{nT} } & {\delta \varvec{P}_{{C_{i} }}^{T} } \\ \end{array} } \right]^{T}$$
(2.7)

where \(\varvec{\phi}_{{C_{i} }}^{n}\) and \(\delta \varvec{P}_{{C_{i} }}\) are the camera’s attitude error and position error at the ith recorded instance. The augmented ith camera pose errors are modeled as random constant, since the previous camera poses will not change over time.

The camera pose is augmented into the state when recording a new image. It is calculated from current IMU pose and the known relative pose between IMU and the camera using Eqs. 2.82.9.

$$\hat{\varvec{C}}_{C}^{n} = \hat{\varvec{C}}_{b}^{n} \varvec{C}_{C}^{b}$$
(2.8)
$$\hat{\varvec{P}}_{C} = \hat{\varvec{P}} + \varvec{D}^{ - 1} \hat{\varvec{C}}_{b}^{n} \varvec{P}_{bC}^{b}$$
(2.9)

where \(\varvec{C}_{C}^{b}\) and \(\varvec{P}_{bC}^{b}\) are the rotation matrix and lever-arm between the camera and the IMU; \(\hat{\varvec{P}}_{C}\) and \(\hat{\varvec{C}}_{c}^{n}\) are the position and attitude matrix of the camera. Then the camera pose estimate is appended to the state vector, and the covariance matrix is augmented accordingly:

$$\varvec{X}_{{\left( {15 + 6(m + 1)} \right)}} \leftarrow \left[ {\varvec{X}_{{\left( {15 + 6m} \right) \times 1}} ; \varvec{X}_{C(6 \times 1)} } \right]^{T}$$
(2.10)
$$\varvec{P}_{k|k} \leftarrow \left[ {\begin{array}{*{20}c} {\varvec{I}_{15 + 6m} } \\ {\varvec{J}_{6 \times (15 + 6m)} } \\ \end{array} } \right]\varvec{P}_{k|k} \left[ {\begin{array}{*{20}c} {\varvec{I}_{15 + 6m} } \\ {\varvec{J}_{6 \times (15 + 6m)} } \\ \end{array} } \right]^{T}$$
(2.11)

where

$$\varvec{J}_{6 \times (15 + 6m)} = \left[ {\begin{array}{*{20}c} {\varvec{O}_{3 \times 3} } & {\varvec{I}_{3 \times 3} } & {\varvec{O}_{3 \times 3} } & {\varvec{O}_{3 \times 6} } & {\varvec{O}_{3 \times 6m} } \\ {\varvec{O}_{3 \times 3} } & {\varvec{D}^{ - 1} \left( {\widehat{\varvec{C}}_{b}^{n} \varvec{P}_{bC}^{b} } \right) \times } & {\varvec{I}_{3 \times 3} } & {\varvec{O}_{3 \times 6} } & {\varvec{O}_{3 \times 6m} } \\ \end{array} } \right]$$

2.2 MSCKF Procedures

Assuming that the ith feature, which has been observed in a set of \(M_{i}\) images, is no longer detected. An EKF measurement update will be triggered after forming a residual corresponding to this ith feature. Suppose we have the measurement \(\tilde{\varvec{z}}_{i}^{j}\) for \(j = 1 \ldots \,\,M_{i} .\) The predicted measurement of the ith feature in jth image is

$$\hat{\varvec{z}}_{i}^{j} = h\left({\widehat{\varvec{P}}_{{C_{j}}},\widehat{\varvec{C}}_{{C_{j}}}^{n},\widehat{\varvec{P}}_{{f_{i}}}} \right)$$
(2.12)

where \(h( \cdot )\) is the nonlinear camera projection equation, and \(\widehat{\varvec{P}}_{{f_{i} }}\) is the computed feature position. So the residuals of ith feature are

$$\delta \varvec{z}_{i}^{j} \varvec{ } = \hat{\varvec{z}}_{i}^{j} - \tilde{\varvec{z}}_{i}^{j} \approx \varvec{H}_{{X_{i} }}^{j} \varvec{X} + \varvec{H}_{{f_{i} }}^{j} \delta \varvec{P}_{{f_{i} }} + n_{i}^{j}$$
(2.13)

where the Jacobian matrix is

$$\varvec{H}_{{X_{i} }}^{j} = \left[ {\begin{array}{*{20}c} {\varvec{O}_{2 \times 15} } & {\begin{array}{*{20}c} {\varvec{O}_{2 \times 6} } & \cdots & {\varvec{J}_{i}^{j} \left( {\widehat{\varvec{P}}_{{f_{i} }}^{{C_{j} }} \times } \right)\widehat{\varvec{C}}_{n}^{{c_{j} }} } \\ \end{array} } & {\begin{array}{*{20}c} { - \varvec{J}_{i}^{j} \widehat{\varvec{C}}_{n}^{{c_{j} }} \varvec{D}} & \cdots & {\varvec{O}_{2 \times 6} } \\ \end{array} } \\ \end{array} } \right]_{{2 \times (15 + 6{\text{m}})}}$$
$$\varvec{H}_{{f_{i} }}^{j} = \varvec{J}_{i}^{j} \widehat{\varvec{C}}_{n}^{{c_{j} }} \varvec{D}$$

where

$$\varvec{J}_{i}^{j} = \frac{1}{{\widehat{Z}_{i}^{{C_{j} }} }}\left[ {\begin{array}{*{20}c} 1 & 0 & { - \frac{{\widehat{X}_{i}^{{C_{j} }} }}{{\widehat{Z}_{i}^{{C_{j} }} }}} \\ 0 & 1 & { - \frac{{\widehat{Y}_{i}^{{C_{j} }} }}{{\widehat{Z}_{i}^{{C_{j} }} }}} \\ \end{array} } \right],\quad \left[ {\begin{array}{*{20}c} {\widehat{X}_{i}^{{C_{j} }} } \\ {\widehat{Y}_{i}^{{C_{j} }} } \\ {\widehat{Z}_{i}^{{C_{j} }} } \\ \end{array} } \right] = \widehat{\varvec{P}}_{{f_{i} }}^{{C_{j} }} = \widehat{\varvec{C}}_{n}^{{C_{j} }} \varvec{D}\left( {\widehat{\varvec{P}}_{{f_{i} }} - \widehat{\varvec{P}}_{{C_{j} }} } \right)$$

Stack the \(\delta \varvec{z}_{i}^{j}\) of all \(M_{i}\) measurements of the ith feature to form the block residual vector:

$$\delta \varvec{z}_{i} = \varvec{H}_{X_{i}} \varvec{X} + \varvec{H}_{{f_{i} }} \delta \varvec{P}_{{f_{i} }} + \varvec{n}_{i}$$
(2.14)

To remove the effect of feature position error, define a residual vector \(\delta \varvec{z}_{i}^{O} = \varvec{A}^{\text{T}} \delta \varvec{z}_{i}\), where \(\varvec{A}\) is a matrix whose columns form the basis of the left null space of \(\varvec{H}_{{f_{i} }}\). The new residual will be

$$\delta \varvec{z}_{i}^{O} = \varvec{A}^{\text{T}} \delta \varvec{z}_{i} = \varvec{A}^{\text{T}} \varvec{H}_{X_{i}} \varvec{X} + \varvec{A}^{\text{T}} \varvec{n}_{i} = \varvec{H}_{i}^{O} \varvec{X} + \varvec{n}_{i}^{O}$$
(2.15)

After the measurement update in the Kalman filter, do the feedback corrections to the navigation states and camera poses.

The MSCKF can be implemented and operated in different ways [18]. For example, minimum feature track lengths can be set to trigger an update. Large minimum feature track length means that the landmarks will be observed with a larger base line, while some near landmarks will be neglected. Potentially, large minimum feature track may have better accuracy than the small minimum feature track, but the former will also have the problem of losing useful information, which is often better than none. Usually, a large minimum feature track length will require more features to be detected to have superior accuracy than the small minimum feature track length.

3 Roll Angle Aiding with Vertical Line Observations

Most approaches of VINS have focused on the point features observation. Line features are omnipresent and can be reliably extracted and tracked in structured urban environments. Here we propose to use vanishing point coordinates from buildings’ vertical line observations as an attitude aiding to improve point-based VINS on-board the land vehicles.

3.1 Roll Measurement from Vertical Line Observations

The advantage of using buildings’ vertical line observations is that the measurement is absolute roll, since the directions of these vertical lines are known. A vanishing point is the intersection of image projections of a set of parallel 3D lines in the scene. Each set of parallel lines is associated to a VP in an image [19].

The vertical lines in the world frame (local-level frame here) intersect at the infinity point which can be represented as \(\varvec{U} = \left[ {\begin{array}{*{20}c} 0 & 0 & 1 & 0 \\ \end{array} } \right]^{\varvec{T}}\). The corresponding vanishing point u in the image plane is the projection of the infinity point U. The projection equation is written as:

$$\varvec{u} = \varvec{K}\left[ {\varvec{C}_{n}^{C} \ \ \ \varvec{T}} \right]\varvec{ U}$$
(3.1)

where K is the camera calibration matrix, \(\varvec{C}_{n}^{C}\) is the camera rotation matrix, and T is the translation vector of the camera. Equation (3.1) can be rewritten as

$$\varvec{u} = \varvec{KC}_{b}^{C} \varvec{C}_{n}^{b} \left[ {\begin{array}{*{20}c} 0 & 0 & 1 \\ \end{array} } \right]^{T}$$
(3.2)

Then INS roll (γ) and pitch (θ) can be represented as

$$\left[ {\begin{array}{*{20}c} { - \sin \gamma \,\cos \theta } \\ {\sin \theta } \\ {\cos \gamma \,\cos \theta } \\ \end{array} } \right] = \varvec{C}_{C}^{b} \varvec{K}^{ - 1} \varvec{u}$$
(3.3)

The roll angle can be calculated as

$$\gamma = \tan^{ - 1} \left( { - \frac{{\varvec{e}_{1}^{T} \varvec{C}_{C}^{b} \varvec{K}^{ - 1} \varvec{u}}}{{\varvec{e}_{3}^{T} \varvec{C}_{C}^{b} \varvec{K}^{ - 1} \varvec{u}}}} \right)$$
(3.4)

where \(\varvec{e}_{1} = \left[ {\begin{array}{*{20}c} {1,} & {0,} & {0,} \\ \end{array} } \right]^{T} ,\) and \(\varvec{e}_{3} = \left[ {\begin{array}{*{20}c} {0,} & {0,} & 1 \\ \end{array} } \right]^{T}\).

Suppose the camera has been calibrated, so \(\varvec{C}_{C}^{b} \varvec{K}^{ - 1}\) is known. Vanishing point u is determined by the coordinates of the intersection of multiple detected vertical lines in the image. In this paper, we use the Line Segment Detection (LSD) [20] to get the lines segments, which are then clustered into multiple VP classes based on the J-Linkage model [21]. The vertical VP u is one of the dominant VP classes, and can be easily selected and estimated [22]. Note that some erroneous line segments exist in the vertical VP class. They can be further cancelled by making additional assumptions, for example on the orientation or length of the line segments.

3.2 Roll Angle Measurement Model

In the INS mechanization, the roll angle \(\gamma\) can be calculated by

$$\hat{\gamma } = \tan^{ - 1} \frac{{ - \hat{c}_{31} }}{{\hat{c}_{33} }}$$
(3.5)

where \(\hat{c}_{31}\) and \(\hat{c}_{33}\) are elements of attitude matrix \(\widehat{\varvec{C}}_{b}^{n}\), which can be written as

$$\widehat{\varvec{C}}_{b}^{n} = \left[ {\varvec{I} -\varvec{\phi}^{n} \times } \right]\varvec{C}_{b}^{n} = \left[ {\begin{array}{*{20}c} 1 & {\phi_{U} } & { - \phi_{N} } \\ { - \phi_{U} } & 1 & {\phi_{E} } \\ {\phi_{N} } & { - \phi_{E} } & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {c_{11} } & {c_{12} } & {c_{13} } \\ {c_{21} } & {c_{22} } & {c_{23} } \\ {c_{31} } & {c_{32} } & {c_{33} } \\ \end{array} } \right]$$
(3.6)

So the roll can be expressed as

$$\hat{\gamma } = \tan^{ - 1} \frac{{ - \hat{c}_{31} }}{{\hat{c}_{33} }} = \tan^{ - 1} \frac{{ - (c_{11} \phi_{N} - c_{21} \phi_{E} + c_{31} )}}{{c_{13} \phi_{N} - c_{23} \phi_{E} + c_{33} }}$$
(3.7)

Then the INS roll error \(\delta \gamma\) can be expressed as

$$\delta \gamma = \frac{{\partial \hat{\gamma }}}{{\partial \phi_{E} }}\phi_{E} + \frac{{\partial \hat{\gamma }}}{{\partial \phi_{N} }}\phi_{N} + \frac{{\partial \hat{\gamma }}}{{\partial \phi_{U} }}\phi_{U}$$
(3.8)

where

$$\frac{{\partial \hat{\gamma }}}{{\partial \phi_{E} }} \approx \frac{{\hat{c}_{21} \hat{c}_{33} - \hat{c}_{31} \hat{c}_{23} }}{{\hat{c}_{31}^{2} + \hat{c}_{33}^{2} }},\ \ \ \frac{{\partial \hat{\gamma }}}{{\partial \phi_{N} }} \approx \frac{{\hat{c}_{31} \hat{c}_{13} - \hat{c}_{11} \hat{c}_{33} }}{{\hat{c}_{31}^{2} + \hat{c}_{33}^{2} }}, \ \ \frac{{\partial \hat{\gamma }}}{{\partial \phi_{U} }} = 0$$

Subtracting the VP derived roll angle from the INS roll angle, we have the residual 

$$Z_{\gamma } = \hat{\gamma } - \tilde{\gamma } = \delta \gamma - v_{\gamma }$$
(3.9)

Therefore, the design matrix of the roll measurement is

$$\varvec{H}_{\gamma } = \left[ {\begin{array}{*{20}c} {\varvec{O}_{1 \times 3} } & {\begin{array}{*{20}c} {\frac{{\partial \hat{\gamma }}}{{\partial \phi_{E} }}} & {\begin{array}{*{20}c} {\frac{{\partial \hat{\gamma }}}{{\partial \phi_{N} }}} & {\frac{{\partial \hat{\gamma }}}{{\partial \phi_{U} }}} \\ \end{array} } & {\varvec{O}_{1 \times 9} } \\ \end{array} } & {\varvec{O}_{1 \times 6m} } \\ \end{array} } \right]$$
(3.10)

As we can see from the Jacobian matrices, the roll measurement will contribute to the estimation of both horizontal attitude errors but not the azimuth error.

3.3 Navigation Procedure

The navigation flow chat is illustrated in Fig. 1. When a new image is observed, some key functions in MSCKF and VP aiding will work: state augmentation and feature detection and matching, and vertical line detection, respectively. If the measurement update is triggered in either vision modules, the states will be corrected.

Fig. 1
figure 1

Navigation flowchart

4 Results and Analysis

The point-based MSCKF and vanishing point aiding from vertical lines algorithm described in the previous sections was tested using ‘2011_09_26_drive_0036’ dataset from the KITTI benchmark [23]. The synchronized IMU data (10 Hz) and rectified grayscale images (10 Hz, global shutter) were used to verify the algorithm. The biases of gyroscopes and accelerometers are 0.01°/s and 1 mg (see more details in [24]). The reference was the IMU/GPS results (OXTS RT3003) with the L1/L2 RTK positioning accuracy of 0.02 m and pitch/roll accuracy of 0.03°. We used 250 images with traveled distance about 180 m, as shown on Fig. 2.

Fig. 2
figure 2

Reference and calculated trajectory

To illustrate the performance of proposed algorithms, the GPS data was totally blocked of whole trajectory except for the initial value of navigation states. We compared the performance of the following navigation schemes.

  • Free INS: Only INS mechanization is performed to calculate the navigation states.

  • MSCKF-VINS with minimum 5 tracking poses: minimum 5 poses in the tracking are required to trigger a measurement update.

  • MSCKF-VINS with minimum 20 tracking poses: minimum 20 poses in the tracking are required to trigger a measurement update.

  • MSCKF-VINS with minimum 20 tracking poses and with vertical line aiding: Vanishing point aiding module is added into the system where minimum 20 poses in the tracking are required.

The trajectory results of the 4 navigation schemes are shown in Fig. 2. The horizontal attitude errors and position errors are presented in Figs. 3 and 4. We can see that there is just minor difference between MSCKF-VINS with minimum 5 and 20 tracking poses. A slight better pitch estimation of MSCKF-VINS with 20 tracking poses can be seen in Fig. 3 when there are abundant features (before 14 s). One reason can be that the longer tracking will observe further distance landmarks, which is beneficial for the pitch angle estimation.

Fig. 3
figure 3

Attitude errors (pitch and roll)

Fig. 4
figure 4

Horizontal position errors

An example of vertical line clustering is shown in Fig. 5, where the green line segments correspond to the vertical VP. When adding the vertical line vanishing point aiding into the system, both pitch and roll estimations are improved to a great extent (up to 0.2°) compared with MSCKF-only methods. The horizontal position error is also less than MSCKF-only methods in general, within 1.2 m. The position estimation improvement results from better attitude estimation, because the attitude error will produce horizontal acceleration error by projecting the gravity to the horizontal plane.

Fig. 5
figure 5

Line segments groups with different colors (Color figure online)

5 Conclusion

In this paper, we use MSCKF in the local-level frame and present it in a way that is common to navigation community. To improve the performance of point-based MSCKF, we proposed to add a VP module which observes the vertical lines of building blocks for land vehicle navigation. Based on that, the roll angle measurement model was derived. Therefore, a mixed VINS using points and line observations was developed, and its performance was verified by real data experiments. Horizontal attitude and position estimations are improved by using vertical line vanishing point aiding module.