Keywords

1 Introduction

With the development of intelligent vehicle assistant driving technology, video ranging technology is rapidly entering the field, used to distance between the vehicle and the surrounding objects. At present, machine vision ranging includes monocular ranging [1] and binocular ranging [2] or multi-field [3] vision ranging technology. For binocular and multidimensional vision technology, to match the characteristic of picture is a most important obstacle that hinder the technology to be developed rapidly. However, other affecting factors include real-time requirements and cost of technology. The Zhao [4] put forward a real-time ranging algorithm based on image depth information. The Wu [5] proposed an improved distance measurement method based on vehicle shadow. The Liu [6] proposed a fast image segmentation method based on difference image stack and ranging method.

This paper proposes a ranging algorithm based on the information of license plate images. The size of most license plates is \(440*140\) m\(^2\) in China. Thus, we can calculate the measure the distance according to this characteristic. In this paper, the Kalman predictor is used to predict the region of the target license plate, which can reduces the number of license plate detection and recognition time. The results of the experiments show that the method can satisfy stability and real-time requirements.

2 Ranging Model and Algorithm Design

The general frame of the algorithm is showed as Fig. 1, which includes five modules. The video capture module obtains pictures by the car DVR. The image preprocessing module is to provide more effective information for distance ranging. The positioning module identifies the size of the license plate with a rectangle. The ranging module is used to calculate the distance and output the result. The prediction module is used to predict the position of the license plate in the next frame.

Fig. 1.
figure 1

General frame of ranging

2.1 Analysis of Algorithm

The imaging geometric model of the camera contains the relationship between the positions and spatial coordinates of all the information. The model is generally described by the pixel coordinates system (u-v), the image coordinates system (x-y), the camera coordinates system (\(X_{c},Y_{c},Z_{c}\)) and the world coordinates system (\(X_{w}, Y_{w}, Z_{w}\)). Assumed that the physical dimensions of each pixel in the u-axis and v-axis directions are \( d_{x}\) and \( d_{y}\), the image coordinates system listed as Fig. 2.

Fig. 2.
figure 2

General frame of ranging

The intersection point of the camera optical axis and the image plane is defined as the origin of the coordinates system, in which the x-axis is parallel to the u-axis and the y-axis to the v-axis. The point (\( U_{0},V_{0}\)) is the coordinates of point \(O_{1}\) in coordinates system u-v. The \( d_{x}\) represents the physical size of each pixel on the horizontal x-axis and the \( d_{y}\) represents the physical size of each pixel on the horizontal y-axis. The relationship between the pixel coordinates system and image coordinates system as follows:

$$\begin{aligned} u=\frac{x}{dx}+u_{0} \end{aligned}$$
(1)
$$\begin{aligned} v=\frac{y}{dy}+v_{0} \end{aligned}$$
(2)

The relationship can be expressed as follows:

$$\begin{aligned} { \left[ \begin{array}{ccc} u\\ v\\ 1 \end{array} \right] }= { \left[ \begin{array}{ccc} \frac{1}{dx}&{} 0 &{} u_{0}\\ 0 &{} \frac{1}{dy} &{}v_{0}\\ 0 &{} 0 &{} 1 \end{array} \right] }\,\,{\left[ \begin{array}{ccc} x\\ y\\ 1 \end{array} \right] } \end{aligned}$$
(3)

2.2 The Linear Model of Camera Imaging

The relationship between the coordinates system of the camera and the world can be showed as Fig. 3. The point O is the optical center. The \(X_{c}\)-axis and \(Y_{c}\)-axis are parallel to the x-axis and y-axis of the image coordinates system respectively. The \(Z_{c}\) is the optical axis of the camera. The intersection of the optical axis and plane of the image is point \(O_{1}\). The camera coordinates system are make up of the point O, \(X_{c}\)-axis, \(Y_{c}\)-axis and \(Z_{c}\)-axis.

Fig. 3.
figure 3

Camera coordinate system and world coordinate system

The world coordinates system can be converted to the camera coordinates system, and the transition relation is shown in the (5):

$$\begin{aligned} {\left[ \begin{array}{ccc} X_{c}\\ Y_{c}\\ Z_{c}\\ 1 \end{array} \right] }= {\left[ \begin{array}{ccc} {{\varvec{R}}} &{} {{\varvec{T}}}\\ 0^{T} &{} 1 \end{array} \right] }=M_{1}{\left[ \begin{array}{ccc} X_{w}\\ Y_{w}\\ Z_{w}\\ 1 \end{array} \right] } \end{aligned}$$
(4)

The R is a orthogonal unit rotation matrix. The T is a translation vector. The \(M_{1}\) is a internal parameter of the camera. The conversion relationship between the camera coordinates system and the image coordinates system can be expressed as follows:

$$\begin{aligned} Z_{c}{ \left[ \begin{array}{ccc} x\\ y\\ 1 \end{array} \right] }= { \left[ \begin{array}{cccc} f &{}0&{} 0 &{}0\\ 0 &{}f&{} 0 &{}0\\ 0 &{}0&{} 1 &{}0 \end{array} \right] }\,\,{\left[ \begin{array}{ccc} X_{c}\\ Y_{c}\\ Z_{c}\\ 1 \end{array} \right] } \end{aligned}$$
(5)

Inputting the (4) and the (5) into the (3), we can get the model as follows:

$$\begin{aligned} Z_{c}{ \left[ \begin{array}{ccc} u\\ v\\ 1 \end{array} \right] }= { \left[ \begin{array}{cccc} \alpha _{x} &{}0&{} u_{0}&{}0\\ 0 &{}\alpha _{y}&{} v_{0} &{}0\\ 0 &{}0&{} 1 &{}0 \end{array} \right] }\,\,{ \left[ \begin{array}{cc} R&{}T\\ 0^{T}&{}1 \end{array} \right] }\,\,{ \left[ \begin{array}{cc} X_{w}\\ Y_{w}\\ Z_{w}\\ 1 \end{array} \right] } =M_{2}M_{1}{ \left[ \begin{array}{cc} X_{w}\\ Y_{w}\\ Z_{w}\\ 1 \end{array} \right] } \end{aligned}$$
(6)

The \( \alpha _{x}=f/dx\), \( \alpha _{y}=f/dy\). The \(M_{2}\) is a external parameters of camera. We can calculate the value of \(M_{2}*M_{1}\) is by the method of camera calibration.

2.3 Camera Calibration

By the process of camera calibration, we can get the three-dimensional position and orientation of camera in the world coordinates system [7]. As a general rule, the size of the actual object in the image is inversely proportional to the distance from the camera. The essay use checkerboard target calibration method to calibrate the camera. The checkerboard calibration template is shown in Fig. 4. The size of each black or white square is \(10*10\) mm\(^2\).

Fig. 4.
figure 4

Checkerboard calibration template

By getting the checkerboard calibration template from different distance and make sure that only the template is included for each shooting. Then, calculate the image area of the template obtained at distances. Based on the (6), we can simplify the camera linear model into the (7) according to the paper [8]:

$$\begin{aligned} d=m\sqrt{s} \end{aligned}$$
(7)

The m is obtained by test on different distances, such as 100 mm, 150 mm, 200 mm and 250 mm. According to results of the camera calibration test, we can get the arithmetic mean of m is 1.298 (Table 1).

Table 1. Test results of camera calibration

We define \(N_{a}\) as the pixel size of the entire image and \(N_{b}\) as the size of the rectangle that identified by location module. At the same time, \(S_{a}\) represents the cross-sectional area of the plane where the license plate is located; \(S_{b}\) is \(440*140\) mm\(^2\). The relationship as the (8):

$$\begin{aligned} \frac{N_{a}}{N_{b}}=\frac{S_{a}}{S_{b}} \end{aligned}$$
(8)

And,

$$\begin{aligned} S_{a}=61600*\frac{N_{a}}{N_{b}}(mm^{2}) \end{aligned}$$
(9)

Based on the (9) and the value of m, we can calculate the actual distance from the camera to the front plate by followed the (10).

$$\begin{aligned} d=1.298*\sqrt{61600*\frac{N_{a}}{N_{b}}}(mm^{2}) \end{aligned}$$
(10)

2.4 Kalman Predictor

The Kalman predictor is one of the common methods to estimate the target, and further improve the efficiency of the algorithm by estimating the license plate position, reducing the range of detection, recognition and matching. The basic principle of the Kalman predictor is to use the measured value to correct the estimated state and finally obtain a reliable state estimate.

$$\begin{aligned} {{\varvec{X}}}(k+1)={{\varvec{A}}}{{\varvec{X}}}(k)+{{\varvec{BW}}}(k) \end{aligned}$$
(11)
$$\begin{aligned} {{\varvec{Y}}}(k)={{\varvec{H}}}{{\varvec{X}}}(k)+{{\varvec{V}}}(k) \end{aligned}$$
(12)

In this paper, the k represents the sequence number of each frame in the video information. The matrix X is the system state vector. The matrix W is the noise vector of the system. The matrix Y is the observation vector. The matrix V is the observed noise vector, A is the state transition matrix, B is the noise driving matrix, and H is the observation matrix. The state prediction equation and the updated equations can be obtained according to Kalman’s prediction theory. The prediction equations as follows:

$$\begin{aligned} {{\varvec{X}}}^{'}(k+1)={{\varvec{A}}}(k+1,k){{\varvec{X}}}^{'}(k|k) \end{aligned}$$
(13)
$$\begin{aligned} {{\varvec{P}}}(k+1|k)={{\varvec{A}}}(k+1,k){{\varvec{P}}}(k|k) {{\varvec{A}}}^{'} (k+1,k)+{{\varvec{Q}}}(k) \end{aligned}$$
(14)
$$\begin{aligned} {{\varvec{P}}}(k+1|k+1)=[{{\varvec{I}}}-{{\varvec{K}}}(k+1){{\varvec{H}}}(k+1)]{{\varvec{P}}}(k+1|k) \end{aligned}$$
(15)

The updated equations as follows:

$$\begin{aligned} {{\varvec{K}}}(k+1)={{\varvec{P}}}(k+1|k){{\varvec{H}}}^{T}(k+1)\left[ {{\varvec{H}}}(k+1){{\varvec{P}}}(k+1|k){{\varvec{H}}}^{T}(k+1)+{{\varvec{R}}}(k+1)^{-1} \right] \end{aligned}$$
(16)
$$\begin{aligned} {{\varvec{X}}}^{'}(k+1|k+1)={{\varvec{X}}}^{'}(k+1|k)+{{\varvec{K}}}(k+1)\left[ {{\varvec{Y}}}(k+1)-{{\varvec{H}}}(k+1)X^{'}(k+1|k) \right] \end{aligned}$$
(17)
$$\begin{aligned} {{\varvec{P}}}(k+1|k+1)=[{{\varvec{I}}}-{{\varvec{K}}}(k+1){{\varvec{H}}}(k+1)]{{\varvec{P}}}(k+1|k) \end{aligned}$$
(18)

Where, the I is the unit matrix and the P is the variance matrix. When the violation occurs, the speed of the vehicle is relatively low. Therefore,in the experiment referred to in this paper, the vehicle in the video information can be moved almost uniformly. The horizontal and vertical coordinates of the center of the identified rectangular box are defined as the initial measurements of the Kalman predictor. In addition, the relevant parameters are defined in this experiment as follows: A = 1, B = 1, H = 1, Q = 0, R = [4,0;0,1], P(0) = [2,0;0,1].

3 Ranging Experiment and Result Analysis

In order to verify the accuracy and real-time performance of the algorithm, the paper arranges vehicle experiments in both static and dynamic status. The experimental procedures are as follows: Step1: Gets information about each frame image. Step2: Preprocess the image. Step3: Matching license plate in the given area and marking the license plate area with a rectangle. Step4: Calculate the distance, and output the result. Step5: Based on the results of Step3, estimate the area where the license plate may appear. The area is centered on the centroid of the rectangle identified by the above frame, while the length of the long side is 1.5 times the previous one. Step6: If we can match the license plate in step5, proceed to step4. If the algorithm can not match the license plate, proceed to step3.

3.1 Static Vehicle Experiment and Result Analysis

The environment of the experiments is showed in the Fig. 5. The distance between the license plate and the rear is 5 cm. Then, mark the position of the distance against the rear, 1 m, 1.5 m, 2 m, 2.5 m and so on. The camera used in the experiment is a Logitech C270 camera with a maximum resolution of \(1280\,*\,720\). Thus, the value of \(N_{a}\) is 921600. We install camera at the fixed height of 1.2 m and an inclination angle of 87\(^{\circ }\) to conduct two groups of experiments. The ranging algorithm is implemented by OpenCV-3.1.0 visual processing function library. And, we can found the algorithm the can satisfy the requirements of static ranging according to the Table 2.

Fig. 5.
figure 5

Experiment environment

Table 2. Experiment result of dynamic distance ranging

3.2 Dynamic Vehicle Experiment

To verify the algorithm, we have a dynamic vehicle experiment. The video information is about 25 frames per second. Figure 6 is test done on the campus road. Figure 7 is test on the urban center road. Compared with the urban center road, the speed of the vehicle on campus is slower, and the surrounding environment is relatively simple. Figures 6 and 7 have been marked out the license plate, and the results are displayed in the image.

Fig. 6.
figure 6

Distance ranging on campus road

Fig. 7.
figure 7

Distance ranging on urban center road

The dynamic experiment shows that the monocular ranging algorithm proposed in the paper can be used for dynamic vehicle distance measurement. The red data on the left corner of the chart represents the time that consumed on the distance measurement. It can be found that the time is in the range of 0.1 to 0.3 s. So it can effectively meet the real-time requirements of ranging.

4 Conclusion

The paper proposes a monocular ranging algorithm for measure the distance. The efficiency of the algorithm is improved by the Kalman predictor. Generally, the speed and distance of the vehicle are relatively small. Thus, according to the results of static experiment and dynamic experiment, the proposed algorithm can meet the requirement of vehicle jumping distance detection, and has good robustness and accuracy.