1 Introduction

Intelligent transportation arouses huge social concern in recent decades, researches can be directly divided by application scenarios, such as ground [1,2,3], underwater [4,5,6] and aerial [7] ones. Traversable road detection is a core task in the context of autonomous driving field. Bunches of implementations are proposed based on various sensors, in which vision-based and lidar-based methods are most popular. Talking about vision-based methods, cameras are sensitive to the surroundings and interfered significantly by illumination. Although infrared cameras perform much better compared to RGB ones on light disturbance, they fail on environmental perception when rain, smoke or sand cover the lens. Furthermore, to construct a perfect driving environment model, range data is quite necessary, which has been a consensus to common. So, information provided by monocular camera, a branch of vision-based methods, is not adequate. Another branch, stereo vision, provides range information, but not accurate enough in sake of vehicle safety. Worse still, it accumulate error along depth, which could lead to meter scale deviation at a distance ahead of the vehicle.

LiDAR performs much better in evaluating the distance from vehicle to obstacles, which could construct a excellent world model with dense point cloud. However, laser produced by lidar is easy to be disturbed. Rain, fog and sand interfere the output as huge noise or even blocks, which means it can not tolerate multiple natural environment. To make matter worse, lidar is quite pricey, which is normally used in experimental platforms and not widely spread for car industry. In that, millimeter-wave (MMW) radar, which is stable and budget, becomes a suitable choice for environmental perception in autonomous driving. In Fig. 1, red pixels in color and infrared images indicate the road boundaries detected, and the blue pixels mean candidate negative obstacles. Grid map is the 2-dimensional projection of LiDAR and MMW radar outcomes, in which white and red pixels indicate obstacles over and less than 1 meter respectively detected by LiDAR, while blue blocks surrounded by red square are objects detected by MMW radar. As Fig. 1 shown, in heavy smoke scenario, vision-based conception methods are totally invalid, no matter the color or infrared camera. Laser can not pierce through heavy smoke, causing a heap of obstacles ahead of the experimental platform. However, MMW radar still works well, the obstacles behind the heavy fog is detected stably, shown in blue blocks surrounded by red squares.

Fig. 1
figure 1

Performance of sensors in heavy smoke scenario

2 Related work

Researchers have studied applications of MMW radar in the field of precision navigation for decades, especially military applications, such as missiles mounted with radar-infrared integrated seekers or satellites equipped with radar detectors for space observation. In recent years, with decreasing price and smaller size, as an automotive radar, MMW radar is playing an increasingly important role in active safety and autonomous navigation systems. MMW radar can provide distance and speed of obstacles ahead of the vehicle immediately. Compared to lidar, it contains all-weather working and satisfying detecting capability. To our knowledge, an extensive investigation of automotive radar used on vehicles is spread by Grimes [8,9,10], which leads to the development of radar application on automobiles. Park [11] did data association and moving object by nearest clustering tracking based on a MASRAU0025 24 GHz MMW radar. To improve the detection efficiency of vehicles and guard-rails, Alessandretti [12] fused vision sensors with two 24 GHz scanning radars to obtain image and range data simultaneously. [13] fused stereo-vision with a 24 GHz radar and adopted extended Kalman filter (EKF) to track vehicle and other obstacle contour. Bertozzi et al. [14] integrated an inertial sensor after fusion of camera and radar information to accomplish road obstacle detection and classification. Xiao [15, 16] inspired by the cooperation between the cone cell and rod cell in human vision, and presented a method through fusing MMW radar and monocular vision sensor, which operates rough position and precision contour respectively just as the cells in human vision. In Feng’s paper [17], a new type of road marking is introduced, providing much better observation to radar.

Road boundaries are the demarcation line between road and non-road regions. They contain salient features could be used on road detection tasks. Precise road boundary localization may ensure high accuracy road segmentation, so methods based on multiple sensors for road boundary detection [18, 19] are developed in past decades, especially based on image and LiDAR.

Although there are already some relative researches on road detection and boundary localization, MMW application on autonomous driving is not enough yet. So, this paper provides an approach for on-road obstacle as well as road boundaries position detection in autonomous driving scenes. In recent years, commercial MMW radars have become increasingly available and affordable thanks to their lower cost. They generally directly give out the measurement information of detected targets, which have already clustered the original radar data and output clusters in form of point targets for convenient processing. In this paper, a commercial MMW radar (Continental ARS 408-21 77 GHz) is used to detect obstacles and determine road boundaries. The Continental ARS 408-21 provides information of detected targets instead of original wave data. The main contributions of this paper are as follows: First, a radar-based road model constructing procedure is specifically presented, including static and dynamic obstacle separation, data filter and multi frame fusion based on modified Bayesian prediction. Second, based on the constructed scene model, a modified RANSAC method is introduced to locate road boundaries, which meets the real-time requirement on unmanned ground vehicles (UGVs).

The rest of this paper is organized as follows: Our radar-based road detection approach is introduced in Section 3. Section 4 shows the experiments and corresponding results of our approach, while discussion based on our results are also made in this section. We draw a conclusion and prediction of our future work in Section 5.

3 Methods

3.1 Preprocessing of radar data

In order to construct a road model based on radar for road detection afterwards, we need to choose a description model first. Due to MMW radar could only evaluate environment on a two-dimensional plane, occupancy grid map become a good choice, which could construct surrounded environment specifically and intuitively. The ARS408-21 sensor uses radar radiation to analyze its surroundings. The reflected signals are processed and after multiple steps they are available in form of clusters and objects. Clusters are radar reflections with information like position, velocity and signal strength. They are newly evaluated every cycle. In contrast to this, objects have a history and dimension. They consist of tracked clusters. Here, we choose Clustering mode of ARS408-21 radar for more abundant and real-time information. In Fig. 2, a road image (Fig. 2a) and Continental radar detection output using official software (Fig. 2b) are illustrated. Clusters detected by Radar are labeled in different colors, which indicate the corresponding RCS characteristics.

Fig. 2
figure 2

Single-frame illustration

In this paper, to make full use of radar information, we involve a preprocessing procedure at the very beginning. We adopted the cluster ambiguous state (ClusterAmbigstate) and invalid state (ClusterInvalidState) provided by Continental ARS408-21 to filter the detected clusters. ClusterAmbigstate returns 5 kinds of units: invalid, ambiguous, staggered ramp, unambiguous and stationary candidates. We keep the clusters with last 2 characteristics and abandon the others. Similarly, ClusterInvalidState has 12 kinds of states, and we keep the clusters with characteristic: valid, valid cluster with low RCS, valid cluster but no local maximum or valid cluster with high multi-target probability. As Fig. 3 shown, some invalid clusters, colored in red, are eliminated and the rest are kept. Next, based on the inertial measurement unit (IMU), vehicle running state can be measured, by which the absolute speed of clusters can be determined. As the Eq. 1 shown, a threshold is set to separate dynamic and static clusters. The points in blue in Fig. 3 indicate static clusters, while the green ones represent dynamic targets.

$$ \begin{array}{@{}rcl@{}} \left\{ \begin{array}{lll} |V_{Cluster} - V_{Vehicle}| &\!>\! Thre \rightarrow Cluster \in Dynamic \\ |V_{Cluster} - V_{Vehicle}| &\!\leq\! Thre \rightarrow Cluster \in Static \end{array} \right. \end{array} $$
(1)
Fig. 3
figure 3

Preprocessing of radar data

Then, static clusters are projected to an occupancy grid map. Here, due to a split of static and dynamic clusters, smear can be avoided as far as possible when fusing multiple static occupancy grid map frames. We believe one grid is occupied if echo received, and the gap region between occupied grid and the sensor is believed as free. In another branch, considering noise impact, the dynamic points colored in green in Fig. 3 are clustered using DBSCAN algorithm, a density-based method, as Algorithm 1 shown.

figure a

3.2 Multi-frame fusion

Due to sparsity of radar detection, we need to fuse a temporal sequence of static occupancy grid maps to construct a complete environment model. Here, to constantly locate the vehicle position, we combine radar with GPS (Global Position System) and IMU (Inertial Measurement Unit). We treat each static OGM as a plane with 6 degrees of freedom in world space. First, a global zero and 3 positive directions of the space are initiated. Then, all OGMs are transformed to world space and projected to ground surface coordinate. Finally, current vehicle surroundings model is generated by mapping world coordinate OGM to the vehicle coordinate. We choose a recorded fixed location which is not too far away from the vehicle as global zero, so that ground plane assumption is tenable. East and North are determined as positive directions of axis \(X^{\prime }\) and \(Y^{\prime }\) in world coordinate. Vehicle coordinate is established by setting zero at the middle while right and forward being positive directions of axis X and Y respectively. Offsets from the global zero in \(X^{\prime }\) and \(Y^{\prime }\) direction Δx and Δy can be calculated by measuring longitude and latitude using GPS. Euler angles (pitch α, yaw β and roll γ) are detected by the IMU, then transformation of a point (x, y) from vehicle to world coordinate \((x^{\prime },y^{\prime })\) is defined as follow. Details are illustrated in Fig. 4. Three views of a vehicle is presented. In the front and side views, pitch α and roll γ are used to calculate the distance between vehicle and obstacle on earth plane. In the top view, yaw β and offset between two coordinate systems are shown.

$$ \begin{array}{@{}rcl@{}} \left\{ \begin{array}{lll} x^{\prime} &= x\cos\gamma\cos\beta + y\cos\alpha\sin\beta + {\varDelta} x\\ y^{\prime} &= -x\cos\gamma\sin\beta + y\cos\alpha\cos\beta + {\varDelta} y \end{array} \right. \end{array} $$
(2)
Fig. 4
figure 4

Transformation between coordinate systems

Next, we are facing to decide whether a grid is occupied or not after a series of frames projected. Bayesian estimation is widely used here to predict the occupancy status of one grid. In OGM, we use p(s = 1) to represent free status probability of one grid, and p(s = 0) for occupied. The grid status is defined as \(Odd(s)=\frac {p(s=1)}{p(s=0)}\). After a new measurement z ∈ 0,1 coming, we need to update the status of this grid. We indicate the previous status value before the new measurement as Odd(s), then the updated status value is defined as \(Odd(s|z)=\frac {p(s=1|z)}{p(s=0|z)}\). Next, Bayesian estimation is adopted as follow.

$$ \left\{ \begin{array}{lll} p(s=1|z)=\frac{p(z|s=1)p(s=1)}{p(z)} \\ p(s=0|z)=\frac{p(z|s=0)p(s=0)}{p(z)} \end{array} \right. \rightarrow Odd(s|z)=\frac{p(z|s=1)}{p(z|p(s=0)}Odd(s) $$
(3)

Then, a logarithm procedure is attached as follow.

$$ \log Odd(s|z)=\log \frac{p(z|s=1)}{p(z|p(s=0)}+\log Odd(s) $$
(4)

After logarithm, the only measurement item in this model is \(\log \frac {p(z|s=1)}{p(z|p(s=0)}\) and it is a preset value named as Vmeas, so we define \(Vfree = \log \frac {p(z=1|s=1)}{p(z=1|p(s=0)}\) and \(Voccu = \log \frac {p(z=0|s=1)}{p(z=0|p(s=0)}\) as two constant term to this model. However, Bayesian estimation is a kind of equivalent probability estimation based on time, which is not consist to reality. In real-life, the latest event is most likely to influence current status. We believe that the affect of each measurement presents exponential attenuation over time, so we introduce an exponential distribution to give each previous Vmeasure a weight, shown as:

$$ \log Odd(s|z) = \omega_{z} Vmeas(z)+\sum\limits_{n=1}^{s}\omega_{n} Vmeas(n) $$
(5)

in which ωz, ωs, ωs− 1...ω0 conform to exponential distribution. To further simplify our modified Bayesian model, the function can be written as:

$$ \log Odd(s|z) = Vmeas(z) + \omega\log Odd(s) $$
(6)

where ω is an artificial set attenuation coefficient to control the influence of former measurements. By using the update function above, we can fuse multi-frames in world coordinate. Finally, the accumulated world grid is transformed back to the local vehicle coordinate system as follow. Here, x, y still stand for vehicle coordinates, while \(x^{\prime },y^{\prime }\) represent world coordinates of each point.

$$ \begin{array}{@{}rcl@{}} \left\{ \begin{array}{lll} x = (x^{\prime}-{\varDelta} x)\cos\beta-(y^{\prime}-{\varDelta} y)\sin\beta\\ y = (x^{\prime}-{\varDelta} x)\sin\beta+(y^{\prime}-{\varDelta} y)\cos\beta \end{array} \right. \end{array} $$
(7)

In Fig. 5, a road image (Fig. 5a) as well as the corresponding radar occupancy grid map after multi-frame fusion and post-processing procedure (Fig. 5b) are illustrated.

Fig. 5
figure 5

Multi-frame fusion

3.3 Road boundaries detection based on modified RANSAC

Now, driving scene is constructed, and we could observe the road boundaries vaguely from Fig. 5b. In order to further emphasize road boundaries for afterwards detection, an edge detection algorithm is attached to the fused occupancy grid map. Here, we could observe the road boundaries vaguely, but much interference exists and makes it difficult to locate road boundaries. To extract structured information from a noisy input, RANSAC (Random Sample Consensus) model is one of the best choices.

RANSAC method could be attached to a line fitting model which abandons off class points and estimates with-in class points. Traditional RANSAC line detection selects 2 points randomly, and form a line based on these 2 sample points. Next, estimate other observed points by computing distance between the point and the line formed, to determine whether this point is an off-class or with-in class point. Calculate the number of with-in points belonged to this line model. Then, if the number is large enough and greater than the former best line model, update this line as a new best predicted line. Repeat these steps until the globally optimal solution is found.

To improve RANSAC method efficiency and adapt to driving scene application, we modify this method for road boundary detection. There are 3 main differences between traditional RANSAC line detection and our modified one. First, an edge-detection procedure is attached to reduce Algorithmic complexity, and instead of selecting 2 points randomly, we select points from ROI (Region of Interest). Furthermore, pairs with unreasonable slope are abandoned. Second, we keep top bonus lines of left and right boundaries simultaneously based on different ROI. Third, instead of time-consuming distance calculation to separate off and with-in class points, we use band template accumulation to compute the number of class interior points. Algorithm and illustration of improved RANSAC method to detect boundaries is detailed in Algorithm 2 and Fig. 6. In Fig. 6a, left-bottom quarter of the edge detection image is a ROI to extract left boundary Lleft, while right-bottom quarter for Lright. The black line in Fig. 6a indicates a candidate right boundary waiting to be evaluate, and yellow band is the counting area of this line. All points in the yellow band are considered as class interior point of the candidate boundary, and the total number is believed as the score of such line.

Fig. 6
figure 6

Road boundaries detection based on modified RANSAC

figure b

4 Results and discussions

To test and verify the model efficiency, we evaluate our model on a data sequence which contains 15546 frames collected by our own UGV platform of different driving scenes. To further computing the specific precision of our outcome, we choose 100 valid frames out of the data sequence and manually label them. We use traditional measurements to evaluate our method including F-feature, Recall and Precision on this dataset, which are 79.67%, 77.56% and 81.89% respectively. Here, to ensure sufficient radar sampling points accumulated on road, we only evaluate road regions which are less than 30 meters away from our vehicle. Moreover, part of visual results are shown in Fig. 7.

Fig. 7
figure 7

Some visual results

Though in most cases our model performs well, it fails in some instances. In Fig. 8, two instances of failure are shown, in which no boundary is detected. For case (a) in Fig. 8, clear left border can be seen vaguely but it is outside ROI of left boundary. Right boundary is also undetected, cause there is no suitable line detected which score is over preset threshold. Same situation occurs in case (b), straight line model no longer fits to road boundaries at bends.

Fig. 8
figure 8

Fail instances

5 Conclusions

Throughout this paper, we put forward a practical road detection method based on millimeter-wave radar, which caters to realistic autonomous driving requirements on cheap price and low computational complexity. There are two main contributions in our paper. First, based on temporal validity, we introduce a modified occupancy grid map generation procedure, in which former Bayesian prediction influence will chronologically attenuate. Second, a modified RANSAC method on straight line detection is put forward, which is customized for driving scenes satisfying real-time requirements.

With regard to future work, multi-sensors fusion is imperative. Even though MMW radar has great advantages such as detecting moving objects fast and providing the long-range detection and exact velocity measurement, MMW radar cannot recognize the shapes and sizes of the detected targets. On the other hand, vision systems can easily obtain the contour of the targets within the short-distance sensing region of a visual sensor. Furthermore, high accuracy global positioning system service is always required when constructing the occupancy grid map, cause the sparsity of radar data. To ensure perfect environment perception in all circumstances, information redundancy is needed, and how to choose and fuse effective data from multiple sensors in different scenes is the focus of future work.