Keywords

1 Introduction

According to the U.S. Department of Transportation, about 22% of vehicle crashes occur each year, and about 16% of the casualties are weather-related. In addition, most accidents occur in wet road conditions, 73% of which occur on wet roads, and 17% on snow or sleet [1], which shows that road conditions do greatly affect driving safety, so real-time perception of road surface conditions is critical to the safe driving of cars. Human drivers perceive current road conditions through visual system, then adjusting the speed of the car according to the current road surface friction, road curvature and road slope. In order to ensure the driving safety of intelligent vehicle in various road surfaces and weather conditions, the intelligent driving system must obtain real-time information of the road through the sensor, e.g. predict the friction coefficient of the current road, road curvature and road slope and other basic parameters, so as to calculate the maximum driving speed and angular speed of the vehicle on the current road surface, and provide safety for the path planning and vehicle control system. This paper firstly introduces the method of road parameter estimation based on visual sensor, then summarizes the method of road surface perception and parameter estimation, finally discusses the current scientific progress of friction coefficient estimation, road curvature estimation and road slope estimation.

2 Road Friction Estimation

Road friction, as a traffic parameter which can directly influence vehicle braking distance, has been extensively researched in the field of transportation and autonomous driving. Autonomous driving and assisted driving systems can significantly benefit from real-time prediction of road friction, as the driving style of vehicles can be timely adjusted according to the road surface condition, thereby avoiding potential traffic accidents. Up to now, traditional method of estimating road friction is mainly based on vehicle response and tire dynamics [2]. Such methods can directly calculate the road adhesion coefficient by tire deformation, noise response and vehicle slip rate during braking, which demonstrate cost advantages as it can directly reuse the vehicle’s inherent sensors. However, in practical applications, such methods have a disadvantage of lacking predictive ability, thus can only calculate the road friction of past driving area and cannot provide upcoming road surface information for the vehicle’s decision-making system.

On the other hand, although road friction estimation method based on visual sensors requires additional hardware, it has a stronger predictive ability and thus can be a basic module of advanced autonomous driving applications. Such methods estimate road friction by establishing a model between road friction and related road parameters. Specifically, this process can be divided into two stages. Firstly, the type of road surface is predicted based on the image taken by a front camera on the vehicle. Secondly, a mapping function of road surface type to friction parameter values is established based on prior knowledge. In this way, we get a specific friction value.

This section mainly introduces the road friction estimation method based on visual sensors. According to the stages of constructing a model, existing works in this field are divided into road surface prediction and friction parameter prediction. The method overview is shown below in Fig. 1. Road surface prediction includes road material classification, road state classification, and mixed classification; friction parameter prediction includes adhesion coefficient estimation and other friction-related parameter estimation. The specific models used here include traditional machine learning models and deep learning models, usually combined with dynamics methods to achieve higher prediction accuracy.

Fig. 1.
figure 1

Overview of vision-based road friction estimation methods.

2.1 Road Surface Prediction

Road Material Classification.

Road material is closely related to road friction value. Common road material (including asphalt, soil, gravel, cobblestones, and grass) can be directly classified from an input image. Therefore, the classification of upcoming road material can be the first step of road friction estimation.

Kim et al. proposed a road material classification model [3] based on hand-crafted features and Bayesian classifiers in 2009, which divides the road surface into four different materials. The model first uses an over-segmentation algorithm to segment areas of the same material in an image, and then enhance contrast of the image to reduce the similarity of adjacent areas. Finally, the Bayesian classifier is used to classify road surface and achieve 95.38% classification accuracy. In contrast, preprocessing input images without contrast enhancement will result in a drop in accuracy to 90.55%.

Similarly, Rateke et al. proposed a CNN-based model [4] to classify road surface into asphalt, paved, and unpaved material in 2019. Due to the lack of appropriate public dataset, the authors construct a new dataset-the RTK dataset by collecting images from suburbs of Brazil using low-cost cameras, so it contains more unpaved road than previous ones. In terms of model design, the authors choose a simple 6-layer CNN for classification. And the lower part of the image which potentially contains more road information is used as RoI input. The model is trained on RTK, KITTI and CaRINA datasets, and the final accuracy on the three datasets reaches 95.7%, 93.1%, and 98.3%, respectively.

Road State Classification.

Road surface state is another important factor that affects road friction greatly. Unlike road material, road surface state tends to change more easily with weather conditions. Common road state such as dry, wet, water, ice, and snow can result in significantly different road friction and vehicle braking distance. Therefore, predict road surface state correctly could lead to a precise road friction estimation.

Qian et al. proposed a road state classification method [3] based on traditional machine learning algorithm in 2016. This method first learns a distribution of road area in the dataset, and then fix that area as the RoI input of the model. The model first uses MR8 filter and K-means cluster to extract luminance-invariant features, and then augment them with luminance-relative features based on pixel deviation histogram. A naive Bayes classifier is used to complete the classification. This method achieves 80%, 68% and 46% accuracy in two/three/five-classes classification respectively. The ablation experiment shows that manual selection of RoI can improve the accuracy by nearly 20%.

In order to get higher accuracy, Almazan et al. improved this work [6]. With other parts of the model basically remain unchanged, they mainly enhance the RoI prediction module. Additional geometric constraint is introduced by calculating vanishing point and horizontal line. Combined with the spatial priors learned from the dataset, a better RoI prediction result is obtained. With the same hand-crafted features being used, a simpler RBF SVM is chosen for classification, and a higher accuracy of 86%, 80% and 52% in two/three/five-classes classification respectively is achieved, which shows a great overall improvement.

Zhao et al. proposed another road state classification algorithm based on traditional machine learning models [7] in 2017. The feature extraction part of the model uses hand-crafted features as well, with SVM being used as a classifier. What makes it different is the parameter optimization algorithm and evaluation metric used here. Grid Search and PSO are used to optimize SVM parameter. And model accuracy is evaluated by calculating the ratio of correctly classified image grids to total 5x9 image grids. Finally, the single-state accuracy exceeds 90%, while the multi-state accuracy exceeds 85%. The SVM model based on PSO optimization algorithm is significantly better than grid search, which leads to an accuracy increase by more than 10%.

For winter road state with more snow, Pan et al. collected data on a highway in Canada and constructed a new dataset to estimate the amount of snow [8]. The dataset divides road state into 5 categories according to snow coverage condition. The model uses VGG16 pre-trained on ImageNet as a baseline classifier with a full image input, which is then compared to traditional machine learning models and a VGG16 model without pre-training. The result shows that the pre-trained VGG16 achieves the highest accuracy of 90.4%, 87.3% and 78.5% in two/three/five-classes classification respectively. Pre-training on ImageNet brings a 2% accuracy improvement.

Mixed Classification.

Since the material and the state of the road surface both significantly affect road friction value, an intuitive idea is to construct a dataset containing both of them to train a mixed classification model.

Nolte et al. proposed a CNN-based mixed road surface classification model [9] in 2018. Considering that there exist lots of reusable data in public traffic datasets, they select and label images from multiple public datasets in a mixed way. After that, appropriate RoI is manually selected and resized to 224x224 as the input of the model. The paper compares performance of ResNet50 and InceptionV3 in road classification. Experiment shows that ResNet50 achieves 92% accuracy, 2% higher than InceptionV3. And selection of RoI is very important, which improves accuracy by 10% than simply using the whole image as input.

Similarly, Busch et al. select road images from existing traffic datasets to form a mixed dataset [10]. InceptionV3, GoogLeNet and SqueezeNet are used to compare the influence of different architectures on classification accuracy. In addition, considering the importance of RoI selection module in the previous work, this paper compares the effect of different RoI shapes on accuracy. Unexpectedly, the SqueezeNet architecture with the whole image input reaches the highest F1 accuracy of 95.36%, while different network architecture has little influence on accuracy.

2.2 Friction Parameter Prediction

Road Adhesion Coefficient Estimation.

Road adhesion coefficient is defined as the ratio between ground adhesion force and tire normal force, which approximately equals to road friction coefficient when adhesion reaches the maximum. As it can directly change a car’s braking distance, adhesion coefficient has been modelled in many different ways. The traditional dynamics-based method estimates adhesion coefficient with longitudinal response of the tire. Since no additional hardware is required and high accuracy can be ensured, such methods have always been the mainstream of adhesion coefficient estimation and have been widely studied. However, such methods also have inherent shortcomings like lacking predictive ability. On the other hand, camera-based adhesion coefficient estimation method is not as accurate as the former, but with more predictive ability and better real-time performance. Therefore, an intuitive idea is to combine both methods to solve the dilemma between speed and accuracy, improving the reliability of road adhesion coefficient estimation. In camera-based methods, dynamic models are always combined with vision predictions to achieve faster adhesion coefficient estimation with higher accuracy.

Xiong et al. proposed a method for calculating adhesion coefficient with dynamics and aided visual information [11] in 2019. This model uses color moment and GLCM to extract features and then use SVM to classify it into dry or wet asphalt, which achieves 92.47%/88.39% classification accuracy on dry/wet asphalt. Then, adhesion coefficients of dry/wet asphalt are specified as 0.85 and 0.6 based on statistical data, which are then used as approximate initial values of the dynamic model to calculate the final adhesion coefficient. As a result, the convergence speed of the hybrid estimator is obviously faster than pure dynamics model. And the prediction accuracy is also closer to ground truth, especially on road where dynamics pattern is not obvious.

Sabanovic et al. propose a method to estimate adhesion coefficient with a similar idea [12] in 2020. What makes the method different is that dynamic model is only needed during training. Road surface is first classified into six categories with AlexNet. After that, an adhesion coefficient-slip ratio curve is fitted for each road surface based on data collected by the vehicle in real time. In this way, an end to end adhesion coefficient estimation is achieved with only visual information. In addition, the paper combines the system with the ABS model to reduce the braking distance of the vehicle by predicting the adhesion coefficient in advance. Combined with the ABS model, the vehicle braking distance is reduced by up to 18%.

Other Friction-Related Parameter Estimation.

In addition to road adhesion coefficient estimation, there are also some works manage to estimate other friction-related parameters to achieve a similar prediction effect. Road friction estimate RFE, friction level μ and anti-skid level BPN all belong to this category. With only visual information, prediction models can only roughly estimate their values in a coarse-grained manner. Therefore, this type of method has many similarities with previous road surface classification methods in data collection and implementation details.

Roychowdhury et al. proposed a multi-stage RFE estimation method [1] in 2018. First, the CNN-based method is used to classify roads into four categories: dry, wet, slush, and snow. Among them, dry roads can be directly considered to have high RFE. Then, manual segmentation is applied to other images to divide them into 15 trapezoid blocks, which are then stretched into rectangular bird’s eye blocks according to perspective projection. Finally, the probability of each image block being dry/wet is predicted separately by the model, and the average value is calculated to comprehensively estimate the RFE. This method uses SqueezeNet to achieve the best accuracy of 97.36% in road classification, and 89.5% in RFE prediction.

The paper published by Jonnarth in 2018 explores the effect of network architecture, data distribution, and the use of simulated data on estimation of friction level μ [13]. VGG, ResNet, and DenseNet are used to classify the road surface into high friction (0.2 <=μ < 0.6) or low friction (μ < 0.2) level. The dataset used in this project consists of 37,000 real images and 54,029 simulated images, which is quite sufficient for a classification task. The trained model finally reached a prediction accuracy up to 90%. And the following conclusions are summarized: 1) the prediction accuracy between different network architectures has little difference; 2) the span of dataset is more important than its size, which significantly affects model performance; 3) there is a certain gap between simulated image and real image, and the use of simulated image does not lead to a noticeable performance improvement.

Du et al. published a paper on rapid estimation of road anti-skid level BPN from the perspective of anti-skid performance of autonomous vehicles [14] in 2019. The author combines CNN and hand-crafted features to propose a deep convolutional neural network-TLDKNet based on domain knowledge. The domain knowledge mentioned here refers to LBP (Local Binary Pattern), GMM (Gaussian Mixed Model) and GLCM (Gray-Level Co-occurrence Matrix), three texture features that have been proven to be strongly related to road anti-skid performance. The model combines the convolutional layer of VGG16 with three texture features to form a 4-branch feature extraction network. The features of different branches are merged together to classify the anti-skid level BPN. As a result, this model divides BPN into three levels: high (BPN > 57), medium (47 < BPN < 57), and low (BPN < 47), with a final accuracy of 90.67% and 80% in two/three-classes classification achieved respectively.

2.3 Existing Problems in Road Friction Estimation

Despite so many excellent works, there are still many problems in the current road friction estimation method based on vision.

  1. 1.

    There are no unified public benchmarks for road surface classification and road friction estimation. Many studies are based on the data collected by researchers themselves for model training and accuracy evaluation, which makes comparison between models very difficult. We need a unified and effective accuracy metric to evaluate the performance of different models.

  2. 2.

    An efficient image pre-processing method is needed to eliminate redundant background information and extract accurate road features. Although the context information can help to identify the state of road surface, it has undoubtedly a negative impact on the estimation of road material and friction parameters.

  3. 3.

    The road friction estimation based on vision is conduct in a coarse-grained manner, which can only be roughly classified into several value levels. And most accurate road friction estimation methods heavily rely on dynamic assistance, which greatly limits the use of the model.

3 Road Curvature Estimation

Intelligent vehicles need to perceive and predict the surrounding environment information in real time during autonomous driving, in which the curvature of the road can assist in predicting the direction of the lane, thus providing important help to the automatic control system of the vehicle. At present, the mainstream road curvature prediction algorithms are road curvature estimates based on lane detection [15,16,17,18,19], road curvature estimates based on GPS trajectories [20], and road curvature estimates based on vehicle dynamics [21]. Since the purpose of this article is to introduce the method of estimating road parameters based on visual sensors, in this section we mainly introduce the method of road curvature estimation based on lane detection.

3.1 Road Curvature Estimation Based on Lane Detection

Tsai et al. first proposed the use of road image information to calculate the curvature of the road [15], their proposed algorithm is divided into four steps: 1) use the vehicle camera to obtain the road image in front of the vehicle, and get the road curve edge through image processing, 2) convert the curve edge from the image coordinates to the world coordinate system through the inverse projection transformation (IPM) [22], 3) calibrate camera parameters, 4) calculate the radius and center point of the curve from the curve point of the world coordinate system. The road curvature estimate proposed by Seo et al. is also composed of four steps [17, 18], inspired by the prior of road is parallel in the world coordinate system, they adjusted the image processing steps: 1) convert the forward view of the road image from the image coordinate system to the world coordinate system, 2) detect the lane line from the world coordinate system bird’s eye view, 3)fit lane using a random sampling consistent (RANSAC) algorithm [23], 4) calculate the lane curvature by sampling data points from the lane. Hu et al. used the continuous characteristics of images collected by the vehicle camera to estimate the error variance of the lane, and filter the data points with high variance, so as to improve the accuracy and robustness of the road curvature. The main technical details involved in road curvature estimation are described below [18].

Inverse Perspective Transformation.

Because the camera’s optical axis intersects with road surface, this causes distortion of the road surface information projected onto the image plane, such as the two parallel lanes on the road surface intersect in the image plane. The inverse perspective transformation projects pixels on the image coordinate system into the world coordinate system.

In fact, by collecting the coordinates of the four points on the road plane and the corresponding image plane under the world coordinate system, the corresponding single-entitlement matrix H can be calculated to transfer the image pixels from the image coordinate system to the world coordinate system.

$$ \left[ {{\text{X Y }}1} \right]^{T} = H^{ - 1} \left[ {uv\, 1} \right]^{T} $$
(1)

Lane Detection.

Because the color of lane is usually white or yellow, it can be used as a priori to capture information about the location of the lane using a color threshold or gradient threshold. Because the lane are concentrated in the x-axis within a certain range, it is possible that the pixel distribution peak on the x-axis is likely to be the base point of the lane line, so the peak point can be used as the base point of the lane line, then the sliding windows are used to detect lane points, finally the straight lane line or arc lane line is fitted by the Hough Transform or RANSAC algorithm. The effect is shown in Fig. 2.

Fig. 2.
figure 2

Lane detection [18].

Road Curvature Calculation.

The starting point, end point, and the center point are collected from the lane line on the picture to fit the radius of the arc’s circle.

The curvature of the road \( \kappa \) is defined as the inverse of the arc radius, so the curvature of the road can be calculated directly.

$$ \kappa = \frac{1}{R} $$
(2)

3.2 Problems of Current Visual-Based Road Curvature Method

The current vision-based road curvature study relies on robust lane detection, however, the results of lane detection will be subject to many conditions, such as poor visual images in bad weather conditions, road snow cover lane, lane wear or lane are severely blocked by cars, the current lane detection method will get poor performance in the above environment, so establish a large-scale road data set covering the above-mentioned situations will greatly promote the development of related methods.

4 Road Slope Estimation

As a key information to ensure driving safety and an important parameter of the electric control system of power transmission and chassis, the road slope can significantly improve the vehicle motion control performance if the road slope can be accurately estimated in real time. Road slope can be divided into lateral road slope and longitudinal road slope. Lateral road slope refers to the slope in the direction of the road crossing, while longitudinal road slope refers to the slope in the direction of the road moving forward. There are two types of longitudinal road slope: uphill and downhill. For the problem of road slope estimation, most studies focus on the estimation of longitudinal road slope [25, 26, 28, 29]. The methods of longitudinal road slope estimation are mainly divided into sensor-based [25] and model-based [26]. The sensor-based methods utilize additional sensors on the vehicle, such as inclination displacement sensors, accelerometers, GPS, etc. These methods are limited by the sensors used, which have good accuracy but have deficiencies. For example, the inclination displacement sensor is susceptible to the impact of body longitudinal acceleration and road bumps, GPS may have problems such as positioning error, signal loss and so on. The model-based methods estimate the road slope by establishing a dynamic model and obtaining the known data on the CAN bus. However, how to decouple the vehicle status parameters and road resistance changes is a difficult problem. Most of the lateral road slope estimation methods depend on the accuracy of the tire model and the road adhesion coefficient [27].

4.1 Vision-Based Road Slope Estimation

Visual-based road slope estimation provides slope information before the vehicle enters a road with different slopes, which gives the driver or intelligent control system enough time to adopt the correct control strategy. However, the study of road slope estimation based on visual sensors is still in its preliminary stage. In [28], researchers combine forward and backward monocular cameras to classify longitudinal road slopes using geometric clues of the image, and divide road slopes into uphill and downhill. Figure 3 shows the images taken by the forward and backward cameras when going uphill and downhill. If \( \phi > \psi \), the image represents an uphill. On the contrary, the image represents a downhill. If ϕ \( = \) ψ, it means flat. In [29], researchers present three methods to estimate road slope from instant road images obtained from a front monocular camera. In the geometry-based method, they estimate the road slope using 2D road line derived from 3D road line and pin-hole camera model. In the local features-based method, they adopt the SIFT (scale-invariant feature transform) local features between two consecutive images. In the covariance-based method, they use 2D road line as feature to train multi-layer perceptron.

Fig. 3.
figure 3

Uphill and downhill. (a) and (b) refer to the images taken by the forward and backward cameras when going downhill. (c) and (d) refer to the images taken by the forward and backward cameras when going uphill [28].

4.2 Limitation of Research on Vision-Based Road Slope Estimation

There are few studies on the estimation of road slope by using visual sensors. Relevant works are mainly conducted in the environment with good weather and high visibility, which cannot guarantee the robustness of the model in the snow and ice environment with bad weather or at night. In addition, these studies use only road lines or local features as inputs, and do not use whole images, resulting in that less information is available. Currently, there is no publicly available road slope image dataset, which hinders the development of road slope estimation using deep convolutional neural networks. In conclusion, the road slope information provided by visual sensors is insufficient to accurately estimate the road slope in real time, and its development is limited due to the lack of relevant dataset.

5 Conclusion

This paper mainly combs recent literature of road parameter estimation based on visual sensor, and classifies them into road friction estimation method, road curvature estimation method and road slope estimation method. The method of road friction estimation includes two part: road surface prediction and friction parameter estimation. In addition, the relevant progress of the road slope estimation method and road curvature estimation method are summarized in detail. This paper has reference value to the design of intelligent vehicle planning module in various road surfaces and weather conditions.

Human driver and intelligent vehicle are more prone to accident in snowy or icy environment, so road parameter estimation is particularly important in this situation. However, not much attention has been paid to this filed. Thus, future research can be expanded in the following directions:

  1. 1.

    Collect and label road parameter data sets for large-scale snow and ice roads. One reason for the lack of road parameters research under snow and ice surface is the lack of relevant open data sets, so collecting and labeling road parameter data sets in snow and ice environment can promote scientific progress in this area.

  2. 2.

    Road parameter prediction method under multi-task constraint. Intelligent vehicle usually requires simultaneous sensing of lane, coefficient of friction, travelable area and road curvature, these tasks have a mutually constrained relationship, so multitasking constraint road parameter prediction is worth studying.

  3. 3.

    Study how to estimate the vehicle’s motion status through the redundant sensor data and the dynamic model, especially the current vehicle speed and corner speed, so as to assist the robustness of road parameter prediction and assist to the online update of prediction model is a very promising direction.