Abstract
Driver fatigue is one of the leading causes of traffic accidents. At present, fatigue driving detection has disadvant ages such as low practical application effect and high equipment requirements. This paper proposes a multi-feature point non-invasive fatigue monitoring system based on a support vector machine with a hybrid kernel function. The system detects feature points through a gradient descent tree algorithm based on a cascaded regression and calculates the eye aspect ratio (EAR) and mouth aspect ratio (MAR). The heart rate is obtained through RGB image analysis combined with Euler’s video magnification algorithm. Classify facial features to get fatigued. This paper is based on the Logistic and Radial Basis Polynomial Kernel (RBPK) function to improve the support vector machine, which has better learning and generalization. Finally, this paper uses the Driver Drowsiness Detection Dataset and the author’s dataset to test. The classification accuracy rate for a single picture is 96.92%. In summary, the system proposed in this paper has a better recognition rate for fatigue driving detection.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Traffic accidents cause thousands of people to be injured or even lose their lives every year. According to statistics from the World Health Organization (WHO), fatigued driving causes a considerable part. In this regard, fatigue driving detection technology is also developing rapidly [1]. According to the dimensionality of the acquired feature data, the current fatigue detection methods can be divided into single-dimensional detection and multi-dimensional detection. The researchers [2, 3] proposed an improved method for detecting driver fatigue by calculating the eyelid movement parameter Parcels. The Percols theory’s limitations can only be applied under certain conditions. Uncertain conditions such as indoor lighting, changes in light, and head motion will cause detection errors. Researchers have found that when drivers feel tired, they show many facial features, including frequent blinking, yawning, and shaking their heads. The researcher Sahayad et al. [4] has pointed out that the hybrid fatigue driving detection method’s reliability and accuracy that combines multiple methods are much higher than that of the method using a single sensor. Multi-dimensional fatigue driving detection is to classify multiple data items and involve convex quadratic programming problems. C Buchheim et al. [5] studied the ellipsoid boundary to determine the convex quadratic programming problem’s boundary. Tao Cai et al. [6] designed the Newton-CG augmented Lagrangian algorithm for the convex quadratic constrained quadratic semi-definite programming, assuming Robin-son constraint norms, healthy second-order sufficient, and other three conditions. The assumptions are relatively strong.
Researchers [7, 8] studied a support vector machine to classify data items in the process of classification and feature selection. In classifying data items, the support vector machine’s principle is to put the target vector into a high-dimensional space through nonlinear changes and find the best hyperplane to distinguish data items. Mingze Xia et al. [9] proposed using genetic algorithms to optimize the RBF parameter and error penalty function C, thus achieving better classification of the model.
Qingshuo Zhang et al. [10] proposed multicore support vector machines based on nuclear alignment, which significantly improved the model’s training efficiency. The kernel function is an essential part of the support vector machine, which is divided into a linear kernel function and a polynomial kernel function. Different kernel functions determine that the support vector machine has different characteristics. M. Tanveer et al. [11] proposed a novel, precise 1-norm linear programming formula linear kernel function for twin support vector machine (TWSVM), which has good generalization ability. However, it does not have excellent learning ability and good predictive ability. G Sideratos et al. [12] proposed a probabilistic wind power prediction model based on radial basis function neural network (RBFNN), which has good learning ability but does not have a good generalization and prediction ability. VH Moghaddam [13] proposed a new kernel called Hermite orthogonal polynomial, which has good predictive ability but does not have good generalization ability and learning ability. In addition, the emergence of federated learning has greatly improved the accuracy of the fatigue driving model [14, 15].
According to previous scholars’ research results, one-dimensional fatigue driving detection is easy for researchers to realize, but the data obtained is more susceptible to interference from indoor lighting and head movement. After multi-dimensional fatigue driving detection combines multiple single-dimensional detection methods reasonably, it can significantly improve detection accuracy in the real environment.
However, a single kernel function often cannot have these characteristics simultaneously, so there is currently a lack of a kernel function that can have multiple characteristics at the same time. At present, few researchers combine multi-dimensional feature data and design a support vector machine that can combine multiple kernel function characteristics to realize a complete set of fatigue testing equipment. To realize the academic vacancy in this area, this paper designs a fatigue driving detection device. The device’s processor uses a microcomputer motherboard that provides an open-source software architecture: Raspberry Pi 4 Model B. The photosensitive device uses the infrared sensor of the OV5647 sensory chip. The camera prevents the normal driving activities of the driver from being affected by contact with the driver. Even in the absence of light or low light conditions, the driver’s image can be obtained well. Besides, to meet various needs, this article’s fatigue driving detection system is also equipped with a Global Navigation Satellite System (GNSS) module of Micro Snow for Global Positioning System (GPS), BeiDou Navigation Satellite System (BDS), and Quasi-Zenith Satellite System (QZSS) multi-satellite system speed measurement and other sensors. For this device, its core is the algorithm part [16]. Based on the concept of multi-dimensional detection, this paper uses a face location algorithm based on a cascaded gradient descent tree to locate and distinguish the driver image’s face and obtain the eye aspect ratio (EAR) and mouth aspect ratio (MAR). This paper uses the Euler-based video zoom algorithm to process the face video image. It obtains the driver’s heart rate signal without touching the driver, which dramatically reduces the system’s intrusiveness (Table 1).
Finally, load the collected data into a pre-designed multi-dimensional dataset. Aiming at the problem that supports vector machines cannot have multiple characteristics simultaneously, this paper proposes a hybrid kernel function, which combines a logical kernel function with good generalization and a radial basis polynomial kernel with excellent learning and predictive capabilities [17]. The functions are combined to construct a support vector machine with a hybrid kernel function. The support vector machine based on the improved kernel function has robust learning and prediction capabilities and has good generalization capabilities. Finally, use the improved support vector machine to classify the mixed dataset, and then use the Raspberry Pi 4 Model B and make the corresponding output on the local side [18]. The average total accuracy of the detection of fatigue driving level reached 96.92% in the obtained experimental results. The fatigue detection system can efficiently and accurately detect the driver’s fatigue state in real-time without contacting the driver.
2 Fatigue Driving Detection Method Based on Improved Kernel Function Support Vector Machine
This method can obtain the video stream through the camera. After intercepting the pictures in the video stream, the driver’s facial feature points are detected based on the gradient descent tree algorithm of cascade regression. Calculate the eye aspect ratio (EAR) and mouth aspect ratio (MAR). The driver’s heart rate is obtained by analyzing the RGB image and combining it with the Euler algorithm. The hybrid kernel function using logic type and RBPK type kernel function improves the support vector machine to classify facial features to determine whether they are fatigued. The system can run on low-end development boards, such as Raspberry Pi 4 Model B (Fig. 1).
2.1 Multi-dimensional Fatigue Driving Feature Extraction Based on Gradient Descent Cascade Regression Model
Face Location Based on Gradient Descent Tree Algorithm of Cascade Regression.
This paper locates the face base on the Gradient Boosting Decision Tree (GBDT) [19]. This algorithm can locate human faces within one millisecond, significantly improving the detection efficiency.
The algorithm lets represent all 68 facial landmarks’ coordinates and use the gradient descent tree algorithm to learn each regressor in the cascade. From the image and the facial landmark estimation value, predict and update the vector and add it to the current shape estimate, make the estimated value closer to the right value, complete the purpose of face alignment, and obtain the value of the 68-dimensional facial landmark:
After achieving face alignment and acquiring the coordinates of 68 facial landmarks, this article selects 32 dimensions of the 68-dimensional coordinates to calculate the eye aspect ratio (EAR) and mouth aspect ratio (MAR) to determine whether the driver is fatigued or not.
Calculation of EAR and MAR.
In this paper, 12-dimensional eyes (both eyes) and a 20-dimensional mouth are selected to calculate the opening and closing degree. The calculation formula of EAR [20] is defined as follows, where \({P}_{1}\) to \({P}_{6}\) represents the left eye, \({P}_{7}\) to \({P}_{12}\) represent the right eye, and the EAR values of the two eyes are calculated separately:
The MAR calculation formula is as follows, \({P}_{13}\) to \({P}_{32}\) represent the mouth (Fig. 2):
Heart Rate Detection Based on Euler Video Zoom.
In addition to calculating EAR and MAR, this article also obtains heart rate as an index for comprehensively judging fatigue driving.
This paper uses Euler’s video magnification algorithm to process face video images. Compared with the independent component analysis algorithm, this algorithm does not require the source signal’s non-Gaussian independence. It has lower time complexity, which can reduce the time for fatigue driving detection. The algorithm processes video images in the spatial and temporal domains, thereby magnifying subtle changes in the video that are usually invisible or difficult to detect with the naked eye.
In this paper, the G channel with the stronger pulse wave signal among the three frequency channels of RGB in each frame of image magnified on the forehead is detected. The maximum power spectrum corresponding frequency of the signal sequence formed by the average value of the pixels in the G channel’s region of interest is used as the heart rate estimation value. The processed heart rate output value \(\tilde{I }(x,t)\) is calculated [21]. It can be seen from the following formula that the original small translational motion \(\delta (t)\) is amplified to \((1+\alpha )\delta (t)\) after time-domain band-pass filtering (Fig. 3):
2.2 Improved Logical Kernel Function
The kernel function is the core of the support vector machine. The performance of different kernel functions has its advantages and disadvantages. The performance of the support vector machine is also different due to different kernel functions. Some kernel functions are global, so they have good generalization capabilities. Some kernel functions have good learning ability and predictive ability. Generally speaking, a single kernel function may not have good learning and generalization capabilities. Therefore, this paper combines the logical kernel function with good generalization, and the radial basis polynomial kernel function (RBPK) with excellent learning ability and predictive ability to construct a mixed kern el function support vector machine, which improves based on The support vector machine of kernel function has not only robust learning and prediction ability but also has good generalization ability.
Logistic Kernel Function.
The expression of Logistic function is:
The expression of Logistic kernel function is:
As long as the kernel function satisfies the Mercer condition, the dot product operation in the high-dimensional space can be converted into the kernel function operation in the input space, thereby avoiding direct calculation in the high-dimensional space and solving the problem of high algorithm complexity.
The literature [22] gives the proof process of the Logistic kernel function as the support vector machine’s kernel function (Fig. 4).
Radial Basis Polynomial Kernel Function (RBPK).
The paper [23] defines a kernel function called Radial Basis Polynomial Kernel (RBPK):
The paper improves RBPK from two kernel functions, which makes full use of the good predictive ability of the polynomial kernel function and the RBF kernel function’s learning ability.
LRBPK Hybrid Kernel Function.
By analyzing the logical kernel function’s image and the radial basis polynomial kernel function, we can conclude that the logical kernel function has good generalization ability, and the support vector machine whose kernel function is RBPK has good learning and prediction ability. Therefore, to obtain a support vector machine with robust learning and predictive capabilities, and generalization capabilities. The Logical and Radial Basis Polynomial Kernel (LRBPK) mixed kernel function is a mixed kernel function of Logistic and RBPK type kernel functions. LRBPK is used as the kernel function of the improved support vector machine in this system.
According to the lemma, we know that if \({K}_{1}\) and \({K}_{2}\) are kernel functions on X*X, and X ∈ R, the constant \(a>=0\), then \(K\left(x,y\right)={K}_{1}\left(x,y\right)+{K}_{2}(x,y)\), \(K\left(x,y\right)=\alpha .{K}_{1}(x,y)\) is still the kernel function.
Therefore, the LRBPK hybrid kernel function expression is as follows:
3 Experiment and Result Analysis
3.1 Experimental Background
Dataset Description.
The dataset in this paper includes Driver Drowsiness Detection Dataset [24] and the dataset established by the author. In Driver Drowsiness Detection Dataset, subjects play driving games to get different states. Under the guidance of the experimenters, the testers showed a series of facial expressions. The total time of this dataset is about nine and a half hours.
A self-built database was constructed using the experimental device below. Use the OV5647 infrared camera to get the video stream. The video format is 30 frames per second and a color image with 320*240 pixels. The total recording time is 5 h. Contains 12 different testers. There are six female drivers and six male drivers, aged between 18 and 40. The testers simulated everyday driving, yawning, squinting, and sleepiness. And we were shooting in four different directions. The testers also tested without wearing any glasses, wearing black-rimmed glasses, and wearing sunglasses.
We collect data in a different light and different angle driving scenes to simulate a real driving scene. The different light environments are intense light, normal light, low light, and no light. Other angles are divided into front, left, and right sides. The camera’s built-in infrared light supplement can display the picture even when there is no light, but it will be different from the usual light environment (Fig. 5).
Description of Experimental Device.
This paper designs and manufactures a fatigued driving detection device composed of Raspberry Pi 4 Model B, OV5647 infrared camera, and various sensors to test the actual driving situation. The device is used to run the fatigue driving detection algorithm proposed in this article and obtain self-built dataset (Fig. 6).
Raspberry Pi 4 Model B.
The Raspberry Pi 4 Model B selected in this article is a microcomputer motherboard that provides an open-source software architecture. It has a 4-core ARM processor clocked at 1.5 GHz and 4 GB memory, which can run the algorithm model proposed in this article. Its price is not high, and it is easy to mass manufacture similar low-cost devices. It is equipped with 40 GPIO interfaces, which can connect a variety of sensors to facilitate the acquisition of various data. Its built-in Wi-Fi module can transmit data to the server during the experiment, reducing the storage and investment of data related to fatigue driving.
Infrared Camera.
This article uses an infrared camera with the sensory chip OV5647. It has a 160-degree viewing angle range, can acquire more images, and can adjust the focus. Equipped with an infrared fill light that can feel ambient light, the camera can reach a visual distance of 2 m at night. It can adapt well to the environment in the car. It can adapt to low-light and no-light environments that are common for driving (Fig. 7).
Other Modules.
The fatigue driving detection of the actual scene will consider many factors. Fatigued driving is detected only during driving, and the Weixue brand GNSS module is installed for GPS, Beidou satellite navigation system (BDS), and QZSS multi-satellite system speed measurement; considering the need for actual temperature and humidity detection, DHT11 sensor is installed; In order to facilitate the intuitive acquisition of data, this article adds a 0.96-inch OLED screen and so on.
3.2 Experimental Process
First, this article extracts fragments from the dataset. A total of 240 video fragments are removed, each of which is 30 s. According to the subjective judgment method, 113 fatigue video clips, 127 non-fatigue video clips, and the category label (0, 1). This paper randomly samples the video and includes four different angles and three glasses-wearing clips—extract 95 fatigue and 95 non-fatigue video clips, respectively. Divide the video into the training set, and test set equally (Fig. 8).
One thousand eight hundred pictures were intercepted from the training set and test set and subjectively classified, and 900 images were divided into a training set and test set. Since there are not many data sets, this article conducts Data Augmentation and uses OpenCv to batch flip, adjust brightness, blur, and other processing methods to get 5400 pictures (Fig. 9).
The gradient descent tree algorithm of the cascaded regression is used to obtain the face in the picture, obtain the face’s 68-dimensional feature points, and extract the feature values of EAR and MAR. Extract the characteristic value of the heart rate through each video clip. Match the video to the corresponding picture. The three sets of feature value data of the training set are provided to the SVM classifier for training.
Get the training parameters to test the test set. The images in the test group were divided into four groups, each with 675 pictures. In order to ensure the balance between the false alarm rate and the false alarm rate, this article defines the accuracy rate: \(Total\;accuracy = 1 - \left( {false\;alarm\;rate + false\;alarm\;rate} \right)\) (Table 2).
It can be seen that the SVM algorithm using the improved kernel function proposed in this paper has better processing results in the false positive rate and can effectively reduce the false-negative rate.
Since the judgment of the picture cannot be intuitively derived from the test in the actual driving environment, this article selects video clips as the continuous monitoring test. However, since the video has a rate of 30 frames per second, this paper establishes 0.5 s as the interval time for fatigue driving judgment. In this article, we have also made relevant calculations based on the blinking frequency of human eyes. According to Sakai’s research [25], the average number of blinks per minute (Nob) of people is about 25 times. Blink time (BT) is about 0.2 s. According to the probability, we can know that the likelihood of being recognized as closed eyes when blinking is:
The probability of closing the eyes for 5 consecutive times is:
Studies have shown that people’s blinking frequency is lower when focusing on driving [26]. The probability of closing the eyes five times will be even lower. In many comparative experiments, this article has also found that the correct rate is better when the fatigue driving judgment occurs five times in a row, and the miss judgment rate is lower. Finally, test with 95 non-fatigue video clips tests set and compare other fatigue driving test data (Table 3).
It can be seen from the test that the accuracy of this method reaches 98.95%, which exceeds 1.06% of the fatigue driving detection method based on YCbCr color space. In the actual driving process, this article uses the equipment mentioned above to conduct multiple tests. Compared with similar commercial products, our fatigue driving test has a lower false alarm rate and a lower false alarm rate (Fig. 10).
4 Conclusion and Future Directions
This paper proposes a multi-dimensional fatigue driving detection system based on an improved kernel function support vector machine by locating and identifying the EAR, MAR, and heart rate of the face. A new fatigue driving detection framework is constructed by improving the kernel function in the support vector machine. Construct a new dataset and combine the public dataset for training and testing, and get a better recognition rate. Comparing single point feature detection with classic support vector machine detection, it has an absolute accuracy improvement. However, due to the lack of a camera to obtain the heartbeat. In the future, we will improve the shortcomings in this area and use better methods to predict. The author will collect more data sets and strive to build a complete fatigue driving detection system. The streamlined system can be used in mid-range IoT devices. In the future, more features such as human body pressure, steering wheel, and head tracking can be combined to develop a more accurate fatigue driving detection system. Combining with L1-L3 unmanned driving systems is also a follow-up research direction.
References
Liu, B., Wang, L., Liu, M.: Lifelong federated reinforcement learning: a learning architecture for navigation in cloud robotic systems. IEEE Robot. Autom. Lett. 4(4), 4555–4562 (2019)
Li, L., Xie, M., Dong, H.: A method of driving fatigue detection based on eye location. In: 2011 IEEE 3rd International Conference on Communication Software and Networks, pp. 480–484. IEEE (2011)
Shuze, G.: Research on driving fatigue detection system based on ARM platform. Ph.D. thesis (2016)
Sahayadhas, A., Sundaraj, K., Murugappan, M.: Detecting driver drowsiness based on sensors: a review. Sensors 12(12), 16937–16953 (2012)
Buchheim, C., Hubner, R., Schobel, A.: Ellipsoid bounds for convex quadratic integer programming. SIAM J. Optim. 25(2), 741–769 (2015)
Tao, C.: Newton-CG augmented lagrangian algorithm for convex quadratic constrained quadratic semidefinite programming. Ph.D. thesis, Beijing University of Technology (2012)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Ghaddar, B., Naoum-Sawaya, J.: High dimensional data classification and feature selection using support vector machines. Eur. J. Oper. Res. 265(3), 993–1004 (2018)
Xia, M.: Research on product quality prediction system based on improved support vector machine. General Institute of Mechanical Science Research (2020)
Zhang, Q.: Method research of multi-core support vector machines. Ph.D. thesis, Beijing Jianzhu University (2020)
Tanveer, M.: Robust and sparse linear programming twin support vector machines. Cogn. Comput. 7(1), 137–149 (2015)
Sideratos, G., Hatziargyriou, N.D.: Probabilistic wind power forecasting using radial basis function neural networks. IEEE Trans. Power Syst. 27(4), 1788–1796 (2012)
Moghaddam, V.H., Hamidzadeh, J.: New hermite orthogonal polynomial kernel and combined kernels in support vector machine classifier. Pattern Recognit. 60, 921–935 (2016)
Liu, B., et al.: A real-time contribution measurement method for participants in federated learning. arXiv preprint arXiv:2009.03510 (2020)
Liu, B., Wang, L., Chen, X., Huang, L., Xu, C.Z.: Peer-assisted robotic learning: a data-driven collaborative learning approach for cloud robotic systems. arXiv preprint arXiv:2010.08303 (2020)
Cheng, J., Zheng, J., Yu, X.: An ensemble framework for interpretable malicious code detection. Int. J. Intell. Syst. (2020)
Liu, J., et al.: A novel robust watermarking algorithm for encrypted medical image based on DTCWT-DCT and chaotic map. Comput. Mater. Con. 61(2), 889–910 (2019)
Tang, X., Wang, L., Cheng, J., Chen, J.: Forecasting model based on information-granulated GA-SVR and ARIMA for producer price index. arXiv preprint arXiv:1903.12012 (2019)
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)
Soukupova, T., Cech, J.: Eye blink detection using facial landmarks. In: 21st Computer Vision Winter Workshop, Rimske Toplice, Slovenia (2016)
Wan, Z.: Research on heart rate detection based on face video images. Ph.D. thesis (2014)
Yang, X.: A support vector machine-based image multi-feature fatigue driving detection method. Ph.D. thesis, Xi’an University of Technology
Bhavsar, M.H., Ganatra, A.: Radial basis polynomial kernel (RBPK): a generalized kernel for support vector machine. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 14(4) (2016)
Weng, C.-H., Lai, Y.-H., Lai, S.-H.: Driver drowsiness detection via a hierarchical temporal deep belief network. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10118, pp. 117–133. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54526-4_9
Sakai, T., et al.: Edabased estimation of visual attention by observation of eye blink frequency. Int. J. Smart Sens. Intell. Syst. 10(2), 296–307 (2017)
Ingre, M., Åkerstedt, T., Peters, B., Anund, A., Kecklund, G.: Subjective sleepiness, simulated driving performance and blink duration: examining individual differences. J. Sleep Res. 15(1), 47–53 (2006)
Liu, C., Zhang, X.: Research on fatigue driving warning based on image processing. Appl. Electron. Tech. (8) (2019)
Wenteng, K., Kuancheng, M., Jiacai, H., Haibin, L.: Fatigue driving detection based on Gaussian white eye model. Chin. J. Image Graph. 21(011), 1515–1522 (2016)
Xu, Z., He, F., Hua, X., Li, J.: Research on fatigue driving detection system based on adaboost algorithm. Automob. Technol. (005), 17–21 (2019)
Acknowledgement
This work was supported by the Hainan Provincial Natural Science Foundation of China (Grant No. 2019RC041 and 2019RC098), Research and Application Project of Key Technologies for Blockchain Cross-chain Collaborative Monitoring and Traceability for Large-scale Distributed Denial of Service Attacks, National Natural Science Foundation of China (Grant No. 61762033), Opening Project of Shanghai Trusted Industrial Control Platform (Grant No. TICPSH202003005-ZC), and Education and Teaching Reform Research Project of Hainan University (Grant No. hdjy1970).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Sun, Y. et al. (2021). Multi-dimensional Fatigue Driving Detection Method Based on SVM Improved by the Kernel Function. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Advances in Artificial Intelligence and Security. ICAIS 2021. Communications in Computer and Information Science, vol 1423. Springer, Cham. https://doi.org/10.1007/978-3-030-78618-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-78618-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78617-5
Online ISBN: 978-3-030-78618-2
eBook Packages: Computer ScienceComputer Science (R0)