Keywords

1 Introduction

Statistics show that [1] 57% of traffic accidents are related to fatigue driving. Therefore, the study of driving fatigue monitoring methods is of great significance to reduce traffic accidents and improve the road traffic safety environment. In recent years, domestic and foreign scholars have proposed different driving fatigue monitoring methods from different research perspectives. These methods have unique opinions on the description and evaluation of driver’s fatigue state, but there are some shortcomings, such as the use of physiological signals as a driving fatigue evaluation index [2, 3], requires the measuring instrument to be in direct contact with the driver, which could interfere with the driving behavior, and affects the judgement of driving fatigue state [4, 5]; choosing the physiological response characteristics as the evaluation index could avoid interference in direct contact with the instrument, but the applied instruments would have a negative impact on driver’s psychological mood, and is not conducive to the accurate evaluation of the driver’s fatigue state.

This paper proposed driving fatigue detection method based on the characteristics of steering wheel angle and chose the steering wheel angle characteristics as fatigue evaluation basis, which had no influence on driver’s driving behavior and psychological emotion. The detection model was established by support vector machines and introduced the cross-validation method to optimize the parameters, which could effectively avoid the shortcomings of the above methods.

2 Basic Theory Introduction

2.1 Support Vector Machines—Introduction

Support vector machine (SVM) was firstly proposed by Vapnik in the 1990s and was based on the statistical learning theory [6]. The algorithm can better achieve the idea of structural risk minimization; that is, the input vector of the sample is mapped to the feature space of high dimension by kernel function, so as to construct the optimal classification surface in the feature space and realize the function of linear classification sample [6, 7]. SVM theory has become the current research hot spot and is widely used in engineering field [8, 9]. The principles and steps of solving the linear separable problem of SVM are as follows:

  1. (1)

    Assume known training set: \(T = \left\{ {\left( {x_{1} ,y_{1} } \right), \cdots ,\left( {x_{l} ,y_{l} } \right)} \right\} \in \left( {X \times Y} \right)^{l}\), where \(x_{i} \in X = R^{n}\), \(y_{i} \in Y = \left\{ {1, - 1} \right\}\left( {i = 1,2, \ldots ,l} \right)\), \(x_{i}\) as the eigenvector; then select a kernel function \(K\left( {x,x^{\prime}} \right)\) and the appropriate parameters C, construct and solve the optimization problem, and finally, obtain the optimal solution: \(\alpha^{ * } = \left( {\alpha_{1}^{ * } , \ldots ,\alpha_{l}^{ * } } \right)^{T}\).

The specific formula for the optimization problem is \(\mathop {\hbox{min} }\limits_{\alpha } \frac{1}{2}\sum\limits_{i = 1}^{j} {\sum\limits_{j = 1}^{l} {y_{i} y_{j} \alpha_{i} \alpha_{j} K\left( {x_{i} ,x_{j} } \right) - \sum\limits_{j = 1}^{l} {\alpha_{j} } } }\)

$$s.t.\quad \sum\limits_{i = 1}^{l} {y_{i} \alpha_{i} = 0} ,\quad 0 \le \alpha_{i} \le C\left( {i = 1, \ldots ,l} \right)$$
  1. (2)

    Select a positive component of \(\alpha^{ * }\), and calculate the threshold: \(b^{ * } = y_{j} - \sum\limits_{i = 1}^{l} {y_{i} \alpha_{i}^{ * } K\left( {x_{i} ,x_{j} } \right)}\)

  2. (3)

    Construct the decision function, and the specific formula is \(f\left( x \right) = \text{sgn} \left( {\sum\limits_{i = 1}^{l} {\alpha_{i}^{ * } y_{i} K\left( {x,x_{i} } \right) + b^{ * } } } \right)\)

For the nonlinear separable problems, the Gaussian radial basis function used in this paper is: \(K\left( {x,x_{i} } \right) = \exp \left( { - \frac{{\left\| {x - x_{i} } \right\|^{2} }}{{2\sigma^{2} }}} \right)\).

2.2 Cross-validation Method Introduction

Cross-validation (CV) is a statistical analysis method used to verify the performance of classifiers [10, 11]. There are three commonly used cross-validation methods, namely hold-out method, K-fold cross-validation (K-CV), and leave-one-out cross-validation (LOO-CV). In this paper, the K-CV method is used to optimize the relevant parameters of the support vector machine to obtain the best parameters, so as to improve the recognition rate of the recognition model.

The specific content of the K-CV method is that the original data is divided into K groups (usually equalized), and each subset of the data were made as a validation set, while the rest of the K − 1 groups of sub-data as a training set; then, the K models are obtained, and the average of the classification accuracy of the final verification set of K models is used as the performance index of the classifier under K-CV. K is generally greater than or equal to 2. In the actual operation, K starts from 3 and is only equal to 2 when the original amount of data is small. K-CV method can effectively avoid overlearning and the occurrence of poor learning state and finally get the results more reasonable.

3 Driving Fatigue Detection Model Establishment Based on SVM

In this paper, the driving fatigue detection model uses SVM as the classifier, the steering wheel angle characteristic as the characteristic vector (input vector), and the driver’s fatigue state as the output vector.

3.1 Classification of Driving Fatigue Status

The driver’s subjective fatigue experience can provide a reference for the driving fatigue classification test. This chapter used the Stanford sleepiness scale [12] to divide the driving fatigue into three states: normal driving state, quasi-fatigue driving state, and fatigue driving state. Table 1 shows the contents of the Stanford sleepiness scale. In the table, 1–2 levels represent the normal driving state, 3–4 stand for quasi-fatigue state, and 5–7 represent the fatigue driving state. The relevant testers have agreed to test and publish test data. The relevant testers include two drivers and two testing engineers.

Table 1 Stanford sleepiness scale

3.2 Steering Wheel Angle Analysis Based on Timing

Steering wheel angle data from the actual road test changes over time as shown in Fig. 1. Figure 1a represents the normal driving state, Fig. 1b represents the quasi-fatigue driving state, and Fig. 1c stands for the fatigue driving state. From Fig. 1, it is difficult to observe the internal variation between steering wheel angle and driving fatigue from its external changing. Therefore, it is necessary to use information processing technology to characterize and extract, and find its internal changes.

Fig. 1
figure 1

Steering wheel angle timing diagram of three driving fatigue states

In this paper, AR model was used to do timing analysis for the three states of the steering wheel angle. The fourteen AR model parameters were used as the eigenvectors of the steering wheel angle in the normal driving state, the quasi-fatigue driving state, and the fatigue driving state and then acted as the mode vector of the driving fatigue detection based on the steering wheel angle characteristic, which can be expressed as follows: \(Xi = \left\{ \varphi \right.i1,,\varphi i2,\varphi i3,\varphi i4,\varphi i5,\varphi i6,\varphi i7,\varphi i8,\varphi i9,\varphi i10,\varphi i11,\varphi i12,\varphi i13,\left. {\phi i14} \right\}\); \(Xi\) is eigenvector of the \(i\) th signal; \(\varphi ij\) represents \(j\) th \(\left( {j = 1,2,\varLambda ,14} \right)\) parameter of the eigenvector corresponding to the ith state; the AR model 14 order is derived from the FPE criterion [13].

3.3 The Driving Fatigue Detection Model Establishment

3.3.1 Design of Input and Output Vectors

The fourteen-dimensional eigenvector of the steering wheel angle extracted by the AR model was used as the input vector of the detection model. The classification status of the driving fatigue was identified, which is used as the output vector. The normal driving status is 1, the quasi-fatigue driving status is 2, and the fatigue driving status is 3.

As the fourteen-dimensional eigenvector had higher dimension and larger the data volume, this paper chose PCA method [14] to reduce the sample data dimension and eliminate the above bad effects, and improve the accuracy of the model. The percentages of the principal component of dimensionality reduction in this paper were determined by the actual simulation experiments. The process flowchart was shown in Fig. 2.

Fig. 2
figure 2

PCA dimensionality reduction flowchart

3.3.2 Selection and Optimization of Classifiers

As mentioned above, this paper chose SVM as the classifier of the detection model. SVM used different kernel functions to obtain different SVM algorithms [6]. At present, the most common kernel functions in practical problems include linear kernel function, polynomial kernel function, Gaussian radial basis function, and sigmoid kernel function. This paper’s kernel function was determined by the results of actual simulation experiments.

As mentioned above, this chapter also used the cross-validation method to optimize the SVM. The main steps of the optimal parameter selection based on the cross-validation method were as follows: (1) First, the penalty parameters \(c\) and the kernel function parameters \(g\) were taken values within a certain range; (2) then, for a certain group of \(c\) and \(g\), the sample set was used as the original data set to obtain the classification accuracy of the classifier by applying the above K-CV method. That is the classification accuracy of the classifier in the CV sense; (3) finally, the final set of \(c\) and \(g\) which had the highest classification accuracy of the classifier acted as the best parameters.

3.3.3 Establishment of SVM-Based Model

In summary, the modeling process based on SVM driving fatigue model mainly included the following steps: ① input and output vectors; ② sample collection and production, kernel function selection, and dimension reduction processing; ③ SVM parameter optimization based on K-CV method; ④ training SVM, identifying the test set, solving the recognition accuracy, and choosing the best parameters. Modeling process flowchart was shown in Fig. 3.

Fig. 3
figure 3

SVM-based model flowchart

4 Application Examples

4.1 Experimental Design

The testers included a driver and three assistants. The driver has more than 10 years of driving experience and good physical condition; the experiment time was selected from 9:00 am to 12:00 pm, 13:00 to 17:00 pm, with the middle interval of 1 h time. The test was divided into A and B groups. At 12 o’clock to 13 o’clock, driver of A group could go to sleep and have a rest, but driver of B group would not be allowed to sleep.

4.2 Collection and Preprocessing of Sample Data

Using the above timing analysis method, the 60 sets of steering wheel angle signal in the above three states were extracted, and the feature vector and the corresponding state number were taken as samples of the model. Finally, 180 groups were obtained. According to a general principle of testing samples (two-thirds of the data as training samples and one-third of the data as test samples), 120 groups of training samples and 60 sets of test samples were randomly selected. Part of the sample data was shown in Table 2.

Table 2 Example of sample data

The preprocessing of the sample as described above refers to the reduced dimension processing of the PCA method. The percentage of the principal component was 85–99%. When the percentage of the main component was 95%, the accuracy of the model was up to 70.00%. The cumulative contribution of 95% of the percentage of the principal component was shown in Fig. 4.

Fig. 4
figure 4

Cumulative contribution of 95% of the principal component percentage

4.3 Simulation Results and Analysis

Based on the above method and data, SVM classifier was implemented by the support vector machine toolbox and MATLAB software, and the Gaussian radial basis function was selected by concrete sample simulation. The method of partitioning the coordinate network was used to select the penalty parameters \(c\) and kernel function parameters \(g\) of the support vector machine, and the cross-validation was taken as 5 which meant the validation set of classifier was divided into five parts. The best penalty parameter was 64.00, the best kernel function was 0.0625, and the final classification accuracy of classifier was 69.17% in CV. The result of the parameter selection was shown in Fig. 5.

Fig. 5
figure 5

3D view of parameter selection results

The meaning of the graph: \(x\)-axis represented the value after \(c\) taking the logarithm of 2; \(y\)-axis represented the value after \(g\) taking the logarithm of 2; the contour of the three-dimensional graph showed the accuracy rate of K-CV method by using the corresponding \(c\) and \(g\); In figure, the values range of \(c\) and \(g\) were \(2^{( - 8)}\)\(2^{\left( 8 \right)}\), and the step of the network division was 0.5, which meant the values of \(c\) and \(g\) were obtained according to the form \(2^{{\left( { - 8} \right)}} ,\;2^{{\left( { - 7.5} \right)}} , \ldots ,2^{\left( 8 \right)}\).

The driving fatigue test model was simulated by using the combination of the best parameters, and the test results of the driver’s driving fatigue state were shown in Table 3. The detection rate was 85.00% for the normal driving state, 100.00% for the quasi-fatigue driving state, and 80.00% for the fatigue driving state, and the total detection rate for driving fatigue was 90.00%.

Table 3 Test results of driving fatigue

5 Summary

Aiming at the problem of driving fatigue detection, this paper applied the support vector machine to detect the driving fatigue state. Firstly, the steering wheel angle eigenvector was proposed as the input vector by using the timing method. According to the Stanford sleepiness scale, the driving fatigue state was divided into three different state scales which were the output vector. Secondly, according to the support vector machine toolbox and using MATLAB software to compile the program, the best parameters of support vector machine were chosen by using cross-validation method after the reasonable selection of kernel function and dimension reduction degree, and then, the mode using the best combination of parameters was simulated. The results showed that the detection model had a high detection rate, which proved that the driving fatigue detection model could effectively detect the fatigue of driver, and had a good application and promotion.