Keywords

1 Introduction

In recent years, more and more researchers have been engaged in fatigue driving recognition and many important achievements have been made, such as lane-track alarm system [1], EEG method [2], heart rate detector [3], etc. Although there are many ways to realize driver fatigue detection to a certain extent, the performances of these methods are various. These methods can be divided into three categories. The first way is to detect fatigue through the analysis of the vehicle status. For example, American Ellison Research Labs monitored lane track to achieve fatigue detection in 2004. While, some researchers detected fatigue through other vehicle status such as the speed of vehicle running and the turning of the steering wheel. But those ways are hard to get a standard to judge whether the drivers are dozing. The second method is based on the physiological property such as EEG and heart rates. According to the research of Japan Pioneer Company, the heart rate will be slow when drivers are drowsy. So through physical characteristics like heart rate, EEG can make an accurate detection. However, these techniques are usually more expensive and will annoy the drivers, so they are not easily accepted by the drivers. In general, driver fatigue monitoring system should consider that whether it is easy to accept for the driver, timeliness, reliability, scalable, and the cost of these. With the development of digital image processing and computer technology, more mature techniques are provided to analyze videos and images. Using digital image processing technology to find out fatigue in time and then prevent it which is the third approach. Fatigue recognition according to facial features has following advantages: real time, reliability and little interference to the driver. Driver fatigue recognition based on facial feature is a trend. Many researchers proposed different methods in this area [4, 5]. The eye contains a lot of information, so the research and analysis of the human eye has become a very hot issue.

The paper is organized as follows. System design is presented in Sect. 2. Section 3 shows the design of fatigue detection algorithm. Section 4 contains the results of experiment experiments. The paper is concluded in Sect. 5.

2 System Design

This driver fatigue monitoring system includes Pre-treatment (video acquisition and image preprocessing), face detection, eye detection, eyes feature extraction and driving fatigue recognition. The driver fatigue monitoring system is shown in Fig. 1 and we will introduce these in detail in the following sections.

Fig. 1.
figure 1

The system framework

3 Fatigue Detection Algorithm Design

This driver fatigue monitoring system detects driver drowsiness based on eye features. Getting the picture from the video and using skin-color to detect face. Using skin-color way to detect face has many advantages, but it is influenced by many factors (light, clothes similar to skin color, etc.), so choosing YCbCr color space and add restricted conditions are necessary. Gray-scale projection method combines with Circular Hough transform method to extract the eye region accurately. Finally, according to eye status we can recognize driving fatigue.

3.1 Face Detection

Face detection should be done before eye detection, so the first step is face detection. The face detection ways currently used include Eigenface [6], neural networks [7], Gabor transform [8], skin-color [911]. Compared with mentioned method above, the face image based on skin-color method can extract face from complex background quickly and accurately. It achieves good real-time performance and strong practicability. The skin-color model is established should choose appropriate color space. RGB, NTSC, YCbCr and HSV are used commonly color space. YCbCr color space put the RGB color space divided into three components, a brightness component(Y) and the two color component(Cr and Cb). YCbCr model is commonly used in color digital video model. In this color space, Y contains brightness information, Chromaticity information is stored in the Cr and Cb. Cb shows green component relative reference value, Cr shows red component relative reference value, and they are independent relationship. YCbCr model data can be described by double precision. As brightness component and color component are separated in YCbCr color space, reduce the effects of light. Cr and Cb contain chrominance information and fit to establish skin-color model14. So we chose YCbCr color space. The RGB components and the YCbCr components can be converted by the following formulas.

$$\begin{aligned} {\left\{ \begin{array}{ll} Y=0.299R+0.578G+0.114 \\ Cb=0.1482(B-R)+0.2910(B-G)+128 \\ Cr=0.3678(B-R)-0.0714(B-R)+128\\ \end{array}\right. } \end{aligned}$$
(1)

The skin color distribution similar to Gaussian distribution, Gaussian model is proposed base on the theoretical. Based on this, calculate the probability of each pixel belongs to skin color which use two-dimensional Gaussian in the color image and get the probability value of skin. Those values make skin probability graph. The higher probability of the pixel color values of skin area in figure, it is the candidate skin region.

The values of the pixel probability are calculated by the principle of two-dimensional Gaussian model of skin color detection. Selecting a threshold base on those probability values, if the pixel probability values greater than the threshold, it belongs to the skin, if not, it cant belong to the skin. Most researchers choose Gaussian model as a skin color model under normal conditions. But this method needs to consider the samples which are not belong to the skin-color.

In the model based on color, getting Cb and Cr values of each pixel of the image. If the Cr value of a pixel ranges from 140 to160 and the Cb value of this pixel ranges from 140 to 195(skin color differs from person to person, the data base on this experiment), the pixel is judged skin-color and set the gray value of the pixel 255. If not, set the gray value of the pixel 0. In this way, we get the binary image. Binary image include noise and the gray-scale images converted into binary image process will inevitably increase noise, so the binary image needs to be smoothed. The techniques which are used include corrosion and expansion of images. Figure 2 contains an example of face detection procedure.

Fig. 2.
figure 2

Illustration of face detection algorithm

If the image background is complex, we may get multiple candidate face region. Those are candidate skin region need to judgment. And it will be affected by realistic environment, such as clothes, skin-color background, so we will get some candidate skin region as described earlier in this paper, it must be added some conditions to limit:

  1. (1)

    The limitation of length and width of the skin region, these values are set according to the proportion of face in the image.

  2. (2)

    Height and width ratio, in reality, the ratio is set range from 0.6 to 2.

  3. (3)

    The region should contain the eyes, in other words there are two black areas in real face region.

Determine the face region accurately by adding the three restrictions. Figure 3 shows the illustration of face detection adding the restrictions mentioned above from complicated background.

Fig. 3.
figure 3

Illustration of face detection with restrictions

The procedures of recognizing face based on skin-color are as follows:

  • Step 1: The color images from RGB color space converted into the YCbCr color space. The function we used is YCbCr = rgb2ycbcr(RGB).

  • Step 2: The color images change into binary image base on the values of Cb and Cr which get according to the skin in YCbCr model distribution range.

  • Step 3: Binary image includes noise and the gray-scale images converted into binary image process will inevitably increase noise, so this step we eliminate the noise.

  • Step 4: Corrosion and fill hole processing.

  • Step 5: We will get some candidate skin region, we choose true face region from them. According to the three restrictions which have already been mentioned we determine the face region accurately.

3.2 Extraction the Eye Area

The extracting face region is the basis for the recognize fatigue driving. Recognize fatigue driving by the eye state. The human eye is one of vital organ to reflect the fatigue driving or not, it contains a lot of information. The researches show that recognize fatigue and eye state have a close relationship, eyes are closed or almost closed state in a long time when people tired, at the same time the blink rate is accelerated and the closed time will become long. So this system through the eye state to recognize fatigue driving.

The gray level around the eye part is lower than other parts in the face, so we can detect the human eye using this characteristic. Gray-scale projection method is used widely in image processing. Using this method to process the original image directly, the noise is relatively large and it is difficult to achieve desired results. But with the development of technology, the gray projection method are also improved, this method in the human eye feature extraction is widely applied [12, 13]. The system is based on previous studies on the application of this theoretical knowledge, Combination with the knowledge of the digital image. In order to improve the efficiency and reduce calculation, we use the binary image instead of the original image to gain the horizontal and vertical integral projection. Binary image of the face can clearly find eyebrow, eyes, nose and mouth etc. The binary image with gray-scale projection can provide a more determined accurately area of the human eye.

The binary image of the human face can clearly find eyebrows, eyes and mouth. Using gray-scale projection algorithm to deal with the binary image can get the position of eye roughly. G(xy) represents the gray value of the pixel in(xy), H(x) stands for the value of binary image horizontal integral projection, V(y) stands for the value of binary image vertical integral projection, which are shown as follows:

$$\begin{aligned} H(x)= & {} \frac{1}{x_2-x_1}\sum _{x_1}^{x_2}G(x,y) \end{aligned}$$
(2)
$$\begin{aligned} V(y)= & {} \frac{1}{y_2-y_1}\sum _{y_1}^{y_2}G(x,y) \end{aligned}$$
(3)

We have got coordinate of the face area in the section A and extracted face area base on the coordinate. In Fig. 4, (a) shows the extracted face region, (b) shows the binary image of the face and (c) shows the extracted eye region. It is gained by the gray-scale projection.

Fig. 4.
figure 4

Extraction the eye area

We get the gray value from the binary image of the face with horizontal and vertical integral projection. Figure 5 shows the gray value of horizontal integral. Figure 6 shows the gray value of vertical integration.

Fig. 5.
figure 5

The gray value of horizontal integral

Fig. 6.
figure 6

The gray value of horizontal integral

The black region which closes to forehead shows eyebrows and eyes. It corresponds to the Horizontal integral projection curve are two minimum.

It is easy to roughly get the ordinate of eyebrows and eyes from the figure, so that we can choose the appropriate width. The same method, two valleys in vertical gray projection curve represent left and right eyes. So we extract the eye region.

3.3 Eye Detection

Hough transform which describes the border of region, is often used to detect geometry, such as circular, oval or line in the image. In this paper, we use this method to detect eye accurately. Before using Hough transform we should detect the borders of the image. By comparing the different edge detection algorithms, we find Prewitt algorithm in this system performs best. It is easy to find that the eye is similar to oval in binary figure. But using the oval model to judge human eyes need to determine the center axislength and minor axis length and steer of the oval which similar to eye. Detecting an oval needs five parameters [14], it is a great amount of calculation. The human iris is circular. Locating the iris circle model by Hough transform to determine the human eye is open or close, if it is close at that time we can’t detect the circle [15]. It is only requires three parameters, the center (x, y) and radius r.

The basic idea of the Hough transform is according to the majority points of the boundary to determine the curve. So through describing the boundary of the curve, image space changes into curves space. So it is good tolerance and robustness to some possible noise of area boundary.

Fig. 7.
figure 7

Eye model

The appearance of eye model is shown in Fig. 7. We know the circle can be expressed with the following formula:

$$\begin{aligned} (x-x_0)^2+(y-y_0)^2=r^2 \end{aligned}$$
(4)

From Eq. 4 we can see that a circle requires three parameters. Judging a circle with the three parameters still exists a certain difficulty. So researchers limit the center (x, y) and radius r within a certain range, and then calculate the parameters. In this way, the parameters have reduced. It reduces much calculating obviouslly. The radius of the circle is calculated first for a very small and sealed area, then judgment all the points on the edge. Consequently, we can quickly identify the edge of the circle.

Set the parameter of the circle is one pixel, the step of radians change is 0.2 Pixel, the minimum Radius \(r_{min} =5\), the maximum Radius, \(r_{max} =8\), the threshold value is 0.685. These parameters are set according to the system. Figure 8 shows the illustration of Hough transform.

Fig. 8.
figure 8

Illustration of Hough transform

This paper presents the gray-scale projection and the Hough transform to quickly detect eye. It includes two steps which are extracting the eye area and eye detection accurately. This section is very important for the driver fatigue monitoring system, we describes two algorithms in section B and C. The procedures of recognizing eye are follows:

  • Step 1: Segmentation of face region and change it to binary image.

  • Step 2: Binary image of face with horizontal integral projection get the distribution of the vertical graph. We get the value of y that the minimum gray values which stand for the eye and eyebrow.

  • Step 3: The width adds to 2d with y values as the center. Get out the region.

  • Step 4: The region with horizontal integral projection and get two lows from the graph which can determine the eye part. The eye area can be extracted.

  • Step 5: Detect the border of the image by Prewitt algorithms.

  • Step 6: Set the parameters of the circle with \(r_{min}=5\), \(r_{max}=8\) and threshold value is 0.685. Then eye detection by Hough transform.

3.4 Eye State Analysis

We calculate the quantity of the eye profile pixel with dilation. Under the same conditions, the pixels number of the open eyes certainly more than the eyes closed pixels. So based on this we can judge whether human eyes are open or closed, and judge how much eyes are open.

PERCLOS is a classical method to determine whether human eyes are fatigue or not, the fatigue recognition model based P80 criterion [16]. The 100 % open is the eyes largest area in all images during a period of time. If the eye closed degree more than 80 % is determined closed state. In this paper, the pixels number of the eye largest is the eyes largest area.

To deal with each frame from the human eye features respectively, the number of eyes closed frames is \(CloseFrame\underline{\ } Num\) and the total number of frames that deal with is \(SumFrame\underline{\ }Num\). According to the following formula the value of PERCLOS can be calculated.

$$\begin{aligned} PERCLOS=\frac{CloseFrame \underline{\ }Num}{SumFrame \underline{\ }Num}\times 100\,\% \end{aligned}$$
(5)

If the value of PERCLOS in experiment is greater than the threshold that we set the value of 20 %, we think the driver is fatigue, then the alarm system start warning.

4 Experimental Results

In this section, we take the video which acquired by camera in simulation driving condition as an example. Video processing using MATLAB converted to images. The video in the system is 30 frames /s and \(640 *480\) pixels per a frame. Based on the fact that movement is continuous and not too fast, there is no need for testing each frame. First of all, we extract one image per 5 frames. In order to not only accurately judge the results but also improve the efficiency of the system in this way. So we extract 6 frames per second [17].

Lots of experiments show that this method is better than other methods. It can identify the face more accurately and quickly which is the foundation for later processing. Figures 9 and 10 show the original image and the framed face image recognize face by this method.

Fig. 9.
figure 9

Detected face 1

Fig. 10.
figure 10

Detected face 2

The experiments illustrate that the face detection method can detect the faces accurately even though in different distances and positions, and the accuracy is over 95 %. It also can meet the real-time face detection. So the face detection method is the basis of the fatigue state judgment.

After extracting the face and then extract eyes from the face image based on the described algorithms for eye detection. Some examples of the detected eyes are shown in Figs. 11 and 12.

Fig. 11.
figure 11

Eye detection 1

Fig. 12.
figure 12

Eye detection 2

Numerous experiments demonstrate the viability of the proposed method. In this section we extract the eye region and detect eye better.

Fig. 13.
figure 13

Results when person not in drowsy state

Fig. 14.
figure 14

Results when person is in drowsy state

Figure 13 shows the eye edge after dilation of person who is not drowsy. And Fig. 14 illustrates the eye edge of a sleepy driver. According to Figs. 13 and 14, the area of eye edge in Fig. 13 is bigger than what in Fig. 14. Two testers with four 30 s videos which simulation of driving situations, according to the algorithm, each video includes 180 frames, Table 1 shows the experimental results.

From Table 1, the system can detect the faces accurately and the extracted eye region is exact, but using Hough transform to detect eye is imperfect. From the experimental results, the proposed algorithm is faster and has higher accuracy. The proposed algorithm can reach correct detection rate of 86.4 % on average. On the other side, the correct rate of video 4 is lower than video1, the main reason is that the system is easily influenced by illumination.

Table 1. Experimental results

5 Conclusion

In this paper, we propose a method for fatigue driving detection base on eye feature. As can be seen from the experimental results the system detected face as well as extracted eye region with a high accuracy of more than 95 %. Using the last system, fatigue detection is not as well as expected. But the system can fully meet the real-time and accurate requirements. We will realize driver fatigue monitoring system on the hardware platform and improve the accuracy in the future study, besides, we will study more evaluation indexes or compare its performance with other similar methods, to demonstrate the effectiveness of the proposed approach.