Keywords

1 Introduction

Drowsiness of drivers and pilots is a major cause of a large number of fatal road traffic and aviation accidents. Drowsiness decreases the individual’s attention and alertness toward the tasks he or she is carrying out. Extended working hours, use of medication, sleeplessness, or continuous driving are some of the major reasons for drowsiness. Amir et al. report that US National Sleep Foundation (NSF) conducted a survey in 2009 which showed that 54% of adult drivers have driven a vehicle while feeling drowsy and 28% of them actually fell asleep [1]. A review of different sensing techniques to detect drowsiness is described in the published works of Sahayadhas et al. [2] and Jianfeng et al. [3]. These sensing techniques for drowsiness detection fall under three categories: First one used sensors mounted on the vehicle parts such as steering wheels or with the help of video cameras [4,5,6]. Second one senses the drowsiness with the help of behavior of the driver, usually by computer vision and imaging techniques [7, 3, 8] to detect the facial movements of the driver such as drooped head, yawning, and eye closures. The third set of techniques use the physiological parameters of the driver, which include eye activity by electrooculogram (EOG), muscle activity by electromyogram (EMG), heart rate variability by electrocardiogram (ECG), and brain states by electroencephalogram (EEG). Physiological changes related to the state of brain functions may help in determining drowsiness in an objective manner. Other characteristics are subject-specific and hence will be less reliable. They are also affected by illumination of the surrounding as well as the drivers postures.

EEG data analysis is one of the accepted methods for drowsiness detection. EOG-based techniques can also serve the same purpose, but the sensors of EOG, placed in and around the eyes, may cause disturbances to the driver if used for long time. These sensors have much lower SNR and can give false alarms. The electrical activity of the neurons inside brain, EEG, can provide the information regarding both muscular movements of the eye as well as the sleep stages. Hence, EEG-based drowsiness detection was considered in this work.

Several methods have been proposed to identify drowsiness from EEG. One such experiment used the analysis of alpha power spectrum changes of EEG signals by Dajeong Kim et al. [9], when subjects are feeling drowsy (with eyes open). Such eye activity cannot be detected by image processing methods. In their work, experiments were conducted on three subjects (age group of 24–25). Power spectrum of alpha, beta, theta, and delta signals was analyzed using fast Fourier transform. It was found from the experiments that alpha showed significant changes in power spectrum during drowsiness period even when eyes are open. In a similar study by Shaoda Yu et al. [10], EEG variations during sleep onset transitions were observed. In their work, the feature, EEG spectral bands divided into 1 Hz bin intervals and used these as input to support vector machine (SVM) classifier. From a set of optimized features, an accuracy and precision of 98.01% and 97.91%, respectively, was achieved. Research carried out by Roman Rosipal et al. [11] used an EEG-based probabilistic model to detect drowsiness using the spectral contents of four-second-long EEG segments. They modeled a real-time system using hierarchical Gaussian mixture model. Labeling was done using Karolinska drowsiness scoring method. Yabo Yin et al. [12] used k-means clustering and linear discriminant analysis (LDA) to find the correlation of alpha and beta waves with the states of drowsiness and consciousness.

In most of the literatures available, the features for drowsiness detection were based either on blink-related information or on spectral characteristics. In this paper, we try to integrate both these features and evaluate those using supervised learning techniques. PCA-based preprocessing will be used to combine the suitable channels from the headset. The performance of the system will be analyzed in terms of accuracy, sensitivity, specificity, and precision. The paper is organized as follows: In Sects. 2 and 3, the methodology of feature extraction and preprocessing is discussed in detail. Section 4 briefly explains the classification methods, and Sect. 5 gives the results obtained. The paper is concluded in Sect. 6 with further information on possible future works.

2 Methodology

In this section, we will be dealing with acquisition of EEG and preprocessing methods. We will begin with a brief discussion on human brain and conclude this section in feature extraction techniques.

2.1 Data Acquisition

EEG data was acquired by using the Emotiv EPOC device. The device has a resolution of 12 bits. It is an EEG signal acquisition and processing wireless headset that monitors 14 channels of EEG data. EPOC has a sampling frequency of 128 Hz and sends the EEG data to the computer via Bluetooth. EEG data of 18 subjects (nine males and nine females) of age group 20–22 was collected. Noise while recording was avoided by switching off other electronic devices nearby. The subjects were asked to undertake a virtual driving game for 2 h. The data was recorded while the subjects were virtually driving the vehicle. Subjects were slipping into stage of drowsiness when the data was taken. For virtual reality, VR box was used to give a 3D virtual image for purpose of providing a real driving scenario. To facilitate the contact of electrode with scalp, a saline solution was used. The 14 electrodes (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, and AF4) were placed according to the international 10–20 systems forming 7 sets of symmetric channels. The recordings were highly noisy as shown in Fig. 1.

Fig. 1
figure 1

EEG output plot of 14 channels

Even though the recordings were taken from all the channels, the features pertaining to ocular pulses were dominant only in the electrodes placed in the frontal and occipital regions. Hence, only 9 channels—AF3, F7, F3, F6, O2, FC6, F4, F8, and AF4—were considered for this work. A video channel was also used to note down the presence and behavior of eye blinks. Video was taken from the instant when the EEG data was recorded. The onset of blinks was identified from the video and verified with EEG pulses obtained. The EEG pulses not matching the video record were not considered for analysis. The identified EEG portions for analysis varied between 6 and 10 min. In noisy data, to identify actual pulses it was necessary to have parallel video recording.

3 Feature Extraction

3.1 Principal Component Analysis and Preprocessing

3.1.1 Preprocessing for Blink-Dependent Features

All of the 9 selected channels give information regarding the state of the subject, and hence, the most relevant data should be selected from the available channels. For combining the appropriate channels, principal component analysis (PCA) was used. The EEG signal of channels AF4 and AF5 is given in Fig. 2a and b, respectively. The PCA output for the channels is shown in Fig. 2c. Since the eye blink-related features were considered, the PCA output with maximum energy (the output with maximum eigenvalue), i.e., the first channel, was considered. Along with the required output, the coefficients (or weights) and latent (or eigenvalues) were also obtained from PCA. This signal was further normalized as shown in Fig. 2d. Normalization was done by subtracting the mean from it and dividing the result with the maximum of that signal.

Fig. 2
figure 2

a EEG data of channel AF4. b EEG data of channel AF5. c PCA output of selected channels. d Normalized PCA output

It is seen that a normal human blinks in between 2 and 10 s. Hence for eye blink-dependent features extraction, the EEG waveform was further segmented and analyzed with 10 s duration. An overlap of 2 s was also provided to prevent loss of data while segmenting. EEG recordings had baseline noise as shown in Fig. 3a. In order to remove the baseline variations, polynomial fitting was done on the normalized signal and was subtracted. The resultant gave smooth signals with peaks corresponding to the eye blinks. This is shown in Fig. 3b.

Fig. 3
figure 3

a Segmented EEG before preprocessing. b Baseline removed and smoothed signal with detected eye blinks

3.1.2 Preprocessing for Spectral Features

The output of PCA was for eye blink-independent features which are discussed below. All EEG channels have effects of blinks. In order to remove the signal corresponding to blink, processing was done using the outputs of PCA. The first output channel of PCA, having the maximum blink variability, was made zero. Inverse operation of PCA was done inorder to get back the EEG data from the 9 channels that are free from the effect of blinks. This was used for further power analysis in different blocks.

3.2 Feature Extraction

As mentioned earlier, brain waves gamma, beta, alpha, theta, and delta play a significant role in determining the stages of alert, drowsy, and sleep. Two types of feature vectors were used for detecting drowsiness: One type was dependent on the eye blink information which included the blink rate, sum of blink heights inside the particular window and sum of blink duration widths in that window. Second feature vector was obtained by removing effects of eye blink. It includes the average and standard deviations of energy in the above-mentioned five frequency bands. Therefore, there were 2 types of features: First one was a temporal analysis in time domain while second one was in frequency domain.

3.2.1 Blink Features

Blink rate was calculated according to the number of times the subject blinks in a 10 s window. Blink width and blink height were also calculated from this window. It is a known fact that characteristics of blinks while a person is awake and when one is drowsy vary in a significant manner, which motivated us to take these features. Thus, three blink-dependent features, rate, height, and width/duration, were obtained for each subject.

3.2.2 Spectral Features

The signal without blink data was further filtered using a set of FIR filters of order 60, tuned to different frequencies according to the frequency band of EEG’s. A total of 10 feature vectors regarding the brain waves: mean energy and standard deviation of energy corresponding to each of 5 frequencies. The 3 blink features and 10 spectral feature vectors were combined to get a total of 13 feature vectors and used for classification purpose. Possible three-dimensional plots of the above feature combinations were taken to visualize the feature space. A sample three-dimensional plot of the feature space comprising the features mean energy of gamma and theta versus the blink rate is shown in Fig. 4.

Fig. 4
figure 4

Three-dimensional plot of the feature space comprising the features: mean energy of gamma and theta versus the blink rate

4 Classification

Two supervised classification methods: K-nearest neighbor (KNN) and artificial neural network (ANN) [13, 14] were used to analyze the feature vectors. The classification was done for three cases: (i) using vectors having eye blink (ii) without eye blink, and (iii) combination of two feature vectors which include both. Eighty percentage of the data was used to train the classifiers while twenty percentage was used to test. The number of nearest neighbors are denoted by the variable k. K was iteratively chosen in the paper to aid maximum accuracy in classification; for our work, it was chosen as 11.

5 Results

Our results show that the three stages were classified using the PCA EEG signals. Classification was done using KNN and ANN algorithms where ANN gave better performance in individual cases. The results were analyzed for three cases: (1) using blink-related features (2) using spectral features, and (3) with combination of both. The classifiers were evaluated for the performance measures accuracy, sensitivity, specificity, and precision.

From the analysis given in Table 1, it can be seen that for both cases, i.e., blink features and spectral features, ANN gave better classification performance measures accuracy, sensitivity, specificity, and precision. Using KNN, spectral parameters gave better performance measures, i.e., 73.33% accuracy, 54.67% sensitivity, 77.65% specificity, and 29.43% precision compared to the blink-based feature. Using ANN, the accuracy measure for both the cases (i) and (ii) were same, i.e., 75%. Table 2 shows the performance measures when combination of blink and spectral features was used. The combination of features was found to give highest accuracy, of the three cases. In case (iii) also ANN classified the stages: awake, onset of drowsiness and sleep with much better performance measures compared to the one with individual cases.

Table 1 Results for case (i) and case (ii)
Table 2 Results for case (iii)

6 Conclusion

Amrita Vishwa Vidyapeetham has developed a system for forewarning drowsiness of pilots and drivers to prevent impending accidents. The analysis of this system has shown it is possible to detect drowsiness in pilots and drivers using PCA EEG signals with reasonable accuracy. Our observations conclude that accuracy obtained for drowsiness detection when the two set of features were used together was 80% using KNN classifier and 85% using ANN classifier. In all the cases, ANN gives the best results in the performance measures considered. It is expected that eye tracking and heart rate variability may give a higher level of accuracy.