1 Introduction

Sleep is a naturally occurring biological phenomenon which impacts the human body. People spend one-third of their life in sleeping [1]. Sleep is an important phenomenon in human life which plays vital role in individual performance, learning ability and day-to-day life activities. Few of the sleep-related problems like insomnia and obstructive sleep apnea have major impact over the physical health of an individual [2]. As per National Sleep Foundation (NSF), sleep disorders are found in 60% of the adults [3]. It is estimated from American Sleep Association (ASA) that around 50–70 million adults in USA are suffering from sleep-related problems [4]. Sleep apnea is found among 2–4% of the adults and 1–3% children [3, 4]. World’s 33% population is suffering from sleep issues [3, 4]. As per National Highway Traffic Safety Administration (NHTSA) in the United States, more than 100,000 automobiles have crashed annually while falling asleep during driving [4]. 20% of road accidents in United Kingdom and one out of four accidents in Germany are due to sleep-related issues [4]. Approximately, 3% of road traffic accidents are caused by sleep-related problems [4, 5].

Sleep experts suggest EEG gives better result for sleep stage classification by reducing disturbances of Polysomnography (PSG) wire recordings and other equipments [4]. PSG signals are acquired from the patients during sleep in the night time at hospitals. Recordings of PSG signals include Electrooculogram (EOG), Electrocardiogram (ECG), Electromyogram (EMG) and Electroencephalogram (EEG) [4, 6, 7]. These signals are divided into epochs, analyzed visually by sleep experts and physicians as per American Academy of Sleep Medicine (AASM) sleep scoring standard [6]. Classifying sleep stages using PSG signals based on visual inspection is a time-consuming and more problematic process [4, 6]. EEG signals recorded from multiple channels and visually examined by experts make sleep scoring phenomenon expensive, more tedious and subjected to human errors. EEG signals consist of more significant and prominent information compared to ECG, EMG and EOG signals [4, 6].

Sleep stages are divided into Rapid Eye Movement (REM) and Non-Rapid Eye Movement (NREM). NREM is further divided into stage 1, stage 2, stage 3 and stage 4 as per the guidelines of Rechtschaffen and Kales method [4, 6, 8, 9]. Recently, American Academy of Sleep Medicine (AASM) has come up with the modified sleep stages which involves stage 1 as N1, stage 2 as N2, and stage 3 and stage 4 combined together as N3 [4, 6]. Most of the previous work which were carried out on sleep classification using different feature extraction methods are listed in Table 1 [4].

Table 1 Different feature extraction methods of EEG signals

Some of the machine learning techniques are listed in Table 2 [4] which are commonly used for sleep stage classification. Single channel of EEG and multi-class SVM-based sleep classification has achieved an accuracy of 87.5% [4]. An accuracy of 92.7% was achieved using Fuzzy C means algorithm [4], whereas 93.13% is obtained using single-channel EEG signal [4].

Table 2 Machine learning techniques

In this paper, EEG signals are obtained from Sleep-EDF database from physionet website and from Dr. Chandrashekhar’s clinic. EEG dataset is processed to extract statistical features from each epoch of 10 s by representing EEG signals in time–frequency domain. Decision Tree (DT), Support Vector Machine (SVM) and Random Forest (RF) algorithms are trained with extracted features using different testing data set percentages. Sleep stages are classified into three stages viz. stage 1 (8 Hz and above), stage 2 (4–8 Hz) and stage 3 (0–4 Hz), which will help to identify and treat issues like fatigue, drowsiness, sleep apnea, insomnia and other sleep-related disorders. Table 3 represents three different EEG signal frequency bands and corresponding amplitude ranges are used in sleep stage classification of the proposed work.

Table 3 Amplitude and Frequency ranges of EEG signal

2 Proposed Method and Procedure

The proposed method includes technique for identification of sleep stages to understand the sleep-related problems. Figure 1 represents the flow chart of the proposed study of sleep stage classification.

Fig. 1
figure 1

Proposed flowchart of sleep stage classification

2.1 EEG Data Set

EEG dataset used in this proposed work is obtained from Sleep-EDF database and from Dr. Chandrasekhar’s clinic. The dataset of 125 patients is collected from all the age groups between 20 to 50. EEG signals are composed of overnight sleep scoring without any medication. Collected EEG signals are sampled with 128 Hz. These signals are processed and divided into epochs of 10 s to produce total of 68,956 epochs for classification of sleep stages.

2.2 Preprocessing

EEG de-noising is performed using wavelet transform method which localizes the features and preserves during filtering of noise signals. Original signal contains many high-frequency and noise components. De-noising is essential to filter high-frequency signals and noise components.

Detailed information of EEG is determined using filtering functions. Figure 2 depicts the de-noised EEG signal.

Fig. 2
figure 2

EEG signal after wavelet de-noising

Infinite Impulse Response (IIR) filters are considered to be the most simple and realizable on any digital signal processors and embedded systems [4]. Higher-order filters result in better accuracy. EEG dataset is processed using 12th order IIR digital butterworth filter with cutoff frequency 0–15 Hz. There are no ripples present in the frequency response of the butterworth filter and the pass band remains almost flat [4]. Fourier Transform is used to decompose EEG signals into frequency sub-bands namely delta, theta, alpha and beta using IIR butterworth band-pass filters as shown in Fig. 3.

Fig. 3
figure 3

EEG signals are decomposed into delta, theta, alpha and beta frequency sub-bands

2.3 Feature Extraction

Significant amount of information is obtained by extracting features of EEG signals. Time–Frequency examination of EEG signals provide right perception to extract features by removing different rhythms of frequencies [13]. These signals are further processed and its time–frequency representation is obtained which is shown in Fig. 4 [22]

Fig. 4
figure 4

Time– Frequency representation of EEG for sleep stage classification

Statistical features are extracted from each frequency sub-bands [4]. These extracted features play vital role in classification of sleep stages, which yields good accuracy and improves system performance. Extracted features from each of these frequency sub-bands are explained below.

2.3.1 Mean

Mean determines the central discrete value of set of numbers. It indicates the probability of distribution of random variables.

$$\mu = x_{1} p_{1} + x_{2} p_{2} + \cdots + x_{n} p_{n} ,$$
(1)

where µ is mean, x is random variable and p is probability

2.3.2 Entropy

Ascertaining data and entropy is helpful in machine learning. Accordingly, machine learning professional requires solid comprehension and instinct for data and entropy. While assessing a model from the information, one needs to expect specific information for creating process. Entropy is average information content of a given dataset. For a given dataset X, entropy is calculated as using:

$$E\left( X \right) = \mathop \sum \limits_{i} p_{i } (\log_{2} p_{i} )$$
(2)

where pi indicates the probability of samples in data set.

2.3.3 Power Spectral Density (PSD) and Event-Related Potential (ERP)

PSD and ERP are considered to be one of the well-established methods for analyzing EEG signals to classify sleep stages. In this paper, PSD and ERP plots are derived using EEGLAB. It gives a graphical user interface and permits users to intuitively process the data for better sleep stage classification.

Figure 5 represents the PSD of the selected EEG channel. ERPs provide insight about EEG and timing of neuronal events. ERPs extracted from EEG have received more attention in the recent years [10]. ERPs have both positive and negative voltage deflections as indicated in Fig. 6. The averaging of overall signal is performed. Information content in the epochs and random activity in EEG signal is canceled out. It slowly starts approaching towards zero as the number of samples increases. EEG signals that survive during this averaging process are ERP components. These components play significant role in the sleep stage classification.

Fig. 5
figure 5

PSD of the selected channel

Fig. 6
figure 6

ERP of channel

2.3.4 Energy sis (Esis)

Esis is the most effective technique for analyzing EEG signals with 10 s of epoch. In the automatic sleep stage classification, systems are more affected by channel redundancy and noisy data which reduces the accuracy of the classification phenomenon. Determining and analysis of Esis parameters is certainly possible to find discriminative features, which increases the performance and boosts classification accuracy. Energy sis is a time domain feature which determines the speed and energy of EEG signals. It is another method of extracting EEG signal features by the analysis of speed (velocity) and energy of EEG signals, it is calculated using the below-mentioned formula (3). The speed and energy is calculated using v = f λ, where f is frequency and λ is wavelength [4]. Esis is measured using the following equation.

$${\text{Esis}} = \mathop \sum \limits_{i = 1}^{N} \left| {x_{i}^{2} } \right| X v,\quad N\;{\text{is}}\;{\text{length}}\;{\text{of}}\;{\text{epoch}}{.}$$
(3)

3 Machine Learning Algorithms for Sleep Stage Classification

Machine learning is a technique in which a model is trained to generate an outcome by training the algorithm with input dataset. In this proposed study, we have considered EEG data samples for sleep stage analysis. Various machine learning algorithms have been proposed to classify sleep stages in the past few years [3, 4]. In the proposed study, extracted features are fed to the training model with different testing data set percentages and classified using the Decision Tree, Support Vector Machine and Random Forest algorithms.

3.1 Decision Tree (DT) for Sleep Stage Classification

Decision Tree involves the separation of EEG signals into subset such that it contains similar information content. Each sleep stage is easily distinguished by nodes. In decision tree, each node is at the top and other nodes are connected to root node through branches or links [11]. In the decision tree, one link or branch is followed and the remaining nodes are root nodes of next sub-tree until any further decision is made. For the training dataset X which includes N samples, namely X = {x1, x2,…, xN}, each sample xi includes q features which are given by xi = {vi1, vi2,…, viq}, i = 1,2,…,N. Each attribute Ak consists of N values given by Ak = {v1k, v2k,…., vNk}, k = 1,2,…,q. In decision tree, the process of tree formation starts with training dataset X, whereas the internal nodes contain different test attributes. Split attributes are the ones which divide the samples into their classes. Consider At is split attribute which divides the internal node t into w sub-branches and is given by {X1, X2,…., Xw). For the given attribute Ak, the information gain of dataset X is given by Eq. (4) and the information gain ratio is computed using Eq. (5).

$${\text{Gain}} \left( {X,Ak} \right) = {\text{Entropy}} \left( X \right) - \mathop \sum \limits_{{v \in {\text{Values}} \left( {Ak} \right)}} \frac{{\left| {Xv} \right|}}{\left| X \right|}{\text{Entropy}}\left( {Xv} \right)$$
(4)
$$\begin{aligned}{\text{Gain}}\; &{\text{Ratio}} \left( {X,Ak} \right) = \frac{{{\text{Gain}} \left( {X,Ak} \right)}}{{{\text{SpiltI}}\left( {X,Ak} \right)}},\\&\quad {\text{SplitI}}\;{\text{is}}\;{\text{Split}}\;{\text{Information}}\;{\text{of}}\;{\text{dataset}}\;X. \end{aligned}$$
(5)

3.2 Support Vector Machine (SVM) for Sleep Stage Classification

Sleep stage classification model is developed by training EEG data set, which helps in predicting the target values of the test data. Leave-one-out training method is used, in which one element is removed from EEG dataset. Using these datasets, predicative model is defined. Defined model is used to predict and assign the elements to the class and this procedure repeats for all the elements.

Most of the times, SVM is used as kernel for classification techniques [12]. Different kernel functions are used in the implementation of algorithm such as Linear, Polynomial, Radial Basic Function (RBF) and Sigmoid. RBF is the most commonly used kernel function for EEG processing [13]. Extracted EEG features are linearly separated using RBF kernel function. This separation of data is called decision hyper-plane. SVM is used to find the optimal hyperplane which results in better generalization of the classifier. SVM is generally used for both linear and non-linear classification problems [11, 14].

3.3 Random Forest (RF) for Sleep Stage Classification

Bootstrap aggregating is used while training the data set using RF algorithm to classify sleep stages. The overall prediction is made based on the average values of individual regression tree-predicted values. RF algorithm is formed by considering the various decision trees which constitute to forest by randomly selecting data. It sums of all the votes of various decision trees and decides final class of an object.

Random forest is an ensemble learning technique which operates and combines the uncorrelated decision trees. Step-by-step RF working model is explained below and described in Fig. 7

  1. (i)

    The randomly available data are collected from data set.

  2. (ii)

    Decision tree is built for each of the sample data set.

  3. (iii)

    Prediction is obtained from each decision tree.

  4. (iv)

    Vote is obtained for each predicted results.

  5. (v)

    Final decision is made by selecting the maximum voted results.

Fig. 7
figure 7

Random forest working model for sleep stage classification

4 Results and Discussions

In the proposed study, based on the entropy of the determined values, sleep stages are classified into stage 1, stage 2 and stage 3 as represented in Fig. 8. Stage 1 is REM sleep stage which consists of frequencies 8 Hz and above whereas, stage 2 and stage 3 are parts of NREM with frequency range 4–8 Hz for stage 2 and 0–4 Hz for stage 3.

Fig. 8
figure 8

Entropy-based sleep stage classification into stage1, stage 2 and stage 3

Proposed study makes use of 125 patients EEG recordings with 68,956 epochs each of 10 s duration. Statistical features are extracted from EEG signals by representing in time frequency domain and these features are used to train RF, SVM and DT algorithms to classify sleep stages with ten-fold cross-validation technique. Evaluation matrix is determined, which helps in analyzing and evaluating the performance parameters such as sensitivity, specificity, accuracy of classification phenomenon.

$$\begin{aligned} &{\text{Sensitivity }}\left( {{\text{Se}}} \right) = \frac{{\left( {{\text{TP}}} \right)}}{{\left( {{\text{TP}} + {\text{FN}}} \right)}}\\& {\text{Specificity}}\left( {{\text{Sp}}} \right) = \frac{{\left( {{\text{TN}}} \right)}}{{\left( {{\text{TN}} + {\text{FP}}} \right)}}\\& {\text{Accuracy}}\left( {{\text{Acc}}} \right) = \frac{{\left( {{\text{TP}} + {\text{TN}}} \right)}}{{\left( {{\text{TP}} + {\text{FP}} + {\text{TN}} + {\text{FN}}} \right)}}, \end{aligned}$$

where TP = True Positive, TN = True Negative, FP = False Positive and FN = False Negative.

In the process of training and testing the EEG data to classify sleep stages using three different algorithms, it is preferred to consider 10% testing data and 90% training data. Then, it is continued with 20% of testing and 80% of training data, later 30% of testing and 70% of training and finally 50–50% of testing and training data are used.

It is evident from the analysis carried out as shown in Table 4, Random Forest algorithm achieves the highest accuracy of 97.80% compared to SVM and DT algorithms.

Table 4 Sensitivity, specificity and accuracy for sleep stage classification with different testing percentages

5 Conclusion

In the proposed method to classify sleep stages, RF provides better accuracy in comparison with SVM and DT algorithms. The proposed method is also compared with the techniques implemented by other researchers as shown in comparison Table 5.

Table 5 Comparative analysis of proposed method with other sleep classification techniques

It is inferred that most of these techniques have considered limited number of subjects for determining the accuracy of sleep stages whereas, our method is feasible and renders easy implementation. The proposed method certainly has advantage in terms of accuracy and feasibility in comparison with the recently available studies on classification of sleep stages [21, 23,24,25,26].