Keywords

1 Introduction

Sleep is one of the important physiological activities for the human body, which directly controls memory consolidation and it also decides the performance of the daily activities. Sleep plays an important role in the human body because it represents the primary functions of the human brain. One human individual is spending one-third of its duration as sleep. Proper quality of sleep maintains the physical and mental fitness of the human body, which alternatively is helpful to perform well in workplaces, control emotions, and able to take proper decisions [1, 2]. Nowadays, it is seen that Sleep diseases (SD) are becoming one of the major causes of death across the world. The main reason for this serious health issue is an imbalance of sleep patterns, and it has occurred due to job pressure and rapid changes in lifestyles across the globe. It has been observed that the prevalence of sleep diseases has significantly increased over the past years. According to the report of the Center for Control of Disease and Prevention (CDC) of the US Government, around 9 million populations have difficulty maintaining good quality sleep [3]. According to a survey of the National Highway Traffic Safety Administration in the USA, it has found that due to the drowsiness factor, around 56,000–100,000 car accidents have happened, which directly reported that more than 1500 have died and 71,000 are affected with injuries annually [4]. It has been found that sleep diseases are considered to be the most predominant death cause with the different age groups of populations across the globe. In general, different types of sleep disorders are categorized, such as obstructive sleep apnea, insomnia, hypersomnia, narcolepsy, breathing-related disorders, stroke, stress, and cardiovascular diseases [5]. All these diseases progressively increased with age. So, early diagnosis is helpful for the human being to prevent the severity of these diseases and it helps to improve the subject’s quality of life. The first most important step for sleep diseases is sleep scoring. The most popular test for analyzing sleep quality is the polysomnography (PSG) test. PSG tests include the signals such as electroencephalogram (EEG), electrocardiogram (ECG), electromyogram (EMG), and electrooculogram (EOG). The entire sleep staging procedures are analyzed according to two available sleep standards such as the Rechtschaffen and Kales (R&K) [6] and the American Academy of Sleep Medicine (AASM) [7]. According to R&K sleep guidelines, the whole sleep cycle is categorized into six sleep stages such as wake stage (W), non-rapid eye movement (NREM stage 1 (N1), NREM stage 2 (N2), NREM stage 3 (N3), and NREM stage 4 (N4)), and rapid eye movement (REM) stage. The only changes reflected with the AASM manual incomparable to R&K standards is NREM sleep stages. According to the AASM guidelines, the total sleep stages are five, the NREM stage 3 (N3) and the NREM stage 4 (N4) are combined into one sleep stage called the NREM stage 3. Traditionally, the sleep scoring procedure was conducted through the visual inspection method, where one clinician was monitoring the sleep behavior of the subject for 6–8 h. of sleep. This traditional sleep analysis method requires more human resources for monitoring the whole sleep recordings, and also it consumes more time for analysis, due to more human interpretation, sometimes the results are erroneous [6]. Sometimes, it is also one of the major causes of not achieving higher classification accuracy in the classification of sleep stages. With consideration of all these above-mentioned facts, the automated sleep scoring approach has gained a lot of attention in recent researches [7, 8]. Automated sleep scoring not only causes accuracy improvements but also provides quick diagnosis [9]. It has been observed that the PSG test is one of the costly experiments, and it also gives so many unpleasant scenarios for the subjects, because of its so much connectivity of wires in the different parts of the body [10, 11]. Henceforth, instead of PSG signals, most of the researchers preferred EEG signal, because it directly provides the brain activities during sleep hours. This helps a lot for analyzing the sleep abnormality and it is also more popular for its easier recording facility. In general, EEG signals are combinations of different waveforms, which help to characterize the different sleep stages with different frequency bands such as delta band (0–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), spindle (12–14 Hz), sawtooth (2–6 Hz), and k-complex (0.5–1.5 Hz). Finally, the scoring and decisions are taken by the sleep experts through proper interpretation of the quantitative and visual analysis of collected sleep recordings. In some cases, the sleep experts use an algorithm for pre-scoring the entire sleep recordings, and these successive representations of the sleep stages information called hypnograms, which is highly required during the diagnosis of the different types of sleep disorders. Sleep staging is generally a tedious job, which requires highly experienced technicians and experts. This other limitation with subject to sleep staging is variations on sleep scoring from experts to experts, which is also one of the major causes for diagnosing sleep diseases [12, 13].

In this paper, we have obtained a single-channel EEG signal for sleep staging analysis; this approach makes it more interesting because of its ease of operational deployments on mobile devices. It also makes more comfortable situations for the patients due to less cabling used during recordings. It has been observed that most of the contributions with single-channel EEG signals were executed in two-step methodology. In the first step, the different hand-engineered features are extracted from the different waveforms, and in the second step, the extracted features are forwarded to a classifier for classifying the sleep stages based on the feature characteristics. In general, it has been seen that most of the authors obtained one of the three following domains of the features [14] (a) time-domain features, (b) frequency-domain features, (c) non-linear features. Similarly, it has been seen that for classification models, the most common models used by the researchers were support vector machine (SVM) [15], decision trees [16], k-nearest neighbor (KNN) [17], k-means clustering [18], bootstrap aggregating [19], random forest (RF) [20], naïvebayes [21], Gaussian mixture model (GMM) [22], AdaBoost [23], sparse auto encoders (SAE) [24], and artificial neural networks (ANNs) [25]. In [26], the authors obtain the multiscale entropy and autoregressive features and used linear discriminate analysis for classifying the sleep stages.

Zhu et al. [27] proposed automated sleep scoring based on the EEG signal. The author used the features from the visibility graph and uses the SVM classification model for the classification of the sleep stages.

In [28], the authors obtained time–frequency features from the raw EEG signal. The extracted features are fed into the random forest classification model.

Hassan et al. [29] extracted features from an empirical mode decomposition of the signal and use bootstrap-aggregating techniques for multi-class sleep staging classifications.

In [30], the authors extracted spectral features through the tunable Q-factor wavelet transform techniques and use a random forest classifier for the classification of the sleep stages.

In [31], the author considered multiple signals such as EEG, EOG, and EMG for the automated sleep scoring through the extraction of features like skewness, kurtosis, variance, entropy, and used a dendrogram-based SVM (DSVM) classifier for classifying the sleep stages and reported accuracy for the model as 88%.

Hassan et al. [32] applied the EEMD algorithm for signal enhancement from single-channel EEG signal, and extracted statistical features are forwarded into boosting techniques, and the accuracy for two–six sleep stages is reported as 98.15%, 94.23%, 92.66%, 83.49%, and 88.07%, respectively.

Silveria et al. [33] presented a six-state sleep staging approach using a discrete wavelet concept and obtained a random forest classifier, the model achieved 90% accuracy.

Rahman et al. [34] introduced a single-channel EOG sleep scoring approach and extracted statistical features by applying discrete wavelet transform techniques. The average accuracy reported for six-state classifications through RUSBoost, RF, and SVM is 90, 91, and 91.7%.

Memar et al. [35] proposed two-state sleep staging and the acquired signal is decomposed into eight sub-bands, finally, 13 features are extracted from each sub-band epoch. The suitable features are identified through the mRMR feature selection algorithm. The model achieved an overall accuracy of 95.31% through a random forest classifier.

Imtiaz et al. [36] presented automated sleep staging through home-based polysomnography signal, and the model reported accuracy for training and testing dataset as 89% and 72%, respectively, through decision tree classification algorithm.

Dimitriadis et al. [37] proposed one-channel EEG sensor ASSC techniques and estimated cross-coupling frequency (CFC) from each epoch and the system achieved an overall accuracy of 94% through multi-class Naïve Bayes classification techniques.

It has been found that most of the existing state-of-the-art works were based on EEG signals. But sometimes, it has also been seen that other behavior of the human body may also affect the sleep irregularities such as muscle movements and rapid movements of the eye. So, it is also necessary to consider the behavior of the muscle movements and eye blinks during the sleep scoring system. In this research work, we propose an automated sleep staging system based on polysomnography signals. In this study, we retrieved the sleep behavior using the single-channel of EEG, EOG, and EMG signals. The entire research work is carried through the four individual experiments; the first three experiments of sleep staging are executed using single-channel EEG, EOG, and EMG. The final and fourth experiment is conducted with the combinations of the EEG, EOG, and EMG signals. In this study, we have obtained the ISRUC-Sleep subgroup-I (SG-I) data. The entire experiment of the proposed model followed AASM scoring rules. Further, the research work is organized as follows: Sect. 2 explains on the dataset used in this work. Section 3 contains brief descriptions of the experimental results of the proposed model. Section 4 conecludes our proposed research work.

2 Methodology

In this paper, we propose an efficient and reliable automatic sleep staging classification system based on polysomnography signals using machine learning techniques. Figure 1 presents the steps of the proposed sleep staging system and the following sub-stages have described the detail on each step. The proposed sleep staging followed four basic phases. In the first phase, preprocessing the recorded signals, and in the second phase, we extracted the signal properties from the preprocessed signals concerning the time- and frequency-domain. After that, we obtained the feature selection techniques to analyze the relevance of the features of the proposed classification model during the third phase. Finally, in the fourth phase, the screened features are forwarded to the obtained classification model. The entire experimental work was coded and executed through MAT LAB software.

Fig. 1
figure 1

The proposed research work framework

2.1 Experimental Data

In this study, we use data of subjects who are completely healthy or have different medical conditions. The recorded data was collected from an open-access comprehensive sleep dataset, called ISRUC-Sleep. This dataset includes information from human adults and contains data on both healthy subjects and those with suspected sleep disorders. Data collection was done at the Sleep Medicine Centre of the Hospital of Coimbra University (CHUC) [38]. The first subsection includes 100 subjects, with one recording session per subject. The second subsection consists of eight subjects with two recording sessions per subject. Finally, the third subsection includes information from ten healthy subjects with one recording session per subject. In this study, we used ISRUC-Sleep Subgroup-I (SG-I) dataset. The signals are sampled at 200 Hz, and the length of each epoch is 30 s according to the AASM standard. As per AASM, the sleep stages are labeled as awake (W), NREM (N1, N2, and N3), and REM (R). This dataset contains bio-signal recordings of EEG, EOG, and EMG signals collected using 11 electrodes. Table 1 show the distribution of the number of sleep stages of the ISRUC-Sleep subgroup-I dataset.

Table 1 Description of distribution of sleep stages

2.2 Preprocessing

Since the recorded signals from the subjects were contaminated with different types of artifacts like muscle twitching, motion, and eye blinks, which could potentially limit the analysis of the changes in sleep characteristics of the different sleep stages. So that we discarded these irregular noises and artifacts using 10th order Butterworth bandpass filter with a frequency range from 0.5 to 49.5 Hz. The entire sleep behavior recordings are segmented into epochs, and each epoch length is 30 s.

2.3 Features Extraction

It is difficult to analyze the sleep behavior of the subjects from the preprocessed signals because recorded signals are highly random, and also the behavior of signals continuously changes concerning time and frequency ranges. So, it is highly necessary for proper analysis of the sleep characteristics during sleep scoring. In this study, we have obtained both time- and frequency-domain features for discriminating the sleep characteristics of the subjects. Though human sleep highly changes in nature, so sometimes it is important to study the signal in a non-linearity manner. As a whole, we extracted 29 features, out of that 12 features are time-domain related, 15 features are in frequency-domain-oriented, and 2 features are in the non-linear features, respectively. The 12 time-domain features are mean, median, mode, minimum, maximum, standard deviation, variance skewness, kurtosis, percentile, and Hjorth parameters. The 15 frequency-domain features are relative spectral power, band power for δ, θ, α, and β frequency sub-bands, seven power ratios, and the non-linear features are zero-crossing rate and spectral entropy.

2.4 Feature Selection

It has been seen that each extracted feature may not be suitable for every subject. Sometimes, it may create a biased performance from the models. So, it is highly important for the proper screening of the features before forwarding them into the classification model. Here, we consider the feature selection algorithm as ReliefF weight, which helps to find out the weightage of the individual features with the help of the generated weight value with regards to the individual features [39].

2.5 Classification

To distinguish the different characteristics of sleep stages, we employ one machine learning classification algorithm, random forest (RF).

Random Forest (RF): This algorithm is proposed by Breimant, and this algorithm is one of the popular classification techniques that uses multiple tree structures for training the data and predict the samples [40]. Each tree requires randomly sampled data values and separate classifiers. The major difference between RF and other classification techniques is that the input is selected in a random manner using bootstrap selection methods. This whole method continues till the noisy and outlier samples are not to desensitize, and at last, the output is computed by voting approaches.

2.6 Model Performance Evaluation

In this proposed study, we have considered performance metrics to validate the proposed system performance with subject to accuracy [41], sensitivity [42], specificity [43], precision [44], and F1score [45].

3 Results and Discussion

The whole research work is conducted with the two different categories of the subjects (SG-I and SG-III) of the ISRUC-Sleep dataset. SG-I category data contains the sleep behavior of the subjects who were affected with the different sleep syndromes, and in opposite, the SG-III data contains the healthy controlled subject's sleep behavior. The entire sleep recordings annotation was done according to the AASM sleep standards and each epoch length is 30 s. The entire research work is executed in the four individual experiments, the first three experiments are conducted with the individual channel of the EEG (C3-A2), EOG (ROC-A1), and EMG (X1) signals, and the final experiment is executed with the combinations of the EEG, EOG, and EMG signals. Initially, we obtained the preprocessing techniques for eliminating the irrelevant noises, muscle movements, and eye blinks information from the acquired channels using the 10th order Butterworth bandpass filter. Though the brain behavior is highly complicated in nature and to properly analyze the sleep behavior, we have extracted the signal characteristics in both the time and frequency ranges, which directly helps to recognize the disturbances during sleep period time. As a whole, 29 features are extracted from all the input signals. Another advantage of this study is the inclusion of the feature screening algorithm, which decides the most suitable features from the pool of the features which supports to discriminate the sleep characteristics concerning the individual sleep stages. Here, we obtained the selection algorithm as ReliefF feature selection algorithm which decides the importance of the feature by generating the weight value against the individual features, which ultimately decides the more optimal features for a classification task. Finally, the selected features are fed into the classifier for classifying the multi-class sleep stages. In this study, we have considered the classification of the five-sleep state. The whole recordings are segmented into the training and testing portions. The dataset ratio for all the experiments of this proposed research work is training dataset is 70% and the rest of the 30% are considered as testing data. The entire code and execution to be done through the MATLAB software (2017a version) with the system properties of i7-7700HQ 2.81 GHz CPU, 8-GB RAM. At last, the proposed model is tested using certain performance metrics such as accuracy, sensitivity, specificity, and F1score.

3.1 Results with Input of ISRUC-Sleep Subgroup-I Dataset

Experiment-1 (Single-channel EEG signal)

The first experiment is based on EEG signals. The reported confusion matrix with testing data is shown in Table 2 and the results of the performance metrics are presented in Table 3.

Table 2 Confusion matrix obtained using single-channel EEG
Table 3 Classification results of the sleep stages with C3-A2 channel of EEG signal

It has been observed from Table 3, the highest accuracy, precision, sensitivity, specificity, and F1Score reported from wake stage (99.09%), N2 stage (98.88%), N2 stage (98.19%), REM stage (99.70%), and N1 stage (96.98%), respectively.

Experiment-2 (Single-channel EMG signal)

In this experiment, we obtained the input channel as X1(Chin) of the EMG signal. The reported confusion matrix for this experiment is shown in Table 4 and the performance metrics results are described in Table 5.

Table 4 Performance values obtained using input of single-channel EMG
Table 5 Performance metrics results using single-channel EMG signal

From Table 5, the highest performance reported in terms of accuracy, precision, sensitivity, specificity, and F1Score is 99% (W stage), 98.99% (N1 stage), 98% (N2 stage), 99.45% (N1 stage), and 98.49% (N2 stage), respectively.

Experiment-3 (Single-channel EOG signal)

The third experiment is conducted with ROC-A1 input channel of EOG signal. The confusion matrix result of this experiment is presented in Table 6 and the performance of the model with different evaluation metrics is presented in Table 7.

Table 6 Performance values obtained using input of single-channel EOG
Table 7 Performance evaluation results using single-channel EOG channel

From Table 7, it has been seen that accuracy, precision, and specificity are reported highest for the N1 sleep stage, similarly, the highest performance results reported for sensitivity and F1Score is N2 sleep stage, respectively.

Experiment-4 (using EEG+EMG+EOG signals)

In the fourth and final experiment, the input for the model is combinations of the channel of the EEG, EMG, and EOG signal. The confusion matrix result for this experiment is presented in Table 8 and the performance metrics results are presented in Table 9.

Table 8 Performance values obtained using input of single-channel EEG+EMG+EOG
Table 9 Performance values obtained using input of single-channel EEG+EMG+EOG

It has been noticed from Table 9 that the performance of the model using combinations of the input channel provides better improvements in comparison to the other three individual input channel experiments. The highest accuracy results achieved from N1 stage (99.40%), precision from N2 stage (99.34%), sensitivity from N2 stage (99.26%), specificity from N3 stage (99.71%), and F1score from N2 stage (99.30%).

From Table 10, it has been seen that the proposed sleep staging study using PSG signals give the best classification performance of 99.14%. To validate the classification performance results, here, we made the comparisons of the results of the proposed model with the existing state-of-the-art works in Table 11.

Table 10 Overall accuracy results for Experiment-1 to Experiment-4
Table 11 Performance comparison of state-of-the-art works results with the proposed model performance results

4 Conclusion

In this research work, we proposed an automated sleep staging system by using PSG signals, the ReliefF feature selection algorithm, and the RF classification model. To analyze the sleep behavior of the subject, a set of linear and non-linear features was extracted from the PSG signal segments. The proposed methodologies are incorporated to analyze the changes in the sleep behavior during different sleep stages. The proposed research work reported higher sleep staging performance comparable to the existing studies. The proposed model can be used for the diagnosis of any type of sleep-related disorders in a real-time application manner. Further, we will plan to extend our work in the directions of using different epoch lengths as input and apply the deep learning concept.