Keywords

1 Introduction

Sleep is one of the important physiological activities for the human body, which directly controls memory consolidation, and it also decides the performance of the daily activities. Sleep plays an important role in the human body because it represents the primary functions of the human brain. One human individual is spending one-third of its duration as sleep. Proper quality of sleep maintains the physical and mental fitness of the human body, which is alternatively helpful to perform well in workplaces, control emotions, and be able to take proper decisions [1, 2]. Nowadays, it has seen that sleep diseases (SD) are becoming one of the major causes of death across the world. The main reason for this serious health issue is an imbalance of sleep patterns, and it has occurred due to job pressure, and rapid changes in lifestyles. Across the globe. It has been observed that the prevalence of sleep diseases has significantly increased over the past years. According to the report of the Center for Control of Disease and Prevention (CDC) of the US Government, around 9 million populations have difficulty maintaining good quality sleep [3]. According to a survey of the National Highway Traffic Safety Administration in the USA, it has been found that due to the drowsiness factor, around 56,000 to 100,000 car accidents have happened, which directly reported that more than 1500 have died and 71,000 are affected with injuries annually [4]. It has been found that sleep diseases are considered to be the most predominant death cause with the different age groups of populations across the globe. In general, different types of sleep disorders are categorized such as obstructive sleep apnea, insomnia, hypersomnia, narcolepsy, breathing-related disorders, stroke, stress, and cardiovascular diseases [5]. All these diseases progressively increased with age. So early diagnosis is helpful for the human being to prevent the severity of these diseases, and it helps to improve the subject’s quality of life. The first most important step for sleep diseases is sleep scoring. The most popular test for analyzing sleep quality is the polysomnography (PSG) test. PSG tests include the signals such as electroencephalogram (EEG), electrocardiogram (ECG), electromyogram (EMG), and electrooculogram (EOG). The entire sleep staging procedures are analyzed according to two available sleep standards such as the Rechtschaffen and Kales (R&K) [6] and the American Academy of Sleep Medicine (AASM) [7]. According to R&K sleep guidelines, the whole sleep cycle is categorized into six sleep stages such as wake stage(W), non-rapid eye movement (NREM stage1 (N1), NREM stage2 (N2), NREM stage3 (N3), and NREM stage4 (N4)) and rapid eye movement (REM) stage. The only changes reflected with the AASM manual incomparable to R&K standards are NREM sleep stages. According to the AASM guidelines, the total sleep stages are five, the NREM stage 3 (N3) and the NREM stage 4 (N4) are combined into one sleep stage called the NREM stage3. Traditionally the sleep scoring procedure was conducted through the visual inspection method, where one clinician was monitoring the sleep behavior of the subject for 6–8 h. of sleep. This traditional sleep analysis method requires more human resources for monitoring the whole sleep recordings, and also, it consumes more time for analysis, due to more human interpretation, sometimes the results are erroneous [6]. Sometimes it is also one of the major causes of not achieving higher classification accuracy in the classification of sleep stages. With consideration of all these above-mentioned facts, the automated sleep scoring approach has gained a lot of attention in recent researches [7, 8]. Automated sleep scoring not only causes accuracy improvements but also provides quick diagnosis [9]. It has been observed that the PSG test is one of the costly experiments, and it also gives so many unpleasant scenarios for the subjects because of its so much connectivity of wires in the different parts of the body [10, 11]. Henceforth instead of PSG signals, most of the researchers preferred to EEG signal, because it directly provides the brain activities during sleep hours. This helps a lot for analyzing the sleep abnormality, and it is also more popular for its easier recording facility. In general, EEG signals are combinations of different waveforms, which help to characterize the different sleep stages with different frequency bands such as delta band (0–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), spindle(12–14 Hz), sawtooth (2–6 Hz), and k-complex (0.5–1.5 Hz). Finally, the scoring and decisions are taken by the sleep experts through proper interpretation of the quantitative and visual analysis of collected sleep recordings. In some cases, the sleep experts use an algorithm for pre-scoring the entire sleep recordings, and these successive representations of the sleep stages information called hypnograms, which is highly required during the diagnosis of the different types of sleep disorders. Sleep staging is generally a tedious job, which requires highly experienced technicians and experts. This other limitation with subject to sleep staging is variations on sleep scoring from experts to experts, which is also one of the major causes for diagnosing sleep diseases [12, 13].

In this paper, we have obtained a single-channel EEG signal for sleep staging analysis; this approach makes it more interesting because of its ease of operational deployments on mobile devices. It also makes more comfortable situations for the patients due to less cabling used during recordings. It has been observed that most of the contributions with single-channel EEG signals were executed two-step methodology. In the first step, the different hand-engineered features are extracted from the different waveforms, and in the second step, the extracted features are forwarded to a classifier for classifying the sleep stages based on the feature characteristics. In general, it has been seen that most of the authors obtained one of the three following domains of the features [14] (a) time-domain features, (b) frequency-domain features, (c) nonlinear features. It is a very difficult part for sleep experts to manually monitoring the recorded EEG signals, and it raises so many errors because during long 7–8 h. EEG recordings, its hectic situation for sleep experts to monitoring within the 30 s framework and fix the labeling of sleep stages. This approach consumed more time and required more manpower for hours of sleep recordings. To overcome difficulty from the manual approach, nowadays automated sleep stage classification is obtained to analyze the sleep-related disorder and real-time diagnosis, and the most important step is designing sleep stage classification. Currently, overnight sleep study through polysomnography is one of the standard procedures for measuring sleep irregularities during sleep [14].

1.1 Related Work

Several sleep analysis studies were proposed for characterization the sleep-related abnormalities based on the sleep standards recommended by R&K and AASM manuals. Various computational methodologies were proposed by researchers to support sleep experts for assisting sleep staging. Those carried steps were on the information extraction (polysomnography channel selection), on the preprocessing (removing the data artifacts and data normalization), on the feature extraction step (transformation of time- and frequency-domain features), on the feature selection technique (identifying the most relevant features) and finally on the classification algorithm. Here we have presented some comparative studies regarding sleep staging. In [15], the authors have obtained wavelet concept techniques for feature extraction and classified the selected features using the fuzzy algorithm. The classification model provided 85% accuracy. Güneş, K et al. [16] used K-means clustering and feature weighting techniques to design an ASSC system. Welch spectral transform was considered for feature extraction, and those selected features were forwarded to a decision tree (DT) and obtained with an overall accuracy of 83%. Aboalayon [17] used EEG signal and obtained Butterworth bandpass filters and used SVM classifiers and reported 90% classification accuracy.

In [18], extracted features form an empirical mode decomposition of the signal and use bootstrap-aggregating techniques for multi-class sleep staging classifications.

In [19] applied the EEMD algorithm for signal enhancement from single-channel EEG signal, and extracted statistical features are forwarded into boosting techniques, and the reported accuracy for two-six sleep stages is reported as 98.15%, 94.23%, 92.66%, 83.49%, and 88.07%, respectively.

Kristin M. Gunnarsdottir et al. have designed an automated sleep stage scoring system with overnight PSG data. Here the authors extracted both time- and frequency-domain properties from PSG signal, and considered healthy individual subjects with no prior sleep diseases and the extracted properties were classified through DT classifiers. The overall accuracy for test set data was reported as 80.70% [20].

Sriraam, N. et al. used a multi-channel EEG signal from ten healthy subjects. In this study, the author has proposed the automated sleep stage scoring in between wake and stage1 of sleep. In this research work, spectral entropy features are extracted from the input channels to distinguish the irregularities among the sleep states. The extracted features processed through a multilayer perceptron feedforward neural network and the overall accuracy with 20 hidden units were reported as 92.9%, and subsequently for 40,60, 80, and 100 hidden units in MLP, it was reported as 94.6, 97.2, 98.8, and 99.2, respectively [21].

In [22] proposed two-state sleep staging and the acquired signal decomposed into eight sub-bands, finally 13 features are extracted from each sub-band epoch. The suitable features are identified through the mRMR feature selection algorithm. The model achieved an overall accuracy of 95.31% through a random forest classifier. Da Silveira et al. used discrete wavelet transform (DWT) for signal segmentation. Skewness, kurtosis, and variance features were extracted from respective input channels. The extracted features were applied to a random forest classifier, and overall accuracy was reported as 90% [23]. Prochaska et al. [24] used polysomnography data features to identify the sleep abnormalities from three different medical conditions of subjects and used an SVM classifier for two-state classifications in between the Wake versus NREM stage and another one in between the Wake versus REM stage. The proposed study achieved an overall accuracy of classification between the Wake-NREM and Wake-REM stage as 85.6% and 97.5%, respectively. Xiaojin Li et al. introduced the hybrid model for identifying the irregularities that occurred in different stages of sleep during the night, and extracted features were forwarded into random forest classifiers. It has been reported that overall classification accuracy has reached 85.95% [25].

2 Experimental Data

To analyze the proposed methodology effectiveness, we have obtained the session-1 and session-2 sleep recordings from the subjects who were already affected by the different types of sleep-related disorders. These required recordings were retrieved from the subgroups of ISRUC-Sleep dataset, which is one of the public databases specifically available for sleep research. These recordings were prepared by the groups of domain experts at sleep center in the Hospital of Coimbra University. This dataset contained the recorded subject details from different age groups, gender categories, and medical conditions. All recordings were recorded through the sleep experts in the sleep laboratory at the Hospital of Coimbra University (CHUC). As per our proposed research objective, the first subject used for experimental work from Subgroup-I of ISRUC-sleep repository. The second category of a subject is taken for our proposed experimental work from Subgroup-II of the ISRUC-sleep database. The distribution of sleep stages epochs per individual subjects is presented in Table 1.

Table 1 Detailed information of each subject sleep dataset records used in this study

3 Methodology

In this work, we proposed a machine learning-based sleep scoring system using a single channel with subjects having different medical conditioned subjects. The main objective of this proposed work is to study the sleep stages behavior of the subjects who were already had some types of sleep diseases symptoms. Additionally, in this research work we also analyzed the sleep quality of the subjects by obtaining the different session recordings on two different dates. The complete layout of this proposed work is described in Fig. 1.

Fig. 1
figure 1

Complete layout of the proposed sleep staging system

3.1 Feature Extraction

The selection of inputs for the classifier is the most valuable for identifying sleep pattern abnormality. Even if obtained highly effective classification model performed very poor performance, if proper inputs are to be identified. It can be found that the different classifiers performed different results for the same set of features; it indicates matching both may found results. On the other part, sometimes we have given some set of features that favors the classification process. It has been found that the sleep behavior of the subjects is highly unstable and non-stationary because the changes characteristics are directly linked with the time and frequency ranges. To properly discriminate the sleep stages, we need to analyze the signals by obtaining the time- and frequency-based parameters. In this study, we have as whole extracted 28 features from the input signal, out of those 13 features are time-based and the other 15 features are frequency-based [2628]. The obtained features are described in Table 2.

Table 2 Short explanation of the extracted features for this proposed study

3.2 Feature Selection

Next to feature extraction, the other important task with regard to classification problem is screening the relevant parameters which help to model for properly classifying the sleep stages. Sometimes it has been found that the extracted fall the features may not be more effective with respect to analyze of the sleep behavior, which directly put impacts on the classification performance of the models. In our study, we adopted the feature screening techniques as online streaming feature selection (OSFS) techniques to screen the suitable features from the pool of extracted features [29]. The selected features concerning the individual subject are presented in Table 3.

Table 3 Screened Feature Lists

4 Experimental Results and Discussion

The main intention behind this research is to analyze the changes sleep stages and classifying the sleep stages using machine learning classification models. This entire procedure is called as sleep scoring. In this work, the entire experiments were performed on two different subgroups of sleep recordings one from ISRUC-Sleep subgroup-I and the other from ISRUC-Sleep subgroup-II dataset. The entire sleep staging experiments followed according to the AASM sleep standards. The proposed sleep scoring methodology is executed through four basic steps that are signal preprocessing, feature extraction, feature screening, and finally classification. In this work, we have considered only the single channel of EEG signal for acquisition of the signal recordings. Next to the acquisition, the required signals need to be processed for further eliminating the irrelevant noises and artifacts which are contaminated during recordings in the raw signal and eliminated these muscle artifacts and noisy portions from recorded signals through the Butterworth band-pass filter. In the next phase, a set of experiments were conducted to extract the features from both the time and frequency domains. As a whole 28 features were extracted from recorded signals of the subjects, and the same details are mentioned in Table 2. The size of the feature vectors for all enrolled subjects for 30 s epoch length is 28 × 750. Matrix dimension for feature vector is feature number x epoch number. The next task is the selection of the most efficient features from among the feature vector. To work out this selection experiment, we have applied OSFS feature selection techniques. The matrix representation for feature selection vectors is selected feature number x epoch number. These matrixes are 16 × 750, 17 × 750, 15 × 750, 15 × 750 for subject-16, subject-23, subject-03 (session-1 recording), and subject-03 (session-2 recording), respectively, for input length of epoch is 30 s. By implementations of tenfold cross-validation techniques on the SVM [30, 31] and DT [32] classifiers, the selected best features are fed as input to the model. We also conducted a comparative analysis with all these enrolled subjects and their session recordings, and finally, comparison experimental results are presented according to the single channel of EEG signals and two sleep classes (wake versus sleep). In this proposed study, we have used some criteria of evaluation metrics for measuring the performances of the proposed sleep scoring study. Here, we have considered six performance metrics for analyzing the performance of the proposed methodology such as classification accuracy [33], recall [34], specificity [35], precision [36], F1-score [37], and Kappa score [38]. Analysis of the comparative results from conducted experiments, and obtained results are presented below.

4.1 Classification Accuracy of Category-I Subject ISRUC-Sleep Database

In this experimental part, we have obtained two subjects who have been affected by some kind of sleep-related disorders and here from subject session-1 recording recorded by sleep experts for diagnosing the irregularities that happened during sleep hours. Table 4 presents the confusion matrix for two-state sleep stage classification problems for both the subjects 16 and 23 with time length of epoch is 30 s. It has been observed that the SVM depicts an overall classification accuracy of 95.62 and 91.20% achieved through DT classifiers for subject-16. For subject-23, the same classifiers SVM and DT reached overall accuracy of 91.46% and 87.73%, respectively, for epoch length 30 s.

Table 4 Confusion matrix of subjects 16 and 23 according to AASM guidelines

The results achieved from the input of 30 s length epoch for subject-16 and subject-23 are specified in Table 5. Figure 2 displays performance statistics for 30 s epoch length for subject-16 and subject-23.

Table 5 Performance of the proposed SleepEEG study using SVM and DT classifiers
Fig. 2
figure 2

Performance statistics of model with 30 s epoch duration for subject-16 and subject-23 using SVM and DT classifiers

The overall performance value of the proposed Category-I subject ISRUC-Sleep database is measured through the evaluation parameters that are recall, specificity, precision, and F1-score, and it reported for subject-16 as 99.59%, 81.64%, 95.14%, and 97.29% through SVM, 93.70%, 82.21%, 94.99%, and 94.34% through DT, respectively; similarly, the same parameters reached for subject-23 through SVM and DT are 95.17%, 82.08%, 93.09%, and 94.12%, and 91.45%, 78.30%, 87.73%, and 91.45%. The computation of score is of six levels of agreements:0.81–1, 0.61- 0.80, 0.41–0.60, 0.21–0.4,0.00–0.20, and less than 0 correspond to excellent, substantial, moderate, fair, slight agreement, and poor agreement [38]. Table 6 gives the kappa coefficient score concerning obtained classification techniques for both the subjects 16 and 23, and it has been found from results that all classification techniques are found excellent agreement with subject to best accuracy for investigation on sleep irregularities.

Table 6 Performance of the accuracy and Kappa score based on the two-state sleep classification problem for subjects 16 and 23

4.2 Classification Accuracy of Category-II Subject ISRUC-Sleep Database

In the ISRUC-Sleep subgroup-II dataset experiment, the proposed sleep stage classification model has experimented based upon only a single channel with two different session recordings from one gender enrolled subject with suspected sleep-related disorder symptoms. Table 7 represents the confusion matrix for both session recordings of subject 03 with the duration of epoch 30 s.

Table 7 Performance of the proposed SleepEEG study using SVM, and DT Classifiers for Subject-03 (session-2 Recordings)

The achieved results for subject-03 for both the sessions are shown in Table 8. The performance graph results for subject-03 for both session recordings of epoch length 30 s are displayed in Fig. 3.

Table 8 Performance of the proposed SleepEEG study using SVM and DT classifiers for Subject-03 (session-2 recordings)
Fig. 3
figure 3

Performance measures using SVM and DT classification techniques for the two-state sleep classification model with session-2 recordings for subject-03 (30 s epochs Length)

Figure 3 presents the reported performances for the two-state sleep classification model, Subgroup-II with session-2 recordings for subject-03.

From each subject, here we have acquired two different session recordings; it has been observed that subject 03 with session-1 recording SVM classification model depicts an overall accuracy of 91.06% and 84.26% for DT, respectively. Similarly, it has been found that the classification results of subject 03 with session-2 recordings through SVM and DT were reported as 89.46% and 84.2%. The overall performance of recall, specificity, precision, and F1-score reported with the session-1 recording of ISRUC-Sleep Subgroup-II database of subject-03 through SVM as 97.07%, 29.85%, 93.38%, and 95.19%, similarly for DT classifier, the performances reached 93.70%, 82.21%, 94.99%, and 94.34%. Similarly, the performances with session-2 recordings are reported as 98.49%, 22.47%, 90.42%, and 94.28% through SVM, 91.45%, 78.30%, 91.45%, and 91.45% through DT. The results of the kappa coefficient for subject-03 with both session recordings are presented in Table 9.

Table 9 Performance of the accuracy and Kappa score based on the two-state sleep classification problem for subject with mild sleep problem having session-2 recording for subject-03

For measuring the impact of session recordings for the classification of sleep stages, we have computed the Cohens kappa coefficient; according to session-1 recording for subject-03, the kappa score through SVM and DT is 0.87 and 0.67. From this kappa score, it concludes that DT is not up to the mark performance incomparable to the SVM classification techniques. Similarly for session-2 recording, the kappa performance for SVM and DT is 0.86 and 0.74, respectively.

4.3 Comparative Analysis in Between Proposed Study and State-Of-The-Art Works

Here we have made a comparison with other similar contributions work to measure the proposed research work effectiveness toward the identification of sleep disorder. Table 10 presents the comparison of the performances based on single-channel EEG acquisition among the proposed research work results with five contributed works.

Table 10 Comparison of performances of the proposed work with previously published works

5 Conclusion and Future Directions

The present proposed research work application showed the most effectiveness in the sleep stage scoring by using a single channel of EEG signal. This proposed SleepEEG study would provide an effective mechanism for handling different health conditions of the subjects with high accuracy of sleep abnormality identification from sleep recordings. The main objective of this application is to analyze the irregularities that occurred during sleep hours from various session recordings, and additionally, this application also successfully deals with the specially aged category of subjects with various disease conditions. The major part of this research work is to find the proper solutions based on irregularity's accuracy during sleep. Another important significance of this proposed SleepEEG study is that, according to our best knowledge, this proposed research work considered different session recordings from the participated subjects in these experimental processes.

This experimental research study provides new directions on scoring sleep stages to identify sleep abnormality through the extraction of different features from both domains such as frequency and time. The major changes are shown between the two different session recordings of sleep stages from two different days, and the general sleep stage classification problem is that annotations of sleep stages are another important source of information. These certain things support for discovering new concepts of investigation on sleep irregularities during sleep, and it may get more advantage for predicting the proper diagnosis plan for treating the disorder.

The proposed scheme automated sleep stage classification based on a single channel of EEG signal gives the benefits with the inclusion of different session recordings and obtained different health condition subjects. It has been observed from the experimental results that the proposed sleep analysis indicated an excellent agreement between automated sleep staging and the gold standard.

The present research work has certain disadvantages that the (1) data used for the experimental purpose from ISRUC-Sleep repository for statistical evaluation was relatively small, (2) only we have included single channel of EEG signal was used for classification, (3) we have not considered the subjects who were effects of diseases, such as narcolepsy and insomnia.