1 Introduction

Nowadays it has been observed that the neurocognitive system directly decides the mental and cognitive performance in a particular task. It’s very difficult to determine the subject’s sleep behavior very accurately either numerically or any standard evaluation procedures [8]. Currently, sleep related diseases are an open challenge in the medical domain concerning different diseases such as neurology disorder, rehabilitation, and psychology-related disorders. It’s also very difficult to assess with a scenario of changes sleep characteristics in a known predictable manner. These types of diseases are more challenging concerning analysis and getting proper diagnosis solutions [23]. Sleep is one of the important ingredients for good human health and also responsible for maintaining the fitness and functioning of the different core systems of our body. It also put an impact on our proper functioning of mental and cognitive systems [12].

For human life, a total of one-third of its duration is constituted of the sleep cycle. It has been observed from several studies that sleep deficiency causes so many consequences like inability to solve the problem, not able to make proper decisions, not controlling the emotions, and reflected several changes in people [11, 70]. Sometimes the improper quality of sleep influenced different types of sleep-related disorders such as sleep apnea, insomnia, depression, narcolepsy, hypersomnia, breathing-related disorders, and circadian rhythm disorders [77]. Sometimes it has been seen that sleep deprivation is considered a stress-related disorder or sleep pathology, which causes high risk in performing some common cognitive risks such as workplace incidents, road accidents happened [25]. According to a report of the National Highway Traffic Administration of USA, due to drowsiness around one lakh, car accidents happened, as consequence more than 1500 death cases resulted and injuries cases reported around 71,000 annually [22].In this scenario, proper analysis of sleep stages is very important for identifying sleep-related irregularities. So that it is very essential to analyze the sleep stages behavior and accurate scoring of sleep states is a very crucial segment of the sleep staging process [27].

The polysomnography test is the primary step for any type of sleep-related disorder. It is a combination of different physiological signal which is useful during analyzing the sleep patterns of an individual subject. Several polysomnographic recording is recorded for sleep scoring: the EEG signal are used for monitoring the brain-behavior, the EOG signal used for monitoring eye movements and the EMG signals are tracked the changes behavior muscle tone. The entire sleep staging process is generally conducted through visualizing the sleep patterns of the subject during sleep periods by well-trained sleep experts according to two available standard sleep guidelines: the Rechtschaffen and Kales (R&K) [36] and the American Academy of Sleep Medicine (AASM) standards [78].As per AASM rules, the whole sleep stages are divided into five sleep stages: wake stage (W), non-rapid eye movement (NREM) sleep stage1 (N1), NREM sleep stage2 (N2), NREM sleep stage3 (N3), and rapid eye movement (REM) sleep stage. The main revision that occurred with AASM is, the merge of two R&K-defined sleep stages S3 and S4 into one single sleep stage, called N3.

Each defined sleep stage in AASM behaves differently during sleep periods. In this study, we are using an automated sleep stage classification system for these five sleep stages. Generally, one subject can go through all these sleep stages during its sleep cycle periods. One person can cover 3 to 5 sleep cycles, each of the time duration around 90–110 minutes during a full sleep [41]. During the first phase of the sleep cycle, the period of NREM stages is more and later part of the sleep cycles, periods REM stage increases. Earlier the sleep staging is done through visual inspection for a period of 2 to 5 hrs. in 8 hrs. of sleep for one subject. Traditionally, the sleep experts have segmented the entire sleep recordings into the 30-s interval, called one epoch, and each epoch labeled with one of the sleep stages through visualizing of its frequency and amplitude ranges, the characteristics of EEG waveforms, blinking of eye movements (EOG) and muscle movements (EMG) [76].

This traditional way of monitoring sleep stages methods has so many disadvantages such as requires more sleep experts to monitor the sleep recordings, time-consuming, and erroneous [10]. Due to more human interpretations during recording and it may not report good classification accuracy in the diagnosis of sleep stage classification [72]. Based on the above-mentioned drawbacks, automated classification of sleep stages is introduced, which ultimately gives benefits for quick diagnosis and also reported with increases of high classification accuracy [5255].

Sleep staging analysis and its scoring is a complicated procedure because of changes in sleep characteristics related to different sleep stages and also the non-stationary nature of the signal information [57].

The rest of the paper is presented as follows: In Section 2, the authors present the research contributions with related to sleep staging. Section 3 describes a brief overview on CNN model and its parameters. Section 4 describes briefly on the proposed methodology, which includes descriptions of the dataset used, proposed 9 L-1D-CNN-SSC model, model training and testing of 9 L-1D-CNN-SSC model are discussed. Section 5 presents the brief descriptions of experimental results of the proposed model. Section 6 discusses about the results and compares them with those by the state-of-the-art methods.Finally in Section 7 the concludes the paper and present the future directions.

2 Related work

Most of the authors are proposed an automatic sleep stage classification system for identifying the sleep patterns and diagnosis of several types of sleep-related disorders [29, 33, 34, 44]. In general sleep, staging procedures are conducted mainly on two strategies, one with single-channel input recording, and the other is multi-channel input recordings [60, 62, 63]. In the first approach, only one channel is considered for extracting the informative features about the sleep characteristics of the subjects. Similarly in a multi-channel system of recordings, a number PSG signal is used that is more than one EEG channel, EOG channel, and EMG channel [6566]. There is a standard procedure obtained for an automated sleep stage classification by most of the authors to their sleep staging experiments through five basic stages: 1) Signal the acquisition, 2) Pre-processing, 3) feature extraction,4) feature reduction, and 5) classification [7982]. The 3)feature extraction step is used for extracting the different characteristics parameter from preprocessed signal stage 2).These feature values can be extracted in frequency,time,time-frequency and non-linear domains [2].It has been seen that some of the s, one additional step used by authors that is feature reduction or dimensionality reduction stage. It is very helpful in screening the relevant features for the classification model. From our survey, it has reported that some of the feature selection algorithms used by different authors in their ASSC study are: principal component analysis (PCA) [46], relief algorithm [50], linear discriminate analysis (LDA) [58], minimum redundancy maximal relevance (mRMR) [71], and sequential forward and backward selection method [7374].Similarly in case of classification model used in recent automated sleep stage classification includes: support vector machine (SVM) [38, 53, 80], k-nearest neighbor (KNN) [104], k-means clustering [84], decision tree (DT) [96], bootstrap aggregating [7],random forest (RF) [20],naïve bayes [3],Gaussian mixture model (GMM) [97],adaboost [6], sparse auto encoders (SAE) [61],and artificial neural networks (ANNs) [93].

In recent research developments, deep learning techniques are becoming more popular in machine learning research applications and also used so many applications such as human-brain computing, computer vision, natural language processing, and speech recognition. Recently it has been found that deep learning concepts such as CNN [87, 99], RNN [28, 37], and LSTM [35] applied to the sleep staging approach. Currently, research on sleep staging plays an important role in NCP and Human-Machine interaction (HMI). There are several studies related to automated sleep staging using various physiological datasets and multimedia data, such as EEG, EMG, EOG, ECG, and audio, etc. One of the most popular contributions of sleep stage classification is the study of sleep behavior through human brain-computer- interaction (BCI) [102, 105].Till now also traditional machine learning techniques used for sleep staging, and recently in this research deep learning methods used in several contributions of automated recognition of sleep stages. We now look upon some of the recent contributions presented by different authors related to sleep staging using machine learning and deep learning concepts.

2.1 Polysomnography (PSG) based sleep staging using machine learning approaches

Most of the research contributions until now depend on the machine learning techniques for the recognition of sleep stages in an automated system. Krakovska et al. [35] used 6 EEG channels, 2 EOG channels, and 1 EMG channel used for recordings of the sleep behaviour and obtained features like variance, average amplitude, and spectral power. For classification of sleep stages obtained using quadratic discriminate analysis and the accuracy result reported about 74%.

In [54] the author considered multiple signals such as EEG, EOG, and EMG for the automated sleep scoring through the extraction of features like skewness, kurtosis, variance, entropy and used a dendrogram-based SVM (DSVM) classifier for classifying the sleep stages and reported accuracy for the model as 88%.

Zhu et al. [106] obtained graph-oriented features from single-channel EEG and used SVM classification techniques and the accuracy result was reported for six-state classification as 87.5%.

Hassan et al. [32] applied the EEMD algorithm for signal enhancement from single-channel EEG signal and extracted statistical features are forwarded into boosting techniques and the reported accuracy for two-six sleep stages is reported as 98.15%, 94.23%, 92.66%, 83.49%, and 88.07% respectively.

Silveria et al. [85] presented a six-state sleep staging approach using a discrete wavelet concept and obtained a random forest classifier, the model achieved 90% accuracy.

Rahman et al. [75] introduced a single-channel EOG sleep scoring approach and extracted statistical features by applying discrete wavelet transform techniques. The average accuracy reported for six state classifications through RUSBoost, RF and SVM is 90%, 91%, and 91.7%.

Memar et al. [59] proposed two-state sleep staging and the acquired signal decomposed into eight sub-bands, finally 13 features are extracted from each sub-band epoch. The suitable features are identified through the mRMR feature selection algorithm. The model achieved an overall accuracy of 95.31% through a random forest classifier.

Imtiaz et al. [42] presented automated sleep staging through home-based polysomnography signal and the model reported accuracy for training and testing dataset are 89% and 72% respectively through decision tree classification algorithm.

Dimitriadis et al. [16] proposed one channel EEG sensor ASSC techniques and estimated cross-coupling frequency (CFC) from each epoch and the system achieved an overall accuracy of 94% through multi-class Naïve Bayes classification techniques.

Sen et al. [81] proposed a sleep staging system using a single-channel EEG signal and obtained 41 multiple feature parameters. The relevant features were selected through different feature selection algorithms such as minimal redundancy maximal relevance (mRMR), ReliefF feature selection algorithm, fast correlation-based feature selection algorithm (FCBF), and Fisher score algorithm. Finally, the selected features were fed into five different classification algorithms such as support vector machine(SVM), decision tree(DT), random forest(RF), feed-forward neural network (FFNN), and radial basis function neural network (RBF) and the RF classification model reported accuracy of 97.03% for six sleep states classification problems.

Dikhya et al. [17] presented automated sleep staging system by obtaining the statistical time-domain features, structural graph similarity feature based on single-channel of EEG signal under R&K sleep scoring rules and the proposed model performed best using SVM classifier, an average accuracy of 95.53%.

T. Zhang et al. [103] proposed a novel mechanism for feature selection using the filter method with pairwise constraints. The author has obtained two different categories, one section of data having affected with mild sleep problem and the other category section, healthy controlled subjects. The whole recordings were collected from the S-EDF dataset. Finally, the model reported an accuracy of 97.66% and 93.57% with consideration of category-1 and category-2 data respectively.

Basha, A. J et al. [9] obtained the fuzzy-kernel SVM for classifying the sleep stages and extracted the statistical features and the selected features were fed into the recurrent neural network and the model reported an overall accuracy of 90.2%.

Shen, H et al. [83] proposed an improved model based on features with a combination of locality energy and state-space model for automated classifying the sleep stages based on single-channel electroencephalogram signals under the R&K and AASM sleep scoring guidelines. The model reached an overall classification accuracy of 92.04% and 78.92% using the S-EDF and Dreams dataset respectively. Similarly, the same model reported accuracy of 79.90% and 81.65% using Dreams and ISRUC-Sleep dataset.

Wang, Q. et al. [98] proposed a high-accuracy and high-efficiency automated sleep staging system using single-channel EEG data and extracted 30 features of types time, frequency, time-frequency features, and non-linear parameters. The selected features were classified through the ensemble learning stacking model and reported an overall accuracy of 96.67% for the five-class sleep stages classification task. It has been also seen that the multi-modal signals analysis also takes an important role during the diagnosis of different types of sleep-related diseases. Therefore several sleep studies were conducted by the different researchers using multi-channel signals.

Yan, R. et al. [100] develop an automated sleep staging system based on the eight combinations of the four multi-modality channels of PSG signals and obtained a total of 232 features of statistical, time, frequency, time-frequency entropy, fractal and non-linear parameters were extracted. The model reported an accuracy of 86.24% using a random forest classifier with the ReliefF selected features.

Ghimatgar, H. et al. [24] introduced a multi-modality approach using a deep learn-ing model and Hidden Markov Model (HMM) to improve the sleep staging performance using multi-channel EEG data. The experimental data collected from sixteen neonates having an age range of 38–40 weeks. The relevant features were screened using the MGCACO algorithm. The model is trained using bi-directional long-short time memory and post-processing done using HMM model. The proposed model reported an accuracy of 78.9% and 82.4% using the K-Fold cross-validation and LOOCV techniques respectively.

Cooray, N. et al. [14] presented a sleep staging system followed by Rapid Eye Movement sleep behavior disorder (RBD) detection. The extracted 156 features from EEG, EOG and EMG channels and forwarded them into the RF classifier. The model reported an accuracy of 92%.

Diykh, M et al. [17] presented the new sleep staging system based on the statistical features and weighted brain networks using multiple-channel of EEG signals under both R&K and AASM sleep scoring guidelines. The proposed model has been per-formed on the two most popular public datasets namely ISRUC-Sleep and S-EDF dataset. The model reported an average accuracy of 96.74% with C3-A2 channel under the AASM scoring standards and 96% with Pz-Oz channel under the R&K standards.

H. Korkalainen et al. [51] presented the deep learning approach for automatic sleep staging system and analyzed the severity of obstructive sleep apnea (OSA). The overnight polysomnography recordings obtained from S-EDF public dataset, both healthy and sleep apnea subjects were considered for analysis of the sleep behavior. The model reported an overall accuracy of 83.7% with single-channel EEG and 83.9% with single-channel EOG. Similarly, for the clinical data, the model achieved an accuracy of 82.9 with EEG signal and 83.8% with combinations of EEG and EOG data.

2.2 Polysomnography (PSG) based sleep staging using deep learning approaches

Nowadays the researchers are majorly focused on deep learning techniques for sleep staging because of its robustness, scalability, and adaptability with related to handle large amounts of signal recordings and it’s processing. Another important advantage related to deep learning models is, no need to require any explicit features for discriminating the subject’s sleep behaviour [30]. It has witnessed that deep learning techniques working well in different applications like image segmentation, recognition, detection, and natural language processing. It has also been observed that deep neural models are widely used in different fields of the biomedical research area. In recent research developments, it has found that notable increases happened with the use of the deep neural network in the field of biomedical signals (EEG, ECG, EMG, and EOG) [18].

Recently deep learning concepts proposed in many challenging applications by the different researchers with the input of biomedical signal data which includes epileptic seizures [18, 39, 45, 89], neurological disorders using CNN models [21, 40, 67, 94] and heart diseases using ECH channel [18, 30, 39, 51].some of the recent contributions conducted by different researchers using deep learning models for classifying sleep stages are described below here.

In [45], the author used a deep convolution neural network for automated sleep staging with the input of single-channel EEG. The model achieved an overall accuracy of 74%.

Sors A et al. [89] presented automatic sleep stages scoring for five-sleep states based on one-channel of EEG using the CNN model and the results reported for the proposed model are 87%.

Chambon et al. [94] introduced a deep learning model with the concept of multivariate signal analysis such as EEG, EOG, and EMG using KNN. The proposed model reached an overall accuracy of 80% with combinations of EEG + EOG + EMG.

In [40] the authors obtained five-layer convolution layers for classifying the sleep stages based on two-channels of EEG and EOG signal and one-channel of EMG signal and achieved result for the model is 83%.

Tripathy et al. [67] introduced a novel approach of sleep scoring based on coupling features of EEG data and RR time-series information using deep neural networks. The model resulted in an average accuracy of 95.71%, 94.03%, and 85.51% for the classification in between NREM vs REM, deep sleep vs light sleep, and sleep vs wake respectively.

Zhihong Cui et al. [21] proposed a sleep scoring system with input of 30s multi-channel signal information based on CNN and fine-grained properties, the model reported an average accuracy of 92.2% with the ISRUC-Sleep public dataset.

Supra Tk A et al. [56] designed a system of sleep scoring through extracted time-invariant information using CNN and find sleep stages transition information from the bidirectional LSTM network. The reported classification accuracy performance reached to 86.2%.

Akyol, K. et al. [5] presented a stacking ensemble learning model for analysis of the single-channel EEG signal and identifying the epileptic seizure detection. The clinical dataset was collected from the Bonn University. Finally, the author has com-pared the performance of the proposed model with the deep neural network (DNN) model. The model achieved an average accuracy value of 97.17%.

Yildirim, O. et al. [101] developed a deep learning model using a one-dimensional convolutional neural network (1D-CNN) using combinations of EEG, EOG, and EMG signal for classifying the six-two sleep stages classification problems. The model reported an overall accuracy of 91.00%, 91.22%, 92.36%, 94.64%, and 98.06% using the S-EDF dataset and similarly, the same model achieved an accuracy of 89.54%, 90.98%, 92.33%, 94.34%, and 97.62% using SE-EDF dataset for classifica-tion of six-two sleep classes problems.

Zhu, T. et al. [107] proposed a sleep staging system using a neural network with the implementation of the CNN concept and obtained inter and intra epoch features from the input signal. The model reached overall accuracies of 93.7% with S-EDF and 82.8% with SE-EDF datasets respectively.

Fernandez-Blanco, E et al. [19] proposed an ensemble technique for an automatic sleep scoring system using multiple-channel of EEG signals, and the model was trained using the convolutional neural network. The entire experiment work was done through a widely accepted bench-mark dataset as SE-EDF dataset and the test was carried out with help of leave-one-out cross-validation technique. The model achieved an accuracy of 92.67%.

C. Sun et al. [90] presented multi-class sleep stages classification using PSG signals based on a hierarchical neural network model. The model functioned in two phases, in the first phase using through feature learning stage and the second phase is the sequence learning stage. This study performed on 147 patients’ sleep recordings, which were obtained from the MASS dataset. Finally, the model reported an overall accuracy of 87.8% and an F1score of 81.8 respectively.

Chenglu Sun et al. [91] developed an automated sleep staging system based on a two-stage neural network model. During the first stage, the model learning hand-crafted features, and in the second stage the model is learned. The author also intro-duced the data augmentation techniques to resolve the class imbalance problem. The whole work was executed through two public datasets such as S-EDF and Sleep apnea dataset. The proposed model reported the result of F1score and Kappa score as 80.6% and 80% with healthy subjects, and 79% and 74% with sleep-disordered subjects.

Antoine Guillot et al. [26] proposed an automated sleep staging system using a deep learning model called as SimpleSleepNet, where the author retrieved two differ-ent categories of data from the Dreams dataset, one completely healthy controlled and other section of data are collected from the subjects who were affected by the sleep apnea subjects. The required data prepared by the five different sleep experts in the different sleep centers. Finally, the SimpleSleepNet framework achieved an average F1Score 89.9% with healthy controlled subjects and 88.3% with apnea subjects.

Mehdi Abdollahpour et al. [1] proposed the automated sleep staging system using the combinations of EEG and EOG signals. The extracted features from both the signals were separated into two different sets, one set contained the EEG features and the other set contained the fused feature of EEG + EOG. Each feature set is transformed into a horizontal visibility graph (HVG). The images of the HVG are classified by the convolutional neural network with the concept of transfer learning. The model has been performed on the two most popular datasets such as S-EDF and SE-EDF datasets. The model achieved an overall accuracy of 93.58% using the S-EDF dataset.

According to the existing contribution to sleep scoring, major challenges found that choosing the correct features which helps to distinguish the sleep stages. It has found that the maximum researchers extracted the time, frequency, and time-frequency features, then after finalizing the relevant features either manually or applied some conventional feature selection algorithm. In some cases, this selection algorithm increases the complexity factor and consumes more time. Another challenge related to feature selection is that some features are well fitted for some of the subjects but the same may not apply for another one.

The next challenge with earlier contribution is that, improper distributions of sleep epochs for all the sleep stages. This imbalance of sleep information may produce biased results with conventional machine learning algorithms. From the literature survey, it has found that the maximum researchers extracted the time, frequency, and time-frequency features, then the relevant features are selected either in manually or using some conventional feature selection algorithm, which takes the computational time and also increases the complexity factor, Another limitation regarding selected features, some of the features well suitable for classification for some of the subject cases may be the same many not applicable for other categories of subjects. This may create a problem to achieve higher classification accuracy.

Another challenge with subjects to sleep staging is that sometimes in the recorded data, it may see that the sleep epochs are not distributed equally for all the sleep stages. This imbalance of information may produce biased results with traditional machine learning algorithms. In most of the previous studies, these may not be properly addressed, so that reported classification accuracy performance is not up to the mark level. With consideration of all these issues, the authors obtained deep learning techniques for automated analysis and classification of sleep stages using polysomnography signals. Though our input data is in the form one-dimensional size. In this study, we propose a 9 L-1D-CNN-SSC for automated sleep stages classification. To recognize the sleep behavior, here the authors have proposed an end-to-end structure, without use any type of handcrafted features for learning. The proposed model learning the features automatically from the obtained layers of the model. Apart from this, other advantages of this architecture and addressing the multi-class sleep stage classification problems without changing any of its layers and its parameters for two to five sleep classes.

2.3 Contribution

The main contributions of our proposed research works are explained below:

1. The authors propose a 9 L-1D-CNN-SSC architecture for classifying multiple sleep classes based on multi-modality signal fusions under the AASM sleep scoring rules using two different categories of subjects’ sleep recordings.

2. The proposed architecture of 9 L-1D-CNN-SSC consists of a convolutional layer, pooling layer, batch normalization layer, and fully connected layer. The performance of the proposed model is also compared with the existing pre-trained model. The obtained features from the last fully connected layers of the 9 L-1D-CNN-SSC model were fed into the softmax activation function.

3. The complete sleep staging process was analyzed with the three different combinations of signals and each one executed in the individual experiment. The first three experiments were executed with the input of single-channel EEG, EMG, and EOG signal, and the final experiment is performed with combinations of the three signals, EEG + EMG + EOG.

4. The proposed methodology uses fewer parameters to train the model and extracting the prominent features from the input signal data automatically, which supports achieving the high classification accuracy incomparable to the earlier contributions. Concerning the earlier contribution of sleep staging using the CNN model by different researchers, our proposed model is well competitive with the results of existing sleep studies, which use even complex structure CNN architecture.

3 Background of CNN model

In recent years, deep learning techniques attempt excellent performance to learn the highly complicated behavior from the input biomedical signals through a designed hierarchical architecture model. Among the different modes in deep learning methods, the CNN model is a more effective technique in biomedical signals and image analysis and classification problems compared to traditional machine learning techniques. The network styles of the CNN model are quite similar to the conventional ANN model structure, a CNN model framed with compositions o the input and output layers, and a set of hidden layers. The hidden layers of the CNN model comprise a set of convolution, pooling, and fully connected layers, which extracts the highly commendable features from the input data automatically, which are more feasible in concerns to each and individual neuron representing in each layer. Like ANN, CNN also depends on the previous layer’s weight and bias information to get the final result. The typical structure of the CNN model for a one-dimensional signal is shown in Fig. 1.

Fig. 1
figure 1

Compositions of one-dimensional convolutional neural network

The entire working procedures in CNN are implemented through two basic stages 1.feedforward stage, 2.backpropagation stage.

In the feedforward stage, the given input data are fed into the designed model, each input data are multiplied with the layer’s parameters of each layer and finally, the achieved feature map values are forwarded into the network output. During the backpropagation stage, the values of the weights are adjusted in each successive layer in the network to control and reduce the error values in the obtained model by implementing the proper loss function. Finally, the error rate determines by comparing actual and desired outcomes.

The CNN model consists of several network layers such as convolution layers (CONV), rectified linear unit (ReLU), batch normalization layers (BN), pooling layers, and fully connected layer, each layers description is described in detail below:

3.1 Convolution layer

In the CNN model, the convolution (CONV) layer is the basic core section of a CNN model. Generally, in one CNN architecture, there are more than convolutions layers are there, in this layer the given inputs are processed through the set of learnable parameters and with the number of filters with different dimension sizes. From each filter as an output, we generate a set of feature maps values, which are computed from dot product small unit of given input data and its weight values. It helps to learn the features from the input signal by learning the weight parameters of the filters. The given size of input data for a convolution layer is ShXSwXSd, the required output volume size of the layer is ShnewXSwnewXSdnew.The required output computed using four hyper parameters 1. Number of filters (NF), 2. Size of filter (Fs), 3.Padding size ( Ps), 4.Stride information (SI).The general form of computation the output volume from convolution layers as follows:

$$ {S_h}^{new}=\frac{S_h-{F}_S+2\ast {P}_S}{S_I}+1 $$
(1)
$$ {S_w}^{new}=\frac{S_w-{F}_S+2\ast {P}_S}{S_I}+1 $$
(2)
$$ {S_d}^{new}={N}_F $$
(3)

3.2 Batch normalization layer

The importance uses of batch normalization (BN) layer in the CNN model are normalizing the data present inside the network [4].It also supports to increase the training speed and reducing the internal changes of covariance values. The main intention to include BN layer in the model is to make confirm that how best the activation function distributes in a stable manner throughout the training procedure.

BNn = {B1, B2, B3……………Bm} Presents a minibatch size of m.

The batch normalization of BNn is computed using this set of mathematical equations.

$$ \mu {BN}_n=\frac{1}{m}\sum \limits_{i=1}^m{B}_i $$
(4)
$$ {\sigma}^2{BN}_n=\frac{1}{m}\sum \limits_{i=1}^m{\left({B}_i-\mu {BN}_n\right)}^2 $$
(5)
$$ \hat{B_i}=\frac{\left({B}_I-\mu {BN}_n\right)}{\sqrt{\sigma^2{BN}_n+\in }} $$
(6)
$$ {B_I}^{BN_n}=\tau \ast {\hat{B}}_i+\vartheta $$
(7)
$$ \hat{\kern0.5em {B}_i}\ represent\ normalized\ input $$
$$ {B_I}^{BN}\ Batch\ normalization\ output\ for\ a\ minibatch\ {B}_n $$

Majorly BNn layer deployed in between convolution layers and ReLU layers, which permits the users to fix the higher learning rates. The most important advantage with batch normalization layer is, it mainly controls over fit issue and increase the training speed.

3.3 ReLU layer

It is a general trend in CNN model to deploy ReLU layer after each successive convolution layers, the main intention to use of this activation function is establishes the nonlinear impact in the network. In the proposed work, we have obtained two different convolution functions are used 1.ReLU and 2.Softmax.

In general, the ReLU activation function converts the negative map values into zeros and maintains the positive values. The mathematically form of ReLU function defined as

$$ \varnothing (x)=\left\{\begin{array}{c}x, if\ the\ value\ of\ x\ge 0\\ {}0, if\ the\ value\ of\ x<0\end{array}\right\} $$
(8)

The role of softmax activation function is decides the probable classification of the output classes. So that softmax function used in the final fully connected layer, for predicting which input signal is belong to wake, N1, N2, N3 and REM sleep stages. The mathematically the equation defined as:

$$ {P}_i=\frac{e^{S_j}}{\sum \limits_1^k{e}^{S_k}}\kern0.5em for\ j=1,2,3\dots \dots ..k $$
(9)

Where S is the input to the network model.

Pi is represents the output value.

The output values are lies in between 0 to 1.

3.4 Pooling layer

The main purposes of using pooling techniques in CNN model is reducing the number of trainable parameters which helps indirectly reduce in complexity in execution. Finally it helps to control the overfitting problem. After each layer of convolution, we applied the max-pooling techniques for down sampling the feature map volume size and during this layer, no parameters are to be trained. In general architecture of CNN model, the pooling layer placed in between two convolution layers, for the purpose of down sampling the feature map size. From many of studies, it has found that maxpooling techniques are more effective with concern to CNN model. So that in our proposed model obtained max-pooling techniques.

3.5 Fully connected layer

Generally in a CNN model, there is one or more fully connected layers (FC) are used after successive convolution, ReLU and max-pooling layers. The operational procedure of FC layer is same with accordance to conventional neural network, where each neuron is associated to all the preceding layer neurons. The most important concern with related to FC layer is, it holds number of learnable parameters which indirectly leading computational overhead during training. In the current research work, used only one FC layer in our proposed model.

4 Methodology

The main intention of the proposed research work is to develop an artificial intelligence-based deep learning classification model to automatically classifying the sleep stages, which alternatively help for diagnosing the major types of sleep diseases This paper proposed a novel nine-layer one-dimensional convolutional neural network-based automated classification model for multi-class sleep staging classification (9 L-1D-CNN-SSC) using polysomnography (PSG) signals. The main intention is to make use of the effective information of multi-modality combinations of the signals to improve the sleep staging classification performance. The concept of the proposed research work is shown in Figure 2 by block diagram. The proposed methodology involves signal processing concepts and proposed 9 L-1D-CNN-SSC architecture followed by finding the discriminatory features to distinguish these multi-modal signals for training and testing. Mainly this research work executed through these four parts (i) input module, which takes PSG signals,(ii) pre-processing is performed for eliminating the irrelevant noise compositions which are contaminated in the recorded signals and passes them into the 9 L-1D-CNN-SSC model (iii) final decision, the output of the 9 L-1D-CNN-SSC model are fed into the fully connected layer to take the final predictions,(iv) validated the results of the proposed 9 L-1D-CNN-SSC architecture for two to five sleep stages classification problems with the existing state-of-the-art works.

Fig. 2
figure 2

The block diagram of the proposed automated sleep scoring

4.1 Sleep stages classes

According to the AASM sleep standards, the sleep classes can be divided into two to five sleep stage classes. The only changes with the AASM sleep standards are N3 and N4 stages of the R&K standards are merged into one stage called as N3 stage. The proposed sleep staging procedure is executed under the AASM sleep scoring rules. The brief description of sleep classes considered in this proposed research work is shown in Table 1.

Table 1 The sleep class description considered in this proposed research work under the AASM standard

4.2 Experimental data

In this proposed study, we have obtained ISRUC-Sleep (ISR) datasets to evaluating the proposed sleep staging method. It contained the sleep recordings from different medical conditioned subjects who were affected by different types of sleep problems. The whole database is divided into three sections such as ISR-SG-I, ISR-SG-II, and ISR-SG-III [5].In ISR-SG-I, 100 subjects one-session sleep recordings were available, all were affected with different types of sleep-related disorder and the average age of subjects in this section is 51 years, similarly, in the ISR-SG-II, 8 subjects sleep recordings were contained, among them 6 were male and 2 were female gender. In this section, two different recordings were collected from subjects on two different dates; maximum subjects in this section were affected with sleep apnea events. It has been found that the average age of participated subjects for this section was 47. Finally, the ISR-SG-III section collected 10 subject’s sleep records, which were completely healthy, controlled, and no prior symptoms with any type of sleep problems in earlier periods. All these sleep recordings are done by the set of sleep exports at the center of sleep in the Hospital of Coimbra University. Each sleep recordings were a collection of whole-night PSG recording containing 6 EEG channel, 2 EOG channel, 3 EMG channel, 1 ECG channel, airflow, abdominal efforts, pulse oximetry, and body position information. Each recorded signal is segmented into 30-s fragments called epochs, and each epoch is represented to one particular sleep stage and labeled according to AASM sleep manuals.

In the present work, two different categories of subjects, one set belong affected with sleep-related disorders and other groups are utterly healthy category. Both the subjects’ recordings were collected from ISRUC-Sleep public repository, which was used explicitly for sleep scoring. Sleep-disordered subjects were obtained from the ISR-SG-I section; similarly, healthy subjects were acquired from ISR-SG-III. In this experiment, the authors have randomly selected five sleep-disordered subjects and five healthy controlled subjects from ISR-SG-I and ISR-SG-III respectively. The subjects’ sleep recordings concerning different stages of sleep are briefly described in Table 2.

Table 2 Description of distribution of sleep stages

The proposed model used three different channels of EEG, EMG, and EOG signals as inputs for measuring the sleep stages scoring. In Figure 2, the 30s sample of C3-A2 channel of EEG signal, ROC-A2 channel of EOG signal, and X1(Chin) channel of EMG signal sample of the sleep-disordered subject-77 record is shown for five sleep stages, similarly, the sleep behavior sample of the healthy subject-2 are shown in Figure 3. Each collection of records from the subjects was different from other individuals due to different sleep patterns. The resulting distribution of sleep stages also distinct differences in their observed sleep behavior, and it may sometimes make the model challenging to train the model.

Fig. 3
figure 3

Samples for each labeled stage of a PSG signal from ISR-SG-I a) C3-A2 channel of EEG signal) ROC-A1 channel of EOG signal) X1(Chin) of EMG signal

4.3 Preprocessing

In this section, the author has to discard non-sleep epochs from each PSG signal recordings, because sleep stages classifications of certain epochs depend on information about its neighboring epochs. Sometimes it has been observed that recorded signals are contaminated with different types of artifacts and noise information. To remove these irrelevant portions from the raw signal by implementing a Butterworth bandpass filter with the order of 10 at the frequency ranges of 0.1–45 Hz. After that, all polysomnographic signals are segmented into 30s epochs, and each epoch is labeled into a particular sleep stage. Sometimes it has been seen that more than one sleep stage information represented to one epoch, in that case, the author assigned the sleep stages to that particular epoch, which had maximum periods of appearance. Similarly scoring for N2 sleep stages, considered the previous epochs last part and current epochs first part, if there are containing more than one sleep spindles and k-complexes then that stage to be considered as N2. Similarly, if the same trends occurred in between N2 and REM sleep stages, then the middle of the epochs are labeled as N2 otherwise it is annotated as REM stage. Sleep spindles are appeared in between N2 and REM stages. Finally, we have applied normalization techniques and here we use zero mean and unit variance and a normalized feature vector is generated. Generally, it supports increasing the performance of the system.

4.4 Proposed 9 L-1D-CNN-SSC model architecture

The CNN concepts are frequently used for two-dimensional image processing [68].But it is not restricted to two or three dimensional image classifications problems only, but it also applicable for one-dimensional input data according to recent research developments.

1D-CNN model shares the same properties like other CNN models, but only the difference of 1D-CNN is with subject to convolution operation, only 1D convolution operation is performed.The known input data in the form of 1D-CNN is biomedical signal data [69].In a 1D-CNN model, with input of input signal SI, and kernel Kw, then the convolution operation is defined in Eq. (10):

$$ {\left({S}_I\ast {K}_W\right)}_n=\sum \limits_{i=1}^{\left|{K}_w\right|}{K}_W(i)\ast {S}_I\left(i+n-1\right) $$
(10)

From the above equation, the * operator determines the convolution operations in between the kernel (weights) which slides over the given input data. The resulted outcome from the convolution operation is called as feature map. The general formula for computing the output from the convolution operation is defined in Eq. (11):

$$ {O^L}_n={\left({S}_{W\left(i,j\right)}\ast W\left(i,j\right)\right)}_n $$
(11)

The operation of the convolution layer is similar to the feature extraction stage, as the outcome from this layer generated a feature map of the input matrix. The resulted feature map values down-sampled further in the network using pooling layers, further, the values are processed with a set of convolution operations. The final layer of the CNN model contains the neural network layer, which is also called as a fully connected layer, mainly it performs classification operation. Figure 4 presents the graphical representation of the proposed 9 L-1D-CNN-SSC model for the automated classification of sleep stages.

Fig. 4
figure 4

Sample sleep stages behaviour of healthy controlled subjects from ISR-SG-III a1) C3-A2 channel of EEG signal b1) ROC-A1 channel of EOG signal c1) X1(Chin) of EMG signal

The proposed 9 L-1D-CNN-SSC model comprises twenty layers: nine convolution layers (CONV-1 to CONV-9), nine max-pooling layers (Polling-1 to Polling-9), and one fully connected layer. Apart from these layers, the proposed model considered nine batch normalization and ReLU layers. The proposed network extracted hierarchical feature information automatically using a set of hidden layers. As we have mentioned earlier the use of more FC layers may increase the computational overhead. Therefore in this research work, we have only considered one FC layer, and finally, the model resulted in a vector of size five corresponding to five different classes of sleep stages, and at last, a softmax activation function is applied in the FC layer to determine the final class label.

The main objective of designing such a custom model is to increase the classification accuracy is compared to the preexisting trained model. The first convolution layer of the proposed model takes an input of preprocessed polysomnography signals of size 3000 × 1 sample points and the first layer model convolves it with 16 × 8 filters and with four stride ratios to produce the resulting feature map of size 750 × 16.The second layer of the proposed model is the pooling layer, in this work the authors have considered max-pooling techniques with filter size 2 × 2 and a stride 1 to produce a lower dimension size of output volume of size 375 × 16. The BN layer and ReLU activation applied over the output of each max-pooling layer of the models. Next, to the pooling layer, we have again applied the second convolution layer and it carried the previous convolution layer output information and convolves it with 32 × 3 kernel of size with two stride ratios, it provides the output volume of size 352 × 32.

The same procedure is repeated for seven times again in this proposed model. In the proposed model, the first convolution layer kernel of size 8 × 1 and the stride ratio set as four, further convolution layer kernel size is 3 × 1 and for the remaining layers, the stride size is 1.In this proposed 9 L-1D-CNN-SSC model the total nine max-pooling operations performed for down sampling the feature map volume size.BN and ReLU activation layers are performed over the output of the max-pooling layer.

Finally, the output result of the convolution layer is fully connected to five neurons of a fully connected layer and applied softmax function to determine the probability distribution over each class label and finally, the output decided upon the neurons having the maximum probability score. All the layers of the proposed model and its parameter description representations are given in Table 3. Algorithm 1 shows the pseudo code for basic block of the proposed model with input vector V and number of neurons N. Similarly the Algorithm 2 presents the pseudo code of the proposed 9 L-1D-CNN-SSC model with input of PSG signals for multi-class sleep stages classification.

Table 3 The configuration details of the proposed 9 L-1D-CNN-SSC architecture
figure a
figure b

The final parameterized structure of the -SSC model for multi-classification sleep stages is described as follows:

1. H1: The first hidden layers contained 16 filters with a kernel size of 8. This is followed by 4 strides with valid padding, and also obtained a rectified linear unit(ReLU) activation function and batch normalization with kernel size 2.Finally the output shape of the layer is (750,16).

2. H2: 32 filters with the kernel of size 3. This is followed by 2 strides, a ReLU activation, and batch normalization with kernel size 2. The output shape is (375, 32).

3. H3: 64 filters with the kernel of size 3. This is followed by 1 stride, a ReLU activation, and batch normalization with kernel size 2. The output shape is (187, 64).

4. H4: 64 filters with the kernel of size 3. This is followed by 1 stride, a ReLU activation, and batch normalization with kernel size 2. The output shape is (93, 64).

5. H5: 64 filters with kernel size 3. This is followed by 1 stride, a ReLU activation, and batch normalization with kernel size 2. The output shape is (46, 64).

6. H6:64 filters with kernel size 3. This is followed by1 strides, a ReLU activation, and batch normalization with kernel size 2. The output shape is (23, 64).

7. H7:64 filters with kernel size 3. This is followed by 1 stride, a ReLU activation, and batch normalization with kernel size 2. The output shape is (11, 64).

8. H8:64 filters with kernel size 3. This is followed by 1 stride, a ReLU activation, and batch normalization with kernel size 2. The output shape is (5, 64).

9. H9:64 filters with kernel size 3. This is followed by 1 stride, a ReLU activation, and batch normalization with kernel size 2. The output shape is (2, 64).

Output: 5 output classes followed by a softmax activation function. The output stage is 5 i.e., 5 classes of 5 sleep stages (Wake, NREM N1 Stage, N2 Stage, N3 Stage, and REM stage).

4.5 Model training and testing

The main important step in the deep learning structure is the training of the 9 L-1D-CNN-SSC model because it directly enhances the impact on the resulting accuracy of the entire sleep staging system. The proposed research work has used two different percentages of data for training, and testing. For experimental purposes, the author considered 70% training data, and the rest of 30% is using testing data. The model performance is validated with unseen test data. First of all, the PSG signal is presented to the semantic vector of 3000 bins. The training process is optimized using categorical cross-entropy loss. For proper memory management, split the training samples into several mini-batches that are processed during the train of the 9 L-1D-CNN-SSC model in each iteration. In each batch, 100 signals are selected randomly from the training samples (without repetition). The proposed model is trained for 100 epochs. Finally, the model is tested by using unseen test data. The hyperparameter used for training the proposed 9 L-1D-CNN-SSC model is described in Table 4 for the multi-class classification of the sleep stages. The proposed 9 L-1D-CNN-SSC model is trained by using 70% training data of the database, and finally, the 30% unused samples are used for testing the model performances. The trained 9 L-1D-CNN-SSC model classifying the sleep stages by studying the behavior of PSG signals for two to five sleep state classification problems (wake, NREM (N1), NREM (N2), NREM (N3), and REM). Those classification results are compared with the ground truth values, which provided by the sleep experts.

Table 4 Hyper parameters of the proposed model

5 Experiments and results

The whole experiments were executed on the most popular and widely used ISR-SG-I and ISR-SG-III datasets. In this study,the authors have considered both subjects affected with different types of sleep-related disorders and healthy controlled subjects.In both the datasets five sleep stages recordings information are available. The detailed information with regarding to sleep staging analysis and classification are presented in this section. Besides, both the subgroups datasets are split into 70% training data and 30% testing data.

5.1 Experimental setup

The proposed model considered PSG signals as input, which is a combination of the EEG, EMG and EOG signals. Collected PSG signal are segmented into 30-s time duration. The acquired signals are sampled with the sampling rate of 100 Hz. Each epoch of PSG data hold 3000 sample points. We have conducted eight individual experiments for both the dataset ISR-SG-I and ISR-SG-III respectively and the brief details of the experiments are presented in Table 5. The complete illustration of the experimental steps is presented in Figure 5.

Table 5 Proposed experiments and its signal combinations
Fig. 5
figure 5

Architecture of the proposed 9 L-1D-CNN-SSC model for multi-class sleep stage classification

Hardware Requirements: To execute the entire experiments and analyze the performance of the model, the authors establish an experimental set-up on a personal computer system with an Intel Quad-Core i7 4th generation processor with a Level1 cache memory of 32 KB, Level2 cache memory of 256 KB, and 4 MB size of cache memory, and with 1 TB hard disk.

Software Requirements: This research work executed Microsoft Windows 10 as the base operating system. MAT LAB 2015b version.

5.2 Performance evaluation metrics

To evaluate the performance of the proposed 9 L-1D-CNN-SSC architecture, the authors considered confusion matrix outcome with subject to sleep scoring using PSG signals.The confusion matrix four terms True Positive (TP),False Positive (FP),True Negative (TN), and False Negative (FN) are used for measuring the sensitivity [19], precision [90], F1Score [91] and classification accuracy [26]. The mathematical exp- ression of these statistical indices is defined in Eqs. (12)–(15).

$$ Sensitivity=\frac{TP}{TP+ FN}\mathrm{X}\ 100\% $$
(12)
$$ Precision=\frac{TP}{TP+ FP}\mathrm{X}100\% $$
(13)
$$ Accuracy=\frac{Sensitivity+ Specificity}{2}\mathrm{X}\ 100\% $$
(14)
$$ F1 Score=\frac{2\mathrm{X} TP}{\left(2 TP+ FP+ FN\right)} $$
(15)

5.3 Results with the input of ISRUC-sleep subgroup-I dataset

5.3.1 Experiment-1

Using EEG signal

In this experiment, the authors considered with single-channel C3-A2 of EEG signal re-cordings of five sleep-disordered subjects. For this experiment also, the ratio for training and testing is 70:30 with same number features are used. Table 6 presents the confusion-matrix generated from the training set and testing set for five-state sleep staging.

Table 6 Confusion matrix of EEG for five state sleep staging

The average accuracy resulted for training set data as 98.67% and for testing it re-ported as 98.49%. Similarly, the performances of sensitivity, precision and F1-score reported more than 96% for all the five sleep stages. The same performance results for both training and testing data are presented in Table 7.The achieved results from both the data are closely near.

Table 7 Performance values obtained using input of EEG

Similarly, with the same layer structure and hyper parameters values the proposed 9 L-1D-CNN-SSC model applied for two to four classes of sleep stages classification. The reported classification accuracy results for two-five sleep classes using single-channel EEG data are presented in Table 8.The training and testing accuracy for two-five sleep classes for 100 epochs iterations are presents in Figure 6.

Table 8 Reported Performance Results for two-five sleep class based on single-channel EEG
Fig. 6
figure 6

Detailed illustration of experimental steps

5.3.2 Experiment-2

Using EOG signal

It has observed sometimes that rapid eye movements occurred during different stages of sleep, it is also one of the sleep characteristics which may create disturb-ances in sleep patterns of the subjects. Here we extracted ROC-A1 channel of EOG signal. To explore the sleep staging behaviour with selected features, the author considered single-channel EOG information from subjects for being analysis the sleep behaviour. The ratio of training data and testing data are same kept like for other two experiment.The confusion matrix for this experiment is presented in Table 9.It can be observed that the average classification accuracy achieved with training set as 98.32% and for testing test, the result reported as 98.31%. The evaluation performance results are presented in Table 10.

Table 9 Performance values obtained using an input of single-channel EOG
Table 10 Performance values obtained using an input of single-channel EOG

Figure 7 presents the classification accuracy performance results with 70% training and 30% testing data sets using input of single-channel EOG. It has observed from the accuracy results, there is no overfitting issue occurred found from all the sleep classes. The performance graph signifies that both training and testing accuracy gain positive learning. The highest sleep staging accuracy achieved from two class and lowest from five sleep class classification.The training and resting stages achieved accuracy for two-five sleep class are presented in Table 11.

Fig. 7
figure 7

Classification accuracy performance graphs of the proposed 9 L-1D-CNN-SSC architecture using single-channel EEG with ISR-SG-I data: (a) training accuracy (b) testing accuracy for the two-five sleep classes

Table 11 Reported accuracy results using single-channel EOG for two-five sleep class

5.3.3 Experiment-3

Using EMG signal

The third experiment carried with input of X1(Chin) channel of EMG signal.The reported results of confusion matrix from both training and testing input data are presented in Table 12.

Table 12 Performance values obtained using input of single-channel EMG

The average accuracy resulted for training set data as 98.67% and for testing it re-ported as 98.49%. Similarly, the performances of sensitivity, precision and F1-score reported more than 96% for all the five sleep stages. The same performance results for both training and testing data are presented in Table 13.The accuracy performance graph representation using training and testing data with input of single-channel EMG are shown in Figure 8 and the same information present in Table 14 for two-five sleep stages classification.

Table 13 Performance values obtained using an input of single-channel EMG
Fig. 8
figure 8

Accuracy performance achieved using single-channel EOG: a1) with training data, b1) with testing data

Table 14 Reported accuracy results using single-channel EMG for two-five sleep class

5.3.4 Experiment-4

Using combinations of EEG + EMG + EOG signals

In our final experiment of this proposed study was conducted with combinations of EEG, EMG and EOG channels. These combinations provide sleep characteristics information which may give more accurate identification of the sleep irregularities and the results of the confusion matrix are tabulated in Table 15 and the reported performance results using EEG+EMG+EOG signals are presented in Table 16.

Table 15 Performance values obtained using input of single-channel EEG + EMG + EOG
Table 16 Performance values obtained using input of single-channel EEG + EMG + EOG

The average accuracy achieved with training set as 98.44%, similarly the results reported with testing data as 98.52%.The results exceeded above 95% for sensitivity, precision and F1-Score. For these combinations of multi-variate signals, the performance of classification accuracy achieved using the proposed 9 L-1D-CNN-SSC model from both training and testing data are presented in Table 17 and the same result information are presented in graphical forms in Figure 9.

Table 17 Reported accuracy results using single-channel EEG + EMG + EOG for two-five sleep class
Fig. 9
figure 9

Accuracy performance results using single-channel EMG: a2) using training data, b2) testing data

5.4 Results with the input of ISR-SG-III data

To analysis the effectiveness of the proposed methodology, the authors applied with the ISR-SG-III data, which contained only healthy controlled subjects.The experimental setup and input channel information is same as Experiment-1 to Experiment-4.The same channel C3-A2 of EEG, ROC-A1 of EOG, Chin(X1) of EMG signal used for sleep staging classification with input of healthy controlled subjects sleep recordings. Experiment-5 to Experiment-8 is executed with input of single-channel EEG, EOG, EMG signal and combinations of the EEG + EOG + EMG signals respectively. The detailed accuracy results for all the four experiments using ISRUC-Sleep (subgroup-III) dataset are given in Table 18.The graphical representation of resulted accuracy with both training and testing data using single-channel EEG,EOG,EMG, and EEG + EOG + EMG are shown in Fig. 10, Fig. 11, Fig. 12, Fig. 13 and Fig. 14 respectively.

Table 18 Accuracy performances values for ISR-SG-III data
Fig. 10
figure 10

Accuracy performance graph representation using single-channel of EEG + EOG + EMG for two to five sleep class: a3) with training data, b3) with testing data

Fig. 11
figure 11

Accuracy performance results using ISR-SG-III from single-channel EEG: a4) training accuracy, b4) testing accuracy

Fig. 12
figure 12

Accuracy performance results using ISR-SG-III from single-channel EOG: a5) training accuracy, b5) testing accuracy

Fig. 13
figure 13

Accuracy performance results using ISR-SG-III from single-channel EOG: a6) training accuracy, b6) testing accuracy

Fig. 14
figure 14

Accuracy performance results using ISR-SG-III from single-channel EOG: a6) training accuracy, b6) testing accuracy

5.5 Summary of experimental results

In the current research work, we have conducted sleep staging studies with considering two subgroups sleep recordings (ISR-SG-I/ISR-SG-III).For all the individual experiments we used only proposed 9 L-1D-CNN-SSC model and with same training and testing dataset size (70:30).The proposed model is an effective with subjects to sleep scoring without any manual feature extraction or feature screening. The reported summary results obtained from both the datasets with input of single and multi-variate signals using the proposed 9 L-1D-CNN-SSC model presents in Table 19.

Table 19 Overall accuracy performances of the proposed model for ISR-SG-I and ISR-SG-III

6 Discussion

Many similar research works have been conducted by different researchers on multiple sleep staging using different methodologies through machine learning techniques. The proposed 9 L-1D-CNN-SSC architecture can automatically be learning high-level features from the input signals directly. The obtained results indicate that the proposed scheme achieved improved sleep stages classification accuracy compared to earlier similar contributed works using the DL approach. Table 20 presents the performance comparisons results between the proposed methodology and the existing similar multiple sleep staging studies using different deep learning techniques. Similarly the performance of the proposed model also compared with the recent contributions with subject to sleep staging using different traditional machine learning classification models and the same achieved results are presented in Table 21. It has been found that the proposed framework achieved higher classification accuracy in comparison to the other state art of the contributions.

Table 20 Performance comparisons among the proposed model with different fine-tuned CNN model
Table 21 Performance comparisons of the proposed 9 L-1D-CNN-SSC model with the existing contributed automated sleep stages classification using different machine learning techniques

The performance of the proposed model is evaluated from two different subgroups of ISR dataset. The proposed model yielded the higher classification accuracy of 99.03%,99.50%, and 99.03% for three to five sleep states classification using single-channel EEG signals, similarly, the reported results for two-state sleep stages classification as 98.93% with single-channel EMG and combinations of single-channel EEG + EMG + EOG respectively using SG-I dataset. Further, the proposed model obtained a higher classification accuracy of 98.88%, 98.76%, and 98.67% using single-channel EMG for the same three to five sleep classes, similarly, the highest classification accuracy performance for two-state sleep stages as 99.26% with single-channel EOG using ISR-SG-III dataset. Furthermore, it has been observed that the obtained results from the proposed 9 L-1D-CNN-SSC model give the best classification accuracy performance for multiple sleep stages classification incomparable to the existing literature works.

The proposed 9 L-1D-CNN-SSC model results indicated that the model is performed excellent with concern to two-five sleep stages classifications incomparable to the existing literature works. It has been seen that several similar works are completely based on handcrafted features and shallow classifiers. From this point, the proposed model completely differs from the rest of the work and provides good potential for sleep staging analysis. Despite the improved performance on classification accuracy, the proposed model reported some more advantages related to the other works: (i) the proposed model eliminates the traditional techniques for classification using multi-stage pipeline architecture, which creates a lot of complexity during execution and raises a lot of errors. (ii) the proposed model learned the features automatically from the polysomnography signals during the model training, therefore it does not require any types of handcrafted features (iii) the proposed model used less number of learnable parameters for training model incomparable to the existing some of the pre-trained DL models, (iv) with the same hyperparameter values, the sleep staging performance is significantly improved for two to five sleep states classification using both the subgroups of a dataset and (v) the proposed architecture obtains higher classification accuracy performance in comparisons with the existing state-of-the-artworks.

7 Conclusion

In this paper, the authors proposed a 9 L-1D-CNN-SSC model for automated sleep stages classification using PSG signals. The proposed architecture contains 11 learnable layers, which helps to learn the features automatically from the multi-modality signals. The main objective of designing such architecture is to improve the sleep staging classification accuracy results with better learnable parameters compared to the traditional shallow learning models.The entire experiments executed in eight individual experiments based on single-channel EEG, EMG, and EOG signals and the final experiments are combinations of EEG, EMG, and EOG signals with the ISR-SG-I and ISR-SG-III data respectively.The proposed model obtained excellent performance for two to five sleep states classification for both the subgroups of the ISRUC-Sleep dataset. The results obtained from the 9 L-1D-CNN-SSC model exhibited the superiority of other existing CNN models. The proposed model architecture achieved the highest classification accuracy of 99.07%,99.50%, and 99.03% with single-channel EEG signals using the SG-I dataset, similarly, the same model reported accuracy using the SG-III dataset of 98.88%,98.76%, and 98.67% using single-channel EMG for three-five sleep stages classification. The same architecture performed accuracy of 98.91% with ISR-SG-I and 99.24% with ISR-SG-III using single-channel EOG signals for two sleep states classification tasks.

The main difference between the performance of the proposed model and state-of-the-art works is, the existing sleep staging systems are dependent upon the hand-engineered feature extraction approaches, which require tuning of parameters, their performance highly depends upon the selection of the hyperparameter and the sleep data. As such the model does not generalize well across the different types of sleep-related datasets. On the other hand, the existing models also involve laborious designs such as first extracted the features and selected the suitable features, and then input into a classifier, all these steps need hyper-parameters whose joint effort is quite laborious. On the other hand, the proposed 9 L-1D-CNN-SSC model is an end-to-end system, which is completely based on the deep learning concept, which takes input signals and gives the decision without any kind of feature extraction, selection, and laborious parameter tuning. The model learns the discriminative information automatically from the input signal data and the learning and classification process is completed automatically.

The proposed model not required any types of handcrafted features and shallow learning classification models for the classification of polysomnography signals and it can assist clinicians during the sleep scoring. To date, there is no pre-trained CNN models exist for biomedical signals. Therefore it is one of the major challenges lies here to get large amounts of the biomedical signal dataset. In the future; the authors will collect the biomedical signal data from different medical institutes and test the efficiency of the model. Furthermore, the proposed work to be extended for different types of sleep related diseases. The authors will also include the data augmentation techniques to overcome the data imbalance issues. Finally, it has been observed that the proposed framework improves the existing state of the art and achieves better classification results for the two-five sleep classification tasks. The proposed fully automated sleep staging classification systems could replace the traditional error-prone classification models.