Abstract
In recent years, AI(Artificial Intelligence) has achieved great development in modern society. More and more modern technologies are applied in surveillance and monitoring. Healthcare monitoring is growing ubiquitous in modern wearable devices, such as a smart watch, electrocardiogram (ECG) necklace, smart band. Many sensors are attached to these smart devices to record and monitor physiological signals caused by activities, and then propagated those recorded electrical data to be further processed to give health diagnosis, disease prevention or making a distress call automatically. Obstructive sleep apnea (OSA) is a sleep disorder with a high occurrence in adult people and observed as an autonomous risk factor for circulatory problems such as ischemic heart attacks and stroke. Numerous traditional neural network based methods have been developed to detect OSA, where these methods however could not provide the intended result because they rely on shallow network. In this paper, we propose an effective OSA detection based on Convolutional neural network. Our method first extracts features from Apnea-Electrocardiogram (ECG) recordings using RR-intervals (time interval from one R-wave to the next R-wave in an ECG signal) and then CNN model having three convolution layers and three fully connected layers is trained with extracted features and applied for OSA detection. The first two convolution layers are followed by batch normalization and pooling layer, and softmax is connected to the last fully connected layer to give final decision. Experimental results on extracted feature of Apnea-ECG signal reveal that our model have better results in terms of performance measure sensitivity, specificity and accuracy. It is expected that the related technology can be applied into smart sensors, especially wearable devices.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Fifteen years ago, people proposed BAN (Body Area Network) which is a key infrastructure element for patient-centered medical applications. At that time, people expected that technologies will enable people to carry their personal BAN which provides medical, lifestyle, assisted living, sports or entertainment functions for the user by the year 2010 [18] where this importance is still expected. However, sensor technologies, embedded systems, wireless communication technologies, miniaturization and AI has achieved greatly success, and it make many smart elements into the lifestyle. Many sensors and devices are invented in recent years, and wearable devices are popular in recent years [24]. Healthcare devices play an important role in health monitoring for modern societies [5, 11, 32], especially for aging portion. These devices make continuous monitoring of inhabitants be possible even without hospitalization. Moreover, many technologies are applied in wearable devices [30]. All the physiological signals as well as physical activities of the patient are possible to be monitored with the help of wearable sensors [1, 10, 31]. Actually, sensors are relatively easy to produce. The key technologies are relatively difficult to be solved. In this paper, deep learning is used to detect the OSA by ECG signal. We believe that the wearable technologies will impact future medical technology, affecting our health and fitness decisions, redefining the doctor-patient relationship and reducing healthcare cost greatly. Figure 1 gives the potential application scenario. Family member can monitor the patient by wireless communication. We can use many sensors. In this paper, we mainly focus on the analysis and recognition for ECG signal.
In Medical science, the lack of breathing at the time of sleeping is called hypopnea whereas the complete silence in breathing is called apnea. An instance when one has either a difficulty in breathing or complete silence of breath during sleeping time, which varies in time and frequency is called OSA. These two form of sleeping disorder are caused because of various reasons. One reason is the pharyngeal collapse during sleep, which leads to choking, intense snoring, sudden and frequent awakening and disrupt in sleeping.
Recent studies suggest that 4% of men and 2% of women of age more than 50 years are suffering from symptomatic OSA [11, 29]. Additionally, 2% to 4% of middle-aged adults and 1% to 3% children are suffering from OSA [2]. Despite how frequent it is, most cases go undetected and can be credited to 70 billion dollars’ loss, 11.1 billion in damages and 980 deaths each year [2]. One of the traditional way to detect various sleep disorder is by using polysomnography (PSG) at a sleep lab. It records the breath air flow, movement of respiratory, oxygen saturation, a position of the body, electromyography (EMG), electroencephalography (EEG) and electrocardiogram (ECG) [9] for detection and treatment. However, this technique is expensive, unavailability of materials and inconvenience for testing, as a technician needs to process overnight.
Over the past few years, several methods have been suggested for the detection of sleep apnea. By using mean absolute amplitude (MAA), [22] studied and suggested that thoracic and abdominal signals are good constraints to detect sleep apnea. The method gains 80% accuracy and 74% sensitivity on the designated dataset. OSA detection was investigated by [3] using speech signal of ECG. The designed model used prospective patients’ speech recordings to automatically diagnose OSA, which is not reliant on phoneme recognition and segmentation. The classification scheme used non-silence segments of the patient’s speech signal, and thus better fits the hidden material in the speech signal and reveals the vocal tract’s dynamics.
Related to our work, various traditional neural network methods [3, 20, 25, 37] of obstructive sleep apnea detection have been highly studied. For example, [3] proposed OSA detection using neural network (NN) classification of time-frequency strategy of the heart rate variability. The method used textures features extracted from normalized gray level co-occurrence matrices of the image obtained by short time discrete Fourier transform. The extracted features are used as an input for three-layer multilayer perceptron for detection. [20] proposed NN based feature selection and identification of OSA and gained up to 70% classification accuracy. In this OSA detection scheme, the NN was used for two purposes: one is to choose the optimal frequency bands that can be used for identification at the time of feature extraction and the other is used for detection during the feature matching stage. Also, [2] exploited support vector machine on ECG signals to detect OSA. The method trained on a subject of both OSA and non-OSA for training and testing the model and obtained up to 96.5% classification accuracy. All the aforementioned NN-based methods only consider the shallow networks especially for classification of OSA from ECG recordings. However, it has been verified that deep networks [8, 14, 15, 19] have no comparison in providing good classification results than shallow networks.
In this paper, taking the advantage of CNN over image recognition and classification, we introduce an efficient framework for OSA detection based on convolutional neural network by considering sleep Apnea-ECG recordings. First, we extract features from the Apnea-ECG recordings using RR-intervals and then the extracted RR-intervals are used as an input for the designed CNN’s model. The designed model has three convolution layers with the first two convolution layers are followed by batch normalization and maxpooling layers. The third convolution layer is followed by three fully connected layers, where the last fully connected layer is connected with softmax classifier for the final decision. Details of our method are explained in Section 3. Figure 2 shows our model architecture details.
The rest of this paper is organized as follows. Section 2 provides some highlighted description about neural networks with CNN as primary concepts. The detail of proposed model is explained in Section 3. Section 4 presents the experimental results. Finally, Section 5 concludes the paper.
2 Neural network
Neural networks are a computational model [4, 6, 7, 36] employed in computer science and other research areas, which is built on a huge group of simple neural elements (artificial neurons), loosely similar to the observed behavior of a biological brain’s axons. Each neural element is related with several others, and associations can enhance the activation state of adjoining neural elements. The objective of the neural network is to resolve difficulties in the similar way that the human brain would. For input x1, x2,..., xn training samples, each individual neural unit is computed as,
Where wi and b denotes weight and bias, respectively, initially they could be a random number and later learned by the model itself, f represents an activation function such as, sigmoid and ReLu functions, and N denotes the number of training samples. A threshold function called biases on each connection and on the element itself might exist, such that the indicator must exceed the limit before propagating to other neurons. These schemes are self-learning and trained, rather than clearly programmed, and surpass in parts where feature detection problem is hard to express in a conventional computer program.
2.1 Convolutional neural networks
A feed forward neural network can be understood as a configuration of various functions
Each function fk takes xk as an input (xk can be an image or sound) and a parameter wk to produce an output xk+ 1. Though the nature and sequence of function usually handcrafted, the parameters w = (w1,..., wk) are learned from data to solve the objective problems. Initially, w can be initialized from a normal distribution with mean zero where the optimum values are learned by the model. The function f is a non-linear transformation applied on an input data x which is local and translation invariant.
2.2 Back propagation in CNNs
The parameters of a CNN, w = (w1,..., wk) should be learned in such a mode that the overall CNN function L = f(x;w) attains the desired objective.
In simple terms, for a given input-output pair association (x1, z1),...,(xn, zn) where xi is an input data and zi is a corresponding output, and \(l(z,\hat z)\) is a loss that expresses the penalty for estimating \(\hat z\) instead of z, the goal is to minimize a penalty function,
This can be minimized by an algorithm called gradient descent. Which means, calculate the gradient of the objective L at a present solution wt and then update the next along the track of fastest descent of L as,
where ηt ∈ R+ is the learning rate.
Also, given an initialized bias corresponding to the weight, the bias will be updated as,
where ηt ∈ R+ is again learning rate.
2.3 ECG data’s in sleep apnea
Heart rate, and other features of the ECG, vary in characteristic techniques in similar with sleep related breathing ailments [17]. Earlier work made use of cyclical variations of heart rate but did not found effective algorithms to quantify sleep allied breathing disorders based on heart rate only [17, 27]. It is clear that an estimation of heart rate cannot produce an apnea index or an hypopnea index. Both values are obtained by the evaluation of airflow, respiratory effort and oxygen saturation. Considering this limitation, it is proved that evaluation of ECG can provide an approximate of disturbed breathing during the night which should correspond to the results obtained by standardized apnea scoring [28]. The ECG of sleep apnea database is collected to detect sleep associated breathing disorders based on a single channel ECG recording. According to this database, all Polysomnographic recordings were scored by one expert in a diverse way. Figure 3 shows an illustration of a ECG signal of sleep apnea.
3 Proposed model
Our proposed OSA detection based on CNN has several basic components. In this section, we give the detail explanation about each component and the overall topology of the model.
3.1 Proposed OSA detection model
The general topology of our OSA detection is illustrated Fig. 2. CNNs have been widely employed in a various application of pattern recognition [12, 33, 34] and detection [3]. Based on the size and the structure of an input data, the number of layers and nodes in the network always differs. Our proposed model is based on feed forward CNNs, which learns a predefined set of input-output example pairs. As shown in Fig. 5, our end-to-end model has the following basic sections; features extraction, convolution layer, pooling layer, batch normalization layer, ReLu layer, fully connected layer and softmax layer, which provide us a better detection results together.
Given an extracted ECG signal features, the first convolution layer applies 64 filters of size 3x3x1 and outputs 64 feature maps (with stride 1, also same over all convolution layers) where this layer is then followed by batch normalization and then with 2x2 max pooling. The second convolutional layer takes as input the output of the first convolutional layer and it filters with 64 kernels of size 3x3x64 and followed by batch normalization and then 2x2 max pooling layer. The third convolutional layer has also 64 kernels of size 3x3x64 connected to the (normalized-pooled) outputs of the second convolutional layer. Finally, the last convolution layer is followed by three fully connected (FC) layers with 100, 10 and 2 neurons, respectively, where ReLu activation function is inserted between FC100 and FC10, and FC10 and FC2. The softmax layer is finally applied on the final output. We will give the detail of these basic layers in the following subsections.
3.1.1 Feature extraction
It is common to extract a set of features from speech signals where detection is carried out on a set of features instead of the original signals themselves. In our feature extraction part, apnea ECG recordings used for training passed through feature extraction process using RR-intervals (time interval from one R-wave to the next R-wave as shown in Fig. 4. To be more specific, RR- interval is a time interval between two successive R peaks [2], which can be written in a simple mathematical equation as,
The extracted features are then used as an input for the rests of the model’s layer.
3.1.2 Convolution layer
Given the extracted feature xi from the input data, each convolution has an outcome yi computed as,
Where N is the number of samples, w and b denote weight and bias of the current layer respectively, where the detail of parameters initialization and optimization process is explained in Section 3.2. Figure 5 shows the architecture detail of our OSA detection model.
3.1.3 Batch normalization
Assume that x = {x1,..., xd} is the input to a layer with dimension d. Each dimension of x is normalized by
where E(xk) is an expectation of xk and var[xk] is the variance of xk and they are computed over the training data. This type of normalization speed up convergence [28] even when the features are not decorrelated.
Normalizing each input in the layer may sometimes change what really the layer should represent. To address this problem, the transformation inserted in the network is chosen to be an identity transform. For this matter, a pair of parameters γk and βk for each xk have been introduced to scale and shift the normalized value as,
γk and βk are learned along with other model parameters. In such manner or formulation, convolutional neural network is benefited from data normalization. Data normalization helps the network train faster and provide higher accuracy.
3.1.4 Pooling layer
It is common to occasionally insert a Pooling layer in-between consecutive convolutional layers in a convolutional Networks architecture. Its function is to gradually shrink the spatial size of the representation to reduce the amount of parameters and computation in the network, and hence also to control overfitting. The Pooling Layer operates independently on every depth slice of the input and resizes it spatially. There are various pooling strategies, but for our case we employed max pooling with size 2x2.
3.1.5 Softmax layer
Basically, when the ECG signal is propagated through the output layer, the final outcome is compared with the desired value and an error value using MSE or other methods is computed for all output unit. These errors are then back propagated to each node in the in-between layers. After every node in the network established an error value that defines its relative involvement to the overall error, weight and bias are updated using (4) and (5). In our method, for the final decision making of the model, we apply Softmax on the final output. In the in-between two fully-connected layers, ReLu (i.e, max(0,x)) activation function is applied and finally the last FC layer is followed by a softmax classifier. The parameters of the model are approximated by the stochastic gradient descent algorithm with the gradient computed by backpropagation algorithm to maximize the log-likelihood.
3.2 Training
To acquire the optimal parameters of the model, people are using diverse cost function (error minimization methods) for image recognition such as mean square error [13]. In our case, we adopt the widely used cost function for image detection i.e. mean square error.
Given an apnea signal of ECG f, our claim is to train a mapping M that estimates the value \(\hat x=M(f)\) where \(\hat x\) is an estimation of the target signal y from f. Hence, for M output-desired signals training dataset pairs \(\{f_{i},y_{i} \}_{i = 1}^{M}\), the optimization objective is,
Where λ denotes the network parameters to be learned, M(fi;λ) is the estimated signal corresponding to fi. Several activation functions such as ReLu, its variation PReLu, RReLu and Elu are suggested by the researchers to have relatively better performance. But for our application, we employed ReLu activation function and the network parameters are optimized by Adam algorithm [21].
4 Experimental results
In this section, we set the details about the dataset utilized for training and testing, the way the model parameters are adjusted, parameters are learned, and finally the result of our model’s performance is given with detail description.
4.1 Training and testing data
To train and test the model, we used the data from Apnea-ECG database which is freely available at (https://physionet.org/physiobank/database/#ecg/). The database [28] has been assembled for the Physio Net/Computers in Cardiology Challenge 2000. It consists of 70 ECG signals of an individuals, each normally 8 hours long, where 35 of them are only annotated.
For our model training, we divide the annotated 35 ECG recorded apnea signals of an individuals into two categories. The first category is training set of 20 ECG apnea recording signals of an individuals that are normal and non-normal and the second category is the testing ECG apnea signals. The testing set consists 10 ECG apnea signals of an individuals which are also normal and non-normal and they are not part of training set.
After feature extraction is carried out for each dataset using (1), all ECG recordings are adjusted to matrix of size 240x240 values for training and testing. As deep learning models are benefited from large training dataset, we perform data augmentation as in [38] which is then an input for the next model layer for the overall training process.
4.2 Training parameters
To capture sufficient spatial information of the ECG recordings, we initialize the weight and bias by the method in [16] and use Adam algorithm [21] with α = 0.01, β1 = 0.9, β2 = 0.999, and 𝜖 = 10− 8. The batch size is set to 64. We have trained the model for 50 epochs. The learning rate is decayed exponentially from 0.01 to 0.0001 for the 50 epochs. We use the MatConvNet package [35] to train the proposed network. All experiments are carried out in Matlab (R2015b) in an environment running on a computer with Intel(R) Xeon(R) CPU E3-1230v3 3.30GHz and NVIDIA Tesla K40c GPU and takes four hours for training.
4.3 Performance evaluation
We evaluated the effectiveness of our model on different records of ECG signals. As a detection measure of our model, we compute the three commonly known performance measures [22] i.e, sensitivity (se), specificity (sp), and accuracy (ac), where they have the following definitions.
Where ’OSAs’ is the number of properly detected OSA signal, ’NORs’ is the number of properly detected normal (NOR) signals, (OSA)t is a total OSA signal and (NOR)t is a total NOR signal tested [12]. Our experimental results show that the CNN based model detect OSA effectively and provides upto 97.80% accuracy with the 50th training epoch. As shown in Table 1, our detection accuracy increases as we increase training epochs from 20 to 50 in 10 increments. However, the model is not providing better accuracy values for more than 50 training epochs. Table 1 shows the performance of our model on ECG dataset at different training epochs.
5 Conclusion and future work
Effective and efficient obstructive sleep apnea detection model is proposed in this paper. The model used the advantage of current success in convolutional neural networks in an image and an audio recognition problems. We proposed efficient deep learning based architecture having ten layers. The Apnea-ECG recording datasets are used for training and testing. Our experimental results show that the detection of OSA based on convolutional neural network is more appropriate method than the traditional neural network based. Also, the models’ accuracy, sensitivity and specificity values showed the effectiveness of our model.
In future work we are considering OSA case and plan to investigate the target problem with recurrent neural networks, with taking an accuracy, sensitivity, and specificity in account.
References
Acharya UR, Fujita H, Lih OS, Hagiwara Y, Tan JH, Adam M (2017) Automated detection of arrhythmias using different intervals of tachycardia ECG segments with convolutional neural network. Inf Sci 405:81–90
Almazaydeh L, Elleithy K, Faezipour M (2012) Obstructive Sleep Apnea Detection Using SVM-Based Classification of ECG Signal Features, 4938–4941
Al-Abed M, Manry M, Burk JR, Lucas EA, Behbehani K (2007) A method to detect obstructive sleep apnea using neural network classification of Time-Frequency plots of the heart rate variability. In: IEEE Engineering in Medicine and Biology Society, pp 6101–6104
Chen B-W, Ji W (2017) Geo-conquesting based on graph analysis for crowd sourced meta trails from mobile sensing. IEEE Commun Mag 55(1):92–97
Chen B-W, He X, Kung S-Y (2015) Support vector analysis of large-scale data based on kernels with iteratively increasing order. J Supercomput 72(9):3297–3311
Chen B-W, Imran M, Guizani M (2016) Cognitive sensors based on ridge phase-smoothing localization and multi regional histograms of oriented gradients. IEEE Transactions on Emerging Topics in Computing
Chen B-W, Yang L, Gu Y (2018) Privacy-preserved big data analysis based on asymmetric imputation kernels. Futur Gener Comput Syst 78(2):859–866
Cheng M, Sori WJ, Jiang F, Khan A, Liu S (2017) Recurrent neural network based classification of ECG signal features for obstruction of sleep apnea detection. In: IEEE International Conference on Computational Science and Engineering and IEEE International Conference on Embedded and Ubiquitous Computing, pp 199–202
De Chazal P, Penzel T, Heneghan C (2004) Automated detection of obstructive sleep apnoea at different time scales using the electrocardiogram. Phys Meas 25(4):967–983
Deepu C, Zhang X, Heng C, Lian Y (2016) An ecg-on-chip with qrs detection lossless compression for low power wireless sensors. Sensors 2:3
Dempsey JA, Veasey SC, Morgan BJ, Donnell CPO (2008) Pathophysiology of sleep apnea. Physiol Rev 90(1):47–112
Feng J, Bo-Wei C, Kun L, Debin Z (2014) Big Data Driven Decision Making and Multi-prior Models Collaboration for Media Restoration. Multimedia Tools and Applications, pp 1–15
Feng J, Shen W, Yang G, Debin Z (2014) Viewpoint-independent hand gesture recognition with kinect. Signal Image Video Process 8(1):163–172
Feng J, Shengping Z, Shen W, Yang G, Debin Z (2015) Multi-layered hand gesture recognition with Kinect. J Mach Learn Res 16:2
Feng J, Seungmin R, Bo-Wei C, Xiaodan D, Debin Z (2015) Face hallucination and recognition in social network services. J Super Comput 71(6):2035–2049
Feng J, Yang G, Shaohui L, Debin Z (2015) Discriminating features learning in gesture classification. IET Compu Vision 9(5):673–680
Guilleminault C, Comoiiy SJ, Winkle R, Melvin K (1984) A Tilkian, Cyclical variation of the heart rate in sleep apnea syndrome, mechanisms and usefulness of 24 h electrocardiography as a screening technique. Lancet, pp 126–13
Gyselinckx B, Van Hoof C, Ryckaert J, Yazicioglu RF, Fiorini P, Leonov V (2005) Human++: autonomous wireless sensors for body area networks. In: Proceedings of the IEEE 2005 Custom Integrated Circuits Conference. IEEE, pp 13–19
Hinton G et al (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process Mag 29(6):82–97
Hossen A (2013) A Neural Network Technique for Feature Selection and Identification of Obstructive Sleep Apnea. In: 6th International Conference on Biomedical Engineering and Informatics (BMEI). IEEE, pp 182–186
Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization, Int Conf Learn Represent
Kriboy M, Tarasiuk A, Zigel Y (2013) Obstructive sleep apnea detection using speech signals. In: Proceedings of the annual conference of the Afeka-AVIOS in Speech Processing, pp 1–5
Morgan BJ, Donnell CPO (2008) Pathophysiology of sleep apnea. Physiol Rev 90(1):47–112
Mukhopadhyay SC (2015) Wearable sensors for human activity monitoring: A review. IEEE Sens J 15(3):1321–1330
Olson JE, Wendy RW, Morgenthaler IT, Timothy IG, Peter C, Staats AB (2003) Obstructive sleep Apnea-Hypopnea syndrome. Mayo Clin Proc 78 (12):1545–1552
Otto C, Milenkovic A, Sanders C, Jovanov E (2006) System architecture of a wireless body area sensor network for ubiquitous health monitoring. J Mobile Multimed 1(4):307–326
Penzel T, Amend G, Meinzer K, Peter JH (1990) A heart rate and snoring recorder for detection of obstructive sleep apnea. Sleep l3(2):175–182
Penzel T, Rg GBM, Goldberges MAL, Peter H (2000) The Apnea-ECG Database, pp 255–258
Pham LV, Schwartz AR (2015) The Pathogenesis of obstructive sleep apnea 7(8):2072–1439
Phan D, Siong LY, Pathirana PN, Seneviratne A (2015) Smartwatch: performance evaluation for long-term heart rate monitoring. In: 2015 IEEE International Symposium on Bio-electronics and bio-informatics (ISBB), pp 144–147
Saini I, Singh D, Khosla A, QRS detection using k-nearest neighbor algorithm (2013) (KNN) And evaluation on standard ECG databases. J Adv Res 4(4):331–344
Shen B, Zhou X, Kim M (2016) Mixed scheduling with heterogeneous delay constraints in cyber-physical systems. Futur Gener Comput Syst 61:108–117
Shen B, Zhou X, Wang R (2017) A delay-aware schedule method for distributed information fusion with elastic and inelastic traffic. Inf Fusion 36:68–79
Shen B, Chilamkurti N, Wang R, Zhou X, Wang S (2017) Deadline-aware rate allocation for IoT services in data center network, Journal of Parallel and Distributed Computing
Vedaldi A, Lenc K (2014) Matconvnet - Convolutional Neural Networks for MATLAB, Arxiv
Worku J, Feng J, Seungmin R, Maowei C, Shaohui L (2017) Medical image denoising using convolutional neural network: a residual learning approach, Journal of Super computing. https://doi.org/10.1007/s11227-017-2080-0
Yadollahi A, Zhara M (2009) Acoustic obstructive sleep apnea detection. In: Annual International Conference of the IEEE in Engineering in Medicine and Biology Society, pp 7110–7113
Zhan S, Yang X, Hu C, Liang Z, Xie D (2016) Super-Resolution Of medical image using representation learning, Int Conf Wirel Commun Signal Process
Acknowledgements
This work is partially funded by the MOE–Microsoft Key Laboratory of Natural Language Processing and Speech, and the National Natural Science Foundation of China under Grant No. 61572155, 61672188 and 61272386. We would also like to acknowledge NVIDIA Corporation who kindly provided two sets of GPU. We would like to acknowledge the editors and the anonymous reviewers whose important comments and suggestions led to greatly improved the manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, X., Cheng, M., Wang, Y. et al. Obstructive sleep apnea detection using ecg-sensor with convolutional neural networks. Multimed Tools Appl 79, 15813–15827 (2020). https://doi.org/10.1007/s11042-018-6161-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6161-8