Keywords

1 Introduction

Cardiovascular diseases (CVDs) are the primary cause of cardiac deaths worldwide. In 2012, an approximated 17.5 million CVD-related deaths are reported and it represents 31% of the overall global deaths [34]. CVDs are the leading cause of disability, particularly among myocardial infarction (MI) and stroke survivors. Moreover, the burden of CVDs is commonly measured by the disability-adjusted life years (DALY), whereby one DALY is similar to losing one year of healthy life. Evidently, the CVD burden for developing countries are comparably greater (DALY > 5100 per 100 000) than for the developed countries (DALY < 3000 per 100 000) [48].

Coronary artery disease (CAD) is caused due to coronary arteriosclerosis, an inflammatory of the arterial wall. Atherosclerotic plaques start to build up on the wall of the coronary arteries [12, 13, 47] and this progressive buildup of plaques eventually obstruct and reduce the blood flow to a region of the myocardium [14]. Thus, oxygen deprived myocardium due to CAD leads to a fatal myocardial infarction (MI) [46].

MI is an irreversible death of the heart tissue which further expands in sizes and damages the left ventricular (LV) function causing LV dysfunction. Improper management and treatment of LV dysfunction together with other cardiac abnormalities leads to a catastrophic stage called Congestive Heart Failure (CHF) [14, 19].

Thus, CHF is characterized by impaired ventricles and also, changes to the neurohormonal regulation. Moreover, CHF is a terminal stage of CVD whereby the heart fails to sufficiently circulate the blood throughout the body. This complex clinical syndrome causes hypoxia, congestion, and even death [11, 15].

The electrocardiogram (ECG) is a commonly preferred clinical diagnostic tool for most of the cardiac conditions such as CAD, MI, and CHF. ECG is comparably cheaper and noninvasive and it contains vital information relating to the functioning of the heart. The morphology of the ECG signals is diagnostically important during an episode of CAD, MI, and CHF. T-waves are abnormally tall, the QT intervals are longer and ST segment is elevated or depressed during CAD and MI [5]. However, visually examining these morphological changes of features in voluminous ECG signals is tedious and highly prone to errors. Hence, an automated computer-aided technique is necessary to overcome the drawbacks of manual analysis of ECG signals.

Several improved methodologies have been developed for automated detection of CAD, MI and CHF using ECG signals or heart rate (HR) signals by researchers. Analysis of ST segment variation using various methods such as Radial Basis Function (RBF) neural networks [25], Principal component analysis (PCA) [7], Binary Particle Swarm Optimization (BPSO), Genetic algorithm [8], Discrete Wavelet Transform [23] have shown good results for classification of CAD affected ECG signals from normal ones.

For classification of normal and MI affected ECG signals, researchers have employed various DWT [21], linear [6, 26, 42] and nonlinear [1, 20, 43, 45] methods to evaluate the signal characteristics such as QRS complex [7], ST segment [6, 44], and T wave amplitude [6]. Even the evaluation of ECG signal variation using detrended fluctuation analysis (DFA) [22], entropies [22], and autoregressive burg [26, 28] have shown to be useful for differentiation of normal and CHF classes.

In the study of normal and CAD ECG signals, Acharya et al. [4] proposed the application of Higher Order Spectra (HOS). In addition, they developed a Coronary Artery Disease Index (CADI) that could automatically characterize normal and CAD ECG signals using a single number.

Furthermore, Acharya et al. [2] studied the classification of normal, CAD, and MI. They compared three different techniques namely Discrete Wavelet Transform (DWT), Empirical mode decomposition (EMD), and Discrete cosine transform (DCT) to differentiate among the three classes (normal, CAD, and MI). It is reported that utilizing DCT technique yielded the highest accuracy of 98.50% is obtained with only seven features.

It is evident from the literature review that, the majority of the studies proposed are on automated classification of two classes (normal and either CAD, MI or CHF). To date, no study has been published an algorithm for characterization of four classes (normal, CAD, MI and CHF) using ECG beats. Thus, in comparison to the literature review, this work proposes a novel algorithm for automated classification of normal, CAD, MI and CHF using ECG beats. The flowchart of the proposed method is illustrated in Fig. 1.

Fig. 1.
figure 1

Block diagram of the proposed methodology.

2 Methodology

2.1 Materials

In this study, the ECG signals of normal, CAD, MI, and CHF are acquired from various Physionet databases, namely PTB Diagnostic ECG Database (for normal and MI), St.-Petersburg Institute of Cardiology Technics 12-lead Arrhythmia Database (for CAD) and BIDMC Congestive Heart Failure Database (for CHF) [18].

2.2 Pre-processing

The ECG signals from the BIDMC Congestive Heart Failure and St.-Petersburg Institute of Cardiology Technics 12-lead Arrhythmia databases are sampled at 250 Hz and 257 Hz respectively whereas PTB Diagnostic ECG database is sampled at 1000 Hz. Thus, to maintain the uniformity for all the databases, a standard sample frequency of 1000 Hz is selected. In addition, the baseline wander and noise from ECG signals are eliminated by using Daubechies 6 (db6) wavelet [33].

2.3 Beats Segmentation

The pre-processed ECG signals are segmented into ECG beats by first detecting the R-peak using Pan Tompkins algorithm [33, 37]. The visibly tall amplitude of the R peaks is chosen as the distinctive point for this study. Each ECG beat is segmented by taking 250 and 400 samples before and after the R peak respectively.

2.4 Wavelet Packet Decomposition (WPD)

The wavelet packet decomposition (WPD) is a wavelet transform technique that utilizes wavelets to transform the ECG signals. Wavelets are the result of the translated and scaled shapes of the basic mother wavelet. Moreover, mother wavelets are localized in the time-frequency domain and have fluctuating amplitudes within a finite time [30]. As compared to DWT method, WPD decomposes the signal into both the low frequency components (approximations) and the high frequency components (details) at every level [30]. Hence, WPD provides more information as compared to DWT.

2.5 Features Extraction

In this study, twelve nonlinear features are obtained from thirty WPD coefficients. The twelve features are namely approximate entropy [38], sample entropy [40], fuzzy entropy [24], Kolmogorov-Sinai entropy [17], Renyi entropy [39], Tsallis entropy [10], fractal dimension [27], wavelet entropy [41], Signal energy [35], permutation entropy [9], recurrence quantification analysis [16], and bispectrum [36].

2.6 Features Selection – Sequential Forward Feature Selection (SFS)

Feature selection is a technique used to select a subset of significant features that yield minimal classification error [29]. Therefore, this step significantly enhances the performance of the classifier with the elimination of those redundant and insignificant features. In this study, sequential forward feature selection (SFS) is implemented for the feature selection process.

2.7 Features Ranking

In this study, ReliefF ranking technique is implemented. ReliefF estimates the significance of features by randomly sampling an instance and then consider the weighted value of features for the nearest instance of classes [32].

2.8 Classification – K-Nearest Neighbor (KNN)

KNN is an instance-based classification technique in which the unknown sample is classified according to either similarity or distance criteria [19]. In this work, the k value of the nearest neighbors is varied ranging from 5 to 10. We have experimented with different values of k and obtained maximum accuracy with k = 5.

3 Results

In total, 181,510 ECG beats of four classes are segmented from three separate ECG databases of 222 subjects. For each ECG beat, WPD of four level is implemented and resulted in 30 coefficients. Twelve different nonlinear features are obtained from the 30 coefficients, therefore, 1050 features are obtained from one beat. Further, out of the 1050 features extracted from each beat, 10 features are selected by SFS. The features have a p-value of p < 0.0001 which also means that these features are statistically significant.

The confusion matrix of the four classes is shown in Table 1.

Table 1. Confusion matrix of the 4 classes.

4 Discussion

In this study, a novel technique for automated characterization of various CVDs is proposed using WPD and nonlinear analysis of the ECG beats. A total of 181,510 ECG beats having normal, CAD, MI and CHF conditions are individually decomposed into 30 WPD coefficients. Twelve types of nonlinear features are then extracted from the coefficients. On the whole, the proposed methodology achieved maximum classification results of 97.98% accuracy, 99.61% sensitivity and 94.84% specificity with 8 ReliefF ranked features using KNN classifier.

The implementation of the various nonlinear analysis techniques is to measure the degree of complexity in the healthy and CVDs ECG signals. This relates to the presence of inherent patterns in the dynamics of nonlinear ECG signals [44]. Indeed, nonlinear techniques are highly sensitive towards the presence of subtle sudden changes in the ECG signals [3].

The main novelty of this work is the integration of WPD and nonlinear techniques into a computer-aided diagnosis (CAD) system which enhances the efficiency of the decision-making and diagnosis process. Thus, clinicians can expeditiously prescribe the relevant treatments to prevent the conditions from deteriorating further.

5 Conclusion

In this study, a novel technique is proposed for the identification and diagnosis of CAD, MI, and CHF by using nonlinear features that are obtained from the segmented ECG beats. The proposed automated diagnostic support system can reliably and efficiently assist clinical staff to detect and diagnose cardiac abnormalities. Thus, reducing the workload and the possibilities of manual errors in interpreting vital information during ECG data assessment. The integration of the proposed methodology with ECG system is cost effective and yield instantaneous results as compared to other conventional cardiac diagnostic modalities. Hence, the proposed cardiac diagnostic support system offers an alternative cheap cardiac screening especially for the developing countries whereby the majority of the CVDs deaths occur. For future work, authors aim to explore on different types of nonlinear feature extraction technique and a bigger database that can produce better accuracy with the lesser number of features. Subsequently, the work can be extended in the various stages of CAD, MI, and CHF. This helps in identifying early indicators of the cardiac diseases from the ECG signals and thus, promptly suppressing the conditions from deteriorating further with the necessary clinical medications and treatments.