Keywords

1 Introduction

Hypertension (HPT) is a critical health issue. It can severely affect human health, and its pervasiveness increases on a global level, but the rate of HPT awareness, treatment, and control remains slow [11]. The World Health Organization (WHO) is more vigilant of HPT treatment, understanding, and diagnosis [11]. HPT is defined when the systolic and diastolic blood pressure is greater than 140/80 in more than three clinical trials.

Hypertension is a remarkable state that can indicate many severe diseases like stroke (STR), syncope (SYN), myocardial infarction (MI), and heart disease [6]. Blood pressure (BP), smoking, overweight, lack of exercise, excessive salty eating, stress, age, family ancestry, kidney illness, and thyroid disease are the few reasons that cause hypertension [24].

The functionality of the heart is recorded by an ECG signal in the form of an electrical signal. Therefore, the ECG signal is more relevant in HPT detection and the disease associated with it [10]. Simjanoska et al. [22] specify the relationship between ECG and blood pressure and how ECG changes when the blood pressure is changed.

The primary motivation of this research work is to detect hypertension-associated diseases from the ECG signal. However, early detection of hypertension can save many lives and enhance people’s life quality.

Various devices, methods, and algorithms have been developed to detect hypertension. Similarly, the details of work done on HPT detection in literature are mentioned below:

Rajput et al. [8] discriminate the severity of hypertension ECG signal using hypertension diagnosis index (HDI). The developed HDI model classifies the low and high-risk hypertension ECG signals with 100% classification accuracy.

In another study [10], the authors classify the low, high-risk hypertension, and normal ECG signals using signal processing and machine learning-based methods. In addition to this, they obtained 99.95% classification accuracy using the ensemble bagged tree classifier.

Further, in the subsequent study [18], the classification accuracy of 98.05% was obtained without wavelet-based methods. The classification has been conducted on the severity of hypertension and normal ECG signals.

Moreover, in another study [9], they have classified hypertension and normal Balistocardiogram signals using empirical mode decomposition and wavelet transform methods. As a result, the authors obtained the highest classification accuracy of 87%.

Quachtran et al. [7] extract intracranial pressure (ICP) from ECG signal and developed a deep learning model for the detection of intracranial hypertension. The deep learning model gives \(92.0\pm 2.25\%\) accuracy.

Sau et al. [12] worked on seafarer people’s depression and anxiety using machine learning. The precision and accuracy of their developed model are 82.6% and 84%, respectively.

In another study, Melillo et al. [5] designed an automated detection of high-risk hypertension algorithm from heart rate variability (HRV) signals using machine learning methods. The sensitivity and specificity of HRV-based models are 71.4 % and 87.8 %, respectively.

Ni et al. [6] employ a HRV signal-based multi-scale fine-grained model to detect the severity of hypertension. The HRV signal-based model gives 95% accuracy using a machine learning algorithm.

Simjanoska et al. [22] identified the SBP and DBP from ECG signal using machine learning methods. The SBP and DBP achieved 9.45 and 8.13 mmHg mean absolute error.

Song et al. [23] distinguish hypertension and heart disease from the HRV signal. In addition, the Naive Bayes classifier, a machine learning-based model, gives classification accuracy of 92.3%.

Hence, it is apparent from the literature, and disease (STR, SYN, MI, and LHT) associated with hypertension has not been studied yet. Therefore, in the current scenario, hypertension is attracting researchers globally. To diagnose and predict hypertension and its associated disease, a large amount of recorded data is available in hospitals and online databases. Accordingly, we have developed the hypertension diseases detection system by signal processing and machine learning-based methods. In the proposed work, we have used an orthogonal wavelet filter bank (OGWFB) to perfectly discriminate STR, SYN, MI, and LHT ECG signals. The OGWFB produces six sub-bands (SBs) for each ECG signal considering five-level wavelet decomposition. In addition, the LOGE and SLFD features were calculated for all SBs. As a result, the KNN classifier presents the highest classification accuracy of 98.4%.

Section 2 provides the details of the dataset. Then, the methodology is explained in Sect. 3. Subsequently, the performance (result) of the developed model is discussed in Sect. 4. At last, the outcome and concluding remarks of the proposed algorithm are given in Sect. 5.

2 Dataset

The dataset for this research work was obtained from Physionet’s online database (SHAREE database). The Ethics Committee approved the current study of Federico II University Hospital Trust. A total of 139 hypertensive recordings were used, out of which 49 are female, and 90 are male patients; the average age is 55 years. In addition to this, the length of each ECG signal is 2 h:10 min:12-s. Furthermore, each ECG signal has III, V3, V5 leads and approximately one million samples (samples/signal). The lead III, V3, and V5 are assigned as CH1 (channel1), CH2 (channel2), and CH3 (channel3). The ECG signal sampling frequency, bit resolution, and sampling intervals are 128 Hz, 8-bit, and 0.0078125 sec, respectively. Out of 139 subjects, three are SYN, three are STR, and 11 are MI, while 122 patients are low-risk hypertension LHT subjects. Further, we have segmented each ECG signal into a 5-minute duration signal. After segmentation of ECG signal, 3172 ECG signals are of LHT, 78 of stroke, 78 of syncope, and 286 of myocardial infarction. Figures 1, 2, 3, and 4 show all four classes of hypertension-associated ECG signals of 5-min duration.

Fig. 1
A graph of hypertension-associated E C G signals for a duration of 5 minutes. It plots the amplitude in micro volts versus samples. Values are estimated. (0,130), (50,141), (150, 100), (200, 138), (250, 140), (300, 121), (400, 145), (500, 130), (600, 145), (700, 131), (800, 118), (900, 158), (1000, 126).

Low-risk hypertension ECG signal

Fig. 2
A graph of hypertension-associated E C G signals for a duration of 5 minutes. It plots the amplitude in micro volts versus samples. Values are estimated. (0,130), (50,141), (150, 100), (200, 138), (250, 140), (300, 121), (400, 145), (500, 130), (600, 145), (700, 131), (800, 118), (900, 158), (1000, 126).

Myocardial infarction ECG signal

Fig. 3
A graph of hypertension-associated E C G signals for a duration of 5 minutes. It plots the amplitude in micro volts versus samples. Values are estimated. The troughs are at (80, 115), (160, 113), (250, 112.5), (330, 111), (410, 114), (495, 111), (560, 111), (655, 114), (730, 113), (820, 111), (900, 111), (980,111).

Stroke ECG signal

Fig. 4
A graph of hypertension-associated E C G signals for a duration of 5 minutes. It plots the amplitude in micro volts versus samples. Values are estimated. (0,126), (50,145), (150, 137), (200, 122), (300, 132), (400, 126), (500, 1126.5), (600, 128), (700, 134), (800, 127), (900, 126), (1000, 126).

Syncope ECG signal

3 Methodology

The optimally designed OGWFB discriminates LHT, MI, STR, and SYN classes of ECG signal. Each ECG signal has been decomposed into various sub-bands using a filter bank. The LOGE and SLFD features were computed for each ECG signal SBs. As a result, a total of 12 (six LOGE and six SLFD) features were obtained from each ECG signal. Subsequently, we applied various machine learning algorithms on features calculated ECG signals. The KNN machine learning classifier gives the highest accuracy. The outline of the developed algorithm is presented in Fig. 5.

Fig. 5
A flowchart of the design of the developed algorithm. The steps are as follows. Hypertension E C G signal, E C G pre-processing, filter bank, wavelet decomposition, feature section and extraction, and classification of E C G signal using machine learning results in 4 outcomes that are L H T, M I, syncope, and stroke.

Layout of the proposed work

3.1 Preprocessing of ECG Signal

Z-score normalization is performed on each epoch of the ECG signals to eliminate the amplitude scaling problem [1, 3, 4]. Five-minute ECG signals are generated by segmenting the long-length ECG signal.

3.2 OGWFB Filter Bank

The two-band filter bank has an analysis filter bank (decomposition) and synthesis filter (reconstruction) is shown in Fig. 6. Analysis filter bank has \(P_{0}(z)\) low-pass filter and \(P_{1}(z)\) high-pass filter. The high- and low-pass analysis filter bank output is down-sampled by a factor of 2, while synthesis filter bank input is up-sampled by a factor of 2. In the synthesis filter bank, \(Q_{0}(z)\) is low pass and \(Q_{1}(z)\) is high-pass filter. In the proposed work, we used an orthogonal wavelet filter bank developed by [15]. The output of the synthesis filter bank is matched with the input to the analysis filter bank to get the same result.

Perfect reconstruction is achieved using the two-channel filter bank. However, the condition of orthogonality is necessary for filters to get perfect reconstruction of signal [2, 16, 26, 28, 30, 31]. Therefore, the orthogonal filter bank can be converted into the finite impulse response analysis low-pass filter \(p_{0}(n)\), and it must fulfill the condition of orthogonality which is equivalent to the condition of perfect reconstruction and zero moments [21, 29]. Additionally, the high-pass filter \(p_{1}(n)\) can be produced by adjusting the sign of the coefficient of the flipped variant of the low-pass filter. Finally, the synthesis bank filters can be extracted from the time reversal of the analysis banks.

Fig. 6
A chart for the O G W F B filter bank. It starts with X, z that is divided into P 0 z and P 1 z labeled the analysis filter bank, this leads to 2 circles labeled 2 with an upward and downward arrow, respectively, which in turn leads to Q 0 z and Q 1 z marked synthesis filter bank and both the outcomes is denoted by Y, z.

Orthogonal wavelet filter bank diagram

3.3 Wavelet Decomposition

The ECG signal is non-stationary; therefore, we cannot apply conventional (Fourier, Laplace, and short-time Fourier transform) methods [16, 19, 21]. Instead, we used an optimal wavelet filter bank (OGWFB) to decompose ECG signals in various sub-bands. In addition, a five-level wavelet decomposition was used [14, 27]- [21, 34]. As a result, it produces accurate and precise information about the ECG signals. Total \(N+1\) sub-bands were made for N level wavelet decomposition [14, 17]. However, the SB1-SB5 are detailed, and SB6 is an approximate sub-band.

3.4 Features Used in Proposed Work

The important part of this work is to calculate and select the required features. Significantly, the performance of the classifier is based on the nature of the feature extracted. Moreover, it is not priory known which feature will best discriminate each class of ECG signal. In addition, the LOGE and SLFD features were computed for all six SBs of each ECG signal [33]. Finally, the feature extracted ECG signals were fed to the machine learning classifiers. As a result, we can classify the LHT, MI, SYN, and STR ECG signals using LOGE and SLFD features with machine learning classifiers.

3.5 Classification and Performance Evaluation

In the proposed work, we have used various supervised machine learning algorithms for the automated classification of hypertension ECG signals. However, we use ECG signals on a variety of classifiers, including support vector machines (SVM), k-nearest neighbor (KNN), decision trees (DT), and ensemble bagged trees (EBT), to improve performance. As a result, we have obtained the highest classification performance using the KNN classifier.

Table 1 Performance summary of filters was obtained using CH1, CH2, and CH3
Table 2 Comparison of AUC obtained by all filters for CH1, CH2 and CH3
Table 3 Confusion matrix of CH1 for all classes obtained using KNN classifier
Table 4 Confusion matrix of CH2 for all classes obtained using KNN classifier
Table 5 Confusion matrix of CH3 for all classes obtained using KNN classifier
Table 6 Classification accuracy of various classifiers for CH1, CH2, and CH3

Usually, KNN is applied for dimensionality reduction and feature selection [13, 20, 25]. In addition, the KNN is used for the k training samples, which are neighbors of the test sample, to classify it. Following this, the KNN classifier provided the lowest probability and overfitting [13, 20, 25].

Fig. 7
An area graph of A U C equals 0.99 plots the true positive rate versus the false-positive rate. Both the horizontal axis and vertical axis range from 0 to 1. Values are estimated. (0, 0), (1, 0), (1, 1), (0.14, 1.00), and (0, 0.9). The point (0.14, 1.00) is highlighted.

ROC curve obtained for CH1 using KNN classifier

4 Result and Discussion

The experimental work has been performed on the MATLAB version (9.1.0), with Intel Xeon 3.5 GHz, and 16 GB RAM. The F1, F2, and F3 filters of OGWFB enhance the performance of the proposed work. In addition, filter F2 presents the highest classification accuracy compared to the other two filters. The performance of each filter in terms of classification accuracy is shown in Table 1. A filter F2 produced the highest area under the curve (AUC) of 0.99 for CH3 is mentioned in Table 2. Tables 3, 4, and 5 represent the confusion matrix of LHT, MI, STR, and SYN of CH1, CH2, and CH3 for KNN classifiers. The classification performance of each classifiers is presented in the Table 6. It is evident from Table 6 that the KNN classifier givesclassification accuracy of 98.4% for all classes. For testing the model performance and avoiding overfitting, we used the ten-fold cross-validation method. Table 6 shows that our model can identify 98.4 % accurately of LHT, MI, STR, and SYN classes. The best receiver operating characteristics (ROC) curve and AUC of KNN classifier are shown in Figs. 7, 8, and 9 for CH1, CH2, and CH3.

Fig. 8
An area graph of A U C equals 0.98 plots the true positive rate versus the false-positive rate. Both the horizontal axis and vertical axis range from 0 to 1. Values are estimated. (0, 0), (1, 0), (1, 1), (0.33, 1.00), (0.14, 1.00), and (0, 0.9). The point (0.33, 1.00) is highlighted.

ROC curve obtained for CH2 using KNN classifier

Fig. 9
An area graph of A U C equals 0.99 plots the true positive rate versus the false-positive rate. Both the horizontal axis and vertical axis range from 0 to 1. Values are estimated. (0, 0), (1, 0), (1, 1), (0.11, 1.00), (0.14, 1.00), and (0, 0.9). The point (0.11, 1.00) is highlighted.

ROC curve obtained for CH3 using KNN classifier

5 Conclusion

This study used optimal OGWFB to separate LHT, MI, STR, and SYN ECG signals. The LOGE and SLFD features were calculated for all six sub-bands. OGWFB can classify ECG signals accurately using LOGE and SLFD with a ten-fold cross-validation method. To check the performance of filter banks, we have applied various classifiers. KNN classifier presents an accuracy of 98.4% and an AUC of 0.99. This study can be employed for the identification of heart, brain, and kidney disease. Therefore, an adaptable, robust, reliable, and accurate model has been proposed. Furthermore, the system performance can be enhanced by extracting other features like signal sample entropy, wavelet entropy, and higher-order spectra. These findings demonstrate that our methods outperform previous models and that they can be used in large databases. Sequentially, we can test performances of the suggested technique on a large dataset for automatic detection of the severity of hypertension.