1 Introduction

Cardiovascular diseases are mainly responsible for heart attacks worldwide and their diagnosis from electrocardiogram (ECG) has greater importance in medical applications. Myocardial ischaemia and cardiac arrhythmias are the two popular heart diseases. In myocardial ischaemia, blood supply reduction alters the morphology of ECG signal whereas cardiac arrhythmias indicate some critical information that leads to sudden cardiac death [8, 20]. Hence arrhythmic beat classification provides important information about cardiac condition of human beings. Electrocardiogram (ECG) morphological analysis has been used in the past for the assessment of arrhythmias at expense of more computational elements. To overcome the complexity involved in morphology based signal processing, heart rate variability (HRV) analysis become very much popular in ECG processing applications [7]. HRV is used to measure the variations in R-R interval or inter-beat interval (IBI), which is defined as duration observed between consecutive R waves of an electrocardiogram in milliseconds. Heart rate in beats-per-minute (bpm) is computed by HR = (60000)/(RR interval) considering R-R interval in milliseconds.

The preprocessing of ECG signal is usually carried out using adaptive filters, wavelets and empirical mode decomposition (EMD) based techniques. Adaptive finite impulse response (FIR) filters widely preferred in many preprocessing applications, but they could not be utilized due to its high computational cost. However, advances in very large scale integration (VLSI) technology make the adaptive filtering computations in single chip with less number of elements. Among various types of adaptive filtering methods, least mean square (LMS) based adaptive filtering is commonly used for wide range of signal processing applications [24].

In the standard normalized LMS (NLMS) algorithm, a step size parameter can be used to control the convergence performance of the filter [3]. A proper step size improves the convergence performance while reducing the MSE. Variable step size based LMS algorithms have been applied in the weight update of adaptive filters to improve the performance with low computational power [16]. To improve the convergence with low steady state error, error normalized LMS (ENLMS) algorithm has been introduced [13]. The instantaneous data vector is normalized in NLMS algorithm, however the error vector is normalized in ENLMS algorithm. The critical path is generally reduced by pipelined structures with delay elements so that the desired sample period can be attained [11, 19]. In this work, a delayed error normalized LMS (DENLMS) adaptive filter is used for ECG noise removal with less computational complexity [5, 21].

Wavelet transform based signal processing is preferred over filtering methods due to its capability of extracting time-frequency domain information retrieval from the non-stationary signals. Since ECG is non-stationary in nature, wide variety of wavelets has been applied in ECG signal preprocessing and feature extraction [17]. Wavelet decomposition and reconstruction are performed to extract the important features of the ECG signal. The number of coefficients and level of decomposition decide the performance of any wavelet. Coiflet wavelet is one of the popular wavelet that is useful in finding R-peaks in ECG signal [15]. R-peaks are used in finding the HRV of the ECG signals. Though Coiflets are formed from Daubechies wavelet, they are more symmetric and has linear phase and its characteristics are much better than Daubechies wavelet and Spline wavelet [9]. Discrete wavelet transform (DWT) based HRV feature extraction is utilized in this work as it reduces the required number of computational elements [14]. Though many wavelets are available for R-peak detection and HRV feature extraction, Coiflet is chosen in this work to extract QRS complex better and results in determining maximum possible R-peaks [2, 6].

Machine learning techniques are useful in many complex applications where the conventional algorithms are infeasible [18]. Since medical expert systems are part of the portable and smart healthcare monitoring devices used in day to day life, large amount of clinical data are generated and they have to be processed and classified appropriately. The K-nearest neighbor algorithm (KNN) is used for classification of feature extracted ECG data which is a simple machine learning algorithm compared to other machine learning techniques such as Support Vector Machine (SVM), decision tree classifier and deep learning [4].

Time domain and frequency domain parameters are used in many machine learning based ECG classification applications to identify sympathetic and parasympathetic activities [22]. In general, the low frequency range is normally affected by both sympathetic and parasympathetic activities, where high frequencies are largely found in parasympathetic activity [1]. The extracted time domain and frequency domain features are applied into the machine learning classifier. In this work, KNN classifier is applied for performing classification using DWT based feature extracted time domain and frequency domain features.

This paper focuses on DENLMS adaptive filter based ECG signal preprocessing and DWT based HRV feature extraction. Arrhythmic beat classification is performed by KNN classifier on HRV features and HRV parameters extracted by the DWT. The performance comparison is made with similar machine learning based classifiers. The paper is organized into five sections including the introductory section. ECG signal preprocessing using DENLMS algorithm is focused in Section 2. DWT based feature extraction is explained and KNN Classifier based arrhythmia classification on HRV is described in Section 3. Section 4 discusses the obtained results and Section 5 concludes the proposed work.

2 ECG signal preprocessing using DENLMS algorithm

The LMS algorithm is the most preferred choice in adaptive filters due to its computational simplicity. In each iteration of the standard LMS algorithm, the FIR filter coefficients are modified based on the error output and step size. In this work, delayed error normalized LMS (DENLMS) algorithm is used with the weight update equation

$$ w\left(n+1\right)=w(n)+{\mu}_{en}e(n)x(n) $$
(1)

Where x(n), e(n), w(n) and w(n + 1) represent input, error, old weight and updated weight respectively. The error normalized step size μ en is written based on squared norm of error vector instead of input vector as

$$ {\mu}_{en}=\left[\frac{\mu }{p+{e}^t(n)e(n)}\right] $$
(2)

where parameter p is set to avoid the high denominator value and small step size.

From Eq. (2), if the error is high, the value of step size decreases which may affect the convergence. However, the increase in step size parameter leads to faster convergence, since e(n) decreases in magnitude. The algorithm initially converges slowly due to small μ value. However, when the algorithm reaches the convergence and error e(n)starts to become less in magnitude, the step size happen to be larger, leads to a faster convergence of filter.

3 DWT based feature extraction and abnormality detection using KNN classifier

In DWT based feature extraction, the R-peaks are detected to determine the HRV signal features. Arrhythmic beat classification is performed to detect abnormalities in ECG signal using KNN classifier. R-peak detection techniques are mainly based on heart rate function which is widely used to calculate the RR interval. Dividing one minute by the instantaneous heart rate gives the RR interval of the given ECG signal. Consecutive RR intervals of the ECG signal are calculated from starting interval of the heart rate function. Among the Meyer, Biorthogonal, reverse Biorthogonal and Coiflet wavelets, Coiflet is chosen to extract R-peaks. In practical RR interval measurement system, correlation technique with timing resolution ± 1 ms is used. The accurate RR interval measurement can be obtained by high performance digital signal processor or customized processor. Considerable amount of baseline trend in ECG signal is removed using the adaptive filtering based preprocessing.

Time domain and frequency domain features can be derived from the extracted HRV features. Figure 1 depicts the components involved in the arrhythmic beat classification based on preprocessing and HRV feature extraction. The proposed ANFIS evaluation consists of i) Collection of raw electrocardiogram (ECG) signal from MIT-BIH database ii) Preprocessing of ECG signal using adaptive filter iii) R-peak detection iv) Frequency domain feature extraction from HRV signal v) KNN classification into normal and abnormal.

Fig. 1
figure 1

Steps involved in abnormality detection

There are 14 well known time domain and frequency domain HRV features. Time domain HRV parameters used are RR mean (ms), RR Std (ms), HR mean (bpm), HR Std (bpm), RMSSD (ms), NN50, pNN50, RR Triangular Index, TINN (ms).In this work, six frequency domain features have been utilized. They are VLF Power (ms2), LF Power (ms2), HF Power (ms2), LF norm, HF norm and LF/HF Ratio [12]. The three frequency bands used are VLF, LF, HF and a frequency ratio LF/HF. In addition to these parameters, LF norm and HF norm are calculated in terms of normalized units.

The K-nearest neighbor algorithm (KNN) is a simple machine learning algorithm compared to similar machine learning approaches [23]. KNN classifier is based on grouping of closest training points of data in the considered feature space. The grouping is done by majority of voting to the nearest neighbor points. In KNN, K value indicates Euclidian and it takes value from 2 to 7 in this work. MIT-BIH database has been chosen for arrhythmic beat classification and abnormality detection. The detected the R waves and RR intervals are used for HRV frequency domain analysis. The frequency domain parameters listed in Table 1 are calculated for the HRV extracted preprocessed data for arrhythmic beat classification.

Table 1 Comparison of various classification techniques

4 Results and discussion

The simulation experiments are conducted based on MIT-BIH arrhythmia database ECG signal and real-time recorded ECG signals. In real-time ECG recording, ECG data has been collected from both the healthy persons and the people with some heart related problems. The different age groups between 18 and 40 have been chosen with different backgrounds for this study. The database ECG signal is originally obtained by placing the electrodes on the chest in the first channel V1 that is the standard practice in ECG recording. The recorded signal has been digitized at 540 Hz sampling rate with 11-bit resolution over a 10 mV range. All the simulation experiments were conducted using MATLAB® v. 7.10. In the simulation scenario, we choose μ for all the filters as 0.01 and the filter length as 21 for reducing the simulation time. The number of iterations for the experiment is 1000. Five sample data (ECG data record 105, ECG data record 106, ECG data record 107, ECG data record 108 and ECG data record 109) of the database have been used. Moreover, speech signal is considered in simulation to test the performance of DENLMS algorithm for different type of signals. Figure 2 shows the MSE comparison of different algorithms for noise SNR of 2.5 dB.

Fig. 2
figure 2

MSE comparison of LMS algorithms for noise with SNR = 2.5 dB

Filter convergence vary in accordance to step size, it converges faster for high step size. MSE reduces in accordance with the increase in SNR of noise. MSE of 19.5 dB has been achieved for the iteration time110 in DENLMS algorithm which is superior to NLMS, transform domain (TDLMS) and delayed NLMS (DNLMS) algorithms.

The filtering performance of this algorithm is observed using five MIT-BIH data base ECG signals. Figure 3 shows the extraction of clear ECG from corrupted ECG data (record 104).

Fig. 3
figure 3

Original and denoised ECG signals using DENLMS algorithm

The denoised ECG is processed using the R-peak detection algorithm to determine the possible beats per minute. The beat rate of 136, 129 and 74 are detected using Meyer wavelet, Biorthogonal wavelet and Coiflet wavelet. Since the human heart rate values vary between 60 and 100 bpm, Coiflet wavelet provides the correct beat rate of 74. The obtained result is shown in Fig. 4.

Fig. 4
figure 4

R-peak detection using Coiflet

The frequency domain parameters such as VLF (ms2), LF (ms2), HF (ms2), frequency ratio LF/HF, LF norm and HF norm are noted for all the feature extracted ECG data. National instruments (NI) biomedical kit has been utilized for the computation of frequency domain values. The obtained values are compared among normal and abnormal persons in Fig. 5. These values are averaged to analyze the risk using arrhythmic beat classification.

Fig. 5
figure 5

Comparison of normal and abnormal subjects based on (a) LF power (b) HF Power

In KNN classification, frequency domain features have been computed from the total of 160 HRV data. Different K values from 2 to 7 are applied over the different frequency band values for better classification results. Threshold values are fixed for the frequency domain features VLF (ms2), LF (ms2), HF (ms2), frequency ratio LF/HF, LF norm and HF norm while categorizing into normal and abnormal subjects. One hundred twenty-eight training data (80%) with signal length of 30 s in 1000 epochs have been used to train the KNN classifier. After training, 32 testing data (20%) were used to validate the accuracy of the classifier.

In the proposed KNN classification based abnormality detection method, only R peak detection has been carried out after performing ECG signal preprocessing. Several existing techniques are time-consuming and require complex computations. In addition, morphological ECG features are not feasible while dealing with noisy data.Various techniques are compared in Table 1 to categorize the arrhythmic risk abnormal and normal subjects. Some of the exiting classification techniques chosen which are based on ANN, neuro fuzzy and conventional KNN classifiers [10]. In these techniques, ECG, blood volume, HRV parameters were used. The maximum classification accuracy of 96% has been achieved using these techniques. But the experimental result of KNN based classifier gives a maximum accuracy of 97.5% on classifying normal and arrhythmic risk abnormal subjects.

5 Conclusion

ECG signal preprocessing is carried out by DENLMS algorithm based adaptive filter. It is observed that MSE value reduces in accordance with the increase in SNR of noise. MSE of 19.5 dB has been achieved for the iteration time110 in DENLMS algorithm which is the performance improvement of 24.35% while comparing with the TDLMS algorithm. In R-peak detection, Coiflet wavelet is used to for better HRV feature extraction with maximum R-peaks. Time domain and frequency domain features are applied to KNN classifier for arrhythmic beat classification which is simpler than other machine learning approaches. The exiting classification techniques are based on ANN, neuro fuzzy and conventional KNN classifiers. In these techniques, ECG, blood volume, HRV parameters were used. The maximum classification accuracy of 96% has been achieved using these techniques. But the experimental result of KNN based classifier with DWT gives a maximum accuracy of 97.5% on classifying normal and arrhythmic risk abnormal subjects.