Keywords

1 Introduction

Arrhythmia is a critical disease that results in adverse effect on human heart. There are numerous types of arrhythmias and each type is related with a specific pattern, due to this reason the identification and classification of arrhythmia is made possible. Some effects will result in severe cardiac disease and the other do not have any symptoms. Generally, Arrhythmia results in irregular heart-beat which is the condition where heart beats: too fast (tachy-cardia), too slow (brady-cardia), too early (pre-mature contraction), too erratically (fibrillation) [1]. Arrhythmia is a common symptom of cardio vascular disease like heart attack or stroke so, that detection of this disease is more important. American Heart Association had reported that more than four million Americans suffer from different types of arrhythmia [2].

Abnormality in human heart is detected by ECG test. A specialist may prescribe ECG to the individual who may in risk of coronary artery disease by reason of high cholesterol, high blood sugar level, Family History and Hypertension. Tiny electrical impulse are produce by human heart which could be spread in heart muscles. This impulse can be detected by an ECG machine [3]. Normal healthy heart’s ECG have characteristic shape. An Electrical activity of heart being change because of any irregularity present in heart rhythm [4]. The process of identifying and classifying arrhythmia using ECG is very tedious task because some condition it is necessary to analyse each heart beat of every signal [5].

Support Vector Machine is commonly used machine learning algorithm which solves classification as well as regression problem for identifying heart arrhythmia because of its good classification and generalisation properties. It gave very successful result when working in this area [2]. Classification of ECG signal have 3 steps mainly, Pre-processing, Extracting Features, and classification.

This paper includes following sections. In Sect. 2 Basic introduction of ECG medical test and ECG processing techniques, Sect. 3 includes current challenges in ECG signal denoising, Sect. 4 includes survey of ECG classification, Sect. 5 includes Analysis of survey and finally Sect. 6 includes conclusion of this survey. This paper gives overview of existing studies for ECG signal processing and arrhythmia detection from it, also discussed challenges in classification of ECG. Different pre-processing and Feature extraction techniques as well as different types of kernels used in Support Vector Machine. This paper will help beginners to decide proper Dataset pre-processing technique, as well as proper feature extraction techniques for classification.

2 Background Knowledge

The main function of heart is pumping the blood to over-all body. For that myocardial muscles are contracting and expanding in that response Electrical current produces in the heart. This electrical current can be detected and measured by Electro-cardiogram (ECG) [7]. ECG signal is repetition of P-QRS-T waves which are having specific magnitude and intervals as shown in Fig. 1 [8].

Fig. 1.
figure 1

Sample ECG signal [7]

The magnitude and amplitudes of ECG are depends on fundamental features of ECG which are describe in Table 1. In arrhythmia diagnoses feature extraction of signal is most important because of find set of relevant features that can be achieve better accuracy.

Table 1. Feature extraction table

In feature extraction method magnitude and intervals between these waves are calculated to get abnormality in rhythm [4]. Feature extraction is based on time domain and frequency domain. As time domain method is not adequate for feature extraction so that we use frequency domain methods recently [8]. The main methods for feature extraction are Discrete wavelet Transform (DWT), Discrete Cosine Transform (DCT), Continuous Wavelet Transform (CWT). For better result of feature extraction technique signal pre-processing is done on ECG signal in which noise and missing values should be replace or remove [2].

Various recursive digital filters are used to remove noise which can be made from advance micro-controller and micro-processor. Commonly used filter is Finite Impulse Response (FIR) filter [4]. Recently, Wavelet based methods are used to remove the noise for that Discrete Wavelet Transform and Continuous Wavelet Transform methods are used which can be easily available in MATLAB Toolbox Function [5].

Basically, Pre-processing techniques are highly dependent on the method we choose for feature extraction and classification. After, Features are derived from the ECG signal model should be built using Machine learning algorithm for arrhythmia beat classification [5]. For classification support vector machine is very efficient in signal processing. From the training dataset Support Vector Machine calculate hyper plane which divides original data into different classes which have maximum margin between hyper plane and adjacent sample. If the feature space is linearly not separable then kernel function is used. Kernel is mathematical function which transforms feature space in to higher Dimension. Different types of kernel are available like Linear, Non-Linear, Polynomial, Radial Basis Function and Sigmoid Function, among them most popular kernel is Radial Basis Function (RBF) because it has localize and finite response Kernels. Support vector machine is binary classifier but for classify more then two class we require strategy call one-against-one and one-against-all. The main sequential steps for Arrhythmia classification are pre-processing signal, feature extraction, and classification [10].

3 Challenges in ECG Signal Classification

3.1 ECG Signal Acquisition

ECG is non-stationary signal, some time it also take unwanted noise like Electromyogram, Motion artefacts, and muscle contraction. Elimination of this noise from ECG signal lead to loss of information because noise and information lie in overlaid manner. This loss of information directly affect the classification accuracy [11].

3.2 ECG Data-Set Challenges

Furthermost large clinical studies still record ECG on paper printouts so for applying computational techniques we require some amount of digitization. In addition, most of studies in this paper is based on publicly available datasets of ECG signal like MIT-BIH were originally captured in analogue format but after some time it was converted in digital form. Moreover, database have less no of patients and require more amount of pre-processing techniques to denoise the signal and many ECG databases are not publicly available [12].

3.3 Lack of Standardization of the ECG Features

Features of ECG signal are based on chronological selection of wave boundaries in particular time and amplitude domain. ECG is analysed at different scale based on local maxima, minima and zero score of the signal. So that approximation of maxima or minima is not sufficient for clinical diagnoses of any disease. Because small change in localization may lead to misclassification [4, 13].

3.4 Variability of ECG Features

ECG is based on cardiac rhythm and heart rate. But rate of heart varies with individuals physical and mental condition like stress anxiety, exercise, excitement. So that, changes in the heart rate can subsequently change segments (features) Like RR interval, QT interval, as well as PR interval. This varying features may lead to miss-classification so varying heart rate should be discarded from the ECG signal [13].

3.5 Uniqueness of ECG Patterns

Uniqueness of ECG refers to interclass variability and intraclass similarity of testing patterns in target signal. To address the individuality of ECG methods should be tested on large dataset but currently it is tested at limited dataset [4].

3.6 Finding Out Appropriate Feature Selection and Pre-processing Method

Feature section and pre-processing technique directly affect the accuracy of classification. Feature extraction technique should be extremely accurate and should ensure fast extraction of features from the ECG signal [7].

4 Literature Survey of ECG Classification

There are many researchers who have worked on ECG classification for arrhythmia detection. They used different pre-processing methods, feature extraction and selection techniques as well as different classification techniques for diagnoses and classification of Arrhythmia. Many investigators uses MIT-BIH dataset for ECG classification.

Xu et al. [2] used PCA (Principal Component Analysis) and FDR (Fisher Discriminant Ratio) for feature reduction. They obtained 76.97% and 78.23% Accuracy respectively. They concluded that PCA is linear transformation method that’s why its not suitable for non-linear manifold feature space so FDR is better transformation technique for non-linear manifold.

Nasiri et al. [29] proposed novel approach for Arrhythmia classification. They used combination of Genetic Algorithm and Support Vector machine for arrhythmia classification and obtain 93.46% overall classification accuracy. They used Genetic Algorithm for feature extraction. They also use Principal Component Analysis (PCA) only approach for feature extraction but GA provides better classification accuracy compare to PCA only method [29] (Table 2).

Table 2. Literature survey of ECG classification

Salam [14], perform QRS complex detection and ST segment detection and R peak detection using Discrete Wavelet Transform and Adaptive least square method and gain 98.67% accuracy which give better result compare to analysis using features like P wave, QRS Complex and T wave.

Halil et al. [5] used DWT, DCT, and CWT hybrid approach in signal wave transformation to improve overall classification performance. In this study we acquire better accuracy in hybrid approach compare to individual transformation method.

Subramanian and Lakshmi et al. [27] detect the QRS complex from ECG signal with 93% accuracy and 90% specificity for that he used three types of algorithms name Pan Tompkins’s Algorithm, Derivative based and Wavelet Transform(DWT) based algorithm among them DWT based Multi-wavelet based algorithm provides better accuracy and specificity. According, to Lakshami et al. (2011) Daubechi’s signal for wavelet transform give better accuracy in wavelet transform because this wavelet pickup minute detail which can be lost by other wavelet methods like Haar wavelet algorithm. However, Daubechi’s algorithm is conceptually more intricate, and has complex computations it gives efficient result. Eaurodo et al. also stated that Daubechies wavelet are most promising among all the wavelets for QRS detection. Among them Daubechies order 2 provide better accuracy.

Faziludeen and Sabiq [30] used Pan Tomkins algorithm for detecting QRS complex and Discrete wavelet Transform and Support Vector Machine for classification of ECG. They extracted 25 features from wavelet analysis like mean, variance, Standard Deviation, minimum and maximum of detail co-efficient and classify 3 classes of arrhythmia and obtain very high accuracy.

5 Analysis of Literature Survey

5.1 ECG Dataset Present

Recently, there are several datasets publicly available for Bio-medical signal processing. From that many are stored by Physio net and others are abandoned by their owners. AHA database was created by American Heart Association for diagnoses of arrhythmias. There are 2 versions of this dataset are available shorter version have five minutes unannotated ECG signal prior to 30 min of annotated segment of each recording. And longer version have 2.5 h of unannotated signal prior to each annotated signal. This data contains 154 recordings at 250 Hz frequency. It compose of 3 signals and it have 7 types of arrhythmia [7].

Another dataset is UCI cardiac dataset. Which have 245 samples. It contains total 16 classes of heart arrhythmia. This data contains 76 attributes [2]. CU database have 35 recordings and each have 8 min sampled at 250 Hz with 12bits resolution. This database have data about different types of ventricular arrhythmia. 3 files of this dataset are annotated [14].

Most widely used data-set for ECG is MIT-BIH Arrhythmia dataset which have 48 recordings of heartbeat at 360 Hz. Each recordings have 30 min and 47 different partition. Each record consists of 2 leads that are lead-A and lead-B. generally, Lead A is used for analysis of heart-beats and lead B is used for arrhythmic classification. Database contains 3 files that are signal file annotation file and header file [4].

5.2 Pre-processing of ECG Signal

ECG signal contains different types of noise in signal which degrade the performance of classification. To remove this type of noise from signal we require Pre-processing Technique. Which type of noise we remove from ECG signal is completely depends on method we choose for further processing for feature extraction and classification [5]. Major types of noise present in ECG signal are listed below

  1. 1.

    Powerline Interface

It is high frequency noise normally between 48 Hz to 50 Hz. This noise is because of sinusoidal alternating current which can be used as source by ECG acquisition equipment [7].

  1. 2.

    Baseline wander

It is low frequency noise normally below 1 Hz Which is caused by breathing movements of different organs on acquired signal [4].

  1. 3.

    Electromyographic Interface

It is high frequency and high amplitude noise. This noise is due to electrical impulse of another organs from human body rather then heart [5].

  1. 4.

    Lead Reversal

This type of noise is because of misplacement of electrode which cause reversing in amplitude of heart beat waveforms [9].

  1. 5.

    Electrode Movement

This is high frequency and high amplitude noise cause by changes of impedance in skin as well as movement of the subject of electrode [3].

5.3 ECG Signal Preparation

Before, Extract the features from pre-processed ECG signal one method is proposed to maximize the performance of feature extraction that is preparation method [3]. In this method we pre-pare the signal for feature extraction phase by reducing variability and several inconsistency from the ECG signal [8].

  1. 1.

    Length inconsistency

If the ECG signal acquire from the middle of the heart beat then extent of the signal may vary. Which may cause wrong feature extraction from ECG signal [3].

  1. 2.

    Amplitude Variation

This variation is caused because of wrong placement of electrode at signal acquisition time. If the amplitude of the heart-beat are changed then we can not extract appropriate features [15].

  1. 3.

    Heart rate variability

At the time of ECG signal acquisition if the heart rate vary over time which cause the changing the shape of Different segments and waves [8].

To remove this type of variability steps require like Fiducial point Detection, Signal Normalization or outlier detection. Signal segmentation can segment the signal using fix the signal span [16]. In Fiducial point detection generally QRS Complex detection and R-peak detection for further feature extraction. For detecting QRS complex there are number of ways like using Pan-Tompkins Algorithm, Signal squaring method and Wavelet Transform method are used. In signal preparation method another method is normalization of amplitude and time of signal. For that we use z-score normalization and Min-Max Normalization method [14]. Outlier detection method is generally used for avoid unwanted wave from the ECG signal. For that certain methods like Gaussian Mixture Model and Normalized cross-correlation between candidate heart beats method are used.

5.4 Feature Extraction of ECG Signal

Any information extracted from the heart beat used to discriminate its type is called a feature of ECG. This stage is most important for better classification of heartbeat arrhythmia. For the feature there are two types of approach [20]. Feature selection and Feature extraction. In Feature selection features are selected from existing set of features [15]. That means selected features are subset of existing features in ECG signal. In feature extraction new features are extracted using different methodology where extracted features are not subset of original features [21].

  1. 1.

    Feature Extraction using Morphological features

In temporal analysis feature are extracted based on the time domain. This features are based on heart beat wave forms as well as their on set and offset points. This types of features are P, Q, R, S, and T waveforms. And also find its amplitude, duration, interval level and area measurement of fiducials. But this type of approach is not perform better when we used off-the-person ECG wavelet and seamless ECG wavelet [19]. Noise and variability presented in the heart beat can change the amplitude which can change the features of ECG signals. So that, this methods are less efficient [3].

  1. 2.

    Feature Extraction using Segments of the signal

In spectral analysis Features are extracted based on fixed size of segments of ECG. This features are QRS complex, ST segment QT segments which can be analysed using Fourier Transform and Wavelet Transform like Discrete Wavelet Transform and Continuous Wavelet Transform [15]. After converting features in to Fourier transform the features are being very big so that analysis being very easy. However, spectral analysis (Fourier Transform) have certain limitation it can not capture certain abnormality which are based on time domain [21].

  1. 3.

    Feature Extraction using Hybrid approach

To overcome the abnormality based on time and frequency Domain this type of analysis comes in the picture. The features extracted in this type of analysis are QRS complex, QT complex, ST segment, RR interval. In this analysis wavelet transform is done for every features [8]. In the Wavelet Transform method original signal is convolve with wavelet. So, that we acquire 2 wavelet co-efficients. A Coefficient and D co-efficient [7]. A co-efficient is approximation output which contains low frequency signal of original input signal. D co-efficient is multi dimension output which gives high frequency components. This output Co-efficients can be represented in time domain as well as frequency domain [11]. So, that classification of the signal is being very good because we can not miss any parameter of ECG [8].

5.5 Classification of ECG Using SVM Algorithm

Support Vector Machine is classification algorithm which classify labelled input dataset by giving optimal hyper plane between different classes with maximum margin from the hyper plane. It have very good generalization capability. Suppose that we have training dataset (x, y). Where x is input and y is its corresponding output. So, that Hyper plane should be

$$ {\text{w}}.{\text{x}}_{\text{i}} \, + \,{\text{b}} \ge 0{\text{ where y}}\, = \,{\text{v}} $$
(1)
$$ {\text{w}}.{\text{x}}_{\text{i}} \, + \,{\text{b}} \ge 0{\text{ where y}}\, = \, - {\text{v}} $$
(2)

Here w is hyper plane’s co-efficient vector. Support Vector Machine’s Optimization problem stated that the margin between nearest point and hyperplane should be maximized. It can be represented the mathematical terms:

$$ {\text{Min}}\frac{1}{2}({\text{w}}.{\text{w}}^{\text{t}} ) $$
(3)

Where, D(w xi + b) ≥ 1

Here, wt and b are scaler of the SVM. The total margin should be \( \frac{2}{{\left| {\left| w \right|} \right|}} \). Margin means difference between 2 Support Vectors. The main aim of the classifier is to maximize the margin and minimize the error rate. Using Lagrange multiplier \( \alpha \ge 0 \) solution can be given by,

$$ {\text{W}} = \sum D\alpha x $$
(4)

Here, \( \alpha \) coefficient’s small factor should be nonzero. The boundary of the decision should be given by xi. All other irrelevant input patterns corresponding to zero \( \alpha \) should be removed. Now input pattern vector xi can be given as:

$$ {\text{f}}\left( {\text{x}} \right)\, = \,{\text{sgn}}\left( {\sum D\alpha \left( {x_{i } x^{t} } \right) + b} \right) $$
(5)

By replacing the term \( \left( {x_{i } x^{t} } \right) \) with kernel function k(x, xi) input pattern should be map in higher Dimension. The data which can not be linearly separable should be converted in to higher dimensional data then separate the data in higher dimension and convert back in to original feature space. This is called kernel function. There are many types of kernel available for Support Vector Machine Classifier like Polynomial kernel, Gaussian kernel, Gaussian Radial Basis Function (RBF) kernel Laplace RBF kernel, sigmoid kernel, Linear Spline Kernel. Among them, Radial basis functions are most commonly used with Support Vector Machine. Support Vector Machine mainly designed for solving binary classification problems but it can also solves the multi problem using different methods and also have multi class classification capability [21]. We implement different types of methods for multi-class classification of ECG using SVM.

  1. 1.

    Hierarchical Support Vector Machine

In this method certain binary SVM classifier are arranged in a binary tree structure. For training of each SVM classifier binary decision subtask’s hierarchy should be design properly so that they can take decision appropriately. One binary classifier is present at each node of the tree which can be trained using 2 classes. This method provides high classification accuracy and computationally efficient [23].

  1. 2.

    One vs One Method

In this method we require \( {\text{n}}\left( {{\text{n}} - 1} \right)/ 2 \) binary classifier for solve any n class problem. Every classifier is trained to separate the data from one class to all the another class. The conclusion is taken based on maximum number of votes for particular class. This method works bitterly then One vs All method (Fig. 2).

Fig. 2.
figure 2

One vs one method [12]

  1. 3.

    One vs All Method

One vs All method also require \( {\text{n}}\left( {{\text{n}} - 1} \right)/ 2 \) binary classifier for n class problem but it is slightly different then One vs one method. This method states that only one SVM accept the decision and another class do not accept the decision (Fig. 3).

Fig. 3.
figure 3

One vs all method [12]

Some times training data are not linearly separable so or that data should be converted in higher dimensional feature space then classify the data and converted back into its original form. This procedure is done with the help of kernel. There are many types of kernel available for Support Vector Machine Classifier like Polynomial kernel, Gaussian kernel, Gaussian Radial Basis Function (RBF) kernel Laplace RBF kernel, sigmoid kernel, Linear Spline Kernel. Among them, Radial basis functions are most commonly used with Support Vector Machine.

6 Conclusion

In this survey, through detailed presentation and discussion of arrhythmia detection using ECG classification we concluded that Support Vector Machine using RBF kernel provides better accuracy compare to another Kernel.