1 Introduction

Cardiovascular diseases (CVDs) are a major health concern to humanity. The mortality rate due to CVDs is still the highest across the world. In 2016, as per the World Health Organization (WHO) report, 17.9 million died due to cardiovascular diseases. Out of these, 85% of deaths are because of stroke and heart attack. In low and middle-income countries, the number of deaths caused by CVDs is huge [1]. Economically, CVDs are leaving an enormous burden on societal resources. Increasing stress due to lifestyle changes, hypertension, unhealthy diet, obesity, physical inactivity, diabetes, hyperlipidemia, consumption of harmful substances like tobacco and alcohol can cause cardiovascular diseases. Major cardiovascular diseases include congenital heart defects, coronary artery disease, cardiomyopathy, myocarditis, and myocardial infarction [2]. Early detection and diagnosis are vital to prevent and treat cardiovascular diseases. Real-time monitoring of heart activity is required to detect CVDs accurately. Electrocardiogram (ECG), echocardiogram, cardiac catheterization, cardiac computerized tomography scan, and cardiac magnetic resonance imaging are some popular methods to detect. CVDs. ECG depicts a graphic pattern that represents the electrical and muscular function of the heart. Being a non-invasive method, ECG avoids the perils of invasive methods. ECG measured from the skin of a subject provides the electrical activity of different cardiac tissues within the heart and is shown in Fig. 1. ECG is a reliable, popular, inexpensive method to trace and study the electrical activity of a heart, hence making it ubiquitous. A careful analysis of each heartbeat is required to detect CVDs [3, 4].

Fig. 1
figure 1

The electrical activity of various cardiac tissues [5]

Nowadays, ECG is not only used to detect CVDs but also used for various other purposes like biometric identification [6], identification of various other diseases like pneumonia [7], estimation of respiratory frequency [8]. The ECG waveform mainly consists of P, Q, R, S, and T waves, as shown in Fig. 2. A detailed description of an ECG waveform is found in [9]. The detailed description of amplitude, frequency, duration, and origin is summarized in Table 1 [5, 10,11,12]. The amplitude values in Table 1 are measured across lead II. The wave amplitude is measured with reference to the ECG signal baseline level defined by the isoelectric line, immediately preceding the QRS complex. Two-time instants determine the duration of a wave; the wave either crosses the baseline or deviates significantly from the zero reference line [5]. As the researchers use ECG for cardiac monitoring and various applications, it is essential to analyze and classify the ECG signals sensibly and precisely. During continuous monitoring, manual analysis of an ECG signal is a tedious and erroneous task. Hence, an automatic system to analyze and classify the ECG signal is in great demand.

Fig. 2
figure 2

Basic ECG waveform

Table 1 Summary of various ECG waveform

A fully automatic system to analyze and categorize an ECG signal includes various steps like pre-processing, data transformation, fiducial point detection, and feature extraction. The block diagram representing various steps involved in the ECG signal analysis are shown in Fig. 3. Pre-processing or noise filtering is vital because it directly influences the performance of the system. Hence removing various kinds of noise present in the ECG signal is vital and challenging.

Fig. 3
figure 3

ECG signal processing algorithm

The pre-processed signal is applied to data transformation, which includes processes like differentiation and squaring. The purpose of this data transformation is to compute the slope and width information of the QRS complex. The derivative stage provides the slope information of the QRS complex while squaring operation converts the bipolar signal into a unipolar signal and provides non-linear amplification to the output of the derivative stage. The squaring operation helps minimize the false positive caused by the T waves, whose energies are higher than usual spectral energies [13]. The moving window integration provides a signal which contains the slope and width information of the QRS complex [13]. This vital information (slope and width of the QRS complex) is used to detect the R peak (peak of QRS complex). The feature extraction unit extracts various statistical and morphological features of an ECG signal, including P-wave, QRS complex, and T wave. With the features mentioned above, the ECG signal is analyzed to obtain various heart conditions.

The organization of the paper is as follows. In Sect. 2, different types of noises are discussed. Pre-processing or denoising techniques are presented in Sect. 3. Section 4 includes data transformation and detection techniques. Various databases are discussed in Sect. 5. Evaluation parameters are introduced in Sect. 6. Section 7 includes discussion and challenges. Finally, conclusions are presented in Sect. 8.

2 Noises in ECG Signal

Noise, the most critical object in the ECG signal analysis, must be given full attention to accurately and precisely detect QRS complexes. Various noises present in an input ECG signal must be filtered before proceeding to further processing. Various noise and artifacts that contaminate an ECG signal are briefly discussed below.

  1. (a)

    Power line Interference (PLI) is the most common noise in the ECG signal is due to the inductive interference between the power line and the electronics of ECG recording equipment. PLI has an amplitude of up to 50% of peak-to-peak ECG amplitude. The frequency of PLI is 60 Hz and its harmonics (or 50 ± 0.2 Hz in some data sets). Signal amplitude and frequency both vary with the power line interference. The bandwidth of this noise is below 1 Hz [14]. Figure 4a shows the reference ECG signal (record 100) taken from the MIT-BIH arrhythmia database, while the PLI included signal is shown in Fig. 4b.

  2. (b)

    Electrode Contact Noise is a randomly occurring noise with an amplitude equal to maximum recorder output. The loss of contact between skin and electrodes results in a transient interference with a one-second duration. The contact break can be permanent; it can be spasmodic when a loose electrode comes into contact or out of contact with the skin due to vibration and movement. The consequence of this noise is the baseline transition, which decreases exponentially to the baseline value [14]. The effect of electrode contact noise on MIT-BIH record 100 is shown in Fig. 4c.

  3. (c)

    Baseline Wander or baseline shift is a low-frequency noise in the frequency range of 0.15–0.3 Hz with an amplitude of 15% of peak-to-peak ECG amplitude. This noise arises from the respiration of the subject and forces the baseline shift in the ECG signal. Baseline wandering increases with an increased breathing rate. Various aspects like impedance of the skin, the subject's movement, electrode features, and electrolyte properties influence the baseline shift's amplitude and duration [15, 16]. Figure 4d represents the effect of baseline wander on MIT-BIH ECG record 100.

  4. (d)

    Motion Artifacts result from the electrode movement over the skin due to movements by the subject. The movement of electrodes causes a change in the skin–electrode impedance, resulting in a variation in the ECG signal baseline. The amplitude of motion artifact is about 500% of peak-to-peak ECG signal amplitude for a duration ranging from 100–500 ms. The motion artifact thus results in a transient baseline change. Motion artifacts and baseline wander can damage the low-frequency component and ST-segment of the ECG signal. Distorted ST-segment may lead to a false prediction of various diseases like myocarditis, ischemia, infarction, Brugada syndrome, infiltrative or myopathic processes [15, 16]. Motion artifacts are shown in Fig. 4e.

  5. (e)

    Electromyographic (EMG) noise is due to muscle contraction, which generates potentials in the millivolt range. The duration of EMG noise is 50 ms and contains frequency components in the range of 0–10 kHz. The average amplitude of EMG noise is about 10% of peak-to-peak ECG signal amplitude. The EMG noise overlaps with the ECG signal in the 0.01–0.1 kHz frequency range [15, 16]. Removal of this noise without disturbing the features in an ECG signal is quite tricky. The effect of EMG noise on MIT-BIH ECG record 100 is represented in Fig. 4f.

  6. (f)

    Instrumentation Noise is produced by improper usage of electronic components used to record ECG signals. An example of instrumentation noise is the saturation in the amplitude of an ECG signal due to improper biasing of an input amplifier, thus causing an improper recording of ECG signal [15, 16]. Instrumentation noise is shown in Fig. 4g.

  7. (g)

    Electrosurgical Noise is the noise produced by the medical equipment present in the patient monitoring environment. This type of noise significantly corrupts the ECG signals. This noise is modeled as a sinusoidal of large amplitude with frequencies ranging between 0.1–1 MHz for 1–10 s. The amplitude of this noise is approximately 200% of peak-to-peak ECG amplitude.

Fig. 4
figure 4

Effect of various noises on ECG signal, a MIT-BIH arrhythmia record 100 b powerline interference, c electrode contact noise, d baseline drift, e motion artefacts, f muscle contraction, g instrumentation noise

3 Pre-processing of ECG Signal

Until the 1970s, the direct-writing electrocardiograph was prominent, and the recorded signals were analog. Before further processing, nearly all ECG machines digitize the analog ECG signal at a particular sampling rate [17] using analog to digital (A/D) conversion techniques. At the front end, the initial A/D conversion sampling rate is significantly higher than the targeted sampling rate of the ECG signal processing. This higher initial sampling rate is known as oversampling. Oversampling is required to detect the output stimulus of a pacemaker with a duration of smaller than 0.5 ms. Oversampling has other advantages like improvement in quantization error regarding the precision of the least significant bit and implementation of lower-order analog anti-aliasing filter [17, 18]. In the sampling process, aliasing is a common problem that must be removed. To remove aliasing, LPF is used, known as an anti-aliasing filter. In ECG signal processing, two LPFs are used to avoid aliasing: an analog LPF and a digital decimation LPF. The analog anti-aliasing LPF is used before A/D conversion for the oversampling process. Digital decimation LPF is located after digitization for the down-sampling process. If both the filters have weak attenuation at their stopbands, aliasing could appear [17].

The analog LPF at the front of an A/D converter avoids aliasing by limiting the spectrum of the input ECG signal at the limit set by the Nyquist criterion. The analog LPF performs three more functions: offer a flat frequency response in the passband, minimize non-linear phase response, and oversampling cut-off much higher than 150 Hz. Similarly, the decimation filter provides a flat frequency response in the passband and sets the upper cut-off frequency at 150 Hz with 3-dB attenuation [17]. However, the implementation of an analog filter involves low tolerance value resistors and capacitors. When realized using VLSI implementation, their IC requires considerable chip area with the increased process and circuit complexity. The resistors implemented in VLSI circuits have a vast process variation. As the quality factor and resonant frequency of a filter depend on component values, the process variations result in huge circuit variations. The parasitic capacitances can also affect the performance of the circuits. Parasitic capacitances significantly affect high impedance nodes due to their small values [19].

Various noise removal techniques have been proposed to suppress noises and artifacts in an ECG signal in the last few decades. A detailed description of noise removal techniques is provided in Table 2. Filters are attractive tools for ECG signal pre-processing and denoising. Digital filters are preferred over analog filters to remove noises and artifacts [20, 21] as digital filter offers design flexibility. A digital filter can be implemented in a software environment before realizing it as hardware. Any change in filter characteristics can be realized in a digital filter by merely changing the filter coefficients, which is achieved by tweaking the program code. Once the performance is satisfactory, then the digital filter can be realized using hardware. Physical reconstruction of a circuit is required to design an analog filter that demands more time and cost to implement. Unlike analog filters, digital filters are immune to environmental conditions and the aging effect as their operation depends on numerical computations rather than the electrical characteristics of components [22]. The cutoff frequency of a digital filter can be represented with excellent precision, while in analog filtering, a 5% deviation in cutoff frequency is accepted [22, 23]. These virtues of the digital filter make them suitable for analyzing very low-frequency signals. Various researchers use various types of filters like a low-pass filter (LPF), high-pass filter (HPF), a band-pass filter (BPF), median, notch, adaptive, Savitsky-Golay (S–G), Moving average (MA) for noise removal.

Table 2 Overview of studies conducted on ECG signal denoising

LPF is a universally accepted method for ECG denoising [24]. The low-pass filter removes high-frequency components of an ECG waveform and leaves a significant portion of the ECG waveform for further processing. The popular cut-off frequencies of low-pass filters in ECG signal analysis are 11 Hz, 30 Hz, 35 Hz, 50 Hz, 90 Hz. A low-pass filter removes high-frequency noises like PLI, EMG but, more importantly, affects high-frequency components of ECG signals such as QRS complex, pacemaker spike and J-wave [62,63,64]. To preserve these useful high-frequency components, the American Heart Association (AHA) has changed its recommendation for cut-off frequencies of the low-pass filter from 35 to 150 Hz and 250 Hz for adults and children, respectively [17, 65]. The effect of LPF on an ECG record 203, taken from the MIT-BIH arrhythmia database, is shown in Fig. 5. Figure 5a represents the original MIT-BIH record 203 and the low-pass filtered version of this ECG record, while the spectrum of the original ECG record and low-pass filtered signal is shown in Fig. 5b. The power is normalized in the Figures. The cut-off frequency of LPF is about 11 Hz. Although the LPF removes the high-frequency noises like PLI and many more, it also attenuates the ECG amplitude and distorts some significant ECG characteristics.

Fig. 5
figure 5

Effect of LPF on ECG Signal (a) Input ECG signal and its low-pass filtered version (b) Spectrum of input ECG signal and low-pass filtered ECG signal

The low-pass filter significantly affects the ECG signal, so various researchers proposed a high-pass filter as an alternative for noise elimination. HPF being simple, easily implementable, is used to remove baseline wander, DC offset, and drift suppression [66]. HPF with cut-off frequencies of 0.05 Hz, 0.5 Hz, 1 Hz, and 2.2 Hz is used frequently to remove baseline wander and drift. The effect of HPF on the MIT-BIH record 203, in both the time-domain and frequency-domain, is demonstrated in Fig. 6. The cut-off frequency for this high-pass filtering operation is 5 Hz. Figure 6 shows that the HPF removes the DC offset and minimizes the baseline wander noise. Since all the heart information lies in a frequency range, researchers are interested in using a band-pass filter for pre-processing. BPF is used to eliminate different kinds of noise like baseline wander, EMG, PLI, and other low and high-frequency noise components. Various authors use different frequency range for BPF which are: 0.5–40 Hz, 1–30 Hz, 0.05–40 Hz, 1–100 Hz [67,68,69,70]. The response of a typical BPF with cut-off frequency 5–15 Hz for MIT-BIH record 203 is shown in Fig. 7. The time-domain and frequency-domain responses in Fig. 7 show that the BPF enhances the QRS complex characteristics by eliminating low and high-frequency noises.

Fig. 6
figure 6

Effect of HPF on ECG signal

Fig. 7
figure 7

Effect of BPF on ECG signal

Along with many advantageous features, the band-pass filter suffers from some disadvantages also. There are two important features of a band-pass filter. First, the BPF output may contain some artifacts and ripple due to low-frequency components. Second, it is challenging to select cut-off frequencies that do not overlap with the desired ECG signal [71]. In [28], Mourad combined group sparsity and singular spectrum algorithm (GSSSA) with BPF for ECG signal denoising. The block diagram of a typical filtering method, based on low-pass and high-pass filtering for ECG signal analysis, is shown in Fig. 8.

Fig. 8
figure 8

The filtering method of ECG signal analysis

Above mentioned filters can remove a range of frequencies, but sometimes removing a single frequency is necessary. A notch filter is popularly used to remove a single frequency component like PLI (50 or 60 Hz) [7, 72]. A notch filter is a band-stop filter of a very narrow bandwidth. A good quality notch filter should attenuate the targeted frequency and preserve the rest. Various research groups use a notch filter with 50 or 60 Hz center frequency to remove the power line interference. Although a notch filter preserves the other frequencies for rapidly changing waveform, they produce unusual ringing.

Notch filters are suitable for removing a single frequency noise, but other noises cannot be removed simultaneously [73]. The effect of the notch filter on MIT-BIH arrhythmia record 203 is shown in Fig. 9. As the ECG signal contains some impulsive noises, their presence may lead to false detection of QRS complex or R-peak. Conventional filters like LPF and HPF cannot remove impulsive noises. Hence many researchers propose a median filter as an attractive tool to remove impulsive noise while preserving signal edges. A median filter is a non-linear digital filter used to remove noises from the images and signals.

Fig. 9
figure 9

Effect of notch filter on ECG signal

Median filtering has unique advantages over linear filtering techniques. The linear filter cannot handle impulsive noise (sharp discontinuities of small duration) without altering the signal characteristics [74]. The median filter operates on the signal, entry by entry, and replaces every entry by the median of neighboring entries. The neighboring pattern is called a window, which slides over the entire signal. Generally, median filters with a window size of 200 to 600 ms are used to restore the baseline of an ECG signal. In the median filter, the predicted value of the current point depends on the past and future values. A median filter removes the baseline wander by assuming that the ECG signal and baseline wander have different amplitude distributions within the window. A median filter's main drawback is a longer computational time for wide window width and complex behavior.

During filtering, preserving the shape of the ECG waveform is essential as the shape contains crucial information on cardiac health. Savitsky–Golay's (S–G) filtering scheme removes noise and preserves the waveform shape. This filtering scheme based on the least square polynomial approximation method draws the attention of many researchers. Preserving the peak height and width of the signal waveform in a noisy environment makes S–G filtering an attractive choice for ECG signal analysis. S–G filters are low-pass filters obtained by fitting a polynomial to the input samples sequence and calculating that polynomial at a point within the selected interval. An extremely flat passband and moderate attenuation in the stopband of an S–G filter help achieve excellent results. In S–G filtering, the computational time is proportional to window width, so window width must be appropriately selected. S–G filters are useful in those applications where signal spectrum and noise overlap with each other. This fact of the S–G filter is used to remove the baseline drift in the ECG signal. The denoising accuracy of an S–G filter depends on the frame length and order of the polynomials. In an S–G filter, the frame length and order of the polynomial are determined by experimentation, which is a disadvantage of the S–G filter [75].

All the filters mentioned above require some prior knowledge about signal or noise. Based on this knowledge, the filters are designed for a particular task and thus categorized as fixed filters. A fixed filter requires a new design whenever there is a change in the input or any other condition. An adaptive filter resolves the problem of a fixed filter. Adaptive filters can automatically adjust the filter coefficients according to the specific requirement. Unlike a fixed filter, the design of an adaptive filter requires little or no prior knowledge of input or noise [76]. The adaptive filter minimizes the mean squared error between the primary input and a reference signal. Generally, the primary input is a noisy ECG signal, and the reference signal is either noise or a signal correlated with noise or ECG signal in primary input, respectively. Easy implementation of advanced hardware or microcontrollers with digital numerical capabilities makes adaptive filters suitable for the digital environment [77]. Based on requirements, various algorithms like least mean square (LMS), normalized least mean square (NLMS), recursive least square (RLS), sign-LMS, and sign-sign LMS are used to design adaptive filters. LMS algorithm has the advantages of implementation simplicity and tracking the statistical changes of non-stationary signals. In contrast, the RMS algorithm offers a faster convergence rate at the cost of increased computational complexity.

Based on the computational unit used to implement an adaptive filter, the filters are categorized as linear and non-linear adaptive filters. In a linear adaptive filter, the output is a linear combination of the observations applied to the input of the filter. The linear adaptive filter has a single computational unit for each output. Linear adaptive filtering is not capable of exploring the higher-order statistics of input data. On the other hand, non-linear adaptive filtering uses non-linear computational units to explore the complete information contained in the input signal. Non-linear components in non-linear adaptive filtering make mathematical analysis much more complicated than linear filtering methods. Adaptive filtering is useful in removing motion artifacts, PLI, baseline wander, and EMG [77]. Although adaptive filtering has numerous advantages, the main drawback is the reference signal requirement, as the choice of the reference signal may significantly affect the efficiency of the method [78, 79].

Many researchers used a moving average filter in ECG signal processing due to its simplicity and ease of use. Moving average filter is a finite impulse response filter used to minimize the random noise while maintaining a sharp step response. The idea behind the moving average filter is to take samples from the input dataset at predefined intervals and take the average of that input to produce an output. This process is repeated over the entire dataset, and a line, known as moving average, is constructed by connecting all these averages. The moving average filter losses the interbeat information due to averaging. The moving average filter is more suitable for time-domain encoded signals. Moving average filter is unsuitable for a frequency-domain encoded signal as it cannot separate one frequency band from the other [80]. While denoising, the edge-preservation of the ECG signal is essential, so some researchers found non-local mean filtering (NLM) suitable for denoising the ECG signal. Non-local mean (NLM) filtering was initially used for 2D image processing later; various researchers use it to denoise biomedical signals like ECG and EEG [24, 81]. Tracey and Miller first used the non-local mean denoising technique [46] for ECG signal denoising.

The non-local mean filtering is a patch-based method that calculates the weighted sum of a patch. NLM filter uses neighboring and non-neighboring patches for weight calculation. Based on this weighted sum, the noise is filtered out. Although the NLM filtering provides good denoising results and preserves the edges of an ECG signal, computational complexity increases the computational cost [46]. Lee and Hwang [24] adapted pNLM filtering with HPF and LPF to denoise selected ECG records from the MIT-BIH database and achieved an average SNR improvement of 7.67 dB.

The filtering techniques mentioned above are off-line methods where the ECG signals are recorded first and then denoised to improve the quality of the signal. However, in real-time applications, where the ECG data are provided by wearable sensors and transmitted to mobile devices, these methods show inefficacy. A real-time ECG signal processing requires computationally efficient filtering schemes. A recursive filter (RF) provides computational efficiency in computing time and memory usage [27]. A recursive filter is an infinite impulse response (IIR) filter whose output is a linear combination of present input and previous inputs and outputs. Recursive filtering provides low computational cost, fast operation, steeper selectivity, large gain with fewer sections. The main drawback of this method is its practical implementation due to the nonconvex optimization problem [82]. Cuomo et al. [27, 29] used recursive filtering for denoising the ECG signal.

Another edge-preserving denoising method is the Kalman filter (KF). It is a powerful tool to estimate the hidden state of a system with the help of a dynamic model of the system and measured data. KF deals effectively with noisy data and with random external factors. It assumes a linear relationship between the system dynamics and measured data. As most of the systems are non-linear, variants of KF are used to analyze non-linear systems. The popular variants of extended Kalman filter (EKF) are unscented Kalman filter (UKF) and extended Kalman smoother (EKS). EKF is an extension of KF for a non-linear system that assumes a non-linear relationship between measured data and the system. The purpose of EKF is the linearization of the non-linear system model close to the previously estimated points. EKF is not an optimal filter like KF. UKF method uses an unscented transform (UT) to denoise an ECG signal. The estimated covariance and sensitive matrices using UT are semidefinite. Hence it is difficult to realize a numerically stable UKF system to cancel noise in an ECG signal.

The EKS denoising method is a non-causal approach as it utilizes future observations to estimate the present state. EKS consists of two stages: the forward EKF stage and the backward recursive smoothing stage. The non-causal nature of EKS provides better performance as compared to EKF [83]. Denoising techniques based on EKF, UKF, and EKS provide better denoising results. However, sometimes these methods require operator interaction to initialize parameters such as amplitude, phase, and width to estimate an ECG signal [30]. EKS, along with differential evolution (DE), is proposed by Panigrahy and Sahu [30] for denoising, but this approach may provide a lower performance at a low sampling rate of ECG signal. As mentioned above, the researchers used various filtering techniques and achieved ECG detection accuracy greater than 99%.

Generally, Butterworth filters are preferred for medical applications due to their maximally flat magnitude response, less computational cost, and accuracy. IIR Butterworth filter has a better frequency than finite impulse (FIR) filter [84]. The filtering techniques suffer from many drawbacks, such as the ringing effect and lack of information on the signal's frequency content. The filtering technique also affects the waveform of the ECG signal. Frequency-domain techniques remove some of these drawbacks. Popular frequency-domain techniques are discrete Fourier transform (DFT), fast Fourier transform (FFT), discrete cosine transform (DCT). These techniques provide the frequency content of the signal. The earliest technique is DFT, which converts the time-domain samples into the frequency domain. DFT is not a function of any variable, but it is a sequence. DFT decomposes the signal in terms of orthogonal sine and cosine functions of different frequencies. These individual components of the signal can be easily analyzed and processed compared to the original signal [85]. For N samples, namely x(0), x(1), x(2)x(k)x(N-1) of a signal x(n), the N-point DFT of the signal is given by (1).

$$X\left( k \right) = \sum\limits_{n = 0}^{N - 1} {x\left( n \right)} W_{N}^{Kn} ,\,0 \le k \le N - 1$$
(1)

Here \(W_{N} = e^{{\frac{ - j2\pi }{N}}}\) is known as the twiddle factor.

DFT is a powerful tool for providing spectral information of the signal. DFT deals with finite data points; it is easy to implement DFT in computers with numerical algorithms. DFT involves many calculations, so a fast-computational algorithm, called fast Fourier transform (FFT), is developed. FFT is a tool to perform DFT efficiently [86]. DFT requires N2 multiplications and N(N-1) additions for a data sequence of N-points, whereas FFT requires \(\frac{N}{2}\left( {\log_{2} N} \right)\) multiplications and Nlog2(N) additions [29, 87, 88]. By using the FFT algorithm, Kumar et al. [88] achieved a sensitivity of 99.65%. Noor et al. [89] achieved low energy consumption by using FFT.

Figure 10 shows a frequency-domain based method to analyze ECG signals, where an analysis filter bank is used to pre-processing the signal. When operated on non-stationary signals, DFT/FFT fails to provide information about instantaneous frequency. Further, one cannot apply DFT/FFT to a multichannel signal. Other frequency-domain techniques like discrete cosine transform (DCT) can express a discrete-time-domain signal into a sum of cosine functions having different frequencies. Due to the energy compaction property, DCT is used for ECG compression [90,91,92]. The major disadvantage of DCT is the requirement of the quantization step to get an integer-valued output [93].

Fig. 10
figure 10

ECG signal analysis using filter bank method

Although these frequency-domain techniques provide spectral information of an ECG signal, they do not provide temporal information. ECG signal, being non-stationary, possesses highly complex time–frequency characteristics, and they cannot be analyzed only using time or frequency-domain techniques. Short-time Fourier transform (STFT), introduced by Gabor in 1946, is a technique that combines both the time and frequency component analysis, which enables a comprehensive analysis of an ECG signal [94]. STFT provides time localized frequency information for a situation in which the frequency component of a signal varies with time [94, 95]. In STFT, with the help of a moving, fixed-width time window, multiple frames of the signal are extracted, and after then, Fourier transform is applied to get the frequency information of these multiple frames. The size of the window is narrow so that the frame would appear stationary. Thus, STFT eliminates the limitation of the Fourier transform by providing both time and frequency information. Xie et al. [96] achieved a classification accuracy of 98.4% using STFT. The main limitation of STFT is a trade-off between time and frequency resolution. Selecting a narrow time window provides a good time resolution but degrades the frequency resolution.

Similarly, a broader time window degrades the time resolution but improves the frequency resolution. Further, a fixed window length imposes a limit on non-stationary information extracted from the signal. A wavelet transform reduces the limitations imposed by STFT. Wavelet transform improves the time–frequency resolution by varying the window length. The flexible window length of wavelet transform helps obtain long, low-frequency, and small, high-frequency information simultaneously. In wavelet transform, a set of basis functions known as a wavelet represent the signal. A wavelet is a waveform with a limited duration and zero average value. A function with the following criteria can be used as a wavelet [97].

  1. (i)

    It must have finite energy.

  2. (ii)

    A function with a no zero-frequency component, or it must have zero mean.

  3. (iii)

    Fourier transform of a complex function must be real and zero for negative frequencies.

Various researchers used different wavelets like Daubechies wavelets, Symlet wavelets, Haar wavelet, Biorthogonal wavelet, Coiflet wavelets, Meyer wavelet to decompose an ECG signal. The type of wavelet to be used is determined by the application [40]. Wavelet transform can be classified into discrete wavelet transform (DWT) and continuous wavelet transform (CWT). DWT decomposes the signal into a set of functions that are orthogonal to its translation and scaling coefficient. On the other hand, CWT provides an output vector larger by one dimension than the input signal. CWT uses non-orthogonal wavelets, which provide highly correlated output vector values. The use of non-orthogonal wavelets in CWT improves the visualization of signals in higher dimensions, but it is not very useful for classification. Sabherwal et al. [98], Sahoo et al. [99], and Banerjee et al. [100] used Daubechies-6 (dB6), Rakshit and Das [101] used dB10 wavelet, Park et al. [102] used Symlets wavelet (sym5). Li et al. [103] used a quadratic spline wavelet as a mother wavelet for ECG signal denoising. Sabherwal et al. [98], Sahoo et al. [99], Banerjee et al. [100], and Li et al. [104] achieved a sensitivity greater than 99.5% using wavelet transform. Rakshit and Das [101], Park et al. [102], Yochum et al. [105], and Sabherwal et al. [106] have gained a sensitivity of 99.93%, 99.93%, 99.87%, and 99.99%, respectively.

The principle of the wavelet transform for ECG signal analysis is represented in Fig. 11. The wavelet decomposer decomposes the ECG signal into wavelet coefficients. With the help of these wavelet coefficients, the denoised ECG signal is generated for further analysis of heart rate variability. The effect of the wavelet transform on a typical MIT-BIH record 109 is shown in Fig. 12. Figure 12 demonstrates that the wavelet transform has successfully removed the baseline wander noise, EMG, and others. Although wavelet transform has many advantages over other techniques like filtering, Fourier transforms, there are still some drawbacks. First, it is not able to capture the edges adequately. Second, a trade-off exists between accuracy and computational time. A significant drawback of wavelet transform is low directional selectivity. The selection of the basis function in the wavelet transform is also a rigorous task. Some drawbacks of wavelet transform are removed by empirical mode decomposition (EMD) by decomposing the signal into intrinsic mode functions (IMF) [49,50,51, 107]. The basic concept of EMD is to identify proper time scales that reveals the physical characteristic of the signal and then decompose the signal into modes intrinsic to the function, referred to as IMF. IMFs are signals that satisfy the following criteria:

  1. (i)

    In the whole data set, the difference between the number of extrema and zero-crossing count must be either equal to zero or differ by one.

  2. (ii)

    At any point, the mean value of the envelope defined by local maxima and envelope defined by local minima is zero.

Fig. 11
figure 11

ECG signal analysis using the wavelet transform

Fig. 12
figure 12

a Input ECG signal b denoised ECG signal

The number of IMF depends on the length of the ECG segment. A long ECG segment produces a large number of IMF. EMD is an iterative process. The iterations can be converged by imposing conditions like standard deviation, the amplitude of the remaining signal, mean value of the envelope, cross-correlation coefficient between the original signal and the remaining signal. EMD is a model-free, entirely data-driven method that naturally copes with non-stationarities and non-linearities. EMD based algorithms are helpful for the removal of baseline wander and high-frequency noise. A typical EMD algorithm is a multi-step process and is simple to implement. EMD uses several equations to extract various features of an ECG signal. EMD is one of the most relevant techniques to remove respiratory signals from the single-channel ECG recording [108, 109]. Rakshit and Das [51], Kabir and Shahnaz [52] utilized EMD for denoising the ECG signal and achieved an SNR improvement of 9.29 dB and greater than 6 dB, respectively. The drawbacks of the EMD technique are a deficiency of theoretical background and mode mixing.

The block diagram of the EMD based ECG signal denoising system is shown in Fig. 13. The EMD method decomposes the ECG signal into many IMFs. The IMF corresponding to various noises is discarded to obtain the clean ECG signal. Denoised ECG signal is then used to detect various events like P-wave, QRS complex, R peak, T wave. A typical ECG record taken from MIT-BIH and its IMFs obtained from EMD operation are presented in Fig. 14. The ECG signal is decomposed into eight IMFs and a residue signal. From IMF1 to the residue signal, the oscillatory behavior is decreasing continuously. The lower order IMFs represent the high-frequency components of the signal, and noise is spread over these IMFs. EMD suffers from a significant drawback, known as mode mixing. In mode mixing, oscillations from different time scales appear in a given IMF, or oscillations from the same scale appear in different IMFs [110]. Like the wavelet transform, EMD is also not able to preserve the edges. Also, the lack of a theoretical framework is another major problem of EMD. Wu and Huang [111] introduced a new technique known as ensemble empirical mode decomposition (EEMD) to eliminate the mode mixing problem of EMD. EEMD is a noise-assisted EMD algorithm. In EEMD, different series of white noise are added to the original signal in many trials. Since the noise is different in each trial, the resulting IMFs differ from each trial, which does not exhibit any correlation. If the number of trials is adequate, the added noise can be eliminated from the ensembles by averaging the IMFs from different trials [112, 113]. The number of ensembles and the noise amplitude are required to define an EEMD. Jain et al. [54] achieved an SNR improvement of 10.08 dB for MIT-BIH record number 115. Chang [110], Rajesh and Dhuli [114], Jebaraj and Arumugam [115] demonstrated EEMD as a powerful tool to denoise an ECG signal. Although EEMD provides a significant improvement over EMD, it still suffers from some problems [111]. (i) In EEMD, each trail produces a set of IMFs. The addition of these IMFs does not need to be a true IMF. (ii) IMFs do not provide any information on handling the multi-mode distribution (iii) higher computational complexity.

Fig. 13
figure 13

ECG signal analysis with the help of the EMD method

Fig. 14
figure 14

Input ECG signal and the eight IMFs after EMD decomposition

Variational mode decomposition (VMD) is an enhanced version of EMD used to analyze the non-stationary and non-linear signals. Like EMD, VMD also decomposes the signal into a set of bandlimited amplitude and frequency modulated oscillations known as modes. All the modes have specific sparsity properties to reproduce the signal. The bandwidth of all the modes in the spectral domain is regarded as sparsity. The high operational efficiency of VMD is based on its robust mathematical theory. VMD avoids information loss because it reconstructs a good signal from decomposed signals. Maji et al. [116] used VMD for QRS detection and achieved a sensitivity of 99.46% with the MIT-BIH database. VMD has some superiority over EMD because the EMD algorithm includes extrema finding, interpolation, and stopping criterion. Any false maxima detection may generate a wrong decomposition, but in VMD, the signal is decomposed around the center frequency of modes. EMD may decompose the ECG signal into unnecessary modes because the decomposition level is independent of the user, but VMD has a controlled decomposition.

The most advantageous feature of VMD is the center frequency of the mode that helps in the characterization of the modes. VMD also provides features like phase angle, which helps categorize the heart rhythm with abnormalities [116, 117]. Various researchers use statistical techniques like principal component analysis (PCA) and independent component analysis (ICA) to denoise the ECG signal. PCA and ICA remove the in-band noise of the ECG signal by removing the dimensions which correspond to noise [118]. They (PCA, ICA) do not show good results with single-lead ECG recordings because these techniques are based on correlation and uncorrelation concepts [119]. Although the techniques mentioned above provide excellent results, the reliability of these techniques for real-time applications requires extensive validation.

4 Detection Techniques

In ECG, a complete heart cycle comprises three main events: P-wave, QRS complex, and T wave. Each event has its characteristic peak amplitude, duration, and frequency. It is necessary to detect the events mentioned above properly and accurately to diagnose CVDs and arrhythmias. After detecting any event, the corresponding signal can be analyzed for the peak amplitude, QRS complex width, frequency content, energy, and the interval between events. Accurate detection of a P-wave, QRS complex, and T wave enables accurate analysis of the ECG signal. Over several years, various researchers have focused their investigations on the QRS complex detection as detecting the P-wave and the T wave is quite difficult compared to QRS complex detection. The accurate detection of the P-wave and the T wave is difficult because of their low amplitude, morphology and amplitude variability, low SNR, and sometimes an overlapping between the P-wave and the T wave. Also, from the clinical point of view, detecting the QRS complex is crucial because the multiple premature QRS complexes indicate cardiac dysfunction. Various researchers proposed different methods to detect QRS complex detection [88, 120,121,122,123,124,125,126,127,128,129,130,131,132]. However, to date, none of the algorithms can detect all possible variations in a QRS complex due to its complex morphology and noisy ECG signal.

QRS complex detection is a two-step process that includes the pre-processing and decision stage. The pre-processing stage, as mentioned above in Sect. 3, is used for noise removal. After noise removal, the denoised signal is subjected to the decision stage. As shown in Table 3, many detection techniques are proposed in the literature to detect QRS complex and R-peak. These techniques include thresholding, zero-crossing detection (ZCD), syntactic methods, matched filter, mathematical morphology, hidden Markov process, and singularity techniques. Also, to reduce false-positive detection, almost all algorithms use the addition decision rule. Two essential criteria, complexity and performance, are used in the selection of a detection technique. Relatively simple algorithms are used in practice. The performance criterion was used to reject those detection techniques, which gave a large number of false-positive at low noises levels [14].

Table 3 Overview of studies conducted on ECG signal detection

The simple and widely used detection method is a thresholding technique in which a feature of the pre-processed signal is compared with a fixed or adaptive threshold to detect the QRS complex. The thresholding technique can be applied in time as well as the time–frequency domain. Fixed thresholding is simple and provides a good result for stationary ECG signals, where beat-to-beat morphology does not change.

However, due to noise and baseline wander, ECG signal beat-to-beat morphology changes, and the probability of accurate detection of QRS complex decreases with fixed thresholding. The adaptive threshold [94, 123, 134,135,136, 145] increases the probability of accurate detection of QRS complexes.

Various researchers [13, 133,134,135,136] have used filtering before thresholding to attenuate various noises like PLI, baseline wander, MA, and other signal components like P-wave and T wave. Usually, a band-pass filter is used in the pre-processing stage, but other filters like low-pass [94, 144], high-pass, median, and moving average [122, 146] are used. In [123], a high-pass filter called the MaMeMi filter removes noise from the ECG signal. Using the MaMeMi filter and adaptive threshold, Rufas and Carrabina [123] have obtained a sensitivity of 99.43%. Christov [122] employed a moving average filter and combined threshold (adaptive steep-slope threshold + adaptive integrating threshold + adaptive beat expectation thresholds) for QRS complex detection. Bajaj and Kumar [162] proposed a QRS detection algorithm that uses the concept of Stockwell transform (ST) and fractional Fourier transform along with thresholding. Although the thresholding technique is simple but setting multiple empirical thresholds is the main drawback of this technique. When a beat does not appear for a long time in a threshold-based detection method, the search back mechanism is activated, producing many false beats.

In the wavelet-based detection method used by various researchers [103, 105, 154], the raw ECG signal is decomposed into many coefficients. Only those coefficients that coincide with the QRS complex are selected. The wavelet-based detection method has two limitations: (a) unavailability of a universal rule for selecting mother wavelet (b) effectiveness of the method depends on the level of decomposition. A wavelet-based algorithm can be realized by using an integrated circuit with a detection accuracy of 99%. In [167], Coast et al. proposed a hidden Markov model for QRS complex detection. In hidden Markov modeling, the observed data sequence is characterized by a probability density function that varies with the state of the underlying Markov chain. The Markov chain in hidden Markov modeling models the observed sequence, waveform duration, and intervals within each beat. The Markov chain preserves the structural properties of the underlying process, and state parameters represent the probabilistic nature of the observed data. Hidden Markov modeling offers excellent flexibility in the selection of observation sequences. The problems associated with hidden Markov modeling are computational complexity; many parameters need to be set and manual segmentation for training before analyzing a record [168].

Artificial neural networks (ANN) are useful for the detection and classification of the QRS complexes. As the name suggests, ANN is a computational algorithm based on the biological neural network. The virtues of ANNs, like learning complex, non-linear surfaces among different classes, make them suitable for ECG beat detection and classification [169]. Multilayer perceptron, self-organizing feature map (SOFM) learning vector quantization, and radial basis function networks are used to process an ECG signal. Xie et al. [96] proposed a convolution neural network (CNN) for the classification and achieved a detection accuracy of 98.4%. Kohler et al. [170] proposed zero-crossing detection of QRS complexes. In zero-crossing detection, a feature is obtained by counting the number of zero crossings per segment. The zero-crossing feature primarily does affect the sudden amplitude changes in the ECG signal, thus providing robustness to noise. The amplitude fluctuations in the problematic sections of an ECG signal do not affect the count of zero-crossing, which significantly improves detection performance. The zero-crossing detection method has the advantage of simplicity and low computational costs.

In [144], mathematical morphology (MM) is proposed to extract the QRS fiducial point. MM is used to extract the topological information from the analysis of the geometrical structure. MM operator non-linearly transforms the signal into another signal called the structuring element (SE). SE is used to detect QRS complexes. The R peak detection using zero-crossing detection concepts was reported in [138, 139, 142]. In [138, 139, 142], the properties of Hilbert transform and zero-crossing locations are used to detect an R peak. The odd symmetry property of an HT provides a zero at each maximum, thus providing a zero whenever the input signal crosses the zero axes. These zero crossing locations are used to locate the true position of an R peak. Li et al. [161] proposed a QRS complex detection method by coordinate mapping based on the phase space reconstruction. Tang et al. [160] employed a delta modulator along with local maximum point (LMaP) and local minimum point (LMiP) algorithm to detect QRS complex, P, and T wave. Different researchers have proposed various QRS and R peak detection techniques, but no technique has provided 100% accuracy with all the standard datasets with all records. Also, their clinical implementation is doubtful.

Another important parameter, which affects the performance of the available QRS complex detection algorithms, is the sampling rate and sampling resolution of the selected ECG data. Sampling is a process of converting a continuous-time signal into a discrete-time signal. According to the Nyquist theorem [171], the sampling frequency must be greater than or equal to twice the signal's maximum frequency to be digitized. If the sampling rate is low, the high-frequency components superimpose low-frequency components, introducing an error into the reconstructed signal. It is required to choose proper sampling frequency so that the accuracy of the QRS complex detection algorithms is not affected [172]. The maximum heartbeat is at most 220 beats per minute (bpm), corresponding to a frequency of less than 4 Hz. In an ECG signal, the spectrum of heartbeat signals extends up to 15 Hz, and some features extend beyond 25 Hz. Hence as per the Nyquist criterion, a minimum sampling frequency of 50 Hz is required. However, in real ECG recorders, the sampling frequencies range from 100–500 Hz, whereas in a lab environment, the sampling frequencies go up to 1000 Hz [172]. In 1975, a minimum sampling rate of 500 Hz, with uniform sampling, generally two or three times of theoretical minimum sampling rate, is recommended by the American Heart Association (AHA) [173]. According to the AHA task force [174], a low sampling rate may yield a jitter in R peak detection, significantly altering the spectrum. The optimal range of sampling frequency for R peak detection is 250–500 Hz or more. The sampling resolution (or bit depth) is another factor that affects the accuracy of heartbeat detection [172]. Sampling resolution produces a signal to quantization noise ratio, which results in an error in the beat detection. Ajdaraga et al. [172] showed that 8-bit or lower sampling resolution is unacceptable for QRS complex detection, especially at high sampling frequencies. The frequency resolution of 10-bit shows acceptable results, but an accurate analysis requires a 12-bit resolution [172].

5 Databases

The ECG databases play a significant role in the development of any algorithm related to ECG signal analytics. These databases provide a wide range of annotated ECG signals recorded under different conditions. Depending on the application, various researchers use various databases to evaluate the performance of the proposed ECG signal analysis algorithms. Details on some important databases are available in [175]. Table 4 summarizes these databases, along with their salient features.

  1. (a)

    Massachusetts Institute of Technology-Beth Israel Hospital Arrhythmia database (MIT-BIH) is a popular database that consists of 48 half-hour ECG recordings from two channels. The ECG signals are from 47 subjects recorded at Boston’s Beth Israel Hospital between 1975 to 1979. Of 47 subjects, 25 subjects were men, aged 32 to 89, and 22 subjects were women, aged 23 to 89. Of these recordings, 60% of records are from inpatients, and 40% are from outpatients. The sampling frequency is 360 samples per second, facilitating 60 Hz notch filters [24,25,26, 138].

  2. (b)

    The American Heart Association database (AHA)- The American heart association, along with funding agency National Heart, Lung, and Blood Institute (NHLBI), motivated the development of a new ECG database at Washington University, which is known as the AHA database. This database includes 80 two-channel recordings. A sampling frequency of 250 Hz per channel over the ten mV range with the 12-bit resolution is used for digitalization. This database is available in two versions: a short version and a long version. The short version consists of an unannotated recording of 5 min before 30 min annotated ECG recording. In the long version, 2.5 h of unannotated recording is present before each annotated part [118, 134, 176, 177].

  3. (c)

    Creighton University (CU) Ventricular Tachyarrhythmia database consists of 35 eight-minute ECG recordings of subjects suffering from Ventricular tachycardia, ventricular flutter, and Ventricular fibrillation. All records are digitized at a sampling rate of 250 Hz with a 12-bit resolution over the ten mV range [176, 177].

  4. (d)

    BIDMC Congestive Heart failure database—It includes long-term ECG records from 15 subjects. Out of 15, 11 are men aged between 22 and 71 years, and the remaining are women aged between 54 and 63 years. All records are digitized at a sampling rate of 250 Hz with a 12-bit resolution over the ± 10 mV range [40, 149].

  5. (e)

    QT database—This database is a collection of various ECG recordings collected from existing databases. The QT database consists of thirteen records from the MIT-BIH Supraventricular Arrhythmia database, fifteen records from the MIT-BIH Arrhythmia database, thirty-three from the European ST-T database, six records from the MIT-BIH ST change database, ten records from the MIT-BIH normal sinus rhythm arrhythmia database, four from MIT-BIH long term database and twenty-four records were from sudden death patients at Boston’s Beth Israel Deaconess Medical center. In total, the database covers 105 records, two-channel recordings to avoid baseline fluctuations or other artifacts [135, 141, 147, 156, 158, 160, 163].

  6. (f)

    European ST-T Database—Ninety annotated ECG records from seventy-seven subjects are present in this database. Seventy subjects were men of age group 30 to 84 years, and eight subjects are women age 55 to 71. Each record consists of two signals of 2 h duration. A sampling frequency of 250 samples per second with a 12-bit resolution over the input range of 20 mV [120, 135] is used for digitizing the signals.

  7. (g)

    Fantasia Database consists of forty ECG recordings, twenty records from the young population age 21 to 34 years, and twenty from the elderly aged population aged 68 to 85 years. All the recording is 120 min duration long. The ECG signals are recorded when the subjects are at rest and watching a Fantasia movie. The sampling rate for digitalization is 250 Hz [41, 120, 121, 135].

  8. (h)

    PTB Diagnostic Database—This database is a collection of 549 ECG records from 290 subjects. The database contains ECG recordings of 209 men with a mean age of 55.5 years and 81 women with a mean age of 61.6 years. Signals are sampled with a sampling frequency of 1000 Hz. Resolution is 16 bits over a range of ± 16.384 mV [40, 43, 45].

  9. (i)

    Long Term ST Database—This database includes 86 long-term ECG records from 80 subjects. The records are 21 to 24 h long, containing two to three ECG signals. The sampling frequency is 250 samples per second over the ± 10 mV range with 12-bit resolution [47, 161].

  10. (j)

    Non-Invasive Fetal ECG Arrhythmia Database (NIFEA DB) consists of 55 recordings taken from a single pregnant subject recorded over twenty weeks.

  11. (k)

    ECG ID Database—It includes 310 recordings taken from 90 subjects. Forty-four men and 46 women from different fields (students, colleagues, and friends of the author) are taken for the recording. The ECG signal sampling frequency is 500 Hz with a 12-bit resolution over ± 10 mV [37].

  12. (l)

    St Petersburg Institute of Cardiological Technics (INCART) Database includes 75 annotated recordings. The duration of each record is 30 min long and covers 12 standard leads. Seventeen men and 15 women aged between 18 to 80 years are selected for recording. The sampling rate for each ECG record is 257 Hz [146].

  13. (m)

    Apnea-ECG database—It is a dataset of 70 records. The length of these records may vary from less than 7 h to 10 h. Each recording includes three facts: a continuous digitized ECG signal, apnea annotation, and machine-generated QRS annotation. These seventy recordings are divided into a learning set and a test set of 35 recordings [39, 121].

Table 4 Overview of various databases used in ECG signal processing

6 Parameters to Evaluate the Performance of an ECG Algorithm

Various parameters are used in the literature to evaluate the ECG algorithms proposed by various researchers. These success measures are an integral part of ECG signal analysis. Evaluation parameters not only evaluate the algorithms but also provides a medium for comparison of various proposed algorithms. Different evaluation parameters are presented in this section. In the following equations, S(n), X(n), X’(n), and Y(n) represents a corrupted signal, original signal, pre-processed signal, and denoised signal, respectively. N is the length of the ECG signal.

  1. (i)
    $${\text{the Input signal to noise ratio}},SNR_{input} = 10\log_{10} \left( {\frac{{\sum\nolimits_{n = 1}^{N} {\left( {X^{\prime } \left[ n \right]} \right)^{2} } }}{{\sum\nolimits_{n = 1}^{N} {\left( {S\left[ n \right]} \right)^{2} } }}} \right)$$
    (2)
  2. (ii)
    $${\text{Output signal to noise ratio}},SNR_{out} = 10\log_{10} \left( {\frac{{\sum\nolimits_{n = 1}^{N} {Y^{2} \left[ n \right]} }}{{\sum\nolimits_{n = 1}^{N} {\left( {X\left[ n \right] - Y\left[ n \right]} \right)^{2} } }}} \right)$$
    (3)
  3. (iii)
    $${\text{Improvement in signal to noise ratio}},SNR_{imp} = SNR_{out} - SNR_{input} = 10\log_{10} \left( {\frac{{\sum\nolimits_{n = 1}^{N} {\left( {S\left[ n \right] - X\left[ n \right]} \right)^{2} } }}{{\sum\nolimits_{n = 1}^{N} {\left( {Y\left[ n \right] - X\left[ n \right]} \right)^{2} } }}} \right)$$
    (4)
  4. (iv)
    $${\text{Mean square error}} \, = MSE = \frac{1}{N}\sum\limits_{n = 1}^{N} {\left( {X\left[ n \right] - Y\left[ n \right]} \right)}^{2}$$
    (5)
  5. (v)
    $${\text{Percentage root mean square difference}} \, = PRD = \sqrt {\frac{{\sum\nolimits_{n = 1}^{N} {\left( {X\left[ n \right] - Y\left[ n \right]} \right)^{2} } }}{{\sum\nolimits_{n = 1}^{N} {X^{2} \left[ n \right]} }}}$$
    (6)
  6. (vi)
    $$Sensitivity\,\left( {Se} \right) = \frac{{\left( {True\,Positive} \right)}}{{\left( {True\,Positive\, + \,True\,Negative} \right)}}$$
    (7)
  7. (vii)
    $$Specificity \, \left( {Sp} \right) = \, \frac{{\left( {True \, negative} \right)}}{{\left( {True \, Positive + \, False \, Positive} \right)}}$$
    (8)
  8. (viii)
    $$Positive \, predictivity\left( { + P} \right) = \, \frac{{\left( {True \, positive} \right)}}{{\left( {True \, Positive + \, False \, Positive} \right)}}$$
    (9)
  9. (ix)
    $$Accuracy\left( {Ac} \right) = \, \frac{{\left( {True \, Positive + \, True \, negative} \right)}}{{\left( {True \, Positive + \, True \, negative \, + False \, Positive + \, False \, Negative} \right)}}$$
    (10)
  10. (x)
    $$Detection \, Accuracy\left( {Da} \right) = \frac{{\left( {True \, Positive} \right)}}{{\left( {True \, Positive + \, False \, Positive + \, False \, Negative} \right)}} \,$$
    (11)
  11. (xi)
    $$Error \, Rate\left( {Er} \right) = \frac{{\left( {False \, Positive + \, False \, Negative} \right)}}{Total \, beat}$$
    (12)
  12. (xii)
    $$QRS \, Detection \, Rate\left( {QDR} \right) = Min\left( {sensitivity, \, positive \, predictivity} \right)$$
    (13)
  13. (xiii)
    $$F - score\left( {Fs} \right) = \frac{{\left( {2x \, Sensitivity \, x \, positive \, predictivity} \right)}}{{\left( {Sensitivity \, + \, positive \, predictivity} \right)}} \,$$
    (14)
  14. (xiv)
    $$Youden \, index\left( {Y_{W} } \right) = Se + Sp - 1$$
    (15)

True positive represents the number of true QRS complexes that are correctly detected. False-negative is the number of QRS complexes that are not detected as a QRS complex. False-positive is the number of non QRS complexes that are detected as QRS complexes. Sensitivity represents the percentage of true beats that an algorithm can correctly detect. Positive predictivity is the percentage of detected beats that were true beats. Accuracy represents the ratio of correctly detected beats to the total number of beats. QRS detection rate is defined as sensitivity or positive predictivity, whichever is minimum.

7 Discussion and Challenges

Detection of cardiovascular diseases requires the accurate and precise detection of QRS complex detection. As the ECG signals suffer from various noises and artifacts, the pre-processing of the ECG signal is very necessary. The pre-processing step, which removes the noises and artifacts in the ECG signal analysis, directly influences the outcome. This paper presents almost all of the existing ECG signal pre-processing techniques and QRS complexes detection techniques. The pre-processing methods, like filtering, transform, empirical, are used by different researchers. The study showed that a single technique is not enough to remove different kinds of artifacts and noises.

Moreover, each denoising technique produces some impact on the ECG signal. It is essential to identify the nature of noise first; after then, a denoising scheme should be selected for those noises and artifacts [177]. In literature, hybrid techniques are used to improve the overall performance of the ECG detection algorithms. Hybrid techniques combine two techniques resulting in good denoising results at the cost of increased computation complexity and processing delay.

The presence of noises and artifacts in the ECG signal produces errors in QRS and R peak detection. Hence denoising step is a very crucial part of cardiovascular disease detection. Our survey showed that many proposed algorithms achieved good denoising results but with few ECG records. No single denoising scheme is validated with all the records of all the standard datasets. Among the various denoising techniques, the filtering technique is quite useful when the noise occurs in a known frequency range beyond the frequency range of the ECG signal. When the noise lies within the ECG signal's frequency range, the transform techniques such as DFT and DCT have shown good denoising performance without distorting the signal. Line fitting techniques such as Savitzky-Golay filtering may appear as a powerful tool when the frequency range of noise is unpredicted and widespread. Savitzky-Golay filtering can smooth the signal without destroying the original property of the ECG signal. Other transform techniques such as STFT, wavelet have also been used for denoising, but they are not suitable for long-term ECG signals.

Some decomposition techniques such as EMD, EEMD, VMD, Fourier decomposition methods are useful for specific noises such as baseline wander, power line interference, and electromyographic noise. These techniques are capable of removing some of the noises and artifacts present in the ECG signal. Further, the performance of the algorithm is evaluated using few ECG records. Most of the denoising techniques have successfully overcome the BW and PLI noise, but removing other noises is still a challenging task. In several studies, beat averaging is used to remove noise, but its usefulness and effect on the specificity and sensitivity of detection algorithms require further studies.

Detection of the QRS complex or R peak depends on the quality of the ECG signal. In the presence of artifacts and noises, the detection accuracy may decrease, and the wrong identification of CVDs is possible. Hence, before detection, denoising is essential. Various methods are available in the literature for the detection of QRS complexes and R peak detection. Almost all the techniques utilize the pre-processing or denoising step before detection to improve the detection accuracy. Recent studies reported excellent detection accuracies by excluding the noisy record of the standard dataset. The detection accuracies of some of the proposed ECG detectors are very high when applied on a limited dataset and under specific conditions, and the algorithms are not validated over all the datasets. Further, very few proposed algorithms implemented in a real-time system suffered from low detection accuracies.

Although different researchers have proposed various QRS denoising and detection techniques, many are not tested against all standard databases, so it is difficult to compare and evaluate their results. Some proposed algorithms in the literature achieved high sensitivity and accuracy, but they have excluded the noisy record of the database. Therefore, these algorithms may not be useful for clinical purposes. Most detection techniques have used different pre-processing and detection methods, which results in complexity and time consumption. Still, no powerful technique is proposed in the literature that can be used for denoising and detection and provides good performance results. Although wavelet transform has been used for denoising and detection both, the performance is not very good. Also, the selection of mother wavelets in the wavelet transform is a difficult task. The practical implementation of the proposed algorithms is a big challenge in ECG signal analysis. Recent studies show that along with denoising and detection, the lossless data compression of the ECG signal is also essential. The lossless compression capability is necessary because it minimizes storage, transmission, and computational data. In literature, very few algorithms have compression capability.

Our studies reveal that the evaluation parameters like accuracy, sensitivity represent the overall performance of the algorithms. However, the portion of the algorithm accountable for the improvement in the evaluation parameters is not provided. The assessment of the algorithms based on features is very important as it would help in the secondary analysis and explaining the cause of heart abnormalities. In the literature, the outcomes of the proposed algorithms are not compared with the ECG recorders' output, which medical professionals use. This comparison will help validate the proposed algorithms and demonstrate the efficacy of the proposed algorithm over the currently used methods by medical professionals. Most of the proposed algorithms are based on QRS complex detection or R-peak detection while ignoring other characteristics like the ST segment and P-wave. Nevertheless, clinically, these characteristics are equally important as QRS complex or R peak.

Most of the reviewed papers have been used different standard databases such as MIT-BIH, AHA, PTB. Although the records have a long duration and enough information in these databases, these records are obtained from a small number of subjects, which is clinically insignificant. Moreover, most proposed algorithms are validated over the same database, limiting the reliability and research findings. Also, labels and annotations of these standard database records are too old to define the new definitions of the CVDs with great reliability. Hence there is a demand for new databases, which are created with modern and precise devices. The annotation and label of this new database records must match with the recent guidelines for CVDs detection. For the advancement in technology, it is required to have an ECG database with many records to evaluate the statistical significance.

Another major issue in the analysis of the ECG signal is ignoring the background of the subject. Apart from patient age, no other information is used by the proposed algorithms. The subject’s medical history, symptoms, and different signs were not included in the literature. However, the inclusion of this information in the proposed algorithms can significantly improve the performance. Nowadays, the wide availability of computing technology offers a significant improvement in ECG signal analysis. The increasing low-cost, high-performance computing technology can provide a reliable and appropriate automatic ECG signal diagnosis solution. Various machine learning algorithms are proposed in the literature to denoise and classify ECG signals. The number of training datasets limits the performance of these machine learning algorithms. Many training datasets are required to develop an accurate system, which in turn increases the system complexity. For early and accurate detection of CVDs, developing a technique that can effectively perform both denoising and detection is required. The technique must be capable of providing the characteristics information of the ECG signal. The technique should be practically implementable, cost-effective, non-invasive, and low time-consuming. It must have compression capability to be easily used in portable and smart ECG detector devices.

8 Conclusions

CVDs possess a severe health risk worldwide, can be predicted by accurately analyzing ECG signals. This paper aims to consolidate the information on various existing algorithms to analyze an ECG signal. The performance of an algorithm depends on various other factors like noise and database, so a summary of noise and database is also discussed.

In literature, the problems associated with the techniques to analyze the ECG signal are discussed. The primary step in ECG signal analysis is the removal of various noises present in the ECG signal. Denoising is the most critical step in ECG signal processing because the denoising step directly influences the outcome of the detection process. Many researchers suggest various techniques for denoising by minimizing or removing the effect of noises. The techniques used for denoising are based on time, frequency, time–frequency, and sparsity domain. Some hybrid techniques, arrived by combining two different approaches, are also used to remove noise from an ECG signal. Each method has its benefits and drawbacks, which are discussed in this paper. These different techniques used for denoising are summarised and compiled in this paper. The results of these techniques for ECG signal denoising on different databases are also presented.

The CVDs can be detected by identifying the ECG signal's morphological features from a denoised ECG signal. In the literature, R peak and QRS complex are used to detect ECG signal characteristics. Before detection, almost all algorithms use denoising or noise removal techniques. Different techniques have been proposed for the detection of ECG characteristics. Various proposed detection techniques have achieved commendable ECG peak detection accuracies on the standard databases. These various detection techniques, along with their performance parameters, are tabulated in this paper.

The proposed techniques for denoising and detection are validated on the various standard databases by the researchers. Most of the standard ECG databases are discussed in this paper. Different success parameters to compare the ECG signal analysis techniques are discussed in this paper. Some of the existing techniques have shown promising results their hardware realization is questionable. Studies show that very few methods are implemented on the hardware. Also, the modern health care system requires wearable devices for continuous monitoring of cardiac health. Only a few existing techniques are implemented in wearable devices with limited accuracy. Indeed, a computational technique that can be used to analyze the ECG signal with higher accuracy in all conditions is still a requirement.