1 Introduction

The ElectroCardioGram (ECG) signal is the foremost tool used by cardiologist for diagnosis, prognosis and survival analysis of the heart diseases. It is a record of the magnitude and direction of electrical commotion that is generated by the heart muscles on depolarization and repolarization of the atria and ventricles. The ECG signal mainly corresponds to a single heart beat consisting of temporally distinct wave shapes: (i) P wave, (ii) QRS complex, (iii) T wave, (iv) U wave, etc. The QRS complex corresponds to depolarization of the ventricles and has amplitude of approximately 1mV. The integral part of modern computerized ECG monitoring systems is mainly the QRS detectors [7]. Algorithms for ECG beats detection focus on the QRS complex because of its short duration and high amplitude of R peak in the QRS complex. The good performance of an automatic ECG analyzing system depends upon the accurate delineation of the QRS complex even in the presence of artifacts, the inter- and intrasubject variations of the ECG signal morphology. Any cardiac dysfunction related to excitation from ectopic centers in the myocardium may result in premature complexes (atrial or ventricular), which changes the morphology of the waveform [7]. The occurrence of multiple premature complexes is used for diagnosis of congestive heart failure and various other heart disorders. Therefore, the delineation of QRS complex is clinically important and attempts are being made to find the universal solution for the accurate detection of R peaks.

The R peak has maximum amplitude; therefore, algorithms in the literature have been suggested to enhance the amplitude of R wave w.r.t to noise and other waves. Ahlstrom et al. [1] formed a linear combination of first and second derivatives to accentuate the R peak followed by fixed threshold for detection. A digital filter algorithm was introduced by Okada [16] to extract the QRS complex followed by fixed threshold. Benitez et al. [3] used band-pass filter to remove the noise and emphasize the QRS complex followed by Hilbert transform and fixed thresholding. For detection of QRS complex template matching and morphological filtering [22], empirical mode decomposition [9] has also being used. The wavelet transform-based QRS detectors have been also reported in [2, 6, 13, 18, 21]. Zidemal et al. [23] have reported the detection of R peaks using S-transform and Shannon energy. The lifting wavelet is used in [14] for detection of R peaks. The localization of R peak is being done using heuristic rules [7] and matched filters [11], hidden Markov method [5], neural network [10], fixed and adaptive thresholding [1]. Though lot of algorithms are available in the literature, the precise detection and localization of the R peak is still a challenge, as various artifacts and intersubject variations of ECG signal morphology affect the estimation adversely.

In this paper, an algorithm is proposed consisting of preprocessing and decision stage that simultaneously reduces noise and gives precise detection of the R peaks even in the presence of artifacts. As ECG signal consists of different waves having different frequency contents occurring at different intervals, therefore to get complete accurate representation of ECG signal, wavelets-based time–frequency analysis is used. The combination of first and second derivative is applied on the signal obtained from wavelet transform to accentuate the QRS complex and suppress P, T waves. Further, the Hilbert transform is applied to have envelope for R peaks for single-sided threshold mechanism. The peak of the envelope is then detected by adding a high-frequency signal whose amplitude depends on the QRS complex.

The algorithm was validated with MIT-BIH arrhythmia database, QT database and MIT-BIH noise stress database taken from physionet.org [15]. As the real-time ECG signals contain various types of noises, therefore to check the performance of the algorithm various artifacts were also simulated and added linearly to the ECG signals of the MIT-BIH arrhythmia database. The algorithm is also validated for the ECG signal corrupted with artificial noise at varying SNR. The algorithm attains 99.9% sensitivity and positive predictivity, which has been improved against the results reported in the literature [7]. This algorithm even works well for the signals with QRS complexes having small amplitude, broad QRS complexes, large and pointed P, T waves, varying QRS morphology.

2 The Proposed Method

2.1 Preprocessing and De-noising the ECG Signal

Mathematically, the observed ECG signal y[n] can be written as,

$$\begin{aligned} y[n]=\alpha s[n]+w[n] \end{aligned}$$
(1)

where s[n] is the ECG signal and w[n] denotes all the artifacts present such as baseline wandering noise, motion artifacts, electrosurgical, muscle contraction noise, power line interference and electrode contact noise, \(\alpha \) is the attenuation parameter where \(\alpha \in R\). Though the QRS complex is generally strongest, time–frequency-varying morphology degrades its amplitude and shape. The assumption is being made that the artifacts are linearly added in the observed ECG signal.

Table 1 Frequency distribution of detail coefficients

As the QRS complex present in ECG signal occupies the frequency band from 5 to 25 Hz [7] and all other waves of ECG signal have frequency band lower than the band of QRS complex. Therefore, the QRS complex x[n] is estimated from the observed ECG signal y[n], by time–frequency analysis, i.e., discrete wavelet transform (DWT). The db6 is used as mother wavelet for R peak delineation, as it resembles with QRS complex and energy spectrum of db6 mother wavelet is concentrated around low frequencies [2, 17]. The DWT analyzes the signal at different resolution by decomposition of the signal into several successive frequency bands. The resolution of the signal, which is a measure of the amount of detail information in the signal, is changed by the filtering operations, and the scale is changed by upsampling and downsampling operations [18]. DWT employs two sets of functions, called scaling functions and wavelet functions, which are associated with low-pass and high-pass filters, respectively. The decomposition of the signal into different frequency bands is simply obtained by successive high-pass and low-pass filtering of the time-domain signal. The original signal y[n] is first passed through a half-band high-pass filter g[n] and a low-pass filter h[n] along with down sampling by factor of 2. One level of decomposition can be expressed as follows:

$$\begin{aligned} d_i[n]= & {} {}Y_\mathrm {high}[n]= \sum _{k}y[k] \cdot g[2n-k] \end{aligned}$$
(2)
$$\begin{aligned} a_i[n]= & {} {}Y_\mathrm {low}[n]= \sum _{k}y[k] \cdot h[2n-k] \end{aligned}$$
(3)

where \(i \in I\), \( Y_\mathrm{high}[n]\) and \(Y_\mathrm{low}[n]\) are the outputs of the high-pass and low-pass filters, g[n] and h[n], respectively. \(d_i[n]\) are known as the detail coefficients, and \(a_i[n]\) are known as the approximation coefficients.

Fig. 1
figure 1

Energy distribution of detail coefficients

The observed ECG signal is decomposed till level 10 with sampling frequency of 360 Hz, as used in the MIT-BIH arrhythmia database [15]. With this sampling frequency, the signal is decomposed till 10 level. The approximation and detail information obtained till level 10 are labeled as \(a_1[n]\) to \(a_{10}[n]\) and \(d_1[n]\) to \(d_{10}[n]\), respectively. The frequency band corresponding to each detail scale after the wavelet decomposition of the observed ECG is shown in Table 1. With the knowledge of frequency bands of each detail scale, the QRS complex can be easily estimated.

The electrosurgical noise (100 kHz–1 MHz) and muscle contraction noises (dc-10 kHz), power line interference (50–60 Hz) are removed by discarding the details \(d_1[n]\) and \(d_2[n]\). The motion artifacts (transient baseline changes) and baseline wandering noise are removed by discarding the lowest-frequency components, i.e., coefficient \(d_{9}[n]\) and \(d_{10}[n]\). It is clear from Table 1, the coefficients \(d_{9}[n]\) and \(d_{10}[n]\) have frequency varying from 0.17 to 0.7 Hz, which is frequency band of motion artifacts and baseline wandering noise. Also, in the ECG signal, most of the energy is concentrated at the QRS complex [2, 21]. For the energy analysis, the average energy content of the detail coefficients for all signals of MIT-BIH arrhythmia database is calculated and the plot is shown in Fig. 1. The plot of energy distribution of detail coefficients shows that the energy is highest at scale 4. Therefore, it is considered that \(d_4[n]\) and its neighboring scales carry the dominant details of the QRS complex. As stated in [17], the power spectra of the ECG signal indicate that energy of QRS complex is present at \(d_3[n]\), \(d_4[n]\) and \(d_5[n]\) scales. Therefore, all other approximations and details are discarded and signal can be estimated from detail \(d_3[n]\), \(d_4[n]\) and \(d_5[n]\) only. But from literature [2] and experimentally, it is observed that \(d_3[n]\) also has some noise, which interferes with accurate beat detection. Therefore, \(d_3[n]\) is discarded and the noise-free signal is therefore estimated from \(d_4[n]\) and \(d_5[n]\) only, which has QRS frequency band of 5–25 Hz.

$$\begin{aligned} \hat{x}[n]=d_4[n]+ d_5[n] \end{aligned}$$
(4)

where \(d_4[n]\) and \(d_5[n]\) are details at levels 4 and 5, respectively, and is obtained from DWT of the signal y[n]. Figures 2 and 3 show observed ECG signal y[n] and signal \(\hat{x}[n]\) estimated from detail coefficients of wavelet transform. This record in Fig. 2 contains high-grade noise and baseline drift which is removed in estimated signal obtained after wavelet transform. As shown in Fig. 3, the observed signal has premature ventricular contractions (PVCs) and the QRS morphology changes due to axis shift. This signal also consists of various noises like muscle artifact and baseline shifts which are removed in the estimated signal.

Fig. 2
figure 2

(i) Observed ECG signal. (ii) Signal estimated after wavelet transform

Fig. 3
figure 3

(i) Observed ECG signal. (ii) Signal estimated after wavelet transform

2.2 Enhancement of QRS Complex

In this stage, a linear combination of first and second derivatives is constructed to accentuate the higher frequencies that are characteristics for the QRS complex, and attenuate the lower frequencies that are characteristics for the P and T waves. The first and second derivatives are performed on the reconstructed signal obtained after DWT. The first and second derivative are approximated as three-point first difference equation as in (5) and (6), respectively.

$$\begin{aligned} \hat{y}_0[n]= & {} \hat{x}[n]-\hat{x}[n-2] \end{aligned}$$
(5)
$$\begin{aligned} \hat{y}_1[n]= & {} \hat{x}[n]-2\hat{x}[n-2]+\hat{x}[n-4] \end{aligned}$$
(6)

As pointed by Rabanni [19], the output of these derivatives is linearly combined as,

$$\begin{aligned} \hat{y}_2[n]= 1.3\hat{y}_0[n]+1.1\hat{y}_1[n] \end{aligned}$$
(7)

During differentiation of the signal, the high frequencies of the noise are also amplified. These noises are smoothed by moving window average filter as

$$\begin{aligned} \hat{y}_3[n]= \frac{1}{8}\sum _{k=0}^{7}\hat{y}_2[n-k] \end{aligned}$$
(8)

The smoothed signal actually represents the signal consisting of true QRS complex. It is passed through nonlinear transformation, i.e., the Hilbert transform for forming the envelope of the QRS complex.

$$\begin{aligned} \hat{y}_h[n]= H(\hat{y}_3[n]) \end{aligned}$$
(9)

Here, the Hilbert transform is used so that single-sided threshold can be used for detecting the R peaks. The peak of the envelope formed after Hilbert transform represents the location of R peak. This also reduces the complexity of finding the local maxima. Figure 4 shows all the signals obtained at each step.

Fig. 4
figure 4

Signal obtained after (i) Wavelet transform. (ii) Differentiation. (iii) Hilbert transform

2.3 Peak-Finding Logic

To accurately detect the R peaks, the envelop formed after Hilbert transform (\(\hat{y}_h[n]\)) is further accentuated by nonlinear transform of the signal as

$$\begin{aligned} \hat{y}_4[n]= \frac{1}{|\hat{y}_h[n]|}\hat{y}_h^3[n] \end{aligned}$$
(10)

where \(\hat{y}_4[n]\) denotes cubic nonlinear transformed signal and is shown in Fig. 5. The temporal location of R peaks is determined from the signal \(\hat{y}_4[n]\). The cubic nonlinear transformation maintains the sign and amplifies the peaks of the signal \(\hat{y}_h[n]\) obtained after Hilbert transform. The technique used here to detect the QRS complex from the transformed signal is based on a feature signal obtained by counting the number of zero crossings per segment. The feature signal must have low value during the QRS complex and high value otherwise. Therefore, a signal is needed to have many number of zero crossings during non-QRS segment and small number of zero crossings during the QRS complex. The nonlinear transformed signal \(\hat{y}_4[n]\) has high amplitude during the QRS complex and low amplitude otherwise. In order to increase the zero crossings during non-QRS segment, a high-frequency sequence \(\hat{h}[n]\) is added to the signal \(\hat{y}_4[n]\).

$$\begin{aligned} \hat{z}[n]= \hat{h}[n]+\hat{y}_4[n] \end{aligned}$$
(11)

The high-frequency sequence is calculated as,

$$\begin{aligned} \hat{h}[n]= (-1)^n\hat{e}[n] \end{aligned}$$
(12)

The amplitude of the high-frequency sequence \(\hat{h}[n]\) is calculated from the amplitude of \(\hat{y}_4[n]\) and is given as

$$\begin{aligned} \hat{e}[n]= \lambda \hat{e}[n-1]+c(1-\lambda )\cdot {\hbox {abs}}(\hat{y}_4[n]) \end{aligned}$$
(13)

where \(\lambda \) is forgetting factor, \(c \in [1,4]\) and abs(\(\cdot \)) is absolute operation. The frequency of 0–25 Hz is required; therefore, \(\lambda \in (0;1)\).

The zero crossings in the signal \(\hat{z}[n]\) are determined as

$$\begin{aligned} \hat{k}[n]= |\frac{\hbox {sign}(\hat{z}[n])-\hbox {sign}(\hat{z}[n-1])}{2}| \end{aligned}$$
(14)

The numbers of zero crossings per segment of 120 ms are counted with the moving window. This led to form a feature signal as

$$\begin{aligned} \hat{f}[n]= \lambda _d\hat{f}[n-1]+(1-\lambda _d)\hat{k}[n] \end{aligned}$$
(15)

QRS detection is done by using adaptive threshold \(\hat{\theta }[n]\), which is calculated as in (16) by first-order recursive filter, which is applied on the obtained feature signal.

$$\begin{aligned} \hat{\theta }[n-1]= \lambda _\theta \hat{\theta }[n-1]+(1-\lambda _\theta )\hat{f}[n] \end{aligned}$$
(16)

where \(\lambda _\theta \) and \(\lambda _d \) are forgetting factors.

$$\begin{aligned} \lambda _\theta \in (0;1),\quad \lambda _d\in (0;1) \end{aligned}$$
(17)

The beginning point \(n_1\) and the end \(n_2\) of the event (search interval for locating QRS complex) are taken when the feature signal falls below a adaptive threshold \(\hat{\theta }[n]\) and rises above the threshold, respectively [12]. These \(n_1\) and \(n_2\) are the boundaries of the search interval p[n] for temporal location of the QRS complex.

$$\begin{aligned} p[n]= \left\{ \begin{array}{ll} y_4[n], &{}\,\, n\in [n_1,n_2]\\ 0,&{}\,\, \hbox {otherwise} \end{array}\right\} \end{aligned}$$

Noisy signal may have multiple events; therefore, the distance between two detected events is considered. If the distance between detected events is too small, i.e., around 80 ms, then they are combined into one. The beginning of the combined event is the beginning of the first event, and the end of the combined event is the end of the last event. Therefore, the R peak is at the location of the maximum in the search interval. Figures 5 and 6 show the observed ECG signal and the detected R peaks using the proposed algorithm for the record 100 and 203 m, respectively, of MIT-BIH arrhythmia database. In Fig. 6, it can be seen that R peaks are accurately detected for the signal having high-grade noise and premature ventricular complexes (PVC’s)

Fig. 5
figure 5

(i) Observed ECG signal. (ii) Output of nonlinear transform. (iii) Detected R peaks

Fig. 6
figure 6

(i) Observed ECG signal. (ii) Output of nonlinear transform. (iii) Detected R peaks

3 Results

The proposed R peak detection algorithm is validated using MIT-BIH arrhythmia database, QT database and noise stress database. The proposed algorithm results in significant improvement of R peak detection for ECG signals having various artifacts, inter- and intrasubject variations in QRS complex morphology. The algorithm compares the onset of the QRS candidate to a key file containing the locations of all the valid QRS onsets. If the estimated onset falls within 88 ms window of the actual onset, it is counted as true positive and if it is outside the range, then it is counted as false positive. The QRS complexes which are missed by the algorithm are counted as false negative. The performance of the algorithm is assessed using four statistical measures as,

$$\begin{aligned} \hbox {Sensitivity\,SE}= & {} \frac{\hbox {TP}}{\hbox {TP}+\hbox {FN}} \end{aligned}$$
(18)
$$\begin{aligned} \hbox {Positive\,predictivity} +P= & {} \frac{\hbox {TP}}{\hbox {TP}+\hbox {FP}} \end{aligned}$$
(19)
$$\begin{aligned} \hbox {Detection\,accuracy}= & {} \frac{\hbox {TP}}{\hbox {TP}+\hbox {FN}+\hbox {FP}} \end{aligned}$$
(20)
$$\begin{aligned} \hbox {Error\,rate}= & {} \frac{\hbox {FP}+\hbox {FN}}{\hbox {Total\,detected\,beats}} \end{aligned}$$
(21)

where \(\hbox {TP}=\) number of true positives (QRS complex detected as QRS Complex), \(\hbox {FN}= \) number of false negatives (QRS complex which have not been detected), \(\hbox {FP}=\) number of false positives (non-QRS complex detected as QRS complex).

Table 2 Performance of proposed algorithm on MIT-BIH arrhythmia Database

These four parameters are proposed as standard parameters to assess the performance of the algorithm [7]. Sensitivity tells us the percentage of true beats that are correctly detected by the algorithm, and positive predictivity tells us the percentage of beat detection that is true beats [7].

The algorithm consists of few parameters, i.e., forgetting factors \(\lambda \), \(\lambda _d\) and \(\lambda _\theta \), which are taken as 0.99, 0.97 and 0.99, respectively. Average sensitivity of 99.9% and positive predictivity of 99.9% and average detection accuracy of 99.8% have been achieved. The average error rate comes out to 0.2% for this algorithm. Table 2 shows that the R peak detection of the records 104, 105 and 108, which are very noisy signals, has been significantly improved by the proposed algorithm. The results obtained for all the first channel ECG signals of MIT-BIH arrhythmia database are summarized in Table 2. The record 108 m has baseline wandering noise and abrupt changes in QRS complex morphology. Hence, the performance comparison for record 108 m with other detectors reported in literature [6, 8, 9, 12, 13, 20, 22, 23] is also shown in Table 3. It is clear from Table 3 that we are able to achieve 100% accuracy for the record number 108 m. The comparison of false negatives and false positives against the earlier work reported in literature [7] for the specific records is shown in Tables 4 and 5, respectively. It can be easily seen that the proposed method achieved better performance in terms of reduced number of false positive and false negative than other methods proposed in the literature.

Table 3 Comparison of record 108 m with detectors available in the literature
Table 4 Comparison of the number of false negative for records of MIT-BIH arrhythmia database
Table 5 Comparison of the number of false positive for records of MIT-BIH arrhythmia database

To study performance with noise, the proposed algorithm is also validated by adding noise to the ECG signal of MIT-BIH arrhythmia database and also validated on signals of noise stress database of physionet.org [15]. This was done to see the behavior of the proposed algorithm with the signals highly corrupted with noise. The four different noises for which study has been done are power line interference, electrode motion artifact, baseline wandering interference and muscle artifact. The power line interference has 60 Hz pick up and harmonics, which is modeled as sinusoids and combination of sinusoids. The amplitude for modeling power line interference is taken 50% of peak to peak ECG amplitude. Electrode motion noise is transient interference that occurs due to loss of contact between the electrode and skin. Baseline wandering occurs due to change in electrode skin impedance with electrode motion. Muscle artifact occurs due to contractions of muscle which result in millivolt-level potentials. Here, the power line interference is modeled and other three interferences are taken from the noise stress database of physionet.org which is added linearly to the ECG signal. Each of four types of noise is added to an ECG signal of MIT-BIH arrhythmia database by scaling at ten different levels. The signal-to-noise ratio at each scaling level is calculated as,

$$\begin{aligned} \hbox {SNR}_\mathrm{dB}=10\hbox {log}_{10}\left( \frac{P_\mathrm{signal}}{P_\mathrm{noise}}\right) \end{aligned}$$
(22)

where \(P_\mathrm{signal}\) is signal power and \(P_\mathrm{noise}\) is noise power. The sensitivity and positive predictivity are calculated for all the signals corrupted with noise scaled at different levels. Further, the average sensitivity and positive predictivity are calculated. Figures 7 and 8 show the plot of average sensitivity and positive predictivity w.r.t. SNR.

Figure 7(i) shows the results, the average sensitivity and positive predictivity obtained for all signals of noise stress database. It is clear from the Fig. 7 that sensitivity and positive predictivity increase with increase in SNR. The sensitivity and positive predictivity obtained for noise stress database even at SNR of 0 dB are 97.15 and 95.2%, respectively. Figure 7(ii) shows the result when baseline wandering artifact is added. At SNR of 2 dB, the sensitivity and positive predictivity obtained are 98.30 and 97.03%, respectively.

Fig. 7
figure 7

Results with (i) noise stress database. (ii) Baseline wandering artifact at varying SNR

Fig. 8
figure 8

Results with (i) electrode motion artifact. (ii) Muscle artifact at varying SNR

Figure 8(i) shows the behavior of the algorithm when electrode motion artifact is added. The sensitivity and positive predictivity obtained with electrode motion artifact at SNR of 2.5 dB are 89.42 and 96.79%, which increases with increase in SNR. At SNR of 36.5 dB, the sensitivity and positive predictivity obtained are 99.38 and 99.84%, respectively. Similarly, in Fig. 8(ii) when muscle artifact is added to signals of MIT-BIH arrhythmia database, the sensitivity increases from 89.65 to 99.89% with SNR increase from 9 to 48.03 dB, respectively. The positive predictivity increases from 91.88 to 99.86% with increase in SNR from 9 to 48.03 dB. Even at 0 dB the results obtained are pretty good for detection of R peaks.

4 Discussion

The approach used for detecting R peaks in this paper has given better results as compared to results available in literature [2, 3, 6, 7, 13, 14, 18, 21]. The sensitivity and positive predictivity have been improved for detection of R peaks. This algorithm is able to remove high grade of noise; this is tested by adding artificial noise. The artificial noise added to the signal is power line interference, electrode motion artifact, baseline wandering interference and muscle artifact. The algorithm works well for normal and diseased signal. This algorithm is robust against amplitude variations of the ECG signal and noise present in the signal.

5 Conclusion

This paper presents a novel technique of detecting R peaks by the combination of wavelet transform, derivatives, Hilbert transform and adaptive thresholding which is done by zero crossing method. This algorithm does not need any learning from the previously detected R peaks. The proposed algorithm is able to detect R peaks with high accuracy and is robust against noise. The detection of R peaks is not affected by various artifacts and morphology of the ECG signal. This algorithm even works well for signals which have wide QRS complex, negative R peaks and high level of noise. The algorithm is able to achieve sensitivity of 99.9% and positive predictivity of 99.9% which is well above the reported results.