Introduction

The report of World Health Organization (WHO) places the cardiovascular diseases (CVDs) as the leading cause of mortalities across the globe and will remain in the near future [1]. In 2008, the CVDs caused 17.3 million deaths, representing 30% of the mortalities worldwide [1]. In 2018, the deaths in United States alone has increased to 836546 (with an average of one death in every 38 s) [1, 2]. By 2030, the expected number of deaths can increase up to 23.3 million globally [3]. As a consequence of the increased rate in mortalities due to CVD’s, cardiac health research has gained significant importance by the researchers. The most common clinical technique utilized for cardiac disorder analysis is an electrocardiogram (ECG). An ECG is a simple, reliable, low-cost and non-invasive tool commonly used to diagnose cardiac disorders [4]. An ECG records the electrical signals originating from the myocardium by placing the electrodes on the surface of the body which is later analyzed by a cardiologist. The term ECG was introduced by Augustus D. Waller (a British physiologist) in 1887 when he recorded the first human ECG using a capillary electrometer [5]. In 1893, Einthoven used an improved electrometer and a correction formula to distinguish five deflections later named as P, Q, R, S and T [5] waves. These waves are generated by heart that undergoes three processes, namely atrial depolarization, ventricular depolarization and ventricular repolarization resulting in the generation of P wave, QRS complex, and T wave respectively. These different waves comprising the standard ECG cycle are depicted in Fig. 1 while their clinical significance is summarized in Table 1. In Fig. 1, the U wave is shown, however it is seen occasionally. It is a positive wave occurring after T-wave having an amplitude of one-fourth of the T wave. The U wave is found in subjects having more prominent T waves and slow heart rates (most frequently seen in leads V2–V4). Therefore, the genesis of U wave is elusive. All of these waves exhibit specific characteristics (such as in time, amplitude or morphology) and carry sufficient amount of information for diagnosing CVDs. Further, various other features such as frequency, entropy distribution and energy, event intervals (like the RR-interval) are also extracted for reliable diagnosis. Any change in either of the features of these P, QRS, T waves may indicate cardiac abnormalities or arrhythmias leading to stroke or sudden cardiac death. Therefore, an efficient diagnosis of these waves is clinically essential for reliable analysis of heart health, such as arrhythmia classification [6,7,8,9,10,11,12,13,14,15], diagnosing breathing disorders [16, 17], study of cardiac functioning during sleep and hypertension [18, 19], epilepsy [20] and for examining various other heart disorders [21].

Fig. 1
figure 1

Cardiac cycle representing the waves

Table 1 Significance of different waves in ECG signal [5]

Among these waves, the QRS wave exhibits the most striking feature in terms of morphology, amplitude and time of occurrence and therefore, plays a significant role in an efficient analysis of ECG recordings of subjects. In the past few decades, the detection of QRS complexes has been thoroughly studied by various researchers. In fact, the study of P and T wave detection is not explored much in comparison to QRS complex detection due to the factors including low signal-to-noise ratio (SNR), variation in amplitude and morphology, and overlapping nature of P and T wave. In spite of various studies and research works in the domain of QRS complexes detection, the development of a reliable universal solution is still a challenge. These challenges mainly arise due to low SNR, variability (i.e. inter and intra) in the morphology of QRS complex and the rest of the waves as well as the artifacts inherent in the ECG signal. The most commonly used database for validating the research works has been the benchmark Massachusetts Institute of Technology—Boston’s Beth Israel Hospital (MIT-BIH) arrhythmia database [22] (described in detail in “MIT-BIH arrhythmia database” section). However, detailed analysis of the cardiac events is possible only if the QRS event is detected efficiently. Therefore, the development of fast, robust, efficient and reliable QRS detector becomes clinically important for timely diagnosis of CVDs.

The objective of this review article lies in the thorough analysis of QRS detection algorithms available in the literature. In order to prepare this manuscript, the literature are searched and reported using the databases such as Google Scholar, Scopus, web of science and the keywords searched is QRS and detection. The analysis is divided in two stages, i.e., the pre-processing stage and the QRS detection stage. In the preprocessing stage, the QRS complex feature is made more prominent or emphasized with respect to the rest of the waves. The output of this stage is followed by the QRS detection stage where the onset and offset points are demarcated and the corresponding R-peak is located in the ECG signal. The contribution of this study is to evaluate the performance of algorithms or techniques depending upon three criteria, i.e., (a) sensitivity towards the noise, (b) computational load and (c) accuracy for both the stages. Further, the current study also provide suggestions to develop a fast, efficient and robust QRS detector methodology for real-time applications.

The rest of the article is summarized as follows. “MIT-BIH arrhythmia database” section briefly describes the most commonly used ECG database studied in the literature. “Pre-processing stage” section presents the evaluation of the preprocessing algorithms while “QRS detection techniques” section presents the evaluation of the QRS detection algorithms. “Performance evaluation and discussion” section presents a brief discusses the challenges of the evaluated pre-processing and QRS detection stages on together followed by future suggestions to develop an efficient QRS methodology. Final section presents the conclusion of the study.

MIT-BIH arrhythmia database

Most of the research works are evaluated and validated on the benchmark Massachusetts Institute of Technology—Boston’s Beth Israel Hospital (MIT-BIH) arrhythmia database [22] developed during 1975 and 1979 by the BIH arrhythmia laboratory. The database contains two channel ambulatory ECG recordings of 47 different subjects comprising 48 records. A modified limb lead II (MLII) is the lead A signal in 45 recordings; a modified lead V1 (often V2 or V5, and V4 in one excerpt) is the lead B. Whereas V5 is the lead A signal in the other three excerpts and V2 is the lead B (two excerpts) or MLII (one excerpt). The heartbeats signal in lead A have more prominent peaks than lead B signal. The database includes 110109 beat labels while the data is band-pass filtered at 0.1 H–100 Hz. The excerpts are digitized with a sampling frequency of 360 samples per second and acquired with 11-bit resolution over 10 mV range. The database is open-access available on-line that can be accessed for performing the experiments.

Pre-processing stage

In this stage, the acquired raw ECG signal is pre-processed to remove various kinds of noise and artifacts [23] associated with them. These various kinds of noise include the baseline wander, artifacts due to muscle contraction, electrode movement and power-line interference. The pre-processing stage improves the SNR of the ECG signals. Hence, the pre-processing of the ECG signal is highly instrumental for an efficient QRS detection. Otherwise, it results in the generation of false alarms and degraded performance of the QRS detector. This section presents the performance evaluation of the various pre-processing techniques based on two factors, i.e. computational load and robustness to noise. A summary of the evaluation done is presented in Table 2 at the end of this section.

For reader’s point of view, this section uses two variables, i.e. A[m] which refers to the raw input ECG signal and B[m] refers to the output filtered signal.

Amplitude technique

The amplitude technique is one of the commonly used algorithms used for the R-peak detection within the ECG signals. Initially, a differentiation step is applied to suppress the P and T event influence in order to highlight the QRS complex, which is followed by the amplitude threshold. Later, this algorithm is used by Sufi et al. [24] to detect the heart rate on mobiles. Moreover, the amplitude threshold followed by the first derivative to make the slope of QRS complex more prominent. The amplitude threshold for a fragment of the ECG signal is determined as:

$$A{\text{th}} = \alpha \times \max \{ A[m]\}$$
(1)

where \(\alpha\) is the amount of ECG signal eliminated in percentage whose value vary from \(0< \alpha < 1\). Moreover, the value of \(\alpha\) is optimized once before the pre-processing while the thresholds are kept constant throughout the analysis. Various amplitude thresholds are employed for subsequent QRS detection. Morizet et al. [25] introduced a QRS scheme using \(A{\text{th}} = 0.3\max \{ A[m]\},\) where below 30% of the peak amplitude of the signal is truncated for A[m], whereas Fraden [26] employed \(A{\text{th}}=0.4\max \{A[m]\}\).

The main advantage of this technique is that it involves less computational load among all the existing pre-processing techniques which is due to smaller length of ECG signal used for processing. The disadvantage being the length of ECG segments processed are fixed and determined empirically [25,26,27]. If the length of ECG is longer, the performance may degrade until it divided into shorter lengths but the ECG segment may lose the starting and end of ECG beats.

First order derivative

Generally, a differentiator of first-order is utilized as a high-pass filter which removes unwanted low-frequency noise and the baseline wander of A[m]. Moreover, it creates zero crossings at the R-peak location and modifies the phase in the ECG signals. Several algorithms implemented the first derivative by the following equation [28]:

$$\begin{aligned} B[m]= -2A[m-2]-A[m-1] + A[m+1] + 2A[m+2] \end{aligned}$$
(2)

Moreover, Holsinger et al. [29] used a central finite-difference approach as:

$$\begin{aligned} B[m]= A[m+1] - A[m-1] \end{aligned}$$
(3)

while Okada et al. used a backward difference scheme [30]:

$$\begin{aligned} B[m]= A[m] - A[m-1] \end{aligned}$$
(4)

An optimal threshold is chosen and applied to B[m] along with the first-order derivative technique for subsequent QRS complex detection within the ECG signals. The length of ECG signal processed and the thresholds applied during the ECG analysis are fixed. The main advantage being this technique involves less computational load. The disadvantage being the technique is not able to remove high-frequency noise completely.

First and second derivative

The first and second order derivatives are calculated separately for A[m] (i.e. input ECG signal) in the QRS enhancement algorithms. These derivatives magnitudes are linearly combined to highlight the QRS wave region with respect to other ECG features. The first and second order derivatives computed by Balda et al. [31] are defined as:

$$B_{0} [m] = \left| {A[m + 1] - A[m - 1]} \right|$$
(5)
$$B_{1} [m] = \left| {A[m + 2] - 2A[m] + A[m - 2]} \right|$$
(6)

Here, \(B_{0}[m]\) is the first while \(B_{1}[m]\) is the second order derivatives of A[m]. Further, both of these derivatives are linearly combined as:

$$\begin{aligned} B_{2}[m]= 1.3B_{0}[m]+1.1B_{1}[m] \end{aligned}$$
(7)

In [32], Ahlstrom et al. computed the first derivative as:

$$B_{0} [m] = \left| {A[m + 1] - A[m - 1]} \right|$$
(8)

Thenafter, the rectified first derivative is smoothed as:

$$\begin{aligned} B_{1}[m]= \frac{1}{4}(B_{0}[m-1]+2B_{0}[m]+B_{0}[m+1]) \end{aligned}$$
(9)

A rectified second derivative is then calculated:

$$B_{2} [m] = \left| {A[m + 2] - 2A[m] + A[m - 2]} \right|$$
(10)

At last, this smoothed rectified first and second derivative are combined together as:

$$\begin{aligned} B_{3}[m]= B_{1}[m]+B_{2}[m] \end{aligned}$$
(11)

These linear combinations of derivatives are followed by a proper threshold criterion for the subsequent QRS detection.

The advantage of this technique is that it involves less computational load. However, the computational load is more than first derivative algorithms. The disadvantage being the noise is not reduced significantly. The length of ECG signal processed and the parameters utilized are fixed. However, the usage of several differentiators (i.e the advantage of each step) for pre-processing is not justified in literature.

Digital filters

In the literature, the digital filter techniques are efficiently utilized for pre-processing the ECG signals [33,34,35,36,37,38,39,40,41,42]. Several algorithms are implemented to realize complex digital filters [30, 33, 43,44,45,46,47,48,49,50,51,52,53,54]. Among them, the most cited are highlighted here.

Engelse et al. [33] applied a differentiator initially to process the input ECG A[m] as

$$\begin{aligned} B_{0}[m]= A[m]+A[m-4] \end{aligned}$$
(12)

Further, a digital low-pass filter (LPF) is applied to \(B_{0}[m]\) as:

$$\begin{aligned} B_{1}[m]= (B_{0}[m]+4B_{0}[m-1]+6B_{0}[m-2]+4B_{0}[m-3]+B_{0}[m-4]) \end{aligned}$$
(13)

Another technique based on the digital filters has been proposed by Okada [30] in which a three-point moving average filter is used to smoothen A[m] as:

$$\begin{aligned} B_{0}[m]= \frac{1}{4}(A[m-1]+2A[m]+A[m+1]) \end{aligned}$$
(14)

Further, a LPF is applied to \(B_{0}[m]\)

$$\begin{aligned} B_{1}[m]= \frac{1}{2n+1} \sum _{k=m-n}^{m+n} B_{0}[p] \end{aligned}$$
(15)

The input and output of this LPF are subtracted and squared, to remove waves of low amplitude with respect to R peak as:

$$\begin{aligned} B_{2}[m]= (B_{0}[m]-B_{1}[m])^{2} \end{aligned}$$
(16)

Thenafter, filtering is applied to this square of difference which makes the QRS area enlarged relative to another ECG features:

$$B_{3} [m] = B_{2} [m]\left\{ {\sum\limits_{{p = m - n}}^{{m + n}} {B_{2} [p]^{2} } } \right\}$$
(17)

Suppappola [55] proposed another digital filter based on the multiplication of backward difference (MOBD) [6, 55, 56] which consists of AND-combination (i.e multiplication operation) of the adjacent derivative values. A MOBD of Mth order is defined as:

$$\begin{aligned} C[m]= \prod _{p=0}^{M-1}(X[m-p]-X[m-p-1]) \end{aligned}$$
(18)

where C[m] represents the QRS features extracted that can be detected by the use of an proper threshold.

Dokur et al. [36] used two different band-pass filters and multiplied the outputs X[m] and Y[m] as:

$$\begin{aligned} C[m]=X[m] \times Y[m] \end{aligned}$$
(19)

where C[m] carries the extracted features of the QRS complex. This procedure assumes that for each filter the occurrence of frequency components within the pass-bands characterizes each of the QRS wave. Here, the AND-combination executes the multiplication operation. Particularly, if the outputs of both the band-pass filters are ’high’ and the feature output (i.e AND combination) is ‘true’, then only a QRS event is detected and the maximum amplitude is the R wave location.

In fact, Pan et al. [34] applied a band-pass digital filter followed by derivative to filter and measure the slope of ECG signals. A high-pass filter (\(B_{2}[m]\)) is used after a low-pass filter (\(Y_{1}[n]\)) to constitute a band-pass filter given as:

$$\begin{aligned} B_{1}[m]= 2B_{1}[m-1]-B_{1}[m-2]+A[m]-2A[m-6]+A[m-12] \end{aligned}$$
(20)

The band-pass filter output is followed by the first derivative which is given by:

$$\begin{aligned} B_{2}[m]= 32B_{1}[m-16]-(B_{2}[m-1]+B_{1}[m]+B_{1}[m-32]) \end{aligned}$$
(21)

Differentiation is followed by the band-pass filtered signal (\(Y_2[n]\)) to emphasize QRS slope, suppress the baseline wander and smoothing ECG signals.

$$\begin{aligned} B_{3}[m]= \frac{1}{8}(-B_{2}[m-3]-2B_{2}[m-1]+2B_{2}[m+1]+B_{2}[m+2]) \end{aligned}$$
(22)

However, the MOBD algorithm is more suitable for real-time implementation due to its better trade-off between computational load and accuracy. The squaring operation amplifies the smaller differences less than the larger differences in an exponential fashion.

The advantage of the digital filter based technique is that it is able to reduce noise properly. However, its computational load more than derivative based algorithms. The length of ECG processed and parameters utilized are fixed.

Wavelet transform (WT)

The wavelet transform (WT) [57] is a mathematical tool for analyzing non-stationary signal localized in both time and scale representation.

$$CWT_{{(a,b)}} = \frac{1}{{\sqrt a }}\int\limits_{{ - \infty }}^{{ + \infty }} {x(t)\left( {\frac{{t - b}}{a}} \right)dt} ,\quad a> 0$$
(23)

The continuous wavelet transform (CWT) provides a variable resolution in both the time and frequency domains for various frequency bands by using a set of analyzing functions which bears an advantage on Fourier transform (FT) and short-time Fourier transform (STFT).

However, the CWT is more redundant than the discrete wavelet transform (DWT) that can be deduced by discretizing the scale and translation parameters. It is usually implemented using the high-pass and low-pass filters as shown in Fig. 2. This choice of scale (\(a=2^k\)) and translation parameters (\(b=m(2^k)\)) leads to the dyadic WT (DyWT) as:

$$W_{f} (2^{k} ,b) = \int\limits_{{ - \infty }}^{\infty } {f(t)\Psi *_{{2^{k} ,b}} (t)\,dt}$$
(24)
$$\Psi _{{2^{k} ,b}} (t) = \frac{1}{{2^{{k/2}} }}\Psi \left( {\frac{{t - b}}{{2^{k} }}} \right)$$
(25)
$$\Psi _{{2^{k} ,b}} (t) = \frac{1}{{2^{{k/2}} }}\Psi \left( {\frac{t}{{2^{k} }} - m} \right)$$
(26)

where j and n are integers. The DyWT is implemented using a dyadic filter bank, in which the filter coefficients are obtained from the mother wavelet function employed for analysis of non-stationary signals [58,59,60,61,62,63,64,65,66] like an ECG.

Fig. 2
figure 2

Block diagram of DWT implementation

The choice of the mother wavelet (like Haar, Daubechesis, Mexican hat and many more), length of processed ECG segment and wavelet scale varies in the literature [68,69,70,71]. However, the selection of mother wavelet depends upon the similarity to the QRS complex. The ECG signals are divided into 2.4 s and 11 s segments by Ahmed et al. [68] and Xiuyu et al. [69] respectively. The scales vary from \(2^3\) to \(2^4\) and \(2^2\) to \(2^4\), which is used by Szilagyi [70] and Xu et al. [71] to detect the QRS complexes. Moreover, the input ECG signal is re-sampled at 250 Hz by Martinez et al. [72].

The advantage of WT is that it improves the signal quality by choosing the coefficients of high amplitude. The disadvantage of this technique is that it involves high computational load.

Matched filters

The matched filter provides an optimal SNR and more essentially, a symmetrical output pulse waveform. Digital filtering is used prior to the use of matched filters [73, 74]. The matched filter output for the filter impulse response of length \(M=128\) is computed as

$$\begin{aligned} q(t)= \sum _{j=0} ^{127} p_j \times r(t-j) \end{aligned}$$
(27)

where q(t) is an output sample and \(r(t-j)\) are input samples of the matched filter. For every patient, the filter coefficients \(p_j\) are selected to optimize the matched filter impulse response. The filter output coefficients are chosen in such a way that resembles to the bandpass-filtered QRS complex. Further, the dc component of sampled QRS complex is removed and windowed and normalized to have a gain of one for the matched filter used as an impulse response. Basically, the matched filter impulse response is the time-reversed version of a template QRS complex. In matched filters, the length of the template processed is fixed; while the type of filter utilized and length of the template is determined empirically. However, its efficient implementation is available in [75]. The disadvantage of this technique is that the analysis of ECG requires more complexity than the derivative based algorithms.

Filter banks (FB)

A FB typically contains a set of analysis filters. It decomposes the signal bandwidth into sub-band signals having uniform frequency bands of constant length. These sub-bands provide information for processing the input signal in both the time and frequency domain from different frequency ranges [76]. The analysis filters \(H_{j}(z)\) bandpass the input ECG signal A(z) [76] to produce the subband signals \(v_{j}(z)\) as:

$$\begin{aligned} v_{j}(z)= H_{j}(z)A(z) \textsc {} \end{aligned}$$
(28)

where \(j=0,\,1, \ldots ,\,N-1\). The effective bandwidth of subband \(v_{j}(z)\) can be downsampled to decrease the total rate which is \(\pi /N\). One sample is kept out from the N samples by utilizing this downsampling process \(N \downarrow\). Hence, downsampled signal \(d_{j}(z)\) is given by:

$$\begin{aligned} d_{j}(z)= \frac{1}{N} \sum _{p=0}^{N-1} v_{j}(z^{1/N}X^{p}) \end{aligned}$$
(29)

where \(X=e^{-k(2\pi /N)}.\) The sub-band \(v_{j}(z)\) has a higher sampling rate than \(d_{j}(z)\). The process of filtering is done using downsampling at 1 / N of the input rate. This technique reduces the computational load of filter bank algorithms [76] and referred as polyphase implementation. The sub-bands of interest are combined to form a variety features that represent the QRS complexes [76]. For example, a sum-of-absolute values feature can be computed using sub-bands, \(j=1, \ldots , 4\) in the range of [5.6, 22.5] Hz. Six features (\(a_1\), \(a_2\), \(a_3\), \(a_4\), \(a_5\), and \(a_6\)) are derived from these sub-bands as:

$$a_{1} [m] = \sum\limits_{{j = 1}}^{3} {\left| {d_{j} (z)} \right|, \quad a_{2} [m]} = \sum\limits_{{j = 1}}^{4} {\left| {d_{j} (z)} \right|, \quad a_{3} [m]} = \sum\limits_{{j = 2}}^{{j = 4}} {\left| {d_{j} (z)} \right|}$$
(30)
$$\begin{aligned} a_{4}[m]= & {} \sum _{j=1}^{3}(d_{j}(z))^2, \quad a_{5}[m]=\sum _{j=1}^{4}(d_{j}(z))^2, \quad a_{6}[m]=\sum _{j=2}^{j=4}(d_{j}(z))^2 \end{aligned}$$
(31)

These features contain a range of values being proportional to QRS wave energy. Ultimately, heuristic beat-detection logic [76] is utilized to incorporate some above features representing the QRS wave.

This technique significantly increases the SNR, which can be considered as an advantage. The computational load of filter banks depends on four parameters, i.e. the filter length, transition-band width, number of sub-bands and the stop-band attenuation having fixed values and are determined experimentally [77]. It involves high computational load which is more than the derivative based techniques. Afonso et al. [78] introduced finite impulse response (FIR) filters having fixed length i.e. 32. The filters are employed to decompose the noisy input ECG into eight constant sub-band frequencies. A sub-band signal within the range of 0–12.5 Hz is not changed, while in the range of (12.5–25 Hz) the sub-band signal is removed outside the region of QRS wave. The high-frequency components outside the QRS region are considered as noise. The sub-band signal within the rest of six bands of range (25–100 Hz) is considered as zero. The main challenge is the selection of combination of optimal filter banks to highlight the QRS wave.

Hilbert transform (HT)

Zhou et al. [79] and Nygards et al. [80] used Hilbert transform (HT) for QRS detection. In the time domain, the HT of the input signal A is:

$$A_{H} (t) = H\{ A\} = \frac{1}{\pi }\int\limits_{{ - \infty }}^{{ + \infty }} {\frac{{A\tau }}{{t - \tau }}d\tau }$$
(32)

In the frequency domain, the input signal can be transformed with a filter of response:

$$\begin{aligned} A_{H}(j\omega )= A(j\omega ) \otimes H(j\omega ) \end{aligned}$$
(33)

where \(\otimes\) denotes the convolution operator and the transfer function of the Hilbert transform \(H(j\omega )\) is given by:

$$\begin{aligned} H(j\omega ) = {\left\{ \begin{array}{ll} -k, &{} 0\le \omega<\pi \\ +k, &{} -\pi \le \omega <0 \end{array}\right. } \end{aligned}$$
(34)

The use of fast Fourier transform (FFT) reduces the computational load of Hilbert transform. The HT i.e. \(A_{H}[m]\) of the ECG signal A[m] is used to compute the signal envelope [80] for band-limited signals which is given by:

$$\begin{aligned} B_{e}[m] \approx \sqrt{A^{2}[n]+A^{2}_{H}[m]} \end{aligned}$$
(35)

Further, the envelope [80] is approximated which involves less computational load as:

$$B_{e} [m] \approx \left| {A[m]} \right| + \left| {A_{H} [m]} \right|$$
(36)

Then after, the envelope is low-pass filtered [80] to eliminate the ripples and to avoid uncertainty in peak detection. Moreover, a waveform adaptive scheme is proposed to remove ECG components of low frequencies. Zhou et al. [79] proposed a method to approximate the envelope of input signal based on HT given as:

$$B_{e} [m] \approx \left| {B_{1} [m]} \right| + \left| {B_{2} [m]} \right|$$
(37)

where \(B_{1}[m]\) and \(B_{2}[m]\) are orthogonal filter outputs given as:

$$\begin{aligned} B_{1}[m]= & {} A[m]|-A[m-6], \end{aligned}$$
(38)
$$\begin{aligned} B_{2}[m]= & {} A[m]|-A[m-2]-A[m-6]-A[m-8] \end{aligned}$$
(39)

Further, the noise is removed from the envelope signal \(B_{e}[m]\) by using a four-tap moving average filter. A few works [81,82,83] have reported the use of a first derivative before applying the HT. The ECG is differentiated which modifies the phase and creates zero crossings the R-peak location. Hence, it requires a transformation which rectifies the phase to mark the true R-peak location in a signal. In [84], the output of Hilbert transform is followed by adaptive Fourier decomposition for enhancing the QRS complex in the ECG signal.

This technique involves high computational load and does not able to reduce noise by itself. The use of FFT for the calculation of HT makes the envelope independent of the frame width. During experiments moving average and digital filters are utilized whose selection is done empirically while the length of ECG signal processed are constant.

Empirical mode decomposition (EMD)

EMD technique is widely used for nonlinear and non-stationary signal analysis [85]. It decomposes a signal into a sum of intrinsic mode functions (IMFs). The EMD process can also be utilized for adaptive filtering. As such, the combination of number of the IMFs obtained after decomposing the ECG signal generates more prominent QRS wave. The EMD can be explained by sifting process. J modes \(w_{p}\)[m] and a residual term g[m] [86,87,88] are obtained and given by:

$$\begin{aligned} A[n]= \sum _{p=1}^{J} w_{p}+g[m] \end{aligned}$$
(40)

The various steps involved in EMD algorithm are as follows:

  1. 1.

    Given a signal \(w_{p=1}[m]=r[m]\); with the sifting \(r_k[m]=w_{p}[m]\), \(k=0\).

  2. 2.

    Detect all extrema of input \(r_{k}[m]\).

  3. 3.

    Calculate the lower and upper envelopes from the maxima and minima by using cubic spline interpolation.

  4. 4.

    Compute the mean of upper and lower envelopes, \(n[m]=\frac{1}{2}(EnvMax[m]+EnvMin[m])\).

  5. 5.

    Extract the detail \(r_{k+1}[m]=r_{k}[m]-n[m]\).

  6. 6.

    If \(r_{k+1}[m]\) is an IMF, go to step 7; otherwise, iterate steps 2–5 on the signal \(r_{k+1}[m]\), \(k=k+1\). (An IMF satisfies two conditions i.e (a) the number of the extrema equals the number of zeros and (b) the upper and lower envelopes should have the same absolute value.)

  7. 7.

    Extract the mode \(w_{p}[m]=r_{k+1}[m]\).

  8. 8.

    Calculate the residual \(g_{p}[m]=r[m]-w_{p}[m]\).

  9. 9.

    The extraction is finished \(g[m]=g_{p}[m]\) if \(g_{p}[m]\) has less than two extrema, otherwise, the algorithm is iterated from step 1 on the residual \(g_{p}[m]\), \(p=p+1\). The two conditions must be satisfied for an IMF: (a) The mean value of the envelopes defined by maxima and minima should be zero at every point. (b) The difference between number of zero crossings and number of extrema should be zero or one.

The length of ECG signals processed is fixed which generates the IMF’s i.e. the number of IMF’s is proportional to the length of ECG. The selection of number IMF’s is selected empirically. An ensemble empirical mode decomposition (EEMD), an advanced EMD is also used to pre-process the ECG signal. This technique involves high computational load and reduces noise significantly.

Mathematical morphology

Chu et al. [89] proposed an enhancement technique, namely mathematical morphology for removing the noise associated with the ECG signal and latter used by Trahanias et al. [90] for QRS detection. It depends on the idea of dilation and erosion. Assume that u : \(U \rightarrow K\) and \(p : P \rightarrow K\) represent discrete functions, where U and P sets are denoted by \(U=0,\,1, \ldots ,\,M-1\) and \(P=0,\,1, \ldots ,\,N-1\). K represents a set of integers here. Erosion of a function u [89] can be defined in terms of function p as:

$$\begin{aligned} (u\ominus p)[n]= \min_{m=0, \ldots ,\,N-1} (u[n+m]-p[m]) \end{aligned}$$
(41)

where p refers to a structuring element also, and defined as \(n=0, \ldots , M-N\). The values of u are always smaller than function (\(u\ominus p\)). Dilation of a function u [89] is defined in terms of function p as:

$$\begin{aligned} (u\oplus p)[n]= min_{m=0, \ldots ,\,N-1} (u[m]-p[n-m]) \end{aligned}$$
(42)

where in this case \(n=N-1,\,N, \ldots ,\,M-1\). Values of u are always less than function (\(u\oplus p\)). Additional steps are performed by combining the dilation and erosion operations. Closing, (indicated as \(\bullet\)) is defined as dilation after erosion operation while Opening (indicated as \(\circ\)) is defined as erosion after dilation operations. These operators exploits the input signals, comparatively in such a way that for a sequence u, opening eliminates the peaks while closing eliminates the negative peaks with the structuring element p. Chu and Delp [89] used these opening and closing operations [90] to suppress noise given as:

$$\begin{aligned} {\tilde{r}}=\frac{[(r\circ p)\bullet p]+[(r\bullet p)\circ p]}{2} \end{aligned}$$
(43)

where p is the structuring element. The features are generated for the QRS wave as

$$d = \tilde{r} - \left( {\frac{{[(\tilde{r}\circ p) \bullet p] + [(\tilde{r} \bullet p)\circ p]}}{2}} \right)$$
(44)

Zhang et al. [91] utilized the first derivative (Okada’s equation [30]) after multi-scale mathematical morphology filtering to remove base-line drifts and artifacts associated with the A[m].

During experiments, the length of ECG segments processed are fixed and equal [25, 26, 31,32,33, 92]. The fixed length of the structuring element is used for the analysis of A[m] i.e 3. This structuring element length is determined empirically and shorter than the multiplication of sampling frequency and the length of A[m] [93]. The advantage of the multiplication operations used in literature [25, 26, 31,32,33, 92] is not discussed. The use of low-pass filter along with this approach increases the SNR significantly.

Table 2 A brief comparison of various pre-processing techniques

Sparsity filtering

The sparse representation (SR) model for a time-domain input signal \(a\in \mathfrak {R}^n\) can be approximated as \(a\approx D\alpha\). Here, \(D\in Re^{n\times m}\) is a dictionary matrix \(\alpha \in \mathfrak {R}^m\) that provides coefficients representing the input signal. In SR, the input signal is approximated as a weighted sum of each columns of the dictionary matrix \({{\varvec{D}}}\) known as atoms and their weights (as given by the coefficients in \(\alpha\)). Generally, the dictionary \({{\varvec{D}}}\) is redundant, where the number of atoms in the dictionary is greater than the length of the input signals. The coefficients \(\alpha\) are sparse i.e. there are only few non-zero weights (coefficients) in \(\alpha\). Hence, the SR of the input signal a is approximated using only few atoms (with corresponding weights that are not equal to zero) from the dictionary matrix \({\varvec{{D}}}\).

In [94], second and third order derivatives of the input signal were computed to smoothen the ECG signal. To reduce the artifacts by solving a convex \(\ell _1\) optimization problem where the clean input signal is modeled as a sum of two signals whose second and third-order derivatives (differences) are sparse respectively. In [95], \(\ell _1\)-sparsity filter with overcomplete hybrid dictionaries is used to emphasize the QRS complex and suppress the baseline drifts, power-line interference and large P/T waves. In [96], the input signal is modeled as superposition of atoms which is learned from a training set plus additive random noise to remove noise and other artifacts such as baseline wandering.

Among all the pre-processing algorithms discussed in this section, none of them are completely efficient when all kinds of noise are considered [27] for analyzing the ECG signals. The amplitude and slope based techniques have a significant advantage over electromyogram (EMG) noise (i.e muscle noise) and are sensitive to changes in the baseline of A[m]. However, the performance of these algorithms is degraded if they are applied to composite noise. Rather, a higher performance is reported by the high-pass and cubic spline approaches used to correct the baseline wander. The filtering of the signal to remove EMG noise is more difficult as the frequency spectrum overlaps the QRS wave. As such, pre-processing algorithms based on filtering based algorithms are sensitive to high-frequency noise but insensitive to baseline changes. Further, the amplitude and derivative based techniques involve less computational load in their implementation, despite being noise sensitive. However, the various stages involved in the amplitude and derivative based algorithms are not justified for pre-processing the ECG signal for their validation on the MIT-BIH arrhythmia database. As such, the parameters of these techniques employed are purely data dependent and may yield varying results, if analyzed on a different database (or on data of different patients). A brief comparison of these pre-processing techniques is summarized in Table 2 in terms of computational load and noise sensitivity. From Table 2, it is concluded that the amplitude and derivative based techniques should be developed properly for pre-processing the ECG signals. Once A[m] is filtered, it is passed through the detection stage for reliable QRS detection.

QRS detection techniques

The filtered ECG signal is passed through the QRS detection stage. This section presents a brief description of the QRS detection techniques used for the localization of the R-peak in the input ECG signal. Among several detection algorithms include the thresholding [50, 117,118,119,120], syntactic methods [121,122,123], neural networks [105, 124,125,126], zero-crossing [127], hidden Markov model [128], matched filters [129, 130], and singularity techniques [131,132,133]. These detection algorithms are also evaluated on the basis of two parameters, i.e. computational load and robustness to noise which is summarized in Table 3.

Thresholding

The thresholds are fixed values that are used to determine a boundary above which a R-peak is detected in A[m]. The thresholds may be fixed or adaptive depending on the approach employed. Numerous works have been reported in which the threshold based approach is utilized that is determined experimentally in [28, 31, 33, 34] to detect the R-peak. A peak is defined as a local maximum when the signal changes its direction within a pre-defined time interval, i.e. to be signal peak, the peak should exceed the threshold. The approach is considered as simple while the choice of optimal threshold is quite difficult. If the input ECG signal contains maximum SNR, then it is possible to utilize the lower thresholds. In Pan et al. [34] improved the SNR by using bandpass filter and used the adaptive thresholds. The thresholds are allowed to float over the noise. The two types of thresholds are applied to the R-peak i.e higher and lower thresholds. The higher thresholds among them are allowed to first analyze the signal. While the lower threshold is used when the no QRS complex is detected within a certain time which is followed by a search back technique to find the QRS complexes back in time. The main advantage of this approach is that it involves less computational load in comparison to all the detection techniques utilized. However, this method requires specification and adjustment of numerous parameters for adequate detection accuracy remains a challenge.

Neural networks (NN’s)

Neural networks have the ability to learn patterns in response to newly input patterns. Those learning and self-organizing abilities are appropriate for QRS-wave recognition [134], because the QRS-wave will change its shape according to the patient’s physical condition. Suzuki et al. [135] used an ART2 (adaptive resonance theory) network employed in this self-organizing neural-network system to detect the QRS complex. In this approach, the category of neural networks should be selected and modified during analysis. The architecture of an ART2 newtwork is shown in Fig. 3, where LTM is the long-term memory, F1 and F2 are layers connecting the neurons and \(w_i\), \(x_i\), \(u_i\), \(q_i\), \(p_i\) \(q_i\) are the nodes that characterize the F1 layer. A neural network with N number of inputs is developed, where the sample taken from the window is fixed for each input [136]. Garcia-Berdone’s et al. [136] utilized 20 samples as input, thereby emphasizing that the input for NN’s should be chosen within a range of samples. In the NN hidden layer, the choice of optimal number of neurons is difficult and determined empirically. A typical neural network architecture is depicted in Fig. 3.

The disadvantage of the technique is that involves high computational load and is highly noise sensitive. The average accuracy of this technique is also lesser than the thresholding based techniques.

Fig. 3
figure 3

Neural network models for QRS detection

Hidden Markov model (HMM’s)

A hidden Markov model (HMM) characterizes an observed data sequence by a probability density function which varies according to the state of an underlying Markov chain. In this approach the output function, a number of states and transition probabilities are determined empirically. The HMM parameters are fixed and cannot be approximated from training data by employing maximum likelihood methods due to the fact that data produced from the state sequence remains unknown [137]. The parameters of a hidden Markov model are not directly estimated when the data is unknown. A hidden Markov model is depicted in Fig.4, where \(q_1,\,q_2,\,q_3, \ldots ,\,q_6\) are the number of sets of states. The model consists of two Markov sub-sources, i.e. one for non-QRS segments and one for QRS segments.

The advantage of this technique is that it provides automatic estimation of all the parameters in the decision rule stage from each ECG file undergoing analysis. However, the search for parameters involves huge computational load. The accuracy results show that a simple HMM detector achieves accuracy which is very close to adaptive threshold based detection techniques.

Fig. 4
figure 4

HMM model for QRS detection [137]

Syntactic techniques

The syntactic approach is applied after the digital derivative operator [121]. The method utilizes a very simple look-up table for coding. The sequences of energy peaks of the derivatives of ECG waveforms corresponding to different leads are coded into the string of messages. For each lead waveform, the strings which correspond to QRS complexes are considered as a sample of positive information and are processed by a grammatical inference algorithm. Analogously the strings which correspond to non-QRS complexes are saved and considered as a negative information, sample to be eventually used in a further generalization step. Consequently, two grammars are built [121]; the first one generates only positive sequences (corresponding to QRS events) while the second one generates sequences corresponding to hypotheses that may or may not correspond to a QRS complex. This learning algorithm infers linear grammars based upon formal derivatives.

The syntactic method enables the detection of the QRS wave of an ECG signal by itself [121,122,123]. The ECG fragments length processed are uniform throughout the analysis. Belforte et al. [121] used segment of 30-s. The syntactic method [122] utilizes four attributes, i.e. the chord length, arc symmetry, arc length and degree of curvature that are computed empirically.

The disadvantage of the technique is that involves high computational load and is highly noise sensitive. However, it yields a comparable accuracy with the rest of the detecting techniques.

Singularity methods

Most of the ECG signal information is carried by its irregular morphology and singular points (fiducial points). In mathematics, a singularity is often considered as the opposite of smoothness and can be measured by Lipschitz exponent. Using the nth derivative of a so-called smoothing function, the singular points can be detected by modulus maxima of the wavelet. In this approach thresholding is employed on individual modulus maxima of WT to reduce the white noise from ECG signals. The wavelet scales are chosen experimentally to search for singular points [138, 139]. The use of thresholds per ECG fragment is constant [138] and computed empirically.

The disadvantage of the technique is that involves high computational load due to searching the singular points and is highly noise sensitive. However, it yields an accuracy (i.e. 99.22% [139]) which is approaching to the thresholding based techniques.

Zero-crossing (ZC)

Zero crossing methods are robust against noise and are particularly useful for finite precision arithmetic. This detection method inherits the robustness and provides a high degree of detection performance even in very noisy ECG signals. In this technique, the beginning of an event is identified when the features of the signal (i.e. number of zero crossings per segment) fall below a signal adaptive threshold while the end is identified when the signal rises above the threshold [127, 140]. This beginning and end of the event determine the boundaries of the search interval for the temporal localization of the R-wave. If adjacent events are temporally very close (multiple events), they will be combined into one single event. The beginning of the combined event is the beginning of the first event and the end of the combined event is the end of the last event. The threshold per segment employed for determining the number of zero crossings is fixed [127] and calculated empirically. In literature [127, 141], the search for zero-crossings depend on the choice of wavelet scale.

The disadvantage of the technique is that involves high computational load and is highly noise sensitive. However, it yields an accuracy (i.e. 99.70%) which is approximately same with those achieved by the thresholding based techniques.

Table 3 A brief comparison of QRS detection techniques described in literature

The different algorithms involved in the detection of QRS complexes are summarized in Table 3. Among all the algorithms presented, the thresholding approach involve low computational load. Since, the study aims to highlight the development of efficient algorithms for a robust and reliable QRS detector, the use of threshold based technique is suggested because of its simplicity and efficiency. These thresholds are used in time [25, 26, 153] and time-frequency domain both [154,155,156]. There are two types of thresholds used to detect the QRS complexes which include the fixed and adaptive thresholds. The use of fixed threshold is simple and efficient for stationary input signals only having similar morphologies. In fact, movement of patients, baseline drift and variation in morphology of ECG signals results in highly inaccurate detection of QRS complexes using fixed thresholds. Rather, the usage of adaptive thresholding [71, 157,158,159], increases the correct detection of QRS complexes; however, the adjustment of multiple thresholds chosen empirically is a drawback. Most of the QRS detection presented in Table 3 perform well on the clean or filtered signals. Rather, their performance degrades in the noisy environment or signals containing arrhythmias. Therefore, these QRS detection techniques lacks in providing a generalized solution.

Performance evaluation and discussion

The R-peak detection is followed by the performance evaluation of the subsequent algorithms. The performance evaluation of various algorithms discussed in the earlier sections is estimated on the basis of two statistical parameters i.e. sensitivity and positive predictivity. The sensitivity is defined as the rate of correctly detected events among the total number of events detected by the algorithm, while positive predictivity refers to the rate of correctly classified events in all detected events which can be represented as:

$$\begin{aligned} Sensitivity\,(S_e)&= \frac{TP}{TP + FN} \times 100 \end{aligned}$$
(45a)
$$\begin{aligned} Positive\,predictivity\,(P_p)&= \frac{TP}{TP + FP} \times 100 \end{aligned}$$
(45b)

where TP (true positives) is termed as the number of correctly classified events into a particular class, FN (false negatives) refers to events of a particular class which have not been detected, and FP (false positives) refers to the number of events of another class detected in a particular class. The overall performance of the existing QRS detection algorithms reported in the current study have not been analyzed relative to computational load and noise sensitivity. Further, a standard database is not used for testing these QRS detection algorithms which makes the analysis difficult to evaluate and compare i.e. some of works utilized different databases or signals from patients demanding the development of an efficient QRS detector algorithm. An algorithm or technique can be termed as efficient, if it satisfies the following factors such as low computational load, evaluated on common standards of data and high accuracy. As such, the QRS detection algorithms reporting high classification performance in terms of accuracy, computational load along with the other factors responsible are summarized in Table 4 and discussed subsequently to develop a fast and robust QRS detector.

Table 4 Overall performance analysis of QRS detection algorithms

A high overall performance is reported by Li et al. [160] (records 214 and 215 are excluded) achieving a sensitivity of 99.89% and specificity of 99.94% respectively, evaluated on the benchmark MIT-BIH arrhythmia database [22]. The features of the different waves are extracted using wavelet-based approach and singularity technique for classification of these features. However, this technique involves high computational load and hence cannot be considered superior in terms of performance. Moreover, the experiments are performed by excluding some of the records from the MIT-BIH arrhythmia database [22] to reduce the noise in the processed ECG signals and reported an improved performance in the detection of QRS complexes. While several investigators also performed their experiments by excluding paced beats [139] and ventricular flutter beats [72] from the patient’s data. Rather, such kind of evaluation of algorithms based on the variability in the utilization of data cannot be justified. Thus, a reliable algorithm is needed for the analysis of ECG signals yielding better overall performance on the overall dataset (i.e. without excluding any fragments of ECG).

In Table 4, each of the QRS detection algorithm is categorized as low, medium or high in terms of its computational load. The computational load of the algorithm is determined by computing the total number of operations involved (in terms of addition, multiplication and differentiation) and the number of iterations. The algorithms with more number of operations (i.e., higher computational load) is categorized as high while the algorithms with lesser number of operations is termed as low. The algorithms having low computational load are faster and vice-versa. Therefore, faster algorithm is more suitable for hardware implementation and can be used in real-time monitoring of ECG signals. Table 4 shows that the Christov [157], Chiarugi et al. [162] and Elgendi [166] algorithms involve low computational load. In the preprocessing stage, the application of first order derivative is promising, particularly if it is followed by a suitable detection stage [167] such as dynamic and/or moving average threshold. The computational load of a first order derivative is O(m) i.e. for m length of ECG data, O(m) number of operations are required, i.e. of linear order. Similarly, the computational load for second order derivative require additional O(m) operations. In fact, the sole application of first order derivative in the preprocessing stage is noise sensitive, and hence, it must be followed by an efficient detection scheme [27]. However, the implementation of the first and second order derivative schemes used for preprocessing the signal is slower than the amplitude based schemes. Rather, a faster (or simple) technique cannot be considered as efficient for QRS detection.

Prior to the development of a fast and robust QRS detector, these efficient algorithms are evaluated on the performance parameters such as noise sensitivity, computational load and accuracy as mentioned in each of the earlier sections. In addition, these efficient algorithms are required to be implemented on the suitable hardware platforms such as microcontrollers or field programmable gate arrays (FPGA). However, the processing speed of the algorithms depends on the operating frequency of these hardware platforms. It is to note that the higher is the processing speed of the hardware platforms, faster is the processing and vice-versa. Some of the works reported usage of mobile phones [24] to evaluate the performance of the three QRS detection techniques. Here, the QRS pre-processing stage consists of amplitude, first and second order derivative algorithms, whereas the detection stage consists of a thresholds only. The simplicity of the combination of these methodologies can be confirmed from Table 4 in terms of computational load. It is concluded from Table 4, the combination of first derivative with threshold can be considered as efficient in terms of computational load for detecting the QRS complexes.

While processing the ECG signals, the consumption of power [166] will be a limitation in battery operated devices. The case of classical Pan–Tompkins technique [34] is an example which shows a significant power utility [167], though it uses first order derivative. The total computational load of Pan–Tompkins algorithm is O(mkn) where ‘n’ is the number of stages through which the ECG signal is passed, ‘k’ is the order of the individual filters (in this case it is 1) and m is the length of the ECG signal. When n and k are very small compared to m, the total complexity would be O(m). Due to more stages involved in the Pan–Tompkins algorithm, more power is required in the detection of QRS complexes. In this study, the standard Pan–Tompkins algorithm is suggested as a ready made solution that can be implemented on suitable hardware platforms to develop an efficient QRS detector. The experiments are validated on the benchmark of the MIT-BIH arrhythmia database and performed using the MATLAB software package with hardware configuration of Intel CoreTM i5-processor CPU 3.30 GHz and 4.00 GB of RAM. The different stages involved in the Pan–Tompkins algorithm is depicted in Fig. 5.

Fig. 5
figure 5

Plot of R-peak detection within B[m] using Pan–Tompkins algorithm [34]

The complete analysis of the QRS detection algorithms depending on the factors such as noise sensitivity, computational load and accuracy is presented in this study prior to their implementation. The algorithms employed in real-time analysis should be simple (in terms of computational load) without resulting in degraded performance i.e., accuracy. If the algorithm is simple, the processing of larger databases is faster and requires less hardware leading to low-power consumption and reduced cost. It is also suggested to process the input data at higher operating frequencies as it can be helpful to process larger databases within the less amount of time. From Table 4, it can be concluded that the combination of first derivative and threshold are efficient if developed properly. Moreover, the Pan–Tompkins can be considered as a complete ready-made solution in the efficient detection of QRS complexes which satisfies all the factors like noise sensitivity, computational load and accuracy which is evaluated on all the records (without excluding of any segment) of the MIT-BIH arrhythmia database. An efficient QRS detector can be integrated with the feature extraction and the classification algorithms for arrhythmia classification [12, 168, 169]. Moreover, a fast and robust detector can easily be employed for breathing disorders and various other cardiac disorders to enhance the lifestyle of patients for CVDs.

Despite of the several algorithms reported in literature their clinical utility is not discussed. It is however difficult for an algorithm that mentions its significance and utility from a clinical point of view. It is best of the author’s knowledge that none of the discussed algorithms are implemented and verified in a clinical environment or hospitals. Hence, it is also suggested that the new algorithms developed for robust and reliable QRS detector based on the factors (i.e. as mentioned in the aim of the study) must be implemented and verified in the clinical environment.

Conclusion

This article presents a brief study of QRS complex detection algorithms based on the literature, to figure out the best-suited algorithm for cardiac analysis based on the factors like robustness to noise, computational load and sensitivity. For pre-processing the filtered ECG, the first-order derivative is suggested because it involves less computational load with high accuracy. However, this approach is noise sensitive and therefore, the approach should be followed by a suitable detection algorithm such as adaptive thresholding. Both these techniques can be developed firmly for detecting the QRS waves because the combination involves less computational load and achieves higher accuracy suitable real-time applications. However, the classical Pan–Tompkins approach is also a good ready-made alternative which is employed in most of the works in arrhythmia classification and implemented in this study. The developed QRS detectors based on the suggested algorithms can be helpful for detecting several cardiac disorders to lead a healthy and secure lifestyle.