1 Introduction

1.1 ECG data overview

The electrocardiography (ECG) is a noninvasive technique that explores the heart’s electrical activity. Nowadays, the cardiac activity measurement constitutes one of the most important parameters that determines the structural and functional heart state of a patient. Generally, ECG signals are pseudo periodical in nature and result in the appearance of five several waves over the electrocardiograph trace (P-QRS-T). The analysis of this signal and the identification of its important features constitute a paramount step for the good diagnosis. However, during recording in clinical environment, ECG signals are corrupted by various sources of noise such as the Baseline Wandering (BLW), Power Line Interference (PLI), Motion Artifact (MA), the muscular noise (EMG), Withe Gaussian Noise (WGN) and other artifacts [8], which can affect severely the ECG visual quality. In this context, the denoising step is very important in any ECG signal processing system. This operation consists to extract the clinical information from the ECG signals embedded with noises without losing the valid information.

1.2 Survey of related work on ECG signal denoising

Signal denoising remains to be one of the most common problems in the field of biomedical signal processing. Various ECG denoising algorithms have been reported in the literature to address with the ECG noise reduction issue, such as conventional approaches based on Finite Impulse Response (FIR) and Infinite Impulse Response (IIR) Filters [32]. However, it is difficult to apply these filters with fixed coefficients, due to the non stationary nature of ECG signal and to the overlapping between the ECG signal spectrum and the noise frequency band. Therefore, adaptive methods filtering based on Least Mean square (LMS), Recursive Least Square (RLS) filters [27] and some of their variants are also proposed in order to filter out different types of noises. The principal limitation of this set of filters lies in the fact that they require an external recorded reference signal that represent the noise source. In [34], an efficient framework based on nonlinear bayesian filters are developed in order to enhance realistic synthetic ECG signals.

Statistical approaches including Independent Component Analysis (ICA) [16], Principal Component Analysis (PCA) [7] were also suggested by some researchers in order to reduce the noises from ECG signals. Nevertheless, these methods require multichannel data. In particular, one of the difficulties with the application of ICA is the perfect determination of the noise sources that will be eliminated.

During the last two decades, filtering in the transforms domains, such as Wavelet Transform (WT) [1] and their variants including Stationary Wavelet Transform (SWT) [10] and Bionic Wavelet Transform (BWT) [35], has become more popular and founded better performances, due to their great efficiency processing of a non stationary signals. Conventionally, it has carried by applying the traditional soft or hard shrinking methods on the obtained details coefficients [9]. However, the classical wavelet filtering methods can introduce some distortions in the form of the filtered ECG signal. On the other side, the wavelet denoising schemes has some limitations such as mother wavelet selection, optimal decomposition level, thresholding function and threshold computation rule.

In addition to its non-stationary nature, ECG signal is non linear. Therefore, a recently developed method for analyzing the non stationary and non linear data called Empirical Mode Decomposition (EMD) has attracted many researchers in the field of ECG signal enhancement [38]. Another research mainly exploits the improved EMD versions, such as Ensemble EMD [6, 11] and some hybrid schemes including EMD-Wavelet thresholding [19] and EMD/EEMD combined with adaptive filters [37], to eliminate the WGN by neglecting or filtering some first IMFs. However, the denoising capability of these algorithms, is based on the assumption that the noise is distributed over some predefined IMFs. Recently, Nguyen et al. [30] have been proposed a novel approach based on CEEMDAN and kullback leibler Deivergence similarity measure with a genetic algorithm to select the optimal threshold parameters. Nevertheless, the used of classical thresholding methods may introduces several distortions in the QRS patterns.

While the above mentioned methods, based on the EMD algorithm or its variants, have shown to provide better performances than conventional ECG filtering methods, the discrimination between the noisy IMFs and the signal dominant IMFs is one of the commonly unresolved problem in the EMD domain. To deal with this problem, we propose in this work a novel scheme for ECG signal denoising based on CEEMDAN and HOS. The key idea of this paper is to make use the Fourth Order Cumulant (i.e kurtosis), in order to classify three groups of IMFs referred respectively as Higher Frequency Noisy (HFN) IMFs, Lower Frequency Noisy (LFN) IMFs and Noiseless Relevants (NR) IMFs. Then, we apply adaptively a novel shrinkage method based on Modified Interval Thresholding to each noisy IMF. Finally the denoised ECG signal is obtained by regrouping together the thresholded IMFs and the unfiltered ones. The performance of the proposed method is evaluated using different real ECG signals corrupted by additive white gaussian noise. The obtained results indicate that the proposed approach outperforms some reported methods in the literature.

This paper is organized in five sections. Section 2 gives a brief review of the EMD method, its variants and introduce the problem of EMD based denoising. Section 3 explains the complete details of our proposed approach. The simulation results and comparison with some existing methods are presented in Section 4. The paper concludes with some final remarks and perspectives in Section 5.

2 Materials and methods

This section is divided into two sub-sections: the conventional EMD method and its extension techniques are briefly described in the first sub-section, while in the second sub-section, we explain the classical problem of EMD based denoising.

2.1 Overviews of EMD, EEMD and CEEMDAN algorithms

2.1.1 Empirical mode decomposition

Empirical Mode Decomposition (EMD) method was initially introduced by Huang et al. [17] as a tool for analyzing the non stationary and nonlinear time-series. In contrast with traditional methods such as Fourier Transform and Wavelet Transform, the main advantage of the EMD lies in the fact that it is a data-driven approach which does not require any prior knowledge or predefined basis for decomposition. Briefly, the aim of EMD method is to decompose any complex signal x(t) into some oscillatory components modulated in both amplitude and frequency (AM-FM) called Intrinsics Mode Functions (IMFs). Theses modes are ordered from high to low frequency (i.e. from lower to higher scale). By definition, each extracted IMF must satisfy two conditions 1) The number of extrema and zero crossings must be equal or differ by no more than one, 2) The mean of the upper envelope defined by the local maxima and the lower envelope defined by the local minima must be zero. These IMFs are obtained by an iterative algorithm called Sifting process [17]. Finally, the original signal x(t) can be reconstructed by a simple linear superposition for all its components as follows

$$ x(t)=\sum\limits_{i = 1}^{N}IMF_{i}(t)+r_{N}(t) $$
(1)

where N is the number of IMFs, IMFi(t) is the ith IMF and rN(t) is the residue signal.

2.1.2 Ensemble empirical mode decomposition

EMD Method has been successfully used to address several problems in different fields such as Speech Enhancement [21, 22], Audio Watermarking [12, 20] and ECG signal denoising [4]. However, it suffers from the mode mixing phenomena that appears when a single IMF contains two modes or inversely when a single mode is distributed over various numbers of IMFs. To avoid this problem, Wu and Huang have proposed the Ensemble Empirical Mode Decomposition (EEMD) [41] following a study of the statistical characteristics of white noise [40]. This method performs the EMD on many noisy versions of the original signal by adding a white Gaussian noise. The general aim is to improve the EMD to behave a dyadic filter bank for many realizations of Gaussian noise [33]. Since the added noise in each realization is canceled by ensemble averaging. As a result, the added white noise in each trial will solves the problem of mode mixing if the number of realizations is sufficient enough. It can be summarized in the following four steps :

  1. 1.

    Add a series of white noise n(t) to the targeted data x(t), xn(t) = x(t) + n(t);

  2. 2.

    Decompose the noisy version xn(t) into IMFs using EMD algorithm;

  3. 3.

    Repeat steps 1 and 2 for Nt realizations by considering different white noise series in each trial, to obtain the \(IM{F^{i}_{k}}(t)\). where \(IM{F^{i}_{k}}(t)\) is the kth mode of the ith trial;

  4. 4.

    The final IMFs are obtained by averaging the ensemble of IMFs corresponding to each trial as follows :

    $$ IMF_{k}(t)=\frac{1}{Nt}\sum\limits_{i = 1}^{Nt}IM{F^{i}_{k}}(t) $$
    (2)

    where Nt is the trials number.

Even if the EEMD algorithm has shown to be useful in a wide range of signal processing applications, it suffers from some shortcomings. The first one is that the resulted IMFs should probably contain a residual noise, as consequence, the reconstructed signal is different from the sum of its IMFs. The second drawback is related to the number of realizations which can produce a different numbers of modes, making difficult the averaging operation. To resolve these problems, a novel complete EMD variant called Complete Ensemble EMD with adaptive noise (CEEMDAN) is proposed [39]. The details of this method are given below.

2.1.3 Complete ensemble EMD with adaptive noise

The complete procedure of CEEMDAN algorithm is described as follows [39]

  1. 1.

    Generate Nt noisy versions of the original signal as xi(t) = x(t) + 𝜖0ni(t), where ni(t) is a white noise with N(0,1) and 𝜖0 is the noise amplitude;

  2. 2.

    Obtain the first mode IMF1(t) similarly as in EEMD method as

    $$\begin{array}{@{}rcl@{}} IMF_{1}(t)=\frac{1}{Nt}\sum\limits_{i = 1}^{Nt}IM{F_{1}^{i}}(t) \end{array} $$
    (3)
  3. 3.

    At first stage (k = 1), compute the first residue signal : r1(t) = x(t) − IMF1(t);

  4. 4.

    Perform the EMD decomposition over a set of signals r1(t) added with different realizations of withe noise to obtain the first mode and get the second mode IMF2(t) by averaging as

    $$ IMF_{2}(t)=\frac{1}{Nt}\sum\limits_{i = 1}^{Nt}E_{1}(r_{1}(t)+\epsilon_{1}E_{1}(n_{i}(t))) $$
    (4)

    where Ej(.) is a function that produces the jth mode obtained by EMD.

  5. 5.

    For k = 2,..., K, calculate the kth residue, then decompose rk(t) + 𝜖kEk(ni(t)) until their first mode obtained by EMD and define the (k + 1)th mode as

    $$ IMF_{k + 1}(t)=\frac{1}{Nt}\sum\limits_{i = 1}^{Nt}E_{1}(r_{k}(t)+\epsilon_{k}E_{k}(n_{i}(t))) $$
    (5)
  6. 6.

    Repeat the last step until the obtained residue is a monotone function. Hence, define the final residue as

    $$ R(t)=x(t)-\sum\limits_{k = 1}^{K}IMF_{k}(t) $$
    (6)
  7. 7.

    Therefore, the signal x(t) can be reconstructed as

    $$ x(t)=\sum\limits_{k = 1}^{K}IMF_{k}(t)+R(t) $$
    (7)

    Where K is the final number of modes.

In order to illustrate the effectiveness and the advantage of this method, Fig. 1 depicts the IMFs corresponding to EMD, EEMD and CEEMDAN respectively applied to an ECG signal numbered 123.dat taken from MIT-BIH arrhythmia database (MITDB). Also, the reconstruction error between the ECG signal and the sum of its corresponding IMFs obtained by each method is illustrated in Fig. 2. These figures show that the CEEMDAN method gives a good separation and produces a negligible reconstruction error than the EMD and EEMD algorithms. It also demonstrates the incomplete decomposition of EEMD algorithm. However, to deal with the problem of mode mixing produced by EMD (IMF5 and IMF6 in Fig. 1b), we use the CEEMDAN approach considered as more suitable for our ECG denoising algorithm discussed in the next section.

Fig. 1
figure 1

ECG signal decomposition using EMD, EEMD and CEEMDAN algorithms and their corresponding IMFs (for 200 white noise realizations with a 0.2 noise standard deviation)

Fig. 2
figure 2

Error of reconstruction using EMD, EEMD and CEEMDAN algorithms for 123.dat ECG signal decomposition (for 200 white noise realizations with a 0.2 noise standard deviation)

2.2 Problem statement of EMD/EEMD/CEEMDAN based denoising

2.2.1 Problem formulation

Noise removal problem is one of the commonly and severely encountered problems in various signal processing applications. In the field of ECG signal processing , if we consider a noiseless original ECG signal x(t) corrupted with a random process n(t) termed noise. The resulted noisy ECG signal y(t) is expressed as

$$ y(t)=x(t) + n(t) $$
(8)

The ultimate goal of any ECG signal filtering method is to estimate a denoised version \(\tilde {x}(t)\) of the observed signal x(t), with justifying generally three tasks :

  1. 1.

    Improvement of the Signal to Noise Ratio (SNR(dB)) after denoising;

  2. 2.

    Minimization of the reconstruction error between the desired signal x(t) and the estimated \(\tilde {x}(t)\);

  3. 3.

    Preservation of the ECG signal features, especially in terms of QRS complex.

2.2.2 Review of existing EMD/EEMD/CEEMDAN denoising methods

In EMD (resp. EEMD and CEEMDAN) domain filtering, the basic principle of signal denoising is performed by a partial reconstruction of the signal after their decomposition into IMFs. This simple procedure is provided by removing some low order IMFs that mainly contain high frequency noise and retaining the high order modes that contain mostly the useful information of the signal. Then, the denoised signal is expressed as follows :

$$ \tilde{x}(t)=\sum\limits_{k=j_{s}}^{N}IMF_{k}(t) + r_{N}(t); $$
(9)

How to define the js index that separates the relevant and the irrelevant modes is a critical task in the EMD-based denoising method. Firstly, Flandrin et al. [14] have suggested a conventional approach to identifying the noiseless IMFs by comparing the actual energy density with the estimated energy density (to form noise-only model) of IMFs, following to the comportement of EMD algorithm as a dyadic filter bank for WGN signals [33]. Further, a confidence interval is employed to discard the noisy IMFs. In [5], Boudraa et al. have proposed a consecutive mean squared error (CMSE) to select relevant modes, but in some cases this criterion can be trapped in a local minima [42]. Recently, a series of a statistical approaches based on similarity measures between the probability distribution function (pdf ) of the noisy signal and that of each extracted mode have been developed by Komaty et al. [24]. The central idea is to find the first local maximum by employing some similarity measures such as Kullback Leibler Divergence (KLD) [24] and Hausdorff Distance (HD) [23] and more.

However, while these methods has been successfully tested over many types of signals. In the case of ECG signal, the first selected IMFs contain not only the high frequency noise but also some useful clinical information of the QRS complex. Consequently, discarding these IMFs is not reasonable because it may retain considerable noise and can introduce several distortions on the R waves of the reconstructed ECG signal.

To avoid these shortcomings, a second family of EMD-denoising methods termed EMD-thresholding is proposed by Boudraa et al. [22], by applying a classical wavelet thresholding to some or all IMFs using a suitable threshold which depend to the noise level on each IMF. Indeed, it was founded that the direct application of wavelet thresholding to the Intrinsic Modes functions is not preferred and can have catastrophic consequences for the continuity of the denoised signal [25]. In order to deal with these problems, kopsinis et al. [25] have provided a series of novel EMD based denoising techniques inspired by classical wavelet thresholding called EMD Interval Thresholding (EMD-IT). In this method, the thresholding function (hard or soft) is performed to the samples of each zero-crossing interval which can effectively guarantee the continuity of the reconstructed signal. Hence, the filtered signal is given as

$$ \tilde{x}(t)=\sum\limits_{k=M_{1}}^{M_{2}-1}\tilde{IMF}_{k}(t) + \sum\limits_{k=M_{2}}^{N}IMF_{k}(t) + r_{N}(t) $$
(10)

where \(\tilde {IMF}_{k}(t)\) is a thresholded version of IMFk(t) by EMD-IT method, M1 and M2 are parameters that are used to discard the noisy IMFs and select the IMFs which need to be thresholded. However, to the best of our knowledge, it does not exist a powerful mechanism that determines automatically these parameters (M1 and M2) in the case of ECG signals. To avoid this question, we are proposed in a previous study [11], to exploit the higher order statistics, especially the kurtosis to define theses parameters in the EEMD domain. The experimental results have shown that our proposed method achieves the better performance in comparison with some earlier reported methods. Nevertheless, the proposed algorithm in [11] suffers from some significant limitations such as

  1. 1.

    The incomplete processing of the EEMD method for ECG noise reduction;

  2. 2.

    The threshold estimation function is not adaptive to each noisy IMF;

  3. 3.

    The noise still exists in the denoised signal by using the Interval Thresholding (IT) and the drawback of the Iterative IT lies on its great computational complexity.

To address these difficulties, we propose in this work a novel method based on HOS and CEEMDAN algorithm with an Hybrid Interval Thresholding method. The main novelties of this paper compared to our earlier work can be summarized as follows : (i) The use of CEEMDAN as a decomposition tool unlike the limitations of EEMD method (ii) a new hybrid thresholding scheme that overcome the classical hard and soft interval thresholding techniques is proposed (iii) a novel adaptive threshold computation method is investigated based on an optimization process and that depend only to the noise energy extracted by applying the wavelet denoising to the first noisy IMF and (iv) the kurtosis may now be used to classify the extracted IMFs in three components with an improved viewpoint than [11]. The details of our approach are given in the following section.

3 Proposed filtering method

Even if the EMD and EEMD have acquired various applications in biomedical signal processing, the exploitation of CEEMDAN and their advantages in the ECG signal denoising was rarely investigated. Hence, we propose in this section, an efficient method for ECG noise reduction based on HOS, CEEMDAN and a new Hybrid Interval Thresholding function. The concrete steps of the proposed filtering method, their novelties and their contributions with our previous work based on EEMD and HOS [11] will be explained in the following subsections.

3.1 CEEMDAN based on hybrid interval thresholding

As mentioned above, in order to overcome the shortcomings of the hard and soft thresholding functions, Kopsinis et al. have developed a novel method termed Interval Thresholding (IT) [25, 26]. The basic idea of this method is to perform a thresholding function to each IMFs by considering only the amplitude of the extrema Extj (i.e. maxima or minima) of each zero crossing interval Zj. More precisely, for each IMFi(t), the IT method is done by comparing the extrema Ej corresponding to the temporal interval between two successive zero crossings \(Z_{j}=[{z^{i}_{j}}\quad z^{i}_{j + 1}]\) with an estimated threshold τj. Hence, we decide if this extrema exceeds the threshold, the interval is a signal dominant, otherwise it is a noise dominant. We define two novel thresholding techniques as follows

$$ \tilde{IMF}_{i}(Z_{j})=\left\{ \begin{array}{r c l} 0\qquad ; \mid Ext_{j} \mid < \tau_{i}\\ IMF_{i}(Z_{j}) \qquad; \mid Ext_{j} \mid \geq \tau_{i} \end {array} \right. $$
(11)

for hard thresholding and

$$ \tilde{IMF}_{i}(Z_{j})=\left\{ \begin{array}{r c l} 0\qquad ; \mid Ext_{j} \mid < \tau_{i}\\ IMF_{i}(Z_{j})\left[1-\frac{\tau_{i}}{\mid Ext_{j} \mid}\right] \qquad; \mid Ext_{j} \mid \geq \tau_{i} \end {array} \right. $$
(12)

for soft one, where τi is the threshold value for the ith IMF, \(\tilde {IMF_{i}}(Z_{j})\) is the thresholded version of the noisy interval IMFi(Zj), Extj is the single extrema of the corresponding zero crossing interval and IMFi(Zj) represents the samples from instance \({z^{i}_{j}}\) to \(z^{i}_{j + 1}\) of the ith IMF. The difference between the classical and the interval hard thresholding is illustrated in Fig. 3. This figure shows that the interval thresholding maintains the continuity of the thresholded IMF.

Fig. 3
figure 3

Difference between classical and hard-interval thresholding: (a) Results of classical hard thresholding (top), (b) Results of hard interval thresholding(bottom) (Inspired from [42])

However, we can notice from the thresholding formulations (11) and (12) the following limitations :

  1. 1.

    Firstly, by using hard interval thresholding, the noise still exists in the final thresholded signal, because that the samples with higher amplitudes than the estimated threshold τi are also corrupted by noise;

  2. 2.

    Secondly, the soft thresholding function reduces the amplitude of all samples which can reduces also the R peak magnitude in same way to the classical soft thresholding.

To overcome these drawbacks, we propose a novel thresholding method referred hereafter as Hybrid Interval Thresholding (HIT) which provides a good trade-off between the conventional hard and soft interval thresholding functions. This method is expressed as follows

$$ \tilde{IMF}_{i}(Z_{j})=\left\{ \begin{array}{r c l} IMF_{i}(Z_{j})\left[1-\left( \frac{\tau_{i}}{\mid Ext_{j} \mid}\right)^{\alpha}\right] \qquad; \mid Ext_{j} \mid \geq \tau_{i}\\ 0\qquad ; \mid Ext_{j} \mid < \tau_{i} \end {array} \right. $$
(13)

where α = 1,…, , if α → 1, the equation is equivalent to soft interval thresholding; if the α, the equation is equivalent to hard interval thresholding.

Once the new thresholding function is defined, the main and most challenging question is how to differentiate between the noisy IMFs which need to be filtered and those containing only the useful information. To give a robust answer of this problem we propose a novel criterion based on HOS explained below.

3.2 Fourth order cumulant as a novel criterion to select the relevant modes

As discussed in Section 2, many approaches have been developed to identify whether a specific IMFs contains useful information or noise only. However, their performances are not generally satisfactory when they directly applied in the problem of ECG signal denoising. As seen in Fig. 4, in our approach, we designed a new strategy to classify the resulting IMFs into three set based on Fourth Order Cumulant namely kurtosis. Our proposed strategy could adaptively to define:

  1. 1.

    First set : the IMFs which contains the high frequency noise (HFN) ;

  2. 2.

    Second set : the IMFs which contains the low frequency noise (LFN) ;

  3. 3.

    Third set : the IMFs which represent the noiseless relevant (NR) modes.

Fig. 4
figure 4

Flowchart of CEEMDAN-HIT proposed filtering method

The central idea of this work is based on the assumption that the first IMFs (i.e. first set) contain mostly high frequency noise, but also some useful information of the QRS complex. The following IMFs (i.e. second set) is dominated by the QRS complex and significant amount about a low frequency noise components, and the last IMFs (i.e. third set) are mainly containing only the useful information about the ECG signal, especially about the low frequency P and T waves. Normally, the QRS complex can be treated as a peak in the signal [36]. Therefore, the kurtosis that is considered as a measure of the peakedness level, can be used to determine the IMFs or distribute the QRS complexes. By definition, fourth-order cumulant or Kurtosis, is a measure of peakedness of the probability distribution of a real random variable. Assuming a zero mean signal x(t), the normalized Kurtosis is mathematically expressed as follows [15]

$$ K_{4}=\frac{E[x(t)^{4}]}{(E[x(t)^{2}])^{2}}-3=M\frac{{\sum}_{t = 1}^{t=M}x(t)^{4}}{({\sum}_{t = 1}^{t=M}x(t)^{2})^{2}}-3 $$
(14)

where M is the number of signal samples.

Figure 5 illustrates the kurtosis values corresponding to an ECG signal with different QRS complex densities computed by using the MATLAB function kurtosisFootnote 1 . It is clearly observed that the higher kurtosis values are obtained when the signal is dominated by the QRS components. On the contrary, the lower kurtosis values are obtained for lower spike (QRS complex) densities. The idea behind using the kurtosis in our work is that signals can be classified based on kurtosis values as (refer to [18]): sub-Gaussian signal (kurtosis< 3, no QRS complex of interest), super-Gaussian signal (kurtosis> 3, QRS components are more probably present) and Gaussian noise (kurtosis= 3, only noise components).

Fig. 5
figure 5

Kurtosis values for a real ECG signal with different QRS spikes densities

Based on the literature of biomedical signal processing applications, the kurtosis is widely used for spikes components detection for many types of biomedical data. Indeed, the ECG signal is a super-Gaussian signal (K4 > 3) because of the presence of the QRS components. Therefore, the kurtosis, may detect physiological activity as spikes (QRS components) in a specific frequency bands. More precisely, the kurtosis can characterize each IMF obtained by EMD, EEMD or CEEMDAN algorithms. Hence, the IMFs in which the QRS complexs are concentrated can be detected by a simple computation of the kurtosis of each IMF.

For illustration of our proposed method, Fig. 6 shows the IMFs arising from CEEMDAN algorithm for an ECG signal numbered 103.dat taken from MITDB corrupted with white gaussian noise, where the noise level was fixed to 5 dB.

Fig. 6
figure 6

CEEMDAN decomposition for noisy ECG signal with 5 dB white gaussian noise level

The probability density function (pdf ) estimated using the normal Kernel density function for the noisy ECG signal noted ECGb(t) and those corresponding to their IMFs obtained by CEEMDAN algorithm are presented in Fig. 7. In Fig. 8, the evolution of kurtosis measured for each IMF is presented.

Fig. 7
figure 7

Superposition of the pdf of noisy ECG signal and those of its corresponding modes

Fig. 8
figure 8

Curve of Kurtosis versus IMFs index of the noisy ECG signal

By analyzing these figures, it is easy to find that the first and the second IMFs are noise-dominated and have a lower kurtosis values. However, these IMFs may contain some useful information of the ECG signal. Consequently, the elimination of theses IMFs can introduce undesired effects at the QRS complex of the signal. It is also observed that the third IMF capture the most information about the QRS components and has the maximum kurtosis value K4jmax, however it contains also a significant amount of noise as shown in Fig. 6. In our scheme, this IMF is considered to be the boundary between the first set of IMFs (HFN) and the second set of IMFs (LFN) which contain a less noise components.

Figure 8 shows also that the kurtosis curve decreases when the amplitude and the QRS components density decrease until reaching the first local minimum value less than a proposed threshold (K4 < 3) at the IMF that best corresponds to the first noise-free IMF (i.e. IMF7(t) in the example). This IMF (K4js) is considered to be the boundary between LFN IMFs and the noiseless IMFs.

With a view to remove the noise components while preserving the QRS complex, thresholding the two set of noisy IMFs should be a good solution than discarding them completely. Therefore, in our approach, the thresholding strategy is followed in three-stages procedure. In stage 1, the first set of HFN IMFs (of levels jjmax) dominated by noise are filtered by using the proposed hard thresholding [25]. In stage 2, the LFN modes (of levels jmax < j < js) which contains the major structures of ECG signal are denoised by the proposed hybrid interval thresholding (with α = 2 in this work). In the third stage, the last IMFs (of levels jsjN) are mainly the useful information, therefore these IMFs should not be thresholdedFootnote 2. Then the denoised ECG signal is reconstructed as

$$ \tilde{ECG}_{d}(t)=\sum\limits_{j = 1}^{j_{max}}imf_{j}(t)+\sum\limits_{j=j_{max}+ 1}^{js-1}imf_{j}(t)+\sum\limits_{j=js}^{N}IMF_{j}(t)+R(t) $$
(15)

where imfj(t) is the thresholded version of the noisy mode IMFj(t).

3.3 Adapted threshold computation

Subsequently to separates the noisy IMFs from the noiseless ones, it remains to determine the threshold corresponding to each noisy IMF. Indeed, threshold selection is a classical problem for signal denoising based on thresholding in any transform domain. In the case of EMD-denoising based thresholding, the classical method for threshold estimation is based on the assumption that the first IMF only contains noise, then a statistical threshold is estimated for each IMF based on the noisy model assumed for many realizations of WGN signals [14]. However, the first IMF mainly contains some useful information of the ECG signal, especially in higher value of Signal to Noise Ratio (SNR). Hence, we propose a simple correction for the noise estimation of the first IMF by extracting the useful information contained in this IMF. This task is performed by filtering the first mode using the wavelet denoising scheme as follows

  1. 1.

    Apply the Wavelet TransformFootnote 3 to the first IMF at level 3 using db4 waveletFootnote 4;

  2. 2.

    Perform hard thresholding function to the wavelet coefficients using the Universal threshold [9];

  3. 3.

    Reconstruct the filtered mode IMFd1 using the Inverse Wavelet TransformFootnote 5.

Then, the novel estimated noise is obtained as IMFb = IMF1IMFd1. The corresponding threshold for each IMF is determined using the following steps

  1. 1.

    Estimation of the noise energy in the first mode as

    $$ E_{1}=\left( \frac{MAD(IMF_{b})}{0.6745}\right)^{2} $$
    (16)

    where MAD(.) is the Median Absolute Deviation.

  2. 2.

    Estimation of the noise energy in the remaining IMFs as

    $$ E_{i}=\frac{E_{1}}{\beta}*\rho^{-i}, i = 2,\dots,N $$
    (17)

    where N is the number of IMFs, β = 0.719 and ρ = 2.01 for a WGN [14].

  3. 3.

    The threshold τi corresponding to the ith IMF is estimated as

    $$ \tau_{i}=\lambda\sqrt{2E_{i}logM}, i = 1,\dots,N $$
    (18)

    where λ is a constant taking a value between 0.1 − 1, M is the length of signal and Ei is the estimated noise energy in the ith IMF.

However, it seem to be not adequate to apply the same fixed tuning parameter λ in the (18) for all noisy IMFs. Hence, an extra attention should also be paid to the threshold values of each set of IMFs, because that the two noisy set of IMFs differ in terms of noise and ECG signal content. In order to determine these points, we introduce a new method by adapting the threshold selection to each set of IMfs by using two tunings parameters λ1 and λ2 corresponding respectively to HFN and LFN groups. Therefore, the threshold for each set is defined as

$$ {\tau^{1}_{j}}=\lambda_{1}\sqrt{2E_{j}logM}, \quad j = 1,\dots,jmax $$
(19)
$$ {\tau^{2}_{j}}=\lambda_{2}\sqrt{2E_{j}logM}, \quad j=jmax+ 1,\dots,js-1 $$
(20)

where \({\tau ^{1}_{j}}\) and \({\tau ^{2}_{j}}\) is the adapted threshold for the 1st and 2nd IMFs set respectively, and Ej is the noise energy in the jth IMF estimated by the noisy model derived from (17).

Extensive computer simulations are investigated to determine the best values of the tunings parameters λ1 (0.5 ≤ λ1 ≤ 1) and λ2 (0.1 ≤ λ2 ≤ 0.5). Hence, the optimal values for these parameters can be achieved by solving an optimization problem, when the objective function is chosen to be the Signal to Noise Ratio after denoising (SNRout(dB)) defined as

$$ SNR_{out}= 10log_{10}\frac{||ECG(t)||^{2}_{2}}{||ECG(t)-ECG_{d}(t)||^{2}_{2}} $$
(21)

where ECG(t) and ECGd(t) denote the original and denoised ECG signals respectively. That is, the main objective is to find the parameters λ1opt and λ2opt that maximize the SNRout. Theses parameters can be obtained as

$$ (\lambda_{1opt},\lambda_{2opt})= \arg \max \quad \left[SNR_{out}\right] $$
(22)

In our simulations, λ1 and λ2 are chosen by fixing λ1 (resp. λ2) and varying λ2 (resp. λ1) with a 0.05 fixed step which provide the all possible combinations. As an example, Fig. 9 depicts the SNRout(dB) evolution for different range of λ1 and λ2 for the same noisy ECG signal under consideration. It is revealed from this figure that the optimal values for λ1 and λ2 in this case are 0.65 and 0.5 respectively. The result of the denoising operation using the proposed approach is illustrated in Fig. 10. We can see from this figure that our method can preserves the important structures of the ECG signal with a minimal distortion as shown in Fig. 10c. To evaluate the performance of our method, extensive simulations and comparisons with some state-of-the-art methods will be presented in the next section.

Fig. 9
figure 9

Surface plot of SNRout(dB) values versus λ1 and λ2 for noisy ECG signal with 5 dB white gaussian noise

Fig. 10
figure 10

Denoising results for noisy ECG signal at SNRin = 5dB using CEEMDAN-HIT proposed approach : (a) Original ECG signal, (b) Denoised ECG signal and (c) Error between original and denoised ECG signals

4 Experimental results and discussion

In this section, we present the simulation results obtained by our proposed approach and we make various comparisons against some established methods for ECG signal enhancement, especially by studying the white gaussian noise case.

4.1 MIT-BIH arrhythmia database

The MIT-BIH Arrhythmia database acquired from physionetbank [28] is an international standard database that is frequently used as a reference dataset to evaluate the performance and robustness of different developed algorithms in the field of ECG signal processing, including ECG data compression, ECG filtering, features extraction as well as ECG signal classification. This database contains 48 real ECG recordings (normal and abnormal beats) taken by two leads with 30 min duration sampled at 360 Hz and with 11 bit resolution. We test our approach over time segments from 10 ECG signals numbered 103.dat, 105.dat, 111.dat, 116.dat, 122.dat, 205.dat, 213.dat, 219.dat, 223.dat, 230.dat. These ECG data are taken by the Modified Limb Lead II (ML-II).

4.2 Experimental protocol and performance evaluation

In order to evaluate the performance and the robustness of our proposed method, we carried many simulations in MATLAB environment over real ECG signals taken from MITDB [28]. For experimental purpose, additive white Gaussian noise with three different SNR levels : 0 dB, 1.25 dB and 5 dB, are artificially added to 10 ECG recordings. The quantitative performance is performed by using two standard metrics namely Signal to Noise Ratio (SNRout (dB)) expressed as in (21) by considering the SNRin (dB) before filtering given as follows

$$ SNR_{in}(dB)= 10log_{10}\frac{|| ECG(t) ||^{2}_{2}}{|| ECG_{b}(t)-ECG(t) ||^{2}_{2}} $$
(23)

and the Root Mean Square Error (RMSE) computed as the following equation

$$ RMSE=\sqrt{\frac{1}{M}|| ECG(t)-ECG_{d}(t) ||^{2}_{2}} $$
(24)

where, ECG(t) denotes the original ECG signal, ECGb(t) is the noisy ECG signal, ECGd(t) is the reconstructed signal and M is the ECG signal length.

4.3 Simulations and discussion

Firstly, we compare the proposed method based on CEEMDAN (resp. EMD or EEMD)Footnote 6 with our earlier proposed approach based on EEMD and HOS. The obtained results in terms of the (SNRout) for 230.dat record as an example are presented in Table 1. This table indicates that our proposed approach specially based on CEEMDAN gives better performances than the previous work [11] and the two other methods based on EMD and EEMD. This might confirms that the CEEMDAN is more suitable for noise reduction than the two others methods. Further, in order to demonstrate the great efficiency of our approach, we have evaluated the performance of the CEEMDAN-HIT method by comparing it with some reported techniques in the literature such as WT-soft thresholding [13], WT sub-bands soft thresholding [31] and Stockwell Transform based method [3].

Table 1 Performance comparison of the proposed method against the earlier works [11]

Table 2 presents the obtained results in terms of SNRout(dB) and RMSE for all 10 ECG signals under consideration. All values for comparison (i.e. SNRout and RMSE) are taken from [3]. In this table, the impressive performances are shown by using the boldface font. Also, to check the reliability of the presented method, the statistical evaluation of the four methods in terms of SNRout(dB), by using the analysis of variance (ANOVA) [2] for Table 2 under different noise levels are depicted in Fig. 11. We should to notice that in this case, we have founded after many simulations that the best values for CEEMDAN parameters are 200 realizations number with 0.1 white gaussian noise standard deviation.

Table 2 Performance comparison of the proposed method against some published results
Fig. 11
figure 11

Boxplot of SNRout(dB) vs. different denoising methods using ANOVA analysis at SNRin = 0 dB, 1.25 dB, 5 dB

It is obvious from the above figure and the presented results in Table 2 that our approach gives the impressive performances in terms of SNRout(dB) as well as RMSE criteria compared to the other techniques; which confirm that the CEEMDAN-HIT is more appropriate for ECG signal denoising task, specially in very noisy environments (i.e. very lower SNRin values) considered in this paper.

Moreover, to validate the superiority of the proposed denoising method, we compare it in terms of RMSE with some recently denoising schemes especially developed in the EMD domains such as EMD and EEMD based Direct Subtraction (EMD-DS, EEMD-DS)Footnote 7[6] and EMD with kullback leibler Deivergence (EMD-KLD) [24]. The obtained performances of different ECG denoising methods for four ECG signalsFootnote 8 at 10 dBSNR level are shown in Table 3. As seen in this table, the lower RMSE values are given by the CEEMDAN-HIT method. We note also from this table that the EMD-DS and the EMD-KLD methods give the same performances for some ECG records. In fact, the EMD-KLD technique removes the first detected noisy IMFs determined by using the KLD similarity measure in the same context to EMD-DS based approach; when the first noisy IMFs are directly discarded without using any criterion.

Table 3 RMSE performance values for CEEMDAN-HIT proposed method versus some developed methods EMD-DS, EEMD-DS and EMD-KLD

Finally, for qualitative performance, Fig. 12 shows the denoising results in time-frequency domain obtained by all methods under comparison applied to the ECG signal numbered 105.dat at a particular SNR level of 10 dB. This figure reveals that the highest visual quality and the higher preservation of ECG signal components are achieved by using our proposed method in comparison with the others methods. Additionally, the CEEMDAN-HIT method preserve the energy distribution of the original ECG signal in Time-Frequency domain, especially in the QRS regions as clearly shown in Fig. 12f. Besides, a significant reduction of noise is clearly observed in Fig. 12f (top) from the proposed technique.

Fig. 12
figure 12

Comparison in the time-frequency (TF) domain for different denoising methods : Waveform (top) and Spectrogram(bottom) for (a) Original ECG signal, (b) Noisy ECG signal, (c) Denoised ECG signal by EMD-DS, (d) Denoised ECG signal by EEMD-DS, (e) Denoised ECG signal by EMD-KLD, (f) Denoised ECG signal by CEEMDAN-HIT Proposed Method

5 Conclusion and perspectives

This paper introduces a novel method for ECG signal enhancement that combines the CEEMDAN algorithms and higher order statistics. In our method, an efficient criterion is developed to discriminate between ECG signal useful information and noise components based on kurtosis. Hence an hybrid interval thresholding has been applied adaptively to each noisy IMFs. The quantitative performance using SNR and RMSE criteria have shown that the developed method outperforms some existing methods. Moreover, the statistical evaluation based on analysis of variance and the obtained results in Time-Frequency domain indicates that our method should to be an efficient tool for ECG denoising. However, while the proposed method has been successfully used for filtering all ECG signals under consideration, it is necessary to test this method in real case environment such as muscle artifact and motion artifact with multiple types of ECG signals. Hence, we plan to study these different cases in a future work.