1 Introduction

Heart auscultation or listening to heart sound using a stethoscope is known as a non-invasive detection tool for the diagnostic of many cardiac anomalies and a perfect tool for examination, providing valuable information about the rate, rhythm and valve functioning of the heart. However, the low sensitivity of human ears in the low frequency range makes cardiac examination difficult. The recent development in auscultation was of a significant aid for practitioners to extract more information from the processing of PCG (Phonocardiogram) signal, i.e., heart sound recorded using digital stethoscope which converts the acoustic sound waves to electrical signals. Nowadays, the predominant methods of cardiac examination are the ECG (Electrocardiogram) and the ultrasound, but compared to PCG based auscultation, they are more complex and require multiple hardwares. Thus, PCG based auscultation is the most suitable cardiac examination technique for primary health care, since heart sound can be captured and analyzed using a smart-phone and an electronic stethoscope. The acoustic signal produced by cardiac sounds can be visually illustrated in the PCG. Generally, the PCG consists of two types of acoustic sounds: heart sounds and heart murmurs. In a cardiac cycle (heartbeat), up to four heart sounds can be found: \({\varvec{s}} _1\), \({\varvec{s}} _2\), \({\varvec{s}} _3\) and \({\varvec{s}} _4\) [1,2,3,4]. Cardiac murmurs are usually divided into two types according to the chronology of the cardiac cycle: systolic murmurs occurring between \({\varvec{s}} _1\) and \({\varvec{s}} _2\) and diastolic murmurs happening between \({\varvec{s}} _2\) and \({\varvec{s}} _1\). In fact, during the systole period, the heart chambers eject blood, while during the diastole period, the heart chambers are filled with blood. The normal heart sounds in a PCG signal are known as the first and the second sounds (\({\varvec{s}} _1\), \({\varvec{s}} _2\)). For \({\varvec{s}} _1\), it results from the closure of the mitral valve followed closely by the closure of the tricuspid valve at each cardiac cycle. In the same way, \({\varvec{s}} _2\) results from the closure of the aortic valve followed closely by the closure of the pulmonary valve. Nevertheless, several heart problems can cause additional sounds in a heart cycle such as \({\varvec{s}} _3\) and \({\varvec{s}} _4\) which are associated with valve dysfunctions.

As for \({\varvec{s}} _1\) or \({\varvec{s}} _2\), the time differences between valves closure is very critical to diagnose some pathologies (\(<30 \, \hbox {ms}\) for normal cases). Hence, the need of an accurate detection of the time difference between the closure instants of the heart valves. During the last few years, several approaches and various tools have been proposed in this field of research. Many of known techniques are based on WT (Wavelet Transform) analysis which decomposes a signal into its low and high frequency characteristics through the use of a basis function [5, 6]. In [7], PCG signals were decomposed and reconstructed using DWT (Discrete WT) to separate the peak values of \({\varvec{s}} _1\) and \({\varvec{s}} _2\). MP (Matching pursuit), has been used to decompose the PCG signal into a series of time-frequency atoms and separate the first heart sound components [8]. In [9], MP was applied successfully for the segmentation of \({\varvec{s}} _1\) and \({\varvec{s}} _2\) with high performance. Many envelope extraction methods were performed for the analysis of heart sounds and the detection of \({\varvec{s}} _3\) and \({\varvec{s}} _4\), namely the Shannon energy, the Hilbert transform, the characteristic waveform [10] and the Hilbert–Huang transform [11]. Recently, EEMD (Ensemble Empirical Mode Decomposition) based approaches [12, 13] have proved their effectiveness in analyzing and denoising heart sounds. The EEMD is derived from the EMD, where the concept is to decompose a signal into a set of IMFs (Intrinsic Mode Functions) representing different simple intrinsic modes of oscillations [14]. This concept offers the ability to use the EEMD combined with Kurtosis as a segmentation method, which was successfully achieved in [15].

The time split estimation from PCG signal is a new problematic that has been raised just recently. In fact, Nigame and Priemer [16] proposed a technique based on blind source separation to estimate the time split of the second heart sound \({\varvec{s}} _2\). However, this method supposes that the components of the second heart sound A2 and P2 are statically independent and require the measurement of two simultaneous PCG signals. Just recently, [17] published a paper that proposes an interesting method for estimating the time split for the second heart sound from PCG signals by using multiple fitting problems.

Up to now, detecting the valves closure instants is unfortunately still challenging. The difficulty mainly comes from PCG signal structure. Actually, a PCG signal can be illustrated as the convolution between impacts generated from valves closure and different Impulse Responses Functions (IRFs) related to each valve closure (each IRF depends on several factors such as breathing, noise generated by the digestive system, mood, ...). We have already addressed this problem in [3] by proposing new mathematical model taking into account the most phenomena influencing heart sounds from the generation of impacts to the electrode. The valves closure impacts signal are in reality sparse as only few coefficients are non-zero. There has been a growing interest in the study of sparse representations of signals in recent years. The main idea is that a signal can be very well approximated with only prototype signal-atoms taken from a redundant family, while its projection onto a basis of elementary signals may reduce the number of non-zero coefficients. This basic idea is the origin of recent theoretical development and many practical applications in denoising, compression, blind source separation, inverse problems, feature extraction image restoration, and stock market analysis [18,19,20,21]. However, representing a signal of interest using the minimum number of vectors from an overcomplete dictionary has been shown to be an NP-hard problem [22]. To solve this issue, several methods and algorithms have been proposed in the literature [23,24,25,26]. Some of those algorithms are identified as greedy pursuit algorithms, which iteratively improve the approximation by iteratively selecting an additional elementary signal.

The present study follows on our previous works [3, 4] and aims to propose a novel greedy based two-stage strategy to detect the instants of the heart valves closure occurring during \({\varvec{s}} _1\) and \({\varvec{s}} _2\). The first stage is dedicated to the dictionary construction from different estimated impulse response functions. In the second stage, the amplitudes and the instants of valves closures are revealed by applying a greedy algorithm alongside dictionaries created from the estimated IRFs. Moreover, the performance of the proposed two-stage strategy is validated through a simulated study on synthetic and real PCG signals.

The remaining parts of this paper are organized as follow. The problem formulation and the PCG model are given in Sect. 2. Section 3 describes the two-stage strategy for the detection of impacts instants ends. To validate the effectiveness of our proposed approach, simulations on synthetic and real PCG signals are performed and studied in Sect. 4. Finally, conclusions are drawn in Sect. 5.

2 Problem Statement

2.1 Problem Formulation

Each heartbeat involves a series of events referred to as the cardiac cycle. In healthy cases, there is two dominant heart sounds often known as the first heart sound \({\varvec{s}} _1\) and second heart sound \({\varvec{s}} _2\), that are always present and occur with each cardiac cycle. It is broadly acknowledged that the asynchronous closures of the mitral and tricuspid valves are the main contributors to \({\varvec{s}} _1\). Therefore, \({\varvec{s}} _1\) can be decomposed into two components, namely the mitral component (M1) and the tricuspid component (T1) as illustrated by Fig. 1. Usually, M1 and T1 are 30 ms apart in healthy subjects which is caused by the asynchronous closure of the two corresponding valves. In the same way, closures of the aortic and pulmonary valves produce two components mainly contribute to the composition of \({\varvec{s}} _2\). As for M1 and T1, the period between the aortic component (A2) and the pulmonary component (P2) must be less than 30 ms in the exhalation phase and around 50–60 ms at the end of inhalation. Hence, detecting the time split of \({\varvec{s}} _1\) and \({\varvec{s}} _2\) may be an indicator of several heart problems.

Fig. 1
figure 1

Normal phonocardiogram signal for a single cardiac cycle

2.2 PCG Signal Modeling

Let first recall the PCG signal model proposed in [3, 4] as it represents the framework of the proposed methods. In order to generate a reliable synthetic PCG, the model must take into consideration the asynchrony between valves closure and low frequency components since the frequency range of heart sounds is 20–200 Hz. Because of the distance between the digital stethoscope and the heart valve, the measured PCG signal corresponds to the convolution between the generated valves closures impacts and the low frequency IRFs. Generally, the IRF \({{\mathcal {H}}} _{i,n}(t)\), for each one of the component composing \({\varvec{s}} _1\) and \({\varvec{s}} _2\), are assumed to be an exponentially decaying sinusoid and can be very well approximated by Gauss Kernel. Let y(t) be the observed PCG signal. The PCG model can be expressed as follows:

$$\begin{aligned} \begin{aligned} y(t)&= \displaystyle \sum _{i\in \{M1,T1,A2,P2\},n} a_{i,n} e^{(t-\mu _{i,n}-nT)^2/ 2 \sigma _i^2} \cos ( 2\pi f_i (t-\mu _{i,n}-nT)-\varphi _{i,n} ) + n(t) \\&=\displaystyle \sum _{i\in \{M1,T1,A2,P2\},n} a_{i,n}\delta (t-\mu _{i,n}-nT) *e^{-t^2 / 2 \sigma _i^2} \cos ( 2\pi f_i t-\varphi _{i,n} ) + n(t)\\&=\displaystyle \sum _{i\in \{M1,T1\},n} a_{i,n}\delta (t-\mu _{i,n}-nT) *{{\mathcal {H}}} _{i,n}(t) +\displaystyle \sum _{i\in \{A2,P2\},n} a_{i,n}\delta (t-\mu _{i,n}-nT) *{{\mathcal {H}}} _{i,n}(t) + n(t)\\&= {\varvec{s}} _1(t)+{\varvec{s}} _2(t)+ n(t) \end{aligned} \end{aligned}$$
(1)

where i denotes the impact indices generated from each valve closure (M1, T1, A2, and P2); \(*\) stands for the convolution operator; n represents the cardiac cycle (heartbeat) index; \(\delta (t)\) stands for the Dirac distribution; \(f_i\), \(\sigma _i\) and \(\varphi _{i,n}\) are the parameters of the Gaussian kernel; \(\mu _{i,n}\) corresponds to the instants of the heart valves closure in each cycle; T stands for the cardiac cycle; n(t) represents an independent and identically distributed (i.i.d) additive noise.

The random nature of the model comes from the parameters \(a_{i,n}\) and \(\varphi _{i,n}\) representing respectively the normally distributed random amplitude \({\mathcal {N}}(\mu _{ai},\sigma _{ai}^2)\) and the uniformly distributed random phase fluctuation inside the interval \([\varphi _{i,0}-\Delta \varphi ,~\varphi _{i,0}+\Delta \varphi ]\) with \(\Delta \varphi \in [0,~\pi ]\) and \(\varphi _{i,0}\) is the \(i{\mathrm{th}}\) initial phase. An example of simulated PCG signal is shown in Fig. 2.

Fig. 2
figure 2

Synthetic PCG signal generated from Eq. 1. a Over a single cardiac cycle. b Over several cardiac cycles

As mentioned before, the aim of this paper is to restore the valves closure impacts, which correspond to the term \(\sum _{i,n} a_{i,n}\delta (t-\mu _{i,n}-nT)\), from an experimental observed PCG signal. The reason behind is the detection of the time-split of \({\varvec{s}} _1\) and \({\varvec{s}} _2\). The resulting diagnostic may offer possibilities of early detection of diseases and symptoms. These were our major motivation while designing the proposed strategy.

2.3 Sparsity of the Impacts of Valves Closures Signal

Let x(t) be the sparse signal containing the valves closure impacts, i.e, \(x(t)=\sum _{i,n} a_{i,n}\delta (t-\mu _{i,n}-nT)\). The model of relationship (1) can be approximately written in a standard matrix form as,

$$\begin{aligned} \begin{aligned} {\varvec{y}}&\simeq \Phi _\Omega {\varvec{x}} _\Omega + {\varvec{n}} \\&\simeq {\sum }_j x_j {\varvec{\phi }} _j + {\varvec{n}} ~, ~~~~ j \in \Omega \end{aligned} \end{aligned}$$
(2)

where \(x_j\) corresponds to the impact amplitude \(a_{i,n}\), the column vector \({\varvec{\phi }} _j\) is formed by the samples of \(\phi (\mu ,\sigma ,f,\varphi ,t)=e^{(t-\mu )^2 / 2 \sigma ^2} \cos ( 2\pi f (t-\mu )-\varphi )\), and \(\Omega\) denotes the set of the sparse coefficients indexes. Equation (2) corresponds to a sparse approximation problem. The key idea of sparse approximation is that a signal can be very well approximated with only a few elementary signals (hereinafter referred to as atoms) taken from a redundant family (often referred to as dictionary), while its projection onto a basis of elementary signals may lead to a larger number of nonzero coefficients.

Of course, the results of the sparse approximation depend mainly on the dictionary constructed from the IRF. This latter requires the correct estimation of (\(\sigma _i\), \(f_i\)) for each heart valves and (\(\mu _{i,n}\), \(\varphi _{i,n}\)) for each heart valve and each cardiac cycle. The estimation of the four parameters is carried out using different techniques and methods. First, \(\sigma _i\) and \(f_i\) are estimated respectively by using the synchronous mean envelope and the EEMD method. At last, \(\mu _{i,n}\) and \(\varphi _{i,n}\) can be estimated simultaneously through any sparse approximation algorithm.

2.3.1 The Orthogonal Matching Pursuit

The OMP [24] is one of the earliest methods for sparse approximation and belongs to the family of greedy algorithms. It iteratively selects at each step the atom which maximizes the scalar product with the residual. The update corresponds to an orthogonal projection of the data on the whole selected atoms. This avoids the selection of already chosen atoms but increases the computation cost as the amplitudes associated with all the selected atoms are updated.

Let the sub-matrix \(\Phi _\Lambda\) built-up from the columns of \(\Phi\) where the indexes are in \(\Lambda\), \({\varvec{\phi }} _i=\Phi _{\{i\}}\), and \(\Lambda ^{(k)}\) is the set of the selected indexes at iteration k. The vectors are defined as follows, \({\varvec{x}} =[x_1,\ldots , x_{L_x}]^{\mathtt {T}}\), \({\varvec{y}} =[y_1,\ldots , y_{L_y}]^{\mathtt {T}}\) and \({\varvec{r}} =[r_1,\ldots , r_{L_y}]^{\mathtt {T}}\) which denotes the residual. Finally, \(L_y\) and \(L_x\) stand respectively for the length of \({\varvec{y}}\) and \({\varvec{x}}\).

  • Selection: \(\Lambda ^{(k)} = \Lambda ^{(k-1)}\cup {\{i^{(k)}\}}\)

    $$\begin{aligned} i^{(k)} = \arg \max _i |{\varvec{\phi }} _i^{\mathtt {T}} \; {\varvec{r}} ^{(k-1)}| \end{aligned}$$
    (3)
  • Update:

    $$\begin{aligned} {\text {solution:}}\,&{\varvec{x}} ^{(k)}_{\Lambda ^{(k)} } = (\Phi _{\Lambda ^{(k)} }^{\mathtt {T}}\Phi _{\Lambda ^{(k)}} )^{-1}\Phi _{\Lambda ^{(k)} }^{\mathtt {T}}{\varvec{y}} \nonumber \\ {\text {residual:}}\,&{\varvec{r}} ^{(k)} = {\varvec{y}}- \Phi _{\Lambda ^{(k)} }{\varvec{x}} ^{(k)}_{\Lambda ^{(k)} } \end{aligned}$$
    (4)
  • Stopping criterion:

The next section introduces the proposed two stage strategy for the restauration of valves closure impacts.

3 Greedy Based Two-Stage Strategy

3.1 Flowchart of the Proposed Strategy

This section gives a detailed description of the proposed two stage strategy for the restauration of the valves closures impacts. The flowchart in Fig. 3 summarizes the different steps of the proposed two-stage strategy.

Fig. 3
figure 3

Flowchart of the proposed greedy based two-stage strategy

In order to estimate the different IRF with accuracy we included a segmentation step to analyze the two heart sounds separately. For simplification, we note \({\varvec{s}}\) a heart sound representing either \({\varvec{s}} _1\) or \({\varvec{s}} _2\) since the same procedure will be applied to both of them. The proposed strategy runs two times. In the first iteration, the IRF parameters of the most energetic component in \({\varvec{s}}\) is estimated. At the end of the first iteration, the impacts of the first heart valve are restored and the residue corresponds to the noise and the remaining component. In the second iteration, the input signal of the algorithm is the residue. This way, the IRF parameters of the remaining component will be estimated. As a result, the sparse representation algorithm restores the remaining impacts of the second valve, while the residue corresponds to the noise. At the strategy end, all the impacts contained in both heart sounds are restored with precision. More details about the proposed two-stage strategy are provided below.

3.2 First Stage of the Strategy

The first stage of the two-stage based strategy is completely dedicated to the estimation of the parameters \(\sigma _i\) and \(f_i\) for each heart valve.

3.2.1 PCG Signal Segmentation

The segmentation step aims to separate the PCG signal into two signals representing each heart sound, \({\varvec{s}} _1\) or \({\varvec{s}} _2\), individually. The major benefice of the segmentation step is to reduce the interference between heart sound components, which allows a better estimation of the IRFs parameters. The segmentation of the PCG signal is carried by the envelope and the estimation of the cardiac cycle period. The first step concerns the time-localization of the different heart sounds present in the signal by using the Hilbert envelope of the signal, Fig. 4. In the second step, the estimation of the cardiac cycle period through the envelope analysis helps with the identification of which ones of the detected heart sounds are \({\varvec{s}} _1\) and which ones are \({\varvec{s}} _2\). After the classification of the detected heart sounds, they are grouped in two different signals \({\varvec{s}} _1(t)\) and \({\varvec{s}} _2(t)\) for further analysis. More details about the segmentation technique can be found in [10, 27].

Fig. 4
figure 4

Segmentation of a synthetic PCG signal generated from Eq. 1: a estimated binary signal. b Heart sounds separation

3.2.2 Estimation of the Damping Coefficient \(\sigma _i\)

Let take the model of Eq. 1, one can remark that this model corresponds to a linear combination of amplitude modulated impacts. It is well known that the envelope allows the extraction of the shape of the modulating signal. In our case, the heart sounds components in our model are weighted by exponential terms, \(a_{i,n} e^{(t-\mu _{i,n}-nT)^2 / 2 \sigma _i^2}\). In this step, the damping ratio \(\sigma _i\) for \(i \in \{ M1, T1, A2, P2 \}\) is calculated from the envelope of the PCG signal synchronous mean [28]. In the previous step we localized the different heart sounds using the envelope. By using this information, we can correct the heart sound position and remove cardiac cycle fluctuation. Next, we perform the synchronous averaging to extract the deterministic periodic part, i.e, synchronous mean.

As shown in Fig. 5, the envelope of the synchronous mean is not symmetric due to the interference between components. Hence, we choose the highest damping ratio where the interference is less between the components M1-T1 and A2-P2. In the second iteration of the proposed strategy, this problem is not encountered since we have only one component remaining for each heart sound.

Fig. 5
figure 5

The synchronous mean envelope of a synthetic PCG signal

3.2.3 Estimation of the Components Frequency \(f_i\)

This step consists in estimating the unknown low frequencies \(f_i\) by the self-driven EEMD method applied respectively to \({\varvec{s}} _1\) and \({\varvec{s}} _2\). The EEMD allows the estimation of the different oscillation modes existing in the PCG signal by decomposing it into several IMF. By calculating the power of the resulting IMF, we can determine the most energetic IMF and consequently deduce their frequencies corresponding to the frequency of the components \(f_i\). Figure 6 illustrates the estimation of the first heart sound \({\varvec{s}} _1\) component frequencies \(f_i\) trough the EEMD technique. The EEMD decomposes \({\varvec{s}} _1\) into several IMF, where the second and the third IMF correspond to the oscillation modes of the two components composing \({\varvec{s}} _1\). Next, by calculating the fundamental frequency of the two IMF we can estimate the frequency of each component at 80.02 Hz and 52.00 Hz. The advantage of this method is the ability to estimate the components frequency in one run and without prior information about the previous estimated IRFs parameters namely \(\mu _{i,n}\) and \(\sigma _i\).

Fig. 6
figure 6

Application of EEMD on the first heart sound \({\varvec{s}} _1\): a\({\varvec{s}} _1\) IMF’s. b Power and main frequency of each IMF

3.3 Second Stage of the Strategy

3.3.1 Estimation of the Instants \(\mu _{i,n}\) and the Phase \(\varphi _{i,n}\)

The second stage focuses in the recovering of the impacts of valves closures, \(a_{i,n}\delta (t-\mu _{i,n}-nT)\), as well as estimating the phase \(\varphi _{i,n}\). The redundant dictionary is built from the kernel \(e^{t^2 / 2 \sigma _i^2} \cos ( 2\pi f_i t-\varphi )\). Let \(\Psi _{i}\) be this dictionary; \(\Psi _{i}\) gathers several sub-dictionaries, each one is associated to several delay values \(\tau\) and a unique phase value \(\varphi\); the phase takes values in the range 0–\(\pi /4\) with a sampling of \(\pi /20\). \(\Psi _{i}\) is simply the union of these sub-dictionaries and is given as, \(\Psi _{i} = [\Psi _{i;1},\ldots ,\Psi _{i;M}]\), M is the number of sub-dictionaries.

Once the dictionary \(\Psi _{i}\) being made, we can apply the previously mentioned greedy algorithm OMP to retrieve the impacts of valves closure \(a_{i,n}\delta (t-\mu _{i,n}-nT)\) and hence the instants \(\mu _{i,n}\) and the phases \(\varphi _{i,n}\) corresponding to the selected atoms. It is important to note that the OMP can be replaced by any greedy algorithms, such as OLS [29].

At the strategy’s end, the residual includes noise and the contribution of the remaining impacts signal with different parameters \(\sigma _i\), \(f_i\), \(\mu _{i,n}\) and \(\varphi _{i,n}\). Hence, the need to iterate the whole strategy once again over the residual to restore the remaining impacts. After that, the residual corresponds to noise. Finally, as for the heart sound \({\varvec{s}}\), let assume that \({\varvec{s}} ={\varvec{s}} _1\), the process will be applied in the same way to the heart sound signal \({\varvec{s}} _2\) where the objective is to recover the impacts (\(i \in \{ (A2, P2) \}\)). Thus, all of the impacts of valves closures will be retrieved and at the same time the instants \(\mu _{i,n}\).

The limitation and the performance of the mentioned method will be investigated in the next section, where a detailed simulation and results are provided.

4 Evaluation Results and Discussion

4.1 Tests on Simulated Data

A simulated study is performed to illustrate and compare the effectiveness of the proposed strategy. For this purpose, a synthetic PCG signal is generated from Eq. 1, Fig. 7. The generated signal represents a realistic PCG signal for a healthy subject, where the model parameters are listed in Table 1. It should be noted that the valves closure instants for each cycle \(\mu _{i,n}\) are randomly generated and follows the normal distribution law \({\mathcal {N}}(\mu _{\mu _{i}},\sigma _{\mu _{i}}^2)\). Since the heart sounds energy is localized in time, it is difficult to assess the influence of the additive Gaussian noise. For this reason, we introduce another indicator for the noise level called the Localized Signal to Noise Ratio (LSNR). The LSNR measures the noise level in a limited interval where the signal energy is localized [30]. Mathematically, the LSNR is expressed by:

$$\begin{aligned} {\mathrm {LSNR}}({\mathrm {dB}})=10 \log _{10}\left( \frac{P_{{\text{ limited-signal }}}}{P_{\text {noise}}}\right) \end{aligned}$$
(5)

with

$$\begin{aligned} P_{\text {limited-signal}}=\frac{1}{n_2-n_1} \sum _{i=n_1}^{n_2}({{\text {signal}}}(i))^{2} \end{aligned}$$

The interval \([n_1,n_2]\) represent the event localization where the signal energy is concentrated. Furthermore, the sampling frequency is set to \(Fs = 1500\) Hz, and some Gaussian noise is added to the signal such that the LSNR is equal to 20 dB.

Table 1 Model parameters values for the synthetic PCG signal
Fig. 7
figure 7

An example of a synthetic PCG signal with the estimated valves closure impacts

The first stage of strategy consists of approximating the different IRFs in the PCG signal, and provides an adequate dictionary for sparse representation. Following the flowchart after the segmentation step, the IRFs are estimated iteratively through the calculation of the main parameters for every iteration, namely the damping ratio \(\sigma _i\) and the heart sound frequency \(f_i\). The remaining parameters, time of occurrence \(\mu _{i,n}\) and the phase \(\varphi _{i,n}\), will be estimated precisely in the second stage.

Fig. 8
figure 8

The restored heart valves closure impacts instants regarding the actual ones

Figure 8 reports the resulting sparse signal, i.e., the detected valves closing impacts. Usually, it is hard to visually distinguish between M1 and T1 or A2 and P2. However, the proposed technique has no problem in detecting the two valves closure instants in \({\varvec{s}} _1\) and \({\varvec{s}} _2\) despite the small time space between them. It should be noted that the difficulty for this process increases as the space between impacts decreases.

Fig. 9
figure 9

Averaged histogram of the restored impacts regarding the actual ones over one cardiac cycle: a for \({\varvec{s}} _1\). b For \({\varvec{s}} _2\)

In order to evaluate the robustness and the precision of the proposed method, we performed Monte Carlo (MC) simulations, with over 100 MC runs for each LSNR value, on signals with almost different configurations. In fact, in this simulation the objective is to assess the performance of the proposed strategy under different situations. For this reason, we generate at each iteration random model parameters values except for the impact positions. As the amplitude and the phase are already random, we need only to alter the component frequencies \(f_i\) and the damping ratio \(\sigma _i\). In this simulation, the component frequencies follows a normal distribution \({\mathcal {N}}(f_i,2.25)\) and the damping ration follows a normal distribution \({\mathcal {N}}(\sigma _i, 0.0005)\), where \(f_i\) and \(\sigma _i\) are provided in Table 1. The remaining model parameters values are the same as the last simulation displayed in Table 1 with different LSNR values.

Fig. 10
figure 10

The time difference MSE for different LSNR values

The averaged histogram reported in Fig. 9 shows the distribution of the detected impacts instants in regard to the actual one for several LSNR values. We note from the same figure that the correct detections for \({\varvec{s}} _1\) and \({\varvec{s}} _2\) are important for T1 and A2 in comparison to M1 and P2. This can be explained by the relatively strong amplitudes of T1 versus M1 for \({\varvec{s}} _1\) and of A2 versus P2 for \({\varvec{s}} _2\). Thus, T1 (resp A2) is the first component to be selected by the greedy algorithm. Given that the sparse approximation is based on dictionaries built from approximated IRFs, which do not correspond to the actual IRFs. The update of the residue induces some errors which interfere in the detection of the remaining impacts namely M1 for \({\varvec{s}} _1\) and P2 for \({\varvec{s}} _2\).

Moreover, the behavior of the strategy regarding noise is globally the same with a slight increase in the number of false detections, especially for LSNR = 8 dB. However, the histogram does not completely inform us about the estimation quality of the time split in the sense that the same time split can be obtained from either two correct detections of M1 and T1 (resp. A2 and P2) or two false detections with the same translation.

To assess the effect of the LSNR on the estimation of the correct time split, a second MC simulation measuring the MSE of the time difference between impacts for different LSNR values is performed. In this simulation, we keep the same PCG model parameters listed in Table 1 while the LSNR changes inside the intervalle [6–24] dB. Furthermore, the MSE is averaged over 50 MC iterations. As represented in Fig. 10 the proposed strategy is robust under normal LSNR values. However, the error increases rapidly when the LSNR is below 10 dB.

Fig. 11
figure 11

Averaged time difference NMSE display: a for \({\varvec{s}} _1\). b For \({\varvec{s}} _2\)

Another simulation was performed in order to evaluate the different limitations of our strategy. As commonly known, the frequency difference \(\Delta f\) and the time difference \(\Delta \mu\) between the heart sounds component are small. This represents a major challenge for our proposed strategy. In this simulations, the aim is to evaluate how the strategy performs in different situations, especially when \(\Delta f\) and \(\Delta \mu\) are small. \(\Delta f =|f_{T1}-f_{M1}|=|f_{A2}-f_{P2}|\) and \(\Delta \mu =\mu _{T1,n}-\mu _{M1,n}=\mu _{P2,n}-\mu _{A2,n}\), where \(f_{M1}\), \(f_{A2}\), \(\mu _{M1,n}\), and \(\mu _{A2,n}\) are listed in Table 1 as the remaining model parameters.

Figure 11 shows the Normalized Mean Square Error (NMSE) distribution of the time difference for different values of \(\Delta f\) and \(\Delta \mu\). In order to make the simulation easily presentable we choose to set the number of MC iterations at 50.

The simulation results show how the proposed strategy behaves when the \(f_i\) and \(\mu _{i,n}\) varies. In fact, the strategy performance decrease as both parameters tend to smaller values and vice versa. According to the results, the performance of the strategy is limited only when \(\Delta f < 20\) Hz and \(\Delta \mu < 0.03\) s. Consequently, the patient diagnosis will be more accurate at the end of inhalation as the normal time difference between heart valves closures is around 50–60 ms. Moreover, the difference between the two subfigures suggests that the strategy performances depend also on other parameters as the two heart sounds have different kernel parameters. Unfortunately, a deep performance analysis of the strategy requires heavy simulations to evaluate how the strategy performs under different combinations of the model parameters values.

The simulation on synthetic signals has revealed the effectiveness and the performance limitation of the proposed strategy in various situations and condition. However, to prove its robustness in real life, the approach needs to be evaluated on real PCG signals.

4.2 Tests on Experimental Datasets

4.2.1 Description of the Heart Sound Database

In this section, the proposed approach is tested on a database collected from a clinical trial in hospitals using the digital stethoscope DigiScope in order to validate its effectiveness. This database contains two datasets published in the Classifying Heart Sounds Pascal Challenge competition [31]. The real-life PCG signal used in this simulation is in the Dataset-B. It includes 656 heart sound signals in WAV format recorded from children by using the Littmann Model 3100 electronic stethoscope with a sampling frequency of 4000 Hz. Information regarding gender, age or condition of the subjects are not available.

4.2.2 Experimental Results

The real-life PCG signal selected for this simulation is listed as normal, where the two heart sounds \({\varvec{s}} _1\) and \({\varvec{s}} _2\) can be distinguished visually, Fig. 12. Furthermore, the studied signal is filtered by applying a low-pass filter (0–200 Hz) since heart sounds have a limited frequency range under 195 Hz.

Fig. 12
figure 12

Real-life PCG signal

Figure 13 reports the sparse signal obtained after applying the proposed strategy. The results of the detected valve closure instants in \({\varvec{s}} _1\) and \({\varvec{s}} _2\) are visually satisfying despite the cardiac cycle fluctuations. However, those results still need a clinical expert consultation. For the first heart sounds, the heart valves closure impacts were perfectly restored with their corresponding amplitude, where the mean and standard deviation of the time difference between M1 and T1 components are respectively equal to 0.0372 s and 0.0043 s. The validity of the result can be visually investigated. For the second heart sounds, the restoration of some impacts is less accurate as it is more difficult to treat. However, the proposed approach manages to overcame the difficulties and restored several impacts. This allows us to measure the time difference between A2 and P2 components, where the mean and the standard deviation are respectively equal to 0.0192 s and 0.0035 s. According to the results the studied PCG signal can be classified as normal, although a clinical expert can say more. Finally, the simulation results of the experimental data confirm the validity and effectiveness of the proposed sparsity-based approach in detecting valves closure instants for PCG signals.

Fig. 13
figure 13

The restored heart valves closure impacts for the real PCG signal

5 Conclusion

In this paper, we introduced a new greedy based two-stage strategy for detecting valves closure instants for normal PCG signals. The first stage of the proposed approach is dedicated to identify the IRFs of both heart sounds \({\varvec{s}} _1\) and \({\varvec{s}} _2\) by estimating three main parameters using techniques such as EEMD and the synchronous mean envelope. In the second stage, with the IRFs based dictionary, the greedy algorithm OMP detects the valve closure instants in the second stage. Finally, simulation results, for both synthetic and real PCG signals, show interesting performance even under considerable noise, which is promising as it requires minimal equipment.

Future work will focus on refining the estimation of the different heart sounds by exploiting the cyclostationarity of the PCG signal. The improvement of IRFs approximation will allow better detection of valves closure instants since distances between heart sound components reveal essential information for the heart diagnostic. This approach would be beneficial for the developing countries and rural health management using only an electronic stethoscope connected to a smart-phone for diagnostics.