1 Introduction

Breathing sounds (BS) offer relevant information on pulmonary parenchyma and airways physiology and pathology. As a result, the methods used for analyzing BS components are of special importance [1, 2]. The BS acoustic characteristics differ significantly between normal and abnormal sounds. Normal breathing sounds (NBS) are produced by the respiratory activity in healthy subjects and are differentiated according to their localization in the thoracic surface, the respiratory cycle phase, and with the square of the air flow [2]. On the other hand, abnormal sounds have acoustic characteristics that indicate the presence, severity, and localization of a disease in the respiratory system [2].

Discontinuous abnormal sounds, also known as crackles, have an explosive and transient character. In terms of their temporal morphology, crackles can be classified as fine (FC) and coarse (CC) and can be produced in different lung diseases, such as pulmonary fibrosis or asbestosis. It is clinically relevant to detect crackles and establish their characteristics and localization in both the respiratory cycle and the thoracic surface [2].

One of the most recent techniques for crackle extraction from BS is independent component analysis (ICA). For ICA, relevant assumptions are that crackles are added to BS and that these sources are independent by being generated through different biological processes. In other words, BS are generated by turbulences in the bronchial tree, whereas crackles are produced by the sudden opening of abnormally closed airways [3, 4]. Recently, the viability of different ICA algorithms has been evaluated for the extraction of independent components (IC) associated with crackles [1]. Nonetheless, in these previous efforts, it was not considered the possible temporal overlapping of crackles due to simultaneous openings of different airways. In [3], clustering of sources via nonlinear indexes is used in the automated selection of IC associated with crackles in multichannel recordings. However, only FC sounds in the inspiratory phase were simulated to test the proposed automated selection, thus ignoring CC sounds and the expiratory phase.

This study presents an additional step in the multichannel BS processing via ICA considering simulations that include 1) different types of crackles, 2) temporal overlapping of crackle sources, and 3) both inspiratory and expiratory phases. The elements incorporated in this study better reflect the real conditions in which BS are acquired; hence this research work may represent a more robust blind source separation (BSS) technique for crackles in BS. To this end, we proposed in this study two simulated scenarios. In the first one, we evaluated the efficiency of three popular ICA algorithms, FastICA, Information-Maximization (Infomax), and Temporal Decorrelation Source Separation (TDSEP), to select the optimal algorithm to extract crackle sounds. Once the optimal algorithm was selected, the second simulated scenario allowed the detection and automated selection of IC associated with crackles.

1.1 Theoretical background

1.1.1 Independent component analysis

ICA is a statistical technique that looks to recover a group of unobserved independent signals from their lineal mixture in measured signals. In the noise-free ICA model, the measured signals x(t) are represented by x(t) = As(t), where s(t) are the unobserved source signals, and A is the mixing matrix, which is invertible and has linearly independent columns. By calculating the inverse of A, i.e., separation matrix W, the source estimations or independent components \( \hat{\mathbf{s}}(t) \) are calculated as \( \hat{\mathbf{s}}(t)=\mathbf{Wx}(t) \). FastICA, Infomax, and TDSEP are among the most popular ICA algorithms. FastICA minimizes negentropy between components, Infomax maximizes the joint entropy of the components by minimizing the existing mutual information between them, and TDSEP minimizes the existing temporal cross-correlation between components [5].

1.1.2 Skewness and kurtosis

Skewness and kurtosis are representative indexes of the probability density function of a random variable based on the third and fourth central moment, respectively. Skewness describes the asymmetry degree of a distribution, where zero represents a symmetrical distribution. On the other hand, kurtosis describes the degree of data concentration around the mean value – a Gaussian distribution has a kurtosis value of zero [6].

1.1.3 Spectrogram

Time-frequency analysis of a non-stationary signal can be achieved by the squared magnitude of the short-time Fourier transform. In this paper, the spectrogram is used to characterize the crackle sources.

2 Methods

2.1 Simulated scenarios

2.1.1 Breathing sounds from healthy subjects

To generate the proposed scenarios, NBS data from a healthy subject were used in which simulated crackles were inserted. The participant included in the study gave his informed consent. The BS signals were acquired on the back of the subject with a 5 × 5 microphone array. For further details on the microphone array, sensor nomenclature, and the breathing maneuver performed by the subject, refer to [1, 3, 4]. For the digitalization of the multichannel BS and air flow signals we used a 12 bit A/D card with a sample frequency of 10 kHz. The signals were filtered with a 201 order, FIR band-pass filter in the 75 to 1500 Hz band.

2.1.2 Simulated crackles – Single channel scenario

To create the first simulated scenario (i.e., simulated scenario 1) we considered five sources, and thus five mixtures. This scenario was built by linearly combining one NBS channel with two simulated FC sources and two simulated CC sources using a random mixing matrix A, whose coefficients guaranteed a low enough signal to noise ratio (SNR) to make sure that the crackles were not visible. We simulated both FC and CC using the mathematical model proposed by Kiyokawa et al. [7]. This model maintains the temporal characteristics of FC, with an initial width deflection (IDW) of 0.9 ms and a two cycle duration (2CD) of 3 ms, and those of CC, with an IDW of 2.2 ms and a 2CD of 10 ms [8]. The crackles were inserted in the inspiratory (IP) and expiratory (EP) phases considering temporal overlapping. Furthermore, source 1 is given by the NBS channel, source 2 contains six FC in the IP, source 3 contains five CC in the IP, source 4 contains five FC in the EP, and source 5 contains six CC in the EP. For BSS and to select the optimal algorithm we used the ICA algorithms previously mentioned (i.e. FastICA, Infomax, and TDSEP). In FastICA we obtained symmetrically the IC and we used the hyperbolic tangent as the non-lineal function. In TDSEP we used time delays in the range τ = 0, 1, …, 50. Finally, in Infomax we used the extended version.

2.1.3 Simulated crackles – Multichannel scenario

In the second scenario (i.e., scenario 2), we used a NBS multichannel record (25 channels) and four types of insertion: ten FC in the IP, nine CC in the IP, nine FC in the EP, and ten CC in the EP. These four simulated crackle sources were inserted in the PRC4, PM4, PRX3, and PLC3 channels, respectively, considering crackle transmission between channels as well as temporal overlap. We made the insertions by adding the crackle sources to the NBS, maintaining the same SNR restriction achieved in scenario 1.

2.2 Optimal ICA algorithm selection

The Amari index (Am) and the signal-to-interference ratio (SIR) were among the indexes used to evaluate the separation made by the three ICA algorithms in scenario 1. Further information of these indexes can be consulted in [1, 9]. As the third index, we calculated the total relative distortion (TRD), defined as follows:

$$ TRD=\frac{{||{\hat{\mathbf{s}}}_{\mathrm{i}}||}^2-{\left|\left\langle {\hat{\mathbf{s}}}_{\mathrm{i}},{\mathbf{s}}_{\mathrm{i}}\right\rangle \right|}^2}{{\left|\left\langle {\hat{\mathbf{s}}}_{\mathrm{i}},{\mathbf{s}}_{\mathrm{i}}\right\rangle \right|}^2} $$
(1)

where \( {\hat{\mathbf{s}}}_{\mathbf{i}} \) represents the estimated sources and si the reference signals. TRD represents the total error in the source separation, i.e. it is associated with different types of errors in the estimations (e.g. interferences, noise, and artefacts). Ideally, TRD should equal zero [10].

2.3 Automated crackle detection

For the second scenario we used the algorithm with the best performance obtained during the evaluation made in scenario 1. Once the IC were calculated, we performed the automated selection and characterization of the ones containing crackle information according to the following steps:

  1. 1.

    We determined crackle presence in the IC by calculating the kurtosis and skewness of each of the squared components during IP and EP. If the values for kurtosis and skewness overcome the established thresholds in either IP or EP, this specific IC are confirmed to contain crackles in that respiratory phase. After a series of preliminary tests involving IC with and without crackles, kurtosis and skewness thresholds were set at 35 and 4.5, respectively.

  2. 2.

    We determined the crackle types (i.e. fine or coarse) with the spectrogram of the selected IC from step 1. Knowing in which phase of the respiratory cycle the crackles are located, the spectral power was calculated in the frequency bands given by the 2CD of FC and CC. The frequency band for CC is between 195 and 205 Hz, whereas for FC, it is between 660 and 670 Hz. The type of crackle was determined according to the frequency band with the highest spectral power.

  3. 3.

    We determined the number of crackles by obtaining local maxima of the envelope of the frequency band, in the corresponding respiratory phase, of the crackle type that exists in the selected IC.

3 Results

3.1 Optimal ICA algorithm selection

Table 1 shows the calculated performance indexes for the three ICA algorithms. According to the Am index, Infomax had the best performance, thus obtaining the lowest value of the three algorithms, followed by FastICA. SIR indicates that Infomax achieved the best source extraction in three of the four crackle sources (sources 2–4), whereas TRD shows that, for the same sources, Infomax achieved the lowest degree of distortion. Thus, according to the SIR and TRD values, Infomax achieved an acceptable performance in the extraction of source 5. Given these results, we chose Infomax as the best algorithm to separate crackles from NBS.

Table 1 Performance indexes of ICA algorithms

3.2 Automated crackle identification in the IC

Figure 1 depicts the 25 IC obtained by Infomax from a respiratory cycle. In the 11th, 17th, 18th and 24th IC estimated crackle sources can be observed, whereas the rest of the IC resemble respiratory sound information. Table 2 shows the calculated kurtosis and skewness values in a respiratory cycle of the four aforementioned IC. These indexes allowed us to automatically detect the 11th and 17th IC as sources with crackles in the IP, and IC 18 and 24 as sources with crackles in the EP.

Fig. 1
figure 1

IC obtained by Infomax in the scenario 2

Table 2 Calculated kurtosis and skewness in scenario 2

The spectral power in the FC and CC frequency bands during the IP for the 11th and 17th IC and during the EP for the 18th and 24th IC indicated that both the 11th and 24th IC contained CC, whereas the 17th and 18th IC contained FC. Finally, we calculated the number of crackles in each of these IC: the 11th and 18th IC included nine crackles each, whereas the 17th and 24th IC had ten crackles a piece. Together, these results show that nine CC in the IP were present in the 11th IC, ten FC in the IP were present in the 17th IC, nine FC in the EP could be found in the 18th IC, and ten CC in the EP could be found in the 24th IC. These results indicate that the 11th, 17th, 18th, and 24th IC correspond to the four simulated crackle insertions described in scenario 2. The difference in the spectral power of the frequency bands between FC and CC can be seen in the spectrograms introduced in Fig. 2.

Fig. 2
figure 2

Spectrograms of simulated CC (left) and FC (right)

4 Discussion

We proposed a methodology for the automated extraction and identification of the type and number of crackles in NBS by ICA. Through simulated crackles, we managed to generate a controlled scenario to find the optimal ICA algorithm to separate NBS from FC and CC, inserted in both the IP and EP of the respiratory cycle. The results show that our methodology is efficient in terms of detection and characterization of IC with the presence of crackles. Similarly, we demonstrated that it is possible to determine the components containing crackles as well as the phase of the respiratory cycle in which they appear by calculating kurtosis and skewness values. The spectral power in the frequency bands of FC and CC provided a viable method for differentiating crackle types. Also, the envelope of the spectral power of the crackle type enabled us to calculate the number of existent crackles in the IC by obtaining local maxima. As future work, we plan to validate the methodology with both a larger number of simulated scenarios and BS recordings from pulmonary patients. Similarly, notice that real crackles may have different temporal morphology if compared to the ones simulated in this research work which is particularly due to the variations of the crackle generator mechanism with respect to mechanical and dynamical characteristics of airways. Finally, different time-frequency representations will be further assessed to improve temporal and frequency crackle localization.

5 Conclusions

Infomax achieved a good source separation of the simulated crackles and the NBS in both scenarios, which allowed the methodology to properly identify the IC containing crackles, the phase of the respiratory cycle in which they appear, crackle type, and number of existent crackles. Obtaining this information by an automated process allows a quantitative analysis that may provide relevant clinical information on lung conditions in patients.