Keywords

Introduction

Respiratory diseases cause an immense health, economic and social burden and are the third leading cause of death worldwide [1] and a significant burden for public health systems [2]. Therefore, significant research efforts have been dedicated to improving early diagnosis and routine monitoring of patients with respiratory diseases to allow for timely interventions [3]. A great amount of research has been focused in the auscultation and characteristics of respiratory sounds (RS), as they are directly related to movement of air, changes within the lung tissue, and position of secretions within the tracheobronchial tree, which make them valuable indicators of respiratory health and respiratory disorders [4].

Respiratory sounds are generally classified as normal or adventitious. Auscultation-based diagnosis and monitoring of respiratory conditions rely heavily on the presence of adventitious sounds and on the altered transmission characteristics of the chest wall. Adventitious sounds are RS superimposed on normal respiratory sounds which can be discontinuous (crackles) or continuous (wheezes). Crackles are discontinuous, explosive, and non-musical adventitious RS that occur frequently in cardiorespiratory diseases [5]. They are usually classified as fine and coarse crackles based on their duration, loudness, pitch, timing in the respiratory cycle, and relationship to coughing and changing body position [6]. Wheezes are musical RS that usually last more than 250 ms. They are a common clinical sign in patients with obstructive airway diseases such as asthma and chronic obstructive pulmonary disease (COPD) [7].

In the first edition of the ICBHI Scientific Challenge, participants were asked to develop algorithms that characterize sound recordings collected from clinical and non-clinical (such as in-home visits) environments. The goal was to classify, for each respiratory cycle of a short recording (10–90 s), acquired at a single location, whether the respiratory cycle contained crackles, wheezes, or both.

To develop solutions for the challenge, participants had access to a respiratory sound database containing various events (e.g., noise, cough, wheezes, crackles) collected from healthy people and patients with different respiratory conditions (e.g., COPD, asthma), providing a variety of signal sources. Data included not only clean respiratory sounds but also noisy recordings, providing authenticity to the challenge. Data were recorded from different locations, depending on the individual protocols used for each data set. The database was annotated by health professionals. The ICBHI challenge process has been supported by a dedicated web application.Footnote 1 Users registered in the contest, accessed the provided datasets, submitted their source code, and communicated in forums provided by this web platform. Furthermore, the platform automatically informed end-users of their submissions’ ranking and evaluation. Finally, the platform actively enforced rules of the contest, e.g., the number of submissions allowed in each contest phase.

The automatic detection or classification of adventitious RS has been the subject of many studies in the last decades. Pramono et al. [8] summarized the most relevant methods employed in those studies. Algorithms developed to detect or classify events usually involve two steps; adventitious RS are no exception. The first step is to extract the relevant features that will be used as detection or classification variables. The second step is to use detection or classification techniques on the data, based on the features extracted. The most common features employed in the literature include Mel-frequency cepstral coefficients (MFCCs), spectral features, energy, entropy, and wavelet coefficients. Machine learning algorithms proposed in the literature include empirical rule-based methods, support vector machines (SVMs), artificial neural networks (ANNs), Gaussian mixture models (GMMs), k-nearest neighbors (k-NNs), and logistic regression models.

Most prior attempts on automated classification of respiratory sounds have been limited by the small number of patients employed in the studies. It is possible to achieve very good classification results because the algorithm can be custom designed and fit carefully to match the data and the features collected from a small number of patients. However, as the number of patients is increased to several dozen or several hundred, the features learned from small datasets typically fail to generalize [9].

This paper is structured as follows: in the Challenge data section, we describe the data collection process, as well as the structure of the challenge; future uses of the database are discussed in the Conclusion section.

Challenge Data

Data Collection

The ICBHI Scientific Challenge database contains audio samples, collected independently by two research teams in two different countries, over several years. The database consists of a total of 5.5 h of recordings containing 6898 respiratory cycles, of which 1864 contain crackles, 886 contain wheezes, and 506 contain both crackles and wheezes, in 920 annotated audio samples from 126 subjects.

School of Health Sciences, University of Aveiro (ESSUA)

Most of the database consists of audio samples recorded by the ESSUA research team at Respiratory Research and Rehabilitation Laboratory (Lab3R), ESSUA and at Hospital Infante D. Pedro, Aveiro. Sounds from several studies conducted by this research team were included in the database. All the recordings followed the computerized RS analysis guidelines for short-term acquisitions [10], collecting sounds from seven chest locations: trachea; left and right anterior, posterior, and lateral. Sounds were collected in clinical and non-clinical (home) settings. The acquisition of RS was performed on subjects of all ages, from infants to adults and elderly people. Subjects included patients with lower respiratory tract infections, upper respiratory tract infections, COPD, asthma, and bronchiectasis.

In some studies, the sounds were collected sequentially with a digital stethoscope (Welch Allyn Master Elite Plus Stethoscope Model 5079-400). In other studies, the sounds were collected using either seven stethoscopes (3 M Littmann Classic II SE) with a microphone in the main tube or seven air-coupled electret microphones (C 417 PP, AKG Acoustics) located into capsules made of Teflon. Respiratory sounds were annotated using the Computerised Lung Auscultation – Sound System (CLASS) [11].

Aristotle University of Thessaloniki (AUTH)

Respiratory sounds were acquired at the Papanikolaou General Hospital, Thessaloniki and at the General Hospital of Imathia (Health Unit of Naousa), Greece. Sounds were collected sequentially from six chest locations, as shown in Fig. 1. The acquisition of RS was performed on adult and elderly patients. All patients had COPD with comorbidities (e.g. heart failure, diabetes, hypertension).

Fig. 1
figure 1

Chest locations for the recording of respiratory sounds

These recordings were acquired as part of the European project WELCOME (Wearable Sensing and Smart Cloud Computing for Integrated Care to COPD Patients with Comorbidities) project and were annotated using AudacityFootnote 2 2.0.6 a free, open source, cross-platform software for recording and editing sounds.

Data Annotation and Curation

ESSUA

Sounds annotation by respiratory experts is the most common and reliable method to assess the robustness of algorithms to detect adventitious RS [12]. Two respiratory physiotherapists and one medical doctor, with experience in visual-auditory crackles/wheezes recognition, independently annotated the sound files in terms of presence/absence of adventitious sounds and identification of breathing phases. Nevertheless, as annotation is a time–consuming process, being difficult to conduct in a large amount of sound files, in part of ESSUA database, only one respiratory physiotherapist annotated the files. For the annotation, the Respiratory Sound Annotation Software was used (Fig. 2) [13].

Fig. 2
figure 2

Respiratory sound annotation software

AUTH

Respiratory sound annotations were performed by three experienced physicians, two specialized pulmonologists and one cardiologist. Annotations discriminated the following sounds: normal (respiratory sound), fine crackles, coarse crackles, wheezing, speech, cough, artifact. Figure 3 reproduces a sample of the annotation process. Figure 4 shows an example of an annotated sound recording.

Fig. 3
figure 3

A sample of the respiratory sound annotation process

Fig. 4
figure 4

A segment including three respiratory cycles: the first contains wheezes (green), the second contains crackles (blue), and the third is normal (black). Respiratory cycle boundaries are represented by vertical lines (red)

Previous Uses of Data for Classification

Part of the ESSUA database has been used previously for the detection of crackles. Pinho et al. [14] developed an algorithm for automatic crackle detection and characterization and evaluated its performance and accuracy against a multi-annotator gold standard. The developed algorithm was based on three main procedures: (i) extraction of a window of interest of a potential crackle (based on fractal dimension and box filtering techniques); (ii) verification of the validity of the potential crackle considering computerized RS analysis established criteria; and (iii) characterization and extraction of crackle parameters. The paper reported a performance of 89% sensitivity and 95% precision.

Part of the AUTH database has been used previously for the detection of wheezes, crackles, and cough. Mendes et al. proposed a method for the detection of wheezes based on their distinct signature in the spectrogram space (WS-SS). In addition to this feature, 29 musical features were computed using the MIR Toolbox [15]. The paper reported a performance of 91% sensitivity and 99% specificity.

Mendes et al. [16] proposed a method for the detection of crackles using a multi-feature approach. 35 features were extracted, including 31 musical features, a wavelet-based feature, entropy and Teager energy. WS-SS was also extracted to improve the robustness of the method against the presence of wheezes. The paper reported a performance of 76% sensitivity and 77% precision. Rocha et al. [17] proposed a method for the detection of explosive cough events based on a combination of spectral content descriptors and pitch-related features. The paper reported a performance of 92% sensitivity and 85% specificity.

Preparation of the Database for the ICBHI Challenge

The challenge was structured in two phases: unofficial and official. During each phase, data from the two aforementioned databases were divided into training (60%) and testing (40%) sets.

The data included in each of the train/test sets were derived from mutually exclusive populations and thus the recordings from the same subject could not be present in both the training and testing sets. Furthermore, the data included in the database were anonymized and no personal information were provided.

During the official phase of the challenge, the training set included 2063 respiratory cycles from 539 recordings derived from 79 subjects, while the testing set included 1579 respiration cycles from 381 recordings derived from 49 patients. Table 1 provides further details about the distribution of the adventitious RS between the datasets.

Table 1 Summary of the training and testing sets used in the official phase of the ICBHI challenge

Conclusion

The creation of this database and the related scientific challenge constitute an initial but decisive step towards leveraging computational lung auscultation, and also towards highlighting the complexity of the RS classification problem. The availability of the database after the challenge (details will be posted on the challenge’s website), along with the challenge’s approaches and results, will set the basis to ensure the continuation of efforts, hopefully inspiring and facilitating future relevant competitions.