Introduction

Cough is a common and one of the earliest symptoms in a range of respiratory diseases such as bronchitis, pneumonia, asthma and pertussis. It is a natural protective mechanism that helps clearing the secretions from the respiratory tract and prevents entering of noxious particles into the respiratory system. It is generally defined as the sudden expulsion of air accompanied with a “typical sound.”10 The prevalence of cough in communities in Europe and USA varies between 9–33%6 and likely higher in the developing world. Even though cough is common in respiratory diseases and considered an importance clinical symptom, there is no objective gold standard to assess cough quality. Subjective assessment of dry and wetness of the cough sounds is the reference method used by clinicians around the globe.3,19 Cough carries vital information on the state of the airway,17 but the field of cough analysis is in its infancy.

Based on the perception of presence of sounds related to secretions in the airways, cough is classified into the two categories ‘wet cough’ and ‘dry cough.’ Depending on its acoustic quality cough is characterized as wet when the sounds carry features indicative of mucus; in the absence of perceivable wetness they are called dry. This is essentially a subjective process. Medically there are different reasons for the wet and dry coughs and their identification aids in the differential diagnosis of diseases such as bronchiectasis, asthma, chronic bronchitis and bronchiolitis.3 Often, the dry-wet classification is used in epidemiological studies21,22 and clinical research.3,26 In children, wet cough is generally associated with lower respiratory tract infections.26 Diseases such as asthma and post-infections can cause dry cough. In some conditions the presence of dry cough as perceived by a clinician indicates early stage of the disease, which may later become wet cough with the progression of the disease leading to more secretions in airway.

In the current clinical practice cough quality is generally evaluated by asking the patients or patients’ caregivers during a clinical assessment. In cases when medical condition of patient allow, clinicians assess cough quality by listening while patient cough voluntarily. However while doing so significant temporal information about the frequency of coughs and variation in wetness of the cough is lost, which may be useful both for making a differential diagnosis and assessing the efficacy of the treatment. In addition to this, the manual evaluation of wetness of a cough is a subjective process and the outcome depends on the experience of clinicians.9,19 The process also suffers from the difficulties for humans to discern, via coughs, low-levels of mucus in airways; even trained clinicians underscore wet coughs as confirmed by bronchoscopic findings.3

Researchers have rarely attempted to develop technology for the automated, objective classification of cough into dry-wet categories. To the best of our knowledge, only two prior works exist in this area.5,14 Murata et al. 14 argued that cough sound frequencies can be used to discriminate between wet and dry coughs. Chatrzarrin et al. 5 proposed peaks of the energy envelop and spectral features of the cough sounds for the same purpose. These studies opened up a new branch of research in respiratory sound analysis. However they have been limited to a descriptive study of some characteristic features of coughs. No definitive classification algorithm or results were presented for wet/dry differentiation. The amount of data analyzed was fairly limited, 30 cough samples from 10 subjects (5 healthy and 5 bronchitis patients) in Murata et al. 14 and a total of 16 coughs in Chatrzarrin et al.,5 making the interpretation of the results difficult.

All existing work used cough sounds from adult subjects only and techniques used duration, magnitude and frequency features to characterize cough into dry/wet categories. Cough in adults are different in many ways; while wet cough is the term used in children, that used in adult is productive cough as adults are able to expectorate airway secretions. Further the same amount of secretions in a large airway (i.e., in adults) would biologically produce a different sound in a small airway (i.e., in children). Further production of cough sound is a complex physiological process involving several anatomical structures in the lower and upper respiratory system. Its acoustic properties vary significantly17 with the individual differences, age, gender and also depends heavily on the state of the airways.10 In diseases cough sound characteristics may change, making it necessary to develop robust methods to identify dryness/wetness. Intensity and duration dependent methods will not be sufficient to capture the rich information hidden in cough sounds.

Cough can be a symptom of serious diseases such as childhood pneumonia which kills over 1 million23 children in the world. The clinical community recognizes the important of cough in assessing the health of children. However, researchers have rarely attempted to develop objective, automated cough analysis systems for children. In particular, no prior work exists in the area of wet/dry classification. Cough assessment technology developed for adults cannot be extrapolated for children.2 There is an urgent need for developing automated objective cough assessment method for children.

In this paper we addresses these issues and propose an automated objective classification model to categorize cough sounds into wet and dry class. Method uses 1st, 2nd and 3rd order statistical features (e.g., formant frequencies, mel-cepstrum, non-Gaussianity, and bispectrum etc.) of the cough sounds. Model is trained and tested on a comprehensive database of 536 coughs from 78 subjects (41 male, 37 female) with age range of 1 month to 15 years. The subjects included in the study have a range of respiratory illnesses such as asthma, pneumonia, bronchitis and rhino-pharyngitis etc.

Materials and Methods

Figure 1 shows the block diagram of the automated cough classification algorithm proposed in this paper. It is divided into four stages, (A) data acquisition process (B) creating a cough sound database and classification into wet/dry classes by expert scorer (C) designing of automatic classifier (D) testing of classifier on prospective cough sound dataset. In section “Data acquisition” to section “Testing of selected LRM ℜ” we describe details of the method.

Figure 1
figure 1

Block diagram for the proposed method for the wet/dry cough sound classification

Data Acquisition

The clinical data acquisition environment for this work is Respiratory Medicine Unit of the Sardjito Hospital, Gadjah Mada University, Indonesia. Table 1 lists the inclusion and exclusion criteria of subjects. All patients fulfilling the inclusion criteria were approached. An informed consent was made using form approved by the human ethics committees of Gadjah Mada University and The University of Queensland. Patients were recruited within the first 12 h of their admission. After the initial medical assessment sound recordings were made for next 4–6 h in the natural environment of the respiratory ward.

Table 1 Inclusion and exclusion criteria used in the study

Sound recordings were made using two systems,

  1. (i)

    Computerized data acquisition system: A high fidelity system with a professional quality pre-amplifier and A/D converter unit (Model Mobile-Pre USB, M-Audio, California, USA) and a matched pair of low-noise microphones having a hypercardiod beam pattern (Model NT3, RODE, Sydney, Australia). Adobe audition software version 2 was used to record the sound data on to the laptop computer.

  2. (ii)

    Portable recording system: A high-end, light-weight portable, 2-AA battery powered audio recorder (Olympus LS-11) with two precision condenser microphones.

In both sound recording systems we used a sampling rate of 44.1 kHz with a 16 bit resolution (CD-quality recording). The nominal distance from the microphone to the mouth of the patient was 50 cm, but could vary from 40 to 70 cm due to patient movements. For each patient, we also received the final diagnosis as well as all the laboratory and clinical examination results.

Cough Sound Dataset and Classification into Wet or Dry by Expert Human Scorers

Let N be the number of patients whose sound recording is used in this paper and C be total number of cough events from N patients. These C cough events were manually segmented after screening though 6–8 h of the sound data of each patient. There is no accepted method for automatic marking of start and end of a cough event. Manual marking is still considered the gold standard. After careful listening start and end of all cough events were manually marked.

We divided N patients with C cough events into two datasets, (i) DS1 (model design dataset) and (ii) DS2 (prospective study dataset). The patients were divided into DS1 and DS2 based on the order of presentation to the respiratory clinic of the hospital. Patients in datasets DS1 and DS2 were mutually exclusive.

  1. (i)

    DS1—consisted of C1 cough events from N1 patients. Cough events from this dataset were used to design the optimal model.

  2. (ii)

    DS2—consisted of C2 cough events from N2 patients. Cough events from this dataset were used to test the designed model. Cough events from DS2 were blind to the process of model design.

Two expert scorers having experience of 15–20 years in pediatric respiratory diseases then scored cough events from two datasets into two classes, wet or dry. Scorers were blinded to the subject’s history and diagnosis. This manual classification is considered as the reference standard against which results of automatic classification are compared.

Design of Cough Sound Classifier

To design a system for automatic classification of cough sounds we used cough events from DS1. Let DS11 be the subset of DS1 containing those cough events on which both scorers agreed on the class of cough sounds. We had C11 cough events in DS11. Use cough events in DS11 to design automatic classifier model. This is a three step process.

[Step 1] Cough Event Feature Matrix Computation

In this step, feature vector containing ‘F’ mathematical features is computed from each of C11 cough events and a cough event feature matrix ‘MDS11’ of size, C11 × F was formed. To compute ‘F’ features from a cough event use below steps.

  1. (i)

    Let x denotes a discrete time sound signal from a cough event.

  2. (ii)

    Normalize x by dividing it by absolute maximum value.

  3. (iii)

    Segment x into ‘n’ equal size non-overlapping sub-segments. Let x i represents the ith sub-segment of x, where i = 1, 2, 3,…, n.

  4. (iv)

    Compute following features for each sub-segment and form feature vector containing F features: Bispectrum Score (BGS), Non-gaussianity score (NGS), formant frequencies (FF), Pitch (P), log energy (LogE), zero crossing (ZCR), kurtosis (Kurt), and twelve mel-frequency cepstral coefficients (MFCC). Please see “Appendix 1” for a detailed explanation of these features.

  5. (v)

    Repeat steps (i)–(iii) for all C11 cough events and form cough event feature matrix MDS11 of size C11 × F.

[Step 2] Automatic Classifier Design

In this paper we used a Logistic-regression model (LRM) as the pattern classifier. LRM is a generalized linear model, which uses several independent predictors to estimate the probability of a categorical event (dependent variable). In this work, the dependent variable Y is assumed to be equal to “one” (Y = 1) for wet cough and “zero” (Y = 0) for dry cough. A model is derived using regression function to estimate the probability Y = 1 (i.e., cough event belong to category of ‘wet cough’) given the independent variables (i.e., F features) as follows:

$$ {\text{Prob}}\left( {Y = 1|_{{f_{1} ,f_{2} ,f_{3} , \ldots f_{F} }} } \right) = \frac{{e^{z} }}{{e^{z} + 1}} $$
(1)
$$ z = \beta_{0} + \beta_{1} f_{1} + \beta_{2} f_{2} + \cdots + \beta_{n} f_{F} $$
(2)

In Eqs. (1) and (2) f 1, f 2,…f F are the elements of feature vector (independent variables), β 0 is called the intercept and β 1, β 2 and so on are called the regression coefficient of independent variables. To select the optimal decision threshold λ from Y (that the cough is wet if Y is above λ otherwise dry) we used the Receiver-Operating Curve (ROC) analysis.

Use data in matrix MDS11 (C11 observations from F independent variables) and adopt leave-1-out cross validation (LOV) technique for LRM design. As the name suggests, LOV technique involves using data from all cough events except one to train the model and one cough event to validate the model. This process was systematically repeated C11 times such that each cough event in DS11 was used as the validation data once. This resulted in L C11 number of LRMs.

To evaluate the performance of the designed L C11, performance measures such as Sensitivity, Specificity, Accuracy, Positive Predicted Value (PPV), Negative Predicted Value (NPV), Cohen’s Kappa (K) statistic were computed. Please see “Appendix 2” on how to interpret K values.

Design logistic regression model (LRM) for

  1. (i)

    Feature Selection: Feature selection is a technique of selecting a sub-set of relevant features for building a robust learning model. Theoretically, optimal feature selection requires exhaustive search of all possible subsets of features. However, to do so for large number of features it will be computationally intensive and impractical. Therefore we searched for satisfactory set of features using p value. In LRM design a p value is computed for each feature and it indicates how significantly that feature contributed in development of the model. Important features have low p value. We used this property of LRM to select a reasonable combination of features (independent variables with low p value) that facilitate the classification, in the model during the training phase. Compute mean p value for ‘F’ features over L C11 LRMs. Select the features with mean p value less than p ths. Let F s be the sub-set of selected features from F.

  2. (ii)

    Robust LRM design Create a matrix: M′DS11 of size C11 × F s from MDS11. Matrix M′DS11 is a cough event feature matrix with only selected features F s from C11 cough events in DS11. Using M′DS11 and adopting LOV, retrain C11 LRMs.

[Step 3] Selecting a Good Model from L C11 LRMs

From L C11 LRMs we selected one model as the best, using the k-mean clustering algorithm8 to test on prospective study dataset DS2. In the k-mean clustering algorithm, target is to divide q data points in d-dimensional space into k clusters, so that within the cluster sum of squared distance from the centroid is minimized.

Problem in our hand is to select a good model from the L C11 models available to us. To do so we divided L C11 models in d-dimensional space into k = 2 clusters, i.e., high performance model cluster and low-performance model cluster. We set space dimension d equal to model parameters plus three performance measures (sensitivity, specificity and kappa). Then from the cluster of the high performance models, we selected that model which had the lowest mean square error value with respect to the centroid. Let ℜ represent the selected LRM and λ is the corresponding probability decision threshold (value determined using ROC curves such that the classifier performance is maximized). Once ℜ is chosen, we fix all the parameters of the model and use it for classifying cough sounds in the prospective dataset DS2.

Testing of Selected LRM ℜ

Following the procedure described in section “Design of cough sound classifier” [Step 1] and using the cough events from dataset DS2, compute the cough event feature matrix MDS2 of size C2 × F. C2 is total cough events in DS2 and ‘F’ is feature vector. Form M′DS2 from MDS2 by selecting only robust F s features. Use selected LRM ℜ to classify data in M′DS2 into classes wet or dry. Decision process of wet/dry class from the output of ℜ is as follows:

Let the output of the ℜ to a given cough input is Y . Then, the cough is classified as wet if Y  ≥ λ and dry otherwise.

Compare the results of automatic classification by ℜ with that of expert scorers and compute the performance measures described in section “Design of cough sound classifier” [Step 2]. All the algorithms were developed using software programming language MATLAB version 7.14.0.739 (R2012a).

Results

Cough Sound Datasets and Agreement Between Expert Scorers

In this paper we used sound recording data from N = 78 patients (41 were male and 37 were female). The mean age of the subjects was 2 years and 11 month. The age range of the subjects varied from 1 month to 15 years and having diseases such as asthma, pneumonia, bronchitis and rhinopharyngitis. Table 2 gives the demographic and clinical details of the patients.

Table 2 Demographic and clinical details of the subjects

From N = 78 patients a total of C = 536 cough events were analyzed. On the average 7 cough events per patients were analyzed (minimum = 2 and maximum = 13). Dataset DS1 has C1 = 385 cough events from N1 = 60 patients and dataset DS2 has C2 = 151 cough events from N2 = 18 patients.

Table 3 shows the contingency table between two scorers in classifying cough sounds from DS1 and DS2, into two classes wet and dry. In DS1 out of 385 cough events, scorers agreed C11 = 310 times (80.5%) on the classes of cough events which were used to form subset DS11. In dataset DS2 they agreed 117 times out of 151 (77.5%). The kappa agreement between Scorer 1 and Scorer 2 is 0.55 for DS1 and 0.54 for DS2. Of the 310 cough events in DS11, 82 belonged to wet class and 228 belonged to dry class. The DS11 cough events were then used to design LRM models described in section “Design of cough sound classifier”.

Table 3 Contingency table between human scorers for classifying coughs into wet/Y

Cough Sound Characteristics in Our Databases

The mean duration of dry cough in DS11 was 260 ± 77 ms (computed using 228 dry coughs) and that of wet cough was 238 ± 54 ms (computed using 82 wet coughs). Figure 2 shows a typical example of dry cough waveform and wet cough waveform from two patients, ids #35 & #38 respectively. The cough sound waveforms were generally clean with high signal-to-noise-ratio (SNR). The mean signal to noise ratio for the DS11 was 15.2 ± 5.5 db (maximum = 28.65 db and minimum = 2.9 db) and that for DS2 was 18.6 ± 4.5 db (maximum = 27.8 db and minimum = 11.1 db). Figure 3 shows the histogram of SNR for the cough sound in DS11 and DS2.

Figure 2
figure 2

Typical example of dry cough waveform and wet cough waveform from two patients, ids #35 & #38 respectively in DS1. Start and end of each coughs were manually marked after listening by a single person, 1st author of the paper. We used n = 3 to divide each cough segment into 3 sub-segments

Figure 3
figure 3

Histogram of signal-to-noise-ratio (SNR) for the cough sound in DS11 and DS2. The mean SNR for the cough sounds in DS11 was 15.2 ± 5.5 db (maximum = 28.65 db and minimum = 2.9 db) and that for DS2 was 18.6 ± 4.5 db (maximum = 27.8 db and minimum = 11.1 db)

Start and end of each coughs were carefully marked after listening to cough sounds as shown in the Fig. 2. All the markings were done by a single person, 1st author of the paper. Following the method given in section “Design of cough sound classifier” [Step 1] we computed feature matrix MDS11. We used n = 3 to divide each cough segment into 3 sub-segments. In the literature, clinicians and scientist alike have described cough sounds consisting of 3 phases, (i) initial opening burst, (ii) followed by noisy airflow and last Eq. (3) glottal closure.24,25 It has been shown that these phases carry different significant information specific to quality of cough, wet or dry. On this basis we divided each cough segments into 3 sub-segments. Setting n = 3 led to a feature vector F of length 66 consisting of following features (n × 12 MFCC) + (n × 4 FF) + (n × [BGS, NGS, P, LogE, Zcr, Kurt]). From C11 = 310 cough events and F = 66 features, cough event feature matrix MDS11 was created.

Automatic Classification using LRM

Feature Matrix and LRM Performance During Training Stage

Following LOV technique, L C11 = 310 LRMs were designed. The mean training sensitivity and specificity for the 310 LRMs were 92 ± 1% and 93 ± 0.5% respectively. Validation sensitivity and specificity for these models were 62 and 84% respectively. Table 4(A) gives the detailed classification results when all the F = 66 features were used to train the LRMs.

Table 4 LRM performance before and after the feature selection

Following the process described in section "Design of cough sound classifier" [Step 2] and using p ths = 0.06, we selected F s = 31 features. Figure 4 shows the mean ‘p value’ associated with F = 66 features computed over C11 = 310 LRMs. All the features which have mean ‘p value’ less than p ths = 0.06 were selected. The selected features were 1 each from Bispectrum score, kurtosis, and number of zero-crossing, 2 each from non-gaussianity score and log-energy, 5 from formant frequencies, and 19 from mel-frequency cepstral coefficients. Table 5 gives details of the feature selected for designing the final LRM. According to this table MFCC based features were most dominant. Out of 31 selected features, 19 features were contributed from different MFCC components. After MFCC formant frequencies made second most dominant contribution with 5 features. Moreover except for 4th formant frequency and pitch based features, which were completely omitted, all other features contributed with features from at-least one sub-segment towards building of final LRM model.

Figure 4
figure 4

Mean ‘p value’ and standard deviation as error bar, associated with F = 66 features computed over 310 trained LRMs. ‘p value’ indicates associated significance level of a feature in developing the model

Table 5 F = 66 features were computed from each cough segment by using n = 3 at section “Design of cough sound classifier” [Step 1]

When only selected features F s were used to re-train LRMs, mean training sensitivity and specificity were recorded as 87 ± 1% and 88 ± 0.5% respectively and validation sensitivity and specificity were 81 and 83%. The validation kappa agreement between the LRM and scorers was 0.46 when all the features were used to train LRM and it increased to 0.58 when only selected features were used. Table 4(B) gives the detailed training and validation results after feature selection.

Selection of LRM (ℜ)

From L C11 = 310 designed LRMs using data from DS11, optimal model ℜ was selected using k-mean clustering method as discussed in section “Design of cough sound classifier” [Step 3]. Models were clustered into two groups, high performance model and low performance models based on model parameters and performance measures. Of 310 models, 202 were clustered in high performance model group and 108 into low performance model group. LRM model #26 has the lowest mean square error value with respect to centroid of the high performance models. This model ℜ was chosen and all its parameters were fixed for future use. ℜ was tested on prospective dataset DS2.

Performance of ℜ on Prospective Dataset DS2

Table 6 gives the classification results of ℜ against expert scorers. When Scorer 1, wet/dry classification was used as reference standard, ℜ has the sensitivity of 77.5%, specificity of 76% and kappa agreement of 0.47. For the Scorer 2, results were sensitivity 75%, specificity 64% and kappa 0.31. When model ℜ was tested on only those events, in which Scorer 1 and Scorer 2 agreed on classification (117 cough events), sensitivity jumped to 84% and kappa value to 0.51. Table 7 shows the contingency table.

Table 6 Performance of ℜ on dataset DS2 prospective study dataset
Table 7 Contingency table for selected LRM tested on dataset DS2

LRM Results When Matched for Age and Gender

Table 8 shows the performance of the LRM on DS11 and DS2 when matched for age and gender. Due to limited availability of data we considered only 4 divisions; (i) male with age ≤60 months, (ii) female with age ≤60 months, (iii) male with age >60 months and (iv) female with age >60 months. According to this table during the model designing stage, generally no significance difference was seen in the model validation performance across four divisions in comparison to when no division was considered, Tables 4 and 8(A). Similar to this on the prospective dataset DS2, selected model performed well across all division (Tables 6, 8(B)), except in the 3rd division (male with age >60) where performance were very poor.

Table 8 LRM validation results for dataset DS11 and prospective dataset DS2 with age and gender matched

Discussion

In this paper we proposed an automated, objective method to classify cough sounds into wet and dry categories. As far as we know, this is the first attempt to develop objective technology for the dry/wet classification of pediatric cough sounds, espcially in diseases such as pneumonia. Our work is also unique for the reason that we proposed and validated methods to classify a given cough event into dry/wet groups in contrast to existing work,5,14 which are limited to qualitaively describing chracteristics of cough events pre-classified by a human observer. The results presented in this paper are based on 536 cough events from 78 subjects, compared to existing work which use no more than 30 coughs in their descriptive analyses. For these reasons we do not have any other work to directly compare our results against.

The reference method used for the assessment of our technology is the subjective classification of cough sounds into wet/dry classes by two pediatric respiratory physicians from different countries. These scorers were blinded to the actual clinical diagnosis of the subjects. In an event-by-event cough classification, the two experts agreed with each other at a Moderate Level (kappa value of κ = 0.54). In Chang et al.,3 inter-clinician agreement for wet/dry cough is reported as κ = 0.88. However it should be noted that, in Chang et al. 3 clinicians assessed wetness of cough at the patient level but not at individual cough level. When we computed the agreement between scorers at the patient level, the kappa value increased to κ = 0.66 (Substantial Agreement). These numbers further illustrate the subjective nature of dry/wet classification.

Our classifier technology was trained on coughs from the training set (set DS1) using only events where both scorers reached consensus. As the output of the training process we identified a good Logistic Regression Model (ℜ) and fixed its parameters. The model was then tested on the Prospective Set (Set DS2) in several different ways. The highest sensitivity and specificity (84 and 76%) of classification were achieved when we tested ℜ against consensus events within DS2. It is interesting to note that these numbers were consistently higher than what we got by testing against individual classification outcomes of each scorer.

Another salient feature of our method is that it has a high negative predictive value (NPV = 93%), when scorer consensus data is used as the ground truth. This means that if the model classifies a cough as non-wet (dry), it is most likely that the two expert scorers would independently reach the same conclusion. However, the positive predictive value of our method compared to human scorers is lower (PPV = 55%). Thus, a sizable fraction of coughs classified by the model as wet ends up being consensus-classified as dry by human scorers. This phenomenon appears to be explained by the results presented by Chang et al. 3 which found that expert human scorers underscore wet coughs. In Chang et al. 3 they systematically compared subjective dry/wet classifications of expert clinicians with bronchoscopic indications of airway mucus. They reported that clinician’s classification of dry cough do not necessarily indicate the absence of secretions. Certain situations in airways, for instance small amounts of secretions, may not be reflected in cough sounds at a sufficient magnitude to be detected by a human observer. One of the possible reasons for a lower PPV value found in our method can be this weakness in the gold standard, human scorers, used to generate our performance statistics. This hypothesis needs to be carefully validated against bronchoscopic findings in the future.

The ability to correctly detect airway mucus can be particularly important in the management of suppurative lung diseases.3,4 Cough is an early symptom of diseases such as pneumonia, bronchitis and bronchiolitis. The accurate assessment of this symptom is a crucial factor in diagnosing acute diseases or the monitoring of chronic symptoms and treatment efficacy. It is known that in children, wet coughs are more likely to be associated with lower respiratory tract infections.4 The subjective classification of wet coughs has low sensitivity as a method of detecting airway mucus, even in the hands of expert clinicians. Accurate, objective technology for the classification of dry/wet coughs is currently unavailable either at the commercial or research levels. To the best of our knowledge, this work is the first attempt in the world to develop such technology.

We present the first ever approach to automate dry-wet classification of coughs. The results presented in this paper can be improved by syetematically optimizing the parameters and fine tuning the training processes of our classifier. Our heuristic model selection process makes the reported results pessimistic estimates. We also believe that the feature set can be improved and the classification accuracy of the method can be further increased. However before an optimization attempt, issue we need to resolve is to improve the ‘gold standard’ used in the clinical diagnosis. A carefully controlled bronchoscopy study will be best suited as the gold standard. We recognize that the optimization work is needed before taking the technology to the clinical practice, but we consider this out of the scope of this paper due to reasons explained above.

Another possible limiting factor to this study is the biasedness of the cough sound database towards dry coughs; almost 70% cough sounds are dry as perceived by expert human scorers. However, with all these factors, our method can currently classify wet and dry coughs with high sensitivity (84%) and specificity (76%) and with a good agreement (κ = 0.51) with the expert human scorers.

The results presented in this paper used manual identification of cough segments from long sound recordings. Once the sounds were identified, the dry/wet classification was fully automated. We currently developing automated cough identification technique and the results will be published elsewhere.

Conclusion

Proposed method in this paper can classify the cough sounds into dry and wet classes with high accuracy and good agreement with pediatricians. This is the first known method for wet/dry classification, presented with complete training and testing results on significantly large cough samples. It is also the first effort to automate the wet/dry classification in pediatric population with range of respiratory infectious diseases. It carries the potential to develop as a useful clinical tool for long term cough monitoring and in the assessment of treatment efficacy or in characterizing the lower respiratory tract infections. It will be essentially useful in clinical or research studies where temporal patterns of cough quality (wet/dry) from hour to hour basis are needed.

The methods proposed in this paper should be available for simultaneous implementation with other potential technologies such as microwave imaging and ultrasound imaging that may be capable of detecting consolidations and mucus in lungs.