Abstract
Background and objectives
Pulmonary obstruction diseases produce adventitious sounds in the breathing cycle. With the increased impact of lung diseases, it has become essential for the medical professional to leverage artificial intelligence for faster and more accurate lung auscultation. Initial biomedical signal processing techniques focused on features based on signal amplitude, so accuracy detection depends upon the signal amplitude. The adventitious sounds heard in the respiratory cycle have non-linear characteristics. The present research targets to propose features based on the non-linearity of the adventitious sounds. Also, in this research, SVM-LSTM with the Bayesian optimization model is applied for the first time to test features of adventitious sounds.
Methods
The characteristics of adventitious sounds contain non-linearities. Targeting the same, the research proposes two feature sets based on wavelet bi-spectrum and bi-phase (eight each). SVM-LSTM analyzes these features with the Bayesian optimization algorithm model. The research employs the RALE\(^{{\circledR }}\) database, which is the most comprehensive public database of lung sounds.
Results
The results are presented in a matrix of 3×10 with parameters as MSE, PSNR, R-value, RMSE, and NRMSE from the confusion matrix for SVM, SVM-LSTM, and SVM-LSTM with BO for each class, i.e., wheeze, crackle, and normal. The results are evaluated using Matlab\(^{{\circledR }}\) 2021b (MathWorks\(^{{\circledR }}\), Inc.). Results reveal that feature sets achieved an accuracy of 94.086% for SVM, 94.684% for SVM-LSTM, and 95.699% with 95.161% for LSTM Bayesian optimization for WBS and WBP, respectively.
Conclusion
The research supports the hypothesis that adventitious sounds have non-linear properties. New features are more effective in detecting lung sounds. Also, combining the LSTM with Bayesian optimization improved each class’s accuracy and statistical parameters. The above model design achieved accurate AI-aided detection of lung diseases for light weighted edge devices.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
One of the leading causes of death worldwide is respiratory disorder diseases (WHO 2019). Patients suffering from these diseases produce adventitious sounds in their breathing cycle. The World Health Organization (WHO) declared COVID-19 as a global pandemic that is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and is rapidly spreading across more than 200 countries worldwide (Sanders et al. 2020). COVID-19 comes with an indication like fever, throat, dry cough, dyspnea, fatigue, and headache. Also, when it is an indication of a critical condition, its symptoms with multiple organ failure (Kujawski et al. 2020; Chang et al. 2020) and Sun et al. (2020). Unconditionally, lung sounds also affect the scarcity of voice, affecting even shortness of breath and congestion in the upper airway. The repetitive dry coughs cause lungs that affect voice sound quality. Researchers even reported that COVID-19 symptoms with inadequate airflow by the vocal tract result in pulmonary and laryngological involvements in people (Asiaee et al. 2020). As a result, all symptoms, as stated, result in the patient’s lungs with sounds as an identifiable voice signature. There are mainly two categories of adventitious sounds, namely continuous and discontinuous. The continuous adventitious breath sounds (wheezes, stridor, and rhonchi) have a time length of > 250 ms, but discontinuous (crackle) adventitious signals have a time duration of 25 ms, according to Islama et al. (2018). It has been seen that it is critical for proper assessment of COVID-19 patients to mitigate and halt the rapid expansion of diseases across the nations. With this intensity and demise rate of COVID-19 in the presence of lung/pulmonary diseases is increased, which is rapidly spreading among the public. TB and COVID-19 differ as TB is curable, but COVID-19 lacks effective anti-viral agents and drugs (Pan et al. 2020; Cantini et al. 2020). It has been seen that both COVID-19 and TB affect health systems since both are airborne transmissible diseases, and we can diagnose them rapidly. Both effects stigma and requires awareness among the public and need cooperation so that it can be prevented and diagnosed so that treatment can be effective. It has been seen that most countries still lack information about COVID-19, and as compared to TB, it does not require many clinical and immunological parameters such that we can understand how both differ from the interaction between the two diseases. Besides, the COVID-19 pandemic led to a notable fall in TB notifications (Migliori et al. 2020). COVID-19 has main dominating symptom is coughing, which is also a symptom of more than 100 diseases. However, how it affects the respiratory system varies from COVID-19, as we can see from the fact that diseases of the lungs affect the airway to be either restricted or obstructed, which influences cough acoustics. However, the glottis behaves diversely under unpredictable pathological conditions, and we can distinguish between coughs due to TB, asthma, bronchitis, and pertussis (Pahar et al. 2021). The burning flare-up of COVID-19 requires COVID-19 individuals well-planned testing such that it can limit and arrest diseases as it is rapidly spreading globally. Chronic pulmonary diseases have been observed main cause of the severity and mortality ratio of COVID-19-affected patients. One of the most feasible assessment approaches is a radiographic examination exploiting chest X-ray images for pulmonary disorders, including COVID-19. The researchers conduct DL image classification by developing DL classifiers with nine class CXI to predict pulmonary diseases with COVID-19 (Bhosale and Patnaik 2022b). To test COVID-19 cases, we have analyzed the most successful radiography utilizing chest X-ray sounds. For analyzing diseases, we have used SVM-LSTM-BO-based artificial intelligence for seeing various sounds of lung-based disease and have studied to see how much improvement we can achieve as a machine learning for our algorithm-based study to see the advantage as compared to obstructive pulmonary diseases (Bhosale et al. 2022; Bhosale and Sridhar Patnaik 2022). The main aim of DL methodology is to grasp hierarchical features from data. DL methods permit us to tackle complex patterns skilfully. GGO, consolidation, pleural effusion, and bilateral lung involvement are the specific patterns due to infectious COVID-19 in radiological images (Carotti et al. 2020). These specific patterns can be identified by using different DL architectures (Khan et al. 2021). It has been reported that DL models have higher sensitivity and specificity values and more accurate predictions for COVID-19 detection. DL methods reduce negative error and false positive rates and provide medical specialists and radiologists with a quick, economical, and accurate diagnostic. COVID-19 can be finely tuned, and much time is saved on analysis-related tasks as we built the DL model and trained them from scratch. The present research aims to uncover the non-linear characteristics of the adventitious sounds heard during the respiratory cycle. It has been discovered that wheezing noises are recognized between expiration and inspiration in intensity, pitch, position, and duration. As a result of narrowing the airway blockages, we can have either a high or low pitch (Swapna et al. 2020; Taplidou and Hadjileontiadis 2007). Kevat et al. (2020) used the manual accusation to detect adventitious breath sounds with low inter-observer reliability. The study gathered 192 auscultation recordings of children with two digital stethoscopes (Clinicloud and Littman) categorized as wheezes and crackles. The above research uses spectrogram and waveform analysis of clinicloud recordings to detect wheezes and crackles. The above study had a positive percent agreement (PPA) of 0.95 and a negative percent agreement (NPA) of 0.99, while Littman collected sounds had a PPA of 0.95. The PPA and NPA were both 0.82 (Fig. 1).
Shi et al. (2019a) use temporal feature Mel spectrogram features and Bi GRUVGish classifier combination on a database of 384 subjects to achieve an accuracy of 87.41%. Aykanat et al. (2017) employ MFCC spectrograms with CNN classifier to achieve an accuracy of 80%.
Niu (2019) describe a system for detecting the presence of sputum and acquired inhale and exhale respiratory sounds. The research extracted audio features and fed them to a tenfold cross-validation experiment (logistic classifier). The research achieved a sensitivity of 92.26% and a specificity of 92.26%
In the research conducted by Shi et al. (2019b), they chose WCC features and combined them with the BPNN classifier to achieve an accuracy of 92.5% with a database of 64 subjects.
Bardou et al. (2018) obtained the features in the form of spectrograms and fed these to a CNN-based classifier to achieve an accuracy of 95.56%.
Demir et al. (2020) clubbed time frequency-based features with convolution neural networks to achieve an accuracy of 65.5%.
The above researchers have employed either one or two classifiers only for testing proposed features. Most of the features target the linear characteristics of adventitious sounds. The accuracy so achieved has an upper cap of 95%. The database also has a limited number of subjects.
Taplidou and Hadjileontiadis (2010) used higher-order spectral features to classify adventitious sounds to detect adventitious sounds based on statistical attributes (SPSS tool), with a 96% accuracy rate. The current research proposes sixteen features (two sets of eight each, with two forwarded (Taplidou and Hadjileontiadis 2010)) based on WBS and WBP. These features are fed to the proposed classification model. Here we used SVM-LSTM with BO as the classification model. SVM algorithm uses loss function for training by folding at k= 5. The finding suggests a gradual rise in accuracy from the SVM to the SVM-LSTM algorithm and the SVM-LSTM-Bayesian optimization model for both types of features. Remote automated auscultation systems may play a crucial role in combating the problem of the availability of expert physicians. Hence, artificial intelligence can be leveraged to assist physicians in performing auscultation remotely and more accurately. This paper proposes a new hybrid framework for lung sound classification for biomedical engineering by combining feature engineering (FE), LSTM, and SVM with Bayesian optimization (BO) for machine learning. The FE module comprises feature selection and extraction phases. And the SVM with LSTM with Bayesian optimization (BO) algorithm is used to fine-tune the control parameters of data and provides more accurate results by avoiding the optimal local trapping. The proposed FE-SVM-LSTM-BO framework works in such a way as to ensure stability, convergence, and accuracy. The present FE-SVM-LSTM-BO model is tested based on data for lung sounds for categorical wheeze, crackle, and normal sounds with error calculation parameters. The results show that the proposed model has significantly improved the accuracy with a fast convergence rate and shows efficiency from previous studies for all statistical and error parameters (Zulfiqar et al. 2022). Our proposed work tested adventitious sounds, i.e., crackles, wheezes, and both but it is incapable of detecting other sounds, i.e., rhonchi and squawks. Also, the ICBHI database on which our work has been proposed has only a limited number of respiratory cycles, and it is a fact that recording respiratory sounds is a challenging process compared to other physiological signals, i.e., ECG; fewer studies focus on them. Also, as we use DL strategies due to noisy and small data suffering from significant deviation and generality failures, and also as we use DL systems, it is critical to assess efficiency since they are susceptible to noise and incorrect model interpretation. The inductive implications inherent in cases of uncertainty (Bhosale and Patnaik 2022a). The paper’s organization flow explains the methods followed in the research and highlights data acquisition and pre-processing techniques followed in the research. And another part gives a broad overview of feature set analysis. And later presents the experimental results and is later detailed in the discussion. And concludes the paper. Finally, end with Acknowledgements.
Methods
The methodology adopted in this work is divided into the following points:
-
1.
Data analysis takes place in two subsections: data acquisition and data processing, in which we have taken the Rale database.
-
2.
Feature extraction phase: Here, the research uses mathematical extraction of features described in the feature analysis section.
-
3.
To extract the feature’s numerical values, we have constructed an excel sheet.
-
4.
Categorical data based on wheeze, crackle, and normal sounds, the research adopts SVM, LSTM, and LSTM with BO algorithm as artificial intelligence.
-
5.
The algorithm’s running generates parameters that make a confusion matrix.
-
6.
Using the confusion matrix, we calculated error parameters.
-
7.
In this analysis, based on the results, we discussed and concluded all our points and discussed them in the result and conclusion sections.
Data acquisition
The data in this research mainly comprise of the Respiration Acoustics Laboratory Environment RALE\(^{{\circledR }}\) (Pasternak 2008) lung sounds 3.2 (permitted to use the data for academic research) and other resources (Huang 2005; Keroes 2018). The educational program RALE\(^{{\circledR }}\) aims to educate doctors, nurses, medical professionals, and students. It features about 50 recordings, a collection of adventitious sounds from people of various ages and conditions. There is also a quiz area with 24 more instances. The Health Sciences Communications Association has given the collection a commendation award for computer-based products. Wheezes (normal, monophonic, and polyphonic wheezes) are represented by 252, crackle (coarse and fine crackle) is characterized by 70, and normal sound (bronchial, bronchovesicular, and tracheal sounds) is represented by 50.
Data pre-processing
The voltage range of the captured sound is − 5V to + 5V (− 32,767 to + 32,768). The sampling frequency for the captured sound is 4kHz, 16 bits, and 1024 points per segment. Following the computerized respiratory sound analysis (CORSA) guidelines, the first-order Butterworth filter high passes the signals at 7.5kHz for filtering out DC offset. The signals are low pass filtered at 2.5kHz using eighth-order Butterworth LPF. The system uses BPF (150Hz–2kHz) for heart sound cancellation. The signals are divided into segments of its waveform by using Goldwave\(^{{\circledR }}\) Software. A pulmonologist manually validated the database in the medical clinic in Indore, India.
Feature analysis
To provide a distinctive identity, the values are drawn from a signal called a feature. In this paper, we have proposed a feature set that shows non-linearity in the time-frequency domain, and for this proposed system, we have seen non-stationary characteristics and the quadratic phase coupling of harmonic peaks of the feature are non-linear in nature. So as a higher-order spectrum, we have a rich feature scope in non-linear signals.
Figures 2, 3, 4, 5, 6, and 7 are the higher-order spectra of wheeze, crackle, and normal health sounds of wavelet bi-spectrum and bi-phase. They are marked with global max and min peaks with arriving rise and fall times.
Wavelet bi-phase (WBP) and bi-spectrum (WBS)
In obstructive pulmonary disease, airway restriction introduces non-linearity in harmonic peak interactions. The wavelet analysis aids in the detection of non-linearity in signal analysis. We convolve the wave-like structures (wavelets) with the signal in wavelet transform. This convolution procedure reveals the signal’s transitory features. The mathematical formula for CWT is as follows: Hadjileontiadis (2018)
where x(t) represents the signal in time-domain (x(t) ∈ L2(R)), * represents complex conjugate, and ψ(t) is the mother wavelet scaled by a factor a, a > 0 and dilated by a factor b, also a and b are continuous. The Morlet wavelet has the advantage of time and frequency localization. They are also helpful in identifying measurable features in the time-frequency domain and are preferred as mother wavelets.
where fc and fb are the central wavelet frequency and bandwidth parameters, respectively. The wavelet bi-spectrum is defined as
The preceding integration takes place over a limited time interval T: τ0 < τ < τ1 and a, a1, and a2 are the wavelet component and signal scale lengths. The WBS provides quadratic phase coupling between wavelet components in the interval T. Wavelet bi-amplitude and bi-phase refer to the magnitude and phase of complex WBS, respectively.
Instantaneous wavelet bi-amplitude and bi-phase
The WBS defined in equation 3 corresponds to time interval T; instantaneous WBS (IWBS) is defined as follows:
= Axejφx
The IWBS is a complex quantity with a magnitude of bi-amplitude and a phase of bi-phase, as shown by the above equation.
Global peaks (GPs) maxima, minima, and Euclidean distance
The following features are based on GPs and Euclidean distance: Feature 1: Global max value in the amplitude domain (wavelet bi-spectrum) GMaxWBx Feature 2: Global min value in the amplitude domain (wavelet bi-spectrum) GMinWBx Feature 9: Global max value in the amplitude domain (wavelet bi-phase) GMaxϕx Features 10: Global min value in the amplitude domain (wavelet bi-phase) GMinϕx If the peak’s amplitude exceeds the average amplitude, it is classified as GP. Global maxima or minima is the maximum value attained by a function in the positive or negative direction. The global peaks (GPs) appear throughout the signal’s lifespan (TTotal). Their features offer the proposed feature’s bi-frequency-related qualities.
where Ax(ω1,ω2) is the IWBC amplitude of instantaneous wavelet bi-amplitude over the area Δ that exceeds the statistical noise.
Ci = (ωc1,ωc2),i = 1,2,.......l The function of global maxima in domain D has a global maximum at C ∈ D.
if f(x) < f(c) for all x ∈ D, it has a global minima at C ∈ D
if f(x) ≥ f(c) for all x ∈ D, it has a global maxima at C ∈ D As seen in Table 1, the GMax of both bi-phase and bi-spectrum has − 45.8678, − 12.276, and − 13.3763 values for W, C, and N. The GMin of both bi-phase and bi-spectrum for crackle has − 99.7943 values. And GMin for bi-spectrum has − 102.6085 and − 111.2704 values for W and N. And GMin for bi-phase has − 102.609 and − 111.27 values for W and N.
The following section puts up light on feature number three and eleven: Feature 3: The distance of the Ci from the contour S of the i th GP at the bi-frequency domain in the wavelet bi-spectrum DGPiWBx Feature 11: The distance of the Ci from the contour S of the i th GP at the bi-frequency domain in the wavelet bi-phase \( D^{GP_{i}} \phi _{x}\)
The distance of the Ci from the contour S of the i th GP at the bi-frequency domain, considering the contour S of GPi is denoted by DGPi. The Euclidean distance (feature numbers 3 and 11) of Si from Ci can be defined as follows:
where m is the number of points on the contour S of GPi. As seen in Table 1, the DGPi of both bi-phase and bi-spectrum has 56.7406,630.5659 and 106.9485 values for W, C, and N.
Amplitude above mean
Feature 4: Amplitude above mean in wavelet bi-spectrum) AmeanWBxFeature 12: Amplitude above mean in wavelet bi-phase) Ameanϕx The fourth and twelfth features are discussed in this section. The peak-to-peak (p-p) amplitude is the difference between the largest and the smallest points. Figure 2 depicts the signal amplitude measurement points. The p-p amplitude is denoted by the number “2” in Fig. 8.Peak-to-Peak amplitude (represented as “2” in Fig. 8) = Mean of the spectrum (MS) = Peak-to-Peak amplitude (denoted as “2” in Fig. 8)/2Amplitude above mean Amean = Peak amplitude (denoted as “1” in Fig. 2)−Mean of the spectrum (MS)
As seen in Table 1, the Amean of both bi-phase and bi-spectrum has 28.3703, 43.7591, and 48.9471 values for W, C, and N.
Average instantaneous WBS/WBP
Feature 6: Average instantaneous wavelet bi-spectrum across the examined total time interval T mWBx(ω1,ω2)Feature 14: Average instantaneous wavelet bi-phase across the examined total time interval T mϕx(ω1,ω2)This section elaborates the features number six and fourteen. The maximum instantaneous wavelet bi-phase of the LPs in the time interval t is denoted as mϕx(ω1,ω2) and for WBS as mWBx(ω1,ω2). The frequencies \(\omega _{c_{1} }\), \(\omega _{c_{2} }\) where LPi has its maximum value vary with time. The representation of time dependence of the wavelet frequencies \(\omega _{c_{1} } \)and \(\omega _{c_{2} } \) is represented as \(\omega _{c_{1} } (t)\) and \(\omega _{c_{2} } (t)\).
As seen in Table 1, the Average instantaneous of both bi-phase and bi-spectrum has 5.18E + 04, 7.01E + 04, and 51134 values for W, C, and N.
Maximum WBS/WBP across time
Feature 7: Maximum wavelet bi-spectrum across time-related to LPs \({\max \limits } WB_{x}^{LP}\)Feature 15: Maximum wavelet bi-phase across time-related to LPs \({\max \limits } \phi _{x}^{LP}\)This section puts light on feature numbers seven and fifteen. The local peaks (LPs) are seen in the signal’s detailed perspective based on window overlapping section Δ using IWBS analysis.
where l is the number of LPs, and i is the maximum peak position. The maximum WBS/WBP across time is related to local peaks as follows:
As seen in Table 1, the Max of both bi-phase and bi-spectrum has 1020, 1024, and 1021 values for W, C, and N.
Arithmetic mean (AM) and standard deviation (SD)
Feature 5: Mean wavelet bi-spectrum related to LPs \(\text {mean} WB_{x}^{LP}\)Feature 8: The standard deviation of the wavelet bi-spectrum related to LPs \( stdWB_{x}^{LP}\)Feature 13: Mean wavelet bi-phase related to LPs (Taplidou and Hadjileontiadis 2010) \( \text {mean} \phi _{x}^{LP}\)Feature 16: The standard deviation of the wavelet bi-phase related to LPs (Taplidou and Hadjileontiadis 2010) \( std \phi _{x}^{LP}\)This section discusses feature numbers five, eight, thirteen, and sixteen. AM measures the dispersion of a collection of data from its mean and is the central value of the SD
As seen in Table 1, the mean of both bi-phase and bi-spectrum has 510.7143, 535.3702, and 511.34 values for W, C, and N. As seen in Table 1, the Std of both bi-phase and bi-spectrum has 290.4293, 289.4929, and 287.2834 values for W, C, and N.
Results
The result section puts up light on the confusion matrix, accuracy vs. iterations, loss vs. iterations, and derivation of statistical measures.
As seen from Tables 2, 3, 4, 5, 6, and 7, we have attained Precision value improvement in WBS, and WBP with LSTM with Bayesian optimization shows WBS with LSTM with Bayesian has a good Precision value. As seen from Tables 2, 3, 4, 5, 6, and 7, we have attained Recall value improvement in WBS, and WBP with LSTM with Bayesian optimization shows WBS with LSTM with Bayesian has a good Recall value. As seen from Tables 2, 3, 4, 5, 6, and 7, we have attained Specificity value improvement in WBS, and WBP with LSTM with Bayesian optimization shows WBS with LSTM with Bayesian has a good Specificity value. As seen from Tables 2, 3, 4, 5, 6, and 7, we have attained F1 value improvement in WBS, and WBP with LSTM with Bayesian optimization shows WBS with LSTM with Bayesian has a good F1 value.
For both WBP and WBS, Tables 2, 3, 4, 5, 6, and 7 present the performance metrics for SVM, LSTM, and LSTM with Bayesian optimization for each class, i.e., wheeze, crackle, and normal sounds. The above model applied in the current study is a new model applied to lung sounds (Anderson et al. 2021), and also it is giving better results.
Table 9 shows the error calculation for SVM, LSTM, and LSTM with BO. These errors are calculated from Figs. 9, 10, and 11. Table 9 shows that the MSE values for WBS AND WBP for SVM and LSTM are 97.333 and 90.667 and for LSTM with BO for WBS is 38.667 and WBP is 44.000. So we conclude that the lower the MSE better it is. And it is clear from Table 9 that bi-spectrum is good for LSTM with BO. Table 9 shows that the PSNR values for WBS AND WBP for SVM and LSTM are 24.2482 and 28.5563 and for LSTM with BO for WBS is 32.2579 and WBP is 31.6983. So we conclude that the lower the PSNR better it is. And it is clear from Table 9 that bi-spectrum is good for LSTM with BO. Table 9 shows that the R-values for WBS and WBP for SVM and LSTM are 0.9958 and 0.9962 and for LSTM with BO for WBS is 0.9984, and WBP is 0.9981. So we conclude that the higher the R-value better it is. And it is clear from Table 9 that bi-spectrum is good for LSTM with BO. Table 9 shows that RMSE for WBS AND WBP for SVM and LSTM are 9.8658 and 9.5219 and for LSTM with BO for WBS is 6.2183 and WBP is 6.6332 (Table 8). So we conclude that the lower the RMSE better it is. And it is clear from Table 9 that bi-spectrum is good for LSTM with BO. Table 9 shows that NRMSE for WBS AND WBP for SVM and LSTM are 0.0453 and 0.0471 and for LSTM with BO for WBS is 0.0308 and WBP is 0.0328. So we conclude that the lower the NRMSE better it is (Fig. 12). And it is clear from Table 9 that bi-spectrum is good for LSTM with BO.
Table 8 shows the comparative analysis for researchers who have achieved lower results than our proposed work. Kevat et al. (2020) use neural network study had a positive percent agreement (PPA) of 0.95 and a negative percent agreement (NPA) of 0.99, while Littman collected sounds had a PPA of 0.95. The PPA and NPA were both 0.82. Shi et al. (2019a) use temporal features, Mel spectrogram features, and Bi GRUVGish classifier combination on a database of 384 subjects to achieve an accuracy of 87.41%. Aykanat et al. (2017) employ MFCC spectrograms with CNN classifier to achieve an accuracy of 80%. Niu (2019) describes a system for detecting the presence of sputum and acquired inhale and exhale respiratory sounds. The research extracted audio features and fed them to a ten-fold cross-validation experiment (logistic classifier). The research achieved a sensitivity of 92.26% and a specificity of 92.26%. Shi et al. (2019b) chose WCC features and combined them with the BPNN classifier to achieve an accuracy of 92.5% with a database of 64 subjects. Bardou et al. (2018) obtained the features in the form of spectrograms and fed these to a CNN-based classifier to achieve an accuracy of 95.56%. Demir et al. (2020) clubbed time frequency-based features with convolution neural networks to achieve an accuracy of 65.5%.
It shows that all error parameters have better values for LSTM with the BO model. Also, Tables 2, 3, 4, 5, 6, and 7 reflect that LSTM with BO model performs best in all parameters, i.e., sensitivity, specificity, precision, F-measure, and accuracy.
Discussion
Automatic classification of adventitious sounds for identifying pulmonary obstructive is a challenge. Previous methods to detect adventitious sounds of lungs mostly employed features based on linear characteristics. Here in this research, we propose features based on the non-linear characteristics of lung sounds. Table 8 compares the various research using RALE\(^{{\circledR }}\) database. Table 8 shows that conventional classifiers used for testing the features are conventional. Here in this research, we used the SVM-LSTM-BO ML combination for the first time to separate lung anomalies. The results compare the accuracy of the algorithm with and without Bayesian optimization. The results show that with the Bayesian optimization proposed algorithm model becomes more effective in detecting the targets. When using Bayesian optimization, the algorithm benefits from prior knowledge of a problem’s structure, and the data shows a set of high-quality solutions. Here we can adjust the previous information to information gathered during the run to produce new solutions. Tables 2, 3, 4, 5, 6, and 7 shows that the accuracy of SVM for both WBS and WBP is 94.086% and for LSTM for both is 94.624% while for LSTM with Bayesian optimization for WBP IS 95.161% and for WBS is 95.699%. So we can conclude from this that major improvement is seen in LSTM with Bayesian and also best improvement is seen in wavelet bispectrum for accuracy parameters. And also, for other parameters, as seen from Tables 2, 3, 4, 5, 6, and 7 we discuss, the LSTM with Bayesian optimization is efficient for wavelet bispectrum for F-measure, sensitivity, specificity, precision for each class, i.e., wheeze, crackle, and normal with macro avg and micro avg respectively. As seen from Tables 2, 3, 4, 5, 6, and 7, we have attained TP value improvement in WBS and WBP with LSTM with bayesian optimization with values 118.67 and 118, which shows WBS with LSTM with Bayesian has a good TP value. As seen from Tables 2, 3, 4, 5, 6, and 7, we have attained TN value improvement in WBS and WBP with LSTM with Bayesian optimization with values 242.67 and 242, which shows WBS with LSTM with Bayesian has a good TN value. As seen from Tables 2, 3, 4, 5, 6 , and 7, we have attained FP value improvement in WBS and WBP with LSTM with Bayesian optimization with values 5.3333 and 6, which shows WBS with LSTM with Bayesian has good FP value, i.e., lower FP more improvement. As seen from Tables 2, 3, 4, 5, 6, and 7, we have attained FN value improvement in WBS, and WBP with LSTM with bayesian optimization with values 5.3333 and 6, which shows WBS with LSTM with Bayesian has good FP value, i.e., lower FN more improvement.
The results show that SVM parameters such as penalty and kernel parameters positively affect SVM model correctness and complexity. Besides, the findings revealed that the proposed method might be employed as a system of aid to diagnose COVID-19 disease. The findings uncover that the suggested strategy has good behavior in increasing classification accuracy and optimal feature selection. The presented strategy can be considered a useful clinical decision-making tool for clinicians. With the increasing popularity of LSTMs, various alterations have been tried on the conventional LSTM architecture to simplify the internal design of cells to make them work more efficiently and reduce computational complexity.
Conclusion
Researchers proposed two sets of features based on WBS and WBP to detect adventitious sounds of lungs. Results reveal that feature sets based on WBS AND WBP obtained an accuracy of 94.086% for SVM and 94.684% for LSTM, and 95.699% and 95.161% for WBS with WBP, respectively, for LSTM and Bayesian optimization. The concept that adventitious sounds have distinct non-linear features has been proven via research. We concluded that combining LSTM with Bayesian optimization improved each class’s accuracy and all statistical parameters. The model achieved accurate AI-aided detection of lung diseases for light weighted devices. As seen from the results, we reached on conclusion that SVM with LSTM with Bayesian optimization have achieved improvement in all parameters, i.e., accuracy, specificity, sensitivity, precision, and recall for each class also, i.e., wheeze, crackle, and normal sounds also we have found that for WBS have more improvement in LSTM with Bayesian as compared with WBP. Also, we conclude from this part that combining SVM with LSTM with Bayesian for WBS proposed method concludes improvement from previous work. Future works will focus on increasing the data-set size to include more subjects and a wider range of diseases such as COVID-19. This will improve the credibility of the proposed model. Although the proposed classification model achieves high-performance metrics, it may be further improved by adjusting the pre-processing techniques and the training structure.
Code availability
Available with source.
References
Anderson, IJ, Berk J, Bertram A, Stein A, Azadi AN, Record J, King C, Garber A, Pahwa A. Supplementing the subinternship: effect of e-learning modules on subintern knowledge and confidence. Amer J Med 2021;134:1052–1057.
Asiaee, M, Vahedian Azimi A, Atashi S, Keramatfar A, Nourbakhsh M. 2020. Voice quality evaluation in patients with COVID-19: an acoustic analysis. J Voice, vol 36.
Aykanat, M, Kilic O, Kurt B. Classification of lung sounds using convolutional neural networks. J Image Video Proc 2017;65:195–203.
Bardou, D, Zhang K, Ahmad SM. Lung sounds classification using convolutional neural networks. Artif Intell Med 2018;88:58–69.
Bhosale, Y, Patnaik K. 2022a. Application of deep learning techniques in diagnosis of COVID-19 (coronavirus): a systematic review. Neural Process Lett:1–53.
Bhosale, Y, Patnaik KS. 2022b. Puldi-COVID: chronic obstructive pulmonary (lung) diseases with COVID-19 classification using ensemble deep convolutional neural network from chest x-ray images to minimize severity and mortality rates. Biomed Signal Process Control: 104445.
Bhosale, YH, Sridhar Patnaik K. 2022. IoT deployable lightweight deep learning application for COVID-19 detection with lung diseases using RaspberryPi, pp 1–6.
Bhosale, YH, Zanwar S, Ahmed Z, Nakrani M, Bhuyar D, Shinde U. 2022. Deep convolutional neural network based COVID-19 classification from radiology x-ray images for IoT enabled devices, vol 1, pp 1398–1402.
Cantini, F, Goletti D, Petrone L, Najafi-Fard S, Niccoli L, Foti R. 2020. Immune therapy, or antiviral therapy, or both for COVID-19: a systematic review. Drugs, vol 80.
Carotti, M, Salaffi F, Sarzi-Puttini P, Agostini A, Borgheresi A, Minorati D, Galli M, Marotto D, Giovagnoni A. 2020. Chest CT features of coronavirus disease 2019 (COVID-19) pneumonia: key points for radiologists. La Radiologia Medica, vol 125.
Chang, D, Lin M, Wei L, Xie L, Zhu G, Dela Cruz C, Sharma L. 2020. Epidemiologic and clinical characteristics of novel coronavirus infections involving 13 patients outside Wuhan, China. JAMA, vol 323.
Demir, F, Ismael AM, Şengur A. Classification of lung sounds with CNN model using parallel pooling structure. IEEE Access 2020;8:105376–105383.
Hadjileontiadis, LJ. Continuous wavelet transform and higher-order spectrum: combined potentialities in breath sound analysis and electroencephalogram- based pain characterization. Philos Trans A Math Phys Eng Sci 2018;376:20170249–58.
Huang, G. 2005. The journal of teaching and learning resources.
Islama, A, Bandyopadhyaya I, Bhattacharyya P, Sahaa G. Multichannel lung sound analysis for asthma detection. Comput Methods Prog Biomed 2018;159:111–23.
Keroes, J. 2018. Medical simulation and training llc.
Kevat, A, Kalirajah A, Rose R. Artificial intelligence accuracy in detecting pathological breath sounds in children using digital stethoscopes. Open Access 2020;253(21):1–6.
Khan, S, Sohail A, Khan A, Hassan M, Lee YS, Alam J, Basit A, Zubair S. COVID-19 detection in chest x-ray images using deep boosted hybrid learning. Comput Biol Med 2021;137:104816.
Kujawski, S, Wong K, Collins J. . Nat Med 2020;26(6):861– 868.
Migliori, GB, Thong PM, Akkerman O, Alffenaar J-W, Álvarez Navascués F, Assao-Neino M, Bernard P, Biala J, Blanc F-X, Bogorodskaya E, Borisov S, Buonsenso D, Calnan M, Castellotti P, Centis R, Chakaya J, Cho J-G, Codecasa L, D’Ambrosio L, Goletti D. 2020. Worldwide effects of coronavirus disease pandemic on tuberculosis services, January-April 2020. Emerging Infectious Diseases, vol 26.
Niu, J. A novel method for automatic identification of breathing state. Sci Rep 2019;9(3):1–13.
Pahar, M, Klopper M, Warren R, Niesler T. Covid-19 cough classification using machine learning and global smartphone recordings. Comput Biol Med 2021;135:104572.
Pan, H, Peto R, Henao-Restrepo A-M, Preziosi M-P, Moorthy V, Abdool Karim Q, Ale-Jandria M, García C, Malekzadeh R, Murthy S, Reddy K, Roses Periago M, Hanna P, Ader F, Al-Bader A, Alhasawi A, Al-Lum E, Alotaibi A, Baidya DK. 2020. Repurposed antiviral drugs for COVID-19 — interim who solidarity trial results. N Engl J Med, vol 384.
Pasternak, H. 2008. Rale\(^{{\circledR }}\) lung sound 3.2. http://www.rale.ca/Pricing.htm.
Sanders, JM, Monogue ML, Jodlowski TZ, Cutrell JB. Pharmacologic treatments for coronavirus disease 2019 (COVID-19): a review. JAMA 2020;323(18):1824–1836.
Shi, L, Du K, Zhang C, Ma H. Lung sound recognition algorithm based on vggish-bigru. IEEE Access 2019a;7:139438–49.
Shi, Y, Li Y, Cai M, Zhang XD. A lung sound category recognition method based on wavelet decomposition and BP neural network. Int J Biol Sci 2019b;15:195–207.
Sun, P, Lu X, Xu C, Sun W, Pan B. 2020. Understanding of COVID-19 based on current evidence. J Med Virol, vol 92.
Swapna, MS, Renjini A, Raj V, Sreejyothi S, Sankararaman S. Time series and fractal analyses of wheezing: a novel approach. Phys Eng Sci Med 2020;43:1339–47.
Taplidou, SA, Hadjileontiadis LJ. Wheeze detection based on time-frequency analysis of breath sounds. Comput Biol Med 2007;37(8):1073–83.
Taplidou, SA, Hadjileontiadis LJ. Analysis of wheezes using wavelet higher-order spectral features. IEEE Trans Biomed Eng 2010;57(7):1596–610.
WHO, M. 2019. World health organization. http://www.who.int/respiratory/asthma/en/.
Zulfiqar, M, Gamage KAA, Kamran M, Rasheed MB. 2022. Hyperparameter optimization of bayesian neural network using Bayesian optimization and intelligent feature engineering for load forecasting. Sensors, vol 22(12).
Acknowledgements
The authors appreciate their indebtedness and extend their gratitude to the parent institutes and research centers.
Author information
Authors and Affiliations
Contributions
All the authors have equally contributed.
Corresponding author
Ethics declarations
Ethics approval
Data used in the present study is secondary data from open sources cited as references in the manuscript.
Consent to participate
Informed consent was obtained from each participant before participation in this research.
Consent for publication
Data used in the present study is secondary data from open sources cited as references in the manuscript.
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
R M Bodade and Divya Dubey contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dubey, R., Bodade, R.M. & Dubey, D. Efficient classification of the adventitious sounds of the lung through a combination of SVM-LSTM-Bayesian optimization algorithm with features based on wavelet bi-phase and bi-spectrum. Res. Biomed. Eng. 39, 349–363 (2023). https://doi.org/10.1007/s42600-023-00270-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42600-023-00270-2