Abstract
Humanity has suffered catastrophically due to the COVID-19 pandemic. One of the most reliable diagnoses of COVID-19 is RT-PCR (Reverse-Transcription Polymer Chain Reaction) testing. This method, however, has its limitations. It is time-consuming and requires scalability. This research work carries out a preliminary prognosis of COVID-19, which is scalable and less time-consuming.
The research carried out a competitive analysis of four machine learning models namely, Multilayer Perceptron, Convolutional Neural Networks, Recurrent Neural Networks with Long Short-Term Memory, and VGG-19 with Support Vector Machines. Out of these models, Multilayer Perceptron outperformed with higher specificity of 94.5% and accuracy of 96.8%. The results show that Multilayer Perceptron was able to distinguish between positive and negative COVID-19 coughs by a robust feature embedding technique.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Coronavirus is a significant virus that causes illness in both animals and humans. It is a family of RNA viruses that is medium-sized and has a viral RNA genome largest of all known. A new, so far unknown coronavirus, SARS-CoV-2, the cause of COVID-19 disease, belongs to the same subgroup as MERS CoV and SARS-CoV. Coronavirus is known to common people as COVID-19 was declared as a pandemic by WHO(The World Health Organization), on March 11, 2020 [1]. It has forced the world into a mandatory lockdown.
The spread of this virus in the human race has caused 3.35 M deaths in the world as of May 2021 and has brought the economy to a standstill. It has also introduced several challenges worldwide. To date, the mode of transmission of SARS-CoV-2 is unresolved and is a topic of debate among researchers. Most researchers believe that it might be identical to SARS, which transmits through in-person contact or unsanitized surroundings in the form of aerosols and droplets. Studies have accentuated that patients with pulmonary symptoms are at higher risk of transmission [2, 3]. However, studies proved that it is also viable from asymptomatic patients [4]. Therefore, concluding that COVID-19 can spread via symptomatic as well as asymptomatic patients. The major task in fighting COVID-19 in most countries is to find asymptomatic patients who might be potential carriers of coronavirus. Currently, widely used methods for the diagnosis of COVID-19 are RT-PCR (Reverse Transcription-Polymerase Chain Reaction) and X-ray or CT scans. Since X-rays require a chest scan at a well-equipped medical facility and are quite expensive, RT-PCR is more widely accepted. However, according to study, this testing is not scalable and sometimes inaccurate [5]. It is also costly, and most countries have faced difficulties buying more test kits. Thus, in near future, there would be a need for an alternate testing method that is simpler, unintrusive, lab-free, and less expensive. Such a method should address all the limitations of current preliminary diagnostic techniques. It must also be based on sound science and identify at-risk individuals effectively.
This research proposes a solution which is a deep neural network that recognizes the differences between COVID-19 positive and negative Coughs using audio classification techniques. It takes input as raw audio files and provides a diagnosis of whether that cough comes from a COVID-infected individual.More precisely the contributions of this research paper are as follows:
-
It provides a pre-screening tool for diagnosis of COVID-19 based on deep learning (AI) ubiquitously available to everyone. Its low cost, rapid results and ease of access to everyone makes it a unique solution which can be employed in offices and various institutions as a pre-screening for entry. It can be used as aiding tool to increase the diagnostic capability and devise a treatment plan in areas where adequate supplies, healthcare facilities, and medical professionals are not available.
-
We increased our dataset upto 5 times by leveraging data augmentation techniques on the open-source cough audio data set by virufy. Thus, illustrating a potential way to overcome the problem of overfitting in machine learning models due to shortage of dataset.
-
The research uses features extracted from samples using sound processing techniques. The research constructed four models by using two main approaches i.e., Time Series waveform approach and Amplitude waveform approach. In the time series waveform approach, we extracted MFCC’s which were fed to MLP, CNN and RNN with LSTM. Whereas in amplitude waveform approach we extracted the features from the flatten layer of VGG-19 which were then fed to SVM. Results shows that out of all these four models, MLP was most successful in classifying the COVID-19 positive and negative cough with an accuracy of 96%. Thus, showing that time series waveform approach was able to learn the robust features and was able to generalize classification better in comparison to the Amplitude waveform approach.
-
Were able to successfully fine tune multi-layer perceptron to such an extent that it outperformed some of the existing literatures [6, 7].
-
Portraying several future directions for our analysis and voice-based diagnosis in the context of COVID-19, which could open the door to pre-screening of COVID-19 and tracking the impact of COVID-19.
2 Background
The primary reason behind the intractability of COVID-19 is that there is a significant delay between infection and diagnosis. Two main types of COVID-19 diagnostic techniques: Laboratory-based testing and Radiography testing.
2.1 Laboratory-Based Testing
Laboratory Testing can further be categorized into two kinds: immunoassays and nucleic acid or molecular tests. Immunoassay tests discern virus-associated proteins whereas Nucleic Acid tests or molecular tests discern the genetic code of the virus. In comparison to Immunoassay tests, Nucleic acid tests are sensitive to early detection and for that reason, they are widely being used during this pandemic. The above tests often depend upon classical technologies one of which is RTPCR(Reverse transcription-polymerase chain reaction) [18]. To perform laboratory-based testing samples were obtained with throat swabs, nasopharyngeal swab, deep airway material, or sputum. Even though this technique is quite sensitive in the early detection of COVID-19, however, there are certain limitations to this technique:
-
i
Geographical and temporal factors limit the availability of testing in various countries.
-
ii
To fulfill the massive time-sensitive demand, it leads to scarcity of clinical testing and increases their cost.
-
iii
The need for a personal visit to the medical facility. Such a visit exposes many segments of the community to coronavirus. This can be a major obstacle, according to the study, the aerosol stability of COVID-19 ranges from three hours up to one week on different planes making it highly stable and hence contagious [8].
-
iv
Many reputed newspapers recently highlighted that the turnaround time stretched to 6–7 working days in a few countries due to laboratories being overflowed with COVID tests. As a result, the virus might have already been transmitted to many, by the time a patient is diagnosed and his treatment starts [9, 10].
-
v
Often medical staff are at higher risk of infection due to these in-person testing techniques. Failure to secure our physicians can further lead to biomedical shortages and increase stress on the already distressed paramedical staff.
-
vi
To protect others from potential exposure, many countries like India have also approved at-home sample collection under the guidelines of ICMR [11]. However, once a patient collects a nasal sample, they need to put it in a saline solution and ship it overnight to a certified lab authorized to run specific tests on the kit. Hence, this approach also introduces delays and could compromise the quality of samples if the sample is stored for too long.
2.2 Radiography Testing
Experts urge that we need more and faster testing to control the coronavirus and many have suggested that Artificial Intelligence (AI) is the solution. According to the study, multiple diagnoses of COVID-19 in development use AI to quickly analyze X-ray or CT scans have shown that in comparison to laboratical tests, radiographic tests provide sharpened sensitivity [12, 13]. In order to manage coronavirus, a Thoracic CT scan - an optional imaging modality - can play a crucial role. This type of CT scan is an important aspect of COVID-19 diagnosis as it has higher precision. To produce high-resolution medical images, firstly X-rays from the patient’s thorax cavity are picked up by the radiation detection tools, further, the radiographs generated are remodeled to form the medical images. One should look out for certain patterns in the thorax cavity, which might reveal different symptoms. This is examined by a radiographer, or when integrated with the AI-based analysis of the image, may detect COVID-19 with much higher specificity. This might be more efficient than that of a laboratical test such as rRT-PCR. Promising results were shown by study, it was calculated at a 95% confidence interval, having high precision and lower recall of 94% and 37% respectively for a diagnostic test based on radiology [14]. However, these techniques require scanning the chest in a well-equipped and expensive medical laboratory. So, indirectly this method also does not solve the problems faced by office-based tests as accentuated above.
2.3 Cough-Based Testing
Many kinds of research, have been carried out, where various prognostic tools for examination of respiratory infections have been presented which are self-regulating [15,16,17]. They have used various deep neural networks such as Convolutional Neural Networks (CNNs) to recognize coughs within natural noise and to determine various diseases such as Bronchitis, bronchiolitis, Asthma, COPD, etc. depending on their distinctive cough sound features. Although cough is a frequent medical symptom in many pulmonary diseases, study has demonstrated that depending on different conditions and locations of the underlying irritants, cough from various pulmonary diseases has unique characteristics [7]. Many types of studies have been done, which show that changes in the character of a coughing sound can indicate conditions of lung disease [19, 20]. Pathological situations arise as a result of certain conditions such as obstruction, restriction, and integrated patterns. Researchers have made numerous efforts to improve the mechanism of objective classification of coughing, to classify different respiratory infections. Isolation of the cough audio signal helps to distinguish between Covid-19 positive and negative cough based on these features. The analysis of recent neurological symptoms shown by COVID-19 patients developed a link between the brain and COVID-19. This led MIT researchers to evaluate their Alzheimer’s biomarkers for COVID-19 diagnosis. To detect Covid-19 coughs, they primarily used vocal cord strength, lung performance, sentiment, and muscular degradation in the human body [21] (Fig. 1).
3 Methodology
3.1 Proposed Architecture
3.2 COVID-19 Cough Dataset
In medical research, finding the right amount and standard data is a difficult task. The dataset used in this study was taken from various sources and combined, COVID 19 cough samples were taken from the virufy open-source audio dataset [22]. The dataset consists of 121 sound segments which are digital audio files in .mp3 format out of which 48 are COVID positive and 73 are negative. Within the dataset, out of three, two relevant discrete attributes for the respective domain were selected as shown in Table 1. The cough audio samples were converted from .mp3 format to .wav format. To ensure consistency all over the dataset, preprocessing of three major sound properties(Audio Channels, Sample Rate, and Bit-depth) was done. The audio channels of the cough samples were integrated into mono channels and the sample rates were modified to the default sample rate of 22.05 kHz. In addition to this, in order to remove the discrepancy in bit depth, the value of each audio file’s average amplitude was called down to range between −1 and 1.
3.3 Data Augmentation
Some domains have limited access to large data, such as medical image analysis or biomedical audio analysis. As a result, the dataset is not readily available and is quite small in size. This can lead to a problem known as overfitting. Overfitting refers to an event in which a network masters a function with very high variations to the maximum level at which it degrades the performance of the model on unseen data. One of the methods to resolve this problem is data augmentation.
Data Augmentation includes many strategies that improve the diversity and quality of data available for training models so that Deep Learning models can be built on it without facing the problem of overfitting. Audio augmentation algorithms are used to generate synthetic audio data. In this study noise injection, shifting time, changing pitch, and speed were applied to the dataset using librosa (library for Recognition and Organization of Speech and Audio). This provides an easy way to manipulate pitch and speed while a Numpy python package was used to handle noise injection and shifting time. As a result, we were able to increase the dataset by 5 folds.
3.4 Feature Extraction
Past studies have showed that the acoustic of cough sounds may carry important information related to diseases [16]. For extracting these features, in this study two approaches are used. The first one is by extracting MFCC (Mel Frequency Cepstral Coefficient) from Audio Samples. It has been scientifically proven that humans are more efficacious at identifying minute changes in a speech at lower frequencies. Thus, to leverage this property one can use MFCC’s i.e., Mel frequency cepstrum coefficients. The MFCC converts the standard frequency to the Mel Scale using Eq. 1. It takes into account the human perceptiveness for sensitivity at appropriate frequencies and is therefore suitable for audio classification and sound processing. Mel scale equation is given below:
An audio signal’s power spectrum, which is short-term, is represented using the Mel frequency cepstrum (MFC). The first step for obtaining MFC is Fourier transformation. On taking the log of the magnitude of this Fourier spectrum as shown in Fig. 2, and then performing cosine transformation to obtain the spectrum of this log, we observe a crest wherever there is a periodical element in the original time signal [23]. MFCC’s are emanated by the cepstrum visualization of sound samples. They are coefficients that altogether form the MFC. The study used the librosa python package to calculate a series of 40 MFCCs for each sample as shown in Fig. 3 and stored it in a pandas data frame.
The second approach was extracting important features from the last flatten layer of the VGG-19 model. After that, constructing the VGG-19 model, ImageNet images of size 64 * 64 were fed for pre-training. After this, the NumPy array of pixel values was created by converting the PIL image object. Next, with dimensions of [samples, rows, columns, channels], it was expanded to the 4D array from the same 3D array. According to the VGG19 model, pixel values need to be changed. After this, all we need to do is to extract features.
In the VGG19 model as shown in Fig. 4. The last layer (1000-dimensional) is removed and the flattened layer results in a 4096-dimensional feature vector representation of an input image. After extracting these features, a 60–40 train test split was performed and then fed into the models.
4 Model Architecture
Since the introduction of Neural Networks (NN) for pattern recognition, they have outperformed the results obtained with traditional algorithms. For instance, in the system for urban sound classification conducted, the performance of an SVM was compared with different configurations of neural models like a deep neural network (DNN) a recurrent neural network (RNN), and a Convolutional Neural Network (CNN), obtaining better results using a CNN or a DNN than using an SVM or an RNN [25]. Keeping this in mind, this research used 3 different configurations of neural network and SVM. In the end, the results of each model were compared and the best model was chosen.
4.1 Multilayer Perceptron
Multilayer Perceptrons, or MLP for short, is a long-established neural network. A combination of multiple neurons forms a multilayer perceptron. The feeding of data takes place at the input layer which is then processed by the hidden layers. These hidden layers are used to increase the level of abstraction. After the processing of data from the hidden layers, the output layer gives us the final predictions. The study used Data Augmentation (noise, shift, and stretch) to increase the audio dataset in order to overcome overfitting. MLP can be constructed using Keras and Tensorflow backend. The model built in this research was sequential in nature and consisted of four layers to increase the level of abstraction. All the four layers - input layer, two hidden layers, and an output layer are of dense type, which is the standard type in most of the cases. The number of nodes comprised by each of the three layers including input and hidden layers were 256, 128, and 64 respectively with an activation function ReLU and a dropout value of 25%. ReLU has proven to perform extremely well with neural network frameworks, it is explained further more in Appendix A.2. For better generalization in models, dropout is used which randomly excludes nodes from each epoch which in turn decreases the chance of overfitting. Finally, the output layer has 2 nodes which indicate the number of class labels with softmax. Softmax is the activation function used in the output layer, explained further in Appendix A.1. Softmax transforms the results in the form of probabilities, due to which it is highly used with various machine learning models. The model then, based on the highest probability, classifies the cough into COVID-19 positive or negative.
4.2 Convolutional Neural Networks
Another Deep Learning algorithm implemented in this study is Convolutional Neural Network (CNN). It can take an image as input, allot significance to the various elements in the image, and be able to distinguish one from the other. As a precautionary measure, each recording of the input cough, processed with the MFCC package, was divided into 6-second audio clips and was padded as required. The study used the Convolutional Neural Network again with Keras and TensorFlow as a backend. It is a sequential model that comprises of four Conv2D convolution layers out of which two are dense layers. A pooling layer of the MaxPooling2D type is linked with the final convolutional layer. The pooling layer reduces the parameters as well as the requirements for subsequent computation. This in turn reduced the dimensionality of the model. As a result, it shortens the duration of the training and reduces overfitting. The Max Pooling version has taken the greatest size possible of every window. For convolutional layers, the ReLU activation function was used, it is explained further more in Appendix A.2. A dropout value of 50% after the final convolutional layer is applied. The output layer has 2 nodes (number of labels, positive and negative) which are the same as the number of possible classifications. Softmax is the activation function used in the output layer, explained further in Appendix A.1. Softmax transforms the results in the form of probabilities, due to which it is highly used with various machine learning models.The model then, based on the highest probability, classifies the cough into COVID-19 positive or negative.
4.3 Recurrent Neural Networks with Long Short-Term Memory
Recurrent neural network (RNN) is a category of neural networks that help in data sequencing. Based on feedforward networks, RNNs show a similar mechanism of action as in the human brain. To put it simply, there is no alternative algorithm that can produce predictable results in sequential data as accurately as a recurrent neural network can. The model used a sequential model, consisting of two LSTM layers, with four Time distributed layers. All LSTM layers consisted of 128 nodes. After the final LSTM layer, we used a Dropout of the value of 50%. The model has four Time Distributed Layers of dense type with 64, 32, 16, and 8 nodes respectively with an activation function as ReLU (Rectified Linear Activation), it is explained further more in Appendix A.2. The output layer has 2 nodes (number of labels, positive and negative) which are the same as the number of possible classifications. Softmax is the activation function used in the output layer, explained further in Appendix A.1. Softmax transforms the results in the form of probabilities, due to which it is highly used with various machine learning models. The model then, based on the highest probability, classifies the cough into COVID-19 positive or negative.
4.4 Support Vector Machines
Support vector machines or also known as SVM, come under the category of data mining techniques that are used for both classification and prediction. It is able to generalize between two different classes. After providing the SVM model set of labelled training data for every category, it can classify the new text by checking the hyperplane that is able to distinguish between the two classes. After extracting features from the VGG-19 flatten layer as explained in Sect. 3.4. A 70–30 train test split was performed and then fed into a LinearSVM for classification.
5 Results
Predictions generated by models were expected to generalize well and could effectively produce the appropriate category label or data classification of previously unknown data. The effectiveness of the classification model was assessed based on the number of precise and false predictions observed by various models implemented on the unseen database. Accuracy, precision, and recall were the three evaluation metrics used which assess the nature of predictions made by the machine learning models developed in this research.
5.1 Accuracy
Accuracy is a measurement of the approximate level of quantity rather than the actual value of a quantity. It can be computed from the confusion matrix using the equation mentioned below (Table 2).
The Fig. 5 shows that Multilayer Perceptron and Convolutional Neural Network performed better than the rest of the models with an overall accuracy of 96% and 86% respectively. SVM performed fairly decent with 81% accuracy whereas Recurrent Neural Network was not able to generalize well and had an accuracy of only 68%.
5.2 Precision
In pattern detection, data retrieval, and categorization (machine reading), precision is the ratio of relevant instances among the retrieved instances. Precision is also known as a positive predictive value. In this study, that would be the proportion of patients who were positively identified with COVID-19 in all patients who actually had it. It was computed using the equation given below.
The precision of each model achieved in both negative and positive classes in this study was recorded in Table 3.
Higher Precision relates to lower false-positive rates. Figure 6 shows that Multilayer Perceptron and Convolutional Neural Networks have lower false-positive rates and are able to classify covid positive patients very well with a precision of 93% and 87% respectively. RNN has a higher false-positive rate and is prone to false alarms. All the models have a lower false-negative rate and are able to classify non-covid patients very well.
5.3 Recall
The recall is the measure of our model that accurately identifies True Positives. It is also known as the sensitivity of the model. Therefore, in all patients with actual COVID-19, recall tells us how many did the model accurately identified as COVID-19 positive. It can be computed using the following equation:
The recall of each model achieved in both negative and positive classes in this study was recorded in Table 4.
Higher Recall relates to higher true positive rates. Figure 7 shows that Convolutional Neural Networks and Support Vector Machines have higher true positive rates for class positive. CNN and SVM correctly identify 90% and 87% of all the positive cases respectively. Multilayer Perceptron and Convolutional Neural Networks have a higher specificity. RNN can only identify 79% of all the positive cases and 61% of all the negative cases.
6 Conclusion
The Trace, Test, and Treat strategy has shown that it is necessary for governments to be able to effectively track the spread of the disease, isolate infected people. This helps in flattening the curve of infection successfully. However, most countries are not able to do enough rapid tests; which is why the alternative proposed can be very helpful. This paper presents an ML model for the initial diagnosis of COVID-19 with cough samples. On the basis of performance evaluation parameters, various models used in this study were analyzed. This analysis revealed that the Multi-Layer Perceptron outperformed with an accuracy of 96%. Convolutional Neural Networks and Support Vector Machines, on the other hand, have performed fairly well in terms of accuracy. Higher precision and lower recall give an extremely accurate result, but it then misses a large number of difficult instances to classify which can’t be ignored in COVID-19 diagnosis. Thus, there is a need for models having higher precision and higher recall at the same time for improved generalized classification.
The results show that precision and recall of both Multi-Layer Perceptron and Convolutional Neural Network yielded somewhat comparable results. On the other hand, Recurrent Neural Network and Long Short-Term Memory were not able to generalize well on COVID-19 cough samples due to higher false-positive rates and lower true positive rates.
Overall, Multi-Layer Perceptron was able to generalize well with a higher sensitivity, ensuring low false alarms. These results promise that AI can be used in the clinic and at home as a support system for physicians and the general public in the early detection of COVID-19. It may play an important role in medical diagnosis. This significant achievement supports extensive testing for COVID-19 even in areas where health facilities are not readily available. As a result, it helps to reduce the burden on paramedical staff.
References
Laboratory testing for coronavirus disease (COVID-19) in suspected human cases. https://www.who.int/publications-detail/laboratory-testing-for-2019-novel-coronavirus-in-suspected-human-cases-20200117. Accessed 3 June 2021
Guo, G., et al.: New insights of emerging SARS-CoV-2: epidemiology, etiology, clinical features, clinical treatment, and prevention. J. Front Cell Dev. Biol. 8, 410 (2020). https://doi.org/10.3389/fcell.2020.00410
Yu, I.T., et al.: Evidence of airborne transmission of the severe acute respiratory syndrome virus. J. N. Engl. J. Med. 350(17), 1731–1739 (2004). https://doi.org/10.1056/NEJMoa032867
Yang, R., Gui, X., Xiong, Y.: Patients with respiratory symptoms are at greater risk of COVID-19 transmission. J. Respir. Med. 165, 105935 (2020). https://doi.org/10.1016/j.rmed.2020.105935
Kameswari, S., Brundha, M.P., Ezhilarasan, D.: Advantages and disadvantages of RT- PCR in COVID 19. Eur. J. Mol. Clin. Med. 7(1), 1174–1181 (2020). https://doi.org/10.1016/j.rmed.2020.105935
Brown, C., et al.: Exploring automatic diagnosis of covid-19 from crowdsourced respiratory sound data. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 3474–3484 (2020). https://arxiv.org/pdf/2006.05919.pdf
Imran, A., et al.: AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app. J. Inf. Med. Unlocked 20, 100378 (2020). https://doi.org/10.1016/j.imu.2020.100378
Van Doremalen, N., et al.: Aerosol and surface stability of SARS-CoV-2 as compared with SARS-CoV-1. J. N. Engl. J. Med. 382(16), 1564–7 (2020). https://doi.org/10.1056/NEJMc2004973
Delayed RT-PCR reports triggering Covid surge, high transmission rate in Lucknow. The Times of India. https://timesofindia.indiatimes.com/city/lucknow/delayed-rt-pcr-reports-triggering-covid-surge-high-transmission-rate/articleshow/82265365.cms. Accessed 2 June 2021
Gujarat: Why RT-PCR test reports ‘delayed by 5–7 days’; AG says many undergo tests unnecessarily. The Indian Express. https://indianexpress.com/article/cities/ahmedabad/gujarat-why-rt-pcr-test-reports-delayed-by-5-7-days-ag-says-many-undergo-tests-unnecessarily-7270655. Accessed 2 June 2021
Advisory for COVID-19 testing during the second wave of the pandemic. ICMR official advisory. https://www.icmr.gov.in/cteststrat.html. Accessed 2 June 2021
Ozsahin, I., Sekeroglu, B., Musa, S.M., Mubarak, T.M., Uzun Ozsahin, D.: Review on diagnosis of COVID-19 from chest CT Images using artificial intelligence. J. Comput. Math. Methods Med. 10 (2020). https://doi.org/10.1155/2020/9756518
Fang, Y., Zhang, H., Xie, J.: Sensitivity of chest CT for COVID-19: comparison to RT-PCR. J. Radiol. 296, 115–117 (2020). https://doi.org/10.1148/radiol.2020200432
Adams, H.J., Kwee, T.C., Kwee, R.M.: COVID-19 and chest CT do not put the sensitivity value in the isolation room and look beyond the numbers. Radiology 297(1), E236–E237 (2020). https://doi.org/10.1148/radiol.2020201709
Bales, C., et al.: Can machine learning be used to recognize and diagnose coughs? In: 2020 International Conference on e-Health and Bioengineering (EHB), 29, pp. 1–4 (2020). https://doi.org/10.1109/EHB50910.2020.9280115
Amrulloh, Y., Abeyratne, U., Swarnkar, V., Triasih, R.: Cough sound analysis for pneumonia and asthma classification in the pediatric population. In: IEEE 6th International Conference on Intelligent Systems, Modelling, and Simulation, pp.127–131, (2020). https://doi.org/10.1109/ISMS.2015.41
Infante, C., Chamberlain, D., Fletcher, R., Thorat, Y., Kodgule, R.: Use of cough sounds for diagnosis and screening of pulmonary disease. In: IEEE global humanitarian technology conference, GHTC, pp. 1–10 (2015). https://doi.org/10.1109/EHB50910.2020.9280115
Waltz, E.: How do coronavirus tests work? IEEE Spectr. https://spectrum.ieee.org/the-human-os/biomedical/diagnostics/how-do-coronavirus-tests-work. Accessed 2 June 2021
Hirschberg, J., Szende, T.: Pathological cry, stridor and cough in infants, Budapest: Akiademiai Kiado. PMCID: PMC1627937, (1983)
Maryam, Z., Fazel, Z.M.H., Mostafa, M.: Application of intelligent systems in asthma disease: designing a fuzzy rule-based system for evaluating the based onlevel of asthma exacerbation. J. J. Med. Syst. 36, 2071–83 (2012). https://doi.org/10.1007/s10916-011-9671-8
Laguarta, J., Hueto, F., Subirana, B.: ACOVID-19 artificial intelligence diagnosis using only cough recordings. IEEE Open J. Eng. Med. Biol. 1, 275–281 (2020). https://doi.org/10.1109/OJEMB.2020.3026928
Khanzada, A., Wilson, T.: Virufy COVID-19 open cough dataset, Github (2020). Accessed 2 Feb 2021
Nair, P.: The dummy’s guide to MFCC, Medium (2018). https://medium.com/prathena/the-dummys-guide-to-mfcc-aceab2450fd. Accessed 5 June 2021
Hewage, R.: Extract features, visualize filters and feature maps in VGG16 and VGG19 CNN models, towards data science (2020). https://towardsdatascience.com/extract-features-visualize-filters-and-feature-maps-in-vgg16-and-vgg19-cnn-models-d2da6333edd0. Accessed 5 June 2021
Chang, C., Doran, B.: Urban sound classification: with random forest SVM DNN RNN and CNN classifiers. In: CSCI E-81 Machine Learning and Data Mining Final Project Fall 2016, Harvard University Cambridge (2016)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
A Appendix
A Appendix
1.1 A.1 Softmax
Softmax is a mathematical function that converts a vector of numbers into a vector of probabilities, where the probabilities of each value are proportional to the relative scale of each value in the vector.
where, \(\exp (x_{i})\) represents standard exponential function for input vector, K represents number of classes in the multi-class classifier, and \(\exp (x_{j})\) represents standard exponential function for output vector.
1.2 A.2 ReLU
The rectified linear activation function or ReLU for short is a piecewise linear function that will output the input directly if it is positive, otherwise, it will output zero. It has become the default activation function for many types of neural networks because a model that uses it is easier to train and often achieves better performance.
where, x is the input to a neuron.
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Pandhi, T., Kapoor, T., Gupta, B. (2022). An Improved Technique for Preliminary Diagnosis of COVID-19 via Cough Audio Analysis. In: Santosh, K., Hegadi, R., Pal, U. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2021. Communications in Computer and Information Science, vol 1576. Springer, Cham. https://doi.org/10.1007/978-3-031-07005-1_30
Download citation
DOI: https://doi.org/10.1007/978-3-031-07005-1_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07004-4
Online ISBN: 978-3-031-07005-1
eBook Packages: Computer ScienceComputer Science (R0)