Keywords

1 Introduction

Coronavirus Disease 2019, COVID-19 is an infectious disease caused by a coronavirus called SARS-CoV-2 [1]. Early detection of COVID-19 plays an important role in reducing the rate of spread of the disease and the damage caused by providing rapid decision-making [2].

The RT-PCR test used to diagnose COVID -19 disease is a very costly test and the time taken to obtain the result is quite long. Many studies emphasize that the false negative rates in the results of this test are also quite high. In other words, false negative results are quite high in people with COVID -19 virus levels [3]. X-Ray and Computed Tomography (CT) images are frequently used to support PCR tests to diagnose COVID -19 disease. However, diseases symptoms such as COVID-19, pneumonia, bacterial pneumonia on the lung are quite similar. Therefore, it is very difficult to distinguish between these diseases on X-Ray [4]. Chest X-Ray images of COVID -19 patients are analyzed, COVID -19 findings on X-Ray peaked 10–12 days after the onset of symptoms of the disease. CT images are still inadequate to identify and differentiate specific viruses [5]. X-rays can be harmful because they are classified as ionizing radiation [6, 7]. Furthermore, exposure of pregnant and lactating women and children to X-rays is harmful. In this study, Covid-19 disease is detected using low-cost, easy-to-use audio signals that are harmless to human health. Most studies show that COVID-19 disease severely affects human breathing, audio and causes symptoms that make voices of patients distinctive [8, 9]. Once a person’s audio recordings can be used to detect COVID-19 disease after are analyzed. It is very important to use audio signals to diagnose COVID-19 since audio signals are easy to obtain, can be analyzed quickly and effectively, and are cost-effective. Differentiating the pandemic-causing COVID-19 virus from other less dangerous upper respiratory tract diseases is an important factor during the pandemic [10]. Coughing is among the most common and most important symptoms of COVID-19. However, cough sounds are related to many diseases [11]. Therefore, it is important to differentiate COVID-19 from upper respiratory tract diseases. Differentiating between COVID-19 and other symptomatic diseases is important for accurate diagnosis and correct treatment. The symptoms of COVID-19 can be similar to some other symptomatic diseases. However, since COVID -19 has a high contagiousness and spread rate, it is important to diagnose the disease correctly to reduce the risk of transmission.

In this study, Mel frequency-based features were extracted from cough sounds. In the first stage, the features were classified into three classes using ML methods. In the second stage, the features were classified into three classes with DL architecture and different hyperparameters were used to determine whether the individual is healthy, COVID-19 and symptomatic. The contributions of this work are as follows:

  • A three-class detection and classification system were developed that automatically detected COVID-19 disease using cough sounds.

  • ML and DL algorithms optimized the hypermeters of different models for the audio-based model.

2 Related Works

Dash et al. found that cough sounds used for COVID-19 detection have different frequencies than speech sounds and developed a COVID-19 Coefficient feature by optimizing the frequency range and frequency conversion scale of the MFCC method to improve the efficiency of COVID-19 detection in their study. They used Coswara and Crowdsourced Respiratory Sound Data datasets. They tested various feature groups using a simple SVM-based classifier [12]. Grant et al. used breath, cough and speech sounds for Covid-19 detection in their study. They extracted MFCC, delta-MFCC and RASTA-PLP features from the signal. They obtained the AUC performance result of the study as 79.38% [13]. Despotovic et al. used a dataset of breath, cough and voice recordings collected from COVID-19 infected and uninfected people. A total of 1103 data, 84 COVID-19 data and 1019 healthy data from their dataset, were used in the study. They obtained Geneva Minimalistic Acoustic Parameter Set (GeMaps), extended GeMaps and ComParE feature set features from the signal. In the classification phase, they created Random Forests, VGGish and OpenL3 algorithms and achieved a performance result of 88.52% [8]. Coppock et al. detected symptomatic and asymptomatic COVID-19 cases using breath and cough voice recordings. The CNN model was tested on 355 data in the study. They then achieved an AUC rate of 84.6% [14]. Chaudhari et al., detected COVID-19 disease in crowd-sourced cough sound recorded and collected on smartphones from various regions of the world. They used MFCC and Mel-spectrogram features on developed CNN model. A total of 2883 data from Coswara and Coughvid datasets, 539 COVID-19 data and 2344 healthy data, were used. Then, their performance result is obtained with a ROC-AUC value of 77.1% [15]. Tena et al. extracted Autoencoder time-frequency features from cough sounds to detect COVID-19 disease. They used various machine learning algorithms (RF, SVM, LR, NB, LDA) in the classification phase. Coswara, Virufy and the Pertussis datasets that collected a total of 813 data were used in the study. They obtained the best accuracy rate is 90% from the RF algorithm [16]. Han et al. generated a total of 828 audio samples from 343 patients by inputting audio signals and symptoms of the patients. The features were extracted with Zero Crossing Rate (ZCR), Root Mean Square (RMS) frame energy, pitch frequency (F0), Harmonicsto-Noise Ratio, and MFCCs. They used SVMs algorithm in the classification stage. Their performance results are obtained with AUC 79%, sensitivity 68% and specificity 82% [17]. Alsabek et al. compared the success of cough, respiration and speech sounds in COVID-19 detection. The study is examined the importance of speech signal processing in MFCC extraction of COVID-19 and non-COVID-19 samples and the relationship between these samples using Pearson correlation coefficients (PCCs). A total of 42 data were used in this study, including 3 recordings from 7 COVID-19 patients and 3 recordings from 7 negative patients. They also used CNN and LSTM algorithms were used in their studies, too [18]. Çelik et al. proposed a DL-based CovidCoughNet method for COVID-19 detection using cough, respiration and voice signals. He used Chroma features, RMS Energy, Spectral centroid, Spectral bandwidth, Spectral rolloff, ZCR, and MFCC methods for feature extraction. InceptionFireNet algorithm was used in the classification stage. As a result of his studies on two different datasets; In the COUGHVID dataset, the AUC value was measured as 98.44% when classified as healthy, COVID-19 and symptomatic. In the use of voice Coswara dataset gave an AUC value of 99.24% [19]. Ulukaya et al. proposed a deep neural network-based model for detect coronavirus from coughing sound only, fast, can be used remotely and has no harmful side effects. The proposed model named MSCCov19Net uses MFCC, Spectrogram and Chromagram features as an input. The system was trained on the Coswara, Coughvid, Virufy and NoCoCo-Da datasets and tested on two unseen (used only for testing) clinical and non-clinical datasets. The performance of MSCCov19Net is compared with various CNN architectures. The proposed system achieved 61.5% accuracy on Virufy and 90.4% accuracy on NoCoCoDa [20]. Kranthi Kumar et al. propose a lightweight CNN with Modified-MFCC using different parameters to classify COVID-19 and other respiratory voice disease symptoms. They used the CU dataset in their study. The proposed model outperforms traditional feature extraction models and existing DL models in the range of 4% to 10% in COVID-19/SARS-CoV-2 classification with an accuracy of 93.65% [21].

3 Methods and Database

3.1. MFCCs:

MFCCs are a measure of the human ear perceives a change in frequency. The human ear is generally sensitive to lower frequencies. With MFCC, this perception of the human ear is visualized [22]. MFCC is a widely used method for representing audio signals as feature vectors. It is developed to mimic human sound perception. MFCC-based features are proposed to minimize the difference and avoid perceptual and feature space distortion [12, 23].

3.2. SVM:

It is a classification method used in machine learning. It uses the attributes in the dataset and generates a classification and regression model based on these attributes. An SVM has a great generalization capability, considering account errors [24]. SVM advantages are it works well on multidimensional data and reduces overfitting, is less sensitive to discrete values and noise in the data set and can separate data sets that cannot be linearly separated by transforming them into spaces. The parameters used in the SVM algorithm are LINEAR, Radial Basis Function-RBF, sigmoid or precomputed kernels, which are used to determine how the data points are separated in space and the decision boundary.

3.3. KNN:

KNN is an algorithm used in machine learning and data mining for solving classification and regression problems [11]. KNN positions data points in a space and uses the labels of the nearest neighbors of these data points to predict data class or value of a new data point. The KNN algorithm uses different distance metrics to measure the difference between two samples [25].

3.4. Logistic Regression:

Logistic Regression is a statistical computational method and machine learning algorithm for solving problems with two or more categorical classes. The class label of the dependent variable makes the probability estimate and the value of the dependent variable represents the class labels. Logistic Regression calculates probabilities using the sigmoid function [26]. Since it calculates probabilities, it estimates class labels in the interval [0, 1]. Multiple logistic regression is used in problems with more than one independent variable.

3.5. Decision Trees:

Decision trees are among the first statistical algorithms for general-purpose prediction and classification mechanisms since the late 20th century [27]. Decision trees analyze a training set for which the class labels. Thus, they can classify previously unsupervised examples. With decision trees, high accuracy can be achieved on very high-quality data. A decision tree classifies or evaluates data samples using a set of decision rules or questions [28]. This algorithm’s advantages are easy to understand, provide high performance and the results can be interpreted.

3.6. Random Forest:

RF is a classification algorithm widely used ML. It is a combination of multiple decision trees. It is a powerful nonparametric statistical method that allows regression problems as well as two-class and multi-class classification problems to be addressed in a single and multifaceted framework [29].

3.7. CNN:

CNN is an artificial neural network model frequently used in deep learning. CNN structure is based on neurons that are organized in layers. Neurons in these layers are connected to each other through weights and biases [30]. Its layers include convolution layer, pooling layer and fully connected layers etc. [31].

3.8. COUGHVID Database:

Detecting various diseases from coughs is becoming more and more common today. One of the most common symptoms of COVID-19 is coughing. The COUGHVID dataset is one of the most widely used datasets in this field with more than 25000 cough records. In this dataset, all records were labeled by expert doctors and made suitable for classification. Of the labeled data, 25% received the Covid value, 35% received the symptomatic value, 25% received the healthy value and 15% received the condition-free value. The COUGHVID dataset was collected through a web application installed on a dedicated server at the École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland. During these enrollments, information such as age, gender and geographical location is also collected from the user on request [32]. As such, COUGHVID is one of the largest and most reliable datasets.

4 Experimental Result

In this study, using ML and DL techniques is detected COVID-19 disease on the COUGHVID audio dataset. The results are comparatively analyzed in this section.

Fig. 1.
figure 1

Data distribution graph.

Figure 1 shows the data distribution graph and Fig. 2 shows a block diagram of our proposed method. The first step represents the data collection and preparation process. In this stage, the COUGHVID dataset was prepared for use in the study. Then, MFCC was extracted from the data. Data more was meaningful and processable in this step. Then, the models were trained using machine SVM, KNN, Logistic Regression, Decision Trees, Random Forest, CNN algorithms. The trained models represented classification models used to classify and categorize the data. These models successfully identified three different classes: COVID19, healthy and symptomatic. Finally, performance analysis was performed to evaluate the model’s work.

Fig. 2.
figure 2

The flow diagram of the system.

4.1 COVID-19 Detection Using ML Techniques

In this study, COVID-19 disease was detected using 5 different algorithms with machine learning techniques. Pseudocodes of algorithms is given Table 1.

Table 1. Pseudocodes of algorithms.

The most accurate successful model of all models was the KNN using Manhattan distance and the accuracy rate was 78.56%. Other KNN models develop with Euclidian and Minkowski distances calculated in 78.36% accuracy. For SVM algorithm, three different kernels were tested on COUGHVID dataset. The best performance for the sigmoid kernel was determined 70.69% accuracy rate and for RBF and LINEAR kernels was computed with 77.04% accuracy rate. The Logistic Regression method also achieved similar results to the other SVM methods with an accuracy rate was 77.04%. The RF model achieved 76.88% accuracy rate. Decision Trees perform the lowest accuracy of all models with an accuracy rate was 60.33%. The results of the ML algorithms were given Table 2.

Table 2. The results of the machine learning algorithms.

4.2 COVID-19 Detection Using DL Techniques

3 different CNN models with 4 layers were developed in the experimental studies part. Models were evolved with the filter numbers of the layers as “8,16”, “16,32”, “32,64” respectively. The number of batches for each model is given as 4, 8 and 16 and for each batch number was tested 50 and 100 epochs. In all models, 100 epochs performance was better than 50 epochs performance. The highest performance was 77.04% accuracy rate was obtained in the model with filter numbers is “32,64”, batch numbers is 8 and 100 epochs. Training, validation and test accuracies of the models were shown Table 3. The Model 3 accuracy and loss graphs with batch size 8 were shown Fig. 3.

Table 3. Training, validation and test accuracies of the models.
Fig. 3.
figure 3

The Model 3 accuracy and loss graphs with batch size 8.

5 Discussion and Conclusion

This study was conducted to evaluate the usability of both traditional ML and DL techniques for COVID-19 disease detection. In the experimental studies, three different CNN models were developed with four layers and different features were extracted. To evaluate the performance of the model, tests were performed by applying different batch numbers and step numbers. The highest performance was 77.04% accuracy rate in the model filter is “32,64” at batch numbers is 8 and 100 epochs. Also, SVM, RF, Decision Tree, Logistic Regression and KNN was used as machine learning models for classification and regression stage. The highest accuracy rate was recorded as 78.56% using Manhattan distance with KNN. Decision Trees accuracy rate was the lowest performance of 60.33%. Different models and model configurations compare each other and measure their performance with each effect. This study demonstrates the potential of deep artificial intelligence-based methods in Covid-19 diagnosis with the COUGHVID dataset.