Introduction

Dopamine deficiency is the most significance reason for the occurrence of Parkinson’s disease (PD) [1]. Dopamine, a chemical that transmits information between brain regions controlling the body movements, is produced by a subset of cells found in a specific part of the human brain [1]. In short, dopamine enables people to perform their movements fluently and harmoniously [1]. In humans, the cells that produce this chemical start to decrease in the later years. When this loss is between 60 and 80%, dopamine cannot be produced in sufficient quantity and motor disorders, one of the symptoms of PD, occurs [1]. Symptoms of the disease are more prominent in people between 40 and 70 years of age and mostly occur in the 60s [1]. The incidence of this disease is higher in males than in females, and it is accepted that one in every 100 male individuals over the age of 65 in the community is suffering from PD [1]. The first symptom indicating a possible diagnosis of Parkinson’s disease is slow movements and, in addition, the presence of tremors during periods of rest [1]. In this disease, symptoms, such as slow movements, mask-like expressions, cramped handwriting, tremors, muscle contraction, postural, gait, speech, and smelling disorders, kyphosis, the feeling of discomfort, restless leg syndrome, and forgetfulness are observed [2]. In a previous study, it was emphasized that early diagnosis of PD is possible with a simple blood test in which the risk of presenting with this disease can be revealed before the associated symptoms occur, and the necessary medical measures can be taken accordingly [3].

In the field of engineering, voice or walking recordings are mainly used for automatic detection of PD [4]. A speech disorder is an early distinctive symptom of this disease and develops in the majority of people with PD (about 90%) [5, 6]. For this reason, sound-related features have been widely used in systems for automatic recognition of PD. The purpose of these studies was to automatically differentiate patients from healthy individuals by using the relevant audio features [7]. For instance, Sakar and Kursun designed a tele-diagnosis system for automatic recognition of PD [7]. During the testing of this system, various attributes were extracted from the records of patients and healthy people and then were evaluated with the Support Vector Machine (SVM) method [7]. Considering the characteristics of the data set on which they were working, the authors attempted to obtain the maximum classification accuracy with the least set of features [7]. In another example, the effectiveness of vocal characteristics in PD diagnosis was analyzed using machine learning techniques [8]. As a result of these analyses, the highest classification accuracy was obtained with an SVM of 96.4% [8]. After collecting various audio recordings of people with and without PD and extracting the necessary features, Sakar et al. gave these records to several classifiers and analyzed the results [9]. Braga et al. conducted research on various available data sets in which audio signals were processed for the automatic detection of PD [10]. In another research, an auto-diagnostic system using the fuzzy k-nearest neighbors (FKNN) classification method for PD diagnosis was presented [11]. In a study by Parisi et al., a new hybrid artificial intelligence system was presented for early diagnosis of Parkinson [12]. In a similar study, SVM based on bacterial foraging optimization (BFO-SVM), a new hybrid diagnostic method, was developed for PD recognition [13]. In addition, the Random Forest-BFO-SVM structure in which the feature selection stage was performed was applied to the data and a result of 97.42% was obtained [13]. Another researcher used four different classification systems: Neural Networks (NN), DMneural, Regression, and Decision Tree for effective detection of PD. The best result (92.2%) was obtained with the NN [14]. Lahmiri et al. tested various classification systems, such as linear discriminant analysis (LDA), k nearest-neighbors (KNN), naive Bayes (NB), regression trees (RT), radial basis function neural networks (RBFNN), SVM, and Mahalanobis distance and found that the best result was obtained with SVM with a rate of 92% in the automatic detection of PD [15]. In another study conducted with respect to classification systems, parallel feed-forward neural network architecture was presented for the same purpose [16]. As a result, it was emphasized that a 9-parallel neural network system works better by 8.4% compared to a single network structure [16]. Eskidere et al. tested the methods of SVM, Least Square SVM, (LS-SVM), Multilayer Perceptron NN (MLPNN), and General Regression NN (GRNN) on an available data set for the purpose of Parkinson’s follow-up and concluded that LS-SVM gave the best result [17]. In another study, Benba et al. first applied the technique of mel frequency cepstral coefficients (MFCCs) to multi-type audio recordings taken from healthy and PD subjects [18]. They then gave the resulting data to the SVM classifier and evaluated the results in which they emphasized that the record of /u/ letter contains more discriminatory analysis than other types of audio signals [18]. A cloud-based framework was presented and achieved a classification performance of 96.6% by the authors in reference [19]. In another study conducted in this area, an FKNN system based on a particle swarm optimization (PSO), named PSO-FKNN, was used to automatically diagnose PD, and an average of 97.47% accuracy was obtained [20]. When studies in this area are examined, data registration, processing, feature extraction, selection, and classification processes were carried out in almost all of them. In studies conducted in this field of engineering, the focus has been on automatic recognition of PD with high classification performance.

In this study, the combined Information Gain Algorithm-based K-Nearest Neighbors (IGKNN) approach was proposed for operating with high accuracy and automatically diagnosing PD from the audio signals of the individual. For the presented system, the attributes extracted from the previously recorded audio recordings from 252 people [21] were used as a data set. These data was taken from the University of California Irvine (UCI) Machine Learning Repository. The selected data were separated as a training-test by virtue of the stratified cross-validation (CV) method. The KNN classifier, which exhibits high performance against noisy data such as audio signals, was used in the automatic PD diagnostic system. The performance results obtained from the selected algorithm were evaluated with many statistical criteria. This study aimed at investigating the effect of the Information Gain approach, which has not previously been used in PD diagnoses in the literature. Also, as in this IGKNN approach, an expert system that can diagnose PD and achieve maximum performance with fewer features from the audio signals has not previously been encountered. Moreover, the stratified CV method, which was used as a data segmentation method, has also been viewed as an innovation for PD studies. Considering the low number of subjects used in the studies so far, another purpose of this study was to depict all of the details of this success rate obtained from these 252 subjects.

Methods

Speech data set

Speech disorders, when used in the diagnosis of PD, can be seen as a symptom that can be understood by an expert or even by the surroundings in nearly 90% of the patients. Because the brain’s signals controlling the speech and the muscles providing the speech are affected by this disease, the voice of PD patients is generally softer and monotonous. For this reason, the symptoms that must be quickly noticed by families are changes in speech. When the muscles of the face are stiff or take longer to move, people have difficulty in speaking, and words can be slurred or mumbled [22]. For this reason, it is possible to determine the difference between a PD patient and a healthy person in an early stage using a specialist system although the changes in the audio signals are among the secondary symptoms for the medical diagnosis of PD. For this purpose, the attributes resulting from the use of dissimilar methods based on the speech recordings of 252 subjects (188 PD patients with 107 males and 81 females and 64 healthy subjects with 23 males and 41 females) in the Department of Neurology at Cerrahpaşa Faculty of Medicine, Istanbul University were taken from UCI [21]. The age range of the subjects ranged from 33 to 87. Accompanied by a specialist, the subjects were asked to say the letter /a/ three times, and the necessary data record was provided. The frequency of the microphone used during recording was fixed at 44.1 kHz. As stated in Ref. [21]; after providing information about the data collection process, signed informed consent was taken from all individual participants in accordance with the approval of Clinical Research Ethics Committee, Bahçeşehir University, İstanbul, Turkey. More detailed information was provided by the creators of the data in [21].

Certain characteristics received from the audio signals of patients and healthy people in the diagnosis of PD facilitate the separation of these classes. For instance, the sample curves of standard deviation characteristics extracted from the 36 sub-bands obtained after applying the Tunable Q-factor Wavelet Transform (TQWT) method to the signals are shown in Fig. 1. In the figure, these features are distinctly separated from each other in specific sub-bands.

Fig. 1
figure 1

Sample curves of speech signal attributes in healthy and PD subjects

Analysis of speech signals

Acoustic measurements of sounds obtained from PD patients for analyzing speech signals can be easily obtained without disturbing the patient under the supervision of a specialist physician [22]. Considering the studies in this area, many methods that have been suggested for acoustic sound measurements are outstanding. The most commonly used measurement were jitter, shimmer, and basic frequency irregularities that occur in patient syllables [23]. In addition to these measurements, the harmonic noise ratio parameter, which can reveal hoarseness occurring over time, was also presented as an effective tool [24]. Cepstral peak importance measurements [25], linear prediction modeling [26, 27], auditory modeling [28], and Mel frequency cepstral coefficients [29] could be also preferred in acoustic sounds measurements in PD patients.

The presence of noise in signal processing applications is one of the main factors affecting the system result. The noise in these signals can occur for many reasons, such as the environment, data transmission, and the subject’s own body actions. In speech signals, which have an important role in PD detection, irregular vibrations and breathiness often produce noise. It is very important to determine the noise source correctly in the first stage. Noises whose source cannot be detected directly will negatively affect the system performance. Most of the noises have more irregular, random, and high frequency contents compared to the basic signal frequency. However, classical signal processing techniques may be insufficient for detecting and eliminating the above-mentioned noises. Instead, using of the adaptive, adjustable advanced signal processing techniques such as TQWT [30] can achieve a high level of noise cancellation. The performance of the system proposed in this study may be adversely affected by the noise in the data presented as input as done in any algorithm. However, thanks to the TQWT method used in [21], from which the data used in the study were obtained, the rate of exposure of the proposed system was minimized. When Ref. [21] is examined, it could be seen that the feature group obtained as a result of TQWT had more successful classification results by algorithm than the feature groups obtained by other methods.

Attributes of the used data

In this study, the TQWT feature group previously obtained from the speech signals of 252 subjects by the authors as described in reference [21] was used. This feature group consists of sub-bands obtain from the TQWT process. Detailed information on this feature group was given in [21]. The TQWT method is a new discrete wavelet transform form consisting of three basic parameters: Q (Q-factor), j (the number of levels), and r (redundancy). Band-pass filters with different Q-factors can be generated, and the low and high frequency values of the signal are also separated using this method. The Q from which the method derives its name is derived by dividing the bandwidth of the center frequency by the band-pass filter. This factor can be adjusted according to the oscillation of the signal being processed, thus creating a non-linear separation. When the frequency distribution of the signal is examined, the frequency spectrum of the sudden changing finite signals for the low Q widens. The frequency spectra of the oscillating signals are more localized for a high Q. In short, the Q refers to the oscillation of the signals being processed. j is described as number of levels that will have j + 1 sub-bands after obtaining high-pass filter and last low-pass filter outputs. The last parameter, r, determines the frequency of the band-pass filters, and as a result of this parameter, TQWT starts to resemble a continuous wavelet transform [30]. Decomposition stages for a single level TQWT are given as an example in Fig. 2. In this figure, x(n), H0(w), H1(w), LPS, HPS, α, β, c0(n), and d1(n) represent input signal, frequency responses of low-pass filter, frequency responses of high-pass filter, low-pass scaling, high-pass scaling, low-pass scaling parameter, high-pass scaling parameter, low-pass sub-band signal, and high-pass sub-band signal, respectively.

Fig. 2
figure 2

Analysis steps for single level TQWT [21, 30]

In Fig. 3, the decomposition of the speech signals from PD and healthy subjects into sub-bands using TQWT is given. Samples of the signal in this figure were taken from Ref. [31]. Details about this dataset can be found in reference [31].

Fig. 3
figure 3

Decomposition of the sample speech signals into 36 sub-bands using TQWT method

The performance of the TQWT algorithm is directly dependent on the Q, r, and j parameters. As stated in Ref. [21] from which the dataset used in this study was taken, a large number of trial and error experiments have been conducted to achieve high accuracy rates. In these experiments performed to determine the optimum values, the r parameter was chosen as 3, 4, and 5, respectively. The Q parameter was analyzed for values between 1 and 10. Finally, in order to determine the most appropriate number of levels, the j parameter was tested between 5 and 50 for different Q values. As a result of all these long-term processes, these parameters were determined as Q = 2, r = 4, and j = 35 for the best system performance according to Ref. [21].

Power spectrum density/power spectrum (PSD) is an analytical method that shows the power of any signal or time series, such as sound, as a distribution on a frequency axis. Thanks to the analysis, information about the noise components in any signal or at which frequencies the signal is effective can be accessed. In this way, preliminary information about the signals can be obtained before the classification of patient and healthy data. The PSD method will make a positive contribution to the classification performance as a result of the highly distinctive features determined by the method if the analyses are sufficiently sensitive. In Fig. 4, the results of the 4th level TQWT application on the sample speech signals in Ref. [31] and the power spectra of the fifth sub-band with the highest energy are given in order to obtain the PSD outputs.

Fig. 4
figure 4

Power spectrum graphs and 4th level TQWT decomposition of the sample speech signals

When the sub-bands, their energy ratios, and power spectra obtained from the sample signals [31] in Fig. 4 are examined, it can be seen that patients and healthy subjects can be separated from each other. The sound signals of the healthy subject have a wider range of power values, while those from the patient occur over a more limited range. In addition, this separation will be understood more clearly when the most basic statistical calculations, such as the average, standard deviation, and maximum value of these power spectra are obtained.

TQWT attributes used for this study contain 12 × 36 parameters. In other words, a total of 432 parameters were obtained by extracting 12 attributes (energy, Shannon entropy, Log Energy entropy, mean Teager–Kaiser energy operator (TKEO), TKEO standard deviation (std), median, mean, std, minimum (min), maximum (max), skewness and kurtosis values) from 36 sub-bands reached as a result of applying the TQWT [21].

Entropy is a measure of the complexity of the data being studied. This criterion cannot be negative [32]. Besides that Shannon entropy (E) is defined by Formula 1 [32]:

$$E = - \sum\limits_{i = 1}^{N} {P_{i} } \log_{2} P_{i}$$
(1)

In this formula, Pi symbolizes the possibility of the i. data type in the whole data set to be present in all of the data [32].

The Log-Energy entropy (H) attribute formulation is stated below [33]:

$$H\left( x \right) = - \sum\limits_{i = 1}^{N - 1} {\left( {\log_{2} \left( {P_{i} \left( x \right)} \right)} \right)}^{2}$$
(2)

TKEO is a method used to monitor energy in audio signals [34,35,36,37,38]. Formula 3 shows a discrete TKEO formulation:

$$\psi \left[ {x\left( n \right)} \right] = x^{2} (n) - x(n + 1)\,x(n + 1)$$
(3)

In this formula, data are defined as x and n is the number of samples.

Information gain algorithm-based KNN hybrid model (IGKNN)

Automatic analysis systems using artificial intelligence algorithms can easily diagnose diseases with high accuracy, similar to diagnosis made by a physician. These systems analyze the data entered by a selected classifier or a clustering algorithm. They also statistically demonstrate the accuracy of the result of the evaluation system. In this study, a new combined IGKNN approach was proposed for the features analysis process. The information gain (IG) algorithm and KNN classifier, which exhibits high performance against noisy data such as audio signals, were chosen for this model. Figure 5 shows the pseudo-code diagram of the IGKNN hybrid feature analysis system.

Fig. 5
figure 5

The pseudo code diagram of IGKNN approach

The basis of the IGKNN method is the assignment of the feature subset in which the lowest error will be obtained for the used classifier. Thus, the IG method was selected for this purpose. This method is often used in data mining and artificial intelligence topics. IG can be expressed as the opposite of the Entropy concept. This criterion, which takes a value between “0” and “1”, shows how much value can be gained as a result of classification according to the given feature. The fact that the calculation is close to “1” is proof that the related feature plays an active role in the parsing of classes [39, 40]. In order to calculate the IG criterion, entropy for each class label must be calculated. Entropy, a measure of uncertainty in the system, was calculated using formula 4:

$$(T) = - \sum\nolimits_{i = 1}^{n} {P_{i} \log_{2} (P_{i} )}$$
(4)

Pi shows the probability that each class tag is contained in a data set with n class tags. Also, formula 5 was used to find IG ranging from “0” to “1”.

$$(x,T) = (T) - \sum\nolimits_{i = 1}^{n} {\frac{{\left| {T_{i} } \right|}}{\left| T \right|}H(T_{i} )}$$
(5)

In addition, T and x are the data set and class type to be calculated, respectively [39, 40]. Besides the feature elimination algorithm, KNN algorithm [41] was selected as the classifier for the classification process of the IGKNN hybrid system. In the KNN algorithm, Euclidean, Manhattan, and Minkowski functions have been tried in distance calculations. As a result of these trials, the best performance results were obtained with the Euclidean function. In addition, the k-parameter, which is the algorithm input, was tried from “1” to “20”, and the best result was achieved with a value of “1”. The classes of healthy and patient were labeled as 0 and 1, respectively. Each data was divided tenfold by the Stratified cross validation (CV) method. In this CV method, each fold has an approximately equal percent sample for each class.

Big-O notation is used for the computational complexity calculation of the proposed combined IGKNN approach. This notation is often used in computer science to refer to the worst case scenario of an algorithm [42]. The computational complexity calculation of the KNN algorithm used as a classifier is the same as the analysis system proposed in this study. The computational complexity of KNN is proportional to the number and dimensions of training data. Accordingly, assuming p is the number of sizes and m is the number of dimensions in training data, the computational complexity value of the proposed method is obtained as O(pm). A low value means that the algorithms respond quickly and take up less memory space. Since the KNN classification algorithm operates on the basis of “instance-based” [42], it needs more memory. This situation negatively affects the computational complexity value of the algorithm or the system and causes it to increase. If data with hundreds/thousands of dimensions are presented to the system in operations such as text classification, the computational complexity will increase. Consequently, poor classification performance will likely be achieved. Methods, such as feature selection/elimination and dimension reduction, are used to solve this problem, which is generally caused by the size of the data. In the proposed system in this study, IG was used as a feature analysis algorithm. In this way, the analysis of each feature of the samples was made directly. As a result, the computational complexity (O (pm)) of the proposed method can be reduced by changing the value of m.

Statistical evaluation processes

In expert systems designed with artificial intelligence algorithms, it is necessary to form a confusion matrix to reveal the number of true and false labeling obtained by automatic detection. Various statistical criteria may be calculated using the number of labels specified in this matrix. In this study, statistically valid results were obtained using several criteria: True Positive rate (TP rate), also named as sensitivity or recall; False Positive rate (FP rate) F measure (F) Matthews Correlation Coefficient (MCC)classification accuracy rate (ACC); Precision (Prec) [43,44,45,46] and Cohen’s Kappa Coefficient (Kappa) [47]. Besides these statistical criteria, the values of receiver operating characteristic and precision-recall curves (ROC and PRC, respectively) [48] were computed.

Results

For this study, the attributes obtained from the audio signals of 252 subjects (188 PD patients and 64 healthy people) [21] were under physician supervision, the /a/ letter was repeated three times by these people and the data recording process was obtained. As a result, the total number of recordings was 756 (252 × 3).

In the first phase of the study, the analysis of the sub-feature sets of the TQWT feature group was started. This feature group consisted of 36 sub-bands, and 12 features were extracted from each sub-band, so 12 sub-feature groups were formed [21]. The tenfold stratified CV method was applied to create training-test data in all analysis processes after this stage. Initially, all of the 12 sub-feature sets were presented to KNN and IGKNN in order to compare the system performances. The results obtained for this comparison are given in Table 1.

Table 1 The best results for all of the 12 sub-feature sets of TQWT with KNN and IGKNN

When Table 1 is examined, the success rate for KNN was obtained as 90.74% using all 432 sub-bands in total, while it was 94.97% for IGKNN using only 108 sub-bands. These results alone prove the superiority of the proposed combined IGKNN system. In addition, when the proposed system was used, a higher performance was obtained with less sub-band value by performing a feature analysis. This situation contributes to the reduction of the computational complexity of the system. In the next step, which of the 12 sub-feature sets was more effective was investigated. Thus, Table 2 shows the classification results for each of the 12 sub-feature sets of TQWT with the KNN algorithm. As shown in Table 2, the best statistical result among all groups was obtained with Log Energy entropy (LEE) sub-feature group. The ACC rate of this sub-feature group was calculated as 95.76% at maximum. Also, the values of ROC and PRC reached almost the 0.95 band with this sub-feature group, which was much closer to the perfect classification result. The LEE sub-feature group was followed by Std value and TKEO mean with 92.85% and 87.96% ACC rates, respectively.

Table 2 The best results for each of the 12 sub-feature sets of TQWT with KNN (TNI total number of instances, CCI correctly classified instances, k-parameter = 1, distance function: Euclidean)

The same feature sets were also analyzed in detail using the IGKNN system. In this way, the effects of each sub-feature set on both systems could be seen more clearly. Table 3 shows the results obtained with the number of sub-bands selected for each feature set.

Table 3 The best results for each of the 12 sub-feature sets of TQWT with IGKNN (TNI total number of instances, CCI correctly classified instances, k-parameter = 1, distance function: Euclidean, TNS total number of sub-bands, NSS number of sub-bands selected)

When Table 3 was examined, it could be seen that according to Table 2, an increase was achieved in the performance results of all sub-feature sets with the exception of Energy. Although no increase in the energy feature occurred, the number of sub-bands was reduced from 36 to 12, and almost identical results were achieved. Achieving the same performance with less sub-bands contributed to the reduction of computational complexity.

In the next stage of the study, the IGKNN system was used to perform double and triple analyses of the sub-feature groups with the best classification performance. These groups were created by adding the second (Std value) and the third (TKEO mean) best sub-feature groups next to the LEE from which the best results were obtained in both KNN and IGKNN. Later, these combined feature groups were submitted to the IGKNN system for the necessary evaluation and classification processes. The results obtained as a result of these processes are given in Table 4.

Table 4 The best results for double and triple analysis of the most successful sub-feature groups with IGKNN system (TNI total number of instances, CCI correctly classified instances, k-parameter = 1, distance function: Euclidean, TNS total number of sub-bands, NSS number of sub-bands selected)

When Tables 2 and 3 are examined in terms of statistical performance results, it is seen that better results was obtained for LEE sub-feature group after using IGKNN system. The ACC performance result increased almost 2% with effective LEE feature of 22 sub-bands. Namely, 737 of 756 input instances were correctly classified. Moreover, the Kappa value was calculated as 0.933, and determining this value above 0.8 showed that perfect agreement between actual and predicted values existed. In addition, the TP rate, Prec, F, and MCC criteria closest to “1” value was statistically supported by the ACC rate. When the FP rate criterion is considered, it reached the lowest value (0.05) among all FP rate calculations in the study. According to the results in Table 4, the LEE sub-feature group was followed by “LEE-Std value and LEE-Std value-TKEO mean” with 96.69% and 96.42% ACC ratios, respectively. As can be seen from Table 4, as the number of features in the groups increases, the performance results decrease slightly. In spite of this, the decrease in the number of NSS contributes to the decrease in the computational complexity of the system as mentioned previously.

Finally, in Table 5, only the information gain rates of the LEE sub-feature set from which the best performance was obtained are given as an example. According to this table, the ratios of 14 out of 36 features were obtained as “0”. The other 22 features were selected for the next step, which was the classification process.

Table 5 Gain ratios of LEE sub-feature set for 22 sub-bands

Figure 6 shows the ROC and PRC curves for the values indicated in the LEE sub-feature group in Table 4. The ROC curve area is the most preferred ROC statistic. Additionally, the balance between precision and recall should be created because these metrics are inversely related. The balance between these two metrics is stated by the PRC curve. In this study, the ROC and PRC criteria exceeded the value of 0.95 after using the IGKNN system and again demonstrated the high success of classification.

Fig. 6
figure 6

ROC and PRC curves of LEE sub-feature group after processing with IGKNN method

Discussion

The PD is a stealthy brain disorder that progresses slowly. As in any disease, early diagnosis is likely to improve PD patients’ quality of life. People diagnosed in the early stage of the disease can shape their new life according to this situation and take the necessary precautions. There are several diagnostic methods of this disease, including the analysis of the audio signals [49]. Expert systems can be used in real-life applications in the areas with medical deficiencies for PD, which can be diagnosed from these signals under specialist supervision. These systems can also help physicians strengthen the diagnosis with high success rates in existing health institutions. In addition, if the proposed system can be made available to people online, this type of system can contribute to directing the individuals suspected of having this disease to a specialist physician in the field. For this purpose, in this study, it was proposed to design an expert analysis system that could work with fast and high accuracy in real-life in addition to the virtual environment and can automatically diagnose PD from the audio signals of the individual.

In the literature, data recording and processing, feature extraction and selection, and also classification processes have been generally used in PD studies. A detailed analysis of PD-related studies is given in Table 6. The studies in this table were compared according to the number of data, experimental methods, and performance results of these systems. Accuracy rates obtained in the studies ranged from 82.5 to ~ 100%. The main goal of the studies in this field is to obtain maximum performance with the available data. Although the number of subjects in this paper and other studies [21, 50,51,52,53] using the same data was 252, this number changed between 31 and 50 for other studies.

Table 6 Comparison of the results in this study with those available in the literature

As seen in Table 6, the data set used in this study has also been examined in other studies so far [21, 50,51,52,53], and some comparisons were made with these studies in terms of the process and results. First, the feature groups formed in these relevant studies [21, 50,51,52,53] were classified by several classifiers, but the classifier in which the best result was obtained for each feature group usually changed. Such situations are undesirable as they are restrictive for expert systems in terms of time and process intensity. While starting this study, the classifier choice was switched to the KNN. Second, the sub-feature data sets belonging to the TQWT feature group formed in the above-mentioned study [21] were not given separate classification systems and were not analyzed. However, in this study, great emphasis was placed on the sub-feature data sets of this feature group from which the best result was obtained, and efforts were made to decrease the data density of the expert system. Last, in the related studies [21, 50,51,52,53], some attribute selection methods were implemented for the whole data set. As a result, an ACC in the range of 86% to 96.83% was achieved with 20–555 features. This situation forces the whole system to search for an effective feature in all data groups. Therefore, the expert system may have difficulty in terms of data density and processing time. Despite this difficulty, in this study, a much higher ACC rate (approx. 98%) was obtained using 22 features with the IGKNN mechanism. Also, the effectiveness of the proposed approach was supported by multiple statistical metrics, such as Kappa, Prec, ROC, and PRC.

When examining other studies in the literature that were conducted with different data sets for the diagnosis of PD based on audio signals, no other studies using the Information Gain features analysis approach or IGKNN hybrid system were found. Moreover, the fact that the number of subjects whose audio signals have been obtained in other studies is relatively low compared to this study further reinforces the importance of this study.

Although the hierarchical IGKNN system presented in this study achieves successful results for the early diagnosis of PD, it also contains some limitations. The limitations of the proposed approach should be expressed under two main headings. The first one is related to the IG algorithm used for feature selection. An important disadvantage of this method has a poor performance against features that contain different variables (such as “date: 19_8_1996”). For this reason, it is necessary to be selective against the data to be presented to the proposed sequential system. Otherwise, the possibility of successful results of the feature selection phase will decrease. Another important limitation is related to the KNN classifier, which is particularly powerful against noisy data, such as audio signals. The disadvantage of this algorithm is the requirement for a large amount of memory space, especially for large data, since it stores the results of all situations while calculating the distance. This limitation is proportional to the number of samples (p) in and size (m) of the data set. This situation directly affects the computational complexity (O(pm)) of the proposed system. The value was reduced by eliminating the ineffective attributes of the samples used in the study as a result of the necessary procedures. Other limitations of the KNN classifier used in the system are hyper-parameters, such as the number of k-neighbors that affect the system performance and the distance calculation criterion. When it comes to the general limitations of the study, the major deficiency of this and similar studies is the lack of testing of the proposed systems due to the limited amounts of data. In addition, noises in audio signals can adversely affect system performance. Another common deficiency of the studies in this area is the lack of raw signal states in the data in the ready-made data banks.

In the literature, besides classifiers (such as ANN, SVM), data segmentation methods (such as k-fold CV, leave-one-out CV) and signal processing techniques (such as Fourier and wavelet) were used. To summarize, a few studies that incorporate a new approach are available in this field. Thus, in this study, the new proposed approach system makes a significant contribution (in terms of both the data segmentation method and the feature selection method) to the literature. In addition, a detailed analysis of attributes extracted from speech signals with the proposed IGKNN system has not been previously reported in the literature. Furthermore, the large amount of data in our study compared to similar studies in this field is another factor that makes this study important.

Conclusions

In artificial intelligence-based automatic diagnosis systems, data preparation, method implementation, feature extraction–selection–reduction, and dimension change methods are important and necessary steps. As a result of all of these stages, development the performance results of the selected classification system was the aim of this study. In this study, a novel approach (IGKNN approach) was recommended for diagnosing PD with high accuracy based on audio signals. For this system, the attributes extracted from the previously recorded speech signals of 252 people [21] were used as a data set; data were taken from the UCI. These recordings were separated as a training-test by virtue of the tenfold Stratified CV method. The KNN algorithm, which is effective against noisy data such as audio signals, was used for the automatic PD diagnostic system. This study purposed to examine the effect of the Information Gain approach, which to my knowledge, has not been previously used in PD diagnosis. Also, as in this IGKNN approach, an expert system that can diagnose PD and achieve maximum performance with fewer features from the audio signals has not been encountered previously. Considering the low number of subjects used in the studies so far, another goal of this study was to define all of the details of this success rate obtained on 252 subjects. As a result of the proposed system performance, the ACC ratio was obtained as 97.48% with 22 features determined. Also, Kappa coefficient was achieved as 0.933, and calculating this value above 0.8 showed that there was a perfect reliability between actual and predicted values. Moreover, the ROC and PRC areas criteria exceeded the value of 0.95 and demonstrated the high success of classification. In a nutshell, a maximum performance result was obtained with a minimum number of attributes thanks to the IGKNN approach. Furthermore, the number of data in this paper was higher than the other studies.

Gender difference between subjects is a factor emphasized in some voice processing studies. In addition to differences in tone of voice due to gender or age factors, many situations such as accent, mouth, tooth structure, hormonal, race/ethnic differences, and environmental factors (smoking, other habits) can affect the success of voice processing studies [57,58,59,60]. However, the main purpose of the studies in this area is to achieve high classification success under optimum system parameters in spite of all of these differences/negativities. In the source from which the data set used in this study was taken [21], differences in voice tone of the subjects or any other negative factors were not mentioned. However, thanks to the presented approach, the classification success was achieved as almost 98%, and this situation demonstrated the success of the study. As a result, it is understood that the selection of 36 sub-bands and the extracted features for both female and male subjects are effective in minimizing the disadvantage that may arise from the stated possible differences. According to this information, in the future, detailed studies can be carried out on the effects of other factors in addition to differences in tone of voice between subjects. Besides, the IGKNN system can be applied to handwriting, gait, and other medical parameters of people with PD. The results of this proposed approach can be developed with larger PD data sets and more significant properties obtained by various methods. In addition, system performance can also be assessed on this existing 12-dimensional data set by using dimension reduction methods, such as a principal component analysis (PCA). In the implementation phase, a new dimensional data matrix obtained by changing the size parameter between 1 and 12 is presented to the system. The most appropriate size is decided according to the performance values recorded for each dimension parameter. Thanks to the reduction of the size in the property space, a reduction in the computational complexity of the algorithm is also found.