Keywords

1 Introduction

Parkinson’s disease is a disorderliness in motion. This effect the nervous organization, and with time, signs become worse. Cerebral palsy, ataxia, and Tourette syndrome are other movement’s conditions [1]. We come about when a disturbance in the nervous system influences the capacity of a person to shift or remain still [2, 3]. The National Institutes of Health (NIH) reports that about 60,000 people are identified with Parkinson’s disease every year in the United States of America, and about half of the million people are living with the condition [4, 5]. Some signs grow with time, and certain patients may have dementia. Most of the symptoms arise from a reduction of brain dopamine levels [6, 7]. One study, located in France, showed in 2018 that men are 50% more prone to have Parkinson’s disease than women in general, but women’s risk tends to increase. Symptoms appear in most adults at or above age 60 [8, 9]. They occur sooner in 5–10% of instances, though [10, 11]. If Parkinson's disease progresses before age 50, this is considered Parkinson’s disease with “early onset.”

The paper is proceeded as follow, Sect. 2 describes the related works; Sect. 3 presents the proposed system; Sect. 4 covers the experimentation results and analysis; Sect. 5 expresses the conclusion [12, 13].

2 Related Work

  • Aaswad Sawant et al. [14] studied on the various cancer detection strategies. The device can be used by surgeons and radiologists as a second decision for fast and effective identification of brain tumors.

  • Gamal Saad Mohamed et al. [15] presented four forms of classification focused on Naive Bayes, SVM, MLP neural network, and decision trees are used in this paper to identify the PD dataset, and the output of these classification is analyzed when applied on the real PD dataset, distinct PD dataset, and chosen collection of PD dataset attributes. The data set used in this study includes a variety of speech signals from 32 people: 25 with PD and 9 healthy individuals.

  • Enes Celik et al. [16, 17], in this analysis, to model Parkinson’s disease, similar classification methodologies including logistic regressions, support vector machines, random trees, gradient boostings, and random forest are related. A total of 1200 speech data sets were used in the classification stage, comprising of 26 characteristics gathered from Parkinson’s diseased patients and non-patients. Thanks to correlation maps, the features space of the dataset is extended. Such correlation of maps are developed with the features that are collected using the principal component analysis (PCA), information gain (IG), and all features, respectively [18, 19].

Monica Giuliano et al. [20, 21] proposed demographic details, and vocal phonation records/a/ from the accessible mPower database were examined in this study in order to classify patients with PD. Then, a parsimonious model was identified that achieved a reduction from 62 to 5 characteristics of the phonation, which were considered in addition to sex and age. Neural networks multilayer perceptron (MLP) and logistic regression (LR) were used to achieve a model with strong predictive potential (area below the Receiver operating characteristic’s curve, AUC-ROC, over 0.82) [22, 23]. This research leads to the tracking of patients with EP by capturing a few phoning information obtained through a mobile phone [24, 25].

3 Proposed System

We suggest a model and accurate results analyzing data from patients with both speech and spiral painting. Thus, the doctor will infer normality or deviation by comparing both the findings and recommending the drug dependent on the stage affected.

3.1 Voice Data Processing

UCI platform opens the audio file. We used RStudio for analysis of the data. The conceptual architecture for predictive analytics is a variation of the K-means clustering and the decision tree classification method that is used to obtain patient insights. The problem can be solved with reduced error rate by using the machine learning algorithms. The speech dataset of Parkinson’s disorder from the UCI machine learning library is used as feedback. While our experimental results show early disease, diagnosis can promote therapeutic care of the elderly and increase the chances of their life span and better lifestyle contributing to peaceful life.

3.2 Spiral Drawing Analysis

We used PyCharm-based python language for data analyzes to process the spiral images. Our proposed system provides reliable results by combining spiral extracting feedback from the patients impacted by usual and Parkinson. From these drawings, the principal component analysis algorithm (PCA) for extraction of the function from the spiral drawings. From the sketches of the spiral: X; Y; Z; Pressure; GripAngle; Timestamp; values of the reference ID are removed. Using machine learning technique (Support vector machine), the extracted values are compared to the trained database and results are obtained (Fig. 1).

Fig. 1
figure 1

Overall proposed

3.2.1 Parkinson’s Disease Voice Dataset Analysis

  1. 1.

    Importing data into RStudio

Step 1: Input and arrange the data in Excel.

Arrange the data in an Excel worksheet, so that the first row (Row 1) includes the column names and each following row contains all the information necessary for each data point in the experiment (i.e., Rates of description and measurements).

Step 2: Save your worksheet as a comma-separated file type (.csv).

Save your Excel spreadsheet as usual (default form of file: Excel Workbook); this will be your master file you can always revert to change stuff, add new details, etc. Then, press “Save As…” to create a version of your data to enter in R. A window should open where you can define the filename you want, as well as the sort of file you want.

Step 3: Import data to RStudio.

  1. 2.

    K-Means Clustering

K-means clustering is an unsupervised machine learning algorithm that attempts to clustering data based on similarity between them. Unsupervised machine learning means that no outcomes can be predicted, and the algorithm is simply trying to find patterns in the data. In k-means clustering, the number of clusters we want to divide the data is defined. The algorithm allocates each discovery randomly to a cluster, and determines the centroid of each cluster. The algorithm then iterates through two steps:

  • Reassigns data point to the cluster which is closest to the centroid.

  • Calculate each cluster with centroid.

Such two measures are replicated until no further reduction of the variability within the cluster is feasible. The variance within the cluster is measured as a percentage of the Euclidean interval between the data points and the centroids concerned.

3.3 Decision Tree

A decision tree is also considered as prediction tree. A decision tree provides a framework to define judgment and result sequences. The aim is to predict an answer or output variable Y, provided the input X = {X1, X2, …, Xn}. An input variable is named for every {X1, X2, …, Xn} part of the group. Creating a decision tree with test points and divisions will accomplish the predictions. At each check point, a decision may be taken to pick a single branch and navigate the decision trees in a number of disciplines, such as: on the basis of individual attributes determining whether or not to give a loan to an individual, predicting the rate of return to various investing strategies, predicting whether or not to deliver a direct mail to a prospective client, etc.

A decision tree consists of node, and thus, contains a rooted tree, which implies it is a guided tree with a core node. Root nodes does not have incoming edges, with all other nodes in a decision tree have exactly one incoming edges. An internal node is the node with an incoming edge and outgoing sides. Also known as the check node is an internal node. Nodes with no outgoing edges are classified as terminal nodes or leaves.

3.3.1 Parkinson’s Disease Spiral Drawing Analysis

  1. 1.

    Preprocessing

    1. (a)

      Image Acquisition

      Image acquisition is the principal phase of image processing. When opposed to HD images, the videos are obtained with minimal noise. Each package has the key advantages of having images of better clarity, low noise, and distortion.

    2. (b)

      Image Preprocessing

      Image pre-processing is one of the image processing types, an attempt to make identification more evident. Pre-processing of photographs is a way to improve image quality, so that the resulting image becomes better than original. The median filter is a non-linear method, whereas linear is the typical filter. Mean filtering is a fast, intuitive, and quick to apply smoothing images, that is to say reducing the amount of difference in strength between one pixel and the next.

      The median filter is usually used in a picture to reduce the salt-and-pepper noise. It also does a better job of maintaining valuable information in the picture than the mean filter. The median is determined by first sorting all the pixel values in numerical order from the surrounding neighborhood, and then, replacing the pixel considered with the center pixel. If there are even numbers of pixels in the area under scrutiny, the sum of the two center pixel values is used. For the reduction of noise, both mean and median filters are used. This pre-processing image is used as the input for image segmentation.

    3. (c)

      Image Segmentation

The segmentation of images is an important method for most subsequent tasks of image analysis. Segmentation divide an image into its region or artifacts which make up it. The aim of segmentation is to render the portrayal of an image clearer or more readily analyzable in something that is more relevant.

  1. 2.

    Prediction

So, our hybrid model, combining image processing (spiral drawing analysis) utilizing image processing methodology and data analytics (values derived from speech dataset and spiral drawings) using R technology. Data analytics have a larger role to play in healthcare sectors, as these data are diverse and complex in nature, and the Parkinson disease dataset is large in scale, and new opportunities and demands are found, greater complexity is revealed, predictive capacity is improved, and time is productive to adapt to cost-effective measures.

More specifically, that integration helps healthcare organizations to quickly and efficiently evaluate their large data sets. Early detection of any type of disease is an important factor and this results in advance treatment of patients. This system detects the highest classifier precision, and multi-classifier consensus tests are taken to identify the disease sooner and increase PD people’s lifespan.

4 Experimental Results

  • In our work, we used UCI machine learning repository. The experiments are performed on R studio. The studio consists of enormous quantities of multidimensional details, that are gathered up within different areas including advertising, geo-spatial, and bio-medical areas. And with the help of python, I am generating a pressure graph for both the diseased and un-diseased. With these graphical representation of the data, we can get an idea of how differently both the diseased and non-diseased persons are identified and can be used for easily identification of the person those are effected (Fig. 2).

Fig. 2
figure 2

The pressure graph for diseased person

When this image is passed, the pre-processing takes place here all the features are extracted for further segmentation (Fig. 3).

Fig. 3
figure 3

The pressure graph for un-diseased person

The features are then trained with a decision tree classifier which is widely used for all classification and regression techniques. Then, as the next step, the RGB components are extracted from the image and the analysis is done on the number of clusters obtained. Some images are kept for testing and training. The classifier learns [11] the features and successfully classifies when any new image is given to it. It identifies where the image is benign or malignant (Fig. 4).

Fig. 4
figure 4

Accuracy output

  • Figure 2 shows the efficiency of the proposed model. The accuracy of the classifier is observed with 86.66% whereas 99.9% specificity and 80.48% sensitivity. The segmentation of the prohibited item is extracted from the exact image which gives us various parameters for measuring such as its intensity, volume, and size. This helps in diagnosing and treating the disease more efficiently. In Fig. 3, the transmission speed of the system is depicted (Figs. 5 and 6).

Fig. 5
figure 5

Scatter plot matrix of Parkinsons’s data

Fig. 6
figure 6

Accuracy of logistic regression

Logistic regression is the technique that also used in the prediction of the diseased persons value as the reference and predict the approximate value for the diseased person (Fig. 7).

Fig. 7
figure 7

Accuracy of support vector machine

Support vector machines are also helpful in predicting because in these algorithm we generally identify the nearest points of the trained data and based on that we test the data (Fig. 8).

Fig. 8
figure 8

Accuracy of decision tree

A decision tree is a tree like structure in which one node be the test data and other node will be trained data, and based on the nodes data, we will predict the outcomes (Fig. 9).

Fig. 9
figure 9

K-nearest neighbor

K-nearest neighbors, decision tree, SVM, k-means clustering we are going to get the accuracy of different algorithms and will help of all this algorithm we can identify the diseased person with some more accuracy and help the patient to move to the next process.

5 Conclusion

Past analysis papers provide a comprehension survey for specific modalities in neuro imagination and related analytical techniques proposed for treatment of Parkinson’s disease in recent years. Past research articles focused solely on a specific imaging modularity such as MRI or PET, or only on a particular type of dementia such as AD. This study sought to cover the wider range of imaging and machine learning algorithms for diagnosing of mental illness so that field researchers could readily identify the state of the arts in the area. We also emphasize the importance of early detection and prediction of Parkinson’s disease, so that patients can be given treatment and support as soon as possible.