Introduction

An early diagnosis of PD, a neurodegenerative condition that is incurable, can be made with the help of the MRI modality [1]. With the use of biomarkers, this condition can be quickly and accurately diagnosed, and with the right care, the gradual death of dopaminergic neurons can be held in check. PD is a long-term problem that affects both the way the nervous system works and the parts of the body that the nervous system controls. The signs start to show up over time. The first sign may be a small tremor in one hand. Even though shakes are common, the sickness can also make a person stiff or slow them down. The face is unable to convey any feeling when Parkinson’s disease is in its early stages. One might not move their arms while walking. The speech could become boring or hard to understand. As Parkinson’s disease gets worse, the symptoms get worse over time [2,3,4]. Even though there is no known cure for PD, medicines may make the symptoms much less severe. Rarely, a doctor might recommend surgery to treat the symptoms and take control of certain parts of the brain.

A large number of dopamine neurons are found in the substantia nigra, a thalamic area in the midbrain [5]. Dopamines are a particular class of molecules that neurons produce to communicate with other nearby neurons in the brain. When these dopamine neurons begin to degenerate in the substantia nigra [6]. PD is a neurological disorder occurring most commonly in the elderly. Parkinson’s disease is estimated to affect between seven and 10 million individuals all over the globe. More than 1900 persons for every 100,000 late-stage teenagers are diagnosed with the condition each year. It is common for the incidence of the condition, which refers to the number of newly diagnosed cases, to rise with advancing age; however, the disease’s progression may level out in those older than 80. It is estimated that 4% of cases of Parkinson’s disease are identified in people under the age of 50. Parkinson’s disease strikes males at a rate that is 1.5 times higher than that of females [7,8,9].

A recent development in the science of neural networks is deep learning (DL) [10, 11]. It is a subset of machine learning that works with unstructured (hierarchical) data types like text, audio, and images. Data is represented by DL in a number of abstraction levels and layers. The need for depth results from the shallow neural network’s restriction (a conventional Neural Network (NN) can only have a maximum of two hidden layers). For instance, in shallow NN, task complexity causes the number of neurons in the hidden layers to increase exponentially. So, compared to the deep NN, the shallow NN would require more neurons. A deep system is the human nervous system as well.

Kaplan et al. [12] suggested a four-stage innovative automated system’s development. The non-fixed size patches’ texture features are extracted. During the step for the extraction of features, a PHOG image descriptor, also known as a pyramid histogram-oriented gradient, is utilized. Using two classifiers and four chosen feature vectors, eight predicted vectors are produced. Finally, to get general categorization outcomes, iterative majority voting (IMV) is applied.

The categorization of magnetic resonance (MR) images of people with Parkinson’s disease and healthy controls is a subject that has been explored. This attempt was performed by Sivaranjini et al. [13], who employed a deep learning neural network. The diagnosis of Parkinson’s disease can be made more accurate with the use of the AlexNet Convolutional Neural Network architecture. The transfer learning network gets its training on the MR pictures, and then it puts those images through a series of tests to see how accurate they are.

Balaji et al. [14] offered a unique long-short-term memory (LSTM) network for assessing the seriousness of PD based on the way a person walks. Machine learning (ML) methods, on the other hand, are not like the LSTM network. The LSTM network solves the problem of disappearing gradients by replacing memory blocks with self-connected hidden units. This lets it decide when to get new information. Three separate gait datasets including recordings of vertical ground reaction force (VGRF) for various walking scenarios are used to train the LSTM network.

According to Cigdem et al. [15] utilized the voxel-based morphometry (VBM) method to compare the changes in the structure of grey matter (GM) and white matter (WM) between people with PD’s and healthy controls. It has been investigated how numerous characteristics, such as total brain volume (TIV), age, sex, and combinations of these, as well as the two distinct theories of t-contrast and f-contrast, affect the ability to distinguish PD from HCs. Using a two-sample t test and differences between PD and HC in certain areas, separate 3D models for GM and WM tissues are made. SVM is used to sort things into groups, and PCA is used to cut down on the number of dimensions.

A diagnosis of Parkinson’s disease was offered by Senturk et al. [16], which was based on machine learning. The proposed diagnostic approach is comprised of procedures for the selection of features and categorization of the data. Both the Feature Importance and the Recursive Feature Elimination methodologies were taken into consideration when it came to the process of feature selection.

Noor et al. [17] showed DL-based methods to identify neurological disorders from MRI data. In a head-to-head comparison of the diagnostic accuracy of many DL architectures applied to a wide range of medical conditions and imaging modalities, the Convolutional Neural Network (CNN) outperformed its competitors.

Tagaris et al. [18], presented that analyses data from medical imaging. The system is trained using an existing medical database, and performance testing is done using that database.

Using deep convolution neural networks, the researchers Kaur et al. [19] developed a way to divide MR pictures of people with Parkinson’s disease and healthy people into two separate groups. These algorithms require a substantial training dataset for a particular activity in order to function well. The authors have implemented a deep CNN classifier in their work in order to improve the categorization. The pre-trained Alex-Net architecture makes a contribution to the overall improving process of diagnosis.

The authors Zhao et al. [20] proposed the creation of a complex model by combining models from a number of different sources. The participants in these three retrospective studies ranged in age from 59.9 to 9.7 years and comprised 305 Parkinson’s disease sufferers and 227 healthy control volunteers. The participants were provided with a dataset consisting of independent blind data (with a size of N = 100) and training using ten-fold cross-validation (with a size of N = 432). A 3 T scanner was used to capture the images, and they had a diffusion-weighted quality to them. After determining the fractional anisotropy and mean diffusivity of the brain using the Automatic Anatomic Labelling template as a guide, the researchers broke the data down into 90 distinct brain areas of interest. For the purpose of developing this model, every region was trained independently. A greedy strategy was used to aggregate the predictions from a number of different locations into one final forecast. Sreelakshmi et al. [21] used a genetic algorithm (GA)-based segmentation approach with deep CNN models to improve the speed and accuracy of PD diagnosis.

Parkinson’s disease is still hard to diagnose exactly, and scientists are still trying to learn more about the early stages of the disease. In the last 5 years, there have been a few major developments in the field of prodromal Parkinson’s disease. These developments include the validation of clinical diagnostic criteria as well as the creation and testing of research criteria. The current models do not do a good job of classifying MRI pictures of the brain to find PD. The purpose of this work is to classify brain MRI images using a modified version of the ResNet50V2 deep learning model to detect Parkinson’s disease. The opening to the paper is in the first part. In “Proposed Model”, we talk about the books. In “Proposed Model”, the suggested method is laid out. “Experimental Results” shows the results of the experiments, followed by a conclusion and a list of sources.

Proposed Model

There are several different variations of ResNet, all of which employ the same fundamental concept but differ with regard to the number of layers [22,23,24]. Resnet50 is the name given to the structure that is capable of functioning with fifty layers of neural networks. To tackle difficult issues, often, Deep Neural Networks have several layers layered on top of one another. This enhances both the accuracy and performance of the network. The idea of stacking is based on the idea that if you add more layers, the system will finally learn more complicated things. When it comes to recognising photos, for example, the first layer might find the edges, the second layer might find the colours, the third layer might find the objects, and so on. The normal model for a convolutional neural network, on the other hand, takes into account the greatest depth level. Figure 1 shows how the suggested model is put together.

Fig. 1
figure 1

Proposed deep learning based framework for PD detection

The problem of training extremely deep networks has been resolved by the creation of ResNet, or residual networks, which are made up of residual blocks. The first difference is the lack of intermediary layers and the existence of a direct link (this may differ amongst models). A link called a “skip connection” sits in the middle of the remaining blocks. The output of the layer is different from what it was before the skip connection. Without this skip link, a bias term is added after the multiplication of the input “x” by the layer weights.

After that, the activation function, f (), is applied, and the final product is denoted by H (x).

$$H(x) = f (wx + b)$$
$$Or\; H (x) = f (x)$$

The output has altered as a result of the skip connection.

$$H(x) = f(x) + x$$

There appears to be a little problem with this approach when convolutional and pooling layers are utilised, where the dimensions of the inputs and outcomes might vary. When f(x) dimensions differ from x in this situation, there are two options:

  • The skip connection is lengthened by padding it with extra zero entries.

  • The projection approach, which involves adding 11 convolutional layers to the input, is employed to match the dimension. The result in this situation is:

    $$H(x) = f(x) + w1.x$$

When utilising the first approach, no additional parameter is supplied; however, here w1 is added.

The skip connections of ResNet, which allow for an additional short-cut conduit for the gradient to run through, solve the issue of the gradient abruptly disappearing in deep neural networks. The performance of the top layer will be at least equal to that of the lower layer and maybe even better because of the identity functions that the model gets via these connections. Consider the possibility that a shallow network and a deep network (x) both make use of the function H to convert an input x into an output y. The deep network should not have the same issues as conventional neural networks (without residual blocks) and should function at least as well as the shallow network. One method for doing this is to instruct the succeeding levels in a deep network in the identity function. This ensures that the output of each layer is equivalent to the inputs it receives and prevents a drop in performance even with the addition of more layers.

It has been observed that residual blocks make layers acquire identity functions incredibly quickly. The aforementioned formulas make it clear. In simple networks, the result is

$$H(x) = f(x)$$

Therefore, f(x) must equal x to learn an identity function, which is more difficult to do than ResNet, which gives the following output:

$$H(x) = f(x) + x$$
$$f(x) = 0$$
$$H(x) = x$$

To get x as the output, which is also our input, is to set f(x) = 0, which is simpler.

ResNet50V2 is an improved version of ResNet. By adding additional layers, the accuracy improvement from ResNet was achieved. It improves calculation efficiency and model accuracy by simply increasing the number of group divisions in block units without increasing the overall depth or overall width (Fig. 2).

Fig. 2
figure 2

Proposed Resnet model architecture

The following is an explanation of each layer that is used in the architecture:

  • Convolutional layer: in the Convolutional layer, a filter is used on the brain MRI input data to perform a convolution operation and extract features. A group of filters that captures the characteristics of various places in each image is completed.

  • Rectified linear unit (ReLU): in the proposed model a non-linear activation function, \(f\left(x\right)=\mathrm{tan}\left(x\right)\;or\; f\left(x\right)=(1+{e}^{-x})\) was used, \(f\left(x\right)=\mathrm{max}\left(0,x\right)f(x)\) learning is accelerated by using ReLU. This is due to the fact that it is capable of resolving the vanishing gradient issue, which arises whenever a traditional activation function is used in a deep neural network.

  • Pooling layer: the pooling layer selects the best features and discards the unimportant ones. The proposed model uses max pooling layer to improve the performance of the model during training.

  • Batch normalization: batch Normalization is used for normalizing the features for each channel based on the data distribution in the batch. Batch normalization can achieve faster and more stable learning convergence while preserving the expressive power of the original CNN.

  • Dense: each neuron in the fundamental layer of neurons known as the Dense Layer receives information from every neuron in the layer below it, which is how it got its name.

  • Dropout: the dropout layer is designed tp avoid overfitting by changing input units to 0 in a random way with a specified frequency at each step throughout the training period, setting input units to 0. Setting input units to 0 is one of the tasks that the Dropout layer does.

  • Softmax: a vector of integers is taken as an input by the Softmax mathematical function, and it is then converted into a vector of probabilities, with the probability of each value. A vector of integers may therefore be converted into a vector of probabilities by using this feature of the function.

The categorical cross entropy is the loss function that is utilized in the model that was developed. As a method for conducting data analysis in multi-class classification models, the loss function is called upon whenever there are two or more output labels to consider. The output label has a one-hot category encoding value in the form of zeroes and ones given to it. Using keras, the output label, if it is already in the form of an integer, is transformed into a categorical representation.

Experimental Results

The experimental findings that were used to support the suggested model are presented in this section. The brain MRI scans of individuals with and without Parkinson’s disease serve as the model’s input. A total of 582 photos from two classes were used for training, while a total of 249 images from two classes were used for testing. MRI images of PD patients are shown in Fig. 3. Figure 4 displays MRI scans of healthy individuals.

Fig. 3
figure 3

MRI of patients with PD

Fig. 4
figure 4

MRI of normal people

When given a training collection of cases, the updated ResNet model learns how to map inputs to outputs. An important part of training is finding a set of weights within the network that has been shown to work, or at least work well enough, to solve the problem. Figure 5 illustrates the validation loss and validation accuracy that occur throughout the training procedure. Both the value of the training loss and the value of the confirmation loss decreased from their previous levels of 60 and 35, respectively. Both the accuracy of the training and the accuracy of the evaluation hit 100%.

Fig. 5
figure 5

a Validation loss and b validation accuracy

In Table 1, you can see the model’s uncertainty matrix. All of the test pictures add up to 249. 183 MRI pictures are of people who don’t have PD (these are called “Normal”), and 66 MRI images are of people who do have PD. The suggested model correctly grouped all of the test pictures.

Table 1 Confusion matrix of the proposed model

Table 2 shows the results of the rating factors True Positive (TP), False Positive (FP), False Negative (FN), True Negative (TN), Precision, Sensitivity, and Specificity. The normal class has a TP of 183 and a PD of 66. The FP and FN values for both classes are 0. The TN for Normal class is 66 and for PD class is 183. The precision, sensitivity, and specificity values are equal to 1 for both classes.

Table 2 Evaluation parameters

The proposed model’s fundamental dependability may be deduced from the statistical evaluation of its precision. The proposed model has a precision of 1. The chance of false negatives and false positives may be determined based on a test’s specificity and sensitivity. The proposed model would discover abnormalities with absolute certainty. Specificity is the ability of a test to correctly rule out people who don’t have a certain disease or condition. The suggested model has a specificity of 1. A test’s sensitivity is how well it can tell who has a certain sickness or condition. The sensitivity of the proposed model is obtained as 1. The accuracy obtained for modified ResNet50V2 Model is shown in Table 3.

Table 3 Accuracy obtained for modified ResNet50V2 model

To circumvent the performance deterioration issue that is inherent to deep neural networks, the residual blocks provide an identity mapping that links activations from earlier in the network. The disappearing and expanding gradients are alleviated, in part, by the use of skip connections. The residual link offers an alternative route for the data to take to bypass certain layers on the way to the latter stages of the neural network. They handle this issue on the first layer of the Resnet to prevent this computational difficulty from occurring in the Resnet. It utilizes just 240 M FLOPs while simultaneously reducing the number of rows and columns by a factor of 2, and the subsequent maximum pooling operation adds another reduction by a factor of 2.

With the addition of a batch normalization layer in the end, the model is more resistant to the hyperparameter setting being too sensitive. It also reduces the magnitude of the internal covariant shift and the dependence of gradients on the magnitude of the parameters or the values that are underlying them. The significance of the weight’s initiation is somewhat diminished. The dropout layer stops all neurons in a layer from improving their weights at the same time simultaneously is the primary benefit of using this strategy. This adaptation, which is carried out in random groups, prevents all of the neurons from converging to the same objective, and as a result, the weights are decorrelated. As a result, the proposed model obtained the best result when compared to the existing methodologies (Table 4).

Table 4 Comparative analysis

The proposed modified ResNet model had an accuracy rate of 100%, which was the highest possible. DenseNet201 was able to achieve a 71% accuracy rate. A level of accuracy of 81% was generated by NASNet. The accuracy provided by VGG19 was 82%. Both MobileNetV2 and MobileNet were able to achieve an accuracy of 95%, with MobileNet achieving 98%. IceptionV3 had a success rate of 99% in its tests.

Conclusion

The neurological condition known as Parkinson’s disease worsens with time. Movement difficulties are often the first indicator of this condition. Dopamine is a chemical that is produced in the brain that makes it possible for smooth and coordinated muscle activity in the body. Parkinson’s disease is characterized by the gradual death of substantia nigra cells. When this takes place, there is a corresponding drop in dopamine levels. Parkinson’s disease symptoms often do not occur until the person’s symptoms have declined by 60–80 percent. A deep learning-based method for identifying PD from brain MRI data is proposed in this research. The proposed approach employs a modified ResNet50V2 model for image classification. The fundamental ResNet50V2 model has been enhanced with additional dense and dropout layers to enhance performance. The proposed model has 100 percent accuracy.