Keywords

1 Introduction

Alzheimer’s disease (AD) is the most common form of dementia that involves substantial medical attention. Early and accurate analysis of AD prognosis is required for the start of clinical progress and effective patient care [1]. Alzheimer’s disease (AD) is a long-term neurobiological brain ailment that gradually destroys brain cells, causes memory and cognitive problems, and ultimately speeds up the loss of ability to carry out even the most fundamental tasks. Neuroimaging and computer-aided diagnostic methods are used by doctors to categorize AD in its early phases. According to the World Alzheimer’s Association’s evaluation of the most recent census, nearly 4.7 million Americans over 65 have outlived this disease. They predicted that 60 million people could be afflicted by AD during the next fifty years. Globally, Alzheimer’s disease accounts for 60 to 80 percent of all dementia types. One person develops dementia every three seconds, and 60% of those cases are caused by AD. To slow down the aberrant degeneration of neurons in the brain, research on early AD diagnosis is still ongoing. Additionally, it benefits the patient’s family financially and emotionally. The person with this illness experiences memory loss, strange conduct, and linguistic difficulties. It is brought on by the brain’s entorhinal cortex and hippocampal areas as well as tangled bundles of neurofibrillary fibers. The patient’s navigational issue and episodic memory impairment are typical initial symptoms of this condition. Memory loss, poor judgment, difficulty recognizing items, difficulty paying bills and operating a vehicle, and placing objects in strange places are some of the higher order symptoms. Improved computer-aided diagnostic tools are needed to interpret MRI images and determine whether patients have Alzheimer’s disease or are healthy. Conventional deep learning algorithms perform AD classification on still raw MRI images using the cortical surface as an input to the CNN.

This research provides a methodology for extracting discriminative features using a convolutional neural network. The model is built from the ground up to better precisely classify the stages of Alzheimer’s disease by decreasing its parameters and computing cost. The models are evaluated by training them on the Kaggle MRI dataset. Mild Dementia (MID), Moderate Dementia (MOD), Non-Demented (ND), and Very Mild Dementia (VMD) are the four kinds of dementia represented in the dataset. The primary goal of this inquiry was to give a list of classification accuracies as well as other performance indicators such as precision and recall. The investigation of the progression among prediction and classification of AD detection is the most significant consequence of this research study. The results show that the suggested model with fewer parameters beats all previous work models. The main contributions to this work can be summarized as follows:

  • A novel “DementiaNet” architecture was developed to aid in Alzheimer’s detection because this model could accurately detect Alzheimer’s disease within a relatively short period of time.

  • Our suggested framework “DementiaNet” employs a sufficient number of convolution filters to capture all of the essential features, increasing the efficiency of feature learning and producing a more accurate and reliable output.

  • The proposed model uses transfer learning during the feature extraction stage, allowing it to take advantage of the logical weight of a successful pre-trained model to enhance performance to a greater extent.

  • To ensure the quality of the input data, we have used a Gaussian filter to remove noise from the input data. By reducing the effect of noise in the MRI images, this step significantly improved the model’s accuracy.

  • Furthermore, we addressed the issue of class imbalance by increasing the size of the minority class through data augmentation techniques.

The rest of this work is structured as follows: Sect. 2 reviews previous investigations of AD diagnosis and classification, Sect. 3 describes the methodology for building and evaluating the proposed CNN model, Sect. 4 presents the experimental and evaluation results, and Sect. 5 concludes the paper and discusses future work.

2 Literature Review

Recent study indicates that employing multimodality data to identify and classify AD has made significant progress. Multimodality data includes positron emission tomography (PET), X-rays, MRI, computed tomography (CT), and the patient’s clinical records [2]. Although MRI is more effective than CT scans in detecting Alzheimer’s disease when using traditional machine learning (ML), there are a number of significant challenges associated with these techniques [3]. In recent years, significant progress has been achieved in the automatic classification of Alzheimer’s disease utilizing multiple methodologies. KNN [4, 5], Random Forest (RF) [6], SVM [7, 8], Decision Tree (DT) [9], Deep Neural Network (DNN), CNN [10], and dynamic connectivity networks (DCN) [11] have all been used to detect AD in MRI images.

Liu et al. [12] present a method for detecting Alzheimer’s disease that makes use of spectrogram information derived from voice data. This strategy can assist patients learn about the progression of their illnesses early on, allowing them to take preventive actions. AlexNet and GoogLeNet were used, and their average classification accuracies were 91.40% and 93.02%, respectively. In comparison to our work, the performance is terrible. Arco et al. [7] presented utilizing the ADNI dataset an SVM-based technique coupled with Searchlight and Principal Component Analysis (PCA). In comparison to the PCA, the Searchlight method performed well, with a maximum accuracy of 80.9%. Helaly et al. [13] used deep learning (DL) techniques (CNN and VGG16) to classify four stages of Alzheimer’s disease. For each two-pair class of AD phases, different binary medical image classifications were also utilized. For 2D and 3D multiclass AD, CNN and VGG19 techniques, respectively, achieved classification accuracies of 93.61% and 95.17%, whereas VGG19 attained a classification accuracy of 97% for multiclass AD. Compared to our work, the categorization accuracy is lower.

The first transfer learning-based method for multi-class detection for stages of AD and cognitive impairment was introduced by Shanmugam et al. [14]. For training and testing on Google’s GoogLeNet, AlexNet, and ResNet-18 networks, they used 6000 MRI ADNI images. The ResNet-18 network achieved the highest classification accuracy (98.63%). Kong et al. [15] developed a special PET-MRI image fusion and a 3D CNN for AD multi-classification methods using deep learning. The ADNI collection’s 740 3D images in total were used. This study suggests using A3C-TL-GTO to classify MRI images and identify AD. The Alzheimer’s Dataset (four picture classes) and the Alzheimer’s Disease Neuroimaging Initiative (ADNI) were used to constructing and assess the A3C-TL-GTO empirical framework for automatic and precise AD classification.

In summary, the excellent capacity of pre-trained models to categorize image data in current research environments serves as the inspiration for our study. We suggest a novel method for the deep CNN-based prediction of Alzheimer’s disease based on the effective global and local search capabilities of multi-objective optimization algorithms. The “DementiaNet” is used in this study to calculate and optimize a suitable CNN architectural design using a transfer learning approach based on a wide variety of hyperparameters. In order to make our work stand out, we describe the convolutional neural network parameters as a multiobjective function. This makes it easier to discover the best parameters in a given search area and overcomes the drawbacks of manual-based classifiers.

3 Methodology

This section describes our proposed methodology for Alzheimer’s disease detection from MRI images as follows in Fig. 1. We utilized a transfer learning approach to leverage a pre-trained “EfficientNet” model for feature extraction, followed by training a deep convolutional neural network (CNN) as a classifier. Our motivation behind using “EfficientNet” for feature extraction was to benefit from its superior performance in image recognition tasks while reducing the computational complexity and overfitting issues commonly encountered in deep learning models. The resulting model aims to accurately distinguish between Alzheimer’s and non-Alzheimer’s subjects based on the features extracted from MRI images, potentially aiding in early diagnosis and treatment of the disease.

Fig. 1.
figure 1

Workflow Diagram For Alzheimer Detection Using “DementiaNet”

3.1 Dataset Collection and Preprocessing

We have evaluated the performance of our proposed “DementiaNet” for Alzheimer’s detection on the “Alzheimer MRI Processed Dataset” collected from Kaggle. As the dataset is imbalanced, with minor sampled classes, we have applied data augmentation techniques to balance the model and improve its performance. The dataset used in this study consists of 6400 MRI images resized to 128\(\,\times \,\)128 pixels and classified into four categories: Mild Demented, Moderate Demented, Non-Demented, and Very Mild Demented. The dataset is available on Kaggle and contains 896 images for Mild Demented, 64 images for Moderate Demented, 3200 images for Non-Demented, and 2240 images for Very Mild Demented.

In the preprocessing phase of this work, first, we have rescaled the images into 220 * 220 size as “EfficientNet” accepts images with higher resolution for better performance. After data normalization, the Gaussian filter was applied to reduce the noise of the dataset. The data augmentation technique was implemented to make a generalized model. Some sample images of our work data are given below in Table 1.

Table 1. Examples of Alzheimer MRI images from each class

3.2 Feature Extraction Using EfficientNet

“EfficientNet” has been proven to surpass other commonly used pre-trained models in various computer vision tasks, due to its innovative scaling technique that optimizes the model’s depth, width, and resolution concurrently. In our study, we employed “EfficientNet” [16] for feature extraction, leveraging its ability to capture high-level features from input images, such as edges, textures, and shapes. To further improve the quality of our features, we applied a Gaussian filter to reduce noise before feature extraction described in Fig 2. By doing this, we aimed to ensure that our model could extract meaningful features and avoid the inclusion of irrelevant information from the input images. Our feature extraction method involves a combination of convolutional, pooling, and activation layers that are designed to progressively extract more complex and abstract features from the input images.

Fig. 2.
figure 2

Feature Extraction Using Transfer Learning By “EfficientNet”

3.3 Contructing Feature Learning and Classification Model: DementiaNet

The details of the proposed “DementiaNet” model can be explained as follows in Fig. 3, the input features obtained from the pre-trained “EfficientNet” model are passed through a series of convolutional and pooling layers to extract relevant features. The first convolutional layer consists of 32 filters with a kernel size of (3, 3) and uses the rectified linear activation function. The output of this layer is normalized using batch normalization and then max pooled using a (2, 2) filter to reduce the spatial dimensions.

Similarly, the next convolutional layer has 64 filters with a kernel size of (3, 3) and is followed by batch normalization and max pooling. This is repeated for two more layers, with the number of filters increasing to 128 in the third layer and then decreasing back to 64 in the fourth layer.

After the final convolutional layer, the output is flattened and passed through two fully connected dense layers with rectified linear activation functions, each consisting of 256 and 128 neurons, respectively. These dense layers help to learn more complex patterns in the extracted features. Dropout with a rate of 0.5 is applied after the second dense layer to prevent overfitting. the output layer of our proposed classification model for Alzheimer’s detection will have four neurons, each representing one of the four classes - mild demented, moderately demented, non-demented, and very mild demented. The output layer uses the softmax activation function to output a probability distribution over the four classes, where the sum of the probabilities for all classes is equal to 1. Finally, the output layer with four neurons and softmax activation function is used to output the probability distribution over the four classes. The model is trained using categorical cross-entropy loss and the Adam optimizer. The performance of the model is evaluated using accuracy metrics, and appropriate hyperparameters are tuned to optimize the model’s performance.

Fig. 3.
figure 3

Details Model Architecture of “DementiaNet”

4 Experimental Results

In this work, we have used hold-out cross-validation to train the dataset in our model. The training set, validation set, and test set consist of 60%, 20%, and 20% of the total dataset(with augmented data also). This model “DementiaNet” was developed using Keras and tensorflow libraries. We employ various assessment criteria, including as accuracy, precision, recall, and F1-score, to assess the effectiveness of the “DementiaNet” architecture and compare our findings with those of prior studies.

4.1 Hyperparameter Tuning

From Fig. 4, the deep CNN model trained for Alzheimer’s detection shows a good level of performance. During training, the model’s training accuracy increases gradually, reaches a plateau after around 15 epochs, and eventually reaches a peak of 99% accuracy between 27 and 30 epochs. On the other hand, the model’s validation accuracy reaches a peak of 97.5% at the end of the training process, indicating that the model has learned to generalize well to the unseen validation set.

Regarding the loss function, it is observed that the loss value decreases rapidly in the initial epochs and eventually becomes saturated around the 17th epoch, with a value below 0.01. This suggests that the model has learned to minimize error and make accurate predictions during training.

The high training accuracy, validation accuracy, and low loss value indicate that the model has learned to classify MRI images of Alzheimer’s disease with a high degree of accuracy. Adam as an optimizer, 0.0001 as a learning rate, and sparse-categorical cross entropy as a loss function were used in this work.

Fig. 4.
figure 4

Training Accuracy, Validation Accuracy, Training Loss and Validation Loss

4.2 Model Evaluation and Result Analysis

Our proposed “DementiaNet” has shown a good classification performance in Alzheimer’s detection from brain MRI images. From Fig. 5, the confusion matrix has shown that most of the test samples are positioned diagonally and the miss classification rate is not a significant one. Figure 6 shows some miss classified samples of very mild demented class which are wrongly predicted as non-demented. Analyzing the overall miss classified samples, it is considered that most of them are wrongly predicted conflicting with their near classes. Model evaluation metrics for our work are shown in Table 2.

Fig. 5.
figure 5

Confusion Matrix for the output of “DementiaNet”

Fig. 6.
figure 6

Miss Classified Samples

Table 2. Performance comparison of Alzheimer’s detection models

Table 3 compares the proposed models to the most recent state-of-the-art models that can be found in the literature. These models employed various architectures while using the same datasets. Table 3 makes it clear that when compared to other techniques reported in the literature, our proposed transfer learning-based “DementiaNet” architecture offers the best prediction performance for the detection of Alzheimer’s disease.

Table 3. Comparison of the proposed framework with other state-of-the-art methods

5 Conclusion

The importance of Alzheimer Detection lies in the potential impact it can have on the early diagnosis of Alzheimer’s disease, which can significantly improve patient outcomes and quality of life. It can also help in reducing the burden on the healthcare system by enabling earlier interventions and treatments. The proposed model “DementiaNet” of using “EfficientNet” as a feature extractor and a Deep CNN [23,24,25] as a classifier for Alzheimer’s detection from MRI images is effective. The model achieved a high accuracy of 97%, which indicates its potential for clinical applications in the early detection of Alzheimer’s disease. Moreover, data augmentation techniques were employed to handle the issue of imbalanced data, which is a common challenge in medical imaging datasets. This approach has shown promising results and could be further explored in future studies. As for future work, this study could be extended to multi-center studies and tested on larger datasets. The model could also be fine-tuned and optimized to improve its performance further [25,26,27,28] and [29]. Additionally, the model could be evaluated for its generalizability on different populations and with different types of MRI scans.