Keywords

1 Introduction

The research in Deep learning has been growing fast in the medical imaging field, including Computer-Aided Diagnosis (CAD), medical image analysis, and radiomics. CAD is an expeditiously developing area of research in the medical industry. The recent researchers in machine learning guarantee the enhanced efficiency in detection of disease. Here, the computers are enabled to think by developing intelligence by learning [1]. There are distinct types of deep learning techniques and which are used to analyze the data sets (Fig. 1).

Fig. 1
figure 1

T1-weighted MRI scans acquired in coronal (left), axial (center), and sagittal (right) planes with 3 T

One of the symptoms of Alzheimer’s disease is the enlargement of amyloid plaques among brain nerve cells [2]. To diagnosis the Alzheimer’s disease, it is required to know the internal structures hidden in the brain which can be obtained through various types of scanning techniques existed.

These scanning methods in the medical field include Angiography and Computed Tomography (CT), MRI Angiography (MRA), Dynamic CT or Dynamic MRI, Magneto Encephalography (ME), MRI, Flow-sensitive MRI (FSMRI), functional MRI (fMRI), etc. Structural MRI is a chief imaging biomarker in AD as the cerebral atrophy is shown to closely link with cognitive symptoms.

Deep Learning (DL) is a member of machine learning research inspired by the function and structure of human brain, and which aim at discovering multiple levels of distributed representations. Recently, many deep learning procedures have been used to fix conventional artificial intelligence problems. Mainly, there are two groups of deep learning models which are dissimilar with respect to how the data flow over the network. The information in the feed-forward networks flows over the network in just one direction, from the input layer to the output layer. Compared to feed-forward network, recurrent networks have feedback connections that allow the information from past inputs to affect the current output. In the framework of supervised classification problems, the implementation of Deep Learning requires two main steps. The first stage is called training and this phase uses the training set which is a portion of the dataset available to correct the networks parameters to perform the classification. The next step is testing phase, which is used the rest of the subset called as the test set to determine whether model that is trained can correctly predict the new observations class. When the number of available data is less, it is also possible to run the training and testing phases several times on different training and test splits of the original data and then estimate the average performance of the model. This approach is known as cross-validation. The training phase and testing phase is not a unique feature of Deep Learning but are used in conventional ML methods [3].

The paper is organized is as follows. Section 2 describes various deep learning techniques to detect Alzheimer’s disease. Section 3 concludes the paper.

2 Related Works

This section reviews existing Deep Learning approaches used to detect Alzheimer’s disease from structural and functional MRI scans.

2.1 Autoencoder

An autoencoder is an artificial neural network which can grasp the features in an unsupervised way by minimizing reconstruction errors [4]. The intention of autoencoder is to train a model for the collection of data, with the typical purpose of data compression. The process of training in autoencoder is based on the development of a cost function. The cost function estimates the error through backpropagation and it is rebuilding at the output. An autoencoder constitutes of an encoder followed by decoder. The encoder and decoder can possess multiple layers; nevertheless for simplicity, we consider that all of them have only one layer. The basic structure of autoencoder is shown in Fig. 2.

Fig. 2
figure 2

Basic structure of autoencoder

Debesh Jha and Goo-Rak Kwon Debesh Jha proposed a method based on autoencoder and this framework use structural MRI data provided from Open Access Series of Imaging Studies (OASIS) database [5]. Here use deep learning architecture, which encompass of sparse autoencoders, Scale Conjugate Gradient (SCG), stacked autoencoder, and a softmax output layer to overcome the bottleneck and support the analysis of AD and normal healthy controls. Compared to the former workflows, this technique requires fewer labeled training examples and minimum prior knowledge and also performs dimensionality reduction and data fusion at the same moment. A performance gain is achieved with the binary classification and gets 91.6% accuracy.

Bhatkoti and Paul [6] studied the effectiveness of the k-sparse Autoencoder (KSA) algorithm in deep learning structure for the diagnosis of Alzheimer’s disease. It compared the modified approach to non-modified k-sparse approach in this application. The research used MRI scan data, CSF, and PET images each of 150 patients. MRI images for comparison with research images together with CSF and PET data were obtained from Alzheimer’s disease Neuroimaging Initiative (ADNI). The MRI scans were reprocessed by correcting orientation errors and by skull strip to obtain underlying tissues. The images were normalized and smoothened. Patch extraction was then done and masks for different brain subregions were obtained and transformed during registration in the Automatic Anatomical Labeling (AAL) template. A feed-forward convolutional pair predictor neural network was developed. Flattening, concatenation, and sorting were done on feature vectors which were in turn input into feed-forward multi-layer perceptron. Prediction of output was carried out using probability function. Three-dimensional convolutional neural network with 384 input neurons and 200 hidden neurons were used in multi-layer perceptron. Cross-validation algorithm with 20-fold cross-validation was used in training. A practical approach with actual MRI images from patient screening was used in this research and compared with data from ADNI as well as those employed in previous studies, and this method contributes to efficiency of 63.24% early diagnosis of Alzheimer’s disease and confirms that KSA enhances the efficiency as 74.05%.

2.2 Convolutional Neural Network

A Convolutional Neural Network (CNN) is made up of convolutional layers, pooling layers, normalization layers and then the fully connected layers. The construction of a CNN is described to yield the advantage of the 2D format of an input image. CNN use little preprocessing operation rather than other image processing operations. The advantage of CNNs is that it cut down the number of parameters with the same number of hidden units and the training of CNN is straightforward. The convolutional layer takes an MxMxR input image, which is corresponding to the height, width, and the number of channels of an image, respectively.

The convolutional layer plays a crucial role in CNN architecture and is the basic building block in this network. The CONV layers parameters contains a group of learnable filters. Spatially, the size of each filter is small but enhance through the complete input volume depth. For each forward pass, it is multiplying the original image pixel values with the values in the filter which is used by CONV layer. These multiplications are all summed up and producing a two-dimensional activation map of that filter. Next, all filters activation maps are stacked and produce output volume. A pooling layer is mostly added in-between subsequent Conv layers. Its function is to reduce the dimensionality of each feature map but preserve the important information. The pooling Layer operates on each feature map independently and resizes the input spatially. Spatial pooling can be of different types Max, Average, Sum, etc. Recent researchers have developed more successful CNN such as AlexNet, GoogleNet, ResNet, ZF Net, VGGNet, and LeNet. The major problem of constructing ConvNet architectures is the memory restrictions of GPU [7,8,9].

Sarraf and Tofighi [10] in their paper used CNN deep learning architecture (LeNet) that was trained and tested with huge number of images and classified the AD data from normal control with 96.86% accuracy. The architecture of LeNet-5 is shown in Fig. 3.

Fig. 3
figure 3

LeNet-5 architecture

Ciprian et al. [11] use DemNet architecture which is a modified version of the 16-layer CNN made by the Oxford University Visual Geometry Group (VGG) for the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC). It is a collection of 13 convolutional layers followed by 3 fully connected layers. This architecture successfully classifies AD and MCI from HC on the Alzheimer’s disease Neuroimaging Initiative (ADNI) dataset with an accuracy of 91.85%.

Glozman and Liba [12] propose a method for AD classification using AlexNet architecture includes five convolutional layers and three fully connected layers. Here, the network pretrained on natural images can be fine-tuned to classify neuroimaging data in which the difference between the different classes are very subtle, even for the human eye. This method results suggest that with the available data, the network can learn to classify the two extreme classes (NC vs. AD), but when faced with a three-way classification task, it will not achieve good accuracy.

3 Conclusion

In this paper, a review on different deep learning methods for diagnosis of Alzheimer’s disease is discussed. Generally, the disease developing through three stages: Normal control, MCI, and AD. The Alzheimer’s disease causes some changes in the brain, and these changes mainly developed on both the structures which are larger and small cells in the brain. Alzheimer’s disease affects some parts of limbic system mainly the hippocampus, then the cerebral cortex, finally the brain stem. Most of the methods use MRI images because it is considered as the favored neuroimaging examination for Alzheimer’s disease.

Early diagnosis of AD and MCI based on deep learning methods needs only minimal prior knowledge dependency in the model optimization. The advantage of autoencoder technique is it requires fewer labeled training examples and minimal prior knowledge. Compared to autoencoder technique, CNN deep learning architecture which was trained and tested with a huge number of images classified the AD more accurately.