Keywords

1 Introduction

Though the diagnosis of neurodegenerative diseases is mainly based on clinical criteria, neuroimaging in nuclear medicine plays important supportive roles in diagnosis and differential diagnosis of neurodegenerative diseases and prediction of disease progression [1, 2]. Different from magnetic resonance imaging (MRI) dependent on morphological changes of cortical and subcortical structures, positron emission tomography/computed tomography (PET/CT) provides quantitative evaluation of functional or molecular changes related to metabolism, proteinopathy, enzyme expression, transporter, or receptor. In addition to visual analysis, quantitative image analysis is essential to investigate clinical significance of neuroimaging. Of them, voxel-based analysis and region-of-interest (ROI) or volume-of-interest (VOI) analysis are widely used for comparison between control (or normal) and patient groups. Statistical parametric mapping (SPM) is the most popular voxel-based approach, which demonstrates areas of the brain with a significant difference between normal controls and patients [3, 4]. ROI or VOI-based image analysis performs the calculation in the pixels of each ROI or VOI. Manual, semi-automatic, and automatic method can be used to draw a region or volume. Although accurate, manual drawing is time-consuming, operator dependent, and less reproducible. On the contrary, accurate region segmentation by automatic drawing should be guaranteed in each patient for robustness and reliability of data analysis.

Different from traditional image analysis, machine learning as a subset of the artificial intelligence finds patterns through big data. Based on the training data, it builds a mathematical model to make prediction. A learning method can be unsupervised, semi-supervised, or supervised. Supervised learning requires labeled data to find the pattern, whereas unsupervised learning uses unlabeled data and semi-supervised learning needs a small labeled data and a large unlabeled data. Machine learning is trained using a large number of input data with high reproducibility to extract the feature of clinical significance. After extraction, feature selection removes unnecessary features to reduce the training time and the possibility of overfitting, and avoid the dimensionality issues. Then, a classifier algorithm such as support vector machine, random forest, or artificial neural network is performed to map the feature for the classification of disease.

As a part of the machine learning, deep learning is consisted of the artificial neural networks with multiple convolutional layers and nodes. Unlike traditional machine learning, deep learning performs the feature extraction and learning by itself. For the feature extraction and transformation, the techniques of deep learning are based on a cascade of multiple layers of nonlinear processing units. High-quality data and labels are most important to train and test the deep learning models. Dataset is typically composed of training, validation, and test set. The training data are used to train a network that loss function calculates the loss values in the forward propagation and learnable parameters are updated via backpropagation. The validation data are to fine-tune hyper-parameters and the test data to evaluate the performance of the model. This chapter will focus on artificial intelligence used for neuroimaging in nuclear medicine including classification of diseases, segmentation of ROI or VOI, denoising, image reconstruction, and low-dose imaging.

2 Classification

2.1 Alzheimer’s Disease

Alzheimer’s disease (AD) is a neurodegenerative disease characterized by a decline in cognitive function. It mostly affects older people so that the prevalence of AD is increasing with the growth of the elderly population. Early diagnosis of AD before the symptoms become severe is of utmost clinical importance since it may provide opportunities for effective treatment. 18F-FDG PET/CT is one of the most useful modalities to support the clinical diagnosis of dementia including AD. It shows changes in glucose metabolism of the brain over various disease entities related to dementia with high sensitivity and specificity. In patients with AD, the reduction of glucose metabolism is expected stating from the mesial temporal to posterior cingulate cortex (PCC), lateral temporal, inferior parietal, and prefrontal regions to help diagnose [5].

Deep learning methods have been studied for the evaluation of patients with AD. Several auto-encoders with multi-layered neural network to combine multimodal features were applied for AD classification [6]. In a study with a stacked auto-encoder to extract high-level features of multimodal ROI and an SVM classifier, the proposed method was 95.9%, 85.0%, and 75.8% accurate for AD, MCI, and MCI-converter diagnosis, respectively, using the ADNI dataset [7]. Recently, CNN methods with 2D or 3D volume data of PET/CT or MRI scans were applied for AD classification [8,9,10,11]. In 2D CNN models, the features from the specific slices of axial, coronal, and sagittal scans were concatenated and used for AD classification. Using MRI volume data, skull stripping and gray matter segmentation were performed and the slices with gray matter information were used as CNN model input. Compared to 2D CNN models, studies have used 3D volume data with promising results. Using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) MRI dataset without skull-stripping preprocessing, Hosseini-Asl et al. built a deep 3D Convolutional Neural Network (3D-CNN) upon a convolutional auto-encoder, which was pre-trained to capture anatomical shape variations in structural brain MRI scans for source domain [8]. Then, fully connected upper layers of the 3D-CNN were fine-tuned for each task-specific AD classification in target domain. The proposed 3D deeply supervised adaptable CNN outperformed several proposed approaches, including 3D-CNN model, other CNN-based methods, and conventional classifiers by accuracy and robustness. Liu et al. used cascaded convolutional neural networks (CNNs) to learn the multi-level and multimodal features of MRI and PET brain images for AD classification [10]. In the method, multiple deep 3D-CNNs were applied on different local image patches to transform the local brain image into more compact high-level features. Then, an upper high-level 2D-CNN followed by softmax layer was cascaded to ensemble the high-level features and generate the latent multimodal correlation features for classification task. Finally, a fully connected layer followed by softmax layer combined these learned features for AD classification. Without image segmentation and rigid registration, the method could automatically learn the generic multi-level and multimodal features from multiple imaging modalities for classification. With ADNI MRI and PET dataset from 397 subjects including 93 AD patients, 204 mild cognitive impairment (MCI, 76 MCI converters +128 MCI non-converters) and 100 normal controls (NC), the proposed method demonstrated promising performance of an accuracy of 93.26% for classification of AD vs. NC and 82.95% for classification MCI converters vs. NC.

Although studies have shown that various deep learning methods were effective for AD classification, the model performance of external validation compared to the training dataset is an issue to be resolved. In fact, the qualities and properties of medical images could be affected by the image-acquisition environment including the imaging acquisition system, acquisition protocol, reconstruction method, etc. Therefore, there is a need for a model with enhanced generalization performance to improve clinical utility of a proposed method. In a recent study using FDG PET/CT, instead of 3D volume data, slice-selective learning using a BEGAN-based model was constructed to solve the above (Fig. 9.1) [9]. The model was trained with an ADNI dataset, then performed external validation with their own dataset. A range was set to cover the most important AD-related regions and searched for the most appropriate slices for classification. The model learned the generalized features of AD and NC for external validation when appropriate slices were selected. The slice range that covered the PCC using double slices showed the best performance. The accuracy, sensitivity, and specificity was 94.33%, 91.78%, and 97.06% using their own dataset and 94.82%, 92.11%, and 97.45% using the ADNI dataset. The performance on the two independent datasets showed no statistical difference. The study showed the feasibility of the model with consistent performance when tested using datasets acquired from a variety of image-acquisition environments.

Fig. 9.1
figure 1

Architecture of slice-selective learning for Alzheimer’s disease classification using GAN network

Despite remarkable diagnostic accuracy of deep learning, the correlation between the features extracted by deep learning model and diseases is hard to explain. Several studies proposed the methods for solving this problem by providing the feature map and input data responsible for the result of prediction. Class activation map (CAM) has been widely used to understand where the deep learning model evaluate for classes and to explain how deep learning models predict the outputs [12,13,14]. Choi et al. demonstrated that brain regions where the CNN model evaluated for AD with decreased cognitive function using CAM method, which can generate the heat map with the probability of AD [15]. However, CAM-based interpretation should be cautious because deep learning models may classify diseases by the regions that cannot be explained by the known knowledge.

2.2 Parkinson’s Disease

Parkinson’s disease (PD) is the second most common of neurodegenerative diseases which is mainly a movement disorder, such as resting tremor, bradykinesia, and rigidity [16, 17]. Alpha-synuclein aggregates, the primary PD pathology, are known to promote the dopaminergic loss [18]. Although non-invasive direct PET imaging of alpha-synuclein aggregates in the brain is limited, the quantification of presynaptic transporters of the nigrostriatal dopaminergic neurons can be performed with PET and SPECT using either 18F or 123I N-(3-Fluoropropyl)-2β-carbon ethoxy-3β-(4-iodophenyl) Nortropane (FP-CIT) [19, 20]. Dopamine transporter (DAT) in PET/CT has been widely used for the early diagnosis of PD and the discrimination between PD and other diseases showing parkinsonism.

Machine learning has been applied to diagnose PD using DAT-SPECT or PET scan [21,22,23,24,25,26,27]. The extracted feature from deep learning methods has outstanding diagnostic results. However, the clinical correlation between disease and deep learning methods needs further explanation and verification since low-level features extracted from deep learning methods may not reflect the neuropathological heterogeneity of PD. Shiiba et al. used semi-quantitative indicators and shape feature acquired on DAT-SPECT to train the model of machine learning for classification between PD and normal controls (NC) [28]. Striatum binding ratio (SBR) as semi-quantitative indicators and circularity index of shape were combined as a feature for machine learning. The performance of classification was significantly improved by using both SBR and circularity than by the one of SBR or circularity index (AUC for SBR and circularity: 0.995, AUC for circularity only: 0.990, and AUC for SBR: 0.973).

FDG PET/CT is also actively used for the evaluation of patients with parkinsonism, especially for the differentiation between idiopathic PD and atypical parkinsonism [29]. Wu et al. used support vector machine to classify PD patients and NC using radiomics features on 18F-FDG PET [21]. The proposed method showed that the accuracy of classification between PD and NC was 90.97 ± 4.66% and 88.08 ± 5.27% in Huashan and Wuxi test sets, respectively. In addition, several studies showed that the deep learning methods were also effective for classification between PD patients and NC [30, 31]. Zhao et al. developed a 3D deep residual CNN for automated differential diagnosis of idiopathic PD (IPD) and atypical parkinsonism (APD) [30]. With dataset from 920 patients including 502 IPD patients, 239 multiple system atrophy (MSA) patients, and 179 progressive supranuclear palsy (PSP) patients, the proposed method demonstrated the performance of 97.7% sensitivity, 94.1% specificity, 95.5% PPV, and 97.0% NPV for the classification of IPD, versus 96.8%, 99.5%, 98.7%, and 98.7% for the classification of MSA, and 83.3%, 98.3%, 90.0%, and 97.8% for the classification of PSP, respectively.

3 Segmentation

Despite the sensitivity of PET/CT is usually much higher than conventional structural images such as CT of MRI, it is considered difficult to extract anatomical information from PET/CT images because they are not well-distinguishable from low-resolution images of PET/CT [32]. So far, there are limited studies to segment anatomical structures on PET images using deep learning methods, especially in the diseases related to the brain. A 3D U-net shaped CNN has been used to segment cerebral gliomas on F-18 fluoroethyltyrosine (18F-FET) PET [33]. Of the deep learning methods, generative adversarial network (GAN) model received great attention due to the ability to generate data without explicitly modeling probability density functions. It has been applied to many tasks with excellent performance such as image-to-image translation, semantic segmentation, and resolution translation from low to high [34]. In particular, GAN models have been promising in the field of segmentation. Of the PET/CT studies, there is only one study applied pix2pix framework of GAN to segment normal white matter (WM) on 18F-FDG PET/CT [35]. The DSC of segmenting WM from 18F-FDG PET/CT was 0.82 on average. Despite the low resolution of 18F-FDG PET/CT, the results showed similar results compared to MRI [36, 37]. The study showed a feasibility of using 18F-FDG PET/CT in segmenting WM volumes.

In the WM, there are foci or areas called as white matter hyper-intensities (WMH) since they show increased signal intensity on T2-weighted fluid attenuated inversion recovery (FLAIR) on MRI. Despite seen in healthy elderly subjects, WMH are associated with greater hippocampal atrophy in non-demented elderly and cognitive decline in patients with CI [38,39,40]. Therefore, MRI has been invaluable in the assessment of WMH [41]. As mentioned, 18F-FDG PET/CT is useful in assessing the glucose metabolism in the cortex or subcortical neurons. However, the low spatial resolution and low glucose metabolism have limited the evaluation of the WM and WMH on 18F-FDG PET/CT. In our group, we applied a GAN framework to segment WMH on 18F-FDG PET/CT (In Fig. 9.2, unpublished data). A dataset of mild, moderate, and severe groups of WMH according to the Fazekas scoring system was used to train and test a deep learning model. Using WMH on FLAIR MRI as gold standard, a GAN method was used to segment WMH on MRI. The dice similarity coefficient (DSC) values were closely dependent on WMH volumes on MRI. With more than 60 mL of volume, the DSC values were above 0.7 with a mean value of 0.751 ± 0.048. With a volume of 60 mL or less, the mean value of DSC was only 0.362 ± 0.263. For WMH volume estimation, GAN showed excellent correlation with WMH volume on MRI (r = 0.998 in severe group, 0.983 in moderate group, and 0.908 in mild group). Although it is limited to evaluate WMH on 18F-FDG PET/CT by visual analysis, they are important vascular component contributing to dementia. Our GAN method showed a feasibility to automatically segment and estimate volumes of WMH on 18F-FDG PET/CT which will increase values of 18F-FDG PET/CT in evaluating patients with CI.

Fig. 9.2
figure 2

Deep learning-based, GAN, FLAIR image synthesized using PET/CT. 18F-FDG PET/CT (a), T2-weighted FLAIR image (b), predicted WMH volume (c), and manually segmented WMH volume (d)

4 Image Generation and Processing

Artificial intelligence in nuclear medicine is also widely used in image processing technology, such as image reconstruction and attenuation correction. For PET/MRI, attenuation correction by making pseudo CT images from MRI has compared to CT-based methods [42,43,44,45,46]. In a method using Dixon sequence, PET activity in bone structure is underestimated in attenuation map [43, 44]. Despite many approaches, MR-based attenuation correction methods are considered lower performance than CT-based method for PET/CT. Recently, deep learning methods have been applied to the attenuation correction for PET/MRI. Hwang et al. [47] proposed a deep learning-based whole-body PET/MRI attenuation correction, which is more accurate than Dixon-based 4-segment method. The proposed deep learning method used activity and attenuation maps estimated using the maximum-likelihood reconstruction of activity and attenuation (MLAA) algorithm as inputs to a CNN to learn a CT-derived attenuation map. The attenuation map generated from CNN showed better bone identification than MLAA and average DSC for bone region was 0.77, which was significantly higher than MLAA-derived attenuation map (0.36). Liu et al. also demonstrated that deep learning approach to generate pseudo CT from MR image reduced PET reconstruction error compared to CT-based method [48]. With the retrospective T1-weighted MR images from 40 subjects, deep convolutional auto-encoder (CAE) network was trained with 30 datasets and then evaluated in 10 dataset by comparing the generated pseudo CT to a ground-truth of CT scan. The results of this study showed that the DSC for air region of 0.97, soft tissue of 0.94, and bone of 0.80.

A generation of MRI from CT or CT from MRI has been performed by a lot of researchers, but very few studies have been carried out for the generation of MR images from PET/CT. Choi et al. [49] built GAN model, based on image-to-image translation, to generate MR images from florbetapir PET images. The generated MR images are used for quantification of florbetapir PET and measured value was highly correlated with real MR-based quantification method. Although there was a high structural similarity of 0.91 ± 0.04 between real MR image and generated MR image, the differentiation between gray and white matter was difficult and there was blurring of the detailed structures in the generated MR. In our group, cycle GAN based deep learning method was applied for generating FLAIR images from 18F-FDG PET/CT. As shown in Fig. 9.3 (unpublished data), the generated FLAIR images from our method had excellent visual quality.

Fig. 9.3
figure 3

Representative images of 18F-FDG PET/CT as an input to deep learning model (a), real FLAIR (b), and the generated FLAIR image by deep learning model (c) (unpublished data)

5 Low-Dose Imaging

High-quality PET images need a large number of gamma events either from high-dose injection or long scan time. Long scan time can result in patient motion artifacts and inconvenience, while high-dose administration increases radiation exposure to patients. To overcome these issues, the development of technology has concentrated on increasing the PET scanner sensitivity to detect a large number of coincidence events. A newer PET system with an axial field-of-view covering the whole body in a single bed position has shown a 40-fold improvement in effective sensitivity [50, 51]. In addition, numerous image reconstruction and noise reduction algorithm have improved spatial resolution and signal-to-noise ratio (SNR) of PET image [52, 53]. Ordered subset expectation maximization (OSEM) with modeling of the point spread function has been used to reconstruct gamma event for high-resolution PET imaging.

With deep learning method, convolutional neural network (CNN) models have been used to learn the relationship between full-dose and low-dose PET images [54,55,56]. Xu et al. [56] proposed a deep learning method, an encoder-decoder structure with concatenate skip connection with residual learning framework, to reduce dose of radioactive tracer in 18F-FDG PET imaging. They achieved significantly better performance compared with reconstructed by denoising algorithms (nonlocal means, block-matching 3D, and auto-context network) from 0.005 of the standard dose.

Chen et al. [57] proposed a method to reconstruct full-dose amyloid PET/MR using 18F-florbetaben (18F-FBB) image from low-dose image. Compared with low-dose image, the synthesized images using CNN model showed marked improvement on all quality metrics, such as peak signal-to-noise ratio (PSNR), structural similarity, and room mean square error (RMSE). In a visual reading of amyloid burden of synthesized FBB image using CNN model, accuracy for amyloid status was 89%. In addition, the CNN model showed the smallest mean and variance for standardized uptake value ratio (SUVR) difference to full-dose images. Ouyang et al. [58] also reported a generative adversarial network (GAN) model to reconstruct the full-dose PET image from low-dose image, which significantly outperformed Chen et al.’s method with the same input by 1.87 dB in PSNR, 2.04% in SSIM, and 24.75% in RMSE.

In our group, a CNN model with a residual learning framework was applied for predicting full-time 18F-FBB PET/CT images from short-time scan of 1 to 5 min with excellent image quality (Fig. 9.4, unpublished data). In amyloid imaging, amyloid positivity can be measured by quantitative analysis of SUVR, which were normalized to the mean value in the cerebellar cortex. The results of our ROC analyses showed that the cut-off values for amyloid positivity deduced from the images predicted from the CNN models using low-dose images from 1 to 5 min remained unchanged as compared with those obtained from the ground-truth images.

Fig. 9.4
figure 4

18F-FBB PET/CT images reconstructed with different scan time (left column) and the predicted 18F-FBB PET/CT images by deep learning method from short scan time (right column). Amyloid status of negative (a) and positive case (b) were shown

Scan time reduction using low-dose imaging has been tried for 18F-FDG PET/CT imaging. Kim et al. [59] proposed that deep learning method to synthesize the PET images with high SNR acquired for typical scan durations from short scan time PET images with low SNR using deep learning with a concatenated connection and residual learning framework (Fig. 9.5). The list-mode PET data were formatted into 10, 30, 60, and 120 s to investigate the effect of scan time on the quality of synthesized PET images. The PSNRs and NRMSEs of the synthesized 18F-FDG PET images were significantly superior to those of the short scan images for all scan times. As the scan time increased from 10 to 120 s, the PSNRs and NRMSEs of the synthesized 18F-FDG PET images were improved by an average of 21.6 ± 3.8% and 47.0 ± 5.5%, respectively.

Fig. 9.5
figure 5

A schematic of the encoder-decoder convolutional neural network for predicting the full-time scan from short-time scan of 18F-FDG PET/CT

As shown in Fig. 9.6, high quality of PET image generated using deep learning model with low count data and/or short scan time can have practical impact on reducing radiation exposure. It will provide new opportunities for PET/CT for those patients such as children, pregnant women, and patients prone to motion artifacts.

Fig. 9.6
figure 6

Representative 18F-FDG PET/CT images in 62-year-old female with normal control, with short-time scan (10 sec, left), predicted images by CNN with residual learning framework (middle), and full-time scan (15 min, right)