FormalPara Key Points

Brain FDG PET results alone can predict Alzheimer’s disease (AD) conversion in 2 years in patients with mild cognitive impairment (MCI).

Composite score of visual rating based on the regional glucose metabolism predicted AD conversion better than traditional all-or-none visual rating in patients with MCI.

Visual rating and computer-assisted analysis were analogously accurate in the prediction of AD conversion.

1 Introduction

Mild cognitive impairment (MCI) is characterized by cognitive decline in objective neuropsychological tests, but preserved activities of daily living [1, 2]. Compared to normal elderly people, patients with MCI are 20 times more likely to progress to Alzheimer’s disease (AD) [3, 4], with approximately 10–15% of patients with MCI progressing to AD every year [5, 6]. Thus, it is important to predict the progression to dementia at the MCI stage in advance.

Fluorodeoxyglucose (FDG) positron emission tomography (PET) is useful to diagnose AD and predict AD conversion in patients with MCI because it can detect the early deterioration of brain glucose metabolism [7]. AD is characterized by cerebral glucose hypometabolism of the temporoparietal lobe [8]. The sensitivity and specificity of visual interpretation and quantitative assessment of FDG PET in diagnosing AD is 85–95% and 70–90%, respectively [8,9,10,11]. In patients with MCI, regional hypometabolism develops in the posterior cingulate, precuneus, parietal, temporal, and frontal regions, similar to AD [12]. This hypometabolic pattern was more prominent in patients with MCI who progressed to AD, compared with in those who remained with MCI [13,14,15]. FDG PET is known to be a good supplement in predicting the conversion to AD with better accuracy than structural magnetic resonance imaging (MRI) and single-photon emission computed tomography [16]. In the Cochrane review, which reported the diagnostic accuracy of visual rating based on FDG PET for detecting AD conversion in patients with MCI, the sensitivity and specificity were 76% and 82%, respectively [17].

The need for objective and easy quantitative diagnostic programs is now increasing, and the use of computer-assisted analysis is now more widespread in research fields. Computer-assisted analyses also showed good sensitivity and specificity in predicting AD conversion in previous studies. Multimodal prediction of cerebrospinal fluid (CSF), apolipoprotein E4 (APOE4), MRI, and FDG PET has yielded greater predictive values [18]. Combination of FDG PET and demographic factors [19] or FDG PET and memory score [20] also predicted AD conversion in patients with MCI, with a sensitivity of 92% and a specificity of 93%.

However, few studies have examined the extent to which FDG PET alone can predict AD conversion and compared the efficacy between visual and computer-assisted analysis directly. Thus far, clinical parameters and neuropsychological test results have been included in the prediction model for AD conversion, making it difficult to indicate the predictive value of FDG PET alone. In contrast to recent research trends, it is important to demonstrate whether FDG PET alone can predict AD conversion because it is difficult to use a multimodal prediction model in clinical practice. Comparisons between visual rating and computer-assisted analysis of FDG PET scans are also rarely explored. Visual rating is clinically simple, but requires expertise. Computer-based analysis, however, although somewhat complex, does not require an expert physician, given that it can be automated. Therefore, it is necessary to compare the prediction of AD conversion between visual and computer-assisted analyses of FDG PET. We hypothesized that visual rating and computer-assisted analysis of FDG PET will be comparably useful in predicting AD conversion after 2 years and that FDG PET alone can predict AD conversion in patients with MCI. The purpose of the current study was to compare the efficacy of visual rating and computer-assisted analysis of FDG PET in predicting AD conversion.

2 Methods and Materials

2.1 Participants

Seventy-two participants (aged 60–80 years) who were diagnosed with MCI were initially recruited for the current study. Among the 72 participants, 13 participants were excluded because they were not followed up after 2 years and 5 participants were excluded due to incomplete imaging data or head motion defect. Thus, a total 54 participants were included in this study. They underwent FDG PET and MRI scanning and neuropsychological tests at initial diagnosis and were clinically followed up after 24 months. All tests were performed in Seoul National University Hospital from July 2003 to June 2011; the institutional review board approved this study.

All participants were diagnosed with MCI criteria, as proposed by Petersen’s criteria [21], at the beginning of the study, and then re-diagnosed after 24 months. At follow-up, psychiatrists, who were blinded to the brain imaging information, made a confirmative diagnosis. AD was diagnosed as probable AD, using the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s disease and Related Disorders Association criteria [22]. Patients who showed structural lesions on brain MRI, such as territorial infarction, intracranial hemorrhage, hydrocephalus, traumatic brain injury, infectious encephalitis, multiple sclerosis, or vasculitis, were excluded from this study. Patients with serious medical conditions that could affect the study, including kidney disease, liver disease, hemato-oncologic disease, thyroid disease, vitamin B12 deficiency, folate deficiency, neurosyphilis, or HIV infection, were also excluded.

2.2 Neuropsychological Tests

All participants completed a clinical interview and underwent a battery of standardized neuropsychological tests, i.e., the Korean version of the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD-K) [23]. The CERAD-K included a verbal fluency test (animal category), modified Boston Naming Test, Mini-Mental State Examination (MMSE), word list memory, construction praxis, word list recall, word list recognition, and construction recall. All participants had scores of 0.5 at the global Clinical Dementia Rating (CDR) scale.

2.3 Image Acquisition and Processing

MRI was performed on participants using a 3T GE whole body imaging system (GE VH/I; General Electric, Milwaukee, WI, USA). PET was performed using an ECAT EXACT 47 scanner (Siemens-CTI, Knoxville, TN, USA). The intrinsic resolution was 5.2 mm full width at half maximum and 3.4 mm thickness for a longitudinal field of view of 16.2 cm. Thirty-minute static emission scans were performed 30 min after administration of 370 MBq of [18F]FDG radioligands intravenously. To correct the attenuation, transmission scans were performed using three germanium-68 rod sources before administration of FDG. The transaxial images were reconstructed using a filtered back-projection algorithm employing a Shepp-Logan filter (cutoff frequency of 0.3 cycles/pixel as 128 × 128 × 47 matrices with a size of 2.1 × 2.1 × 3.4 mm).

FDG PET images were co-registered with the corresponding T1 structural images individually and normalized. Transformation parameters were calculated, and the images were spatially normalized to a Montreal Neurological Institute template. Spatially normalized images with a voxel size of 2 × 2 × 2 were smoothed using a 16-mm Gaussian filter. Data were processed using Statistical Parametric Mapping (SPM) 8 software (http://www.fil.ion.ucl.ac.uk/spm), which was implemented in Matlab 7.13 (Mathworks, Natick, MA, USA). The whole brain glucose metabolism was used as a reference region for intensity normalization [24].

2.4 Qualification of FDG PET (Visual Rating)

FDG PET images were evaluated by two nuclear medicine physicians, who were blinded to the clinical symptoms and demographic information of the participants. In the case of a discrepancy, a third expert nuclear medicine physician was consulted to reach an agreement. We scored the level of cerebral glucose metabolism on a 5-point Likert scale, from 4 (normal metabolism) to 0 (the least metabolism), in the following eight regions of interest (ROIs): (1) right parietal, (2) left parietal, (3) right posterior cingulate cortex (PCC), (4) left PCC, (5) right temporal, (6) left temporal, (7) right frontal, and (8) left frontal.

2.5 Quantification of FDG PET (Computer-Assisted Analysis)

The automatic anatomic labeling algorithm was applied to obtain the value of metabolism for each ROI. The glucose metabolism value of each voxel was normalized for the whole brain value, which was extracted for each participant. The degree of cerebral glucose metabolism in the precuneus, temporal, frontal, occipital, and overall cortex was quantitatively measured using SPM8.

In order to set the MCI-related ROIs, we compared the cerebral glucose metabolism of participants with MCI in the present study and that of 29 age- and sex-matched normal elderly people in our previous study [25] using t tests, with a significance of p < 0.001 and no multiple comparisons at the cluster level using the random field theory. MCI-related ROIs were as follows: (1) right and left precuneus, (2) right temporo-parietal, (3) left temporo-parietal, (4) right frontal, (5) right temporo-occipital, and (6) left temporo-occipital. Supplemental Table 1 and Supplemental Figure 1 show the results of the comparison (see the Electronic Supplementary Material).

2.6 Predictive Accuracy of the Two Methods and Their Comparison

We first separately investigated the predictive validity of visual rating and computer-assisted analyses. The visual rating was determined positive for AD conversion using the traditional visual rating, i.e., reduced glucose metabolism was found in the temporal region (level ≤ 3) or parietal region (level ≤ 2), which is crucial for the diagnosis of AD in clinical practice [9]. We then calculated the composite score predicting AD conversion based on all eight ROIs to enhance the prediction of the visual rating. The level of metabolism in the eight ROIs was analyzed using linear regression analysis with a backward method. The best predictive model was selected based on the lowest Akaike Information Criterion (AIC) value. The best model for AD conversion included the three ROIs, i.e., left parietal, right temporal, and right frontal, with a constant term applied according to the contribution to the prediction of AD conversion. The equation was as follows:

$${\text{Visual}}\;{\text{rating}}\;{\text{composite}}\;{\text{score}} = 0.479{-}0.200*{\text{Parietal}}\left( {\text{L}} \right){-}0.122*{\text{Temporal}}\left( {\text{R}} \right) \, + 0.246*{\text{Frontal}}\left( {\text{R}} \right)$$

In the computer-assisted analysis, only the composite score was calculated to predict AD conversion. Among the six ROIs, three ROIs, i.e., the left temporo-parietal, right frontal, and right temporo-occipital regions, were selected for the best predictive model for AD conversion, based on the lowest AIC value. For each selected ROI, a different constant term was applied according to its contribution to the prediction of AD conversion. The equation was as follows:

$${\text{Computer-assisted}}\;{\text{analysis}}\;{\text{composite}}\;{\text{score}} = 1.189 \, + \, 0.031*{\text{Frontal}}\left( {\text{R}} \right) \, {-} \, 0.028*{\text{Temporo-parietal}}\left( {\text{L}} \right) \, {-} \, 0.019*{\text{Temporo-occipital}}\left( {\text{R}} \right)$$

Receiver operating characteristics (ROC) curves and predictive values were obtained for traditional visual rating, composite scores of both visual rating and computer-assisted analysis, and also for each ROI included in the best predictive models. Then, the ROC curves of the three evaluation methods (traditional visual rating, composite score of visual rating, and composite score of computer-assisted analysis) were compared.

2.7 Statistical Analyses

A t test or chi-square test was used for demographic and clinical data. κ statistics were used to analyze inter-observer agreement on visual rating of FDG PET. In the computer-assisted analysis, FDG PET uptake was compared with the data of normal elderly people using a t test.

The composite score for both visual rating and statistical parametric mapping (SPM) analyses was calculated. The constant term was applied to each contributing ROI, using a binary linear regression analysis that predicts AD conversion. ROC curves, sensitivity, specificity, likelihood ratio, and area under the curve (AUC) were determined for all ROIs, composite scores of visual rating, and SPM analysis. Voxelwise analyses were performed using SPM12. MedCalc Statistical Software (MedCalc Software bvba, Ostend, Belgium; http://www.medcalc.org; 2015) was used to analyze the predictive values and ROC curves, and compare the ROC curves. Other analyses were performed using SPSS Statistics for windows version 23.0 (SPSS Inc., Chicago, IL, USA). Statistical significance was based on two-tailed tests and p < 0.05.

3 Results

3.1 Demographic and Clinical Data

Table 1 shows the demographic and neuropsychological data of the participants. There were no differences in the age, sex, and years of education between the converter and non-converter groups. Compared to non-converters, converters showed significantly lower MMSE and higher CDR-sum of box scores. All participants showed a CDR global score of 0.5, with a mean follow-up period of 24.3 ± 2.5 months. During the follow-up period, 19 patients (35.2%) converted to AD dementia from MCI. At the time of follow-up, 33 patients (61.1%) were diagnosed with MCI and two patients were evaluated as having no cognitive impairment (3.7%). No patients were diagnosed with other forms of dementia, like frontotemporal dementia and dementia with Lewy bodies.

Table 1 Demographic and neuropsychological data at baseline

3.2 Qualitative Analyses (Visual Rating)

Agreement between the two readers for the visual rating of FDG PET varied (κ = 0.66–0.85). The highest degree of agreement was found in the parietal region (κ = 0.85 [L] and 0.84 [R]). The detailed results of the visual rating of the two readers are presented in Supplemental Table 2 (see the Electronic Supplementary Material).

Table 2 shows the AUC, sensitivity, specificity, and likelihood of the visual rating analysis. The AUC, sensitivity, and specificity of the traditional visual rating predictive values were 0.67, 75.0%, and 58.8%, respectively. The composite score of visual rating was calculated using weighted ROIs, according to the predictive power of the binary logistic linear regression analysis (F = 4.050, p = 0.012). The explanatory power of the regression analysis was 20% (Nagelkerke R2 = 0.20). The AUC, sensitivity, and specificity of visual rating composite score predictive values were 0.76, 65.0%, and 85.3%, respectively.

Table 2 Diagnostic accuracy of visual rating and computer-assisted analyses of FDG PET

3.3 Quantitative Analyses (Computer-Assisted Analysis)

Table 2 also shows the AUC, sensitivity, specificity, and likelihood ratio of the computer-assisted analyses. The composite score of computer-assisted analysis was calculated using the weighted ROIs, according to the predictive power of the binary logistic linear regression analysis (F = 4.649, p = 0.006). The explanatory power of the regression analysis was 22% (Nagelkerke R2 = 0.22). The AUC, sensitivity, and specificity of the SPM composite score were 0.79, 75.0%, and 76.5%, respectively.

3.4 Comparison of the Qualitative and Quantitative Analyses

Figure 1 shows the ROC curves of the traditional visual rating, composite score of the visual rating, and composite score of the computer-assisted analysis. There was a difference between the ROC curve of the traditional visual rating and that of the composite visual rating score (Z = 2.095, p = 0.036). However, no differences were found between the ROC curves of visual rating composite score and computer-assisted analysis composite score (Z = 0.463, p = 0.643) and traditional visual rating and computer-assisted analysis composite score (Z = 1.441, p = 0.150).

Fig. 1
figure 1

ROC curve of the visual ratings and the computer-assisted analysis. ROC curve comparing (1) traditional visual rating (area under the curve = 0.67), (2) computer-assisted analysis (area under the curve = 0.79), and (3) composite score of visual rating (area under the curve = 0.76). ROC receiver operating characteristics

4 Discussion

The current study aimed to find the predictive validity of FDG PET alone to predict the conversion to AD in patients with MCI, and to compare the validity of the visual rating and computer-assisted analysis of FDG PET. We demonstrated that FDG PET analysis can effectively predict AD conversion by itself. Further, both visual rating and computer-assisted analysis showed similar accuracy for detecting AD conversion.

The major finding of this study is that visual rating and computer-assisted analysis of FDG PET showed similar predictive values for AD conversion in patients with MCI. FDG PET scans are usually assessed visually by nuclear medicine physicians. Visual rating is easy to use in clinical practice, but can vary according to the raters’ expertise. In contrast, computer-assisted analysis has the advantage of lower inter-rater variability. However, it is not user-friendly and the image results still need to be processed. Further, the difference of the predictive accuracy of visual rating and computer-assisted analysis of FDG PET for AD conversion is not well known. A few recent studies reported that both visual rating and computer-assisted analysis are predictive for AD conversion in patients with MCI. However, the comparison of the two methods showed inconsistent results. In a study by Ito et al. [26], the predictive accuracy of visually inspected FDG PET and computer-assisted analysis for AD conversion was 56% and 83%, respectively, during 3-year follow-up in patients with amnestic MCI. In contrast, Morbelli et al. [27] demonstrated that predictive accuracy of visual reading of FDG PET by an expert was better than that of computer-assisted analysis (0.89 vs 0.80). Grimmer et al. [28] also concluded that accuracy rates in predicting AD conversion in patients with MCI were better with visual analysis (54–68%) compared to those with fully automated analysis (50%). The authors of these studies used PALZ software (PMOD Technologies, http://www.pmod.com) for the computer-assisted analyses [26,27,28]. Using single-case SPM, Frisoni et al. [29] reported that the visual read was more sensitive (94 vs 72%), but less specific (74 vs 92%), than computer-assisted analysis. The results of our study corroborated with previous investigations, in that traditional visual rating showed a similar (75 vs 75%) sensitivity to, and lower (59 vs 76%) specificity than, computer-assisted analysis. The useful result in the present study is that we calculated the AUC of each evaluation method. The AUC of the composite scores of visual rating and computer-assisted analysis were quite modest (0.76 vs 0.79), with no significant differences (Figure 1). Currently there is no consensus whether FDG PET scans should be analyzed visually or by computer-assisted method. The European Association of Nuclear Medicine guideline recommends that brain FDG PET scans should be interpreted in both visual inspection and quantification [18]. Our study also replicated comparable predictive values of both visual and computer-assisted analyses of FDG PET scan for AD conversion.

In the present study, visual rating of FDG PET was predictive for AD conversion in patients with MCI with an AUC of 0.67 in the traditional visual rating and 0.76 in the composite score. The visual rating of FDG PET in patients with MCI is already known to be suggestive of conversion to AD [30, 31]. Previous studies using visual ratings showed better prognostic value than the result of the traditional visual read in our study (i.e., an AUC of 0.67, sensitivity of 75%, and specificity of 59%). Morbelli et al. [27] demonstrated that expert visual reading was accurate to predict AD conversion, with a sensitivity, specificity, and accuracy of 90, 89, and 89%, respectively. Frisoni et al. [29] reviewed FDG PET studies and reported that visual reading with a sensitivity of 74% and specificity of 94% at the MCI stage was predictive of AD conversion. In the present study, the sensitivity and specificity of traditional visual rating, which determined positive AD conversion in FDG PET images by hypometabolism in the temporal or parietal regions, were slightly lower than those in previous studies. We postulated that this was because nuclear medicine physicians were blinded to the symptoms, diagnosis, test results, and demographic information of the participants. In addition to the traditional visual rating, we calculated the composite score of visual rating to enhance the predictive value. There was a noticeable increase in the AUC from 0.67 to 0.76. In previous studies, visual analysis findings were based on visual inspection by nuclear medicine physicians, who determined typical AD, with the presence of cortical hypometabolism in the parietal, temporal, or frontal regions [9, 26,27,28]. We postulated that since we calculated the composite score based on the ROIs included in the best model, according to the lowest AIC value and degree of contribution to the prediction of AD conversion, the predictive value of the composite score might have increased, compared to the traditional visual rating, with an increase in the specificity. In regard to clinical practice, the high specificity in the result can increase the utility of FDG PET in screening patients who will not convert to AD. In addition, because the present study manifested quite accurate prognostic values without biological, demographic, and neuropsychological test variables, FDG PET visual inspection is assumed to be capable of being a screening tool to detect AD conversion, even without cognitive test results.

The computer-assisted analysis in our study was valuable to predict AD conversion, with a sensitivity of 75%, specificity of 77%, and AUC of 0.79. Computer-assisted analysis has become a necessary method in this field of research. Several recent studies have reported the good prognostic value of computer-assisted analysis of FDG PET in patients with MCI. A preliminary study demonstrated that the objective statistical evaluation of FDG PET can improve the detection rate in prodromal AD [13]. A study, which used AD neuroimaging data, showed the moderate prognostic value of computer-assisted analysis of FDG PET for AD conversion (AUC = 0.74) [32]. A meta-analysis, which reviewed six studies with FDG PET computer-assisted analysis, showed a wide range of predictive values for AD conversion (i.e., sensitivity 37.5–100% and specificity 67.0–80.3%) [33]. In 2015, a Cochrane review [17] reported 16 studies, which investigated the predictive value of FDG PET for AD conversion, 12 of which used quantitative evaluation as well as visual inspection. The sensitivity values ranged from 25% to 100%, while the specificity ranged from 29% to 100%. Given the wide range of values, the authors postulated that the computer-assisted analyses were limited due to the heterogeneity between the studies in the chosen automated programs, lack of common thresholds, and different ROIs [17]. Additional computer-assisted analyses, such as the evaluation of asymmetries and volume of interest methods, have raised the prognostic accuracy of FDG PET, reaching an AUC of 0.93 [34]. Combining other factors, such as biological markers (CSF, MRI, and APOE), demographic variables (age, sex, and years of education), and neuropsychological test results (MMSE and auditory verbal learning test) in a predictive model also raised the prognostic values [19, 32, 35, 36]. Thus, comparing the values of computer-assisted analyses of FDG PET, including our results, is not quite simple. Although computer-assisted analysis has the advantage of low inter-rater variability, it has limitations due to the different automated or semi-automated programs and composite or dichotomous scoring methods being used by each research group.

Some limitations in the current study must also be noted. Due to the absence of cross validation, the results might not be applicable to other groups with MCI. The small sample size can also minimize the clinical validity for predicting AD in patients with MCI. Further, since the final diagnosis was only confirmed by clinical evaluations and the follow-up duration was only 2 years, this could have precluded detection of additional conversion to AD or other forms of dementia in non-converters. The lack of confirmation of amyloid positivity for AD diagnosis is also one of the limitations. However, our study is advantageous in that we found the modest predictive value of the visual and computer-assisted analyses of FDG PET alone with AUC of ROC, and calculated the visual rating composite score, which was better at prediction for AD conversion than the traditional visual rating.

In conclusion, visual rating and computer-assisted analyses of FDG PET scans were analogously accurate in predicting AD conversion in patients with MCI. FDG PET may be a useful tool for screening AD conversion in patients with MCI, regardless of interpretation method. FDG PET can be used in a clinical setting, with no other information related to the patient. Since both visual qualification and computer-assisted quantification could be analyzed with a variety of methods, however, further research on their direct comparison is required to elucidate the most accurate evaluation of FDG PET scan in AD.