Introduction

Patients with mild cognitive impairment (MCI) represent an important clinical group as they are at increased risk of developing Alzheimer disease (AD) [37] and are an ideal target for drug interventions. Since up to a half of all MCI patients never develop dementia [4], reliable predictors of conversion are necessary. Based on the notion that AD pathology may start decades before the first clinical symptoms [28], and that biological changes typical of AD have been found in MCI patients [9, 19], a proposal for new AD diagnostic criteria has been developed, positing that individuals with memory impairment plus biological marker evidence of AD should be regarded as early AD [15].

The new research criteria consider three main markers: medial temporal (MT) atrophy on MRI, abnormal Abeta42 or tau protein concentrations in the CSF, and reduced cortical glucose metabolism on fluorodeoxyglucose positron emission tomography (18F-FDG PET). A recent review on neurochemical and imaging AD biomarkers concluded that, in MCI patients, hippocampal atrophy predicts later conversion to AD with about 80% accuracy, the combination of tau and Abeta42 with a sensitivity of 95% and a specificity of 85%, and the AD-like temporoparietal and posterior cingulate hypometabolic pattern with an accuracy >80% [23]. A fourth marker, brain amyloid deposition on amyloid imaging with PET, will not be addressed by the present study for its presently poor feasibility in clinical settings.

While the new criteria do not assume that a marker is more predictive than any other, nor do they imply that combinations of markers might be more predictive than individual markers, it is likely that this is the case. The aim of this study was to test the validity of the new diagnostic criteria for AD in patients with MCI from an outpatient memory clinic. Hippocampal volume, CSF Abeta42 and tau, and cortical glucose metabolism were collected, and conversion to dementia was assessed clinically over a time span up to 6.5 years.

Methods

Patients

Patients were taken from a prospective study on the natural history of cognitive impairment (the Translational Outpatient Memory Clinic—TOMC Study) carried out in the outpatient facility of the National Institute for the Research and Care of Alzheimer’s Disease (IRCCS Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy). The TOMC is a multidisciplinary team of professionals working at the IRCCS Centro San Giovanni di Dio Fatebenefratelli made up of: neurologists and geriatricians (patients’ screening and clinical assessment); neuropsychologists (neuropsychological battery administration); image analysis neuroscientists (post-processing of functional and structural images); and biologists (assaying of CSF markers). Lumbar tap of patients seen in the TOMC is performed in loco or, in specific cases, in a local general hospital (S. Orsola-Fatebenefratelli Hospital). Structural MR and 18F-FDG PET were performed in local diagnostic facilities (Spedali Civili and Istituto Clinico Città di Brescia) following ad hoc image acquisition protocols.

Briefly, patients with MCI underwent clinical and cognitive assessment, high-resolution MR, 18F-FDG PET, lumbar tap, and follow-up visits every 12 months until development of probable AD or other dementia according to the National Institute of Neurological and Communicative Diseases and Stroke/Alzheimer's Disease and Related Disorders Association (NINCDS-ADRA) criteria [34]. A detailed description has been provided elsewhere [18]. The study protocol was approved by the local ethics committee and all participants signed an informed participation consent.

For the present study, we selected those 108 MCI patients consecutively referred to the TOMC during the first 24 months of activity (June 2006 to June 2008) with no history or presence of neurological signs of major stroke. MCI is defined as the presence of objective impairment in memory or other cognitive domains (performance lower than the fifth percentile on neuropsychological tests detailed below) in the absence of functional impairment. Eighteen patients were excluded because they lacked follow-up assessments due to refusal (n = 16) or logistic problems (n = 2). Finally, 90 patients were included. Only variables relevant to this study are described.

Clinical assessment

Global cognition was assessed with the Mini Mental State Examination (MMSE) [16], and depressive and anxiety symptoms with the pertinent subscales of the brief symptom inventory (BSI) [10]. BSI subscores range from 0 to 4, higher scores indicating more severe symptoms. Hypertension, heart disease, diabetes mellitus, and hypercholesterolemia were investigated based on history, currently prescribed drugs, and historical charts, and defined as present if treatment was currently assumed. Follow-up visits consisted of a complete clinical (cognitive, behavioural, neurological, functional, and physical) examination as in the baseline assessment. In particular, history of progression of symptoms was taken from a knowledgeable informant and was focused on those symptoms that might help in the differential diagnosis of the most frequent forms of dementia: implicit and explicit memory, language and executive functions, disinhibition and other behavioural disturbances, hallucinations and other psychiatric symptoms, fluctuations of consciousness, REM-sleep behavior disturbances, and falls. Instrumental activities of daily living were assessed with the Lawton and Brody’s scale [29]. The clinical diagnosis of dementia was made by the physician-in-charge (a neurologist or a geriatrician) according to the traditional clinical criteria [1, 33, 35] and blinded to the MR, CSF, and PET findings.

Neuropsychological tests

Tests were administered to assess long term memory (Story Recall Test; Auditory-Verbal Learning Test, immediate and delayed recall; Rey-Osterrieth Complex Figure, recall), language comprehension (Token Test), verbal fluency (phonetic and semantic), constructional and visuo-spatial abilities (Rey-Osterrieth Complex Figure, copy; Clock Drawing Test), attention and executive functions (Trail-Making Test A and B), and non-verbal reasoning (Raven Colored Progressive Matrices). All the tests were administered according to standard procedures and corrected for age and education, according to Italian normative populations [30].

Magnetic resonance imaging

MR images (axial T2 weighted, proton density, fluid attenuated inversion recovery, and gradient echo 3D images) were acquired at the Città di Brescia Hospital, Brescia, with a 1.0-T Philips Gyroscan, as detailed elsewhere [18]. Acquisition parameters of 3D images were as follows: TR 20 ms, TE 5 ms, flip angle 30°, field of view 220 mm, acquisition matrix 256 × 256, slice thickness 1.3 mm.

Hippocampal volume

Three-dimensional images were processed and the hippocampi were manually traced by an expert tracer following a validated and extensively documented protocol [17]. The time required to trace a hippocampus was about 20 min. The volume was normalized to the total intracranial volume (TIV) and rescaled to the mean population TIV according to the following formula ([volume/individual total intracranial volume] × mean total intracranial volume). TIV was obtained by manual tracing of coronal MRs (voxel size 1 mm × 1 mm × 7 mm) using Display 1.3 tools. To obtain this voxel size, the images were transferred to a SUN Workstation and transformed from DICOM to MNC format. A resampling algorithm, included in the MINC package developed at the McConnel Brain Imaging Centre (Montreal Neurological Institute, McGill University, Montreal, Canada, http://www.bic.mni.mcgill.ca/software) was used to set the voxel dimension to 1 mm × 1 mm × 7 mm for all images. Hippocampal volume could not be traced in four subjects because of low quality of the 3D images.

Medial temporal (MT) atrophy was defined as hippocampal volume (the smallest between the left and right side) below the fifth percentile of the volume distribution (1,970 μl) in a population of 125 normal subjects 60 years and older taken from a study on the structural features of normal ageing [20].

White matter lesions

WMLs were rated separately in frontal, parieto-occipital, temporal, infratentorial areas and basal ganglia using the rating scale for age-related white matter changes—ARWMC [44]. The total score ranged from 0 to 30, higher scores indicating higher cerebrovascular burden.

Cerebrospinal fluid analysis

CSF was obtained by lumbar tap between L4 and L5 or L3 and L4 and processed as detailed elsewhere [18]. Levels of Abeta42, total tau, and phospho-tau (p-tau) proteins were determined by commercially available enzyme linked immunosorbent assay (Innogenetics, Belgium). CSF was not available in 26 patients because of refusal (n = 24) or failure to reach the arachnoid space due to osteoarthrosis (n = 2).

Abnormal CSF was defined according to the normative values of Sjogren and colleagues [41] when the level of Abeta42 was lower than 500 pg/ml and the level of total tau was either higher than 450 pg/ml for subjects with an age range between 51 and 70, or higher than 500 pg/ml for subjects with an age range between 71 and 93.

18F-deoxyglucose positron emission tomography

18F-FDG PET imaging of the brain was performed at the Nuclear Medicine Service, Spedali Civili of Brescia, Italy, by a 24-ring 3D PET/CT device, as detailed elsewhere [18]. Fifty-two patients did not undergo 18F-FDG PET because of refusal (n = 25), contraindications (n = 7) or because they had previously undergone a brain perfusion study with 99mTc-ECD SPECT (n = 20).

Herholz’s t sum

FDG uptake was assessed with the automated version (PALZ score of PMOD technologies, http://www.pmod.com) of the t sum score developed by Herholz and colleagues [25] for the diagnosis of AD, combining the virtues of voxel-based parametric mapping with the diagnostic information on brain regions that are typically affected in AD. Briefly, the 18F-FDG PET image of an individual patient is compared to a database of normal controls and the voxel-by-voxel sum of t scores in an AD-pattern mask is computed. Abnormal 18F-FDG PET was defined following the original indications of a t sum higher than 11,090.

Regional metabolism

In order to perform a comprehensive study of the predictivity of metabolic markers, a traditional ROI-based measure of hypometabolism was also studied in the hippocampus, posterior cingulate, and temporoparietal areas. A detailed description of 18F-FDG PET image processing is available as supplemental data. ROI metabolism was computed as the average of the signal intensity within each ROI scaled by mean metabolism in the cerebellum.

Figure 1 shows examples of MCI patients with normal and abnormal MR and 18F-FDG PET markers.

Fig. 1
figure 1

MR and 18F-FDG PET markers: instances of MCI patients with normal and abnormal markers

Genetic analysis

Blood samples were available for 83 patients. Genomic DNA was extracted from whole-blood samples according to standard procedures. Apolipoprotein E genotyping (ApoE2, ApoE3, and ApoE4 alleles) was carried out by PCR amplification and HhaI restriction enzyme digestion. The genotype was resolved on 4% Metaphor Gel (BioSpa, Italy) and visualized by ethidium bromide staining [27].

Statistical analysis

Data were analyzed using SPSS V. 13.0. Differences in clinical and marker features were assessed among groups disaggregated by dementia diagnosis at conversion (non-converted MCI—MCI-NC, MCI converted to AD—MCI-AD, and MCI converted to non-AD dementias—MCI-nAD) with analysis of variance (ANOVA) for continuous variables and χ2 for dichotomous variables. For continuous variables, post hoc pairwise comparisons among groups were performed with the Games–Howell or Tukey test depending on homogeneity of variance tested with Levene’s test. For dichotomous variables, a paired χ2 test was performed.

The relationship between marker abnormality and outcome (conversion to AD and non-AD dementias) was explored with Kaplan–Meier curves, plotting the survival probability (i.e. the probability of conversion to dementia) against follow-up time, and differences of survival among groups were tested with the logrank test. In order to control for the effect on the outcome of clinical variables found to be significantly different between converted and MCI-NC, the predictive value of markers was tested in Cox regression models with time of follow-up (right censoring) or patients’ clinical variables (left censoring) as the time scale and hazard ratio was computed. This was the predicted change in the hazard for a unit increase in the predictor, values lower than 1 indicating increased predicted survival with increasing values of the variable.

In order to adjust for the fact that conversion to dementia was continuous and not a dichotomous event, we performed an accurate anamnesis concerning history of progression of functional impairment, allowing us to precisely determine the date (month and year) of conversion to dementia; that is, the appearance of early functional impairment. As a single day of the month was more difficult to be determined, we considered as default the 15th day of the month.

The missing biomarker data have been managed with the case deletion method, i.e. they have been eliminated from the analyses, as provided for by default in the SPSS statistical package. This was appropriate because basic assumptions of the method (i.e. a sample size sufficiently large, missing data with no structure or pattern and data missing at random) were satisfied. However, in order to eliminate each potential bias due to high number of missing data from 18F-FDG PET and consequent sample size variation depending on case availability, we re-ran the survival analyses in those 28 MCI patients (16 not converted and 12 converted to AD) who had all three biomarkers and no missing data.

Lastly, the accuracy of markers to separate MCI-AD from MCI-NC was estimated using receiver operating characteristic (ROC) curves and comparing areas under the curve (AUC) (95% CI) (http://www.medcalc.be).

Combinations of biomarkers was the presence of at least one between MT atrophy and abnormal CSF, MT atrophy and abnormal PET, or abnormal CSF and abnormal PET. For example, the combination of MT atrophy and abnormal CSF is the presence of MT atrophy, abnormal CSF, or both.

Results

Of the 90 MCI patients included in the study, 39 (43%) converted and 51 did not convert to dementia after 24.0 ± 13.9 (SD) months of follow-up. Conversion occurred after 20.6 ± 9.7 months on average and non-converters were followed 26.5 ± 16.0 months. Dementia diagnoses were probable AD (n = 24, 62%), possible AD with subcortical cerebrovascular disease (n = 2, 5%), frontotemporal dementia (FTD) (n = 9, 23%), and Lewy body dementia (LBD) (n = 4, 10%).

MCI-NC, MCI-AD, and MCI-nAD had similar sociodemographic, clinical, and vascular features. The frequency of the APOE4 allele was higher in MCI-AD than in MCI-nAD (p = 0.01 on post hoc comparison). MCI-AD had poorer performance on Rey-Osterrieth Complex Figure recall, and MCI-nAD on phonetic fluency relative to the other groups, but post hoc comparisons were not significant. MCI-AD had poorer performance than MCI-NC on semantic fluency (p = 0.03 on post hoc comparison) (Table 1).

Table 1 Baseline features of 90 MCI patients by conversion status

CSF tau and p-tau of MCI-AD were significantly higher and Abeta42 significantly lower relative to both other groups. MCI-AD showed higher prevalence of MT atrophy relative to MCI-NC (55 vs. 18%, p = 0.002 on post hoc comparison) and higher prevalence of abnormal CSF than both MCI-NC (60 vs. 13%, p = 0.001) and MCI-nAD (60 vs. 7%, p = 0.002). The 18F-FDG PET was abnormal for the majority of the subjects in all the groups (Table 2). Regional glucose metabolism in the hippocampus, temporoparietal lobe, and posterior cingulated cortex was similar among MCI-NC, MCI-AD, and MCI-nAD (1.09 ± 0.10 vs. 1.14 ± 0.13 vs. 1.00 + 0.19, p = 0.09; 1.32 ± 0.16 vs. 1.35 ± 0.18 vs. 1.24 + 0.19, p = 0.46; 1.53 ± 0.19 vs. 1.45 ± 0.18 vs. 1.38 + 0.25, p = 0.30, respectively).

Table 2 Baseline marker features of 90 MCI patients by conversion status

On survival analysis, all the MCI patients with MT atrophy as well as those with abnormal CSF converted to AD, but only 48% of those without MT atrophy and 35% of those without abnormal CSF (p on logrank test = 0.0007 and 0.001, respectively) (Fig. 2a). About 70% of MCI patients with abnormal 18F-FDG PET converted to AD, but also 50% of those without abnormal 18F-FDG PET; survival difference was not significant (Fig. 2a). Prediction of AD conversion improved when positivity to MT atrophy or CSF markers combined was considered, only 15% of those MCI patients negative on both converting to AD (p on logrank test <0.0005) (Fig. 3a).

Fig. 2
figure 2

Time to outcome (conversion) in the whole group of 90 MCI patients (a) and in those 28 for whom all markers were available (b), disaggregated by presence (dotted line) and absence (solid line) of medial temporal (MT) lobe atrophy, abnormal CSF, and abnormal 18F-FDG PET. MT atrophy was defined as hippocampal volume <fifth percentile based on 125 elderly controls [20], abnormal CSF as Abeta1–42 and total tau based on Sjogren’s cutoffs [41], and abnormal 18F-FDG PET as t sum score >11,090 [25]. N denotes group size with and without abnormal marker

Fig. 3
figure 3

Time to outcome (conversion) in the whole group of 90 MCI patients (a) and in those 28 for whom all markers were available (b), disaggregated by presence (dotted line) and absence (solid line) of MT atrophy, abnormal CSF or both; MT atrophy, abnormal 18F-FDG PET, or both; and abnormal CSF, abnormal 18F-FDG PET, or both. N denotes group size with and without abnormal marker

Positivity to MT atrophy or CSF markers combined remained a significant outcome predictor, even after adjustment for the clinical variable was found to be significantly different between MCI-AD and MCI-NC (i.e. semantic fluency) (unadjusted hazard ratio 0.29, p = 0.001, adjusted 0.33, p = 0.005).

Survival analyses were quite similar between a subgroup of 28 MCI patients without missing biomarker data and the entire group of MCI-AD patients, for both single and combined biomarkers (Figs. 2b, 3b, respectively). In particular, when biomarkers combined were considered, the combination of MT atrophy and abnormal CSF significantly improved conversion prediction (p on logrank test 0.002), confirming previous results for the entire group of patients. Markers were not significantly predictive of conversion to non-AD dementias (Fig. 1, see supplemental material).

The accuracy of MT atrophy and abnormal CSF in discriminating MCI-AD from MCI-NC was fair (AUC 0.73, 95% CI 0.57–0.89, and 0.74, 95% CI 0.58–0.89, respectively), but was good when the combination of two markers was considered (AUC 0.82, 95% CI 0.70–0.95) (Fig. 4). However, differences between AUCs of single and of combined markers were not significant (p = 0.14 and p = 0.16), likely due to small group sizes.

Fig. 4
figure 4

Accuracy of AD markers in discriminating MCI-AD from MCI-NC (whole group of 90 MCI patients)

Discussion

The main finding of this study is that AD markers predict conversion of MCI patients to AD with good accuracy, and that markers do not predict conversion to non-AD dementias. This is the first evidence of validity of the new marker-based criteria for AD in a naturalistic memory clinic population where structural, metabolic, and CSF markers have been studied.

Predictive value of single markers

In our MCI patients, MT atrophy and abnormal CSF significantly predicted conversion to AD, while abnormal 18F-FDG PET did not. The predictive usefulness of MT atrophy for MCI conversion to AD has been well established. In a study of 139 MCI patients followed for a mean of 5 years, smaller hippocampal and entorhinal volumes predicted time of conversion to AD with a relative risk of 2.2–2.5 [11]. In a recent Alzheimer's Disease Nueroimaging Initiative (ADNI) study of 339 MCI patients, the degree of neurodegeneration of MT structures (hippocampal gray matter density, hippocampal and amygdalar volumes, and entorhinal cortical thickness) was the best antecedent MR marker of AD conversion, with decreased hippocampal volume being the most robust [39]. Our findings that MT atrophy was predictive of conversion to AD are well in agreement with the previous literature.

The finding that MT atrophy does not predict conversion to non-AD dementias deserves comment. To our best knowledge, no studies in the literature have evaluated the value of hippocampal volume in predicting conversion to non-AD dementias. However, the poor predictive power in our MCI-nAD patients was partly expected since hippocampal atrophy is less marked in full blown non-AD dementias than in AD [3, 31].

A number of studies have shown that AD features increased tau and p-tau and decreased Abeta42 in the CSF [6] and that MCI patients with incipient AD displayed similar CSF changes [24, 40]. Our results are in line with these studies, in that our MCI-AD patients had CSF marker levels with a typical AD profile relative to MCI-NC patients. Moreover, we found that a combination of elevated tau and decreased Abeta42 was significantly predictive of conversion to AD, in agreement with earlier studies showing that predictivity of conversion was greater when tau and Abeta42 are analyzed together than when those markers are independently analyzed [32]. Which combination of CSF markers yields the best predictivity remains to be established; most studies are considering the combination of Abeta42 and total tau [40, 42], but some Abeta42 and p-tau [32].

Lastly, we found that abnormal CSF was not significantly predictive of conversion to non-AD dementias. This result is expected in view of the relative specificity of the CSF markers we investigated. Indeed, in studies of FTD, LBD, and AD patients, the combination of CSF markers yielded specificity values higher than 85% in distinguishing between groups [5, 38].

Unlike MT atrophy and abnormal CSF, abnormal 18F-FDG PET did not predict conversion to AD in our MCI patients, despite literature data showing the value of temporoparietal hypometabolism in predicting AD [14, 36]. In fact, a large part of both our MCI-AD and MCI-NC patients had abnormal 18F-FDG PET, but only 70% of those with abnormal 18F-FDG PET converted to AD as well as 50% of those patients who had a normal 18F-FDG PET. These findings can be partly explained by the high sensitivity of 18F-FDG PET in detection of functional brain alterations related to the earliest symptoms of AD [13], allowing one to detect a proportion of not yet converted MCI patients. Longer follow-up of MCI-NC patients will clarify if those with abnormal 18F-FDG PET will develop dementia in the future. Moreover, study limitations such as short follow-up and small number of our MCI patients with available 18F-FDG PET might explain the lack of predictivity to AD conversion.

On the contrary, the result that the t sum score was not significantly predictive of conversion to non-AD dementias was expected seeing as this was developed to identify the pattern of hypometabolism typical of AD and not other dementias [25]. Moreover, the finding that the ROI-based measure of hypometabolism also was not predictive is in agreement with observations showing that temporoparietal and posterior cingulate cortex hypometabolism might be also present in non-AD dementias, such as FTD [26] and LBD [22].

Predictive value of combined markers

When a combination of markers was considered, we found that having at least one positive marker between MT atrophy and abnormal CSF yielded the best predictive power of conversion from MCI to AD, even in a subgroup of MCI patients with no missing biomarker data. This finding suggests that MR and CSF provide complementary predictive information, in line with a recent ADNI study of 192 MCI patients showing that the combination of a structural abnormality index on MR with the ratio of CSF tau to Abeta42 provided better AD prediction than either marker alone [42]. Moreover, a small longitudinal study of 24 MCI patients showed that combining MT atrophy and CSF markers significantly increased overall prediction of AD from 74 to 84% [8].

Lastly, our findings that MCI-NC and MCI-AD had similar clinical characteristics, and that the predictive value of the combination of MT atrophy and abnormal CSF still held after adjusting for different baseline clinical features, support the idea that markers provide incremental information to clinical assessment.

The proportion of the ApoE4 carriers in MCI-NC and MCI-AD was in line with the literature, showing a range of prevalence from 23 to 44% in non-converted and from 30 to 75% in MCI converted to AD [2, 12, 14]. The result that MCI-NC also had a high proportion of ApoE4 genotype might be explained with the short time of follow-up, preventing us from excluding some of the ApoE4 carriers not yet converted who might convert in the future. On the other hand, several genes other than ApoE might contribute to the development of AD, and then the presence of ApoE4 might be neither sufficient nor necessary to explain all occurrences of disease [2]. Lastly, the small number of patients carrying the ApoE2 allele in our study was insufficient to make any meaningful comparisons regarding the protective role of the ApoE2 allele in progression of MCI to AD.

Use of markers for AD diagnosis in the clinical practice

Some issues need to be solved before marker-based AD criteria can be applied in the clinical practice. First, standardized operational procedures (SOPs) need to be developed to avoid differences in marker measurements across laboratories [21, 43]. Efforts are currently under way to develop international SOPs for a CSF marker assay and MR-based hippocampal volumetry by joint European and US working groups (http://www.centroalzheimer.org/sito/ip_sops_e.php).

A further step toward applicability will involve the operationalization of AD markers into measures that could be collected relatively easily and consistently in AD diagnostic centres. New criteria suggested measuring MT atrophy with visual scales or quantitative volumetry of MT lobe structures [15]. Even though the use of visual rating is appealing in clinical settings because of high feasibility, in the present study we chose to adopt hippocampal volumetry for its greater accuracy [7].

Similar considerations apply to 18F-FDG PET. The proposal for new diagnostic criteria does not provide indications to operationalization, and the literature offers a whole range of measurement options, from visual rating scales [18], to category allocation algorithms [25]. In the present study we have chosen the latter method for its high feasibility and operator-independence.

Study limitations

This study had some limitations. First, not all AD markers were available for all patients, decreasing the power of some analyses. However, missing markers were due to various clinical and organizational hurdles (such as fear of and refusal to undergo lumbar puncture, distance from a memory clinic, or physical contraindications), unavoidable under real-life conditions. According to this perspective, the present results could, with all due caution, be generalized to other clinical contexts more confidently than purely experimental studies such as the ADNI. Further limitations of this study are the relatively short follow-up period and the lack of gold standard in AD diagnosis. Indeed, longitudinal studies lasting at least 5 years are needed to assess the natural history of patients with MCI [4] and clarify whether all patients with AD markers eventually progress to dementia. Moreover, AD diagnosis of our MCI patients was made clinically and lacked neuropathological confirmation. Because of the low specificity of the NINCDS-ADRDA criteria for AD, we cannot exclude the co-occurrence of neurodegenerative conditions other than AD in our MCI-AD patients.