Introduction

In recent years, many studies have demonstrated that amyloid biomarkers such as cerebrospinal fluid markers or specific ligands using PET can be useful in the accurate diagnosis of Alzheimer’s disease (AD) [1]. Numerous studies have involved the use of PET tracers that bind to amyloid plaques in vivo [24]. Among them 18F-AV45 (Amyvid™) is a radiopharmaceutical that shows cortical uptake well linked to AD pathology [5, 6]. This compound has been widely used with good performance in differentiating AD patients from cognitively normal subjects [79] and has recently been approved for use by the Food and Drug Administration (April 2012) and the European Medicines Agency (January 2013).

However, amyloid PET imaging suffers from some limitations. One of them is the high prevalence of amyloid positivity in normal older individuals. Population-based studies are only beginning to be reported, but estimates of age-specific positivity rates for amyloid PET are less than 5 % in subjects 50 to 60 years old, 10 % in those 60 to 70 years old, 25 % in those 70 to 80 years old, and more than 50 % in those 80 to 90 years old [10, 11]. This amyloid positivity linked to age is the subject of much work in progress and some authors have suggested that “amyloid positives" may be totally asymptomatic but could have an increased risk of developing symptomatic AD [12]. Another limitation of PET amyloid imaging is the variation reported in the level of inter-reader agreement in previous studies (with a Fleiss kappa coefficient between 0.68 and 0.98) [5, 7, 13, 14]. Such variability has led the radiopharmaceutical companies to introduce special educational reading programmes which have led to a great improvement of inter-rater agreement. However, some equivocal findings continue to be observed. It would be interesting to know if subjects’ cognitive performance associated with equivocal scans is similar to the cognitive pattern in amyloid-positive or amyloid-negative subjects. Amyloid imaging results alter physicians’ diagnostic thinking, intended testing and management of patients undergoing evaluation for cognitive decline [15]. No study has sought to determine if those equivocal scans are associated with a specific pattern of clinical signs.

The main objective of our study was to analyse the specific clinical patterns of such visual equivocal scans, based on a sample of older adults, aged 70 years and over, living in the community without any clinical sign of dementia. In addition, we aimed to assess interobserver and intercentre agreement with semiquantitative analysis.

Material and methods

MAPT trial

All subjects enrolled in this neuroimaging PET study came from the Multidomain Alzheimer Preventive Trial (MAPT) [1618], a multicentre, randomized, placebo-controlled study. The MAPT study is a large long-term trial specifically designed to test whether omega-3 fatty acid supplementation in combination with a multidomain intervention consisting of nutritional counselling, physical exercise and cognitive stimulation, is effective in slowing cognitive decline in frail older adults at risk of cognitive decline. The protocol is registered on a public-access clinical trial database (www.clinicaltrials.gov, no. NCT01513252). The members of the MAPT study group are listed in the Appendix This ancillary MAPT study protocol was approved by the French Ethics Committee in Toulouse in December 2007.

Subjects

Included subjects were aged 70 years or older and fulfilled one of the three following clinical criteria: (1) spontaneous memory complaint expressed to a general practitioner, (2) limitation in one instrumental activity of daily living (e.g. the ability to use the telephone, shop, prepare meals, do housekeeping, do the laundry, use transportation, follow a medication schedule, or manage money [19, 20], (3) slow walking speed (speed ≤0.8 m/s, i.e. more than 5 s to walk 4 m). Subjects with dementia, cognitive impairment, limitation in basic activities of daily living (bathing, dressing, toileting, transferring, continence, eating) or suffering from severe depression were not included in the trial. Inclusion and noninclusion criteria for the MAPT study and PET ancillary studies are presented in Table 1. Written, informed consent was obtained from all participants.

Table 1 Inclusion and exclusion criteria for the MAPT trial

The first randomization was on 30 May 2008, and the targeted number of randomized participants was reached on 24 February 2011. A total of 1,680 participants were recruited in the MAPT centres. The amyloid PET ancillary study was proposed to subjects enrolled in the ten centres close to a nuclear medicine department that could offer PET amyloid imaging (Bordeaux, Limoges, Montpellier, Nice and Toulouse).

Clinical data

Clinical visits were scheduled every 6 months to assess physical condition, diseases and corresponding treatments, and adherence to multidomain intervention. Cognitive and functional assessments were conducted at baseline, at 6 months, and annually at 1, 2 and 3 years by independent research staff blinded to the interventions. All the assessments were performed by hospital practitioner memory experts. A series of neuropsychological tests was administered for cognitive assessment. These included the Free and Cued Selective Reminding Test (FCRST, episodic memory/recall) [21], the Controlled Oral Word Association Test and Category Naming Test (COWAT and CNT, verbal fluency) [22], the Digit Symbol Substitution Subtest of the Wechsler Adult Intelligence Scale–Revised (attention and executive function) [23], the Trail-Making Test (TMT, switching) [24], the Mini-Mental State Examination (MMSE) [25], and the Clinical Dementia Rating Scale (CDR). Two visual analogue scales were also administered to assess memory functioning and the consequences of memory impairment in everyday life. In addition, functional assessment included the Alzheimer Disease Cooperative Study–Activities of Daily Living Prevention Instrument (ADCS-ADL, dependency) and the Short Physical Performance Battery (SPPB) to evaluate the functional capacities [26]. Frailty was evaluated using the classification system proposed by Fried et al. [27], based on assessments of grip strength, timed walking, unintentional weight loss, fatigue, and physical activity. Comorbid depression was assessed with the Geriatric Depression Scale-15 items (GDS) [28].

PET imaging

PET scans were performed as close as possible to a clinical visit during the 3 years of follow up in each patient. Subjects were examined using five different hybrid PET/CT scanners, including one PET/CT 690 (GE Healthcare), one Discovery RX VCT (General Electric), two True Point HiRez (Siemens Medical Solutions) and one Biograph 4 Emission Duo LSO (Siemens Medical Solutions). All tomographs operated in 3D detection mode. All PET sinograms were reconstructed with an iterative algorithm, with corrections for randomness, scatter, photon attenuation and decay, which produced images with an isotropic voxel of 2 × 2 × 2 mm and a spatial resolution of approximately 5-mm full-width at a half-maximum at the centre of the field of view. The acquisition data were processed using the standard package delivered with each acquisition system. All cerebral emission scans were begun 50 min after injection of a mean of 4 MBq/kg weight of 18F-AV45. In each subject, 10-min or 15-min frames were acquired to ensure movement-free image acquisition.

Image analysis

Visual reading

All amyloid scans were sent to the coordinator centre. 18F-AV45 PET images were visually assessed by a panel of three independent observers, who were specialists in molecular imaging and blinded to all clinical and diagnostic information. Each of the observers, who remained the same throughout the study, had interpreted more than 200 amyloid scans before this study. Prior to performing this assessment, the observers underwent a half-day training session on a training set provided by Avid Radiopharmaceuticals/Ely Lilly and Company. Briefly the observers used a binary scale to classify each scan as 0 if there was no significant florbetapir cortical retention (clear grey/white matter contrast) or as 1 if there was some significant florbetapir cortical retention (two or more brain areas in which there was reduced or absent grey/white matter contrast or one or more areas in which grey matter radioactivity was intense) as previously described [13].

Semiautomated quantitative analysis

In addition to the visual readings of scan images, semiautomated quantitative analysis (cortical to cerebellar regional mean standardized uptake values, SUVr) was applied using the mean signal of six predefined anatomically relevant cortical regions of interest (frontal, temporal, parietal, precuneus, anterior cingulate, and posterior cingulate) with the whole cerebellum used as the reference region. For this procedure, the 18F-AV45 PET images were coregistered to the 18F-AV45 template provided by Avid Radiopharmaceuticals and previously published [14]. A quality control based on a semiquantification process was also provided by Avid. Based on the literature, the positivity threshold for amyloid PET was set at SUVr >1.17 [8].

Data analysis

Classification of scans

The PET scan visual analysis results were classified into three classes: (1) “positive PET” if the three observers agreed there was some significant 18F-AV45 cortical retention, (2) “negative PET” if the three observers agreed there was no significant 18F-AV45 cortical retention, and (3) "equivocal PET" (EP) if there was no consensus among the three observers. After debate, the three observers reached a consensus and each EP scan was reclassified as positive (EP+) or negative (EP−).

Inter-rater agreement

We studied the 2 × 2 and overall concordances with kappa coefficients in the entire population and by PET system among the three observers and between visual analysis and quantitative analysis.

Clinical patterns according to PET scan result assessed by visual reading

We compared the clinical characteristics of subjects according to their visual PET scan result in three comparisons: (1) comparison of the three groups of subjects (negative PET, positive PET, EP), (2) comparison of the EP group with the negative PET and positive PET groups, (3) comparison of the EP subgroups after consensus reclassification with scans unanimously classified as positive or negative (i.e. EP− vs. negative PET, and EP+ vs. positive PET), (4) comparison of the EP subgroups (i.e. EP− vs. EP+)

We used the chi-squared test or Fisher’s exact test (for expected values <5) for categorical variables, one-way analysis of variance for quantitative variables with a normal distribution (Fisher tests), and nonparametric tests (Kruskal-Wallis test) for quantitative variables without a normal distribution. For variables that had a global comparison p value <0.05, we compared the characteristics of the EP group with the negative PET and positive PET groups using univariate polytomic regression for categorical variables and univariate linear regression or Kruskal-Wallis tests for continuous variables. In the absence of a normal distribution, variables were tested after transformation to square root or logarithmic values to obtain normal distributions. P values were based on two-sided tests and were considered statistically significant if <0.05, and <0.025 for subgroup analyses to take into account the multiple comparisons. Analyses were performed using SAS software version 9.3 (SAS institute, Cary, NC).

Results

Population

From the 1,680 participants in the MAPT study, 271 subjects were enrolled in this ancillary study at baseline (10 subjects), 6 months (51 subjects), 12 months (94 subjects), 24 months (111 subjects) and 36 months (5 subjects). The mean time between clinical evaluation and PET scan was 2.55 ± 1.42 (range 0.0 – 9.2). At baseline subjects who participated in the ancillary study were significantly younger (on average 74.74 ± 4.26 years vs. 75.46 ± 4.44, p = 0.010), had a slightly higher MMSE score (28.28 ± 1.50 vs. 28.03 ± 1.61, p = 0.018), had a lower TMT-B test score (112.93 ± 42.95 vs. 124.56 ± 66.53, p = 0.017), and had a lower total GDS score (2.75 ± 2.26 vs. 3.36 ± 2.68, p < 10−4) than subjects included in the MAPT study who did not participate in this ancillary study. The demographics of the included subjects and their cognitive performance at the time of the scan are presented in Table 2.

Table 2 Main characteristics of the study population at the time of the PET scan according to the amyloid result

Agreement

Based on visual analysis, 77 subjects (28.41 %) had a positive scan for amyloid, 158 (58.30 %) a negative scan, and 36 (13.28 %) an equivocal scan. After reaching a consensus, these equivocal scans were rated EP+ in 14 subjects (38.89 %) and EP− in 22 subjects (61.11 %). The global agreement among the three observers was excellent (kappa 0.80, standard error 0.035). Kappa values for agreement among the three observers ranged from 0.42 to 0.87 in the five study centres.

Based on semiquantitative analysis (positive PET with cortical SUV >1.17), 103 subjects (38.01 %) had a positive PET scan and 168 (61.99 %) a negative PET scan. The agreement between semiquantification values (mean cortical SUVr) and the visual analysis of the three observers was substantial for all the observers (kappa 0.61, 0.63 and 0.64, respectively). The agreement between semiquantification values and the visual assessment by consensus was 0.63 and the proportion of scans with observer disagreement was 17 %.

Clinical patterns of equivocal scans

Subjects with positive PET, negative PET and EP scans significantly differed in terms of gender, intervention, verbal episodic memory assessment (FCRST total free recall score), activities of daily living functioning (ADCS-ADL score) and mean cortical SUVr (Table 2). Subjects with an EP scan were mostly men (58 %), younger than 75 years (53 %), had a high level of education (39 % had more than 12 years of formal education) and exhibited good physical performance (89 % had a SPPB score ≥10; Table 2). The subjects with an EP scan significantly differed from subjects in both the positive and negative PET groups in terms of gender and mean cortical SUVr. The subjects with an EP scan had intermediate scores between those of subjects in the positive and negative PET groups for MMSE, FCSRT, TMT-B, CNT and ADCS-ADL (Table 2).

The 158 subjects with negative PET scans (assessed by three raters) and the 22 subjects with an EP− scan significantly differed in terms of gender (p = 0.019) and SPPB score (p = 0.028). The characteristics of the subjects with an EP scan according to their PET classification after visual consensus (EP+ and EP−) are presented in Table 3. There was no significant clinical difference between the 77 subjects with positive PET scans (assessed by three raters) and subjects with an EP+ scan. However, mean cortical SUVr was significantly different between the negative and EP− groups (p = 0.006), and between the positive and EP+ groups (p = 0.007); Fig. 1).

Table 3 Characteristics of 36 subjects with an equivocal PET scan according to their PET classification after visual consensus (EP− or EP+)
Fig. 1
figure 1

Distributions of cortical SUVr (from semiautomated quantitative analysis) in the different visual analysis groups. Equivocal PET scans were reclassified as negative (EP−) or positive (EP+). Data from individual subjects are shown. Bottom of the boxes first quartile, top of the boxes third quartile, central lines second quartile (median), extremities minimum and maximum values. *p < 0.001, comparisons among the three groups of subjects; p = 0.006 and p = 0.007, 2 × 2 comparisons, Kruskal-Wallis test

The Z-values for cognitive performance using subjects with negative PET scans as the reference group are presented in Supplementary Table 1.

Discussion

This study explored for the first time the clinical patterns of “equivocal” amyloid PET scans in a large population of nondemented older adults aged 70 years and over. Our study had three main findings. First, 28 % of subjects had a positive amyloid scan based on visual analysis and 13 % had equivocal scans. Second, the global agreement among the three observers was excellent. Third, the clinical characteristics of subjects with equivocal scans did not significantly differ from those with positive or negative scans but seemed to be intermediate. Likewise, the amyloid load of subjects with equivocal scans was intermediate between the amyloid loads of subjects with positive and negative scans. After reclassification of equivocal scans, the mean cortical SUVr remained significantly different between the negative and EP− groups and between the positive and EP+ groups. A high prevalence of amyloid positivity as observed in this study has also been reported in the elderly. It reached 28 % in our population with a mean age of 75 years, and in previous studies the percentage was around 25 % in subjects aged 70 to 80 years [10, 11].

The global agreement among the three observers is in accordance with previously reported data and confirms the accuracy of our visual assessment methodology. Another interesting result was that the agreement varied with the PET centre (kappa varying between 0.42 and 0.87). The worst agreement was observed in the centre with the oldest PET scanner (installed in 2004). Image quality may then be a major factor explaining the misclassification in amyloid PET interpretation. The agreement between semiquantification (mean cortical SUVr, threshold set at 1.17) values and visual analysis of the three observers (kappa 0.61, 0.63 and 0.64) was not so good for all the observers or between the semiquantification value and the consensus visual assessment (kappa 0.63). This relative discordance between semiquantification and visual agreement may be explained by some limitations. First, in a recent communication based on 250 amyloid scans, Klein et al. have demonstrated that concordance between quantitative SUVr methods and visual assessment is highest for methods using MRI data and cerebral white matter reference regions [29]. Using the white matter or brainstem as a reference may provide superior performance. Those results are in line with those of a recent study by our group [30]. In this multicentre study, MRI of the subjects was not available and the semiquantification was based on the cerebellum. Another explanation may be the fact that semiquantification is based on a mean cortical value in contrast to visual assessment which is based on regional approaches [5]. These results suggest that quantification methods in development should involve a specific regional approach rather than a single global mean value.

Verbal episodic memory assessed by FCRST (free recall, delayed total recall) scores was associated with the results of visual assessment of scans, suggesting that cognitive function could be related to amyloid load in a nondemented population. Such a relationship between regional amyloid-beta deposition and episodic memory deficits in the presymptomatic stage of AD has been already described by the Australian Imaging, Biomarkers and Lifestyle (AIBL) group [31, 32]. The EP group significantly differed from both the positive and negative groups in terms of gender. There was no significant difference between the EP group and the other two groups in terms of functional and cognitive parameters. However, the EP group seemed to be an intermediate cognitive group (for episodic verbal memory and executive functions), and amyloid load based on mean SUVr was intermediate and corresponded to the threshold between positive and negative PET (1.17).

We sought to determine if the EP group corresponded to the combination of the two subgroups EP+ and EP−, similar to the positive and negative groups, respectively, or to one homogeneous group with its own specific cognitive patterns and mild amyloid load. After the consensus determination, the EP+ and positive groups and the EP− and negative groups did not differ from a neuropsychological point of view. In contrast, amyloid load, based on mean SUVr, differed among the groups. Subjects in the equivocal group had an intermediate amyloid load which could explain the difficulty in interpretation encountered by the three raters.

This study suffered from some limitations. First, the delay between clinical assessment and PET acquisition could have reached 3 months. However, a recent study has demonstrated that the kinetics of amyloid deposit are probably very slow [32], and this delay should not have interfered significantly with our results. Second, MRI scans were not available in this study. However, since PET template-based quantification seems adequate for clinical use to discriminate controls from patients with early AD, the use of MRI-based cortical quantification to avoid nonspecific white matter binding reported in healthy subjects as well as in patients with AD seemed not to be necessary [33]. Second, the population of this ancillary study was cognitively heterogeneous. The lack of accuracy in the diagnosis of mild cognitive impairment is a limitation. Last, the small number of subjects with equivocal scans prevented multivariate analysis of associated clinical factors and limited the comparison of subjects after reclassification of the equivocal scans (EP+ or EP−) with subjects of the other groups.

Equivocal PET scans could be a neuroimaging entity representing intermediate amyloid load and without any specific neuropsychological patterns. The results of this cross-sectional study could justify a clinical follow-up to assess the cognitive evolution of subjects with equivocal PET scans.