Introduction

Parkinson’s disease (PD) is a progressive neurodegenerative disorder affecting dopaminergic neurotransmission that results in resting tremor, bradykinesia, and rigidity [1]. Patients with PD over 50 years of age numbered 4.1–4.6 million in 2005, and this number is predicted to double to 8.7–9.3 million by 2030 due to the overall aging of the population [2]. The main pathological characteristic of PD is a loss of dopaminergic neurons in the substantia nigra, with a subsequent decrease in striatal dopamine concentrations [3]. As the clinical diagnosis of the early-stage disease is still subject to limitations [4, 5], functional imaging of the dopamine transporter (DAT) has been used to confirm or exclude a diagnosis, assess disease severity and monitor disease progression.

18F-labeled-N-3-fluoropropyl-2-ß-carbomethoxylodopropyl-3-ß-(4 iodophenyl) nortropane ([18F]FP-CIT) is a high affinity cocaine analog that has been developed for imaging DAT binding in vivo [6]. [18F]FP-CIT has been approved by the Korean Food and Drug Administration and has already been incorporated into clinical practice in several Korean and European hospitals. [18F]FP-CIT benefits from high affinity (Ki = 3.5 nM) and selectivity to DAT, relatively fast kinetics, and a nonradioactive nor-beta-CIT metabolite, making it suitable for positron emission tomography (PET) neuroimaging, where the excellent spatial resolution and sophisticated attenuation correction allows for rapid quantification of DAT binding [6].

The demonstration of a reproducible [18F]FP-CIT PET outcome measure is particularly important when accurate quantification of DAT is essential for purposes such as evaluating disease progression or the therapeutic effects of neuroprotective interventions. The test–retest reproducibility of various dopaminergic tracers has been reported, but most reproducibility studies using healthy controls have included young people in their 20 s or 30 s [7,8,9]. The mean variabilities and intraclass correlation coefficients (ICCs) were 7.63 ± 5.01% and 0.96 for [99mTc]-TRODAT-1 SPECT [9], 5.70 ± 3.30% and 0.95 for [123I]FP-CIT SPECT [10], and 6.11 ± 3.68%–11.60 ± 5.70% and 0.79–0.80 for [11C]MNPA/[11C]raclopride PET, respectively [11]. In patients with PD, the mean variabilities and ICCs were 7.84 ± 8.94% and 0.95 for [I23I] FP-CIT SPECT [12] and 10.20 ± 6.17% and 0.95 for [99mTc]-TRODAT-1 SPECT, respectively [13]. Variance in the reproducibility across the studies may have been influenced by differences in age and disease severity, imaging modalities, half-lives of the radioisotopes, tracer kinetics, imaging processing methods, selection of targets and reference regions, and time intervals between test and retest imaging. To the best of our knowledge, no study has reported the test–retest reproducibility of quantitative measures of [18F]FP-CIT PET. This is the first study to evaluate the test–retest reproducibility of [18F]FP-CIT PET measures of DAT density in patients with essential tremor (ET) and PD. We also investigated whether the reproducibility of [18F]FP-CIT PET measures depended on acquisition time or analysis method.

Materials and methods

Subjects

This prospective study, which was conducted at a single, tertiary academic center, was approved by the Asan Medical Center institutional review board (2018-0183), and all patients signed an informed consent form. Between April 2018 and October 2018, 15 patients with probable ET (13 females and 2 males; mean age ± standard deviation (SD), 68.9 ± 6.6 years) and 10 patients with PD (4 females and 6 males; mean age ± SD, 70.5 ± 6.3 years; Hoehn and Yahr (H&Y) stage 2.3 ± 0.8) were enrolled. The diagnoses of ET and PD were based on the Consensus Statement of the Movement Disorder Society on Tremor [14] and the UK Parkinson’s Disease Society Brain Bank Clinical Diagnostic Criteria [15], respectively. The following exclusion criteria were applied: (a) hypersensitivity or allergy to [18F]FP-CIT; (b) history of alcohol or drug abuse; (c) history of stereotactic surgery; and (d) the use of cocaine, benztropine, amphetamines, phentermine, mazindol, and methylphenidate. Each subject underwent two sequential [18F]FP-CIT PET/CT scans at an interval of 48 ± 7 day. All subjects performed MR scans (average time interval between MRI and test PET was 44 ± 14 days).

[18F]FP-CIT PET/CT

[18F]FP-CIT was synthesized as previously described [16]. All [18F]FP-CIT PET scans were performed with a Biograph Truepoint 40 PET/CT scanner (Siemens, Knoxville, TN, USA) at 90 min and 3 h after the intravenous injection of [18F]FP-CIT (dose: 5.01 ± 0.30 mCi (test), 4.76 ± 0.23 mCi (retest)/specific activity: 5661.83 ± 4670.02 Ci/mmol (test), 5592.86 ± 3848.35 Ci/mmol (retest)). PET images were acquired for 10 min in 3-dimensional mode after the brain CT. CT scanning was performed at 120 kVp and 228 mAs with a slice thickness of 1.5 mm. [18F]FP-CIT PET images were reconstructed with a TrueX algorithm and an all-pass filter with a 336 × 336 matrix.

Image processing

Volumes of interest (VOIs) were applied to the PET images in two different ways: a T1-MRI template-based normalization (automatic) method or a fixed-VOI (manual) method. Statistical parametric mapping 12 software (SPM12; Wellcome Trust Centre for Neuroimaging, London, UK) implemented in MATLAB R2013a for Windows (The MathWorks Inc.) and MRIcro version 1.40 (Chris Rorden, Columbia, SC; http://www.mccauslandcenter.sc.edu/crnl/) were used for the preprocessing with both methods. First, the DICOM files of reconstructed [18F]FP-CIT PET images and corresponding T1 MRIs were converted into Nifti file format using MRIcro. Then, the [18F]FP-CIT PET images were coregistered onto the corresponding T1-weighted MRI images.

For the T1-MRI template-based normalization (automatic) method, individual T1-weighted MRI images were spatially normalized onto the Montreal Neurological Institute (MNI) T1 template using SPM12. The coregistered PET images were then spatially normalized using the spatial transformation parameter estimated for the T1 MRI. In this way, we could conduct quantitative analyses based on VOIs, which were defined on the basis of a template in standard space. VOIs were defined for the left and right caudate, anterior putamen, and posterior putamen and an occipital VOI was defined as the reference region on the basis of a template in standard space (Fig. 1a). The volume of each VOI was 2.46 cm3, 2.55 cm3 for left and right caudate, 1.70 cm3, 1.68 cm3 for left and right anterior putamen and 1.58 cm3, 1.88 cm3 for left and right posterior putamen and 3.18 cm3 for the occipital cortical region. Using template VOIs and normalized PET images, the ratios of striatal to occipital uptake (i.e., standardized uptake value ratio; SUVR) were defined as follows: (mean standardized uptake value [SUV] of substriatal VOI)/mean SUV of occipital VOI). The left and right SUVR values were averaged.

Fig. 1
figure 1

VOIs overlaid on coregistered PET/MR images from an individual patient with ET using a T1-MRI template-based normalization method (a) and manual fixed-VOI method (b). In the T1-MRI template-based normalization method (a) using MRI-based spatial normalization, the positions of the template VOIs were not manually adjusted. In the manual fixed-VOI method (b), each VOI was manually translated without changing the VOI size to match the corresponding structures on the coregistered PET/MR images

For the fixed-VOI (manual) method, the aforementioned preprocessed Nifti files of coregistered PET and corresponding MRI data were analyzed using PMOD 3.5 software (PMOD Technologies, Zurich, Switzerland). The predefined fixed VOIs for the aforementioned subregions on the same transaxial slice where the striatum was best visualized were loaded on the fused PET/MR image (Fig. 1b). The volume of each VOI was 1.12 cm3 for each substriatal region and 12.68 cm3 for the occipital cortical region. To eliminate interoperator variation, all VOIs were applied by one operator, and subject-specific individual VOIs were saved and used for both the test and retest PET images. SUVRs were calculated in the same manner used for the automatic method.

For both methods, retest PET images were realigned to the corresponding test PET images. The other processes were the same as those described above.

Reproducibility analysis

The test–retest reproducibility was evaluated by calculating the bias, variability, percent test–retest (PTRT), within-subject coefficient of variation (WSCV), repeatability coefficient (RC) and ICC [17]. The test–retest bias was calculated as the difference between the two measurements and reported as a percentage of the difference from the value of the first measurement: (100 × (retest-test/test)). The test–retest variability was the SD of the bias. The percent test–retest (PTRT) was calculated as follows [17]:

$$\mathrm{PTRT}=\frac{1}{n}\sum_{i=1}^{n}\left|2\frac{{y}_{i2}-{y}_{i1}}{{y}_{i2}+{y}_{i1}}\right|,$$

where n is the number of subjects in the test–retest study and yi1 and yi2 are the estimated PET outcome measures obtained for a particular region in the ith subject in the test and retest scans, respectively.

WSCV was calculated using the following equations [18, 19]:

$$\mathrm{WSCV}=\frac{{\sigma }_{e}}{\mu }$$
$${\widehat{{\sigma }_{e}}}^{2}=\frac{1}{n}\sum_{i=1}^{n}\sum_{j=1}^{2}{\left({y}_{ij}-\stackrel{-}{{y}_{i}}\right)}^{2}$$
$$\widehat{\mu }=\frac{1}{n}\sum_{i=1}^{n}\stackrel{-}{{y}_{i}},$$

where σe is the within-subject SD.

RC was calculated as follows [18]:

$${\text{RC}} = \sqrt 2 \times z_{{1 - {\raise0.7ex\hbox{$\alpha $} \!\mathord{\left/ {\vphantom {\alpha 2}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$2$}}}} \sigma_{e} ,$$

where z1−α/2 is the 1 − α/2 quantile of the standard normal distribution. It is defined as a 100(1 − α/2) % quantile of the distribution of test–retest differences. Thus, this quantile represents the limits of a typical range containing a large proportion (e.g., 95%) of the distribution of test–retest differences (with α = 0.05, z1−α/2 = 1.96).

Reliability was estimated by calculating the ICC using the following equation [20]:

$$\uprho =\frac{MSBS-MSWS}{MSBS+\left(k-1\right)MSWS},$$

where MSBS and MSWS are the mean sum of squares between and within subjects, and k is the number of within-subject measurements. Comparisons of SUVRs between test and retest imaging were made using Wilcoxon’s matched-pairs test. SPSS for Windows, version 18.0 (SPSS, Chicago, IL, USA), was used for statistical analysis, with statistical significance defined as P < 0.05.

Results

The mean SUVRs of the test and retest [18F]FP-CIT PET are summarized in Table 1. There was no significant difference in the mean SUVR between the test and retest conditions. Representative test–retest 90-min and 3-h [18F]FP-CIT PET images of ET (Fig. 2a) and PD (Fig. 2b) patients illustrate the excellent reproducibility of DAT binding in the caudate, anterior putamen, and posterior putamen. The test–retest metrics estimated using autonomic and manual methods are summarized in Tables 2 and 3, respectively. The test–retest bias (ET: range 1.05–2.04%, PD: − 0.65–2.75%) and variability (ET: 6.99–8.02%, PD: 3.51–6.94%) were both low, and reliability (ET: 0.88–0.96, PD: 0.98–0.99) was excellent. The ET group showed higher variability and lower ICCs than the PD group. In the ET group, there were no significant differences in the test–retest variability (caudate nucleus: 7.17%, anterior putamen: 6.99%, posterior putamen: 8.02%) among the subregions. However, in the PD group, the test–retest variability was significantly higher in the posterior putamen (6.94%) than in the caudate (3.51%) and anterior putamen (4.12%). The test–retest variability and ICCs of the 90-min images were similar to those of the 3-h images in both the ET and PD groups. In the ET group, the manual method showed similar test–retest reproducibility to the automatic method. However, in the PD group, the manual method (11.81%) showed higher variability than the automatic method (6.94%) in the posterior putamen.

Table 1 Test and retest mean SUVR values in Essential Tremor (ET) and Parkinson's Disease (PD) patients
Fig. 2
figure 2

Representative standardized uptake value ratio (SUVR) images from 90-min and 3-h test–retest [18F]FP-CIT PET scans of patients with essential tremor (a) and Parkinson’s disease (b)

Table 2 Test–retest reproducibility metrics from 90-min and 3-h images estimated by an MRI-based normalization method
Table 3 Test–retest reproducibility metrics from 90-min and 3-h images estimated by a manual fixed-VOI method

Discussion

We showed that [18F]FP-CIT PET measures are reproducible in both ET and PD patients, for both 90-min and 3-h acquisition times with both automated and manual analysis methods. [18F]FP-CIT PET SUVRs showed higher variability in ET patients (6.99–8.02%) than in PD patients (3.51–6.94%). This result was surprising, since it is generally expected that scans in PD patients show lower reproducibility than those in ET patients because of lower counts due to progressive striatal degeneration. In one [123I]FP-CIT SPECT study, PD patients (H&Y stage, I–IV) showed higher variability (16.8 ± 13.3%) than healthy controls (HCs) (12.8 ± 8.9%) [21]. Another [123I]FP-CIT SPECT study involving advanced-stage PD patients (H&Y stage, II–IV) with a longer disease duration (mean, 12 years; range, 2–25 years) reported higher variability (7.36 ± 6.16%) than our result (3.51–6.94%) [22]. Our finding of better reproducibility in PD than in ET patients might be explained by the fact that our PD patients showed early-stage disease (H&Y stage, 2.3 ± 0.8) and that there was a slight overlap of SUVR between PD and ET. Furthermore, the methodological advantages of PET-based measurements and MRI-based coregistration indicated that the imaging quality for [18F]FP-CIT PET was sufficient and had little effect on the variability of the measurements obtained from patients with PD.

Despite the higher counts, the higher variability in patients with ET than in patients with PD may be due to true biological variability within individuals, rather than technical problems related to the measurement methods or image quality. Although striatal DAT uptake in the pseudoequilibrium state will generally show good reproducibility, the time point at which pseudoequilibrium occurs differs between patients with ET and PD, and is affected by various pharmacokinetic factors, including blood flow, peripheral clearance, and association and dissociation rate constants [23]. Because of a loss of striatal DAT, the peak specific striatal uptake in patients with PD equilibrates earlier than in HCs, contributing to lower variability in patients with PD [24, 25]. In patients with PD, the time curve for specific striatal [18F]FP-CIT uptake reached a plateau at approximately 60 min [26]. However, the specific striatal [18F]FP-CIT uptake in HCs did not reach a plateau until over 100 min, and the striatum to occipital ratio increases linearly after specific striatal uptake reaches a plateau [26]. As interindividual variability in [18F]FP-CIT kinetics influences the initial pseudoequilibrium time point, not all patients had completely reached a peak equilibrium state at 90–120 min. As it has the same molecular structure, [18F]FP-CIT shows similar pharmacokinetic features to [I23I]FP-CIT. The striatal [I23I]FP-CIT SUVR increases gradually up to 3 h and at least approximates a steady-state between 3 and 6 h [25]. In the 90-min and 3-h time periods, striatal [18F]FP-CIT uptake had not yet reached pseudoequilibrium, and the hemodynamic status was vulnerable to an individual’s biological factors. Relatively lower ICCs (0.88–0.96) in the ET group than in the PD group (0.97–0.99) may be attributable to higher within-subject variability (6.99–8.02%) due to the kinetic instability and lower intersubject variability among the ET patients.

Our results are consistent with previous test–retest reports and indicate good reproducibility for striatal [18F]FP-CIT binding. The variability in the SUVRs of our patients with ET was 6.99–8.02%, which is comparable to traditional DAT SPECT imaging agents such as [I23I]FP-CIT SPECT (7.47 ± 6.35%) [22] and [99mTc]-TRODAT-1 SPECT (7.63 ± 5.01%) [9], but inferior to the dopamine receptor PET agent [11C]raclopride PET (3.9–4.5%) [27]. The variability in our patients with PD was 3.51–6.94%, which is superior to [99mTc]-TRODAT-1 SPECT (10.20 ± 6.17%) [13]. The variance among studies may be attributed to differences in the imaging modality (SPECT vs. PET) [7, 13], physical decay (99mTc vs.123I) [13, 22], VOI size (whole striatum vs. subregion) [12, 22], analysis model (kinetic modeling vs. ratio method) [13, 28], VOI delineation method (manual drawing vs. automatic delineation with MRI coregistration) [8, 28] and patient selection.

The decision on whether to place a VOI on a subdivision or the total striatum is determined by the study purpose. The placement of a small spherical VOI on the substriatum is helpful for differentiating patients with early PD from HCs to sensitively detect subtle changes in DAT levels in the posterior putamen. However, in a longitudinal study evaluating disease progression, the placement of a VOI on the whole striatum showed higher reproducibility, as it is less affected by variation in movement [12, 25]. In patients with ET, a difference in variability was not observed among the subdivisions, implying that the reproducibility was preserved despite the application of a small VOI. However, in patients with PD, the posterior putamen showed higher variability than other substriatal regions because it is most substantially affected region in patients with PD [29]. Our result is consistent with a previous [I23I] FP-CIT SPECT study comparing the variability of whole-striatal and substriatal ROIs [12]. Although whole-striatal ROIs showed higher variability in HCs (11.1 ± 10.4%) than in patients with PD (7.84 ± 8.94%), subregional ROIs on the posterior putamen showed higher variability in patients with PD (17.7 ± 12.3%) than in HCs (13.7 ± 9.89%) [12]. Using our manual method, patients with PD showed relatively lower ICCs (0.81) in the posterior putamen, which were affected by the higher within-subject variability due to the lower count and lower between-subject variability. As DAT binding decreases from the posterior to the anterior direction in PD, the posterior putamen is commonly decreased, regardless of disease stage.

Regarding the contribution of methodological differences to the reproducibility, the manual method showed comparable reproducibility to the automatic method in patients with ET. A L-[β-11C]DOPA study in which VOIs were manually drawn on each of the coregistered MR images reported higher variability (10.7 ± 5.4% and 11.9 ± 5.8% for the caudate and putamen) than our manual method [30]. The better reproducibility of our manual method may be attributed to our application of the same manually placed predefined VOIs to both the test and retest PET images to decrease interoperator variation. However, the manual method resulted in higher variability (11.81%) in the posterior putamen of patients with PD than the automatic method (6.93%). This finding differs from a previous [I23I] FP-CIT SPECT study [22], which found that both manual (7.90 ± 6.89%) and automated methods (7.36 ± 6.16%) were equally reproducible. Low counts cannot necessarily explain the relatively high variability in the posterior putamen, because the posterior putamen showed poor reproducibility only with the manual method, not with the automatic method. Although the VOI volumes were highly reproducible, the poor reproducibility of the posterior putamen with our manual method may be attributed to registration error, deviations in individual anatomy, or motion artifacts. The automatic method appears to be more suitable for evaluating the early loss of DAT binding in patients with PD.

To determine the optimal imaging time for [18F]FP-CIT PET in routine clinical practice, the pseudoequilibrium time point of the tracer kinetics should be considered first, although other practical factors, including a high target-to-background ratio, discrimination between patients and HCs, and high test–retest reproducibility, should also be considered together. Because striatal [18F]FP-CIT uptake may not reach a pseudoequilibrium by 90 min, 90-min imaging scan time is more vulnerable to each individual’s biological conditions, including blood flow, blood–brain permeability, and peripheral tracer clearance [26]. As the 90-min scan time is short relative to the slow clearance of [18F]FP-CIT, it may result in large numbers of outliers, and scan durations of at least 120 min were required to obtain reliable fits using a plasma input model for a dynamic scan [31]. However, the reproducibility of the 90-min images was comparable to the 3-h images. Kazumata et al. reported that 90-min images showed a high target-to-background ratio (3.5:1), providing sufficient image quality for visual interpretation [26]. Although the striatal to occipital SUVR at 90 min (SUVR90min) underestimated the distribution volume ratio, SUVR90min correctly differentiated early-stage PD patients from HCs [26, 31]. Our finding that the image acquisition time did not affect the reproducibility supports the use of 90-min static imaging in clinical fields to reduce the patient’s waiting time and scanner occupation.

Our study had some limitations. The sample size was not large, and most of the ET patients were females (13 females and 2 males). Because estrogen exerts a significant effect on DAT expression, the administration of estrogen replacement therapy to postmenopausal women is related to a modest increase in [99mTc]TRODAT-1 binding in the putamen [32]. Although our female ET patients were all postmenopausal, we did not control for hormone supplementation status. Second, partial volume effects may have resulted in an underestimation of the count in the posterior putamen. However, this potential issue may not have affected our reproducibility results, because this underestimation effect would similarly apply to both test and retest images, and the high spatial resolution of our PET scanner may have compensated for this problem.

Conclusion

In conclusion, [18F]FP-CIT PET is a reproducible and reliable method for the quantitative measurement of DAT binding in patients with ET or PD, regardless of the imaging acquisition time or analytical method. Also, the automatic method is more suitable for evaluating early loss of DAT binding in patients with PD.