Objective

The objective of this guideline is to define the criteria for the data acquisition protocol of oncology 18F-FDG-PET/CT scans to standardize the PET image quality among facilities and different PET/CT scanner models. It describes the method for phantom experiments and human image quality evaluation, and provides recommended values as a reference. The optimum imaging protocol for each camera model can be determined by using this guideline as a manual and by comparing the results with the recommended values.

Need for the guideline

The quality of FDG-PET images acquired with PET cameras (either dedicated PET or PET/CT units) depends on the camera model, injected activity, scanning duration, and other details of the data acquisition protocol. It also depends on body size, and larger subjects generally give poorer image quality with the same injected activity per weight. An optimum data acquisition protocol has not necessarily been established for each camera model. On the other hand, radiation safety regulations and operational limitations may prevent injecting sufficient activity and/or scanning for a sufficient duration in the clinical setup.

The diagnostic accuracy reported by a PET center may not be applicable to other centers that use other camera models with different data acquisition protocols and may therefore provide a different image quality. Unless image quality is universally controlled by standardization of the scanning protocol, the FDG-PET scan will not be validated as a reliable diagnostic tool. Multicenter studies and clinical trials are not possible if the image quality depends on the site. When FDG-PET is used as an end point in clinical trials of new anticancer drugs, in which efficacy of the treatment is evaluated based on the disappearance or decline of FDG uptake by lesions, it is essential to assure a certain level of image quality.

There is, therefore, a growing need for a method to determine a data acquisition protocol that provides adequate image quality of a given camera model, as well as standards of image quality evaluation applicable to human FDG-PET images acquired with any camera models.

Contents of and instructions concerning the guideline

Phantom experiments

Phantom experiment no. 1 of this guideline allows determination of the minimum scanning duration to detect a 10-mm diameter hot sphere with 1:4 background activity, simulating a subject of standard size injected with 3.7 (or 7.4) MBq/kg FDG and imaged at 1 h post-injection. In phantom experiment no. 2, hot spheres of various sizes with 1:4 background activity are imaged in a given data acquisition protocol for the evaluation of visualization, as well as under noise-free conditions to estimate image resolution based on the recovery coefficient (RC).

The reconstruction condition, which affects image quality and spatial resolution, may be predetermined by the users or the manufacturer, but can also be determined with phantom experiments.

Since the detection of a 10-mm hot sphere with 1:4 background activity is a challenging goal, a routine data acquisition protocol may be determined apart from the phantom experiments considering the clinical requirements and operational limitations of the PET center. It should also be kept in mind that phantom experiments use a body phantom of a specific size, and the results do not provide direct evidence for thicker or thinner subjects.

Human image quality evaluation

The clinical part of the guideline defines physical parameters (NECpatient, NECdensity, and liver SNR) and proposes their recommended values as an easy and objective reference for the image quality of human whole-body FDG-PET. These three parameters are used in this guideline, because they are believed to be good indicators of image quality [1]. These reference values, however, may depend on the PET camera model and the body size of the subject to some extent. The human image quality is also influenced by subject factors including blood glucose level, resting conditions, and body motion. Therefore, human images should finally be checked visually by a physician or technologist.

Coverage of PET scanner types and acquisition modes

Although this guideline is designed to apply to a PET/CT scanner in 3D data acquisition mode, which is the norm for oncology scans these days, it can be applied to a dedicated PET scanner as well as to a scanner operated in 2D data acquisition mode. A scanner with continuous bed movement for a simultaneous emission and transmission scan is also evaluable with this guideline.

Since the phantom experiment defined in this guideline requires list mode acquisition, an alternative method is described for a scanner that does not provide the list mode acquisition. This guideline also requires measurement of prompt and random count rates, for which consultation with the manufacturer may be necessary.

Procedure and evaluation criteria of phantom experiments

This section describes two experiments using 18F-solution and an IEC body phantom (image quality phantom) referred to in the NEMA NU-2 2007 Standard [2]. Another phantom (scatter phantom of the NEMA NU-2 2007 Standard) may be placed adjacent to the body phantom to account for the activity outside the field of view, which is preferable but not essential in this guideline.

If little information of the PET scanner is available as in the case of a new scanner model/version, or in case a new reconstruction parameter is applied, phantom experiment no. 1 has to be carried out first to obtain optimum data acquisition conditions followed by phantom experiment no. 2. If a data acquisition protocol is already in use, phantom experiment no. 1 may be skipped and image quality can be confirmed by phantom experiment no. 2 under that protocol.

Phantom experiment no.1

Outline

Since lesion detectability in a PET image and overall image quality depend on the count statistics, phantom experiment no.1 determines an appropriate scanning duration that enables visualization of a 10-mm diameter hot sphere of unknown localization embedded in a warm background of 4:1 activity concentration ratio. The lid, to which the sphere is attached, is screwed on at an arbitrary angle so that only the person who has prepared the phantom knows the localization of the 10-mm sphere. Data are acquired by list mode, from which PET images of 1–10 min data acquisition duration are reconstructed and evaluated for detectability of the hot sphere.

Data acquisition

Phantom preparation

Measure the background volume of the phantom beforehand. Using a regularly checked dose calibrator and taking decay into consideration, prepare 18F-FDG with sufficient activity to make the background concentration of 5.3 kBq/ml at the start of data acquisition. Fill exactly one-fourth of the background volume with tap water, add the entire 18F-FDG that was precisely measured for the activity and stir to make a hot solution. Draw an aliquot and place it in the 10-mm sphere. If phantom experiment no. 2 has to follow, draw another 60 ml of the solution for later use. Fill up the phantom background with tap water and stir to make a warm solution. Fill the other five spheres with the warm background solution.

Scanning

Place the body phantom horizontally on the bed, so that the hot spheres are localized at the center of the field of view in the z-axis. Start acquiring two sets of data in list mode, each for 12 min, exactly when the background activity concentration has decayed to 5.30 and 2.65 kBq/ml, respectively. Record prompt and random coincidence counts at the same time. Reconstruct PET images of 1, 2, 3, …10 min data acquisition duration, three sets for each duration, by summing the data starting at 0, 1, 2 min and lasting for 1, 2, 3,…, 10 min. Use image reconstruction parameters that are routinely used or recommended for the camera model.

If list mode acquisition is not available for the PET camera, carry out static scans sequentially, each for 1 min or with a preset count mode if the gaps between scans cause significant decay, and add the raw data to make 1–10 min of scanning duration. If raw data cannot be added, carry out ten static scans with a duration of 1, 2, 3,…, 10 min, respectively, and repeat the experiment for a total of three times on three separate days. For a PET camera with continuous bed movement for a simultaneous emission and transmission scan, carry out the list mode data acquisition in the stationary mode and evaluate the data as described here, but convert the resulting optimum scanning duration into the corresponding bed speed when it is put into practice.

Evaluation

PET image quality is evaluated for each acquisition duration with (1) visual score, (2) phantom noise equivalent count (NECphantom), (3) % contrast (Q H,10 mm), and (4) % background variability (N 10 mm).

The PET images are visually evaluated regarding the detectability of the 10-mm diameter hot sphere in a three-step (0, 1, 2) scale by one or more licensed PET physicians, who do not know the hot sphere localization or the slice on which it is to be visualized. The images are displayed using an inverse gray scale with an upper level of SUV = 4, which equals the activity concentration of the hot sphere, and a lower level of SUV = 0. The image is scored 2 if the hot sphere is “identifiable”, 1 if it is “visualized, but similar hot spots are observed elsewhere”, and 0 if it is “not visualized”. The score is averaged for three image sets and for the physicians at each acquisition duration time.

NECphantom, Q H,10 mm, and N 10 mm are computed based on the NEMA standards (see “Appendix”).

Recommendations

This guideline recommends the scanning duration that provides an image with an average score of 1.5 or more, i.e., the 10-mm hot sphere is detected in half or more of the cases. The physical indicators may be used as a reference when determining the optimum scanning duration; the reference values are NECphantom > 10.4 (Mcounts), N 10 mm < 6.2 (%), and Q H,10 mm/N 10 mm > 1.9 (%).

Phantom experiment no.2

Outline

In phantom experiment no. 2, hot spheres of various sizes are imaged with a given clinical data acquisition protocol to evaluate their visualization and are also imaged in a noise-free condition to estimate image resolution based on the recovery coefficient (RC). Phantom experiment no. 2 can either be carried out following no.1 or separately. In the former case, the scanning duration should be adjusted to account for radioactivity decay.

Data acquisition

Phantom preparation

A body phantom is prepared in the same way as in phantom experiment no. 1, except that all six (10, 13, 17, 22, 28 and 37-mm diameter) hot spheres are filled with hot solution. The background is filled with 1:4 warm activity concentration as in phantom experiment no. 1.

Scanning

The phantom is scanned twice; namely, in the given clinical condition and in a noise-free condition.

In the given clinical condition, the scanning duration is determined so that equivalent counts are obtained assuming that the phantom simulates a 60-kg subject injected with 222 MBq (3.7 MBq/kg) FDG. If a 60-kg subject is injected more (or less) activity than 222 MBq in the given protocol, the scanning duration is accordingly shortened (or elongated) inverse-proportionally. The scan starts when the activity concentration decays to the following value. If experiment no. 2 is done alone, the emission scan starts when the activity concentration decays to 2.65 kBq/ml. If experiment no. 2 is done following no.1, the emission scan starts when the activity concentration decays to 1.325 kBq/ml, taking twice the scanning duration. When setting up the scan, input the phantom volume as “patient weight (kg),” together with the activity and injected time.

After the static scan of the clinical condition, a second scan of 30-min duration is carried out as a noise-free condition to measure the recovery coefficient.

With all those scans, an acquisition method should be selected that enables the recording of prompt and random coincidence counts in a readable format in the sinogram header or in a separate file. The image reconstruction parameters used in the usual clinical diagnostic scans should be applied to the phantom experiments.

Evaluation

The quality of PET image acquired in the clinical condition is evaluated by (1) visual inspection, (2) phantom noise equivalent count (NECphantom), and (3) % contrast (Q H,10 mm) and (4) % background variability (N 10 mm) for a 10-mm diameter sphere.

The recovery coefficient for a j-mm diameter hot sphere (RCj) is calculated as the maximum pixel value (Cj) in the region of interest (ROI) on the reconstructed image acquired in a noise-free condition divided by that of the 37-mm diameter sphere: RCj = Cj/C37.

Recommendations

It is preferable that the image acquired under clinical conditions provides visualization of the 10-mm diameter sphere and the physical indicators of NECphantom > 10.4 (Mcounts), N 10 mm < 6.2 (%), and Q H,10 mm/N 10 mm > 1.9 (%).

A reconstruction condition that provides spatial resolution of 10-mm FWHM or better (RC10 mm > 0.38) is recommended.

Evaluation of human PET image quality

Objective

This section describes the clinical part of the guideline, in which physical indicators of image quality of human whole-body FDG-PET are defined, including NECpatient (noise equivalent count per axial length), NECdensity (NEC per volume), and liver SNR (mean/SD within liver ROI), together with their reference values as recommended criteria.

While it is preferable that human images are acquired under conditions that meet the recommended criteria of phantom experiment no. 2, especially that for image resolution (RC10 mm > 0.38), this guideline recommends criteria for the physical parameters that are directly measured on the human images, considering the inherent limitations of the phantom experiments such as body size variations.

Method

The criteria are applicable to whole-body FDG-PET images covering at least from the neck to the abdomen. The images should be acquired while recording the prompt and random coincidence counts in each bed position. The transmission or CT images should also be generated together with PET images to compute cross-sectional areas.

For the whole-body image, bed positions corresponding to the axial span from the neck to the abdomen are determined by excluding the brain and urinary bladder. The prompt and random counts are extracted for each bed position, from which NECpatient and NECdensity area are computed (see “Appendix”). The liver SNR is computed as mean/SD within the liver ROI that is placed separately from the porta hepatis and major vessels in three consecutive coronary sections (Fig. 1).

Fig. 1
figure 1

How to place ROI over the liver

Recommendations

This guideline recommends that the physical parameters meet the criteria of NECpatient > 13 Mcounts/m, NECdensity > 0.2 kcounts/cm3, and liver SNR > 10.

Since these reference values may strictly depend on the camera model, they may be subjected to future modifications and revisions. It may also be inappropriate to use the criteria if FDG distribution is far from normal, such as those with lesions showing extremely strong FDG accumulation.

Discussion

Dependence on camera model

This guideline aims to establish a standard to assure image quality independent of the camera model. Standards of N 10 mm < 6.2, Q 10 mm/N 10 mm > 1.9, and NECphantom > 10.4 have been proposed for the phantom image quality parameters based on the results of phantom experiment no.1 for a number of camera models regarding detection of a 10-mm hot sphere of unknown localization with 1:4 background activity. The image spatial resolution should be better than 10-mm FWHM corresponding to RC > 0.38 for the 10-mm sphere in phantom experiment no.2. As for human images, image quality parameters of NECpatient > 13, NECdensity > 0.2, and liver SNR > 10 have tentatively been proposed as the minimum standards based on the clinical data at a number of PET centers. Although these standards may depend on the camera model, our results suggest that it may be roughly applicable to all camera models.

Computation of NEC needs scatter fraction, which was obtained from the literature or measured under the conditions defined in the NEMA standard and was not measured concurrently in each phantom experiment or human scan in the present study. Therefore, the scatter fraction value may have an error, which may be one of the reasons for camera dependence of the relationship between NEC and visual score.

Scatter fraction

The scatter fraction of a PET camera depends on the camera model, acquisition mode, body size, activity outside the field of view [3], etc. In general, the scatter fraction measured with a scatter phantom based on NEMA standard may provide a lower value than clinical scans, because it increases as the subject size increases [4]. Moreover, the scatter fraction is related to the energy lower level discriminator (ELLD) and is reported to be higher than 40% if ELLD is set below 400 keV [5, 6]. In addition, the scatter fraction is influenced by the radioactivity concentration if the PET camera detector contains lutetium (176Lu), and the data are acquired in the 3D mode [7]. Therefore, the scatter fraction varies widely with body size and activity inside or outside the direct field of view. However, since the real-time measurement of the scatter fraction is impossible with clinical scans, this guideline instructs using the scatter fraction values based on NEMA NU 2-2007 as an intrinsic value for each camera model. Therefore, there is a possibility of errors in the actual scatter fraction for each human scan.

Relationship between phantom results and human scanning conditions

In many PET centers in Japan, patients are injected with 3.7 MBq/kg FDG and are scanned starting 60 min post-injection. Suppose that the target region is scanned at 68 min post-injection (physical decay to 65%). Assuming that 20% of injected FDG is excreted in the urine [8], and that the remaining FDG is distributed uniformly within the body except the adipose tissue, which constitutes 27% of the total body volume [9], the soft tissue activity concentration is estimated to be 3.7 MBq/kg × 1 kg/l × 0.65 × 0.8/0.73 = 2.64 MBq/l, which is comparable to the background activity concentration in the phantom experiment (2.65 kBq/ml) (specific gravity = 1). The soft tissue SUV value is then 0.8/0.73 = 1.1, which is compatible with the SUV value in the mediastinum or abdomen observed in routine clinical experience. The cross-sectional area of the body phantom (550 cm2) corresponds to that of a standard Japanese with a body weight of 60 kg. Therefore, the body phantom at an activity concentration of 2.65 kBq/ml corresponds to a standard Japanese subject of 60 kg, injected with 3.7 MBq/kg FDG and scanned starting 60 min p.i. Phantom experiment no.1 corresponds to determining the minimum scanning duration to detect a 10-mm hot lesion with 4 times the background activity concentration in such a subject.

The results of the present study indicated that scanning for 3–4 min or longer is necessary for most camera models to visualize a 10-mm sphere in phantom experiment no. 1. This is longer than 2–3 min, which is usually adopted for a standard size subject in Japan. This suggests that a 10-mm lesion with 1:4 background activity may not be visualized in routine clinical scans. As a matter of fact, considering that the image activity of a 10-mm hot lesion with 4 times the background is decreased to SUV = 1.7 by the partial volume effect, it seems difficult to detect a 10-mm lesion of SUV = 1.7 of unknown localization in a routine clinical situation. It may be detectable with the aid of a CT using a PET/CT unit.

Body size and current data acquisition protocol

More activity was injected in heavier subjects in the routine clinical setup of all the PET centers surveyed in this study, and some centers further increased the scanning duration in subjects with high body weight or BMI (= weight(kg)/height(m)/height(m)). However, the results of the present study indicated a trend of image quality degradation as the body weight or BMI increased, suggesting that, in general, the current routine protocol adjustment for increased body size is not sufficient. It is advisable to inject more activity or (because injecting more activity may not work due to the increased random rate) to increase scanning duration in large-size subjects to acquire equivalent image quality as in small-size subjects.

Supporting data

This section presents phantom and human data on a number of PET camera models acquired and/or evaluated based on this guideline, from which the recommended reference values have been derived.

Phantom experiment no. 1

Method

Phantom experiment no. 1 was carried out according to this guideline on seven PET camera models (Aquiduo, Biograph LSO, Discovery ST, Discovery STE, Discovery STEP, SET-3000 BCT/L, SET-3000 G/X) to determine the optimum scanning duration and to investigate the validity of the physical parameters as indicators of the 10-mm hot sphere visualization. The reconstruction condition, which is routinely used in the PET center that housed the PET camera, was also used for this experiment. The PET images were visually evaluated by nine physicians and technologists for visualization scores using “Fusion Viewer 2.0” (NMP) software.

Results and discussion

Figure 2 represent the relationship between the average score of visualization for the 10-mm diameter hot sphere and the scanning duration. As the scanning duration increased, the visualization of each PET camera model improved, although the optimum duration depended on the model.

Fig. 2
figure 2

Relationship between scanning duration and visualization score of a 10-mm sphere in phantom experiment no. 1 (a 5.30 kBq/ml, b 2.65 kBq/ml). Symbols represent camera models

Figure 3 represents the relationship between the average score of visualization for the 10-mm diameter hot sphere and the physical parameters. The NECphantom, N 10 mm, and Q H,10 mm/N 10 mm were similarly related to the visual score regardless of the camera model, suggesting the validity of those parameters as indicators of the hot sphere detectability. On the other hand, Q H,10 mm was poorly associated with the visual score. It should be noted that N 10 mm and Q H,10 mm are affected by the reconstruction condition, while NECphantom is not, and that the reconstruction condition was pre-determined in the present experiments.

Fig. 3
figure 3

Relationship between visualization score and NECphantom (a, b), N 10 mm (c, d), and Q H,10 mm/N 10 mm (e, f) in phantom experiment no. 1 for 5.30 kBq/ml (a, c, e) and 2.65 kBq/ml (b, d, f). Symbols represent camera models

The median value of the 7 camera models that provided the average visual score of 1.5 in this experiment was adopted as the recommended reference value for the three physical parameters, i.e., NECphantom > 10.4 M counts (95% confidence interval 7.7–18.3), N 10 mm < 6.2 (95% confidence interval 4.8–6.9) , and Q H,10 mm/N 10 mm > 1.9 (95% confidence interval 1.5–2.8).

Simulation of image resolution and phantom experiment no. 2

Computer simulation was carried out to obtain the relationship between spatial resolution and the recovery coefficient measured under noise-free conditions in phantom experiment no. 2. Using a 3D Gaussian filter with FWHM = 10 mm, the recovery coefficients of the spheres under the present experimental conditions turned out to be: RC10 mm = 0.38, RC13 mm = 0.52, RC17 mm = 0.72, RC22 mm = 0.88, and RC28 mm = 0.97 (Fig. 4). Based on this simulation, RC10 mm > 0.38 was adopted as the recommended reference value in this guideline, assuming that spatial resolution of 10-mm FWHM or better is necessary for an oncology FDG-PET image with sufficient quality.

Fig. 4
figure 4

Simulated image of digital body phantom generated with a Gaussian filter of 10-mm FWHM isotropic image resolution

Figure 5 illustrates the RCs for the 7 PET camera models measured under noise-free conditions in the phantom experiment no. 2. All 7 models met the reference criteria of RC10 mm > 0.38. Fig. 6 presents one of those PET images.

Fig. 5
figure 5

Recovery coefficients (RCs) obtained in noise-free scans in phantom experiment no. 2. Symbols represent camera models: Aquiduo, Biograph LSO, Discovery ST, Discovery ST-E, Discovery ST-EP, SET-3000BCT/L, SET-3000 G/X

Fig. 6
figure 6

A representative PET image acquired in noise-free condition in phantom experiment no. 2, on which RCs were measured

Human image quality evaluation

Methods

To examine the image quality of whole-body FDG-PET images currently acquired in Japan and the relationship with the physical parameters, patient images were collected from 5 PET centers using 5 different PET camera models, 30 cases from each center. Those images had been acquired as routine diagnostic scans according to the protocol of each PET center without any artifacts or other problems, interpreted by local PET physicians and reported to the attending physicians. Images with extremely abnormal FDG accumulation were excluded.

The quality of the images were visually evaluated by five licensed PET physicians using the 5-step score regarding how and whether they had sufficient quality to read and interpret. The image was given a score of 5 for “very good quality”, 4 for “sufficiently good quality”, 3 for “scarcely sufficient quality”, 2 for “not sufficient quality”, and 1 for “unreadable”. NECpatient, NECdensity, and liver SNR were computed as described above.

Results and discussion

Figure 7 illustrates the plots of the average visual score against NECpatient, NECdensity, and liver SNR, respectively. A significant difference was observed between those with a score of less than 3.0 (9 out of 148 patients) and those with 3.0 or higher for NECpatient (17.5 ± 3.0 vs. 23.8 ± 9.6, p < 0.001) and NECdensity (0.28 ± 0.07 vs. 0.45 ± 0.23, p < 0.03), but not for liver SNR (14.2 ± 6.2 vs. 15.9 ± 4.5, p = 0.28). Spearman rank correlation analysis of the overall data indicated that the visual score was significantly associated with both NECpatient (r = 0.47, p < 0.001) and NECdensity (r = 0.57, p < 0.001), but not with liver SNR (r = 0.27, p = 0.01). If examined for each camera model, however, the visual score correlated with all three parameters, suggesting that although these three parameters are good indicators of image quality, camera differences may exist in the exact relationship and recommended reference value.

Fig. 7
figure 7

a Scatter plots of visual score against NECpatient, b NECdensity, and c liver SNR. Each plot represents a subject. Linear regression lines are shown for each graph

Based on those patient data, the recommended reference value was determined as NECpatient > 13 (Mcounts/m), NECdensity > 0.2 (kcounts/cm3), and liver SNR > 10 for this guideline. It should be noted, however, that these reference values may still depend on the camera model and that further modification and revision may be necessary to make them reliable criteria for quality control to be used in clinical trials.

The average visual score was 3.35 ± 0.44, 3.37 ± 0.35, 3.79 ± 0.29, 3.57 ± 0.27, and 3.28 ± 0.17 (mean ± SD) for each of the five PET center data, showing a rather small variation between patients. Since the images were all selected from routine clinical scans, heavier patients had been injected with more activity and/or were scanned for a longer duration, so that they would not include images with too high or too low quality. This may be another reason for the weak correlation between the visual score and the parameters.

Figure 8 presents the plots of the score against BMI. There was a trend of a lower score for patients with a larger BMI. Those patients with a score below 3.0 (9/148 cases) had a BMI of 26.8 ± 5.0, which was significantly higher than 23.2 ± 3.5 for those with a score of 3.0 or above (p = 0.004). This suggests that patients with a large BMI may need more injected activity and/or a longer scanning duration than that specified in the current protocol and carried out at each center.

Fig. 8
figure 8

Scatter plots of visual score against BMI