Introduction

18F-fluorodeoxyglucose (FDG) positron emission tomography (PET)/computed tomography (CT) is currently a standard imaging modality for staging and response evaluation in FDG-avid lymphoma such as diffuse large B cell lymphoma (DLBCL). According to the Deauville criteria and the Lugano classification, response evaluation of FDG-avid lymphoma is based on visual grading with a 5-point scale. In this classification, a score of 1–3 (uptake lower than or similar to that of the liver) without any new lesion is classified as complete response (CR), and a score of 4–5 (uptake higher than that of the liver) or any new lesion is classified as partial or no response [1,2,3].

In contrast with aggressive lymphoma such as DLBCL, indolent lymphoma including follicular lymphoma (FL) and marginal zone B cell lymphoma (MZBCL) exhibits relatively slow growth and variable FDG avidity [4,5,6,7,8,9,10,11]. Mean overall survival times are 9–11 years for FL and 8–13 years for MZBCL. Although some indolent lymphoma may transform into aggressive types and eventually leads to poor prognosis, most of indolent lymphoma exhibits slow progression. Thus, it is questioned whether the method of response evaluation for the FDG-avid lymphoma may also be applied to these types of lymphoma.

Quantitation ability is an essential strength of PET imaging. With quantitative analysis, small changes can be detected with high accuracy. Additionally, quantitative analysis is more observer independent than visual qualitative analysis. However, there are still several suggestions for response evaluation method using FDG PET, in terms of target lesions and quantitative indexes. In PERCIST system [12], it is recommended to measure standardized uptake value (SUV) of a single representative lesion, whereas RECIST1.1 recommends to measure tumor diameters of maximum five lesions with no more than two lesions of a single organ [13]. Thus, methods for treatment response evaluation using quantitative indexes of FDG PET need more refinement and validation.

In this study, we aimed to investigate the feasibility and effectiveness of quantitative indexes on FDG PET in treatment response evaluation of the indolent lymphoma for effective use of quantitative analysis in such clinical conditions. Various quantitative indexes on interim and end-of-treatment (EOT) PET were tested in terms of association with response determined by Lugano classification and patients’ final outcome.

Patients and Methods

Patients

From the image archive of our institution, patients who were diagnosed with indolent lymphoma and underwent initial FDG PET/CT scan between 2012 and 2016 were retrospectively enrolled. The inclusion criteria were (1) patient’s age ≥ 20 years, (2) pathologically proven FL or MZBCL treated with the standard regimen of chemotherapy, and (3) available FDG PET/CT scans obtained at baseline, interim (after 2–3 cycles) and EOT phases. The study design and waiver of informed consent were approved by our institutional review board (H-1703-108-840).

FDG PET/CT and Image Analysis

Patients fasted for at least 6 h, and PET/CT was performed at 1 h after intravenous injection of FDG (5.18 MBq/kg) using dedicated PET/CT scanners (Biograph mCT40 or mCT64, Siemens Healthcare, Germany). CT scan for attenuation correction and lesion localization was performed, followed by emission scan from the skull base to the proximal thigh. PET images were reconstructed on 128 × 128 matrices using an iterative algorithm.

Images were analyzed using an analysis software package (Syngo.via, Siemens Healthcare, Knoxville, TN, USA). For quantitative analysis of FDG uptake, maximum SUV (SUVmax; g/mL) was obtained for a lesion. For volumetric analysis, metabolic tumor volume (MTV; cm3) and total lesion glycolysis (TLG; g) of a lesion were measured, for which a spherical volume of interest (VOI) was manually drawn to encompass whole target lesion and a tumor contour was automatically drawn with the margin threshold of SUV 3.0. The volume of the isocontour VOI was defined as MTV, and TLG was calculated by multiplying mean SUV and the MTV. The cutoff value of SUV 3.0 was chosen based on our preliminary analysis, in which various margin thresholds of SUVmax-based relative values (30–70% of SUVmax with increment of 10%), fixed values (SUV 3.0 and 4.0), and a reference tissue-based value (twice the mean SUV of mediastinal blood pool) were tested. Among them, SUV 3.0 showed the highest statistical significances. Additionally, it was considered that SUV 3.0 is usual mean SUV of the liver, which is used as the reference tissue in Deauville criteria and Lugano classification.

Two target lesion sets were defined for analysis; (1) a single hottest lesion (target A), like the target lesion definition of PERCIST, and (2) a maximum of five hottest lesions, like the target lesion definition of RECIST1.1 (target B). In case of target B, quantitative PET indexes of all lesions were summed into a single value. Quantitative indexes at initial, interim, and EOT PET images were measured, and their percent differences (%Δ) between initial and interim, or between initial and EOT PET images were calculated.

Response Evaluation and Follow-up for Survival Analysis

EOT PET was visually analyzed by consensus of two experienced nuclear medicine physicians, and response was determined according to the Lugano classification; CR was defined as scores 1–3 without new lesion, and non-CR was defined as scores 4–5 or any new lesion [2]. SUVmax, MTV, and TLG and their %Δ from initial PET were measured with each of the two target lesion sets (targets A and B). On EOT PET, quantitative indexes were compared with visually determined response. On interim PET, quantitative indexes were tested as the early marker for the response on EOT PET.

For survival analysis, progression-free survival (PFS) was evaluated. Progression of disease was defined by PET/CT performed during follow-up. On follow-up PET/CT, progression was defined by the PERCIST criteria, as any new lesion or > 50% increase of SUVmax in previous lesions [12]. PFS was calculated from the date of baseline scan to progression of disease or death.

Statistical Analysis

Quantitative indexes were compared between groups using Student’s t test. For survival analysis, the optimal cutoff value of each index was determined by the receiver-operating characteristic curve analysis to maximize diagnostic performance. Survival analysis was performed using the Kaplan-Meier curve and Cox regression analysis. All statistical analyses were performed using a commercial statistical software package (MedCalc Ver. 18.2.1, MedCalc Software bvba, Ostend, Belgium), and p values less than 0.05 were considered statistically significant.

Results

Patient Characteristics

A total of 57 patients (27 men and 30 women; mean age, 57 years; range, 25–79 years) were included in the analysis; 39 with FL and 18 with MZBCL. Most of the patients were in the advanced stage, and aggressive treatment was performed; all patients received a combination chemotherapy regimen of rituximab, cyclophosphamide, vincristine, and prednisolone. Patient characteristics are summarized in Table 1.

Table 1 Patient characteristics

During the follow-up period of 22.3 ± 11.6 months (range, 8.2–62.0 months), 14 patients (24.6%) showed disease progression at 25.8 ± 13.5 months (range, 9.1–56.4 months).

Comparison of Quantitative Indexes and Response by Visual Analysis

Most of the cases showed high FDG uptake on initial PET (SUVmax, 12.1 ± 7.9, Table 2). In visual analysis of EOT PET, 37 patients were classified as CR and 20 patients were classified as non-CR according to Lugano classification. SUVmax and MTV on EOT PET were well associated with visual analysis and were significantly different between CR and non-CR groups with both targets A and B, whereas TLG was not. The indexes for changes between initial and EOT PET (%ΔSUVmax, %ΔMTV, and %ΔTLG) were not significantly different between CR and non-CR groups with both targets A and B (Table 2).

Table 2 Initial values and end-of-treatment values of quantitative indexes according to response by visual analysis

On interim PET, SUVmax and %ΔSUVmax were well associated with final response on EOT PET; SUVmax and %ΔSUVmax with both targets A and B were significantly different between CR and non-CR groups (Table 3). However, %ΔTLG of target A and MTV and TLG of target B did not exhibit significant difference between CR and non-CR groups, mostly due to wide variation in non-CR group.

Table 3 Interim values of quantitative indexes according to response by visual analysis

Prognosis by Quantitative PET Indexes

In visual analysis, non-CR by Lugano classification on EOT PET was a significant prognostic factor (p < 0.0001; HR, 20.07 (95% CI, 6.87–58.66)). Most of the tested indexes were also significant factors for predicting PFS (Table 4). On EOT PET, all of the SUVmax, MTV, and TLG were significant prognostic factors with both targets A and B (Fig. 1). Among them, SUVmax presented the highest HR (6.76 with target A and 8.62 with target B). Regarding the indexes for changes between initial and EOT PET, %ΔMTV of target A and %ΔSUVmax of target B were not significant prognostic factors.

Table 4 Prognostic values of quantitative PET indexes in predicting progression-free survival
Fig. 1
figure 1

Progression-free survival according to quantitative indexes on EOT PET; with target A (ac) and target B (df). All the indexes were significant prognostic factors, with highest hazard ratio presented by SUVmax (a, d)

On interim PET, most of the tested indexes were also significant prognostic factors, whereas MTV of target A and MTV and %ΔSUVmax of target B were not (Table 4). In accordance with EOT PET, SUVmax was a significant factor with both targets A and B, whereas the highest hazard ratio was presented by %ΔTLG in target A (HR, 4.61 (95% CI, 1.03–20.59)) and TLG in target B (HR, 4.93 (95% CI, 1.71–14.23)) (Fig. 2).

Fig. 2
figure 2

Progression-free survival according to quantitative indexes on interim PET. SUVmax was significantly associated with PFS, both with target A (a) and target B (b). The highest hazard ratio was presented with TLG of target B (c)

Representative cases regarding prognostic role of FDG PET are shown in Fig. 3.

Fig. 3
figure 3

Whole body FDG PET images of representative cases. A 57-year-old male patient with follicular lymphoma exhibited high SUVmax on both baseline (a) and EOT PET (b) (SUVmax 17.22 and 12.46, respectively). The patient experienced disease progression 11.9 months after completion of chemotherapy. A 32-year-old female patient with follicular lymphoma exhibited high SUVmax (SUVmax 18.24) on baseline (c), but low SUVmax (SUVmax 1.56) on EOT PET (d). The patient had been in CR state for 19.5 months after completion of chemotherapy

Discussion

In this study, quantitative indexes from FDG PET/CT have been evaluated for their feasibility and effectiveness in response evaluation and prognosis prediction in indolent lymphoma. On EOT PET, SUVmax and MTV of both targets A and B were well associated with Lugano classification. Because Lugano classification and Deauville score are based on only the tumor uptake at EOT, it is not surprising that SUVmax on EOT PET is well associated with the Lugano classification. MTV may be an additional effective index for response evaluation. It appears that simple measurement of SUVmax for the single hottest lesion (target A) can be enough, while the measurement from multiple samples (target B) would show differences more definitely. Any of %Δ indexes was not significantly associated with Lugano classification, although %ΔSUVmax is the recommended index in PERCIST. On interim PET, SUVmax and its %Δ from initial PET were also well associated with response at EOT.

Despite wide use, Lugano classification is a surrogate marker for response. Treatment response should be finally associated with the efficacy of treatment or outcome of a patient who receives the treatment. Thus, PFS was evaluated in this study as the outcome. In this study, SUVmax at EOT exhibited the highest HR although most of the tested indexes were significant prognostic factor. Similarly to response evaluation, SUVmax of target A was an effective prognostic marker, while SUVmax of target B exhibited a slightly higher HR. The results suggest that simple measurement of SUVmax of the single hottest lesion can be used as an effective index for both response evaluation and PFS prediction.

FDG PET/CT has been reported to be effective for response evaluation in FDG-avid lymphoma [14,15,16,17,18,19]. Usually, FDG avidity of lesions is determined, and the response is assessed by using the 5-point scale of visual assessment [15, 18, 20]. However, there have not been much evidences for effectiveness of FDG PET-based response evaluation in indolent lymphoma, especially in low FDG-avid tumors. Indolent lymphoma shows slow disease progression with variable metabolic features. Although FL is the most common type of indolent lymphoma, it usually shows moderate to high FDG avidity, and previous studies have shown the effectiveness of FDG PET in FL [4, 6]. These studies reported high performance of FDG PET in lesion detection, which results in more accurate staging and more adequate treatment planning. In contrast, MZBCL shows a wide variety of FDG avidity, with a tendency toward low FDG uptake [11, 20,21,22], and thus, follow-up using FDG PET/CT is generally not recommended in this type of lymphoma [20, 21]. The present study demonstrated the effectiveness of FDG PET in response evaluation of indolent lymphoma, by using quantitative indexes.

In analyzing FDG PET, SUVmax is the most widely used index for various purposes. In treatment-response evaluation, the use of SUVmax or SUVpeak is recommended in PERCIST. MTV and TLG are volume-based indexes that reflect tumor burden, and they are expected to be effective in prognosis prediction and response evaluation. However, in the present study, MTV and TLG did not surpass SUVmax in terms of response evaluation and prognosis prediction. Particularly, TLG did not show significant differences between CR and non-CR groups, probably due to a wide variation. Although further studies are required for validating the results, it can be asserted that SUVmax is still a simple and effective index.

Recently, the role of interim PET for response evaluation has been emphasized [7, 23,24,25,26,27]. It is generally accepted that preferable response such as metabolic CR on interim PET is well associated with a greater chance of achieving CR at EOT with a lesser chance of relapse [28,29,30]. On the contrary, if interim PET result does not show response, a patient is more likely to result in poor outcome. If poor response is observed on interim PET, treatment regimen may be changed for better outcome. In DLBCL, recent studies have suggested the usefulness of interim PET assessment using quantitative indexes as well as visual scale [29, 31, 32]. These studies attempted to enhance the utility of interim PET using the quantitative index of SUVmax, which exhibited promising predictive values [29, 31, 32]. In this study, prognostic role of interim FDG PET was tested in a group of indolent lymphoma. In accordance with these previous studies, the results showed that SUVmax and %ΔSUVmax are well associated with final response on EOT PET. Additionally, other factors such as TLG, %Δ TLG, and %Δ MTV on interim PET were also significant in prediction of prognosis.

There are some limitations in this study. First, a small number of patients were included due to relatively low incidence rate of indolent lymphoma. Further studies are required with a larger group of patients. Second, due to the inclusion criteria that required all baseline, interim, and EOT PET scans in a single patient, almost all the enrolled patients were in advanced stage. Accordingly, disease characteristics and treatment regimen of the present study were somewhat different from those of low-stage lymphoma. Additionally, although FL is the most common type of indolent lymphoma, most of FL cases show moderate to high FDG uptake, which resulted in a large variation of FDG uptake in our study cohort. The comparison of PET indexes for response evaluation may have been affected by heterogeneous FDG avidity of lymphoma. With a larger cohort, a homogeneous group of low FDG-avid lymphoma needs to be analyzed in the future regarding the effectiveness of FDG PET in response evaluation.

Conclusion

In indolent lymphoma, quantitative indexes of FDG PET are well associated with Lugano classification results. A simple measurement of SUVmax of the single hottest lesion can be an effective index for response evaluation and prognosis prediction of the indolent lymphoma.