Introduction

For more than a half century, functional imaging has been used in the management of patients with lymphoma, from staging prior to treatment to interim evaluation, and posttreatment assessment. Standardization of staging and response assessment permit comparisons among studies and facilitate regulatory approval. Newer applications of imaging include assignment of prognosis and risk adapted approaches. Taken together, functional imaging has contributed to the improved management of patients with lymphomas.

Staging

Staging of lymphoma is necessary to define the extent of the disease to direct appropriate therapy. Early staging recommendations for lymphoma were primarily for Hodgkin lymphoma for which the primary mode of treatment was radiation therapy [1, 2]. At the time, effective treatments were nonexistent for other lymphoma histologies. Methodologies used to determine the extent of spread of disease ranged from staging laparotomy to imaging techniques including roentgenograms, intravenous pyelograms, lymphangiograms, and ultrasound. However, these modalities lacked sensitivity and were limited in their ability to distinguish active tumor from scar tissue, fibrosis, or other benign entities. Functional imaging was needed to fulfill this necessary objective.

The most successful of the early methods utilized gallium-67 (67 Ga) scintigraphy which relies on the accumulation of 67 Ga by viable lymphoma cells by binding to transferrin receptors. 67 Ga scans were more specific than CT scans in their ability to distinguish benign tissue from lymphoma with an accuracy ranging from 70 to 84%. The negative predictive value ranged widely from 65 to 96%. Unfortunately, 67 Ga scintigraphy suffered from numerous deficiencies, which limited widespread adoption. Up to 7–14 days are required following injection of the isotope, before the results are available, which reduced its clinical utility. 67 Ga scintigraphy also suffered from low spatial resolution and suboptimal sensitivity and specificity. This issue was particularly notable for indolent lymphomas, and for bowel involvement, where interpretation could be confounded by physiological bowel uptake.

The next milestone came from the Costwold meeting which recommended inclusion of CT scans primarily for staging of Hodgkin lymphoma [3]. However, it wasn’t until 1999 that the first widely accepted response criteria were published by the International Working Group which were based on the use of CT scans in the assessment of patients with non-Hodgkin lymphoma, but which were also adopted for Hodgkin lymphoma [4]. These guidelines standardized the definitions of complete remission (CR), complete remission unconfirmed (CRu), partial response (PR), stable (SD), and progressive disease (PD). However, these criteria were subject to intraobserver variation and were based on outdated methods of evaluation such as physical examination, chest radiographs, CT scans and 67 Ga scintigraphy.

PET Scanning

The first positron emission tomography (PET) scan was developed in 1973 by Hoffman and Phelps but was not applied to lymphoma for almost 20 years. In 1998 Townsend and Nutt invented the PET-CT scan using 2’-fluorodeoxygluconse (FDG). FDG uptake by malignant cells depends on the expression of the glucose 1–7 transporters (GLUT1 to GLUT7) and the rate of uptake depends upon the glycolytic activity, mainly the hexokinase II activity [5]. Non-malignant cells with high glycolytic activity, such as activated macrophages or immunocompetent cells, can similarly contribute to FDG uptake. Therefore, the uptake observed in lymphoma can not only be due to lymphoma, but also to microenvironment cells. Interpretation of a lesion’s uptake on a PET scan performed at staging or for response assessment should therefore be interpreted in context of these two components [6]. Notably, in Hodgkin lymphoma, Hodgkin and Reed Sternberg cells (HRS) account for less than 1–5% of the total lymphoma but interact with the overwhelming population of non-neoplastic mononuclear bystander cells. These cells (CCR4-expressing cell subsets, including eosinophils, histiocytes, macrophages, plasma cells, and Th2 and Treg lymphocytes) are recruited by chemokines produced by the HRS cells and induce the expression of antiapoptotic proteins in HRS cells and their immortalization via a paracrine loop. There is an intimate relationship between the HRS cells and reactive cells of the microenvironment that enables the tumor to thrive and evade immune surveillance. The microenvironment is the major component of Hodgkin lymphoma nodes. In addition, the uptake of FDG by the different components of the Hodgkin tumor has different prognostic meanings and behaves differently during treatment. GLUT1 expression in HRS cells correlates with PDL1 and PDL2 expression, but not with PD1 expression in the microenvironment [7].

Differences in glucose transport have been recently shown among the various lymphoma subtypes: GLUT1 and GLUT3 GLUT7 expression are involved in FDG uptake in DLBCL cells lines. GLUT7 is not involved in NK/T cells lines [8]. Clinically, FDG avidity is greater with aggressive than indolent subtypes, but with great variations within the histologies [9], the uptake by the environmental cells, the distribution of lesions (diffuse or focal the latter being easier to detect than a diffuse infiltration), and the localization of the lesion involving a parenchyma without (lymph node) or with physiological uptake (gut or bone marrow). As an example, bone marrow infiltration of follicular lymphoma is missed in 60% of cases which is explained by its diffuse character and may be missed in DLBCL when it is diffuse and not focal. The detectability of MALT lymphoma is a challenge due to the high physiological uptake in the gut as is the detection of other extra-nodal MZL. Thus, PET has proven to be useful in the majority of lymphoma histologic subtypes, with the exception of chronic lymphocytic leukemia/small lymphocytic lymphoma, extra nodal marginal zone lymphoma, and some cutaneous and enteropathy-type T-cell lymphomas.

PET Quantification

PET quantification is currently performed using the standardized uptake value (SUV), an indirect evaluation of FDG uptake and glucose utilization: The SUV is a semiquantitative ratio of the relative local increase of the FDG metabolism in a region of the body (e.g., a lymph node). This ratio would be equal to 1 if the FDG distribution were homogeneous throughout the body. However, the SUV in lymphoma lesions is markedly increased with the improvement of PET performances quite doubling in some histologies, preventing a comparison between newer data and older studies.

Since the degree of uptake measured by the maximum standardized uptake value (SUVmax) observed in a patient correlates with lymphoma aggressiveness, studies have suggested that an SUVmax of 14 or more found in an indolent lymphoma is suggestive of aggressive transformation [10]. This conclusion has been recently challenged in follicular lymphoma in which a high pretreatment SUV did not correlate with the presence of histologic transformation [11]. Moreover, the threshold proposed for transformation in older series must be reconsidered when comparing with the newer PET devices.

The Role of PET in Lymphoma Staging

The first attempt at a uniform staging system was the Rye Classification of 1966 [1] which was eventually supplanted by the Ann Arbor Classification of 1971 [2]. These were used primarily for HL as there were no successful treatments for NHL at that time. The Rye Classification divided patients into 4 stages on the basis of extent of disease, and into A and B related to the absence or presence, respectively, of constitutional symptoms. However, over the ensuing years, it became apparent that patients are better distinguished into limited stage (AA I and II) and advanced stage (bulky II, III, and IV) based on how they are treated and their expected outcome with therapy. More recently, however, numerous studies have demonstrated that PET-CT is more sensitive and specific than CT alone in staging for both nodal and extranodal disease [12,13,14,15, 16•]. Compared with CT, PET-CT identifies additional sites of disease leading to upstaging in 10–30% of patients, and, less so, rules out disease involvement in fewer cases. However, the data on change in management on the basis of PET-CT are limited with few data supporting that patient outcome is improved. In one of the largest studies, the RATHL trial in 1214 patients with advanced stage Hodgkin lymphoma, contrast enhanced CT, and PET-CT were compared in staging [17]. The two modalities were concordant in 80%; however, PET-CT upstaged 14% and downstaged 6% of patients. CT identified disease in only 7 patients who were negative by PET-CT. Thus, one of the most significant changes in staging as a result of PET-CT is the elimination of the need for bone marrow biopsies in most patients with Hodgkin and diffuse large B-cell lymphoma [18].

The Lugano classification [19] established PET-CT as the standard for staging FDG-avid lymphomas, not only because of its high sensitivity and sensitivity, but also to provide a baseline against which to compare posttreatment assessment. A contrast enhanced CT is only needed for accurate measurement of the size of nodes or masses. The increased sensitivity of PET-CT is of particular importance in identifying patients with limited disease who are being considered candidates for radiation therapy.

The Role of PET in Response Assessment

Numerous studies have demonstrated that PET and later PET-CT are more sensitive and specific than other imaging techniques for the evaluation of response in patients with various histologies of lymphoma. In 2005, Juweid et al. [20] reported that the outcome of patients with DLBCL and a residual mass that was no longer FDG-avid was similar to those with a CT confirmed CR. With the increased use of 18-Fluorodeoxyglucose (PET) scanning, revised criteria were published by the NCI-Working Group, originally for Hodgkin and DLBCL [21]. Again, these were rapidly adopted, and their use expanded for other FDG-avid lymphomas.

Following publication of the revised guidelines, several events warranted further revisions. First, greater experience using the criteria identified issues needing clarification. Whereas the revised criteria used visual interpretation of PET scans, the development of the 5-point Deauville scale allowed for better standardization and interobserver concurrence. PET is positive if the residual uptake is higher than a fixed reference background (RB). This RB ranked in increasing intensity is the nearby background the mediastinal blood pool and the liver. For the same residual uptake, increasing the RB turns a PET positive to a negative.

In 2009 the Deauville criteria (DC) defined a set of visual criteria dedicated for interim PET evaluation and scaling the residual uptake against all these reference background on a 5-point scale. DC underlined that the liver was the RB visually easier to evaluate. A score of 4 (moderately increased uptake over the liver) and 5 (markedly increased uptake and/or new lesion) was claimed positive. These criteria have improved readings of interim PET in HL [22] and DLBCL [23]. However, there are difficulties in visual reporting since the eye is more sensitive to contrast than to differences of intensities. The uptake of a node surrounded by a faint background (e.g., an axillary lymph node) may seem more intense and can mislead the analysis.

The Deauville criteria were then redefined in Menton [24] and, subsequently, in Lugano in 2014 [19, 25] and extended from interim to end of treatment PET. Score 5 was defined to be an uptake > 2–3 times the SUVmax in normal liver. However, although it was advised to report the PET using a SUV scale, the term “moderately increase” of score 4 was not defined. Thus, scoring 4 based on a small increase of SUV relative to the liver may result in a false positive result.

Quantitative Approaches

A quantitative approach has been proposed several years ago in DLBCL to overcome the difficulties in the interpretation of the DS: i.e., the ΔSUVmax, the reduction of the SUVmax from baseline to interim [26]. For iPET the cut-off for better discrimination of response was a reduction of 66% after two cycles and 72% after 4 cycles. Several retrospective and prospective studies performed by national lymphoma groups (LYSA, CALBG/ALLIANCE, SAKK) have reported that this approach performed better than DS for predicting outcome in aggressive DLBCL with a better interobserver reproducibility [23, 27, 28••, 29•, 30, 31]. Another approach has been proposed for all risk categories of DLBCL: a tumor/liver (T/L) ratio of 1.4. It was the only significant factor for outcome prediction at iPET4 and EOT in a series of 181 patients [32]. In the AHL 2011 trial including 800 advanced HL patients, the PET-guided strategy of de-escalation from BEACOPP to ABVD was based on PET negativity, but PET positivity was considered if the T/L was \(\ge\) 1.4 [33]. The 5-year PFS was remarkably high 85% and similar in the standard and the PET-driven arms. The overall survival was also similar and higher than 95%.

Additional data supported the validity of PET in other histologies, notably follicular lymphoma. Trotman et al. performed a pooled analysis of two retrospective and one prospective study and demonstrated that posttreatment PET predicted not only PFS, but also OS when a cut-off of DS4 was used [34]. In 2012 at a conference at the International Workshop on Malignant Lymphoma (Lugano, Switzerland), the need for revised guidelines for evaluation, staging, and response assessment were discussed and developed over the ensuing years. They were eventually published in 2014 as the Lugano classification [19]. Revisions included a modification of the Ann Arbor Classification for staging, FDG-PET-CT in the staging and response assessment of all FDG-avid lymphomas, and elimination of the need for bone marrow evaluation in the staging of Hodgkin and DLBCL unless it was likely to alter treatment. Residual masses that were no longer FDG-avid were still consistent with a complete metabolic response. However, progression of a single node or mass indicated PD.

New Staging and Response Techniques

Baseline PET provides the whole tumor burden. In addition, several additional parameters can be extracted by a mathematical process called image segmentation to improve on the prognostic value of the scan. Among these parameters, the total metabolic tumor volume (TMTV) is the most investigated. TMTV is the sum of the 3D measurements of the volume of each lesion with FDG uptake; it is the viable fraction of tumors and microenvironment. The SUVmax has been addressed before, with its limits as an absolute value. For the same reason, this limit also applies to the total lesion glycolysis which is the sum of the SUV of all the element (voxel) included in the metabolic volume. Other metrics exist for single lesion under the word of radiomics; but they all depend on the method of metabolic volume measurement as detailed below. Metabolic volume cannot be drawn visually but must be determined by computer thresholding. The thresholding of the lesion volume is determined according to a relative reference which can be the SUVmax of the lesion (for instance, 40% of the tumor SUVmax, i.e., all the voxel between the SUVmax and 40% of the SUVmax are comprised in the volume) or any other internal reference; an absolute value of SUVmax such as 2.5 or 4 can also be chosen as a threshold (Fig. 1). Different methods resulted in different volumes depending on the SUV in the lesions. However, the same method used in similar population should result in similar median TMTV and different investigators using the same method, but different software must find similar values.

Fig. 1
figure 1

TMTV in 4 patients with DLBCL ranging from small (72cm3) to large (1119 cm3) volumes. TMTV has been computed with the % SUVmax thresholding method. In these patients, the maximum distance between lesions normalized by BSA, SDmax, has been computed. Patients with low risk of events have a small metabolic tumor volume ≤ 220 cm3 and a small normalized maximum distance ≤ 0.32 m−1 between the lymphoma sites. Patients with high risk have a large volume and a high maximum distance. Patients with intermediate risk of events have either a large volume or a high distance. From (36•)

TMTV is now automatized but requires standardization to be use in trials. This has been launched by cooperative groups in 2019 evaluating different methods and defining a benchmark in order that every group using the same software with the same technique of volume measurement must find the same TMTV median value. Recently artificial intelligence methods have been applied successfully to TMTV measurement, and this field is under investigation [35, 36•, 37 ].

Radiomics and Heterogeneity

In all published studies in aggressive B-cell lymphomas, TMTV stratified patients better than Ann Arbor Stage. Patients with similar stage may have markedly different TMTV, but TMTV and stage do not fully characterize the heterogeneity of the lesions or their dissemination.

Other radiomic features include metrics derived from measurements in the image. Two radiomics parameters describe the heterogeneity: the voxel value histogram, for instance, the SUV distribution in a bulk and the voxel value spatial arrangement which is the SUV spatial organization within this bulk. Ceriani et al. [38] measured the metabolic heterogeneity (MH) in 103 patients with primary mediastinal B-cell lymphoma (PMBL) using the voxel SUV histogram. They observed that the metabolic heterogeneity in the mass was predictive of outcome. Progression-free survival at 5 years was 94% vs 73% in low- vs high-MH groups. They proposed that total lymphoid glycolysis and MH could be used in future studies using risk-adapted strategies. Others have come to similar conclusions [39].

Interim Assessment

One of the more controversial areas of imaging is assessment during the planned treatment. Interim assessment has been suggested to potentially identify high-risk patients who are unlikely to respond to therapy, permitting an alteration in therapy to improve outcome, or, alternatively, those who are lower risk for whom the duration or intensity of therapy could be reduced, limiting toxicity.

A number of factors prevent the routine use of interim scanning. First, the optimal timing remains unclear. Residual activity on iPET can be highly variable with the number of cycles and the effect of residual tumor cells as well as treatment inflammatory or environmental cells. PET performed after cycles 1 and 2 evaluates chemosensitivity; however, it also records acute necrosis and inflammation. Results of PET performed after cycles 3 and 4 are balanced by regrowth. Consequently, one must not mix data from 1 and 2 to 3 and 4 or try to derive what will be the results at 2 cycles from what observed at cycle 3. Some trials have explored in DLBCL the two steps of response by performing PET after 2 and 4 cycles which help stratify the patients [33, 40].

Benefit of interim PET has been most clearly demonstrated in Hodgkin lymphoma. Gallamini and coworkers first demonstrated that the results of a PET scan following 2 cycles of adriamycin, bleomycin, vinblastine, and dacarbazine (ABVD) were more predictive of outcome than standard risk scoring systems [22, 41]. In the RAPID trial [42], patients with limited stage disease received 3 cycles of ABVD and then underwent PET-CT. Those who were negative were randomized to either involved field RT or no further therapy. Survival was similar in the two groups, demonstrating that more than 90% of patients can be spared unnecessary radiation. Subsequent analysis showed that the best discriminant was a DS of 5 [16•]. In the German HD15 study [43], patients received one of several induction regimens. Those without residual disease or ≤ 2.5 cm on CT and a negative PET received no further therapy. Those with a residual mass > 2.5 cm on CT and a negative PET were followed without further therapy. Those with a positive PET underwent radiation therapy. The investigators demonstrated that they were able to reduce the number of patients who underwent radiation therapy from 71% in earlier trials to 11% in the HD15 study with no compromise in efficacy. In the aforementioned RATHL study [17], patients with advanced disease received 2 cycles of ABVD followed by a PET-CT scan. Those with a negative PET-CT were randomized to 4 additional cycles of either ABVD or AVD. The outcome of the two groups was similar, but with less toxicity in the AVD group. Thus, these studies demonstrated a reduction in therapy and toxicity with an interim scan.

Based on studies by Gallamini and coworkers [22, 41], patients with advanced HL and a positive interim scan would be expected to have a long-term PFS of about 20–30%. In the RATHL trial [17], patients with a positive interim scan were randomized between 4 cycles of BEACOPP-14 or 3 of escalated BEACOPP followed by another PET scan. If the scan was negative, patients received 2 additional cycles of BEACOPP-14 or 1 of escalated BEACOPP with no radiation therapy. Those with a positive scan went on to salvage approaches. Results with the two regimens in PET-negative patients were similar with a PFS of 60–70%, far better than expected. Other risk-adapted studies from various groups have generated similar findings with PFS consistently in the 60–77% range. Thus, in high-risk patients, altering treatment appears to improve outcome. This observation has been confirmed in other studies [33, 44,45,46,47,48].

Unfortunately, benefit from interim PET has been difficult to demonstrate in patients with DLBCL [49]. The largest randomized study was the PETAL trial in which patients with a positive interim scan after 2 cycles of R-CHOP were randomized to an intensive Burkitt-like regimen or to continue on induction R-CHOP for 4 additional cycles [28••]. There was no benefit from the aggressive regimen with respect to event-free or overall survival; however, those patients experienced greater toxicity. Similar results were reported in patients with peripheral T-cell lymphoma. Thus, interim PET should not be considered a standard approach in patients with aggressive lymphomas.

Assessing Response with Immunomodulatory Therapy

Recently, a number of immunomodulatory drugs in solid tumors as well as lymphoid malignancies have been found to induce a flare reaction. These findings on PET scan may be confused with PD resulting in discontinuation of a potentially active therapy. Representative drugs used in lymphomas include lenalidomide, rituximab, brentuximab vedotin, Bruton tyrosine kinase (BTK) inhibitors, and check point inhibitors. This observation led to the development of the LYRIC (lymphoma response to immunomodulatory therapy criteria) recommendations [50]. The working group identified 3 subtypes of indeterminate responses: IR1, increase in tumor volume within the first 12 weeks; IR2, increase in volume or number of lesions at any time, but not ≥ 50%; and IR3, an increase in FDG uptake without an increase in lesion size. A biopsy or subsequent scans will help distinguish flare from progressive disease. In addition, incorporation of next-generation sequencing assays, such as circulating tumor DNA, should enable better discrimination.

FDG-PET for Assessment of End of Treatment

End of treatment (EOT) was the first time point where the superiority of PET imaging over CT was demonstrated. Before the publication of harmonized criteria for PET reporting, several studies demonstrated the value of PET reported using self-made criteria to determine whether a residual mass after first-line chemotherapy in Hodgkin lymphoma and aggressive non-Hodgkin lymphomas was persistent disease or non-malignant tissue. Juweid et al. summarizing in 2008 results obtained on 635 patients from different studies where PET was reported prior to harmonization of criteria noted that in HL the NPV was 71–100% with a PPV ranging from 13 to 100%; in NHL, the NPV was 74–100% and PPV ranged from 50 to 83%. These figures have changed with the DS, still confirming the high negative predictive value of end of treatment PET, but a PPV increase to 90% in HL, with PPV values ranging from 50 to 100% in DLBCL. In the HD0607 trial including 276 patients with advanced stage (IIB-IVB) HL with a baseline nodal mass of ≥ 5 cm who were PET negative after 6 cycles of ABVD experienced a similar 6 year PFS whether (91%) or not (95%) the mass was irradiated [51••], supporting that radiation can be omitted in this population. In a recent British Columbia study, patients who were PET negative after 6 cycles of R-CHOP (72%) did not receive consolidation radiotherapy. Time to progression and overall survival at 3 years were 83% and 56%, respectively, for those with a negative scan and 87% and 64%, respectively, for those with a positive scan who received consolidative radiotherapy. Thus, radiation could be eliminated in PET negative patients with bulky disease ≥ 10 cm whose outcome was indistinguishable from those without bulk [52]. In DLBCL the PPV of end of treatment PET may be equivocal even when PET is performed a minimum of 3 weeks, but preferably 6 or 8 weeks after eoT as recommended [25]. This finding can be attributed either to inflammatory changes persisting after immunochemotherapy or to difficulties in scoring DC 4. Consequently, many centers in routine practice use biopsy in DLBCL, in case of persistent positive sites. In addition, if there are persistent focal changes in the marrow in the context of a nodal response, consideration should be given to further evaluation with MRI or biopsy or a subsequent scan.

Several studies have shown that post-induction PET predicts outcome in FL [34]. Trotman et al. performed a pooled analysis of 3 trials; PRIMA, PET Folliculaire, and FOLLO-5. In the 3 trials, 439 patients underwent PET scan. Using a cut-off of 4 or higher on the DS as a positive study, the hazard ratio for PFS for those with a positive vs negative scan was 3.9 (95% CI 2.5–5.9, p < 0.0001) and was 6.7 (2.4–18.5; p = 0.0002) for overall survival. In contrast, CT-based response was only weakly predictive. FOLL12, including 769 patients treated with 6 cycles of R-CHOP or R-bendamustine + 2R, failed to demonstrate an advantage of PET- and MRD-guided therapy against standard R-maintenance. In contrast, patients with guided therapy and maintenance omitted had a 76% 3-year PFS versus 96% in standard arm [53]. An analysis of posttreatment PET was also performed from the GALLIUM trial [54] including 1202 patients of which 595 were included in the PET cohort interpreted prospectively by the IHP and 508 retrospectively by the Lugano classification. Following treatment, 65.5% achieved a mCMR by IHP and 75.6% by Lugano classification. The 2.5-year PFS from end of induction was 87.8% in CRs and 72% in non-CRs according to the IHP. Using the Lugano criteria, the PFS was similar for the CMRs at 87.4%, but was only 54.9% for the others. They concluded that their data validated the Lugano criteria and supported PET as preferred over contrast enhanced CT scan.

In addition, PET-CT obviates the need for a posttreatment bone marrow biopsy in both DLBCL and FL [55•].

Thus, PET-CT remains the standard for response assessment for FDG-avid lymphomas, and its performance will be improved with subsequent modifications of response criteria [19].

Surveillance

Another controversial role for PET scanning is as surveillance following posttreatment assessment. Theoretically, regular follow-up scans should identify recurrence sooner allowing for the rapid institution of salvage therapy and improved patient outcome. Unfortunately, there are no data to support this possibility. Moreover, false positives lead to additional scanning and invasing procedures, and the practice is not cost-effective. Thompson et al. [56] performed an analysis of 680 patients with DLBCL treated with anthracycline-based chemo-immunotherapy. Of these, 81% achieved a CR, of which 20% relapsed. Of the relapses, 64% were identified before a scheduled visit. Surveillance imaging identified asymptomatic relapse in 4 patients (1.8%). Clinical features at relapse included symptoms in approximately 60%, abnormal physical examination in 50%, elevated LDH in 50%, and with at least one feature in 90%. Importantly, the overall survival was the same with or without routine scans. Thus, the recommendation is not to perform surveillance imaging in DLBCL and HL in the absence of evidence of disease recurrence [19]. How best to manage patients with other histologies is less clear. In patients with FL and a CR, similar management appears to be acceptable. For those with residual disease following induction, CT scans can be performed every 3–6 months, depending on the size of the mass and its rate of growth. There are no specific recommendations for other histologies and patients should be managed according to best medical judgment.

PET for Pretreatment Prognosis

A number of imaging procedures have been shown to be prognostic prior to initiating lymphoma treatment. Total metabolic tumor volume (TMTV) obtained on a baseline PET-CT is an evaluation of the total metabolic tumor burden. TMTV has been shown to be a strong predictor of PFS and OS in several lymphoma subtypes. In DLBCL the TMTV is correlated to the circulating tumor DNA (Alig et al. in revision JCO 2020). The GOYA study, including 1418 DLBCL patients, found that baseline TMTV is prognostic for PFS and OS, the risk increasing with the volume quartile [57]. In high tumor burden follicular lymphoma, the FOLLCOL study including 181 patients has shown that baseline TMTV predicted, better than FLIPI and FLIPI2, PFS, POD24, and OS events [58]. In early stage Hodgkin lymphoma, the analysis of the standard arm of the H10 trial confirmed that a high baseline TMTV correlated with a lower PFS and OS. The 5-year estimates of PFS was 92% in the low TMTV group compared with 71% in the high TMTV group, 98% 5-year OS vs 83%. TMTV was superior on current EORTC/LYSA classification used in the trial (favorable/unfavorable). Indeed, two-thirds of patients classified as unfavorable were reclassified as low risk with TMTV. The high-risk patients identified with the TMTV had a 71% PFS instead of 84% for the unfavorable category [59]. Recently it has been shown that TMTV before CAR T-cell infusion was highly prognostic of early relapse and combined with extra-nodal involvement could stratify the risk of progression in a population of 116 relapsed refractory DLBCL patients [60].

In addition in several lymphoma subtypes (DLBCL, PTCL,HL, FL, PMBL), TMTV combined with other prognostic indices better stratifies the risk than the clinical indices alone [61,62,63] [58, 64] and that combined with molecular profile it can identify in low molecular risk patients a subset of patients with poor outcome [65].

Using an approach which combines TMTV and a clinical parameter, a simple prognostic index has been built associating TMTV and ECOG in DLBCL [60, 66]. This index was obtained from patients 60–80 years old in the REMARC trial in good response after R-CHOP. Patients between 60 and 80 with ECOG ≥ 2 and TMTV > 220 at baseline, before R-CHOP, had a worse prognosis and could be defined as ultra-high risk. This index which is more discriminating than IPI or NCCN-IPI has been validated in 2306 DLBCL patients from two prospective trials, PETAL and GOYA, and a group of real-world patients from the UK, France, Poland, and Portugal [60].

Different TMTV cut-offs appear to separate high- and low-risk patients which has created confusion. The likely explanation is that there are several issues inherent to the measurement method. The problem becomes simpler for the cut-off once the method is decided. As perfectly described by Pfreundschuh, taking the size of the bulk as an example, the best cut-off to separate high- and low-risk patients depends on the level of risk in the population [67]. Therefore, the TMTV cut-off changes with the type of population, and for a given population, it differs between methods.

Consequently, it has been shown that all methods used in the same population of DLBCL give KM curves with similar PFS and OS. TMTV is a strong prognostic tool, very soon available in routine but has never been used yet to guide a therapeutic strategy. TMTV is prognostic but at this time only one study in HL showed its predictive value (Moskowicz et al., Blood 2017). Indeed, in relapsed or refractory HL patients, TMTV improved the predictive power of pre-ASCT PET. In this study, baseline TMTV and pre-ASCT PET were independently prognostic: 3-year EFS for pre-ASCT PET-positive patients with low TMTV was 86%.

Integration of PET into Other Modalities

Direct comparison of unenhanced lower-dose PET-CT and cePET-CT suggests that although ceCT may occasionally identify additional findings and improve detection of abdominal or pelvic disease, management is rarely altered by ceCT in FDG avid lymphoma. Contrast-enhanced CT when included in staging/restaging should ideally be performed at a single visit combined with PET-CT, if not already performed, starting with the PET/CT before injecting contrast media to avoid errors in quantitative PET parameter measurements.

Metser et al. [68] assessed the role of PET/CT, using a multicenter registry including 850 participants based on clinical data and CT, or equivocal CT for advanced stage, who were considered for curative-intent first-line therapy compared to an historical control pool staged by contrast enhanced CT. There was a lower 1-year mortality for participants with advanced non-Hodgkin lymphoma in the PET/CT versus CT cohort and for those with limited stage at PET/CT compared with those with limited stage at CT. PET/CT helped to upstage approximately 18% of participants and planned management was frequently altered. Participants with aggressive non-Hodgkin lymphoma whose first-line therapy was guided by PET/CT had significantly better survival compared with participants whose treatment was guided by CT alone.

PET/MRI

PET combined with MRI has been introduced recently and seems to give comparable results compared to PET/CT in pediatric lymphoma with the advantage of a decrease of radiation dose [69]. Whereas PET-MRI might be of interest to evaluate brain and spinal cord lesions, the SUV is modified due to the process of attenuation correction and consequently so is the PET quantification. Until now, no published systematic and prospective studies demonstrate the advantages of PET/MRI over PET/CT in lymphoma management in adults.

PET for Assessment of the Microenvironment

The role of microenvironment on FDG uptake is important in Hodgkin lymphoma as described above. Agostinelli et al. [70] showed that increased expression of CD68 macrophages and PD1 by microenvironmental cells was associated with an unfavorable prognostic value permitting stratification of patients with a negative interim PET-2 scan. Indeed, in a group of 208 patients, a subset of PET-2-negative patients with a 3-year PFS significantly lower than that of the remaining PET-2-negative population was identified. Viviani et al. [71] also showed that elevated levels of serum thymus and activation-regulated chemokine (TARC) identifies among PET-2-negative patients those with a worse prognosis.

The importance of the microenvironment also explains the so-called pseudo-progression observed after immunomodulatory drugs such as check point inhibitors or after CAR-T cell therapy. The phenomenon is associated with a transient increase in tumor size and activity secondary to an augmented immune infiltrate. Rather than a real progression, pseudo-progression represents a flare phenomenon induced by the massive recruitment of immune cells into the tumor microenvironment. It is a transitory event, confirmed as such, only during subsequent scanning (or in the case of biopsy), demonstrating indeed tumor regression and treatment benefit. Lymphoma response to immunomodulatory therapy criteria (LYRIC) classified the patterns of flare reactions under the category of indeterminate response (IR), with 3 subcategories: IR1, increase in overall tumor burden; IR2, appearance of new lesions or growth of 1 or more existing lesions; and IR3, increase in FDG uptake of 1 or more lesions without a concomitant increase in lesion size or number. Importantly, the authors also proposed to consider that an increase of FDG avidity of 1 or more lesions suggestive of lymphoma without a concomitant increase in size of those lesions meeting progressive disease (PD) criteria does not constitute PD [50]. Recently a retrospective study of 45 R/R HL patients has shown that early CT and PET/CT at a median of 2 months after initiation of nivolumab predicted overall survival in relapsed or refractory Hodgkin lymphoma using Lugano 2014 or LYRIC criteria and that early PET detected additional patients with complete metabolic response [72]. However, they did not observe any instances of pseudoprogression. This observation confirmed a previous analysis of the PET response in these conditions [73].

PET and Minimal Residual Disease

Despite the achievement of a CMR, patients with lymphoma frequently relapse, reflecting the level of sensitivity of the assay. Several studies in FL support the integration of measures of MRD into response assessment [74]. The GALLIUM trial compared rituximab with obinutuzumab combined with chemotherapy in untreated FL [75]. Despite similar response rates between the cohorts, there was a modest but significant difference in PFS. Similarly in the GADOLIN study comparing bendamustine plus obinutuzumab versus bendamustine alone in rituximab refractory follicular and low-grade NHL [76], overall and complete response rates were comparable, yet the combination achieved longer PFS and overall survival. The improvement in outcome was associated with a greater and more prolonged reduction in MRD [77].

Future Directions

Prognosis

Whereas PET-CT has been shown to provide accurate staging and response information, several lines of evidence support its role in pretreatment prognosis. One of the more exciting possibilities would be to combine baseline TMTV with other assays to develop risk adapted strategies. Kurtz et al. [78] demonstrated that dynamic measurements of ctDNA in 217 DLBCL patients could be complemented by results of interim PET. A good correlation was also found between baseline ctDNA and TMTV by this group [79]. However, TMTV does not translate the spread of the lesion which is in part expressed by Ann Arbor classification. Recently a new index the normalized maximum distance existing between two lymphoma lesions has been described as prognostic of outcome in a series of 95 DLBCL patients from the LNH073B trial [80]. This index has been validated in 290 DLBCL patients of the REMARC trial and reported highly prognostic of PFS. Combined with TMTV, it identifies a very high-risk category, these two adverse factors seeming more present in patients with CNS relapses [36•]. Interesting results from radiomics came from a recent retrospective study on HL patients where a model developed a posteriori based on the radiomic characteristics of the bulky mass appeared to predict refractory disease [81]. Other potential combination strategies have been suggested, including radiomics and molecular genetic signatures [82].

New Tracers

Several new tracers have been evaluated to improve on FDG. F18 deoxy-fluoro-thymidine (FLT) which has no brain uptake was an interesting tool to detect brain involvement; but it is not readily available and is expensive. Studies have shown that early FLT has a lower positive predictive value than FDG in DLBCL [83]. F18 Glutamine analog, proposed to obviate the risk of false-negative response with FDG when tumor is growing mainly through glutamine uptake, has not yet been assessed in a large series of lymphoma patients, and the first results in solid tumors were disappointing except in glioblastoma [84]. 89Zr-immuno-PET is a promising noninvasive clinical tool that measures target engagement of monoclonal antibodies (mAbs) to predict toxicity in normal tissues and efficacy in tumors. First clinical results showed that nonspecific uptake of mAbs for tissues without target expression can be quantified using 89Zr-immuno-PET at multiple time points [85]. Other interesting tracers under investigation include F18 Fludarabine, which could have a better predictive positive value than FDG [86], and Ga68 Pentixafor labeling CXCR4 expression which has been used to explore CNS lesions and MALT lymphoma [87]. None of these tracers are currently clinically available.

Use of PET-CT in Clinical Practice and Clinical Trials

Despite the marked contribution of PET-CT to the management of patients with lymphoma, the use of this modality should be limited to appropriate indications (Table 1). The decision whether to use PET-CT should be determined by the clinical situation, e.g., clinical practice vs clinical trial. PET-CT is preferred for the staging of all FDG-avid histologies given its improved sensitivity and specificity compared with CT alone. The benefit of interim scans has only been demonstrated in HL to assist in improving efficacy for high-risk patients and reducing toxicity for those at low risk. PET-CT is also preferred for restaging after treatment of lymphoma that were FDG-avid prior to therapy. Subsequent PET-CT should only be considered in the setting of suspicion of recurrence on a CT scan, in case of clinical features of relapse if CT is negative, and to identify the preferred lesion for a biopsy. The limited use of unnecessary scans reduces expense, radiation exposure, and potential for false positive results. In contrast, the use of PET-CT in clinical trials not only includes those indication noted for clinical practice, but also to address clinically important study questions and to confirm suspicious findings on follow-up CT scans being used to assess PFS. The concomitant use of a contrast enhanced CT scan with the PET should be limited to situations where measurement of the size of nodes or masses is important in patient management.

Table 1 Current role of FDG-PET-CT in clinical practice*

Interpretation and Revised criteria

Two important tasks must be now accomplished. First, an international committee is working on simplifying and standardizing TMTV [88]. Second, the Deauville criteria must be better defined, especially score 4. Indeed, the UK National Cancer Research Institute initiated a prospective study (UKCRN-ID 1760) to assess the prognostic value of early fluorodeoxyglucose (FDG)-positron emission tomography (PET)/computed tomography (CT) in diffuse large B-cell lymphoma (DLBCL) and showed that only score 5 at interim was predictive of PFS [89], and, in limited disease, the threshold for score 5 has to be increased [16•]. Precision must be added to Lugano classification for other lymphomas such as follicular lymphoma and the role of delta SUV [40, 90]. Finally, integration of PET-CT with other modalities will lead to improved assessment and management of patients with lymphoma.

Conclusions

Metabolic imaging with PET-CT has improved the accurate staging and response assessment of patients with lymphoma. It also enables risk adapted strategies, notably in HL, to reduce toxicities in low-risk patients while improving outcome for those at high risk. To date, the same observations have not been made in other histologies. Pre-treatment PET-CT has been shown to be prognostic, especially when using TMTV and other measures of radiomics. Moreover, PET-CT is being integrated with other modalities such as assessment of minimal residual disease and microenvironmental factors, which will lead to therapeutic approaches that will improve the outcome of patients with lymphoma.