Keywords

1 Staging of Aggressive Lymphoma

When the Ann Arbor staging system for lymphoma was drafted in 1971, the authors acknowledged two aims of staging: to facilitate communication and exchange information and to guide prognosis and assist in therapeutic decisions. The former could only be done at the expense of a loss of some information, as it is necessary to condense in one number a considerable amount of data, while the latter aim was best achieved if the greatest amount of information was given for each patient. These two aims highlighted a tension that has existed in lymphoma staging ever since: between being succinct and comprehensive; a lumper vs. a splitter; providing a simple standardised staging paradigm vs. a more complicated individualised approach. Furthermore, it was acknowledged that intercomparison demands that all the staging procedures performed should be as similar as possible in each centre to avoid bias in staging and interpretation of the therapeutic results.

A diagnosis of aggressive lymphoma is most commonly made after a standard CT scan has been performed and mapped the presence of lymphadenopathy, including the node/lesion most suitable for core or ideally excision biopsy. CT-based anatomic extent of lymphoma has traditionally defined stage, but, with aggressive lymphomas being invariably glucose- avid, in 2007 the IHP defined PET-CT as the principal imaging modality for staging aggressive lymphoma [1, 2]. Therefore once a diagnosis is made, whole body PET-CT is performed. PET-CT is the most sensitive imaging modality, particularly in identifying extranodal disease [3,4,5]. Indeed, the more recent, international staging criteria noted significant (~20%) stage migration, particularly upstaging with the more sensitive PET-CT scanning [6, 7]. It has been shown recently that the detection of extranodal involvement by PET has improved the prognostic value of IPI, R-IPI and NCCN-IPI [3]. While still separating lymphomas into localised or advanced stage, it was recognised that distinguishing nodal vs. extranodal status and unidimensional measurement of bulk was of limited use in an era of widespread use of systemic and multimodality approaches.

As a full-dose contrast-enhanced CT has commonly already been performed when diagnosing aggressive lymphoma, the most common PET-CT performed for staging purposes utilises 18F-FDG (administered 60 +/− 5′ after resting) and low-dose CT to minimise the impact of contrast on attenuation correction: the correction made for the loss of detection of photons because of spatially dependent absorption. When precise anatomical definition of disease is required, say for consideration of radiotherapy, a full contrast-enhanced CT scan is also required, either with, or more commonly separate to the PET scan.

Staging PET scans are to be reported using visual assessment [8], noting the location of increased focal uptake in nodal and extranodal sites, which is distinguished from physiological uptake and other patterns of disease that may have increased FDG uptake including infection and inflammation [9, 10], according to the distribution and/or CT characteristics. PET scans should be reported using a fixed display and a grey or colour table which can be scaled to the standardised uptake value (SUV) [11]. The SUV is the measured radioactivity corrected for patient weight and administered activity.

Focal FDG uptake within the bone/bone marrow, liver and spleen is highly sensitive for involvement in aggressive NHL [12,13,14,15], and the presence of focal lesions in the bone marrow may obviate the need for bone marrow biopsy [4, 16, 17]. Indeed PET detects BM involvement more often than BMB, and patients with a positive BMB generally have other factors consistent with advanced stage or poor prognosis. Where the PET scan shows no focal marrow uptake, the clinical value of performing a bone marrow biopsy outside of clinical trials to exclude concurrent low-grade disease is debatable as it generally would not change the prognosis and management. High physiological FDG uptake occurs in the brain, and although intracerebral lymphoma is often highly FDG-avid [18], diffuse and low-volume leptomeningeal disease may be missed. MRI is preferred to assess suspected CNS involvement.

Prior to our capacity to measure both anatomic and metabolic extent of lymphoma, various surrogates of tumour burden, including CT and BM biopsy-based stage, presence of extranodal disease and the lactate dehydrogenase level (a surrogate of both bulk and proliferation) have been included in the prognostic indices of all aggressive lymphoma subtypes (the IPI [19]and revised IPI [20] and NCCN-IPI [21] in DLBCL and the PIT in PTCL [22].

More recently, PET studies across a range of lymphomas including DLBCL and peripheral T-cell lymphoma suggest that quantifying the baseline total metabolic tumour volume (TMTV), the sum of the three-dimensional measurements of lesions with FDG uptake: a measure of the viable fraction of tumours and microenvironment may more accurately quantify tumour burden for determining prognosis. In two retrospective DLBCL series, the median TMTV was reported around 320 cm3 using the 41% thresholding method [23]. Patients with a large baseline metabolic volume (>300 cm3) had a significantly worse outcome than those with a volume ≤300 cm3 [24, 25]. The populations could be stratified according to TMTV, with risk increasing with each TMTV distribution quartile [25]. Combining these two PET series resulted in a cohort of 187 patients (44% >60 years old, 81% Ann Arbor Stage III/IV, 66% with aaIPI 2–3, 75% treated by R-CHOP) confirming that TMTV with a 300 cm3 cut-off was predictive of both 5-year PFS and OS [24], Fig. 14.1. In 167 young patients with an aaIPI score of 2–3 enrolled in a prospective study and treated with either R/CHOP14 or R/ACVBP, the median TMTV was 380 cm3. A 6.6% increase in risk of events for each 100 cm3 increase of TMTV was observed, and a TMTV >660 cm3 was the strongest predictor of inferior PFS and OS [26]. In primary mediastinal B-cell lymphoma, TMTV was shown to be predictive of outcome in 103 patients included in the series of the International Extranodal Lymphoma Study Group (IELSG26 trial) [27]. In a retrospective study including 108 patients with nodal PTCL (PTCL NOS, AITL, ALCL), the median TMTV at staging was 220 cm3. It was shown that TMTV with a threshold >230 cm3 was strongly prognostic independent of either IPI or PIT [24].

Fig. 14.1
figure 1

TMTV impacts PFS and OS in 187 patients with DLBCL. Kaplan-Meier curves show that patients with TMTV>300 cm3, Group 1, have lower 5-year PFS and OS than patients with TMTV≤300 cm3, Group 0 (54.7% vs. 77%, HR = 2.32, p = 0.013 and 55% vs. 85%, HR = 3.03, p = 0.0002, respectively). Note that when the best cut-off of 205 cm3 determined for PFS by ROC and X-tile analysis is taken, the 5-year PFS is 55% when >205 cm3 and 82% when ≤205 cm3, HR = 2.96, p = 0.0002

Not only can TMTV “per se” be used to stratify patient prognosis, but it has been shown in several studies to be independent to risk assignment using the current clinical prognostic indices. In DLBCL, TMTV stratified patients with high NCCN-IPI into two risk groups, a group with high TMTV, with a very poor outcome (5-year PFS 35%, 5-year OS 42%), and a group with low TMTV, with a much better outcome (5-year PFS 64%, 5-year OS 69%, P = 0.001 and P = 0.01, respectively) [24]. Patients with low NCCN-IPI had an excellent outcome irrespective of their TMTV (5-year PFS 80%, 5-year OS 77%). Moreover, TMTV has potential to refine cell of origin risk assignment in DLBCL. Both patients with GCB genotype and a high volume and patients with ABC disease with a small TMTV had a 5-year PFS of around 50% [24]. Similarly, in patients with PTCL, TMTV0 combined with PIT discriminated outcome better than TMTV0 alone, identifying patients with an adverse outcome (TMTV0 > 230 cm3 and PIT >1, n = 33) from those with good prognosis (TMTV0 ≤ 230 cm3 and PIT ≤1, n = 40): 19% vs. 73% 2-year PFS (p < 0.0001) and 43% vs. 81% 2-year OS (p = 0.0002), respectively.

An alternative interest in TMTV measurement is around measurement of drug delivery. It has recently been shown in 108 patients with DLBCL that the TMTV influenced rituximab pharmacokinetics. Exposure to rituximab decreased as TMTV increased with a decrease of the area under concentration-time curve (AUC). Small volume and high AUC were associated with a better response and a longer PFS. These results suggested that TMTV measurement could be helpful for optimising rituximab dose individualisation in DLBCL [28].

With standardisation of PET acquisition and software packages to assist with measurement of TMTV, we may be getting closer to providing a single staging parameter (Fig. 14.2). However, we must remain committed to systematically addressing the challenges of volume calculation and the appropriate choice of TMTV software algorithms to provide reproducible measurements of TMTV across prospective multisite studies. No method is always the most accurate: performance varies as a function of the activity distribution, noise, spatial resolution and contrast. The cut-off volume separating high vs. lower tumour volume across patients depends upon the method. With evolving scanner performance, relative methods relying on internal standards such as fixed percentage thresholding or adaptive methods are the most reproducible, but it has been shown that even if the cut-off varies with a given method, the predictive value remains quite similar once a method is chosen.

Fig. 14.2
figure 2

Different total metabolic tumour volume observed in patients with DLBCL (from Cottereau et al. Clin Cancer Res, 2016)

With widespread use of multi-agent systemic therapies, the Ann Arbor staging is no longer fit for purpose. TMTV has the potential to provide the single most efficient and relevant means of informing clinicians and patients of their disease burden. With reliable software and consensus, TMTV may in time replace the Ann Arbor system and become the new standard to convey prognosis and rationale for tailored therapy to our patients.

2 PET to Assess Therapeutic Response

One of the advantages of metabolic imaging is the capacity to accurately chart early metabolic response to therapy. The high FDG avidity of the aggressive lymphoma cells results from increased cellular turnover and internal trapping of glucose by tumour and stromal cells. Decreased glucose metabolism during treatment may be a surrogate of treatment efficiency. Chemotherapy-induced reduction of lymphoma metabolism is a nonlinear process influenced by chemotherapy regimen, the schedule and number of cycles and the effect of the chemotherapy on the surrounding microenvironment. End-of-induction PET-CT is necessary for response assessment in aggressive lymphoma, and the poorer prognosis of patients who remain PET-positive after completion of therapy has driven study of interim PET assessment seeking to identify such patients earlier where it is hoped that a change, often intensification of therapy, will improve outcomes.

2.1 Interim PET

In both clinical trials and in practice, the decision of when to perform interim PET (iPET) is driven by the clinical tension between the very good negative predictive value (NPV) of the test in identifying patients with a good prognosis and obtaining a sufficiently high positive predictive value (PPV) at an early enough time point so that PET-positive patients can be salvaged with a change in therapeutic approach. Therefore in most studies, iPET is performed after 2 and/or 4 cycles of chemotherapy. Resolution of FDG uptake at sites of initial disease indicates a complete metabolic response with a very good negative predictive value with a 2-year PFS rate of 73–85% [6].

The limitations of iPET are twofold. Firstly, the positive predictive value of iPET is too low as cited to range from 18 to 74% [6]. Secondly, no prospective randomised study has clearly demonstrated that either intensification of chemotherapy or change in therapeutic agent can improve the poorer prognosis of patients who remain PET-positive.

One study highlights the low PPV of iPET, particularly when applying the now outdated IHP criteria (using background FDG uptake as reference) [29]. All patients who remained PET-positive after 4 cycles of R-CHOP-14 underwent systematic biopsy. The 23% with biopsy-confirmed disease had an inferior outcome. However, the PFS in PET-positive biopsy-negative patients was comparable to that of PET-negative patients. The observation of a high false PET-positive rate in this series did not deny a clinically relevant prognostic value to early PET. It rather highlighted the key challenge for early PET reporting: to establish reproducible interpretation criteria able to discriminate FDG uptake associated with active disease from that related to non-specific post-therapy inflammatory changes. Indeed, visual interpretation can significantly change depending on the reference background used. Clearly for the same residual FDG uptake, increasing the reference for measuring background uptake can change a PET-positive scan to a negative one and the cut-off must be chosen carefully. In DLBCL the signal decreases continuously during induction treatment in parallel with tumour destruction, and the residual uptake decreases with each cycle. The degree of uptake that is indicative of response [8]is dependent on the timing of the scan during treatment [30, 31] and on the clinical context, including prognosis, lymphoma subtype [31,32,34] and treatment regimen [35, 36]. It is also dependent on the presence of inflammatory cells induced by rituximab and on microenvironmental cells. In addition there are difficulties in a qualitative visual comparison between residual uptake and background since visual reporting is highly observer dependent.

The 5-PS was developed as a simple, reproducible scoring method for response assessment (Table 14.1). It provides flexibility to change the threshold between good/poor response according to the clinical context and treatment strategy [37]. For example, a lower level of FDG uptake might be preferred to define a “negative result” in a clinical trial exploring de-escalation to avoid undertreatment. A higher level of uptake might be preferred to define a “positive result” in a trial exploring escalation to avoid overtreatment. The 5-PS has been validated for use at interim and at the end of treatment and in the last decade has been adopted as the preferred reporting method for response assessment [8].

Table 14.1 The 5-PS (also called Deauville criteria) scores the most intense uptake in a site of initial disease, if present as follows

Good interobserver agreement has been reported in DLBCL [38]. Scores 1, 2 and for the most part 3 are defined as CMR. When a score 1 or 2 is achieved at interim, an end of treatment scan is not required [39]. Score 3 also likely represents CMR at interim [40] and a good prognosis at completion of standard treatment ([41] (suppl 1; abst 15); [42]). One issue to be resolved with the 5PS is the lack of definition of the terms “moderately “and “markedly” which are not yet defined in quantitative terms. However, the 2014 Lugano guidelines recommend that a score of 4 should apply to uptake greater than the max SUV of the liver and a score of 5 to uptake 2–3 times greater than the SUVmax liver. In these guidelines a score of 4–5 with uptake reduced from baseline represents a partial metabolic response, while a score of 5 with no decreased uptake or with new FDG-avid foci consistent with lymphoma represents treatment failure and/or progression.

The problem of interobserver variability for reporting according to the 5PS has been pointed out recently. Concordance was excellent with the liver threshold but decreased for all the other thresholds. In this regard a quantitative approach measuring delta (Δ)SUVmax between baseline and interim PET would add value, and such quantitative reporting is being encouraged both clinically and in trials. It decreases the interobserver variability seen with visual reporting and integrates kinetic information by comparing baseline PET with interim PET [38]. The maximum SUV is measured in the hottest lesion before treatment and after 2 or 4 cycles of treatment. The change in SUV is expressed as a percentage of the initial uptake: referred to as the ΔSUVmax (Itti et al. 0.2009). When calculating the ΔSUVmax, it is important to appreciate that the lesion containing the SUVmax on iPET may not be the same as the lesion with the SUVmax at baseline. One challenge is the difficulty applying the ΔSUVmax method in patients with low baseline SUV which cannot reach the cut-off ΔSUVmax (between 66 and 72%) to determine a good response. The use of this metric has been adopted in two large clinical prospective trials (PETAL and GAINED), but there has been no prospective within-study comparison of the performance of the 5PS and ΔSUVmax. Another interesting quantitative approach has been recently proposed by a group from Beijing. They showed in DLBCL that using a ratio of 1.6 between the residual uptake and the liver, they could better discriminate patient outcomes than using 5PS or ΔSUVmax approach [43].

Although there is no prospective comparison between these methods, several exploratory investigations compare qualitative visual and nonvisual quantitative PET assessment [38]. Even while these studies have limitations, they all conclude in favour of quantitative methods.

The more significant limitation of iPET relates not to the scans themselves per se but the failure observed in some iPET-directed escalation studies to improve outcomes for patients who remain PET-positive. The most notable study in this respect was the PETAL (positron emission tomography-guided therapy of aggressive non-Hodgkin lymphomas) trial, where patients who failed to have a ΔSUVmax of 66% after 4 cycles of R-CHOP were escalated to a Burkitt-like regimen. iPET remained positive in 13% of patients and was highly predictive of inferior outcome with a 2-year TTF of 79% vs. 47% (p < 0.001), but a benefit from escalation could not be demonstrated (Duhrsen et al. 2014). To the contrary, in the Australasian Leukaemia and Lymphoma Group phase II escalation study, patients with DLBCL remaining PET-positive after 4 cycles of R-CHOP 14 and changed to R-ICE followed by ASCT had similar survival as PET-negative patients who completed six cycles of R-CHOP, (Hertzberg et al. 2017). However in this trial, iPET were reported using the now outdated IHP criteria. A reanalysis of PET-positive patients with the 5PS showed that the subset with score 5 had a poor prognosis and were refractory to the intensification approach.

In contrast to the challenges of interpreting an iPET-positive result, the very good negative predictive value of interim PET in DLBCL allows us to consider studies of de-escalation strategies in PET-negative patients. Furthermore, the reassurance to the patient in achieving iPET-negativity and a favourable prognosis cannot be underestimated. The results of the French LNH073B trial studying patients <60 years, with age adjusted IPI > 1, showed that 79% of patients became PET-negative using a ΔSUVmax approach. The results suggested that the quantitative approach could better characterise the majority of patients eligible for continued standard immunochemotherapy and select the presumably small subset of patients likely to benefit from upfront ASCT consolidation and those refractory ones early needing alternative strategies [44].

In peripheral T-cell lymphoma (PTCL), retrospective studies have reported conflicting results on the value of iPET. However, the largest retrospective multicentre French and Danish series applying the 5PS in 140 patients for interim PET performed either after two or after 3/4 cycles have shown that interim PET was predictive of outcome [45]. PFS and OS for iPET3/4 positive and iPET3/4 negative patients were 16% and 32% vs. 75% and 85%, respectively. Moreover baseline TMTV helped stratify the early PET responders into different risk categories.

The complexity of interpreting iPET demands that it be assessed in a multidisciplinary setting aware of the clinical context of such interpretation before influencing the ongoing therapeutic approach for patients with aggressive lymphoma.

2.2 Postinduction PET

End-of-induction (EOI) PET is the standard imaging modality for end-of-induction response assessment of aggressive lymphoma with demonstrated greater accuracy than CT scanning. PET should be performed at least 3 weeks after last cycle of chemotherapy or 8–12 weeks after radiotherapy, given the propensity for inflammatory reactions after this modality of therapy.

The current recommendation for end-of-induction PET is to apply the 5PS, where a score of 4 or 5 represents residual metabolic disease and treatment failure [7]. There is insufficient evidence to specify a target ΔSUVmax at end-of-induction PET that predicts a high probability of cure in DLBCL, and so the 5PS remains the recommended guide to subsequent prognosis and clinical approach. The NPV of end-of-induction PET is reported to be 80–100%, but again PET assessment is plagued with a low PPV ranging from 50 to 100% [6]. Therefore, if further treatment, beyond consolidation radiotherapy to a single residual FDG-avid lesion, is being considered, a biopsy or follow-up imaging is advised. One DLBCL subtype with excellent long-term response rates despite frequent persistence of FDG uptake is primary mediastinal B-cell lymphoma with data suggesting that a score of 4 on the 5PS is not associated with as poor a prognosis as score 5 [46]. In this lymphoma where consolidation radiotherapy is commonly used, a prospective randomised IELSG is assessing whether it is safe to omit radiotherapy in patients who become PET-negative.

There is data to suggest that the anatomic CT response may also play a complementary role with a greater reduction in mass associated with improved outcome both in patients who remain PET-positive and who achieve PET-negative status. The recently published response evaluation criteria in lymphoma (RECIL), recommended for use patients in basket clinical trials with novel agents, outlines the predictive power of a reduction by ≥30% in the sum of the longest diameters of three target lesions [47]. This supports an ongoing value to anatomic reduction of masses, although how these criteria can be applied outside of clinical trials is unclear and for now the Lugano criteria remain central to response assessment for aggressive lymphomas in clinical practice.

3 Assessment Before High-Dose Therapy (HDT) and Autologous Stem Cell Transplant (ASCT)

Several studies have reported that PET is prognostic in patients with relapsed or refractory DLBCL after salvage chemotherapy for whom high-dose chemotherapy and autologous stem cell transplantation are considered. In the context of this population of patients having a poorer prognosis, overall PET separates out a 3-year PFS/EFS of 30–40% in patients who remain PET-positive, vs. 75–82% for those who become PET-negative after salvage. The PET results, particularly in the context of a comparison with PET prior to salvage therapy, and the context of patient age, fitness and alternative clinical trial options serve to assist the clinician in deciding the merits of transplantation +/− consolidation radiotherapy.

4 Peripheral T cell Lymphoma

In 130 patients with relapsed or refractory PTCL treated by romidepsin, end of treatment PET reported with outdated criteria appeared superior to conventional CT assessment to determine prognosis [48]. In a recent retrospective study including 140 PTCL patients, the prognostic value of end of treatment PET reported with Lugano criteria has been confirmed, Cottereau et al. 2017). In extranodal NK/T-cell lymphoma, it has been shown that posttreatment 5PS and Epstein-Barr virus DNA positivity were independently associated with progression-free and overall survival in a multivariable analysis (for posttreatment 5PS of 3–4, PFS hazard ratio [HR] 3.607, 95% CI 1.772–7.341, univariable p < 0.0001; for posttreatment Epstein-Barr virus DNA positivity, progression-free survival HR 3.595, 95% CI 1.598–8.089, univariable p < 0.0001) [49].

5 Remission Surveillance

There is no evidence-based role for either PET or CT in the routine surveillance of remission in aggressive lymphoma. Educating the patient about signs and symptoms of relapse and clinical follow-up at initially three and then six monthly intervals is more appropriate. In the absence of prospective data demonstrating its benefit, surveillance imaging with either PET or standard contrast-enhanced CT for aggressive lymphoma generates unnecessary cost, anxiety and radiation exposure as most relapses are detected clinically. The 2014 Lugano guidelines cite a false positive rate of 20% which results in unnecessary biopsies for such patients. There was an estimated 91–255 scans performed for every relapse detected and no clear demonstrated improvement in patient outcome in the small proportion of patients whose relapse is detected initially with imaging [50, 51].

6 The Future for Imaging in Response Assessment of Aggressive Lymphoma

Despite considerable enthusiasm for identifying blood biomarkers for prognosis and response prediction, PET imaging remains the central biomarker at both baseline and end of immunochemotherapy in aggressive lymphoma. It is hoped that future combinations of baseline TMTV, the biologic profile of the lymphoma (particularly DLBCL), iPET and EOI PET assessment may be sufficiently prognostic to provide a platform for PET-adapted approaches in aggressive lymphomas in future clinical trials. For such approaches to be successful however, the results from the PETAL study suggest that simply intensification of chemotherapy may not be sufficient, and rationally biologic targeted therapies need to be developed.