Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 Introduction

Since chemotherapy inception in the early 1950s, the prediction of the ultimate treatment response has been the object of intensive clinical research in oncology for more than half a century. In the millennium turnaround, this interest has been further fuelled by the technological progress of medical imaging for cancer treatment monitoring and by the discovery of a vast array of new prognostic and predictive markers for a modern, personalized treatment strategy. The concept of prognostication does not necessarily overlap with treatment response prediction. In general, prognostic markers are readily available before treatment onset, are informative of the risk of recurrence, and on the ultimate treatment outcome of a given malignancy. They are useful to minimize confounding factors when comparing the results of similar cohorts of patients in clinical trials, or when stratifying patients according to their risk of treatment failure. On the other hand, predictive markers are treatment-dependent and available only during therapy. Tumour response prediction, based on the early appraisal of a number of tumour biomarkers, which proved informative of the final treatment outcome, is increasingly used in Oncology [1]. Tumour chemosensitivity was originally studied from in vitro cultures of cancer cells from patient, and has been considered for long the ideal predictive tool of final treatment outcome [2]. Standard parameters such as colony-forming ability, growth inhibition, or cell viability were used as measurable indexes of sensitivity to cytostatic drugs. Later on, the development of high-throughput technologies, e.g. cDNA microarrays, enabled a more detailed analysis of drug responses. However, these methods proved unsuitable in the clinical practice and they are currently limited to new drug discovery and preclinical drug testing platforms [3]. Tumour shrinkage has been also considered in the past a surrogate marker for chemosensitivity, and classical radiological imaging by contrast-enhanced computed tomography (CeCT) scan has been proposed during treatment to assess an early tumour response [4]. However, it became clear that traditional radiological assessment of tumour bulk shrinkage is not an accurate predictor of outcome, as any reduction in tumour volume takes time and can lag behind metabolic slowdown of the neoplastic tissue, which occurs immediately after chemotherapy delivery. This is particularly evident in HL, where a residual mass is observed in up to two-thirds of the patients at the end of treatment [5, 6]. Furthermore, treatment response assessment by radiological imaging modalities may be inaccurate because of errors in tumour measurements, errors in selection of measurable targets, and inter-observer variability of tumor size assessment [7]. More recently, a new class of prgnostic markes able to predict treatment outcome in a single patients-basis have beeen proposed. Among them, functional imaging by 67Ga-citrate scintigraphy or 18F-fluorodeoxyglucose (FDG) positron emission tomography (FDG PET) proved able to predict treatment outcome, as surrogate markers of chemosensitivity with superior overall accuracy in lymphoma [8, 9] and other solid neoplasms [1012]. Similarly, minimal residual disease (MRD) detection by flow cytometry or molecular biology in acute and chronic leukaemia proved essential to predict long-term disease control [1316]. The predicted benefit (overall survival) and/or its surrogate (progression-free survival) must be appropriate to the treatment context. In this aspect a “predictive” marker is different from a “prognostic” marker since only the former is strictly related to a given treatment. In HL, this concept applies both to end of therapy and interim PET scan, whose predictive role on treatment outcome, whatever the time point during chemotherapy or chemoradiation the scan is performed, depends on the intensity of delivered therapy [17].

3.2 Interim PET to Predict Treatment Outcome

3.2.1 Prognostication in HL

HL has been for long considered the archetype in oncology for tumour staging, restaging, and prognostication. The Ann Arbor staging system [18], and later the Cotswolds revised classification [19], first introduced the concept that disease manifestations and tumour bulk identify distinct categories of patients who have a different prognosis and perhaps need specific therapeutic approaches. Surgical procedures (the so-called staging laparotomy with splenectomy and multiple nodal and organ biopsies) were first proposed in the early 1970s for tumour staging [20]. These procedures had the merit of having fuelled the knowledge on the physiopathology of disease spread, but proved cumbersome and even burdened by some morbidity. For these reasons at the beginning of the 1980s, radiological imaging with lymphography and CeCT surmounted staging laparotomy. CeCT, in particular, proved a readily accessible, non-invasive diagnostic tool, with a high sensitivity and overall accuracy for tumour spread detection and it became rapidly the standard for tumour staging [21].

In the meanwhile, the growing evidence that the tumour per se and the host reaction against the tumour were the main prognostic parameters correlated to tumour survival provided the frame for a new classification of prognostic factors in HL as (1) tumour-related, (2) host-related, and (3) environment-related [22]. Tumour-related factors include those depending on tumour biology, pathology, and burden. Host-related factors include a number of causes, which may significantly influence outcome such as age, co-morbidity, viral infections, and naïve immunity against the tumour. Environment-related factors include mainly situations outside the patients such as socio-economic status and access to god-quality health care. Assumedly, “true” prognostic factors have a known value at disease onset, before treatment starts, the so-called fixed-covariates, while others may only be known later during treatment, the so-called predictive factors or time-dependent covariates, such as time to response or early chemosensitivity assessment. The latter may be important for answering biological and clinical questions, but its prognostic relevance can be assessed only in prospective randomized studies comparing the chemosensitivity-adapted treatment (experimental arm) to the traditional non-adapted chemotherapy (standard arm) [23]. In Hl, tumour bulk, computed with a software by measuring the area of every neoplastic lesion, manually contoured in transaxial slices of CT scan by an expert radiologist, proved indeed to be one of the most powerful predictor of treatment outcome and, though related to many clinical staging parameters, was not predicted by them [24]. As a matter of fact, both in early-stage [25, 26] and in advanced-stage [27] HL, the number of involved lymph node regions as well as the volume of the disease on individual regions proved to predict progression-free survival (PFS) and overall survival (OS). These observations prompted clinicians to refine the classical four-stage Ann Arbor classification. As a consequence, a further prognostic breakdown of early-stage disease in two distinct subsets was proposed, based on a mixture of prognostic factors related to tumour bulk and host characteristics (see Table 3.1), and the intensity and duration of treatment modulated accordingly [28]. At the end of millennium, prognostic information of several biomarkers related to tumour burden and host reaction in advanced-stage disease was retrospectively extracted by a large data set collected from 5141 advanced-stage patients treated with doxorubicin-containing regimens in 25 international institutions [29]. Seven parameters were found to be associated in multivariate analysis, with an inferior treatment outcome: low albumin levels, anaemia, male sex, age ≥ 45 year, stage IV, leucocytosis, and lymphopenia. A prognostic model, the International Prognostic Score (IPS), was then constructed, and six risk classes, depending on the number of adverse prognostic factors, were identified, showing a 5-year freedom from progression (FFP) ranging from 84 % for score 0 (no risk factor) to 42 % for score 5 (≥5 risk factors) (Fig. 3.1).

Table 3.1 Preliminary results of the multicentre international PET response-adapted prospective trials of the GITIL/FIL (HD0607), of the NCRI (RAPID), and of the SWOG-CALG-B (S0816)
Fig. 3.1
figure 1

The International Prognostic Score (IPS) for advanced-stage Hodgkin lymphoma (From Hasenclever et al. [29])

However, the discriminative power and the prognostic relevance of the model were limited as only 7 % of the patients showed a 6-y FFS less than 50 %, and therefore its use in clinical practice has been questioned [30]. Interestingly, nearly 20 years after, the prognostic value of IPS has been again retrospectively assessed in a comparable cohort of 686 advanced-stage HL patients aged 15–65 years and staged without the contribute of FDG PET, on behalf of the British Columbia Cancer Agency (BCCA) [31]. Although confirming the prognostic role of IPS, the study showed a substantial narrowing of the distance among the 5-y FFP Kaplan-Meyer curves of the different score levels ranging between 88 % for score 0 and 70 % for score 6, that was attributed by the authors to a lower percentage of stage IV (24 % in the BCCA series vs. 42 % in the original IPS study). This phenomenon, in turn, depended on a more restrictive definition of stage IV according to BCCA guidelines. It should be stressed, however, that in the original IPS study stage IV had an adverse prognostic meaning only in the presence of 2 or more ENS attained by disease, which occurred only in 12 % of the patients. This scenario has been profoundly modified in the PET era, due to its higher sensitivity and overall accuracy comparing to CeCT in detecting ENS spread, with a resulting upward-stage migration in 20–25 % of the patients, mainly for a shift from stage III to stage IV [32].

Besides staging, HL prognostication has been also revolutionized, in the mid 1990s, by the advent of functional imaging with 18F-FDG PET. In all the key aspects of HL management such as staging and restaging, early and final treatment response monitoring, radiotherapy planning, and guiding FDG PET/CT has gained an irreplaceable role, thus becoming an indissoluble and essential tool in the HL therapeutic strategy [33] (Fig. 3.2).

Fig. 3.2
figure 2

FDG PET/CT for Hodgkin lymphoma management (Adapted from Gallamini et al. [33])

Probably the most relevant contribution of PET in the overall HL management has been the early chemosensitivity assessment both in early- and advanced-stage HL. This success was due to a number of tumour-related and tumour-unrelated reasons, but probably more importantly, to the peculiar pathobiology and tissue architecture of HL. The latter is characterized by the presence of few, scattered neoplastic cells, the Hodgkin and Reed-Sternberg cells (HRSC), accounting for less than 5 % of the total cell burden, embedded in a meshwork of non-neoplastic, reactive cells, which are attracted in the neoplastic milieu by a cytokine gradient and in turn responsible for the growth and immortalization of HRSCs [34]. These “inflammatory” cells, lymphocytes, macrophages, granulocytes, and eosinophils, identified as micro-environment (ME) cells, show a considerably high glycolytic activity [35] and are largely responsible for the high FDG uptake within the tumour tissue [36]. Both chemokine production and metabolic activity of the ME cells are apparently shut down early during treatment in chemo-sensitive disease, in nearly in 80 % of HL patients [3740]. In this “on-off” phenomenon, ME cells work as a signal amplifier as they are switched off in case of HRSC kill in chemo-sensitive HL and vice versa in chemo-resistant disease. This mechanism, in turn, increases dramatically the detection power of FDG PET/CT, which is normally able to detect only nodal lesion of a diameter of 4–5 mm or more [41]. As a matter of fact, interim PET scan performed after few chemotherapy courses (PET-2) with doxorubicin, bleomycin, vinblastine and dacarbazine (ABVD) is able to predict the long-term disease control with an overall high accuracy in HL, while specificity and positive predictive value (PPV) resulted higher in advanced- compared to early-stage disease [42, 43]. On the other hand, the negative predictive value (NPV) of PET-2 was reportedly very high, ranging from 100 % to 86 %, depending on the effectiveness of chemotherapy regimen [37, 44]. As mentioned above, the PPV resulted disappointingly low in early stage disease, ranging from 20 % to 45 %, probably due to (1) the high rescue rate of radiotherapy in PET-2-positive patients, (2) to the low a priori risk of treatment relapse in early-stage disease, (3) to a non-negligible rate of false-positive results due to unspecific FDG uptake in post-chemotherapy inflammatory tissue and (4) to the lack of accurate rules for interim PET reporting [42].

The situation is completely different in advanced-stage disease. In a large meta-analysis review, interim PET performed after 2 cycles of ABVD (PET-2) had an overall sensitivity of 0.81 (95 % CI, 0.72–0.89) and a specificity of 0.97 (95 % CI, 0.94–0.99) in predicting PFS [45]. In the retrospective Italian Danish study in a large (N = 260) cohort of advanced-stage (N = 193) or unfavourable early stage (N = 67), treated with 6 courses of ABVD ± consolidation RT, undergoing interim PET scan after 2 ABVD courses for prognostic aim only, the 3-y PFS of PET-2-negative and PET-2-positive patients was 95 % and 12.8 % (p < .0001). Importantly, compared to a classical prognostic model such IPS, the predictive value of PET-2 on treatment outcome was maintained both in low- (0–2) or high-score (≥3) IPS patients, thus superseding the prognostic role of the latter [8] (Fig. 3.3).

Fig. 3.3
figure 3

IPS score and Interim PET scan in predicting treatment outcome in advanced-stage, ABVD-treated HL (From Gallamini et al. [8])

These data have been subsequently confirmed in larger cohorts of patients [4648]. Other groups have explored the predictive value of interim PET as early as after 1 single course of chemotherapy (PET-1). After the preliminary report in small and mixed cohort of HL and aggressive B-cell lymphoma patients, which stressed the very high negative predictive value of PET-1 [49, 50], the results of a large international prospective cooperative study have been reported in a series of 126 HL patients with early (N = 68: 54 %) and advanced (N = 58) stage [51]. This study confirmed the very high NPV of PET-1 of 96.8 %, while the PPV was only 44.4 %. The authors commented that if in a PET-adapted strategy the intention is treatment de-escalation – which can be an attractive option for early-stage patients – PET-1 is better than PET-2. However, because of the higher rate of false-positive results associated with PET-1, PET-2 should remain the preferred choice for selecting non responding patients to switch to a more aggressive treatment.

3.3 PET Response-Adapted Therapy

HL is a high curable disease, as most patients become long-term survivors, with a 10-year cure and survival rates after first-line treatment exceeding 80 % and 90 %, respectively [52]. However, 10–15 % of early-stage and 20–30 % of advanced-stage patients are chemo-refractory to first-line treatment, either for primary resistant or relapsing disease, and nearly half of them ultimately succumb to their disease [53]. Hence, a still unmet need exists for a valid tool to predict the completeness of therapy response and the final patient outcome. However, the most compelling argument for a personalized treatment approach based on the actual risk of chemo-resistance remains the unwarranted treatment-related morbidity. In early-stage HL, for instance, during the late follow-up, five years or more beyond diagnosis, the disease itself no longer represents the main cause of death, but secondary neoplasms and cardiovascular events do [54]. By contrast, in advanced-stage HL, the most frequent cause of death is HL (see Fig. 3.4).

Fig. 3.4
figure 4

Causes of death in early-stage (I–IIA) and advanced-stage (IIB–IVB) Hodgkin lymphoma according to Haematology Department of S. Matteo IRCCS Institute (Courtesy of E. Brusamolino)

However, in female aged less or more than 30 years and treated with the very active escalated BEACOPP (EB: dose-intense combination of bleomycin, etoposide, doxorubicin, cyclophosphamide, vincristine, procarbazine, and prednisolone), amenorrhoea was observed in 51 % and 95 % of the cases, respectively [55], while the cumulative risk of secondary acute myeloid leukaemia in the entire cohort of advanced-stage disease was 3 % at 10 years [56]. For these reasons the search of reliable markers for tumour response prediction in an individual basis is very attractive in the context of a highly curable neoplasm, especially in early-stage disease, in whom the rate and magnitude of treatment-related morbidity or mortality could even supersede the rate of disease-related death.

As previously mentioned, a novel class of prognostic factor in lymphoma has been proposed, based on the early individual risk assessment of chemo-resistance during treatment, either by the evaluation of MRD [57, 58] or by assessing the chemosensitivity to treatment with PET scanning. However, the clinical relevance of a prognostic factor should be weighted against its usefulness in therapy planning and effectiveness in improving overall patient treatment outcome or reducing therapy-related toxic effects without compromising treatment efficacy. Till now, nobody knows, in the absence of published results of multicentre randomized prospective trials, whether a PET-adapted strategy could ultimately improve the final outcome of high-risk HL patients or reduce toxicity in low-risk patients while maintaining the same treatment efficacy [59, 60]. Several ongoing, or already concluded prospective trials have been launched in low-risk, early- and advanced-stage HL to explore the feasibility of treatment de-escalation strategies in patients with a negative interim PET, while others have been proposed based on therapy escalation in high-risk interim PET-positive, HL patients. In this review we will first review the phase II, already concluded studies and we will decribe then the outline and the preliminary results of the ongoing phase III trial based on a PET response-adapted strategy.

3.3.1 Phase II Concluded Studies in Early-Stage Disease

As soon as the prognostic role of interim PET scan to predict the final treatment outcome in early-stage HL became manifest [43], this strong therapy predictor was harnessed to answer the historical question revolving around the dilemma whether combined modality treatment with chemoradiation (CMT) should be preferred to chemotherapy alone for a deeper and immediate disease control in early-stage HL. The higher acute disease control, with a 3–7 % superior PFS, as shown in four published randomized clinical comparing CMT vs. chemotherapy alone in early-stage HL [6164], did not translate to an improvement in OS of CMT. On the contrary, the final analysis of the National Cancer Institute of Canada Clinical Trials Group (NCIC-CTG) and Eastern Cooperative Oncology Group (ECOG) HD.6 study showed superior OS for chemotherapy alone at 12 years, due to increased late events/toxicity in the CMT arm [65]. Similarly, the GHSG in the HD10/11 trials while showing an improved long-term disease control (8-y time to treatment failure) was unable to show an advantage in OS for patients treated with CMT as compared to chemotherapy alone [66]. On the other hand, clinicians should be cognizant of the fact that the scope of these trials was not merely to compare the treatment efficacy between the therapy arms but also to assess the benefits of omitting RT as a well-known risk factor for late toxicity. With the understanding that second-line treatments at the time of relapse can be quite effective in overcoming the transient survival disadvantage, RT can be probably safely avoided, at least in the patient subset with early favourable disease [17].

Due to very high NPV of interim PET in early-stage HL [8, 37, 38, 51, 67], its most attractive use in a PET response-adapted strategy in early-stage HL is likely the de-escalation of therapy either with chemotherapy abbreviation or even omitting radiotherapy. However compared to advanced-stage, data are less mature and results are controversial in early stage disease. The interest for the predictive value on interim PET scan was ignited in 2005 by Hutchings et al. in a pioneer retrospective study conducted in a cohort of 85 early and advanced HL patients undergoing interim PET after 2–3 cycles of ABVD; however, the positive predictive value of interim PET was much less evident in limited stage [43]. This lower predictive value could be largely explained by the concept that chemo-resistance does not imply a priori a refractoriness to radiation therapy, which is an essential part of the combined-modality treatment (CMT) in early-stage HL [28]. This concept has been elegantly proved by Sher et al. [67], who reported a 2-year failure-free survival of 92 % vs. 69 % for patients undergoing consolidation radiotherapy vs. no further treatment for patients with a mid-treatment positive PET scan after completion of the chemotherapy program.

In a prospective study aimed at assessing the effectiveness of the less toxic regimen with doxorubicin, vinblastine, and gemcitabine (AVG) compared to ABVD, early-stage HL patient underwent restaging with PET/CT after 2 and 6 cycles of chemotherapy [44]. After a mean follow-up of 3.3 years (0.4–5.0), the 2-year PFS for cycle 2 PET-negative and PET-positive patients were 88 % and 54 %, respectively, compared with 89 % and 27 % for cycle 6 PET-negative and PET-positive patients. The NPV and PPV for interim PET were 84.4 % and 45.8 %, respectively. This relatively low NPV could be explained by the lower effectiveness of AVG chemotherapy regimen compared to ABVD (CR rate 94 % vs. 81 %). The reasons for the disappointingly low PPV have been already reported, including the high patient rescue rate with radiation therapy.

Le Roux et al. reported the results of a PET-adapted strategy in a cohort of 90 HL patients in a perfect balance between early stage (45 patients) and advanced stage (45 patients), prospectively enrolled in a single institution [68]. After four cycles of ABVD, patients underwent a mid-treatment evaluation including CT and FDG PET/CT scan. Patients with negative FDG PET/CT or positive interim FDG PET/CT but in CR according to CT completed the pre-planned treatment for low-risk patients: IFRT for early favourable HL and additional or four more cycles of ABVD for early unfavourable and advanced stages (III and IV). Patients with positive interim FDG PET/CT but not in CR were addressed to autologous stem cell transplantation (ASCT). The criterion for a positive interim PET was a FDG uptake higher than background. In a following separate analysis, three different criteria for interim PET interpretation were than retrospectively used. After a median follow-up of 49 (13–81) months, 6 of 31 patients with a positive and 7 of 59 patients with a negative interim PET scan presented treatment failure. Again, the NPV was very high (95 %) and the PPV very low (16 %). Another prospective study was launched in Italy to assess the role of PET scan in guiding radiotherapy in both early- and advanced-stage patients in complete remission at the end of chemotherapy. One hundred-sixty HL patients with bulky disease at baseline defined as a node with a diameter >5 cm, showing a negative end-of-therapy PET scan after 6 courses of vinblastine, etoposide, bleomycin, epirubicin, and prednisone (VEBEP), were randomized to receive to radiotherapy or observation [69]. Two thirds of the patients in both arms had limited-stage disease (stage I-IIA). At 40-month median follow-up, PFS was 86 % in the chemotherapy arm compared to 96 % in the CMT arm, the difference being statistically significant (p = .03). The overall diagnostic accuracy of FDG PET to exclude impending relapses in the patients non-protected by radiotherapy was 86 % with a false-negative rate of 14 %. All the relapses in the chemotherapy only arm occurred in the bulky site and contiguous nodal regions. The largest concluded phase II study is the RAPID trial, on behalf of the UK National Cancer Research Institute (NCRI) [70]. The study enrolled 602 patients with non-bulky, early-stage (IA–IIA) disease with a median age of 34 years. Sixty-two percent of enrolled patients had a favourable prognosis according to EORTC criteria. Following three cycles of ABVD, an interim PET scan was performed (PET-3). 420 patients with a negative PET-3 were randomized to either no further therapy (NFT) or involved-field radiotherapy (IFRT): 209 to IFRT and 211 to NFT. Patients with a positive PET-3 were treated with a fourth ABVD cycle, followed by IFRT (Fig. 3.5).

Fig. 3.5
figure 5

Final results of the UK NCRI RAPID trial: progression-free survival of irradiated vs. no further treatment patients. (a) progression-free survival for irradiated versus no further treatment patients: Intention to treat analysis. (b) progression-free survival for irradiated versus no further treatment patients: per-protocol analysis. (From Radford et al. [70])

Interim PET scan was interpreted according to the Deauville five-point scale [71], but the threshold for a positive scan was set between scores 2 and 3 (“sensitive” threshold), in order to avoid false-negative results. Seventy-five percent had a negative (scores 1–2) and 25 % a positive (scores 3–5) PET-3 scan. After a median follow-up of 60 months from randomization, in an intent-to-treat (ITT) analysis, PFS and OS were not statistically different between the arms. The 3-year progression-free survival rate was 94.6 % (95 % confidence interval [CI], 91.5–97.7) in the radiotherapy group and 90.8 % (95 % CI, 86.9–94.8) in the NFT group, with an absolute risk difference of −3.8 percentage points (95 % CI, −8.8 to 1.3). The trial was a non-inferiority, randomized study powered to exclude a ≥7 % difference in PFS of the experimental arm vs. the standard arm, and therefore the endpoint was met. However, in a per-protocol (PP) analysis, upon exclusion of 26 patients allocated to IFRT and not irradiated, 3-year PFS was 97.1 % for the IFRT arm and 90.8 % for the NFT arm. Moreover, as further confounding factor, all the 5 deaths recorded in the study occurred in patients allocated to IFRT arm, before starting radiation therapy.

3.3.2 Phase II Ongoing Trials in Early-Stage Disease

Three European groups, EORTC (European Organization for Radiotherapy and Treatment of Cancer), LYSA (Lymphoma Study Association) and FIL (Italian Foundation on Lymphoma), jointly launched a prospective phase III PET response-adapted randomized study both in early favourable (H10F arm) and early unfavourable (H10U arm) HL. In this trial the interim PET was performed after 2 ABVD cycles (PET-2) and the scans were centrally reviewed. The endpoint was a non-inferiority of the experimental arm (PET-2-adapted strategy) compared to standard arm in both strata (3 ABVD + IFRT in H10F or 4 ABVD + IFRT in H10U, respectively, whatever the result of PET-2). Both in H10F and H10U, the experimental arm was split in an escalation arm and a de-escalation arm, according to PET-2 result: in the former, PET-2-positive patients are treated with 2 BEACOPP esc., followed by IFRT 20 Gy., irrespective of the risk stratum (both H10F and H10U). In the latter, PET-2-negative patients are treated with 2 further ABVD (H10F) or 4 further ABVD (H10U) (see Fig. 3.6).

Fig. 3.6
figure 6

The EORTC, LYSA, FIL H10 trial in early favourable and unfavourable Hodgkin lymphoma (From Raemaekers et al. [72])

An interim futility analysis of the primary end point was scheduled after documentation of 12 and 22 events (progression, relapse, or death) for the H10 F and H10 U subgroups, respectively. The Deauville five-point scale was adopted as interpretation key for PET-2: the rate of PET-2 negative in the H10F and H10U studies was 86 % and 75 %, respectively. The recently published results of the pre-planned interim analysis led to opposite conclusions compared to RAPID study [72]. In the H10F stratum approximately 190 patients have been randomized to each study arm: 1 single event was recorded in the standard arm compared to 9 in the non-irradiated PET-2-negative arm. In the H10U study nearly 260 patients were randomized: 7 and 16 events occurred in the standard arm and in non-irradiated PET-2-negative arm, respectively. Based on the statistical analysis, despite the very low number of events, futility was declared (p = .017 and .026, respectively). The data safety and Monitoring Board amended the study by closing the experimental, de-intensification arm. The results of the intensification arm have been recently presented during the 13th ICML in Lugano [73]. Briefly, 361/1950 (18 %) patients had a positive interim PET scan: 159 continued with one (H10F) or two (H10U) ABVD courses plus INRT, while 169 switched in both strata groups to BEACOPP escalated for two courses, followed by INRT. After a minimum follow-up of 4.5 years, the 5-y PFS was 77 % for ABVD vs. 91 % for the BEACOPP esc. arm (p = .002). However the 5-y OS showed only a non-significant superiority for the intensification arm: 89 % vs. 96 % (p = .06).

The German Hodgkin Study Group (GHSG) launched two prospective, non-inferiority clinical trials in favourable (HD 16) and unfavourable (HD 17) early-stage HL [74, 75]. The trials are similar in endpoint (non-inferiority study) and experimental design to the EORTC/LYSA/FIL H10 trial. In both trials a chemoradiation program non-PET-based with ABVD (HD 16) or BEACOPP (HD 17) and IFRT in the standard arm is compared to a chemotherapy-alone program in PET-2-negative patients and a CMT program with the corresponding chemotherapy regimen in PET-2-positive patients. Both studies were powered to a ≤5 % non-inferiority statistical design.

Two American collaborative groups, Cancer and Leukemia Group B and Eastern Cooperative Oncology Group, are conducting two very interesting trials in early-stage bulky HL, in which interim PET-positive patients after 2 ABVD courses are treated with 4 BEACOPP escalated cycles, followed by IFRT. The former trial is designed to omit INRT to the PET-2-negative subset [76] and the latter to deliver the conventional combination of ABVD + INRT to PET-negative patients [77].

3.3.3 Phase II Concluded Studies in Advanced-Stage Disease

In advanced-stage disease, a heated historical dilemma spanned over two decades to answer the following question: should a more effective treatment like escalated BEACOPP (EB) be indiscriminately given to all patients at disease onset or could it be delivered only to those with relapsing or refractory disease after standard ABVD, with the intent of sparing undue toxicity to all the patient cohort [78]? Despite the proven superiority of EB over standard ABVD, in terms of 10-year PFS, which has been reported in four randomized clinical trials [56, 7982], a large meta-analysis conducted on 2868 patients with advanced-stage HL concluded that there was no significant difference in OS between respective groups receiving either treatment [83]. Here again, as for limited disease, PET scan could ideally play the role of “arbiter” in this debate. As previously mentioned, early interim PET scan proved the most accurate predictor of treatment outcome in advanced-stage, ABVD-treated, HL patients [45].

Moving from these observations since 2006 onward, several Italian haematology institutions convened to adopt an interim PET response driven strategy in advance-stage HL patients, to prospectively validate the following working hypothesis: (1) if very high-risk PET-2-positive patients could be rescued with EB in at least half of cases and (2) if the overall outcome of the entire cohort of patients could be improved compared to standard historical results of ABVD treatment. The results of this study showed that after a median follow-up of 34 months (12–52), the 2-year failure-free survival (FFS) for the entire patient cohort was 91 %: 62 % for PET-2-positive and 95 % for PET-2-negative patients [84]. The working hypothesis was thus confirmed, and this therapeutic strategy proved feasible.

Similar to limited-stage HL, the therapy goal for advanced disease includes both maximizing treatment efficacy and avoiding undue toxicity for low-risk patients who do not require intensified therapies. Nevertheless, the primary treatment objective differs significantly from that of limited-stage HL, in that treatment intensification in high-risk disease takes precedence over minimizing therapy-related side effects. Both hypotheses, however, have been addressed in small phase II, single-centre or large cooperative multicentre clinical trials which have been recently concluded and published, adopting a escalation or a de-escalation strategy based on PET-2 result after ABVD or BEACOPP, respectively [85, 86]. While data from Ganesan [85] seem very similar to that reported in the interim analysis of other large multicentre trials with the same endpoint, Deau et al. reported the results of a retrospective analysis on a small cohort of 64 advanced-stage HL who were consecutively enrolled in a single institution in a time lag spanning over 6 years. Treatment started with 2 EB courses and patients had their treatment adapted in the basis of interim PET results [86]. Fifty-five patients (86 %) achieved a negative PET-2. Six relapses (11 %) occurred within the PET-2-negative group, mostly during the first year of follow-up (range: 4–14 months). In the PET-2-positive group, five patients showed disease progression with a positive PET after two more EB cycles (PET-4) and were allocated to salvage therapy. Moreover, four (44 %) PET-2-positive patients relapsed. After a median follow-up of 30 months, the 2-year PFS was 87 % in the PET-2-negative group but was only 47 % in the PET-2-positive arm (p = .0059).

3.3.4 Phase II Ongoing Trials in Advanced-Stage Disease

Three large, international prospective multicentre trials sharing (a) the inclusion criteria, (b) the main study endpoint, (3) the interpretation key for interim PET (the Deauville five-point scale) and (4) the overall treatment strategy were launched in 2007 from US intergroup (S0816 trial), from UK National Cancer Research Institute (RATHL study) and from Italian Gruppo Italiano Terapie Innovative nei Linfomi (GITIL) and the Italian Foundation on Lymphoma (FIL), the HD0607 study [8789]. The common trial backbone is the following: advanced-stage HL patients (IIB-IVB) are treated with two ABVD courses, and an interim PET is performed afterwards (PET-2). Patients showing a positive PET-2 switch to EB (minimum 4 courses) patients with a negative PET-2 continue with ABVD for a total of 6 cycles. Secondary intra-arm randomizations are planned in the RATHL study (ABVD vs. AVD in PET-2-negative patients) and in the HD 0607 study (consolidation radiotherapy vs. no further treatment in PET-2-negative arm). Preliminary results from the interim analysis of these trials have been presented in abstract form. The preliminary results of the US intergroup trial S0816 on behalf of four cooperative groups have been presented at the twelfth ICML meeting of Lugano [90]. An overall population of 357 pts was available in whom interim PET-2 scan was centrally reviewed, and Deauville five-point scale was used to report the scans. Two-hundred-ninety-two patients (82 %) were PET-negative (score 1–3) and 65 (18 %) were PET-positive (scores 4–5). Out of 349 patients registered to continue therapy, based on the interim PET result, 291 continued with ABVD and 58 with EB. The Kaplan–Meier estimate for 1-year overall survival was 98 % (95 % CI: 95 %, 99 %) and for 1-year PFS was 84 % (95 % CI: 79 %, 89 %). The 1-year PFS of PET-2 negative and positive was 85 % (95 % CI: 79 %, 90 %) and 72 %, respectively. The preliminary results of the RATHL study have been also presented during the 13th ICML in Lugano [91]. PET-2 results were available from 1137 patients with the following breakdown: 954 (84 %) were negative and 183 (16 %) positive. Among PET-2-negative pts, 65 % of patients treated with ABVD and 69 % of patients treated with AVD achieved CR or Cru. The CR/CRu rate was dependent on PET-2 Deauville five-point score: score 1, 82 %; score 2, 72 %; and score 3, 58 % (p < 0.01). Those with positive PET-2 who received intensified therapy with EB reached a negative PET-3 in 74 % of cases. The 3-year PFS for PET-2 patients treated with eBEACOPP or BEACOPP-14 and for PET-2 negative treated with ABVD or AVD was 66 %, 82.5 %, 85.4 % and 84.4 %, respectively. The 3-y PFS for the entire cohort of patient was 82.5 % (80.1–84.7).

The results of the second interim analysis from the GITIL/FIL HD 0607 trial have also been presented in the same meeting [92]. The trial has been closed in June 2014: 753 patients have been enrolled and 656 (84 %) completed the treatment. 114 (17.3 %) had a positive, and 542 (82.6 %) a negative PET-2 upon blinded independent central review (BICR). Treatment efficacy could be assessed in a cohort of 500 patients with a minimum follow-up of 2 years after the end of treatment of 1065.5 days (749.5–1299.5). A continuous complete remission (CCR) was recorded in 68 out of 97 PET-2-positive patients who switched to EB (70 %) and in 351 out of 400 PET-2-negative patients (88 %) who continued with ABVD. The probability of 2-y PFS and 5-y PFS were 66 % and 62 %, 89 % and 85 % and 84 % and 81 % for PET-2-positive, PET-2-negative and the overall cohort of patients, respectively (p < .001). In conclusion, more than 2000 patients have been enrolled in those three trials: therefore, critical information and new treatment options of these patients will be soon available. Importantly, the results of interim PET using the Deauville 5-point scale confirmed the reproducibility of this interpretation key across these studies: the percentages of PET-2-positive patients in this very large pool of patients from the UK, USA and Italian trials were 16 %, 18 % and 17 %, respectively (see Table 3.3).

Although based on preliminary data, the following observations could be done: (1) nearly 10 % of the PET-2-negative patients experience a treatment failure; this percentage seems twice that reported in previous non-adapted observational studies [8, 32, 37, 38, 43, 44, 46, 47]; (2) nearly two-thirds (60–70 %) of the PET-2-positive patients could be rescued with EB and achieve a long-term remission. (3) The 2-year PFS of the overall cohort of patients seems slightly better than that obtained with standard ABVD treatment, with a gain in PFS of 5–10 % compared to historical controls [53].

Another critical point is the procedure to adjudicate the final result or the PET scan review process. While no data are from the U.S. intergroup S0813 or from the RATHL studies, the Italian GITIL/FIL study adopted Blinded Independent Central Review procedure (BICR). Besides the decision that the local PET site must cede the final determination of a patient’s status to the central review, which should bilaterally agreed between the sponsor and the local PET site, this choice depended on the need to check the reproducibility of the 5-point Deauville scale (5-PS) and the agreement coefficient among reviewers [93]. The 5-PS for interim PET interpretation was just proposed at that time [71] and no validation studies were available on the reproducibility of those interpretation rules. Moreover, the U.S. Food and Drug Administration (FDA) recommends BICR for trials where reviewer’s blinding is not achievable, and reviewers are informed that their decision would be determinant to decide a switch to a more aggressive treatment [94].

Finally, technological progress on the web-based imaging exchange and the availability of the web platform WIDEN® to upload and download images [95] have rendered BICR and the consequent treatment decision by the local clinical investigator possible and timely. In the HD0607 trial the median scan uploading and downloading times were 1 min, 25 s, and 1 min 55 s, respectively; the average and median times for central review were 47 h, 53 m, and 37 h, 43 m, respectively. The binary concordance between pairs of reviewers (Cohen’s k) ranged from 0.72 to 0.85. The 5-point scale concordance among all reviewers was (Krippendorf alpha) was 0.77 [95].

At this writing no conclusive or preliminary data are available of clinical trials adopting a de-escalation strategy after EB, with the exception of the results of an interim analysis of the Israeli H2 trial [96], which has been presented during the 9th International Symposium on Hodgkin Lymphoma in Cologne [97]. Patients with advanced-stage HL are first assigned to therapy based on IPS score: IPS 0–2 receive 2 ABVD courses and IPS ≥ 3 two EB courses. An interim PET is performed afterwards in both strata: if PET-2 is negative, 4 more cycles of ABVD are given, followed by IFRT to bulky mediastinal masses. In PET-2-positive arm with no evidence of HL progression, 4 EB cycles are given, followed by IFRT on mediastinal bulky masses. Treatment de-escalation was possible in 80 % of advanced-stage patients. No data are available on treatment escalation. At a median follow-up of 24 months (4–74), PFS was 82 % for the entire cohort of advanced-stage patients. An overview of interim PET adapted clinical trials is provided in Fig. 3.7.

Fig. 3.7
figure 7

Overview of the PET-adapted clinical trials in advanced-stage HL. EB escalated BEACOPP, R rituximab, RT consolidation radiotherapy, LYSA Lymphoma Study group de l’Adulte, GHSG German Hodgkin Lymphoma Study Group, FIL Italian Foundation on Lymphoma, GITIL Italian: Group For Innovative Therapy of Lymphoma, NCRI National Cancer Research Institute, SWOG South Western Oncology Group, CALGB Cancer and Acute Leukemia Group

3.4 PET to Guide Consolidation Radiotherapy

One of the most compelling applications of PET imaging in HL has been guiding consolidation radiotherapy for residual mass persisting after chemotherapy.

Tumour bulk decreases over time during cytostatic treatment, and the rationale for using FDG PET for chemotherapy response assessment is based on the strong relationship between FDG uptake entity and cancer cell number, which has been reported in a substantial number of studies [98, 99]. Therefore, a decline in FDG uptake during tumour shrinkage results from reduction of the number of viable neoplastic cells, while a sustained increase of SUV values is seen upon tumour regrowth. On the other hand, the relationship between a CT-detected tumour mass and clinical response could be lost in chemo-sensitive neoplastic disorders, as the metabolic slowdown of the neoplastic tissue could precede by months the reduction of tumour volume. As a consequence, 60–80 % of HL patients show a residual mass during end-of-treatment restaging mostly in sites of bulky disease recorded at baseline [5, 6], but only less than half of these masses still harbour residual disease [100]. This phenomenon was first described in lymphoma entering a sustained clinical remission at the end of therapy, but later it has also been reported in a number of solid tumours such as head and neck squamous cell carcinoma (HNSCC) and gastrointestinal stromal tumours (GIST), in whom a metabolic response of the tumour, documented by a negative FDG PET/CT scan, invariably preceded the anatomical response detected on CT [101, 102].

In pre-PET era, Bonadonna et al. in Milan originally proposed a boost of consolidation RT for bulky nodal lesions or residual masses in advanced HL as an integral part of ABVD treatment [53]. However, with the advent of PET, it became possible to discriminate residual active disease from fibrotic tissue at the end of chemotherapy in lymphoma, with a sensitivity of 43–100 % and a specificity of 67–100 % [103]. Owing to its ability to detect persisting viable tissue, functional imaging with PET/CT proved superior to conventional radiological in defining the prognosis of tumour masses detected at the end of chemotherapy and turned out an ideal tool for guiding consolidation radiotherapy. Predictably, the NPV of the end-treatment PET depends on the efficacy of the administered chemotherapy, being as high as 94 % with very effective chemotherapy regimens such as EB [104] or as low as 75 % after the low-intensity VEBEP regimen [69, 105].

A very elegant and convincing demonstration of these concepts came from the results of the large HD15 trial of the GHSG, in whom consolidation radiotherapy was administered only to advanced-stage HL patients, showing a PET-positive, CT-detected residual mass with a diameter ≥ 2.5 cm at the end of three different EB regimens. The 4-year PFS of irradiated vs. non-irradiated patients was 86.2 % and 92.6 %, respectively (P = 0.022). The NPV of end-therapy PET was as high as 94 %. A residual mass was detected by CT scan in 739/2126 (34.7 %) and 191 out of these 739 (26 %) had a positive PET scan at the end of treatment [104]. A very important conclusion of the trial was that consolidation radiotherapy was needed only for 11 % of the enrolled patients compared to 71 % in the HD 9 trial [56]. In a subsequent analysis, combining dimensional data of the residual mass (i.e. measuring the largest diameter of the residual lesion in trans-axial CeCT slices) with PET/CT data, the same group was able to refine and improve the interpretation criteria of end-of-therapy scan to predict treatment outcome, by measuring the dimension of the residual mass: in the PET-positive patients a decrease in size of the residual mass ≥ 65 % from baseline values decreased the false-negative results [106].

Similar conclusions have been reached in a cohort of ABVD-treated advanced-stage patients by Savage et al. on behalf of the British Columbia Cancer Agency (BCCA) and reported in abstract form [107]. All the advanced-stage HL patients enrolled in clinical trials on behalf of BCCA after 2005 showing a residual mass at CT scan with a diameter ≥ 2 cm. at the end of ABVD treatment and a negative PET scan, the consolidation radiotherapy was omitted. In short, 151 patients with advanced stage HL and a PET-negative residual mass at the end of treatment had a 5-year progression-free survival of 92 %, and a subset of 71 patients with a PET-negative residual mass in a nodal region where a bulky lesion with a diameter ≥ 10 cm was recorded at baseline had a 5-y PFS of 90 %. The overall NPV and PPV of end-of-therapy PET scan were 92 % and 55 %, respectively. This study confirmed the high NPV of end-of therapy PET scan in patients treated with adequate-intensity chemotherapy regimen. The low positive predictive value could be due to the rescue treatment with consolidation radiotherapy but also to false-positive PET scan results due to an unspecific tissue inflammation secondary to chemotherapy-induced tumour lysis [108]. In conclusion, the decision to irradiate a single PET-positive residual mass should be taken in the awareness of false-positive results especially in the case of residual masses showing a dramatic shrink compared to baseline dimensions.

3.5 PET During Second-Line Treatment

The standard therapeutic option for second-line treatment of relapsed or refractory HL is high-dose chemotherapy (HDT), followed by autologous haematopoietic stem cell transplantation (ASCT), resulting in a rescue and long-term disease control in up of two-thirds of patients. Successful outcome depends on remission duration after first-line chemotherapy and chemosensitivity to second-line or salvage therapy prior to ASCT [109, 110]. Furthermore, recent meta-analysis data confirmed the prognostic value of pre-ASCT FDG PET imaging in lymphoma, demonstrating a poor long-term disease control in PET-positive patients after induction chemotherapy (31–41 %) compared with a PFS of 73–82 % in those who achieved a PET-negative remission before undergoing HDT/ASCT [111114]. Moving from these observations, a PET response-adapted strategy was also proposed during second-line rescue treatment including HDT and ASCT for relapsing/refractory HL. In a non-randomised, open-label, single-centre, phase 2 trial, 45 patients refractory to doxorubicin-containing first-line treatment received weekly infusions of 1.2 mg/kg brentuximab vedotin (BV) on days 1, 8, and 15 for two 28-day cycles. After completion of two cycles, patients received a PET scan. Twelve patients (27 %, 95 % CI 13–40) were PET-negative, with a Deauville score 1 or 2, and proceeded straight to HDT/ASCT, while 33 (73 %, 95 % CI 60–86) were PET-positive (Deauville 3–5) after BV. One still PET-positive patient withdrew consent, and therefore 32 PET-positive patients received HDT with augmented ICE (ifosfamide 5000 mg/m2 in combination with mesna 5000 mg/m2, continuous infusion every 12 h, days 1 and 2; carboplatin, single dose AUC 5, day 3; etoposide 200 mg/m2 every 8 h, day 1 for three doses), for two cycles. After HDT PET scan reverted to negativity in 22/32 (69 %, 95 % C.I. 53–85) cases. Overall, 34/45 patients (76 %, 95 % CI 62–89) achieved PET negativity [115]. However due to the very short number of enrolled patient and the very short follow-up (nearly 1 year after treatment end), these observations should be taken with caution and considered preliminary, to be confirmed in a larger phase III trial. Interestingly, a very conservative cut-off value for a negative scan (score ≤ 2) was adopted along the 5-PS. This choice, as in other clinical trials as the RAPID study [70] aimed at assessing the role of interim PET for treatment de-escalation, was adopted in order to maximize the sensitivity of the imaging technique, as recently proposed in the Lugano Workshop on PET scan for lymphoma staging and restaging [116]. Different from the abundant historical data present in the literature in front-line treatment prediction, very few reports are available on the predictive value of interim PET scan during salvage therapy. In a small cohort of 24 relapsing or refractory HL patients treated with rescue chemotherapy consisting of ifosfamide, gemcitabine and vinorelbine (IGEV) followed by ASCT, PET scan was predictive of final treatment outcome when performed after the second cycle. The 2-year PFS was 93 % vs. 10 % for patients with PET-negative and PET-positive results, respectively (P < 0.001) [117]. More recently, brentuximab vedotin (BV) turned out as the most active drug for relapsing refractory HL, proving able to induce an overall response rate (ORR) as high as 75 % in HL patients treated with up to 13 lines of chemotherapy [118, 119]. BV is an antibody-drug conjugate composed of the anti-CD30 chimeric immunoglobulin G1 (IgG1) monoclonal antibody cAC10 conjugated with the potent anti-microtubule drug mono-methyl auristatin E (MMAE) connected by a protease-cleavable linker; the drug is internalized in the HRS cells, which are selectively killed by the MMAE toxin. Several retrospective experiences have been reported with the use of BV in the so-called national-named patient program (NNP) for the compassionate use of BV in refractory HL, and interim PET was usually performed after 2–4 doses of BV administration. In the GHSG experience, 12 consecutive, heavily pretreated patients with relapsed and refractory HL treated with BV at the dose 1.8 mg/kg every 21 days were available for analysis. Interim PET was performed after a median of 3 cycles (range, 2–5 cycles) and was analysed visually using a 5-point scale (5PS). The 1-year PFS was 100 % and 38 % in patients with negative and positive interim PET, respectively (p = 0.033) [120]. Similar results were obtained in the Italian NNP in a retrospective study including 65 patients treated with a median number of 4 (2–13) prior cancer-related systemic regimens including HDT and ASCT or allogeneic stem cell transplant, receiving BV at the dose of 1.8 mg/kg every 21 days. In the absence of specific indications, response was assessed by PET/CT scans after cycles 3 and 8 (PET-3, PET-8) and at treatment discontinuation, according to the International Harmonization Program (IHP) criteria [121]. The best overall response rate (70.7 %), including 21.5 % complete responses, was observed at the first restaging after the third cycle of treatment (PET-3). Before the second interim evaluation, which was scheduled after eight cycles of BV (PET-8), 21 patients discontinued BV treatment: 12 of them for progressive disease and 3 for toxicity, while 6 underwent stem cell transplantation. The final response of the whole sample was as follows: 14 complete responses (21.5 %), 5 partial responses (7.7 %), 6 cases of stable disease and 40 cases of progressive disease. After a median follow-up of 13.2 months, the overall survival rate at 20 months was 73.8 %, while the progression-free survival was 24.2 % [122].

3.6 PET Scan Interpretation

3.6.1 Historical Proposal

In the pre-PET era, at the end of millennium, a first proposal for treatment response assessment in HL and non-Hodgkin lymphoma (NHL), based on traditional, radiological imaging, was proposed, with the aim of harmonizing the CT interpretation rules, later called the IWC (International workshop criteria) rules [123]. The latter were mainly based on the reduction of the nodal and extra-nodal lesion size. Cheson et al. included anatomic definitions of complete response, defined by a “normal” lymph node size defined as equal or lower than 1.5 cm in the longest transverse diameter in trans-axial slices of CT. A designation of complete response/unconfirmed (RCu) was adopted to include patients with radiological evidence of a residual mass at the end of treatment, showing a reduction on the largest diameter ≥ 75 % of that measured at baseline in the same mass. Partial response (PR) was defined a reduction in sum of the largest diameter of all the measurable nodal masses and extra-nodal lesions ≥ 50 % and stable disease (SD) of all the measurable nodal masses and extra-nodal lesions ≤ 25 %. Progressive disease (PD) was defined as an increase in sum of the largest diameter of all the measurable nodal masses and extra-nodal lesions > 50 % or new lesion.

In 2007, the exponential increase of PET use in lymphoma staging and restaging led to a revision of the IWC criteria by including PET/CT in the recommended panoply of imaging tools for treatment response assessment. On the other hand, specific rules for PET scan were also required, as it became clear that a residual FDG uptake at the end of treatment does not necessary mean persisting active disease [43]. New established criteria, the so-called International Harmonization Project criteria (IHP criteria), were therefore proposed for treatment response assessment in HL and NHL, based on literature data and consensus expert opinion [121]. The main points of the recommendations were the following:

  • Baseline FDG PET (before treatment) was not deemed mandatory for FDG-avid lymphoma subtype Hodgkin Lymphoma (HL), diffuse large B-cell lymphoma (DLBCL), follicular lymphoma (FL), mantle cell lymphoma (MCL), but nevertheless recommended, to ease the end-of-treatment scan interpretation. In case of variably FDG-avid lymphoma, baseline PET was also recommended (e.g. peripheral T-cell lymphoma, marginal zone lymphoma).

  • Patients had to be scanned at least 3 weeks, but preferably 6–8 weeks, after chemotherapy or chemo-immunotherapy end, and 8–12 weeks after radiation.

  • Visual assessment alone was considered adequate for PET interpretation.

  • Mediastinal blood pool activity was recommended as the reference background activity to compare the residual FDG uptake in case of a residual mass ≥2 cm in largest transverse diameter, regardless of its location.

  • In case of a lesion with a lower-size residual mass (with the largest ξ ≤ 2 cm), the lesion could be considered positive if its residual FDG uptake showed an intensity above that of the surrounding background.

Specific criteria for defining PET positivity in the liver, spleen, lung, and bone marrow were also proposed. The above criteria were then integrated in the revised response criteria of IWC [124], which included PET/CT and bone marrow biopsy data (Table 3.2).

Table 3.2 IHP criteria

More recently new criteria for interim and end-of-treatment PET scan interpretation have been proposed by experts, moving from the following observations: (1) the low reproducibility of dimensional criteria in a lesion measured in trans-axial slices of CT scan, (2) the inconsistencies of FDG activity measure in small lesion due to the partial volume effect, and (3) the revised concept of minimal residual uptake (MRU), which was considerably widened to encompass a persisting FDG uptake with an intensity as high as that measured in the liver, far beyond that originally proposed by Hutchings et al. [43].

During the 1st international workshop on PET scan in lymphoma, held in Deauville (France) and the ensuing meetings in Menton (France), a visual five-point scale (so-called Deauville criteria, detailed in the next paragraph) was proposed and validation studies for these rules launched [125, 126].

The main challenge of the interim PET interpretation is based on the presence of a residual FDG uptake in interim and end-of-treatment PET scan which was deemed by nuclear medicine physicians non-disease-related: the so-called “minimal residual uptake” (MRU). The latter, according to the original Hutchings definition, was defined as low-grade uptake of FDG (just above background) in a focus within an area of previously noted disease reported by the nuclear medicine physicians as not likely to represent malignancy” [43]. This was recorded in the 10.6 % of patients scanned after 2 or 3 courses of chemotherapy. However, the tumour shrinkage during chemotherapy is a continuous process, and PET scan is no longer able to detect tumour lesion with a diameter lower than 4–5 mm, which correspond to a reduction in tumour cell number of only two logarithms, but is still compatible with the presence of residual viable cells. It is therefore conceivable, at least in theory, that a residual FGD uptake could be a harbinger of residual viable neoplastic tissue. Moving from this assumption, new criteria incroporating PET (PERCIST) have been proposed moving from the traditional radiological response criteria in solud tumours (RECIST) have been proposed [127]. A residual uptake may therefore correspond to a residual disease, which would be just above this detectability threshold (Fig. 3.8).

Fig. 3.8
figure 8

The relation between different kinetics of tumour cell kill and the detection power of PET. (Extract from: From RECIST to PERCIST: Evolving Considerations for PET response criteria in solid tumours [127])

However, due to the high chemosensitivity of lymphoma, the persistence of a single spot of residual FDG uptake in these neoplasms is nearly always due to a post-therapeutic inflammatory change. The MRU concept then evolved over time, with the aim of increase the specificity and the PPV of interim and final PET scan, as synthetized by Gallamini et al. [128] (Fig. 3.9).

Fig. 3.9
figure 9

The evolution of the MRU definition over time (From: Gallamini et al. [128]). BKG surrounding background, MBPS mediastinal blood pool structures, MRU minimal residual uptake

As earlier mentioned, in 2005, Hutchings et al. defined a minimal residual uptake as a low FDG uptake, slightly higher than surrounding background, in a localization initially involved by lymphoma; this residual uptake was considered as probably non-malignant [43]. The significance of this observation stayed undetermined; the hypothesis was that it was due to unspecific FDG uptake by inflammatory cells infiltrating the tumour in response to chemotherapy. In this pioneer study, only one patient relapsed among the 9 patients with MRU at interim PET. In 2007, Juweid et al. defined MRU as a residual FDG uptake with intensity equal to mediastinal blood pool for lesion having a diameter equal or superior than 2 cm and with an intensity equal to background for lesions with a lower size (MBP) [121]. At the same time, Gallamini et al. defined MRU as low and persistent FDG uptake with intensity equal or slightly higher to MBP [8]. In 2008, Barrington et al. [129] defined MRU as residual uptake with intensity equal or lower than liver uptake. The concept of MRU has evolved over time to include all the situations in which FDG uptake could be predictably attributed to an unspecific tissue reaction. Accordingly, the proposed threshold for a positive scan has been substantially raised. Moreover, different thresholds according to different clinical situations may be set. For example, for good prognosis patients, if the aim of a trial is a safe treatment de-escalation, a “sensitive” threshold with a high NPV is desirable. On the other hand, if the aim is intensifying treatment in interim-positive patients, a high PPV is requested for the interim scan, in order to spare patients with a predictably favourable outcome the undue toxicity of an aggressive therapy. [129]. Furthermore, Barrington et al. were able to demonstrate a fairly high inter-observer concordance when a threshold higher than liver uptake was used. All the above recommendations have been proposed during the first international workshop on interim PET in lymphoma held in Deauville (France) in April 2009, which was attended by haematologists and nuclear medicine experts in lymphoma [71]. The purpose of this meeting was to reach a consensus on simple and reproducible interpretation rules for interim PET in HL and DLBCL and to launch two or more international validation studies (IVS) to validate these criteria.

The main conclusions of this workshop were the following:

  • The threshold should be determined regarding clinical and therapeutic strategy, lymphoma subtypes and escalation or de-escalation therapeutic changes.

  • The residual FDG uptake should be scored as follows:

  • A visual analysis using a five-point scale (5-PS) is recommended, with MBP and the liver as reference points.

  1. 1.

    No uptake

  2. 2.

    Uptake ≤ the mediastinum

  3. 3.

    Uptake > the mediastinum but ≤ the liver

  4. 4.

    Moderately increase uptake > the liver

  5. 5.

    Markedly increased uptake > the liver and/or new lesions related to lymphoma

In April 2010, during the second international workshop PET in lymphoma “which was held in Menton (France), [130] the preliminary results of the application of the 5-point Deauville scale (5-PS) were presented and the problems in practical application discussed. In September 2011, during the Third International Workshop on PET in Lymphoma” [125], the final results of the international validation study (IVS) in Hodgkin lymphoma and diffuse large B-cell (DLBCL) lymphoma have been presented [131]. The results confirmed the prognostic value of interim PET in HL (PFS: 28 % in positive interim PET group vs. 95 % in negative interim PET group; p < 0.0001) and the reliability and reproducibility of Deauville five-point scale. The threshold chosen for a positive scan was between scores 3 and 4, with scores 1–3 considered as negative. The inter-observer agreement was very high (97 %). Forty-five patients out of 260 patients (17 %) showed a positive interim scan; however in 12 of them a false-positive result was recorded, upon central review of the scans. Nonetheless, a preliminary consensus was reached on the use of 5-PS for interim PET in HL, with a cutoff value for a positive scan between score 3 and 4. Finally, during the two last workshops in Menton (4th and 5th international workshop on PET in lymphoma, October 2012 and September 2014), the 5-PS was proposed also for other NHL subsets for interim and end-of-treatment PET scan interpretation [126, 132]. Some issues were still discussed, like: (a) the interest, the significance and the reproducibility of differentiating Deauville scores 4 and 5, (b) the different patterns of FDG uptake in bone marrow across NHL subtype and its respective clinical significance in relationship with the “gold standard” to assess bone marrow involvement by lymphoma (trephine bone marrow biopsy), (c) the visual reference organ to be used in case of liver disease, and (d) the significance of complete metabolic response with residual mass on CT. Preliminary reports of the use of quantitative PET scan (Q-PET) using standardized uptake value (SUV) and SUV-derived quantitative metrics, such as metabolic tumour volume (MTV) or total lesion glycolysis (TLG) have been also presented, but these results were considered as true preliminary and difficult to interpret owing to the complete absence of a program for Q-PET result standardization.

3.7 Current PET Interpretation Recommendations in Treatment Response Evaluation

The last updated recommendations including interim and end-of-treatment PET interpretation, and, more in general, for PET integration in the diagnostic workup for lymphoma staging and restaging, were agreed among nuclear medicine experts and clinicians convening in a closed workshop on PET scan in lymphoma during the 12th International Congress on Malignant Lymphoma (ICML) held in 2013 in Lugano. They are better known as “Lugano criteria for interim and end-of-treatment PET scan interpretation in Lymphoma” [133] (Table 3.3). The recommendations from this session could be displayed as follows:

Table 3.3 Lugano criteria for interim and end-of-treatment PET scan interpretation in Lymphoma [133]

3.7.1 Staging Procedures

  • “Excisional biopsy is preferred for diagnosis, although core-needle biopsy may suffice when biopsy is not feasible.

  • Clinical evaluation includes careful history, relevant laboratory tests, and recording of disease-related symptoms.

  • PET-CT is the standard for FDG-avid lymphomas, whereas CT is indicated for non-avid lymphoma subsets.

  • A modified Ann Arbor staging system is recommended”, simply based on only two subsets with different tumour burden: early stage (Ann Arbor stages I or II, non-bulky) or advanced disease (Ann Arbor stages III or IV), with stage II bulky disease considered limited or advanced as determined by histology and a number of prognostic factors. This two-classes classification was not intended as guidance to treatment: patients should be treated according to prognostic and risk factors in each lymphoma subset.

  • Suffixes A and B are only required for HL.

  • The designation X for bulky disease is no longer necessary; instead, a recording of the largest tumor diameter is required.

  • If a PET-CT is performed, a BMB is no longer indicated for HL; a BMB is only needed for DLBCL if the PET is negative and identifying a discordant histology is important for patient management”.

3.7.2 Restaging Procedures

The 5-point scale (Deauville score) should be used for interim and end-of-treatment PET scan interpretation, both in clinical trials and in the daily clinical practice [116].

  • PET/CT is used to assess early treatment response and, at end of treatment, to establish remission status.

  • A score of 1 or 2 is considered to represent complete metabolic response at interim and end of treatment.

  • More recent data also suggest that most patients with uptake higher than mediastinum but less than or equivalent to liver (score of 3) have good prognosis at the end of treatment with standard therapy in HL [131].

  • However, in response-adapted trials exploring treatment de-escalation, a more cautious approach may be preferred, judging a score of 3 to be an inadequate response to avoid undertreatment. Therefore, interpretation of a score of 3 depends on the timing of assessment, the clinical context, and the treatment.

  • A score of 4 or 5 at interim suggests chemotherapy-sensitive disease, provided uptake has reduced from baseline, and is considered to represent partial metabolic response.

  • A residual metabolic activity at the end of treatment with a score of 4 or 5 represents treatment failure even if uptake has reduced from baseline.

  • A score of 4 or 5 with intensity that does not change or even increases from baseline and/or new foci compatible with lymphoma represents treatment failure, both at interim and at the end-of-treatment assessment.

All the above recommendations should be based on a PET scan interpretation by visual assessment. In the literature, some data suggest that a quantitative cut-off based on SUV measurement may also be interesting. For example, a recent publication [134] showed that, in a cohort of 59 HL patients treated with 4–8 cycles of anthracycline-based chemotherapy, the PET-2-positive predictive value was better using ΔSUVmax (with a cut-off of 70 %) than the 5-point scale (46 %). However, at the moment, there is insufficient evidence to precisely settle the adequate reduction (“delta”) in FDG uptake that predicts treatment response; moreover, this quantitative phenomenon depends on the timing and intensity of the given treatment; finally, caution should be used in assessing data arising from quantitative PET scan interpretation, especially if retrospectively generated, in the absence of a defined program for PET scanner calibration, image generation, acquisition and reconstruction. Recent data also suggest that morphological information with CT evaluation may help in patients with a positive interim PET; a greater reduction in tumour size correlates with an improved outcome; for example, in 88 HL doxorubicin, vinblastine and gemcitabine (AVG)-treated patients, interim PET predicted PFS better than percent decrease in the sum of the products of the perpendicular diameters (%SPPD), but in a combined CT and PET/CT analysis, the predictive value on PFS was higher than with either test alone [135]. On the other hand, a classical anatomical CT-based response assessment is preferred for lymphoma subsets with a variable/low FDG avidity. In summary, the following recommendations have been set for end-of-treatment response assessment (Table 3.3):

  1. 1.

    “PET-CT should be used for response assessment in FDG-avid lymphoma, using the 5-point scale; CT is preferred for low or variable FDG avidity.

  2. 2.

    A complete metabolic response (CMR) even with a persistent mass is considered a complete remission.

  3. 3.

    A partial response by CT criteria only requires a decrease by more than 50 % in the sum of the product of the perpendicular diameters of up to six representative nodes or extranodal lesions.

  4. 4.

    Progressive disease by CT criteria only requires an increase in the cross product of the longest transverse diameter of a lesion and perpendicular diameter of a single node by ≥50 %.

  5. 5.

    Surveillance PET scans for patients in complete remission are discouraged, especially for DLBCL and HL, although a repeat study may be considered after an equivocal finding after treatment.

  6. 6.

    Judicious use of follow-up scans may be considered in indolent lymphomas with residual intra-abdominal or retro-peritoneal disease.”

3.8 Practical Examples on Interim and End-of Treatment PET Scan Interpretation

Case 1

G. L., female, 26 years. Since December 2008 she complained 4-limb and trunk itching and night sweats; 2 months later a supraclavicular right enlarged lymph node was palpable. Upon surgical resection the pathology examination of an enlarged left lateral cervical node revealed classic Hodgkin lymphoma, nodular sclerosis subtype. Baseline biochemical test and haemogram with complete blood count revealed a normal total and fractional leucocyte number, mild anaemia, ESR 66, and LDH 435 U/l. Viral serology was negative. Bone marrow trephine biopsy excluded the presence of lymphoma. Pregnancy test was negative.

The Staging PET/CT, performed in May 2009 (Shown in Fig. 3.9)

Left side cervical enlarged nodes were recorded, with SUVmax between 3.3 and 4.8 and in the left supraclavicular region with a SUVmax of 3.3. Another enlarged lymph node was noted in the infra-pectoral region with a SUVmax of 2.7 and a focal FDG uptake was also recorded in the left upper lung lobe corresponding to a CT-recorded opacity of 1.5 cm, with a SUVmax of 11.4. Presence of pathologically enlarged lymph nodes and partially confluent in right para-tracheal region and right pre-carinal and Barety lodge (SUVmax 9). There were no abnormal findings in the anatomical regions below the diaphragm. A diffuse pattern of FDG uptake at the skeletal bone marrow was compatible with diffuse marrow activation in the absence of focal elements.

Final Diagnosis: Classical Hodgkin Lymphoma, Nodular Sclerosis Subtype, Stage IV A (Lung)

IPS 1

The patient was enrolled in the HD0607 trial and treated with two ABVD courses from June to August 2009.

Interim PET/CT in August 2009

No evidence of pathological FDG uptake. An unspecific uptake was recorded in the tonsillar region. Upon blinded independent central review, the interim PET (PET-2) was reported as negative and the patient continued therapy with ABVD. A final evaluation by PET/CT in December 2009 (Fig. 3.10) showed complete disappearance of abnormal FDG uptake, compatible with complete metabolic response.

Fig. 3.10
figure 10

PET/CT for staging

Case 2

B. A., female, 59 years. Since May 2010 she noted the appearance of a persistent cough, fever 38.5 °C, weight loss of about 7 kg and generalized itching. An ultrasound examination of the neck showed evidence of enlarged lymph nodes of diameter of 7 and 10 mm in the supra-clavicular and cervical right regions. In July 2010 a chest X-ray showed a mediastinal lymph node enlargement at the level of azygos vein confluence. In mid-September, a clinical examination revealed voluminous enlarged nodes in the right axilla with the largest diameter of about 5 cm and in cervical right region of about 3 cm. The baseline complete haemogram showed mild anaemia and leucocytosis. Routine biochemical blood tests were normal. A biopsy of the right cervical node showed a histological diagnosis of HL classic, nodular sclerosis subtype.

The Baseline PET, Performed in Late September 2010 (Shown in Fig. 3.11)

Fig. 3.11
figure 11

PET/CT for interim restaging

There was evidence of right cervical nodes with a diameter ranging from 2 to <1 cm with a SUVmax between 6.6 and 17.6. Confluent left supraclavicular lymph nodes with a SUVmax of and right confluent axillary nodal mass were recorded, with the largest diameter of 5 cm and SUVmax 12.8. A mediastinal bulky mass was also detected, with the contribution of anterior mediastinal, internal mammary and para-tracheal lymph nodes, with a SUVmax of 15.7. A pericardial effusion was present, with a SUVmax of 8.4. Several pathologically enlarged para-aortic lymph nodes, extending from D12 to L3, were also noted, showing a SUVmax of 13.6. There were no abnormal findings in the liver. The spleen was massively and focally infiltrated by lymphoma with a pathological area with the largest diameter of 9 cm and SUVmax of 13.5. There were no skeletal abnormalities.

The Final Diagnosis: Hodgkin Lymphoma, Classical, Nodular Sclerosis Subtype, Stage IIIB. IPS 2

The patient was enrolled in the HD0607 clinical trial. After 2 ABVD courses, an interim PET (PET-2) was performed, with the following local report: probable persistence of disease in Barety lodge. There were no other sites of disease (Fig. 3.12). Upon central review PET-2 was considered positive, with a Deauville score 4 and, accordingly, the treatment was intensified with BEACOPP escalated in December 2010: two cycles were administered at full dosage and the other two with an attenuated dose (BEACOPP baseline) for neurological toxicity (WHO grade 3 peripheral neuropathy). Treatment response was assessed with PET/CT in June 2011, with evidence of complete metabolic response (CMR). The patient skipped the subsequent treatment as planned in the HD 0607 trial, for grade 3 SAE (pneumonia, occurring after the 4th cycle). The complete restaging with FDG-CT/PET in November 2011 showed CMR, and since then the patient is in continuous complete remission.

Fig. 3.12
figure 12

PET/CT for end-of-treatment restaging