Introduction

Radioimmunotherapy (RIT) involves monoclonal antibodies (mAb) to selectively target the surface antigens on malignant cells, and a radioisotope coupled to the mAb, which selectively delivers ionizing radiation to the tumor [1]. Ibritumomab tiuxetan (Zevalin®; Spectrum Pharmaceuticals, Irvine, CA) is a murine antiCD20 monoclonal antibody conjugated to Yttrium-90 (90Y-IT). It is currently the only available RIT drug approved in Europe and in the United States for the treatment of patients with lymphoma specifically for the treatment of relapsed or refractory, low-grade or follicular B-cell NHL as well as for previously untreated follicular lymphoma (FL) in patients who achieve a partial or complete response to first-line chemotherapy [2, 3]. Approved indications for 90Y-IT followed the publications of the results of several clinical trials demonstrating that in patients with relapsed and/or refractory low-grade NHL, a single dose of 90Y-IT induces overall response rates in 70–80% and a complete response in 15–50% of patients [3,4,5,6].

In more recent years several authors investigated the usefulness of 90Y-IT either as consolidation treatment after immunochemotherapy or as monotherapy in first line [7, 8]. These studies suggested that RIT might induce a more profound response, postponing an eventual relapse when used as consolidation after induction immunochemotherapy [7]. However, almost all patients nowadays receive rituximab in first line. Accordingly, 90Y-IT is considered a potential treatment option especially in patients with FL who are not eligible for rituximab maintenance [9]. The efficacy of RIT in other types of NHL has also been tested. Available studies demonstrated durable and feasible responses in an extended variety of NHL types and patient populations [10,11,12,13,14]. On the other side, the most common adverse reactions of 90Y-IT are cytopenias, fatigue, nasopharyngitis, nausea, abdominal pain, asthenia, cough, diarrhea, and pyrexia, while the most serious adverse reactions are prolonged and severe cytopenias (thrombocytopenia, anemia, lymphopenia, neutropenia). However, the presence of grade 4 neutropenia or thrombocytopenia has been demonstrated respectively in around 30% and 10% of patients enrolled in clinical trials [3]. Despite the evidence that RIT positively affects NHL patients’ outcomes without compromising the quality of life, use of RIT remains limited [15, 16]. Potential issues associated with underuse of RIT are related to inadequate reimbursement policies, lack of widespread availability, concerns about radiation protection issues and about potential delayed toxicities (e.g., marrow damage, secondary malignancies and treatment-related myelodysplastic syndromes), which actually are rare in clinical practice [16,17,18]. Similarly, despite several studies there’s still lack of evidence-based parameters which can predict the response to RIT on an individual patient basis [19].

Indeed, type of lymphoma (indolent versus non-indolent subtypes), serum levels of lactate dehydrogenase (LDH) and βeta-2 microglobulin (β2M), disease extension, age, previous therapies, molecular markers, body surface area (BSA) and more in general clinical prognostic indexes such as Follicular Lymphoma International Prognostic Index (FLIPI) and International Prognostic Index (IPI) have been taken into account as potential predictors of response to RIT [19]. However, the prognostic role of several of these biomarkers is still controversial. Similarly, imaging studies have tried to address the same issues. In this framework, acquisition of a pretreatment 111In-ibritumomab tiuxetan scan has been historically used as a method for measuring organ-specific accumulation of ibritumomab tiuxetan and it was also tested for RIT efficacy prediction [20,21,22]. Despite the obvious relevance of organ-specific accumulation of ibritumomab tiuxetan, previous studies concluded that there’s no clear evidence about a relationship between the intensity of uptake at a disease site on the pre-RIT 111In-ibritumomab tiuxetan scan and response to treatment. In recent years the value of pretreatment FDG PET scan to predict RIT efficacy has also been tested. In this context, several different semiquantitative measures might be taken into account when assessing the predicting value of FDG PET, and thus comparability of results of different studies might be not always straightforward [19]. Accordingly, while several studies have suggested a potential role FDG PET in evaluating response to RIT [23,24,25], the prognostic role of FDG PET before RIT is still considered controversial [19]. In the present systematic review, we thus aimed to summarize and discuss studies addressing the value of baseline FDG PET as predictive biomarker for response to RIT in patients with NHL. As a matter of fact, only an effective and more individualized risk stratification can lead to a clear understanding and a better selection of patients considered candidates for RIT thus allowing to optimize the efficacy and safety of treatment with RIT [26].

Evidence acquisition

We searched (last update: March 2019) the databases PubMed, PMC, Google Scholar and Medline using both as text and as MeSH (Medical Subject Headings) terms the following: “positron emission tomography—PET”, “PET/CT”, “FDG”, “18F-fluorodeoxyglucose”, and “radioimmunotherapy”, ‘90Y-ibritumomab tiuxetan” and “non-Hodgkin lymphoma” and “follicular Lymphoma“. We also included additional studies if cited in the selected articles. No language restriction was applied to the search, but the reviewed articles were all in English. Among all the retrieved articles, we selected only those specifically analyzing role, predictive and overall value of pretreatment FDG PET in patients with NHL submitted to RIT. A total of eight papers met the  inclusion criteria and were selected [22, 27,28,29,30,31,32,33]. Characteristic of selected studies are summarized in Table 1.

Table 1 Studies evaluating the prognostic role of baseline FDG PET in patients treated with radioimmunotherapy

Evidence synthesis

We will synthesize findings in the literature by focusing on studies involving the use of pretreatment FDG PET as predictor of response in NHL patients treated with RIT trying to focus both on the added value of FDG PET examination and on the predictive value of different semiquantitative biomarkers such as maximum and mean standardized uptake values (SUVmax, and SUVmean), metabolic tumor volume (MTV), Total lesion Glycolysis (TLG). We will also comment the potential role of more sophisticated approaches able to capture metabolic heterogeneity within NHL lesions and the complementary or combined value of other imaging and biochemical biomarkers. Eight studies in 254 patients evaluated the role of FDG PET as predictor of response in patients submitted to RIT. All patients were retrospectively enrolled with the exception of patients included in the studies carried out by Lim et al. [33] and Hertzberg et al. [28].

Seven studies evaluated the role of FDG PET as well as of other clinical biomarkers. In particular, four studies also evaluated the specific relevance of tumor size as assessed by computed tomography (CT).

Lopci and colleagues evaluated 38 relapsed or refractory FL patients submitted to FDG PET before and 3 months after RIT with the aim of evaluating the role of FDG PET before and after treatment with 90Y-IT [31]. Twenty out of 38 patients had a limited disease on baseline FDG PET, 11 patients had nodal findings on both sides of the diaphragm and the remaining 7 patients had both nodal and extra-nodal findings. When disease extent at relapse and response to treatment were compared, higher rate of complete response (75%) was present in patients with limited metabolic active disease, while patients with diffused PET positive nodal and/or extra-nodal findings were more frequently characterized by partial response or progressive disease (66%) [31]. Prognostic value of post-induction and pre-RIT PET was at least partially addressed in a large study more generally aiming to establish whether treatment intensification with R-ICE chemotherapy (rituximab, ifosfamide, carboplatin, and etoposide) followed by 90Y-ibritumomab tiuxetan-BEAM (BCNU, etoposide, cytarabine, and melphalan) can improve 2 years progression-free survival in high-risk diffuse large B-cell lymphoma patients positive to interim PET scan after 4 cycles of R-CHOP-14 [28]. Patients received 4 cycles of R-CHOP-14, followed by a centrally-reviewed PET performed at day 17–20 of cycle 4 and assessed according to International Harmonisation Project criteria. Among the 143 patients undergoing interim PET, 42 (29%) were PET positive and 32 of them completed R-ICE and 90Y-IT BEAM. However, at a median follow up of 35 months, the 2-year PFS for PET-positive patients was 67%, a rate similar to that for PET-negative patients treated with R-CHOP-14 while overall survival was 78% and 88% respectively. Only in a further exploratory analysis, PFS and OS were markedly superior for PET-positive Deauville score 4 versus score 5. Therefore, the authors concluded that diffuse large B-cell lymphoma patients PET-positive after 4 cycles of R-CHOP-14 and who switched to R-ICE and 90Y-ibritumomab tiuxetan-BEAM could achieve favorable survival outcomes similar to those for PET-negative R-CHOP-14-treated patients [28]. The specific role of PET-based semiquantitative measures has also been tested. Lim and colleagues prospectively enrolled twenty-four patients treated with unlabeled rituximab and a therapeutic activity (median 7.3 GBq) of 131I-rituximab [29]. Contrast-enhanced FDG PET/CT scans were performed before and 1 month after RIT. Tumor sizes and SUVmax were measured and high baseline SUVmax was found to be related with poorer overall (OS) and progression-free survival (PSF). Furthermore, a large tumor size in pretreatment scan was associated with poorer OS but not with PFS. Finally, in multivariate analyses, a high SUVmax, a large tumor size in a pretreatment scan and diffuse large B-cell lymphoma histology were significantly associated with poorer OS [33].

The value of FDG as prognostic factor in NHL patients treated with RIT has also been evaluated in multicenter studies. In an Italian multicenter study, FDG PET was the only independent pre-RIT biomarker surviving at multivariate analysis and predicting PFS while all the other prognostic factors including age, gender, time from diagnosis to RIT, number of previous treatments, disease extent before RIT did not show significant correlation with response to treatment [30].

Similarly, Grgic et al carried out a multicenter evaluation to prove the feasibility of the multicenter web-based data collection and to preliminary explore imaging findings and prediction of therapy response in patients with FL [29]. They retrospectively analyzed and correlated clinical and imaging data (CT and FDG PET) before and after RIT as documented by the RIT-Network. Evaluation of treatment response was done on both patient- and lesion-basis. Every measurable lesion was analyzed in terms of SUVmax and volume (PET and CT) response. Uni- and multivariate model were used to identify RIT efficacy predictors. A total of 159 lesions were measured. In the multivariate model lesion volume, the total and maximum lesion volume were predictors for response (CR + PR) [29]. When focusing on lesional CR, small lesions volume at baseline and their metabolism were identified as prognostic predictors thus suggesting that FDG PET may also predict the likelihood of response to RIT. Cazaentre et al. retrospectively enrolled 35 patients with NHL who had undergone FDG PET prior to RIT with either 90Y- IT or 90Y-epratuzumab tetraxetan. Four functional variables were measured for each tumour lesion in a given patient (SUVmax and SUVmean, functional lesion volume (LVol) and TLG) while for each patient, highest SUVmax and SUVmean, cumulative TLG (TLGcum) and the sum of all LVol (TVol) were computed [32]. Predictive value on the response [complete or partial response according to International Workshop Criteria (IWC)] to RIT was compared with those of conventional prognostic factors. In particular, conventional prognostic evaluation included, age, histological type, Ann Arbor stage, performance status according to ECOG (Eastern Cooperative Oncology Group), international prognostic indexes, LDH, and the presence of bone marrow involvement. The sum of the products of the two longest perpendicular diameters as defined by IWG criteria and the diameter of the largest lesion on pre-RIT CT scan were also considered. A total of 154 lesions were analyzed. Nineteen patients (54%) responded to RIT according to IWC. In patients treated with 90Y-IT, response rate was 54% in patients with a SUVmax < 20 g/ml, and 75% both in patients with a TVol < 100 ml or a TLGcum < 1060 g, while no patient above these thresholds responded. The response rate was 93% for patients with SUVmax < 15 g/ml while no patient above this threshold responded. All patients with TLGcum below 1360 g responded, compared with only 37% of patients whose TLGcum was above this threshold. By contrast, conventional prognostic factors failed to predict response and authors concluded that pre-therapy FDG PET functional parameters such as SUVmax and TLG may help predicting more accurate response to single agent Yttrium-90 based RIT [32]. Similarly, Hanaoka and colleagues evaluated both tumor accumulation and heterogeneity of 111In-ibritumomab tiuxetan and tumor accumulation of FDG and compared them to the tumor response in B-cell non-Hodgkin’s lymphoma patients treated with 90Y-IT [22]. Sixteen patients were enrolled in this retrospective study. On pretherapeutic FDG PET/CT images, SUVmax was measured. Percentage of the injected dose per gram (%ID/g) and SUVmax of 111In-ibritumomab tiuxetan were also measured at 48 h after its administration. The skewness and kurtosis of the voxel distribution were calculated to evaluate the intratumoral heterogeneity of tumor accumulation. Moreover, cumulative SUV-volume histograms describing the percentage of the total tumor volume above the percentage thresholds of pretherapeutic FDG and 111In-ibritumomab tiuxetan SUVmax were calculated as a further intratumoral heterogeneity index [22]. Forty-two lesions were analyzed and classified into responders and non-responders on lesion-by-lesion basis on post-therapeutic CT images. This study reported a positive correlation between the FDG SUVmax and accumulation of 111In-ibritumomab tiuxetan in lesions. A significant difference in pretherapeutic FDG SUVmax was observed between responders and non-responders, while no significant difference in 111In-ibritumomab tiuxetan SUVmax was observed between the two groups. Accordingly, authors concluded that pretherapeutic FDG accumulation was predictive of the tumor response to 90Y-IT. The heterogeneity of the intratumoral distribution rather than the absolute level of 111In-ibritumomab tiuxetan was correlated with the tumor response as skewness of 111In-ibritumomab tiuxetan images was significantly different in responders and non-responders [22]. Finally, in a recent study including 34 patients with relapsed indolent lymphoma treated with 90Y-IT monotherapy, predictive value of clinical data as well as CT and FDG PET were retrospectively assessed [27]. In univariate analysis, tumor long axis diameter ≤ 2.5 cm, SUVmax ≤ 6.5, localized disease, normal levels of serum soluble interleukin-2 receptor, and the number of involved nodal sites ≤ 3 immediately prior to 90Y-IT were associated with median PFS greater than 6 years [27]. Of note, in multivariate analysis, only tumor long-axis diameter ≤ 2.5 cm and SUVmax ≤ 6.5 affected PFS. Accordingly, authors concluded that 90Y-IT treatment should be especially considered for patients with indolent lymphoma in first relapse who have tumor long-axis diameter ≤ 2.5 cm and SUVmax ≤ 6.5 [27]. A summary of FDG PET-based parameters already evaluated in studies in patients with different lymphoma subtypes candidates to RIT is reported in Table 2. Two representative examples of baseline and post-therapy FDG PET showing homogeneous and heterogeneous response to RIT are shown in Fig. 1 and 2. In both cases, post-therapy FDG PET was performed 12 weeks after therapy. However, it should be noted that a potential further decline in tumor SUVmax between 12 and 24 weeks in absence of additional therapy has been previously reported thus suggesting the potential usefulness of a more delayed response assessment [34]. More recently, early evaluation 6 weeks after therapy has also been proposed for response assessment and post-therapy prognostic stratification after RIT [35].

Table 2 Baseline FDG PET-based parameters potential predictors of response in different NHL subtypes before radioimmunotherapy
Fig. 1
figure 1

Representative example of a homogeneous response to radioimmunotherapy (RIT) in patients with follicular NHL with progressive disease after rituximab and chemotherapy. Pre-RIT FDG PET scan a shows highly FDG-avid bilateral cervical, axillary and upper mediastinal lymph nodes (SUVmax 5). Post-RIT FDG PET scan, b (performed 12 weeks after therapy) demonstrates complete resolution of abnormal metabolic activity in all sites of disease despite the persistence of measurable lymph nodes (i.e. red arrow shows a subcentimetric non-FDG avid lymph node in the left axilla)

Fig. 2
figure 2

Representative example of heterogeneous response to radioimmunotherapy (RIT) in a patients with follicular NHL presenting with progressive disease after salvage chemotherapy. Pre-RIT FDG PET scan a shows a highly FDG-avid abdominal bulky lesion (SUVmax 12). Post-RIT FDG PET (performed 12 weeks after therapy), b demonstrates a residual mass with markedly reduced uptake with respect to baselines scan. However, the residual mass is characterized by a relatively heterogeneous response with some small hot-spots still showing an uptake higher with respect to the uptake of the liver (Deauville score 4)

Future perspectives

The identification of FDG PET-based variables able to predict response to RIT might be of interest also for the selection of patients for treatment with new (not yet registered) RIT compounds which could be available in the next future. In fact, despite the underuse of RIT, several new radiolabeled compounds for RIT have been proposed in preclinical models as well as in patients with lymphoma. In particular, epratuzumab is a humanized antibody targeting CD22, known to be highly expressed in most types of lymphoma [36]. This antibody has been labeled with Yttrium-90 and has been used for the treatment of aggressive NHL with 53% of patients showing an objective response [36]. In addition to CD20 and CD22, other biomarkers been investigated as targets for RIT in lymphoma are CD37 [37], CD38 [38], CD25 [39], CXCR4 [40], the human leukocyte antigen DR (HLA-DR) [41] and CD45 [42]. Finally the possibility to administer RIT compounds based on the use of α-particle-emitting radionuclides with their high linear energy transfer (LET) combined with a short path length in tissue, it is a further important emerging opportunity [43].

Conclusions

The literature on the predictive role of FDG PET in NHL patients treated with RIT is still based on studies involving small groups of patients which in the vast majority of cases are retrospectively recruited. Despite these methodological issues, patients- and lesion-based analyses seem to suggest a relevant prognostic role of both morphological (CT) and metabolic imaging (PET). As a matter of fact, it has already been demonstrated that tumor bulk affects PFS and OS. In fact, pre-RIT bulky sites are significantly at the higher risk for disease recurrence and appear to be the first sites of recurrence after RIT [44, 45]. Tumor burden not superior to 5-7 cm [46,47,48], is significantly associated with higher OS, PFS and CR rate after RIT. As in other context, also in the framework of NHL treated with RIT, evaluation of tumor metabolism seem to provide a further and different window on tumor behavior and responsiveness to therapy. In fact both SUVmax and TLG demonstrated to act as independent predictor of response to 90Y-IT. However emerging PET-based parameters such as MTV and TLG were analyzed in very few studies. Similarly, while the predictive role of tumor extension and volume as assessed by CT have a well-established prognostic role, the spin-off of the specific weight of metabolic active tumor burden is not trivial and not clearly possible from published studies. Similarly, it is still not possible to specifically define the best PET-based predictor and the identification of reproducible cut-offs (i.e. for SUVmax, MTV or TLG) in NHL treated with RIT is still a very complex issue. In this framework, the availability of FDG PET in patients included in already completed clinical trials might be of interest and could be used for new analyses in homogeneous groups of patients thus allowing to more deeply disclose the role of specific PET-based parameters as prognostic indicator in NHL patients candidate to RIT. In recent years, FOLL12 study (EUDRACT 2012-003170-60) a multicenter, phase III, randomized study aiming to evaluate the efficacy of a response-adapted strategy in patients with advanced-stage Follicular Lymphoma has been promoted and carried out by Italian centers belonging to the “Fondazione Italiana Linfomi” (FIL). In this study, the experimental arm is based on FDG PET and molecular minimal residual disease (MRD) results. In these patients, a de-intensified treatment is reserved to MRD- and PET-negative cases while a consolidation with radio-immunotherapy is performed in patients still PET-positive after induction and a pre-emptive therapy is adopted for PET-negative but MRD-positive patients. FOLL12 has now completed recruitment of 810 patients [49]. Once analyses addressing primary and secondary endpoints will be published, the same homogenous group of patients might be used also to specifically assess the prognostic value of baseline FDG PET in patients treated with RIT based on the study design.