Introduction

Chimeric antigen receptor T-cell therapy (CART) directed against the CD19 antigen [1] has demonstrated efficacy in relapsed/refractory (r/r) large B-cell lymphoma (LBCL) [2,3,4], follicular lymphoma (FL) [3, 4], and mantle cell lymphoma (MCL) [5]. Compared to historical controls, CART has significantly improved progression-free survival (PFS) and overall survival (OS) [6].

The prognosis of LBCL has historically been estimated by the International Prognostic Index (IPI), introduced in 1993, which includes age, performance status, Ann Arbor stage, serum lactate dehydrogenase (LDH), and extranodal involvement [7]. The IPI score has been shown to be prognostic for PFS in the setting of CART [8, 9], yet no association with OS has been observed [9]. The pivotal trials JULIET and ZUMA-1 showed trends for higher overall response rates (ORR) with lower IPI scores [3, 4], as well as the recent ZUMA-12 trial [10].

The metabolic tumor volume (MTV) of the lymphomas on baseline 18F-Fluorodeoxyglucose positron emission tomography–computed tomography (18F-FDG PET/CT) is prognostic in Hodgkin [11] and several lymphoma subtypes of non-Hodgkin lymphoma including transformed FL [12], MCL [13], and LBCL [14]. Imaging-based response assessment for determination of PFS in most lymphoma entities has most frequently relied on 18F-FDG PET/CT. In current and ongoing phase III trials, the most widely adopted response criteria are based on the Lugano criteria from 2014 [15, 16]. The prognostic value of MTV as a component of a prognostic index has been studied recently using five published trials on DLBCL [17, 18]. The recently introduced International Metabolic Prognostic Index (IMPI) was developed after review of the prognostic potential of the IPI components as well as MTV and resulted in a simplified modification. IMPI only includes age, Ann Arbor stage, and MTV, yet outperformed IPI in survival estimation in the first-line DLBCL treatment setting [17].

We aimed to compare the prognostic value of the IMPI and the historically established IPI for progression-free (PFS) and overall survival (OS) in the context of CD19 CART for r/r B-NHL.

Methods

Study design and population

The study population was based on a prospective registry of all patients consecutively treated with standard-of-care CD19 CART (i.e., axicabtagene ciloleucel, tisagenlecleucel, brexucabtagene autoleucel) at the Comprehensive Cancer Center Munich of the Ludwig-Maximilian University Munich (CCCMLMU) between January 2019 and May 2022 (data cutoff). The following inclusion criteria were applied:

  1. 1.

    Patients with r/r lymphoma (DLBCL and MCL)

  2. 2.

    Any measurable disease on imaging according to Lugano criteria [15].

  3. 3.

    Available 18F-FDG PET/CT imaging studies at baseline (≤ 2 weeks before CART) and at least at follow-up around 30 days (FU) or before if clinical progression was evident

The following exclusion criteria were applied:

  1. 1.

    Any non-diagnostic imaging studies or missing baseline 18F-FDG PET/CT

  2. 2.

    Patients with non-measurable disease

  3. 3.

    Lack of follow-up examinations or survival data at time of study inclusion

Histologic diagnoses were reviewed by expert pathologists. Patients received lymphodepletion with fludarabine and cyclophosphamide according to the manufacturers’ instructions.

Definition of IPI and IMPI

IPI was calculated using age, Eastern Cooperative Oncology Group (ECOG) performance status, Ann Arbor stage, serum LDH, and extranodal involvement [7]. IMPI was calculated using age, Ann Arbor stage, and MTV [17]. Tumor delineations were performed by a board-certified imaging expert. For stratification and statistical analysis, the IPI scores were grouped as 0–1 (low risk), 2 (low intermediate risk), 3 (high intermediate risk), and 4–5 (high risk) as described previously (International Non-Hodgkin’s Lymphoma Prognostic Factors Project) [7]. We compared the IMPI with IPI, first dividing the study population into four groups with the same sizes as the IPI categories (IMPIlow risk, IMPIlow intermediate risk, IMPIhigh intermediate risk, and IMPIhigh risk). For this purpose, we ranked patients to their absolute IMPI and matched patient numbers according with the number of the corresponding IPI risk group. In a second subdivision, the advantage of the interval-scaled IMPI was used to subdivide it into the tertials IMPIlow risk, IMPIintermediate risk, and IMPIhigh risk to generate 3 groups with the same size for a more detailed breakdown of PFS stratification.

18F-FDG PET/CT imaging

PET/CT images were acquired approximately 45 min after tracer injection (159–275 MBq weight-adapted with approximately 2.5–4.5 MBq 18F-FDG per kg bodyweight) and for the FDG PET/CT contrast-enhanced or unenhanced CTs using a slice thickness of 2 mm 120 kVp, 100–400 mAs, and dose modulations were performed for attenuation correction. The following scanners were used: Biograph 64 and Biograph mCT (Siemens Healthineers, Germany) or Discovery 690 (GE Healthcare, USA). Both scanners fulfilled the requirements indicated in the European Association of Nuclear Medicine (EANM) imaging guidelines and obtained EANM Research Ltd. (EARL1) accreditation during acquisition. The following reconstruction algorithms were used: Biograph 64, TrueX (3 iterations, 21 subsets) with Gaussian post-reconstruction smoothing (2 mm full width at half-maximum); Biograph mCT, TrueX (3 iterations, 21 subsets); Discovery 690, VUE Point FX algorithm with 2 iterations and 36 subsets. All systems resulted in a PET image with a voxel size of 2 × 2 × 2 mm3. Images were normalized to decay-corrected injected activity per kg body weight (SUV g/mL).

Imaging response assessment

Overall response was determined based on Lugano criteria with segmentation of up to 6 target lesions (TL). The sum of the product of diameters (SPD) was measured to determine tumor burden (TB). DoR was calculated as the percent change of SPD from baseline to 30-day follow-up. Spleen size was measured with splenomegaly being defined by a vertical length > 13.0 cm. Target lesions (TL), non-target lesions (NTL), and new appearing lesions (NL) during therapy were evaluated quantitatively and qualitatively. All imaging analyses were performed with dedicated trial reporting software mint Lesion 3.8 (mint Medical GmbH; Heidelberg, Germany). The MTV was evaluated using the open-source software platform LIFEx (https://www.lifexsoft.org) [19]. Attenuation-corrected PET images were analyzed, and the threshold of the absolute standardized uptake value (SUV) was set to ≥ 4 to define hypermetabolic lymphoma tissue as described before [20, 21].

Statistical analysis

All statistical analyses were performed using GraphPad Prism 9. Proportional Cox regression analysis studied association of IMPI and IPI with PFS and OS. For survival analysis, PFS and OS were visualized using Kaplan–Meier survival curves with categorization for IMPI and IPI as described above. Multivariable regression analysis was used to study associations between IMPI, IPI, and DoR. The overall response rate (ORR) was calculated as the rate of patients with CR and PR. Log-rank (Mantel-Cox) test was performed to examine the significance of the results. p values below 0.05 were considered to indicate statistical significance.

Results

Patient characteristics

Thirty-nine out of 80 patients met the inclusion criteria (median age: 67 years, 38% female). Thirty-two patients had to be excluded because of missing baseline 18F-FDG PET/CT examination close to the CAR-T-cell transfusion, 4 patients were excluded because of lack of survival documentation, and 5 patients did not have a measurable lesion according to the Lugano criteria. A flow chart is provided in Fig. 1. The IPI was determined for all patients. The distribution of IPI scores 1–5 was 23%, 21%, 26%, 21%, and 10%, respectively. Three patients (8%) had stage I disease, 10 patients (26%) stage II, 9 patients (23%) stage III, and 17 patients (44%) stage IV according to Ann Arbor staging system. Twenty-eight out of 39 patients (72%) received a bridging therapy between apheresis and CAR T-cell infusion. Median SPD at baseline was 4835 mm2, and median MTV at baseline was 345 mL. Detailed patient characteristics are demonstrated in Table 1.

Fig. 1
figure 1

Flow chart. A total of 80 lymphoma patients were treated with CAR T-cell therapy at our site. Thirty-two patients did not have a baseline 18F-FDG PET/CT examination close to the CAR T-cell transfusion, 4 patients were excluded because of lack of survival documentation, and 5 patients did not have a measurable lesion according to the Lugano criteria. Thirty-nine patients met the inclusion criteria

Table 1 Patient characteristics

IMPI and depth of response (DoR)

The DoR as percent increase or decrease in SPD from BL to FU was calculated for all 39 patients and is illustrated in Fig. 2. The color coding of the waterfall plot was chosen according to the categories of the IMPI risk categories. Patients with IMPIlow risk were labeled green, with IMPIintermediate risk labeled yellow, and with IMPIhigh risk labeled red. The majority of patients had a good 30-day DoR, with median percentage change of − 67%, − 66%, and − 54% for IMPIlow risk, IMPIintermediate risk, and IMPIhigh risk, respectively. In 24 of 39 patients (62%), Lugano-based TB decreased by more than 50% 30 days after CART. In 8 patients (21%), there was a slight decrease < 50%, and in 1 patient (3%), the size of lymphoma manifestations had not changed. TB increased in 6 patients (15%), by > 50% in 4 patients (10%), and by < 50% in 2 patients (5%). There was no correlation between DoR, IMPI, and IMPI 3-year PFS (r = 0.065; p = 0.697).

Fig. 2
figure 2

Depth of response and IMPI. Color-coded waterfall plot for depth of response (DoR) as percentage change of Lugano tumor burden (TB) of all patients from baseline to follow-up 30 days after CART. Positive values indicate an increase and negative values a decrease in TB. Bars are labeled red for IMPIhigh risk, yellow for IMPIintermediate risk, and green for IMPIlow risk at baseline

Imaging-based overall response

Overall response rate (ORR) was similar for all IMPI risk categories. IMPIlow risk, IMPIintermediate risk, and IMPIhigh risk patients had a 30-day ORR of 69%, 62%, and 62%, respectively. Imaging-based response classification of the different IMPI risk groups by Lugano criteria also showed only minor differences. In the IMPIlow risk group, 3 patients (23%) showed CR, 6 patients (46%) PR, 2 patients (15%) SD, and 2 patients (15%) PD at 30-day FU. Of the 13 IMPIintermediate risk patients, 2 patients (15%) had CR, 6 patients (46%) PR, 1 patient (8%) SD, and 4 patients (31%) PD. In the IMPIhigh risk group, 1 patient (8%) was detected with CR, 7 patients (54%) with PR, 4 patients (31%) with SD, and 1 patient (8%) with PD.

IPI and IMPI scores and imaging endpoints

For the same three IMPI-based risk groups, we performed an analysis of association with PFS. The median Lugano-based PFS was 187 days, 97 days, and 87 days for IMPIlow risk, IMPIintermediate risk, and IMPIhigh risk patients, respectively (Table 2). A moderate negative correlation between IMPI 3y-PFS probability and IPI (r =  − 0.672, p < 0.001) and a moderate positive correlation between PFS and OS (r = 0.619, p < 0.001) was observed. IPI and the size-adjusted IMPI groups were not able to stratify PFS (both p > 0.05). However, dividing patients into three risk groups IMPIlow risk, IMPIintermediate risk, and IMPIhigh risk according to their IMPI 3y-PFS probability showed a significant trend for PFS stratification (p = 0.030). Neither IPI nor IMPI yielded a significant association with OS after CART (both p > 0.05). Kaplan–Meier curves for PFS and OS with the different group forms are depicted in Fig. 3AC.

Table 2 Influence of IMPI on imaging endpoints at 90 days and on PFS
Fig. 3
figure 3

Survival Stratification by IPI and IMPI. Depicted are the Kaplan–Meier curves for progression-free survival (PFS, left) and overall survival (OS, right). The upper row (A) shows the color-coded Kaplan–Meier curves according to the IPI groups with low risk (IPI 0–1, green), low intermediate risk (IPI 2, yellow), high intermediate risk (IPI 3, orange), and high risk (IPI 4–5, red). The middle row (B) demonstrates the survival curves for the size-adjusted IMPI groups analogous to the IPI groups with the same color coding. The lower row (C) depicts a split of the IMPI groups by tertials into 3 equally sized groups with the following color coding: IMPI low risk (green), IMPI intermediate risk (yellow), and IMPI high risk (red)

Discussion

In our study set in the context of later-line CART for r/r B-NHL patients, the IMPI outperformed the IPI for prognostication of PFS. However, we did not find a significant association with ORR, DoR, and OS, neither for IMPI nor for IPI, in this patient population. Unlike first-line LBCL treatment, the prognostic relevance of IMPI (and possibly IPI) regarding OS may therefore be limited in the CART setting.

In the first-line treatment setting, the association of imaging endpoint surrogates of survival such as PFS with OS has been established [22, 23]. In contrast, later disease stages may reflect phenotypic and metabolic changes of the lymphoma manifestations themselves, which may in turn affect these associations [24, 25]. Typically, more widespread nodal locations are involved, and extranodal lesions are more frequently encountered. Importantly, these imaging findings are associated with elevated systemic inflammatory markers, which are of prognostic interest in the context of CART [26,27,28]. Notably, prognostic indices have so far not been adapted to such changes in the disease course, and data on association with OS is scarce [15, 29].

In our study population, the division of groups according to the originally published 4 risk groups [7] was not able to stratify PFS or OS. This could be due to the limited number of subjects in this study. Another explanation is that the IPI might have a different prognostic value in context of later line treatments such as CART. Consistent with this would be that division of patients into a low-risk IPI group (IPI 0–2) and a high-risk group (IPI 3–5) showed a small non-significant difference in PFS in our cohort (Supplemental Fig. 1A) but no difference in OS (Supplemental Fig. 1B), as previously published in the setting of CART [9].

Novel prognostic indices, imaging endpoints, and response criteria in lymphoma will likely evolve from selected lesion-based assessments (as, e.g., with the Lugano criteria) to whole tumor burden quantification (as, e.g., with MTV). In the first-line setting, the IMPI has outperformed the conventional IPI in estimating outcome of DLBCL patients. Notably, MTV has replaced the three components LDH, Ann Arbor stage, and performance status [17]. This indicates that whole tumor burden may contain more important prognostic information and that other less granular clinical or serological data may carry some redundant information.

To our knowledge, there is no literature comparing IMPI and IPI in the context of advanced lymphoma under CAR T-cell therapy. Our study has limitations which need to be considered when interpreting the results. First, this is a single-center study with a limited number of subjects. This may limit the interpretation of the association of IMPI or IPI with OS. Second, some patients had to be excluded as there was no measurable disease on PET. This represents a limitation of imaging-based prognostic indices (as compared with IPI) in clinical routine. Third, resulting from the operational and logistical nature of CART, the clinical use of bridging therapy may affect the MTV as the metabolic activity is likely altered by systemic bridging regimens. During the bridging period, this may affect the metabolic component of the lymphomas more strongly than the morphologic lymphoma masses, which could also affect the prediction of OS.

In conclusion, the IMPI yielded superior prognostic value compared to the IPI alone regarding the estimation of PFS following CD19 CART and thus holds potential as a novel prognostic index. In contrast with IMPI in the first-line DLBCL setting, we did not observe a significant association of IMPI at baseline with OS after CART. Future research should prospectively assess the value of IMPI regarding OS in larger studies of r/r B-NHL patients receiving CART.