Introduction

Over the past 20–25 years, outcome of patients with newly diagnosed Hodgkin’s lymphoma (HL) and aggressive non-Hodgkin’s lymphoma (NHL) has significantly improved. Contributing factors have been risk-adapted treatment strategies using prognostic indices (Diehl and Fuchs 2007; Hasenclever and Diehl 1998; International Non-Hodgkin’s Lymphoma Prognostic Factors Project 1993; Sehn et al. 2007), the assessment of metabolic responses by positron emission tomography (PET) using fluorodeoxyglucose ([18F] FDG) (Borchmann et al. 2017; Borchmann 2020) and the application of different dose-escalated or dose-dense protocols (Pfreundschuh et al. 2004, 2008; Schmitz et al. 2012; Tresckow et al. 2018). In addition, the use of the anti-CD20 monoclonal antibody rituximab (Coiffier et al. 2002; Pfreundschuh et al. 2006) or the anti-CD30 antibody–drug conjugate (ADC) brentuximab vedotin (Connors et al. 2018) have improved the treatment results. However, the prognosis of patients with recurrent lymphomas is still poor. Especially primary refractory diseases or early relapses are associated with a worse outcome (Gisselbrecht et al. 2010; Josting et al. 2002, 2010). Since several studies investigating different platinum-containing protocols such as DHAP, ESHAP and ASHAP had shown encouraging results with response rates of 60–70% (Hänel et al. 2000; Rodriguez et al. 1999; Velasquez et al. 1988, 1993, 1994), the MIFAP regimen was developed. In a non-randomized phase II trial, 46 patients with very poor-risk recurrent lymphoma achieved results comparable to patients with prognostically more favorable diseases (Hänel et al. 2001). However, a comparison with an established protocol like Dexa-BEAM was still pending. Therefore, this prospective randomized trial was performed.

Patients and methods

Patient enrollment/course of study

Patients aged 18–65 years with HL or aggressive NHL suffering from up to three relapses, primary refractory diseases or refractory relapses were included. After randomization, all patients received the salvage therapy with either Dexa-BEAM or MIFAP. Restaging was performed after one cycle and patients with complete response (CR), partial response (PR) or minor response (MR) received a second course of Dexa-BEAM or MIFAP. Patients reaching at least PR after two chemotherapy cycles were consolidated by either high-dose chemotherapy (HDT) with autologous hematopoietic cell transplantation (HCT), allogeneic HCT or autologous HCT with subsequent allogeneic HCT after reduced intensity conditioning. Non-responders (less than MR after the first course or less than PR after the second course) dropped out from the trial. The applied remission criteria were in accordance with the recommendations valid at the time the study was initiated (Cheson et al. 1999); after the first course, MR was defined as a tumor size reduction between 25 and 49%.

Treatment plan

Enrolled patients were randomly assigned at a 1:1 allocation ratio to receive either two cycles Dexa-BEAM (standard arm (Table 1a) or two cycles MIFAP (experimental arm (Table 1b) as salvage therapy. In responding patients (at least PR after the second course), a subsequent consolidation by HCT (autologous, allogeneic or autologous-allogeneic) was planned, as shown in Fig. 1.

Table 1 Treatment regimens (a) Dexa-BEAM and (b) MIFAP
Fig. 1
figure 1

Trial course. Out of 76 patients assessed for eligibility, 73 were assigned to the Dexa-BEAM arm (N = 37) or the MIFAP arm (N = 36). After the first course 26 patients of the Dexa-BEAM arm and 23 patients of the MIFAP arm were classified as responders (CR complete response, PR partial response, MR minor response). Non-responders (NR) were assigned to crossover or leave the study. After the second course, both treatment arms had 19 responders each. Thereafter, the patients received autologous (auto-HCT), allogeneic (allo-HCT), autologous and allogeneic (auto-allo-HCT), or no hematopoietic cell transplantation (no-HCT)

Statistical analysis

The study design aimed at the detection of a promising response rate of 60%, in contrast to 40% considered to be futile with regard to data on several existing salvage therapies for the treatment of relapsed/refractory NHL/HL. According to a one-stage phase II design by Fleming (1982), the recruitment of 38 patients for the experimental arm was required to achieve 80% power with a one-sided type-I error of 0.05. A similar number was allocated to a randomized reference group, to verify the historical assumptions, and thus, control for selection bias.

The statistical survival analyses and multivariate Cox Proportional Hazard Models were performed using R version 3.1.3 (R Core Team 2020), including the packages ggplot2 1.0.1 (Wickham 2016), Hmisc 3.15-0 (Harrell Jr et al. 2015) and survival 2.38-1 (Therneau 2015; Therneau and Grambsch 2000). Overall survival (OS) and progression-free survival (PFS) were analyzed using the Kaplan–Meier method (KM estimator). Overall survival time was calculated from the first day of first cycle (Dexa-BEAM or MIFAP) until death from any cause or last contact (= censored observation). Patients were counted as events in case of relapse, disease progression, or death from any cause (PFS), or were censored with their last observation date. P values were calculated using two-sided Log-rank tests. The statistical analysis of the response rates, the toxicities according World Health Organization (WHO) as well as the supportive therapy were performed using R version 4.0.3 (R Core Team 2020) and the packages exactRankTests 0.8-31 (Hothorn and Hornik 2019), MASS 7.3.53 (Venables and Ripley 2002), car 3.0.10 (Fox and Weisberg 2019), DescTools 0.99.38 (Signorell et al. 2020), lubridate 1.7.9.2 (Grolemund and Wickham 2011), and binom 1.1.1 (Dorai-Raj 2014). The overall response rate (ORR) was calculated on the square matrix of patients’ best response after every stage including both induction cycles as well as the HCT. CR and PR were classified as response, whereas MR, stable disease (SD), progressive disease (PD), and death from any cause were summarized to non-response (NR). Fisher’s exact test was used to determine the P value. All P values are two-sided and considered explorative. Explorative multivariate analysis was performed with all factors listed in Table 3 using generalized linear models for overall and complete response rates after two cycles Dexa-BEAM or MIFAP (logit transformation). Toxicities were analyzed using two-sided Fisher’s exact tests on categorical variables and Welch tests or Wilcoxon rank sum tests on continuous variables if prior Shapiro–Wilk tests pointed to other than normal data distribution. P values were adjusted for multiple comparisons by calculation of false discovery rates (FDR). The treatment-related toxicities were compared using Fisher’s exact tests (WHO toxicity grades 0–2 and 3–4 were subsumed, respectively).

Results

Patient characteristics

Between 2000 and 2006, 76 patients from 10 institutions were screened for the study. Table 2 displays the patients’ characteristics and the trial course is depicted in Fig. 1. Out of 76 patients assessed for eligibility, 73 were assigned either to the Dexa-BEAM standard arm (N = 37) or the MIFAP experimental arm (N = 36). Overall, the patients included in this study comprised a population with a very unfavorable prognosis. Fifty-eight patients (79%) had early and/or multiple relapses or had been refractory to the previous therapy. In patients with relapsed diseases, the median duration of the last remission (before study entry) was only 8 months.

Table 2 Patient characteristics at randomization

Treatment response

After the first course, 26 patients (70%) of the standard arm and 23 patients of the experimental arm (64%) achieved at least MR. Non-responders were assigned to crossover to the other treatment regime or to leave the study. Table 3 shows the response rates after the second treatment course for each treatment arm. The ORR was 51% in the Dexa-BEAM group and 53% in the MIFAP arm and the CR rate was 38% in the Dexa-BEAM arm and 36% in the MIFAP arm (both not significant). Patients with refractory diseases had a significant higher risk for non-response than patients with relapse (P < 0.001, OR = 14.1, 95% CI 4.0–69.1). In addition, elevated lactate dehydrogenase (LDH) levels increased the risk for not reaching a CR (P = 0.064, OR = 2.9, 95% CI 1.0–9.5). Successful mobilization and harvesting of autologous stem cells was observed in 65 (89%) patients. A poor mobilization was documented in only three cases (two patients in the Dexa-BEAM and one patient in the MIFAP group). Stem cell collection was not performed in five patients due to toxicity (n = 1) or disease progression (n = 4). With regard to stem cell harvest, there were no significant differences between the two treatment arms (Table 1S).

Table 3 Response rates dependent on risk factors

Toxicity

After the first treatment course, only the median duration of G-CSF support (12 days MIFAP vs. 9 days Dexa-BEAM, P = 0.016) was of significance. After the second course, however, the increased hematological toxicity observed in the MIFAP arm was much more pronounced. This included both the duration of grade 4 leukocytopenia (P < 0.001) or thrombocytopenia (P < 0.001) as well as of febrile neutropenia (P = 0.004). In addition, significantly more transfusions of red blood cells (P < 0.001) or platelets (P < 0.001) were necessary. Patients treated in the MIFAP arm needed more G-CSF support (P = 0.021), their hospitalization lasted longer (P = 0.002). Table 2S summarizes hematological toxicities and supportive therapy. Regarding the non-hematological toxicities, no major differences were observed between both groups after the first treatment course. After the second treatment cycle infections (P = 0.001), performance status (P = 0.007), and pain (P = 0.018) were significantly shifted towards higher WHO toxicity grades in the experimental MIFAP arm (Table 3S).

Survival

After a median follow-up of 14.4 years (inverse Kaplan–Meier method), the 15-year rates of PFS and OS of the entire group are 26% and 29% (median 5.6 and 26.6 months), respectively. There were no significant differences between Dexa-BEAM and MIFAP in PFS (median 5.6 vs. 5.4 months, P = 0.812) and OS (median 25.5 vs. 26.6 months, P = 0.440). In addition, in patients with relapsed or with refractory disease, no significant differences for PFS and OS were found between both regimens. Thirty-five of the 38 responders (with at least a PR after two courses) were consolidated by autologous (N = 29), allogeneic (N = 1) or sequential autologous/allogeneic (N = 5) HCT. Responding patients with subsequent autologous HCT achieved 15-year rates in PFS and OS of 59% and 50% (median not available and 121.8 months, respectively). Survival curves for patients suffering from HL or NHL are presented in Fig. 2, and the outcome of responders with subsequent autologous HCT is shown in Fig. 1S. In a univariate as well as in a multivariate analysis including treatment arm, lymphoma subtype, disease status, Ann Arbor stage as well as LDH level at randomization, we found no influence of treatment regimens (Dexa-BEAM vs. MIFAP) on PFS and OS (Table 4S).

Fig. 2
figure 2

Kaplan–Meier curves showing PFS from Dexa-BEAM for HL (N = 13) vs. MIFAP (N = 12) for HL (a) and PFS for Dexa-BEAM (N = 24) for NHL vs. MIFAP (N = 24) for NHL (b), OS from Dexa-BEAM for HL (N = 13) vs. MIFAP (N = 12) for HL (c) and OS from Dexa-BEAM (N = 24) for NHL vs. MIFAP (N = 24) for NHL (d)

Discussion

Although the benefit of consolidating high-dose therapy followed by autologous HCT has been demonstrated in patients with recurrent lymphomas (Philip et al. 1995; Schmitz et al. 2002), this option is typically limited to patients having chemosensitive disease. Therefore, it is very important to induce at least a partial remission by an effective salvage regimen. In this trial report, the platinum-containing MIFAP protocol, which had shown encouraging results in a phase II study (Hänel et al. 2001), was tested within a randomized phase II trial against the Dexa-BEAM regimen. No significant differences between the two treatment arms regarding response and survival were documented. In general, the achieved response rates were about 10–15% lower than reported in other trials (Gisselbrecht et al. 2010; Josting et al. 2010; Schmitz et al. 2002). However, the extremely high proportion (79%) of prognostically unfavorable patients with refractory disease or early and/or multiple relapses in our study must be considered. For comparison, the proportion of these patients in the HDR2 study (Josting et al. 2010) and the CORAL study (Gisselbrecht et al. 2010) was only 43% (120/279 patients) and 54% (228/396 patients), respectively. Since no superiority of the experimental arm was observed in response and survival, but the MIFAP regimen was significantly more toxic than Dexa-BEAM (particularly after the second course of treatment), the primary hypothesis of the study could not be confirmed. Still, the results for responders with autologous HCT are comparable to that of other studies (Gisselbrecht et al. 2010; Josting et al. 2010; Schmitz et al. 2002). The main problem remains the achievement of confirmed remission (at least PR) as a prerequisite to perform autologous HCT, especially in patients with refractory or early relapsed diseases (Gisselbrecht et al. 2010; Josting et al. 2010; Schmitz et al. 2002). So far, there is no optimal and favorable salvage protocol available. The DHAP regimen is routinely used as the standard salvage protocol for recurrent HL and aggressive NHL, although—to our best knowledge—its superiority to other protocols has not been demonstrated in a prospective randomized study. Furthermore, allogeneic HCT has not become a standard option for the treatment of Hodgkin’s lymphomas, and remains under debate in non-Hodgkin lymphomas (Shanbhag et al. 2019; Sureda et al. 2014). Nevertheless, allogeneic HCT for high-risk patients (defined as primary refractory/refractory relapse/recurrence after autologous HCT) is still a treatment option for non-Hodgkin lymphoma (Glass et al. 2014). For Hodgkin’s lymphomas, brentuximab vedotin is administered for maintenance therapy in high-risk patients after autologous HCT (Moskowitz et al. 2019). Polatuzumab vedotin (Sehn et al. 2020) has recently demonstrated very good results as salvage therapy without autologous HCT in DLBCL patients and could possibly be investigated as maintenance therapy in high-risk patients after autologous HCT. In addition, the checkpoint inhibitors nivolumab and pembrolizumab are options in the treatment of relapsed or refractory Hodgkin’s lymphoma (Chen et al. 2017; Younes et al. 2016) and are also intensively investigated for the treatment of NHL (Merryman et al. 2017).

Autologous genetically modified T cells, expressing chimeric antigen receptors (CARs, CAR T cells) directed against CD19 are a novel innovative treatment modality for patients with relapsed or refractory DLBCL and PMBCL from the 3rd line onwards. The ZUMA-1 study investigating the efficacy of axicabtagene ciloleucel demonstrated an overall response rate of 83% and a CR rate of 58% (Locke et al. 2019). Subsequently, a 4-year overall survival rate of 44% at a median follow-up of 51.1 months was reported (Jacobson et al. 2020). In the JULIET trial, 52% of patients responded to treatment with tisagenlecleucel and 40% achieved CR (Schuster et al. 2019b). After a median follow-up of 40.3 months, a 3-year overall survival rate of 36% was reported (Jaeger et al. 2020). Recently, a 5-year progression-free survival of 31% at a median follow-up of 60.7 months has been described (Chong et al. 2021). In clinical practice, the survival outcomes were consistent with the ZUMA-1 results (Nastoupil et al. 2020). In addition, Neelapu et al. showed that axicabtagene ciloleucel improved survival to conventional treatment based on standardized analyses. In the ZUMA-1 study, over 50% of the patients were alive at 2 years (Neelapu et al. 2019), compared to 12% of the SCHOLAR-1 study, the largest patient-level pooled retrospective analysis to characterize response rates and survival for a population of patients with refractory DLBCL (Crump et al. 2017).

Bispecific antibodies and T cell engagers (BiTE) bind both the target antigen of malignant lymphoma cells and that of CD3-positive T cells. On one hand, this leads to cell-mediated cytotoxicity, and on the other hand to the activation of various cellular and humoral immune reactions (Huehls et al. 2015). In heavily pretreated patients, various anti-CD3/anti-CD20 bispecific antibodies lead to overall response rates of up to 60% (Bannerji et al. 2020; Coyle et al. 2020; Lugtenburg et al. 2019; Schuster et al. 2019a). In patients after previous CAR T cell therapy, an overall response rate of up to 39% was reported (Bannerji et al. 2020; Schuster et al. 2019a).

In summary, treatment results of patients with recurrent lymphomas have been stagnant for about 20–25 years. This is especially true for unfavorable refractory diseases and early relapses. Despite different salvage regimes, autologous and allogeneic HCT, the results have not significantly improved. In the future, new treatment strategies using CAR T cells, BiTE and ADC may improve the outcome of patients with recurrent lymphomas.