Introduction

Allogeneic hematopoietic cell transplantation (allo-HCT) is widely used for adult patients with acute lymphoblastic leukemia (ALL), including those in first complete remission (CR) with high-risk features as well as patients in second or subsequent CR [1, 2]. Antileukemic activity of allo-HCT relies on both, conditioning regimen, and the graft-versus-leukemia (GVL) reaction. In ALL, the role of conditioning appears particularly important. Both the intensity of the protocol and type of regimen influence outcome. The use of myeloablative regimens based on total body irradiation (TBI) is considered the gold standard. In a global, randomized, pediatric trial the use of irradiation-free regimens was associated with a 2.5 times higher risk of relapse compared to TBI combined with etoposide [3]. Similar results have been reported for adults based on retrospective registry-based analyses [4,5,6]. In the study by Pavlu et al., the beneficial effect of TBI was independent of minimal residual disease status [5].

While the use of myeloablative TBI for ALL is considered a standard of care, choice of its chemotherapy counterpart remains controversial. Traditionally, TBI has been combined with cyclophosphamide (Cy) and in adults with ALL such a regimen is still most frequently used [7]. The use of etoposide, more common in pediatric centers may be alternative [7, 8]. Finally, the combination of both, etoposide and Cy with TBI is also practiced, being considered the most aggressive conditioning regimen [9]. Unfortunately, chemotherapy compounds contribute markedly to overall treatment toxicity, while their contribution to antileukemic activity of the regimens is poorly documented. In order to reduce regimen toxicity, Cy or etoposide may be substituted with the purine analog, fludarabine (Flu). Fludarabine is widely used in allo-HCT with reduced intensity conditioning [10]. In acute myeloid leukemia (AML) the combination of Flu with TBI (TBI/Flu) was evaluated in a prospective, randomized trial [11]. It was shown to reduce non-relapse mortality (NRM) without increased risk of relapse compared to classical TBI/Cy conditioning [11]. In ALL this issue has never been analyzed, either in a prospective or retrospective way. Therefore, the goal of this registry-based study was to compare outcomes of allo-HCT using either myeloablative TBI/Cy or TBI/Flu for adults with ALL in CR.

Methods

Study design and data collection

This was a retrospective, multicenter analysis. Data were provided by the registry of the Acute Leukemia Working Party (ALWP) of the European Society for Blood and Marrow Transplantation (EBMT). The EBMT is a non-profit, scientific society representing more than 600 transplant centers, mainly located in Europe, which are required to report all consecutive stem cell transplantations and follow-ups once a year. Data are entered, managed, and maintained in a central database. Since 1990, all patients have provided informed consent authorizing the use of their personal information for research purposes. The validation and quality control program includes verification of the computer print-out of the entered data, cross-checking with the national registries, and on-site visits to selected teams. The study was approved by the ALWP of the EBMT institutional review board and conducted in accordance with the Declaration of Helsinki and Good Clinical Practice guidelines.

Criteria for selection

The inclusion criteria were as follows: 1) patients with ALL who underwent their first allo-HCT in CR1 or CR2 between January 2010 and June 2020; 2) age ≥18 years; 3) conditioning regimen based on fractionated TBI at the total dose of 12 Gy in combination with either Cy or Flu; 4) transplantation from either a human leukocyte antigen (HLA) matched sibling donor (MSD) or unrelated donor (URD; HLA compatibility was defined as 10/10 or 9/10 match; 5) the use of peripheral blood or bone marrow as a source of stem cells. The use of anti-thymocyte globulin (ATG) or alemtuzumab as part of conditioning was allowed while transplantations with ex vivo T-cell depletion were excluded from the analysis.

Statistical analysis

Leukemia-free survival (LFS) was the primary study endpoint. Secondary endpoints were: 1) overall survival (OS), 2) relapse incidence (RI), 3) NRM, 4) incidence of grade 2–4 and grade 3–4 acute graft-versus-host disease (GVHD), 5) incidence of chronic GVHD and extensive chronic GVHD, and 6) survival free from grade 3–4 acute GVHD, chronic GVHD and relapse (GRFS) [12].

Patients’ characteristics were compared using the Mann-Whitney U test for continuous variables, and the chi-squared or Fisher’s exact test for categorical variables. Probabilities of LFS, OS and GRFS were calculated using the Kaplan-Meier estimator. Cumulative incidence curves were used to estimate the probabilities of RI, NRM, acute and chronic GVHD in a competing-risks setting [13]. Univariate analyses were performed using the log-rank test for LFS, OS and GRFS, and Gray’s test was used to compare cumulative incidence estimates [14]. Multivariate analyses were performed using Cox’s proportional-hazards model. All variables differing significantly between the groups, and factors known to influence outcomes were included in the Cox model. To take account of possible heterogeneity of data, a random effect or frailty was introduced for each country into the models [15, 16].

An additional matched-pair analysis was conducted using exact matching for ALL subtype (B-cell precursor/T-cell precursor ALL), disease status at allo-HCT (CR1/CR2) as well as donor type (MSD/URD), and nearest neighbor for recipient age, donor and recipient sex, the use of in vivo T-cell depletion, Karnofsky performance score, and stem cell source.

All tests were two-sided with the type 1 error rate fixed at 0.05. Statistical analyses were performed with SPSS 24.0 (IBM Corp., Armonk, NY, USA) and R 3.5.3 (R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/).

Results

Patients, donors, and allo-HCT procedure

The analysis included 2255 patients, out of whom 2105 were treated with TBI/Cy and 150 received a TBI/Flu conditioning regimen. Details on patient, donor and procedure characteristics are listed in Table 1. Patients in the TBI/Flu group were older (median age 35 years vs. 33 years in the TBI/Cy group, p = 0.006), were treated more recently (median year of allo-HCT: 2018 vs. 2015, p < 0.0001) and more frequently had a Karnofsky performance score <90 (27% vs.20%, p = 0.03). They more frequently received allo-HCT from URD (71% vs. 56.5%, p = 0.0007) while they were less frequently administered in vivo T-cell depletion (37% vs. 49%, p = 0.003). The use of bone marrow as a source of stem cells was more frequent in the TBI/Cy group compared to the TBI/Flu group (21% vs. 5%, p < 0.0001). Cyclosporin A plus mycophenolate mofetil was the most common immunosuppressive regimen in the TBI/Flu group while cyclosporin A plus methotrexate predominated in the TBI/Cy group (Table 1).

Table 1 Patients, donors and HCT procedure.

The matched-pair analysis included 132 patients in each group. The characteristics of both cohorts did not differ significantly except for the year of allo-HCT and spectrum of immunosuppressive protocols (Table 1).

Engraftment and GVHD

The engraftment rate was 99.3% and 98.9% in the TBI/Flu and TBI/Cy groups, respectively (p = 1.0).

In the analysis including the entire study population cumulative incidence of grade 2–4 acute GVHD was higher for TBI/Cy compared to the TBI/Flu group although the difference did not reach statistical significance (36% vs. 28%, p = 0.08). The incidences of grade 3–4 acute GVHD, overall chronic GVHD and extensive chronic GVHD were comparable between the study groups (Table 2). On multivariate analysis the risk of grade 2–4 acute GVHD was increased for TBI/Cy compared to TBI/Flu (hazard ratio [HR] = 1.57, p = 0.03) (Table 3). However, the effect was not confirmed in a matched-pair analysis (Table 2).

Table 2 Univariate comparison according to the type of conditioning regimen.
Table 3 Multivariate analysis of the effect of the type of conditioning on outcomes, adjusted for other risk factors (including total study population).

In the Cox model the risk of both grade 2–4 acute GVHD and chronic GVHD was reduced with the use of in vivo T-cell depletion while these were increased in the case of female donors. In addition, the use of URD as compared to MSD was associated with an increased risk of grade 2–4 acute GVHD while the use of peripheral blood as a stem cell source was associated with an increased risk of chronic GVHD compared with bone marrow (Table 3).

Relapse and NRM

The median follow-up was 23 months for the TBI/Flu group and 36 months for the TBI/Cy group. In a univariate analysis including the entire study population cumulative RI at 2 years was higher for TBI/Flu compared to TBI/Cy recipients, although the difference was not statistically significant (29% vs. 24%, p = 0.1) (Table 2). In the Cox model the effect of the type of conditioning on the risk of relapse was statistically significant in favor of TBI/Cy (HR = 0.69, p = 0.049) (Table 3). Also, TBI/Cy was associated with a significantly decreased RI in a matched-pair analysis compared with TBI/Flu (30% vs. 18%, HR = 0.5, p = 0.015) (Table 2, Fig. 1). In univariate or multivariate analysis the type of conditioning did not affect NRM (Tables 2, 3).

Fig. 1: Allo-HCT for ALL patients according to the type of conditioning regimen. Results of a matched-pair analysis.
figure 1

NRM, non-relapse mortality; RI, relapse incidence; LFS, leukemia-free survival; OS, overall survival. For NRM p value is 0.69, for RI p = 0.015, for LFS p = 0.07, for OS p = 0.16.

In the Cox model the risk of relapse was significantly increased for patients transplanted in CR2 compared to CR1 and when using in vivo T-cell depletion. It was decreased for URD-HCT compared to MSD-HCT and when using female compared to male donors (Table 3). The risk of NRM was increased with increasing recipient age, for transplantations performed in CR2 and when using peripheral blood compared to bone marrow as the stem cell source. It was increased for URD-HCT compared to MSD-HCT (Table 3).

The most frequent causes of death in both groups were: original disease (56.8% for TBI/Flu and 51.6% for TBI/Cy), GVHD (15.9% and 18.6%, respectively), infections (15.9% and 17.6%), neurotoxicity (4.5% and 3.3%), multiorgan failure (2.3% and 2%) and veno-occlusive disease (2.3% and 0.8%).

Survival

No significant differences were found in a univariate analysis comparing the two study cohorts with regard to OS, LFS and GRFS (Table 2, Fig. 1). In the Cox model no significant effect of the type of conditioning on OS, LFS and GRFS could be demonstrated (Table 3).

In a multivariate analysis, both increasing recipient age and more advanced disease status (CR2 vs. CR1) were associated with decreased chance of OS, LFS and GRFS. In addition, OS and GRFS were improved for female compared to male recipients and for bone marrow compared to peripheral blood as a stem cell source. A chance of GRFS was increased with the use of in vivo T-cell depletion (Table 3).

Discussion

In this registry-based study we compared for the first time, outcomes of allo-HCT for adults with ALL in CR, using two myeloablative, irradiation-based conditioning regimens, TBI/Cy and TBI/Flu. Results of both multivariate and matched-pair analysis demonstrated that TBI/Flu was associated with an increased risk of leukemia recurrence without significant impact on NRM and survival. The risk of grade 2–4 acute GVHD was reduced for TBI/Flu compared to TBI/Cy in the Cox model, but it was not confirmed in a matched-pair comparison.

TBI is considered an optimal backbone for conditioning in ALL. Its antileukemic activity is dose-dependent and therefore the maximum tolerated dose should be preferentially used [17]. According to the survey performed among EBMT centers, the total dose of 12 Gy is the most commonly used [18]. It used to be applied in 6 fractions of 2 Gy each, however, as shown in a retrospective analysis, the number of fractions may be decreased to 3 (4 Gy per fraction, once daily) without significant impact on outcomes [19]. Clinical practice varies among centers with regard to many technical aspects of TBI including dose rate, organ shielding and methods of patient immobilization, that may affect both safety and efficacy of the treatment [18, 20].

Although TBI at myeloablative doses is sufficiently immunosuppressive to allow for engraftment, it used to be combined with chemotherapy in order to increase overall antileukemic activity of the conditioning regimen. In adults, the regimens most frequently include Cy and etoposide, or both agents. No randomized trials have been performed to compare particular TBI-chemotherapy conjunctions. According to retrospective comparisons, TBI + etoposide may display higher efficacy than TBI/Cy leading to reduced incidence of relapse, especially when allo-HCT is performed in CR2 [7, 8]. Despite this, according to the EBMT database, Cy is still used almost 10 times more frequently than etoposide [7].

In a recent publication, the ALWP of the EBMT proposed new definitions of the intensity of conditioning regimens taking into account prediction of all early and late NRM as well as relapse [21]. Weight scores were assigned to all components of most frequently used regimens. Their sum enabled classification of the regimens into three groups: low, intermediate, and high transplant conditioning intensity (TCI). According to the proposed algorithm, TBI 12 Gy/Cy has a TCI score of 4, which places it in the category of high intensity regimens, associated with high risk of NRM. Attempts to reduce the risk of NRM associated with the use of classical myeloablative regimens include substitution of alkylating agents with purine analogs, usually Flu. In the AML setting, prospective, randomized trials demonstrated improved tolerance to TBI or busulfan combined with Flu compared to regimens incorporating Cy [11, 22]. In particular, a German study group compared TBI 12 Gy/Cy with TBI 8 Gy/Flu for patients <60 years old [11]. With a follow-up of almost 10 years the authors found no differences in the incidence of relapse, and a tendency towards reduced NMR after reduced intensity conditioning [23]. However, studies on AML must not necessarily translate into ALL. As ALL is an oligoclonal disease, the GVL effect may be weaker for ALL compared to AML and therefore the role of the conditioning intensity may be more relevant.

The comparison of TBI/Cy and TBI/Flu has never been the subject of a prospective or retrospective study in adults with ALL. We decided to select only patients treated with TBI at the most popular myeloablative dose of 12 Gy to evaluate if substitution of Cy with the less toxic agent – Flu may lead to reduced NRM without compromising antileukemic activity of the regimen. TBI/Flu has a TCI score equal to 3.5 and is thus in the category of intermediate intensity or “reduced-toxicity” regimens [21]. Unexpectedly, the incidence of NRM was comparable in both cohorts (13% for TBI/Flu and 15% for TBI/Cy at 2 years, Table 2). Also, despite an increased incidence of grade 2–4 acute GVHD, the distribution of cause of death was similar in both groups. The NRM rate for TBI/Cy was lower than reported in a large registry study on adults with ALL, including allo-HCT performed between 2000 and 2015 (20% at 2 years) [7]. As the median year of allo-HCT in the TBI/Cy group in this study was 2015, it may be assumed that NRM decreased during the last decade, the probable explanation being the progress in supportive care.

While the introduction of Flu instead of Cy did not show a clear benefit in terms of NRM, it was associated with an increased risk of relapse. The effect was particularly distinct in the matched-pair analysis where the risk of relapse for TBI/Flu was double that of TBI/Cy (Table 2). This observation can be explained by the higher antileukemic activity of Cy than Flu. Indeed, the efficacy of Cy as a single agent in patients with ALL had already been documented in 1963 [24]. Until now it is used as part of basic chemotherapy protocols during pre-treatment, induction and consolidation. Flu is a potent immunosuppressive drug, while its direct antileukemic activity in ALL has never been documented. The drug was found to potentiate metabolism of cytarabine by increasing intracellular concentration of its active metabolite, ara-cytosine triphosphate [25, 26]. For patients with acute leukemias, Flu has been incorporated into several salvage regimens like FLAG-IDA or FLAM, always in combination with cytarabine, while never as a single agent [27, 28]. Differences between antileukemic efficacy of TBI/Cy and TBI/Flu could potentially be a consequence of different synergistic effects of the chemotherapeutic agents and irradiation. However, so far, neither cycylophosphamide nor fludarabine have been documented as effective radiosensitizers.

Multivariate analysis revealed the role of other potentially modifiable factors such as stem cell source, donor type, and the use of in vivo T-cell depletion. The use of hematopoietic cells derived from peripheral blood was associated with an increased risk of chronic GVHD, increased NRM, and, consequently decreased GRFS and OS. These findings correspond well with a recent report showing inferior outcomes when using peripheral blood as compared to bone marrow in allo-HCT from haploidentical donors for patients with ALL [29]. Increased risk of chronic GVHD when using peripheral blood as a source of stem cells may be diminished by administration of ATG as part of the conditioning regimen [30]. Indeed, the use of ATG is recommended in both URD-HCT and MSD-HCT [30]. Unfortunately, results of retrospective analyses focusing separately on patients with Ph-negative and Ph-positive ALL demonstrated an increased risk of relapse when using ATG [31, 32]. These findings have been confirmed in our study showing a 47% increase in the risk of relapse among patients treated with in vivo T-cell depletion. It implies that in the ALL setting ATG should be used with caution, probably at reduced doses.

As compared to MSD-HCT, URD-HCT procedures were associated with an increased risk of NRM, compensated by a reduced risk of relapse, which suggests a more effective GVL reaction when using URD. This observation corresponds well with previous reports published by our study group [4]. Unfortunately, results of this study must not be translated to the haploidentical HCT setting. The use of post-transplant Cy excludes TBI/Cy as conditioning due to the risk of excessive cumulative toxicity. In this type of procedure TBI/Flu may be a valuable alternative.

Our study has some important limitations related to its retrospective nature. Firstly, we could not identify reasons for the choice of TBI/Cy or TBI/Flu as conditioning. Also, data on minimal residual disease before transplantation were available for only 57% of patients, which did not allow inclusion of this variable in multivariate analyses. Finally, technical aspects of the TBI procedure, including, dose rate, energy of the beams, methods of immobilization, dosimetry, organ shielding etc., which may influence both safety and efficacy of the procedure could not be controlled in the analysis. Nevertheless, we believe that our findings highlighting the role of the intensity of conditioning regimens in adults with ALL referred for allo-HCT may be of clinical importance.

We conclude that the use of myeloablative TBI/Cy as conditioning prior to allo-HCT for adult patients with ALL is associated with significantly lower relapse rates compared to TBI/Flu and therefore should probably be considered a preferable regimen. It must be stressed, however, that no significant effect of the type of conditioning on survival could be demonstrated in our study.