Introduction

Mixed-phenotype acute leukemia (MPAL) is a rare diagnosis constituting 2–5% of acute leukemia [1]. The complex phenotype exhibited by this type of leukemia historically resulted in a myriad of treatment approaches utilizing therapy for acute lymphoblastic leukemia (ALL), acute myeloid leukemia (AML), or so-called “hybrid” therapy mixing elements of both. As a rare disease, current evidence for the treatment of MPAL is limited to small case series, often diffusely published across a variety of international pathology and oncology journals. Determining best therapy is further complicated by the lack of a uniform definition with two different classification systems for MPAL commonly used today: the European Group for Immunological Characterization of Acute Leukemias (EGIL) and the World Health Organization’s published iteration in 2008 (WHO2008) [2,3,4,5] with minimal changes in the WHO2016 update [6]. Interpretation of clinical outcomes across classification systems is confounded by important differences in the definition of MPAL resulting in overlapping but distinct patient populations. Both classifications remain in use internationally, with continued controversy whether one is more advantageous [1]. Few studies include classification of MPAL using both systems, and all represent small case series with fewer than 50 patients in total, thereby limiting comparisons of clinical outcomes between definitions. The dearth of prospective trials with such diverse, scattered retrospective data is a significant hurdle to developing an evidence-based treatment approach by clinicians currently caring for this rare disease. The objective of this meta-analysis and systematic review was to therefore answer the question: is AML-based or ALL-based therapy associated with greater rates of complete remission (CR) and/or overall survival (OS) in patients with MPAL? To our knowledge, this is the first systematic approach to quantitatively synthesize available evidence for MPAL addressing the association of therapy type with early disease response and survival.

Materials and methods

Search strategy

A systematic literature search was conducted by a research librarian (L.K.) using a combination of controlled vocabulary when possible and keywords in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [7]. To ensure inclusion of all available data for this rare disease, a comprehensive search was undertaken using multiple databases (Medline/Pubmed, EMBASE (initially through OvidSP and later via Elsevier), Cochrane Library, Web of Science, Scopus, Clinicaltrials.gov). All seven databases were searched from their inception through June 2017 with no language restriction. As MPAL is often inconsistently labeled in the literature, the search strategy was intentionally broad to maximize the chances for identification of all published reports of patients with MPAL. Included terminology was representative of different descriptions of MPAL (e.g., “biphenotypic leukemia” OR “bilineal leukemia” OR “mixed leukemia”). The literature search was supplemented by review of cited references from included case series. All references were compiled into an Endnote (X7) library. Following removal of duplicates, title and abstract review was independently performed by two authors (E.O., M.M.); in case of disagreement, final decision was reached by consensus. The full search strategy inclusive of search terms is provided in the supplement (see Supplementary Methods).

Eligibility

Eligible studies had to meet the following criteria: a de novo acute leukemia (i.e., excluding secondary cancers, nor presentation at relapse as MPAL, nor relapsed MPAL), specifically state the classification system (EGIL or WHO2008) used to establish the diagnosis, describe sufficient treatment to determine therapy type, and include sufficient response data per therapy type to inform the study endpoints for CR and OS. Studies reporting MPAL using the WHO2016 definition (described albeit not yet formally published) were included within the WHO2008 category as only subtle distinctions are present in the updated definition [6]. Studies were excluded that reported only combined ALL/AML/MPAL or combined EGIL/WHO2008 data wherein the contribution of therapy to definition-specific outcomes could not be identified. Sufficient treatment data were defined as, at minimum, reporting of chemotherapy agents, protocols, or authors’ description of “therapy-directed” at ALL, AML, or hybrid (or similar verbiage). Studies reporting the use of non-cytotoxic chemotherapy alone (such as immune modulators, epigenetic therapy) or palliative chemotherapy only were excluded. Acute undifferentiated leukemia (AUL) was excluded as a distinct diagnosis [4, 6]. Due to the low prevalence of disease and the goal to quantitatively analyze all relevant information, studies of any design were incorporated including individual case reports and case series.

Data extraction

Data were extracted by one author (M.M.) and independently verified by a second author (E.O.). Data of interest consisted of type of research (retrospective, prospective, clinical trial), treatment years (or publication year if not available), consortia where applicable, country of origin, and treatment regimen or chemotherapy agents. Patient characteristics included age, sex, ethnicity, and race. Age was classified as pediatric (≤18 years) or adult (>18 years) or as reported within the study as “adult,” “pediatric,” or “mixed” if not separated in the report. Disease presentation data included MPAL definition (EGIL or WHO2008), phenotype (B/Myeloid [B/My], T/Myeloid [T/My], B/T, B/T/Myeloid [B/T/My]), lineage (bi-lineage [Bi-L], bi-phenotypic [Bi-P]), and presenting features (white blood cell [WBC] count, central nervous system [CNS] involvement, and recurrent cytogenetic features (presence/absence of BCR-ABL, rearrangements in the KMT2A gene [KMT2Ar, formerly “MLLr”], or ETV6/RUNX1). Treatment characteristics included therapy type (ALL, AML, hybrid), use of stem cell transplant (SCT), and use of a tyrosine-kinase inhibitor (TKI). When therapy type was not classified by regimen or assigned by authors within the case report (i.e., only chemotherapy agents listed), induction therapy was considered to be ALL-based if it included steroids, vincristine, asparaginase, AML if it included anthracycline and cytarabine, and “hybrid” if it included all five of these agents. Outcomes of interest included post-induction CR (yes/no) and OS. CR was defined using traditional morphology criteria and/or as a reported CR. In addition to aggregate data, when available, detailed information was extracted from case reports and case series to enable patient-level analyses as a “compiled case-series.”

Statistical approach

The primary and secondary endpoints of interest were post-induction CR and OS, respectively. The proximal measure of CR was selected as the primary endpoint to best understand treatment sensitivity of MPAL. By focusing on induction, treatment heterogeneity was minimized, thus enabling a 1:1 relationship between therapy type and immediate disease response. Candidate predictors were type of induction, age (adult/pediatric), phenotype (B/My, T/My, B/T, B/T/My), lineage (Bi-P, Bi-L), and SCT. Summary measures were generated from case series reports for inclusion in the meta-analyses. Meta-analyses were conducted within each definition of MPAL to evaluate outcomes per MPAL definition and to enable comparisons between the two systems (EGIL versus WHO2008). Log odds ratios and log relative failure rates were pooled using the random effect method [8]. Between-study heterogeneity was assessed using the Q-statistic along with the proportion of variation (I2). Multivariable regression analyses for the meta-analyses were not possible as only marginal totals of patients for categories of independent variables were typically provided with inadequate information to describe the association between different variables. Analyses were performed on data from the patient-level compiled case series using multivariable logistic regression analysis for CR and multivariable Cox regression analysis for OS. Age, therapy type, phenotype, lineage, and MPAL definition were included as candidate predictors in each model. All models were performed with backward stepwise main effects analyses using p < 0.10 for retention. As a final step, removed variables were retested for improvement of the final model. Interactions between the MPAL definition variable and main effects were tested for plausible evidence that the observed effect might differ depending on definition. Examination of the impact of treatment era was limited by case series reporting only the range of years during which cases were treated instead of a patient-specific year. As described in detail in the Supplementary Methods, we explored this indirectly using a simulation approach to examine the impact of including treatment year in the regression models for CR and OS. Data for clinical features of MPAL presentation were analyzed for aggregate and patient-level data; aggregate data was synthesized by weighting each measure by patients included per report. Publication bias for aggregate data was assessed through visualization of funnel plots. Data quality was not formally assessed as all included studies were descriptive from retrospective cohorts. P-values of <0.05 were generally considered statistically significant. All statistical computations were performed on Stata, version 14 (Statacorp. 2015. Stata Statistical Software: Release 14. College Station, TX: StataCorp LP). Additional detail for the statistical approach is provided in the supplement (Supplementary Methods).

Results

Search results

The search strategy yielded 17,421 reports fulfilling the intended search terms for this heterogeneous disease with no reports identified from other sources. Following assessment of title and abstract, 252 met criteria for full-text review (Fig. 1, PRISMA Flow Diagram). The final quantitative assessment included 97 studies whose characteristics are described in Supplementary Table 1 [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106]. From these, data for 1,499 individual patients were extracted, 759 patients met the criteria for MPAL via EGIL, 740 via WHO2008, and 54 of these were reported to satisfy both definitions. From this cohort, 1,351 unique patients fulfilled the EGIL and/or WHO definition for MPAL and provided sufficient information for inclusion in at least one quantitative analysis for the study endpoints of CR or OS. The remaining 148 patients (three reports [15, 71, 77]) did not provide adequate information for either study endpoint and were included in the description of the presenting features of MPAL only. Thirty-two patients were treated with regimens not otherwise classified. These were defined during data extraction (AML therapy = daunorubicin–cytarabine [“DA,” n = 5], idarubicin–cytarabine [“IA,” n = 2], cytarabine–daunorubicin–etoposide [“ADE,” n = 1]; ALL therapy = cyclophosphamide, vincristine, doxorubicin, and dexamethasone [“hyperCVAD,” n = 5], vincristine, cyclophosphamide, etoposide, prednisone, mitoxantrone [“VCEP-M,” n = 19]). Patient-level data for the compiled case series was drawn from 65 of the 97 reports and including detailed information for 342 patients for analysis of CR and 240 for analysis of OS. Three additional studies [107,108,109] did not meet eligibility criteria for inclusion in the quantitative analysis but are included in the discussion (no attributable definition [n = 2], inadequate treatment data [n = 1]). Reports were drawn from the international literature, with patients treated from 33 different countries.

Fig. 1
figure 1

PRISMA flow chart

Study cohort and presenting features of MPAL

Systematic review of the literature revealed 31 reports describing the prevalence of MPAL, with a mean prevalence of 2.8% of acute leukemia (range 0.3–9.0%). Presenting characteristics of MPAL are shown in Table 1 according to definition and report type with several trends apparent irrespective of definition. In general, MPAL did not present commonly with hyperleukocytosis, with a median peak WBC at diagnosis of 12–28 K/µL in the aggregate and patient-level data. Only approximately one in five patients in the compiled case series presented with a severe hyperleukocytosis of ≥100,000/µL. Recurrent leukemic cytogenetic abnormalities were relatively infrequent, with BCR-ABL present in <25%, and KMT2Ar and ETV6/RUNX1 in ≤10%. Only 17 of the 51 BCR-ABL+ patients within the compiled case series were actively reported as receiving a TKI during their frontline therapy. Detailed review of these 17 cases showed no added toxicity with the addition of the TKI. Involvement of the CNS was uncommon at the time of diagnosis, affecting <20% of patients. Where reported, Bi-P MPAL was far more common than Bi-L MPAL. B/My MPAL was the most prevalent phenotype followed by T/Myeloid with B/T and B/T/My MPAL relatively infrequent except within the WHO2008 compiled case series.

Table 1 Presenting characteristics of patients with MPAL

Predictors of CR

ALL induction therapy was associated with a more than three-fold greater CR rate than AML therapy irrespective of MPAL definition (Fig. 2a: WHO2008, n = 322, Odds ratio [OR] = 0.33, 95% Confidence interval [95% CI]0.18–0.58; Fig. 2b: EGIL, n = 277, OR = 0.18, 95% CI 0.08–0.40). Minimal and insignificant between-study heterogeneity was present (WHO2008 I2 = 0%, Q-statistic p = 0.53, EGIL I2 = 0%, Q-statistic p = 0.96). CR rates with hybrid induction therapy were not significantly different than those with ALL induction by either definition (Supplementary Fig. 1A, B) and were of borderline significance for higher CR rates compared to AML induction (Supplementary Fig. 1C, D). Meta-analyses of other candidate predictors for CR showed no effect of patient age or MPAL lineage, but in EGIL-defined MPAL, B/My phenotype was associated with greater chance for post-induction remission than T/My (Supplementary Fig. 2A, B). In the compiled case series, the majority of patients receiving an ALL induction achieved a CR (n = 150/203, 73.9%). Pediatric patients were more likely than adult patients to begin with ALL therapy (OR = 2.67, 95% CI 1.64–4.35, p < 0.001). However, subsequent multivariable analysis inclusive of treatment type, MPAL definition, age, and lineage found only treatment type and phenotype were associated with a CR (Table 2). After accounting for the impact of phenotype, AML therapy was associated with a lower CR rate than ALL therapy (OR = 0.45 95% CI 0.27–0.77, p = 0.003). Hybrid induction was not significantly different from an ALL induction (OR = 0.95, 95% CI 0.42–2.17, p = 0.899). Examination of treatment era in the context of this final model suggested a possible increase in CR rate as a function of treatment year (Supplementary Data).

Fig. 2
figure 2

Meta-analyses of complete remission rate from ALL or AML induction therapy. Examination of likelihood for obtaining a complete remission when beginning treatment with either ALL or AML induction therapy for MPAL as defined by WHO2008 (a) or EGIL (b). Reference number and sample size as indicated. *, **, *** indicate left, right, or both boundaries extend past figure. C = data reported within detailed case series, S = data reported only as summary/aggregate data

Table 2 Multivariable analysis of patient-level data for complete remission

Predictors of OS

Meta-analyses of OS showed a significant survival benefit for starting with ALL therapy as compared to AML, irrespective of definition (Fig. 3a: WHO2008, n = 154, OR = 0.45, 95% CI 0.26–0.77; Fig. 4b: EGIL, n = 141, OR = 0.43, 95% CI 0.24–0.78). There was similarly no significant between-study heterogeneity (I2, Q-statistic p-value = 15%, p = 0.31 and 24%, p = 0.24, respectively). Only 11 studies compared outcomes for SCT. The limited data for SCT showed a trend for an association with higher OS for WHO2008-defined MPAL but not EGIL (Supplementary Fig. 3A, B). For the compiled case series, 3-year OS for the combined adult and pediatric cohort was 44 ± 3.7%. Examination of the data revealed a large amount of early censoring due to “early” publication; as a sensitivity analysis to determine an “upper boundary’ for potential OS, if publication-censored patients were presumed alive, the maximum 3-year OS would be 52 ± 3.2%. Nonetheless, starting with AML and ALL therapy resulted in similar 3-year OS (47 ± 5.0% and 48 ± 6.9%) as compared to worse survival for starting with hybrid therapy (23 ± 8.6%) (Fig. 4, logrank p = 0.001). On multivariable analysis (Table 3), two models were found to represent the data equally well, one with starting therapy (i.e., induction type) and age (model #1) and the other with starting therapy and lineage (model #2). In both of these multivariable models, starting with either AML or ALL therapy was associated with greater OS as compared to hybrid therapy. Irrespective of age, patients beginning therapy with AML regimens did not experience significantly poorer survival as compared to ALL (hazard ratio [HR] = 1.18, 95% CI 0.79–1.75, p = 0.413) while those receiving hybrid therapy fared worse (HR = 2.11, 95% CI 1.30–3.43, p = 0.003). This effect of starting therapy was also preserved after accounting for MPAL lineage (model #2). Age was predictive of OS on univariable analysis (Fig. 5, logrank p = 0.025) with continued borderline significance in the multivariable model inclusive of starting therapy (pediatric HR = 0.69, 95% CI 0.48–1.00, p = 0.051). Examination of treatment era in the context of the multivariable OS model did not suggest significant improvement as a function of treatment year (see Supplementary Data). SCT could not be included in the multivariable analysis as it is a time-dependent variable with insufficient information reported in most studies as to timing of transplant. However, in examining the patterns of transplant, ~38% of patients in the compiled case series received SCT (n = 101/265), with ~67% of those proceeding to SCT having achieved an early CR (n = 68/101). Patients who started with ALL therapy were less likely to proceed to SCT if they achieved an end of induction CR versus those without CR, while those starting with AML therapy were far more likely to proceed to a SCT with an end induction CR versus no initial CR (ratio of ORs = 4.80, 95% CI 1.49–15.45, p = 0.009).

Fig. 3
figure 3

Meta-analyses of overall survival according to starting therapy. Results of meta-analyses comparing overall survival for those beginning treatment with ALL versus AML therapy for MPAL defined by WHO2008 (a) or EGIL (b). Reference number and sample size as indicated. *, **, *** indicate left, right, or both boundaries extend past graph. C = data reported within detailed case series, S = data reported only as summary/aggregate data

Fig. 4
figure 4

Overall survival curves according to starting therapy. Comparison of overall survival from diagnosis for patients beginning treatment with ALL, AML, or hybrid therapy in the patient-level compiled case-series

Table 3 Multivariable analysis of patient-level data for overall survival
Fig. 5
figure 5

Overall survival curves according to age group. Comparison of overall survival from diagnosis for adult versus pediatric patients treated for MPAL in the patient-level compiled case-series

Discussion

To our knowledge, we present here the results of the first quantitative synthesis for MPAL with over a thousand unique patients with evaluable data included among the different analyses. Due to the relative rarity of MPAL, this represents the largest evaluation to date of therapeutic approaches to treating MPAL. Our principal finding supports the use of an ALL induction regimen as more likely to achieve an initial remission than more toxic AML regimens. Meta-analyses supported the benefit of starting with ALL therapy for OS, but this finding was not replicated in multivariable analysis of the smaller compiled case series. It is unclear if this discrepancy is due to differences in post-induction therapy, variable use of SCT, or other differences not minimized by the large number of patients in the aggregate meta-analyses. Beginning with hybrid therapy showed no consistent difference in the study endpoints across definitions when compared to ALL or AML therapy, but significantly worse survival than either within the compiled cases series. Pending future prospective trials, the consistent trends in the large number of both pediatric and adult patients included here support beginning therapy with less intensive ALL therapy [110]. By analyzing data for both the WHO2008 and EGIL definitions, this finding has broad relevance irrespective of preferred classification system.

The nature of the MPAL literature previously precluded a clear understanding of the presenting characteristics of MPAL. Analysis of this large cohort revealed no specific trends for age, sex, or presenting leukemia features. Of note, while not separated by age group, recurrent leukemia cytogenetic abnormalities were present at rates that were not clearly different from those in the general ALL population. As the first collated description of the prevalence of CNS involvement at diagnosis of MPAL, the relative infrequency of CNS disease is important to note for planning CNS prophylaxis even with the unknown incidence of CNS relapse in MPAL. The relatively low prevalence of BCR-ABL is reassuring, as is the lack of noted toxicity in patients treated with a TKI in support of current expert opinion [6, 110]. We found B/My and Bi-P MPAL to be the most common phenotype and lineage, similar to that reported in an earlier registry study [58]. Understanding the presenting characteristics for MPAL such as the prevalence of CNS involvement, hyperleukocytosis, and cytogenetic abnormalities may help providers guide initial care as well as future research efforts to determine optimal therapy for this population.

While the data supporting improved CR rates with ALL therapy is consistent across all aspects of this quantitative analysis, the impact of therapy type on OS is not as clear. We would note that the poor 3-year OS of <50% demonstrated for patients in the compiled case series may be adversely weighted by the inclusion of older patients (as demonstrated in the multivariable analysis). Two epidemiological studies included for qualitative review, one from the Surveillance, Epidemiology, and End Results (SEER) database and one from United States Medicare data [107, 108], both support the effect of age on OS. Both studies were hampered by insufficient diagnosis and/or treatment data, but consistently showed elderly patients (≥60 or ≥65 years) diagnosed with MPAL resulted in 2-year OS of less than 20%; the SEER study also showed the highest survival in pediatric patients <20 years of age [107], similar to what we find here. This latter study supports a possible confounding effect of treatment era, with improving survival in MPAL in those treated since 2006, an effect we could not examine [107]. While a benefit for ALL therapy was not observed in the compiled case-series, even equivalent OS might support the use of ALL regimens with reduced treatment intensity and associated acute toxicity and long-term late effects. Comparison of treatment intensity is particularly relevant as the observed 3-year OS in our series included an undetermined effect of SCT, with patients receiving AML therapy far more likely to proceed to intensive SCT even with an early remission. The overall trends seen in our data are consistent with a recent abstract from the iBFM AMBI2012 registry [109]. While the AMBI2012 registry data could not be included in the meta-analyses due to its inclusion of either MPAL classification without specification, and of patients with acute undifferentiated leukemia (AUL), the study generally supported a survival benefit for ALL-based therapy to treat MPAL (5-year EFS ~70%) compared to AML or hybrid regimens (5-year EFS ~40% and ~50%, respectively). Available quantitative and qualitative data thus both support the use of ALL therapy for initial induction, with the preponderance of evidence supporting at least equivalency of long-term outcomes and OS, if not benefit as seen in the meta-analyses.

Conversely, the role of transplant remains a question yet unanswered for MPAL. The meta-analyses show that incorporation of SCT may favor OS, but our analysis is limited through inconsistent reporting for SCT including pre-transplant disease status. Moreover, this apparent association may be an artifact as patients receiving SCT had to survive sufficiently long to receive it. Differences in benefit for SCT between the WHO2008 and EGIL definition are similarly challenging to explain, although the WHO results were likely influenced by the smaller sample size. A recent retrospective study of patients with MPAL treated with SCT by the Center for International Blood and Marrow Transplant Research (CIBMTR) [111] lacks data on frontline therapy but describes post-transplant survival for 95 WHO2008-defined MPAL patients. They concluded that transplant was of equal benefit for MPAL as for other acute leukemia. Within the CIBMTR data, transplant in first or second remission did not impact survival, albeit with the caveat this does not include those who did not survive to time of transplant. Preliminary data from the earlier AMBI2012 abstract also describes a potential benefit for SCT, although only in those receiving AML therapy [109]. As such, while SCT likely has utility for treatment of MPAL similar to other acute leukemia, its specific role in frontline versus salvage therapy has yet to be elucidated.

Although this constitutes the largest analysis of MPAL patients in the literature, several limitations inherent to meta-analyses are present in our study. Foremost, this systematic review highlights the absence of high quality prospective studies in MPAL to guide therapy selection. The possibility of a treatment bias from retrospective series cannot be excluded, although this concern is somewhat tempered by the sheer numbers of patients from a wide variety of countries. As is common to all retrospective studies, we cannot entirely exclude unknown interactions between therapy type and leukemia prognosis. While the inclusion of TKI therapy for BCR-ABL + MPAL is now common, we are unable to gauge the efficacy of this approach from the extant literature; it is nonetheless important to note we found no excessive toxicity reported from inclusion of a TKI. Individual recurrent cytogenetic findings were relatively uncommon in the MPAL literature, but beyond those revealed by routinely obtained cytogenetics, this is likely limited by the specific testing sent in each case. Prospective trials with uniform testing at time of diagnosis are necessary to determine the precise prevalence. Both BCR-ABL and KMT2Ar, while less prevalent in MPAL, likely adversely affect OS in MPAL similar to other acute leukemia. Limitations in the primary data precluded evaluating the extent of their influence on CR rates and survival in this study. As a rare disease, data for case series were often presented in abstract form only; we therefore limited inclusion criteria to reports specifically stating their usage of well-established criteria (EGIL or WHO) to mitigate the risk for added heterogeneity in the diagnosis of MPAL. We also acknowledge that the broad search strategy, including unavoidable overlap with terminology for leukemia with rearrangement of the “mixed lineage leukemia” gene, may have resulted in missed isolated case reports during our extensive title review. Nonetheless, we minimized the risk for missing data through independent review by two authors, and we would note our strategy resulted in the inclusion of a very large and international cohort for analysis of a rare leukemia, thereby greatly increasing wide relevance of these findings. Lastly, the aggregate data was too sparse to determine the impact of post-induction therapy with or without SCT on survival. While the meta-analyses of data for EGIL and WHO2008 MPAL are strongly suggestive of benefit for beginning treatment using ALL therapy, the lack of prospective trial data and the absence of clear improvement in treatment outcomes over the years highlight the need for prospective trials to validate these findings and to explore the role of response-based risk stratification and SCT to determine optimal therapy for MPAL.