Introduction

Many patients with acute myeloid leukemia (AML) will achieve a morphological complete remission (CR) with curative-intent induction chemotherapy [13]. Despite intensive post-remission therapies, however, fewer than half of the patients achieving a CR will be cured [1]. The likelihood of cure varies considerable across individual patients, highlighting the need for accurate methods to identify the patients at high risk of disease recurrence and direct post-remission therapy in a risk-adapted manner. There is now increasing evidence that levels of minimal residual disease (MRD) during the course of therapy can serve as an independent marker to identify such high-risk patients [5]. Besides achievement of a morphologically defined CR as prerequisite of cure, the term “molecular remission” had been introduced in the 2003 guidelines to refine treatment response in AML [4], and there is increasing evidence that levels of MRD after induction therapy are independently associated with risk of relapse and survival [5].

AML has a wide spectrum of cytogenetic and molecular abnormalities [2,6]. In some cases, leukemia-specific fusion genes (e.g., PML-RARA in t(15;17)(q22;q12), RUNX1-RUNX1T1 in t(8;21)(q22;q22), CBFB-MYH11 in inv(16)(p13.1q22) or t(16;16)(p13.1;q22), MLL-MLLT3 in t(9;11)(p22;q23)) and/or clone-specific recurring gene mutations (e.g., NPM1, DNMT3A) or overexpression of genes, e.g., WT1, are present that can be used as molecular markers to detect and monitor MRD (Table 1). In recent years, two methods, multiparameter flow cytometry (MFC) and quantitative real-time polymerase chain reaction (RT-qPCR), have been developed to detect MRD in AML patients [1,2,5]. Increasing evidence indicates that the presence of MRD characterizes patients with a higher risk of relapse as compared to those who are MRD negative. Hence, MRD provides powerful prognostic information during treatment and follow-up that is in a statistical sense independent of pretreatment characteristics such as cytogenetic or molecular abnormalities [5,7,8•]. Nonetheless, in AML, MRD as a tool to fine-tune risk assessment during post-remission therapy with adaption of treatment strategy is lagging behind acute lymphoblastic leukemia (ALL), acute promyelocytic leukemia (APL), or chronic myeloid leukemia (CML), in which MRD is now routinely used to guide treatment decisions at predefined checkpoints during therapy [913]. Below, we will review MRD assessment in AML by MFC and RT-qPCR, the proposed standardization of the methods and the potential use of MRD as a surrogate [14] for survival endpoints.

Table 1 Antibody combinations for the identification of leukemia-associated aberrant immunophenotypes as well as molecular targets for the measurement of minimal residual disease in acute myeloid leukemia

MRD Detection by MFC

Besides lineage determination to distinguish AML from ALL, immunophenotyping by MFC is a valuable tool to distinguish normal bone marrow (BM) from leukemia cells. Normal BM exhibits reproducible patterns of antigen expression during differentiation, whereas in the vast majority of AML patients (>90 %), leukemic cells are characterized by antigen-specific markers, which define a leukemia-associated aberrant immunophenotype (LAIP), thus allowing to monitor the proportion of cells expressing the LAIP in response to therapy [5]. Another approach is called “different-from-normal,” which uses a standard, fixed antibody panel to differentiate AML cells from normal hematopoietic cells or recovering marrow after chemotherapy and is therefore independent from the initial LAIP. The latter approach has been successfully used in the setting of allogeneic hematopoietic stem cell transplantation (HSCT), fostered by the fact that immunophenotypic data from initial disease presentation was only available for a small subset of patients [15]. Visualization and analysis of immune-phenotypic data depends on manual inspection of cells on biaxial plots with multiple gating steps of selected cell populations. In an adequate sample, a MRD detection sensitivity of 10−3 to 10−4 can be achieved [5]. Currently, 6–10 differently labeled antibodies are used as standard for MFC-MRD detection [5,8•,15]; analysis of immunophenotypic data by automated analysis algorithms (e.g., SPADE [16] or viSNE [17]) may facilitate interpretation and standardization of results in the future.

However, interpretation of MFC for MRD detection is technically demanding and highly individualized, since it is dependent on the expertise and experience of the reference laboratory [8•,15,18]. Increased accuracy and sensitivity of MRD detection can be obtained by incorporation of more antibodies and automated analysis algorithms for the detection of an abnormal cell [18]; albeit cost-effectiveness has so far not been systematically evaluated. Leukemic cells are quantified relative to other cells in the specimen; hence, the smallest cell cluster, judged as MRD positive, depends on the total number of analyzed cells. Consequently, if the cell quantity is limited (e.g., in aplasia during therapy), MRD sensitivity is reduced. In addition, sensitivity is largely a function of whether the same phenotype is present in normal immature cells. The analysis becomes more complex with increasing numbers of simultaneously applied antibodies. It is well recognized that there is limited consistency in results between individual laboratories [18]. Thus, beside standardization of immunophenotypic MRD assays analogous to RT-qPCR, as done for example in the Europe Against Cancer (EAC) program [19], extensive experience and regular cross validation is necessary to reproducibly identify LAIPs.

An important shortcoming is related to “antigenic shift” with a change of LAIP expression profile at relapse, which is a well-known problem and may occur in up to 91 % of the cases [2022]. This “antigenic shift” is closely related to the time point of assessment with a very low frequency at an early time point, such as after induction therapy, and an increasing frequency over time, particularly during follow-up. Therefore, several consensus LAIP panels are needed for MFC-MRD especially at later time points during therapy to avoid false negative findings due to LAIPs with antigenic shift.

Nevertheless, advantage of MFC-MRD as compared to RT-qPCR is its applicability to the vast majority of AML patients and the very fast availability of results. An example of a screening antibody panel for MFC is outlined in Table 1.

MRD Detection by RT-qPCR

In general, RT-qPCR is an advanced PCR assay in combination with a fluorophore emitting light on amplification of the PCR product [23]. This RNA-based approach provides a sensitivity of 10−4 to 10−6 [5,2426]. RT-qPCR measures mRNA expression levels of AML-specific genes relative to a control gene, which is used to control for quality factors during sample preparation, e.g., time between sample withdrawal and RNA preparation, cell concentration, and quality of reverse transcription that may influence cDNA yield. Besides providing comparability of different measurements, this allows documentation of sensitivity of each individual assay and elimination of poor quality samples. About 10 years ago, ABL was identified as the most stable and reliable gene in normal peripheral blood (PB) and BM and should therefore be used as control gene [27]. Within the EAC program, a framework for optimal conditions on RT-qPCR analysis including careful selection of probes and primers has been established through systematic parallel evaluation in an international network of expert laboratories, thereby enhancing assay sensitivity and specificity as well as interlaboratory comparability [19].

Very recently, the European LeukemiaNet developed a MRD-reporting software, where data from various RT-qPCR platforms can be imported, processed, and presented in an uniform manner to generate intuitively understandable reports allowing efficient data handling and harmonization of reporting [28••].

PML-RARA

MRD monitoring of the fusion gene PML-RARA by sequential RT-qPCR has been shown to be the strongest predictor of relapse in APL and, when coupled with preemptive therapy, provides a valid strategy to prevent overt hematological relapse [29]. This has led to the incorporation of MRD assessment in APL as a component of the standard response criteria [30]. In the context of several clinical trials, the standardized EAC RT-qPCR for PML-RARA has been shown to improve MRD detection rates as compared to conventional nested PCR assays [29,31]. Within a large UK study on 406 APL patients, mostly treated within the Medical Research Council (MRC) AML15 trial, the EAC RT-qPCR assay was the strongest predictive factor of relapse (p < 0.0001), even stronger than white blood cell (WBC) count at diagnosis [29]. Patients with detectable transcripts after the second course of standard therapy had a significant higher risk of relapse as compared to MRD-negative patients (relapse risk at 3 years, 19 vs. 8 %; p = 0.003). Evaluation at completion of the consolidation therapy revealed that BM was more sensitive as compared to PB (median difference, 1.5-log). Consequently, monitoring of BM after the end of therapy on a three-monthly basis has been recommended taking into account the assay sensitivity and typical kinetics of relapse [30,32]. However, this approach is increasingly questioned with recently reported relapse rates in adult low-risk APL patients of 1.1 % at 3 years after therapy with arsenic trioxide (ATO) in combination with all-trans retinoic acid (ATRA) [33]. Nonetheless, MRD measurement remains an important tool to inform management of high-risk and relapsed APL patients, and MRD monitoring in these patients is highly recommended [5,30].

RUNX1-RUNX1T1

Several studies have noted the importance of MRD measurement by RT-qPCR in AML patients with t(8;21)(q22;q22) leading to the fusion transcript RUNX1-RUNX1T1 [25,34•,35•,3638]. Optimal outcomes are achieved when either a molecular remission or very significant reductions in RUNX1-RUNX1T1 transcripts are achieved. However, significant heterogeneity within t(8;21)(q22;q22) leukemias is widely appreciated. While pretreatment variables associated with worse outcome have been recognized in AML with t(8;21)(q22;q22), including higher age, a high WBC count, deletion of the long arm of chromosome 9, nullisomy Y in male patients as well as the presence of KIT and/or FLT3 mutations at diagnosis [3946], a recent multicenter study in 100 AML patients with t(8;21)(q22;q22) (age range, 18–60 years) suggests that the level of MRD reduction outperforms high WBC and mutational status of KIT/FLT3 to identify patients at high risk [35•]. Hence, the authors suggested that MRD level, rather than secondary gene mutations, should be used for future treatment stratifications [35•]. In particular, in the largest prospective study conducted to date on 163 patients, a >3-log reduction of transcript burden after the first induction therapy and a >4-log reduction after the first course of post-remission therapy were associated with cumulative incidences of relapse (CIR) of only 4 and 13 %, respectively [25]. Higher intensity regimens may lead to deeper log reductions after the first course of chemotherapy as has been shown for the addition of gemtuzumab ozogamicin (GO) to intensive chemotherapy [25] as well as for anthracycline dose intensification during intensive induction chemotherapy [47]. Patients, who received daunorubicin of 90 mg/m2 showed a faster and deeper MRD reduction and achieved a higher proportion of complete molecular responses that translated into a reduced relapse rate as compared to those patients receiving 60 mg/m2.

These findings were extended by Zhu et al. who evaluated a risk-directed therapy approach based on MRD in t(8;21)(q22;q22) AML patients. Within this study, 116 younger patients (range, 15–60 years) who achieved morphological remission with 1–2 courses of standard induction according to the “7 + 3” schema and then completed two cycles of intermediate-dose cytarabine-based consolidation therapy were examined. The lack of a major molecular response (MMR, defined as a >3-log reduction in RUNX1-RUNX1T1 transcript levels from baseline) after the second course of consolidation, or loss of MMR within 6 months, was used to categorize patients into high or low risk. Allogeneic HSCT was associated with reduced relapse risk and improved survival as compared to chemotherapy for high-risk patients (5-year CIR 22.1 vs. 78.9 %, p < 0.0001; 5-year disease-free survival (DFS) 61.7 vs. 19.6 %, p = 0.001; 5-year overall survival (OS) 71.6 vs. 26.7 %, p = 0.007), whereas it had no impact in low-risk patients (5-year CIR 14.7 vs. 5.3 %, p = 0.33) and was associated with inferior DFS relative to those treated with chemotherapy/autologous HSCT (70.3 vs. 94.7 %, p = 0.024) [34•]. Although treatment assignment was not performed on a randomized manner, this study would support the idea of post-remission treatment adaptation according to MRD assessment in AML with t(8;21)(q22;q22).

CBFB-MYH 11

Similar to RUNX1-RUNX1T1, MRD measurement of CBFB-MYH11 resulting from inv(16)(p13.1q22) or t(16;16)(p13.1;q22) has been shown to be of clinical importance: Yin et al. defined relevant MRD checkpoints during treatment in a large series of 115 patients by identifying that an absolute copy number reduction in PB below 10 both after first induction and after first consolidation therapy was associated with a reduced CIR of 21 and 36 % as compared to 52 and 78 % above this copy number, respectively [25]. Other studies came to qualitatively similar conclusions [36,4850]. In the study by Jourdan et al., n = 98 patients with AML and inv(16)(p13.1q22) or t(16;16)(p13.1;q22) were studied. As shown for AML with t(8;21)(q22;q22), multivariate analysis revealed higher MRD level (>3-log vs. ≤3-log) after the second consolidation as the only significant prognostic factor for RFS (HR, 0.40; range, 0.18–0.91) [35•].

MLL-MLLT3

AML with MLL-MLLT3 is rare in adults (about 2.0 % of the patients) but is more common in childhood AML (9–12 %) [51,52]. Adult patients have an intermediate survival and one that is superior as compared to other MLL-rearranged AML [53]. Scholl et al. established a sensitive RT-qPCR assay for different fusion types and showed that patients achieving a PCR-negative state had a very low probability of relapse (11 %) and a high OS at 4 years of 70 %, whereas those patients with RT-qPCR positive results all relapsed and died within 3 years [54,55]. These data had recently been expanded by Abildgaard et al. who reported on a RT-qPCR assay in combination with a locked nucleic acid probe for quantification of the most common break point region of the MLL-MLLT3 fusion gene in pediatric patients [56].

NPM1 Mutations

Frameshift mutations of the NPM1 gene are one of the most frequent molecular abnormalities in AML, particularly in patients with a normal karyotype [3,5]. To date, more than 50 different NPM1 mutations have been reported [57]. The three most common variants (types A, B, and D) represent 90 % of all mutated cases and have been shown to be reliable markers for MRD detection with a high sensitivity [58]. The same assay can be adapted for cases with rare variants by replacing the mutation-specific primer, but these case-specific assays should be carefully tested in control samples (e.g., NPM1 wild-type AML) to avoid nonspecific background amplification from the wild-type NPM1 allele [59]. In contrast to other molecular aberrations (e.g., FLT3), NPM1 mutations are typically stable during the course of the disease, which supports the notion that they are an early pathogenetic lesion in AML [60]. Similar to the findings in CBF-AML, RT-qPCR assessment of MRD can distinguish patients at high risk of relapse: in a study on 245 adult patients with NPM1 mutated AML, relevant MRD checkpoints could be defined [61]. Achievement of RT-qPCR negativity after double induction therapy identified patients with a low CIR (6.5 % after 4 years) as compared to RT-qPCR-positive patients (53 % after 4 years; p < 0.001), translating into significant differences in OS (90 vs. 51 %, respectively; p = 0.001). After completion of therapy, CIR was 15.7 % in RT-qPCR-negative patients as compared to 66.5 % in RT-qPCR-positive patients (p < 0.001) [61]. These data are extended by the study of Hubmann et al. in whom a NPM1 mutation cutoff level of 0.01 was associated with a CIR after 2 years of 77.8 % for patients with ratios above the cutoff versus 26.4 % for those with ratios below the cutoff [62]. Of note, in the randomized French ALFA-0701 trial showing the superiority of intensive chemotherapy in combination with GO over intensive chemotherapy alone, NPM1-MRD was predictive for response to therapy since more MRD-negative results were obtained in patients treated in the GO arm as compared to those treated in the control arm after induction therapy (39 vs. 7 %; p = 0.006) as well as at the end of treatment (91 vs. 61 %; p = 0.028) [63••]. This is one of the first randomized studies indicating that MRD assessment may serve as a surrogate for survival endpoints for the treatment under investigation.

Additionally, in a retrospective analysis performed by the German Study Alliance Leukemia, increasing levels of NPM1 MRD were predictive of an impending relapse after chemotherapy (MRD increase >1 % NPM1mut/ABL1) or allogeneic HSCT (MRD increase >10 % NPM1mut/ABL1) [64].

Internal Tandem Duplication of the FLT3 Gene

Though FLT3 internal tandem duplication (FLT3-ITD) can be detected in roughly 25 % of all AML patients and is one of the most affected gene mutations in AML, its suitability as a MRD marker has been questioned for several reasons: (1) its heterogeneity according to size, number of clones per patient, allelic ratio, and insertion site within the FLT3 gene and (2) its proposed instability (reported on about 25 % of paired diagnosis-relapse samples) during the course of treatment, previously based on mainly semiquantitative methods with limited sensitivity [6570]. However, these results may, to a certain degree, be hampered by insufficient sensitivity of the applied methods. In most cases, the mutation originally detected at diagnosis is also present at relapse, often at a higher allelic ratio than at diagnosis [68,69,71]. The other cases, where a FLT3-ITD is acquired at relapse, may represent clonal disease progression [60,71]. Newer techniques are aimed to improve the sensitivity of FLT3-ITD detection, such as RT-qPCR with patient-specific primers [72]. However, this approach is time-consuming since each FLT3-ITD needs a clone-specific primer/probe designed from the junctional sequence. Technically, this may not be possible for every case due to the constraints of the sequence at the junction. In addition, patients with a low allelic FLT3-ITD may not be obtainable for direct sequencing since the wild-type sequence is competitively amplified. Recently, another PCR-based assay to detect FLT3-ITD MRD has been reported. Within this assay, primers oriented in the opposite direction were used; hence, amplification occurred only if a FLT3-ITD was present. Again, this approach is technically limited, since short FLT3-ITDs (less than 30–40 bases) will not be detected due to insufficient primer annealing space, which may apply to roughly 25 % of all FLT3-ITD cases [73,74]. Both approaches are therefore not ready to be implemented in clinical routine care. Of note, next-generation sequencing (NGS) has been shown to be useful for MRD assessment, particularly in patients with FLT3-ITD [75]. Yet, major challenge is caused by the mass of data generated, including data storage, data analysis, and interpretation of the results. Finally, even with the decreasing costs for NGS studies, these tests remain expensive and require expertise to accurately analyze and decipher the complex data [76]. Taken together, though FLT3-ITD mutation testing should be mandatory in all AML patients at diagnosis as well as at relapse for prognostic purposes and for guiding therapeutic decisions, it currently has little utility for MRD monitoring.

DNMT3A Mutations

The complexity of the leukemic genome has been highlighted by the discovery of mutations in genes important for epigenetic gene transcription regulation. DNMT3A mutations can be found in 15–25 % of AML patients, particularly in AML with normal cytogenetics [77,78] and are thought to be a “founder” mutation since they are present in early preleukemic hematopoietic stem cells [79•]. This has been confirmed in two pivotal population-based studies showing an increasing incidence of clonal hematopoiesis with increasing age and DNMT3A as the most frequently mutated gene [80,81]. In AML patients, about 60 % of DNMT3A mutations affect residue R882 [82,83]. Several studies evaluated the stability of DNMT3A mutations in paired diagnosis and relapse material [60,84]. In the largest analyses, Hou et al. studied sequentially 316 samples from 138 patients, including 35 patients with distinct DNMT3A mutations and 103 patients without mutations at diagnosis. At relapse, all initially DNMT3A-mutated cases showed the same mutation, whereas all other patients remained negative at relapse [84]. Differential MRD assessment using DNMT3A, NPM1, FLT3, and KIT has been performed by Ploen and co-workers, who developed a multiplex allele-specific quantitative PCR assay for the sensitive detection of DNMT3A mutations affecting residue R882 [85]. Analysis of DNA from 298 diagnostic AML samples revealed DNMT3A mutations in 45 AML patients (15 %); the mutation was stable in 12 of 13 patients presenting with relapse or secondary myelodysplastic syndrome. Interestingly, persistent levels of DNMT3A mutations could also be found in remission samples from 14 patients up to 8 years after initial AML diagnosis, whereas MRD levels of concurrent mutations at diagnosis, such as NPM1 and FLT3, became negative. Furthermore, cell sorting demonstrated the presence of DNMT3A mutations in leukemic blast cells, but also in T and B cells from the same patients [85]. These data suggest that DNMT3A mutations in these cases are rather markers for clonal hematopoiesis than AML-defining events. Thus, long-term persistence of DNMT3A may simply reflect the turn back from overt leukemia to clonal but otherwise clinically normal hematopoiesis. Based on the current data, the suitability of DNMT3A as a MRD marker within standard AML treatment is questioned; it may be of interest in clinical trials using epigenetically active drugs in combination therapy as well as single agent maintenance therapy (e.g., NCT01757535).

MRD Detection by Measuring Expression of WT1

WT1 is overexpressed in roughly 80 % of AML patients, thus making it an attractive target for MRD monitoring [26]. However, its value for MRD monitoring has been debated due to (1) the difficulty to discriminate the residual expression of leukemic cells from background expression, since the expression of WT1 is not leukemia-specific and (2) differences of the applied assays. In 2009, the ELN consortium systematically evaluated nine different WT1 RT-qPCR assays leading to the recommendation of a standardized assay as well as proposed threshold levels of 50 and 250 WT1 copies/104 ABL copies for MRD detection in PB and BM samples, respectively [26]. The predictive value of MRD assessment had then been analyzed within a cohort of 129 AML patients treated with conventional “7 + 3”-based chemotherapy with diagnostic WT1 levels exceeding 2 × 104 copies/104 ABL copies, allowing the discrimination of at least a 2-log reduction as compared to the pretreatment level. A greater WT1 reduction after the first induction chemotherapy was associated with a reduced relapse risk (hazard ratio [HR] 0.54 per log reduction; range, 0.36 to 0.83; p = 0.004), independent from age, WBC count, or cytogenetic risk group. Reduction of WT1 below the threshold limits defined in normal controls by the end of consolidation also predicted a reduced relapse risk (p = 0.004). However, the degree of WT1 overexpression in most AML patients was too modest to afford a highly sensitive universal marker for sequential MRD monitoring, when background levels of expression observed in normal PB and BM were taken into account [26]. In addition, PB seems to be more informative due to the much higher background expression in normal BM [5,63••]. Nevertheless, several studies have shown a correlation between detectable WT1 MRD and clinical outcome [8689]. Very recently, as already mentioned above, Lambert et al. evaluated the differential prognostic significance of WT1 and NPM1 MRD level assessed by RT-qPCR in 278 adult AML patients (age range, 50–70 years) treated in the ALFA-0701 trial [63••]. Positive WT1 MRD (defined as >0.5 % in PB) after induction and at the end of consolidation treatment were both significantly associated with a higher risk of relapse (HR = 3.15 [1.78–5.58], p < 0.0001; HR = 3.41 [1.62–7.17], p = 0.001, respectively) and shorter OS (HR = 3.23 [1.64–6.37], p = 0.0007; HR = 4.64 [1.38–15.62], p = 0.013, respectively). However, the impact of the addition of GO to chemotherapy was not reflected by WT1 MRD levels, whereas NPM1 MRD levels tracked along with the treatment efficacy in terms of better molecular remissions in those patients treated with GO as adjunct to intensive chemotherapy. This discrepancy on the background of a significant beneficial impact of GO on all survival endpoints further questions the clinical use of WT1 MRD assessment. In summary, WT1 expression for MRD assessment should be restricted to cases where MRD assessment is otherwise not possible.

MRD-Directed Therapies

To date, the efficacy of close MRD-directed therapy is best supported for patients with APL [29,31,32]. For non-APL AML, MRD-based decision-making has not yet been embraced for routine clinical practice [5], but in genetically defined subsets of AML, such as CBF-AML and AML with NPM1 mutations close MRD assessment has entered clinical trials triggering intensification of post-remission therapy in case of persistent MRD (e.g., NCT02013648, NCT00893399). Furthermore, the more general applicable approach of LAIPs detection with MFC has been integrated into a number of currently recruiting trials with MRD-triggered treatment intensification (EudraCT Number: 2013-002843-26, NCT02349178, NCT01452646, NCT01462578, NCT01677949).

A widely cited study in 216 pediatric AML patients has used a comprehensive risk stratification strategy based on genetic abnormalities at diagnosis and MRD findings to direct decisions on the second induction course and subsequent therapy [90]. At diagnosis, patients were provisionally classified according to their underlying genetic abnormalities; responses to each course of therapy, as assessed by morphologic and flow MFC-MRD, determined the final risk classification. Those patients with a MRD level of ≥1 % after induction I were classified as high risk. Intended treatment consisted in low-risk patients (n = 68) of five courses of chemotherapy, whereas high-risk patients (n = 79), as well as standard-risk patients (n = 69) with matched-related donors, were eligible for allogeneic HSCT (performed in 48 high- and 8 standard-risk patients). Comparison with historical controls suggested that this approach could improve outcomes.

In an observational series of 10 adult AML patients who were treated for increasing NPM1 MRD levels with 5-azacytidine (5-AZA), Sockel and co-workers reported promising results: after a median follow-up time of 10 months (range 2–12 months) from initiation of 5-AZA treatment only three patients developed a hematologic relapse [91]. A molecular response with a ≥1-log decrease in the MRD level was observed in 7 of the 10 patients [91]. The impact of 5-AZA therapy in AML patients with an impending hematological relapse and evaluable MRD marker (such as t(8;21), inv(16), and NPM1) is currently explored in a prospective phase II clinical trial (NCT01462578). In addition, since mutant NPM1 can induce an antileukemic T cell response, it could serve as a target in the setting of immunotherapy [92]. Future strategies might therefore additionally use adoptive immunotherapy in case of reoccurrence of mutated NPM1 after HSCT or administration of donor lymphocyte infusion (DLI) in patients after allogeneic HSCT [93].

MRD as a Predictor of Outcome After Transplant

The general notion that MRD-positive patients should immediately be candidates for an allogeneic HSCT has put into question by studies showing high relapse rates even after allogeneic HSCT in pre-transplant MRD-positive patients [15,94]. Bastos-Oreiro and co-workers evaluated the prognostic impact of MFC-MRD before and after allogeneic HSCT. MRD-negative patients (defined as ≤0.1 % LAIPs) at the time point of allogeneic HSCT had significant lower rates of relapse (15 vs. 66 %, p = 0.045) and better OS (83 vs. 52 %, p = 0.021) after 1 year as compared to MRD-positive patients [94]. Consistently, in the study presented by Walter et al., a marked impact of MRD positivity before allogeneic HSCT with relapse rates as high as 60 to 70 % has been shown [15]. However, the benefit of further therapy to reduce the level of MRD before allogeneic HSCT and the level of MRD that would preclude the likelihood of cure after allogeneic HSCT are currently unknown. Additional intensive chemotherapy augments the risk for organ toxicity or life-threatening infections and may not always reduce the leukemia burden because of resistant leukemia cells. Nevertheless, a recent study on pediatric leukemia patients with positive MRD reported relatively high survival rates after allogeneic HSCT (5-year OS, 66.7 vs. 80.4 % for MRD positive vs. negative AML patients), suggesting that the negative effect of MRD had been partially offset by allogeneic HSCT [95]. Although MRD is associated with a several fold increased risk of relapse after allogeneic HSCT, up to 20–30 % of patients with MRD at the time of transplantation experience prolonged DFS; i.e., some MRD-positive patients will be salvaged with either myeloablative or nonmyeloablative conditioning allogeneic HSCT [15]. Therefore, MRD-positive AML patients should not be excluded from a potentially curative allogeneic HSCT.

Of course, based on these data, one could question whether, in fact, MRD-negative patients actually require allogeneic HSCT for long-term relapse-free survival.

MRD as a Surrogate Endpoint

Beyond a rapid blast cell clearance assessed by morphology after induction therapy [96,97], achievement of MRD negativity or marked transcript level reduction after first or second induction therapy seems to be of high prognostic impact (Table 2) [8•, 25, 62, 63••, 64, 98, 99]. Thus, a rapid decline in MRD levels after induction therapy reflects a highly chemosensitive disease with a per se favorable prognosis. However, relapse rates in patients with MRD negativity or marked transcript level reduction after induction therapy range between 0 and 73 % with a median of 27 %. Therefore, at least in one quarter of these patients, the label “low risk” is misleading, raising the question of specificity and sensitivity of the assay. In addition, Table 2 also illustrates a further shortcoming due to a broad variety of cut points used in these studies. This simply reflects the fact that in each study, the cut point was established and optimized within and for the reported cohorts without any validation process resulting in very low external validity, which is further put into question due to the highly selected patient populations. Therefore, a common international attempt to move forward standardization of MRD assessment is mandatory for future use of MRD in routine practice.

Table 2 Impact of minimal residual disease status at different time points during induction and consolidation treatment on cumulative incidence of relapse

Beyond routine practice, MRD may increasingly be used to optimize efficacy assessment of specific treatment components such as standard consolidation therapy and to facilitate drug development. Exemplarily, Burnett et al. recently stated that consolidation therapy with 1.5 g/m2 bid for 3 days is equivalent effective as the former used standard dose of 3 g/m2 bid for 3 days mainly based on similar OS rates [100]. However, despite comparable OS rates in both arms, the strong trend (p = 0.06) towards a higher relapse rate with the lower dosage of 1.5 g/m2 points in a somewhat different direction [7]. In this situation, MRD assessment and systematic comparison of MRD levels in the two treatment arms would have added important scientific arguments. Currently, two studies are available showing that MRD may serve in the future as an early efficacy read out and may therefore be used as a surrogate [14] for survival endpoints. One study in CBF-AML revealed that higher anthracycline dosage (DNR 90 mg/m2 for 3 days) during standard induction therapy was associated with a significant reduction in MRD levels and longer survival as compared to patients receiving the lower anthracycline dosage (DNR 60 mg/m2 for 3 days) [47]. In addition, Lambert et al. showed a marked additional reduction of MRD levels by GO in conjunction to intensive chemotherapy in the French ALFA 0701 with a clear beneficial impact on OS [63••]. These examples illustrate that early biomarker endpoints as a surrogate marker are of high interest to provide an early readout of clinical efficacy. Such early readouts will allow designing reasonable adaptive drug development plans with the aim to speed up the process of the transition of new drugs into routine patient care. Quantifying residual disease by MFC or RT-qPCR has the potential to serve as an early biomarker surrogate for survival endpoints but has so far not been evaluated in a confirmatory prospective manner.

Conclusions

Our progress in unraveling the genetic and immunophenotypic heterogeneity of AML has allowed us to identify a number of aberrations for MRD assessment. Despite the ability to achieve remission in most patients, AML remains a lethal cancer with most patients eventually dying of their disease; as outcomes are variable, however, better risk assessment and subsequent dynamic adaption of the therapy may ultimately improve outcomes. MRD measured quantitatively is a strong prognostic tool and used increasingly to guide MRD-adapted therapy. Recent studies suggest the possibility that MRD-guided modification of post-remission treatment intensity based on MRD status may be used to optimize outcomes and that MRD assessment should be implemented in clinical trials as an early readout tool for clinical efficacy. However, the process to regularly include MRD-based treatment decision-making and clinical efficacy measure in non-APL AML into clinical practice has just begun.