Introduction

Acute myeloid leukemia (AML) is the result of proliferation of myeloid precursors with impaired ability for normal differentiation leading to peripheral blood cytopenias and increased risk of infection and hemorrhage [1]. AML is characterized by multiple recurrent cytogenetic and molecular changes that are associated with response to therapy and overall prognosis [2, 3]. This genetic information, together with patient characteristics such as age, performance status, organ function, and comorbidities, is used to determine type of therapy and potential role of allogeneic stem cell transplantation (alloSCT) [3, 4].

In North America, the most commonly used AML therapy is modeled after a Cancer and Leukemia Group B trial consisting of a “7 + 3” induction with cytarabine and an anthracycline followed by several cycles of high dose cytarabine consolidation [5]. Younger patients benefited from the use of higher doses of anthracycline for 3 days with continuous infusion cytarabine. This treatment is associated with complete response (CR) rates of 70% and a median survival of 24 months [6]. These results are still suboptimal as a majority of patients will relapse and die from their disease.

There has been a debate whether higher doses of cytarabine are beneficial or not during the induction phase in AML. An example of such an approach is idarubicin and high dose cytarabine (IA), which was associated with high response rates in a single institution report [7].

In addition to “7 + 3” or IA types of therapy, multiple clinical trials have studied the potential of adding a third agent to such combinations. Examples in the era of genomic annotation include successful addition of a FLT3 inhibitor to 7 + 3 or the use of a nucleoside analogue in combination with cytarabine and an anthracycline [8, 9]. Vorinostat is a histone deacetylase (HDAC) inhibitor that has been shown to have single agent antileukemia activity and to synergize with both hypomethylating agents and in a sequence specific fashion with anthracyclines [10,11,12]. A phase II trial performed at a single institution combining IA with vorinostat (IA + V) reported a high response rate in a cohort of patients with poor-risk characteristics [7]. Of importance, patients with core binding factor abnormalities (CBFs) were excluded from these trials; since re-approval of gemtuzumab ozogamicin (GO) in 2017 induction with 7 + 3 + GO combination has been recommended for these patients [3].

In view of this significant activity of IA + V, a multicenter randomized phase 3 clinical trial was designed to compare standard “7 + 3” with IA, with or without vorinostat, for younger patients with AML. SWOG 1203 was thus designed to answer two important questions: whether a higher dose of cytarabine improves outcomes when compared to standard dose as administered in 7 + 3 and whether the addition of vorinostat improves outcomes when compared to IA. An additional goal of this trial was to study the success rate of transplanting high-risk patients in first complete remission in the context of a large randomized clinical trial. These results were reported in a separate manuscript [13].

Methods

SWOG S1203 was an US National Cancer Institute (NCI) study led by SWOG, with accrual from sites throughout the National Clinical Trials Network (NCI NCTN). The study was registered in ClinicalTrials.gov as NCT01802333 and was approved at all participating centers following institutional guidelines. The protocol may be found in the Supplementary Information.

Cytogenetic analysis was performed by local laboratories used by participating institutions. Cytogenetic risk stratification was based on updated SWOG criteria [14,15,16,17]. Favorable-risk cytogenetics includes the presence of clonal abnormality of t(8;21) or inv(16)/t(16;16). High-risk cytogenetics was defined as those with clonal aberrations of abn3q26 [inv(3)/t(3;3)], del(5q)/−5, del(7q)/−7, 11q23 rearrangement [except t(9;11)], del(17p), t(6;9), t(9;22) complex (at least 3 unrelated abnormalities), and monosomal karyotype (either loss of 2 different chromosomes or loss of 1 chromosome together with a structural chromosome abnormality other than add, ring, and mar). Intermediate-risk cytogenetics includes all results from an informative study without any of the favorable- or high-risk clonal aberrations. Patients without an informative study, either failed karyotype analysis due to non-proliferation/no metaphase available or limited study with a normal karyotype based on fewer than 20 metaphases analyzed, are categorized as “unknown cytogenetics”.

Eligibility

Patients between 18 and 60 years old with AML by World Health Organization 2008 criteria and evidence of marrow involvement were eligible. Patients with acute promyelocytic leukemia (APL) or blastic phase of chronic myeloid leukemia were excluded. Patients could not have received therapy for AML or myelodysplastic syndromes. Other inclusion criteria included Zubrod performance status 3 or lower, evidence of cardiac ejection fraction of more than 44%, and no evidence of QTc prolongation (more than 500 ms). Patients with severe comorbidities were not eligible but patients with HIV, hepatitis B or C infection were allowed under specific conditions (see protocol in the Supplementary Information). Pregnant patients were not eligible and effective contraceptive methods were required. Informed consent was required to participate in the study and was obtained from all subjects.

Treatment

A detailed treatment schema is provided in the protocol document (Section 7 of the protocol in the Supplementary Information). In summary, therapy was divided into an induction phase followed by a consolidation phase. Treatment schemas during induction and consolidation are shown in the protocol document. The 7 + 3 arm used daunorubicin at 90 mg/m2 with infusional cytarabine 100 mg/m2 per day (DA). Because of potential shortages, daunorubicin could be replaced by idarubicin 12 mg/m2/day when needed. In the IA arms, idarubicin was used at a dose of 12 mg/m2 IV daily x 3 days and cytarabine at a dose of 1500 mg/m2 IV daily as a 24 h continuous infusion for 4 days. In the IA + V arm, vorinostat was used at a dose of 500 mg orally three times a day for 3 days on days 1 to 3 and IA started on day 4. For the 4 cycles of consolidation, in the DA arm cytarabine was used at a dose of 3000 mg/m2 IV over 3 hours twice a day on days 1, 3, and 5. In the IA and IA + V arms consolidation consisted of cytarabine 750 mg/m2 IV daily x 3 and idarubicin at 8 mg/m2 on days 1 and 2. Vorinostat was used at the same dose as during induction. Toxicities were assessed according to the Common Terminology Criteria for Adverse Events version 4.0 and therapy was dose adjusted following protocol guidelines.

Study calendar

Follow-up during the study is detailed in the protocol document (Supplementary Information). Patients were required to have a bone marrow evaluation with cytogenetics within 28 days prior to registration. Assessment of cardiac ejection fraction was also required prior to study entry. In addition, physical examination, laboratory assessments and buccal swabs for stem cell donor search were required. Laboratory assessments were serially obtained during induction. Bone marrow was obtained for disease assessment at the end of induction prior to consolidation. Patients who did not achieve remission with the first cycle of induction were allowed to receive a modified cycle of re-induction. Patients with higher-risk cytogenetics and in remission were then assessed for alloSCT. An echocardiogram and electrocardiogram were also obtained at the end of induction. Patients were treated with all supportive care measures as needed including transfusions of blood products, use of prophylactic antibiotics and treatment of any complications such as neutropenic fever using local institutional guidelines.

Study design and statistical methods

The primary objective of the study was to compare event-free survival (EFS) in patients with AML who received DA versus IA or IA + V. A second primary objective was to determine the fraction of high-risk patients receiving alloSCT in first remission. Secondary objectives were to compare toxicities, disease-free survival (DFS), and overall survival among the three arms. Morphologic response (including complete remission [CR] and CR with incomplete hematologic recovery [CRi, ANC < 1000/mcl or platelets < 100,000/mcl]) definitions followed the contemporary consensus review definitions [18]. EFS was defined for all patients and was measured from the date of randomization to the first of the following events: death from any cause, relapse from remission (CR or CRi), or completion of protocol induction/re-induction therapy without documentation of CR or CRi; patients last known to be alive in CR or CRi were censored at the date of last contact. DFS was defined for patients who achieved CR or CRi with protocol induction/re-induction and was measured from date of CR or CRi to the first day of relapse from CR or CRi or death from any cause; patients last known to be alive in CR or CRi were censored at the date of last contact. Overall survival was defined for all patients and was measured from the date of randomization to the date of death due to any cause; patients last known to be alive were censored at the date of last contact.

The full details of the statistical design are provided in Section 11 of the protocol document. The design specified 1:1:1 randomization between the three arms and specified up to 5 interim analyses. Fisher’s exact test and the Wilcoxon-rank sum test were used to assess differences in categorical and quantitative variables across the arms. EFS, DFS, and OS were estimate using the Kaplan-Meier method. Cox models were used for multivariable regression modeling of EFS, DFS, and OS.

Results

Study population

Between April 2013 and November 2015, 754 patients were registered to the study. Accrual to IA + V was stopped on June 1, 2015 due to IA + V crossing a futility threshold at the second interim analysis. The study completed accrual to the DA and IA arms in November 2015. The IA arm crossed a futility threshold at the fourth interim analysis and the trial was closed to further accrual. The results of the study were released in spring of 2016. Sixteen patients were found to be ineligible, therefore a total 738 eligible patients are summarized in the following analyses. A CONSORT diagram is shown in Fig. 1. 261 eligible patients were registered on the DA and IA arms respectively and 216 on the IA + V arm. Patient characteristics are shown in Table 1. Median age was 49 years (min 18, max 60) and 49% of the patients were female. Median white blood cell count (WBC) was 10.8 ( x 109/L) (min 0.3, max 800). 96 patients (13%) had favorable-, 457 (63%) intermediate- and 159 (22%) high-risk cytogenetics. FLT3-ITD mutation was present in 120 (16%) patients and NPM1 mutation was present in 152 (21%). There were no statistically significant differences in terms of patient characteristics distribution among the different treatment arms (Table 1).

Fig. 1
figure 1

CONSORT diagram.

Table 1 Patient summary by treatment arm. Median (range) and N (%) reported.

Toxicities

The overall mortality rates at days 30 and 60 for the whole study population were 4% and 7%, respectively. With DA, day 30 and 60 mortalities were 3% and 5%, respectively, whereas with IA were 6% and 9% and with IA + V were 4% and 9%. There was no significant difference in 30- and 60-day mortality between 3 arms (p = 0.15 and 0.08 respectively, Supplemental Table S1). The IA + V arm had higher rates of grade 3 diarrhea compared to the other two arms (6% on DA, 8% on IA, 18% on IA + V, p < 0.001) and higher rates of grade 3 typhlitis (3% on DA, 5% on IA, 10% on IA + V, p = 0.002). Grade 4 and 5 GI toxicities rates were low; one patient on the DA arm had a grade 5 typhlitis and 1 patient on the IA + V arm had grade 4 diarrhea and another patient on the IA + V arm had grade 4 typhlitis. Grade 4 and 5 sepsis rates were higher on the IA and IA + V arms compared to DA (grade 4 rates: 6% on DA, 7% on IA, 13% in IA + V (p = 0.002); grade 5 rates: 0% on DA, 3% on IA, 3% on IA + V, p = 0.03; p-value for grade 4 + 5 = 0.005). Other toxicities were similar across the arms (Table 2). The study did not collect data on time to hematopoietic recovery, but among patients who achieved a CR or CRi on the first induction cycle, the time to starting consolidation from the day of CR/CRi was longer on the DA arm compared to the IA and IA + V arms (median 18 days on DA, 12 days on IA, 13 days on IA + V, p < 0.001).

Table 2 Toxicity during induction by treatment arm: number of patients with a given type and grade of adverse event. N (%) reported.

Response

Results are shown in Table 3. There were no differences in response rates between any of the arms. CR was documented in 461 patients (62%) and CRi in additional 111 patients (15%) for an overall response rate (ORR) of 78%. The median time to response (25%, 75% of patients) in the DA, IA, and IA + V arms was 27 (14, 35), 32 (27, 35), 31 (28, 37) days, respectively (p < 0.001). There were no differences in response in subsets defined by cytogenetic risk, FLT3-ITD, NPM1 mutations, and age (18–39 and 40–60) (all p > 0.23).

Table 3 Response summary by arm. N (%) reported.

Event-free survival

With a median follow-up of 33 months among patients without events, there were no significant differences in EFS between any of the arms (all p > 0.38, Fig. 2). All three arms had plateaued between 2 and 3 years after randomization; 2-year EFS in the DA, IA, and IA + V arms was 36%, 41%, and 37%, respectively; and 3-year EFS was 35%, 37%, and 31% respectively. In subsets defined by cytogenetic risk, FLT3-ITD, NPM1 mutation status, and age (18–39 and 40–60) there were no significant differences (all p > 0.20) except in the favorable cytogenetic risk subset, in which the IA and IA + V arms had significantly worse EFS compared to the DA arm (DA versus IA p = 0.008, DA versus IA + V p = 0.006, Fig. 3).

Fig. 2
figure 2

Event-free survival and overall survival by treatment arm.

Fig. 3
figure 3

Event-free survival and overall survival among cytogenetic and molecular subgroups.

Overall survival

With a median follow-up of 33 months among patients still alive, there were no significant differences in OS between any of the arms (all p > 0.29, Fig. 2). Two-year OS in the DA, IA, and IA + V arms was 57%, 59% and 57%, respectively; 3-year OS was 50%, 52%, and 48% respectively. In subsets defined by cytogenetic risk, FLT3-ITD mutation, NPM1 mutation, and age (18–39 and 40–60) there were no significant differences (all p > 0.13), including among patients with favorable risk cytogenetics; though the comparison in this subgroup had limited power due to the limited number of events (IA versus DA hazard ratio [HR] = 2.32, p = 0.11; IA + V versus DA HR = 1.94, p = 0.19 Fig. 3).

Comparison with historical data

This study was designed based on assumptions from a prior trial in SWOG, S0106, which used DA as the control arm. We used Cox regression models to evaluate differences in survival between the studies controlling for age, gender, performance status, prestudy WBC, platelets, marrow blasts, blood blasts, hemoglobin, secondary AML status, cytogenetic risk, and NPM1/FLT3 mutation status. There were no significant differences in OS between the DA arm of S0106 and the S1203 DA and IA arms (p = 0.31 and 0.78 respectively); the OS on the IA + V arm was significantly worse than OS on the S0106 DA arm (HR = 1.42, p = 0.019).

Discussion

SWOG S1203 was designed with an aim to improve outcomes in younger patients with non-APL AML. This was a cytogenetic and molecular agnostic study, meaning that therapy was not adapted to potentially targetable cytogenetic or molecular alterations (i.e., FLT3 mutations or presence of CBF abnormalities). In this study, the control arm was DA, still the standard of care but incorporating higher doses of daunorubicin, an approach that resulted in improvement in survival in younger patients [6]. The experimental arms consisted of a combination of idarubicin with high doses of cytarabine and in combination with the HDAC inhibitor vorinostat.

In the overall study population, no improvement in outcomes (EFS, OS, or response) was observed when comparing the IA arms with DA. There was an increase in drug-related toxicities with the IA + V arm. This data indicates that there is no advantage to using a high dose cytarabine based induction compared to standard DA in younger patients with AML. That said, there is no evidence either that DA is superior to any of the other arms. When comparing IA with IA + V, there was no evidence that the addition of vorinostat improved outcomes when compared to IA and therefore there is no justification to add vorinostat to IA.

AML is a heterogeneous disorder that includes distinct molecular and cytogenetic alterations. These molecular alterations are associated with differences in outcomes. The major groups analyzed here included patients with specific molecular alterations (NPM1 and FLT3 mutations) and cytogenetic alterations (poor, intermediate, and favorable risk). No differences in outcomes were observed when comparing DA versus the IA arms except for the subset of patients with favorable risk cytogenetics, with CBF alterations. This subset included 96 (13%) patients that were evenly distributed in each arm. Unfortunately, treatment of this subset of patients underscores the potential complexity of selecting initial treatment of AML based on genetic data as a large majority of centers during the time of the study did not have access to genetic reports until after therapy was already initiated. For instance, at the institution where the IA programs were developed, MD Anderson Cancer Center, investigators had access to cytogenetic and molecular results in the first 2 to 3 days after initial evaluation of patients prior to any therapy, thus allowing optimal therapy stratification.

S1203 was designed in 2010-2011 before systematic data on mutation profiles and targeted agents were available. Recent reports indicate that the addition of a FLT3 inhibitor has resulted in improved outcomes in patients with FLT3 mutated AML and the addition of GO to standard induction chemotherpay may improve outcomes in patients with favorable cytogenetics [8, 19].

In summary, the data presented from this large multicenter trial indicate that standard 7 + 3 regimens and IA produce similar outcomes in younger patients with AML except for those with CBF alteration that clearly need a higher dose cytarabine-based consolidation approach. The addition of vorinostat did not improve the activity of IA.