Introduction

The addition of trastuzumab substantially improves the efficacy of neoadjuvant chemotherapy in HER2-positive early breast cancer patients and has become a core component of standard neoadjuvant regimens [1, 2]. Lapatinib, a small molecule, dual tyrosine kinase inhibitor of HER2 and EGFR, demonstrated non-cross resistance with trastuzumab in pre-clinical studies and activity in women with HER2-positive metastatic breast cancer who progressed on trastuzumab [3,4,5]. A Phase III trial comparing the combination of lapatinib and capecitabine compared with capecitabine alone in women with progressive, locally advanced or metastatic HER2-positive breast cancer previously treated with an anthracycline, a taxane and trastuzumab, demonstrated a significant improvement in median time to progression and a trend toward improvement in overall survival (OS) [6, 7].

NSABP B-41 is a 3-arm randomized clinical trial of neoadjuvant therapy in HER2-positive early breast cancer in which patients received 4 cycles of doxorubicin/cyclophosphamide (AC) followed by weekly paclitaxel (WP) administered with trastuzumab (T), lapatinib (L) or the combination of trastuzumab and lapatinib (TL). Following surgery all patients received adjuvant trastuzumab to complete a year of HER2-targeted therapy. Detailed methods, the primary end point of pCR rates, and toxicities have been previously reported [8]. The associations of intrinsic subtypes with pathologic complete response, event-free survival (EFS) and OS for a subset of B-41 patients with available tissue samples have also been reported [9, 10]. We now present long-term outcomes for the specified secondary endpoints of recurrence-free interval (RFI) and OS and a non-prespecified endpoint of EFS for the entire cohort. We also present correlation of RFI and OS with pathological complete response versus non-pathological complete response and exploratory analyses based on hormone receptor status in each of the treatment groups.

Methods

Study design and patients

Eligible patients for the trial had operable HER2-positive breast cancer, age ≥ 18 years and an ECOG performance status of 0 or 1. Additional inclusion criteria included: breast tumor at least 2 cm by palpation; clinical stage T2 to T3, N0 to N2a disease; diagnosis by core needle biopsy; tumor with HER2 gene amplification by fluorescent in situ hybridization (FISH) or chromogenic in situ hybridization (CISH), or a strongly positive (3 +) staining score by immunohistochemistry; left ventricular ejection fraction assessment of 50% or higher by multiple-gated acquisition scan or echocardiogram.

The trial was approved by local human investigations committees or institutional review boards in accordance with assurances filed with and approved by the US Department of Health and Human Services. Written informed consent was required.

Randomization and masking

Patients were randomly assigned to receive an initial four cycles of neoadjuvant AC, followed by WP for 12 doses combined with either T, L or TL in a 1:1:1 ratio. Stratification factors included clinical tumor size (2.0–4.0 cm vs > 4.0 cm); clinical nodal status (negative vs positive); hormone receptor status (estrogen receptor [ER]-positive or progesterone [PR]-positive, vs ER-negative and PR-negative); and age (< 50 years vs ≥ 50 years). To avoid extreme inequality in treatment assignment within an institution, we applied an adaptive randomization scheme that used a biased-coin algorithm [11]. Treatment assignment was done via an online program maintained by the NSABP Biostatistical Center and neither the patient nor the participating site could know the next assignment in advance. Neither patients nor treating physicians were masked as to treatment assignment. Baseline patient characteristics and subsequent follow-up data were entered into an NSABP data server remotely by trained staff at investigation sites.

Procedures

The trial was activated on July 16, 2007. Eligible patients were randomly assigned into one of three treatment regimens. All patients received four cycles of standard AC every 3 weeks, followed by 12 doses of WP (80 mg/m2) with concurrent HER2-directed therapy. The T group received a loading dose of 4 mg/kg followed by 2 mg/kg weekly with WP until surgery. The L alone group received 1250 mg po daily with WP until surgery. The combination TL group received weekly T combined with daily lapatinib at 750 mg po daily administered with WP until surgery. All patients initiated adjuvant trastuzumab (6 mg/kg) every 3 weeks after surgery until completion of 1 year of targeted therapy. Decisions on chest wall and regional nodal irradiation were left to investigator’s discretion and patients with hormone receptor-positive tumors were to receive a minimum of 5 years of adjuvant endocrine therapy with choice of therapy at investigator discretion. Additional details can be found in the publication of the primary results.

The primary protocol-specified endpoint was pathological complete response of the breast, defined as no histological evidence of invasive tumor cells in the breast specimen removed at surgery. The descriptive secondary endpoints reported here include two pre-specified endpoints: recurrence-free interval (RFI) and OS. EFS was not specified in the protocol but has become a standard endpoint in neoadjuvant trials, and thus, is being provided in this report.

Recurrence-free interval was defined as time from surgery to local, regional or distant recurrence. Patients who developed inoperable progressive disease during neoadjuvant treatment were considered as having recurrence on Day 0. Second primary cancers were neither censored nor events, and deaths due to causes other than breast cancer were censored at time of death. There was one ineligible T4 patient who presented with synchronous bone metastasis at randomization and another patient with distant metastasis prior to breast surgery. Their time to recurrence was defined as Day 1.

OS was defined as time from randomization until death due to any cause. EFS was defined as time from randomization to progression preventing surgery, first local or regional recurrence after surgery, distant recurrence, secondary primary or death due to any cause.

The study was designed to follow patients for disease recurrence and survival for 5 years after study entry. For each time-to-event endpoint, the Kaplan–Meier estimates of the percentages of patients free from the events at 5 years for OS and EFS and at 4.5 years for RFI were compared among these three treatment regimens employing the Greenwood formula to estimate the corresponding standard errors [12]. Inference on these tests followed a step-down procedure to adjust for multiple testing and to control the overall family wise error rate at 0.05 [13]. The maximum of the absolute values of those three pairwise test statistics was compared with the 99.2th (= 100–0.025/3) percentile of the Gaussian distribution. If the threshold was crossed, then the next two test statistics would be compared with the 97.5th percentile of the Gaussian distribution; otherwise, they would be compared with the 98.75th and 97.5th percentiles, respectively. The stratified log-rank test was used to compare the distribution of RFI, OS and EFS among the three treatment arms. Multivariate Cox proportional hazards models were used to assess the treatment efficacy in terms of hazard ratio after adjusting for the stratification factors. The statistical analyses were done with SAS/STAT version 9.4 and R version 3.4. This study is registered with ClinicalTrials.gov, Number NCT00486668.

GlaxoSmithKline (GSK) provided lapatinib to all study sites as well as trastuzumab in Canada along with funding support. GSK provided input on the study design, but did not participate in data collection, data analysis, data interpretation, or writing of the report. The authors had full access to all the data and had final responsibility for the decision to submit for publication.

Results

Between August 27, 2007, and June 30, 2011, 529 patients were enrolled in the study. Seven patients had withdrawn from the study shortly after enrollment and did not provide follow-up data (CONSORT, Supplementary Fig. S1). Among the 522 patients with follow-up data included in the analysis, 179 were on the WP plus T arm, 171 were on the WP plus L arm and 172 were on the WP plus TL arm. Three patients only had survival data reported and are only included in the OS analysis (CONSORT, Fig. S1). Characteristics of the patients included in the follow-up analyses were balanced across treatment groups (Table 1). Median follow-up for OS was 5.1 years (IQR 4.9–5.3) with similar follow-up among patients on the three treatment arms (log-rank p = 0.53). Median follow-up for RFI was 4.4 years (IQR 4.0–4.6). Median follow-up for EFS was 5 years (IQR 4.7–5.2). A total of 90 EFS events were observed: 27 on the WP plus T arm, 38 on the WP plus L arm, and 25 on the WP plus TL arm. The number of EFS events by treatment arms and sites are presented in Supplemental Table S1.

Table 1 Patient characteristics: NSABP B-41

Robidoux et al. previously reported the proportions of breast pCR (ypT0/is) for the three treatment arms (52.5% for the T arm, 53.2% for the L arm and 62% for the TL arm), which did not achieve statistical significance.

Among the patients with receptor-positive tumors, the proportions of breast pCR were 46.7%, 48% and 55.6%, for T, L and TL arms, respectively, and among the hormone receptor-negative patients, were 65.5%, 60.6% and 73%.

Recurrence-free interval (4.5 years) from surgery

The hazard ratio (HR) for RFI was 0.70 (95% CI 0.37–1.32, log-rank p = 0.37) for a comparison of the TL arm with the T arm and 1.37 (95% CI 0.80–2.34, log-rank p = 0.34) for a comparison of the L arm with the T arm. The Kaplan-Meier (K-M) estimates of the proportion of patients free from recurrence 4.5 years after surgery were 89.4% (95% CI 83.5–93.3%) in the TL arm, 87.2% (95% CI 81.1–91.3%) in the T arm, and 79.4% (95% CI 71.9–85%) in the L arm. The maximum of the absolute values of the three pairwise standardized test statistics that compare these three K–M estimates was 2.45 (one-sided p = 0.007) and larger than the 99.2th percentile of the Gaussian distribution. The next test statistic was 1.86, which was smaller than the 97.5th percentile of the Gaussian distribution. This demonstrated that although the proportion of patients who were recurrence-free at 4.5 years was significantly higher for the TL arm relative to the L arm, there were no statistically significant differences between TL vs T and L vs T in the other pairwise comparisons. The p-value of the stratified log-rank test on the equivalence in RFI of the three arms was 0.08 (Fig. 1a).

Fig. 1
figure 1

Kaplan–Meier estimates of recurrence-free interval by treatment arms: NSABP B-41. a Overall, b Among hormone receptor-positive patients, c Among hormone receptor-negative patients

The RFI among the three arms were also compared according to hormone receptor status. Among 330 patients with hormone receptor-positive tumors, at 4.5 years 91.8% of patients in the TL arm were free of recurrence, compared to 89.1% in the T arm and 83.4% in the L arm. The HR for comparison of the TL arm to the T arm was 0.76 (95% CI 0.31, 1.81, log-rank p = 0.38) and was 1.39 (95% CI 0.68, 2.86, log-rank p = 0.28) for a comparison of the L arm to the T arm (Fig. 1b).

Among 189 patients with hormone receptor-negative tumors, the proportion of patients who were recurrence-free at 4.5 years were 85.2%, 82.4% and 73.9% and for arms TL, T, and L, respectively. Compared with the T arm, the HRs were 0.65 (95% CI 0.26, 1.67) and 1.28 (0.56, 2.89), and the log-rank p-values for the TL and L arms were 0.74 and 0.29, respectively (Fig. 1c).

RFI was also compared among these three treatment arms according to the other stratification factors: clinical nodal status (negative vs positive), clinical tumor size (2-4 cm vs > 4 cm), age at randomization (< 50 years vs ≥ 50 years). In all patient subgroups, except those with 2–4 cm tumors, arm TL was associated with the highest percentage free from recurrence while arm L was associated with the lowest percentage. (Supplementary Fig. S2). However, none of the differences were statistically significant.

The multivariate Cox PH model showed that patients with larger tumors (p < 0.001), hormone receptor-negative tumors (p = 0.02) and positive nodes (p = 0.06) were associated with higher risk of cancer recurrence (Table 2).

Table 2 Multivariate Cox proportional hazard models: NSABP B-41

Overall survival

For patients on the TL and T arms, the 5-year OS was excellent at 95.8% (95% CI 91.3–98.0%), and 94.8% (95% CI 90.2–97.3%), respectively. Patients on the L arm had a lower 5-year OS of 89.1% (95% CI 83.2–93%) (Fig. 2a). However, the p-value for the stratified log-rank test on equivalence in OS among the three arms was 0.09. OS was also compared according to all stratification factors (Supplementary Fig. S3). In all patient subgroups, except for the hormone receptor-negative cohort, the HR favored arm TL relative to arm T. Of interest, all 108 women with hormone receptor-positive tumors on arm TL were alive at their last follow-up. In the comparison of arm L vs arm T, arm L was associated with inferior OS relative to arm T in most of the subgroups (Supplementary Fig. S3). However, none of those differences were statistically significant except among the women with hormone receptor-positive tumors (log-rank p = 0.05).

Fig. 2
figure 2

Kaplan–Meier estimates by treatment arms: NSABP-B-41. a Overall survival, b Event-free survival

Results from the multivariate Cox PH model showed that patients with larger tumors (p = 0.02) and hormone receptor-negative tumors (p < 0.001) were associated with higher mortality (Table 2).

Event-free survival

The 5-year EFS was 84.2% (95% CI 77.5–89.1%), 84.7% (95% CI 78.4–89.4%) and 76.7% (95% CI 69.2–82.6%) for patients on arms TL, T and L, respectively. Compared with arm T, the HRs for EFS were 0.92 (95% CI 0.53, 1.59) for arm TL and 1.40 (95% CI 0.85, 2.31) for arm L (Fig. 2b). The differences were not statistically significant with the stratified log-rank test p = 0.25.

Pathological complete response

Patients with pCR in breast were associated with better prognosis. The HR for RFI was 0.42 (95% CI 0.26, 0.68, log-rank test p < 0.0003) (Fig. 3a). The K-M estimates of freedom from recurrence at 4.5 years was 90.7% (95% CI 86.5%, 93.6%) among the 287 patients with pCR in the breast and 78.9% (95% CI 72.8%, 83.8%) for the 232 patients with residual invasive disease in the breast at surgery. The HRs for OS and EFS were 0.26 and 0.47, respectively, with log-rank test p < 0.001 for both endpoints (Fig. 3b and c).

Fig. 3
figure 3

Kaplan-Meier estimates by pCR status: NSABP B-41. a Recurrence-free interval, b Overall survival, c Event-free survival

An exploratory analysis was performed on breast pCR for RFI according to hormone receptor status. For patients with hormone receptor-positive tumors, the HR was 0.59 (95% CI 0.31, 1.14, p = 0.11) for pCR vs non-pCR status, though the difference was not statistically significant. The proportion of patient recurrence-free at 4.5 years was 91.4% for patients with pCR compared to 85.3% for patients with residual invasive disease (Supplementary Fig. S4). In patients with hormone receptor-negative tumors, the HR for RFI for pCR vs non-pCR was 0.23 (95% CI 0.12, 0.47, p < 0.001) with the proportion free from recurrence at 89.6% for patients with pCR in breast vs 62.2% for those with residual invasive disease in the breast (Supplementary Fig. S5). Among patients with hormone receptor-positive tumors, the KM estimate of 5-year OS was favorable irrespective of pCR status, which was 98.1% for patients with pCR in breast and 95.0% for those with residual invasive disease (p = 0.09; Supplementary Fig. S6). However, among patients with hormone receptor-negative tumors, there was a marked difference in OS with the K-M estimates of 5-year OS of 95.8% for patients with pCR and only 71.7% for non-pCR patients (p < 0.001; Supplementary Fig. S7).

Discussion

The B-41 study showed similar pCR rates with the substitution of L (53.2%) for T (52.5%) and a numerically higher, but not statistically significant pCR rate of 62% with the combination of TL administered with WP following doxorubicin and cyclophosphamide in the neoadjuvant setting for HER2-positive early breast cancer. All patients were to receive adjuvant trastuzumab to complete 1 year of HER2-directed therapy, and patients with hormone receptor-positive disease were to receive a minimum of 5 years of endocrine therapies with choice of therapy at investigator’s discretion. With 4.5 years of follow-up after surgery, there was no statistically significant difference in RFI between the combination of TL compared to T (89.4% vs 87.2% free of recurrence), or between L compared to T (79.4% vs 87.2% free of recurrence). However, RFI at 4.5 years was statistically significantly higher in the TL group relative to the L group (89.4% vs 79.4%) (two-sided p = 0.014). For patients on the TL and T arms, the 5-year OS was excellent at 95.8% (95% CI 91.3–98.0%), and 94.8% (95% CI 90.2–97.3%), respectively. Patients on the L arm had a numerically lower 5-year OS of 89.1% (95% CI 83.2–93%), but the differences among the arms were not statistically significant.

The B-41 study was one of several phase III studies which evaluated lapatinib as an alternative to trastuzumab or in combination with trastuzumab as a component of neoadjuvant therapy for HER2-positive early breast cancer which were designed to assess differences in pCR. Following demonstration of benefit with adding lapatinib to capecitabine in the 2nd line setting of metastatic HER2-positive breast cancer [6] and preclinical work [14] suggesting dual HER2-targeted therapy might be more effective than monotherapy, there was strong interest in evaluating lapatinib as an alternative to trastuzumab or in combination with trastuzumab in early breast cancer. The CALGB 40601 [15] trial administered preoperative weekly paclitaxel with T, L, or the combination (TL), followed by surgery with adjuvant administration of AC. All patients were to complete 1 year of HER2-directed therapy with adjuvant trastuzumab. The pCR rates in the breast were 46% in the T cohort, 32% in the L cohort, and 56% in the TL cohort, but these differences were not statistically significant (pCR) [15]. With 7 years of follow-up the TL cohort had a significant improvement in RFS and OS compared to trastuzumab (RFS HR, 0.32, 95% CI 0.14–0.71: p = 0.005; OS HR, 0.34; 95% CI 0.12–0.94: p = 0.037) while there were no differences between the T and L cohorts, consistent with long-term benefit with dual HER2-targeted therapy. The 7-year RFS were 79%, 69% and 93% for the T, L and TL cohorts respectively with 7-year OS 88%, 84% and 96%, respectively [16].

In the NeoALTTO trial [17], patients also received trastuzumab (T), lapatinib (L), or the combination of lapatinib and trastuzumab (TL) with weekly paclitaxel prior to surgery. Following surgery, 3 cycles of FEC were administered along with the assigned neoadjuvant HER2-directed therapy, which was continued to complete a year of HER2-directed therapy. The pCR rate in the breast was significantly higher in the TL cohort (51.3%) compared to the T cohort (29.5%) (p = 0.0001) while there was no significant difference between the L cohort (24.7%) and the T cohort (29.5%). The 6-year EFS rates were 67% for both the T and the L cohorts and numerically higher at 74% for the TL cohort, but the difference was not statistically significantly different compared to T cohort (HR = 0.81; p = 0.35). The numerical differences in 6-year OS were also not statistically different (79%, 82% and 85% for the T, L and TL cohorts, respectively) [18, 19].

The NeoSphere [20] trial employed a similar design to CALGB 40601 and NeoALTTO but evaluated the monoclonal antibody pertuzumab as the second HER2-targeted therapy. Patients (n = 417) were randomized to receive 4 cycles of neoadjuvant docetaxel with either trastuzumab (T), pertuzumab (P), or the combination (TP). A fourth arm evaluated the activity of TP as neoadjuvant therapy without chemotherapy. Following surgery all patients received FEC for 3 cycles and the patients randomized to TP alone also received 4 cycles of adjuvant docetaxel. Patients received trastuzumab as adjuvant therapy as well to complete 1 year of HER2-directed therapy. Patients given TP plus docetaxel had a significantly improved breast pCR rate of 46% compared with 29% in those given T plus docetaxel (p = 0.014). The cohort receiving P plus docetaxel had a breast pCR rate of 24% and the cohort receiving neoadjuvant TP alone had a breast pCR rate of 17%. For patients in the TP plus docetaxel cohort, 5-year progression-free survival (PFS) rate was 86% vs 81% for the T plus docetaxel cohort (HR 0.69 [95% CI 0.34–1.40]. [21] The cohort receiving P plus docetaxel and the cohort receiving only TP as neoadjuvant therapy both had 5-year PFS of 73%.

All four studies above were not designed with large enough sample sizes to assess long-term outcomes as a primary endpoint. Although CALGB 40601 demonstrated statistically significant improvement in long-term endpoints with dual HER2-targeted therapy, the other three studies demonstrated numerically higher but non-significant improvements in long-term endpoints. A planned combined analysis of the data from the trials is ongoing and could help inform correlative predictive biomarker studies and assess the clinical utility of combination therapy in important subsets. [22] However, based on results from neoadjuvant studies such as NeoSphere, and TRYPHAENA (phase II cardiac safety study) [23] and subsequent positive results from the APHINITY adjuvant trial [24] in the node-positive cohort, dual HER2-targeted therapy with trastuzumab and pertuzumab with chemotherapy has become a standard of care for patients presenting with node-negative, HER2-positive breast cancer ≥ 2 cm or with node-positive, HER2-positive breast cancer.