Introduction

Breast cancer is a heterogeneous disease. Estrogen receptor (ER) positive, Human Epidermal Growth Factor Receptor 2 (HER2)-negative breast cancer is the most common type of breast cancer and accounts for 65–70% of all invasive breast carcinomas [1]. It is increasingly accepted that patients with ER-positive and HER2-negative tumors do not benefit from chemotherapy as much as ER-negative or HER2-positive disease and endocrine therapy remains the mainstay of systemic treatment for these patients.

Neoadjuvant endocrine therapy (NET) was initially introduced in the early 1980s to treat elderly and fragile patients who were deemed inoperable or unfit for surgery or chemotherapy [2]. A 2016 meta-analysis showed that NET was as effective as neoadjuvant chemotherapy (NACT) in terms of down staging the tumor and was associated with lower toxicity [3]. Studies indicate that approximately 50% of patients who would require upfront mastectomy will be converted to breast-conserving surgery (BCS) after NET [4,5,6]. However, approximately 30–50% of ER-positive tumors do not respond to AIs [5, 7].

Pathological complete response (pCR) is rare after NET. A recent systematic review showed a pCR rate as low as 2.8% after NET in patients with ER +, HER2-breast cancer [8]. On the other hand, HR + breast cancer patients have a better long-term prognosis regardless of the pathological status at surgery [9,10,11], which invalidates pCR as a robust marker of survival. The Preoperative Endocrine Prognostic Index (PEPI) combines Ki67, ER status, tumor size, and nodal status after neoadjuvant endocrine treatment (Table 1). It was developed to identify patients at low risk of relapse [12] and has the potential of triaging patients with high PEPI to an escalation of therapy and patients with low PEPI to endocrine therapy only [13].

Table 1 The Preoperative Endocrine Prognostic Index [12]

However, NET has been timidly implemented in clinical trials and in practice. A review of data from the National Cancer Data Base showed that from 2004 to 2014, only 3.1% of stage II–III HR-positive breast cancer patients received NET in the USA [14]. NET is underutilized mainly due to the paucity of available clinical data. Increasing use of NET will demand definitive long-term survival data.

Methods

Study design and participants

Initiated in 2012, the study was an investigator-initiated, prospective, multi-center, non-randomized, controlled trial, at nine institutions in China. This study was overseen by the China Anti-Cancer Association (CACA) to ensure that it is conducted, recorded, and reported in accordance with the protocol, Standard Operating Procedures (SOPs), Good Clinical Practice (GCP), and the applicable regulatory requirements. In brief, management without adjuvant chemotherapy was the preferred approach for patients who had PEPI 0–1 or pCR. Patients who had PEPI ≥ 2 may receive adjuvant chemotherapy at the discretion of the treating physician. A chemotherapy regimen combining anthracyclines and taxanes or taxanes alone was used according to institutional guidelines.

Eligibility criteria included clinical stage T2-3N0M0, ≤ 75 years of age, ER stained positive in ≥ 50% malignant epithelial cells based on core biopsies, tumor size ≥ 2 cm in physical examination, clinically negative axilla (axillary ultrasound preferred in all patients), and World Health Organization performance status of 0–1. In case of multifocal tumors, the largest lesion had to be at least 2 cm in diameter and was designated as a target lesion for all subsequent tumor evaluations. HER2-positive tumors were ineligible. HER2 positivity was defined by at least 10% HER2-expressing tumor cells in IHC analysis or by a positive fluorescence in situ hybridization (FISH) assay. Other exclusion criteria include inflammatory breast cancer, previous chemotherapy or radiation therapy, and other concurrent illness, such as active infection, heart failure, or other significant illness that might influence treatment tolerability.

The study was conducted in accordance with the Declaration of Helsinki and Good Clinical Practice guidelines and was approved by the institutional review board of all participating centers. All patients gave written informed consent. The trial is registered on ClinicalTrials.gov website, NCT01613560.

Treatment

Patients were given letrozole 2.5 mg daily for 4 months before surgery. Sentinel lymph node biopsy (SLNB) was performed as a separate procedure. Axillary lymph node dissection (ALND) was recommended to all patients with any residual lymph node disease, including isolated tumor cells (ITC) and micro-metastases. Adjuvant treatment was given in compliance with local guidelines.

PEPI central assessment

After surgery, tumor blocks were sent to Peking University Cancer Hospital Pathology Department for a blinded centralized PEPI assessment by a dedicated breast pathologist (LYQ). Ki67 immunohistochemical examination was performed using mouse anti-Ki67 monoclonal antibody (dilution: 1:100; cat. no. ZM-0167; clone: MIB-1; ZSGB-BIO/ORIGENE, Inc., Beijing, China) using UltraPATH (ZSGB-BIO, Beijing, China) according to the manufacturer’s instructions. The interpretation and scoring of central Ki67 were carried out according to the proposal of the International Ki67 in Breast Cancer Working Group (IKWG) [15, 16]: (1) in full sections, at least three high-power (× 40 objective) fields were selected to represent the spectrum of staining seen on initial overview of the whole section; (2) if there were clear hot spots, data from these were hot spots included in the overall score; (3) only nuclear staining is considered positive and staining intensity is not relevant; (4) at least 500 malignant invasive cells (and preferably at least 1000 cells) were counted in each case; and (5) the Ki67 index was expressed as the percentage of positively staining cells among the total number of invasive cells in the area scored. PEPI was calculated by combining pT, pN, ER status, and Ki67 after surgery [12].

End points

The primary end point was to investigate the 5-year RFS in patients who had PEPI 0–1 or pCR without chemotherapy. Secondary endpoints included 5-year distant disease-free survival (DDFS), pathological complete response (pCR) rate, and safety of 4-month NET. Objective clinical response was calculated based on WHO criteria [17]. Miller–Payne classification [18] data derived from surgical specimen were collected for each of the participants.

Patients were followed every 6 months after surgery for 5 years. RFS was defined as the interval between the date of initiation of letrozole and documented disease recurrence, progression, or death from any cause. Distant disease-free survival (DDFS) was defined as the time from surgery to distant metastasis or death from any cause. Safety was assessed on all participants who have started their allocated treatment using Common Terminology Criteria for Adverse Events (CTCAE) version 3.0 from the National Cancer Institute.

Statistical analysis

The sample size was calculated using PASS 2008 software based on the two-sided, one-sample Log-rank test. Based on our previous work [19], a sample size of 202 achieves 80.0% power at a 0.050 significance level to detect a 5-year RFS rate of 98.6% in the PEPI 0–1 or pCR group when the 5-year RFS rate in the historic control group was 95.0% (considering a 10% drop-out rate). According to the statistical design of our study, the confidence interval for the estimate of RFS should exclude 95%, because we feel a RFS rate lower than 95% would be unacceptable for this group of patients with good prognosis (PEPI 0–1 or pCR). Our previous work showed that the number of patients who had PEPI 0–1 or pCR was similar to the number of patients who had PEPI ≥ 2 after 4 months of neoadjuvant AI [19]. As a result, an equal number of patients were planned to be enrolled in the PEPI ≥ 2 group and a total of 404 patients were to be recruited.

All analyses were performed using SPSS version 22 (IBM Corp., Armonk, NY, USA). All statistical tests were two-sided, and a P value < 0.05 was considered statistically significant. Quantitative variables were described using medians or mean and standard deviations. Categorical data are presented as number and percentage. Normally distributed variables were analyzed using Student’s t test, while the Mann–Whitney U test was used for nonnormally distributed variables. Frequency-associated analyses were performed using the chi-square test or Fisher’s exact method. Survival endpoints such as Survival rates and curves were analyzed using the Kaplan–Meier method. Differences between groups were tested using the log-rank test. Cox proportional hazards model was used to determine the association between related prognosis factors and survival.

Results

Between May 2012 and July 2018, 410 patients were registered to receive letrozole and 58 patients were excluded (for details see study flowchart, Fig. 1). Thus 352 patients constituted the per-protocol population. The median age was 61 years (49–75 years), most patients (349/352) had T2 tumors, and only three patients had T3 tumors. Most tumors were both ER and PR positive. Ki67 ≥ 25% was used to differentiate between low and high values according to our institutional guidelines and the treating physician may consider this information when making adjuvant treatment plans (chemotherapy or no chemotherapy). Patient and tumor characteristics are summarized in Table 2.

Fig. 1
figure 1

Study flowchart

Table 2 Patient and tumor characteristics

Clinical and pathological outcomes

Clinical response was assessed by ultrasound after 4 months of letrozole. Two patients had uCR (0.5%), 53 had uPR (15.0%), 288 had uSD (82.0%), and 9 had uPD (2.5%).

BCS was performed in 182 patients (52.0%). SLNB was attempted in 345 patients and lymph node visualization failure occurred in 20 patients (5.8%). All patients with a positive SLNB and lymph node visualization failure received further axillary lymph node dissection (ALND). Seven patients proceeded directly to ALND 9 of 352 patients (2.5%) had pCR (ypT0/is ypN0) after NET. 258 patients (73.3%) had ypN0, 80 patients had 1–3 positive lymph nodes, and 14 patients had more than 4 positive lymph nodes. All patients with residual cancer were found to be ER positive. 128 patients (36.4%) had PEPI of 0. 184 patients (52.3%) had PEPI 0 or 1. 159 patients had PEPI ≥ 2. PEPI score components distribution in the per-protocol cohort are summarized in Table 3.

Table 3 PEPI score components distribution in the per-protocol cohort

In the adjuvant phase, 290 (82.4%) patients received endocrine therapy only and adjuvant chemotherapy was given to 62 (17.6%) patients. Among 193 patients who had PEPI 0–1 or pCR, only 8 patients (4%) received adjuvant chemotherapy, compared with 53 of 159 patients (33%) who had PEPI ≥ 2. 196 patients (55.7%) received adjuvant radiotherapy.

Safety

Common toxicities were grade 1 or 2 muscular/bone/joint pain and hot flashes. One patient developed allergic reaction possibly related to letrozole (skin rash) and was taken off study. No other grade 3 or 4 toxicity was recorded.

Survival analysis

Median follow-up was 60 months (4–104 months). In total, 11 protocol-defined events (3.1% of the per-protocol population) were observed including 1 death form other primary cancer and 10 breast cancer relapses including 4 distant metastases, 5 loco-regional relapses and 1 concurrent loco-regional and metastatic relapses.

RFS at 5 years was 96.8% (95% CI 94.8–98.8%) for the whole per-protocol population. When categorized according to PEPI groups, 5-year RFS were 99.5% (95% CI 98.5–99.9%) for the PEPI 0–1 or pCR group vs. 93.7% (95% CI 89.6–97.8%) for the PEPI ≥ 2 group (HR 0.18, 95% CI 0.04–0.83, P = 0.028). DDFS at 5 years were 98.1% (95% CI 96.5–99.7%) for the whole per-protocol population. 5-year DDFS were 100% for the PEPI 0–1 or pCR group vs. 95.8% (95% CI 92.5–99.1%) for the PEPI ≥ 2 group (Log-rank P = 0.007) (Fig. 2, Table 4).

Fig. 2
figure 2

Kaplan–Meier curves survival estimation

Table 4 Survival estimates by PEPI and Miller & Payne classification

5-year RFS were 100% for the PEPI = 0 or pCR group vs. 94.8% (95% CI 91.7–97.9%) for the PEPI ≥ 1 group (Log-rank P = 0.007). 5-year DDFS were 100% for the PEPI = 0 or pCR group vs. 96.8% (95% CI 94.3–99.3%) for the PEPI ≥ 1 group (Log-rank P = 0.048) (Fig. 2, Table 4). Pathologically responsive tumors (defined as Miller & Payne classification grade 3–5 vs. grade 1–2) were not predictive for RFS (Table 4).

For the PEPI ≥ 2 group, 5-year RFS were 90.9% (95% CI 82.3–99.5%) for patients with adjuvant chemotherapy and 94.9% (95% CI 90.6–99.2%) for patients without adjuvant chemotherapy (HR 1.70, 95% CI 0.46–6.32, P = 0.432). 5-year DDFS were 93.3% (95% CI 85.9–99.9%) for patients with adjuvant chemotherapy and 96.9% (95% CI 93.3–99.9%) for patients without adjuvant chemotherapy (HR = 2.11, 95% CI 0.43–10.46, P = 0.361). Similarly, in the PEPI ≥ 1 group, no difference in RFS or DDFS were detected in patients with or without adjuvant chemotherapy (Table 5).

Table 5 Survival estimates by PEPI and adjuvant chemotherapy

Discussion

To this day, we are unable to accurately predict a tumor’s response to endocrine agent. NET has the potential to select appropriate treatment for individual patient since it incorporates an in vivo response assessment. Responsive tumors may receive endocrine therapy only and non-responsive tumors may be triaged to alternative treatment to improve survival, just like HER2-positive or triple-negative breast cancer patients who do not achieve pCR after NACT [20, 21].

However, tumor responsiveness is not adequately assessed using clinical response only and require the incorporation of other prognostic factors. Using data from the P024 trial, Ellis et al. developed the Preoperative Endocrine Prognostic Index (PEPI) [12]. The PEPI was further validated in the IMPACT trial [12] and the American College of Surgeons Oncology Group (ACOSOG) Z1031 trial [22]. In the Z1031 trial, PEPI = 0 cases had a relapse risk of 3% with a median follow-up of 5.5 years and are therefore unlikely to benefit from adjuvant chemotherapy [22].

A major weakness of the PEPI is that the relapse risk estimation is based on very limited number of cases and PEPI validation efforts should continue. Previous PEPI prospective validations were mostly conducted in Europe and the USA. To the best of our knowledge, only one small phase II Japanese study validated PEPI in Asian population [23]. We seek to contribute more PEPI validation data based on a larger population. Another limitation of the PEPI is the low number PEPI = 0 cases after NET. In Z1031, 25.9% of tumors were categorized as PEPI = 0 [22]. The proportion of patients with PEPI = 0 was 15.2% in the IMPACT trial and 25.9% in the P024 trial [12].

Efforts have been made to increase the number of patients categorized as “low risk” after NET. In the previously mentioned Japanese study, pretreatment progesterone receptor (PR) > 50% was associated with RFS and BCSS. When PR was combined with the PEPI, the percentage of patients categorized as “low risk” increased from 25% (PEPI = 0 group) to 49% (PEPI-P low-risk group). The PEPI-P was also shown to be a stronger predictor of outcome than PEPI alone [24]. However, the study did not control for adjuvant therapy and the prognostic significance of PEPI-P needs further prospective validation.

Our study demonstrates that for early-stage postmenopausal strongly ER-positive and HER2-negative breast cancer, the 5-year RFS and DDFS were 97.9% (95% CI 96.3–99.5%) and 98.1% (95% CI 96.5–99.7%), respectively. Our study confirms patients who had PEPI = 0 or pCR after NET have excellent survival and chemotherapy can be safely omitted. The excellent survival rates raise the question that chemotherapy is overtreatment for many patients even they have PEPI > 0. It is true that the goal is to categorize patients accurately and to minimize the proportion misclassified. However, at present, given the low percentage of “low-risk” patients identified by PEPI = 0, we feel the PEPI itself needs improving and a good way to start would be to increase the sensitivity of PEPI and categorize more patients to the “low-risk” group. Our institutional retrospective study with a median follow-up of almost 10 years showed that pathologically responsive (defined as Miller & Payne classification grade 3–5) was an independent prognostic factor for RFS, DDFS, and BCSS. Pathologically responsive or PEPI 0–1 (about 75% of the enrolled patients) may be used to define a group of “low-risk” patients who are potential candidates to omit chemotherapy (article in press). Unfortunately, in the present study, pathologically responsive was not associated with improved survival. One possible explanation is that our follow-up is short for ER-positive patients and the number of events is very low. Longer follow-up is definitely needed.

The Z1031 trial tested the hypothesis that for endocrine-resistant tumors, early switch to chemotherapy would improve clinical outcome. However, triage patients with Ki67 > 10% after 2–4 weeks of AI to chemotherapy was less effective than expected [22]. In our study, 5-year RFS were 90.9% (95% CI 82.3–99.5%) and 94.9% (95% CI 90.6–99.2%) for patients with and without adjuvant chemotherapy in the PEPI ≥ 2 group, respectively. Chemotherapy was given to patients in our study often due to clinical–pathological features (for instance, higher number of positive lymph nodes) at the discretion of the treating physicians and only a limited number of patients received chemotherapy, so the results need to be interpreted cautiously. The idea of triaging patients to escalation or de-escalation of therapy according their PEPI is being prospectively validated in the ALTERNATE trial [25]. While we await the long-term survival data from the ALTERNATE trial, no survival difference was detected in our study between patients received adjuvant chemotherapy vs. no chemotherapy among PEPI ≥ 2 cases. However, our study is not a randomized study and no conclusion can be drawn regarding the impact of adjuvant chemotherapy on survival. The low chemotherapy responsiveness observed could partially be explained by the postmenopausal status of the enrolled patients [26], the high ER expression of the tumors [27], or the use of neoadjuvant endocrine therapy may induce up-regulation of genes associated with chemo-resistance [28]. In the adjuvant setting, the monarchE trial recently showed in a population with high clinical–pathological risk, the addition of abemaciclib demonstrated a clear efficacy benefit [29]. Additional research is warranted to develop new drug combinations and predictive biomarkers to personalize the neoadjuvant strategy for ER-positive breast cancer.

Originally, we planned to randomize PEPI ≥ 2 cases to adjuvant chemotherapy vs. no chemotherapy. However, such a randomized design was considered unfeasible by some participating centers and was dropped. This reflects the reluctance to randomization to a NET trial from patients and physicians alike. Similarly, a number of other NET studies had to abandon their original phase III design or close early due to slow accrual [30, 31].

We used an ER expression threshold of 50% to try to maximize response. It was viewed as an indication of highly endocrine-responsive tumors [32]. The Z1031 study selected ER-positive patients based on the Allred score (Allred score 6–8) [22]. Decreased levels of ER after NET were previously reported [12] and were speculated to associate with endocrine resistance [4]. However, in our current study, all patients remained ER positive after NET.

A major strength of the present study is the prospectively planned endpoints, diagnostic procedures, treatment, and follow-up protocols. Another strength is the relatively large number of patients, which assures quality-controlled analyses and a generalizable patient and treatment characteristics. The centrally reviewed postoperative tissue blocks and PEPI assessment excluded variation between pathology centers.

The present study has several limitations. First of all, it is performed using patient subgroup defined by immunohistochemistry only and high degrees of molecular heterogeneity may still exist. Further trials using specific patient subgroups stratified by gene expression profiles may help to define a more homogeneous group of tumors and more accurately identify predictive biomarkers. However, the use of genomic assays to predict response to neoadjuvant therapies has not been rigorously studied and is not recommended by recent ASCO guidelines [33]. Secondly, our cohort was geographically constrained, which suggests caution should be taken when generalizing the findings. Thirdly, longer follow-up is required as HR + breast cancer patients tend to experience late relapses. BCSS and OS results are not discussed due to limited follow-up time and low number of events and will be included in future reports. Fourthly, to this day, some level of controversy remains regarding Ki67’s reproducibility but international guidelines have been published and progress are being made [15, 16].

In conclusion, our study indicates PEPI 0–1 or pCR may be used to define a group of ER-positive and HER2-negative postmenopausal early breast cancer patients with low relapse risk for whom adjuvant chemotherapy can be safely withheld. Studies on the identification and alternative treatment options for endocrine-resistant tumors are warranted.