Introduction

Multiple studies have identified gene expression signatures to predict the response of breast cancer to neoadjuvant chemotherapy [19]. An ideal signature would identify, at the presentation of disease, those patients who would benefit from specific chemotherapy regimens and allow the remainder to be spared of its side-effects. Most studies aimed at identifying such a signature have considered breast cancer as a single homogeneous entity. The molecular heterogeneity of breast cancer, however, has been demonstrated with the “intrinsic” gene signature [10]. At least three major molecular subclasses—basal-like (BL), luminal, and erbB2-positive—have subsequently been found to have distinct clinical outcomes [1113]. Studies are emerging which suggest differing complete pathological response (pCR) rates of breast cancer to neoadjuvant cytotoxic chemotherapy among the molecular subclasses; notably, significantly higher rates of pCR are achieved with BL-like tumors [11, 12, 14]. Given these observations, one would expect a gene signature predictive of response to chemotherapy derived from unclassified breast cancer data to contain redundant information with that of the intrinsic gene signature. The distinct biological make-up of these tumor subclasses—as evidenced by their differential clinical course, expression of hormonal receptors, and response to treatment—warrants distinct analysis for each class to arrive at optimal and customized predictors of response to therapy.

Prospective trials have demonstrated that those patients with a pCR of the primary tumor have significantly improved disease-free survival and overall survival when compared with patients who do not have a pCR [1517]. Based on these data, pCR is frequently used as a surrogate for overall survival in the design of clinical trials. In this study, we have identified a gene expression signature which predicts pCR response to neoadjuvant chemotherapy within the BL subclass of breast tumors. Analysis of samples from patients with pCR versus those with residual disease (RD) following neoadjuvant chemotherapy yielded a gene expression signature which is independent of chemotherapeutic regimen or method of tissue collection. Clinical surveillance showed that this signature may differentiate between disease-free survival as well.

Patients and methods

Patient population

119 patients with clinical stage II or III breast cancer, between March 2003 and 2006 at Washington University, were enrolled into a prospective clinical trial of four cycles of epirubicin (75 mg/m2) and docetaxel (75 mg/m2) every 3 weeks prior to surgery and two cycles after surgery. Half of the patients received zoledronic acid every 3 weeks beginning at the time of chemotherapy. Tumor size was measured from mammograms and ultrasound studies prior to treatment. pCR was defined as no residual invasive disease in the breast or lymph nodes. Residual in situ carcinoma was also considered a pCR [18]. Estrogen receptor (ER) and Her-2 status was determined on a diagnostic core obtained before treatment. The mean follow-up was 2 years with annual restaging. This study protocol was approved by the Institutional Review Board at Washington University. Written informed consent was obtained from each patient.

Tissue collection and gene expression profiling

RNA was extracted from snap frozen 14-gauge core samples obtained from pre-treatment tumors. Specimens containing more than 40% of tumor on histological examination were analyzed. Trizol reagent (InVitrogen) was used to isolate total RNA. RNA was assessed by Agilent Bioanalyzer (Agilent, Palo Alto, CA). Affymetrix target preparation, array hybridization, and array scanning were performed using standard protocols. 15 μg of biotinylated cRNAs were hybridized to Affymetrix U133Plus2 GeneChip™ oligonucleotide arrays. Array images were processed using the Affymetrix Microarray Analysis Suite (MAS5) algorithm. The arrays were scaled to a target intensity of 1,500 and exported to the Bioinformatics Core Facility (http://bioinformatics.wustl.edu).

Microarray gene expression analysis

Array analysis was from 70 specimens. In order to identify 24 BL samples, WU-BLAST (http://blast.wustl.edu/) was used to identify the corresponding Affymetrix oligonucleotide probe sets to that of the intrinsic gene signature [10]. Hierarchical clustering was performed using Cluster 3.0 (http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#ctv). The intrinsic signature from each of the 70 arrays was scored against the signature of the five tumor subclasses from Sorlie et al. using Pearson’s correlation coefficient [10]. Signatures with highest correlation to the Sorlie et al. BL cluster as compared to the other tumor types were identified as BL tumors. A similar procedure was used to identify the 26 BL tumors within the microarray data set published by Hess et al. [4].

Analysis to identify probes which exhibited differential expression between pCR and RD samples proceeded as follows: In order to reduce the baseline noise within the data, probe sets with fewer than 50% ‘P’ calls, as determined by the MAS5 algorithm, across all 70 samples accrued at this institution were retained for further analysis. In order to account for potential block effects between the data of the two studies, expression values of each probe set were mean-centered and variance-normalized independently within each study. The processed expression values from probe sets common to both studies were then combined to form a data set consisting of 50 arrays with 13,181 probe sets within each array. In order to partition the samples from the two studies into training and validation sets, approximately four-fifths of the arrays of each study were chronologically partitioned within each study, then combined to form a 37 array training set. The remainder were combined and sequestered to form a 13 array validation set.

Differential expression between pCR and RD samples was evaluated with a moderated t-statistic using lmFit and eBayes of the LIMMA package within the R statistics package [19, 20]. Discrimination profiles were constructed by incremental inclusion of the most informative probe sets. Each of these profiles was then evaluated for its ability to predict pCR using linear discriminant analysis (LDA) with leave-one-out (LOO) cross validation. The nominally optimal model was identified among the profiles with minimal error rates in LOO cross validation. In order to assess the statistical significance of the result, this search algorithm in its entirety was repeated on 500 data sets in which the data within each probe set were randomized. Each randomized data set was analyzed for probes which showed differential expression between pCR and RD outcomes, and the most informative probes arising from the randomized data were evaluated for their ability to predict pCR using LDA with LOO.

Disease-free survival curves were computed using the Kaplan–Meier method as implemented in the R survival library. Tests for statistical differences between these survival curves were performed using survdiff of the R survival package.

Results

Identification of BL tumors

At this institution, 119 patients were enrolled prospectively in a neoadjuvant chemotherapy trial (Table 1). Adequate core biopsies were obtained prior to neoadjuvant therapy from 86 patients, 70 of which fulfilled quality requirements to undergo expression profile analysis. Oligonucleotide probe sets within the Affymetrix U133Plus2 array which corresponded to genes of the intrinsic signature were identified. Hierarchical clustering using the intrinsic signature identified 24 BL tumors (Fig. 1a). As further support of correct BL tumor identification, the tumors of this branch were predominantly ER, progesterone receptor (PR), and HER2 negative by immunohistochemistry compared to the tumor constituents of the other cluster branches (Fig. 1a).

Table 1 Patient and tumor characteristics of the breast cancers used for expression profiling at this institution and on the basal-like tumors accrued by Hess et al. [4]
Fig. 1
figure 1

Identification of basal-like tumors. Unsupervised hierarchical clustering of samples from WU using the “intrinsic” signature (a) and Hess et al. (b) [4]. The red branch demarcates BL tumors. Blue marks denote positive expression of ER, PR,and HER2. Black marks denote pCR. Probe set numbers differ between the two plots as the result of oligonucleotide platforms. For the analyses, probe sets common to both chips were used. In the right-most branch of (b) (blue), samples had low values and were therefore excluded from the analysis

In order to obtain additional sample numbers for analysis, four studies were evaluated to determine the compatibility of merging the published data sets with ours [5, 21, 22]. Data from three of these studies were unable to be incorporated in this analysis for reasons including incompatibility of the platforms and the presence of too few BL tumors within the data set [5, 21, 22]. Data from Hess et al. were derived from a microarray platform compatible with that of this study and contained sufficient probe sets to identify the basal subtype using the intrinsic signature [4]. In Hess et al., 133 patients underwent fine needle aspiration (FNA) tissue collection prior to any treatment and subsequently received 12 weekly treatments with paclitaxel followed by four treatments with fluorouracil, doxorubicin, and cyclophosphamide. Application of the intrinsic signature on the 100 profiles in the published data set yielded data from 26 additional BL tumors (Fig. 1b).

Fifty BL tumor expression profiles were thus assembled from two independent studies (Table 1). The patient and tumor characteristics in each data set were similar in age, histology, and grade. In our data set, there were a greater percentage of African-American patients (42 vs. 8%), larger tumors (83 vs. 42% T2 tumors) and greater percentage of N0 disease (63 vs. 23%) (Table 1). 62% of the patients with basal tumors had pCR at the time of surgery in the study by Hess et al., compared with 21% in the patients at our institution (Table 2).

Table 2 Distribution of pCR and RD response to chemotherapy within all breast tumors and basal-like subclass of the two studies

Gene expression signature of pathologic complete response

The expression data were normalized independently within each study and then combined for further analysis. The BL tumor profiles were partitioned chronologically into training and validation groups in a balanced manner to ensure equal representation from each study. Data from both studies were included in the training set with the expectation that this would facilitate the selection of probe sets exhibiting differential expression across treatment response without influence of the specific chemotherapy used or method of sample collection. Within the 37 sample training group, a moderated t-statistic ranked the probe sets by their degree of differential expression as related to treatment response. A search using LDA analysis with LOO cross validation found that inclusion of the top 25 probe sets in the discrimination model yielded the most parsimonious model with optimal predictive results (Fig. 2). These 25 probes correspond to 23-genes which encompass a diverse array of cellular functions (Table 3). None of these 23-genes exhibited differential expression in a previous comparison of tumor expression profiles derived from core biopsies versus fine needle aspirates from the same tumor suggesting that the profile is independent of the tumor-sampling method [23].

Fig. 2
figure 2

Determination of the optimal number of probe sets. The 50 basal-like tumor samples were partitioned in a balanced manner into a 37-sample training group and a 13-sample validation group. A moderated t-statistic ranked the probe sets by their degree of differential expression as related to treatment response. A search using linear discrimination analysis with leave-one-out cross validation found that inclusion of the top 25 probe sets in the discrimination model yielded the optimal and most parsimonious predictive results

Table 3 Functional classification of the 23-genes corresponding to the predictive 25 probe sets

Using the independent 13 sample validation set, the 23-gene signature achieved an overall accuracy of 0.92 (95% CI 0.64–1) with a positive predictive value of one (95% CI 0.4–1) (Table 4; Fig. 3). In order to assess the chance occurrence of this result, the search procedure was repeated on 500 data sets generated by the randomization of data within each probe set. No instance of predictive accuracy was observed which matched or exceeded that which resulted from analysis of the actual data. The observed average predictive accuracy with the randomized data was 0.47 (95% CI 0.46–0.49).

Table 4 Performance of the 23-gene prediction profile on the validation set
Fig. 3
figure 3

Training and validation heat plots of the 25 probe model. Using linear discrimination analysis, each of the 13 validation samples were compared to the training data and predicted to have had pCR or RD to neoadjuvant therapy. All RD samples were correctly categorized as such, as were five of six pCR samples. Overall accuracy was 92% (95% CI 64–100), with sensitivity of 80% (95% CI 28–99) and specificity of 100% (95% CI 63–100). Hash marks above heat plots mark samples which had pCR to neoadjuvant therapy

Expression signature and disease-free survival

In order to examine the internal consistency of the expression signature, the expression profiles of all 50 BL tumors were subjected to hierarchical clustering using the 23-gene expression signature. Expected clusters enriched for pCR and RD samples were observed (Fig. 4). Notably, a small but distinct cluster of eight patients exhibited a high rate of disease recurrence during post-therapy surveillance. Although chemotherapy achieved pCR in half of these eight patients, six nevertheless developed recurrent disease (Table 5). Kaplan–Meier survival analysis demonstrated a statistically significant difference in survival among the pCR and RD groups as compared to the high recurrence groups (P < 0.01), while a similar analysis between the pCR and RD groups alone did not achieve statistical significance (Fig. 5).

Fig. 4
figure 4

Internal consistency of the expression signature. The expression profiles of the 50 basal-like tumors were clustered using the 23-gene expression signature. The data from Hess et al. are indicated in the recurrence hashes by open boxes; the data from this institution are indicated by closed boxes

Table 5 Tabulation of pCR and recurrence events within the three groups resulting from hierarchical clustering of the 50 basal-like tumors with the 23-gene prediction profile
Fig. 5
figure 5

Kaplan–Meier survival analysis. Disease-free survival estimates for the three groups—pCR, RD, and high recurrence—resulting from hierarchical clustering of the 50 basal-like tumors with the 23-gene prediction profile demonstrated a statistically significant difference in survival between the pCR, RD and high recurrence groups (P < 0.01). Survival analysis of the pCR and RD groups alone did not achieve statistical significance

Discussion

BL breast cancers are associated with a poor clinical outcome [24, 25]. Response of the primary tumor to neoadjuvant chemotherapy has been related to subsequent disease-free and overall survival, making this a valuable intermediate end-point for treatment evaluation. Using array data from 50 BL tumors combined from two independent studies, we have identified a 23-gene expression signature which predicts pCR of BL breast tumors to neoadjuvant chemotherapy regimens containing an anthracycline and a taxane. More importantly, this signature identifies a group of patients who are at high risk of disease recurrence. Our data suggest that within the BL breast cancer molecular subclassification there are further subtypes which have differing biological behavior, and that pCR itself within the BL subtype may not always portend a favorable prognosis.

Several recent studies suggest that gene expression profiles can achieve greater accuracy in predicting breast tumor response to neoadjuvant chemotherapies than clinical predictors alone [7, 9]. There are several approaches for the development of multigene predictor of response to chemotherapy. One approach is to group all breast cancers into either responders or non-responders and define the gene expression difference between these groups. This approach has been successfully applied to develop prognostic signatures for breast cancer [5, 21]. In these studies, data are usually analyzed from all breast tumors regardless of underlying molecular subclassification. However, recent data indicate that substantive inferences concerning the response of a tumor to neoadjuvant chemotherapy may be made from the subtype of the tumor alone [26, 27]. Since distinct molecular classes of breast cancer exist, stratification of patients by molecular class should yield more accurate class-specific predictors. Therefore, inclusion of specific breast tumor subclasses into a prediction scheme for neoadjuvant chemotherapy should increase its accuracy. In this study, our goal was to develop the best possible classifier for prediction of pCR in the BL breast cancer subtype.

Considerable barriers exist for molecular analysis using specific breast tumor subtypes. As each subclass exists as a fraction of all breast tumors, several-fold increases in patient accrual rates would be required to achieve adequate sample numbers for analyses. As an alternative approach, we combined the data sets from two neoadjuvant chemotherapy studies. Multiple published array sets were examined [5, 21]. Due to compatibility in platforms and number of data sets available, we chose to work with the set published by Hess et al. [4]. This allowed us to develop a 23-gene predictor of pCR in BL tumors with high accuracy. The genes identified are involved in a wide variety of cell functions including chromatin remodeling, gene expression cell proliferation, ubiquitin regulation, cell motility, and signal transduction. Interestingly, two of the genes NNMT and ABCB1 are involved in drug metabolism. However, not all of these genes may play a causative role in determining sensitivity to chemotherapy; some may represent distant downstream transcription effects of biological events that influence drug sensitivity. In a smaller study, Rouzier et al. found 61 genes which differed significantly between BL tumors which achieved a pCR and those that did not [11]. None of these 61 genes overlap with the 23-genes found to predict pCR in our data.

Although pCR is a powerful prognostic factor for prolonged survival in patients receiving neoadjuvant chemotherapy [15, 2830], a significant proportion of patients with pCR develop recurrent disease [31]. Reported series have shown a 5-year recurrence rate in patients with pCR ranging from 13–25% [15, 29, 30, 32]. Of the 50 BL tumors used in this analysis, 28% (n = 14) of the patients developed recurrences within 5 years of diagnosis and half (n = 7) of these patients had a pCR. This paradoxical feature is consistent with other studies conducted in BL breast cancer which were identified using gene expression profiling [11, 14].

Although the 23-gene signature was not constructed to predict disease-free survival, it is interesting to note that this signature also identified a minor subclass of patients who developed recurrent disease, independent of whether they exhibited pCR. The number of patients in this group was small but the effect reached statistical significance when compared to the remaining tumors. In contrast, survival analysis of the pCR versus the RD groups did not substantiate prior associations of pCR with increased disease-free survival [15, 29]. This is consistent with the findings of Keam et al. who report that although a higher rate of pCR is achieved in patients with triple negative tumors, these patients had higher relapse free survival and shorter overall survival [27]. It has been reported that recurrence rates with pCR progressively increased depending on the initial stage at diagnosis [33]. Patients in this study had almost uniformly clinical stage II/III tumors, and therefore stage at diagnosis is unlikely to account for the clustering of these patients. Our observation is more likely due to the biology of the tumor and resistance to systemic therapy [10, 12]. Although this study may not have been powered adequately to fully evaluate disease-free survival, future studies may be warranted to examine whether pCR is an accurate predictor of disease-free survival within the BL subtype of breast cancer.

In order to obtain a sufficient number of BL breast tumors for our analysis, we combined data from two different studies. Notable differences existed between the experimental designs of our study and that of Hess et al. which provided the second set of tumor expression data used in this meta-analysis. Core biopsies were collected from our study, whereas Hess et al. acquired fine needle aspirates [4]. Although these methods result in different ratios of tumor versus stromal cells in the collected sample, expression profiles derived from these sample collection methods have been found to be largely concordant with only a small set of genes exhibiting differential expression [23]. Another difference in methodology was the number of chemotherapy treatments administered before pathological analysis and the chemotherapeutic regimens used. However, both regimens contained an anthracycline and a taxane. Although this would not affect the tumor expression profiles given that the samples were collected prior to therapy administration, it may have affected the percentage of pCR observed between the two studies. Various neoadjuvant chemotherapy regimens have previously been observed to achieve different pCR rates and would presumably affect the composition of the derived prediction model [34]. Despite these and other technical differences, we found an expression signature with sufficient signal to overcome these potential sources of bias between the two data sets. A prediction model derived from patients treated with heterogeneous regimens of chemotherapy may not result in optimal predictive accuracy for a specific regimen. However, a practically useful predictive model would be generally applicable to the reality of diverse clinical practice. With these constraints, a model could reasonably be expected to have high specificity for identifying pCR to neoadjuvant chemotherapy. Arguably, type II errors in detecting pCR are more acceptable from the vantage of oncologic treatment.

Gene expression analyses have identified molecular subtypes that are refining our understanding of breast cancer biology. We have identified a 23-gene signature which predicts pCR in BL breast cancers. Moreover, this signature identifies a subgroup of tumors associated with a decreased disease-free survival. Our data support the concept that there exists biological heterogeneity within the BL breast cancer subtype. It is important to identify patients with BL cancers who may develop systemic failure after achieving a pCR; these patients may benefit from additional therapy.