Introduction

Early onset breast cancers are thought to be biologically more aggressive and to result in earlier local recurrences and distant metastatic spread when compared to stage- and treatment-matched breast cancers arising after age 40 [14]. It has been argued that breast cancers arising in younger women are a unique biologic entity [5, 6]. However, the underlying biology driving the aggressiveness of breast cancers in younger women remains poorly understood despite previous consideration of many age-associated prognostic breast cancer features including endocrine dependence and biomarkers reflecting tumor proliferation, genetic instability, angiogenesis, and invasiveness [1]. In particular, it remains unclear whether primary tumors arising before age 40 or with high proliferative potential confer an equally poor prognosis for both endocrine-dependent and endocrine-independent breast cancer.

The most important age-associated biological feature of breast cancer that must be controlled for is estrogen receptor (ER) overexpression, as both breast cancer ER content (fmol/mg protein cytosol) and the proportion of ER-positive breast cancers increase in a decade-by-decade pattern after age 40 [1, 7]. In contrast, virtually all clinical measures of breast cancer proliferation (e.g., mitotic index, Ki-67/MIB-1 positivity), growth factor dependence (e.g., ErbB2/HER2, EGFR), and genomic instability (e.g., nuclear grade, p53 positivity or mutation frequency) associating with breast cancer aggressiveness show strong inverse correlations with tumor ER and patient age-at-diagnosis [7]. Tumor ER content and the proportion of ER-positive breast cancers rise continuously with increasing age-at-diagnosis from 40 to ≥80 years; and, in an inverse fashion, tumor growth biomarkers like ErbB2/HER2 and EGFR decline continuously with aging while tumor proliferation biomarkers like mitotic index and Ki-67/MIB-1 diminish most rapidly between ages 40–60 [1, 7]. Therefore, efforts to understand the aggressive nature of breast cancer arising in younger women must account for the strong inter-dependencies between age-at-diagnosis, tumor ER expression, and proliferative potential.

A recent cohort study of sporadic, node-negative ER-positive invasive ductal breast cancers arising in women under age 45 compared with histologically matched ER-positive breast cancers arising in women over age 69 revealed no significant genomic differences (copy number changes, p53 mutations), but did find significant differences in gene expression microarray profiles [8]. Unsupervised microarray analysis identified six different transcriptome subclasses with apparent age biases; as well, supervised transcriptome analysis identified 59 genes (including ER) that were more highly expressed in the breast cancers of older women and 26 genes (including amphiregulin) that were more highly expressed in the breast cancers of younger women [8]. These studies concluded that early and late onset ER-positive breast cancers likely arise by fundamentally different biological processes, with epigenetic rather than genomic mechanisms accounting for the fact that early onset breast cancers grow more rapidly and are biologically more aggressive than late onset ER-positive breast cancers. While it has been assumed that similar conclusions apply to endocrine-independent (ER-negative) breast cancers, there have been no similar cohort studies comparing the clinical and molecular features of early vs. late onset ER-negative breast cancer.

This study explores the prognostic impact of young age-at-diagnosis in relation to tumor ER status and proliferative potential. We pooled outcome and expression microarray data on adjuvant treatment-naive, node-negative breast cancer cases from four different clinically annotated public sources [912], looking for the differential impact of age-at-diagnosis and tumor proliferative potential on the metastatic outcome of ER-positive and ER-negative breast cancer. Since gene expression measures of tumor proliferation are known to have prognostic significance exceeding that of Ki-67/MIB-1 immunohistochemistry [13, 14], we compared two different, non-redundant transcriptome measures of breast cancer proliferation: a 61-gene proliferation signature [8, 15] and FOXM1, a single gene surrogate known to regulate the fidelity of cell division as well as other cancer, aging and regenerative cell mechanisms [16, 17].

Methods

683 adjuvant treatment-naive, node-negative breast cancer cases (447 ER-positive and 236 ER-negative) annotated for distant metastasis-free survival (DMFS) were pooled from four sources: GSE2034 [9], GSE5327 [10], GSE7390 [11], and NKI-295 [12]. Dichotomized by age-at-diagnosis, cohorts were defined as either younger (Y) <40 years or older (O) ≥40 years cases. Normalized and log2-scaled gene expression data were stratified by ER status and mean centered independently within each data source and according to ER subtype. Using chip annotation files obtained from the Broad Institute ftp site, data were collapsed by gene symbol such that expression of a gene represented by multiple probes was defined as the average across probes. Data generated on different microarray platforms were mapped together using gene symbols, yielding 10219 unique genes, and combined using distance weighted discrimination (DWD) [18]. Using 245 genes from the “Intrinsic/UNC” gene signature (Supplementary Table 1) [19], each tumor case was also assigned to one of five different intrinsic breast cancer subtypes: luminal-A, luminal-B, basal-like, HER2-like, or normal-like.

Breast cancer proliferation was evaluated as the average expression score from a 61-gene proliferation signature that includes the MKI67 (Ki-67) gene and correlates well with mitotic grade (Supplementary Table 1) [8, 15]. Survival analysis was restricted to the subset of 621 breast cancer cases with ≤15 year follow-up (400 ER-positive and 221 ER-negative), avoiding the high censoring rate after 15 years follow-up; and Kaplan–Meier analyses were performed on cohorts dichotomized by ER status, age-at-diagnosis, and median proliferation score (PS). Significance was assessed by the log rank test. The prognostic value of age-at-diagnosis and breast cancer proliferation (dichotomized at median PS) within each ER cohort were similarly assessed. Associations between breast cancer PS and DMFS within each ER and age-at-diagnosis cohort were also evaluated by univariate Cox analysis. In addition, multivariate Cox analysis was employed to determine the independent prognostic value of age-at-diagnosis and proliferation within ER-stratified cohorts.

The relationships between breast cancer proliferation, age-at-diagnosis, and ER status were evaluated by gene set enrichment analysis (GSEA) using the 61-gene proliferation signature. Box plots of the PS were also constructed, and differences between age-at-diagnosis within each ER subgroup were assessed by t test. For comparison, expression levels of FOXM1, a key transcriptional regulator of the cell cycle not present in the 61-gene proliferation signature, were compared between age-at-diagnosis and ER cohorts.

While the originally reported multigene proliferation signature generated on a Stanford custom microarray platform included a gene named forkhead drosophila-like 16 [15], its annotation could not be confirmed as identical to forkhead box M1, which was therefore not included in the set of proliferation signature genes used herein to generate the PS [8]. Potential age differences between the dichotomized PS and FOXM1 subsets (cutpoints defined by median values) were evaluated by Chi-square test (for O vs. Y cohorts) or pair-wise t test comparisons (for the continuous age-at-diagnosis variable). Kaplan–Meier analyses of PS and FOXM1 dichotomized subsets were used to assess metastatic outcome.

Results

Kaplan–Meier analysis of 621 node-negative, adjuvant treatment-naïve breast cancer cases reveals that while ER-positive breast cancer appears to have a better prognosis within the first 5 years of diagnosis, follow-up out to 15 years indicates no significant difference in metastatic outcome between ER-positive and ER-negative disease (Fig. 1a). In contrast, the survival differences associated with younger age-at-diagnosis (<40 years) and more highly proliferative tumors (PS > median value) persist over 15 years (Fig. 1b, c). Interestingly, the prognostic impact of age-at-diagnosis and breast cancer proliferation appears to be ER dependent, as significant curve separation between young (Y) and old (O) cohorts, and between high and low PS cohorts, are present in ER-positive but not ER-negative cases (Fig. 2). Univariate Cox analysis confirms that PS shows significant prognostic value in ER-positive but not ER-negative cases, irrespective of age-at-diagnosis (Table 1), although a trend was observed for association of higher PS with poor prognosis in the ER-negative Y cohort (P = 0.07). Multivariate Cox analysis revealed that among ER-positive cases, age-at-diagnosis cohorts were no longer significantly associated with survival when PS was taken into account; however, the multivariate hazard ratio (HR) between age cohorts appeared comparable to the univariate Cox HR: multivariate HR for Y vs. O cohorts = 1.40 (95% CI: 0.91–2.15, P = 0.13) relative to the univariate HR for Y vs. O cohorts = 1.65 (95% CI: 1.08–2.52, P = 0.021). Neither age nor PS was prognostic in the ER-negative cases in the multivariate Cox analysis. In line with previous reports, GSEA shows significant enrichment of the proliferation signature in younger age-at-diagnosis ER-positive breast cancer cases (FDR P = 0.002); in contrast, the proliferation signature is not enriched in younger ER-negative cases (FDR P = 0.398). In agreement with these GSEA findings, a higher PS is observed in the Y cohort of ER-positive but not ER-negative cases (Fig. 3a).

Fig. 1
figure 1

Prognostic performance of breast cancer ER status, age-at-diagnosis, and proliferative capacity. Kaplan–Meier plots of distant metastatic events of 621 adjuvant treatment-naïve, node-negative breast cancers dichotomized by a ER status (ER+ (gray), ER− (black)); b age-at-diagnosis (Y: <40 years (black), O: ≥40 years (gray)); and c median PS (High (black), Low (gray))

Fig. 2
figure 2

Prognostic performance of breast cancer age-at-diagnosis and proliferative capacity in ER-stratified cohorts. Kaplan–Meier plots of distant metastatic events in a 400 ER+ cases dichotomized by age (Y (black), O (gray)); b 221 ER− cases dichotomized by age (Y (black), O (gray)); c 400 ER+ cases dichotomized by proliferative capacity by median PS (High (black), Low (gray)); and d 221 ER− cases dichotomized by proliferative capacity by median PS (High (black), Low (gray))

Table 1 Univariate Cox prognostic analysis of tumor proliferation score (PS) on metastatic outcome (DMFS) for breast cancer cohorts defined by age-at-diagnosis (O ≥ 40 years; Y < 40 years) and ER status (+, −)
Fig. 3
figure 3

PS and FOXM1 as measures of breast cancer proliferative capacity in relation to age-at-diagnosis and metastatic outcome. a Boxplot of PS in age and ER-stratified cohorts with P values for t test comparisons between Y and O cohorts within each ER subtype; b Scatterplot of FOXM1 expression vs. PS with Pearson correlation coefficient and P values (line depicts linear model fit); c Boxplot of FOXM1 in age and ER-stratified cohorts with P values for t test comparisons between Y and O cohorts within each ER subtype; d Kaplan–Meier analysis of FOXM1/PS subgroups among ER+ breast cancers (PS-High/FOXM1-High (magenta), PS-High/FOXM1-Low (orchid), PS-Low/FOXM1-High (gold), PS-Low/FOXM1-Low (sea green))

A different and non-redundant measure of breast cancer proliferation, FOXM1 expression, was also assessed for its prognostic value. In this dataset, FOXM1 mRNA levels correlate significantly with PS despite the absence of FOXM1 in the 61-gene proliferation signature (Fig. 3b). Although FOXM1 is significantly higher in the Y cohort only in the ER-positive cases (Fig. 3c), it shows a trend for different expression levels based on age-at-diagnosis in ER-negative cases (FOXM1 P = 0.0607 vs. PS, P = 0.2592), suggesting potential links between FOXM1 and aging independent of proliferation. As with PS, FOXM1 appears significantly prognostic only for ER-positive breast cancer (log rank P = 2.21e−06 and 0.380 for ER+ and ER− FOXM1 dichotomized cohorts, respectively); and multivariate Cox analysis revealed a similar reduction in the prognostic significance of young age when FOXM1 levels are accounted for in the ER-positive cohort, with multivariate HR for Y vs. O cohorts = 1.45 (95% CI: 0.947–2.23, P = 0.087).

Curiously, while PS is significantly prognostic within both luminal-A and luminal-B subtypes of endocrine-dependent breast cancer, FOXM1 is only prognostic within the luminal-A breast cancer subtype (data not shown), indicating that FOXM1 and PS do not reflect equivalent breast cancer phenotypes. When ER-stratified cohorts are dichotomized by both PS and FOXM1 (at their median values), the classification concordance by these two parameters (i.e., PS-High and FOMX1-High, PS-Low and FOXM1-Low) is >80% (84% for ER-positive and 87% for ER-negative cases). No significant Y vs. O cohort biases are apparent in the dichotomized PS/FOXM1 subgroups although, among ER-negative cases, discordant PS-High/FOXM1-Low cases exhibit significantly higher age-at-diagnosis than concordant PS-High/FOXM1-High cases (t test P = 0.0159). While Kaplan–Meier analyses suggest that FOXM1 status adds little prognostic value to dichotomized PS cohorts (log rank P values = 5.45e−07 and 0.816 for ER-positive and ER-negative PS/FOXM1 subsets vs. 4.37e−08 and 0.366 for ER-positive and ER-negative PS subsets, respectively), Fig. 3d shows that the subgroup of discordant ER-positive PS-Low/FOXM1-High breast cancers trend toward significantly worse DMFS relative to concordant PS-Low/FOXM1-Low cases (P = 0.0691).

Discussion

The study of young onset breast cancers remains compromised by their rarity and heterogeneity since they constitute <7% of all newly diagnosed breast cancers with nearly half being ER-positive [1, 3]. Given these limitations and the known inter-dependencies between breast cancer age-at-diagnosis, ER status, and proliferative potential, we pooled multiple clinical and expression microarray datasets to evaluate metastatic outcome relative to these parameters in 621 early stage, adjuvant treatment-naïve breast cancers grouped into age-at-diagnosis (O = 520, Y = 101) and ER (ER-positive = 400, ER-negative = 221) cohorts. Consistent with earlier observations about the time-dependent prognostic value of ER [13], the DMFS curves for this pooled dataset show an initial survival benefit for ER-positive breast cancers within the first 5 years of diagnosis—when nearly all of the destined metastatic events for ER-negative cases occur—followed by a convergence of both survival curves (Fig. 1a). Unlike ER status, however, the association between younger onset breast cancer and poor prognosis appears to be retained for at least 15 years after diagnosis (Fig. 1b). Despite the expectation of biomarker experts who once considered the prognostic value of ER status to be more a reflection of tumor proliferative capacity than metastatic potential [13], metastatic outcome in this collection of early stage breast cancer cases appears well dichotomized by PS, based on a 61-gene signature from expression microarrays (Fig. 1c).

In the absence of published studies to the contrary, oncologists generally believe that endocrine-dependent and endocrine-independent primary breast cancers arising in young (<age 40) women or with high proliferative capacity are prone to both early local recurrence and distant metastatic spread [1, 3, 4, 13]. This study demonstrates that this generalization is incorrect. Unconfounded by stage and treatment effects, the poor prognosis of early onset breast cancer is only apparent in ER-positive and not in ER-negative cases (Fig. 2a, b). Likewise, the prognostic value of PS is significant only for ER-positive and not for ER-negative breast cancer cases (Fig. 2c, d).

Based on these data, oncologists must now recognize that the poor prognosis associated with breast cancers diagnosed before age 40 or primary tumors with high proliferative capacity applies only to endocrine-dependent (ER-positive) breast cancer.

It is not entirely clear why the metastatic outcome of endocrine-independent breast cancer is not driven by either age-at-diagnosis or tumor proliferative capacity except for the fact that, regardless of age-at-diagnosis, most ER-negative breast cancers are highly proliferative and those destined for metastatic relapse usually do so within 5 years of diagnosis. Moreover, the biological aggressiveness and metastatic potential of ER-negative and ER-positive breast cancers appear to be driven by fundamentally different mechanisms. A recent analysis of the same ER-negative breast cancers studied here revealed a novel set of 14 genes functionally linked to immune/inflammatory chemokine regulation, unassociated with either aging or cell proliferation, that is capable of significantly predicting the metastatic outcome of ER-negative but not ER-positive breast cancer [20].

Unlike ER-negative disease, young onset ER-positive breast cancer is significantly associated with more highly proliferative tumors (Fig. 3a), raising the question of whether the prognostic significance of age-at-diagnosis is merely a reflection of the proliferative status of endocrine-dependent breast cancer. Our multivariate Cox analysis found that young age loses its prognostic significance when either proliferative status (PS) or FOXM1 expression is taken into account, suggesting that much of the prognostic value of young age-at-diagnosis is due to the proliferative status of ER-positive breast cancer. However, the modest reduction in the hazard ratio (HR) from univariate to multivariate Cox analysis suggests that additional tumor features besides PS and FOXM1 potentially contribute to the poor prognosis of young onset ER-positive breast cancer.

Most of the better known multigene predictors of breast cancer outcome contain proliferation gene modules and demonstrate comparable prognostic utility restricted to ER-positive breast cancer [21]. Our attention was drawn to a single gene not included in other validated multigene predictors [9, 12, 14, 22] yet intimately linked to the fidelity of mitosis and maintenance of genomic stability, the forkhead box (FOX) transcription factor family member known as FOXM1 [17]. Upon phosphorylation by mitogenic cues, the transcriptionally active isoforms of FOXM1 (FOXM1B and FOXM1C) bind and transactivate cyclins and cyclin-dependent kinases (Cdk) or repress Cdk inhibitors, qualifying FOXM1 as a bonafide proliferation-specific biomarker [17]. In keeping with its role as a proliferation biomarker, FOXM1 transcript levels correlate significantly (Rp = 0.83, P < 2.2e−16) with the 61-gene PS (Fig. 3b), although we found that this correlation was not nearly as strong as that seen between PS and another 14-gene predictor [14] of breast cancer proliferation (Rp = 0.93, P < 2.2e−16; data not shown), hinting that FOXM1 expression directs additional cell processes besides cell proliferation. This study offers additional provocative evidence for the differentiation of FOXM1 from other proliferation biomarkers. While both PS and FOXM1 levels were significantly higher in younger onset ER-positive breast cancers, FOXM1 but not PS trended toward higher expression in young onset ER-negative cases (Fig. 3a, c). Neither FOXM1 nor PS showed any prognostic value in ER-negative cases yet both were prognostic in ER-positive cases; however, FOXM1 was only prognostic for the luminal-A subtype while PS was prognostic for both luminal-A and luminal-B subtypes of ER-positive breast cancer. The existence of discordant PS/FOXM1 cases with potential differences in their metastatic outcome also indicate that FOXM1 and PS are not prognostically equivalent (Fig. 3d). Since the expression microarray probes detecting FOXM1 transcripts cannot distinguish between splice variants, it was not possible in this study to determine if the discordances between FOXM1 expression and PS were due to variable expression of the transcriptionally inactive FOXM1A variant.

The unfolding role of FOXM1 as a mechanistic link between organ aging, regeneration and tumorigenesis may also contribute to the lack of prognostic equivalence between FOXM1 and PS. On the one hand, constitutive upregulation of FOXM1 not only deregulates cell proliferation and promotes tumor development but also enhances the regenerative potential of aged normal organs [17]. On the other hand, beyond mere induction of cell cycle arrest, FOXM1 absence or loss of function prevents tissue regeneration at the organ level while also inducing chromosome missegregation, defective cytokinesis, and aneuploidy at the cellular level [17]. In addition to being mechanistically linked to the mitotic defects commonly observed in aging normal cells, FOXM1 was recently shown to be one of the most significantly downregulated genes in elderly people as well as in younger patients suffering from the premature aging syndrome known as Hutchinson–Gilford progeria [16]. Thus, this study calls attention to the unexpected finding of discordant ER-positive breast cancer cases in which FOXM1 expression and tumor proliferation are not tightly linked. In some instances, as in PS-Low/FOXM1-High cases, this functional disconnection appears to have prognostic consequences that affect metastatic outcome. A better understanding of the prognostic and functional role of FOXM1 in breast and other cancers may shed new light on how normal cell regenerative mechanisms regulated by FOXM1 may be usurped, beyond that of dysregulated cell cycling, and thereby result in a more aggressive cancer phenotype.

In conclusion, this study demonstrates that the poor prognosis associated with primary breast tumors diagnosed before age 40 or with high proliferative capacity, assessed by either of two proliferation parameters (PS and FOXM1), applies only to endocrine-dependent breast cancer. This higher risk for metastatic relapse associated with early onset ER-positive breast cancer is largely, but not entirely, attributable to the significantly higher proliferative capacity of these primary tumors. As well, our findings illustrate a provocative distinction between FOXM1 and PS as breast cancer proliferation parameters, with FOXM1 potentially playing a tumorigenic role beyond its mere regulation of cell mitosis.