Keywords

12.1 Introduction

Breast cancer remains the most common cancer diagnosed in women in Europe and the USA. Screening programs, education, and improved adjuvant treatment have decreased the mortality rates from this disease. However, more than 450,000 estimated deaths due to breast cancer are expected annually worldwide [1]. The most plausible explanation for this scenario is that we lack a complete picture of the biologic heterogeneity of breast cancers. Importantly, this complexity is not fully reflected by the main clinical parameters (such as tumor size, lymph node involvement, histological grade, age) and pathological markers (estrogen receptor [ER], progesterone receptor [PR], and human epidermal growth factor 2 [HER2]), all of which are routinely used in the clinic to stratify patients for prognostic predictions, to select treatments and to include patients in clinical trials.

Gene expression profiling has had a considerable impact on our understanding of breast cancer biology allowing researchers to carry out simultaneous expression of thousands of genes in a single experiment in order to create molecular profiles. During the last 15 years, we and others have identified and extensively characterized 5 intrinsic molecular subtypes of breast cancer (Luminal A, Luminal B, HER2-enriched, basal-like, and claudin-low) and a normal breast-like group [26]. In 2000, Perou and colleagues published the first article classifying breast cancer into intrinsic subtypes based on gene expression profiling [2]. Using DNA microarrays from 38 breast cancer cases, 4 molecular subtypes were identified: Luminal, HER2, basal-like, and normal breast. The subsequent expansion of this work in a larger cohort of patients showed that the Luminal subgroup could be divided into at least two groups (Luminal A and B) [7].

In 2009, Parker et al. [8] published a clinically applicable gene expression-based predictor, known as PAM50, which was developed using microarray and quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) data from 189 prototypic samples which fell into one of the 4 main intrinsic subtypes: Luminal A, Luminal B, HER2-enriched, basal-like, and normal-like. By comparing global gene expression data from microarray and qRT-PCR, a minimized set of 50 genes was identified that could reliably classify each tumor into one of the intrinsic subtypes with 93 % accuracy. Over the past 7 years, the PAM50 intrinsic subtypes have shown to provide significant prognostic and predictive information beyond standard parameters [912]. The PAM50 assay is now clinically implemented worldwide using the nCounter platform [1319].

A particular result that highlights the importance of intrinsic subtyping in breast cancer comes from one of the most complete molecular characterization studies that has ever been performed in breast cancer. In this study, led by The Cancer Genome Atlas Project (TCGA), more than 500 primary breast cancers were extensively profiled at the DNA (i.e., methylation, chromosomal copy number changes, and somatic and germ line mutations), RNA (i.e., miRNA and mRNA expressions), and protein (i.e., protein and phosphor-protein expression) levels using the most recent technologies [6]. In a particular analysis of over 300 primary tumors [6], 5 different data types (i.e., all except DNA mutations) were combined together in a cluster of clusters in order to identify how many biological homogenous groups of tumors one can identify in breast cancer. The consensus clustering results showed the presence of 4 main entities of breast cancer but, more importantly, these 4 entities were found to be recapitulated very well by the 4 main intrinsic subtypes (Luminal A, Luminal B, HER2-enriched, and basal-like) as defined by mRNA expression only [8]. Overall, these results suggest that intrinsic subtyping captures a great amount of biological diversity that occurs in breast cancer.

12.2 Intrinsic Subtyping Based on Gene Expression Versus Histopathology

To date, numerous studies have evaluated and compared the classification of tumors based on the PAM50 gene expression predictor with the pathology-based surrogate definitions [6, 11, 2034]. To better understand the concordance between the 2 classification methods, we have combined the data from all of these studies for a total of 5994 independent samples (Fig. 12.1). The vast majority of these studies performed central determination of pathology-based biomarkers, so this needs to be taken into account, since this is not what is currently being done in the clinical setting where each hospital determines these biomarkers. Of note, large discrepancies (~20 %) between local and central determination of ER, PR, Ki67, and HER2 are expected [3539].

Fig. 12.1
figure 1

Distribution of the PAM50 intrinsic subtypes within each pathology-based group. The data have been obtained from the different publications. Several studies have performed a standardized version of the PAM50 assay (RT-qPCR-based or nCounter-based) from formalin-fixed paraffin-embedded tumor tissues [11, 22, 25, 2730], while others have performed the microarray-based version of the PAM50 assay [6, 24, 26, 3134]

In this combined analysis, the discordance rate between both classifications was found to be present in almost 30.72 % across all patients. Across the IHC-based subtypes, the discordance rate was 37.8, 48.9, 53.8, 33.9, and 13.9 % for the IHC-Luminal A, IHC-Luminal B, IHC-Luminal B/HER2+ (to identify PAM50 Luminal B), HR-/HER2+ (to identify PAM50 HER2-enriched), and triple-negative (to identify PAM50 basal-like) subtypes, respectively. The most likely explanation for these results is that 3 or 4 biomarkers do not fully recapitulate the intrinsic subtypes of breast cancer. In fact, during the development of the clinically applicable PAM50 intrinsic subtype predictor, 50 genes were found to be the minimum number of genes needed to robustly identify the 4 main intrinsic subtypes without compromising its accuracy [4].

The protein expression of Ki-67 has been studied as a potential IHC marker that could distinguish Luminal B from Luminal A subtypes in HR+ breast tumors. In the article published by Cheang et al. [40], 357 breast tumors were profiled and tumor subtypes were assigned using the 50-gene qRT-PCR ‘PAM50’ subtype predictor. By linking the available immunohistochemical data with the expression profile assignments, the authors identified 84 and 60 HR+/HER2− tumors as Luminal A and B, respectively. Thus, the Luminal A subtype was defined as being HR+/HER2− and low for Ki-67, and the Luminal B subtype as being HR+/HER2− and high for Ki-67 or HR+/HER2+. Further validation of this surrogate IHC panel in an independent population-based cohort of 4046 tumors demonstrated the prognostic value of this Luminal B IHC definition within homogeneously treated patient subsets. However, we must keep in mind that although the HR+/HER2−/Ki67-high/low IHC panel will distinguish the majority of Luminal B from A tumors, this definition does not identify all the tumors within the Luminal B expression-defined subtype since up to 20 and 7 % of Luminal B tumors are clinically ER+/HER2+ and ER−/HER2−, respectively.

12.3 Main Molecular Features of the Intrinsic Subtypes

12.3.1 Luminal Disease

At the RNA and protein level, Luminal A and B subtypes are largely distinguished by the expression of two main biological processes: proliferation/cell cycle-related pathways and luminal/hormone-regulated pathways (Fig. 12.2).

Fig. 12.2
figure 2

Intrinsic subtype identification using the PAM50 subtype predictor. PAM50 unsupervised gene expression heatmap of 1197 breast cancer samples profiled at the TCGA download portal. The subtype calls of each sample are shown below the array tree. Each square represents the relative transcript abundance

The Luminal A breast cancer is the most common subtype, representing 50–60 % of the total. It is characterized by the expression of genes activated by the ER transcription factor that are typically expressed in the luminal epithelium lining the mammary ducts. It also presents a low expression of genes related to cell proliferation [41]. The Luminal A immunohistochemistry (IHC) profile is characterized by the expression of ER, PR, Bcl-2, and cytokeratin CK8/18, an absence of HER2 expression, a low rate of proliferation measured by Ki67, and a low histological grade. Moreover, the GATA3 marker expresses its highest level in the Luminal A subgroup.

Compared to Luminal A tumors, Luminal B tumors have higher expression of proliferation/cell cycle-related genes or proteins (e.g., MKI67 and AURKA) and lower expression of several luminal-related genes or proteins such as the PR [42] and FOXA1, but not the ER [30], which is found similarly expressed between the two luminal subtypes and can only help distinguish luminal from non-luminal disease. At the DNA level, Luminal A tumors show a lower number of somatic mutations across the genome, lower number of chromosomal copy number changes (e.g., lower rates of CCND1 amplification), less TP53 mutations (12 % vs. 29 %), similar GATA3 mutations (14 % vs. 15 %), and more PIK3CA (45 % vs. 29 %) and MAP3K1 mutations (13 % vs. 5 %) compared to Luminal B tumors [6] (Table 12.1). Interestingly, a subgroup of Luminal B tumors is found hypermethylated, and a subgroup of Luminal A (6.3–7.8 %) and Luminal B (16.4–20.8 %) tumors show HER2-amplification/overexpression.

Table 12.1 More frequently mutated genes in 3303 primary breast cancers

Within HR+/HER2-negative breast cancer, 90–95 % of tumors fall into the Luminal A and B subtypes. In early breast cancer, Luminal B disease has worse baseline distant recurrence-free survival at 5 and 10 years regardless of adjuvant systemic therapy compared to Luminal A disease (Fig. 12.3). Regarding prognosis, the Luminal A subtype has shown repeatedly to have a better outcome than the rest of subtypes across many datasets of patients with early breast cancer, including 6 phase III clinical trials (i.e., CALGB9741 [43], GEICAM9906 [44], TransATAC [11], ABCSG08, MA.5 [45], and MA.12 [25] trials) coming from different countries and populations and with different adjuvant systemic therapies (i.e., endocrine-only, chemotherapy-only, and both).

Fig. 12.3
figure 3

Kaplan–Meier curves of relapse-free survival based on intrinsic subtype in 2629 patients from a combined cohort (GSE12276 [113], GSE18229 [5], GSE18864 [114], GSE2034 [115, 116], GSE22219 [117], GSE25066 [118, 119], GSE2603 [120], GSE2990 [121], GSE4922 [122, 123], GSE7390 [124], and GSE7849 [125]) of breast cancer patients. Dark blue, Luminal A; light blue, Luminal B; red, basal-like; pink, HER2-enriched; yellow, claudin-low

Of note, the vast majority of these studies with long-term follow-up show that the survival curves of Luminal B tumors cross the survival curves of basal-like disease at around ~10 years of follow-up. Thus, although at 5 years of follow-up, basal-like disease had a worse outcome than Luminal B tumors, this is not the case at 10 years. This result suggests that we should focus on finding additional therapies for Luminal B disease since this tumor subtype is very frequent (i.e., represent ~30–40 % of all breast cancer diagnoses), and chemotherapy and endocrine therapies are not enough for the majority of these patients.

Apart from predicting baseline prognosis, the Luminal A vs B classification, together with tumor size and nodal status, predicts the residual risk of recurring at a distant site within the 5–10 years of follow-up (the so-called late recurrence) [4648], suggesting that intrinsic subtype has the ability to inform decisions concerning the length of endocrine therapy (i.e., 5 vs. 10 years), being the low-risk Luminal A tumors with low tumor burden (e.g., tumor size 1 cm and node-negative) the group were 5 years of endocrine therapy might be sufficient.

Most of the direct evidence of general chemosensitivity of the Luminal A and B subtypes comes from the neoadjuvant setting. For example, in a cohort of 208 patients with luminal disease treated with anthracycline/taxane-based chemotherapy and with pathologic complete response (pCR) data, the pCR rates in patients with the Luminal A and B subtypes were 3 and 16 % (odds ratio = 6.01, p-value = 0.003), respectively [4, 4952]. Overall, these data suggest that among the 2 luminal subtypes, the Luminal A tumors are less chemosensitive than Luminal B tumors. This hypothesis is further sustained by the fact that pCR is not predictive of survival outcome in IHC-Luminal A tumors [51] and in patients with HR+/HER2−/low-grade [53], but it is predictive of outcome in IHC-Luminal B/HER2-negative [51] and in HR+/HER2−/high-grade [53]. Further studies are needed to determine whether Luminal A tumors benefit from chemotherapy or specific chemotherapeutic agents/regimens or even CDK4/6 inhibitors. This answer would be especially relevant in the clinic for those patients with Luminal A tumors with high tumor burden (intermediate or high risk).

Regarding the benefit from endocrine therapy, both tumor subtypes have shown to derive a similar relative benefit by looking at the proportional fall in the proliferation marker Ki67 upon treatment with an aromatase inhibitor in the neoadjuvant setting [24]. However, since Luminal A tumors have a lower baseline proliferation status than Luminal B tumors, a larger proportion can achieve low post-treatment values.

12.3.2 HER2-Enriched

The HER2-enriched subtype is characterized at the RNA and protein level by the high expression of HER2-related and proliferation-related genes and proteins (e.g., ERBB2/HER2 and GRB7), intermediate expression of luminal-related genes and proteins (e.g., ESR1 and PGR), and low expression of basal-related genes and proteins (e.g., keratin 5 and FOXC1). At the DNA level, these tumors show the highest number of mutations across the genome, and 72 and 39 % of HER2-enriched tumors are TP53- and PIK3CA-mutated, respectively (Table 12.2). Although the majority (68 %) of HER2-enriched tumors have ERBB2/HER2 overexpression/amplification, we should expect to identify the HER2-enriched subtype within HER2-negative disease. Interestingly, the HER2-enriched subtype has been found uniquely enriched for tumors with high frequency of APOBEC3B-associated mutations [54]. APOBEC3B is subclass of APOBEC cytidine deaminases, which convert cytosine to uracil and has been implicated as a source of mutations in many cancer types [55].

Table 12.2 Highlights of genomic, clinical, and proteomic features of the intrinsic subtypes

Similar to the other pathology-based groups, all the intrinsic molecular subtypes can be identified within clinically HER2-positive disease albeit with different proportions. In our combined analysis of 831 HER2+ tumors (Fig. 12.1), 44.6, 26.8, 17.6, and 11.0 % were identified as HER2-enriched, Luminal B, Luminal A, and basal-like.

From a biological perspective, a particular unanswered question was how different is an intrinsic subtype based on HER2 status. For example, how different is HER2+/Luminal A disease from a classical HER2-negative/Luminal A disease? We recently approached this question by interrogating The Cancer Genome Atlas (n = 495) and Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) datasets (n = 1730) of primary breast cancers for molecular data derived from DNA, RNA, and protein, and determined intrinsic subtype. Within each subtype, only 0.3–3.9 % of genes were found differentially expressed between HER2+ and HER2-negative tumors. As expected, the vast majority of differentially expressed genes originated in the 17q12 DNA amplicon where the ERBB2 gene is located. Within HER2+ tumors, HER2 gene and protein expression were statistically significantly higher in the HER2-enriched subtype than either luminal subtype. Thus, this result suggests that intrinsic subtype dominates the biological phenotype within HER2+ and HER2-negative disease.

Two large studies have evaluated the prognostic value of HR status (i.e., a surrogate manner of looking at luminal vs non-luminal disease) within HER2+ breast cancer [56, 57]. In the 4 year follow-up of the N9831 and National Surgical Adjuvant Breast and Bowel Project B-31 adjuvant trials of trastuzumab in HER2 + disease (n = 4045), HR-positive disease was found statistically significantly associated with approximately 40 % increased disease-free survival and overall survival, compared to hormone receptor-negative disease [38]. This association of hormone receptor status with survival was found to be independent of the main clinical–pathological variables, including trastuzumab administration. Similar results were observed in a prospective cohort study of 3394 patients with stage I to III HER2+ breast cancer from National Comprehensive Cancer Network centers [57]. In both studies, HR-negative disease experienced more cancer relapse in the first 5 years than HR-positive [57]. Interestingly, patients with HR-negative tumors were less likely to experience first recurrence in bone and more likely to recur in brain, compared to patients with hormone receptor-positive tumors [57]. Better outcomes independently of treatment in the HR-positive group compared to the HR-negative have also been observed in the NeoALTTO [58] and ALTTO [59] clinical trials.

Regarding intrinsic subtyping, we have recently evaluated the prognostic value of these entities in a large retrospective cohort of 1730 patients from the UK and Canada with and without HER2+ disease treated in the adjuvant setting with different treatments except trastuzumab [32]. The results revealed that intrinsic subtypes are an independent prognostic variable beyond tumor size and nodal status, and HER2+/Luminal A tumors showed a similar outcome compared to HER2-negative/Luminal A tumors [31]. Overall, these data suggest that Luminal A disease could be used, in the future, together with tumor size and nodal status, to help better identify those patients with a low risk of relapsing and thus safely treated with less intense chemotherapy such as the adjuvant regimen paclitaxel and trastuzumab recently proposed for “small” (i.e., <3.0 cm) and node-negative HER2+ breast cancer [60].

The intrinsic subtypes might be to help identify those patients with HER2+ early breast cancer that might be successfully treated with dual HER2 blockade (+/− endocrine therapy) but without chemotherapy since their tumors are exquisitely sensitive to anti-HER2 therapy. Interestingly, in a recently reported neoadjuvant study, the TBCRC023, comparing 12-week versus 24-week lapatinib + trastuzumab treatment (and endocrine therapy if HR+), the pCR rate in the HR+ tumors was 33.2 %, suggesting that longer treatment in HR+ tumors might reach similar pCR rates as chemotherapy plus two anti-HER2 agents [61]. However, no data on intrinsic subtype are available to date from these studies. Based on the prior knowledge, one can speculate that regardless of HR status, the HER2-enriched subtype enriches for the identification of patients that are more likely to achieve a pCR with dual HER2 blockade without chemotherapy. We are currently testing this hypothesis in a prospective neoadjuvant clinical trial called PAMELA (NCT01973660), which is similar to TBCRC006 and TBCRC023 trials, but the treatment lasts for 18 weeks.

12.3.3 Basal-Like

The basal-like subtype is characterized at the RNA and protein level by the high expression of proliferation-related genes (e.g., MKI67) and keratins typically expressed by the basal layer of the skin (e.g., keratins 5, 14, and 17), intermediate expression of HER2-related genes, and very low expression of luminal-related genes. At the DNA level, these tumors show the second highest number of mutations across the genome, mostly hypomethylated, and 80 and 9 % of basal-like tumors are TP53- and PIK3CA-mutated, respectively. BRCA1-mutated breast cancer is associated with basal-like disease [62, 63]. Finally, ERBB2/HER2 overexpression/amplification is found in 2.1–17.4 % of tumors with a basal-like profile.

Previous studies (including our own) have tried to define basal-like carcinomas based on immunohistochemical (IHC) surrogate profiles. For example, EGFR and keratins 5/6 (CK5/6) have been proposed as positive IHC markers on top of the ER-PR-HER2-definition (the “five-marker method,” also known as the Core Basal group). This definition has previously been shown to identify basal-like tumors versus microarray-based classifications with 76 % sensitivity and 100 % specificity [29]. Furthermore, in a series of 4046 breast tumors [64], 17 % (639 of 3744) were defined as the triple-negative (TN), whereas 9.0 % were basal-like by the five-marker core basal definition. Interestingly, when the triple-negative group was segregated into core basal and the “5 Negative Profile” (5NP), the Core Basal group showed a significantly worse outcome compared to the 5NP group.

12.3.3.1 Basal-Like Classification: Biological and Epidemiological Implications

The TCGA comprehensive molecular characterization of breast cancer confirmed that among all the intrinsic subtypes, the basal-like is the most distinct [6]. This observation fits with previous molecular studies and with clinical data that show that triple-negative breast cancer tends to affect young women, is associated with BRCA1 mutations, and is a highly aggressive disease [65]. However, how different is basal-like disease from the rest of breast cancer subtypes?

Two recent studies have addressed this question from a biological perspective [66, 67]. In the first one, we evaluated global microarray-based gene expression profiles of a combined dataset composed of 6 different cancer types obtained from the TCGA project and that included 542 primary breast cancers [66]. The unsupervised results revealed that a subgroup of breast cancers, virtually all basal-like by PAM50, should be considered a molecular entity by itself just like ovarian or colorectal cancer, and that >70 % of basal-like breast cancers were more similar to squamous cell lung cancer than to Luminal A or B disease [66]. In the second study, the panCancer TCGA study group combined all the available molecular data (except mutations) across 12 cancer types, including 845 primary breast cancers [67]. Unsupervised classification using all data types revealed a similar finding as the previous study, namely that basal-like breast cancer is a unique entity and much different from the rest of breast tumors. Interestingly, the other cancer type that showed such a large biological heterogeneity was bladder cancer which could be reclassified into 3 distinct molecular entities, one being similar to the basal-like breast cancer subtype [67].

Despite in vivo preclinical data suggesting that breast cancer disease arises from the transformation of a common luminal progenitor [6870], this biological result with human tumors strongly suggest that 2 very different cell types of origin exist in the mammary gland; one whose transformation gives rise to basal-like disease and another one whose transformation gives rise to non-basal-like disease.

An example is work by Millikan et al. [71] looking at risk factors of breast cancer in a population-based, case–control study of African-American, and white women. The results revealed that Luminal A disease exhibits risk factors typically reported as protective for the development of breast cancer, including increased parity and younger age at first full-term pregnancy; on the other hand, basal-like cases exhibits several associations that were opposite to those observed for Luminal A, including increased risk for parity and younger age at first term full-term pregnancy [71]. Moreover, longer duration breastfeeding, increasing number of children breastfed, and increasing number of months breastfeeding per child were each associated with reduced risk of basal-like breast cancer, but not Luminal A [71]. Overall, these data suggest that we should clearly separate these two entities when we talk about breast cancer.

Within HR+/HER2-negative early disease, it is expected to identify a subpopulation of non-luminal subtypes (i.e., HER2-enriched and basal-like) by gene expression (Fig. 12.1). Basal-like tumors represent around ~1 %. Based on the molecular features of these two non-luminal subtypes, one would expect to identify these tumors in patients with tumors that express low ER. In fact, a study performed intrinsic subtyping in 25 tumor samples with 1–9 % ER-positive tumor cells and found that 80 % were non-luminal (48 % basal-like and 32 % HER2-enriched) [72]. On the other hand, a combined analysis of 48 borderline cases (1–10 % ER+ tumor cells) from the MA.5, MA.12, and GEICAM9906 revealed that 46.0 % were non-luminal (29 % HER2-enriched and 17 % basal-like) [73]. Moreover, HER2-enriched and basal-like tumors can still be identified in tumors that have very high expression of ER as exemplified by the 6 non-luminal tumors (representing 2.9 % of the entire cohort) identified in the Z1031 trial where patients’ tumors were all Allred ER score of 6–8.

In terms of survival outcome, we evaluated the prognostic value of the intrinsic subtypes in a cohort of 1380 patients with ER+/HER2-unknown early breast cancer treated with 5 years of adjuvant tamoxifen-only across several retrospective studies [74]. Non-luminal subtypes represented 9 % (7 % HER2-enriched and 2 % basal-like) of the samples, and each non-luminal subtype showed a significant worse outcome compared to Luminal A subtype in both node-negative and node-positive disease.

In the past, we have used the word TN and basal-like interchangeably. However, within TN disease, all the intrinsic molecular subtypes can be identified, although the vast majority fall into the basal-like subtype (86 %; range 56–95 %, depending on the study). In our combined analysis of 868 TN tumors, 86.1, 9.1, 3.2, and 1.6 % were identified as basal-like, HER2-enriched, Luminal B, and Luminal A, respectively. Although the correlation between pathological and gene expression profiling is moderate, this pathology-based subset is the one with the greatest consistency between both classifications. Of note, we did not evaluate the presence of the claudin-low subtype [5].

At the same time, other gene expression-based classifications of TN disease have emerged over the years. For example, Lehmann and colleagues described 6 molecular subtypes of TN breast cancer: two basal-like (BL1 and BL2), an immunomodulatory (IM), a mesenchymal (M), a mesenchymal stem-like (MSL), and a luminal androgen receptor subtype (LAR) [75, 76]. As expected, Lehmann’s classification identified most TN tumors as basal-like (80.6 %) [76] and, with the exception of LAR group, all other subtypes were mostly identified as basal-like by PAM50 (BL1 99 %, BL2 95 %, IM 84 %, M 97 %, MSL 50 %). Interestingly, the LAR subtype was predominantly identified as either HER2-enriched (74 %) or Luminal B (14 %). In another recent study, Burstein et al. [77] classified TN disease into 4 main groups: LAR, mesenchymal (MES), basal-like immune-suppressed (BLIS), and basal-like immune-activated (BLIA). Again, most PAM50 non-basal-like tumors were identified as LAR by this classification, and most PAM50 basal-like were BLIS and BLIA. Thus, we can conclude that TN disease is biologically heterogeneous and that although basal-like disease predominates (+/− immune activation and/or infiltration), there is a small group of non-basal-like tumors (mostly LARs, or HER2-enriched) [23, 78]. These TN tumors with a non-basal-like or LAR profile might benefit from androgen receptor inhibition.

No data are available regarding the prognostic impact of the intrinsic molecular subtypes defined by PAM50 within TN disease. Regarding the Lehmann’s classification, the 7 subtypes have been evaluated retrospectively in several publicly available cohorts of TN disease treated with different adjuvant therapies [75, 76, 78]. Although no clear results were obtained, several tendencies were observed in both studies. For example, the M group showed the worse outcome and the IM group showed a relatively better outcome. Regarding the LAR group, one study showed a worse outcome and another one a tendency for the best outcome. In Burstein et al. [77], the only group that showed a different outcome from the rest was the BLIA, which is consistent with the known prognostic impact of immune infiltration in TN disease [7981]. However, the BLIA group, or the basal-like with immune infiltration, has a high risk of relapsing (~20 %). Thus, these data suggest that subtyping within TN will not have a clinical impact based on prognosis-only since no group has such an outstanding.

12.3.4 Claudin-Low

In 2007, Herschkowitz et al. [82] analyzed 232 human breast samples by semi-unsupervised hierarchical clustering and compared their gene expression profiles versus 108 mammary tumors from multiple genetically engineered mouse models. In this report, a potential new intrinsic subtype, apparent in both mouse and human datasets, was identified; this ‘claudin-low’ subtype was characterized by the low expression of genes involved in tight junctions and cell–cell adhesion. Interestingly, most of the defining characteristics of the claudin-low human tumors were conserved in several mouse models including 3 models with engineered BRCA1 and/or p53 deficiencies.

After, we have reported a more comprehensive characterization of this rare intrinsic subtype [5]. Hierarchical clustering analysis of 320 human breast tumors and 17 normal breast samples using a 1900 gene intrinsic list [8] places the claudin-low group next to the basal-like subtype, indicating that both tumor types share some gene expression features. These shared features include low expression of the HER2 and the luminal gene clusters, as well as the genes HER2, ESR1, GATA3, and the luminal keratins 8 and 18. However, two intrinsic gene clusters are uniquely expressed (or not expressed) in the claudin-low subtype. One of these clusters is enriched with cell–cell adhesion proteins and is found to show low expression within claudin-low tumors. Among the 20 genes that compose this cluster are claudin 3, 4, 7, cingulin, and occludin that are involved in tight junctions, and E-cadherin that is a calcium-dependent cell adhesion protein. Conversely, the other cluster, which is composed of 40 genes, is highly enriched with immune system response genes and is highly expressed in claudin-low samples. Many of these genes are known to be expressed by T- and B-lymphoid cells (i.e., CD4 and CD79a), indicating high immune cell infiltration in this tumor subtype. However, the origin of other immune-related genes highly expressed in claudin-low tumors, such as interleukin 6 or CXCL2 might be produced by the actual tumor cells, or immune cells, or both.

Clinically, the majority of claudin-low tumors are poor prognosis ER-negative (ER−), PR-negative (PR−), and HER2-negative (HER2−) (i.e., triple-negative) invasive ductal carcinomas with a high frequency of metaplastic and medullary differentiation. Preliminary data show that they have a response rate to standard neoadjuvant chemotherapy that is intermediate between basal-like and luminal tumors [5]. Furthermore, claudin-low tumors are enriched with unique biologic properties linked to mammary stem cells (MaSCs) [83], a Core EMT signature [84], and show features of tumor-initiating cells (TICs, also known as cancer stem cells [CSCs]) [85, 86], the study of which is leading to the formulation of new hypothesis regarding the “cell of origin” of the different subtypes of breast cancers.

No differences in survival were observed between claudin-low tumors and other poor prognosis subtypes (Luminal B, HER2-enriched, and basal-like), or even between claudin-low tumors versus all other tumors combined.

Metaplastic and medullary carcinomas have also been linked with the claudin-low profile [3, 86]. These two special histological types represent less than 5–7 % of all breast cancer diagnoses and generally are poorly differentiated triple-negative tumors. However, while metaplastic carcinomas are associated with poor prognosis and treatment resistance [87], medullary carcinomas tend to show good outcomes despite their aggressive pathological features [88].

In a combined dataset of 400 tumors/patients (UNC337 [5] and MDACC133 [89], 49 % of TN tumors were basal-like, 30 % claudin-low, 9 % HER2-enriched, 6 % Luminal B, 5 % Luminal A, and 1 % normal breast-like; if the claudin-low classification is ignored, then 72 % of triple-negative tumors are basal-like. Conversely, 6–29 % [7, 90] and 9–13 % of basal-like tumors are ER+ or HER2+, respectively. Thus, the triple-negative surrogate for basal-like makes both kinds of mistakes in that it includes samples that are not basal-like and it fails to identify a significant number of basal-like tumors.

Overall, claudin-low tumors are the least frequent subtype (prevalence 12–14 %) and are mostly high-grade and ER−/PR−/HER2− (i.e., triple-negative) tumors similar to the basal-like subtype, which is concordant with the low expression of the luminal and HER2 intrinsic gene clusters observed in both tumor types. However, it is important to note that 15–25 % of claudin-low tumors are hormonal receptor-positive (HR+) and 10 % of basal-like tumors are also HR+.

12.4 Novel Subgroups of Breast Cancer

In 2012, Curtis et al [91.] proposed a new molecular classification of breast cancer based on the combination of two different genomic views derived from primary fresh-frozen tissue from 2000 women with breast cancer from the METABRIC cohort. The authors presented an integrated analysis of copy number changes and gene expression in a discovery and validation set of 997 and 995 primary breast tumors, respectively, with long-term clinical follow-up. The results revealed a total of 10 different subtypes [92]:

Integrative cluster (IntClust) 1 is constituted by ER‐positive tumors, predominantly classified into the Luminal B intrinsic subtype. The subgroup typically has an intermediate prognosis, similar to that of IntClust 6 and 9. All encompass a high proportion of higher proliferation ER+/Luminal B tumors and are characterized by relatively high levels of genomic instability. The defining molecular feature of IntClust 1 is amplification of the 17q23 locus. IntClust 1 also has the highest prevalence of GATA3 mutations across all of the 10 clusters.

Integrative cluster 2 is comprised of ER‐positive tumors and includes both Luminal A and Luminal B tumors. Remarkably, this subgroup is associated with the worst prognosis of all ER‐positive tumors with a 10‐year disease‐specific survival rate of only around 50 %. The defining molecular feature of this subtype is amplification of 11q13/14.

Integrative cluster 3 is composed primarily of Luminal A cases and is enriched for histopathological subtypes that have a good prognosis such as invasive lobular and tubular carcinomas. At the molecular level, the subtype is characterized by low genomic instability, a very low prevalence of TP53 mutations, and a paucity of copy number and cis‐acting alterations. However, of note, tumors within this subtype have the highest frequency of PIK3CA, CDH1, and RUNX1 mutations. Importantly, the subgroup is associated with the best prognosis of all the 10 integrative clusters with a 10‐year disease‐specific survival of around 90 %.

Integrative cluster 4 is a unique cluster incorporating both ER‐positive (n = 238/343) and ER‐negative (n = 105/343) cases, including 26 % of all triple-negative tumors, and a mixture of intrinsic subtypes including basal‐like cases. Importantly, the subtype is associated with favorable outcome and a 10‐year disease‐specific survival of around 80 %. Similarly to IntClust 3, IntClust4, the largest subtype of breast cancer (up to 17 % of cases), is characterized molecularly by low levels of genomic instability and a “CNA‐devoid” flat copy number landscape. Many of the tumors within this subgroup show evidence of extensive lymphocytic infiltration, and the observed deletions are the consequences of the somatic TCR rearrangement present in the infiltrating T cells.

Integrative cluster 5 encompasses the ERBB2-amplified cancers composed of both HER2‐enriched ER‐negative (58 %) and luminal ER‐positive cases (42 %). Women in the METABRIC study were enrolled before the general availability of trastuzumab, and as expected, this group demonstrated the worst disease‐specific survival at 10 years of around 45 %. In addition to specific ERBB2 amplification at 17q12, these tumors demonstrate intermediate levels of genomic instability and a high proportion of TP53 mutations (in >60 % cases).

Integrative cluster 6 represents a distinct subgroup of ER‐positive tumors, comprising both Luminal A and Luminal B cases. Clinically, this cluster shows an intermediate prognosis and a 10‐year disease‐specific survival of around 60 %. Molecularly, this subtype is characterized by specific amplification of the 8p12 locus and high levels of genomic instability. Notably, tumors within this cluster demonstrate the lowest levels of PIK3CA mutations across all of the ER‐positive cancers.

Integrative cluster 7 is comprised predominately of ER‐positive Luminal A tumors and identifies a good prognostic subgroup with 10‐year disease‐specific survival rates of around 80 %. It is characterized by intermediate levels of genomic instability, specific 16p gain, and 16q loss, as well as a higher frequency of 8q amplification.

Integrative cluster 8 shares similarities with IntClust7 and encompasses ER‐positive tumors predominately of the Luminal A intrinsic subtype with a good prognosis. This subgroup, however, is characterized molecularly by the classical 1q gain/16q loss event. Furthermore, tumors within IntClust 8 demonstrate high levels of PIK3CA, GATA3, and MAP2K4 mutations.

Integrative cluster 9 is comprised of a mixture of intrinsic subtypes but includes a large number of ER‐positive cases of the Luminal B subgroup. IntClust 9 shows an intermediate prognosis with a 10‐year disease‐specific survival of around 60 %. This cluster is characterized by high levels of genomic instability and the highest level of TP53 mutations among the ER‐positive subtypes.

Integrative cluster 10 incorporates mostly triple-negative tumors (n = 190/320 classify into this cluster) from the core basal‐like intrinsic subtype. Although the subtype represents a high‐risk group in the first 5 years after diagnosis, beyond 5 years the prognosis for this subgroup is relatively good. These breast cancers have the highest rates of TP53 mutations despite displaying only intermediate levels of genomic instability.

12.5 Intrinsic Subtypes in the Metastatic Setting

A better understanding of the biological changes occurring during metastatic progression of breast cancer is needed to identify new biomarkers, targets, and novel treatment strategies. Although the TCGA results provide a valuable landmark of genomic/genetic information, a critical point is that the TCGA analyses were performed in non-treated primary breast tumors and not in post-treated, resistant, or metastatic tumors. This is important as recent studies that are starting to characterize resistant or metastatic tumors are identifying frequent genomic alterations that were found to be rare in the TCGA dataset [93].

One example is the molecular alterations in the ER gene [94] (i.e., somatic mutations, gene amplifications, or gene fusions), which are found in ~20 % of metastatic luminal tumors, and which we (in collaboration with Washington Univ. St. Louis, USA) and others have shown that they might play an important role in the development of endocrine resistance [95, 96]. Recent studies have identified mutations in ESR1 affecting the ligand-binding domain (LBD) of the ER-α protein [97]. In preclinical models, mutant receptors drive ER-dependent transcription and proliferation in the absence of estrogen and reduce the efficacy of ER antagonists, suggesting that LBD-mutant forms of the ER are involved in mediating clinical resistance to endocrine therapy and that more potent ER antagonists may be of substantial therapeutic benefit.

Regarding the intrinsic changes from primary to metastatic tumors, our data obtained after comparing expression changes of a set of 105 genes between 30 paired luminal primary and metastatic tumors in the CONVERTHER trial [98] suggest that a potential driver of treatment resistance and aggressiveness in luminal disease (i.e., high proliferation) is the fibroblast growth factor receptor 4 (FGFR4), a tyrosine kinase cell surface receptor, which we have found to be highly upregulated in metastatic tumor samples. Interestingly, upregulation of this gene is a main feature of the HER2-enriched subtype [99], a subtype known to have high RAS-/MAPK-pathway signaling and be endocrine-resistant [26]. Interestingly, many Luminal A and B metastatic samples have a FGFR4 expression above the mean expression of this gene in primary HER2-enriched tumors. In contrast, ERBB2 expression was not found upregulated in metastatic luminal disease. Our results showed that intrinsic subtype is mostly maintained during metastatic progression, except primary Luminal A disease which becomes non-Luminal A in the majority of the cases.

Recently, we published [100] an unplanned retrospective analysis of 821 tumor samples (85.7 % primary and 14.3 % metastatic) from the EFG30008 phase III trial[101] in which postmenopausal women with HR-positive invasive breast cancer and no prior therapy for advanced or metastatic disease were randomized to letrozole with or without lapatinib. In this retrospective study, we showed that intrinsic subtype is the strongest prognostic factor independently associated with progression-free survival and overall survival in all patients, being the first study to reveal an association between intrinsic subtype and outcome in first-line HR-positive metastatic breast cancer. The clinical value of intrinsic subtyping in HR-positive metastatic breast cancer warrants further investigation.

12.6 Frequently Mutated Genes in Breast Cancer

In estrogen receptor-positive (ER+) breast cancer, mutations in PIK3CA represent the most common genetic events, occurring at a frequency of 30–50 %. As we can see in Table 12.1, there are other frequently mutated genes in breast cancer. Less commonly observed are mutations in PTEN (2–4 %), AKT1 (2–3 %), and phosphatidylinositol-3-kinase regulatory subunit alpha (PIK3R1: 1–2 %). Similar findings were observed in HER2-positive breast cancer. In contrast, triple-negative breast cancer (TNBC) is associated with a lower incidence of PIK3CA mutations (<10 %).

The frequent occurrence of PI3K pathway activation makes it an attractive therapeutic target in breast cancer (Table 12.1). The recognition of its importance in tumorigenesis and cancer progression has led to the development of a number of agents that target various components of this pathway as cancer therapeutics. Promising results with these agents have been observed in the treatment of advanced estrogen receptor-positive (ER+) breast cancer. However, the therapeutic efficacy of single-agent PI3K pathway inhibitor is likely limited by feedback regulations among its pathway components and cross talk with other signaling pathways. Strategies that combine PI3K pathway inhibitors with inhibitors against RTKs, or inhibitors against MEK, MYC, PARP, or STAT3 pathways, or agents that activate autophagy and apoptosis machineries, are being explored. In addition, there is continued effort to identify resistance mechanisms and predictors of therapeutic response.

Germ line mutations in p53 occur in a high proportion of individuals with the Li-Fraumeni cancer susceptibility syndrome, which confers an increased risk of breast cancer [102]. This implies an important role for p53 inactivation in mammary carcinogenesis, and the structure and expression of p53 have been widely studied in breast cancer. Loss of heterozygosity (LOH) in the p53 gene was shown to be a common event in primary breast carcinomas [103], and this is accompanied by mutation of the residual allele in some cases. Although the overall frequency of p53 mutation in breast cancer is approximately 20 %, certain types of the disease are associated with higher frequencies. For example, a number of studies have identified an increased rate of p53 mutations in cancers arising in carriers of germ line BRCA1 and BRCA2 mutations. Moreover, a distinct spectrum of p53 mutations occurs in such carcinomas. Strikingly, in typical medullary breast carcinomas, p53 mutation occurs in 100 % of cases. This is of particular interest, since it is now well recognized that medullary breast cancers share clinicopathological similarities with BRCA1-associated cases. Indeed, methylation-dependent silencing of BRCA1 expression occurs commonly in medullary breast cancers. Molecular pathological analysis of specific components of the p53 pathway is likely to have diagnostic and prognostic utility in breast cancer. Moreover, a number of innovative strategies have been proposed to restore p53 function to tumors. It will be of great interest to observe how these and other novel therapeutic approaches targeted to the p53 pathway impact on clinical outcome in breast cancer [104].

HER2 somatic mutations have been described in the last years, with an overall HER2 mutation rate of approximately 1.6 % of breast cancers. Some of them are activating mutations, including G309A, D769H, D769Y, V777L, P780ins, V842I, and R896C that are likely driver events in their cancer [105]. It is important to note that recurrence did not predict the phenotype of the mutation (activating, drug resistant, or neomorphic). Several HER2-targeted drugs were tested on these mutations, and it has been observed that neratinib was a very potent inhibitor for all of the HER2 mutations. Lobular breast cancer may have an increased frequency of HER2 somatic mutations, but the number of cases sequenced to date is small (3 patients with lobular breast cancer with HER2 somatic mutation among 39 lobular breast cases in the TCGA study and 3 patients with HER2 mutations among 113 lobular cases in Shah et al. [106]. The HER2 mutation frequency in relapsed or metastatic breast cancer patients is currently unknown and potentially could be higher than 1.6 %. Because of the low mutation rate, prospective clinical trials using HER2 gene-sequencing results will need to screen a large number of patients, and the cooperation of many academic institutions and treatment centers is essential.

12.6.1 Lobular Breast Cancer

Invasive lobular carcinoma (ILC) is the second most prevalent histologic subtype of invasive breast cancer, constituting ∼10–15 % of all cases. The classical form [107] is characterized by small discohesive neoplastic cells invading the stroma in a single-file pattern. The discohesive phenotype is due to dysregulation of cell–cell adhesion, primarily driven by lack of protein expression observed in ∼90 % of ILCs. This feature is the ILC hallmark, and immunohistochemistry (IHC) scoring for CDH1 expression is often used to discriminate between lesions with borderline ductal versus lobular histological features. ILC variants have also been described, yet all display loss of E-cadherin expression [108].

The first TCGA breast cancer study reported on 466 breast tumors assayed on six different technology platforms. ILC was represented by only 36 samples, and no lobular-specific features were noted besides mutations and decreased mRNA and protein expression of CDH1. In 2012, Ciriello et al. [109.] profiled 817 breast tumors, including 127 ILC, 490 ductal (IDC), and 88 mixed IDC/ILC. As expected, they could identify CDH1 loss at the DNA, mRNA, and protein level in almost all ILC cases. Moreover, 12/27 CDH1 mutations in non-ILC cases occurred in mixed tumors strongly resembling ILC at the molecular level. Surprisingly, they did not identify DNA hypermethylation of the CDH1 promoter in any breast tumor, suggesting that E-cadherin loss is not epigenetically driven. Besides E-cadherin loss, they identified mutations targeting PTEN, TBX3, and FOXA1 as ILC-enriched features. PTEN loss associated with increased AKT phosphorylation was highest in ILC among all breast cancer subtypes. Spatially clustered FOXA1 mutations correlated with increased FOXA1 expression and activity. Conversely, GATA3 mutations and high expression characterized Luminal A IDC, suggesting differential modulation of ER activity in ILC and IDC. Proliferation and immune-related signatures determined three ILC transcriptional subtypes associated with survival differences. Mixed IDC/ILC cases were molecularly classified as ILC-like and IDC-like, revealing no true hybrid features. This multidimensional molecular atlas sheds new light on the genetic bases of ILC and provides potential clinical options.

12.7 Conclusions

Breast cancer is a clinically and biologically heterogeneous disease. However, the vast majority of the biological diversity coming from the DNA, mRNA, miRNA, and protein is captured by the 4 main intrinsic subtypes defined by gene expression only. At the same time, and contrary to popular belief, intrinsic biology is not sufficiently captured by standard clinical–pathological variables. In this chapter, we have argued how intrinsic biology identified by gene expression analyses provides today, and especially in the future, clinically relevant information beyond the current pathology-based classification. In the upcoming years, we should expect more wealth of data regarding the clinical utility of intrinsic subtyping in a variety of clinical scenarios, and in combination with other biomarkers such as somatic mutations will allow the development of new targeted therapeutics now being tested in ongoing clinical trials.

These findings have led us to understand that this is not just one disease, but many, and that each patient entails a particular case where personalized medicine could play a crucial role. The last decade has changed the way researchers understand, classify, and study breast cancer, and it has reshaped the way doctors diagnose and treat this disease. In addition, it has undoubtedly changed the search for alternative therapies by integrating molecular studies and the selection of study populations based on their molecular markers into clinical trials. The therapeutic advances made to date have been achieved by performing large randomized clinical trials. The problem is that these trials were designed to determine the best therapeutic approach for the median population, not for a specific individual. Furthermore, we have learned through trial and error that new targeted therapies have to be developed in targeted populations, selected on the basis of a given biomarker. The good news is that the molecular studies that have been developed over the past decade have opened a broad field in cancer research that allows basic and translational researchers to look for new potential therapeutic targets and to test them in the clinic.