Keywords

Introduction

Advances in molecular technologies have led to a paradigm shift in the way we define breast cancer, resulting in the transition from purely morphological classification systems to combined histologic and molecular taxonomies. The advent of massively parallel sequencing has solidified the notion that breast cancer comprises multiple diseases with different biology, clinico-pathological features, natural history, and response to therapy [1,2,3,4,5]. Moreover, microarray-based gene expression profiling has led to the implementation of a molecular classification of breast cancer [1] and to the development of prognostic gene signatures, some of which have now been incorporated into clinical practice [6].

Molecular studies have shed light into the vast tumor heterogeneity of breast cancer, illustrated by the dissimilar genetic makeup of primary tumors and metastatic foci [7]. While the complexity and heterogeneity of breast cancer poses significant diagnostic and therapeutic challenges, it also provides opportunities for the realization of the potentials of precision medicine [8]. Novel strategies, such as the implementation of liquid biopsies, are being developed to overcome these diagnostic and therapeutic hurdles.

In this chapter, we will discuss the key contributions of molecular pathology in the dissection of the biology of breast cancer, focusing on the role of gene expression profiling and massively parallel sequencing in the classification and prognostication of the disease. We will contextualize the diagnostic and therapeutic challenges posed by breast cancer heterogeneity, as well as strategies to overcome them.

Molecular Classification of Breast Cancer

Gene expression studies have solidified the notion that breast cancer should not be regarded as a single disease but rather as a group of entities with different molecular landscapes and clinical outcomes. Pioneering microarray-based gene expression profiling has led to the development of a breast cancer classification comprising five “intrinsic” molecular subtypes, namely, luminal A, luminal B, HER2 (also known as HER2-enriched), basal-like, and normal-like [1]. The “intrinsic” molecular classification has made it evident that ER-positive and ER-negative breast cancer are essentially different diseases at the transcriptomic level [1,2,3, 5, 9]. Furthermore, studies investigating the clinico-pathological features of these cancers revealed that if luminal A and basal-like breast cancers are compared, they differ in terms of risk factors, clinico-pathological presentation, histopathological features, response to therapy, and outcomes [5].

In-depth analyses of the transcriptomic profiles of luminal A, luminal B, HER2-enriched, basal-like, and normal breast-like revealed important characteristics of these molecular subtypes. Luminal tumors are characterized by the expression of the ER gene (ESR1) and ER-related genes. There is marked intrinsic heterogeneity within the luminal subgroup. Luminal tumors are subclassified into luminal A and luminal B subtypes based on the level of expression of proliferation-related genes, whereby luminal A tumors display low levels of expression of proliferation-related genes, whereas luminal B cancers display higher levels [10,11,12]. Luminal A tumors may be further subclassified into four groups, which differ in terms of their somatic mutation profiles, copy number alterations, and clinical behavior [13]. Among these subgroups, a copy number high (CNH) luminal subgroup was recognized, characterized by high genomic instability, TP53 mutations, increased Aurora kinase signaling, and poor clinical outcome [13]. HER2-enriched cancers are characterized by expression of the HER2 gene (ERBB2) and of genes found in the HER2 amplicon. It should be noted, however, that not all HER2-enriched breast cancers display HER2 gene amplification and not all cases diagnosed as HER2-positive according to the ASCO/CAP guidelines are classified as HER2-enriched by microarray analysis [14]. In fact, all intrinsic breast cancer subtypes may be recognized among clinically defined HER2-positive breast cancers [15]. In light of the not uncommonly observed primary and secondary resistance to HER2 blockade, the identification of biomarkers predictive of response is of paramount importance. Along these lines, the determination of the molecular intrinsic subtype in the realm of HER2-positive disease is paving the road for the development of a therapeutically sound molecular stratification of HER2-positive breast cancer. In fact, the analysis of HER2-positive breast cancers from the NCCTG (Alliance) N9831 trial, using the Prosigna algorithm, showed that HER2-enriched and luminal tumors benefited the most from the addition of trastuzumab to chemotherapy, whilst basal-like tumors did not show a significant benefit [16].

Similarly, it was recently shown that in patients with clinically defined HER2-positive breast cancer from the PAMELA trial who were managed with dual HER2 blockade with trastuzumab and lapatinib, the pathologic complete response varied according to the intrinsic molecular subtype [17]. Indeed, HER2-positive breast tumors of the HER2-enriched subtype showed a significantly higher rate of pathologic response compared to patients from non HER2-enriched subtypes, further suggesting that the intrinsic subtype might greatly aid in the discrimination of patients who will benefit from HER2 blockage, in whom chemotherapy might potentially be spared [17].

The basal-like subtype was so named because the transcriptomic profiles of these cancers comprise genes that are usually expressed by normal breast epithelial/ basal cells. Normal-like breast cancers, on the other hand, have proven to be more controversial. There are several lines of evidence to suggest that this subtype is a mere artifact of gene expression profiling, being the result of “intrinsic” subtyping of samples with a disproportionately high content of normal breast epithelial cells and/or stromal cells [5, 10, 18, 19].

Due to limitations of hierarchical clustering analysis for the classification of single breast cancer samples in a prospective manner [20], single sample predictors have been developed [3]. They allow for gene expression-based subtyping of individual tumors based on microarray gene expression profiling. Microarray-based single sample predictors, however, seem to have limited reproducibility and to require extensive and rather complex processing of the microarray data to be applied for the classification of individual samples [11, 21]. To overcome these limitations and to allow for the use of archival material, the PAM50 assay has been developed. This is an nCounter-based assay based on the expression of 50 genes and classifies breast cancers into the four major intrinsic subtypes (i.e., luminal A, luminal B, HER2-enriched, and basal-like; the normal-like subtype was removed as it is currently perceived as a likely artifact of having a high percentage of normal cell contamination) [18]. Importantly, immunohistochemical surrogate definitions have gained widespread use in the last few years due to their similarities with breast cancer molecular subtypes as defined by gene expression profiling. Indeed, based on the recognition of “intrinsic” breast cancer subtypes, this immunohistochemical surrogate classification was accepted by the 12th St. Gallen International Breast Cancer Conference Expert Panel as a new approach for therapeutic purposes [22]. Nevertheless, it has been recognized that disagreement between the PAM50 assay and immunohistochemistry may lead to different treatment decisions [23].

In addition to the “intrinsic” subtypes, microarray-based class discovery studies have resulted in the identification of additional molecular subtypes, which are predominantly of ER-negative phenotype. The molecular apocrine subtype of breast cancer has been identified by independent investigators [24,25,26] and is characterized by low or no expression of ER and expression of androgen receptor (AR) and AR-related genes [24,25,26]. These tumors have been shown to have an aggressive clinical outcome [26] and to display some molecular and histopathological features consistent with apocrine differentiation. Through an analysis of conditional mouse models, breast cancer cell lines, and primary breast cancers, the claudin-low subtype has been identified [19, 27]. These tumors are characterized by low levels of expression of the tight junction proteins claudins 3, 4, and 7 and other adhesion molecules, including E-cadherin, and display transcriptomic features similar to those of breast cancer-initiating cells and epithelial-to-mesenchymal transition. In comparison with other intrinsic subtypes, claudin-low tumors display low levels of expression of ER and ER-related genes and intermediate levels of expression of proliferation-related genes. Although initially perceived as a variant of triple-negative breast cancers (TNBCs), up to 33% and 22% of claudin-low cancers may be ER and HER2 positive by immunohistochemical analysis [19]. From an immunohistochemical standpoint, it should be emphasized that up to 41% and 55% of tumors classified as claudin-low by gene expression profiling express claudin 3 and E-cadherin, respectively [19].

TNBC, defined by the lack of expression of ER, progesterone receptor, and HER2, is vastly heterogeneous at the molecular level, and despite the large overlap between TNBC and the basal-like intrinsic subgroup of breast cancer, it is nowadays recognized that these definitions are not synonymous. Indeed, seminal studies by Lehmann et al. [28] revealed the existence of six molecular TNBC subtypes, namely, basal-like 1 (BL1), basal-like 2 (BL2), mesenchymal (M), mesenchymal stem-like (MSL), immunomodulatory (IM), and luminal androgen receptor (LAR). Underscoring the therapeutic relevance of this molecular TNBC taxonomy, murine xenografts of breast cancer cell lines representative of the different TNBC subtypes were found to display differential sensitivity to therapeutic agents [28]. While basal-like cell lines displayed sensitivity to cisplatin, mesenchymal stem-like and LAR cell lines were shown to be sensitive to a dual PI3K and mTOR inhibitor (BEZ235) and an antiandrogen (bicalutamide), respectively [28]. The clinical implications of this taxonomy were further supported by the different responsiveness of the various TNBC molecular subtypes to neoadjuvant chemotherapy [29] and by their different survival outcomes [28]. Follow-up studies conducted by the same group revealed, nonetheless, that the transcriptional profiles of IM and MSL tumors derive from tumor-infiltrating lymphocytes and stromal cells, respectively, rather than from tumor cells, and this classification was therefore refined to include only the four remaining molecular TNBC subgroups [30]. Subsequent independent studies by Burstein et al. [31] proposed the existence of four transcriptomic TNBC subgroups. The TNBC molecular subtypes proposed by Burstein et al. [31], i.e., luminal/androgen receptor, mesenchymal, basal-like/immune-suppressed, and basal-like/immune-activated, also differed in terms of their clinical outcomes and were analogous to the ones put forward by Lehmann et al. [30], indicating that the most parsimonious number of molecular TNBC subtypes is likely four.

The Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) implemented an alternative molecular breast cancer taxonomy, based on the integration of genome-wide copy number alterations and transcriptomic profiles [32]. In their pioneering study, Curtis et al. [32] analyzed approximately 2,000 breast cancers and, using this integrative approach, classified them into 10 integrative clusters (IntClust 1–10). The molecular subtypes identified by this strategy had a limited correlation with the “intrinsic” subtypes and have been shown to be associated with different outcomes [32]. These investigators later devised a simplified gene expression-based methodology to subtype breast cancer into the ten IntClusts [33], which could facilitate the application of this taxonomy. A validation study in 7,544 breast cancers using this classifier confirmed the reproducibility of the IntClust molecular classification, as well as its association with survival outcome and response to neoadjuvant chemotherapy [33].

Gene Expression Prognostic Signatures

Gene expression studies have solidified the notion that breast cancer is markedly heterogeneous and that ER-positive and ER-negative breast cancer are different diseases. The identification of breast cancer patients who may benefit from adjuvant chemotherapy, and of those in whom chemotherapy could be spared, remains challenging. Nonetheless, it is nowadays recognized that the assessment of panels of genes, namely, “first-generation” signatures, could aid in the prognostication of breast cancer. It should be noted however that “first-generation” signatures, which identify the patient population having poor prognosis [34, 35], have been shown to be useful only for ER-positive breast cancer patients, as they have negligible discriminatory power in ER-negative disease, because the levels of expression of proliferation-related genes are uniformly high in these tumors (Fig. 26.1). In fact, several meta-analyses [10, 34, 36] have demonstrated that “first generation” signatures identify as poor prognosis those patients whose tumors have high levels of expression of proliferation-related genes, which have been shown to constitute one of the strongest prognostic factors in ER-positive disease [36, 37]. Microarray-based technologies allowed the initial implementation of various multigene assays, which were further developed and are nowadays commercially available [38]. Several multigene assays have been implemented in clinical practice in the context of ER-positive disease, including MammaPrint®, Breast Cancer Index, Oncotype DX®, Prosigna, and EndoPredict (Table 26.1) [5, 6, 39,40,41,42,43].

Fig. 26.1
figure 1

Schematic representation of gene expression signatures and their prognostic and predictive value for estrogen receptor (ER)-positive and ER-negative breast cancer. First-generation prognostic gene expression signatures are clinically useful for ER-positive disease and classify patients into good or poor prognosis. Second-generation signatures, which are underpinned by the prognostic value conferred by the expression of immune response-related genes, may play a role in the prognostication of patients with ER-negative breast cancer. The stromal gene signatures and endocrine predictive signatures (such as the SET index) also have the potential to help personalize the therapy for patients with ER-positive disease. New genomic platforms for discovering and validating prognostic and predictive biomarkers (e.g., massively parallel sequencing) are expected to have a dramatic impact on systemic therapy decision-making for patients with breast cancer

Table 26.1 Main characteristics of commercially available gene expression signatures in breast cancer

Even though these assays provide similar information at the population level, the pairwise concordance between different assays for individual patients is only moderate [44]. Indeed, the comparison of the EndoPredict score and Oncotype DX® RS in the same cancer samples revealed major discrepancies in 18% of cases [44]. A third of cases classified by MammaPrint® as high risk were classified as low risk by Oncotype DX® [45]. The OPTIMA prelim study [46] compared risk stratification by different multigene assays and revealed that while they provided equivalent risk information at the population level in patients with ER-positive breast cancer, they assigned individual patients to different subtypes and risk strata [46]. Indeed, there was a disagreement in risk categorization in 61% of tumors [46]. These discrepancies might be, at least in part, due to the differences in the weight of proliferation-related genes and ER signaling-related genes in the different assays, which are more relevant in early and late recurrences, respectively. Although all of these multigene assays have the power to predict early recurrences (within 5 years of diagnosis), they differ in their ability to predict late recurrences (beyond 5 years of diagnosis). Indeed, Prosigna ROR, EPclin (EndoPredict), and BCI appear to have the best predictive power for late recurrences [47].

Despite these limitations, prognostic signatures are changing clinical practice and play an important role in the management of ER-positive disease. Indeed, their incorporation in the 8th edition of the American Joint Committee on Cancer (AJCC) staging system in the subset of ER-positive HER2-negative breast cancers has been recommended by a multidisciplinary team of breast cancer experts [48].

Owing to the fact that the prognostic power of these first-generation signatures largely stems from the information provided by proliferation-related genes, the classification of breast cancers according to these signatures correlates with response to conventional chemotherapy agents [49,50,51]. This is not surprising, given that chemotherapy preferentially targets cells that are cycling/ proliferating. An important observation, however, is that most of the low-risk/good prognosis groups identified by first-generation prognostic signatures may potentially benefit from specific chemotherapy agents (e.g., taxanes) [52, 53].

MammaPrint®

The 70-gene assay (MammaPrint®, Agendia, Netherlands) is a widely used breast cancer multigene classifier assay and the first US Food and Drug Administration (FDA)-cleared breast cancer recurrence assay. MammaPrint® is a microarray-based gene expression profiling assay that uses DNA microarray technology to predict risk of developing distant metastasis. The application of this assay is intended for patients with ER-positive node-negative, stage I-II invasive breast cancer. Although it originally required RNA extracted from fresh-frozen tumor specimens, technology improvements have eliminated the need of frozen tissue, and this assay is now available for formalin-fixed, paraffin-embedded (FFPE) tissue. Of note, the analysis of FFPE samples has been shown to be comparable to that of frozen material [54, 55].

This gene signature was originally developed by the supervised expression analysis of 25,000 genes from 78 patients with node-negative stage I-II breast cancer who did not receive adjuvant systemic therapy, which resulted in a list of 70 genes [56]. A prognostic score that categorizes patients into “good” (i.e., no distant metastasis within 5 years of follow-up) and “poor” (i.e., distant metastasis within 5 years of follow-up) outcome groups was developed. Although this prognostic signature consists of genes that are to some extent associated with proliferation, invasion, metastasis, and angiogenesis, its prognostic power seems to stem from the expression levels of proliferation-related genes alone [34].

This signature was further validated in various cohorts of breast cancer patients (e.g., node-negative, node-positive, HER2-positive) and was shown to provide prognostic information in addition to that provided by standard clinico-pathological variables [56,57,58,59,60,61,62]. Furthermore, the prognostic groups identified by MammaPrint® seem to correlate with response to chemotherapy; MammaPrint®-defined good prognosis tumors have been reported to derive minimal benefit from chemotherapy, whereas a subset of tumors classified as of poor prognosis have higher rates of chemotherapy response [60].

The first prospective validation of the MammaPrint® assay was provided by the RASTER study [63], which included 427 node-negative breast cancer patients and showed that patients with a low-risk signature had a 5-year relapse-free survival rate of 97%, compared to 91.7% among patients with a high-risk signature. Later on, the clinical utility of the MammaPrint® assay was validated by the MINDACT randomized phase III trial [64], which included 6,693 patients with negative or 1–3 positive nodes. The results of this trial showed that patients with clinically high-risk disease based on clinico-pathological parameters (Adjuvant! Online) and a MammaPrint®-defined low genomic risk who did not receive chemotherapy had a 5-year distant metastasis-free survival of 94.7%, supporting the utility of the MammaPrint® assay in the selection of patients in whom chemotherapy could be spared [64].

Oncotype Dx®

The 21-gene assay (Oncotype DX® , Genomic Health, Redwood City, CA, USA) is one of the most widely used multigene classifier assays. It consists of a qRT-PCR-based signature in which RNA is extracted from FFPE tissue samples [65, 66]. The signature measures the expression of 21 genes, of which 16 are cancer-related genes and 5 are reference genes. An algorithm is used to calculate a “recurrence score” (RS) based on the 21-gene list ranging from 0 to 100 and classifies patients into three risk groups: low risk (RS <18), intermediate risk (RS from 18 to <31), and high risk (RS ≥31). The RS has been shown to predict the 10-year risk of distant relapse for ER-positive node-negative breast cancer patients, based on the analyses of samples from the National Surgical Adjuvant Breast and Bowel Project (NSABP) B-20 clinical trial [67]. The RS was validated in a large cohort of ER-positive, node-negative tamoxifen-treated patients from the NSABP B-14 trial which results in level I evidence to support its prognostic value [68]. In addition, RS has also been shown to be associated with benefit from chemotherapy in patients with ER-positive disease. Chemotherapy benefit is observed in patients whose tumors have a high-RS, whereas the benefit from chemotherapy is negligible in patients with low-RS cancers [69]. The first prospective study to validate the clinical utility of Oncotype DX® was the TAILORx trial [70]. To minimize undertreatment, the Oncotype DX® RS ranges were modified in this trial, with an RS of 11–25 defining the intermediate-risk group. The initial results of TAILORx showed that the risk of recurrence in patients with hormone receptor-positive, HER2-negative, node-negative breast cancer with an RS < 10, receiving endocrine therapy alone, is very low [70], indicating that this population can safely forgo chemotherapy.

Multiple studies have evaluated the clinical utility of Oncotype DX® to determine whether patients with an intermediate RS may benefit from the addition of adjuvant chemotherapy. A recent prospective-retrospective study showed that patients with an intermediate RS (11–25) had very low 5-year distant recurrence rates, suggesting that chemotherapy did not confer clinical benefit in this group [71]. The results of the TAILORx trial in patients with an intermediate RS will be presented soon and are eagerly awaited.

Based on these studies, Oncotype Dx® has been incorporated in clinical guidelines and it use is recommended by expert panels; furthermore, it has received support from the American Society of Clinical Oncology for its use in early ER-positive node-negative breast cancer [22, 72, 73].

Oncotype DX® has been shown to provide prognostic information above and beyond that of histologic grade and tumor size [74,75,76]. The applications of Oncotype Dx® have been expanded, as this assay has also been revealed to be a useful prognostic test in other scenarios such as (i) ER-positive node-positive patients treated with tamoxifen, (ii) ER-positive patients treated with aromatase inhibitors, (iii) ER-positive, node-negative patients receiving no adjuvant therapy, and (iv) node-positive patients treated with doxorubicin-containing chemotherapy [75, 77, 78].

Prosigna®

The prediction analysis of microarrays 50 (PAM50) assay was originally intended as a means to identify breast cancer “intrinsic” gene subtypes with high prognostic validity [18]. Prosigna® , a commercially available assay using NanoString technology in RNA extracted from FFPE samples, was later developed, and its use in postmenopausal women with hormone receptor-positive tumors with or without node involvement was approved by the FDA [79]. Assessment of the expression of 50 classifier genes and 5 control genes can be used to classify breast tumors in the intrinsic subtypes. In addition, this assay provides a risk of recurrence score (ROR), which ranges between 0 and 100, defining low-, intermediate-, and high-risk categories. The ROR score in the training dataset predicted the probability of cancer recurrence over 10 years for patients with node-negative tumors who did not receive adjuvant systemic therapy [18]. The prognostic value of ROR score has been further validated for 786 patients with ER-positive breast cancer treated with tamoxifen, showing that PAM50 and tumor size might give more prognostic information than other clinico-pathological variables [80]. Notably, an 11-gene proliferation signature, which is related to cell cycle function, was derived from the 50 genes of the PAM50 assay. The 11-gene signature was found to improve the original model as it was found to have more prognostic value than expression of Ki67 [80]. A study comparing the prognostic information provided by Oncotype Dx® and PAM50 using over 1,000 samples from the Arimidex, Tamoxifen, Alone or in Combination (ATAC) trial revealed that the PAM50 ROR score yielded significantly more prognostic information than the Oncotype Dx® RS, and that the PAM50 ROR provides independent prognostic information above and beyond that offered by nodal status, tumor size, histopathologic grade, age, and type of endocrine treatment [81]. Another validation study included 1,478 postmenopausal patients with hormone receptor-positive, HER2-negative breast cancer receiving adjuvant endocrine therapy, and showed that the ROR score was able to predict relapse-free survival [82]. Similarly, a recent comprehensive study conducted in a nationwide Danish cohort, including postmenopausal women with hormone receptor-positive, HER2-negative breast cancer, solidified the notion that Prosigna ROR may identify patients with negative- or one to three positive-nodes in whom adjuvant chemotherapy could be spared [83].

Breast Cancer IndexSM(BCI)

The Breast Cancer Index (BCI) molecular assay (BioTheranostics, San Diego, CA) was developed to assess the risk of distant recurrence in ER-positive, node-negative breast cancer patients [74, 84, 85]. It is a prognostic assay which combines two gene expression signatures: the HOXB13:IL17BR (H:I) two-gene ratio, which predicts distant recurrence in patients with ER-positive breast cancer treated with tamoxifen [84], and a proliferation-related five-gene molecular grade index (MGI) [74] that distinguishes grade 1 from grade 3 cancers. This dichotomous index (MGI together with HOXB13:IL17BR) is based on quantitative RT-PCR (qRT-PCR) using RNA from FFPE tissues, and provides more accurate prognosis than either biomarker alone. Furthermore, the BCI, is a continuous risk model that enables prediction of distant recurrence risk, and is significantly associated with distant recurrence and breast cancer death [85].

The BCI assay, 21-gene recurrence score, and an immunohistochemical prognostic model (IHC4) were prospectively compared for both early (0–5 years) and late (5–10 years) recurrence in ER-positive, node-negative patients in the TransATAC study (i.e., patients enrolled in the Arimidex, Tamoxifen, Alone or in Combination (ATAC) clinical trial) [86]. The BCI has been shown to be a significant prognostic test for risk of both early and late distant recurrence and could assist in the identification of high-risk patients who would derive benefit from extended endocrine therapy or additional therapy.

A recent retrospective analysis comparing the prognostic accuracy of BCI and Oncotype DX® RS showed that BCI possessed a higher prognostic accuracy than the RS [87]. Notably, the BCI was able to identify subsets of patients with low- and intermediate-RS tumors with significant rates of distant recurrence [87], indicating that BCI may aid in the selection of patients with hormone receptor-positive and node-negative breast cancer who could benefit from adjuvant chemotherapy or extended endocrine therapy. A novel Breast Cancer Index model (BCIN+) was later developed for the assessment of the risk for distant recurrence in patients with hormone receptor-positive breast cancer and one to three positive lymph nodes [88]. BCIN+ integrates BCI gene expression and tumor size and grade and could identify a patient population with limited risk of recurrence over 15 years, who could safely forgo extended endocrine therapy [88].

EndoPredict Test

EndoPredict is an RNA-based multigene assay that interrogates proliferation and ER signaling-related genes for the assessment of the probability of distant recurrence in patients with ER-positive, HER2-negative breast cancer treated with adjuvant endocrine therapy [39, 41,42,43]. The EndoPredict test is based on the quantification of mRNA levels of eight cancer genes plus three reference genes in FFPE specimens by qRT-PCR and was shown to provide additional prognostic information, which is independent from clinico-pathological parameters (i.e., Adjuvant!Online and Ki67 labeling index) [40]. In two validation cohorts, the EndoPredict test was combined with clinical risk factors (i.e., nodal status and tumor size) into a comprehensive risk score called EPclin, which has been shown to identify a subgroup of “very-low”-risk patients who may be satisfactorily treated with adjuvant endocrine therapy only [39]. The clinical utility of EndoPredict was also validated in the patients with ER-positive, HER2-negative node-positive breast cancer from the GEICAM 9906 trial treated with adjuvant chemotherapy and endocrine therapy [89]. The EndoPredict and EPclin scores showed independent prognostic power for the prediction of metastasis-free survival and low-risk and high-risk patients [89]. A recent study comparing the performance of EPclin and Oncotype Dx® RS for the prediction of 10-year distant recurrence showed that EPclin provided more prognostic information then Oncotype Dx® RS [90].

Gene Expression Predictive Signatures

Predictive gene signatures aim to define the therapeutic response to chemotherapy, endocrine therapy, or other target agents [5, 6, 91,92,93,94,95]. Akin to the prognostic gene expression signatures, ER status and proliferative index have been shown to be major determinants of response to combinatorial chemotherapy. Thus far, the clinical value of gene expression signatures predictive of response to single chemotherapy agents remains controversial for breast cancer. In fact, there is no robust available gene signature capable of predicting responses to specific therapeutic agents. Several hypotheses have been advanced to explain the limited success in developing and validating predictive signatures. First, resistance to chemotherapy can be caused by functional alterations in few or single genes, and it is plausible that microarray-based gene expression profiling would not be sufficiently sensitive to identify such genes [91]. Second, intra-tumor genetic heterogeneity plays an important role in determining the emergence of drug resistance. Breast tumors often comprise heterogeneous collections of cancer cells that encompass rare clonal subpopulations, which have different genetic and epigenetic aberrations [96, 97]. Some genetic aberrations, which may be found in single clones of tumors, may drive therapeutic resistance [98]. In fact, as microarrays give an average of the expression profile of the tumor, this technique would not be reliable to identify those rare resistant clones. Finally, multiple genetic and epigenetic factors and also drug-resistance mechanisms not related to the tumor itself (e.g., tissue microenvironment, patient metabolism) may determine resistance to therapy [6]. Although some predictive gene expression signatures appear to have predictive value in validation studies (e.g., SET index) [99], their accuracy to determine the response of individual patients may be limited [6].

Massively Parallel Sequencing and the Impact in Intra-tumor Genetic Heterogeneity

The advent of massively parallel sequencing has enabled the analysis of the entire constellation of genetic alterations in cancers to be defined in a matter of days at reasonable costs. Several large-scale massively parallel sequencing-based studies of breast cancer have now been completed and demonstrated that (i) the collection of genetic aberrations found in breast cancers is complex with a limited number of highly recurrently mutated genes in a substantial proportion of unselected cases [32, 96, 100, 101], (ii) the number of genes mutated in small minorities of breast cancers is vast, (iii) the repertoire of mutations in luminal and basal-like breast cancers is vastly different, and (iv) despite these differences, there is no gene or mutation that defines a subtype of breast cancer [100,101,102,103].

Genomic analyses of human cancers have provided direct evidence of spatial [104,105,106] and temporal [104, 107, 108] intra-tumor genetic heterogeneity and have shown that a substantial proportion of cancers at diagnosis are composed of mosaics of tumor cells [96, 106], where subclones of cells harbor private mutations in addition to the founder genetic events. Although intra-tumor genetic heterogeneity is recognized for many years [109], it has been explored in primary breast cancers using massively parallel sequencing approaches in a limited number of studies (Fig. 26.2) [96, 97, 110]. The impact of intra-tumor genetic heterogeneity on the biology and, consequently, on treatment design of breast cancer remains to be fully understood. Genomic analysis of two pairs of matched primary tumors and distant metastatic relapses after adjuvant treatment revealed differences in their mutational makeup [107, 108], and suggested that clonal selection during the metastatic process is likely to occur. Along these lines, the integrative whole exome sequencing and gene expression analysis of a cohort of 500 metastatic solid tumors, which was enriched for breast cancer patients, identified TP53, CDKN2A, PTEN, PIK3CA, and RB1 as the most frequently somatically mutated genes in metastatic cancer [7]. A recent study portrayed the mutational landscape of 216 metastatic breast cancers and compared it to the one of 772 primary breast cancers from the TCGA [111]. This study identified ESR1 and RB1 as driver genes enriched in breast cancer metastases, with ESR1 being the most frequently metastasis-specific mutated gene [111]. Among other frequently mutated actionable genes identified in ER-positive HER2-negative metastatic breast cancer were TSC1 and TSC2, ERBB4, NOTCH3, and ALK [111].

Fig. 26.2
figure 2

Tumor heterogeneity. (a) Inter-patient heterogeneity. (b) Inter-patient heterogeneity. (c) Clonal evolution and the tree model: mutations shared by all tumor cells proceed from the founder clone which is depicted as the trunk of the tree. The branches are composed by tumor cells that acquire mutations present only in a subset of the tumor cells. (d) Intra-tumor genetic heterogeneity and the approaches for the characterization of the molecular aberrations in breast cancers

Mutations targeting ESR1 are also among the actionable targets that differ between primary and metastatic breast cancers. While ESR1 mutations are found in <1% of primary breast cancers, they may be identified in up to 54% of relapses following endocrine therapy [112]. ESR1 mutations affect the DNA binding domain, and some of these mutations have been shown to result in the activation of ER-dependent genes even in the absence of E2 and to require higher doses of tamoxifen and fulvestrant for the inhibition of ER activity [113,114,115]. Along these lines, ESR1 mutations may be identified in the cfDNA of patients with metastatic breast cancer who progress despite endocrine therapies [116]. Moreover, the detection of Y537S and D538G ESR1 mutations in cfDNA of patients with ER-positive metastatic breast cancer from the BOLERO-2 trial receiving aromatase inhibitors was associated with a shorter survival [117].

HER2 mutations are also enriched in metastatic breast cancer [118]. Of note, not all HER2 mutations result in activation of downstream pathways [119]. Indeed, in vitro and in vivo assays revealed that only a subset of HER2 mutations are bona fide activating mutations [119]. Importantly, upon therapeutic pressure, passenger HER2 mutations may become drivers. Along these lines, massively parallel sequencing of lapatinib-resistant cell models showed that acquisition of the HER2 L755S mutation may result HER2 reactivation, representing a mechanism of resistance to lapatinib, which may be overcome by irreversible HER2 inhibition [120]. A recent “basket” trial across 21 cancer types using the pan-HER kinase inhibitor neratinib showed that its efficacy in HER2 mutant cases varied according to tumor type and individual mutant variant [121]. Breast tumors and missense mutations targeting the kinase domain of HER2 were found to be associated with the greatest sensitivity to neratinib [121].

The spatial and temporal intra-tumor genetic heterogeneity observed in solid cancers constitutes a challenge for the realization of the potentials of precision medicine, given that the results of genetic biomarker analyses performed in single biopsies for treatment decision-making may differ according to the area of the tumor sampled [104], between the primary tumor and its distant metastases, or even between different metastatic sites [104, 122]. This multiregional separation of molecular aberrations can lead to sampling bias, potentially impairing the interpretation of genomics results derived from individual biopsies. Therefore, approaches to provide a global assessment of the repertoire of somatic genetic aberrations in a tumor are important for the accurate selection of targeted therapies for individual patients.

Deciphering intra-tumor heterogeneity using massively parallel sequencing approaches has important implications that may refine our understanding of breast cancer biology, its genetic diversity and the mechanisms that lead to therapeutic resistance [103, 122,123,124,125]. Much effort has been put in this direction, including massively parallel sequencing of single cells [106] and circulating biomarkers [126,127,128,129].

Liquid Biopsies in Breast Cancer

Tumors are composed of multiple subclones with different genetic alterations, and minor subpopulations of the primary tumor may be the ones that develop into metastasis [108]. Despite their many advantages, traditional DNA sequencing approaches, in which the bulk of the tumor is analyzed, lack the power to detect minor tumor subclones [130] which may be the source of disease progression and resistance to therapy [123]. Moreover, occasionally, the anatomic inaccessibility of metastatic outgrowths precludes their sampling [131]. Liquid biopsies, which encompass the study of circulating cell-free tumor DNA (cfDNA) and circulating tumor cells (CTCs), have the potential to circumvent the limitations inherent to tissue-based DNA sequencing and to monitor dynamic changes in tumor genomes, in a noninvasive manner [129, 132].

Multiple lines of evidence indicate that the study of liquid biopsies has a potential use in tailoring therapy in early and metastatic breast cancer [133]. In the context of early disease, mutation tracking in ctDNA in plasma in early breast cancer patients receiving neoadjuvant chemotherapy could predict metastatic relapse in a shorter median lead time than the methods currently used [134]. Moreover, it allowed for the identification of the genetic events in minimal residual disease that could in turn predict the genetic alterations in subsequent metastasis with more accuracy than sequencing of the primary tumor [134].

Liquid biopsies may also play a role in the detection of genetic alterations that drive therapeutic resistance in metastatic breast cancer, such as ESR1 mutations [135]. The detection of ESR1 mutations in liquid biopsies might aid in the triage of patients with metastatic hormone receptor-positive breast cancer for further endocrine therapies, as illustrated in the study of archived baseline plasma of patients of the SoFEA trial [136]. In this study, patients with ESR1 mutations detected in plasma treated with fulvestrant had a better progression-free survival than those treated with exemestane, whereas no difference was observed in patients with wild-type ESR1.

Other potential uses of liquid biopsies in tailoring the management of breast cancer patients are currently being explored [133]. BRCA1/2 reversion mutations in BRCA1/2 germline mutation carriers may functionally restore BRCA1 and BRCA2 and mediate resistance to platinum salts or PARP inhibition [137]. MPS analysis of cfDNA detected BRCA1/2 reversion mutations in BRCA1/2 germline mutation carriers with metastatic breast cancer pretreated with platinum and/or PARP inhibitors, underscoring the potential of liquid biopsies to aid in the selection of patients amenable to PARP inhibition [138].

Taken together, a burgeoning body of evidence indicates that analysis of liquid biopsies represents a robust approach to tackle breast intratumor heterogeneity and to guide the management of breast cancer patients, both in early and in advanced stages.

Molecular Advances in Histologic Subtyping of Breast Tumors

Comprehensive genomic portrayals of breast cancer have analyzed cohorts of unselected breast cancers, where invasive ductal carcinomas of no special type (IDC-NST) were overrepresented [102]. Special types of breast cancer, which collectively account for up to 20% of all invasive breast cancers, were largely not investigated in those studies. In fact, the second breast TCGA study, which focused on lobular breast cancer [139], and independent investigators of invasive lobular carcinomas (ILC) [140] confirmed that inactivating CDH1 mutations, the hallmark of lobular carcinomas, are not present in IDC-NSTs. Furthermore, the genetic alterations activating the estrogen pathway differ according to tumor histology, with FOXA1 and GATA3 mutations being more frequent in ILCs and IDC–NSTs, respectively [139, 140].

The analysis of special types of breast cancer, however, has provided important insights in regard to the taxonomy of breast cancer. Studies focusing on the genomic characterization of rare breast cancer types have demonstrated that the vast histologic heterogeneity of breast cancer is paralleled by marked heterogeneity at the molecular level, which is more overt in the realm of TNBC [141, 142]. Indeed, studies conducted by our group and others have shown that contrary to the common perception of TNBC as a group of tumors with uniformly aggressive biology and poor prognosis, low-grade variants of triple negative disease exist [141, 142]. Among these entities, the “low-grade triple-negative breast neoplasia” family, which includes microglandular adenosis (MGA), atypical MGA, and acinic cell carcinoma (ACC), can be recognized. Notwithstanding their low-grade morphology, MGA and ACC display complex genomic profiles and frequent TP53 mutations, similar to conventional high-grade TNBCs [143].

Salivary gland-like tumors of the breast are also low-grade TNBC variants and encompass tumors that despite being more frequent in the salivary glands arise also in the breast and are underpinned by pathognomonic genetic alterations, such as secretory carcinomas and adenoid cystic carcinomas [144, 145]. Secretory carcinomas are characterized by the t(12;15)(p13;q25) translocation that results in the ETV6–NTRK3 fusion gene [144]. The hallmark genetic alteration of adenoid cystic carcinomas is the t(6;9)(q22-23;p23-24) translocation which results in the MYB-NFIB fusion gene [145]. Interestingly, our study of MYB-NFIB-negative adenoid cystic carcinomas revealed that these tumors harbor MYBL1 rearrangements (MYBL1-ACTN1 and MYBL1-NFIB) or MYB amplification, showing that this entity is driven by MYB or MYBL1 activation achieved by different mechanisms, and constitutes an example of convergent phenotype [146].

Adenomyoepitheliomas (AMEs) and solid papillary carcinomas with reverse polarity (SPCRPs) constitute additional examples of genotypic-phenotypic correlations in the breast. Our recent analysis of breast AMEs revealed that their genetic makeup varies according to their ER status [147]. ER-positive AMEs display frequent PIK3CA or AKT1 activating mutations, whereas ER-negative AMEs are characterized by HRAS Q61 hotspot mutations co-occurring with PIK3CA or PIK3R1 mutations [147]. Notably, epithelial-myoepithelial carcinomas of the salivary glands harbor frequent HRAS Q61 hotspot mutations which co-occur with PIKC3A mutations in almost half of cases [148], showing that the aforementioned mutational co-occurrence results in epithelial-myoepithelial differentiation regardless of anatomic location. Importantly, this study [147] qualifies HRAS mutations as pathognomonic for AMEs in a breast-specific context.

“SPCRPs are extremely rare breast tumors, which morphologically resemble the tall cell variant of papillary thyroid carcinoma and constitute an additional example of genotypic-phenotypic association in the breast. Our analysis of two independent cohorts of SPCRPs revealed that these tumors are underpinnned by IDH2 R172 hotspot or TET2 mutations , concurrent with PIK3CA or PIK3R1 mutations [149, 150]. Simultaneous IDH2 and PIK3CA mutations in breast cell lines resulted in the recapitulation of the characteristic morphology of SPCRPs [149] illustrating how the integration of molecular studies and classic pathology resulted in the definition of a discrete breast cancer subtype with a distinctive morphology and molecular underpinning.

Conclusions

Molecular diagnostics play a key role in the management of breast cancer patients, and molecular assays are being increasingly incorporated in routine clinical practice. Gene expression profiling has provided significant advances in the molecular classification and prognostication of breast cancer and has given new insights regarding therapeutic prediction. Microarray-based gene expression studies have changed the way breast cancer is perceived and have highlighted the fact that breast cancer comprises a heterogeneous collection of diseases with distinct molecular characteristics and outcomes. Along these lines, the development of multigene signatures has allowed the identification of patients with ER-positive, HER2-negative breast cancer in whom chemotherapy could be spared.

The identification of actionable targets by massively parallel sequencing approaches is becoming a cornerstone for the realization of the potentials of precision medicine. Indeed, the implementation of liquid biopsies in the monitoring of early and advanced breast cancer, in the near future, as means to overcome the challenges posed by intra-tumor heterogeneity is not hard to envision.

The integration of molecular studies and classic pathology in the recent years has facilitated the dissection of the morphologic and genetic heterogeneity of breast cancer. Thus, the taxonomy of the breast is becoming increasingly more reliant on the genetic makeup of tumors rather than solely on classical histomorphological parameters. Molecular techniques are developing at an unprecedented pace. Nevertheless, to achieve the goals of individualized therapy, molecular methods must be incorporated into clinical practice after undergoing the same level of scrutiny that current diagnostic techniques have been subjected to.