Introduction

Most primary breast cancers are hormone receptor-positive [1], and targeting the estrogen receptor (ER) is a common treatment strategy [2]. Aromatase inhibitors, selective ER modulators (SERMs), and selective ER down regulators (SERDs) can successfully treat ER-positive disease [3]. However, for a significant number of patients, endocrine resistance eventually develops, leading to disease recurrence and metastases [4]. Delineating the mechanisms of resistance and developing strategies to overcome treatment resistant disease is an important focus for improving breast cancer outcomes.

Alterations known to contribute to acquired endocrine resistance include somatic alterations to key cancer pathways, ER transcriptional regulators, and DNA repair genes [5,6,7]. Comparative analyses of tumor sequencing data for hormone therapy naïve and post hormone therapy tumors identified ESR1, ERBB2 and NF1 among the most frequently mutated genes in response to hormone therapy [6]. ESR1 encodes ERα, and mutations that result in its constitutive activation—mutations in the LBD—are found almost exclusively in patients with endocrine therapy-resistant disease [8, 9]. ESR1 LBD mutations [10], copy number gains [11], and gene fusions [12] are all associated with acquired resistance and metastatic disease. In an analysis of 541 patients with metastatic breast cancer pretreated with aromatase inhibitors [10], Chandarlapathy and colleagues tested circulating tumor DNA (ctDNA) for two LBD hotspot mutations, Y537S and D538G in patients with metastatic breast cancer previously treated with an aromatase inhibitor, and almost 30% of patients had one of these mutations. In another targeted sequencing analysis (287 gene panel) of 11,616 breast cancer tumors, ESR1 mutations were identified in 10% of samples and overwhelmingly enriched in metastatic samples; 78% of the samples in which ESR1 mutations were found were metastatic [13].

We sought to survey the landscape of ESR1 variants among a large genomic database of breast cancer cases. We analyzed 9860 breast cancer tumors, which included 5337 (54.1%) samples from distant metastatic sites, with a combination of large panel sequencing—a 592-gene panel—and whole transcriptome sequencing. Herein, we report the results of our ESR1-focused analyses.

Methods

Patient samples

Formalin-fixed paraffin-embedded (FFPE) patient samples (n = 9860) were submitted to a commercial CLIA-certified laboratory (Caris Life Sciences, Phoenix, AZ). The present study was conducted in accordance with guidelines of the Declaration of Helsinki, Belmont Report, and U.S. Common Rule. With compliance to policy 45 CFR 46.101(b), this study was conducted using retrospective, de-identified clinical data, and patient consent was not required.

Next-generation sequencing (NGS) for 592-gene panel

NGS was performed on genomic DNA isolated from FFPE tumor samples (n = 9860) using the NextSeq platform (Illumina, Inc., San Diego, CA). Matched normal tissue was not sequenced. A custom-designed SureSelect XT assay was used to enrich 592 whole-gene targets (Agilent Technologies, Santa Clara, CA). All variants were detected with > 99% confidence based on allele frequency and amplicon coverage, with an average sequencing depth of coverage of > 500 and an analytic sensitivity of 5%. Prior to molecular testing, tumor enrichment was achieved by harvesting targeted tissue using manual microdissection techniques. Genetic variants identified were interpreted by board-certified molecular geneticists and categorized as ‘pathogenic,’ ‘likely pathogenic,’ ‘variant of unknown significance,’ ‘likely benign,’ or ‘benign,’ according to the American College of Medical Genetics and Genomics standards. Alteration rates were calculated as the total number of samples harboring a ‘pathogenic’ or ‘likely pathogenic’ variant divided by the total number of samples scored.

Copy number alteration (CNA)

The CNA of each exon was determined by calculating the average depth of the sample along with the sequencing depth of each exon and comparing this calculated result to a pre-calibrated value.

Immunohistochemistry (IHC)

IHC was performed on full FFPE sections of glass slides. These slides were stained using automated staining techniques, per the manufacturer’s instructions, and optimized and validated per Clinical Laboratory Improvement Amendments/College of American Pathologists and International Organization for Standardization requirements. Staining was scored for intensity (0 = no staining; 1 +  = weak staining; 2 +  = moderate staining; 3 +  = strong staining) and staining percentage (0–100%).

Tumor mutational burden (TMB)

TMB was measured by counting all non-synonymous missense, nonsense, in-frame insertion/deletion and frameshift mutations found per tumor that had not been previously described as germline alterations in dbSNP151, Genome Aggregation Database (gnomAD) databases or benign variants identified by Caris Life Sciences geneticists. A cutoff point of ≥ 10 mutations per megabase (mt/MB) was used based on the KEYNOTE-158 pembrolizumab trial [14], which showed that patients with a TMB of ≥ 10 mt/MB across several tumor types had higher response rates than patients with a TMB of < 10 mt/MB. Caris Life Sciences is a participant in the Friends of Cancer Research TMB Harmonization Project [15].

Whole transcriptome sequencing (WTS)

WTS uses a hybrid-capture method to pull down the full transcriptome from a FFPE tumor samples (n = 4305; WTS platform was not available at the time of profiling for all samples) using the Agilent SureSelect Human All Exon V7 bait panel (Agilent Technologies, Santa Clara, CA) and the Illumina NovaSeq platform (Illumina, Inc., San Diego, CA). FFPE specimens underwent pathology review to determine percent tumor content and tumor size; a minimum of 10% tumor content in the area for microdissection was required to enable enrichment and extraction of tumor-specific RNA. Qiagen RNA FFPE tissue extraction kit was used for extraction, and the RNA quality and quantity were determined using the Agilent TapeStation. Biotinylated RNA baits were hybridized to the synthesized and purified complementary DNA (cDNA) targets and the bait-target complexes were amplified in a post capture PCR reaction. The resultant libraries were quantified and normalized, and the pooled libraries were denatured, diluted, and sequenced. Raw data were demultiplexed using the Illumina DRAGEN FFPE accelerator. FASTQ files were aligned with STAR aligner (Alex Dobin, release 2.7.4a github). A full 22,948-gene dataset of expression data were produced by the Salmon, which provides fast and bias-aware quantification of transcript expression [16]. BAM files from STAR aligner were further processed for RNA variants using a custom detection pipeline. The reference genome used was GRCh37/hg19 and analytical validation of this test demonstrated ≥ 97% Positive Percent Agreement (PPA), ≥ 99% Negative Percent Agreement (NPA) and ≥ 99% Overall Percent Agreement (OPA) with a validated comparator method.

Fusion detection by WTS

For samples tested February 2019 and later, gene fusion detection was performed on mRNA isolated from a FFPE tumor sample (n = 4305) using the Illumina NovaSeq platform (Illumina, Inc., San Diego, CA) and Agilent SureSelect Human All Exon V7 bait panel (Agilent Technologies, Santa Clara, CA). FFPE specimens underwent pathology review to determine percent tumor content and tumor size; a minimum of 10% of tumor content in the area for microdissection was required to enable enrichment and extraction of tumor-specific RNA. Qiagen RNA FFPE tissue extraction kit was used for extraction, and the RNA quality and quantity was determined using the Agilent TapeStation. Biotinylated RNA baits were hybridized to the synthesized and purified cDNA targets and the bait-target complexes were amplified in a post capture PCR reaction. The resultant libraries were quantified, normalized and the pooled libraries are denatured, diluted, and sequenced; the reference genome used was GRCh37/hg19 and analytical validation of this test demonstrated ≥ 97% Positive Percent Agreement (PPA), ≥ 99% NPA and ≥ 99% OPA with a validated comparator method.

Fusion detection by archer

For samples tested prior to February 2019, gene fusion detection was performed by targeted RNA sequencing using the ArcherDx fusion assay (Archer FusionPlex Solid Tumor panel). The FFPE tumor samples (n = 344) were microdissected to enrich the sample to ≥ 20% tumor nuclei, and mRNA was isolated and reverse transcribed into cDNA. Unidirectional gene-specific primers were used to enrich for target regions, followed by NGS (Illumina MiSeq platform). Targets included 52 genes, and the full list can be found at http://archerdx.com/fusionplex-assays/solid-tumor. We analyzed reads and contigs that were matched to a database of known fusions and other oncogenic isoforms (Quiver database, ArcherDx), as well as those novel isoforms or fusions with high reads (> 10% of total reads) and high confidence after bioinformatic filtering. Samples with < 4000 unique RNA reads were reported as indeterminate and excluded from analysis, and all the analyzed fusions were in-frame and were predicted to have kinase domains preserved. Fusions among the > 11,000 fusions known to be found in normal tissues were excluded (16). The detection sensitivity of the assay allows for detection of a fusion that is present in at least 10% of the cells in the samples tested.

Statistical analysis

All statistical analyses were performed with JMP V13.2.1 (SAS Institute), or R Version 3.6.1 (https://www.R-project.org). Categorical data was evaluated using Chi-square or Fisher’s exact test, where appropriate.

Results

We retrospectively reviewed the molecular profiles of a national cohort of 9860 tumor samples from 9545 unique breast cancer patients that were submitted to Caris Life Sciences for molecular testing. All samples were analyzed using targeted DNA NGS and 4305 samples also had WTS performed. Most patients (74.7%) were aged 50 years or older at the time of molecular profiling, and 94 patients (1%) were male. A slight majority of samples were obtained from distant metastatic sites (54.1%) and 45.8% were from primary breast tissue or locoregional (LR) lymph nodes, and the remaining 0.1% were from lymph node sites that were not otherwise specified (Table 1). The most represented metastatic sites were liver (n = 1655, 31%) and bone (n = 733, 13.7%). Overall, an ESR1 LBD mutation was detected in 8.6% of all tumors evaluated and a pathogenic ESR1 fusion was detected in 1.6%. ESR1 variants (mutation or fusion) were enriched in ER-positive/HER2-negative tumors; 14.5% LBD mutations and 2.6% fusions. ESR1 LBD mutations were exclusive to ER-positive tumors, whereas ESR1 fusions were noted in ER-negative tumors, although rare. Breast tumors with an unclear receptor subtype (i.e., indeterminate IHC result for ER, PR, and/or HER2) accounted for 10.7% of the cohort.

Table 1 Tumor sample characteristics

ESR1 somatic mutations

Of the 913 ESR1 variant samples, 844 (92.4%) had an ESR1 LBD mutation, seven of which had a concurrent ESR1 fusion detected. A total of 867 pathogenic/likely pathogenic ESR1 mutations were detected (including 1 from a male patient), along with 229 unclassified/variant of unknown significance (VUS) ESR1 mutations (Fig. 1). Across all samples, the most common mutations were LBD hotspot mutations in ESR1-D538G (326 of 9860, 3.3%), ESR1-Y537S (227 of 9860, 2.3%), and ESR1-E380Q (111 of 9860, 1.1%). All pathogenic/likely pathogenic ESR1 mutations that we detected were in the LBD, while unclassified/VUS mutations were identified throughout the gene sequence, including the LBD, estrogen receptor domain (ERD), and DNA zinc finger (ZF) domain. Pathogenic ESR1 LBD mutations were present in 20.1% of ER-positive tumors and were significantly more common in HER2-negative tumors (14.5%) than in HER2-positive tumors (5.6%, P < 0.0001); this trend persisted in both locoregional and metastatic tissue. Overall, distant metastatic samples more commonly harbored an ESR1 variant than locoregional tissues across breast cancer subtypes (Fig. 2A). Among metastatic tissue sites, liver metastases had the highest overall ESR1 LBD mutation rate and the highest rate for each hotspot mutation (Fig. 2B). ESR1-D538G was the most common ESR1 LBD mutation in most metastatic sites (range: 10.5% in liver, 2.5% in lung metastases); however, ESR1-Y537S was the most common (3.7%) in lung metastases; ESR1-D538G (4.2%) in bone metastases.

Fig. 1
figure 1

ESR1 protein and the number of mutations detected at each amino acid. Amino acids 40–180 code for the estrogen receptor domain (ERD), 181–252 for the DNA zinc finger (ZF), and 331–595 for the ligand-binding domain (LBD). Bolded text indicates LBD hotspot mutations (E380Q, Y537S, and D538G)

Fig. 2
figure 2

Frequency of select ESR1 variants by breast cancer receptor subtype and across tissue biopsy sites. A Frequencies reflect the number of variant samples per subtype in locoregional (breast and locoregional lymph nodes) and distant metastatic samples by receptor subtype. B Frequencies reflect the number of variant samples per biopsy site. *P < 0.05 for metastatic sites with a significantly higher variant frequency than locoregional sites

ESR1 fusions

Our assays were capable of detecting gene fusions for 4649 tumor samples. We profiled 4305 samples by WTS and 344 samples by Archer panels. At least one pathogenic/likely pathogenic ESR1 fusion isoform was detected in 76 samples, which constitutes 1.6% of evaluable tumor samples. A total of 40 unique fusion partners were identified, with ESR1 exclusively observed as the upstream (5′) fusion partner. The majority of ESR1 fusion-positive samples lacked a concurrent ESR1 LBD mutation (n = 69, 91%). Of the ESR1 fusions with resolvable breakpoints (94%), 56.5% of downstream fusion partner sequences were in-frame with ESR1; five ESR1 fusion transcripts could not be classified because of low resolution across the breakpoint.

The most common in-frame fusion partners were YAP1 (n = 4 samples), NCOA2 (n = 4), PLEKHG1 (n = 3), and VTA1 (n = 3) (Fig. 3). Out-of-frame fusion partners included CCDC170 (n = 17), ARMT1 (n = 3), and 12 single-occurrence partners. We identified both in-frame and out-of-frame ESR1 fusion transcripts with AKAP12, ARTM1 and PLEKHG1. In-frame fusion products were predominately (93.2%) fused at ESR1 exon 6, but ESR1 exons 3, 5 and 7 were also each involved in single in-frame fusion events. Out-of-frame fusions involved ESR1 exons 1–8, with breakpoints occurring most frequently in exon 2 (38.2%). ER-negative tumors comprised 5.6% (n = 4) of all breast tumors with ESR1 fusions.

Fig. 3
figure 3

Recurrent in-frame and out-of-frame ESR1 fusions in primary and metastatic breast cancer. A Schematic of ESR1 coding sequence (CDS) with exons annotated and scaled to protein sequence. B Schematics of fusion transcript CDS for six recurrent fusion partners

ESR1 variant co-alterations

To determine whether ESR1 variants frequently co-occurred with other molecular alterations frequently observed in breast cancer samples, we compared ESR1 wild-type samples and ESR1 variant samples (i.e., harboring a pathogenic ESR1 LBD mutation or fusion) for select concurrent molecular alterations (Fig. 4). ESR1 variant samples had a higher frequency of androgen receptor overexpression (78.0 vs 58.6%, P < 0.01) and PIK3CA mutations were more common (36.2 vs 31.4%, P = 0.09) than in ESR1 wild-type tumors. In addition, ESR1 variant tumors less commonly expressed the immune checkpoint proteins PD-1 (20.0 vs 53.4%, P < 0.05) and PD-L1 (immune cell stain, 10.0 vs 30.2%, P < 0.0001) than wild-type tumors. TP53 mutations were also less common among ESR1 variant tumors (19.8 vs 59.6%, P < 0.0001).

Fig. 4
figure 4

ESR1 variant co-alterations. The frequency of protein expression [immunohistochemistry (IHC)], TMB-High (tumor mutational burden; ≥ 10 mutations/Mb), select mutations (MT) and copy number amplifications (CNA) were compared in ESR1 variant (pathogenic ESR1 mutation or fusion) and ESR1 wild-type samples (no pathogenic ESR1 variants detected). *P < 0.05

Subjects with multiple biopsies

Two-hundred ninety-eight patients had multiple biopsy samples analyzed. We were particularly interested in the samples from 92 patients for whom both a primary and distant metastatic biopsy sample were analyzed. Nine patients had an ESR1 variant detected in the metastatic sample only, one in the primary sample only, and one with an ESR1-D538G mutation in the primary and metastatic samples. We analyzed expression patterns among 51 patients with an ESR1 variant detected in at least one biopsy (Fig. 5). For those who had ESR1-E380Q variants detected, these variants were more frequently observed across all of an individual’s biopsy samples than unique to a subset of their biopsies (66.7%, 4 of 6 patients), whereas D538G (42.1%, 8 of 19 patients), Y537S (37.5%, 6 of 16 patients), and other LBD variants (33.3%, 4 of 12 patients) were more often unique to a subset of patient biopsies. Of the 30 patients with an ESR1 variant unique to a subset of their own biopsies, 86.7% (n = 26) did not harbor the variant in the initial biopsy, including nine patients whose initial biopsy was from the breast or LR lymph nodes. An ESR1 fusion was identified in three samples, although it is unclear whether fusions were present in respective paired samples as fusion detection was not performed at the time of tumor profiling.

Fig. 5
figure 5

Analysis of paired samples for patients with ESR1 variants. Expression patterns for patients (n = 51) with an ESR1 variant detected in at least one biopsy are shown. Column width reflects number of biopsy samples per patient. IHC immunohistochemistry, ER estrogen receptor, PR progesterone receptor, ISH/CISH in situ hybridization/chromogenic ISH

Discussion

In this study of 9680 breast cancer tumors that underwent comprehensive molecular profiling, we have identified a broad range of ESR1 variants. ESR1 LBD mutations were detected in 8.6% of all tumors evaluated and a pathogenic ESR1 fusion was detected in 1.6%. ESR1 LBD mutations were appreciated in 14.5% of ER+/HER2− breast cancer samples. ESR1 LBD mutations were somewhat less common than previously reported [10, 17]. This may be explained by characteristics of this cohort (more heterogeneous population) and by the selection of a pretreatment specimen for molecular profiling in several instances. Inherent ESR1 mutations are rare [6]; therefore, this practice would lower the frequency of identified variants. Higher rates of ESR1 mutations—25% to 40%—have been reported in clinical trial settings following progression on endocrine therapy, often utilizing specimens collected at progression of disease [10, 17]. However, in a similar study utilizing the Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT) platform to analyze 929 breast cancers, ESR1 mutations were identified in 10% of samples [9]. In another study that evaluated the molecular profiles of 11,616 breast tumors, ESR1 mutations were detected in 10.2% and were enriched in metastatic samples [10, 17].

ESR1 variants were more prevalent in liver samples than primary breast samples (27.0 vs 2.3%, P < 0.0001) and represented 55.9% (n = 447/799) of all ESR1 variants identified in metastatic tissue. These findings are similar to prior reports [13], in which the highest ESR1 mutation rates were found in liver metastases (44%), followed by pleura (25%), lung (24%), and bone (20%), with a 5% rate of ESR1 mutations in breast tissue samples. In our study, patients with multiple biopsies more commonly harbored an ESR1 variant in subsequent tissue rather than an initial biopsy. Limited literature exists assessing the prevalence of ESR1 variants in paired tissue samples. In one previous study, matched samples were available for four patients with higher allele frequencies of ESR1 mutations in biopsy samples at progression [18].

The most common ESR1 LBD mutations in our study were D538G, Y537S, and E380Q. These mutations are known to be clinically important. In the BOLERO-2 phase III trial that assessed the combination of everolimus and exemestane in patients with endocrine therapy-resistant metastatic ER-positive breast cancer, both D538G and Y537S mutations identified from baseline cell-free DNA (cfDNA) were associated with decreased overall survival [10]. The SoFEA (Study of Faslodex Versus Exemestane With or Without Arimidex) phase III study identified ESR1 mutations in 39.1% of available patient baseline plasma ctDNA. Within the exemestane-treated arm, ESR1 mutation status was associated with worse progression-free survival (PFS) (HR 2.12, P = 0.01) [17]. In the PALOMA-3 phase III trial that evaluated the addition of palbociclib to second-line endocrine therapy, ESR1 mutations were identified in 25.3% of available patient baseline plasma ctDNA. Treatment with palbociclib significantly improved PFS regardless of ESR1 mutation status. Worsened PFS was appreciated for patients with ESR1 mutations who received fulvestrant and placebo (3.6 months vs 5.4 months) [17]. In both the SoFEA and PALOMA-3 studies, the predominant ESR1 mutations were D538G, Y537S/N, and E380Q [17].

Targeting fusions for cancer treatment has made a significant impact in other cancer subtypes, including EML4-ALK fusions in non-small cell lung cancer and NTRK fusions, for which targeted therapies are now available. These therapies have led to significant improvements in overall response rate and PFS in a biomarker-selected population [19, 20]. Fusion transcripts that are in-frame are more often considered pathogenic and clinically relevant due to retained key functional domains (i.e., kinases, LBD) that can be activated by the fusion partner. Out-of-frame fusion products can have a more varied biology, as completely new sequences without preserved domains are being translated. We identified several ESR1 fusions that are likely to have clinical relevance, albeit rare, such as recurrent ESR1-CCDC170, ESR1:YAP1, ESR1:NCOA2, and ESR1:PLEKHG1 fusions, along with 36 other unique fusions. ESR1 fusion transcripts are of clinical importance as they may allow for constitutive activation of the estrogen receptor and can contribute to endocrine therapy resistance, predominately by deleting the binding domain for traditional estrogen inhibitory therapies [21]. ESR1 fusions can therefore render breast cancer cells resistant to aromatase inhibitors, SERMs, and SERDs due to disruption of the LBD. Conversely, ESR1 LBD mutations may allow for some level of responsiveness to SERMs and SERDS [8, 9, 22]. In addition to causing endocrine therapy resistance, ESR1 fusion products may activate downstream signaling pathways [23] and support metastatic proliferation. In our analysis, the majority of the ESR1 fusion products retained the ERD and ZF domains in the setting of a dysfunctional LBD, suggesting that the ability to bind DNA and initiate downstream signaling was retained. ESR1:NCOA2 fusions are not well-described in breast cancer but have been identified in uterine tumors [24]. NCOA2 is a transcriptional coactivator for nuclear hormone receptors, and this fusion product may utilize an active promoter region to dysregulate expression of the NCOA2 coactivator domain, thereby increasing proliferation cell signaling pathways including estrogen-mediated. ESR1-CCDC170 fusions activate proliferation pathways involving HER2/HER3 [21, 25] and enhance cell migration and invasion [25]. ESR1:YAP1 upregulates an epithelial-to-mesenchymal transcriptional signature thereby promoting metastasis [21] and estrogen-independent enrichment at regulatory regions of estrogen-responsive genes [21]. A fusion event could therefore provide multiple mechanisms for cancer growth escape.

The ESR1:CCDC170 fusion was the most common fusion we identified. It was out-of-frame in all instances; however, a previous study suggested that this fusion has biological relevance in ER-positive breast cancers. Previously described by Veeraraghavan et al. [25], CCDC170 fusion transcripts are likely generating N-terminally truncated CCDC170 proteins expressed under the ESR1 promoter, which could cause constitutive activation of the ER LBD.

Targeting ESR1 fusions could be a potential treatment strategy. In one study that evaluated interacting proteins with ESR1 fusion transcripts, enhanced recruitment of 26S proteasomal subunits was identified in tumors characterized by an ESR1:YAP1 fusion [26]. Following treatment with bortezomib, a 26S proteasome inhibitor, and fulvestrant, tumor growth was suppressed [27]. Blocking the downstream estrogen receptor kinases CDK4/6 with CDK4/6 inhibitors has also been evaluated [21]. In a patient derived xenograft model harboring the ESR1:YAP1 fusion, treatment with palbociclib led to inhibitor tumor growth, decreased Ki-67 levels, and reduced pRb. The sensitivity of ESR1 fusion-expressing breast cancer cells to concomitant HER2-targeted therapies has also been assayed in a preclinical setting; breast cancer cells harboring an ESR1:CCDC170 fusion treated with tamoxifen and lapatinib showed decreased growth [28].

The main limitation of our study is the lack of longitudinal outcomes data. We did not have access to clinical outcomes data to correlate ESR1 variants with treatment response and survival, and this is beyond the scope of the current study. Most patients were presumed to have stage IV disease, even if the tumor submitted was obtained from a breast biopsy or surgical specimen. However, it is unknown whether patients presented with de novo metastatic disease, and her history of previous lines of therapy. In addition, it is not possible to precisely define the clinical scenarios of patients at the time the specimen was collected—i.e., whether the biopsy was taken from a current site of residual disease or progression during therapy, or if prior to any systemic therapy. Finally, the sequencing platforms available for tumor profiling varied over time, with fusion data unavailable for many samples.

Herein, we have described one of the largest series of ESR1 fusions reported, with 40 unique fusions identified. ESR1 LBD mutations were common, identified in 8.6% of all tumors evaluated and 14.5% of ER+/HER2− tumors. An improved understanding of how ESR1 variants affect ER signaling may ultimately guide treatment choices following progression on endocrine therapy. Future studies investigating the prognostic implications of ESR1 variants and how ESR1 variants affect responses to therapies beyond endocrine therapy are needed.