Introduction

Mammography screening reduces breast cancer mortality by 15–20%, [1] but some cancers are not detected by mammography, either because they are missed or because they grow aggressively within the screening interval. Approximately 15% of breast cancers are diagnosed after a negative mammogram and before the next recommended screening exam [2]. These interval cancers are particularly problematic when they carry a poor prognosis because of their size, subtype, or involvement of nodes or distant sites. Because of the limitations of mammography screening, many have called for a transition to personalized approaches to breast cancer screening, or precision screening, that tailors screening initiation, interval, and modality based on individualized risk in order to maximize screening benefits and reduce harms [3, 4]. Specifically, breast MRI screening is emerging as a potential tool for screening high-risk women, due to its higher sensitivity for invasive breast cancers compared with mammography [5,6,7,8]. Knowledge of short-term risk of developing breast cancer, particularly risk of poor prognosis breast cancers, would help direct more intensive screening to those at highest risk who would be most likely to benefit.

Both high and moderate penetrance genes (e.g., BRCA1/2, PALB2, ATM) as well as low penetrance, common single-nucleotide polymorphisms (SNPs) associated with breast cancer susceptibility have been identified [9,10,11]. Multiple studies have shown that combining panels of SNPs into polygenic risk scores (PRS) can summarize genetic risk in a way that stratifies women into high and low risk for breast cancer [12, 13]. A 313 SNP PRS incorporating breast cancer subtype-specific risk estimates was developed and validated in a large population of women of European ancestry and was shown to be well calibrated and accurate in stratifying women based on breast cancer risk [12]. Women with the highest 1% of polygenic risk had a fourfold increased risk of breast cancer compared to women with the population average risk. Furthermore, in addition to the overall breast cancer PRS, this study validated separate PRS for estrogen receptor-positive and estrogen receptor-negative breast cancers, allowing for estimation of subtype-specific risks. The 313 SNP PRS was also shown to be independent of most established breast cancer risk factors [14]. These data combined with the declining cost of next-generation sequencing is making the integration of genetics in clinical care increasingly feasible.

The purpose of this study was to evaluate the association of the 313 SNP breast cancer PRS with 2-year risk of breast cancer, including the risk of poor prognosis breast cancer. In addition, we examined whether subtype-specific PRS were associated with ER + and ER− breast cancer to determine the utility of these PRS to guide screening decisions.

Methods

Study population

The study population included women aged 40–85 who underwent mammography screening at Massachusetts General Hospital between 2006 and 2015 and who had a negative mammogram based on initial assessment (BI-RADS assessment 1 or 2, Fig. 1). Mammograms were excluded in the following women: prior diagnosis of breast cancer, non-residents of Massachusetts, breast implants, prior screening mammogram within 90 days, or had insufficient identifiers for linkage with cancer registry and Partners Biobank data. This resulted in 294,954 screening mammograms among 74,980 women. Among this population, we identified women who were diagnosed with breast cancer within 2 years of the negative mammogram through linkage both to local hospital cancer registries and Massachusetts State Cancer Registry data. For women diagnosed with breast cancer (N = 1394) within 2 years, we selected the earliest mammogram within 2 years of diagnosis, and for women not diagnosed with breast cancer within 2 years, we randomly selected one negative screening mammogram for inclusion in the analysis. Next, we linked this cohort to the Partners Biobank, a large research repository that includes DNA samples and genetic information and found 3351 non-cases and 133 cases had genotype data or DNA samples available from the Partners Biobank. To increase the sample size of cases, we actively recruited women diagnosed with breast cancer within 2 years of a negative mammogram to provide a DNA sample via Oragene Saliva DNA collection kits (DNA Genotek, Inc., ON, Canada). Appendix Table 5 details the recruitment of breast cancer cases. We attempted to contact all poor prognosis cases and selected a random sample of good prognosis cases to contact. Patients received a letter introducing them to the study, with a postcard to return indicating whether they were interested in participating or preferred to opt out. Patients who did not opt out within 4 weeks were then sent a second mailing with a consent form. Patients who consented to the study were mailed an Oragene Saliva DNA collection kit with instructions to provide the saliva sample and a return mailer. We additionally contacted non-respondents via telephone and if patients were scheduled for a follow-up visit in medical oncology, they were given the option to provide the saliva DNA sample at the time of their next appointment. In total, we attempted to contact 684 women with breast cancer, of whom 205 consented and returned a DNA sample (30%). The study was approved by the Partners Healthcare Institutional Review Board (IRB) and all patients provided informed consent for genotyping.

Fig.1
figure 1

Study Population

Breast cancer prognosis

Breast cancer prognosis was categorized according to the Tomosynthesis Mammographic Imaging Screening Trial (TMIST) definition of poor prognosis cancers, which includes size, subtype, and lymph node/distant involvement and predicts breast cancer death within 5 years of mammography screening [15, 16]. Tumor subtypes were categorized based on immunohistochemical expression of estrogen receptor (ER), progesterone receptor (PR), or human epidermal growth factor receptor 2 (HER2). Poor prognosis breast cancers were defined as (1) greater than 2 cm, (2) greater than 1 cm and triple negative or HER2 positive, (3) had positive lymph nodes, or (4) were metastatic. Cancers that did not meet any of these criteria were considered early stage, including ductal carcinomas in situ. Missing HER2 status was manually abstracted for invasive cases from electronic health records where available, primarily for cases prior to 2010. Only 1 patient had HER2 status that was truly missing, the remaining patients without HER2 status had DCIS and the test was not performed. Patients with unknown ER/PR/HER2 status or borderline HER2 status were not categorized as triple negative or HER2 positive for the purposes of the advanced breast cancer definition. Similarly, patients with unknown lymph node involvement or unknown tumor size were not categorized as having positive lymph nodes or large tumor size for the purpose of the advanced breast cancer definition. However, most patients missing lymph node involvement had DCIS and all patients missing tumor size had DCIS (Table 2).

Genotyping and statistical analyses

Samples were genotyped using the Illumina Multi-Ethnic GWAS/Exome SNP (MEGA) array and genotypes were imputed using TOPMed (Version r2 2020). Patients whose saliva DNA samples had low concentration, failed quality control procedures, or were BRCA1/2 carriers were excluded from analyses (N = 30 cases and 2 non-cases). We generated the 313 SNP breast cancer PRS, the estrogen receptor-positive (ER +) PRS, and the estrogen receptor-negative (ER−) PRS using established methods [12]. We used logistic regression to estimate the odds ratios of breast cancer overall, poor prognosis breast cancer, ER +, and ER− breast cancer for standardized PRS measures. We evaluated the following covariates in the models: age (continuous), race/ethnicity, breast density (4-category variable), year of screening, menopause status, digital mammography vs. digital breast tomosynthesis, and genetic principal components (PCs, calculated from an independent set of common variants across the genome). Menopause status and digital mammography vs. digital breast tomosynthesis were dropped because they were not statistically significant and did not meaningfully change the effect estimate for the PRS. The final models adjusted for age, breast density, race/ethnicity, year of screening, and ancestry PCs. We did not include other established breast cancer risk factors in the model because the 313 SNP PRS has been found to be independent of these risk factors in a very large study [17]. Body mass index (BMI) was not included in the main model because it was missing for 5% of the population; however, we performed a sensitivity analysis adjusting for both continuous BMI and BMI categories (< 25, 25–29, 30 + kg/m2). In addition, logistic regression models were stratified by family history of breast cancer [12] and women older than age 74 years were excluded.

Results

After exclusions, our study sample consisted of 308 women who developed breast cancer (cases) and 3349 women who did not develop breast cancer (non-cases). Non-cases in the analytic sample were slightly older, had lower breast density, and were less likely to report Asian/PI race/ethnicity than non-cases in the full cohort (Appendix Table 5). Cases in the analytic sample were slightly younger, less likely to report Asian/PI race/ethnicity, and more likely to have poor prognosis than cases in the full cohort, at least in part because of the oversampling of cases with poor prognosis for DNA collection (Appendix Table 5). The characteristics of the analytic population are displayed in Table 1. The majority of patients were non-Hispanic White (87%) and the mean age was 57 years and was similar for cases and non-cases. Cancer cases were more likely than non-cases to have higher breast density and a family history of breast cancer. Characteristics of cancer cases overall and for early and advanced disease are displayed in Table 2. Of the 308 breast cancers, 137 (44%) had poor prognosis.

Table 1 Characteristics of the analytic population by cancer status
Table 2 Characteristics of cancers diagnosed within 2 years of a negative mammogram

Table 3 displays the association of overall, ER +, and ER− PRS with different breast cancer outcomes. Within 2 years of a negative mammogram, PRS was significantly associated with breast cancer diagnosis, with an OR 1.39 per standard deviation unit increase in PRS (OR 1.39, 95% CI 1.23–1.57, p < 0.001). The overall breast cancer PRS was also significantly associated with diagnosis of poor prognosis disease, with an OR 1.21 per standard deviation unit increase in PRS (OR 1.24, 95% CI 1.03–1.49, p = 0.018). We observed a stronger association between the PRS and good prognosis cases (OR 1.52 95% CI 1.29–1.80 p = 3.60 × 10–7). In addition, the ER + PRS was significantly associated with ER/PR + breast cancer (OR 1.42, 95% CI 1.24–1.66, p < 0.001), and the ER− PRS was significantly associated with ER− breast cancer (OR 1.52, 95% CI 1.11–2.09, p = 0.008). Models excluding women older than 74 years and adjusting for BMI yielded similar results (Appendix Tables 7 and 8). Furthermore, models adjusting for race/ethnicity alone, ancestry PCs alone, or both race/ethnicity and ancestry PCs yielded similar results (Appendix Table 9).

Table 3 Logistic regression of PRS and cancer diagnosis within 2 years of a negative mammogram, overall, by prognosis, and by ER status

Due to previous work reporting an interaction between family history and PRS with respect to breast cancer, [12] we performed analyses stratified by family history of breast cancer (Table 4). Across all PRS and all breast cancer outcomes, PRS was statistically significantly associated with breast cancer, and the magnitude of the ORs for PRS were greater among women with no family history than among the total study population. There were no statistically significant associations between PRS and breast cancer risk among patients with a family history; however, the sample sizes in the subset of patients with family history were small.

Table 4 Logistic regression stratified by family history of breast cancer

Discussion

Even after adjusting for breast density and other risk factors, the breast cancer PRS was significantly associated with diagnosis of both breast cancer overall and poor prognosis breast cancer within 2 years of a negative mammogram [18]. Furthermore, the subtype-specific PRS were significantly associated with short-term risk of ER + and ER− disease. These results suggest that PRS may be useful in guiding decisions about screening interval and supplemental screening, given the association of PRS with risk of poor prognosis disease in the short term.

To our knowledge, this is the first study to evaluate the associations of the 313 SNP PRS with both short-term risk and risk of poor prognosis breast cancers. While in our study the PRS was significantly associated with 2-year risk of breast cancer (OR 1.39), 2-year risk of poor prognosis breast cancer (1.24), and 2-year risk of ER + breast cancer (OR 1.42), the associations were slightly lower in magnitude than the associations reported in the original study validating the 313 SNP breast cancer PRS (ORs 1.61 for all breast cancer, OR 1.68 for ER +) [12]. For ER− disease, we observe a slightly larger association (OR 1.52) than the original study (OR 1.45) [12]. This may be partly due to the selection of women with a recent negative mammogram as the study population and short follow-up time. A large cohort study in Sweden developed 2-year breast cancer risk models for supplemental screening, and, similar to our results, this study found that the 313 SNP PRS was significantly associated with 2-year risk of breast cancer [19]. However, the Swedish study did not look at the association of the PRS with tumor characteristics or prognosis. A case-only analysis examined the associations between a 77-SNP breast cancer PRS and tumor prognostic factors [18]. Similar to our results, this study found that the BC PRS was associated with favorable tumor prognosis, including higher risk of ER + disease, smaller tumor size, and lower grade, and the PRS was not significantly associated with metastasis. This is not entirely surprising given that most variants have been identified as associated with ER + disease, which tends to be less aggressive. A case–control study conducted in Sweden found that the 77-SNP breast cancer PRS was associated with both screen-detected and interval breast cancers; however, patients with higher PRS were less likely to be diagnosed with interval compared with screen-detected cancers [20]. The ER− PRS was significantly associated with ER− disease, but not with other tumor characteristics. A cohort study of women undergoing mammography screening in the UK observed a stronger association between an 18 SNP breast cancer PRS breast cancer diagnosed at stage 2 or greater and interval cancer than breast cancer overall and good prognosis cancer, although the confidence intervals for these estimates were overlapping [21].

Mavaddat and colleagues reported a significant interaction between family history and the PRS such that the association of PRS with breast cancer risk was smaller among women with family history than among women without family history [12]. Our results were similar, with significant associations of the various PRS with breast cancer among women with no family history, and no statistically significant associations among women with a family history. However, the number of cases among patients with family history were small, so results need to be interpreted cautiously, and we expect that with a larger sample size we would have observed a significant association between the PRS and breast cancer among women with a family history, although with smaller magnitude of association.

A recent modeling study from the Cancer Intervention and Surveillance Modeling Network (CISNET) found that mammography screening tailored to the 313 SNP PRS improved life years gained and breast cancer deaths averted compared with USPSTF screening guidelines [22]. Our results add to the literature by suggesting that PRS can identify short-term risk of poor prognosis disease, an important step toward tailoring screening recommendations. Further research is needed to determine whether using this information to increase the frequency of screening or add supplemental screening can reduce breast cancer mortality.

A few limitations should be considered when interpreting our findings. First, the study population was relatively small, which may limit statistical power, particularly for analyses stratified by family history. However, we were able to compare our analytic population to the full underlying cohort of women undergoing mammography screening and found only small differences between those with and without genetic data, suggesting that our results would generalize to the larger population of women undergoing mammography screening. In addition, the study population was predominantly White, with mainly European ancestry. We included women of other ancestries and adjusted for ancestry principal components. However, future studies that utilize trans-ethnic GWAS summary statistics may provide better risk assessment particularly for women with non-European ancestry. We did not include information on reproductive risk factors, given that our main research question was the association of the PRS with short-term risk and risk of poor prognosis cancer and our sample size was modest. However, a very large prior study showed that the 313 SNP breast cancer PRS was independent of reproductive risk factors and BMI [17] and therefore we think it is unlikely that adjustment for these factors would meaningfully change our results.

In summary, this is the first study to our knowledge to examine the association of the 313 SNP breast cancer PRS and subtype-specific PRS with short-term risk of breast cancer and short-term risk of poor prognosis breast cancers. Our results provide intriguing evidence that the PRS may aid in decision-making regarding personalized screening approaches.