Introduction

Primary prevention of breast cancer remains a major goal for reducing the burden associated with this disease. Two large breast cancer prevention trials of selective estrogen receptor modulators (SERMs) including the National Surgical Adjuvant Breast and Bowel Project (NSABP) P-1 placebo-controlled trial of tamoxifen [1] and double-blind NSABP P-2 trial comparing raloxifene to tamoxifen, showed that these agents reduce the risk of breast cancer among women with a 5-year predicted breast cancer risk of at least 1.66 by 50 % after five years of therapy [2]. Follow-up of the P-2 trial at a median exposure of 81 months suggested that long-term raloxifene use was 76 % as effective for preventing invasive disease, but had less toxicity than tamoxifen [3]. Thus, both tamoxifen and raloxifene are viable prevention strategies for women at high risk of breast cancer [4].

Almost 80 confirmed common genetic susceptibility loci for breast cancer have been identified to date [520]. Taken together, these validated loci are estimated to explain up to 14 % of familial breast cancer risk [5]. Two recent studies showed that a polygenic risk score (PRS) composed of 76-77 of these genetic loci can identify individuals at increased breast cancer risk in the general population [21, 22]. Specifically, those at highest risk by the PRS had a 1.8-fold increased risk for breast cancer relative to the second quartile of PRS, and those in the lowest quartile had a reduced risk (0.6 fold) of breast cancer [21]. The PRS association with breast cancer was stronger among those with ER-positive compared to ER-negative disease and effectively stratified breast cancer risk in women both with and without a family history of breast cancer [22].

It is not clear, however, whether these common genetic variants will also be risk factors for breast cancer among high-risk women treated with SERMs for breast cancer prevention, given the large risk reduction associated with SERMs. We present the first report to evaluate a comprehensive set of 75 established breast cancer susceptibility loci, in the context of a PRS, as a risk factor for breast cancer among high-risk women from NSABP P-1 and P-2 trials taking raloxifene and tamoxifen for breast cancer prevention. We also examined whether the influence of the PRS on breast cancer differs by type of SERM, extent of family history, ER-positive compared to ER-negative breast cancer, and other clinical characteristics.

Methods

Study populations

The study population consisted of a nested case–control sample within the NSABP P-1 and P-2 trials [23] including 596 breast cancer cases who developed breast cancer while on SERM therapy and 1,171 matched controls selected from the 32,859 participants enrolled in P-1 and P-2 breast cancer prevention trials. Controls were matched to cases on trial and treatment arm (P-1 tamoxifen, P-2 tamoxifen, P-2 raloxifene), age at trial entry, categories of 5-year predicted breast cancer risk based on the Gail model [24], history of lobular carcinoma in situ, history of atypical hyperplasia, and time on study (controls on study at least as long as the matched breast cancer case) (Table 1) [23]. Each study obtained informed consent and had ethics and institutional approvals.

Table 1 Characteristics of cases and matched controls within National Surgical Adjuvant Breast and Bowel Project P-1 and P-2 Trials

Genotyping

The genotypes of 75 published breast cancer single nucleotide polymorphisms (SNPs) (Supplementary Table 1) were obtained from a genome-wide association study (GWAS) of cases and controls. Genotyping was performed by the RIKEN Center for Integrative Medical Science using the Illumina Human610-Quad BeadChip and genotypes are currently available through dbGAP (dbGaP Study Accession number is phs000305.v1.p1) [23, 25]. Four cases were ineligible due to low DNA quantity (n = 2) or quality (n = 2) for a total of 592 cases for GWAS analyses. Imputation was performed using Beagle and all samples from the version 2 of the 1000 Genomes data May 2011 [26] as a reference [27]. Of the 77 SNPs previously shown associated with breast cancer [520] (Supplementary Table 1), 75 were available and used to form the PRS. Of these, genotypes on 36 SNPs were imputed and had a quality score r 2 > 0.4, with the majority (n = 33 of 36) above r 2 > 0.8.

Statistical methods

The PRS was created using per allele odds ratios from the SNP associations with overall breast cancer (Supplementary Table 1) [520]. The PRS represented the combined effect of the 75 SNPs, regardless of departures from a multiplicative model, because there has been no evidence seen for SNP by SNP interactions [28]. Specifically, the log OR for each SNP was multiplied by the number of risk alleles and summed to generate a unique PRS for each person in the dataset [29]. For missing genotypes (0.05 %), the SNP was locally imputed within a 20 Mb region around the SNP, using Beagle v3.3.1 and 1000 Genomes, version 2 [26, 27]. The PRS approximated a normal distribution, and was included as a continuous measure (per one unit) in the conditional logistic regression risk model. For ease of presentation, the PRS score was also divided into quintiles based on the distribution among controls. Associations of PRS with breast cancer were examined with conditional logistic regression, accounting for the matched design.

Tests for differential associations of PRS by prevention agent (raloxifene vs. tamoxifen), family history (1 or more 1st degree relatives vs. 0 relatives), age at trial entry (<55 vs. >55), predicted 5-year risk based on the Gail model (<3.01 vs. >3.01 %), hysterectomy (no/yes), atypical hyperplasia (no/yes), and lobular carcinoma in situ (LCIS) (no/yes), with breast cancer were tested by creation of an interaction term between each covariate and the main effect of PRS. For ER-receptor status, age at onset (<55, 55–64, 65+) and type of breast cancer (ductal carcinoma in situ (DCIS) vs. invasive), we stratified cases (and their matched controls), fit conditional logistic regression models within each strata, and compared the odds ratios across strata by taking the difference in log OR, and dividing by the square root of the sum of the variances.

Results

Table 1 shows the characteristics of the cases and controls. There were 139 women with DCIS and 453 women with invasive breast cancer; 69 % of the invasive cases were ER-positive, 26 % were ER-negative, and 5 % had unknown ER status. A quarter of sample was less than age 55 at trial entry and over two-thirds had a five-year predicted risk score of greater than 3 % by the Gail model, indicative of a population at greater than average risk. Matching was successful on all variables (Table 1).

The PRS based on 75 variants ranged from 3.98 to 7.74, with a median of 5.61. A one-unit change in PRS was associated with 42 % increase in breast cancer risk (OR = 1.42; 95 % CI 1.18–1.70). The PRS association with breast cancer risk was also evident when examining quintiles of the PRS (Table 2; Supplementary Fig. 1) (P trend = 0.0005). Relative to the middle quintile (5.52–5.78), women in the lowest quintile (3.98–5.17) were at a reduced risk of breast cancer (OR = 0.81; 95 % CI 0.59–1.12) while those in the highest PRS quintile (6.10–7.74) were at an increased risk (OR = 1.45; 95 % CI 1.06–1.98). This translates to a risk of 1.8 comparing highest to lowest quintiles.

Table 2 Association of polygenic risk score (PRS) and breast cancer (n = 592 cases, 1,171 controls) within the National Surgical Adjuvant Breast and Bowel Project P-1 and P-2 trials

The association of PRS with breast cancer was similar across age at trial entry, treatment type, 5-year predicted risk, hysterectomy status, body mass index (BMI), presence of atypical hyperplasia, and LCIS (all P values for heterogeneity >0.15) (Fig. 1; Supplementary Table 2). However, there was evidence of a stronger association of PRS with breast cancer among women without a first-degree family history of breast cancer (OR = 1.62 per unit change in PRS, 95 % CI 1.18–2.21) compared to those with a positive family history (OR = 1.32, 95 % CI 1.06–1.66) (P intx = 0.04) (Fig. 1). Further, the PRS appeared to be a stronger risk factor for ER-positive than ER-negative breast cancer, although the test for heterogeneity did not reach statistical significance (P intx = 0.10) likely due to the limited number of ER-negative cases (n = 119). There was a 59 % increased risk with ER-positive breast cancer per unit change in PRS (OR = 1.59, 95 % CI 1.25–2.02, P = 0.0002), but only a 5 % increase in risk associated with ER-negative breast cancer (OR = 1.05, 95 % CI 0.68–1.62, P = 0.84) (Fig. 2; Supplementary Table 3). No differences were evident for age at onset or type of breast cancer (Fig. 2; Supplementary Table 3), although sample size was also limited for these comparisons.

Fig. 1
figure 1

Polygenic risk score (PRS) and breast cancer association (Odds ratios (OR) and 95 % confidence intervals) by clinical covariates

Fig. 2
figure 2

Polygenic risk score (PRS) and breast cancer association (Odds ratios (OR) and 95 % confidence intervals) by age at onset and tumor characteristics

Additional analyses examining the PRS and breast cancer association after adjustment for the two loci identified as breast cancer risk factors through a prior GWAS in this sample (rs10030044 at CTSO and rs8060157 at ZNF423) [23] showed no difference for the PRS and breast cancer association compared to the unadjusted results (data not shown).

Discussion

We present the first report to examine the influence of 75 common breast cancer susceptibility loci on breast cancer risk among women taking SERMs for primary prevention. Using genotyping data from women receiving tamoxifen and raloxifene in the NSABP P-1 and P-2 studies, we have shown that a PRS of the 75 loci is a risk factor for breast cancer in the presence of SERMs, with the risk of breast cancer ranging from OR = 0.59 to 1.98 for those with the lowest and highest PRS, respectively (compared to the average PRS). This finding suggests that the intrinsic risk of breast cancer associated with the common variants is maintained in the presence of the strong risk reducing effects of SERM treatment.

Although the PRS was a risk factor for women with and without a family history of breast cancer, we found that the association was stronger among women without a family history with 30 % increased risk per unit PRS in those with a family history, and 62 % increased risk in those without. This difference may reflect the fact that these SNPs explain a portion of familial breast cancer risk (estimated at 14 %) [5], thereby attenuating their influence on risk in this subgroup. One possible explanation is that a strong family history may reflect the presence of other more highly penetrant mutations that have a larger influence on breast cancer risk than these common genetic loci.

Our data suggested a stronger association of the PRS with ER-positive than with ER-negative breast cancer, although the differences did not reach statistical significance, likely due to the limited number of ER-negative breast cancers. The strong association with ER-positive breast cancer was expected, given that the majority of the 75 variants were originally identified in studies primarily comprised of ER-positive breast cancer [520]. In fact, only a small number of loci (LGR6, MDM4, 2p24.1, TERT, FTO, 19p13.31) have shown genome-wide significant (P < 5 × 10−8) associations with ER-negative breast cancer and risk estimates approximating 1.0 for ER-positive breast cancer (Supplementary Table 1) [711, 30, 31]. Since SERMs are beneficial for prevention of ER-positive breast cancer, risk models incorporating a PRS that is strongly predictive of ER-positive cancer may allow better selection of women at high risk of ER-positive breast cancer who may benefit from SERM intervention. Furthermore, as the associations with breast cancer were similar in those taking either tamoxifen or raloxifene, it appears that the type of SERM may have little influence on the variant-associated risk. Thus, the PRS should also be evaluated as a risk factor for breast cancer in prevention trials or prospective patient populations treated with aromatase inhibitors, which are effective for prevention of ER-positive breast cancer [32, 33].

The absence of women on placebo or usual care in this study did not allow for examination of the interaction of the PRS and SERMs on risk. However, comparison with the few studies to date on the PRS and breast cancer association [21, 22] suggests that there is some attenuation of the association in the SERM-treated population. Although the PRS distributions  are not directly comparable across these studies, a one-unit increase in PRS was associated with a 1.4-fold increased risk of breast cancer in this NSABP population, but associated with a 1.8-fold increase (95 % CI 1.6–2.1) in a general population [21]. Because of the known risk reduction associated with these SERMs, the SNPs (PRS) may not be as strong a risk factor in moderate to high-risk women on tamoxifen and raloxifene. In addition, the attenuation may be due in part to the large proportion of women with family history of breast cancer in the NSABP trials (70 %), for whom the PRS and breast cancer association was attenuated relative to those without a family history.

While the matched nature of the cases and controls precluded calculation of absolute risk and realistic area under the curve (AUC) estimates, the close matching in the two well-characterized clinical trials on a large number of clinical variables did allow evaluation of the PRS without potential confounding influences.

In conclusion, this is the first study to examine a comprehensive PRS among a moderate to high-risk population receiving SERMs and to demonstrate a contribution of common genetic variation to the development of future breast cancers.