Introduction

Globally, approximately 382 million people currently live with diabetes, and this number may rise to 592 million by 2035 [1]. Type 2 diabetes (T2D) accounts for over 90 % of all diabetes cases [2]. Breast cancer is the most common cancer among women in many countries, including the USA [3]. Many epidemiological studies have linked T2D to increased breast cancer risk [48]. Recent meta-analyses have shown a more than 20 % increase in risk of breast cancer among women with T2D compared to women without the disease [912]. T2D and breast cancer share some risk factors, including obesity in postmenopausal women and physical inactivity [13]. Elevated levels of circulating C-peptide and insulin-like growth factor-1, biomarkers related to insulin resistance, have also been associated with increased breast cancer risk [14, 15]. It remains unclear, however, whether the link between these two diseases is due to shared lifestyle risk factors or intrinsic etiology such as genetic susceptibility. Understanding how genetic variants related to T2D risk influence breast cancer risk may provide insights into the nature of the T2D–breast cancer relationship.

Recent genome-wide association studies (GWASs) have identified approximately 50 genetic variants associated with T2D risk. Some of these reported T2D-related genetic variants have been studied in relation to the risk of several cancers, including cancers of the pancreas [16], colon/rectum [17, 18], and prostate [19]. The influence of these variants on breast cancer risk, however, has not been adequately studied. To date, only two studies have evaluated the association of a subset of these T2D-related genetic variants with breast cancer risk [20, 21]. Both studies reported a null association, which may be due to small study size and low study power.

In this analysis, using data from two consortia including 62,328 breast cancer cases and 83,817 controls of women of European ancestry, we evaluated T2D-related genetic variants reported to date in relation to breast cancer risk. By constructing a T2D-related genetic risk score (T2D GRS) and evaluating its association with breast cancer risk, we tested the hypothesis that, overall, the alleles that increase T2D risk may also increase breast cancer risk. We also tested the hypothesis that certain T2D-related genetic variants may be associated with breast cancer risk.

Methods

Study population

Included in this analysis were 62,328 breast cancer cases and 83,817 controls of women of European ancestry recruited either in the 39 studies (Online Resource Table 1) that participated in the Breast Cancer Association Consortium (BCAC), a part of the Collaborative Oncological Gene-Environment Study (COGS), or in the 11 studies (Online Resource Table 2) that are included in the Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) project of Genetic Associations and Mechanism in Oncology (GAME-ON). From the BCAC, we included individual-level data for 46,325 breast cancer cases and 42,482 controls. The DRIVE project included 16,003 breast cases and 41,335 controls; however, only summary statistics for the association between T2D-related risk variants and breast cancer risk were available, and thus, these summary statistics were used in our study. The study samples and participant data, including demographics and the traditional risk factors for breast cancer, were collected in each contributing study.

Single-nucleotide polymorphism (SNP) selection

We searched for all reported genetic risk variants associated with T2D in European ancestry populations at a genome-wide significance level (p < 5 × 10−8, trait ‘Type 2 diabetes’ or ‘Type 2 diabetes and other traits’) using the US National Human Genome Resource Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWASs Catalog, accessed 19 November 2012, at http://www.genome.gov/gwastudies). Fifty SNPs representing 33 independent loci (linkage disequilibrium (LD) R 2 < 0.1) were identified (Fig. 1).

Fig. 1
figure 1

Overview of the T2D genetic risk score construction

Genetic risk score construction

The genetic risk scores were calculated in 46,325 cases and 42,482 controls included in the BCAC. At each of the 33 independent loci, we selected the SNP with the lowest p value for association with T2D reported in GWASs to represent the locus in constructing the T2D GRS. Using these 33 SNPs, a weighted T2D GRS was constructed as a measure of the overall association of genetic risk variants with T2D. In the BCAC, 11 SNPs were directly genotyped and 22 were imputed with imputation quality threshold of R 2 > 0.5. The T2D GRS was created as \(\sum\nolimits_{i}^{33} {w_{i} {\text{SNP}}_{i} }\), where \(w_{i}\) is the logarithm of the odds ratio (OR) of the ith SNP with T2D reported from previous GWAS and \({\text{SNP}}_{i}\) is the number of risk alleles carried by a given subject on the ith SNP. We hypothesized that the risk allele for T2D would be associated with increased risk of breast cancer. The 33 individual T2D risk variants identified from the NHGRI GWAS catalog are presented in Online Resource Table 3.

Genotyping

In the BCAC, genotype data were obtained either from direct genotyping with a custom Illumina iSelect genotyping array (iCOGS) that contains 211,155 SNPs [22] or from imputation with the 1,000 Genomes Project Phase I integrated variant set (version 3, March 2012 release) as the Ref. [23], using the program IMPUTE2 [24]. Details of the studies that participated in the BCAC and the methodology used by the BCAC and iCOGS have been published elsewhere [22] and can also be found on the iCOGS Web site (http://ccge.medschl.cam.ac.uk/research/consortia/icogs/).

In the DRIVE project, genotype data were obtained either from direct genotyping using Illumina or Affymetrix arrays (Online Resource Table 2) or from imputation with the HapMap version 2 CEU panel (Utah residents of Northern and Western European ancestry) as a reference, using the program MACH v1.0 or IMPUTE [24]. Details of the studies that participated in DRIVE were described in previously published papers [22, 2528] or on the GAME-ON Web site (http://gameon.dfci.harvard.edu).

Statistical analysis

We evaluated the association between the T2D GRS and breast cancer risk using individual-level data from 46,325 breast cancer cases and 42,482 controls of European ancestry who participated in BCAC studies. Demographic characteristics and known breast cancer risk factors were summarized by case–control status using mean and standard deviation (SD) for continuous variables or frequency with percentage for categorical variables. Differences between cases and controls were compared using the Wilcoxon rank-sum test (continuous variables) or the Chi-square test (categorical variables). To assess the association between the T2D GRS and breast cancer risk factors, we used control data and calculated the mean and SD of the T2D GRS by comparison groups for each categorical variable; the difference was tested by the Wilcoxon rank-sum test. For continuous variables, the Pearson’s correlations were measured. To account for potential population stratification within our study population, genetic ancestry was estimated by principal component (PC) analysis using EIGENSTRAT software [29] on 37,000 uncorrelated SNPs (including those selected as ancestry informative markers) on the chip. The mean value of the genomic inflation factor (λ) was 1.01 for the participating studies when PCs were included in the regression models, indicating little evidence of population stratification [22]. For all analyses, the top eight PCs were included in all regression models. For the LMBC study, the study-specific principal component was further adjusted. To assess the association between the T2D GRS and breast cancer risk, we first fitted unconditional logistic regression models adjusting for age and PCs within each of the 39 contributing studies individually and recorded the β coefficients with standard errors for T2D GRS quintiles (relative to the first quintile). We then conducted a meta-analysis on the results from these 39 studies using both fixed-effect and mixed-effect models. The odds ratios (ORs) with 95 % confidence intervals (CIs) from the fixed-effects model are reported in Table 1, as are further analyses by estrogen receptor (ER) status, menopausal status, age group (<50 vs. ≥50 years), and body mass index (BMI, <25 vs. ≥25 kg/m2).

Table 1 The associations between T2D genetic risk score and breast cancer risk in Breast Cancer Association Consortium

We also used the SNP-set Kernel Association Test (SKAT) to evaluate whether any SNP in the T2D-associated SNP set may be related to breast cancer risk without making the assumption that the alleles that increase T2D risk may also increase breast cancer risk [30]. To evaluate the association of each individual SNP (per copy of risk allele) with breast cancer risk, we used individual-level data from the BCAC (46,325 cases and 42,482 controls) and summary results data from DRIVE (16,003 cases and 41,335 controls). We first estimated allelic OR for each SNP for each BCAC study with adjustment similar to that in the analyses for the association of T2D GRS with breast cancer risk described above and then combined the results across all BCAC studies with results from DRIVE using the inverse-variance meta-analysis with a fixed-effect model. Both consortium-specific results and combined results are reported in Table 2. For individual SNP analyses, statistical significance was considered after adjusting for multiple comparisons using the Bonferroni method (0.05/33). For all other analyses, statistical significance was considered at a two-sided 5 % level unless stated otherwise. All analyses were conducted using R version 3.0.3 [31].

Table 2 Selected T2D risk variants associated with breast cancer risk in BCAC at p < 0.05 and their associations in GAME-ON DRIVE project

Results

Among the 88,807 BCAC participants studied, on average, cases were slightly older than controls (57.8 vs. 54.9 years, p < 0.001) and entered menopause at a younger age (48.5 vs. 48.7 years, p < 0.01), as shown in Online Resource Table 4. More cases than controls were postmenopausal (69.3 vs. 68.1 %, p < 0.01) or had a first-degree family history of breast cancer (27.7 vs. 11.2 %, p < 0.01). Among postmenopausal women, cases and controls had comparable BMI (p = 0.62). Among controls, the T2D GRS was positively correlated with BMI (postmenopausal women, Pearson r = 0.018, p = 0.03) and inversely correlated with age at menarche (Pearson r = −0.021, p < 0.01). For other categorical variables examined, the mean T2D GRS values were virtually identical across different statuses (Online Resource Table 4, right columns).

Overall, the T2D GRS was not found to be associated with breast cancer risk (p for trend = 0.69, Table 1). No significant results were observed in analyses stratified by ER status (p for trend = 0.74 and 0.47 for ER+ and ER− breast cancer, respectively), menopausal status (p for trend = 0.74 and 0.93 for premenopausal and postmenopausal women, respectively), age group (p for trend = 0.74 and 0.62 for age <50 and age ≥50 years, respectively), or BMI group (p for trend = 0.64 and 0.64 for BMI < 25 and BMI ≥ 25, respectively). Meta-analysis using mixed-effect models gave similar results (data not shown). In a sensitivity analysis, which included only the 11 directly genotyped SNPs and 14 imputed SNPs with imputation R 2 > 0.9, similar results were observed (Online Resource Table 5).

Using SKATs and without making the assumption that the alleles that increase T2D risk also increase breast cancer risk, we found evidence for potential association for some of the T2D-related SNPs with breast cancer risk (p = 3.95E−10). Of the 33 independent SNPs investigated, seven were nominally associated with breast cancer risk using BCAC data alone (Table 2). Of these, the risk allele for T2D in four SNPs was associated with a reduced risk of breast cancer. After adjusting for multiple comparisons, the association for two SNPs, rs7903146 (TCF7L2, OR 1.04, 95 % CI = 1.02–1.07, p = 1.20E−04) and rs9939609 (FTO, OR 0.93, 95 % CI = 0.91–0.95, p = 3.63E−12), remained statistically significant, and both associations were replicated in DRIVE. SNP rs8042680 (PRC1) was related to breast cancer risk in the BCAC at p = 0.02 and in DRIVE at p = 6.18E−3; meta-analyses of these data yielded a significant association after adjusting for multiple comparisons (OR 0.97, 95 % CI = 0.99–0.99, p = 8.05E−4).

Discussion

In this large study, we investigated the association of 33 independent T2D-related genetic variants with breast cancer risk individually and in combination (through the use of our GRS). Generally, we found no association between T2D GRS and risk of breast cancer overall or by ER status. Of the 33 T2D-associated SNPs investigated in this study, three showed a significant association with breast cancer risk after adjusting for multiple comparisons: rs9939609 (FTO), rs7903146 (TCF7L2), and rs8042680 (PRC1). Although this study does not provide any evidence for an overall association of T2D susceptibility and breast cancer risk, it does show that some T2D-associated SNPs may be related to breast cancer risk.

It has been hypothesized that the association between T2D and breast cancer may be mediated through insulin resistance and hyperinsulinemia [32]. T2D and breast cancer share some lifestyle risk factors, including obesity in postmenopausal women and physical inactivity. Indeed, it has been shown previously that the observed association between these two diseases may be, in part, due to residual confounding by BMI [33]. With a very large sample size, our study suggests that overall genetic susceptibility to T2D was not related to breast cancer risk, indicating that the previously observed association between T2D and breast cancer risk may be largely due to shared lifestyle risk factors. Our finding for a null association between T2D GRS and breast cancer risk is supported by two previous studies that investigated this association. In one of these studies, Chen et al. [20] investigated 18 T2D-related SNPs among 503 European ancestry cases and 633 controls from the multiethnic cohort and PAGE studies. In the other study, Hou et al. [21] pooled data for 25 genotyped and 15 imputed T2D-related SNPs from seven studies and investigated this association among 1,142 European ancestry cases and 1,137 European ancestry controls. Neither study reported a significant association between T2D GRS and overall breast cancer risk. However, these two studies had evaluated a smaller set of T2D risk variants than the current study and the sample size in both studies was substantially smaller than the current study, and thus, the statistical power in these two previous studies was low. For example, for a given SNP with a minor allele frequency of 0.3, the current study had 99.6 % power to detect an OR of 1.05 at a type I error rate of 0.05, while the previous studies had <15 % power to detect an OR of 1.05.

We identified three T2D risk variants that were associated with breast cancer risk. SNPs in strong correlation with each of these three variants have recently been identified in GWAS to be associated with breast cancer risk. SNP rs9939609 (FTO) located in region 16q12.2 and rs7903146 (TCF7L2) located in region 10q25.2 are in perfect LD (R 2 = 1) with rs17817449 and rs7904519, respectively, which were identified in relation to breast cancer risk in a GWAS conducted using BCAC data [22]. SNP rs8042680 (PRC1) is in strong LD with rs2290203 (R 2 = 0.59, 9,270 bp apart) that was recently identified as a risk variant for breast cancer in a GWAS conducted in East Asian women [34]. Interestingly, the T2D-risk allele of rs9939609 and rs8042680 is associated with a decreased risk of breast cancer. Though studies have suggested that TCF7L2 may associate with breast cancer through the Wnt/β-catenin pathway [35, 36], the exact mechanisms underlying these associations are unclear. Further studying these genes may uncover additional insights into the biology and genetics that link the risk of breast cancer and T2D.

The sample size for our study was very large. When comparing subjects in T2D GRS Q5 to those in Q1, our study had 80 % power to detect an OR for breast cancer risk as low as 1.06 (or 0.94) at 5 % type I error rate. Our study showed that the association between T2D GRS and breast cancer risk should be very small, if it exists. The GRS used in our study was constructed using SNPs with established association with T2D, as demonstrated convincingly in previous GWAS, and thus, this GRS should have a clear association with T2D. Indeed, using the resources from the Nashville Breast Health Study [37], we showed that this GRS was related to T2D in a dose–response manner (p for trend < 0.01, Online Resource Table 6). However, there are some potential limitations of our study. The T2D treatment information was not available for the study, preventing us from conducting an in-depth evaluation of the potential influence of T2D treatment on the association of T2D risk variants with breast cancer risk. To reduce potential influence of T2D treatment, we conducted an analysis among younger patients (<50 years old) who are less likely to have T2D diagnosis than the older age group. This analysis showed similar results in younger and older groups (Table 2), indicating that the influence of T2D treatment on the association of T2D risk variants with breast cancer risk should be small. Approximately two-thirds of the SNPs used to construct the T2D GRS were not directly genotyped. We imputed these SNPs using 1,000 Genomes Project data as the reference. The imputation quality was high. In a sensitivity analysis, we constructed an alternate T2D GRS using only the 11 directly genotyped SNPs and the 14 imputed SNPs which had almost perfect quality (R 2 > 0.9). This T2D GRS is highly correlated with the T2D GRS used in our primary analysis (Pearson’s r = 0.93), and using the alternate T2D GRS did not change the results appreciably. Since we started this project, 14 new genetic loci for T2D have been identified. Unfortunately, we don’t have any data for these 14 new loci for our study. However, the strength of the association of T2D risk is much weaker for these newly identified variants than the 33 variants identified previously and included in our study. Therefore, we believe that including these variants would not change the conclusion of this study. Finally, all participants in this study are of European ancestry, possibility affecting the generalizability of our study findings to other populations.

In conclusion, our study found no apparent association between a polygenetic score constructed using the known T2D risk variants identified to date in GWAS and breast cancer risk among women of European ancestry. It is possible that the previously reported association between these two diseases could be due to shared lifestyle risk factors for T2D and breast cancer, providing support for lifestyle modification as an effective prevention strategy to reduce the risk of both T2D and breast cancer. Our finding of significant associations of three T2D risk variants with breast cancer suggests a potential link of certain shared genetic and biological pathways for these common diseases.