Background

Breast cancer is one of the most frequently diagnosed cancers in women [1], where etiology is postulated to be polygenic [2]. Polymorphisms in genes contributing to tumorigenesis are less penetrant, but may account for the majority of breast cancers due to their higher frequency in the general population [2]. In fact, single nucleotide polymorphisms (SNPs) have been implicated in risk and age at diagnosis of breast cancers [3, 4]. One such polymorphism in MDM2 was found to lead to increased risk for earlier age at diagnosis of estrogen receptor (ER) positive breast cancers in women carrying the variant allele of the SNP [5].

Although breast cancers are also associated with mutations in the highly penetrant breast cancer susceptibility genes, BRCA1 and BRCA2, fewer than 10% of cases can be attributed to this etiology [2]. What is known based on BRCA1 and BRCA2 is that these familial syndromes are associated with particular breast cancer subtypes [6]. Breast cancer is a heterogeneous disease that is reflective of its molecular features and histology [7]. Whereas BRCA1 carriers frequently develop hormone receptor negative disease, the phenotype in BRCA2 carriers reflects the distribution observed for spontaneous breast cancer [6]. Therefore, other genetic contributors, including SNPs, may also be reflected in tumor heterogeneity.

Tuberous sclerosis is an autosomal dominant disorder that results from mutations in the tumor suppressor genes, TSC1 or TSC2 [810]. Tuberous Sclerosis is characterized by benign hamartomas that affect multiple systems including the brain, skin, heart, and kidneys [810]. Lymphangiomyomatosis, a manifestation of tuberous sclerosis seen in 30–40% of affected women, shares some of the same characteristics of breast cancers including the following: gender-dimorphism, estrogen-modulation, increased cell proliferation and migration [1113]. TSC1 acts to stabilize TSC2, preventing its degradation and both function together as the TSC1/TSC2 heterodimer [8, 10]. This complex inhibits the G protein, Rheb, which acts upstream of the mammalian target of rapamycin (mTOR) kinase [10, 14]. The serine–threonine kinase mTOR plays a central role in the control of cell growth and proliferation through phosphorylation of its effector molecules, 4E-BP1 and S6K1 [15]. Activation of the pathway occurs in response to growth factors, amino acids, and nutrients, leading to mRNA translation and ribosome biogenesis [15]. TSC1/TSC2 complex when not functional, leads to uncontrolled mTOR activity causing uncontrolled cell growth and tumor formation. Rapamycin inhibits mTOR and has been used successfully to treat individuals with tuberous sclerosis [16]. Furthermore, analogs of rapamycin are currently being tested in breast and other cancer clinical trials [14].

Despite the implications highlighted by tuberous sclerosis and the key position of TSC1 and TSC2 in the mTOR pathway, few studies have examined the role of TSC1 and TSC2 in breast cancer [17]. We investigated a panel of 18 TSC1 and 14 TSC2 SNPs by first mapping their haplotypes and using bioinformatics approaches to further select potentially functional SNPs. Five germline TSC1 SNPs and one germline TSC2 SNP were evaluated from women with breast cancer and correlated with clinicopathologic characteristics. The data from these studies indicates that estrogen signaling may modulate the effect of a TSC1 SNP in age at diagnosis of breast cancer.

Methods

Subjects

The cohort consisted of 1,137 consecutively enrolled patients, invited to participate in this prospective study from 2004-present, through the Stacy Goldstein Breast Cancer Center at The Cancer Institute of New Jersey (CINJ). Over 95% of eligible individuals gave consent for participation. A history of biopsy-proven breast cancer was verified by pathology records and confirmed on review by our institutional breast pathologist. Fewer than 5% of cases were not available for review and for those pathological features were based on pathology reports from other institutions. Lobular carcinoma in situ (LCIS) was excluded. Negative estrogen receptor (ER) staining was defined as <10%. BRCA1/2 testing was performed where clinically indicated and patients with known BRCA1/2 mutations were then excluded from age at diagnosis analysis due to potential confounding bias. Investigations were performed with prior approval by the University of Medicine and Dentistry of New Jersey Institutional Review Board.

Determining the haplotype structure

A list of 18-tagged SNPs in TSC1 and 14-tagged SNPs in TSC2 (Table 1) was generated based on HapMap data, representing the minimum number of SNPs necessary to complete the haplotype for TSC1 and TSC2. The SNP tagging formalism employed an information-theoretic definition of haplotype diversity, and the optimal tag SNPs were chosen using a greedy procedure [18] that minimized the haplotypic uncertainty (unpublished). This SNP tagging algorithm was designed independently of Haploview Tagger. It outperforms Tagger giving similar but better results, requiring slightly fewer SNPs to tag the haplotypes. Genomic DNA isolated from lymphoblastoid cell lines (LCLs) was genotyped to determine frequency of haplotypes (demonstrated for one locus in each gene; Fig. 1 and Online Resource 1). LCLs were obtained from Coriell Institute for Medical research (Camden, NJ, USA) and represent DNA from healthy individuals. For both TSC1 and TSC2, only haplotypes occurring at greater than 2% are depicted.

Table 1 TSC1 and TSC2 SNPs used to generate haplotype structure
Fig. 1
figure 1

Haplotype structure and gene map of TSC1 for Caucasians The haplotype structure of TSC1 determined from analysis of LCLs is shown, along with the gene structure, haplotype block for rs7874234, and relevant haplotype frequencies. Below the haplotype trees, is the gene structure of TSC1; boxes indicate exons with intervening spaces representing intronic sequence. Below the gene structure is the haplotype block and corresponding haplotype frequencies. Dashed lines and arrows indicate the SNP position within these blocks and location on the corresponding gene map above. The haplotype structure and corresponding haplotype frequencies represent the SNPs immediately adjacent, either right or left of the reference loci, rs7874234

Candidate TSC1 and TSC2 SNPs for Clinical Study

Candidate SNP selection for further study in the breast cancer cohort was achieved by two different approaches. The first approach searched for SNPs manifesting signatures of natural selection in the pattern of genotype correlations with nearby SNPs. To identify genetic variants that deviate from the standard assumptions of selective neutrality, a previously described and publicly available map of recent positive selection of the human genome (Haplotter, http://hg-wen.uchicago.edu/selection/haplotter.htm) was utilized. In Haplotter, recent positive selection is determined using a haplotype-based approach that looks for enrichment of the classic signal for strong directional selection using the phase II data of the HapMap. Haplotter utilizes a test statistic called the integrated haplotype score (iHS), which is a measure that includes the degree of haplotype homozygosity around a given SNP [19]. Highly positive and negative his scores denote SNPs that harbor higher haplotype homozygosity, compared with other SNPs with similar allele frequencies in the genome. Using this methodology, SNPs in TSC1 (rs7874234, rs1076160, rs1073123, and rs3761840) and SNPs in TSC2 (rs13335638) were identified.

The second approach was based on associations between the genotypes and the response of tumor derived cell lines to standard chemotherapeutic agents (Online Resource 2) [20, 21]. The mutational status of p53, the genotypes of 109,687 SNPs (Affymetrix 125K chip), and the GI50 data for the NCI60 cell panel of tumor derived cell lines was obtained from the NCI/NIH Developmental Therapeutics Program web site, http://www.dtp.nci.nih.gov. A univariate test was undertaken for 132 standard agents to evaluate allelic differences in the GI50s. Specifically, the average log GI50 [X = − log10(GI50)] for cells for each of the three genotypes of a given locus (AA, Aa, and aa) were calculated for cells either wild-type or mutant for p53. Subsequently, the probability (P value) was computed that just by chance the difference for the following groupings either was equal to or larger than the actual measurement: (a) X a − X AA or (b) X aa − X A, or (c) X AA − X a, or (d) X A − X aa, or (e) [X aa − X aA and X aA − X AA], and (f) [X AA − X aA and X aA − X aa]. These probabilities were estimated using a permutation test (106 permutations) that preserved the allele or genotype group sizes but permuted the samples among the groups. Results P < 0.05 were considered significant and P < 0.1 marginally significant. A multiple hypothesis test was performed for allelic differences in the GI50s across the entire panel of drugs. A Fisher’s exact test to compute the statistical significance of observing h univariate hits for a SNP on a total of D = 132 drugs, given that overall H significant hits are observed after testing S reference SNPs on the D drugs. All 109,687 Affymetrix genotyped SNPs were chosen as a reference set. Using this methodology, TSC1 rs2809243 and TSC2 rs13335638 were identified (Online resource 2).

Therefore a total of five SNPs in TSC1 and one SNP in TSC2 were further analyzed for associations with breast cancer phenotypes, i.e. associations of age at diagnosis and breast cancer subtype (ductal versus lobular), ER status (ER-positive versus ER-negative) and menopausal status as well as recurrence.

Genotyping

Genomic DNA was extracted from 1 ml of peripheral blood, obtained through venipuncture, using a spin column-based method according to the manufacturer’s protocol (QIAGEN). Genotyping for TSC1 and TSC2 SNPs was performed using Taqman assays on the ABI 7900HT Fast Real-Time PCR System (Applied Biosystems). Briefly, reactions were performed using 5–10 ng genomic DNA in 10 μl volume. For TSC1 rs7874234, rs1076160, rs2809243, rs3761840, and rs1073123 and for TSC2 rs13335638, PCR cycling conditions were 50°C for 2 min, 95°C for 10 min, followed by 45 cycles of 92°C for 15 s and 60°C for 1 min. Conditions were modified to optimize reactions for the following several loci: TSC1 rs11243940, conditions were identical except only 40 cycles (instead of 45 cycles) were performed; for TSC1 rs739442 and rs10491534, conditions were identical except the final annealing temperature was 58°C.

Statistical Analysis

A permutation test was performed to determine the statistical significance of differences in mean age at diagnosis between different genotype groups (e.g., wild-type homozygote or heterozygote vs. variant homozygote). This permutation test was chosen because it is non-parametric, with the assumption that all genotype groups, or categories, are equivalent and making no assumptions about the age of diagnosis distribution. Fisher’s exact test was used to determine the statistical significance of the association between categorical values for each genotype group. The odds ratio and 95% confidence interval were then computed using a Bayesian estimate for the odds ratio posterior distribution.

Results

Analysis in LCLs and haplotype generation

Using genotype data generated by the Hapmap project, the minimal set of TSC1 and TSC2 SNPs sufficient to reconstruct their haplotypes was determined. These 18-tagged SNPs in TSC1 and 14-tagged SNPs in TSC2 were analyzed in LCLs, obtained from healthy individuals, and used to generate haplotype frequencies and are demonstrated for two loci (Fig. 1 and Online Resource 1). For TSC1 rs7874234, since genotype frequencies were similar between Caucasian and African American populations (Table 3), only the haplotype tree for the Caucasian population is shown (Fig. 1). However, since genotype frequencies differed between the Caucasian and African American populations, haplotype trees for both populations were generated for TSC2 rs13335638 (Online Resource 1).

The major TSC1 haplotypes generated, with rs7874234 as the reference locus, accounted for 38.5, 18.9, 10.5, 9, and 2.6% of the genotypes in the Caucasian population (Fig. 1) and was similar in African Americans (Table 3). The C allele was more prevalent and represented 57.4% of the haplotypes. The major TSC2 haplotypes generated, with rs13335638 as the reference locus, accounted for 63.0%, 7.6%, 6.1%, 3.2% and 3.0% of the genotypes in Caucasians, as depicted in Online Resource 1. In the Caucasian population, the major allele was the T allele for this TSC2 SNP. Haplotype trees for both Caucasian and African American populations are shown for TSC2 (Online Resource 1). For African Americans, there was more heterogeneity, and the major haplotypes occurred at frequencies of 24.0, 22.2, 16.2, 11.0, 9.3, 5.1, and 2.0% (with six haplotypes occurring at 2.0%). In the African American population, both the C and the T allele are represented almost equally, with one C allele haplotype accounting for 22.2% of the genotypes, and one T allele haplotype accounting for 24.0% of the genotypes.

Demographics and SNP Frequencies in the Breast Cancer Cohort

The demographics for the breast cancer cohort are depicted in Table 2. The data shows that the majority of women were Caucasian and the majority of cancers were ductal in origin. The average age at diagnosis was 51.5 years, with patients ranging from 19–89 years of age. Nearly 75% of all breast cancers were ER positive.

Table 2 Demographics of the breast cancer cohort

Population-specific genotype frequencies were observed for TSC1 rs7874234 and TSC2 rs13335638 (Table 3). TSC1 genotype frequencies did not deviate from Hardy–Weinberg equilibrium (HWE) for any of the races in this cohort. TSC1 rs7874234 genotype frequencies were similar between the populations depicted, except for the Asian population. TSC2 rs13335638 genotypes were in HWE for all populations except the Hispanic subset. Both Asian and Hispanic populations represent a small number of individuals and heterogeneity in area of participant origin. In Caucasians, the TT genotype was observed in 65.5% of the population. However, in the African American population, both CC and TT genotypes were equally prevalent (each accounting for 29% of the population).

Table 3 Genotype Frequencies for relevant TSC1 and TSC2 SNPs for all populations in the CINJ breast cancer cohort

Association between TSC1 and TSC2 SNPs and breast cancer phenotypes

Six SNPs across TSC1 and TSC2 were evaluated for associations with breast cancer phenotypes (Table 4). For all comparisons, ancestral homozygotes versus variant homozygotes were used to determine odds ratios and P values. Associations for TSC1 and TSC2 SNPs with breast cancer subtype (ductal versus lobular), ER status (ER-positive versus ER-negative) and menopausal status at diagnosis were evaluated but no significant associations were found. A trend was observed for an association between ER status and TSC1 rs1073123 genotype, though it did not reach significance. Likewise, for rs7874234, the mean age at diagnosis between ancestral and variant homozygotes differed by 3.4 years, but this did not reach significance.

Table 4 Association between TSC SNPs and breast cancer phenotypes

All SNP loci were evaluated for association with age at diagnosis. One TSC1 SNP, rs7874234, showed a significant association with age at diagnosis as depicted in Fig. 2. To reduce the heterogeneity in the study population and potential confounders, samples were stratified into groups with the same cancer subtype and ethnic background. In a case-only analysis of rs7874234, for Caucasian women who had ER+ ductal carcinomas, CC, CT, and TT carriers had an average age at diagnosis of 52, 55 years and 61 years, respectively. Homozygous variant TT carriers had a 9-year later age at diagnosis of ER+ ductal carcinomas as compared with CC carriers (P = 0.0049). When comparing all three genotypes, the curve for heterozygotes fell in between the TT and CC curves, and associated with a 3-year later age at diagnosis than CC carriers and a 6-year earlier age at diagnosis than TT carriers (P = 0.00036). When stratifying the analysis by post-menopausal status, post-menopausal women with ER+ ductal carcinomas showed a similar pattern of genotypes for age at onset as was observed in Fig. 2 (data not shown). Specifically, an additive effect was observed in the analysis of post-menopausal women with ER+ ductal carcinomas. CC, CT, and TT carriers were diagnosed at 61, 64, and 68 years of age, respectively (P = 0.00046). None of the other five SNPs showed significant associations with age of diagnosis by any stratification.

Fig. 2
figure 2

TT genotype of TSC1 rs7874234 associates with a later age at diagnosis of ER+ ductal carcinomas in Caucasian women cumulative incidence as a function of age at diagnosis was evaluated for TSC1 rs7874234 to determine genotype-specific effects. The breast cancer population demonstrates ductal ER+ carcinomas in Caucasian women. CC carriers are depicted by filled diamonds, CT carriers by filled squares, and TT carriers by filled triangles

Discussion

Due to the ubiquitous role of TSC1 and TSC2 in the mTOR pathway and the pathway in tumor biology [22], a systematic evaluation of clinical associations with SNPs in TSC1 and TSC2 was undertaken. Analysis of TSC1 SNP rs7874234 in this breast cancer cohort showed that in Caucasian women, variant homozygote TT carriers developed ductal ER+ breast carcinomas on average 9 years later than CC carriers. Furthermore, the intermediate age of diagnosis of heterozygotes in comparison to either homozygote indicates an additive effect. The largest differential appears in the postmenopausal age group (assuming the average age of menopause in the US is 51 years of age). No other significant associations were observed for other TSC SNP loci and other clinico-pathologic variables, including ER status, breast cancer subtype and menopausal status.

The effect observed with TSC1 rs7874234 indicates a deleterious effect with the CC genotype and that the TT genotype confers protection against earlier development of ER+ ductal carcinomas. Alternatively, a later age at diagnosis for TT carriers could also mean that tumors in TT carriers are slower-growing and take longer to reach a threshold for detection than CC carriers. In silico analysis showed an estrogen receptor element (ERE) within the flanking sequence of rs7874234. Although there is high variability in ERE sequences, comparison of known EREs in other human genes, revealed sequence homology between “GTTAG” in TSC1 with the ERE identified for Human calbindin-D9k [23]. However, this homology only exists for the T allele. This is suggestive that different alleles in rs7874234 may affect ER binding.

Taken together, these findings support the possibility that the T allele may mediate estrogen-specific effects in risk towards later onset breast cancer. It is hypothesized that the T allele allows for ER to bind to this ERE in TSC1, activating TSC1 transcription and increased inhibition of mTOR, delaying breast cancer in TT carriers. The C allele, however, does not allow for ER binding, and therefore there is no increased inhibition of mTOR, furthering earlier tumorigenesis in CC carriers.

The post-menopausal effect observed fits nicely in our hypothesized mode of action for the T allele: if the T allele increases binding affinity of activated ER and hence leads to increased TSC1 transcription, this would make TSC1 in carriers more sensitive to the low levels of circulating estrogens in post-menopausal women. In premenopausal women, because the circulating levels of estrogens are much higher, ER binding to the TSC1 ERE is not limiting. Furthermore, effects of estrogens on other cell pathways may be more critical in premenopausal women [24]. Other SNP association studies have demonstrated similar post-menopausal effects [2527]. A recent study of SNPs in folate and alcohol metabolic pathway genes and breast cancer risk showed an association with a SNP in 5-methyltetrahydrofolate-homocysteine methyltransferase reductase (MTRR) and increased risk for breast cancer in postmenopausal but not premenopausal women [25]. Another study investigated the role of SNPs in ER-alpha and other estrogen-metabolizing genes and breast cancer risk in Chinese women and found that postmenopausal but not premenopausal breast cancer risk was associated with a heterozygous CYP17 genotype [26].

One advantage of using this study cohort is the availability of extensive clinico-pathologic information that was collected for all study participants. Analysis was enhanced by the ability to reduce heterogeneity within the study cohort. SNP selection which was supported by bioinformatic analysis reduced the number of SNPs to be analyzed, making this a cost-effective approach. This analysis was stratified by race, while other subgroup analysis was limited due to small numbers.

Further study in a larger cohort would allow the detection of associations in other subgroups. For example, analysis of this SNP in the African American population showed no significant association with age at diagnosis of ductal ER+ breast carcinomas; however, this may be due to the small number of African American patients with a TT genotype. Furthermore, defining the molecular mechanism of SNP functionality would further support the association observed for rs7874234. TSC1 has not been previously shown to be regulated by estrogen, and while informatics gave evidence for a plausible explanation for rs7874234 functionality, the possibility exists that another SNP, not rs7874234, could be the functional SNP. Therefore, the effect observed could be due to other SNP(s) in the haplotype represented by rs7874234. As a case-only study, a limitation of this study is that it only evaluated one aspect of risk, i.e., age at diagnosis, whereas overall risk for development of breast cancer would provide further clinical utility in potentially identifying individuals at risk for the disease and knowing the optimal time for screening and prevention.

In this cohort, no association was found between any TSC SNPs and recurrence (data not shown). Although analysis of drug response in the NCI-60 cell lines demonstrated genotype-specific effects for another TSC1 SNP, rs2809243, these results were dependent on p53 status (Online Resource 2). In a p53 mutant background, the genotypes for rs2809243 significantly differed in their growth inhibition to several different chemotherapeutic agents(P = 4.7 × 10−7). However, the heat maps showed a divergent effect between a p53 mutant versus p53 wild-type background. For example, the TT genotype associated with a better response to alkylating agents in a p53 mutant background, while the CC genotype associated with better response to these same agents in a p53 wild-type background, indicating an opposing effect between p53 mutant and p53 wild-type backgrounds for the same class of drugs (Online Resource 2). Overall, the T allele associated with better response in the p53 setting, while the C allele correlated with better response in a mutant p53 background. Since upwards of 30% of tumors harbor p53 mutations in breast cancer [28], the absence of a recurrence-phenotype in the breast cancer cohort, where p53 status is unknown, may not reflect a true lack of association.

In summary, we found that the TSC1 rs7874234 variant associated with delayed age at diagnosis of ER-positive ductal carcinomas in Caucasian women. The observed findings are intriguing and bear confirmation in other breast cancer populations.