Introduction

Breast cancer is the most common malignancy affecting women worldwide, and it is the leading cancer in females in Cyprus, with approximately 400 new cases diagnosed annually [1]. In vitro studies have shown variability in inter-individual DNA repair capacity and have demonstrated that reduced ability to repair DNA is associated with an increased risk for breast cancer [24]. It has also been suggested that deficient DNA repair capacity predisposes to both familial and sporadic forms of breast cancer [57].

Ten different genes that are involved in pathways critical to genomic integrity have been implicated in inherited predisposition to breast cancer, including BRCA1, BRCA2, p53, PTEN, CHEK2, ATM, NBS1, RAD50, BRIP1 and PALB2. The association of germline mutations in DNA repair genes with an increased susceptibility to breast cancer highlights the importance of these pathways in the development of breast cancer [8].

The DNA repair pathway is clearly involved in familial breast cancer. Thus, it was hypothesized that common single nucleotide polymorphisms (SNPs) of genes involved in the DNA repair pathway may influence breast cancer risk. Many studies have investigated the role of SNPs in DNA repair genes in relation to breast cancer and have reported associations with breast cancer risk [912].

Analysis of members of the DNA repair pathway appears to be a good rationale for identifying novel susceptibility loci. In particular, genes which have a direct interaction with the BRCA1 and BRCA2 genes are very good candidates. Recently, two more susceptibility genes, namely BRIP1/FANCJ and PALB2/FANCN, which interact with BRCA1 and BRCA2 genes, respectively, have been identified [13, 14].

BRCA1 and BRCA2 participate in the biological response to DNA damage that includes the activation of cell cycle checkpoints and the recruitment of the DNA damage repair machinery. Both BRCA1 and BRCA2 are implicated in DNA repair by homologous recombination, and their proteins have distinct roles in double-strand break repair [15].

Despite the progress that has been made in improving our understanding of the functions of the BRCA1 protein, a complete picture has not yet been attained. It has been hypothesized that BRCA1 acts as a coordinator of the various functions of DNA damage, recognition, response and repair, and double-strand break repair. BRCA1 interacts with many DNA repair proteins and protein complexes including the RAD50-MRE11A-NBS1 (MRN) complex. The proteins associated with BRCA1 are involved in response to and in the repair of DNA damage in several ways by acting as DNA damage sensors, signal transducers and repair effectors. Hence, these proteins are instrumental in the repair of DNA breakages and in the maintenance of genomic integrity [1618]. The exact role(s) of the BRCA2 protein also still remain(s) elusive. It has been demonstrated that BRCA2 plays an important role in homologous recombination, both in meiosis and in the repair of double-strand breaks. Fewer proteins are known to interact with BRCA2 compared to BRCA1 [19]. These include RAD51, which mediates DNA repair via homologous recombination (HR) [15], and PALB2, which is required for BRCA2 nuclear localization and stability as well as for some of its functions in HR and double-strand break repair [20]. Overall, BRCA1 and BRCA2 act in response to DNA damage and participate in multi-protein complexes that are involved in tumor suppression processes [17].

In this study, we hypothesized that germline variations in genes encoding proteins that interact with BRCA1/2, are potential candidates for modifying breast cancer risk in the Cypriot population. Consequently, disturbances in the interactions with BRCA1 and BRCA2 may prevent their tumor suppression function(s) and consequently modify inter-individual DNA repair capacity. As part of an ongoing study we assessed genetic variation in 60 SNPs in 29 genes, which interact with BRCA1 or BRCA2 genes and their association with breast cancer in a case–control study of Cypriot women. Furthermore, we investigated the role of two additional SNPs in the PBOV1 (UROC28) and DBC2 genes that are both upregulated in breast cancer [21, 22].

Materials and methods

Study population

To investigate the associations between genetic factors and breast cancer risk in the Cypriot population, we conducted a population-based case–control study, with the acronym MASTOS (Greek word for breast). The population of this study are women participating in the MASTOS study. Blood samples were collected between 2004 and 2006 from 1,109 female breast cancer patients diagnosed between 40 and 70 years old and 1,177 age-matched healthy controls. Participants were women who were previously diagnosed with breast cancer between January 1999 and December 2006. The majority of patients were ascertained from the Bank of Cyprus Oncology Centre which operates as a referral centre and offers treatment and follow-up for 80–90% of all breast cancer cases diagnosed in Cyprus. The rest of the patients were recruited at the Oncology Departments of the Nicosia, Limassol, Larnaca and Paphos district hospitals. The control group consisted of healthy women who were participating in the National program for breast cancer screening with the use of mammography. Volunteers were enrolled in the study during the same calendar period as the cases, from the four district mammography screening centers that operate in Cyprus. Eligible controls were women with no previous history of breast cancer and who had a negative mammography result. All study participants, both patients and controls, were of Greek Cypriot Caucasian origin, thus reducing any potential bias due to population stratification. In addition, the study population was representative of the whole island population and thus consisted of women, who resided in all five districts of the country, minimizing potential selection bias. The participation rate of cases and controls was very high covering around 98% of eligible cases and controls. In addition to blood samples, a risk factor questionnaire, which included extensive demographic, epidemiologic and pathologic data, was obtained from each participant through a standardized interview. Breast cancer cases were verified by reviewing histological reports. The study was reviewed and approved by the National Bioethics Committee of Cyprus. All participants provided written informed consent.

Gene and SNP selection

Sixty-two SNPs in the ATF1, ATM, ATR, BARD1, BLM, BRIP1, CHEK1, CHEK2, DDB2, DMC1, EME1, FANCA, FANCC, FANCD2, FANCE, MLH1, MRE11A, MSH2, MSH6, MUS81, NBS1, PALB2, PCNA, RFC1, RAD50, RAD51C, RAD51L1, RAD52 and XPC genes were genotyped. The genetic variants were selected based on three main criteria: (1) all SNPs chosen belong to genes that interact with either BRCA1 or BRCA2; (2) the SNPs chosen are either functional SNPs (based on potential protein changes, evolutionary conservation and location in putative functional regions [2325] or (3) SNPs which were reported by other groups to modify cancer risk [14, 2632]. For MRE11A and RAD50, we genotyped the tagging SNPs in Allen-Brady et al. [33], and for NBS1, we genotyped the tagging SNPs in Lu et al. [32]. SNPs in the PBOV1 and DBC2 genes were selected based on their minor allele frequency (MAF) >0.05.

Genotyping

DNA was isolated from blood samples using standard procedures (phenol–chloroform method). SNPs were genotyped by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) of allele-specific primer extension products (Mass Array, Sequenom Inc., San Diego, CA, USA). Assay design was based on published sequences retrieved from the National Center of Biotechnology Information (NCBI) databases. A 34-plex and a 28-plex multiplex assay were designed using the Sequenom MassARRAY Assay Design software (version 3.0). SNPs were genotyped using Sequenom iPLEX chemistry on a MALDI-TOF Compact Mass Spectrometer (Sequenom Inc., San Diego, CA, USA).

Briefly, PCR reactions were carried out in a final volume of 5 μl in standard 384-well plates. PCR was performed with 5 ng of genomic DNA, 1 U of HotStarTaq DNA polymerase (Qiagen, Hilden, Germany), 500 μmol of each dNTP and 100 nmol of each PCR primer. PCR thermal cycling was carried out in an ABI-9700 instrument (Applied Biosystems, Foster City, CA, USA) for 15 min at 94°C, followed by 44 cycles of 20 s at 94°C, 30 s at 56°C and 60 s at 72°C. Next, PCR products were treated with 0.5 U of shrimp alkaline phosphatase for 40 min at 37°C to dephosphorylate unincorporated dNTPs, followed by enzyme inactivation for 5 min at 85°C. After adjusting the concentrations of the extension primers to equilibrate signal-to-noise ratios, the post-PCR primer extension reaction of the iPLEX gold assay was performed in a final 10 μl volume extension reaction containing 0.2 ul of termination mix, 0.0041 μl of iPLEX enzyme (Sequenom Inc., San Diego, CA, USA) and 700–1,400 nM of extension primers. A two-step 200 short cycles program was used for the iPLEX reaction: initial denaturation was for 30 s at 94°C followed by five cycles of 5 s at 52°C and 5 s at 80°C. An additional 40 annealing and extension cycles were then looped back to 5 s at 94°C, 5 s at 52°C and 5 s at 80°C. Final extension was carried out at 72°C for 3 min. The iPLEX reaction products were desalted by diluting samples with 16 μl of water and adding 6 mg of clean resin. Following a quick centrifugation (3,200 g for 5 min), reaction products were spotted on a 384-format SpectroChip (Sequenom Inc., San Diego, CA, USA). SpectroCHIPs were processed in a MassARRAY Compact Analyzer (Bruker Daltonics, Bremen, Germany) by MassARRAY Workstation (version 3.3) software (Sequenom Inc., San Diego, CA, USA). Acquisition data were analyzed using MassARRAY TYPER 3.4 software (Sequenom Inc., San Diego, CA, USA).

For quality control, 48 random samples were genotyped in duplicate. Furthermore, ten samples were sequenced to confirm genotype calls from the MALDI-TOF platform. The genotype concordance rate between platforms was 99%. The order of the DNA samples on 384-well plates was randomized in order to ensure the same study conditions for samples from cases and controls. Genotyping call rates ranged from 95 to 99%, and duplicate concordance rates were higher than 99%. The SNP that had 20% missing data was excluded from further analysis.

Data analysis

Hardy Weinberg equilibrium (HWE) was assessed in the control samples by applying an exact test. The primary tests of association were the univariate analyses between each SNP and breast cancer. Genotype frequencies in cases and controls were compared using the χ 2 test. The association between breast cancer and each SNP was examined using logistic regression with the SNP genotype tested under models of complete dominance and recessive inheritance as well as under the log-additive model after adjusting for breast cancer risk factors including age (under or over 55 years), menopause status (pre- or post-menopausal), family history of breast cancer (first degree relative with breast cancer) and use of hormone replacement therapy. Statistical analysis was carried out using SNPStats, a web-based application designed for analysis of association studies [34].

Associations between breast cancer and common haplotypes of the ATM, MRE11A and NBS1 genes were also investigated using SNPStats, which allows the estimation of maximum likelihood estimates of haplotype frequencies using the Expectation-Maximization (EM) algorithm. Logistic regression was performed to test the association between haplotypes and breast cancer risk. For assessing the contribution of the MRE11A haplotypes in breast cancer risk, a haplotype tagging SNP genotyped previously was also included in haplotype reconstruction [35]. Haplotypes with a frequency of less than 1% were not considered further for analysis since they are likely to be a result of rare recombination events.

Results

Table 1 shows the genotype frequency in cases and controls for the 62 SNPs, of which the 61 were successfully genotyped. Six SNPs (rs1800149, rs2706377, rs1800282, rs7487683, rs3626, rs28908468) deviated from HWE in controls (P < 0.01) and were excluded from further analysis. Of the remaining 55 SNPs, 8 were monomorphic in both groups. Significant differences in genotype frequencies between breast cancer patients and controls were observed in 5 of the 55 SNPs analyzed.

Table 1 Genotype frequencies in cases and controls for the 62 SNPs studied

The associations of SNPs and breast cancer risk in Cypriot women are shown in Table 2. Five of the 55 SNPs were associated at a P value of less than 0.05. Three SNPs were associated with a reduced risk for breast cancer while the three remaining were associated with an increased breast cancer risk. In detail, the variant allele of NBS1 rs13312840 (924 T>C) was associated with a reduced risk of disease (OR TT vs. TC/CC = 0.58; 95% CI, 0.37 to 0.92; P = 0.019). Carriers of the NBS1 rs769416 rare allele also had a reduced risk of breast cancer (OR GG vs. GT/TT = 0.23, 95% CI 0.06–0.85, P = 0.017). Furthermore, the variant allele of MRE11A rs556477 was associated with a reduced risk of developing the disease (OR AA vs. AG/GG = 0.76; 95% CI, 0.64–0.91; P = 0.0022). The variant allele of MUS81 rs545500 was associated with an increased risk of developing breast cancer (OR GG vs. GC/CC = 1.21, 95% CI, 1.02–1.45; P = 0.031). In addition, the rare allele of PBOV1 rs6927706 was also associated with an increased risk of developing breast cancer (OR AA vs. AG/GG = 1.53, 95% CI, 1.07–2.18; P = 0.019).

Table 2 Genotypic specific risk (OR and 95% CI)

The NBS1 haplotype GGCGCAC (rs769416, rs769420, rs13312840, rs1805794, rs6413508, rs12677527, rs1805787), which contains the NBS1 rs13312840 C allele, to be associated with a reduced breast cancer risk compared with the most frequent haplotype GGTCCGC (OR = 0.62; 95% CI = 0.39–0.97; P = 0.037). We also found a reduced risk for breast cancer for a rare haplotype in NBS1 (OR = 0.42; 95% CI = 0.26–0.66; P = 2 × 10−4). In addition, the MRE11A haplotype AGCG (rs556477, rs601341, rs10831234, rs1009456) is associated with a significantly increased risk for breast cancer (OR = 1.32; 95% CI = 1.13–1.54; P = 0.0004). None of the common ATM haplotypes were associated with breast cancer (Table 3).

Table 3 Estimated haplotype frequencies in cases and controls and haplotypic specific risks

Discussion

Breast cancer is a complex polygenic disease. Published data suggest that a proportion of breast cancer can be explained by common low-penetrance alleles that increase susceptibility [36]. High-penetrance mutations in genes that are involved in DNA repair pathways such as BRCA1 and BRCA2 predispose to familial breast cancer [37, 38]. Previously our group characterized novel mutations in these genes in Cypriot families [39, 40]. The importance of common inherited variants in DNA repair genes in relation to breast cancer risk is still being elucidated, but is currently receiving increased attention. Our group as part of an ongoing investigation has studied genetic variation in DNA repair genes in relation to breast cancer risk in the Cypriot population and has reported a number of SNPs that modify breast cancer risk [35, 41]. A number of large studies which focused on the contribution of common SNPs in DNA repair genes in breast cancer, using tagging SNP approaches have also been completed [9, 42, 43]. In this case–control study, we evaluated both functional as well as tagging SNPs in DNA repair genes in relation to breast cancer risk in Cypriot women.

We found that Cypriot women who carry NBS1 rs13312840 C and rs769416 T alleles have a reduced risk of breast cancer. The NBS1 protein is involved in non-homologous end-joining (NHEJ) pathway that repairs DNA double-strand breaks (DSBs). The first step of this pathway consists of the recognition of DSBs by the MRN complex whose core contains the MRE11, RAD50 and NBS1 proteins. NBS1 is the key regulator of this protein complex [44, 45]. The NBS1 rs13312840 T>C SNP is located on the 5′ UTR (-1120) of the gene that is the transcription factor GATA-1 binding site. The activation domains of GATA-1 are capable of activating transcription in mammalian cells through GATA motifs [46]. Our results are in contrast to those of a recent study by Lu et al. who found an increased risk for breast cancer in non-Hispanic Caucasian women aged 55 or younger who were carriers of the C allele [32]. Conflicting evidence for association may be due to population-specific and/or age-specific differences. The protective effect of the NBS1 rs13312440 SNP observed in our study could be attributed to the SNP itself or to linkage disequilibrium with another variant.

To the best of our knowledge, this is the first study investigating the role of NBS1 rs769416 SNP and breast cancer risk. The rs769416 SNP causes an amino acid change (Gly to Lys) at codon 216 of the NBS1 gene. This SNP is not located within one of the three functional regions of the NBS1 protein, but it may have an alternative splicing regulatory effect, based on the Functional Single Nucleotide Polymorphism (F-SNP) database [47]. Our result on the association of rs769416 SNP and breast cancer needs to be interpreted with caution, since this is a rare SNP in our population and the most likely explanation for this association is chance.

Haplotype analysis with the combination of the seven NBS1 SNPs showed that the frequency of the GGCGCAC haplotype (rs769416, rs769420, rs13312840, rs1805794, rs6413508, rs12677527, rs1805787) was lower in patients than in controls (0.0147 vs. 0.0225; P = 0.035), suggesting a protective effect. There was also evidence for a protective effect of the rare pooled NBS1 haplotypes. This protective effect is driven by the difference in frequencies of the pooled rare haplotypes that conferred a low risk (OR = 0.42) and had a combined frequency of 3.29% in controls and 1.37% in patients. It is possible that these pooled haplotypes are a marker for a single, rare, protective mutation in the Cypriot population. There may be value in sequencing this region in order to help identify the protective variant(s). Both these findings need to be replicated in independent studies in order to confirm or refute this effect.

Our data support the notion that MUS81 rs545500 C allele carriers are at an increased risk for breast cancer. Rs545500 is a non-synonymous SNP located in the coding region of MUS81, a structure-specific DNA nuclease that plays an important role in DNA repair by homologous recombination [48]. This polymorphism results in an amino acid change from a positively charged hydrophilic arginine to an uncharged hydrophobic proline residue, which may have an effect on the 3D structure or a protein–protein binding interface of the MUS81 protein [25]. The role of the MUS81 gene in breast cancer has not been investigated. However, it was demonstrated that MUS81 homozygote and heterozygote knockout mice have a predisposition to develop cancer. Proper bialellic expression of MUS81 is critical for the maintenance of genomic integrity and tumor suppression [49]. Therefore, the rs545500 SNP could predispose individuals to breast cancer, but functional studies need to be performed in order to identify the actual role of this variant in carcinogenesis.

Our findings also suggest that the PBOV1 rs6927706 polymorphism may be a risk factor for breast cancer. Rs6927706 is a non-synonymous SNP located in the coding region of PBOV1, a gene which is upregulated in prostate, breast and bladder cancers [21]. The polymorphism results in an amino acid change at codon 73 from a hydrophobic isoleucine to a hydrophilic threonine residue. Bioinformatics analysis indicates that this SNP could be involved in splicing regulation [47]. However, further work is warranted since the exact roles of the PBOV1 protein as well as its functional domains are not well known at present.

Our current data suggest that the MRE11A rs556477 G allele may be associated with a reduced breast cancer risk. The MRE11A gene forms a complex with RAD50 and NBS1 genes which is involved in the cellular response to DNA double-strand breaks. Defects in the members of this tri-complex are linked to increased chromosomal instability which leads to cancer [50]. The rs556477 common variant is located in intron 15 of the MRE11A gene. The rs556477 MAF is 40% in Caucasians as reported in NCBI’s dbSNP database; the same as that observed in our population. The functionality of this SNP is not clear. Using the TFSEARCH webtool (http://www.cbrc.jp/research/db/TFSEARCH.html), we searched for potential transcription factors binding sites at this position. The rs556477 SNP is located in a region that is a potential transcription factor-binding site of activator protein 1 (AP-1), which plays a critical role in signal transduction pathways in many cells. A recent study has shown that inhibition of AP-1 transcription factors suppresses breast cancer growth. Inhibitors that are capable of blocking AP-1 activation may be promising agents for the treatment and prevention of breast cancer [51]. The reduced risk of breast cancer for carriers of rs556477 SNP found in our study is in contrast with the above finding since it is expected that the creation of an AP-1 binding site will result in an increased breast cancer risk. However, it must be taken into account that the prediction that rs556477 A to G substitution results in a gain of an AP-1 binding site is based on in silico analysis and this remains to be proven by in vitro data. Furthermore, the MRE11A rs556477 polymorphism may not be causal, but could be in linkage disequilibrium with a true protective variant.

In the current study, we present evidence for an increased breast cancer risk for women carrying the MRE11A AGCG (rs556477, rs601341, rs10831234, rs1009456) haplotype. It is noteworthy that in a previous study conducted by our group there was evidence for an increased breast cancer risk for women homozygous for the MRE11A rs601341 A allele [35]. The rs601341 A to G substitution results in potential binding of ubiquitous transcription factor Ying Yang 1 (YY1) that has a fundamental role in normal biologic processes such as differentiation, replication and cell proliferation. YY1 overexpression and/or activation results in uncontrolled cellular proliferation, resistance to apoptotic stimuli and tumorigenesis [52]. Given the intronic position of the two associated SNPs, it is unlikely that these SNPs in and by themselves are disease associated. Rather, in all likelihood, they are in linkage disequilibrium with other variants that cause the associations observed.

Our study has several strengths, including a high participation rate of eligible cases (98%) and a population sample from a homogeneous ethnic background (all participants are Greek Cypriots) thus reducing any potential bias due to population stratification. In addition, our study population (both cases and controls) was from all over the country minimizing potential selection bias.

However, there were limitations in our study, one of which is the possibility of survivor bias. This is one of the known disadvantages of all retrospective case–control studies. In our study, samples from breast cancer cases were collected between 2004 and 2006 for cases diagnosed between 1999 and 2006. Our study may therefore have excluded a number of women with the most aggressive form of breast cancer, diagnosed between 1999 and 2003. It is possible that this could lead to “survivor bias” if genotypes differ between those who succumb quickly compared with longer-term breast cancer survivors.

The SNP selection for this study was based solely on functionality and their position in genes interacting with BRCA1/2 rather than allele frequency. As a result of this, a number of monomorphic/low-polymorphic SNPs were included in the study. It is noted that this is the first time that these SNPs were studied in the Cypriot population, and their allele frequencies were a priori unknown. Rare SNPs can also contribute to disease risk [53]. However, our study did not have sufficient power to detect such associations, and the possibility that some of the low-polymorphic SNPs studied contribute to breast cancer risk cannot be ruled out.

Another limitation of our study is that we did not consider the possibility of gene–gene interactions or gene–environment interactions. It is possible that the risks observed are the result of interactions, but we have not attempted to assess such effects, since the estimate of an interaction effect will be unreliable because of the small numbers available. Furthermore, we did not account for multiple testing. When multiple comparisons are being made, statistically significant associations may be identified by chance alone. Replication in independent, well-powered studies is the gold standard of bona fide true associations from chance findings. A Cypriot replication set is not available to attempt to replicate the variants identified, and replication will need to be performed in other populations.

In conclusion, this study provides support for the hypothesis that genetic variants in DNA repair genes influence breast cancer risk and provides further evidence for the polygenic model of breast cancer. However, large-scale genetic epidemiologic studies are warranted to further examine and corroborate the associations observed between polymorphisms and breast cancer in multiethnic groups. In addition, elucidation of the functional impact of the breast cancer associated SNPs is needed in order to provide further insights into their mechanistic effects on risk.