Introduction

Breast cancer is the second most common cancer among women in Thailand and its incidence is still increasing [1]. Although there are well established risk factors such as age at first child’s birth, nulliparity, and family history of breast cancer, the aetiology of breast cancer is still not completely known [2]. There is, however, strong evidence that insufficient repair of DNA damage plays an important role in the development of breast cancer [25]. Besides DNA double-strand break repair, efficiency of base excision repair (BER) is suggested to be a major determinant of breast cancer risk as BER is involved in the repair of oxidative DNA lesions which are induced by free radicals produced e.g., during cellular estrogen metabolism or by exogenous exposure to chemicals and ionizing radiation [6, 7].

The BER pathway consists of at least 11 DNA damage specific glycosylases and more than 20 further proteins [8], three of the most important ones being 8-oxoguanine DNA glycosylase (hOGG1), apurinic/apyrimidine endouclease 1 (APEX1), and X-ray repair cross-complementing group 1 protein (XRCC1) [9, 10]. BER proteins are cooperating at the damaged site in a strongly coordinated way, e.g., an oxidized 8-oxoguanine base is recognized and removed by the Ogg1 gylcosylase leaving behind an apurinic site which is excised by APEX1 and subsequently re-polymerized by DNA polymerase β and ligase 3. XRCC1 acts as a scaffolding protein and can regulate time and strength of interaction between the multiple proteins of the BER machinery.

Imbalances in this tightly regulated process can be caused by single nucleotide polymorphisms and will result in insufficient DNA repair and an increase in DNA breaks [8, 9]. There are several common polymorphisms known affecting the amino acid sequence in BER genes, among them Arg194Trp (rs1799782), Arg280His (rs25489) and Arg399Gln in XRCC1 (rs25487), Asp148Glu in APEX1 (rs3136820), and Ser326Cys in OGG1 (rs1052133). The XRCC1 399Gln allele and the OGG1 326Cys allele were shown to be associated with reduced DNA repair capacity as assessed by the persistence of DNA bulky adducts [1113] and γ-irradiation induced oxidative DNA damage [14]. Decreased DNA repair capacity after exposure to oxidative agents was also found when cells carrying the 280His allele were compared with isogenic cells carrying the 280Arg allele [15]. In addition, an increasing number of variant alleles of APEX1 Asp148Glu and XRCC1 Arg399Gln was significantly associated with prolonged cell cycle delay in G2 phase and decreased DNA repair capacity after γ-irradiation [14, 16]. In contrast, the 194Trp variant of XRCC1 was reported to be associated with lower bleomycin and benzo(a)pyrene diol-epoxide sensitivity in vitro [17], suggesting a protective role for the Trp-Allele in the development of cancer potentially by increasing BER activity.

Associations of these polymorphisms and breast cancer risk have been examined by epidemiologic studies and results have been summarized in three meta-analyses [1820]. Only a few studies have investigated associations between breast cancer risk and the genetic variants OGG1 Ser326Cys, APEX1 Asp148Glu and XRCC1 Arg280His. Reasonable numbers of studies are available only for the XRCC1 Arg194Trp and Arg399Gln polymorphisms. The summary estimates for the 399 polymorphism presented an increased risk in Asian and African homozygous carriers of the 399 Gln allele but not in Caucasians [20]. Data on Asian women are however based on small sample size. Studies on Thai women are missing. Therefore, it is unclear whether the observed differences by race have a biological basis and further evidence is needed to make a conclusion about the potential differences in relative risk by race.

Here, we present data on the association between these polymorphisms and breast cancer risk in a case-control study of Thai women. In addition, we determined the effects of the XRCC1 haplotypes and analyzed whether the combined occurrence of these polymorphisms in genes of the BER pathway affects breast cancer risk.

Material and methods

Study population

Cases were all new incident breast cancer patients histologically diagnosed at the National Cancer Institute in Bangkok and at the hospital in Khon Kaen province of North Eastern of Thailand during the period of May 2002 to March 2004 [21], with a participation rate of 99.6% (554/556). Controls were randomly selected from healthy women who visited patients admitted to the same hospitals for diseases other than breast or ovarian cancer. The participation rate among visitors who were asked to participate was 98.7% (572/579). Informed consent was obtained from all participants and a structured questionnaire was administered by trained interviewers to collect information on demographic and anthropometric data, reproductive and medical history, residential history, physical activity and occupation as well as diet. Lifestyle exposure parameters were reported as follows: tobacco smoking: less than or equal to 50 cigarettes (non-smoker) versus more than 50 cigarettes over a 6-month period (smoker); involuntary tobacco smoking: less versus more than or equal to 1 h of exposure per day during the last 2 years (sum of three sources: from the spouse, at the workplace or in public settings); alcohol consumption: less versus more than or equal to once a week for at least 6 months. Approximately 7 ml of blood were collected from participants, but 45 cases and 149 controls refused to give blood samples. In total, blood samples of 507 cases and 425 controls were included in the genotype analysis, resulting in a participation rate of 91.2% (cases) and 73.4% (controls). Genomic DNA was isolated from buffy coats using a QIAmp DNA blood kit (Qiagen, Hilden, Germany). The study was approved by the ethical review committee for research in human subjects, Ministry of Public Health, Thailand.

Genotyping analysis

Detection of polymorphisms was performed by rapid capillary PCR followed by melting curve analysis using fluorescence labeled hybridization probes in a LightCycler (Roche Diagnostics, Mannheim, Germany) as described [22]. Primers and probes were designed with the help of Tib Molbiol (Berlin, Germany), and are given in Table 1.

Table 1 Primers and probes for genotyping of single-nucleotide polymorphisms

Analysis was performed in 10-μl volumes in glass capillary tubes (Roche Diagnostics) with Qiagen reagents: 1x PCR Buffer, 2.5 mM total MgCl2, 0.2 mM dNTPs, 0.1% bovine serum albumin in 2 mM Tris–HCl/2.5 % glycerol, 1x Q-solution (not for APEX1 Asp148Glu), 0.1 U Taq DNA polymerase, 0.5 μM of each primer, 0.15 μM of each probe and 10 ng of DNA. For APEX1 Asp148Glu, 2 μM of anchor and sensor were used. Reaction conditions were as follows: initial denaturation at 95°C for 3 min, then 40 cycles of denaturation at 95°C for 15 s, annealing at 58°C (OGG1 Ser326Cys, APEX1 Asp148Glu, XRCC1 Arg194Trp and Arg280His), or 55°C (XRCC1 Arg399Gln) for 10 s and elongation at 72°C for 12 s. Melting curve analysis was performed with an initial denaturing step at 95°C for 30 s, 20 s at 40°C, slow heating to 75°C with a ramping rate 0.2°C/s and continuous fluorescence detection. Genotypes were assigned by comparing melting curves of unknown samples with those of samples with known (sequenced) genotype. A negative control containing all reagents but with water instead of the DNA template was included to each amplification. Melting points were evaluated by two independent observers who were unaware to the clinical diagnosis. In addition, 15% of randomly selected samples were repeated independently to verify genotyping results and 100% concordance was found. Because of an inadequate amount of DNA, a few samples did not generate complete information for all polymorphisms.

Data analysis

Breast cancer patients were compared with controls for basic demographic and lifestyle characteristics. Genotypes of both cases and controls were tested as to whether they were in Hardy-Weinberg equilibrium using the χ2-test of goodness of fit with one degree of freedom, with respect to the distribution of the considered allele groups. Because the XRCC1 208His allele is uncommon in this population, individuals with genotypes Arg/His and His/His were combined into one group as 208His allele carriers and compared with Arg/Arg homozygotes as the reference.

For analyzing the association of breast cancer risk with polymorphisms, odds ratios (ORs) and their 95% confidence intervals (CI) were calculated and assessed for statistical significance according to [23], both as crude ORs and as adjusted ORs. Multivariate unconditional logistic regression analysis was performed to assess the association between occurrence of breast cancer and prevalence of polymorphisms and to adjust for potential confounders. Covariates were selected when either being significant in the univariate analysis at the level of 5% or when being considered as relevant factor for the occurrence of breast cancer in general. Family history of breast cancer (FH), menopausal status, use of contraceptives, and education (≤9 years, >9 years) were incorporated in the model as binary predictors. Alcohol consumption, active and involuntary smoking were combined into one binary variable “hazardous lifestyle” (0 = no regularly alcohol consumption and no active and no involuntary smoking, 1 = else). In the analysis of all subjects, reproduction was a combination of pregnancy and breast-feeding in five categories: non-pregnant, age at 1st pregnancy ≤22 years and non breast-feeding, age at 1st pregnancy ≤22 years and breast-feeding, age at 1st pregnancy >22 years and non breast-feeding, age at 1st pregnancy >22 years and breast-feeding. To deal with possible non-linearity, continuous predictors (age, body mass index [BMI] and age at menarche) were modeled by using fractional polynomials [24]. Finally, ORs resulted from multivariate logistic regression model including age (non-linear), BMI, FH, age at menarche, reproduction parameters (5 classes), menopausal status, use of contraceptives, hazardous lifestyle, and education.

Possible interaction effects for age × BMI, polymorphism × hazardous lifestyle, polymorphism × BMI, and polymorphism × reproduction were tested by introducing an interaction term into the logistic regression model using the standard Wald test. Significance level alerting was set to 0.1 in accounting for the lower power to test for interaction compared to testing for single covariate’s effects.

In addition, subgroups based on menopausal status were analysed. Here, reproduction was categorized as non-pregnant, age at 1st pregnancy ≤22 years, and age at 1st pregnancy >22 years. Thus, ORs in subgroups were adjusted by age (linear), BMI, FH, age at menarche, reproduction parameters (three classes), use of contraceptives, hazardous lifestyle, and education. Statistical analyses were performed using the statistical packages SAS (SAS Institute, Cary, NC) for Windows Version 9.

Linkage disequilibrium between the different markers in XRCC1 was estimated using the “Estimating Haplotypes” program [25, 26]. Haplotypes pairs of XRCC1 were reconstructed using the PHASE V2.1 online software [27]. The haplotype specific risks were investigated by univariate and multivariate logistic regression analysis as described for the single nucleotide polymorphisms using the most probable haplotype pairs yielded by the PHASE software.

To evaluate the potential combined effects of XRCC1 Arg399Gln, APEX1 Asp148Glu, and OGG1 Ser326Cys on breast cancer risk, the number of homozygous variant alleles per individual was introduced in the model and crude as well as adjusted ORs were calculated.

Results

Characteristics of the study population were compared by case-control status, as shown in Table 2. The mean age of controls (45.3 ± 12.2 years) was significantly lower ( < 0.01) than that of breast cancer patients (48.0 ± 10.0 years). Pregnancy, breast feeding, BMI, involuntary tobacco smoking, FH, and education were different between cases and controls. However, as to oral contraceptive use, menopausal status, smoking, and alcohol consumption, no significant differences were found between cases and controls.

Table 2 Selected characteristics of study population

Frequencies of the variant alleles were 0.31 (XRCC1 194Trp), 0.07 (XRCC1 280His), 0.23 (XRCC1 399Gln), 0.34 (APEX1 148Glu) and 0.50 (OGG1 326Cys) in controls (Table 3). Genotype frequencies were in Hardy-Weinberg equilibrium in both cases and controls. In XRCC1 Arg280His, only 3 cases and 2 controls were found to be homozygous for the His allele. Thus, carriers of Arg/His and His/His were combined for further analysis and compared to Arg/Arg carriers as the reference.

Table 3 Associations between XRCC1, APEX1, and OGG1 polymorphisms and breast cancer

Univariate and multivariate logistic regression analysis of the overall effect of the polymorphisms (Table 3) did not give evidence that the XRCC1 194Trp, 280His, and 399Gln alleles were associated with breast cancer (OR = 0.70, 95% CI 0.42–1.14; OR = 1.30, 95% CI 0.88–1.93; and OR = 1.80, 95% CI 0.99–3.29). However, the APEX1 148Glu allele showed a significantly protective effect on breast cancer risk (OR = 0.60, 95% CI 0.38–0.94) when adjusted for age, BMI, family history of breast cancer, age at menarche, reproduction parameters (five classes), menopausal status, use of contraceptives, hazardous lifestyle, and education. For the OGG1 Cys/Cys genotype, an adjusted OR of 1.42 was calculated (95% confidence limits 0.97–2.09). No interaction effects were observed between age and BMI, as well as between polymorphisms and hazardous lifestyle, BMI, or reproduction (P > 0.85).

In the subgroup analysis based on menopausal status (Table 4), no effect of the polymorphisms was seen in premenopausal women, whereas there was a significant increase in breast cancer risk among postmenopausal women homozygous for the 326Cys variant of OGG1 (OR = 2.05, 95% CI 1.14–3.69). The other polymorphisms did not contribute significantly to breast cancer risk in postmenopausal women.

Table 4 Associations between XRCC1, APEX1, and OGG1 polymorphisms and breast cancer stratified by menopausal status

A strong association (linkage disequilibrium) was found among the three XRCC1 polymorphisms in codon 194, 280 and 399 (P-value < 0.0001) in cases and in controls. Reconstruction of haplotypes yielded four haplotypes which were called CGG (Arg194/Arg280/Arg399), CGA (Arg194/Arg280/Gln399), CAG (Arg194/His280/Arg399), and TGG (Trp194/Arg280/Arg399). Haplotype frequencies in the overall study population were 36% for CGG (n = 673), 26% for CGA (n = 572), 7% for CAG (n = 140), and 31% for TGG (n = 475). The haplotypes with the variant sequence in all three loci (TAA) or in two loci (TAG, TGA, or CAA) did not occur. In addition, haplotype pairs were calculated for each individual (Table 5) and the association between haplotype-pair (diplotype) and breast cancer risk was analyzed. In total, 10 diplotypes occurred, the diplotype CAG/CAG was combined with CAG/TGG because of the small sample size of these two groups. For regression analysis, the diplotype CGG/CGG which consists of the wild type sequence in all loci was selected as reference. A significantly increased risk for breast cancer was found for individuals carrying the CGG/TGG, CGA/CGA, CGA/CAG and the CAG/CAG combined with CAG/TGG diplotype. The strongest effect was seen for CGA/CGA (Arg194/Arg280/Gln399) with an OR of 2.56 (95% CI 1.28–5.15; P = 0.008).

Table 5 Associations between diplotypes of XRCC1 and breast cancer

As already mentioned, the three genes analyzed in this study are involved in the BER pathway. We therefore explored the combined effects of variant alleles in these genes on breast cancer risk. The analysis concentrated on individuals that were homozygous for the variant alleles because these individuals exhibited stronger effects of these alleles compared with heterozygous individuals for whom ORs were close to 1 (Table 3). As the results of the XRCC1 diplotype analysis showed the strongest effect for the haplotype with the wild type sequence at the 194 and 280 loci, and the 399Gln variant at the third locus, the variant allele of XRCC1 399Gln was selected for combination analysis. For APEX1 Asp148Glu, the 148Asp allele was designed as the risk allele because the Glu variant showed a protective effect. The reference population consisted of all individuals who were not homozygous for any of the three risk alleles. Table 6 summarizes the joint effects of the XRCC1 399Gln, APEX1 148Asp and OGG1 326Cys alleles on breast cancer risk when carriers are homozygous for one and more of these variants. Both crude and adjusted ORs were significantly enhanced for carriers of two or three homozygous risk alleles (OR = 1.88, 95% CI 1.26–2.82; P = 0.002).

Table 6 Breast cancer risk in individuals homozygous for more than one variant allele: Joint effects of variants in base excision repair (XRCC1 399Gln, APEX1 148Asp, and OGG1 326Cys)

Discussion

This study investigated possible associations between five polymorphisms in BER genes (OGG1 Ser326Cys, APEX1 Asp148Glu, XRCC1 Arg194Trp, XRCC1 Arg280His, and XRCC1 Arg399Gln) and breast cancer risk for the first time in Thai women. Frequencies obtained among controls for the rare alleles of all polymorphisms were consistent with frequencies found among other Asian populations [19]. Our data also confirmed that the XRCC1 194Trp and the OGG1 326Cys alleles are more frequent among Asians than among Caucasians. These differences may account for a different contribution of these polymorphisms to breast cancer in Asians, and especially in a Thai population.

All polymorphisms investigated were suggested recently to modify BER activity by correlating various cellular endpoints with genotype (see Introduction). Reduced BER capacity elongates the persistence of oxidative DNA damage in breast epithelial cells and may thus increase the mutational load in these cells and contribute to breast cancer risk.

We found no association between XRCC1 Arg194Trp and Arg280His and breast cancer risk among Thai women. Although studies on Arg194Trp had contradictory outcomes, our result corresponds to risk estimates obtained by meta-analyses for Arg194Trp [19, 20] which showed that this polymorphism did not modify breast cancer risk regardless of ethnic origin. The Arg280His polymorphism is a very rare allele in both Caucasian and Asian populations. Therefore, only few studies have determined this polymorphism and can be compared with our data. For Caucasians, an 80% increased risk for carriers of the 280His allele was reported [28] but this increase was not confirmed by two other studies [20, 29]. For Asians, two small studies, one from India (123 cases/123 controls) and one from China (84 cases/252 controls), did not find an association between Arg280His and breast cancer risk [30, 31].

Studies on the role of XRCC1 Arg399Gln in breast cancer risk have been summarized by meta-analyses [19, 20]. An effect of this polymorphism was found only when different ethnicities were analyzed separately. Summarizing 4 studies with Asian populations (1567 cases/1643 controls) [20] revealed a significant increase of risk among Asian carriers of the homozygous Gln variant of 60% (OR = 1.6, 95% CI 1.1–2.4). In the Thai population, the Gln/Gln genotype was also more frequent in cases than in controls (= 0.056). Although our data are not significant, they support that XRCC1 Arg399Gln can modulate breast cancer risk in Asian women. This effect was still emphasized when we assessed the associations between the three polymorphisms determined in XRCC1 and breast cancer risk based on diplotypes. Among the 10 diplotypes present in our population, we found the strongest effect, a 2.5-fold increase in risk, in individuals with the CGA/CGA diplotype. These individuals carry the wild type sequence in positions 194 and 280, but the variant Gln in position 399.

For OGG1 Ser326Cys, we found a two-fold, significantly increased risk for breast cancer in postmenopausal women with the Cys/Cys genotype. This is in contrast to three studies among Caucasians [20, 32, 33] and two studies among Asians [34, 35] which did not find an association of this polymorphism with breast cancer risk in the overall population and in the subgroup analysis according to menopausal status. There is however strong evidence that risk factors differ in pre- and post-menopausal women and it is plausible that factors interacting with this polymorphism in determining breast cancer risk also have a different role in the subgroups [2]. The increased risk we observed for the OGG1 Cys/Cys genotype is supported by detailed biochemical studies with the two variant enzymes. They found that the 326Cys-Ogg1 protein has a sevenfold lower activity for repairing 8-oxoguanine, altered substrate specificity and anomalous DNA binding conformation [3638]. These functional alterations indicate a reduced repair activity for the variant allele and may contribute to the association of the polymorphism with breast cancer risk. Effects might become visible mainly in postmenopausal women who are older and lived with the OGG1-related repair insufficiency for a longer time. The differences to other studies may be related to the heterogeneous nature of exposure to breast carcinogens within study populations. As we have however no clear data in our study that can be used to assess internal and external exposure to reactive oxygen species, our results need further validation.

To our knowledge, only one recent study examined the role of APEX1 Asp148Glu and risk of breast cancer [20]. This study did not find an association with breast cancer risk among non-Hispanic white Americans. In contrast, a significant protective effect of APEX1 148Glu allele was observed in our Thai study which is the first report on Asian women. Little is known about how APEX1 Asp148Glu affects DNA repair function. Characterization of the isolated variant proteins showed that APEX1 Asp148Glu had no impact on endonuclease and DNA binding activities [39]. However, an increasing number of variant alleles of APEX1 Asp148Glu and XRCC1 Arg399Gln was significantly associated with prolonged cell cycle delay in G2 phase and decreased DNA repair capacity after γ-irradiation [14, 16]. Both conditions impede correct mitosis, and cells might thus favor cell death by apoptosis instead of proliferating with insufficiently repaired genome and potential accumulation of mutations. Therefore, the APEX1 148Glu allele might reduce breast cancer susceptibility as observed in our study. Nevertheless, our result needs further confirmation by additional case-control studies and functional investigations.

An individual’s BER activity is characterized by more than one genetic variant. Although the functional impact of a single gene variant with low penetrance might be very limited, the interaction of several variant proteins with slightly reduced functional activity might sum up to a significant decrease in repair activity and an increased cancer risk. The analysis of the joint effects of the risk variants in the three BER genes XRCC1, APEX1, and OGG1 revealed in fact a nearly two-fold increased risk for breast cancer in individuals homozygous for two or more variant alleles. It would therefore appear reasonable to hypothesize that those BER gene variants may be risk alleles for breast cancer and simultaneously contribute to higher cancer susceptibility. We are nevertheless aware that the polymorphisms studied here still do not give the complete information about variability in gene function. To study the relevance of BER on breast cancer risk in more detail, comprehensive analyses should include polymorphisms in all known BER genes.

Our risk estimates took into consideration several risk factors known from literature to affect the gene-risk association including age, reproductive parameters, hormonal use, and hazardous lifestyle [2]. Some of these risk factors were considerably different between cases and controls in our study (Table 1). This is in accordance with previous studies of breast cancer [2], with the exception of pregnancy and education: More cases had children than controls, but pregnancy is estimated as being protective, and controls were more highly educated than cases. As both criteria were not used in patient recruitment as inclusion criteria, these differences might have happened by chance. They were, however, considered in our multivariate regression analysis as adjustment factors together with other covariates differing in our population or being especially important risk factors. One example is age that was different among cases and controls in our study. Therefore, we adjusted for this difference by modeling age as an exponential variable in the logistic regression model. The multivariate regression analysis for age only, for the complete model including all covariates, and the interaction analysis revealed that they had only a minor effect on the risk estimate for the polymorphisms. Thus, our results were robust to different model adjustments indicating that the risk contribution of genetic variants is moderate but not affected by the known strong breast cancer risk factors such as pregnancy, breast feeding and family history of breast cancer.

In conclusion, we have demonstrated that amino acid substitution variants of OGG1, APEX1 and XRCC1 genes, particularly in combination, are associated with increased susceptibility to breast cancer among Thai women. Although our study is larger in sample size than other studies among Asian populations, further epidemiological studies are warranted to confirm the role of genetic variation of BER in breast cancer susceptibility.