Introduction

Breast cancer (BC) is an intricate multifaceted disease with undistinguishable aetiology. BC potential risk can be determined by the extent of DNA damage and genetic instability caused by various environmental factors [1]. However, we are blessed by birth with a very efficient system to guard our genome from several type of DNA damages and repair those damages if anyone occurred with the help of transcription—coupled repair (TCR), nucleotide excision repair (NER), base excision repair (BER), mismatch repair (MMR), and double—strand DNA break repair systems [2].

NER is meant to repair DNA damages induced by ultraviolent (UV) radiations, intrastranded DNA cross-links, products of organic combustion, oxidative stress and heavy metals. NER proteins involved in the repair of damages caused by aforementioned sources and to maintain genomic integrity by carcinogenesis prevention includes XPA, ERCC1, XPB, XPC, XPD, XPF, XPG and DDB1 [3]. NER process encompasses of various definite steps such as recognition of DNA damage, demarcation of damaged DNA, incision of damaged DNA, repair process (patch synthesis) followed by DNA ligation [4]. Xeroderma pigmentosum complementation gene (XPC) is meant to form a complex with HR23B and responsible for recognizing DNA damages predominantly cyclo-butane-pyrimidine dimers formed as a result of exposure to UV radiations. This complex plays major role in recognizing damages by binding with DNA lesion and lead to its unwinding, offering place for XPA so, that it can bind and unwound DNA to initiate the whole NER process. Most of the XPC genetic alterations (95%) are frameshift, deletion, splice site or non-sense mutations and produces truncated proteins [5, 6]. Numerous NER polymorphisms have been reported in literature specifically with reference to their involvement in carcinogenesis [4].

Researchers have demonstrated low NER levels in the peripheral blood lymphocytes of BC cases and their relatives [7]. Besides, some studies have shown association of NER gene polymorphisms with BC risk in different ethnicities [1, 8,9,10]. But the results have been unpredictable and inconsistent due to smaller sample size, lack of clinical details and follow up together with different sociocultural and genetic backgrounds [11]. Among the Pakistani population, we have previously reported the association of XPG polymorphisms with breast cancer and survival [10] but to date, no systematic examination is available on the susceptibility of XPC or other NER genes in breast cancer. XPC rs2228001 A > C polymorphism at exon 15 results in Gln substitution into Lys at 939 position, whereas rs2733532 lies in the intronic region. rs2228001 is one of the frequently studied XPC polymorphism with highly reported global minor allele frequency (MAF) i.e., 0.31 whereas rs2733532 have second highest MAF, 0.30 [12]. Therefore, current study was designed to offer a more comprehensive understanding of association between XPC polymorphisms (rs2228001–A > C & rs2733532–C > T) and BC carcinogenesis and to predict possible conformational changes in protein with different computational tools. Likewise, substantial survival hindrance as a result of XPC polymorphisms among BC cases in overall and progression free survival was explored.

Methodology

Study subjects and ethical considerations

For case–control study, from September 2014 to May 2018 ethical approval was obtained from “IRB—Institutional Review Board and Ethical Committees of Fatima Jinnah Women University, Rawalpindi and hospitals including Benazir Bhutto Shaheed Hospital, District Headquarter Hospital, Holy Family Hospital, Combined Military Hospital, Rawalpindi and Shifa International Hospital Islamabad”. Study sample size was calculated with sample size calculator provided by World Health Organization (WHO) and manually confirmed with formula {\(n = Z_{{\frac{\alpha }{2} }}^{2} \frac{\text{PQ}}{{\left( {\text{MOE}} \right)^{2} }}\)}. Using disease prevalence of 15% [13] and 5% margin of error (MOE) calculated sample size turns out to be 195, where \({\text{Z}}_{{\frac{\alpha }{2}}}\), MOE and p is statistical constant, relative precision and disease prevalence respectively. Study parameters were designated by following the recommendations of tumor marker prognostic studies [14]. Blood samples were collected from 493 histo-pathologically diagnosed BC cases and age-matched 387 controls of same geographical region following written informed consent. Clinical and demographic details were taken in a detailed questionnaire with the assistance of patient, their attendees, medical records and respective oncologist.

XPC genotyping

XPC genotyping was performed as previously described [10] via Tetra ARMS–PCR “tetra-primer amplification refractory mutation system–polymerase chain reaction”. Primers were designed with a program established by Ye et al. with default settings [15]. Reaction conditions includes initial denaturation at 94 °C (5 min), followed by 30 cycles comprising of 94 °C (45 s), 74 °C (1 min), and 72 °C (45 s) and final extension at 72 °C (10 min). Results were validated by repeat analysis of randomly selected twenty-five samples each from cases and controls. PCR products were recorded with gel documentation system after resolving on agarose gel through electrophoresis.

Data set

SNPs (rs2228001 and rs2733532) were witnessed through dbSNP (https://www.ncbi.nlm.nih.gov/projects/SNP) [16] in the XPC gene. SNP rs2733532 C > T lies in the intronic region, while rs2228001 A > C polymorphism was evident in the exon 15. The primary protein sequence (ID: Q01831) was retrieved through UniProtKB/Swiss-Prot database (http://www.uniprot.org). 3D structures of XPCWT and XPCQ939K were modelled using Phyre2 [17] and I-TASSER [18]. Energy minimization was carried out by GROMACS 5.1.4 through Amber force field [19]. The predicted structures were visualized by UCSF Chimera ver. 1.11.2, which is an extensive program for interactive visualization and analysis of molecular structures and related data (Fig. 2). Structure validation was carried out by RAMPAGE [20] and ProQ [21] tools, followed by refinement using WinCoot [22]. Structure joining, editing and analyses were carried out by UCSF Chimera ver. 1.11.2 [23] as shown in Fig. 3.

Statistical analysis

For the case–control study, goodness of fit χ2 was tested by Hardy–Weinberg equilibrium. Distributional differences of several clinical features, demographic factors and XPC polymorphisms between BC cases and healthy controls were examined by Chi square test. Risk of XPC polymorphisms (rs2228001–A > C & rs2733532–C > T) and breast carcinogenesis was assessed by odds ratios (OR) and 95% confidence intervals (CIs) using a conditional logistic regression model.

A total of 347 BC cases participated in survival analysis due to their availability for follow-up till the end of the study period. All the BC cases were followed-up after every 6 months via telephonic calls or hospital visits to maintain a record of disease progression, patient’s health status, disease free survival and deaths. Overall survival (OS) was defined “as the time of disease diagnosis (randomization) to death due to any cause whereas progression free survival (PFS) was the time of diagnosis to last clinical follow up (45 months for current study)”. Kaplan–Meier test and log rank test were applied to predict survival distributions and differences among XPC polymorphisms and breast carcinogenesis respectively. Association of XPC polymorphisms with OS and PFS in BC was demonstrated by hazard ratio (HR) and 95% CI by univariate cox proportional hazards model. Homozygous normal genotype (wild type) was taken as reference for both XPC polymorphisms. P value ≤ 0.05 was considered as statistically significant. SPSS version 24.0 and MedCalc version 15.0 were used for respective statistical analysis.

Results

Demographic and clinical details

Mean age of BC cases and controls was 50.4 ± 14.1 years and 46.5 ± 14.9 years, respectively. No significant differences were observed in the age and menarche (OR = 0.88; CI 0.7–1.07) among BC cases and controls, whereas early menopause was evaluated as a risk factor (OR = 3.3; CI 2.2–4.8) for BC development. Mean number of live births and mean ages at first pregnancy were 4.7 ± 1.8 years and 20.8 ± 3.4 years, respectively. Results showed that only 3% BC cases were nulliparous and 3.4% parous women had no lactating history. Obesity, consanguinity, positive marital status and BC family history was significantly associated (P ≤ 0.01) with BC risk. In current study, most of BC cases were diagnosed with invasive ductal carcinoma (76%), whereas 24% cases had invasive lobular carcinoma illustrating that invasive ductal carcinoma was the most common BC type. None of the cases were reported with grade-I tumor, whereas 66% had grade-II tumor. Among 29% BC cases, tumor was metastasized to different body parts.

XPC genotyping and clinical details

Genotyping revealed that rs2228001 (A > C) had 57 heterozygous (C/A) and 13 homozygous (A/A) polymorphisms and heterozygous (C/A) polymorphism was found to be significantly associated with BC risk as shown in Fig. 1a. XPC polymorphism, rs2733532 (C > T) had statistically significant involvement towards BC development (Fig. 1b) with 58 heterozygous (C/T) and 11 homozygous (T/T) alterations (Table 1). Additionally, most of the cases with XPC polymorphisms were married, had consanguinity and invasive ductal carcinoma with grade II tumor. Family history of cancer was found to be pivotal factor in possession of breast cancer and almost 42% BC cases with XPC polymorphisms had family history of cancer. A total of 69% cases with rs2228001 polymorphism showed lymph node positivity whereas 34% had metastasis to different body organs whereas 76% and 32% cases were reported to have positive lymph nodes and metastasis with rs2733532 polymorphism. For, rs2733532 polymorphism, significant association was detected among age and delayed menopause (P ≤ 0.01). Results showed statistically significant association between positive family history of cancer (P = 0.02) with higher tumor grade in BC cases with rs2733532 polymorphism. Therefore, results showed that XPC polymorphisms are linked with predisposition to BC and different lifestyle along with clinical factors contributes towards disease severity and progression.

Fig. 1
figure 1

Electropherogram of Tetra–ARMS PCR amplified product for a XPC rs2228001 (A > C) where; C/C homozygous wild type, C/A heterozygous mutant and A/A homozygous mutant. b XPC rs2733532 (C > T); C/C homozygous wild type, C/T heterozygous mutant and T/T homozygous mutant. C, Breast cancer; N, Control; numerical figures represents number of cases and controls

Table 1 Association between XPC variant and breast cancer risk

Structural analysis

Human XPC comprises 4 domains named as Transglutaminase like domain, β-hairpin domain 1, β-hairpin domain 2 and β-hairpin domains 3, respectively (Fig. 2) [24]. Ramachandran analysis for XPCWT and XPCQ939K indicated presence of 96.8% and 92.3% residues in sterically allowed region (fig S1) (Table 2). Through superimposition of XPCWT and XPCQ939K structures, an RMSD value of 2.1Å was detected, indicating significant change at structural level. Quite interestingly, by comparison to XPCWT, predominant conformational changes were witnessed at the C-terminal region of XPCQ939K (Fig. 3), particularly in α-helical pattern. Two additional α-helices (L845-K849 and E859-R864) were observed in XPCQ939K, resulting in a shifting of T923-A930 helix. Similarly, another significant change was witnessed at the helical region of XPCQ939K encompassing P238-R307 residues at the vicinity of C-terminus and a helix H487-H491 position has been shifted into loop (Fig. 3).

Fig. 2
figure 2

Structure of XPC protein. a Domain organization in XPC. b Ribbon diagram of human XPC. Transglutaminase-like domain (TGD) is indicated in spring green color and three β-hairpin domains (BHD 1–3) are shown in purple, yellow and blue colors, respectively

Table 2 Structural evaluation of XPCWT and XPCQ939K
Fig. 3
figure 3

Comparative 3D and 2D structure of XPCWT and XPCQ939K. Ribbon diagrams of a XPCWT and b XPCQ939K. 2D structures of c XPCWT and d XPCQ939K. The regions exhibiting conformational changes are shown in dotted box

XPC genotyping, survival analysis and breast cancer prognosis

Relationship of XPC polymorphisms with OS and PFS was measured among BC cases with the help of Kaplan–Meier Survival curve (Fig. 4) and Cox proportional hazards model, respectively. Even though heterozygous genotypes of both polymorphisms (rs2733532 and rs2228001) were associated with BC risk, they exhibited no association (P ≥ 0.05) with OS and PFS (Table 3). Though noticeable differences were observed among wild type and altered genotypes of rs2733532 with PFS, findings were not significant.

Fig. 4
figure 4

Kaplan–Meier estimates representing association of XPC polymorphisms with OS and PFS in BC cases. a, b No association of rs2228001 polymorphism with PFS and OS among BC cases. c, d No association of rs2733532 polymorphism with PFS and OS among BC cases, respectively

Table 3 Association between XPC variants and breast cancer prognosis using Kaplan Meier survival analysis and Cox proportional hazards model

Discussion

Breast cancer is not only the most frequently identified malignancy amongst females, but also a leading cause for cancer-related deaths around the globe [25]. Demographic factors suggest that mean age of BC cases is 50.4 years and ranges from 18 to 85 years, illustrating that BC incidence increases with age [26, 27]. Menarche and menopause are the hallmarks accountable for the onset and offset of ovarian and endocrine activity associated with reproduction. Current study demonstrated 3% rise in BC risk among females with early menopause, consistent with earlier reports [28,29,30,31]. In contrast to previous finding [32], where menarche role remains inconsistent, we did not observe any association of menarche with BC risk, rather consanguinity and familial cancer history were BC associated risk factors, as described elsewhere [10, 31, 33, 34]. The positive marital status had a substantial impact on BC risk signifying that married females tend to have better health than unmarried ones, conceivably due to reproductive and psychosocial factors as demonstrated by National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) 18 regions database [35].

Several studies presented contradictory results concerning the association of XPC polymorphisms and risk of cancer. Individuals having XPC polymorphisms were interrelated with 1000 folds higher skin carcinoma risk [36]. XPC and XPG cofactors stimulate OGG1, a DNA glycosylase activity and competent working of NTH1, whereas XPA supports in the repair of oxidized bases [37]. The current case–control–association study presented a statistically significant association of two XPC polymorphisms (rs2228001 & rs2733532) with BC risk. XPC rs2228001 polymorphism results in the substitution glutamine into lysine at 939 position, while, another polymorphism, rs2733532 was present in the intronic region. These polymorphisms were selected due to their highest MAF values and lack of data with respect to their BC association in Pakistani population. Thus, XPC codon 939 polymorphic locus (rs2228001) may be considered as a potential marker for BC diagnosis. In a meta-analysis, no association has been detected between rs2228001 polymorphism and BC risk [38]. As long term follow-up is required for improved understandings of OS, PFS is a better option in such condition [39]. In the present study, OS and PFS analysis of 347 BC cases showed no association with both XPC polymorphisms. Though evident difference was found among wild-type and altered genotypes of rs2733532 with PFS, but outcomes were statistically non-significant that may be due to shorter follow-up time and limited study size. The visible differences in the outcomes of current and prior studies may possibly be due to diverse genetic backgrounds and sample population types. Acquaintance to dissimilar socio-cultural backgrounds, environmental and life style factors in consort with varying study sample sizes may be the conceivable explanation of opposing results [11, 12].

Recent studies potentiate that XPC 939 substitution may be associated with higher frequency of p53 mutations and [40] Aflatoxin B1 (AFB1)-induced hepatocellular carcinoma [41], as XPC plays a major role in the reduction of AFB1-induced toxic effects. Intriguingly, in XPCQ939K, presence of two additional α-helices at L845-K849 and E859-R864 positions induced conformational switching in the helix (T923-A930 AA) symmetry at the C-terminus. Another significant change was witnessed at the helical region of XPCQ939K encompassing P238-R307 residues at the vicinity of C-terminus, that may contribute in its abnormal activity. XPC C-terminus is crucial for TFIIH recruitment, hHR23B and damaged DNA binding. XPC stable binding with NER protein hHR23B recognizes DNA damage sites, formed by exogenous carcinogens such as AFB1. Despite having normal NER activity, XPCQ939K is associated with higher cancer susceptibility [42], and is linked with abnormal p53 degradation due to disruption in its interaction with Mdm2 ubiquitin ligase [43]. These findings add substantial evidence in the non-NER activity of XPCQ939K, possibly via impairing p53 proteolysis.

However, present study has some limitations. It was conducted in different hospitals of Rawalpindi and Islamabad but collecting samples from representative hospitals of all provinces could give more generalized results. Although individuals from all over the country visits these hospitals due to their better diagnosis and treatment modalities as they were situated in the capital of the country. Genotyping was carried out via ARMS-PCR while more accurate findings could be available by means of sequencing. Additionally, onset of clinical diseases is not the manifestation of disruptions in single or more genes. Moreover, genetic disturbances are entrenched in the entire genome and affected by various environmental factors as well. Hence, some other genes and polymorphisms could also involve in breast carcinogenesis.

In conclusion, present study reveals that XPC polymorphisms (rs2228001–A > C and rs2733532–C > T) are possible risk factors associated with increased BC incidence. XPC polymorphism, rs2228001 may change the structural and functional preferences of XPC C-terminus, while rs2733532 may have regulatory role thereby leading to potential BC risk. Additional studies with larger sample size and longer follow-up may provide better outcomes in future.