1 Introduction

The XRCC1 (X-ray cross-complementing-1) protein is a key component of the cellular machinery responsible for base excision repair (BER) of DNA damage [1]. XRCC1 acts as a scaffold protein in this repair pathway coordinating the activity of the other components of the BER machinery [1, 2]. Thus, XRCC1 appears to regulate the function of the poly(ADP-ribose) polymerases that catalyze the ribosylation of a number of DNA-bound proteins to decrease their affinity for DNA allowing the repair machinery access to the damaged site (PARP-1 and PARP-2), the specific glycosylases that recognize and remove the damaged base (including hOGG1, human 8-oxoguanine glycosylase, and MPG, methyl purine glycosylase), an apurinic/apyrimidinic endonuclease responsible for cleaving the phosphodiester bond at the abasic site created by the glycosylase (APE1), a DNA polymerase that has deoxyribosephosphodiesterase activity to release the 5′ sugar phosphate group and gap-filling synthesis activity to add one nucleotide to the 3′-OH (Pol β), and a DNA ligase that seals the nick in an ATP-dependent fashion (Lig III) [1, 2]. XRCC1 coordinates the activity of these other proteins through three protein interaction modules that constitute independent and distinct globular domains of the protein, including an N-terminal domain (NTD), a central BRCA1 carboxy terminal domain (BRCT1), and a C-terminal BRCT domain (BRCT2) [3]. The two intervening linker regions between these globular domains have been denominated hinges and have been predicted to be unstructured [3].

The NTD is the site of interaction with Pol β [4], and BRCT2 is the site of interaction with Lig III [5]. The BRCT1 domain (approximately residues 315–407) is believed to be responsible for XRCC1 interaction with PARPs and the BER glycosylase, decreasing the activity of the former and increasing the activity of the latter, and thus is a critical domain for proper functioning of the repair process [6, 7]. Similar BRCT domains have been found in a number of other important cancer-related proteins, such as BRCA1, BRCA2, 53BP1, REV1, RAP1, RAD9, RAD4, Ect2, and Crb2, where they are believed to likewise be involved in critical protein–protein interactions [8]. The BRCT1 domain of XRCC1 is also the site of a very common single nucleotide polymorphism that results in the substitution of a glutamine (Gln) for the normally occurring arginine (Arg) at amino acid residue 399 [9]. Although the data to date have been somewhat inconsistent, some evidence from both experimental and epidemiologic studies suggests that this polymorphism in XRCC1 may alter DNA repair capabilities and cancer risk [10]. Thus, it is theoretically possible that this polymorphism is responsible for altering the structure of the BRCT1 domain and hence the function of the XRCC1 protein by disrupting critical protein–protein interactions and its coordination of BER. The purpose of the present study was to investigate this possibility by determining if there are differences in the structure of the BRCT1 domain with Arg and Gln at position 399 using molecular dynamics techniques.

2 Materials and Methods

The BRCT2 domain of XRCC1 (amino acid residues 538–633) is highly homologous to the BRCT1 domain (68% homologous; 20% identical), and its X-ray crystallographic structure has been determined [11]. Therefore, we used this known BRCT2 structure as the starting point for determining the structure of the wild-type BRCT1 domain with Arg at amino acid residue 399 via an adaptation of a molecular dynamics approach as previously described for other protein structural determinations, which has been shown to yield results that are consistent with experimental data [12, 13].

First, the sequence of the BRCT1 domain from amino acid residues 315–407 with Arg at 399 was threaded onto the coordinates of the BRCT2 domain using commercially available software (Sybyl; Tripos, St. Louis, MO), with the ends of the peptide neutrally blocked. This alignment was done by direct substitution at homologous amino acid residues that left two gaps due to the fact that the BRCT2 domain contains three additional amino acid residues (at 553 and 598–599) that have no analogous residues in the BRCT1 sequence; however, these additional residues occur in random coil loop regions that connect β sheets and α helices and thus can be easily deleted without expecting to produce any significant conformational effects. The resulting gaps were closed by making direct peptide bonds between the adjacent amino acid residues in the BRCT1 sequence. Next each BRCT1 substituted residue in the BRCT2 structure was relaxed and minimized using Sybyl’s Structure Preparation Tool, which allowed steric clashes to be removed. The BRCT1 domain has an overall charge of +9 that was neutralized by the random placement of Cl counterions. These counterions were allowed to relax individually to correspond to physico-chemically reasonable placement. The molecule was then immersed in a water box of 7,507 water molecules, and the water molecules were also allowed to relax while the BRCT1 domain was held restrained.

A series of nested energy minimizations were then performed on this complex of the BRCT1 molecule, counterions and water, resulting in an overall minimized structure for the wild-type BRCT1 domain. After the overall structure was minimized to a root-mean-square (RMS) gradient of 0.001 A, the dynamics runs were carried out. The BRCT1 complex was heated to 300°K over 2 picos and then allowed to equilibrate over the balance of the dynamics run. The energies, volume and density of the completed dynamics run were examined to insure that the results were physically reasonable. The final average wild-type BRCT1 structure was determined from the last 25–50 picos of the run and was calculated from the frames along the part of the dynamics stimulation that had equilibrated.

Next, the final average wild-type BRCT1 structure was used as the starting point for determining the structure of the polymorphic BRCT1 domain with Gln substituted for the normally occurring Arg at amino acid residue 399. Then the same procedures as above were repeated to obtain the final average polymorphic BRCT1 structure. Finally, the average structure for the polymorphic protein was superimposed on that for the wild-type protein such that the RMS deviation of the coordinates of the backbone atoms of one structure from the other was a minimum. The average RMS deviation between the polymorphic and wild-type proteins was determined for the structures as a whole, as well as for each amino acid residue individually to identify isolated regions with the most significant conformational changes.

3 Results

As shown in Figs. 1 and 2, coordinate fluctuations for all residues in both the wild-type and polymorphic forms of the protein were less than 1 A (in most cases much less), confirming that the proteins converge on a specific structure in each case.

Fig. 1
figure 1

Coordinate fluctuations for the final low-energy structures on the dynamics trajectory of the wild-type BRCT1 domain of XRCC1. Scale is in Angstroms

Fig. 2
figure 2

Coordinate fluctuations for the final low-energy structures on the dynamics trajectory of the polymorphic BRCT1 domain of XRCC1. Scale is in Angstroms

The overall RMS deviation for the average structures of the wild-type and polymorphic proteins was moderately large at 4.95 A, even though the general configuration of the wild-type BRCT1 domain is retained in the polymorphic form. The individual residue RMS deviations for the average structures are shown in Fig. 3, demonstrating even larger deviations (>5 A at multiple residues) at discrete regions throughout the BRCT1 domain. The most prominent of these deviations occur at Leu320-Gly322, Ser328-Gln331, Arg335-Ser336, Arg339-Ala342, Asp352-Thr354, Trp385-Val386, Arg391-Leu395, and Met403-Gly407. Interestingly, at the site of the polymorphism at residue 399, although there is some loss of secondary structure in the polymorphic form (see below), there is minimal deviation (1.94 A) between the two structures; however, there are regions of considerable deviation N-terminal and C-terminal to this site.

Fig. 3
figure 3

Individual residue backbone deviations for the coordinates of corresponding amino acid residues 315–407 of the average structure of the polymorphic BRCT1 domain from those of the wild-type domain average structure. Scale is in Angstroms

Perhaps even more significant are the differences in secondary structure between the wild-type and polymorphic proteins, which can be seen in Figs. 46. Figure 4 shows the average backbone structure for the wild-type protein, Fig. 5 shows the average backbone structure for the polymorphic protein, and Fig. 6 shows the best-fit superposition of the two average structures. The wild-type protein can be seen to have four α helices and three β sheets (Fig. 4). However, in the polymorphic protein two extended α helices from Arg335-Leu345 and from Trp385-Arg391 and a shorter α helix from Ser398-Arg400 are no longer present; similarly, the β sheets from Val323-Leu327 and from Ala347-Tyr349 are also missing in the polymorphic protein (Fig. 5). In particular, the α helices, which in many cases are known to be involved in protein interactions, occur on the surface of the BRCT1 domain where they could easily participate in such interactions, so these changes could obviously have significant functional implications.

Fig. 4
figure 4

Cα tracing of the average structure of the wild-type BRCT1 domain of XRCC1 from molecular dynamics simulations with α helices indicated by broad blue bands and β sheets by broad green arrows. The N-terminus is to the lower left and the C-terminus to the upper right

Fig. 5
figure 5

Cα tracing of the average structure of the polymorphic BRCT1 domain of XRCC1 from molecular dynamics simulations with α helices indicated by broad red bands and β sheets by broad blue arrows. The N-terminus is to the lower left and the C-terminus to the upper right

Fig. 6
figure 6

Superposition of the Cα tracings of the average structures of the wild-type (yellow) and polymorphic (gray) forms of the BRCT1 domain of XRCC1. The site of phosphorylation at Ser371 and of the polymorphic substitution at Arg/Gln399 are indicated. The N-termini are to the lower left and the C-termini to the upper right

4 Discussion

These results suggest that the polymorphic substitution of Gln for Arg at amino acid residue 399 in the XRCC1 protein could produce significant conformational changes in the BRCT1 domain. Although the site of the substitution itself does not show much deviation in the two structures, the substitution appears to be responsible for major deviations and secondary structural changes at several other sites in the BRCT1 domain, both adjacent to the site of the substitution, as well as at a distance from it.

The conformational effects noted in this study seem to be consistent with other lines of evidence. For example, it is known that DNA-PK, a member of the phosphatidylinositol 3-kinase-related kinase superfamily, phosphorylates the BRCT1 domain of XRCC1 at Ser371 following exposure to ionizing radiation, causing XRCC1 dimer dissociation [14]. The presence of the polymorphic Gln399 variant does not affect this phosphorylation [14], suggesting that this site is not significantly changed in conformation from that of the wild-type protein. Ser371 is in the center of the one α helix that occurs in both the wild-type and polymorphic forms of the protein from Lys369-Gln372, retaining its relative orientation with respect to the rest of the BRCT1 domain. Furthermore, our results show that the RMS deviation at Ser371 between the wild-type and polymorphic proteins is only a moderate 3.76 A and that the site remains exposed on the surface of the BRCT1 domain easily accessible for phosphorylation in both forms (Fig. 6).

Data from our own prior studies suggest that the Gln399 polymorphism leads to a reduction in BER capability both in vitro and in vivo [15, 16]. For example, we have studied a model population of workers exposed to the known mutagen/carcinogen vinyl chloride and the effect of this XRCC1 polymorphism on the occurrence of biomarkers of mutations in this cohort. Vinyl chloride is known to be metabolized to the reactive intermediates chloroethylene oxide and chloroacetaldehyde which form promutagenic etheno-DNA adducts. The resultant etheno-adenine DNA adduct is believed to be responsible for the production of A → T transversions in the TP53 tumor suppressor gene that occur in workers with the sentinel neoplasm for vinyl chloride exposure, angiosarcoma of the liver, and the biomarkers for these mutations that occur in exposed workers without tumors [17]. These mutant p53 biomarkers occur in a statistically significant dose-response relationship with regard to cumulative vinyl chloride exposure, but at any given exposure level there are seemingly otherwise similar individuals who differ in the occurrence of the biomarkers, suggesting that some genetically determined susceptibility could account for different outcomes despite similar exposures. The presence of the Gln399 polymorphism in XRCC1 appears to explain much of this differential susceptibility. For instance, vinyl chloride-exposed workers who were homozygous for the Gln399 polymorphism were found to have a 4-fold increased risk for the occurrence of the mutant p53 biomarkers even after controlling for potential confounders including cumulative vinyl chloride exposure [15, 16]. In addition, we studied the effect in cell culture of chloroacetaldehyde exposure on the formation of etheno-adenine adducts in lymphoblast lines from individuals who were homozygous wild-type or homozygous variant for the XRCC1 399 polymorphism. The efficiency of repair of the adducts in the homozygous wild-type cells was four times greater than the efficiency of repair in the homozygous variant cells, directly analogous to the epidemiologic results in the exposed workers [16]. These findings are consistent with the fact that the repair of the etheno-adenine adducts produced by vinyl chloride occurs by BER via MPG and the other XRCC1-coordinated repair machinery. As noted, an important step in this process requires XRCC1 regulation of PARP-1, which depends on catalyzing its automodification with multiple poly(ADP-ribose) molecules [18]. The BRCT1 domain of XRCC1 contains the PAR-binding consensus sequence from Arg379 to Arg400 [18]. As found in the present study, this is a region of significant RMS deviation and secondary structural change in the polymorphic protein, including the loss of a major α helical segment in the middle of the sequence from Trp385 to Arg391 on the surface of the BRCT1 domain. Therefore, it is entirely plausible that the effects on BER of etheno-adenine adducts that we have observed in vitro and in vivo are at least partially attributable to the conformational alterations in the PAR-binding site, disrupting PARP-1 regulation and the access of the repair machinery to the site of the DNA damage. Similarly, it is possible that the conformational changes noted in other regions of the BRCT1 domain affect the interaction of XRCC1 with different BER proteins, including the glycosylases. For example, some evidence suggests that the XRCC1–hOGG1 interaction involves the BRCT1 domain and the hinge region just N-terminal to it [19], so the conformational changes that we have found in the Gln399 polymorphic protein at the N-terminal end of BRCT1, including the loss of the α helix from Arg335 to Leu345, could conceivably affect the ability of XRCC1 to interact with and regulate hOGG1 activity.

In summary, the results of the present study provide support for the hypothesis that the position 399 polymorphism in XRCC1 could influence BER capability by altering the structure of the BRCT1 domain and disrupting its interaction with other components of the repair machinery. Since this is a relatively common polymorphism in many populations [10], these conformational alterations and their functional effects could account for a significant amount of the observed variability in DNA repair and hence cancer risk in these populations.