Main

Alkylation of DNA by endogenous metabolites, environmental toxins and chemotherapeutic agents is a major source of genotoxic damage8. By virtue of their positive charge, N3- and N7-alkylpurines are prone to spontaneous depurination at physiological pH, and both N3-methyladenine (3mA) and apurinic/apyrimidinic (AP) sites interfere with DNA replication and transcription9,10. As the initial step in the base excision repair pathway, DNA glycosylases remove 3mA and other cationic and neutral nucleobases from the genome. Both enzymatic and non-enzymatic depurination of these lesions have been shown to proceed through a stepwise pathway initiated by cleavage of the glycosidic bond, followed by addition of the nucleophilic water to the oxocarbenium (dR+) intermediate11 (Fig. 1a). The resulting AP site is then converted to an undamaged nucleotide by a common set of lesion-independent base excision repair enzymes1,2.

Figure 1: Crystallographic reconstruction of the reaction trajectory for 3mA excision.
figure 1

a, Proposed reaction scheme illustrating stepwise depurination of 3mA-DNA. b, Corresponding structural analogues used for crystallography. c, AlkD–3d3mA-DNA complex. d, Substrate-like complex containing 3d3mA-DNA. Watson–Crick (56%) and sheared (44%) conformations of the 3d3mA•T base pair are represented with thin and thick bonds, respectively. e, Intermediate-like complex containing 1aR-DNA and 3mA nucleobase. f, Product-like complex containing THF-DNA and 3mA nucleobase. AlkD is coloured blue, the lesion is purple, the opposing thymine is green and the flanking nucleotides are yellow. Charge–dipole and hydrogen-bonding interactions are shown as dashed lines. Annealed omit electron density in df is contoured to 2.5σ.

PowerPoint slide

Despite their structural diversity, DNA glycosylases are generally thought to accomplish base flipping through the use of two conserved protein elements—a nucleobase binding pocket and a DNA intercalating residue5. The nucleobase binding pocket provides a means of damage recognition through shape and charge complementary with the modified base, while also creating a scaffold for the residues that catalyse hydrolysis of the N-glycosidic bond. With few exceptions, an aspartate or glutamate residue has a dual role in stabilizing developing positive charge on the sugar as the glycosidic bond is broken and in deprotonating the nucleophilic water that forms the AP product12. Additional residues in the pocket enhance the leaving-group potential of the base. For neutral alkylpurines, this enhancement comes from a general acid that protonates the nucleobase to provide the same instability inherent to cationic purines, which are in essence pre-activated for depurination11. Outside the binding pocket, DNA intercalating residues stabilize the catalytically active conformation of the DNA by filling the void in the duplex created when the lesion is flipped13.

We previously determined that the alkylpurine DNA glycosylase AlkD from B. cereus14 uniquely recognizes damaged DNA without a nucleobase binding pocket or an intercalating residue15,16. However, the modified nucleotides in these structures were not in contact with the protein, leaving us to only speculate on the catalytic mechanism. To determine how AlkD excises positively charged substrates without a base binding pocket or an intercalating residue, we determined a new AlkD crystal structure with DNA containing 3-deaza-3-methyladenine (3d3mA), a comparatively stable isostere of 3mA (Fig. 1b and Extended Data Table 1). As in the previous AlkD–DNA complexes, the DNA duplex is bound by the concave surface of the protein. The helix is bent by 30° away from the enzyme while the minor groove surrounding the lesion is widened by 4 Å (Fig. 1c). This distortion induces an equilibrium in which the 3d3mA•T base pair is in either a Watson–Crick or highly sheared conformation. Shearing displaces the 3d3mA nucleobase by 4 Å into the minor groove and towards AlkD, but leaves the base partially stacked in the duplex (Extended Data Fig. 1). In this conformation, the deoxyribose of 3d3mA is in contact with three residues—Trp109, Asp113 and Trp187 (Fig. 1d)—all of which are crucial for lesion excision and are invariant in the AlkD family15,16,17,18. The carboxylate and two indole side chains cradle the backbone of the lesion and the two flanking nucleotides. Asp113 is in line with the N-glycosidic bond of 3d3mA and thus ideally positioned to stabilize developing positive charge on the deoxyribose as the nucleobase is excised (Fig. 1d). This arrangement also allows Asp113 to position and deprotonate the nucleophilic water for subsequent addition to the oxocarbenium intermediate. Similar interactions between catalytic carboxylate residues and the deoxyribose are achieved by other DNA glycosylases, but only after the lesion has been flipped into the nucleobase binding pocket. By contrast, hypothetical rotation of the 3d3mA nucleotide in the AlkD complex would disrupt these catalytic contacts. The lack of base flipping, however, precludes a general acid from gaining access to protonate the nucleobase substrate, which limits AlkD to excision of cationic lesions.

Unlike 3mA, 3d3mA is uncharged and therefore refractory to depurination at physiological pH10. To our surprise, the electron density in the new AlkD–DNA complexes revealed a mixture of intact 3d3mA nucleotide together with AP site and free 3d3mA nucleobase, indicating that cleavage of the N-glycosidic bond had occurred (Fig. 2a and Extended Data Fig. 2). By flash-freezing crystals at various times and determining their structures, we were able to visualize the glycosylase reaction at 1.4–2.0 Å resolution, starting with intact 3d3mA-DNA substrate and ending with AP-DNA product (Fig. 2a and Extended Data Tables 1 and 2). Quantifying the fractional occupancies of substrate and product over the course of the reaction gave a rate constant for in crystallo base excision of 4.6 × 10−6 s−1 (Fig. 2b). For comparison, cationic 3mA lesions are excised by AlkD at least 800-fold more rapidly19. The unexpected excision of 3d3mA can be explained by pH-dependent protonation at N7, which would confer positive charge on 3d3mA, activating it for excision by AlkD (Fig. 2c). Given the moderately acidic (pH 5.7) crystallization buffer and the calculated pKa (3.8) of 3d3mA, ~1% of the lesion should be protonated, consistent with the slow rate of in crystallo cleavage. By contrast, we did not observe excision of 3d3mA in AlkD–3d3mA-DNA complexes crystallized at pH 7.0 (Extended Data Fig. 3 and Extended Data Table 3). We also did not observe cleavage in crystals grown at pH 5.7 in which AlkD bound 3d3mA-DNA in a non-catalytic orientation that placed the lesion on the opposite face of the duplex and away from the protein16 (Extended Data Fig. 4). Thus, the AlkD–3d3mA-DNA structure presented here represents a bona fide enzyme–substrate complex that enabled visualization of the endpoints of the glycosylase reaction.

Figure 2: Crystallographic snapshots of 3d3mA excision by AlkD.
figure 2

a, Enzymatic conversion of 3d3mA-DNA substrate (purple) to AP-DNA product and 3d3mA nucleobase (pink). The excised 3d3mA nucleobase is rotated by 180° relative to its position in the non-hydrolysed substrate. Annealed omit electron density is contoured to 2.5σ. b, Time course of 3d3mA excision determined from fractional occupancies of substrate and product in the crystal structures. c, Proposed reaction scheme showing protonation and excision of 3d3mA.

PowerPoint slide

We probed the intervening step of the reaction trajectory by determining a structure representing the oxocarbenium intermediate, using DNA containing 1′-aza-2′,4′-dideoxyribose (1aR) and 3mA nucleobase (Fig. 1b and Extended Data Table 3). Relative to the position of the 3d3mA nucleotide, the cationic 1aR moiety is shifted slightly towards the surface of AlkD, which enhances electrostatic interactions with Asp113 and the nucleophilic water (Fig. 1e). These same interactions would stabilize the high-energy oxocarbenium intermediate formed upon cleavage of the glycosidic bond. A nearly identical arrangement is present in the product-like complex containing tetrahydrofuran (THF)-DNA and 3mA nucleobase (Fig. 1b, f and Extended Data Table 3). The only notable exception is a small rotation of the neutral THF ring away from Asp113. In both ternary complexes, the 3mA nucleobase is retained in the DNA duplex and paired with the complementary thymine, maintaining stacking interactions with the flanking bases (Fig. 1e, f). While base stacking is altered upon shearing of the 3d3mA•T base pair, it is never fully disrupted and is completely restored following cleavage of the N-glycosidic bond. In stark contrast to the traditional base-flipping mechanism, there is no evidence from these structures that a void in the duplex is created at any point along the reaction trajectory that would require stabilization by a DNA intercalating residue. Furthermore, these structures indicate that after shearing of the 3d3mA•T base pair, minimal movement of the protein and the DNA is necessary for either bond cleavage or nucleophile addition to occur.

The manner in which Trp109 and Trp187 contribute to binding and catalysis17,18 has so far been unknown. The new AlkD–DNA structures reveal that both residues form CH–π interactions20 with C2′, C4′ and C5′ of the 3d3mA nucleotide (Fig. 3a). To gain insight into how these weakly polar contacts contribute to recognition and excision of 3mA, we used a reductive computational approach with only the three catalytic residues, the nucleophilic water and the lesion. Electrostatic potential calculations showed that substantial cationic character is present on both the 3mA nucleobase and the deoxyribose (Fig. 3b and Extended Data Fig. 5). Importantly, increased positive charge on the deoxyribose correlated with stronger calculated binding energies, suggesting a means by which AlkD might recognize cationic alkylpurine lesions through backbone contacts (Extended Data Fig. 5). This would enhance detection of altered base pairing or base stacking21 without the need to interact with the modified nucleobase directly. The electrostatic potential calculations also revealed that additional positive charge is transferred to the deoxyribose upon elongation of the N-glycosidic bond (Extended Data Fig. 5). Correspondingly, preferential binding and stabilization of this transition-state-like structure resulted in a theoretical 103–104-fold rate enhancement of glycosidic bond cleavage. Roughly half of this enhancement was attributed to the increasingly polar CH–π interactions provided by Trp109 and Trp187 (Extended Data Fig. 5). While interactions of this type are widespread among proteins and are prevalent in protein–DNA complexes22,23, to our knowledge, this is the first indication that CH–π interactions might function in a catalytic capacity in DNA repair. These interactions are reminiscent of the π–π and cation–π interactions used by base-flipping enzymes to recognize extrahelical nucleobases24,25, but are distinctly different in their involvement of the deoxyribose, and may be fundamental to lesion excision in the absence of base flipping.

Figure 3: 3mA recognition and excision through charge–dipole and CH–π interactions.
figure 3

a, Simulated AlkD–3mA-DNA complex (stereo image). Charge–dipole and hydrogen-bonding interactions are depicted as black dashed lines. CH–π interactions are shown as purple arrows. b, Electrostatic potential surfaces of catalytic residues and 3mA nucleoside before protein–DNA binding (E + S), in the enzyme–substrate (E–S) complex and in an approximate enzyme–transition state (E–TS) complex. All structures were determined computationally. Potentials were scaled to −0.22–0.55 atomic units on an isodensity surface of 0.05 electrons bohr−3.

PowerPoint slide

The unique mechanism reported here explains our previous finding that AlkD, but not human AAG, removes bulky, cationic pyridyloxobutyl adducts16. While extrahelical pyridyloxobutyl adducts are accommodated by the large nucleobase binding pockets of alkyltransferase-like proteins26, the relatively small active site pockets of AAG and other base-flipping DNA glycosylases impose tighter steric limitations. AlkD, however, does not rely on a base-flipping mechanism and therefore is not restricted in this fashion, suggesting it may have a cellular role in the repair of bulky lesions. Specificity for bulky alkyl adducts would explain the coexistence of several alkylpurine glycosylases in numerous bacterial species14. Consistent with this notion, an AlkD orthologue from Streptomyces sp. TP-A0356 has been found to excise 815-dalton (Da) N3-yatakemycyladenine (YTMA) adducts from DNA27 (Fig. 4a). Yatakemycin (YTM) is a minor-groove DNA-alkylating agent belonging to the duocarmycin family of antibiotic and antitumour drugs6. The gene cluster responsible for biosynthesis of YTM encodes the DNA glycosylase YtkR2, which confers resistance to YTM by excising YTMA lesions27.

Figure 4: Excision of N3-yatakemycyladenine by AlkD.
figure 4

a, YTMA. b, Modelled AlkD–YTMA-DNA complex. c, Solvent-filled cavity between AlkD and 3d3mA-DNA into which YTM was modelled. d, In vitro YTMA excision monitored by separation of full-length YTMA-DNA substrate from cleaved AP-DNA product by denaturing gel electrophoresis. e, Quantification of in vitro excision of 3mA (black) and YTMA (purple) by diverse alkylpurine DNA glycosylases. Error bars denote s.d. from three replicate experiments. f, Growth of wild-type (black) and alkD-knockout (red) strains of B. anthracis in the presence of no drug (control), 2 mM mMS and 10 nM YTM. Error bars denote s.e.m. from three replicate experiments.

PowerPoint slide

On the basis of the sequence similarity between YtkR2 and AlkD, we proposed that AlkD would also have activity for YTMA. There is an extended solvent-filled volume between the protein and the minor groove in the AlkD–3d3mA-DNA complex that can accommodate YTM modelled onto the C3 position of 3d3mA without introducing steric clashes or disrupting catalytic interactions (Fig. 4b, c). Using a standard in vitro DNA glycosylase assay, we found that AlkD excised YTMA from DNA with the same efficiency as YtkR2 (Fig. 4d, e). By contrast, the alkylpurine DNA glycosylases AAG, MAG and AlkA failed to excise YTMA but readily removed 3mA (Fig. 4d, e). To determine the specificity of AlkD for YTMA in cells, we constructed a Bacillus anthracis strain lacking alkD and tested its sensitivity against YTM and methyl methanesulfonate (MMS). MMS primarily produces 3mA and N7-methylguanine adducts that are removed by canonical alkylpurine DNA glycosylases5. In the absence of alkylating agent, deletion of alkD had no effect on the growth of B. anthracis (Fig. 4f and Extended Data Fig. 6). Similarly, alkD cells were no more sensitive to MMS than wild-type cells (Fig. 4f and Extended Data Fig. 6), most likely as a result of activity from the other alkylpurine glycosylases (AAG, AlkA and AlkC) still present in the deletion strain. Conversely, deletion of alkD caused a marked increase in sensitivity to YTM, consistent with AlkD-catalysed excision of YTMA in vivo (Fig. 4f and Extended Data Fig. 6). This suggests that the primary role of AlkD in the cell is to repair larger alkyl adducts such as YTMA that are not normally corrected by the base excision repair pathway.

This work establishes that, contrary to dogma, substrate recognition and catalysis by DNA glycosylases can occur in the absence of base flipping. There is evidence that other enzymes are able to detect DNA damage and even discriminate between different chemical modifications before flipping28,29,30. That AlkD is limited to removing inherently labile cationic lesions strongly suggests that the main function of the nucleobase binding pocket in canonical DNA glycosylases is to increase the leaving-group potential of the nucleobase substrate, as opposed to merely discriminating against non-substrate nucleobases. With this capability, however, comes a limit on the size of the adduct that can be excised. Ultimately, the greatest benefit of a non-base-flipping mechanism may be the ability to repair bulky lesions.

Methods

Oligodeoxynucleotide synthesis

3d3mA phosphoramidite was purchased from Berry and Associates. 1aR phosphoramidite was synthesized with minor modification of a previously described method31. Both were incorporated into oligodeoxynucleotides using standard solid-phase techniques. The resulting products were purified by reverse-phase HPLC and verified by mass spectrometry. All other oligodeoxynucleotides were purchased from Integrated DNA Technologies and used without further purification.

Protein purification

Human (Δ79)AAG32, Saccharomyces cerevisiae MAG30, Escherichia coli AlkA33 and B. cereus AlkD15 were purified as previously described. The gene encoding YtkR2 (GenBank accession IADZ13541) was synthesized by DNA2.0 and ligated into a modified pET27 expression vector encoding a Rhinovirus 3C cleavable hexahistidine tag. YtkR2 overproduction in E. coli HMS174(DE3) cells was induced at 16 °C after addition of 0.5 mM isopropyl β-d-1-thiogalactopyranoside (IPTG). Cells were collected from LB medium by centrifugation, resuspended in buffer L (50 mM Tris-HCl, pH 7.5, 500 mM NaCl and 20% (v/v) glycerol) and lysed on ice by gentle sonication. Cleared lysate was applied to a Ni-NTA column equilibrated in buffer L. The column was then washed with buffer L containing 2 mM histidine and eluted with buffer L containing 100 mM EDTA. Pooled fractions were supplemented with 2 mM dithiothreitol (DTT) before overnight cleavage of the affinity tag. Cleaved YtkR2 was diluted tenfold in buffer H (50 mM Tris-HCl, pH 7.5, 20% (v/v) glycerol, 2 mM DTT and 0.1 mM EDTA) before being applied to a heparin Sepharose column equilibrated in buffer H. The column was then washed with buffer H containing 50 mM NaCl and eluted by linearly increasing to buffer H containing 1 M NaCl. Pooled fractions were passed through a Ni-NTA column equilibrated in buffer L to remove trace protein contaminants. The column was subsequently rinsed with buffer L containing 2 mM histidine to elute weakly bound YtkR2. The flow-through and the rinse were then combined, concentrated by ultracentrifugation and applied to a Superdex 200 column equilibrated in buffer S (20 mM Bis-tris propane, pH 6.5, 400 mM NaCl, 20% (v/v) glycerol, 2 mM DTT and 0.1 mM EDTA). YtkR2 was eluted with additional buffer S, concentrated to 3 mg ml−1 by ultracentrifugation, flash-frozen in liquid nitrogen and stored at −80 °C.

Crystallization of AlkD–DNA

DNA annealing reactions contained 0.54 mM 9-mer A [d(AAGCAXACC)/d(TGGTTTGCT)], 9-mer B [d(AAGCCXCCC)/d(TGGGTGGCT)], or 12-mer [d(CCCGAXAGTCCG)/d(CGGACTTTCGGG)] oligodeoxynucleotides, 10 mM MES, pH 6.5, and 40 mM NaCl. Reactions containing 1aR•T-DNA or THF•T-DNA were supplemented with 27 mM 3mA nucleobase. Strands were annealed by heating to 85 °C and slowly cooling to 20 °C over several hours. AlkD–DNA complexes were formed by mixing 0.45 mM protein and 0.54 mM DNA solutions in equal volumes and incubating at 4 °C for 30 min. Complexes containing 9-mer DNA were crystallized using the sitting drop vapour diffusion method. Drops were prepared from 2 μl of protein–DNA solution (0.22 mM AlkD and 0.27 mM DNA), 2 μl of reservoir solution (22–25% (w/v) PEG 8,000, 50 mM HEPES, pH 7.0, and 50 mM CaCl2), and 1 μl of additive solution (5% (w/v) benzamidine hydrochloride) and equilibrated against an additional 500 μl of reservoir solution at 21 °C. Crystals were collected after 24 h, briefly soaked in reservoir solution supplemented with 15% (v/v) glycerol and flash-cooled in liquid nitrogen. Complexes containing 12-mer DNA were crystallized using the hanging drop vapour diffusion technique. Drops were assembled from 2 μl of protein–DNA solution (0.22 mM AlkD and 0.27 mM DNA), 2 μl of reservoir solution (15–19% (w/v) PEG 4,000, 42 mM sodium acetate, pH 4.6, 85 mM ammonium acetate and 5% (v/v) glycerol) and 1 μl of seed solution (submicroscopic crystals of AlkD–1mA•T-DNA) and equilibrated against an additional 500 μl of reservoir solution at 21 °C. Crystals were collected after 4–360 h, briefly soaked in reservoir solution supplemented with 15% (v/v) glycerol and flash-cooled in liquid nitrogen.

X-ray data collection and structure refinement

X-ray diffraction data were collected at beamlines 21-ID-F and 21-ID-G at the Advanced Photon Source and processed using HKL2000 (ref. 34). Data collection statistics are provided in Extended Data Tables 1, 2, 3. A previously determined model of AlkD (PDB accession 3BVS) was positioned with Phaser35, while DNA was manually built in Coot36, guided by inspection of 2mFo − DFc and mFo − DFc electron density maps. The entirety of the 12-mer oligodeoxynucleotide duplex was readily apparent, as were AlkD residues 1–229 and two non-native residues (−1–0) from the cleaved N-terminal affinity tag, but not the last eight residues (230–237) at the C terminus. As with the 12-mer, the complete 9-mer duplexes were visible in the density maps. However, three residues (52–54) between helix C and helix D and twelve residues (226–237) at the C terminus could not be reliably modelled. Atomic coordinates, temperature factors and fractional occupancies were refined in PHENIX37. The final AlkD–DNA models were validated using MolProbity38 and contained no residues in the disallowed regions of the Ramachandran plot. Refinement and validation statistics are given in Extended Data Tables 1, 2, 3.

All structure images were created in PyMOL (https://www.pymol.org). mFo − DFc omit electron density maps were calculated using PHENIX by removing the lesion and the opposing thymine and then performing simulated annealing on the remaining AlkD–DNA complex to minimize model bias. Maps were carved around the omitted atoms with a 1.5–2.0 Å radius and contoured to 2.5σ. YTM was manually docked in the AlkD–3d3mA-DNA complex after defining the cavity between the protein and the DNA with Hollow39. Simulated structures shown in Fig. 3a and Extended Data Fig. 1c were generated from AlkD–3d3mA-DNA crystal structures by manually removing or transmuting atoms and without performing subsequent computational optimization.

Preparation of YTMA-DNA

YTM was purified from Streptomyces sp. TP-A0356 culture as previously described6. Adduction reactions containing 150 μM YTM, 10 μM fluorescein (FAM)-labelled DNA (FAM-d(CGGGCGGCGGCAAAGGGCGCGGGCC)/d(GGCCCGCGCCCTTTGCCGCCGCCCG); underline denotes nucleotide modified by YTM), 10 mM MES, pH 6.5, 40 mM NaCl and 10% (v/v) DMSO were incubated at 25 °C for 24 h and periodically mixed by inversion. YTM-modified DNA was then separated from free YTM by passage through a G25 size exclusion column equilibrated in 10 mM MES, pH 6.5, and 40 mM NaCl and stored at −80 °C.

Quantification of base excision

Reaction mixtures containing 5 μM enzyme, 10 μg of methylated calf thymus DNA or 100 nM YTM-modified oligodeoxynucleotide duplex, 50 mM Bis-tris propane, pH 6.5, 100 mM NaCl, 5% (v/v) glycerol, 2 mM DTT, 0.1 mM EDTA and 0.1 mg ml−1 BSA were incubated at 25 °C for 6 h. Excision of 3mA was quantified by measuring cleaved nucleobase using a previously described HPLC–MS/MS method19. Excision of YTMA was quantitated by separating the 25-mer substrate from the 12-mer product using standard electrophoretic techniques19. Strand breakage of 25-mer abasic oligodeoxynucleotide was induced by heating the reaction mixture at 70 °C for 30 min in the presence of 0.2 M NaOH.

Preparation of alkDΔ cells

The 1-kilobase flanking regions surrounding alkD were inserted into the knockout-plasmid pLM4 using standard molecular biology techniques40. The modified plasmid was propagated in E. coli K1077 before introduction by electroporation into B. anthracis Sterne cells41,42. B. anthracis colonies containing the plasmid were grown on LB plates supplemented with 20 μg ml−1 kanamycin at 42.5 °C for 1–2 days to generate merodiploids with plasmid DNA integrated into their chromosomal DNA. Merodiploids were then grown in LB medium for 1 day at 30 °C to facilitate elimination of redundant DNA from their genomes. Cultures were serially diluted and grown on LB plates without kanamycin for 1 day at 30 °C so that colonies lacking alkD could be identified by PCR screening.

Determination of YTM and mMS resistance

Overnight cultures of B. anthracis Sterne (wild-type and alkDΔ) grown at 30 °C were diluted 1:100 in 100 μl of LB medium in the presence or absence of 2–40 nM YTM or 1–10 mM mMS in a 96-well flat-bottom plate. The plate was incubated at 30 °C with shaking for 20 h, and cell density was measured at 600 nm every hour using a Synergy 2 multi-detector microplate reader. Before each measurement, the plate was gently vortexed to ensure full resuspension of sedimented cells. Experiments were performed in triplicate.

Overnight cultures of B. anthracis Sterne (wild-type and alkDΔ) grown at 30 °C were diluted 1:100 in 5 ml of LB medium and incubated at 30 °C with shaking until early logarithmic phase. Culture aliquots (5 μl) were then tenfold serially diluted (1:10−1–1:10−5) and spotted on LB plates prepared with or without 2–40 nM YTM or 1–10 mM mMS. Plates were incubated at 37 °C and imaged after 2 days. Experiments were performed in duplicate.

Calculation overview

All calculations were performed using Gaussian 09 (http://www.gaussian.com/g_prod/g09.htm). Atomic coordinates for computationally optimized structures are provided in Supplementary Data 1.

pKa calculation

The aqueous pKa microacidity constant of N7-protonated 3-deaza-3,9-dimethyladenine (3d3m9mA) was calculated using the isodesmic reaction method43 with N7-protonated 9-methyladenine (9mA) used as the reference acid (aqueous pKa = 2.96)44. Neutral and N7-protonated 9mA and 3d3m9mA were optimized to minima using Truhlar’s M06-2X density functional45 with the split-valence 6–311+G(2d,p) basis set and the SMD solvation model46. Optimized structures were confirmed minima by vibrational frequency analyses. Single point evaluations were carried out on the optimized structures in the presence of implicit solvent (water) at the M06-2X/aug-cc-pVTZ level. Thermal corrections to Gibbs free energies were taken from the frequency calculations and applied to electronic energies determined at the M06-2X/aug-cc-pVTZ level. Proton exchange energetics (ΔGexchg) between 9mA and 3d3m9mA were then determined using the corrected electronic energies.

Constrained optimization of AlkD–lesion structures

The crystal structure of AlkD bound to 3d3mA-DNA served as the starting point for all computational structural optimizations. Atomic coordinates were extracted for AlkD residues Trp109, Asp113 and Trp187 as well as HOH308 and 3d3mA. Hydrogen atoms were added manually to satisfy all valences, and Asp113 was assumed to be in carboxylate form. To reduce computational cost during structural optimizations and subsequent calculations, protein residues were truncated to their respective side chains (β-carbons as methyl groups), and 3d3mA was truncated to the corresponding 2′-deoxynucleoside. The starting point for the structure of 3mA was generated by replacing carbon with nitrogen at the 3-position of the purine ring in 3d3mA. The starting point for the transition state approximation (TSA) was generated by elongating the glycosidic bond of optimized 3mA by 0.5 Å. This distance was chosen based on the computationally determined transition state for excision of uracil by UDG47. Finally, the starting point for dR+ was generated by deleting the 3d3mA nucleobase from the corresponding 2′-deoxynucleoside, leaving an open valence on C1′.

Structural optimizations were performed in the gas phase using the M06-2X functional in conjunction with the 6–31+G(d) basis set. In all cases, the Cartesian coordinates for all non-hydrogen atoms in AlkD residues Trp109, Asp113 and Trp187 as well as HOH308 were frozen using the freeze code −1 in the molecular coordinates specification. Additionally, the 3′- and 5′-oxygen atoms of all 2′-deoxynucleosides were frozen to simulate the experimentally observed lack of motion in the phosphodiester backbone in the full AlkD–DNA complexes. All other atoms, with the exception of C1′ and N9 in the TSA, were allowed to move during optimizations. Stationary points derived from geometry optimizations on these reduced-dimension potential energy surfaces were verified as minima by vibrational frequency analyses. Frequency analyses were performed at the same level of theory and on the same reduced-dimension surfaces as the geometry optimizations. Using this approach, freezing Cartesian coordinates results in zeroing of elements in the Hessian associated with the frozen atoms. As such, the expected imaginary frequency resulting from elongation of the glycosidic bond in the TSA was not observed. All structures, including the TSA, afforded zero imaginary frequencies.

Calculation of binding energy and catalytic rate enhancement

Binding energies of the truncated AlkD residues and the catalytic water to 3d3mA, 3mA, TSA and dR+ were determined using the M06-2X functional. Structural optimizations were carried out as described above using the 6–31+G(d) basis set, and single point evaluations were carried out using the 6–311++G(3df,2p) basis set. The basis set superposition error was accounted for using the counterpoise method48, in which AlkD residues and the catalytic water were defined as the first fragment and the 2′-deoxynucleosides were defined as the second fragment. Because heavy atoms in AlkD residues were frozen at their experimentally determined atomic positions, binding energies were evaluated as differences in zero point-uncorrected electronic energies. This approach reproduced experimental gas-phase binding energies to within 1 kcal mol−1 for two prototypical CH–π systems, including neutral CH4/benzene (experimental 1.03–1.13 kcal mol−1 versus calculated 1.6 kcal mol−1)49 and cationic (CH3)4N+/benzene (experimental 9.4 kcal mol−1 versus calculated 10.3 kcal mol−1)50. To determine the individual contributions of Trp109, Asp113, Trp187 and HOH308 to ensemble binding energies, the counterpoise method was again employed as described above, but with the atoms in Trp109, Asp113, Trp187 or HOH308 replaced with the corresponding Gaussian ghost atoms. The contribution of each residue to the total binding energy (defined as BEindividual(x) = BEensemble − BEensemble,ghost(x)) was computed as the difference between the ensemble binding energy and that where Trp109, Asp113, Trp187 or HOH308 had been replaced with ghost atoms. Complexes containing ghost atoms were not reoptimized, which may affect individual binding energies. However, using this method, between 92% and 99% of the ensemble binding energy was accounted for in each complex.

The barrier to bond elongation (defined as ΔE = ETSA − E3mA) was calculated in the presence (ΔE = 18.1 kcal mol−1) and absence (ΔE = 23.0 kcal mol−1) of Trp109, Asp113, Trp187 and HOH308. The difference in these values (ΔΔE = 4.9 kcal mol−1) was taken as the catalytic contribution of AlkD to 3mA excision. This process was repeated for complexes in which atoms in each residue or the catalytic water had been replaced with ghost atoms. For each series, ΔΔE values were computed as outlined above, and the difference between each of these ΔΔE values and that of the full ensemble was taken as the individual contribution of the omitted residue to catalysis. Estimated rate enhancements were calculated from these values by substituting ΔΔE for ΔΔG in the Eyring equation at 25 °C. The barrier to bond elongation (ΔE = 23.0 kcal mol−1) calculated with this approach is consistent with the activation enthalpy for depurination of 3mA (ΔH= 23.5 kcal mol−1) extrapolated from empirically determined half-lives51.

Quantitation of charge transfer and generation of electrostatic potential maps

Charge transfer from AlkD residues Trp109, Asp113 and Trp187 and HOH308 to the lesions was quantitated by performing Merz–Singh–Kollman (MK) population analyses52,53 on each complex and on each individual component (either AlkD and the catalytic water or the lesion). To correct for potential basis set superposition error in population analyses, the appropriate Gaussian ghost atoms were used to maintain consistent basis between fragments and complexes. Partial atomic charges derived from MK population analyses were summed for all atoms in AlkD and for the lesion, affording the corresponding group charges. The arithmetic difference between group charges in the AlkD–lesion complex and AlkD or the lesion alone were taken as the amount of charge transferred upon complex formation and expressed as percentages. Electrostatic potential maps were generated from densities computed at the M06-2X/6–311++G(3df,2p) level with ghost atoms omitted for visual clarity. Potentials were scaled to −0.22–0.55 atomic units on an isodensity surface of 0.05 electrons bohr−3.