Main

An obligatory and characteristic step of DNA repair is gap filling, in which excised or missing nucleotides are replaced using the undamaged strand as a template3. If a sufficient number of nucleotides are incorporated, DNA repair synthesis can be used as a proxy for the extent and location of endogenous DNA damage4. In light of this, we developed a method to map sites of DNA repair synthesis by sequencing (synthesis associated with repair sequencing (SAR-seq)). We labelled post-mitotic glutamatergic neurons derived from induced pluripotent stem cells (iPS cells) (i3Neurons5,6) on day 6 after differentiation with ethynyl deoxyuridine (EdU) for 18 h, biotinylated the labelled DNA, reduced it to 150–200 bp by sonication, and then isolated the biotinylated DNA for high-throughput sequencing (Extended Data Fig. 1a). We identified more than 55,000 SAR-seq peaks at recurrent genomic locations in neurons, which were highly reproducible between different experiments (Fig. 1a, Extended Data Fig. 1b–d). Peaks were not caused by DNA synthesis during S phase7,8 because i3Neurons are post-mitotic (Extended Data Fig. 1e), and the SAR-seq peaks were unaffected by inhibition of the replicative DNA polymerase-α (Extended Data Fig. 2a–c). By contrast, and as expected, neuronal synthesis associated with repair (SAR) was largely prevented by hydroxyurea (HU), which reduces the availability of deoxyribonucleotides (Extended Data Fig. 2a, b).

Fig. 1: SAR-seq peaks occur within enhancers and are associated with PARP activation.
figure 1

a, Genome browser screenshots of SAR-seq (n = 3), ATAC–seq (n = 1), and ChIP–seq for H3K4me1 (n = 2), H3K27ac (n = 1), and MLL4 (n = 1) in i3Neurons. Below, expanded view of the indicated region to show overlapping peaks. b, Genome browser screenshot of SAR-seq performed in rat primary neurons (n = 1) as well as input. The culture was co-incubated with 5 μM aphidicolin to block DNA replication of S-phase glial cells. c, Representative images of i3Neurons with immunofluorescence staining for PAR (green) and the neuronal marker tubulin-β3 (red), counterstained with 4′,6-diamidino-2-phenylindole (DAPI) (blue) (data are representative of three independent experiments). As a positive control, cells were treated with 0.1 mg ml−1 MMS for 15 min; NT, not treated. Boxed regions in top row are enlarged below. d, Heat maps of SAR-seq signal and ChIP–seq signals for XRCC1 and PAR for 1 kb on either side of SAR-seq peak summits in i3Neurons, ordered by SAR-seq intensity.

The SAR-seq peaks were 200–2,000 bp in width (mean 901 bp) (Extended Data Fig. 2d), with each peak presumably comprising multiple clustered sites of DNA repair (see below). The most prominent neuronal SAR-seq peaks were detectable by pulse labelling with EdU for just 1 h, and EdU incorporation approached saturation after labelling for 18 h (Extended Data Fig. 2e, f).

When iPS cells were differentiated into skeletal muscle cells (iMuscle), we did not detect incorporation of EdU despite labelling cells for 18 h (Extended Data Fig. 3a). Similarly, we did not detect EdU incorporation in G0-arrested pre-B cells, although we could detect EdU incorporation in pre-B cells after the induction of site-specific DNA double-strand breaks (DSBs) (Extended Data Fig. 3b). To rule out the possibility that the SAR-seq peaks were an artefact of iPS cell differentiation, we labelled bona fide rat neurons with EdU. Similar to i3Neurons, we detected robust peaks of EdU incorporation at 22,196 specific sites in rat neurons (Fig. 1b, Extended Data Fig. 5e). Thus, the high frequency of recurrent DNA synthesis appears to be a specific feature of post-mitotic neurons.

DNA repair sites enriched at enhancers

Neuronal SAR-seq peaks were enriched in intragenic regions (Extended Data Fig. 3c, d) and further within expressed genes (Extended Data Fig. 3d, e). However, SAR signal intensity did not correlate with transcript levels (Extended Data Fig. 3e). Moreover, the sites of EdU incorporation were not associated with strand specificity, as EdU was incorporated uniformly in both transcribed and non-transcribed strands (Extended Data Fig. 3f).

We next searched for specific DNA motifs among the strongest 5,000 SAR-seq peaks. More than 25% of the sites contained a motif similar to the ONECUT family of transcription factors, which was centred at SAR-seq peak summits (Extended Data Fig. 4a). As ONECUT1 can promote genomic accessibility in neurons9, we compared SAR-seq peaks with accessible regions using an assay for transposase-accessible chromatin using sequencing (ATAC–seq). Fifty-four per cent of all SAR-seq regions coincided with ATAC-seq peaks (Fig. 1a, Extended Data Fig. 4b, c), and the widths of SAR-seq and ATAC-seq peaks were correlated (Extended Data Fig. 4d), suggesting that open chromatin structure influences the extent of DNA synthesis.

Despite their localization in open chromatin, SAR-seq peaks were not enriched at promoters (Extended Data Fig. 4e–g), which exhibited only modest levels of DNA synthesis. However, we detected a strong correlation between locations of DNA synthesis and of neuronal enhancers, as measured by chromatin immunoprecipitation with sequencing (ChIP–seq) for methylation at lysine 4 of histone H3 (H3K4me1), acetylation at lysine 27 of histone H3 (H3K27ac) and the lysine methyltransferase MLL4 (Fig. 1a, Extended Data Fig. 5a, b). Nevertheless, SAR-seq peaks were not strongly correlated with other, non-enhancer-specific, epigenetic indicators of open chromatin or condensed chromatin (Extended Data Fig. 5a, c). Similar to i3Neurons, SAR-seq peaks in primary rat neurons overlapped and correlated with rat H3K4me1 ChIP–seq (Extended Data Fig. 5d–f).

We performed ultra-deep Hi-C in i3Neurons to examine chromosomal interactions associated with SAR-seq peaks within the context of topological associated domains (TADs) (Extended Data Fig. 6a). Enhancers with SAR-seq peaks had more intra-TAD interactions than enhancers without SAR-seq peaks. Consistently, H3K27ac—a mark of active enhancers—was significantly enriched at enhancers with SAR-seq peaks (Extended Data Fig. 6a) (P < 2.2 × 10−6). Chromatin looping interactions detected by promoter capture Hi-C (pcHi-C) enables distal enhancer elements to be linked to their target genes10. Compared to random sites, we observed an eightfold enrichment of SAR-seq sites among in vivo-validated enhancer elements11 that overlapped with neuronal H3K4me1 histone marks and a fourfold enrichment using a pcHi-C dataset generated in i3Neurons10 (Extended Data Fig. 6b). The latter includes enhancers whose activity was validated in human neurons using CRISPR techniques (Extended Data Fig. 6c, d). Thus, distal-acting regulatory enhancers are hotspots of neuronal SAR.

To determine whether the enhancers with SAR-seq peaks are specific to neurons, we compared sites of H3K4me1 in i3Neurons and iPS cells. Only 2% of the SAR-seq peaks overlapped with iPS cell-specific H3K4me1 peaks, whereas most of the SAR-seq peaks overlapped with either neuron-specific or shared H3K4me1 sites (Extended Data Fig. 6e). Thus, SAR is associated with enhancers that are active in differentiated neurons.

Gene Ontology (GO) analysis of the genes that contained SAR-seq peaks revealed that they were enriched in GO terms related to nervous system function (Extended Data Fig. 6f). We then further characterized the neuronal enhancers that were associated with SAR. Super-enhancers are a large collection of enhancers that drive the transcription of genes involved in cell identity. Using H3K27ac ChIP–seq, we found approximately 1,300 super-enhancers in i3Neurons. Notably, 90% of super-enhancers exhibited SAR-seq peaks, whereas fewer than 25% of conventional enhancers possessed SAR-seq peaks (Extended Data Fig. 6g, h). Collectively, these data identify enhancers and genes that are associated with neuronal function as hotspots of recurrent DNA synthesis.

PARP activity at neuronal enhancers

Given the close association between unrepaired DNA strand breaks and neurodegeneration1,2, we wondered whether the sites of EdU incorporation might reflect sites of DNA break repair. To test this idea, we measured the activity of poly(ADP-ribose) polymerases (PARPs) at the sites of DNA synthesis. Various types of DNA damage, including SSBs, DSBs and single-strand gaps, activate PARP1 and PARP212,13. PARP activity signals the presence of these lesions by modifying localized proteins with poly(ADP-ribose) (PAR)12,13. When we monitored ADP-ribosylation in individual neurons14, we detected focal sites of nuclear ADP-ribose. These focal sites, along with pan-nuclear staining, increased following treatment with the genotoxin methylmethansulfate (MMS) (Fig. 1c, Extended Data Fig. 7a). To determine whether the sites of endogenous ADP-ribosylation were localized to sites of neuronal DNA synthesis, we used ADP-ribose ChIP–seq. As a positive control, we first confirmed that we could detect ADP-ribosylation at site-specific DSBs (Extended Data Fig. 7b). In i3Neurons, the endogenous sites of ADP-ribosylation co-localized with SAR-seq peaks (Fig. 1d, Extended Data Fig. 7c), suggesting that recurrent sites of DNA synthesis are associated with DNA strand break repair.

DNA repair is not associated with DSBs

Neuronal activity has been reported to cause DSBs generated by topoisomerase 2 (TOP2)15,16. TOP2-induced DSBs can promote the expression of early response genes15, associating these DNA breaks with regions of transcriptional activity17,18. Treatment of i3Neurons with etoposide to trigger TOP2-induced DSBs resulted in DNA synthesis within gene bodies (Extended Data Fig. 8a, b). However, most of the sites of etoposide-induced DNA synthesis were distinct from those detected in untreated neurons (Extended Data Fig. 8a, b). In addition, we did not detect DSBs in unchallenged i3Neurons as measured by immunostaining for either γH2AX or 53BP1 (Extended Data Fig. 8c) or by END-seq19 (which involves the ligation of a sequencing adapter to the ends of DSBs) (Extended Data Fig. 8d). Thus, sites of DNA synthesis in neuronal enhancers are independent of DSBs.

SSBs at sites of DNA synthesis

As well as DSBs, PARP1 and/or PARP2 are also activated at SSBs and subsequently recruit the XRCC1 protein complex, which accelerates SSB repair20,21. We therefore examined the genomic localization of XRCC1 by ChIP–seq. Similar to sites of PARP activity, XRCC1 co-localized with SAR-seq peaks, and the intensity of XRCC1 binding correlated with the intensity of EdU incorporation in both human i3Neurons and rat primary neurons (Fig. 1d, Extended Data Figs. 7c, 8e–g). Thus, sites of DNA synthesis colocalize with sites of PARP activation and XRCC1-associated SSB repair.

To directly map SSBs at nucleotide resolution, we treated agarose-embedded i3Neuron plugs with recombinant single strand-specific S1 nuclease in situ to convert SSBs into DSBs, which we then detected using END-seq. To test whether this method would successfully convert DNA nicks into DSBs, we first incubated the i3Neuron agarose plugs with the site-specific nicking endonuclease Nt.BspQI before treating them with S1 nuclease. Nt.BspQI nicks were readily converted into DSBs by S1 (Extended Data Fig. 9a). Moreover, S1 END-seq could detect nicks generated by an inducible CAS9 nickase expressed in G1-arrested MCF10 cells (Extended Data Fig. 9b, c). However, S1 END-seq alone did not detect endogenous SSBs in i3Neurons (Fig. 2a, Extended Data Fig. 9a).

Fig. 2: Single-strand breaks detected by S1 END-seq after incubation with ddN.
figure 2

a, Top, genome browser screenshots illustrating the overlap in i3Neurons between sites of SAR-seq and SSBs (S1 END-seq) detected in the presence of ddN (n = 1) to block DNA ligation, or in its absence (NT, n = 1). The S1 END-seq signal is separated for positive (black) and negative (grey) strands. Bottom, expanded views demonstrating the presence of multiple SSBs within one SAR-seq peak. b, Aggregate plots of S1 END-seq signal (red, positive strand; blue, negative strand) and END-seq signal in the presence of ddN (n = 1) (black, positive strand; grey, negative strand) for 500 bp on either side of SAR-seq peak summits. RPM, reads per million. c, Composite DNA sequence motif for 5 bp on either side of SSB summits on the positive strand (right) and on the negative strand (left) of the 10,000 most prominent S1 END-seq peaks.

We surmised that failure to detect endogenous SSBs in neurons could be due to their rapid repair. To increase the half-life of SSBs, we incubated i3Neurons with a mixture of chain-terminating dideoxynucleosides (ddA, ddC, ddT and ddG, denoted ddN). This led to robust SSB accumulation as detected by S1 END-seq (Fig. 2a), which co-localized extensively with SAR-seq peaks (Fig. 2a, b, Extended Data Fig. 9d, e). Many of the SAR-seq peaks contained multiple S1 END-seq peaks (Fig. 2a, inset). Each peak spanned about 25 nucleotides on average (Extended Data Fig. 9f), thus comprising multiple-clustered single-strand gaps. Notably, we also detected a very low level of DSBs at SAR-seq peaks following chain termination; these were independent of S1 nuclease treatment and probably reflected closely apposed SSBs on opposite DNA strands (Fig. 2b).

S1 END-seq enabled us to locate endogenous SSBs with much higher resolution than did SAR-seq, which is limited to the size of the sonicated fragments (150–200 nt). The improved resolution revealed a prevalence of C/G nucleotides at S1 END-seq peak summits on positive or negative strands (Fig. 2c), and showed that CpG dinucleotides were highly enriched at SAR sites (Extended Data Fig. 9g, h). Collectively, these data provide direct evidence for the site-specific formation of clustered SSBs in neuronal enhancers at or near C/G nucleotides.

SAR involves long-patch SSB repair

SSB repair comprises both short-patch and long-patch sub-pathways, in which single or multiple nucleotides, respectively, are replaced at the site of the SSB22,23. PARP1 and XRCC1 promote the repair of a wide spectrum of SSBs, primarily by short-patch repair. We therefore examined the effects of inhibiting and/or depleting these proteins on neuronal DNA synthesis. There was a reproducible increase in EdU incorporation at SAR sites if neurons were co-incubated with any of three independent inhibitors of PARP1, or if PARP1 was depleted using CRISPR interference (CRISPRi)24 (Fig. 3a, Extended Data Fig. 10a, b). Depletion of XRCC1 similarly led to a prominent increase in EdU incorporation at sites of SAR (Fig. 3b, Extended Data Fig. 10c–e). These data suggest that if PARP1- or XRCC1-dependent short-patch SSB repair is impeded, long-patch SSB repair is increased to compensate.

Fig. 3: Localized SSB repair in neurons comprises short-patch and long-patch sub-pathways.
figure 3

a, Bottom, heat maps of SAR-seq intensities for 1 kb on either side of SAR-seq peak summits for i3Neurons treated with the indicated PARP inhibitors (n = 3) or expressing non-targeted control (sgControl, n = 1) or PARP1-targeted (sgPARP1, n = 2) CRISPRi plasmids. Top, aggregate plots of SAR-seq intensity. Rep, replicate. b, Heat maps of SAR-seq intensities for 1 kb on either side of SAR-seq peak summits for i3Neurons expressing non-targeted control (sgControl, n = 2) or XRCC1-targeted (sgXRCC1, n = 2) CRISPRi plasmids. c, Heat maps of SAR-seq intensities for 1 kb on either side of SAR-seq peak summits for i3Neurons expressing non-targeted (sgControl, n = 2) or POLB-targeted (sgPOLB, n = 2) CRISPRi plasmids, either untreated or treated with 50 μM aphidicolin (Aph) for 24 h before and during EdU incorporation.

During short-patch SSB repair, a single nucleotide is replaced at the site of the break22,23,25. Nucleotide replacement typically requires DNA polymerase β (POLβ), which interacts directly with XRCC120. By contrast, during long-patch repair, alternative DNA polymerases such as POLε and POLδ can generate a 2–20-nucleotide DNA repair patch. Consistent with this, we found that depletion of POLβ resulted in a marked increase in DNA synthesis at neuronal enhancers (Fig. 3c, Extended Data Fig. 10f–h). By contrast, incubation with aphidicolin, an inhibitor of POLε and POLδ, greatly reduced SAR in both wild-type and POLβ-depleted neurons when used at a concentration (50 μM) that inhibits cellular repair synthesis4,26 (Fig. 3c, Extended Data Fig. 10i). Collectively, these data indicate that neuronal site-specific SSBs are repaired by both short-patch and long-patch repair, with the latter being the primary source of the SAR-seq signal.

SSBs and active cytosine demethylation

The detection of DNA repair synthesis at neuronal enhancers could indicate that the latter are sites of increased DNA damage or, alternatively, that they are sites of preferential DNA repair. Our observations using S1 END-seq, which showed that SSBs are enriched at neuronal enhancers, strongly support the first hypothesis. To test this further, we treated i3Neurons with MMS to introduce SSBs stochastically across the genome. We predicted that if DNA repair is targeted to enhancer sites, the induction of random SSBs across the genome should not affect the location of the SAR-seq peaks. However, despite the expected two- to fivefold increase in the overall level of EdU incorporation, MMS ablated the appearance of SAR-seq peaks at enhancers. This observation suggests that elevated levels of endogenous SSBs occur specifically at neuronal enhancers and account for the peaks of DNA repair synthesis at these sites, rather than it being a selective and site-specific repair process (Extended Data Fig. 11a).

We considered the source of the localized SSBs. Oxidative DNA lesions such as 8-oxoguanine have strongly been implicated as sources of DNA damage in the brain; however, unlike the sites of DNA repair synthesis detected here, such lesions have not been shown to preferentially accumulate in open chromatin or at enhancers27. By contrast, enhancers may be especially vulnerable to SSBs induced by TOP1, a topoisomerase enzyme that has been implicated in enhancer activation28,29. However, such SSBs are also unlikely to be the source of the localized SSBs, as depletion of the polynucleotide kinase 3′-phosphatase (PNKP)—which is recruited by XRCC1 to repair TOP1-induced SSBs30,31—had only a small effect on the SAR-seq peaks (Extended Data Fig. 11b, c). This was in contrast to the DNA repair synthesis of bona fide TOP1-induced SSBs that we triggered by treating neurons with the TOP1 poison camptothecin, which were increased by PNKP depletion and located in gene bodies rather than in enhancers (Extended Data Fig. 11d).

Given our finding that SSBs associated with SAR-seq peaks were enriched at C/G nucleotides (Fig. 2c, Extended Data Fig. 9g, h), we entertained the possibility that these SSBs were sites of cytosine demethylation (Fig. 4c). Consistent with this idea, active DNA demethylation of cytosine at CpG sites occurs preferentially at enhancers32, is tenfold more active in post-mitotic neurons than in peripheral cell types33, and generates SSBs that are intermediates of XRCC1-associated, but not PNKP-associated, base excision repair (BER)32,34. Active demethylation via ten-eleven translocation (TET) family enzymes (TET1, TET2, and TET3) is initiated through progressive oxidation of 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), or 5-carbolxylcytosine (5caC), and steady-state levels of 5hmC account for approximately 40% of modified cytosines in the brain33. Using labelling methods to detect 5hmC and 5fC35,36, we mapped oxidized forms of 5-methylcytosine genome-wide in i3Neurons. We found that the sites of both DNA repair synthesis (measured by SAR-seq) and SSBs (measured by S1 END-seq) overlapped with peaks of 5hmC and 5fC, and the intensity of SSBs correlated with that of 5hmC and 5fC (Fig. 4a, b, Extended Data Fig. 11e, f). Although future studies will be needed to determine the precise source(s) of SSBs at neuronal enhancers, our data implicate cycles of cytosine methylation and demethylation at neuronal enhancers as a potential source of neuronal site-specific DNA single-strand breakage.

Fig. 4: Localized SSB repair in neurons correlates with sites of oxidized 5-methylcytosine.
figure 4

a, Genome browser screenshot illustrating the overlap in i3Neurons between sites of 5hmC (5hmC-Seal, n = 2)35,36, 5fC (5fC-Seal, n = 2)35,36, SSBs (ddN S1 END-seq), DNA repair synthesis (SAR-seq), and H3K4me1 ChIP–seq. Both positive (black) and negative (grey) S1 END-seq signal are shown. b, Heat maps of signals for DNA repair synthesis (SAR-seq), SSBs (ddN S1 END-seq), 5fC (5fC-Seal), and 5hmC (5hmC-Seal) in i3Neurons, for 1 kb on either side of the summits of the SAR-seq peaks and ordered by SAR-seq intensity. c, Model depicting the balance between short-patch and long-patch SSB repair at neuronal enhancers. Although our data suggest that SSBs arise during the base excision repair of modified cytosine residues within enhancers, we do not exclude the occurrence of other sites and sources of SSBs.

Conclusions

Our study reveals that human post-mitotic neurons are subject to an unexpected level of localized DNA synthesis that is associated with ongoing sites of SSB repair at neuronal enhancers. The scaffold protein XRCC1 is of particular importance during SSB repair because it is physically associated with a variety of SSB repair enzymes including TDP1, POLβ, PNKP, APTX, and DNA ligase III (LIG3)20, in some of which mutations result in hereditary neurodegenerative diseases1,2. Failure to repair the site-specific SSBs that we have discovered here may thus contribute to such diseases, although we do not rule out an involvement of other SSB sources. Unrepaired SSBs may trigger neuropathology in several ways, such as via excessive or aberrant activation of PARP137. In addition, our current data raise the possibility that increased dependency on DNA repair synthesis at specific sites of the genome increases the mutational burden in long-lived neurons38. For example, it is possible that an accrual of mutations at sites of recurrent DNA repair synthesis within enhancers could lead to aberrant gene expression, resulting in neurological dysfunction over time.

In summary, we describe methods that enable genome-wide mapping of endogenous sites of frequent DNA single-strand breakage and repair synthesis in post-mitotic neurons. Our findings identify enhancers as hotspots of DNA damage in human post-mitotic neurons, perhaps explaining why DNA SSB repair is important for neurological functionality during development and normal ageing. During the preparation of this manuscript, we became aware of the closely related work of D. Reid et al., which demonstrates recurrent DNA repair sites in embryonic stem cell-derived neurons39.

Methods

iPS cell culture

All iPS cell experiments used the WTC11 line, which was derived from a healthy human male participant and obtained from the Coriell cell repository. All policies of the NIH Intramural Research Program for the registration and use of this iPS cell line were followed. The WTC11 iPS cell line was validated to have a normal male karyotype, as expected, and was confirmed to be mycoplasma free based on the Lonza MycoAlert mycoplasma testing kit. iPS cell culture was performed as previously described5. Tissue culture-treated dishes were coated with human embryonic cell (hES cell)-qualified matrigel (Corning, ref 354277). Matrigel was removed and iPS cells were plated in Essential 8 Medium (E8; Thermo Fisher Scientific, cat. no. A1517001) and 10 μM ROCK inhibitor (RI; Y-27632; Selleckchem, cat. no. S1049). iPS cells were maintained in an incubator at 37 °C with 5% CO2 and fed every 1–2 days as needed. Cells were split using either accutase (Life Technologies, cat. no. A1110501) for enzymatic dissociation into single cells or EDTA (0.5 mM; Life Technologies, cat. no. 15575020) for routine passaging. Medium was supplemented with 10 μM RI to promote survival during passaging. As necessary, iPS cells were frozen in 90% ES cell-qualified fetal bovine serum (FBS) (Sigma Aldrich, cat. no. ES-009-B) and 10% DMSO (Mediatech, cat. no. 25-950-CQC), and then thawed rapidly at 37 °C, followed by removal of FBS/DMSO and plating in E8 + RI medium.

i3Neuron culture

The human iPS cells used in this study were previously engineered5,6 to express mouse neurogenin-2 (NGN2) under a doxycycline-inducible promoter integrated at the AAVS1 safe harbour in the WTC11 background, with or without CAG-dCas9-BFP-KRAB at the Clybl promoter24. For neuronal differentiation, 20–25 million iPS cells were plated on day 0 onto a 15-cm plate in N2 medium (knockout Dulbecco’s modified Eagle’s medium (DMEM)/F12 medium; Life Technologies Corporation, cat. no. 12660012) with N2 supplement (Life Technologies, cat. no. 17502048), 1× GlutaMAX (Thermofisher Scientific, cat. no. 35050061), 1× MEM nonessential amino acids (NEAA) (Thermofisher Scientific, cat. no. 11140050), 10 μM ROCK inhibitor (Y-27632; Selleckchem, cat. no. S1049), and 2 μg/ml doxycycline (Clontech, cat. no. 631311). N2 medium was changed once a day for two more days. On day 3, cells were replated onto freshly prepared dishes coated with poly-l-ornithine (PLO; 0.1 mg/ml; Sigma, cat. no. P3655-10MG) as follows. Cells were washed with PBS, dissociated with accutase for 10 min at 37 °C, washed and plated in i3Neuron culture medium: BrainPhys medium (STEMCELL Technologies, cat. no. 05790) supplemented with 1× B27 Plus Supplement (ThermoFisher Scientific, cat. no. A3582801), 10 ng/ml BDNF (PeproTech, cat. no. 450-02), 10 ng/ml NT-3 (PeproTech, cat. no. 450-03), 1 mg/ml mouse laminin (Sigma, cat. no. L2020-1MG), and 2 μg/ml doxycycline (Clontech, cat. no. 631311). For 10-cm plates used in SAR-seq and CHIP–seq, 12–15 million neurons were plated. For 15-cm plates, 30–45 million neurons were plated. For ibidi slides used in imaging experiments, 0.2 million neurons per well were plated. Unless otherwise noted, i3Neurons were fed on day 6 during a half-medium change and collected on day 7. For i3Neurons cultured beyond 7 days, half-medium changes were conducted three times per week. In some experiments, pre-differentiated i3Neurons were frozen on day 3 in 90% FBS (Sigma Aldrich, cat. no. ES-009-B) and 10% DMSO (Mediatech, cat. no. 25-950-CQC), and then thawed rapidly at 37 °C, followed by removal of FBS/DMSO and plating in i3Neuron culture medium. We did not detect any differences for experiments in which day 3 neurons were thawed or plated immediately after differentiation.

iMuscle culture

To generate skeletal muscle myoblasts (iMuscle) from human iPS cells, we engineered a doxycycline-inducible vector containing a MyoD1 transcription factor transgene immediately followed by a co-inducible short-hairpin RNA targeting OCT4 and a selection marker/fluorescent protein cassette (MyoD-O iPS cells), similar to previous methods40. We used the PiggyBac system to facilitate the delivery and genome integration of the transgene cassette. To increase transposase expression in iPS cells, we subcloned the PiggyBac transposase under a long version of the EEF1A1 (also known as eF1a) promoter, and co-transfected this transposase vector with the MyoD1 donor vector into iPS cells using lipofectamine stem (Thermofisher Scientific, cat. no. STEM00015). Transfection rates were approximately 55%, based on fluorescence of a co-expressed reporter gene, and we were able to obtain 100% MyoD1-O iPS cells after puromycin selection of iPS cells harbouring stable integration of the transcription factor cassette.

To induce myogenic differentiation, MyoD-O iPS cells were dissociated using accutase (37 °C for 10 min) and resuspended in myogenic induction medium (MIM) and plated with 3 × 106 iPS cells per 10-cm dish. The MyoD-O iPS cell differentiation was modified from a recent report41. The MIM contained DMEM/F12 HEPES (Gibco, cat. no. 11-330-032) supplemented to a final concentration of 1 mM sodium pyruvate, 1× MEM nonessential amino acids (NEAA) (Thermofisher Scientific, cat. no. 11140050), 0.1 mM 2-mercaptoethanol (Gibco, cat. no. 21985023), 10 μg/ml insulin (Roche, cat. no. 11376497001), 2 μg/ml doxycycline and 10 μM RI.

After plating the MyoD-O iPS cells in MIM on day 0, fresh medium was supplied daily until 80% confluency or more (day 6). Myoblast morphology was observed beginning at day 2 followed by myotube-like morphology at day 5. On day 6, the glycogen synthase kinase 3 (GSK-3) inhibitor, CHIR 99021, was administered for 48 h to enhance myoblast differentiation and stimulate fusion (3 μM, Tocris, cat. no. 4423) as reported by others42,43. At day 9 after doxycycline, the differentiating myoblasts were collected.

Lentivirus production for CRISPRi guide delivery

Lenti-X Human Embryonic Kidney (HEK) cells were plated onto dishes coated with poly-l-ornithine (0.1 mg/ml PLO) at a density of 10–15 million cells per 10-cm dish or 30–45 million cells per 15-cm dish in warm DMEM, high glucose GlutaMAX Supplement medium (Life Technologies, cat. no. 10566024) with 10% FBS (Sigma, cat. no. TMS-013-B) and then cultured overnight to achieve approximately 90% confluency. The next morning, cells were transfected using Lipofectamine 3000 (Life Technologies, cat. no. L3000150). For each transfection, 2.4 ml room temperature Opti-MEM medium (Fisher Scientific, cat. no. 31985062) and 60 μl Lipo 3K were combined and incubated at room temperature for 5–20 min. Then, in another tube, 2.4 ml room temperature Opti-MEM, 80 μl P3000, 13.3 μg psPAX2, 4.5 μg pMD2G, 1.8 μg pAdVantage, and 19.5 μg of the lenti-vector of interest were combined. The contents of the two tubes were combined and incubated at room temperature for 30 min. This mixture was then added dropwise to a plate of Lenti-X cells and cultured overnight. The next morning, the medium was changed with 36 ml of fresh warm DMEM + Glutamax + 10% FBS medium supplemented with 72 μl viral boost reagent (ALSTEM, cat. no. VB100). Then, 2–3 days later, the medium containing virus was collected and centrifuged to remove cell debris, and the supernatant was concentrated using Lenti-X concentrator, following the manufacturer’s protocol (Takara Bio, cat. no. 631231). The viral pellet was gently dissolved in PBS at either 1:10 or 1:100. Concentrated virus was then aliquoted and stored at −80 °C for future use.

CRISPRi plasmids

The sgRNAs used in this study were cloned into either the pU6-sgRNA EF1Alpha-puro-T2A-BFP vector (gift from J. Weissman; Addgene no. 60955)24,44 or the pMK1334 CROPSeq vector (gift from M. Kampmann; Addgene no. 127965)24. Guides were driven under the mouse U6 promoter. The sgRNA sequences were as follows: non-targeting controls, GTCCACCCTTATCTAGGCTA or GACCAGGATGGGCACCACCC; PARP1, GGGTGCGGCGTGTTCGGTGG; XRCC1, GGTAGAGTATGGGGTCCGAG; POLB sg1, GCGCCGGAGGGAGATCCCCA; POLB sg2, GCCAGCTTGAAGGAGGTACC; PNKP, GCCAGGGCTTGCCCGTCCGA.

Puromycin selection for CRISPRi knockdown i3Neurons

To perform CRISPRi knockdown, 1–4 million iPS cells were transduced with an aliquot of sgRNA-expressing virus immediately after an accutase split before cell attachment. The medium was changed the next day to remove residual virus. Two days after transduction, iPS cells were split into accutase and plated at low density in E8 + RI medium containing 10 μg/ml puromycin. The following morning, cells were washed with PBS and given fresh E8 or E8 + RI medium. Cells were then expanded for 1–2 days before inducing differentiation into neurons. sgRNA knockdown efficiency was tested at the iPS cell stage and confirmed in day 7 i3Neurons via quantitative PCR.

Immunofluorescence and Microscopy

For imaging, i3Neurons were plated onto 96-well plates (0.05 × 106 cells per well; Perkin Elmer, cat. no. 6055302) or on μ-Slide glass bottom IBIDI slides (0.2 × 106 cells per well; IBIDI). Before fixation, if indicated, cells were treated with 0.1 mg/ml MMS for 15 min, 10 μM PARG inhibitors (PARGi) for 20 min, or 50 μM etoposide (ETO) for 1 h. Cells were then washed with PBS and fixed in 4% paraformaldehyde in PBS for 15 min at room temperature. Cells were then washed three times in PBS, permeabilized in 0.5% Triton X-100 for 5 min at room temperature and blocked in 1% BSA/0.2% Triton in PBS for 1 h at room temperature before incubation with primary antibodies. Primary antibodies and dilutions used were as follows: anti-PAR (1:500, Sigma-Aldrich MABE1031), anti-53BP1 (1:1,000, Novus Biologicals, no. NB100-305), anti-phospho-histone H2AX (1:5,000, Millipore, no. JBW301), and anti-tubulin-β3 (TUBB3, 1:5,000, Biolegend no. 801201). Immunofluorescence was detected using fluorochrome-conjugated secondary antibodies as follows: rhodamine Red-X AffiniPure donkey anti-mouse IgG (1:1,000, Jackson Labs no. 715-295-151) for detection of TUBB3; Alexa Fluor 488 goat anti-rabbit (1:1,000, Invitrogen no. A11034) for detection of PAR or 53BP1; Alexa Fluor 555 goat anti-mouse (1:2,000, Invitrogen no. A21422) for detection of yH2AX. EdU was visualized using Click-iT Plus EdU Cell Proliferation Kit for Imaging, Alexa Fluor 488 dye (Life Technologies, cat. no. C10637), according to the manufacturer’s instructions. Finally, DNA was counterstained with DAPI (2 μg/ml, Thermo Fisher Scientific no. 62248). Images were acquired on an inverted Nikon spinning-disk confocal microscope (Nikon Eclipse T1), using a 60× 1.40 NA oil-immersion objective.

Flow cytometry

For cell cycle profiling, cells were incubated with 10 μM 5-ethynyl-2-deoxyuridine for 30 min at 37 °C and stained using the Click-IT EdU Alexa Fluor 488 or 647 Flow Cytometry Assay Kit (ThermoFisher) according to the manufacturer’s instructions. DNA content was measured using DAPI (0.5 μg ml−1). Data were analysed using FlowJo v.10 software.

Rat primary neuron cell culture

All animal procedures were conducted according to the NIH Guide for the Care and Use of Laboratory Animals, under Animal Study Proposal no. 19-011 approved by the NICHD Animal Care and Use Committee. Ten-to-eleven-week-old pregnant albino rats were delivered to our facility on day 17 of gestation from Envigo. They were housed under a 12-h light–dark cycle for 24 h with access to food and water ad libitum. On the next day, the rats were killed by carbon dioxide inhalation followed by decapitation before embryos were extracted and neurons prepared. Neurons from embryos of the same litter were pooled for each experiment.

Primary rat cortical neurons were prepared at embryonic day 18 (E18) as previously described45. Rats were killed at E18 by decapitation. The brain was collected and meninges were removed, after which cortices were isolated in sterile Hanks’ medium (Hanks’ balanced salt solution (HBSS), 20 mM HEPES, pH 7.5). Cortices were then collected and treated with 0.25% trypsin (Gibco), and 100 μg/ml DNase (Roche) for 15 min at 37 °C. One volume of adhesion medium (DMEM without phenol red, 4.5 g/l glucose, 25 mM HEPES, 10% heat-inactivated horse serum (Gibco), 100 U/ml penicillin and 100 mg/ml streptomycin) was added to stop trypsin enzymatic action. The tissue was then disrupted mechanically by pipetting it through a 10-ml serological pipette. Cells were then strained through a 70-μm nylon filter (Corning) and centrifuged at 700g for 10 min. The cell pellet was resuspended in 5 ml adhesion medium and cells were counted. Between 10 million and 25 million cells were plated on 10-cm culture dishes previously coated with poly-l-lysine (Sigma) and 5 μg/ml laminin (Roche). After 2 h, the neurons were adherent to the plate and the medium was changed to complete neurobasal medium (CNB) (neurobasal medium (Gibco), 1× B27 serum-free (Gibco), 4.5 g/l glucose, and 100 U/ml penicillin–streptomycin (Gibco)) and supplemented with 5 μM aphidicolin (Aph) to eliminate residual dividing cells. Primary neurons were then cultured for 7–8 days before being collected for CHiP–seq or SAR-seq.

SAR-seq

Neurons and iMuscle cells were incubated with 20 μM EdU for 18 h, unless otherwise noted. Cells were collected and fixed as follows. Cells were washed with PBS, incubated with accutase for 5–10 min, collected using a cell scrapper, pelleted at 200g for 5 min and resuspended in cold 0.1% BSA in PBS. Cold methanol was then added dropwise during slow vortexing to 80% final concentration. Samples were kept on ice for 20 min and then stored at −20 °C until processing.

Copper catalyses azide-alkyne click chemistry. For biotin labelling via Click-iT reaction, cells were first washed 1× in PBS, permeabilized with 0.2% Triton-X100/PBS for 10 min on ice, and then washed 1× in PBS. Then the following were added in order: 3 mM copper sulfate (Sigma), 50 μM biotin azide (ThermoFisher, cat. no. B10184), and 1× Click-iT additive (ThermoFisher, cat. no. C10424) for 2 h with shaking at room temperature. Cells were then washed 1× in PBS and lysed in 50 mM Tris pH 8.0 with 1% SDS and proteinase K overnight at 37 °C. DNA was extracted using UltraPure phenol:chloroform:isoamyl alcohol (25:24:1, v/v) (Invitrogen) according to the manufacturer’s instructions, followed by 2.5:1 volume ethanol and 1:10 volume sodium acetate precipitation. DNA pellets were resuspended in TE buffer and sheared to 150–200-bp fragments using a Covaris S220 sonicator at 10% duty cycle, 175 peak incident power, 200 cycles per burst, for 240 s. DNA was again precipitated in 2.5:1 volume ethanol and 1:10 volume sodium acetate and resuspended in TE buffer. Biotin–EdU fragments were pulled down using MyOne Streptavidin C1 Beads (ThermoFisher, cat. no. 650-01). Before pulldown, 35 μl of Dynabeads was washed twice with 1 ml 1× wash and binding buffer (1× W&B; 10 mM Tris-HCl pH 8.0, 1 mM EDTA, 1 M NaCl, 0.1% Tween20) on a DynaMag-2 magnetic separator (Invitrogen, cat. no. 12321D) and resuspended in 2× W&B (10 mM Tris-HCl pH8.0, 2 mM EDTA, 2 M NaCl, 0.2% Tween20). An equal volume of Dynabeads in 2× W&B was added to DNA in TE buffer and incubated at 24 °C with shaking in a ThermoMixer C at 800 rpm for 30 min. Dynabeads bound to biotin–EdU fragments were washed three times in 1 ml of 1× W&B, twice in 1 ml EB, and once in 1 ml 1× T4 DNA ligase buffer (NEB). Dynabeads were resuspended in 50 μl end-repair reaction mix (1X T4 DNA ligase buffer, 0.4 mM dNTPs, 2.7 U T4 DNA polymerase (NEB), 9 U T4 Polynucleotide Kinase (NEB), and 1 U Klenow fragment (NEB)) and incubated at 24 °C with shaking at 800 rpm for 30 min. Dynabeads were washed once in 1 ml 1× W&B, twice in 1 ml EB, and once in 1 ml NEBuffer 2 (NEB) and resuspended in 50 μl A-tailing reaction mix (1× NEB dA-tailing buffer and 20 U Klenow fragment exo- (NEB)), followed by incubation at 37 °C with shaking at 800 rpm for 30 min. Dynabeads were then washed again once in 1 ml NEBuffer 2 and resuspended in 115 μl ligation reaction mix (1× quick ligase buffer (NEB), 6,000 U quick ligase (NEB), 5 nM annealed TruSeq truncated adaptor) and incubated at 25 °C with shaking at 600 rpm for 20 min. The ligation reaction was stopped by adding 50 mM EDTA, and Dynabeads were washed three times in 1 mL 1× W&B, three times in 1 ml EB, and finally resuspended in 8 μl EB + 10 μl 2× Kapa HiFI HotStart Ready Mix (Kapa Biosciences). Primers (10 mM) 5′-CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGG AGTTCAGACGTGTGCTCTTCCGATC*T-3′ and 5′-AATGATACGGCGA CCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC*T-3′ (asterisk indicates a phosphothiorate bond and a NNNNNN TruSeq index sequence) were added with 37 μl PCR reaction mix (20 μl 2× Kapa HiFI HotStart Ready Mix, 17 μl H2O) for a final volume of 60 μl. DNA was amplified using PCR programme: 98 °C, 45 s; 15 cycles of [98 °C, 15 s; 63 °C, 30 s; 72 °C, 30 s]; 72 °C, 5 min. PCR products were separated from DynaBeads and cleaned using 1.8× volume AMPure Beads XP. We isolated 150–200-bp bands on 2% agarose gel and purified them using QIA-quick Gel Extraction Kit (Qiagen). Before sequencing using Illumina NextSeq 550 (75 bp single read), library concentrations were calculated by KAPA Library Quantification Kit for Illumina Platforms (Kapa Biosystems).

To specifically sequence only EdU-incorporated strands of DNA (Strand-Specific SAR-seq), SAR-seq was followed exactly as above with additional steps before PCR amplification. After washing three times in 1 ml 1× W&B and three times in 1 ml EB post-ligation, Dynabeads were washed in 50 μl 1× SSC buffer and resuspended in 20 μl 0.15 M NaOH for 10 min at room temperature to denature DNA strands. Beads were placed back on the DynaMag-2 magnetic separator and washed once with 20 μl 0.1 M NaOH, once with 1 ml 1× W&B, and twice with 1 ml EB, and then resuspended in 8 μl EB + 10 μl 2× Kapa HiFI HotStart Ready Mix. Primers and PCR reaction mix were added as above.

In the indicated experiments, i3Neurons were treated with the following compounds: aphidicolin (Aph; 5 μM), etoposide (50 μM), olaparib (10 μM), velaparib (10 μM) or tazaparib (1 μM) were added along with EdU at 18 h before collection. The DNA polymerase-α inhibitor (POLαi) adarotene (1 μM) or Aph (50 μM) was added with EdU 14 h or 24 h before collection, respectively. Hydroxyurea (HU; 10 mM) powder was dissolved fresh into water to make a 1 M stock each time it was used. HU was added on day 3 of differentiation and again on day 6 at 18 h before collection along with EdU for a total of 4 days of treatment. For dideoxynucleoside (ddN) chain termination, 5 μM of each of ddA, ddT, ddG, and ddC chain-terminating nucleosides (20 μM total) were added to i3Neuron culture for 18 h before cell collecting for END-seq and S1 END-seq experiments.

END-seq and S1 END-seq

To dissociate i3Neurons for use in END-seq, we performed a modified papain dissociation protocol46. Papain (Worthington Biochemical Cat: LK003178) was dissolved into TrypLE Express Enzyme (1×) with no phenol red and warmed at 37 °C for 10 min. Then, day 7 i3Neurons on a 15-cm plate were washed with PBS and treated with 5 ml papain/TrypLE for 1 min at 37 °C. Papain was removed with gentle pipetting and 5 ml trituration solution was added (30 ml i3Neuron culture medium,10 μM ROCK inhibitor, and 1 vial freshly dissolved DNase). Cells were collected and gently pipetted 3–10 times in a conical tube using a wide-bore 10-ml pipette, being careful not to over-digest the sample. Cells were then washed with PBS, pelleted and resuspended in PBS containing 0.1% BSA and 0.5 mM EDTA, and kept on ice. Cells were processed for END-seq as previously described47. For S1 END-seq, cells were collected and embedded in 1% agarose plugs, lysed and digested with proteinase K (1 h at 50 °C, followed by 7 h at 37 °C), washed with TE buffer, and then treated with RNase A for 1 h at 37 °C. Plugs were then washed in EB and equilibrated in S1 nuclease buffer (40 mM sodium acetate pH 4.5, 300 mM NaCl, 2 mM ZnSO4) for 30 min. We added 1.8 U S1 nuclease to 100 μl S1 nuclease buffer per plug and incubated them on ice for 15 min to allow the enzyme to diffuse into the plug. The reaction mix was then placed at 37 °C for 20 min before addition of EDTA (10 mM final concentration) to terminate the reaction. Finally, plugs were processed through the standard END-seq protocol.

AsiSI induction in pre-B cells

Abelson-transformed mouse pre-B cells48 were retrovirally transduced with tetracycline-inducible ER-AsiSI49. Cells were arrested in G1 with 3 μM imatinib for 24 h, followed by addition of 3 μg/ml doxycycline for 24 h, and then further addition of 1 μM 4OHT for 18 h to induce AsiSI nuclear localization, as previously described19.

Construction of the Dox-Cas9-D10A nickase

Dox-inducible Cas9-D10A was constructed using isothermal assembly50. In brief, a plasmid encoding Dox-inducible Cas9 nuclease was obtained from Addgene. pCW-Cas9-Blast was a gift from M. Babu (Addgene plasmid no. 83481; http://n2t.net/addgene:83481; RRID: Addgene_83481). This plasmid was digested with NheI/BamHI and assembled with 2 PCR fragments (Nickase-P1, Nickase-P2) and transformed into competent cells. Gel-extracted PCR fragments of Nickase-P1 and Nickase-P2 were generated using the Q5 HotStart 2× mastermix with the primers as follows using Addgene-83481 as template: Nickase-P1 forward primer, GTCAGATCGCCTGGAGAATTG; Nickase-P1 reverse primer, tgC CAGGCCGATG CTGTACTTCT; Nickase-P2 forward primer, AGAAGTACAG CATCGGCCTG Gca ATCGGCACCAACTCTGTGGG; Nickase-P2 reverse primer, TGCCTTGGAAAAGGCGCAAC.

MCF10A Cas9D10A-inducible cell line

To produce the MCF10A Cas9-D10A inducible cell line, we infected MCF10A cells (sourced from ATCC, not aunthenticated, and not mycoplasma tested) with lentivirus containing Dox-Cas9-D10A and cells were selected with 10 μg/ml blasticidin. Cas9-D10A expression was induced using 3 μg/ml doxycycline and confirmed by western blotting. Three guide RNAs (sequences: 5′-TGGGGCGTTTATCCGATGTC-3′; 5′-GCACTAGCCGGCCCGGACGT-3′; 5′- CCAGCCTGGTAGCGCCCCCA-3′) were cloned into the Lenti-Guide-NLS–GFP vector51 and the MCF10A Cas9-D10A inducible cell line containing the three guide RNAs was selected using 2 μg/ml puromycin. To identify the nicks by S1 END-seq, cells were arrested in G1 for 48 h with 5 μM Palbociclib with doxycycline added during the last 24 h to induce Cas9D10A, followed by cell collection and S1 END-seq processing.

ChIP–seq and western blotting

Fifteen million i3Neurons or rat neurons were fixed in 1% formaldehyde at 37 °C for 10 min. The fixation reaction was quenched with glycine at a final concentration of 125 mM. Cells were spun down and washed twice with chilled PBS, and pellets were then snap frozen on dry ice and finally stored at −80 °C until sonication. Sonication, immunoprecipitation, and library preparation steps were done as previously reported17. All antibodies were pre-conjugated to 40 μl magnetic protein A beads before immunoprecipitation: H3K4me1 (5 μg, Abcam no. 8895); MLL4 (antibody courtesy of K. Ge); H3K27ac (5 μg, Abcam no. 4729); H3K27me3 (5 μg, Millipore no. CS200603); H3K4me3 (6 μl, Abcam no. 8580); H3K9me3 (10 μg, Active Motif, cat. no. 39765); CTCF (6 μl, Millipore, cat. no. 07-729); RNA polymerase II (8 μg, Abcam no. 26721); H3K36me3 (5 μg, Abcam cat. no. ab9050); anti-PAR (5 μg, Millipore-Sigma MABE1016); XRCC1 (2.6 μg, Novus, cat. no. NBP1-87154). For western blotting, cells were collected and lysed in a buffer containing 50 mM Tris-HCl (pH 7.5), 200 mM NaCl, 5% Tween-20, 0.5% NP-40, 2 mM PMSF, 2.5 mM β-glycerophosphate (all from Sigma-Aldrich) and protease inhibitor cocktail tablet (complete Mini, Roche Diagnostics). Equal amounts of protein were loaded into precast mini-gels (Invitrogen) and resolved by SDS–PAGE. Proteins were blotted onto a nitrocellulose membrane, blocked with Intercept (TBS) blocking buffer (LI-COR Biosciences) and incubated with the corresponding primary and secondary antibodies: anti-DNA polymerase β (1:1,000, Millipore no. ABE1408), anti-tubulin (1:10,000, Sigma-Aldrich no. T5168), IRDye 800 CW goat anti-rabbit (1:15,000, Li-Cor no. 926-32211), and IRDye 680 RD goat anti-mouse (1:15,000, Li-Cor no. 926-68070).

Selective chemical labelling and capture of 5hmC and 5fC

5hmC-Seal was performed as previously described52 with modifications. In brief, 80 μg genomic DNA was resuspended in TE buffer and sonicated to 200 bp fragments using a Covaris S220 sonicator. Fragmented DNA was precipitated using ethanol and sodium acetate and resuspended in TE buffer. The selective 5hmC chemical labelling was performed in 100 μl glucosylation buffer (50 mM HEPES buffer pH 8.0, 25 mM MgCl2) containing above fragmented DNA, β-GT (NEB, cat. no. M0357), and UDP-azide-glucose (Active Motif, cat. no. 55020), and incubated at 37 °C for 1.5 h. After the reaction, DNA was cleaned up with QIAquick Nucleotide Removal Kit. The labelled DNA was eluted with ddH2O, after which 1 mM DBCO-PEG4-Biotin (Click Chemistry Tools) was added and incubated at 37 °C for 2 h. Then, the biotin-labelled DNA was pulled down using C1 Streptavidin beads (ThermoFisher, cat. no. 650-01) for 30 min at room temperature. The captured DNA fragments were processed for library construction as described for SAR-seq (above).

For 5fC-Seal, we modified the previously described protocol10, which reduces 5fC to 5hmC using NaBH453. We labelled the newly generated 5hmC (derived from 5fC reduction with NaBH4) with an azide-modified glucose as described above for 5hmC-Seal. In brief, 80 μg of fragmented i3Neuron genomic DNA was incubated in 100 μl glucosylation buffer (50 mM HEPES buffer pH 8.0, 25 mM MgCl2) containing unmodified UDP-glucose (NEB, Cat No. M0357), and β-GT for 1.5 h at 37 °C. The labelled DNA was cleaned up with QIAquick Nucleotide Removal Kit. Then, an equal volume of freshly prepared NaBH4 (Aldrich, cat. no. 213462) solution was added to the glucose-blocked DNA solution. The reaction mixture was vortexed and incubated in a Thermomixer for 15 min at room temperature. The chemical labelling and capture were performed as described above for 5hmC-Seal.

In situ Hi-C from hiPS cell-derived i3Neurons

Two in situ Hi-C libraries were generated from 10 million cultured hiPS cell-derived i3Neurons as previously described54. In brief, in situ Hi-C consists of seven steps: (1) crosslinking cells with formaldehyde, (2) DNA digestion using MboI, (3) filling in and marking ends with biotin, (4) proximity ligation, (5) DNA shearing, (6) pulling down the biotinylated ligation junctions with streptavidin beads, and (7) paired-end sequencing. As quality control (QC), we confirmed efficient restriction, ligation and DNA shearing with an agarose DNA gel and for appropriate size selection in using Agilent 4200 TapeStation system after steps (5) and (6). For the final QC, we performed 100 paired-end sequencing on the Illumina Nextseq to assess the quality of the libraries based on the percentages of intra-chromosomal reads and long range (>20 kb) reads, and library complexity. The HiC libraries were sequenced on 150 paired-end sequencing using an Illumina Novaseq 6000.

RNA extraction, RNA-seq and quantitative real-time PCR

To extract RNA, cells were plated on six-well dishes and washed with PBS, and then 500 μl tri-reagent (Zymo Research, cat. no. R2050-1-200) was added directly to the cells. The lysed cells were collected using a cell scraper. To isolate RNA, we used a Direct-zol RNA miniprep kit (Zymo Research, cat. no. R2052), according to the manufacturer’s instructions including the optional DNase step. For quantitative PCR, total RNA was reverse-transcribed with iSCRIPT Advanced cDNA Synthesis Kit (Bio-Rad, cat. no. 1725037) according to the manufacturer’s instructions. The resulting cDNA was diluted tenfold and used for quantitative real-time PCR (qRT–PCR) analyses with specific primer and probe sets (Bio-Rad, ACTB qHsaCEP0036280; PARP1 qHsaCEP0052423; XRCC1 qHsaCIP0033686; POLB qHsaCEP0057881; FEN1 qHsaCEP0039485; PNKP qHsaCEP0057803 in a final volume of 20 μl, which contained 10 μl SsoAdvanced Universal Probes Supermix (Biorad, cat. no. 1725280) and 3 μl cDNA. qRT–PCR was performed in triplicate wells per sample on a CFX96 Real-Time System (Biorad). For RNA sequencing, six biological replicates were sequenced. Total RNA was enriched for polyA and sequenced 2 × 75 bp on a HiSeq machine.

ATAC–seq

ATAC–seq was performed as previously described55. i3Neurons were grown on 96-well plates (0.05 × 106 cells per well). Cells were washed gently with PBS. Then, 100 μl accutase per well was added and removed, and the plate was incubated at 37 °C for 5 min. Fifty microlitres of cold lysis buffer (10 mM Tris-HCl, pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% IGEPAL CA-630) was added directly to the well for 10 min. Cells were then pipetted 10–20 times to break clumps and centrifuged at 500g for 10 min at 4 °C. The nucleus pellet was resuspended in the Nextera transposition reaction mix (25 μl 2× TD buffer, 2.5 μl Nextera Tn5 transposase (Illumina, cat. no. FC-121-1030), and 22.5 μl nuclease-free H2O) on ice, then incubated for 30 min at 37 °C. The tagmented DNA was purified using the Qiagen MinElute kit and eluted with 10 μl EB. Following purification, library fragments were amplified using the Nextera index kit (Illumina, cat. no. FC-121-1011) under the following cycling conditions: 72 °C for 5 min, 98 °C for 30 s, followed by thermocycling at 98 °C for 10 s, 63 °C for 30 s, and 72 °C for 1 min for five cycles. To prevent saturation due to over-amplification, a 5-μl aliquot was then removed and subjected to qPCR for 20 cycles for calculation of the optimal number of cycles needed for the 45-μl reaction that remained. The number of additional cycles required was determined by calculating the cycle number that corresponded to a quarter of the maximum fluorescence intensity. Adding seven cycles to this estimate was found to yield optimal libraries. PCR reactions were subsequently cleaned with Agencourt AMPure XP beads (Beckman Coulter) and run on a 2% agarose gel, and a smear of 200–800 bp was cut and gel-purified using a QIAquick Gel Extraction Kit (QIAGEN). Library concentration was determined using a KAPA Library Quantification Kit for Illumina Platforms (Kapa Biosystems). Sequencing was performed on the Illumina Nextseq500 (75-bp paired-end reads).

Genome alignment

SAR-seq, END-seq, ChIP–seq and Seal reads were aligned to the reference genome (hg19 for human i3Neuron and iMuscle, mm10 for mouse pre B cells or rn6 for rat primary neurons) using bowtie (v.1.1.2)56 with parameters -n 3 -l 50 -k 1 for END-seq and -n 2 -l 50 -m 1 for the rest. ATAC–seq reads were aligned by bowtie (v.2.4.1)57. RNA-seq reads were aligned by STAR (v.2.7.6a)58. Functions ‘view’ and ‘sort’ of samtools (v.1.11)59 were used to convert and sort the aligned .sam files to sorted .bam files. .bam files were further converted to .bed files using the bedtools (v.2.29.2) bamToBed command60. Mitochondrial reads were removed in SAR-seq for intensity comparisons.

Peak calling

We used MACS (v.1.4.3)61 to call SAR-seq, XRCC1 ChIP–seq and ATAC–seq peaks. SAR-seq XRCC1, and CTCF ChIP–seq peaks with more than 10-fold enrichment over background were kept. Peaks of hisone modification determined by ChIP–seq peaks were called by SICER62 v.2-1.0.2 with default parameters. Peaks within blacklisted regions (https://sites.google.com/site/anshulkundaje/projects/blacklists) were filtered63. Overlapping SAR-seq peaks from three non-treated replicates shown in Extended Data Fig. 2b were used for most of the analyses. As peaks of ddN S1 END-seq are always clustered, subpeaks were called by the PeakSplitter tool of PeakAnalyzer with parameters -c 15 -v 0.664. Peak summits on both strands were identified, and the distance between the peak summit on the negative strand and its closed peak summit on the positive strand was calculated and represented in Extended Data Fig. 9f.

Quantification for sequencing data

For SAR-seq, Seal, and ChIP–seq, reads per kilobase of transcript per million mapped reads (RPKM) was calculated. For RNA-seq, fragments per kilobase of transcript per million mapped reads (FPKM) was calculated using cufflinks41 (v.2.2.1) based on the annotation from GENCODE v.3365. Mean values of replicates were used for analyses.

Enhancer enrichment and super-enhancer identification

Bedtools shuffle command with parameter (-chrom) was used to generate 1,000 random sets for SAR-seq peaks and ATAC–seq peaks, respectively, to estimate enrichments at genomic features. One-kilobase regions upstream of transcription start sites from GENCODE v33 annotation were defined as promoters in analyses. Super-enhancers were separated from conventional enhancers by identifying an inflection point of H3K27ac signal versus enhancer rank66.

Gene Ontology analysis

We used the DAVID web-tool67 to find the GO terms that were enriched for genes containing the top 2,000 SAR-seq peaks. The most significant Gene Ontology Biological Process terms and their associated false discovery rate (FDR) values are reported.

Motif analysis

MEME-ChIP68 of MEME suite was used to identify the common sequence motif of the nucleotide sequences from ± 500 bp around the summits for the top 5,000 SAR-seq peaks. The composite DNA sequence motif shown in Fig. 2c was plotted using ggseqlogo in R.

Kinetics model for EdU labelling

The rate of EdU labelling was estimated under the assumptions that the fraction of a synthesis event at a given site is fixed with different EdU incorporation times and that 18 h is enough time for EdU to incorporate into all synthesis events at a given site. The fraction of EdU-labelled synthesis event at h hours at a given site was denoted as [EdU]h = 1 – ekh. k represents the rate of EdU labelling at a given site and [EdU]h was estimated by SAR-seq intensity at h hours divided by SAR-seq intensity at 18 h. The intensity values (RPKM) of SAR-seq at different time points (1, 2, 4, 8 and 18 h of EdU) was fitted to the function [EdU]h = 1 – ekh using the nls package in R. The k value shown in Extended Data Fig. 2f was derived using the top 2,000 SAR-seq peaks.

Hi-C analysis

Using Juicer software69, .hic files were generated, and normalized contact matrices and observed over expected normalized contact matrices were obtained with a dump command. Loop or domain calls were also done using Juicer software69, and interaction matrices were visualized using Juicebox software. After two replicates had been processed, the final map was generated by merging these two replicates. The final bin resolution of the Hi-C map was 1 kb. From 2.1 billion raw reads generated together with QC and data-generation runs, we obtained 1.5 billion final total contacts whose read quality (Q) was over 30.

For compartment analysis, PC1 eigen vector values were extracted in 50k resolution by Juicer software, then the sign of the eigen vector in each chromosome was adjusted according to distribution patterns of the H3K27ac, H3K4me1 and ATAC–seq peaks in the whole chromosome. Among H3K4me1 peaks that did not overlap with transcription start sites within 1 kb, peaks that overlapped with SAR-seq peaks were extracted for the analysis (SAR-seq+H3K4me1+). Ten thousand random sequences were generated using the bedtools random program with 2,600 nt length, which is the average length of H3K4me1 peaks. While 40% of random sequences are located in A compartment, 63% of SAR-seq+H3K4me1+ peaks were located in A compartment.

Visualization

BedGraph files were generated using bedtools genomecov, normalized by reads per million (RPM) and then converted into .bigWig files using bedGraphToBigWig from UCSC pre-compiled utilities for visualization at the UCSC genome browser70,71.

Data matrices for heat maps were calculated using computeMatrix and plotted with plotHeatmap of deepTools suite72. Venn diagrams were plotted using VennDiagram package in R. Confocal images were processed using FIJI. Schematics were created using BioRender, and figures compiled in Adobe Illustrator.

Quantification of confocal Images

Confocal images were quantified using Nikon software. Images were first background subtracted. Then nuclei or EdU-positive cells were identified and counted using bright spot detection. The corresponding PAR signal intensity was measured per cell and reported as a mean. Data compilation and statistical analyses were performed using PRISM software.

Statistical analysis

Statistical analysis was performed using R version 3.6.2 (http://www.r-project.org). The statistical tests are reported in the figure legends and main text.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.