Main

To understand the principles that underline chromatin heterogeneity as related to nucleosome positioning and chromatin accessibility, we developed the scMNase-seq technique to simultaneously measure nucleosome positioning and chromatin accessibility in single cells. We applied scMNase-seq to 48 NIH3T3 single cells, 198 mouse embryonic stem cells (ESCs), and 278 mouse naive CD4 T cells, obtaining on average about 3, 0.9 and 0.7 million unique fragments, respectively, for each cell type (Fig. 1a, Supplementary Table 1). Sequence reads from sorted human or mouse cells from a mixed population mapped exclusively to the respective genome, suggesting that there was no DNA contamination across cells (Extended Data Fig. 1a). Pooled single-cell reads revealed a size distribution that was consistent with that obtained by bulk-cell MNase-seq (Extended Data Fig. 1b). We considered fragments with a length between 140 and 180 bp as canonical nucleosomes, and fragments with a length ≤ 80 bp as subnucleosome-sized particles (Extended Data Fig. 1b, c). Compared to CD4 T cells and mouse ESCs (Fig. 1b), NIH3T3 libraries have the largest number of non-redundant reads (Extended Data Fig. 1d) and the highest genomic coverage (5–30%) of nucleosomes—probably owing to the polyploidy of NIH3T3 cells (Extended Data Fig. 1e). Nevertheless, all three cell types have a similar nucleosome density across different genomic regions, which suggests that representation of the genome is relatively even (Extended Data Fig. 1f). The nucleosome positioning and the enrichment of subnucleosome-sized particles surrounding DNase I hypersensitive sites (DHSs), the transcription start sites (TSSs) of active genes and CTCF-binding sites were consistent between pooled scMNase-seq and bulk-cell MNase-seq data (Fig. 1c, Extended Data Fig. 2a–h). The density of subnucleosome-sized particles from pooled single cells is correlated with the DNase I tag density at DHSs and with gene expression at TSSs, suggesting that subnucleosome-sized particles are predictive of chromatin accessibility (Extended Data Fig. 2i, j). Moreover, the percentage of DHSs detected by scMNase-seq was higher than that detected by single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq)10 with the same sequencing redundancy (owing to the higher complexity and non-redundant read-number of scMNase-seq libraries), although when using scMNase-seq the percentage of recovered DHSs per non-redundant read for subnucleosome-sized particles was relatively lower than scATAC-seq fragments (Extended Data Fig. 2k–n). Nucleosome positions from single cells, aggregated nucleosome density from pooled single cells and tag density from bulk-cell MNase-seq at representative cell-type-specific genes are shown for all three cell types (Fig. 1d). Notably, the similarity of aggregated nucleosome profiles between pooled single cells and bulk cells is correlated with nucleosome positioning stringency and nucleosome coverage, and is higher for active promoters than it is for silent promoters (Fig. 1d, Extended Data Fig. 2o). These results demonstrate that scMNase-seq can simultaneously measure nucleosome positioning and chromatin accessibility in single cells.

Fig. 1: scMNase-seq simultaneously measures the positions of nucleosomes and subnucleosome-sized particles in single cells.
figure 1

a, Schema of scMNase-seq. b, Plot of non-redundant nucleosome read number (x axis) and genomic coverage of nucleosomes (y axis) for single NIH3T3 cells, CD4 T cells and mouse embryonic stem cells. c, Average density profiles of nucleosomes (red) and subnucleosome-sized particles (blue) relative to TSS of active genes (left) and CTCF-binding sites (right) for pooled CD4 T cells scMNase-seq data. Subnucl., subnucleosome-sized particles. d, Genome browser view of single-cell nucleosome positions for NIH3T3 cells, CD4 T cells and mouse ESCs at TSSs of three representative cell-type-specific gene loci. Single-cell libraries that have at least one nucleosome within any of three genomic regions are shown. Tracks for tag density of corresponding bulk-cell MNase-seq data (one representative from two repeated experiments is shown) and pooled scMNase-seq data (all single cell libraries with detected nucleosomes in selected genomic regions are included) are also shown. The nucleosome maps at expressed genes for each cell type are highlighted with pink rectangle. The expression levels of genes are shown in the heat map above the tracks. 3T3, NIH3T3 cells; T, CD4 T cells; ESC, mouse ESC; RPKM, reads per kilobase of transcript per million mapped reads.

Although nucleosome positioning13 is well-studied5,14,15,16 on the basis of large numbers of pooled cells, genome-wide nucleosome spacing patterns are poorly understood because current knowledge about nucleosome spacing is limited to the positioned nucleosomes7,17,18. We profiled the distribution of nucleosome-to-nucleosome distance for different single cells and used relative peak height to measure the uniformity of nucleosome spacing for both positioned and non-positioned nucleosomes (Extended Data Fig. 3a–c, Supplementary Methods). This analysis revealed a high degree in spacing uniformity in single cells regardless of positioning stringency; decreased uniformity in spacing was observed as positioning stringency decreased, when using either the pooled single cells or bulk-cell MNase-seq data19 (Extended Data Fig. 3d). The bulk-cell MNase-seq data failed to reveal the actual spacing pattern owing to the mixture of non-positioned nucleosomes from a population of different cells.

The degree of uniformity in spacing in the promoter regions of silent genes is higher than that of active genes (Fig. 2a, b, Extended Data Fig. 4a, b), and uniformity is higher in non-DHS than in DHS regions (Fig. 2c, d, Extended Data Fig. 4a–c). Notably, the higher uniformity of spacing in non-DHS regions was also observed in single haploid mouse ESCs and haploid chromosome X in single mouse ESCs (Extended Data Fig. 4d–g), and was independent of MNase concentration (Extended Data Fig. 4h–m). Furthermore, nucleosome spacing in active chromatin regions associated with H3K4me1, H3K4me3, H3K27ac, H3K9ac and H2AZ shows a lower degree of uniformity than transcribed regions marked by H3K36me3, heterochromatic regions marked by H3K27me3 or not marked by any of the histone modifications that we studied (Extended Data Fig. 4n, o).

Fig. 2: Profiling nucleosome positioning and spacing in single cells reveals distinct nucleosome organization principles at active and silent chromatin regions.
figure 2

a, Density plots of nucleosome-to-nucleosome distance within active-gene promoters (top) and silent-gene promoters (bottom) for bulk-cell MNase-seq, pooled 48 NIH3T3 single cells, one representative single cell and 48 single-cell scMNase-seq datasets. b, The relative peak heights based on the data from a reveal a higher degree of uniformity in spacing within silent-gene promoters than active-gene promoters. c, Density plots of nucleosome-to-nucleosome distance within DHS regions (top) and non-DHS regions (bottom) for bulk-cell MNase-seq (blue) and 48 single-cell scMNase-seq (red) datasets. d, The relative peak heights based on the data from c reveal a higher degree of uniformity in spacing within non-DHS regions than DHS regions. e, Cumulative density of variance in nucleosome positioning in active and silent genes within a cell (top) and across single cells (bottom), at −1 (left) and +1 (right) nucleosomes relative to the TSS. Top left, n = 7,574 and 13,107 nucleosome pairs for active and silent genes, respectively; bottom left, n = 164,512 and 304,847 nucleosome pairs for active and silent genes, respectively; top right, n = 11,388 and 17,631 nucleosome pairs for active and silent genes, respectively; bottom right, n = 237,006 and 416,328 nucleosome pairs for active and silent genes, respectively. P values were calculated using one-sided Mann–Whitney U-test. f, Cartoon illustrating nucleosome organization patterns in silent (left) and active (right) chromatin states. Rep., representative genomic region.

We next measured variation in nucleosome positioning not only across cells but also within single cells (across different alleles) by calculating the mean value of distances between two overlapping nucleosomes within genomic regions related to a particular feature (for example, active promoters) (Extended Data Fig. 5a). As expected, variation in nucleosome positioning around the TSSs of active genes—where nucleosomes are phased relative to TSS—is smaller than that around the TSSs of silent genes (Fig. 2e). In addition, nucleosome positions show smaller variation at the centre of DHSs and the centre of chromatin regions enriched in active histone modifications than they do elsewhere (Extended Data Fig. 5b–g).

The results above reveal that there are different rules of nucleosome organization in different chromatin regions. In silent chromatin states—such as in repressed promoters and heterochromatic regions—nucleosomes are highly uniformly spaced, but are not positioned relative to the underlying genomic DNA across different arrays. By contrast, in active chromatin states—such as transcribed promoters and DHS regions—nucleosomes are positioned but are not as uniformly spaced (Fig. 2f). This model was further supported by the observation that nucleosomes in promoter regions of silent genes, non-DHS regions and heterochromatic regions show higher synchronized shift scores than nucleosomes in promoters of active genes, DHS regions and regions marked by active histone modifications (Extended Data Fig. 6a, b). Furthermore, the synchronized shift score is dependent on nucleosome spacing; the highest scores are in the spacing range of 180–185 bp, which is dominant throughout the genome in all single cells (Extended Data Fig. 6c, d). The nucleosome spacing might indicate a stable structure for packaging nucleosomes20 in silent chromatin states, which is probably collectively determined by chromatin assembly factors21, linker histones22,23 and the environment surrounding chromatin fibres. In active states, ATP-dependent chromatin remodelling activities15,24 may reposition nucleosomes1,25 and consequently change the local nucleosome spacing to facilitate chromatin accessibility and gene transcription. Notably, the average nucleosome spacing surrounding the DHSs is shorter than that in non-DHS regions (Extended Data Fig. 6e, f), which may be the result of repositioning of the nucleosomes to allow accessibility of the DHS regions.

Although nucleosomes are positioned surrounding DHSs to ensure chromatin accessibility7, extensive heterogeneity of chromatin accessibility across different single cells10,12 implies heterogeneity of nucleosome positioning at the same DHS. Profiling nucleosome-to-nucleosome distances over DHSs reveals two distinct peak patterns: one has a summit at about 190 bp and the other has a summit at about 300 bp, which presumably corresponds to two different chromatin states (closed or open) (Fig. 3a, b). More than 80% of the DHSs have both spacing types at the same DHS in different single cells (Fig. 3c), and the higher DNase I tag density at DHSs measured in bulk cells26 is associated with more wide-spacing DHSs in single cells (Extended Data Fig. 7a). Furthermore, the DHSs with a higher fraction of wide space—which is not related to MNase digestion—are associated with higher accessibility, when measured by bulk-cell DNase I hypersensitive sites sequencing (DNase-seq) or by scMNase-seq subnucleosome-sized particles (Extended Data Fig. 7b–d), and with lower variation in DHS accessibility and nucleosome positioning across different single cells (Fig. 3d, e). These results indicate that one DHS may have two types of nucleosome organization (wide or narrow spacing) across different single cells; the degree of accessibility of a DHS as well as the variation in DHS accessibility and nucleosome positioning across cells are directly linked to the ratio between the two states of nucleosome organization in different single cells.

Fig. 3: The bimodal distribution of nucleosome spacing across DHSs is associated with the cell-to-cell variation in nucleosome positioning and chromatin accessibility.
figure 3

a, Schema of nucleosome spacing across a DHS and two chromatin states inferred by nucleosome spacing. b, Density plot of nucleosome spacing across a DHS within single cells reveals two peaks that correspond to narrow spacing (blue) and wide spacing (red). c, Heat map showing DHS frequency as a function of number of cells with narrow spacing and number of cells with wide spacing. The percentage of DHSs in which there are both types of spacing across a DHS in different single cells is shown. d, e, Box plots showing the cell-to-cell variation in nucleosome positioning (d) and chromatin accessibility (e) for five groups of DHSs, defined by fraction of wide space. Data represent 612, 2,088, 3,858, 2,500 and 1,586 DHSs (from left to right). f, Scatter plot showing nucleosome variance (y axis) and DHS variation (x axis) across cells for 106 bins of DHSs, based on DHS variation. Each dot represents the average of 500 DHSs for each bin. Pearson’s correlation was calculated. g, Box plot showing nucleosome variation at +1 nucleosome relative to TSS for two groups of genes sorted by expression variation. Low, bottom 25% (n = 1,171 genes); high, top 25% (n = 1,174 genes). In d, e and g, P values were calculated by one-sided Mann–Whitney U-test. In the box plots, centre line is median; boxes, first and third quartiles; whiskers, 1.5× interquartile range; notch, 95% confidence interval of the median.

Furthermore, variation in nucleosome positioning around DHSs is positively correlated with variation in accessibility across different single cells (Fig. 3f). The fraction of single cells with nucleosomes positioned around DHSs is correlated with the number of cells detected as DHSs (Extended Data Fig. 7e). The variation in nucleosome positioning around TSSs in different single cells is also correlated with variation in gene expression. The TSSs with +1 nucleosomes that show higher variation in nucleosome positioning also show higher variation in expression across different single cells (Fig. 3g). Genes for which expression was detected in a higher fraction of single cells exhibit positioned +1 nucleosomes in a higher fraction of single cells than do the genes with a lower fraction of expression (Extended Data Fig. 7f). The top 1,000 active genes with smallest nucleosome variance around their TSS across cells are enriched in common biological processes such as translation and protein transport (Extended Data Fig. 7g), consistent with the notion that house-keeping genes display less variation in nucleosome positioning. Furthermore, variation within a cell in nucleosome positioning around DHSs, or at the +1 nucleosome of the TSSs of active genes, is smaller than that across different single cells (Extended Data Fig. 7h, i). The variation in nucleosome positioning within the cell type is smaller than that across different cell types (Extended Data Fig. 7j). Clustering based on similarity in nucleosome positioning at all the DHSs across all the single cells from three cell types separated these single cells into three clusters that correspond to the respective cell types—this clustering is independent of experiment time and fragment-size ratio (Extended Data Fig. 7k).

The DNA sequence has an important role in nucleosome positioning4,14,16. Consistent with a previous report14, we observed high CC, GG and GC frequency in nucleosome-occupied sequences and high AA, TT, AT and TA frequency in flanking regions in single cells, as well as a periodical pattern that supports the rotational positioning of nucleosomes4,16 (Extended Data Fig. 8a). Smaller variation in nucleosome positioning is associated with lower frequencies of CC, GG and GC and higher frequencies of AA, TT, AT and TA in the flanking region (Extended Data Fig. 8b–e). We next explored the relationship between variance in DNA sequence and variance in nucleosome positioning. Our analysis shows that sequences occupied by nucleosomes have a higher fraction of alternative bases than those that are occupied by subnucleosome-sized particles, by tags from DNase-seq or by tags from CTCF chromatin immunoprecipitation with sequencing (ChIP-seq) (Extended Data Fig. 8f, g), which supports the notion that sequence variants influence transcription-factor binding27 and nucleosome positioning28. We found that single-base variance within nucleosome regions is positively correlated with nucleosome variance across cells (Extended Data Fig. 9h). Furthermore, the single-base variance at transcription-factor motifs is positively correlated with nucleosome variance at DHSs and is also positively correlated with gene expression variation across different single cells (Extended Data Fig. 9i, j).

Enhancers display remarkable cell-type specificity. Consistent with a previous observation that active enhancers are associated with a nucleosome loss3, the naive CD4 T cell-specific enhancers displayed decreased nucleosome occupancy in naive CD4 T cells, as revealed by the pooled scMNase-seq data from naive CD4 T cells; by contrast, enhancers that are specific to T helper 1 (TH1) and T helper 2 (TH2) cells showed only a very minor overall nucleosome loss in naive CD4 T cells (Extended Data Fig. 9a, b). However, examination of the nucleosome patterns at the TH1- and TH2-specific enhancers across different single cells revealed that 19% and 29% of naive CD4 T cells showed decreased nucleosome occupancy—which is independent of fragment-size ratio—at the de novo enhancers of TH1 and TH2 cells, respectively, whereas much smaller fractions of mouse ESCs and NIH3T3 cells showed decreased nucleosome occupancy at these enhancers (Fig. 4a, Extended Data Fig. 9c–e). Furthermore, subgroups of T cells that show decreased nucleosome occupancy at the TH1 and TH2 enhancers do not have much overlap (Extended Data Fig. 9f), which suggests they are specifically primed for the corresponding lineages. The TH1-specific enhancers with the most nucleosome loss in naive CD4 T cells are linked to genes that encode TH1 cytokine (Ifng) and key regulators (Tbx21, Stat1 and Stat4) (Extended Data Fig. 9g, h); the TH2-specific enhancers with the most nucleosome loss are linked to genes that encode key regulators for TH2 differentiation (Il4 and Stat6) (Extended Data Fig. 9i, j). Motif analysis revealed that the nucleosome loss at TH1 enhancers is specifically associated with motifs for RELA, which promotes TH1 differentiation; the nucleosome loss at TH2 enhancers is specifically associated with motifs for GATA3 and STAT6, which promote TH2 differentiation (Extended Data Fig. 9k). Gene Ontology analysis revealed that the higher-ranked nucleosome losses at both TH1 and TH2 enhancers are associated with functions in T cell differentiation, immune system process and cytokine production (Extended Data Fig. 9l, m). These results suggest that a large fraction of naive CD4 T cells have already experienced differentiating signalling events during the developmental history of these cells, which have primed the de novo enhancers of TH1 or TH2 cells by means of decreased nucleosome occupancy in the undifferentiated naive CD4 T cells.

Fig. 4: A subgroup of undifferentiated cells shows a nucleosome signature primed for differentiation.
figure 4

a, b, A large fraction of naive CD4 T cells shows decreased nucleosome occupancy at the de novo enhancers that are formed either in TH1 (a, top) or TH2 cells (a, bottom), whereas only a small fraction of mouse ESCs and NIH3T3 cells shows nucleosome depletion at the same enhancers. By contrast, a large fraction of mouse ESCs shows depleted nucleosomes at the de novo enhancers that are formed in EBs, whereas only a small fraction of naive CD4 T cells and NIH3T3 cells shows nucleosome depletion at the same enhancers. The fractions of primed cells are shown in red. Data represent 237 single naive T cells, 143 single mouse ESCs and 48 single NIH3T3 cells.

Similarly, mouse ESCs displayed a substantial nucleosome loss at the mouse ESC-specific enhancers but only a minor loss at embryoid-body (EB)-specific enhancers, which are formed de novo after differentiation from mouse ESCs (Extended Data Fig. 10a, b). Analysis of single cells revealed that 40% of mouse ESCs showed decreased nucleosome occupancy at the de novo EB-specific enhancers, whereas only 1% and 2% of naive CD4 T cells and NIH3T3 cells, respectively, showed decreased nucleosome occupancy at these enhancers (Fig. 4b, Extended Data Fig. 10c, d). The EB enhancers with the most nucleosome loss are linked to genes that include mesoderm markers (Brachyury (also known as T) and Wnt3) and endoderm markers (Gata4 and Gata6) (Extended Data Fig. 10e, f), and are associated with stem cell differentiation and development of various lineages, such as myeloid, neural tube and placental cells (Extended Data Fig. 10g). These results reveal the heterogeneity of cultured mouse ESCs, and suggest that some of these cells are already primed for differentiation by the reorganization of their nucleosome structure at enhancers formed in the differentiating EBs.

Here we introduce scMNase-seq, a powerful method for simultaneously measuring chromatin accessibility and nucleosome positioning in single cells, which may be paired with existing approaches—such as single-cell RNA-seq9, single-cell DNase-seq12 and/or single-cell ChIP-seq29—for systems analysis and to provide further insights into the molecular underpinning of cellular heterogeneity. Our application of scMNase-seq to three types of single cells revealed principles of nucleosome organization in different chromatin regions as well as heterogeneity of nucleosome positioning and spacing at DHSs. Our data suggest that the cellular heterogeneity of undifferentiated cells is related to heterogeneous nucleosome organization in critical regulatory regions, which reflects the differentiation potential of these cells.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Code availability

Custom codes for the quantification of the uniformity of nucleosome spacing and calculation of nucleosome occupancy score are available at https://github.com/binbinlai2012/scMNase.

Data availability

The scMNase-seq datasets have been deposited in the Gene Expression Omnibus database with accession number GSE96688.