Abstract
Endogenous retroviruses (ERVs) are abundant and heterogenous groups of integrated retroviral sequences that affect genome regulation and cell physiology throughout their RNA-centred life cycle1. Failure to repress ERVs is associated with cancer, infertility, senescence and neurodegenerative diseases2,3. Here, using an unbiased genome-scale CRISPR knockout screen in mouse embryonic stem cells, we identify m6A RNA methylation as a way to restrict ERVs. Methylation of ERV mRNAs is catalysed by the complex of methyltransferase-like METTL3–METTL144 proteins, and we found that depletion of METTL3–METTL14, along with their accessory subunits WTAP and ZC3H13, led to increased mRNA abundance of intracisternal A-particles (IAPs) and related ERVK elements specifically, by targeting their 5′ untranslated region. Using controlled auxin-dependent degradation of the METTL3–METTL14 enzymatic complex, we showed that IAP mRNA and protein abundance is dynamically and inversely correlated with m6A catalysis. By monitoring chromatin states and mRNA stability upon METTL3–METTL14 double depletion, we found that m6A methylation mainly acts by reducing the half-life of IAP mRNA, and this occurs by the recruitment of the YTHDF family of m6A reader proteins5. Together, our results indicate that RNA methylation provides a protective effect in maintaining cellular integrity by clearing reactive ERV-derived RNA species, which may be especially important when transcriptional silencing is less stringent.
Similar content being viewed by others
Main
Mammalian genomes host millions of retrotransposons, including ERVs that derive from past retroviral infections and have integrated as permanent residents. Over the course of evolution, successive waves of ERVs have multiplied and diversified, providing a fertile ground for genomic innovations. However, ERVs potentially compromise genomic integrity by disrupting genome structure and expression6. In laboratory mice, roughly 12% of pathological mutations result from ERV integrations, half of which emanate from a single family of the ERVK class, the IAPs, that comprise approximately 2,800 full-length copies7. By contrast, human ERVs are mostly transposition-defective8. However, by providing cis-regulatory modules, ERVs can also divert regulatory networks and alter cellular states. Moreover, ERVs generate RNA, cDNA, RNA–DNA hybrid species and proteins, the accumulation of which is associated with and may contribute to senescence, cancer and neurodegenerative diseases3.
Homeostatic regulation of ERVs is achieved by surveillance at different steps of the ERV life cycle. Notably, chromatin-based silencing by DNA methylation and histone modifications and post-transcriptional control through RNA editing and RNA interference have extensively been characterized9. However, these control mechanisms are not active in all cell types or developmental periods. To identify unknown ERV-limiting factors, we performed a CRISPR–Cas9 loss-of-function screen for IAPEz control, a highly active mouse ERV. We engineered mouse embryonic stem (ES) cells to carry constitutively expressed Cas9 and a reporter cassette with IAPEz regulatory elements: IAPEz(5′LTR-UTR+3-60n gag)-GFP-BlastR (Supplementary Table 1), in which ‘LTR’ denotes long terminal repeat and ‘UTR’ denotes the untranslated region (Fig. 1a, Extended Data Fig. 1a). Placing a doxycycline (dox)-responsive promoter upstream of the LTR sequence allowed us to test reactivation of the reporter after dox induction, to choose blasticidin-resistance over GFP as a more sensitive marker, and to adjust the blasticidin concentration for selection (Extended Data Fig. 1b, c). We also showed that the IAPEz reporter responded to known IAP repressors, by transducing cells with single guide RNAs (sgRNAs) against KAP110 (Extended Data Fig. 1d, e, Supplementary Fig. 1, Supplementary Tables 2, 3).
For the screen, we transduced IAPEz-reporter cells with a lentiviral genome-wide sgRNA library at multiplicity of infection (MOI) of 0.2–0.3 (Fig. 1b). The frequencies of sgRNAs after the application of blasticidin (5, 7 and 9 days) versus non-selected conditions were assessed via sequencing, and candidate genes were identified using MAGeCK (model-based analysis of genome-wide CRISPR–Cas9 knockout)11. Selection efficiency was verified by dropout of control intergenic sgRNAs, and genes were ranked on the basis of sgRNA P values (Fig. 1c, Supplementary Tables 4, 5). Several known IAP-repressing genes were among the top 100 hits: Resf1, Trp53, Daxx, Atrx, Uhrf1, Cbx1 and Dnmt112,13,14,15. Moreover, we identified several previously unknown candidates for IAP control (Supplementary Table 4). Notably, among the top hits were regulators of the N6-methyladenosine (m6A) mRNA methylation pathway, such as Mettl3, Mettl14, Wtap and Zc3h13, the enrichment of which gradually increased with extended blasticidin-mediated pressure (Fig. 1c, d). By repeating the screen under the most stringent selection, we confirmed significant enrichment for Mettl3, Mettl14 and Wtap sgRNAs (Extended Data Fig. 2a, b). m6A is the most abundant internal mark on mRNAs, and is crucial for organizing their fate—including export, decay and translation—in an array of biological processes such as development, cell differentiation, stress response and cancer4. The deposition of m6A is exerted by a nuclear complex with an enzymatically active core formed by methyltransferase-like METTL3 and METTL14 proteins and other calibrating subunits, including WTAP and ZC3H13. METLL3 and METTL14 form a heterodimer, in which METTL3 is the catalytic component and METTL14 facilitates binding to the RNA substrate. WTAP and ZC3H13 are essential for assembling the complex into the nucleus16,17.
To confirm that methylation of m6A RNA regulates endogenous IAP copies, we generated individual ES cell lines that contained gene knockouts of Mettl3, Mettl14, Wtap and Zc3h13 (Fig. 2a, Extended Data Fig. 3a–d). To avoid differentiation and lethality effects previously reported when m6A-depleted ES cells are cultured in metastable conditions (serum plus leukaemia inhibitory factor (LIF)), we derived and cultured ES cells in a medium that stabilizes a ground state of pluripotency (‘2i + LIF’ medium)18. In these conditions, the identity of ES cells was globally preserved in the mutant lines (Extended Data Fig. 3e, f), with a mild reduction in proliferation (Extended Data Fig. 3g). We first confirmed reduced m6A RNA methylation levels in the four knockout lines by enzyme-linked immunosorbent assay (ELISA) (Extended Data Fig. 4a). We used quantitative PCR with reverse transcription (qRT–PCR) to confirm that knockout of each of the four m6A factors increased the abundance of endogenous IAPEz mRNA by two- to threefold compared to wild-type cells (Fig. 2b). RNA sequencing (RNA-seq) analysis confirmed significant upregulation of IAPEz in m6A mutants, as well as close relatives within the ERVK family—such as MMERVK10C, MMERVK10D3, ETnERV2 and Y chromosome-specific IAPEy elements—that share more than 65% sequence identity with IAPEz (https://www.dfam.org/) (Fig. 2c, Extended Data Fig. 4b–f, Supplementary Table 6). Specific Y-linked elements of the ERV1 family were also more abundant in the knockout lines (MuRVY). Upregulation of ERVK was observed at the subfamily level after random assignment of multi-mapped reads, and confirmed at the level of individual ERVK copies, when considering uniquely assigned reads only (Fig. 2d, Extended Data Fig. 5a). By contrast, MERVL remained globally unaffected and evolutionarily young long interspersed nuclear elements (LINEs or L1s) transcripts showed opposite downregulation in m6A-knockout ES cells (Fig. 2c, d, Extended Data Figs. 4e, f, 5a). These different responses to the loss of m6A mRNA methylation highlight the divergent effects that this pathway may exert depending on the retrotransposon type, with a negative effect on IAP-related ERVK elements, specifically. ERVK-specific upregulation was confirmed in published nuclear RNA-seq datasets from independent Mettl3-knockout ES cells19 (Extended Data Fig. 5b).
We next evaluated the potential effect of increased IAP transcripts on gene regulation. As previously shown, hundreds of gene transcripts were upregulated in m6A mutant ES cells20, among which 941 were common between the 4 knockout lines (Extended Data Fig. 6a). However, these upregulated genes did not show correlation with proximity of ERVK annotations (−5 kb to +1 kb from the transcription start site (TSS)) (Extended Data Fig. 6b). Moreover, we did not score increased splicing between exonic sequences and IAP fragments in m6A-knockout compared to wild-type ES cells (Extended Data Fig. 6c). As a whole, we conclude that the increased abundance of IAP transcripts after loss of m6A loss minimal cis-effects on gene expression. Notably, depletion of m6A did not result in downregulation of known retrotransposon repressors (Extended Data Fig. 6d) and did not alter ES cell identity, as demonstrated by expression levels of pluripotency and early differentiation genes (Extended Data Fig. 3e). This provides strong indication that m6A RNA methylation directly represses IAP elements.
To verify this, we mapped the abundance and topology of m6A methylation on IAPEz transcripts, by m6A immunoprecipitation (methylated RNA immunoprecipitation sequencing (MeRIP–seq)) of total RNA from wild-type and Mettl3-knockout ES cells. We scored 15,216 and 4,864 m6A peaks in wild-type and Mettl3-knockout ES cells, respectively, with substantially higher m6A signal intensities in wild-type cells (Extended Data Fig. 7a). Notably, in addition to the well-characterized enrichment of m6A methylation at the 3′ UTR and exons of genic mRNAs20,21,22,23 (Fig. 3a, Extended Data Fig. 7b, c), we found that a considerable number of METTL3-dependent m6A events mapped to retrotransposon annotations, comprising 13% of all peaks, including L1s—as recently reported19,24—and ERVK elements (Fig. 3a). When we plotted m6A distribution along the IAPEz consensus sequence, we found two distinct regions of METTL3-dependent m6A enrichment, predominantly at the 5′UTR—present in the IAPEz reporter—and to a lesser extent on the Pol sequence (Fig. 3b, c, Extended Data Fig. 7d). Enrichment in m6A also coincided with the 5′ UTR region of MMERVK10C (Extended Data Fig. 7e, f) and was rather spread across L1MdA_I (Extended Data Fig. 7g, h). The m6A RNA methylation mostly occurs on conserved RRACH sequence motifs (in which R denotes A or G, and H denotes A, C or U)21,23. Accordingly, we found several RRACH motifs on the 5′ UTR of IAPEz and MMERVK10C consensus sequences (Fig. 3b, Extended Data Fig. 7e, i). By focusing on uniquely mapped copies, we found that m6A-dependent repression was proportional to the RRACH content: the more RRACH motifs an IAPEz copy contains in the 5′ UTR, the more upregulated it is in m6A-knockout cells (Fig. 3d, Extended Data Fig. 7j). These data demonstrate for the first time, to our knowledge, that IAPs and their ERVK relatives undergo m6A RNA methylation, and reveal a new pathway of retrotransposon suppression.
Functions of the m6A RNA methylation complex have so far been investigated by conventional gene perturbation techniques—such as knockout or short hairpin RNA (shRNA)-mediated knockdown—which precludes examining the early consequences of m6A loss and can also lead to secondary effects after prolonged selection. To address the early and direct IAP responses to m6A depletion, we used auxin-inducible degron (AID)25 to control the depletion of METTL3 and METTL14, individually and in combination (Fig. 4a, Extended Data Fig. 8a, b). The addition of auxin (indole-3-acetic acid (IAA)) resulted in efficient and near-total degradation within 1 h, which persisted over prolonged treatment and was reversible after wash-off (Fig. 4b, Extended Data Fig. 8c). Depletion of METTLs was rapidly followed by substantial and sustainable decrease of m6A RNA methylation levels (Extended Data Fig. 8d). Notably, we observed a progressive, time-dependent accumulation of IAP transcripts after the removal of m6A (Fig. 4c, Extended Data Fig. 8e), whereas levels of cellular proliferation were unaffected (Extended Data Fig. 8f). A similar trend was observed after degron-mediated depletion of ZC3H13 (Extended Data Fig. 8g–j). After 96 h of m6A depletion, levels of IAP mRNA were increased by 7-fold, 5-fold and 15-fold in single METTL3, single METTL14 or METTL3;METTL14 double degron, respectively (Fig. 4c), and this translated into accumulation of IAP-encoded Gag proteins in cytoplasmic speckles (Fig. 4d). Notably, the abundance of IAP mRNA was higher after degron depletion than in established knockouts for individual Mettl genes (only twofold increase in Mettl3 or Mettl14 knockouts) (Fig. 2b), which suggests the implementation of adaptive mechanisms after prolonged m6A loss. Moreover, the relative upregulation after simultaneous METTL3 and METTL14 degron-mediated depletion compared to single degrons of METTL3 and METTL14 highlight their functional synergy in reducing IAP mRNA levels. Finally, re-stabilizing the m6A methyltransferase complex after auxin removal resulted in a rapid decline in IAP mRNAs (Fig. 4c, Extended Data Fig. 8e, j). Together, these results suggest that m6A RNA methylation dynamically restrains the cellular availability of IAP mRNAs.
Increased abundance of IAP mRNA may result from transcriptional and/or post-transcriptional effects of the m6A loss. Notably, m6A RNA methylation has been shown to affect chromatin-based regulation of transcription19,26. We performed CUT&RUN (cleavage under targets and release using nuclease) experiments to profile permissive (H3K4me3 and H3K27ac) and repressive (H3K9me3) histone marks after 96 h of auxin treatment or dimethylsulfoxide (DMSO) control in METTL3;METTL14 double degron ES cells (AID-METTL3;METTL14). On a global scale, the rapid m6A removal did not alter the coverage of these histone marks on m6A target genes or on different retrotransposon families (Extended Data Fig. 9a–d). When focusing on full-length IAPEz annotations, we observed moderate H3K4me3 gain (Fig. 4e, f), which may contribute to the mRNA upregulation observed after AID-METTL3;METTL14 double degron depletion, although this was not accompanied by a decrease in H3K9me3 or an increase in H3K27ac (Fig. 4e, f). We therefore concluded that acute withdrawal of m6A does not strongly modify chromatin-based regulation of ERVKs in the short term.
Owing to its major effect on genic mRNA lifetime20, we sought to determine whether m6A could destabilize IAP-derived mRNAs. We used METTL3;METTL14 double degron ES cells to monitor IAP RNA levels after the inhibition of transcription with actinomycin D, and then calculated the half-life rate. IAP mRNA levels and half-life rates were substantially higher after degradation of the m6A methyltransferase complex than in cells treated with DMSO control (Fig. 4g). Importantly, the rapid reintroduction of m6A methylation (by auxin wash-off) restored IAP mRNA degradation (Fig. 4g). The fate of m6A-modified mRNAs is determined by a set of readers, among which the YTH-domain containing proteins YTHDF1, YTHDF2 and YTHDF3 have key roles by directing them to cytosolic compartments where they undergo destabilization and decay5. Using RNA immunoprecipitation (RIP), we found that YTHDF2 binds to IAP and MMERVK10C mRNAs (Extended Data Fig. 10a–c), and notably, this occurred in an m6A-dependent manner (Fig. 4h). Finally, we reasoned that, similarly to depleting the m6A writing complex, depleting YTHDF m6A readers should increase IAP mRNA abundance. To account for potential redundancy27,28, we derived two mutant ES cell lines that exhibited loss of all three YTHDF proteins (Fig. 4i, left, Extended Data Fig. 10d). Depletion of YTHDFs resulted in increased IAP mRNA levels (Fig. 4i, right), which supports the idea that the m6A methylation pathway regulates IAP mRNAs in a post-transcriptional and YTHDF-dependent manner, although other mechanisms may also apply.
Our results provide evidence that m6A RNA methylation directly affects ERV mRNA abundance by accelerating their clearance. This mechanism appears specific to IAP-related ERVK elements, whereas other retrotransposon types may undergo alternative directions and modes of m6A-dependent regulation. Considering the abundance and deleterious effect of ERVKs, this pathway may be particularly relevant in situations of relaxation of chromatin-based control, such as in early mammalian embryos—from which ES cells are derived—or during ageing. The link with YTHDF readers indicates that ERV mRNA decay may occur through phase-partitioning into cytoplasmic processing bodies (P-bodies)29, consistent with previous observations that IAP mRNAs localize to these structures30. In addition, this pathway may cooperatively prevent m6A-modified ERV mRNAs from being reverse transcribed, translated or assembled into protective retrotransposition complexes, providing several layers of control. Notably, m6A mostly occurs on the 5′ UTR of ERVKs, a region that contains the tRNA primer-binding site that is essential for reverse transcription. Finally, m6A-dependent regulation could also dampen the immunogenic potential of ERV-derived RNA species and their ability to trigger inflammatory responses, as seen in human neurodegenerative diseases3.
Methods
Data reporting and statistical analysis
No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. All statistical tests, resulting P values and observation numbers are indicated in the figure panels or legends.
Data visualization
Unless stated otherwise, heat maps, violin plots, box plots, Venn diagrams, density plots, dot plots and bar plots that visualize deep-sequencing analyses were generated using ggplot2. Box plots always show the median as the centre line, box limits correspond to upper and lower quartiles, and whiskers cover 1.5× the interquartile range. Volcano plots visualizing retrotransposon expression changes from RNA-seq and bar graphs visualizing ELISA and RT–qPCR experiments were generated using Prism 8.3.0. Dot plots and histogram visualizing FACS analyses were generated using NovoExpress software (v.1.2.1).
Cell culture
Mouse embryonic day (E) 14 ES cells were grown in two different media: serum + LIF consisted in Glasgow medium (Sigma), 15% FBS (Gibco), 2 mM l-glutamine, 0.1 mM MEM non-essential amino acids (Gibco), 1 mM sodium pyruvate (Gibco), 0.1 mM β-mercaptoethanol, 1,000 U ml−1 leukaemia inhibitory factor (LIF, Miltenyi Biotec); 2i + LIF was made of 50% Neurobasal medium (Gibco), 50% DMEM/F12 (Gibco), 2 mM l-glutamine (Gibco), 0.1 mM β-mercaptoethanol, Ndiff Neuro-2 medium supplement (Millipore), B-27 medium supplement (Gibco), 1,000 U ml−1 LIF, 3 μM GSK3 inhibitor (CT-99021), 1 μM MEK inhibitor (PD0325901). Cells were cultured in 0.2% gelatin-coated flasks at 37 °C with 5% CO2. Except for the CRISPR–Cas9 loss-of-function screens that were performed in serum + LIF medium, all experiments were performed in 2i + LIF medium. Mycoplasma-free status of the cell cultures was verified.
Plasmid construction
IAPEz reporter
Plasmids used to target ROSA26 (pEN111) and TIGRE (also known as Igs7) (Addgene 92141) loci, and the ROSA26- and TIGRE-specific sgRNA-encoding plasmids (Addgene 86234 and 92144, respectively) were provided by E. Nora (UCSF). The IAPEz-5′LTR-5′UTR-gag(3-60nt) consensus sequence was obtained from RepeatMasker and/or from repbase (http://www.repeatmasker.org/) synthetized and cloned into pUC57 by GenScript (Supplementary Table 1). To make the IAPEz reporter (pTCH1), IAPEz-5′LTR-5′UTR-gag(3-60nt) sequence and the GFP-T2A-blasticidin resistance cassette (hereafter denoted as BlastR) were combined using extension PCR and inserted it into the pEN111 backbone using the ClaI site. To insert the Cas9 gene at the TIGRE locus, 3× Flag-NLS-Cas9 was PCR amplified from pX459 expression vector (Addgene 62988) and inserted into plasmid 92141 backbone using BamHI and XhoI sites (pTCH2).
Plasmids for N terminus tagging with AID domain for auxin-inducible degron
The plasmids to target OsTIR1 at the TIGRE locus (Addgene 92141) and the TIGRE-specific sgRNA-encoding plasmid (see above) were provided by E. Nora (UCSF). The plasmids to target AID inserts into the gene-of-interest N terminus were generated as follows: either puromycin resistance (PuroR)-P2A-3×Flag-AID, or BlastR-P2A-3×Flag-AID inserts were cloned into pUC19 backbone (pFD71 with puromycin resistance gene and pFD75 with blasticidin S-resistance gene). Next, homology arms ranging from 320 to 530 bp depending on the gene (flanking both sides, but excluding ATG start codon) for Mettl3, Mettl14 and Zc3h13 were PCR amplified from mouse genomic DNA and inserted into pFD71 or pFD75 surrounding and in frame with AID insert using EcoRI/NcoI sites for upstream homology arms and AgeI/HindII for downstream homology arms. Final expression vectors were used as follows: pTCH3 (BlastR-P2A-3×Flag-AID-METTL3_Nter), pTCH4 (BlastR-P2A-3×Flag-AID-METTL4_Nter) and pFD119 (BlastR-P2A-3×Flag-AID-ZC3H13_Nter) to generate individual endogenous degron ES cell lines for aforementioned genes. The pTCH4 and pTCH5 (PuroR-P2A-3×Flag-AID-Mettl3) were used sequentially to generate the AID-METTL3;-METTL14 double degron ES cell line. For the sgRNA cloning, the pX459 plasmid (Addgene 62988) was digested with BbsI immediately downstream of the U6 promoter and annealed DNA duplex corresponding to the target sgRNA sequences were ligated. sgRNA sequences were chosen to overlap with the gene TSS, so that after introduction of the AID the sgRNA-specific sequences were disrupted. sgRNA sequences used for degron targeting are listed in Supplementary Table 2.
Cell transfection and clone isolation
All transgenic insertions and mutations were performed using Amaxa 4D nucleofector (Lonza). For each nucleofection, 3 × 106–5 × 106 cells were electroporated with 1–3 μg of nonlinearized targeting vector and/or sgRNA/Cas9-encoding plasmids and plated at a low density. Two days later, cells were selected with puromycin (1 μg ml−1, Life Technologies) or blasticidin S (5 μg m−1) for 2 and 5 days, respectively, and individual clones were picked and screened by PCR. Flippase-mediated removal of puromycin-resistance cassettes were performed for the IAPEz reporter cell line (from both ROSA26 and TIGRE loci) and for puromycin resistance cassette for AID-METTL3;METTL14 double degron from the TIGRE locus. For the IAPEz reporter cell line, functionality of the reporter cassette was confirmed by doxycycline (dox) induction followed by FACS and fluorescence microscopy analyses while Cas9 expression and activity was confirmed by Kap1-specific sgRNA introduction (see below) and western blot analysis. To generate Mettl3, Mettl14, Wtap, Zc3h13, Ythdf2 and combined Ythdf mutant ES cells, two sgRNAs for each gene were designed using the online CRISPOR Design Tool31 to introduce indels and/or deletions. For sgRNA cloning, the pX459 plasmid (Addgene 62988) was digested with BbsI immediately downstream of the U6 promoter and annealed DNA duplex corresponding to the target sgRNA sequences were ligated. The Mettl3-knockout cells were created by deleting part of exon 4; Mettl14-knockout by deleting part of exon 1; Zc3h13-knockout by deleting part of exon 9; and sgRNA targeting of Wtap gene resulted in single nucleotide insertion and premature stop codon in exon 4. Ythdf2-knockout cells were generated by deleting part of exon 4. The mutYthdf1-2-3 and mutYthd2-3 ES cells were obtained by simultaneous introduction of six sgRNAs targeting the Ythdf1, Ythdf2 and Ythdf3 genes. Protein loss was confirmed by western blot for all mutant cell lines; in addition, m6A ELISA assays was carried out for Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout lines. For degron lines, proper insertion and AID-fusion protein activities were confirmed by genotyping, western blot analysis and m6A ELISA. For sgRNA sequences used for generation of knockout, mutant and degron lines see Supplementary Table 2.
Cell treatments
IAPEz reporter expression was induced after the administration of doxycycline (1 μg ml−1). Auxin-mediated depletion of target proteins was achieved by supplementing culture medium with auxin (Sigma) at the recommended concentration of 500 μM. Auxin-containing medium was renewed every 24 h. For auxin wash-off, auxin-containing medium was removed, cells were rinsed twice with PBS, and exposed to auxin-free medium. Blockade of transcription with actinomycin D was achieved by supplementing culture medium with 5 μg ml−1 actinomycin D (Sigma).
Lentivirus production and lentiviral-based Kap1-specific sgRNA knockout
Two previously described sgRNAs specific to the Kap1 gene32 (Supplementary Table 2) were incorporated into plentiGuide-puro vector (Addgene 52963). For production of lentiviral particles, HEK293FT cells were co-transfected with 3.33 μg of either of the Kap1 lentiGuide-puro constructs, 2.5 μg psPAX2 packaging plasmid and 1 μg pMD2.G envelope plasmid using Lipofectamine 2000 (Invitrogen). Lentiviral supernatant was collected, filtered with 0.45-μm filter, concentrated using Amicon Ultra centrifugal filter (Millipore, 100-kDa cut-off) and added to pre-plated IAPEz reporter cells supplemented with 8 μg ml−1 polybrene (Millipore). Twelve hours after infection, the medium was replaced and supplemented with puromycin (1 μg ml−1). After 48 h of puromycin selection the medium was replaced and supplemented with blasticidin S (5 μg ml−1) for additional 72 h.
Protein extraction and western blotting
Cells were trypsinized, washed once in medium and once in PBS and pelleted for 5 min at 1,000 rpm. Cell pellets were resuspended in RIPA buffer (1× PBS, 0.5% sodium deoxycholate, 0.1% SDS, 1% Igepal CA-630) containing protease inhibitors (Roche) and incubated on ice for 20 min. Lysates were then centrifuged for 20 min at 16,400 rpm, at 4 °C and supernatants were kept. Protein concentration was determined using Bradford assay and protein extracts were boiled for 10 min in LDS buffer (Life Technologies) containing 200 mM DTT. Equal amounts of protein were loaded on 4–12% Bis-Tris gel (NuPAGE), or 3–8% Tris-Acetate gel (NuPAGE) for ZC3H13 detection. Transfer was performed on a 0.45-μm nitrocellulose membrane (GE Healthcare) using wet-transfer system, blocked with 5% milk in PBS (+Igepal CA-630 to a final concentration of 0.3%) for 1 h at room temperature. Membranes were incubated with primary antibodies (Supplementary Table 3) at 4 °C overnight in 1% milk in PBS (supplemented with 0.3% Igepal CA-630), washed five times with PBS supplemented with 0.3% Igepal CA-630 and incubated with HRP-conjugated secondary antibodies for 1 h at room temperature and washed again five times. Signal was detected using LumiLight Plus Kit (Roche) on the Chemidoc MP imaging system (BioRad). For uncropped images of western blot membranes, see Supplementary Fig. 1.
FACS analysis
Cells were collected, washed with PBS to remove residual medium and proceeded to analyse GFP expression using NovoCyte 2000R (ACEA Biosciences) flow cytometer and NovoExpress software (v1.2.1). The percentage of GFP-positive cells was determined upon definition of three gates: (i) FSC-H vs SSC-H to isolate cells from debris, (ii) SSC-H versus SSC-A to isolate single cells and (iii) SSC-H versus FITC-H for detection of GFP-positive population. For pseudocolour plots and gating strategy see Supplementary Fig. 2.
Genome-wide screen in IAPEz reporter mouse ES cells (screens I and II)
Approximately 300 × 106 IAPEz reporter ES cells expressing Cas9 were lentivirally infected with a genome-wide Mouse Two Plasmid Activity-Optimized CRISPR Knockout Library (Addgene 1000000096) as described above, containing 188,509 sgRNAs targeting 18,986 genes and 199 intergenic sgRNAs at a multiplicity of infection of 0.2–0.3 (measured by puromycin-resistance gene co-delivered with the lentiviral vector) and selected for lentiviral integration using puromycin (1 μg ml−1) for 3 days. In screen I, the culture was expanded for another 4–8 days. On days 4, 6 and 8 of expansion, 200 × 106 cells were split into blasticidin S-selecting conditions (for 9, 7 and 5 days, respectively) and non-selection conditions (9 days). Cells in non-selection conditions were maintained at minimum level of 100 × 106 cells and logarithmic growth. After 9 days, 3 × 106–5 × 106 cells from selection conditions and 100 × 106 non-selection conditions were washed three times with PBS and pelleted by centrifugation for genomic DNA extraction using GeneElute Mammalian Genomic DNA Miniprep kit (Sigma) and Quick-DNA Midiprep Plus kit (Zymo Research), respectively, following the manufacturers guidelines. The sgRNA-encoding insertions were PCR-amplified using Agilent Herculase II Fusion DNA Polymerase (600675). These libraries were then sequenced using Illumina HiSeq 2500 (approximately 5 million–10 million reads with sgRNA sequence per condition; around 40× coverage per library element in non-selection conditions, screen I). As screen I demonstrated that longer blasticidin S treatment resulted in better intergenic sgRNA depletion, we performed screen II in two biological replicates with 9-day-long blasticidin S selection after either 8-day-long or 17-day-long cell culture (early and late selection, respectively). After genomic DNA extraction and library amplification, libraries were sequenced using Illumina HiSeq 2500 (SE65) (approximately 30 million–35 million reads per condition; around 170× coverage per library element in early and late non-selection conditions, screen II. See Supplementary Table 1 for the primer sequences used to amplify the libraries.
Immunofluorescence
Cells were plated on fibronectin-coated (Sigma) glass coverslips. For IAPEz-GFP reporter reactivation control, doxycycline (1 μg ml−1) was added for 24 h. The next day, cells were fixed with 3% paraformaldehyde for 10 min at room temperature, rinsed three times with PBS, incubated 3 min in 0.3 μg ml−1 DAPI and rinsed again with PBS. For detection of IAP-GAG, after fixation, cells were permeabilized for 4 min with PBS/0.5× Triton X-100 on ice, blocked with 1% BSA/PBS for 15 min, incubated for 40 min with rabbit anti-mouse IAP-GAG antibody (gift from B. Cullen), 40 min with secondary antibodies and 3 min in 0.3 μg ml−1 DAPI at room temperature. Slides were mounted with VECTASHIELD medium (Vector Laboratories). Images were obtained with an Upright Spinning disk Confocal Microscope (Roper/Zeiss) and processed with Image J.
RT–qPCR analysis
Total RNA was extracted using Trizol (Life Technologies). Genomic DNA was removed by DNase I treatment (Qiagen), precipitated and resuspended in DNase/RNase-free water. Next, 10 μg of RNA was used for a second round of purification using RNeazy Mini columns (Qiagen) and 500 ng RNA was reverse-transcribed using random priming with Superscript III (Life Technologies). Quantitative PCR was performed using the SYBR Green Master Mix on the Viia7 thermal cycling system (Applied Biosystem). Relative expression levels were normalized to Gapdh or Actb (indicated in respective figures) using the ΔΔCt method. For primer sequences, see Supplementary Table 1.
RNA stability assay
For RNA stability assay, 0.5 × 106 AID-METTL3;METTL14 dd ES cells treated with either 500 μM auxin or DMSO for 96 h or after 48-h auxin wash-off were re-plated on fibronectin-coated 6-cm plates 24 h before addition of actinomycin D. Next, medium were renewed and supplemented with 5 μg ml−1 actinomycin D (Sigma) to inhibit transcription. Total RNA was extracted at indicated time points and used for RT–qPCR. The half-life of IAPΔ1 was calculated according to the following equation: ln(Ci/C0) = −kti, in which k is the degradation rate, Ci is the mRNA value at time i, and ti is the time interval in hours33. First, we calculated degradation rate ki from each time point. The half-life t1/2 is ln(2)/ka, in which ka is the average degradation rate measured across the different time points.
m6A ELISA
m6A ELISA was performed using an m6A RNA methylation colorimetric assay kit (Abcam, ab185912) according to manufacturer’s protocol, and using 200 ng of RNA. After incubation in the last developer solution, the reaction was stopped by adding 100 μl of STOP solution at the moment positive control wells turned medium blue. It is crucial to continuously control the progressive colour change for the positive control; prolonged incubation will result in signal saturation in the experimental wells and potential masking of differences between tested conditions. Absorbance was measured at 450 nm within 5 min using absorbance microplate reader.
RNA immunoprecipitation
For each replicate, 1.5 × 107 cells were collected, washed with ice-cold PBS, resuspended in 1 ml of RIP lysis buffer (50 mM Tris-HCl pH 7.4, 150 mM NaCl, 1 mM EDTA, 1 mM DTT, 0.5% Igepal CA-630, containing 40 U ml−1 of RNasin and protease inhibitors (Roche)). Lysates were placed on ice for 20 min and centrifuged at 4 °C to remove cell debris. Supernatants containing RNA–protein complexes were collected and 50 μl was kept for input control. Remaining lysates were precleared with M-280 sheep anti-rabbit IgG magnetic beads (Thermo Fisher Scientific) and incubated with either 5 μg of YTHDF2 antibody or rabbit IgG rotating overnight at 4 °C. Next, 25 μl of beads was added and samples were rotated for additional 2 h at 4 °C. Beads were washed with RIP lysis buffer five times for 10 min and RNA was isolated from the beads as well as input samples using Trizol. Genomic DNA was removed by DNase I treatment (Qiagen), precipitated and resuspended in DNase/RNase-free water. Next, purified RNA underwent a second round of purification using RNeazy Mini columns (Qiagen). Immunoprecipitated and input RNAs were reverse transcribed using random priming with Superscript III (Life Technologies). Reverse transcription and real-time quantitative PCR were used to measure abundance.
Poly-A RNA sequencing
Total RNA was extracted using Trizol (Life Technologies). Genomic DNA was removed by in solution DNaseI treatment (Qiagen), RNA was precipitated and resuspended in DNase/RNase-free water. Next, 10 μg of RNA was used for a second round of purification using RNeazy Mini columns (Qiagen). RNA integrity was evaluated on TapeStation 4200 (Agilent) using RNA ScreenTape (5067-5576), requiring a minimal integrity number (RIN) of 9. Libraries were prepared according to Illumina’s instructions accompanying the TruSeq Stranded mRNA Library Prep Kit. Approximately 800 ng of RNA per replicate was used for library preparation. After library preparation, the length profiles were assessed with the LabChip GX Touch HT system (Perkin Elmer) and equimolar pool from all samples was prepared. Molarity of the pool was quantified by qPCR using KAPA Library Quantification Kit and the CFX96 qPCR system (Biorad) before sequencing. Samples were sequenced using Novaseq 6000 (PE100, approximately 90 million clusters per replicate).
MeRIP-seq
MeRIP-seq was carried out using Magna MeRIP m6A Kit (Millipore) according to the manufacturer’s instructions. In short, total RNA was extracted using Trizol (Life Technologies). DNaseI-treated RNA samples were chemically fragmented into 100-nucleotide-long fragments and 350 μg of total RNA were subjected to each immunoprecipitation with affinity purified anti-m6A antibody in presence of RNase inhibitor. Bound m6A-methylated RNA fragments were eluted with free N6-methyladenosine, purified using RNeazy Kit (Qiagen) and processed for library generation using SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (TaKaRa) following the manufacturer’s recommendations, but without fragmentation step (9 ng of RNA per replicate). Sequencing was performed using Illumina Novaseq 6000 (PE100, approximately 50 million to 90 million clusters per replicate). The m6A IP for wild-type and Mettl3-knockout cells was performed independently three times. Input for each cell line was sequenced as a control.
CUT&RUN
For chromatin profiling, we performed CUT&RUN34 on AID-METTL3;METTL14 double degron cells treated with either DMSO (control) or auxin for 96 h in two biological replicates. In brief, 2 × 105 cells were washed with PBS, three times with wash buffer (20 mM HEPES-KOH pH 7.9, 150 mM NaCl, 0.5 mM spermidine and Protease Inhibitors (Roche)) at room temperature, then resuspended in 1 ml of washing buffer. Next, 10 μl of concanavalin A-coated magnetic beads (Bangs Laboratories, BP531), pre-washed and resuspended in binding buffer (20 mM HEPES-KOH, pH 7.9, 10 mM KCl, 1mM CaCl2, 1 mM MnCl2), were added to the cells. After 10 min incubation at room temperature under rotation, bead-bound cells were isolated on a magnetic stand and resuspended in 400 μl of antibody buffer (wash buffer supplemented with 0.02% digitonin (Millipore, 300410) and 2 mM EDTA) containing 2 μg of anti-H3K4me3, -H3K27ac or -H3K9me3 antibody or 2 μg of rabbit IgG (Supplementary Table 3). After 15 min incubation at room temperature under rotation, cells were washed three times in 1 ml of digitonin(dig)-wash buffer (wash buffer supplemented 0.02% digitonin), incubated with pA-MNase (400 μl of dig-wash buffer containing 700 ng ml−1 pA-MNase, produced by the Protein Core Facility of Institut Curie) and washed again three times in 1 ml of dig-wash buffer. Cells were then resuspended in 150 μl of dig-wash buffer, transferred to 1.5 ml tubes, and equilibrated to 0 °C in a metal block for 10 min on ice. To initiate pA-MNase-dependent digestion, CaCl2 was added to a final 2 mM concentration, incubation was carried out at 0 °C for 30 min, and stopped by adding 150 μl of 2 × STOP solution (200 mM NaCl, 20 mM EDTA, 5 mM EGTA, 0.1% Igepal CA-630, 40 μg ml−1 glycogen). RNase A was added to a final concentration of 50 μg ml−1 and samples were incubated at 37 °C for 20 min. Samples were placed on a magnetic stand, supernatant was transferred to low-binding tubes and SDS and proteinase K were then added to final concentrations of 0.1% and 300 μg ml−1, respectively, and samples were incubated at 70 °C for 30 min. Total DNA was extracted by phenol/chloroform followed by two rounds of ethanol precipitation, eluted in 40 μl of 1 mM Tris-HCl pH 8.0 and 0.1 mM EDTA, and quantified and analysed using Qubit and TapeStation assays. CUT&RUN libraries were prepared using the Accel-NGS 2S Plus DNA Library Kit (Swift Biosciences) according to the manufacturer’s protocol. Samples were sequenced using NovaSeq 6000 (PE50, approximately 60 million clusters per replicate).
Genome-wide CRISPR–Cas9 screen analysis
The sequenced reads were mapped to the sgRNA library. Only reads that contained one sgRNA sequence without mismatch were counted. The MAGeCK11 test command line (version 0.5.8) was used to rank sgRNAs and genes with following parameters: –norm-method total–adjust-method fdr–remove-zero-threshold 10–gene-lfc-method alphamean–remove-zero both. For screen II, the sequencing primer was oriented in the opposite direction to the sgRNA, therefore the CRISPR mouse pooled library was reverse-complemented before counting.
RNA-seq analysis
Adapters were trimmed using Atropos v.1.1.1635. Paired-end read alignment was performed onto the Mouse reference genome (mm10) with STAR v.2.7.0a36 reporting randomly one position, allowing 6% of mismatches (–outFilterMultimapNmax 5000–outSAMmultNmax 1–outFilterMismatchNmax 999–outFilterMismatchNoverLmax 0.06). Repeat annotation was downloaded from RepeatMasker (http://www.repeatmasker.org/). To reconstruct full-length LTR copies, we used the same strategy as done previously10 using the perl tool ‘one code to find them all’37. Reconstructed transposons annotation and basic genes annotation from GENCODE v.18 were merged and used as input for quantification with FeatureCounts v1.5.138. Differential expression analysis was performed using edgeR’s normalization combined with voom transformation from limma R package39,40. P values were computed using limma and adjusted with the Benjamini–Hochberg correction. Genes and transposon families were declared as differentially expressed if FDR < 5% and log2FC > 0.75. Upregulated genes in all four knockout lines were annotated with proximal retrotransposon elements (overlap with promoter regions defined as −5 kb to +1 kb from the TSS). Randomized gene sets were created 100 times and were annotated to proximal retrotransposon elements to compute permutation test using regioneR41 R package.
Transposon element-based analysis was performed using only uniquely mapped reads by retrieving reads with NH tag equal to 1. To avoid confounding effects between expression from retrotransposon copies and from genes, elements overlapping genes in the same direction were removed from RepeatMasker annotations. FeatureCounts v.1.5.1 was used for the quantification and only copies with at least 10 reads in at least one sample were conserved. RRACH motifs were searched into the 5′ UTR sequences of individual IAP copies using RSAT dna pattern. Division into three categories of RRACH motif number (7–8, 9–10 and 11–18) was performed to homogenize the number of IAP copies per category.
To estimate intron retention between genes and single IAP copies, reads alignment was performed using specific parameters to report only uniquely mapped reads with STAR v.2.7.0a36 (–outFilterMultimapNmax 1–outSAMmultNmax 1). Unannotated splice junctions detected by STAR was annotated with GENCODE v.18 and IAP LTR elements from RepeatMasker annotation to retrieve splicing events between a gene and an IAP element. The number of uniquely mapped reads crossing the splicing events was calculated for each sample and normalized by the library size.
MeRIP-seq analysis
Due to the addition of 3 nucleotides on 5′-end of the second sequencing read (R2) from the Pico v.2 SMART Adaptor, paired-end reads were trimmed using Trim Galore v.0.4.4 with the options: –three_prime_clip_R1 3–clip_R2 3 (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). Reads were aligned onto the mouse ribosomal sequence (GenBank: BK000964.3) using Bowtie v.1.2 allowing at most three mismatches42. Previously unmapped reads were aligned onto the mouse reference genome (mm10) using STAR v.2.6.0c reporting randomly one position, allowing 4% of mismatches (–outFilterMultimapNmax 5000–outSAMmultNmax 1–outFilterMismatchNmax 999–outFilterMismatchNoverLmax 0.04). PCR duplicates were removed using STAR with the option –bamRemoveDuplicatesType UniqueIdenticalNotMulti. Bigwig files were produced with deepTools v.2.5.343 using the option –normalizeUsingRPKM. Peaks enriched in the MeRIP sample over the input control were defined using MACS2 peak-caller44 with a genome size of 994,080,837 bp20 and the FDR threshold of 5%. Reads were extended to 200-bp-long fragments. Only peaks called in at least two replicates were used for downstream analysis. Peaks intensity was calculated using FeatureCounts v.1.5.138 and normalized to background (reads not falling into peaks) and to peak length. GENCODE v.18 was used to define 5′UTR, 3′UTR, intronic and exonic regions. Retrotransposon annotations (RepeatMasker) were downloaded from UCSC table browser. Genes overlapping with at least one peak were used to calculate coverage along the genic region (5′UTR, coding sequence and 3′UTR) with trumpet R package45. Mapped reads onto the Mouse reference genome overlapping with IAP and L1MdA_I elements were extracted as single-end reads and mapped to the full-length IAP (GenBank: M17551.1) and L1MdA_I consensus sequences with Bowtie2 v.2.2.946 with these parameters: –local -N 1. Coverage along the consensus sequence was normalized to background (reads not falling into peaks) as was done previously for peak intensities. Rolling mean was calculated for a window of 50 bp to smooth the signal. RRACH motif was searched into the IAP consensus sequence using RSAT dna pattern. Intragenic peaks in the wild-type condition were used to define m6A-bound genes. In addition, a threshold of 1 RPKM in the input wild-type samples was put to be sure that the gene is expressed. Genes with RPKM higher than 1 and not overlapping with a m6A peak were defined as non-m6A bound genes.
CUT&RUN analysis
Paired-end reads were trimmed using Trim Galore v0.4.4. The alignment was performed onto a concatenated genome using the mouse reference genome (mm10) and the Escherichia coli genome (str. K-12 substr. MG1655, GenBank: NC_000913) with STAR v.2.7.0a36 reporting randomly one position, allowing 6% of mismatches (–outFilterMultimapNmax 5000–outSAMmultNmax 1–outFilterMismatchNmax 999–outFilterMismatchNoverLmax 0.06). PCR duplicates were removed using Picard v.2.6.0 (http://broadinstitute.github.io/picard/). Reconstructed transposon annotation was used as input for quantification with FeatureCounts v.1.5.138 using reads mapped onto the mouse genome. Normalization of counts per million was performed using as library size the number of mapped reads onto the mouse genome.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.
Data availability
The raw imaging data that support the findings of this study are available from the corresponding authors on request owing to size considerations. All sequencing data derived from CRISPR–Cas9 screens I and II, RNA-seq, MeRIP-seq and CUT&RUN have been deposited in the Gene Expression Omnibus (GEO) under accession number GSE145616. Previously published data were downloaded from GEO: Mettl3 control and Mettl3 knockout nuclear RNA-seq: (GSE133585)19. Source data are provided with this paper.
References
Johnson, W. E. Origins and evolutionary consequences of ancient endogenous retroviruses. Nat. Rev. Microbiol. 17, 355–370 (2019).
Barau, J. et al. The DNA methyltransferase DNMT3C protects male germ cells from transposon activity. Science 354, 909–912 (2016).
Tam, O. H., Ostrow, L. W. & Gale Hammell, M. Diseases of the nERVous system: retrotransposon activity in neurodegenerative disease. Mob. DNA 10, 32 (2019).
Shi, H., Wei, J. & He, C. Where, when, and how: context-dependent functions of RNA methylation writers, readers, and erasers. Mol. Cell 74, 640–650 (2019).
Patil, D. P., Pickering, B. F. & Jaffrey, S. R. Reading m6A in the transcriptome: m6A-binding proteins. Trends Cell Biol. 28, 113–127 (2018).
Goodier, J. L. & Kazazian, H. H., Jr. Retrotransposons revisited: the restraint and rehabilitation of parasites. Cell 135, 23–35 (2008).
Gagnier, L., Belancio, V. P. & Mager, D. L. Mouse germ line mutations due to retrotransposon insertions. Mob. DNA 10, 15 (2019).
Hancks, D. C. & Kazazian, H. H. Jr. Roles for retrotransposon insertions in human disease. Mob. DNA 7, 9 (2016).
Zamudio, N. & Bourc’his, D. Transposable elements in the mammalian germline: a comfortable niche or a deadly trap? Heredity 105, 92–104 (2010).
Rowe, H. M. et al. KAP1 controls endogenous retroviruses in embryonic stem cells. Nature 463, 237–240 (2010).
Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 15, 554 (2014).
Fukuda, K., Okuda, A., Yusa, K. & Shinkai, Y. A CRISPR knockout screen identifies SETDB1-target retroelement silencing factors in embryonic stem cells. Genome Res. 28, 846–858 (2018).
Liu, X. et al. UHRF1 targets DNMT1 for DNA methylation through cooperative binding of hemi-methylated DNA and methylated H3K9. Nat. Commun. 4, 1563 (2013).
Sadic, D. et al. Atrx promotes heterochromatin formation at retrotransposons. EMBO Rep. 16, 836–850 (2015).
Maksakova, I. A. et al. H3K9me3-binding proteins are dispensable for SETDB1/H3K9me3-dependent retroviral silencing. Epigenetics Chromatin 4, 12 (2011).
Wen, J. et al. Zc3h13 regulates nuclear RNA m6A methylation and mouse embryonic stem cell self-renewal. Mol. Cell 69, 1028–1038.e6 (2018).
Ping, X. L. et al. Mammalian WTAP is a regulatory subunit of the RNA N6-methyladenosine methyltransferase. Cell Res. 24, 177–189 (2014).
Greenberg, M. V. C. & Bourc’his, D. Cultural relativism: maintenance of genomic imprints in pluripotent stem cell culture systems. Curr. Opin. Genet. Dev. 31, 42–49 (2015).
Liu, J. et al. N6-methyladenosine of chromosome-associated regulatory RNA regulates chromatin state and transcription. Science 367, 580–586 (2020).
Geula, S. et al. m6A mRNA methylation facilitates resolution of naïve pluripotency toward differentiation. Science 347, 1002–1006 (2015).
Dominissini, D. et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 485, 201–206 (2012).
Batista, P. J. et al. m6A RNA modification controls cell fate transition in mammalian embryonic stem cells. Cell Stem Cell 15, 707–719 (2014).
Meyer, K. D. et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell 149, 1635–1646 (2012).
Abakir, A. et al. N6-methyladenosine regulates the stability of RNA:DNA hybrids in human cells. Nat. Genet. 52, 48–55 (2020).
Nishimura, K., Fukagawa, T., Takisawa, H., Kakimoto, T. & Kanemaki, M. An auxin-based degron system for the rapid depletion of proteins in nonplant cells. Nat. Methods 6, 917–922 (2009).
Li, Y. et al. N6-Methyladenosine co-transcriptionally directs the demethylation of histone H3K9me2. Nat. Genet. 52, 870–877 (2020).
Zaccara, S. & Jaffrey, S. R. A unified model for the function of YTHDF proteins in regulating m6A-modified mRNA. Cell 181, 1582–1595.e18 (2020).
Lasman, L. et al. Context-dependent functional compensation between Ythdf m6A reader proteins. Genes Dev. 34, 1373–1391 (2020).
Ries, R. J. et al. m6A enhances the phase separation potential of mRNA. Nature 571, 424–428 (2019).
Lu, C., Contreras, X. & Peterlin, B. M. P. P bodies inhibit retrotransposition of endogenous intracisternal A particles. J. Virol. 85, 6244–6251 (2011).
Haeussler, M. et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 17, 148 (2016).
Walter, M., Teissandier, A., Pérez-Palacios, R. & Bourc’his, D. An epigenetic switch ensures transposon repression upon dynamic loss of DNA methylation in embryonic stem cells. eLife 5, 1–30 (2016).
Chen, C. Y. A., Ezzeddine, N. & Shyu, A. B. Messenger RNA half-life measurements in mammalian cells. Methods Enzymol. 448, 335–357 (2008).
Skene, P. J. & Henikoff, S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. eLife 6, 1–35 (2017).
Didion, J. P., Martin, M. & Collins, F. S. Atropos: specific, sensitive, and speedy trimming of sequencing reads. PeerJ 5, e3720 (2017).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Bailly-Bechet, M., Haudry, A. & Lerat, E. ‘One code to find them all’: a perl tool to conveniently parse RepeatMasker output files. Mob. DNA 5, 1–15 (2014).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Gel, B. et al. regioneR: an R/Bioconductor package for the association analysis of genomic regions based on permutation tests. Bioinformatics 32, 289–291 (2016).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44 (W1), W160–W165 (2016).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Zhang, T., Zhang, S. W., Zhang, L. & Meng, J. trumpet: transcriptome-guided quality assessment of m6A-seq data. BMC Bioinformatics 19, 260 (2018).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Acknowledgements
We thank the members of the Bourc’his laboratory for their support; M. Greenberg and A. Shkumatava for critical reading of the manuscript; G. Cristofari and J. J. Zylicz for suggestions; I. Pinheiro for help with FACS analysis; E. Nora for targeting plasmids; and M. Schultz and M. Borensztein for assistance with cell culture and immunofluorescence experiments. We acknowledge the ICGex NGS platform of the Institut Curie -supported by grants ANR-10-EQPX-03 (Equipex) and ANR-10-INBS-09-08 (France Génomique) from the Agence Nationale de la Recherche- and the Cell and Tissue Imaging Platform (PICT-IBiSA) of Institut Curie -member of the French National Research Infrastructure France-BioImaging (ANR-10-INBS-04). The laboratory of D.B. is part of the LABEX DEEP (ANR-11-LABX-0044, ANR-10-IDEX-0001-02). This work was supported by the Fondation Bettencourt Schueller, the Association Robert Debré pour la Recherche Médicale (ARDRM), the Fondation pour la Recherche Médicale (FRM) and the Association de Recherche contre le Cancer (ARC-PJA-20191209637). T.C. was a recipient of an EMBO postdoctoral fellowship; E.R. and M.D were supported by PhD fellowships from la Ligue contre le Cancer and from Région Ile-de-France, respectively.
Author information
Authors and Affiliations
Contributions
D.B. and T.C. conceived and designed the study. T.C. performed the genetic screen, genetic engineering of the different ES cell lines (reporter, genetic knockouts and auxin-degron), RNA-seq, MeRIP-seq, degron experiments, actinomycin D assays, m6A quantification by ELISA and immunoblots. E.R. contributed to generating and characterizing ES cell knockouts of Mettl3, Mettl14, Wtap, Zc3h13, Ythdf1, Ythdf2 and Ythdf3 and aided degron experiments. T.C., M.D. and L.B. performed CUT&RUN experiments. S.R. and C.F. aided the genetic screen, and M.D. the MeRIP-seq. F.D. provided degron-targeting vectors and helped with degron design. A.T. performed the bioinformatic analyses. S.L. aided sequencing library preparations. D.B. and T.C. interpreted the data and wrote the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature thanks Miguel Branco and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Validation of the IAPEz-reporter ES cell line.
a, Immunoblot showing Cas9 protein levels in parental (E14) and doxycycline (dox)-inducible IAPEz-reporter ES cells. Lamin B1 served as loading control. b, Flow cytometry analysis of GFP expression in IAPEz-reporter ES cells before and upon 48 h of dox (1 μM) induction and blasticidin (blast, 5 μg ml−1) selection. Left, overlaid dot blots (last 130,000 events) and histograms normalized to 100% scale (right). Analysis was performed using NovoExpress software (Acea Biosciences). For pseudocolour plots and gating strategy, see Supplementary Fig. 2. Note that blasticidin selection further increases the fraction of GFP-positive cells, and not all blasticidin-resistant cells accumulate detectable GFP levels. Therefore, blasticidin selection is more sensitive than the GFP signal to subtle changes in the expression of IAPEz reporter. c, Immunofluorescence detection of GFP in dox-induced IAPEz-reporter ES cells. Scale bar, 10 μm. d, Immunoblot showing KAP1 protein levels in IAPEz-reporter ES cells after KAP1-specific sgRNA introduction and blasticidin selection. Scrambled sgRNA served as control. Blasticidin treatment shows that only cells with successful KAP1 depletion become antibiotic-resistant. Lamin B1 served as loading control. e, Flow cytometry dot blot analysis (last 50,000 events) of GFP expression in cells from d. Discontinued lines denote GFP-positive threshold. For pseudocolour plots and gating strategy, see Supplementary Fig. 2. KAP1 depletion combined with blasticidin selection leads to IAPEz-reporter reactivation and GFP expression. Experiments in a–e were replicated at least twice with similar results.
Extended Data Fig. 2 Early versus late genome-wide CRISPR–Cas9 screen for IAPEz suppressors.
a, Schematic of the screening process. b, Ranked P values (permutation test) for enriched genes in early (left) and late (right) blasticidin treatment, as in a. Discontinued lines indicate genes ranked in top 100 according to P value (top) and P = 0.05 (bottom). Known IAP regulators (black, when P < 0.05) and Mettl3, Mettl14, Wtap and Zc3h13 (red) are reported. Presented data were derived from biological duplicates.
Extended Data Fig. 3 Generation and validation of Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout mouse ES cells.
a–d, Mettl3-knockout (a), Mettl14-knockout (b), Wtap-knockout (c), and Zc3h13-knockout (d), mediated by CRISPR–Cas9. Schematic representation showing sgRNAs targeting of indicated loci (top) and immunoblots (bottom) confirming depletion of indicated proteins. PCNA or lamin B1 served as loading controls. In red, clones selected for downstream analyses presented in Figs. 2, 3, Extended Data Figs. 3e–g, 4–7. Western blots for clones indicated in red were repeated at least twice with similar results. e, RNA-seq heat maps showing expression of selected pluripotency and post-implantation markers in wild-type, Mettl3-knockout, Mettl14-knockout, Wtap-knockout and Zc3h13-knockout ES cells. Asterisks indicate genes with log2FC > 0.75 and FDR < 0.05. Data represent averages of two independent RNA-seq. f, Representative images of wild-type, Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout ES cells grown in serum-free 2i + LIF medium, or after 24 h conversion to serum + LIF medium. Scale bar, 0.5 mm. When replacing 2i + LIF with serum + LIF medium, marked morphological changes were observed in all knockout lines. Experiments were repeated twice with similar results. g, Growth curves of wild-type, Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout ES cells cultured in serum-free 2i + LIF medium, or after conversion to serum + LIF. Data are mean ± s.d. from three independent experiments. When replacing 2i + LIF with serum + LIF medium, the self-renewal ability of Wtap-knockout cells was severely impaired and Zc3h13-knockout cells disappeared (highlighted by an asterisk). This could explain the lower sgRNA ranking for these two genes upon extended selection time in the screen, which was performed in serum + LIF medium (Extended Data Fig. 2b).
Extended Data Fig. 4 Depletion of m6A methyltransferase complex results in deregulation of different retrotransposon families.
a, ELISA showing normalized m6A levels in total RNA in WT and KO ES cells. Data are mean ± s.d. of three independent RNA samples. P values were determined by two-sided Student’s t-test. b, Volcano plot showing log2FC in retrotransposon expression in Mettl14, Wtap and Zc3h13-knockout versus wild-type cells using a random assignment of multi-mapped reads. In red, blue, orange and green are significantly deregulated RepeatMasker annotations belonging to ERVK, ERV1, ERVL and L1 families, respectively. In grey, non-differentially expressed (non-DE) retrotransposons. P values were computed using limma and adjusted with the Benjamini–Hochberg correction (Methods). c–f, Volcano plots showing log2FC of retrotransposon expression in Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout versus wild-type ES cells using a random assignment of multi-mapped reads. In red, blue, orange and green are significantly deregulated RepeatMasker annotations belonging to ERVK (c), ERV1 (d), ERVL (e) and L1 (f) families, respectively (log2FC > 0.75 and FDR < 0.05). In grey, non-differentially expressed retrotransposons. The P values were computed using limma and adjusted with the Benjamini–Hochberg correction.
Extended Data Fig. 5 Family-based analysis of retrotransposon expression using random versus unique mapping.
a, RNA-seq heat maps showing log2FC in expression of indicated retrotransposon families in Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout versus wild-type ES cells allowing either multiple mapping with random allocation (left) or unique mapping only (right). In case of unique mappers with intragenic position, only elements transcribed in reverse orientation to the gene were included in the analysis. Note that although ERVK annotations show general upregulation in all knockout ES cells, L1s tended to be downregulated, with the exception of oldest L1MdA_IV, L1ME3Cz and L1MdV_I subfamilies. b, Heat maps of published nuclear RNA-seq data19 showing log2FC in expression for indicated retrotransposon families in two Mettl3-knockout versus control ES cells, allowing multiple mapping with random allocation. The two knockout lines show divergent levels of retrotransposon reactivation but consistently show greater reactivation of ERVK compared to ERVL and LINE1 families. Asterisks indicate retrotransposon families with log2FC > 0.75 and FDR < 0.05. Data were taken from accession GSE133585.
Extended Data Fig. 6 Analysis of gene expression in mutant ES cells of the m6A methyltransferase complex.
a, Venn diagram showing overlap of the upregulated genes (FDR < 0.05, log2FC > 0.75) in Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout ES cells as identified by RNA-seq. In blue, genes (n = 941) commonly upregulated in all four knockout ES cells. b, Correlation between gene upregulation and proximity to retrotransposon annotations (−5 kb to +1 kb from the TSS). In blue, genes (n = 941) commonly upregulated in all four knockout ES cell lines. In grey, random genes plotted as mean + s.d. of n = 1,000 randomizations. P values were determined by permutation test. c, Dot plot showing splicing events (normalized counts per million) occurring between exons and RepeatMasker-annotated IAPs in wild-type, Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout ES cells. Horizontal lines denote mean (n = 2 independent RNA-seq). d, Heat map showing average expression of selected retrotransposon regulators in wild-type, Mettl3-, Mettl14-, Wtap- and Zc3h13-knockout cells. Genes with log2FC > 0.75 and FDR < 0.05 are marked by an asterisk.
Extended Data Fig. 7 IAP mRNAs undergo METTL3-dependent m6A methylation.
a, Left, Venn diagram showing overlap between m6A peaks identified in wild-type and Mettl3-knockout MeRIP-seq. Right, input-normalized cumulative distribution of signal intensity for m6A peaks in wild-type (top), overlapping (middle) and Mettl3-knockout specific (bottom) cells. A set of novel m6A peaks was detected in Mettl3-knockout cells; however, they are probably false positives (weak signal intensity compared to canonical wild-type peaks). b, Normalized MeRIP-seq read density in wild-type (red) and Mettl3-knockout (blue) cells across the 5′ UTR, coding sequence and 3′ UTR of mRNA for the genes with at least one m6A peak. Discontinued lines represent respective inputs. c, UCSC genome browser tracks showing m6A distribution (red, normalized read density, RPM) at indicated genes in wild-type and Mettl3-knockout ES cells. In black, input RNA. Results represent average signal of three independent MeRIP-seq experiments. d, Inputs (top) and background-normalized m6A signal distributions (bottom) across the IAPEz consensus sequence in wild-type (red) and Mettl3-knockout (blue) cells. Vertical black lines denote positions of RRACH motifs. Discontinued vertical black lines denote regions of m6A enrichment present in wild-type and lost in Mettl3-knockout cells. e, Average of input-normalized m6A signal intensities along the MMERVK10C consensus sequence in wild-type (red) and Mettl3-knockout (blue) cells. Vertical black lines denote RRACH motif positions. Discontinued vertical black lines denote region of m6A enrichment present in wild-type and lost in Mettl3-knockout cells. f, Average of m6A signal intensities for indicated MMERVK10C sequence segments in wild-type (red) and Mettl3-knockout (red) ES cells. Data are mean ± s.d. of three independent MeRIP-seq experiments. g, h, Background-normalized m6A signal distribution (g) and average of input-normalized m6A signal intensities across the L1MdA_I consensus sequence in wild-type (red) and Mettl3-knockout (blue) cells (h). Vertical black lines denote RRACH motif positions. i, Alignment between 5′ UTRs of IAPEz and MMERVK10C consensus sequences. IAP-specific, MMERVK10C-specific and common RRACH motifs are indicated in blue, yellow and green, respectively. j, Box plot showing log2FC in expression of uniquely mapped IAPEz copies grouped according to number of 5′ UTR-associated RRACH motifs in Mettl14-, Wtap- and Zc3h13-knockout versus wild-type cells. Box plots are as in Fig. 2d. Only copies with a minimum of 10 reads in at least one sample were conserved.
Extended Data Fig. 8 Auxin-inducible degron of endogenous METTL3, METTL14 and ZC3H13.
a, Schematic of TIR1 and SCF1 complex-dependent degradation of endogenously AID-tagged proteins in presence of auxin. b, Immunoblot showing protein levels of endogenously 3×Flag-AID-tagged METTL3 and METTL14 in single and double degron ES cell lines. TIR1-only ES cells were used as control for protein levels. PCNA served as loading control. c, Immunoblot showing efficiency and reversibility of METTL3 and METTL14 depletion after 0–96 h auxin treatment followed by 48 h auxin wash-off. PCNA served as loading control (related to Fig. 4b). d, ELISA showing normalized m6A levels in total RNA after 0, 6 and 96 h of auxin-induced degradation of METTL3, METTL14 and METTL3;METTL14 double degron. Data are mean ± s.d. of three technical replicates. TIR1-only ES cells served as control. Experiment was repeated twice with similar results. e, RT–qPCR showing normalized IAP mRNA levels relative to 0 h using Δ1- or gag-specific primers after auxin-induced degradation of METTL14 (light blue), METTL3 (blue) and METTL3;METTL14 double degron (dark blue). TIR1-only (black) ES cells served as control. Data are mean ± s.d. of three independent auxin inductions). f, Growth curves of TIR1-only and AID-METTL3;METTL14 double degron ES cells treated with either DMSO (grey) or auxin (black) for 0–96 h. Data are mean ± s.d. of three independent auxin inductions. Note that contrary to constitutive m6A-knockout ES cells, proliferation rate is not altered by acute m6A loss (similar rate between AID-METTL3;-METTL14 double degron and TIR1-only ES cells). However, prolonged auxin treatment may have negative effect on the proliferation rate. g–i, ZC3H13 auxin-dependent degron. Schematic of ZC3H13 degron engineering in mouse ES cells (g), immunoblot showing protein levels of endogenously 3×Flag-AID-tagged ZC3H13 (h) and degron efficiency in presence of auxin (i). TIR1-only ES cells were used as control for protein levels. KAP1 served as loading control. j, RT–qPCR showing normalized IAP mRNA levels relative to 0 h using LTR-specific primers after auxin-induced degradation of ZC3H13. Data are mean ± s.d. from three independent auxin inductions. Immunoblots presented in b, c, h, i were repeated at least twice with similar results.
Extended Data Fig. 9 Retrotransposon chromatin profiling in AID-METTL3;METTL14 double degron ES cells using random assignment of multi-mapped reads on full-length elements.
a, Composite profiles (top) and density plots (bottom) showing enrichment of H3K4me3, H3K27ac and H3K9me3 as determined by CUT&RUN on m6A-bound and non-m6A bound genes (±2 kb) (gene lists were obtained based on MeRIP-seq results) in AID-METTL3;METTL14 double degron ES cells treated with DMSO or auxin for 96 h. b, Heat map showing average H3K27ac, H3K4me3 and H3K9me3 levels (log2(immunoprecipitate/IgG)) for 56 retrotransposon families in AID-METTL3;METTL14 double degron ES cells treated with DMSO or auxin for 96 h. c, Violin plots and box plots (inside) showing H3K4me3, H3K27ac and H3K9me3 levels at L1MdA_I-IV copies (n = 436 copies) in AID-METTL3;METTL14 double degron ES cells treated with DMSO (grey) or auxin (red) for 96 h. Only copies with a minimum of 10 reads in at least one sample were conserved. Box plots are as in Fig. 2d. P values were determined by two-sided Student’s t-test. d, Composite profiles showing H3K4me3 (top), H3K27ac (middle) and H3K9me3 (bottom) coverage along full-length (>5 kb) L1MdA_I copies (n = 1,400) in AID-METTL3;METTL14 double degron ES cells treated with DMSO (control, grey) or auxin (red) for 96 h. Results in a–d represent averages of two independent CUT&RUN.
Extended Data Fig. 10 YTHDF2 binds to ERV mRNAs.
a, Schematic representation showing sgRNA targeting of the Ythdf2 gene. b, Immunoblot showing YTHDF2 protein levels in wild-type and YTHDF2-depleted cells. PCNA served as loading control. Immunoblotting was repeated twice with similar results c, RIP–qPCR showing YTHDF2 enrichment on indicated ERVKs and Nanog in wild-type and Ythdf2-knockout ES cells (two independent clones). Rabbit IgG served as control. Data are mean ± s.d. from three independent experiments. d, Generation of mutant mutYthdf2-3 and mutYthdf1-2-3 ES cells using CRISPR–Cas9. Schematic representation showing sgRNA sequences; mutation/deletion information based on Sanger sequencing is provided.
Supplementary information
Supplementary Figures
This file contains Supplementary Figure 1 | Uncropped images of Western blot gels. Supplementary Figure 2 | Pseudocolor plots and gating strategies for FACS analyses presented in Extended Data Fig. 1.
Supplementary Table
Supplementary Table 1: IAPEz nucleotide sequence used in IAPEz reporter cassette; primer sequences used for CRISPR-Cas9 library amplification and sequencing for screens I and II; primers used in RT-qPCR experiments.
Supplementary Table
Supplementary Table 2: List of sgRNA sequences used for generation of mutant and degron mouse ESCs.
Supplementary Table
Supplementary Table 3: List of antibodies.
Supplementary Table
Supplementary Table 4: CRISPR-Cas9 knockout screen I and II gene ranking obtained using MAGeCK.
Supplementary Table
Supplementary Table 5: Sequencing statistics for the CRISPR-Cas9 Screens I and II, RNA seq, MeRIP-seq and CUT&RUN.
Supplementary Table
Supplementary Table 6: Mettl3-, Mettl14-, Wtap- and Zc3h13-KO RNA-seq results for differentially-expressed retrotransposons.
Rights and permissions
About this article
Cite this article
Chelmicki, T., Roger, E., Teissandier, A. et al. m6A RNA methylation regulates the fate of endogenous retroviruses. Nature 591, 312–316 (2021). https://doi.org/10.1038/s41586-020-03135-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-020-03135-1
- Springer Nature Limited
This article is cited by
-
miR-340-3p-modified bone marrow mesenchymal stem cell-derived exosomes inhibit ferroptosis through METTL3-mediated m6A modification of HMOX1 to promote recovery of injured rat uterus
Stem Cell Research & Therapy (2024)
-
Activation of human endogenous retroviruses and its physiological consequences
Nature Reviews Molecular Cell Biology (2024)
-
LINE-1 transcription activates long-range gene expression
Nature Genetics (2024)
-
m6A/m1A/m5C-Associated Methylation Alterations and Immune Profile in MDD
Molecular Neurobiology (2024)
-
METTL3 regulates TFRC ubiquitination and ferroptosis through stabilizing NEDD4L mRNA to impact stroke
Cell Biology and Toxicology (2024)