Keywords

1 Introduction

Transposable elements (TEs) play a central role in genome evolution and genetic innovation, as first proposed by Barbara McClintock’s seminal work describing somatic transposition in maize, and many subsequent studies (Johnson and Guigo 2014; Feschotte 2008; Oliver and Greene 2011; Bourque 2009; Sasaki et al. 2008; Böhne et al. 2008; Hutchins and Pei 2015; Casacuberta and González 2013; McClintock 1950). The remnants of now inactive TEs pervade most eukaryotic genomes and, in some cases, carry out biological functions that favour the host cell, a phenomenon called ‘exaptation ’ (Bejerano et al. 2006; Jordan et al. 2003; Jacques et al. 2013; Gifford et al. 2013; Kelley et al. 2014; Fort et al. 2014; Faulkner et al. 2009). In humans, the only class of TE still able to mobilise autonomously is the retrotransposon LINE-1 (L1). A full-length L1 is a transcribed 6 kb genetic unit (Grimaldi et al. 1984) that encodes two proteins essential for L1 mobility (called ORF1p and ORF2p) (Moran et al. 1996; Scott et al. 1987; Singer et al. 1993), as well as an unusual antisense open reading frame (ORF0) of unclear relevance to L1 retrotransposition (Denli et al. 2015). Although ~500,000 L1 copies comprise 17 % of human genomic DNA, nearly all of these copies are now immobile due to 5′ truncations, internal rearrangements and mutations (Lander et al. 2001). As a result, ~100 L1 copies remain retrotransposition competent (Sassaman et al. 1997) and, of these, only a small number, dubbed ‘hot’ L1s, account for the vast majority of new L1 retrotransposition events observed in human populations (Brouha et al. 2003; Beck et al. 2010). The L1 proteins can also recognise and mobilise non-autonomous retrotransposons, such as Alu and SINE-VNTR-ALU (SVA) elements, in trans (Dewannieux et al. 2003; Belancio et al. 2010; Garcia-Perez et al. 2007a; Wei et al. 2001; Doucet et al. 2015). Until recently, it was considered that mammalian cells only allowed somatic L1 retrotransposition during early embryonic development and under pathological circumstances, such as cancer (Kazazian et al. 1988; Garcia-Perez et al. 2007b; Kano et al. 2009; Trelogan and Martin 1995; Iskow et al. 2010; Miki et al. 1992). However, Muotri et al., analysing the transcriptional profiles of multipotent neural progenitor cells (NPC) , were the first to discover that L1 transcripts were also expressed in the brain under normal conditions (Muotri et al. 2005). Here we consider the last decade of discoveries relating to L1 activity in the mammalian brain that followed on from the key findings of Muotri et al. We particularly emphasise the role of the environment and neurological disease in modulating neuronal L1 retrotransposition, as this is arguably the clearest route available to understand the functional significance of L1 mobilisation in the brain.

2 Detecting Retrotransposition in the Neuronal Lineage

The developmental timing of neuronal L1 retrotransposition is decisive in determining how many somatic L1 insertions are found per neuron, and how many neurons each L1 insertion is found in. It is now well established that L1 mobilisation occurs during neuronal differentiation, when neural stem cells (NSCs) commit to neuronal progenitor cells (NPCs) , and potentially in mature, postmitotic neurons. The key findings supporting this conclusion are primarily based on in vitro and in vivo measurements of L1 activity using transgenic L1 elements, and in vivo studies of endogenous L1 behaviour. Cultured adult rat NPCs, as well as human NPCs derived from foetal brain stem cells, each support retrotransposition of a human L1 element bearing an enhanced green fluorescence protein (EGFP) reporter cassette during the early stages of neuronal differentiation (Muotri et al. 2005; Coufal et al. 2009). The L1-EGFP cassette contains the gene encoding EGFP in reverse orientation to the L1 transcript. Due to an interruption of the EGFP gene by an intron in the same transcriptional orientation as the L1, EGFP-positive cells only arise when L1 retrotransposition is completed and the EGFP intron is removed from the RNA intermediate before reverse transcription (Ostertag 2000). Additionally, endogenous L1 mRNAs are detectable in human NPCs (Coufal et al. 2009). The cells that support retrotransposition events and contain endogenous L1 transcripts present a multipotent NSC phenotype with bias towards neuronal differentiation (Muotri et al. 2005; Coufal et al. 2009). L1 insertions can occur within neuronal genes and thereby have the potential to cause gene expression changes (Muotri et al. 2005; Klawitter et al. 2016; Han et al. 2004; Upton et al. 2015). As well as during adult neurogenesis, L1 retrotransposition occurs during early embryonic development, as found in human embryonic stem cells (hESCs) (Garcia-Perez et al. 2007b) and transgenic L1-EGFP mice where an engineered human L1 is under the control of a native L1 promoter (L1RP) (Muotri et al. 2005).

Coufal et al. subsequently developed an L1 copy number variation (CNV) assay based on qPCR which, when applied to human central nervous system (CNS) and other somatic tissues, displayed an overall elevation of L1 copy number in the CNS (Coufal et al. 2009), consistent with substantial full-length and processed L1 mRNA expression occurring in the brain (Faulkner et al. 2009; Belancio et al. 2010; Tyekucheva et al. 2011). This higher L1 copy number is particularly observed in the hippocampal dentate gyrus (DG) (Coufal et al. 2009; Baillie et al. 2011). It is notable that although the engineered L1-EGFP and L1 CNV assays provide a window into endogenous L1 activity in the brain, they also carry considerable drawbacks. For example, the L1-EGFP assay requires reverse transcription of the sizeable EGFP cassette, at a minimum, to observe EGFP-positive cells, and the EGFP promoter is subject to host genome silencing (Garcia-Perez et al. 2010). The L1 CNV assay, by contrast, measures endogenous L1 genome content, but is primarily useful as an indicator of relative L1 copy number, and does not provide the genomic locations of L1 integration sites.

High-throughput DNA sequencing can overcome these issues by allowing the detection and genomic localisation of endogenous L1 variants. Briefly, this usually involves sequencing genomic DNA in order to identify L1 integration sites in brain tissue that are not present in matched non-brain tissue (e.g. liver or heart). Subsequently these data is cross referenced to databases containing known polymorphic insertions (Baillie et al. 2011; Kurnosov et al. 2015; Mir et al. 2015) to gain further confidence in predicted somatic L1 variants . To facilitate higher sequencing depth at L1 insertion sites, DNA can be enriched prior to sequencing. Retrotransposon capture sequencing (RC-seq) , for instance, is a hybridisation-based method developed to enrich sequencing libraries for fragments containing L1 junctions (Baillie et al. 2011; Upton et al. 2015; Shukla et al. 2013). Using RC-seq , Baillie et al. again identified the hippocampus as a region prone to somatic L1 retrotransposition (Baillie et al. 2011), corroborating the earlier Coufal et al. study (Coufal et al. 2009). Interestingly, the hippocampus is one of the primary brain regions where neurogenesis is maintained in adulthood (Eriksson et al. 1998), which is consistent with the finding that L1 activity becomes more prominent during neurogenesis and neuronal differentiation (Coufal et al. 2009; Muotri et al. 2005). Investigating the genome-wide integration site pattern of detected somatic L1 insertions, Baillie et al. found an overrepresentation of insertions in some protein-coding loci, specifically the introns of neurobiological genes, corroborating a preliminary observation made by Muotri et al. based on genomic mapping of L1-EGFP insertions (Baillie et al. 2011; Muotri et al. 2005).

That the hippocampus is a major source of adult neurogenesis, and provides a substantial contribution to behavioural phenotypes (Kim et al. 2015; McDonald and Hong 2013), combined with Baillie et al.’s finding that somatic L1 insertions primarily occur in gene-rich regions, is stunning because in this setting the chances of an L1 insertion leading to phenotypic change are greatly increased (Richardson et al. 2014). However, the rate at which L1 mobilisation takes place in neurons is still unclear. Single-cell genomic analyses , where DNA is obtained from individual cells and then massively amplified, estimate that 1 L1 insertion is found per 300 neurons (Evrony et al. 2012), through to multiple insertions per cell (Upton et al. 2015). The last study aiming to resolve this issue reported 13.7 somatic L1 insertions per hippocampal neuron (Upton et al. 2015), leaving the chances of functional consequences relatively high.

3 Insertional Impact and Regulation of Retrotransposons

The integration of new L1 insertions , or other TEs, into genes can significantly impact gene expression by constraining or differentially regulating transcription or altering the encoded protein. The consequences of an L1 insertion depend on the characteristics of the insertion (full-length or 5′ truncated, sense or antisense to the gene) and the cellular environment, including the response of the host cell to the insertion. For example, L1 insertions in the sense orientation to a gene are expected to be more detrimental to that gene than an antisense insertion because RNA polymerase II struggles to process the L1 sequence in sense (Chen et al. 2006; Han and Boeke 2004; Han et al. 2004). This is the primary explanation for a strong depletion of sense-oriented L1 insertions in protein-coding genes in the human reference genome (Ewing and Kazazian 2011). New L1 insertions can, however, impact host gene expression via many routes, and generate phenotypes (Beck et al. 2011). This is nicely illustrated in two distinct mouse models: the spastic mouse and the Orleans reeler. The spastic mouse contains a homozygous mutation in the brain-expressed glycine receptor β subunit-encoding (Glyrb) gene . This mutation results in defects of the glycine signalling pathway and subsequent motor deficiency and is the consequence of a full-length L1 insertion in intron 5 of the Glyrb gene , leading to aberrant splicing of the pre-mRNA by skipping of exon 5 (Mülhardt et al. 1994; Kingsmore et al. 1994). As the L1 insertion solely affects splicing of the adult isoform of the receptor subunit (GlyRA), the spastic phenotype only becomes apparent around 2 weeks of age, when a developmental switch from the neonatal isoform (GlyRN) to GlyRA takes place (Becker 1990). By comparison, the Orleans reeler mouse has a full-length L1 insertion into an exon of the Reelin (Reln) gene, inducing exon skipping (D’Arcangelo et al. 1995). Exon skipping leads to a frame shift that causes a 220 bp deletion of the Reln mRNA, which encodes a truncated protein that is secreted inefficiently (de Bergeyck et al. 1997; Takahara et al. 1996). As Reln is an extracellular signalling protein required for the regulation of neuronal migration, deficiency in its secretion leads to a severe impairment of neuronal migration and, as a consequence, cortical and cerebellar delamination and subsequent typical neurological symptoms. These archetypal examples of germline L1 retrotransposition leading to neuronal phenotypes, in the Orleans reeler and spastic mouse, point to the possible consequences of somatic L1 retrotransposition occurring during neurogenesis. Unsurprisingly, the host genome has evolved several mechanisms to limit L1 mobilisation in germ cells, and the neuronal lineage (Fig. 1).

Fig. 1
figure 1

L1 regulation is complex and dynamic. Numerous proteins, including YY1, RUNX3, SRY (Sox2 and 11), HDAC1, MeCP2, SIRT6 and P53, regulate L1 activity via epigenetic modifications, and through transcriptional stimulation/repression

Methylation of the L1 promoter region is the first line of defence for cells to guard against potentially deleterious L1 mobilisation (Hata and Sakaki 1997). Methyl CpG-binding protein 2 (MeCP2), a protein required for DNA methylation-mediated gene repression and mainly expressed in mature neurons (Fig. 2), is closely involved in inhibiting L1 activity. MeCP2 knockdown correlates with an increase in L1 promoter activity (Muotri et al. 2010). Under normal circumstances, MeCP2 binds methylated CpG dinucleotides and interacts with histone deacetylase protein (HDAC) and SIN3A corepressor complex resulting in blockage of transcription factors, histone deacetylation and methylation (Fig. 1) (Fuks et al. 2003; Nan et al. 1998). Inhibition of an MeCP2-interacting protein, HDAC1, by valproic acid enhances the transcriptional activity of L1 (Lennartsson et al. 2015). This indicates that HDAC1 is also involved in L1 repression (Fig. 1). HDAC1 dysfunction is known to play a role in psychiatric disorders, specifically schizophrenia, suggesting a potential mechanism underlying the symptoms experienced by these patients (Weïwer et al. 2013). The mono-ADP ribosyltransferase enzyme, Sirtuin 6 (SIRT6), another deacetylase, is suggested to inhibit L1 transcription by promoting heterochromatin formation (Van Meter et al. 2014). SIRT6 localises to the L1 promoter and, interestingly, appears to be displaced during aging as well as in oxidative stress conditions, circumstances known to enhance TE activity (Li et al. 2013). Although L1 is silenced by the MeCP2 complex and other mechanisms in most tissues, the brain exhibits significantly lower L1 methylation than matched skin samples (Coufal et al. 2009). Furthermore, during cell differentiation the L1 promoter tends to be demethylated (Muotri et al. 2010) potentially creating a brief window for retrotransposition to take place (Kano et al. 2009; Muotri et al. 2005).

Fig. 2
figure 2

Dynamic L1 activity during neurogenesis . The factors illustrated in Fig. 1 are involved in proliferation, differentiation and neuronal function. L1 expression is, as a result, differentially regulated during brain development as well as early and adult neurogenesis, resulting in potentially dynamic L1 activity, and mobilisation, during these stages (CA 1, 3 cornu ammonis 1 and 3, GCL granule cell layer)

Beyond epigenetic suppression, L1 can be regulated by transcription factors (TFs) expressed in neural cells. For instance, Ying Yang 1 (YY1), a zinc finger protein TF, strongly and predominantly expressed in neurons (Rylski et al. 2008), is involved in neuronal differentiation (Fig. 2) (reviewed in He and Casaccia-Bonnefil 2008) and facilitates L1 transcription, potentially by directing the RNA polymerase II (pol II) complex to its proper binding site (Fig. 1) (Becker et al. 1993; Athanikar et al. 2004). Members of the sex-determining region Y (SRY) protein family can also impact L1 activity. SRY-box 2 (Sox2) can inhibit L1 transcription (Kuwabara et al. 2009; Coufal et al. 2009; Muotri et al. 2005), while Sox11 is suggested to stimulate L1 activity (Tchénio et al. 2000). During embryonic and adult neurogenesis Sox2 is involved in maintenance of the multipotent state of NSCs and NPCs (Graham et al. 2003; Heinrich et al. 2014; Ring et al. 2012). By contrast, Sox11 is mainly expressed in non-proliferative, committed neuronal cells in the neurogenic niches of the adult brain, where it acts as a transcriptional activator of several neuronal genes (Haslinger et al. 2009; Mu et al. 2012; Bergsland et al. 2006). Another TF, runt-related transcription factor 3 (RUNX3), which is involved in neurogenesis, development and survival of proprioceptive neurons, stimulates the L1 promoter region (Yang et al. 2003; Inoue et al. 2008; Lallemend et al. 2012). Finally, p53 supresses L1 retrotransposition through its involvement in H3K9 trimethylation (H3K9me3), a silencing marker, which has been found to occur at the L1 enhancer region (Wylie et al. 2015; Harris et al. 2009). P53 expression is found in proliferating and newly formed neurons where it helps regulate proliferation and differentiation (reviewed in Tedeschi and Di Giovanni 2009). Thus, L1 activity in the brain is regulated by TFs essential to neurogenesis. It remains unclear as to whether this is by coincidence or because L1, a molecular parasite, has found a niche where it is derepressed as part of the greater cascade of gene regulation governing neurogenesis.

As new L1 insertions attract epigenetic suppression and carry TF-binding sites, the integration of an L1 into introns or intergenic regions upstream of protein-coding genes can alter the expression pattern of those genes. For example, 79 protein-coding genes were shown by Kuwabara et al. to present SRY-binding sites from L1 insertions occurring proximal to their transcription start sites in the human genome (Kuwabara et al. 2009). In these cases, transcriptional activation or suppression of L1 by one of the members of the SRY family may lead to the activation or suppression of the downstream protein-coding gene. That the regulatory factors described above play a role in neurogenesis and differentiation may suggest that L1 can influence these processes by, for example, genetically reprogramming differentiating cells (Spadafora 2015; Peaston et al. 2004; Muotri et al. 2005). It follows that L1 mobilisation in the brain is proposed as a source of neuron functional diversity (Muotri et al. 2005; Baillie et al. 2011; Upton et al. 2015; Singer et al. 2010; Coufal et al. 2009; Richardson et al. 2014). Hypothetically, if L1 causes genome plasticity in neurons, it may provide itself, and the host organism, extra capacity to adapt to its environment (Casacuberta and González 2013; Oliver and Greene 2011) at the cost of, perhaps, occasional catastrophic consequences for the individual, including neurological disorders (reviewed in Reilly et al. 2013).

4 Environmental Influences upon L1 Activity

Barbara McClintock was the first to propose the “genomic shock ” hypothesis, speculating that environmental factors have the ability to stimulate the activity of TEs (McClintock 1984). Since then, numerous studies have aimed to address this hypothesis for environmental/cellular changes ranging from stress and toxic agents to voluntary physical activity. Although these studies have often reported enhanced L1 activity, we must emphasise that many of these observations require replication.

Preliminary experiments suggest that heavy metals may, for instance, modulate L1 mobilisation. Mercury (Hg), nickel (Ni) and cadmium (Cd) exposure appear to increase L1 retrotransposition (El-Sawy et al. 2005; Kale et al. 2005, 2006). The particulate, water-insoluble forms of these heavy metals (mercury sulfide (HgS), nickel oxide (NiO) and cadmium sulfide (CdS)) increase L1 mobilisation in HeLa cells (Kale et al. 2005). Exploring the effect of the soluble forms of these substances produces slightly different results for mercury (HgCl2) (Habibi et al. 2014). No difference in L1 promoter activity, transcription or putative genomic L1 integration is detected for non-neuronal cells, including HeLa cells, after HgCl2 exposure. By contrast, a neuroblastoma cell line (NB) does potentially show an increase in all of these measurements. The soluble form of nickel (NiCl2) and cadmium (CdCl2) however generates similar results to those of their particulates (Kale et al. 2006; El-Sawy et al. 2005). Examination of these phenomena reveals that L1 endonuclease activity associated with the increase in L1 mobilisation does not contribute to the toxicity observed for CdS or CdCl2 (Kale et al. 2006). Furthermore, the increased L1 retrotransposition resulting from NiCl2 exposure is not mediated by enhancement of L1 promoter activity (El-Sawy et al. 2005); also the direct genotoxicity of CdS and NiCl2, which could potentially facilitate L1 insertion into DNA double-stranded breaks (DSBs), is not causative (El-Sawy et al. 2005; Kale et al. 2006). Instead, it appears that the influence of Ni and Cd on the displacement of magnesium (Mg) and zinc (Zn) cofactors induces L1 activity, as is demonstrated by the abolishment of this effect after Mg and Zn supplementation (El-Sawy et al. 2005; Kale et al. 2006).

L1 activity induced by other genotoxic agents , such as benzo[a]pyrene (BaP), an aromatic hydrocarbon produced by wood burning and found in coal tar and automobile exhaust fumes, is plausibly dependent on their ability to induce DNA damage (Stribinskis and Ramos 2006). This potentially reflects cellular attempts to recruit L1 as a compensatory mechanism, either to induce apoptosis via genome instability triggered via ORF2p activity or to use the ability of L1 to repair DNA damage through EN-independent L1 integration (Stribinskis and Ramos 2006; Morrish et al. 2002; Teng et al. 1996). Morrish et al. described enhanced levels of retrotransposition of an EN-incompetent L1 in cell lines lacking DNA repair mechanisms (Morrish et al. 2002). However, induced DSBs in cell lines with intact DNA repair mechanisms were not found to increase retrotransposition of an EN-incompetent L1 (Farkash et al. 2006). Coufal et al. further reported that mutations inactivating the function of both non-homologous end joining (NHEJ) and p53 are required for efficient EN-incompetent L1 retrotransposition (Coufal et al. 2011). Therefore, L1 could be used by the cell to mediate the repair of DNA damage, but exclusively in cells suffering from NHEJ and p53 dysfunction. Finally, oxidative stress, which can result from a number of natural stimuli as well as toxic agents, appears to increase retrotransposition of an L1 reporter in cultured neuroblastoma cells (Giorgi et al. 2011), an interesting finding considering that the brain is a metabolic hotspot. Despite the studies described above, it remains unclear whether L1 can function as a cellular buffer against the environmental impact of toxic agents. Additionally, the observed influence of environmental factors may be dependent on the exact characteristics of the chosen stimulus, as well as the cell type investigated. As a result, more extensive investigation is required in this area, particularly for primary neuronal cells, as most data obtained thus far has been from immortalised cancer cell lines.

Although environmental factors impact neurogenesis (Koehl 2015) and, as the above-mentioned literature suggests, may also alter L1 activity, it remains to be proven whether environmental perturbation during neuronal differentiation leaves L1 more prone to mobilise. The only substantive data in this area is from a 2009 study by Muotri et al.: using transgenic mice carrying the human L1-EGFP reporter construct, they found that voluntary exercise resulted in an increase in EGFP-positive cells in the brain (Muotri et al. 2009). However, these EGFP-positive cells were not only found in the hippocampus where exercise was shown to lead to a significant increase in NPC proliferation and newborn neurons, providing the opportunity for L1 to mobilise, but also in the cerebellum, a non-neurogenic area. L1 retrotransposition in the cerebellum was an intriguing observation because it either indicated that L1 could jump in postmitotic neurons or that the detected EGFP was found in cells born elsewhere that migrated to the cerebellum and then underwent derepression of the EGFP cassette in mature neurons due to chromatin remodelling. Hence, it is difficult to conclude whether exercise led to an increased detection of L1 insertions due to increased L1 mobilisation, neurogenic rate, chromatin accessibility or a combination of these factors. Muotri et al.’s experiments therefore highlight difficulties in attributing phenotypic effects to L1 mobilisation in vivo, but do at least favour speculation that L1 can mediate neuronal genome plasticity in response to environmental changes.

5 Retrotransposon Involvement in Neurological Disorders

Traumatic early life events and chronic stress are major risk factors for the development of a range of neurological disorders (Bagot et al. 2014). If L1 is reactive to environmental stressors, it could play a potentially important role in the development or exacerbation of neurological diseases. Here we highlight the intriguing findings in this area while noting that there are no certain causative links at this stage established between any brain disorder and somatic L1 retrotransposition.

5.1 Retrotransposons in Neurodevelopmental and Neurodegenerative Disorders

Neurological disorders resulting from inherited or spontaneous genetic mutations can reproducibly present upregulation of L1 retrotransposition in the brain. In particularly, recent works have revealed that L1 copy number is elevated in Rett syndrome (RTT) and ataxia telangiectasia (AT) patient brains (Coufal et al. 2011; Muotri et al. 2010).

RTT is a progressive and devastating disease predominantly associated with mutation of the MeCP2 gene, characterised by a range of neurological problems from ataxia to autism and usually developing before 2 years of age (Amir et al. 1999). As noted above, MeCP2 is involved in transcriptional repression by binding methylated DNA and inducing histone methylation and deacetylation. MeCP2 is highly expressed in mature neuronal nuclei (Fig. 2) and, when mutated, is associated with aberrant epigenetic profiles, potentially explaining the severe CNS defects seen in RTT (Shahbazian 2002; Gabel et al. 2015). Although Yu et al. established that MeCP2 influences L1 promoter activity and L1 retrotransposition in transformed cell lines (Yu et al. 2001), Muotri et al. brought this work forward by showing that MeCP2 knockout in mouse neuroepithelial cells increases L1 promoter activity fourfold (Muotri et al. 2010). This result was specific for the reduction of MeCP2 and was not found for methyl CpG-binding domain protein 1 (MBD1), a protein from the same family but with a different DNA specificity. L1-EGFP transgenic mice deficient for MeCP2 also showed increased L1 retrotransposition compared to wild-type animals, with the strongest effects found in the cerebellum, striatum and hippocampus. Muotri et al. also found, using the L1 qPCR assay , a marked increase in L1 ORF2 copy number but not the L1 5′UTR, perhaps indicating that new L1 retrotransposition events were characterised by substantial 5′ truncations. NPCs produced from induced pluripotent stem cells (iPSCs) derived from RTT patient fibroblasts supported a higher (twofold) retrotransposition rate of the L1-EGFP reporter compared to unaffected controls. Altogether, this seminal work from Muotri et al. showed conclusively that L1 activity was higher in RTT patients than in controls. L1 insertion site mapping with single-cell genomics (Upton et al. 2015) would be a valuable future strategy to demonstrate differential L1 activity in RTT. It also should be considered that a wild-type phenotype can be rescued in a conditional mutant mouse RTT model (Guy et al. 2007), raising an important question as to whether elevated L1 activity impacts RTT phenotype.

AT patients suffer from a loss-of-function mutation in the ATM gene, a 350 kDa serine/threonine kinase (Taylor et al. 2015). The most severe and typical form of AT cases start to show symptoms between 1 and 2 years of age. ATM dysfunction leads to neuronal degeneration, immunodeficiency, chromosomal instability and a predisposition to cancer (Shiloh 2001). Under normal circumstances, ATM phosphorylates downstream factors as CHK2, p53, BRCA1 and the MRN complex (MRE11, Rad50 and NBS1) in response to the presence of double-stranded DNA breaks, which activates DNA damage checkpoint and cell cycle arrest leading to the repair of damaged DNA or p53-mediated apoptosis. NPCs produced from hESCs and carrying ATM mutations present a two- to fourfold increase of L1-EGFP retrotransposition but do not show a significant difference in promoter activity or endogenous ORF1p levels (Coufal et al. 2011). Even though L1 retrotransposition may be toxic for cells (Wallace et al. 2008), no difference in survival rates, nor cell cycle or cell division pattern, is observed in ATM-mutated versus wild-type cells (Symer et al. 2002; Coufal et al. 2011; Haoudi et al. 2004). These findings led to speculation that ATM-deficient cells might have a survival advantage due to higher tolerance for L1-induced toxicity (Coufal et al. 2011). Further experiments are required to address whether this is the case, and why ATM mutations result in higher L1 retrotransposition. Notably, ATM mutation appears to lead to longer L1 insertions, possibly due to the role of ATM in cellular DNA repair, which may interfere with L1 retrotransposition in wild-type cells. Although the L1 CNV assay revealed an increase in L1 content in post-mortem hippocampal neurons from AT patients compared to age/gender-matched healthy individuals, single-cell genomic analyses are again required to corroborate this result, and identify if the endogenous L1 insertions generated are longer, or follow a different genome-wide integration pattern compared to wild-type cells.

Although RTT and AT present the clearest evidence of unusual retrotransposon activity in the brain, TEs have also been observed to undergo derepression in neurodegenerative disorders commonly associated with aging (Bollati et al. 2011). For example, TAR DNA-binding protein 43 (TDP-43) dysfunction , a hallmark for a number of neurodegenerative disorders such as amyotrophic lateral sclerosis (ALS), frontotemporal lobar degeneration (FTLD) and Alzheimer’s disease , is found to correspond to higher transcription levels of LINEs, SINEs and LTRs, the three major classes of TEs, in mice (Li et al. 2012). Brain samples with TDP-43 dysfunction, from transgenic mouse models as well as human FTLD patients, show a reduced association between this protein and a range of TE-derived transcripts. These particular TE transcripts are the same transcripts identified as being upregulated in response to TDP-43 dysfunction, indicating that this multifunctional RNA-binding protein might play a role in the regulation of TEs in somatic tissue. Furthermore, aging itself has been found to lead to activation of transposable elements (Li et al. 2013; Van Meter et al. 2014), suggesting that the increase of TE transcripts and potential copy number in the genome may lay at the base of the development of these neurodegenerative disorders.

5.2 Do Retrotransposons Link Environmental and Genetic Risk Factors in Psychiatric Disorders?

In RTT and AT, where the mutated gene responsible for pathophysiology also influences L1 activity, the detected increase in L1 expression and, potentially, L1 copy number is most likely a direct effect of the main driver mutation in these diseases. However, abnormal retrotransposon activity in the brain has also been observed for several psychiatric disorders where the interaction between genetic and environmental risk factors, or solely environmental factors, is considered central to disease aetiology. For example, use of methamphetamines and cocaine , a major risk factor for the development of psychiatric disorders (Akindipe et al. 2014; Zhornitsky et al. 2015), with the potential to turn into substance-use disorder (SUD), is suggested to lead to L1 activation (Okudaira et al. 2014; Maze et al. 2011). This effect is observed using an engineered L1 reporter in vitro for neuronal, but not non-neuronal, cell lines (Okudaira et al. 2014). Furthermore, the abolishment of unusual L1 mobilisation after knockdown of the cAMP response element-binding protein (CREB) suggests that this neuronal response to methamphetamine and cocaine is CREB dependent. Investigation of the mechanism underlying this phenomenon identified an enhanced recruitment of L1 ORF1p to chromatin-rich fractions, without increasing the total expression of L1 mRNA or ORF1p. This startling result suggests that methamphetamine and cocaine use may elevate L1 mobilisation by recruiting L1 ORF1p to the chromatin in a CREB-dependent manner, facilitating L1 integration into the genome. This in turn could induce changes in chromatic structures and gene expression, with the potential to lead to psychiatric disorders.

Post-traumatic stress disorder (PTSD) , a disorder closely related to SUD (Jacobsen et al. 2001), has been found to lead to differential epigenetic regulation of L1 as well as Alu copies in the genome (Rusiecki et al. 2012). PTSD and SUD share numerous cellular circuits and signalling pathways in their pathophysiology, due to similar involvement of the learning and memory system (reviewed in Tipps et al. 2014). PTSD is an anxiety disorder characterised by persistent re-experiences of a past traumatic event or events, often accompanied by memory and concentration problems, anxiety, panic attacks, insomnia, substance abuse and/or depressive symptoms (American Psychiatric Association 2013). PTSD patients present gene expression signatures not found in controls (Segman et al. 2005). Multiple studies have shown that epigenetic alterations play an important role in facilitating changes in gene expression associated with the formation and persistence of memory (Kwapis and Wood 2014; Zovkic and Sweatt 2013). Changes in methylation levels of L1 and Alu in soldiers pre- and post-deployment, of which a subset developed PTSD after their return, have been detected, potentially reflecting resilience or vulnerability factors to PTSD development (Rusiecki et al. 2012). Increased methylation of L1 was detected in the control group post-deployment compared to pre-deployment which, to speculate, might be a result of the body’s response to stress-mediated L1 activation (Li and Schmid 2001). By contrast, a pre-existing abundance of Alu methylation in cases compared to controls might reflect a potential vulnerability to stress or a protective effect of hypomethylation. Specific patterns of Alu expression have been previously linked to physiological stress responses, with perhaps functional consequences (Berger et al. 2014; Pandey et al. 2011; Li and Schmid 2001). Hypermethylation may prevent Alu from fulfilling a protective function, although the mechanism involved is unknown at this stage.

More recently, Bundo et al. investigated L1 CNV in schizophrenia (SCZ), major depression (MD) and bipolar disorder (BD), detecting increased L1 copy number in the prefrontal cortex (PFC) of patients suffering from SCZ compared to healthy controls (Bundo et al. 2014). SCZ is a multifactorial disorder with a typical onset between late puberty and early adulthood and characterised by a chronic and dynamic progression, with genes and environment playing important aetiological roles (Brown 2011). Diagnosis of SCZ is based on a collection of positive, negative and cognitive symptoms, persisting over a period of time (American Psychiatric Association 2013). The PFC is considered to be involved in SCZ symptomology and show differential gene expression when patients are compared to controls, making the finding of Bundo et al. particularly interesting (Kimoto et al. 2014; Joshi et al. 2014; Farzan et al. 2010; Guillozet-Bongaarts et al. 2014). Repeating the L1 CNV analysis using solely neuronal cells yielded a more prominent difference, suggesting that the phenomenon is neuron specific (Bundo et al. 2014). In order to investigate the contribution of genetic factors, Bundo et al. assessed L1 CNV in neurons derived from iPSCs of patients suffering from a rare variant of schizophrenia caused by a 22q11 deletion, one of the highest genetic risk factors. This resulted in the detection of a consistent increase in L1 copy number in the neuronal cells of patients. Furthermore, the influence of environmental risk factors was explored by determining L1 CNV in the PFC of two established SZ animal models and, consistently, higher L1 copy number was detected in both models. Several environmental risk factors for the development of schizophrenia, such as metal exposure and drug use (Modabbernia et al. 2016; Akindipe et al. 2014), were discussed above to also influence L1 activity, making it plausible that L1 would be involved in the development of this disorder.

Although these studies of L1 activity in psychiatric disorders are correlative, they do suggest that L1 mobilisation may be more than a secondary effect of abnormal neurobiology. Particularly impressive were the experiments by Bundo et al. showing that L1 content is increased in SCZ patient samples, iPSC-derived neurons and SCZ animal models. Consistent L1 upregulation in SCZ across very diverse experimental systems indicates a close association between disease phenotype and ectopic L1 activity, though it remains unknown whether L1 plays an active role in the manifestation of SCZ symptoms or is merely a passenger. Given that L1 can influence genome stability, as well as gene transcription, and is responsive to environmental cues, it is plausible that subtle genetic differences arise in genes related to SCZ symptomology . Alternatively, inability to control L1 activity is at the least emblematic of neuronal genome vulnerability and instability. A great deal of more future research is required in this area to make any substantive conclusions regarding the functional role of L1 mobilisation in SCZ and other psychiatric disorders.

6 Conclusion and Future Directions

Somatic L1 retrotransposition is now well established to occur in the neuronal lineage. The field also has a reasonable idea of how this process is regulated, by MeCP2 and other factors. However, we lack even basic understanding of how L1 mobilisation in the brain impacts normal neurobiology, let alone neuronal phenotype in psychiatric, neurodevelopmental or neurodegenerative disorders. As a result, the significance of L1 retrotransposition to brain function is still largely unclear. To move forward in this area, we require improved resolution of the precise timing and cell specificity of retrotransposition during embryonic and adult neurogenesis as well as, potentially, in mature neurons. These parameters are prerequisites to define the contribution of L1 mobilisation to neuronal genome diversity. Moreover, despite advances in single-cell genomics, it is currently not possible to assay the genome and transcriptome of the same individual neuron, precluding detection of gene expression changes associated with somatic L1 insertions. One alternative approach in this area would be to use newly developed genome editing tools (e.g. CRISPR-Cas9) (Wright et al. 2016) to artificially introduce L1 insertions found in patient samples into homogenous neuronal cultures in vitro, or into transgenic animal models. This could facilitate a more comprehensive analysis of how individual L1 insertions alter normal neuronal physiology and, potentially, behaviour. Moreover, although L1 deregulation has been found in several neurological disorders, the mechanisms through which L1 retrotransposition could impact disease symptomology remain largely unexplored. Therefore, the role of L1-derived genomic mosaicism in neurobiology remains unclear, despite its obvious appeal as a foundation for complex brain functions (e.g. memory formation), and as an aetiological factor in the dysregulation of those functions.