Keywords

1 Retrotransposons in Mammalian Genomes

Retrotransposons are a class of mobile genetic elements that make up around 40 % of the sequenced mammalian genome (Chinwalla et al. 2002; Lander et al. 2001). Retrotransposons contribute to genomic instability in mammalian genomes by providing interspersed repeats of homologous sequences that can act as substrates for recombination causing deletions, duplications and structural rearrangements in the genome (Romanish et al. 2010). Retrotransposons are thought to be the only active class of mobile genetic element in most mammalian genomes, and can also cause genome instability through jumping to new locations in the genome. These de novo retrotransposon insertions have been reported as the causal mutation in various human genetic diseases (Crichton et al. 2014; Hancks and Kazazian 2012). The copy-and-paste mechanism that retrotransposons use to jump to new locations in the genome involves reverse-transcription of retrotransposon RNA, and integration of the resulting cDNA into new locations in the genome. There are typically a few hundred different types of retrotransposon annotated in each mammalian genome, with each type of retrotransposon being present in up to 10,000 copies. However, the types of retrotransposon, their copy numbers and their genomic locations vary significantly between species.

Mammalian retrotransposons are classified into LIN E (long interspersed nuclear elements ), SINE (short interspersed nuclear elements ) and LTR (long terminal repeat ) retrotransposon classes (Chinwalla et al. 2002; Lander et al. 2001). Each class of retrotransposons can be further subdivided into families, and each family into individual types. The LINE class of retrotransposons in mammals is primarily represented by the LINE-1 family. Only the human-specific L1HS-Ta subfamily of LINE-1 is still active in the human genome, whereas the L1MdA, L1MdTf and L1MdGf types of LINE-1 are all active in the mice (Beck et al. 2011; Hancks and Kazazian 2012; Sookdeo et al. 2013). Full-length LINE-1s are typically 6–7 kb long and are transcribed from an internal promoter located in the 5′ untranslated region (UTR) of these elements (Beck et al. 2011; Hancks and Kazazian 2012; Swergold 1990). The LINE-1 promoter also generates an antisense transcript that can extend into the adjacent flanking cellular DNA (Cruickshanks et al. 2013; Li et al. 2014; Macia et al. 2011; Speek 2001). The LINE-1 5′ UTR varies significantly between species, and even between individual types of LINE-1 within species (Khan et al. 2006; Lee et al. 2010; Sookdeo et al. 2013). LINE-1 encodes two open reading frames in mice and rodents, and three open reading frames in humans and primates (Beck et al. 2011; Denli et al. 2015; Hancks and Kazazian 2012). The recently discovered primate-specific LINE-1 ORF0 protei n is localised to nuclear bodies and enhances LINE-1 retrotransposition activity, although its mechanism of action is not currently understood (Denli et al. 2015). LINE-1 ORF1 protein encodes an RNA-binding protein that forms a particle with LINE-1 RNA and is required for LINE-1 retrotransposition activity (Khazina and Weichenrieder 2009; Martin and Branciforte 1993; Moran et al. 1996). The LINE-1-encoded ORF2 protein is also required for LINE-1 retrotransposition and encodes an endonuclease and reverse transcriptase that nicks the host genomic DNA and catalyses target-primed reverse transcription of LINE-1 RNA into DNA at the site of genomic integration (Cost et al. 2002; Feng et al. 1996; Mathias et al. 1991; Moran et al. 1996). LINE-1 elements display a strong cis-preference where LINE-1 ORF1p and ORF2p tend to associate with the same mRNA molecule from which they are translated (Esnault et al. 2000; Kulpa and Moran 2006; Wei et al. 2001).

The SINE class of retrotransposons includes a group of elements that are typically 100–300 bp long and are derived from small non-coding cellular RNAs including 7SL RNA, 5S rRNA and tRNAs (Kramerov and Vassetzky 2011). SINE retrotransposons are transcribed from internal RNA polymerase III promoters, and include the Alu family in humans and primates. The human genome contains hundreds of active Alu elements, particularly those belonging to AluY and AluS subfamilies (Bennett et al. 2008). Alu elements have been proposed to utilise a ‘stealth’ mode of amplification where elements mobilise at low frequencies for a long time, occasionally generating hyperactive copies that expand aggressively but rapidly become extinct (Han et al. 2005). Alu elements, and SINEs in general, are non-autonomous retrotransposons that rely on LINE-1-encoded proteins to catalyse their retrotransposition (Dewannieux et al. 2003; Hancks et al. 2011; Raiz et al. 2012). The crystal structure of the Alu ribonucleoprotein particle suggests that Alu elements hijack LINE-1 reverse transcriptase by binding to ribosomes that are stalled when LINE-1 ORF2p reverse transcribes its encoding mRNA (Ahl et al. 2015). Other active SINEs in the human genome include SVA elements which can be up to 2 kb in length and contain regions derived from SINE-R and Alu retrotransposons (Hancks and Kazazian 2010; Wang et al. 2005).

The LTR retrotransposon class, whose members are also known as endogenous retroviruses (ERVs ), typically encodes the Gag, Pol, Pro and sometimes also Env proteins that are found in retrovirus genomes (Bannert and Kurth 2006). These retroviral proteins function in the production of retroviral capsid proteins, retroviral DNA synthesis and integration into the host genome, processing of retroviral proteins, and forming the surface envelope on the retroviral capsid, respectively. The 5′ LTR acts as a promoter for these elements. LTR retrotransposons typically reverse-transcribe their RNA in the cytoplasm of the host, then translocate the cDNA into the nucleus and use a LTR retrotransposon-encoded integrase to insert the DNA into the genome. Some LTR retrotransposons are autonomous and encode the proteins required to catalyse their own retrotransposition, whereas others use proteins encoded by other types of LTR retrotransposon to retrotranspose in trans (Dewannieux et al. 2004; Ribet et al. 2004). LTR retrotransposons actively mobilise in rodents, and de novo retrotransposition of LTR retrotransposons accounts for around 10 % of spontaneously occurring mutations in mice (Maksakova et al. 2006). However, the vast majority of LTR retrotransposons in human genomes are probably extinct, and mobilisation of LTR retrotransposons in humans is extremely limited (Wildschutte et al. 2016).

Despite the diversity in retrotransposon structure and life cycle, the reason why each and every one of these elements has been able to accumulate multiple copies in the genome during evolution is because they are able to retrotranspose in the cells that can transmit those new genomic copies to the next generation. These crucial cells that play a key role in the life cycle and biology of retrotransposons in mammals belong to the germline.

2 The Mammalian Germline

In mammals, genetic information is transmitted from generation to generation by germ cells. Germ cells are conceptually distinct from the somatic cells that populate organs like the liver, brain, kidney, lungs and heart in that any genetic change that arises in germ cells can potentially be transmitted to subsequent generations whereas those that arise in somatic cells cannot (Fig. 1). This defining distinction between germ cells and somatic cells, as first proposed by the evolutionary biologist August Weismann (1834–1914) (Weismann 1889), means that it is events and activities that occur within germ cells that shape the landscape of mammalian genomes during evolution. Thus, while any individual retrotransposon might retrotranspose in any specific somatic tissue, all successful retrotransposons must retrotranspose in the germline.

Fig. 1
figure 1

The germline cycle . A schematic diagram showing the mammalian germline cycle. Genetic information, variation and mutations are inherited in the zygote from the parental gametes. The zygote gives rise to an individual containing soma (white) and germline (grey) tissues. Genetic information, variation and mutation in germline tissues can be incorporated into the gametes and transmitted to the next generation, whereas that in the soma cannot

Weismann’s distinction between germline and soma means that genetic information and mutations that arise in somatic cells cannot be transmitted to the germ cells and subsequent generations, but this barrier between germ cells and soma is unidirectional (Fig. 1). Weismann realised that the soma must originate from the germline during early development, and therefore genetic information and mutations that arise in the germline can be propagated into the soma. These early embryonic cells that can give rise to both the germline and all somatic tissues would correspond to totipotent and pluripotent cells in modern terminology (Nichols and Smith 2009), while the term ‘germ cell’ is now typically reserved for lineage-restricted cells that have the capacity to differentiate into the mature gametes, but that do not normally contribute to somatic tissues in that individual (McLaren 2003). Weismann’s germline encompasses totipotent cells, pluripotent cells and germ cells as they are all able to transmit genetic information and mutations to the next generation. Similarly, I will include the totipotent and pluripotent cells that are present in early mammalian development as part of the germline for the purposes of this review.

The major stages in mammalian germline development are outlined in Fig. 2. As many of these events are best characterised in mice, the timings and stages of germline development will be described for this species although there are likely to be broad similarities with other mammalian species. Mouse development initiates when mature germ cells, that is, female eggs and male sperm, fuse during fertilisation to generate a single-celled zygote. This zygote is totipotent in that it has the potential to differentiate into all extra-embryonic and embryonic cell types in the conceptus, which includes the germ cells. As pre-implantation development proceeds to blastocyst stage, some cells in the embryo differentiate into trophectoderm and primitive endoderm lineages that contribute only to extra-embryonic tissues. The remaining cells differentiate into pluripotent epiblast cells which retain the capacity to differentiate into all cell types in the embryo proper, including the germ cells (Magnúsdóttir and Surani 2014; Nichols and Smith 2009). After the blastocyst implants into the uterus, some pluripotent epiblast cells located close to the extra-embryonic ectoderm are induced by extracellular signals to differentiate into primordial germ cells. This germ cell specification event occurs around 6.25–7.25 days post coitum (dpc) in mouse embryos, and generates a founding population of around 40 primordial germ cells (Lawson and Hage 1994; Ohinata et al. 2005). The nascent primordial germ cells then embark on a phase of proliferation as they migrate through the embryo to reach the emerging gonads around 10.5 dpc. The germ cells are typically referred to as primordial germ cells during the early stages of their development until they reach the gonad, and primordial germ cell development at this point occurs similarly in male and female embryos. Once the primordial germ cells colonise the gonad, they continue to proliferate for a few more days, then initiate sex-specific differentiation into either male prospermatogonia or female oocytes (Kocer et al. 2009).

Fig. 2
figure 2

Germline development in mice . A schematic diagram summarising the main stages of germline development in mice. Germ cells and their developmental precursors (totipotent and pluripotent cells) are coloured grey. For pre-implantation stages, a totipotent zygote and cleavage stage embryo, and a blastocyst containing pluripotent epiblast cells are shown. Post-implantation embryos before and after gastrulation are also shown, with pluripotent epiblast cells and primordial germ cells coloured grey. For foetal stages, primordial germ cells are shown within the gonads, and meiotic cells are indicated by a pair of homologous chromosomes in their nuclei. Pachytene stage of meiosis is depicted by the ladder-like synaptonemal complex between homologous chromosomes, which is absent in dictyate oocytes. Products of the first and second meiotic divisions are depicted by nuclei with a single replicated chromosome and an individual chromatid, respectively. Note that the second meiotic division in oocytes is typically only completed at fertilisation. Chromosomes are not shown in diploid mitotic cells for clarity, and the distinctive morphologies of fully grown oocytes and mature sperm are also indicated

The germ cells’ decision to differentiate along a male or a female pathway depends on sex-determining cues present in the gonadal environment rather than the sex chromosome constitution of the germ cells themselves (Kocer et al. 2009). In mice, germ cells in a foetal ovary become committed to develop down a female pathway between 12.5 dpc and 13.5 dpc (Adams and McLaren 2002). These oocytes initiate meiosis around 13.5 dpc and progress through most of the first meiotic prophase during late foetal development, then arrest as dictyate oocytes a few days after birth. The dictyate oocytes will eventually be stimulated to grow and resume meiosis in response to hormonal cues in adult animals. Fully grown oocytes then arrest meiosis during the second meiotic division before being ovulated and potentially fertilised (MacLennan et al. 2015).

In contrast, germ cells in a foetal mouse testis become committed to male development between 11.5 dpc and 12.5 dpc (Adams and McLaren 2002). The resulting prospermatogonia, also termed gonocytes, enter a period of quiescence and differentiation during late foetal development, then resume mitotic proliferation a few days after birth (Rossitto et al. 2015). The prospermatogonia give rise to mitotic spermatogonia and to a pool of spermatogonial stem cells which will self-renew and differentiate into more mitotic spermatogonia, thereby maintaining spermatogenesis throughout adulthood (Yang and Oatley 2014). The spermatogonia undergo a number of mitotic divisions before initiating meiosis which, in contrast to oocytes, typically proceeds without interruption. During spermiogenesis, the post-meiotic spermatids undergo a series of morphological changes that include condensation of the chromatin, elongation of the nucleus, specialisation of the Golgi membranes into an acrosome, generation of a flagella and elimination of residual cytoplasm (O’Donnell 2015). The sperm chromatin is delivered, along with a limited amount of cytoplasmic material, into the oocyte at fertilisation. The highly condensed and specialised sperm chromatin is predominantly associated with protamines rather than histones, and is reprogrammed with oocyte-derived histones shortly after fertilisation (Hogg and Western 2015).

3 Retrotransposon Expression in the Mammalian Germline

For a retrotransposon to accumulate new genomic integrations during evolution it needs to be active in the mammalian germline. While each retrotransposon does not need to be expressed and active at all stages in the germline cycle, each retrotransposon needs to be expressed and active at least at one point in the germline cycle. As outlined in the previous section, the germline cycle involves multiple distinct phases of development, and germline cells appear to use distinct transcription factor networks during these phases. Thus, it is rare to find germline-specific genes or transcription factors that are expressed throughout pre-implantation development, primordial germ cell development, foetal gametogenesis , oogenesis and spermatogenesis . One might expect individual retrotransposon expression profiles to behave similarly.

LINE-1 element transcripts , for example, are reported to be present in pre-implantation embryos, and in primordial germ cells in foetal gonads from 11.5 onwards (Fadloun et al. 2013; Hayashi et al. 2008; Molaro et al. 2014; Seisenberger et al. 2012). Although LINE-1 transcripts are present in foetal germ cells from 11.5 dpc, LINE-1 ORF1 protein is not detected in these cells until 15.5 dpc (Trelogan and Martin 1995). In female germ cells, LINE-1 ORF1p protein is expressed during early meiotic prophase but does not appear to be as abundant during post-natal oocyte development (Malki et al. 2014; Trelogan and Martin 1995). In male germ cells, LINE-1 ORF1 protein levels decrease after birth and, somewhat analogously to females, increase transiently during early meiotic prophase, then decrease as spermatogenesis proceeds (Branciforte and Martin 1994). Thus, early meiotic prophase appears to be a point in the germline cycle that LINE-1 retrotransposons target for expression. Interestingly, different types of LINE-1 retrotransposon have distinct RNA expression profiles during mouse spermatogenesis (Zamudio et al. 2015), which presumably reflects differences in the transcription factors binding to the distinct 5′ UTRs that each type of LINE-1 possesses.

LTR retrotransposons have a rich diversity of expression patterns during the germline cycle. Multiple types of LTR retrotransposon are highly expressed in pre-implantation and early post-implantation embryos, and retroviral-like particles are abundant in the totipotent and pluripotent cells present in these stages, although each LTR retrotransposon has quite specific expression profiles within these developmental stages (Brûlet et al. 1985; Dupressoir and Heidmann 1996; Fadloun et al. 2013; Macfarlan et al. 2012; Peaston et al. 2004; Piko et al. 1984; Reichmann et al. 2012; Ribet et al. 2008; Yotsuyanagi and Szöllösi 1981). The complete germline expression patterns of many types of LTR retrotransposon have not been reported, and additional intricacies are likely to emerge from next generation sequencing of RNA isolated from germline cells. As the retrotransposon LTRs will contain binding sites for transcription factors that are expressed in the germline, understanding how retrotransposons are expressed at specific stages in the germline cycle may help decipher some aspects of the transcriptional regulatory networks operating in these cells and can potentially help identify developmentally distinct subpopulations in the germline cycle (Macfarlan et al. 2012).

4 Retrotransposon Activity in the Mammalian Germline

The rate of de novo retrotransposition in the germline is presumably subject to evolutionary constraint. Although retrotransposons and de novo retrotransposition provide a rich source of genetic material and genetic variation, the insertional mutations that arise from these events contribute to genome instability and high de novo retrotransposition rates could prove to be deleterious for both the host and the retrotransposon. Rates of de novo retrotransposition in the germline have been estimated from sequencing genomic DNA, and from identification of disease and phenotype-causing mutations in mice and humans (Hancks and Kazazian 2012). In humans, de novo LINE-1 insertions are estimated to arise in 1 in every 100 births, with de novo Alu insertions estimated to occur five times more frequently. SVA elements have a somewhat lower de novo retrotransposition rate of approximately 1 in every 1000 births. LTR retrotransposons are not thought to be retrotranspositionally active in humans, but in mice de novo LTR retrotransposition events account for around 10–15 % of sequenced spontaneous mutant alleles (Maksakova et al. 2006). However, in general the stages of germline development during which the retrotransposition can occur are not known.

Retrotransposition at different stages of germline development can have distinct consequences for the host. Retrotransposition in the one cell zygote immediately after fertilisation and before the first round of DNA replication would theoretically result in the de novo retrotransposition event being present in a heterozygous state in all germ cells and somatic cells in that individual. However, retrotransposition at later times during pre-implantation development or during early post-implantation development would likely result in a mosaic conceptus containing some cells that have new heterozygous retrotransposition events, and some that do not. The characterisation of a mutagenic LINE-1 insertion in humans suggests that LINE-1 retrotransposition can generate extensive somatic mosaicism consistent with retrotransposition occurring early in development (van den Hurk et al. 2007). Importantly, if a de novo retrotransposition event has phenotypic consequences , the genetically distinct cells in the conceptus could potentially compete with each other, and select for or against cells carrying the de novo retrotransposition event. Lastly, de novo retrotransposition within the germ cells themselves once they are specified will generate mosaicism and potential competition and selection in the germ cell population, but these events will not be present in the soma.

The timing of de novo retrotransposition is probably best studied for LINE-1 elements. Experiments using transgenic mice and rats carrying human or mouse LINE-1 retrotransposition reporter cassettes suggest that de novo LINE-1 retrotransposition occurs infrequently in the germ cells themselves, and is more readily detectable in pre-implantation embryos (Kano et al. 2009). It is not clear if the inner cell mass, trophectoderm and primitive endoderm layers present in mouse blastocysts have differential susceptibilities to LINE-1 retrotransposition. Even though expression of these LINE-1 reporter transgenes was significantly higher in spermatogenic cells than somatic cells, the relative abundance of cells carrying de novo retrotransposition events in sperm was an order of magnitude lower than that in somatic tissues (Kano et al. 2009). Thus germ cells may possess host defence mechanisms that inhibit LINE-1 retrotransposition at a post-transcriptional level. Intriguingly, this study also indicates that pre-implantation embryos can inherit LINE-1 RNA from both the mature parental sperm and egg, and that this parentally transcribed RNA can retrotranspose in pre-implantation embryos (Kano et al. 2009). The finding that transgenic LINE-1 reporters can retrotranspose in early pre-implantation embryos when pluripotent cells are present in mice is consistent with the observation that transgenic LINE-1 reporter and endogenous LINE-1 and SINE retrotransposition occurs in human induced pluripotent stem cells and human embryonic stem (ES) cells in culture (Garcia-Perez et al. 2010; Klawitter et al. 2016; Wissing et al. 2011). However, data on retrotransposition rates of endogenous LINE-1 elements at different stages of the germline cycle is still lacking. Similarly, the rates of de novo retrotransposition of LTR retrotransposons at different stages of the mouse germline cycle also remain poorly understood. The application and adaptation of new methodologies to identify de novo retrotransposition events in genomic DNA (Baillie et al. 2011; Evrony et al. 2012; Ewing and Kazazian 2010; Upton et al. 2015) should help characterise the natural retrotransposition rates of individual retrotransposons during different stages of the germline cycle.

Although viable retrotransposons must be able to retrotranspose in the germline cycle, some retrotransposons may also be active in somatic tissues . In recent years, evidence has accumulated for de novo LINE-1 retrotransposition in human and mouse brain tissue, and de novo retrotransposition has been shown to be a mutational mechanism that can inactivate tumour suppressor genes in cancer (Baillie et al. 2011; Coufal et al. 2009; Evrony et al. 2012; Muotri et al. 2005; Shukla et al. 2013; Solyom et al. 2012; Upton et al. 2015). This aspect is detailed in chapters ‘Retrotransposon Contribution to Genomic Plasticity’, ‘The Mobilisation of Processed Transcripts in Germline and Somatic Tissues’, ‘Neuronal Genome Plasticity: Retrotransposons, Environment and Disease’ and ‘Activity of Retrotransposons in Stem Cells and Differentiated Cells’ of this book. Similarly, various types of LTR retrotransposon are also expressed in specific somatic tissues (Chuong et al. 2013; Faulkner et al. 2009; Gimenez et al. 2010; Seifarth et al. 2005). Some retrotransposon expression in somatic tissues may represent additional somatic roles for the transcription factors that individual retrotransposons are using to drive their expression in the germline cycle. For example, SOX2, which associates with the LINE-1 5′UTR in human cells (Coufal et al. 2009), is an integral component of the transcription factor network in pluripotent cells, but it also maintains the identity of neural progenitor cells (Avilion et al. 2003; Boyer et al. 2005; Graham et al. 2003). For other retrotransposons, expression in somatic tissues may represent the transcription of a small number of specific integration events that have occurred at genomic loci that promote their expression in somatic tissues. The liver-specific expression of a subset of IAP elements in mice appears to fall into this category (Puech et al. 1997), as does expression of the active L1HS-Ta LINE-1 subfamily in human cell lines (Philippe et al. 2016).

It is often not clear whether retrotransposon expression in somatic tissues has a functional role or if it is evolutionarily neutral. Retrotransposition itself can generate mosaicism in an individual, and it is possible that this provides some evolutionary advantage in some somatic tissues (Muotri et al. 2007). Some retrotransposons can influence expression of nearby host genes, for example, the LTRs of some types of LTR retrotransposon appears to act as enhancers in the placenta, and the activity of these elements in the placenta could be being selected for (Chuong et al. 2013). In some cases, specific copies of retrotransposon-encoded proteins appear to have been co-opted into the host genome to provide key functions in somatic cells, with the repeated independent co-option of LTR retrotransposon proteins to promote cell–cell fusions in the placenta during mammalian evolution being a good example of this (Dupressoir et al. 2011; Mi et al. 2000) (see chapter ‘Roles of Endogenous Retrovirus-Encoded Syncytins in Human Placentation’). The domestication of an ancient LINE-like retrotransposon to generate telomerase, the enzyme that maintains telomeres at the ends of chromosomes in eukaryotes, suggests that retrotransposon-derived sequences can evolve to have functions in somatic tissues (Belfort et al. 2011; Curcio and Belfort 2007; Eickbush 1997). Thus, although viable retrotransposons must be expressed and active in the germline cycle, this requirement is not incompatible with potential roles for these elements in somatic tissues.

5 The Impact of Retrotransposons on the Mammalian Germline

Retrotransposons are able to impact on the germline as a source of trans-generational genomic instability that can cause insertional mutations due to jumping to new locations in the genome, and due to recombination between homologous retrotransposon loci causing genetic deletions, segmental duplications and other chromosomal rearrangements (Fig. 3). However, retrotransposons also influence the biology of germline cells in other ways (Fig. 3). Retrotransposons use germline transcription factors to drive their expression, therefore each retrotransposon locus provides clusters of binding sites for transcription factors that are active in the germline that can provide a useful source of DNA sequence modules for evolution (Bourque et al. 2008; Rebollo et al. 2012). For example, retrotransposon sequences, particularly those from the ERV1 family of LTR retrotransposons, account for around 15–20 % of the genomic locations occupied by either OCT4 or NANOG pluripotency-associated transcription factors in human ES cells, and can drive expression of nearby genes in human ES cells (Kunarso et al. 2010). The ERVL family of LTR retrotransposons similarly appears to strongly influence the oocyte transcriptome and a subset of oocyte transcripts are chimaeras between ERVL retrotransposons and host genes (Peaston et al. 2004). One gene that has an oocyte-specific isoform driven by an ERVL retrotransposon promoter in oocytes is DICER1, which encodes an RNA endonuclease that is involved in the production of endogenous short interfering RNAs (Flemr et al. 2013). This oocyte-specific isoform of DICER1 influences the abundance of retrotransposon transcripts in mouse oocytes, presumably through endogenous siRNAs targeting retrotransposon transcripts (Flemr et al. 2013). Thus, retrotransposon sequences in the genome are even promoting expression of retrotransposon defence mechanisms in the germline. Chimaeric oocyte transcripts originating from ERVL-derived promoters can encode retrotransposon-host gene fusion proteins that are translated in the oocytes (Peaston et al. 2004). Similarly, the recently discovered primate LINE-1 ORF0 transcript can run through to adjacent host genes and generate fusions between the ORF0 polypeptide and host-encoded proteins in human ES cells (Denli et al. 2015). Non-coding RNAs derived from LTR retrotransposons are also reported to be essential to maintain human ES cells in a pluripotent state, possibly through facilitating the binding of some transcriptional co-activator proteins to chromatin (Lu et al. 2014; Wang et al. 2014). Thus retrotransposons appear to be a rich source of genetic material that is contributing to the evolution of the transcriptome and the proteome of germline cells.

Fig. 3
figure 3

Impact of retrotransposons on germline biology. A schematic diagram summarising some of the ways that retrotransposons impact on mammalian germ cells. A region of a chromosome containing a gene (grey) flanked by retrotransposons (RPNs, black) is indicated. Transcription is indicated by corner arrows, and active enhancers by asterisks. RNA transcripts are indicated by wavy lines. For details see main text

Mutations in retrotransposon defence mechanisms can result in high levels of retrotransposon expression during the germline cycle (Crichton et al. 2014; Ollinger et al. 2010; Zamudio and Bourc’his 2010). Mutations in the PIWI-piRNA pathway (Fu and Wang 2014), or in the accessory de novo methyltransferase DNMT3L, can result in high levels of LINE-1 expression in male germ cells, particularly during meiotic prophase (De Fazio et al. 2011; Di Giacomo et al. 2013; Soper et al. 2008; Zamudio et al. 2015). In general, these mutants arrest spermatogenesis during meiosis, typically around the pachytene stage (Crichton et al. 2014; Ollinger et al. 2010; Zamudio and Bourc’his 2010). Mice that have mutations in MAEL, a component of the PIWI-piRNA pathway, are reported to arrest during meiosis with high levels of meiosis-independent DNA damage that could potentially be caused by high levels of de novo retrotransposition of the de-repressed retrotransposons (Soper et al. 2008). In contrast, DNMT3L -/- mice are reported to have no detectable increase in meiosis-independent DNA damage, and have been proposed to arrest during meiosis due to histone modifications at transcriptionally active LINE-1 retrotransposon loci recruiting the meiotic recombination machinery, and disrupting the pairing of homologous chromosomes that characterise meiotic prophase (Zamudio et al. 2015). However, mice that de-repress LINE-1 post-transcriptionally also exhibit chromosome asynapsis and pachytene arrest (Di Giacomo et al. 2013), suggesting that there may be additional aspects to this phenotype that are not currently understood. Importantly, although de-repression of retrotransposons has been reported in various germline genome defence mutants, it remains to be determined whether de novo retrotransposition events are accumulating in the mutant germ cells.

De-repression of retrotransposons in oocytes is also associated with defects in progression through meiotic prophase. Mutations in LSH, a gene implicated in the establishment or maintenance of DNA methylation at retrotransposons and some single copy genes, result in loss of methylation at IAP retrotransposon sequences in oocytes, a failure of oocytes to progress through pachytene, and foetal oocyte death (De La Fuente et al. 2006). Mutating LSH in somatic cells also results in loss of DNA methylation and de-repression of retrotransposons (Dunican et al. 2013). This loss of DNA methylation appears to have indirect effects on other chromatin modifications in the genome as polycomb repressive complexes re-localise to sites normally occupied by DNA methylation, and sequestering polycomb repressive complexes away from their normal targets (Dunican et al. 2013). A similar phenomenon is seen in somatic cells with mutations in the maintenance DNA methyltransferase DNMT1 (Reddington et al. 2013), and in hypomethylated embryonic stem cells (Lynch et al. 2012). It is not clear if relocalisation of polycomb repressive complexes also happens in LSH mutant oocytes or in PIWI-piRNA mutant spermatocytes, but DNA hypomethylation at abundant retrotransposon sequences does have the potential to cause significant effects on the genome-wide distribution of other histone modifications that could be contributing more to the mutant phenotypes than retrotransposon de-repression itself.

Mutations in MAEL result in de-repression of retrotransposons, meiotic abnormalities and foetal oocyte death (Malki et al. 2014). The MAEL -/- oocyte phenotype appears to be related to de-repression of LINE-1 retrotransposons, and differences in the level of LINE-1 expression between individual oocytes in wild-type mice has been proposed to influence foetal oocyte attrition and the number of oocytes present in the ovarian pool at birth (Malki et al. 2014). The rate of foetal oocyte attrition is accelerated in transgenic mice carrying an active LINE-1 transgene, and delayed by treating pregnant mice with an anti-retroviral drug, although these manipulations do not change the final number of oocytes in the ovarian pool (Malki et al. 2014). There are fundamental differences in the way that the oocyte pool influences fertility and menopause between humans and mice, and it will be of interest to determine if manipulating LINE-1 activity can influence the size of the oocyte pool in humans. Human oocytes do, however, contain the host factors to support LINE-1 retrotransposition, at least using engineered LINE-1 reporter constructs (Georgiou et al. 2009).

6 Genome Defence Mechanisms Operating in the Mammalian Germline

The mammalian germline possesses a number of defence mechanisms that suppress the potentially mutagenic activity of retrotransposons in these cells (Crichton et al. 2014; Friedli and Trono 2015; Zamudio and Bourc’his 2010). Histone modification appears to play an important role in repressing retrotransposons in mouse ES cells, with H3K9me3 and H3K27me3 chromatin marks frequently associating with silenced retrotransposons in these cells (Day et al. 2010). Canonical and alternative polycomb repressive complexes , which catalyse trimethylation of H3K27, are involved in repressing multiple families of LTR retrotransposons in mouse ES cells (Hisada et al. 2012; Leeb and Wutz 2007; Reichmann et al. 2012). SETDB1 and SUV39H1/SUV39H2, which trimethylate H3K9, are similarly implicated in repressing LINE-1 and multiple families of LTR retrotransposons on mouse ES cells (Bulut-Karslioglu et al. 2014; Karimi et al. 2011; Matsui et al. 2010; Reichmann et al. 2012). The lysine demethylase KDM1A also plays a role in repressing LINE-1 elements and ERVL LTR retrotransposons (Macfarlan et al. 2012), and histone deacetylases are implicated in repression of ERVK LTR retrotransposons in mouse ES cells (Reichmann et al. 2012), and in suppressing de novo but not pre-existing LINE-1 integrations in human embryonal carcinoma cells (Garcia-Perez et al. 2010). The variety of different mechanisms operating in ES cells may reflect the diversity of retrotransposons in the genome and the multiple strategies that these elements use to drive their transcription in the germline.

As retrotransposons hijack the transcriptional networks present in the host to drive their transcription, it is possible that some of the silencing mechanisms operating in ES cells reflect the mechanisms normally used to regulate the developmental timing of host genes whose expression is driven by the same transcription factors. For example, endogenous gene transcripts that have a role in the zygote might be downregulated as pre-implantation development proceeds such that they are not expressed in pluripotent epiblast cells or ES cells. Similarly, retrotransposons that are using this transcription factor network to drive their expression in zygotes would be expected to be downregulated in ES cells. Some of the mechanisms involved in repressing retrotransposons in ES cells will presumably reflect the normal mechanisms that mediate developmental changes in host gene transcription in these cells, with retrotransposon sequences being repressed ‘by association’ due to their co-regulation with host genes that are downregulated at this stage of development.

In contrast, the specific targeting of repressive chromatin marks to retrotransposons that is mediated by Krüppel-associated box zinc finger proteins (KRAB-ZFPs ) appears to represent a more active and directed defence mechanism against these elements. Specific KRAB-ZFPs bind to specific retrotransposon sequences, recruiting the co-repressor KAP1 (also known as TRIM28) and H3K9me3 chromatin modifications to these sites (Friedman et al. 1996; Rowe et al. 2010; Wolf and Goff 2009, 2007). KAP1 interacts with the histone H3K9 methyltransferase SETDB1, which appears to be the major H3K9 histone methyltransferase involved in silencing LTR retrotransposons in mouse ES cells (Karimi et al. 2011; Matsui et al. 2010; Sripathy et al. 2006). Some LINE-1 elements are silenced by SUV39H1/SUV39H2 H3K9 histone methyltransferases rather than SETDB1 (Bulut-Karslioglu et al. 2014; Matsui et al. 2010), but it is not clear why different histone methyltransferases are being used to silence different retrotransposons. For KAP1 to target a retrotransposon for silencing, a KRAB-ZFP must evolve to bind to that retrotransposon sequence, and KRAB-ZFPs appear to be evolving rapidly for this purpose (Jacobs et al. 2014). Furthermore, retrotransposon loci that mutate or delete their KRAB-ZFP binding sites can escape from KAP1-dependent repression and evolve into new retrotransposon sub-types (Jacobs et al. 2014). Thus while some LINE-1 elements recruit and are repressed by KAP1 in mouse and human ES cells, the youngest types of LINE-1 are repressed by alternative mechanisms (Castro-Diaz et al. 2014).

DNA methylation plays an important role in repressing retrotransposons in germ cells and somatic cells (Bourc’his and Bestor 2004; Davis et al. 1989; De La Fuente et al. 2006; Dunican et al. 2013; Jackson-Grusby et al. 2001; Walsh et al. 1998), although its role in repressing retrotransposons in mouse ES cells may be more restricted than either KAP1 or SETDB1 (Karimi et al. 2011; Matsui et al. 2010; Rowe et al. 2010). The ability of mouse ES cells to induce compensatory histone modifications at some retrotransposons in response to DNA hypomethylation (Walter et al. 2016) may contribute to this. Pluripotent epiblast cells in mouse embryos undergo a wave of de novo DNA methylation during post-implantation development (Borgel et al. 2010) and it is possible that DNA methylation is recruited to retrotransposons that are already silenced by histone modifications at this stage to reinforce and stabilise the repressed state (Rowe et al. 2013). This mechanism bears some resemblance to the observation that bulk of the de novo DNA methylation that occurs during tumourigenesis is located at genes that are already silenced by other mechanisms (Sproul et al. 2011). However, at least for some retrotransposons, persistent KAP1/KRAB-ZFP activity is required to maintain repression of some retrotransposon loci in differentiating and adult somatic cells (Ecco et al. 2016), and the interplay between histone modifications and DNA methylation in establishing and maintaining silencing at different retrotransposon sequences in different cell types is likely to be complex and requires further study.

In the developing germ cells, DNA methylation is globally lost from the genome from 8.5 dpc through to 11.5 dpc as the primordial germ cells migrate to and colonise the genital ridge (Hajkova et al. 2008, 2002). This global loss of DNA methylation includes retrotransposon sequences, although IAP elements are amongst the most resistant to this phenomenon (Popp et al. 2010; Seisenberger et al. 2012). This loss of DNA methylation is part of a more extensive epigenetic reprogramming event that also involves genome-wide loss of various repressive histone modifications including H3K9me1 and H3K9me2 (Hajkova et al. 2008). It is not clear how retrotransposons are transcriptionally repressed during this stage of germ cell development, but the histone arginine methyltransferase PRMT5 localises to the nucleus and symmetric dimethylation of histones H2AR3 and/or H4R3 is upregulated during the early part of this reprogramming event and are present at LINE-1 and IAP retrotransposon sequences (Ancelin et al. 2006; Kim et al. 2014). PRMT5 -/- primordial germ cells have undetectable levels of H2A/H4R3me2s in their nuclei, and de-repress LINE-1, IAP and other retrotransposons suggesting that PRMT5-dependent symmetric dimethylation of histones H2AR3 and H4R3 may contribute directly to the repression of retrotransposons in hypomethylated primordial germ cells (Kim et al. 2014). There may of course be additional, presently uncharacterised, histone modifications associated with specific types of retrotransposon that are helping to repress transcription of these elements in hypomethylated germ cells.

At around 11.5 dpc, PRMT5 is re-localised from the primordial germ cell nucleus into the cytoplasm and levels of H2A/H4R3me2s concomitantly decrease (Ancelin et al. 2006). Additional mechanisms likely become important to limit retrotransposon activity in hypomethylated germ cells at this stage. Analysis of retrotransposon transcript abundance in the transcriptome of 13.5 dpc primordial germ cell from wild-type mice supports widespread low level transcriptional de-repression of retrotransposons in hypomethylated germ cells at this stage (Molaro et al. 2014). The DNA hypomethylation that occurs in the developing primordial germ cells is not restricted to retrotransposons and extends to most genomic features including endogenous gene promoters (Popp et al. 2010; Seisenberger et al. 2012). Thus this global DNA hypomethylation event can influence expression of host genes in the developing germline (Hackett et al. 2012). Interestingly, many genes that are primarily and causally regulated by DNA methylation in mice are germline-specific genes that are involved in suppressing retrotransposon activity (Hackett et al. 2012). DNA hypomethylation in the developing germline therefore appears to induce expression of a group of retrotransposon-suppressing genome defence genes to compensate for the increase in potential retrotransposon activity caused by loss of this repressive epigenetic mark.

One of the genome defence genes that is most sensitive to DNA hypomethylation is TEX19.1. TEX19.1 is required to repress MMERVK10C LTR retrotransposons in spermatocytes, and also to repress LINE-1 elements and LTR retrotransposons in the hypomethylated somatic cells present in the placenta (Öllinger et al. 2008; Reichmann et al. 2013). Most of the other germline genome defence genes induced in response to global DNA hypomethylation are components of the PIWI-piRNA pathway for repressing retrotransposons. The PIWI-piRNA pathway uses small RNAs encoded in the genome to target retrotransposons for suppression by epigenetic and post-transcriptional mechanisms in the germline (Fu and Wang 2014; Iwasaki et al. 2015). There are three PIWI proteins in mouse and rat genomes, but four PIWI proteins in many other mammals including humans. The three mouse PIWI proteins have distinct expression profile during spermatogenesis. These PIWI proteins physically interact with small single-stranded PIWI-interacting RNAs (piRNAs ) whose sequence is thought to target the PIWI proteins to retrotransposon sequences (Aravin et al. 2006, 2008, 2007; Carmell et al. 2007; Kuramochi-Miyagawa et al. 2001). Genomic piRNAs are derived from long RNA precursors that undergo a number of processing events to generate mature piRNAs. These primary piRNAs can facilitate processing of complementary precursor sequences, such as retrotransposon transcripts, into secondary piRNAs which in turn can promote processing of genomic precursors into primary piRNAs. This ping-pong cycle can amplify groups of piRNAs and is important for generating an effective piRNA response against LINE-1 elements in male mouse germ cells (De Fazio et al. 2011). The slicer RNA endonuclease activity of PIWI proteins plays an important role in processing piRNA precursors, and PIWI-piRNA-directed slicing of retrotransposon RNAs by PIWIL1 and PIWIL2 contribute to the PIWI-piRNA defence against retrotransposons (De Fazio et al. 2011; Di Giacomo et al. 2013; Reuter et al. 2011). PIWIL2 and PIWIL4 are required for male germ cells to establish de novo DNA methylation at retrotransposon sequences (Aravin et al. 2008; Kuramochi-Miyagawa et al. 2008). De novo methylation of retrotransposons occurs from 16.5 dpc onwards in foetal male germ cells, and it is possible that sequence information present in the PIWI-piRNA pathway is being used to direct the de novo DNA methylation machinery to these sequences. It is not clear if PIWI-piRNA complexes regulate DNA methylation directly, or indirectly through other chromatin modifications. H3K9 methylation has been implicated in PIWI-dependent retrotransposon silencing in Drosophila, and in silencing retrotransposons in post-natal male germ cells in mice (Di Giacomo et al. 2014, 2013; Huang et al. 2013; Pezic et al. 2014). Perhaps one of the reasons that genome-wide loss of DNA methylation occurs in the developing germ cells is to expose retrotransposon loci to the PIWI-piRNA pathway so that retrotransposons can be identified and epigenetic repression of these elements established de novo in preparation for transmission of the genome to the next generation. Removing and resetting epigenetic marks on these sequences may be preferable to propagating existing marks in order to prevent epimutations from being transmitted across multiple generations. There may also be some analogies to the mechanisms operating in Arabidopsis, where programmed loss of DNA demethylation in the pollen’s vegetative nucleus results in de-repression of retrotransposons whose transcripts are processed into small RNAs, transported to the pollen’s germline nucleus, and used to direct epigenetic silencing of retrotransposons in the germline DNA (Calarco et al. 2012; Slotkin et al. 2009). The DNA methylation-sensitive coupling of expression of post-transcriptional genome defence mechanisms and components of the PIWI-piRNA pathway to transcriptional de-repression of retrotransposons in mouse germ cells may similarly allow mouse germ cells to generate the retrotransposon RNA transcripts needed to direct de novo identification and silencing of retrotransposon loci in the mammalian germline (Fig. 4).

Fig. 4
figure 4

Potential role of genome defence genes in the male germline. A schematic diagram outlining the potential role of genome defence genes during epigenetic reprogramming in the male germline. Germ cells losing DNA methylation (filled grey circlesfilled white circles) transcribe RNA (wavy lines) encoding genome defence genes and retrotransposons, which can be translated into protein (filled triangles and squares, respectively). Genome defence proteins, including components of the PIWI-piRNA pathway, that inhibit any post-transcriptional stages of the retrotransposon life cycle (grey) can limit mutations caused by retrotransposition, while allowing retrotransposon RNA transcripts to prime the PIWI-piRNA pathway (broken wavy lines). The PIWI-piRNA pathway slices retrotransposon RNA transcripts to generate piRNAs and can potentially use sequence information in the piRNAs to direct de novo DNA methylation onto retrotransposon sequences (indicated by question mark)

Components of the PIWI-piRNA system also appear to play roles in post-transcriptional suppression of retrotransposons in oocytes, and PIWIL2 suppresses LINE-1 mobility in human induced pluripotent cells (Lim et al. 2013; Malki et al. 2014; Marchetto et al. 2013; Watanabe et al. 2008). In mice, PIWI function in oocytes is primarily provided by PIWIL2 although additional members of the PIWI family may also contribute to PIWI function in oocytes in other mammalian species including human (Roovers et al. 2015). Mutating PIWIL2 in mice does not have the severe consequences for fertility in females that it does in males (Kuramochi-Miyagawa et al. 2004; Lim et al. 2013), which may in part reflect differences in the way that de novo DNA methylation is regulated between spermatogenesis and oogenesis (Smallwood and Kelsey 2012). De novo DNA methylation occurs post-natally during oocyte growth in the female germline, and oocytes are therefore in a DNA hypomethylated state throughout their prolonged dictyate arrest and for much of their adult life. In the absence of PIWIL2, the abundance of some retrotransposon transcripts is elevated in oocytes, potentially reflecting PIWIL2-dependent post-transcriptional suppression of these elements during oogenesis (Lim et al. 2013; Watanabe et al. 2008). However, the PIWI-piRNA system is not the only mechanism that operates in oocytes to post-transcriptionally suppress retrotransposons, and DICER1-dependent endogenous siRNAs also make a significant contribution (Flemr et al. 2013; Stein et al. 2015; Tam et al. 2008; Watanabe et al. 2008). MARF1 may represent another mechanism regulating retrotransposons at a post-transcriptional level in these cells (Su et al. 2012a, b).

The distinct post-transcriptional suppression mechanisms operating in mouse oocytes appear to complement each other to target different types of retrotransposon (Watanabe et al. 2008). Pools of endogenous siRNA and piRNA present in fully grown oocytes can potentially be transmitted to the next generation to provide some protection against retrotransposons in pre-implantation embryos. However, retrotransposon-encoded transcripts, proteins, and ribonucleoprotein particles that are expressed during the oocyte’s prolonged dictyate arrest or post-natal growth can similarly be transmitted in the oocyte cytoplasm and can cause retrotransposition in the next generation (Kano et al. 2009). The presence of multiple overlapping and complementary genome defence mechanisms in oocytes may therefore provide some protection against retrotransposon mobilisation during oogenesis, and also help to limit maternal transmission of retrotransposon-derived ribonucleoprotein particles than can retrotranspose in the next generation.

7 Concluding Remarks

As described in this chapter, the mammalian germline and retrotransposons are intrinsically linked in multiple ways. All retrotransposons need to be expressed and active in the mammalian germline in order to accumulate in the genome during evolution, and while the germline appears to have evolved multiple defence mechanisms to limit the mutagenic activity of these elements, these mechanisms are helping to drive evolution of retrotransposons to escape suppression. This situation is analogous to the Red Queen hypothesis (van Valen 1973) as both retrotransposons and germline defence mechanisms need to continue to evolve simply to keep up with each other. However, in addition to this antagonistic relationship, retrotransposons appear to be participating in the transcriptional and proteomic networks of germline cells and providing regulatory modules for gene expression that are being repurposed by the germline to help it evolve. Additional intricacies will likely emerge in the coming years as the interplay between retrotransposons and the germline becomes better understood, but it would appear that retrotransposons can be viewed as having both beneficial and deleterious effects on their germline hosts.