Introduction

The exchange of endogenous genomic sequences for exogenous donor DNA molecules via homologous recombination (HR) is a process that has been known for decades. Pioneering studies by the late Oliver Smithies illustrated how homologous DNA molecules could recombine and correctly insert at defined mammalian chromosomal locations (Smithies et al. 1984, 1985). These findings were instrumental for envisaging and developing gene targeting methodologies in mouse embryonic stem (ES) cells, for which Smithies, together with Mario Capecchi and Martin Evans, were jointly awarded the Nobel Prize in the Physiology or Medicine in 2007 (Mak 2007). In these first experiments, the recorded efficiency of targeted integration by HR in cultured somatic cell lines was relatively low (at best 1 in 1000), even in the presence of selection (Smithies et al. 1985). Nonetheless, the method was at least ten times more efficient than attempting to insert a genomic DNA fragment into its corresponding chromosomal location by HR through direct pronuclear microinjection of mouse fertilized eggs, as demonstrated by the perseverance of Ralph Brinster and collaborators (Brinster et al. 1989).

It became quickly obvious that opening the donor DNA molecule through a double-strand break (DSB) within the region of homology considerably enhanced the frequency of HR within the gene (Kucherlapati et al. 1984; Smithies et al. 1985). A similar observation was confirmed some years later in mouse ES cells when Allan Bradley reported increased targeting efficiencies using insertion versus replacement vectors (Hasty et al. 1991). Therefore, the benefits for the integration efficiency somehow correlated with the promotion of DSBs in the donor DNA template, so exposing homologous sequences at the end of these DNA molecules, a concept that was solidly established. This observation would play a fundamental role in subsequent developments leading to the first genome-editing strategies.

Indeed, the next step towards the establishment of the first genome-editing strategies was based on using I-SceI yeast meganuclease, a rare-cutter endonuclease with a recognition site of 18 base pairs responsible for intron homing in yeast mitochondria (Jacquier and Dujon 1985). 10 years after its discovery, it was shown that I-SceI meganuclease could be used to promote HR in mammalian chromosomes with a frequency two orders of magnitude higher than spontaneous HR (Choulika et al. 1995). The authors demonstrated the efficient repair of I-SceI-induced specific DSBs with donor DNA molecules carrying regions of homology flanking the endogenous I-SceI site in mouse cells, which was probably one of the first attempts of applying genome-editing strategies based on I-SceI meganuclease to produce a DSB for repair by endogenous cellular mechanisms. Subsequently, the introduction of I-SceI restriction sites at a mouse genomic locus by HR in ES cells, and the succeeding expression of this meganuclease in the presence of a replacement vector with homologous DNA sequences surrounding the introduced I-SceI site, resulted in a significant (100-fold stimulation) increase in targeting efficiency (Cohen-Tannoudji et al. 1998). Of course, the procedure was lengthy and cumbersome, requiring two consecutive HR steps and two selection markers. It was also not very practical since the I-SceI-based integration promotion was strictly dependent on the prior introduction of I-SceI recognition sites at the locus to be modified. Moreover, recent reports have suggested that HR by meganuclease-induced DSBs may be locus dependent in mammalian cells (Fenina et al. 2012). Introducing I-SceI meganuclease sites at either end of transgenes and co-microinjecting with the meganuclease also resulted in increased efficiency of transgenesis in medaka fish embryos (Thermes et al. 2002) and Xenopus (Pan et al. 2006), but not in mammalian embryos. Recently, this limitation has been overcome using a modified meganuclease with a nuclear localization signal (NLS), which efficiently mediates germline transgenesis in mouse and porcine embryos (Wang et al. 2014).

The evident limitation of target sequences associated with I-SceI would ultimately be overcome by producing engineered meganucleases with customized recognition sites, once they became commercially available. One of these engineered meganucleases, derived from I-CreI, was used to induce targeted recombination at high frequency at the SCID gene in human cells (Grizot et al. 2009). More recently, another engineered I-CreI, cleaving a 22-bp sequence of the Rag1 gene, was applied to demonstrate effective targeting events in rat and mouse embryos (Menoret et al. 2013).

The recognition that meganucleases were the first to be used for genome-editing purposes, well before other more celebrated and recent genome-editing nucleases (zinc-finger nucleases, ZFN; transcription activator-like effector nucleases, TALEN; CRISPR-associated proteins, Cas), is often not sufficiently acknowledged; hence these introductory words in the short but intense history of genome editing in mammals.

The mechanism by which the different types of nucleases enhance gene disruption or HR, in the presence of suitable donor DNA molecules, is the same; namely, they all efficiently promote DSBs at specific genomic locations. Subsequently, the endogenous cellular repair mechanisms, in the absence (non-homologous end joining, NHEJ) or presence (homology-driven repair, HDR) of adequate donor DNA molecules, with homologous sequences surrounding the DSB, will seal this breach in the genome. Depending on the repair route taken by the cell, the original DSB can result in gene disruption events, which are associated with the insertion or deletion of nucleotides (INDELs) through NHEJ, or in gene-editing events, which are associated with HDR (Fig. 1). Indeed, the quantifiable benefits of these nucleases regarding the increase in the frequency of HR events they can promote are estimated to be at least 1000-fold (Bibikova et al. 2001), as compared to previous attempts without nucleases (Smithies et al. 1985; Brinster et al. 1989).

Fig. 1
figure 1

Basic scheme illustrating the process of genome editing mediated by nucleases. This scheme is valid for all known types of genome editors. The associated nucleases trigger DNA cleavage at specific genomic locations and this generates a double-strand break (DSB) that is eventually repaired through one of the two possible routes: NHEJ or HDR, as indicated. The PAM sequence is indicated for the most common CRISPR-Cas9 system used for genome-editing purposes. DSB double-strand break, NHEJ non-homologous end joining, HDR homology-driven repair, PAM protospacer adjacent motif, INDELs insertion and deletions, HR homologous recombination

Zinc-finger nucleases (ZFN)

Even though yeast meganucleases were the first to be utilized to edit mammalian genomes (Choulika et al. 1995; Cohen-Tannoudji et al. 1998), a fundamental change in methodology occurred in 2009 that would transform how transgenic animals are produced. In a pioneering study, Guerts, Jacob and Buelow, and colleagues used zinc-finger nucleases (ZFN) to produce the world´s first knockout rats (Geurts et al. 2009). The methodological approach was based on the use of a DNA endonuclease domain from the bacterial restriction enzyme FokI, engineered with zinc-finger domains with known DNA-binding capacity, which was used to target and cleave a specific genome location. Zinc-finger nucleases (ZFN) had actually been devised some years earlier when the first chimeric proteins resulting from fusing the FokI endonuclease domain with engineered DNA-binding zinc-finger domains were shown to efficiently cleave and promote HR events at specific genomic sequences in Xenopus embryos (Bibikova et al. 2001). The technology was improved a few years after this and the engineered ZFN were shown to drive the correction of the human SCID mutation in cells by HR with an extrachromosomal DNA donor at high efficiencies (more than 18% of cells) without selection (Urnov et al. 2005). Generating the expected DSB required dimerization of the FokI endonuclease domain, indicating that the combined effect of two ZFN was required for efficient DNA cleavage at both strands (Mani et al. 2005). Soon, approaches were developed to design ZFN for the genome editing of mammalian genomes (Porteus 2008).

Zinc-finger nucleases (ZFN) were quickly converted into an efficient gene-editing tool to be applied for the generation of mammalian transgenesis. As stated earlier, the first of these reports described the generation of knockout rats (Geurts et al. 2009), a mammalian species in which the classical gene targeting approach was not possible because of the lack of equivalent rat ES cells. In fact, rat ES cells had been isolated a few months before the publication of the knockout rat study (Buehr et al. 2008). The use of ZFN for generating transgenic mammals rapidly disseminated among numerous laboratories (Remy et al. 2010) and soon ZFN were shown to drive successful genome-editing experiments in mice (Carbery et al. 2010; Meyer et al. 2010), cattle (Yu et al. 2011; Liu et al. 2013; Wei et al. 2015), and pigs (Hauschild et al. 2011; Carlson et al. 2012; Kwon et al. 2013; Qian et al. 2015).

The biggest impact of ZFN was probably in livestock species, where the absence of bona fide species-specific ES cells had traditionally limited genome alterations that required HR events (Petersen and Niemann 2015). Good examples of ZFN for livestock applications are exemplified by genome-edited cattle with increased resistance to mastitis (Liu et al. 2014a, b), or genome-edited domestic pigs carrying a warthog-derived variant of the RELA gene with the aim of transmitting the associated resilience to African swine fever (Lillico et al. 2016).

Gene therapy applications with ZFN were also explored and analyzed in tissue culture (e.g., Overlack et al. 2012), including the delivery of expression constructs for these chimeric nucleases through adeno-associated virus (AAV) particles (Ellis et al. 2013). In parallel, however, some unwanted consequences were discovered. For example, it was observed that the sequence specificity of ZFN is not strict and similar genomic sequences (known as off-targets) can also be targeted and eventually repaired with the donor DNA molecule, highlighting one of the current limitations of genome-editing nucleases (Radecke et al. 2010). Nevertheless, some pre-clinical studies using ZFN in human cells have progressed into clinical trials, as illustrated by the inactivation of the HIV co-receptor CCR5, approved by the US regulatory authorities (Maier et al. 2013; Tebas et al. 2014).

As a final note for this section, it became apparent that the ZFN methodology was not easily affordable or accessible to standard molecular biology laboratories, a fact that may have limited the success and impact of ZFN in mammalian transgenesis. Indeed, although some open platforms were established for producing ZFN (Hermann et al. 2012), most were produced commercially by the company who owned the patent for the technology (Swarthout et al. 2011).

Transcription activator-like effector nucleases (TALEN)

In 2011, a new type of genome-editing nuclease, known as transcription activator-like effector nuclease (TALEN), was reported in the field, representing a potential alternative to ZFN. Similar to that described for ZFN, TALEN were first used to efficiently produce knockout rats (Tesson et al. 2011). Whereas established and robust gene targeting in mice has been standard practice since 1987, precise genome editing in the rat was not possible for decades, likely explaining why these new technologies were first used in rats and eventually transferred to mice. Indeed, soon thereafter, the use of TALEN was extended to genome editing in mice, where numerous novel knockout models were quickly created (Panda et al. 2013; Sung et al. 2013; Wang et al. 2013a; Wefers et al. 2013). TALEN were also applied to the production of genome-edited livestock species (Carlson et al. 2012), including pigs (Xin et al. 2013), goats (Cui et al. 2015), sheep and cattle (Proudfoot et al. 2015; Wei et al. 2015), and non-human primates (Liu et al. 2014a, b). Again, similar to what was observed for ZFN, TALEN were shown to be robust and reliable tools for several biotechnological and biomedical applications in livestock, and, in particular as an efficient strategy to generate large animal models of human disease (Whitelaw et al. 2016). Agricultural applications of TALENs have included the production of genome-edited cattle genetically without horns, matching an existing natural mutant allele (Tan et al. 2013), or the generation of genome-edited cattle with an increased resistance to tuberculosis (Wu et al. 2015).

While TALEN share similarities with ZFN, they also present some attractive differences. Both ZFN and TALEN are chimeric proteins generated by fusing a different DNA-binding domain with the same endonuclease domain, from the restriction enzyme FokI. They also operate as dimers, hence one pair of ZFN or TALEN must always be considered to target a given genomic location. This is normally achieved through web-based algorithms that aid researchers to find the best target DNA sequences in both strands within the target locus (Periwal 2016). However, whereas ZFN are entirely engineered in the laboratory, with a defined DNA-binding specificity, according to an inferred code that links groups of three amino acids with three nucleotides, TALEN derive from TALE proteins, which exist in nature and are used by some plant-infecting pathogens to hijack and transform the cellular machinery to support the pathogen’s life cycle (Mussolino and Cathomen 2012). Through the systematic investigation of numerous TALE proteins, it was possible to decipher a code linking a precise dual amino acid found inside the protein with its capacity to bind to one specific nucleotide (Bochtler 2012). Unlike ZFN, which remained proprietary to a company and had to be ordered through it, engineered TALEN were easy to generate locally. Indeed, any molecular biology laboratory could easily purchase a large collection of intermediate reagents, made available as plasmids, which would be used to generate customized TALEN using a convenient step-by-step Golden Gate cloning protocol based on an innovative use of type II restriction enzymes for progressively assembling DNA fragments (Cermak et al. 2015; Sakuma and Yamamoto 2016). In theory, the Golden Gate assembly kit and protocol offers the possibility to design and easily produce any TALEN according to the target DNA sequence. In reality, however, this process can be long and cumbersome due to the numerous plasmids that are required for producing one of these engineered nucleases. In addition, whether the final TALEN will eventually target and cleave the expected DNA sequence or not unfortunately remains unpredictable, and can only be resolved by validating each newly produced TALEN in vitro and in vivo (Seruggia and Montoliu 2014).

Transcription activator-like effector nucleases (TALEN) have been also instrumental for gene therapy applications, where a number of pre-clinical strategies, including human induced pluripotent stem cells (iPSC), have been investigated successfully (e.g., Osborn et al. 2013; Ramalingam et al. 2014; Biffi 2015; Garate et al. 2015).

In the mouse genetics field, TALEN have proven their great utility as an efficient genome-editing tool in a study that aimed to correct a mutation affecting retinal function that was found associated with C57BL/6 N mice, one of the most common mouse inbred strains used currently by the International Mouse Phenotyping Consortium (IMPC; http://www.mousephenotype.org). This unexpected mutation could impair the interpretation of the eye phenotype of additional mouse mutations generated on the same genetic background (Mattapallil et al. 2012). Researchers used TALEN to correct the C57BL/6 N-specific mutation at the Crb1 locus (rd8 mutant allele) and obtained genome-edited C57BL/6 N mice without the original retinal degeneration (Low et al. 2014).

Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins (Cas)

In 2013, the scientific community discovered a new genome-editing tool with a formidable capacity to drive precise genome alterations with high efficiency and, at the same time, was extraordinarily easy to implement in any molecular biology laboratory. This disruptive technology was envisaged in 2012 by at least two different groups (Jinek et al. 2012; Gasiunas et al. 2012), but it was not until 2013 when its capacity as a powerful genome-editing tool could be demonstrated. Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins (Cas) are elements of an adaptive immune system discovered and investigated in prokaryotes more than two decades ago (Mojica and Montoliu 2016). The CRISPR-Cas system, or simply CRISPR tools, were first tested and used for genome editing in mammalian cells in culture (Cong et al. 2013; Mali et al. 2013), and soon after in mice (Wang et al. 2013b). It was immediately apparent that these third generation genome-editing tools would be the simplest, easiest, most robust, and efficient of all four reported nuclease methodologies, meganucleases, ZFN, TALEN, and CRISPR.

CRISPR tools are derived from an adaptive defense system found in most archaea and bacteria. In brief, the CRISPR systems require two basic components: a small molecule of RNA, which would be responsible for pairing with the corresponding homologous DNA sequence at the target site, and a DNA endonuclease, which would cleave the DNA and generate the unique DSB at the locus of choice, driven by short RNA molecule. The indicated small RNA molecule in reality corresponds to two original RNA molecules used by prokaryotes, crRNA and tracrRNA, responsible for pairing with the target sequence and for interacting with a Cas endonuclease, respectively. The two small RNA molecules were artificially merged into a synthetic guide RNA (sgRNA), or simply gRNA, by the laboratories of Jennifer Doudna and Emmanuelle Charpentier in their milestone 2012 paper, thereby further simplifying a system that was already fairly simple (Jinek et al. 2012). The most common CRISPR tools are derived from the CRISPR-Cas system found in Streptococcus pyogenes, where the relevant Cas endonuclease is Cas9. For this reason, CRISPR tools are often reported as CRISPR-Cas9. However, since we currently know the existence of numerous types of CRISPR systems in prokaryotes, each one associated with a different set of Cas proteins (Makarova et al. 2015), it is recommended to refer to these tools as CRISPR-Cas or, even simpler, CRISPR tools (Mojica and Montoliu 2016).

The mechanism by which CRISPR tools can target and cleave specific DNA sequences to generate a DSB is analogous to that of ZFN or TALEN. In essence, all three sets of tools use a binary system formed by an endonuclease and an element to direct it to a specific genomic location. As stated previously, in the case of ZFN and TALEN, the engineered chimeric proteins obtain the endonuclease domain from the bacterial restriction enzyme FokI. By contrast, the CRISPR systems use their own natural endonuclease, a Cas protein, usually Cas9. Whereas both ZFNs and TALENs depend on protein-DNA interactions to detect the planned target genomic sequences, through different amino acid/nucleotide codes, the CRISPR system relies on its own natural partner, a small RNA molecule, to match the target DNA sequence through the standard Watson-and-Crick nucleotide pairing code, which is biochemically more stable than the electrostatic interactions between DNA and protein molecules (Seruggia and Montoliu 2014).

The amazing plasticity and simplicity for generating CRISPR tools (e.g., Harms et al. 2014), along with their proven efficacy in promoting gene disruption (through NHEJ) or gene editing (through HDR, in the presence of exogenous DNA templates) has transformed the field and has positioned CRISPR as the most popular genomic-editing tool and the method of choice for gene editing. Using CRISPR reagents, any molecular biology laboratory can very easily envisage a strategy for inactivating a locus, through INDELs, or for generating large insertions or deletions, substitutions, duplications, inversions, knockins, and many other complex mutations or chromosomal rearrangements.

CRISPR tools have been used successfully for the generation of a large variety of tailored genomic alterations in many mammalian species. Some examples include strategies for introducing small INDELs (Wang et al. 2013b; Han et al. 2015), substitutions and insertions (Yang et al. 2013; Ma et al. 2014; Platt et al. 2014; Peng et al. 2015), large deletions and insertions (Xiao et al. 2013; Seruggia et al. 2015; Zhang et al. 2015; Birling et al. 2017), inversions (Xiao et al. 2013; Seruggia et al. 2015), and duplications, and other chromosomal rearrangements (Maddalo et al. 2014). The largest known deletions and gross chromosomal alterations were recently reported in rats and mice, where the excision of genomic fragments up to 24.4 Mb could be triggered by CRISPR (Birling et al. 2017). CRISPR tools have also been applied for investigating the role and relevance of defined DNA regulatory elements in non-coding genomic areas (Han et al. 2015; Seruggia et al. 2015; Guo et al. 2015; Lupiañez et al. 2015).

As expected, the use of CRISPR approaches for gene therapy applications are beginning to be explored. CRISPR tools have been used successfully to generate or correct mutations in human iPSC (e.g., Flynn et al. 2015). Moreover, CRISPR reagents encapsulated into AAV or non-viral particles have also been used to partially restore gene function in a number of animal models of human genetic disorders, such as Duchene muscular dystrophy (Nelson et al. 2016), human hereditary tyrosinemia (Yin et al. 2016) or retinitis pigmentosa (Suzuki et al. 2016). Preliminary attempts to use CRISPR for gene editing in human embryos have been reported; however, these studies produced similar results as those observed in embryos from other mammalian species, namely, mosaicism, multiple alleles created and co-existing on top of the expected allele, and some off-target modifications (Liang et al. 2015; Kang et al. 2016). Overall, these reports suggest that CRISPR-mediated genome editing may not yet be used for irreversibly modifying the human genome, should there be a good reason to do it.

Beyond basic research and rodent models, CRISPR has also had a strong impact in large animal models. CRISPR strategies have been applied for the generation of genome-edited livestock for biomedicine and biotechnology applications, including pigs (Tan et al. 2013; Hai et al. 2014; Peng et al. 2015; Lai et al. 2016), cattle (Wang 2015; Tan et al. 2016), sheep (Crispo et al. 2015; Tan et al. 2016), rabbits (Guo et al. 2016; Lv et al. 2016), and goats (Ni et al. 2014; Wang et al. 2015; Guo et al. 2016). Indeed, the arrival of CRISPR editing may have revitalized some biomedical applications such as xenotransplantation, owing to its success in inactivating all copies of porcine endogenous retrovirus (PERV) in the porcine genome (Yang et al. 2015). In addition, CRISPR tools have facilitated the generation of new pig models carrying multiple transgenic and knockout alleles (Niemann and Petersen 2016; Wang et al. 2016; Fisher et al. 2016). Some other recent examples can serve to illustrate the power of CRISPR-mediated genome editing in livestock species. CRISPR tools were used to inactivate the PRNP prion gene in the cattle genome, as a valid strategy to interrupt the dissemination of bovine spongiform encephalopathy, or mad-cow disease (Bevacqua et al. 2016). As a route to produce animals with increased resistance to tuberculosis, CRISPR strategies were used for inactivation of the NRAMP1 locus in cattle (Gao et al. 2017).

The recent use of CRISPR tools to inactivate the NANOS2 locus in the pig has led to the generation of germline ablated male pigs, which can serve as surrogates for transplantation of donor spermatogonial stem cells to expand the availability of gametes from genetically desirable sires (Park et al. 2017). Finally, very recent and exciting application of CRISPR is its use for the production of interspecies chimeras. Experiments have been designed to trigger the differentiation of human pluripotent cells in developing swine embryos that have been previously genome-edited with CRISPR tools, in an attempt to find differentiated progenies of human cells contributing to various cell lineages and organs in developing pig fetuses. The idea behind these innovative and challenging experiments is the future generation of human organs for transplantation. The first results from these interspecies chimeras have been just released (Wu et al. 2017).

The future of genome-editing nucleases

Four different genome-editing technologies are known to date: engineered meganucleases, ZFN, TALEN, and CRISPR. Other new systems that use different mechanisms or elements might yet be discovered, but if we consider only CRISPR-Cas systems, we realize that this may represent only the tip of the iceberg regarding the adaptive defense systems in prokaryotes. Most laboratories use only CRISPR tools derived from the CRISPR-Cas9 system of a single bacterium: Streptococcus pyogenes (Jinek et al. 2012). This Cas9 protein is now being engineered and new mutant versions have been generated with increased on-target efficiency and decreased off-target activity, aiming for more suitable tools for gene therapy applications (Kleinstiver et al. 2015, 2016; Slaymaker et al. 2016). Beyond Cas9, which we all mainly use, we also know there are many other CRISPR-Cas systems and types, each one with its own pros and cons, for use as genome-editing tools (Makarova et al. 2015). Many thousands of novel CRISPR-Cas systems in known cultivated species remain to be characterized. Some of these novel CRISPR systems are just now beginning to be studied, including Cpf1 and C2c2, with Cas-like proteins with different gene-editing properties and requirements (Zetsche et al. 2015; Abudayyeh et al. 2016). Even more astounding are the new CRISPR-Cas systems that have been recently discovered from uncultivated microorganisms (Burstein et al. 2017), opening this field to an almost unlimited source of potentially useful CRISPR reagents with specific properties, perhaps each one particularly suitable for a specific type of application.

Genome-editing nucleases (meganucleases, ZFN, TALEN, and CRISPR) have been extremely useful for increasing our knowledge of regular gene function through systematic functional genomic approaches, and for the generation of newer and better animal and cellular models of human disease. These improvements are irreversible and have already shaped the approaches for producing genetic alterations in mammals. Mouse functional genomic consortia now regularly use CRISPR tools to generate the remaining mouse mutants to accomplish the announced goal of creating a library of knockout animals for all murine genes. The next wave is expected to focus on gene therapy applications. ZFN and TALEN protocols have already been used in pre-clinical and clinical trials. CRISPR approaches for gene therapy are starting to be explored, with caution, due to the inherent diversity of genetic alleles created at the target locus and the possibility of modifying similar sequences at off-target genomic locations. A careful risk to benefit analysis will have to be considered before CRISPR genome-editing capacities are eventually transferred to routine clinical protocols.