Keywords

1 Introduction

Engineered endonuclease-mediated genome editing using ZFNs and TALENs is conceptually similar to restriction endonuclease-mediated DNA manipulation in vitro. Given that recent advances in the engineering of programmable nucleases have almost completely abolished any limitations of the target sequences, especially for TALENs, it appears as though the use of restriction endonucleases has become unrestricted.

On the other hand, engineered endonuclease-mediated genome editing strategies always require the construction of customized nucleases corresponding to the intended genomic sequence. As an analogy, the principle is similar to the performance of immunostaining using primary antibodies directly conjugated with alkaline phosphatase. As we know, two components, primary antibodies without any conjugations and secondary antibodies recognizing the primary antibodies conjugated with alkaline phosphatase, can facilitate immunochemical manipulation, because the phosphatase-conjugated secondary antibody can be used for any primary antibodies recognizing a variety of antigens. The same phenomenon is also expected in genome editing technology, and prokaryotic immunity has represented a “diamond in the rough” to realize such two-component gene targeting.

2 The CRISPR/Cas System in Prokaryotic Adaptive Immunity

The clustered regularly interspaced short palindromic repeats (CRISPR) locus is found in the genomes of some bacteria and archaea (Ishino et al. 1987; Mojica et al. 2000; Jansen et al. 2002). It contains tandem repeats and spacers, in which the repeats comprise the same sequence and the spacers comprise different sequences derived from exotic DNA (Mojica et al. 2005; Pourcel et al. 2005). The CRISPR locus functions with CRISPR-associated (Cas) proteins as an adaptive immune system against invading foreign DNA (CRISPR/Cas system; Fig. 2.1) (Wiedenheft et al. 2012; Westra et al. 2014). In the system, the invading DNA is incorporated into the spacer region in the CRISPR locus and transcribed as a long pre-crRNA (CRISPR RNA) containing multiple repeats and spacers. Subsequently, in the type II CRISPR/Cas system, pre-crRNA is processed to crRNA harboring a single spacer sequence complementary to the foreign DNA with another short RNA molecule, trans-crRNA (tracrRNA), transcribed from a different locus. The resulting crRNA–tracrRNA heteroduplex works as a guidance molecule to target exogenous DNA with the identical sequence to the crRNA, and induce a DNA double-strand break (DSB) at the specific locus in association with Cas protein(s) (Bhaya et al. 2011; Reeks et al. 2013; Barrangou and Marraffini 2014).

Fig. 2.1
figure 1

Natural mechanism of the type II CRISPR/Cas adaptive immune system. Foreign nucleotides such as viral DNA or plasmids are incorporated into the CRISPR locus on the host genome. Following new spacer acquisition, pre-crRNA is transcribed and hybridized with tracrRNA. After processing, the crRNA–tracrRNA complex is recruited to the target DNA sequence along with Cas9 protein, and degradation occurs through the nuclease activity of Cas9

3 Application of CRISPR/Cas9 in Genome Editing

When applying the CRISPR/Cas system in genome editing, only two components are needed, namely a chimeric guide RNA (gRNA) mimicking the crRNA–tracrRNA complex, and a Cas9 protein with nuclease activity (Fig. 2.2) (Jinek et al. 2012; Cong et al. 2013; Mali et al. 2013a). Although the targeting specificity is mainly dependent on the gRNA sequence, Cas9 also requires a few particular bases, known as a protospacer adjacent motif (PAM) (Bolotin et al. 2005). The PAM sequences vary among species. For example, Streptococcus pyogenes Cas9 (SpCas9) requires 5′-NGG-3′ (Jinek et al. 2012), Streptococcus thermophilus Cas9 (StCas9) requires 5′-NNAGAAW-3′ (Cong et al. 2013; Esvelt et al. 2013), and Neisseria meningitidis Cas9 (NmCas9) requires 5′-NNNNGATT-3′ (Esvelt et al. 2013; How et al. 2013; Walsh and Hochedlinger 2013). Currently, SpCas9 is the most widely used for genetic engineering (Hsu et al. 2014; Wilkinson and Wiedenheft 2014). The SpCas9-gRNA complex is known to initially seek out the PAM sequence in the genome, and subsequently unwind the double-stranded DNA and form DNA-RNA base-pairing in a directional manner (Sternberg et al. 2014). When introducing a DSB, two nuclease domains, HNH and RuvC, independently induce a nick at the Watson and Crick strands, resulting in a linear DSB between the bases at 3- and 4-bp upstream of the PAM sequence (Jinek et al. 2012; Nishimasu et al. 2014).

Fig. 2.2
figure 2

Schematic representation for target DNA recognition and cleavage by a gRNA-Cas9 complex. SpCas9 initially searches for the PAM sequence (5′-NGG-3′) on the target DNA. Subsequently, base-pairing between the target DNA and gRNA gradually occurs from the PAM side. After 20-bp hybridization, the target DNA is cleaved by Cas9 nuclease, resulting in a blunt end at the 3-bp upstream of the PAM site

The gRNA structure is another important factor for CRISPR/Cas9-based genome editing. Although crRNA and tracrRNA can be separately transcribed like the naturally-occurring CRISPR/Cas system, a chimeric gRNA structure is rather simple and often leads to high activity (Hsu et al. 2013). A chimeric gRNA consists of a crRNA-derived region at the 5′ end and a tracrRNA-derived region at the 3′ end, and various modifications have been adopted in both regions by several groups (reviewed in Sander and Joung 2014). Basically, the DNA-recognition sequence in the crRNA region is 20-bp long, but the addition or truncation of a few bases can reportedly improve the specificity (Cho et al. 2014; Fu et al. 2014). The 3′ end of the crRNA region and the 5′ end of the tracrRNA region are generally linked with four nucleotides (5′-GAAA-3′) to form a major stem loop, known as a tetraloop (Kim and Kim 2014). Stem extensions have also been reported (Chen et al. 2013; Hsu et al. 2013; Jinek et al. 2013). The tracrRNA region has additional minor loops on the 3′ side, and these sequences are known to be important for high gRNA expression (Hsu et al. 2013). In addition, A-U flips in the poly-A or poly-T regions have been adopted in some studies (Chen et al. 2013; Jinek et al. 2013).

4 Targeting Specificity of CRISPR/Cas9

As described above, the approximately 20-bp gRNA sequence and 3-bp PAM sequence of SpCas9 define the targeting specificity of CRISPR/Cas9. However, the stringency of base recognition is not equivalent among these sequences. Regarding the PAM sequence, SpCas9 has the ability to bind to 5′-NGA-3′ (Zhang et al. 2014) and 5′-NAG-3′ (Hsu et al. 2013; Jiang et al. 2013) sites as well as 5′-NGG-3′. Regarding the gRNA targeting sequence, the specificity decreases with increasing distance from the PAM site. The sequence extending up to 12 bp adjacent to the PAM site is called the seed sequence, and has relatively high targeting specificity (Jinek et al. 2012; Cong et al. 2013).

In some types of cultured cells, especially immortalized cell lines such as U2OS, HEK293T, and K562, highly frequent off-target mutations have been observed by many groups (Cradick et al. 2013; Fu et al. 2013; Hsu et al. 2013; Pattanayak et al. 2013; Cho et al. 2014; Lin et al. 2014). In vitro assays have also shown off-target binding with high frequencies (Pattanayak et al. 2013). However, in normal cells such as mouse embryonic stem (mES) cells and organisms such as mice and rats, the levels of induced off-target mutations do not seem to be as high (Wang et al. 2013; Mashiko et al. 2014; Yoshimi et al. 2014). Furthermore, whole-genome sequencing has recently been conducted by several groups to analyze the off-target mutations in genome-edited cells, resulting in findings that individual cell clones had low frequencies of unintended mutations among human stem cells treated with CRISPR/Cas9, as well as TALENs (Smith et al. 2014; Suzuki et al. 2014; Veres et al. 2014). These reports suggest that the mutation frequencies at off-target sites vary among species and cell types because of potential differences in the DSB repair machineries.

Interestingly, a genome-wide survey of SpCas9-gRNA-binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq) revealed that only seven nucleotides, including 5′-GG-3′ in the PAM, could be identified as a consensus sequence (Wu et al. 2014). Furthermore, although thousands of off-target binding sites were determined, only one of the analyzed potential off-target sites carried significant mutations in mES cells (1/295; 0.34 %). Similar results were reported by two other groups (Duan et al. 2014; Kuscu et al. 2014). On the other hand, 70 % of SpCas9-gRNA-binding sites were found to be associated with genes (Wu et al. 2014). This result suggests that changes in transcriptional regulation can be triggered by CRISPR/Cas9 for unintended genes, because various studies have shown that binding of catalytically inactive Cas9 to a coding region or regulatory region causes transcriptional inhibition (CRISPRi) (Gilbert et al. 2013; Larson et al. 2013; Qi et al. 2013; Zhao et al. 2014). Although further studies are needed to clarify this issue, we need to recognize the possibility of such potential side effects without any mutations when using CRISPR/Cas9.

5 Double-Nicking and Dimeric FokI-dCas9 Strategies for Highly-Specific Genome Editing

Based on the strong concern about off-target mutations, several advanced strategies for highly specific CRISPR/Cas9-mediated genome editing have been developed (Fig. 2.3). The main problem for CRISPR/Cas9 specificity is its monomeric architecture, unlike the case for the dimeric ZFNs and TALENs. A conventional CRISPR/Cas9 genome editing system contains a single gRNA and a Cas9 nuclease. Since the Cas9 nuclease has cleavage activity for both DNA strands, the induced DSB site is determined by the single gRNA.

Fig. 2.3
figure 3

Three different DNA-cleaving strategies using the CRISPR/Cas9 system. (a) Original CRISPR/Cas9 system mediated by wild-type Cas9 nuclease and a single gRNA. (b) Double-nicking strategy mediated by Cas9 nickase harboring the D10A mutation and two gRNAs. (c) RNA-guided FokI nuclease system mediated by catalytically inactive Cas9 harboring D10A and H840A mutations (dCas9) fused to the nuclease domain of FokI and two gRNAs

Previous research on engineered endonucleases has provided some clues to solve the problem of specificity. The TALE::TevI architecture, known as compact TALEN (cTALEN), can induce a nick when used as a monomer, but can also induce a DSB when used as a pair (Beurdeley et al. 2013). This paired nicking can only cleave DNA when the space between two nicks is within a range of defined lengths (9–18 bp). Similarly, the double nicking induced by CRISPR/Cas9 was reported to introduce a DSB (Mali et al. 2013b; Ran et al. 2013; Cho et al. 2014). The nuclease activity of Cas9 can be converted to nickase activity when a D10A or H840A mutation is incorporated (Jinek et al. 2012). Theoretically, these nickases cannot induce a DSB unless two adjacent nicks on both strands are introduced. Therefore, when using Cas9 nickase, the target site needs to be recognized by two distinct gRNAs, meaning that targeted mutagenesis can be performed in a highly specific manner. Moreover, this double-nicking strategy reportedly works not only in cultured cells, but also in embryos of animals such as mice (Fujii et al. 2014; Shen et al. 2014).

Another attempt to improve the specificity involves the creation of a fusion protein between catalytically inactive Cas9 (dCas9) and the nuclease domain of FokI (RNA-guided FokI nuclease; RFN). dCas9 has both D10A and H840A mutations and no DNA-cleaving activity. Tsai et al. (2014) showed that FokI-dCas9 can be used as a dimeric nuclease similar to ZFNs and TALENs. FokI-dCas9 can introduce a DSB when the spacer length is in the range of 13–18 bp. In fact, paired nicking is not truly a dimeric strategy, because Cas9 nickase is catalytically active and the nicks sometimes induce mutations (Tsai et al. 2014). On the other hand, FokI-dCas9 acts as a proper dimeric nuclease. The applicability of RFNs in various organisms other than cultured cells needs to be investigated in future studies.

6 Web-Based Software for Designing gRNA Targets and Predicting Off-Target Candidates

In principle, the sequence limitation for targeting with CRISPR/Cas9 is only a PAM site. In practice, other actual limitations are as follows: 1) sequences harboring poly-T should be avoided as gRNA target sequences, because poly-T can work as a transcriptional terminator; and 2) the number of potential off-target sites should be minimized.

Currently, various tools for designing CRISPR/Cas9 target sites and predicting potential off-target sites are available on the web. CRISPR Design Tool (http://crispr.mit.edu/), developed by the Feng Zhang laboratory at MIT (Ran et al. 2013), is supposedly the most widely used resource for designing and assessing gRNA target sequences (Ni et al. 2014; Yen et al. 2014; Yoshimi et al. 2014). CRISPR Design Tool is used not only for design, but also for searching for off-target sites in the genomes of certain species, including humans, rats, mice, zebrafish, flies, and nematodes. ZiFiT Targeter (http://zifit.partners.org/ZiFiT/) (Sander et al. 2007, 2010; Hwang et al. 2013; Fu et al. 2014) available on the website of the Zinc Finger Consortium is also commonly used for designing gRNAs (Blitz et al. 2013; Nakayama et al. 2013; Yu et al. 2014). Other web resources include E-CRISP (http://www.e-crisp.org/E-CRISP/) (Heigwer et al. 2014), CRISPRdirect (http://crispr.dbcls.jp/), CRISPR Optimal Target Finder (http://tools.flycrispr.molbio.wisc.edu/targetFinder/) (Gratz et al. 2014), CasOT (http://eendb.zfgenetics.org/casot/) (Xiao et al. 2014), Cas-OFFinder (http://www.rgenome.net/cas-offinder/) (Bae et al. 2014), CHOPCHOP (https://chopchop.rc.fas.harvard.edu/) (Montague et al. 2014), sgRNAcas9 (http://www.biootools.com/col.jsp?id=103) (Xie et al. 2014), and CRISPy (http://staff.biosustain.dtu.dk/laeb/crispy/) (Ronda et al. 2014).

CRISPR Genome Analyzer, CRISPR-GA (http://crispr-ga.net/) (Guell et al. 2014), developed by the George Church laboratory, is a different type of web tool. Using CRISPR-GA, we can obtain analytical data by uploading forward and reverse reads of Miseq sequences from the amplicons of genetically modified cells or organisms. The percentages of error-prone non-homologous end-joining are calculated, and the sizes and locations of deletions and insertions can be visualized in automatically created figures.

7 Construction of CRISPR/Cas9 Vectors

Plasmids for constructing custom gRNA- and Cas9-expressing vectors are available from Addgene (https://www.addgene.org/) and several other commercial companies including Life Technologies, OriGene, and System BioSciences. The construction procedure only involves insertion of annealed oligonucleotides into the vectors, which is much simpler than the procedures for ZFNs or TALENs (Fig. 2.4) (Cong et al. 2013; Ran et al. 2013). The gRNA and Cas9 can be expressed using either separate vectors or a single combined vector. pX330, a single vector expressing both gRNA and Cas9 nuclease with human U6 and chicken beta-actin hybrid (CBh) promoters, respectively, was originally developed by the Feng Zhang laboratory (Cong et al. 2013) and has been very widely used for cell and animal genome editing (Mashiko et al. 2013, 2014; Ran et al. 2013; Matsunaga and Yamashita 2014; Mizuno et al. 2014; Park et al. 2014; Yin et al. 2014).

Fig. 2.4
figure 4

Construction methods for CRISPR/Cas9 vectors for single (upper panel) and multiple (lower panel) gene targeting. pX330, originally developed in the Feng Zhang laboratory, is probably the most commonly used CRISPR/Cas9 vector. A template DNA sequence for the gRNA should be prepared as annealed oligonucleotides and inserted into the BbsI-digested pX330 vector. For multiplex genome engineering, an all-in-one vector system containing multiple gRNA cassettes and a Cas9 cassette can be used. The system involves the BsaI-mediated Golden Gate assembly method for the concatemerization of gRNA cassettes

Meanwhile, Sakuma et al. (2014) developed an all-in-one CRISPR/Cas9 vector system for multiplex genome engineering by modifying the pX330 plasmid. In their system, up to seven gRNA expression cassettes are tandemly ligated into a single vector along with a Cas9 nuclease/nickase cassette using the Golden Gate assembly method (Fig. 2.4), which is often used for modular assembly of DNA-binding repeats of TALE (Cermak et al. 2011; Kim et al. 2013; Sakuma et al. 2013a, b). The all-in-one vector constructed with this system has been proven to be applicable for simultaneous gene targeting of up to seven and three genomic loci with standard nuclease and paired nickase strategies, respectively. The materials for the construction are expected be distributed as the “Multiplex CRISPR/Cas9 Assembly System Kit” by Addgene.

8 Methods for Introducing CRISPR/Cas9 into Cells and Organisms

To achieve CRISPR/Cas9-mediated genome engineering, various methodologies have been devised and conducted for delivery of the two components, gRNA and Cas9. For cultured cells and animal embryos, the two components can be introduced by DNA/RNA/protein transfection or microinjection. Multiplex genome engineering is also applicable when multiple plasmids, plasmid and DNA fragments, single all-in-one plasmids, or RNA/protein are introduced (Fig. 2.5) (Jao et al. 2013; Li et al. 2013b; Wang et al. 2013; Guo et al. 2014; Ma et al. 2014; Sakuma et al. 2014). Purified Cas9 protein and gRNAs can form ribonucleoproteins (RNPs) in vitro, which can be incorporated into cells by electroporation (Kim et al. 2014b). If a cell-penetrating peptide is added for gRNAs and conjugated with Cas9, RNPs can be delivered into cells by simply adding them into the medium (Ramakrishna et al. 2014), similar to the case for TALENs with cell-penetrating peptides (Ru et al. 2013; Liu et al. 2014). Lentiviral delivery into cells and animals has also been reported (Malina et al. 2013; Heckl et al. 2014). Importantly, a lentiviral CRISPR/Cas9 library has enabled forward genetics screening in cultured cells (Koike-Yusa et al. 2014; Shalem et al. 2014; Wang et al. 2014; Zhou et al. 2014). For plant applications, protoplast transformation or Agrobacteria infection has generally been used for the delivery (Feng et al. 2013; Li et al. 2013a; Nekrasov et al. 2013; Shan et al. 2013). The current situations for CRISPR/Cas9-mediated genome editing in various cells and organisms are described in Part II of this book.

Fig. 2.5
figure 5

Examples of transfection strategies for CRISPR/Cas9-mediated multiplex genome engineering. Multiple plasmids, plasmid and DNA fragments, or single plasmids can be applied for the DNA transfection (upper panels). Alternatively, Csy4-mediated cleavage of long transcripts can produce multiple gRNAs (Nissim et al. 2014; Tsai et al. 2014). Several gRNAs transcribed in vitro and Cas9 mRNA or protein can be used for DNA-free transfection

9 Expanded Applications of CRISPR/Cas9 in Life Science Studies

Similar to ZF- and TALE-based technologies, fusion proteins of dCas9 with various functional domains can act as a variety of site-specific DNA-binding effector proteins (Fig. 2.6) (Mali et al. 2013c; Hsu et al. 2014; Sander and Joung 2014). For transcriptional activation, the herpes simplex virus-derived activator domain, VP16, or its concatemers, such as VP48, VP64, and VP120, are fused with dCas9 (Cheng et al. 2013; Maeder et al. 2013a; Hu et al. 2014). Regarding transcriptional repression, although dCas9 itself can inhibit transcription (Gilbert et al. 2013; Larson et al. 2013; Qi et al. 2013; Zhao et al. 2014), dCas9 fused with a repressor domain such as KRAB can repress gene expression more efficiently (Gilbert et al. 2013; Kearns et al. 2014). dCas9-GFP, developed by Chen et al. (2103), enables dynamic imaging of genomic loci in cultured cells. Site-specific epigenome editing is also thought to be applicable using a dCas9-fusion strategy (Rusk 2014), but only a few examples have currently been reported using TALE-based strategies (Konermann et al. 2013; Maeder et al. 2013b; Mendenhall et al. 2013) and there are no reports for CRISPR technology. Nuclease-independent genetic engineering enzymes have also been adopted in ZF/TALE-fusion architectures. ZF/TALE-recombinases and ZF/TALE-transposases have been reported in the following papers: Gordley et al. (2007), Gersbach et al. (2011), Mercer et al. (2012), and Gaj et al. 2013 for recombinases; Li et al. (2013c) and Owens et al. (2013) for transposases. CRISPR applications for these purposes are expected to be developed in the near future.

Fig. 2.6
figure 6

Various applications of ZF/TALE/CRISPR technologies. Genome editing techniques can expand beyond site-specific nucleases. For example, VP16 and KRAB fusions result in transcriptional activation and repression of specific genes, respectively, GFP fusion results in visualization of specific genomic loci, and TET1 and LSD1 fusions result in site-specific epigenetic modifications

It is particularly worth noting that CRISPR/Cas9-based transcriptional control methodologies open up a huge new field of synthetic biology, as well as TALE-based transcriptional modulation techniques (Farzadfard et al. 2013; Kiani et al. 2014; Moore et al. 2014). Among others, Nissim et al. (2014) constructed particularly sophisticated gene networks using CRISPR transcriptional control tools in combination with various RNA-modifying systems such as RNA-triple-helix structures, introns, microRNAs, and ribozymes. In the meantime however, further expansion and deepening of the technologies are required for this challenging field.

In addition, there are some other applications that differ from the standard genome engineering approaches. Engineered DNA-binding molecule-mediated chromatin immunoprecipitation, enChIP, is one of the unique methods utilizing CRISPR/Cas9 (Fujita and Fujii 2013). Using enChIP, specific genomic regions can be efficiently purified and their associated proteins can be identified by mass spectrometry. The same group further showed that similar experiments can be performed with TALEs instead of CRISPR/Cas9 (Fujita et al. 2013). Kim et al. (2014a) applied the CRISPR/Cas9 system to the genotyping of polymorphisms in vitro. Restriction fragment length polymorphism (RFLP) analysis is often used for genotyping of genome-edited alleles (Suzuki et al. 2013; Nakagawa et al. 2014; Sakane et al. 2014). However, conventional RFLP analysis can only be applied when there is a recognition sequence for a restriction enzyme around the DSB site. On the other hand, the CRISPR/Cas9-mediated RFLP method using in vitro-synthesized gRNA and Cas9 protein enables the genotyping of any sequence, as long as there is a PAM sequence around the target site.

Conclusions

Owing to its simplicity, convenience, and flexibility, CRISPR/Cas9 technology is currently evolving with astonishing rapidity (Pennisi 2013). Along with novel mechanistic insights, various improvements and upgrades of the system are continuously being reported, together with a rich variety of applications that are too numerous to mention. This innovative technology is clearly upsetting conventional wisdom in every research field in life science studies. Researchers are encouraged to enjoy the benefits of this novel technique and drive the growth of their science.