Keywords

13.1 Introduction

Genetics is characterized by identification of genes which are fundamental hereditary units and responsible for variation in organisms. The conventional approach to identify the genes responsible for a particular phenotype is to pinpoint the unique phenotype emerged either by natural selection or by mutation. A modern approach (reverse genetics) is to determine the accountable gene from its genomic sequence causing the gene to mutate specifically for characterizing its phenotype. Genetic mutations can be envisaged in two ways (Fig. 13.1): (a) targeted mutagenesis and (b) targeted gene replacement. Targeted gene replacement relies on homology directed repair to modify an endogenous gene. Targeted mutagenesis allows extensive and subtle genetic change.

Fig. 13.1
Two illustrations. a, depicts a D N A exposed to the mutagen, which causes error in replication. b, depicts a foreign gene and a genome, which leads to the foreign gene being placed in the genome.

Schematic representation of targeted mutagenesis and targeted gene replacement. (a) Exposure to mutagen results in error or mutation in replication. (b) Foreign gene is inserted into DNA, in order to modify genome

Targeting a specific gene within the large genome to make directed changes is a challenging work. Targeted gene replacement in Saccharomyces cerevisiae (baker’s yeast) was done several decades ago (Rothstein 1983; Scherer and Davis 1979). Yeast DNA manipulation in the laboratory is easy due to several factors including the ability of yeast cells to take up DNA precisely, homologous recombination between donor and recipient DNA, lack of competitive reactions that divert the donor DNA to alternative genomic sites, and elite genomic selection. All these properties are often observed in numerous fungi and bacteria but not in eukaryotic organisms (Carroll 2011).

Targeted gene replacement in mice has become a routine practice due to the existence of embryonic stem (ES) cells that can be cultured and due to their easy selection (Capecchi 2005). Targeted gene replacement in mice also depends on homologous recombination for targeting the donor DNA and selection for the commonly occurring product. Dual selection (positive for donor and negative for outer homology) results in preferred substitution. The desired substitution is then transferred to the progeny on injection into embryos due to pluripotency of ES cells. Despite being selected for the desired event, the homologous recombination frequency in both mice and yeast remains substantially low. Also, the ES cells are not particularly available for other model organisms. The challenge of precise gene targeting can be accomplished by an increased frequency of homologous recombination. This could be achieved by manipulating linear DNA, which is more efficient than circular DNA. The highest impact on the frequency of homologous recombination was observed by damaging the target DNA and its inherent repair system. A double strand break (DSB) was able to influence the frequency of recombination manifolds. The inspiration to these events came from the yeast meiotic switch and crossing over (Choulika et al. 1995; Plessis et al. 1992; Rouet et al. 1994; Rudin et al. 1989).

DSBs (Double Strand Breaks) are repaired by cells’ inherent repair system either by non-homologous end joining (NHEJ) or homology directed repair (HDR) (Fig. 13.2) (Carroll 2011).

Fig. 13.2
An illustration depicts double strand breaks of a genome, which is repaired in 2 ways as follows. 1. Non homology end joining, which includes insertions or deletions and gene disruptions. 2. Homology directed repair, which includes precise D N A editing and gene insertion. The factors that cause the double strand breaks are also depicted.

Double strand breaks (DSB) are repaired by cells’ two inherent means: (a) non-homologous end joining (NHEJ) and (b) homology directed repairing (HDR)

13.2 Genome Editing Techniques in Plants

Genome editing with engineered nucleases (GEEN) is a technique to modify the genome by creating site-specific DSBs and subsequent repairing of DSBs to successfully engineer the genetic material of cells. Smaller genomes like of viruses and bacteria use restriction endonucleases for cutting and ligases for rejoining the DNA. Cleavage and rejoining of bigger genomes like eukaryotes are extremely difficult with restriction endonucleases and ligases as they are able to target smaller DNA fragments only. Artificial enzymes were made to modify the complex genomes that are able to bind to the restriction site selectively coupled with proteins to cleave the target site. This targeted approach was accomplished by the construction of chimeric proteins carrying one or two structural units capable of selectively binding and catalyzing the targeted DNA site specifically (Knorre and Vlasov 1985). These chimeric proteins are produced within cell, and the associated DNA fragments are engineered in appropriate vectors for nuclear localization. Genome editing tools are helpful in engineering the genomes precisely. Addition, deletion, or gene replacement is being done with great precision using latest genome editing tools. The genome editing tools that are currently in use for editing genomes at very specific site are described in the following section.

  1. 1.

    ZFNs (Zinc Finger Nucleases).

Zinc finger nucleases (ZFNs) are the class of first-generation engineered nucleases. ZFNs were first used in research after the discovery of zinc finger domain that functions specifically due to Cys2-Hys2 elements (Kim et al. 1996; Gaj et al. 2013; Palpant and Dudzinski 2013; Pabo et al. 2001). A 𝛽𝛽𝛼 configuration is formed by folding up to 30 amino acid residues to develop a Cys2-Hys2 domain (Pabo et al. 2001; Cathomen and Joung 2008; Petolino 2015). The analysis of ZFN crystallographic structure depicted that Cys2-Hys2 protein domain binds to the target DNA by incorporating its 𝛼-helix into the major groove of a double-stranded DNA (Pavletich and Pabo 1991). A zinc finger protein is able to recognize three consecutive nucleotides in DNA within genome of target organism. A DNA binding domain of ZFN can be engineered to bind specific DNA sequence in the genome of target organism. A cleavage domain (normally FokI domain of type II restriction nucleases) is able to recognize DNA binding domain and subsequently cleaving that specific region. At target site of nuclease domain, the cleavage domain dimerizes to leave a 5–6 base pair spacer region (Fig. 13.3).

Fig. 13.3
An illustration depicts Z F N, which consists of D N A binding domain and F o k 1 D N A cleavage domain. Both are separated by a spacer region with 5 to 7 nucleotides. The left Z F N and right Z F N are labeled.

Zinc finger nuclease (ZFN) comprised of DNA cleavage domain (FokI) and DNA binding domain separated by 5–7-nucleotide spacer region

  1. (i)

    DNA Binding Domain.

DNA binding domain of ZFNs is comprised of 3–6 distinct repeats of zinc fingers. Each ZFN repeat is able to recognize a specific sequence of three base pairs thereby recognizing 12–15 base pairs by a combination of 4–5 zinc fingers. ZFNs are engineered to target a particular sequence in the genome. Yeast 1–2 hybrid system, bacterial 1–2 hybrid system, mammalian cells, and phages select individual protein among a pool of ZFNs to bind and cleave specific DNA sequence. A latest approach to specify ZFNs is called oligomerized pool engineering (OPEN) (Maeder et al. 2008). A zinc finger is selected prior to bind three specific DNA nucleotides in OPEN. A second step of selection is made to identify three consecutive zinc fingers to bind nine specific base pairs. OPEN has become a successful alternative to commercially available engineered ZFNs (Ramirez et al. 2008) (Fig. 13.4).

Fig. 13.4
An illustration depicts a D N A cleavage domain in which the following parts are labeled. Zinc finger, D N A, B plated sheet, alpha recognition helix, and zinc ion.

ZFN DNA binding domain: zinc finger proteins are wrapped around DNA as β-plated sheet coupled with α-recognition helix

  1. (ii)

    DNA Cleavage Domain.

The cleavage domain of ZFNs is type IIs restriction endonuclease of FokI (Kim et al. 1996). The FokI domain cleaves the site targeted by DNA binding domain (Bitnaite et al. 1998). A ZFN is formed by combining cleavage domain (FokI) to the C-terminus of DNA binding domain. ZFNs work in pairs in opposite orientation by attaching to complementary strands of the target DNA and cut the spacer region (5–7 base pair region) identified by DNA binding domain (Cathomen and Joung 2008).

ZFN as a genome engineering tool is used for targeting particular loci in many organisms (Ekker 2008). These molecular scissors work in pairs to manipulate the genome of numerous plant and animal species (Kim and Kim 2011). An important class of zinc finger domains is Cys2-Hys2 domain being the most wildly used DNA binding domain of ZFNs in eukaryotes. Cys2-Hys2 domains are considered as the second most important protein domains in human beings (Gaj et al. 2013). A zinc finger has a conserved ββα conformation, comprised of nearly 30 amino acids (Fig. 13.5).

Fig. 13.5
An illustration depicts a Z F N D N A cleavage domain in which the following parts are labeled. N terminus, alpha helix, B-plated sheet, Z n, C terminus, Cys side chain, and His side chain.

ZFN DNA binding domain comprised of α-helix having Cys side chain and β-plated sheet having His side chain bonded with Zn ion

Amino acids present on the surface of the alpha helix specifically bind to three base pairs of the major groove of DNA with high affinity (Gaj et al. 2013). A targeted DNA sequence is recognized by >3 zinc finger arrays forming a site-specific structure that is able to recognize >9–18 base pairs. This controversial technique was used to manipulate human genome for the first time in 1996 (Kim et al. 1996), and it was able to pinpoint a targeted DNA sequence in a complex big genome (Gonzalez et al. 2010). ZFNs can be synthesized in laboratory by identifying high selectivity of DNA binding protein to its target DNA. Artificially synthesized ZFNs are able to cleave any target DNA sequence and may also cause inhibition of replication of viruses, thereby imparting resistance in host organism (Chen et al. 2014). Numerous methods for artificially synthesizing ZFNs have been established; some include the construction of ZFNs with specific DNA modular assembly sequences (Beerli et al. 2002); others are constructed by OPEN (oligomerized pool engineering) (Maeder et al. 2008). Some applications of ZFN are elaborated in Table 13.1.

Table 13.1 Applications of zinc finger nucleases (ZFNs) in eukaryotes
  1. 1.

    TALENs (Transcription Activator Like Effector Nucleases)

Molecular scissors developed in advancement of ZFNs are called transcription activator-like effector nucleases (TALENs). Like ZFNs, these are formed by combining a TALE-specific DNA binding domain and a cleavage domain. TALENs are restriction enzymes that are able to cleave DNA at a very specific site. TALEs can be engineered to bind to a specific DNA site in the genome. TALENs are synthesized by combining TALE-specific DNA binding domain and a nuclease domain. TALENs are designed to specifically bind and cleave at a very specific site in the genome. These engineered endonucleases are specifically used as cleavage domains in genome editing known as GEEN. Like ZFNs and CRISPR, TALENs are the prominent tools of genome editing (Boch 2011) (Fig. 13.6).

Fig. 13.6
An illustration depicts T A L E Ns, which consists of D N A binding domain and F o k 1 D N A cleavage domain. Both are separated by a spacer region with 16 to 19 nucleotides. The left T A L E N and right T A L E N are labeled.

Transcription activator-like effector nucleases (TALENs) comprised of DNA binding domain and FokI DNA cleavage domain separated by 16–19-nucleotide spacer region

  1. (iii)

    DNA Binding Domain.

The proteins extracted from the bacterium Xanthomonas through their type III secretion system are used as TAL effectors. These proteins are secreted by the bacterium of infecting plants (Boch and Bonus 2010) as a conserved array of 33–35 amino acids combine to form DNA binding domain, having divergent 12th/13th amino acids termed as repeat variable di-residue (RVD). RVDs are highly alterable and exhibit a strong correlation with nucleotide recognition at a specific site (Boch 2009, Moscou and Bogdanove 2009, Moscou and Bogdanove 2009). This particular correlation between the sequence of amino acids and DNA recognition site has allowed for the development of numerous DNA binding domains with a selective combination of RVDs. Target specificity can be improved further by incorporating nonconventional RVDs or by using altered combinations (Juillerat et al. 2015).

  1. (iv)

    DNA Cleavage Domain.

DNA cleavage domain is nonspecific and often comes from the FokI restriction endonuclease. DNA cleavage domain is also used for the construction of hybrid restriction endonucleases that are effective in yeast (Christian et al. 2010; Li et al. 2011), plants (Mahfouz et al. 2011; Cermak et al. 2011), and animal cells (Cermak et al. 2011; Miller et al. 2011; Hockemeyer et al. 2011; Wood et al. 2011).

Initially, wild-type nonspecific FokI domains were used as a cleavage domain. Current studies on TALEN construction (Hockemeyer et al. 2011; Tesson et al. 2011; Huang et al. 2011) utilized mutant FokI cleavage domains to impart specificity (Doyon et al. 2011; Szczepek et al. 2007) and efficient activity (Guo et al. 2010) to cleave at the very specific site identified by DNA binding domain. The FokI cleavage domain acts as a dimer targeting unique DNA binding site with specific spacing and orientation within the target genome. The number of base pairs present between the binding sites of two independent TALENs and the number of amino acids present between DNA binding and cleavage domain are important for acquiring highest specificity and TALEN activity (Miller et al. 2011; Mussolino et al. 2014).

13.2.1 Construction of TALEN Engineering Vectors and Transfection

Efficient TALEN protein engineering can be done by optimizing the correlation between specific bondage between TALE DNA binding domain and the corresponding amino acid sequence. TALE binding domain remains less specific in annealing of repetitive DNA sequences during artificial synthesis of TALE domains (Zhang et al. 2011). Software programs like “DNA works” (Hoover 2012) are efficient in designing TALE domains by accurately calculating oligonucleotides used in amplifying whole gene for oligonucleotide assembly in two-step PCR. Numerous schemes of modular assembly for engineering TALEN constructs are reported (Cermak et al. 2011; Zhang et al. 2011; Morbitzer et al. 2011; Li et al. 2011; Geissler et al. 2011; Weber et al. 2011). All the systematic approaches for engineering DNA binding domains are also coherent in designing respective domains for ZFN.

TALEN constructs are assembled in plasmids and are transfected to the target cells. The genes are expressed in target cells and enter the nucleus to approach the genome. Alternatively, TALENs can also be delivered as mRNAs to the cells which eliminate the probability of integration of TALE expressing proteins into the genome. An mRNA based delivery also enhances the probability of homology directed repairing and introgression of TALE product for genome editing. Some applications of TALENs are given in Table 13.2.

Table 13.2 Applications of transcription activator-like effector nucleases (TALENs) in plants 

CRISPR/Cas System

Bacteria protect themselves from foreign invaders and viruses by their inherent defense system named as clustered regularly interspaced short palindromic repeats abbreviated as CRISPR and its associated proteins (Cas proteins) (Barrangou 2015). CRISPR/Cas system is comprised of short palindromic repeat sequences originally adapted from invading viral or phage genomes that have been integrated within bacterial genome during prior attack. These specific sequences get integrated during first attack and serve as defense sequences during subsequent attacks (Barrangou 2015). Associated proteins of CRISPR system serve as protecting enzymes to cleave the target DNA recognized by Cas9. Cas proteins recognize and cleave the target DNA sequence. Cas proteins are involved in DNA processing for targeting the viral attack. This remarkable combination of CRISPR sequences and its associated proteins makes the most tremendous technology of genome editing (Zhang et al. 2014) (Fig. 13.7).

Fig. 13.7
An illustration depicts C a s 9, which has target D N A and s g R N A. The s g R N A depicts the upper stem, lower stem, bulge, Nexus, and hairpins. The target D N A has the P A M sequence.

Clustered regularly interspersed short palindromic repeats (CRISPR system). CRISPR/Cas9 with crRNA (CRISPR-RNA) and tracrRNA (trans-activating crRNA) binds and cleaves double-stranded DNA upstream of PAM together with protospacer

CRISPR system is an indigenous defense system of bacteria and archaea to develop acquired immunity against invading phages and viral genomes (Sternberg and Doudna 2015; Barrangou et al. 2007; Marraffini and Sontheimer 2008). An RNA spacer guides Cas proteins to recognize complementary DNA sequences and target foreign DNA. This RNA serves as guide for Cas proteins and is termed as guided RNA (gRNA) in engineered CRISPR systems. This system is declared as an inherent system of bacteria and archaea as it is reported in 50% of bacterial and 90% of archaeal genomes so far (Hille et al. 2018). Minute clusters of some genes are also present in close proximity of CRISPR spacer sequences as CRISPR-associated genes. Based on sequence homology, 93 Cas genes make up 35 CRISPR-associated gene families. From these 35 gene families, 11 Cas gene families are reported as Core genes (Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, and Cas9 gene) for editing most genomes of organisms edited. CRISPR system must possess a core gene for accurate recognition and targeting of DNA sequence (Makarova et al. 2015) (Fig. 13.8).

Fig. 13.8
An illustration depicts 3 stages of C R I S P R immunity as follows. 1, Acquisition depicts t r a c R N A, c a s 9, c a s genes, c r R N A. 2, Expression depicts t r a c R N A, c a s protein, primary transcript. 3, Interference depicts c r R N A. The foreign D N A before and after the 3 stages are also depicted.

CRISPR immunity involves three stages: (a) Adaption involves the addition of new spacer sequences into CRISPR loci by foreign DNA attack. (b) Expression stage involves the CRISPR loci transcription and the formation of CRISPR-RNA. (c) Interference stage involves the detection and disintegration of mobile foreign DNA by CRISPR-RNA and Cas protein complex

CRISPR systems are distinguished in two categories: Class 1 and Class 2. A combination of Cas proteins are used by Class 1 CRISPR system to target the invading genome. This category is further subdivided into type I, III, and IV CRISPR subclasses. Class 2 uses only one Cas gene to cleave the target sequence and is subdivided into II, V, and VI subtypes (Wright et al. 2016). All the six subtypes are further distinguished by 19 specific and unique proteins (Westra et al. 2016).

CRISPR mechanism starts when a foreign invader attacks on bacteria. Firstly, virus genome is arrested and integrated as spacer sequence into the CRISPR loci that results in cleavage of target site. Since Cas1 and Cas2 genes are involved in spacer sequence hunting, they are found in almost all the classes of CRISPR (Aliyari and Ding 2009; Dugar et al. 2013; Hatoum et al. 2011; Yosef et al. 2012; Swarts et al. 2012; Mussolino et al. 2014). Protospacer sequence is present adjacent to the leader sequence typically bound to its direct repeat sequences. Direct repeat duplication occurs after single-strand extension to repair CRISPR sequence. Sequence acquisition step is normally the same in almost all CRISPR types. Remaining two steps: (a) processing of CRISPR RNA and (b) interference differ in each class of CRISPR system. Type I CRISPR/Cas system utilizes Cas6e/Cas6f to target direct repeat sequence at the junction of single-stranded and double-stranded RNA to form a hairpin structure. Type II CRISPR system utilizes trans-activating RNA (tracerRNA) to produce double-stranded RNA cleaved by Cas proteins and RNaseIII. Type III CRISPR system uses a homologous of Cas6 protein without forming a hairpin structure for targeting. In addition, type II and type III CRISPR systems necessitate amendment at 5′/3′ position to form operative crRNAs. Cas proteins and crRNAs combine to form interference complexes. The positive correlation between Cas proteins and PAM sites is obligatory for accurate performance of type I and II CRISPR systems resulting in appropriate cleavage of incoming DNA. Type III CRISPR systems do not require connotation of PAM sequence for cleavage due to pairing between crRNA and mRNA (Aliyari and Ding 2009; Dugar et al. 2013; Hatoum et al. 2011; Yosef et al. 2012; Swarts et al. 2012; Mussolino et al. 2014).

Multiple strategies are reported for targeting precise genome editing of eukaryotes. The most practical strategy in laboratory is to introduce the Cas gene, gRNA, and its scaffold in a plasmid to cleave the genome at a very specific site (Swarts et al. 2012). Several companies (Editas and Cellectis) have developed their own strategies to introduce CRISPR system for genome editing and are highly practical in gene therapies (Rinaldo and Ayliffe 2015; Regalado 2015).

Several innovative methods of CRISPR have been reported, viz., hyper-efficient CRISPR/Cas9 system for HDR (inherent homology-directed repair) (Charpentier et al. 2018), bridged CRISPR RNA with integrated nucleic acid for accurate detection of the target DNA (Cromwell et al. 2018), HypCas9 (Chen et al. 2017), Local Accumulation of Double Strand Repair (LOAD) (Sakuma and Yamamoto 2018), and PAM sites bordering XCas9 (Hu et al. 2018). CRISPR system applications in plants are listed in Table 13.3.

Table 13.3 Applications of CRISPR/Cas systems in plant manipulation

13.3 Comparison of Different Genome Editing Techniques

Nearly all organisms can be edited by the latest genome editing techniques. By generating DSBs and their subsequent repair through NHEJ repair pathway, DNA sequences can be interrupted, and indels can be introduced at the cleavage point. If two chromosomes are cleaved by the same nuclease and are simultaneously distributed into the same cell, it can cause deletion of intervening fragment. Moreover, if a DNA repair template goes into the same cell with the nuclease, it results in copying of the repair template by HDR pathway. This mechanism results in interchange of some base pairs and/or addition of an expression cassette.

During last decade, applications of CRISPR systems has revolutionized basic and applied research in agriculture and medicine. Although all the three genome editing tools work on the same principle, therefore, at present, there is no comparison of the best performing tool when we compare ultrahigh featured ZFNs, TALENs, or CRISPR. The suitable platform to their use depends on the availability of resources and applicability criteria like target tissues and the delivery process. Nowadays, CRISPR and TALENs are being applied for all those organisms for which ZFNs were established nearly 15 years ago. All this involves the knock-in/knock-out or gene modification in numerous organisms including cows, C. elegans, Drosophila melanogaster, human stem cells and cell lines, mice, monkeys, pigs, plants, rats, and zebrafish. Genetic alterations have been used for miscellaneous application like the development of insect−/pest-resistant crops, examining genomic functions, developing genetically modified animals, biopharmaceuticals’ production by engineering cell lines/tissues, gene therapy, and combating numerous genetic disorders.

By far, ZFNs, TALENs, and CRISPR system have been used as nucleases for genome editing. An equally important application of these tools is gene expression regulation and epigenome modification. Since ZFNs and TALENs are fused with FokI domain to perform nuclease activity, they are fused with activator/repressor and any enzymatic domain to enhance gene expression modulation and directing histone modification and DNA methylation. Likewise, CRISPR system can be altered to moderate its nuclease activity by 2-point mutations. The gRNA and the protein scaffold are fused with gene regulating domain to alter the target gene. Therefore, all the three tools provide a comprehensive toolbox for redesigning almost any type of organism in scientific research and medical therapies (Fig. 13.2).

Today’s era of genome editing with ZFNs, TALENs, and CRISPR system has provided researchers and scientists to explore enormous amount of hidden information stored in organism’s genome by nature. These tools catalyze basic research and help identify the genetics of numerous diseases as cardiovascular diseases, diabetes, and neurological disorders. These tools have fostered the agricultural and pharmacological improvements and have addressed the technical and scientific challenges that remain a hindrance in the advancement of gene therapy from decades. This will bring a genomic revolution toward the advancement of plants, animals, and the human world.

13.4 CRISPR/Cas: A Splendid Gift from Nature

There is no sign of decelerating of CRISPR uprising. This remarkable immune system of prokaryotes seems to be developed for genome engineering by nature with remarkable flexibility and ease of scaling up and multiplexing. Prokaryotes have utilized CRISPR system as their adaptive defense system against viral attack, and it is emerging as a powerful genome editing tool eclipsing other gene editing tools like ZFNs and TALENs. CRISPR was originally discovered in 1987 in the genome of E.coli, but it was recognized as inherent immune system against viral attack was revealed in 2007. Researchers and scientists put forward that CRISPR being the defense system makes use of CRISPR-associated genes (Cas genes) that help in storing the information of attacking phages and prepare the bacteria to fight for the next attack (Makarova et al. 2015). In 2012, CRISPR was identified as programmable tool for targeted genome editing in eukaryotes (Jinek et al. 2012). In 2013, mammalian cell cultures were developed on the basis of CRISPR system (Cong et al. 2013; Mali et al. 2013). In 2018, CRISPR-based publications were enlisted by PubMed which were detailed on improving CRISPR specificity, multiplexing, and orthogonality in numerous organisms with the extension of developing new functions. It was remarkable to see the CRISPR progress in just 5 years with the steeply risen appreciation involving immuno-pluripotent stem cells and RNA interference. Customized gRNAs were developed for numerous transcriptional, translational, and therapeutic purposes. CRISPR uprising remained fruitful, and the invention of CRISPR system was awarded with noble prize in 2020. Professor Jennifer A. Doudna and professor Emmanuelle Charpentier were awarded with Nobel Prize in Chemistry for their breakthrough in flourishing CRISPR system as the most robust and powerful genome editing tool.

CRISPR reagents could be delivered in the form of plasmids, in-vitro transcribed mRNA or RNPs for genome editing. In addition, different delivery methods such as viral, non-viral and physical methods are available for efficient delivery of genome editing reagents.  Different delivery tools used for CRISPR reagents are elaborated in Table 13.4.

Table 13.4 Comparison of different delivery tools for CRISPR reagents

13.5 Emerging CRISPR/Cas Systems

  1. (a)

    CRISPR/Cas9

CRISPR/Cas9 is a class of type II CRISPR system mainly composed of II-A, II-B, and II-C subtypes. CRISPR/Cas9 is the first characterized system that utilizes single protein as Cas effector. Cas9 makes blunt ends of both the DNA strands that are self-repaired by NHEJ or HDR in the presence of a DNA template for site-specific editing. Highest specificity for genome editing among CRISPR/Cas9 is of type II-A. Off-target effects at varied genome locations render it a bit disadvantageous to use than other CRISPR systems. However, this CRISPR system has been customized for reducing the off-target effects. Among all the type II CRISPR systems, type II-C has natural extraordinary fidelity. CRISPR/Cas9 originated from Streptococcus thermophilus, S. pyogenes, Staphylococcus aureus, Campylobacter jejuni, and Neisseria meningitidis. The size of Cas9 protein is nearly 1000 to 1600 aa. Guided RNA spacer length ranges from 18 to 24 nucleotides. Total gRNA length is nearly 100 nucleotides. Protospacer Adjacent Motif (PAM) specificity varies in different CRISPR/Cas9 systems such as 3′ NGG (for spCas9), 3′ NNNNGATT (for NmCas-9), and 3′ NNGRRT (for saCas-9) (Fig. 13.9).

Fig. 13.9
An illustration depicts C a s 9, in which the following parts are labeled. Protospacer, P A M, c r R N A, and t r a c r R N A.

Schematic diagram of CRISPR-Cas9. CRISPR/Cas9 with crRNA and tracRNA binds and cleaves double-stranded DNA upstream of PAM together with protospacer

  1. (b)

    Cas-12

CRISPR/Cas12 is comprised of type V-A (cpf-I) and type V-B (c2c-I) CRISPR system. It was first originated in Acidaminococcus species, Francisella novicida, Lachnospiraceae species, and Prevotella species. The size is nearly 1100 to 1300 aa, and guided RNA spacer length ranges from 18 to 25 nucleotides. Total gRNA length should remain between 42 and 44 nucleotides. The PAM site is 5′ TTTN for FnCas12a. CRISPR/Cas12 is an efficient system that makes 5′ overhangs at double-strand break end. CRISPR/Cas12 selects its own single or multiple gRNAs. Cas12 has the specialty to target epigenomes, and it can cleave ssDNA (single-stranded DNA) provided that it is activated by a target sequence that is coherent with the spacer DNA sequence. This feature of identifying small DNA fragments from a mixture of varied-sized DNAs makes Cas12 the most powerful tool (Fig. 13.10).

Fig. 13.10
Two illustrations. Top: depicts C a s 12 a, in which the following parts are labeled. Protospacer, P A M, and c r R N A. Bottom: depicts C a s 12 b, in which the following parts are labeled. Protospacer, P A M, c r R N A, and t r a c r R N A.

Schematic diagram of CRISPR-Cas12a and 12b. CRISPR-Cas12 binds and cleaves double-stranded DNA upstream of PAM together with protospacer. CRISPR-Cas12a only requires crRNA. CRISPR-Cas12b requires both crRNA and tracrRNA for its activity

  1. (c)

    Cas13

CRISPR/Cas13 is a type IV CRISPR/Cas system and is divided into VI-A, VI-B, VI-C, and VI-D (CasRx and C2c2) subtypes. It mainly originated from Bergeyella zoohelcum, Leptotrichia buccalis, L. shahii, Listeria seeligeri, Prevotella buccae, and Ruminococcus flavefaciens. The size ranges from 900 to 1300 aa, and the gRNA spacer length ranges from 22 to 30 nucleotides. The total gRNA length ranges from 52 to 66 nucleotides. The PAM sequence specificity is 3’H(LshCas13a), none (RfCas13d), 3’NNA/NAN(BzCas13b). Cas13 cleaves ssRNA (single-strand RNA) rather than DNA. When a ssRNA having complementary sequence to its crRNA activates Cas13 and it can target RNA non-specifically, leading to cleavage of all the nearby RNAs irrespective of their size and sequence. This property of Cas13 has been utilized in precision diagnostics, RNA knockdown, or multiplexing RNAs in mammalian cell lines. Cas13 is known for its substantial activity in gene expression analysis without causing permanent changes in the genomic sequence (Fig. 13.11).

Fig. 13.11
Two illustrations. Top: depicts C a s 13 a, in which the following parts are labeled. Protospacer, P F S, c r R N A. It carries direct repeat at 5 apostrophe end. Bottom: depicts C a s 13 b, in which the following parts are labeled. Protospacer, P F S, c r R N A. It carries direct repeat at 3 apostrophe end.

Schematic diagram of CRISPR-Cas13a and 13b. CRISPR-Cas13a and 13b together with crRNA binds and cleaves single-stranded RNA. Complementarity between crRNA and protospacer flanking sequence (PFS) together with protospacer results in the cleavage of RNA. Cas13a carries direct repeat at 5′ end. Cas13b carries direct repeat at 3′ end

  1. (d)

    CasX

A new CRISPR system was identified and elaborated in Nature in 2019 that is able to edit human genome. This CRISPR system has high functionality as compared to other CRISPR systems in having smallest size among all CRISPR systems, has minimum trans-cleavage activity, and has very high gRNA content. An earlier metagenome analysis of samples from groundwater recognized the CasX protein which is able to interrupt the transformation of bacteria when expressed with an RNA complimentary to the construct through an unknown mechanism. This mechanism led to the endonuclease activity of CasX as its apparent similarity to other Cas proteins. CasX generates a staggered ds-DNA break with a 20-nucleotide piece of DNA exactly complimentary to its gRNA (Liu et al. 2019) (Fig. 13.12).

Fig. 13.12
An illustration depicts C a s X, that has P A M, c r R N A, t r a c r R N A. A loop is between c r R N A, and t r a c r R N A.

Schematic diagram of CRISPR-CasX. RNA-dependent plasmid with two natural RNAs including crRNA and tracrRNA binds and cleaves double-stranded DNA

  1. (e)

    Cas-14

CRISPR/Cas14 is also a new genome editing system that has an advantage of being minor and plainer than other CRISPR systems. It is being used in a broader term of diagnostics (infectious/non-infectious). Professor Jennifer Doudna and her team investigated other forms of CRISPR system by generating a database of metagenomes and microbial genomes. Cas14 was then identified as one of the smaller proteins weighing 40–70 KDa. They are almost half the size of the rest of the Cas proteins, and their amino acid number ranges from 400 to 700. Cas14 has 24 variants which are grouped into 3 categories: Cas14a, Cas14b, and Cas-14c. Regardless of the diverse sequence of Cas14 proteins, they all have a conserved RuvC cleavage domain. The clear difference of Cas14 among others is that it is only present in archaea and not in any bacteria, which depicts its more primitive nature than all other Cas proteins. Being smaller, it is known as a stand-alone CRISPR system. Cas14 is able to detect dsDNA, ssDNA, and RNA, which increases its fidelity to SNPs (single nucleotide polymorphism). Cas14 has its application in microbial infection diagnostics and cancer therapeutics (Aquino-Jarquin 2019; Harrington et al. 2018; NIH 2017) (Fig. 13.13).

Fig. 13.13
An illustration depicts C a s 14, in which the parts labeled are as follows. Protospacer, c r R N A, t r a c r R N A, 5 apostrophe and 3 apostrophe.

Schematic diagram of CRISPR-Cas14. CRISPR-Cas14 together with crRNA and tracrRNA binds and cleaves single-stranded DNA without PAM recognition

13.6 CRISPR/Cas Tools for Engineering Abiotic Stress Tolerance in Soybean

There is a dire need to increase agricultural productivity and to reduce the stresses that affect its production to feed the world’s ever-increasing population. To cope up with food security problems, innovative breeding methods must be applied that can boost agricultural production. Genome sequencing coupled with genome editing technologies has opened new horizons for biologists to edit the genome of almost any crop according to the world’s need. First-generation genome editing tools, i.e., ZFN, meganucleases, TALENs, and site-specific nucleases, can edit almost any gene in plants. These tools were profligate, tedious, and troublesome. The emergence of second-generation genome editing tools, i.e., CRISPR, offers the most precise and highly efficient tools in targeting the genomes of almost all the crops. CRISPR-based genome editing has transformed agriculture by acting as an efficient tool for regulation of gene expression, imparting viral resistance in crops, mutant libraries generation, and crop improvement by integrating new traits (Singh et al. 2020) (Fig. 13.14).

Fig. 13.14
An illustration depicts a plant in the center, abiotic stress on the left, and secondary stresses on the right. Abiotic stress includes heat, cold, heavy metals, salinity, drought, and flooding. Secondary stresses include oxidative stress, ionic stress, mechanical stress, and osmotic stress. Related factors are depicted under Secondary stresses.

Schematic diagram of abiotic stress tolerance. Plant response to abiotic stress which includes signaling pathways is important for the survival of the plant

The genome of soybean was successfully edited endogenously by CRISPR system for Gm-FE-12 and Gm-SHR genes using 6 gRNAs and exogenously for bar genes using single gRNA (Cai et al. 2015). Similarly, chromosome 4 of soybean was targeted at two genomic loci, i.e., DD-20 and DD-43, providing small indels (Li et al. 2015). Moreover, the ubiquitin promoter of Glycine max (Gm-U6-16-1) can be used to edit numerous homeotic genes efficiently (Du et al. 2016). The soybean nodulation restriction gene (Rj-4) was edited by CRISPR system to inhibit nodulation (Tang et al. 2016). The virulence gene locus of soybean was identified by mutating Phytophthora sojae virulence gene (Avr4/6) by CRISPR (Fang and Taylor, 2016). Delayed flowering in short-day and long-day situations was induced by mutating Gm-FT-2 gene (Soybean flowering time-2 gene) by using CRISPR system (Cai et al. 2018). Summary of the crops edited with CRISPR/Cas system for abiotic stress tolerance is elaborated in Table 13.5.

Table 13.5 Summary of abiotic stress resistance crops developed through CRISPR-Cas9

13.7 Applications of CRISPR for Abiotic Stress Tolerance in Soybean

Soybean is a crop of high economic value being rich in oil and protein. With the ever-increasing demand of high-quality oil and increased contents of proteins, it is significant to investigate gene functions and to accelerate breeding procedures for enhancing crop improvement and better yield. Recently, CRISPR technology provides a precise tool for targeted genome editing in crops with a broad range of applications (Fig. 13.15), reverse genetics gene knock-out/knock-in, editing of multiple genes, gene deletion, and gene replacement (Gratz et al. 2013, 2014; Feng et al. 2013, 2014; Mao et al. 2013; Zhou et al. 2014). Several crop species have been edited by CRISPR involving Arabidopsis, barley, cotton, maize, rice, wheat, and tobacco (Li et al. 2013; Kapusi et al. 2017; Gao et al. 2017; Chen et al. 2018; Shan et al. 2013; Upadhyay et al. 2013; Gao et al. 2015). CRISPR/Cas system was used to modify nine endogenous genes of soybean by knocking out GFP (Green Fluorescent Protein) (Jacob et al. 2015).

Fig. 13.15
An illustration depicts a plant, for which genome editing is performed. T r u, g R N A is targeted to O S T 2 slash A H A 1 gene. The changes in the leaf due to genome editing and a graph that compares the wild type and O S T 2, C R I S P R 1 are also depicted.

CRISPR/Cas-based genome editing in plants to improve abiotic stress tolerance: CRISPR-Cas9-based genome editing for the generation of ost2 mutant allele by using tissue-specific AtEF1 promotor for the expression of Cas9. The mutant Arabidopsis ost2-crispr-1 showed enhanced stomatal response to abiotic stress (drought)

Soybean has been extensively edited by CRISPR (Cai et al. 2015, 2018; Du et al. 2016; Jacob et al. 2015; Li et al. 2015, 2019a, b; Michno et al. 2015; Sun et al. 2015; Tang et al. 2016). CRISPR system is an important genome editing tool in soybean which has hastened the breeding progressions and refining the quality of soybean. CRISPR/Cas system works on the principle of transformation, and soybean has very low transformation efficiency with the limitation of dependence on receptor genotypes (Du et al. 2016). There are only few varieties of soybean that are suitable for transformation (Guo et al. 2015; Donaldson and Simmonds 2000). On the other hand, soybean is a short-day plant with high sensitivity to photoperiod that further restricts its geographical cultivation, amending its breeding productivity and development of tremendous varieties (Wang et al. 2016a, b; Xu et al. 2013). Almost all soybean varieties need to be integrated photoperiod insensitivity as they are inherent to varied photoperiod and latitudes. This adaptability in soybean is controlled by some QTLs and numerous major genes (Watanabe et al. 2012). Currently, eleven genes (E-1, E-2, E-3, E-4, E-5, E-6, E-7, E-8, E-9, E-10, and J) relating to soybean growth period are the potential targets for improvement through CRISPR system (Bernard 1971; Bonato and Antonio 1999; Buzzell and Voldeng 1980; Buzzell 1980; Cober et al. 2010; Kong et al. 2014; Lu et al. 2017; Mcblain et al. 1987; Mcblain and Bernard 1987; Ray et al. 1995; Yue et al. 2017). CRISPR/Cas9 system have been used to eradicate detrimental DNA sequence by targeting the specific gene. Apart from that, crops can be made clean or transgene free by eliminating Cas9 or selectable markers during selfing/progeny separation resulting in clean gene crops (Chen et al. 2018; Cai et al. 2018). Off-target activity can also be reduced by carefully selecting the target sequence (Xie et al. 2014; Xu et al. 2015). A significant foundation in soybean for editing its genome by CRISPR was laid in site-directed mutagenesis (Jacob et al. 2015). Soybean quality was improved by introducing a clean and transgene free soybean variety with high content of oleic acid (Huan et al. 2014). Transgene free soybean has the ability of delayed flowering which was developed by using CRISPR/Cas system for knocking out Gm-FT2a gene (Cai et al. 2018). Summary of the crops developed through CRISPR/Cas system for abiotic stress tolerance is explained in Table 13.5.

13.8 Engineering Biotic Stress Tolerance in Soybean Through CRISPR

Numerous disease-resistant crops have been developed by utilizing CRISPR/Cas technology. In 2016, the resistance to rice blast disease was developed in Oryza sativa by targeting OzERF-922 gene (Wang et al. 2016a, b). Mutant lines of rice were selected on segregation at T1 and T2 generation. Resultant lines were observed as developing lesser diseased lesions due to pathogen attack (Wang et al. 2016a, b). Triticum aestivum (hexaploid bread wheat) was targeted at mildew resistance locus homeoalleles (MLO) utilizing site-specific endonucleases (Wang et al. 2014). Xanthomonas citri subspecies Xcc Citrus sinensis was targeted for citrus canker disease (Peng et al. 2017). Disease resistance enhancement by CRISPR/Cas system was observed by modifying lateral organ boundary promoter (CsLOB-1). Entire sequence deletion from both alleles of CsLOB-1 (EBEpthA-4) convened maximum resistance to Wanjincheng oranges. Stacked multigene as single transgene of CRISPR system led to targeted disintegration of numerous viral infections (Iqbal et al. 2016). A summary of different crops engineered with CRISPR/Cas for biotic stress tolerance have been presented in Table 13.6.

Table 13.6 Summary of biotic stress resistance crops developed through CRISPR-Cas9

13.8.1 Future Prospects

CRISPR/Cas technology is progressing at an exceptional rate. Gene silencing and gene knockouts done so far through NHEJ couldn’t prevail for long as being not much precise. HDR-based gene replacement/gene knock-in showed promising results in numerous plants and mammalian cell lines. Homology directed repairing was a competitive task especially in plants due to ineffective distribution of the donor sequence into transformed plant cells (Puchta and Fauser 2014; Steinert et al. 2016). Numerous success reports for HDR based applications of CRISPR are available (Collonnier et al. 2017; Gil-Humanes et al. 2017). Apart from gene editing, CRISPR technology is used in studying molecular biology, cell biology, and functional genomics, study of gene modules, loss/gain of individual gene functional genomic analysis, and regulation of gene expression. Some key applications of CRISPR system are as follows, of which some are yet to be applied in plants:

  1. (a)

    Site-specific gene integration

    1. (i)

      Gene knock in by non-homologous end joining (NHEJ)

    2. (ii)

      Knock in by homologous recombination (HR)

    3. (iii)

      Fusion of GFP with native genes

    4. (iv)

      Cas9 gene splitting

  2. (b)

    Clean gene technology

    1. (i)

      Editing of RNP

    2. (ii)

      Editing of viral encoding genes

    3. (iii)

      Selfing and crossing

  3. (c)

    Imparting resistance against virus

    1. (i)

      Viral genome disintegration

    2. (ii)

      Viral genome cleaning

    3. (iii)

      Removal of RNPs

  4. (d)

    Regulation of gene expression

    1. (i)

      Regulation of transcription

    2. (ii)

      Regulation of translation

  5. (e)

    Manipulation of structure, function, and number of chromosomes

    1. (i)

      Addition

    2. (ii)

      Deletion

    3. (iii)

      Translocation

  6. (f)

    Screening of functional genomes

    1. (i)

      Repressor

    2. (ii)

      Activator

    3. (iii)

      Enhancer