Keywords

1 Introduction

The modifications (like insertions, deletions, and substitutions) in the genomes of organisms are commonly referred to as genome engineering (El-Mounadi et al. 2020). Genome or gene editing techniques are used for genome engineering to incorporate site-specific modifications into any genomic DNA, making use of different DNA repair mechanisms found endogenously. Gene editing usually deals with one target gene, i.e., a single gene is modified, whereas genome editing refers to large-scale modifications of the complete genomic DNA (Robb 2019). This technology has addressed the unmet need for the tools to introduce different types of genetic modifications that can cause a change in the physical as well as genomic traits of an individual or a population. Currently, scientists are making headway in developing gene-therapy treatment strategies by employing these advanced genome editing tools to prevent and treat various diseases in humans and animals. A breakthrough in the field occurred when Capecchi in 1989 for the first time demonstrated that the introduction of a segment of DNA, having homologous arms at both ends, into embryonic stem cells allowed its integration into the host genome via homologous recombination, causing inheritable changes in the cell (Capecchi 1989). Later, the discovery and development of methods to introduce an artificial DNA restriction enzyme into cells, which can cut genomic or dsDNA and generate a double-stranded break (DSB) at specific recognition sites, increased our ability to use cellular repair systems for genome engineering (Zhang et al. 2011). The mechanism of action for site-directed nucleases is based on the site-specific cleavage of the DNA or induction of a double-stranded break/nick (DSB) at targeted regions of DNA sequence by nucleases followed by triggering the two prominent DNA repair pathways of the cell, i.e., homology-directed repair (HDR) and nonhomologous end joining (NHEJ) (Fig. 9.1). The HDR repair mechanism uses homologous donor DNA to repair DNA damage, whereas NHEJ is an error-prone mechanism in which broken ends of DNA are joined together, often resulting in a heterogeneous pool of insertions or deletions. Though one of the efficient repair mechanisms active in the cells, NHEJ has a high rate of mutation and results in frequent nucleotide insertions or deletions (indels). HDR has low efficiency as it requires higher sequence similarity between the template and donor DNA strands. In the HDR mechanism, there is always a chance for reversion if the used DNA template is identical to the original undamaged DNA (Jasin and Rothstein 2013; Ghezraoui et al. 2014). The development of various techniques for genome engineering has focused on the use of different endonucleases (to create DSB with high precision) followed by employing these repair mechanisms to develop an engineered genome with new properties (Mandip and Steer 2019; Khalil 2020) (Fig. 9.1).

Fig. 9.1
A schematic exhibits the flow from a double-stranded break to insertion of random indels and precise repair after bifurcating into non-homologous and joining repair and homology-directed repair through correction, crossover, and ligation, respectively.

An overview of the mode of action of ZFNs, TALENs, and CRISPR/Cas9 system: ZFNs TALENs, and CRISPR/cas9 nucleases will induce double-strand breaks (DSBs) in a gene at a targeted location and can be repaired by either NHEJ or HDR. NHEJ-mediated repair leads to the introduction of variable-length indel mutations. HDR with double-stranded DNA “donor templates” can lead to the introduction of precise nucleotide substitutions or insertions

2 Tools/Methods for Genome Engineering

The invention of genome editing tools opened up a whole set of opportunities for assisting the treatment of various diseases at the genome level (National Human Genome Research Institute 2019). Though various tools or methods for DNA modification have already existed for several decades, the development of more precise methods has made genome editing much cheaper, faster, and more efficient. Every genome editing tool being employed so far is based on one mechanism in which a targeted broken portion of DNA sequence in a gene or genome activates the cell repair mechanism that repairs the break in DNA sequence, i.e., HDR and NHEJ for DNA repair (National Institute of Health 2020). These tools and techniques allow efficient and accurate changes in genomic DNA by introducing DSB at a specific or targeted site in DNA followed by known modifications (i.e., insertion, deletion, indels, etc.). Currently, several genome engineering or genome/gene editing techniques exist which are primarily based on the following tools (nucleases) to target-specific sequences (molecular scissors): (1) zinc finger nucleases (ZFNs) (Porteus and Baltimore 2003; Miller et al. 2007; Sander et al. 2011; Wood et al. 2011), (2) transcription activator-like effector nucleases (TALENs) (Boch et al. 2009; Moscou and Bogdanove 2009; Christian et al. 2010; Hockemeyer et al. 2011; Wood et al. 2011; Zhang et al. 2011; Reyon et al. 2012; Sanjana et al. 2012), and (3) the RNA-guided CRISPR-Cas nuclease system (Deveau et al. 2010; Horvath and Barrangou 2010; Bhaya et al. 2011; Makarova et al. 2011; Cho et al. 2013; Cong et al. 2013; Jinek et al. 2013; Mali et al. 2013) (Table 9.1). The tools used for making sequence-specific cuts in the genome for the genome editing tool are briefly described below.

2.1 Zinc Finger Nucleases (ZFNs)

The first endonucleases used for genome engineering were Zinc finger nucleases (ZFNs), which were composed of zinc finger domains fused with FokI endonuclease (Kim et al. 1996). ZFNs are members of the zinc finger protein (ZFP) family, in which the zinc fingers (ZF) are novel DNA-binding domains that can bind to discrete base sequences. These ZFs have Cys2-His2 fingers and each ZF can recognize a triplet (3 bp) of DNA sequence (Miller et al. 1985; Wolfe et al. 2000). The ZFNs used for genomic engineering are comprised of a tandem array of ZFs, also known as the ZF array that confers unique nucleotide sequence-binding specificity. The dimerization of FokI endonuclease of ZFNs on the binding of two ZFNs to the opposite DNA strands allows the cleavage of the dsDNA at the target sites (Fig. 9.2). For genome editing, two recombinant ZFNs recognizing two different (one each) closely located nucleotide sequences within the target DNA sequence are employed, which with the help of FokI, creates a double-strand break (DSB) at desired target DNA sequence. Since the series of linked ZF domains (ZF arrays) determine the specificity of the target nucleotide sequence, by changing the array of ZFs, any desired sequence may be targeted. A certain degree of off-target effects (nonspecific/desired sequence cleavage) sometimes occurs when the employed pair of ZFNs is not able to recognize the desired target sequence for cleavage. The addition of more fingers per ZFN is recommended to minimize off-target effects and successfully specify rarer and longer target cleavage sites.

Fig. 9.2
A diagram of a zinc finger nuclease targeting the sequence N G G. The Z F N is made up of four zinc finger arrays and two FokI nucleases. The zinc finger arrays bind to the D N A sequence, and the FokI nucleases cut the D N A on both strands, creating a double-strand break.

Illustration of a pair of ZFNs bound to targeted nucleotide sequence: Zinc fingers are shown as ZF, with short circles indicating binding with the DNA base pairs. FokI cleavage domains are shown as shaded boxes, with common cleavage sites, spaced by N bp, and indicated by vertical arrows as ZFN-induced DSB. Zinc fingers are numbered from the N-terminus. The linker between the binding and cleavage domains of one protein is labeled. The spacer between the zinc finger-binding sites is 5–7 bp in this case

Table 9.1 Target specificity, mechanism of action, and experimental design of commonly used gene/genome editing nucleases

The FokI domains of ZFNs are key to their successful application as they carry features that help in the cleavage of a complex genome at a specific target. FokI dimerization is crucial for the cleavage of the dsDNA. The lower strength of the interaction between FokI monomer domains causes the cleavage of DNA by FokI of ZFNs, requiring independent and appropriately placed two adjacent binding occurrences of ZFNs in correct orientations to allow catalytically active dimer formation (Miller et al. 1985; Vanamee et al. 2001; Szczepek et al. 2007) (Fig. 9.2). ZFNs-based genome editing is mainly dependent on the ability of endonuclease to create site-specific double-strand break (DSB) onto the locus of interest. In all eukaryotic cells, the DSBs generated by ZFNs are efficiently repaired by the NHEJ or HDR pathway (Szczepek et al. 2007; Lieber 2010; Moynahan and Jasin 2010) (Fig. 9.1).

Different strategies have been reported for the synthesis of ZFNs of desired DNA-binding specificity by “modular assembly” of different ZFs that have unique triplet base specificities (Segal et al. 2003; Sander et al. 2010; Thakore and Gersbach 2015). The ZFs developed for the modular assembly had been mostly for triplet sequences only (Choo and Klug 1994; Jamieson et al. 1994; Rebar and Pabo 1994; Segal et al. 1999, 2003; Dreier et al. 2001, 2005; Bae et al. 2003; Thakore and Gersbach 2015). The modular assembly of ZF components led to the generation of active ZFNs with specificity to a large number of endogenous sequences (Kim et al. 2009; Remy et al. 2010; Gaj et al. 2013b; Gupta and Musunuru 2014; Shiva and Suma 2019). Apart from the modular assembly approach, several other alternative strategies have also been developed for making ZFPs (Wu et al. 2007, 2013; Chandrasegaran and Carroll 2016; Paschon et al. 2019). These new approaches were focused on accommodating the deviation from strict functional modularity (like many natural and designed fingers can only contact with the adjacent ZF and to bases present outside of their proximal DNA triplet) which was observed for many of the ZF and making them specific (Fairall et al. 1993; Pavletich and Pabo 1993; Houbaviy et al. 1996; Nolte et al. 1998; Wolfe et al. 2001; Segal et al. 2006). These approaches could permit more selective binding and reduce the complications and wasted efforts that occur in modular designing for producing new ZFPs (Ramirez et al. 2008; Chandrasekharan et al. 2009; Chandrasegaran and Carroll 2016; Paschon et al. 2019).

Whatever may have been the methods used for designing ZFNs module, they were always first evaluated in vitro for their affinity and specificity toward the target DNA sequence followed by their application in vivo system. It is done as there is always a possibility that ZFNs/ZFPs which are validated in vitro could fail in performing the genome editing in vivo (Urnov et al. 2010; Wang et al. 2013a; Paschon et al. 2019). Many times, it may arise from the complexity of the genome which sometimes contains multiple copies of sequences that are identical or highly related (paralogues or pseudogenes) to the intended targeted sequences which can act as an additional target for ZFNs. The researchers have tried to address this problem by focusing on DNA-protein interactions and creating minor sequence divergence to reduce the chances of nonspecific targeting of related genomic regions (Carroll 2011; Urnov et al. 2010; Laoharawee et al. 2018). The specificity, recognition, and cleavage of desired sites by ZFNs are determined by the amino acid sequence of each ZF, nuclease (FokI) domain interactions, and quantity of the ZFs. The structure of both the functional domains of ZFNs, i.e., a catalytic domain and binding domains, can be optimized to increase specificity and enhance the affinity for the novel models developed by genome engineering (Jackson and Bartek 2009; Paschon et al. 2019). For improving the accuracy of targeting by ZFN, the “selection-based methods” have been also developed to optimize its cleavage specificity and reduce the nonspecific toxicity (Rahim et al. 2021).

2.2 Transcription Activator-Like Effector Nucleases (TALENs)

The second tool developed for genome editing or genome engineering is Transcription Activator-Like Effector Nucleases (TALENs) which display better specificity and functionality than ZFNs. Similar to ZFNs, TALENs also consist of an endonuclease, i.e., DNA cleavage domain, and a site-specific DNA-binding domain derived from transcription activator-like effectors (TALEs) proteins which together allow the creation of DSBs at specific sites. The DNA cleavage domain used for TALENs is primarily the FokI nuclease. The DNA-binding domains of TALENs, i.e., TALEs originated from a repeated sequence of highly conserved proteins of “phytopathogenic Xanthomonas” (Boch et al. 2009; Boch and Bonas 2010; Chandrasegaran and Carroll 2016). In Xanthomonas, the transcription activator-like effectors (TALEs) proteins are present in the cytoplasm where they promote the modification of genes that help in transcription. TALE proteins are capable of localization to the nucleus, DNA binding, and transcription activation of the target gene (Schornack et al. 2006). The studies conducted on the mechanism of action of these effector proteins showed that these proteins can mimic the functioning of eukaryotic transcription factors in binding with DNA and activating gene expression (Becker and Boch 2021).

Soon after the realization of the TALE domains simplicity, i.e., one monomer binds/recognizes one nucleotide, the first chimeric TALE domain-fused nuclease (TALEN) was constructed (Joung and Sander 2013; Gaj et al. 2013b; Nemudryi et al. 2014; Becker and Boch 2021). The chimera was developed by inserting the DNA-binding domain of TALE in a plasmid vector which was used for ZFNs (Christian et al. 2010). This ultimately leads to the formation of a genetic construct that has DNA binding and catalytic domain of restriction endonuclease, i.e., FokI. The DNA-binding domain (i.e., TALE) monomers that bind with the single nucleotide in the targeted DNA sequence are repeats of 34 amino acid residues in which amino acids at 12 and 13 positions are highly variable and known as repeat variable domain (RVD). The RVD region of TALE is responsible for the recognition of specific nucleotides. The variation in RVDs allows them to bind to different nucleotides with different efficiencies (Fig. 9.3). The TALEs with different RVDs were combined to form artificial nucleases (i.e., TALENs) which bind and cleave the targeted DNA sequences. The TALENs nucleases contain a half repeat (i.e., 20 amino acid residues of the last tandem repeat that bind to the nucleotide at the 3′ end of the recognition site), N-terminal domain, nuclear localization signals, and FokI catalytic domain (Fig. 9.3). The presence of thymidine at the 5′ end of the target sequence interacts with the N-terminal domain of the TALE and affects its overall binding efficiency (Lamb et al. 2013). TALENs always work in pair, their binding sites are located at the opposite site of DNA strands and are separated by small fragment (i.e., 12–25 bp) known as “spacer sequence.” After the TALENs enter the nucleus, they bind with the targeted sequence and the FokI domains located at the C-terminal of TALE cause the DSBs (Fig. 9.1).

Fig. 9.3
A diagram of the T A L Effector R V D D N A-binding code and a T A L E N gene editing tool.

Illustration of a pair of transcription activator-like effector nucleases (TALENs) bound to targeted nucleotide sequence: TALE repeats, i.e., RVD are shown as colored boxes that are responsible for the recognition of specific nucleotides. RVDs bind to different nucleotides with different efficiency. Letters inside each repeat represent the two hypervariable residues. TALE-derived amino (N domain) and carboxy-terminal domains required for DNA-binding activity are shown as pink boxes. The nonspecific nuclease domain from the FokI endonuclease is shown as a larger shaded green box. TALENs bind and cleave as dimers on a target DNA site. The TALE-derived amino- and carboxy-terminal domains flanking the repeats may make some contact with the DNA. Cleavage by the FokI domains occurs in the “spacer” sequence that lies between the two regions of the DNA bound by the two TALEN monomers. The amino acid sequence of a single TALE repeat is expanded below with the two hypervariable residues highlighted in red and bold text. TALE-derived DNA-binding domain aligned with its target DNA sequence is shown in the box indicated as TAL effector RVD DNA-binding codes

Despite simple designing codes as compared to ZFNs, there has been difficulty in the cloning of the designed TALE arrays comprised of large-scale repeats. To overcome this problem, different strategies have been developed such as High-Throughput Solid Phase Assembly, Golden Gate Cloning, and Connection-independent cloning techniques which help in assembling the desired TALE arrays (Schmid-Burgk et al. 2013). Several other modifications have also been made to TALENs to make them a better tool than the ZFNs such as (1) site selection enhancement by varying the length of the spacer sequence (Nemudryi et al. 2014); and (2) development of mutant variants of the TALE’s N-terminal domains that could more specifically bind to A, G, and C nucleotide (Nemudryi et al. 2014; Mak et al. 2012; Lamb et al. 2013).

2.3 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)

After two years of the discovery of TALENS, the discovery of Clustered Regularly Interspersed Short Palindromic Repeats (CRISPR) led to the development of a third genome engineering tool that has revolutionized the field of biotechnology and the health sector tremendously (Nemudryi et al. 2014; Lino et al. 2018; Kaminski et al. 2021). The CRISPR was first time discovered in Escherichia coli (E. coli) in 1987, and later on, found in many other prokaryotes too, e.g., 87% in archaea and 48% in eubacteria (Grissa et al. 2007). The CRISPR system has noncoding RNAs and CRISPR-associated (Cas) protein which has a nuclease activity (Ishino et al. 1987; Jore et al. 2012). In bacteria, the CRISPR-Cas system plays an important role in the adaptive immune response. It helps in protecting bacteria from phage infection by generating memory in the bacterial chromosomes against phage (Barrangou and Marraffini 2014; Renaud et al. 2016; Kim et al. 2021).

There are two types of CRISPR/Cas systems depending on the structural variation that existed in Cas genes (1) Class 1 systems contain multiple protein effectors complexes, and (2) Class 2 has one effector protein. To date, six types of CRISPR/Cas systems and 29 subtypes of Cas-system have been reported (Moon et al. 2019; Liu et al. 2020). CRISPR-Cas9 type II system is one of the most used, advanced, and versatile CRISPR systems for genome engineering or editing because of its specificity which is stemming from the Cas protein. The Cas protein of this CRISPR-Cas9 type II system was extracted from Streptococcus pyogenes (i.e., SpCas9) which targets the specific DNA sequences and is responsible for the advanced specificity of the system (Jiang et al. 2013).

Sequencing of the various bacterial genomes revealed the presence of short unique DNA regions known as spacers DNA, which are separated from one another by short palindromic sequence repeats (Deshpande et al. 2015; Lino et al. 2018). These structures are found to be located in the proximity of Cas genes. The cas gene gives rise to protein products that have nuclease and helicase activity (Haft et al. 2005). Spacer DNA is a homologous DNA found in several phages and plasmids (Bolotin et al. 2005; Mojica et al. 2005; Pourcel et al. 2005; Barrangou et al. 2007) (Fig. 9.4). Cas9 protein is polyfunctional, it interferes with the foreign DNA and pre-crRNA processing (Sapranauskas et al. 2011). The processing of crRNA depends on small noncoding RNA known as transactivating RNA (tracrRNA). The tracrRNA forms a duplex after binding with the complementary repeat sequence present in the pre-rRNA. RNase III (present in the host cell) in the presence of Cas9 cleaves this duplex and leads to the formation of mature crRNA (Makarova et al. 2006; Marraffini and Sontheimer 2010). The CRISPR-Cas9 system employs two main components, i.e., Cas9 endonuclease and a single-stranded guide RNA (sgRNA) or tracrRNA-crRNA chimera (Cong et al. 2013). The sgRNA recognizes and binds with the targeted sequence, and Cas9 cleaves the DNA causing DSB (Fig. 9.4). The site of cleavage for Cas9 endonuclease is 3 bp upstream of an “NGG” PAM located on genomic DNA. This DSB generated gets repaired by NHEJ or HDR (Fig. 9.1) (Pawelczak et al. 2018).

Fig. 9.4
An illustration of the CRISPR-Cas 9 gene editing system comprises D N A specific s g R N A sequence, cleavage site, s g R N A, P A M, Cas 9 enzyme, target D N A, and protospacer.

Illustration of clustered regularly interspaced short palindromic repeats (CRISPR-Cas9 System) bound with targeted DNA sequence: CRISPR system has a single chimeric sgRNA (crRNA and tracrRNA) to which introduces a DSB into the target nucleotide sequence. A protospacer is a site that is recognized by the CRISPR/ Cas9 system. A spacer is a sequence in sgRNA that is responsible for complementary binding to the target site. PAM is a short motif (NGG in the case of CRISPR/Cas9) whose presence at the 3′-end of the protospacer is required for introducing a break. A Cas9 nuclease is capable of introducing DSB into selected DNA site

The mechanism of genome editing with the help of the CRISPR system both inside the prokaryotic cells and in vitro is divided into three stages, i.e., adaptation, transcription, and intervention. In adaptation, a small fragment of foreign DNA entering the bacterial cell gets inserted into the CRISPR locus of the bacterial genome leading to the formation of the new spacer or protospacer (i.e., a viral genome fragment). Viral protospacer is complementary to the spacer present in the host cell and these protospacers are flanked by a short, conserved sequence (2–5 bp) which is known as a protospacer adjacent motif (PAM) (Mojica et al. 2009a). The PAM is inserted at the AT-rich side of the sequence that also has a promoter element and a landing site for regulatory proteins present just before the CRISPR cassette (Deltcheva et al. 2011; Jinek et al. 2012). In the transcription step, the complete CRISPR locus formed is transcribed into a long poly-spacer precursor crRNA (pre-crRNA) (Fig. 9.4). The Cas6 endonucleases are responsible for the formation of mature crRNA in most of the CRISPR/Cas systems (Carte et al. 2008; Lillestøl et al. 2009; Mojica et al. 2009b; Hale et al. 2012; Pawelczak et al. 2018). The short nucleotide CRISPR RNA (crRNAs) has one spacer sequence whose repeat ends are involved in the formation of a stem-like loop structure. The 5′ end with eight nucleotide repeats has an OH group and forms a stem whereas the 3′ end with 2′,3′-cyclic phosphate (hairpin structure) forms a loop (Haurwitz et al. 2010; Gesner et al. 2011).

During the intervention step, the viral DNA or RNA interacts with the crRNA and Cas proteins. The crRNA identifies the complementarily of the protospacer sequence whereas Cas protein leads to its degradation (Marraffini and Sontheimer 2010; Rath et al. 2015; Shabbir et al. 2019). The coevolution of viruses/phages with their host over time has led to the formation of a wide range of CRISPR/Cas system in prokaryotes (Hale et al. 2009; Sashital et al. 2011; Richter et al. 2012; Bondy-Denomy et al. 2013; Newsom et al. 2021).

3 Applications of Genome Engineering/Editing Methods

The development of genome editing tools has given possibilities of directly targeting and modifying genome sequences in eukaryotes. The recent progress in the development of programmable nucleases such as ZFNs, TALENs, and CRISPR-Cas-associated nucleases has significantly accelerated the progress of genome engineering in different fields ranging from basic research to biomedical and applied biotechnological research. The application of different gene editing tools in different fields of biological sciences and their future possibilities are briefly indicated below and summarized in Table 9.2.

Table 9.2 Application of genome editing tools in different fields of biological sciences

3.1 In Genetic Engineering of Cell Lines and Animal Models

Before the development of engineered nucleases, the study of the genetically modified mammalian cell line was costly, labor-intensive, time-limited, and required specialized expertise. However, with the introduction of cost-effective and user-friendly gene editing technologies, the custom cell line bearing any genome modifications can now be generated easily in a few days, e.g., gene deletion (Lee et al. 2010), gene inversion (Xiao et al. 2013), gene knockout (Santiago et al. 2008; Mali et al. 2013), gene addition (Moehle et al. 2007; Hockemeyer et al. 2011; Hou et al. 2013), gene correction (Urnov et al. 2005; Ran et al. 2013), gene addition as well as chromosomal translocation (Torres et al. 2014). Along with cell line engineering, the targeted nucleases have also accelerated the generation of genetically modified organisms, such as the accelerated creation of transgenic zebrafish (Doyon et al. 2008; Sander et al. 2011; Hwang et al. 2013), livestock (Hauschild et al. 2011; Carlson et al. 2012), monkeys (Liu et al. 2014), mice (Cui et al. 2011; Wang et al. 2013a, b; Wu et al. 2013), rats (Geurts et al. 2009; Tesson et al. 2011; Li et al. 2013), etc.

3.2 In Genetic Engineering of Plant Cells

These engineered nucleases have also emerged as a dominant tool for plant engineering (Baltes and Voytas 2015). For example, both CRISPR-Cas9 and TALENs have been used for the modification of multiple alleles in the haploid breed of wheat to create resistance variety against powdery disease (Wang et al. 2014b). Moreover, TALENs were used for soybean to knock out the nonessential gene that is used for fatty acid metabolism and thus produce simple plant cells with reduced metabolic constituents (Haun et al. 2014). The purified proteins comprised of various genomic engineering tools can be directly injected into the plant protoplast to effect germline-transmissible changes which are almost indistinguishable from the natural variety (Luo et al. 2015; Woo et al. 2015). The technological advancement of these tools could be very much helpful to reduce some regulatory problems which are associated with the use of transgenic plants. The targeted nucleases have been also used for the inactivation of pathogenic genes that help in the prevention of parasitic or viral infections and knock out specific factors leading to the development of pathogens resistance varieties (Ghorbal et al. 2014; Lin et al. 2014; Wu et al. 2015).

3.3 In Genetic Engineering for Insect-Borne Disease

Interestingly, the targeted nuclease has been also used to limit mosquito or insect-borne diseases (Burt 2003; Sinkins and Gould 2006). Genome editing enables the introduction of a particular gene or mutation in the host that can also get transferred to its progeny (Windbichler et al. 2011). This gene editing technique has been used in the vector of malaria, i.e., Aedes aegypti, Anopheles stephensi, and Anopheles gambiae for disease control and prevention (Gantz and Bier 2015; Hammond et al. 2016). Countries like Saudi Arabia, Turkey, Korea, Philippines, India, USA, Europe, China, and Japan are using the CRISPR technique for combating vector-borne diseases (Mahto et al. 2022). Smidler et al. reported the targeted disruption of the thioester-containing protein1 (TEP1) gene using TALEN in Anopheles gambiae mosquitos, which transmit malaria. The TEP1 gene of An. gambiae has been identified as a key gene for immunity against plasmodium infection (Miller et al. 2011). Gene editing in Ae. aegypti and An. stephensi using ZFNs and TALENs was reported in 2013 (Degennaro et al. 2013). De Gennaro et al. investigated the involvement of the odorant receptor coreceptor (orco) gene and the odorant receptor pathway in host identification and susceptibility to the chemical repellent N,N-diethyl-meta-toluamide (DEET) in Ae. aegypti (Christian et al. 2010). The developed ZFN was injected into embryos of Ae. aegypti in this experiment with promising results.

3.4 In Genetic Engineering of Industrially Important Microorganisms

The targeted nucleases also offer a convenient means for developing modified bacterial and yeast strains for synthetic biology such as metabolic pathway engineering. For example, the bacterial species belonging to the order Actinomycetales are one of the key sources of industrially relevant secondary metabolites. However, the large numbers of Actinomycetales species are historically resistant to genetic manipulation and had severely hindered their use for metabolic engineering. Now, CRISPR-Cas9 has been used to deactivate several genes of actinomycetes (Tong et al. 2015). This indicates the ability of the CRISPR/Cas9 system to create designer bacteria with enhanced secondary metabolite production capabilities. The CRISPR has also helped in metabolic pathway engineering in yeast by creating random mutagenesis in yeast chromosomal DNA at high efficacy (Jakočiūnas et al. 2015), allowing rapid screening of the desired mutants (Ryan et al. 2014).

3.5 In Genetic Engineering for Functional Genomics

The CRISPR-based knockout strategy has been playing an important role in functional genomics (Hilton and Gersbach 2015), e.g., facilitated the discovery of genomic loci that make cells drug-resistant (Koike-Yusa et al. 2014; Shalem et al. 2014; Wang et al. 2014a; Zhou et al. 2014; Blancafort et al. 2008). The genome editing tools also uncovered how the cells can initiate host immune response (Parnas et al. 2015), as well as keep giving new insights into the genetic basis of cell fitness (Hart et al. 2015; Wang et al. 2015). The genome editing tools have also increased the understanding of how certain viruses affect cell death (Ma et al. 2015). The genome-wide application of the CRISPR strategy has helped in the discovery of functional noncoding elements (Kim et al. 2013; Korkmaz et al. 2016) and understanding of their role in the structure and evolution of the human genome (Findlay et al. 2014). The CRISPR has also helped in identifying the factors key to zebrafish development (Shah et al. 2015) as well as disease development in mice (Chen et al. 2015).

3.6 In Genetic Engineering for Therapeutics

Genome editing technologies have great potential to treat/cure various diseases at genetic levels (Cox et al. 2015; Porteus 2015; Maeder and Gersbach 2016). For example, the ZFN-mediated disruption of HIV co-receptor CCR5 allowed the development of resistance against HIV in both CD4+ T cells (Perez et al. 2008) and CD34 hematopoietic stem/progenitor cells (HSPCs) (Holt et al. 2010; Tebas et al. 2014). Along with the introduction of gene modifications that enhance autologous cell therapies, targeted nucleases can also mediate genome editing in situ through combining viral vector, such as AAV (Gaj et al. 2016a). The delivery of an AAV vector designed to target a defective copy of the factor IX gene and provide a repair template had led to effective gene correction in mouse liver increasing factor IX protein production in both neonatal (Li et al. 2011) and adult (Anguela et al. 2013) mice. Recently, the in vivo gene editing tool has been used for the restoration of expression of the dystrophin gene allowing the rescue of muscle function in mouse models of Duchenne muscular dystrophy (Long et al. 2016; Nelson et al. 2016; Tabebordbar et al. 2016). A therapeutic gene editing tool has been successfully used in a mouse model of human hereditary tyrosinemia disease (Yin et al. 2014). This approach has been also used for the correction of disease-causing mutations in the ornithine transcarbamylase gene in the liver in a neonatal model of disease (Yang et al. 2016).

3.7 In Genetic Engineering: Epigenome Editing (Modulating Gene Expression)

Along with the DNA recognition ability of CRISPR-Cas9, the flexibility associated with constructing arrays of ZFs and TALEs proteins capable of binding to specific sequences allows their fusion with transcriptional activator and expression protein domains to modulate the expression of any gene from its promoter or enhancer sequences. The fusion of engineered zinc finger proteins either with the transcriptional domain derived from herpes-simplex or Kruppel-associated box (KRAB) repression protein had been used for the generation of the first fully synthetic transcriptional effector protein (Sadowski et al. 1988; Margolin et al. 1994; Beerli et al. 1998). Several other types of effector domains were extended and featured over the next 15 years using the zinc finger-based transcriptional modulators (Beerli and Barbas 2002). For example, modulation of transcription through targeted methylation or demethylation was done using the Dnmt3a methyltransferase domain (Rivenbark et al. 2012; Siddique et al. 2013) and the ten-eleven translocation methylcytosine dioxygenase 1 (TET1) (Chen et al. 2014). The TALE transcription factor has also emerged as an effective platform to achieve modulation of targeted transcription (Miller et al. 2011; Zhang et al. 2011). Similar to zinc finger, the TALE is also compatible with several modifiers such as the TET1 hydroxylase catalytic domain which is used for targeted CpG demethylase domains (Maeder et al. 2013), and the lysine-specific histone demethylase domain (LSD1) which has been used for targeted CpG histone demethylation (Beerli et al. 2000; Pollock et al. 2002; Magnenat et al. 2008; Polstein and Gersbach 2012; Maeder et al. 2013; Mendenhall et al. 2013; Perez-Pinera et al. 2013). TALE activators have also been effectively engineered to regulate gene expression in response to endogenous chemical stimuli (Li et al. 2012), proteolytic cues (Copeland et al. 2016; Lonzarić et al. 2016), external stimuli (Mercer et al. 2014), and optical signals (Konermann et al. 2013). The potential is immense.

3.8 Genome Engineering for Transcription Modulator

Because of the excellent ease, the CRISPR-Cas9 system has been also used for transcriptional modulation via fusion of a particular effector domain with the catalytically disabled variant of Cas9 protein (Qi et al. 2013). The mutant form is unable to cleave DNA and is referred to as dCas9 (Dead Cas9 Endonuclease) because of its ability to bind to the DNA in an RNA-directed manner. The carboxyl domain of dCas9 protein fused with the effector domain can modulate the gene expression from either strand of the targeted DNA sequences (Farzadfard et al. 2013; Maeder et al. 2013; Perez-Pinera et al. 2013; Gilbert et al. 2014; Hu et al. 2014). Moreover, dCas9 can inhibit gene expression by simply blocking the transcription initiation or elongation via the process known as CRISPR interference (Qi et al. 2013), whereas the fusion of transcriptional repressor domains with dCas9 can also be used to effectively silence a gene from the promoter region (Gilbert et al. 2013; Balboa et al. 2015; Zalatan et al. 2015). The light-inducible dCAs9-based system has been shown to be capable of allowing optical control of gene expression or achieving altered conditional control of gene expression (Nihongaki et al. 2015; Polstein and Gersbach 2015). The first-generation dCas9 activators were found to display a sub-optimal level of activation (Karlson et al. 2021). The development of second-generation CRISPR activators has rapidly emerged and expanded as a hugely promising area of research (Vora et al. 2016; Chen and Qi 2017).

Even though a lot has been accomplished using these genetic engineering tools still many challenges remain to limit the realization of the full potential of the genome editing tool. Most importantly are the development of new techniques which are capable of introducing gene modifications without DNA breaks such as Oligonucleotide-Directed Mutagenesis (ODM) and Base Editing (Komor et al. 2018). These methods can convert one target base pair to a different base pair without requiring DSBs and in the future can be promising technologies for the study of potential treatments for genetic diseases (Komor et al. 2018). The targeted recombinases that can recognize specific DNA sequences and incorporate desired therapeutic factors into the human genome can be designed and developed (Akopian et al. 2003; Pruett-Miller et al. 2009; Mercer et al. 2012; Gaj et al. 2013a; Sirk et al. 2014; Wallen et al. 2015). This could very well herald the era of the beginning of the union between regenerative medicines and genome engineering (see Table 9.2). However, despite the existence of substantial knowledge gained from genome editing in immortalized cell lines, its application in regenerative medicine that requires genetic manipulation of the progenitor or stem cell populations is still in its infancy as their epigenome as well as the organization of the genome and its functional regulation is inherently different from the transformed cell lines. It is important to fully explore and understand the functional landscape of the potential role and usage of these technologies in progenitor cells and stem cells before their large-scale usage in designer therapeutic applications that could mean reprograming the cell fate and behavior for the next generation of advancement in gene therapy and synthetic biology.