Keywords

1 Introduction

Plant breeders have tried to improve crop varieties in order to satisfy the hunger of exponentially growing human population. Furthermore, steady changes in climatic conditions and reduction in natural resources gave birth to new problems which limited the scientists to achieve desirable outcomes within time. Traditional breeding methods utilize already available genetic variation in natural population to produce a new variety which takes around 8–10 years, lacking behind in the race to feed the ever-growing population. In addition, it leads to degradation of genomic diversity, ultimately resulting in the generation of vulnerable genetic stock (Haroon et al. 2020). Genome-editing (GE) technology creates fundamental insight into biology of crop plants ultimately revolutionizing the agriculture sector at commercial scale (Chen et al. 2019). The GE techniques including the custom-based site-specific nucleases (SSNs), such as meganucleases, zinc finger nucleases (ZFNs), and transcription activator like effector nucleases (TALENs), come under the traditional techniques. However, GE came into the limelight after CRISPR/Cas9 was developed and was considered as a modern technique (Chen et al. 2020). CRISPR/Cas9 technique is mostly utilized in plant breeding programs for the sake of its high efficiency, easy to perform, and high flexibility in comparison to conventional GE techniques. Plants generated through these techniques are almost similar to their wild types except the corresponding trait allowing them to be separate from the genetically modified organisms (GMOs) legislation (Ran et al. 2017). The GE techniques are now known as new breeding techniques (NBT) which influenced the academic institutions, legislation authorities, and government bodies to rewrite the regulation document. Together with conventional plant breeding methods, NBT have shown immense potential in future of trait improvement of elite cultivars. SSNs directed generation of double-stranded breaks (DSBs) are simultaneously repaired by two natural mechanisms either non-homologous end joining (NHEJ) or homologous recombination (HR), resulting into a loss of function or replacement of gene, respectively (Yin et al. 2017). Using the idea of natural mechanism, plant breeders are exploiting NHEJ pathway for the production of knock-out mutants and synthetically deriving homology-directed repair (HDR) pathway for development of knock-in mutants. However, GE offers great opportunities to basic and applied research areas, but on the other hand it is also arousing many dynamic challenges. Independent responses of a particular plant species or cultivar against in vitro processes, transformations, and survival rate are its major drawbacks. Furthermore, off-targets and unintended modification due to integration of cassettes in plant genomic background affect the productivity of GE techniques (Ellison et al. 2020). Since the beginning of CRISPR in plant GE, there has been tremendous improvement in this technique with the introduction of novel tools. These include DNA free editing, base editing, prime editing, epigenome editing, CRISPRa (gene activation by CRISPR), CRISPRi (gene induction by CRISPR), etc. (Zhang et al. 2019, 2020). The future of plant science is looking promising, although substitution of every novel technique needs to be simultaneously addressed to avoid any delay for betterment of agricultural sciences. This chapter highlights the current scenario of GE techniques along with their challenges and new approaches with future perspective.

2 Mechanisms of Repairing Double-Stranded Breaks

2.1 Non-Homologous End Joining (NHEJ)

NHEJ is a type of DSB repair which is not a result of immense homology. There are two types of NHEJ – classical NHEJ and alternative NHEJ. The former one needs a lot of factors such as Ligase 4, KU70/80, XRCC4, etc. (Burma et al. 2006), whereas the latter one will lead to the least of the DSB repair that is required, and it doesn’t need any of the mentioned factors that are needed in classical NHEJ. Alternative NHEJ usually results in minimum homology and a deletion at the repair junction. It is not clear up to what extent is alternative-NHEJ different from homology directed repair (Guirouilh-Barbat et al. 2004). NHEJ can lead to mutations and the error rates can be as high as 50% (Paris et al. 2015).

The DNA repair mechanisms either NHEJ or HDR play a role in genome editing. In bacteria, the DSB can be repaired by either HDR or NHEJ. In eukaryotes, breaks by CRISPR/Cas can be most effectively repaired by NHEJ, which leads to indel mutations (Bernheim et al. 2017). For example, different pathways can affect how CRISPR/Cas will perform. These pathways lead to the regulation of the DSB, if it will be available or if it will compete with the CRISPR/Cas machinery for DNA substrate. Also, as the DNA substrate becomes available for the CRISPR/Cas mechanism, it might inhibit the DNA repair mechanism pathways to work. It has been observed that after a DSB, spacers from CRISPR/Cas have been obtained from a DNA repair mechanism called RecBCD pathway (Levy et al. 2015).

2.2 Genome editing and Homologous Recombination

Homologous recombination refers to the exchange of identical DNA sequences. This mechanism makes sure that the precise replacement and joining of DNA molecules happen; however, the exchange might not be possible if there is less homology. This process is very helpful when certain mutations are required to be brought into the organism’s system or when certain mutations are needed out of the same system. Different kinds of mutations can be introduced into the DNA with the help of certain nucleases called SSNs (sequence-specific nucleases). As the name suggests, SSNs are very specific in cutting the double-stranded DNA at a particular targeted sequence. The natural DSB repair mechanisms of the host come into play afterwards and have been studied in yeast and bacteria (Doudna and Charpentier 2014). This indispensable mechanism has a lot of applications in the biological systems. This is an efficient and simple method for gene deletion as a minimal level of gene homology would also lead to targeting a specific gene. Gene targeting is very successful in mouse model system. Mammalian cells were not targeted that often but as such techniques improved, it leads to manipulating the non-selectable genes more frequently and with much higher efficiency (Müller et al. 1999; Sedivy and Dutriaux 1999). Once damage is done to the DNA, the DSB can be repaired by SDSA (synthesis dependent strand annealing) pathway, or by the formation of a DSBR (double-stranded break repair) which follows either a non-crossover or a crossover approach, or SDSA, which follows only the non-crossover approach (San Filippo et al. 2008).

2.2.1 History of Genome Editing

Gene targeting was first done in animal cells, which were earlier considered hard to work with. But shortly after the creation of first knockouts in animal cells, it was seen that the same can be done in plant cells as well. Nicotiana spp. was used as a transformation system using PEG (polyethylene glycol)-mediated transformation, which had led to low transformation efficiency and is also time consuming (Paszkowski et al. 1988); however, Agrobacterium-mediated transformation has been later proved to be more efficient (Offringa et al. 1990). Agrobacterium-mediated transformation method is more efficient in terms of transformation efficiencies but less efficient in terms of targeting efficiencies, and it is also considered to be less labor intensive. With Agrobacterium-mediated transformation, transformation efficiency increased and the targeting efficiency decreased (Offringa et al. 1990). False positives can be a problem. Some gene targeting products that were thought to be positive proved to be random integrations. It may have happened because of the cell’s repair mechanism that resulted in the integration of a random sequence in place of the target sequence. Agrobacterium-mediated transformation has also shown low transformation efficiencies as PEG (Hrouda and Paszkowski 1994). Lower organisms such as Chlamydomonas also show low transformation efficiencies as higher organisms such as tobacco or Arabidopsis (Smart and Selman 1991; Sodeinde and Kindle 1993; Gumpel et al. 1994).

Most of the time, the tissues used were from the mesophyll protoplasts of leaf from Arabidopsis or tobacco, and sometimes the root tissue from Arabidopsis was also used (Miao and Lam 1995). Vacuum infiltration is another method that made use of inflorescence of Arabidopsis (Bechtold 1993). It was hypothesized that the positive-negative selection as well as the endogenous genes might also have caused the efficiency to be low. By using a negative selection system, a very high efficiency has been seen in rice (Terada et al. 2002).

2.2.2 Homologous Recombination in E. coli

It is common knowledge that prokaryotic systems are easier to understand as compared to eukaryotes (Roca and Cox 1997). In eukaryotes, as well as prokaryotes, there are different kinds of enzymes involved in HR. DSBs are identified and repaired by the RecBCD pathway (Kowalczykowski et al. 1994). The heterotrimer of RecB, RecC, and RecD proteins recognizes the break and uses its exonuclease and helicase activity, which also requires Mg2+ ions. Recombination hot spots are created by Chi(χ)-site sequences, with which the said heterotrimer complex interacts and the enzyme degrades the 3′ terminal strand, followed by 5′-terminal degradation. Single-stranded DNA on the 3′ terminal is produced by RecBCD complex, followed by RecQ helicase providing a substrate for RecA to produce a nucleoprotein filament by coating the tail of 3′ssDNA (Bianco and Kowalczykowski 1997; Arnold and Kowalczykowski 2000). RecBCD displaces single-stranded binding proteins from the ssDNA, which is stabilized with the binding of RecBCD (Meyer and Laine 1990).

Branch migration is promoted by proteins such as RuvA, RuvB, and RecG. RuvA identifies the holliday junction, RuvB is important for the migration of branch, and RecG is also important for branch migration, but at a smaller scale (West 1996; Whitby and Lloyd 1998). Mutants of RecA proteins show no recombination events (Cox 1999).

2.2.3 HR in Saccharomyces cerevisiae

HR is more important when it comes to S. cerevisiae as compared to NHEJ. The cells that were competent for HR didn’t show any DSB. NHEJ activity could only be seen when the HR system is disabled, acting as a backup system (Siede et al. 1996). RAD genes such as RAD54, RAD55, RAD52, RAD50, RAD50, RAD 57, RAD55, and MRE11, XRS2 are involved in HR. The function of these genes is written in Table 1. The mutants of these genes are not affected by infrared but by ultraviolet light (Petes 1991; Pâques and Haber 1999; Pastink et al. 2001; Symington 2002; van den Bosch et al. 2002). RAD52 is the most important of all the said genes for HR. Mutant of rad52 shows IR (Infrared) sensitivity. But when a double mutant is made with rad52 and one of rad52, rad54, rad55, or rad57 genes, the phenotype is consistent with each other. Other than RAD52, RAD51 is also needed for some HR events. The mutants of other genes involved in HR in S. cerevisiae such as xrs2, mre11, and rad50 also show similar phenotypes as described before (Game and Mortimer 1974).

Table 1 List of factors involved in homologous recombination

Different genes in S. cerevisiae that are involved in HR are connected by networks.

2.2.4 HR in Higher Organisms

Genes involved in homologous recombination in higher organisms are a little different than the ones found in lower organisms. Gene targeting in mice helped scientists generate thousands of mutations, which lead to loss of function of a protein (Smithies 1987). One integration event in a hundred could be seen in embryonic stem cells (Jasin et al. 1996). The same has not been seen in plants, where the transformation frequencies are much lower (Paszkowski et al. 1988). Generally, transformation methods such as using polyethylene glycol, electroporation, and Agrobacterium-mediated transformation are employed; however, the gene transformation efficiencies are always as low as 1 in 10,000 or 1 in 100,000 (Zupan et al. 2000; Potrykus and Spangenberg 2013). Up to 22 kb DNA has been transferred (Thykjær et al. 1997). Recombination can be increased in plants by induction of DSBs (Pâques and Haber 1999).

Transformation mediated by Agrobacterium in Arabidopsis to target TGA (TGA1A-related gene 3) locus has been used. The number of calluses that were used was 2580, whereas only one of them showed the targeted TGA locus (Miao and Lam 1995). In Arabidopsis, MADS-box gene AGL5 (Agamous-like 5) was knocked out. Out of 750 events, one showed to have the said gene actually targeted, when vacuum infiltration was used (Kempin et al. 1997). Out of the two models of recombination double-stranded break repair (DSBR) and SDSA, it is observed that chromosomal rearrangements, specifically translocations can occur, according to the DSBR model, whereas translocations are completely avoided according to the SDSA model (Gorbunova and Levy 1999; Puchta 1999).

Gene targeting has also been done in moss, Physcomitrella patens, where DNA was very efficiently integrated into the organism using homologous recombination (Schaefer 2001). The reason for more efficient gene transfer in moss compared to plants can be given to the fact that gene transfer occurs at a particular stage in the life cycle of a moss, which is the G2/M phase (Reski 1999).

3 Different Genome Editing Techniques

3.1 Meganucleases (MNs)

Meganucleases or homing endonucleases are the type of endonucleases that cleaves DNA at a larger recognition site of around 14–40 bp (Iqbal et al. 2020). They are naturally occurring restriction enzymes that are found in prokaryotic and unicellular eukaryotic organisms (Carroll 2017). The recognition site of MNs is bigger than normal type II restriction enzymes and can alter the target sequence in a highly efficient manner. MNs are encoded by mobile genetic elements and are composed of both DNA binding and DNA cleavage domains. The double-stranded breaks formed by MNs are repaired by NHEJ or HDR process (Silva et al. 2011).

Based on their structural and sequence motifs, MNs have been characterized into five families: HNH, His-Cys box, GIG-YIG, PD-(D/E) XK, and LAGLIDADG (Zhao et al. 2007). Among all these families, the LAGLIDADG family is well characterized and is highly used for genome modification purposes. I-SceI (Saccharomyces cerevisiae), I-CreI (Chlamydomonas reinhardtii), and I-DmoI (Desulfurococcus mobilis) are widely used meganucleases of the LAGLIDADG family (Khandagale and Nadaf 2016). They can withstand site-specific polymorphism without loss of binding and cleavage activity. This technology has been successful with I-SceI-mediated transformation in prokaryotes and eukaryotes, but the structure of I-SceI is quite complex which makes it difficult for re-engineering to target genes of interest (Zaman et al. 2019). The structure of I-CreI is less complex and has been widely used to knockout genes in several organisms (Arnould et al. 2007). For example, The Cre-I-based meganuclease was used to target two maize loci, namely, liguless 1 and ms26, which upon treatment induced the insertion and deletion mutations at target loci (Gao et al. 2010; Djukanovic et al. 2013). Furthermore, it has been shown that the DSB repair by NHEJ led to gene knockout in Arabidopsis and tobacco (Kirik et al. 2000). Thus, overall meganucleases are easy to use and can be used to edit the genomes of plants and animals. They also possess a small size (40 kD) which makes them compatible with viral vectors with shorter coding sequences (Iqbal et al. 2020). Despite these advantages, they have not been commonly used in genome engineering as other genome-editing tools due to certain limitations. The first limitation is that the DNA binding domain and catalytic domain are overlapping. To edit the target gene, one has to engineer the DNA recognition sites of MNs but as both domains overlap each other, it is really hard to re-engineer the MNs compared to other genome-editing tools (Khandagale and Nadaf 2016; Iqbal et al. 2020). Second, meganucleases are prone to sequence degeneracy which can highly result in off-target binding and cleavage (Argast et al. 1998). So, in order to overcome the limitations of MNs, researchers were focusing on other simple and efficient methods of gene editing which gave rise to ZFN, TALENs, and CRISPR.

3.2 Zinc Finger Nucleases (ZFNs)

ZFNs are the proteins, designed to cut the DNA at specific sites known as DSBs, which subsequently leads to induction of HR or NHEJ. These repair mechanisms result in deletions, insertions, and base mutations at the site of cleavage (Carroll 2011). Hence, this technology has been employed for the editing of plant and mammalian genomes. ZFNs have different DNA-binding and DNA-cleavage domains (Li et al. 1992).

Fok1 is a type IIS restriction enzyme (Kim and Chandrasegaran 1994). It consists of N-terminal DNA-binding domain and non-specific DNA cleavage domain at the C-terminal end. The cleavage domain has no sequence specificity and hence, can be redirected by substituting with the alternative recognition domains and the most useful for these were Cys2His2 zinc fingers (Kim and Chandrasegaran 1994). Various sequences can be attacked by using novel assemblies of ZFNs. When both sets of ZFNs bind to their recognition sequences on the DNA, dimerization and cleavage is achieved. Short linkers of 5–6 bp (base pair) are generally used between the domains of the protein and binding sites (Bibikova et al. 2001; Händel et al. 2009; Shimizu et al. 2009).

Kim et al. (1996) created first ZFNs as chimeric restriction endonucleases, and the first success was achieved using a ZFN pair that targeted the genome of Drosophila. However, frequency of target modification varies, and ZFN pairs have been successfully used in a wide range of organisms and cell types (Carroll 2011). The success of this technique of genome editing depends on the delivery method used to deliver ZFNs into the host cell. Earlier experiments were dependent on the genomic integration of ZFN-coding sequences and donor DNA via P-element-mediated transformation, (Bibikova et al. 2002, 2003; Beumer et al. 2006) which used to require elaborate and complex construction of delivery system of the ZFNs. A major breakthrough in this technology occurred when it was demonstrated that both DSB repair mechanisms could be obtained through injecting ZFN mRNAs and donor DNA into the host embryo (Beumer et al. 2006). This method is well-established in zebrafish, rat (Geurts et al. 2009; Mashimo et al. 2010), frog (Young et al. 2011) as well as sea urchin (Ochiai et al. 2010), and ZFN-induced mutagenesis had been achieved in a number of genes (Doyon et al. 2008; Meng et al. 2008; Foley et al. 2009). In the higher organisms such as plants, including Arabidopsis thaliana and several crop species, Agrobacterium-mediated transformation had been used through the delivery of coding sequences which are under the control of viral promoter (Lloyd et al. 2005; Cai et al. 2009; De Pater et al. 2009; Osakabe et al. 2010; Zhang et al. 2010). In addition to this, direct transfer of DNA (Wright et al. 2005; Cai et al. 2009; Shukla et al. 2009; Townsend et al. 2009) and viral delivery has been a success in the plants as well (Ira et al. 2010).

The design of ZFNs to target genetic modifications is smooth; however, substantial proportion of ZFN pairs fail (Ramirez et al. 2008; Joung et al. 2010; Carroll 2011). This is the reason that scientists at Sigma-Aldrich and Sangamo Biosciences always practice making multiple pairs for sequences within a single target gene to do extensive testing. There are various methods to select three sets of ZFNs from partially randomized libraries, which can be quite time consuming (Meng et al. 2007). ZFNs for some DNA triplets derived by ToolGen explain the individual finger in their collection that behaves best in modular assembly (Kim et al. 2011).

The modular structure of ZF motifs and recognition by ZF domains allows designing of the artificial DNA binding domains to facilitate genetic modification at the specific target site in the genome (Pabo et al. 2001; Beerli and Barbas 2002). The ZF motifs binds the DNA through insertion of its α-helix into the major groove of helical structure of DNA (Pavletich and Pabo 1991). Key amino acids are present at −1, +1, +2, +3, +4, +5, and +6 positions (relative to the start of α -helix) of ZF motif (Pavletich and Pabo 1991; Shi and Berg 1995; Elrod-Erickson and Pabo 1999). In order to bind long sequences of DNA, several ZF motifs are linked in a tandem fashion which forms zinc finger proteins (ZFPs) (Kim et al. 1996; Liu et al. 1997; Beerli et al. 1998). Zinc finger activators (ZFAs), zinc finger transcription repressors (ZFRs), and zinc finger methylases (ZFMs) make the whole ZFP platform (Xu and Bestor 1997; Bartsevich and Juliano 2000; Zhang et al. 2000; Liu et al. 2001; McNamara et al. 2002; Rebar et al. 2002; Ren et al. 2002; Bartsevich et al. 2003; Snowden et al. 2003; Dai et al. 2004; Rebar 2004).

ZF recognition depends upon the match to the target DNA sequence as well as mechanisms being employed for DSB repair. This ability of ZFNs to modify specific sequences of genes in order to create variants with loss-in-function is a powerful tool for investigating the function of genes as well as development of new products (Osakabe et al. 2010). Gene knockouts have been prepared by the scientists in zebrafish. Similarly, mutagenesis and gene replacement had been achieved in mice (Carbery et al. 2010). Alterations in the genomic loci had been also achieved in the crop plants. Tobacco (Townsend et al. 2009) and maize (Shukla et al. 2009) can be regrown again from the callus, modified in culture by ZFNs. ZFN knockout of CCR5 gene is one of the therapeutic application in humans, which provide resistance to HIV with an improved immune system (Urnov et al. 2005). Similarly, efforts are being made to knock out the targeted genes in order to treat the neurodegenerative diseases in humans such as Huntington’s, Parkinson’s, Schizophrenia, and amyotrophic lateral necrosis (Swarthout et al. 2011).

This technology is broadly applicable and versatile in the genetic engineering of plants. Before the emergence of this technology, targeted gene modification was difficult in plants, engineering of plant traits had always been laborious, time consuming, and unpredictable (Puchta 2002). Zinc finger consortium (ZFC) had been established to ensure the development of ZFNs technology through creation of software, resources, and other required tools for engineering zinc fingers for performing genome editing (Wright et al. 2005; Maeder et al. 2008). For instance, using this publicly available consortium, ZFNs were engineered to recognize SuR loci to achieve high-frequency modification of plant genes (Townsend et al. 2009). Jeffrey A. Townsend et al. (2009) used ZFN to target acetolactate synthase genes (ALS, SuRA, SuRB) in tobacco, which resulted in herbicide-resistance mutations. ZFC had developed a method, called oligomerized pool engineering (OPEN), that uses genetic selections in bacteria for identifying variants of zinc finger arrays (ZFAs) that recognize specific target sequences in the genome (Maeder et al. 2008). These ZFAs function as ZFNs. Similarly, other researchers also used ZFNs to modify endogenous loci in plants, for instance, Shukla et al. (2009), described the use of this technology for genome editing in the crop species of Zea mays. Furthermore, ZFNs is a powerful tool for genome modification of animals as well (Rémy et al. 2010). Initially, genetic manipulations of embryonic cells were done by cloning through nuclear transfer which was limited to only some species and modification at specific loci started with emergence of ZFNs. It has been used to modify Drosophila, zebra fish, and rats (Beumer et al. 2008; Ekker 2008; Geurts et al. 2009). Mammalian cells, including the human genome had been modified permanently via HR of targeted DSB (Durai et al. 2005).

Scientists have also faced several challenges with delivery of the targeting materials. High level of somatic mutagenesis in the targets of genomes as well as extra chromosomal arrays was achieved using heat shock promoter for driving ZFN expression from a DNA template in Caenorhabditis elegans. Due to RNA interference, parallel expression in the germline was undetectable (Morton et al. 2006). Apart from the use of ZFN for mutagenesis, there have been other studies reporting gene replacement using ZFNs. Some genomic regions and sequences within a single gene are sometimes inaccessible due to compact chromatin structure or modifications in the DNA. For instance, chromatin structure prevents cleavage of intact recognition sites during mating-type switching in Saccharomyces cerevisiae (Rusche et al. 2003). DSB repair mechanisms differ with cell types and their developmental stages; hence, understanding of the biological system of every organism is essential to overcome these limitations. Another challenge is specificity of ZF binding, as some bind equally well to triplets other than their supposed preference. The addition of fingers can improve both, specificity and affinity; however, it might lead to binding at off-target sites. The separation of two-finger modules with a short linker was shown to improve specificity (Moore et al. 2001). Death of the host cells is the ultimate result due to off-target cleavage, as the number of breaks outstrips the DSB repair capacity of the DNA (Bibikova et al. 2002; Porteus and Baltimore 2003; Alwin et al. 2005).

3.3 Transcription Activator-Like Effector Nucleases (TALENs)

TALENs are the restriction enzymes which can be engineered to cut the DNA at specific sequences; hence, they are being used as site specific nucleases for targeted genome editing. TALENs are the fusions between non-specific DNA cleavage domain and a custom-designed DNA binding domain (Miller et al. 2011; Wood et al. 2011). DSBs are induced at the desired site on the DNA, which can be repaired by HDR or NHEJ to create small insertions or deletions at the cleavage sites. This technology emerged as an alternative to one of the similar genome-editing technologies, i.e., ZFNs. It contains DNA-binding domains which contain highly conserved repeats, derived from TALENs. These are proteins secreted by Xanthomonas spp. (Boch and Bonas 2010). Both ZFNs and TALENs can cleave the DNA at similar efficiency (Hockemeyer et al. 2011; Tesson et al. 2011; Reyon et al. 2012). The difference between them is that, TALENs more site specific with lesser off-target effects as compared to ZFNs (Chandrasegaran and Carroll 2016).

TALENs can be easily and rapidly designed using “protein-DNA code,” relating DNA-binding TALE repeat domains to the target-binding site. One TAL effector repeatedly binds to one base pair of DNA (Boch et al. 2009; Moscou and Bogdanove 2009). These TAL effector repeats can also be joined together to develop extended arrays that can recognize new targets in DNA (Boch et al. 2009; Morbitzer et al. 2010; Miller et al. 2011; Weber et al. 2011). Most of the methods that are used to construct TALENs, use golden gate cloning method with some variations (Morbitzer et al. 2010; Cermak et al. 2011; Huang et al. 2011; Li et al. 2011; Sander et al. 2011; Weber et al. 2011). However, none of them are adaptable for automated high-throughput production (Maeder et al. 2008; Morbitzer et al. 2010; Cermak et al. 2011; Li et al. 2011; Sander et al. 2011; Weber et al. 2011). DNA-binding domains with highly conserved repeats of 33–35 amino acids are transferred into the host cells via Type III secretion system of the bacteria, Xanthomonas spp., hence, facilitating bacterial colonization to alter the transcription of genomic DNA of the host cells. The two hypervariable residues identify the site at the DNA where the TALE repeats bind. Hypervariable residues, viz., NN, NI, HD, and NG, in nearly all engineered TALE repeats recognize guanine, adenine, cytosine, and thymine respectively (Joung and Sander 2013).

Fast ligation-based automatable solid-phase high throughput (FLASH) is a rapid and cost-effective technology for assembly of TALENs at large scale (Reyon et al. 2012). This technology had been used previously to construct 48 TALEN pairs which were targeted to diverse range of gene sequences in human beings and 100% of nucleases were active in human cells, similarly FLASH TALEN pairs had been targeted to 96 genes involved in epigenetic regulation in humans, out of which targeted alterations arose in 84 genes (Reyon et al. 2012).

Genes in a wide range of cell types or organisms can be engineered using this technology; thus, this technology has significant effects on biological research and has a potential to treat genetic diseases as well as has applications in crop improvement (Joung and Sander 2013). For reflecting its wide importance, it was named the 2011 “Method of the Year” by the journal “Nature Methods” (Baker 2011). This technique had been employed in a variety of organisms, including yeast (Li et al. 2011), zebrafish (Sander et al. 2011), frog (Lei et al. 2012), roundworm (Wood et al. 2011), rat (Tesson et al. 2011), cow, pig (Carlson et al. 2012), rice (Li et al. 2012), thale cress (Cermak et al. 2011), silkworm (Ma et al. 2012), cricket (Watanabe et al. 2012), fruit fly (Liu et al. 2012), and somatic and pluripotent stem cells of human beings (Cermak et al. 2011; Hockemeyer et al. 2011; Miller et al. 2011; Reyon et al. 2012).

Construction of TALE repeat arrays can be challenging due to the need for assembling multiple, identical repeat sequences. Different platforms have been designed, including “Golden Gate” cloning; solid phase assemble; standard restriction enzyme; and ligation-based cloning to facilitate the assembly of plasmids that encode TALE repeat arrays (Joung and Sander 2013). Usually, TALENs are built to bind 18-bp sequences or even longer than that; however, recent studies have suggested that use of larger TALENs may result in less specificity (Guilinger et al. 2014). Off-target effects are also one of the major concerns regarding TALENs, as, in one of the studies where this technology was used in human pluripotent stem cells, mutagenesis at 19 possible off-target sites was reported (Hockemeyer et al. 2011). The size of cDNA encoding TALEN is approximately 3 kb. This large size is also one of the disadvantages of TALENs, which makes it harder to deliver and express TALENs into the host cells. The ability of the TALENs to get delivered by some of the viral vectors is often impaired due to their highly repetitive nature (Holkers et al. 2013); however, this limitation can be overcome through diversification of the coding sequences of the TALE repeats (Yang et al. 2013).

3.4 Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR Associated Protein

TALENs came as an alternative approach to the less accurate and error prone ZFNs. CRISPR is even easier to be prosecuted and is more efficient compared to TALENs and ZFNs. CRISPR and different types of Cas have been widely adapted as a gene-editing technology, which shows a lot of applications and many promising results. Emmanuelle Charpentier and Jennifer Doudna won the 2020 Noble prize in chemistry for CRISPR discovery (NoblePrize.org). Different types of Cas proteins are: Cas1, Cas2, Cas3, Cas5, Cas6, Cas7, Cas9, Cas10, etc. (Makarova et al. 2011). CRISPR/Cas is part of the immune system of bacteria to provide resistance against viruses. Once the bacteria encounter the same virus again, it attacks it with the memory from before (Horvath and Barrangou 2010). It consists of Cas endonuclease, a variety of which, i.e., Cas9 is derived from Streptococcus pyogenes and sgRNA (Heler et al. 2015). sgRNA consists of a tracrRNA and a crRNA (Deltcheva et al. 2011), also known as trans-activating crRNA and CRISPR RNA respectively. The sgRNA has a part that binds to the target sequence of the host, and another part that loops on itself (Cui et al. 2018), the former can be made specific for any target sequence (Cong and Zhang 2015). The target sequence consists of the region to be targeted by sgRNA, and a PAM (protospacer adjacent motif) sequence, which usually is NGG, where N pertains to any nucleotide, and G is guanosine. The target sequence is usually 20 nucleotides. If the PAM site is absent, the Cas9 endonuclease will not be able to recognize and hence cleave the target sequence. If the PAM site is present right before the sgRNA, the Cas9 endonuclease can bind to it and leads to the creation of a DSB exactly the size of the target sequence. Once that sequence is cut, the natural mechanisms of the host cells come into play to repair the DSB, by the mechanisms such as NHEJ and HDR (Wyman and Kanaar 2006). The latter will not, but the former one might lead to the creation of insertions and deletions, which leads to the loss of gene function.

Figure 1 shows different parts of CRISPR. CRISPR cannot only lead to loss of gene function, i.e., gene knockouts (CRISPRko) (Mali et al. 2013) but is also helpful in inducing the gain of function of a gene. CRISPRi (Qi et al. 2013) includes a non- functional Cas9 endonuclease, which doesn’t cleave the target sequence. CRISPRa (Gilbert et al. 2014) can be used for the activation of the gene when non-functional Cas9 is attached to a transcriptional activation domain. The sgRNA can be designed in various ways. Along with binding to the target sequence, sgRNA also works in recognizing the target sequence in the whole genome. A good sgRNA will have specificity as well as efficacy.

Fig. 1
An image with a double-stranded target D N A placed horizontally at the left that uncoils from its double helix form into 2 parallel strands and again regains its helix form. An R N A as a loop is placed vertically near the uncoiled DNA and combines with the above strand which leads to DNA fragment insertion and a circle is set in the background marked as cas9 enzyme.

Mechanism of CRISPR/Cas. (Modified from https://www.labiotech.eu/in-depth/crispr-cas9-drug-discovery/)

The mutations that are created by NHEJ can be identified by Sanger sequencing, AFLP (amplified fragment length polymorphism), and restriction enzyme assays (Belhaj et al. 2013; Belhaj et al. 2015). Instead of targeting only one gene, multiple genes can be targeted together, which is called target gene multiplexing (Cong et al. 2013; Čermák et al. 2017). This can be used either to target different genes or different sequences in the same gene to increase efficiency, by having multiple sgRNAs either under different promoters or under the same promoter (Čermák et al. 2017). The other versions include Cys4-gRNA or tRNA-gRNA (Xie et al. 2015). The former is an endoribonuclease from Pseudomonas aeruginosa. It has been applied in plants such as wheat, Arabidopsis, tomato, potato, rice, banana, and tobacco (Upadhyay et al. 2013; Andersson et al. 2018; Kaur et al. 2018; Castel et al. 2019).

4 Challenges in the CRISPR/cas9 System

Despite having several advantages, CRISPR/Cas9 system in gene modification faces major hurdles, lowering the efficiency or complete failure of genome editing.

4.1 Complex Designing

Despite of being highly specific, the conventional techniques are restricted to only research laboratories due to the inefficient understanding of their complex nature. The customized protein engineering varies from species to species and is more difficult in polyploidy genomes. Use of meganucleases is limited because of a rare homing site present in particular genome which requires extensive research to design DNA-binding domains ultimately narrowing their application in plant science (Wright et al. 2014). Additionally, it is also time consuming and costly in comparison to other GE techniques (Aglawe et al. 2018). ZFNs overcome few of the problems of meganucleases and broaden the scope of plant genome editing in various agriculturally important crops. However, the selection-based fabrication of large sized libraries for different traits make it inadequate to be used in diverse laboratories, since it requires high technical expertise (Maeder et al. 2008; Nelson and Gersbach 2016). The most specific technique, i.e., TALENs, requires monomer DNA-binding domains; perhaps, comprehensive knowledge is needed to achieve significant target efficiency. Various methods of TALEs construction have been developed using 20–30 monomers involving ligation, cloning, to generate dimers library, and subsequently, golden gate assembly is used which is very tedious and time-consuming process (Abdallah et al. 2015). Currently, CRISPR/Cas9 technique is a preferred approach, bypassing the meganuclease, ZFNs, and TALENS due to its simple RNA-dependent DNA binding followed by cleavage of target site via single protein (Zhang et al. 2014; Sharma et al. 2017).

4.2 Inefficient Delivery

To modify plant genome, the construct of GE machinery needs to be effectively introduced in plant cell, making it a very crucial step to achieve beneficial outcome. The methods available for delivery of cassettes into plants are based on Agrobacterium, viral vector, gene gun mediated, lipofection, etc. (Nelson and Gersbach 2016, Yin et al. 2017, Liu et al. 2020). The target tissues for transformation being utilized are callus, immature embryos, protoplast, shoot apical meristem, etc. (Altpeter et al. 2016, Ran et al. 2017). Delivery of construct is very challenging because of the specific requirement of independent genotypes, explants, and type of in vitro culture conditions (Ran et al. 2017). Additionally, the requirement of transgene free altered plants at commercial scale implies restriction on the most frequently used methods of delivery of GE cassettes, creating a tough task for the researchers. This emerges as a principal obstacle toward the generation of novel traits in plants (Baltes et al. 2014). Although considerable achievements have been obtained for transformation or delivery of GE reagent in various crops, their low regeneration ability and unstable integration are hindering the scientific endeavors (Altpeter et al. 2016).

4.3 Selection of Target Site and gRNA Design

The major advantage of CRISPR machinery is its ability to target ~21–23 base pair (bp) DNA sequence containing a PAM sequence on forward or reverse strand. The PAM motif on average occurs every 8bps which provides higher flexibility in target site selection (Ramakrishna et al. 2014). In addition, Cas9 proteins from several other organisms have been studied and different PAM sites have been identified. This diversity in PAM sites have increased choices for choosing target sequences. However, recent reports have shown that the target site selection and designing of guide RNA is not as simple as it was previously assumed. Due to post-transcriptional modifications of mRNA transcribed by RNA polymerase II, it is not possible to use RNA polymerase II for sgRNA construction (Zhang et al. 2014). Currently, RNA polymerase III along with Ubiquitin3 (U3) and U6 small nuclear RNA (snRNA) promoters are used for guide RNA production. The Ubiquitin genes are housekeeping genes and cannot be used to generate tissue specific or cell specific gRNAs (Gao and Zhao 2014). As the RNA polymerase III is not commercially available, this also limits the production U3- and U6-based guide RNA. Therefore, various approaches should be developed to overcome the application of ubiquitin promoters-based gRNA construction.

4.4 Off-Target Effect

In additional to the rational design of sgRNA, off-target DNA cleavage by Cas9 endonuclease is a major challenge, reducing the efficiency of the machinery. During the process of CRISPR/Cas9 gene editing, the interaction of endonuclease and target strand initiated by the recognition of PAM motif, which denatures DNA upstream of the motif, therefore allowing the binding of target sequence and sgRNA to form R-loop (Ebrahimi and Hashemi 2020). The Cas9 enzyme then cleaves the target DNA having less number of mismatches, ultimately resulting in the off-target cleavage (Herai 2019; Newton et al. 2019). This off-target cleavage came from natural combat between bacteria and virus where the DNA sequence of virus mutates itself to escape from the chopping effect of Cas9 nuclease, but Cas9 in return is capable to bind and chop target virus with minimum number of mismatches (Li et al. 2019). Based on these accepted number of mismatches, various algorithms have been developed to design guide RNA and to check off-target sequences. In general, highly specific sequence with zero mismatch is required for cleavage of DNA at 7–9 PAM proximal bases, and 3–4 mismatches away from PAM site is admissible for attachment of Cas9 but not cleavage (Dagdas et al. 2017; Singh et al. 2017).

The off-target effect is solely not based on the Cas9 endonuclease, but also depends highly on sgRNA design. Therefore, modifications in Cas9 protein and sgRNA can limit these off-target effects respectively. One approach is the modification of sgRNA backbone chemically. The introduction of 2′O-methyl-3′-phosphonoacetate at particular sites of sgRNA or partial substitution of ribonucleotides with deoxyribonucleotides can reduce off-target effects respectively (Ryan et al. 2018). Another approach includes creating mutant Cas9 systems where ZFNs and TALENs serve as an inspiration for improved target precision. This mutant Cas9 system involves the fusion of dcas9 (deactivated) and a Fok1 nuclease dimer. Here, the 20–21bp guide RNAs binds the target strand. After binding of sgRNA, the fok1 nuclease dimers become functional Fok1 and cause double-strand breaks (Guilinger et al. 2014; Tsai et al. 2014). This system significantly reduces off-target cleavage but increases the size of genome-editing tool, thereby reducing the efficiency of transformation in vivo. Another mutant approach involves the fusion of Cas9 protein with ZFPs or TALEs, which can target the genomic loci with better precision (Bolukbasi et al. 2015).

4.5 Weak Repair Efficiency of HDR in Eukaryotes

In eukaryotic cells, the double-strand breaks are repaired by NHEJ mechanism with higher efficiency. This mechanism commonly repairs the DNA without the use of template DNA resulting in indel mutations, which causes the initiation of frameshift resulting in establishment of gene knockout or knockdown (Shalem et al. 2014). On the other hand, the efficiency of HDR-mediated repair from double-strand break is very low in mammalian cells. It has been reported that the HDR repair in mice after cas9-based genome editing is 0.5–20% whereas the repair efficiency through NHEJ is ~20–60% (Maruyama et al. 2015).

Various strategies have been developed to increase the HDR repair process and reduce NHEJ efficacy. The use of tiny molecules known as inhibitors (for NHEJ) or inducers (for HDR) is one of the approach for improving recombination frequency (Aird et al. 2018). The application of SCR-7 (NHEJ inhibitor) or RS-1(HDR enhancer) have been reported to increase the HDR efficiency by several folds (Yu et al. 2015; Vartak and Raghavan 2015; Song et al. 2016). In addition, the other approaches include the gene silencing, use of cell lines deficit in NHEJ machinery (Weinstock and Jasin 2006), cell cycle synchronization, or controlled cas9 expression (Weber et al. 2015). The controlled delivery of CRISPR/Cas9 machinery along with cell cycle has increased the CRISPR/Cas9 mediated DNA repair. The synchronization of Cas9 protein with cell cycle progression can be made by fusing Cas9 protein with human DNA replication inhibitor (geminin), which modifies Cas9 endonuclease post-translationally (Gutschner et al. 2016). While the use of NHEJ inhibitors or HDR inducers have resulted in increased HDR mediated genome editing, they are really toxic to host cells and cause numerous problems for sufficient cell growth. To overcome these problems, another approach has been developed which rely on covalent tethering of repair template to ribonucleoprotein complex. The utilization of this approach led to the increase in HDR repair by ~30 folds, and it was proved that this strategy can be applied in various other organisms and target loci (Aird et al. 2018).

4.6 Cas9 Endonuclease Activity and Cytotoxicity

Several Cas9 proteins from different species have been found and used in genome editing, including Staphylococcus aureus (SaCas9) (Ran et al. 2013a, b), S. thermophiles (StCas9) (Kleinstiver et al. 2016), and Streptococcus pyogenes (SpCas9) (Vento et al. 2019). Among all, SpCas9 is an endonuclease enzyme commonly used for genome editing in prokaryotic bacteria. SpCas9 is a well-characterized endonuclease with simple PAM site and has high expression rate in various prokaryotes. On the other hand, different Cas9 proteins have been used in eukaryotes and the selection of specific Cas9 ortholog for each organism showed improved efficiency of editing for a specific sequence. In addition to the selection, many factors have shown to affect the activity of Cas9 protein. For gene editing to occur in eukaryotes, the Cas9 protein must translocate in nucleus, and the nuclear location signal (NLS) should be associated with Cas9 protein. It has been reported that by reducing the proximity of NLS and Cas9 using 32 amino acid spacer, the DNA cleavage activity increased to a higher extent (Shen et al. 2013). In addition, by increasing the guide RNA:Cas9 ratio was shown to increase on target chopping activity, by ensuring all Cas9 protein form active R loop with the sgRNA and DNA complex (Kim et al. 2014). Unlike other endonucleases, the activity of Cas9 protein is significantly less with a single turnover rate of ~0.4–1.0 per min (Jinek et al. 2012). Also, when bind to the target DNA, the displacement rate of Cas9 protein is quite challenging. It has been reported that even after DSB formation, the 1 nm Cas9 enzyme cleaved ~2 nm plasmid DNA after 2 h (Jinek et al. 2012). This shows Cas9 enzyme works more like actuator rather than a catalytic enzyme.

In addition to enzyme activity, the cytotoxic effect of Cas9 protein can be considered as a crucial obstacle in gene editing (Vento et al. 2019). Various attempts have been made to reduce this toxic effect of Cas9 protein for efficient genome editing using programmable DNA cleavage. The first approach includes the usage of inducible expression system for Cas9, where the activity of Cas9 protein is highly reduced when no inducer is present (Reisch and Prather 2015). Another approach includes the use of toxin-free endonucleases or variant nucleases. The variant of cas9 protein: cas9n has shown reduced toxicity because it targets only single strand of DNA (Standage-Beier et al. 2015).

5 Approaches

5.1 Base Editing

Base editing allows nucleotide substitutions in the genome by using modified Cas effectors without the requirement of producing DSBs (Komor et al. 2016). There are two types of base-editing tools, cytidine base editor (CBE) and adenine base-editor (ABE), that enable cytosine-guanine to thymine-adenine and adenine-thymine to guanine-cytosine transitions (Chen et al. 2019). These CBE and ABE tools make use of cytosine and adenosine deaminases for cytosine and adenine base editing (Komor et al. 2016; Nishida et al. 2016; Gaudelli et al. 2017; Ren et al. 2018; Wang et al. 2018). In the CBE system, nick/dead Cas protein fuses to cytidine deaminase, catalyzing conversion of cytidine (C) to uracil (U). In the ABE system, engineered Escherichia coli RNA adenosine deaminase (TadA) fuses to nick/dead Cas protein, leading to conversion of adenine (A) to inosine (I), which is recognized as guanine by DNA polymerase during replication of DNA. Plasmid transfection and viral delivery are the common methods of delivery of the base editors in the living cells (122,124). Hence, they introduce targeted substitutions in the genes, and it had been intensively employed in model plants and crops for the improvement of agricultural traits, including flowering, plant height, disease, and herbicide resistance (Chen et al. 2017; Shimatani et al. 2017; Kang et al. 2018; Tian et al. 2018; Chen et al. 2019; Wu et al. 2020). Other major applications are the study or treatment of disease-associated point mutations, use of base editors as the recorder of cellular events in biomedical research, and introduction of premature stop codons by CBA to disrupt genes in homogenous manner (Landrum et al. 2014, 2016; Farzadfard and Lu 2018). However, in mammalian cells, challenge is to circumvent DNA repair processes that oppose target base pair conversion (Rees and Liu 2018). Human cells undergo effective cellular repair of U.G intermediate through base excision repair of U.G in DNA, in which uracil N-glycosylate (UNG) recognizes U.G mismatch and cleaves glycosidic bond between uracil and backbone of deoxyribose (Kunz et al. 2009). Generation of indels, targeting limitations and off-target editing DNA base pairs, had been also reported (Rees and Liu 2018), and various scientists are doing efforts worldwide to overcome these limitations. Base editing in RNA is also possible that provide powerful capabilities to life sciences and medicine. To date, deamination of A to I has been only reported (Vogel and Stafforst 2019).

5.2 Prime Editing

Prime editors are being employed for precise editing, employs same mechanism as conventional CRISPR/Cas systems but does not require DSBs (Anzalone et al. 2019). It involves longer-than-usual guide-RNA, commonly known as pegRNA. Fusion of Moloney murine leukemia virus (M-MLV) reverse transcriptase (RT) and Cas9 nickase (nCas9) is the major component of prime editor. Complex of nCas/M-MLV/pegRNA mediates site-specific nicking by nCas9, which is then served as a template for RT and then at the end reverse transcriptions leads to production of stable edited DNA (Zhan et al. 2020). Base conversions and small insertions can be achieved through the use of prime editors (Anzalone et al. 2019). There are three prime editors: first is PE1 that is a combination of Cas9 H840A nickase and wild type (WT) M-MLV RT enzyme; second is PE2, improved thermostability, processivity, and DNA-RNA substrate affinity of the RT component; and third is PE3, in which second gRNA was introduced in addition to pegRNA (Anzalone et al. 2019). Prime-editing systems have the capability of performing precise genome editing in human cells, and scientists also tried to create mutations in the genome for treating rare genetic diseases, including SCD, Tay-Sach, and prion diseases in humans (Matsoukas 2020). Recently, Xu et al. (2020) has developed plant prime-editing system, plant prime editor 2 which was tested through targeted mutation on an HPT-ATG reporter in rice. Its development is an essential addition to the genome-editing technologies and also addresses CRISPR/Cas limitations. However, this technology also has some challenges, it may not be able to create large DNA insertions or deletions as compared to conventional CRISPR/Cas systems, possibility of addition of cDNAs due to presence of RT, and large protein constructs may affect the delivery of full-length therapeutic protein (Matsoukas 2020). Hence, further research is required for optimizing prime editors and maximizing their efficiency in different cell types.

6 Conclusion and Future Prospects

The steady and undesirable variations in climatic conditions, supplemented with depletion of the natural resources and biodiversity, are creating new challenges toward sustainable crop production. The rapid and constantly evolving nature of GE techniques assists plant scientists via numerous applications in crop improvement. These advances widen the scope of trait refinement in the diverse genetic background, irrespective of their natural mechanism. However, the efficient utilization of GE techniques seeks a deep understanding of the interaction between the genotype and their respective phenotype under the different environmental conditions as most of the agronomically important plant features are controlled through a complex genetic mechanism. Conventional GE techniques initially gained the attention of researchers due to their ability to induce directed DSBs followed by the natural repair mechanism. But, the natural weapon of the bacterial immune system transformed the era of GE and emerged as a principle genome modifying tool. The necessity of short sgRNA and single unit Cas9 nuclease protein makes it the first preference over the conventional methods. CRISPR opens up tremendous opportunities in the living world beyond the DSB mediated SDNs, including the study of regulatory elements, complex genetic mechanisms, cell signaling, chromatin modeling, etc. These developments broadened the way of assessment in various agriculturally important monocot and dicot species as well as the plants having multiple copies of genomes. The adequate information of functional genomics is a prerequisite to target any specific genotype. GE techniques, especially CRISPR, have shown the ability of precise mutation through knock-out, knock-in, and knock-down mutation resulting in loss of function, the gain of function, and specific transient modulation, respectively. The rapid advancement in CRISPR techniques accompanied by dynamic natural variations display enormous opportunities for the betterment of plant sciences. On the other hand, the technical difficulties and various shortcomings such as off-targets, genotype dependency, construct delivery, unintended effects, etc. take time to translate these techniques from basic research to applied studies. The techniques are already standardized in model plants such as Arabidopsis and Rice, but their exploitation in other important plant species needs to be simultaneously addressed. The exponential growth in recent years in the field of GE promises to plant scientists in providing customized and flexible solutions of their beneficial thoughts with regards to the nourishment of agricultural sciences. Conclusively, this handy approach has shown the potential to counter the upcoming challenges of agricultural, environmental, social, and geological issues ultimately hindering the fate of the food crisis in the scenario of climate alterations.