Key words

1 Introduction

The emergence of transcription activator-like effector nucleases (TALENs) has made genome-editing tools widely accessible to any laboratory with basic molecular biology expertise. The development of the TALEN technology and its use in various biotechnological applications build on the considerable progress in genome editing over the previous decade with other approaches. Accordingly, the availability of the TALEN technology over the past few years has led to numerous advances in genome editing in a diverse range of cell types and organisms. This facile genome editing approach has facilitated new strategies to model disease, develop novel genetic therapies, or create desired phenotypic properties through highly specific rewriting of the genome. In this chapter, the development and use of TALEN technologies are reviewed and discussed.

1.1 Genome-Editing Systems

Genome editing with engineered site-specific endonucleases has emerged as a technology to selectively replace or correct disrupted genes, in contrast to conventional genetic engineering methods of gene addition [1, 2]. There are numerous platforms for generating site-specific gene modifications in the genome, but to date the most successful have been based on zinc finger nucleases [3, 4], TALENs [5, 6], and, more recently, the RNA-guided CRISPR/Cas9 system [79]. These systems are at present the most developed publicly available platforms for robust and efficient targeted gene editing. In particular, the recent development of TALENs and CRISPR/Cas9 has dramatically advanced genome editing due to their ease of engineering and efficient genetic modification [620]. Other systems in development include meganucleases [21, 22], triplex-forming oligonucleotide (TFO) complexes [23], and programmable recombinases based on zinc finger protein [2427] or TALE DNA-binding domains [28]. Historically, meganucleases have been difficult to engineer due to interdependence of the DNA-binding and cleavage domains, although recent developments in directed evolution of meganucleases [2931] and fusion of meganucleases to TALE DNA-binding proteins [32, 33] are providing promising new opportunities with this technology. TFO complexes have thus far been limited by relatively low levels of gene modification, but oligonucleotide-mediated gene editing can be improved with the incorporation of TALENs [34]. Programmable recombinases are a promising next-generation gene editing technology, but target site requirements, overall efficiency, and unknown off-target effects are still major challenges to the widespread adoption of this technology [35].

1.2 Nuclease-Mediated Genome Editing

Engineered nucleases generate targeted genome modifications by creating a targeted double-strand break in the genome that stimulates cellular DNA repair through either homology-directed repair (HDR) or nonhomologous end joining (NHEJ) [36, 37] (Fig. 1). Briefly, HDR uses a designed synthetic donor DNA template to guide repair and can be used to create specific sequence changes to genome, including the targeted addition of whole genes. HDR has enabled integration of gene cassettes of up to 8 kb in the absence of selection at high frequency (~6 %) in human cells [38]. Generally, gene correction strategies have been based solely on HDR, the efficiency of which is dependent on the genomic target, cell type, cell-cycle state, and efficient delivery of an exogenous DNA template [3943]. In many cases, antibiotic selection is used in tandem with genome editing for gene correction in cell types with low levels of HDR repair [4042]. In contrast to genome modification by HDR, the template-independent religation of DNA ends by NHEJ is a stochastic, error-prone repair process that introduces random small insertions and deletions at the DNA breakpoint (Fig. 1). Gene editing by NHEJ has been used in mammalian cells to disrupt genes [44, 45], delete chromosomal segments [4648], or restore aberrant reading frames [49, 50]. This chapter reviews how TALENs have been used to exploit NHEJ and HDR DNA repair processes to create highly specific changes to a desired gene.

Fig. 1
figure 1

Mechanisms of DNA repair following nuclease-induced double-strand breaks. (a) In the absence of a DNA repair template, the break is repaired by nonhomologous end joining, which is an error-prone process and can lead to small insertions or deletions. Alternatively, two adjacent nuclease-induced breaks can be used to excise the intervening chromosomal DNA from the genome. (b) If a DNA repair template is provided with homology to the target site surrounding the break, it will be used to guide homology-directed repair. In this way, particular small changes to the DNA sequence or the insertion of whole-gene expression cassettes can be directed to specific genome target sites

2 Development of TALENs

2.1 TALE DNA Recognition

In 2009, two landmark studies described the simple and modular TALE DNA-binding domain [14, 15]. These novel DNA-binding proteins are naturally occurring transcriptional activators from the plant pathogen Xanthomonas. As reported in these studies, the TALE DNA-binding domain consists of numerous tandem repeats, with each repeat specifying recognition of a single base pair of DNA. Importantly, single-base-pair recognition by each repeat is determined by alteration of only two hypervariable amino acids, termed repeat variable diresidues (RVDs), and each repeat appears to recognize DNA in a modular manner. This simple mode of DNA recognition was confirmed in structural studies of a naturally occurring TALE bound to its cognate DNA target [51, 52]. These discoveries were quickly expanded upon to create novel TALE proteins by engineering a custom RVD array to recognize a user-specified DNA target [5355]. The only sequence requirement for TALE binding is that each target site be immediately preceded by a 5′-thymine for efficient DNA recognition, although more recently modified proteins have been developed to accept other nucleotides at this position [56, 57]. These novel DNA-binding domains were then fused to transcriptional activator domains [53, 55], nuclease catalytic domains [11, 54, 55], epigenetic modifying domains [58, 59], and recombinases [28] to generate an array of programmable enzymes for manipulating genes in complex genomes.

Although naturally occurring TALEs have a modular RVD recognition code, several studies have shown that some RVDs, specifically those targeting guanosine, display unexpected recognition of degenerate bases in the context of engineered TALE DNA-binding domains [55, 60]. However, more specific RVDs, such as NH or NK for recognition of guanosine, can result in significantly reduced activity of the reengineered TALE protein [6062]. Recently, a publicly available web server has been developed that generates TALE targets utilizing more specific RVDs predicted to have minimal impact on activity [63]. Other publicly available web servers are available to assist in generating RVD arrays that are predicted to have high activity and specificity [6466]. Together, these studies demonstrate the overall robustness of TALE DNA recognition and its utility in generating highly active nucleases at novel targets of interest.

2.2 Assembly of RVD Arrays to Create Customized TALE DNA-Binding Domains

Synthesizing custom TALE DNA-binding domains requires contiguous assembly of many RVD repeats, each only differing by two amino acids, into a destination TALE array. The large number of repeats, typically 15–20, makes this process difficult with conventional recombinant DNA technology. To overcome this technical challenge, several approaches have been developed that iteratively assemble new TALE arrays in a highly efficient and rapid manner. Custom TALE arrays can be rapidly created from a relatively small library of plasmids using publicly available reagents utilizing “Golden Gate” molecular cloning techniques to assemble new arrays within a few days [13, 53]. These methods are simple and only require reagents and equipment commonly found in molecular biology labs, although the overall throughput of assembly is limited. Other protocols are well suited to high-throughput generation of TALE arrays using solid-phase assembly [12, 67] or ligation-independent cloning techniques [68]. Notably, with the proper equipment, these high-throughput assembly methods are able to generate dozens to hundreds of TALEN constructs in 1 day. Alternatively, TALE arrays can also be custom ordered and pre-validated through commercial sources such as Life Technologies, Cellectis Bioresearch, Transposagen Biopharmaceuticals, and System Biosciences.

2.3 TALE Nuclease Architectures

Conventionally, TALEN monomers are created as a fusion of the TALE DNA-binding domain to the nonspecific endonuclease catalytic domain of FokI. Site-specific double-strand breaks are created when two separate nuclease monomers bind to adjacent target DNA sequences on opposite strands in a tail-to-tail fashion, thereby permitting dimerization of FokI and cleavage of the target DNA (Fig. 2). Thus, since FokI acts as a dimer, TALENs are designed in pairs to guide two separate FokI monomers to a desired target site. Several TALEN architectures have been described that demonstrate improved nuclease activity by truncating the C-terminus of the TALE DNA-binding domain [11, 55, 69]. These studies also show that the translocation domain on the TALE N-terminus can be removed without impacting activity. Moreover, these truncations can be used to restrict the length of the sequence allowed between the TALEN monomers [55] and may be useful for restricting potential off-target mutagenesis. Directed evolution of the TALE DNA-binding domain has also yielded mutants that have higher observed gene editing activity against episomal and chromosomal targets [70]. Alternate nuclease catalytic domains are also possible; for example, fusions of TALEs to monomeric meganucleases have recently been shown to improve targeting of these enzymes [32].

Fig. 2
figure 2

TALEN architecture and structure. (a) The TALE DNA-binding domain consists of the array of RVDs engineered to recognize specific sequences, along with fixed N- and C-terminal domains (orange), fused to the catalytic domain of the FokI endonuclease (blue). (b) Schematic of the TALEN structure, with TALEs (orange, PDB 3UGM) fused to the FokI domain (blue, PDB 2FOK) on DNA (green)

2.4 Enhancement of Nuclease Activity

Several improvements have been made to enhance the specificity of the FokI chimeric nucleases. A major advance was the identification of mutations that require heterodimerization of the nuclease pairs [7173], thereby preventing potential homodimerization of nuclease monomers at unintended target loci. Furthermore, introduction of distinct obligate heterodimer mutations can be used to create two independent TALENs by preventing unexpected interactions between monomers from either pair [48]. Introduction of inactivating mutations to the FokI domain on one of the two nuclease domains in each pair can be used to generate targeted nickases. The single-strand nicks generated by these enzymes facilitate high levels of HDR but do not stimulate error-prone NHEJ repair [74, 75]. TALE nickases therefore display significantly reduced mutagenesis at off-target loci. Finally, directed evolution was utilized to find mutations that enhance the activity of FokI in a target site-independent manner [76].

2.5 Relaxation of the 5′-Thymine Targeting Requirement

The range of DNA sequences that can be targeted by TALEs is constrained by a strict requirement of a thymine base at the zero base position (N0) [55]. The crystal structure of a natural TALE protein suggests that there is a cryptic repeat domain in the N-terminus of the protein that specifically recognizes thymine [51, 52]. Novel TALE architectures have been developed to overcome this requirement by engineering this region of the TALE N-terminus to recognize alternative bases at this position [56, 57] or by utilizing TALE-like domains from related plant pathogens [77, 78] that naturally recognize guanine at the N0 position. However, DNA-binding activity of these TALE architectures may be reduced, especially for targets with adenosine and cytosine bases at the N0 position. Further work in this area may yield TALE scaffolds that can readily target sequences with any base at the N0 position with high efficiency.

2.6 Targeting Methylated DNA

The methylation status of the target DNA locus is known to impact DNA binding of TALE proteins, particularly with chromosomal targets directly containing 5′-methylated cytosine (5mC) [79, 80]. As a result, DNA methylation can significantly reduce or completely eliminate TALE binding. Methylation analysis of a target loci can be used to generate TALENs targeted to open chromatin; however this further restricts the utility of TALENs for site-specific gene modification. Global demethylation of a target genome using chemical modifiers such as 5′-aza-2′-deoxycytidine can rescue TALE binding [79]; however these methods are commonly associated with undesirable toxicity. More attractive methods have been developed that substitute specific RVDs in TALE proteins to efficiently bind particular methylated and/or demethylated cytosines in the target sequence. Thus, TALE proteins can be reengineered either to be insensitive to cytosine methylation by using the N* RVD that binds to both cytosine and 5mC [81] or by utilizing RVDs that specifically recognize 5mC (NG) or cytosine (HD) [82]. By substituting these particular RVDs, TALENs can be engineered to target these sites with high efficiency. It is also noteworthy that TALEs have been shown to target regions that are insensitive to DNase I, indicating that these proteins are able to access sites located in heterochromatin [83]. These studies were performed in dividing cells, and future work is necessary to determine the role of DNA replication in facilitating access to these target sites.

2.7 Delivery of TALENs

TALEN monomers are readily delivered by DNA expression cassettes or directly as mRNA by conventional transfection methods. However, the size of TALEN monomers and the highly repetitive array of RVD sequences present a significant challenge to viral delivery of TALEN constructs, thereby potentially limiting their utility in some gene editing applications. Adenovirus presents an attractive delivery vehicle for delivering gene constructs encoding both TALEN monomers [84], although adenovirus has limited tropism in some cell types and is highly immunogenic. Interestingly, this study also demonstrated that lentivirus was unable to deliver intact TALEN gene cassettes, due to rearrangements in the TALEN-coding region caused by the repetitive structure of RVD arrays. This limitation was overcome by the development of recoded TALEN constructs, termed re-TALEs, that can be efficiently expressed by lentiviral delivery [20], although this method may require reoptimization and synthesis of each new TALE gene. In contrast to DNA or mRNA delivery, direct protein delivery of TALENs can be achieved by utilizing cell-penetrating peptides covalently bound to purified TALEN proteins [85]. This method enables efficient genome editing in cells without the risk of spontaneous integration of the TALEN DNA expression construct into the genome that can be caused by non-viral and viral gene delivery. Furthermore, previous evidence suggests that protein delivery of gene-editing nucleases may reduce off-target activity by limiting the duration of nuclease exposure [86].

3 Applications in Basic Science and Biotechnology

Conventional genetic engineering methods involve the addition of new genes to cellular genomes by random integration of foreign genetic material into the chromosomal DNA. In contrast, genome editing using engineered nucleases enables precise manipulation at nearly any desired locus with high efficiency. Importantly, genome editing can generate a variety of genetic mutations without leaving any exogenous DNA sequences in the target genome. The development of high-throughput TALE assembly methods, in combination with high success rates of engineering highly active TALEN pairs, has resulted in the unprecedented ability to manipulate any gene of interest in a diverse array of organisms (Table 1). As one example of the breadth of TALEN assembly and applicability, libraries of TALENs have been generated to target 18,740 human protein-coding genes [80].

Table 1 Examples of biotechnology applications of TALEN-mediated gene modification

A powerful application of the TALEN technology is to rapidly and efficiently generate cellular models of human disease or to interrogate disease-related mutations or genes. This approach has been exploited to create disease-associated genetic mutations in somatic and stem-cell models for a variety of human diseases [87]. Notably, in this study, few if any TALEN-associated off-target mutations were detectable in many of the modified cell populations. High-throughput TALEN assembly was also used to interrogate a large panel of genes related to epigenetic regulation or cancer, with successful modification of >85 % of targeted genes [12]. The ease of TALEN technologies has enabled researchers to rapidly generate large genomic deletions to quickly interrogate microRNA function [88, 89]. These notable examples demonstrate that TALENs are a versatile tool to interrogate and study small and large genetic elements in complex genomes.

TALENs have also enabled rapid gene modification to efficiently generate transgenic species or to knockout genes of interest. This has enabled the study of a variety of genes of interest in a diverse range of organisms, including mice [90, 91], rats [92], pigs [93], cows [93], zebrafish [47, 94, 95], C. elegans [96, 97], newts [98], silkworm [99], flies [100], mosquitos [101], and frogs [102]. In addition, genome engineering is an exciting method to address challenges in plant engineering [103, 104]. Many plant genes are arranged in tandem arrays, making it difficult to selectively alter single genes to study or impart new gene function. The ability of TALENs to discriminate between relatively few mismatches makes this technology particularly powerful for altering specific gene arrays. An example of this approach is the application of TALENs in rice to generate disease resistance, as well as the rapid modification of numerous other genes [105]. Other studies have demonstrated that TALENs are a powerful platform to rapidly modify plant genes, including Arabidopsis thaliana [106], barley [107], and Brachypodium [105].

4 Applications for Gene Therapies

Gene therapies using designer nucleases have shown promise to correct the genetic basis of human diseases [1, 2]. The significant advances made in the efficiency and precision of novel genome engineering technologies across the past decade have led to the development of TALENs targeted to numerous genes related to a range of human diseases (Table 1). In contrast to gene replacement therapies, genome editing can directly correct mutations associated with disease. For example, we developed TALENs to generate small insertions and deletions to restore the reading frame of the dystrophin gene as a novel method to correct the molecular basis of Duchenne muscular dystrophy [50]. TALENs have also been used to correct mutations associated with epidermolysis bullosa [108], sickle cell disease [109, 110], beta-thalassemia [111], xeroderma pigmentosum [112], and alpha-1 antitrypsin deficiency [113] by homologous recombination and to correct mitochondrial DNA disorders [114] by deletion of aberrant sequences.

Beyond correction of mutant genes, gene editing strategies have been developed to modify genes in order to modulate disease phenotypes. ZFNs targeted to the gene encoding the HIV-coreceptor CCR5 are currently in clinical trials and have laid the groundwork for genome editing as a novel treatment modality [44, 115]. Studies have demonstrated that TALENs can also introduce efficient mutations to CCR5 [11, 55, 116, 117] and present an alternative gene editing technology for this application. TALENs have also been designed to target and eliminate hepatitis B viral genomes from human cells [118, 119]. TALENs have been utilized to disrupt the myostatin gene [120], the loss of which leads to hypertrophy of skeletal muscle that could be used to treat a range of diseases, including muscular dystrophies. Collectively, these studies show that TALENs are a powerful technology to generate a variety of gene modifications to correct human diseases.

5 Discussion

Over the past 5 years, the rapid advancement of genome editing technologies has led to widespread adoption of various gene editing platforms for a diverse range of applications [16]. TALEN technologies have made effective gene-editing tools accessible to nearly any researcher at low cost. The robustness of this technology has enabled researchers to rapidly and efficiently interrogate a large number of genes in a range of organisms (Table 1). Importantly, TALENs have impressive observed specificity and several advances in this field have further improved the fidelity of this approach [56, 57, 60, 61, 63, 121]. The specificity and efficiency of these approaches may be further improved as second-generation technologies are developed, such as TALE recombinases [28] and single-chain TALE-meganuclease fusions [32, 33]. The easily programmable TALE DNA-binding domain has also been a boon to creating other synthetic enzymes to regulate gene expression [53, 83, 122] and the epigenome [59, 58]. Although the recent advent of CRISPR/Cas9-based genome-engineering tools has provided an alternative facile method for gene editing [7, 123, 124], there are many differences between the two technologies and various applications could benefit from the strengths of each approach. Collectively, TALENs and other TALE-based gene-modifying tools have introduced publicly available, low-cost, efficient, and rapid gene modification that is accessible to any lab and has enabled studies for a remarkable variety of applications.