Keywords

16.1 Introduction

Protozoan parasites remain as important infectious agents affecting animal and human health globally. Their often complex life cycles and relationship with their hosts present numerous challenges to our ability to develop improved methods of control. Thus, new, combined research strategies are urgently required to hasten the rate of discoveries. In this chapter, we focus on the application of genomics and genetic manipulation techniques as tools to improve our understanding of the biology of a selected group of typically neglected protozoan parasites of veterinary and medical importance (Table 16.1).

Table 16.1 Full genome, and “omic” studies currently performed on a selected group of protozoan parasites of veterinary importance

Ideally, and in addition to other measures, new and improved vaccines and novel drugs are needed to prevent or treat most of the burden of disease caused by this group of diverse protozoan pathogens. In particular, developing improved control using rational approaches requires an advanced level of understanding of the parasites’ biology, their interactions with their hosts, and particularly for vaccine development and the mechanisms of protective immunity. Recent significant advances in our understanding of the biology of most protozoan parasites affecting farm animals began with the arrival of the “omics” era, including genomics, proteomics, transcriptomics, metabolomics, lipidomics, as well as other existing, or future, “omics.”

The emergence of genomics, perhaps the initial “omics,” and the provision of the first complete and annotated organism genomes permitted the identification of numerous species-specific genes, but it was quickly realized that genomic approaches alone were insufficient to understand gene function. The value and utility of genomic data however is greatly increased when complemented with additional approaches such as transcriptomics, proteomics, lipidomics, and metabolomics. Thus, the “omics” field is extremely dynamic, and it can be expected that progress will be accelerated as novel computer-aided strategies of data management and analysis are able to integrate the massive incoming data arriving from all of these diverse research fields. The simplified scheme shown in Fig. 16.1 depicts a model of analysis involving several different “omics” strategies. For example, genome annotation and re-annotation, an area related to genomics, usually depends on proteomic as well as transcriptomic data and so on. Additionally, distinct epigenetic markers in identical genomes can influence gene transcription and hence affect everything occurring downstream in the natural flow of information in a cell.

Fig. 16.1
figure 1

Interrelationships among commonly used “omics” methods applied to the functional characterization of protozoan parasites

Metabolomics aims at integrating the flow of pathways and metabolites involved in cell function at a certain moment of the life of the cell, and lipidomics and glycomics involve the study of pathways and networks of cellular lipids and sugars in biological systems, respectively, while fluxomics is aimed at determining the rates of metabolic reactions in biological systems. Finally, phenomics studies the set of physical and biochemical traits of a given organism as they respond to mutations and environmental changes. In fact, the relationships among the different “omics” are highly dynamic, and information may flow in any direction between them. In general, the combined use of these techniques used in “cross-sectional” studies can provide useful snapshots that may deliver insights into a parasite lifestyle, status, and survival strategies. In any case, comparing integrated “omics” profiles of different stages of a parasite cycle can provide useful information on the lifestyle of any unicellular organism, as is the case here for protozoa. Additionally, comparison of virulent and attenuated parasite strain/lines using “omics” approaches may also provide revealing insights into regulatory and metabolic networks and can be useful for the identification of virulence factors.

16.2 Genomics and Beyond

Genomics greatly facilitated the development of methods of genetic manipulation of these protozoan parasites, besides helping to provide a general blueprint of the biology of the organisms. Important progress has been so far achieved in genome-wide, transcriptomic, and proteomic analysis and genetic manipulation on most of the protozoan parasites listed in Table 16.1 (1–94). However, a few have received relatively little attention, as in the case of Besnoitia where, for example, “omics” analysis and, predictably, gene manipulations remain unavailable. Genome size varies largely among this arbitrary selected group of highly diverse protozoan parasites (Fig. 16.2, Table 16.1), differing by a factor of more than 25 times between the smallest (B. microti, 6.4 Mbp) and the largest (Tritrichomonas spp., ~176.4 Mbp). In general, there appears to be an association between the size of an organism’s genome and its dependency on intracellular parasite lifestyles (Sundberg and Pulkkinen 2015) with gene reduction being generally more drastic for obligate intracellular organisms that depend almost entirely on their host for survival. Finding out the possible associations between lifestyle and genome size and the evolutionary significance of genome size differences among this diverse collection of protozoan parasites would be of great interest. However, a better understanding of parasitic lifestyle also requires comprehensive, integrative, and comparative molecular, functional, metabolic studies, as well as improved knowledge of the parasite-host relationships. Regardless, genomic comparative studies performed among related apicomplexans so far have resulted in a better, albeit somewhat limited, understanding of their biology (Blake 2015; Lv et al. 2015).

Fig. 16.2
figure 2

Schematic comparison of the genome sizes of selected protozoan parasites

For instance, basic cellular and genomic research performed on Toxoplasma gondii, which is widely considered as a “model” apicomplexan parasite (Kim and Weiss 2004), is generally applicable to other related apicomplexans mainly because several key mechanisms, such as apical subcellular organelle formation and function, apicoplast and mitochondria function, signaling, gliding motility, intracellular molecule trafficking, cell invasion, etc., are overall well conserved among most of them (Kim and Weiss 2004; Ngô et al. 2004).

Linear regression analysis correlating the ratios of the number of genes per genome (represented as gene density) and genome size among all the protozoan parasites in Table 16.1, with the exception of Tritrichomonas, gives a significant linear negative correlation, with an r 2 coefficient of 0.85, suggesting a strong association between these two parameters (Fig. 16.3). This information is consistent with the notion that smaller protozoan genomes are generally more compact than larger genomes, containing more genes per megabase of DNA, likely as a result of having overall similar average gene sizes, but less repeated/redundant regions and less and/or shorter introns and noncoding intergenic regions. Tritrichomonas was not included in these comparisons as it has a highly atypical large and vastly repetitive genome. As shown in Fig. 16.2 and Table 16.1, coccidian parasites, such as Toxoplasma and Neospora, have a significantly larger genome size compared to the compacted genomes of the piroplasmid (Babesia and Theileria). An exception to this pattern is Cryptosporidium spp., a protozoan parasite with a genome size and gene density comparable to piroplasmid parasites. Interestingly, Cryptosporidium parasites lack a plastid, and no genes of plastid origins were identified, suggesting the early loss of the symbiotic apicoplast by these parasites (Abrahamsen et al. 2004; Xu et al. 2004), as well as the loss of the mitochondrial genome. Genome analysis on Cryptosporidium also revealed extremely streamlined metabolic pathways and an absence of many cellular structures and metabolic pathways found in other apicomplexans (Bouzid et al. 2013). Remarkably, Plasmodium parasites have a genome size that is intermediate between these two clades (~23 Mbp). Both piroplasmid and Plasmodium spp. share the existence of intraerythrocytic stages, but the Plasmodium sporozoites inoculated by the mosquito vectors are only capable of invading liver cells, and thus these parasites, in addition to an intrahepatic stage, have many other significant life cycle differences during the arthropod life stages. Interesting differences among piroplasma include the unique ability of Babesia parasites to transmit via transovarial mechanisms and the ability of Theileria, but not Babesia sporozoites, to invade and transform leucocytes of the mammalian host (Lau 2009). Clearly, each of these parasites faces distinct adaptive challenges, including their different strategies for causing persistent infections, requiring a unique and specific genome composition. Other remarkable differences among coccidian and piroplasmid parasites include the ability of some coccidians, such as Toxoplasma, to invade multiple distinct cell types in their vertebrate hosts and to form cysts. Once more, all of these phenotypic differences may account for the unique requirements in the number and quality of genes that can sustain the distinct parasitic lifestyles with different levels of complexity occurring in each species. However, and consistent with a more conservative value in the number of genes found for each of these organisms, genome size is also related to the sizes of the noncoding and intergenic areas in their genomes, as reflected in the ratios shown in Fig. 16.3. Again, a special case is the large genome of the highly related Trichomonas and Tritrichomonas parasites. Tritrichomonas are members of the eukaryotic supergroup Excavata, a group of free-living organisms that may or may not have a parasitic lifestyle. Their name is derived from the existence of an “excavated” ventral feeding groove. They are anaerobic parasites that lack classical mitochondria but instead contain specialized organelles, called hydrogenosomes, which are responsible for anaerobic metabolism. Consistent with its extracellular living status, Trichomonas vaginalis possesses a large genome, which is largely comprised by repeats and transposable elements (Carlton et al. 2007) and is shared by other related Tritrichomonas parasites of veterinary importance. A draft genome sequence of T. foetus showed that 72% of the open reading frames (ORFs) were found to be similar to those of Trichomonas vaginalis (Benchimol et al. 2017). In both parasites, the superabundance of repeats resulted in a highly fragmented sequence, preventing an investigation of genome architecture. The other 28% remaining ORFs have no significant results with any other genome. The assembled genome of T. foetus, together with the functional annotation, is available at http://www.labinfo.lncc.br/index.php/tritrichomonas_foetus. Other study, using homology analysis, suggested that massive expansions might have occurred in the T. foetus genome in a similar way it was also predicted for Trichomonas vaginalis, while conservation assessment showed that duplications have been acquired after differentiation of the two species (Oyhenart and Breccia 2014). The authors of the former study concluded upon comparing the two genomes that gene duplications might be common among these parasitic protozoans (Oyhenart and Breccia 2014). In view of these findings, we included the genome of T. vaginalis, a human pathogen, together with T. foetus in Table 16.1. The high level of similarities among the genomes of Trichomonas and Tritrichomonas might simplify gene functional analysis using one of these organisms as a model.

Fig. 16.3
figure 3

Representation of the relationship between gene density and genome size for selected protozoan parasites. Gene density (genes/Mbp) was calculated by dividing total gene number by genome size (in Mpb) for each parasite species. The “X” axis is organized in ascending order, according to genome size: 1. Babesia microti, 2. Babesia bovis, 3. Theileria parva, 4. Theileria annulata, 5. Theileria orientalis, 6. Cryptosporidium spp., 7. Theileria equi, 8. Babesia bigemina, 9. Leishmania spp., 10. Trypanosoma spp., 11. Eimeria spp., 12. Neospora caninum, 13. Toxoplasma gondii, 14. Acanthamoeba spp., 15. Sarcocystis spp

Genomic and genetic manipulation studies performed on the “model apicomplexan” Toxoplasma gondii and Babesia sp. parasites, discussed below, exemplify the potential of these techniques toward improved parasite control. It is expected that the application of these approaches in other still poorly researched protozoa of veterinary importance will enhance our understanding of the biology of these parasites and their relationships with their hosts. It can be predicted that this new knowledge will translate into improved control of important yet neglected diseases with a high public health and economic impact globally. Certainly, more dramatic advances are expected to occur with the massive application of “omics” and vaccinology approaches in the near future.

16.3 Genomic Resources for Protozoan Parasites

Complete genomes of most protozoans of veterinary, medical, and zoonotic importance are currently available (Table 16.1). Furthermore, comparison between genome sequences among apicomplexan and other protozoans is now greatly facilitated using the Eukaryotic Pathogen Database Resource, EuPathDB (http://eupathdb.org/eupathdb/). This database provides access to the full genomes of Babesia spp. and Theileria spp. organized into the PiroplasmaDB (http://piroplasmadb.org/piro/showApplication.do), Acanthamoeba spp. genomes organized in the AmoebaDB (http://amoebadb.org/amoeba/showApplication.do), Cryptosporidium spp. at the CryptoDB (http://cryptodb.org/cryptodb/showApplication.do), and Coccidian genomes, including Toxoplasma gondii, Neospora caninum, Sarcocystis neurona, and Eimeria sp. at the ToxoDB (http://toxodb.org/toxo/showApplication.do). The Trichomonas vaginalis genome sequence is at the TrichDB (http://trichdb.org/trichdb/showApplication.do); and Leishmania spp. and Trypanosoma spp. can be found at the TriTrypDB (http://tritrypdb.org/tritrypdb/showApplication.do).

Importantly, the information on the EuPathDB is easily available and comprehensive and not limited to genome sequences. The site also provides easy access to analytical genomic tools such as Blast and available EST, microarray, RNA-seq, and proteomics data for these organisms, among other useful information.

16.4 Genetic Manipulation of Protozoan Parasites of Veterinary and Zoonotic Importance

Genetic manipulation techniques are important tools that allow access to multiple research approaches, including identification of virulence factors, subunit vaccine components, and parasite transmission factors. Importantly, genetically manipulated parasites themselves can be potentially used for vaccine development since targeted knockout of genes encoding known virulence factors might result in the production of genetically defined attenuated parasites. Another application of interest is the development of novel vaccine delivery platforms by manipulating attenuated parasites to express foreign genes coding for exogenous or stage-specific endogenous protective antigens. Also, genetically manipulated parasites used in vaccines can be easily distinguished from their wild-type counterparts, facilitating the discrimination among vaccinated and naturally infected animals. In addition to classic gene manipulation using transfection or gene editing techniques, RNA interference (RNAi) methods are also tools for gene function characterization (Meissner et al. 2007). However, Trypanosoma spp. and Leishmania spp. parasites, as well as most apicomplexan parasites (including Babesia and Plasmodium), lack the enzymes required for this pathway, and RNAi is not generally regarded so far as a useful method of gene analysis for these parasites. The mechanisms leading to the loss of the RNAi genes in these organisms, with no recognizable traces of their past presence, are unknown, although chromosomal rearrangements may have contributed to their disappearance (Kolev et al. 2011). Interestingly, the genome sequence of T. gondii revealed the existence of Dicer, AGO, and RdRp homologues (Braun et al. 2010) that appear to have plant/fungal (Dicer/RdRp) and metazoan (AGO) signatures. Initial reports thus suggested that T. gondii is the only apicomplexan with a functional RNAi pathway (Kolev et al. 2011). However, reported experimental results on the activity of this pathway were not reproducible, and the occurrence of this mechanism in this parasite has been put into doubt, highlighting the need for more research. A recent report also described the use of RNAi techniques to inhibit B. bovis in vitro growth upon the targeting of three distinct genes (AbouLaila et al. 2016), but the possible mechanisms involved remain uncertain given the absence of canonical RNAi genes in B. bovis. In other protozoan parasites, such as Trichomonas vaginalis, the presence of a Dicer-like gene and two Argonaute genes suggests the existence of the RNAi pathway (Carlton et al. 2007). Identification of these components raises the possibility of using RNAi technology to manipulate T. vaginalis gene expression.

The most widely used genetic manipulation methods include classic transfections based on the insertion of exogenous DNA using homologous recombination mechanisms (de Koning-Ward et al. 2000) and more recently CRISP/CAS9 (Lander 2016; Wright et al. 2016) and other gene editing methods such as TALENs and zinc-finger nucleases, based on programmable nucleases (Ma and Liu 2015). Such methods have been extensively used for the genetic manipulation of apicomplexan and other protozoa as will be described below.

16.4.1 Classic Transfection Methods

In transfection, DNA (or RNA) molecules may be introduced either as extrachromosomal replicating episomes or inserted into chromosomes by homologous recombination. Stably transfected lines can then be used for multiple applications including the study of gene function and by creating parasite lines that either overexpress or lack specific genes of interest. However, a limitation of reverse genetic approaches for functional gene analysis is that essential genes may be impossible to knock out, since this will result in nonviable parasites. These limitations can now be overcome at least partially by using inducible promoter strategies, including the use of tetracycline-inducible promoters. By choice of appropriate 5′ and 3′ flanking regions in the transfecting plasmid DNA endogenous chromosome, genes can be targeted and deleted. Additionally, the method can be also used to create transfected parasites that may function as vaccine delivery systems. The study of genetically transformed parasite lines can provide important clues about gene function during the parasite life cycle.

Perhaps mirroring their importance as human pathogens, Leishmania and other trypanosomatid parasites were first targeted for genetic transformation using transfection methods (Bellofatto and Cross 1989; Cruz and Beverley 1990; Laban and Wirth 1989; Lee and Van der Ploeg 1990; Ten Asbroek et al. 1990). The first report of the genetic modification of an apicomplexan parasite was the description of a transient transfection method for Toxoplasma gondii (Soldati and Boothroyd 1993). This led shortly thereafter to the development of a method for the stable transfection of this organism (Kim et al. 1993). Transient transfection is regarded as a useful approach for finding appropriate electroporation settings and to identify and test the function and efficacy of regulatory elements (promoters and termination signals) mediating gene expression and regulation. Based in part on these findings, transient and stable transfection techniques were later also applied to some species of Plasmodium parasites (Goonewardene et al. 1993; Van Dijk et al. 1995).

Transient transfection methods are useful for characterizing and defining promoters and other regulatory factors and later became essential components of advanced gene engineering and editing techniques, such as those based on the CRISPR/Cas9. Briefly, transient transfection techniques (Fig. 16.4) are designed to introduce and express foreign DNA, usually in the form of a plasmid, into a nucleated cell in a non-stable manner. Thus, in transient transfection, the introduced plasmid nucleic acid does not integrate into the genome of the target cells, and the transfected genes will not be replicated. After being developed in T. gondii (Soldati and Boothroyd 1993), transient transfection methods were applied on Babesia bovis (Suarez and McElwain 2008, 2010; Suarez et al. 2004, 2006, 2007), Eimeria mitis (Qin et al. 2014), Sarcocystis neurona (Gaji et al. 2006), Theileria parva (De Goeyse et al. 2015), and T. annulata (Adamson et al. 2001). This approach proved to be useful for the definition of promoters in B. bovis and later in B. bigemina (Silva et al. 2016a, b) and settled the basis for the development of stable transfections systems for T. gondii (Kim et al. 1993), Sarcocystis neurona (Gaji et al. 2006), Acanthamoeba castellanii (Peng et al. 2005), and B. bovis (Suarez and McElwain 2009, 2010). Transient transfection plasmids typically include a reporter gene or a gene that needs to be expressed transiently (such as required for current CRISPR/Cas9 methods), placed under the transcriptional control of a promoter and transcription and translation regions located at the 5′ and 3′ ends, respectively (Fig. 16.4). An appropriate amount of the transient transfection plasmid then is introduced into the target cells using distinct methods. These include those relying on physical treatments such as electroporation, nucleofection, biolistic delivery (gene gun), or microinjection and those relying on chemical entities, such as liposomes (Kepczynski and Róg 2016).

Fig. 16.4
figure 4

Schematic representation of the principles and elements involved in transient transfection methods. A transfection plasmid (a) containing a reporter gene (red box) under the control of promoter and termination regions is transferred into the nucleus of a target parasite (b) using physical or chemical methods (transfection process). The plasmid DNA transferred into the target cell (in red) is not integrated stably into the genome (c), but it can be processed by the transcription and translation machinery of the cell to generate a product (red dot) (d) that can be quantified (e.g., by measuring luciferase activity)

Physical methods create reversible “holes” in the cell membranes to insert the nucleic acids, whereas chemical methods are based on the use of transfection reagents, sometimes in the form of cationic lipids that allow membrane fusion and intracellular/intranuclear delivery of the foreign DNA into the target cells. More recently, nanoparticles and other polymers have been applied to this end. In general, the transiently transfected plasmids are designed so that they do not integrate into the genome but remain as episomes in the target cell, where the gene of interest is expressed for a limited period of time. However, promoter strength studies and comparisons using transient transfection approaches are relative and limited, and so the data should be analyzed strictly in the context in which these experiments are performed. This is so because this approach is based on measuring promoter activity by promoter regions which are cloned in transiently transfected plasmids. This experimental approach would preclude estimating the possible regulatory role and contributions of distantly located or “trans” enhancers, the potential competition for transcription factors among the native and the plasmid-cloned promoters, as well as the possible contributions to promoter activity that depends on other regulatory elements such as epigenetic factors.

The stable transfection techniques are essentially based on the ability of the parasites to insert genetic material in their DNAs using homologous recombination mechanisms, in a fashion allowing expression of the transfected genes. Again, as for transient transfection, these techniques rely either on the use of liposomes such as Lipofectamine or on the application of a controlled electrical pulse, such as in electroporation, or later in nucleofection. These procedures allow the incorporation of exogenous DNA, usually provided in the form of a circular or linearized plasmid, into the nuclear compartment of a eukaryotic cell. Therefore, basic steps involved include (1) identification of a suitable genetic marker to select for transgenic parasites (“selectable marker”); (2) preparation of a transfection plasmid vector containing, at a minimum, a selectable marker gene under the control of a suitable promoter, and 5’and 3’regions to target integration of the construct into the genome; (3) a liposome or electroporation/nucleofection protocol which does not greatly compromise the viability of the target cells; and (4) a method for selection of transfected parasites. A schematic representation of a typical stable transfection vector is shown in Fig. 16.5.

Fig. 16.5
figure 5

Schematic representation of the principles and elements involved in stable transfection methods. A transfection plasmid containing a selectable drug resistance gene (green box) (such as bsd, dhfr, pyrimethamine, etc.) with a gene coding a fluorescent marker (such as GFP) under the control of promoter and termination regions, in addition to the 5′ and 3′ flanking regions required for homologous recombination, is transferred into the nucleus of a target parasite using physical or chemical methods. The incorporated plasmid DNA (in red) is integrated stably into the genome of the target cell by homologous recombination and processed by the transcription and translation machinery of the cell to generate a product (i.e., GFP-BSD) (green dot). Drug selection is performed to eliminate non-transfected parasites and to obtain a cell line of transfected parasites

Specific integration of the transfected gene(s) at the intended site into the genome of the target parasite depends upon the operation of homologous recombination mechanisms. However, this requirement may affect the efficiency of the integration process. In fact, the efficiency of the integration process in protozoan parasites is highly variable among species, and depends heavily on the available DNA repair mechanisms operating in each cell, as found for Toxoplasma parasites. However, the efficiency of exogenous gene integration can also be affected by the particular DNA base composition of the target cells, as in the case of the A + T-rich genome of Plasmodium parasites. Interestingly, Toxoplasma parasites are also difficult to engineer using classic transfection technologies because they have the ability to randomly insert the foreign DNA in sites different from the targeted. This occurs, at least in part, due to the action of the NHEJ repair mechanisms based on the activity of a gene encoding the KU80 protein. This limitation has been recently addressed by preparing a genetically transformed Toxoplasma gondii line lacking the KU80 gene, which makes it amenable to gene targeting using homologous recombination mechanisms (Huynh and Carruthers 2009). In contrast, transfection work performed in Babesia bovis suggests that this technique is efficient, at least in terms of targeting, when applied to this organism (Suarez et al. 2015). Yet, differences in the gene repertoires and gene structure for other proteins involved in gene repair mechanisms among B. bovis and T. gondii together with differential regulation of their expression might also help explain the differences observed among the mechanism of gene repair operating in these two organisms. In contrast to Plasmodium and Toxoplasma, B. bovis appears to be quite amenable for stable transfection and, consistently, is able to specifically and efficiently integrate foreign genes. Thus, stable transfection techniques for B. bovis allowed highly specific KO and KO reversion experiments that are needed to study gene function (Asada et al. 2012a, b, 2015; Suarez et al. 2015). Recent progress in Babesia transfection technology includes the demonstration of a method for functional gene analysis by generating gene KO followed by gene function recovery (Asada et al. 2015) and the demonstration of cross-species promoter function (Silva et al. 2016a, b). This later study describes the ability of a B. bovis ef-1α promoter to function efficiently in B. bigemina. This observation suggests that common regulatory signals should exist, allowing the control of promoter functions among these two parasites.

16.4.2 Gene Editing Using Programmable Nucleases

Targeted genetic editing methods that allow precise modifications in a genome were more recently developed. These methods offer great potential for the manipulation of the genomes of Toxoplasma, Plasmodium, and other protozoan parasites, where transfection methods based solely on homologous recombination typically demonstrate very low efficiency. A key factor dramatically increasing efficiency of programmable nucleases is their ability to generate blunt double-strand breaks (DSB) in the target DNA of interest. The DSB results in the intervention of repairing systems of the cells, such as error-prone nonhomologous end joining (NHEJ) mechanisms, which can repair the break without the presence of donor homologous DNA. Alternatively, the breaks can be repaired by homology-directed repair (HDR) mechanisms in the presence of homologous donor double- or single-strand DNA, leading to the insertion of exogenous genetic material. The two mechanisms of DNA repair are exemplified in Fig. 16.5. The existence of these alternative pathways also suggests the possibility of using different gene manipulation strategies. Thus, introduction of simple mutations resulting in gene inactivation or disruption can be generated by double break followed by NHEJ. This repair mechanism can generate either insertions or deletions (indels) in the target gene resulting in frameshifts that disrupt the continuity of the open reading frame, usually leading to the knockout of the gene. If the objective is the insertion of foreign genes, such as reporter genes, it may then be necessary to add donor plasmid DNA containing the gene which is intended to be inserted with the addition of homologous flanking regions, to facilitate accurate targeting. In this case, the insertion of the foreign gene will likely be mediated by HDR repair mechanisms. Importantly, new discoveries on the mechanisms of DNA repair in apicomplexan parasites revealed the participation of certain proteins such as rad51 and KU80. As discussed earlier, targeted mutation of the KU80 gene resulted in a Toxoplasma gondii mutant line that is more efficient for gene targeting, since it favors the KU80-indpendent HDR mechanism of repair and prevents random incorporation of transfected genes, which was commonly found in this parasite. This cell line is thus ideally suited for gene function analysis in Toxoplasma using homologous recombination KO approaches (Huynh and Carruthers 2009; Smolarz et al. 2014).

The specific design of gene editing experiments depends on the programming nuclease method of choice. The programmable gene editing methods currently available include the use of engineered proteins such as zinc-finger nucleases (ZFN), transcription activator-like effector nucleases (TALENs) or RNA-guided engineered nucleases (RGEN). However, despite perceived improved target specificity of TALEN methods, the RGEN methods have several advantages over the other two, including their simple design, versatility, and cost. Briefly, the ZFN attach cutting domains derived from the prokaryote Flavobacterium okeanokoites to proteins called zinc fingers that can be customized to recognize certain three-base-pair DNA codes. On the other hand, TALENs fuse the same cutting domains to different proteins called TAL effectors. Both ZFN and TALENs require two cutting domains in order to cleave double-stranded DNA. Excellent reviews on the use of ZFN and TALEN approaches for gene editing are available elsewhere (Ma and Liu 2015).

The most widely used RGEN method is based on the CRISPR/Cas9 system. Deeper coverage on the discovery and function of the CRISPR/Cas9 system was described elsewhere (Lander 2016; Wright et al. 2016). Briefly, this system is divided into three types based on the Cas proteins involved. Only the simpler type II system is used for gene editing and is essentially based in just a single effector Cas9 protein, although other putative effectors can now also be used. The principle of the method is illustrated in Fig. 16.6 and its applications for gene editing in Fig. 16.7. Briefly, the acronym CRISPR is derived from “clustered regularly interspaced short palindromic repeats,” which together with the Cas (“CRISPR-associated” proteins) endonucleases, such as Cas9, are part of an adaptive immune system against phages of bacteria and archaebacteria (Wright et al. 2016).

Fig. 16.6
figure 6

Principles and elements involved in current gene editing methods. A targeted genome area is specifically cleaved with a double-strand break (DSB) by a nuclease and can be repaired using two different mechanisms: “nonhomologous end joining” (NHEJ) or “homology direct repair” (HDR) of the targeted cell. As a result, mutations, such as insertion/deletions (pink box), are caused, resulting in the inactivation of the target gene (NHEJ) or, in the presence of a donor sequence with homologous arms, the stable incorporation of new genetic material (green box) in the integrated genomic locus (HDR). L and R: left and right flanking homology arms

Fig. 16.7
figure 7

Basic elements involved in gene editing methods based on CRISPR/Cas9. The 20 nucleotide guide RNA (gRNA) (represented in red) and PAM (black box) complexed with Cas9 is targeted to a specific sequence in the genome using Watson and Crick complementary base pairing. The complex locks into the targeted locus where it generates a double-strand break (DSB) caused by Cas9, which can be repaired using the “nonhomologous end joining” (NHEJ), or “homology direct repair” (HDR) mechanisms of the target cell. As a result, the targeted gene could be mutated or a new sequence can be integrated stably into the genome of the target cell. Gene-edited cells can later be selected using positive or negative selection procedures

This bacterial immune system provides RNA-mediated immunity against viruses and plasmids based on copying and specifically cleaving exogenous genetic materials. It was soon realized that, upon providing the necessary components to target cells, this system could be also manipulated to edit DNA in virtually any cell. Together, CRISPR and Cas9 are able to target and cut almost any DNA in vivo, and together with transfection techniques, they quickly became an important asset as efficient and specific tools for gene editing.

The most commonly used CRISPR/Cas9 systems are adapted from Streptococcus pyogenes. A CRISPR-Cas9 system specifically cleaves a DNA sequence through a two-stage recognition process, as depicted in Fig. 16.6. Initially, as more detailed below, a Cas9-sgRNA complex will be able to attach stably to a DNA sequence only if an appropriate, short (often only a few base pairs) protospacer adjacent motif (PAM) is located in close proximity (Fig. 16.7). Therefore, an important advantage of the CRISPR/Cas9 type II system is its simplicity, since only three components are required to achieve site-specific DNA recognition and cleavage. These include a CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) which are required in order to guide the Cas9 enzyme to its target sequence (Fig. 16.7). These two elements (crRNA and tracrRNA) are usually combined into a single synthetic guide RNA (sgRNA). According to experimental design and specific gene targeting, the sgRNAs can be designed to include the specific base sequence that matches the target gene of interest. In that way, the complex can redirect the Cas9 enzyme to almost any preferred sequence. The S. pyogenes Cas9 endonuclease, which should bear nuclear localization signals (NLS), preferably requires an NGG PAM (with “N” representing any nucleobase followed by two guanine or “G” nucleobases). However, NAG and NGA PAM motifs can also sometimes be recognized. The 20 bp long sequence in the guide RNA then recognizes the homologous DNA target sequence by Watson-Crick base pairing. If a complete target sequence is confirmed, allosteric activation of the Cas9’s two nuclease domains, RuvC and HNH, will result in dual cleavage and, accordingly, a complete double-strand break in the target sequence. Clearly, the specificity of any CRISPR-Cas9 system depends heavily on the proper design of the guide RNA. This can be done sometimes using algorithms that minimize the likelihood of off-target effects. In other words, a CRISPR/Cas9 gene editing experiment requires the design of a 20 nucleotide guiding RNA (sgRNA) that can hybridize specifically with a sequence in the target gene. The sequences coding guiding RNA, the gene coding for Cas9 (including NLS), and donor DNA need to be provided to the target cells for expression in the form of plasmid DNA. Co-expression of these DNAs can be achieved following single vector or multiple vector strategies. Thus, for example, a single vector can include the genetic information necessary for the co-expression of Cas9, sgRNA, and donor DNA. This can be achieved using transient transfection of a properly engineered transfection plasmid having each gene under the transcriptional control of distinct promoters that need to be functional in the target parasite. This is now facilitated in B. bovis by the discovery that at least one heterologous B. bigemina promoter is also active in this parasite (Silva et al. 2016a, b). Alternative strategies include the delivery of in vitro transcribed sgRNA, as was the case for T. cruzi (Peng et al. 2015). Gene editing based on CRISPR/Cas9 has been used successfully for genetic analysis of several apicomplexan parasites of veterinary importance (Cui and Yu 2016), including Cryptosporidium parvum (Vinayak et al. 2015) and Toxoplasma gondii (Shen et al. 2014). In contrast to T. gondii, the low level of nonhomologous or random integration of exogenous transfected genes in B. bovis suggests that this parasite uses mainly HDR rather than NHEJ repair mechanisms. In addition, CRISPR/Cas9 approaches have also been used for genetic modification of trypanosomatid parasites such as Trypanosoma cruzi, T. brucei, and Leishmania spp. (Zhang and Matlashewski 2015). The versatility of T. gondii as a model apicomplexan was also employed for further development of a CRISPR/Cas9-based genome-wide genetic screen toward the identification of T. gondii essential genes during infection of human fibroblasts (Sidik et al. 2016). This approach allowed the description of an apicomplexan-conserved invasion factor termed claudin-like apicomplexan microneme protein (CLAMP). This novel approach has potential to be applied to other apicomplexan parasites.

Trypanosomatid parasites present the additional challenge of possessing a diploid genome; thus, deletion of an entire gene requires at least two distinct selection markers (Lander 2016). The use of these systems greatly accelerates our knowledge of the genetics of these parasites and the development of new vaccines. A website to guide the design of CRISPR tools in protozoan pathogens is a useful resource that is currently freely available (http://grna.ctegd.uga.edu/batch.html). Finally, different strategies for the selection of editedparasites, with or without the use of selectable markers, are also available (Mogollon et al. 2016).

Despite possible off-target cleavage and other potential limitations, gene editing procedures can be used for understanding gene function, generation of mutated attenuated parasites, or as a tool for the development of novel vaccines and therapeutics, thus improving the control of parasites of veterinary interest.

Conclusions

Combination of current “omics” and gene manipulation methods can improve dramatically our understanding of the genetics and the biology of protozoan parasites of veterinary and enzootic relevance. However, the rapid pace of progress of biotechnology, “omics,” and other molecular and computer-aided tools will likely result in the constant emergence of novel integrated methods for the interrogation and modification of genomes, leading to a better understanding of parasite lifestyle, and, ultimately, to the rational design of improved methods for the control of animal infectious diseases.