Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Trypanosomatids: Taxonomy, Life Cycle, Genome Organization and Post-transcriptional Control of Gene Expression

Trypanosomatids are protozoans characterized by a single flagellum and distinct cellular morphologies. The best-characterized trypanosomatids are Leishmania and Trypanosoma species, mainly those that are pathogenic in humans and other mammals. The pathogenic species tend to have complex life cycles with many developmental forms that occur successively during the physical transit between different tissues of the invertebrate vector and mammalian host. These are responsible for different diseases of worldwide impact, targeting millions of people mainly in low-income countries and are classified as neglected diseases by the World Health Organization. Chagas disease and human African trypanosomiasis are caused respectively by two Trypanosoma species, T. cruzi and T. brucei, while different Leishmania species are responsible for the various forms of Leishmaniasis.

Trypanosomatids are classified within the order Kinetoplastida (phylum Euglenozoa, supergroup Excavata), which includes both free-living and pathogenic organisms. The Kinetoplastida are characterized by a specialized single mitochondrion containing an internal array of concatenated mitochondrial DNA molecules forming a structure, visible with the light microscope, called the kinetoplast [1]. The Kinetoplastida probably represent one of the earliest diverged groups of eukaryotes and consequently have had ~1 billion years of separate evolution. Within the Kinetoplastida, the family Trypanosomatidae (the trypanosomatids) contains species that are all symbiotic/parasitic and probably co-evolved with invertebrates. The majority of species retain a single host, but some have evolved to infect a second host. The range of the secondary host species is amazing; some infect mammals, others infect vascular plants and others everything in between. The evolution of the ability to infect a secondary host must have occurred many times [2, 3].

The environment encountered by the trypanosomatid cell can change dramatically, especially when transferring between hosts, and its adaptive response includes changes in the cell morphology, cell surface and catabolic metabolism; all of course downstream of regulated changes in gene expression. For example, Leishmania sp. are intracellular within their mammalian hosts, where they multiply as amastigotes, rounded cells with no protruding flagellum, within the phagolysosomes of macrophages, at conditions of acidic pH at 37 °C [4, 5]. When transferred to a hematophagous sandfly vector, they first differentiate into extracellular, elongated, flagellated forms (promastigotes and others), which proliferate at neutral pH and ambient temperature in the insect gut. These then migrate to the salivary glands where they differentiate once again to metacyclic forms, endowed with a very long flagellum and pre-adapted for survival within the mammalian host [6]. A new cycle is started when the infected insect once again feeds upon a mammal.

The ability to survive through their complex life cycles requires that the trypanosomatids precisely regulate their gene expression in order to adapt to the different extracellular environments. This regulation has to be superimposed on a gene expression mechanism that evolved while the world was a simpler place. A decade ago, the availability of the first three trypanosomatid genomes, the TriTryps [710], followed by subsequent genomes from major lineages [11], led to a major boost in the study of genome structure and gene expression [1214]. The most unexpected aspects of the genome structures were the arrangement of protein coding genes into long tandem arrays and the conservation of synteny within the arrays despite divergence in chromosome number and size. This conservation of gene order is linked to the mechanism of RNA polymerase II transcription and mRNA maturation; tens of protein-coding genes are encoded in tandem arrays and co-transcribed from occasional transcription start sites. No defined promoters for RNA polymerase II-dependent protein coding genes have been characterized so far, but changes in the pattern of histone modifications have been seen to be associated with these regions and might be required for transcription to start/stop [1517].

Transcription in trypanosomatids is thought to be constitutive as there is little or no evidence for selective use of RNA pol II [18], although the overall rate of transcription initiation is regulated. Whether RNA pol II transcribes different tandem gene arrays at the same rate remains to be tested. There is co-transcriptional processing of the pre-mRNAs to monocistronic mRNAs, through trans-splicing of a capped ~39 nucleotide mini-exon donated from the Spliced Leader (SL) RNA to the 5′ end of each mRNA and a linked cleavage and polyadenylation of the upstream mRNA. Both processes are mediated by the same set of sequence motifs found within the precursor mRNA [1921]. Prior to trans-splicing, the SL RNA is modified with a ‘cap 4’ structure at its 5′ end consisting of the typical inverted 7-methyl-GTP commonly followed by methylations at various positions on the first four nucleotides of the SL sequence (AACU) [22, 23]. All mature trypanosomatid mRNAs then are characterized by the SL sequence plus cap 4 at their 5′ ends with the typical eukaryotic poly-A tails at their 3′ ends.

The constitutive nature of transcription in trypanosomatids, lack of defined promoters and the long polycistronic transcription units indicate an absence of most known mechanisms for selective transcriptional control of gene expression. It follows that most processes associated with the regulation of gene expression are mediated at the post-transcriptional level. Many of these may act at the level of mRNA processing, transport, storage, translation and half-life, while others likely act at the post-translational level, through the regulation of protein modification and degradation, in most cases by yet undefined mechanisms [2428]. Translation is a key step in quantitative regulation of gene expression in trypanosomatids, and a major role is expected to be played during its initiation stage. Several unique features of the trypanosomatid translation initiation apparatus have been identified and will be discussed in detail below. This review also provides an update on what is known regarding other major elements of the protein synthesis machinery in trypanosomatids, including other translation factors, ribosomal subunits and RNA-binding proteins, always comparing data from the best studied species belonging to the genera Leishmania and Trypanosoma. Here, when discussing individual trypanosomatid polypeptides, all will be named in capital letters following the proposed nomenclature for trypanosomatid proteins [29].

1.1 Eukaryotic Initiation Factors (eIFs) and Translation Initiation

The picture that emerges from searches for genes encoding translation initiation factors (eIFs) within the available trypanosomatid genomes is the conservation in regard to the presence of a complete set of initiation factors. Apart from the automatic annotation made with the original sequencing of the first trypanosomatid genomes, however, little has progressed regarding the characterization of many of their eIFs and the evaluation as to the degree they are conserved in function when compared to other eukaryotes. Nevertheless, substantial progress has been made concerning the characterization of individual polypeptides/complexes. Here, in order to systematically evaluate what is known regarding their translation initiation factors, this review will first discuss those factors that consist of single polypeptides (eIF1, eIF1A, eIF4B, eIF5, eIF5B and eIF6), followed by less well-studied complexes of multiple subunits (eIF2, eIF2B, eIF3) and a more detailed description of the multiple eIF4F complexes, and their subunits, which have been characterized recently in these organisms.

1.1.1 Single Polypeptide Translation Initiation Factors

Most single polypeptide eIFs—eIF1, eIF1A, eIF5, eIF5B and eIF6—have readily identifiable trypanosomatid orthologs, annotated as such within the various genome sequences available at the TriTrypDB database. The single exception is eIF4B, which is more divergent in sequence, as exemplified by a comparison of human, yeast and plant eIF4Bs [30]. Trypanosomatids lack clearly identifiable eIF4B orthologs, and it will not be discussed further. None of the above remaining factors have been specifically targeted for more detailed studies in trypanosomatids, although an eIF6 sequence was early on described from Trypanosoma cruzi after being found encoded within a genomic DNA fragment containing a short interspersed repetitive element (SIRE) [31]. Several of these factors, however, were found to co-purify with other translation initiation complexes studied in these organisms, therefore providing clues as to their functional properties. Leishmania EIF1, for instance, strongly associated with the eIF3 complex in a co-precipitation assay using polyclonal antibodies directed against its EIF3E subunit. In this experiment, no other eIF, or even the EIF3J subunit of eIF3, co-precipitated with the 11-subunit eIF3 complex, highlighting the strict conditions in which the purification was carried out [32]. The strong interaction between the Leishmania EIF1 and the eIF3 complex was independently observed using a slightly different approach where a streptavidin-binding-peptide (SBP)-tagged EIF3E was also used to precipitate eIF3 using streptavidin-Sepharose beads. In this assay the EIF1A and EIF5 orthologs also co-precipitated with the eIF3 complex, as well as the orthologs for the three eIF2 subunits. This pattern of co-precipitation, also seen using a tagged Leishmania EIF3A subunit, was proposed to indicate the presence in the trypanosomatids of the multi-factor complex (MFC), formed by EIF1, EIF1A, EIF5 and both eIF3 and eIF2 complexes, independently of the ribosome [33]. Leishmania EIF5 also co-precipitated with two eIF4E homologs (EIF4E1 and EIF4E4), when these two proteins were tagged with the SBP, in an assay that also brought down several eIF3 and all eIF2 subunits, but no EIF1 or EIF1A [34]. EIF6 has also been seen to co-precipitate with eIF3 [33] and EIF5B with EIF4E1/EIF4E4 [34] in the assays described above, but no further functional data have been described associated with them. In an extensive analysis of the parasite’s phosphoproteins [35], the T. brucei EIF6 ortholog was found to be phosphorylated at a single serine residue near to its C-terminus; however, the serine is not conserved in L. major EIF6, and the phosphorylation is distinct from the previously reported eIF6 phosphorylation from other organisms [36]. Overall the analysis of these different eIFs in trypanosomatids highlights their overall conservation within the eukaryotes but emphasizes the need for a better investigation in order to define conserved and divergent aspects of their functions.

1.1.2 eIF2, eIF2B and eIF2α Kinases

Orthologs of all three eIF2 subunits are found in the trypanosomatid genome sequences, but so far only the EIF2α subunit has been investigated. The two other eIF2 subunits (EIF2β and EIF2γ) have yet to be characterized, although both, and EIF2α, co-precipitated with tagged eIF3 and eIF4F subunits [33, 34], as described elsewhere in this review. Orthologs to all five eIF2B subunits are also present in the genome sequences although they have not been characterized either. The overall conservation is much higher for the three regulatory subunits (EIF2Bα, EIF2Bβ and EIF2Bδ) when compared with the catalytic ones (EIF2Bγ and EIF2Bε) [de Melo Neto et al., unpublished observation], which is consistent with only the regulatory subunits having defined archaean homologs [37, 38]. The pattern seen then from both eIF2 and eIF2B subunits is consistent with a conserved role for these factors throughout the eukaryotes, although unique features have been found for those polypeptides studied in more detail. One noteworthy feature has to do with the regulation of eIF2 and eIF2B function by novel phosphorylation events. Not only EIF2α, but also EIF2γ and two regulatory subunits of eIF2B (EIF2Bβ and EIF2Bδ) have been identified as phosphoproteins [35], and, in an independent analysis, EIF2Bβ has also been found to be targeted by tyrosine phosphorylation [39].

A trypanosomatid EIF2α was first identified within the first published genome sequences from T. brucei in a study that had as its main focus the characterization of its EIF2α kinases [40]. The T. brucei EIF2α gene was found to codify for an extended protein containing many features conserved in eukaryotic eIF2α sequences but including an N-terminal extension only present in other trypanosomatids. This extension is roughly 110 amino acids long and shows little variation between the Trypanosoma and Leishmania sequences. When ectopically expressed in Saccharomyces cerevisiae cells, the full-length T. brucei sequence could not complement a lack of the endogenous yeast protein; however, a truncated T. brucei EIF2α, missing its N-terminal extension, was functional. Despite an overall conservation in the predicted loop that includes the serine 51 (S51) residue targeted for phosphorylation by eIF2α kinases in other eukaryotes, this residue is replaced by a threonine in different trypanosomatid sequences (T169 in T. brucei EIF2α). This T169 residue, nevertheless, was efficiently phosphorylated by a T. brucei EIF2α kinase although it did not seem to be recognized by other eukaryotic eIF2α kinases [40].

Three potential eIF2α kinases (TbEIF2K1 to TbEIF2K3) have been identified in trypanosomatids, originally from T. brucei. TbEIF2K2 and its orthologs have been the focus of several functional studies in both Leishmania and Trypanosoma species and will be discussed in more detail. In T. brucei, it is a transmembrane glycoprotein located in or near the flagellar pocket that has been shown to specifically phosphorylate yeast eIF2α at S51 and also phosphorylates the T. brucei EIF2α at T169. This kinase is also targeted by phosphorylation, with its phosphorylation pattern changing during the parasite’s life cycle, and it has been suggested that it could be involved in sensing protein or nutrient transport [40]. Its ortholog in T. cruzi has been recently characterized (TcK2) and seen to localize to the endosomal compartment where it binds heme and is inhibited by it [41]. In Leishmania, the TbEIF2K2 ortholog (~44 % identical) has been identified as a PERK kinase homolog, a glycosylated transmembrane protein that localizes to the endoplasmic reticulum, phosphorylates Leishmania EIF2α at T166 (equivalent to the T. brucei’s T169) and also undergoes autophosphorylation. It has been shown that this PERK kinase homolog is differentially targeted by post-translational modifications during Leishmania’s life cycle and that these modifications are also seen after treatment with an agent that induces stress to the endoplasmic reticulum [42]. These modifications are presumed to be phosphorylation, which would likely be associated with the kinase activation, EIF2α phosphorylation and translation inhibition. In T. cruzi, a nutritional stress has also been shown to induce EIF2α phosphorylation [43], highlighting the central role that the EIF2α kinases and EIF2α phosphorylation may have in the regulation of translation in response to different cellular stresses. However, this may not always be so, since in T. brucei the phosphorylation of the EIF2α T169 residue was not found to be associated with a decrease in polysomes seen in cells stressed through heat shock [44]. A second EIF2α kinase, TbEIF2K3 or PK3, has only recently been studied in more detail and also seen to partially localize to the endoplasmic reticulum, but, upon stress induction and phosphorylation, translocates to the nucleus where it seems to phosphorylate factors associated with transcription events [45].

Several different studies have indicated a requirement for a tight control of protein synthesis associated with EIF2α phosphorylation and linked to the differentiation processes seen during different trypanosomatid life cycles. In Leishmania an increase in EIF2α phosphorylation has been seen to correlate with a decrease in overall protein synthesis, which occurs during the differentiation to amastigote forms (which live in their mammalian hosts) [46, 47]. Furthermore, the overexpression of a dominant version of the PERK kinase, which inhibits EIF2α phosphorylation, delays differentiation [42]. In T. cruzi, blocking the attenuation of protein synthesis by the overexpression of a mutant form of EIF2α, which cannot be phosphorylated, abolished the differentiation of the parasite’s epimastigote form (which grows in the insect vector) into infective metacyclic forms [43]. More recently, the study of the T. cruzi TcK2 kinase has indicated a critical role for this protein during the differentiation process, linking it to the metabolism of heme. In its absence, activation of TcK2 leads to EIF2α phosphorylation, translation arrest and differentiation [41].

1.1.3 eIF3

The eIF3 complex in trypanosomatids has only recently been the focus of more detailed studies, although seven eIF3 subunits (A, C, E, G, I, K and L) were initially detected in Leishmania in complexes that were pulled down with an eIF4E homolog, EIF4E3, but not with its partner, EIF4G4 [48]. Recently, 12 Leishmania eIF3 subunits (A through L, with the exception of the M subunit) were identified using an in-depth bioinformatics study with subsequent biochemical validation. This study used the Leishmania EIF3E as a target to immunoprecipitate the whole complex followed by mass spectrometry analysis. The biochemical data indicate a lack of association of the EIF3J subunit to the complex, and an important finding was the strong interaction between EIF1 and the whole eIF3 complex [32]. An independent study combining bioinformatics with affinity purification and mass spectrometry identified the same subunits and additionally demonstrated that the EIF3J subunit may be part of the complex in Leishmania [33].

Comparative functional studies of the eIF3 complex in trypanosomatids have just been started, and, as expected, all identified eIF3 subunits were found in polysome fractions in T. brucei [49]. In L. amazonensis, pull-down of a tagged eIF4G homolog, EIF4G3, expressed in transgenic cells, resulted in efficient capture of eIF3 subunits, implicating this eIF4G homolog as a possible link to recruit eIF3 and the 43S pre-initiation complex to the mRNAs in these protozoans, in a manner reminiscent to what is seen in others eukaryotes. The same work identified a direct interaction between eIF3 and an eIF4E homolog, EIF4E1, which in turn does not to bind any eIF4G or eIF4G-like protein. Through yeast two-hybrid experiments, this interaction was mapped between the C-terminus of the EIF3A subunit and EIF4E1 [33]. The biological function of this interaction between EIF3A/EIF4E1 is not yet clear, but it could serve as a novel mode of eIF3 recruitment to the mRNA, presumably directed to selected populations of mRNAs, and which would bypass the need for any eIF4G function.

Phosphorylation sites for multiple eIF3 subunits have also been identified in T. brucei, with single sites reported for EIF3B, EIF3D, EIF3E, EIF3I and EIF3K, while four sites were found targeting EIF3C [35]. When compared to known phosphorylation events targeting the mammalian eIF3 subunits [50], most were not conserved, with the exception of those targeting EIF3C. Three of the four T. brucei EIF3C sites mapped to its N-terminal end, remarkably conserved between homologs from different eukaryotes and required for its interaction with eIF5 [32]. The equivalent segment from both human and plant eIF3c is also targeted by multiple phosphorylation events [50, 51], and in plants these have seen to be mediated by the CK2 kinase, which also targets several other initiation factors that are part of the MFC complex. It has been proposed that the CK2 kinase, through the phosphorylation of the plant eIF3c subunit and other translation factors, may have a role in regulating translation initiation in plants [51], and the conservation in target sites among trypanosomatids, plants and mammals might indicate a mechanism conserved through most eukaryotic lineages.

1.1.4 eIF4F

In trypanosomatids the eIF4F subunits have been the focus of several recent papers that have begun to unravel their role during translation initiation and have highlighted a novel pattern for eIF4F-like complexes. Multiple homologs for the eIF4A, eIF4E and eIF4G subunits have then been described, conserved in different Leishmania and Trypanosoma species, which associate into multiple eIF4F-like complexes with yet unresolved roles in translation [23, 5256]. Here we review what is known about these factors, the complexes in which they participate, protein partners and possible biological roles. Figure 1 summarizes then the data available regarding these subunits that are discussed in more detail below.

Fig. 1
figure 1

Comparative scheme of the trypanosomatid eIF4E homologs and their respective eIF4F-like complexes, mostly as defined from T. brucei. Group 1, formed by eIF4E homologs that do not interact with eIF4G partners: EIF4E1/EIF4E1-IP [34] and EIF4E2, with so far unknown partner(s). Group 2, formed by eIF4E homologs that form complexes involved in translation: EIF4E4/EIF4G3/EIF4AI/PABP1 and EIF4E3/EIF4G4 (which may or not also interact with EIF4AI and PABP2) [34, 48, 63]. Group 3, formed by eIF4E homologs that interact with eIF4G partners and also with putative cap-generating proteins: EIF4E5/EIF4G1/Tb117.5/Tb47.5 and/or EIF4E5/EIF4G2/Tb17.9/14-3-3 I/14-3-3 II [56]; EIF4E6/ EIF4G5/ EIF4G5-IP [75]

1.1.4.1 eIF4A

The first translation initiation factor identified in trypanosomatids was an eIF4A homolog, identified after the screening of a genomic expression library from Leishmania braziliensis for immunologically relevant proteins [57]. Subsequently, the L. major ortholog was cloned and characterized as part of an effort to investigate its role in inducing protective immunity against leishmaniasis [58]. With the focus on studying translation initiation and the eIF4F complex in Leishmania and other trypanosomatids, sequences encoding two eIF4A homologs were first identified through Blast searches using the human eIF4A as query against the L. major genomic sequences. These were first named EIF4A1 and EIF4A2, and conserved orthologs were found from other trypanosomatid sequences. When used to evaluate their evolutionary relationship with other eIF4A sequences in a Phylogenetic tree based on their sequence alignment, the trypanosomatid eIF4As were found to cluster together in branches distinct from the human eIF4A homologs. Nevertheless, EIF4A1 was >100-fold more abundant than EIF4A2 and was the only one able to bind to a Leishmania eIF4G homolog (EIF4G3), although it did not bind to human eIF4G [52]. Further functional assays were carried out in yeast where the Leishmania EIF4A1 gene was not able to complement the deletion of the endogenous eIF4A genes, and the overexpressed protein, despite being able to interact with yeast eIF4G, inhibited cell growth [59]. Using T. brucei as a model, the two eIF4A sequences were further characterized, with its EIF4A1 ortholog seen to localize to the cytoplasm while EIF4A2 localized strictly within the nucleus. Metabolically labeling with 35S-methionine after RNA interference induction and growth curve analysis of cells expressing dominant negative versions of the two proteins were then carried out to investigate their function more directly. The data derived from these experiments conclusively defined EIF4A1 as the eIF4A homolog involved in translation initiation. Further sequence analysis, with the focus on identifying individual amino acid residues that could discriminate between eIF4AI and eIF4AIII orthologs, clearly grouped the second trypanosomatid eIF4A homolog with EIF4AIII orthologs from major eukaryotic lineages and pinpointed several residues that could have functional relevance for both proteins [60]. The identification of an eIF4AIII ortholog in trypanosomatids was an unexpected event considering its known role in mRNA splicing as part of the exon-junction complex [61]. Nevertheless, subsequently, further core components of this complex, such as the conserved Mago protein, were found in trypanosomatids, although no direct interaction was seen between a dimer of the Mago and Y14-like subunits and the EIF4AIII ortholog [62].

1.1.4.2 eIF4E

The initial studies carried out with the purpose of characterizing the trypanosomatid eIF4E homologs were based on homology searches using mammalian or yeast eIF4E sequences as queries against Leishmania genomic sequences. These resulted in the identification of four homologs that were subsequently targeted for functional studies [52, 53]. More recently, two further eIF4E homologs, more divergent in sequence, were identified in T. brucei, with orthologs in other trypanosomatid species [56]. To date then, six trypanosomatid eIF4E homologs (EIF4E1 to E6) have been identified, conserved in both Leishmania and Trypanosoma species and that can be paired in three distinct groups, based on structural and molecular properties and on associated protein partners [52, 56, 63]: Group 1 consists of EIF4E1 and EIF4E2; group 2 is formed by EIF4E3 and EIF4E4; group 3, more recently identified, is formed by EIF4E5 and EIF4E6. None of these trypanosomatid eIF4E groups can be confidently assigned to any of the better known classes of eIF4E homologs found in multicellular organisms [64], suggesting independent duplication events.

Group 1 This group is formed by proteins with sizes similar to human and yeast eIF4Es, which interact with the cap structure but not with eIF4G homologs. EIF4E1, the first eIF4E homolog identified in L. major (LeishIF4E-1) [65], is a 24-kDa protein (26 kDa for T. brucei) with 23 % identity to the human eIF4E-1 homolog and containing seven of the eight conserved tryptophan residues typical of eIF4E family members. These include the three residues involved in cap binding (W56, W102 and W166 positions in human eIF4E-1), with the tryptophan involved in eIF4G interaction (equivalent to W73 in human eIF4E) being replaced by a phenylalanine. EIF4E2 was described from Leishmania as a 31.5-kDa protein (28.3 in T. brucei) possessing all conserved tryptophan residues typical of the eIF4E protein family [52] and sharing 29 % identity with the human eIF4E-1. EIF4E1 and EIF4E2 were identified as cytoplasmic proteins in Leishmania [53], but in T. brucei, when overexpressed as fluorescent fusions, both behaved as nucleo-cytoplasmic polypeptides [63]. Neither of these two eIF4E homologs are abundant polypeptides, and they are present in levels far below those estimated for their group 2 counterparts (EIF4E3 and EIF4E4; group 3 eIF4Es have not been quantified so far), in both Leishmania and T. brucei [52, 63].

EIF4E1 from both Leishmania and T. brucei is able to bind to m7GTP-Sepharose, and the same applies for the EIF4E2 from T. brucei. The EIF4E2 from Leishmania differs from its T. brucei ortholog in that it does not bind to m7GTP-Sepharose beads [52, 53, 63, 65, 66]. The difference between the EIF4E2 orthologs may be caused by an insertion between W113–W130 in the Leishmania protein that could hypothetically change the protein folding, making it unable to bind m7GTP-Sepharose [55]. Additional analysis using cap analogs in vitro have shown that the Leishmania EIF4E2 binds preferentially to the methylated cap 4 structure, while EIF4E1 binds well to both m7GTP cap and the cap 4 [53].

Polysome distribution analysis reported for Leishmania EIF4E2 placed it in both monosome and polysome fractions and suggested an association with translating mRNAs. In contrast, the data for EIF4E1 were contradictory since they were first reported as present exclusively on the top fraction of a sucrose gradient, which is not associated with ribosomes, but subsequently was also seen to be associated with polysomes [53, 67]. Most of the early work with the Leishmania eIF4Es was carried out with the insect promastigote form, since it is easily cultured. The cultured vertebrate stage, the intracellular amastigote, is not easily available and doesn’t necessarily reflect the pattern of gene expression present in the true intracellular parasite [68]. Nevertheless, experiments using axenic amastigote cells from L. amazonensis found that EIF4E1 was the single eIF4E with increased expression in amastigotes, and it was proposed that EIF4E1 would be a functional eIF4E in amastigotes [34]. This possibility was reinforced by the recent identification of a direct interaction between EIF4E1 and the EIF3A subunit of eIF3, an interaction that would eliminate the need for an eIF4G intermediate [33]. In addition, a Leishmania EIF4E1-binding protein was identified, later called Leish4E-IP (here 4E1-IP), that has no homology to eIF4G or eIF4E-BP but contains the short conserved eIF4E-binding motif. This protein has been seen to preferentially bind to EIF4E1 only during its promastigote stage, leading to a model where the release of EIF4E1 by 4E1-IP during amastigote differentiation would lead to changes in mRNA translation mediated by the freed EIF4E1 [34]. The evidence from T. brucei, however, does not support this model. In T. brucei, neither EIF4E1 nor EIF4E2 is essential for cell survival in the insect procyclic forms although EIF4E1 is essential for the human bloodstream forms. Depletion of both together in procyclics led to a rapid cell death without any apparent effect on general translation. The fact that neither is able to bind to any of the described eIF4G homologs also makes them unlikely candidates for bona fide eIF4Es acting during translation initiation [63]. More recently, no association between EIF4E1 or EIF4E2 and polysomes could be detected in T. brucei in a proteomic analysis of the polysome constituents from both life stages of its life cycle [49]. In fact, in a search for translation regulators from T. brucei, using a tethering assay where selected factors were bound to the 3′ UTR of reporter mRNAs, both EIF4E1 and the 4E-IP ortholog were seen to cause translational repression of the mRNA to which they were bound [69, 70].

Group 2 This group consists of EIF4E3 (29 % identity between L. major EIF4E3 and human eIF4E-1) and EIF4E4 (31 % identity with the human eIF4E). Both proteins are characterized by N-terminal extensions not seen in other eIF4Es from eukaryotes outside the kinetoplastids. In L. major, EIF4E3 is a 38-kDa protein (48 kDa in T. brucei) while EIF4E4 is slightly larger in size, 48 kDa (46 kDa in T. brucei). The group 2 eIF4Es are characterized by substitutions in key residues involved in cap binding: the human W56 is replaced by a phenylalanine or tyrosine in both EIF4E3 and EIF4E4 and the almost universally conserved WED motif, which includes the W102 residue, is replaced by WEH in EIF4E3 (also seen in the trypanosomatids’ EIF4E2 homologs). Several other tryptophan residues generally conserved in eIF4E homologs (three for EIF4E3 and two for EIF4E4) are also modified, although mostly replaced by aromatic residues. Both proteins are cytoplasmic and the most abundant trypanosomatid eIF4E homologs. EIF4E4 binds efficiently to m7GTP-Sepharose beads and to the soluble m7GTP cap or cap 4, while EIF4E3, despite being able to bind to m7GTP in solution, binds poorly to the m7GTP beads or to soluble cap 4 [52, 53, 55, 63].

EIF4E3 and EIF4E4 are constitutively expressed in the insect and vertebrate developmental forms of T. brucei [63], but they have been found to be downregulated in the amastigote, vertebrate stage of L. amazonensis [34]. Both proteins undergo post-translational modifications, specifically multiple phosphorylation events, during cell growth of T. brucei and L. amazonensis. EIF4E4 phosphorylation in both organisms was found to be associated with exponentially growing cells while phosphorylation of EIF4E3, in L. amazonensis only was associated with stationary phase cells [71]. For L. infantum EIF4E4, which had its phosphorylation characterized in more detail, it was found to be constitutively expressed in the two stages of the parasite cell cycle, and phosphorylation was seen to be typical of exponential growth in both life stages. These phosphorylation events did not require binding to eIF4G homologs, ruling out a Mnk-like pattern of phosphorylation, and the target sites were mapped to several serine/proline or threonine/proline motifs localized within its N-terminal extension, in a pattern reminiscent of phosphorylation by CDK or MAP kinases [72]. In T. brucei, a high throughput analysis of its phosphoproteins identified both EIF4E3 and EIF4E4 as targets for multiple phosphorylation events directed at serine/threonine residues within their N-terminal halves [35]. Both proteins displayed a phosphorylation pattern reminiscent of Leishmania EIF4E4 and in marked contrast to what is observed for Leishmania EIF4E3.

Sucrose density gradient analysis in Leishmania showed that while both group 2 proteins co-sedimented with monosomes and polysomes, most of the EIF4E4 was present at the top of the gradients, away from ribosomes [53]. Later experiments found that EIF4E3 is present only in nuclease-resistant 80S particles and enters stress granules during starvation [48]. A recent analysis in T. brucei, however, has shown that both EIF4E3 and EIF4E4 are present in polysomes and most of the polysomal mRNAs are bound to EIF4E4, strongly implying a role for EIF4E4 in translation [49]. RNAi experiments showed that EIF4E3 is required for proliferation of both mammalian bloodstream and insect procyclic forms of T. brucei. This contrasts with EIF4E4, which is required in only the bloodstream form. Nevertheless, a double knockdown of EIF4E1 and EIF4E4 does lead to a major inhibition of translation, for reasons not yet understood [63]. Failed attempts to create double EIF4E4 knockout and complementation studies in L. infantum indicate that this protein may be essential for cell survival in insect stage promastigotes [72].

EIF4E3 and EIF4E4 have the ability to interact with eIF4G homologs forming potential eIF4F complexes: EIF4E3 binds to EIF4G4 and EIF4E4 to EIF4G3, and both complexes can interact with EIF4AI [34, 48, 63]. The conservation in sequence within the putative eIF4G-binding regions implicates the DVECFW motif in T. brucei EIF4E3 (DVESFW in L. major) as being required for the interaction with EIF4G4, while T. brucei EIF4E4 most likely binds to EIF4G3 through the similar ISSFW motif (ILTFW in L. major) [23, 55, 73]. EIF4E4 can also interact directly with PABP1 through its N-terminal extension, a unique eIF4E/PABP interaction that so far has only been described in trypanosomatids [34]. Recently this interaction was further characterized, and it was found to require three small regions, or boxes, mapped to the N-terminal extension of EIF4E4. Those boxes are based on the consensus L/MN/DXXAXXY/FXP (where X can be any amino acid) and are found conserved not only in different EIF4E4 homologs but also in the EIF4E3 sequences (within their N-terminus) from several trypanosomatid species. Mutational analysis revealed that amino acid changes disrupting individual boxes were not enough to prevent the EIF4E4/PABP1 binding, but this interaction was abolished when all three boxes were mutated. The functional relevance of this interaction was evaluated by assaying whether ectopic overexpression of EIF4E4, wild type and mutants, could complement the absence of the endogenous protein. Both wild-type EIF4E4 and a mutant impaired on its ability to bind to EIF4G3 could compensate for the knockout of the endogenous alleles, although the cells grown with the mutant protein could not differentiate into amastigotes. A mutant EIF4E4 impaired on its binding to PABP1, however, could not compensate for the loss of the endogenous protein, highlighting the importance of the novel EIF4E4/PABP1 interaction and confirming that it is more critical for EIF4E4 function than its interaction with EIF4G3 [72].

Most of the evidence generated so far implicates EIF4E4 directly as the most likely eIF4E homolog to be required for translation in trypanosomatids. Nevertheless, the possibility that EIF4E3 is also involved in translation in some manner cannot be ruled out, especially considering the impact on translation of its knockdown in T. brucei [63]. This was reinforced by the data from tethering experiments where both EIF4E3 and EIF4E4 were able to stimulate the translation of a reporter mRNA when tethered to its 3′ UTR [69]. The possibility remains that the two different eIF4F-like complexes, based on EIF4E3 and EIF4E4, could be required for the translation of distinct mRNA subsets, with the EIF4E4-based complex being responsible for the translation of the bulk of the mRNAs. In Leishmania, the presence of EIF4E3 in stress granules led to the proposal that it would be involved in stress granule formation only [48], but mammalian eIF4E is also present in stress granules despite being active in translation [74]. Perhaps the dissociation of EIF4E3 from EIF4G4 under stress conditions [48] is an indication of specific translation repression, but additional data are required to clarify the real role of the EIF4E3/EIF4G4 complex and the degree to which it is conserved between different trypanosomatid species.

Group 3 This group is based on the two recently identified eIF4E homologs, EIF4E5 and EIF4E6, which are both very small, fewer than 200 amino acids, and more closely related to each other than to the other trypanosomatid eIF4Es. When compared with group 1 and group 2 eIF4Es, these are the most divergent, with pair-wise alignments between EIF4E5 or EIF4E6 with the human eIF4E-1 producing identity values equal to or below 25 %. Nevertheless, both contain several critical residues conserved in eIF4Es from different organisms and required for their function. So far, these two proteins have only been properly studied in T. brucei [56, 75].

T. brucei EIF4E5 is a cytoplasmic, 22-kDa protein, having the W56 and W166 residues required for cap binding and with W102 replaced by a tyrosine. It binds to soluble cap and cap 4 structures with affinities similar to those reported for Leishmania EIF4E4. So far, this is the only T. brucei eIF4E homolog with two putative eIF4G partners, EIF4G1 and EIF4G2, although it seems that, at least in the procyclic stage, EIF4G2 is the preferred binding partner. Mass spectrometry analysis confirmed that EIF4E5 forms two independent complexes. When bound to EIF4G1, it also associates with two hypothetical proteins of 117.5 kDa (named Tb117.5) and 47.5 kDa (named Tb47.5), plus one of the two T. brucei homologs of the conserved phosphoserine/phosphothreonine-binding protein 14-3-3 (14-3-3 II). Tb117.5 has some intriguing characteristics since it harbors two cap-generating domains, with possible guanylyltransferase and methyltransferase activities, while Tb47.5 has two RNA-binding domains. The complex based on EIF4E5/EIF4G1 only interacts with its Tb117.5 partner in the absence of 14-3-3 II, suggesting that the latter might act as a regulator of this interaction. A second EIF4E5-based complex, dependent on the interaction with EIFG2, is associated with yet another hypothetical protein of 17.9 kDa (Tb17.9) and both 14 3-3 homologs (14 3-3 I and 14-3-3 II). Knockdown of EIF4E5 indicated that the protein is not essential for cell survival for procyclic T. brucei in culture but caused a motility phenotype since the cells did not remain in suspension in liquid culture and growth patterns on agarose plates were altered [56].

EIF4E6 is also a small cytoplasmic protein, 21 kDa in size, having a conserved W166 but with both W56 and W102 being replaced by phenylalanine. It also can bind to cap analogs in vitro, but its affinity for soluble m7GTP cap or cap 4 is lower than those observed for EIF4E4 and EIF4E5 and more similar to that observed for EIF4E1. This eIF4E homolog forms a complex with EIF4G5 and a 70.3-kDa hypothetical protein that interacts directly with EIF4G5, named TbG5-IP. Surprisingly, TbG5-IP also possesses two domains usually found in nuclear cap-generating proteins, a nucleoside triphosphate hydrolase and a guanylyltransferase domain. Interestingly, knocking down EIF4E6 leads to a flagellum detachment phenotype and alters growth on agarose plates, but it does not influence overall translation rates, interfere with cell growth or reduce the targeted cell’s ability to remain in suspension [75].

1.1.4.3 eIF4G

Five distinct trypanosomatid eIF4G homologs were originally identified through searches using the mammalian HEAT1-MIF4G domain as queries against the L. major genome sequences (named EIF4G1 to EIF4G5). All five identified polypeptides shared the central conserved HEAT1-MIF4G domain and on reciprocal BLAST searches against mammalian, plant and yeast sequences yielded eIF4G homologs as the most similar proteins. Orthologs to all five proteins were easily identifiable in other trypanosomatid species, but apart from EIF4G3 and EIF4G4, which are clearly related, the homology between these proteins is restricted to the HEAT1-MIF4G domain [52]. In both T. brucei and Leishmania, EIF4G3 and EIF4G4 have subsequently been shown to specifically bind to two different eIF4E homologs, EIF4E3 and EIF4E4, forming distinct eIF4F-like complexes that have both been implicated in protein synthesis and have subsequently been studied in more detail [34, 48, 63, 67, 73]. As described above, EIF4G1, EIF4G2 and EIF4G5 have only recently been shown to form novel eIF4F complexes in T. brucei that have not been directly linked to the translation initiation process [56, 75]. All three proteins were nevertheless seen to stimulate translation of reporter mRNAs when tethered to their 3’UTR [69, 70], although none were found in T. brucei polysomal fractions [49]. In the phosphoproteomic analysis of T. brucei proteins, both EIF4G1 and EIF4G2 were seen to be phosphorylated at multiple serine (EIF4G1) or serine/threonine (EIF4G2) residues. EIF4G1 was also found to be phosphorylated at a unique tyrosine residue while a single serine phosphorylation site was identified for EIF4G5 [35]. Further work will be required then in order to understand what, if any, roles these proteins and their partners play during initiation of protein synthesis in trypanosomatids and how they might be associated with translation control mechanisms. Figure 1 summarizes the data available for the trypanosomatid eIF4G homologs within the described eIF4F-like complexes.

Apart from the common HEAT1-MIF4G domain, EIF4G3 and EIF4G4 also share a very short, similar N-terminus and homologous C-termini, including divergent HEAT2-MA3 and HEAT3-W2 domains. The structure of both proteins thus indicates an ancient origin of the eIF4G tripartite structure, which is apparently missing from plant and yeast eIF4G homologs [73]. Both EIF4G3 and EIF4G4 are localized predominantly to the cytoplasm, and both are moderately expressed, with EIF4G3 being roughly three- to tenfold more abundant than EIF4G4. When compared with the previous estimates for the abundance of their corresponding eIF4E partners, EIF4G3 is found in levels similar to EIF4E4, while EIF4G4 is present at levels at least tenfold lower than EIF4E3, although the reasons for the difference in abundance of the latter protein pair is not understood [63, 73]. As seen for their eIF4E partners, EIF4G3 and EIF4G4 are also targeted for phosphorylation in both L. amazonensis and T. brucei, although no differences in phosphorylation were observed between logarithmic growth and stationary phase [71]. In the high-throughput search for T. brucei phosphoproteins, EIF4G4 was indeed found to be targeted by six different phosphorylation events directed at serine residues concentrated at its N- and C-terminal halves. In contrast, EIF4G3 was not identified as a phosphoprotein in this assay [35].

In sucrose gradients in Leishmania, EIF4G3 co-migrates with EIF4E4 and EIF4AI in polysome-containing fractions [67], but EIF4G4 does not [48]. Leishmania EIF4G3 also binds to the initiation complex eIF3 through a direct interaction observed in vitro between the fully assembled eIF3 complex and recombinant EIF4G3 [33]. EIF4G4, however,  did not pull down any eIF3 subunits in co-precipitation assays from native extracts, contrasting with its partner EIF4E3, which efficiently pulled down seven eIF3 subunits [48]. In T. brucei, knockdown of EIF4G3 strongly reduced cell growth and global translation very soon after RNAi induction, but, in contrast, knockdown of EIF4G4 does not impact significantly on protein synthesis although it does result in cell death over a longer time scale. The conclusion from these experiments is that the two proteins are essential for viability, but only EIF4G3 is a major participant in the initiation of translation. Nevertheless, the observation that knockdown of EIF4G4 induces changes in morphology prior to cell death may indicate a selective role in the translation of specific mRNAs [73]. Indeed, also in T. brucei, the tethering assay confirmed that both EIF4G3 and EIF4G4 and their eIF4E partners are able to stimulate the translation of mRNAs to which they are tethered [69, 70], and the two protein pairs are found in polysomes [49].

The similarities in sequence and structure observed between EIF4G3 and EIF4G4 and also between their eIF4E-binding partners, EIF4E3 and EIF4E4, are indicative of gene duplication events preceding the split of the Trypanosoma and Leishmania lineages. A single original complex would then evolve into the two distinct complexes seen today, with acquisition of new functions for at least one of these [73]. As discussed above, the evidence from both Leishmania and T. brucei definitely implicates the EIF4G3/EIF4E4 complex in translation initiation, and the data from T. brucei are also in agreement for a similar role for the second complex EIF4G4/EIF4E3 [33, 49, 67, 70, 73]. Nevertheless, the data from Leishmania are not consistent with a clear role for EIF4G4/EIF4E3 in translation, and the data from the two subunits are not entirely in agreement with both functioning as a pair, since EIF4E3 is found in polysomes and pulls down eIF3 subunits but not its partner EIF4G4 [48]. Likewise, in T. brucei RNAi-mediated depletion clearly implicates EIF4E3 in translation but not its partner EIF4G4 [63, 73]. A clear difference regarding the two eIF4F complexes has to do with the requirements for eIF4A binding. EIF4G3 interacts strongly with EIF4AI in vitro, requiring only its HEAT1-MIF4G domain, and overexpression of an EIF4G3 mutant that is impaired on its interaction with EIF4AI inhibits cell growth, an indication of a dominant negative phenotype. In contrast, the binding between EIF4G4 and EIF4AI seems much less efficient and requires the full length EIF4G4, and overexpression of an EIF4G4 mutant that does not bind EIF4AI does not impact on cell growth [73]. Both EIF4G3 and EIF4E4 have also been seen to interact directly with one of the Leishmania PABP homologs, PABP1, and these interactions seem to be specific and required for proper function of all three proteins during translation initiation [34, 72, 73, 76]. So far, an interaction between the EIF4G4/EIF4E3 complex and PABP homologs has not been definitively proven, and, in vitro at least, no clear interaction between EIF4G4 and any of the Leishmania PABP homologs has been detected [73]. Leishmania PABP2 specifically co-precipitated with EIF4G4 in assays using whole parasite extract, but it was not found in similar co-precipitation assays using EIF4E3 as bait [48], so the possibility exists of a yet undemonstrated specific interaction between PABP2 and the second eIF4F-like complex based on EIF4G4/EIF4E3.

One aspect of the study involving the two complexes based on EIF4G3/EIF4E4 and EIF4G4/EIF4E3 deals with the motifs in the two eIF4Gs responsible for these interactions and that reside in the short N-terminal segments from both proteins. It has been proposed that an eIF4E-binding motif is present in the N-terminus of EIF4G3 between positions 20 and 26 (YPGFSLD) in Leishmania. Mutations in the tyrosine, leucine and phenylalanine residues within this motif completely abrogate binding to EIF4E4 [67]. A subsequent work has suggested that there is not a consensus motif in Leishmania EIF4G4 for binding to EIF4E3, but instead this interaction is mainly based on the secondary structure of the EIF4G4 binding peptide, and preventing it by replacing the L26 residue with a proline would eliminate the interaction [48]. However, more recently, it has been proposed that the EIF4G3 and EIF4G4 proteins do share a common eIF4E-binding motif (F/MXXXXIL/R—Leishmania consensus) that should include residues in equivalent positions in both eIF4Gs (I8 and R9 in EIF4G3/I25 and L26 in EIF4G4). In agreement with this hypothesis, overexpression of T. brucei EIF4G3 and EIF4G4 variants having mutations targeting the equivalent residues effectively abolished the interaction with their eIF4E partners. Furthermore, overexpression of these EIF4G3 and EIF4G4 mutants induced minor but reproducible reductions in the cell proliferation rate [73]. Surprisingly, as previously stated, loss of the direct EIF4E4/EIF4G3 interaction does not affect the survival of Leishmania promastigotes, possibly a consequence of the compensatory interactions between EIF4E4/PABP1 and PABP1/EIF4G3, which under some circumstances might bypass the need for the direct EIF4E4/EIF4G3 interaction [72].

1.2 Other Translation Factors

This topic covers the current knowledge of individual proteins participating in other steps of the translation process in trypanosomatids. Classical elongation, termination and release factors, which have been studied in trypanosomatids, are discussed as well as eIF5A, which has been studied with some detail. Poly-A binding proteins (PABPs) are also discussed, since they have multiple roles not only during translation initiation but also during other stages of mRNA translation, processing and degradation.

1.2.1 Elongation Factors

1.2.1.1 EF-1 and EF-2

Phylogenetic analysis of the distribution of EF-1A and its paralog EF-Like (EFL) in the Euglenozoa, and in other eukaryotes, showed that trypanosomatids have only a single gene encoding EF-1A [77, 78]. This factor is essential for T. brucei cell viability since its silencing through RNA interference led to an almost instant cessation of growth, eventually causing cell death [79]. EF-1A depletion also caused a lethal phenotype in a high-throughput experiment that evaluated loss of fitness following RNAi in procyclic and bloodstream forms and also during differentiation [80], consistent with an essential function during protein synthesis.

Comparison of structural models shows that EF-1A orthologs from Leishmania, yeast and mammals resemble each other closely. Despite its conservation in function and general structure, the EF-1A in trypanosomatids has several distinct differences when compared to its mammalian counterpart. A major difference is the deletion of 12 amino acids from the Leishmania protein that is associated with a hairpin loop found in mammalian EF-1A [81]. Absence of this hairpin loop has been found to expose part of the main body of EF-1A. The function of this particular region is not well defined, but when it was ‘blocked’ by a specific antibody, which did not bind mammalian EF-1A because of the presence of the hairpin loop, protein synthesis was inhibited in vitro in a Leishmania cell-free translation system. A similar inhibitory effect was observed by a peptide based on the exposed region of the Leishmania protein [82]. Indeed this region has been evaluated as a target for small molecules that bind specifically to the Leishmania EF-1A and that inhibit specifically the Leishmania translation [83].

Similarly to mammalian and plant EF-1As, which are modified by two ethanolamine phosphoglycerol (EPG) moieties attached to conserved glutamic acid residues, T. brucei EF-1A is also modified accordingly at a single EPG modification site, residue E362. Although a second potential EPG modification site, E289, equivalent to the one found in mammals and plants, is found in the T. brucei protein, it does not seem to be targeted by this modification. Despite the fact that the EPG modification is exclusive to EF-1A, and found in many species, it is not necessary for cell growth in T. brucei, and its function remains unknown [84, 85].

Despite their importance in translation, recent data from a number of laboratories have supported the idea of secondary functions and biological significance for the EF1 subunits [86]. Notably, many studies have linked the EF-1A to immunological responses during pathogenesis in Leishmania and it should also be noted that two previously unrecognized putative immunoreceptors have been revealed through its sequence analysis. In a similar context, EF-1B subunits in trypanosomatids were identified as part of a trypanothione S-transferase (TST) complex, with the active site localized to the γ subunit, with the authors suggesting that it may have a role in translational response to oxidative and xenobiotic stress [87, 88].

In T. cruzi, EF-2 is encoded by two identical gene copies, and the protein has 60 % identity to S. cerevisiae EF-2. It is highly conserved in all trypanosomatids, sharing over 96 % homology with EF-2 of Leishmania species and over 98 % with other trypanosomes. Comparative analysis of secondary structures shows an overall conserved architecture, displaying the four canonical alpha helices in the GTPase domain as well as a motif involved in nucleotide binding. However, all Trypanosoma EF-2 proteins seem to lack regulation by phosphorylation, since they do not have the targeted threonine residue that mediates its binding to the ribosome [89].

Both EF-1 subunits (1A and 1B) and EF-2 were detected in pull-down assays targeting the cap-binding protein EIF4E4, as well as the polypeptides EIF4G3 and eIF3, all currently suggested as functional translation initiation factors that act during the initial steps of protein synthesis in trypanosomatids [33, 34]. As expected, they were also found in a proteomic analysis of the polysome contents of T. brucei [49].

1.2.1.2 eIF5A

In trypanosomatids, eIF5A homologs have been described from Trypanosoma cruzi, Leishmania donovani and T. brucei [9092]. As expected, the Leishmania EIF5A was cytoplasmic and displayed high sequence identity with T. cruzi (76 %) and T. brucei (77 %) orthologs and less identity to the human ortholog (45 % identity) [91]. Trypanosoma EIF5A was also found to localize to the cytoplasm and to be modified by hypusination as in other eukaryotes [90, 92]. Indeed, the two enzymes that catalyze hypusination, deoxyhypusine synthase and deoxyhypusine hydroxylase, have been found and characterized from Leishmania [93, 94]. Other post-translational modifications targeting eIF5A in trypanosomatids were found through mass spectrometry analysis. T. cruzi EIF5A was phosphorylated at a single conserved serine residue, S2, but a novel tyrosine phosphorylation event, at Y21, has also been identified. Three methylations events were also found, mapped to residues E73, E99 and E116 [90]. Phosphorylation at S2 was also identified for T. brucei EIF5A [35, 95].

In exponentially growing T. cruzi cells, EIF5A was phosphorylated and partially found to be associated with polysomes. In stationary phase cells, EIF5A was dephosphorylated, and a greater fraction was present in polysomes. Overexpression of a phosphomimetic mutant S2D of EIF5A increased cell proliferation and protein synthesis but seems to be toxic to stationary phase cells. Overall EIF5A seems to cycle between phosphorylated and dephosphorylated forms, which might be required to regulate translation in response to growth conditions [90]. Complementary work has shown that EIF5A is essential for cell growth in T. brucei; RNAi-mediated knockdown causes growth arrest and also leads to morphological abnormalities including cell rounding and detached flagella. The RNAi-induced growth defect was complemented by expression of wild-type human eIF5A, showing that the trypanosome and human proteins are functionally homologous, but not by a K52 mutant (K50 in human eIF5A) that blocks modification by deoxyhypusine, confirming this modification to be an essential requirement for cell viability. The expression levels of two representative polyprolyl proteins involved in the actin cytoskeleton were also analyzed after T. brucei EIF5A knockdown, and both were shown to be reduced [92]. The selective reduction of the polyprolyl-containing protein suggests a preferential requirement for EIF5A for the translation of proteins containing consecutive proline tracts, as demonstrated recently in other eukaryotes [96].

1.2.2 Termination/Release Factors

Trypanosomatid orthologs of eRF1 and eRF3 are present and annotated as putative proteins within the various genome sequences available at the TriTrypDb database [10]. Due to the high degree of conservation of sequence in release factors, an analysis of the T. brucei ERF1 and ERF3 sequences has been included in phylogenetic works that helped improve the relationships and the understanding of the translation process between eukaryotic groups [9799]. As for some other factors with major roles in translation, RNAi knockdown of ERF1 and ERF3 resulted in cessation of proliferation following RNAi in both procyclic and bloodstream forms and also in an experiment that evaluated loss of fitness post-RNAi during differentiation of T. brucei [80].

1.2.3 Poly-A-Binding Proteins (PABPs)

The first trypanosomatid PABP homolog, from T. cruzi, was identified through early studies in the 1990s, which started with the biochemical characterization of the native protein after its purification through affinity chromatography using poly-A Sepharose. This was followed by the screening of DNA libraries with polyclonal serum directed against the purified protein, which led to the identification of a single PABP homolog encoded by two gene copies [100]. The T. brucei ortholog was subsequently identified [101, 102] followed by the identification of a more divergent Leishmania PABP homolog [103]. With the completion of the genome sequences, the differences in PABP genes were clarified, with three distinct PABP homologs identified first in L. major, and subsequently in other Leishmania species, and two in T. brucei and T. cruzi, orthologous to Leishmania PABP1 and PABP2. In Leishmania, the three PABPs are simultaneously expressed as abundant proteins at least during the promastigote insect stage of its life cycle [76].

PABP1, the first Leishmania PABP homolog identified, is a phosphoprotein that binds specifically to poly-A and localizes to the cytoplasm. It interacts in vivo and in vitro with EIF4G3, the Leishmania eIF4G homolog most directly implicated in translation initiation. The second PABP homolog, PABP2, is the ortholog to the one originally identified in T. cruzi and T. brucei. Its specificity in binding to poly-A seems to be reduced, and it contains several polymorphisms in residues almost universally conserved in PABP sequences and previously implicated in poly-A recognition. PABP3, absent from T. brucei and T. cruzi, co-precipitates with PABP2 in a mRNA independent manner, and it seems likely that both bind to the same population of mRNAs and may interact with each other [76].

There is evidence that the PABP orthologs have discrete functions: PABP1 binds to a distinct set of mRNAs and also differs from PABP2 and PABP3 in trafficking between the cytoplasm and nucleus, since both PABP2 and PABP3, but not PABP1, migrate to the nucleus upon inhibition of transcription by actinomycin D [76]. A subsequent phylogenetic analysis of PABP sequences from different trypanosomatids and related kinetoplastids revealed that the lack of PABP3 from the Trypanosoma genus is a secondary loss that occurred after its divergence from the remaining trypanosomatid lineages [104], in agreement with PABP2 and PABP3 in Leishmania having related and perhaps redundant functions.

In T. brucei, the two PABP homologs were seen to stimulate translation of a reporter mRNA when tethered to their 3′ UTR, compatible with both proteins having relevant roles during translation initiation [69]. Nevertheless, both proteins are individually essential for viability and seem to have non-redundant roles [76]. In T. cruzi, both PABP homologs were shown to migrate to cytoplasmic mRNP granules [105], an observation also confirmed in T. brucei where the two proteins were found to localize to different sets of granules in response to different stress conditions or inhibition of transcription/trans-splicing, implying again a differential association with distinct mRNA populations [104, 106]. So far, however, not much has been described regarding specific mRNA targets differentially bound by these two PABP homologs, with the exception of an early report from Crithidia fasciculata, a related trypanosomatid, where PABP2 was found in a complex that binds to mRNAs with cell cycle-dependent regulation [107].

The evidence regarding the association of the different PABP homologs with the eIF4F complexes in Leishmania is still controversial and needs to be resolved. As discussed above, a clear interaction between PABP1 and EIF4E4 has been identified through different means [34, 72], and PABP1, but not PABP2 or PABP3, was seen to co-precipitate with native EIF4G3 [76]. Other co-precipitation assays targeting EIF4E4 brought down both PABP1 and PABP2, but in these assays EIF4E3 and also EIF4G4 were found in the precipitated fractions [34]. Subsequently, PABP2 was found to co-precipitate with EIF4G4 but not with EIF4E3 [48]. The data available so far then favor an exclusive association of PABP1 with the EIF4E4/EIF4G3 complex; an association that seems to be conserved in T. brucei, since co-precipitation experiments have also seen that PABP1 co-migrates with the complex formed by EIF4E4/EIF4G3 [104, 106]. In the same study, PABP2 co-migrated with EIF4E1 so it is still not clear whether a similar association exists between PABP2 and the second eIF4F complex, formed by EIF4E3/EIF4G4. No PABP homologs co-precipitated with the recently described eIF4F-like complexes based on EIF4E5 and EIF4E6, which indeed do seem to be functionally distinct [56, 75].

A number of RNA-binding proteins from both T. brucei and T. cruzi have been seen to associate with either of the PABP homologs, and their characterizations have led to relevant data that can be useful for the understanding of the functional distinctions between these proteins. In T. cruzi, PABP2 has been found to associate specifically with proteins named as UBPs, which bind to U-rich sequences found within the 3′ UTRs of trypanosomatid mRNAs and are involved in mRNA destabilization events [108]. In T. brucei, both PABPs were also found to co-precipitate with four distinct Alba-domain proteins, two of which are associated with polysomes and might be required for the stage-specific translation of selected mRNAs [109]. Also in T. brucei, the two PABP homologs have been seen to interact with the T. brucei ortholog of PBP, a non-essential protein in yeast that interacts with PABP and co-sediments with polysomes. In T. brucei the PBP ortholog seems to bridge an interaction between the PABPs and ZC3H11, a zinc finger RNA-binding protein that binds selectively to AU-rich elements in mRNAs and stimulates translation in tethering assays. A model has been proposed, according to which these interactions would allow the selective translation of mRNAs bound by ZC3H11 [110].

1.3 Ribosomes and Ribosomal Proteins

A remarkable characteristic of the trypanosomatid ribosomes, first reported in the 1980s, is the processing events targeting the ribosomal rRNAs. These lead to the generation of a very long 18S rRNA, the presence of only one form of 5.8S rRNA and the 25/28S rRNA being fragmented into six mature transcripts [111114]. More recently, with the solving of the trypanosomatid’s ribosome structure, and the characterization of its protein content, much more is understood regarding similarities and differences in ribosome structure and function in comparison with other eukaryotes, as described below.

1.3.1 Ribosome Structure

In 2005, the structure of the Trypanosoma cruzi 80S ribosome was resolved by cryo-electron microscopy (cryo-EM) and the general structures of the 40S and 60S subunits display evolutionary conserved features seen in other eukaryotes. The major characteristics of a typical ribosome and the phylogenetically conserved eukaryotic rRNA core structure are also maintained. However, in comparison to other species there are some distinctive differences in both small and large subunits. Indeed, the density map of the T. cruzi 80S ribosome includes unusual structural components, mostly related to large expanded segments in the rRNA molecules [115]. Subsequently, the structure of the T. brucei ribosome was solved, also through cryo-EM but with a higher resolution. As noted for T. cruzi, the main characteristic of the ribosome is the unusual size of several expanded segments (ES) in the rRNAs. Some of these segments, found in both ribosomal subunits, seem to be several times larger than in other ribosomes of known structure; with the 60S subunit also including an additional domain, called the kinetoplastid-specific domain (KSD) [116]. The functions of the large expanded segments of rRNA remain unknown, although the arrangement and localization of the largest expansions within the 18S rRNA (ES6/ES7) near the mRNA channel suggest that it may be important for translation control. Indeed, part of the ES6/ES7 in the 40S subunit makes up a structure first reported in T. cruzi and which was named ‘turret’. This structure was considered the longest helical structure ever observed in a ribosome, and it only exists in trypanosomatids, with high conservation in sequence and size. The extension of a lower part of the turret also forms a bridge with the 60S subunit, being responsible for a unique type of connection between the small and large subunits. Within the 60S subunit, a comparison of its structure with other eukaryotes revealed that the trypanosomatid subunit does not have typical eukaryotic features, such as a planar surface near the exit site of the polypeptide; instead, it has a shape more similar to the one seen in bacteria [115, 116]. Based on the differences cited above, mainly the turret structure, and the cap 4 5′ end of trypanosomatid mRNAs, a role in translation initiation for the turret structure has been proposed to provide an explanation for the low affinity of the trypanosomatid eIF4Es to the cap 4, when compared to the affinity between mouse eIF4E and the monomethylated cap [115].

1.3.2 Ribosomal Proteins

Sequence analysis of ribosomal proteins showed that almost all yeast ribosomal proteins have counterparts in T. cruzi and T. brucei, with the exception of L41 and S31 [115, 116]. In regard to individual ribosomal proteins, the P complex present in the stalk of the ribosome, involved in the translocation step of protein synthesis, was described quite early on from T. cruzi. Four components were identified prior to the availability of the genome sequence (P0, P1, P2α and P2β). P0 was found to have a divergent C terminus, while P1, P2α and P2β have more typical C-terminal ends [117]. Later, a novel putative P complex component, named P1β, was identified within the T. cruzi genome. Further sequence analysis revealed that T. brucei and Leishmania also contain a similar set of five P proteins, suggesting that the composition of the different P protein subtypes is a feature found in their common ancestor. Yeast two-hybrid and surface plasmon resonance analysis revealed a unique protein-protein interaction pattern for this complex, when compared to other eukaryotes, as P0 interacts with each of the four P1/P2 proteins [89, 118]. More recently, the P proteins have been investigated as potential drug targets, with the identification of a single chain recombinant antibody directed against the C-terminal end of the T. cruzi P2β, which specifically inhibits the translation of various trypanosomatid species [119].

In an expanded data-mining search for trypanosomatid ribosomal proteins, using S. cerevisiae sequences as queries, several putative ribosomal proteins were identified within the T. cruzi genome database. However, the average amino acid identity was very low (approximately 50 %), contrasting with the high degree of conservation of the 80S ribosome during evolution. T. cruzi ribosomal proteins are, on average, longer than their S. cerevisiae counterparts, with the extra regions usually at their N- or C-terminal ends. Thirty-two genes encoding proteins with significant sequence identity to polypeptides found within the S. cerevisiae 40S subunit were found, where 29 of those were also identified after mass spectrometry of the T. cruzi ribosome. For the 60S subunit, genes encoding putative orthologs for all 48 yeast proteins were found, with the mass spectrometry confirming the presence of most proteins within the native ribosome, with the exceptions of L1, L35, L39, L40, P1 and P2 [120].

One component of the large ribosomal subunit characterized in more detail is the L5 protein, tightly associated with the 5S rRNA to protect and stabilize it from degradation by nucleases. In trypanosomes, two novel 5S rRNA-binding proteins were found as well as L5, named P34 and P37 [121]. While it is reported that 90 % of mammalian 5S rRNA is bound by L5, only about 25 % of the trypanosomatid 5S rRNA is associated to its L5 ortholog. The amino acid sequence of the trypanosomatid L5 differs from the eukaryotic consensus at potentially significant positions; however, L5 still is essential to cell viability in T. brucei. Considering their characteristics, the trypanosome-specific P34 and P37 probably compensates for the weaker binding of L5 to the 5S rRNA, and it has been suggested that the trimolecular complex is necessary for adequate 5S rRNA stability in trypanosomatids [122]. The scaffold protein RACK1, which is a constituent of all eukaryotic ribosomes, was also identified in its conserved binding site within the small subunit of T. brucei ribosomes. For some unresolved reason, RACK1 was missing from the T. cruzi ribosome cryo-EM analysis, but data mining indicates the presence of one RACK1 ortholog within the T. cruzi genome database [115, 116].

1.4 tRNAs and tRNAs Synthetases

The total number of tRNA genes in the Tritryps was early on found to be low when compared to most eukaryotes, which usually have hundreds of tRNA genes [123]. Subsequent bioinformatic analysis confirmed 83 tRNA genes annotated in the L. major genome and indicated 66 genes in T. brucei. The distribution of these genes does not seem to be random as they are normally organized into clusters that are mostly confined to a subset of chromosomes [124]. Sixteen anticodons were not found in the tRNA genes of trypanosomatids, although their corresponding codons are present in many mRNAs of these organisms [125]. It is well known that some relaxed base pairing at the third position of the codon occurs (wobble), and this seems to be the case in trypanosomatids since there are 46 isoacceptors types, which are able to read the 61 canonical codons [124].

Genes encoding tRNA-Sec, which reads UGA as a selenocysteine codon in some specific mRNAs, were also identified in all three TriTryp species, but with variations in the number of copies [124]. The presence of selenoproteins is reported in trypanosomatids [126128], although they do not seem to be essential for parasite viability [129, 130]. Unlike other tRNAs that are transcribed by RNA pol III, tRNA-Sec of L. major and T. brucei are reported to be transcribed only by RNA pol II or by RNA pol II and III [131, 132].

Few aminoacyl-tRNA synthetases (aaRSs) have been individually characterized in trypanosomatids. Twenty-four nuclear genes have been identified in the genome of T. brucei, all highly conserved in T. cruzi, which encode enzymes to aminoacylate all 20 amino acids [133]. Besides the 24 conserved genes, Leishmania spp. have an extra copy for the AsnRS [134]. Unlike most eukaryotes that usually encode two genes for each of the 20 standard aminoacyl synthetases, trypanosomatids encode only one copy of each gene, except for AspRS, TrpRS and LysRS, which have two copies each, and PheRS, which is formed by two subunits (α and β). Amino acid sequence comparison shows a good identity with other eukaryotic aminoacyl-tRNA synthetases, and the enzymes identified play roles in either cytoplasmic or mitochondrial translation, with some of them working in both compartments. The dual localization is the result of many different processes such as alternative trans-splicing, alternative start codons or protein modification and/or translocation to the mitochondria. All gene products are essential for cell growth in T. brucei procyclic and bloodstream forms. As in mammals and yeast, several T. brucei aminoacyl-tRNA synthetases and other related proteins seem to be associated in a multiprotein complex. This complex, called MARS (Multiple Aminoacyl-tRNA Synthetase complex), is responsible for enhancing the efficiency and accuracy of tRNA aminoacylation and improving general translation, by avoiding charged tRNAs diffusion and facilitating tRNA reutilization [133, 135].

1.5 Translation in Trypanosomatid Mitochondria

The trypanosomatid mitochondrial genome encodes similar polypeptides to other eukaryotes as well as two rRNAs. As in other organisms, translation then is required for synthesis of several components of the electron transport chain, but the process by which some of these components are synthesized is distinct from the mammalian system. First, transcripts are post-transcriptionally modified through RNA editing, an extensive and precise insertion and deletions of uridines, usually within the coding regions, correcting frameshifts and producing translatable mRNAs [136, 137]. Second, major differences in the mitochondrial translation apparatus have also been identified, which are described below.

1.5.1 Mitochondrial tRNAs and tRNA Synthetases

In contrast to yeast and mammals, the mitochondrial genome of trypanosomes are devoid of tRNAs genes, and so all the tRNAs needed for translation in the organelle are imported in small amounts from the cytosol through an unique import machinery [138, 139]. This is dependent on the binding of the tRNAs by EF1A in the cytosol [140], and both tRNA and protein imports share elements in common [141]. Upon reaching the mitochondria, most of these tRNAs are aminoacylated by the same synthetases also found in the cytosol. One exception is the tRNAASP, which is targeted by distinct synthetases in the two compartments [142].

As in all eukaryotes, the mitochondrial translation in trypanosomatids is of the bacterial type although it only uses eukaryotic-type tRNAs. This has resulted in some adaptations to integrate the imported eukaryotic-type tRNAs [143]. For instance, in mitochondria the stop codon UGA has been reassigned to tryptophan. Trypanosomatids have adapted to this through a mitochondria-specific RNA-editing event that converts the CCA anticodon of the imported tRNATrp to UCA and by evolving a highly divergent eukaryotic-type TrpRS that is specific for mitochondria [144, 145]. Generation of the formylated initiator methionine is required for mitochondrial initiation of translation, but in trypanosomatids the single initiator tRNAMet remains in the cytosol and is required solely for cytosolic translation. Within the mitochondria only the elongator tRNAMet is found, but a fraction of it is targeted by the formylase activity to generate the formylated tRNAMet, which is then used during the mitochondrial translation. This unique activity is carried out by a polypeptide homologous to other prokaryotic and mitochondrial enzymes of similar function [146, 147].

1.5.2 Mitochondrial Translation Factors

The orthologs of mitochondrial translation factors (EF-Tu, EF-G1, EF-Ts and RF1) are readily detected in trypanosomatid genomes, and the depletion of each of these four factors results in growth impairment of the procyclic, insect forms. Despite having much reduced mitochondria and not depending on oxidative phosphorylation, the mammalian bloodstream forms of T. brucei also require the functional mitochondrial translation for cell survival, since the depletion of EF-Tu through RNAi in bloodstream cells also led to growth arrest and death [145, 148]. EF-Tu of trypanosomatids is very similar to other species, although it includes a C-terminal extension of approximately 30 amino acids. Sequence comparison has showed that the cytosolic EF-Tu ortholog, EF-1A, also has a similar C-terminal extension enriched in charged amino acids. RNAi depletion assays have confirmed that the trypanosomatid-specific C-terminal extension of EF-Tu is essential, and complementation with a chimeric EF-Tu harboring the corresponding segment of EF-1A is able to restore cell growth. Thus, it has been suggested that this specific motif may be linked to unique features of mitochondrial translation, as, for example, an adaptation that allows the protein to fit in the peculiar organellar ribosomes of these parasites [148].

1.5.3 Mitochondrial Ribosomes

The mitochondrial ribosome of the non-pathogenic Leishmania tarentolae is a minimal ribosomal rRNA-containing structure. It was isolated as a 50S protein-rich complex, and its two subunits were characterized as 28-30S small and 40S large subunits, and an unusual ≈45S SSU complex, representing a dimer of the small subunits and additional proteins. The cryo-EM of this ribosome shows most of the main characteristics of a typical ribosome, but with a size smaller than its eubacterial or mammalian counterparts. Comparing the morphology of each subunit, they resemble eubacterial subunits, containing several homologs of bacterial ribosomal proteins. Despite these similarities, more than 60 % of the eubacterial rRNA segments are absent while other differences in the density map of the small subunit are the result of specific proteins and extensions of the Leishmania mitochondrial ribosome. In addition, the subunits are held together by only nine inter-subunit bridges, fewer than those found in bacterial (13) or mammalian mitochondrial (15) ribosomes. The overall size and the maintenance of minimal key architectural components reflect a conserved functioning of the ribosomes. However, distinct topological differences in the Leishmania mitochondrial ribosome suggest that the process of protein synthesis is mechanistically different [149].

The remarkably small size of the mitochondrial ribosomes in trypanosomatids is due to the very short rRNAs found in their small and large subunits (9S and 12S, respectively). The main rRNA core regions are conserved, but other known domains are completely absent. A total of 133 proteins were identified in association with purified ribosomal subunits in the T. brucei mitochondrial ribosomes, and, when compared, the orthologs of all 49 proteins that were found in 45S SSU-related ribosomal complex in L. tarentolae were identified [143]. The specific function of the 45S SSU-related complex is not fully understood although it has been suggested that it could be involved in the translational control of specific mRNAs. More recently, this complex has been shown in T. brucei to be indispensable for normal cellular growth and translation of mitochondrial mRNAs, such as those encoding cytochrome c oxidase and cytochrome bc1, in procyclic insect forms. The same complex, however, seems to be absent or downregulated in the bloodstream mammalian forms, in which the respiratory chain is not required [150].

It is possible that the higher number of ribosomal proteins identified in the mitochondrial ribosomes of trypanosomatids works in a way to compensate for the shorter rRNAs. Only 30 % of the ribosome-associated proteins in the T. brucei mitochondrial ribosome have orthologs in bacteria or in other eukaryotes, while 70 % are trypanosomatid-specific or may have diverged beyond confident recognition when compared to proteins from other species. To date, 56 proteins with high confidence assignment to purified mitochondrial ribosome of T. brucei have no recognizable homology outside the Kinetoplastida while several other proteins have no evident homology to other ribosomal proteins despite having motifs suggestive of associated functions, for example, PPR, GTP binding, GTPase and methyltransferase motifs [143, 151].

1.6 RNA Binding Proteins

The lack of transcriptional regulation and the reliance of trypanosomatids on post-transcriptional mechanisms to control gene expression, mainly at the mRNA level, imply the existence of elaborated mechanisms associated with regulation of mRNA translation, possibly associated with a diversified set of RNA-binding proteins (RBPs). It is believed that the association of specific RNA-binding proteins with 3′ UTRs of different classes of regulated transcripts may coordinately regulate the stability and translation of groups of mRNAs in response to environmental signals. This process likely constitutes a major mechanism for regulation of gene expression in these organisms, mediated by cis-acting signals through the binding of mRNA-specific or generic RNA-binding proteins [25, 28, 152]. Several recent reviews have discussed the different classes of RNA-binding proteins found in trypanosomatids, their diversity and conservation in comparison to other eukaryotes. These include RNA helicases of the DEAD-box subfamily, the RRM containing proteins, the CCCH family of zinc-finger proteins, PUF and ALBA domain-containing proteins [25, 153156]. This review will briefly discuss what is known regarding these proteins with the focus on those that have been shown to play a role in mRNA metabolism and/or translation.

1.6.1 RNA Helicases

The first reports of DEAD-box RNA helicases in trypanosomatids were in 1995 and included a description of a 64-kDa polypeptide (HEL64) from T. brucei homologous to a yeast nuclear RNA helicase [157] and the first Leishmania eIF4A homolog [57]. Subsequently, a great number of polypeptides were identified and annotated as putative DEAD/DEAH helicases within the TriTryp genomes. A systematic assessment of the full range of trypanosomatids’ helicases, however, revealed that the total number of true RNA helicases of the DEAD-Box subfamily does not significantly exceed the number described for other eukaryotes of similar life cycle. Within the TriTryps 27 to 30 different DEAD-Box helicases were identified [155, 158], a number comparable to the 25 helicases reported from S. cerevisae [159]. Indeed, only four of the trypanosomatid DEAD-Box RNA helicases were identified as components of the T. brucei polysomes, and this include orthologs to two DED1-related proteins originally described from Leishmania [49], as detailed below.

The two DED1 homologs (DED1-1 and DED1-2) that have been identified in trypanosomatids have evolved from a gene duplication event that preceded the origin of the different trypanosomatid lineages. The two Leishmania proteins are able to complement a yeast mutant lacking the endogenous proteins Ded1p. Both co-precipitated with EIF4E1 and EIF4E4 but only DED1-2 with EIF4G3. In both instances, however, these interactions were RNA dependent and were also not detected through yeast two-hybrid assay. A stage-specific expression pattern was observed in L. amazonensis, with DED1-2 being preferentially expressed in the insect promastigote stage, while DED1-1 was more abundant in the amastigote mammalian stage. RNAi assays using the conserved T. brucei orthologs in procyclic cells lead to a very mild reduction in proliferation rate after DED1-1 depletion, while depletion of DED1-2 induces a proliferation arrest. These experiments suggest redundant functions for the two proteins since simultaneous depletion of both homologs lead to a synthetic phenotype with a rapid growth arrest and inhibition of translation [160]. Leishmania DED1-1 was also investigated in an independent study that identified it in a search for proteins that bound to the large ribosomal subunit rRNA and was found to have a protective role in preventing anti-sense ribosomal rRNA fragmentation [161].

Yet another member of the DEAD-Box family of RNA helicases that has roles implicated in the regulation of gene expression, and possibly translation, in trypanosomatids, is Dhh1. It was first identified in a search for eIF4A homologs in Leishmania where it was found to be the third nearest match to the human eIF4A within the L. major genomic sequences, although clearly distinct from the EIF4AI and EIF4AIII sequences [52]. Subsequently, the Dhh1 orthologs were characterized from T. cruzi, where they were found to be associated with P-bodies and other granules and with translationally repressed mRNAs [162, 163], and also from T. brucei. In fact, T. brucei Dhh1 was found to be essential for cell viability and the overexpression of either wild type or an ATPase-inactive form of Dhh1 led to a decrease in polysomes, increase in P-bodies and growth arrest. A selective effect on the expression of developmentally regulated mRNAs was observed, implicating this enzyme in critical roles during translation regulation in trypanosomatids [164]. Interestingly, Dhh1 was more recently found not to interact with the T. brucei ortholog of the known translational repressor SCD6. T. brucei SCD6 shares the same general domain structure seen in orthologs from different eukaryotic lineages and localizes to P-bodies and other granules, and its depletion leads to an increase in global translation rates, demonstrating similar roles to the ones described in other organisms. Lack of an association with Dhh1, seen in all other organisms studied so far, suggests a clear difference in the mode of action of the two proteins during translation control in trypanosomatids [165].

1.6.2 RRM Domain Proteins

RNA recognition motif (RRM) proteins are involved in several processes that are conserved between trypanosomatid species. There are around 75 genes encoding RBPs with one or more RRM in trypanosomatid genomes [154, 166]. Some are involved in differentiation control, like RBP10, a protein involved in regulation of several mRNAs differentially expressed in the T. brucei’s mammalian form [167]. Another example is RBP6, which binds to AU-rich elements at 3′ UTR of mRNAs and is involved in differentiation/progression between different life stages [168, 169]. RBP42 binds mainly within the coding region of mRNAs that encode proteins involved in cellular energy metabolism [170]. An RRM protein homolog to hnRNP F/H controls trans-splicing efficiency and mRNA stability by binding to a purine-rich motif, AAGAA, found in some bloodstream form mRNAs [171]. UBP1 and UBP2, also RRM-containing proteins from T. cruzi, bind to a 43-nt U-rich mRNA element and stabilize the mRNAs from selective life stages [172, 173]. In T. brucei UBP1 and UBP2 are essential for normal growth and have different targets from T. cruzi, binding to a group of mRNAs from the F-box protein family, some of them transmembrane proteins. Overexpression of T. brucei UBP2 inhibited cell growth and upregulated the levels of several mRNAs [174, 175]. Yet another pair of RRM proteins, which has been better characterized in trypanosomatids, is the PTB homologs named PTB1/PTB2 (DRBD3/DRBD4). These are mainly nuclear proteins, but can localize to the cytoplasm and have been implicated in different events such as trans- and cis-splicing, mRNA transport, regulation of mRNA stability and translation [176179].

1.6.3 CCCH Family Proteins

Several genes encoding proteins with one to five zinc-finger motifs (CCCH) were identified in the trypanosomatid genomes. Most are conserved and probably evolved before speciation events, and some may act in regulatory roles of mRNA metabolism [25, 153]. Relevant examples are ZFP1, ZFP2 and ZFP3, small CCCH-type proteins. In T. brucei, ZFP1 is transiently enriched during differentiation to tsetse fly midgut procyclic forms, and while ZFP1 RNAi had no phenotype in the mammalian bloodstream form, ZFP1 null bloodstream cell lines were unable to differentiate into procyclics. ZFP2 RNAi affects the ability of cells to differentiate to procyclic form, and ectopic expression of ZFP2 induced differentiation with elongated cell phenotypes due to polar extension of the cytoskeleton and cell cycle arrest in G1 [180, 181]. ZFP2 from T. cruzi was found associated with mRNAs upregulated in the metacyclic trypomastigote forms [182]. Ectopic expression of ZFP3 in T. brucei also induced a differentiation phenotype similar to ZFP2 [183], and this protein is associated to a subset of mRNAs enriched in transmissible forms of the parasite [184]. Cycling sequence binding proteins (CSBP) A and B are CCCH proteins identified as binding to mRNAs containing the sequence CAUAGAAG, expressed during the S-phase, but their specific function remains unclear [185, 186].

1.6.4 PUF Domain Proteins

Eleven PUF proteins were identified in Trypanosomatids [187, 188], but only a few have been characterized. PUF2 exhibits a low level of in vivo RNA binding, is not associated with polysomes and may have a repressive role [189]. PUF6 regulates target mRNA levels by association with degradation complexes [190]. PUF7, PUF8 and PUF10 are nucleolar proteins involved in rRNA metabolism and do not seem to be involved in translation [191, 192]. PUF9 controls the expression of proteins involved in replicative processes by binding and stabilizing a small number of mRNAs that increase in G1 phase [193].

1.6.5 ALBA Proteins

In Trypanosoma brucei there are four ALBA (acetylation lowers binding affinity) proteins found in homo- and heterodimers in the cytoplasm [109]. ALBA1 and 2 interact with 3′ UTR elements of GPEET procyclin. ALBA3 and 4 are involved in cell growth and linked to differentiation. They are expressed in all stages of the T. brucei life cycle, except the transition from procyclic to epimastigote form. Complexes of ALBA1/2 and ALBA3/4 were found as components of stress granules, and they also seem to be associated with polysomes and can be co-purified with the translation initiation machinery, which indicates they may be involved in translation control [109, 194]. In Leishmania infantum there are two ALBA proteins, LiAlba1 and LiAlba3, which form a complex interacting with other RNA-binding proteins, ribosomal subunits and translation factors and may have a role in translational repression. They display differential subcellular localization since in the insect promastigote stage they localize predominantly in the cytoplasm but move to the nucleolus and the flagellum upon amastigote differentiation [195, 196].

1.7 Concluding Remarks

Throughout this review, novel features of translation in trypanosomatids have been described that highlight differences to other, better characterized eukaryotes. For instance, the canonical eIF4F complex EIF4G3/EIF4E4/EIF4AI binds PABP1 directly through EIF4E4 and appears to be regulated by phosphorylation at novel sites; in animals the interaction is between eIF4G and PABP. There is a second eIF4F complex also likely to be involved in translation, based on EIF4G4/EIF4E3, and it also may make similar interactions with PABP homologs. The evidence from trypanosomatids then reinforces the strong link between PABP and eIF4F function during eukaryotic translation initiation, and further characterization might help clarify the role of PABP in the process. Two further eIF4F complexes, recently identified and based on EIF4E5 and EIF4E6, are associated with proteins homologous to nucleotide-modifying enzymes associated with cap formation in the nucleus, but with so far undefined functions. Could these enzymes be involved in a novel cytoplasmic mRNA capping or recapping, or maybe translation repression or selective translation of small subsets of mRNAs? What are the mechanisms involved? It is also possible that the multiple eIF4F-like complexes identified in trypanosomatids constitute multiple hubs integrating signals from different sources and leading to the selective translation of specific mRNAs, reminiscent of nematodes [197]. Regarding ribosome function, the distinctive features seen in both cytoplasmic and mitochondrial ribosomes in trypanosomatids expand the range of ribosome diversity in eukaryotes considerably. They also raise questions regarding to what extent these ribosomes use conserved mechanisms in order to achieve proper translation or whether they depend on novel features/processes in order to accomplish their functions successfully. The great number of RNA-binding proteins and other mRNA associated proteins in trypanosomatids [70], most of which are novel and have unknown function, also highlights the diversity associated with mRNA metabolism and translation. Further approaches will be required in different fields in order to solve these and other questions raised by the study of translation in trypanosomatids, but undoubtedly the answers generated shall further highlight the exquisite and complex model for translation found in such extraordinary organisms.