Introduction

Euglenozoa is the protist phylum comprising euglenids, kinetoplastids (bodonids and trypanosomatids), and diplonemids. The support for grouping of these organisms has arisen from various morphological, molecular as well as phylogenetic evidences. These organisms possess discoidal mitochondrial cristae and characteristic feeding apparatus (Simpson 1997; Triemer and Farmer 1991), they possess unusual base “J” in nuclear DNA (Dooijes et al. 2000), and they add non-coding capped spliced-leader (SL) RNA to nearly all cytosolic mRNAs via trans-splicing (Bonen 1993; Liang et al. 2003). The monophyly of Euglenozoa has been supported by molecular phylogenies as well (Simpson and Roger 2004).

The phylogenies support the early divergence of euglenids within Euglenozoa followed by the split of diplonemids and kinetoplastids (Simpson and Roger 2004; Simpson et al. 2002, 2004). However, the discovery of new euglenozoan clade—Symbiontida, exact phylogenetic position of which within Euglenozoa is currently uncertain (Breglia et al. 2010; Yubuki et al. 2009), has challenged this view, and diplonemids and symbiontids might be instead more closely related to euglenids than to kinetoplastids (Chan et al. 2012). Nevertheless, trypanosomatid parasites seem to be one of the latest branches of Euglenozoa, and they evolved most likely from one of the free-living bodonid clades (Callahan et al. 2002; Deschamps et al. 2011; Dyková et al. 2003; von der Heyden et al. 2004; Moreira et al. 2004; Simpson and Roger 2004; Simpson et al. 2004, 2006). The parasitism evolved at least four times within kinetoplastids (Simpson et al. 2006). Most euglenids are free-living heterotrophic flagellates, while some euglenids possess plastids of secondary green algal origin. Various lines of evidence suggest relatively recent acquisition of plastids by a phagotrophic ancestor of plastid-bearing euglenids (Leander et al. 2001; Leander 2004; Nozaki et al. 2003; Rogers et al. 2007; Turmel et al. 2009; Vesteg et al. 2010). Although the vast majority of phototrophic euglenids have already lost the ability of phagocytosis, marine phototrophic euglenid Rapaza viridis has been recently described, which has retained the ability to capture a eukaryotic prey (Yamaguchi et al. 2012). However, the flagellate Euglena gracilis, belonging to the latest euglenid clade represented by freshwater phototrophs (Linton et al. 2010), has been the most studied euglenid species so far. The scheme of euglenozoan phylogeny is presented in Supplemental Fig. 1.

Euglenozoan mitochondrial genome structures are unusual and diverse. The E. gracilis mitochondrial genome is represented by a heterodisperse collection of short molecules (approximately 4 kb) encoding gene fragments flanked by repeats (Spencer and Gray 2011). The diplonemid Diplonema papillatum possesses multiple 6–7 kb circular-mapping chromosomes containing short subgenic modules expressed as separate transcripts that are then trans-spliced to yield translatable mRNAs (Marande and Burger 2007; Vlček et al. 2011). In kinetoplastids, mitochondrial transcripts encoded by maxicircles are edited by guide RNAs (gRNAs) encoded by minicircles (Hajduk et al. 1993), although in some trypanosomatids a small proportion of gRNAs is also encoded by maxicircles (for review see Simpson et al. 2000; Stuart and Panigrahi 2002). The RNA editing in mitochondria of kinetoplastids includes uridine insertions and deletions.

Euglenozoa have been classified within Excavata which have been only recently suggested to be one of the three eukaryotic major groups possibly representing the most basal eukaryotic branch (Hampl et al. 2009). However, another hypothesis has been recently suggested proposing that Euglenozoa might be instead the earliest branching eukaryotes apart from Excavata (Cavalier-Smith 2010). Cavalier-Smith (2010) has proposed that some euglenozoan features are primitive. Two of these include euglenozoan mitochondrial features—unique cytochrome c biogenesis (Allen et al. 2008) and the possible absence of mitochondrial outer membrane channel Tom40 (Schneider et al. 2008).

The engulfment of an α-proteobacterial ancestor of mitochondria by a host entity was probably a key moment in eukaryogenesis (Martin and Müller 1998; Vesteg and Krajčovič 2008, 2011). The α-proteobacterial ancestors of mitochondria might have been either strict aerobes (Cavalier-Smith 2002) or facultative anaerobes (Martin and Müller 1998). The former view is supported by the fact that most eukaryotes possess aerobic mitochondria and by the calculation of an oxyphobic index considering the amino acid distribution in anaerobes and aerobes suggesting that the last eukaryotic common ancestor (LECA) was an aerobe (Di Giulio 2007). The latter hypothesis is supported by the fact that mitochondria of E. gracilis, other euglenozoans and excavates possess biochemical properties of both aerobic and anaerobic mitochondria (Ginger et al. 2010), and thus, the biochemistry of mitochondria of these organisms might represent an intermediary evolutionary stage reminiscent of the mitochondria of LECA.

The acquisition of an α-proteobacterium and its evolution to a primitive mitochondrion was accompanied by the transfer of endosymbiont genes to the host genome and the evolution of a mechanism for import of proteins to mitochondria including the evolution of mitochondria-targeting presequences. The most of proteins necessary for mitochondrial function were probably nucleus-encoded in LECA (Desmond et al. 2011). The potentially primitive mitochondrial import apparatus in possibly most ancient eukaryotic group (either euglenozoans or excavates) could be a good model to trace the evolution of mitochondrial import mechanism of the first eukaryote. However, the data about mitochondrial targeting presequences of Euglenozoa are fragmentary.

The studies of some proteins targeted to mitochondria and some predictions suggest that mitochondrial presequences of trypanosomatids are quite short (some only 6 aa in length) (Häusler et al. 1997). Nevertheless, e.g., the trCOIV (Cox4) preprotein of Leishmania tarentolae possesses 31 aa-long presequence (Maslov et al. 2002). In contrast to other eukaryotes, cytochrome c 1 lacks cleaved targeting peptide in euglenozoans (Priest et al. 1993; Priest and Hajduk 2003). Experimental evidence exists that E. gracilis presequences of the subunits II and IX of ubiquinol-cytochrome c reductase complex (Qcr2 and Qcr9) are 42 and 30 aa-long, respectively (Cui et al. 1994). Although 30 aa-long consensus sequence has been generated from the N-termini of 107 hypothetical E. gracilis proteins potentially targeted to mitochondria (Gawryluk and Gray 2009), nearly nothing is known about the variability of length of mitochondrial presequences in euglenids. While the structure of plastid-targeting presequences and domains and motifs therein in E. gracilis have been precisely analyzed (Durnford and Gray 2006), the common patterns present in mitochondrial presequences of euglenids are largely unknown.

In this study, we searched for common protein motifs in predicted mitochondrial presequences of nucleus-encoded mitochondrial precursor proteins in E. gracilis and parasitic trypanosomatids Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major. Since trypanosomatid parasites and phototrophic freshwater euglenids (including E. gracilis) seem to be the most distant branches in euglenozoan phylogeny (see Supplemental Fig. 1), the similarities of mitochondrial import signals and machineries in these organisms could likely reflect the nature of mitochondrial import apparatus of euglenozoan common ancestor, if not LECA itself.

Methods

More than 500 proteins of respiratory chain and associated proteins, proteins of citric acid cycle, and proteins involved in the synthesis of Fe–S clusters from T. brucei, Chlamydomonas reinhardtii, Saccharomyces cerevisiae and Bos taurus were used as queries in tBLASTn search (http://blast.ncbi.nlm.nih.gov/Blast.cgi) in currently available non-annotated EST data (http://www.ncbi.nlm.nih.gov/ ) of E. gracilis (Ahmadinejad et al. 2007; Durnford and Gray 2006; Ferreira et al. 2007). In addition, the E. gracilis EST data were searched for the presence of homologs encoding 20 trypanosomatid editosomal proteins. Only those hits with e-values below 1 × 10−9 were used for further analysis. Some of these contained SL-leader sequence TTTTTTTCG generally present at the 5`-end of E. gracilis mRNAs and mRNAs of other euglenids (Frantz et al. 2000). The protein sequences used for analysis of presequence-encoding regions were either those obtained via in-frame translation (based on comparison with homologous proteins from other organisms) of ESTs containing SL-leader or sequences obtained via linking of 2–5 EST sequences (at least one containing SL-leader) with identical overlaps. The first methionine encoded by ATG in the sequence downstream of SL-leader was chosen as the presequence start. Accession numbers of E. gracilis ESTs used in this study, the names of the organisms with the best BLASTx hits, e-values and names of putative gene products are in the Supplemental Table 1. Mitochondrial protein homologs of T. brucei (TB), T. cruzi (TC), and L. major (LM) (if found) were also included in the analysis of presequence regions, as were the mitochondrial proteins involved in heme synthesis ferrochelatase (FeCH), δ-aminolevulinic acid synthase (ALAS) and protoporphyrinogen oxidase (PPOX) in E. gracilis (Kořený and Oborník 2011). In addition, various other T. brucei (TB), T. cruzi (TC), and L. major (LM) mitochondrial protein precursors, corresponding homologs of which were not found in currently available E. gracilis sequence data, involved in citric cycle, synthesis of Fe–S clusters and RNA editing were also used in this study. Accession numbers (http://www.ncbi.nlm.nih.gov/) and names of trypanosomatid nucleus-encoded mitochondrial protein precursors included in the analysis of euglenozoan mitochondrial presequences can be found in the Supplemental Table 2.

E. gracilis protein sequences and trypanosomatid proteins listed in Supplemental Tables 1 and 2 were used as queries in tBLASTn search in currently available ESTs of Euglena longa, Bodonidae, and Diplonemida, and our unpublished EST data of Eutreptiella gymnastica. The currently available EST data of E. gracilis and E. gymnastica were also screened for the presence of Tom40—key component of the complex of mitochondrial outer membrane translocon. The screening was performed, firstly, using tBLASTn with Tom40 (S. cerevisiae) and ATOM (T. brucei) as queries and, secondly, using Hidden Markov model (HMM) search. The HMM search (Likic et al. 2010) was performed using MyHMMER script kindly provided by Vojtěch Žárský (Department of Parasitology, Faculty of Science, Charles University in Prague) under the default setting using alignments of either 23 eukaryotic Tom40 sequences or ATOM of kinetoplastids. The top hits from the searches were further evaluated using HHpred (Söding 2005) (http://hhpred.tuebingen.mpg.de/hhpred) under default setting and with pdb70_9Feb12 database. This database consisted of HMMs created from alignment of proteins present in protein data bank on February 9, 2012 (Bourne et al. 2004).

The probable cleaved targeting sequences (CTS) were identified using the programs MITOPROT (Claros and Vincens 1996) (http://ihg.gsf.de/ihg/mitoprot.html) and targetP (Emanuelsson et al. 2000; Nielsen et al. 1997) (http://www.cbs.dtu.dk/services/TargetP/) under default settings. When E. gracilis sequences were used for CTS prediction by targetP, organism group was changed from non-plant to plant to consider potential plastid localization. Common sequence motifs within presequences were determined using MEME program (Bailey and Elkan 1994) (http://meme.nbcr.net/meme4_6_1/cgi-bin/meme.cgi) and GLAM2 (Frith et al. 2008) (http://meme.nbcr.net/meme4_6_1/cgi-bin/glam2.cgi). The biochemical properties of predicted CTSs such as molecular weights, theoretical isoelectric points (pI), and numbers of positively and negatively charged amino acids were calculated using +ProtParam program (Gasteiger et al. 2005) (http://expasy.org/tools/protparam.html).

Results

The predicted presequences (cleaved targeting sequences—CTSs) of mitochondrial protein precursors of E. gracilis, T. brucei, T. cruzi and L. major were analyzed in this study. It was not possible to find CTS in all proteins included in this study, although they were predicted to be targeted to mitochondria. For example, no CTS regions were identified in cytochromes c 1 and c of E. gracilis and all three trypanosomatids. Similarly, it was impossible to detect CTS in E. gracilis NADH dehydrogenase NDUFA9 and aconitase (Aco). The total number of euglenozoan proteins predicted to be targeted to mitochondria and possessing CTS identifiable by either MITOPROT or targetP programs (or both) was 127. While only 105 CTSs were detected by MITOPROT, 122 CTSs were detected by targetP. Therefore, CTSs predicted by targetP were chosen for further analysis.

The euglenozoan-predicted CTSs were 5–118 aa-long with an average size of CTS 31 and 41 aa predicted by MITOPROT and targetP, respectively (see Table 1 for details). The CTSs contained mainly positively charged and hydrophobic aa-s, and the average isoelectric point was 11.37 ± 1.26 (MITOPROT) and 10.66 ± 1.84 (targetP). The CTS-lengths and biochemical properties of presequences predicted by MITOPROT and targetP are depicted in Supplementary Table 3. In addition, E. gracilis Qcr1 respiratory chain protein was found to be homologous to β subunits of matrix processing peptidases (MPPs) from trypanosomatids. This was consistent with the previous study of Cui et al. (1994). Qcr1 and its partial homolog Qcr1p were found to be homologous to β subunits of different trypanosomatid MPPs (assigned B1 and B2 here). Supplemental Table 1 contains the list of E. gracilis mitochondrial preproteins included in this study, ESTs of which were found in the currently available non-annotated EST sequence data of this organism.

Table 1 Average, modus, and median length of euglenozoan presequences predicted by MITOPROT and targetP

Since CTSs needed to be longer than 7 aa to be analyzed using MEME program such as to search for common motifs, 116 presequences predicted by targetP (each longer than 7 aa) were analyzed using this program. This analysis generated short conserved motif (M/L)RR present in most of the analyzed presequences. The consensus logo generated from all 116 proteins included in the analysis is presented in Fig. 1a. This motif is generally followed by hydrophobic region rich in alanine, phenylalanine, leucine, and valine. The short (M/L)RR motif is the same as previously identified at the N-terminus of some trypanosomatid mitochondrial presequences predicted to be short (Häusler et al. 1997). In the MEME output including all 116 presequence, the consensus region serving as the basis for generation of the logo was present at the N-terminus (up to 7 aa from the first methionine) in 60 euglenozoan presequences, while it was found somewhere within the predicted presequence region of 56 euglenozoan presequences (more than 7 aa from the N-terminus) (data not shown). The MEME analysis of 60 proteins in which the (M/L)RR motif was detected at the N-terminus in the initial step is presented in Table 2, and the common 10 aa-long motif (present in 56 of 60 of these proteins) is depicted in Fig. 1b. The MEME analysis of 56 proteins, in which the (M/L)RR motif was detected within the presequence region in the initial step, is presented in Table 3 and the common 20 aa-long logo generated from 50 of 56 of these proteins is depicted in Fig. 1e. Table 4 includes only the analysis of mitochondrial protein precursors from E. gracilis and MppB2 from Eutreptiella gymnastica, and 11 aa-long logo generated by the program is depicted in Fig. 1c. This logo is very similar to the N-terminus of the logo generated by Gawryluk and Gray (2009). Figure 1d includes the common motif generated from six euglenozoan sequences predicted to be 5 or 6 aa-long by targetP. The sequence identities among presequences and among mature parts of preproteins in E. gracilis and trypanosomatids are shown in Supplemental Fig. 2.

Fig. 1
figure 1

Common protein motifs present in presequences of mitochondrial precursor proteins of Euglena gracilis and trypanosomatids. a Logo generated in the MEME analysis of 116 euglenozoan presequences predicted by targetP. b 10 aa-long logo generated in the MEME analysis of 60 euglenozoan mitochondrial presequences predicted by targetP in which the (M/L)RR motif was detected at the N-terminus (up to 7 aa from the first methionine) (see Table 2). c The 11 aa-long logo generated in the MEME analysis of 20 E. gracilis targetP-predicted mitochondrial presequences and 1 presequence from Eutreptiella gymnastica (see Table 4). d Logo generated in GLAM2 program from 6 euglenozoan sequences predicted to be 5 or 6 aa-long by TargetP. e 20 aa-long logo generated in the MEME analysis of 56 euglenozoan mitochondrial presequences (predicted by targetP) in which the (M/L)RR motif was detected more than 7 aa from the first methionine (see Table 3)

Table 2 The output of the MEME analysis of 60 euglenozoan mitochondrial presequences (predicted by targetP) in which the (M/L)RR motif was detected at the N-terminus (up to 7 aa from the first methionine) in the initial analysis in which 116 predicted euglenozoan mitochondrial targeting presequences were included
Table 3 The output of the MEME analysis of 56 euglenozoan mitochondrial presequences (predicted by targetP) in which the (M/L)RR motif was detected more than 7 aa from the first methionine in the initial analysis in which 116 predicted euglenozoan mitochondrial targeting presequences were included
Table 4 The output of the MEME analysis of 20 E. gracilis targetP-predicted mitochondrial presequences

In addition, we could not identify the mRNAs of proteins analyzed in this study in the currently available EST data of diplonemids. We could identify some of these in bodonid data, but none of these was predicted to be targeted to mitochondria by the programs. We identified ESTs encoding Cit1, Fh, Mdh, Grx, Hel61, MEAT1, trCOIV, AtpB, AtpA, CytC1, CytC, SdhB1, TAO, SdhB2, NuoG, MppB1 (Qcr1), MppB2 (Qcr1p), Eao2, and Eao3 in Euglena longa EST data. However, most of these lack 5`-ends. ESTs encoding cytochromes c and c 1 do possess 5`-end, but similarly, in E. gracilis and trypanosomatids, the CTSs were not detectable by the programs, because these proteins probably lack it. The only sequences encoding homologs of proteins analyzed in this study found in E. gymnastica transcriptome data (unpublished) were trCOIV, AtpA, AtpB, Mdh, MppB2, cytochromes c and c 1 , while only MppB2, and cytochromes c and c 1 possessed complete 5`-end. While cytochromes c and c 1 lacked identifiable CTS, MppB2 presequence was predicted to be 40 aa-long by targetP (and 39 by MITOPROT).

Using BLAST and HMM search, we have identified a single eukaryotic porin3 protein in E. gracilis (accession number AF317222 in GenBank nr database and ELE00007502 in GenBank EST database), and its ortholog was detected also in the transcriptome of E. gymnastica. HHpred search indicated that this porin likely belongs to the VDAC (Voltage-dependent anion channel) subfamily rather than to Tom40 subfamily. No porin similar to ATOM of kinetoplastids has been detected in euglenids.

Discussion

The nucleus-encoded mitochondrial precursor proteins in E. gracilis and trypanosomatids apparently possess presequences sharing common features. Principally, the euglenozoan mitochondrial presequences are of three types with respect to the length. Two types of presequences were known in trypanosomatids before: (1) the short (up to 10 aa-long) mitochondrial targeting presequences, and (2) 10–30 aa-long presequences (most frequently about 16 aa-long) (Häusler et al. 1997). In this study, it has been demonstrated that some E. gracilis mitochondrial precursor proteins also possess these two types of CTS. Moreover, this study revealed that some mitochondial targeting presequences can be quite long (up to 118 aa) in both E. gracilis and trypanosomatids. These predictions are consistent with the experimental evidence that frataxin precursor of T. brucei possesses 55 aa-long mitochondrial targeting sequence (Long et al. 2008), and that 115 aa-long N-terminus of WD-repeat preprotein can serve as mitochondrial targeting signal in T. cruzi (Bromley et al. 2004). Most of long euglenozoan presequences identified here possess at least one long common protein motif (up to 20 aa) (Fig. 1e). The variability of mitochondrial presequence length in E. gracilis was also predicted via the alignments of eukarytic enzymes involved in tetrapyrrole synthesis and their bacterial homologs lacking CTS (Kořený and Oborník 2011). The presequences of mitochondria-targeted enzymes ferrochelatase (FeCH) and δ-aminolevulinic acid synthase (ALAS) seem to be quite long (47 and 86 aa, respectively), while the presequence of protoporphyrinogen oxidase (PPOX) may be only 7 aa-long (see Supplemental Fig. S3 in Kořený and Oborník 2011). While ALAS mitochondrial presequence was not detectable by both MITOPROT and targetP, FeCH, and PPOX presequences were predicted to be 42 and 96 aa-long by targetP (and 49 and 26 aa-long by MITOPROT) in our analysis, respectively. This suggests that the usage of the programs for the prediction of mitochondrial presequences is not without its limitation.

Although the length of a presequence predicted by MITOPROT and targetP differed in many cases (Supplemental Table 3), most of predicted presequences contained arginine (R) as the last but one aa (data not shown). This may reflect the fact that these programs search for conserved cleavage sites most commonly found among eukaryotes, while their algorithms differ. It is also possible that euglenozoan processing peptidases do not recognize consensus cleavage sites in some cases. It should be mentioned that both MITOPROT and targetP are mainly trained with model organisms which are phylogenetically distant from euglenozoans. Although the cleavage sites of most euglenozoan presequences included in this study remain to be experimentally verified, the MEME analysis revealing common motifs at the N-termini of predicted presequences is of significance. Since the analyzed mitochondrial proteins have different mitochondrial function such as in oxidative phosphorylation, citrate cycle, heme synthesis, synthesis of Fe–S clusters, RNA editing, and RNA and protein processing (Supplemental Tables 1 and 2), the common motif identified at the N-termini of most euglenozoan presequences could be hardly explained by the redundancy of the data used in this study.

Another possibility for the explanation of the difference between presequence lengths predicted by targetP and MITOPROT is that some precursor proteins are sequentially processed to the mature form in two steps by peptidases, while each program suggests only one of the cleavage sites. Two-step processing of protein precursor has been demonstrated experimentally, e.g., for Rieske iron-sulfur protein subunit of the cytochrome c reductase in T. brucei (Priest and Hajduk 1996). Moreover, even more complex processing of frataxin in T. brucei occurs (Long et al. 2008). Since we found two different homologs of E. gracilis Qcr1 (possessing domains typical for β subunits of MPPs) as well as two homologs of MPP β subunits in trypanosomatids, it might be possible that two types of MPPs exist in Euglenozoa probably recognizing different cleavage sites. Hypothetically, cleavage sites might alternate under different growth conditions or in different developmental stages in the case of trypanosomatids. It has been shown for example that alternative oxidase (TAO) and subunit IV of cytochrome oxidase (trCOIV) are imported in the mitochondrion via different mechanisms in T. brucei bloodstream and procyclic form (Williams et al. 2008).

Euglenozoan mitochondrial import apparatus is often assumed to be primitive, potentially reminiscent of LECA (Cavalier-Smith 2010; Schneider et al. 2008). Apart from some exceptions including Tim17, most components of the mitochondrial protein import machinery generally present in other eukaryotes cannot be identified in euglenozoan sequence data by homology-hit searches (Schneider et al. 2008; Singha et al. 2008). The most striking is the potential absence of Tom40 (Cavalier-Smith 2010; Pusnik et al. 2011; Schneider et al. 2008). Pusnik et al. (2011) have recently identified the protein translocase ATOM belonging to eukaryotic porin family in trypanosomes (Pusnik et al. 2011). ATOM is assumed to have the same function as Tom40 in other eukaryotes, and it has been suggested that this translocase is related to bacterial Omp85-like proteins (Pusnik et al. 2011). On the other hand, Hidden Markov Model (HMM)-based analysis of trypanosomatid ATOMs has recently revealed that ATOM is most likely highly derived Tom40 (Žárský et al. 2012). Our HMM search detected neither Tom40 nor ATOM in the transcriptomes of euglenids.

Since euglenozoan mitochondrial import machinery seems to be different from all organisms studied so far (Lithgow and Schneider 2010), one would expect that mitochondrial presequences would be also somehow specific. However, it has been recently shown that trypanosome mitochondria have the capacity to use the human CTS (Long et al. 2008). Moreover, except for highly variable sequence length and the presence of (M/L)RR motif in presequences, there seems to be nothing special about euglenozoan presequences in comparison to other organisms. The (M/L)RR logo has been previously generated, when N-termini of 24 L. major mitochondrial proteins precursors were included in the analysis (Uboldi et al. 2006). Euglenozoan presequences are generally rich in hydrophobic aa-s such as alanine, phenylalanine, leucine, and valine, and the basic arginine. (M/L)RR motif is present at the N-terminus (up to 10 aa) in most presequences and it is generally followed by a hydrophobic region, although in some longer presequences the RR motif is found more distantly from the N-terminus. In addition, when some Bos taurus and/or S. cerevisiae mitochondrial presequences were included in the analysis, they were all in the outputs of the MEME program (data not shown) suggesting that biochemical properties of euglenozoan presequences and motifs therein are not very different from the biochemical properties of presequences from other unrelated organisms. Nevertheless, the data presented here are consistent with the hypothesis that only the (M/L)RR motif and/or RR within two hydrophobic aa-s (mainly alanine, leucine, methionine or valine) or within longer hydrophobic regions is probably sufficient euglenozoan mitochondrial targeting signal irrespective of presequence length. Since not all presequences possess exactly (M/L)RR motif, but they all share at least short region with highly similar biochemical properties, another possibility is that only hydrophobic region rich in arginine (R) at the N-terminus of a preprotein can be sufficient signal for mitochondrial import in Euglenozoa. Nevertheless, it was also possible to identify 20 aa-long region with statistically significant biochemical similarities in 50 of 56 euglenozoan presequences (see Table 3 and Fig. 1e). Therefore, it is also possible that longer presequence regions with similar biochemical properties in long presequences can serve as N-terminal mitochondria-targeting signals. It has been also experimentally demonstrated that T. brucei CoxVI does not possess N-terminal cleavable presequence, and it uses internal signal sequence for targeting to mitochondria (Tasker et al. 2001). Taken altogether, although the majority of euglenozoan mitochondrial protein precursors likely requires only (M/L)RR motif and/or short hydrophobic region rich in arginine at the N-terminus to be targeted to mitochondria, longer motifs within longer presequences of some precursors and internal protein sequences can be also responsible for protein targeting to euglenozoan mitochondria.