Abstract
Many eubacteria contain an ATP-dependent protease complex, which is built by multiple copies of the HslV and HslU proteins and is therefore called HslVU. HslU proteins are AAA + ATPases, while HslV proteins are proteases that show highly significant similarity to β subunits of proteasomes. Therefore, the HslVU complex has been envisaged as a precursor or ancestral type of proteasome. Here we show that species of most of the main eukaryotic lineages have HslU and HslV genes very similar to those found in proteobacteria. We have detected them in amoebozoa, plantae, chromoalveolata, rhizaria, and excavata species. Phylogenetic analyses suggest that these genes have been obtained by endosymbiosis from the proteobacterial ancestor that gave rise to eukaryotic mitochondria. The products encoded by these eukaryotic genes adopt, according to modeling based on the known crystal structures of prokaryotic HslU and HslV proteins, conformations that are compatible with their being fully active, suggesting that functional HslVU complexes may be present in many eukaryotic species.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Protein degradation is a key cellular process. All organisms possess specific systems that digest proteins into small peptides and, finally, single amino acids. In eukaryotes, most proteins are degraded by the ubiquitin-proteasome system (reviewed by Glickman and Ciechanover 2002; Groll et al. 2005). Ubiquitin chains are added to proteins that become tagged for destruction by a complex multiprotein machine, the proteasome. Proteasomes are found not only in eukaryotes but also in archaea and in some eubacteria, such as several actinomycete species (Volker and Lupas 2002; Gille et al. 2003). The eukaryotic 26S proteasome is formed by a proteolytic core, called 20S proteasome, and 19S regulatory complexes. The 20S proteasome is formed by 28 related subunits. There are two types of subunits, α and β, that are arranged to form four seven-membered rings, with the two outer rings containing only α and the inner rings only β subunits (Glickman and Ciechanover 2002; Groll et al. 2005). This structure is conserved in the archaeal and eubacterial proteasomes (reviewed by de Mot et al. 1999; Volker and Lupas 2002). The α and β subunits probably diverged before the archaea/eukaryote split but still conserve substantial primary sequence similarity and a characteristic protein fold, often called the proteasome fold (Hughes 1997; Bouzat et al. 2000; Volker and Lupas 2002; Grolle et al. 2005). However, while in prokaryotes single α and β subunits are found, in eukaryotes these subunits are encoded by multiple genes, in such a way that each of the proteasome rings is formed by seven distinct subunits (e.g., Bouzat et al. 2000). The origin of the proteasome is still obscure. Two different views are as follows: (1) proteasomes originated just before the archaea/eukaryotic split, being subsequently horizontally transferred to actinomycetes (e.g., Volker and Lupas 2002); and (2) proteasomes originated in eubacteria—archaea and eukaryotes derive from an actinomycete ancestor that already contained proteasomes (Cavalier-Smith 2002).
Many eubacteria do not contain proteasome-related genes, but they have other types of protease systems (reviewed by Groll et al. 2005). Most interestingly, a complex structurally reminiscent of the proteasome, called HsuVU (or, sometimes, ClpQY), can be found in many eubacterial species (Gille et al. 2003). This complex contains two types of proteins. HslU (also known as ClpY) is an AAA + ATPase, unrelated to any of the subunits of the 20S proteasome. HslV (ClpQ), however, has a significant similarity to proteasomal β subunits (Chuang et al. 1993). The structure of the HslVU complex has been determined, also showing interesting similiarities with proteasomes. This complex has four characteristic rings with six subunits each. The two outer rings are formed by six HslU subunits each, while inner rings contain six HslV subunits (Rohrwild et al. 1997; see also Sousa et al. 2000; Wang et al. 2001). Determination of the three-dimensional structure of HslV proteins of multiple species has shown that they also possess the characteristic proteasome fold found in α and β proteasomal subunits (e.g., Bochtler et al. 1997). This has led several authors to postulate that HslVU complexes are the eubacterial counterparts of proteasomes or even some type of precursor complexes from which 20S proteasomes derived, passing from a state with two rings of proteases to the current structure with four protease rings (Volker and Lupas 2002). For some time, it was thought that HslVU and proteasomes were mutually incompatible, so species may have one or the other (or none of them, as in some eubacteria) but not both (De Mot 1999). However, more recently it was found that protists such as the euglenozoa Leishmania and Trypanosoma or the apicomplexa Plasmodium contained both proteasome complexes and HslU and HslV genes (Couvreur et al. 2002; Gille et al. 2003). It was then suggested that these eukaryotes may have acquired the genes by horizontal transfer from the endosymbiont that gave rise to mitochondria or by more recent gene transfers favored by the direct contact of these unicellular eukaryotes with bacteria in their insect vectors (Couvreur et al. 2002; Gille et al. 2003).
In this study, we describe the finding of HslU and HslV genes in many different eukaryotic lineages. We also show that HslU eukaryotic proteins are most similar to their proteobacterial counterparts, a result that is also found, albeit with a low statistical support, for HslV proteins. Structural analyses strongly suggest that eukaryotic and prokaryotic HslU and HslV proteins may fold in very similar ways. These results suggest that active HslVU complexes may be present in many eukaryotic lineages and that eukaryotes most likely acquired these proteins from the endosymbiotic proteobacteria from which mitochondria derived. The implications of these results for our understanding of the evolutionary history of HslVU complexes and proteaseomes are discussed.
Methods
Sequence Data Mining
All sequences analyzed in this study were obtained from the National Center for Biotechnology information (NCBI) databases. To generate a representative database, several prokaryotic HslU and HslV sequences were used as queries to perform TBLASTN searches against the nonredundant, est, month, wgs, htgs, and gss NCBI databases. We used both blastcl3 client and web searches at http://www.ncbi.nlm.nih.gov/BLAST/. Large sets of representative sequences were retrieved for prokaryotic species. Then specific searches were performed to characterize all available eukaryotic sequences. Because all HslU and HslV sequences are very similar, these searches quickly became saturated (i.e., no matter the query sequence used; searches generally detected the same eukaryotic sequences). The analyses were finished in April 2006.
Multiple-Sequence Alignments and Phylogenetic Trees
We used CLUSTALX 1.83 (Thompson et al. 1997) to align the sequences obtained in our searches. Multiple alignments were then manually refined using GeneDoc 2.6 (Nicholas and Nicholas 1997). Phylogenetic trees were obtained by both the neighbor-joining and the maximum parsimony methods. Neighbor-joining analyses were performed using Mega 3.1 (Kumar et al. 2004) with the correction for multiple substitutions and the pairwise deletion option. Maximum parsimony analyses were performed also using Mega 3.1, with the following parameters: (1) all sites included; (2) initial trees obtained by random addition, with 10 replicates; and (3) close-neighbor interchange with search level 3. Interior branch tests and bootstrap tests (1000 replicates) were performed to establish the reliability of the topologies obtained for the neighbor-joining and maximum parsimony analyses. TreeView 1.6.6 (Page 1996) was used to explore the trees and generate the figures presented below. The results of phylogenetic analyses allowed to detect false-positive eukaryotic sequences (i.e., sequences anotated as eukaryotic, but being of prokaryotic origin) as well as to characterize sequences that were present two or more times. These false-positive or duplicate sequences were eliminated from our final trees.
Three-Dimensional Structure Modeling
Comparison of the primary sequences of eukaryotic HslU and HslV proteins with those of prokaryotes for which crystal structures are available and generation of the three-dimensional models were performed using Swiss-Model (http://www.swissmodel.expasy.org/) (Peitsch 1996). The images were then analyzed with Swiss-PdbViewer 3.7 (Guex and Peitsch 1997) (details at http://www.swissmodel.expasy.org/spdbv/). The same program was used to generate the figure with the three-dimensional structures shown below. The known structures of the HslU and HslV Escherichia coli proteins were used as templates. Protein Data Bank codes for the templates of HslV were 1ned (Botchler et al. 1997), 1e94 (Song et al. 2000), 1g4b, 1ht1, and 1hqy (the latter three from Wang et al. 2001). For HslU, we used the template 1do0 (Bochtler et al. 2000).
Results
HslU Genes of Likely Mitochondrial Origin Are Present in Most Eukaryotic Lineages
Figure 1 shows a multiple-sequence analysis of all the HslU eukaryotic sequences detected compared with the sequence of the canonical Escherichia coli HslU gene. The complete sequences available include not only the ATP-binding site, located in the N-terminal (N) domain, but also the intermediate (I) and C-terminal (C) domains (see, e.g., Bochtler et al. [2000] and Fig. 1 legend for a precise definition of these domains). The I domain is the less conserved region of the HslU protein. We found HslU sequences in species from five of the six main eukaryotic groups (Simpson and Roger 2004), namely, amoebozoa (Dyctiostelium discoideum), plantae (the vascular plant Oryza sativa and green algae such as Ostreococcus tauri and Cyanidoschizon merolae), chromoalveolata (alveolata such as species of the Plasmodium, Toxoplasma, and Tetrahymena genera; haptophytes such as Emiliania huxleyi; stramenophiles such as species of the Phytophthora and Thalassiosira genera; chryptophytes such as Guillardia theta), rhizaria (the cercozoan Bigelowiella natans), and excavata (the euglenozoans Leishmania infantum and the species of the Trypanosoma genus). It is unclear whether HslU is present in the sixth main eukaryotic group, the opisthokonta, which includes, among others, fungi and animals. In fact, there are many HslU sequences in the databases that have been annotated as belonging to animal species, including human. In those cases their similarity to bacterial sequences allowed us to detect and eliminate them from our final analyses. However, a single cDNA sequence putatively derived from the fish Danio rerio was found that is quite different from all other sequences, either prokaryotic or eukaryotic. This sequence encodes just a small part of the HslU protein (see Fig. 1), precluding its precise classification. Thus, it is impossible at present to know whether this sequence may be of eukaryotic origin. However, even if it is indeed eukaryotic, a contamination of Danio samples by DNA of other eukaryotic organisms cannot be ruled out. Considering that no other animals seem to have HslU genes, we think it is very unlikely that this sequence actually derives from a vertebrate.
Figure 2 shows a neighbor-joining phylogenetic tree for HslU sequences (101 prokaryotic plus the longest 16 eukaryotic sequences available) in which the maximum parsimony results have also been included. Both methods of phylogenetic reconstruction generate almost-identical trees, in which the eukaryotic sequences always appear as a monophyletic group, sister of α-proteobacterial HslU genes. The eukaryote/α-proteobacteria clade is strongly supported by the interior branch test. However, bootstrap support is quite low (see Fig. 2). In any case, the high similarity of eukaryotic and proteobacterial genes, already detected by other authors, albeit in much smaller analyses (Couvreur et al. 2002; Gille et al. 2003), is striking and suggests that the origin of eukaryotic HslU genes may be the endosymbiotic event that generated the mitochondria, which derives from an α-proteobacterial ancestor.
HslV Genes Are Also Present in Multiple Lineages
We detected HslV genes in almost as many lineages as HslU genes. Figure 3 shows the multiple-sequence analysis comparing the 22 available eukaryotic sequences together with their E. coli homolog. HslV genes were found in amoebozoa (D. discoideum), plantae (both plants [Cycas and Physcomitrella] and algae [Ostreococcus, Cyanidoschizon, Chlamydomonas]), chromoalveolata (the alveolata Plasmodium, Neospora, Toxoplasma, and Tetrahymena, the stramenophiles Phaeodactylum, Phytophthora, and Thalassiosira, and the haptophyte Prymnesium), and excavata (Leishmania, Trypanosoma). We did not find HslV sequences from Opisthokonta or Rhizaria. We generated phylogenetic trees based on HslV sequences, which included the 20 longest eukaryotic sequences (length > 100 amino acids) plus other 140 prokaryotic HslV genes (Fig. 4). In both neighbor-joining and maximum parsimony trees, eukaryotic sequences again group together with α-proteobacterial sequences. However, HslV sequences are quite short and therefore the information that they contain is limited. Consequently, bootstrap support for the corresponding topologies is very low (see details in legend to Fig. 4).
Eukaryotic HslU and HslV Sequences Can Be Modeled to Structures Compatible with Functional Activity
To determine whether eukaryotic sequences may encode products able to form functional HslVU complexes, we decided to use the available crystal structures for HslU and HslV proteins to model the three-dimensional folding of deduced eukaryotic proteins. As an example, Fig. 5 shows the folding of the HslU (Fig. 5a) and HslV (Fig. 5b) proteins of the green algae Cyanidoschizon merolae (results for the proteins of other eukaryotic species are similar). The modeled structures are almost identical to that of the bacterial proteins. Thus, C. merolae HslU shows the three characteristic domains already described in prokaryotic proteins (C, I, and N domains; see Fig. 5a), while C. merolae HslV is modeled as having a typical proteasome fold. These results show that the eukaryotic-specific amino acid changes (e.g., all those accumulated in the I domain of HslU that we show in Fig. 1) do not significantly alter protein folding.
Discussion
Our results establish that HslU and HslV genes are present in many eukaryotic species, thus extending the previous findings of Couvreur et al. (2002) and Gille et al. (2003), which detected those genes in a few protozoans. Given the likelihood of contamination of eukaryotic samples—especially from partially sequenced genomes—by eubacterial DNA, it is important to stress the coherence of the results, which makes it very unlikely that these results are an artifact. First, all eukaryotic genes appear as a monophyletic ensemble in the HslU phylogenetic trees, clearly distinct from the prokaryotic sequences. They also appear together in the HslV trees, although mixed with α-proteobacterial sequences due to the poor resolution provided by the short HslV sequences. Second, in many cases, both genes have been found together in the same species. This includes amoebozoa species (such as Dyctiostelium discoideum), plantae species (Cyanidoschizon merolae, Ostreococcus tauri), chromoalveolata (several Plasmodium species, Phytophthora infestans, Tetrahymena thermophila, Toxoplasma gondii, Thalassiosira pseudonana), and excavata (Leishmania infantum, two Trypanosoma species). The likelihood of all of them being false positives is negligible.
In summary, most of the main eukaryotic lineages have some species with HslU and HslV genes. Most significantly, both unikonts (such as D. discoideum) and bikont species (all the rest) have been found to have these genes. Because there is good evidence for the unikont/bikont split being the deepest dichotomy in eukaryotes (see reviews by Cavalier-Smith [2004] and Richards and Cavalier-Smith [2005]), we think that the last common eukaryotic ancestor contained both genes. This result, together with the phylogenetic analyses, which strongly hint that these genes have a proteobacterial origin, suggests that they were part of the set of genes transferred to eukaryotes in the endosymbiotic process involving an α-proteobacteria that gave rise to mitochondria. If this is indeed the correct evolutionary history, then we may deduce that HslU and HslV genes became dispensable and disappeared in many organisms independently. We have evidence that suggests that this process is still continuing today. As indicated above, most eukaryotic gene sequences are compatible with their encoding functional products. However, the sequences of both genes of Tetrahymena thermophila contain several stop codons that suggest that they have become inactive quite recently.
The substantial sequence conservation, conserved three-dimensional structures, and appeareance of both genes in many species suggest that functional HslVU complexes similar to those found in eubacteria may be still being generated in many eukaryotes. We can speculate that they serve as a backup or complementary system for the proteasome, although other functions related to protein degradation are also possible. Our results also contribute to the understanding of the relationships between the proteasome and the HslVU complex. When thinking about the origin of those two protein complexes, we envisage two main options. First, proteasome and HslVU may be viewed as alternatives, in which eubacteria have HslVU, while archaea and eukaryotes have proteasomes. Under this hypothesis, both actinomycete proteasomes and eukaryotic HslVU complexes would have been horizontally transferred, in the first case from eukaryotes to actinomycetes and in the second case by endosymbiosis. This is an option already suggested by Volker and Lupas (2002) and Couvreur et al. (2002), among other authors. A second option is that the HslVU complex is much more ancient than the proteasome and HslV a likely precursor of proteasome subunit genes. Thus, proteasomes may have emerged in actinomycetes, perhaps as a substantial modification of HslVU complexes, and later archaea and eukaryotes received proteasomes as a consequence of their deriving from an actinomycete species (as suggested by Cavalier-Smith 2002). The finding of HslVU complexes in eukaryotes again would be explained by horizontal transmission by endosymbiosis. It is unclear how to test which one of these two hypotheses is correct. However, in our opinion any proof will depend more on the true relationships among the three domains—eubacteria, archae, eukaryotes—than on further extensions of the phylogenetic range in which HslVU or proteasomes are present.
References
Bochtler M, Ditzel L, Groll M, Huber RR (1997) Crystal structure of heat shock locus V (HslV) from Escherichia coli. Proc Natl Acad Sci USA 94:6070–6074
Bochtler M, Hartmann C, Song HK, Bourenkov GP, Bartunik HD, Huber R (2000) The structures of HsIU and the ATP-dependent protease HsIU-HsIV. Nature 403:800–805
Bouzat JL, McNeil LK, Robertson HM, Solter LF, Nixon JE, Beever JE, Gaskins HR, Olsen G, Subramaniam S, Sogin ML, Lewin HA (2000) Phylogenomic analysis of the alpha proteasome gene family from early-diverging eukaryotes. J Mol Evol 51:532–543
Cavalier-Smith T (2004) Only six kingdoms of life. Proc Biol Sci 271:1251–1262
Chuang SE, Burland V, Plunkett 3rd G, Daniels DL, Blattner FR (1993) Sequence analysis of four new heat-shock genes constituting the hslTS/ibpAB and hslVU operons in Escherichia coli. Gene 134:1–6
Couvreur B, Wattiez R, Bollen A, Falmagne P, Le Ray D, Dujardin JC (2002) Eubacterial HslV and HslU subunits homologs in primordial eukaryotes. Mol Biol Evol 19:2110–2117
De Mot R, Nagy I, Walz J, Baumeister W (1999) Proteasomes and other self-compartmentalizing proteases in prokaryotes. Trends Microbiol 7:88–92
Gille C, Goede A, Schloetelburg C, Preissner R, Kloetzel PM, Gobel UB, Frommel C (2003) A comprehensive view on proteasomal sequences: implications for the evolution of the proteasome. J Mol Biol 326:1437–1448
Groll M, Bochtler M, Brandstetter H, Clausen T, Huber R (2005) Molecular machines for protein degradation. Chem Biochem 6:222–256
Guex N, Peitsch MC (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 18:2714–2723
Hughes AL (1997) Evolution of the proteasome components. Immunogenetics 46:82–92
Kumar S, Tamura K, Nei M (2004) MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5:150–163
Lupas A, Zuhl F, Tamura T, Wolf S, Nagy I, De Mot R, Baumeister W (1997) Eubacterial proteasomes. Mol Biol Rep 24:125–131
Nicholas KB, Nicholas HB Jr (1997) GeneDoc: Analysis and visualization of genetic variation. Distributed by the authors; http://www.psc.edu/biomed/genedoc/
Page RD (1996) TREEVIEW: an application to display phylogenetic trees on personal computers. Comput Appl Biosci 12:357–358
Peitsch MC (1996) ProMod and Swiss-Model: Internet-based tools for automated comparative protein modelling. Biochem Soc Trans 24:274–279
Richards TA, Cavalier-Smith T (2005) Myosin domain evolution and the primary divergence of eukaryotes. Nature 436:1113–1118
Rohrwild M, Pfeifer G, Santarius U, Muller SA, Huang HC, Engel A, Baumeister W, Goldberg AL (1997) The ATP-dependent HslVU protease from Escherichia coli is a four-ring structure resembling the proteasome. Nat Struct Biol 4:133–139
Simpson AG, Roger AJ (2004) The real ‘kingdoms’ of eukaryotes. Curr Biol 14:R693–R696
Sitnikova T, Rzhetsky A, Nei M (1995) Interior-branch and bootstrap tests of phylogenetic trees. Mol Biol Evol 12:319–333
Song HK, Hartmann C, Ramachandran R, Bochtler M, Behrendt R, Moroder L, Huber R (2000) Mutational studies on HslU and its docking mode with HslV. Proc Natl Acad Sci USA 97:14103–14108
Sousa MC, Trame CB, Tsuruta H, Wilbanks SM, Reddy VS, McKay DB (2000) Crystal and solution structures of an HslUV protease-chaperone complex. Cell 103:633–643
Stechmann A, Cavalier-Smith T (2003) The root of the eukaryote tree pinpointed. Curr Biol 13:R665–R666
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24:4876–4882
Volker C, Lupas AN (2002) Molecular evolution of proteasomes. Curr Top Microbiol Immunol 268:1–22
Wang J, Song JJ, Franklin MC, Kamtekar S, Im YJ, Rho SH, Seong IS, Lee CS, Chung CH, Eom SH (2001) Crystal structures of the HslVU peptidase-ATPase complex reveal an ATP-dependent proteolysis mechanism. Structure 9:177–184
Acknowledgments
Our group is supported by Grants GEN2001-4851-C06-02 and SAF2003-09506 (Ministerio de Educación y Ciencia, Spain) and Grant GV04B-141 (Generalitat Valenciana, Spain).
Author information
Authors and Affiliations
Corresponding author
Additional information
[Reviewing Editor: Dr. Yves Van de Peer]
Rights and permissions
About this article
Cite this article
Ruiz-González, M.X., Marín, I. Proteasome-Related HslU and HslV Genes Typical of Eubacteria Are Widespread in Eukaryotes. J Mol Evol 63, 504–512 (2006). https://doi.org/10.1007/s00239-005-0282-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-005-0282-1