Introduction

Heat shock proteins (HSPs) are expressed constitutively or/and in response to stress conditions in living organisms, playing important or even essential physiological roles (Lindquist and Craig 1988). The HSPs are commonly categorized into five families: Hsp100, Hsp90, Hsp70, chaperonin (e.g., Hsp60), and small heat shock proteins (sHSPs) (Narberhaus 2002). The structures and biological functions of large HSPs like Hsp90, Hsp70, and Hsp60 are usually highly conserved from prokaryotes to eukaryotes (Bukua and Howrich 1998; Hartl and Hayer-Hartl 2002) and the evolutionary relationships of members in each family have been well elucidated via phylogenetic analyses (Gupta et al. 1993; Gupta 1995; Brocchieri and Karlin 2000).

Unlike the large HSPs, the sHSPs are highly divergent in both primary sequences and oligomeric status (Lindquist and Craig 1988; Vierling 1991; Plesofsky-Vig et al. 1992; Kappe et al. 2000). Although sHSPs are characterized by having a relatively conserved α-crystallin domain spanning about 100 residues, their sequence similarity is far lower than that for the large HSPs (Vierling 1991; Plesofsky-Vig et al. 1992; de Jong 1993). Due to the lack of high sequence similarity, tracing the evolutionary relationships of members of the sHSP family becomes difficult (Vierling 1991; Plesofsky-Vig et al. 1992; Kappe et al. 2000; de Jong et al. 1993, 1998; Waters 1995; Waters et al. 1996). Furthermore, unlike the large HSPs that usually exist as well-defined oligomers, the sHSPs from different organisms exist as oligomers that vary in size, shape, and subunit interaction dynamics, with some of them even exhibiting polydispersity (meaning the proteins exist simultaneously as oligomers of various sizes; reviewed by Narberhaus [2002], Hartl and Hayer-Hartl [2002], Vierling [1991], and Waters [1995]).

This study was carried out in an attempt to understand the evolutionary relationships of sHSPs. Our phylogenetic analysis revealed that bacterial sHSPs such as E.coli IbpA/IbpB and animal sHSPs each forms an independent monophyletic group that cluster together to form an outgroup, which is highly distinct from the sHSPs of plant, fungi, archaea, and other bacteria. Further analyses demonstrate that these bacterial sHSPs are members of the bacterial class A, according to the reports of Narberhaus and co-workers (Studer et al. 2000; Munchbach et al. 1999). Consistent with such an evolutionary assignment are the accumulating observations that the bacterial class A and animal sHSPs are often found to exist as polydisperse oligomers, while the other sHSPs exist as relatively well-defined monodisperse oligomers. In view of these revelations, it is hypothesized that the animal sHSPs were evolutionarily originated from the sHSPs of bacterial class A, extending the previous observations that animal sHSPs are linked with some prokaryote sHSPs (de Jong et al. 1998).

Materials and Methods

Sequence Alignments

All the available amino acid sequences of sHSPs were downloaded from the mirror website of SWISSPROT and TREMBL databases in China (available at http://cn.expasy.org). Partial and duplicated sequences were discarded. The remaining 344 unique and complete sequences, each indicated by the accession number of either SWISSPROT or TREMBL (see Supplementary Table S1), were grouped as bacterial class A, bacterial class B (grouped according to Munchbach et al. 1999), archaea, fungi, plant, animal, and other eukaryotes.

To perform sequence alignment and phylogenetic analysis, a representative of 51 sequences (as shown in Fig. 1) were chosen from the total 344 sequences. The exact number of sequences selected from each subfamily for the alignment was roughly proportional to its total number of sequences so far accumulated (for instance, 13 of the 115 plant sequences and 11 of the 79 animal sequences were chosen). The criteria in choosing certain particular sequences from each subfamily were to represent as many species and classes as possible. For instance, chosen to represent the higher plant were three, two, two, two, and 2 sequences from classes I, II, and IV, chloroplast, and mitochondria respectively; chosen to represent the lower plants (moss) were 1 sequence each from classes I and II. A partial list of species selected to represent each subfamily is as follows: bacteria—Mycobacterium tuberculosis, Streptococcus thermophilus, Bradyrhizobium japonicum, and Escherichia coli; archaea—Methanococcus jannaschii, Halobacterium sp., and Sulfolobus solfataricus; fungi— Schizosaccharomyces pomb,; Saccharomyces cerevisiae, and Laccaria bicolor; plants—Arabidopsis thaliana, Lycopersicon esculentum, Pisum sativum, and Funaria hygrometrica; and animals—human, frog, fish, fruit fly, and nematode.

Figure 1
figure 1

Sequence alignments of the α-crystallin domains of 51 representative sHSPs. These 51 proteins cover the six subfamilies of sHSPs: bacterial class A, bacterial class B, fungi, archaea, plant, and animal. The number of proteins selected from each subfamily is indicated in brackets. The highly conserved residues are indicated by black backgrounds. The types of secondary structures indicated at the bottom are assigned according to the determined crystal structure of M. jannaschii Hsp16.5 (Kim et al. 1998).

These 51 proteins were initially subjected to sequence alignment by using the clustalW multiple alignment (Thompson et al. 1994) in BioEdit (version 5.06; Department of Microbiology, North Carolina State University). The α-crystallin domain sequences were first identified from the above alignments by referring to the alignments presented by Saitou and Nei (1987) and then realigned using the clustalx software (version 1.7 [Thompson et al. 1994]) after all gaps were removed (as shown in Fig. 1).

Phylogenetic Analysis

The neighbor joining (NJ) (Saitou and Nei 1987) distance tree (Fig. 2), formulated from the realignment of the α-crystallin domain sequences of the 51 sHSPs, was evaluated by bootstrap (Felsenstein 1985) (the number of trials is 1000). Phylogenetic trees plotted on the basis of either the full or the N-terminal region sequences were also constructed using the NJ method (Supplementary Figs. S1A and S1B). The maximum likelihood (Felsenstein and Churchill 1996) and maximum parsimony phylogenetic trees were also constructed based on the realigned α-crystallin domains using the PROTML and PROTPARS softwares in the phylip package 3.6α, respectively (Figs. 3A and B).

Figure 2
figure 2

The neighbor-joining distance tree constructed from the α-crystallin domain. The NJ distance tree was first constructed from the realigned α-crystallin domains of the 51 sHSPs (as shown in Fig. 1) using the program clustalx and then evaluated by bootstrap (Felsenstein 1985) (the number of trials being 1000). The number of bootstrap replicates ( of a total of 1000) for a branch to appear is indicated above the branch, with values below 500 not indicated. The scale bar indicates the number of substitutions per site for a unit branch length. Indicated on the right are the sHSPs subfamilies that appear as monophyletic groups. To show the subfamily of the sequence, each name of the sequence is followed by the abbreviation of the subfamily (BA, bacterial class A; BB, bacterial class B; AR, Archae; F, fungi; P, plant; AN, animal), and the same indication is used in the other phylogenetic trees.

Figure 3
figure 3

Phylogenetic trees constructed on the basis of α-crystallin domains by maximum likelihood (A) and maximum parsimony (B). The trees were constructed on the basis of the alignment of α-crystallin domains of 51 sHSPs (Fig. 1). (A) The scale bar indicates the number of substitutions per site for a unit branch length. The animal, bacterial class A, archaeal, and plant sHSPs, respectively, which form monophyletic groups are designated. (B) The animal and bacterial class A sHSPs, respectively, which form monophyletic groups are designated.

Plasmid Construction and Protein Purification

Rat lens αA-crystallin was expressed from the BL21(DE3) E. coli strain harboring a pET20b+ expression vector that carries the encoding cDNA gene and purified according to the method described before (Bova et al. 1997). Recombinant M. jannaschii Hsp16.5 protein was expressed in BL21(DE3) cells harboring the expression vector pET21a carrying the encoding gene and purified according to the method described previously (Kim et al. 1998). Wheat Hsp16.9 was expressed in BL21(DE3) cells harboring the expression vector PJC20 carrying the gene and purified according to methods described before (Lee et al. 1995). The gene for Hsp16.3 was subcloned into the pET-9d expression vector as previously described (Chang et al. 1996). Recombinant Hsp16.3 proteins were overexpressed in E. coli BL21 (DE3) host cells and purified as previously described (Fu et al. 2003; Gu et al. 2002).

The gene encoding E. coli IbpB was inserted into pALTER-Ex1, expressed in JM109(DE3) cells, with the protein purified via a sequential chromatography operations: first with two ion-exchange columns (resins used, DEAE Sepharose FastFlow and subsequently Source 15Q; buffer, 50 mM Tris–HCl, pH 8.5; elution, 0–0.5 M NaCl), then with a hydrophobic interaction column (resin, phenyl-Sepharose HP; buffer, 50 mM Tris–HCl, pH 8.5, 0.8 M [NH4]2SO4; elution, 0.8–0 M [NH4]2SO4), and eventually with a preassembled size-exclusion column of Superose 6 HR 10/30 (buffer, 50 mM Na3PO4, pH 7.3, 50 mM NaCl).

The purified proteins were dialyzed in deionized water, lyophilized, and stored at −20°C before further analysis. Protein concentrations were determined using the Bio-Rad Protein Assay.

Blue Native Polyacrylamide Gel Electrophoresis (PAGE)

The blue native PAGE, having a 4–25% linear gradient concentration of acrylamide, was prepared as described before (Schagger and von Jagow 1991). Electrophoresis was performed at 4°C at a constant electric voltage: 100 V before the samples entered into the separating gel, increasing to 150 V afterward and for another 2 h. Protein bands were visualized by staining with Coomassie brilliant blue G-250.

Size-Exclusion Chromatography

Size-exclusion chromatography (SEC) was performed on a ÄKTA FPLC system connected with a water bath, using an XK 16/75 column that was self-packed with Superdex 200 pregrade medium (all from Amersham Pharmacia Biotech). For each analysis, a protein sample, after being cleaned by centrifugation (10,000g, 10 min), of 500 μl was loaded onto the column. Protein samples were then eluted with 50 mM phosphate sodium buffer (containing 0.15 M NaCl, pH 7.0) at a flow rate of 0.8 ml/min.

Results

Bacterial Class A and Animal sHSPs Are Closely Related as an Evolutionary Outgroup

Optimal alignments for the 51 sequences, essential for the reliable phylogeny reconstruction, were performed by discarding the sequences of the highly variable N-terminal regions and C-terminal extensions, by referring to the previous alignment results of sHSPs (Kappe et al. 2000; Kim et al. 1998), after the first round of alignments based on the full sequences. The realigning results of the sequences of the α-crystallin domains are presented in Fig. 1.

The phylogenetic tree presented in Fig. 2 was constructed by applying the NJ method (Saitou and Nei 1987) to the aligned sequences of the α-crystallin domains (Fig. 1) and evaluated by bootstrapping (Felsenstein 1985). Data displayed in Fig. 2 unambiguously revealed the following phenomenon: the bacteria class A and animal sHSPs cluster together to form an outgroup that is highly distinct from the other sHSPs (the bootstrap score is 900/1000). Additional data supporting such a close relationship between the bacterial class A and animal sHSPs include the following. First, such a close relationship was also revealed when the complete sequences of the 51 proteins were phylogenetically analyzed (see Supplementary Fig. S1A) or when the phylogenetic tree was constructed based on the method of maximum likelihood (Fig. 3A). The close relationship between the two subfamilies of sHSPs was, however, not revealed in the phylogenetic tree constructed by applying maximum parsimony (Fig. 3B). Similar inconsistency among the three methods for phylogenetic analysis has also been reported previously (Gupta 1995; Nikoh et al. 1994) and it might be attributed to the fact that small errors in distance may disrupt the phylogenetic tree for the maximum parsimony method (Bruno et al. 2000). As a result, when this method was applied in analyzing the sequences of sHSPs which are so diverse between each other, the close evolutionary relationship between the animal and bacterial class A sHSPs failed to be revealed.

Consistent with previous reports (Plesofsky-Vig et al. 1992; de Jong et al. 1998; Munchbach et al. 1999; Waters and Vierling 1999), animal sHSPs (Fig. 2 and Supplementary Figs. S1A and S2) as well as plant sHSPs (Figs. 2 and 3A) were all found to exhibit monophyletic grouping. The bacterial class A sHSPs were also found to exhibit monophyletic grouping (Figures 2 and 3A and Supplementary Fig. S2). Such monophyletic properties for the animal, bacterial class A, and plant sHSPs, however, were not detected in the NJ tree on the basis of the N-terminal regions alone (Fig. S1B), reflecting the extreme variation in amino acid sequences of this region among sHSPs (de Jong et al. 1993). Interestingly, the archaea sHSPs were found to exhibit monophyletic grouping with a marginal tendency in our analysis (Figs. 2 and 3A), suggesting that this subfamily may either have diverged after being evolved from one ancestral gene or have formed by convergent evolution from more than one ancestral genes.

Bacterial Class A and Animal sHSPs Exist as Highly Polydisperse Oligomers, and the Others as Relatively Monodisperse Oligomers

The information for the native folding of one polypeptide chain is in principle encoded in its amino acid sequence (Anfinsen 1973; Dobson and Karplus 1999). The fact that sHSPs commonly exist as oligomers, as well as the close evolutionary relationship between bacteria class A and animal sHSPs (Fig. 2), as deduced from a comparison of their amino acid sequences, all prompted us to pursue whether they also share certain features in their patterns of oligomerization. To our great amazement, the bacterial class A and animal sHSPs were all found to exclusively exist as polydisperse oligomers (i.e., the protein exists simultaneously as oligomers of various sizes), as revealed by previously published data that are summarized in Table 1 (Kim et al. 1998; Lee et al. 1995; Haley et al. 1998, 2000; Vanhoudt et al. 2000; Ehrnsperger et al. 1997, 1999; van de Klundert et al. 1998; Leroux et al. 1997; Sobott et al. 2002; Lee et al. 1997; Suzuki et al. 1998; Haslbeck et al. 1999). In contrast, all the other sHSPs, including those from plants, fungi, archaea, and bacterial class B, exist exclusively as monodisperse oligomers (Table 1).

Table 1 Polydispersity vs monodispersity for sHSP oligomers

Partial confirmation of this extremely interesting revelation comes from the data presented in Fig. 4A, which demonstrate that bovine α-crystallin, rat αA-crystallin, and E. coli IbpB (a member of bacterial class A) all exist as polydisperse oligomers, while M. tuberculosis Hsp16.3 (a member of bacterial class B), M. jannaschii Hsp16.5 (archaea), and wheat Hsp16.9 exist as monodisperse oligomers. The results presented here are highly consistent with the previous individual analysis for each of these six proteins, using various other methods (Kim et al. 1998; Chang et al. 1996; Haley et al. 1998, 2000; van Montfort et al. 2001; Shearstone and Baneyx 1999).

Figure 4
figure 4

Characterization of the oligomeric status of representative sHSPs. A Blue native pore gradient (4–10%) PAGE analysis results for the following six proteins (with 20 μg protein loaded for each): bovine α-crystallin (lane 2), rat αA-crystallin (lane 3), IbpB (lane 4), M. tuberculosis Hsp16.3 (lane 5), M. jannaschii Hsp16.5 (lane 6), and wheat Hsp16.9 (lane 7). BSA was used as a marker (lane 1). Electrophoresis was first performed at 100 V and 4°C, then continued at 150 V for another 2 hr after the samples entered the separating gel. B The subunit exchange test between α-crystallin and IbpB, examined by temperature-controlled SEC analysis performed at 45°C using a XK 16/70 column self-packed with Superdex 200 prep grade medium. The α-crystallin (2 mg/ml bovine ) or E. coli IbpB (0.5 mg/ml) protein or their mixture was preheated at 45°C for 1 h before being loaded for chromatography analysis. Inset SDS-PAGE analysis results for fractions (corresponding to peaks a and b) collected from chromatography (concentrated by trichloroacetic acid precipitation before being applied for electrophoresis).

Subunit Exchange Does Not Occur Between E. coli IbpB (a Bacteria Class A sHSP) and Bovine α-Crystallin (an Animal sHSP)

The homo-oligomeric sHSPs have been found to undergo dynamic dissociation and reassociation (Gu et al. 2002), thus allowing subunit exchange to occur between the oligomers. Similar subunit exchange has also been reported to occur between the sHSPs of different species (Studer and Narberhaus 2000; Bova et al. 1997, 2000, 2002; Sobott et al. 2002; Datta and Rao 2000), indicating the compatibility of interfaces for oligomeric assembly between them. In view of these observations and the close phylogenetic relation between bacterial class A and animal sHSPs (Fig. 2), it was then asked whether sHSPs from the two groups are able to undergo such subunit exchange. For this purpose, bovine α-crystallin and E. coli IbpB were selected as representative of each group. Temperature-controlled size exclusion chromatography (SEC) in combination with SDS-PAGE was applied to detect whether hetero-oligomers could be formed between the two sHSPs via subunit exchange. Such hetero-oligomers were not detected after the two proteins were coincubated at 25°C for 3 h (data not shown). To exclude the possibility that subunit exchange might be inefficient at such a low temperature (25°C) (Bova et al. 1997), the two proteins were coincubated at 45°C for 1 hr before applied for the detection of hetero-oligomers, as shown by data presented in Fig. 4B. Although our data demonstrate that bovine α-crystallin and E. coli IbpB sHSPs are essentially incompatible for their subunit interfaces, the nonexhaustive nature of subunit exchange results presented here does not allow us to make a conclusion that subunit exchange does not occur between any members of the bacterial class A and animal sHSPs. Our observations are at least consistent with certain previously reported results, in which subunit exchange was not detected to occur between rat α-crystallin and M. jannaschii Hsp16.5 (an archaea) (Bova et al. 2000) or even between bacterial class A and class B sHSPs (Studer et al. 2000).

Discussion

Although a phylogenetic linkage between animal sHSPs and certain prokaryote sHSPs was reported (de Jong et al. 1998), the evolutionary relationship of the animal sHSPs and the two classes of bacterial sHSPs is still elusive. The present study based on the sequence comparison of the most conserved α-crystallin domain reveals that the animal sHSPs might have their evolutionary origin from the bacterial class A sHSPs (see Figs. 2 and 3A and supplementary Fig. S1A). Such a claim is further strengthened by the fact that members of these two subfamilies of sHSPs exhibit polydispersity in their oligomeric structures, while sHSPs from the other subfamilities exhibit monodispersity (see Table 1 and Fig. 4A). To our knowledge, this is the first report where a close phylogenetic relationship revealed from sequence comparison is also exhibited at the level of quaternary structure. Given that the information for the native tertiary structure of one protein is encoded in its amino acid sequence according to the Anfinsen principle [Anfinsen 1973]), a reasonable hypothesis is that the information for the oligomeric assembly of a protein (i.e., for the quaternary structure) is also encoded in its amino acid sequence. This hypothesis is apparently supported by our data that the close evolutionary relationship between bacterial class A and animal sHSPs as revealed by analysis of their amino acid sequences is also reflected at the quaternary structure level, by exhibiting the polydispersity in their oligomers, distinguishing themselves from the other sHSP subfamilies, which all exhibit monodispersity.

However, it is very difficult to experimentally determine the amino acids being involved in the oligomeric polydispersity. We thus explored an alternative logic strategy as following: if the similarity in the oligomeric polydispersity between bacterial class A and animal sHSPs is indeed correlated with their close relation in primary sequences, then amino acids showing a close relation, similarly to the α-crystallin domain, can be reasonably considered to contribute to the oligomeric polydispersity. Previous studies (Kappe et al. 2000; Waters et al. 1996) reported that region I and region II (as indicated at the top of Fig. 1) are conserved in all animal and in most plant sHSPs, respectively. Our phylogenetic analysis results on the basis of each region (I or II), as shown in Supplementary Fig. S2, indicate that region II seems to support the close relation between animal and bacterial class A better than region I does (compare the tree in Fig. 2 with those in Supplementary Figs. S2A and S2B). This observation suggests that region II may play a more important role than region I for these two subfamilies of sHSPs to exhibit oligomeric polydispersity. Structure determination studies of M. jannaschii Hsp16.5, on the other hand, indicate that region II is primarily involved in dimerization, and region I in higher oligomerization (Kim et al. 1998). These observations seem to imply that the dimerization interfaces of such sHSPs also play a role for them to exist as polydisperse oligomers. Interestingly, the monophyletic grouping of plant sHSPs is also observed in the evolutionary tree constructed from region II (Supplementary Fig. S2B) but not that from region I (Supplementary Fig. S2A). This might result from the higher diversity of region II compared to region I, allowing a better distinction of subfamilies when sequences of region II are applied for such evolutionary analysis.

If animal sHSPs evolved from a bacterial class A ancestor, it might be assumed that this evolutionary event occurred via lateral gene transfer. The ancestral sHSP gene might have transferred to the animal nuclear genome from the genomes of either an endosymbiotic mitochondria or a bacterial pathogen. The subsequent rapid duplication and divergence of the transferred gene may allow animal sHSPs to become highly divergent from the ancestral gene, as well as from each other (Plesofsky-Vig et al. 1992; de Jong et al. 1993, 1998). Given that there is a second subfamily of bacterial sHSPs (class B), and in view of the close ecological relationships between plant and Rhizobia, a similar logic would suggest that the plant sHSPs might have been evolved from those present in certain bacteria like Rhizobia. The sequences of sHSPs presently available for Rhizobia, however, do not allow us to test such a hypothesis.

Given that functionally important structural features are often highly conserved for proteins during evolution, our data strongly implicate that the exhibition of polydispersity for the bacterial class A and animal sHSPs may somehow allow them to act, for example, in responding to stressful conditions and in forming complexes with denatured substrate proteins, differently from those sHSPs that exhibit monodispersity. The details of this aspect merit further exploration. In addition, does the conclusion that oligomers of bacterial class A and animal sHSPs exhibit polydispersity and those of other subfamilies exhibit monodispersity hold true in general? The tests await further examination of oligomeric structures for more sHSPs.