Introduction

The genetic basis of the polyglutamine (polyQ) repeat disease family is a CAG trinucleotide repeat expansion in the protein coding region that results in expression of an expanded polyQ domain. Diseases within this family include Huntington’s disease (HD), spinal and bulbar muscular atrophy, and the spinocerebellar ataxias (SCAs). In these diseases, polyglutamine (polyQ) expansion leads to the formation of fibrillar protein aggregates, and ultimately neuronal cell death. While it is known that in the case of ataxin-3, the causative agent of SCA3, exceeding a threshold of 52 glutamines triggers formation of intranuclear aggregates, with consequent cell death [1], there is experimental evidence that suggests ataxin-3, as well as the HD-agent huntingtin, undergo fibrillar aggregation by a multidomain misfolding mechanism in which nonpolyQ regions self associate before the polyQ tract [24].

Small heat shock proteins (sHsp) are widely distributed molecular chaperones that bind to misfolded proteins to prevent irreversible aggregation and aid in refolding to a competent state. The sHsps characterized thus far all contain a conserved α-crystallin domain, and variable N- and C- termini critical for chaperone activity and oligomerization. The sHsp isoforms αΑ-crystallin and αΒ-crystallin are a large component of the human eye lens [5], and αΒ-crystallin is also expressed in many other cell types, including neurons [6]. The chaperone activity of the sHsp dimer of various α-crystallins has been postulated to coincide with the exposure of hydrophobic interface sites after a temperature-regulated subunit exchange or dissociation of the oligomer [7, 8]. This provides for exposure of substrate binding sites, which in turn result in the sequestration of the target protein in a high mass complex, thereby preventing formation of an amorphous protein aggregate. In addition, to maintain homeostasis by protection against protein misfolding, some sHsps, most notably αΒ-crystallin have also been shown to protect against amyloid fibril formation [5]. It has been shown that peptide sequences from specific domains of both αΑ-crystallin and αΒ-crystallin bind independently to target proteins, and protect against the unfolding and aggregation of amyloidogenic proteins [9]. Finally, αΒ-crystallin has been identified as a suppressor of SCA3 toxicity [10], most likely through the formation of a transient αΒ-crystallin/ataxin-3 complex [11].

Solvent-exposed intramolecular backbone hydrogen bonds, or dehydrons, have been previously identified as vulnerabilities or structural defects in the packing of a wide array of proteins [12, 13]. Exposure of such dehydrons to an aqueous environment has been shown to weaken protein secondary structures [14, 15]. In turn, excluding solvent from protein regions containing exposed hydrogen bonds has been implicated as a determinant factor in ligand–protein [16] and protein–protein [17] interactions as well as protein subunit assembly [18]. A mechanism of action for sHsps based on the protection of solvent-exposed backbone hydrogen bonds within the α-crystallin domain has also been developed [19, 20]. This paper seeks to explore whether the dehydron hypothesis might also provide a structural basis for the modulation of SCA3 toxicity by αΒ-crystallin, and whether the development of peptide mimics as therapeutic targets against polyQ diseases might also benefit from this analysis.

Materials and Methods

Proteins

The dodecameric structure resulting from the docking of the α-crystallin domain, from T. aestivium (wheat) into the density map of M. tuberculosis α-crystallin is available from the RCSB (www.rcsb.org) as Protein Data Bank (pdb) code entry 2BYU. An homology model for the alpha-crystallin domain of human αB-crystallin was generated from a sequence alignment (UniProt P02511) with the 2BYU homolog, using the MODELER protocol as implemented in the Discovery Studio program suite from Accelrys Inc. The fitted structure 2BYU includes the conserved IXI motif of the C-terminal extension, but lacks the remainder of the C-terminal tail. The missing residues at positions 138–146 were added from the wheat structure (available as pdb code 1GME) after superposition of the backbones of residues 133–138 and 147–151. After adding hydrogens, the inserted loop residues and the hydrogen positions were subjected to a short energy minimization using the CHARMm force field [21]. A homology model of the αB-crystallin dimer was generated by fitting the subunits onto the augmented oligomeric structure of 2BYU. These αB-crystallin models were then subjected to a short energy minimization, followed by successive steepest descent and conjugate gradient minimizations. A crystal structure for the josephin domain of ataxin-3 is available from the RCSB (www.rcsb.org) as PDB entry 1YZB [22]. After adding hydrogens, this protein was also subjected to a short energy minimization as well as successive steepest descent and conjugate gradient minimizations.

The extent of hydrogen bond desolvation is quantified as the number of nonbonded carbonaceous groups, ρ, contained within a domain centered on the residues linked by the interaction [17]. This desolvation domain is defined as two intersecting spheres of fixed radius centered on the Cα atoms of the linked residues. Dehydrons are then identified as those backbone hydrogen bonds that are underwrapped by nonpolar carbonaceous groups, and defined as those interactions with ρ values at or below the average minus one root mean squared deviation. In this work, the default values for domain radius, 6.2 Å, and dehydron cutoff, ρ ≤ 19, were used as per reference 16. The dehydrons for the α-crystallin, and the josephin domains of αB-crystallin and ataxin-3, respectively, are shown as green connectors in Fig. 1.

Fig. 1
figure 1

The solvent exposed backbone hydrogen bonds for the α-crystallin domain of αΒ-crystallin, the josephin domain of ataxin3 and the complex generated by ZDOCK. The dehydrons, are shown as green connectors between the Cα of the linked residues [images generated with YAPView, available from http://sourceforge.net/projects/protlib/files/yapview/] (Color figure online)

Fig. 2
figure 2

a and c: Shown in yellow are those residues of the josephin domain of ataxin-3 determined by 15N-HSQC spectroscopy to lie at the interaction surface with αB-crystallin; b and d: surface image of the predicted ataxin-3/αB-crystallin complex showing the placement of the surface residues (Color figure online)

Fig. 3
figure 3

a Ribbon diagram showing deformation of the α2 helix after a 5 ns MD equilibration of the partially solvated the josephin domain of ataxin-3; b The solvated ataxin-3/αΒ-crystallin complex after a 5 ns MD equilibration with the josphin domain shown as a ribbon, and the αΒ-crystallin monomers shown in green and brown; c Hydrogen bonding between water molecules and the carbonyl oxygens of residues Arg47, Glu50 and Gly51 of the α2 helix of ataxin-3 (Color figure online)

Docking

ZDOCK is a Fast Fourier Transform rigid docking protocol that searches all possible binding modes in the translational and rotational spaces between two proteins, and evaluates each using an energy scoring function based on shape complementarity [23]. By associating an unfavorable desolvation energy contribution with specified atoms, protein residues can be blocked from being included in binding sites. Results are filtered by clustering poses, where the RMSD lies within a specified cutoff of the predicted binding interface. The pose shown in Fig. 1 was generated by first blocking, all those residues in the josephin domain of ataxin-3 that are not on the αB-crystallin interaction surface. A signal enhancement, in the 15N-HSQC spectrum of ataxin-3 in the presence of a relaxation agent, of greater than 20 % upon addition of a two fold molar excess of αB-crystallin was used to identify those residues at the interaction surface [11]. For αB-crystallin, only those residues that were specifically identified as possessing antifibril activity when incubated as synthetic peptides in Reference 9 were left unblocked. Of the 54,000 docked poses generated using the default parameters as implemented in the Discovery Studio program suite from Accelrys Inc, the 2,000 top-ranked poses were filtered as 86 clusters. For each cluster, an average contact surface area was generated by calculating the change in the solvent accessible surface (SAS) area upon formation of the protein–protein complex for each docked pose. Finally, the highest ranked pose, as represented by its ZRANK score [24], in the cluster with the largest average contact surface area was selected.

The docking of the αB-crystallin bioactive tetra-peptide HEER was performed with version 4.0 of the program AutoDock, using the implemented empirical free energy function and the Lamarckian genetic algorithm [25]. AutoTors, as implemented in the ADT (AutoDock Tool Kit) software program [26], was used to define the torsional degrees of freedom to be considered during the docking process. The peptide coordinates were extracted from the computed homology model of αB-crystallin with the backbone structure held fixed, but all side-chains were free to rotate during docking. Atomic charges were assigned using the Gasteiger-Marsili formalism [27], which is the type of atomic charges used in calibrating the AutoDock empirical free energy function [28]. The grid maps representing the protein in the actual docking process were calculated with AutoGrid 4.0. The grid dimensions were chosen to be 30 Å × 30 Å× 30 Å, with a spacing of 0.375 Å between the grid points and the grid box, centered at the approximate center of mass of the helical hairpin formed by helices two and three. Docking parameters included an initial population of 100 randomly placed individuals, a maximum number of 15 million energy evaluations, a maximum of 27,000 generations, a mutation rate of 0.02, a crossover rate of 0.80, and an elitism value of 1. Proportional selection was used, where the average of the worst energy was calculated over a window of the previous 10 generations. For the local search, the pseudo-Solis and Wets algorithm was applied using a maximum of 300 iterations per local search. The probability of performing local search on an individual in the population was 0.06, and the maximum number of consecutive successes or failures before doubling or halving the local search step size was 4. One hundred independent docking runs were carried out, and results differing by less than 4 Å in positional root mean- square deviation (RMSD) were clustered together and represented by the result with the most favorable free energy of binding. The pose shown in Fig. 4b has the most favorable free energy of binding and is representative of the cluster with the highest occupancy. This indicates that convergence was achieved.

Fig. 4
figure 4

a: Ribbon structure of of the predicted ataxin-3/αB-crystallin complex with the 104HEER107 sequence of αB-crystallin shown in yellow; b: The lowest energy, highest cluster occupancy, pose for the binding of the HEER tetrapeptide (yellow) to the josephin domain of ataxin-3, showing the extensive hydrogen bonding between the arginine and histidine of the tetrapeptide and residues Gly52, Thr54 and Ser55 of ataxin-3 (Color figure online)

Molecular Dynamics

Molecular dynamic (MD) simulations were performed on the josephin domain of ataxin-3 and it’s complex with the αB-crystallin dimer as generated by ZDOCK. In both cases, the initial complex was solvated with TIP3P explicit water molecules, occupying a sphere of radius 20 Å from the center of mass of the helical hairpin formed by helices two and three, and employing an explicit spherical boundary with harmonic restraint. The system was minimized by steepest descent and conjugate gradient, heated to 300 K, and equilibrated at 300 K for 5 ns. The SHAKE algorithm was employed to keep bonds involving hydrogen atoms at their equilibrium length, allowing the use of a 2 fs time step.

Results and Discussion

Dehydron analysis of αB-crystallin, Fig. 1, confirms our previous observation that the Ig-like β fold structure of the α-crystallin domain provides an ideal topology for the presentation of solvent-exposed backbone hydrogen bonds. Specifically, two β89 interstrand hydrogen bonds, between the Ser71 and Thr79 residues, respectively, are identified as vulnerable to solvent exposure, a vulnerability shown to be conserved across most α-crystallin domains [19]. Previous MD simulations for solvated α-crystallin oligomers confirmed that solvent exposure triggers a small but important dislocation of the β8 strand relative to the β9 strand, initiated by the migration of solvent molecules toward these underwrapped, and thus exposed backbone hydrogen bonds, a dislocation in turn sufficient in scope to disrupt key β89 interstrand stabilizing forces [19]. The distribution of dehydrons in ataxin-3, shown in Fig. 1, also proves instructive with fully half of all the vulnerabilities present in the protein being found in the thumb like extension formed by the protrusion from the protein core of the α23 helical hairpin. Our analysis also identifies that six of the eleven intrahelical hydrogen bonds in this subdomain are vulnerable to solvent exposure.

Dehydron analysis for the complex formed by ataxin-3 and the αΒ-crystallin dimer, as predicted by the ZDOCK algorithm and using the change in SAS area as the key selection criterion, is also shown in Fig. 1. The reduction in SAS area of 963 Å2, over the SAS area for the individual proteins is achieved by fitting one face of the ataxin-3 core into the hydrophobic groove bounded by the β4 and β8 strands of one of the αΒ-crystallin monomers. This allows for the helical hairpin to then sit in a groove bounded by residues 73–85 of one αΒ-crystallin monomer and residues 101–110 of the other. This result is not wholly unexpected, since it was these residues that were found previously to possess antifibril activity and thus were left unblocked in the ZDOCK protocol. However, the identification of the shape of the ataxin-3 α23 helical hairpin as complementary to the crevice bounded by the αΒ-crystallin monomers is independent of any such selection, as are the results that showed this thumb like protrusion as having both a very high density of αΒ-crystallin binding sites and a high density of dehydrons.

The values reported in Table 1 highlight the extent to which the penetration of one partner’s residues into the desolvation domains of the other partner’s dehydrons serves to protect, otherwise solvent-exposed backbone hydrogen bonds. For the complexes highlighted, Y represents the number of dehydrons in the individual binding partners, and δ represents the density of solvent-exposed backbone hydrogen bonds per thousand Å2. The value Yint represents the number of dehydons at the binding interface, and δint is the density of solvent-exposed backbone hydrogen bonds at the binding interface satisfied on complexation. The δint value for the ataxin-3/αΒ-crystallin complex, in addition to being significantly larger than the average density of the binding partners, also compares favorably with that calculated for complexes previously identified as exhibiting the greatest compensatory effect upon complexation [29]. Of the fifteen complexes studied in reference 29, only the two included in Table 1 have a higher δint. Additionally, the two β89 interstrand hydrogen–hydrogen bonds that still remain underwrapped in the αΒ-crystallin dimer even after complexation are found, upon examination, to have ρ values of 19. In addition to being the cutoff value for dehydron characterization, the number of 19 protecting nonpolar carbonaceous groups represents a dramatic improvement over the ρ values of 13 (Ser74(O)-(HN)Val77) and 16 (Ser71(NH)-(O)Thr79) calculated for the isolated sHsp. Identifying these hydrogen bonds as sufficiently protected from solvent exposure upon complexation increases the δint values for the ataxin-3/αΒ-crystallin complex to 7.3, a compensatory benefit greater than that found for any complex studied in reference 29.

Table 1 Y represents the number of dehydrons in the individual binding partners; δ represents the density of solvent-exposed backbone hydrogen bonds, per thousand Å2, in the separate proteins; Yint represents the number of dehydons at the binding interface; δint is the density of solvent-exposed backbone hydrogen bonds at the binding interface satisfied on complexation

Figure 3 highlights the structural consequences of exposing the dehydron vulnerabilities present in the ataxin-3 α23 helical hairpin to water, and the consequential structural benefit of complex formation with αΒ-crystallin. Whereas a 5 ns MD equilibriation of the partially solvated josephin domain of ataxin-3 results in a clear deformation of the α2 helix, Fig. 3a, equilibriating the solvated ataxin-3/αΒ-crystallin complex over the same timeframe results in no loss of secondary structure, Fig. 3b. The extent of this dislocation can be seen by contrasting the lengths of the helical axis, measured as the distance between the α carbons of residues Val31 and Gly51. The cause of the near 3 Å extension of this axis, and the resultant loss in secondary structure can be seen in Fig. 3c. Hydrogen bonding between water molecules and the carbonyl oxygens of Arg47, Glu50, and Gly51 clearly results in a concomitant loss of the stabilizing intrahelical hydrogen bonds involving those residues. As might be expected, these residues had previously been identified by our dehydron analysis as being the loci of underwrapped backbone hydrogen bonds, and thus vulnerable to solvent exposure. Indeed, the ρ values of 11 for (Arg47(O)-(HN)Glu50) and 9 for (Met48(O)-(HN)Gly51) marked them as the most vulnerable of all the dehydrons found in our analysis of ataxin-3 Fig. 2.

As previously noted, several peptides corresponding to interactive sequences in αΒ-crystallin have been shown to modulate the fibrillation of amyloidogenic proteins [9]. One of these sequences, the 104HEER107 component, is found to lie proximate and transverse to the α23 turn in our predicted complex, Fig. 4a. By extracting the coordinates for this tetrapeptide from the computed homology model of αB-crystallin, and holding the backbone fixed while allowing all side-chains free to rotate, we succeeded in converging to the docked pose shown in Fig. 4b, using the Lamarckian genetic algorithm as implemented in the program AutoDock. While the predicted free energy of binding of 4.46 kcal mol−1 indicates relatively weak attachment, this pose does show extensive hydrogen bonding between the arginine and the histidine of the tetrapeptide and residues Gly52, Thr54, and Ser55 of ataxin-3. While the Ki predicted by this docking is somewhat higher than might be expected based on the 3–50 μM range at which such peptides inhibited fibrillation (at 5:1 ratios of peptide to protein) [9], such a value is consistent with the prediction, based on size exclusion chromatography and nuclear magnetic resonance data [11] that the inhibition of the josephin domain of ataxin-3 aggregation occurs by transient association with αΒ-crystallin.

This minimal model of polyQ misfolding involves a two step process, whereby the first stage involves aggregation of the globular N-terminal Josephin domain, followed by self association of the expanded polyQ segments [2]. The formation of a weakly bound complex with αΒ-crystallin impedes polyQ aggregation by sequestering the monomeric protein, therby inhibiting fibril growth. The chaperonin TRiC has also been shown to have a similar effect on the aggregation of huntingtin (Htt), though in this case it is proposed that the chaperonin attachment is specific to a 17-residue segment at the N-terminus [30]. In this model, it is postulated that though intrinsically unstructured the N17 region can form an amphiphilic helix, which then promotes aggregation through self association of the nonpolar helical faces. By reversibly attaching to the same nonpolar regions, the chaperonin sequesters the protein in a manner similar to that proposed for the inhibition of ataxin-3 by αΒ-crystallin [31]. The analysis presented here raises the possibility that, as previously noted in reference 29, the amyloidogenic propensity of proteins such as ataxin-3 is the result, not of specific structural folds, but is triggered instead by vulnerabilities contained within these structural features. Targeting such vulnerabilities could allow for the development of novel and effective treatments of these, and other, devastating neurodegenerative disorders.