Introduction

α-Synuclein (AS) is a 140-residue, 14.5-kDa, intrinsically disordered protein (IDP) implicated in the development of Parkinson’s Disease (PD) and Lewy Body Dementia. Aggregation of α-synuclein is one of the key clinical features exhibited in both disorders, collecting in the form of plaques called Lewy bodies. This process has been implicated in the progression of the diseases (Bermel et al. 2006; Bertoncini et al. 2005). Understanding the mechanism of the aggregation process at a molecular level is hoped to lead to the discovery of pharmacological agents or other types of therapeutic interventions that can arrest it and hence offer a potential therapeutic benefit for the increasing number of patients.

Wild-type (WT) α-synuclein contains three regions: the amphipathic region (residues 1–60), a hydrophobic domain consisting of the non-amyloid-containing region (residues 61–95), and the C-terminus region (residues 96–140). α-Synuclein has been found in both membrane- or vesicle-bound and free forms in pre-synaptic terminals of neurons, with residues 71–82 shown to be implicated in the formation of amyloid fibrils by several experimental methodologies (Bussel and Eliezer 2001). α-Synuclein is coded by the SNCA gene, with several point mutations in the gene known to cause familial forms of PD, including A30P, A53T and E46K (Breydo et al. 2012). Our study has specifically focused on the WT form of the protein and these three familial mutants, in an effort to shed light on the structural changes that occur as a result of significant residue changes and temperature variations.

An important question that may shed light on the aggregation mechanism of α-synuclein is how the mutations causing familial PD affect the conformational behavior of α-synuclein and what distinguishes each of the variants. These questions are challenging to address because of the intrinsically disordered nature of α-synuclein (Bertoncini et al. 2005; Dedmon et al. 2005), but they have been probed previously with a variety of approaches. For example, assays such as CD and FTIR spectroscopy, SAXS, thioflavin T fluorescence, and imaging with electron and atomic force microscopy have shown that the A53T and E46K mutations induce significantly faster fibril formation than in WT α-synuclein (Conway et al. 1998; Fredenburg et al. 2007; Greenbaum et al. 2005; Li and Fink 2001), with distinct differences in the fibril morphology (Choi et al. 2004; Lemkau et al. 2013). Interestingly, the effect of the A30P mutation on fibrillization is less clear from such studies, with some showing increased rates for compared to WT α-synuclein (Li and Fink 2001; Narhi et al. 1999) and others decreased rates (Conway et al. 2000; Serpell et al. 2000); contradictory results were also found for the effect of the mutation on fibril morphology (Conway et al. 2000; Serpell et al. 2000). Turning to studies of the monomeric protein, NMR studies have found a loss of helical structure in the region around residue 30 in the A30P mutant, as well as small variations in β- sheet propensity in the region around residue 53 in the A53T mutant (Clayton and George 1998; National Library of Medicine 2015; Lee and Lee 2014). Studies of fluorescently-labeled protein found that WT α-synuclein and its A53T and E46K mutants exhibited three states in the presence of sodium dodecyl sulfate (SDS), whereas the A30P variant exhibited only two (Ferreon et al. 2010). Computer simulations have found similar structural variations, using various sampling methodologies such as molecular dynamics simulations using replica exchange, simulated annealing or umbrella sampling methods, and Monte Carlo simulations (Balesh et al. 2011; Dedmon et al. 2005; Jónsson et al. 2012; Wise-Scira et al. 2013). Although such studies typically describe secondary structure assignments in terms of the per-residue probability, this approach presents difficulties in the case of an IDP like α-synuclein, because the probability of structure formation at any given residue is sufficiently low that it is challenging to determine the likelihood of forming an extended structure.

Previous studies of α-synuclein have indicated the existence of distinct conformational phases in monomeric ensembles (Jónsson et al. 2012), but never compared the propensity of individual residues to form extended structures. Here, we investigate the relative correlations between residues and their neighbors to determine the frequency with which these residues are members of short- and long-chain secondary structures. We used Monte Carlo simulations to most effectively sample the diverse conformational fluctuations of α-synuclein and minimize any bias from the choice of initial structure in the simulations. Through these simulations, we identified regions in the protein exhibiting an increased propensity to form secondary structure chains that correlated suggestively with the relative aggregation propensities presented by WT α-synuclein and its mutants.

Methods

Protein Monte Carlo simulations

Canonical Monte Carlo (MC) simulations were performed using the ProFASI software package (Irbäck 2006) on wild-type human α-synuclein, as well as the A30P, A53T and E46K mutants. In each case, a cysteine residue was added to the termini of the protein, to mimic the proteins used in single-molecule studies (Neupane et al. 2014). 16 independent simulations were run for each temperature, to ensure appropriate occupancy levels for potential bimodal peaks, as well as improving the net sampling as they would each have different resulting trajectories. The simulations were performed for a total of 12 million MC sweeps, where each sweep consisted of elementary steps over all degrees of freedom of the system, to ensure energetic convergence. Elementary moves consisted of pivots, rotations, and biased Gaussian steps. “Pivots” represent rotations about a backbone bond, “rotations” represent specifically rotations of the amino acid side-chains, and biased Gaussian steps represent internal rotations about several residues on the backbone. These three movements were imposed randomly, with relative frequencies of 55, 22.5, and 22.5%, respectively, chosen to reflect the relative degrees of freedom for the different motions (Irbäck 2006). After each move was performed, the energy of the system was calculated according to the ProFASI force field given by the formula:

$$E\; = \;E_{\rm loc} + E_{\rm ev} + E_{\rm hp} + E_{\rm hb} ,$$
(1)

where E loc is the energy due to the excluded volume term, E ev the energy due to electrostatic interactions, E hp the energy due to hydrophilic and hydrophobic interactions, and E hb the energy of hydrogen bond formations. After each move, the resulting structure was accepted with the probability determined by the expression

$$P\; = \hbox{min} \left\{ {1,\exp \left[ {\frac{{ - \left( {E_{2} - E_{1} } \right)}}{kT}} \right]} \right\},$$
(2)

where E 2 and E 1 are the energies of the new and old structures, respectively.

Simulations were performed at temperatures ranging from 290 to 340 K, in 10 K increments. We note that the temperature scale in PROFASI MC simulations does not represent a true physical temperature; it was calibrated using Trp-Cage phase transition, and for larger proteins requires re-calibration by comparison to experimental results confirming the structural ensembles (Jónsson et al. 2012; Mitternacht et al. 2010).

Analysis

Individual trajectories produced by ProFASI for the α-synuclein simulations were transformed to dcd format via the catdcd plugin available with the visual molecular dynamics (VMD) program (Humphrey and Schulten 1996). The Amber Tools trajectory analysis program, ptraj, was used to determine the secondary structure propensities for each residue, as well as the correlation matrices for the structures in each trajectory (Case et al. 2012). In the case of the bimodal WT ensemble, a second set of simulations was run to determine if the initial simulation had been run long enough to converge. We analyzed the lengths of secondary structures formed, the frequency of each length, and the specific residues involved in these various lengths using custom scripts in the Octave package (Eaton et al. 2009, 2014). Comparison of ensembles to the available NMR results was performed using the SHIFTX2 package, which calculates theoretical chemical shifts for each structure, and generates ensemble averages (Han et al. 2011). These theoretical values were then compared to NMR structures available in the Biological Magnetic Resonance Bank (Ulrich et al. 2008). Further comparison to experiment was performed by calculating (J HN ,H α) coupling values via the Karplus equation (Schmidt et al. 1999), as well as calculating Rh values using the program HYDROPRO (Ortega et al. 2011).

Results

We first examined the distribution of energies for the conformations of wild-type and mutant α-synuclein at each temperature value, as a sign of possible structural differences between the protein variants (Fig. 1). At the highest simulation temperature, 340 K, the energy distributions for all four versions of the protein (WT and mutant) converged to a single peak (Fig. 1a). As the temperature was lowered, a bimodal distribution indicative of a phase transition was observed for each protein (Fig. 1b–d), being most evident at 320 K (Fig. 1c). Variations in the distributions between the WT and mutant proteins also became apparent, implying differences in the temperature for the structural phase transition in each protein variant. Wild-type α-synuclein was found to have the lowest average energy compared to the three mutants, in each case. Whereas the differences between the α-synuclein variants were in most cases not very large, the A30P mutant was notably different from the other variants. To ensure convergence, a second set of 16 independent simulations was run at 320 K for the WT, and found there energy distributions to have converged (to within error) (see Fig S1). With this mixed phase simulation showing convergence, the setup for our simulations would likely also converge at higher temperatures, given the increased sampling that occurs at higher temperature. At the lowest temperature (Fig. 1f), the distributions appeared to converge again towards a single peak, although small differences between the four variants remained.

Fig. 1
figure 1

Distribution of conformational energies of wild-type and mutant α-synuclein as a function of temperature. a 340 K, b 330 K, c 320 K, d 310 K, e 300 K, f 290 K. Black wild-type α-synuclein, blue E46K, green A53T, red A30P

To ensure that our results were consistent with previous work, we compared the NMR chemical shifts that would be expected theoretically from the ensemble of simulated structures (using SHIFTX2) to the values measured experimentally by NMR The bimodal peaks were decomposed to form distinct high and low energy peaks of approximately the same size. The sides of each peak were fitted to a Gaussian distribution in order to produce these distinct peaks. Because experimental results exhibit properties consistent with mixed phases at low temperature, or mainly high-energy phase at higher temperatures, we focused on the high-energy peak in the simulated distribution. The high-energy peak of the distribution at 320 K led to predicted chemical shifts that agreed well with NMR data collected at 285.5 K and a pH 6.5 (Bermel et al. 2006); the observed root-mean-square deviations (RMSDs) of 0.48 and 1.81 ppm for 13C and 15N shifts, respectively, compared to expected RMS errors for SHIFTX of 0.9 ppm (13C), and 2.4 ppm (15N) (Han et al. 2011; Neal et al. 2003). Differences in simulation and experimental temperature are due to the fact that PROFASI’s temperature scale is not physical, and instead calibrated using a smaller protein, Trp-Cage. We note that our results are also consistent with previous MC simulations of α-synuclein (Jónsson et al. 2012).

Theoretically predicted paramagnetic relaxation enhancement (PRE) intensity ratios were estimated through the use of the measure 1-p ij , where p ij is the contact probability and were compared to experiment. We found that our results are comparable to the measurements performed by Wu et al as well as those computed by Jónsson et al (2012), Wu et al. (2009). This includes an apparent extended region of long-range contacts about residue 90 (residue 91 in Cys-terminate α-synuclein), and the presence of a contact “bump” between residues 80–100 for PRE-labelled residue 132 (133 in Cys-terminated α-synuclein). Extended contact regions also exist about residue 19 and 132 (residue 20 and 133 in Cys-terminated α-synuclein) also exhibit extended regions similar to those found by Jónsson et al (2012) and Wu et al. (2009). This is shown in Fig. 2.

Fig. 2
figure 2

1-pij values for residues in our simulations corresponding to WT i = 19 (red), 90 (blue), and 132 (black), in our high energy ensembles produce at T = 320 K. These measurements have been used as an estimation of spin labelled PRE results. This is similar to both the work of Jónsson et al, and Wu et al (Ulrich et al. 2008; Lee et al. 2014)

Calculation of scalar 3 J(HN ,H α) couplings was performed for both peaks via the Self Consistent Karplus Equation (Schmidt et al. 1999). The results were then compared against experiment. Kim et al found a mean coupling value of 7.44 Hz, whereas we found a mean of 8.072 Hz for the low-energy phase and 6.31 Hz for the high-energy phase, which are very close to the values of 8.09 and 6.43 Hz, respectively, found by Jonsson et al (2012); (Kim et al. 2007). The low energy peak exhibited J-Coupling values in good agreement with Cho et al’s low pH system, with RMSD of 1.006 Hz and mean offset of 0.2 Hz (Cho et al. 2009). However, when compared to the results of Lendel et al, the low energy-peak exhibited an RMSD of 1.50 Hz, and a mean offset of 1.2002 Hz (Lendel and Damberg 2009). The high-energy phase exhibited greater variation, with an RMSD of 1.40 Hz, and a mean offset of 1.03 Hz, when compared to Cho et al (2009). Whereas, the high energy phase showed an RMSD of 0.58 Hz, with a mean offset of 0.05 Hz when compared to Lendel et al (Lendel and Damberg 2009). These results are also consistent with the observations of Jónsson et al., using the Wang-Landau Monte Carlo method (Jónsson et al. 2012).

We also calculated the radius of hydration (Rh) values for both the low- and high-energy distributions. 1000 structures were randomly selected from each peak and the values of Rh were determined using HYDROPRO (Ortega et al. 2011) and we found the WT peaks to be at 25.9 ± 1.5 and 30.8 ± 2.4 Å, respectively. The Rh value for the high-energy distribution aligns perfectly well with previous experimentally measured values, as well as previous simulations, which put it at approximately 32 Å (Allison et al. 2009; Binolfi et al. 2006; Cho et al. 2009; Jónsson et al. 2012; Wu et al. 2009). The measurement of the low energy phase Rh is also consistent with Jónsson et al’s simulations where they measure a mean value of 26.0 Å, and with Wu et al’s low-temperature measurements of 28.1 Å (Jónsson et al. 2012; Wu et al. 2009). The consistency of our results with that of previous simulations and experiments, as well as consistency in comparison to a second set of simulations at 320 K, allow us to justify our simulation set-up, and temperature-comparisons. Radius of gyration values were also compared favourably against previous simulations by Alison et al. (2009).

We next analyzed the propensity of each temperature ensemble to form secondary structure for each protein variant, as a function of temperature, concentrating on the range 310–340 K around the phase transition. Our focus in this case is primarily on those distributions formed at our phase transition temperature and those formed at temperatures greater than the apparent phase transition temperature. Distributions below this point exhibited properties that were unverifiable, or outside of the known properties of low temperature experiments (eg. Rh values significantly lower then known values), and were hence excluded. The frequency with which each residue took on dihedral angles consistent with β-strands (Fig. 3) and α-helices (Fig. 4) was measured from the simulations. For both types of secondary structure, the per-residue frequency at high temperature (340 K) was quite similar across the entire protein for wt α-synuclein and the A53T and E46K mutants (Figs. 3a, 4a), but significant local differences were seen for A30P near the mutation site, including a reduction in helical propensity. At 330 K, the results for helices were similar to the results at 340 K (Fig. 4b), but differences were observed in the β-strand propensity, with E46K being more prone to β-strands than wt α-synuclein whereas the other two mutants were generally less prone (Fig. 3b). There was also an overall increase in the per-residue probabilities for both structures. At 320 K, the trends in β-strand propensity continued (Fig. 3c), but A30P became more liable than WT α-synuclein to form helices, whereas E46K became less so (Fig. 3c); again, the overall per-residue probabilities increased in all cases. Finally, at lower temperature the four variants begin to converge again in their β-strand propensity (Fig. 3d), albeit with a few notable deviations, but A30P remains somewhat different from the others in its helical tendencies, especially near the N terminus (Fig. 4d).

Fig. 3
figure 3

Per-residue probability of dihedral angles consistent with β-strand secondary structure as a function of temperature. a 340 K, b 330 K, c 320 K, d 310 K, Black wild-type α-synuclein, blue E46K, green A53T, red A30P

Fig. 4
figure 4

Per-residue probability of dihedral angles consistent with α-helix secondary structure as a function of temperature. a 340 K, b 330 K, c 320 K, d 310 K. Black wild-type α-synuclein, blue E46K, green A53T, red A30P

Per-residue conformational frequency analysis is a useful tool, but it does not capture correlations between neighboring residues and thereby reveal the frequency with which amino acid residues might form extended structures. The latter is more indicative of actual secondary structure formation, rather than transient fluctuations that are merely consistent with the ability to form structure. We therefore also investigated the frequency with which extended segments of the peptide chain formed dihedral angles consistent with β-strands or α-helices, reflecting extended structure.

Considering first the β-strands, we divided structures into “short” strands of fewer than 5 residues, and “long” strands of 5–12 residues; strands longer than 12 residues were ignored as they occurred with a frequency of less than 0.001 %. These lengths were chosen as they are commonly-reported strand lengths in NMR experiments on aggregates (Lemkau et al. 2012; Vilar et al. 2008), and they are also the most commonly-formed extended structures as determined via joint probability plots (see Supporting Materials, Fig. S2). Short β-strand segments were dominated by “strands” of only 2 residues; perhaps not surprisingly, the short strands occurred at a rate that was very similar to the per-residue frequency of β-strand angles (Fig. S3).

The probability that a given residue formed part of a longer strand (Fig. 5) displayed some of the same trends observed in the per-residue probabilities, including a decreasing probability of structure at higher temperature and quasi-periodic regions with increased propensity for strand formation. At 340 K, all four variants had similarly low strand probability, although A30P was notably lower in the region around residue 30. At 330 K, the variants were more differentiated: E46K was generally more prone to form strands than the other forms of α-synuclein, especially near the N terminus, between residues 2–18, 21–31, 38-45, 50–57, 60–69, 73, 75–82, 85–88, 91–92, 97–102, 110–122, 127, 130–136, and residue 139. A53T was more prone than WT to form strands at residues 24–25, 36–55, 59–64, 66, 69–73, 86, 110–119, 123–127, 129–138, and 140. A30P was less likely to form strands over most of the sequence, but showed increased propensity at residues 36–37, 50–56, 70–73, 86, and residues 140 and 141. At 320 K, A30P remained much less likely to form strands than the other variants, which were otherwise quite similar over the C-terminal half of the protein; A53T and especially E46K were more prone than wt to form strands in the N-terminal half. Finally, at 310 K, the probability for strand formation in A30P was only slightly lower than for WT (except at residue 30), and in some regions (e.g. 50–100) A53T was lowest.

Fig. 5
figure 5

Probability of a given residue forming part of a β-strand of length 5-12 residues. a 340 K, b 330 K, c 320 K, d 310 K. Black wild-type α-synuclein, blue E46K, green A53T, red A30P. Inset of figure a shows the segment from residue 50–60, to more clearly see variation. This region exhibits residues showing consistent relative variation in extended sheet propensity

The same kind of analysis was applied for ɑ-helical structures, again divided into short helices (2–10 residues, shown in Fig. S4) and long helices, Fig. 6, (11–20 residues; helices longer than 20 residues occurred in less than 0.005% of extended structures). At 330–340 K, both long and short helical segments formed with similar frequency in E46K and A53T mutants as in WT α-synuclein, but helix formation was depressed in residues 17–39 for A30P. In contrast, at 320 and 310 K (below the phase transition), A30P formed long helices more frequently than the other variants for residues 2–19 and from 40–100, but less frequently for residues 22–38 around the mutation site. The A53T mutant behaved very similarly to WT α-synuclein at both temperatures, with a slightly higher helix propensity for residues 60–100, but whereas E46K had generally lower helix propensity than WT at 320 K, it was generally a bit higher at 310 K, especially at the N terminus.

Fig. 6
figure 6

Probability of a given residue forming part of a helix of length 11–20 residues. a 340 K, b 330 K, c 320 K, d 310 K. Black wild-type α-synuclein, blue E46K, green A53T, red A30P

Discussion

Our work extends the previous study of Iirback and colleagues who analyzed bimodal distributions of wild-type α-synuclein via multi-canonical Wang-Landau simulations, determining the ensemble-averaged secondary and tertiary structural properties of each mode of the distributions (Jónsson et al. 2012). Here we focus instead on the simulation of temperature-dependent properties of α-synuclein and its familial point mutations (A30P, E46K, and A53T). Reassuringly, our simulations exhibited comparable peak energy levels to those of Iirback et al. for WT α-synuclein, as seen in the 320 K simulation of Fig. 1 (Jónsson et al. 2012), although there was a roughly 9 K difference in the apparent transition temperature, likely due to differences in the move frequency used in the simulations. Our simulation of WT α-synuclein at a simulation temperature of 340 K also agreed well with solution NMR experiments at 303 K, providing a useful point of comparison to experimental results. Noting that variation between the experimental and simulated temperatures are due to Protein Monte Carlo being weakly coupled to temperature.

Based on our simulations, we found significant structural variation between wt and mutant α-synuclein. The E46K and A53T mutants both exhibited an increased propensity to form extended sheet structure when compared with either the WT or A30P mutant, most notably at the simulation temperature of 330 K. This variation was found primarily within the portion of the protein that makes up the core structure of α-synuclein fibrils, from approximately residues 30–110 (Chen et al. 2007; Heise et al. 2005, 2008; Lemkau et al. 2012; Lemkau et al. 2013; Vilar et al. 2008). To test whether any of these structural variations might be correlated with changes in the aggregation propensities of the ɑ-synuclein variants in vitro, we looked for regions in the primary sequence where the structural trends matched the trend in the aggregation rates, with E46K aggregation fastest, then A53T, then A30P and WT ɑ-synuclein (where A30P and WT are taken to be similar, given the uncertainty in the experimental aggregation results for these variants).

Looking first at the propensity to form extended sheet structure, at 340 K the results are generally similar for all variants, but at residues 24–25, 36–43, 46–51, 53–55, 59–60, 69, and 131–136 the propensity to form extended sheets does indeed match the order of aggregation propensity (E46K > A53T > A30P ~ WT). This pattern of extended-sheet propensities is even more prominent at 330 K occurring over a wider range of residues, from 24–25, 38–43, 50–55, 60–64, 66, 69, 73, 110–119, 127, and 130–136. From these results, we see that this propensity is exhibited in both simulations at residues 24–25, 38–43, 50, 51, 53–55, 60–69, and 131–136. We note that previous work showed that the aggregation rates of ɑ-synuclein mutants were linearly related to the propensity of the structure to form extended sheets, with residues 50-54 showing a marked increase in aggregation rate with increased sheet content (Camilloni and Vendruscolo 2013), and residues 38–40, 50–54, 65–75, and 87–92 have been predicted to be some of the most aggregation-prone regions of the ɑ-synuclein sequence (Pawar et al. 2005). Our results reinforce the idea that the increase in extended β-sheet propensity is the underlying cause of increasing aggregation propensity. Regions showing consistent relative variation in β-sheet content are most likely due to the presence of mutations between monomers, whereas those that are not consistent may be more likely due to random variations. This suggests that the effect of the mutations is to increase the structural propensity of aggregation prone regions to form β-sheet structures, and that this may be the underlying driving force causing increased fibril formation. Below the phase transition temperature, the monomers exhibit less consistent variations in extended sheet content, with primary variations exhibited about the mutation sites and neighbors. However, E46K and A53T do exhibit increased local and long range correlations when compared to A30P and WT, which is indicative of structure formation. Propensity for short β-strands appears to converge quickly above the transition temperature; this is likely due to increased likelihood of random short chain fluctuations dominating the ensemble.

Turning next to the propensity to form helices, we found only minor variations in ɑ-helical propensity, whether for extended or short helices, with the exception of the A30P mutant. The variations in helix content between mutants did not consistently match the effects of the mutations on aggregation rates in any region of the sequence, suggesting that the effects of the mutations on helix content are not significant for aggregation. The effects of the A30P mutation were most important in the region 17–39 at temperatures above the transition, consistent with observations linking depressed ɑ-helical formation in residues 6–25 with reduced membrane binding (Fusco et al. 2014) and the reduction in membrane/vesicle binding observed in A30P (Diao et al. 2013; Jensen et al. 1998; Jo et al. 2002).

In conclusion, our simulations support the notion that the ability of the monomers to form extended β-sheet structures in the fibril core region may be intimately related to the rates of aggregation. This might suggest a mechanism where aggregates form via the interaction of long chains, so that the ability of the protein to form these structures upon interaction is important in fibril growth. Our work also lends support to a drug design strategy targeting those regions of the protein exhibiting increased β-strand content proportional to aggregation tendency. Targeting these regions for small-molecule binding and aggregation-inhibition could reduce the propensity of these regions to form β-strand secondary structures as a likely method of reducing the effects of ɑ-synuclein aggregation in disease.