Introduction

Intrinsically disordered proteins (IDPs) are abundant and important in biology (Dyson and Wright 2005). In structural studies of IDPs, NMR spectroscopy has proven a powerful technique that gives high resolution information about transiently formed structures (Dyson and Wright 2002). The simplest way to characterize transiently formed structure in IDPs is secondary chemical shift analysis of the peptide backbone atoms. Chemical shifts are attractive probes of transient structure as they can be measured with very high precision, which is required for detection of marginally populated structural elements in disordered proteins.

The backbone chemical shifts depend on the dihedral angles ψ and φ, and are thus affected by the secondary structure content of the peptide chain (Spera and Bax 1991). In folded proteins, secondary structural elements can be identified by calculating a chemical shift index (Wishart et al. 1992; Wishart and Sykes 1994). In disordered proteins, secondary structure is only formed transiently, and the chemical shift index is thus not appropriate, as it is primarily intended to identify fully formed secondary structure. Instead, IDPs are evaluated in terms of secondary chemical shifts, Δδ, that identify the location and the population of transiently structured regions. Identification of transiently structured regions is of biological interest, as these are frequently involved in molecular recognition (Mohan et al. 2006).

The secondary chemical shift is calculated for each atom as the difference between the experimentally determined shifts and a set of random coil chemical shifts. The choice of an appropriate set of random coil values is pivotal for calculation of reliable secondary chemical shifts, however, it is unclear how the appropriate values should be defined. Numerous datasets have been proposed by different groups using a variety of methods. The procedure for obtaining the random coil chemical shifts can be grouped into two categories, depending on whether the chemical shifts are derived from a protein chemical shift database (Zhang et al. 2003; De Simone et al. 2009; Wishart et al. 1991; Peti et al. 2001) or a set of model peptides (Wishart et al. 1995a; Schwarzinger et al. 2000; Richarz and Wuthrich 1978; Bundi and Wuthrich 1979; Jimenez et al. 1986; Braun et al. 1994; Thanabal et al. 1994; Merutka et al. 1995; Plaxco et al. 1997; Bienkiewicz and Lumb 1999). Database methods collect chemical shift data from a large number of proteins recorded at a variety of different experimental conditions. The large datasets in database methods allow good approximation of the average values, but the averaged chemical shifts are only appropriate if the protein of interest is investigated at conditions similar to the database average, which is usually poorly defined. Alternatively, random coil chemical shifts have also been derived from short unstructured peptides. The datasets used for determination of peptide based random coil chemical shifts are much smaller than the database derived values. However, the experimental conditions can be tightly controlled, which makes small peptides optimal for investigation of the effects of environmental perturbations such as pH, denaturant concentration and temperature. A widely used random coil peptide series has the sequence Ac-GGXGG-NH2, where the conformational freedom of the glycines combined with chemical denaturant ensure that the peptides are unstructured (Schwarzinger et al. 2000).

Sequential neighbors in the peptide chain affect the random coil chemical shifts (Schwarzinger et al. 2001), and if this is not taken into account, the neighbor effect could mask the transiently structured regions. One approach for eliminating the neighbor effects compares the chemical shifts to those of the same protein recorded under highly denaturing conditions (Modig et al. 2007). This procedure gives clean secondary chemical shifts for IDPs (Kjaergaard et al. 2010a). This approach is, however, expensive in terms of instrument time, sample consumption and analysis time and the denaturant may affect the ensemble of the disordered state. Alternatively, the sequence effects can be removed using a set of sequence correction factors. Like the random coil chemical shifts, the sequence correction factors can either be derived from database approaches (Wang and Jardetzky 2002; De Simone et al. 2009; Wang et al. 2007) or from peptide studies (Schwarzinger et al. 2001). An elegant approach for experimentally determining the sequence correction factors was proposed by Schwarzinger et al. (Schwarzinger et al. 2001). This method uses the chemical shifts of the glycines in the Ac-GGXGG-NH2 peptide series to calculate correction factors for the X residue. The underlying assumption of this method is thus that all residue types will experience a similar perturbation of the random coil chemical shift as glycine. The dataset reported by Schwarzinger et al. (Schwarzinger et al. 2000; Schwarzinger et al. 2001), however, was determined at pH 2.3 where Asp, Glu and His are fully protonated. When this dataset is used for analysis of chemical shifts recorded at neutral pH, these residues show large deviations in the secondary chemical shift profile suggesting that these random coil values and correction factors are inappropriate for IDPs investigated at neutral pH.

The quality of a protein NMR spectrum often depends on the sample conditions. Therefore, NMR investigations of proteins often start by finding the optimal experimental conditions in terms of salt, pH and temperature. For IDPs in particular, low temperatures are often attractive as the exchange rates of amide protons with the solvent are lower. In addition, the secondary structure content is higher at low temperature and transiently formed structures are thus easier to detect (Kjaergaard et al. 2010a). High temperatures, in contrast, are often used for molten globules to reduce peak broadening due to intermediate exchange (Eliezer et al. 1997; Ramboarina and Redfield 2003; Kjaergaard et al. 2010b). Random coil chemical shifts depend on the temperature and it is thus necessary to extrapolate random coil chemical shifts to the desired temperature. For this reason temperature coefficients have been reported for both 1H and 15N random coil chemical shifts (Merutka et al. 1995; Lam and Hsu 2003). The intrinsic temperature effect of the 13C chemical shifts has been assumed to be negligible.

Using the approach of Schwarzinger et al., we report a set of random coil chemical shifts and sequence correction factors recorded for the Ac-GGXGG-NH2 peptide series. The present dataset differs from that determined previously by being recorded at neutral pH and at a lower urea concentration. Furthermore, we have explored the effects of experimental conditions that are likely to be relevant for IDPs, namely the temperature dependence of the 13C random coil chemical shifts and the pH dependence of the histidine random coil chemical shifts. This dataset removes systematic errors in secondary chemical shift analyses due to mismatch between the experimental conditions and those used for determining the random coil chemical shifts. Although this dataset was determined with IDPs in mind, it should be equally suitable for folded proteins determined at neutral pH.

Materials and methods

Peptides with the sequence Ac-GGXGG-NH2 were purchased from KJ Ross-Petersen ApS (Klampenborg, Denmark), where X was each of the 20 common amino acids. All peptides were purified to more than 95% purity by reversed phase HPLC and their identities were confirmed by mass spectrometry. NMR samples were prepared by dissolving 2–3 mg of peptide in 500 μl 20 mM sodium phosphate buffer pH 6.5 containing 5% (v/v) D2O, 1 M urea, 3 mM NaN3, and 1 mM DSS. pH was adjusted to 6.5 by addition of small quantities of HCl. The cysteine containing peptide additionally contained 10 mM DTT to keep the thiol group in the reduced state.

All NMR spectra were acquired on a Varian Unity 800 MHz spectrometer equipped with a cold probe. Chemical shifts were referenced to internal DSS as described previously (Wishart et al. 1995b). Temperatures were measured by a thermocouple mounted in an NMR tube and inserted into the spectrometer probe. For each sample the following spectra were acquired at natural isotope abundance: 1H-15N HSQC, 1H-13C HSQC and a newly developed 1Hα-13C′ HSQC. For all peptides, data were recorded at 5, 15, 25, 35 and 45°C. The 1Hα-13C′ HSQC experiment correlates the Hα protons with the carbonyl resonances of the same and the preceding residue and will be discussed fully in a subsequent publication. In short the 1Hα-13C′ HSQC experiment is a modified version of a previously described experiment (Kay et al. 1992), where the INEPT delay has been changed to 37 ms, and the hard pulses in the INEPT period have been replaced by a selective sech/tanh pulse (Silver et al. 1985) centered on carbonyl for 13C and a REBURP pulse (Geen and Freeman 1991) centered in the Hα region for 1H. NMR data were processed using NMRPipe (Delaglio et al. 1995) and analyzed using CCPNMR Analysis (Vranken et al. 2005). Temperature coefficients were determined by least squares fitting of the chemical shifts to a linear function of temperature using OpenOffice 3.2 to the Eq. 1, where “a” is the temperature coefficient:

$$ \delta_{\text{rc}} ({\text{T}}) = \delta_{\text{rc}} (25^\circ {\text{C}}) + {\text{a}} \times ({\text{T}} - 25) $$
(1)

The sequence corrected random coil chemical shifts of a residue can thus be calculated at any temperature using the equation:

$$ \delta_{\text{rc}} ({\text{T}}) = \delta_{\text{rc}} (25^\circ {\text{C}}) + {\text{a}} \times ({\text{T}} - 25) + {\text{A}}_{{({\text{i}} + 2)}} + {\text{B}}_{{({\text{i}} + 1)}} + {\text{C}}_{{({\text{i}} - 1)}} + {\text{D}}_{{({\text{i}} - 2)}} $$
(2)

where B(i+1) is the B correction factor of the subsequent residue etc.

For the histidine containing peptide, a titration series from pH 8 to pH 4 in steps of 0.5 were recorded by stepwise addition of HCl. The NMR spectra were recorded at 5°C to minimize exchange of the amide protons with the solvent and to prevent overlap between the histidine Hα and the water signal. The chemical shifts were assumed to be the linear combination of the chemical shift of a fully protonated and a fully deprotonated species and thus follow the equation:

$$ \delta = \delta_{\text{A}} \times {\frac{{{\text{K}}_{\text{a}} }}{{10^{{ - {\text{pH}}}} + {\text{K}}_{\text{a}} }}} + \delta_{\text{HA}} \times \left( {1 - {\frac{{{\text{K}}_{\text{a}} }}{{10^{{ - {\text{pH}}}} + {\text{K}}_{\text{a}} }}}} \right) $$
(3)

δHA and δA represent the random coil chemical shifts of the fully protonated and fully deprotonated species, respectively. Ka is the acid dissociation constant of the side chain. The chemical shifts were fitted to Eq. 3 using IGOR PRO (WaveMetrics), where Ka was treated as a global fitting parameter.

Results and discussion

Three spectra were recorded for each peptide: 1H-15N HSQC, 1H-13C HSQC and 1Hα-13C′ HSQC. The amide peaks were well resolved in the 1H-15N HSQC (Fig. 1A) and assignments were readily transferred from the previous study at low pH. The glycine signals were in the same region of the 1H-13C HSQC (Fig. 1B), but they could be separated to allow determination of the chemical shifts of each glycine. The 1Hα-13C′ HSQC spectrum correlates Hα(i) with 13C′(i) and 13C′(i − 1), which allows the glycine Hα resonances to be assigned by a sequential walk through the carbonyl resonances (Fig. 1C). The chemical shifts of glycine peaks are similar to those reported at low pH. The signals were weaker in the 1H-15N HSQC at 35°C and absent at 45°C due to exchange of the amide protons with the solvent. Accordingly, the 1H-15N HSQC spectrum was not recorded at 45°C for most peptides. In a few spectra, the chemical shifts could not be obtained due to overlap with the water signal or due to proton exchange. The temperature dependence could still be analyzed as chemical shifts from at least three temperatures were available for all nuclei with the amide resonances from histidine as the only exception. The chemical shifts depend linearly on temperature (Fig. 2) and were thus analyzed by least squares fitting to a linear function. Random coil chemical shifts and temperature coefficients for all 20 residues are reported in Table 1. Sequence correction terms for residue X were obtained using the method of Schwarzinger et al. (Schwarzinger et al. 2001) by subtracting the chemical shifts of the Ac-GGGGG-NH2 peptide from those of the Ac-GGXGG-NH2 peptide (Tables 2, 3). Glycine 1, 2, 4 and 5 give rise to correction factors A, B, C and D, respectively.

Fig. 1
figure 1

2D NMR spectra used for determination of chemical shifts for the glutamate containing peptide. a 1H-15N HSQC, b 1H-13C HSQC and c 1Hα-13C′ HSQC spectra recorded at pH 6.5 and 25°C on the peptide with the sequence Ac-GGEGG-NH2. * denotes aliased peaks from the N-terminal acetyl group and E3 Cγ-Hγ. The dashed line in (c) illustrates a sequential walk through the sequence, which forms the basis for sequential assignment of the peaks. The peaks from Hα of G4 are split due to prochirality induced by E3

Fig. 2
figure 2

Temperature dependence of the random coil chemical shifts of glutamate at pH 6.5. Random coil chemical shifts from 25°C have been subtracted from all datasets to allow the different nuclei to be represented together. Lines represent the best linear fit from which temperature coefficients were extracted

Table 1 Random coil chemical shifts and temperature coefficients at pH 6.5, 1 M urea and 25°C
Table 2 Sequence correction factors for 13C at pH 6.5, 1 M urea and 25°C
Table 3 Sequence correction factors for 15N and 1H at pH 6.5, 1 M urea and 25°C

For all other residues than histidine, aspartate, and glutamate, the random coil chemical shifts obtained at pH 6.5 are very similar to those reported by Schwarzinger et al. (Schwarzinger et al. 2000) at pH 2.3 and 8 M urea. For 13C resonances, the difference in chemical shifts was below 0.2 ppm and in most cases it was much smaller. This shows that for residues that do not change protonation state, the 13C random coil chemical shifts are largely independent of pH and denaturant concentration. For the 15N chemical shifts, the deviations from the values reported by Schwarzinger et al. (2000) are larger. This effect on the random coil chemical shifts is presumably due to the effect of urea rather than pH as most residues do not change protonation state. For the 1H chemical shifts, the random coil chemical shifts are similar to the values determined at low pH and at a high concentration of urea. The sequence correction terms are almost identical to the low pH values, and even the residues deprotonated at pH 6.5 show only minor differences in the sequence correction factors.

The motivation for determining the random coil chemical shifts at a higher pH was to obtain accurate values for histidine, aspartate, and glutamate. Aspartate and glutamate have side chain pKa values of approximately 4, which means that the side chains will be fully protonated at the conditions where Schwarzinger et al. determined their shifts (pH 2.3), 10% protonated at the conditions of Wishart et al. (pH 5) and fully deprotonated at the conditions used here (pH 6.5). Consistent with the pH-values, the 13C and 15N chemical shifts for aspartate and glutamate at 6.5 are significantly higher than those determined by Schwarzinger et al. and slightly higher than those reported by Wishart et al. due to the remaining protonation at pH 5. For the 1H chemical shifts, protonation does not change the chemical shifts in the same direction for all protons. Similarly to the 13C and 15N chemical shifts, the 1H chemical shifts are significantly different from those determined at pH 2.3 and slightly different from those determined at pH 5. In contrast to peptide based random coil chemical shifts, the effect of pH is more difficult to control in database based random coil values, where it is assumed that the chemical shift effect from differences in the pH values is averaged by using a large database. Averaging of the pH effects requires that pH induced changes at higher pH values are canceled by changes of opposite sign at lower pH. However, this will not be the case for the pH dependence of chemical shifts as the chemical shifts only are affected when the pH is near the pKa value. When the average pH value in the database is near 6, the chemical shifts of aspartate and glutamate at pH values above 6 are identical to the value at pH 6, as the side chain is already fully protonated. Accordingly, the contributions from chemical shifts recorded at pH values below 6 are not canceled by averaging with values at higher pH values. Unless care is taken to eliminate datasets recorded at lower pH values, database methods are likely to result in random coil values representing partially protonated side chains of aspartate and glutamate. The distribution of pH values used for random coil databases suggests that this may be a problem for at least some datasets (De Simone et al. 2009). Random coil chemical shift values for aspartate and glutamate determined by database based approaches are generally much more similar to the values determined at pH 6.5 than those determined at pH 2.3, however, the deviations are consistent with a small contribution from low pH datasets.

Of the 20 common amino acids, only histidine has a side chain with a pKa value near neutral pH. The pH values used in protein NMR studies are highly variable, but are generally kept slightly below neutral pH to minimize proton exchange. Due to the variability in pH values, a set of generally applicable random coil chemical shifts needs to take the pH dependence of histidine into account. We have performed a pH titration using the Ac-GGHGG-NH2 peptide to address this problem. The random coil chemical shifts of this peptide were determined from pH 4 to 8, which is the range where protein NMR studies are usually carried out (Fig. 3). As expected, the chemical shifts of all nuclei change with a midpoint at pH 6.9, corresponding to the expected pKa of the histidine side chain. The data are fitted to Eq. 3 using Ka as a common parameter for all nuclei. This procedure extracts the chemical shifts of the fully protonated and the fully deprotonated residue (Table 4) and Ka = 1.2 × 10−7 M. Using these values and Eq. 3, the pH corrected random coil chemical shift for histidine at any desired pH can be calculated. A similar procedure can be applied to aspartate and glutamate residues, using the values reported by Schwarzinger et al. as the fully protonated state and the values reported here as the fully deprotonated state.

Fig. 3
figure 3

pH dependence of the random coil chemical shifts of histidine. A pH titration was carried out for the Ac-GGHGG-NH2 peptide at 5°C. The random coil chemical shifts as a function of pH were fitted to Eq. 3 with pKa as a global fitting parameter. This gives the random coil chemical shifts of the fully protonated, δHA, and the fully deprotonated, δHA, species. Data could not be collect for the amide at pH 8 due to hydrogen exchange with the solvent

Table 4 Histidine random coil chemical shifts of the fully protonated and deprotonated state at 5°C and 1 M urea

Random coil chemical shifts depend on the temperature. If the temperature dependence is not taken into account, a systematic bias is introduced into the secondary chemical shifts. The temperature dependence of 1H and 15N random coil chemical shifts has been addressed by others (Merutka et al. 1995; Lam and Hsu 2003) and our values resemble those reported previously. The temperature dependence of 13C random coil chemical shifts is relatively small and has not been addressed previously. In folded proteins with fully formed secondary structure, the temperature dependence of the 13C random coil chemical shifts is less important due to the small temperature coefficients relative to the secondary chemical shifts. In IDPs, however, the contribution from the temperature dependence of the random coil chemical shifts might be of the same magnitude as the secondary chemical shifts and consequently even minute contributions to the random coil chemical shifts need to be taken into account. Table 1 shows that C′ temperature coefficients are always negative and Cβ temperature coefficients are, with the exception of proline, positive. Histidine is an exception and will be discussed below. The consistent sign of the temperature coefficients means that if the temperature dependence is not taken into consideration, low temperature data will consistently overestimate the α-helix content and underestimate the degree of β-structure and vice versa. If random coil chemical shifts recorded at 25°C are used for analysis of NMR data recorded at 5°C, the magnitude of the error is up to 0.2 ppm, corresponding to a systematic error of 5–10% in the calculated secondary structure content. The Cα temperature coefficients are smaller than those of C′ and Cβ and can be positive or negative. Accordingly, when the temperature dependence of Cα chemical shifts is not corrected for, it is less likely to lead to a systematic error in quantification of transiently formed secondary structure elements. The temperature dependence of the Cα shifts will, however, still add noise to the secondary chemical shifts if the chemical shifts are recorded at a different temperature than the random coil chemical shifts.

With respect to temperature dependence of the chemical shift, histidine is a special case. A change of temperature results in a change in the pH and as a consequence a change in the protonation state and thus the random coil chemical shifts. Small temperature induced pH changes are, however, difficult to avoid. In order to minimize the temperature dependence of the pH value, we have used phosphate buffer whose pH is relatively insensitive to temperature (Dawson et al. 1986). Small pH changes can, however, not be avoided, and accordingly the positive temperature coefficients for C′ and the unusually large Cβ temperature coefficient for histidine may thus represent small temperature induced changes in the pH.

Intrinsically disordered proteins often undergo temperature dependent structural changes (Uversky 2009; Kjaergaard et al. 2010a; Kim et al. 2007; Wu et al. 2008; Hsu et al. 2009). The temperature dependence of chemical shifts in disordered proteins consists of the sum of two components, the chemical shift change induced by structural transitions, and the change caused by the temperature dependence of the random coil chemical shifts. The temperature dependence of the random coil chemical shifts thus complicates analysis of temperature induced structural changes in disordered proteins. Intrinsic random coil referencing solves this problem as the chemical shifts can be recorded at the same temperatures as the reference chemical shifts (Kjaergaard et al. 2010a), however, this approach requires reassignment of the protein under denaturing conditions at several temperatures. The tabulated temperature coefficients reported here allow analysis of temperature dependent structural changes by removing the intrinsic temperature dependence of the random coil chemical shifts. To compare the temperature coefficients to intrinsic statistical coil referencing, the chemical shift differences calculated for a 40°C temperature difference using Eq. 2 were compared to the differences recorded on the urea denatured state of ACTR between 5 and 45°C (Fig. 4). The chemical shifts of the urea denatured state vary more than the tabulated values, which is likely to result from the effects of the neighboring residues on the temperature coefficients. There is not a complete agreement between the two methods for correcting for the intrinsic temperature dependence of the random coil chemical shifts, but they suggest that the intrinsic temperature effect has the same sign and magnitude. This shows that the tabulated temperature coefficients remove the systematic error from secondary chemical shift analyses. However, intrinsic random coil referencing is expected to perform better due to the contributions of neighboring residues to the temperature coefficients.

Fig. 4
figure 4

Comparison of the difference in random coil chemical shifts between 5°C and 45°C obtained using intrinsic random coil referencing and peptide based temperature coefficients at pH 6.5. The temperature induced change in random chemical shifts between 5 and 45°C were calculated for the activation domain of ACTR based on the temperature coefficients in Table 1 (dashed line) for Cα (black) and C′ (red). For comparison, the temperature induced changes in the intrinsic random coil chemical shifts reported previously are displayed as a full line. The intrinsic random coil chemical shifts are measured in the presence of 6 M urea

The residue preceding a proline experiences an unusually large sequence effect (Schwarzinger et al. 2001). This most likely represent the unusual φ/ψ distributions of these residues, where the population of the α-helical region of the Ramachandran plot is unusually low (Ting et al. 2010) due to steric clashes between the proline Cδ and the Cβ of the X-residue (MacArthur and Thornton 1991). Glycine does not have a Cβ atom and glycine derived correction factors are thus not able to correct for this phenomenon. The neighbor effect varies considerably depending on the nature of the X residue (Wishart et al. 1995a), reflecting the strong residue-type dependence of the φ/ψ distribution of the residue preceding proline (Ting et al. 2010). For this reason, we recommend using the amino acid specific values correction factors determined previously for proline (Wishart et al. 1995a) instead of the glycine derived value presented here.

The effect of the neighboring residues on the random coil chemical shifts is often corrected using a set of correction factors derived from glycine rich peptides (Schwarzinger et al. 2001). The sequence correction factors for residue X are derived from the chemical shifts of the glycines in the Ac-GGXGG-NH2 peptide relative to those of the Ac-GGGGG-NH2 peptide. The underlying assumption of this approach is thus that all residues experience the same neighbor perturbation as glycine. Since the dataset reported here is determined under similar conditions to that reported previously by Wishart et al. (Wishart et al. 1995a), we can test this assumption. Figure 5 compares chemical shifts for the Ac-GGXAGG-NH2 peptide series obtained by experiment or calculated from the Ac-GGXGG-NH2 peptide series. In all cases, the correction factors improve agreement between the two peptide series. The predicted random coil chemical shifts are, however, consistently too low, consistent with the correction factors being too small. This suggests that the glycine-derived correction factors are able to correct only a part of the neighbor effect. Residue-type specific correction factors for the residue preceding proline suggest that glycine is an outlier with smaller correction factors than all other residues. Two different mechanisms can be envisioned for the neighbor effect. First, the different chemical structures of the amino acids affect the polarization of the chemical bonds, and thus the electron density around the nuclei. Presumably, this effect is similar for all residue types, and may thus be corrected using glycine-derived correction factors. Second, the neighboring residues may affect the φ/ψ distribution in the disordered state as was demonstrated using residual dipolar couplings of a set of disordered peptides (Dames et al. 2006). Glycine has an unusual φ/ψ distribution due to the absence of a side chain, and is thus not perturbed by the neighboring residues in the same way as other residue types. Correction factors derived from glycine residues are thus only able to correct for certain aspects of the neighbor effects.

Fig. 5
figure 5

Comparison between chemical shifts for the Ac-GGXAGG-NH2 peptide series obtained experimentally (Wishart et al. 1995a) at pH 5 and from Ac-GGXGG-NH2 at pH 6.5 series with (grey) and without (black) glycine derived correction factors. In all cases except for the carbonyl of glycine, glycine derived sequence correction factors improved the agreement between experimental and predicted chemical shifts. The Cα and C′ chemical shifts based on the Ac-GGXAGG-NH2 peptide series show a small, but consistent, systematic deviation from the experimental chemical shifts

To compare the performance of the present data set to other random coil datasets, we compared the random coil values to a dataset consisting of chemical shift assignments of 14 IDPs. The dataset was assembled recently to determine random coil chemical shifts suitable for IDPs (Tamiola et al. 2010). As these proteins are highly disordered, the deviation from the experimental chemical shifts can be used to probe the suitability of random coil datasets for these proteins. Figure 6 shows the RMSD between the experimental chemical shifts and predicted random coil values for seven datasets. The datasets made specifically for IDPs have the smallest RMSD values suggesting the best predictive power. Compared to the present dataset, the database assembled by Tamiola et al. has smaller RMSDs for the amide nuclei. For the Hα, Cα, Cβand C′ nuclei that are more frequently used for secondary chemical shift analysis, these two datasets perform similarly.

Fig. 6
figure 6

Comparison of random coil dataset predictions and experimental chemical shifts reported for 14 IDPs. A dataset consisting of 14 IDPs with available NMR assignments were manually compiled as described previously (Tamiola et al. 2010). The RMSD between predicted and experimental chemical shifts were determined for each random coil dataset. The compared random coil values are Zhang et al. (2003), Schwarzinger et al. (2001), Wang and Jardetzky (2002), Wishart et al. (1995a), De Simone et al. (2009) and Tamiola et al. (2010)

Secondary chemical shift were calculated for the activation domain (residues 1018–1088) of the intrinsically disordered ACTR using the dataset reported here and the six other random coil datasets compared above (Fig. 7). Chemical shifts recorded at 5°C were deliberately chosen to allow the effect of temperature correction to be seen. When the random coil chemical shifts recorded at pH 2.3 are used, spikes are seen at the positions of aspartate and glutamate residues. In datasets that do not correct for the effects of neighbors (Zhang et al. 2003) spikes are seen in the residues preceding proline. Both of these types of spikes disappear when sequence corrected random coil chemical shifts recorded at pH 6.5 are used as the reference. Relative to several of the other datasets, the values reported here appear to have smaller fluctuations in the secondary chemical shifts consistent with the RMSD analysis discussed above. A transiently formed α-helix is observed from residue 1044–1054 using all random coil datasets. This region corresponds to the first α-helix in the complex between ACTR and its ligand, the nuclear coactivator binding domain from CBP (Demarest et al. 2002). This agrees with previous studies on ACTR (Kjaergaard et al. 2010a; Ebert et al. 2008) even though secondary chemical shifts data are less noisy when intrinsic random coil referencing is used (Kjaergaard et al. 2010a). Due to the small magnitude of the secondary chemical shifts, it is primarily Cα and C′ that detect this transiently formed helix. For this reason, these two nuclei are usually the only nuclei for which secondary chemical shifts are reported for IDPs. The effect of temperature correction can be seen most clearly for the C′ and HN secondary chemical shifts, where the values are systematically lower when the temperature correction values are applied. This suggests that temperature correction of the random coil chemical shifts, removes a systematic effect from the secondary chemical shifts. If this effect was not removed, it would cause a systematic bias resulting in overestimation of α-helical populations and underestimation of extended regions.

Fig. 7
figure 7

Comparison of secondary chemical shifts for the intrinsically disordered activation domain of ACTR (residues 1018–1088) using 7 different reference random coil datasets. The spikes in the Zhang ′03 dataset are from the residues preceding proline whereas the spikes in the Schwarzinger dataset are due to side chain protonation at low pH. The six nuclei represented are a Cα, b Cβ, c C′, d N, e HN, and f Hα

Conclusion

We have measured a set of random coil chemical shifts for a Ac-GGXGG-NH2 peptide series including the 20 common amino acid residues of proteins at pH 6.5 and in the temperature range from 5 to 45°C. The dataset overcomes a problem inherent in previous peptide based random coil dataset recorded at acidic pH. The pH dependence problem is prone to persist in database derived random coil datasets due to incomplete averaging of pH dependent chemical shift differences amongst the datasets in the database. Temperature coefficients for the random coil chemical shifts were extracted allowing extrapolation of the random coil values to the temperature at which the chemical shifts are determined. Temperature correction of random coil datasets will thus remove a systematic error from secondary chemical shifts caused by mismatch between the temperature of the random coil study and the protein sample of interest. The dataset reported here has been measured in order to improve the precision of secondary structure analysis of disordered proteins in particular, but it can also be used as a reference for secondary chemical shifts of folded proteins. Future studies will reveal whether peptide- or database derived random coil chemical shifts performs better for IDPs. Even if the database values are eventually found to be better than peptide based values, experimental studies are still useful for the understanding the effects of experimental conditions such as pH and temperature.