Introduction

Nuclear Overhauser effect (NOE) measurements yield the most important structural data that can be obtained from NMR with proteins. For the purpose of structure determination NOEs are traditionally interpreted in a conservative way as loose upper distance bounds. This approach, that has been used successfully to determine more than 8,500 protein solution structures, takes implicitly into account that proteins are dynamic molecules and that NOEs do not fulfill the independent spin pair approximation, in addition to the experimental difficulties to determine NOE rates with high accuracy. However, using imprecise upper distance bounds entails a significant loss of information. We have recently shown that this loss of information can be avoided largely by a quantitative determination of NOE rates, resulting in “exact” NOEs (eNOEs) that can yield distances with accuracy better than 5 % (Vögeli et al. 2009, 2010). This made it possible to measure the temperature dependence of 1HN1HN distances in ubiquitin (Leitz et al. 2011), and to elucidate motion in proteins by ensemble-based structure calculation on the basis of eNOEs (Orts et al. 2012; Vögeli et al. 2012, 2013).

Stereospecific assignments are important to fully exploit the potential of eNOEs, lest part of the increased accuracy be lost to account for the lack of stereospecific assignments. Stereospecific assignments are therefore more relevant in the context of eNOEs than with traditional upper distance bounds. Fortunately, the high accuracy of eNOEs also opens up new ways to determine stereospecific assignments, as will be shown in this work.

The standard NMR resonance assignment methods do not yield stereospecific assignments for diastereotopic groups. Thus, there have been a variety of studies on the impact of the presence, or absence, of stereospecific assignments on NMR structure determinations of proteins (Driscoll et al. 1989; Fletcher et al. 1996; Güntert 1998; Güntert et al. 1989; Havel 1991), and a variety of methods for determining stereospecific assignments, mostly from the early 1990s, including approaches based on stereospecific isotope labeling (Kainosho and Güntert 2009; Kainosho et al. 2006; Neri et al. 1989; Plevin et al. 2011; Senn et al. 1989), and computational algorithms based on systematic searches of the local conformation space (Güntert et al. 1989; Hyberts et al. 1987; Nilges et al. 1990; Polshakov et al. 1995; Tejero et al. 1999) or analyses of preliminary three-dimensional structures (Beckman et al. 1993; Folmer et al. 1997; Güntert et al. 1991a, b; Pristovšek and Franzoni 2006; Weber et al. 1988). Stereospecific assignment methods based on isotope labeling are reliable and are widely used for the methyl groups of valine and leucine (Senn et al. 1989), for which stereospecific assignments have the largest impact on the structure. The computational methods, on the other hand, have a certain potential for errors, especially when internal dynamics is present (Folmer et al. 1997; Havel 1991). For this reason, and because methods have been developed that reduce the loss of structural information in the absence of stereospecific assignments (Fletcher et al. 1996), the use of computational approaches for determining stereospecific assignments has decreased during the last decade.

In this paper we introduce a computational method based on eNOEs that can provide stereospecific assignments for a large number of methylene and isopropyl methyl groups in a straightforward and reliable way without need for additional experiments.

Materials and methods

NMR measurement and evaluation of eNOEs

NMR measurements of the protein GB3 were performed with 350 μl of a 4 mM uniformly 13C,15N-labeled protein solution in 97 % H2O, 3 % D2O, 50 mM potassium phosphate buffer, pH 6.5, and 0.5 mg/ml sodium azide on a Bruker 700 MHz spectrometer equipped with a triple resonance cryoprobe at 298 K. A series of 3D 15N- or 13C-resolved [1H,1H]-NOESY spectra with mixing times τ m = 20, 30, 40, 50, and 60 ms was recorded for the measurement of NOE buildups. Cross-relaxation rates were extracted following the previously established protocol (Vögeli et al. 2010). Details and experimental data have been presented elsewhere (Vögeli et al. 2013).

A total of 823 distances were measured based on eNOEs. Of these, 324 were obtained from two pathways (two symmetrically related peaks in the spectrum) and were used as exact distance restraints (upper bound = lower bound), 481 were obtained from one pathway and were used with ±15 % distance error, and 18 were between two methyl groups and were used with ±20 % distance error (Vögeli et al. 2010). In addition, there were 61 NOEs with aromatics that were used conventionally with an upper distance bound of 8 Å. For comparison, NOEs were also interpreted in the traditional, semi-quantitative way, yielding 1,956 upper distance bounds. Of these, 1,041 were non-redundant conformation-restricting restraints. In addition to the NOE distance restraints, the NMR data for GB3 comprised, in both cases, also 54 torsion angle restraints obtained from 13Cα chemical shifts, 147 3 J HNHα, 3 J HNC′, and 3 J HNCβ scalar coupling restraints, as well as 90 15N–1HN and 13C–1Hα residual dipolar coupling (RDC) restraints (Vögeli et al. 2012). The stereospecific assignments for βCH2 of amino acid residues 3, 5, 8, 22, 30, 35, 36, 37, 40, 43, 45, 46, 47, 52 and 54 were also confirmed independently using a set of scalar couplings and RDCs reported in the literature (Lian et al. 1992; Miclet et al. 2005; Vögeli et al. 2013).

Stereospecific assignment based on eNOEs

The original eNOE restraints are given arbitrary stereospecific assignments. Stereospecific assignments are determined by comparing the eNOE-derived distance restraints to structures that were calculated with the program CYANA in the absence of any stereospecific assignments using the same eNOE data, and possibly other conformational restraints such as torsion angle restraints, residual dipolar couplings, etc. The absence of stereospecific assignments is handled by symmetrizing the restraint list (Güntert et al. 1991a, b, using the CYANA command ‘distances modify’. In short, a pair of distance restraints d(A, B 1) < u 1 and d(A, B 2) < u 2 from an atom A to the two atoms B 1 and B 2 of a diastereotopic group are replaced by (in general) three restraints that are invariant under exchange of the stereospecific assignment, i.e. a restraint to a pseudoatom Q located centrally with respect to the positions of atoms B 1 and B 2, d(A, Q) < u Q , and two restraints with identical upper bound u = max(u 1, u 2) for the individual distances, d(A, B 1) < u and d(A, B 2) < u (Güntert et al. 1991a). Structures are calculated using the standard torsion angle dynamics simulated annealing protocol of the program CYANA. Starting from 250 conformers with random torsion angles, 25,000 torsion angle dynamics steps were applied per conformer, and the 50 conformers with lowest final target function values were selected for analysis. Structures obtained in this way are strictly independent of the arbitrary stereospecific assignments assumed in the input restraints.

The following algorithm can also be applied to structures obtained in other ways, for example to an X-ray crystal structure. In this paper, this was done with the RDC-refined X-ray structure of GB3 (Derrick and Wigley 1994; Yao et al. 2008).

The algorithm then calculates for each diastereotopic group the weighted target function difference upon exchanging its stereospecific assignment, Δf = (f R − f I) × |f R − f I|/max(f I, f R), where f I and f R are the CYANA target function values with the stereospecific assignment of the group under consideration as in the input or reversed, respectively, calculated only for the distance restraints that involve the given diastereotopic group. This definition of the weighted target function difference upon exchanging its stereospecific assignment, Δf, captures the idea that a stereospecific assignment should be safer when the target function difference f R − f I is higher, and that the difference is the more significant when the relative difference between the two target function values is larger. Therefore we combine these two concepts into a single formula by multiplying them. For instance, the stereospecific assignment of a diastereotopic group with f I = 0 Å2 and f R = 2 Å2, yielding Δf = 2 Å2, is considered more significant than one with f I = 8 Å2 and f R = 10 Å2, yielding Δf = 0.4 Å2. Since the stereospecific assignment of one diastereotopic group can in principle have an influence on the target function values f I and f R of another diastereotopic group, the optimal swapping of the entire set of all stereospecific assignments is iterated multiple times until no further change occurs for any of the diastereotopic groups. If there are multiple structures, the minimal absolute value of Δf over the ensemble is taken, and the maximal fraction q of conformers with either Δf ≥ 0 (i.e., the input stereospecific assignment is preferred) or Δf < 0 (the reversed stereospecific assignment is preferred) is computed. If the same stereospecific assignment yields consistently a lower target function value for all conformers of the structure ensemble, then q = 1, and in all cases q ≥ 0.5, because we divide the set of target function difference values into two groups (Δf ≥ 0 or Δf < 0). A stereospecific assignment is considered as reliable if Δf and q exceed given thresholds, |Δf| ≥ Δf min and q ≥ q min.

The calculations of this paper were performed with Δf min = 0.1 Å2 when using an input X-ray structure or Δf min = 0.2 Å2 when using only the NMR data. In the latter case it was required that all conformers yielded a consistent stereospecific assignment, i.e. q min = 1.

The algorithm can be applied to the side-chain NH2 groups of Asn and Gln in exactly the same way as to diastereotopic groups. In this paper, we therefore include these side-chain NH2 groups among the diastereotopic groups.

Structure calculations with three-state ensemble-averaged restraints

Structure calculations were performed with the ensemble-based structure determination protocol using ensemble-averaged distance restraints obtained from eNOE rates, as described recently (Vögeli et al. 2012, 2013). CYANA structure calculations were started from 100 conformers with random torsion angle values, simulated annealing with 50,000 torsion angle dynamics steps was applied, and the 20 conformers with the lowest final target function values were analyzed. For the ensemble-averaged calculations 3 structural states of the entire protein were calculated simultaneously, excluding steric repulsion between atoms of different states, and applying the eNOE distance restraints to the 1/r 6 averages of the corresponding distances in the individual states. The absence of stereospecific assignments was handled as described above. Similarly, the 3 J coupling restraints and the RDC restraints were applied to the arithmetic mean of the corresponding quantities in the individual states. Bundling restraints were applied in order to keep the individual structural states together in space as far as permitted by the experimental restraints. To this end weak upper distance bounds of 1.2 Å were imposed on all distances between the same nitrogen and carbon atoms in different states. The weight of these bundling restraints was 100 times lower than for NOE upper distance bounds, except for the backbone atoms N, Cα, C′, and Cβ, for which a 10 times lower weight than for NOEs was used.

Results and discussion

The protein GB3 contains 75 diastereotopic groups. The eNOE distance restraint set (Vögeli et al. 2012) provides distance restraints for 47 diastereotopic groups (Table 1). No distance restraints are available for the remaining 28 diastereotopic groups.

Table 1 Stereospecific assignments for GB3 using eNOEs

Stereospecific assignments based on the RDC-refined X-ray structure

Using the RDC-refined X-ray structure (Derrick and Wigley 1994; Yao et al. 2008) and the eNOE data set for GB3 (Vögeli et al. 2012), our algorithm yielded stereospecific assignments for 45 out of the 47 diastereotopic groups for which the eNOE data set provided relevant information (Table 1). All 45 stereospecific assignments were in agreement with those reported earlier (Vögeli et al. 2012). There are significant differences with regard to the unambiguousness of the stereospecific assignments: 6 diastereotopic groups show a weighted target function difference upon exchanging the stereospecific assignment Δf > 10 Å2, 24 have 1 < Δf ≤ 10 Å2, and 15 have 0.1 < Δf ≤ 1 Å2. Two diastereotopic groups have Δf ≤ 0.1 Å2, and are therefore not stereospecifically assigned. The stereospecific assignments with Δf > 0.1 Å2 include 3 out of 3 Gly αCH2 groups, 24 out of 24 βCH2 groups, 6 out of 8 γCH2 groups, 3 out of 3 δCH2 groups, 5 out of 5 Val and Leu isopropyl (CH3)2 groups, and 4 out of 4 Asn and Gln side-chain NH2 groups, for which eNOE data is available (Fig. 1). This shows that eNOEs in conjunction with a high-resolution X-ray structure of GB3 enable our algorithm to determine unambiguous stereospecific assignments for the large majority of diastereotopic methylene and isopropyl methyl groups, as well as for the planar side-chain amide groups of Asn and Gln.

Fig. 1
figure 1

Stereospecific assignments for GB3. Diastereotopic groups with correct and ambiguous stereospecific assignments are colored in green and cyan, respectively. The ribbon is colored in green or cyan if the majority of the stereospecific assignments of a residue is correct or ambiguous, respectively. a Stereospecific assignments determined using eNOEs and the X-ray structure, mapped onto the X-ray structure. b Same as a; structure rotated by 180° around a vertical axis. c Stereospecific assignments determined using eNOEs and the NMR structure bundle calculated without stereospecific assignments, mapped on the structure with the lowest target function value. d Same as c; structure rotated by 180° around a vertical axis. The 10 threonines and the 6 alanines are not shown, as they do not have diastereotopic groups. Spheres represent oxygen, nitrogen, or sulfur atoms

It should be noted that a comparable result could not be achieved with traditional, semi-quantitative NOE distance restraints. Applying the same algorithm to the conventional NMR data set of 1,956 upper distance bounds (and no lower distance bounds), yielded correct stereospecific assignments only for 13 diastereotopic groups, instead of 45 when using eNOEs.

Stereospecific assignments based on NMR data alone

In the absence of an input (e.g. X-ray) structure, a bundle of 50 conformers was generated with CYANA using the NMR data set for GB3 (Vögeli et al. 2012) after “stereo-symmetrization”, as described in the “Materials and methods” section. A larger number than the usual 20 NMR conformers was generated to increase the statistical significance and thus the reliability of the stereospecific assignment. Applying the present stereospecific assignment algorithm to the eNOE data set with these 50 NMR conformers yielded stereospecific assignments for 27 out of the 47 diastereotopic groups with relevant experimental data. The stereospecific assignments with Δf > 0.2 Å2 include 2 out of 3 Gly αCH2 groups, 16 out of 24 βCH2 groups, 2 out of 8 γCH2 groups, 0 out of 3 δCH2 groups (all Lys), 3 out of 5 Val and Leu isopropyl (CH3)2 groups, and 4 out of 4 Asn and Gln side-chain NH2 groups, for which eNOE data is available (Fig. 1).

The choice of the cutoff value Δf for the weighted target function difference is to some extent arbitrary. The numbers of correct/wrong stereospecific assignments vary with increasing Δf values (and q = 100 %) as follows: Δf = 0.0 Å2, 31 correct/2 wrong; Δf = 0.1 Å2, 27 correct/2 wrong; Δf = 0.2 Å2, 27 correct/0 wrong; Δf = 0.3 Å2, 27 correct/0 wrong; Δf = 0.4 Å2, 25 correct/0 wrong; Δf = 0.5 Å2, 25 correct/0 wrong. We have chosen the Δf cutoff value as the lowest “round” number that excluded any erroneous stereospecific assignments. However, as the above numbers show, one could also choose significantly higher (safer) cutoffs without loosing a significant number of stereospecific assignments.

There is a correlation between the solvent accessibility of a residue and its stereospecific assignments. All 10 diastereotopic groups with eNOE data in buried residues with <10 % solvent accessibility could be assigned correctly, whereas the stereospecific assignments remained ambiguous for 7 out of the 11 diastereotopic groups with eNOE data in the highly solvent exposed residues with more than 40 % solvent accessibility. Most of the 20 ambiguous stereospecific assignments occur in charged or hydrophilic residues, i.e. 17 in Lys, Asp, Glu, but only 3 in other residues (Gly, Val, Leu).

Comparing the results obtained with the X-ray structure and on the basis of the NMR data alone, it is apparent that in the clear-cut cases of stereospecific assignments with high Δf values, the latter are very similar with X-ray and NMR structures. In general the X-ray structure yields slightly higher Δf values. On the other hand, there are 10 diastereotopic groups with Δf > 0.5 Å2 when using the X-ray structure but insignificant Δf < 0.1 Å2 when using the NMR structure. The eNOE restraints and the NMR structure calculated from them without any assumptions on the stereospecific assignments can thus serve to determine many but not all of the stereospecific assignments that are possible by knowledge of a high-resolution X-ray structure. The eNOE restraints provide significantly more stereospecific assignments than the set of conventional semi-quantitative upper distance limits, which yield only 2 reliable stereospecific assignments with Δf > 0.2 Å2.

Using only the NMR data, the stereospecific assignments of 20 out of 47 relevant diastereotopic groups remain ambiguous. This may appear to be a significant number. However, it should be noted that of the total of 1,707 distance restraints in the eNOE data set, only 254 (15 %) involve atoms without stereospecific assignment. Thus, the present stereospecific assignment method serves well its principal purpose to enable the accurate interpretation of the large majority of the eNOEs.

This finding is corroborated by comparing the results of CYANA structure calculations of three-state ensembles (Vögeli et al. 2012) using the 27 stereospecific assignments that can be made from NMR data alone with those obtained with complete stereospecific assignments (Fig. S1). Overall there is little difference between the two calculations that exhibit very similar heavy atom RMSDs to the mean coordinates of 0.85 and 0.87 Å, respectively. Also the summaries of PSVS Protein Structure Validation Suite (Bhattacharya et al. 2007) structure quality factors in Tables S1 and S2 show similar values for the two structures.

It is conceivable that the extent of stereospecific assignments could be increased by an iterative procedure that uses the stereospecific assignments determined by a first application of the algorithm to the structure obtained in the absence of any stereospecific assignments as input for the calculation of a new NMR structure bundle that incorporates the stereospecific assignments that have been made so far. The stereospecific assignment algorithm is then run again with this new NMR structure bundle as input, etc. We applied this approach for ten iterative cycles of NMR structure calculation and stereospecific assignment determination for GB3. The results showed that virtually no additional stereospecific assignments could be determined compared to the first, non-iterative cycle of the procedure, and that occasionally incorrect stereospecific assignments appeared in later cycles. This indicates that the stereospecific assignments of different diastereotopic groups are essentially independent from each other. We therefore conclude that it is sufficient and more reliable to run the structure calculation and the stereospecific assignment algorithm only once for a given eNOE data set.

Conclusions

In this paper we have presented an algorithm for the determination of stereospecific assignments on the basis of “exact” eNOEs. Application of the algorithm to the protein GB3 shows that a significant number of stereospecific assignments can be obtained, and that all these stereospecific assignments are in agreement with those determined earlier by other methods. The use of eNOEs is thereby essential, as corresponding calculations with traditional, semi-quantitative NOE distance restraints resulted in far less stereospecific assignments. The stereospecific assignment algorithm is automatic and fast, requiring less than 1 s of CPU time on a laptop computer for GB3.

The GB3 protein sample used for this study was of exceptionally good quality in terms of sample concentration and stability. It is possible that for more demanding proteins the eNOE analysis could not be carried out to the same degree of completeness as for GB3, resulting in a smaller number of unambiguous stereospecific assignments. Nevertheless, the approach presented in this paper will remain valid. We have initiated eNOE measurements of several other proteins, including cyclophilin A, for which results will be reported in the future.

Stereospecific assignments are of particular importance for the optimal use of eNOE data, for example to elucidate motions in proteins (Vögeli et al. 2012). Much of the accuracy of the eNOE-based distance measurements is otherwise lost by corrections that have to be made to account for the absence of the stereospecific assignments (Fletcher et al. 1996; Güntert 1998). This effect is illustrated in Fig. S2 by the distributions of the χ1 torsion angle values in three-state ensembles of GB3 obtained from eNOEs with either no stereospecific assignments or complete stereospecific assignments. This figure clearly shows that stereospecific assignments for the βCH2 groups lead in many cases to significantly narrower χ1 distributions. The present stereospecific assignment method is therefore a crucial complement of the eNOE methodology (Orts et al. 2012; Vögeli et al. 2009, 2010, 2012, 2013).