Introduction

The structure determination of biological macromolecules by NMR in solution relies primarily on distance restraints derived from cross peaks in NOESY spectra. A large number of assigned NOESY cross peaks are necessary to compute an accurate three-dimensional (3D) structure because many of the NOEs are short-range with respect to the sequence and thus carry little information about the tertiary structure and because NOEs are generally interpreted as loose upper bounds in order to implicitly take into account internal motions and spin diffusion [although, in principle, accurate distance measurements are possible with NOEs (Vögeli et al. 2012, 2009)]. Obtaining a comprehensive set of distance restraints from NOESY spectra is in practice not straightforward. The sheer amount of data, as well as resonance and peak overlap, spectral artifacts and noise, and the absence of expected signals because of fast relaxation turn interactive NOESY cross peak assignment into a laborious and error-prone task. Therefore, the development of computer algorithms for automating this often most time-consuming step of a protein structure determination by NMR has been pursued intensely and reviewed extensively (Altieri and Byrd 2004; Baran et al. 2004; Billeter et al. 2008; Gronwald and Kalbitzer 2004; Guerry and Herrmann 2011; Güntert 1998, 2003, 2009; Moseley and Montelione 1999; Williamson and Craven 2009). Besides semi-automatic approaches (Duggan et al. 2001; Güntert et al. 1993; Meadows et al. 1994), several algorithms have been developed for the automated analysis of NOESY spectra given the chemical shift assignments, namely NOAH (Mumenthaler and Braun 1995; Mumenthaler et al. 1997), ARIA (Nilges et al. 1997; Rieping et al. 2007), ASDP (Huang et al. 2006), KNOWNOE (Gronwald et al. 2002), CANDID (Herrmann et al. 2002a), PASD (Kuszewski et al. 2004), AutoNOE-Rosetta (Zhang et al. 2014), and a Bayesian approach (Hung and Samudrala 2006). Automated NOESY peak picking has been integrated into the method (Herrmann et al. 2002b). Automated NOESY assignment can be combined with automated sequence-specific resonance assignment with the Garant (Bartels et al. 1997) or FLYA (Schmidt and Güntert 2012) algorithms in order to perform a complete NMR structure determination without manual interventions (López-Méndez and Güntert 2006). In favorable cases, this can even be achieved using exclusively experimental data from NOESY spectra (Ikeya et al. 2011; Schmidt and Güntert 2013).

The fundamental problem of NOESY assignment is the ambiguity of cross peak assignments. Assigning based solely on the match between cross peak positions and the chemical shift values of candidate resonances does in general not yield a sufficient number of unambiguously assigned distance restraints to obtain a structure (Mumenthaler et al. 1997). Ambiguous distance restraints make it possible to use also NOEs with multiple assignment possibilities in a structure calculation (Nilges 1995). Nevertheless, additional criteria have to be applied to resolve these ambiguities, such as using secondary structure information (Huang et al. 2006) or a preliminary structure that is refined iteratively in cycles of NOE assignment and structure calculation (Mumenthaler and Braun 1995). The CANDID automated NOESY assignment method introduced the concepts of network anchoring to reduce the initial ambiguity of NOE assignments and constraint combination to reduce the impact of erroneous restraints (Herrmann et al. 2002a). In CYANA, the conditions applied by CANDID for valid NOE assignments have been reformulated in a probabilistic framework that is conceptually more consistent and better capable to handle situations of high chemical shift-based ambiguity of the NOE assignments (Güntert 2004, 2009).

The aforementioned approaches can go wrong in two ways, especially with low-quality input data. Either the algorithm fails to ever assign enough NOE distance restraints to obtain a defined structure. This outcome, manifested by a divergent structure bundle with a high RMSD, is unfortunate but straightforward to detect. More problematic are failures of a second kind, where the algorithm, possibly gradually over several cycles, discards part of the NOE cross peaks (by letting them unassigned) and selects a self-consistent but incomplete subset of the data to compute a well-defined but erroneous structure, i.e. a tight bundle of conformers with low RMSD to its mean coordinates that, however, differs significantly from the (unknown) correct structure of the protein. If this outcome goes unnoticed, it may result in the publication or PDB deposition of erroneous structures that cannot be detected easily by coordinate-based validation tools (Nabuurs et al. 2006).

Given the widespread use of automated NOESY assignment algorithms (Guerry and Herrmann 2011; Williamson and Craven 2009) it is important to give criteria for their safe application (Herrmann et al. 2002a) and to assess their reliability. It is known that the CANDID algorithm generally requires a high degree of completeness of the backbone and side chain chemical shift assignments (Jee and Güntert 2003). Recently, the CASD-NMR initiative (Rosato et al. 2009) has evaluated several NMR structure determination methods by blind testing. Using high-quality data sets of small proteins from a structural genomics project it was found that the NOESY-based methods included in the test yielded structures with an accuracy of 2 Å RMSD or better to the subsequently released reference structures (Rosato et al. 2012). However, the situation is less clear for more difficult cases, in which the resonance assignments may be incomplete, spectral crowding, overlap, and low signal-to-noise ratios prevent collecting a “complete” set of NOESY cross peaks, or the lack of isotope labeling may preclude the use of, intrinsically less ambiguous, 3D and 4D NOESY spectra. Further complications may arise with symmetric multimers or solid-state NMR data. In this paper, we address these questions by an extensive, systematic analysis of the combined automated NOESY assignment and structure calculation algorithm in CYANA under a variety of conditions mimicking data imperfections that may occur with challenging systems.

Materials and methods

Combined automated NOE assignment and structure calculation algorithm

The algorithm for automated NOE assignment in CYANA (Güntert 2004, 2009) is a re-implementation of principles of the former CANDID procedure (Herrmann et al. 2002a) on the basis of a probabilistic treatment of the NOE assignment process. The key features of the algorithm are network anchoring to reduce the initial ambiguity of NOESY peak assignments, ambiguous distance restraints to generate conformational restraints from NOESY cross peaks with multiple possible assignments, and constraint combination to minimize the impact of erroneous distance restraints on the structure. Automated NOE assignment and the structure calculation are combined in an iterative process that comprises, typically, seven cycles of automated NOE assignment and structure calculation, followed by a final structure calculation using only unambiguously assigned distance restraints. Between subsequent cycles, information is transferred exclusively through the intermediary 3D structures. The molecular structure obtained in a given cycle is used to guide the NOE assignments in the following cycle. Otherwise, the same input data are used for all cycles, that is the amino acid sequence of the protein, one or several chemical shift lists from the sequence-specific resonance assignment, and one or several lists containing the positions and volumes of cross peaks in 2D, 3D, or 4D NOESY spectra. The input may further include previously assigned NOE upper distance bounds or other previously assigned conformational restraints for the structure calculation.

In each cycle, first all assignment possibilities of a peak are generated on the basis of the chemical shift values that match the peak position within given tolerance values, and the quality of the fit between the atomic chemical shifts and the peak position is expressed by a Gaussian probability, P shifts. Second, the probability P structure for agreement with the preliminary structure from the preceding cycle (if available), represented by a bundle of conformers, is computed as the fraction of the conformers in which the corresponding distance is shorter than the upper distance bound plus the acceptable distance restraint violation cutoff. Assignment possibilities for which the product of these two probabilities is below the required probability threshold are discarded. Third, each remaining assignment possibility is evaluated for its network anchoring, i.e., its embedding in the network formed by the assignment possibilities of all the other peaks and the covalently constrained short-range distances. The network anchoring probability P network that the distance corresponding to an assignment possibility is shorter than the upper distance bound plus the acceptable violation is computed given the assignments of the other peaks but independent from knowledge of the three-dimensional structure. Contributions to the network anchoring probability for a given “current” assignment possibility result from other peaks with the same assignment (e.g. transposed peaks), from pairs of peaks that connect indirectly the two atoms of the current assignment possibility via a third atom, and from peaks that connect an atom in the vicinity of the first atom of the current assignment with an atom in the vicinity of the second atom of the current assignment. Short-range distances that are constrained by the covalent geometry can, for network anchoring, take the same role as an unambiguously assigned NOE. Individual contributions to the network anchoring of the current assignment possibility are expressed as probabilities, \( P_{1} ,P_{2}, \ldots , \) that the distance corresponding to the current assignment possibility satisfies the upper distance bound. The network anchoring probability is obtained from the individual probabilities as \( P_{\text{network}} = \, 1 \, {-} \, \left( {1 \, {-}P_{1} } \right)\left( {1 \, {-}P_{2} } \right) \ldots , \) which is never smaller than the highest probability of an individual network anchoring contribution. Only assignment possibilities for which the product of the three probabilities is above a threshold, \( P_{\text{tot}} = P_{\text{shifts}} P_{\text{network}} P_{\text{structure}} \ge P_{\hbox{min} } \), are accepted. Cross peaks with a single accepted assignment yield a conventional unambiguous distance restraint. Cross peaks with multiple accepted assignments result in an ambiguous distance restraint.

Spurious distance restraints may arise from the misinterpretation of noise and spectral artifacts, in particular at the outset of a structure determination before 3D structure-based filtering of the restraint assignments can be applied. CYANA uses “constraint combination” (Herrmann et al. 2002a) to reduce structural distortions from erroneous distance restraints. Medium-range and long-range distance restraints are incorporated into “combined distance restraints”, which are ambiguous distance restraints with assignments taken from different, in general unrelated, cross peaks. A basic property of ambiguous distance restraints is that the restraint will be fulfilled by the correct structure whenever at least one of its assignments is correct, regardless of the presence of additional, erroneous assignments. This implies that such combined restraints have a lower probability of being erroneous than the corresponding original restraints, provided that the fraction of erroneous original restraints is smaller than 50 %. Constraint combination aims at minimizing the impact of erroneous NOE assignments on the resulting structure at the expense of a temporary loss of information. It is applied to medium- and long-range distance restraints in, by default, the first two cycles of combined automated NOE assignment and structure calculation with CYANA.

The distance restraints are then included in the input for the structure calculation with simulated annealing by the fast CYANA torsion angle dynamics algorithm (Güntert et al. 1997). A complete structure calculation with automated NOESY assignment typically comprises seven cycles. The second and subsequent cycles differ from the first cycle by the use of additional selection criteria for cross peaks and NOE assignments that are based on assessments relative to the protein 3D structure from the preceding cycle. The precision of the structure determination normally improves with each subsequent cycle. Accordingly, the cutoff for acceptable distance restraint violations in the calculation of P structure is tightened from cycle to cycle. In the final cycle, an additional filtering step ensures that all NOEs have either unique assignments to a single pair of hydrogen atoms, or are eliminated from the input for the structure calculation. This facilitates the use of subsequent refinement and analysis programs that cannot handle ambiguous distance restraints.

Experimental NMR data sets

The performance of CYANA was assessed on the basis of the NMR structure bundles of ten proteins to which we refer in this paper by the four-letter acronyms given in Table 1: copz, the copper chaperone CopZ of Enterococcus hirae (Wimmer et al. 1999); cprp, the chicken prion protein fragment 128–242 (Calzolai et al. 2005); enth, the ENTH-VHS domain At3g16270 from Arabidopsis thaliana (López-Méndez and Güntert 2006; López-Méndez et al. 2004); fsh2, the Src homology two domain from the human feline sarcoma oncogene Fes (Scott et al. 2004, 2005); fspo, the F-spondin TSR domain 4 (Pääkkönen et al. 2006); pbpa, the Bombyx mori pheromone binding protein (Horst et al. 2001); rhod, the rhodanese homology domain At4g01050 from Arabidopsis thaliana (Pantoja-Uceda et al. 2005, 2004); wmkt, the Williopsis mrakii killer toxin (Antuch et al. 1996); scam, stereo-array isotope labeled (SAIL) calmodulin (Kainosho et al. 2006); ww2d, the second WW domain from mouse salvador homolog 1 protein (Ohnishi et al. 2007).

Table 1 Overview of proteins and data sets used in this study

The proteins copz, cprp, enth, fsh2, pbpa, rhod and wmkt are proteins will a well-defined single-domain structure. The protein fspo has an unusual, less well-defined structure without regular secondary structure. The protein scam has two flexibly connected domains. The protein ww2d forms a symmetric dimer. For the original structure determinations the proteins were uniformly labeled with 13C and 15N, except for copz that was only 15N labeled, wmkt that was unlabeled, and scam that was stereo-array isotope labeled (Kainosho et al. 2006). The completeness of the resonance assignments and the type and amount of NOESY data are summarized in Table 1.

For most proteins the unassigned NOESY peak lists were the only source of conformational restraints. Exceptions are cprp and pbpa, whose data sets included 123 and 148 ϕ/ψ torsion angle restraints derived from Cα chemical shifts (Luginbühl et al. 1995), respectively, ww2d including 44 ϕ/ψ torsion angle restraints from TALOS (Cornilescu et al. 1999). In the data set of cprp the assignments of 18 NOESY cross peaks were kept fixed, as in the original structure determination (Calzolai et al. 2005). Disulfide bonds were restrained in cprp, fspo, pbpa, and wmkt. In scam the distances between the four calcium ions and their 16 ligands were restrained to the range 1.7–2.8 Å. No hydrogen bond restraints or other additional restraints were used.

The original experimental data sets were used to determine a reference structure for each protein using the same computational schedule as for the subsequent calculations with modified data. Seven cycles of combined automated NOESY assignment and structure calculation were performed, followed by a final structure calculation. In each cycle, structure calculations were started from 100 conformers with random values of the torsion angles, to which the standard CYANA simulated annealing schedule was applied with 10,000 torsion angle dynamics steps per conformer. The 20 conformers with the lowest final target function values were selected for analysis and are shown in Fig. 1.

Fig. 1
figure 1

Bundle representations of the ten proteins included in the present study (see Table 1). Secondary structure elements are highlighted in purple (α-helix) and yellow (β-sheet). Atomic coordinates originate from the Protein Data Bank (PDB) entries 1CPZ (copz), 1U3M (cprp), 1VDY (enth), 1WQU (fsh2), 1VEX (fspo), 1GM0 (pbpa), 1VEE (rhod), 1X02 (scam), 1WKT (wmkt), and 2DWV (ww2d). Two separate superpositions are presented for the two-domain protein scam

Modified input data sets

The experimental input data sets were modified in 14 different ways to mimic different kinds of data imperfections. All random data modifications were applied five times using different random numbers resulting in a total of 397 different data sets for each protein including the respective complete data set.

  1. 1.

    Missing chemical shift assignments

    A given percentage P between 0 and 40 % of randomly selected 1H chemical shift assignments was deleted. Experimental NOESY peak lists were not changed.

    1. (a)

      Random shift deletion: The shifts to be deleted were chosen randomly among all assigned 1H chemical shifts.

    2. (b)

      Deletion of side chain chemical shifts: The shifts to be deleted were chosen randomly among all side-chain 1H chemical shift assignments.

    3. (c)

      Deletion of “important” chemical shift assignments: The shifts to be deleted were chosen among all assigned 1H chemical shifts, but “important” shifts were deleted with higher probability. Importance was defined according to the number of NOEs in the reference calculation that involve a given atom. Chemical shifts were divided into eleven classes occurring in 0–1, 2–3, 4–5,…, and ≥20 peaks, with class indices i = 0, 2, 4,…, 20. Chemical shifts from class i were deleted with relative deletion probability \( p_{i} = 1/\left( {21{-}i} \right) \), resulting in higher deletion probabilities for more important chemical shifts.

    4. (d)

      Deletion of “unimportant” chemical shift assignments: As in (c), but “unimportant” 1H shifts were deleted preferably. Chemical shifts from class i were deleted with relative deletion probability \( p_{i} = 1/(i + 1) \).

  2. 2.

    Erroneous chemical shift assignments

    A given percentage P between 0 and 40 % of randomly selected assigned 1H chemical shift values were modified. Experimental NOESY peak lists were not changed.

    1. (e)

      Random new chemical shift values: The selected chemical shifts were set to randomly chosen values within fifteen times the assignment tolerance for a given atom.

    2. (f)

      Chemical shift permutations: Each selected chemical shift values was replaced with the chemical shift value of another atom from the set of selected atoms. Only atoms with a chemical shift value within 2.5 times the standard deviation of the corresponding chemical shift distribution from the BMRB were used for replacement.

    3. (g)

      Permuted locally with other chemical shifts: As in (f), but only atoms from the same or directly neighboring amino acid residues were used for replacement.

  3. 3.

    Missing NOESY peaks

    A given percentage P between 0 and 75 % of the NOESY peaks was deleted. Chemical shift lists were not changed.

    1. (h)

      Random peak deletion: The peaks to be deleted were chosen randomly.

    2. (i)

      Deletion of weak peaks: The weakest peaks were (non-randomly) deleted.

  4. 4.

    Inaccurate NOESY peaks

    The positions or volumes of all NOESY peaks were distorted. Chemical shift lists were not changed.

    1. (j)

      Inaccurate peak positions: Peak positions were modified in all spectral dimensions by adding a random number from a normal distribution with mean 0 and standard deviation equal to the corresponding assignment tolerance times a varying percentage P between 0 and 100 %.

    2. (k)

      Inaccurate peak volumes: Peak volumes were multiplied by a normally distributed random number with mean 1 and standard deviation P between 0 and 150 %.

  5. 5.

    Projection to two dimensions

    1. (l)

      NOESY peak lists of all data sets were reduced to the two proton dimensions.

  6. 6.

    Increased chemical shift tolerances

    1. (m)

      Chemical shift tolerance for NOESY peak assignment was increased from the standard value of 0.03 ppm to 0.04, 0.05, 0.06, 0.08, and 0.1 ppm for 1H, and proportionally from 0.5 ppm to 0.67, 0.83, 1.0, 1.33, and 1.67 ppm for 15N and 13C. Chemical shift lists and NOESY peak lists were not changed.

  7. 7.

    Increased number of random starting structures and annealing steps

    1. (n)

      The calculations with randomly deleted chemical shifts of modification (a) were repeated with 200 instead of 100 random starting structures and 20,000 instead of 10,000 torsion angle dynamics steps during the simulated annealing protocol.

Structure calculations

Automated NOESY peak assignment was performed with a chemical shift tolerance of 0.03 ppm for 1H and 0.5 ppm for heavy atoms [except for modifications (m), see above]. Twenty independent structure calculation runs starting from different random structures were performed for each data set of each protein. Each of these structure calculations [except for modification (n), see above] started from 100 random conformers to which the standard CYANA simulated annealing protocol with 10,000 torsion angle dynamics steps was applied, and the 20 conformers with lowest target function values were chosen for the final structure bundle.

Analysis of results

For each protein, the solution NMR structure calculated from the complete data set was used as the reference structure (Fig. 1). The accuracy of a structure was measured by the RMSD bias (Güntert 1998), i.e. the backbone RMSD between the average structure of a given calculation and the average structure of the reference. The average structure of a structure bundle was obtained by optimally superimposing its individual conformers for minimal backbone RMSD of the ordered regions, and calculating the average coordinates. Ordered parts of each protein were determined by the program CYRANGE (Kirchner et al. 2011) applied to the reference structure. The average RMSD bias for each type of input data modification was averaged over all ten proteins, five different random modifications and 20 independent structure calculation runs leading to averaging over 1000 structure calculations.

Important as well as unimportant chemical shifts were further analyzed by classification into six different 1H classes: Hα, HN, methyl protons, aromatic ring protons, lysine and arginine side chain protons beyond Hβ, and aliphatic protons. The number of NOE cross peaks involving a given atom was determined for each atom and the average was calculated for the different classes.

In de novo structure calculations there is usually no reference structure available. It is therefore necessary to have a measure independent from the RMSD bias to assess the quality of a structure calculation result. We analyzed two previously suggested criteria, i.e. the RMSD to the mean structure (RMSD radius) of cycle 1 (convergence) and the RMSD between the structure obtained in cycle 1 and in the final structure calculation (RMSD drift). The individual criteria were then combined into a weighted average calculated as \( \sqrt {\left( {1.5R} \right)^{2} \, + \, D^{2} } \), where R denotes the RMSD radius in cycle 1 and D the RMSD drift.

Results and discussion

The effect of missing, erroneous, or inaccurate structure calculation input data was investigated by random deletion and modification of chemical shifts as well as NOESY peaks. Structure calculations were performed using original and modified experimental data sets of ten different proteins (Table 1; Fig. 1) and the average RMSD bias was used as a measure of accuracy.

The consequence of random new chemical shifts in comparison to missing NOESY peaks is illustrated in Fig. 2 for the protein fsh2 as an example of the two principle kinds of structure calculation failures that were discussed in the Introduction. An incomplete set of NOESY peaks generally causes less well defined structure bundles indicative of a loss of long-range information. This is reflected in the RMSD radius, which increases from 1.15 Å at 30 % deleted peaks (Fig. 2a) to 2.08 Å at 60 % deleted peaks (Fig. 2b) and 10.13 Å at 75 % deleted peaks (Fig. 2c). This example illustrates the first category of structure calculation failure, namely the inability to ever assign enough distance restraints to converge to a well-defined structure bundle. This type of error is straightforward to detect and therefore less problematic. The results for erroneous chemical shifts show a different effect. The bundle remains rather well defined with a low RMSD radius of 0.82 Å (10 % modified chemical shifts, Fig. 2c), 1.04 Å (30 %, Fig. 2d) and 1.8 Å (40 %, Fig. 2e) whereas the increasing RMSD bias of 2.07 Å (10 %), 7.64 Å (30 %) and 7.1 Å (40 %) shows that the structure calculation converges to an incorrect fold at a certain degree of erroneous shifts. This reflects the second kind of failure that can be attributed to the selection of a self-consistent, but incorrect subset of NOESY peak assignments. Due to the well-defined nature of the structure bundle, the error is more difficult to detect and hence potentially more dangerous.

Fig. 2
figure 2

Effect of a 30 %, b 60 %, c 75 % missing NOESY peaks [modification (h) in Methods] and d 10 %, e 30 %, f 40 % erroneous chemical shift assignments [modification (e) in Methods] on the structure calculation result of the protein fsh2. Structures were calculated using the standard CYANA protocol for combined automated NOE assignment and structure calculation based on 100 random starting structures and 10,000 annealing steps. The final structure bundles comprise the 20 conformers with lowest target function values. The ordered residues 8–108 in the reference structure (PDB 1WQU) were used for superposition and RMSD calculation. The RMSD bias is calculated as the RMSD between the mean structure of the bundle and the mean reference structure and represents the accuracy. The RMSD radius is calculated as the average RMSD of each conformer to the mean structure of the bundle and represents the precision

For a systematic evaluation, the average RMSD bias was plotted against the percentage P of modified input data for the different types of modifications (Figs. 3, 4, 5). The dotted line indicates an RMSD value of 3 Å representing the threshold below which the global fold of the structure is still assumed to be correct. The results for each individual protein can be found in Fig. 4 and in the Supplementary Material (Figs. S1–S10).

Fig. 3
figure 3

RMSD to the reference structure for different types of simulated chemical shift imperfections. For each data point, twenty independent automated NOESY assignment and structure calculation runs were performed for each of five randomly modified data sets of the ten proteins of Table 1. The average RMSD to the reference structure is plotted against the percentage P of modified chemical shifts. See Methods for details. The data point at 0 % shift modification denotes the RMSD for 20 runs with the complete, unmodified experimental data. a Random deletion of chemical shift assignments, b random deletion of side-chain chemical shift assignments, c random deletion of “important” chemical shift assignments, d random deletion of “unimportant” chemical shift assignments, e random new chemical shift values, f random permutation of chemical shift values, g local permutation of chemical shift values, h doubled number of random starting structures and annealing steps for randomly deleted chemical shift assignments

Fig. 4
figure 4

RMSD to the reference structure for different percentages of randomly deleted chemical shifts [modification (a) in Methods]. Results are presented separately for each of the ten proteins of Table 1 and Fig. 1. For each data point, twenty independent automated NOESY assignment and structure calculation runs were performed for each of five randomly modified data sets. The RMSD bias from the reference structure is plotted against the percentage P of modified chemical shifts. The data point at 0 % shift deletion denotes the RMSD for 20 runs with the complete, unmodified experimental data

Fig. 5
figure 5

RMSD to the reference structure for different types of simulated peak list imperfections. For each data point, twenty independent automated NOESY assignment and structure calculation runs were performed for each of five randomly modified data sets of the ten proteins of Table 1. The average RMSD to the reference structure is plotted against the percentage P of modified data, where applicable. See Methods for details. The data point at 0 % peak modification denotes the RMSD for 20 runs with the complete, unmodified experimental data. a Random deletion of NOESY peaks, b random deletion of the weakest NOESY peaks, c erroneous peak positions, d erroneous peak volumes, e 2D projection of NOESY peaks, f increased assignment tolerances

The overall effect of chemical shift deletions is presented in Fig. 3a–d. Chemical shifts were deleted in four different ways: random deletion from the set of all shifts (Fig. 3a), random deletion only from side chain atoms (Fig. 3b), random deletion of “important” shifts (Fig. 3c) and random deletion of “unimportant” shifts (Fig. 3d). Omission rates were varied between 0 and 40 % in steps of 5 %. In all four cases the average RMSD bias increases at increasing omission rates P. In most cases, random deletion of 5 % of the chemical shifts results in structures with an RMSD bias below 3 Å, whereas 10 and 15 % missing chemical shifts raise the average RMSD bias slightly above 3 Å (Fig. 3a). Omission rates of more than 15 % increase the average RMSD including the standard deviation considerably above 3 Å indicating that structure calculations reproducibly fail to converge to the correct global fold when using severely incomplete chemical shift data. The outcome in the range between 10 and 15 % chemical shift omission strongly depends on the protein and the quality of the respective NOESY data, which becomes apparent when comparing the plots for the individual proteins presented in Fig. 4 and in the Supplementary Material. In favorable cases, the correct structure can still be found with 20 % chemical shifts missing, whereas rather unfavorable cases may fail at 5 % missing chemical shifts. Torsion angle restraints generated from chemical shifts with the program TALOS (Cornilescu et al. 1999) can in some cases slightly improve the structure calculation result. These improvements are predominantly observed in cases of higher deletion percentages (above 20 %) where the original calculations excluding TALOS restraints did not necessarily converge to the correct global fold (Fig. S11). Nearly no improvement is observed for the proteins copz and fspo.

It does not make any significant difference whether random chemical shifts or only side-chain shifts are missing (Fig. 3a, b). Deletion of “important” shifts causes a steeper increase in the average RMSD bias compared to random deletion, whereas the slope is less steep in the case of “unimportant” shifts (Fig. 3c, d). This shows that it can make a difference for the structure calculation results which particular chemical shifts are missing. It is in practice more likely that “unimportant” shifts are missing, as they are typically more difficult to assign.

To further investigate the importance of individual types of protons, chemical shifts from all data sets were classified into six different classes: Hα, NH, methyl protons, aromatic protons, lysine and arginine side chain protons, and aliphatic protons. Importance is measured based on the amount of medium- and long-range NOESY peaks that involve the respective chemical shift (Fig. 6). Protons from methyl groups appear on average in 17.5 medium- and long-range NOE peaks; aromatic protons appear on average in 13.5 peaks, NH protons in 11.9 peaks, Hα protons in 10.3 peaks, aliphatic protons in 10.2 peaks and Lys/Arg sidechain protons in 9.0 peaks. Figure 6 suggests that methyls and aromatic protons are very important, which can be attributed to their preferential occurrence in the hydrophobic, densely packed core of the protein enabling a large amount of NOE contacts.

Fig. 6
figure 6

Number of 1H resonances that are involved in a given number of medium- and long-range NOESY peaks. Proton chemical shifts were separated into six disjoint classes: HA, NH, aliphatic, methyl, aromatic, and lysine and arginine sidechain atoms. Chemical shifts were taken from the data sets of ten different proteins (Table 1). Peaks were counted in the final assigned peak lists of the combined automatic NOE assignment and structure calculation run that yielded the reference structure

Figure 3e–g shows the effect of modified chemical shift values. Different simulated sources of errors such as random new chemical shift values (Fig. 3e), randomly permuted chemical shift values (Fig. 3f), and locally permuted chemical shift values (Fig. 3g) result in very similar average RMSD values as random missing chemical shifts. Even local permutations show the same result.

Compared to missing chemical shifts, deletion of NOESY peaks shows a less steep increase of average RMSD (Fig. 5a). On average, the RMSD bias at 30 % deleted NOESY peaks is below 3 Å while the average RMSD rises slightly above 3 Å at 45 %. The much less pronounced increase can be explained by the fact that NOESY peaks firstly contain a large amount of signals that contain no or very limited structural information due to their sequential nature and secondly contain rather redundant information through the dense NOE network. In contrast, one missing chemical shift leads to a whole set of NOESY peaks that remain unassigned in the more favorable case or get assigned incorrectly in the less favorable case. Figure 5b shows the result for deletion of weak peaks. The RMSD bias at 30 % deletion is comparable to random deletion, whereas deletion of 45 % of the weakest peaks results in a significant increase of 7 Å compared to 3 Å at 45 % randomly deleted peaks. A higher average RMSD for deletion of weak peaks is expected as they contain important long-range information.

Using the complete peak lists, but introducing errors in peak positions yields an average RMSD bias of 3 Å at 45 % error and of more than 5 Å at 60 % error (Fig. 5c). In contrast to errors in peak positions, errors in peak volumes have largely no effect on the average RMSD for the complete range tested up to 150 % error (Fig. 5d). A larger influence from erroneous peak positions can be explained by the fact that the number of incorrect assignments increases, creating potentially distorting restraints, whereas erroneous peak volumes only affect the upper distance limit value. This erroneous effect on the upper distance limit value is furthermore greatly reduced by the r−6-correlation between peak volume and calibrated distance.

Using only two-dimensional peak lists has almost no effect on the structure calculation result in the case of three proteins (copz, ww2d and wmkt). This result can be explained by the fact that a significant part of the peaks of the original data set comes from 2D NOESY spectra. Reducing the remaining peaks to two dimensions has a less severe effect in these cases compared to other data sets, which contain mainly 3D data. For fsh2, fspo, rhod and scam the RMSD bias shows a slight increase but remains below 3 Å, and for cprp, enth and pbpa the RMSD bias increases above 5 Å (Fig. 5e).

Figure 5f shows the effect of increased chemical shift tolerances, which simulates spectra with less resolution resulting in higher assignment ambiguities. Chemical shift tolerances for NOESY peak assignments were raised up to 3.33 times their original value, which corresponds to 0.1 ppm for 1H and 1.66 ppm for 15N and 13C. Up to 200 % increased tolerance, the average RMSD bias is still around 3 Å, whereas further increase results in RMSD bias values of around 5 Å. Increased chemical shift tolerances have very diverse consequences on the different data sets (Supplementary Fig. S1-S10). The effect is most severe in cases where the data sets contain a large amount of two-dimensional data (copz, ww2d and wmkt) as well as in the case of the data set of cprp. Two-dimensional data are especially sensible to reduced resolution as the amount of assignment possibilities is much higher. It should, however, be noted that these two simulations (reduction to two spectral dimensions and increased chemical shift tolerance) might give a somewhat too optimistic picture of the situation encountered in NMR spectra with poor resolution. In severely overlapped spectra, several peaks may be fused into one single peak with a biased peak position, or peaks may no longer be recognizable at all. In our simulation, all peaks are still considered individually at the correct peak position.

Finally, we tested whether the effect of missing data can be compensated by performing more annealing steps during structure calculation and using more random starting structures. For this purpose, we repeated all calculations with randomly deleted chemical shifts with 200 instead of 100 random starting structures and with 20,000 instead of 10,000 annealing steps. The calculation results show only marginal overall improvement (Fig. 3h), indicating that data imperfections can in general not be compensated by longer computation times. The only exception is the homodimeric protein ww2d, for which longer simulated annealing yielded significantly lower RMSD bias values for the data sets with 5–15 % deleted chemical shifts.

These results show that data imperfections of various natures can dramatically reduce the quality of NMR structures. In case of de novo structure determination with lack of a reference structure, it is important to be able to evaluate the structure calculation result based on a measure independent of the RMSD bias. Several criteria have been suggested previously. Two of these criteria are the convergence (RMSD to the mean structure) of the initial structure calculation cycle and the RMSD drift (RMSD between the first and the last cycle). If the initial cycle converges to an RMSD radius below 3 Å and the RMSD drift is simultaneously below 2 Å, the result is considered reliable (Herrmann et al. 2002a; Jee and Güntert 2003). We have investigated these criteria using all aforementioned structure calculations and summarized the results in Fig. 7.

Fig. 7
figure 7

Structural accuracy plotted against commonly applied evaluation criteria for combined automated NOESY assignment and structure calculation runs. The accuracy is represented by the RMSD bias, i.e. the RMSD between the mean structure of the bundle and the mean reference structure. Every data point represents one combined automated NOESY assignment and structure calculation run. a Initial convergence measured by the RMSD to the mean structure of the structure bundle from the first structure calculation cycle, b RMSD drift measured by the RMSD between the final structure bundle and the structure bundle of the first cycle, c a combination of the two criteria calculated as \( \sqrt {\left( {1.5R} \right)^{2} \, + \, D^{2} } \), where R denotes the RMSD radius in cycle 1 and D the RMSD drift, for all proteins except the homodimeric ww2d, and d same as in c for the structure calculations of the protein ww2d

Figure 7a and b show the accuracy plotted against the RMSD in cycle 1 and the RMSD drift. Especially dangerous are false positives, i.e. cases, where the evaluation parameters meet the required criteria (convergence < 3.0 Å, drift < 2.0 Å) but the structure is misfolded. Considering both criteria individually, the number of false positives is 2 % (convergence) and 0.4 % (drift), respectively. Calculation of a weighted average from both values (Fig. 7c) further reduces the number of false positives to 0.01 %. The correlation of the weighted average and the accuracy shows a significantly reduced number of data points above the diagonal (accuracy exceeding the criterion) which therefore allows it to be used as an upper limit on the accuracy. The distribution for the homodimeric protein ww2d is presented separately in Fig. 7d. In contrast to the monomeric proteins, it shows multiple clusters that are presumably due to different ways of dimer formation. On the one hand, there are a large number of cases of structures with a high accuracy around 1 Å for which the combined criterion varies over a large range of 1–10 Å. On the other hand, there is a narrow cluster of structures with an RMSD bias of about 10 Å and values of 2–10 Å for the combined criterion.

In order to investigate the influence of artifacts such as water signals or baseline distortions on the structure calculation result, we have recalculated the structures of the three proteins enth, fsh2, and rhod based on peaks lists from automatic peak picking without subsequent refinement (López-Méndez and Güntert 2006). Results are summarized in Table 2. Only slight differences between the results obtained with refined and unrefined sets of peak lists can be observed in the case of enth with respect to the RMSD bias, the final CYANA target function, as well as the aforementioned evaluation parameters (RMSD radius in cycle 1, RMSD drift, and the combination thereof). This is in good agreement with the results obtained from the modified data sets, where enth is one of the rather stable structure calculations which yields an accurate structure bundle up to 15 % missing chemical shifts (Fig. 4). In the two other cases, the structural quality drops significantly when compared to the results obtained from refined peak lists, however, the RMSD bias is still below 3.0 Å and the global fold is thus considered correct. In all three cases, the final CYANA target function increases and the RMSD radius decreases when using unrefined peak lists. This can be attributed to an increased number of potentially incorrect long-range restraints that result from artifact peaks. The combined criterion gives a good indication about the structural quality.

Table 2 Structure calculation results using refined and unrefined NOESY peak lists

Conclusions

The results presented in this study clearly show that imperfections within the chemical shift assignment can cause severe problems during NOE assignment and structure calculation. In most of the data sets tested 10 % of missing or erroneous chemical shifts result in inaccurate structures with RMSD bias values above 3 Å. In some cases of high quality data and large amounts of 3D peaks, higher percentages of missing or erroneous chemical shifts can be tolerated. Less severe problems arise from missing peaks, errors in peak positions and volumes as well as lower resolution simulated by using higher assignment tolerances. Furthermore, it was shown that data imperfections cannot be overcome by longer computation times. The convergence of the initial structure calculation cycle and the RMSD drift between the first and the last cycle can be combined in a weighted average and used as an indication for the reliability of a structure calculation result.