Abstract
Atomic displacement parameters (ADPs, also known as B-factors), which depend on structural heterogeneity, provide a wide spectrum of information on protein structure and dynamics and find several applications, from protein conformational disorder prediction to protein thermostabilization, and from protein folding kinetics prediction to protein binding sites prediction. A crucial aspect is the standardization of the ADPs when comparisons between two or more protein crystal structures are made, since ADPs are differently affected by several factors, from crystallographic resolution to refinement protocols. A potential limitation to ADP analysis is the modern tendency to let ADPs to inflate up to extremely large values that have little physico-chemical meaning.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
More than a century after the first diffraction experiment and more than half a century after the determination of the first protein crystal structure, a large amount of structural data has been accumulated in databases. The Protein Data Bank, the repository of all the structures of biological macromolecules, contains nowadays more than 130,000 entries, most of them determined with single-crystal X-ray crystallography. Each entry contains a rich assortment of annotations, ranging from experimental details to biological features, and the description of the three-dimensional structure of the macromolecule or of the supramolecular assembly made by two or more macromolecules. The essence of this description is the list of the atoms, their names, their position in space (three coordinates x, y, and z) and their occupancy, which is equal to one if the atoms have a unique stable position or it is less than one if the atoms are conformationally disordered and have two or more stable positions (the sum of the occupancies should be equal to one); and, if it is a crystal structure, the atomic ADP, which is as important as the other positional parameters.
Chemists, physicists, and molecular biologists have been using with increasingly interest this enormous amount of structural data, and structural bioinformatics tools are being increasingly used and useful in macromolecular science. ADPs have been studied too and it was found that they provide valuable information and, in certain cases, allow predicting biologically interesting features. This review summarizes numerous structural bioinformatics analyses and applications where ADPs play starring roles.
First, the physico-chemical significance of the ADP is summarized, especially for readers not familiar with crystallographic computing. Then several ADP features, which are inferable from database information, are described: its dependence on crystallographic resolution and its relationship with temperature. These features point to the ADP standardization problem, which is described in detail, and several standardization techniques that have been used in structural bioinformatics studies are presented. Later, ADP distributions and techniques for ADP prediction are summarized. The use of ADP for protein thermostabilization, conformational disorder prediction, protein folding kinetics prediction, and protein binding sites prediction are then summarized. Conservation of ADPs during evolution is also mentioned and the use of ADPs to estimate the atomic positional accuracy in protein crystal structures is described.
Atomic displacement parameters
This chapter is addressed to non-crystallographers, who might be not familiar with the physico-chemical significance of ADPs, with their determination and refinement, and with potential pitfalls in their use.
In the crystalline state, protein atoms can move in several ways. For example, they can simply oscillate around their equilibrium positions or they can move from one equilibrium position to another, showing what is known as dynamic conformational disorder, which becomes static if the temperature is sufficiently low to prevent the overcome of the activation energy associated with the passage from one position to the other (Giacovazzo et al. 2002; Schmidt and Lamzin 2010).
In X-ray crystal structures, displacements are monitored by atomic displacement parameters (ADP) (Dunitz et al. 1988a, b; Trueblood et al. 1996), which are frequently named ADPs or thermal factors and are related to the mean-square amplitude of displacements of the atoms around their equilibrium positions (\( \langle u^{2} \rangle \)) according to,
ADPs are estimated from refining parameters of an atomic model against diffraction intensities, since the decrease of the atomic form factors (f) associated with the diffraction angle (θ) is enhanced by an ADP increase according to
where f 0 is the atomic form factor at B = 0 Å2 and λ is the X-ray wavelength. This implies that atoms with larger ADPs contribute less to the diffraction intensities than atoms with smaller ADPs and that it is possible, as a consequence, to determine not only the positions of the atoms but also their displacements.
In macromolecular crystallography, ADPs are usually refined isotropically, by assuming that oscillation amplitudes are equal in all directions around the equilibrium position of the atom (Zanotti 2002). Although this is a rather severe approximation, it is adopted because of the scarcity of diffraction data, which does not allow one to refine more than one variable per atom, the atomic displacement, in addition to the three coordinates x, y, and z. However, when more diffraction data are available, ADPs are refined anisotropically (Dunitz et al. 1988a, b; Dauter et al. 1997), by assuming that atomic displacements can be different in the three dimensions and in this case six additional parameters (the six unique elements of a symmetric 3 × 3 tensor) are refined in addition to the three coordinates x, y, and z. Obviously, this is also an approximation, though less severe than the isotropic model.
It is well known and accepted that other structural features influence the ADP values. The one, mentioned above, is conformational disorder. At high resolution, it is often possible to identify alternative atomic positions and refine them together with their respective occupancies, which should sum up to one, if the atom has not been lost because of radiation damage or other degradation reactions (Garman 2003; Carugo and Djinovic-Carugo 2005; Garman and Owen 2006; Holton 2009; Bury et al. 2017). In this case, under the anisotropic approximation, at least 19 variables must be refined per atom, when there are only two alternative positions (two sets of x, y, and z coordinates, two sets of anisotropic ADPs—six variable each—and one value of occupancy occ—the other being 1 − occ).
However, it is often impossible, if the resolution is insufficient, to identify alternative positions and this is compensated by an increase of the ADP. It may happen that the atom is positioned in between the two (or more) positions really occupied and refined with an ADP large enough to encompass the entire region occupied by its electron density. It is difficult to say if this happens seldom or frequently, though protein flexibility suggests that conformational disorder is rather common, at least at the protein surface (Hartmann et al. 1982; Läuger 1985; Stein 1985; Smith et al. 1986; Declercq et al. 1999; Woldeyes et al. 2014).
Several other factors may affect ADP values. Among them, it is necessary to remember that most macromolecular crystallography refinements are restrained (Zanotti 2002): for example, deviations of bond distances from their ideal values are penalized—and in this way the ideal bond distances are treated as further experimental data that add to the diffraction data. Analogously, ADPs of atoms connected by a covalent bond are restrained to have similar components along the covalent bond, since this one is rather rigid and cannot stretch vigorously. Therefore, the variability of the ADP values is somehow reduced.
It is also necessary to be aware that occupancy and ADP are correlated, since decreases in occupancy are accompanied by ADP decreases, since the reduction of the number of electrons that occupy a certain part of the crystal implies a parallel reduction of the apparent oscillation amplitude. Erroneous ADP values may also arise from mistakes in the interpretation of the electron density map. For example, an isolated peak may be interpreted as a calcium(II) cation, with a larger ADP, or as a water molecule, with a smaller ADP, since calcium(II) electrons are more numerous than water electrons and thus they try to spread around more than those of water to fit the electron density peak.
ADP and crystallographic resolution
One of the reasons why ADPs may be different in different crystal structures of the same protein is that the average ADP tends to increase if resolution decreases.
Based on analyses of a limited number of protein crystal structures, the dependence of ADPs on resolution was observed nearly 20 years ago (Carugo and Argos 1999).
Figure 1 depicts the dependence of the average ADP on resolution in the entire Protein Data Bank and in a non-redundant subset of the Protein Data Bank obtained by imposing a maximal pairwise percentage of sequence identity of 30% (both data sets were generated in July 2017). Only protein crystal structures were considered, while structures of nucleic acids and of protein–nucleic acid complexes were discarded.
Clearly, a strict relationship between average ADP and resolution is apparent and it can be fitted by:
and by
for all protein crystal structures (Pearson correlation coefficient = 0.982) and for the non-redundant subset of protein crystal structures (Pearson correlation coefficient = 0.984).
It might be therefore unnecessary to standardize ADPs when comparing protein structures of similar resolution and it might be simple to rescale the ADPs of a protein structure to make them comparable to those of another protein structure.
Large ADPs
In PDB files, there are four types of lines that can be used to indicate the atoms/residues that were invisible in the electron density maps computed in crystallographic studies. Lines beginning with “REMARK 465” and “REMARK 470” enumerate residues and atoms of the protein that were invisible and were not included in the “ATOM” lines; lines beginning with “REMARK 475” and “REMARK 480” enumerate residues and atoms that were invisible and were included in the “ATOM” lines with zero occupancy. It is obviously a rather arbitrary decision whether the electron density is interpretable or not and, perhaps, this is the reason why during the last decade many crystallographers prefer to include in the refinement also the atoms that are (nearly) invisible, allowing their ADPs to inflate enormously.
Figure 2 shows that up to 2007–2008 only 15–20% of the protein X-ray crystal structures deposited in the Protein Data Bank had at least one atom with an ADP larger than 100 Å2 and that in the same period the percentage of complete structures, containing coordinates of all protein atoms and without missing atoms, was in the range 80–90%. After 2008, the percentage of structures with large ADPs began to increase and now more than 50% of the structures contain atoms with large ADP. Analogously, the percentage of structures without missing residues began to decrease and now less than 50% of the structures have coordinates for all the atoms.
The attitude to allow ADP values to inflate in an uncontrolled way is scientifically questionable, since if it is true that the agreement between the model and the experimental observations (the R-factors) may marginally improve if the atoms are not visible in the electron density map, it is also true that there is no physical understanding behind the fit enhancement. For example, one may decide to place an arbitrary number of uranyl cations (UO22+) in the asymmetric unit and allow their ADPs to increase enormously, without significant consequences either on the rest of the model or on the R-factors.
It must also be remembered that the inclusion in the model of atoms with extremely large ADPs, which reflect their immense positional spread, may result in over-interpretations of the structural data delivered to the scientific community. For example, the electrostatic potential at the protein surface might be absolutely inaccurate if atoms/residues, the position of which is uncertain, are included in the calculation.
For this reason, ADP thresholds must be used to filter off structure moieties that cannot be considered to have been experimentally determined. For example, Benkert and co-workers discharged structures with more than 20% of the residues having an ADP above two standard deviations in an analysis of statistical potential in globular proteins (Benkert et al. 2008). However, it is necessary to design less arbitrary criteria to handle atoms, residues and structures associated with enormous and unreasonable ADPs.
ADPs and temperature
Protein X-ray crystal structures, once routinely determined at room temperature, are nowadays determined in general at low temperature (100 K), to reduce radiation damage induced by bright synchrotron X-ray beamlines and to allow the analysis of small and tiny crystal specimen (Carugo and Djinovic-Carugo 2005). Presently (September 25, 2017) 87,044 protein crystal structures, deposited in the Protein Data Bank together with the experimental data, have been determined at 90–110 K and only 4941 have been determined at 280–320 K (ratio 18 to one); moreover, while 66% of the 280–320 K crystal structures have been deposited prior to 2008 (10 years ago), only 26% of the 90–110 K crystal structures have been deposited prior to 2008.
After the first attempts to determine protein crystal structures at temperature below 273 K (Alber et al. 1976), several studies have been dedicated to the analysis and comparison of room-temperature and low-temperature protein crystal structures.
In general, only modest modifications of the protein structure are associated with the temperature decrease. Small reduction in the protein volume and subtle changes of contacts between α-helices have been observed in myoglobin (Frauenfelder et al. 1987). Protein shrinkage was observed also in ribonuclease A (Tilton et al. 1992). Juers and Metthews reported that cryo-cooling generally increases lattice contacts and reduces protein volumes, but causes only small changes in crystallographic models (Juers and Matthews 2001). However, it has also been suggested that cryo-cooling modifies the repertoire of accessible conformations and, consequently, it has been proposed that room-temperature data provide a fuller description (Fraser et al. 2011a, b). This hypothesis is supported by the observation that the crystal cryo-cooling process is too slow (several seconds) to trap the room-temperature equilibrium distribution of protein and solvent configurations (Halle 2004). In early times, Frauenfelder and co-workers hypothesized that minor conformational substrates are influenced by cryo-cooling (Frauenfelder et al. 1979).
It is expected that ADPs depend on the temperature at which crystal structures are determined. The cooling-induced reductions in ADPs suggest that cryogenic structures adopt less variable conformations (Fraser et al. 2011a, b). Huber and co-workers observed that the average ADP decreases from 13.3 to 6.1 Å2 in the crystal structures of trypsinogen if temperature decreases from 293 to 213 K and that the decrease is not linear but sigmoidal, with a sharp decrease in a small temperature range that depends on the solvent composition (Singh et al. 1980). Similarly, the average protein ADP is 14 Å2 at 300 K and 5 Å2 at 80 K in the structures of met-myoglobin and the ADP decrease is not linear, but shows a discontinuity of slope (Hartmann et al. 1982).
The slope discontinuity is believed to depend on the “glass transition”. In crystalline RNaseA, a “glass transition” in the protein between 212 and 228 K reduces ADPs and the cooling-induced reductions in ADPs suggest that cryogenic structures adopt less changeable conformations (Rasmussen et al. 1992; Tilton et al. 1992). Similarly, the B-factors in thaumatin decrease on cooling, indicating a reduction in thermal motions, but there is a sudden change in the slope dB/dT at T ≈ 210 K, due to the protein dynamical transition (glass transition) (Warkentin and Thorne 2009, 2010).
However, large ADPs at low temperature have been observed recently for thaumatin by Russi and co-workers (26 Å2 at 100 K and only 19 Å2 at 278 K), who suggested that the ADPs reflect prevalently the radiation damage at low temperature, while other features play a relevant role at room temperature (Russi et al. 2017).
Interestingly, there is no trace of ADP decrease at low temperature on the Protein Data Bank. A simple statistical survey is summarized in Table 1. At high resolution, the average ADPs, computed only on protein atoms, are nearly identical in the data sets of structures determined at low temperature and in the data sets of structures determined at room temperature. At intermediate and low resolution, on the contrary, ADPs are larger at low temperature.
This analysis is certainly extremely simple, since it compares proteins that have completely different dimensions, folds, and secondary structure compositions. A better methodology would require the comparison of pairs of identical proteins, one determined at room temperature and the other at low temperature. However, the data sets of Table 1 are rather large and consequently it seems reasonable to suppose that they contain similar levels of structural heterogeneity both at room and low temperature. Therefore, it seems also reasonable to suppose that the average ADP values based on these large data sets are close to the real and genuine average values. It must, however, be observed that further and more accurate analyses are necessary to fully characterize the relationship between temperature and ADPs based on PDB data.
ADP standardization
It has been observed that average ADP values may change drastically among different crystal structures of the same protein. For example, Fig. 3 shows the average ADPs, plotted against the resolution, of 109 sperm whale myoglobin crystal structures. The average ADPs, in few cases, are lower than 10 Å2 or higher than 30 Å2. Three extreme cases can be examined: 1mbn (Watson 1969), 1ebc (Bolognesi et al. 1999), and 4of9 (Wang et al. 2014) (Table 2). In model 1mbn, which is one of the oldest protein crystal structures, deposited in the Protein Data Bank in 1973, the ADPs were not refined, as it was common practice in the early days of macromolecular crystallography. In 1ebc, the average ADP is large, more than 45 Å2, and in 4of9 it is more than four times smaller (9 Å2). On the one hand, it might be expected to observe lower ADPs in 4of9, since the diffraction data were collected at lower temperature (100 K in a synchrotron beamline), while the data collection was performed at room temperature (300 K with rotating anode X-ray generator) in 1ebc, as it was the routine until the end of last century. On the other hand, large ADPs are expected in 4of9, since the fraction of the crystal volume occupied by liquid solvent is considerably larger in 4of9 (60%) than in 1ebc (38%), and this should increase the average mobility of the atoms in 4of9 with a consequent increase of the ADPs. However, it is not surprising that 4of9 and 1ebc have different average ADPs, since other features discriminate the two crystal structures. The space groups are different: hexagonal in 4of9 and monoclinic in 1ebc; different refinement programs have been used: TNT, which was widely used at the time of 1ecb, and REFMAC, which was commonly used at the time of 4of9; and also the resolution was different: better in 4of9, is associated, on average, with smaller ADPs.
It is clear that often ADPs in a structure cannot be directly compared with ADPs in another structure. In these cases, it is necessary to standardize them and the most common procedure is to transform them into z-scores (Carugo and Argos 1999; Smith et al. 2003; Yang et al. 2016), often named normalized ADPs (BN), according to
where Bave and Bstd are the average ADP and its standard deviation, respectively, defined as
and
where n is the number of protein atoms. In this way, all crystal structures have an average BN equal to zero and a standard deviation of the population equal to one, though BNs are dimensional and thus part of the information provided by Bs is lost.
A slightly different approach was followed by Gourinath et al. (Gourinath et al. 2003) in comparing ADPs of a single helix in several states of myosin, where the normalized ADPs (BN’) were defined as
where Bave and Bstd were computed only on the N residues of helices and strands, thus ignoring loops, and n is the number of residues in the examined helix. This standardization should be preferred when the examined sample is small.
Another standardization that has been used is
where the value of D is empirically selected to yield normalized B values (BN″) with mean 1.0 and root-mean-square deviation 0.3 (Vihinen et al. 1994).
A further standardization has been used, defined as
where the number 1.645 is a typical threshold in standard normal distributions, indicating the 0.05 probability of a value outside the interval − 1.645 to 1.645 for each of the two tails, and where the values − 1 or + 1 was imposed to BN″′ values lower than − 1 or larger than + 1 (Liu et al. 2013, 2014).
Other standardization procedures can be conceived. For example, since independent sources of disorder add in determining the resulting ADP, it can be envisaged that it is sufficient to subtract a constant, equal to Bave, from individual ADPs to standardize their values among different crystal structures (Elgavish and Shaanan 1998). Similarly, the minimum-function method equalizes the minimum ADP values found in two protein structures (Frauenfelder and Petsko 1980; Ringe and Petsko 1986). Alternatively, one might refine each crystal structure with exactly the same computational protocol, for example by using the PDB_REDO server (Joosten et al. 2014), although two data sets at different resolutions might require different ADP handlings (for example, isotropic refinement in a structure with very high-resolution data, which allow anisotropic refinement, could lead to erroneous ADP that cannot be compared to ADPs of a medium-resolution data structure, which cannot be refined anisotropically).
ADP distributions
Parthasarathy and Murthy analyzed the ADPs of the Cα atoms of more than 35,000 residues found in a non-redundant ensemble of 110 high-resolution (better than 2.0 Å) protein crystal structures and found that the distribution of the normalized BN values bimodal, according to
where k1, k2, k3, k4, B1, and B2 are parameters that were optimized with least-squares procedures (Parthasarathy and Murthy 1997). The same authors also investigated the correlation between main- and side-chain atom ADPs and found that it is quite variable (Parthasarathy and Murthy 1999).
Different results were published more recently by Erman, based on the analysis of Cα atom ADPs of more than 400,000 residues found in 2000 non-redundant protein crystal structures (Erman 2016). The distribution of the ADPs is unimodal and can be fitted by a gamma function
were Bav, the average ADP, is equal to 12.9 Å2. A similar expression can be employed to fit the ADP distribution in a single protein crystal structure:
where the scaling factor a is
and
where Bav, Bmin, and Bmax are the average value, the minimal value, and the maximal value of the ADPs of the protein crystal structure. Clearly, this indicates that large ADPs are extremely unlikely, since these distributions are positively skewed.
The reason of the discrepancy between the results of Parthasarathy and Murthy, on the one hand, and of Erman, on the other hand, is unclear. It is possible that the much larger data set analyzed by Erman makes his results more reliable, though it must also be remembered that the structures analyzed more than 20 years ago were likely determined at room temperature, while those examined more recently were mostly determined at 100 K and that a temperature-dependent effect cannot be disregarded. Moreover, while Parthasarathy and Murthy analyzed normalized BN-factors, Erman analyzed ADPs.
ADP prediction
ADP prediction has attracted considerable attention. ADP profiles, where a single ADP value is associated with each amino acid (Cα’s ADP), have been predicted from the protein sequence to estimate the flexibility of each residue. Individual, atomic ADPs have been predicted from the protein tertiary structures to estimate the flexibility of each atom in computationally built structures. Unfortunately, several methods have not been benchmarked and we lack a systematic comparison of these computational tools.
ADP profiles have been predicted with a variety of methods. Yuan et al. used a support vector regression (SVR) approach to predict ADP profiles from protein sequence with Pearson correlation coefficient of 0.53 between experimental and predicted ADPs (Yuan et al. 2005). A more complex technique, where the most important global and local features of the protein sequence, identified with random forests, are imputed into a two-stage support vector regression tool, has been developed and a web server (www.csbio.sjtu.edu.cn/bioinf/PredBF) is available for academic use (Pan and Shen 2009).
Support vector regression was used also to predict individual atomic ADPs (Yang et al. 2016) and a server is presently available at https://zhanglab.ccmb.med.umich.edu/ResQ/ to allow users to predict ADPs based on modeled three-dimensional structures. Graph theory-based methods, which consider both covalent and non-covalent interactions observed in protein structures, were used to predict isotropic ADPs of all protein atoms (Jacobs et al. 2001; Gohlke et al. 2004). Atomic ADPs were predicted also from atomic fluctuations in molecular dynamics simulations (Higo and Umeyama 1997; MacKerell et al. 1998; Hinsen and Kneller 1999; Pang 2016), from normal mode analyses of protein structures (Levitt et al. 1985; Tirion 1996; ben-Avraham and Tirion 1998) and from Gaussian network models (Bahar et al. 1997, 1998; Haliloglu and Bahar 1999; Halle 2002; Kundu et al. 2002). Recently, Nguyen et al. (2016) proposed a new predictive method, named flexibility–rigidity index (FRI), to predict ADPs. Generalized Gaussian network models, coupled with anisotropic network model, were used to foresee ADPs, with performance close to FRI (Xia et al. 2015). In a study, Weiss described the relationship between ADPs and the number of atomic contacts for each atom and used this simple relationship to predict ADPs (Weiss 2007).
Eventually, it is interesting to mention that ADPs might be drastically underestimated by crystallographic refinements. Based on classical molecular dynamics simulations of villin headpiece domain crystals, Kuzmanić and co-workers observed that isotropic and anisotropic ADPs underestimate their values computed in silico by even sixfold, probably because of inadequate conformational averaging and treatment of correlated motions (Kuzmanic et al. 2014).
Extremophilic proteins and thermostabilization
While most living organisms presently known grow best at moderate temperature, around 20–45 °C, several organisms prefer either lower temperature, and they are named psychrophiles, or higher temperature, and they are named thermophiles (psychro- and thermo- taken together are named extremophiles).
Proteins of extremophiles have been studied intensively, because of their potential biotechnological applications, and their ADPs have been analyzed (Parthasarathy and Murthy 2000; Gianese et al. 2002).
By comparing the structures of 93 mesophilic and 21 thermophilic proteins, Parthasarathy and Murthy observed that serines and threonines have lower ADPs in thermophilic proteins and that lysines and glutamates are more frequent in high ADP protein moieties in thermophilic proteins (Parthasarathy and Murthy 2000). On the contrary, the overall dispersion of B values is similar in mesophilic and thermophilic proteins (Parthasarathy and Murthy 2000).
Based on the hypothesis that thermostable proteins tend to be more rigid than mesophilic proteins, thermostabilization of the mesophilic lipase A from Bacillus subtilis was achieved by mutations of amino acids that display the highest B-factors, corresponding to the most pronounced degrees of thermal motion and thus flexibility (Reetz et al. 2006). Similarly, the “rigidity theory” has been applied to the thermostabilization of lipase A from Bacillus subtilis (Rathi et al. 2016). Recently, ADPs were examined to identify residues for site-saturation mutagenesis to stabilize Candida rugosa lipase 1 (Zhang et al. 2016). Similarly, Huang and co-workers selected mutation sites based on ADPs to thermostabilize Aspergillus terreus amine transaminase (Huang et al. 2017).
Based on a careful intra-family comparison of psychrophilic, mesophilic, and thermophilic protein structures, Siglioccolo and co-workers observed that flexibility is more heterogeneous in psychrophilic enzymes, which show an irregular alternation of rigid and flexible small regions (Siglioccolo et al. 2010).
Conformational disorder and flexibility prediction
Given that they reflect positional spread, ADPs have been analyzed with the aim of predicting protein flexibility and conformational disorder. This is justified by many observations. For example, it has been shown that the ADPs of the atoms flanking polypeptide segments that are “invisible” in the electron density maps are increasingly large in approaching these segments (Djinovic-Carugo and Carugo 2015). Given that the conformational disorder of the segments invisible is likely to be too extreme to leave a trace in the electron density maps, it follows that the last residues still visible and close to the missing segment are considerably disordered.
Prediction of flexibility from amino acid sequence is somehow similar to prediction of ADPs, though flexibility may be defined in different ways, always related to ADPs. Early flexibility predictions, based on few protein crystal structures, provided quite contradictory results (Karplus and Schulz 1985; Bhaskaran and Ponnuswamy 1988; Ragone et al. 1989; Vihinen et al. 1994). This research field converged with the more specific problem of ADP prediction, which is described in another section of the review.
Predictions of conformational disorder can be done with several programs and meta-servers (Lieutaud et al. 2016). One of them, DisEMBL, is based on ADP analyses (Linding et al. 2003). It consists of three different predictors, one aimed at the prediction of loops, one at the prediction of “hot loops”, which are characterized by large ADPs, and the third one aimed at the prediction of strings of residues that were not detected in the electron density maps (Linding et al. 2003). Despite that it is not really recent, DisEMBL is used in several meta-servers, like DisMeta (Huang et al. 2014), GeneSilico MetaDisorder MD2 (Kozlowski and Bujnicki 2012), MetaPrDOS (Ishida and Kinoshita 2008), MobiDB-lite (Necci et al. 2017), and MeDor (Lieutaud et al. 2008), and its results are included in databases (Potenza et al. 2015).
ADPs and sequence evolution
Given that protein flexibility is stringently related to protein function and stability, it is expected that it is conserved during evolution and sequence divergence and, given that ADPs reflect protein flexibility, studies have been devoted to ADP conservation.
Maquid and co-workers analyzed the evolutionary divergence of Cα atom ADPs in homologous proteins classified into families and superfamilies and observed that Cα atom flexibility diverges slowly and that it is sometime conserved even for protein pairs with insignificant sequence similarity (Maguid et al. 2006). It became possible to predict ADPs profiles based on evolutionary information and statistical methods (Yuan et al. 2005).
Protein folding
In vitro protein folding rates are extremely variable and depend on several factors, including the occurrence of post-translational modifications, the fold topology, the amino acid sequence composition, the size of the protein, etc. They also depend on the local flexibility, which may hinder or favor certain backbone movements. Based on this consideration, Gao and co-workers designed three predictors, for two-state, multistate, and unknown folding kinetics, which require, among other parameters, predicted ADPs (Gao et al. 2010).
Protein binding sites
In the mainstream of recent structural bioinformatics, prediction of binding sites at the protein surface has attracted conspicuous attention and ADPs have been repeatedly used.
A first problem, when dealing with protein crystal structures, is the distinction between protein crystal contacts and protein–protein physiological contacts (Janin and Rodier 1995; Carugo and Argos 1997; Krissinel and Henrick 2007; Duarte et al. 2012). Liu and co-workers defined four variables to describe the ADP of protein–protein interfaces (Liu et al. 2014):
where n is the number of interfacial atoms and BN ″′ j is the standardized ADP of the j-th interfacial atom;
where min r is the smaller number of the average numbers of residues per chain for the two biological units in a complex; and
where NoB is the number of interface atoms with a negative standardized ADP and a combination of the last two,
Empirical threshold values allow one to reach positive and encouraging prediction accuracies on various data sets (Liu et al. 2014).
A machine learning technique, random forest, has been used by Jiao and Ranganathan to predict interface residues in a set of heterodimers, where each surface residue is described by several variables, among which ADP plays a prominent role (Jiao and Ranganathan 2017). Another machine learning technique, support vector machine, was used to predict interface residues in non-obligate dimers by imputing standardized ADPs besides sequence profiles and solvent-accessible surface areas (Liu et al. 2010).
A further question is the computation of binding affinity, and ADPs (the standardized BN″′ values) have been shown, with machine learning methods, to play a significant role in improving previous prediction methods in protein–small molecule complexations (Liu et al. 2013).
Order parameter and positional accuracy
The position accuracy of an atom is obviously related to its thermal motion and atoms with extremely large ADPs are hardly detectable in the electron density maps. Cruickshank observed that the positional standard error (psu) increases with B with a quadratic trend:
where the parameters a, b, and c depend on the crystal structure that is examined (Cruickshank 1999) and it has been proposed to estimate the average coordinate standard error [σ(x i )] of the atoms of type i (for example, nitrogens, oxygens or carbons) with the following expression:
where nobs is the number of experimental observations, npar is the number of refined parameters, R is the R-factor, res is the crystallographic resolution, and N i is the number of atoms of type i needed to give scattering power equal to that of the asymmetric unit of the structure.
where f i is the atomic form factor of the atom i and the sum at the numerator is obtained over all the atoms in the asymmetric unit. The average coordinate standard error can be used to estimate the standard error of each individual atom [σ(x i ,B)] with the following expression:
where Bave is the average ADP and the parameters a, b and c depend on the crystal structure. The rather empirical nature of this expression made it unfortunately little used by the scientific community.
More recently, Fenwick and co-workers proposed an ADP-based order parameter (OP) for pairs of bonded atoms defined as:
where the sum is obtained for all the i-th conformational states of the atoms u and v, o i is the occupancy of the i-th conformational state, and B u,i and B v,i are the ADPs of the atoms u and v in the i-th conformational state (Fenwick et al. 2014). Interestingly, if the numerator (B u,i + B v,i ) is equal to 8π2 (≈ 79 Å2), then OP = 0: this indicates a completely disordered pair of atoms. On the contrary, OP approaches 1 if the ADPs are extremely small and in the case the pair of atoms is particularly ordered. It must be observed that OP is only applicable to high-resolution structures (Fenwick et al. 2014).
Conclusions
ADPs, which are refined in crystal structures since decades and which depend on structural heterogeneity, provide a wide spectrum of information, which can be used in numerous fields of structural biology and bioinformatics. Here, several applications of ADPs are reviewed, ranging from conformational disorder prediction in proteins to protein thermostabilization. A crucial aspect is the standardization of the ADPs when comparisons between two or more protein crystal structures are made, since ADPs are differently affected by several factors, from crystallographic resolution to refinement protocols, and several standardization procedures are briefly summarized. A potential limitation to ADP analysis is the modern tendency to let ADPs to inflate up to extremely large values that have little physico-chemical meaning, and the definition of upper limits, probably resolution dependent, is necessary.
References
Alber T, Petsko GA, Tsernoglou D (1976) Crystal structure of elastase-substrate complex at—55 degrees C. Nature 263:297–300
Bahar I, Atilgan AR, Erman B (1997) Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Fold Des 3:173–181
Bahar I, Rana Atilgan A, Demirel MC, Erman B (1998) Vibrational Dynamics of folded proteins: significance of slow and fast motions in relation to function and stability. Phys Rev Lett 80:2733–2736
ben-Avraham D, Tirion MM (1998) Normal modes analyses of macromolecules. Physica A. 249:415–423
Benkert P, Tosatto SC, Schomburg D (2008) QMEAN: a comprehensive scoring function for model quality assessment. Proteins. 71:261–277
Bhaskaran R, Ponnuswamy PK (1988) Positional flexibilities of amino acid residues in globular proteins. Chem Biol Drug Des 32:241–255
Bolognesi M, Rosano C, Losso R, Borassi A, Rizzi M, Wittenberg JB, Boffi A, Ascenzi P (1999) Cyanide binding to Lucina pectinata hemoglobin I and to sperm whale myoglobin: an X-ray crystallographic study. Biophys J 77:1093–1099
Bury CS, Carmichael I, Garman EF (2017) OH cleavage from tyrosine: debunking a myth. J Synchrotron Radiat 24:7–18
Carugo O, Argos P (1997) Protein-protein crystal-packing contacts. Protein Sci 6:2261–2263
Carugo O, Argos P (1999) Reliability of atomic displacement parameters in protein crystal structures. Acta Crystallogr D Biol Crystallogr 55(Pt 2):473–478
Carugo O, Djinovic-Carugo K (2005) When X-rays modify the protein structure: radiation damage at work. Trends Biochem Sci 30:213–219
Cruickshank DWJ (1999) Remarks about protein structure precision. Acta Cryst. D55:583–593
Dauter Z, Lamzin VS, Wilson KS (1997) The benefits of atomic resolution. Curr Opin Struct Biol 7:681–688
Declercq JP, Evrard C, Lamzin V, Parello J (1999) Crystal structure of the EF-hand parvalbumin at atomic resolution (0.91 A) and at low temperature (100 K). Evidence for conformational multistates within the hydrophobic core. Protein Sci 8:2194–2204
Djinovic-Carugo K, Carugo O (2015) Missing strings of residues in protein crystal structures. Intrinsically Disord Proteins 3(1):1–7
Duarte J, Srebniak A, Scharer M, Capitani G (2012) Protein interface classification by evolutionary analysis. BMC Bioinform 13:334
Dunitz JD, Maverick EF, Trueblood KN (1988a) Atomic motions in molecular crystals from diffraction measurements. Angew Chem Int Ed Eng 27:880–895
Dunitz JD, Shomaker V, Trueblood KN (1988b) Interpretation of atomic displacement parameters from diffraction studies of crystals. J Phys Chem 92:856–867
Elgavish S, Shaanan B (1998) Structures of the Erythrina corallodendron lectin and of its complexes with mono- and disaccharides. J Mol Biol 277:817–932
Erman B (2016) Universal features of fluctuations in globular proteins. Proteins. 84:721–725
Fenwick RB, van den Bedem H, Fraser JS, Wright PE (2014) Integrated description of protein dynamics from room-temperature X-ray crystallography and NMR. Proc Natl Acad Sci USA 111:E445–E454
Fraser JS, van den Bedem H, Samelson AJ, Lang T, Holton JM, Echols N, Albera T (2011a) Accessing protein conformational ensembles using room-temperature X-ray crystallography. Proc Natl Acad Sci USA 108:16247–16252
Fraser JS, van den Bedemb HE, Samelson AJ, Lang PT, Holton JM, Echols N, Alber T (2011b) Accessing protein conformational ensembles using room-temperature X-ray crystallography. Proc Natl Acad Sci USA 108:16247–16252
Frauenfelder H, Petsko GA (1980) Structural dynamics of liganded myoglobin. Biophys J 32:465–483
Frauenfelder H, Petsko GA, Tsernoglou D (1979) Temperature-dependent X-ray diffraction as a probe of protein structural dynamics. Nature 280:558–563
Frauenfelder H, Hartmann H, Karplus M, Kuntz IDJ, Kuriyan J, Parak F, Petsko GA, Ringe D, Tilton RFJ, Connolly ML et al (1987) Thermal expansion of a protein. Biochemistry 26:254–261
Gao J, Zhang T, Zhang H, Shen S, Ruan J, Kurgan L (2010) Accurate prediction of protein folding rates from sequence and sequence-derived residue flexibility and solvent accessibility. Proteins. 78:2114–2130
Garman E (2003) ‘Cool’ crystals: macromolecular cryocrystallography and radiation damage. Curr Opin Struct Biol. 13:545–551
Garman EF, Owen RL (2006) Cryocooling and radiation damage in macromolecular crystallography. Acta Crystallogr. D62:32–47
Giacovazzo C, Monaco HL, Artioli G, Viterbo D, Ferraris G, Gilli G, Zanotti G, Catti M (2002) Fundamentals of crystallography. Oxford University Press, Oxford
Gianese G, Bossa F, Pascarella S (2002) Comparative structural analysis of psychrophilic and meso- and thermophilic enzymes. Proteins 47:236–249
Gohlke H, Kuhn LA, Case DA (2004) Change in protein flexibility upon complex formation: analysis of Ras-Raf using molecular dynamics and a molecular framework approach. Proteins 56:322–327
Gourinath S, Himmel DM, Brown JH, Reshetnikova L, Szent-Györgyi AG, Cohen C (2003) Crystal structure of scallop Myosin s1 in the pre-power stroke state to 2.6 a resolution: flexibility and function in the head. Structure. 11:1621–1627
Haliloglu T, Bahar I (1999) Structure-based analysis of protein dynamics: comparison of theoretical results for hen lysozyme with X-ray diffraction and NMR relaxation data. Proteins. 37:654–667
Halle B (2002) Flexibility and packing in proteins. Proc Natl Acad Sci USA 99:1274–1279
Halle B (2004) Biomolecular cryocrystallography: structural changes during flash-cooling. Proc Natl Acad Sci USA 101:4793–4798
Hartmann H, Parak F, Steigemann W, Petsko GA, Ponzi DR, Frauenfelder H (1982) Conformational substates in a protein: structure and dynamics of metmyoglobin at 80 K. Proc Natl Acad Sci USA 79:4967–4971
Higo J, Umeyama H (1997) Protein dynamics determined by backbone conformation and atom packing. Prot Eng. 10:373–380
Hinsen K, Kneller G (1999) A simplified force field for describing vibrational protein dynamics over the whole frequency range. J Chem Phys. 111:10766–10769
Holton JM (2009) A beginner’s guide to radiation damage. J Synchrotron Radiat 16:133–142
Huang YJ, Acton TB, Montelione GT (2014) DisMeta: a meta server for construct design and optimization. Methods Mol Biol 1091:3–16
Huang J, Xie DF, Feng Y (2017) Engineering thermostable (R)-selective amine transaminase from Aspergillus terreus through in silico design employing B-factor and folding free energy calculations. Biochem Biophys Res Commun 483:397–402
Ishida T, Kinoshita K (2008) Prediction of disordered regions in proteins based on the meta approach. Bioinformatics 24:1344–1348
Jacobs DJ, Rader AJ, Kuhn LA, Thorpe MF (2001) Protein flexibility predictions using graph theory. Proteins. 44:150–165
Janin J, Rodier F (1995) Protein-protein interaction at crystal contacts. Proteins. 23:580–587
Jiao X, Ranganathan S (2017) Prediction of interface residue based on the features of residue interaction network. J Theor Biol 432:49–54
Joosten RP, Long F, Murshudov GN, Perrakis A (2014) The PDB_REDO server for macromolecular structure model optimization. IUCrJ. 1:213–220
Juers DH, Matthews BW (2001) Reversible lattice repacking illustrates the temperature dependence of macromolecular interactions. J Mol Biol 311:851–862
Karplus PA, Schulz GE (1985) Preiction of chain flexibility in proteins. Natuwissenschaften. 72:212–213
Kozlowski LP, Bujnicki JM (2012) MetaDisorder: a meta-server for the prediction of intrinsic disorder in proteins. BMC Bioinform 13:111
Krissinel E, Henrick K (2007) Inference of macromolecular assemblies from crystalline state. J Mol Biol 372:774–797
Kundu S, Melton JS, Sorensen DC, Phillips GN Jr (2002) Dynamics of proteins in crystals: comparison of experiment with simple models. Biophys J 83:723–732
Kuzmanic A, Pannu NS, Zagrovic B (2014) X-ray refinement significantly underestimates the level of microscopic heterogeneity in biomolecular crystals. Nat Commun 5:3220
Läuger P (1985) Ionic channels with conformational substates. Biophys J 47:581–590
Levitt M, Sander C, Stern PS (1985) Protein normal-mode dynamics: trypsin inhibitor, crambin, ribonuclease and lysozyme. J Mol Biol 181:423–447
Lieutaud P, Canard B, Longhi S (2008) MeDor: a metaserver for predicting protein disorder. BMC Genomics 9(Suppl 2):S25
Lieutaud P, Ferron F, Longhi S (2016) Predicting conformational disorder. Methods Mol Biol 1415:265–299
Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB (2003) Protein disorder prediction: implications for structural proteomics. Structure (Camb). 11(11):1453–1459
Liu Q, Kwoh CK, Li J (2010) Identifying protein-protein interaction sites in transient complexes with temperature factor, sequence profile and accessible surface area. Amino Acids 38:263–270
Liu Q, Kwoh CK, Li J (2013) Binding affinity prediction for protein-ligand complexes based on β contacts and B factor. J Chem Inf Model 53:3076–3085
Liu Q, Li Z, Li J (2014) Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts. BMC Bioinform 15:S3
MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S et al (1998) All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B. 102:3586–3616
Maguid S, Fernández-Alberti S, Parisi G, Echave J (2006) Evolutionary conservation of protein backbone flexibility. J Mol Evol 63:448–457
Necci M, Piovesan D, Dosztányi Z, Tosatto SCE (2017) MobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins. Bioinformatics 33:1402–1404
Nguyen DD, Xia K, Wei GW (2016) Generalized flexibility-rigidity index. J Chem Phys. 144:234106
Pan XY, Shen HB (2009) Robust prediction of B-factor profile from sequence using two-stage SVR based on random forest feature selection. Protein Pept Lett 16:1447–1454
Pang YP (2016) Use of multiple picosecond high-mass molecular dynamics simulations to predict crystallographic B-factors of folded globular proteins. Heliyon. 2:e00161
Parthasarathy S, Murthy MRN (1997) Analysis of temperature factor distribution in high-resolution protein structures. Protein Sci 6:2561–2567
Parthasarathy S, Murthy MRN (1999) On the correlation between the main-chain and side-chain atomic displacement parameters (B values) in high-resolution protein structures. Acta Crystallogr. D55:173–180
Parthasarathy S, Murthy MR (2000) Protein thermal stability: insights from atomic displacement parameters (B values). Protein Eng 13:9–13
Potenza E, Domenico TD, Walsh I, Tosatto SC (2015) MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins. Nucleic Acids Res. 43:D315–D320
Ragone R, Facchiano F, Facchiano A, Facchiano AM, Colonna G (1989) Plexibility plot of proteins. Prot Eng. 2:497–504
Rasmussen BF, Stock AM, Ringe D, Petsko GA (1992) Crystalline ribonuclease A loses function below the dynamical transition at 220 K. Nature 357:423–424
Rathi PC, Fulton A, Jaeger K-E, Gohlke H (2016) Application oft he rigidity theory tot he thermostabilization of Lipase A from Bacillus subtilis. PLoS Comput Biol 12:e1004754
Reetz MT, Carballeira JD, Vogel A (2006) Iterative saturation mutagenesis on the basis of B factors as a strategy for increasing protein thermostability. Angew Chem Int Ed Eng. 45:7745–7751
Ringe D, Petsko GA (1986) Study of protein dynamics by X-ray diffraction. Methods Enzymol 131:389–433
Russi S, González A, Kenner LR, Keedy DA, Fraser JS, van den Bedem H (2017) Conformational variation of proteins at room temperature is not dominated by radiation damage. J Synchrotron Radiat 24:73–82
Schmidt A, Lamzin VS (2010) Internal motion in protein crystal structures. Protein Sci 19:944–953
Siglioccolo A, Gerace R, Pascarella S (2010) “Cold spots” in protein cold adaptation: insights from normalized atomic displacement parameters (B-factors). Biophys Chem 153:104–114
Singh TP, Bode W, Huber R (1980) Low-temperature protein crystallography. Effect on flexibility, temperature factor, mosaic spread, extinction and diffuse scattering in two examples: bovine trypsinogen and Fc fragment. Acta Cryst. B36:621–627
Smith JL, Hendrickson WA, Honzatko RB, Sheriff S (1986) tructural heterogeneity in protein crystals. Biochemistry 25:5018–5027
Smith DK, Radivojac P, Obradovic Z, Dunker AK, Zhu G (2003) Improved amino acid flexibility parameters. Protein Sci 12:1060–1072
Stein DL (1985) A model of protein conformational substates. Proc Natl Acad Sci USA 82:3670–3672
Tilton RFJ, Dewan JC, Petsko GA (1992) Effects of temperature on protein structure and dynamics: X-ray crystallographic studies of the protein ribonuclease-A at nine different temperatures from 98 to 320 K. Biochemistry 31:2469–2481
Tirion MM (1996) Large amplitude elastic motions in proteins from a single-parameter. Atomic analysis. Phys Rev Lett. 77:1905–1908
Trueblood KN, Bürgi H-B, Burzlaff H, Dunitz JC, Gramaccioli CM, Schulz HH, Shmueli U, Abrahams SC (1996) Atomic displacement parameter nomenclature. Report of a subcommittee on atomic displacement parameter nomenclature. Acta Cryst. A52:770–781
Vihinen M, Torkkila E, Riikonen P (1994) Accuracy of protein flexibility predictions. Proteins. 19:141–149
Wang C, Lovelace LL, Sun S, Dawson JH, Lebioda L (2014) Structures of K42N and K42Y sperm whale myoglobins point to an inhibitory role of distal water in peroxidase activity. Acta Cryst. D70:2833–2839
Warkentin M, Thorne RE (2009) Slow cooling of protein crystals. J Appl Cryst. 42:944–952
Warkentin M, Thorne RE (2010) Glass transition in thaumatin crystals revealed through temperature-dependent radiation-sensitivity measurements. Acta Cryst. D66:1092–1100
Watson HC (1969) The stereochemistry of the protein myoglobin. Prog Stereochem 4(299–312):5
Weiss MS (2007) On the interrelationship between atomic displacement parameters (ADPs) and coordinates in protein structures. Acta Crystallogr. D63:1235–1242
Woldeyes RA, Sivak DA, Fraser JS (2014) E pluribus unum, no more: from one crystal, many conformations. Curr Opin Struct Biol 28:56–62
Xia K, Opron K, Wei GW (2015) Multiscale Gaussian network model (mGNM) and multiscale anisotropic network model (mANM). J Chem Phys. 143:204106
Yang J, Wang Y, Zhang Y (2016) ResQ: an Approach to unified estimation of B-factor and residue-specific error in protein structure prediction. J Mol Biol 428:693–701
Yuan Z, Bailey TL, Teasdale RD (2005) Prediction of protein B-factor profiles. Proteins. 58:905–912
Zanotti G (2002) Protein Crystallography. In: Giacovazzo C (ed) Fundamental of crystallography. Oxfor University Press, Oxford, pp 667–757
Zhang XF, Yang GY, Zhang Y, Xie Y, Withers SG, Feng Y (2016) A general and efficient strategy for generating the stable enzymes. Sci Rep. 6:33797
Acknowledgements
I would like to thank Kristina Djinović (University of Vienna) and the members of the COST BM1405 network on non-globular proteins for interesting discussions. The in-depth work of a reviewer is also gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author reports no declarations of interest.
Research involving human participants and/or animals
None.
Informed consent
None.
Additional information
Handling Editor: J. D. Wade.
Rights and permissions
About this article
Cite this article
Carugo, O. Atomic displacement parameters in structural biology. Amino Acids 50, 775–786 (2018). https://doi.org/10.1007/s00726-018-2574-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00726-018-2574-y