Introduction

Solution hydrodynamic parameters of macromolecules, such as the translational (D t ) and rotational (D r ) diffusion and sedimentation (s) coefficients, and the intrinsic viscosity ([η]), can be experimentally determined by well-established techniques. Since the size, detailed shape, and time-dependent conformation determine the macromolecules’ frictional properties, the computation of these parameters from their structures has been a field of intense research. These calculations are, however, not straightforward. Well-defined geometrical objects, such as cylinders and ellipsoids, have been used initially to build very low resolution models of proteins (Tanford 1961; Cantor and Schimmel 1980) and other biopolymers, and are still in use today in an enhanced version (Harding et al. 2004). A big step forward was the development of the theory for the computation of translational and rotational frictional coefficients and intrinsic viscosity of ensembles of non-overlapping spheres (beads) of differing radii (reviewed in García de la Torre and Bloomfield 1981; Spotorno et al. 1997; Carrasco and García de la Torre 1999). This procedure has been extended in a number of different ways to model proteins and other biomacromolecules of known 3D structure, ranging from shell modeling to grid-based methods (see Byron 2000). However, the calculation of the hydrodynamic parameters of ensemble of beads can be computationally demanding, requiring a compromise between the bead model resolution and the number and size of the beads employed. Furthermore, the effect of the so-called water of hydration (Halle and Davidovic 2003) should be correctly taken into account (see Rai et al. 2005).

Currently, three principal different bead modeling methods are available, implemented in public-domain computer programs. A “grid” method was implemented by O. Byron in the program AtoB (Byron 1997). Here, the protein is subdivided into equally sized cubes and each residue is assigned to a particular cube. Then, according to user choice, beads of either equal or differing radii are generated and placed in the center of gravity of each cube, the resolution of the final model depending on the spacing of the initial cubic grid. AtoB (Byron 1997) was tested against a large globular protein (aldolase) and a spherical hollow protein (apoferritin). Bead models were generated and the calculated s 0(20,w) and [η] values agreed well with experimental values, provided that an appropriate grid spacing was used and after radial expansion of the beads to compensate for the water of hydration. Zipper and Durchschlag (1997, 1998) also followed a similar approach, and the usefulness of such methods appears to rest mainly in the modeling of very large structures.

At the other end of the spectrum lies the “shell modeling” approach implemented in the currently most widely used bead modeling program, Hydropro, developed by García de la Torre and collaborators (García de la Torre et al. 2000; García de la Torre 2001). In this approach, all atoms in a protein are first replaced by equally sized beads of a certain radius. Then, the surface of this “primary” model is covered with a “shell” of smaller beads, and the procedure is iterated, decreasing the shell beads’ radius, allowing extrapolation to zero bead size. This approach has undergone more extensive testing (García de la Torre et al. 2000; García de la Torre 2001), and the models can on average reasonably reproduce the hydrodynamic parameters determined experimentally (albeit without a critical evaluation of the literature data, see below). Furthermore, to reach a consensus agreement across the test proteins, the primary beads’ radius was adjusted until a mean satisfactory value was found. In addition, to avoid excessive memory requirements and very long computing times, Hydropro currently has an upper limit of ~3,000 shell beads, whose radius is a function of the protein’s size, potentially limiting its precision when large structures are analyzed.

A third approach is to build a bead model with direct correspondence between the atoms in the macromolecules’ residues, such as amino acids in proteins, sugar units in carbohydrates and nucleosides in nucleic acids, and the beads that are used to represent them. This approach can overcome some of the limitations of the other methods, and was chosen for the development of SOMO (SOlution MOdeller; Rai et al. 2005), where, for instance, amino acid residues are represented each by two beads, one for the main-chain and another for the side-chain segments. The beads’ volumes are initially determined by summing the volumes of the atoms which they represent, and are then augmented by adding the volume of the water molecules which were experimentally found to be statistically bound to each residue (e.g., Kuntz and Kauzmann 1974). Beads are positioned according to the characteristics of the residues they represent, and the overlaps between them are then removed by proportional radial reduction, trying to preserve the original anhydrous surface envelope as much as possible. This is aided by an accessible surface area (ASA) computation initially performed on the atomic structure to separate exposed beads from buried beads. Buried beads can then be excluded from subsequent hydrodynamic parameters computations, which in the original SOMO implementation were carried out separately by the program SUPCW (Spotorno et al. 1997; Rai et al. 2005). SOMO was extensively tested against three small proteins, BPTI, RNase A and lysozyme, for which a very large body of hydrodynamic data exists (that were critically evaluated), plus two larger proteins, fibrinogen fragment D and citrate synthase dimer, with very good results (Rai et al. 2005). SOMO has already been instrumental in discriminating between alternative conformation of integrins in a recently published study (Rocco et al. 2008). However, the original SOMO implementation suffered from a number of drawbacks. SOMO consisted of a collection of separate, command-line driven executables running under the Linux operating system, with a rather rigid user interface, recognizing only residues hard-coded in the programs. To overcome these flaws, an entirely re-designed and enhanced version of the SOMO program was developed by the authors by integrating the basic functionality of SOMO under the open source software UltraScan (US). First, we added a graphical user interface (GUI) and replaced the hard coded residue representation by implementing user-modifiable reference tables which code for the atomic groups and residues present in the Protein Data Bank (PDB; Berman et al. 2000) structures. A full range of options controlling details of the modeling process and of the hydrodynamic computations (performed with an integrated version of SUPCW; see Spotorno et al. 1997 and Rai et al. 2005) can be accessed through dedicated menus. In the process, we have also corrected some mistakes that went unnoticed in the original SOMO release, and added new features. Finally, a module for the creation of bead models based on the AtoB grid method (Byron 1997), further developed by M. Nöllman (Centre de Biochimie Structurale, CNRS-INSERM, Montpellier, FR) for the original SOMO program (Rai et al. 2005), has been coded for US-SOMO. This allows either a further reduction of resolution starting from a previously generated bead model, or the direct generation of grid-based models from PDB files or small-angle X-ray scattering (SAXS)-derived dummy atoms models. This first US-SOMO release was tested with an expanded number of X-ray crystallography and NMR spectroscopy structures, as presented by García de la Torre (2001), whose experimental hydrodynamic parameters were, however, critically re-evaluated. In the Electronic Supplementary Material (ESM) of this paper, a detailed description of the operation and main features of US-SOMO is presented. The very satisfactory results of US-SOMO in reproducing most experimental parameters of the test proteins are here reported and discussed, highlighting its potential as a powerful tool for many hydrodynamic modeling applications.

Methods

US-SOMO implementation: general layout, reference tables and options

In Fig. 1 we present the main GUI panel of the new SOMO implementation under UltraScan, which can be accessed from the US “Simulation” drop down menu. The program is divided into three sub-menus. “Modify Lookup Tables:” refers to the reference files needed to operate the program; settings of various modeling and computational options are listed under “Modify SOMO Options for:”; and “Run SOMO Program:” refers to the various runtime operations. The right-side window updates the user about the operation(s) in progress. The US-SOMO operations are described in detail in the ESM.

Fig. 1
figure 1

Main panel of the US-SOMO program, shown after processing the 8RAT.pdb file. The font size in the right-side window has been artificially reduced to show the entire process

At the core of the program lies its capability to read and interpret PDB-formatted structural files. US-SOMO will upload a PDB file and recognize only the relevant records, discarding all others. Currently, these include the atom, hetatm, model, endmodel, ter, and end records. Within the atom and hetatm records, US-SOMO extracts and loads the atom name, the residue name, the chain identifier, the residue sequence number, and the x, y and z coordinates into appropriate data structures. The atom and residue names are then compared with the records present in the somo.residue table, which can be edited by the user through a pop-up window accessed by pressing the “Add/Edit Residue” button in the main panel.

Each residue type present in the PDB file must be correctly described in the somo.residue table, and, in order to have maximum flexibility in coding for all possible residues, two other tables were defined. In the first one, somo.hybrid, the different atomic groups are listed, together with their fundamental properties, i.e., the mass and the atomic van der Waals (VdW) radius, according to the “hybridizations” described by Tsai et al. (1999). The current content of the somo.hybrid file is shown in Table S1 in the ESM, and users can edit the current definitions or add new atomic groups through the “Add/Edit Hybridization” menu (not shown). The atomic groups listed in somo.hybrid are then used to build the somo.atom table through the “Add/Edit Atom” menu (not shown). A brief excerpt of the current entries in this table is shown in Table S2 in the ESM, which shows that the PDB coding for atoms does not discriminate between different hybridization states. For instance, CB is bound to three hydrogen atoms in alanine (C4H3), to just one in leucine, isoleucine and threonine (C4H1), and to two in all other amino acids (C4H2), implying a mass difference as shown in Table S2. A more profound difference is found, as an example, between the CG in leucine, having four single bonds and one hydrogen atom bound (C4H1), and that in histidine, having two single and one double bonds, and no hydrogens bound (C3H0). In this case, not only is the molecular weight different, but also the atomic van der Waals (VdW) radius is different. Thus, the somo.atom file allows selection of the correct atom name/molecular weight/VdW radius combination for each of the atoms within a residue. A full description of the operations necessary to enter/edit a residue in the somo.residue table is presented in the ESM.

Technical details

US-SOMO is written in C ++ and linked against the UltraScan (Demeler 2005) and Qt (TrollTech.com: Qt—a cross-platform application framework. http://www.trolltech.com/) libraries. The code is licensed under the GPL license (The GNU General Public License Version 3. http://www.gnu.org/copyleft/gpl.html) and can be downloaded from the UltraScan wiki (The UltraScan Trac Wiki. http://wiki.bcf.uthscsa.edu/ultrascan/). Binaries for all major platforms (Linux/X11, Microsoft Windows, Macintosh OS-X) can be downloaded from the UltraScan website at http://www.ultrascan.uthscsa.edu.

Experimental hydrodynamic data

All experimental hydrodynamic parameters of the proteins used to test US-SOMO were taken from the literature, but with a critical evaluation of the conditions used and of the correctness, whenever possible, of their reduction to standard conditions (water at 20°C). A full list is presented in the ESM, with the appropriate references.

Protein structures

The high-resolution structures of the test proteins were taken from the PDB (http://www.rcsb.org/pdb/home/home.do). Whenever possible, we sought structures deriving from the same species from which the solution data were available. This explains some differences between the structures we have employed and those previously used (García de la Torre et al. 2000; García de la Torre 2001). The completeness of each structure was checked and ensured at two levels: missing atoms within side chains were automatically added by the WHATIF webserver (Vriend 1990; http://swift.cmbi.ru.nl/servers/html/index.html) while missing residues were mostly manually modeled using O (Jones et al. 1991). A relatively long C-terminal sequence in nitrogenase MoFe was generated by Robetta (Chivian et al. 2005; http://robetta.bakerlab.org/) using the ab initio protocol (Bonneau et al. 2002), and then pasted in the original structure using O.

Results and discussion

The US-SOMO implementation was firstly thoroughly tested against the original SOMO software (Rai et al. 2005). In the process, several minor bugs were fixed, the most significant involved an incorrect formulation of the outward translation when reducing exposed side chains beads (see Rai et al. 2005). Of the two ASA algorithms implemented, SurfRace (Tsodikov et al. 2002) was found to be very reliable for small, compact structures, but presented some problems with larger, multisubunit structures. Therefore, in all subsequent work we used the ASAB1 option based on the Lee and Richards (1971) rolling sphere method, which is also the only option implemented for re-checking the beads’ exposure after overlap reduction. Another change affects the threshold detection for activating bead fusion (“popping”) which is now done by computing the intersection volume of pairs of beads. The pair of beads is fused when the volume of either bead multiplied by the user defined percentage overlap is greater than the volume of intersection. The volume of the fused bead is the total volume of the pair of beads.

The testing against protein structures was performed in three phases. In the first, multiple structures for the same protein, originating from both X-ray crystallography and NMR spectroscopy, were used. For the latter, averages of the hydrodynamic parameters computed for each of the multiple conformations present in the models were performed. The test proteins chosen, for which an extensive body of experimental hydrodynamic data exist, are the same utilized in Rai et al. (2005), bovine pancreatic trypsin inhibitor (BPTI), bovine pancreatic ribonuclease (RNase), and hen egg white lysozyme, to which myoglobin was added. A second set included the other proteins utilized by García de la Torre and collaborators (García de la Torre et al. 2000; García de la Torre 2001), excluding some less characterized proteins (trypsin, pepsin and subtilisin). Instead, we have examined in more detail hemoglobin, glyceraldehyde-3-phosphate dehydrogenase (G3PD) and lactate dehydrogenase (LDH), for which data and structures coming from different species exist. For these two sets, we computed and compared D 0t(20,w) , s 0(20,w) , τ hc(20,w) , and [η], whose experimental values were critically assessed as reported in Tables S3–S5 of the ESM. Finally, the τ hc(20,w) values only where computed for the full protein set presented in Table 2 of García de la Torre (2001), after re-calculation of the reduction to standard conditions of the experimental values as presented in Table S6. In all our modeling, we kept some options fixed, including: (1) a popping threshold of 40% for exposed side chains and of 60% for exposed main chain beads (this differs from what was used by Rai et al. (2005) because of the new definition of overlap threshold implemented in US-SOMO, see above); (2) the hierarchical overlap removal procedure was used in all cases, with outward translation for the exposed side chain beads; (3) the computations of the hydrodynamic parameters were done with stick boundary conditions, referred to the diffusion center, and with exclusion of the buried beads from both the full computations and from the volume correction; (4) the molecular weights and partial specific volumes used for the computation of s 0(20,w) and [η] were those computed by US-SOMO from the composition.

In Table 1, the comparisons between experimental and calculated D 0t(20,w) and s 0(20,w) for BPTI, RNase, lysozyme and myoglobin are presented. Taking full advantage of the ease by which some modeling options can be now set in US-SOMO, we explored the influence of ASA thresholds on these parameters. Practically, increasing the residues’ ASA threshold labels more beads as buried, and increasing the ASA re-check threshold also keeps more beads in the buried category. The effect on the two parameters examined in Table 1 is, therefore, entirely due to the number and position of the beads employed in the computations. However, as we will see later in Table 2 (and Tables 4, 5), it has an additional impact on the τ hc(20,w) and [η] values because of the exclusion of the buried beads from the volume correction. The three conditions examined are residues’ ASA thresholds of 10, 20 and 40 Å2 (A10, A20, A40), coupled respectively with beads’ ASA re-check thresholds of 30, 50 and 60% (R30, R50, R60). For comparison, the A10/R30 condition is equivalent to that employed by Rai et al. (2005) in their modeling study.

Table 1 Comparison between experimental and calculated D t(20,w) and s (20,w) values for US-SOMO bead models derived from test proteins
Table 2 Comparison between experimental and calculated τ c(20,w) and [η] values for US-SOMO bead models derived from test proteins

The first interesting result from Table 1 is that the D 0t(20,w) value of three out of the four test proteins examined here is reproduced extremely well, within the 3% error of experimental data. Moreover, it is independent of the structure used to generate the models, with no appreciable differences between X-ray and NMR models. The lone exception is lysozyme, for which the X-ray-derived structures perform slightly worse than in the other cases examined, while the NMR structure is in excellent agreement (<1%). This was already noticed by Rai et al. (2005), who suggested that this effect is mainly due to the high number of long, hydrophilic residues on lysozyme surface, not fully extended in crystal structures due to crystal packing. As for the s 0(20,w) , the results are mixed, with an excellent agreement for RNase (≤3%) and for the NMR-derived model of lysozyme (≤2%), while the X-ray-derived models of the latter suffer from the same problem seen with D 0t(20,w) . The poor agreement of the myosin CO s 0(20,w) data are instead likely due to a suspicious experimental value ("?" in Table 1), since a minor change is observed in the D 0t(20,w) data between the CO and apo forms. As it will be discussed in more detail below, the computed value of the partial specific volume \( \bar{v}_{2} \) could also affect the reliability of these numbers.

The other important evidence derived from Table 1 is the very small effect of greatly reducing the number of the beads used in the computations by increasing the ASA thresholds. Practically, halving the number of beads decreases the accuracy by roughly 1%. This is a surprising result, and it will be further discussed below in conjunction with some graphical images of larger protein models. From the data presented in Table 1, it seems safe to use a residue ASA threshold of 20 Å2 coupled with a bead ASA re-check threshold of 50%, effectively obtaining a factor of ~ 10 in the reduction of the number of frictional points with respect to the starting atomic structures.

In Table 2, the results of the comparisons between experimental and computed τ hc(20,w) and [η] values are presented for the same proteins of Table 1. The first thing to notice is the larger error present in the experimental τ hc(20,w) values, between 6 and 10%, with respect to the D 0t(20,w) , s 0(20,w) and [η] values. The second is that the ASA threshold values have a relevant effect on the calculated parameters: increasing the ASA threshold decreases the computed τ hc(20,w) and [η] values, because fewer beads are included in the volume correction. Examining in detail the τ hc(20,w) values, it seems that the A20/R50 values produce the best match between experimental and computed data, well below the experimental errors. The exceptions are BPTI, for which it seems that the experimental data might underestimate the rotational tumbling (supported by the lack of differences between X-ray and NMR structures), and the NMR-derived lysozyme model(s). For the latter, this effect was again noticed and tentatively explained by Rai et al. (2005) as deriving from the opposite effect of the long, hydrophilic and flexible surface side-chains on translational and rotational diffusion. As for [η], the two available datasets confirm that choosing a 20 Å2 residue ASA threshold coupled with a 50% beads ASA re-check threshold produces an excellent match between experimental and computed data. Again, the NMR-derived model(s) of lysozyme are in poor agreement for the reasons given above.

Next, we examine a series of structures with increasing size. In Table 3, the D 0t(20,w) and s 0(20,w) values are reported, and it can be seen that most D 0t(20,w) values computed with A10/R30 are within 1% of the experimental values. The exceptions are catalase (+6.3%), α-lactalbumin (+3.7%), the oxi form of hemoglobin (−3.6%), the holo form of G3PD (−4.2%), and nitrogenase MoFe (−5.4%). Given the uncertainty associated with the experimental data, some of which are more than 60 years old (see ESM Table S3), we can consider this an excellent result. As for the computed s 0(20,w) values, most of them are within 5% of the experimental data. The deoxi form of hemoglobin and again nitrogenase MoFe are here the worst performers (+9.7 and +11%, respectively), while in the case of NAD-bound pig muscle lactate dehydrogenase (+8.6) the experimental value is suspiciously equal to that of the pig heart form. This is in contrast to the corresponding D 0t(20,w) values, where there is a net difference extremely well matched by the relative models. As done for Table 1, we have not investigated the effect of using experimental \( \bar{v}_{2} \) values in the computations instead of calculated values, and similar effects could have affected the conversion to standard conditions of the original data. In this light, the agreement of the computed and experimental s 0(20,w) data in Table 3 can be considered satisfactory. Moreover, it can be seen that the A20/R50 combination works here as well as the original A10/R30.

Table 3 Further comparison between experimental and calculated D t(20,w) and s (20,w) values for US-SOMO bead models derived from test proteins

τ hc(20,w) and [η] data are also available for a restricted set of the same proteins, presented in Table 4. Using the A20/R50 combination, the τ hc(20,w) of all proteins, except ovalbumin, are within 15% of the experimental values, which is good considering the errors associated with the experimental data. More data are available for [η], and here the results are mixed, with four protein models having computed values within 2% of the experimental data, and another four laying between 10 and 15% (still considering the A20/R50 framework). Again, some experimental data are suspicious, like the 4 cm3/g value for ovalbumin, but a full examination of these issues is beyond the scope of this paper.

Table 4 Further comparison between experimental and calculated τ c(20,w) and [η] values for US-SOMO bead models derived from test proteins

The results described in the previous section can be better interpreted by comparing the original atomic structures and the US-SOMO-generated models. In Fig. 2, panels a–d, the original β-lactoglobulin (1BEB.pdb) structure is shown (panel a) together with the three bead models generated by US-SOMO (panels b–d) whose parameters are reported in Tables 3 and 4. The color-coding, fully described in the Fig. 2 legend, refers to the characteristics of the residues’ side chains and distinguishes also the buried beads (orange) from all the other beads. Note the increasing proportion of the buried beads in going from panel b (model generated with A10/R30) to panel d (A20/R50), to panel c (A40/R60). From the data in Table 3, the loss of prediction accuracy for D 0t(20,w) is only about 0.5% in going from model b (A10/R30) to model c (A40/R60), while the number of beads used to calculate the parameters drops from 325 to 139. Evidently, the translational motion of the protein is dominated by a restricted number of highly exposed frictional centers, clearly seen in Fig. 2 by comparing panels b and c. A similar situation is found with a larger protein made of four subunits, pig heart lactate dehydrogenase whose atomic structure (5LDH.pdb) is shown in panels e and f of Fig. 2. The two different representations were made to show both the subunits composition of LDH (panel e) and the US-SOMO residue coding (panel f). The two US-SOMO bead models in panels g and h were generated with A10/R30 and A40/R60, respectively. Again, the huge increase of “buried” beads between the two models corresponds only to a modest loss of accuracy of about 0.4% in D 0t(20,w) (Table 3). Overall, these data cast doubt on the necessity of an accurate modeling of protein surfaces for translational friction, which appears to be dominated by a subset of frictional centers. As for the rotational dynamics and intrinsic viscosity, the interpretation is complicated by the role of the excluded beads in the volume correction, and a more in depth analysis should await further studies.

Fig. 2
figure 2

Atomic structures, shown in space filling mode, of β-lactoglobulin (1BEB.pdb, panel a) and pig heart lactate dehydrogenase with NAD bound (5LDH.pdb, panels e and f) with their corresponding US-SOMO-generated bead models (β-lactoglobulin, panels bd; LDH, panels g and h). The models in panels b and g were generated with A10/R30, that in panel d with A20/R50, and those in panels c and h with A40/R60 (see text for details). The color-coding in panel e is blue for the main chain atoms, and green, greenblue, magenta and yellow for the side chains atoms of the four LDH subunits; the NAD moieties are pink. In all other panels, the color coding is: blue, main-chain; cyan, hydrophobic; magenta, non-polar; yellow, basic; green, acid; white, fused beads; orange, buried beads

Nevertheless, we can now examine in detail the performance of the US-SOMO-generated models in matching the NMR-derived τ hc(20,w) values for a set of relatively small proteins, as presented by García de la Torre (2001). To ensure a proper comparison, the reduction to standard conditions of the experimental data was again critically assessed, as reported in Table S6. It was found that most likely the effect of D2O on the solution viscosity was not accounted for, leading to values different from those reported in Table 2 of García de la Torre (2001). In Table 5, the corrected τ expc(20,w) are thus reported, and compared with those computed by US-SOMO (τ SOMOc(20,w) ) using the A20/R50 combination that performed better in the previously described testing phase. In addition, we have also reported in Table 5 the Hydropro-generated values, τ HPc(20,w) , presented in Table 2 of García de la Torre (2001), with their % differences from the new recalculated experimental values. Furthermore, we have expanded the set of structures to include additional NMR-derived structures, for which the τ SOMOc(20,w) averages were computed. This was facilitated by the fully automated processing implemented in US-SOMO, allowing the generation of most of the dataset presented in Table 5 in a mere 5 h of work, including the retrieval of the structures from the PDB and the coding of new residues (ligands and co-factors) not originally present in the somo.residue file. A few structures needed more work because they were incomplete, requiring additional operations. In particular, in the 1LKI leukemia inhibitor factor X-ray structure the first eight N-terminal residues were missing, which were taken from the 1A7M NMR structure (three different conformations were selected, and averages computed). Likewise, in the 1STN staphylococcal nuclease SN X-ray structure, the first five N-terminal and the last eight C-terminal residues were missing, and were taken from the 1JOR NMR structures, again generating three different models whose parameters were then averaged. For comparisons, the original incomplete structures were also processed and their computed values are presented in Table 5. We must also underscore that the data reported in Table 2 of García de la Torre (2001) were computed on a single structure for each protein examined, while the full datasets could be processed in our study.

Table 5 Comparison between NMR-derived correlation times, τ expc(20,w) , with those computed by US-SOMO with A20/R50, τ SOMOc(20,w) , and by HYDROPRO, τ HPc(20,w) (García de la Torre, 2001), for a set of test proteins

Examining in detail the data presented in Table 5, we notice that our models in general match better the experimental data than those produced by Hydropro, the apparent exceptions being interleukin-1β, lysozyme and eglin-c. However, these data must be also interpreted in the light of the evidence, documented in Table 5, that the average data from the NMR-derived structures in many cases perform worse than the corresponding X-ray structures, when both can be compared. An examination of the structures (not shown) reveals that in these cases the residues at the N- and C-terminal ends are very disordered, giving rise to quite different conformations. It is thus likely that this disorder reflects true conformational flexibility, which cannot be properly modeled in the rigid-body approximation used by the current implementation of US-SOMO and by Hydropro. Obviously, it would be possible to choose a single conformation matching the experimental parameters, but this would be clearly incorrect. In any case, overall the data presented in Table 5 confirm the reliability of the US-SOMO hydrodynamic modeling, while suggesting that other approaches, like Brownian dynamics (Ermak and McCammon 1978) or discrete molecular dynamics (Dokholyan et al. 1998) simulations, should be used to properly account for local flexibility effects.

In conclusion, we have shown that the bead-modeling scheme implemented in US-SOMO could be a very valuable tool in biomacromolecular hydrodynamic studies. The limitations present in the original SOMO (Rai et al. 2005) have been removed, and the program is now controlled from a GUI. When using pre-set default parameters, and non-standard residues or ligands are absent from a structure, the computations of all the hydrodynamic parameters are fast, reliable and very accurate for at least the translational diffusion parameters. When using the proper ASA/ASA re-check combination, the computational time required is minimal on standard personal computers, ranging from seconds for structures in the range 5–50 kDa to a few minutes for structures up to 250 kDa like catalase. While defining new residues still requires a detailed knowledge of their physical-chemical characteristics, these operations are also greatly aided by the GUI interface, effectively allowing the modeling of any kind of biomacromolecule from proteins to nucleic acids, carbohydrates, lipids and their complexes. The somo.atom and somo.residue files already contain, respectively, 300 and 64 entries covering amino acids, nucleotides, sugars, co-factors like heme and NAD/NADPH, prosthetic groups like N-acetyl and phosphate. These files will be constantly updated, we hope also through the help of the analytical ultracentrifugation and other hydrodynamic techniques communities, for which this enhanced and powerful tool was mainly developed.