Abstract
The ability to simulate sedimentation velocity (SV) analytical ultracentrifugation (AUC) experiments has proved to be a valuable tool for research planning, hypothesis testing, and pedagogy. Several options for SV data simulation exist, but they often lack interactivity and require up-front calculations on the part of the user. This work introduces SViMULATE, a program designed to make AUC experimental simulation quick, straightforward, and interactive. SViMULATE takes user-provided parameters and outputs simulated AUC data in a format suitable for subsequent analyses, if desired. The user is not burdened by the necessity to calculate hydrodynamic parameters for simulated macromolecules, as the program can compute these properties on the fly. It also frees the user of decisions regarding simulation stop time. SViMULATE features a graphical view of the species that are under simulation, and there is no limit on their number. Additionally, the program emulates data from different experimental modalities and data-acquisition systems, including the realistic simulation of noise for the absorbance optical system. The executable is available for immediate download.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
When properly applied, simulation can be a vital component of planning a biophysical experiment. This utility is especially important in the field of analytical ultracentrifugation (AUC) in the sedimentation velocity (SV) mode. In this experimental modality, a high centrifugal field is applied to a solution of a macromolecule or a mixture of several macromolecules. As the molecules migrate along the vector of centrifugal force, concentration profiles of the solutes are acquired via on-board absorbance optics or a Rayleigh interferometer. These “scans” are taken at all relevant radii and occur at discrete times. In addition to their centrifugal transport, the molecules also undergo translational diffusion due to the concentration gradients inherent in the experiment. Examination of the velocity of the migration and the properties of the diffusion allow the discernment of the sedimentation coefficient (s) and the translational diffusion coefficient (DT), and these two quantities can be used to determine the molar mass (M) of a species via the Svedberg equation:
where R is the universal gas constant, T is the temperature in kelvins, \(\overline{v}\) is the partial-specific volume of the species, and ρ is the solution density.
The acquired data can be modeled directly in data space using solutions the Lamm equation (Lamm 1929):
where c is the concentration of the solute, t is time from the start of centrifugation, r is the distance from the center of rotation, and ω is the rotation speed. Because the solutions to this partial differential equation can be used to model data, they obviously can also be used to simulate data given a physically rational set of parameters. However, no exact analytical solution of the Lamm equation is known. Rather, analysts today predominantly solve the equation numerically, although there are good approximate analytical solutions (Behlke and Ristau 2002). Prominent AUC data-analysis software programs (e.g., SEDFIT (Brown and Schuck 2008), UltraScan (Cao and Demeler 2008), and SedAnal (Stafford and Sherwood 2004)) use numerical finite-element solutions of the Lamm equation, an approach pioneered by Claverie and colleagues in the 1970s (Claverie et al. 1975).
Simulation is common in AUC because the method, while very robust, is not amenable to quick pilot experiments. Thus, simulation offers the possibility of performing preliminary experiments in silico without the investment in time and materials needed for in vitro experiments. Common questions that can be addressed are (1) “How long will the experiment take at a given rotor speed?” (2) “What combination of rotor speed and experimental duration will result in an optimal analysis?” (3) “Can standard analyses resolve two (or more) putative species?” and (4) “How will the signal-to-noise ratio affect my analysis?” Thus, the ability to simulate AUC data is a rapid, cost-free means to augment the planning of AUC experiments. Simulation has obvious pedagogic value as well.
In recognition of the usefulness of simulation in AUC, the software packages mentioned above all contain simulation functionalities that work well. But they have features that can make them difficult to use for an inexperienced experimentalist. For example, a protein chemist will most often be familiar with the molar mass and perhaps the shape of a macromolecule under study, but the relationship between these parameters and those needed for simulation, namely s and DT, are not straightforward. This fact imposes a burden on the experimenter to properly calculate the necessary quantities and enter them into the simulation software of choice. Another disadvantage of the available simulators is the lack of interactivity, i.e., adding or adjusting the parameters of a sedimenting species after examining the results of an initial simulation can be cumbersome. Also, some extant simulators require the user to input a finite time of sedimentation, but this may be unknown, forcing the user to make a difficult guess that may result in unwanted data or too few scans. Finally, alternative optical systems (i.e. Schlieren optics), different modes of data collection (difference sedimentation velocity, DSV, see Kirschner and Schachman 1971b; Brautigam et al. 2020), and realistic modeling of noise from the absorbance optical system are not supported.
To address the need for flexible, rapid, and interactive AUC simulations, a computer program called SViMULATE (Sedimentation Velocity in silico Mock experiments Using numerical Lamm and analytic Archibald-Type Equations) is introduced herein. This software has a built-in, on-the-fly hydrodynamics calculator that quickly provides the relationships between s, DT, frictional ratio, and M under user-provided experimental conditions. The program can also receive such information from HullRad, which efficiently calculates s and DT from structure files in the PDB format (Fleming and Fleming 2018). Results from the simulation are generally displayed within 1 s, and any necessary adjustments are easy to make, followed by additional simulation. There are simulation modes in which no finite time of simulation need be provided; instead, the software senses (based on user-adjustable criteria) whether the sedimentation is “complete”, and the simulation is halted at that point. There are no limits on the number of species that can be simulated, and the simulated data may be displayed as standard signal-based or Schlieren profiles. Additionally, DSV experiments aimed at discerning small changes in s-values (Kirschner and Schachman 1971b; Brautigam et al. 2020) can be simulated, and noise features of the absorbance optical system can be emulated. Finally, the generated data may be outputted for analysis with other software. SViMULATE is freely available as a pre-compiled executable for 64-bit Windows-based computers and is distributed with all dependencies.
Methods
Algorithms
Numerical
For the numerical simulation of SV data, the finite-element algorithm of Claverie and others (Claverie et al. 1975) using mathematical strategies introduced by Todd and Haschemeyer (Todd and Haschemeyer 1983) and Schuck (2016) was coded into a C + + module (clavPack). Although the aforementioned authors have extensively documented the respective algorithms, a few of the concepts are recapitulated here to justify some of the strategies used in clavPack and SViMULATE. clavPack was encoded as a Python-readable module using Swig (Beazley 1996). SViMULATE imports this module, gathers parameters from the user, communicates them to the module, actuates the simulation, and finally, clears it from memory (i.e., collects the garbage). clavPack reports the results back to SViMULATE, which graphs the results.
The goal of simulating SV data is the description of the continuous function c(r,t), representing the concentration of the solute as a function of radius and time after the start of centrifugation. In the formulation used herein and by others (Claverie et al. 1975; Cox and Dale 1981; Todd and Haschemeyer 1983), the radial space from the meniscus to the bottom of the solution column is divided into N equal-sized intervals; each interval thus has the size
where rm is the radial position of the meniscus and rb is that of the bottom of the solution column. In SViMULATE, the user has control of the number of intervals and thus of the magnitude of Δr. This radial space is spanned by N + 1 invariant triangular basis elements sometimes called “hat functions” (Claverie et al. 1975). Each of these elements Pi reaches its zenith (1, by definition) at ri, slopes to 0 at ri – 1 and ri + 1, and it is 0 everywhere outside of this range. Thus, a vector (C) with N + 1 elements may be used to scale the N + 1 hat functions to result (after summation) in c(r,t) (Cox and Dale 1981):
Consequently, the Lamm equation may be formulated thus for any element Pj at a given moment in time:
Equation 5 represents a set of N + 1 simultaneous equations that can be reformulated using matrices:
where the elements of matrices B, A2, and A1 can be calculated by computing the respective integrals that they substitute for (cf. Equations 6 and 5). These matrices are tridiagonal and invariant during a given simulation, and therefore they may be efficiently calculated at the outset and remain fixed. The formulas for the values in these matrices are tabulated elsewhere (Cox and Dale 1981; Todd and Haschemeyer 1983). A remaining problem is the calculation of the vectors dC/dt and C. They are estimated as
where Cb is the concentration vector before the time step at hand (which has a magnitude of Δt), Ca is the concentration vector after, and θ is a dimensionless value between 0 and 1 (inclusive).
Making these substitutions, rearranging, and conveniently defining A = DTA1—sω2A2, Eq. 6 becomes
The choice of θ underscores the main difference between the original approach of Claverie et al. (1975) and subsequent treatments by Todd and Haschemeyer (1983) and Schuck (1998). Claverie chose θ = 1, an “implicit” scheme that simplifies the right-hand side of Eq. 8 to BCb. The θ value was set to 0.5 by Todd and Haschemeyer (1983), justifying the choice based on its inherent numerical stability. Schuck effectively made the same choice by applying a Crank-Nicolson scheme to the finite-element method (Crank and Nicolson 1947; Schuck et al. 1998). A θ value of 0.5 is used in SViMULATE, and the default value of Δt is 1.0 s.
By definition in this numerical simulation, Cb is known. At the start (t = 0), it is a uniform value across all elements (i.e., radial positions). Therefore, Eq. 8 must be solved for Ca, i.e., the concentration must be calculated following the time step Δt. This is accomplished in SViMULATE using the iterative procedure outlined by Todd and Haschemeyer (1983), with the only embellishment being the necessary recalculation of A at each time step during rotor acceleration (if used). After Ca is calculated, it is reassigned as Cb, and the process begins again for the next time step. The Ca vector is not recorded for every time step; rather, the user stipulates a reporting frequency (called “scan frequency”) in seconds, and only at these time points is Ca recorded for output before being reassigned as Cb.
An important aspect of numerical simulation is when to stop it. SViMULATE offers four different ways to define the halt point. The first two are trivial: the user may indicate an integral number of “scans” to be outputted or may stipulate a total time of the simulation in hours and minutes. In the second two, the user tasks clavPack with the decision of when to exit. The first of these completion modes is called “Completion.” At the recording points in simulation time, the algorithm compares the current values of Cb with the just previously recorded one at all radial values between rm and ru, the latter being a user-chosen “right-side limit.” When the maximum difference (on an element-by-element basis) between the two “scans” falls below a user-defined level, the algorithm exits. The final mode is the “Concentration” mode, in which the signal at a user-provided radius (usually close to rb) is monitored, and the algorithm exits when it falls below a user-defined threshold. It is possible for the user to set the halt criteria such that the simulation would never stop; however, as a failsafe, if the simulation reaches three days (259,200 s), clavPack automatically exits and SViMULATE displays the results. For all modes, SViMULATE displays the total time of sedimentation by default, as this value is sometimes the objective of the simulation.
The user is afforded significant control over the simulation in SViMULATE. Parameters under user control are: hydrodynamic parameters (vide infra), partial-specific volume, concentrations, rm, rb, T, ω (given as rotor speed in rpm), solution density, solution viscosity, N, Δt, scan frequency, rotor acceleration, completion mode/criteria, output sampling, and noise elements (vide infra).
Analytic
For analytical simulation, all calculations are performed in Python using all six terms of the Archibald-type equation promulgated by Behlke and Ristau (2002). A difficulty encountered in some simulations is that large exponents for e may need to be calculated, and these can exceed the floating-point precision used in the program. The user is warned in such cases. All of the user-adjustable parameters available to the numerical simulations are also present for the analytic ones, except for rotor acceleration and N.
Noise elements
In SViMULATE, three sources of random noise may be added to the noiseless, simulated data. First, of course, is the stochastic noise of data acquisition (y(r,t)s). This may be selected as normally distributed, randomly sampled noise added to each outputted data point, and the user has control of the standard deviation of the sampled distribution. However, because the absorbance scans result from the log transformation of a ratio of intensities, the noise distribution can no longer be assumed to be normal. Rather, simulations of an absorbance detector show that the noise increases and skews positively as absorbance increases; no simple analytic representation of this amplifying, skewing noise distribution could be found (see Supplemental Methods). Instead, the user may request realistic absorbance noise in two ways: (1) the user-provided parameters can be used to consult a series of tabulated parameters for an exponentially modified Gaussian function that can be sampled for noise-generation purposes, or (2) a simulation can be performed to generate the noise elements; the rationale and mathematics underpinning these protocols are presented “Results and Discussion” and more thoroughly in Supplemental Methods. Other noise sources include time-invariant (TI) noise, probably caused by imperfections in the optical path that light traverses during data acquisition, and radially invariant (RI) noise, which is usually only encountered with the Rayleigh interferometer and is due to minute changes in the vertical values of the fringes from scan to scan (i.e., “jitter”) (Schuck and Demeler 1999). For TI noise, a function y(r)TI is initiated with all n̂ values in this data set assigned to 0. This author has observed that the frequency of TI noise appears to be less than that of data acquisition. Thus, to mimic this “medium-frequency” noise, only every third data point from y(r)TI is selected for noise generation, resulting in \(\hat{n}\) data points in a subset called \(\hat{y}(\hat{r})_{TI}\). Next, \(\hat{n}\) values of stochastic, normally distributed noise are generated about 0 (again with a user-selected standard deviation) and, respectively, added to \(\hat{y}\)TI. This distribution is then subjected to a differencing procedure:
Finally, the neglected data points from y(r)TI are re-inserted to restore the full data set, and their values are interpolated (or extrapolated as necessary) between the newly calculated values of \(y^{\prime}_{{{\text{TI}}}}\). For RI noise (y(t)RI), for each time point t, a number is randomly sampled from a Gaussian distribution whose standard deviation is also specified by the user. This number is added to all radial points for a given t. Thus, the final formula for the output (c(r,t)out) is
where c(r,t)sim represents the noiseless simulated data. The addition of noise in this fashion is available in SViMULATE for both the numerical and analytic simulation modes.
DSV simulations
For DSV simulations, the user is constrained to simulating two species: one for the reference sector, and one for the sample sector. The user inputs information about the reference species, and then all aspects of the species in the sample sector are kept the same except for the sedimentation coefficient (represented as Δs) and the meniscus (Δrm). When actuated, SViMULATE calculates simulations for both species and then subtracts the concentration trace of the reference sector from that of the sample sector, plotting the result.
Schlieren optics
Simulations (except DSV) can be displayed either as signal-concentration traces (the default) or pseudo-Schlieren profiles. The latter are estimated using the central difference formula to approximate the first derivative of the profile. For a stable estimation, the concentration profile had to be interpolated with the assumption of a cubic spline connecting the successive data points. The suggestion of Cox and Dale (1981) of estimating this profile by differencing all concentration values and dividing by Δr was considered, but the resulting displacement of the radial grid by \({{\Delta r} \mathord{\left/ {\vphantom {{\Delta r} 2}} \right. \kern-0pt} 2}\) was not desired.
On-the-fly hydrodynamics calculations
Three of the hydrodynamic-calculation modes described in the main text essentially combine the Svedberg equation (Eq. 1), the Stokes–Einstein equation
where k is the Boltzmann constant and f is the frictional coefficient, Stokes’ law,
and
where f0 is the frictional coefficient of a sphere with radius R0, which is the minimum radius that a particle of molar mass M may assume, η is the solution viscosity, and NA is Avogadro’s number. For example, when the user inputs M and a frictional ratio f/f0, Eq. 13 is used to find R0, which is inserted into Eq. 12 to yield f0 and, trivially, f. DT can then be found from Eq. 11 and inserted into a rearranged Eq. 1 to yield s; s and DT are then supplied to the simulation algorithm when the user starts the simulation. The values f/f0, s, DT, and M are continuously updated as appropriate in response to user inputs.
Fitting simulated data
Data for accuracy testing was outputted using SViMULATE’s standard output features. No noise elements were added. The data were loaded into SEDFIT version 16.1c (https://sedfitsedphat.github.io/download.htm) and analyzed using the “Non-Interacting Discrete Species” model in a mode that directly fits s and DT. No changes to the default numerical Lamm-equation parameters were made. Because SViMULATE writes out sedimentation data with a header feature indicating correct time-stamps, SEDFIT did not attempt to automatically modify them (see (Zhao et al. 2013)). The sample meniscus, s, DT, and concentrations were fitted in the analyses.
Results and discussion
Simulation algorithms
SViMULATE has two different means of calculating solutions to the Lamm equation (Eq. 2). The first, preferred, mode is using a finite-element numerical simulation. The simulation implemented is similar to that proposed by Claverie (Claverie et al. 1975) and essentially identical to that implemented by Todd and Haschemeyer (1983), with the exception that the rotor acceleration to the target speed can be simulated (this feature is active by default in SViMULATE). Specifics of this simulation are beyond the scope of this communication and are mostly presented elsewhere (Claverie et al. 1975; Todd and Haschemeyer 1983; Schuck et al. 1998), but some aspects are detailed in Methods. The simulation can be efficiently carried out; 50 scans (spaced 5 min apart) of a 40,000 Da species with a frictional ratio of 1.3 sedimenting at 50,000 rpm in water were completed in 0.1 s on the author’s laptop computer. This efficiency was achieved by encoding the simulation in C + + and interfacing this code to the Python master program (see Methods), and it was aided by optimized calculations in Python libraries like NumPy (Harris et al. 2020), SciPy (Virtanen et al. 2020), and Matplotlib (Hunter 2007).
The second mode of calculating concentration profiles is via an approximate analytic Lamm-equation solution as detailed by Behlke and Ristau (2002). This mode does not take rotor acceleration into account, and it was included mainly as a point for comparison between it and the numerical calculation. The advantage of the method is its speed: the calculation mentioned above, in this case performed entirely in the native Python environment, only takes 0.04 s. Although numerical solutions are very frequently used for modeling SV data, accuracy testing (vide infra) demonstrates that this analytical formula can work very well. Indeed, this approach forms the computational underpinnings of the data-modeling programs SVEDBERG (Philo 1996) and LAMM (Behlke and Ristau 2002). The main disadvantage of the analytic approach is that some terms of the Behlke/Ristau formula can assume values larger than the maximum value allowed in a 64-bit floating-point number. SViMULATE tests for this problem and reports to the user when a set of parameters may produce errors.
Neither of the simulation modes currently encoded into SViMULATE account for inter-solute interactions. That is, at present, only non-interacting, ideal species may be simulated. Other authors have modified the finite-element method to account for concentration-dependent effects on sedimentation, such as hydrodynamic non-ideality (Cox and Dale 1981) and infinitely fast self-association (Schuck 1998). Further, the numerical calculations can be extended to account for finite kinetics and hetero-associations (Stafford and Sherwood 2004; Dam et al. 2005). Although none of these are currently implemented in SViMULATE, expansion of the program to include at least a few simple non-ideal and interacting models is envisioned.
Accuracy testing
In the initial publication on the finite-element numerical method, Claverie et al. (1975) noted that there was some inaccuracy in the calculation when spatial and temporal discretization is sparse. That is, when they simulated an SV data set (N = 400, Δt = 1 s) with a sedimentation coefficient (s) of 7.0 S and a diffusion coefficient (DT) of 5.7 F and then analyzed it using linear-transformation methods, errors of ≤ 0.2% and 2.3%, respectively, were observed. In an initial test of SViMULATE, this simulation was exactly recapitulated: in addition to the parameters listed above, it featured a rotor speed of 60,000 rpm (with no attempt to model rotor acceleration), one observation every 200 s, a meniscus of 6.0 cm, the sector bottom at 7.0 cm, and a starting “concentration” of 1.0, using the “implicit” scheme to perform the calculations (i.e., θ = 1; see Eq. 8). These noiseless data were then analyzed with SEDFIT, which uses a different finite-element method (specifically, a non-equidistant grid and different time discretization (Brown and Schuck 2008)) to model SV data. The agreement between the modeled and refined values was excellent for the s-value but evinced a + 2.4% error in DT (Table 1).
Next, a modification to the algorithm was made to depict the actual centrifugation experiment more realistically. Specifically, the rotor acceleration was modeled at 270 rpm/s, which is approximately the acceleration value observed with the analytical ultracentrifuge in service at UT Southwestern. Only slight increases in accuracy were observed (Table 1).
Finally, a correction scheme providing better numerical stability was added to the algorithm according to the method outlined by Todd and Haschemeyer (1983) and Schuck (Schuck et al. 1998). This method abandoned the “implicit” scheme of Claverie et al. (1975) for a more numerically robust form (θ = 0.5; see Eq. 8). It required roughly twice the number of calculations to model the acceleration phase of the rotor, but it resulted in substantial increases in the accuracy of DT (− 0.06%) without sacrificing significant levels of accuracy in the s value (Table 1). Given the excellent accuracy and performance of this method (33 “scans” of this simulation were completed in 0.04 s on the author’s laptop), it was adopted as the method of choice for simulation in SViMULATE.
The approximate analytical solution encoded in SViMULATE performed very well for this particular set of parameters (Table 1). Indeed, its performance exceeded that of the previously described numerical simulation, having the same error in s and a slightly smaller deviation in DT. However, as emphasized above, the rotor acceleration was not simulated, and thus the analytic solution is not the most faithful proxy for real-world SV data.
In early tests of the implicit Claverie algorithm implemented in SViMULATE, it was noted that large species sedimenting in a high centrifugal field suffered an even higher degree of inaccuracy than that noted above (Table 1). To illustrate this, a scenario in which two species having considerably different s values (3.244 S v. 11.516 S), molar masses (40 kDa v. 400 kDa), and frictional ratios (1.3 v. 1.7) was considered (Figs. 1 and 2A). In the implicit Claverie scheme, the SEDFIT-analyzed results featured a DT for the larger species that was incorrect by 6.7%, leading to a faulty determination of the molar mass (see Eq. 1 and Table 2). The numerically robust scheme with modeled rotor acceleration provided far superior estimates of DT and molar mass (errors of − 0.4% and + 0.4%, respectively). This scenario could not be simulated with the analytic algorithm, as it resulted in values in some terms exceeding the maximum for 64-bit floating-point numbers.
Given the recent interest in performing SV on gene-therapy vectors, particularly adeno-associated viruses (AAVs) (Burnham et al. 2015; Nass et al. 2018; Maruno et al. 2021), a simulation was conducted with large species meant to mimic empty and full AAV capsids (Table 3). The s-values of these species were approximately 64 S and 100 S, respectively, which are similar to results garnered in this lab and others. Although performing this simulation only resulted in 34 scans (with a scan frequency of 300 s), the analysis results (Table 3) show that the SViMULATE simulations accord with those from SEDFIT to a very high degree.
On-the-fly hydrodynamics calculations
Five modes of inputting macromolecular parameters were enabled in SViMULATE, named for the information that the user provides: (1) frictional ratio/M, (2) s/M, (3) s/D, (4) s/frictional ratio, and (5) HullRad. For example, in the first mode, the user can input the known molar mass and a guess regarding the frictional ratio (along with solution parameters), and all other parameters necessary for the simulation will be calculated on-the-fly and displayed to the user. Once these are adjusted to the user’s satisfaction, the simulation can be actuated, with the result immediately displayed. Any number of species can be simulated, and each one can have its own mode of macromolecular-property input (Fig. 1). The fifth method, “HullRad,” utilizes the convex-hull method introduced by Fleming and Fleming (2018) to generate s and DT from a structure-coordinate file, and all other parameters are calculated from knowledge of these two.
If desired, SViMULATE allows the user to inspect a graph that summarizes the hydrodynamic properties of all currently inputted species. For example, the simulation used to produce Table 2 could be visualized as in Fig. 1. SViMULATE can generate three such views: M vs. f/f0 (Fig. 1), s vs. f/f0, and s vs. DT. It is straightforward to switch between them.
Data display and output
The SV data resulting from the simulation can be displayed and outputted in several different ways. First and most commonly, the user may specify that signal profiles, as collected by the AUC data-acquisition software, be displayed (Fig. 2A). The user may enter “concentrations” in signal, molar, or mass-concentration units. Signal increments of course must be provided for the latter two values. A Schlieren-type data-output mode, i.e., \({{dc} \mathord{\left/ {\vphantom {{dc} {dr}}} \right. \kern-0pt} {dr}}\) vs. r (Fig. 2B), is available, but the program enforces a requirement for the mass-concentration mode of concentration input, as this data-acquisition method is based on refractive-index changes.
A specialized mode offered only in SViMULATE is the ability to simulate DSV data. In this experimental strategy, samples of identical concentration are placed in both sectors of an AUC centerpiece, and the Rayleigh interferometer is used to measure the refractive-index differences between them. The usual objective is to find differences in sedimentation coefficient between the samples in the two sectors. This method is a sensitive means to detect ligand-induced conformational changes in proteins (Kirschner and Schachman 1971a; Brautigam et al. 2020). SViMULATE, when used for such simulations, expects the user to define two species, one for each sector. Upon actuation, it simulates both curves, computes the difference between them, and displays the result in signal units (Fig. 2C).
Systematic noise designed to mimic the noise generated by the AUC optics can also be simulated. Three major types of noise in AUC data are (1) the stochastic noise of data acquisition, (2) time-invariant (TI) noise, and (3) radially invariant (RI) noise (Schuck and Demeler 1999). Sources of these noise elements are briefly discussed in Methods and are elaborated elsewhere (Stafford 1992; Schuck and Demeler 1999; Kar et al. 2000; Schuck et al. 2016). SViMULATE can add all three types of noise in every possible combination. The user has control of the magnitude of noise added in all cases. An example of only TI noise added to the simulation in Fig. 2A is shown in Fig. 3.
Realistically modeling the stochastic noise from the absorbance optical system represents a particular challenge. This is because the absorbance reading is the base-ten logarithm of the ratio of two intensity readings (one from the reference sector, and one from the sample sector). Simulation of the intensity readings, considering the likely noise features, suggested that realistic noise for the absorbance optics has two trends as the reading increases: (1) it becomes higher, and (2) it becomes more asymmetrically distributed (Fig. 4A). Extensive modeling of theoretical noise led to the conclusion that it could be simulated with an exponentially modified Gaussian (EMG) distribution. Although a simple analytic relationship between the absorbance, user-selected noise, and the EMG’s parameters could not be found, the modeling of 30,100 achievable combinations of parameters allowed the construction of parametric tables that can be consulted by SViMULATE (see Supplemental Methods). Thus, when the user selects realistic absorbance noise, the tables are referred to, and noise from an appropriate EMG is sampled. Notably, a single scan can feature readings from 0.0 to near the maximum of absorbance (Fig. 4B), and thus the noise should increase correspondingly. This feature is also a part of the SViMULATE absorbance modeling. The user may turn this realistically skewed noise feature on or off on demand. An alternative mode for calculating realistic absorbance noise is to simulate the noise elements directly as if they resulted from the logarithm of the ratio of two noisy intensity readings. Although this second method is effective and is provided as an option in SViMULATE, it is time consuming and imposes significant limitations on the magnitudes of the noise and the absorbance readings. For these reasons, the EMG-based method is preferred. An important aspect of these noise-generation protocols is that they do not guarantee accurate modeling for all absorbance optical systems; rather, they generate noise features that are plausible for absorbance optics that behave as described in Supplemental Methods. Future modifications will seek to augment the verisimilitude of the noise (e.g., adding a sloping baseline to TI noise for data simulated to be from the Rayleigh interferometer).
The simulation can be saved in two ways. First, SViMULATE can write out a binary file that contains all species’ respective parameters and the global experimental parameters. The user may thus load these data later and exactly recapitulate the simulation. The second means of saving the data is to write to disk the simulated scans using the Beckman-Coulter file format. Because the output grid may not exactly match the radial points specified by the user in the numerical simulation, linear interpolation is used to provide values for all the outputted radial points. The outputted files may be opened by any analytic software package for examination and analysis. An informational text file is also written in the same directory as the simulated data files; it contains all relevant details of the simulation.
In summary, the software SViMULATE is an accurate, quick, easy, and interactive tool for simulating AUC data in the sedimentation velocity mode. It may be downloaded immediately from https://www.utsouthwestern.edu/research/core-facilities/mbr/software, and it is designed for use on 64-bit Windows-based computers. It is hoped that it can serve as a tool to be utilized by the scientific community for experimental planning and hypothesis testing, facilitating the informed use of limited centrifuge time and maximizing throughput. Also, its ease of use should incentivize AUC neophytes to explore the principles of the method.
Data availability
A compiled version of the software is freely available at https://www.utsouthwestern.edu/research/core-facilities/mbr/software.
References
Beazley DM (1996) Using SWIG to control, prototype, and debug C programs with Python. https://www.legacy.python.org/workshops/1996-06/papers/. Accessed 6 Feb 2023
Behlke J, Ristau O (2002) A new approximate whole boundary solution of the Lamm differential equation for the analysis of sedimentation velocity experiments. Biophys Chem 95:59–68
Brautigam CA, Deka RK, Liu WZ, Norgard MV (2018) Crystal structures of MglB-2 (TP0684), a topologically variant D-glucose-binding protein from Treponema pallidum, reveal a ligand-induced conformational change. Protein Sci 27:880–885
Brautigam CA, Tso S-C, Deka RK et al (2020) Using modern approaches to sedimentation velocity to detect conformational changes in proteins. Eur Biophys J 49:729–743
Brown PH, Schuck P (2008) A new adaptive grid-size algorithm for the simulation of sedimentation velocity profiles in analytical ultracentrifugation. Comput Phys Commun 178:105–120
Burnham B, Nass S, Kong E et al (2015) Analytical ultracentrifugation as an approach to characterize recombinant adeno-associated viral vectors. Hum Gene Ther Methods 26:228–242
Cao W, Demeler B (2008) Modeling analytical ultracentrifugation experiments with an adaptive space-time finite element solution for multicomponent reacting systems. Biophys J 95:54–65
Claverie J-M, Dreux H, Cohen R (1975) Sedimentation of generalized systems of interacting particles. I. Solution of systems of complete Lamm equations. Biopolymers 14:1685–1700
Cox DJ, Dale RS (1981) Simulation of transport experiments for interacting systems. In: Frieden C, Nichol LW (eds) Protein-protein interactions. John Wiley & Sons, New York, pp 173–211
Crank J, Nicolson P (1947) A practical method for numerical evaluation of solutions of partial differential equations of the heat-conduction type. Math Proc Cambridge Philos Soc 43:50–67
Dam J, Velikovsky CA, Mariuzza RA et al (2005) Sedimentation velocity analysis of heterogeneous protein-protein interactions: Lamm equation modeling and sedimentation coefficient distributions c(s). Biophys J 89:619–634
Fleming PJ, Fleming KG (2018) HullRad: Fast calculations of folded and disordered protein and nucleic acid hydrodynamic properties. Biophys J 114:856–869
Harris CR, Millman KJ, van der Walt SJ et al (2020) Array programming with NumPy. Nature 585:357–362
Hunter JD (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9:90–95
Kar SR, Kingsbury JS, Lewis MS et al (2000) Analysis of transport experiments using pseudo-absorbance data. Anal Biochem 285:135–142
Kirschner MW, Schachman HK (1971a) Conformational changes in proteins as measured by difference sedimentation studies. II. Effect of stereospecific ligands on the catalytic subunit of aspartate transcarbamylase. Biochemistry 10:1919–1926
Kirschner MW, Schachman HK (1971b) Conformational changes in proteins as measured by difference sedimentation studies. I. A technique for measuring small changes in sedimentation coefficient. Biochemistry 10:1900–1919
Lamm O (1929) Die differentialgleichung der ultrazentrifugierung. Ark För Mat Astron Och Fys 21B:1–4
Ma J, Zhao H, Schuck P (2015) A histogram approach to the quality of fit in sedimentation velocity analyses. Anal Biochem 483:1–3
Maruno T, Usami K, Ishii K et al (2021) Comprehensive size distribution and composition analysis of adeno-associated virus vector by multiwavelength sedimentation velocity analytical ultracentrifugation. J Pharm Sci 110:3375–3384. https://doi.org/10.1016/j.xphs.2021.06.031
Nass SA, Mattingly MA, Woodcock DA et al (2018) Universal method for the purification of recombinant AAV vectors of differing serotypes. Mol Ther Methods Clin Dev 9:33–46
Philo JS (1996) An improved function for fitting sedimentation velocity data for low- molecular-weight solutes. Biophys J 72:435–444
Schuck P (1998) Sedimentation analysis of noninteracting and self-associating solutes using numerical solutions to the Lamm equation. Biophys J 75:1503–1512
Schuck P (2016) Sedimentation velocity analytical ultracentrifugation: discrete species and size-distributions of macromolecules and particles. CRC Press, Boca Raton
Schuck P, Demeler B (1999) Direct sedimentation analysis of interference optical data in analytical ultracentrifugation. Biophys J 76:2288–2296
Schuck P, MacPhee CE, Howlett GJ (1998) Determination of sedimentation coefficients for small peptides. Biophys J 74:466–474
Schuck P, Zhao H, Brautigam CA, Ghirlando R (2016) Basic principles of analytical ultracentrifugation. CRC Press, Boca Raton
Stafford WF (1992) Boundary analysis in sedimentation transport experiments: a procedure for obtaining sedimentation coefficient distributions using the time derivative of the concentration profile. Anal Biochem 203:295–301
Stafford WF, Sherwood PJ (2004) Analysis of heterologous interacting systems by sedimentation velocity: curve fitting algorithms for estimation of sedimentation coefficients, equilibrium and kinetic constants. Biophys Chem 108:231–243
Todd GP, Haschemeyer RH (1983) Generalized finite element solution to one-dimensional flux problems. Biophys Chem 17:321–336
Virtanen P, Gommers R, Oliphant TE et al (2020) SciPy 1.0: fundamental algorithms for scientific computing in python. Nat Methods 17:261–272
Zhao H, Ghirlando R, Piszczek G et al (2013) Recorded scan times can limit the accuracy of sedimentation coefficients in analytical ultracentrifugation. Anal Biochem 437:104–108
Acknowledgements
The author wishes to thank Dr. Peter Schuck for helpful discussions, Dr. Walter Stafford for providing exemplary code, and Drs. Lake Paul and Alexander Yarawsky for beta-testing SViMULATE and offering helpful suggestions.
Funding
No funding was received for this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author has no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Special Issue: Analytical Ultracentrifugation 2022.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Brautigam, C.A. SViMULATE: a computer program facilitating interactive, multi-mode simulation of analytical ultracentrifugation data. Eur Biophys J 52, 293–302 (2023). https://doi.org/10.1007/s00249-023-01637-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00249-023-01637-0