Abstract
The repulsion term in conventional force fields constitutes a major source of error. Assuming that this could originate from a too simple analytical functional form, we analyzed various analytical functions using ab initio exchange component values as a reference and obtained (α + β R −1)exp(−γ R) as the optimal form to represent the repulsion term. Universal exchange, delocalization, and electrostatic penetration potentials approximating the corresponding interaction energy components defined within hybrid variation-perturbation theory (HVPT) were derived using as a reference a training set of 660 biomolecular complexes. The electrostatic multipole term was calculated using cumulative atomic multipole moments, whereas correlation contribution including dispersion term and first-order correlation correction was estimated from nonempirical D a s functions derived by Pernal et al. The resulting non-empirical atom–atom potentials (NEAAP) were tested for several urokinase–inhibitor complexes yielding improved docking results.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Precise knowledge of intermolecular interactions between proteins and their ligands is of critical importance in rational drug or biocatalyst design. Experimental studies yield still very limited information in this respect, while rigorous ab initio calculations are prohibitively expensive, due to their steep scaling with basis set size. Therefore, simulations of large biomolecular systems are usually based on empirical force fields (FF), which exhibit limited transferability between molecular systems mainly due to the electrostatic term [1–3] and do not preserve any clear physical meaning of its particular non-bonded contributions. Historically, first conventional non-bonded force field parameters have been fitted to reproduce experimentally available data such as enthalpies of vaporization or sublimation, however they usually differ considerably between themselves. Alternative and more reliable source of data for FF fitting constitute accurate quantum chemical interaction energies, which may cover much wider range of intermolecular distances. Therefore, such FF could be called non-empirical. To our knowledge, only a few attempts have been made to derive universal [1, 4] FF nonbonded energy terms directly by fitting interaction energy components as defined by perturbation theory. Corresponding system-specific potential functions were much more frequent [5–8]. In this contribution, we focus on improving the functional form for the exchange repulsion term, which has been noted as a major source of error in conventional force fields [9]. We have already encountered this issue when studying inhibitors docked into urokinase, where some inhibitors were docked by a force field 0.5 Å too deep into the active site compared to ab initio results [10]. One of the reasons for this could be an inappropriate functional form for the repulsion term, usually represented by α R −12 or αexp(−β R), whereas the analytical formula for the first order exchange term involves squared overlap integrals, composed in turn as products of complex polynomials and exponential functions [11, 12].
With this in mind, we here explore in a systematic way the various possible analytical formulas for reproducing the first-order exchange component for large number of biomolecular complexes and determine their optimal parameters. To facilitate applications in conventional software, we limit ourselves to products of polynomial and exponential functions resembling the above mentioned, since they can be easily incorporated into typical force fields.
This work is an extension of our earlier report [1], in which perhaps one of the first universal non-empirical atom–atom potentials were derived for main interaction energy components. These were calculated in a minimal valence basis set for 336 hydrogen-bonded dimers and tested on the packing of N 2, C O 2, and nitromethane crystals [13]. The present contribution is based on an extended aug-cc-pVDZ basis set and a much wider collection of 660 biomolecular complexes, including various types of interactions besides hydrogen bonds. We tested the resulting nonempirical atom–atom potentials on the mentioned urokinase–inhibitor complexes, which exhibited artefact structures when optimized with a standard Tripos 5.2 force field [10].
Interaction energy components
The only rational way to determine non-empirical atom–atom potentials (NEAAP) is to partition the intermolecular interaction energy ΔE into some well-defined components, which can separate system-specific terms (for example, first-order electrostatic) from other more transferable contributions with different distance dependence requiring different analytical representation. There are many possible interaction energy partitioning methods, mostly related to variational Morokuma scheme [14] or symmetry adapted perturbation theory (SAPT) [15]. On the other hand, non-bonded interactions can be explained in an elegant way using Hellmann–Feynman theorem [16]. Unfortunately, the Hellmann–Feynman theorem has not been practically demonstrated to provide simple potential functions yet. In this study, we have applied the hybrid variation-perturbation theory (HVPT) [17, 18], in which the total interaction energy for systems containing over 1800 AOs [19] can be partitioned into the following contributions:
The first-order multipole electrostatic term \(E_{EL,MTP}^{(10)}\) is obtained using SCF monomer cumulative atomic multipole moments (CAMMs) \( M_{A}^{(k_{a})} \) and \( M_{B}^{(k_{b})} \) for all atom pairs of A and B molecules [20]:
where M (k) is a rank k multipole and \(\phantom {\dot {i}\!}T^{k_{a}+ k_{b}}\) is the Cartesian interaction tensor containing all the partial derivatives of |R a b |−1 of rank k a + k b . In this contribution, the CAMM expansion was truncated at the R −5 term (rank 5) yielding best convergence [21] and calculated using the GAMESS-US ab initio package (activated by adding $ELMOM IAMM=n $END to the input file, where n is the desired highest rank) [22]. Higher atomic multipole moments can easily be transformed into alternative point charge models [23, 24].
The complete first-order electrostatic term \( E_{EL}^{(10)}\) is calculated in the dimer basis set as the first-order perturbational correction within the polarization approximation [17] and is equivalent to the analogous term defined within Symmetry Adapted Perturbation Theory (SAPT) [15].
where monomer electron densities \( D_{rs}^{A}{(D)} \), \( D_{tu}^{B}{(D)} \) have been obtained in dimer basis set D = A + B, whereas Z a , Z b denote nuclear charges, < r s∣t u >, \(< r \mid Z_{b} R_{1b}^{-1} \mid s >\), two-electron electron repulsion and one-electron nuclear attraction integrals, respectively.
The electrostatic penetration term \( E_{EL,PEN}^{(10)}\) is defined as the difference between the entire electrostatic energy \( E_{EL}^{(10)}\) and its multipole component \( E_{EL,MTP}^{(10)}\):
Taking mutually orthogonalized monomer wavefunctions obtained in the dimer basis set as a starting point, the first-order Heitler–London interaction energy E (10) is calculated as the difference between the AB dimer energy E A B (D) at iteration zero and the monomer energies E A (D) and E B (D)obtained in the dimer basis set:
Neglecting small Murrell delta term, the exchange repulsion component \(E_{EX}^{(10)}\) is then defined as the difference between the Heitler–London E (10) term defined above and the electrostatic component \(E_{EL}^{(10)}\):
Another component \( E_{DEL}^{(R0)} \), called the delocalization term, covers higher-order (R) induction and exchange-deformation interactions [25] and is obtained as the difference between the converged SCF interaction energy ΔE SCF and the Heitler–London term E (10):
It has to be noted that further partitioning \( E_{DEL}^{(R0)} \) yields strongly basis set-dependent induction and charge transfer terms whereas their sum, i.e., delocalization term does not display such dependency [26]. Dispersion and exchange dispersion terms obtained within SAPT approach (\( E_{DISP}^{(2)} + E_{EX-DISP}^{(2)}\)) as well as first-order correlation correction 𝜖(1) can be closely approximated by atom–atom potentials that include damping functions D a s [27] represents inter- and intra-molecular correlation term E C O R R .
High quality of D a s functions allows to supplement SCF interaction energy [27] by correlation effects avoiding well-known deficiencies of DFT or MP2 methods to represent dispersion interactions.
Thus, the total interaction energy used in this study can be expressed as the sum of the following terms:
where both long-range \(E_{EL,MTP}^{(10)}\) and D a s terms scale with the number of atoms squared O(A 2) and will be supplemented by similarly scaling atom–atom potentials derived in this work to approximate the short-range \(E_{EL,PEN}^{(10)}\), \(E_{EX}^{(10)}\) and \( E_{DEL}^{(R0)}\) terms yielding complete nonempirical estimate of major nonbonded interactions applicable in any force field.
Results and discussion
Selection of the optimal functional form for the exchange repulsion term
Due to the critical importance of the repulsion term in empirical force fields [9], we extensively tested various possible functions starting with the most popular ones such as α R −12, αexp(−γ R) and ending with (α + β R −1+δ R + κ R 2+ω R −3)exp(−γ R), as well as simpler intermediate versions. As reference data, we assumed values of the first-order exchange term obtained for 660 dimers of biomolecular complexes using the aug-cc-pVDZ basis set, generated from the S66 training set [28]. This set included 23 hydrogen-bonded, 23 dispersion-dominated, and 20 mixed molecular complexes composed of hydrogen, carbon, nitrogen, and oxygen only. Additional inclusion of sulphur, phosphorus, as well as halogen complexes involved in sigma-hole bonding is planned in future. Besides six shortest original distances defined by the ratio R/R e q , we generated an additional S66x4 set to cover shorter distances critical for repulsion interactions. The parameters α, β, γ, δ, κ, and ω that appear in the longest functional form given above approximating the nonempirical exchange energy, were optimized by nonlinear least-squares fitting with all weights equal 1 using Powell’s conjugate direction method [29]. The corresponding total root mean square errors (RMSE) as well as for distances close to equilibrium structures (eq) are given in Table 1, together with mean unsigned error (MUE) (Table 2) and mean unsigned relative error (MURE) values (Table 3). The results presented in Tables 1- 3 indicate that the α R −12 function fails completely, whereas among all functions considered (α + β R −1)exp(−γ R) seems to provide an optimal representation of the exchange repulsion interaction around equilibrium distances as illustrated in Fig. 1. The distance dependence of the ratio of various approximations the and exact reference \( E_{EX}^{(10)}\) values is shown in Fig. 1 for the methanol dimer. Analogous plots for acetate–methanol, methylamine–methanol, and methylammonium–methanol complexes are shown as Figs. 2, 3, and 4. Again, the α R −12 function from the Amber force field or NEAAP seem to yield considerably underestimated repulsion over the entire distance range. Very poor performance of Amber repulsive FF term could be due to direct coupling it to R −6 van der Waals component via imprecisely defined well depth and equilibrium distance. Since the optimal function (α + β R −1)exp(−γ R) resembles the conventional force field expression, it could be easily incorporated into existing molecular mechanics or dynamics packages. Computationally costly contribution of the three-body interactions dominated by induction term seems to be negligible and has only small influence on final geometries.
Atom–atom representation of the remaining interaction energy terms
Due to the presumably critical role of the exchange repulsion term in determining equilibrium geometries, the (α + β R −1)exp(−γ R) function has also been applied to approximate the remaining short-range components, namely delocalization \(E_{DEL}^{(R0)} \) and electrostatic penetration \( E_{EL,PEN}^{(10)}\), in order to compare them on an equal footing. The values of root mean square errors around equilibrium presented in Table 4 indicate that the delocalization term \( E_{DEL}^{(R0)}\) could be also reasonably represented by (α + β R −1)exp(−γ R). On the other hand, the electrostatic penetration component \(E_{EL,PEN}^{(10)}\) seems to require a more complex functional form, for example like the one recently proposed by Tafipolsky and Engels [30]. Due to the relatively small contribution of electrostatic penetration effects (on average, electrostatic penetration at equilibrium in the S66 test set is 14.8 times smaller than exchange repulsion), we kept its functional form the same as for the exchange and delocalization terms for the sake of simplicity. The corresponding α, β, and γ parameters are given in Tables 5–7. Combining \(E_{DEL}^{(R0)} \) and \( E_{EX}^{(10)}\), or \(E_{DEL}^{(R0)} \), \( E_{EX}^{(10)}\) and \( E_{EL,PEN}^{(10)}\) terms leads to some RMSE reduction (Table 4) due to error compensation, but we did not resort to this in order to keep a clear meaning for all interaction energy components.
Testing nonempirical atom–atom potentials for inhibitor–active site complexes
The NEAAP potentials derived for S66 model biomolecular complexes have been tested for several urokinase–inhibitor complexes [10], where a force field docking resulted in several short contacts shown in Table 8. Equilibrium distances for active site amino acid-inhibitor contacts obtained using the standard Tripos 5.2 force field and our atom–atom potentials are compared in Fig. 6 alongside MP2 results. Clearly, the application of nonempirical atom–atom potentials results in considerable improvement, yielding a correlation coefficient R 2 = 0.92 between NEAAP contact distances and MP2 data, in contrast to the force field value of R 2 = 0.64. This feature can be very useful in improving the quality of structural predictions, which can be of practical importance in drug design and scoring. It is possible that the occurrence of artefact short contacts in simulations based on conventional force fields is underreported in the literature as locating it requires significant computational effort and incorrect data are overshadowed by other results while calculating statistical averages (Figure 6).
Conclusions
Derivation of universal nonempirical atom–atom potentials NEAAP from interaction energy components defined within hybrid variation-perturbation theory HVPT opens the possibility for systematic improvements to force field nonbonded terms that are critical for more accurate modeling of molecular materials. This study indicates that the analytical functions α R −12 or αexp(−β R) commonly used to represent repulsive interactions are not adequate. By using the more appropriate but still relatively simple form (α + β R −1)exp(−γ R) it is possible to obtain a better description of exchange repulsion over a wide range of distances, especially around equilibrium. The application of derived NEAAPs resulted in a considerable improvement of the structural characteristics for an enzyme–inhibitor complex that exhibited artefact short contacts when a conventional force field was applied in the past [10].
References
Sokalski WA, Lowrey AH, Roszak S, Lewchenko V, Blaisdell J, Hariharan PC, Kaufman JJ (1986) Nonempirical atom–atom potentials for main components of intermolecular energy. J Comp Chem 7:693–700
Gilson MK, Honig BH (1988) Energetics of charge–charge interactions in proteins. Proteins: Struct Funct Dyn 3:32–52
Roterman I, Gilson K D, Scheraga H A (1989) A comparison of the CHARMM, AMBER and ECEPP potentials for peptides. 1. Conformational predictions for the tandemly repeated peptide (a s n−a l a−a s n−p r o)9. J Biomol Str Dyn 7:391–419
Gresh N, Claverie P, Pullman A (1986) Intermolecular interactions: elaboration on an additive procedure including an explicit charge-transfer contribution. Int J Quantum Chem 29:101–118
Singh UC, Kollman PA (1985) A water dimer potential based on ab initio calculations using Morokuma component analyses. J Chem Phys 83:4033–4040
Torheyden M, Jansen G (2006) A new potential energy surface for the water dimer obtained from separate fits of ab initio electrostatic, induction, dispersion and exchange energy contributions. Mol Phys 104:2101–2138
Parish RM, Sherill CD (2014) Spatial assignment of symmetry adapted perturbation theory interaction energy components: the atomic SAPT. J Chem Phys 141:044115
Schmidt JR, Yu K, McDaniel JG (2015) Transferable next-generation force fields from simple liquids to complex materials. Acc Chem Res 48:548–556
Zgarbová M, Otyepka M, Sponer J, Hobza P, Jurecka P (2010) Large-scale compensation of errors in pairwise-additive empirical force fields: comparison of AMBER intermolecular terms with rigorous DFT-SAPT calculations. Phys Chem Chem Phys 12:10476–10493
Grzywa R, Dyguda-Kazimierowicz E, Sieńczyk M, Feliks M, Sokalski W A, Oleksyszyn J (2007) The molecular basis of urokinase inhibition: from the nonempirical analysis of intermolecular interactions to the prediction of binding affinity. J Mol Model 13:677–683
Murrell JN, Teixeira JJ (1970) Dependence of exchange energy on orbital overlap. Chem Phys Lett 19:521–525
Sokalski WA, Chojnacki H (1978) Approximate exchange perturbation study of inter-molecular interactions in molecular complexes. Int J Quantum Chem 13:679–692
Sokalski WA, Roszak S, Lowrey AH, Hariharan P, Walter Koski S, Kaufman JJ (1983) Crystal structure studies using ab-initio potential functions from partitioned MODPOT/VRDDO SCF energy calculations. I. N 2 and C O 2 test cases. II. Nitromethane C H 3 N O 2. Int J Quantum Chem: Quantum Chemistry Symp 17:375–391
Kitaura K, Morokuma K (1976) New energy decomposition scheme for molecular-interactions within Hartree–Fock approximation. Int J Quantum Chem 10:325–340
Jeziorski B, Moszynski R, Szalewicz K (1994) Perturbation theory approach to intermolecular potential energy surfaces of van der Waals complexes. Chem Rev 94:1887–1930
Politzer P, Murray J, Clark T (2015) Mathematical modeling and physical reality in noncovalent interactions. J Mol Model 21:52
Sokalski WA, Roszak S, Pecul K (1988) An efficient procedure for decomposition of the SCF interaction energy into components with reduced basis set dependence. Chem Phys Lett 153:153–159
Szefczyk B, Mulholland A, Ranaghan K, Sokalski W A (2004) Differential transition state stabilization in enzyme catalysis: quantum chemical analysis of interactions in the chorismate mutase reaction and prediction of the optimal catalytic field. J Am Chem Soc 126:16148–16159
Langner KM, Janowski T, Gora R, Dziekonski P, Sokalski W A, Pulay P (2011) The ethidium-UA/AU intercalation site: effect of model fragmentation and backbone charge state. J Chem Theor Comp 7:2600–2609
Sokalski WA, Poirier RA (1983) Cumulative atomic multipole representation of the molecular charge distribution and its basis set dependence. Chem Phys Lett 98:86–92
Sokalski WA, Sawaryn A (1987) Correlated molecular and cumulative atomic multipole moments. J Chem Phys 87:526–534
Schmidt MW, Baldridge KK, Boatz JA, Elbert S T, Gordon M S, Jensen JH, Koseki S, Matsunaga N, Nguyen KA, Su S, Windus TL, Dupuis M, Montgomery JA (1993) General atomic and molecular electronic structure system. J Comp Chem 14:1347–1363
Sokalski WA, Shibata M, Ornstein RL, Rein R (1993) Point-charge representation of multicenter multipole moments in calculation of electrostatic properties. Theor Chim Acta 85:209–216
Devereux M, Raghunathan S, Federov DG, Meuwly M (2014) A novel, computationally efficient multipolar model employing distributed charges for molecular dynamics simulations. J Chem Theor Comp 10:4229–4241
Chalasinski G, Szczesniak MM (1994) Origins of structure and energetics of van der Waals clusters from ab initio calculations. Chem Rev 94:1723–1765
Sokalski WA, Roszak S (1991) Efficient techniques for the decomposition of intermolecular interaction energy at SCF level and beyond. J Mol Struct(THEOCHEM) 234:387–400
Podeszwa R, Pernal K, Patkowski K, Szalewicz K (2010) Extension of the Hartree–Fock plus dispersion method by first-order correlation effects. Phys Chem Lett 1:550–555
Řezáč J, Riley KE, Hobza P (2011) S66: A well-balanced database of benchmark interaction energies relevant to biomolecular structures. J Chem Theor Comput 7:2427–2438
Powell MJD (1964) An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comp J 7:155–162
Tafipolsky M, Engels B (2011) Accurate intermolecular potentials with physically grounded electrostatics. J Chem Theor Comp 7:1791–1803
Acknowledgments
This work was supported by Wroclaw Research Centre EIT+ within the project ”Biotechnologies and advanced medical technologies” - BioMed (POIG.01.01.02-02-003/08) co-financed by the European Regional Development Fund (Operational Programme Innovative Economy, 1.1.2). Partial financing by a statutory activity subsidy from Polish Ministry of Science and Higher Education for Faculty of Chemistry of Wroclaw University of Technology is also acknowledged. Calculations were performed in supercomputer centers in Wroclaw (WCSS), Poznan (PCSS), and Warsaw(ICM). The authors are indebted to Dr. Karol M. Langner from the University of Virginia at Charlottesville, VA, USA, for stimulating discussion and valuable comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper belongs to Topical Collection 6th conference on Modeling & Design of Molecular Materials in Kudowa Zdrój (MDMM 2014)
Rights and permissions
About this article
Cite this article
Konieczny, J.K., Sokalski, W.A. Universal short-range ab initio atom–atom potentials for interaction energy contributions with an optimal repulsion functional form. J Mol Model 21, 197 (2015). https://doi.org/10.1007/s00894-015-2729-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00894-015-2729-7