Heats of formation of the amino acids re-examined by means of W1-F12 and W2-F12 theories

Karton, Amir; Yu, Li-Juan; Kesharwani, Manoj K.; Martin, Jan M. L.

doi:10.1007/s00214-014-1483-8

Heats of formation of the amino acids re-examined by means of W1-F12 and W2-F12 theories

Regular Article
Published: 16 April 2014

Volume 133, article number 1483, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Theoretical Chemistry Accounts Aims and scope Submit manuscript

Heats of formation of the amino acids re-examined by means of W1-F12 and W2-F12 theories

Download PDF

Amir Karton¹,
Li-Juan Yu¹,
Manoj K. Kesharwani² &
…
Jan M. L. Martin^2,3

519 Accesses
38 Citations
Explore all metrics

Abstract

We have obtained accurate heats of formation for the twenty natural amino acids by means of explicitly correlated high-level thermochemical procedures. Our best theoretical heats of formation, obtained by means of the ab initio W1-F12 and W2-F12 thermochemical protocols, differ significantly (RMSD = 2.3 kcal/mol, maximum deviation 4.6 kcal/mol) from recently reported values using the lower-cost G3(MP2) method. With the more recent G4(MP2) procedure, RMSD drops slightly to 1.8 kcal/mol, while full G4 theory offers a more significant improvement to 0.72 kcal/mol (max. dev. 1.4 kcal/mol for glutamine). The economical G4(MP2)-6X protocol performs equivalently at RMSD = 0.71 kcal/mol (max. dev. 1.6 kcal/mol for arginine and glutamine). Our calculations are in excellent agreement with experiment for glycine, alanine and are in excellent agreement with the recent revised value for methionine, but suggest revisions by several kcal/mol for valine, proline, phenylalanine, and cysteine, in the latter case confirming a recent proposed revision. Our best heats of formation at 298 K ($\Delta H_{f,298}^{\circ }$) are as follows: at the W2-F12 level: glycine −94.1, alanine $-$101.5, serine $-$139.2, cysteine $-$94.5, and methionine $-$102.4 kcal/mol, and at the W1-F12 level: arginine $-$98.8, asparagine $-$146.5, aspartic acid $-$189.6, glutamine $-$151.0, glutamic acid $-$195.5, histidine $-$69.8, isoleucine $-$118.3, leucine $-$118.8, lysine $-$110.0, phenylalanine $-$76.9, proline $-$92.8, threonine $-$149.0, and valine $-$113.6 kcal/mol. For the two largest amino acids, an average over G4, G4(MP2)-6X, and CBS-QB3 yields best estimates of $-$58.4 kcal/mol for tryptophan, and of $-$117.5 kcal/mol for tyrosine. For glycine, we were able to obtain a “quasi-W4” result corresponding to $\hbox {TAE}_e$ = 968.1, $\hbox {TAE}_0$ = 918.6, $\Delta H_{f,298}^{\circ }=-90.0$, and $\Delta H_{f,298}^{\circ }=-94.0$ kcal/mol.

First-principles data set of 45,892 isolated and cation-coordinated conformers of 20 proteinogenic amino acids

Article Open access 16 February 2016

Constructing Homodesmic Reactions for Calculating the Enthalpies of Formation of Organic Compounds

Article 08 April 2016

Chemical Reactions: Thermochemical Calculations

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Due to the increasing computational power provided by supercomputers and recent advances in the development of economical ab initio methods (e.g., advances in explicitly correlated techniques [1–4]), high-level ab initio methods have now been refined to the point where they are applicable to biologically relevant systems (see Refs. [5–13] for some recent studies). Proteinogenic amino acids are the most basic building blocks of proteins and play key roles in protein structure and function. They also serve as precursors of many biologically relevant molecules, such as polypeptides, nucleotides, hormones, neurotransmitters, and antioxidants [14]. Despite their importance, the experimental gas-phase heats of formation of most of the natural amino acids are not accurately known. Determination of these fundamental thermochemical quantities may be important in understanding why nature chose these molecules as fundamental biological building blocks—for example, by comparing the relative stabilities of $\alpha$- versus $\beta$-amino acids [15, 16]. Accurate heats of formation for the amino acids are also important from a theoretical point of view, e.g., for the validation and parameterization of computationally cost-effective procedures such as density functional theory, semiempirical molecular orbital theory, and molecular mechanics. In recent years, a large number of theoretical studies were dedicated to obtaining thermochemical properties of amino acids using high-level theoretical procedures [15–25].

In the present work, we obtain accurate theoretical heats of formation for the lowest-energy conformers for the 18 proteinogenic amino acids using the high-level, ab initio W1-F12 and W2-F12 thermochemical procedures [31]. These thermochemical procedures represent layered extrapolations to the all-electron, relativistic CCSD(T)/CBS energy (complete basis set limit coupled cluster with singles, doubles, and quasiperturbative triple excitations) and can achieve an accuracy in the sub-kcal/mol range for molecules whose wave functions are dominated by dynamical correlation [31, 32]. We use these benchmark values to evaluate the performance of a variety of G$n$-type procedures [33] that were recently used for obtaining accurate thermochemical properties of amino acids [15–20].

The present paper also seeks to pay tribute to the scientific achievements of Prof. Isaiah Shavitt (OBM) and specifically to his seminal contributions to coupled cluster theory [26], to the theory and development of Gaussian basis sets [27], to accurate applied quantum chemistry [28–30], and to computational biochemistry [5].

2 Computational details

Most calculations were run on the CRUNTCh (Computational Research at UNT in Chemistry) Linux farm at the University of North Texas, on the high-performance computing National Computational Infrastructure (NCI) National Facility at Canberra, and on the iVEC@UWA facilities. Some additional calculations were carried out on the Faculty of Chemistry Linux farm at the Weizmann Institute of Science.

The geometries have been optimized at the B3LYP/A’VTZ level of theory [34–36] (where A’VTZ indicates the combination of the standard correlation-consistent cc-pVTZ basis set on hydrogen, [37] the aug-cc-pVTZ basis set on first-row elements, [38] and the aug-cc-pV(T+d)Z basis set on sulfur) [39]. All geometry optimizations and frequency calculations were performed using the Gaussian 09 program suite [40]. Benchmark relativistic, all-electron CCSD(T)/CBS energies were then obtained by means of our recently developed W1-F12 and W2-F12 thermochemical protocols [31] using the Molpro 2012.1 program suite [41]. The computational protocols of W1-F12 and W2-F12 theories have been specified and rationalized in reference [31].

In W1-F12 theory, the Hartree–Fock component is extrapolated from the VDZ-F12 and VTZ-F12 basis sets, using the $E(L) = E_{\infty } + \hbox {A}/L^{\alpha }$ two-point extrapolation formula with $\alpha$ = 5 (where $L$ is the highest angular momentum represented in the basis set, and V$n$Z-F12 denotes the cc-pV$n$Z-F12 basis sets of Peterson et al. [42] which were developed for explicitly correlated calculations). Optimal values for the geminal Slater exponents ($\beta$) used in conjunction with the V$n$Z-F12 basis sets were taken from reference [43]. The valence CCSD-F12 correlation energy is extrapolated from the same basis sets, using the said two-point extrapolation formula. Extrapolation exponents ($\alpha$) were taken from references [31, 43]. In all of the explicitly correlated coupled cluster calculations the diagonal, fixed-amplitude 3C(FIX) ansatz [45–47] and the CCSD-F12b approximation [48, 49] are employed. The (T) valence correlation energy is obtained in the same way as in the original Weizmann-1 (W1) theory, [50] i.e., extrapolated from the A’VDZ and A’VTZ basis sets using the above two-point extrapolation formula with $\alpha$ = 3.22. The CCSD inner-shell contribution is calculated with the core-valence weighted correlation-consistent A’PWCVTZ basis set of Peterson and Dunning, [51] while the (T) inner-shell contribution is calculated with the PWCVTZ(no $f$) basis set (where A’PWCVTZ indicates the combination of the cc-pVTZ basis set on hydrogen and the aug-cc-pwCVTZ basis set on carbon, and PWCVTZ(no $f$) indicates the cc-pwCVTZ basis set without the $f$ functions). The scalar relativistic contribution (in the second-order Douglas–Kroll–Hess approximation [52, 53]) is obtained as the difference between non-relativistic CCSD(T)/A’VDZ and relativistic CCSD(T)/A’VDZ-DK calculations [54] (where A’VDZ-DK indicates the combination of the cc-pVDZ-DK basis set on H and aug-cc-pV(D+d)Z-DK basis set on heavier elements). The atomic spin–orbit coupling terms are taken from the experimental fine structure, and the diagonal Born–Oppenheimer correction (DBOC) is calculated at the HF/A’VTZ level of theory. The zero-point vibrational energies (ZPVEs) are derived from B2PLYP/def2-TZVPP harmonic frequencies (and scaled by 0.9833, see Sect. 3.3).

In W2-F12, the Hartree–Fock component is calculated with the VQZ-F12 basis set. The valence CCSD-F12 correlation energy is extrapolated from the VTZ-F12 and VQZ-F12 basis sets, using the above two-point extrapolation formula with $\alpha$ = 5.94. The quasiperturbative triples, (T), corrections are obtained from standard CCSD(T)/VTZ-F12 calculations (i.e., without inclusion of F12 terms) and scaled by the factor f = $0.987 \times E_{\text {MP2-F12}}/E_{\text {MP2}}$. This approach has been shown to accelerate the basis set convergence [31, 49]. The CCSD inner-shell contribution is calculated with the core-valence weighted correlation-consistent A’PWCVTZ basis set, while the (T) inner-shell contribution is calculated with the PWCVTZ(no f) basis set. The scalar relativistic, spin–orbit coupling, DBOC, and ZPVE corrections are obtained in the same way as in W1-F12 theory.

The total atomization energies at 0 K ($\hbox {TAE}_0$) are converted to heats of formation at 298 K using the Active Thermochemical Tables (ATcT) [55–59] atomic heats of formation at 0 K (H 51.633 $\pm$ 0.000, C 170.024 $\pm$ 0.014, N 112.469 $\pm$ 0.007, O 58.997 $\pm$ 0.000, and S 65.709 $\pm$ 0.036 kcal/mol), and the CODATA [60] enthalpy functions, $\hbox {H}_{298}-\hbox {H}_{0}$, for the elemental reference states ($\hbox {H}_{2}(\hbox {g}) = 2.024\pm 0.000$, C(cr,graphite) = $0.251\pm 0.005, \hbox {N}_2(\hbox {g}) = 2.072\pm 0.000, \hbox {O}_2(\hbox {g}) = 2.075\pm 0.000$, and S(cr,rhombic) = $1.054\pm 0.001$ kcal/mol), while the enthalpy functions for the amino acids are obtained within the rigid rotor harmonic oscillator (RRHO) approximation from B3LYP/A’VTZ geometries and harmonic frequencies.

W1-F12 shows excellent performance for systems containing first-row elements (and H). Specifically, for the 97 first-row atomization energies in the W4-11 dataset, [32] W1-F12 attains a root mean square deviation (RMSD) of 0.19 kcal/mol relative to all-electron, relativistic CCSD(T) reference atomization energies at the infinite basis set limit. However, for second-row systems, it was found that the performance of W1-F12 is significantly degraded owing to shortcomings of the cc-pVDZ and cc-pVDZ-F12 basis sets for second-row elements (see Ref. [31] for details): for the 40 second-row atomization energies in the W4-11 dataset, RMSD actually exceeds 1 kcal/mol. W2-F12 does not suffer from this problem and yields similar RMSDs of 0.18 kcal/mol for first-row and 0.24 kcal/mol for second-row systems. (For further details, see reference [31]). Thus, for the sulfur-containing amino acids (cysteine and methionine) and for the small amino acids (alanine, glycine, and serine), the heats of formation are also obtained using W2-F12 theory.

The case of glycine is small enough (especially considering the $C_s$ symmetry) that the result can be independently verified using accurate thermochemical procedures based on layered extrapolation of orbital basis sets, specifically the high-accuracy W4 method [68]. Full details of the method are given in that reference and will not be repeated here: suffice to say that for a set of molecules where accurate experimental atomization energies are available via ATcT, the RMSD from experiment is 0.10 kcal/mol [32, 68]. The largest-scale calculation involved here, CCSD/aug’-cc-pV6Z, entails 1400 basis functions and required 3 terabyte of scratch space, yet ran to completion within a day on a machine with a large solid-state disk array. The CCSD(T)/aug’-cc-pV5Z calculation, involving 910 basis functions, ran in under a day on 32 cores and 512 GB of RAM.

The heats of formation have also been obtained using computationally more economical composite procedures, namely the Gaussian-4 (G4) protocol [33, 63] and its computationally more economical G4(MP2) and G4(MP2)-6X variants [64, 65]. These calculations were performed using the Gaussian 09 program suite [40]. The G4 and G4(MP2) protocols are widely used for the calculation of thermochemical properties and are applicable to relatively large systems (of up to 20–30 non-hydrogen atoms). They, generally, give RMSDs from experimental or high-accuracy theoretical thermochemical data of 1–2 kcal/mol [32, 63, 64]. For example, for the 454 experimental thermochemical determinations of the G3/05 test set (including heats of formation, ionization energies, and electron affinities), [66] G4 and G4(MP2) attain RMSDs of 1.2 and 1.5 kcal/mol, respectively [63, 64]. For the set of 137 very accurate theoretical atomization energies in the W4-11 set, both procedures attain an RMSD of 2.0 kcal/mol [32]. Finally, we have also considered the performance of the CBS-QB3 procedure [67] using Gaussian 09.

3 Results and discussion

3.1 Computational cost of the W1-F12 calculations

For systems consisting of more than eight non-hydrogen atoms (with $\hbox {C}_1$ symmetry), W1 theory [50] becomes prohibitively expensive with current commodity server hardware. W1-F12 theory is an explicitly correlated version of the W1 method, [50] which combines explicitly correlated F12 methods [1–4] with extrapolation techniques in order to approximate the CCSD(T)/CBS energy. Because of the drastically accelerated basis set convergence of the F12 methods [42, 43], W1-F12 is superior to the original W1 method, not only in terms of performance but also in terms of computational cost [31]. For example, the cpu times for calculating W1 and W1-F12 energies for a system containing 8 non-hydrogen atoms (with $\hbox {C}_1$ symmetry) are 595 and 163 h, respectively (both calculations ran on 8 Intel Xeon Sandy Bridge cores at 2.6 GHz). In terms of disk space requirements, the W1 calculation used about five times the amount of scratch disk (660 GB) that the W1-F12 calculation required (126 GB).

In the present work, we obtain W1-F12 energies for the 18 amino acids with up to 12 non-hydrogen atoms. Of these, the largest amino acids are glutamic acid, glutamine, and lysine (10 non-hydrogen atoms); histidine (11 non-hydrogen atoms); arginine and phenylalanine (12 non-hydrogen atoms). Considering the fact that none of the amino acids (apart from glycine) have any spatial symmetry, these represent the largest W1-F12 calculations reported to date. For example, the W1-F12 calculation for arginine ran for 51 days on 6 Intel Nehalem 8837 cores at 2.67 GHz and used 253 GB of RAM and 1.1 TB of scratch disk. Due to this very steep computational cost, we obtain our best heats of formation for the two amino acids with more than 12 non-hydrogen atoms (i.e., tryptophan and tyrosine) with the G$n$ and CBS-QB3 methods [33, 67], which have a significantly reduced computational cost. In Sect. 3.4, we show that, relative to W1-F12 and W2-F12 heats of formation, G4, G4(MP2)-6X, and CBS-QB3 result in RMSDs of 0.72, 0.71, and 1.01 kcal/mol, respectively, i.e., near or below the threshold of “chemical accuracy” (traditionally arbitrarily defined as 1 kcal/mol).

3.2 W1-F12 and W2-F12 benchmark heats of formation

Since W1-F12 and W2-F12 theories represent a layered extrapolations to the CCSD(T) basis set limit energy, it is of interest to estimate whether the contributions from post-CCSD(T) excitations are likely to be significant for the atomization energies of the amino acids. The percentage of the total atomization energy accounted for by parenthetical connected triple excitations, $\%\hbox {TAE}_e$[(T)], has been shown to be a reliable energy-based diagnostic for the importance of non-dynamical correlation effects [68, 74]. It has been suggested that $\%\hbox {TAE}_e$[(T)] $<$ 2 % indicates systems that are dominated by dynamical correlation, while 2 % $<$ $\%\hbox {TAE}_e$[(T)] $<$ 5 % indicates systems that include mild non-dynamical correlation. $\%\hbox {TAE}_e$[(T)] values for the amino acids are gathered in Table 1. The amino acids are characterized by $\%\hbox {TAE}_e$[(T)] values ranging from 1.7 (leucine) to 2.5 % (histidine). Note also that in all cases, the SCF component accounts for 69–77 % of the total atomization energy. These values suggest that our all-electron, non-relativistic, vibrationless benchmark atomization energies should, in principle, be considerably closer than 1 kcal/mol of the atomization energies at the full configuration interaction (FCI) basis set limit. For example, for systems that are associated with similar $\%\hbox {TAE}_e$[(T)] values in the W4-11 dataset [32], post-CCSD(T) contributions to the atomization energy are 0.2 kcal/mol or less, although somewhat larger values were found for benzene [75, 76].

Table 1 Diagnostics indicating the importance of post-CCSD(T) contributions for the amino acids

Full size table

Table 2 gives an overview of basis set convergence of the CCSD-F12 component of the total atomization energy. The magnitude of the valence CCSD-F12 correlation component spans a relatively large range. For example, the CCSD-F12/V{D,T}Z-F12 results extrapolated with $\alpha$ = 3.67 (which was optimized to minimize the RMSD over 137 first- and second-row systems in the W4-11 dataset [31]) extend from 272.48 (glycine) up to 701.76 (arginine) kcal/mol. The differences between the CCSD-F12/V{D,T}Z-F12 results obtained with $\alpha$ = 3.67 (optimized over the entire W4-11 set of small molecules) and $\alpha$ = 3.38 (optimized over the subset of 97 first-row species only) can get quite significant for these medium-sized species, ranging from 0.25 kcal/mol for glycine to 0.71 kcal/mol for arginine. Note that these differences still only correspond to about 0.1 % of the valence CCSD correlation component. For comparison, for the systems in the W4-11 dataset, the absolute differences between the CCSD-F12/V{D,T}Z-F12 component extrapolated with $\alpha$ = 3.67 and 3.38 are reduced to just 0.00–0.22 kcal/mol, or 0.08 kcal/mol mean absolute—likewise, about 0.1 % of the valence CCSD correlation component of the atomization energy. Finally, using instead the extrapolation exponent optimized by Hill et al. [43] ($\alpha$ = 3.144), which was optimized over a smaller set of 14 absolute correlation energies, results in atomization energies increased by 0.24 (glycine) up to 0.69 (arginine) kcal/mol over the values with $\alpha$ = 3.38 (Table 2).

Table 2 Overview of the basis set convergence of the CCSD-F12 component of the total atomization energies for the amino acids (kcal/mol)

Full size table

For five smaller amino acids (alanine, cystine, glycine, methionine, and serine), we were able to obtain CCSD-F12/VQZ-F12 energies. Table 2 gives the CCSD-F12/V{T,Q}Z-F12 results extrapolated with $\alpha$ = 5.94 (used in W2-F12 theory [31]) and 4.596 (from Ref. [43]). For these systems, the difference between the CCSD-F12/V{T,Q}Z-F12 contributions extrapolated with $\alpha$ = 5.94 and 4.596 ranges between 0.20 (glycine) and 0.34 (methionine) (Table 2). We note that the error statistics over the 137 systems in the W4-11 dataset are as follows: RMSD = 0.13, MAD = 0.10, and MSD = 0.01 for $\alpha$ = 5.94, and RMSD = 0.15, MAD = 0.11, and MSD = 0.08 kcal/mol for $\alpha$ = 4.596. Peterson and Feller [44] obtained benchmarks extrapolated from basis sets as large as aug-cc-pV8Z for a fairly large sample of molecules that overlaps W4-11 and found that CCSD-F12b/V{T,Q}Z-F12 tends to overestimate the valence CCSD component on average: as they were using $\alpha$ = 4.596, this is consistent with the present finding. (They also report difficulties reaching 0.1 kcal/mol convergence for CCSD-F12b energies with aug-cc-pV5Z basis sets: We were only able to apply this basis set to glycine, and in any case 0.1 kcal/mol is smaller than other potential error sources in the present work).

Table 3 Component breakdown of the W1-F12 and W2-F12 atomization energies and final gas-phase heats of formation at 0 and 298 K for the lowest-energy conformers of the amino acids (kcal/mol)

Full size table

For the five W2-F12 amino acids, the RMSDs for CCSD-F12/V{D,T}Z-F12 with various choices of extrapolation exponent are 0.43 ($\alpha$ = 3.67), 0.14 ($\alpha$ = 3.38), and 0.24 ($\alpha$ = 3.144) kcal/mol. Taking the average between the CCSD-F12/V{D,T}Z-F12 components extrapolated with $\alpha$ = 3.38 and 3.144 results in an RMSD of 0.12 kcal/mol and a mean signed deviation of only +0.06 kcal/mol. We thus use this averaged CCSD-F12/V{D,T}Z-F12 component in our final W1-F12 atomization energies. The spread between the $\alpha$ = 3.38 and 3.144 values can be considered a crude gauge of the uncertainty in the basis set limit.

Table 4 Dependence of computed ZPVEs (kcal/mol) on the level of theory

Full size table

The component breakdowns of the W1-F12 and W2-F12 atomization energies are gathered in Table 3. The following general observations may be noted:

As pointed out above, the magnitude of the valence CCSD-F12 correlation component runs a large gamut, extending from 272.85 (glycine) up to 702.80 (arginine) kcal/mol.
The magnitude of the valence (T) correlation component can be rather large, reaching 54.28 kcal/mol for phenylalanine.
The core–valence contribution approaches or exceeds 10 kcal/mol for the largest systems. Namely, it is 9.88 (arginine) and 11.68 (phenylalanine) kcal/mol.
The DBOC contribution ranges from 0.28 (glycine) up to as much as 0.72 (arginine) kcal/mol.

Comparison of the W1-F12 and W2-F12 results for alanine, cystine, glycine, methionine, and serine reveals the following:

The HF/V{D,T}Z-F12 component systematically underestimates the HF/VQZ-F12 basis set limit, namely by 0.03 (glycine), 0.04 (alanine and cysteine), 0.05 (serine), and 0.08 (methionine) kcal/mol.
Our best CCSD-F12/V{D,T}Z-F12 component overestimates the CCSD-F12/V{T,Q}Z-F12 component by 0.05 (glycine), 0.06 (methionine), 0.10 (serine), 0.20 (alanine) kcal/mol, and underestimates it by 0.11 kcal/mol for cysteine.
The valence (T) contribution from W1-F12 theory systematically overestimates the W2-F12 results, specifically by 0.06 (cysteine), 0.13 (methionine), 0.17 (glycine), 0.20 (alanine), and 0.25 (serine) kcal/mol.
The core–valence contribution from W1-F12 systematically underestimates the W2-F12 result, namely by 0.09 (glycine), 0.12 (alanine), 0.14 (serine), and 0.16 (cysteine) kcal/mol (we were not able to obtain the core–valence contribution for methionine from W2-F12 theory).
Overall, the $\hbox {TAE}_e$ from W1-F12 theory overestimates the $\hbox {TAE}_e$ from W2-F12 theory by 0.11 (glycine and methionine), 0.16 (serine), and 0.23 (alanine) kcal/mol, and underestimates it by 0.30 kcal/mol for cysteine.

As noted in the “Methods” section, we were able to “cross-check” the result for glycine at the W4 level: the lower-cost W2.2 level is obtained as a by-product. As seen in Table 3, the SCF, CCSD, (T), core-valence, and relativistic components of the W2-F12 calculation are all in excellent agreement with the W4 calculation, the cumulative difference being just 0.04 kcal/mol. The higher-order correlation steps, CCSDT(Q)/cc-pVTZ, and CCSDTQ/cc-pVDZ are more problematic from a computational point of view, but their importance is typically quite small for molecules dominated by a single reference configuration (due to error compensation between “antibonding” higher-order $T_3$ and “bonding” $T_4$ contributions [68–73]). Absent a direct calculation, their importance can be estimated by assuming that their contribution to the following isodesmic reaction energy will be approximately zero:

$$\begin{aligned} \hbox {CH}_{3}\hbox {COOH} + \hbox {CH}_{3}\hbox {NH}_{2} \rightarrow \hbox {glycine} + \hbox {CH}_{4} \end{aligned}$$

(1)

From Table SI-II of Ref. [32], we find the post-CCSD(T) contributions to the TAEs to be $-$0.05 kcal/mol for acetic acid, $-$0.09 kcal/mol for methyl amine, and +0.01 kcal/mol for methane, leading to an estimated post-CCSD(T) correction of $-$0.15 kcal/mol for glycine.

3.3 A note on zero-point vibrational energies (ZPVEs)

In view of the magnitude of the zero-point vibrational energies (50–140 kcal/mol, see Table 4), some remarks are due concerning their calculation. Ideally, one should obtain them from accurate anharmonic force fields, and for small molecules, this is indeed a practical option [68, 85, 91]. In the present case, however, the computational cost would be prohibitive with the computational resources at hand, and multiplication of calculated harmonic frequencies with a scaling factor $\lambda (\mathrm{ZPVE})$ appropriate for zero-point vibrational energies [50, 83, 84, 86, 90] is the only practical option. As shown in Ref. [84], ZPVEs are typically almost exact averages of one-half the sum of the harmonics and one-half the sum of the fundamentals, the difference being just $\mathrm{ZPVE}-(1/4)\sum _i{\omega _i+\nu _i}=G_0-\sum _i X_{ii}/4$, where the $X_{ii}$ are the diagonal anharmonicity constants and $G_0$ is the polyatomic counterpart of the small $Y_{00}$ Dunham constant [82] in diatomics. Consequently [50, 84, 90], the optimal scaling factor for ZPVEs is almost exactly midway between a $\lambda ({\omega })$ suitable for harmonic frequencies (as an approximate correction for systematic bias in the calculated frequencies) and a $\lambda ({\nu })$ suitable for fundamental frequencies (which additionally seeks to approximately corrects for anharmonicity). In fact, Alecu et al. [86] found for a large variety of basis sets and ab initio and DFT methods that $\lambda ({\omega })/\lambda (\mathrm{ZPVE})=1.014\pm 0.002$, which is almost exactly the ratio of 1.0143 found by Perdew and coworkers [87] between harmonic frequencies and ZPVEs derived from experimental anharmonic force fields. Note that the “small” uncertainty of 0.002 on a ZPVE of 140 kcal/mol still would translate to about 0.3 kcal/mol, and even that is probably optimistic for the uncertainty in an individual molecule [88]. It has been argued earlier [91] (see also Ref. [92]) that for organic and bio-organic molecules that are “well-behaved” from an electronic structure point of view, the main factor limiting accuracy in computational thermochemistry may well be the treatment of the nuclear motion, rather than the electronic problem as such.

Computed zero-point vibrational energies for the amino acids at various levels of theory (including those used in the composite thermochemistry schemes compared in this work) are listed in Table 4. In search of an alternative that was more accurate than B3LYP yet still comparatively affordable, we considered the B2PLYP double hybrid functional [93] in conjunction with the def2-TZVPP basis set [94] and optimized a $\lambda (\omega )$ scaling factor by minimizing the RMSD for the HFREQ27 dataset [95] of accurately known harmonic frequencies. As can be seen in Table 4, the RMSD over the HFREQ27 set is only half that of B3LYP and drops to 13.2 cm$^{-1}$ if the anomalous F$_2$ molecule is eliminated. (For comparison, the HFREQ27 RMSD for CCSD(T)/cc-pV(Q+d)Z is still 8.4 cm$^{-1}$.) The optimum scale factor $\lambda (\omega )=0.9971$ is very close to unity, and in conjunction with the “universal” ratio of 1.014 translated into $\lambda (\mathrm{ZPVE})=0.9833$. As a sanity check on our procedure, we re-evaluated the $\lambda (\mathrm{ZPVE})$ for B3LYP/6-31G(2df,p) and B3LYP/6-311G(2d,d,p) and obtained 0.9858 and 0.9896, respectively, which agree to better than 3 decimal places with the “official” values used in G4 theory and CBS-QB3, respectively [63, 67].

It can be seen in Table 4 that the lower levels of theory used for ZPVEs in G3(MP2) [61] and G3(MP2)B3, [62] can yield values several kcal/mol lower than the highest-level method: the RMSD from B2PLYP/def2-TZVPP are 2.12 and 2.29 kcal/mol, respectively, compared to 0.33 and 0.14 kcal/mol, respectively, for B3LYP/6-31G(2df,p) (scaled by 0.9854) as used by the G4 variants, and B3LYP/6-311G(2d,d,p) as used by CBS-QB3. (The “2d” refers to the use of an extra d function on second-row elements.) But also B3LYP/cc-pV(T+d)Z scaled by 0.985, as used in W1- and W1-F12 theory, appears to yield values that are too low, and indeed $\lambda (\mathrm{ZPVE})$ as obtained from the HFREQ27 set is 0.9892. For B3LYP with a basis set that is effectively at the Kohn-Sham limit, $\lambda (\mathrm{ZPVE})$ = 1.004 was found, which corresponds to $\lambda (\mathrm{ZPVE})$ = 0.99, and the database of Radom and coworkers [90] likewise lists scaling factors near 0.99 for B3LYP with large basis sets. While a scaling factor of 0.985 vs. 0.990 may rightly be considered a distinction without a difference for small molecules (where anybody concerned about 0.1 kcal/mol in a ZPVE should seriously consider an accurate anharmonic ZPVE), the problem is much more obvious in larger systems such as presently considered.

For one system, glycine, an anharmonic value of 49.438 kcal/mol is available due to Puzzarini and coworkers [81], who combined CCSD(T)/CBS harmonic frequencies with a DFT anharmonic force field. Fortuitously, our scaled B2PLYP/def2-TZVPP value agrees to two decimal places. As an additional observation, for ethane, the accurate anharmonic ZPVE is 46.29 kcal/mol, [91] compared to 45.97 kcal/mol B3LYP/cc-pVTZ scaled by 0.985, 46.20 with a revised scaling factor of 0.99, and 46.33 kcal/mol at the B2PLYP/def2-TZVPP level scaled by 0.9833.

3.4 Performance of G$n$ methods for the heats of formation of the amino acids

In this Section we use our best heats of formation from W1-F12 and W2-F12 theories (given in Table 3) to evaluate the performance of a variety of composite thermochemical Gaussian-$n$ (G$n$) procedures including G3(MP2), [61] G3(MP2)B3, [62], G4, [63] G4(MP2), [64] and G4(MP2)-6X [65]. Table 5 presents the deviations (G$n$–W$n$-F12) from our benchmark W$n$-F12 results, as well as the RMSD, mean absolute deviations (MAD), and mean signed deviations (MSD) for the G$n$ methods. Stover et al. [17] obtained G3(MP2) heats of formation for the amino acids: except for phenylalanine, cysteine, and methionine, the deviations between their heats of formation and our reference values exceed 1 kcal/mol. The mean signed deviation (MSD) of 1.90 kcal/mol being nearly equal to the RMSD of 2.25 kcal/mol indicates a very systematic error. Simply switching to G3(MP2)B3 cuts the MSD to 0.78 kcal/mol and the RMSD to 1.13 kcal/mol, while “upgrading” to G3B3 lowers these numbers even further to 0.45 and 0.60 kcal/mol, respectively. While both methods use MP2 rather than B3LYP reference geometries, the entire G3 family suffers from underestimated ZPVEs for the amino acids (Table 4), so apparently some of that issue is absorbed by the empirical correction. Stover et al. [17] also obtained G3(MP2) heats of formation via isodesmic bond separation reactions. As expected this improves the performance, with RMSD = 1.48 kcal/mol and a maximum deviation of 2.40 kcal/mol for phenylalanine. We note, however, that their CCSD(T)/CBS anchor value for the heat of formation at room temperature of glycine, $-$92.6 kcal/mol, is 1.5 kcal/mol lower than our W2-F12 value. If we substitute the latter in their isodesmic reactions, their RMSD plunges to just 0.47 kcal/mol.

Table 5 Performance of a selection of composite procedures of the G$n$ family for the calculation of heats of formation ($\Delta _f H^\circ _{298 K}$, exclusive of conformer correction) of the 18 amino acids in Table 3

Full size table

We now turn our attention to the performance of the Gaussian-4 family: G4, [63] G4(MP2), [64] and G4(MP2)-6X [65]. The G4(MP2) procedure exhibits somewhat disappointing performance, its RMSD = 1.80 kcal/mol placing intermediately between G3(MP2) and G3(MP2)B3. The largest deviations are obtained for asparagine (2.48), lysine (2.32), glutamine (3.15), and arginine (3.34 kcal/mol), but all other deviations exceed 1 kcal/mol apart from phenylalanine, cysteine, and methionine. The computationally more expensive “full” G4 procedure yields much better performance with an RMSD of 0.72 kcal/mol, and just three cases exceeding 1 kcal/mol (glutamine 1.39, arginine 1.21, and lysine 1.37 kcal/mol). However, an essentially identical RMSD = 0.71 kcal/mol is afforded by the G4(MP2)-6X procedure, which involves the same computational steps and cost as G4(MP2) but entails six additional empirical scaling factors. Deviations larger than 1 kcal/mol are obtained for just four systems, namely arginine (1.63), glutamine (1.63), asparagine (1.10), and methionine ($-$1.02 kcal/mol). Finally, we note that the CBS-QB3 method clocks in at RMSD = 1.01 kcal/mol.

Very recently, Ramabhadran et al. [21] determined the enthalpies of formation of cysteine and methionine using their connectivity-based hierarchy (CBH-$n$) approach [77, 78]. From their Table 3, the best enthalpies of formation obtained for the lowest-energy conformer at the CBH-2 (isoatomic) rung using experimental heats of formation for the reference species and CCSD(T)/6-311++G(3df,2p) reaction energies are $-$96.1 (cysteine) and $-$104.3 (methionine) kcal/mol. From their Table 7, we calculate conformer corrections of +0.77 kcal/mol for cysteine and +0.37 kcal/mol for methionine: The latter we actually use in the present work, while the former is slightly less than our own calculation of 0.81 kcal/mol. According to their Table 9, the heats of formation after conformer correction are $-$95.3 and $-$104.0 kcal/mol (the latter value presumably after roundoff), both more exothermic than our W2-F12 values (Table 3) of $-$94.5 and $-$102.4 kcal/mol. We do note that some of the experimental data for reference species used in Ref. [21] carry non-trivial uncertainties, which could account for at least some of the discrepancy.

3.5 Comparison with experiment

Comparison with experiment obviously entails thermal corrections. The RRHO approximation will cause some errors, the largest of which will be neglect of the population of the various low-energy conformers. If we neglect the difference between the rovibrational partition functions of the different conformers, then the conformer contribution to the enthalpy function $\mathrm{hcf}_{298}\equiv H_{T=298}-E_0$ is easily found as [96]

$$\begin{aligned} \mathrm{hcf}_{298}^\mathrm{conf}=\frac{\sum _i x_i \exp ( -x_i)}{\sum _i \exp ( -x_i)} \quad \hbox {where}\quad x_i \equiv \frac{E_i - E_0}{RT} \end{aligned}$$

(2)

where the index $i$ runs over the conformers. The effect of accounting for different rovibrational partition functions in the different conformers was considered in Ref. [96] for the alkane conformers and is negligible compared to other potential error sources in the present calculation, such as the neglect of anharmonicity and the uncertainty in the basis set extrapolation. Conformer energies were gathered from published calculations in the literature [21–24, 81, 100–112]: these range from complete basis set CCSD(T) studies for glycine [81] and alanine [24] to relatively low-level MP2 or DFT calculations for some other species. Details are given in the footnotes to Table 3.

Table 6 lists the available experimental gas-phase heats of formation at 298 K ($\Delta H_{f,298}^{\circ }$). Our W2-F12 value for alanine ($-$101.5 kcal/mol) is spot on the experimental value of Dorofeeva and Ryzhova [97] ($-$101.5$\,\pm\,$0.5 kcal/mol) and still agrees to within mutual uncertainties with that of da Silva et al. [15] ($-$101.9$\,\pm\,$0.7). However, the NIST chemistry WebBook [79] value ($-$99.1$\,\pm\,$1.0 kcal/mol) is clearly incompatible with our calculations.

Table 6 Experimental gas-phase heats of formation at 298 K for the amino acids (kcal/mol)

Full size table

Our W2-F12 heat of formation for cysteine ($-$94.2 kcal/mol) suggests that the experimental value of Roux et al. [19] should be revised downward by about 2.8 kcal/mol; the recent study of Ramabhadran et al. [21] suggests even further downward revision (vide supra). As for glycine, the W2-F12 heat of formation ($-$94.1 kcal/mol from W2-F12, $-$94.0 from quasi-W4) and the available experimental values agree to within overlapping uncertainties. Specifically, our calculations are spot on the experimental value of Dorofeeva and Ryzhova [97] ($-$94.1$\,\pm\,$0.4 kcal/mol), just slightly below the experimental value from the CRC Handbook ($-$93.7 kcal/mol), and in the upper end of the uncertainty band of the NIST WebBook value ($-$93.3$\,\pm\,$1.1 kcal/mol). Our W2-F12 value for methionine ($-$102.4 kcal/mol) agrees well with the new measurement of Roux et al. [18] ($-$102.8$\,\pm\,$2.4 kcal/mol), and both imply a downward revision of the NIST Chemistry Webbook value ($-$98.8$\,\pm\,$1.0 kcal/mol) by about 3–4 kcal/mol. As for phenylalanine, our W1-F12 value ($-$76.9 kcal/mol) suggests that the experimental value from the CRC Handbook ($-$74.8 kcal/mol) should be revised downward by about 2 kcal/mol. The W1-F12 values for proline ($-$92.8 kcal/mol) and valine ($-$113.6 kcal/mol) suggest that the experimental values should be revised downward by about 5 kcal/mol (Table 6).

For the two largest amino acids, tryptophan and tyrosine, we were unable to calculate W1-F12 atomization energies. At the G4, CBS-QB3, and G4(MP2)-6X levels, respectively, we obtain heats of formation at 0 K for tryptophan of $-$49.60, $-$47.87, and $-$48.77 kcal/mol, and for tyrosine of $-$109.12, $-$108.58, and $-$108.49 kcal/mol. At room temperature, the corresponding values are $-$59.98, $-$58.27, and $-$58.98 kcal/mol for tryptophan and $-$118.56, $-$118.03, and $-$117.78 kcal/mol for tyrosine. Averaging over all three levels of theory, and adding in conformers corrections for tryptophan of 0.71 kcal/mol [111] and for tyrosine of 0.65 kcal/mol, we finally obtain estimated heats of formation at 298 K of $-$58.37 kcal/mol for tryptophan, and of $-$117.47 kcal/mol for tyrosine.

4 Conclusions

We have obtained benchmark heats of formation at the CCSD(T)/CBS limit for the 20 natural amino acids. Our best heats of formation at 298 K ($\Delta H_{f,298}^{\circ }$) are $-$101.5 (alanine), $-$98.8 (arginine), $-$146.5 (asparagine), $-$189.6 (aspartic acid), $-$94.5 (cysteine), $-$151.0 (glutamine), $-$195.5 (glutamic acid), $-$94.0 (glycine, quasi-W4) or $-$94.1 (glycine, W2-F12), $-$69.8 (histidine), $-$118.3 (isoleucine), $-$118.8 (leucine), $-$110.0 (lysine), $-$102.4 (methionine), $-$76.9 (phenylalanine), $-$92.8 (proline), $-$139.2 (serine), $-$149.0 (threonine), and $-$113.6 (valine) kcal/mol. These heats of formation are obtained at the W2-F12 level for alanine, cysteine, glycine, methionine, and serine, and at the W1-F12 level for all of the rest. For the two largest amino acids, an average over G4, G4(MP2)-6X, and CBS-QB3 yields best estimates of $-$58.4 kcal/mol for tryptophan, and of $-$117.5 kcal/mol for tyrosine.

Uncertainties caused by issues with the zero-point vibrational energy and the conformer corrections rival, and probably exceed, those directly related to the electronic structure treatment. The overall uncertainty is somewhat difficult to quantify, but a semi-quantitative estimate would range from about $\pm$0.5 kcal/mol for the smaller, to about $\pm$1 kcal/mol for the larger, amino acids.

For glycine, by way of validation, we were able to obtain a “quasi-W4” result corresponding to $\hbox {TAE}_e=968.1, \hbox {TAE}_0=918.6, \Delta H_{f,298}^{\circ }$ = $-$90.0, and $\Delta H^\circ _{f,298}$ = $-$94.0 kcal/mol.

Our best theoretical values suggest that the experimental gas-phase heats of formation from the NIST WebBook should be revised downward by 2.4 (alanine), 0.7–0.8 (glycine), 3.2 (methionine), and 5.3 (proline) kcal/mol. Similarly, we suggest that the experimental values from the CRC Handbook should be revised downward by 0.4 (glycine), 2.0 (phenylalanine), and 4.8 (valine) kcal/mol. Our best theoretical values are in good agreement with the recently reported experimental values of Roux and coworkers for alanine [15] and methionine, [18] but suggest that their experimental value for cysteine should be revised downward by 2.8 kcal/mol. Finally, our best theoretical values for alanine and glycine are in excellent agreement with the recent values of Dorofeeva and Ryzhova [97].

Using our W1-F12 and W2-F12 benchmark heats of formation, we benchmark the performance of the empirical composite G$n$ procedures. We obtain the following RMSDs: 2.25 (G3(MP2)), 1.13 (G3(MP2)B3), 0.60 (G3B3), 1.80 (G4(MP2)), 0.71 (G4(MP2)-6X), and 0.72 (G4) kcal/mol. Particularly G4(MP2)-6X appears to offer an excellent performance-to-computational cost ratio.

Finally, it appears that for W1- and W1-F12, the scaling factor for the B3LYP/cc-pV(T+dZ)Z or B3LYP/aug’-cc-pV(T+d)Z zero-point vibrational energy should be revised upward to 0.990.

5 Supporting information

B3LYP/A’VTZ optimized geometries for the species considered in the present work (Table S1). Full references for ref [40] (Gaussian 09) and ref [41] (Molpro 2010) (Table S2). B2PLYP/def2-TZVPP harmonic frequencies for all amino acids except tryptophan and tyrosine, and B3LYP/aug’-cc-pV(T+d)Z frequencies for all amino acids.

References

Hättig C, Klopper W, Köhn A, Tew DP (2012) Chem Rev 112:4
Article Google Scholar
Kong L, Bischoff FA, Valeev E (2012) Chem Rev 112:75
Article CAS Google Scholar
Ten-no S, Noga J (2012) WIREs Comput Mol Sci 2:114
Article CAS Google Scholar
Peterson KA, Feller D, Dixon DA (2012) Theor Chem Acc 131:1079
Article Google Scholar
Quinn JR, Zimmerman SC, Del Bene JE, Shavitt I (2007) J Am Chem Soc 129:934
Article CAS Google Scholar
Distasio RA Jr, Steele RP, Rhee YM, Shao Y, Head-Gordon M (2007) J Comp Chem 28:839
Article CAS Google Scholar
Šponer J, Riley KE, Hobza P (2008) Phys Chem Chem Phys 10:2595
Article Google Scholar
Valdes H, Pluháčková K, Pitonák M, Řezáč J, Hobza P (2008) Phys Chem Chem Phys 10:2747
Article CAS Google Scholar
Valdes H, Pluháčková K, Hobza P (2009) J Chem Theory Comput 5:2248
Article CAS Google Scholar
Jiang J, Wu Y, Wang Z-X, Wu C (2010) J Chem Theory Comput 6:1199
Article CAS Google Scholar
Tkatchenko A, Rossi M, Blum V, Ireta J, Scheffler M (2011) Phys Rev Lett 106:118102
Article Google Scholar
Bokatzian-Johnson SS, Stover ML, Dixon DA (2012) J Phys Chem B 116:14844
Article CAS Google Scholar
Goerigk L, Karton A, Martin JML, Radom L (2013) Phys Chem Chem Phys 15:7028
Article CAS Google Scholar
Nelson DL, Cox MM (2008) Lehninger principles of biochemistry. Palgrave-Macmillan, New York
Google Scholar
Ribeiro da Silva MAV, Ribeiro da Silva MDMC, Santos AFLOM, Roux MV, Foces-Foces C, Notario R, Guzman-Majia R, Juaristi E (2010) J Phys Chem B 114:16471
Article CAS Google Scholar
Notario R, Roux MV, Foces-Foces C, da Silva MAVR, da Silva MDMCR, Santos AFLOM, Guzman-Meja R, Juaristi E (2011) J Phys Chem B 115:9401
Article CAS Google Scholar
Stover ML, Jackson VE, Matus MH, Adams MA, Cassady CJ, Dixon DA (2012) J Phys Chem B 116:2905
Article CAS Google Scholar
Roux MV, Notario R, Segura M, Chickos JS, Liebman JF (2012) J Phys Org Chem 25:916
Article CAS Google Scholar
Roux MV, Foces-Foces C, Notario R, da Silva MAVR, da Silva MDMC, Santos AFLOM, Juaristi E (2010) J Phys Chem B 114:10530
Article CAS Google Scholar
Brás NF, Perez MAS, Fernandes PA, Silva PJ, Ramos MJ (2011) J Chem Theory Comput 7:3898
Article Google Scholar
Ramabhadran RO, Sengupta A, Raghavachari K (2013) J Phys Chem A 117:4973
Article CAS Google Scholar
Jaeger HM, Schaefer HF III, Demaison J, Császár AG, Allen WD (2010) J Chem Theory Comput 6:3066
Article CAS Google Scholar
Wilke JJ, Lind MC, Schaefer HF III, Császár AG, Allen WD (2009) J Chem Theory Comput 5:1511
Article CAS Google Scholar
Balabin RM (2011) Comp Theor Chem 965:15
Article CAS Google Scholar
Balabin RM (2009) Chem Phys Lett 479:195
Article CAS Google Scholar
Shavitt I, Bartlett RJ (2009) Many-body methods in chemistry and physics: MBPT and coupled-cluster theory. Cambridge Molecular Science, Cambridge
Book Google Scholar
Shavitt I (1993) The history and evolution of Gaussian basis sets. Isr J Chem 33:357
Article CAS Google Scholar
Bartlett RJ, Cole SJ, Purvis GD, Ermler WC, Hsieh HC, Shavitt I (1987) J Chem Phys 87:6579
Article CAS Google Scholar
Shavitt I (1985) Tetrahedron 41:1531
Article CAS Google Scholar
Comeau DC, Shavitt I, Jensen P, Bunker PR (1989) J Chem Phys 90:6491
Article CAS Google Scholar
Karton A, Martin JML (2012) J Chem Phys 136:124114
Article Google Scholar
Karton A, Daon S, Martin JML (2011) Chem Phys Lett 510:165
Article CAS Google Scholar
Curtiss LA, Redfern PC, Raghavachari K (2011) WIREs Comput Mol Sci 1:810
Article CAS Google Scholar
Lee C, Yang W, Parr RG (1988) Phys Rev B 37:785
Article CAS Google Scholar
Becke AD (1993) J Chem Phys 98:5648
Article CAS Google Scholar
Stephens PJ, Devlin FJ, Chabalowski CF, Frisch MJ (1994) J Phys Chem 98:11623
Article CAS Google Scholar
Dunning TH (1989) J Chem Phys 90:1007
Article CAS Google Scholar
Kendall RA, Dunning TH, Harrison RJ (1992) J Chem Phys 96:6796
Article CAS Google Scholar
Dunning TH Jr, Peterson KA, Wilson AK (2001) J Chem Phys 114:9244
Article CAS Google Scholar
Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Mennucci B, Petersson GA, et al (2009) Gaussian 09, Revision D01, Gaussian Inc, Wallingford CT. See also: URL: http://www.gaussian.com
Werner H-J, Knowles PJ, Manby FR, Schütz M, Celani P, Knizia G, Korona T, Lindh R, Mitrushenkov A, Rauhut G et al (2012) Molpro 2012.1, University College Cardiff Consultants Limited: Cardiff U.K. See also: http://www.molpro.net.
Peterson KA, Adler TB, Werner H-J (2008) J Chem Phys 128:084102
Article Google Scholar
Hill JG, Peterson KA, Knizia G, Werner H-J (2009) J Chem Phys 131:194105
Article Google Scholar
Feller D, Peterson KA (2013) J Chem Phys 139:084110
Article Google Scholar
Ten-no S (2004) Chem Phys Lett 398:56
Article CAS Google Scholar
Werner H-J, Adler TB, Manby FR (2007) J Chem Phys 126:164102
Article Google Scholar
Knizia G, Werner H-J (2008) J Chem Phys 128:154103
Article Google Scholar
Adler TB, Knizia G, Werner H-J (2007) J Chem Phys 127:221106
Article Google Scholar
Knizia G, Adler TB, Werner H-J (2009) J Chem Phys 130:054104
Article Google Scholar
Martin JML, de Oliveira G (1999) J Chem Phys 111:1843
Article CAS Google Scholar
Peterson KA, Dunning TH (2002) J Chem Phys 117:10548
Article CAS Google Scholar
Douglas M, Kroll NM (1974) Ann Phys 82:89
Article CAS Google Scholar
Heß BA (1986) Phys Rev A 33:3742
Article Google Scholar
de Jong WA, Harrison RJ, Dixon DA (2001) J Chem Phys 114:48
Article Google Scholar
Ruscic B, Pinzon RE, Morton ML, von Laszewski G, Bittner S, Nijsure SG, Amin KA, Minkoff M, Wagner AF (2004) J Phys Chem A 108:9979
Article CAS Google Scholar
Ruscic B (2004) Encyclopedia of Science and Technology (2005 Yearbook of Science and Technology). McGraw-Hill, New York, p 3
Google Scholar
Ruscic B, Pinzon RE, Morton ML, Srinivasan NK, Su M-C, Sutherland JW, Michael JV (2006) J Phys Chem A 110:6592
Article CAS Google Scholar
Stevens WR, Ruscic B, Baer T (2010) J Phys Chem A 114:13134
Article CAS Google Scholar
Ruscic B, Feller D, Peterson KA (2014) Theor Chem Acc 133:1415. doi:10.1007/s00214-013-1415-z
Cox JD, Wagman DD, Medvedev VA (1989) CODATA key values for thermodynamics, Hemisphere Publishing Corp.: New York. http://www.codata.org/resources/databases/key1.html
Curtiss LA, Redfern PC, Raghavachari K, Rassolov V, Pople JA (1999) J Chem Phys 110:4703
Article CAS Google Scholar
Baboul AG, Curtiss LA, Redfern PC, Raghavachari K (1999) J Chem Phys 110:7650
Article CAS Google Scholar
Curtiss LA, Redfern PC, Raghavachari K (2007) J Chem Phys 126:84108
Article Google Scholar
Curtiss LA, Redfern PC, Raghavachari K (2007) J Chem Phys 127:124105
Article Google Scholar
Chan B, Deng J, Radom L (2011) J Chem Theory Comput 7:112
Article CAS Google Scholar
Curtiss LA, Redfern PC, Raghavachari K (2005) J Chem Phys 123:124107
Article Google Scholar
Montgomery JA, Frisch MJ, Ochterski JW, Petersson GA (1999) J Chem Phys 110:2822
Article CAS Google Scholar
Karton A, Rabinovich E, Martin JML, Ruscic B (2006) J Chem Phys 125:144108
Article Google Scholar
Bak KL, Jørgensen P, Olsen J, Helgaker T, Gauss J (2000) Chem Phys Lett 317:116
Article CAS Google Scholar
Boese AD, Oren M, Atasoylu O, Martin JML, Kallay M, Gauss J (2004) J Chem Phys 120:4129
Google Scholar
Stanton JF (1997) Chem Phys Lett 281:130
Article CAS Google Scholar
Karton A, Taylor PR, Martin JML (2007) J Chem Phys 127:064104
Article Google Scholar
Harding ME, Vázquez J, Ruscic B, Wilson AK, Gauss J, Stanton JF (2008) J Chem Phys 128:114111 and references therein
Article Google Scholar
Fogueri UR, Kozuch S, Karton A, Martin JML (2013) Theor Chem Acc 132:1291
Article Google Scholar
Karton A, Kaminker I, Martin JML (2009) J Phys Chem A 113:7610
Article CAS Google Scholar
Harding ME, Vázquez J, Gauss J, Stanton JF, Kállay M (2011) J Chem Phys 135:044513
Article Google Scholar
Ramabhadran RO, Raghavachari K (2011) J Chem Theory Comput 7:2094
Article CAS Google Scholar
Ramabhadran RO, Raghavachari K (2012) J Phys Chem A 116:7531
Article CAS Google Scholar
Afeefy HY, Liebman JF, Stein SE “Neutral Thermochemical Data” in NIST Chemistry WebBook, NIST Standard Reference Database Number 69, Eds Linstrom PJ, Mallard WG, National Institute of Standards and Technology, Gaithersburg MD, 20899. http://webbook.nist.gov. Retrieved October 22, 2013
(2012) CRC handbook of chemistry and physics, 93rd edn. CRC Press, Boca Raton 2013
Barone V, Biczysko M, Bloino J, Puzzarini C (2013) Phys Chem Chem Phys 15:10094–10111. The absolute ZPVE for the lowest-energy conformer is not given explicitly in the paper, but is 49.438 kcal/mol: Puzzarini C, personal communication to authors (January, 2014).
Dunham JL (1932) Phys Rev 41:721
Article CAS Google Scholar
Del Bene JE, Aue DH (1992) Shavitt I 114:1631
Google Scholar
Grev RS, Janssen CL, Schaefer HF (1991) J Chem Phys 95:5128
Article CAS Google Scholar
Schuurman MS, Allen WD, Schaefer HF (2005) J Comput Chem 26:1106
Article CAS Google Scholar
Alecu IM, Zheng J, Zhao Y, Truhlar DG (2010) J Chem Theor Comput 6:2872. http://comp.chem.umn.edu/freqscale/index.html
Google Scholar
Csonka G, Ruzsinszky A, Perdew JP (2005) J Phys Chem A 109:6779
Article CAS Google Scholar
Irikura KK, Johnson RD, Kacker RN, Kessel R (2009) J Chem Phys 130:114102
Article Google Scholar
Sinha P, Boesch SE, Gu C, Wheeler RA, Wilson AK (2004) J Phys Chem A 108:9213
Article CAS Google Scholar
Merrick JP, Moran D, Radom L (2007) J Phys Chem A 111:11683. http://groups.chem.usyd.edu.au/radom/More/ScaleFactor.html
Karton A, Ruscic B, Martin JML (2007) J Mol Struct Theochem 811:345
Article CAS Google Scholar
Pfeiffer F, Rauhut G, Feller D, Peterson KA (2013) J Chem Phys 138:044311
Article Google Scholar
Grimme S (2006) J Chem Phys 124:034108
Article Google Scholar
Weigend F, Ahlrichs R (2005) Phys Chem Chem Phys 7:3297
Article CAS Google Scholar
Kozuch S, Gruzman D, Martin JML (2011) J Phys Chem C 114:20801, Table S-16.
Gruzman D, Karton A, Martin JML (2009) J Phys Chem A 113:11974
Article CAS Google Scholar
Dorofeeva OV, Ryzhova ON (2009) J Chem Thermodyn 41:433
Article CAS Google Scholar
CFOUR, Coupled-Cluster techniques for Computational Chemistry, a quantum-chemical program package by Stanton JF, Gauss J, Harding ME, Szalay PG with contributions from Auer AA, Bartlett RJ, Benedikt U, Berger C, Bernholdt DE, Bomble YJ, Cheng L, Christiansen O, Heckert M, Heun O, Huber C, Jagau TC, Jonsson D, Juslius J, Klein K, Lauderdale WJ, Matthews DA, Metzroth T, Mück LA, O’Neill DP, Price DR, Prochnow E, Puzzarini C, Ruud K, Schiffmann F, Schwalbach W, Simmons C, Stopkowicz S, Tajti A, Vázquez J, Wang F, Watts JD and the integral packages MOLECULE (Almlöf J, Taylor PR), PROPS (Taylor PR), ABACUS (Helgaker T, Jensen HJA, Jørgensen P, Olsen J), and ECP routines by Mitin AV, van Wüllen C. For the current version. http://www.cfour.de
Gauss J, Tajti A, Kállay M, Stanton JF, Szalay PG (2006) J Chem Phys 125:144111
Article Google Scholar
Ling S, Yu W, Huang Z, Lin Z, Haranczyk M, Gutowski M (2006) J Phys Chem A 110:12282–12291
Article CAS Google Scholar
Chen M, Huang Z, Lin Z (2005) J Mol Struct Theochem 719:153–158
Article CAS Google Scholar
Chen M, Lin Z (2007) J Chem Phys 127:154314
Article Google Scholar
Meng L, Lin Z (2011) Comp Theor Chem 976:42–50
Article CAS Google Scholar
Pang R, Guo M, Ling S, Lin Z (2013) Comp Theory Chem 1020:14–21
Article CAS Google Scholar
Huang Z, Yu W, Lin Z (2006) J Mol Struct Theochem 801:7–20
Article CAS Google Scholar
Dokmaisrijan S, Lee VS, Nimmanpipug P (2010) J Mol Struct Theochem 953:28–38
Article CAS Google Scholar
Boeckx B, Maes G (2012) J Phys Chem B 116:12441–12449
Article CAS Google Scholar
Huang Z, Yu W, Lin Z (2006) J Mol Struct Theochem 758:195–202
Article CAS Google Scholar
Czinki E, Császár AG (2003) Chem Eur J 9:1008–1019
Article CAS Google Scholar
Szidarovszky T, Czakó G, Császár AG (2009) Mol Phys 107:761–775
Article CAS Google Scholar
Huang Z, Lin Z (2005) J. Phys. Chem. A 109:2656. MP2/6-311++G(d, p) total energies for 45 lowest conformers, with B3LYP/6-311G* zero-point energies.
Zhang M, Huang Z, Lin Z (2005) J Chem Phys 122: 134313. Lowest 36 conformers, MP2/6-311G(2df, p)//B3LYP/6-311++G(d, p) with zero-point energy from B3LYP/6-311++G(d, p).

Download references

Acknowledgments

JMLM is the Baroness Thatcher Professor of Chemistry at the Weizmann Institute of Science and acknowledges partial financial support from the Lise Meitner-Minerva Center for Computational Quantum Chemistry and the Helen and Martin Kimmel Center for Molecular Design. This research was supported in part by the Weizmann AERI (Alternative Energy Research Initiative) and by a startup grant from the University of North Texas from which the Martin group Linux cluster was purchased. The authors would like to thank Dr. David Hrovat for assistance with procurement and management of the latter. A.K. is the recipient of an Australian Research Council (ARC) Discovery Early Career Researcher Award (project number: DE140100311). We also acknowledge the generous allocation of computing time from the National Computational Infrastructure (NCI) National Facility and the support of iVEC through the use of advanced computing resources located at iVEC@UWA.

Author information

Authors and Affiliations

School of Chemistry and Biochemistry, The University of Western Australia, Perth, Australia
Amir Karton & Li-Juan Yu
Department of Organic Chemistry, Weizmann Institute of Science, 76100, Reḥovot, Israel
Manoj K. Kesharwani & Jan M. L. Martin
Department of Chemistry and Center for Advanced Scientific Computing and Modeling (CASCaM), University of North Texas, Denton, TX, 76201, USA
Jan M. L. Martin

Authors

Amir Karton
View author publications
You can also search for this author in PubMed Google Scholar
Li-Juan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Manoj K. Kesharwani
View author publications
You can also search for this author in PubMed Google Scholar
Jan M. L. Martin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Amir Karton or Jan M. L. Martin.

Additional information

Dedicated to the memory of Professor Isaiah Shavitt and published as part of the special collection of articles celebrating his many contributions.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (f 177 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karton, A., Yu, LJ., Kesharwani, M.K. et al. Heats of formation of the amino acids re-examined by means of W1-F12 and W2-F12 theories. Theor Chem Acc 133, 1483 (2014). https://doi.org/10.1007/s00214-014-1483-8

Download citation

Received: 20 January 2014
Accepted: 15 March 2014
Published: 16 April 2014
DOI: https://doi.org/10.1007/s00214-014-1483-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Heats of formation of the amino acids re-examined by means of W1-F12 and W2-F12 theories

Abstract

Similar content being viewed by others

First-principles data set of 45,892 isolated and cation-coordinated conformers of 20 proteinogenic amino acids

Constructing Homodesmic Reactions for Calculating the Enthalpies of Formation of Organic Compounds

Chemical Reactions: Thermochemical Calculations

1 Introduction

2 Computational details

3 Results and discussion

3.1 Computational cost of the W1-F12 calculations

3.2 W1-F12 and W2-F12 benchmark heats of formation

3.3 A note on zero-point vibrational energies (ZPVEs)

3.4 Performance of G\(n\) methods for the heats of formation of the amino acids

3.5 Comparison with experiment

4 Conclusions

5 Supporting information

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Electronic supplementary material

Supplementary material 1 (f 177 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Heats of formation of the amino acids re-examined by means of W1-F12 and W2-F12 theories

Abstract

Similar content being viewed by others

First-principles data set of 45,892 isolated and cation-coordinated conformers of 20 proteinogenic amino acids

Constructing Homodesmic Reactions for Calculating the Enthalpies of Formation of Organic Compounds

Chemical Reactions: Thermochemical Calculations

1 Introduction

2 Computational details

3 Results and discussion

3.1 Computational cost of the W1-F12 calculations

3.2 W1-F12 and W2-F12 benchmark heats of formation

3.3 A note on zero-point vibrational energies (ZPVEs)

3.4 Performance of G\(n\) methods for the heats of formation of the amino acids

3.5 Comparison with experiment

4 Conclusions

5 Supporting information

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Electronic supplementary material

Supplementary material 1 (f 177 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation