Introduction

The composite quantum chemical methods such as the Gaussian-n (Gn) theories (n = 1, 2, 3, 4) [14] have been developed to predict accurate thermochemical properties. These methods employ a set of calculations with different levels of accuracy and basis sets with the goal of approaching the exact energy without requiring extensive computer resources. The G4 method [4] was assessed on 270 accurate experimental enthalpies of formation, essentially, of small and moderate sized molecules with 2–8 non-hydrogen atoms. Only four larger molecules with 10 (naphthalene, azulene) and 12 non-hydrogen atoms (hexafluorobenzene, chloropentafluorobenzene) were included in the test set. The mean absolute deviation of G4 theory from experiment is 3.3 kJ/mol.

In the G4 approach [4], a high level correlation calculation CCSD(T) with a moderate sized basis set 6-31G(d) is combined with energies from lower level calculations (MP4, MP2, and HF) with larger basis sets to approximate the energies of more expensive calculations. In addition, several molecule-independent empirical parameters (higher level correction (HLC) terms) are included to estimate remaining deficiencies, assuming that they are systematic. Since the G4 method was parameterized using a test set of relatively small molecules, one may expect an accumulation of systematic errors in the application of G4 theory to larger molecules as it was indicated for G2 theory [57]. Besides, further testing may reveal types of molecules for which G4 may fail.

In order to assess how G3 theory performs on molecules that are larger than those contained in the original test set, the enthalpies of formation have been calculated for large alkanes of up to 16 carbon atoms [8]. The G3 enthalpies of formation of alkanes deviate from experiment by less than 8 kJ/mol. This suggests a small accumulation of error (0.16 kJ/mol per bond) that increases the deviation with chain length. However, the similar studies for molecules of other types have not been undertaken. With the improvements in hardware performance, the accurate enthalpy of formation predictions become available even for large molecules and it is of interest to determine how G4 theory performs on such molecules with accurate experimental data.

In this paper, we have assessed the G4 theory on 122 molecules which have up to 15 non-hydrogen atoms. Of the molecule types which were considered, there were those examined by Curtiss et al. [4], however, most of additional molecules were larger. Particular attention has been given to nitro compounds and heterocycles containing nitrogen, oxygen, and sulfur. Although these compounds are presented in a varying degree in the test set [4], they pose difficult cases. It is known that alkanes are the simplest category of molecules to obtain reliable results. Other organic molecules are generally more difficult to achieve accurate enthalpies of formation [9, 10]. Therefore, it is interesting to assess the G4 theory on certain types of molecules not sufficiently included in the original test set. The experimental values of enthalpies of formation used in this work to compare with theory were taken mainly from the papers published recent years.

Computational details

All ab initio and density functional theory (DFT) calculations were performed using the Gaussian 03 package of programs [11]. The G4 energies were calculated for the most stable conformers. Geometry optimization and conformational analysis for flexible molecules was performed at DFT/B3LYP/6-31G(d,p) level of theory. To characterize the optimized stationary points, the harmonic vibrational frequencies were calculated at the same level. The resulting geometries were used as inputs in G4 calculations.

The enthalpies of formation at 298 K \( \left( {\Updelta_{\text{f}} H_{298}^{^\circ } } \right) \) were calculated from atomization energies. The calculation through atomization reactions [7, 12] involves the use of experimental enthalpies of formation of gaseous atoms at T = 0 K and thermal corrections for elements in their standard states; the corresponding values were taken from the reference book [13]. For molecules with several stable conformers, a correction for the mixture of conformers was estimated from the conformational energy differences based on Boltzmann averaging [8]. The B3LYP/6-31G(d,p) energies were used in these estimations.

To check the accuracy of \({\Updelta_{\text{f}} H_{298}^{^\circ } }\) values obtained from atomization energies, the method of isodesmic reactions [14, 15] was also applied to calculate the enthalpies of formation of some compounds. An isodesmic scheme is based on a combination of theoretical and experimental data to eliminate systematic errors and usually improves the results obtained from atomization scheme. The electronic energies for all molecules involved in the isodesmic reactions were obtained from G4 calculations. The G4(0) energies include the zero-point energies calculated at B3LYP/6-31G(2df,p) level and scaled by 0.9854. These energies corrected by the changes in enthalpy from T = 0 to 298 K were used to calculate the enthalpies of isodesmic reactions. Thermal corrections were computed from scaled B3LYP/6-31G(2df,p) vibrational frequencies. The resulting enthalpies of formation were calculated combining the G4 calculated enthalpies of isodesmic reactions with the experimental enthalpies of formation of reference molecules, whose thermochemical data are well established. The experimental \({\Updelta_{\text{f}} H_{298}^{^\circ } }\) values for species involved in isodesmic reactions were taken from the literature (Ref. 16 and references in Tables 1, 2, 3, and 4).

Table 1 Experimental enthalpies of formation and deviations from G4 values for nitro compounds
Table 2 Experimental enthalpies of formation and deviations from G4 values for nitrogen heterocycles
Table 3 Experimental enthalpies of formation and deviations from G4 values for oxygen and sulfur heterocycles
Table 4 Experimental enthalpies of formation and deviations from G4 values for different organic compounds

Results and discussion

All compounds considered in this work are divided into four groups. Nitro compounds (Table 1) are picked out since substantial discrepancies are observed between experimental enthalpies of formation and those calculated from atomization energies obtained by different Gn methods. Besides, only two nitro compounds (nitromethane and 2-nitrobutane) were included in the test set of G4 method [4], whereas a lot of experimental data were obtained for different nitro compounds last years. The second class is nitrogen containing heterocycles (Table 2). Although these compounds were presented in the test set [4] (aziridine, pyrrole, tetrahydropyrrole, N-methylpyrrole, pyridine, pyrimidine, pyrazine, piperidine), fairly large deviations from experiment were observed for pyrimidine and pyrazine. In this work the G4 calculations have been carried out for five-membered cycles with 2–4 nitrogen atoms, different derivatives of pyridine, pyridazine, pyrimidine, pyrazine, and some condensed nitrogen heterocycles with 9–14 non-hydrogen atoms. The next group of examined compounds is oxygen and sulfur containing heterocycles (Table 3). Of these compounds, the derivatives of furan and thiophene are presented for the most part in the test set [4]. Therefore, it is interesting to check the larger size compounds. Such compounds including condensed cycles with 10–15 non-hydrogen atoms were considered in this work. The last group of compounds (Table 4) includes different types of organic substances widely presented in the test set [4], however, the new molecules are substantially larger in size.

Nitro compounds

The calculated G4 values (Table 1) except those of compounds 4 and 15 are from 1.8 to 19.0 kJ/mol less than the experimental values. The mean absolute deviation is 10.7 kJ/mol, which is substantially more than that obtained for a test set (3.3 kJ/mol [4]). Although the most part of compounds in Table 1 are the chlorinated nitrobenzenes, it is unlikely that the underestimation of \( \Updelta_{\text{f}} H_{298}^{^\circ } \) values is determined by chlorine atoms. Such underestimation is not observed for chlorobenzenes 112118 (Table 4), while the \( \Updelta_{\text{f}} H_{298}^{^\circ } \) values for nitro compounds without chlorine atoms (13, 5, 6, 23, 24, 52, 57, 58) are also less than the experimental values by 10 kJ/mol on the average.

Two compounds, 4 and 15, with largest deviations may have problems with the experimental values. To check the accuracy of enthalpies of formation obtained from G4 atomization energies, the \( \Updelta_{\text{f}} H_{298}^{^\circ } \) values of these compounds were also calculated from isodesmic reactions (Table 5). For 4 the isodesmic reactions with different reference molecules give evident preference to theoretical value. New measurements of the enthalpies of formation and sublimation of 4 would be extremely valuable to check the accuracy of the theoretical calculation. As for 15, the experimental value is likely to be overestimated by about 10 kJ/mol. This suggestion is supported not only by the results of isodesmic reaction calculations (Table 5) but also by the G4 enthalpies of formation for other nitro compounds (all compounds in Table 1 and 52, 57, 58 in Table 2) for which the deviations between theory and experiment do not exceed 19.0 kJ/mol.

Table 5 Comparison of experimental enthalpies of formation with those calculated from G4 atomization energies and G4 enthalpies of isodesmic reactions (in kJ/mol)

The \( \Updelta_{\text{f}} H_{298}^{^\circ } \) values were calculated from isodesmic reactions for some other compounds (3, 10, 14, 17) with considerable deviations from experiment. However, for these compounds (Table 5), the results from isodesmic reactions are very close to the experimental values. Thus, it may be suggested that the systematic error in \( \Updelta_{\text{f}} H_{298}^{^\circ } \) values calculated from G4 atomization energies is observed for nitro compounds.

The deviations between experimental and G4 enthalpies of formation for all nitro compounds studied in this work are shown in Fig. 1a. Only compounds 4 and 15 are excluded from consideration because of apparent inaccuracy of experimental data. As can be seen from Fig. 1a, for nitro compounds with 8–14 non-hydrogen atoms, the G4 enthalpies of formation are appreciably underestimated compared to experimental values; however, there is no accumulation of error with increasing of the number of non-hydrogen atoms. Therefore, for nitro compounds with up to 15 heavy atoms one would expect the G4 values to be underestimated by about 10 kJ/mol regardless of the molecular size.

Fig. 1
figure 1

Deviations between experimental and G4 enthalpies of formation. (a) Nitro compounds: excepting 4 and 15, all species from Table 1 and 52, 57, 58 from Table 2 are shown; (b) nitrogen heterocycles from Table 2 except for 31; (c) all oxygen and sulfur heterocycles from Table 3; (d) different compounds from Table 4 except for 109

Nitrogen heterocycles

Among compounds given in Table 2, the deviation over 20 kJ/mol is observed only for 2-methyl-2H-tetrazole (31). Since a substantially better agreement between theory and experiment is obtained for other methyl derivatives of tetrazole (30, 32), it is probable that a large deviation for 31 results from inaccuracy of experimental value [29]. Excluding 31 from consideration, the mean absolute deviation from experiment of G4 theory for the species in Table 2 is 6.8 kJ/mol, which is appreciably less than that for nitro compounds. The deviations have different sign and the largest ones with absolute value of about 15 kJ/mol are found for compounds with different number of heavy atoms (Fig. 1b). Thus, although the deviations of the G4 enthalpies of formation from the experimental ones are larger than for molecules from test set [4], the accumulation of errors due to the size of the molecule is not observed for nitrogen heterocycles, at least with the increase of non-hydrogen atom number from 6 to 14. It is worth noting that rather small deviations are observed even for large three-cyclic condensed molecules (59, 60).

For three compounds with relatively large deviations (28, 50, 52), the \( \Updelta_{\text{f}} H_{298}^{^\circ } \) values were also calculated from isodesmic reactions (Table 5). The results for 28 support the G4 value calculated from atomization energy, thus questioning the experimental value. The first reaction for 50 strongly supports the value from G4 atomization energy, while the value obtained from second reaction lies between the experimental value and that calculated from G4 atomization energy. Therefore, it is likely that the experimental value for 50 is a little overestimated. It is interesting that, as for other nitro compounds (see 3, 10, 14, 17 in Table 5), the experimental value for compound with nitro group (52) has a convincing support from isodesmic reaction calculations.

As mentioned above, in the test set [4] the appreciable deviations, 10.4 and −7.9 kJ/mol, were obtained for pyrimidine and pyrazine. In this work the similar discrepancies are observed for some derivatives of pyrimidine and pyrazine (40, 41, 43, 45), whereas the deviations are insignificant for others (39, 42, 44) and for other six-membered ring with two nitrogen atoms (38). To clarify these deviations, the additional comparison with experimental data should be made for six-membered nitrogen heterocyclic compounds.

Oxygen and sulfur heterocycles

The mean absolute deviation between experimental and G4 values is 9.1 kJ/mol for compounds in Table 3. It is somewhat more than for nitrogen heterocycles. The largest deviations of about 16 kJ/mol are revealed for 68 and 73. Unfortunately, for these species it is difficult to select a sufficient number of well-balanced isodesmic reactions. However, as is seen from Table 5, there are no grounds to doubt the experimental data for 68 and 73: the values of \( \Updelta_{\text{f}} H_{298}^{^\circ } \) calculated from isodesmic reactions lie between experimental values and those calculated from atomization energy.

Almost all theoretical values for oxygen and sulfur heterocycles are less than experimental values (Fig. 1c) and, as for nitrogen heterocycles, the accumulation of errors due to the size of the molecule is not observed with the increase of non-hydrogen atom number from 6 to 15.

Different compounds

Different groups of organic substances (aromatic compounds, cycles, ethers, amines, cyano compounds, amides, azo compounds, chlorobenzenes, chlorobenzoic acids, amino acids) are presented in Table 4. The molecules of these types but smaller in size were widely used in the testing of G4 method [4]. The mean absolute deviation of G4 theory from experiment for the species from Table 4 is 4.5 kJ/mol without considering 109. This deviation is rather close to that observed for the test set [4] (3.3 kJ/mol). Again, as for other compounds, there is no accumulation of errors with the increase of heavy atom number from 7 to 14 (Fig. 1d).

A significant difference between G4 and experimental enthalpies of formation for 109 suggests inaccuracy of experimental value what is supported by isodesmic reaction results (Table 5). The experimental value of enthalpy of formation of gaseous proline given in Table 4 was obtained with the value of \( \Updelta_{\text{sub}} H_{298}^{^\circ } = 149 \pm 4\,{\text{kJ/mol}} \) based on the experimental enthalpies of sublimations in the temperature range 390 < T < 420 K [79]. In this work a substantially lower value of \( \Updelta_{\text{sub}} H_{298}^{^\circ } = 123\,{\text{kJ/mol}} \) was obtained by adjusting the enthalpy of sublimation to 298.15 using equation

$$ \Updelta_{\text{sub}} H_{298}^{^\circ } \approx \Updelta_{\text{sub}} H_{\text{T}}^{^\circ } + [C_{\text{p, 298}}^{^\circ } ({\text{cr)}} - C_{\text{p, 298}}^{^\circ } ({\text{g)](}}T - 298.15) $$

together with experimental values of \( \Updelta_{\text{sub}} H_{\text{T}}^{^\circ } \) [79], \( C_{\text{p, 298}}^{^\circ } ({\text{cr)}} \) [80], and the \( C_{\text{p, 298}}^{^\circ } ({\text{g)}} \) value calculated in this work for the most stable conformer of proline. The value of \( \Updelta_{\text{sub}} H_{298}^{^\circ } = 1 3 1 \,{\text{kJ/mol}} \) was obtained by similar adjustment with the value of \( \Updelta_{\text{sub}} H_{ 4 0 6}^{^\circ } \) obtained from other measurements [81]. These two values of enthalpy of sublimation estimated in this work lead to values of \( \Updelta_{\text{f}} H_{298}^{^\circ } \left( {{\text{g}},{\text{ proline}}} \right) \) which agree with G4 result within 4 kJ/mol. Therefore, a large discrepancy between experimental and theoretical enthalpies of formation of 109 is likely due to incorrect values of \( \Updelta_{\text{sub}} H_{298}^{^\circ } \) accepted in Refs. [16, 79]. The similar picture is observed for methionine (108): the difference between experiment and theory is reduced from 16 kJ/mol to 7 kJ/mol or −4 kJ/mol if the available experimental data on \( \Updelta_{\text{sub}} H_{\text{T}}^{^\circ } \) [82, 83] are adjusted to \( \Updelta_{\text{sub}} H_{298}^{^\circ } \) by the same way as for 109.

Particular attention in Table 5 was given to amino acids 107111 since the experimental values of \( \Updelta_{\text{sub}} H_{298}^{^\circ } \) for these compounds are often unreliable and theoretical calculations allow to reveal such examples [84, 85]. As is seen from Table 5, the values of \( \Updelta_{\text{f}} H_{298}^{^\circ } \) obtained by isodesmic reaction method are very close to those calculated from atomization energies. Therefore, the G4 values obtained from atomization energies may be used to check the accuracy of experimental values of amino acids.

Conclusions

Continual evaluation of composite quantum chemical methods is necessary. This is especially important for methods that fit test sets of experimental accurate data to obtain empirical parameters. Assessments are needed to ensure that the methods are accurate for species not yet included in the test set. Systematic evaluations can help to find weaknesses and eventually lead to new and improved methods [86].

Compared to original test set [4], in this work the G4 method was tested on larger size compounds of different types of organic substances. The largest deviations between experiment and G4 values are revealed for nitro compounds and oxygen and sulfur heterocycles: almost all G4 values are 5–15 kJ/mol less than experimental values (Fig. 1a, c). The G4 method was parametrized using a small number of molecules of these types. Therefore, it is very likely that further correction schemes may be necessary to improve the performance of G4 method for nitro compounds and oxygen and sulfur containing heterocycles.

For other compounds the deviations are appreciably less and for the most part have positive sign. Hence, the theoretical values are rather underestimated than overestimated with respect to experimental enthalpies of formation. The smallest deviations are observed for compounds which types were extensively used in the test set [4]. It is important that the expected accumulation of errors as the molecular size increases is not observed for molecules with up to 15 non-hydrogen atoms. From the results obtained in this work, it may be concluded that deviations between experimental and G4 values of 20 kJ/mol and more point to errors in the experimental values. And finally, worthy of mention is the outstanding importance of isodesmic reaction method that, as it is shown in Table 5, often helps to decide between experimental \( \Updelta_{\text{f}} H_{298}^{^\circ } \) values and those calculated from G4 atomization energies.