Introduction

Highly correlated ab initio methods used in conjunction with a large number of basis functions provide highly accurate results, but have an elevated computational cost, thus limiting their application to small systems. Composite theories emerged in the 1980s as a means to extend the application of high-level ab initio calculations to larger molecules. The philosophy behind composite procedures is the development of a systematic combination of ab initio calculations so as to calculate atomic and molecular properties with high accuracy and low computational cost [1]. The strategy is based on the additive properties of basis set effects and different levels of electron correlation.

The first widely used combinations of ab initio calculations to predict thermochemical properties were referred to as the Gaussian n theories (n = 1, 2, 3 and 4), or simply the Gn theories, and were developed by Pople, Curtiss and coworkers [212]. The expectation that the general corrections are additive and can be applied to essential improvements of different properties possibly led the authors to refer to the procedure as a theory. These methods indeed resulted in a reduction in computational costs, and have evolved in two different directions. One approach has searched for more accurate results at the expense of a larger number of correction components. The main Gn procedures, G1, G2, G3 and G4, correspond to a sequence of improvements in the central methodology. A second approach tries to minimize computational costs while preserving an acceptable level of accuracy for any particular Gn method, thus allowing its application to larger systems. Examples of this second perspective are found in modifications of the G3 theory, for example, G3(MP2) [5, 6] and G3(MP2)//B3 [13]. These reduced-order methods have been guided by the following principles: (1) third- and fourth-order perturbation calculations with large basis sets are avoided; (2) the equilibrium molecular geometries are obtained from the density functional theory level of calculation; and (3) the core electrons in single-point calculations are kept frozen.

In addition to the reduced-order alternatives, computational costs can be minimized by using pseudopotentials, such as in the G3CEP [14, 15] and G3(MP2)//B3-CEP [16] theories. These new composite theories present a significant reduction in computational demand while maintaining an accuracy similar to that of the respective all-electron versions, and have been applied not only to calculating thermochemical properties but also to other molecular properties [17, 18].

The main objective of this paper was to evaluate the applicability of the pseudopotentials developed by Steves, Krauss and Basch [19, 20] [compact effective pseudopotential (CEP)] in the G3 reduced-order MP2 theory [G3(MP2)], hereafter referred to as G3(MP2)-CEP, to molecules containing atoms from the first, second and third rows of the periodic table. The choice for CEP was based on the simplicity of the pseudopotential, its availability for the largest number of elements of the periodic table, applications in the literature and successful adaptation in previous works [1416].

Computational methods

Two similar theories were adapted from the G3 theory: G3(MP2) and G3(MP2)//B3. The aim was to achieve a level of accuracy for the final energy calculated with both methods that is comparable to that obtained with a QCISD(T,Frz)/G3MP2large calculation by using additive corrections according to the equation:

$$ {E}_{G3(MP2)//B3}=E\left[ QCISD\left(T,Frz\right)/6-31G(d)+\varDelta {E}_{G3MP2l \arg e}+{E}_{SO}+{E}_{ZPE}+\varDelta {E}_{HLC}\right. $$
(1)

where E[QCISD(T,Frz)/6–31G(d)] is the initial reference energy at the QCISD level including triple excitation corrections and frozen core approximation to be modified by the following corrections: (1) ∆E G3MP2large = E[MP2/G3MP2large] − E[MP2/6 − 31G(d)] , where G3MP2large is a large basis set developed specially for Gn; (2) E SO  = spin-orbit correction; (3) E ZPE  = thermal corrections and vibrational zero point energy (ZPE), and (4) ∆E HLC  = high level correction used empirically to minimize the deficiencies of the calculation. ∆E HLC is expressed as ΔE HLC  = −A n β − B (n α − n β) for molecules and ΔE HLC  = −C n β − D (n α − n β) for atoms, where n α and n β are the number of valence electrons with alpha and beta spins, respectively, with n α ≥  n β; A, B, C and D are parameters optimized to give the smallest mean absolute deviation from experimental data.

The G3(MP2) and G3(MP2)//B3 theories differ in their optimization of molecular geometries. G3(MP2) is based primarily on geometries optimized at the Hartree-Fock/6-31G(d) level of theory, and further improved by optimization at the MP2(full)/6-31G(d) level using all electrons for the calculation of correlation effects, while G3(MP2)//B3 uses geometries optimized at the B3LYP/6-31G(d) level. The adaptation of pseudopotentials to the G3(MP2) theory was carried out in a similar manner to that introduced in the G3(MP2)//B3-CEP theory [16] and consisted of the adaptation of the basis sets and the reoptimization of the HLC parameters following the procedure used previously for the G3CEP and G3(MP2)//B3 theories [15, 16]. Since the methodologies were similar, the 6-31G(d) and G3MP2large basis set adapted for use with G3(MP2)-CEP were the same as those used with G3(MP2)//B3-CEP [16]. The adaptation from the all-electron basis set to the pseudopotential is discussed in detail in the literature [14, 15] and will not be reproduced here.

The four high level correction (HLC) parameters for G3(MP2)-CEP were optimized with respect to the mean absolute deviation compared with experimental data using the modified simplex method of Nelder and Mead [21]. The set of experimental reference data [13] consisted of compounds containing elements of the first, second and third rows of the periodic table evaluated for four properties: enthalpies of formation, ionization energies, and electron and proton affinities. Atomization energies were also considered for compounds containing third row elements. Table 1 shows the optimal HLC parameters for the G3(MP2)-CEP, G3(MP2)//B3-CEP [16] and G3CEP [14, 15] theories. The first general trend was that the HLC parameters were larger for G3(MP2)-CEP, G3(MP2), G3(MP2)//B3-CEP and G3(MP2)//B3-CEP than for G3 and G3CEP. HLC parameters should tend to zero for theories approaching the exact results. The number of energy correction terms in the G3 and G3CEP theories were greater than in the other two alternatives, which presupposes a smaller dependency of HLC for the more complex theories. Table 1 compares the HLC parameters for G3(MP2) and G3(MP2)//B3 to the corresponding approaches that use pseudopotentials, and shows larger parameters for the pseudopotential approach of both theories than for the all-electron one. In any of these cases, the usually larger parameters seen when using pseudopotentials are an indication of the effects of inner shell electrons on the calculated properties. The parameters for G3(MP2)-CEP were also smaller than for G3(MP2)//B3-CEP, suggesting a better representation of the calculated properties by the MP2 optimized geometries than the B3LYP geometries. It is noteworthy that the original methodologies do not include compounds containing third row elements of the periodic table in the optimization of the HLC term, unlike the methodologies adapted for pseudopotentials.

Table 1 Optimized high level correction (HLC) parameters for the G3(MP2)-CEP, G3(MP2)//B3-CEP and G3CEP theories as well as original parameters for the corresponding all-electron theories in parentheses (data in millihartrees)

The steps used to yield the G3(MP2)-CEP energy can be summarized as:

  1. (1)

    The equilibrium molecular geometry was obtained at the HF/CEP-P31G(d) level of theory.

  2. (2)

    The molecular structure obtained in step 1 was used to calculate the harmonic ZPE and vibrational thermal effects (E ZPE), which was multiplied by 0.8929 to express anharmonic effects [3, 22, 23].

  3. (3)

    The equilibrium geometries were refined by optimizing at the MP2/CEP-P31G(d) level.

  4. (4)

    The molecular structure obtained in step 1 was used in a single-point calculation at the QCISD(T)/CEP-P31G(d) and MP2/CEP-G3MP2large basis set levels. The energy correction due to the use of increasingly large basis sets is given by:

    $$ \varDelta {E}_{G3MP2\mathrm{large}}=E\left[MP2/ CEP-G3MP2\mathrm{large}\right]-E\left[MP2/ CEP-P31G(d)\right] $$
    (2)
  5. (5)

    Spin-orbit corrections, E SO , were considered for atomic species and molecules. These corrections were obtained from the literature and have been determined experimentally or by accurate calculation [6, 23].

  6. (6)

    The ∆E HLC empirical correction was added to the total energy to account for any other residual effects not considered in the previous corrections. Parameters A, B, C and D are listed in Table 1.

  7. (7)

    The final energy is given by:

    $$ {E}_{G3(MP2)- CEP}=E\left[ QCISD(T)/ CEP-P31G(d)\right]+\varDelta {E}_{CEP-G3MP2\mathrm{large}}+{E}_{SO}+{E}_{ZPE}+\varDelta {E}_{HLC} $$
    (3)

The steps described above were performed with 6d and 7f polarization functions for all calculations except those carried out with the G3MP2large basis set following the G3(MP2) procedure adapted in the GAUSSIAN09 program [24].

The standard enthalpy of formation (Δf H 0) was calculated as described in the literature [2]. The proton affinity (PA0) was estimated at 0 K following the G3 theory [23] and the ionization energy (IE 0) and electron affinity (EA0) were calculated adiabatically [2].

Results and discussion

The G3 test set [13] was used as the reference for evaluation of the G3(MP2)-CEP theory. This data set consists of 247 enthalpies of formation, 22 atomization energies, 104 ionization energies, 63 electron affinities and 10 proton affinities, resulting in calculations of 446 chemical species containing atoms from the first, second and third rows of the periodic table.

Statistically, the molecular geometries used by G3(MP2) and G3(MP2)-CEP calculated at the MP2/6-31G(d) and MP2/CEP-P31G(d) levels, respectively, are similar. The mean absolute deviation is 0.02 Å for bond lengths and 0.5° for bond angles, taking into account the G3(MP2) theory as reference. Similar deviations were obtained for comparison between the B3LYP/6-31G(d) and B3LYP/CEP-P31G(d) optimized geometries [16] used in the G3(MP2)//B3 and G3(MP2)//B3-CEP theories, respectively.

The performance of the G3(MP2)-CEP theory with respect to the suggested properties were analyzed as outlined in the following three sections: (1) properties of compounds containing elements of the first and second rows, (2) properties of compounds containing representative elements of the third period, and (3) general performance of the Gaussian n-CEP theories.

Compounds containing elements of the first and second rows

The results for all chemical species containing elements of the first and second periods can be found in the supplementary material, Tables S1S4. The mean absolute deviation (MAD) for the thermochemical properties, Δf H°, IE0, EA0 and PA, are shown in Fig. 1. Comparing the total mean absolute deviations for G3(MP2) and G3(MP2)-CEP, it can be seen that the all-electron theory preserves a higher level of accuracy (1.44 kcal mol−1) than the pseudopotential theory (1.61 kcal mol−1). The total MAD for G3(MP2)-CEP is close to the original theory and does not exceed 0.17 kcal mol−1.

Fig. 1
figure 1

Mean absolute deviations (MAD) with respect to experimental data for the G3(MP2) and G3(MP2)-CEP theories using a test set for atoms, ions and molecules containing first- and second-row elements. The numbers above each column shows the number of compounds evaluated for each thermochemical property. ∆f H 0 enthalpy of formation, IE 0 ionization energy, EA 0 electron affinity, PA 0 proton affinity

The best G3(MP2)-CEP results among the studied properties were seen for proton affinity (see Fig. 1), with results similar to the original theory, G3(MP2)-CEP = 0.85 kcal mol−1 and G3(MP2) = 0.88 kcal mol−1. The electron affinities and enthalpies of formation also exhibited great similarity when comparing the all-electron with the pseudopotential approach of these theories. The largest deviation was seen for the ionization energies, with G3(MP2) = 1.52 kcal mol−1 and G3(MP2)-CEP = 1.80 kcal mol−1. Although these deviations may seem high, the results were excellent when compared with other rigorous methods presented in the literature.

The enthalpies of formation (∆f H 0) were affected mainly by the optimization of the HLC parameters. The test set for the standard enthalpies of formation of compounds containing elements of the first and second rows contains 236 molecules (see Table S1). From the literature [5, 1416] it is known that some of these molecules exhibit high deviations with respect to experimental data. They usually have halogen atoms (chlorine and/or fluorine) in their structures and the respective deviations with respect to experimental data are greater than ±4 kcal mol−1. Some hypervalent sulfur and phosphorus-containing compounds, and unsaturated aromatics are also subject to significant deviations [12, 23]. As mentioned by Curtiss et al. [12] and referenced in other papers on the Gaussian-n-CEP theories, the reason for these large deviations are not clear [1416].

Histograms of the deviations calculated for the enthalpies of formation provide more detailed information on the accuracy of the method, as shown in Fig. 2. Most values obtained by the G3(MP2)-CEP calculations are in the range of ±2 kcal mol−1 (Fig. 2b). Figure 2a shows that the G3(MP2) theory presents a larger number of cases within this range of accuracy.

Fig. 2
figure 2

Histograms obtained from 236 standard enthalpies of formation containing first- and second-row elements. The data sets represent the results obtained at the a all-electron G3(MP2) and b G3(MP2)-CEP levels of theory

It is important to consider not only the enthalpy of formation, but also the other properties (Tables S2S4) as an indication of the excellent agreement between G3(MP2)-CEP and G3(MP2). As mentioned previously, the use of pseudopotentials provides larger deviations for some compounds. Among the outliers, one can quote the ionization energy of Be (−10.2 kcal mol−1), Ne (−5.3 kcal mol−1) and P2 (−6.4 kcal mol−1) as well as the electron affinities of Li (−11.4 kcal mol−1) and Na (−9.4 kcal mol−1). Figure 3 illustrates the outliers for the ionization energies with deviations larger that ±2 kcal mol−1 either using all-electron calculations (Fig. 3a) or using pseudopotentials (Fig. 3b). Most compounds with anomalous behaviors are common to both methods, but some are improved significantly by the all-electron calculation or by using pseudopotentials. In these cases, the obvious core electron effects may be pointed out to be either responsible for the improvement or the source of errors in the respective ionization energies or a cancellation of errors in both alternatives above. The same argument may be suggested as responsible for the larger number of outliers with respect to the all electron calculations. However, far from being a simple matter, these anomalous results suggest that more complex modifications are required to improve the electronic properties for these compounds.

Fig. 3
figure 3

Distribution of absolute deviations for calculated ionization energies from a G3(MP2), and b G3(MP2)-CEP with respect to the experimental data

Compounds containing third row representative elements

G3(MP2)-CEP was also used in the calculation of the following properties for compounds containing representative elements of the third row: 22 atomization energies, 11 enthalpies of formation, 17 ionization energies, 5 electron affinities and 2 proton affinities. Figure 4 shows the mean absolute deviations for these compounds with the data presented in Table S5. Figure 5 indicates that the major differences between the G3(MP2)-CEP and G3(MP2) theories occurs for the proton affinity. This result is not statistically significant since it was obtained for only two compounds, precluding the possibility of any proper conclusion regarding the accuracy of the methods for this particular set of compounds. Next, the property responsible for more significant deviations is the enthalpy of formation. The G3(MP2) results provides a mean absolute deviation of 1.7 kcal mol−1 and the pseudopotential approach achieves 2.3 kcal mol−1. The ionization energies follow a similar trend. However, even with these large deviations, the G3(MP2)-CEP theory provides results consistent with high level ab initio methods [1].

Fig. 4
figure 4

Mean absolute deviations with respect to experimental data for the G3(MP2) and G3(MP2)-CEP theories using a test set comprising 57 species containing representative elements of third-row atoms. Numbers above each column shows the number of compounds evaluated for each thermochemical property. D 0 Atomization energy, ∆ f H 0 enthalpy of formation, EI 0 ionization energy, EA 0 electron affinity, PA 0 proton affinity

Fig. 5
figure 5

Mean absolute deviations with respect to all experimental data for the G3(MP2) and G3(MP2)-CEP theories on a test set comprising 446 atoms, ions and molecules containing first-, second- and third-row atoms. Numbers above each column shows the number of compounds evaluated for each thermochemical property. D 0 atomization energy, ∆ f H 0 enthalpy of formation, EI 0 ionization energy, EA 0 electron affinity, PA 0 proton affinity

The best results using the G3(MP2)-CEP theory were obtained for electron affinity, with a mean absolute deviation of 2.19 kcal mol−1 vs 2.54 kcal mol−1 for G3(MP2).

Some cases suggest that some refinement may be performed in the G3(MP2)-CEP theory to obtain better results. The KF molecule was not included in the test set for the third row because the large difference in electronegativity between fluorine and potassium causes such a significant distortion of the electronic distribution that the bond length reached an unrealistic value lower than 1 Å. It is also seen that the results for potassium yield a very large ionization potential and also a very negative electron affinity, indicating a larger compression of the electronic distribution. It is worth noting that the G3(MP2) theory reaches accurate results for the ionization potential of potassium, although it presents considerable deviations for electron affinity.

Comparison among G3(MP2)-CEP, G3(MP2)//B3-CEP and G3CEP

Comparing the pseudopotentials (G3(MP2)-CEP and G3(MP2)//B3-CEP [16]) with the respective all-electron theory (G3CEP) [14, 15], it was observed that the elimination of the MP4 calculations and the use of smaller adapted basis sets, G3CEPlarge and G3CEPMP2large, provided a significant reduction in CPU time with respect to the more elaborated G3CEP theory.

The accuracy of the theories showed a total absolute deviation of 1.67 kcal mol−1 with G3(MP2)-CEP, 1.60 kcal mol−1 with G3(MP2)//B3-CEP and 1.29 kcal mol−1 with G3CEP [14, 15]. These results suggest that there are small statistical differences between the two reduced order methods employing pseudopotential. As previously mentioned, the smaller HLC parameters for G3(MP2)-CEP with respect to G3(MP2)//B3-CEP also suggests better structural and electronic conditions for the first method. However, the insignificant statistical differences indicated that the improvement of the results depended on more elaborated corrections as carried out by the G3CEP theory or other superior corrections. The difference of 0.38 kcal mol−1 between G3(MP2)-CEP and G3CEP is relatively small, but indicates that the first is sufficiently accurate for initial estimates of the properties studied, but further improvements are necessary to obtain more accurate results.

Conclusion

The CEP pseudopotential was adapted to the G3(MP2) theory. This modified reduced-order Gn theory, referred to as G3(MP2)-CEP, was applied to the calculation of enthalpies of formation, ionization energies, atomization energies, and electron and proton affinities for 446 species, containing elements of the first, second, and third rows of the periodic table. The adaptation was carried out similarly to the G3(MP2)//B3-CEP theory, preserving the characteristics of G3(MP2) as much as possible.

The final implementation presented a total mean absolute deviation of 1.67 kcal mol−1 with G3(MP2)-CEP, which compares with 1.47 kcal mol−1 with G3(MP2). The electron affinities and enthalpies of formation are the properties that present the lowest deviations with respect to the original G3(MP2) theory.

In summary, the use of pseudopotentials and composite theories in the framework of the G3 theories is feasible for any of the three versions tested, provides accurate results compatible with the all electron approach and significantly reduces CPU time.