Keywords

1 Introduction

In the past two decades Kohn–Sham Density Functional Theory (DFT) [15] has become a very important tool for understanding mechanistic problems in chemistry. At the heart of this topic is the proper energetic description of all chemical species involved in a reaction. This thermochemistry problem, which from a broader point of view may also contain transition states or non-equilibrium structures in addition to the normal minima, is the topic of this work. The evaluation of the performance of density functionals (DFs) by benchmarking for energetic properties is a crucial step prior to the investigation of a new system. The reason for this is the still somewhat empirical nature of current DF approximations and their non-systematic improvability.

Several molecular sets were developed over the last few years to test DFs for, e.g., atomization energies [68], non-covalent interactions (NCI) [911], or special reactions and kinetics [1216]. Many of them were collected in the GMTKN30 [17] test set by our group to build a large benchmark set which includes the chemically most important properties of main-group chemistry. Less extensive benchmarks exist in the field of transition metal chemistry for which we mention a few examples [1824].

The reference data used can be taken from experiment but nowadays it has become common practice to compute reference reaction energies at high Wave Function Theory (WFT) level (normally coupled-cluster) and compare these data for the same molecular geometry directly with DFT results. This procedure avoids the effects of temperature, conformations, solvents, and other uncertainties in the measurements, and is the preferred way in our group. However, it cannot always be applied to large molecules because the computationally demanding WFT calculations are intractable. Mixed approaches which combine experimental and theoretical data (back-correction schemes) represent a solution to the problem and will be discussed below.

Another important aspect in the context of DFT and larger chemical systems is the effect of the London dispersion interaction. As an electron correlation effect, dispersion has a fundamentally quantum chemical, complex many-particle origin but is chemically a local phenomenon. It often operates on a relatively long-range length scale where classical (atomic or other local) approximations perform well (and exchange is negligible) but also has a short-range component. The dispersion energy, and in particular its long-range (London) part, is not accurately described by common semi-local DFs [2527] and dispersion corrections represent a very active field of research [2830]. In this chapter we also want to highlight the importance of the dispersion energy in intramolecular cases and in thermochemistry generally. Even if dispersion alone is insufficient to form a stable chemical bond, it is clear that larger molecules are significantly more influenced by the intramolecular dispersion energy than smaller ones. Because dispersion as a special type of electron correlation effect is always attractive (energy lowering), this means larger molecules are thermodynamically stabilized by dispersion compared to small systems [31]. From this new concept it is concluded that large (preferably electron-rich and polarizable) functional groups can be used to stabilize thermodynamically (and not merely kinetically) weak bonds or reactive parts in a molecule. The chemical examples discussed at the end of this chapter illustrate this point.

As already mentioned, DFT has also become the “work-horse” of modern quantum chemistry because it represents a good compromise between computational effort and accuracy. However, the huge number of developed DFs to date shows that current approximations still suffer from several flaws and that the quest for finding a functional which comes close to the “true one” is still going on. In this context, we want to particularly focus on the fact that not every DF is equally well applicable to every problem [32]. This makes choosing the right functional for the right problem a tough task, even for experienced researchers in this field. Here, we want to shed light on the question whether very recent, newly or further developed functionals are accurate for thermochemistry and concomitantly robust, i.e., broadly applicable to various chemical problems. For evaluation we employ a combination of standard thermochemical benchmark sets and four “real” chemical reactions of molecules with about 70–200 atoms. Nine modern, higher-level functionals of dispersion corrected hybrid, range-separated hybrid, and double-hybrid type are tested.

2 Theory

2.1 Thermochemical Calculations in the Condensed Phase

A free reaction energy ΔG r in solution, which is often measured experimentally under convenient equilibrium conditions for solvent X at temperature T, can be computed as

$$ \Delta {G}_{\mathrm{r}}\approx \Delta {E}_{\mathrm{r}}\left(\mathrm{gas}\right)+\underset{\begin{array}{l}\mathrm{DFT},\mathrm{RRHO},\\ {}\mathrm{low}\hbox{-} \mathrm{freq}.\mathrm{mode}\\ {}\mathrm{approx}.\end{array}}{\underbrace{\Delta {G}_{\mathrm{TRV}}^T\left(\mathrm{gas}\right)}}+\underset{\mathrm{COSMO}\hbox{-} \mathrm{RS}}{\underbrace{\Delta \delta {G}_{\mathrm{solv}}^T\left(\mathrm{solvent}\; X\right)}} $$
(1)

where ΔE(gas) is the zero-point-vibrational exclusive reaction energy for the isolated molecules, ΔG TTRV (gas) is the thermo-statistical correction from energy to free energy with translational, rotational, and vibrational contributions, and ΔδG Tsolv is the solvation free energy contribution. For the latter term we employ the COSMO-RS continuum solvation model [3335] throughout. It is based on single point calculations on the default BP86/def-TZVP [3638] level of theory for optimized gas phase structures. Consistently, the rigid-rotor-harmonic-oscillator (RRHO) model is also based on gas phase structures. Note that the normally small effects stemming from changes of the structure and vibrational frequencies upon solvation are implicitly accounted for by the COSMO-RS parametrization. For charged or very polar species, where larger changes are expected, we usually compute the structure and frequencies at the DFT-D3/COSMO level. Low-lying vibrational modes (<100 cm−1) are treated by a special rigid-rotor approximation in order to avoid numerical artifacts in the entropy calculations [39].

Equation (1) can be rearranged to obtain approximate experimental reaction energies from condensed phase equilibrium measurements and the corresponding theoretical correction energy terms given above according to

$$ \Delta {E}_{\mathrm{r}}{\left(\mathrm{gas}\right)}^{\mathrm{exptl}.}\approx \Delta {G}_{\mathrm{r}}^{\mathrm{exptl}.}-\Delta {G}_{\mathrm{TRV}}^T\left(\mathrm{gas}\right)-\Delta \delta {G}_{\mathrm{solv}}^T\left(\mathrm{solvent}\;X\right). $$
(2)

This approach is used to obtain reference reaction energies for three examples discussed in Sect. 3.2 with which the various DF results are compared. If experimental values are not available, the quantum chemistry “gold standard,” namely coupled-cluster with single and double excitations and perturbative triple excitations (CCSD(T)), is employed to obtain reference reaction energies directly. For large molecules it cannot be applied in its canonical form due a steep increase of the computational effort with the number of correlated electrons. However, CCSD(T) calculations on larger molecules become computationally feasible if local correlation approaches are applied. Among the growing number of local coupled cluster implementations, the recently published DLPNO-CCSD(T) method [40], which employs pair-natural orbitals (PNOs) and domain-based techniques, seems to be very promising. It shows near linear scaling of the computation time with the system size and, hence, molecular calculations with up to 200 atoms and reasonable valence triple-zeta AO basis sets are possible, although the computational demands are still significantly higher compared to DFT methods. It has been shown [40] and confirmed by us [41] that the errors due to the additional approximations in the DLPNO-CCSD(T) method are small (<1–2 kcal/mol) and well controllable. Using a tight value of the electron pair cut-off (T cutPairs = 10−5 E h) to ensure that London dispersion interactions are captured properly, and estimating the remaining basis set incompleteness error by focal point analysis [42, 43], DLPNO-CCSD(T) offers a reliable means for obtaining accurate reference reaction energies if experimental values are missing.

The complete basis set (CBS) DLPNO-CCSD(T) results were estimated from the following standard additivity scheme for the electronic energy E in which a correction from a def2-TZVP (TZ) [44] calculation is added to the MP2/CBS result:

$$ E\left(\mathrm{CCSD}\left(\mathrm{T}\right)/\mathrm{C}\mathrm{B}\mathrm{S}\right)\approx E\left(\mathrm{M}\mathrm{P}2/\mathrm{C}\mathrm{B}\mathrm{S}\right)+\left[E\left(\mathrm{CCSD}\left(\mathrm{T}\right)/\mathrm{T}\mathrm{Z}\right)-E\left(\mathrm{M}\mathrm{P}2/\mathrm{T}\mathrm{Z}\right)\right]. $$
(3)

Here E(CCSD(T)) refers to a canonical CCSD(T) or for large molecules to a DLPNO-CCSD(T) energy.

For neutral systems and reaction energies in a “normal” range (10–40 kcal/mol), the errors of the back-correction scheme as well as residual errors in the DLPNO-CCSD(T)/CBS treatment (mostly due to basis set extrapolation errors and non-additivity effects) typically amount to 10% of ΔE, i.e., 2–3 kcal/mol. As will be shown below this is within the error range of most modern DFs. According to our experience, the often claimed “chemical accuracy” of about 1 kcal/mol for thermochemistry is unrealistic for large systems from the experimental as well as the theoretical point of view and the quoted 2–3 kcal/mol should be considered as a more realistic target.

2.2 The Functional “Zoo”

The “zoo” of density functionals [45] has grown tremendously over the years and it has become difficult even for an expert in the field to follow all developments continuously. About a decade ago it was basically sufficient to specify the choice of the Generalized Gradient Approximation (GGA) flavor in a DF with the amount of non-local Fock-exchange included (as specified by the mixing parameter a x [46, 47]). Nowadays, the exchange-correlation energy E XC is composed of more diverse components. We want to clarify this issue with the help of the general formula

$$ {E}_{\mathrm{XC}}={E}_X^{\mathrm{GGA}}+{E}_X^{\mathrm{NL}}\left({a}_x,\mu \right)+{E}_{\mathrm{C}}^{\mathrm{GGA}}+{E}_{\mathrm{C}}^{\mathrm{NL}} $$
(4)

where E GGAXC represents semi-local GGA exchange-correlation energy components, E NL X (a x , μ) is the non-local (NL) Fock-exchange energy determined by the global mixing parameter a x and possibly the range-separation parameter μ, with the non-local correlation energy E NLC describing mostly long-range London dispersion effects. These three parts constitute three independent “coordinate axes” which define the functional space of modern DFT. Because there are many technically and theoretically different choices for the three parts, and since there is no universally accepted theory on how to combine them, the number of possible (and probably at least reasonably performing) DFs is huge. In the spirit of John Perdew’s so-called “Jacob’s Ladder” scheme [48, 49], however, one can make some general classifications (cf. Peverati and Truhlar [50]):

  • Global hybrids like B3LYP [47, 51] or PBE0 [52, 53] employ standard GGA and Local Density Approximation (LDA) components and a fixed amount of Fock-exchange. The hybrids performing optimally for thermochemistry have small a x values, typically in the range 0.1–0.3. They have been used extensively in chemistry and recently their theoretical foundation was discussed in some detail [54]. If properly corrected for London dispersion effects, e.g., by the atom-pairwise D3 [55, 56] or density-dependent VV10 [57] schemes (i.e., inclusion of E NLC , see below), good accuracy for thermochemical properties can be obtained [50, 58, 59].

  • Range-separated hybrids (RSHs) split the two-electron operator based on the inter-electronic distance into short-range (SR) and long-range (LR) parts, which are then treated differently. Hirao and coworkers [60] realized this splitting as

    $$ {r}_{12}^{-1}=\underset{\mathrm{SR}}{\underbrace{\mathrm{erf}\mathrm{c}\left(\mu \cdot {r}_{12}\right)\cdot {r}_{12}^{-1}\;}}+\underset{\mathrm{LR}}{\underbrace{\mathrm{erf}\left(\mu \cdot {r}_{12}\right)\cdot {r}_{12}^{-1}\;}}, $$
    (5)

    where erf(x) is the error function, erfc(x) = 1 − erf(x), and μ is an adjustable parameter controlling the switch between the two regimes SR and LR. Usually, the SR part is treated by a conventional GGA while the LR part is considered “exactly,” i.e., Fock-exchange is taken. Again, such functionals can be dispersion-corrected for better performance like ωB97X-D or ωB97X-D3 [61, 62]. Note that simple RSHs derived from standard GGA components perform worse for thermochemistry than global hybrids [58] but improve reaction barriers.

  • If E NLC contains an orbital-dependent term which is computed by second-order perturbation theory, the functionals are called double-hybrids (DHs). The similar term “doubly-hybrid” for a linear combination of DFT and MP2 parts was first coined in Zhao et al. [63] (the term DH was first used in Neese et al. [64]). The first DF in this class was B2PLYP [65] (for earlier related mixtures of DF and MP2 components, see [63, 66, 67]). A general expression for the correlation energy E C in modern DFs is given by

    $$ {E}_{\mathrm{C}}=\left(1-{a}_{\mathrm{c}}\right){E}_{\mathrm{C}}^{\mathrm{GGA}}+{a}_{\mathrm{c}}{E}_{\mathrm{C}}^{\mathrm{PT}2}+{E}_{\mathrm{C}}^{\mathrm{disp}}, $$
    (6)

    where a c is a local/non-local mixing parameter (in analogy to the Fock-exchange mixing parameter a x ), E PT2C is the standard MP2 correlation energy expression but evaluated with hybrid-GGA orbitals and eigenvalues, and E dispC is a further London dispersion energy correction. In the limit a c = 0 a normal hybrid is obtained. For the PT2 part, different scale factors for same- and opposite-spin pair correlation energies in the spirit of the SCS-MP2 method [68, 69] can be used, increasing robustness in electronically complicated situations and possibly leading to computational savings. According to extensive benchmarks [58], DHDFs are the best performing methods in the functional “zoo.” For a thorough review of DHDFs including extensive benchmarking, see Kozuch et al. [70].

  • Dispersion effects are ubiquitous in matter and hence we briefly describe our recommended procedure for their computation. For reviews on dispersion corrections to DFT and other atom-pairwise approaches, see [2830, 7173]. In the VV10 scheme by Vydrov and van Voorhis [57] (application denoted by “-NL” or “-V”), which is based on earlier work by Langreth and Lundqvist [74], the non-local correlation (dispersion) energy takes the form of a double-space integral

    $$ {E}_{\mathrm{C}}^{\mathrm{disp},\mathrm{V}\mathrm{V}10}=\frac{1}{2}{\displaystyle \iint \rho (r)\phi \left(r,{r}^{\prime}\right)\rho \left({r}^{\prime}\right)\mathrm{d}r\mathrm{d}{r}^{\prime },} $$
    (7)

    where ρ is the charge density and r and r′ denote electron coordinates. The different flavors of such density-dependent corrections [57, 74, 75] only differ in the choice of the non-local correlation kernel ϕ(r, r′). These kernels are physically based on local approximations to the (averaged) dipole polarizability at frequency ω (i.e., α(r, ω)). Knowing α at all (imaginary) frequencies leads automatically, via the famous Casimir–Polder relationship [76], to the long-range part of the dispersion energy. This clarifies the deep relation to atom-pairwise methods (e.g., the DFT-D3 [55] approach used here) which employ these coefficients as basic quantities and replace the charge density by atom-centered delta functions and the double-integral by a double-sum. The C6 dispersion coefficient for induced dipole–dipole dispersion interacting fragments A and B is given by

    $$ {\mathrm{C}}_6^{\mathrm{A}\mathrm{B}}=\frac{3}{\pi }{\displaystyle {\int}_0^{\mathit{\infty}}\alpha {\left(\mathrm{i}\omega \right)}^{\mathrm{A}}\alpha {\left(\mathrm{i}\omega \right)}^{\mathrm{B}}\mathrm{d}\omega .} $$
    (8)

    Higher-order dipole–quadrupole, quadrupole–quadrupole, … coefficients (i.e., C8, C10, …) can also be computed by similar formulas [77]. The C6 coefficients (and derived C8) in the D3 method were obtained from a modified form of this relation where the α(iω) are computed non-empirically by time-dependent DFT and A and B are reference molecules from which atomic values are derived [55]. Because the reference system can also be a molecular cluster modeling a solid environment, special coefficients for atoms in the bulk can be derived [78] (for a discussion of these atom-in-molecules effects, see Johnson [79]). The final form for the DFT-D3 two-body part of the dispersion energy employs the so-called Becke–Johnson (BJ) damping [56, 71] and truncates the expansion at C8:

    $$ {E}_{\mathrm{C}}^{\mathrm{disp},\mathrm{D}3\left(\mathrm{BJ}\right)}=-\frac{1}{2}{\displaystyle \sum_{\mathrm{A}\ne \mathrm{B}}{s}_6\frac{{\mathrm{C}}_6^{\mathrm{A}\mathrm{B}}}{R_{\mathrm{A}\mathrm{B}}^6+ f{\left({R}_{\mathrm{A}\mathrm{B}}^0\right)}^6}+{s}_8\frac{{\mathrm{C}}_8^{\mathrm{A}\mathrm{B}}}{R_{\mathrm{A}\mathrm{B}}^8+ f{\left({R}_{\mathrm{A}\mathrm{B}}^0\right)}^8}} $$
    (9)

    where f(R 0AB ) = a 1 R 0AB  + a 2, and a 1, a 2, s 6 and s 8 are empirical parameters that have been determined by a fit to CCSD(T) interaction energies for typical NCI (i.e., the S66 set [10]). The VV10 functional also contains two empirical parameters (adjusting the long- and short-range behavior) of which only the latter is fitted for each DF [59]. The above form of the damping function (which is similar in VV10) ensures that, for small interatomic distances, the right constant limit of the dispersion energy is obtained [80]. The D3 or VV10 corrections can be added to all semi-local DFs that are dispersion devoid in the medium-range regime so that no significant double-counting effects occur. For DHDFs the correction is scaled to complement the contribution of the orbital-dependent PT2 part [17] or empirically fitted [70].

Although these DFs do not fully solve the “bouquet of DFT puzzles” [81] (see also [82]), the modern functionals considered in this work are expected to provide, for “non-exotic” electronic structure problems, on average a significantly higher accuracy than what was typically obtained a decade ago. This will be illustrated by selected sets from our GMTKN30 [17] database and a few “real-life” examples involving large but in a chemical sense prototypical systems. We consider three standard dispersion corrected hybrids with two flavors of dispersion correction (B3LYP-NL, PW6B95-D3, and PBE0-D3), two RSHs which have been constructed consistently by including dispersion in the empirical fittings ωB97X-D3 [62] and ωB97X-V [83], and three versions of double-hybrid DFs which are based on different construction principles (DSD-PBEP86-D3 [84], PWPB95-D3 [17], and PBE0-DH [85]). The highly parametrized M06-2X meta-hybrid DF [86] is included because of its widespread use and to investigate the question of how much accuracy is lost by including only the medium-range dispersion energy [87]. For comparison, dispersion uncorrected B3LYP results are also given. Table 1 provides an overview and some properties of the investigated functionals.

Table 1 Overview of the investigated density functionals

Rather important in practical applications is the amount of non-local Fock-exchange included as determined by the mixing parameter a x (in RSHs the non-local exchange contribution normally reaches 100% for large inter-electronic distances). It determines the magnitude of the so-called self-interaction error (SIE) which leads to too low reaction barriers, too loosely bound electrons, and over-delocalized electronic structures [8890]. In particular, these problems may arise in unsaturated, radical-containing structures. Most DHDFs employ larger a x values and hence suffer less from SIE because the unwanted effects of the larger Fock-exchange contribution are compensated by the orbital-dependent correlation energy. The influence of SIE on thermochemical properties and its structural dependence in large systems is much less clear than that of the dispersion energy.

2.3 Technical Details of Quantum Chemical Calculations

The most important issue to consider in practical calculations is the choice of the one-particle atomic orbital (AO) basis set which is used to expand the orbitals in the Kohn–Sham approach and to generate excitation spaces in PT2 or CCSD(T) treatments. While thermochemical results with hybrid DFs are typically already close to convergence with properly polarized triple-ζ type basis sets (e.g., def2-TZVP [44] or cc-pVTZ [91]), DHDFs require larger sets due to the presence of the perturbation term [58]. Therefore, we employ here in single-point energy computations the def2-QZVP basis deprived of g-functions on non-hydrogen atoms and f-functions on hydrogen and lithium atoms. In standard notation this basis reads for first- and second-row elements [7s4p3d2f]/[4s3p2d]. According to many tests performed over the years for various reactions (see, e.g., [24, 58, 92]), this basis set level provides reaction energies within 1–2 kcal/mol of the complete basis set (CBS) limit for the cases considered here. Because basis set effects are largest for energies but only moderate in structure optimizations (see, e.g., [93]), the def2-TZVP level is sufficient for the latter purpose and hence most structures are based on TPSS-D3/def2-TZVP optimizations. The structures in the subsets of the GMTKN30 [17] database in Sect. 3.1 were taken without modification. For the palladium atoms (DIMPD reaction) an ECP (SD(28,MWB) [94]) (and respective ECP basis set) was used. In this case, the geometries used were obtained on the PBE-D3/def2-TZVP level. The MP2 energies used in the DLPNO-CCSD(T) calculations are based on def2-TZVP/def2-QZVP extrapolations according to the procedure of Halkier et al. [95, 96]. For the two-electron Coulomb integrals the RI-approximation [9799] was used which speeds up the computations remarkably without any significant loss of accuracy when optimized auxiliary basis sets [100, 101] are used. The MP2 results are also based on RI-treatments [102] with the corresponding exchange-type auxiliary basis sets [103]. All orbital-dependent correlation energies were obtained within the frozen (chemical) core approximation. The numerical quadrature grid m5 was generally employed for the integration of the exchange-correlation contribution. For the M06-2X calculations with ORCA the larger grid 7 was used because results with this functional are known to be strongly grid-dependent [58, 104]. Moreover, it was recently reported [105] that the M06-2X functional produces artificially large basis set superposition errors (BSSE) even with very large AO basis sets indicating some numerical instability.

All electronic energy and frequency calculations were conducted with TURBOMOLE [106, 107] or ORCA [108, 109] codes which provide practically identical results for similar technical settings. The COSMO-RS corrections were obtained from the COSMOtherm [110] software package. For further numerical details and discussions of the back-correction scheme, see Grimme [39]. All calculations involved in the DLPNO-CCSD(T) treatment were conducted with the ORCA code.

3 Examples

3.1 Thermochemical Benchmark Sets

A very convenient and unbiased way to assess the “global” accuracy of DFs is using the so-called GMTKN30 [17] database developed in our group over several years [111]. This benchmark covers 30 subsets related to general main group thermochemistry, kinetics, and NCI. In total, it encompasses 1,218 single-point calculations and 841 data points (relative energies). It therefore turned out to be ideal for evaluation and development of DFT methods. Here we utilize only parts of the GMTKN30 database and concentrate on four prototypical benchmarks for “true” chemical reactions which are described below. Intermolecular NCI have been studied extensively in recent years [9, 10, 112, 113] and are not considered here. It is noted that all tested functionals in this work perform well for NCI as long as they are corrected for long-range London dispersion effects by, e.g., D3 [55, 56] or NL(VV10) [57, 59] methods. As usual, fixed molecular structures are used and all energies are vibrational zero-point energy exclusive, which can conveniently be compared to the result of a standard QC calculation.

The MB08-165 subset: the “mindless-benchmark” set (MB08-165) was introduced by Korth and Grimme. [16] It contains 165 randomly created so-called “artificial molecules” (AMs) with varying constituents. For these molecules, decomposition energies into their hydrides (for the main group elements 1–4) and homonuclear diatomics (main group elements 5–7) were calculated. For these reactions, estimated CCSD(T)/CBS reference values were computed. In contrast to other benchmark sets, MB08-165 is less biased towards certain chemical aspects, as it contains artificial systems only. Korth and Grimme assessed a variety of density functionals and could reproduce nicely the Jacob’s Ladder scheme, with higher rung functionals yielding better results. We chose MB08-165 to be the first subset of our benchmark study as it can be regarded as one of the most important for general thermochemistry, in particular in difficult situations when the electronic structure is not fully clear. Although the stated objective of this investigation is to focus on non-exotic cases, we believe it is also important to test the limits of DFT. Compared to the other subsets, it contains a large number of reference values and rather large reaction energies (117 kcal/mol on average, with an energy range from −570.6 to 433.7 kcal/mol).

The G2RC subset: contains 25 reactions, whose reactants and products are part of the G2/97 set of heats of formation [6]. Based on vibrationally back-corrected experimental data from Curtiss et al. [6], reference energies were calculated. The G2RC set comprises 47 single point calculations and has an average absolute reaction energy of 50.6 kcal/mol, with an energy range from −1.0 to −212.7 kcal/mol. It contains relatively small molecules (benzene being the largest) but is non-trivial because chemically large structural changes (e.g., transformation of multiple to single bonds) occur. In contrast to MB08-165, however, it covers only the conventional chemical space.

The BH76 subset: a fusion of the HTBH38 [114] and NHTBH38 [12] databases by Truhlar and co-workers. HTBH38 contains forward and reverse barriers of 19 hydrogen atom transfer reactions. NHTBH38 comprises 38 barriers of 19 heavy atom transfer, nucleophilic substitution, unimolecular, and association reactions. Reference values are based on high-level W1 calculations and “best theoretical estimates” (see [12, 114] for more details). The combined BH76 test set involves 95 single point calculations and has an average barrier height of 18.5 kcal/mol, with an energy range from −15.5 to 106.2 kcal/mol. Because the reaction energy considered always involves a transition state in which chemical bonds are partially broken, the results are sensitive to the treatment of the SIE in approximate DF.

The ISOL24-6 subset: Huenerbein et al. recently published a new benchmark set containing 24 isomerization reactions (ISOL24 [15]) of large molecules covering a wide range of different compounds, like, e.g., a sugar, a steroid, an organic dye, hydrocarbons, and large molecules containing heteroatoms. As reference, estimated SCS-MP3/CBS(TQ) was used. In contrast to the popular ISO34 set [14], which is a part of GMTKN30, the large size of the molecules casts additional light on effects that are important in “real life” organic chemistry. These are in particular intramolecular London-dispersion effects. Charged systems are also considered. This set has been reconsidered by Truhlar et al. [115] and for a selection of smaller systems (with 24–35 atoms) new, higher-level CCSD(T) reference values were computed. The energy range is from 4.7 to 33.5 kcal/mol and the average reaction energy is 13.6 kcal/mol.

The statistical data (mean and maximum deviations) for the four benchmarks sets are given in Table 2 along with an overall performance measure (mean of the four MAD values) for each DF. Because the sets are of varying complexity and the reaction energies are also of different magnitude, we mostly consider the relative performance of the DF and take the maximum deviation as a measure of robustness (which should be proportional to the number of expected outliers). We have also investigated a weighted MAD similar to that of Goerigk and Grimme [17], but have found virtually no difference in relative performances.

Table 2 Statistical results for the thermochemical benchmark sets

From the mean \( \overline{\mathrm{MAD}} \) one can identify three groups of DFs with increasing accuracy: (1) PBE0-D3, PBE0-DH, and ωB97X-D3 with values in the range 4.5–5.4 kcal/mol and a largest MaxD value >36 kcal/mol; (2) B3LYP-NL, ωB97X-V, and PW6B95-D3, with values in the range 3.3–4.1 kcal/mol and largest MaxD values in the range 17–34 kcal/mol; (3) M06-2X, PWPB95-D3, and DSD-PBEP86-D3 with MAD values of 1.6–2.5 kcal/mol and largest MaxD values of 13–25 kcal/mol. According to these four benchmarks, the PWPB95-D3 and DSD-PBEP86-D3 double-hybrids are clearly the best performers. They provide the smallest MAD for all sets, the lowest mean MAD, and always very low MaxD values. The other double-hybrid (PBE0-DH) performs less well, which is not unexpected because it misses a dispersion correction and the non-local perturbation contribution is much smaller than for the other two DHDFs. The only moderately good result for ωB97X-V is somewhat disappointing because this functional tries to account for all exchange-correlation effects, and is based on extensive parametrization and/or empirical searches for best performing functional components. The improved accuracy of ωB97X-V compared to ωB97X-D3 at least shows that some progress can be made even on a relatively high accuracy level at a formal cost lower than for most DHDFs. The good performance of B3LYP with the density dependent dispersion correction is noteworthy and, as mentioned before [59], the combination of an over-repulsive hybrid GGA part with a (partly overbinding) medium-range correlation part (VV10) is beneficial due to systematic error compensation. B3LYP-NL yields mostly small MaxD values and only fails for the barrier heights (similar to PBE0-D3) due to the SIE introduced by the too small Fock-exchange component. Although in particular the LYP part in B3LYP is sometimes responsible for larger errors [17, 116], this functional in dispersion-corrected form (and this also holds for B3LYP-D3) can still be recommended for thermochemistry at least for comparison (error estimation) purposes. However, at a similar numerical complexity and empiricism level, PW6B95-D3 provides in general better results and hence represents our default hybrid DF. It is often close to the performance of M06-2X which unfortunately is numerically unstable as noted above. Nevertheless, it is clear that all “modern” functionals perform better than standard B3LYP which yields large errors, in particular for the MB08-165 and ISOL24-6 sets. The only moderately positive effect of the NL(VV10) dispersion correction to the B3LYP results is mostly rooted in the small molecules considered so far. Whether this picture also prevails in larger, more realistic situations will be discussed below.

3.2 Chemical Reaction Examples

In the following sections we discuss the results for four “real-life” chemical applications involving rather large systems: the dissociation energy of a substituted hexaphenylethane (HEXAPE), of a substituted, dimeric hydrochinone derivative (DHCH), and the ligand exchange and dissociation reaction of a dimeric palladium species (DIMPD). The last example considers activation of H2 by a so-called frustrated Lewis pair (FLP) which is a very active field of chemical research [117]. The results for all tested functionals are given in Table 3; the reaction formulas and structures are shown graphically in each subsection.

Table 3 Results for the four chemical reaction energies ΔE (in kcal/mol)

3.2.1 Hexaphenylethane

The predominant view in chemistry is that bulky groups in molecular structures are more repulsive rather than stabilizing. In particular the widespread misconception that a tert-butyl group only acts repulsively has recently been challenged by Grimme and Schreiner who re-investigated the textbook case of hexaphenylethane (Fig. 1) [118]. The stabilization of normal covalent bonds by dispersion in large systems is rather obvious from general considerations of the size and distance dependence of the dispersion energy [31]. Here we discuss a case in which the dispersion interaction between seemingly “innocent” ligands provides the main driving force for binding, meaning that without these forces the system would spontaneously dissociate. Hexaphenylethane is known for its inability to form a stable C(sp3)–C(sp3) single bond [118]. A delicate balance of covalent bonding, dispersion attraction, and Pauli repulsion forces between the phenyl rings and attached substituents can be expected when the central C–C bond is broken. However, why the parent molecule hexaphenylethane cannot be synthesized but the seemingly sterically more overcrowded all-meta tert-butyl substituted derivative can be isolated was an open question. It was shown that its thermodynamic stability and the instability of the parent molecule can be fully explained by dispersion-corrected DFT computations and that the tert-butyl groups stabilize the molecule compared to its radical fragments by as much as 40 kcal/mol. Because the system is very large (212 atoms), and involves open-shell species and various interaction types, we consider the computation of the dissociation energy ΔE as non-trivial. Unfortunately, the experimental ΔG is not precisely known. A value around zero with an error bar of ±2 kcal/mol is compatible with the observations [118] from which, after back-correction, a gas phase ΔE value of 33 ± 4 kcal/mol for the formation of the two trityl-type radicals is deduced.

Fig. 1
figure 1

Reaction formula and optimized structures for the all-meta substituted hexaphenylethane (HEXAPE). In the structure on the lower left, hydrogens have been omitted for clarity

As can be seen from the results in Table 3, most functionals come close to this value, but we also note a rather large spread of the computed values (5.6–60.6 kcal/mol). Because of the missing dispersion effects in B3LYP, its error is huge (>70 kcal/mol) so that the molecule becomes unbound. In accordance with this observation, the lowest dissociation energies result from M06-2X and PBE0-DH. The former DF only includes the medium-range dispersion energy but misses the important long-range component and the DHF does not include enough dispersion by the perturbation part as noted above. The other examples discussed below support this finding and this underbinding tendency of M06-2X was also observed for non-equilibrium (stretched) van der Waals complexes [119]. Good performers with values within the error bar of the reference value are B3LYP-NL, PBE0-D3, and PW6B95-D3 with deviations of about 3 kcal/mol (10%). Considering the size and complexity of the system, this agreement between theory and experiment is very satisfactory.

3.2.2 A Zero Free Dissociation Energy Bond

It was recently shown experimentally [120] that the 2,6-di-tert-butyl-4-metoxyphenoxyl radical dimerizes in solution and in the solid state (Fig. 2). In the same study, a bond dissociation free Gibbs energy around zero (−0.2 ± 0.1 kcal/mol) was measured using optical and IR spectroscopies. The authors also provided results of DFT calculations in the gas phase in the supporting information. However, the calculations were done with a rather small basis set and only one DF was applied without any consideration of solvation effects. Thus, a more detailed theoretical investigation is justified and we take this example here to test our selection of modern DFs.

Fig. 2
figure 2

Reaction formula and optimized structures for the substituted chinone dimer (DHCH)

Similar to the previous example, the dimer is stabilized by dispersion interactions between the tert-butyl substituents on the aromatic ring. While the stabilization here is surely smaller than in the case of the HEXAPE, the counteracting Pauli-repulsion is also smaller, resulting in a similar ΔG value in solution. As will be seen below, the DFT description of the monomers suffers more from spin-overdelocalization, which leads to an overstabilization of the monomers with respect to the dimer. This makes the system, though smaller than the previous one, very challenging.

The DFT results for the dissociation are given in Table 3. From the experimental free dissociation energy, a back-corrected value for ΔE of 17 ± 2 kcal/mol has been deduced. Thus, the central C–C bond is even weaker than in the hexaphenylethane example but, similarly, uncorrected B3LYP yields an unbound molecule due to missing dispersion effects. Good DF performers are in this case only the two range-separated DFs and one of the DHDFs with dispersion correction, namely PWPB95-D3. Even though these functionals counter the SIE to some degree, they still overestimate the stability of the radicals by 4–5 kcal/mol. It should be noted here that M06-2X performs better than most other functionals due to its large amount of Fock-exchange (54%), but cannot match the accuracy of the above-mentioned methods, which can in part be attributed to the missing long-range dispersion energy in this functional. The M06-2X value reported here deviates by about 3 kcal/mol from that given in the original publication (11.6 kcal/mol), which can be attributed to the basis set superposition error by the small basis used in the original publication leading to artificial overbinding. The strong overbinding of DSD-PBEP86-D3 by about 10 kcal/mol with respect to the reference value is probably related to the higher spin contamination of the monomer compared to the other functionals.

The system is a good example where consideration of both SIE and dispersion effects is necessary and that an accuracy of 2–3 kcal/mol or better poses a challenge for even the most sophisticated DFs.

3.2.3 Transition Metal Complex

As a third example, a chemical reaction which involves a di-palladium complex was studied (see Fig. 3), for which the reaction enthalpy in solution was measured experimentally by Djukic et al. and back-corrected to pure electronic energies [41]. In this reaction, the Pd–Pd bond is quenched by a triphenylphosphane ligand, yielding the corresponding monopalladium complex. The best estimate experimental reference value for the reaction energy is ΔE = −32 ± 3 kcal/mol, whereat the largest source of error stems from the Gibbs free solvation energy correction (for a comprehensive discussion including different ligands and methods, see [41]). The relatively large uncertainty of ±3 kcal/mol reflects the problems associated with obtaining reliable reference values for large systems, and a relative error of about ±5% is a realistic estimate. Hence, it should be emphasized once again that the term “chemical accuracy” has to be adjusted for the thermochemistry of large molecules which are the focus of this review. Several KS-DFT methods were tested concerning their ability to reproduce the experimental reference value of ΔE for the investigated reaction. For comparison, it should be noted that the plain HF method predicts the reaction energy qualitatively wrong (endothermic) which clearly shows that the driving force for this reaction is electron correlation. There is no strong steric cluttering around the Pd center, and thus it is easily accessible for the ligand and the base can easily approach the Lewis acid. It is also not the relatively large medium-range correlation but the significant long-range contribution to dispersion energy which renders this reaction particularly difficult. Not surprisingly, KS-DFT methods without dispersion correction are unable to reproduce the experimental reference value, e.g., the hybrid functional B3LYP undershoots the reaction energy by more than 60% and also the M06-2X functional underbinds as noted before. The best performance of all tested methods is given by B3LYP with NL correction but the meta-hybrid PW6B95 with D3 correction and the ωB97X-V functional are also able to reproduce the reaction energy almost quantitatively. Moreover, the B3LYP-NL reaction energy is in reasonable agreement with the PW6B95-D3 result, thus indicating the D3 dispersion correction is physically sound. The two double hybrids PWPB95-D3 and DSD-PBEP86-D3 suffer from an overestimation of the reaction energy on the MP2 level and hence slightly overshoot the reaction energy. In contrast, the PBE0-DH functional, which does not include the D3 correction, underestimates the reaction energy due to the missing long-range dispersion interaction. It is encouraging to see that after describing the physics correctly, i.e., proper treatment of medium and long-range correlation effects, KS-DFT methods can reproduce the experimental reaction energy with sufficient accuracy, thus giving the right answer for the right reason. This finding is in agreement with a previous study of ligand substitution energies for transition metal complexes [23]. Thus, we are confident that the available modern KS-DFT methods are also well suited for studying the thermochemistry of larger transition metal complexes.

Fig. 3
figure 3

Reaction formula and optimized structures for the palladium dimer ligand exchange reaction (DIMPD)

3.2.4 Dihydrogen Activation by an FLP

Activation of dihydrogen is typically a domain of transition metal chemistry and even Nature uses metal centered reactions to split the dihydrogen molecule in the hydrogenase enzymes. There is a recent development in the use of metal-free systems for H2-activation. Stephan, Erker, and others have described so-called frustrated Lewis pairs (FLP), i.e., pairs of Lewis acids and bases, which do not quench each other due to the steric bulk of their substituents and heterolytically split the H2-molecule [117, 121]. Phosphane/borane pairs such as the system considered here (see Fig. 4) react rapidly and effectively with H2 to yield the corresponding phosphonium cation/hydridoborate anion pair [122]. These systems have been used as active metal-free hydrogenation catalysts [123] and an increasing number of related systems that appear in the literature can activate many other small molecules (e.g., alkenes, CO, CO2, NO [124127]). The quantum chemical description of FLP reactions has attracted a lot of interest recently [92, 128132]. Here we study the reaction energy of the original [B(C6F5)3]/[P(tBu)3] system [122] with H2. Because the thermodynamic properties have never been measured accurately (the reaction is practically irreversible in common solvents, i.e., ΔG ≪ 0) we take as reference the value of −21.6 kcal/mol from a DLPNO-CCSD(T)/CBS treatment which should be accurate to about 1–2 kcal/mol. Note that the reaction energy is calculated relative to the weakly bound [B(C6F5)3]/[P(tBu)3] donor-acceptor complex and separate H2 and not with respect to all free reactands (which would lower the reaction energy by about 15 kcal/mol [129]).

Fig. 4
figure 4

Reaction formula and optimized structures for dihydrogen activation by a frustrated Lewis pair (FLPH2)

As can be seen from Table 3, the DFT results are relatively close to each other and also to the reference value. The span of the values from −18.3 to −24 kcal/mol is much smaller than in the previous cases and the DFT results nicely bracket the DLPNO-CCSD(T)/CBS reference value. Although the structural changes are large, i.e., splitting of a strong single bond and formation of a zwitterionic structure with two new bond types, it seems that the FLP reaction is electronically rather easy to treat by DFT. Part of the reason for this somewhat surprising finding (which, however, has been noted before for a comparison of MP2, SCS-MP2, and B97-D methods [129]) is that the non-covalent interactions are similar in the FLP and the reaction product so that most errors in the theoretical treatment cancel. This view is supported by the relatively good results of uncorrected B3LYP. The best results with deviations of only about 1 kcal/mol are provided by ωB97X-V and DSD-PBEP86-D3. The worst performers are B3LYP-NL and M062-X with deviations of 3.3 and 2.5 kcal/mol, respectively.

4 Summary and Conclusion

The accurate description of the electronic energy part of the thermodynamic properties in large molecule reactions still represents some challenge to theory. However, the benchmarks presented and the four “real-life” chemical problems show that significant progress has been achieved in DFT in recent years. The modern density functionals investigated from hybrid and double-hybrid rungs of “Jacob's Ladder” mostly perform very well when properly corrected for long-range London dispersion effects. The basic reason for this is that larger molecules are significantly more stabilized by the intramolecular dispersion energy than smaller ones, and in particular dissociation reactions definitely require dispersion corrections in DFT. This also explains why the long-range dispersion devoid M06-2X functional performs less well for all larger “real-life” examples. Note that these findings are not evident from still widely employed small molecule benchmark sets. Similar conclusions apply to the performance analysis of the dispersion-uncorrected PBE0-DH double-hybrid functional. In contrast, the “old-fashioned” B3LYP approach shows reasonably good performance in its tested NL(VV10)-corrected form. Somewhat better accuracy than with global hybrids – which include non-locality only for the exchange part – can be achieved with double-hybrids that include a perturbative orbital-dependent correlation energy. From the three tested variants, the PWPB95-D3 functional, which is also computationally efficient due to the use of opposite-spin orbital correlation only, shows the best overall performance. The two range-separated functionals tested from the ωB97X family provide good results in particular when the self-interaction error is relevant (e.g., for delocalized radicals or reaction barriers) but also some outliers are noted. This is tentatively attributed to the underlying Taylor-expansion of the B97-type GGA part. Nevertheless, we reiterate that the error estimates of the experimental measurements of substantial reaction energies in large molecular systems do not justify the supposition of a chemical accuracy of 1 kcal/mol. Rather, we suggest a relaxation to 2–3 kcal/mol and note that the modern functionals investigated are not too far away from this bound. Concerning just the electronic energy we conclude that DFT in combination with nowadays possible large-scale DLPNO-CCSD(T) calculations opens a bright future for theoretical thermochemistry.

However, comparisons to experimental data under typical conditions require inclusion of thermal and entropic effects in the gas phase together with corrections for solvation. Since solvation and entropic contributions almost never cancel and some reactions are entirely driven by solvation (e.g., those leading to zwitterions or ion pairs), their accurate account is mandatory. Comparison of the results from different continuum solvation models (not shown here) for large but not very polar systems indicates that an accuracy of 1–2 kcal/mol for the solvation free energy contribution to a reaction is not easy to achieve. Similar estimates are obtained for the error of the thermostatistical calculation of reaction entropies. Further work along these lines together with improved density functionals should allow routine calculations with 2 kcal/mol accuracy or better for even larger systems than treated in this review in the foreseeable future.