Introduction

Semiempirical molecular orbital theory (SEMO) is an umbrella term for a family of methods that originated in the work of John Pople in the 1960s as approximations to minimal basis Hartree–Fock (HF) theory [1,2,3]. The main hallmarks of these methods are the neglect of the atomic orbital (AO) overlap matrix (S = 1) in the secular equation and the neglect of a large number of (multi-center) two-electron integrals. While these approximations were originally intended to simply maintain the invariance properties of HF, they could partially be justified later on [4,5,6].

Even though it quickly became apparent that ab initio minimal basis HF is a woefully inadequate model chemistry, SEMO methods have survived to this day [7,8,9,10,11,12,13]. This is due to a shift in philosophy most prominently associated with Michael Dewar and Michael Zerner [14,15,16,17,18]. Instead of understanding SEMO as an approximation to HF, they simply used it as a parametric framework, which was fitted to experimental data. This allowed them to obtain thermochemical (Dewar) and spectroscopic (Zerner) predictions with (at the time) unrivaled accuracy and efficiency. Importantly, the physical basis of the SEMO equations ensured that these methods were reasonably transferable, compared to fully empirical force-fields. At the same time, two different approaches to the development of the SEMO methods emerged: the first approach is exploiting parametric flexibility in a “brute-force” fitting strategy on as large and diverse training set as possible and the second approach is pursuing improvement of the physical model to be parametrized on a carefully selected limited training set [19,20,21].

The fully empirical route of the first approach brings the danger of unexpected (and sometimes catastrophic) failures for systems outside the training set. As an example, to improve hydrogen bond energies, additional Gaussian potentials were introduced in AM1 (and used in most PMx methods) [13, 22, 23]. These corrections do improve hydrogen bond geometries (though not necessarily energies [24]), but they lead to non-covalent alkyl-alkyl equilibrium distances of below 1 Å. Ironically, dispersion corrections to PM6 therefore contain an additional empirical repulsive potential between hydrogen atoms [25, 26]. While this solves the specific issue of alkyl-alkyl interactions, it will likely cause new problems elsewhere (e.g., in the covalent bond of H2). Overall, patching problems caused by empirical corrections with new empirical corrections resembles the old theory of epicycles in Ptolemaic astronomy.

In the second approach, the underlying non-parametric SEMO model is modified to come as close as possible to the parent minimal basis HF and to retain the speed of the SEMO methods, as in Thiel’s OMx methods [6, 27]. Nevertheless, recovering minimal basis HF is not the goal per se due to the known inadequacy of this method as a modern model chemistry. Instead, the hope is that given a better model (closer to HF), it should be easier to introduce and fit parameters to reproduce experimental and high-level theoretical properties and that the resulting method will be more robust in situations outside the training set. Indeed, the OMx methods greatly outperform methods of the first approach for very challenging systems with peculiar electronic structure [27, 28]. We argue, however, that the next-generation SEMO models should target not HF, but quantum chemical methods that include correlation in their model. These SEMO models should also remain as fast as modern general-purpose SEMO methods.

To approach the design of such next-generation SEMO models, we have to reconsider what the SEMO model is supposed to be approximating. In particular, it has long been argued that the empirical scaling of two-electron integrals in methods like MNDO (modified neglect of diatomic overlap) can be justified as an incorporation of dynamical correlation [14, 29]. If that is the case, then it is potentially misleading to look to HF for guidance regarding the desired form and properties of SEMO Hamiltonians. After all, HF by definition contains no electron correlation [30].

In this paper, we consider the question posed in the title in light of these considerations. To this end, we briefly review the equations of MNDO and related methods as they are derived in the context of HF. We then consider how correlation is introduced in other HF-like theories based on ab initio wavefunction theory (WFT) and Kohn–Sham (KS) density functional theory (DFT). From this analysis, we argue that the introduction of correlation effects into the SEMO Fock matrix is inconsistent with the use of a HF-like total energy expression. Alternative expressions are discussed, and a consistent parameterization strategy is proposed. Finally, some preliminary numerical results are shown, indicating the feasibility of improved, consistent SEMO methods.

Theory

Electron correlation in semiempirical Hamiltonians

The central equation to be solved in HF and other mean-field electronic structure methods is a generalized matrix eigenvalue problem in a basis of non-orthogonal AO basis functions [31]:

$$ \boldsymbol{FC}=\boldsymbol{\varepsilon} \boldsymbol{SC}, $$
(1)

with the Fock matrix F, the molecular orbital (MO) coefficient matrix C, the AO overlap matrix S and the diagonal MO eigenvalue matrix ε. This is typically solved by orthogonalizing the Fock matrix so that [32]:

$$ {}^{\lambda}\boldsymbol{F}={\boldsymbol{S}}^{-1/2}\boldsymbol{F}{\boldsymbol{S}}^{-1/2} $$
(2)

and

$$ {}^{\lambda}\boldsymbol{C}={\boldsymbol{S}}^{1/2}\boldsymbol{C} $$
(3)

This leads to the canonical eigenvalue problem:

$$ {}^{\boldsymbol{\lambda}}\boldsymbol{F}{}^{\boldsymbol{\lambda}}\boldsymbol{C}=\boldsymbol{\varepsilon} {}^{\boldsymbol{\lambda}}\boldsymbol{C}, $$
(4)

which can be solved by diagonalization of λF. In MNDO and related methods, the approximation is made that λF = F, so that Eq. (4) can be solved directly. The matrix elements of F resemble the HF ones, i.e.:

$$ {F}_{\mu \nu}={T}_{\mu \nu}+{V}_{\mu \nu}+\sum \limits_{\lambda \sigma}^{AO}{P}_{\lambda \sigma}\left[\left(\mu \nu |\lambda \sigma \right)-\frac{1}{2}\left(\mu \lambda |\nu \sigma \right)\right]. $$
(5)

Here, Tμν and Vμν are the one-electron kinetic and potential energy contributions, (μν| λσ) are two-electron repulsion integrals and Pλσ are density matrix elements defined as:

$$ {P}_{\lambda \sigma}=\sum \limits_i^{occ}{n}_i{C}_{\lambda i}{C}_{\sigma i}, $$
(6)

with the summation going over all occupied MOs. Here ni are orbital occupation numbers.

For convenience, the first two terms on the r.h.s of Eq. (5) can be combined into the core Hamiltonian matrix Hcore, whereas the remaining terms are grouped into the two-electron matrix G (i.e., F = Hcore+ G). The two-electron matrix can be further decomposed into the Coulomb (J) and exchange matrices (K).

In MNDO and related methods, \( {H}_{\mu \nu}^{core} \) is directly given via empirical expressions. Meanwhile, G is strongly simplified via the neglect of diatomic differential overlap (NDDO) approximation, which leads to a neglect of most two-electron integrals according to [5]:

$$ \left({\mu}_A{\nu}_B\right.\left.|{\lambda}_C{\sigma}_D\right)={\delta}_{AB}{\delta}_{CD}\left({\mu}_A{\nu}_A\right.\left.|{\lambda}_C{\sigma}_C\right) $$
(7)

where the notation μA indicates that basis function μ is centered on atom A.

As an aside, this approximation (which amounts to a neglect of all three- and four-center integrals and some two-center two-electron integrals) is only partially justified in an orthogonal Löwdin basis for the valence-only minimal basis set, meaning that the assumption that λF = F and the NDDO approximation are not entirely consistent with each other [6]. Consequently, the reintroduction of the generalized eigenvalue equation into one of the least accurate SEMO models led to noticeable improvements [33]. In addition, broadly used OMx methods remedy the imbalance in the one-electron integrals, which among the NDDO-retained integrals are most strongly affected by the orthogonalization [6, 7].

In the formulation of the MNDO method, Dewar and Thiel used an empirical expression for the remaining two-electron integrals (μAνA| λCσC). This expression is based on a multipole expansion of the densities μAνA and λCσC, which is asymptotically correct for large distances rAC. Meanwhile, in the limit rAC = 0, the values of the one-center two-electron integrals (μAνA| λAσA) should be recovered (which were derived from experimental data). In MNDO, this is achieved via the Dewar–Sabelli–Klopman (DSK) scaling function.

It was observed that this scaling leads to semi-empirical integrals that are smaller than the corresponding ab initio ones. Since the one-center integrals are derived from spectroscopic data (which obviously includes all electron correlation effects), this was interpreted as an average inclusion of dynamical correlation. Indeed, the mean-field potential of HF is known to be overly repulsive, an effect that is, e.g., reflected in erroneous one-electron properties such as dipole moments [34, 35].

To make this implicit correlation treatment more apparent, we can write the semi-empirical (SE) integrals as:

$$ \left({\mu}_A{\nu}_A\right.{\left.|{\lambda}_C{\sigma}_C\right)}_{SE}=\left({\mu}_A{\nu}_A\left.|{\lambda}_C{\sigma}_C\right)\right.+{V}_{\mu \nu \lambda \sigma}^C\left({r}_{AC}\right) $$
(8)

where we have introduced a correlation potential \( {V}_{\mu \nu \lambda \sigma}^C\left({r}_{AC}\right) \), which is defined as the difference between the semiempirical and ab initio integrals. Using the definitions of the Coulomb and exchange matrix elements (Eq. 5), we can define an analogous matrix VC for the correlation potential, which adds up the SE corrections (Eq. 8) to J and K. With this, we can reformulate the SEMO Fock matrix as:

$$ {F}_{\mu \nu}^{SEMO}={H}_{\mu \nu}^{core}+{J}_{\mu \nu}-{K}_{\mu \nu}+{V}_{\mu \nu}^C $$
(9)

It should be emphasized that the point of this reformulation of the MNDO equations is not to actually implement the method in this way, but rather to make the treatment of electron correlation more obvious, and to facilitate comparison with WFT and DFT methods. In general, Eq. (9) defines a correlated single determinant (independent particle) theory [36,37,38].

Correlation potentials in Kohn–Sham density functional theory

In an AO basis set, the KS equations closely resemble Eq. (9) [39]:

$$ {F}_{\mu \nu}^{KS}={H}_{\mu \nu}^{core}+{J}_{\mu \nu}+{V}_{\mu \nu}^{XC} $$
(10)

Here, \( {V}_{\mu \nu}^{XC} \) is a matrix element of the exchange-correlation potential (Vxc) in the AO basis [40]. Indeed, hybrid functionals like B3LYP or PBE0 also include a scaled contribution of HF exchange (Kμν), further increasing the structural similarity between the KS and SEMO equations [41, 42].

Let us consider the properties of Vxc and the corresponding KS determinants. Most importantly, the exact Vxc is the potential that yields the correct ground-state density of the system [40, 43]. Consequently, a good approximation to Vxc should yield a determinant that accurately reflects the exact electron density and related one-electron properties such as dipole moments and molecular electrostatic potentials.

A second important property of the exact Vxc is that the eigenvalue of the highest occupied molecular orbital (HOMO) of the corresponding self-consistent KS determinant is exactly equal to the vertical ionization potential (IP) of the system [44]. Furthermore, Bartlett has formulated a more general IP theorem, based on the adiabatic time-dependent DFT equation [36, 45, 46]. According to this, all occupied orbital eigenvalues should correspond to vertical ionization potentials. Baerends and coworkers found this to be true for a broad range of molecules when using the statistical averaging of orbital potentials (SAOP) approximation to Vxc [47].

The exact Vxc is not known in general, but it can be determined for specific systems, e.g., via the optimized effective potential method or by inverting accurate electron densities [48,49,50,51,52]. Such studies offer numerical confirmation that the above conditions (in particular regarding ionization potentials) are indeed properties of the ideal Vxc. For general applications, many approximations to Vxc exist, which fulfill these conditions to a greater or lesser extent [45, 53].

In this context, it should be noted that the focus of most functional developers has been on the exchange-correlation functional (Exc, see below), and not on the potential [54]. This is attractive because the corresponding potential can in principle be derived for any functional via the functional derivative with respect to the electron density ρ:

$$ {V}_{xc}\left(\boldsymbol{r}\right)=\frac{\partial {E}_{xc}\left[\rho \right]}{\partial \rho \left(\boldsymbol{r}\right)} $$
(11)

Meanwhile, the inverse is not true, i.e., it is in general not possible to derive a functional that corresponds to some known potential. However, the convenience of Eq. (11) is misleading. The consistent potentials derived in this way tend to be of poor quality, reflecting e.g., in an erroneous description of transition states and other self-interaction based problems. Burke and coworkers have characterized these problems as “density-driven” errors [55, 56]. It is telling that these issues can (to a large extent) be fixed by combining HF densities non-self-consistently with common density functionals [57, 58]. In other words, having no correlation potential seems to be better than having a poor one in many cases.

Correlation potentials in wavefunction theory methods

Conventionally, higher-level WFT methods are based on a canonical HF reference, without any correlation contribution to the Fock matrix. The HF equations yield the variationally optimal reference, i.e., the one with the lowest energy. There are, however, also alternative choices to be made, such as Brueckner, natural or orbital-optimized references [38, 59,60,61,62,63].

These alternative determinants by definition have a higher energy than the canonical HF one. At first sight, this may seem like a drawback, as it means that more energy has to be recovered by the correlation treatment. However, these references are optimal in different ways. For instance, the Brueckner determinant has the largest overlap with the exact wavefunction and orbital-optimization yields the determinant which minimizes the total energy (i.e., the Hartree–Fock energy, EHF, plus the correlation energy, EC) [64]. These determinants are therefore more directly tied to the correlated wavefunction. The electrons are not simply subject to the mean-field potential, but feel an additional correlation potential [38]. The corresponding generalized Fock matrices thus formally resemble Eq. (9).

As for DFT, we can ask what the properties of these correlation potential and the related determinants are. Hesselmann and Jansen showed that the one-electron and response properties of Brueckner determinant are in significantly better agreement with experiment than those of the uncorrelated HF determinant [35]. The same is true for natural orbitals, which are obtained from diagonalization of correlated one-electron reduced-density matrices. By construction, they therefore reflect the more accurate electron density of the correlated treatment. Similarly, orbital-optimized MP2 calculations have been shown to produce accurate spin densities [62]. Finally, there is a close relationship between the extended Brueckner Fock matrix and correlation corrections to Koopman’s theorem, meaning that the correlation potential also improves the orbital energies [65, 66].

Though not usually counted as WFT methods, we should also mention electron propagator and Green’s function approaches, such as the Outer Valence Green’s Function and GW approximations [38, 65, 67,68,69,70]. Here, electrons interact with a screened potential that is described by a so-called “self-energy” term. Again, the methods can formally be written via a generalized Fock matrix resembling Eq. (9), with the self-energy corresponding to the correlation potential [65]. The eigenvalues of this equation then correspond to vertical IPs (and electron affinities).

Comparison between SEMO, DFT, and WFT methods

As a short summary of the previous three sections, we note that: 1) The effect of scaling two-electron integrals in SEMO corresponds to introducing a correlation potential into the Fock matrix. 2) The exact exchange-correlation potential in KS-DFT leads to a determinant that has the exact ground state electron density and occupied orbital eigenvalues, which correspond to vertical IPs. 3) Correlation potentials in WFT also improve the electron density and eigenvalue spectra (but not the energy of the determinant) relative to HF.

As both DFT and WFT are in principle exact, the corresponding exact correlation potentials must have the same properties. This should also be true for good approximations in both theories, although the KS potentials of popular density functional approximations like BLYP are actually rather poor [49]. While SEMO correlation potentials have not usually been considered explicitly, the parameterization of the methods typically uses reference dipole moments and (principal) IPs. The recent hpCADD-Hamiltonian of Clark and coworkers further includes molecular electrostatic potentials [71]. In light of the above discussion, this is a very good choice. Furthermore, the use of experimentally (or theoretically) derived one-center two-electron integrals ensures that the SEMO Hamiltonian is tied to the correlated ionization potentials of isolated atoms [72, 73].

In this sense, the established practice in the SEMO community is vindicated by our analysis. However, heats of formation and molecular geometries also play a very significant role in the parameterization. As we will discuss in the following section, this should be addressed independently.

Total energy expressions

The total electronic energy Eel in HF and SEMO is calculated via:

$$ {E}_{el}^{HF}=\frac{1}{2}\sum \limits_{\mu \nu}^{AO}{P}_{\mu \nu}\left[{H}_{\mu \nu}^{core}+{F}_{\mu \nu}\right]=\sum \limits_{\mu \nu}^{AO}{P}_{\mu \nu}\left[{H}_{\mu \nu}^{core}+\frac{1}{2}{G}_{\mu \nu}\right] $$
(12)

As is well known, this total electronic energy in not equal to the sum of occupied MO eigenvalues, sometimes referred to as the band energy Eband:

$$ {E}_{band}=\sum \limits_i^{occ}{n}_i{\varepsilon}_i=\sum \limits_i^{occ}\sum \limits_{\mu \nu}^{AO}{n}_i{C}_{\mu i}{C}_{\nu i}{F}_{\mu \nu}=\sum \limits_{\mu \nu}^{AO}{P}_{\mu \nu}\left[{H}_{\mu \nu}^{core}+{G}_{\mu \nu}\right] $$
(13)

It follows that the band energy includes a double counting of two-electron terms, which can be removed to obtain the total energy, in terms of Eband [74]:

$$ {E}_{el}^{HF}=\sum \limits_i^{occ}{n}_i{\varepsilon}_i-\frac{1}{2}\sum \limits_{\mu \nu}^{AO}{P}_{\mu \nu}{G}_{\mu \nu} $$
(14)

This gives the total electronic energy of a single determinant in HF. However, as discussed above, the inclusion of a correlation potential does not improve (in fact worsens) the energy of the reference determinant. From a total energy perspective, the benefit of using a correlated determinant is only seen when it is combined with an explicit expression for the correlation energy, e.g., via coupled cluster (CC) or perturbation theory (so-called post-HF methods).

There has in fact been a long history of applying such post-HF treatments to SEMO Hamiltonians. Most prominently, the configuration interaction (CI) method can be combined with SEMO determinants to obtain excited state properties [8, 75,76,77,78,79]. While there is some concern with respect to double-counting between the implicit and explicit description of correlation, this does not seem to be a major issue as long as the CI treatment is mostly used to describe static correlation within a limited active space [78].

In contrast, explicit correlation treatments for ground-state properties are rarely used, although the MNDO/C method was explicitly parameterized for ground-state energetics, in combination with second-order perturbation theory [29]. This approach is formally well justified in our view, but the use of traditional correlated quantum chemistry methods in a minimal basis set does not yield satisfactory results, as high-angular momentum polarization functions are necessary to adequately describe the electron-electron cusp [80]. From a more practical point of view, using a WFT-based correlation energy expression significantly impacts the main advantage of SEMO methods, namely their computational efficiency.

It is therefore worthwhile to look at DFT, where the correlated total energy is obtained at a mean-field cost. This is achieved through the exchange-correlation functional Exc. In analogy to Eq. (14), the total electronic energy in DFT can be written in terms of a sum of orbital energies and correction terms [74, 81]:

$$ {E}_{el}^{DFT}=\sum \limits_i^{occ}{n}_i{\varepsilon}_i-{E}_H\left[\rho \right]-\int {V}_{xc}(r)\rho (r) dr+{E}_{xc}\left[\rho \right] $$
(15)

Here, EH[ρ] is the Hartree energy (corresponding to Jμν in Eq. 9), and the integral over Vxc corresponds to the contributions of Kμν and \( {V}_{\mu \nu}^C \). In other words, the first three terms in Eq. (15) are analogous to the semiempirical version of Eq. (14), where \( {V}_{\mu \nu}^C \) is implicitly included in Gμν via the scaled integrals. However, Eqs. (12) and (14) contain no analogue of the last term Exc[ρ].

From the point-of-view advocated in this paper (namely that the semiempirical scaling of two-electron integrals corresponds to an implicit definition of a correlation potential analogous to Vxc), the use of Eqs. 12 or 14 to determine the electronic energy in SEMO is therefore inconsistent.

How can this inconsistency be resolved? The straightforward solution would be to include an explicit Exc[ρ]. However, this is not ideal for several reasons. Firstly, the quality of SEMO electron densities is limited due to the minimal basis set and frozen core approximation. Secondly, Exc[ρ] is generally evaluated with numerical quadrature, which has negligible computational cost in the context of a full first-principles DFT calculation, but may significantly impact the efficiency of a SEMO approach. Finally, as we alluded to above, there is no simple prescription for obtaining the Exc[ρ], which corresponds to a given Vxc.

It is instead worth looking to semiempirical density functional based tight-binding (TB) schemes such as DFTB [82,83,84,85,86]. These methods also avoid the explicit evaluation of Exc[ρ], despite being approximations to DFT. In the simplest case, the total energy of a TB scheme is given by:

$$ {E}_{tot}^{TB}=\sum \limits_i^{occ}{n}_i{\varepsilon}_i+\frac{1}{2}\sum \limits_{A,B}^{atoms}{V}_{rep}\left({r}_{AB}\right) $$
(16)

Here, Vrep(rAB) are pair-potentials, which fold in all effects from the last three terms of Eq. (15), as well as the core–core repulsion between atoms A and B. This term is purely empirical, but it has been shown that the sum of these contributions is indeed approximately pairwise and short-ranged in DFT [74]. Formally, it resembles empirical modifications of the core–core terms that are also known in SEMO methods based on Eq. (12) (used, e.g., to improve the description of hydrogen bonds) [87]. Similar pairwise expressions have also been used by Grimme et al. to correct basis-set insufficiencies [88, 89].

Determining the total energy of a semiempirical Hamiltonian via Eq. (16) defines a hybrid TB-SEMO method. In the tight-binding literature, the short-ranged, repulsive nature of Vrep(rAB) is attributed to a cancellation between double-counting terms, Exc[ρ] and the core–core repulsion [74]. This indicates that the use of Eq. (16) might be advantageous over two other alternatives, which could be used.

One of them is to add an additional nucleus-electron potential to the core-Hamiltonian [90]. This technique allows for description of noncovalent interactions, which are not described adequately by HF and the SEMO methods without dispersion corrections, but are described adequately with post-HF methods including correlation explicitly. This recovers the long-range correlation contribution, but not the complete correlation energy.

Another approach is to model the effects of the correlation functional explicitly:

$$ {E}_{el}^{SE MO}=\sum \limits_i^{occ}{n}_i{\varepsilon}_i-\frac{1}{2}\sum \limits_{\mu \nu}^{AO}{P}_{\mu \nu}{G}_{\mu \nu}+{E}_C^{SE} $$
(17)

Here, \( {E}_C^{SE} \) represents some semiempirical expression for the correlation energy, which remains to be defined, although empirical and machine-learning (ML)-based approximations for the correlation energy have been reported [91,92,93,94]. From a formal perspective, Eq. (17) mirrors Møller–Plesset perturbation theory. Specifically, if the Hamiltonian is partitioned as

$$ \hat{H}=\hat{F}+\lambda \hat{V}, $$
(18)

the first and second terms on the r.h.s of Eq. (17) correspond to the MP0 and MP1 energies, whereas the last term in principle contains all higher orders (and in practice would be truncated, e.g., to second order). As mentioned above, the explicit calculation of the correlation energy in a minimal basis set is not desirable, however.

Independent of the final form of the total energy expression, the above arguments have important consequences with respect to the parameterization procedure. In the spirit of a correlated orbital theory, all SEMO parameters that directly affect the Fock matrix elements (electronic parameters) should be optimized to fulfill the known properties of a correlation potential. Specifically, only IPs and one-electron properties (such as dipole moments) should be used as reference data at this stage. Clearly, this will lead to a poorer performance for thermochemistry and molecular geometries, if the electronic energy is determined according to Eqs. (12) or (14). These errors should then be corrected by a further parametric expression (e.g., Vrep(rAB)), which now includes the effects of Exc[ρ].

Importantly, the parameterization of this term is not performed simultaneously with the electronic parameters, but subsequently. In other words, first the electronic parameters are optimized to correctly describe a reference set of IPs and dipole moments (for molecules with fixed geometries). Second, the pair potentials are parameterized to correctly describe thermochemistry and geometries. Besides being physically motivated, this separation of potential and energy functional also has practical advantages for the parameterization process. We expect that the error function for both steps should be significantly smoother, compared to when thermochemical and electronic reference data is included simultaneously.

Results and discussion

In the previous section, we analyzed the nature of the implicit description of electron correlation in SEMO methods. This analysis showed that (1) a single-particle correlation potential leads to improved one-electron properties (orbital energies, dipole moments, etc.) and (2) evaluating the total energy of a correlated determinant with the HF energy expression does not lead to improved energies. We concluded that the parameterization of SEMO Hamiltonians should focus on one-electron properties (rather than energies and molecular geometries), and that the SEMO total energy expression needs an additional term to account for the correlation energy. Unfortunately, the full development and parameterization of a new SEMO method along these lines is beyond the scope of this manuscript. In the following, we provide some numerical evidence that such a method is feasible, however.

Specifically, we construct a prototype method using the unmodified SEMO formalism for the Fock matrix. We decided to build this TB-SEMO method based on the OM2 [7, 95, 96] Hamiltonian, which is known [27, 97] to be among the most robust NDDO-based SEMO methods. The Hamiltonian is reparameterized using only one-electron properties as reference data. We then apply this Hamiltonian in an energy expression like Eq. (16), and explore the properties of the corresponding repulsive potential Vrep(rAB). All OM2 calculations were performed with the development version of the MNDO program [98].

We started construction of the TB-OM2 method by reparametrization of the OM2 technique to minimize the sum of squares of errors (SSQ) in IPs (calculated according to Koopmans’ theorem) and dipole moments for the set of fixed reference geometries. The training set was the CHNO set [27] used in parametrization of the OM2 method. We optimized all OM2 parameters except for the one-center two-electron integrals and parameters for the effective core potentials. First, we optimized each parameter one after the other. Second, we simultaneously re-optimized all parameters obtained in the first step. We used the Subplex optimization algorithm as implemented in the NLopt library [99, 100]. Weighting factors for errors in IPs and dipole moments were adjusted so that their SSQs were numerically equal for calculations with the standard OM2 parameters used as the initial guess in reparametrization (\( \frac{w_{IP}}{w_{Dipole}}=1.04259 \)). Optimized parameter values of TB-OM2 are given in Table 1.

Table 1 Parameters of the TB-OM2 method and their deviations from the standard OM2 parameters (in parentheses) for H, C, N, and O elements. See Ref. [7] for the definitions of the OM2 parameters

The thus obtained TB-OM2 Hamiltonian performs systematically much better for IPs and dipole moments than the standard OM2 Hamiltonian as is clear from the correlation plots between calculated and reference values for the CHNO set (Figs. 1 and 2). The correlation coefficients are higher, the slope of the linear trend line is closer to unity, and its y-intercept is closer to zero for TB-OM2 compared to OM2. The mean absolute errors (MAEs) in IPs are 0.13 and 0.30 eV at TB-OM2 and OM2, respectively. MAEs in dipole moments are 0.13 and 0.27 D at TB-OM2 and OM2, respectively. We note that the errors of OM2 somewhat decrease for IPs (MAE of 0.26 eV) and dipole moments (MAE of 0.25 D) when full optimizations are performed at this level of theory (this is how the OM2 method was parametrized), but they remain much higher than errors of TB-OM2 [27]. The linear trend lines are very similar for IPs and dipole moments calculated with OM2 regardless of whether geometry optimizations were performed or not at this level of theory.

Fig. 1
figure 1

Correlation between reference ionization potentials (IPref) and IPs calculated with the OM2 (left) and TB-OM2 (right) methods on the reference geometries (IPcalc)

Fig. 2
figure 2

Correlation between reference dipole moments (μref) and dipole moments calculated with the OM2 (left) and TB-OM2 (right) methods on the reference geometries (μcalc)

The TB-OM2 Hamiltonian thus performs quite well for the targeted properties. It is, of course, not very surprising that the new method outperforms the standard OM2 for these properties, given the parameterization. However, we do find it notable that such high accuracy can be achieved in absolute terms. This means that the minimal-basis NDDO framework can faithfully represent the exact one-electron potentials. This confirms that SEMO methods are an attractive basis for the development of an efficient correlated orbital theory.

Nevertheless, the new parameters are not suitable for total energy calculations following the conventional theory (i.e., based on Eq. 12). Properties depending on total energies are described poorly, e.g., errors in heats of formation at 298 K exceed 100 kcal/mol for most of the molecules in the CHNO set. Errors in relative energies are also unacceptably large. Geometry optimizations lead to wrong structures.

These deficiencies can potentially be corrected with a new energy expression following Eq. (16). While developing these new approximations is beyond the scope of this study, it is instructive to investigate the properties of Eband (i.e., the sum of occupied orbital energies). This energy is the first term in Eq. (16) (and 17) and should recover most of the long-ranged electronic interactions. We can therefore use Eband to probe the properties of the unknown term in Eq. (16), i.e., Vrep(rAB).

To this end, we apply a bond-projection procedure used in the DFTB literature [101]. Assuming that Vrep(rAB) is a short-ranged pairwise potential, its values can be determined from rigid potential energy surface scans along bond coordinates. This was performed for the symmetric stretch of all C–H bonds in methane and along the C–C bond in ethane. Figures 3 and 4 show the forces obtained by differentiation of the PES with respect to these coordinates. As a reference, the same curves were computed at the PBE0/def2-TZVP level.

Fig. 3
figure 3

Plot of forces obtained from the differentiation of a rigid PES scans along the C–H bond length in methane

Fig. 4
figure 4

Plot of forces obtained from the differentiation of a rigid PES scans along the C–C bond length in ethane

In both cases, the curves obtained from Eband are in reasonable agreement with the reference at large distances (i.e., r > 2.5 Å for C–C and r > 1.5 Å for C–H). This is because the systems dissociate into neutral atoms, meaning that long-range core–core interaction are screened. Furthermore, electronic effects are well represented by Eband in this regime. Meanwhile, the curves deviate strongly in the bonding and repulsive regions, where this is not the case. The plots also show the difference between both curves, ΔF. The pairwise potential correcting the error of Eband (i.e., Vrep(rAB)) can now be obtained by integrating ΔF (see Fig. 5).

Fig. 5
figure 5

Pairwise potentials obtained by integrating the ΔF curves in Figs. 3 and 4. In the case of methane, the potential is divided by the number of C–H bonds (four)

The resulting potentials behave analogously to what is typically observed in DFTB, i.e., they are short-ranged and mostly repulsive. Importantly, they tend to zero between the typical first and second nearest neighbor distances. This allows choosing appropriate cutoff values for these potentials, an essential requirement for a robust pairwise correction.

While these initial results are encouraging, the important question of how transferable the potentials are to other bond types and hybridizations remains open. Preliminary investigations for acetylene and ethylene were inconclusive because wavefunction instabilities were observed for the dissociation curves of these systems. Indeed, semiempirical methods are very prone to such instabilities, in particular for π-systems [102, 103]. Consequently, the bond-projection scheme applied above is likely not well suited for determining a ‘general’ Vrep(rAB). Instead, a force-matching approach for equilibrium structures may be a better choice.

Conclusions

In this work, we discussed the approaches used to incorporate electron correlation in the existing semiempirical molecular orbital (SEMO) theory, ab initio wavefunction theory, density functional theory, and density functional-based tight-binding (TB) methods. We outlined the design of next-generation TB-SEMO methods, which includes correlation via explicit DFTB-like pair-potentials (TB part of the model) also incorporating core–core repulsions. In these methods, the parametrization of the SEMO part of the model targets only ionization potentials and dipole moments and then it enters the TB-SEMO model via the sum of orbital energies. We provide a numerical proof-of-concept for such a method based on the OM2 method. The optimized parameters of this experimental TB-OM2 method provide a significant improvement over the OM2 method for the ionization energies and dipole moments. It was then demonstrated that the pair-wise potentials needed to complete the TB-OM2 model are short-ranged and mostly repulsive in analogy to the DFTB case.