1 Introduction

The extended Hückel theory [13] (EHT) has a long history of development and application to various types of systems [48]. In the beginning of the computational chemistry era it was among the most successful yet computationally efficient and physically transparent models of molecular interactions. With the advent of efficient computers, the interest to the EHT method gradually declined, because computations of relatively small molecules could be performed at a higher level of theory in reasonable computational time. Still, the EHT method keeps attracting researchers, mainly because of its great computational efficiency—when relatively large systems [9] are to be studied or when one is interested in time evolution of electronic states of large-scale systems [1016].

In the modern time the EHT approach in its original formulation or in a slightly modified form is utilized for studying electron transport (conductivity) [1722], photoinduced electron transfer [1016], electronic structure and magnetic properties of inorganic materials [21, 2326], as well as enthalpies of formation and interaction energies of organic molecules [2730]. Applicability of the EHT method is particularly advantageous for systems with metal atoms, especially heavy ones. Treatment of such species at the density functional theory (DFT) or ab initio levels requires suitable atomic pseudopotentials and proper handling of relativistic effects. Both types of complications are easily overcome by a suitable parameterization within the EHT formulation [8]. The EHT Hamiltonian has been used as the basic framework for time-dependent tight-binding calculations of molecular excited states [31] and electronic resonances [32]. In further discussion, we will show that the EHT-type Hamiltonians that include self-consistent charges are closely related to the popular tight-binding DFT (DFTB) method. Some degree of similarity can be found with the phenomenological Anderson–Newns Hamiltonian or the more elaborate Hubbard and DFT\(+\)U Hamiltonians. Generalizations of the EHT to periodic systems [7, 17, 33] (with inclusion of crystal momentum quantum numbers) and to unrestricted formulations, in which dependence on electronic spin polarization in included [33], have been developed. Therefore, the EHT approach has a potential as an efficient method that can be applied to large systems and to processes that involve not only charge transfer, but also spin polarization and spin relaxation dynamics.

The original EHT formulation was proposed as a non-iterative approach based on diagonalization of a simple tight-binding Hamiltonian, to account for covalent chemical bonding. Later, it was recognized that such an approach predicted wrong equilibrium geometries [34, 35] and could not properly describe charge-transfer processes [36, 37], especially in ionic crystals [38]. The former problem was solved by introducing properly parameterized and tuned nuclear repulsion terms and electron-nuclear attraction energy terms [27, 34, 35, 3941]. To improve accuracy of the computations of charge-transfer related properties, a simple correction to the original Hamiltonian was introduced via charge-dependent ionization potentials [8, 33, 3638, 42], rooting back to the work proposed by Harris [43, 44]. As a complication, solution of the resulting equations must be obtained iteratively, until self-consistently is achieved, because the Hamiltonian depends on the charge distribution, and the charge distribution depends on the Hamiltonian. The resulting method is known as the iterative EHT (IEHT), or self-consistent EHT (SC-EHT). It is worth saying that a non-iterative scheme for description of charge transfer effects exists [45]. The method starts with the conventional EHT Hamiltonian and a single-determinant wavefunction. Various excited configurations are then created, following the standard configuration interaction (CI) philosophy. Unlike the standard technique, the CI coefficients are obtained via a predefined formula rather than from the diagonalization of the CI matrix. This technique allows one to accelerate calculations, to avoid problems that appear in SCF iterations (divergences, etc.), and explicitly account for multi-configurational nature of some processes. However, a judicious choice of parameters is required and the parameters may not always be transferrable.

In the early works, the SC-EHT Hamiltonian was used directly to reach the self-consistency and to obtain converged solutions [8, 46]. This approach led to slow convergence and unstable charge fluctuations during the iterative process. Elaborate workaround schemes were proposed to help converging this process [8]. The workaround technique was criticized by Mukherjee [37], who argued that the convergence was in error, since the use of the charge-corrected EHT Hamiltonian for obtaining wavefunctions and 1-electron energy levels was not consistent with the variational principle. Additional modifications to the EHT Hamiltonian were needed to satisfy the variational principle. Several authors derived such corrections via density matrix variation [37, 42]. However, the results are not always clear and systematic. In addition, the application of the direct variational procedure is mathematically elaborate, especially if a complicated form of the charge-dependent matrix elements is considered. A general derivation of the effective Hamiltonian corresponding to an arbitrary potential was reported by Sanhueza et al. [47] Nonetheless, although the utilization of the variational principle by itself does not raise questions, the relation of the resulting approximations to the established ab initio methods remains obscure.

Rather similar to the SC-EHT, the self-consistent charge tight-binding density functional theory (SCC-DFTB) was derived directly from the DFT [4851]. The rigorous relation to the fundamental DFT is one of the reasons of the SCC-DFT popularity. Because of this heritage, a hierarchy of increasingly accurate approximations can be made, illustrating the history of development of the DFTB family of methods. However, no derivation of the SC-EHT from the wavefunction theory was presented so far. The lack of connection to rigorous theories, such as Hartree–Fock (HF), may be the reason why the EHT method and its derivatives have not gained much popularity as a computational tool. On the contrary, semiempirical methods rooted in rigorous HF theory, such as MNDO [52, 53] or ZINDO [54, 55], gained significantly greater interest and respect.

Successful results in different aspects of simulations [7, 8, 17, 27, 28, 34, 42, 56] and high computational efficiency make the EHT-based methods attractive candidates for further improvement and systematization. In this regard, it is important to understand the roots of the SC-EHT and establish its connection to the ab initio wavefunction theory. This connection would not only justify the EHT as a semiempircial method derived from the HF theory, but it would also have immediate practical value. First, it would present a simple way of constructing a proper effective Hamiltonian matrix that is to be used in the eigenvalue problem. Specifically, we address the questions raised by Mukherjee regarding the need for a proper self-consistency correction [37]. We find that this correction may be needed, but the methods without such a correction are valid, and the difference resides mostly in interpretation. We also present a simple way of deriving the Mukherjee-type correction. Second, the connection would clarify the approximations made in transition from the reference wavefunction theory (HF) to the EHT-based models, and hence, can suggest further ways of improving the quality of subsequent refinements of the SC-EHT formulation, potentially leading to a family of new semiempirical methods with systematically increased accuracy, similarly to the DFTB family.

In this work, we present the connection between the SC-EHT method and the HF theory. We start by introducing notation and revising a few related techniques. We then analyze the variational HF theory and present the approximation that maps the HF theory into the SC-EHT method. In this way, we always know the proper effective Hamiltonian for the SCF iterative process—the Fock matrix. One can then relate an arbitrary EHT Hamiltonian to the Fock matrix, to ensure that the solution of the SC-EHT is variationally consistent. Further, we analyze the approximations that reduce the HF theory to the SC-EHT and suggest possible ways of improving the approximations. Finally, we discuss similarities of the SC-EHT and its derivatives with the charge equilibration scheme and DFTB-based methods.

2 Overview of the EHT method

EHT is the molecular orbital (MO) method. According to it, each MO, \(|{\psi _i}\rangle \), is represented by a linear combination of atomic orbitals (AOs), \(|{\chi _a}\rangle \):

$$\begin{aligned} |{\psi _i}\rangle =\sum _a {C_{ai} |{\chi _a}\rangle }, \end{aligned}$$
(2.1)

where \(C_{ai}\) is the coefficient of the ath AO in the expansion of the ith MO.

In the original formulation of Hoffmann [1], the matrix elements of the EHT Hamiltonian, \(H\), are charge-independent, and they are computed in the AO basis as:

$$\begin{aligned} H_{ij} =\frac{1}{2}K_{ij} S_{ij} ({h_i +h_j}), \end{aligned}$$
(2.2)

where \(K_{ij}\) is the proportionality constant, typically assumed to be in range between 1 and 2, and, in general, dependent on the type of orbitals \(i\) and \(j\). For the diagonal elements this constant is set to 1, \(K_{ii} =1\), while for all other pairs of orbital types \(K_{ij},i\ne j\) it is treated as an adjustable parameter. In many cases it is customary to set this parameter to the constant value 1.75 for all types of orbital pairs. The parameter \(S_{ij}\) is the overlap integral in the AO basis:

$$\begin{aligned} S_{ij} =\langle {\chi _a}| {\chi _b}\rangle . \end{aligned}$$
(2.3)

The parameter \(h_i\) entering Eq. (2.2) is the energy of \(i\)th orbital of an isolated atom. These parameters are typically available from X-ray spectroscopy measurements as the orbital binding energies, also known as valence state ionization potentials (VSIPs),

$$\begin{aligned} h_i =h_i^0 =- VSIP _i, \end{aligned}$$
(2.4)

but can also be treated as adjustable parameters.

In the EHT method, the AOs \(|{\chi _a}\rangle \) are typically taken to be of the form of either single-exponent Slater-type orbitals (STOs):

$$\begin{aligned} |{\chi _a}\rangle =N\exp ({-\xi r}), \end{aligned}$$
(2.5a)

or double-zeta STOs:

$$\begin{aligned} |{\chi _a}\rangle =c_1 N_1 \exp ({-\xi _1 r})+c_2 N_2 \exp ({-\xi _2 r}). \end{aligned}$$
(2.5b)

The numbers \(N,\,N_1,\,N_2\) are the normalization coefficients, \(c_1,\,c_2 \) are the liner combination coefficients and \(\xi ,\,\xi _1,\,\xi _2 \) are the orbital exponents. The latter are typically available from ab initio calculations of electronic structure of isolated atoms [5759] or can be treated as adjustable parameters [7, 17]. The overlap integrals in Eq. (2.3) are first computed in the coordinate system in which one of the axes is parallel to direction connecting the pair of atomic centers on which the orbitals are located. The formulae for computing these integrals are available from different authors [60, 61]. The computed quantities are then rotated back to the molecular coordinate system. To avoid complications and to facilitate the calculations of derivatives and other molecular integrals, one can utilize the representation of the STOs as the linear superposition of n Gaussian type orbitals (STO-nGTO, STO-nG):

$$\begin{aligned} N_1 \exp ({-\xi _1 r})=\sum _{j=1}^n {c_j \exp ({-\tilde{\xi }_{ij} r^{2}})}. \end{aligned}$$
(2.6)

In general, more elaborate expressions of the EHT Hamiltonian matrix elements can be utilized. They were shown to produce superior accuracy or can correct unnatural behavior of the simple formula Eq. (2.2) that is encountered under certain circumstances. Among the most notable examples are the weighted Wolfsberg-Helmholz formula [2]:

$$\begin{aligned} H_{ij}&= \frac{1}{2}\left[ {K_{ij} +\Delta ^{2}+\Delta ^{4}({1-K_{ij}})} \right] S_{ij} ({h_i +h_j}),\end{aligned}$$
(2.7a)
$$\begin{aligned} \Delta&= \frac{h_i -h_j}{h_i +h_j}, \end{aligned}$$
(2.7b)

the Calzaferri formula [34, 56]:

$$\begin{aligned} H_{ij} =\frac{1}{2}[ {1+\kappa _{ij} \exp [ {-\delta ({R_{ij} -R_{ij,0}})} ]} ]S_{ij} ({h_i +h_j}). \end{aligned}$$
(2.8)

Different dependencies of the Hamiltonian matrix elements on the atomic overlaps had also been considered. Notable are the simple Wolfsberg-Helmholz formula [62]:

$$\begin{aligned} H_{ij} =\frac{1}{2}S_{ij} ({h_i +h_j}), \end{aligned}$$
(2.9)

Cusachs formula [63],

$$\begin{aligned} H_{ij} =\frac{1}{2}S_{ij} ({2-|{S_{ij}}|})({h_i+h_j}), \end{aligned}$$
(2.10)

and exsin formula [42]:

$$\begin{aligned} H_{ij}&= \hbox {sgn}({S_{ij}})\frac{1}{2} \{ {({1+| {S_{ij}}|}) [ {1+c\sin ({\pi | {S_{ij}} |} )\exp ({b | {S_{ij}} |} )} ]-1} \}\nonumber \\&\quad \times ({h_i +h_j}),\end{aligned}$$
(2.11a)
$$\begin{aligned} b&= -\pi \cot ({\pi | {S_m} |} ), \end{aligned}$$
(2.11b)

with \(c\) and \(S_m\) being parameters.

Despite different performance and effects accounted for by the above formulations, in all cases the EHT Hamiltonian is independent of the wavefunctions (charge density). It is under this assumption the variational principle yields the well-known secular equation:

$$\begin{aligned} HC = SC \varepsilon , \end{aligned}$$
(2.12)

with the EHT Hamiltonian, \(H\), being also the effective 1-electron Hamiltonian (Fock) operator that enters the Eq. (2.12). The total energy of the system in this case is given by the sum of occupied orbital energies

$$\begin{aligned} E=2\sum _{i\in occ } {\varepsilon _i} = tr ({H^{T}P})=\sum _{a,b} {H_{ ab } P_{ ab }}, \end{aligned}$$
(2.13a)

or, more generally:

$$\begin{aligned} E&= \sum _{i\in occ _\alpha } {\varepsilon _i^\alpha } +\sum _{i\in occ _\beta } {\varepsilon _i^\beta } = tr ({H^{T}P^{\alpha }} )+ tr ({H^{T}P^{\beta }})\nonumber \\&= \sum _{a,b} {H_{ ab } P_{ ab }^{\alpha }} +\sum _{a,b} {H_{ ab } P_{ ab }^\beta }, \end{aligned}$$
(2.13b)

where \(P, \,P^{\alpha }\) and \(P^{\beta }\) are the density matrices (total, spin-up, and spin-down, respectively):

$$\begin{aligned} P_{ ab }^\alpha&= \sum _{i\in occ_\alpha } {C_{ai}^\alpha C_{bi}^\alpha } =({C^{\alpha }O^{\alpha }({C^{\alpha }} )^{T}} )_{ ab } \Leftrightarrow P^{\alpha }=C^{\alpha }O^{\alpha }({C^{\alpha }} )^{T},\end{aligned}$$
(2.14a)
$$\begin{aligned} P_{ ab }^\beta&= \sum _{i\in occ_\beta } {C_{ai}^\beta C_{bi}^\beta } =({C^{\beta }O^{\beta }({C^{\beta }})^{T}} )_{ ab } \Leftrightarrow P^{\beta }=C^{\beta }O^{\beta }({C^{\beta }} )^{T},\end{aligned}$$
(2.14b)
$$\begin{aligned} P&= P^{\alpha }+P^{\beta }. \end{aligned}$$
(2.14c)

The matrices of the MO coefficients, \(C^{\alpha }\) and \(C^{\beta }\) are organized such that \(i\)th column contains MO-LCAO coefficients of \(i\)th MO. The matrices \(O^{\alpha }\) and \(O^{\beta }\) are the density matrices in the MO basis, also known as the population matrices. These are the diagonal matrices with the first \(N_\alpha \) and \(N_\beta \) diagonal elements set to 1.0, where \(N_\alpha \) and \(N_\beta \) are the numbers of spin-up and spin-down electrons, respectively.

3 Self-consistent EHT

In the simplest SC-EHT approach, the parameters \(h_i\) are modified according to:

$$\begin{aligned} h_i =h_i^{0} -a_i q_I, \end{aligned}$$
(3.1)

where \(a_i\) is an adjustable parameter, and \(q_I\) is the partial charge of atom \(I\), the atom on which \(i\)th AO is localized. Typically Mulliken charges [64] are used:

$$\begin{aligned} q_I =Z_I -\sum _{a\in I} {n_a}, \end{aligned}$$
(3.2)

where the summation runs over all AOs (index \(a\)) localized on a given atom (index \(I\)) and

$$\begin{aligned} n_a =\sum _b {\left( {P_{ ab }^\alpha +P_{ ab }^\beta } \right) S_{ ab }}, \end{aligned}$$
(3.3)

is the total Mulliken population on the orbital \(a\). The off-diagonal matrix elements of the EHT Hamiltonian are typically computed according to the standard rule, Eq. (2.2), using charge-corrected parameters \(h_i\). In principle, one may utilize one of the expressions Eqs. (2.7)–(2.11) together with the charge-dependent orbital energy parameters, Eq. (3.1).

The dependence of electronic Hamiltonian on charge density via Mulliken charges, Eq. (3.2), is perhaps one of the simplest treatments of self-consistent electrostatics in electronic structure calculations. Other similar schemes are represented by early semiempirical [6571] or mixed quantum mechanics/molecular mechanics (QM/MM) [72] approaches. While the semiempirical formulations often originate directly from the Hartree–Fock method, the QM/MM is substantially guided by an accurate description of electrostatic and polarization effects between different parts of a system. Although disregarding some quantum effects and relying on the classical picture of interactions, the approach proved to be extremely successful. Transparent and efficient, it granted the authors the Nobel prize award in 2013.

Equation (3.1) implies that the excess of electron charge density (negative charge) on a given atom pushes the energy levels of atomic states toward more positive values and makes the ionization potential smaller. Consider two limiting situations—mono-charged cation and anion (Fig. 1a).

Fig. 1
figure 1

Definition of the orbital-resolved (a) and atom-resolved (b) slope parameters \(a\) for different number of electrons, N. The number N corresponds to the reference value (e.g. number of electrons in isolated atom). The orbital energy levels in panel (a) are adjusted according to EHT definition. The atom energies in panel (b) are according to standard chemical definition of IP and EA. Note the IP and EA values used in the two panels have different meaning

Then the energies can be approximated by:

$$\begin{aligned} - IP _2&= - IP _+ =h_i ({+1} )=h_i^0 -a_i,\end{aligned}$$
(3.4a)
$$\begin{aligned} - EA&= - IP _- =h_i ({-1} )=h_i^0 +a_i, \end{aligned}$$
(3.4b)

or

$$\begin{aligned} a_i =\left\{ \begin{array}{ll} { IP _2 - IP _1}&{} {q<0} \\ { IP _1 - EA }&{} {q>0} \\ \end{array}\right. . \end{aligned}$$
(3.5)

where \( IP _2\) is the second ionization potential for a given atom and \( EA \) is its electron affinity. As it follows from Eq. (3.5) and Fig. 1, the slope parameter \(a_i\), which is essentially the energy derivative with respect to the number of electrons (or charge), \(a_i=\frac{\partial E}{\partial N}\), is a discontinuous function of the atomic charge. This is the result well known in both wavefunction theory and DFT. It originates from the differences in exchange integrals for systems with different number of electrons. The derivative discontinuity is a cornerstone of many DFT formulations, the problem which is often very hard to account for in a systematic way. In the context of the DFT, the attempts to account for this effect led to approaches known as “scissor operator” method [73, 74]. In the context of the SC-EHT method, the derivative discontinuity can easily be incorporated via Eq. (3.5). In most SC-EHT formulations, the importance of the incorporation of the derivative discontinuities was not explicitly recognized and the same value of slope parameters was typically used for positive and negative charges, mainly for the sake of simplicity. We anticipate that incorporation of the charge-dependent slopes may lead to new interesting effects, especially when the electron transfer and electronic excitations are of concern.

Unlike Eq. (3.1), some formulations of the SC-EHT method involve dependence of the atomic orbital energies for \(i\)th orbital, \(h_i\), on the fluctuation of its population:

$$\begin{aligned} h_i =h_i^0 -a_i \delta n_i, \end{aligned}$$
(3.6a)

where

$$\begin{aligned} \delta n_i =n_i -\bar{{n}}_i, \end{aligned}$$
(3.6b)

is the fluctuation of the population on the orbital \(i\) with respect to atomic limit or other reference value \(\bar{{n}}_i\). Higher order polynomials in \(\delta n_i\) were also considered in early works [8]:

$$\begin{aligned} h_i =h_i^0 -a_i \delta n_i +b_i \delta n_i^2. \end{aligned}$$
(3.7)

We want to emphasize that under the approximations given by Eqs. (3.6)–(3.7), the Hamiltonian is not rotationally invariant. Therefore, these approximations should not be used, even despite their higher flexibility for parameterization. It is easy to illustrate by a simple example why one should avoid such formulae. Suppose the values of the parameters \(\{{a_i}\}\) and, potentially, \(\{{b_i}\}\) are fixed. Depending on initial guess of the atomic orbitals, one can obtain different values of the reference populations, \(\bar{{n}}_i \). A simple example is the carbon atom. Its valence configuration, \(2{s}^{2}2{p}^{2}\), implies that if the initial guess is chosen as a set of atomic orbitals \(2s, 2{p}_\mathrm{x}, \,2{p}_\mathrm{y}\) and \(2{p}_\mathrm{z}\), the atomic limit of population on one of the \(2p\) orbitals would be zero (if no smearing of MO populations is utilized). However, one could start with four \({sp}^{3}\) hybrids, each of which would have non-negligible projection (population) on any of the atomic orbitals. Therefore, the fluctuations \(\delta n_i\) depend on initial guess of orbitals, and on a specific choice of orbital directions, which can vary along all subsequent iterations. The dependence of the Hamiltonian matrix elements on the choice of coordinate system is undesirable property—apart from being physically incorrect, it may lead to numerical instabilities and poor convergence.

In contrast to Eqs. (3.6)–(3.7), if the charges in Eq. (3.1) are chosen to be atomic partial charges that are rotationally invariant, for example Mulliken charges, Eqs. (3.2)–(3.3), the rotational invariance of the charge-corrected Hamiltonian is preserved at all times. Despite the absence of rotational invariance, a number of researchers utilized the orbital-resolved populations to construct charge-corrected EHT Hamiltonians. One should then be careful when transferring the parameters \(\{{a_i}\}\) and, potentially, \(\{{b_i }\}\) obtained in those works onto the parameters used with Mulliken charges. It is reasonable to expect that the magnitude of the fluctuation \(\delta n_i\) for any given orbital \(i\) is smaller than the total Mulliken charge on the atom containing this orbital. Therefore, the parameters obtained for \(\delta n_i\) should be scaled down to smaller numbers, approximately by the number of valence orbitals considered in atomic calculations. Utilization of the quadratic polynomials in \(\delta n_i\) also leads to large slope parameters. Finally, in some works the net rather than the gross Mulliken atomic or orbital charges are utilized. As it will become clear in the following sections, the choice of net populations is non-natural, although one could argue on its clear physical interpretation. Because these charges are not rotationally invariant, we discourage one from using them in charge-corrected Hamiltonians.

At this point, we should comment on the construction of the charge-dependent Hamiltonian matrix elements. The main condition for the definition Eq. (3.1) to be physically justified is the requirement that the charge-dependent proportionality factor be rotationally invariant. This condition is satisfied for the Mulliken charges and is violated for the orbital-resolved quantities, Eq. (3.6). At the same time, Mulliken decomposition of the charge density is one of infinitely many proper (rotationally invariant) schemes. Thus, we anticipate that this flexibility may be used to construct more accurate and computationally efficient charge-dependent functionals for computing Hamiltonian matrix elements via charge-dependent ionization potentials.

4 SC-EHT equations via direct variation of charge density

In this section we show how the charge-corrected EHT Hamiltonian should be modified to satisfy the variational principle. Starting with the EHT energy expression, Eq. (2.13), we consider its first-order variation with respect to all density matrix elements:

$$\begin{aligned} \delta E&= E({P+\delta P} )-E(P)=\sum _{a,b} {H_{ ab } \delta P_{ ab }^\alpha } +\sum _{a,b} {H_{ ab } \delta P_{ ab }^\beta }\nonumber \\&\quad +\sum _{a,b} {\delta H_{ ab } P_{ ab }^\alpha } +\sum _{a,b} {\delta H_{ ab } P_{ ab }^\beta }. \end{aligned}$$
(4.1)

Because the variations of spin-up and spin-down densities are independent, we consider each of them separately, so:

$$\begin{aligned} \delta E^{\sigma }=\sum _{a,b} {H_{ ab } \delta P_{ ab }^\sigma } +\sum _{a,b} {\delta H_{ ab }^\sigma P_{ ab }},\quad \sigma =\alpha ,\beta . \end{aligned}$$
(4.2)

From Eqs. (2.2) and (3.1) we have:

$$\begin{aligned} H_{ij}&= \frac{1}{2}K_{ij} S_{ij} ({h_i +h_j} )=\frac{1}{2}K_{ij} S_{ij} \left( {h_i^0 +h_j^0}\right) \nonumber \\&\quad -\,\frac{1}{2}K_{ij} S_{ij} ({a_i Z_I +a_j Z_J} )+\frac{1}{2}K_{ij} S_{ij} ({a_i n_I +a_j n_J} ). \end{aligned}$$
(4.3)

The variation of the Hamiltonian matrix element is:

$$\begin{aligned} \delta H_{ij}^\sigma =\frac{1}{2}K_{ij} S_{ij} \left( {a_i \delta n_I^\sigma +a_j n_J^\sigma }\right) . \end{aligned}$$
(4.4)

Using the definition of atomic population and the definition of Mulliken orbital populations, Eq. (3.3), we obtain:

$$\begin{aligned} n_I&= \sum _{a\in I} {n_a} =\sum _{a\in I} {\sum _b {\left( {P_{ ab }^\alpha +P_{ ab }^\beta }\right) S_{ ab }}}=\sum _{a,b} {\left( {P_{ ab }^\alpha +P_{ ab }^\beta } \right) S_{ ab } \delta _{aI}}\nonumber \\&= \frac{1}{2}\sum _{a,b} {\left( {P_{ ab }^\alpha +P_{ ab }^\beta }\right) S_{ ab } ({\delta _{aI} +\delta _{bI}})}, \end{aligned}$$
(4.5)

so

$$\begin{aligned} \delta n_I^\sigma =\frac{1}{2}\sum _{a,b} {\delta P_{ ab }^\sigma S_{ ab } ({\delta _{aI} +\delta _{bI}} )}. \end{aligned}$$
(4.6)

Summing up:

$$\begin{aligned} \delta E^{\sigma }&= \sum _{a,b} {H_{ ab } \delta P_{ ab }^\sigma } +\sum _{i,j} {\delta H_{ij}^\sigma P_{ij}} =\sum _{a,b} {H_{ ab } \delta P_{ ab }^\sigma } +\sum _{i,j} {\frac{1}{2}K_{ij} S_{ij} ({a_i \delta n_I^\sigma +a_j n_J^\sigma } )P_{ij}} \nonumber \\&= \sum _{a,b} {H_{ ab } \delta P_{ ab }^\sigma } +\sum _{i,j} \frac{1}{2}K_{ij} S_{ij} \left( a_i \sum _{a,b} {\delta P_{ ab }^\sigma S_{ ab } \frac{({\delta _{aI} +\delta _{bI}} )}{2}}\right. \nonumber \\&\quad \left. +\,a_j \sum _{a,b} {\delta P_{ ab }^\sigma S_{ ab } \frac{({\delta _{aJ} +\delta _{bJ}})}{2}}\right) P_{ij} \nonumber \\&= \sum _{a,b} {H_{ ab } \delta P_{ ab }^\sigma } +\frac{1}{2}\sum _{a,b} \delta P_{ ab }^\sigma S_{ ab } \left[ \sum _{i,j} {a_i K_{ij} S_{ij} P_{ij} \frac{({\delta _{aI} +\delta _{bI}} )}{2}}\right. \nonumber \\&\quad \left. +\,\sum _{i,j} {a_j K_{ij} S_{ij} P_{ij} \frac{({\delta _{aJ}+\delta _{bJ}} )}{2}}\right] \nonumber \\&= \sum _{a,b} {F_{ ab }^\sigma \delta P_{ ab }^\sigma }, \end{aligned}$$
(4.7)

with the sought-for effective Hamiltonian, \(F^{\sigma }\), given by:

$$\begin{aligned} F_{ ab }^\sigma&= H_{ ab } +\frac{1}{2}S_{ ab } \left[ {\sum _{i,j} {a_i K_{ij} S_{ij} P_{ij} \frac{({\delta _{aI} +\delta _{bI}})}{2}} +\sum _{i,j} {a_j K_{ij} S_{ij} P_{ij} \frac{({\delta _{aJ} +\delta _{bJ}} )}{2}}} \right] \nonumber \\&= H_{ ab } +\frac{1}{2}S_{ ab }\left[ {\sum _i {a_i ({\tilde{S}P} )_{ii} \frac{({\delta _{aI} +\delta _{bI}} )}{2}}+\sum _j {a_j ({\tilde{S}P})_{jj} \frac{({\delta _{aJ} +\delta _{bJ}} )}{2}}} \right] \nonumber \\&= H_{ ab } +\frac{1}{2}S_{ ab } \left[ {\sum _i {a_i ({\tilde{S}P} )_{ii} \delta _{aI}} +\sum _i {a_i ({\tilde{S}P} )_{ii} \delta _{bI}}}\right] , \end{aligned}$$
(4.8)

where

$$\begin{aligned} \tilde{S}_{ij} =K_{ij} S_{ij}. \end{aligned}$$
(4.9)

The resulting correction is similar to the formulae presented by Mukhejee [37] and Kalman [42]. However, unlike Mukherjee, Eq. (4.8) contains summation over all orbitals, \(i\), centered on a given atom, \(I\). This is reflected by the terms \(\delta _{aI}\) and \(\delta _{bI}\). On the contrary, in Mukherjee’s results these symbols are effectively reduced to \(\delta _{ai}\) and \(\delta _{bi} \). The result given by Kalman [42] is closer in this respect to ours. However, their notation is somewhat confusing, leading to the difficult-to-follow numerical prefactor and sign.

5 Mapping of the SC-EHT to the Hartree–Fock method for derivation of self-consistency (SC) correction

A more general and significantly more convenient derivation of the correct effective Hamiltonian for the SC-EHT method can be obtained starting from the conventional ab initio HF theory. According to such formulation, the total electronic energy of the system is given by

$$\begin{aligned} E_{ HF } =\frac{1}{2}\sum _{i,j} {P_{ij}^\alpha \left( {H_{ij}^0 +F_{ij}^\alpha }\right) } +\frac{1}{2}\sum _{i,j} {P_{ij}^\beta \left( {H_{ij}^0 +F_{ij}^\beta }\right) }, \end{aligned}$$
(5.1)

with the Fock matrix \(F^{\sigma }\) playing the role of the effective 1-electron Hamiltonian for spin channel \(\sigma \) and defined as:

$$\begin{aligned} F_{ij}^\alpha&= H_{ij}^0 +\sum _{a,b} {\left[ {\left( {P_{ ab }^\alpha +P_{ ab }^\beta }\right) J_{ijab} +P_{ ab }^\alpha K_{ijab}} \right] },\end{aligned}$$
(5.2a)
$$\begin{aligned} F_{ij}^\beta&= H_{ij}^0 +\sum _{a,b} {\left[ {\left( {P_{ ab }^\alpha +P_{ ab }^\beta }\right) J_{ijab} +P_{ ab }^\beta K_{ijab}}\right] }. \end{aligned}$$
(5.2b)

The integrals \(J_{ijab}\) and \(K_{ijab}\) are defined by:

$$\begin{aligned} J_{ijab}&= ({ij\;|\;ab}),\end{aligned}$$
(5.3a)
$$\begin{aligned} K_{ijab}&= ({ib\;|\;aj}). \end{aligned}$$
(5.3b)

The chemists’ notation for molecular integrals, Eq. (5.3), is adopted:

$$\begin{aligned} ({ab|cd})&\equiv \left( {\psi _a \psi _b |\frac{1}{r_{12}}|\psi _c\psi _d}\right) \nonumber \\&\equiv \int {d\sigma _1 d\vec {r}_1 \int {d\sigma _2 d \vec {r}_2 \psi _a^*(1)\psi _b (1)\frac{1}{r_{12}}\psi _c^*(2)\psi _d (2)}}. \end{aligned}$$
(5.4)

To show that the Fock matrix Eq. (5.2) is, indeed, a proper 1-electron Hamiltonian that corresponds to the total energy, Eq. (5.1), we consider energy variation in a way similar to the one already done for the particular case discussed in Sect. 4. Using definitions Eq. (5.2) in the intermediate step of derivation, we obtain:

$$\begin{aligned} \delta E_{HF}^\alpha&= \frac{1}{2}\sum _{i,j} {\delta P_{ij}^\alpha \left( {H_{ij}^0 +F_{ij}^\alpha }\right) } +\frac{1}{2}\sum _{i,j} {P_{ij}^\alpha \left( {\sum _{a,b} {\delta P_{ ab }^\alpha ({J_{ijab} +K_{ijab}} )}}\right) }\nonumber \\&\quad +\frac{1}{2}\sum _{i,j}{P_{ij}^\beta \left( {\sum _{a,b} {\delta P_{ ab }^\alpha J_{ijab}}}\right) } \nonumber \\&= \frac{1}{2}\sum _{a,b} {\delta P_{ ab }^\alpha \left( {H_{ ab }^0 +F_{ij}^\alpha }\right) } +\frac{1}{2}\sum _{a,b} {\delta P_{ ab }^\alpha \sum _{i,j} {\left[ {P_{ij}^\alpha ({J_{ijab}+K_{ijab}} )+P_{ij}^\beta J_{ijab}} \right] }} \nonumber \\&= \frac{1}{2}\sum _{a,b} {\delta P_{ ab }^\alpha \left( {H_{ ab }^0 +F_{ij}^\alpha }\right) } +\frac{1}{2}\sum _{a,b} {\delta P_{ ab }^\alpha \left( {F_{ij}^\alpha -H_{ij}^0}\right) =} \sum _{a,b} {\delta P_{ ab }^\alpha F_{ ab }^\alpha }. \end{aligned}$$
(5.5)

The standard EHT Hamiltonian is defined such that the energy is given by the expression:

$$\begin{aligned} E_{ EHT } =\sum _{i,j} {P_{ij}^\alpha H_{ij}^{EHT,\alpha }} +\sum _{i,j} {P_{ij}^\beta H_{ij}^{EHT,\beta }}. \end{aligned}$$
(5.6)

In many works separation of alpha and beta electrons is not considered, because of the absence of exchange (and often even Coulomb) terms in the EHT Hamiltonian. To generalize the EHT methodology, and to facilitate its mapping onto unrestricted HF theory, we explicitly consider different spin channels.

To obtain the effective Fock matrix that corresponds to the charge-corrected EHT Hamiltonian, \(H^{ EHT ,\sigma }\), we map Eq. (5.6) onto Eq. (5.1), leading to:

$$\begin{aligned} ({H^{0}+F^{\sigma }})=2H^{ EHT ,\sigma }, \end{aligned}$$
(5.7a)

or

$$\begin{aligned} F^{\sigma }=2H^{ EHT ,\sigma }-H^{0}. \end{aligned}$$
(5.7b)

We remind the reader that \(H^{0}\) in this context has the meaning of the charge-independent EHT Hamiltonian. Thus, the necessary correction is simply:

$$\begin{aligned} \Delta ^{\sigma }=H^{ EHT ,\sigma }-H^{0}. \end{aligned}$$
(5.8)

The SC-corrected Hamiltonian is:

$$\begin{aligned} F^{\sigma }\equiv \tilde{H}^{\sigma }=H^{ EHT ,\sigma }+\Delta ^{\sigma }. \end{aligned}$$
(5.9)

Obviously, in the case when the EHT Hamiltonian does not depend on charges, \(H^{ EHT ,\sigma }=H^{0}\), the correction is zero and the same Hamiltonian, \(H^{0}\), appears in the eigenvalue problem and in the total energy expression.

The explicit expression for the SC correction of the EHT Hamiltonian based on the Hoffmann’s rule, Eq. (2.2), and the charge dependence, Eq. (3.1), is then:

$$\begin{aligned} \Delta _{ij}^\sigma&= H_{ij}^{ EHT ,\sigma } -H_{ij}^{0,\sigma } =\frac{1}{2}K_{ij} S_{ij} ({h_i +h_j})\nonumber \\&\quad -\frac{1}{2}K_{ij} S_{ij} ({h_i^0 +h_j^0})=-\frac{1}{2}K_{ij} S_{ij} ({a_i q_I +a_j q_J} ). \end{aligned}$$
(5.10)

Hence, the effective 1-electron operator is:

$$\begin{aligned} F_{ij}^\sigma =H_{ij}^{ EHT ,\sigma } +\Delta _{ij}^\sigma =H_{ij}^{ EHT ,\sigma } -\frac{1}{2}K_{ij} S_{ij} ({a_i q_I +a_j q_J}). \end{aligned}$$
(5.11)

We remind the reader that the Hamiltonian \(H_{ij}^{ EHT ,\sigma }\) in Eq. (5.11) and before is the charge-corrected EHT Hamiltonian, not the one based on the charge-independent diagonal matrix elements.

The advantage of the present mapping scheme for derivation of the SC correction of the charge-dependent EHT Hamiltonians is easy to observe when the elaborate formula of type Eq. (2.7) are used for computing the EHT Hamiltonian. In this case, the dependence of the matrix elements on the diagonal terms is non-linear, and application of the direct density matrix variation, similar to one shown in Sect. 4, is difficult. At the same time, the formula Eq. (5.8) is much more transparent and easy to apply. We also note that the result, Eqs. (5.10)–(5.11) is very close to the expression given my Mukherjee [37], although some differences still exist.

6 SC-EHT as an approximation of the Hartree–Fock method

An alternative and very illuminating look on the SC-EHT method is to assume that no self-consistency correction is needed and the charge-corrected extended Hückel Hamiltonian itself plays the role of effective Fock matrix:

$$\begin{aligned} H_{ij}^{EHT,\sigma } =F_{ij}^\sigma =H_{ij}^0 +\sum _{a,b} {\left[ {\left( {P_{ ab }^\alpha +P_{ ab }^\beta }\right) J_{ijab} +P_{ ab }^\sigma K_{ijab}}\right] }. \end{aligned}$$
(6.1)

This interpretation advocates the use of the charge-corrected EHT Hamiltonians without SC corrections, utilized by various authors. The definition, Eq. (6.1), is more consistent with the physical meaning of the charge-correction terms—alteration of the orbital energies as the function of atomic partial charges, rather than alteration of the total energy, although the latter interpretation can be rather appealing and could be related to classical charge equilibration principles. In addition, with the interpretation of the EHT Hamiltonian as Eq. (6.1), the meaning of the slope constants is the same as in most charge-dependent EHT formulations.

We can now answer the question—“Which particular approximation of the 2-electron integrals, Eq. (5.3), leads to the SC-EHT Hamiltonian?” If we adopt (orbital, as opposed to energy) interpretation of the EHT Hamiltonian, Eq. (6.1), the form of charge-dependent EHT Hamiltonian, Eqs. (2.2) and (3.1), can be obtained via the following approximation:

$$\begin{aligned} K_{ijab}&= 0,\end{aligned}$$
(6.2a)
$$\begin{aligned} J_{ijab}&= f_{ij}^{ab} S_{ ab },\end{aligned}$$
(6.2b)
$$\begin{aligned} f_{ij}^{ab}&= K_{ij} S_{ij} \left( {\left( {\delta _{Ia} -\frac{\bar{{n}}_I}{N}}\right) f_i +\left( {\delta _{Jb} -\frac{\bar{{n}}_J}{N}} \right) f_j} \right) , \end{aligned}$$
(6.2c)

where \(f_i\) are constants, \(\bar{{n}}_I\) is the reference electron population on atom \(I,\,N\) is the total number of electrons in the system. Under approximations Eq. (6.2), the Eq. (6.1) transforms:

$$\begin{aligned} F_{ij}^\sigma =H_{ij}^0 +\sum _{a,b} {\left[ {\left( {P_{ ab }^\alpha +P_{ ab }^\beta }\right) J_{ijab} +P_{ ab }^\sigma K_{ijab}} \right] } =H_{ij}^0 +\sum _{a,b} {f_{ij}^{ab} S_{ ab } P_{ ab }}.\qquad \end{aligned}$$
(6.3)

Utilizing definition, Eq. (6.2c), the double sum transforms:

$$\begin{aligned} \sum _{a,b} {f_{ij}^{ab} S_{ ab } P_{ ab }}&= K_{ij} S_{ij} \sum _{a,b} {\left( {\left( {\delta _{Ia} -\frac{\bar{{n}}_I}{N}}\right) f_i +\left( {\delta _{Jb}-\frac{\bar{{n}}_J}{N}} \right) f_j}\right) S_{ ab } P_{ ab }}\nonumber \\&= K_{ij} S_{ij} \sum _{a,b} {\delta _{Ia} f_i S_{ ab } P_{ ab }} -K_{ij} S_{ij} \frac{\bar{{n}}_I}{N}f_i \sum _{a,b} {S_{ ab }P_{ ab }}\nonumber \\&\quad +\,K_{ij} S_{ij} \sum _{a,b} {\delta _{Jb} f_j S_{ ab } P_{ ab }} -K_{ij} S_{ij} \frac{\bar{{n}}_J}{N}f_i \sum _{a,b} {S_{ ab }P_{ ab }}\nonumber \\&= K_{ij} S_{ij} \left( {f_i \left( {\sum _a {\delta _{Ia} n_a} -\bar{{n}}_I} \right) +f_j \left( {\sum _b {\delta _{Jb} n_b}-\bar{{n}}_J}\right) }\right) \nonumber \\&= K_{ij} S_{ij} ({f_i ({n_I -\bar{{n}}_I} )+f_j ({n_J -\bar{{n}}_J} )} )\nonumber \\&= K_{ij} S_{ij} ({f_i \delta n_I +f_j \delta n_J} ). \end{aligned}$$
(6.4)

where \(\delta n_I\) is the excess of the Mulliken (gross) population (not charge) on atom \(I\) with respect to the reference value \(\bar{{n}}_I \). To summarize:

$$\begin{aligned} F_{ij}^\sigma =H_{ij}^0 +\sum _{a,b} {f_{ij}^{ab} S_{ ab } P_{ ab }} =\frac{1}{2}K_{ij} S_{ij} \left( {h_i^0 +h_j^0}\right) +K_{ij} S_{ij} ({f_i \delta n_I +f_j \delta n_J}).\qquad \end{aligned}$$
(6.5)

If one chooses \(f_i =\frac{1}{2}a_i \) and recalls that \(\delta n_i=-q_i\), Eq. (6.5) turns into the desired charge-corrected EHT Hamiltonian:

$$\begin{aligned} H_{ij}^{SC-EHT,\sigma } =F_{ij}^\sigma =\frac{1}{2}K_{ij} S_{ij} \left( {\left[ {h_i^0 -a_i q_I}\right] +\left[ {h_j^0 -a_j q_J}\right] }\right) . \end{aligned}$$
(6.6)

Thus, the utilization of the charge-corrected EHT Hamiltonians for determination of the eigenvalues, adopted by many authors, is justified. One should keep in mind that in this situation the energy that is minimized variationally is not the one defined by Eq. (5.6). The sought-for energy is:

$$\begin{aligned} E_{ SC \text {-} EHT }&= \frac{1}{2}\sum _{i,j} {P_{ij}^\alpha \left( {H_{ij}^0 +F_{ij}^\alpha }\right) } +\frac{1}{2}\sum _{i,j} {P_{ij}^\beta \left( {H_{ij}^0 +F_{ij}^\beta }\right) }\nonumber \\&= \left( {\sum _{i,j} {P_{ij}^\alpha H_{ij}^0} +\sum _{i,j} {P_{ij}^\beta H_{ij}^0}}\right) -\frac{1}{2}\sum _{i,j} {P_{ij} K_{ij} S_{ij} ({a_i q_I +a_j q_J} )} \nonumber \\&= E_{ EHT }^0 -\frac{1}{2}\sum _{i,j} {P_{ij} K_{ij} S_{ij} ({a_i q_I +a_j q_J} )} \nonumber \\&= E_{ EHT }^0 -\sum _i {a_i ({P\tilde{S}} )_{ii} q_I}. \end{aligned}$$
(6.7)

Note that in Eq. (6.7) index \(I\) is the function of index \(i\): \(I=f(i ):i\in I\).

In this section we showed that the SC-EHT can be derived straight from the HF theory, by neglecting exchange-type integrals, Eq. (5.3b), and by approximating Coulomb-type integrals, Eq. (5.3a), by the product of pair-wise overlaps:

$$\begin{aligned} J_{ijab} =({ab\,|\,ij} )=\frac{1}{2}K_{ij} S_{ ab } S_{ij} \left( {\left( {\delta _{Ia} -\frac{\bar{{n}}_I}{N}}\right) a_i +\left( {\delta _{Jb} -\frac{\bar{{n}}_J}{N}}\right) a_j}\right) . \end{aligned}$$
(6.8)

Analysis of Eq. (6.8) reveals explicitly the reasons of potential failures of the SC-EHT method and rationalizes some of the early modifications of the EHT method, Eqs. (2.7)–(2.11). First of all, the correct asymptotic of the Coulomb-type integral is \(\frac{1}{R}\), where \(R\) is some measure of the separation of the orbitals. The product \(S_{ ab } S_{ij} \) behaves as \(\exp ({-R})\), if Slater AOs are utilized explicitly, or as \(\exp ({-R})\) gradually switching to \(\exp ({-R^{2}})\), if the STO-nGTO approach is used. For the intermediate and short distances the approximation Eq. (6.8) is acceptable, causing no problems to most situations. However the incorrect asymptotic behavior can have prominent effect when long-range electrostatic interactions are important, for example when charge transfer of electronic polarization over extended spatial region is considered. We also note that many other methods, typically considered high-level, do often lack the same correct asymptotic, \(\frac{1}{R}\), as well, giving advantages over semiempirical methods mostly due to their short-range description. The asymptotic behavior of the RHS of Eq. (6.8) can be partially improved by making the constant \(K_{ij} \) distance-dependent. In this regard, the Calzaferri formula, Eq. (2.8), can be considered one of these types of improvement. Indeed, utilization of such approximation helped to model excited states [56]—the task particularly sensitive to long-range interactions. Use of correct asymptotic formulae for \(K_{ij}\), for example approximated by Ohno [75], Klopman [76], Mataga [77] terms, Eq. (6.9), can be advantageous.

$$\begin{aligned} K_{ij} \sim ({a_{ij} +R^{n}} )^{-1/n}. \end{aligned}$$
(6.9)

Alternative to the modification of parameters \(K_{ij}\) can be the modification of slope parameters, such that they introduce dependence on inter-orbital separation of \(\frac{1}{R}\) type. Specifically, assuming that the following equation holds,

$$\begin{aligned} a_i =\alpha _i +\sum _j {f_1({R_{ij}})\alpha _j}, \end{aligned}$$
(6.10)

one may naturally incorporate correct asymptotic behavior via functions \(f_1({R_{ij}})\), as well as introduce dependence of orbital energies not only on the charge of the host atom, but also on the charges on different atoms.

Finally, we consider the possibility of incorporating exchange effects into EHT, via re-introducing exchange integral. The 4-orbital, 2-electron integrals, Eq. (5.3), do not have clear distinction as Coulomb and exchange integrals when expressed in AOs, in contrast to their definition in MO basis. Therefore, one can utilize the same type of approximation to \(K_{ijab}\) as that applied to \(J_{ijab}\), Eq. (6.8), with suitable orbital index permutation. Re-parameterization of conventional quantities entering the EHT Hamiltonian definition may be needed.

7 Relation to other methods

In this section we establish connections between the SC-EHT method and two related methods—the charge equilibration (QEq) method by Rappe and Goddard [78] and the family of DFTB methods by Elstner and co-workers [4851]. We start by analyzing the result, Eq. (6.7). Assuming that parameter \(K_{ij}\) is independent of the orbital indices, and that the parameter \(a_i\) is the same for all orbitals centered on a given atom \(I,\,a_i =a_I ,\forall i\in I\), the energy term \(-\sum _i {a_i ({P\tilde{S}} )_{ii} q_I}\) can be simplified:

$$\begin{aligned} -\sum _i {a_i ({P\tilde{S}} )_{ii} q_I}&= -K\sum _i {a_i n_i q_I} =K\sum _I {q_I a_I \sum _{i\in I} {-n_i}} \nonumber \\&= K\sum _I {a_I q_I \left( {Z_I -\sum _{i\in I} {n_i}}\right) } -K\sum _I {a_I q_I Z_I} \nonumber \\&= \sum _I {({-Ka_I Z_I} )q_I} +\sum _I {Ka_I q_I^2}. \end{aligned}$$
(7.1)

The term, Eq. (7.1), has clear interpretation—partial atomic charge \(q_I\) interacts attractively with the (effective) core nuclear charge, \(Z_I\), and repulsively with itself. To relate the SC-EHT method to the QEq scheme [78], we consider the energy of a charged atom. Taylor expansion in atomic charge fluctuations (e.g. partial Mulliken charges), \(q_I \), up to second order yields:

$$\begin{aligned} E_I ({q_I} )=E_{I,0} +\frac{\partial E}{\partial q_I}q_I +\frac{1}{2!}\frac{\partial ^{2}E}{\partial q_I^2}q_I^2. \end{aligned}$$
(7.2)

The energy diagram of the atom with different number of electrons is present in Fig. 1b. Note the difference in definition of the EA and IP quantities (atomic) used in QEq scheme with respect to the orbital-resolved VSIPs, shown in Fig. 1a.

For the system of N atoms the energy can be written as (up to the second order in charge fluctuations):

$$\begin{aligned} E(q)=E_0 +\sum _I {\frac{\partial E}{\partial q_I}q_I} +\frac{1}{2!}\sum _I {\frac{\partial ^{2}E}{\partial q_I^2}q_I^2} +\frac{1}{2!}{\mathop {\mathop {\sum }\limits _{I,J}}\limits _{I \ne J}} {\frac{\partial ^{2}E}{\partial q_I \partial q_J}q_I q_J}. \end{aligned}$$
(7.3)

Unlike the original QEq scheme, in the SC-EHT method charges are determined from the orbital occupations, which are obtained by solving self-consistent field equations. It is the form of energy expression which is similar in the two methods. Comparing structure of Eq. (7.3) with that of Eq. (7.1), we can observe a clear similarity and establish useful relations between proportionality constants used in two methods. Equation (7.1) disregards all mixed second order derivatives of energy. For the rest of the terms the relations are straightforward:

$$\begin{aligned} E_0&= E_{ EHT }^0,\end{aligned}$$
(7.4a)
$$\begin{aligned} ({-Ka_I Z_I})&= \frac{\partial E}{\partial q_I}\approx \frac{1}{2}({IP_I +EA_I} )=\chi _I,\end{aligned}$$
(7.4b)
$$\begin{aligned} Ka_I&= \frac{1}{2}\frac{\partial ^{2}E}{\partial q_I^2}\approx \frac{1}{2}({IP_I -EA_I} )=\frac{1}{2}J_I. \end{aligned}$$
(7.4c)

The definition of the Mulliken charges, Eq. (3.2), utilizes well-defined value of effective core charge, \(Z_I\). Therefore, Eqs. (7.4b) and (7.4c) are overdetermined with respect to the parameter \(a_I\). We also remind the reader that this result is obtained under the assumption \(a_i =a_I,\forall i\in I\) and that \(K_{ij} =K,\forall i,j\). If these requirements are lifted, one may obtain different set of equations, possibly better determined. Because of the close connection between the SC-EHT and QEq methods, one may apply the latter for finding charges, which is efficient even for large systems. The resulting charges can then be used to construct the charge-corrected effective 1-electron Hamiltonian, to determine the electronic structure in a non-iterative way. For the described approach to yield the best accuracy, it is important that the parameters \(a_I, \,Z_I, \,K\) on one side and the parameters \(J_I\) and \(\chi _I\) on the other side are chosen in the most consistent way, as it is suggested by Eq. (7.4).

Equation similar to Eq. (7.3) appears in theory of self-consistent DFTB methods—SCC-DFTB [50] and DFTB3 [51]. Namely, the total DFT energy can be approximated by the Taylor sum in charge density fluctuation, \(\Delta \rho =\rho -\rho _0 \):

$$\begin{aligned} E[\rho ]&= E_0 +E^{(2)}+E^{(3)}+\cdots ,\end{aligned}$$
(7.5a)
$$\begin{aligned} E^{(2)}&= \frac{1}{2}\int {d\vec {r}^{\prime }\int {d\vec {r}\left( {\frac{1}{| {\vec {r}-\vec {r}^{\prime }} |}+. {\frac{\delta ^{2}E_{xc}}{\delta \rho \delta \rho ^{\prime }}} |_{\rho _0,{\rho _0}^{\prime }}}\right) \Delta \rho \Delta \rho ^{\prime }}},\end{aligned}$$
(7.5b)
$$\begin{aligned} E^{(3)}&= \frac{1}{6}\int {d\vec {r}^{{\prime }{\prime }}\int {d\vec {r}^{\prime }\int {d\vec {r}\left( {\frac{1}{| {\vec {r}-\vec {r}^{\prime }} |}+. {\frac{\delta ^{2}E_{xc}}{\delta \rho \delta \rho ^{\prime }}} |_{\rho _0,{\rho _0}^{\prime }}}\right) \Delta \rho \Delta \rho ^{\prime } \Delta \rho ^{{\prime }{\prime }}}}}.\quad \end{aligned}$$
(7.5c)

The second and third order terms are approximated:

$$\begin{aligned} E^{(2)}\approx E^{\gamma }\equiv \frac{1}{2}\sum _{A,B} {\gamma _{AB} \Delta q_A \Delta q_B}, \end{aligned}$$
(7.6a)

and

$$\begin{aligned} E^{(3)}\approx E^{\Gamma }\equiv \frac{1}{3}\sum _{A,B} {\Delta q_A^2 \Delta q_B \Gamma _{AB}}, \end{aligned}$$
(7.6b)

The similarity of Eqs. (7.6) with Eq. (7.3) and with Eq. (7.1) become more apparent. Second-order energy correction, Eq. (7.6a), leads to effective 1-electron Hamiltonian of form:

$$\begin{aligned} F_{ij}^{ SCC \text {-} DFTB } =H_{ij}^0 +\frac{1}{2}S_{ij} \sum _A {({\gamma _{IA} +\gamma _{JA}} )\Delta q_A}, \end{aligned}$$
(7.7)

which can be compared to Eq. (6.6), for example. The major deficiencies of the latter are the lack of summation over all atomic charges, and the wrong asymptotic form of \(K_{ij} a_i \). From Eq. (7.7) it is clear that the orbital energies must be corrected not only for the charge present at the atom containing the orbital, but also on the charges of all other atoms. This correction to the SC-EHT method can be introduced by approximation Eq. (6.10), for example. Secondly, the parameters \(K_{ij} a_i\) must be chosen to behave similar to functions \(\gamma _{IA}\), that possess correct \(\frac{1}{R}\) asymptotic. Finally, non-linear charge corrections terms, which have been reported in some versions of the SC-EHT [8], can lead to higher-order corrections of the total energy, such as Eq. (7.6b). Therefore, we identify a close relation and high degree of similarity of SC-EHT-based methods and the DTFB with self-consistent charge. On the grounds of this comparative analysis, as well as the analysis of asymptotic behavior and physical interpretation of certain quantities, discussed in previous sections, we suggest that the proper modification of the original SC-EHT methods can be developed, leading to high-accuracy semiempirical methods that have their roots in rigorous DFT and wavefunction theories.

Finally, we discuss the relation of the simple EHT scheme and its self-consistent variant to the high-level ab initio theories. In his fundamental works, Löwdin elaborated a rigorous wavefunction theory that describes many-body interactions in quantum systems [7981]. Among other important results, the method of configuration interaction (CI) was presented. Nowadays, the CI family of methods provides very high accuracy, making wavefunction-based calculations predictive. It is important for our purposes that, as discussed by Löwdin, variational electronic structure CI calculations either can be performed using a linear CI Hamiltonian and a large enough set of basis states, or can be based on a non-linear projected Hamiltonian and a smaller set of basis states. The advantages of the non-linear equations are questionable in the straightforward application of the CI method. However, the projected Hamiltonian formulation provides fundamental grounds for constructing Hamiltonians of the EHT type that also account for many-body quantum effects. Further improvements of the EHT and SC-EHT methods can be based on elaboration of the non-linear projected Hamiltonian derived by Löwdin.

In recent years there have been several works attempting to utilize the ideas similar to the one just discussed. Namely, semiempirical methods have been used as a framework for efficient and accurate calculations on large systems, with the parameters derived from the fitting to the results of correlated calculations on small systems. This approach is essentially a projection of the CI-based Hamiltonian onto a simple effective semiempirical-looking Hamiltonian.

Projected Hamiltonians were constructed for fast calculations with the accuracy comparable to that of correlated wavefunction methods. For example, Rossi and Truhlar [82] utilized the neglect of diatomic differential overlap (NDDO) approximation as a framework to fit the potential energy surfaces of the Cl \(+\) CH\(_{4}\) reactive system. The parameters were derived from a number of single point calculations along reaction coordinate, as obtained with the MP2 method. The resulting model was able to describe successfully the points of PESs away from the reaction coordinate.

The Thiel group proposed a transfer Hamiltonian approach. The method originates from the standard coupled-cluster theory in which the formally exact wavefunction, \(| \Psi \rangle \) can be expressed via:

$$\begin{aligned} | \Psi \rangle =\exp ({\hat{{T}}} )|{\Phi _0} \rangle . \end{aligned}$$
(7.8)

\(| {\Phi _0} \rangle \) is the reference wavefunction, which is typically chosen as the ground state Slater determinant, and \(\hat{{T}}\) is the excitation operator. Then, solving the Schrodinger equation for exact wavefunction \(|\Psi \rangle \):

$$\begin{aligned} H| \Psi \rangle =E| \Psi \rangle , \end{aligned}$$
(7.9)

is equivalent to finding orbitals of the reference wavefunction \(| {\Phi _0} \rangle \), but with the projected Hamiltonian \(\bar{{H}}\):

$$\begin{aligned} \bar{{H}}| {\Phi _0} \rangle&= E| {\Phi _0} \rangle ,\end{aligned}$$
(7.10a)
$$\begin{aligned} \bar{{H}}&= \exp ({-\hat{{T}}} )H\exp ({\hat{{T}}} ). \end{aligned}$$
(7.10b)

Eventually, the equations can be reduced to the form of Eq. (2.12), but with the effective Hamiltonian containing correlation and electrostatic effects. Thus, a generalization of the EHT and SC-EHT methods is of broad and fundamental value, since it roots back to the correlated wavefunction theory. The construction Eq. (7.10b) is similar to earlier results of Löwdin. The non-linear form of the projected Hamiltonian can be used as a starting point for further theoretical elaboration of new generations of the EHT method, with the SC-EHT being among the simplest ones.

Another result found by Löwdin concerns convergence properties of the CI expansions. It was found that the choice of 1-particle orbitals as eigenfunctions of the density matrix leads to the fastest convergence, such that a single Slater determinant (HF method) may be an adequate approximation. The result is important for the EHT and derivative methods, since they all are based on a single Slater determinant wavefunction. This concerns mostly the theoretical formulations rather than the accuracy and physical interpretation, because of the parametric nature of the EHT and SC-EHT methods. The choice of orbitals may be compensated by the choice of the parameters.

8 Conclusions

In this work, the SC-EHT method has been analyzed in detail. Construction of a proper 1-electron Hamiltonian operator that variationally minimizes the energy for a given charge-corrected EHT Hamiltonian has been presented. Simple routes are based on the mapping of the ab initio HF Fock operator onto the effective EHT Hamiltonian (energy-based mapping) or onto the effective 1-electron Hamiltonian (orbital-based mapping). Both approaches are valid, but their interpretation is different and must be properly performed when analyzing results or when developing parameterizations against different types of data (e.g. enthalpies of formation and electronic spectra).

Using the energy mapping, the self-consistency correction is required and can be easily obtained. Our analysis suggests a much simpler formulation than the one that can be obtained by application of direct variation with respect to the density matrix. The convenience of our approach is especially valuable when the matrix elements are strongly non-linear in atomic charges.

For orbital-based mapping, which is very convenient for analysis and Hamiltonian construction, the correction is not needed. Instead, the effective 1-electron Hamiltonian is obtained as a specific approximation of the HF Fock matrix. The approximation leading to the original SC-EHT method disregards exchange integrals and introduces incorrect asymptotic behavior of matrix elements as the function of inter-orbital separation. We discuss earlier approximations in light of their potential to soften the introduced inaccuracies. Further, we propose possible modifications that would allow one to achieve correct asymptotic properties and to improve quality of approximations, bringing the SC-EHT method to a new level of theory. The proposed modifications are expected to have prominent impact on accuracy of the wavefunction and related properties, especially for the charge transfer and polarization processes that occur over extended spatial regions in large-scale systems. The proposed modifications can serve as the basis for novel accurate and efficient semiempirical methodologies.

We establish connection between the SC-EHT method and its possible extensions to the QEq method and to the series of DFTB approximations. The relation to the QEq suggests techniques for avoiding self-consistent charge determination via MO optimization. Instead, the charges may be obtained directly from the computationally more favorable QEq method and then can be used for non-iterative electronic structure calculations, provided a suitable mapping between the parameters in the two techniques is established. The analogy of the SC-EHT with the DFTB-derived methods emphasizes that the SC-EHT can also be considered an approximation to the DFT technique, not just a HF method.

We also outline the relation of the EHT and SC-EHT methods to the correlated wavefunction methods. In particular, we argue that the simple Hamiltonians of EHT or SC-EHT type may be further elaborated starting from the pioneering works of Löwdin on non-linear projected Hamiltonians, as exemplified by several recent works reporting mapping of high-level ab initio calculations on reparameterized semiempirical Hamiltonians of the NDDO form. The improvements along these lines can also be applied to the significantly simpler EHT and SC-EHT Hamiltonians, leading to novel, computationally efficient electronic structure calculations of high-accuracy, applicable to large-scale systems.

To recapitulate, with the above analysis we demonstrate that the SC-EHT derived Hamiltonians provide rigorous grounds and promising opportunities for constructing simple, physically transparent, accurate, and efficient semiempirical methodologies that deserve further exploration.