The unappreciated Hellmann-Feynman theorem!

The Hellmann-Feynman theorem has an illustrious albeit somewhat complicated lineage, which includes three Nobel Laureates. The key relationship was reported by Schrödinger in one of his landmark 1926 papers [1], by Güttinger in 1932 [2] and by both Pauli [3] and Hellmann [4] in 1933. Feynman presented it, apparently independently, in 1939 [5]. It was Hellmann [6] and Feynman [5] who arrived separately at one of the best known consequences of the theorem, which is presumably why their names are attached to it.

Over the years, the approval rating of the Hellmann-Feynman theorem has fluctuated wildly. In a 1945 paper, Coulson and Bell concluded that it was invalid [7]; however, Berlin rescued it in 1951 [8], demonstrating that it was their argument that was invalid. He also showed how the theorem could be used to obtain insight into bonding forces in molecules, an approach that Bader et al. followed successfully a few years later [9,10,11]. In 1962, Wilson used the theorem to derive an exact formula for molecular energies [12]. Nevertheless, Musher claimed in 1966 that many find the theorem to be “too trivial to merit the term ‘theorem’” [13]. Yet only six years later, no less an authority than Slater proclaimed the Hellmann-Feynman theorem to be one of the “two most powerful theorems applicable to molecules and solids” [14], the other being the virial theorem.

Overall, the Hellmann-Feynman theorem has not received the respect that it deserves. Fernandez Rico et al. commented in 2005 that “the possibilities that it opens up have been scarcely exploited, and today the theorem is mostly regarded as a scientific curiosity” [15]. Deb had already concluded nearly 25 years earlier that “the apparent simplicity of the H-F theorem had evoked some skepticism and suspicion” [16]. We agree with Deb, and note the irony in the fact that simplicity arouses suspicion, given the observations of Newton (“Nature is pleased with simplicity.”) and Einstein (“Nature is the realization of the simplest conceivable mathematical ideas.”) [17].

We suggest that the real power of the Hellmann-Feynman theorem is, at present, more conceptual than numerical. In this brief overview, we shall point out some significant insights to which it has led, despite the “flaw” of being lamentably straightforward.

Derivations

Consider a system having a Hamiltonian operator H with a normalized eigenfunction Ψ and energy eigenvalue E; then HΨ = EΨ and accordingly E = <Ψ*|H|Ψ>. If λ is any parameter that appears explicitly in the Hamiltonian, then,

$$ \frac{\mathrm{\partial E}}{\mathrm{\partial \uplambda }}=\frac{\partial }{\mathrm{\partial \uplambda }}\left\langle {\Psi}^{\ast}\left|\mathrm{H}\right|\Psi \right\rangle $$
(1)
$$ =\left\langle \frac{{\mathrm{\partial \Psi}}^{\ast }}{\mathrm{\partial \uplambda }}\left|\mathrm{H}\right|\Psi \right\rangle +\left\langle {\Psi}^{\ast}\left|\frac{\mathrm{\partial H}}{\mathrm{\partial \uplambda }}\right|\Psi \right\rangle +\left\langle {\Psi}^{\ast}\left|\mathrm{H}\right|\frac{\mathrm{\partial \Psi }}{\mathrm{\partial \uplambda }}\right\rangle $$
(2)

Since the operator H is Hermitian, we can rewrite the third term on the right side of Eq. (2),

$$ \frac{\mathrm{\partial E}}{\mathrm{\partial \uplambda }}=\left\langle \frac{{\mathrm{\partial \Psi}}^{\ast }}{\mathrm{\partial \uplambda }}\left|\mathrm{H}\right|\Psi \right\rangle +\left\langle {\Psi}^{\ast}\left|\frac{\mathrm{\partial H}}{\mathrm{\partial \uplambda }}\right|\Psi \right\rangle +\left\langle \frac{\mathrm{\partial \Psi }}{\mathrm{\partial \uplambda }}\left|\mathrm{H}\right|{\Psi}^{\ast}\right\rangle $$
(3)

and since HΨ = EΨ,

$$ \frac{\mathrm{\partial E}}{\mathrm{\partial \uplambda }}=\mathrm{E}\left\langle \frac{{\mathrm{\partial \Psi}}^{\ast }}{\mathrm{\partial \uplambda }}\left|\Psi \right.\right\rangle +\left\langle {\Psi}^{\ast}\left|\frac{\mathrm{\partial H}}{\mathrm{\partial \uplambda }}\right|\Psi \right\rangle +\mathrm{E}\left\langle \frac{\mathrm{\partial \Psi }}{\mathrm{\partial \uplambda }}\left|{\Psi}^{\ast}\right.\right\rangle $$
(4)

The first and third terms on the right side of Eq. (4) add up to zero,

$$ \mathrm{E}\left\langle \frac{{\mathrm{\partial \Psi}}^{\ast }}{\mathrm{\partial \uplambda }}\left|\Psi \right.\right\rangle +\mathrm{E}\left\langle \frac{\mathrm{\partial \Psi }}{\mathrm{\partial \uplambda }}\left|{\Psi}^{\ast}\right.\right\rangle =\mathrm{E}\frac{\partial }{\mathrm{\partial \uplambda }}\left\langle {\Psi}^{\ast}\left|\Psi \right.\right\rangle =\mathrm{E}\frac{\partial }{\mathrm{\partial \uplambda }}(1)=0 $$
(5)

so that Eq. (4) becomes,

$$ \frac{\mathrm{\partial E}}{\mathrm{\partial \uplambda }}=\left\langle {\Psi}^{\ast}\left|\frac{\mathrm{\partial H}}{\mathrm{\partial \uplambda }}\right|\Psi \right\rangle $$
(6)

Equation (6) is what Levine has called the “generalized Hellman-Feynman theorem” [18]. (This term has also been applied to various extensions of Eq. (6) [19,20,21].)

While the derivation of Eq. (6) may appear to be deplorably simple, this is somewhat deceptive, and even as eminent a theoretician as Coulson could be misled. Equations (1) and (6) show that,

$$ \frac{\partial }{\mathrm{\partial \uplambda }}\left\langle {\Psi}^{\ast}\left|\mathrm{H}\right|\Psi \right\rangle =\left\langle {\Psi}^{\ast}\left|\frac{\mathrm{\partial H}}{\mathrm{\partial \uplambda }}\right|\Psi \right\rangle $$
(7)

Since H can be expressed in terms of the kinetic energy operator T and the potential energy operator V, H = T + V, then Eq. (7) might seem to imply that analogous equations can be written for T and V separately. Coulson and Bell argued that the analogues of Eq. (7) for T alone and V alone would not be valid [7], and that is correct:

$$ \frac{\partial }{\mathrm{\partial \uplambda }}\left\langle {\Psi}^{\ast}\left|\mathrm{T}\right|\Psi \right\rangle \ne \left\langle {\Psi}^{\ast}\left|\frac{\mathrm{\partial T}}{\mathrm{\partial \uplambda }}\right|\Psi \right\rangle \kern0.36em \frac{\partial }{\mathrm{\partial \uplambda }}\left\langle {\Psi}^{\ast}\left|\mathrm{V}\right|\Psi \right\rangle \ne \left\langle {\Psi}^{\ast}\left|\frac{\mathrm{\partial V}}{\mathrm{\partial \uplambda }}\right|\Psi \right\rangle $$
(8)

Their error was in not recognizing that Eq. (7) does not in reality imply the T and V analogues. As pointed out by Berlin [8], the derivation of Eqs. (6) and (7) relies on the fact that Ψ is an eigenfunction of H; it is not an eigenfunction of either T or V and so the transition from Eq. (3) to Eq. (4) could not be made for either T or V as the operator.

Prior to 1937, Eq. (6) had already been obtained by Schrödinger [1], Güttinger [2], Pauli [3], and Hellmann [4], and perhaps by others as well. However it was Hellmann in 1937 [6] and Feynman in 1939 [5] who arrived at its most famous (or infamous) formulation. For a system of nuclei and electrons within the Born-Oppenheimer approximation, the Hamiltonian is,

$$ \mathrm{H}=-\frac{1}{2}\sum \limits_{\mathrm{A}}{\nabla_{\mathrm{A}}}^2-\frac{1}{2}\sum \limits_{\mathrm{i}}{\nabla_{\mathrm{i}}}^2+\sum \limits_{\mathrm{A}}\sum \limits_{\mathrm{B}>\mathrm{A}}\frac{{\mathrm{Z}}_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{B}}}{\left|{\mathbf{R}}_{\mathrm{B}}-{\mathbf{R}}_{\mathrm{A}}\right|}-\sum \limits_{\mathrm{A}}\sum \limits_{\mathrm{i}}\frac{{\mathrm{Z}}_{\mathrm{A}}}{\left|{\mathbf{r}}_{\mathrm{i}}-{\mathbf{R}}_{\mathrm{A}}\right|}+\sum \limits_{\mathrm{i}}\sum \limits_{\mathrm{j}>\mathrm{i}}\frac{1}{\left|{\mathbf{r}}_{\mathrm{j}}-{\mathbf{r}}_{\mathrm{i}}\right|} $$
(9)

In Eq. (9), RA and RB are the positions of nuclei A and B and ri and rj are the positions of electrons i and j. Atomic units are used throughout this paper. Let the parameter in Eq. (6) be a coordinate of nucleus A, e.g., xA; then,

$$ \frac{\mathrm{\partial E}}{\partial {\mathrm{x}}_{\mathrm{A}}}=\left\langle {\Psi}^{\ast}\left|\frac{\mathrm{\partial H}}{\partial {\mathrm{x}}_{\mathrm{A}}}\right|\Psi \right\rangle $$
(10)

Since the gradient of an energy is the negative of a force, Eq. (10) will give the x-component of the force felt by nucleus A.

The first, second, and fifth terms on the right side of Eq. (9) do not explicitly depend upon the nuclear coordinates; hence, their derivatives in Eq. (10) are zero. Performing the differentiations indicated in Eq. (10) for the remaining two terms of Eq. (9), and doing this for each coordinate of A (see Feynman [5] or Levine [18]), will yield the gradient of E at RA and thus the negative of the force exerted upon nucleus A:

$$ \mathrm{\nabla E}\left({\mathbf{R}}_{\mathrm{A}}\right)=-\mathbf{F}\left({\mathbf{R}}_{\mathrm{A}}\right)=\sum \limits_{\mathrm{B}\ne \mathrm{A}}\frac{{\mathrm{Z}}_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{B}}\left({\mathbf{R}}_{\mathrm{B}}-{\mathbf{R}}_{\mathrm{A}}\right)}{{\left|{\mathbf{R}}_{\mathrm{B}}-{\mathbf{R}}_{\mathrm{A}}\right|}^3}-\int \frac{{\mathrm{Z}}_{\mathrm{A}}\left({\mathbf{r}}^{\prime }-{\mathbf{R}}_{\mathrm{A}}\right)\uprho \left({\mathbf{r}}^{\prime}\right)\mathrm{d}{\mathbf{r}}^{\prime }}{{\left|{\mathbf{r}}^{\prime }-{\mathbf{R}}_{\mathrm{A}}\right|}^3} $$
(11)

The negative of the first term on the right side of Eq. (11) is the force on nucleus A due to the other nuclei, and the negative of the second term is the force due to the electrons.

Equation (11) is the key result. It is just the generalized Hellmann-Feynman theorem for a particular choice of λ, but to distinguish it from the more general Eq. (6), it has sometimes been called the “electrostatic theorem” [11, 15, 18].

What is the significance of Eq. (11)? It is said, in Feynman’s words [5], that “the force on any nucleus (considered fixed) in any system of nuclei and electrons is just the classical electrostatic attraction exerted on the nucleus in question by the other nuclei and by the electron charge density distribution for all electrons.” Thus, Coulomb’s Law is sufficient to explain the bonding in molecules, complexes, etc. All that is needed is the electronic density distribution (which can be obtained experimentally as well as computationally) and the positions and charges of the nuclei. The Hellmann-Feynman theorem, in the form of Eq. (11), can in fact be regarded as a forerunner of the Hohenberg-Kohn theorem [22], which shows that the electronic density alone determines all of the properties of a system of nuclei and electrons, even the electrical potential of the nuclei.

Equation (11), expressing the Coulombic force felt by a nucleus in a molecule or complex, could certainly be written by someone with no knowledge of the generalized Hellmann-Feynman theorem, Eq. (6), or even of quantum mechanics. So what function does Eq. (6) serve in this context? It establishes the quantum-mechanical credibility of Eq. (11), by relating it directly to the Schrödinger equation.

Hellmann-Feynman theorem and approximate wave functions

The derivation of the generalized Hellmann-Feynman theorem, Eq. (6), relied explicitly upon the wave function Ψ being an eigenfunction of the Hamiltonian operator. In practice, however, the wave functions are almost invariably approximate, and therefore not eigenfunctions. Can Eq. (6) still be valid? A second and independent question is, for an approximate electronic density, how accurate is the force obtained by Eq. (11)?

The validity of Eq. (6) for approximate wave functions was addressed quite a few years ago by a number of studies, summarized very well by Deb [19]; see also Epstein [23]. Deb cites significant contributions made by Coulson, notwithstanding the latter’s earlier concern about the Hellmann-Feynman theorem. These various studies showed that a number of different types of approximate wave functions, including true Hartree-Fock, do satisfy Eq. (6), despite not being eigenfunctions of the Hamiltonian operator. However, this does not necessarily mean that they will give accurate forces via Eq. (11); for approximate wave functions this is a separate issue from satisfying Eq. (6).

The accuracy of the force by Eq. (11) depends upon the quality of the electronic density that is used. According to the Møller-Plesset theorem [24], Hartree-Fock electronic densities are correct to the first-order. However, this can be misleading; second-order corrections can sometimes be quite significant [25]. Forces are in particular very sensitive to a proper description of inner-shell polarization, due to the 1/rA2 dependence of the force [19, 26].

Deb pointed out that a very demanding test of the accuracy of a computed electronic density is the degree to which it meets the condition that the net force upon each nucleus as computed by Eq. (11) is zero at equilibrium [19, 27]. Applying this to a series of hydrogen fluoride calculations is illustrative [28]. For electronic densities from minimal-basis-set SCF wave functions, the electronic forces exerted upon the hydrogen nucleus (which has no inner shell) were given quite well, with deviations from the zero force criterion of just 1.5–2.8%. Extended-basis-set near-Hartree-Fock densities gave even better results, the deviations being 0.4–0.6%. For the forces exerted upon the fluorine nucleus (which does have inner shell electrons), the situation was very different. The minimal-basis-set electronic densities gave almost no net electronic force upon the fluorine nucleus, i.e., deviations of nearly 100%! Furthermore, the two extended-basis-set densities yielded very different results; one deviated from the zero force criterion by 26%, the other by only 3.4%. The latter electronic density was clearly more accurate, even though the energies differed by just 0.0004 au and the one giving the greater deviation actually had the lower energy! (This shows again that a lower energy does not necessarily imply better values for other properties.)

To summarize this section, the Hellmann-Feynman theorem proves that the force felt by a nucleus in a real system is the resultant of the Coulombic attraction of the electrons and the Coulombic repulsion of the other nuclei. That concept is correct, regardless of the particular wave function that may be used to describe the system. Equation (11) is the exact expression of these forces. The accuracy of the result that is obtained with Eq. (11) in any specific case depends solely upon how well the electronic density and the nuclear positions are represented. The preceding example shows that any generalizations with respect to the density must be made with caution, since the forces are so sensitive to it. A good option for evaluating the quality of the electronic density would seem to be to follow Deb’s suggestion [19], and test how close to zero the net force is upon each nucleus.

Applications to chemical bonding

Equation (11) indicates that a purely Coulombic interpretation suffices to explain covalent and noncovalent bonding. This is highly disturbing to many theoreticians. What about such time-honored quantum-mechanical concepts as exchange, Pauli repulsion, orbital interaction, correlation, etc.?

To a large extent, such concerns reflect a failure to distinguish between mathematical modeling and physical reality. Exchange and Pauli repulsion reflect the requirements that electrons be indistinguishable and the wave function antisymmetric. They are very important in arriving at a good approximate solution to the Schrödinger equation and hence electronic density; however, they do not correspond to physical forces [8, 14, 18, 29, 30]. Correlation refers to the instantaneous repulsions between electrons and thus is part of the total Coulombic interaction. Orbitals are simply mathematical constructs, a useful means of expressing a wave function. They are not physically real [31], nor is their overlap. It is essential to distinguish — as did Schrödinger [32] — between the mathematics that produces the wave function, which itself has no physical significance, and the electronic density, which does. To quote Levine, “there are no ‘mysterious quantum-mechanical forces’ acting in molecules” [18]. (For a good relevant discussion, see Bader [33].)

In 1951, Berlin examined the bond-strengthening or bond-weakening effects of electronic charge in various regions within a molecule [8]. Figure 1 illustrates the basic idea in simplified fashion. Consider a diatomic molecule AB with nuclear charges ZA and ZB, and an element of electronic charge Q at distances rA and rB from the nuclei. Q exerts an attractive force upon each nucleus, the components of which along the z-axis are, by Coulomb’s Law,

$$ {\mathrm{F}}_{\mathrm{A},\mathrm{z}}=\frac{{\mathrm{QZ}}_{\mathrm{A}}\cos {\uptheta}_{\mathrm{A}}}{{{\mathrm{r}}_{\mathrm{A}}}^2}\kern0.84em {\mathrm{F}}_{\mathrm{B},\mathrm{z}}=\frac{{\mathrm{QZ}}_{\mathrm{B}}\cos {\uptheta}_{\mathrm{B}}}{{{\mathrm{r}}_{\mathrm{B}}}^2} $$
(12)
Fig. 1
figure 1

Bond-strengthening or bond-weakening effects of a charge Q at different locations in a molecule AB. Forces are not drawn to scale. (a) Bond-strengthening. (b) and (c) Bond-weakening if FBz > FAz, bond-strengthening if FAz > FBz

In Fig. 1(a), FA,z and FB,z are clearly pulling the two nuclei together; thus, the effect of Q is bond-strengthening. In Fig. 1(b), FB,z is likely to be larger than FA,z due to rB being much less than rA, in which case Q is pulling nucleus B away from A and thus is bond-weakening. However, if ZA is considerably greater than ZB, and depending also upon the magnitudes of θA and θB, then FA,z could be larger than FB,z, and the result would be to pull nucleus A toward B, i.e., bond-strengthening. In Fig. 1(c), it is yet more plausible that Q could have either effect, depending upon the relative magnitudes of the nuclear charges and the angles.

Berlin [8] and subsequently Bader et al. [9,10,11] formalized and quantified the preceding type of analysis, and Bader et al. in particular applied it to a variety of molecules. For other approaches that relate electronic densities and Coulombic forces to chemical bonding, see Fernandez Rico et al. [15, 34] and Hirshfeld and Rzotkiewicz [35]. Correlations have been reported, for diatomic molecules, between intramolecular forces and dissociation energies [11, 35].

Such reasoning can be quite useful even qualitatively. For instance, the relative bond lengths and vibration frequencies indicate that the bond in CO+ is stronger than in CO [36]. Why is that? It can be explained quite readily [37] by noting that the most energetic electrons in CO are in the carbon lone pair. Their effect, as discussed in relation to Fig. 1, is to pull the carbon away from the oxygen, which weakens the bond. When one of them is lost, in forming CO+, then this bond-weakening effect is diminished and the bond is strengthened.

Applications to calculating energies

From classical physics, a force is the negative gradient of an energy. Accordingly, as Slater pointed out [14], one could in principle calculate the energy of a system by integration of Eq. (11) — however, this would require a knowledge of the electronic density for arbitrary positions of the nuclei.

Another approach was taken by Wilson [12]. The starting point was the molecular Hamiltonian, as in Eq. (9). A key step was the introduction of a general scaling parameter λ such that the charge on any nucleus A is ZA’ = λZA, where ZA is the true nuclear charge on atom A. λ can vary between 0 and 1. In the actual molecule λ = 1, so that ZA’ = ZA. Introducing λ allows all of the nuclear charges to vary in a concerted manner between 0 and their real values.

Equation (9) thus becomes,

$$ {\mathrm{H}}^{\mathrm{mol}}=-\frac{1}{2}\sum \limits_{\mathrm{A}}{\nabla_{\mathrm{A}}}^2-\frac{1}{2}\sum \limits_{\mathrm{i}}{\nabla_{\mathrm{i}}}^2+\sum \limits_{\mathrm{A}}\sum \limits_{\mathrm{B}>\mathrm{A}}\frac{\uplambda^2{\mathrm{Z}}_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{B}}}{\left|{\mathbf{R}}_{\mathrm{B}}-{\mathbf{R}}_{\mathrm{A}}\right|}-\sum \limits_{\mathrm{A}}\sum \limits_{\mathrm{i}}\frac{{\mathrm{Z}}_{\mathrm{A}}\uplambda}{\left|{\mathbf{r}}_{\mathrm{i}}-{\mathbf{R}}_{\mathrm{A}}\right|}+\sum \limits_{\mathrm{i}}\sum \limits_{\mathrm{j}>\mathrm{i}}\frac{1}{\left|{\mathbf{r}}_{\mathrm{j}}-{\mathbf{r}}_{\mathrm{i}}\right|} $$
(13)

Then according to Eq. (6),

$$ {\left(\frac{\partial {\mathrm{E}}^{\mathrm{mol}}}{\mathrm{\partial \uplambda }}\right)}_{\mathrm{N}}=2\uplambda \sum \limits_{\mathrm{A}}\sum \limits_{\mathrm{B}>\mathrm{A}}\frac{{\mathrm{Z}}_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{B}}}{\left|{\mathbf{R}}_{\mathrm{B}}-{\mathbf{R}}_{\mathrm{A}}\right|}-\sum \limits_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{A}}\int \frac{\uprho \left(\mathbf{r},\uplambda \right)\mathrm{d}\mathbf{r}}{\left|\mathbf{r}-{\mathbf{R}}_{\mathrm{A}}\right|} $$
(14)

In Eq. (14), the number of electrons N is to be held constant.

Note that λ serves two purposes: First, it allows Eq. (6) to apply simultaneously to all of the nuclei. Second, since λ is continuous between 0 and 1, it avoids the problem that would arise if E were being differentiated with respect to a nuclear charge, which can only have integer values and thus is noncontinuous. (There is continuing controversy over defining the chemical potential as ∂E/∂N [38], since N is also not continuous.)

Integrating both sides of Eq. (14) between λ = 0 and λ = 1 produces Wilson’s formula [12], Eq. (15).

$$ {\mathrm{E}}^{\mathrm{mol}}=\sum \limits_{\mathrm{A}}\sum \limits_{\mathrm{B}>\mathrm{A}}\frac{{\mathrm{Z}}_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{B}}}{\left|{\mathbf{R}}_{\mathrm{B}}-{\mathbf{R}}_{\mathrm{A}}\right|}-\sum \limits_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{A}}\underset{\uplambda =0}{\overset{1}{\int }}{\left[\int \frac{\uprho \left(\mathbf{r},\uplambda \right)\mathrm{d}\mathbf{r}}{\left|\mathbf{r}-{\mathbf{R}}_{\mathrm{A}}\right|}\right]}_{\mathrm{N}}\mathrm{d}\uplambda $$
(15)

The quantity in brackets in Eq. (15) is simply Ve,0,A(λ), the electrostatic potential at nucleus A due to all of the electrons,

$$ {\mathrm{V}}_{\mathrm{e},0,\mathrm{A}}\left(\uplambda \right)=\int {\left[\frac{\uprho \left(\mathbf{r},\uplambda \right)\mathrm{d}\mathbf{r}}{\left|\mathbf{r}-{\mathbf{R}}_{\mathrm{A}}\right|}\right]}_{\mathrm{N}} $$
(16)

Accordingly, Eq. (15) can also be written as,

$$ {\mathrm{E}}^{\mathrm{mol}}=\sum \limits_{\mathrm{A}}\sum \limits_{\mathrm{B}>\mathrm{A}}\frac{{\mathrm{Z}}_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{B}}}{\left|{\mathbf{R}}_{\mathrm{B}}-{\mathbf{R}}_{\mathrm{A}}\right|}-\sum \limits_{\mathrm{A}}{\mathrm{Z}}_{\mathrm{A}}\underset{\uplambda =0}{\overset{1}{\int }}{\left[{\mathrm{V}}_{\mathrm{e},0,\mathrm{A}}\left(\uplambda \right)\right]}_{\mathrm{N}}\mathrm{d}\uplambda $$
(17)

For a single atom with nuclear charge Z and located at R, an analogous procedure gives Eq. (18).

$$ {\mathrm{E}}^{\mathrm{atom}}=-{\mathrm{Z}}_{\mathrm{A}}\underset{\uplambda =0}{\overset{1}{\int }}\int {\left[\frac{\uprho \left(\mathbf{r},\uplambda \right)}{\left|\mathbf{r}-\mathbf{R}\right|}\right]}_{\mathrm{N}}\mathrm{d}\uplambda =-{\mathrm{Z}}_{\mathrm{A}}\underset{\uplambda =0}{\overset{1}{\int }}{\left[{\mathrm{V}}_{\mathrm{e},0}\left(\uplambda \right)\right]}_{\mathrm{N}}\mathrm{d}\uplambda $$
(18)

Equations (15), (17), and (18) are exact formulas for molecular and atomic energies. Applying them rigorously encounters the problem that evaluating the integrals requires knowing the electronic density as a function of λ, with the number of electrons being held constant.

However Eqs. (15), (17), and (18) are of considerable significance conceptually. They demonstrate, surprisingly, that molecular and atomic energies can be expressed exactly in terms of only the electronic electrostatic potentials at the nuclei, which are one-electron properties, and the nuclear charges and positions. Interelectronic repulsion, a two-electron property that has traditionally been a key problem in quantum chemistry, is not explicitly mentioned; it is evidently taken into account by the integration over λ.

In drawing attention to the key roles of electronic electrostatic potentials at nuclei (a purely Coulombic property), Eqs. (15), (17), and (18) stimulated a great deal of further analysis, including the derivation of a variety of approximate atomic and molecular energy expressions. For reviews, see Levy et al. [39] and Politzer et al. [40, 41]. In this context, it is important to note that while Hartree-Fock electronic densities are correct to the first-order [24], the electrostatic potentials at nuclei are correct to the second-order [39, 42]. Levy et al. have exploited this to derive some remarkably accurate formulas for atomic energies and energy differences, including even atomic correlation energies, obtained from Hartree-Fock electrostatic potentials at the nuclei [43].

Dispersion interactions

As an example of the suspicion with which the Hellmann-Feynman theorem is viewed can be mentioned the long-standing failure to accept Feynman’s interpretation of what are known as dispersion interactions, e.g., between two xenon atoms, in spite of the evidence that supports it. These weak nonbonding attractions have historically been described as arising from fluctuating induced dipoles, as in structure 1 below [44, 45]. Feynman’s explanation, on the other hand, was that the electronic charge of each xenon atom is slightly polarized toward the other, structure 2, and that it is “the attraction of each nucleus for the distorted charge distribution of its own electrons that gives the attractive 1/R7 force” [5]. In physical terms, the two interpretations (1 vs. 2) are obviously quite different, even though they both lead to the 1/R7 dependence of the force.

$$ {\displaystyle \begin{array}{cc}{}^{-\zeta }{Xe}^{+\zeta}\hbox{-} \hbox{-} {\hbox{-}}^{-\zeta }{Xe}^{+\zeta }& {}^{+\zeta }{Xe}^{-\zeta}\hbox{-} \hbox{-} {\hbox{-}}^{-\zeta }{Xe}^{+\zeta}\\ {}1& 2\end{array}} $$

Feynman’s argument has been confirmed by several subsequent studies [27, 46,47,48,49] . Investigations of the computed electronic density in the Ar---Ar complex [50,51,52] indicate polarization of the electronic charge of each atom toward the other, as predicted by Feynman [5] and depicted in structure 2. All of this is in full agreement with Feynman’s interpretation. Nevertheless, dispersion interactions continue to be widely attributed to fluctuating dipoles.

Discussion and summary

The generalized Hellmann-Feynman theorem, Eq. (6), has often been criticized on the grounds that it is only valid for the exact wave function. We find this criticism puzzling. First of all, the same statement could be made about the Schrödinger equation, yet it is the foundation of quantum chemistry. Second, it is not true that only exact wave functions satisfy Eq. (6); as mentioned above, some types of approximate wave functions (e.g., Hartree-Fock) do so as well.

Equations (6) and (11) prove the key concept that, in physical reality, the forces felt by the nuclei in molecules and other systems are entirely Coulombic. This physical fact is independent of whatever wave function may be used to describe a system. Equation (11) is the rigorous expression for these forces in terms of Coulomb’s law, however one may in practice obtain the electronic density.

Evaluation of these forces typically requires approximations, as does nearly everything else in quantum chemistry. An approximate wave function is generally used to obtain the electronic density for Eq. (11), and unfortunately the resulting forces are very sensitive to any inadequacies in the wave function, more so than is the energy [19, 23, 26, 46]. Accordingly, it is often preferred to use the negative gradients of the approximate energies rather than Eq. (11) to determine forces in molecules, as in geometry optimizations invoking the zero net force criterion for equilibrium [26]. However, this does not alter the basic fact that the forces on the nuclei are purely Coulombic and that Eq. (11) is their rigorous formulation.

For the exact wave function, and therefore exact electronic density, Eq. (11) and the negative of the energy gradient must of course give the same results, but they come from different spatial regions in the two cases [26]. They may therefore differ significantly for approximate wave functions. This is illustrated by the fact that wave functions for optimized geometries that meet the zero net force criterion using energy gradients typically do not give zero net forces by Eq. (11); see, e.g., Kern and Karplus [28].

The focus of quantum chemistry has traditionally been upon energy. These efforts have been enormously successful. We suggest that the proper role of the Hellmann-Feynman theorem, at present, is not in that area but rather in the development of insight and understanding. In this paper, for example, we have discussed three important concepts that have come out of the Hellmann-Feynman theorem:

  1. (1)

    The forces felt by the nuclei in molecules, complexes, etc. are purely classical Coulombic. Given an accurate electronic density and geometry, quantum mechanics has no further role.

  2. (2)

    The energies of atoms and molecules can be expressed rigorously in terms of just the electronic electrostatic potentials at their nuclei, which are one-electron properties dependent only upon the electronic density, and the nuclear charges and positions. This is fully in accord with the Hohenberg-Kohn theorem [22].

  3. (3)

    The forces in dispersion interactions arise from polarization such as is depicted in 2. Both 1 and 2 give the R−7 dependence of the forces but physically they are the opposite of each other. 2 is supported by considerable evidence.

There are certainly more examples, such as the use of molecular electrostatic potentials to interpret intermolecular interactions, pioneered by Scrocco and Tomasi [53, 54] and now so widespread (e.g., [55, 56]). This is clearly an application of the Hellmann-Feynman theorem.

We suggest that energies and forces not be viewed as alternatives but rather as complementary. Again quoting Deb [19], “we should do well to employ both the energy and the force formulation in our studies of molecular structure and dynamics. The former approach would generally provide more accurate numbers, while the latter should provide a simple unified basis for developing physical insights into different chemical phenomena.” This statement was made many years ago, but it seems to still be generally valid.