Introduction: What and Why?

Quantum chemistry is a branch of science originating from quantum mechanics that focuses on investigations of chemical systems. The mathematical roots of quantum chemistry allow it to be treated as a methodology for solving an eigenvalue equation for operators or – even simpler – finding solutions for some differential equations. We cannot totally escape this way of thinking, since this is how things really are. However, a chemist will comprehend quantum chemistry more as a helping tool in experimental work, supporting description of chemical reactions, a tool that plays a role similar to a spectrophotometer or a chromatographic column, a tool that can provide information about the system under consideration, and a powerful tool, the popularization of which was achieved thanks to the fast progress of computer power and the hard work of people who made the transformation from pure theory to computer programs possible. Their efforts were appreciated – in 1998, the Nobel Prize was awarded to Walter Kohn “for his development of the density functional theory” and John Pople “for his development of computational methods.”

From the point of view of the experimentalist, the apparatus of quantum chemistry can be perceived similarly as the NMR spectrometer. One knows that the quality of the obtained NMR spectrum depends not only on the magnetic field of the magnet but also on the signal-processing capabilities. To successfully use NMR spectroscopy in experimental work, detailed knowledge about the technology of production and preparation of the magnets and electronic equipment is not a requisite. It is enough to keep in mind that with the given frequency one gets corresponding accuracy and information. All the rest is simply skill in sample preparation and expertise in spectrum interpretation. For effective usage of computational techniques of quantum chemistry, one must be aware of applied approximation to tune the accuracy of calculations and possess knowledge of the physicochemical phenomenon one wants to describe.

The aim of the present chapter is to provide a gentle introduction to basic quantum chemistry methods – the methods of solving the electronic Schrödinger equation. The chapter is intended for people starting their adventure with computational chemistry and wanting it to become the tool, not the aim itself.

When discussing quantum chemistry, we cannot totally avoid quantum mechanics. However, let us use another comparison: Traveling abroad it is good to know some basic expressions in the local language of the country you go to. It makes life easier and gives pleasure in interpersonal contacts. Still, no one expects a tourist to speak a language as fluently as a native speaker. Therefore, to efficiently apply computational techniques in experimental research, one has to learn some basic quantum mechanical terms that will help during the journey through the remainder of this chapter.

Quantum Mechanics for Dummies

We will begin with the basic terms of quantum mechanics. In this section, they will be introduced in an intuitive manner, to enable understanding of the next sections’ content, even by beginners.

We consider a system of N electrons in the field produced by the potentials arising from nuclei (the nuclei are not treated as particles consisting of nucleons but just as a point source of the electrostatic potential). We are interested only in one particular case:

  • The probability of finding the electrons of the considered system on the infinite distance from the nuclei is equal to zero. In other words, we want our system to constitute the whole entirety, not breaking into separate and independent parts (this would be the case of the interaction of two electrons with no attraction – two negative charges would repel each other to infinity).

  • The energies of this system constitute the discrete spectrum.

  • We want to know only the lowest value of energy (the wider approach can be found in the next volume of the present book).

With such limitations, we do not need to consider all of the different general cases and can simply concentrate on the bound-state chemistry.

A central notion in quantum chemistry is a wave function. This is a function characterizing a state of the system. Therefore, it depends on the variables that are adequate for the given system. This means that the wave function has to depend, at least, on spatial coordinates describing motions of the particles in the investigated system. Moreover, the wave function depends on so-called spin variables (spin is an additional degree of freedom included a posteriori in nonrelativistic quantum mechanics). This spin dependency can be built into the wave function by introducing a spin function. For instance, for the electron with a label 1, its wave function depends on the spatial coordinates x 1, y 1 and z 1 and is multiplied by the spin function α(σ 1) or β(σ 1), where σ 1 is a spin variable. The spin functions must fulfill the following requirements:

$$ {\displaystyle \int {\alpha}^{*}\left({\sigma}_1\right)\alpha \left({\sigma}_1\right)d{\sigma}_1}={\displaystyle \int {\beta}^{*}\left({\sigma}_1\right)\beta \left({\sigma}_1\right)d{\sigma}_1}=1 $$
(1)
$$ {\displaystyle \int {\alpha}^{*}\left({\sigma}_1\right)\beta \left({\sigma}_1\right)d{\sigma}_1}={\displaystyle \int {\beta}^{*}\left({\sigma}_1\right)\alpha \left({\sigma}_1\right)d{\sigma}_1}=0, $$
(2)

where the integration is carried out over the spin variable, which can be treated as the integration variable only. Such a construction may seem to be somehow unnatural; however, it is a convenient way of ascribing spins to the electrons without dealing with its origins.

In general, the wave function must depend on time to reproduce information about the time evolution of the system. However, since we are interested only in the ground state of the system, we can neglect the time dependency. Considering the bound state is equivalent to imposing the condition of the square integrability of the wave function. The integral over all variables in the full range must exist:

$$ {\displaystyle \iint \dots {\displaystyle \int {f}^{*}fd\tau =q,}} $$
(3)

where q is a finite real number. An asterisk under the integral denotes the complex conjugate; it comes from the fact that the wave function can be complex in general. The square-integrability condition ensures that the wave function vanishes for the infinite values of all spatial variables and, therefore, that our molecule is kept together. In the above expression, the integration intervals and the integration variables are not stated explicitly. For the investigated N-electron system, the wave function depends on the 3 N spatial variables (for each particle i, we have x i , y i , and z i coordinates) and additionally N spin variables (σ i for the particle i):

$$ f=f\left({x}_1,{y}_1,{z}_1,{\sigma}_1,{x}_2,{y}_2,{z}_2,{\sigma}_2,\dots, {x}_N,{y}_N,{z}_N,{\sigma}_N\right). $$
(4)

The volume element in this 4 N-dimensional space is

$$ d\tau =dV\cdotp d\sigma, $$
(5)

where the spatial part can be written as

$$ dV=d{x}_1d{y}_1d{z}_1d{x}_2d{y}_2d{z}_2\dots d{x}_Nd{y}_Nd{z}_N, $$
(6)

and the spin part is

$$ d\sigma =d{\sigma}_1d{\sigma}_2\dots d{\sigma}_N. $$
(7)

The spatial variables change from \( -\infty \) to \( \infty \) and spin variables can take allowed values. One can see that writing all of the integrals, variables, and volume elements explicitly takes time and a lot of paper, even for relatively small systems. Therefore, one usually keeps them in mind, not writing them down.

The wave function contains all of the information about the state of the system. In order to extract it, operators are applied. An operator can be understood by an analogy to a function. The function ascribes a number to a number and the operator ascribes a function to another function. In other words, the operator is a recipe for how to obtain one function from another:

$$ \widehat{A}f=g. $$
(8)

We will denote operators by hats above the symbol to distinguish them from functions and numbers. One of the particularly interesting cases is when the function g is proportional to the function f,

$$ g=af, $$
(9)

where a is a number. Then Eq. 8 takes the form

$$ \widehat{A}f=af. $$
(10)

Equation 10 is called an eigenvalue equation of the operator Â. The function f fulfilling this equation is called an eigenfunction and a is an eigenvalue of the operator Â. The Schrödinger equation

$$ \widehat{H}\Psi =E\Psi $$
(11)

is a typical eigenvalue equation in which the Hamilton operator Ĥ extracts the information about the energy E of the system from the wave function Ψ.

Like in the case of the wave function, we will not consider operators in general. Let us concentrate on the Hamilton operator and its properties to simplify our discussion. We need our operators ascribed to observables (Hamiltonian among others) to satisfy the following requirements:

  • Linearity – the operators must fulfill the condition

    $$ \widehat{A}\left(\alpha f+\beta g\right)=\alpha \widehat{A}f+\beta \widehat{A}g, $$
    (12)

    where now α and β are numbers. This seems simple and obvious; however, it is not the property of all operators. For instance, the square root is not a linear operator, since the square root of the sum is not equal to the sum of the square roots.

  • Real eigenvalues – only real values can be measured in a laboratory.

    For these reasons, we will be interested in so-called Hermitian operators that can be defined by the relation

    $$ {\displaystyle \iint \dots {\displaystyle \int {f}_1^{*}\left(\widehat{A}{f}_2\right)d\tau =}}{\displaystyle \iint \dots {\displaystyle \int {f}_2{\left(\widehat{A}{f}_1\right)}^{*}d\tau .}} $$
    (13)

All the functions, variables, and integration intervals remain the same as in Eq. 3. Writing of all these things in the expressions was already troublesome enough, and things become even more complicated when operators appear. In order to make life easier, Dirac notation can be applied. In this notation Eq. 13 has the form

$$ \left\langle {f}_1\Big|\widehat{A}{f}_2\right\rangle =\left\langle \widehat{A}{f}_1\Big|{f}_2\right\rangle, $$
(14)

where the left-hand side can be equivalently written as \( \left\langle {f}_1\left|\widehat{A}\right|{f}_2\right\rangle, \) and the integral of Eq. 3 becomes simply

$$ \left\langle f\Big|f\right\rangle =q. $$
(15)

In this very convenient notation, it is also assumed that the integration intervals and variables flow from the context.

Let us look at Hermitian operators more carefully, considering them in the example of Hamiltonian. It has already been mentioned that such operators have real eigenvalues. Furthermore, the eigenfunctions of the Hermitian operator that correspond to different eigenvalues are orthogonal. In other words, for

$$ \widehat{H}{f}_1={E}_1{f}_1 \mathrm{and} \widehat{H}{f}_2={E}_2{f}_2, $$
(16)

where \( \left({E}_1\ne {E}_2\right), \) one has

$$ \left\langle {f}_1\Big|{f}_2\right\rangle =\left\langle {f}_2\Big|{f}_1\right\rangle =0. $$
(17)

This will be a very useful property, since it will cause various terms in complicated expressions to vanish. In the case of degeneration, or, in other words, when one of the eigenvalues corresponds to two or more eigenfunctions, the eigenfunctions f 1 and f 2 can be orthogonalized.

It is worth considering the integral

$$ \left\langle {f}_1\left|\widehat{H}\right|{f}_1\right\rangle . $$
(18)

Since \( {f}_1 \) is the eigenfunction of Ĥ with the eigenvalue E 1, it is obvious that

$$ \left\langle {f}_1\left|\widehat{H}\right|{f}_1\right\rangle =\left\langle {f}_1\Big|{E}_1{f}_1\right\rangle ={E}_1\left\langle {f}_1\Big|{f}_1\right\rangle . $$
(19)

It would be certainly more convenient if the result was a single number – the eigenvalue E 1. This would be the case if \( \left\langle {f}_1\Big|{f}_1\right\rangle =1 \) or if we say the function f 1 was normalized to unity. It would be consistent with the interpretation of the integral \( \left\langle {f}_1\Big|{f}_1\right\rangle \) as the probability of finding the system in the whole space – it should be surely equal 1. This is a very handy requirement. Any function that does not possess this property can be normalized by multiplying by the normalization factor \( \mathcal{N}=1/\sqrt{\left\langle {f}_1|{f}_1\right\rangle } \). Then, the new function \( {\tilde{f}}_1 \) will be given as

$$ {\tilde{f}}_1=\mathcal{N}{f}_1. $$
(20)

This new function \( {\tilde{f}}_1 \) is also an eigenfunction of the Hamiltonian, since f 1 was only divided by the number \( \sqrt{\left\langle {f}_1\Big|{f}_1\right\rangle } \) and Hamiltonian is linear:

$$ \widehat{H}{\tilde{f}}_1=\widehat{H}\mathcal{N}{f}_1=\mathcal{N}\widehat{H}{f}_1=\mathcal{N}E{f}_1=E\mathcal{N}{f}_1=E{\tilde{f}}_1. $$
(21)

In the case of unnormalized functions, the expression for the eigenvalue E 1 can be obtained from Eq. 19:

$$ {E}_1=\frac{\left\langle {f}_1\left|\widehat{H}\right|{f}_1\right\rangle }{\left\langle {f}_1\Big|{f}_1\right\rangle }. $$
(22)

However, many of the applied functions are not the eigenfunctions of the Hamiltonian. Therefore, let us investigate another interesting integral,

$$ \left\langle g\left|\widehat{H}\right|g\right\rangle, $$
(23)

where g is not an eigenfunction of Ĥ. In order to calculate this integral, an alternate important property of the Hermitian operators needs to be exploited: the fact that their eigenfunctions constitute a complete basis set. Each function depending on the same variables as the eigenfunctions can be expressed as the linear combination of the basis functions. Now, this concept seems to be hard-core mathematics; however, anybody using computational techniques knows well that the two things one must input to the ab initio program are the method and the basis set. Hence, let us make a break from the general considerations of operators and abstract space functions and concentrate for a while on the basis set concept in the example of simple trigonometric functions.

In a calculus course, one learns how to express a function using a set of other functions. For example, consider sinx function and expand it in the Taylor series around 0:

$$ \sin x={\displaystyle \sum_{i=1}^{\infty}\frac{{\left(-1\right)}^{i-1}}{\left(2i-1\right)!}{x}^{2i-1}}=x-\frac{x^3}{3!}+\frac{x^5}{5!}-.\dots $$
(24)

In Eq. 24, sinx function is expressed in the basis set of monomials:

$$ \sin x={\displaystyle \sum_{k=1}^{\infty }{c}_k{x}^k}, $$
(25)

where c k are the expansion coefficients that need to be determined. In our case it is simple, since c k result directly from the Taylor expansion and are equal:

$$ {c}_k=\left\{\begin{array}{cc}\hfill \frac{{\left(-1\right)}^{\left(k-1\right)/2}}{k!}\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\;\mathrm{o}\mathrm{dd}\;k,\hfill \\ {}\hfill 0\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\;\mathrm{even}\;k.\hfill \end{array}\right. $$
(26)

The summation in Eqs. 24 and 25 goes from 1 to \( \infty \). In practice, finite and possibly short expansions are applied:

$$ {F}_n(x)={\displaystyle \sum_{i=1}^n{c}_i{x}^i}. $$
(27)

This truncation of the series introduces an approximation to our function.

Let us analyze the x function in the range \( x\in \left\langle -{\scriptscriptstyle \frac{\pi }{2}};{\scriptscriptstyle \frac{\pi }{2}}\right\rangle . \) The standard deviation works well as the accuracy measure:

$$ {\sigma}_n\sqrt{{\displaystyle {\int}_{-{\scriptscriptstyle \frac{\pi }{2}}}^{{\scriptscriptstyle \frac{\pi }{2}}}{\left( \sin x-{F}_n(x)\right)}^2dx}}. $$
(28)

Table 1 summarizes data for small n values. Increasing the number of expansion terms causes a decrease of the standard deviation values and more accurate representation of the original x function. Given the required accuracy of the calculation, the necessary value length of the expansion, n, can be found.

Table 1 Taylor expansion of x function and standard deviation for various expansion lengths

The following question arises: Why use Taylor expansion instead of x function itself, if one needs to worry about the expansion accuracy? The answer is straightforward: simplifications and savings. It is much easier to operate on the polynomials than on the trigonometric functions (for instance, the integral \( {\int {\left({x}^i\right)}^2dx} \) is much easier to handle than \( {\int { \sin}^2xdx} \)). Moreover, the required accuracy can often be obtained with a relatively short expansion.

Let us now make the considerations more general. As was stated before, the set of eigenfunctions of the Hermitian operator Ĥ is complete and orthonormal – functions are orthogonal and normalized:

$$ {\forall}_{i,j}\left\langle {f}_i\Big|{f}_i\right\rangle ={\delta}_{ij}, $$
(29)

where δ ij is a Kronecker symbol that takes value 1 for i = j and 0 otherwise. The basis set completeness means that each function depending on the same set of variables can be expressed by the basis functions:

$$ g={\displaystyle \sum_{i=1}^{\infty }{c}_i{f}_i,} $$
(30)

where coefficients c i need to be found. Knowing the normalized g function makes this task simple, because of the orthonormality of the {f i } set, the coefficients will be equal:

$$ {c}_i=\left\langle {f}_i\Big|g\right\rangle, $$
(31)

since

$$ \left\langle {f}_i\Big|g\right\rangle ={\displaystyle \sum_{j=1}^{\infty }{c}_i}\left\langle {f}_i\Big|{f}_i\right\rangle ={\displaystyle \sum_{j=1}^{\infty }{c}_j{\delta}_{ij}={c}_i}. $$
(32)

(Only one term for i = j, c i , remains; all other vanish for the Kronecker delta equals zero if \( i\ne j \).) Similarly,

$$ \left\langle g\Big|g\right\rangle ={\displaystyle \sum_{j=1}^{\infty }{c}_j^{*}{c}_j.} $$
(33)

However, things are not that easy, since we usually apply the expansion (Eq. 30) when we do not know the g function. Thus, the integrals (Eqs. 31 and 33) should be perceived rather as the interpretation of the c i coefficients than the direct recipe for the calculations. From Eq. 30, the g function can be treated as the linear combination of the f i functions. Moreover (see Eq. 33), the probability that a system described by the function g is in the state f j is given by c * j c j .

Now let us consider the expression

$$ \left\langle g\left|\widehat{H}\right|g\right\rangle \equiv {\left\langle \widehat{H}\right\rangle}_g, $$
(34)

when g is not the Hamiltonian eigenfunction. Using the expansion Eq. 30 and the fact that f i are the Hamiltonian eigenfunctions (Eq. 16), one obtains

$$ \left\langle g\left|\widehat{H}\right|g\right\rangle ={\displaystyle \sum_{j=1}^{\infty }{c}_j^{*}{c}_j}{E}_j. $$
(35)

The above integral is called an average (expectation) value, and Eq. 35 for Hamiltonian carries the information about the average energy of the system in the state described by the g function. Looking closer at Eq. 35 shows that this average energy is simply a weighted average of all possible E j energies of the system. The weights are determined by the c * j c j products – the probability of finding the system in the f j state. It should be noticed that we used the linearity of the Hamiltonian operator to achieve this result.

Conclusion? Very optimistic: We can say something about the sought energy value not knowing the eigenfunctions of the operator of interest, since for calculations of Eq. 35, we do not need f j functions. Strange? Not at all, if we recall some linear algebra: Using three basis vectors, we can describe each and every point in the 3D space. Likewise, the wave function can be perceived as a vector, the Hermitian operator as the symmetric transformation matrix, an integral \( \left\langle f\Big|g\right\rangle \) as the dot product, orthogonality of functions as orthogonality of vectors, and normalization as dividing the vector components by its length.

Now, when the “linear algebra” term has already appeared, let us see how it is applied for solving the eigenequation. Almost all calculations are performed by applying the basis functions. This means that the unknown function Ψ describing the investigated system is expressed in the basis of known functions χ i (see Eq. 30):

$$ \Psi \approx {\displaystyle \sum_{i=1}^n{c}_i{\chi}_i=\Phi, } $$
(36)

where Ψ is the eigenfunction of the Hamiltonian corresponding to a given eigenvalue E (Eq. 11). The task is to find such c i coefficients that the function Φ would be the best approximation to Ψ. Since Φ is an approximation to the wave function, the corresponding energy will also be only approximated. Let us call the approximation E Φ. Basis functions χ i are not the Hamiltonian eigenfunctions; therefore, to estimate the energy, an average value must be calculated. Therefore, substituting Eq. 36 for Eq. 11 and multiplying by \( {\Psi}^{*}=\left\langle {\varSigma}_i{c}_i{\chi}_i\right| \) on both sides givesFootnote 1

$$ \mathrm{L}\mathrm{H}\mathrm{S}=\Big\langle {\displaystyle \sum_i{c}_i{\chi}_i\left|\widehat{H}\right|}{\displaystyle \sum_j{c}_i{\chi}_j\Big\rangle }={\displaystyle \sum_i{\displaystyle \sum_j{c}_i^{\ast }{c}_j\left\langle {\chi}_i\left|\widehat{H}\right|{\chi}_j\right\rangle, }} $$
(37)
$$ \mathrm{R}\mathrm{H}\mathrm{S}={E}_{\Phi}{\displaystyle \sum_i{\displaystyle \sum_j{c}_j^{\ast }{c}_j\left\langle {\chi}_i\right|}{\chi}_j}\Big\rangle . $$
(38)

In order to further simplify the notation, let us denote \( \left\langle {\chi}_i\left|\widehat{H}\right|{\chi}_j\right\rangle \) by H ij and \( \left\langle {\chi}_i\Big|{\chi}_j\right\rangle \) by S ij . Then,

$$ {\displaystyle \sum_i{\displaystyle \sum_j{c}_j^{*}{c}_j{H}_{ij}={E}_{\Phi}{\displaystyle \sum_i{\displaystyle \sum_j{c}_i^{*}{c}_j{S}_{ij}.}}}} $$
(39)

Equivalently, in the matrix form

$$ \mathbf{H}\mathbf{c}=\mathbf{S}\mathbf{c}{E}_{\Phi}, $$
(40)

where H is the Hamiltonian matrix with the elements H ij , S is called the overlap matrix and is built of the overlap integrals S ij , and c denotes the vector of the c i coefficients. The basis sets applied in practice are usually non-orthogonal, which causes the off-diagonal terms in the S matrix to not vanish.

Such a method of finding approximate eigenvalues and eigenvectors of the Hamiltonian is known as the Ritz method and is frequently applied in quantum chemistry.

This simple introduction of basic terms of quantum mechanics is obviously far from complete. One can notice the lack of further discussion of the degeneration, the continuum spectrum, and many other topics. For these we encourage the reader to dive into the following excellent books on quantum mechanics and chemistry: Atkins and Friedman (2005), Griffiths (2004), Levine (2008), Lowe and Peterson (2005), McQuarrie and Simon (1997), Piela (2007), Ratner and Schatz (2000), and Szabo and Ostlund (1996).

On the Way to Quantum Chemistry

For the sake of simplification, we assume that the energy of the ground state of our system differs from other energy values. This allows one to avoid embroilment in technical details that are unnecessary at this point. Our system is described by the wave function Ψ fulfilling the Schrödinger Eq. 11. It is important to notice that this eigenvalue equation can be solved exactly only for hydrogen atoms. Any more complicated system requires approximate techniques. In order to explain this complication, let us look into the Hamilton operator. For the system of N electrons and M nuclei, the full Hamiltonian is a sum of the following terms:

  • Kinetic energy of electrons, \( {\widehat{T}}_e \)

  • Kinetic energy of nuclei, \( {\widehat{T}}_n \)

  • Energy of interactions between electrons, \( {\widehat{V}}_{ee} \)

  • Energy of interactions between nuclei, \( {\widehat{V}}_{nn} \)

  • Energy of interactions between a nucleus and an electron, \( {\widehat{V}}_{ne} \)

In the atomic units, these terms have the following form:

$$ {\widehat{T}}_e=-\frac{1}{2}{\displaystyle \sum_{i=1}^N}{\bigtriangledown}_{r_i}^2 $$
(41)
$$ {\widehat{T}}_n=-{\displaystyle \sum_{i=1}^M}\frac{1}{2{m}_i}{\bigtriangledown}_{R_i}^2 $$
(42)
$$ {\widehat{V}}_{\mathrm{ee}}={\displaystyle \sum_{i=1}^N}{\displaystyle \sum_{j>i}^N}\frac{1}{r_{ij}} $$
(43)
$$ {\widehat{V}}_{\mathrm{nn}}={\displaystyle \sum_{i=1}^M}{\displaystyle \sum_{j>i}^M}\frac{Z_i{Z}_j}{R_{ij}} $$
(44)
$$ {\widehat{V}}_{\mathrm{ne}}=-{\displaystyle \sum_{i=1}^N}{\displaystyle \sum_{j=1}^M}\frac{Z_j}{\left|{\mathbf{r}}_i-{\mathbf{R}}_j\right|} $$
(45)

where m i is the mass of the nucleus i, Z i stands for the nuclear charge, and r ij denotes the distance between the electrons i and j, \( {r}_{ij}=\left|{\mathbf{r}}_i-{\mathbf{r}}_j\right| \). Likewise, R ij refers to the internuclear distance, \( {R}_{ij}=\left|{\mathbf{R}}_i-{\mathbf{R}}_j\right| \). The presence of the mutual distances between the particles causes a serious problem when solving the Schrodinger equation; it does not allow one to decouple the equations.

Fortunately, from the chemist’s point of view, such a Hamiltonian is not very useful. The chemist is not interested in each and every bit of information one can get about any N-electron M-nuclei system; however, she or he is focused on the given molecule, its conformations, interactions with the environment, and properties (spectroscopic, magnetic, electric, and so on). What makes quantum mechanics a valuable tool for chemists is the Born–Oppenheimer approximation, discussed in detail in the previous chapter of this volume. Let us briefly summarize it to maintain consistent notation throughout the chapter.

The chemist is concerned with the relative positions of the nuclei in the molecule and with the internal energy, but not with the motions of the molecule as a whole. This motion can be excluded from our consideration, for example, by elimination of the center-of-mass translation. Moreover, the intermolecular (electrostatic) forces acting on electrons and nuclei would be similar. This would cause much slower internal motion of the heavy nuclei in comparison to light electrons. For this reason, the approximate description of electron motion with parametric dependence on the static positions of nuclei is justified. Such reasoning leads to adiabatic approximation and finally to Born–Oppenheimer approximation .

According to this approximation, the Hamiltonian can be written as

$$ \widehat{H}={\widehat{T}}_n+{\widehat{H}}_e+{\widehat{V}}_{\mathrm{nn}}, $$
(46)

where \( {\widehat{T}}_n \) now has a meaning of nuclear kinetic energy of the molecule for which the center of mass is stopped (however, there are still oscillations and rotations), and

$$ {\widehat{H}}_e={\widehat{T}}_e+{\widehat{V}}_{\mathrm{ee}}+{\widehat{V}}_{\mathrm{en}} $$
(47)

is called an electronic Hamiltonian, and it represents the energy of the system after omitting the nuclear kinetic energy term and nuclear repulsion. One can now focus on the solution of the equation of the form

$$ \left[{\widehat{T}}_n\left(\mathbf{R}\right)+{\widehat{H}}_e\left(\mathbf{r};\mathbf{R}\right)+{\widehat{V}}_{\mathrm{nn}}\left(\mathbf{R}\right)\right]\Psi \left(\mathbf{r},\mathbf{R}\right)=E\Psi \left(\mathbf{r},\mathbf{R}\right). $$
(48)

Here, the dependence on the electronic spatial variables \( \mathbf{r}=({x}_1,{y}_1,{z}_1,\dots, {x}_N,\break {y}_N,{z}_N) \) and the nuclear spatial variables \( \mathbf{R}=\left({X}_1,{Y}_1,{Z}_1,\dots, {X}_M,{Y}_M,{Z}_M\right) \) is written explicitly. The semicolon sign in the Ĥ e term denotes the parametric dependence – for various R the various electronic equations are obtained.

With such a Hamiltonian, it seems reliable to distinguish also the nuclear f(R) and electronic Ψ e (r; R) part in the wave function

$$ \Psi \left(\mathbf{r},\mathbf{R}\right)\approx {\Psi}_e\left(\mathbf{r};\mathbf{R}\right)f\left(\mathbf{R}\right), $$
(49)

which leads to a significant reduction of the problem.

Now Eq. 48 can be separated into three equations:

$$ {\widehat{H}}_e\left(\mathbf{r};\mathbf{R}\right){\Psi}_e\left(\mathbf{r};\mathbf{R}\right)={E}_e\left(\mathbf{R}\right){\Psi}_e\left(\mathbf{r};\mathbf{R}\right) $$
(50)
$$ \left({\widehat{H}}_e\left(\mathbf{r};\mathbf{R}\right)+{\widehat{V}}_{\mathrm{nn}}\left(\mathbf{R}\right)\right){\Psi}_e\left(\mathbf{r};\mathbf{R}\right)=U\left(\mathbf{R}\right){\Psi}_e\left(\mathbf{r};\mathbf{R}\right) $$
(51)
$$ \left({\widehat{T}}_n\left(\mathbf{R}\right)+\widehat{U}\left(\mathbf{R}\right)\right)f\left(\mathbf{R}\right)=Ef\left(\mathbf{R}\right) $$
(52)

The first two describe electronic motion for a given position of nuclei. The difference between E e (R) and U(R) is that in U(R) nuclear repulsion energy is taken into account. These equations are milestones in our considerations for two reasons. First, since we are now talking about “fixed positions of the nuclei,” finally we have got molecules instead of an unspecified system containing some electrons and some nuclei. The second is hidden in Eq. 52: Electronic energy and nuclear repulsion energy constitute the potential, in which nuclei are moving. That is why the proper description of electronic movement in a molecule is so important: The electrons glue the whole molecule together.

Our attention in the rest of the chapter will be focused only on Eq. 50; hence, to simplify notation, all the subscripts denoting the electronic case will be omitted:

$$ {\widehat{H}}_e\to \widehat{H} $$
(53)
$$ {\Psi}_e\to \Psi $$
(54)
$$ {\widehat{H}}_e{\Psi}_e\left(r;R\right)={E}_e{\Psi}_e\left(r;R\right)\to \widehat{H}\Psi =E\Psi $$
(55)

The electronic wave function Ψ satisfies all the requirements discussed in the previous sections, depends on the coordinates of N electrons, and additionally must be antisymmetric with respect to the exchange of coordinates of twoelectrons.Footnote 2

It should be noted that the analytic solution of Eq. 50 is not known even for the smallest molecule, i.e., H2. Therefore, the approximate techniques must be applied to extract the necessary information about molecules of interest. Quantum mechanics provides two tools:

  • Variational principle

  • Perturbation theory

Variational Principle: An Indicator

The variational principle allows one to judge the quality of the obtained solutions. It can be formulated as follows: For the arbitrary trial function χ that is square-integrable, differentiable, and antisymmetric and depends on the same set of variables as a sought ground-state Ψ0 function, we have

$$ {E}_0\le \frac{\left\langle \chi \left|\widehat{H}\right|\chi \right\rangle }{\left\langle \chi \Big|\chi \right\rangle }, $$
(56)

where E 0 is the ground-state energy corresponding to Ψ0 (Eq. 50). The important consequence of the variational principle is that to estimate the energy of the system, one does not need to solve the eigenequation (this we already know; see Eq. 35), and moreover – what is crucial – the estimated energy value will always not be lower than the exact eigenvalue E 0.

The proof of the inequality (Eq. 56) is straightforward and can be derived from Eqs. 30 and 35. The function χ that satisfies the above requirements can be expanded on the basis of the Hamiltonian eigenfunctions:

$$ \chi ={\displaystyle \sum_{i=0}^{\infty }{c}_i{f}_i,} $$
(57)

where f i fulfill the eigenproblem \( \widehat{H}{f}_i={E}_i{f}_i \). Thus,

$$ \frac{\left\langle \chi \left|\widehat{H}\right|\chi \right\rangle }{\left\langle \chi \Big|\chi \right\rangle }=\frac{{\displaystyle {\sum}_{i=0}^{\infty }{c}_i^{*}{c}_i{E}_i}}{{\displaystyle {\sum}_{i=0}^{\infty }{c}_i^{*}{c}_i}}\ge \frac{{\displaystyle {\sum}_{i=0}^{\infty }{c}_i^{*}{c}_i{E}_0}}{{\displaystyle {\sum}_{i=0}^{\infty }{c}_i^{*}{c}_i}}={E}_0, $$
(58)

with the assumption that E 0 is the lowest of all Hamiltonian eigenvalues (Atkins and Friedman 2005; Levine 2008; Lowe and Peterson 2005; McQuarrie and Simon 1997; Piela 2007; Szabo and Ostlund 1996).

Perturbation Calculus: The Art of Estimation

Due to the variational principle that is satisfied for the electronic Hamiltonian, the group of methods of searching for parameters optimizing the energy value can be constructed. The way of verification of the given wave function is the corresponding energy value: The lower, the better. Besides this “quality control,” the variational principle does not give the prescription for the choice of the trial wave functions. Here comes the perturbation calculus – the method frequently applied in physics for the estimation of the functions or values on the basis of partial knowledge about the solutions of the investigated problem. We will consider here the Rayleigh–Schrödinger variant of the perturbation calculus (Atkins and Friedman 2005; Levine 2008; Lowe and Peterson 2005; McQuarrie and Simon 1997; Piela 2007; Ratner and Schatz 2000).

Let us assume that the total electronic Hamiltonian of the investigated system can be divided into

$$ \widehat{H}={\widehat{H}}^0+{\widehat{H}}^1, $$
(59)

in such a fashion that we know the exact solutions of

$$ {\widehat{H}}^0{\Psi}_k^{(0)}={E}_k^{(0)}{\Psi}_k^{(0)}, $$
(60)

where the subscript k enumerates the eigenvalues of the Ĥ 0 operator in such a way that E (0)0 is the lowest energy. Now, one can say that the operator Ĥ describes the system for which Ĥ 0 is an unperturbed operator and Ĥ 1 denotes a perturbation. We can assume that if the change in the system represented by Ĥ 1 is minor, then the functions Ψ (0) k would be a good approximation to Ψ k . Considering Ĥ 0, one postulates its Hermicity and that its eigenvalues are not degenerate (in our case, for the ground state at least E (0)0 must not be equal to any other eigenvalue). This condition will become clear in a moment.

Knowing only the unperturbed solutions (Eq. 60), we would like to say something more about the ground-state energy of the investigated system. Nothing is easier – we can calculate the average value of the full electronic Hamiltonian with the Ψ (0)0 function. The variational principle states that the resulting energy will be not lower than the exact energy:

$$ {E}_0\le \left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^0+{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle ={E}_0^{(0)}+{E}_0^{(1)}. $$
(61)

The term modifying E (0)0 is simply

$$ {E}_0^{(1)}=\left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle . $$
(62)

So far, the only new thing is the manner of partitioning the total energy into the energy of the unperturbed system and the corrections (where E (1)0 is not the only term):

$$ {E}_0={E}_0^{(0)}+{E}_0^{(1)}+{E}_0^{(2)}+.... $$
(63)

Likewise, the wave function can be written as

$$ {\Psi}_0={\Psi}_0^{(0)}+{\Psi}_0^{(1)}+{\Psi}_0^{(2)}+\dots, $$
(64)

where Ψ (1)0 , Ψ (2)0 and so forth are the corrections to the wave function of the unperturbed system Ψ (0)0 . Now, the electronic Schrödinger Eq. 50 becomes

$$ \begin{array}{l} \left({\widehat{H}}^0+{\widehat{H}}^1\right)\left({\Psi}_0^{(0)}+{\Psi}_0^{(1)}+{\Psi}_0^{(2)}+\dots \right)\\[6pt] {\qquad}=\left({E}_0^{(0)}+{E}_0^{(1)}+{E}_0^{(2)}+\dots \right)\left({\Psi}_0^{(0)}+{\Psi}_0^{(1)}+{\Psi}_0^{(2)}+\dots \right).\end{array} $$
(65)

Introducing the expansions (Eqs. 63 and 64) does not increase our knowledge about the energy or the wave function; it is only a different way of expressing the unknowns by other unknowns. However, now we have a starting point for further investigations.

The comparison of the terms on the left- and right-hand side of the above expression is instructive. Let us regard as similar the terms with the same sum of the superscripts (so-called perturbation order, by analogy to the multiplication and ordering of polynomials). Simple multiplication in Eq. 65 and directing the terms of the same order to separate equations gives

$$ {\widehat{H}}^0{\Psi}_0^{(0)}={E}_0^{(0)}{\Psi}_0^{(0)}, $$
(66)
$$ {\widehat{H}}^0{\Psi}_0^{(1)}+{\widehat{H}}^1{\Psi}_0^{(0)}={E}_0^{(0)}{\Psi}_0^{(1)}+{E}_0^{(1)}{\Psi}_0^{(0)}, $$
(67)
$$ \begin{array}{l}{\widehat{H}}^0{\Psi}_0^{(2)}+{\widehat{H}}^1{\Psi}_0^{(1)} ={E}_0^{(0)}{\Psi}_0^{(2)}+{E}_0^{(1)}{\Psi}_0^{(1)}+{E}_0^{(2)}{\Psi}_0^{(0)}.\\ {} \vdots \end{array} $$
(68)

These equations link the corrections to the wave function and to the energy. Before detailed investigation of the subsequent corrections, one more thing should be underlined. Up to now, the function Ψ0 is not normalized; only Ψ (0) k are normalized. Until the corrections to Ψ0 were found, we would not be able to normalize it. We can only write the normalization constant as \( \mathcal{N}=\frac{1}{\sqrt{\left\langle {\Psi}_0|{\Psi}_0\right\rangle }}. \) However, it is not necessary at this moment. Now the intermediate normalization condition is more useful:

$$ \left\langle {\Psi}_0^{(0)}\Big|{\Psi}_0\right\rangle =1. $$
(69)

Such a concept is based on the information that the eigenfunctions of Ĥ 0 form an orthonormal complete set of functions (that is one of the reasons why the Hermicity of Ĥ 0 was required) and they can be applied to express any other function, for instance, Ψ0 as

$$ {\Psi}_0={\displaystyle \sum_{k=0}^{\infty }{c}_k{\Psi}_k^{(0)}}+{\displaystyle \sum_{k\ne 0}^{\infty }{c}_k{\Psi}_k^{(0)}.} $$
(70)

In this linear combination, the function Ψ (0)0 has a distinguished meaning \( \left({c}_0=1\right) \); this is the approximation of the wave function of the considered system. Therefore, one can require that Ψ (0)0 does not have a contribution to the higher corrections: Ψ0 (1), Ψ0 (2), and so on:

$$ {\displaystyle \sum_{k\ne 0}^{\infty }{c}_k{\Psi}_k^{(0)}={\Psi}_0^{(1)}+{\Psi}_0^{(2)}+....} $$
(71)

Here, the benefits from the intermediate normalization are obvious: The function Ψ (0)0 is orthogonal to each of the corrections (or, in other words, the corrections are defined in such a way that they are orthogonal to Ψ (0)0 ).

Therefore, there is an additional set of equations to be satisfied:

$$ \left\langle {\Psi}_0^{(0)}\Big|{\Psi}_0^{(n)}\right\rangle ={\delta}_{0n}, $$
(72)

where the superscript n denotes the nth-order correction to the ground-state wave function Ψ0. Now we can go back to Eqs. 66, 67, and 68 and extract the corrections to energy. For this purpose, each of the equations must be multiplied from the left-hand side by Ψ (0)0 and integrated:

$$ {E}_0^{(0)}=\left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^0\right|{\Psi}_0^{(0)}\right\rangle, $$
(73)
$$ {E}_0^{(1)}=\left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle, $$
(74)
$$ {E}_0^{(2)} =\left\langle {\Psi}_0^{(0)}\left|\widehat{H}\right|{\Psi}_0^{(1)}\right\rangle, \vdots $$
(75)

The integrals \( \left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^0\right|{\Psi}_0^{(n)}\right\rangle \) vanish, since

$$ \left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^0\right|{\Psi}_0^{(n)}\right\rangle =\left\langle {\widehat{H}}^0{\Psi}_0^{(0)}\Big|{\Psi}_0^{(n)}\right\rangle ={E}_0^{(0)}\left\langle {\Psi}_0^{(0)}\Big|{\Psi}_0^{(n)}\right\rangle =0. $$
(76)

Thus, obtaining the energy corrections of any order is straightforward. The general expression for the nth-order correction can be written as

$$ {E}_0^{(n)}=\left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{\left(n-1\right)}\right\rangle \mathrm{f}\mathrm{o}\mathrm{r} n>1. $$
(77)

The problem is that to obtain the corrections to the energy in the second or higher orders, the corrections to the wave function are necessary. Then, let us try to find Ψ (1)0 . This function can be expressed as the linear combination of the functions from the orthonormal set {Ψ (0) k } for \( k\ne 0 \):

$$ {\Psi}_0^{(1)}={\displaystyle \sum_{k\ne 0}^{\infty }{c}_k^{(1)}{\Psi}_k^{(0)}}, $$
(78)

where c (1) k are the expansion coefficients in the first-order correction. Again, the whole thing reduces to finding the coefficients c k . Substituting Eq. 78 to Eq. 67 gives

$$ \left({\widehat{H}}^0-{E}_0^{(0)}\right){\displaystyle \sum_{k\ne 0}^{\infty }{c}_k^{(1)}{\Psi}_k^{(0)}=\left({E}_0^{(1)}-{\widehat{H}}^1\right)}{\Psi}_0^{(0)}. $$
(79)

Integrating this equation with the Ψ (0) l function leads to

$$ \begin{array}{ll}\mathrm{L}\mathrm{H}\mathrm{S}&=\left\langle {\Psi}_l^{(0)}\left|{\widehat{H}}^0-{E}_0^{(0)}\right|{\displaystyle \sum_{k\ne 0}^{\infty }{c}_k}{\Psi}_k^{(0)}\right\rangle ={\displaystyle \sum_{k\ne 0}^{\infty }{c}_k}\left\langle {\Psi}_l^{(0)}\left|{\widehat{H}}^0-{E}_0^{(0)}\right|{\Psi}_k^{(0)}\right\rangle \\[10pt] {}&={\displaystyle \sum_{k\ne 0}^{\infty }{c}_k\left({E}_l^{(0)}-{E}_0^{(0)}\right)\left\langle {\Psi}_l^{(0)}\Big|{\Psi}_k^{(0)}\right\rangle }={\displaystyle \sum_{k\ne 0}^{\infty }{c}_k\left({E}_l^{(0)}-{E}_0^{(0)}\right){\delta}_{lk}}\\[10pt] {}&={c}_l\left({E}_l^{(0)}-{E}_0^{(0)}\right)\end{array} $$
(80)

and

$$ \begin{array}{ll}\mathrm{R}\mathrm{H}\mathrm{S}&=\left\langle {\Psi}_l^{(0)}\left|{\mathrm{E}}_0^{(1)}-{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle ={E}_0^{(1)}\left\langle \left.{\Psi}_l^{(0)}\right|{\Psi}_0^{(0)}\right\rangle -\left\langle {\Psi}_l^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle \\[10pt] {}&=-\left\langle {\Psi}_l^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle .\end{array} $$
(81)

Altogether, these allow one to write the coefficients of the expansion (Eq. 78) as

$$ {c}_l=\frac{\left\langle {\Psi}_l^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle }{E_0^{(0)}-{E}_l^{(0)}}. $$
(82)

Hence, the first correction to the wave function is already known:

$$ {\Psi}_0^{(1)}={\displaystyle \sum_{k\ne 0}^{\infty}\frac{\left\langle {\Psi}_k^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle }{E_0^{(0)}-{E}_k^{(0)}}}{\Psi}_k^{(0)}, $$
(83)

and, thereby, the second-order correction to the energy can be calculated as

$$ {E}_0^{(2)}={\displaystyle \sum_{k\ne 0}^{\infty}\frac{\left\langle {\Psi}_k^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle }{E_0^{(0)}-{E}_k^{(0)}}}\left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_k^{(0)}\right\rangle . $$
(84)

This is also equivalently written as

$$ {E}_0^{(2)}={\displaystyle \sum_{k\ne 0}^{\infty}\frac{{\left|\left\langle {\Psi}_k^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle \right|}^2}{E_0^{(0)}-{E}_k^{(0)}}}. $$
(85)

The energy difference in the denominator of the above expression cannot be equal to zero, and for this reason the non-degenerated ground state was assumed. The higher-order corrections are sought in a similar manner, which just requires more operations.

One of the interesting issues is the problem of variationality of the perturbation calculus built upon the variational Hamiltonian. This is, however, a sophisticated problem for advanced readers and will not be discussed here. It should be added, in summary, that the manner of partitioning of the Hamilton operator was arbitrary. The only prerequisites were the Hermitian character of the operators (for the eigenfunctions to form the orthonormal set) and the non-degenerated ground-state eigenenergy and nothing more. One should also remember that, in practice, even solving the unperturbed problem cannot be performed exactly and the approximations must be applied. The consequence could be the loss of accuracy for higher-order corrections. Moreover, good convergence of the perturbation expansion can be expected when the consecutive corrections are small in comparison to the total estimated value. However, in such a case, the low orders of the series would reproduce the sought value with relatively good accuracy. Thus, application of the low orders of perturbation calculus is highly recommended.

One-Electron Approximation: Describe One and Say Something About All

Equipped with general knowledge about the tools for the Schrödinger equation solution, one can move to many-electron systems.

The electronic Hamiltonian for any many-electron system in atomic units has the following form (compare Eqs. 41, 42, 43, 44, and 45):

$$ \widehat{H}=-\frac{1}{2}{\displaystyle \sum_{i=1}^N{\Delta}_{r_i}+{\displaystyle \sum_{i=1}^N{\displaystyle \sum_{j=1}^M\frac{Z_j}{\left|{\mathbf{r}}_i-{\mathbf{R}}_j\right|}}+{\displaystyle \sum_{i=1}^N{\displaystyle \sum_{j=i+1}^N\frac{1}{\left|{\mathbf{r}}_i-{\mathbf{r}}_j\right|}}.}}} $$
(86)

Let us look more closely. In the first term, we sum up over the number of electrons N; in the second term, the summations run over the number of electrons N and number of nuclei M; and the third term contains the double sum over the number of electrons N. Thus, one can simplify the notation of the first two terms:

$$ -\frac{1}{2}{\displaystyle \sum_{i=1}^N{\Delta}_{r_i}}+{\displaystyle \sum_{i=1}^N{\displaystyle \sum_{j=1}^M\frac{Z_j}{\left|{\mathbf{r}}_i-{\mathbf{R}}_j\right|}}={\displaystyle \sum_{i=1}^N\left(-\frac{1}{2}{\Delta}_{r_i}+{\displaystyle \sum_{j=1}^M\frac{Z_j}{\left|{\mathbf{r}}_i-{\mathbf{R}}_j\right|}}\right)}}. $$
(87)

Now, denoting the term in parenthesis by ĥ(i),

$$ \widehat{h}(i)=-\frac{1}{2}{\Delta}_{r_i}+{\displaystyle \sum_{j=1}^M\frac{Z_j}{\left|{\mathbf{r}}_i-{\mathbf{R}}_j\right|}}, $$
(88)

we get the part of the Hamiltonian depending only on the coordinates of one electron i (and nuclear coordinates, but it does not bother us). Hence, the total electronic Hamiltonian (Eq. 86) can be rewritten as the sum of one-electron and two-electron contributions:

$$ \widehat{H}={\displaystyle \sum_{i=1}^N\widehat{h}}(i)+{\displaystyle \sum_{i=1}^N{\displaystyle \sum_{j=i+1}^N\widehat{g}\left(i,j\right)},} $$
(89)

where we introduced a symbol:

$$ \widehat{g}\left(i,j\right)=\frac{1}{\left|{\mathbf{r}}_i-{\mathbf{r}}_j\right|}. $$
(90)

It should be noticed that each of the one-electron Hamiltonians ĥ(i) describes a single electron in the field of some potentials. Therefore, the exact solutions of the eigenvalue problem for these one-electron operators are available. The problem lies in the ĝ(i, j) operator that couples two electrons together: It is not possible to separate their coordinates exactly.

All electronic Hamiltonians have the same general form; they differ only in the number of electrons N and the nuclear potential hidden in ĥ(i). One can choose any possible chemical compounds and try to write the corresponding equations; however, one would soon note the similarity of all of them. Therefore, we will not invest time in describing the procedure for a polypeptide or a nanotube, but for simplicity we will start from the two-electron helium atom (the simplest many-electron system) and later try to generalize the considerations. For the helium atom:

  • \( N=2- \) two electrons

  • \( M=1- \) one nucleus

The generalization of the helium discussion into the larger (N-electron) systems should be straightforward:

$$ \widehat{h}(1)+\widehat{h}(2)={\displaystyle \sum_{i=1}^2\widehat{h}(i)}\to {\displaystyle \sum_{i=1}^N\widehat{h}(i)}, $$
(91)
$$ \widehat{g}\left(1,2\right)={\displaystyle \sum_{i=1}^2{\displaystyle \sum_{j=i+1}^2\widehat{g}\left(i,j\right)}\to {\displaystyle \sum_{i=1}^N{\displaystyle \sum_{j=i+1}^N\widehat{g}\left(i,j\right)},}} $$
(92)

and finally

$$ \widehat{h}(i)=-\frac{1}{2}{\Delta}_{r_i}+\frac{Z_1}{\left|{\mathbf{r}}_i-{\mathbf{R}}_1\right|}\to -\frac{1}{2}{\Delta}_{r_i}+{\displaystyle \sum_{j=1}^M\frac{Z_j}{\left|{\mathbf{r}}_i-{\mathbf{R}}_j\right|}}. $$
(93)

Recall that the exact solutions of the one-electron problem are known, and the task is to solve the full problem. The ideas of the perturbation theory were explained in the previous section. Now it is the time to apply the knowledge. The one-electron part can be treated as the unperturbed Hamiltonian and the rest as the perturbation:

$$ {\widehat{H}}^0=\widehat{h}(1)+\widehat{h}(2), {\widehat{H}}^1=\widehat{g}\left(1,2\right), $$
(94)

where the normalized solutions for the one-electron part are known:

$$ \widehat{h}(1){\phi}_i(1)={\upepsilon}_i{\phi}_i(1), $$
(95)
$$ \widehat{h}(2){\phi}_j(2)={\upepsilon}_j{\phi}_j(2). $$
(96)

These functions require more attention. Although the electronic Hamiltonian – and thereby the one-electron operators – do not act on the spin variables, the wave functions ϕ k must carry on the spin dependence. Therefore, the function ϕ k (l) is a product of the spatial part depending on the three spatial coordinates of the electron l and on the spin part and is called a spin-orbital. For simplicity, the set of coordinates τ l is written as a label of the electron, i.e., l. Such convention will be applied from now on, with an exception where the τ l labeling is really needed.

If an operator can be written as a sum of contributions acting on different variables, its eigenfunction takes a form of the product of the eigenfunctions of the subsequent operators in the summation. In the case of the helium atom, where Ĥ 0 is a sum of ĥ(1) and ĥ(2), the wave function Ψ(1, 2) can be denoted as the product of one-electron functions:

$$ \Psi \left(1,2\right)={\phi}_i(1){\phi}_j(2). $$
(97)

The eigenproblem for such a function gives the eigenvalue that is simply the sum of the one-electron eigenvalues:

$$ \begin{aligned}\left[\widehat{h}(1)+\widehat{h}(2)\right]{\phi}_i(1){\phi}_j(2)&=\widehat{h}(1){\phi}_i(1){\phi}_j(2)+\widehat{h}(2){\phi}_i(1){\phi}_j(2)\nonumber\\ &{\quad}=\left[\widehat{h}(1){\phi}_i(1)\right]{\phi}_j(2)+\left[\widehat{h}(2){\phi}_j(2)\right]{\phi}_i(1)\nonumber\\ &{\quad}=\left[{\upepsilon}_i{\phi}_i(1)\right]{\phi}_j(2)+\left[{\upepsilon}_j{\phi}_j(2)\right]{\phi}_i(1)\nonumber\\ &{\quad}=\left[{\upepsilon}_i+{\upepsilon}_j\right]{\phi}_i(1){\phi}_j(2).\end{aligned} $$
(98)

However, here the antisymmetry requirement should also be taken into account. The many-electron function must change the sign with respect to the exchange of labels of any two electrons:

$$ \Psi \left(1,2\right)=-\Psi \left(2,1\right). $$
(99)

The product (Eq. 97) is not antisymmetric; the interchange of the electron labels leads to

$$ \Psi \left(2,1\right)={\phi}_i(2){\phi}_j(1)\ne -\Psi \left(1,2\right), $$
(100)

and the result is a function different from the original Ψ(1, 2). But another function,

$$ \Psi \left(1,2\right)\sim \left[{\phi}_i(1){\phi}_j(2)-{\phi}_i(2){\phi}_j(1)\right], $$
(101)

satisfies the antisymmetry condition.

Moreover, the wave function Ψ(1, 2) has to be normalized. This can be achieved by calculating the following (overlap) integral:

$$ \begin{array}{l}\left\langle \Psi \left(1,2\right)\Big|\Psi \left(1,2\right)\right\rangle {=}\left\langle \mathcal{N}\left({\phi}_i(1){\phi}_j(2){-}{\phi}_j(1){\phi}_i(2)\right)\Big|\mathcal{N}\left({\phi}_i(1){\phi}_j(2){-}{\phi}_j(1){\phi}_i(2)\right)\right\rangle \\[8pt] {} ={\mathcal{N}}^2\Big(\left\langle \left.{\phi}_i(1){\phi}_j(2)\right|{\phi}_i(1){\phi}_j(2)\right\rangle -\left\langle \left.{\phi}_i(1){\phi}_j(2)\right|{\phi}_j(1){\phi}_i(2)\right\rangle \\[8pt] {} -\left\langle \left.{\phi}_j(1){\phi}_i(2)\right|{\phi}_i(1){\phi}_j(2)\right\rangle +\left\langle \left.{\phi}_j(1){\phi}_i(2)\right|{\phi}_j(1){\phi}_i(2)\right\rangle \Big).\end{array} $$
(102)

The spin-orbitals of electrons 1 and 2 are mutually independent; thus,

$$ \left\langle {\phi}_i(1){\phi}_j(2)\Big|{\phi}_k(1){\phi}_l(2)\right\rangle =\left\langle {\phi}_i(1)\Big|{\phi}_k(1)\right\rangle \left\langle {\phi}_j(2)\Big|{\phi}_l(2)\right\rangle ={\delta}_{ik}{\delta}_{jl}, $$
(103)

where \( \left\langle {\phi}_i(1)\Big|{\phi}_k(1)\right\rangle ={\delta}_{ik} \) arises from the fact that the eigenfunctions of the Hermitian operator ĥ(1) form the orthonormal set. Hence, in Eq. 102 only the first and last integral will be nonvanishing, and, finally,

$$ \left\langle \Psi \left(1,2\right)\Big|\Psi \left(1,2\right)\right\rangle =2{\mathcal{N}}^2=1, $$
(104)

and, therefore, the normalization constant \( \mathcal{N} \) must equal \( 1/\sqrt{2} \). In order to fulfill the normalization and the antisymmetry request, the trial wave function can be written as

$$ \Psi \left(1,2\right)=\frac{1}{\sqrt{2}}\left({\phi}_i(1){\phi}_j(2)-{\phi}_j(1){\phi}_i(2)\right)=\frac{1}{\sqrt{2}}\left|\begin{array}{cc}\hfill {\phi}_i(1)\hfill & \hfill {\phi}_i(2)\hfill \\ {}\hfill {\phi}_j(1)\hfill & \hfill {\phi}_j(2)\hfill \end{array}\right|. $$
(105)

Thereafter, the expectation value of the Hamiltonian can be calculated using the same tricks as in Eq. 102 Footnote 3:

$$ \begin{array}{l}{\left\langle \widehat{H}\left(1,2\right)\right\rangle}_{\Psi \left(1,2\right)}=\left\langle \Psi \left(1,2\right)\left|\widehat{H}\left(1,2\right)\right|\Psi \left(1,2\right)\right\rangle \\[6pt] {} =\left\langle {\phi}_i(1)\left|\widehat{h}(1)\right|{\phi}_i(1)\right\rangle +\left\langle {\phi}_j(2)\left|\widehat{h}(2)\right|{\phi}_j(2)\right\rangle \\[7pt] {} +\left\langle {\phi}_i(1){\phi}_j(2)\left|\frac{1}{r_{12}}\right|{\phi}_i(1){\phi}_j(2)\right\rangle \\[7pt] {} -\left\langle {\phi}_i(1){\phi}_j(2)\left|\frac{1}{r_{12}}\right|{\phi}_j(1){\phi}_i(2)\right\rangle .\end{array} $$
(106)

The only difference with respect to the overlap integral is that the integrals containing 1/r 12 cannot be separated.

In the spirit of perturbation calculus, the expectation value (Eq. 106) can be treated as the energy corrected up to the first order of perturbation expansion. According to the variational principle, this integral is an upper bound of the exact energy of the two-electron system under consideration. The recipe for correcting the trial function is given by the perturbation theory: Apply functions corresponding to the remaining states of the unperturbed system in the expansion. In other words, in order to improve the wave function, the expansion built from products of the remaining states of the systems should be used.

Let us summarize. The many-electron function of the system can be approximately written as the antisymmetrized product of the one-electron functions being the solutions for the one-electron eigenvalue problem. This is the idea of the popular one-electron approximation (Atkins and Friedman 2005; Lowe and Peterson 2005; McQuarrie and Simon 1997; Ratner and Schatz 2000). In the N-electron case, one obtains the trial function as the antisymmetrized product of N one-electron functions that can be written in the form of the Slater determinant:

$$ \Psi \left(1,2,\dots, N\right)=\frac{1}{\sqrt{N!}}\left|\begin{array}{llll}{\phi}_1(1)\hfill & {\phi}_1(2)\hfill & \dots \hfill & {\phi}_1(N)\hfill \\ {}{\phi}_2(1)\hfill & {\phi}_2(2)\hfill & \dots \hfill & {\phi}_2(N)\hfill \\ {}\vdots \hfill & \vdots \hfill & \ddots \hfill & \vdots \hfill \\ {}{\phi}_N(1)\hfill & {\phi}_N(2)\hfill & \dots \hfill & {\phi}_N(N)\hfill \end{array}\right|. $$
(107)

\( 1/\sqrt{N!} \) factor ensures the normalization of the wave function. Often, instead of writing the whole determinant, only the diagonal is written down:

$$ \left|\begin{array}{llll}{\phi}_1(1)\hfill & {\phi}_1(2)\hfill & \dots \hfill & {\phi}_1(N)\hfill \\ {}{\phi}_1(2)\hfill & {\phi}_2(2)\hfill & \dots \hfill & {\phi}_2(2)\hfill \\ {}\vdots \hfill & \vdots \hfill & \ddots \hfill & \vdots \hfill \\ {}{\phi}_N(1)\hfill & {\phi}_N(2)\hfill & \dots \hfill & {\phi}_N(N)\hfill \end{array}\right|=\left|{\phi}_1(1){\phi}_2(2)\dots {\phi}_N(N)\right|. $$
(108)

It is important to realize that the many-electron function in the form of the Slater determinant is only an approximation. Intuition tells us that describing the many-electron system using only one-electron functions cannot be exact. One needs to be aware of the fact that such an approach causes the loss of some information included in the sought wave function. In particular, the one-electron function cannot “see” another electron; therefore, the terms coupling the mutual electron positions are missing in the one-electron approximation.

For example, consider the two-electron function (Piela 2007)

$$ F\left(1,2\right)\sim \left({e}^{-a{r}_1-b{r}_2-c{r}_{12}}-{e}^{-a{r}_2-b{r}_1-c{r}_{12}}\right), $$
(109)

where r 1, r 2 are the electron–nucleus distances, r 12 denotes the distance between two electrons, and a, b, and c stand for the coefficients. F(1, 2) contains the factor c correlating the electron motions. In the one-electron approach, such a function would be approximated by the antisymmetrized function

$$ f\left(1,2\right)\sim \left({e}^{-a{r}_1-b{r}_2}-{e}^{-a{r}_2-b{r}_1}\right), $$
(110)

with total neglect of this correlation. From the practical point of view, this means that the trial function written as the single determinant does not allow reproduction of the exact electronic energy, even when using the best possible one-electron functions for its construction.

The question arises: Why use such an approach, knowing from the very beginning that it is bad? The answer is simple. First, including the electron correlation in the wave function is very expensive. For two electrons, one additional term appears; for three, three terms; for four, six. In general, for N electrons there are N(N − 1)∕2 terms (the triangle of the N × N matrix without the diagonal elements). Therefore, the number of coefficients describing the electron correlation is much bigger than the number of one-electron terms. Calculating even the one-electron coefficients is very time-consuming, and moreover, calculating the overlap integrals and Hamiltonian matrix elements becomes prohibitively complicated with the correlated functions. Second, perturbation theory provides ways of improving the results. Expansion built on a higher number of determinants will lead to better energy. Third, a chemist does not usually need exact data but only an appropriate accuracy (furnishing a house does not require caliper measurements, just a quick glance to estimate the size of the door and the furniture).

Therefore, let us stick to the one-electron approximation. The next section will explain how to find the best possible spin-orbitals.

Hartree–Fock Method: It Is Not That Sophisticated

Now we are prepared to concentrate on the methods for solving the electron equation. As you will see, they are only an extension of the already-discussed techniques. We start with the fundamental Hartree–Fock method.

The main goal of this approximation is to find the spin-orbitals applied for construction of the Slater determinant that will best reproduce the exact wave function. We again begin our considerations with the two-electron system. The problem is that the operator ĥ(i) does not contain the part arising from the potential of the second electron, and an operator responsible for this missing part must be found. What we do know is that such an interaction is included in the two-electron part of Eq. 106:

$$ {\left\langle \frac{1}{r_{12}}\right\rangle}_{\Psi \left(1,2\right)}=\left\langle {\phi}_1(1){\phi}_j(2)\left|\frac{1}{r_{12}}\right|{\phi}_i(1){\phi}_j(2)\right\rangle -\left\langle {\phi}_i(1){\phi}_j(2)\left|\frac{1}{r_{12}}\right|{\phi}_j(1){\phi}_i(2)\right\rangle . $$
(111)

Unfortunately, Eq. 111 contains the integrals that cannot be exactly separated into a product of the simpler integrals depending only on the coordinates of one electron. What we can propose is rewriting this equation in the following form:

$$ {\left\langle \frac{1}{r_{12}}\right\rangle}_{\varPsi \left(1,2\right)}=\left\langle {\phi}_i(1)\left|\widehat{\upsilon}(1)\right|{\phi}_i(1)\right\rangle +\left\langle {\phi}_j(2)\left|\widehat{\upsilon}(2)\right|{\phi}_j(2)\right\rangle +\mathrm{the}\;\mathrm{rest}, $$
(112)

where\( \widehat{\upsilon}(1) \) and \( \widehat{\upsilon}(2) \) are introduced to extract only the terms depending on the coordinates of a single electron from Eq. 111 and “the rest” is what remains from \( \left\langle \frac{1}{r_12}\right\rangle \Psi \left(1,2\right) \) after this extraction.

If such extraction were possible, the operators \( \widehat{\upsilon}(1) \) and \( \widehat{\upsilon}(2) \) could be applied to improve our one-electron operators, which leads to the following equations:

$$ \left[\widehat{h}(1)+\widehat{\upsilon}(1)\right]{\phi}_i(1)={\upepsilon}_i{\phi}_i(1), $$
(113)
$$ \left[\widehat{h}(2)+\widehat{\upsilon}(2)\right]{\phi}_j(2)={\upepsilon}_j{\phi}_j(2). $$
(114)

The solutions of such equations (spin-orbitals ϕ i , ϕ j ) can be used to build up the Slater determinant. They ensure better approximation than Eqs. 95 and 96, since they somehow provide for the influence of the second electron.

Therefore, our goal now is to utilize Eq. 106 (treating ϕ i and ϕ j ) as known functions) to find the best form of the operators \( \widehat{\upsilon}(1) \) and \( \widehat{\upsilon}(2) \). Adding zero, written as

$$ 0={\left\langle \frac{1}{r_{12}}\right\rangle}_{\Psi \left(1,2\right)}-{\left\langle \frac{1}{r_{12}}\right\rangle}_{\Psi \left(1,2\right)}, $$
(115)

to Eq. 111 allows one to ascribe, for instance,

$$ \begin{array}{rl}\left\langle {\phi}_i(1)\left|\widehat{\upsilon}(1)\right|{\phi}_i(1)\right\rangle &=\left\langle {\phi}_i(1){\phi}_j(2)\left|\frac{1}{r_{12}}\right|{\phi}_i(1){\phi}_j(2)-{\phi}_j(1){\phi}_i(2)\right\rangle, \\ {}\left\langle {\phi}_j(2)\left|\widehat{\upsilon}(2)\right|{\phi}_j(2)\right\rangle &=\left\langle {\phi}_i(1){\phi}_j(2)\left|\frac{1}{r_{12}}\right|{\phi}_i(1){\phi}_j(2)-{\phi}_j(1){\phi}_i(2)\right\rangle, \\ {}\mathrm{the}\;\mathrm{rest}&=-\left\langle {\phi}_i(1){\phi}_j(2)\left|\frac{1}{r_{12}}\right|{\phi}_i(1){\phi}_j(2)-{\phi}_j(1){\phi}_i(2)\right\rangle .\end{array} $$

Consider in more detail the integral containing \( \widehat{\upsilon}(1) \). We want it to be expressed in such a way that only the coordinates of the electron labeled by 1 are explicitly written under the integral:

$$ \begin{array}{l}\left\langle {\phi}_i(1)\left|\widehat{\upsilon}(1)\right|{\phi}_i(1)\right\rangle =\left\langle {\phi}_i(1){\phi}_j(2)\left|\frac{1}{r_{12}}\right|{\phi}_i(1){\phi}_j(2)-{\phi}_j(1){\phi}_i(2)\right\rangle \\ {} =\int \int {\phi}_i^{*}\left({\tau}_1\right){\phi}_j^{*}\left({\tau}_2\right)\frac{1}{r_{12}}{\phi}_i\left({\tau}_1\right){\phi}_j\left({\tau}_2\right)d{\tau}_1d{\tau}_2\\ {} -\int \int {\phi}_i^{*}\left({\tau}_1\right){\phi}_j^{*}\left({\tau}_2\right)\frac{1}{r_{12}}{\phi}_j\left({\tau}_1\right){\phi}_i\left({\tau}_2\right)d{\tau}_1d{\tau}_2\\ {} =\int {\phi}_i^{*}\left({\tau}_1\right)\left(\int {\phi}_j^{*}\left({\tau}_2\right)\frac{1}{r_{12}}{\phi}_j\left({\tau}_2\right)d{\tau}_2\right){\phi}_i\left({\tau}_1\right)d{\tau}_1 \end{array}\\\begin{array}{l} {} -\int {\phi}_i^{*}\left({\tau}_1\right)\left(\int {\phi}_j^{*}\left({\tau}_2\right)\frac{1}{r_{12}}{\phi}_i\left({\tau}_2\right)d{\tau}_2\right){\phi}_j\left({\tau}_1\right)d{\tau}_1\\ {} =\int {\phi}_i^{*}\left({\tau}_1\right)\left({\widehat{J}}_j(1){\phi}_i\left({\tau}_1\right)\right)d{\tau}_1\\ {} -\int {\phi}_j^{*}\left({\tau}_1\right)\left({\widehat{K}}_j(1){\phi}_i\left({\tau}_1\right)\right)d{\tau}_1\\ {} =\left\langle {\phi}_i(1)\left|{\widehat{J}}_j(1)\right|{\phi}_i(1)\right\rangle -\left\langle {\phi}_i(1)\left|{\widehat{K}}_j(1)\right|{\phi}_i(1)\right\rangle \\ {} =\left\langle {\phi}_i(1)\left|{\widehat{J}}_j(1)-{\widehat{K}}_j(1)\right|{\phi}_i(1)\right\rangle, \end{array} $$
(116)

where two operators were defined:

$$ {\widehat{J}}_j(1){\phi}_i(1)=\left({\displaystyle \int {\phi}_j^{*}\left({\tau}_2\right)\frac{1}{r_{12}}{\phi}_j\left({\tau}_2\right)d{\tau}_2}\right){\phi}_i(1), $$
(117)
$$ {\widehat{K}}_j(1){\phi}_i(1)=\left({\displaystyle \int {\phi}_j^{*}\left({\tau}_2\right)\frac{1}{r_{12}}{\phi}_i\left({\tau}_2\right)d{\tau}_2}\right){\phi}_j(1). $$
(118)

Despite a slightly different way of defining, they are still ordinary operators. An operator is a function of a function. The operators Ĵ j (1) and \( {\widehat{K}}_j(1) \) act on the function ϕ i (1), producing another function, as was written in Eq. 8. The operator \( \widehat{J_j}(1) \) acting on ϕ i (1) transforms it into the same function:

$$ {\phi}_i(1)\overset{{\widehat{J}}_j(1)}{\to}\left({\displaystyle \int {\phi}_j^{*}\left({\tau}_2\right)}\frac{1}{r_{12}}{\phi}_j\left({\tau}_2\right)d{\tau}_2\right){\phi}_i(1), $$
(119)

and the operator \( {\widehat{K}}_j(1) \) produces ϕ j (1):

$$ {\phi}_i(1)\overset{{\widehat{K}}_j(1)}{\to}\left({\displaystyle \int {\phi}_j^{*}\left({\tau}_2\right)\frac{1}{r_{12}}{\phi}_i\left({\tau}_2\right)d{\tau}_2}\right){\phi}_j(1). $$
(120)

The expressions on the right-hand side of the arrows are some functions of a dependent variable τ 1 (the integration eliminates the dependence on τ 2; however, its result is not a number but a function depending on τ 1). The interpretation of the Ĵ j (1) and \( {\widehat{K}}_j(1) \) operators by ascribing them to observables is not straightforward. These operators appear in the equations when we try to write the interaction between two electrons as the average value of the one-electron operator calculated with the one-electron function. However, in fact, they do appear as the difference:

$$ {\widehat{\upsilon}}_i^{\mathrm{HF}}(1)={\widehat{J}}_j(1)-{\widehat{K}}_j(1). $$
(121)

And the physical sense should be sought in this difference. Here, \( {\widehat{\upsilon}}_i^{\mathrm{HF}}(1) \) is an operator of the average interaction of the electron labeled as 1, described by a spin-orbital ϕ i with the second electron characterized by ϕ j (2). It should be noticed that the potential \( {\widehat{\upsilon}}_i^{\mathrm{HF}}(1) \) depends on ϕ j (2), since this spin-orbital is necessary to define Ĵ j (1) and \( {\widehat{K}}_j(1) \) operators.

Similarly,

$$ {\widehat{\upsilon}}_j^{\mathrm{HF}}(2)={\widehat{J}}_i(2)-{\widehat{K}}_i(2), $$
(122)

where the action of operators Ĵ j (1) and \( {\widehat{K}}_j(1) \) on the function ϕ j (2) is defined by

$$ {\widehat{J}}_i(2){\phi}_j(2)=\left({\displaystyle \int {\phi}_i^{*}\left({\tau}_1\right)\frac{1}{r_{12}}{\phi}_i\left({\tau}_1\right)d{\tau}_1}\right){\phi}_j(2) $$
(123)
$$ {\widehat{K}}_i(2){\phi}_j(2)=\left({\displaystyle \int {\phi}_i^{*}\left({\tau}_1\right)}\frac{1}{r_{12}}{\phi}_j\left({\tau}_1\right)d{\tau}_1\right){\phi}_i(2). $$
(124)

Here, one more important circumstance should be mentioned. The electronic Hamiltonian does not depend on spin. However, the electronic wave function is spin dependent. During the construction of the HF equations, this dependence is introduced to the operators \( \widehat{K} \) since they depend on two differentspin-orbitals.

Let us briefly review. We want to have a one-electron operator that includes the interaction between the electrons in some averaged way. Such an operator must depend on the function describing the motion of the second, adjacent electron. Therefore, for the two-electron case, two coupled equations must be solved (Eqs. 113 and 114). The word “coupled” that distinguishes this set of equations from Eqs. 95 and 96 is crucial. Denoting

$$ {\widehat{f}}_i(1)=\widehat{h}(1)+{\widehat{\upsilon}}_i^{\mathrm{HF}}(1), $$
(125)

one can rewrite the above equations as

$$ {\widehat{f}}_k(1){\phi}_k(1)={\upepsilon}_k{\phi}_k(1), \mathrm{f}\mathrm{o}\mathrm{r} k=i,j. $$
(126)

\( {\widehat{f}}_i(1) \) is called the Fock operator and Eq. 114 gives the Hartree–Fock equations.

It should be noted that the label of electrons determines only the name of the integration variables, and the result of the integration does not depend on the name of the variable. Therefore, what is really important is the label of the spin-orbital. It will be even more pronounced in the N-electron case, when the electron labels are applied only to show that the operator acts on one or two electron coordinates. The form of the N-electron wave function depends only on the spin-orbitals and not on the electron labels; they are just the integration variables.

The most popular way of solving Eq. 126 is the iterative procedure. It starts from the guessed or chosen spin-orbitals ϕ i (1) and ϕ j (2) applied to construct the potentials \( {\widehat{\upsilon}}_i^{\mathrm{HF}}(1) \) and \( {\widehat{\upsilon}}_j^{HF}(2). \) Next, the obtained potentials are substituted to Eq. 125 and solutions of Eq. 126 give the improved form of the ϕ i (1) and ϕ j (2) spin-orbitals. These, on the other hand, are treated as the starting point again, and the whole procedure is repeated until the starting and final orbitals of the given iteration do not differ much. This technique is called self-consistent field (SCF) . Often the abbreviations HF for Hartree–Fock method and SCF are used interchangeably. They can also be joined together as SCF–HF, denoting the self-consistent way of solving Hartree–Fock equations.

Let us assume that the spin-orbitals are already known. Concentrate on the calculations of the average value of the Hamiltonian with the determinant build of these spin-orbitals using the operators defined previously. Writing down the two-electron part:

$$ \begin{array}{ll}{\left\langle \frac{1}{r_{12}}\right\rangle}_{\varPsi \left(1,2\right)}&=\left\langle {\phi}_i(1)\left|{\widehat{J}}_j(1)-{\widehat{K}}_j(1)\right|{\phi}_i(1)\right\rangle \\[10pt] {}&=\left\langle {\phi}_j(2)\left|{\widehat{J}}_i(2)-{\widehat{K}}_i(2)\right|{\phi}_j(2)\right\rangle, \end{array} $$

one obtains the integrals that can be denoted by

$$ \begin{array}{ll}{J}_{ij}&=\left\langle {\phi}_i(1)\left|{\widehat{J}}_j(1)\right|{\phi}_i(1)\right\rangle, \\[10pt] {}{K}_{ij}&=\left\langle {\phi}_i(1)\left|{\widehat{K}}_j(1)\right|{\phi}_i(1)\right\rangle .\end{array} $$

With this notation, the expression (Eq. 106) takes the following form:

$$ {\left\langle \widehat{H}\left(1,2\right)\right\rangle}_{\varPsi \left(1,2\right)}=\left\langle {\phi}_i(1)\left|\widehat{f}(1)\right|{\phi}_i(1)\right\rangle +\left\langle {\phi}_j(2)\left|\widehat{f}(2)\right|{\phi}_j(2)\right\rangle -\left({J}_{ij}-{K}_{ij}\right), $$
(127)

or if one wants to apply the spin-orbital energies ε calculated earlier:

$$ {\left\langle \widehat{H}\left(1,2\right)\right\rangle}_{\Psi \left(1,2\right)}={\upepsilon}_i+{\upepsilon}_j-\left({J}_{ij}-{K}_{ij}\right). $$
(128)

Now it is time to generalize these considerations into the N-electron case. The recipe for this transformation was given previously (Eqs. 91, 92, and 93). For the system of N electrons, the set of N-coupled equations of the form

$$ {\widehat{f}}_i(1){\phi}_i(1)={\upepsilon}_i{\phi}_i(1), \mathrm{f}\mathrm{o}\mathrm{r} i=1,\dots, N $$
(129)

must be solved. Here,

$$ {f}_i(1)=h(1)+{\displaystyle \sum_{j\ne i}^N\left({\widehat{J}}_j(1)-{\widehat{K}}_j(1)\right).} $$
(130)

The summation in the expression for the one-electron Fock operator arises from the fact that now the given electron described by the spin-orbital i interacts with (N − 1) electrons in the states determined by the remaining spin-orbitals.

In searching for the ground state energy, one is interested in the lowest possible energy. Therefore, the functions of choice are the spin-orbitals corresponding to the lowest values of ε. Such a set of spin-orbitals, called occupied, is opposite to any other solutions of the Fock equations corresponding to higher energies. These are known as virtual (unoccupied) orbitals.

Finally, the average value of the Hamiltonian can be written as

$$ {\left\langle \widehat{H}\left(1,2,\dots, N\right)\right\rangle}_{\Psi^{(0)}(N)}={\displaystyle \sum_{i=1}^N{\upepsilon}_i}-{\displaystyle \sum_{i=1}^N{\displaystyle \sum_{j>i}^N\left({J}_{ij}-{K}_{ij}\right)}={E}_0^{\mathrm{HF}}} $$

It should be emphasized that the energy estimated in this manner is not the simple sum of the orbital energies. If the terms \( {\displaystyle {\sum}_{i=1}^N{\displaystyle {\sum}_{j>i}^N\left({J}_{ij}-{K}_{ij}\right)}} \) are neglected, the double summation of the electron–electron interaction would take place (Atkins and Friedman 2005; Cramer 2004; Jensen 2006; Levine 2008; Lowe and Peterson 2005; McQuarrie and Simon 1997; Piela 2007; Ratner and Schatz 2000; Roos and Widmark 2002; Szabo and Ostlund 1996).

Møller–Plesset Perturbation Theory: HF Is Just the Beginning

From here forward, we will treat the Hartree–Fock function as the basis for the further investigations and denote it as Ψ (0)0 , where the subscript 0 indicates the ground state and the superscript (0) is the reference function. We will also omit the explicit writing of the dependence of the Hamiltonian and the wave function on the coordinates of N electrons. As a consequence,

$$ \left\langle {\Psi}_0^{(0)}\left|\widehat{H}\right|{\Psi}_0^{(0)}\right\rangle ={E}_0^{\mathrm{HF}}. $$
(131)

One of the possible ways of improving the HF results is the application of the perturbation theory (Atkins and Friedman 2005; Cramer 2004; Jensen 2006; Levine 2008; McQuarrie and Simon 1997; Piela 2007; Ratner and Schatz 2000; Szabo and Ostlund 1996). The Hamiltonian (Eq. 89) can be partitioned into the unperturbed part and the perturbation in the following manner:

$$ {\widehat{H}}^0={\displaystyle \sum_{i=1}^N\widehat{f}(i)}, {\widehat{H}}^1={\displaystyle \sum_{i=1}^N\left({\displaystyle \sum_{j>i}^N\frac{1}{r_{ij}}-{\widehat{\upsilon}}^{\mathrm{HF}}(i)}\right)} $$
(132)

(compare Eq. 125). Up to the first order in the perturbation theory, the energy is

$$ {E}_0^{\mathrm{HF}}={E}_0^{(0)}+{E}_0^{(1)}=\left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^0+{\widehat{H}}^1\right|{\Psi}_0^{(0)}\right\rangle ={\left\langle \widehat{H}\right\rangle}_{\Psi_0^{(0)}}. $$
(133)

The correction to the HF energy appears in the second order:

$$ {E}_0^{(2)}={\displaystyle \sum_{k\ne 0}^{\infty}\frac{\left|\left\langle {\Psi}_0^{(0)}\left|{\widehat{H}}^1\right|{\Psi}_k^{(0)}\right\rangle \right|{}^2}{E_0^{(0)}-{E}_k^{(0)}}.} $$
(134)

During calculations, the expansion of the spin-orbitals in the finite basis set is applied. This allows identification of only the finite number of spin-orbitals. But the number of all possible Ψ (0) k functions can still be horrifyingly large. In practice, this is equivalent to the finite but long expansion in the above summation. In quantum chemistry we are interested in the best quality results with moderate expenses and are continuously searching for more economical methods.

Careful examination of the second-order energy correction E (2)0 shows that a significant number of its terms do not contribute to the final result. Some work invested in the manipulation of expressions allows one not only to learn the basic computational apparatus but also to save a lot of effort.

In order to proceed comfortably, the notation should again be simplified. Let us drop the superscript (0) denoting the zeroth-order functions (other functions will not appear in our considerations). Additionally, we omit the subscript k from the determinants containing the virtual spin-orbitals. In return, we explicitly specify the pattern of spin-orbital exchange. Using Eq. 108, the Slater determinant can be written as

$$ \left.\Big|{\Psi}_0\right\rangle =\left|{\phi}_1(1){\phi}_2(2){\phi}_3(3){\phi}_4(4)\dots \left.{\phi}_N(N)\right\rangle \right.. $$
(135)

The numbers of the occupied spin-orbitals vary from 1 to N. Thus, the virtual spin-orbitals will be labeled starting from N + 1 onward. Now consider the example of the determinant in which the occupied spin-orbital ϕ 3 is exchanged for the virtual one, \( {\phi}_{N+8} \). The new determinant can be written as

$$ \left.\Big|{\Psi}_3^{N+8}\right\rangle =\Big|{\phi}_1(1){\phi}_2(2){\phi}_{N+8}(3){\phi}_4(4)\dots \left.{\phi}_N(N)\right\rangle, $$
(136)

where the subscript in \( {\Psi}_3^{N+8} \) denotes the occupied orbital that is exchanged and the superscript denotes the virtual one that takes its place. To generalize, one could denote the occupied orbitals building the Ψ0 function by first alphabet letters a, b, c, d… and the virtuals by p, q, r, s…. Therefore, \( {\Psi}_3^{N+8} \) can be written as Ψ p a , where a = 3 and p = N + 8. Likewise, any determinant can be represented.

Thereafter, the influence of the function choice on the values of the integrals appearing in the electronic energy calculations can be investigated. Similarly, as in the case of Hamiltonian, the integrals can also be divided into two groups with one-electron operator

$$ {\widehat{o}}_1={\displaystyle \sum_{i=1}^N\widehat{o}(i)}, $$
(137)

where in place of ô(i) operators ĥ, (i) or \( \widehat{f}(i) \) will be used, and with two-electron operator,

$$ {\widehat{O}}_2={\displaystyle \sum_i{\displaystyle \sum_{j>i}\left(\frac{1}{r_{ij}}\right).}} $$
(138)

To get a full picture, we should analyze the following types of integrals:

$$ \left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_a^p\right\rangle \mathrm{and} \left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_a^p\right\rangle, $$
(139)
$$ \left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_{ab}^{pq}\right\rangle \mathrm{and} \left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_{ab}^{pq}\right\rangle, $$
(140)
$$ \left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_{abc}^{pqr}\right\rangle \mathrm{and} \left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_{abc}^{pqr}\right\rangle . $$
(141)

Any other will be equal to zero (Levine 2008; Szabo and Ostlund 1996).

Recall that the determinant of the N × N matrix can be represented as the sum of the N! products of the matrix elements. According to Laplace’s formula, a determinant can be expanded along a row or a column. Thus, calculation of the 〈Ψ0|ô 1 p a 〉 integrals, when the occupied orbital a in Ψ0 has been exchanged with the virtual orbital p in Ψ p a , can be performed by expanding the determinant Ψ0 along the a-th row and the determinant Ψ p a along the p-th row:

$$ {\Psi}_0=\frac{1}{\sqrt{N}!}{\displaystyle \sum_{i=1}^N{\phi}_a}(i){C}_{ai}, $$
(142)
$$ {\Psi}_a^p=\frac{1}{\sqrt{N!}}{\displaystyle \sum_{i=1}^N{\phi}_p}(i){C}_{pi}. $$
(143)

The cofactors C can be perceived as the (N − 1)-electron determinants that have been obtained from Ψ0 and Ψ p a via the elimination of the a-th and p-th spin-orbitals, respectively. Thus, after elimination of what is different in the two determinants, we get both cofactors equal to each other. Hence,

$$ \begin{array}{rl}\left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_a^p\right\rangle &=\frac{1}{N!}\left\langle {\displaystyle \sum_{i=1}^N{\phi}_a(i){C}_{ai}\left|{\displaystyle \sum_{j=1}^N\widehat{o}(j)}\right|{\displaystyle \sum_{k=1}^N{\phi}_p(k){C}_{pi}}}\right\rangle \\[4pt] {}&=\frac{1}{N!}{\displaystyle \sum_{i=1}^N\left\langle {\phi}_a(i)\left|\widehat{o}(i)\right|{\phi}_p(i)\right\rangle \left\langle \left.{C}_{ai}\right|{C}_{pi}\right\rangle}\\[4pt] {}&=\frac{1}{N}{\displaystyle \sum_{i=1}^N\left\langle {\phi}_p(i)\left|\widehat{o}(i)\right|{\phi}_p(i)\right\rangle}\\[4pt] {}&=\left\langle {\phi}_a(1)\left|\widehat{o}(1)\right|{\phi}_p(1)\right\rangle, \end{array} $$
(144)

where

$$ \left\langle {C}_{ai}\Big|{C}_{pi}\right\rangle =\left(N-1\right)!, $$
(145)
$$ {\displaystyle \sum_{i=1}^N\left\langle {\phi}_a(i)\left|\widehat{o}(i)\right|{\phi}_a(i)\right\rangle }=N\left\langle {\phi}_a(1)\left|\widehat{o}(1)\right|{\phi}_a(1)\right\rangle $$
(146)

was applied. If the cofactors are not the same – this would be the case when the determinants differ from two or more spin-orbitals – the corresponding overlap matrix is equal to zero according to the orthogonality condition. Thus,

$$ \begin{array}{rl}\left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_a^p\right\rangle &=\left\langle {\phi}_a(1)\left|\widehat{o}(1)\right|{\phi}_p(1)\right\rangle, \\[6pt] {}\left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_{ab}^{pq}\right\rangle &= 0,\\[6pt] {}\left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_{abc}^{pqr}\right\rangle &= 0.\end{array} $$
(147)

Similar (but a little more time-consuming) considerations for the two-electron operators lead to the following expressions:

$$ \begin{array}{rl}\left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_a^p\right\rangle &={\displaystyle \sum_{i=1}^N\left(\left\langle {\phi}_a(1){\phi}_i(2)\left|\frac{1}{r_{12}}\right|{\phi}_p(1){\phi}_i(2)\right\rangle \right.}\\[10pt] {}&\quad -\left.\left\langle {\phi}_a(1){\phi}_i(2)\left|\frac{1}{r_{12}}\right|{\phi}_i(1){\phi}_p(2)\right\rangle \right)\\[6pt] {}\left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_{ab}^{pq}\right\rangle &=\left\langle {\phi}_a(1){\phi}_b(2)\left|\frac{1}{r_{12}}\right|{\phi}_p(1){\phi}_q(2)\right\rangle \\[6pt] {}&\quad-\left\langle {\phi}_a(1){\phi}_b(2)\left|\frac{1}{r_{12}}\right|{\phi}_q(1){\phi}_p(2)\right\rangle \\[5pt] {}\left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_{abc}^{pqr}\right\rangle &=0.\end{array} $$
(148)

In the first contact with these equations, one can have the feeling that something is lost here. We start from the integrals with the N-electron functions and we finish with the integral of only the electrons labeled by 1 and 2. What happened to the rest? Again, it should be emphasized here that the electron labels symbolize only the integration variables. What matters is the functions of these variables, namely, spin-orbitals. Thus, in the integral with the one-electron operator, the electron label 1 means that the integration is performed only over the variables of one electron. Likewise, the two-electron operator integral depends on the variables of two electrons, which is symbolized by two labels: 1 and 2.

In the integration of one-electron expressions, the electron label can be omitted without any harm: \( \left\langle {\upphi}_a(1)\left|{\widehat{o}}_1(1)\right|{\upphi}_a(1)\right\rangle \). Likewise, in the two-electron case, we can declare that the spin-orbitals are written in the given order. So,

$$ \left\langle {\phi}_x(1){\phi}_y(2)\left|\frac{1}{r_{12}}\right|{\phi}_v(1){\phi}_z(2)\right\rangle =\left\langle {\phi}_x{\phi}_y\left|\frac{1}{r_{12}}\right|{\phi}_v{\phi}_z\right\rangle . $$
(149)

Moreover, now there is no reason to explicitly write the ϕ symbol. Thereby, the next simplification of the notation is obvious:

$$ \begin{array}{rl}\left\langle {\phi}_a\left|{\widehat{o}}_1\right|{\phi}_a\right\rangle &=\left\langle a\left|{\widehat{o}}_1\right|a\right\rangle, \\[6pt] {}\left\langle {\phi}_x{\phi}_y\left|\frac{1}{r_{12}}\right|{\phi}_v{\phi}_z\right\rangle &=\left\langle xy\left|\frac{1}{r_{12}}\right|vz\right\rangle, \end{array} $$
(150)

and finally, since the combination of the integrals \( \left\langle xy\Big|1/{r}_{12}\Big|vz\right\rangle -\left\langle xy\Big|1/{r}_{12}\Big|vz\right\rangle \) appears frequently, the following symbol is introduced:

$$ \left\langle xy\left|\right|zv\right\rangle =\left\langle xy\left|\frac{1}{r_{12}}\right|vz\right\rangle -\left\langle xy\left|\frac{1}{r_{12}}\right|zv\right\rangle . $$
(151)

Now Eqs. 147 and 148 can be rewritten as

$$ \begin{array}{rl}\left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_a^p\right\rangle &=\left\langle a\left|\widehat{o}\right|p\right\rangle, \\[6pt] {}\left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_{ab}^{pq}\right\rangle &=0,\\[6pt] {}\left\langle {\Psi}_0\left|{\widehat{o}}_1\right|{\Psi}_{abc}^{pqr}\right\rangle &=0,\\[6pt] {}\left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_a^p\right\rangle &={\displaystyle \sum_{i=1}^N\left\langle ai\left|\right| pi\right\rangle },\\[6pt] {}\left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_{ab}^{pq}\right\rangle &=\left\langle ab\left|\right|pq\right\rangle, \\[6pt] {}\left\langle {\Psi}_0\left|{\widehat{O}}_2\right|{\Psi}_{abc}^{pqr}\right\rangle &=0.\end{array} $$
(152)

These simple equations, known as the Slater rules , allow for the following general remark: If the one-electron operator is integrated with functions that differ by more than one spin-orbital, the corresponding integral vanishes. Similarly, the result is zero for the integration of two-electron operators with functions differing by more than two spin-orbitals. Hitherto, only the integrals with Ψ0 were considered. However, it is easy to notice that functions Ψ pq ab and Ψ pqr abc differ by only one exchange (spin-orbital \( {\phi}_c\to {\phi}_r \)), etc. Therefore, the above considerations can be also applied in any other cases.

In this abundance of equations, our main goal cannot be lost: All these derivations were necessary to limit the types of the Ψ k functions present in the MP2 energy expression. Now, with a recognition of the above Slater rules, one can safely neglect the integrals with the pairs of the Ψ k functions differing with more than two spin-orbital exchanges. But this is not everything.

Let us consider the integral

$$ \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_a^p\right\rangle =\left\langle a\left|\widehat{h}\right|p\right\rangle +{\displaystyle \sum_j\left(\left\langle aj\left|\frac{1}{r_{ij}}\right|pj\right\rangle -\left\langle aj\left|\frac{1}{r_{ij}}\right|jp\right\rangle \right)}. $$
(153)

Since

$$ \left\langle aj\left|\frac{1}{r_{ij}}\right|pj\right\rangle =\left\langle a\left|\widehat{J_j}\right|p\right\rangle, $$
(154)

and

$$ \left\langle aj\left|\frac{1}{r_{ij}}\right|jp\right\rangle =\left\langle a\left|{\widehat{K}}_j\right|p\right\rangle, $$
(155)

one gets

$$ \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_a^p\right\rangle =\left\langle a\left|{\widehat{H}}^0\right|p\right\rangle, $$
(156)

which has to vanish, because

$$ \left\langle a\left|{\widehat{H}}^0\right|p\right\rangle =\left\langle a\left|{\displaystyle \sum_{i=1}^Nf(i)}\right|p\right\rangle =\left\langle a\left|f\right|p\right\rangle ={\upepsilon}_p{\delta}_{ap}=0, $$
(157)

for it was assumed that \( a\ne p \).

The obtained result,

$$ \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_a^p\right\rangle =0, $$
(158)

is known as the Brillouin theorem . Applying it allows one also to show that

$$ \left\langle {\Psi}_0\left|{\widehat{H}}^1\right|{\Psi}_a^p\right\rangle =0. $$
(159)

Indeed, using

$$ {\widehat{H}}^1=\widehat{H}-{\widehat{H}}^0, $$
(160)

one obtains

$$ \begin{array}{rl}\left\langle {\Psi}_0\left|{\widehat{H}}^1\right|{\Psi}_a^p\right\rangle &=\left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_a^p\right\rangle -\left\langle {\Psi}_0\left|{\widehat{H}}^0\right|{\Psi}_a^p\right\rangle \\[6pt] {}&=0-\left({\displaystyle \sum_{i=1}^N{\upepsilon}_i}\right)\left\langle {\Psi}_0\Big|{\Psi}_a^p\right\rangle \\[6pt] {}&=0.\end{array} $$
(161)

Now is the time for an important conclusion: In order to calculate the correction to the energy in MP2, the functions arising from Ψ0 only by the exchange of precisely two spin-orbitals need to be applied, not more, not less. Life becomes easier when not calculating zero contributions in a complicated way.

Let us exploit the above knowledge to transform Eq. 134. The integral on the right-hand side will be calculated with the functions Ψ pq ab . Assuming b > a and q > p allows avoidance of the integration with the same functions. The upper limit for the summations over a and b will be equal to N. For the remaining spin-orbitals, it should be ∞; however, in practice, the finite basis is applied and the upper limit will be determined by the basis set size. Let us look into the numerator of Eq. 134 carefully. Denoting the total Fock operator as \( \widehat{F}={\displaystyle {\sum}_i\widehat{f}}(i) \), one gets

$$ \begin{array}{rl}\left\langle {\Psi}_0\left|{\widehat{H}}^1\right|{\Psi}_{ab}^{pq}\right\rangle &=\left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_{ab}^{pq}\right\rangle -\left\langle {\Psi}_0\left|\widehat{F}\right|{\Psi}_{ab}^{pq}\right\rangle \\[5pt] {}&=\left\langle ab\left|\frac{1}{r_{ij}}\right|pq\right\rangle -\left\langle ab\left|\frac{1}{r_{ij}}\right|pq\right\rangle, \end{array} $$
(162)

since

$$ \left\langle {\Psi}_0\left|\widehat{F}\right|{\Psi}_{ab}^{pq}\right\rangle =0 $$
(163)

(integration of the one-electron operator with the functions differing by two spin-orbitals; see Eq. 152).

Now consider the denominator of Eq. 134. The energy of the ground state is simply a sum of N lowest spin-orbital energies:

$$ {E}_0^{(0)}={\displaystyle \sum_{i=1}^N{\upepsilon}_i}. $$
(164)

The energy of the zeroth order for the function Ψ pq ab is a similar sum with ϵ a and ϵ b replaced by ϵ p and ϵ q :

$$ {E}_{\left({}_{ab}^{pq}\right)}^{(0)}={E}_0^{(0)}-{\upepsilon}_a-{\upepsilon}_b+{\upepsilon}_p+{\upepsilon}_q. $$
(165)

Thus, the final form of Eq. 134 is

$$ {E}_0^{(2)}={\displaystyle \sum_{b>a}{\displaystyle \sum_{q>p}\frac{\left|\left\langle ab\left|\right|pq\right\rangle \right|{}^2}{\upepsilon_p+{\upepsilon}_q-{\upepsilon}_a-{\upepsilon}_b}.}} $$
(166)

Why was it worth our hard work? Not only for satisfaction. These derivations are necessary to understand the mechanisms employed in computational methods of quantum chemistry. With the Slater rules, it is straightforward to recognize the vanishing integrals. The double exchanges appear to be most important, since these are the only exchanges that give rise to the energy corrections in the MP2 method.

According to the perturbation calculus, the wave function corrected in the first order can be written as

$$ \Psi \approx {\Psi}_0+{\displaystyle \sum_{k\ne 0}{c}_k{\Psi}_k}, $$
(167)

where c k is given by

$$ {c}_k=\frac{\left\langle {\Psi}_0\left|{\widehat{H}}^1\right|{\Psi}_k\right\rangle }{E_0-{E}_k}. $$
(168)

From the Slater rules, it is easy to estimate that the only nonvanishing terms will arise from the double spin-orbital exchange in the wave function. The intuitive statement is that the low-order corrections should have the larger impact and the above considerations lead to the most important accomplishment of this section: The largest contribution to the corrections of the Hartree–Fock function arises from the functions with the doubly exchanged spin-orbitals.

Beyond the HF Wave Function

Having done all this hard work, one can now sit comfortably in an armchair and think. The main goal of the quantum chemistry is to find the best possible description of the state of the system (the best possible wave function). The exact solutions of the Hamiltonian eigenproblem are unavailable and all we have are approximations. We have become used to approximations in everyday life. The important thing is to realize that the Hartree–Fock solutions can be improved. At the beginning of this chapter, various properties of operators were discussed. Among others, it was stated that the eigenfunctions of the Hermitian operators constitute the complete sets and any other function of the same variables can be represented by applying them. The one-electron spin-orbitals that are the eigenfunctions of the Fock operator are accessible. They form the complete set, but only for the one-electron functions. However, they can be applied to build up the N-electron determinants. The set of all possible determinants is also complete and, therefore, can be applied to express any N-electron function (Cramer 2004; Jensen 2006; Levine 2008; Lowe and Peterson 2005; Piela 2007; Ratner and Schatz 2000; Roos and Widmark 2002; Szabo and Ostlund 1996):

$$ \Psi ={c}_0{\Psi}_0+{\displaystyle \sum_{a,p}{c}_a^p{\Psi}_a^p+}{\displaystyle \sum_{a,b,p,q}{c}_{ab}^{pq}{\Psi}_{ab}^{pq}+}{\displaystyle \sum_{a,b,c,p,q,r}{c}_{abc}^{pqr}{\Psi}_{abc}^{pqr}+}\dots . $$
(169)

For this purpose, only the coefficients c 0, c p a , c pq ab , c pqr abc , … need to be found. In the ideal case, all the summations would be infinite and the problem must be reduced. Still, instead of using the infinite expansions, the finite and relatively small number of terms can be sufficient. Moreover, solving the Hartree–Fock equations in the finite basis, one possesses only the finite number of orbitals that can be exchanged.

Anyway, it is instructive to see how large the number of terms in Eq. 169 can be. Consider the methane molecule CH4. Calculations with the minimal basis set (each orbital described by a single one-electron function; for carbon single functions for each of the orbitals, 1 s, 2 s, 2p x , 2p y , 2p z , and for hydrogen a single function for 1 s orbital) require 9 orbitals/18 spin-orbitals for the 10-electron system. From the probability theory, the number of combinations (K) of k elements from the n-element set can be calculated as

$$ K=\left(\begin{array}{c}\hfill n\hfill \\ {}\hfill k\hfill \end{array}\right)=\frac{n!}{\left(n-k\right)!k!}. $$
(170)

In the case of methane, 10 electrons can be placed in 18 spin-orbitals on \( \left(\begin{array}{c}\hfill 18\hfill \\ {}\hfill 10\hfill \end{array}\right)=43,758 \) ways. This is equivalent to the 43,758 terms in the expansion (Eq. 169). Impressive. And one needs to remember that the minimal basis set gives relatively bad Hartree–Fock solutions and is not recommended in ab initio calculations. However, increasing the basis set size causes the number of expansion terms to grow dramatically. For instance, in the case of the so-called double- ζ basis set (two functions per each orbital) for methane, one has 36 spin-orbitals, which makes 254,186,856 combinations! And double- ζ is still not much….

Therefore, it is necessary to find some way to reduce the size of the problem. The symmetry of the molecules can be applied here, and the fact that the chosen determinants (or their linear combinations) must be the given functions of the spin operators can be beneficial. Moreover, one would like to eliminate from the expansion the determinants that are not crucial for the quality of the wave function, and their neglect does not cause the deterioration of the description of the system (or causes only slight deterioration). In other words, only the determinants that have the significant contribution to the total energy must be chosen for the wave function construction. Let us begin with the classification of the determinants, taking into account the number of the spin-orbitals exchanged with respect to Ψ0. For this purpose, the averaged value of Hamiltonian calculated with Ψ will be useful. We can write the wave function expansion as

$$ \Psi ={c}_0{\Psi}_0+\mathbf{S}{\mathbf{c}}_{\mathbf{S}}+\mathbf{D}{\mathbf{c}}_{\mathbf{D}}+\mathbf{T}{\mathbf{c}}_{\mathbf{T}}+\mathbf{Q}{\mathbf{c}}_{\mathbf{Q}}+\dots . $$
(171)

The symbols’ meaning can be clearly deciphered by comparison with Eq. 169. S denotes a vector build of the determinants constructed from Ψ0 by single exchanges, and c S is a vector of coefficients corresponding to the functions in S:

$$ \mathbf{S}{\mathbf{c}}_{\mathbf{S}}={\displaystyle \sum_a{\displaystyle \sum_q{c}_a^q{\Psi}_a^q}}. $$
(172)

In other words, S contains all the functions with the single exchanged spin-orbital. Similarly, D would be the combination of the functions with double exchanges, T with triple exchanges, and so forth. With such a notation, the function Ψ can be treated as the scalar product of the basis vectors Ψ0, S, D, …, and the coefficient vector

$$ \left.\Big|\Psi \right\rangle =\left[\left.\Big|{\Psi}_0\right\rangle, \left.\Big|\mathbf{S}\right\rangle, \left.\Big|\mathbf{D}\right\rangle, \left.\Big|\mathbf{T}\right\rangle, \left.\mathbf{Q}\right\rangle, \dots \right].\left[\begin{array}{c}\hfill {c}_0\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{S}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{D}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{T}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{Q}}\hfill \\ {}\hfill \vdots \hfill \end{array}\right]. $$
(173)

Using this notation, the Hamiltonian Ĥ of the system can be linked in an elegant way to a matrix H. Let us apply this form of the wave function for the calculation of the Hamiltonian average value. To simplify the expressions, let us limit ourselves to the truncated expansion:

$$ {\Psi}_{\mathrm{SD}}=\left[{\Psi}_0\mathbf{S}\mathbf{D}\right]\left[\begin{array}{c}\hfill {c}_0\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{S}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{D}}\hfill \end{array}\right]={c}_0{\Psi}_0+\mathbf{S}{\mathbf{c}}_{\mathbf{S}}+\mathbf{D}{\mathbf{c}}_{\mathbf{D}}. $$
(174)

The average value of the Hamiltonian can now be written as

$$ {\left\langle H\right\rangle}_{\Psi_{\mathrm{SD}}}=\left\langle {\Psi}_{\mathrm{SD}}\left|\widehat{H}\right|{\Psi}_{\mathrm{SD}}\right\rangle ={\left[{c}_0{\mathbf{c}}_{\mathbf{S}}{\mathbf{c}}_{\mathbf{D}}\right]}^{\dagger}\left[\begin{array}{c}\hfill \left\langle {\Psi}_0\Big|\right.\hfill \\ {}\hfill \left\langle \mathbf{S}\right.\hfill \\ {}\hfill \left\langle \mathbf{D}\right.\hfill \end{array}\right]\widehat{H}\left[\left.\Big|{\Psi}_0\right\rangle, \left.\Big|\mathbf{S}\right\rangle, \left.\Big|\mathbf{D}\right\rangle \right]\left[\begin{array}{c}\hfill {c}_0\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{S}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{D}}\hfill \end{array}\right]. $$
(175)

The vector multiplication leads to the following expression:

$$ \begin{array}{l}{\left\langle H\right\rangle}_{\Psi_{\mathrm{SD}}}={c}_0^{*}\left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle {c}_0+{c}_0^{*}\left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{S}\right\rangle {\mathbf{c}}_{\mathbf{S}}+{c}_0^{*}\left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{D}\right\rangle {\mathbf{c}}_{\mathbf{D}}\\ {} +{{\mathbf{c}}_{\mathbf{S}}}^{\dagger}\left\langle \mathbf{S}\left|\widehat{H}\right|{\Psi}_0\right\rangle {c}_0+{{\mathbf{c}}_{\mathbf{S}}}^{\dagger}\left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{S}\right\rangle {\mathbf{c}}_{\mathbf{S}}+{{\mathbf{c}}_{\mathbf{S}}}^{\dagger}\left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{D}\right\rangle {\mathbf{c}}_{\mathbf{D}}\\ {} +{{\mathbf{c}}_{\mathbf{D}}}^{\dagger}\left\langle \mathbf{D}\left|\widehat{H}\right|{\Psi}_0\right\rangle {\mathbf{c}}_{\mathbf{0}}+{{\mathbf{c}}_{\mathbf{D}}}^{\dagger}\left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{S}\right\rangle {\mathbf{c}}_{\mathbf{S}}+{{\mathbf{c}}_{\mathbf{D}}}^{\dagger}\left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{D}\right\rangle {\mathbf{c}}_{\mathbf{D}}.\end{array} $$
(176)

Such an equation is not very useful, since we still do not know the c 0, c S and c D coefficients determining the ΨSD function. The only thing that can be said about them so far comes from the normalization requirement for ΨSD:

$$ 1={c}_0^{*}{c}_0+{{\mathbf{c}}_{\mathbf{S}}}^{\dagger }+{\mathbf{c}}_{\mathbf{S}}+{{\mathbf{c}}_{\mathbf{D}}}^{\dagger }{\mathbf{c}}_{\mathbf{D}}. $$
(177)

This is not enough to uniquely determine the wave function. However, going back to Eq. 175 and multiplying only the inside vectors, one obtains

$$ {\mathbf{H}}_{\mathbf{SD}}=\left[\begin{array}{c}\hfill \left\langle {\Psi}_0\right.\Big|\hfill \\ {}\hfill \left\langle \mathbf{S}\right.\Big|\hfill \\ {}\hfill \left\langle \mathbf{D}\right.\Big|\hfill \end{array}\right]\widehat{H}\left[\left.\Big|{\Psi}_0\right\rangle, \left.\Big|\mathbf{S}\right\rangle, \left.\Big|\mathbf{D}\right\rangle \right]=\left[\begin{array}{ccc}\hfill \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \\ {}\hfill \left\langle \mathbf{S}\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \\ {}\hfill \left\langle \mathbf{D}\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \end{array}\right]. $$
(178)

We can then associate finding the approximate Hamiltonian eigenvalues with its matrix in the Ψ SD basis:

$$ {{\left\langle \mathbf{H}\right\rangle}_{\Psi}}_{{}_{\mathbf{S}\mathbf{D}}}=\left[{c}_0^{*}{\mathbf{c}}_{\mathbf{S}}^{\dagger }{\mathbf{c}}_{\mathbf{D}}^{\dagger}\right]=\left[\begin{array}{ccc}\hfill \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \\ {}\hfill \left\langle \mathbf{S}\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \\ {}\hfill \left\langle \mathbf{D}\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \end{array}\right]\left[\begin{array}{c}\hfill {c}_0\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{S}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{D}}\hfill \end{array}\right]. $$
(179)

Because of the Hermitian character of the Hamilton operator, the H SD matrix is symmetric and real. Its diagonalization provides the set of the eigenvalues corresponding to its eigenvectors. We are interested in the ground-state energy and thus, we need only the lowest eigenvalue of the H SD matrix and the respective normalized eigenvector Ψ SD .

Knowing the procedure for the finite basis (only single- and double-orbital exchanges), we can see how it looks for the full (Eq. 171) expansion. The matrix notation leads to the average value of the Hamiltonian, written as

$$ {\left\langle H\right\rangle}_{\Psi}=\left[{c}_0,{\mathbf{c}}_{\mathbf{S}},{\mathbf{c}}_{\mathbf{D}},{\mathbf{c}}_{\mathbf{T}},{\mathbf{c}}_{\mathbf{Q}}\right]\left[\begin{array}{c}\hfill \left\langle {\Psi}_0\Big|\right.\hfill \\ {}\hfill \left\langle \mathbf{S}\right.\Big|\hfill \\ {}\hfill \left\langle \mathbf{D}\right.\Big|\hfill \\ {}\hfill \left\langle \mathbf{T}\Big|\right.\hfill \\ {}\hfill \left\langle \mathbf{Q}\right.\Big|\hfill \\ {}\hfill \vdots \hfill \end{array}\right]\widehat{H}\left[{\Psi}_0\right\rangle \left|\left.\mathbf{S}\right\rangle, \right|\left.\mathbf{D}\right\rangle, \left|\left.\mathbf{T}\right\rangle, \left.\mathbf{Q}\right\rangle, \dots \right]\left[\begin{array}{c}\hfill {c}_0\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{s}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{D}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{T}}\hfill \\ {}\hfill {\mathbf{c}}_{\mathbf{Q}}\hfill \\ {}\hfill \vdots \hfill \end{array}\right]. $$
(180)

The vector multiplication permits one to perceive the average value as the eigenproblem of the Hamiltonian matrix:

$$ \mathbf{H}=\left[\begin{array}{cccccc}\hfill \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill 0\hfill & \hfill \left\langle {\Psi}_0\left|\widehat{H}\right|\left.\mathbf{D}\right\rangle \right.\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill \dots \hfill \\ {}\hfill 0\hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{T}\right\rangle \hfill & \hfill 0\hfill & \hfill \dots \hfill \\ {}\hfill \left\langle \mathbf{D}\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{T}\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{Q}\right\rangle \hfill & \hfill \dots \hfill \\ {}\hfill 0\hfill & \hfill \left\langle \mathbf{T}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{T}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill & \hfill \left\langle \mathbf{T}\left|\widehat{H}\right|\mathbf{T}\right\rangle \hfill & \hfill \left\langle \mathbf{T}\left|\widehat{H}\right|\mathbf{Q}\right\rangle \hfill & \hfill \dots \hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill \left\langle \mathbf{Q}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill & \hfill \left\langle \mathbf{Q}\left|\widehat{H}\right|\mathbf{T}\right\rangle \hfill & \hfill \left\langle \mathbf{Q}\left|\widehat{H}\right|\mathbf{Q}\right\rangle \hfill & \hfill \dots \hfill \\ {}\hfill \vdots \hfill & \hfill \vdots \hfill & \hfill \vdots \hfill & \hfill \vdots \hfill & \hfill \vdots \hfill & \hfill \ddots \hfill \end{array}\right]. $$
(181)

It can be clearly seen that some blocks in this matrix are equal to zero. This happens in two cases:

  • The integrals between Ψ0 and functions of the S type (single exchange of spin-orbitals) vanish due to the Brillouin theorem, as was shown in the previous section.

  • The integrals between the functions that differ by more than two exchanges, for instance, S and Q type, vanish due to the Slater rules.

Even not knowing combinatorics, one can expect that the number of functions in a block will grow drastically with and increase in the number of exchanges (block S will contain less functions than D etc.). A bit of thinking in the beginning would help to save a lot of time by not calculating zero integrals. Let’s see: Only in the case of \( \left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{D}\right\rangle \) and \( \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{S}\right\rangle \) blocks should all the elements be calculated. The remaining matrices are sparse. For example, the \( \left\langle \mathbf{D}\Big|\backslash \mathrm{widehat}\left\{H\right\}\Big|\mathbf{D}\right\rangle \) block contains the integrals of the following types: \( \left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}\right|{\Psi}_{ab}^{pq}\right\rangle \) (the same function on both sides), \( \left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}\right|{\Psi}_{ab}^{pr}\right\rangle \) (differing by one exchange), \( \left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}\right|{\Psi}_{ab}^{rs}\right\rangle \) (differing by two exchanges), \( \left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}\right|{\Psi}_{ac}^{pq}\right\rangle \) (differing by three exchanges), and \( \left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}\right|{\Psi}_{dc}^{pq}\right\rangle \) (differing by four exchanges). Obviously, the two latter cases produce zeros.

With the large number of exchanges, the size of the blocks grows abruptly, but most of the elements would be equal to zero. The simplification in this case would be the limitation of the H matrix size by the elimination of the functions including more than a given number of exchanges from the expansion. Let us leave only the single exchange block. Thus, the Hamiltonian matrix has the form

$$ {\mathbf{H}}_{\mathrm{s}}=\left[\begin{array}{cc}\hfill \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill \end{array}\right]. $$
(182)

This is the block diagonal matrix. One of the properties of such matrices is that their eigenvalue set is a sum of the eigenvalues of the diagonal blocks. This means that the lowest possible eigenvalue is \( {E}_0=\left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle \), and in consequence, there is no improvement in the ground-state energy with respect to the Hartree–Fock theory when taking only the single orbital exchanges.

Hence, let us also include the functions of the D type. Now the Hamilton matrix can be written as

$$ {\mathbf{H}}_{\mathrm{SD}}=\left[\begin{array}{ccc}\hfill \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill 0\hfill & \hfill \left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \\ {}\hfill 0\hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \\ {}\hfill \left\langle \mathbf{D}\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{S}\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \end{array}\right]. $$
(183)

It is no longer a block diagonal matrix – all blocks contribute to its eigenvalues and one can count on some improvement. An interesting observation, however, is that here the functions with the single spin-orbital exchange also have influence on the energy via the \( \left\langle \mathbf{S}\left|\widehat{H}\right|\mathbf{D}\right\rangle \) and \( \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{S}\right\rangle \) blocks.

Next, subsequent groups of functions can be applied containing more than two spin-orbital exchanges. However, the calculations become prohibitively expensive, even for moderate size of the systems, and the consecutive corrections are smaller and smaller. The distinguished character of the double spin-orbital exchange was already discussed within the MP2 method. Now one can also expect that including double exchanges produces reasonable results with the moderate computational cost. Then, why not save more and diagonalize only

$$ {\mathbf{H}}_{\mathrm{D}}=\left[\begin{array}{cc}\hfill \left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle {\Psi}_0\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \\ {}\hfill \left\langle \mathbf{D}\left|\widehat{H}\right|{\Psi}_0\right\rangle \hfill & \hfill \left\langle \mathbf{D}\left|\widehat{H}\right|\mathbf{D}\right\rangle \hfill \end{array}\right] $$
(184)

instead of H SD ? This can be done; however, savings are not that great, since the number of S functions is significantly smaller than the number of D functions. Thus, if one can afford H D diagonalization, H SD diagonalization is probably also within easy reach.

The above reasoning has led to the sequence of quantum chemistry methods. The best results can be obtained within full CI (FCI) by applying the full expansion (Eq. 171) within the given basis set. This is certainly the most expensive variant. Cheaper – but also worse – are, respectively, CISD based on the H SD matrix and CID neglecting single exchanges (Cramer 2004; Jensen 2006; Levine 2008; Lowe and Peterson 2005; Piela 2007; Ratner and Schatz 2000; Szabo and Ostlund 1996).

So far, the reference function has been a single determinant. Such an approach is very limited. For instance, it does not allow one to describe a dissociation process. Correct description of dissociation requires at least one determinant for each subsystem. And, even in the cases when multi-determinant reference state description is not obligatory, such an elastic wave function will provide an improved description of the system of interest (Cramer 2004; Jensen 2006; Levine 2008; Piela 2007; Roos and Widmark 2002).

The multi-determinant wave function Ψ depends both on the expansion coefficients and on the spin-orbitals building up the determinants. Both these sets of variables can be optimized simultaneously. The particular case of this procedure, when taking only the first expansion term, is the Hartree–Fock approximation (SCF–HF). Therefore, the optimization of the multi-determinant wave function is called the multiconfiguration (MC) SCF method . Even without a detailed study of the MC–SCF equations, an improvement in the results with respect to the HF energy can be expected. However, this approach is much more expensive, since the spin-orbitals are optimized several times. Again, a time saving is desired. Therefore, let us search for the spin-orbitals with the highest influence on the total energy value. It has been observed that not all doubly exchanged functions provide the same contribution to the energy. Some improve the result more and others less. This is due to the spin-orbital energy differences. The exchange of the spin-orbitals of significantly different energies does not contribute much to the total energy improvement. Therefore, it can be requested that the exchange is included in calculations only if the energy difference between the involved spin-orbitals is smaller than some given value. Hence, only some groups of spin-orbitals can be exchanged.

Up to now, the spin-orbital notion was used. However, let us switch to the orbital language that is frequently used for MC–SCF considerations.

For the N-electron system, the orbitals can be divided into three groups:

  • Core orbitals , which are not varied, since they have too low orbital energies, but are applied in the wave function expansion (doubly occupied orbitals)

  • Active orbitals, which are exchanged in the expansion (partially occupied orbitals)

  • Virtual orbitals, which are not varied and not applied in the expansion (unoccupied orbitals)

Instead of optimization of all the orbitals, only the active orbitals will be varied within the complete active space self-consistent field (CASSCF) approximation . In the acronym of this method, the number of active orbitals and active electrons is also provided for the given system. For instance, CASSCF (6,4) denotes the calculations with the expansion including all the possible exchanges of the four electrons within the six active orbitals. The CASSCF approach leads to all possible exchanges in the given active space, and for a moderately sized system, the size of the active spaces can quickly exceed the computational resources. In such a case, the solution can be the restricted active space self-consistent field (RASSCF) method, which supplies a way of limiting the size of the active space.

Additionally, one needs to remember that for a powerful tool such as perturbation theory, there is no obstacle to applying the multi-determinant reference function as the unperturbed function in perturbation calculus. Thus, similar to the SCF–HF and MP2 approaches, CASPT2 would be the second-order perturbation theory complete active space method – the perturbationally corrected CASSCF.

Coupled Cluster Approximation: The Operator Strikes Back

It would seem that all the straightforward ways to improve wave function in the one-electron approximation have been exploited. However, we now next discuss one of the most accurate (and simultaneously most expensive) methods applied in quantum chemistry.

The idea is simple. Consider again the expansion (Eq. 171). Introducing an operator

$$ \widehat{C}={\widehat{C}}_0+{\widehat{C}}_1+{\widehat{C}}_2+{\widehat{C}}_3+{\widehat{C}}_4+\dots, $$
(185)

defined as

$$ {\widehat{C}}_0\left|{\Psi}_0\right\rangle ={c}_0\left|{\Psi}_0\right\rangle, $$
(186)
$$ {\widehat{C}}_1\left|{\Psi}_0\right\rangle =\mathbf{S}{\mathbf{c}}_{\mathbf{S}}, $$
(187)
$$ {\widehat{C}}_2\left|{\Psi}_0\right\rangle =\mathbf{D}{\mathbf{c}}_{\mathbf{D}}, $$
(188)
$$ {\widehat{C}}_3\left|{\Psi}_0\right\rangle =\mathbf{T}{\mathbf{c}}_{\mathbf{T}}, $$
(189)
$$ {\widehat{C}}_4\left|{\Psi}_0\right\rangle =\mathbf{Q}{\mathbf{c}}_{\mathbf{Q}}, $$
(190)
$$ \vdots $$

allows one to write Eq. 171 in a very compact form:

$$ \Psi =\widehat{C}{\Psi}_0. $$
(191)

Now the problem of finding the appropriate expansion can be replaced by the problem of finding the adequate operator. This is the essence of the coupled cluster (CC) method. Here the assumption is made that the wave function can be expressed by

$$ \Psi ={e}^{\widehat{T}}{\Psi}_0, $$
(192)

where Ψ0 is a reference function (depending on the approach, this can be the one-determinant HF function or the multi-determinant function arising from MC–SCF) and \( \widehat{T} \) is a sought operator. Applying the expansion of the exponential function, it can be written that

$$ {e}^{\widehat{T}}=\widehat{1}+\widehat{T}+\frac{1}{2!}{\widehat{T}}^2+\frac{1}{3!}{\widehat{T}}^3+\dots . $$
(193)

Such an expanded form makes the interpretation of the \( \widehat{T} \) operators easier. Putting

$$ \widehat{T}={\widehat{T}}_1+{\widehat{T}}_2+{\widehat{T}}_3+{\widehat{T}}_4+\dots, $$
(194)

one can identify the subsequent \( \widehat{T_i} \) operators as corresponding to i-tuple exchanges of the spin-orbitals in the reference function:

$$ {\widehat{T}}_1{\Psi}_0={\displaystyle \sum_{a,p}{t}_a^p{\Psi}_a^p,} $$
(195)
$$ {\widehat{T}}_2{\Psi}_0={\displaystyle \sum_{a,b,p,q}{t}_{ab}^{pq}{\Psi}_{ab}^{pq}}, $$
(196)

and so forth. The coefficients t (called “amplitudes”) are in general not equivalent to the c coefficients in the CI expansion (see Eq. 171). In order to find their mutual relation, let us consider the approximate operator:

$$ \widehat{T}\approx {\widehat{T}}_1+{\widehat{T}}_2+{\widehat{T}}_3+{\widehat{T}}_4. $$
(197)

The operator (Eq. 193) takes the form

$$ \begin{array}{c}{e}^{{\widehat{T}}_1+{\widehat{T}}_2+{\widehat{T}}_3+{\widehat{T}}_4}=\widehat{1}\\ {}+{\widehat{T}}_1+{\widehat{T}}_2+{\widehat{T}}_3+{\widehat{T}}_4\\ {}+\frac{1}{2!}{\left({\widehat{T}}_1+{\widehat{T}}_2+{\widehat{T}}_3+{\widehat{T}}_4\right)}^2\\ {}+\frac{1}{3!}{\left({\widehat{T}}_1+{\widehat{T}}_2+{\widehat{T}}_3+{\widehat{T}}_4\right)}^3\\ {}+\frac{1}{4!}{\left({\widehat{T}}_1+{\widehat{T}}_2+{\widehat{T}}_3+{\widehat{T}}_4\right)}^4.\end{array} $$
(198)

Limiting ourselves to the terms corresponding to not more than four spin-orbital exchanges and writing it in the ordered way according to the number of exchanges, one gets

$$ \begin{array}{c}{e}^{{\widehat{T}}_1+{\widehat{T}}_2+{\widehat{T}}_3+{\widehat{T}}_4}=\widehat{1}\\ {}+{\widehat{T}}_1\\ {}+{\widehat{T}}_2+\frac{\overline{1}}{2}{\widehat{T}}_1^2\\ {}+{\widehat{T}}_3+{\widehat{T}}_1{\widehat{T}}_2+\frac{1}{3}{\widehat{T}}_1^3\\ {}+{\widehat{T}}_4+{\widehat{T}}_1{\widehat{T}}_3+\frac{1}{2}{\widehat{T}}_2^2+\frac{1}{2}{\widehat{T}}_1^2{\widehat{T}}_2+{\widehat{T}}_1^4.\end{array} $$
(199)

Now the direct comparison can be made:

$$ {\widehat{C}}_1={\widehat{T}}_1, $$
(200)
$$ {\widehat{C}}_2={\widehat{T}}_2+\frac{1}{2}{\widehat{T}}_1^2, $$
(201)
$$ {\widehat{C}}_3={\widehat{T}}_3+{\widehat{T}}_1{\widehat{T}}_2+\frac{1}{3}{\widehat{T}}_1^3, $$
(202)
$$ {\widehat{C}}_4={\widehat{T}}_4+{\widehat{T}}_1{\widehat{T}}_3+\frac{1}{2}{\widehat{T}}_2^2+\frac{1}{2}{\widehat{T}}_1^2{\widehat{T}}_2+{\widehat{T}}_1^4. $$
(203)

We have the relation between the Ĉ i and \( {\widehat{T}}_i \) operators, but still neither Ĉ i nor \( {\widehat{T}}_i \) are known. Recall from the earlier sections that the double exchanges have a significant influence on the energy improvement with respect to the Hartree–Fock results. Taking double exchanges into account within the coupled cluster formalism means that the operators \( {\widehat{T}}_1 \) and \( {\widehat{T}}_2 \) need to be determined. However, as a side effect, they also allow inclusion of some not negligible contributions arising from the triple and higher exchanges. In the above comparison, the \( {\widehat{T}}_1 \) and \( {\widehat{T}}_2 \) operators recover two out of three terms in Ĉ 3 and three out of five terms in Ĉ 4. This is the power of the CC method.

Unfortunately, the strength of this method does not go together with ease of calculations. Obtaining the expressions for the operators \( \widehat{T} \) is ransomed with compromises. Not only is the operator expansion (Eq. 194) truncated, but the basis set is finite. Moreover, the variational character of the method is sacrificed.

In order to realize the complications, let us consider step-by-step the energy calculation within the CC formalism. We begin, as usual, with the electron Schrödinger Eq. 50. Substituting Eq. 192 gives

$$ \widehat{H}{e}^{\widehat{T}}{\Psi}_0=E{e}^{\widehat{T}}{\Psi}_0. $$
(204)

Taking into account that, due to (Eq. 193),

$$ \left\langle \left.{\Psi}_0\right|\Psi \right\rangle =\left\langle {\Psi}_0\left|{e}^{\widehat{T}}\right|{\Psi}_0\right\rangle =\left\langle {\Psi}_0\left|{\Psi}_0\right.\right\rangle =1, $$
(205)

the energy can be calculated as

$$ E=\left\langle {\Psi}_0\left|\widehat{H}{e}^{\widehat{T}}\right|{\Psi}_0\right\rangle =\left\langle {\Psi}_0\left|\widehat{H}\right|\Psi \right\rangle . $$
(206)

This is not the Hamiltonian average value expression. Additionally, the operator inside the bracket is not Hermitian. But, until we assume that (Eq. 192) is true, such an approach works. We can also construct an integral:

$$ \left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}{e}^{\widehat{T}}\right|{\Psi}_0\right\rangle =E\left\langle {\Psi}_{ab}^{pq}\left|{e}^{\widehat{T}}\right|{\Psi}_0\right\rangle, $$
(207)

which is the consequence of Eq. 204 and will be applied in the near future.

We should now concentrate on the way of determining the form of amplitudes. To simplify the considerations, we can assume

$$ \widehat{T}\approx {\widehat{T}}_2, $$
(208)

which is equivalent to the CCD variant. We are interested in finding the amplitudes t pq ab . The final result of the calculations will be the approximate energy:

$$ {E}_{\mathrm{CCD}}=\left\langle {\Psi}_0\left|\widehat{H}{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle . $$
(209)

The information about the amplitude t pq ab can be extracted from the integral

$$ {t}_{ab}^{pq}=\left\langle {\Psi}_{ab}^{pq}\left|{\widehat{T}}_2\right|{\Psi}_0\right\rangle $$
(210)

(see Eq. 196). However, the amplitudes are still not known, since we do not know the \( {\widehat{T}}_2 \) operator. Therefore, one more equation is necessary to elicit the sought information. Let us begin with the approximated expression (Eq. 207):

$$ \begin{array}{rl}\left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle &={E}_{\mathrm{CCD}}\left\langle {\Psi}_{ab}^{pq}\left|{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle \\[6pt] {}&=\left\langle {\Psi}_0\left|\widehat{H}{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle \left\langle {\Psi}_{ab}^{pq}\left|{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle .\end{array} $$
(211)

The expansion (Eq. 193) tailored to the present case,

$$ {e}^{{\widehat{T}}_2}=\widehat{1}+{\widehat{T}}_2+\frac{1}{2}{\widehat{T}}_2^2+\dots, $$
(212)

and substituted to the left-hand side of Eq. 211 gives

$$ \left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}{e}^{\widehat{T}}\right|{\Psi}_0\right\rangle =\left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}\left(\widehat{1}+{\widehat{T}}_2+\frac{1}{2}{\widehat{T}}_2^2\right)\right|{\Psi}_0\right\rangle . $$
(213)

Further terms are not necessary; in such a case, the functions on both sides of the integral would differ with four and more spin-orbital exchanges (and Hamiltonian is still a sum of one- and two-electron operators). Similarly, the expansion in the integral \( \left\langle {\Psi}_0\left|\widehat{H}{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle \) will also be truncated on the second term:

$$ \left\langle {\Psi}_0\left|\widehat{H}{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle =\left\langle {\Psi}_0\left|\widehat{H}\left(\widehat{1}+{\widehat{T}}_2\right)\right|{\Psi}_0\right\rangle . $$
(214)

Remembering that

$$ {E}_0^{(0)}=\left\langle {\Psi}_0\left|\widehat{H}\right|{\Psi}_0\right\rangle, $$
(215)

one gets

$$ \left\langle {\Psi}_0\left|\widehat{H}{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle ={E}_0^{(0)}+\left\langle {\Psi}_0\left|\widehat{H}{\widehat{T}}_2\right|{\Psi}_0\right\rangle . $$
(216)

The last integral on the right-hand side of Eq. 211, \( \left\langle {\Psi}_{ab}^{pq}\left|{e}^{\widehat{T}}\right|{\Psi}_0\right\rangle \), can be nonvanishing only if the functions on the right and left side are the same. This is possible for

$$ \left\langle {\Psi}_{ab}^{pq}\left|{e}^{{\widehat{T}}_2}\right|{\Psi}_0\right\rangle =\left\langle {\Psi}_{ab}^{pq}\left|{\widehat{T}}_2\right|{\Psi}_0\right\rangle . $$
(217)

This is the integral that can provide information about the desired amplitudes of Eq. 210. Putting all these together, one gets

$$ \left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}\left(\widehat{1}+{\widehat{T}}_2+\frac{1}{2}{\widehat{T}}_2^2\right)\right|{\Psi}_0\right\rangle =\left({E}_0+\left\langle {\Psi}_0\left|\widehat{H}{\widehat{T}}_2\right|{\Psi}_0\right\rangle \right)\left\langle {\Psi}_{ab}^{pq}\left|{\widehat{T}}_2\right|{\Psi}_0\right\rangle . $$
(218)

Therefore, the amplitude \( {t}_{ab}^{pq}=\left\langle {\Psi}_{ab}^{pq}\left|{\widehat{T}}_2\right|{\Psi}_0\right\rangle \) can be expressed as

$$ {t}_{ab}^{pq}=\frac{\left\langle {\Psi}_{ab}^{pq}\left|\widehat{H}\left(\widehat{1}+{\widehat{T}}_2+\frac{1}{2}{\widehat{T}}_2^2\right)\right|{\Psi}_0\right\rangle }{E_0+\left\langle {\Psi}_0\left|\widehat{H}{\widehat{T}}_2\right|{\Psi}_0\right\rangle }. $$
(219)

Unluckily, this does not mean that the amplitudes are known. Still, the above expression also contains the t pq ab amplitudes on the right-hand side in the \( {\widehat{T}}_2 \) operators. Moreover, all other amplitudes are also present on the right-hand side. The consequence of this aggravation is that the CC equations cannot be solved separately, one by one. All together, the complicated set of nonlinear equations must be handled. The number of equations is equal to the number of sought amplitudes. This is the main reason for the huge computational cost of the CC calculations, even though the variationality of the method was abandoned (Atkins and Friedman 2005; Cramer 2004; Jensen 2006; Levine 2008; Piela 2007; Roos and Widmark 2002).

It can be seen that solving the CC equations is quite complicated, even in the simplified case of the CCD approach. If one wanted to use the variational Hamiltonian and apply its average value, the following integrals would appear:

$$ \left\langle \Psi \left|\widehat{H}\right|\Psi \right\rangle =\left\langle {\Psi}_0\left|{e}^{{\widehat{T}}^t}\widehat{H}{e}^{\widehat{T}}\right|{\Psi}_0\right\rangle =\left\langle {e}^{\widehat{T}}{\Psi}_0\left|\widehat{H}\right|{e}^{\widehat{T}}{\Psi}_0\right\rangle . $$
(220)

In order to calculate them, one needs to know the form of all the \( {\widehat{T}}_i \) operators, since not only the function on the right-hand side of the above integral will contain the exchanged spin-orbitals but also the function on the left-hand side. Therefore, one needs to calculate terms like \( \left\langle {\widehat{T}}_3{\Psi}_0\left|\widehat{H}\right|{\widehat{T}}_2{\Psi}_0\right\rangle \) and many others. This causes the significant increase of the computational costs of the CC method.

Like in the MP n case, the CC method is worth using for the short expansion of the \( \widehat{T} \) operator. Thus, relatively good accuracy is obtained with a moderate price.

Conclusions

We have finally reached the end of the zeroth iteration in the process of learning quantum chemistry methods. The beginner may feel saturated or even overwhelmed; however, we hope that this chapter arouses interest. Our aim was to show that simple ideas underlie quantum chemistry methods. The purpose is to put complicated things in a simpler and more convenient form. One of the most popular rules in computational chemistry is as follows: “If you cannot calculate something, divide it into parts in such a way that you can calculate some contribution while the other is too difficult.” For instance, nonrelativistic energy can be divided into HF energy and correlation energy. Correlation energy accounts for the contribution that we cannot calculate in practice, but methods such as MP n, CC, and CI allow one to find some part of it. It may happen (and it often does!) that what we can calculate will be enough.

This chapter should be treated as the introduction to more advanced handbooks or as a guide through the symbols and concepts applied in the later parts of this book. Thus, some of the concepts are just touched upon, and many are omitted. If the reader noticed this and wants to know more, it means that this chapter has met its goal.