Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

In this chapter we consider the methods of first-principle calculations of the bulk crystal and nanostructure properties, which depend on the electronic structure.

The choice of the basis set is of particular importance when treating periodic systems where a large variety of chemical bonding can be found. The following two approaches to the basis-set choice define two types of methods of the electronic-structure calculations of periodic systems: plane-wave (PW) methods and localized atomic-like orbitals (LCAO) methods. Each method has its advantages and disadvantages.

In the case of monoperiodic (1D) systems (for example, nanotubes and nanowires) PW calculations one needs an introduction of two vacuum regions, both of which must be wide enough so that the 1D systems will not interact through the bulk crystal. LCAO calculations are free from this shortcoming. Therefore in the quantum chemistry of the systems with 2D and 1D periodicity the LCAO methods are more preferable. These methods are more flexible as they allow both Hartree-Fock and Kohn–Sham equations to be solved, are applicable in the correlated wavefunction approaches (for example, MP2 theory) and in the Kohn–Sham theory based on hybrid exchange-correlation functionals. Both all-electron and valence-electron—only calculations can be made in LCAO basis, while the PW methods always require using the effective core potential (ECP) for core electrons. Different choices of the ECP for solids are discussed.

The extension of one-electron approximation (Restricted, Unrestricted and Restricted Open shell one-determinant Hartree-Fock methods) to periodic systems is considered. The direct lattice summations in the Fock matrix elements and the \(\mathbf k\) dependence of the one-electron density matrix, energy levels and crystalline orbitals are the main difficulties of the HF LCAO method for periodic systems, compared with molecules.

In the self-consistent calculations of the electronic structure of bulk crystals and nanostructures, both in the basis of plane waves and in the basis of localized atomic-type functions, one needs at every stage of the self-consistent procedure to evaluate the approximate electron-density matrix by integration over the Brillouin zone (BZ).

The progress in the electron-correlation study of solids is connected mainly with the density-functional theory (DFT), determining directly the various physical properties of a system without any knowledge of the many-electron wavefunction. Unfortunately, for the density-functional-based methods there is no procedure for a systematic improvement of the calculated results when a higher accuracy is desired. Modeling the exchange and correlation interactions becomes difficult within DFT as the exact functionals for the exchange and correlation are not known except for the homogeneous (uniform) electron gas. The problem of molecular exchange-correlation functional transferability to crystals is discussed. The climbing of the so called Jacob’s Ladder for Exchange-Correlation Functionals defines approximations that permit the calculation of real periodic systems: LDA, GGA and hybrid Hartree-Fock/Kohn–Sham (orbital dependent) exchange correlation and screened exchange correlation functionals.

The Density Functional Tight Binding (DFTB) method is popular for 1D nanostructure calculations. This method uses Slater type atomic orbitals, applied to the two-center integrals introduction as free parameters. When these parameters are fitted to reproduce the results of the DFT calculations the electron correlation is also taken into account. Hamiltonian matrix elements can be modified by a self-consistent redistribution of Mulliken charges.

The application of the DFT and hybrid exchange-correlation functionals to simulations of nanostructures (nanolayers, nanotubes, nanowires) is discussed in Part II of this book. As we mentioned in Chap. 2, the study of nanostructures usually starts with calculations of the structure and properties of the corresponding bulk crystals. The bulk crystals play a role of benchmark systems to check the applicability of the selected computation method to reproducing the experimental data for different bulk crystal properties: the equilibrium atomic structure, one-electron properties (band structure and density of states), formation and atomization (cohesive) energy, bulk modulus and elastic constants, phonon frequencies and thermodynamical properties.

The results of the LCAO DFT(PBE0) calculations of the structure and electronic properties, bulk modulus and the relative energies of different phases are presented for several bulk crystals serving as a base for nanostructure modeling: binary oxides (rutile phase of TiO\(_2\)), ternary oxides (SrTiO\(_3\), SrZrO\(_3\), BaTiO\(_3\)) and sulfides (TiS\(_2\), ZrS\(_2\)). Good agreement is demonstrated of the calculation results with the experimental data.

1 Basis Sets and Pseudopotentials in the Crystalline Electronic Structure Calculations

1.1 Plane Wave and Localized Atomic Functions Basis Sets

Although the same fundamental theory underlies the Hartree-Fock or the density-functional approach in both solid-state physics and molecular quantum chemistry (for example, the same exchange-correlation functionals can be used in both disciplines), the detailed implementation is usually quite different [1]. First, this difference concerns the basis-set choice.

Since the electronic charge density of an isolated molecule is necessarily localized in a finite region of space, the traditional quantum-chemical method is LCGTO (MOs are expanded in a basis of localized Gaussian-type orbitals, centered on the atomic nuclei).

The choice of the basis set is of particular importance when treating periodic systems where a large variety of chemical bonding can be found. The following three approaches to the basis-set choice define three types of methods of the electronic-structure calculations in crystals [2]: plane-wave (PW) methods, atomic-sphere (AS) methods, LCAO methods. Each method has its advantages and disadvantages. The information about the existing computer codes using AS, PW and LCAO basis sets for the periodic structure calculations can be found on the site [3].

The tradition in solid-state physics is to use plane waves (PW) as a basis set for expanding the one-particle Kohn–Sham wavefunctions (crystalline orbitals) [4]. These can be used in their pure form in the first-principles Effective Core Potential (pseudopotential) method [5] or modified near the atomic cores (augmented plane waves [6]).

The basic idea of AS methods is to divide the electronic-structure problem, providing efficient representation of atomic-like features that are rapidly varying near each nucleus and smoothly varying functions between atoms [2]. The smooth functions are augmented near each nucleus by solving the Schrödinger equation in the sphere at each energy and matching to the outer wave function. The resulting APW (augmented plane waves) or KKR (Kohn–Korringa–Rostoker) methods are powerful but require solution of nonlinear equations. The linear modifications of AS methods (LAPW, LMTO) use the familiar form of a secular equation involving a Hamiltonian and overlap matrix. The linear modification of the all-electron (termed also full potential—FP) PW method (FPLAPW) provide the most precise solutions of the Kohn–Sham equations. However FPLAPW calculations are practically difficult as the core-states description requires a huge number of plane waves.

The simpler PW methods using the effective core potential (ECP) for core electrons (pseudopotential) are the most popular in the Kohn–Sham periodic-systems calculations. Plane waves are an orthonormal complete set; any function belonging to the class of continuous normalizable functions can be expanded with arbitrary precision in such a basis set. Using the Bloch theorem the single-electron wavefunction \(\varphi _i(\mathbf{r})\) can be written as a product of a wave-like part and a cell-periodic part \(\varphi _{i\varvec{k}}=\exp (\mathrm{i}\varvec{k}\varvec{r})u_i(\varvec{r})\) (see Chap. 2). Due to its periodicity in a direct lattice \(u_i(\varvec{r})\) can be expanded as a set of plane waves \(u_i(\varvec{r})=\sum _{\varvec{B}} C_{i\varvec{B}}\exp (\mathrm{i}\varvec{B}\varvec{r})\), where \(\varvec{B}\) is the reciprocal lattice vector. Thus, in the PW basis the single-electron wavefunction can be written as a linear combination of plane waves

$$\begin{aligned} \varphi _{i\varvec{k}}(\varvec{r})=\sum \limits _{\varvec{B}}C_{i\varvec{B}}\exp (\mathrm{i}(\varvec{k}+\varvec{B})\varvec{r}) \end{aligned}$$
(3.1)

The number of basis functions used is controlled by the largest wavevector in the expansion (3.1). This is equivalent to imposing a cutoff on the kinetic energy as the kinetic energy of an electron with wavevector \((\varvec{k}+\varvec{B})\) is given by \(\frac{|\varvec{k}+\varvec{B}|^2}{2}\). Thus, the size of the PW basis set is defined by the so-called cutoff energy, i.e. the kinetic energy for the largest reciprocal lattice vector included in the PW basis. The PW basis set is universal, in the sense that it does not depend on the positions of the atoms in the unit cell, nor on their nature [7]. One does not have to construct a new basis set for every atom in the periodic table nor modify them in different materials, as is the case with localized atomic-like functions and the basis can be made better (and more expensive) or worse (and cheaper) by varying a single parameter—the number of plane waves defined by the cutoff energy value. This characteristic is particularly valuable in the quantum molecular-dynamics calculations, where nuclear positions are constantly changing. It is relatively easy to compute forces on atoms. Finally, plane-wave calculations do not suffer from the basis-set superposition error (BSSE). In practice, one must use a finite set of plane waves, and this in fact means that well-localized core electrons cannot be described in this manner. One must either augment the basis set with additional functions (as in linear combination of augmented plane waves scheme), or use pseudopotentials to describe the core states. Both AS and PW methods, developed in solid-state physics are used to solve Kohn–Sham equations. We refer the reader to recently published books for the detailed description of these methods [2, 810]. The plane waves are a reasonable first approximation to conduction-band eigenfunctions and permit many formal simplifications and computational economies.

Nevertheless, PW methods have the disadvantage that very localized or inhomogeneous systems may require excessive numbers of waves for their representation. PW basis can be formally applied only to the systems with 3D periodicity i.e. bulk crystals. The main problem of PW basis use arises when the three-dimensional objects have 2D periodicity (nanolayers) or 1D periodicity (nanotubes, nanowires).

This problem can be especially serious in surface calculations, where the requirement of periodicity in three dimensions leads to a model with slabs repeating periodically along the surface normal. A surface may have periodicity in the plane of the surface, but it cannot have periodicity perpendicular to the surface. Therefore the artificial supercell for a surface calculation is introduced. The supercell contains a crystal slab and a vacuum region [5]. The supercell is repeated over all space, so the total energy of an array of crystal slabs is calculated. To ensure that the results of the calculation accurately represent an isolated surface, the vacuum regions must be wide enough so that faces of adjacent crystal slabs do not interact across the vacuum region, and the crystal slab must be thick enough so that the two surfaces of each crystal slab do not interact through the bulk crystal [5]. If the space between the slabs is wide enough to make interactions between them negligible, the “lattice constant” normal to the surface has to be quite large and the corresponding reciprocal lattice vector quite short, resulting in a large number of plane waves within a given kinetic-energy cutoff and hence a relatively expensive calculation. The distance between the slabs needs to be even larger if adsorbate molecules are to be added to the surface, since these molecules also should not interact across the space between the slabs.

In the case of monoperiodic (1D) systems (for example, nanotubes and nanowires) one needs to use two vacuum regions in PW calculations and both must be wide enough so that the 1D systems do not interact through the bulk crystal.

Finally, even molecules (0D periodicity) can be studied in this fashion (three vacuum regions have to be introduced when PW basis set is used). Again, the supercell needs to be large enough so that the interactions between the molecules are negligible. The calculation of the charged molecular systems (molecular ions) requires the charge compensation by the surrounding medium.

To make the Hartree-Fock calculations in PW basis practically applicable one needs the introduction of the short- and long-range parts of the Hartree-Fock exchange (see Hartree-Fock method in PW basis subsection).

Localized-basis approaches to periodic systems do not suffer from these disadvantages. Very localized states, including core states if desired, can be represented by a suitable choice of atom-centered basis functions. The electronic charge density goes naturally to zero in regions of space where there are no atoms, and the lack of basis functions in such regions is not a problem unless a detailed description of energetic excitations, scattering states, or tunneling is required [1]. LCAO methods can be formulated equally easily for systems periodic in one, two or three dimensions. For surface modeling there is no need for periodicity normal to the surface; a single slab of sufficient finite thickness can be used. This facilitates the accurate treatment of molecule–surface interactions, especially when the molecules are relatively far from the surface. Also, if the basis functions are carefully constructed and optimized for the crystalline environment (see next subsection) it is possible to represent the valence-electron eigenstates with a relatively small number of basis functions: just enough to accommodate all the electrons, plus a few more functions to give the Kohn–Sham orbitals sufficient variational freedom. Thus, only a few tens of basis functions per atom are typically needed, versus hundreds per atom in typical plane-wave approaches.

The auxiliary basis sets needed to represent the charge density and, if desired, the exchange-correlation potential are also fairly modest in size. Once the basis set has been constructed and the required overlap and Hamiltonian matrices formed, the small (compared to plane wave) size of the basis can greatly reduce the computational cost of solving the Kohn–Sham equations.

Also, especially in systems with large unit cells, one can exploit sparsity in these matrices, since basis functions on distant centers will have negligible overlap and interaction (except for certain long-range Coulomb multipole interactions that can be summed according to Ewald’s convention or screened). Advanced techniques can be applied to calculate multicenter integrals in LCGTO methods for solids. For example, in the implementation of periodic boundary conditions in the MO LCAO program [11, 12] all terms contributing to the KS Hamiltonian are evaluated in real space, including the infinite Coulomb summations, which are calculated with the aid of the fast multipole method. In LCGTO methods with PBC the \(O(N)\) linear-scaling DFT calculations for large and complex systems are possible [11, 13]. For example, carbon nanostructures up to C540 were calculated [14] using DFT LCAO with the numerical atomic basis.

In the quantum chemistry of the systems with 2D and 1D periodicity the LCAO methods are more preferable [7]. These methods are more flexible as they allow both Hartree-Fock and Kohn–Sham equations to be solved, are applicable in the correlated wavefunction approaches (for example, MP2 theory) and in Kohn–Sham theory based on hybrid exchange-correlation functionals. In comparison with plane waves, the use of all-electron LCAO calculations allows one to describe accurately electronic distributions both in the valence and the core region with a limited number of basis functions. The local nature of the basis allows a treatment both of finite systems and of systems with periodic boundary conditions in one, two or three dimensions. This has an advantage over plane-wave calculations of molecules, polymers or surfaces that work by imposing artificial periodicity: the calculation must be done on, e.g. a three-dimensional array of molecules with a sufficiently large distance between them (the molecule is placed at the center of a periodic supercell). LCAO total energies can be made very precise (i.e. reliable to many places of decimals) since all integrals can be done analytically (in practice, this is only true for Hartree-Fock calculations; density-functional theory LCAO calculations require a numerical integration of the exchange-correlation potential that reduces the attainable precision). Having an “atomic-like” basis facilitates population analyses, the computation of properties such as projected densities of states, and “pre-SCF alteration of orbital occupation” making the convergence of SCF calculations faster. As was already noted the LCAO basis allows easy comparison of the results obtained for molecules and solids at the same precision level of calculations.

As it is demonstrated in the next subsection, the molecular basis set atomic-like orbitals can be considered as a starting point to generate atomic basis sets to be used in crystalline compounds. Therefore, we begin with the molecular basis-sets description.

In molecular quantum chemistry two types of atomic-like basis sets are used: Slater-type orbitals (STO) and Gaussian-type orbitals (GTO). In fact, it is not really correct to call them “orbitals”. They are better described as basis-set functions, since they are Slater-type or Gaussian-type functions used to approximate the shapes of the orbitals defined as one-electron wavefunctions. Using the acronyms accepted in solid-state theory it would be possible even call LCAO methods for crystals to “all-electron or full potential linear combination of Slater (Gaussian)-type functions—FP LS(G)TF method” [7], compare with the acronym FP LAPW (full potential linear combination of augmented plane waves).

The mathematical form of the normalized primitive Slater-type function (STF) in atom-centered polar coordinates is

$$\begin{aligned} \chi _{nlm}^{STF}(\varvec{r})=N_{nl}(\zeta )r^{n-1}\exp (-\zeta r)Y_{lm}(\theta ,\varphi )=R_{nl}(r)Y_{lm}(\theta ,\varphi ) \end{aligned}$$
(3.2)

where \(\zeta \) is an orbital exponent, \(n\) is the principal quantum number, \(Y_{lm}(\theta ,\varphi )\) are the spherical harmonics, depending on the angular momentum quantum numbers \(l\) and \(m\). The Slater orbitals can be written in the form (3.2) with the orbital exponent \(\zeta =\frac{Z-s}{n^*}\) and \(n=n^*\).

The STF with the radial part in the form (3.2) and integer \(n\) can be used as the basis functions in Hartree-Fock–Roothaan calculations of atomic wavefunctions. The radial dependence of the atomic orbitals is an expansion in the radial Slater-type basis functions \(\varphi _{lmp}\) whose indices are \(l\), running over \(s,p,d,f,\ldots \) symmetries, and \(p\) counting serially over basis-set members for a given symmetry:

$$\begin{aligned} \varphi _{lmi}=\sum \limits _pC_{lmp}R_{lp}(r)Y_{lm}(\theta , \varphi ) \end{aligned}$$
(3.3)

The radial expansion is independent of \(m\); all electrons with a given \(l ,i\) have the same radial dependence. The orbital angular dependence \(Y_{lm}(\theta , \varphi )\) is a normalized spherical harmonic.

The Slater-type orbitals were the first to be used in the molecular quantum chemistry semiempirical calculations. Nowadays STO are used in DFTB method (see Sect. 3.4). Unfortunately, such functions are not suitable for fast calculations of multicenter integrals in ab-initio calculations. Gaussian-type functions (GTFs) were introduced to remedy the difficulties. GTFs are used in basis sets in practically all modern codes for LCAO calculations of molecules. We know two exclusions from this rule—KS LCAO codes SIESTA and ADF, see [3]. The SIESTA code uses a numerical AO basis set, the ADF code uses Slater-type basis set orbitals.

A primitive Gaussian-type function can be written in a local Cartesian coordinate system in the form

$$\begin{aligned} \chi ^{GTF}=x^ly^mz^n\exp (-\alpha r^2) \end{aligned}$$
(3.4)

where \(\alpha \) is the orbital exponent, and the \(l, m, n\) are not quantum numbers but simply integral exponents of Cartesian coordinates. Gaussian primitives (3.4) can be factorized into their Cartesian components, i.e. \(\chi ^{GTF}=\chi _x^{GTF}\chi _y^{GTF}\chi _z^{GTF}\), where each Cartesian component has the form (introducing an origin such that the Gaussian is located at position \(A\)),

$$\begin{aligned} \chi _x^{GTF}=(x-x_A)\exp \left( -\alpha (x-x_A)^2\right) \end{aligned}$$
(3.5)

This simplifies considerably the calculation of integrals. If we write the exponential part of an STF, \(exp(-\alpha r)\) in Cartesian components we get \(\exp (-\alpha \sqrt{x^2+y^2+x^2})\) that is not so separable. Note that the absence of the STO pre-exponential factor \(r^{n-1}\) restricts single Gaussian primitives to approximating only \(1s\), \(2p\), \(3d\) etc. orbitals and not, e.g. \(2s\), \(3p\), \(4d\), etc. However, combinations of Gaussians are able to approximate correct nodal properties of atomic orbitals if the primitives are included with different signs. The sum of exponents of Cartesian coordinates \(L=l+m+n\) is used analogously to the angular-momentum quantum number for atoms to mark Gaussian primitives as \(s\)-type (\(L=0\)), \(p\)-type (\(L =1\)), \(d\)-type (\(L =2\)), \(f\)-type (\(L =3\)) etc. From six \(3d\) GTFs (\(x^2, xy, xz, yz, y^2, z^2\)) there are only five linearly independent and orthogonal atomic \(d\)-orbitals being linear combinations of Cartesian Gaussians (\(3z^2-r^2, xz, yz, x^2-y^2, xy\)), the sixth combination \(x^2+y^2+x^2=r^2\) is Gaussian primitive of \(s\)-type.

It is clear that the behavior of a Gaussian is qualitatively wrong both at the nuclei and in the long-distance limit for a Hamiltonian with point-charge nuclei and Coulomb interaction. From this point of view Slater type orbitals would be preferable.

In so-called Pople basis sets, the basis functions are made to look more like Slater-type functions by representing each STF \(\chi ^{STF}\) as a linear combination of Gaussian primitives:

$$\begin{aligned} \chi ^{STF}=\sum \limits _{i=1}^N C_i\chi ^{GTF}_i \end{aligned}$$
(3.6)

where \(C_i\) is a fixed coefficient and \(N\) is the number of Gaussian primitives used to represent the Slater-type basis function. The sums (3.6) are known as the contracted Gaussian basis set. Linear combinations of Gaussian primitives allow the representation of the electron density close to the nucleus to be improved. Recall that STF has a cusp at the nucleus, while GTF does not. By taking linear combinations of Gaussian primitives, the cusplike behavior is better reproduced.

The Gaussian basis set can also be used without any connection with the Slater-type orbitals: HF equations solutions for atoms are reached by Roothaan’s expansion method with Gaussian-type functions as was done with STFs. The details of both procedures—Gaussian expansion of Slater-type orbitals and atomic SCF calculations with Gaussian-type basis—are well known in molecular quantum chemistry. It was found that GTF expansions require inclusion of more primitives than STF expansions for the same accuracy in the total and one-electron SCF energies for atoms.

In molecular quantum chemistry Gaussian-type basis functions are expanded as a linear combination (contraction) of individually normalized Gaussian primitives \(g_j(\varvec{r})\) characterized by the same center and angular quantum numbers but with different exponents

$$\begin{aligned} \chi _i(\varvec{r})=\sum \limits _{j=1}^N d_jg_j(\varvec{r}), \quad g_j(\varvec{r})=g(\varvec{r};\alpha , l,m)=N_{lm}(\alpha )r^lY_{lm}(\theta , \phi )\exp (-\alpha _j r^2) \end{aligned}$$
(3.7)

where \(N\) is the length of the contraction, the \(\alpha _j\) are the contraction exponents, the \(d_j\) are contraction coefficients. Gaussian primitives can be written in terms of real spherical harmonics including a normalization constant.

If accurate solutions for an atom are desired, they can be obtained to any desired accuracy in practice by expanding the “core” basis functions in a sufficiently large number of Gaussians to ensure their correct behavior. Furthermore, properties related to the behavior of the wavefunction near nuclei can often be predicted correctly, even without an accurately “cusped” wavefunction. In most molecular applications the asymptotic behavior of the density far from the nuclei is considered much more important than the nuclear cusp [7]. The molecular wavefunction for a bound state must fall off exponentially with distance, whenever the Hamiltonian contains Coulomb electrostatic interaction between particles. However, even though an STFs basis would, in principle, be capable of providing such a correct exponential decay, this occurs in practice only when the smallest exponent in the basis set is \(\zeta _{min}=\sqrt{2I_{min}}\), where \(I_{min}\) is the first ionization potential. Such a restriction on the range of exponent values, while acceptable for atomic SCF calculations, is far too restrictive for molecular and solid-state work. Some of these formal limitations have thus turned out to be of relatively little importance in practice.

By proper choice of the \(N\), \(\alpha _j\), and \(d_j\) in the contraction (3.7) the “contracted Gaussians” may be made to assume any functional form consistent with the primitive functions used. One may therefore choose the exponents of the primitives and the contraction coefficients so as to lead to basis functions with desired properties, such as reasonable cusp-like behavior at the nucleus (e.g. approximate Slater functions or HF atomic orbitals). Integrals involving such basis functions reduce to sums of integrals involving the Gaussian primitives. Even though many primitive integrals may need to be calculated for each basis function integral, the basis function integrals will be rapidly calculated provided the method of calculating primitive integrals is fast, and the number of orbital coefficients in the wavefunction will have been considerably reduced. The exponents and contraction coefficients are normally chosen on the basis of relatively cheap atomic SCF calculations so as to give basis functions suitable for describing exact Hartree-Fock atomic orbitals. An approximate atomic basis function, whose shape is suitable for physical and chemical reasons, is thus expanded in a set of primitive Gaussians, whose mathematical properties are attractive from a computational point of view. Note that the physical motivation for this procedure is that, while many primitive Gaussian functions may be required to provide an acceptable representation of an atomic orbital, the relative weights of many of these primitives are almost unchanged when the atoms are formed into molecules or crystals. The relative weights of the primitives can therefore be fixed from a previous calculation and only the overall scale factor for this contracted Gaussian function need be determined in the extended calculation. It is clear that contraction will in general significantly reduce the number of basis functions.

For molecular basis sets of Gaussian-type functions (GTF) general acronyms and notations are used that are well known in molecular quantum chemistry.

A Minimal-basis sets are constructed by using one Slater-type orbital basis function of each type occupied in the separated atoms that comprise a molecule. If at least one \(p\)-type, \(d\)-type or \(f\)-type orbital is occupied in the atom, then the complete set (3 \(p\)-type, 5 \(d\)-type, 7 \(f\)-type) of functions must be included in the basis set. The simplest of these basis sets is that designated STO-3G, an acronym for Slater-type-orbitals simulated by 3 primitive Gaussians added together. The coefficients of the Gaussian functions are adjusted to give as good a fit as possible to the Slater orbitals.

Only one best fit to a given type of Slater orbital is possible for a given number of Gaussian functions. Hence, all STO-3G basis sets for any row of the periodic table are the same, except for the exponents of the Gaussian functions. The exponents are expressed as scale factors, the squares of which are used as multipliers of the adjusted exponents in the original best-fit Gaussian functions. In this way, the ratios of exponents remain the same while the effective exponent of each orbital can be varied, [7]. The STO-3G basis set, and other minimal basis sets, usually do reasonably well at reproducing geometries of only simple organic molecules. The minimal basis sets do not allow alteration of the basis orbitals in response to a changing molecular environment and therefore make comparison between charged and uncharged species unreliable. Anisotropic environments are another problem for minimal basis sets [7].

Because the core electrons of an atom are less affected by the chemical environment than the valence electrons, the core electrons can be treated with a minimal basis set while the valence electrons are treated with a larger basis set. This is known as a split-valence basis set. In these bases, the AOs are split into two parts: an inner, compact orbital and an outer, more diffuse one. The coefficients of these two kinds of orbitals can be varied independently during construction of the MOs. Thus, the size of the AO can be varied between the limits set by the inner and outer functions. Basis sets that similarly split the core orbitals are called double zeta, DZ (implying two different exponents) or triple zeta, TZ (implying three different exponents). For example, the 3-21G notation of split valence basis set means that the core orbitals are represented by three Gaussians, whereas the inner and outer valence orbitals consist of two and one Gaussians, respectively. If we were to name bases consistently, of course, this one would be labeled STO-3-21G, but the STO is customarily omitted from all split-valence descriptors. Two other split-valence bases are the 6-31G and the 6-311G. Both have six Gaussian cores. The 6-311G is a triply split basis, with an inner orbital represented by three Gaussians, and middle and outer orbitals represented as single Gaussians. The triple split improves the description of the outer valence region.

Further improvement of the basis set is achieved including polarization functions in the basis set. For example, this is done by adding \(d\)-orbitals to the basis of all atoms having no \(d\)-electrons. For typical organic compounds these are not used in bond formation, as are the \(d\)-orbitals of transition metals. They are used to allow a shift of the center of an orbital away from the position of the nucleus. For example, a \(p\)-orbital on carbon can be polarized away from the nucleus by mixing into it a \(d\)-orbital of lower symmetry. One obvious place where this can improve results is in the modeling of small rings; compounds of second-row elements also are more accurately described by the inclusion of polarization. The presence of polarization functions is indicated in the Pople notation by appending an asterisk to the set designator. Thus, 3-21G* implies the previously described split valence basis with polarization added. Typically, six \(d\)-functions (\(x^2, y^2, z^2, xy, xz\), and \(yz\)), equivalent to five \(d\)-orbitals and one \(s\), are used (for computational convenience). Most programs can also use five real \(d\)-orbitals. An alternative description of this kind of basis is DZP: double zeta, polarization. A second asterisk, as in the 6-31G** basis set, implies the addition of a set of \(p\)-orbitals to each hydrogen to provide for their polarization. Again, an alternative notation exists: DZ2P(double zeta 2 polarization). An asterisk in parentheses signals that polarization functions are added only to second-row elements. Another alternative to the asterisk for specifying polarization functions is (\(d\)), placed after the G.

To provide more accurate descriptions of anions, or neutral molecules with unshared pairs, basis sets may be augmented with so-called diffuse functions. These are intended to improve the basis set at large distances from the nuclei, thus better describing the barely bound electrons of anions. Processes that involve changes in the number of unshared pairs, such as protonation, are better modeled if diffuse functions are included.

The augmentation takes the form of a single set of very diffuse (exponents from 0.1 to 0.01) \(s\) and \(p\) orbitals. The presence of diffuse functions is symbolized by the addition of a plus sign, +, to the basis set designator: 6-31+G. (Since these are \(s\)- and \(p\)-orbitals, the symbol goes before the G.)

Again, a second + implies diffuse functions added to hydrogens; however, little improvement in results is noted for this addition unless the system under investigation includes hydride ions.

All of the codes for molecular ab-initio calculations offer at least one set of diffuse functions. Still more extensive basis sets exist, and are described by more complicated notation.

Let us summarize the notations used for molecular GTO basis sets. Basis sets denoted by the general nomenclature N-M1G or N-M11G, where N and M are integers, are called Pople basis sets. The first, N-MlG, is a split-valence double-zeta basis set while the second, N-M11G is a split-valence triple-zeta basis set. The integers N and M in the basis set name give the number of Gaussian primitives used. For example, in the split-valence double-zeta basis set 6-31G for a carbon atom, the first number (N \(=\) 6) represents the number of Gaussian primitives used to construct the core orbital basis function (the \(1s\) function). The second two numbers (M \(=\) 3 and 1) represent the valence orbitals, \(2s\), \(2s'\), \(2p\)(3) and \(2p'\)(3). The first number after the dash in the basis set name (3 in this case) indicates the number of Gaussian primitives used to construct the \(2s\) and \(2p\)(3) basis functions. The second number after the dash (1 in this case) gives the number of Gaussian primitives used to construct the \(2s'\) and \(2p'\)(3) basis functions. There are two common methods for designating that polarization functions are included in a basis set. The first method is to use * or ** after the Pople basis set name; for example, 6-31G* or 6-31G**. The single * means that one set of \(d\)-type polarization functions is added to each nonhydrogen atom in the molecule. The double ** means that one set of \(d\)-type polarization functions is added to nonhydrogens and one set of \(p\)-type polarization functions is added to hydrogens. The second method for including polarization functions in the basis-set designation is more general. It is indicated by the notation (ll,l2) following the Pople basis-set name; for example, 6-31G(\(d\)) or 6-31G(\(d,p\)). The first label indicates the polarization functions added to nonhydrogen atoms in the molecule. The notations 6-31G(\(d\)) and 631G(\(d,p\)) mean that one set of \(d\)-type polarization functions is added to all nonhydrogens. The notation 6-31 l(2\(df\)) means that two sets of \(d\)-type and one set of \(f\)-type polarization functions are added to nonhydrogens. The second label in the notation (ll,l2) indicates the polarization functions added to hydrogen atoms. The basis set 6-31G(\(d\)) has no polarization functions added to hydrogen, while the basis 6-31G(\(d,p\)) has one set of \(p\)-type polarization functions added to hydrogen atoms. The basis set 6-31 1G(\(2df\), \(2pd\)) has two sets of \(p\)-type and one set of \(d\)-type polarization functions added to hydrogen atoms. The use of diffuse functions in a Pople basis set is indicated by the notation + or ++. The + notation, as in 6-3l+G(\(d\)), means that one set of \(sp\)-type diffuse basis functions is added to nonhydrogen atoms (4 diffuse basis functions per atom). The ++ notation, as in 6-31++G(\(d\)), means that one set of \(sp\)-type diffuse functions is added to each nonhydrogen atom and one \(s\)-type diffuse function is added to hydrogen atoms.

In Table 3.1 we give a list of basis sets of orbitals that are commonly used in modern MO calculations. The H and C atomic orbitals that are included are listed. The first set contains “Gaussian-like orbitals”. Outside of the STO-3G basis set, most Gaussian basis sets are “split-valence”, which means that they use different numbers of Gaussian functions to describe core and valence atomic orbitals. Thus, a 6-31G basis set uses 6 Gaussians for the core orbitals, and two sets of Gassians for the valence orbitals, one with 3 Gaussians and another with one. Adding more Gaussians allows more flexibility in the basis set so as to give a better approximation of the true orbitals. Therefore, although some of the basis sets incorporate the same AOs, the larger ones provide a much better description of them. For each basis set, polarization (*), and diffuse functions (\(+\)) can also be added, which add flexibility into the basis set.

Table 3.1 The basis sets and orbitals they include
Table 3.2 Correlation-consistent polarized valence X zeta (XZ) basis sets

Alternate basis sets that are commonly used are the Dunning correlation-consistent polarized valence X zeta basis sets, denoted cc-pVXZ, see Table 3.2. Dunning pointed out that basis sets optimized at the Hartree-Fock level might not be ideal for correlated computations [15]. The “correlation consistent” basis sets are optimized using correlated wavefunctions and cc-pVXZ means a Dunning correlation-consistent, polarized valence, X-zeta basis; X \(=\) D,T,Q,5,6,7. In particular, the cc-pVDZ for C atom consists of \(3s2p1d\), cc-pVTZ would be \(4s3p2d1f\), cc-pVQZ would be \(5s4p3d2f1g\). The advantage of these is that increasing basis-set size is systematic. In fact, if energies are calculated with the cc-pVDZ, cc-pVTZ and cc-pVQZ basis sets, then there is an analytical function that can be used to extrapolate to cc-pVmZ (the complete basis-set limit). Unfortunately, cc-pVQZ calculations are not commonly made. Moreover, cc-VXZ functions are more difficult to integrate than the Gaussian functions and these calculations take longer even for comparably sized basis sets. Dunning basis sets already contain polarization functions, but they can be augmented with diffuse functions.

The EMSL (the environmental molecular sciences laboratory) [16] library supplies a wide selection of atomic basis sets optimized for molecules. This library allows the extraction of Gaussian basis sets, and any related effective core potentials (the effective core potentials are considered in the next sections), from the Molecular Science Research Center’s Basis Set Library. A user may request the basis set be formatted appropriately for a wide variety of popular molecular electronic-structure packages. In addition to the exponents and contraction coefficients that define the basis set, the user can obtain descriptive data that include the overall philosophy behind the basis literature citations, the angular momentum composition of the basis and many other pieces of information.

1.2 Gaussian Basis Sets for Solid State Calculations

Molecular AO basis sets can formally be used in periodic LCAO calculations but their adequacy must be carefully checked.

In the periodic systems the basis sets are chosen in such a way that they satisfy the Bloch theorem. Let a finite number \(m_A\) of contracted GTFs be attributed to the atom \(A\) with coordinate \(\varvec{r}_A\) in the reference unit cell. The same GTFs are then formally associated with all translationally equivalent atoms in the crystal occupying positions \(\varvec{r}_A+\varvec{a}_n\) (\(\varvec{a}_n\) is the direct lattice translation vector). For the crystal main region of \(N\) primitive unit cells there are \(N\sum _A m_A\) Gaussian-type Bloch functions (GTBF) constructed according to

$$\begin{aligned} \chi _{\varvec{k} i}(\varvec{r})=\frac{1}{\sqrt{N}}\sum \limits _{\varvec{a}_n}\exp (\mathrm{i}\varvec{k}\varvec{a}_n)\chi _i(\varvec{r}-\varvec{r}_A-\varvec{a}_n) \end{aligned}$$
(3.8)

The total number \(M\) of GTBFs equals \(M=\sum _{A}m_{A}\), where the summation is made over the atoms of the reference cell. Thus, in solids the basis-set functions are modulated over the infinite lattice: any attempt to use large uncontracted molecular or atomic basis sets, with very diffuse functions can result in the wasting of computational resources [17]. Therefore exponents and contraction coefficients in the molecular and periodic systems are generally rather different, and with some exceptions, such as molecular crystals and certain covalent solids, molecular basis sets are not directly transferable to the study of crystalline solids.

As in the case of molecules, basis functions in solids are grouped into shells. In general, a shell contains all functions characterized by the same \(n\) and \(l\) quantum numbers (e.g. all the different \(d\)-functions in a \(3d\) shell); this allows the partitioning of the total charge density into “shell charge distributions” and is useful in the selection of bielectronic integrals and in the evaluation of long-range interactions. A feature of the contraction schemes originally used in the basis sets of the Pople type (and often useful in calculations with the CRYSTAL code [17]) is the additional grouping into shells of basis functions with only the same principle quantum number; e.g. a \(2sp\) shell, in which both \(2s\) and \(2p\) functions have the same set of exponents \(\alpha _j\) but different contraction coefficients \(d_j\). This procedure reduces the number of auxiliary functions to be calculated in the evaluation of electron integrals. Note that if the basis set is restricted to the \(s\), \(p\) and \(d\) basis functions, only \(sp\) shells can be formed in this way. In certain circumstances it may actually represent an important constraint on the form of the basis functions. For relatively small calculations where the time and storage limitations are not an important factor, some consideration should be given to describing the \(s\) and \(p\) functions with separate sets of exponents. Most standard molecular codes use what is known as a segmented contraction scheme, in which the transformation from the larger primitive set to the smaller contracted set is restricted in such a way that each Gaussian primitive \(g_j(\varvec{r})\) contributes to exactly one contracted GTF. In contrast, the general contraction scheme makes no such assumptions, and allows each Gaussian primitive to contribute to several contracted GTFs. A considerable advantage of the general scheme is that the contracted GTFs reproduce exactly the desired combinations of primitive functions. For example, if an atomic SCF calculation is used to define the contraction coefficients in a general contraction, the resulting minimal basis will reproduce the SCF energy obtained in the primitive basis. This is not the case with segmented contractions. There are other advantages of the general contraction: for example, it is possible to contract inner-shell orbitals to single functions with no error in the atomic energy, making calculations on heavy elements much easier. Another advantage is a conceptual one: using a general contraction, it is possible to perform calculations in which the one-particle space is a set of atomic orbitals, a true LCAO scheme, rather than being a segmented grouping of a somewhat arbitrary expansion basis. The MOs or COs can then be analyzed very simply, just as with the original qualitative MO LCAO or CO LCAO approach, but in terms of “exact AOs” rather than relatively crude approximations to them.

The computational aspects of GTFs use in the calculations of solids are discussed in [7]. The important reason for the usefulness of a Gaussian basis set is embodied in the Gaussian product theorem (GPT), which in its simplest form states that the product of two primitive Gaussian functions with exponents \(\alpha \) and \(\beta \), located at the centers \(\varvec{A}\) and \(\varvec{B}\), is itself a primitive Gaussian with exponent \(\gamma =\alpha +\beta \), multiplied by a constant factor \(F\), located at a point \(\varvec{C}\) along the line segment \(\varvec{A}-\varvec{B}\), where \(\varvec{C}=\frac{\alpha \varvec{A}+\beta \varvec{B}}{\gamma }\) and \(F=\exp \left( -\frac{\alpha \beta }{\gamma }(\varvec{A}-\varvec{B})^2\right) \).

The product of two polynomial GTF, of degree \(\mu \) and \(\nu \) and located at the points \(\varvec{A}\) and \(\varvec{B}\) is therefore another polynomial GTF located at \(\varvec{C}\) of degree \(\mu +\nu \) in \(x_C\), \(y_C\) and \(z_C\), which can be expressed as a short expansion of one-center Gaussians:

$$\begin{aligned} \chi _{_{Ax}}(x)\chi _{_{Bx}}(x)=\sum \limits _{i=0}^{\mu +\nu }C_i^{\mu +\nu }\varphi _{Ci}(x-x_C) \end{aligned}$$
(3.9)

where \(\varphi _{Ci}(x)=x^i\exp (-\alpha _p(x-x_c)^2)\) and \(x_c=\frac{\alpha x_{A}+\beta x_B}{\alpha +\beta }\).

The product of two Gaussians that are functions of the coordinates of the same electron is referred to as an overlap distribution, and all the integrals that must be calculated involve at least one such overlap distribution. The most important consequence of the GPT is that all four-center two-electron integrals can be expressed in terms of two-center quantities. Even though the four-center bielectronic integrals can be written in terms of two-center quantities, the cost of evaluating them still scales nominally as \(N^4\), where \(N\) is the number of functions in the expansion. This scaling must be reduced in order to treat large systems. One way of doing this that is used in the CRYSTAL code is the method of prescreening, where, rather than attempting to calculate the integrals more efficiently, one seeks, where possible, to avoid their evaluation altogether. The expression for an integral over primitive Gaussians can be formally written as

$$\begin{aligned} (ab|cd)=S_{ab}S_{cd}T_{abcd} \end{aligned}$$
(3.10)

where \(S_{ab}\) is a radial overlap between the functions \(\chi _a\) and \(\chi _b\), and \(T_{abcd}\) is a slowly varying angular factor. In many situations the product \(S_{ab}S_{cd}\) thus constitutes a good estimate of the magnitude of the integral whose product is used as an estimate in screening out small integrals. In order to estimate these overlaps quickly, a single, normalized \(s\)-type Gaussian called an adjoined Gaussian is associated with each shell, whose exponent \(\alpha \) is the smallest of the exponents in the shell contraction. This function thus reproduces approximately the absolute value of the corresponding AOs at intermediate and long range. The adjoined Gaussian is used in fast algorithms for estimating overlaps on the basis of which integrals are either evaluated exactly, approximately, or not at all. The level of approximation is user-definable through a set of tolerances given in the input. Such algorithms, and a consideration of the crystalline symmetry, mean that the integrals part of the CRYSTAL code scales at between \(N\) and \(N^2\), depending on the size of the system. The most unpleasant scaling in this code is thus the SCF part that, since it involves the diagonalization of the Fock matrix, scales as approximately \(N^3\).

The role of diffuse basis-set functions requires a special consideration in crystalline systems. Very diffuse functions can yield numerical instabilities and involve a risk of linear-dependent catastrophes [7]. Furthermore, due to the truncation criteria of the infinite sums, based on the overlap, when the exponents of the primitive Gaussians decrease, the number of integrals to be calculated increases very rapidly. Too extended basis sets are not needed in periodic calculations because the complete basis-set limit is reached quicker than in molecular calculations. Furthermore, the risk of linear dependence problems increases. The choice of the AO basis set is a compromise between accuracy and costs. As accuracy must be the main goal of ab-initio calculations, good-quality basis sets should always be used in spite of their computational cost to avoid producing meaningless numbers.

The choice of the basis set (BS) is one of the critical points, due to the large variety of chemical bonding that can be found in a periodic system. For example, carbon can be involved in covalent bonds (polyacetylene, diamond) as well as in strongly ionic situations (\(\mathrm Be_2C\), where the Mulliken charge of carbon is close to \(-\)4). Some general principles of the basis-set choice for periodic systems are formulated [7, 17] and we briefly reproduce them here. The great deal of experience gained in the molecular computational chemistry can be used in the selection of basis sets for studies of crystalline solids. However, molecular quantum chemists do not, in general, optimize basis sets by varying the exponents or contraction coefficients to minimize the energy. Rather, there is a hierarchy of basis sets with perceived qualities, and for a difficult problem where accuracy is important, one would use a “good-quality” standard basis set from a library without modification. In crystalline systems, by contrast, basis-set optimization is usually necessary, essentially for two reasons. First, there is a much larger variety of binding than in molecules, and basis sets are thus less transferable. Secondly, hierarchical libraries of basis sets comparable to those available for molecules do not really exist for solids. For certain types of compounds, such as molecular crystals or many covalent materials, the molecular sets can sometimes be used largely unmodified, but this has to be done carefully. However, for strongly ionic crystals and metals the basis sets, particularly the valence states, need to be redefined completely. In essentially all cases, the core states may be described using the solutions of atomic calculations, as even in the presence of strong crystal fields the core states are barely perturbed and may be described by the linear variation parameters in the SCF calculation.

Redefining basis sets in this way is obviously time consuming and even more obviously rather boring, and so over the recent years various workers involved with the CRYSTAL code have contributed to the effort to develop libraries of basis sets to be made available on the Internet. The URL of the official site is: http://www.crystal.unito.it/basis-sets.php. We recommend the reader to use the information given on this site. The site shows a periodic table. Clicking on the symbol for the required element will reveal a text file containing various different basis sets that may have been used in different materials containing that element type. Accompanying each basis set is a list of authors, a list of materials where the set has been used, references to publications and hints on optimization where relevant. This table is obviously not complete (in particular, lanthanides and actinides are practically absent as the calculations with \(f\)-electrons are not included even in the CRYSTAL06 code). Additional information can be found on the site of the Cambridge basis set library: http://www.tcm.phy.cam.ac.uk/.

We follow [7] the discussion of optimization strategies for the basis sets given in the libraries mentioned and also the adaptation of molecular bases for various types of solids. First, a number of general principles are given that should be taken into account when choosing a basis set for a periodic system.

The diffuse basis functions are used in atoms and molecules to describe the tails of the wavefunction, which are poorly described by the long-range decay of the Gaussian function. In periodic systems the cost of HF/DFT calculations can more essentially increase when the diffuse basis functions are included (in the silicon and diamond crystals, for example, the number of bielectronic integrals can be increased by a factor of 10 simply by changing the exponent of the most diffuse single Gaussian from 0.168 to 0.078 (Si) and from 0.296 to 0.176 (C), [7]. Fortunately, in crystalline compounds in contrast to molecules, especially in nonmetallic systems, the large overlap between neighbors in all directions drastically reduces the contribution of low-exponent Gaussians to the wavefunction. This has the consequence that a small “split-valence” basis set such as 6-21G is closer to the Hartree-Fock limit in crystals than in molecules.

The number of primitives is an important feature of the basis set. A typical basis set for all electron calculations includes “core functions” with higher exponents and a relatively large number of primitives—these will have a large weight in the expansion of the core states. The “valence functions” with a large weight in the outer orbitals will have lower exponents and contractions of only a very few primitives. It is possible to get away with putting a lot of primitives in the core since core states have very little overlap with the neighboring atoms. The use of many primitives in the valence shells would add significantly to the cost of the calculation, but in many cases it is necessary.

There are several ways to improve a basis set [7]: (1) reoptimize the more diffuse exponents (and contraction coefficients if necessary); (2) decontract, i.e. convert the more diffuse contractions into single Gaussian primitives; (3) convert \(sp\) functions into separate s and p functions; (4) add polarization functions if not already present; (5) add more primitives (watch out for linear dependence problems); (6) use a better starting point for the basis set. Optimization in this sense means varying an appropriate subset of the basis-set parameters until the energy is minimized. In principle, this is a reasonably complex multidimensional minimization, but there are various standard shell scripts available. Two of them can be downloaded from the Internet sites: BILLY code http://www.tcm.phy.cam.ac.uk/mdt26/downloads/billy.tar.gz) and LoptCG code (http://www.crystal.unito.it/LoptCG/LoptCG.html).

The adequacy of the starting molecular basis sets depends on the type of crystalline compound.

To describe covalent crystals the small molecular split-valence basis sets can be used with confidence and essentially without modification. It is enough to reoptimize the exponent of the most diffuse shell, which produces a slightly improved basis, while reducing the cost of the calculation. That said, 6-21G* is not really all that good and a larger better basis set with more variation freedom is quite easy to make for these cases (see web libraries).

For fully-ionic crystals (like alkali halides, LiH or MgO) with an almost completely empty cation valence shell it often proves convenient to use a basis set containing only “core” functions plus an additional sp shell with a relatively high exponent. It is usually difficult, and often impossible, to optimize the exponents of functions that only have appreciable weight in almost empty orbitals. Anions present a different problem. Reference to isolated ion solutions is possible only for halides, because in such cases the ions are stable even at the Hartree-Fock level. For other anions, which are stabilized by the crystalline field, the basis set must be redesigned with reference to the crystalline environment. Consider, for example, the optimization of the oxygen basis set in \(\mathrm Li_2O\) [7]. The difficulty is to allow the valence distribution to relax in the presence of two more electrons. We can begin from a standard STO-6G basis set, i.e. six contracted primitive Gaussians for the \(1s\) shell, and six more to describe the \(2sp\) shell. First, two more Gaussians were introduced into the \(1s\) contraction, in order to improve the virial coefficient and total energy. The two outer Gaussians of the valence \(sp\) shell were then removed from the contraction and allowed to vary independently. The exponents of the two outer independent Gaussians and the coefficients of the four contracted ones were optimized in \(\mathrm Li_2O\). The best outer exponents of the ion were found to be 0.45 and 0.15, and are therefore considerably more diffuse than the neutral isolated atom, where the best exponents are 0.54 and 0.24. The rest of the oxygen valence shell is unchanged with respect to the atomic situation. The introduction of \(d\) functions in the oxygen basis set gives only a minor improvement in the energy, with a population of 0.02 electrons/atom/cell (\(d\) functions may be important in the calculation of certain properties, however). Thus, for anions, reoptimization of the most diffuse valence shells is mandatory when starting from a standard basis set.

The majority of crystals can be classified as semi-ionic (with chemical bonding being intermediate between covalent and purely ionic limits). For such crystals the adequacy of selected basis sets must be carefully tested as, is discussed in [7], for example, for semi-ionic compounds \(\mathrm SiO_2\) (quartz) and \(\mathrm Al_2O_3\) (corundum). The exponents of the outer shell for the two cations (Si and Al) used in molecular calculations prove to be too diffuse. For the Si atom in quartz, reoptimization in the bulk gives \(\alpha =0.15\) (instead of the molecular value 0.09). Corundum is more ionic than quartz, and in this case it is better to eliminate the most diffuse valence shell of Al, and to use two Gaussians of the inner valence shells as independent functions (\(\alpha =0.94\) and \(\alpha = 0.3\), respectively).

For metals very diffuse Gaussians are required to reproduce the nearly uniform density so that has often been argued that plane waves are a more appropriate basis for these systems. It is generally impossible to optimize an atomic-like basis set in the Hartree-Fock calculations of metallic systems. However, Gaussian DFT studies indicate that GTFs are able to provide a reliable and efficient description of simple metallic systems (see, for example, metallic lithium calculations [18]). Thus, the effects of the basis set and Hamiltonian were separated.

UHF and hybrid DFT calculations (see Sects. 3.2 and 3.3) for strongly correlated transition-element magnetic compounds require reasonably good basis sets for the transition elements. These are not so widely available even to molecular quantum chemists since most of the effort in developing molecular GTF basis sets has been for the first- and second-row atoms. One reason for this may be that molecules containing transition metal atoms tend to be very badly described at the Hartree-Fock level. Molecular bonds tend to have a fairly high degree of “covalency” and the existence of partially-occupied \(d\) states leads to a great many nearly degenerate levels, and thus to a large “static correlation” (i.e. the weight of the HF determinant in a CI expansion would be small, and a multideterminant treatment is more appropriate). Basis sets to describe the correlation using quantum chemistry correlated wavefunction techniques need to be much richer than those for systems well described at the HF level since they need to treat all of the unoccupied levels. It may seem surprising that single-determinant HF could be so successful in periodic crystalline magnetic insulators containing transition elements, but this is an important characteristic of these ionic materials. The highly symmetric environment and long-range Coulomb forces tend to separate the orbitals into well-defined subsets with a significant gap between the occupied and unoccupied states. Hence, the ground state of NiO (for example) is rather well described by a single determinant. In this sense, a strongly correlated magnetic insulator is in many ways a “simpler system” than many molecules. The success of UHF calculations in these materials (and also hybrid DFT schemes) is now well known.

Molecular GTF basis sets for transition elements have been reoptimized in the solid state and are available on the Torino and Cambridge Gaussian basis set library web sites referred to earlier.

There exist a number of different algorithms for the minimization of the many variables function, [19]. The comparative study of their efficiency for the basis set optimization in crystals was recently performed in [20]. As an example of LCAO basis set optimization in TiO\(_2\) (anatase) calculations we mention here publication [21]. To optimize the BS, the minimization method without calculations on the total energy derivatives developed by Powell [22] and often called “the method of conjugate directions” was used. It believed to be one of the most efficient direct minimization methods. Interfaced with the CRYSTAL09 LCAO computational code [17] the program package OPTBAS [20] was applied for the BS optimization. BS exponential parameters less than 0.1 were excluded from the AOs and a bound-constrained optimization was performed for the remaining exponential parameters with a 0.1 lower bound. The diffuse exponents of valence s, p and d-orbitals were optimized for the stable anatase phase of bulk titania. Its atomic and electronic properties were reproduced in good agreement with the experiment (the experimental values are given in brackets): the lattice parameters a \(=\) 3.784 (3.782) and c \(=\) 9.508 (9.502), the dimensionless parameter for the relative position of the oxygen atom u \(=\) 0.2074 (0.2080), although the values of the bandgap are reproduced worse, being overestimated: 4.0 versus 3.2 eV. In any case, these results for the bulk titania (anatase) agree with the experimental data better than those given in [23] for both plane wave (PW) and LCAO calculations when one uses different exchange-correlation potentials.

In the final part of this section we shall discuss a rather serious problem, associated with Gaussian basis sets—the basis-set superposition error (BSSE). A common response to this problem is to ignore it, since it will go away in the limit of a complete basis but to achieve this limit one needs calculations that are seldom performed. The problem of BSSE is a simple one: in a system comprising interacting fragments A and B, the fact that in practice the basis sets on A and B are incomplete means that the fragment energy of A will necessarily be improved by the basis functions on B, irrespective of whether there is any genuine binding interaction in the compound system or not. The improvement in the fragment energies will lower the energy of the combined system, giving a spurious increase in the binding energy. It is often stated that BSSE is an effect that one needs to worry about only in calculations on very weakly interacting systems. This is not really true [7]. BSSE is an ever-present phenomenon and accurate calculations should always include an investigation of BSSE. Examples of areas in which one should be particularly worried are the study of the binding energy of molecules adsorbed on surfaces (see, for example [24] for an interesting discussion) or the calculation of defect-formation energies.

The approach most commonly taken to estimate the effect of BSSE is the counterpoise correction of Boys [25]: the separated fragment energies are computed not in the individual fragment basis sets, but in the total basis set for the system including “ghost functions” for the fragment that is not present. These energies are then used to define the counterpoise-corrected (CPC) interaction energy, which by comparison with the perturbation theory, was shown to converge to the BSSE-free correct value [26].

Linear dependencies of Gaussian-type orbital basis sets employed in the framework of the HF SCF method for periodic structures, which occur when diffuse basis functions are included in the basis set in an uncontrolled manner, were investigated [27]. The basis sets constructed avoid numerical linear dependences, and were optimized for a number of periodic structures. The numerical AO basis sets for solids were generated in [28] by confining atoms within spheres and smoothing the orbitals so that the first and second derivatives go to zero at the boundary. This gives rise to small atomic-like basis sets that can be applied to solid-state problems and are efficient for treating large systems.

The concept of balanced Gaussian basis sets for periodic density functional theory calculations was analyzed in [29] for diamond and silicon bulk crystals using energy-optimized Gaussian basis sets and compared to the corresponding molecular cases. Benchmark calculations [29] show that explicitly optimized condensed phase basis sets have a slower basis set convergence than atomic optimized ones.

The first attempt to develop a consistent standard all-electron Gaussian basis set of a defined quality for H–Br elements for periodic calculations was made in [30]. The def-2 -TZVP (VTZ* in the former subsection) molecular basis sets [31, 32] were chosen as a starting point. These basis sets consist of highly contracted Gaussians for core shells and three less contracted or primitive Gaussians per valence shell. For every valence shell, there is at least one primitive or even contracted polarization function of higher angular momentum. Gaussian functions with orbital exponents smaller than 0.1 were removed. Pob-TZVP basis sets (“pob” means portable) inherit highly contracted core-shell Gaussians and three (contracted or primitive) basis functions per valence shell of the def2-TZVP molecular basis sets, but only one (instead of two) primitive polarization function of higher angular momentum is added as polarization for the complete shell. This scheme results in smaller numbers of contracted and primitive Gaussians than in the original def2-TZVP basis. The orbital exponents and contraction coefficients of the valence shells were variationally optimized for selected solids using the hybrid DFT functional: the correlation functional is PW91, [33] whereas the exchange functional is a mixture of 80 % of PW91 exchange and 20 % of Hartree-Fock exchange, see Sect. 3.3 for hybrid exchange-correlation functionals description. This functional was shown to provide good agreement with the experiment for lattice parameters and cohesive energies [3436].

The final basis sets obtained, denoted as pob-TZVP, are not fully variational for all reference systems. The final values of orbital exponents and contraction coefficients depend on the selected systems and on the DFT method that was used. However, as the changes necessary in the adjustments were relatively small, these basis sets are portable to other systems (pob-TZVP basis sets are given on the CRYSTAL code site http://www.crystal.unito.it). These basis sets were tested in [30] and the results obtained were compared with standard basis sets taken from the CRYSTAL code basis set database. The lattice constants and atomic positions were optimized using the hybrid exchange-correlation functional defined above. Figures 3.1, 3.2 and 3.3 give the relative error in the lattice constants for ionic compounds, semiconductors and transition metal oxides, respectively.

Fig. 3.1
figure 1

Relative error in the lattice constants of ionic compounds with respect to experimental values: a cubic; b hexagonal; c orthorhombic [30]

It is seen that the maximum and mean errors in the lattice constants obtained with pob-TZVP basis sets are less than those obtained with the standard basis sets. The pob-TZVP basis sets were also used in [30] for cubic metals. Although most of the optimization runs failed or gave bad results with the CRYSTAL standard basis sets, the pob-TZVP basis sets gave satisfactory results without further modifications, showing the portability of these basis sets. The results can be easily improved by adding diffuse valence functions with very small orbital exponents.

Fig. 3.2
figure 2

Relative error in the lattice constants of semiconducting compounds with respect to experimental values: a cubic; b hexagonal [30]

Fig. 3.3
figure 3

Relative error in the lattice constants of transition metal oxides with respect to experimental values: a cubic; b hexagonal and tetragonal, [30]

We considered the basis sets for all-electron calculations in which all the core electrons are involved explicitly. Both in molecular and solid-state quantum chemistry core states can be treated implicitly as creating an effective core potential (pseudopotential). The pseudopotential approximation becomes the most efficient for the crystalline compounds of heavy elements. Simultaneously, an important role is played by the relativistic effects on the electronic structure. In the next subsection we consider nonrelativistic and relativistic pseudopotentials used in the modern LCAO calculations of periodic systems. The choice of the corresponding valence basis sets is also discussed.

1.3 Effective Core Potentials

It is well known that most of the chemical properties of molecules and solids are determined by the valence electrons of the constituent atoms. The core states are weakly affected by changes in chemical bonding. The effect of core electrons is principally to shield the nuclear charges and to provide an effective potential for the valence electrons. The main reason for the limited role of the core electrons is the spatial separation of the core and valence shells that originates from the comparatively strong binding of the core electrons to the nucleus.

The idea behind effective core potentials (ECPs), also called pseudopotentials (PP), is to treat the core electrons as creating effective averaged potentials rather than actual particles. Effective core potentials are based on the frozen-core approximation and serve to represent the potential generated by the core electrons, also incorporating relativistic effects. ECP application can introduce significant computational efficiencies as it allows the formulation of a theoretical method for dealing only with the valence electrons, while retaining the accuracy of all-electron methods.

As we already noted, pseudopotentials are essentially mandatory in plane-wave calculations of solids since the core orbitals have very sharp features in the region close to the nucleus and too many plane waves would be required to expand them if they were included. In atomic-like basis-set calculations pseudopotentials are formally not mandatory and have different characteristics from those designed for plane waves since localized basis functions actually have necessary sharp features in the core region. Nevertheless, ECP methods are also used in the LCAO calculations of molecules and solids since the difficulty of the standard LCAO methods rises rapidly with the number of electrons. If the CPU time in LCAO calculations were dominated by integrals calculation, PP introduction would not give very much since the number of integrals is controlled by more diffuse functions that overlap strongly with the neighboring atoms. These diffuse functions are introduced mainly to describe the change of atomic valence states. However, the use of ECP will decrease the number of coefficients in the one-electron wavefunctions and might give significant savings in the SCF part. It is also quite easy to incorporate relativistic effects into pseudopotentials, which is increasingly important for heavy atoms. All electron relativistic calculations are very expensive and in many cases are practically difficult.

The parameters and the underlying basis set of so-called energy-consistent ECPs can be adjusted in accordance with representative experimental data, not only for the ground state, but also for excited states, electron affinities, ionization potentials and so on. Being of semiempirical origin, they can perform remarkably well for a given system, but their transferability to other environments can be poorer than that of shape-consistent ECPs. So-called shape-consistent ECPs (known in computational condensed matter as the norm conserving ECPs) are rather easy to derive and contain no adjustable parameters, i.e. these are ab-initio ECPs. The construction of shape-consistent ECPs for molecules is based on the original proposal of Christiansen et al. [37], where shape consistency was introduced. Simultaneously, norm conserving ECPs were introduced in computational condensed-matter physics by Hamann et al. [38]. This example demonstrates practically independent development of ECPs for molecules and crystals as different terms are used for the same ECP property.

The work by Phillips and Kleinman (PK) [39] is an important step in ECP applications to solids. PK developed the pseudopotential formalism as a rigorous formulation of the earlier “empirical potential” approach. They showed that ECP that has plane-wave pseudo wavefunctions as its eigenstates could be derived from the all-electron potential and the core-state wavefunctions and energies. Thus a nonempirical approach to finding ECP was introduced.

The PK pseudopotential shortcomings are well known [40]: it depends explicitly on the one-electron eigenvalue and outside the core region the normalized pseudo orbital (PO) is proportional but not equal to the true orbital. Typically, generation of a pseudopotential proceeds as follows. First, a cutoff distance \(r_c\) for the core is chosen. In a PP approach all radial \(R_{nl}(r)\) orbitals of the valence shell must be nodeless, as for each \(l\) all lower-lying states have to be projected out by the PP. In the case of oxygen, for instance, the \(2s\)-PO must be nodeless. It is, however, impossible to produce two (or more) nodeless orbitals in the same energy range with only a single spherical PP, as for fixed PP only the angular momentum term can generate differences.

Since PP replaces the potential of the nucleus and the core electrons, it is spherically symmetric and each angular momentum \(l\) can be treated separately, which leads to nonlocal \(l\)-dependent PP \(V_l(r)\). Consequently, the total atomic PP usually consists of several components, one for each angular momentum present in the valence space. The PP dependence upon \(l\) means that, in general, PP is a nonlocal operator, that can be written in a semilocal (SL) form

$$\begin{aligned} {\hat{V}}^{PS} =\sum \limits _{lm}|Y_{lm}(\theta ,\varphi )\rangle V_l(r)\langle Y^*_{lm}(\theta ,\varphi )| \end{aligned}$$
(3.11)

where \(Y_{lm}(\theta ,\varphi )\) are spherical harmonics and \(V_l(r)\) is the pseudopotential for the \(l\)th angular-momentum component. This is termed semilocal because it is nonlocal in the angular variables, but local in the radial variable: when operating on the function \(f(r,\theta ',\varphi ')\), \({\hat{V}}^{PS}\) has the effect [2]

$$\begin{aligned} \left[ {\hat{V}}^{PS}f\right] _{r,\theta ,\varphi }=\sum \limits _{lm}Y_{lm}(\theta ,\varphi )V_l(r)\int d(\cos \theta ')d\varphi 'Y_{lm}(\theta ',\varphi ')f(r,\theta ',\varphi ') \end{aligned}$$
(3.12)

All the PP information is in the radial functions \(V_l(r)\). (We note that the HF exchange operator is fully nonlocal both in the angular and radial variables.) To generate PP an all-electron calculation (HF or DFT) of the free atom is performed. The DFT PP Hamiltonian includes the local (Hartree and exchange-correlation) and semilocal (PP) parts; the HF PP Hamiltonian includes local (Hartree), nonlocal (exchange) and semilocal (PP) parts. The set of PP parameters is chosen to accurately reproduce the eigenvalues and eigenfunctions of the valence states. Clearly, in the region of space where most of the electronic norm is concentrated, both orbitals must be very close, if not identical. On the other hand, the form of the valence orbitals in the core region, where the core electrons are moving, is less relevant. Otherwise, the core states themselves would play a more important role. In the core region one can thus allow the pseudo-orbital (PO) to differ from the all-electron orbital (AO) without losing too much accuracy [41].

A giant step forward in pseudopotentials was taken by Hamann et al. [38], who introduced norm conserving pseudopotentials (NCPP). NCCP for the angular momentum \(l\) is chosen so that

  1. (1)

    the resulting atomic valence PO agrees with the corresponding all-electron (AE) AO for all \(r\) larger than some \(l\)-dependent cutoff (core) radius

    $$\begin{aligned} R_{nl}^{PS}(r)=R_{nl}^{AE}(r) \end{aligned}$$
    (3.13)
  2. (2)

    the norm of the orbital is conserved,

    $$\begin{aligned} \int \limits _0^{r_{c,l}}|R_{nl}^{PS}(r)|^2dr=\int \limits _0^{r_{c,l}}|R_{nl}^{ae}(r)|^2 dr \end{aligned}$$
    (3.14)
  3. (3)

    The logarithmic derivatives of the true and pseudowavefunctions and their first energy derivatives agree for \(r >r_c\).

Condition 1 automatically implies that the real and pseudovalence eigenvalues agree for a chosen “prototype” configuration, as the eigenvalue determines the asymptotic decay of the orbitals.

Properties (2) and (3) are crucial for the pseudopotential to have optimum transferability among a variety of chemical environments [41].

The PP concept was motivated by the inertness of the atomic core states in binding so that the ionic core of the atom provides a fixed potential in which the valence electrons are moving, independently of the system (atom, molecule or solid) considered. However, in polyatomic systems the valence states undergo obvious modifications compared to the atomic valence orbitals, even if the polyatomic core potential is given by a simple linear superposition of atomic core potentials. Most notably, the eigenenergies change when atoms are packed together, which leads to bonding and antibonding states in molecules and to energy bands in solids. Thus, while PPs are designed to reproduce the valence AOs of some chosen atomic reference configuration (usually the ground state), it is not clear a priori that they will have the same property for all kinds of polyatomic systems and for other atomic configurations. Consequently, one has to make sure that the PP is transferable from its atomic reference state to the actually interesting environment. To check the transferability of PP one has to analyze the sensitivity of the agreement between atomic POs and AOs to the specific eigenenergy in the single-particle equation. One finds that the variation of the logarithmic derivative \(R^{PS}_{nl}(r)/R^{ae}_{nl}(r)\) with the single-particle energy is determined by the norm contained in the sphere between the origin and \(r\). This is true in particular in the neighborhood of one of the actual atomic eigenvalues \(\varepsilon _{nl}\), i.e. for a bound atomic eigenstate. Thus, as soon as normconservation is ensured, the POs exactly reproduce the energy dependence of the logarithmic derivative of the AOs for \(r>r_{c,l}\). Consequently, one expects the POs to react as the AOs when the valence states experience some energy shift in a polyatomic environment—provided the underlying PPs are normconserving. This argument supporting the transferability of PPs emphasizes the importance of normconservation in a very explicit way. In practice, it is, nevertheless, always recommended to check the transferability explicitly examining some suitable atomic excitation process and the binding properties of simple molecular or crystalline systems [41].

The form of ECP used in the condensed-matter applications depends on the basis chosen—PW or LCAO. The numeric pseudopotentials in plane-wave calculations must be used with the density functional that was employed to generate them from the reference atomic state. This is a natural and logical choice whenever one of the plane-wave DFT codes is used. In PW calculations the valence functions are expanded in Fourier components, and the cost of the calculation scales as a power of the number of Fourier components needed in the calculation. One goal of PP is to create POs that are as smooth as possible and yet are accurate. In PW calculations maximizing smoothness is to minimize the range of Fourier space needed to describe the valence properties to a given accuracy [2]. Normconserving PPs achieve the goal of accuracy, usually at some sacrifice of smoothness. A different approach by Vanderbilt, known as “ultrasoft pseudopotentials” (US) [42] reaches the goal of accurate PW calculations by a transformation that re-expresses the problem in terms of the smooth function and the auxiliary function around each core that represent the rapidly varying part of the density. The generation code for Vanderbilt US pseudopotentials and their library can be found on site http://www.physics.rutgers.edu/dhv/uspp.

Ab-initio pseudopotentials for PW calculations of solids can be generated also by the fhiPP package [43], see also site http://www.fhi-berlin.mpg.de/th/fhimd/.

The numerical AO-based DFT code SIESTA [44] employs the same numeric pseudopotentials as plane-wave-based codes. An alternative approach is used in the Slater-orbital-based DFT code ADF [45], where so-called core functions are introduced. They represent the core-electron charge distribution, but are not variational degrees of freedom and serve as fixed core charges that generate the potential experienced by valence electrons [46].

In molecular quantum chemistry Gaussian-function-based computations, core potentials were originally derived from the reference calculation of a single atom within the nonrelativistic Hartree-Fock or relativistic Dirac–Fock approximations, or from some method including electron correlations (CI, for instance). A review of these methods, as well as the general theory of ECPs is provided in [47, 48].

Let us discuss the effective core potentials and the corresponding valence basis sets that are used for Gaussian-function-based LCAO periodic computations implemented in the computer codes CRYSTAL [17] and GAUSSIAN [49].

The ECP general form is a sum of the Coulomb term, the local term and the semilocal term

$$\begin{aligned} V_{PS}(r)=C+V_{loc}+V_{sl}=&-\frac{Z_N}{r}+\sum \limits _{k=1}^M r^{n_k}C_k\exp (-\alpha _k r^2)\nonumber \\&+\sum \limits _{l=0}^3\left[ \sum \limits _{k=1}^{M_l} r^{n_{kl}-2}C_{kl}\exp (-\alpha _{kl}r^2)\right] {\hat{P}}_l \end{aligned}$$
(3.15)

where \(Z_N\) in the Coulomb term is the effective nuclear charge (total nuclear charge minus the number of electrons represented by ECP). The local term is a sum of products of polynomial and Gaussian radial functions. The semilocal term is a sum of products of polynomial radial functions, Gaussian radial functions and angular-momentum projection operators \({\hat{P}}_l\). Therefore, to specify the semilocal ECP one needs to include a collection of triplets (coefficient, power of \(r\) and exponent) for each term in each angular momentum of ECP.

Hay and Wadt (HW) ECP [50] are of the general form (3.15). The procedure employed for the generation of ECPs includes the following sequence of steps: (1) the “core” orbitals to be replaced and the remaining “valence” orbitals are defined. This step defines whether the small-core (the outermost core electrons are explicitly treated along with the valence electrons) or a large-core HW pseudopotential is generated; (2) the true numerical valence orbitals are obtained from self-consistent nonrelativistic Hartree-Fock (or relativistic Dirac–Fock) calculations for \(l = 0, 1, \ldots , L\), where \(L\), in general, is greater than the highest angular-momentum quantum number of any core orbital; (3) smooth, nodeless pseudo-orbitals (PO) are derived from the true Hartree-Fock (Dirac–Fock) orbitals so that PO behave as closely as possible to HF orbitals in the outer, valence region of the atom; (4) numerical effective core potentials \(V_l^{PS}\) are derived for each \(l\) by demanding that PO be a solution in the field of \(V_l\) with the same orbital energy \(\varepsilon \) as the Hartree-Fock (Dirac–Fock) orbital; (5) the numerical potentials are fit in the analytic form with Gaussian functions, the total potential is represented as (3.15); (6) the numerical POs are also fit with Gaussian functions to obtain basis sets for molecular or periodic calculations. In the case of large-core ECP the primitive Gaussian bases (\(3s2p5d\)), (\(3s3p4d\)) and (\(3s3p3d\)) are tabulated for the first, second, and third transition series atoms, respectively. The figures in brackets mean the number of primitive Gaussians in \(ns\), \(np\) and \((n-1)d\) contracted AOs for \(n=4,5,6\). In the case of small-core ECP \((n-1)s,(n-1)p\) contracted AOs are added and given as the linear combinations of primitive Gaussians. Hay-Wadt ECPs and valence-electron basis sets are also generated for main-group elements: large core—for Na to Xe, and Cs to Bi, small core—for K, Ca, Rb, Sr, Cs, Ba.

The other known ECP and valence-electron basis sets were generated using the procedure described for Hay–Wadt ECP generation. The Durand–Barthelat large-core semilocal ECP [51] and the corresponding valence-electron basis sets are generated for \(3d\)-transition elements and the main-group elements Li to Kr.

Compact one- and two-Gaussian expansions for the components of the effective potentials of atoms in the first two rows are presented by Stevens–Basch–Krauss [52]. Later, the list of ECP was extended to the third-, fourth- and fifth-row atoms [53] and includes relativistic ECP (RECP). The pseudo-orbital basis-set expansions for the first two rows of atoms consist of four Gaussian primitives using a common set of exponents for the \(s\) and \(p\) functions. Analytic SBK RECP are generated in order to reproduce POs and eigenvalues as closely as possible. The semilocal SBK ECP are given by

$$\begin{aligned} r^2V_l(r)=\sum \limits _k A_{lk} r^{n_{l,k}}\exp (-B_{l,k}r^2) \end{aligned}$$
(3.16)

The potentials and basis sets were used to calculate the equilibrium structures and spectroscopic properties of several molecules. The results compare extremely favorably with the corresponding all-electron calculations.

Stuttgart–Dresden (SD) ECP (formerly Stoll and Preuss ECP) are under constant development [54]. SD semilocal ECPs are written in the form

$$\begin{aligned} V_{sl}=\sum \limits _{l=0}^3\left[ \sum \limits _{k=1}^{\mu _k}r^{n_{kl}-2}C_{kl}\exp (-\alpha _kr^2)\right] {\hat{P}}_l \end{aligned}$$
(3.17)

Note the different convention for the factor \(r^{n_{kl}-2}\) compared to (3.15). The database of SD ECP include relativistic ECP (RECP) generated by solving the relativistic Dirac–Fock equation for atoms. Improved SD pseudopotentials exist for many of the main-group elements, and the pseudopotentials are also available for 5\(d\) and other heavier elements. The most recent ECP parameters, optimized valence-electron basis sets, a list of references and guidelines for the choice of pseudopotentials can be found at site http://www.theochem.uni-stuttgart.de.

The use of atomic pseudopotentials (or effective core potentials–ECPs) considerably simplifies the quantum-mechanical description of polyatomic systems (molecules and crystals) as the much more localized and chemically inert core electrons are simulated by ECP introduction. The choice of the norm conserving and transferable ECPs ensures that the valence states are reproduced in the majority of cases as accurately as would be done in all-electron calculations.

Pseudopotentials are also used as embedding potentials when some special region of a covalently bonded solid or very large molecule is modeled by a modest-size cluster. This model is applied when one is interested in the electronic structure and properties of some small region of a large system such as a localized point defect in a solid, an adsorbed molecule on a solid surface or an active site in a very large biological molecule. In such a case one can model the region of interest by cutting a modest-sized but finite cluster out of a larger system and performing the calculation on it. For ionic solids the surrounding crystal can be modeled by the system of point charges. More difficult is the case when the environment is a covalently bonded system and the boundary between the cluster and the environment passes through chemical bonds (covalent or partly covalent). In this case, the distant and nearest parts of the environment should be treated separately [55]. In the distant part, only the electrostatic potential representing the ionic component of the environment should be retained. The nearest part, corresponding to the “broken or dangling bonds”, needs special consideration. Each atom of the cluster boundary surface has unsaturated “dangling bonds” that cause spurious effects unless saturated in some way, usually by adding hydrogen or pseudoatoms [56]. This is better than leaving the dangling bond but clearly the termination is still imperfect in the sense that the bond to the attached atom is different from the bond to whatever atom is situated there in the real system [57]. The use of the embedding potentials (EP) to saturate the dangling bonds of the cluster gives a cluster surface bond identical to that in the real large system.

Heavy-element systems are involved in many important chemical and physical phenomena. However, they still present difficulties to theoretical study, especially in the case of solids containing atoms of heavy elements (with the nuclear charge \(Z \ge 50\)). The description of the relativistic electronic-structure theory for molecular systems is given, for example, in [58] and in our earlier book [59]. For a long time the relativistic effects inherent in heavy atoms were not considered important for chemical properties because they appear primarily in the core atomic region. However, now the importance of these effects, which play an essential and vital role in the total nature of electronic structures for heavy-element molecular and periodic systems, is recognized [58].

While accurate relativistic calculations of simple heavy-atom molecules can be performed on modern computers, the relativistic calculations of periodic systems are mainly made using the relativistic effective core potential (RECP).

There are several reasons for using RECPs in calculations of complicated heavy-atom molecules, molecular clusters and periodic solids. Like nonrelativistic ECP approaches, RECP approaches allow one to exclude from calculations a large number of chemically inactive electrons and explicitly treat only valence and outermost core electrons. The oscillations of the valence spinors are usually smoothed in heavy-atom cores simultaneously with the exclusion of small components from explicit treatment (quasirelativistic approximation). As a result, the number of primitive basis functions can be essentially reduced; this is especially important for the calculation and transformation of two-electron integrals when studying many-atomic systems with very heavy elements including lanthanides and actinides. The RECP method is based on a well-developed earlier nonrelativistic technique of pseudopotential calculations; however, effective scalar-relativistic and spin-orbit interaction effects are taken into account by means of the RECP operator. In the RECP method, the interactions with the excluded inner core shells (spinors!) are described by spin-dependent potentials, whereas the explicitly treated valence and outer core shells can be described by spin-orbitals. This means that some “soft” way of accounting for the core-valence orthogonality constraints is applied in the latter case [60]. Meanwhile, the strict core-valence orthogonality can be retrieved after the RECP calculation by using the restoration procedures. The use of the spin-orbitals allows one to reduce dramatically the expenses at the stage of correlation calculation.

In LCAO calculations of heavy-atom molecules and periodic systems the relativistic “energy-consistent” pseudopotentials by the Stuttgart–Dresden–Cologne group are also actively used [54, 61]. To generate “energy-consistent” RECP, direct adjustment of two-component pseudopotentials is performed (scalar-relativistic + spin-orbit potentials) to atomic total energy valence spectra. The latter are derived from the four-component multiconfiguration Dirac-Hartree-Fock all-electron atomic calculations. The “energy-consistent” RECPs are now tabulated for all the elements of the periodic table at the site http://www.theochem.uni-stuttgart.de. The adjustment of the pseudopotential parameters was done in fully numerical atomic calculations, valence basis sets were generated a posteriori via energy optimization. A complete set of potentials includes one-component (nonrelativistic and scalar-relativistic) effective-core potentials (ECP), spin-orbit (SO) and core-polarization potentials (CPP); only one-component ECPs are listed in full. The “energy-consistent”pseudopotentials are under continuous development and extension [61, 62] and the corresponding Gaussian basis sets are published [6365].

In plane-wave calculations of solids and in molecular dynamics, separable pseudopotentials [66] are more popular now because they provide linear scaling of the computational effort with the basis-set size in contrast to the radially local RECPs. Contrary to the four-component wavefunction used in full relativistic calculations, the pseudowavefunction in the RECP case can be both two- and one-component. The RECP operator simulates, in particular, interactions of the explicitly treated electrons with those that are excluded from the RECP calculations.

The consideration of the relativistic effects in the compounds of heavy atoms is most easily carried out in scalar-relativistic (one-component) calculations with the use of the relativistic effective core potential (RECP). Following [67], we discuss a problem of practical importance, i.e. to what extent the application of the RECP of an uranium atom to scalar-relativistic calculations is useful in reproducing the dissociation energy and spectrum of one-electron energies found in the full-electron relativistic calculation by the DHF method. The validity of RECP method is studied in [67] using the UF\(_6\) molecule. Uranium hexafluoride UF\(_6\) became the subject of numerous experimental and theoretical studies owing to the fact that it is used in the enrichment of uranium with a molecular laser. The difficulties in the calculation of the electronic structure of this molecule are connected both with the necessity to take into account the relativistic effects and with the influence of electronic correlation effects in the case of the localized \(f\)-electrons of the uranium atom. The main results of the study [67] are the following.

According to the full-electron nonrelativistic calculation by the HF method, the dissociation energy of the UF\(_6\) molecule is 13.7 eV, which reproduces only 43 % of the experimental value. If relativistic effects are taken into account this value increases to 23.5 eV (73 % of the experimental value), and the energies of MOs corresponding to the core states \(1s\), \(2s\), and \(3s\) of the uranium atom are essentially reduced as compared to the HF calculation. At the same time, for MOs that are higher in energy (arising from the \(6s\) and \(6p\) states of the uranium atom) the reduction is noticeably less. For MOs containing the filled \(3d\), \(4d\), \(5d\), and \(4f\)-orbitals of the uranium atom the one-electron energies increase as a consequence of the increasing screening of the nucleus by the inner electrons when the relativistic effects are taken into account. It is evident from the aforesaid that the consideration of the relativistic effects in the full-electron calculation of the UF\(_6\) molecule is essential.

The results of scalar-relativistic calculations on the UO\(_2\) crystal and the comparative study of UN, U\(_2\)N\(_3\) and UN\(_2\) crystals are discussed in [59] and demonstrate the efficiency of RECP use for the periodic systems.

In the next sections we discuss in short the Hartree-Fock (also called the wave function based) and Density Functional Theory (DFT) methods used nowadays in periodic structures modeling, in particular for nanotubes and nanowires.

2 LCAO Hartree-Fock Method for Periodic Systems

2.1 Electron Correlation, One-Electron and One-Determinant Approximations

Electrons in molecules and crystals repel each other according to Coulomb’s law, with the repulsion energy depending on the interelectron distance as \(r_{12}^{-1}\). This interaction creates a correlation hole around any electron, i.e. the probability to find any pair of electrons at the same point of spin-coordinate space is zero. From this point of view only the Hartree product \(\varPsi \) \(_H\) of molecular or crystalline spin-orbitals \(\psi _i(\varvec{x})\):

$$\begin{aligned} \varPsi _H(\varvec{x}_1,\varvec{x}_2,\ldots ,\varvec{x}_{N_e})=\psi _1(\varvec{x}_1)\psi _2(\varvec{x}_2), \ldots , \psi _{N_e}(\varvec{x}_{N_e}) \end{aligned}$$
(3.18)

is a completely uncorrelated function. The Hartree product (3.18) describes the system of \(N_e\) electrons in an independent particle model. This independence means that the probability of simultaneously finding electron 1 at \(\varvec{x}_1\), electron 2 at \(\varvec{x}_2\), etc. (\(\varvec{x}\) means the set of coordinate \(\varvec{r}\) and spin \(\sigma \) variables) is given by

$$\begin{aligned}&|\varPsi _H(\varvec{x}_1,\varvec{x}_2,\ldots ,\varvec{x}_{N_e})|^2d{\varvec{x}_1}d{\varvec{x}_2}, \ldots , d{\varvec{x}_{N_e}}\nonumber \\ \quad&=|\psi _1(\varvec{x}_1)|^2d{\varvec{x}_1}|\psi _2(\varvec{x}_2)|^2d{\varvec{x}_2}, \ldots , |\psi _{N_e}(\varvec{x}_{N_e})|^2d{\varvec{x}_{N_e}} \end{aligned}$$
(3.19)

which is the probability of finding electron 1 at \(\varvec{x}_1\) times the probability of finding electron 2 at \(\varvec{x}_2\), etc., i.e. product of probabilities.

The well-known Extended Hückel semiempirical method for molecules and the Tight Binding (TB) approach to crystals are examples of the models with full absence of electron correlation in the wavefunction. The Hamiltonian in these methods does not include explicitly electron–electron interactions (such a Hamiltonian is known as the one-electron Hamiltonian) so that the total many-electron wavefunction is a simple product (3.18) of the one-electron functions and the total electron energy is a sum of one-electron energies. If the semiempirical parameters are used in these methods the electron correlation is taken into account at least partly. When the TB parameters are fitted to reproduce the results of DFT calculations (see DFTB method) the electron correlation is also taken into account but it is not known precisely how large is the correlation part included.

The difference between the one-electron Hamiltonian and the Hamiltonian of the one-electron approximation (HF method) is the following. The former does not include electron–electron interaction so that the calculation of its eigenvalues and eigenvectors does not require a self-consistent procedure. The Hamiltonian of the one-electron approximation includes explicitly the interelectron interactions, the one-electron approximation is made only in the many-electron wavefunction. The one-electron approximation Hamiltonian depends on the one-electron wavefunctions unknown at the beginning of the calculation (for example, the Coulomb and exchange parts of the Hamiltonian in the Hartree-Fock method, see next section) and the self-consistent calculation is required.

The Hartree-Fock (HF) self-consistent (SCF) method replaces the instantaneous electron–electron repulsion with the repulsion of each electron with an average electron charge cloud. The HF method assumes that the many-electron wavefunction can be written as one Slater determinant. The Hartree-Fock method is usually defined as “uncorrelated”. However, the electron motions are no longer completely independent. Indeed, no two electrons with the same spin can be at the same place. This is called the Fermi hole. Thus, same-spin electrons are correlated in Hartree-Fock, different-spin electrons are not. Sometimes, it is said that HF methods take into account the so-called spin correlation.

The HF methods are also called by the independent electrons approximation [68] but this independence is restricted by the Pauli principle.

In modern molecular quantum chemistry the correlation energy is defined as the difference between the exact energy and the HF energy in a complete basis (Hartree-Fock limit). As one does not know the exact energy one uses the experimental total energy (the sum of the experimental cohesive energy and free-atom energies) or calculates the exact energy for a given one-electron basis set and defines the basis set correlation energy as the difference between the exact and HF energies calculated for the same one-electron basis set. In molecular systems, the correlation energy is about 1 eV per electron pair in a bond or lone pair.

One of the first attempts to include the electron-correlation in calculations was made by Fock et al. [69], who suggested the incomplete separation of variables for two-valent atoms.

The key distinction between the Hamiltonian operator and the Fock operator is the following, [70]: the former returns the electronic energy for the many-electron system, the latter is really not a single operator, but the set of all of the interdependent one-electron operators that are used to find the one-electron functions (molecular or crystalline orbitals) from which the HF wavefunction is constructed as a Slater determinant. The HF wavefunction corresponds to the lowest possible energy for a single-determinant many-electron wavefunction formed from the chosen basis set.

Including electron-correlation in MO theory means an attempt to modify the HF wavefunction to obtain a lower electronic energy when we operate on that modified wavefunction with the Hamiltonian. This is why the name post-Hartree-Fock methods is traditionally used for the methods including the electron-correlation.

In the unrestricted Hartree-Fock approximation (where the coordinate dependence of spin-up and spin-down MOs was allowed to differ) the one-determinant many-electron wavefunction is, in the general case, not an eigenfunction of the total spin operator \(S^2\). To repair that deficiency the technique of projection is used [71] so that the resulting wavefunction becomes a sum of several Slater determinants and therefore partly takes into account electron-correlation, i.e. goes beyond the one-determinant HF approximation. However, the coefficients in the sum of Slater determinants are defined only by the projection procedure, i.e. the total spin-symmetry requirements introduced for the many-electron wavefunction.

The sum of Slater determinants

$$\begin{aligned} \varPsi =C_0\varPsi _{HF}+C_1\varPsi _1+C_2\varPsi _2+\cdots \end{aligned}$$
(3.20)

is used also in other post-HF approaches: configuration interaction (CI), multiple-configuration SCF (MCSCF) and coupled-cluster (CC) methods applied to include the electron-correlation in molecules.

Often, the HF approximation provides an accurate description of the system and the effects of the inclusion of correlations with CI or MCSCF methods are of secondary importance. In this case, the correlation effects may be considered as a smaller perturbation and as such treated using the perturbation theory. This is the approach of Möller–Plesset [72] or many-body perturbation theory for the inclusion of correlation effects. In the MP2 approximation only the second-order many-body perturbations are taken into account.

The above-mentioned quantum-chemical approaches to electron-correlations in molecules (also called wavefunction-based correlation methods) are described in detail in monographs [68, 70], review articles [73, 74] and are implemented in modern computer codes.

The main disadvantage of the wavefunction-based correlation methods is the high scaling of the computational cost with the number of atoms \(N\) in a molecule, at least when the canonical MOs are used. The scaling of the computational complexity is \(O(N^5)\) for the simplest and cheapest method—second-order perturbation theory MP2. For the CC theory the computational cost scales are \(O(N^6)\) and even \(O(N^7)\) for the truncated beyond doubles and triples substitutions, respectively. Such a high “scaling wall” [75] restricts the application range of the wavefunction-based correlation methods to molecules of rather modest size. For this reason the density-functional-based correlation methods remain until now the main way to treat large molecular systems. The main disadvantages of the latter methods are the principal impossibility for systematic improvements, the underestimation of transition-state energies, and the inability to describe weak interactions (dispersive forces).

The essential progress in the correlation effects inclusion was achieved in the so-called local correlation methods [75, 76] taking into account the short-range nature of the correlation. In these methods, the localized MOs are generated from the occupied canonical MOs since different localization criteria. For the virtual space the atomic-orbital basis is projected out of the occupied MO space.

As compared to the molecules the wavefunction-based correlation methods for periodic systems are practically reliable only when the molecular cluster model is used. Unfortunately, the well-known problems of the cluster choice and the influence of the dangling bonds on the numerical results restricts the application range of the molecular cluster model to the essentially ionic systems.

The more sophisticated incremental scheme [7780] maintains the infinite nature of periodic systems but the correlation effects are calculated incrementally using standard quantum-chemical codes.

Only recently was the MP2 theory applied to the periodic systems based on the local correlation methods and use of Wannier functions [75, 81].

While for the molecules the local correlation methods are already implemented in computer codes the implementation of this approach to the periodic systems is the main goal of the new CRYSCOR project [82].

2.2 LCAO Hartree-Fock Method for Periodic Systems

In molecular quantum chemistry the molecular orbital (MO) is expanded in terms of GTO. Different atomic functions used in calculations of molecules and crystals were considered above. For the moment, we restrict ourselves by representation of MO \(\varphi _i(\varvec{r})\) as a linear combination of atomic orbitals \(\chi _{\mu A}(\varvec{r})\) (MO LCAO approximation):

$$\begin{aligned} \varphi _i(\varvec{r})=\sum \limits _{\mu A}C_{i\mu }\chi _{\mu A}(\varvec{r}) \end{aligned}$$
(3.21)

where \(\mu \) numbers all basis functions centered on atom A and summation is made over all the atoms in the molecular system. The MO LCAO approximation (also known as the Hartree-Fock–Roothaan approximation [83]) is practically the only way to make first-principles calculations for molecular systems. In the standard derivation of the Hartree-Fock equations relative to a closed-shell system, the constraint that each molecular orbital is populated by two electrons or vacant is introduced (Restricted Hartree-Fock theory—RHF).

In the MO LCAO approximation RHF method transforms to the matrix equations

$$\begin{aligned} \mathbf{FC=SCE} \end{aligned}$$
(3.22)

where \(\mathbf{F}\) and \(\mathbf{S}\) are the Fock and the overlap matrices, \(\mathbf{C}\) and \(\mathbf{E}\) are the matrices of eigenvectors and eigenvalues. The dimension M of square matrices \(\mathrm{\mathbf{{F,S,C}}}\) is equal to the number of items in the sum (3.21), i.e. the total number of AO used in the calculation.

The Fock matrix \(\mathbf{F}\) is the sum of one-electron \(\mathbf{H}\) and two-electron \(\mathbf{G}\) parts. The former includes the kinetic (T) and nuclear attraction (Z) energy, the latter is connected with the electron–electron interactions.

By definition, in the molecular states with closed shells all MO are fully occupied by electrons or empty. The corresponding many-electron wavefunction can be written as a single Slater determinant, each MO is occupied by an equal number of electrons with \(\alpha \) and \(\beta \) spins. Such a function describes the ground state of a molecule with total spin \(S =\) 0 and with the symmetry of identity representation of the point-symmetry group.

In the case of open-shell molecular systems a single Slater determinant describes a state with the fixed spin-projection (equal to the difference of \(n_{\alpha }\) and \(n_{\beta }\) electrons) but is not the correct spin eigenfunction. Indeed, let in the open-shell configuration the highest one-electron energy level be occupied by one electron with \(\alpha \) spin. As there are no spin interactions in the Hartree-Fock Hamiltonian the same electron energy corresponds to the function with \(\beta \)-spin electrons on the highest occupied level. This means that in order to get the correct total spin eigenfunction transforming over the identity representation of the point group it is necessary to use a sum of Slater determinants. The molecules with an odd number of electrons, the radicals and the magnetic systems have the open shells in the ground state.

The restricted open-shell Hartree-Fock (ROHF) and the unrestricted Hartree-Fock Method (UHF) approximations permit, however, open-shell systems to be described, while maintaining the simplicity of the single-determinant approximation. This is made at the stage of self-consistent electronic-structure calculations. Afterwards, the obtained spin-orbitals can be used to get the correct total spin many-determinant wavefunction and to calculate the corresponding electron energy.

The ROHF [84] many-electron wavefunction is, in the general case, a sum of Slater determinants; each determinant contains a closed-shell subset, with doubly occupied orbitals and an open-shell subset, formed by orbitals occupied by a single electron.

In one particular case, the ROHF wavefunction reduces to a single determinant: this is the so-called half-closed- shell cases, where it is possible to define two sets of orbitals, the first \(n_d\) occupied by paired electrons and the second \(n_s\), by electrons with parallel spins. The total number of electrons \(n=n_d+n_s\). In all molecular programs ROHF means a single-determinant wavefunction with maximal spin projection that is automatically eigenfunctions of \(S^2\) with the maximal spin projecton value \(S=n_s/2\). So, for the ROHF method projection on a pure spin state is not required. The space symmetry of the Hamiltonian in the ROHF method remains the same as in the RHF method, i.e. coincides with the space symmetry of nuclei configuration. The double-occupancy constraint allows the ROHF approach to obtain solutions that are eigenfunctions of the total spin operator. The molecular orbitals diagram for the ROHF half-closed shell is given in Fig. 3.4, (left).

Fig. 3.4
figure 4

The one-electron levels filling in a ROHF and b UHF methods

In the UHF method, keeping a single-determinant description, the constraint of double occupancy of molecular orbitals is absent as \(\alpha \) electrons are allowed to occupy orbitals other than those occupied by the \(\beta \) electrons. The greater variational freedom allows the UHF method to produce wavefunctions that are energetically more stable, i.e. give lower electron energy. Another advantage of the UHF method is that it allows solutions with locally nonzero negative or positive spin density (i.e. ferromagnetic or antiferromagnetic systems). However, UHF solutions are formed by a mixture of spin states and are eigenfunctions only of the total spin-projection operator \(S_z\).

In the UHF approach, the single-determinant wavefunction is computed using \(n_{\alpha }\) MOs \(\varphi ^{\alpha }(\varvec{r})\) and \(n_{\beta }\) MOs \(\varphi ^{\beta }(\varvec{r})\), corresponding to the electrons with \(\alpha \) and \(\beta \) spin, respectively.

Let us we examine the main modifications in the Hartree-Fock–Roothaan equations (3.22), it being necessary to take into account the translation symmetry of periodic systems.

The first most important difference appears in the LCAO representation of the crystalline crbitals (CO) compared to the molecular orbitals (MO).For the periodic system the AO symmetrization over the translation subgroup of the space group is made giving the Bloch sums of AOs:

$$\begin{aligned} {\chi }_{\mu \varvec{k}}(\varvec{r})=\frac{1}{\sqrt{N}}\sum \limits _{\varvec{R}_n} \exp ({\mathrm{i}\varvec{k}\varvec{R}_n}){\chi }_{\mu }(\varvec{r} - \varvec{R}_n) \end{aligned}$$
(3.23)

In (3.23), the index \(\mu \) labels all AOs in the reference primitive unit cell (\(\mu =1,2,\ldots ,M\)) and \(\varvec{R}_n\) is the translation vector of the direct lattice (for the reference primitive cell \(\varvec{R}_n=0\)). The summation in (3.23) is supposed to be made over the infinite direct lattice (in the model of the infinite crystal) or over the inner primitive translations \(\varvec{R}_n^0\) of the cyclic cluster (in the cyclic model of a crystal).

In the LCAO approximation crystalline orbital (CO) is the Bloch function as it is expanded over the Bloch sums of AOs:

$$\begin{aligned} {\varphi }_{i\varvec{k}}(\varvec{r})=\sum \limits _{\mu }C_{i\mu }(\varvec{k}){\chi }_{\mu \varvec{k}} (\varvec{r}) \end{aligned}$$
(3.24)

In the MO LCAO approximation the index \(i\) in the expansion (3.21) numbers the MOs (their total number \(M\) is equal to the number of atomic orbitals used in the expansion (3.21)). In the case of closed shells the \(N_e\) electrons of the molecule occupy \(N_e/2\) MOs and \((M - N_e/2)\) MOs are empty.

The total number of COs of the cyclic cluster equals \(M\times N\) (\(N\) is the number of the primitive unit cells in the cyclic cluster; \(M\) is the number of AO basis functions per primitive unit cell). For the cyclic cluster containing \(N_e =N\times n\) electrons (\(n\) is the number of electrons per primitive unit cell) \(N_e/2 \) crystalline orbitals are occupied by electrons and \(({M\times N- N_e}/2)\) orbitals are empty. The numbering of crystalline orbitals is made by two indices \(i\) and \(\varvec{k}\): the one-electron states of a crystal form the energy bands numbered by index \(i\) and joining the \(N\) states with the same \(i\). The closed-shell case (the nonconducting crystals) means that all the energy bands are filled or empty (there are no partly filled energy bands).

For \(N\rightarrow \infty \) the total number of COs also becomes infinite, the one-electron energy levels form the continuous energy bands, but the total number of energy bands remains finite and is equal to \(M\). For the closed-shell case \(n/2\) electrons occupy the lowest-energy bands for each \(\varvec{k}\)-vector. The forbidden gap in the nonconducting crystals is the crystalline analog of the HOMO (highest-occupied MO)–LUMO (lowest-unoccupied MO) one-electron energies difference.

On the basis of Bloch functions, the Fock and overlap matrices \( F\) and \( S\) become

$$\begin{aligned} F_{\mu \nu }(\varvec{k})=\sum \limits _{\varvec{R}_n}\exp (\mathrm{i}\varvec{k}\varvec{R}_n)F_{\mu \nu }(\varvec{R}_n);\ \ S_{\mu \nu }(\varvec{k})=\sum \limits _{\varvec{R}_n}\exp (\mathrm{i}\varvec{k}\varvec{R}_n)S_{\mu \nu }(\varvec{R}_n) \end{aligned}$$
(3.25)

where \(F_{\mu \nu }(\varvec{R}_n)\) is the matrix element of the Fock operator between the \(\mu \)th AO located in the reference (zero) cell and the \(\nu \)th AO located in the \(\varvec{R}_n\) cell. The matrix element \(S_{\mu \nu }(\varvec{R}_n)\) is the overlap integral of the same AOs. The row index can be limited to the reference cell for translational symmetry. Matrices represented on the Bloch basis (or in \(\varvec{k}\)-space) take a block-diagonal form, as Bloch functions are bases for irreducible representations of the translation group T (\(\mathrm{T}^{(N)}\) for the cyclic cluster of \(N\) primitive cells); each block has the dimensions of the AO basis in the primitive cell, \(M\),

$$\begin{aligned} \mathrm{F}(\varvec{k})\mathrm{C}(\varvec{k})=\mathrm{S}(\varvec{k})\mathrm{C}(\varvec{k})\mathrm{E}(\varvec{k}) \end{aligned}$$
(3.26)

In the HF LCAO method, (3.26) for the periodic systems replaces (3.22) written for the molecular systems. In principle, the above equation should be solved at each SCF procedure step for all the (infinite) \(\varvec{k}\)-points of the Brillouin zone. Usually, a finite set \(\{\varvec{k}_j\} (j=1,2,\ldots ,L)\) of \(\varvec{k}\)-points is taken (this means the replacing the infinite crystal by the cyclic cluster of \(L\) primitive cells). The convergence of the results relative to the increase of the \(\varvec{k}\)-points set is examined in real calculations, for the convergent results the interpolation techniques are used for eigenvalues and eigenvectors as these are both continuous functions of \(\varvec{k}\) [85]. The convergence of the SCF calculation results is connected with the density matrix properties.

The overlap matrix elements in the AO basis are the lattice sums of the overlap integrals between AOs, numbered now by indices of AO in the zero (reference) cell and of the cell defined by the translation vector \(\varvec{R}_n\).

The Fock matrix elements (as in the case of molecules) are the sums of one-electron (the kinetic energy T and the nuclear attraction energy Z) and two-electron (Coulomb J and exchange energies) parts. The difference of these matrices from the molecular analogs is the appearance of the sums over a direct lattice, containing one-electron (kinetic energy and nuclear attraction) and two-electron integrals.

The spinless one-electron density matrix (DM) elements are defined in the LCAO approximation as

$$\begin{aligned} P_{\lambda \sigma }({\varvec{R}}_{m'})=2\int d\varvec{k}e^{i\varvec{k}{\varvec{R}}_{m'}} \sum \limits _i C^*_{i\lambda }(\varvec{k})C_{i\sigma }(\varvec{k})\theta (\epsilon _F- \epsilon _i(\varvec{k})) \end{aligned}$$
(3.27)

where \(\epsilon _F\) is the Fermi energy, the integration in (3.27) extends to the first Brillouin zone and corresponds to the model of the infinite crystal. In the cyclic-cluster model the integration is replaced by the summation over \(\varvec{k}\)-points numbering the irreps of the group of the inner cyclic-cluster translations. The DM elements depend on the eigenvectors \(\mathrm{C}_i(\varvec{k}) \) of the \(\mathrm F(\varvec{k})\) matrix. The integration or summation over \(\varvec{k}\) in DM elements makes it impossible to solve (3.26) for different \(\varvec{k}\) independently.

The electron energy of the crystal (per primitive unit cell) as calculated within the HF LCAO approximation can be expressed in terms of the one-electron density matrix (DM) and includes the lattice sums

$$\begin{aligned} E_e=\frac{1}{2}\sum \limits _{\mu \nu }^M\sum \limits _{\varvec{R}_n} P_{\mu \nu }(\varvec{R}_n)\left( F_{\mu \nu }(\varvec{R}_n)+T_{\mu \nu }(\varvec{R}_n)+ Z_{\mu \nu }(\varvec{R}_n) \right) \end{aligned}$$
(3.28)

The direct lattice summations in the Fock matrix elements and the \(\varvec{k}\) dependence of the one-electron DM, energy levels and COs are the main difficulties of the HF LCAO method for periodic systems, compared with molecules. A special strategy must be specified for the treatment of the infinite Coloumb and exchange series as well as for the substitution of the integral that appears in DM with a weighted sum extended to a finite set of \(\varvec{k}\)-points. The efficient solution of these problems has been implemented in the CRYSTAL code [86]. These problems are also valid for UHF and ROHF LCAO methods for periodic systems.

In self-consistent calculations of the electronic structure of crystals both in the basis of plane waves and in the basis of localized atomic-type functions, one needs in every stage of the self-consistent procedure to evaluate an approximate electron-density matrix by integration over the Brillouin zone (BZ):

$$\begin{aligned} \rho (\mathbf{r},\mathbf{r}')=\int \limits _{BZ}Q_{\mathbf{r},\mathbf{r}'}(\mathbf{k})d\mathbf{k}, \qquad Q_{\mathbf{r},\mathbf{r}'}(\mathbf{k})=\sum _v \psi _v^*(\mathbf{k},\mathbf{r}')\psi _v(\mathbf{k},\mathbf{r}) \end{aligned}$$
(3.29)

where the sum is over occupied electronic states of the crystal. In practice, \(\psi (\mathbf{k},\mathbf{r})\) (and \(\varphi (\mathbf{k})\equiv Q_{\mathbf{r},\mathbf{r}'}(\mathbf{k})\)) are calculated in some finite meshes of \(\mathbf{k}\)-points. Therefore, one needs for the evaluation of the integral (3.29) to construct the interpolation formula of the numerical integration based on an interpolation procedure by means of some set of interpolation functions.

The functions \(\varphi (\mathbf{k})\equiv Q_{\mathbf{r},\mathbf{r}'}(\mathbf{k})\)) are periodic in the reciprocal space with periods determined by the basic translation vectors \(\mathbf{b}_{i}\) of the reciprocal lattice and having the full point symmetry of the crystal (\(F\) is a point group of order \(n_F\) of the crystal).

$$\begin{aligned} \varphi (\mathbf{k} + \mathbf{b}_{\mathbf{m}})=\varphi ({\mathbf{k}})=\varphi (f{\mathbf{k}}),\quad f\in F,\quad \mathbf{b}_{\mathbf{m}}=\sum _{i=1}^3m_i\mathbf{b}_i \end{aligned}$$
(3.30)

The plane waves \(\exp (\mathrm{i}\mathbf{k}\cdot \mathbf{a}_{\mathbf{n}})\) seem to be the most convenient as interpolation functions for the integrand in the BZ integration, where \(\mathbf{a}_{\mathbf{n}}=\sum _{i=1}^3n_i\mathbf{a}_i\) are direct lattice translation vectors and \(\mathbf{a}_i\) are primitive translations. It is easy with plane waves to take into account the translational and point symmetry of the crystal.

In the problem of BZ integration, nodes for the interpolation (\(\mathbf{k}\)-points of the BZ) were called special points (SP).

Many procedures for special points generation have been proposed. The most popular special-points (SP) sets are meshes obtained by a very simple algorithm proposed by Monkhorst and Pack (MP) in [87]:

$$\begin{aligned} \mathbf{k}_\mathbf{p}^{(n)}=\sum _{i=1}^3u_{p_i}^{(n)}\mathbf{b}_{i}, n=1,2,\ldots , p_i=1,2,\dots ,n, \quad u_{p_i}^{(n)}=\frac{2p_i-n-1}{2n} \end{aligned}$$
(3.31)

These meshes are widespread in modern Hartree-Fock and density-functional theory calculations of crystals and are automatically generated in the corresponding computer codes. But MP meshes of SP have at least one shortcoming, which may be understood if one oversees the problem from the point of view of the more general large unit cell–small Brillouin zone (LUC–SBZ) method of SP generation [59].The MP method is a particular case of the LUC–SBZ method. In [88] the modification of the MP meshes was suggested that makes faster and more regular convergence of the results of the self-consistent calculations of the electronic structure of crystals.

The suggested in [88] modification of Monkhorst–Pack special-points meshes for Brillouin-zone integration is essentially useful for crystals with many atoms in a primitive unit cell or for point-defect calculations in a supercell model. In these cases each step of the self-consistent procedure in Hartree-Fock or DFT calculations is time consuming, so that the higher efficiency of \(\mathbf{k}\)-point meshes shortens the computing time. It is also important for the lattice parameters or atom-position optimization in crystals, when the self-consistent procedure is repeated for different atomic structures. LUC–SBZ method of the special point set generation can be applied both in LCAO and plane-wave calculations to approximate the density matrix of a crystal.

3 Foundations of Density Functional Theory

3.1 Density Functional Theory

The low computational cost, combined with useful accuracy, has made Density Functional Theory (DFT) a standard technique in most branches of chemistry and materials science [89]. The number of papers given by Web of Knowledge when DFT is searched as a topic will soon reach 10,000 per year [89]. The applications to materials will soon outstrip those in chemistry. It is demonstrated in the next chapters that the majority of the published nanostructure calculations are also based on the DFT approach.

Density-functional theory has its conceptual roots in the Thomas–Fermi model of a uniform electron gas [90, 91] and the Slater local exchange approximation [92]. A formalistic proof for the correctness of the Thomas–Fermi model was provided by Hohenberg–Kohn theorems, [93]. DFT has been very popular for calculations in solid-state physics since the 1970s. In many cases DFT with the local-density approximation (LDA) and plane waves as basis functions gives quite satisfactory results, for solid-state calculations, in comparison to experimental data at relatively low computational costs when compared to other ways of solving the quantum-mechanical many-body problem.

In DFT, the one-electron density \(\rho (\varvec{r})\)

$$\begin{aligned} \rho (\varvec{r})=N\int d^3\varvec{r}_2\int d^3\varvec{r}_3\ldots \int d^3\varvec{r}_N\varPsi ^*(\varvec{r},\varvec{r}_2,\ldots ,\varvec{r}_N)\varPsi (\varvec{r},\varvec{r}_2,\ldots ,\varvec{r}_N) \end{aligned}$$
(3.32)

becomes the key variable: DFT can be summarized by the sequence

$$\rho (\varvec{r})\rightarrow \varPsi (\varvec{r}_1,\varvec{r}_2,\ldots ,\varvec{r}_N)\rightarrow V(\varvec{r})$$

i.e. knowledge of \(\rho (\varvec{r})\) implies knowledge of the wavefunction and the potential, and hence of all other observables. This also represents the fact that ultimately the electron density and not a wavefunction is the observable. Although this sequence describes the conceptual structure of DFT, it does not really represent what is done in actual applications of it and does not make explicit the use of many-body wavefunctions.

Some chemists until now consider DFT as a containing “ semiempirism”(not ab-initio) method but recognize that the small number of semiempirical parameters are used in DFT and these parameters are “universal to the whole chemistry” [94]. It took a long time for quantum chemists to recognize the possible contribution of DFT. A possible explanation of this is that the molecule is a very different object to the solid as the electron density in a molecule is very far from uniform [94]. DFT was not considered accurate enough for calculations in molecular quantum chemistry until the 1990s, when the approximations used in the theory were greatly refined to better model the exchange and correlation interactions. DFT is now a leading method for electronic-structure calculations in both fields.

Whereas the many-electron wavefunction is dependent on \(3N\) variables, three spatial variables for each of the \(N\) electrons, the density is only a function of three variables and is a simpler quantity to deal with both conceptually and practically.

The literature on DFT and its applications is large. Some representative examples are the following: books [95101], separate chapters of monographs [2, 70, 102, 103] and review articles [104110]. The short essay [111] introduces newcomers to the basic ideas and uses of modern electronic DFT theory,

The DFT recent progress (in particular, for materials and nanoscience) and ongoing challenges are discussed in [89].

Two core elements of DFT are the Hohenberg–Kohn (HK) theorems [4, 93] and the Kohn–Sham equations [112]. The former is mainly conceptual, but via the second the most common implementations of DFT have been done.

The first HK theorem states that the external potential of the system, and hence the total energy, is a unique functional of the electron density. This theorem demonstrates the existence of a one-to-one mapping between the ground-state electron density and the ground-state wavefunction of a many-particle system. The first Hohenberg–Kohn theorem is only an existence theorem, stating that the mapping exists, but does not provide any such exact mapping. It is in these mappings that approximations are made.

Hohenberg and Kohn proved [93] that to a given ground-state density \(\rho (\varvec{r})\) it is in principle possible to calculate the corresponding ground-state wavefunction \(\varPsi _0(\varvec{r}_1,\varvec{r}_2,\ldots ,\varvec{r}_N)\). This means that for a given ground-state density for some system we cannot have two different external potentials \(V\). This means that the electron density \(\rho (\varvec{r})\) defines all terms in the Hamiltonian and therefore we can, in principle, determine the complete \(N\) electron wavefunction for the ground-state by only knowing the electron density. The first HK theorem shows only that it is possible to calculate any ground-state property when the electron density is known but does not give the means to do it. Since the wavefunction is determined by the density, we can write it as \(\varPsi _0=\varPsi _0[\rho ]\), which indicates that \(\varPsi _0\) is a function of its \(N\) spatial variables, but a functional of \(\rho (\varvec{r})\). More generally, a functional \(F[n]\) can be defined as a rule for going from a function to a number, just as a function \(y = f(x)\) is a rule \((f)\) for going from a number \((x)\) to a number \((y)\). A simple example of a functional is the total number of electrons in a system \(N\)

$$\begin{aligned} N=\int d^3\varvec{r} \rho (\varvec{r})=N[\rho (\varvec{r})] \end{aligned}$$
(3.33)

which is a rule for obtaining the number \(N\), given the function \(\rho (\varvec{r})\). Note that the name given to the argument of \(\rho \) is completely irrelevant, since the functional depends on the function itself, not on its variable. Hence we do not need to distinguish \(F[\rho (\varvec{r})]\) from, e.g. \(F[\rho (\varvec{r}')]\). Another important case is that in which the functional depends on a parameter, such as in

$$\begin{aligned} V_H[\rho (\varvec{r})]=\int d^3\varvec{r}'\frac{\rho (\varvec{r}')}{|\varvec{r} - \varvec{r}'|} \end{aligned}$$
(3.34)

that is a rule that for any value of the parameter \(\varvec{r}\) associates a value \(V_H[\rho (\varvec{r})]\) with the function \(\rho (\varvec{r}')\). This term is the so-called one-electron Hartree potential of the Coulomb field created by all electrons of the system, with the electron in question included.

DFT explicitly recognizes that nonrelativistic Coulomb systems differ only by their external potential \(V(\varvec{r})\), and supplies a prescription for dealing with the universal operators \(T\) and \(U\) once and for all. This is done by promoting the electron density \(\rho (\varvec{r})\) from just one of many observables to the status of the key variable, on which the calculation of all other observables can be based. In other words, \(\varPsi _0\) is a unique functional of \(\rho \), i.e. \(\varPsi _0[\rho ]\) and consequently all other ground-state observables \(O\) are also functionals of \(\rho \)

$$\begin{aligned} \langle O\rangle [\rho ]=\langle \varPsi _0[\rho ]|\hat{O}|\varPsi _0[\rho ]\rangle \end{aligned}$$
(3.35)

From this it follows, in particular, that also the ground-state energy is a functional of \(\rho \)

$$\begin{aligned} E_0=E[\rho ]=\langle \varPsi _0[\rho ]|T+V+U|\varPsi _0[\rho ]\rangle \end{aligned}$$
(3.36)

where the contribution of the external potential can be written explicitly in terms of the density

$$\begin{aligned} V[\rho ]=\int V(\varvec{r})\rho (\varvec{r})d^3\varvec{r} \end{aligned}$$
(3.37)

The functionals \(T[\rho ]\) and \(U[\rho ]\) are called universal functionals, while \(V[\rho ]\) is obviously nonuniversal, as it depends on the system under study. Having specified a system, i.e. \(V\) is known, one then has to minimize the functional

$$\begin{aligned} E[\rho ]=T[\rho ]+U[\rho ]+\int V(\varvec{r})\rho (\varvec{r})d^3\varvec{r} \end{aligned}$$
(3.38)

with respect to \(\rho (\varvec{r})\), assuming one has got reliable expressions for \(T[\rho ]\) and \(U[\rho ]\).

The second HK theorem states that the ground state energy can be obtained variationally: the density that minimizes the total energy is the exact groundstate density i.e. this theorem proves that the ground-state density minimizes the total electronic energy of the system. It states that once the functional that relates the electron density with the total electronic energy is known, one may calculate it approximately by inserting approximate densities \(\rho '\). Furthermore, just as for the variational method for wavefunctions, one may improve any actual calculation by minimizing the energy functional \(E[\rho ']\). A successful minimization of the energy functional will yield the ground-state density \(\rho _0\) and thus all other ground-state observables. A practical scheme for calculating ground-state properties from electron density was provided by the approach of Kohn and Sham [112] considered in the next section.

3.2 The Kohn–Sham Single-particle Equations

Within the framework of Kohn–Sham (KS) DFT, the intractable many-body problem of interacting electrons in a static external potential is reduced to a tractable problem of noninteracting electrons moving in an effective potential. The functional in (3.38) is written as a fictitious density functional of a noninteracting system

$$\begin{aligned} E_{eff}[\rho ]=\langle \varPsi _{eff}[\rho ]|T_{eff}+V_{eff}|\varPsi _{eff}[\rho ]\rangle \end{aligned}$$
(3.39)

where \(T_{eff}\) denotes the noninteracting electrons kinetic energy and \(V_{eff}\) is an external effective potential in which the electrons are moving. It is assumed that the fictitious (model) system has the same energy as the real system. Obviously, \(\rho _{eff}(\varvec{r})=\rho (\varvec{r})\) if \(V_{eff}\) is chosen to be

$$\begin{aligned} V_{eff}=V+U+(T-T_{eff}) \end{aligned}$$
(3.40)

Thus, one can solve the so-called Kohn–Sham equations of this auxiliary noninteracting system with the effective Hamiltonian

$$\begin{aligned} H_{eff}=\sum \limits _{i=1}^N\left[ -\frac{1}{2}\varDelta _i+V_{eff}(\varvec{r}_i)\right] =\sum \limits _{i=1}^N h_{eff}(\varvec{r}_i) \end{aligned}$$
(3.41)

that yields the orbitals \(\varphi _i(\varvec{r})\) that reproduce the density \(\rho (\varvec{r})\) of the original many-electron system \(\rho (\varvec{r})=\rho _{eff}(\varvec{r})={\sum }_{i=1}^N|\varphi (\varvec{r})|^2\). The effective single-electron potential \(V_{eff}(\varvec{r})\) can be written as

$$\begin{aligned} V_{eff}(\varvec{r})=V+\int \frac{\rho _{eff}(\varvec{r}')}{|\varvec{r} - \varvec{r}'|}d^3\varvec{r}'+V_{XC}\left[ \rho _{eff}(\varvec{r})\right] \end{aligned}$$
(3.42)

where the second term denotes the so-called Hartree term describing the electron–electron Coulomb repulsion, while the last term \(V_{XC}\) is called the exchange correlation potential and includes all the many-electron interactions. Since the Hartree term and \(V_{XC}\) depend on \(\rho (\varvec{r})\) that depends on the \(\varphi _i\), which in turn depend on \(V_{eff}\), the problem of solving the Kohn–Sham equation has to be done in a self-consistent (i.e. iterative) way. Usually, one starts with an initial guess for \(\rho (\varvec{r})\), then calculates the corresponding \(V_{eff}\) and solves the Kohn–Sham equations for the \(\varphi _i\). From these one calculates a new density and starts again. This procedure is then repeated until convergence is reached.

There are only single-particle operators in (3.41). Therefore, the solution to the Schrödinger equation for a model system of noninteracting electrons can be written exactly as a single Slater determinant \(\varPsi =|\varphi _1,\varphi _2,\ldots ,\varphi _N|\), where the single-particle orbitals \(\varphi _i\) are determined as solutions of the single-particle equation

$$\begin{aligned} h_{eff}\varphi _i=\varepsilon _i\varphi _i \end{aligned}$$
(3.43)

Furthermore, \(\rho (\varvec{r})={\sum }_{i=1}^N |\varphi _i(\varvec{r})|^2\), where the summation runs over the \(N\) orbitals with the lowest eigenvalues \(\varepsilon _i\).

The main problem is that we do not know the exact form of the effective potential \(V_{eff}\), i.e. the so-called exchange-correlation energy and potential. There exist approximations to these (LDA, GGA, hybrid exchange-correlation potentials) that are good approximations in very many cases. On the other hand, by introducing an approximation in the Schrödinger equation for the model noninteracting particles, we do not know whether improved calculations (e.g. using larger basis sets) also lead to improved results (compared with, e.g. experiment or other calculations). The success of the Kohn–Sham approach is based on the assumption that it is possible to construct the model system of noninteracting particles moving in an effective external potential. Thus, it is indirectly assumed that for any ground-state density there exists an effective potential (the corresponding density matrix is called \(V\) representative). There exist (specifically constructed) examples where this is not the case, but in most practical applications, this represents no problem [102].

HK theorems and KS equations can be extended to the spin-polarized systems where the electron- density components \(\rho _{\alpha }(\varvec{r}), \rho _{\beta }(\varvec{r})\) for spin-up and spin-down orbitals differ i.e. the spin-density \(\rho ^s(\varvec{r})=\rho _{\alpha }(\varvec{r})- \rho _{\beta }(\varvec{r})\) is nonzero.

The result is that the total energy and any other ground-state properties become a functional not only of \(\rho (\varvec{r})\) but also of spin-density \(\rho ^s(\varvec{r})\), i.e. \(E_{eff}=E[\rho (\varvec{r}),\rho ^s(\varvec{r})]\). Spin-density-functional theory (SDFT) is a widely implemented and applied formalism of DFT.

Since 1990 there has been an enormous amount of comparison for molecules between DFT KS and HF (MO) theory. Such a comparison is easily extended to crystals when we consider CO instead of MO. The discussion of advantages and disadvantages of DFT compared to MO theory can be found for example, in [70] and is briefly reproduced here. The most fundamental difference between DFT and MO theory is the following: DFT optimizes an electron density, while MO theory optimizes a wavefunction. So, to determine a particular molecular property using DFT, we need to know how that property depends on the density, while to determine the same property using a wavefunction, we need to know the correct quantum- mechanical operator. As a simple example, consider the total energy of interelectronic repulsion. Even if we had the exact density for some system, we do not know the exact exchange-correlation energy functional, and thus we cannot compute the exact interelectronic repulsion. However, with the exact wavefunction it is a simple matter of evaluating the expectation value for the interelectronic repulsion operator to determine this energy. Thus, it is easy to become confused about whether there exists a KS “wavefunction”. Formally, the KS orbitals are pure mathematical constructs useful only in construction of the density. In practice, however, the shapes of KS orbitals tend to be remarkably similar to canonical HF MOs and they can be quite useful in qualitative analysis of chemical properties. If we think of the procedure by which they are generated, there are indeed a number of reasons to prefer KS orbitals to HF orbitals. For instance, all KS orbitals, occupied and virtual, are subject to the same external potential. HF orbitals, on the other hand, experience varying potentials, and, in particular, HF virtual orbitals experience the potential that would be felt by an extra electron being added to the molecule. As a result, HF virtual orbitals tend to be too high in energy and anomalously diffuse compared to KS virtual orbitals. This fact is especially important for crystalline solids and explains why HF bandgaps are overestimated compared to those in DFT and experiment. Unfortunately, for some choices of exchange-correlation potential DFT bandgaps are too small in comparison with the experimental data. In exact DFT, it can also be shown that the eigenvalue of the highest KS MO is the exact first ionization potential, i.e. there is a direct analogy to Koopmans’ theorem for this orbital—in practice, however, approximate functionals are quite poor at predicting IPs in this fashion without applying some sort of correction scheme, e.g. an empirical linear scaling of the eigenvalues.

The Slater determinant formed from the KS orbitals is the exact wavefunction for the fictitious noninteracting system having the same density as the real system. This KS Slater determinant has certain interesting properties by comparison to its HF analogs. It is an empirical fact that DFT is generally much more robust in dealing with open-shell systems where UHF methods show high spin contamination, i.e. incorporates some higher spin states (doublets are contaminated by quartets and sextets, while triplets are contaminated by pentets and heptets). The degree of spin contamination can be estimated by inspection of \(\langle S^2\rangle \), which should be 0.0 for a singlet, 0.75 for a doublet, 2.00 for a triplet, etc. Note, incidentally, that the expectation values of \(S^2\) are sensitive to the amount of HF exchange in the functional. A “pure” DFT functional nearly always shows very small spin contamination, and each added per cent of HF exchange tends to result in a corresponding percentage of the spin contamination exhibited by the HF wavefunction. This behavior can make the hybrid HF-DFT functionals useful for open-shell systems (see next subsection).

The formal scaling behavior of DFT is, in principle, no worse than \(N^3\), where \(N\) is the number of basis functions used to represent the KS orbitals. This is better than HF by a factor of \(N\), and very substantially better than other methods that include electron correlation.

The most common methods for solving KS equations proceed by expanding the Kohn–Sham orbitals in a basis set. DFT has a clear advantage over HF in its ability to use basis functions that are not necessarily contracted Gaussians. The motivation for using contract GTOs is that arbitrary four-center two-electron integrals can be solved analytically. In the electronic-structure programs where DFT was added as a new feature to the existing HF code the representation of the density in the classical electron-repulsion operator is carried out using the KS orbital basis functions. Thus, the net effect is to create a four-index integral and these codes inevitably continue to use contracted GTOs as basis functions. In particular, such a scheme is used in the CRYSTAL code [17]. However, if the density is represented using an auxiliary basis set, or even represented numerically, other options are available for the KS orbital basis set, including STFs. The SIESTA density-functional code [44] for crystalline solids uses the numerical AO basis instead of GTO. STO have the advantage that fewer of them are required (since they have correct cusp behavior at the nuclei), and certain advantages associated with symmetry that can be used even more readily, so they speed up calculations considerably. The Amsterdam density-functional code and its BAND version for solids [45] makes use of STO basis functions covering atomic numbers 1–118.

Another interesting possibility is the use of plane waves as basis sets in periodic infinite systems (crystalline solids) represented using periodic boundary conditions. While it takes an enormous number of plane waves to properly represent the decidedly aperiodic densities that are possible within the unit cells of interesting chemical systems, the necessary integrals are particularly simple to solve, and thus this approach has found wide use in solid-state physics.

3.3 Climbing the Jacob’s Ladder for Exchange-Correlation Functionals

The effective potential (3.40) includes the external potential and the effects of the Coulomb interactions between the electrons, e.g. the exchange and correlation interactions. In principle, it also includes the difference in kinetic energy between the fictitious noninteracting system and the real system. In practice, however, this difference is ignored in many modern functionals as empirical parameters appear that necessarily introduce some kinetic-energy correction if they are based on experiment [70].

Modeling the exchange and correlation interactions becomes difficult within KS DFT as the exact functionals for exchange and correlation are not known except for the homogeneous (uniform) electron gas. However, approximations exist that permit the calculation of real systems.

Perdew [113] described a Jacobs ladder in which the “rungs” (levels of sophistication) of density functional approximations ultimately lead to the “Heaven” of chemical accuracy [114], see Fig. 3.5.

Fig. 3.5
figure 5

Jacob’s ladder of density functional approximations to the exchange-correlation energy, n is an electron density (Reprinted figure with permission from Perdew et al. [113], Copyright (2005) by the AIP Publishing LLC)

The first “rung” consists of density functionals that are classified as local spin density approximation (LSD) and local density approximation (LDA). The second “rung” consists of semilocal density functionals that depend explicitly on both the electron density and the density gradient and are referred to as generalized gradient approximations (GGAs). The GGA density functionals increase the level of accuracy previously achieved by the LDA.

More accurate functionals of higher “rungs” can be called “beyond-GGA functionals” or nonlocal functionals.

The third-“rung” consists of density functionals that in addition to the electron density and the density gradient depend explicitly on the kinetic energy-density, expressed in terms of the occupied KS orbitals. Density functionals in the third “rung” are called Meta-GGA (MGGA) and they strive for wider applicability and better accuracy than LDA or GGA functionals [114, 115].

Fourth-rung hybrid functionals mix a fraction of the Hartree-Fock exchange \(\epsilon _x\) into the DFT exchange functional i.e. the wave-function-based HF method is combined with the density-based theory. Walter Kohn, the Nobel Prize Winner in chemistry writes in his Nobel lecture [4] that wave-function- based and density-based theories, will in complementary ways, continue not only to give us quantitatively more accurate results, but also contribute to a better physical/ chemical understanding of the electronic structure of matter.

MGGA and hybrid functionals can be called orbital-dependent functionals because they are not only represented in terms of the electron density, but also contain parts represented in single-particle Kohn–Sham orbitals \(\varphi _i(\varvec{r})\). In MGGA functionals the kinetic energy density \(\tau (\varvec{r})=1/2{\sum }_i|\nabla \varphi _i(\varvec{r})|^2\) is included. In hybrid functionals the nonlocal HF exchange energy is also an orbital-dependent functional. Still another type of orbital functional is the self-interaction correction (SIC).

The fifth “rung” density-functionals, called double-hybrid, are considerably more complex. A double hybrid functional includes a certain amount of HF exchange and the perturbative second-order correlation part PT2 (hence double hybrid) [116]. In this case the unoccupied KS orbitals are included in the calculation.

The simplest approximation is the local-density approximation (LDA), based upon the exact exchange energy for a uniform electron gas, which can be obtained from the Thomas–Fermi (TF) model, and from fits to the correlation energy for a uniform electron gas.

In the TF model it is suggested that the number of electrons is so large that the system could be treated using quantum-statistical arguments. The approximation in the TF model concerns kinetic energy. For a homogeneous interaction-free electron gas the density is constant and the average kinetic energy per particle is \(\varepsilon _t^{hom}=Cn^{5/3}\). The kinetic energy per unit volume in this model is \(n\varepsilon _t^{hom}\). If the electron density varies sufficiently slowly in space \(T^{LDA}=\int d^3\varvec{r} \varepsilon _t^{hom}(\rho (\varvec{r}))\) can serve as a workable approximation for the kinetic-energy functional.

If this is combined with the expression for the nuclei–electron attractive potential and the electron–electron Hartree repulsive potential we have the TF expression for the energy of a homogeneous gas of electrons in a given external potential:

$$\begin{aligned} E_{TF}[\rho (\varvec{r})]=C_F\int \rho ^{5/3}(\varvec{r})d^3\varvec{r}-\sum \limits _j\frac{Z_j}{|\varvec{r} -\varvec{R}_j|}\rho (\varvec{r})+\frac{1}{2}\int \int \frac{\rho (\varvec{r})\rho (\varvec{r}')}{|\varvec{r} -\varvec{r}'|}d^3\varvec{r} d^3\varvec{r}' \end{aligned}$$
(3.44)

The importance of this equation lies not so much in how well it is able to really describe even the energy of an atom, as in that the energy is given completely in terms of the electron density \(\rho (\varvec{r})\).

This is an example of a density-functional for energy allowing us to map the density \(\rho (\varvec{r})\) onto the energy \(E\) without any additional information required. Furthermore, the TF model employs the variational principle assuming that the ground state of the system is connected to the electron density for which the energy (3.44) is minimized under the constraint of \(N=\int d^3\varvec{r}\rho (\varvec{r})\).

The exchange-correlation energy \(E_{XC}\) is approximated by a sum of the exchange \(E_X\) and correlation \(E_C\) energies.

In LDA, for the exchange energy calculation the Dirac–Slater exchange energy is used

$$\begin{aligned} E_X[\rho ]=C_X\int \rho ^{4/3}(\varvec{r})d^3\varvec{r} \end{aligned}$$
(3.45)

or the more complicated suggested by Barth and Hedin [117].

For the correlation energy functional \(E_C[\rho (\varvec{r})]\) the situation is more complicated since even for a homogeneous electron gas it is not known exactly. Early approximate expressions for correlation in homogeneous systems were based on applying the perturbation theory. With the advent of highly precise calculations of correlation energy for the electron liquid by Ceperley and Alder (CA) [118] the approximations for the correlation energy in a homogeneous system are made by a parameterization of CA data for a free-electron gas. There are known parameterizations of Vosko–Wilk–Nisair [119], Perdew–Zunger [120] and Perdew–Wang [121]. The three latter parameterizations of the LDA are implemented in most standard DFT program packages (both for molecules and solids) and in many cases give almost identical results. On the other hand, the earlier parameterizations of the LDA, based on perturbation theory can deviate substantially and should be better avoided.

The functional dependence of \(E_{XC}\) on the electron density is expressed as an interaction between the electron density and “an energy density” \(\varepsilon _{XC}\) that is dependent on the electron density

$$\begin{aligned} E_{XC}[\rho (\varvec{r})]=\int \rho (\varvec{r})\varepsilon _{XC}[\rho (\varvec{r})]d^3\varvec{r} \end{aligned}$$
(3.46)

The energy density \(\varepsilon _{XC}\) is treated as a sum of individual exchange and correlation contributions. Two different kinds of densities are involved [70]: the electron density is a per unit volume density, while the energy density is a per particle density. The LDA for \(E_{XC}\) formally consists in

$$\begin{aligned} E_{XC}^{LDA}[\rho (\varvec{r})]=\int d^3\varvec{r}\left[ \varepsilon _X^{hom}[\rho (\varvec{r})]+\varepsilon _C^{hom}[\rho (\varvec{r})]\right] =\int d^3\varvec{r}\varepsilon _{XC}^{hom}[\rho (\varvec{r})] \end{aligned}$$
(3.47)

The energy densities \(\varepsilon _X^{hom}\), \(\varepsilon _C^{hom}\) refer to a homogeneous system, i.e. the exchange-correlation energy is simply an integral over all space with the exchange-correlation energy density at each point assumed to be the same as in a homogeneous electron gas with that density. Nevertheless, LDA has proved amazingly successful, even when applied to systems that are quite different from the electron liquid that forms the reference system for the LDA.

In the local spin-density approximation (LSDA) the exchange-correlation energy can be written in terms of either of the two spin densities \(\rho ^{\alpha }(\varvec{r})\) and \(\rho ^{\beta }(\varvec{r})\)

$$\begin{aligned} E_{XC}^{LSDA}[\rho ^{\alpha }(\varvec{r}),\rho ^{\beta }(\varvec{r})]&= \int d^3\varvec{r}\rho (\varvec{r})\varepsilon _{XC}^{hom}[\rho ^{\alpha }(\varvec{r}), \rho ^{\beta }(\varvec{r})]\nonumber \\&= \int d^3\varvec{r} \rho (\varvec{r})\left[ \varepsilon _{X}^{hom}\left[ \rho ^{\alpha }(\varvec{r}), \rho ^{\beta }(\varvec{r})\right] +\varepsilon _{C}^{hom}\left[ \rho ^{\alpha }(\varvec{r}), \rho ^{\beta }(\varvec{r})\right] \right] \qquad \end{aligned}$$
(3.48)

or the total density \(\rho (\varvec{r})\) and the fractional spin polarization \(\zeta (\varvec{r})=(\rho ^{\alpha }(\varvec{r})-\) \(- \rho ^{\beta }(\varvec{r}))/\rho (\varvec{r})\).

For many decades the LDA has been applied in, e.g. calculations of band structures and total energies in solid-state physics. The LDA provides surprisingly good results for metallic solids with delocalized electrons, i.e. those that most closely resemble the uniform electron gas (jellium). At the same time, there are well-known disadvantages of the LDA for solids. It revealed systematic shortcomings in the description of systems with localized electrons and as a result the underestimation of the bond distances and overestimation of the binding energies. LDA calculations as a rule give calculated bandgaps that are too small. In the quantum chemistry of molecules the LDA is much less popular because the local formulation of the energy expression does not account for the electronic redistribution in bonds. For well-localized electrons the nonexact cancellation of the self-energy part (self-interaction) of the Hartree term in the LDA exchange functional is important (in HF energy the self-energy part in the Hartree term is cancelled by the corresponding part of the exchange term). The LDA fails to provide results that are accurate enough to permit a quantitative discussion of the chemical bond in molecules (so-called “chemical accuracy” requires calculations with an error of not more than about 1 kcal/mol per particle). The LDA exploits knowledge of the density at the point \(\rho (\varvec{r})\). Real systems, such as molecules and solids, are inhomogeneous (the electrons are exposed to the spatially varying electric fields produced by the nuclei) and interacting (the electrons interact via the Coulomb interaction). The way density-functional theory, in the local-density approximation, deals with this inhomogeneous many-body problem is by decomposing it into two simpler (but still highly nontrivial) problems: the solution of the spatially uniform many-body problem (the homogeneous electron liquid) yields the uniform exchange-correlation energy, and the solution of a spatially inhomogeneous noninteracting problem (the inhomogeneous electron gas) yields the particle density. Both steps are connected by the local-density approximation, which shows how the exchange-correlation energy of the uniform interacting system enters the equations for the inhomogeneous noninteracting system.

We note that both the local density approximation and the local exchange approximation use only the diagonal part of the density matrix \(\rho (\varvec{r}, \varvec{r}')\), i.e. \(\rho (\varvec{r})\) = \(\rho (\varvec{r}, \varvec{r})\). However these approximations are different in their nature. In LDA the local density is used to include both the exchange and correlation of electrons, in the local exchange approximation to the HF exchange the electron correlation is not taken into account at all.

The particular way in which the inhomogeneous many-body problem is decomposed, and the various possible improvements over the LDA, are behind the success of DFT in practical calculations, in particular, those of materials. The most important improvement over LDA is the attempt to introduce a spatially varying density and include information on the rate of this variation in the functional. The corresponding functionals are known as semilocal functionals [107].

The first successful extensions for the LDA were developed in the early 1980s, when it was suggested that the density \(\rho (\varvec{r})\) be supplemented at a particular point \(\varvec{r}\) with information about the gradient of the electron density at this point in order to account for the nonhomogeneity of the true electron density [100]. LDA was interpreted as the first term of the Taylor expansion of the uniform density, the form of the functional was termed the gradient expansion approximation (GEA). The authors expected to obtain better approximations of the exchange-correlation functional by extending the series with the next lowest term. In practice, the inclusion of low-order gradient corrections almost never improves the LDA, and often actually makes it worse. The reason for this failure is that for GEA the exchange-correlation hole has lost many of the properties that made the LDA hole physically meaningful [100]. Higher-order corrections, on the other hand, are exceedingly difficult to calculate, and little is known about them.

It was a major breakthrough when it was realized, in the early 1980s, that instead of power-series-like systematic gradient expansions one could experiment with more general functions of \(\rho (\varvec{r})\) and \(\nabla \rho (\varvec{r})\), which do not need to proceed order by order. Such functionals, of the general form

$$\begin{aligned} E_{XC}^{GGA}[\rho _{\alpha },\rho _{\beta }]=\int d^3\varvec{r} f(\rho _{\alpha },\rho _{\beta },\nabla \rho _{\alpha },\nabla \rho _{\beta }) \end{aligned}$$
(3.49)

have become known as generalized-gradient approximations (GGAs), [122]. GGA functionals are the workhorses of the current density-functional theory. Different GGAs differ in the choice of the function \(f(\rho ,\nabla \rho )\). Note that this makes different GGAs much more different from each other than the different parameterizations of the LDA: essentially there is only one correct expression for \(\varepsilon _{XC}^{hom}(\rho )\) and the various parameterizations of the LDA are merely different ways of writing it [107]. On the other hand, depending on the method of construction employed for obtaining \(f(\rho ,\nabla \rho )\), one can obtain very different GGAs. In particular, GGAs used in molecular quantum chemistry typically proceed by fitting parameters to test sets of selected molecules. On the other hand, the GGAs used in solid state theory tend to emphasize exact constraints on the density-functional for the exchange-correlation energy [113]. In this approach, the density-functional approximations are assigned to various rungs according to the number and kind of their local ingredients [113].

The semiempirical functionals are fitted to selected data from experiment or from the ab-initio calculations. The higher the rung of the functional the larger is the number of parameters (functionals with as many as 21 fit parameters are popular in chemistry). Is DFT ab-initio or semiempirical? As was suggested in [113], it can fall in between as a nonempirical theory when the functionals are constructed without empirical fitting.

The best nonempirical functional for a given rung is constructed to satisfy as many exact theoretical constraints as possible while providing satisfactory numerical predictions for real systems [113]. Once a rung has been selected, there remains little choice of which constraints to satisfy (but greater freedom in how to satisfy them), [113]. Accuracy is expected to increase up the ladder of rungs as additional local ingredients enable the satisfaction of additional constraints. A short summary of the exact constraints on \(E_{XC}[\rho ]\) can be found in [113]. In this paper some useful recommendations for DFT users are also given. Users should not randomly mix and match the functionals, but use exchange and correlation pieces designed to work together, with their designer-recommended local parts. They should not shop indiscriminately for the functional that “works best”. Users should always specify which functional they used, with its proper name and literature reference, and why they chose it. Statements like “we used the density-functional theory” or “we used the generalized gradient approximation” are almost useless to a reader or listener who wants to reproduce the results.

Nowadays, the most popular (and most reliable) GGA functionals are PBE (denoting the functional proposed in 1996 by Perdew et al. [123] in physics, and BLYP (denoting the combination of Becke’s 1988 exchange functional [124] with the 1988 correlation functional of Lee et al. [125]) in chemistry. A detailed consideration of the PBE functional can be found in [126].

PWGGA denotes the GGA functional suggested by Perdew and Wang [33, 122]. Many other GGA-type functionals are also available, and new ones continue to appear. The new meta-GGA (MGGA) functional TPSS (Tao–Perdew–Staroverov–Scuseria) [127] supersedes the older PKZB (Perdew–Kurth–Zupan–Blaha) functional [128]. The known functionals are modified as has been done recently for the PBE functional to improve its accuracy for the thermodynamic and electronic properties of molecules [115, 129].

Recently, several new GGA functionals have been proposed to improve the description of solids: Armiento-Mattsson (AMO5) [130], Wu-Cohen (WC06) [131] and the modified PBE GGA for solids (PBEsol) [132]. A short discussion of them can be found in [133]. Both WC06 and PBEsol are based on the PBE functional. In the former, the exchange part of the PBE functional is slightly modified to recover the fourth-order parameters of the gradient expansion in the limit of the slowly varying electron density. The PBEsol functional has the same analytical form as the PBE except that of the two parameters are changed in order to satisfy the constraints that are more appropriate for solids. The AM05 functional was developed in quite a different framework, the so-called subsystem functional scheme in which a general functional is obtained by combining functionals obtained from different model systems. All these new functionals have been systematically investigated for comprehensive sets of solids with different bonding characters, and have shown significant improvements over the LDA and PBE, in particular for equilibrium lattice constants [134]. On the other hand, these new GGA functionals still suffer from the failure to treat the van der Waals dispersion interaction. The layered materials (being the base for the nanotube production, see Part II) are characterized by quasitwo- dimensional layered structure, and the inter-layer interaction is usually held to be of van der Waals character. The performance of LDA and GGA functionals for layered materials has been rarely investigated. In [135], \( MoSe_2\) was included in a set of solids to test different GGA functionals. It was found that while the PBE overestimates the lattice constant c by 17 %, WC06 does it by only 2 %. A systematic investigation of LDA and GGA functionals for the layered materials \(ZrX_2\) and \(HfX_2\) (X = S,Se) is performed in [133], using the FPLAPW method.

The PBE functional [128] has two nonempirical derivations based on the exact properties of the XC hole [123] and energy [136].

The PBE GGA correctly reduces to LSD for uniform electron densities, its correlation component recovers the slowly varying (\(t\rightarrow 0\)) and rapidly varying (\(t\rightarrow \infty \)) limits of the numerical GGA. Under uniform scaling (\(\rho (\varvec{r})\rightarrow \lambda ^3\rho (\lambda \varvec{r})\)), the PBE exchange energy scales like \(\lambda \) (as does the exact exchange functional) and the PBE correlation energy correctly scales to a constant as \(\lambda \rightarrow \infty \). For small-amplitude density variations around a uniform density, LSD is a very good approximation. The PBE functional recovers this limit. The PBE functional satisfies the Lieb–Oxford bound [137]:

$$\begin{aligned} E_X^{PBE}[\rho _{\alpha },\rho _{\beta }]\ge E_{XC}^{PBE}[\rho _{\alpha },\rho _{\beta }]\ge 2.273E_X^{LDA}[\rho ] \end{aligned}$$
(3.50)

A useful way to compare GGA functionals is to write

$$\begin{aligned} E_{XC}^{GGA}[\rho _{\alpha },\rho _{\beta }]\approx \int d^3\varvec{r} \rho \varepsilon _X(\rho )F_{XC}(r_s,\zeta ,s) \end{aligned}$$
(3.51)

where the enhancement factor \(F_{XC}(r_s,\zeta ,s)\) over the local exchange depends upon the local radius \(r_s (r_s\lesssim 1\) for the core electrons and \(r_s\gtrsim 1\) for the valence electrons), the spin polarization \(\zeta \) and inhomogeneity \(s\). The \(s\)-dependence of \(F_{XC}\) is the nonlocality of the GGA. We see that the nonempirical PBE functional best fulfils many of the physical and mathematical requirements of DFT. The application of the PBE functional in the calculations of bulk crystals and nanostructures is considered in Part II of this book.

At the same time, there are some failures of the PBE functional essential for extended systems. For example, the exact exchange-correlation hole in crystals can display a diffuse long-range tail that is not properly captured by the GGA (such a diffuse hole arises in the calculation of the surface energy of a metal). Neither the LSD nor the GGA correctly describes the long-range tail of the van der Waals interaction. Corrections of these failures might be made possible by using orbital-dependent or hybrid functionals.

The density-functional theory, even with rather crude approximations such as the LDA and the GGA, is often better than Hartree-Fock: the LDA is remarkably accurate, for instance, for geometries and frequencies, and the GGA has also made bond energies quite reliable. Therefore, “the aura of mystery” appeared around DFT (see discussion of this by Baerends and Gritsenko [138]). The simple truth is not that LDA/GGA is particularly good, but that Hartree-Fock is rather poor in the two-electron chemical-bond description. This becomes clear when one considers the statistical two-electron distribution, which is usually cast in terms of the exchange-correlation hole: a decrease in the probability of finding other electrons in the neighborhood of a reference electron, compared to the (unconditional) one-electron probability distribution [100].

The concept of electron density, which provides an answer to the question “how likely is it to find one electron of arbitrary spin within a particular volume element while all other electrons may be anywhere” can be extended to the probability of finding not one but a pair of two electrons with spins \(\sigma _1\) and \(\sigma _2\) simultaneously within two volume elements \(d\varvec{r}_1\) and \(d\varvec{r}_2\), while the remaining \(N-2\) electrons have arbitrary positions and spins [100].

The exchange-correlation hole describes the change in the conditional probability caused by the correction for self-interaction, exchange and Coulomb correlation, compared to the completely uncorrelated situation. It can formally be split into the Fermi hole, \(h_{x}^{\sigma _1=\sigma _2}(\varvec{r}_1,\varvec{r}_2)\) and the Coulomb hole \(h_{c}^{\sigma _1, \sigma _2}(\varvec{r}_1,\varvec{r}_2)\)

$$\begin{aligned} h_{xc}(\varvec{x}_1,\varvec{x}_2)=h_{x}^{\sigma _1=\sigma _2}(\varvec{r}_1,\varvec{r}_2)+h_{c}^{\sigma _1, \sigma _2}(\varvec{r}_1,\varvec{r}_2) \end{aligned}$$
(3.52)

The exchange hole \(h_x\) applies only to electrons with the same spin, the correlation hole \(h_c\) has contributions for electrons of either spin and is the hole resulting from the \(1/r_{12}\) electrostatic interaction. This separation is convenient but only the total xc hole has a real physical meaning. In the HF method the Fermi hole is accounted for through the use of a single Slater determinant, whereas the Coulomb hole is neglected. Like the total hole, the Fermi hole contains exactly the charge of one electron \(\int h_{x}(\varvec{r}_1,\varvec{r}_2)d\varvec{r}_2=-1\) and takes care of the self-interaction correction (SIC). The exchange hole \(h_x\) is negative everywhere \(h_{x}(\varvec{r}_1,\varvec{r}_2)<0\) and its actual shape depends not only on the Fermi correlation factor but also on the density at \(\varvec{r}_2\). As a consequence, \(h_{x}\) is not spherically symmetric.

The Coulomb hole must be normalized to zero and this result is independent of the positions of electrons with \(\sigma '\ne \sigma \):\(\int h_{c}(\varvec{r}_1,\varvec{r}_2)d\varvec{r}_2=0\).

The essential error of the Hartree-Fock method arises from the fact that in the HF model the exchange-correlation hole in a two-electron bond is not centered around the reference electron, but is too delocalized, having considerable amplitude on both atoms involved in the bond. The LDA and GGA models incorporate, simply by using an electron-centered hole, an important part of the effect of the interelectron Coulomb repulsion. To improve the LDA and GGA models orbital-dependent functionals are introduced. The introduction of orbital dependence, not only density and gradient dependence, into the functionals can be realized in different ways [107]. Hybrid HF-DFT functionals are the most popular beyond-GGA functionals. These functionals mix a fraction of the HF exchange into the DFT exchange functional and use the DFT correlation part. But why is the exact HF exchange mixed with the approximate DFT exchange part? How can one find the weights of this mixing? May be it would be possible to use the exact HF exchange and rely on approximate functionals only for the part missing in the HF model, i.e. the electron correlation: \(E_{xc}=E_x^{HF}+E_c^{KS}\) (EXX—orbital-dependent exact exchange functional). The HF exchange is calculated using the KS orbitals so that the EXX energy will be slightly higher than the HF energy. The orbital-dependent exact-exchange (EXX) group of methods has received much attention in the literature. Two advantages of the EXX functional are often mentioned: (a) the self-interaction correction is incorporated in this functional; (b) it is natural to break up the total problem of electron–electron interaction in the large (due to self-interaction correction) exchange part and the small correlation part. The hope is of course that it will be simpler to find an accurate density-functional for the small correlation energy.

Critical analysis of the EXX method in DFT can be found, for example, in [100, 138]. The true xc hole is substantially more localized around the reference electron. That is why rough localized model holes, like those of the GGA, which approximate this total hole, are so successful.

The LDA and GGA total holes are localized around the reference electron, which is a definite advantage, as EXX approximation cannot be regarded to be the next improvement over the GGA. There are other possibilities of such an improvement: (1) to develop orbital-dependent functionals, which represent exchange and correlation simultaneously; (2) use HF-DFT exchange mixing.

The first possibility was realized recently, for example, in [139] where the virtual orbital-dependent xc functional \(E_{xc}^{BB}\) (BB, Buijse–Baerends) is suggested:

$$\begin{aligned} E_{xc}^{BB}\left[ \{\varphi _j\},\{\varphi _a\}\right] =-\frac{1}{2}\sum \limits _i^M\sum \limits _j^M\sqrt{w_i w_j} \int d\varvec{r}_1 d\varvec{r}_2 \frac{\varphi _i(\varvec{r}_1)\varphi _i(\varvec{r}_2 )\varphi _j(\varvec{r}_1 )\varphi _j^*(\varvec{r}_2 )}{r_{12}} \end{aligned}$$
(3.53)

and applied in the calculations of hydrogen chains \(\mathrm H_{n}\) and small molecules. Here \(w_i[\rho ]\) is the xc orbital weight, which governs the involvement of the occupied/virtual orbital \(\varphi _i\) in the xc functional. It is important that the summation in (3.53) is made over \(M=N_{occ}+N_v\) orbitals, where \(N_{occ}\) and \(N_v\) are the numbers of occupied and virtual KS orbitals. By giving weights to the virtual orbitals one can incorporate the effect of correlation. In [140] the functional form of \(w_i\) was approximated with the Fermi-type distribution

$$\begin{aligned} w_i=\frac{2}{1+\exp (f(\varepsilon _i-\varepsilon _F))} \end{aligned}$$
(3.54)

where \(\varepsilon _i\) are the KS orbital energies and \(\varepsilon _F\) is the Fermi-level parameter. We do not consider the \(E_{xc}^{BB}\) functional in more detail as its possibilities for crystals are not well understood.

The second possibility of GGA improvement—the mixing of HF and DFT exchange—is used in the well-known B3LYP, B3PW and PBE0 hybrid functionals. In Kohn–Sham density-functional theory, the exchange-correlation energy is rigorously given by

$$\begin{aligned} E_{xc}&=\frac{1}{2}\int \rho (\varvec{r})\int \frac{h_{xc}(\varvec{r},\varvec{r}')}{|\varvec{r}' -\varvec{r}|}d^3\varvec{r}'d^3\varvec{r} \end{aligned}$$
(3.55)
$$\begin{aligned} h_{xc}(\varvec{r},\varvec{r}')&=\int \limits _0^1h_{xc}^{\lambda }(\varvec{r},\varvec{r}')d\lambda \end{aligned}$$
(3.56)

known as the “adiabatic connection” or “coupling strength integration”. In this most fundamental of DFT formulas, \(h_{xc}\) is an effective exchange-correlation hole, and \(\lambda \) is a coupling-strength parameter that switches on interelectronic \(1/r_{12}\) repulsion, subject to a fixed electronic density (achieved, in principle, by suitably adjusting the external potential as a function of \(\lambda \)). The Kohn–Sham \(h_{xc}\) is therefore a coupling strength average of \(h_{xc}^{\lambda }\) as it evolves from \(\lambda =0\) through \(\lambda =1\) [141].

In an atom, the size of the hole is relatively insensitive to \(\lambda \) and remains of roughly atomic size. In a molecule, the changes in the character and size of the hole occur as \(\lambda \) varies from its noninteracting \(\lambda =0\) to its fully interacting \(\lambda =1\) limits. At \(\lambda =0\) (pure exchange with no correlation whatsoever) delocalization of the hole over two or more centers is characteristic. Accurate DFTs must recognize this \(\lambda =0\) nonlocality. As \(\lambda \) increases, the hole is localized by the long-range, nondynamical, left–right correlations that are absent in atoms but operative in molecules. At the fully interacting limit (\(\lambda =1\)), the hole is localized to roughly atomic size. To incorporate the necessary nonlocality Becke [142] proposed so-called hybrid functionals

$$\begin{aligned} E_{xc}^{hyb}=E_{xc}^{KS}+a_{mix}\left( E_x^{HF}-E_x^{KS}\right) \end{aligned}$$
(3.57)

The difference from the EXX functional is the following: the exact HF exchange \(E_x^{HF}\) is mixed with the DFT (LDA, GGA) exchange-correlation energy in hybrid functionals and with the DFT correlation energy in EXX functionals. The mixing parameter \(a_{mix}\) is used to patch an appropriate amount of HF exchange into the exchange-correlation energy. Even the simplest “half-half” hybrid functional (\(a_{mix}=0.5\)) greatly improves the calculated properties, compared with the pure DFT results. Becke developed 3-parameter functional expressions known as B3PW and B3LYP hybrid functionals:

$$\begin{aligned} E_{xc}^{B3PW}=E_{xc}^{LSDA}+a\left( E_x^{HF}-E_x^{LSDA}\right) +b\varDelta E_x^{Becke}+c\varDelta E_x^{PW} \end{aligned}$$
(3.58)
$$\begin{aligned} E_{xc}^{B3LYP}=E_{xc}^{LSDA}+a\left( E_x^{HF}-E_x^{LSDA}\right) +b\varDelta E_x^{Becke}+c\varDelta E_x^{LYP} \end{aligned}$$
(3.59)

The B3PW functional (3.58) uses in \(E_{xc}\) the Becke [124] exchange and Perdew–Wang exchange-correlation [122], while in the B3LYP functional (3.59) the correlation part is that suggested by Lee–Yang–Parr [125]. The \(a, b, c\) parameters were optimized to fit the experimental data and do not depend on the molecule under consideration. It is clear that the choice of hybrid functionals is motivated by reasonable physical arguments. The term \(a\left( E_x^{HF}-E_x^{LSDA}\right) \) replaces some electron-gas exchange with the exact exchange to capture the proper small-\(\lambda \) limit in (3.55). The coefficient \(a\) reflects the rate of onset correlation as it increases from zero. The other terms allow optimum admixtures of exchange and correlation-type gradient corrections. These hybrid functionals are the simplest mixture of the exact exchange, the LSDA for exchange-correlation, and gradient corrections of exchange and correlation type, that exactly recovers the uniform electron-gas limit. The B3LYP functional is the most popular one in molecular quantum chemistry. Nevertheless, the correct amount of HF exchange included in any hybrid functional cannot be a constant over all species or even all geometries of a single species [143]. The rationale for mixing the exact exchange with the DFT approximation is discussed in [144]. The authors write the hybrid functional in the form

$$\begin{aligned} E_{xc}\cong E_{xc}^{KS}+\frac{1}{n}\left( E_x-E_x^{KS}\right) \end{aligned}$$
(3.60)

where the optimum integer \(n\) can be found from the perturbation theory and \(n=4\) for the atomization energies of typical molecules. Such a formally parameter-free hybrid functional with a PBE exchange-correlation part is known as the PBE0 hybrid functional. A simplification of hybrid exchange-correlation functionals was suggested by Becke [141], based on the simulation of delocalized exact exchange by local density-functionals. A simple model was introduced that detects exchange hole delocalization in molecules through a local variable related to kinetic-energy density.

In the next section we consider the extension of DFT methods to the periodic systems, including the monoperiodic nanostructures.

4 LCAO and Density Functional Tight Binding (DFTB) Methods for Periodic Systems

4.1 LCAO DFT Method for Periodic Systems

We discuss here the KS LCAO method for crystals in comparison with the HF LCAO approach (Sect. 3.2). The electronic energy of the crystal (per primitive cell) as calculated within the HF approximation (\(E_{HF}\)) and DFT (\(E_{DFT}\)) can be expressed in terms of the one-electron density matrix (DM) of the crystal defined as \(P(\varvec{k})\) in terms of the Bloch sums of AOs, or as \(\rho (\varvec{R},\varvec{R}')\) in coordinate space. These expressions are:

$$\begin{aligned} E_{HF}[\rho ]=E_{0}[\rho ]+E_{H}[\rho ]+E_{X}[\rho ] \end{aligned}$$
(3.61)
$$\begin{aligned} E_{DFT}[\rho ]=E_{0}[\rho ]+E_{H}[\rho ]+E_{XC}[\rho ] \end{aligned}$$
(3.62)

where \(E_{0}[\rho ]\) is defined as the expectation value of the one-electron operator \(\hat{h}(\varvec{R})\)

$$\begin{aligned} E_0[\rho ]=\frac{1}{N}\int \limits _{V_N}d^3\varvec{R}[\hat{h}(\varvec{R})\rho (\varvec{R},\varvec{R}')]_{\varvec{R}'=\varvec{R}} \end{aligned}$$
(3.63)

\(E_H[\rho ]\) is the Coulomb (Hartree) energy,

$$\begin{aligned} E_H[\rho ]=\frac{1}{N}\int \limits _{V_N}d^3\varvec{R}\int \limits _{V_N}d^3\varvec{R}'\frac{\rho (\varvec{R},\varvec{R})\rho (\varvec{R}',\varvec{R}')}{|\varvec{R} - \varvec{R}'|} \end{aligned}$$
(3.64)

\(E_X[\rho ]\) is the HF exchange energy,

$$\begin{aligned} E_X[\rho ]=-\frac{1}{2N}\int \limits _{V_N}d^3\varvec{R}\int \limits _{V_N}d^3\varvec{R}'\frac{|\rho (\varvec{R},\varvec{R}')|^2}{|\varvec{R} - \varvec{R}'|} \end{aligned}$$
(3.65)

and \(E_{XC}[\rho ]\) in (3.62) is the exchange-correlation energy functional of density \(\rho (\varvec{R},\varvec{R})=\rho (\varvec{R})\) (different expressions for this functional depend on the DFT version and were discussed in the next subsection). The electron position vector \(\varvec{R}\) is supposed to be written as the sum \(\varvec{R}=\varvec{r}+\varvec{R}_n\), where \(\varvec{R}_n\) specifies the primitive cell and \(\varvec{r}\) is the position vector of an electron within this primitive cell. The difference between the HF exchange energy \(E_X\) and DFT exchange-correlation energy \(E_{XC}\) is the following: the former depends on the total DM \(\rho (\varvec{R},\varvec{R}')\), the latter only on DM diagonal elements \(\rho (\varvec{R})\), i.e. electron density.

In the HF approximation and DFT, the crystal orbitals are solutions to the equations

$$\begin{aligned} \hat{F}(\varvec{k})\varphi _i(\varvec{k})=\varepsilon _i(\varvec{k})\varphi _i(\varvec{k}) \end{aligned}$$
(3.66)

where the one-electron operator \(\hat{F}(\varvec{k})\) is either the HF operator, \(\hat{F}^{HF}(\varvec{k})\)

$$\begin{aligned} \hat{F}^{HF}(\varvec{k})=\hat{H}(\varvec{k})+\hat{J}(\varvec{k})+\hat{X}(\varvec{k}) \end{aligned}$$
(3.67)

or the Kohn–Sham operator \(\hat{F}^{KS}(\varvec{k})\),

$$\begin{aligned} \hat{F}^{KS}(\varvec{k})=\hat{H}(\varvec{k})+\hat{J}(\varvec{k})+\hat{V}_{XC}(\varvec{k}) \end{aligned}$$
(3.68)

Here, \(\hat{H}(\varvec{k})\) is a one-electron operator that describes the motion of an electron in the crystal and is equal to the sum of the kinetic-energy operator and the Coulomb interaction operator between the electron and fixed atomic nuclei and \(\hat{J}(\varvec{k})\) and \(\hat{X}(\varvec{k})\) are the Coulomb and exchange operators, respectively, which describe the interaction of the given electron with the other electrons of the crystal.

In the LCAO basis, both the Hartree-Fock and Kohn–Sham equations are :

$$\begin{aligned} F(\varvec{k})C(\varvec{k})=S(\varvec{k})C(\varvec{k})E(\varvec{k}) \end{aligned}$$
(3.69)

The HF and KS operators in the reciprocal space are represented by the Fock matrices \(F^{HF}_{\mu \nu }(\varvec{k})\) and Kohn–Sham matrices \(F^{KS}_{\mu \nu }(\varvec{k})\), which are related to the matrices in the coordinate space by the relations

$$\begin{aligned} F^{HF}_{\mu \nu }(\varvec{k})=\sum \limits _{\varvec{R}_n}\exp (i\varvec{k}\varvec{R}_n)\left[ h_{\mu \nu }(\varvec{R}_n)+j_{\mu \nu }(\varvec{R}_n)+x_{\mu \nu }(\varvec{R}_n)\right] \end{aligned}$$
(3.70)
$$\begin{aligned} F^{KS}_{\mu \nu }(\varvec{k})=\sum \limits _{\varvec{R}_n}\exp (i\varvec{k}\varvec{R}_n)\left[ h_{\mu \nu }(\varvec{R}_n)+j_{\mu \nu }(\varvec{R}_n)+v^{XC}_{\mu \nu }(\varvec{R}_n)\right] \end{aligned}$$
(3.71)

where \(h_{\mu \nu }(\varvec{R}_n), j_{\mu \nu }(\varvec{R}_n)\) and \(x_{\mu \nu }(\varvec{R}_n)\) are the one-electron, Coulomb, and exchange parts of the Fock matrix in the coordinate space. In the DFT, instead of the nonlocal-exchange interaction matrix \(x_{\mu \nu }(\varvec{R}_n)\), the exchange-correlation matrix \(v^{XC}_{\mu \nu }(\varvec{R}_n)\) is used, with different exchange-correlation functional approximations being employed in various versions of the DFT. In particular, in the local-density approximation (LDA), it is assumed that

$$\begin{aligned} v^{XC}_{\mu \nu }(\varvec{R}_n)&= \int d^3\varvec{R} \varphi _{\mu }(\varvec{R})v^{XC}(\varvec{R})\varphi _{\nu }(\varvec{R} - \varvec{R}_n)\nonumber \\ v^{XC}(\varvec{R})&= \frac{\partial }{\partial \rho }\varepsilon [\rho (\varvec{R})] \end{aligned}$$
(3.72)

The CO-LCAO calculations based on both the HF method and DFT allow one not only to make a comparison between the results obtained within these two approximations but also to employ a combination of these approximations used in hybrid HF-DFT methods. The HF self-consistent electron density of the crystal can be used to calculate correlation corrections to the total HF energy a posteriori [145]. In some cases it is useful to use the HF self-consistent density matrix to make the convergence of the DFT LCAO self-consistent procedure faster.

The extension of the existing HF-LCAO computer codes to DFT-LCAO ones allows a very direct comparison between these methods using the same code, the same basis set, and the same computational conditions. The modification of the HF LCAO code with PBC to perform DFT calculations is straightforward in principle. It is required only to delete the HF exchange and instead to evaluate the matrix elements of exchange-correlation operator, that is, to solve KS rather than HF equations in each iteration of the SCF procedure [146]. The main difficulty of this modification arises in the calculations of matrix elements of the exchange-correlation operator. Different approaches to this problem were suggested.

The principal merit of the HF LCAO scheme is the possibility to calculate the matrix elements of the Fock matrix analytically. This merit may be retained in the DFT LCAO calculation if the exchange-correlation potential is expanded in an auxiliary basis set of Gaussian-type functions, with even-tempered exponents. At each SCF iteration the auxiliary basis set is fitted to the actual analytic form of the exchange-correlation potential, which changes with the evolving charge density [147].

The numerical integration also can be used to calculate the matrix elements of the exchange-correlation potential. For the numerical integration, the atomic partition method proposed by Savin [148] and Becke [149] has been adopted and combined with Gauss–Legendre (radial) and Lebedev (angular) quadratures [150]. The Kohn–Sham LCAO periodic method based on numerical integration at each cycle of the self-consistent-field process is computationally more expensive than the periodic LCAO Hartree-Fock method that is almost fully analytical.

In conclusion of this section we give some comments concerning the meaning of the KS eigenvalues and orbitals [9]. Strictly speaking, the KS eigenvalues and eigenvectors do not have the physical meaning of one-electron energies and wavefunctions as in the HF theory. Also, the single-determinant many-electron wavefunction constructed out of the KS orbitals does not correspond to the exact wavefunction of the system of interacting electrons; it is merely a wavefunction of the corresponding fictitious noninteracting system. The HF eigenvalues for occupied and empty states are calculated with different potentials, but KS eigenvalues with the same. Therefore, one should be careful with the KS eigenvalues of the states lying above the “occupied” ones. For example, the energy gap between the uppermost valence and the lowermost conduction bands in a solid is strongly underestimated in the majority versions of DFT. Note that the gap is always overestimated in the HF theory. It can be shown that similarly to Koopman’s theorem in the HF theory, the ionization energy in the KS theory is given by the KS energy in the highest-occupied state. But this result cannot be applied to other occupied KS states since in the KS theory all states must be occupied sequentially from the bottom to the top: one cannot remove an electron below the highest-occupied state leaving a hole there.

As an advantage of the DFT LCAO method one can mention the possibility of linear scaling in calculations of complex molecules and crystals.

The Siesta (Spanish Initiative for Electronic Simulations with Thousands of Atoms) method [13, 14] achieves linear scaling by the explicit use of localized Wannier-like functions and numerical pseudoatomic orbitals confined by a spherical infinite-potential wall.The SIESTA method provides a very general scheme to perform a range of calculations from very fast to very accurate, depending on the needs and stage of the simulation, of all kinds of molecule, material and surface. It allows DFT simulations of more than a thousand atoms in modest PC workstations, and over a hundred thousand atoms in parallel platforms [151]. The numerous applications of the Siesta DFT LCAO method can be found on the Siesta code site [44]. These applications include nanotubes, surface phenomena and amorphous solids. As the restriction of the Siesta method we mention the difficulty of the all-electron calculations and use of only LDA/GGA exchange-correlation functionals.

The second DFT LCAO linear-scaling method by Scuseria and Kudin (SK method) [11] uses Gaussian atomic orbitals and a fast multipole method, which achieves not only linear-scaling with system size, but also very high accuracy in all infinite summations. This approach allows both all-electron and pseudopotential calculations and can be applied also with hybrid HF-DFT exchange-correlation functionals.

4.2 Exchange-Correlation Functionals for Periodic Systems

Are Molecular Exchange-correlation Functionals Transferable to Crystals? This question arises due to the well-known fact that the properties of crystals are quite often different from those of molecular systems. The difference is particularly evident for ionic and semi-ionic solids, in which the long-range electrostatic forces provide a strong localizing field for the electronic states that is not present in molecules. In this situation the shortcomings of standard LDA and GGA functionals, linked to the missing electronic self-interaction, are likely to be most severe [152]. This explains why the electronic-structure calculations in molecular quantum chemistry and solid-state physics were for a long time developing along two independent lines. While in the molecular quantum chemistry the wavefunction-based approaches (HF and post-HF) in the LCAO approximation were mainly used, in solid-state physics DFT-based methods with plane wave (PW) basis were popular. These two standard approaches were for a long time poorly transferable between the two fields [152]: early DFT functionals underperform post-HF techniques in reproducing the known properties of small molecules, while the extension of accurate post-HF methods to solid-state systems is difficult or has prohibitive computational expense.

The formulation of hybrid HF/DFT exchange-correlation functionals and their extension to periodic systems changed the situation; the combined HF-DFT approach has adequate accuracy for most needs in both quantum chemistry of molecules and solids, retaining a tractable computational cost and even allowing the system-size linear-scaling. One appealing feature of this approach is that it can readily exploit the progress and tools available to the quantum chemists for calculating the HF exchange.

Numerous solid-state studies have been performed with molecular hybrid functionals, providing valuable experience on their accuracy and applicability. In the review article [152] the results are summarized of publications in which hybrid exchange functionals have been applied under PBC to represent crystalline solids. The list of later publications, using LCAO basis set, can be found on the site http://www.crystal.unito.it.

The next step in the HF/DFT extension to solids was made by Scuseria and coworkers who coded the HF/DFT method with PBC in LCAO basis [153] and included a linear-scaling hybrid exchange-correlation HSE functional. Later HSE hybrid functional was implemented in VASP code [154], based on use of PW basis set.

The performance of hybrid density-functionals (HDF) in solid-state chemistry is examined in a review article [152] and the original publications [131, 155164]. The HDF method has been used in considering the bulk crystal structure and electronic properties [23, 158, 165], phase transitions in solids [59, 166168], the vibrational properties and lattice dynamics [21, 169171], surfaces [172, 173], point defects in crystals [174179]. The references are given here only on few recent publications demonstrating that the HDF methods allow the calculation of different properties of solids in good agreement with the experiment.

However, the electronic structure calculations [160] of prototypical ferroelectric oxides BaTiO\(_3\) and PbTiO\(_3\) using the most popular exchange-correlation functionals show that it is difficult to obtain simultaneously good accuracy for structural and electronic properties. On the one hand, all the usual DFT functionals reproduce the structural properties with various degrees of success, the recently introduced GGA–Wu–Cohen [131] being by far the most accurate, but severely underestimate the band gaps. On the other hand, the B3LYP and PBE0 hybrid functionals correct the band-gap problem, but overestimate the volumes and atomic distortions giving a supertetragonality comparable to that rendered by the GGA-PBE for the tetragonal phases. It has been found the supertetragonality inherent to B3LYP and PBE0 calculations is mostly associated with the GGA exchange part and, to a lesser extent, to the exact–exchange contribution. To bypass this problem, a different hybrid B1–WC functional has been suggested by combining the GGA–WC functional with a small percentage of exact HF exchange A \(=\) 0.16 (instead of A \(=\) 0.25 in PBE0). With this B1–WC, very good structural, electronic, and ferroelectric properties have been obtained in LCAO calculations [160] as compared to experimental data for BaTiO\(_3\) and PbTiO\(_3\). Indeed, DFT(B1–WC) calculations give for SrTiO\(_3\) and BaTiO\(_3\) cubic phases the lattice constant a values 3.880 Å(3.905 Å) and 3.971 Å(4.00 Å), respectively. The calculated indirect band gaps are 3.57 eV (3.25) eV and 3.39 eV (3.20 eV), respectively. The experimental values are given in brackets.

In solid-state calculations, however, the use of hybrid functionals is not a common practice because of the high computational cost that exact exchange involves. A recent alternative to conventional hybrid functionals is a screened exchange hybrid functional developed by Heyd, Scuseria, and Ernzerhof (HSE) [156, 180, 181]. This functional uses a screened Coulomb potential for the exact exchange interaction, drastically reducing its computational cost, and providing results of similar quality to traditional hybrid functionals. It was demonstrated that the screened HF exchange (neglect of the computationally demanding long-range part of HF exchange) exhibits all physically relevant properties of the full HF exchange. The HSE functional for solids is much faster than regular hybrids and can be used in metals also.

The range of the exchange interaction in insulators decays exponentially as a function of the bandgap [182]. In metallic systems the decay is algebraic. To render the HF exchange tractable in extended systems, either the exchange interactions need to be truncated artificially or their spatial decay accelerated [181]. Various truncation schemes have been proposed to exploit the exponential decay in systems with sizable bandgaps. Truncation schemes are very useful for systems with localized charge distributions where the HF exchange decays rapidly over distance. However these approaches fail to significantly decrease the computational effort in systems with small or no gaps. In delocalized systems truncation leads to severe convergence problems in the self-consistent-field (SCF) procedure as well as uncertainties in the predicted energy of the system. The second option, accelerating the spatial decay, circumvents both of these problems but still neglects interactions that might be physically important.

The HSE hybrid density-functional includes the spatial decay acceleration using a screened Coulomb potential and attempts to compensate for the neglected interactions. A screened Coulomb potential is based on a splitting of the Coulomb operator into short-range (SR) and long-range (LR) components. The choice of splitting function is arbitrary as long as SR and LR components add up to the original Coulomb operator. HSE use the error function to accomplish this split since it leads to computational advantages in evaluating the short-range HF exchange integrals: the error function can be integrated analytically when using Gaussian basis functions. The following partitioning is used for the full \(1/r\) Coulomb potential:

$$\begin{aligned} \frac{1}{r}=\left( \frac{erfc(\omega r)}{r}\right) _{SR}+\left( \frac{erf(\omega r)}{r}\right) _{LR} \end{aligned}$$
(3.73)

where the complementary error function \(erfc(\omega r)=1-erf(\omega r)\) and \(\omega \) is an adjustable parameter. For \(\omega =0\), the long-range term becomes zero and the short-range term is equivalent to the full Coulomb operator. The opposite is the case for \(\omega \rightarrow \infty \). In the HSE approach the screened Coulomb potential is applied only to the exchange interaction. All other Coulomb interactions of the Hamiltonian, such as the Coulomb repulsion of the electrons, do not use a screened potential. The exchange energy in the PBE0 functional \(E_{xc}^{PBE0}=a_{mix}E_x^{HF}+(1-a_{mix})E_x^{PBE}+E_c^{PBE}\) is split into short- and long-range components

$$\begin{aligned} E_x^{PBE0}&=a_{mix}E_x^{HF,SR}(\omega )+a_{mix}E_x^{HF,LR}(\omega )+(1-a_{mix})E_x^{PBE,SR}(\omega )\nonumber \\&\quad +E_x^{PBE,LR}(\omega )-a_{mix}E_x^{PBE,LR}(\omega ) \end{aligned}$$
(3.74)

where \(\omega \) is an adjustable parameter governing the extent of short-range interactions. For the exchange-mixing parameter value \(a_{mix}=0.25\) and \(\omega =0\) (3.74) is equal to the exchange part of the PBE0 parameter-free functional. Numerical tests based on realistic \(\omega \) values (\(\omega \approx 0.15 \)) indicate that the HF and PBE long-range exchange contributions of this functional are rather small (just a few per cent), and that these terms tend to cancel each other [180].

Thus, if we neglect them and work under the assumption that this approximation may be compensated by other terms in the functional, one obtains a screened Coulomb potential hybrid density-functional of the form:

$$\begin{aligned} E_{xc}^{\omega PBE}=a_{mix}E_x^{HF,SR}(\omega )+(1-a_{mix})E_x^{PBE,SR}(\omega )+E_x^{PBE,LR}(\omega )+E_c^{PBE} \end{aligned}$$
(3.75)

where \(\omega \) is an adjustable parameter governing the extent of short-range interactions. The hybrid functional (3.75) is equivalent to PBE0 for \(\omega =0\) and asymptotically reaches PBE for \(\omega \rightarrow \infty \).

The short-range component of the HF exchange can be obtained by using the SR Coulomb potential when calculating the electron-repulsion integrals for the HF exchange energy:

$$\begin{aligned} (\mu \nu |\lambda \sigma )^{SR}=\int \int d\varvec{r}_1 d\varvec{r}_2\varphi _{\mu }(\varvec{r}_1)\varphi _{\nu }(\varvec{r}_1)\frac{erfc(\omega r_{12})}{r_{12}}\varphi _{\lambda }(\varvec{r}_2)\varphi _{\sigma }(\varvec{r}_2) \end{aligned}$$
(3.76)

over contracted Gaussian-type functions. For these calculations the algorithm of Gill and Pople [183] was modified [180] so that the evaluation of integrals (3.76) is only slightly more time consuming than the regular electron-repulsion integrals.

To calculate the screened Coulomb exchange PBE functional the exchange model hole \(J_x^{PBE}\) of the PBE functional constructed in a simple analytical form by Ernzerhof and Perdew [184] is used. The model hole reproduces the exchange energy density of the PBE approximation for exchange and accurately describes the change in the exchange hole upon the formation of single bonds. In the HSE functional this PBE exchange hole is screened by employing the short-range Coulomb potential from (3.73).

The PBE long-range exchange contribution is then defined as the difference of the exchange-hole-based PBE and the SR PBE exchange energy densities.

In the so-called revised HSE03 hybrid functional [181] the improvement was introduced in the calculation in the integration procedure of the PBE exchange hole. This modification made the calculations numerically more stable and ensures that the HSE03 hybrid functional for \(\omega =0\) is closer to the PBE0 hybrid. The HSE03 (denoted also as \(E^{\omega PBE}\)) functional was incorporated into the development version of the GAUSSIAN code [185]. It was demonstrated for molecular systems that the HSE03 hybrid functional delivers results (bond lengths, atomization energies, ionization potentials, electron affinities, enthalpies of formation, vibrational frequencies), that are comparable in accuracy to the nonempirical PBE0 hybrid functional [181].

The HSE03 hybrid functional was extended to periodic systems [181]. The calculations were made for 21 metallic, semiconducting and insulating solids. The examined properties included lattice constants, bulk moduli and bandgaps. The results obtained with HSE03 exhibit significantly smaller errors than pure DFT calculations.

The influence of the exchange screening parameter \(\omega \) on the performance of screened hybrid functionals has been reexamined in [156]. It has been shown that variation of the screening parameter influences solid band gaps the most. Other properties such as molecular thermochemistry or lattice constants of solids change little with \(\omega \). A new version of HSE (HSE06) with the screening parameter \(\omega =0.11\) has been recommended for further use. Compared to the original implementation, the new parameterization yields better thermochemical results and preserves the good accuracy for band gaps and lattice constants in solids.

The two-parameter space (\(\omega \), a) of the HSE functional has been systematically examined in [186, 187] to assess the performance of hybrid screened exchange functionals and to determine a balance between improving accuracy and reducing the screening length, which can further reduce computational costs. The suggested in [187] HSE12 functional is a range-minimized functional that matches the overall accuracy of the existing HSE06 parameterization but reduces the Fock exchange length scale by half. Analysis of the error trends over parameter space produces useful guidance for future improvement of density functionals.

The preliminary screening of the integrals is necessary to take advantage of the rapid decay of short-range exchange integrals for periodic systems [181]. Two different screening techniques are used. In the first technique (Schwarz screening) substituting the SR integrals in place of the \(1/r\) integrals yields an upper bound of the form

$$\begin{aligned} |(\mu \nu |\lambda \sigma )_{SR}|\le \sqrt{(\mu \nu |\mu \nu )_{SR}}\sqrt{(\lambda \sigma |\lambda \sigma )_{SR}} \end{aligned}$$
(3.77)

The \((\mu \nu |\lambda \sigma )\) integrals are then evaluated for each batch of integrals and only batches with nonnegligible contributions are used in calculating the HF exchange. The SR screening integrals are evaluated by the same procedure as the SR exchange integrals.

The second—distance-based multipole screening technique—uses multipole moments introducing the following screening criterion

$$\begin{aligned} T_n=\sum \limits _{M\in \mu \nu }\sum \limits _{M\in \lambda \sigma }C_{\mu \nu }^{max}\frac{1}{r^{l+M_{\mu \nu }^{low}+M_{\lambda \sigma }^{low}}}C_{\lambda \sigma }^{max} \end{aligned}$$
(3.78)

where \(T_n\) is an estimate for the contribution of a shell quartet and \(M\) is a multipole in the multipole expansion of a given shell pair. \(M_{ij}^{low}\) are the lowest-order multipoles that can contribute to the integral and \(C_{ij}^{max}\) are the maximum coefficients in each order of multipoles. Replacing the \(1/r\) potential with the \(erfc(\omega r)/r\) short-range potential yields

$$\begin{aligned} T_n^{SR}=\sum \limits _{M\in \mu \nu }\sum \limits _{M\in \lambda \sigma }C_{\mu \nu }^{max}\frac{erf(\omega r^{l+M_{\mu \nu }^{low}+M_{\lambda \sigma }^{low}})}{r^{l+M_{\mu \nu }^{low}+M_{\lambda \sigma }^{low}}} \end{aligned}$$
(3.79)

This provides a distance-based upper bound for the SR exchange contribution of a given shell quartet.

The implementation of periodic boundary conditions relies on evaluating all terms of the Hamiltonian in real space [11]. The HF exchange is evaluated using replicated density matrices. All interactions within a certain radius from a central reference cell are calculated and the rest are neglected. This so-called near-field exchange (NFX) method [188] allows calculation of the HF exchange in time, scaling linearly with system size. It works reasonably well for insulating solids since the corresponding density matrix elements decay rapidly. In systems with smaller bandgaps, however, the spatial extent of nonnegligible contributions to the exchange energy is large. This large extent results in a large number of significant interactions. To render the computation tractable, the truncation radius must be decreased. Thus, significant interactions are neglected that leads to errors in the total energy of the system and introduces instabilities into the self-consistent field procedure.

Screened Coulomb hybrid functionals do not need to rely on the decay of the density matrix to allow calculations in extended systems [181]. The SR HF exchange interactions decay rapidly and without noticeable dependence on the bandgap of the system. The screening techniques do not rely on any truncation radius and provide much better control over the accuracy of a given calculation. In addition, the thresholds can be set very tightly, without resulting in extremely long calculations.

A series of benchmark calculations [181] on three-dimensional silicon (6–21G basis was used) demonstrates the effectiveness of the screening techniques. The time per SCF cycle was studied given as a function of the distance up to which exchange interactions were included in the calculation. As this radius grows, the number of replicated cells grows as \(O(N^3)\). The PBE0 curve tracks this growth since regular HF exchange has a large spatial extent. The relatively small bandgap of silicon (1.9 eV in this calculation) is insufficient for density matrix elements to decay noticeably. HSE, on the other hand, only shows a modest increase in CPU time as the system becomes larger. Beyond 10 Å, the CPU time only increases due to the time spent on screening. The HSE calculation of the total energy converges very rapidly and only cells up to 10 Å from the reference cell contribute to the exchange energy. PBE0, by comparison, converges significantly slower. Thus, HSE not only reduces the CPU time drastically, but also decreases the memory requirements of a given calculation since fewer replicated density matrices need to be stored in memory. In practice, HSE calculations can be performed with the same amount of memory as pure DFT calculations, whereas traditional hybrid methods have larger memory demands. Given the fast decay of the SR HF exchange interactions and the high screening efficiency, the HSE hybrid functional was successfully applied to a variety of three-dimensional solids. It was demonstrated that the screened Coulomb hybrid density-functional HSE not only reduces the amounts of memory and CPU time needed, when compared to its parent functional PBE0, but it is also at least as accurate as the latter for the structural, optical and magnetic properties of solids.

In nanostructure calculations the monoperiodic unit cell can contain several hundreds of atoms so that not only hybrid but even pure DFT exchange-correlation functionals are not practically applicable. The DFTB method allows one to simplify the calculation scheme maintaining the DFT accuracy of the results.

4.3 Density Functional Tight Binding (DFTB) Method

General information about the DFTB method and its implementation in computer codes is given on sites [189, 190]. The Density Functional based Tight binding (DFTB) method is based on a second-order expansion of the Kohn–Sham total energy in Density-Functional Theory (DFT) with respect to charge density fluctuations. The zeroth order approach is equivalent to a common standard non-self-consistent (TB) scheme, while at second order a transparent, parameter-free, and readily calculable expression for generalized Hamiltonian matrix elements can be derived. These are modified by a self-consistent redistribution of Mulliken charges (SCC).

Besides the usual “band structure” and short-range repulsive terms the final approximate Kohn–Sham energy additionally includes a Coulomb interaction between the charge fluctuations. At large distances this accounts for long-range electrostatic forces between two point charges and approximately includes self-interaction contributions of a given atom if the charges are located at one and the same atom. Due to the SCC extension, DFTB can be successfully applied to problems where deficiencies within the non-SCC standard TB approach become obvious.

In the last few years, the DFTB method has been significantly extended to allow the calculation of optical and excited state properties. The GW formalism as well as the time dependent DFTB was implemented. Furthermore, the DFTB was used to calculate the Hamiltonian for transport codes, using Green function techniques.

We list here some selected references allowing the study of the DFTB method foundations and practical applications [191201].

Based on DFT, the DFTB formalism introduces several approximations. The electronic charge density is written as a superposition of atomic densities. Likewise, the effective potential can be written as a superposition of atomic contributions so that the exchange-correlation energy is also decomposed into atomic contributions.

For large interatomic distances, the integral describing the electron-nucleus interaction is approximated through a point-charge approximation. In turn, the atomic population converges to the nuclear charge in the limit of large distances, and the energy terms from nuclear–nuclear repulsion and electron nuclear energy compensate for each other for large distances. In addition, in the large distance range, the two-center terms containing the potential vanish. In total, the many-center terms vanish at large interatomic distances.

For small interatomic distances, these arguments are not valid anymore. Therefore, the compensating terms are calculated using the atomic densities and potentials and then summarized together with the nuclear repulsion energy terms into a sum of pairwise repulsive energy terms. For short interatomic distances, the nuclear repulsion dominates. In this limit, the pairwise energy terms are repulsive, and they asymptotically approach zero in the long range. The second-order terms are represented in a multipole expansion considering only the monopolar terms.

KS-like equations can be set up, in which the effective potential consists of the sum of the atomic potentials and the corresponding contributions related to the second-order terms. The corresponding KS orbitals are written within the LCAO approach. They are represented in a minimal basis of optimized orthogonal atomic orbitals. These pseudoatomic basis functions are obtained by self-consistently solving the KS equations for a spherical symmetric spin-unpolarized neutral atom.

The approximations above formulated contain two-center terms only and lead to the same structure of the secular equations as in the other nonorthogonal TB schemes. It is important to emphasize that all matrix elements are calculated within the DFT.

The notations and formalism of the DFTB method are described in detail in [192]. The approximations in the formulated DFTB method, which contains two–center terms only in Hamiltonian matrix elements, lead to the same structure of the secular equations as in the non-orthogonal TB schemes or the extended Huckel method, but the important advantage is that all the matrix elements are calculated within the DFT.

Within the DFT, the total energy can be written as follows:

$$\begin{aligned} E_\mathrm{tot}[\rho ({\varvec{r}})] = T[\rho ({\varvec{r}})]+E_\mathrm{ext}+\overbrace{\frac{1}{2}\int \int \frac{\rho ({\varvec{r}})\rho ({\varvec{r}'})}{\left| {\varvec{r}}-{\varvec{r}'}\right| }d{\varvec{r}}d{\varvec{r}'}}^{E_\mathrm{H}}+E_\mathrm{nn}+E_\mathrm{xc}[\rho ({\varvec{r}})] \end{aligned}$$
(3.80)

Here, T is the kinetic energy of the electrons, and \(E_\mathrm{ext}\), \(E_\mathrm{H}\), and \(E_\mathrm{nn}\) are the electron-nuclei, the meanfield (Hartree), and the nuclear interaction energies, respectively. \(E_\mathrm{xc}[\rho ({\varvec{r}})]\) denotes the functional of the exchange-correlation (xc) energy.

The electron-density distribution

$$\begin{aligned} \rho ({\varvec{r}}) = \sum _i^\mathrm{occ}\psi _i^{*}(\varvec{r})\psi _i(\varvec{r}) \end{aligned}$$
(3.81)

can be obtained through the solution of the KS equations

$$\begin{aligned} \nonumber&\left[ -\frac{1}{2}\nabla ^2+V_\mathrm{eff}({\varvec{r}})\right] \psi _i({\varvec{r}})=\varepsilon _i\psi _i({\varvec{r}}),\\ \nonumber&V_\mathrm{eff} = V_\mathrm{ext}+V_\mathrm{H}+V_\mathrm{xc},\\ \nonumber&V_\mathrm{H} = \int \frac{\rho ({\varvec{r}'})}{\left| {\varvec{r}}-{\varvec{r}'}\right| }d{\varvec{r}}, V_\mathrm{ext}=-\sum _j\frac{Z_j}{r_j}{\varvec{r}},\\&V_\mathrm{xc} = \frac{\delta E_\mathrm{xc}}{\delta \rho }. \end{aligned}$$
(3.82)

Using (3.82) the total energy can also be written as follows:

$$\begin{aligned} E_\mathrm{tot}&= \sum ^\mathrm{occ}_i \langle \psi _i | \overbrace{-\frac{1}{2} \nabla ^2 + V_\mathrm{eff}}^{\hat{h}^0} | \psi _i \rangle - \frac{1}{2}\left[ \int V_\mathrm{eff}\rho d{\varvec{r}} - \int V_\mathrm{ext}\rho d{\varvec{r}} \right] + E_{xc}[\rho ] \nonumber \\&\quad -\frac{1}{2}\int V_\mathrm{xc}\rho d{\varvec{r}}+E_\mathrm{nn} \end{aligned}$$
(3.83)

Because \(E_{tot}\) is variational with respect to the density variations, the total energy may be calculated from an approximate electronic charge density obtained by knowing roughly the reference density distribution \(\rho _0\). The reference density should be close to the exact density \(\tilde{\rho }\), but one can further improve it [192].

As the whole Hamiltonian and overlap matrices contain one- and two-center contributions only, they can be calculated and tabulated in advance as functions of the distance between the atomic pairs. Thus, it is not necessary to recalculate any integral during the actual calculation. The pairwise tabulated integrals have to be transformed to a specific coordinate system by transformations of the angle-dependent part of the basis functions–atomic orbitals.

Because the atomic charges \(\varDelta q_j\) in the second-order contributions depend on the orbitals \(\psi _i\) via the KS ansatz for the charge density in (3.81), the eigenvalue problem has to be solved self-consistently. However, in contrast to a full DFT calculation, the self-consistency has not to be achieved for the charge density, but for the charges \(\varDelta q_j\) of the atoms in the system only. Such a restricted self-consistency is called a self-consistent charge (SCC) treatment.

The charge of an atom in a molecule or a solid is not uniquely defined. In line with the minimal valence basis ansatz for the KS orbitals, the most obvious way to define the charges is via the projection of the atomic orbitals onto each atom. The partitioning of the overlapping contributions can simply be done within the Mulliken approach. The resulting charges then correspond to the widely used Mulliken charges.

After the solution of the approximate KS equations, the total energy or the binding energy can be calculated. The repulsive energy contributions can be calculated also; but in practice, these pairwise contributions are fitted to full DFT calculations. In principle, any high-level calculation method could be used in full DFT calculations. The pairwise repulsive energy can be fitted to, for example, the following type polynomials:

$$\begin{aligned} V_\mathrm{rep}(R)=\left\{ \begin{aligned}&\sum _{n=2}d_n(R_c-R)^n,\quad \mathrm{for}\; R<R_c, \\&0,\qquad \qquad \qquad \quad \mathrm{otherwise}. \end{aligned}\right. \end{aligned}$$
(3.84)

Thereby, the repulsive potential has to be fitted for each pair of elements, not for each pair of atoms. It can be tabulated as the Hamiltonian and overlap matrix elements. It is also important to note that the repulsive terms are rather short-ranged, i.e., \(\mathrm R_c\) is in the order of \(\approx \)1.5 times of the typical interatomic bonding distance. This characteristic makes \(\mathrm {V_{rep}}\) rather well transferable from one system to another.

Analytical expressions for the forces on the atoms can be derived as well [192].

It should be noted that the computation of derivatives of the Hamiltonian and the overlap matrices can be performed, because the integrals are stored in tables. An efficient calculation of the forces is essential for the realization of molecular-dynamics (MD) simulations within the DFTB approach.

Corresponding formulations of the second derivative of the energy with respect to the nuclear coordinates exist as well. These are needed, e.g. for the calculation of vibrational spectra. However, a numerical differentiation of the forces turned out to be more practical.

In the materials science one needs to study the macroscopic properties of aggregates of atoms or molecules (an infinite molecular crystal, a molecular liquid, a polymer, or large biomolecule). Within these systems, one needs not only to take care of the covalent or ionic forces that bond the units, but also of the weak forces that contribute to the long range. The most important one is the London dispersion that belongs to the class of van der Waals interactions.

Within the DFT, the correct description of the London dispersion interactions is complicated. Therefore, the standard DFT does not include these forces. The same holds for the DFTB in the standard and the SCC approach as only short-range atomic potentials are used and the Hamiltonian matrix elements are negligible at large interatomic distances, where van der Waals forces have a significant effect. However, the van der Waals interactions may be included into the DFTB approach a posteriori, which means that one calculates the van der Waals contributions independently after a DFTB or SCCDFTB cycle and add the result \(\mathrm {E_{disp}}\) to the total energy. The typical van der Waals decay is with \(R^{6}\) so that a Lennard–Jones potential can be used. The full periodic–table force–field for molecular mechanics and molecular–dynamics simulations is given in [202]. However, the \(\mathrm R^{12}\) term in the Lennard-Jones potential is repulsive for short-range distances, in which the energy is calculated within the quantum-mechanical DFT or DFTB approach. Therefore, the potential has to be modified, so that no repulsive terms occur (the details of this correction are considered in [192]).

The force-field consideration of van der Waals interactions suggests the inclusion of further nonquantum–mechanically calculated interactions (QM/MM-quantum mechanics-molecular mechanics) methodology [203]. In such a treatment, the total energy is written as \(\mathrm {E_{tot}} = \mathrm {E_{QM}} + \mathrm {E_{MM}} + \mathrm {E_{QM/MM}}\), where \(\mathrm {E_{QM}}\) is the energy of the quantum mechanically described part of the system, and \(\mathrm {E_{MM}}\) is the energy of the classically described part. For the QM part, the DFTB approach can be used; for the classical molecular-mechanics part, one can choose an empirical force field. \(\mathrm {E_{QM/MM}}\) describes the energy contribution resulting from the linkage of the QM to the MM region.

The QM/MM is a very practical approach for large systems, in which only parts of the processes are of interest, or in which a special process occurs in only a small region of a large system. With an approximate QM method such as the DFTB, one can even describe large parts of a biomolecule quantum mechanically, see references in [192].

The advantages of QM/MM calculations are that they are much faster than a full quantum-mechanical description, which in most cases is not even possible. However, there are also several drawbacks of the QM/MM approach. Molecular mechanics is not able to properly describe bond breaking and bond formation. Therefore, one has to make sure that the MM part in the system is restricted to the area where the bonds will not be broken during the simulation.

The DFTB approach is used to calculate not only ground-state properties. It has been extended to the excited states using the Time Dependent DFTB (TD–DFTB) scheme [192, 204].

A critical point in using the DFTB in any implementation is still the fitting procedure for the repulsive energy. To obtain meaningful results, it is vital that this procedure is done with greatest care for receiving meaningful results. Because the repulsive energy depends on the fitting procedure, its tables are not generally transferable from one class of systems to another. This is the reason why these parameters are not available publicly for all the elements of the periodic table. Attempts have been made to automatize this procedure [205].

The discussion of the DFTB method concludes the description of LCAO basis set use for the periodic systems calculations. In the next section we consider Plane Wave approaches to periodic systems.

5 Plane Wave Hartree-Fock and DFT Methods for Periodic Systems

5.1 Projector Augmented-Wave Method

The Hartree-Fock and DFT Plane Wave methods for periodic systems are realized using the projector-augmented wave method (PAW) technique. It is a generalization of the pseudopotential and linear augmented-plane-wave methods, and allows density functional theory calculations to be performed with greater computational efficiency [206]. The formal relationship between the ultrasoft (US) and the PAW method is derived in [207].

The valence wavefunctions tend to have rapid oscillations near the ion cores due to the requirement that they be orthogonal to core states. The PAW approach addresses this issue by transforming these rapidly oscillating wavefunctions into smooth wavefunctions which are more computationally convenient. The PAW method is typically combined with the frozen core approximation, in which the core states are assumed to be unaffected by the ion’s environment. However the PAW method is an exact all- electron (AE) method for a complete set of partial waves [207]. Therefore, the method should yield results that are indistinguishable from any other frozen core AE method. In the US-PP method additional approximations are made.

The formalism of the PAW approach is described in [207, 208].

The AE (all-electron) wave functions \(\varPsi _{nk}\), with the \(n\) index corresponding to the summation over the bands and \(k\) indexing the k-points, are obtained starting from the PS (pseudo) \(\widetilde{\varPsi }_{nk}\) by using a linear transformation:

$$\begin{aligned} |\varPsi _{nk}\rangle =\widetilde{\varPsi }_{nk}+\sum _i(|\varPhi _i\rangle -| \widetilde{\varPhi }_i\rangle )\langle \widetilde{p}_i|\widetilde{\varPsi }_{nk}\rangle \end{aligned}$$
(3.85)

The index \(i\) stands for the atomic position \({\varvec{R}}\), the angular momentum \((l,m)\) and an additional index \(n\) to label different partial waves for the same site and angular momentum. The AE \(\varPhi _i\) and PS \(\widetilde{\varPhi }_i\) partial waves are equal outside the PAW sphere. Therefore, as in the norm conserving and ultra-soft scheme, \(\varPsi _n=\widetilde{\varPsi }_n\) outside the core radius \(r^l_c\). The projector functions \(\widetilde{p}_i\) are dual to the PS partial waves:

$$\begin{aligned} \langle \widetilde{p}_i|\widetilde{\varPhi }_j\rangle =\delta _{ij} \end{aligned}$$
(3.86)

In the PAW method, two grids are used: a radial one inside the PAW sphere and a regular mesh in the whole simulation cell. As a consequence, on the one hand, the partial waves and projector functions are separated between the angular and radial parts:

$$\begin{aligned} \varPhi _i({\varvec{r}})&=\frac{\varPhi _{n_il_i}(r)}{r}S_{l_im_i}(\hat{r});\quad \widetilde{\varPhi }_i({\varvec{r}})=\frac{\widetilde{\varPhi }_{n_il_i}(r)}{r}S_{l_im_i}(\hat{r}) \end{aligned}$$
(3.87)
$$\begin{aligned} \widetilde{p}_i({\varvec{r}})&=\frac{\widetilde{p}_{n_il_i}(r)}{r}S_{l_im_i}(\hat{r}) \end{aligned}$$
(3.88)

with \(S_{lm}(\hat{r})\) the real spherical harmonics, see the Appendix A in [208]. On the other hand, the pseudized wave functions are expanded on a plane wave basis set:

$$\begin{aligned} \widetilde{\varPsi }_{nk}({\varvec{r}})=\sqrt{\frac{1}{\varOmega }}\sum _{{\varvec{B}}}c_{nk}\left( {\varvec{B}}\right) e^{i({\varvec{k}}+{\varvec{B}})\cdot {\varvec{r}}} \end{aligned}$$
(3.89)

with \(\varOmega \) the volume of the unit cell.

By using (3.85) and the completeness relation of the projector-partial wave set,

$$\begin{aligned} \sum _i|\widetilde{\varPhi }_i\rangle \langle \widetilde{p}_i|=1 \end{aligned}$$
(3.90)

the AE valence charge density becomes:

$$\begin{aligned} \rho _v({\varvec{r}})=\sum _{nk}f_{nk}|\varPsi _{nk}|^2=\widetilde{\rho }({\varvec{r}})+\rho ^1({\varvec{r}})-\widetilde{\rho }^1({\varvec{r}}) \end{aligned}$$
(3.91)

with \(f_{nk}\) the occupation factors.

The pseudized charge density \(\widetilde{\rho }\) is analogous to the one obtained in the norm-conserving pseudopotential case, whereas \(\rho ^1\) and \(\widetilde{\rho }^1\) are the AE and pseudized partial densities, which are only defined within the PAW spheres.

A discussion of the PAW method application to the forces and stress tensor calculations can be found in [208].

The PAW method has been implemented in several codes using Plane Waves as the basis set [208]. The PAW technique is also used for the Hartree-Fock Plane Wave method realization for periodic systems, considered in the next subsection.

5.2 Plane Wave Hartree-Fock Method for Periodic Systems

For periodic systems the long-range nature of the Fock exchange interaction and the resultant large computational requirements present a major drawback [209]. This is especially true for metallic systems, which require a dense Brillouin zone sampling. Therefore the Hartree-Fock method in the basis of plane waves is usually realized using the screened density functional approach [209].

The nonlocal Fock exchange energy \(E_x\) in real space can be written as

$$\begin{aligned} E_x = -\frac{1}{2}\sum _{\mathbf{k}n,\mathbf{q}m}2w_\mathbf{k}f_{\mathbf{k}n}\times 2w_\mathbf{q}f_{\mathbf{q}m}\times \int \int d^3\mathbf{r}d^3\mathbf{r'}\frac{\phi ^{*}_{\mathbf{k}n}(\mathbf{r})\phi _{\mathbf{q}m}(\mathbf{r})\phi ^{*}_{\mathbf{q}m}(\mathbf{r'})\phi _{\mathbf{k}n}(\mathbf{r'})}{\left| \mathbf{r - r'}\right| } \end{aligned}$$
(3.92)

Here \(\phi _{\mathbf{k}n}\) is a set of one-electron Bloch states of the system and the corresponding set of (possibly fractional) occupational numbers \(f_{\mathbf{k}n}\). The sums over \(\mathbf k\) and \(\mathbf q \) need to be performed over all the \(\mathbf k\) points chosen to sample the Brillouin zone (BZ), whereas the sums over m and n are performed over all the bands at these \(\mathbf k\) points. The \(w_\mathbf{k}\) are \(\mathbf k\) -point weights and the factors 2 account for the fact that we are considering a closed-shell system with doubly occupied one-electron states.

The corresponding nonlocal Fock exchange potential is given by

$$\begin{aligned} V_x(\mathbf{r,r'})&= -\sum _{\mathbf{q}m}2w_\mathbf{q}f_{\mathbf{q}m}\frac{\phi ^{*}_{\mathbf{q}m}(\mathbf{r'})\phi _{\mathbf{q}m}(\mathbf{r})}{\left| \mathbf{r - r'}\right| } \nonumber \\&= -\sum _{\mathbf{q}m}2w_\mathbf{q}f_{\mathbf{q}m}e^{-i\mathbf{q\cdot r'}}\frac{u^{*}_{\mathbf{q}m}(\mathbf{r'})u_{\mathbf{q}m}(\mathbf{r})}{\left| \mathbf{r - r'}\right| }e^{i\mathbf{q\cdot r}} \end{aligned}$$
(3.93)

where \(u_{\mathbf{q}m}\) is the cell periodic part of the Bloch state, \(\phi _{\mathbf{q}m}\) at k point \(\mathbf q\) with the band index m. Using the decomposition of the Bloch states \(\phi _{\mathbf{q}m}\), in plane waves (\(\mathbf B\)-reciprocal lattice vector)

$$\begin{aligned} \phi _{\mathbf{q}m}(\mathbf{r}) = \frac{1}{\sqrt{\varOmega }}\sum _\mathbf{B}C_{\mathbf{q}m}(\mathbf{B})e^{i(\mathbf{q+B})\cdot \mathbf{r}} \end{aligned}$$
(3.94)

Equation (3.93) can be rewritten as

$$\begin{aligned} V_x(\mathbf{r,r'}) = \sum _\mathbf{k}\sum _\mathbf{BB'}e^{i(\mathbf{k+B})\cdot \mathbf{r}}V_\mathbf{k}(\mathbf{B,B'})e^{-i(\mathbf{k+B'})\cdot \mathbf{r'}} \end{aligned}$$
(3.95)

where

$$\begin{aligned} V_x(\mathbf{B,B'})&= \langle \mathbf{k+B} | \hat{V}_x | \mathbf{k+B'}\rangle \nonumber \\&= -\frac{4\pi e^2}{\varOmega }\sum _{m\mathbf{q}}2w_\mathbf{q}f_{\mathbf{q}m}\times \sum _\mathbf{G''}\frac{C^{*}_{\mathbf{q}m}\left( \mathbf{B'-B''}\right) C_{\mathbf{q}m}\left( \mathbf{B-B''}\right) }{\left| \mathbf{k-q+B''}\right| ^2} \end{aligned}$$
(3.96)

is the representation of the Fock exchange potential in reciprocal space [210].

To avoid the calculation of expensive integrals over the slowly decaying long-ranged part of the Fock exchange, Heyd et al. [180] proposed replacing it with the corresponding density functional counterpart. The resulting expression for the exchange-correlation energy in the hybrid DFT-HF HSE03 approach is given by

$$\begin{aligned} E_{xc}^\mathrm{HSE03} = \frac{1}{4}E_x^{\mathrm{sr,}\mu }+\frac{3}{4}E_x^{\mathrm{PBE,sr,}\mu }+E_x^{\mathrm{PBE,lr,}\mu }+E_c^\mathrm{PBE} \end{aligned}$$
(3.97)

As can be seen from (3.97), only the exchange component of the electron-electron interaction is separated into a short- (sr) and a long-range (lr) part. The complete electronic correlation is represented by the standard correlation part of the PBE density functional.

A simple decomposition of the Coulomb kernel can be obtained using the construction

$$\begin{aligned} \frac{1}{r} = S_{\mu }(r)+L_{\mu }(r) = \frac{\mathrm{erfc}(\mu r)}{r}+\frac{\mathrm{erf}(\mu r)}{r} \end{aligned}$$
(3.98)

where \(\mu \) is the parameter that defines the range separation related to a characteristic distance \({2}/{\mu }\) at which the short-range interactions become negligible. In the context of the HSE03 functional, it was empirically established that the optimum range-separation parameter, \(\mu \) is approximately 0.3 Å\(^{- 1}\). Using the decomposed Coulomb kernel (3.98) and (3.92) one obtains straightforwardly

$$\begin{aligned} E_x^{\mathrm{sr,}\mu } =&-\frac{e^2}{2}\sum _{\mathbf{k}n,\mathbf{q}m}2w_\mathbf{k}f_{\mathbf{k}n}2w_\mathbf{q}f_{\mathbf{q}m} \times \int \int d^3\mathbf{r}d^3\mathbf{r'}\frac{\mathrm{erfc}(\mu \left| \mathbf{r - r'}\right| )}{\left| \mathbf{r - r'}\right| }\nonumber \\&\times \phi ^{*}_{\mathbf{k}n}(\mathbf{r})\phi _{\mathbf{q}m}(\mathbf{r})\phi ^{*}_{\mathbf{q}m}(\mathbf{r'})\phi _{\mathbf{k}n}(\mathbf{r'}) \end{aligned}$$
(3.99)

The representation of the corresponding short-range Fock exchange potential in reciprocal space is given by

$$\begin{aligned} V_\mathbf{k}^\mathrm{sr,\mu }(\mathbf{B,B'}) = \langle \mathbf{k+B} | \hat{V}_x^\mathrm{sr,\mu } | \mathbf{k+B'}\rangle =&-\frac{4\pi e^2}{\varOmega }\sum _{m\mathbf{q}}2w_\mathbf{q}f_{\mathbf{q}m} \nonumber \\&\times \sum _\mathbf{B''}\frac{C^{*}_{\mathbf{q}m}\left( \mathbf{B'-B''}\right) C_{\mathbf{q}m}\left( \mathbf{B-B''}\right) }{\left| \mathbf{k-q+B''}\right| ^2}\nonumber \\&\times \left( 1-e^{-\left| \mathbf{k-q+B''}\right| ^2/4\mu ^2}\right) \end{aligned}$$
(3.100)

The only difference from the reciprocal space representation of the complete (undecomposed) Fock exchange potential, given by (3.96), is the second factor in the summand in (3.100), representing the complementary error function in reciprocal space. Note that (3.100) shows the range-separated exchange interaction to belong to the class of screened exchange interactions.

The short-range PBE exchange energy and potential, as well as their long-range counterparts, are arrived at using the same decomposition (3.98), in accordance with Heyd et al. [180]. It is easily seen from (3.98) that the long-range term becomes zero for \(\mu = 0\) and the short-range contribution then equals the full Coulomb operator, whereas for \(\mu \) \(\rightarrow \,\infty \) it is the other way around.

A detailed comparison of the performance of the HSE03 and PBE0 functionals for a set of archetypical solid state systems is presented in [209] by calculating lattice parameters, bulk moduli, heats of formation, and band gaps. The results indicate that the hybrid functionals indeed often improve the description of these properties, but in several cases the results are not yet close to the standard gradient corrected functionals. This concerns in particular metallic systems for which the bandwidth and exchange splitting are seriously overestimated.

5.3 Plane Wave Density Functional Method for Periodic Systems

The Kohn–Sham equations take on a very simple form when one uses plane waves [5]

$$\begin{aligned}&\left\{ \sum \limits _{\varvec{B}'}(\frac{1}{2}|\varvec{k}+\varvec{B}|^2\delta _{\varvec{B}\varvec{B}'}+V_{eN}(\varvec{B}-\varvec{B}')+V_{ee}(\varvec{B}-\varvec{B}')\right. \nonumber \\&\quad \left. +V_{XC}(\varvec{B}-\varvec{B}') ) \right\} C_{i,\varvec{k}+\varvec{B}'}=C_{i,\varvec{k}+\varvec{B}}\varepsilon _{i\varvec{k}}\qquad \qquad \end{aligned}$$
(3.101)

where \(V_{eN}(\varvec{B}-\varvec{B}'), V_{ee}(\varvec{B}-\varvec{B}')\), and \(V_{XC}(\varvec{B}-\varvec{B}')\) are the Fourier transforms of the electronic–nuclei, electron–electron Coulomb, and exchange-correlation potentials. In this form, the kinetic energy is diagonal.

The solution of (3.101) proceeds by diagonalization of a Hamiltonian matrix whose matrix elements \(H_{\varvec{k}+\varvec{B}, \varvec{k}+\varvec{B}'}\) are given by the terms in the brackets above. The size of the matrix is determined by the choice of cutoff energy, and will be intractably large for systems that contain both valence and core electrons. This is a severe problem. A plane-wave basis set is usually very poorly suited to expanding electronic wave functions because a very large number of plane waves are needed to expand the tightly bound core orbitals and to follow the rapid oscillations of the wave functions of the valence electrons in the core region. An extremely large plane-wave basis set should be required to perform an all-electron calculation, and a vast amount of computational time would be required to calculate the electronic wave functions [5]. The pseudopotential approximation allows the electronic wave functions to be expanded using a much smaller number of plane-wave basis states. A list of PW DFT codes for periodic systems is given at [211].

6 Molecular Dynamics Approach

Complex quantum-mechanical systems that molecules and crystals are, as they consist of a great number of atomic nuclei and electrons, can in many cases be modeled by the nonrelativistic Hamiltonian having the form

$$\begin{aligned} \hat{\mathrm{H}}=-\frac{1}{2}\sum \limits _{i=1}^{2N_e}\varDelta _{\varvec{r}_i}-\frac{1}{2}\sum \limits _{j=1}^{N_a}\frac{1}{M_j}\varDelta _{\varvec{R}_j}+V(\varvec{r},\varvec{R}) \end{aligned}$$
(3.102)

where \(\varvec{r}_i\) are the coordinates of electrons (\(i =\mathrm{1,2},\ldots ,2N_e\)), \(\varvec{R}_j\) are the coordinates of nuclei (\(j = 1,2,\ldots , N_\mathrm{a}\)) and \(V(\varvec{r}, \varvec{R})\) is the energy of the Coulomb interaction of the electrons and nuclei:

$$\begin{aligned} V(\varvec{r}, \varvec{R})=\sum \limits _{j<j'}\frac{Z_jZ_{j'}}{|\varvec{R}_j-\varvec{R}_{j'}|}+\sum \limits _{i<i'}\frac{1}{|\varvec{r}_i-\varvec{r}_{i'}|}-\sum \limits _{ij}\frac{Z_j}{|\varvec{r}_i-\varvec{R}_j|} \end{aligned}$$
(3.103)

The Hamiltonian (3.101) is approximate as it does not take into account the spin-orbit interaction and other relativistic effects. The calculation of eigenfunctions and eigenvalues of the operator (3.102), i.e. the solution of the time-independent Schrodinger equation

$$\begin{aligned} \hat{\mathrm{H}}\varPhi =E\varPhi \end{aligned}$$
(3.104)

is possible only after applying some approximations. The first of them is the adiabatic approximation. It permits the motion of electrons and nuclei to be considered separately and is based on the large difference in electron and nuclear masses (\(m_e\ll M_j\)).

In the adiabatic approximation, first the problem of electronic motion is solved for the fixed positions of nuclei

$$\begin{aligned} \left[ -\frac{1}{2}\sum \limits _{i=1}\varDelta _{\varvec{r}_i}+V(\varvec{r},\varvec{R})\right] \psi (\varvec{r},\varvec{R}) =W(\varvec{R})\psi (\varvec{r},\varvec{R}) \end{aligned}$$
(3.105)

The wavefunctions \(\psi (\varvec{r},\varvec{R})\) and the eigenvalues \(W(\varvec{R})\) in (3.105) depend on the nuclear coordinates \(\varvec{R}\) as parameters. Then, the found eigenvalues \(W(\varvec{R})\) are used as the operators of potential energy in the equation determining the nuclear motion:

$$\begin{aligned} \left[ -\frac{1}{2}\sum \limits _{j=1}\frac{1}{M_j}\varDelta _{\varvec{R}_j}+W(\varvec{R})\right] \chi (\varvec{R})=\varepsilon \chi (\varvec{R}) \end{aligned}$$
(3.106)

This way of solving (3.104) is equivalent to the representation of the wavefunction \(\varPhi \) in the form of the product

$$\begin{aligned} \varPhi (\varvec{r},\varvec{R})=\psi (\varvec{r},\varvec{R})\chi (\varvec{R}) \end{aligned}$$
(3.107)

Further corrections of this reasonable approximation may be obtained from adiabatic perturbation theory by using, as the small parameter, the value \((\frac{1}{M})^{1/4}\), where \(M\) is the average mass of the nuclei. Equation (3.105) is often considered as an independent problem without any relation to the more general problem (3.104). This is motivated by the following reasoning. If the temperature is not very high, the nuclei vibrate about some equilibrium positions \(\varvec{R}^{(0)}\). Thus, in calculating the electronic structure, only configuration with the nuclei fixed at their equilibrium positions \(\varvec{R}^{(0)}\) is considered. The latter are typically known from experimental data (e.g. from X-ray or neutron-scattering crystallographic data). This means of electronic-structure calculation without using any other experimental data (except the equilibrium positions of the nuclei) is often considered as made from first-principles. Often, the first-principle calculations are made with geometry optimization when the positions of nuclei are found from the total-energy minimization. Formally, the results of such calculations correspond to zeroth temperature.

The temperature dependence of the structure can be found using the Molecular Dynamics (MD) approach [5]. The simplest MD approach uses classical mechanics to find solutions of the (3.106). The familiar Newton equations of motion are solved

$$\begin{aligned} \sum \limits _{j=1}\frac{1}{M_j}\varDelta _t{\varvec{R}_j(t)}=-\sum \limits _{j=1}\nabla _jV(\varvec{R}_j(t)) \end{aligned}$$
(3.108)

The potential \(V(\varvec{R}_j(t))\) can be found basing on the force field approach or from the solution of the ground state time-independent Schrodinger equation for the fixed nuclei configuration \(\varvec{R}\). In the first case this approach is known as the classic mechanics-molecular mechanics method, in the second case - the quantum mechanics-molecular mechanics method.

7 First-Principles Simulation of Bulk Crystal (3D) Properties

The applications of the DFT and hybrid exchange-correlation functionals for simulations of nanostructures (nanolayers, nanotubes, nanowires) are discussed in Part II of this book. However, the study of nanostructures usually starts with calculations of the structure and properties of the corresponding bulk crystals. The bulk crystals play a role of benchmark systems to check the applicability of the computational method selected to reproduce the experimental data for different bulk crystal properties: the equilibrium atomic structure, one-electron properties (band structure and density of states), formation and atomization (cohesive) energy, bulk modulus and elastic constants, phonon frequencies and thermodynamical properties.

7.1 One-Electron Properties: Band Structure, Density of States

The local properties of chemical bonding in crystals are defined by the electron-density distribution in real space described by a one-electron density matrix (DM). The latter is calculated self-consistently for the finite set of \(L\) discrete \(\varvec{k}\)-points in BZ and corresponds to the cyclic cluster of \(L\)-primitive unit cells modeling the infinite crystal:

$$\begin{aligned} \rho _{\mu \nu }(\varvec{R}_n)=\frac{1}{L}\sum \limits _{i=1}^M\sum \limits _{j=1}^L\exp (-\mathrm{i}\varvec{k}\varvec{R}_n)C_{i\mu }(\varvec{k}_j)C^*_{i\nu }(\varvec{k}_j)n_i(\varvec{k}_j) \end{aligned}$$
(3.109)

The total number \(M\) of the bands in (3.109) equals the number of AOs (both in LCAO and PW calculations) associated with the primitive unit cell. The one-electron energy levels form energy bands consisting of \(L\) levels in each band. Therefore, each energy band can allocate \(2 \times L\) electrons; if in a unit cell there are \(n\) electrons, and the bands do not cross, the lowest \(n/2\) bands are occupied and are separated from the empty bands. In this case the occupation numbers in (3.109) are \(n_i =2 , 0\) for the occupied and empty bands, respectively (insulators and semiconductors). However, if \(n\) is odd, or if the valence and conduction bands cross, more than \(n/2\) bands are partially occupied (metal).

The eigenvalue spectrum of an infinite periodic system does not consist of discrete energy levels as the wavevector \(\varvec{k}\) changes continuously along the chosen direction of the BZ. The DM of an infinite crystal is written in the form where the summation over the discrete \(\varvec{k}\)-vectors of the BZ is replaced by the integration over the BZ. In particular in the LCAO approximation DM (calculated for the cyclic-cluster in the coordinate space) is replaced by

$$\begin{aligned} \rho _{\mu \nu }(\varvec{R}_n)&= \sum \limits _{i=1}^M\rho ^{(i)}_{\mu \nu }(\varvec{R}_n) \nonumber \\&= \frac{1}{V_{BZ}}\sum \limits _{i=1}^M \int \limits _{BZ}\exp (-\mathrm{i}\varvec{k}\varvec{R}_n)C_{i\mu }(\varvec{k})C^*_{i\nu }(\varvec{k})\left( \varepsilon _F-\varepsilon _i(\varvec{k})\right) d\varvec{k} \end{aligned}$$
(3.110)

where \(\rho ^{(i)}_{\mu \nu }(\varvec{R}_n)\) is the contribution of the \(i\)th energy band.

For the infinite crystal at each cycle of the SCF process, an energy \(\varepsilon _F\) (the Fermi energy) must be determined, such that the number of one-electron levels with energy below \(\varepsilon _F\) is equal to the number of electrons (or, in other words, the number of filled bands below \(\varepsilon _F\) is equal to half the number of electrons in the unit cell). The Fermi surface is the surface in reciprocal space that satisfies the condition \(\varepsilon _i(\varvec{k})=\varepsilon _F\). By limiting the integration over the BZ to states with energy below \(\varepsilon _F\), a Heaviside step function \(\theta (\varepsilon _F-\varepsilon )\) excludes the empty states from the summation over the energy bands. In fact, the band structure of the infinite crystal is obtained after the cyclic-cluster self-consistent calculation by the interpolation of the one-electron energy levels considered as continuous functions \(\varepsilon (\varvec{k})\) of the wavevector. The band structure of solids is an important feature, defining their optical, electrostatic and thermal properties. Both the electron charge distribution and the band structure are defined by the self-consistent DM of a crystal.

For example, two binary oxides—cubic MgO in sodium chloride structure and teragonal TiO\(_2\) in rutile structure—differ significantly in the character of metal-oxygen chemical bonding, which is due to the electron-density distribution being much different in them. Mg–O bonding is practically ionic, Ti–O bonding is an essentially covalent one.

We discuss now the differences in the band structures of these crystals.

The electronic structures of both compounds have been well studied experimentally. The experimental data show that the MgO crystal is a wide-bandgap insulator (\(E_g = 7.8\ \) eV); titanium dioxide TiO\(_2\) in the rutile structure is a semiconductor with an experimental bandgap of approximately 3 eV. These differences are reproduced in the band structure of these two binary oxides, calculated in [212] by HF and LDA LCAO methods and shown in Figs. 3.6 and 3.7, respectively. The details of the AO basis-set choice and BZ summation can be found in [212].

Fig. 3.6
figure 6

Band structure and DOS of MgO crystal [212]: a HF LCAO method; b DFT(LDA) LCAO method

Fig. 3.7
figure 7

Band structure and DOS of rutile TiO\(_2\) crystal [212]: a HF LCAO method; b DFT(LDA) LCAO method

In MgO (in accordance with the results of other calculations), the two highest valence bands are the oxygen \(s\)- and \(p\)-like bands, respectively, whereas the conduction bands are more complicated. The upper valence bands in TiO\(_2\) are also oxygen \(s\)- and \(p\)-like bands; however, they consist of 4 and 12 sheets, respectively, because the primitive cell of titanium oxide contains four oxygen atoms.

The lowest conduction band in TiO\(_2\) consists of 10 branches formed by \(3d\)-states of two titanium atoms and is noticeably separated in energy from the upper conduction bands. The symmetry of the one-electron states can be found using the band representation theory of space groups and data on the crystalline structures.

Evidently the symmetry of band states does not depend on the basis choice for the calculation (LCAO or PW); the change of basis set can only make changes in the relative positions of one-electron energy levels.

An important parameter in the band theory of solids is the Fermi-level energy (see Figs. 3.6 and 3.7), the top of the available electron energy levels at low temperatures. The position of the Fermi level in relation to the conduction band is a crucial factor in determining electrical and thermal properties of solids. The Fermi energy is the analog of the highest-occupied MO energy (HOMO) in molecules. The LUMO (the lowest unoccupied MO) energy in molecules corresponds to the conduction-band bottom in solids. The HOMO-LUMO energy interval in a solid is called the forbidden energy gap or simply bandgap.

Depending on the translation symmetry of the corresponding Bloch states the gap may be direct or indirect. In HF and LDA calculations of both crystals under consideration the bandgap is direct, i.e. the one-electron energies at the top of the valence band and the bottom of the conduction band belong to the same \(\varGamma \)-point of the BZ (this result agrees both with the experiment and other band-structure calculations). It is seen that the HF bandgap in both crystals is essentially overestimated and decreases in LDA calculations mainly due to the lowering of the conduction-band bottom energy. The influence of the correlation effects on the valence-band states is smaller.

Figures 3.6 and 3.7 show that the oxygen \(2p\) bandwidth in ionic crystal MgO is smaller than that in TiO\(_2\). The bandwidth is a measure of dispersion in the \(\varvec{k}\)-space and depends on the magnitude and the range of interactions within the crystal: for the more covalent rutile crystal the oxygen-oxygen interactions are stronger. The core bands of both crystals (not shown on the figures) are separated by a large energy gap from the valence bands and are completely flat due to the high localization of core states near the atomic nuclei.

The Fermi energy of a crystal with \(n\) electrons in the primitive unit cell is defined from the condition

$$\begin{aligned} n=2\sum \limits _{i=1}^M \int \limits _{-\infty }^{\varepsilon _F}\frac{1}{V_B}\int \limits _{BZ}\theta (\varepsilon -\varepsilon _i(\varvec{k}))d\varvec{k} =\int \limits _{-\infty }^{\varepsilon _F}n(\varepsilon )d\varepsilon =\sum \limits _i\int \limits _{-\infty }^{\varepsilon _F}n_i(\varepsilon )d\varepsilon \end{aligned}$$
(3.111)

where \(n(\varepsilon )\) is the total density of states (DOS) per unit energy. DOS is an important quantity calculated for crystals. The total DOS can be expressed as the sum of contributions \(n_i(\varepsilon )\) from individual energy bands, see (3.111). The total DOS definition is independent of the basis set choice (PW or LCAO); the product \(n_i(\varepsilon )d\varepsilon \) defines the number of states with energy in the interval \(d\varepsilon \). Each energy band spans a limited energy interval between the minimal and maximal one-electron energies \(\varepsilon _{min}\), \(\varepsilon _{max}\) so that \({\int }_{\varepsilon _{min}}^{\varepsilon _{max}}n(\varepsilon )d\varepsilon \) gives DOS in the corresponding energy interval.

The connection between the electronic structure of a crystal and the one-electron states of the constituent atoms is given by the projected density of states (PDOS), associated with the separate AOs, their shells or individual atoms.

Let us rewrite the total DOS as

$$\begin{aligned} n(\varepsilon )=\sum \limits _{i=1}^Mn_i(\varepsilon )=\frac{1}{V_B}\sum \limits _{i=1}^M\int \limits _{BZ}f^i(\varvec{k})\delta (\varepsilon -\varepsilon _i(\varvec{k}))d\varvec{k} \end{aligned}$$
(3.112)

The weighting function \(f^{(i)}(\varvec{k})\) in (3.112) chosen as

$$\begin{aligned} f^{(i)}(\varvec{k})=C_{i\mu }(\varvec{k})C^*_{i\nu }(\varvec{k})\exp (-\mathrm{i}\varvec{k}\varvec{R}_n) \end{aligned}$$
(3.113)

defines PDOS associated with AO \(\mu \) in the reference cell and AO \(\nu \) in the cell with the translation vector \(\varvec{R}_n\). After summation over band index \(i\) and integration up to \(\varepsilon _F\) PDOS (3.113) gives the density matrix elements (3.110).

According to Mulliken population analysis, the DOS projected onto a given set of AOs \(\{\lambda \}\) (belonging to the given shell or to the whole atom), is defined by the weight function

$$\begin{aligned} f^{(i)}_{\{\lambda \}}(\varvec{k})=\sum \limits _{\mu \in \{\lambda \}}\sum \limits _{\nu }C_{i\mu }(\varvec{k})C^*_{i\nu }(\varvec{k})S_{\mu \nu }(\varvec{k}) \end{aligned}$$
(3.114)

where \(S_{\mu \nu }(\varvec{k})\) is Fourier component of the overlap integral.

Let the set \(\{\lambda \}\) consist of one AO from the reference cell. The summation of (3.114) over the direct lattice translations \(\varvec{R}_n\) and integration over BZ gives the orbital DOS

$$\begin{aligned} n_{\mu }(\varepsilon )=\frac{1}{V_{BZ}}\sum \limits _i\sum \limits _{\nu }\sum \limits _{\varvec{R}_n}\int \limits _{BZ}C_{i\mu }(\varvec{k})C^*_{i\nu }(\varvec{k})\exp \left( \mathrm{i}\varvec{k}\varvec{R}_n\right) S_{\mu \nu }(\varvec{k})\delta (\varepsilon -\varepsilon _i(\varvec{k}))d\varvec{k} \end{aligned}$$
(3.115)

The DOS of an atom \(A\) \(n_A(\varepsilon )\) and the total DOS \(n_{tot}(\varepsilon )\) are caluculated from the orbital DOS:

$$\begin{aligned} n_A(\varepsilon )=\sum \limits _{\mu \in A}n_{\mu }(\varepsilon ); n_{tot}(\varepsilon )=\sum \limits _An_A(\varepsilon ) \end{aligned}$$
(3.116)

The total DOS and PDOS give rich information on the chemical structure of a system, connecting the calculated band structure with the atomic states. We demonstrate this by considering total and projected DOS for binary oxides MgO, TiO\(_2\) and ternary oxides SrTiO\(_3\) and SrZrO\(_3\) with cubic perovskite structure (see Figs. 3.6, 3.7 and 3.8).

Fig. 3.8
figure 8

Full and partial DOS in a SrTiO\(_3\) and b SrZrO\(_3\) crystals

Fig. 3.9
figure 9

Band structure of cubic crystals: a, b, c SrTiO\(_3\), d, e, f SrZrO\(_3\). a, d HF LCAO method; b, e hybrid HF-DFT(PBE0) LCAO method; c, f DFT(PBE) LCAO method

The analysis of total and partial DOS demonstrates that in all the three oxides under consideration the upper valence band is predominantly formed by the O \(2p\) states. It can also be seen that Mg \(3s\), Ti \(3d\) and Zr \(4d\) states make the dominant contribution to the bottom of the conduction band.

As is seen from Fig. 3.9, the inclusion of correlation effects moves the valence and conduction bands to the higher and lower energies, respectively, which decreases the bandgap. In more ionic stronzium zirconate the energy bands are narrower than in less-ionic strontium titanate.

The ground-state electron-charge density in a crystal can be expressed as

$$\begin{aligned} \rho (\varvec{r})=\sum \limits _{\mu \nu }\sum \limits _{\varvec{R}_n}\sum \limits _{\varvec{R}_m}\rho _{\mu \nu }(\varvec{R}_n-\varvec{R}_m)\chi _{\mu }(\varvec{r}-\varvec{R}_n)\chi _{\nu }(\varvec{r}-\varvec{R}_m) \end{aligned}$$
(3.117)

and reproduces the essential features of the electron-density distribution near atomic nuclei and along interatomic bonds. For each \(\varvec{r}\) the sums over \(\varvec{R}_n\) and \(\varvec{R}_m\) are restricted to those direct lattice vectors such that the value of \(\chi _{\mu }(\varvec{r}-\varvec{R}_n)\) and \(\chi _{\nu }(\varvec{r}-\varvec{R}_m)\) are nonnegligible.

The total electron-density maps provide a pictorial representation of the total electronic distribution and are obtained by calculation of the charge density in a grid of points belonging to some planes. More useful information is obtained by considering difference maps, given as a difference between the crystal electron-density and a “reference” electron density. The latter is a superposition of atomic or ionic charge distributions.

Figure 3.10 shows the total and difference maps for Cu\(_2\)O crystal, obtained in HF LCAO calculations [213].

Fig. 3.10
figure 10

Electron-density maps obtained for Cu\(_2\)O on a (110) plane [213]. a Total electron density. b Density difference maps, bulk minus neutral atom superposition. Values corresponding to neighboring isodensity lines differ by 0.01 e/Bohr\(^3\). The full and broken curves in (b) indicate density increase and decrease, respectively

Two important quantities that require an integration involving \(\rho (\varvec{r})\) are the electrostatic potential and the electric field [214]. In particular, maps of the electrostatic potential created by electrons and nuclei at a crystal surface may be useful for gathering information about reaction paths and active sites of electrophilic or nucleophilic chemical processes at the surface. As concerns the electric field, it may be of interest to calculate the electric-field gradient at the location of nuclei with a nonzero nuclear quadrupole moment, since comparison with experimental data is possible in such cases.

Three functions may be computed that have the same information content but different use in the discussion of theoretical and experimental results [86]: the electron momentum density itself (EMD) \(\rho (\varvec{p})\); the Compton profile (CP) function \(J(\varvec{p})\); the autocorrelation function, or reciprocal space form factor, \(B(\varvec{r})\).

Let \(\chi _{\mu }(\varvec{p})\) be defined as the Fourier transform of AO \(\chi _{\mu }(\varvec{r})\) belonging to atom A

$$\begin{aligned} \chi _{\mu }(\varvec{p})=\int \exp (\mathrm{i}\varvec{p}\varvec{r})\chi _{\mu }(\varvec{r})d\varvec{r} \end{aligned}$$
(3.118)

and \(\varvec{s}_{\mu }\) is the fractional coordinate of atom \(A\) in the reference cell.

EMD is defined as the diagonal element of the six-dimensional Fourier transform of the one-electron density matrix from coordinate to momentum space:

$$\begin{aligned} \rho (\varvec{p})=\sum \limits _{\mu \nu }\sum \limits _{\varvec{R}_n}\rho _{\mu \nu }(\varvec{R}_n)\exp \left( -\mathrm{i}\varvec{p}(\varvec{R}_n+\varvec{s}_{\mu }-\varvec{s}_{\nu }\right) )\chi _{\mu }(\varvec{p})\chi _{\nu }(\varvec{p}) \end{aligned}$$
(3.119)

The Compton profile function is obtained by 2D integration of the EMD over a plane through and perpendicular to the direction

$$\begin{aligned} J(\varvec{p})=\int \rho (\varvec{p}+\varvec{p}'_{\perp })d\varvec{p}'_{\perp } \end{aligned}$$
(3.120)

after indicating with \(\varvec{p}'_{\perp }\) the general vector perpendicular to \(\varvec{p}\).

It is customary to make reference to CPs as functions of a single variable \(p\), with reference to a particular direction \(\langle \) \(h\) \(k\) \(l\) \(\rangle \) identified by a vector \(\varvec{e}=(h\varvec{a}_1+k\varvec{a}_2+l\varvec{a}_3)/|h\varvec{a}_1+k\varvec{a}_2+l\varvec{a}_3|\). We have

$$\begin{aligned} J_{\langle hkl\rangle }(p)=J(p\varvec{e}) \end{aligned}$$
(3.121)

The function \(J_{\langle hkl\rangle }(p)\) is referred to as directional CPs. The weighted average of the directional CPs over all directions is the average CP. In the so-called impulse approximation, \(J_{\langle hkl\rangle }(p)\) may be related to the experimental CPs, after correction for the effect of limited resolution [86].

Once the directional CPs are available, the numerical evaluation of the corresponding autocorrelation function, or reciprocal-space form factor, \(B(\varvec{r} )\) is given by the 1D Fourier transform:

$$\begin{aligned} B_{\langle hkl \rangle }(\varvec{r})=\frac{1}{\pi }\int \limits _{-\infty }^{\infty }J_{\langle hkl\rangle }(p)\exp (\mathrm{i}\varvec{p}\varvec{r}){ d}\varvec{p} \end{aligned}$$
(3.122)

The structural and electronic properties of Cu\(_2\)O have been studied in [213] using the HF LCAO method and a posteriori density-functional corrections. The electronic structure and bonding in Cu\(_2\)O were analyzed and compared with X-ray photoelectron spectroscopy spectra, showing a good agreement for the valence-band states. The Fourier transform of the ground-state charge density of a crystalline system provides the X-ray structure factors of the crystal, which can be determined experimentally by X-ray diffraction. To check the quality of the calculated electron density in Cu\(_2\)O crystal, structure factors have been calculated in [213], showing a good agreement with the available experimental data.

7.2 Equilibrium Structure, Bulk Modulus, Formation and Surface Energy

The comparison of the parameters of the bulk rutile calculated using DFT-LCAO and DFT-PW methods with those obtained previously was made in [215].

Large-scale DFT-GGA-PBE LCAO calculations of the bulk rutile TiO\(_2\) with total geometry optimization was performed using the CRYSTAL-09 [17]. An all-valence basis set (BS) in the form of 6s–311sp–1d Gaussian–type functions (GTFs) was used for the O atom [216]; for the Ti atom, the small-core effective-core pseudopotential (SCECP), was employed for the internal shells, and the BS includes valence (3s, 3p, 3d, 4s electrons) and virtual shells, i.e., GTF configuration is \(SCECP\)\(5s\)\(6sp\)\(5d\) [217]. The optimization of BSs both for the O and Ti atoms was performed, the self-consistent procedure was performed on the \(6\!\times \!6\!\times \!6\) MonkhorstPack [87] \(\mathbf {k}\)-mesh. To estimate effective atomic charges within the LCAO method, the Mulliken population analysis was used.

For large-scale DFT-PW calculations on rutile with the total geometry optimization, the scalar relativistic pseudopotentials combined with the Projector Augmented Wave (PAW) method, as implemented in the VASP computer code [154], were employed. The same PBE-GGA nonlocal exchange-correlation functional was used both in DFT-LCAO and DFT-PW calculations. To estimate effective charge distribution on atoms within plane wave methods, the Bader approach has been used.

Results calculated in [215] and a previous study using both DFT-LCAO and DFT-PW approaches are compared between themselves, see Table 3.3. Comparison of the structure parameters for titania bulk calculated using different methods clearly shows qualitative correlation between themselves and their correspondence to the experimental values, although a certain dispersion of results (obtained using different methods) can be observed. It is also true for the Ti–O bond lengths inside the first (I) and second (II) coordination shells consisting of the four and two O atoms, respectively, around each Ti atom. Calculated values of the effective charges q\(_O\) and q\(_{Ti}\) look also quite reasonable, which are found to be qualitatively comparable with the corresponding data obtained earlier.

Table 3.3 Structural and electronic properties of optimized rutile-based TiO\(_2\) bulk [215]

As for the large difference between the experimental and calculated values of band gaps \(\varDelta \varepsilon _g\), it is well-known that standard DFT calculations essentially underestimate \(\varDelta \varepsilon _g\) for semiconductors and insulators. Nevertheless, DFT-PW-GGA methods were found to be quite applicable for qualitative description of a number of properties for the bulk rutile.

Table 3.4 shows the results TiS\(_2\)(ZrS\(_2\)) bulk crystal LCAO(PBE0) calculations [219].

Table 3.4 Calculated lattice parameters, formation energy \(E_\mathrm{form}\) \(^\mathrm{a}\), and band gaps \(E_g\) of different TiS\(_2\) and ZrS\(_2\) phases [219](experimental values are given in parenthesis)

Table 3.4 shows that the difference between the theoretical and experimental structure parameters does not exceed 1 % for the two sulfides under consideration. Also, a good estimate of the formation energy of TiS\(_2\) and ZrS\(_2\) is provided. This fact can be considered as the verification of the sufficient accuracy of PBE0 approximation in the special case of these layered crystals. This success of hybrid DFT method in the reproducing of weak van der Waals-type interlayer forces may be explained by the large contribution of the induced electrostatic interactions due to the large polarizability of sulfur atom. In agreement with the experimental observations, the 1To polymorphs were found to be the most stable modifications among all the considered phases. This can be clearly seen from Table 3.4 where the formation energy (per formula unit) of other presupposed phases is given with respect to that of 1To phase. The most encouraging thing is that the calculated energy of the observed 1To polytype proved to be lower than the energy of 2Ho (with P6\(_3\)mc symmetry) and 3R-o (with R3m symmetry) polytypes, which have the same octahedral morphology of the three-plane atomic layers but were not actually observed.

Ab initio PW simulations of surface properties use both isolated slab and repeated slab models, compared in HF studies of surface properties of BaTiO\(_3\) in [226]. Ab initio schemes implemented with plane-wave based computational methods are quite convenient and effective for 3D periodic bulk crystals, but cannot treat directly isolated slabs. With plane-wave-based methods, slabs must be treated with periodically repeated supercells, which contain a vacuum space separating the slabs. Two major differences then result between isolated-slab and periodic-slab calculations [226]. The first one is the presence of spurious interactions among the periodically repeated slabs. Secondly, isolated-slab calculations can assume the proper boundary condition of a vanishing electric field in the vacuum regions, while periodic-slab calculations must assume the boundary condition of a vanishing average field over the supercell. This is a major problem when dealing with slabs with asymmetric terminations, even in the absence of a macroscopic polarization. Since the work functions of the two nonequivalent surfaces are generally different in that case, there is a potential difference across the slab. However, periodic boundary conditions force a periodic potential, which amounts to imposing a fictitious external field. This problem is unavoidable with periodic-slab calculations, where self consistency is achieved in such a fictitious field. Presumably, both of these problems for periodically repeated slabs diminish with increasing thickness of the vacuum regions separating the slabs. On the other hand, even moderate vacuum separations greatly increase the computational burden, particularly when plane-wave-like basis sets are used.

The results of the very first ab initio study of surface properties of BaTiO\(_3\) using isolated slabs were reported in [226]. The ability to deal with isolated slabs is due to the localized basis implementation that has been adopted in [226]. A localized-basis set not only allows one to perform calculations for genuinely finite and isolated systems, but also permits the introduction of arbitrarily large vacuum separations in periodic systems without any essential increase in the computational burden. That allows one to test periodic-slab calculations with various thicknesses of vacuum space in between. The results obtained in [226] indicate that periodic-slab calculations are substantially affected by the two problems mentioned above, even at quite large vacuum separations. On the other hand, the results for truly isolated slabs are quite consistent with fast convergence to genuine surface properties.

Several surface properties of BaTiO\(_3\) have been calculated in [226]: namely, the surface charge, surface energy, and surface dynamical charges, within an ab initio HF scheme scheme. These properties have been studied as functions of the thickness and termination of the slabs. Both genuinely isolated slabs and periodically repeated slabs, with different terminations have been used. The ability to deal with genuinely isolated slabs is a virtue of the localized-basis implementation adopted in [226]. In particular, the slab surface charge provides a value for the spontaneous polarization in excellent agreement with the calculation of the latter as a Berry phase, and with experiment as well. On the other hand, it has been found that when periodically repeated slabs are used, the interactions among slabs and the fictitious field imposed by the periodic boundary conditions can significantly affect the accuracy of the calculations, even at quite large vacuum separations.

7.3 Phonon Frequencies and Relative Phase Stability Calculations

The second-order derivatives of the energy with respect to the nuclear coordinates are involved in lattice dynamics, in particular, in the calculation of vibrational (phonon) spectra. We shall begin from the molecular case [227]. The decoupling of the nuclear from the electronic motion is made in the adiabatic approximation. Let \(u_i\) represents a displacement of the \(i\)th cartesian coordinate from its equilibrium value (\(i=1,2,\ldots , 3N\)), where \(N\) is the number of nuclei in a molecule, and \(q_i=\sqrt{M_i}u_i\) are the generalized coordinate (\(M_i\) is the mass of the atom associated with the \(i\)th coordinate) and its derivative with respect to time \({\overset{.}{q}}_i=p_i\). In the harmonic approximation the classical vibrational Hamiltonian of a polyatomic molecule becomes

$$\begin{aligned} H=T+V=\frac{1}{2}\left( \sum \limits _iM_i{\overset{.}{u}}^2_i+\sum \limits _{ij}H_{ij}U_j\right) +V_0=\frac{1}{2}\left( \langle p|p\rangle +\langle q|W|q \rangle \right) +V_0 \end{aligned}$$
(3.123)

Here \(V_0\) is the electron energy for the equilibrium atomic coordinates, and \(H_{ij}\) are the Hessian matrix elements

$$\begin{aligned} H_{ij}=\frac{1}{2}\left[ \frac{\partial ^2V}{\partial u_i\partial u_j}\right] _0 \end{aligned}$$
(3.124)

evaluated at equilibrium. The relation \(W_{ij}=\frac{H_{ij}}{\sqrt{M_iM_j}}\) defines the elements of the weighted Hessian.

The eigenvalues \(\varkappa _j\) of the Hermitian matrix \(W\) are the generalized force constants. The Hamiltonian (3.123) then can be factorized into \(3N\) one-dimensional harmonic Hamiltonians

$$\begin{aligned} H=\sum \limits _{\nu }h_{\nu }=\sum \limits _{\nu }\frac{1}{2}\left( P_{\nu }^2+\omega _{\nu }^2Q_{\nu }^2\right) \end{aligned}$$
(3.125)

Thus each of the \(3N-6\) vibrational modes can be interpreted as a collective oscillatory movement with frequency \(\omega _{\nu }=\sqrt{\varkappa _{\nu }}/2\pi \) and the problem of calculating vibrational spectra reduces to the diagonalization of matrix \(W\) to find the set of eigenvalues \(\varkappa _j\).

For periodic systems, the translation invariance of the potential energy and Hessian matrix should be used. The generalized coordinates obey the Bloch theorem and are written in the form

$$\begin{aligned} q_i(\varvec{k})=N\sum \limits _{\varvec{g}}\exp (-\mathrm{i}\varvec{k}\varvec{g})q_i^{\varvec{g}} \end{aligned}$$
(3.126)

The vibrational problem is block-factorized into a set of problems (one for each \(\varvec{k}\) point in BZ) of dimension \(3N-6\) where \(N\) is the number of atoms in the primitive unit cell.

The \(\varvec{k}\)-block of the \(\varvec{k}\)-factorized \(W\) matrix takes the form

$$\begin{aligned} W_{ij}(\varvec{k})=\sum \limits _{\varvec{g}}\exp (\mathrm{i}\varvec{k}\varvec{g})\frac{H_{ij}^{\varvec{0}\varvec{g}}}{\sqrt{M_iM_j}} \end{aligned}$$
(3.127)

where \(H_{ij}^{\varvec{0}\varvec{g}}\) is the second derivative of potential energy at equilibrium with respect to atom \(i\) in the reference cell \(\varvec{0}\) and atom \(j\) in cell \(\varvec{g}\). The number of equations (3.127) to be solved equals the number of \(\varvec{k}\)-points in the BZ, i.e. is infinite for the infinite crystal. In practice, the calculations of phonon frequencies are made for a finite number of \(\varvec{k}\)-points and the interpolation is used to obtain so-called phonon branches \(\omega _1(\varvec{k}),\ldots ,\omega _i(\varvec{k}),\ldots ,\omega _{3N}(\varvec{k})\) (like one-electron energies are obtained in SCF calculations for a finite set of \(\varvec{k}\)-points and then interpolated to form the electron-energy bands). The relationship between phonon frequencies \(\omega \) and wavevector \(\varvec{k}\) determines the phonon dispersion.

Comparing the vibrational branches and electronic bands calculations we note that in the former case the equations for different \(\varvec{k}\) values are solved independently while in the latter case the self-consistent calculation is necessary due to the BZ summation in the HF or KS Hamiltonian.

Once the phonon dispersion in a crystal is known, thermodynamic functions can be calculated on the basis of statistical mechanics equations. As an example, the Helmholtz free energy, \(F\), can be obtained as:

$$\begin{aligned} \langle F\rangle =\sum \limits _{i\varvec{k}}\left\{ \frac{1}{2}\hslash \omega _{i\varvec{k}}+k_BT\ln \left[ 1-\exp \left( -\frac{\hslash \omega _{i\varvec{k}}}{k_BT}\right) \right] \right\} \end{aligned}$$
(3.128)

where the sum is extended to all lattice vibrations, \(\omega _{i\varvec{k}}\), and \(k_B\) is the Boltzmann’s constant. Another way of computing thermodynamic functions is based on the use of the phonon density of states. The evolution of the crystal structure as a function of temperature and pressure can also be simulated by minimizing \(G=F+pV\). The procedure requires a sequence of geometry optimizations, and lattice-vibration calculations [228].

Lattice vibrations can be measured experimentally by means of classical vibration spectroscopic techniques (infrared and Raman) or neutron inelastic scattering. However, only the latter technique allows one to measure the full spectrum in a range of \({\varvec{k}}\) vectors, whereas with infrared and Raman spectroscopy, only lattice vibrations at \(\varGamma (\varvec{k} =\varvec{0})\) are usually detected (the second-order spectra, corresponding to nonzero wavevector \(\varvec{k} \ne \varvec{0}\) are demanding). The calculations of the vibrational frequencies only at \(\varGamma \) point require the solution of only one equation

$$\begin{aligned} \det \left| W(0)\right| =0,\quad W_{ij}(0)=\sum \limits _{\varvec{g}}\frac{H_{ij}^{\varvec{0}\varvec{g}}}{\sqrt{M_iM_j}} \end{aligned}$$
(3.129)

The knowledge of phonon symmetry is useful in the analysis of infrared and Raman spectra of solids as the symmetries of active phonons in these spectra are governed by selection rules following from the symmetry restrictions imposed on the transitions matrix elements [229]. Using the symmetry of the phonons found one can establish which vibrational modes are active in the first- and second-order infrared and Raman spectra.

To analyze the symmetry of phonon states the method of induced reps of space groups can be used [229]. The procedure of analysis of the phonon symmetry is the following. Arranging atoms in the primitive cell over the Wyckoff positions of the periodic system symmetry group (the space, layer or line group) and using the induced reps of this group one can determine the symmetry of the phonons. Only those of the induced reps that are induced by the irreps of the site-symmetry groups according to which the components of the vectors of the local atomic displacements transform are used. The total dimension \(n\) of the induced rep (called the mechanical representation) equals \(3N\) (\(N\) is the number of atoms in the primitive cell).

As an example, Table 3.5 shows the phonon symmetry in rutile TiO\(_2\) crystal.

Table 3.5 Phonon symmetry in rutile TiO\(_2\) crystal with space group \(D_{4h}^{14}\)

There is a one-to-one correspondence between irreps of crystal point group \(D_{4h}\) and irreps of the space group at \(\varGamma \) point: \(A_{1g, u} -1^{\pm }, A_{2g, u} -3^{\pm }, B_{1g, u} -2^{\pm }, B_{2g, u} -4^{\pm }, E_ {g, u} -5^{\pm }\). The atomic displacements of six atoms in the primitive cell generate the 18-dimensional reducible representation, which contains three acoustic modes and 15 optical modes (\(A_{1g}+A_{2g}+A_{2u}+B_{1g}+2B_{1u}+B_{2g}+E_g+3E_u\)). Three acoustic modes have zero frequency at the \(\varGamma \) point and are associated with the translation of the entire crystal along any direction in space. These branches are called acoustic modes as the corresponding vibrations behave as acoustic waves. The translation of the entire crystal along the \(z\)-axis corresponds to irrep \(3^-(A_{2u})\) at \(\varGamma \) point, translation in the plane \(xy\)\(5^-(E_u)\).

Both acoustic modes are polar and split into transverse \(A_{2u}\) (TO) and longitudinal \(E_u\) (LO) with different frequencies due to macroscopic electric field. All other branches show finite nonzero frequencies at \(\varGamma \) and are known as optical modes, because they correspond to unit-cell dipole moment oscillations that can interact with an electromagnetic radiation.

In the model of a finite crystal the rotation of the entire crystal around the \(z\)- and \(xy\)-axis corresponds to the irreps \(3^+\) and \(5^+\), respectively, including the displacements of only oxygen atoms, see Table 3.5. As seen from Table 3.5, the phonons of even symmetry (\(1^+, 2^+,3^+, 4^+,5^+\)) are connected only with the oxygen-atom displacements. Due to the different atomic masses of oxygen and titanium atoms, the corresponding lines can appear in different parts of the vibrational spectra, making its interpretation easier.

The calculation of phonon frequencies of the crystalline structure is one of the fundamental subjects when considering the phase stability, phase transitions, and thermodynamics of crystalline materials. The approaches of ab-initio calculations fall into two classes: the linear response method [230] and the direct method, see [231] and references therein.

In the first method, the dynamical matrix is expressed in terms of the inverse dielectric matrix describing the response of the valence electron-density to a periodic lattice perturbation. For a number of systems, in particular, the nanostructures, the linear-response approach is difficult, since the dielectric matrix must be calculated in terms of the electronic eigenfunctions and eigenvalues of the perfect crystal.

There are two variants of the direct method.

In the frozen-phonon approach the phonon energy is calculated as a function of the displacement amplitude in terms of the difference in the energies of the distorted and ideal lattices. Another approach of the direct method uses the forces \(\left( \frac{\partial E}{\partial u_i}\right) _{eq}\) calculated in the total-energy calculations, derives from them the values of the force-constant matrices and hence the dynamical matrix and phonon-dispersion curves.

The direct method calculations are restricted to phonons whose wavelength is compatible with the periodic boundary conditions (PBC) applied to the supercell used in the calculations [232]. Let the supercell is defined by the linear transformation (2.4) with matrix \(\mathbf {l}\). The determinant L of this matrix defines the number of the primitive unit cells in the supercell.

The eigenvectors of the dynamical matrix \(\mathrm D(\mathbf{k};\mu ,\nu )\) are Bloch functions with the wave vectors \(\mathbf{K}^{(s)}=\sum _i s_,\mathbf{B}_i (\Vert s_i\Vert \le \Vert 1/2\Vert )\) belonging to BZ of the initial reciprocal lattice, defined by the basic translation vectors \(\mathbf{B}_i, i=1,2,3\). Let \(\mathbf{b}_i\) be basic translation vectors of the reciprocal lattice corresponding to the direct lattice \(\varGamma _2\) of supercells and defining the reduced BZ. By definition of a reciprocal lattice \((\mathbf{a}_i\mathbf{B}_j)= (\mathbf{A}_j\mathbf{b}_i)=2\pi \delta _{ij}\). The transformation (2.4) of the direct lattice is accompanied by the transformation of reciprocal lattices \(\mathbf{b}=\mathbf{l}^{-1}\mathbf{B}\). For the symmetrical transformation \(\mathbf l\) the transformation \(\mathbf{l}^{-1}\) is also symmetrical [229], and L points of the initial BZ \(\mathbf{K}^{\mathbf{k}_0} (i=1,2,\ldots ,L)\) become equivalent to point \(\mathbf{k}_0\) of the reduced BZ:

$$\begin{aligned} \mathbf{K}^{\mathbf{k}_0}_i=\mathbf{k}_0+\sum \limits _{ij}q_{ij}\mathbf{b}_j, i=1,2,\ldots L \end{aligned}$$
(3.130)

where \(q_{ij}\) are integers.

Introducing PBC for a supercell means that only \(\varGamma (\mathbf{k}=0)\) point in the reduced BZ is considered so that \(\mathbf{k}_0 =0\) in (3.130). The corresponding \(\mathbf{K}_i^{(s)}\) points of initial BZ have to be the translation vectors of the new reciprocal lattice i.e. \(\mathbf{K}^{(s)}=\sum _{j'} m_{j'}\mathbf{b}_{j'}\) so that

$$\begin{aligned} \sum \limits _{j'}m_{j'}\mathbf{b}_{j'}\sum \limits _jn_j\mathbf{A}_j=\sum \limits _{j'j}m_{j'}n_j(\mathbf{b}_{j'}\mathbf{A}_j)=2\pi \sum \limits _jm_jn_j \end{aligned}$$
(3.131)

However,

$$\begin{aligned} (\mathbf{K}^{(s)}\mathbf{A}_n)&=\sum \limits _{i'}s_{i'}\mathbf{B}_{i'}\sum \limits _jn_j\sum \limits _il_{ji}\mathbf{a}_i\nonumber \\&=\sum \limits _{i'}\sum \limits _j\sum \limits _is_{i'}l_{ji}n_j(\mathbf{B}_{i'}\mathbf{a}_i)=2\pi \sum \limits _j\sum \limits _is_il_{ji}n_j \end{aligned}$$
(3.132)

where \(m_j, n_j\) are zero or integers. Thus, the relation

$$\begin{aligned} \sum \limits _i l_{ji}s_i = m_j, i,j=1,2,3 \end{aligned}$$
(3.133)

is fulfilled for the wave vectors \(\mathbf{K}^{(s)}=\sum _jm_j\mathbf{b}_j\). These wave vectors are called commensurate with the supercell, defined by matrix l.

The majority of calculations were performed based on DFT PW method, for example to study phonons in the rutile structure [233]. The calculation of the vibrational frequencies of crystals was implemented in the LCAO CRYSTAL code (see http://www.crystal.unito.it) and applied to different oxides bulk crystals [59].

In the LCAO approximation frequencies at \(\varGamma \) are evaluated in the direct method in the same way as for molecules [227]: a set SCF calculations of the unit cell are performed at the equilibrium geometry and incrementing each of the nuclear coordinates in turn by \(u\) (use of symmetry can reduce the number of the required calculations). Second-order energy derivatives are evaluated numerically. Obtaining frequencies at wavevector symmetry points different from \(\varGamma \) would imply the construction of appropriate supercells due to the one-to one correspondence between the supercell choice and the set of \(\varvec{k}\)-points equivalent to the \(\varGamma \) point in the reduced BZ. A finite range of interaction in the lattice sum (3.129) is assumed, usually inside the supercell chosen. In the case of ionic compounds, long-range Coulomb effects due to coherent displacement of the crystal nuclei are neglected, as a consequence of imposing the periodic boundary conditions [227]. Therefore, \(W_{ij}(0)\) needs to be corrected for obtaining the longitudinal optical (LO) modes [234]. For this reason, in some cases only transverse optical (TO) parts of the phonon spectrum are calculated as is done in the combined DFT PW-DFT LCAO lattice dynamics study of TiO\(_2\) rutile [235]. The phonon frequencies computed in [235] for the rutile optimized crystal structure are reported in Table 3.6 and compared with experimental data.

Table 3.6 Calculated and measured frequencies (in cm\(^{-1}\)) relative to the \(\varGamma \)-point of bulk TiO\(_2\) rutile [235]\(^*\)

The LDA frequencies are in excellent agreement with the experimental frequencies, especially if compared with the frequencies measured at low temperature (\(T \sim \) 4 K), when these data are available. The deviation between the LDA and experimental frequencies is \(\sim \)13 cm\(^{-1}\) at most, and is often much smaller than that. For instance, the deviation drops to no more than \(\sim \)2 cm\(^{-1}\) for the two stiffest modes, \(B_{2g}\) and \(A_{1g}\), and it remains small also for several of the softer modes. Both PBE and PW91 results are much less satisfactory. With the exception of the \(B_{1g}\) mode, the GGA functionals systematically underestimate the LDA and measured frequencies. It is found in [235] that this discrepancy between LDA and GGA results is mostly due to the difference in the equilibrium lattice parameters at zero pressure predicted by the functionals. The LDA frequency is the highest because LDA predicts the smallest equilibrium volume, and the PBE equilibrium volume is large enough to lead to an imaginary frequency.

All the results discussed were obtained within the plane-wave pseudopotential implementation of DFT. For all functionals, calculations were repeated with the all-electron LCAO scheme for the equilibrium geometries, bulk moduli and energy profiles along the ferroelectric TO \(A_{2u}\) mode. Apart from slight quantitative differences, in all cases the all-electron LCAO calculations agree well with the plane-wave, pseudopotential results, confirming their independence of the particular numerical scheme used to implement DFT. We note that the hybrid HF-DFT LCAO calculations could in principle give better agreement with experimental data for phonon frequencies of rutile as was shown in the B3LYP LCAO calculations of the vibrational spectrum of calcite CaCO\(_3\) [236]. In this case the mean absolute error is less than 12 cm\(^{-1}\) (frequencies range from 100 to 1600 cm\(^{-1}\)).

The phonon frequencies calculations allow one to estimate the relative stability of the different bulk crystal phases. Table 3.7 shows the results of LCAO DFT(PBE0) calculations [237] of the phonon frequencies in BaTiO\(_3\) bulk crystal. It is seen that, in agreement with the experimental data, only the rhombohedral BaTiO\(_3\) phase is stable at low temperatures (the smallest phonon frequencies are imaginary for all the three other phases).

Table 3.7 Calculated formation energies and the smallest phonon frequencies of the rhombohedral, orthorhombic, tetragonal and cubic phases of BaTiO\(_3\) [237]

8 Nanostructure Formation, Surface and Strain Energy

In the part II Applications we consider the results of the calculations of the inorganic nanostructures (nanolayers-slabs, nanotubes-NT, nanowires-NW) based on boron compounds, tetravalent semiconductors, boron and Group III metal nitrides, oxides (binary and ternary) and sulfides.

For nanolayers, nanotubes and nanowires in consideration the calculated formation energy E \(_{form}\), surface energy E \(_{surf}\) and strain energy E \(_{str}\) are given as defined by the following equations.

The formation energy E \(_{form}\) of the stoichiometric slabs was calculated via the equation:

$$\begin{aligned} E_{form} = E_{slab} /n_{slab} - E_{bulk} /n_{bulk} \end{aligned}$$
(3.134)

where E \(_{slab}\), n\(_{slab}\) and E \(_{bulk}\), n\(_{bulk}\) are the total energy and the number of formula units per primitive cell in the slab and bulk crystal, correspondingly.

Stoichiometric nanotube (nanowire) formation energy E \(_{form}\) can be calculated using the total energy E \(_{NT}\)(E \(_{NW}\)) and the number of formula units n \(_{NT}\) (n \(_{NW}\)) per primitive cell for the nanotube (nanowire) instead of the slab values.

The stoichiometric slab surface energy was estimated as

$$\begin{aligned} E_{surf} = (E_{slab} - n_{slab} E_{bulk}/n_{bulk})/(2S_{slab}) \end{aligned}$$
(3.135)

where S \(_{slab}\) is the surface area of the final slab structure. The term layer is used for the designation of the smallest possible group of close (but separated from others) atomic planes in the slab, which obey the bulk crystal stoichiometry.

The nanotube strain (rolling) energy was calculated by the equation:

$$\begin{aligned} E_{str} = E_{NT}/n_{NT}-E_{slab}/n_{slab} \end{aligned}$$
(3.136)

This quantity may be interpreted as the energy of NT formation from the corresponding slab having the optimized structure.