Keywords

1 Photoemission Spectroscopy and Methods for Electronic Structure Calculations

Photoemission (PES) is one of the most powerful spectroscopy tools that probe the electronic properties of materials [1, 2]. Direct photoemission is a photon-in electron-out technique: the energy of an incoming photon is used to extract an electron (that is called the photoelectron) from the sample. If N is the initial number of electrons, the electronic system after the electron removal is left in an (N − 1)-electron state, which can be the ground state or an excited state. By measuring the kinetic energy of the photoelectron, one obtains the ionization energies of the electrons in the system (i.e., the energy required to remove an electron, or, equivalently, to create a hole). The affinity energies (i.e., the energy gained to add an electron to the system) can be analogously obtained by means of inverse photoemission, which is an electron-in photon-out technique. Electron addition and removal energies together define the electronic structure of a material (e.g., its band gap). In particular, when the momentum (and possibly the spin) of the photoelectron is also measured, as in angular-resolved photoemission (ARPES), one gets access to the band structure of the material.

From the theoretical point of view, an accurate description of photoemission spectra is still a great challenge today [38]. In the sudden approximation one assumes that the photoelectron is immediately decoupled from the sample. In this approximation the measured photocurrent J k (ω), which is the probability per unit time of emitting an electron with momentum k and kinetic energy E k when the sample is irradiated with photons of frequency ω, is given by [5]

$$ {J}_{\mathbf{k}}\left(\omega \right)={{\displaystyle \sum_i\left|{\Delta}_{\mathbf{k}i}\right|}}^2{A}_{ii}\left({E}_{\mathbf{k}}-\omega \right), $$
(1)

where we have introduced the matrix elements of the spectral function A(ω) (for ω < μ, where μ is the Fermi energy), which are weighted by the photoemission matrix elements Δ k i that describe the coupling with the photons. Here we have also assumed that there exists a one-particle basis in which A, rigorously defined as the imaginary part of the one-particle Green’s function G (see Sect. 2), is diagonal. While the spectral function A ii is an (intrinsic) property of the (bulk) material, independent of the photon field, photoemission spectra often display an important dependence on the photon properties (see Fig. 1). However, when comparing theoretical results with experimental spectra, most often the photoemission matrix elements are completely neglected, together with the photon-energy dependence of the spectra. Going through the series of drastic assumptions that we have made so far, we see that bridging the gap between theory and experiment is still an important issue.

Fig. 1
figure 1

Experimental photoemission spectra for the valence band of the metallic phase of V2O3 for a broad range of photon energies from hv = 19 eV to hv = 6 KeV. We observe a strong variation of the spectra with the photon energy. From [9]. Copyright 2009, The American Physical Society

In the following we will confine our discussion to the electronic properties contained in the spectral function A. In an independent-particle picture the spectral function for occupied states is simply A ii (ω) = δ(ω − ε i )θ(μ − ω) and the photocurrent

$$ {J}_{\mathbf{k}}\left(\omega \right)={\displaystyle \sum_i^{\mathrm{occ}}}\delta \left({E}_{\mathbf{k}}-\omega -{\varepsilon}_i\right) $$
(2)

is hence described by the density of occupied states evaluated at the energy E k  − ω, and is given by a series of delta peaks corresponding to the energies ε i of the one-particle Hamiltonian. By measuring the kinetic energy of the photoelectron, in an independent-particle picture one would obtain directly the energy of the one-particle level that the electron was occupying before being extracted from the sample. More realistically, as a consequence of the electronic interactions, the delta peaks in the spectral function acquire a finite width, which corresponds to the finite lifetime of the excitation in the electronic system. Generally a broadened peak is still prominent in the spectrum, having a weight Z (called the renormalization factor) as large as 0.6–0.7 (a delta peak has Z = 1). In these situations one associates the peak with a quasiparticle (QP) excitation. The collection of the energy positions of the quasiparticle peaks defines the band structure of the material, which is most often the target of the calculations.

However, beyond a quasiparticle picture, the spectral function is generally much richer. In fact, features with lower intensities (the satellites or side bands) are often visible in the spectra. They correspond to situations in which the electronic system is left in an (N − 1)-electron excited state which cannot be described by a quasi-hole. Indeed, together with the electron removal, additional excitations (e.g., plasmons) can also be induced in the system by the absorption of the photon. Clearly a calculation of quasiparticle energies only is unable to describe satellites, including Hubbard side bands, which are, by definition, a signature of many-body effects beyond a band-structure description.

Density-functional theory (DFT) [10] is a ground-state theory and hence is not appropriate to describe excitation energies measured in photoemission. In the Kohn–Sham scheme [11] the ground-state density and the ground-state total energy are obtained from a set of fictitious noninteracting electrons. Unfortunately, the eigenvalues ε i of the Kohn–Sham equation

$$ \left(-\frac{\nabla^2}{2}+{V}_{\mathrm{ext}}\left(\mathbf{r}\right)+{V}_{\mathrm{H}}\left(\mathbf{r}\right)+{V}_{\mathrm{xc}}\left(\mathbf{r}\right)\right){\phi}_i\left(\mathbf{r}\right)={\varepsilon}_i{\phi}_i\left(\mathbf{r}\right), $$
(3)

where V ext, V H, and V xc are respectively the external, Hartree, and exchange-correlation potentials, cannot be interpreted as the additional or removal energies measured in photoemission [12] (except for the HOMO level [13]). In practical applications, the Kohn–Sham band gaps are most commonly smaller than the experimental values. The exact quasiparticle band gap is defined as [E(N + 1) − E(N)] − [E(N) − E(N − 1)], where E(M) is the ground-state total energy of M electrons. In finite systems in the “self-consistent field” (Δ-SCF) scheme, these total energies for N, N ± 1 electrons can be calculated explicitly and approximations to V xc can give good results for the band gaps, since they can describe Hartree relaxation effects well [14]. The variation of the electron density δρ for adding or removing an electron scales as 1/N: thus in an infinite system δρ → 0. In this limit the Δ-SCF scheme yields, with analytic functionals of the density such the local-density approximation (LDA) [11], the Kohn–Sham band gap [15]. In exact Kohn–Sham theory a nonanalytic, discontinuous change in V xc with respect to change in the electron number must be present [1618].

Even in the homogeneous electron gas, where the local-density approximation is exact for the total energies by definition, the discontinuity of the momentum distribution at the Fermi surface (which corresponds to the renormalization factor Z) and spectral properties such as lifetimes and satellites remain inaccessible in the Kohn–Sham formalism [19, 20]. The momentum distribution, for instance, can be accurately calculated using Quantum Monte Carlo (QMC) techniques [19], but excitations contained in the spectral function cannot be easily calculated in QMC at the moment. Accurate quantum chemical methods, such as full configuration interaction or coupled-cluster techniques, have been developed for molecular systems, but until recently they have not been explored in solids for their high computational complexity [21].

In spectroscopy one is often interested in specific answers to limited questions and there is no need to calculate the full many-body wavefunction, which contains more information than necessary. A successful strategy is hence to introduce suitable quantities with a reduced number of degrees of freedom that are able to provide the desired spectra. A key quantity is the spectral function – see (1) – which can be calculated from the Green’s function G:

$$ {A}_{ii}\left(\omega \right)=\frac{1}{\pi}\left|\mathrm{Im}{G}_{ii}\left(\omega \right)\right|. $$
(4)

Many-body perturbation theory (MBPT) has traditionally been the framework in which different approaches have been developed to calculate G. We can refer to the various textbooks for a general introduction (see, e.g., [2224]). Within MBPT, Hedin’s GW approximation (GWA) [25] represents the state-of-the-art for the calculation of band structures in a large variety of materials. Besides the early review work by Hedin and Lundqvist [26], other in-depth reviews were published around the year 2000 [14, 27, 28]. Since then the number of applications has exploded, reaching more and more complicated materials such as interfaces [29], surfaces with adsorbates (see, e.g., [3032]), or defects in bulk materials (see, e.g., [3338]), and extending to other areas like quantum transport (see, e.g., [3943]). Moreover, the field itself has experienced several advances, both from the technical and the fundamental points of view. For instance:

  • Computational schemes that avoid sums of part or all the empty states have recently been proposed (see, e.g., [4449]).

  • Pathologies and limits of the GWA (e.g., the self-screening problem and the wrong atomic limit of the Hubbard model) and ways to go beyond GW have been explored (see, e.g., [34, 5054]).

  • The GWA has been extended in various ways to account for phenomena not contained in it. Examples include the cumulant expansion (see, e.g., [55, 56]), the T matrix approximation (see, e.g., [5759]), approaches developed starting from the Hubbard model, like dynamical mean field theory (DMFT) [60], or with the calculation of vibrational excitations [61, 62].

  • Another emerging research line in recent years has been the application of the GWA to non-equilibrium situations by means of the solution of the Baym–Kadanoff equations (see, e.g., [42, 6365]).

In this chapter we will focus on the approaches that have been developed in the last decade to go beyond the perturbative G 0 W 0 scheme (see Sect. 2), which has been the standard implementation for GW calculations in real materials. We will focus on the calculation of spectral properties (for instance we will not deal with calculations of ground-state total energies; see, e.g., [6670]). While advanced computational schemes allow one to deal with more complicated materials (e.g., with a larger number of atoms), beyond-G 0 W 0 approaches have led to a qualitative breakthrough in ab initio calculations of spectral properties of more complex materials, such as those containing localized d or f electrons, also including the challenging class of strongly correlated materials. In particular, in the following we will discuss why (see Sect. 3) and how (see Sects. 4 and 5) to go beyond G 0 W 0, notably by means of the quasiparticle self-consistent GW (QSGW) method introduced in 2004 by Faleev et al. [71]. Finally, we will conclude the chapter by presenting a brief discussion of the new prospects that QSGW has opened in the field.

2 Theoretical Survival Kit in GW Environment

2.1 Green’s Function G and Self-Energy Σ

The single-particle Green’s function G is the most basic ingredient of MBPT. The time-ordered Green’s function describes the propagation of an extra electron in an electronic system for positive times and the propagation of a missing electron (i.e., a hole) for negative timesFootnote 1:

$$ \begin{array}{c}\hfill iG\left(\mathbf{r}t,\mathbf{r}\mathbf{^{\prime }}t^{\prime}\right)=\theta \left(t-t^{\prime}\right)\left\langle N0\left|\psi \left(\mathbf{r}t\right){\psi}^{\dagger}\left(\mathbf{r}\mathbf{^{\prime }}t^{\prime}\right)\right|N0\right\rangle \hfill \\ {}\hfill -\theta \left(t^{\prime }-t\right)\left\langle N0\left|{\psi}^{\dagger}\left(\mathbf{r}\mathbf{^{\prime }}t^{\prime}\right)\psi \left(\mathbf{r}t\right)\right|N0\right\rangle, \hfill \end{array} $$
(5)

where |N0〉 denotes the exact ground-state wavefunction of an N electron system, ψ and ψ are the annihilation/creation field operators in the Heisenberg picture, and θ is the step function.

The physical meaning of G becomes clear when inserting the closure relation in between the two field operators and taking a Fourier transform in time. The so-called Lehmann representation reads

$$ G\left(\mathbf{r},\mathbf{r}\mathbf{^{\prime }},\omega \right)={\displaystyle \sum_i\frac{f_i\left(\mathbf{r}\right){f}_i^{*}\left(\mathbf{r},\mathbf{^{\prime }}\right)}{\omega -{E}_i}}. $$
(6)

The poles of G are located at the energies E i :

$$ \begin{array}{c}\hfill {E}_i={E}_{N+1i}-{E}_{N0}- i\eta \kern1em \mathrm{when}\ {E}_i>\mu \hfill \\ {}\hfill ={E}_{N0}-{E}_{N-1i}+ i\eta \kern1em \mathrm{when}\ {E}_i<\mu, \hfill \end{array} $$
(7)

where the energies E N ± 1i are the exact eigenenergies of the N ± 1 electron system and i is the index labeling the exact eigenvectors of both the N − 1 and N + 1 electron systems. The ubiquitous vanishing positive η has naturally arisen from the Fourier transform of the step functions. In a solid, the discrete set of poles in (6) merges into a branch-cut. The so-called Lehmann amplitudes f i are then defined as

$$ \begin{array}{c}\hfill {f}_i\left(\mathbf{r}\right)=\left\langle N0\left|\psi \left(\mathbf{r}0\right)\right|N+1i\right\rangle \kern1em \mathrm{when}\ {E}_i>\mu \hfill \\ {}\hfill =\left\langle N-1i\left|\psi \left(\mathbf{r}0\right)\right|N0\right\rangle \kern1em \mathrm{when}\ {E}_i<\mu .\hfill \end{array} $$
(8)

Note that the Lehmann amplitudes f i are not mutually orthogonal. From this representation we see that the poles E i carry the exact ionization energies of electrons in the system or the exact affinity energies. The analytical structure of G is also made clear: the poles lie slightly above the real axis for E i  < μ and slightly below for E i  > μ. The poles can be directly compared to the peaks obtained from a photoemission or inverse photoemission experiment (see Sect. 1).

Because G is the fundamental quantity, a great deal of effort has been put in to its evaluation in a many-body context. This poses a very large challenge since equation of motion for G involves the two-particle Green’s function. Its equation of motion in turn involves the three-particle Green’s function, and so on. The standard remedy in MBPT is to break this hierarchy by introducing an effective operator, the self-energy Σ. As Schwinger showed, by introducing an auxiliary external field U(r t) that is set to zero at the end, it is possible to express formally the two-particle Green’s function as a function of the one-particle Green’s function [72]. This results in an equation of motion for G alone:

$$ \int d\mathbf{r}\mathbf{^{\prime }}\left\{\left[\omega -{h}_0\left(\mathbf{r}\right)-{V}_{\mathrm{H}}\left(\mathbf{r}\right)\right]\delta \left(\mathbf{r}-\mathbf{r}\mathbf{^{\prime }}\right)-\Sigma \left(\mathbf{r},\mathbf{r}\mathbf{^{\prime }},\omega \right)\right\}G\left(\mathbf{r}\mathbf{^{\prime }},\mathbf{r}\mathbf{^{{\prime\prime} }},\omega \right)=\delta \left(\mathbf{r}-\mathbf{r}\mathbf{^{{\prime\prime} }}\right). $$
(9)

Here h 0 is the non-interacting Hamiltonian and V H the Hartree potential – see (3). Note that the self-energy Σ hides all the complexity of the original problem and thus is a non-local, dynamical and non-Hermitian operator. When Σ = 0 the Green’s function G 0 is simply the resolvent of the Hartree Hamiltonian: G − 10  = ω − h 0 − V H. We refer the reader to the review articles of Strinati [72] or of Hedin and Lundqvist [26] for further details.

Dyson’s equation results by multiplying (9) by G 0:

$$ G\left(\left(\mathbf{r},\mathbf{r},\mathbf{^{\prime }},\omega \right)\right)={G}_0\left(\left(\mathbf{r},\mathbf{r},\mathbf{^{\prime }},\omega \right)\right)+\int d{\mathbf{r}}_1d{\mathbf{r}}_2{G}_0\left(\left(\mathbf{r},{\mathbf{r}}_1,\omega \right)\right)\Sigma \left({\mathbf{r}}_1,{\mathbf{r}}_2,\omega \right)G\left(\left({\mathbf{r}}_2,\mathbf{r},\mathbf{^{\prime }},\omega \right)\right). $$
(10)

This equation establishes the link between the Hartree Green’s function G 0 (easily calculated) and the fully interacting Green’s function G (very hard to calculate) through the self-energy Σ.

The purpose of MBPT is then to provide approximations with increasing accuracy for the self-energy. The Coulomb interaction between electrons

$$ v\left(\mathbf{r}-\mathbf{r}\mathbf{^{\prime }}\right)=\frac{1}{\left|\mathbf{r}-{\mathbf{r}}^{\mathbf{\prime}}\right|} $$
(11)

is considered as the perturbation with respect to the independent-particle case. The first-order contribution to the self-energy is nothing else but the Fock exchange operator (the Hartree potential is already taken into account by G 0). This level of approximation is widely used for atoms and molecules, and in quantum chemistry perturbative methods in v with respect to Hartree–Fock are known as Møller–Plesset perturbation theory [73]. However, for the homogeneous electron gas, Hartree–Fock yields an anomalous zero density of states at the Fermi level. There is therefore a stringent need for higher order terms for periodic systems. Unfortunately, the analytical evaluation of one of the two second-order contributions is not finite in the case of the homogeneous electron gas [22, 23]. Perturbation theory is thus not justified. How should one proceed to circumvent this problem, especially for periodic systems?

2.2 The Screened Coulomb Interaction W

This divergence can be addressed in an effective manner by introducing a screened counterpart to the Coulomb interaction v. Other electrons act as a dielectric medium that reduces the interaction between any pair. It is common sense that the interaction between charges is not the same in vacuum as in a dielectric medium. At the macroscopic scale this is measured by the dielectric constant of the medium. At the microscopic scale the screening of the Coulomb interaction is given by

$$ W\left(\left(\mathbf{r},\mathbf{r},\mathbf{^{\prime }},\omega \right)\right)=\int d{\mathbf{r}}_1{\varepsilon}^{-1}\left(\left(\mathbf{r},{\mathbf{r}}_1,\omega \right)\right)v\left({\mathbf{r}}_1-\mathbf{r}\mathbf{^{\prime }}\right), $$
(12)

where the microscopic dielectric matrix ε − 1 has been introduced. ε is linked to the macroscopic dielectric function ε M [74, 75], which is a measurable quantity. For instance, − Imε − 1 M is called the loss function and can be measured by electron energy loss spectroscopy (EELS) or inelastic X-ray scattering (IXS).

So far, the expression of the dielectric matrix has not been specified. Nevertheless, one can still analyze the physical meaning of the dynamically screened Coulomb interaction W(r,r′,ω). The effective interaction between electrons in a medium is decreased from v, the bare Coulomb interaction, to W the screened interaction. A perturbation theory based on W rather than on v then makes much more sense. However there is a price to pay: the screened interaction W is dynamical, meaning that the screening is more effective for some frequencies than for others. For metals, the static dielectric constant is infinite and consequently the long-range component of W vanishes. This fixes the problem of the vanishing density of states at the Fermi level predicted by Hartree–Fock theory for the homogeneous electron gas. Conversely, in the high frequency limit, the screening by the electrons becomes completely ineffective and the screened Coulomb interaction is simply the bare Coulomb interaction.

The additional complexity contained in W compared to v bears the hope that the perturbation theory is to be rapidly convergent. Maybe, as W already contains an infinite sum of interactions v, just the first order in W will suffice, as proposed by Hedin [25].

2.3 Hedin’s Equations and the GW Approximation

Employing W instead of v in the MBPT allowed Hedin to reformulate the exact equations of the solution of the many-electron problem for the calculation of G [25]. They read:

$$ G\left(1,2\right)={G}_0\left(1,2\right)+{\displaystyle \int d(34){G}_0\left(1,3\right)\Sigma \left(3,4\right)G\left(4,2\right)} $$
(13a)
$$ \Sigma \left(1,2\right)=i{\displaystyle \int d(34)G\left(1,3\right)W\left({1}^{+},4\right)\Gamma \left(3,2,4\right)} $$
(13b)
$$ W\left(1,2\right)={\displaystyle \int d(3){\varepsilon}^{-1}\left(1,3\right)v\left(3,2\right)} $$
(13c)
$$ \varepsilon \left(1,2\right)=\delta \left(1,2\right)-{\displaystyle \int d(3)v\left(1,3\right)\tilde{\chi}\left(3,2\right)} $$
(13d)
$$ \tilde{\chi}\left(1,2\right)=-i{\displaystyle \int d(34)G\left(1,3\right)G\left(4,1\right)\Gamma \left(3,4,2\right)} $$
(13e)
$$ \Gamma \left(1,2,3\right)=\delta \left(1,2\right)\delta \left(1,3\right)+{\displaystyle \int d(4567)\frac{\delta \Sigma \left(1,2\right)}{\delta G\left(4,5\right)}G\left(4,6\right)G\left(7,5\right)\Gamma \left(6,7,3\right).} $$
(13f)

Contracted indexes (1) = (r 1,t 1,σ 1) have been used for simplification. The index 1+ denotes the times t 1 + η for a vanishing positive η. Most of the quantities have been introduced earlier. \( \tilde{\chi} \) is the irreducible polarizability and Γ is the three-point vertex function. These nonlinear equations are coupled. If solved self-consistently, these equations form an exact scheme to obtain the solution of the many-body problem. This process is pictured in the left panel of Fig. 2. Of course, in practice, even in the simplest cases it is not feasible, mainly due to the presence of the three-point vertex function Γ. It is then natural that an approximated scheme begins by simplifying this particular term.

Fig. 2
figure 2

Schematic view of the three frameworks described in this review. The exact expression (GWΓ, left panel) can be obtained after an initialization with a guessed Green’s function G 0 by iteration of all the Hedin’s equations. The self-consistent GW approximation (GW, central panel) arises from the iteration of the equations keeping the vertex function Γ = 1. The perturbative GW approximation (G 0 W 0, right panel) is the one-shot evaluation of the GW self-energy based on the guessed Green’s function G 0

The second term in the right-hand side of (13f) involves the self-energy which is of first-order in W according to (13b). Retaining only the zero-order terms in (13f) (i.e., the δ functions), Hedin’s equations are greatly simplified:

$$ G\left(1,2\right)={G}_0\left(1,2\right)+{\displaystyle \int d(34){G}_0\left(1,3\right)\Sigma \left(3,4\right)G\left(4,2\right)} $$
(14a)
$$ \Sigma \left(1,2\right)= iG\left(1,2\right)W\left({1}^{+},2\right) $$
(14b)
$$ W\left(1,2\right)={\displaystyle \int d(3){\varepsilon}^{-1}\left(1,3\right)v\left(3,2\right)} $$
(14c)
$$ \varepsilon \left(1,2\right)=\delta \left(1,2\right)-{\displaystyle \int d(3)v\left(1,3\right)\tilde{\chi}\left(3,2\right)} $$
(14d)
$$ \tilde{\chi}\left(1,2\right)=- iG\left(1,2\right)G\left(2,1\right). $$
(14e)

The irreducible polarizability \( \tilde{\chi} \) is then a simple product of two Green’s functions. This is the well known Random-Phase Approximation (RPA) to the dielectric matrix. The self-energy is also much simplified: this is just the simple product of G and W, giving the name to the GW approximation. It is of first-order in W. The missing terms (second and higher orders in W) are commonly named the “vertex corrections”.

The set of equations (14a14e) still requires a self-consistent treatment since W and Σ depend on G, which is the quantity one needs to find. This is pictured in the central panel of Fig. 2. The practical implementation of these equations is still far from obvious. This is the reason why for many years the GW self-energy has been evaluated non-self-consistently.

2.4 Practical Calculation of the GW Self-Energy: The G 0 W 0 Approach

It is usually impossible to evaluate the Green’s function self-consistently from (14a14e). However, let us imagine that mean-field theories such as Hartree–Fock or Kohn–Sham provide a good description of the electronic system under study. In a mean-field theory, the one-electron wavefunctions ϕ i (r) and eigenvalues ɛ i allow one to evaluate the independent-particle Green’s functionFootnote 2 G 0

$$ {G}_0\left(\mathbf{r},{\mathbf{r}}^{\mathbf{\prime}},\omega \right)={\displaystyle \sum_i}\frac{\phi_i\left(\mathbf{r}\right){\phi}_i^{\ast}\left(\mathbf{r},\mathbf{^{\prime }}\right)}{\omega -{\varepsilon}_i+ i\eta {\rm{sign}}\left({\varepsilon}_i-\mu \right)}. $$
(15)

The location of the poles of G 0 are above the real axis for occupied states and below for empty states. As a consequence, \( \tilde{\chi} \) and then W can be readily evaluated from this expression of G 0. Let us label this evaluation of the irreducible polarizability χ 0 and of the screened Coulomb interaction, W 0. Finally, the GW self-energy is obtained as the convolution G 0 W 0.

The so-called G 0 W 0 approach consists in stopping the procedure immediately after the first evaluation of the self-energy, as shown in the right hand panel of Fig. 2. This “one-shot” procedure is justified when the starting mean-field theory used for G 0 is accurate enough for the targeted property. The vast majority of the GW applications for almost 50 years have been obtained with the G 0 W 0 procedure. Of course, the choice of the starting point is material dependent. The seminal paper of Hedin [25] simply employed the free electron model to calculate the GW self-energy for the homogeneous electron gas. The first application of GW to real solids used either the Hartree–Fock approximation [76] or the local density approximation [77]. For atoms, Shirley and Martin chose Hartree–Fock [78]. The rationale underlying the choice is the selection of the most accurate mean-field theory for the specific system under scrutiny. This strategy is sometimes referred to as the “best G, best W” approach (see Sect. 3.1).

In the quasiparticle approximation, the Dyson equation (10) becomes

$$ \left(-\frac{\nabla^2}{2}+{V}_{\mathrm{ext}}\left(\mathbf{r}\right)+{V}_{\mathrm{H}}\left(\mathbf{r}\right)\right){\psi}_i\left(\mathbf{r}\right)+{\displaystyle \int d{\mathbf{r}}^{\mathbf{\prime}}\varSigma \left(\mathbf{r},{\mathbf{r}}^{\mathbf{\prime}},{E}_i\right){\psi}_i\left(\mathbf{r},\mathbf{^{\prime }}\right)={E}_i{\psi}_i\left(\mathbf{r}\right)} $$
(16)

In the G 0 W 0 framework one assumes that the quasiparticle wavefunctions ψ i can be approximated by the Kohn–Sham orbitals ϕ i . By comparing (3) and (16) one finds that the quasiparticle energies E i can be calculated as a first-order correction with respect to the underlying mean-field starting point from

$$ {E}_i={\varepsilon}_i+\left\langle {\phi}_i\left|\Sigma \left({E}_i\right)-{V}_{\mathrm{xc}}\right|{\phi}_i\right\rangle, $$
(17)

where Σ is the G 0 W 0 self-energy. From a linearization of the frequency dependence of Σ, one finally obtains

$$ {E}_i={\varepsilon}_i+{Z}_i\left\langle {\phi}_i\left|\Sigma \left({\varepsilon}_i\right)-{V}_{\mathrm{xc}}\right|{\phi}_i\right\rangle, $$
(18)

where the renormalization factors Z i are

$$ {Z}_i={\left[1-\left\langle {\phi}_i\left|\frac{\partial \Sigma \left(\omega \right)}{\partial \omega}\right|{}_{\omega ={\varepsilon}_i}\Big|{\phi}_i\right\rangle \right]}^{-1}. $$
(19)

In most G 0 W 0 calculations the band structures are obtained using (18). One can also calculate the spectral function defined in (4) from:

$$ {A}_{ii}\left(\omega \right)=\frac{1}{\pi}\frac{\left|\left\langle {\phi}_i\left|\mathrm{Im}\Sigma \left(\omega \right)\right|{\phi}_i\right\rangle \right|}{{\left[\omega -{\varepsilon}_i-\left\langle {\phi}_i\left|\mathrm{Re}\Sigma \left(\omega \right)-{V}_{\mathrm{xc}}\right|{\phi}_i\right\rangle \right]}^2+{\left[\left\langle {\phi}_i\left|\mathrm{Im}\Sigma \left(\omega \right)\right|{\phi}_i\right\rangle \right]}^2}. $$
(20)

The spectral function has poles correspondence to the quasiparticle energies, i.e., when ω − ε i  − 〈ϕ i |ReΣ(ω) − V xc|ϕ i 〉 = 0; cf. (17). The width of the quasiparticle peak is given by ImΣ(ω), which is hence linked to the lifetime of the excitation (defined as the inverse of its width). The spectral function can have other peaks, the satellites, that originate from structures in ImΣ(ω). ω − ε i  − 〈ϕ i |ReΣ(ω) − V xc |ϕ i 〉 can also have additional zeroes, giving rise to satellites. Within the GWA this latter kind of satellites has been called plasmarons [79, 80], but lately they have been shown to be an artifact of the GWA [56, 81, 82]. In Hartree–Fock the self-energy is Hermitian: ImΣ(ω) = 0. Therefore quasiparticle peaks become delta functions (i.e., the lifetime of quasiparticle becomes infinite). Moreover, since the self-energy is static, no other structures (e.g., satellites) can appear in the spectral function.

The GW self-energy can be split into a Fock exchange term Σ x and a correlation term Σ c (ω): Σ(ω) = Σ x  + Σ c (ω). While Σ x  = iGv is static, the evaluation of Σ c (ω) requires the calculation of the convolution integral of G and W p  = W − v:

$$ {\Sigma}_c\left({\mathbf{r}}_1,{\mathbf{r}}_2,\omega \right)=\frac{i}{2\pi }{\displaystyle \int d{\omega}^{\prime }{e}^{i\eta {\omega}^{\prime }}G\left({\mathbf{r}}_1,{\mathbf{r}}_2,\omega +{\omega}^{\prime}\right){W}_p\left({\mathbf{r}}_1,{\mathbf{r}}_2,{\omega}^{\prime}\right)}. $$
(21)

Since Σ c is obtained through the frequency integration (21), the fine details of the energy dependence of W p are often not important. In these cases one can approximate the imaginary part of the inverse dielectric function ε − 1 as a single-pole function in ω (plasmon-pole model) [83, 84]. Plasmon-pole models can be used for calculating quasiparticle energies, but should be avoided for spectral functions because, for example, they do not describe ImΣ correctly. In these cases the full-frequency dependence of Σ is required and the frequency integration has to be performed with care [85].

The spectral representation of W p is given by [5]

$$ {W}_p\left({\mathbf{r}}_1,{\mathbf{r}}_2,\omega \right)=2{\displaystyle \sum_s}\frac{\omega_s{W}_s\left({\mathbf{r}}_1,{\mathbf{r}}_2\right)}{\omega^2-{\left({\omega}_s- i\eta \right)}^2}. $$
(22)

The poles of W p are the energies ω s that correspond to neutral excitations (electron-hole transitions and plasmons). By combining (22) with (15) and performing the frequency integration (21), one finds that the G 0 W 0 self-energy is given by the sum of two terms:

$$ {\Sigma}_{\mathrm{SEX}}\left({\mathbf{r}}_1,{\mathbf{r}}_2,\omega \right)=-{\displaystyle \sum_i\theta \left(\mu -{\varepsilon}_i\right){\phi}_i\left({\mathbf{r}}_1\right){\phi}_i^{\ast}\left({\mathbf{r}}_2\right)W\left({\mathbf{r}}_1,{\mathbf{r}}_2,\omega -{\varepsilon}_i\right)}, $$
(23a)
$$ {\Sigma}_{\mathrm{COH}}\left({\mathbf{r}}_1,{\mathbf{r}}_2,\omega \right)={\displaystyle \sum_i{\phi}_i\left({\mathbf{r}}_1\right){\phi}_i^{\ast}\left({\mathbf{r}}_2\right)}{\displaystyle \sum_s\frac{W_s\left({\mathbf{r}}_1,{\mathbf{r}}_2\right)}{\omega -\left({\omega}_s- i\eta \right)-{\varepsilon}_i}}. $$
(23b)

The first term arises from the poles in G and the second from the poles in W. Owing to the similarity of the first term with the Fock exchange, it is usually called the “screened exchange” term. The second term is referred to as the “Coulomb-hole” term [25]. If a further static approximation is carried out, this decomposition gives rise to the so-called COHSEX (Coulomb hole plus screened exchange), first introduced by Hedin [25, 79]. This static and Hermitian self-energy is obtained by setting ω − ε i  = 0 in ΣSEX(ω) and ΣCOH(ω). This corresponds to assuming that the main contribution to the self-energy Σ(ω) stems from the states ε i close to ω. So ω − ε i is small compared to the main excitations in W which are at the plasmon energies ω s [25, 79].

In the majority of practical cases the G 0 W 0 scheme is largely sufficient to evaluate successfully the GW self-energy. However for some cases it is not. This review will focus precisely on those pathological cases.

3 Beyond G 0 W 0

In this section we will present arguments to show why one should go beyond the G 0 W 0 scheme that has been introduced in the previous section. First of all, we will discuss the criticisms that can be raised about G 0 W 0 from the point of view of basic principles. Then we will consider a case study where the G 0 W 0 scheme shows in practice to be inadequate under different aspects. This practical example is VO2, a prototypical transition metal oxide, which has been the subject of an intense debate for the last 5 decades for its metal-insulator transition occurring just above room temperature [86].

3.1 Which Starting Point?

The GW approximation can be seen to correspond to the first step of an iterative solution of Hedin’s equations that takes as the starting point Σ = 0 (see Fig. 2). Therefore, in principle the GW self-energy should be built using a Green’s function G calculated in the Hartree approximation (i.e., with Σ = 0). However, the Hartree approximation is generally inadequate, giving bad electron densities and one-particle wavefunctions. Moreover, performing an additional step in the iterative solution of Hedin’s equations leads to the inclusion of vertex corrections beyond GW that are immediately very expensive from the computational point of view.

One would rather like to remain within the GWA and find an alternative strategy to evaluate the self-energy, discarding the idea of solving Hedin’s equations iteratively. This has been the approach followed in practice in the application of the GWA to the calculations of band structures of real materials [18, 77, 83]. The idea is to build the GW self-energy with the best ingredients that can be calculated from first-principles. Historically, in solids this led to the use of the LDA or the GGA (alternatively, Hartree–Fock for atoms [78]) at the place of the Hartree approximation. This was justified by the observation that in sp semiconductors and metals LDA wavefunctions are a good approximation to QP wavefunctions [83] (see Sect. 2.4). A posteriori this choice was validated by the good agreement of G 0 W 0 results with experimental band gaps [27, 28].

Moreover, still in the spirit of the iterative solution of Hedin’s equations, one could improve the starting point by setting Σ = V xc, instead of Σ = 0 [83, 87]. This would give a self-energy that still has a GW-like form. However, the Coulomb interaction is now screened by an effective dielectric function that has a electron-test-charge form, \( \tilde{\varepsilon}=1-\left(v+{f}_{\mathrm{xc}}\right){\chi}_0 \), instead of the RPA ε used in the standard GW approximation, ε = 1 −  0. The main difference is that in the electron-test-charge case the induced charge generates an exchange-correlation potential in addition to the induced Coulomb potential that is already taken into account at the RPA level. This approximation has been called “GWΓ” [87] because it contains, to some extent, vertex corrections beyond GW. It is consistently derived and calculated using Kohn–Sham ingredients. It has been shown to give results that are similar to (or slightly worse than) standard LDA-based GW [29, 52, 87, 88], adding another argument in favor of the pragmatic “best G best W” approach. Vertex corrections beyond the local approximation used to build vertex in the “GWΓ” scheme [87] are still an open question under investigation [89, 90].

Anyway, this “best G best W” strategy clearly introduces a degree of arbitrariness in the results. As any first-order perturbation scheme, the G 0 W 0 results directly depend on the quality of the starting point (i.e., the zero order of the perturbative scheme). In general, Hartree–Fock overestimates and the LDA (or GGA) underestimates band gaps (and analogously HOMO-LUMO gaps in finite systems). G 0 W 0 corrections improve with respect to the starting point, getting closer to experiment from above if starting from Hartree–Fock, or from below if starting from LDA or GGA. Therefore G 0 W 0 results obtained from these two starting points generally bracket the reference values, and starting from any hybrid functional mixing LDA with Fock non-local exchange gives rise to intermediate results between the two. This observation is, for instance, illustrated for a benchmark of 34 small closed-shell molecules in Fig. 3. Deviations from the reference quantum chemistry CCSD(T) calculations are reported: each bar corresponds to a different approximation for the exchange-correlation functional used in the underlying mean-field calculation, comprising local, semilocal, and several hybrid functionals [92]. Similar results for atoms and molecules have recently been obtained, for instance also in [91, 9396], and these observations can be generalized to extended systems as well.

Fig. 3
figure 3

Deviation of the HOMO energies with respect to the CCSD(T) total energy difference. The upper panel shows the mean error (ME), and the lower panel the mean-absolute error (MAE) for the 34-molecule set introduced in [91]. The shaded bar corresponds to the HOMO energy of the starting point, and the white bar shows the corresponding G 0 W 0 HOMO energy. Reprinted with permission from [92]. Copyright 2013, American Chemical Society

Moreover, in contrast to the first applications of the LDA-based G 0 W 0 scheme, in recent years the investigation of more complex materials has shown severe limitations of the G 0 W 0 approach. Alternative starting points beyond LDA have been put forward, either remaining in the Kohn–Sham scheme – e.g., using the exact exchange approximation (EXX) [97, 98] or LDA+U [99, 100] – or moving to a generalized Kohn–Sham scheme and adding a part of non-local Fock exchange – e.g., using the Heyd–Scuseria–Ernzerhof (HSE) hybrid functional [101, 102]. In most cases bad G 0 W 0 results have been explained in terms of the inadequacy of the underlying LDA starting point.

3.2 Energy Scales

In the G 0 W 0 scheme, the self-energy displays a dependency on the absolute energy scale used for the construction of G 0. To illustrate this statement we follow Hedin [5, 25] and we introduce an arbitrary shift Δ of the energy scale of G 0. The Green’s function G Δ0 , depending parametrically on Δ, can be written as (see (15))

$$ {G}_0^{\Delta}\left({\mathbf{r}}_1,{\mathbf{r}}_2,\omega \right)={\displaystyle \sum_i}\frac{\phi_i\left({\mathbf{r}}_1\right){\phi}_i^{\ast}\left({\mathbf{r}}_2\right)}{\omega -{\varepsilon}_i-\Delta + i\eta {\rm{sgn}}\left({\varepsilon}_i-\mu \right)} $$
(24)

While χ 0 and W 0 are independent of Δ, the G 0 W 0 self-energy becomes

$$ {\Sigma}^{\Delta}\left(\omega \right)={\Sigma}^{\Delta =0}\left(\omega -\Delta \right). $$
(25)

Using the linear expansion of ΣΔ(ω) – see (18) – the quasiparticle energies are in turn calculated as

$$ {E}_i={\varepsilon}_i+\Delta +{Z}_i^{\Delta =0}\left[\left\langle {\phi}_i\left|{\Sigma}^{\Delta =0}\left({\varepsilon}_i\right)-{V}_{\mathrm{xc}}\right|{\phi}_i\right\rangle -\Delta \right]. $$
(26)

The result should not depend on the absolute energy scale, which is arbitrary. On the contrary, here we see that shifting by Δ the energy scale of G 0 leads to a variation of the quasiparticle energies equal to (1 − Z i Δ = 0)Δ. This is a consequence of the fact that the self-energy in the GWA is dynamical, hence Z i  ≠ 1.

However, if the renormalization factors Z i do not change much from state to state, as is generally the case in simple systems, this dependency on the absolute energy scale of G 0 has a tiny effect on the G 0 W 0 band-structure results. In this situation, in fact, changing arbitrarily the energy scale of G 0 would only give a rigid shift of the whole band structure.

In order to fix this problem, Hedin suggested setting the energy scale by requiring a self-consistency at the Fermi level [5, 25]. This means setting Δ equal to the matrix element of the self-energy at the Fermi level calculated with Δ = 0. A more general solution would be to calculate self-consistently the energies entering Σ (and hence also the Fermi level).

A more severe consequence of this dependency on the energy scale can be found in the calculation of spectral functions. To see this effect in practice, we consider here the example of the metallic phase of VO2 [Gatti et al. (2013) Unpublished]. In Fig. 4 we compare the spectral functions calculated within G 0 W 0 (dashed lines) and within a scheme in which the quasiparticle energies used to calculate the GW self-energy are updated to self-consistency (solid lines). In LDA the spectral function for each state is just a delta peak and in Fig. 4 the LDA top valence at the Γ point is represented by the vertical arrow. It corresponds to a V 3d state. In G 0 W 0 the corresponding position of the quasiparticle peak is shifted upwards and its width becomes finite, meaning a finite lifetime of the quasiparticle excitation. The shift of the peaks is similar for all the other V 3d states (not shown in Fig. 4), which also have very similar renormalization factors. This is the situation described above: the absolute energy scale entering G 0 does not have an important influence on the band-structure results.

Fig. 4
figure 4

Spectral functions A(ω) and self-energy Σ(ω) for the top-valence state at Γ point for the metallic phase of VO2, calculated in G 0 W 0 (dashed lines) and eigenvalue self-consistent GW (solid lines). Vertical orange arrow: energy of the LDA KS state. Vertical dotted line: Fermi energy in LDA. The zero of the energy axis is at the Fermi energy calculated in the eigenvalue self-consistent GW. Δ F is the shift of the Fermi energy. From Gatti et al. (2013) Unpublished

On the contrary, G 0 W 0 gives an inaccurate description of the incoherent features (i.e., besides the quasiparticle peak) in the spectral function. In fact, the satellite measured in the photoemission spectrum of metallic VO2 [103] is too weak in the G 0 W 0 spectral function (see dashed red line in Fig. 4)) [104]. On the basis of this G 0 W 0 result, one would claim the inadequacy of the GW approximation for the description of this genuine signature of dynamical correlation. However, we now show that it is actually a problem of the G 0 W 0 scheme itself.

In G 0 W 0 the energy scale for G 0 is given by the LDA starting point. For example, the imaginary part of the G 0 W 0 self-energy can be written as (see (15), (21), and (22))

$$ {\rm{Im}}\varSigma \left(\left(\mathbf{r},\mathbf{r},\mathbf{^{\prime }},\omega \right)\right)=\mathrm{sgn}\left(\mu -{\varepsilon}_i\right)\pi {\displaystyle \sum_{i,s\ne 0}}{\phi}_i\left(\mathbf{r}\right){\phi}_i^{\ast}\left(\mathbf{r}\mathbf{^{\prime }}\right){W}_s\left(\left(\mathbf{r},\mathbf{r}\right)\mathbf{^{\prime }}\right)\delta \left[\omega -{\varepsilon}_i+{\omega}_s\mathrm{sgn}\left(\mu -{\varepsilon}_i\right)\right] $$
(27)

where ε i are LDA energies and μ the LDA Fermi energy. Therefore, the G 0 W 0 ImΣ(ω) (see dashed black line in Fig. 4) changes its sign, crossing zero at the LDA Fermi energy (vertical dotted line in Fig. 4) which does not correspond to the Fermi energy after inclusion of GW corrections. In the eigenvalue self-consistent GW scheme, instead, since the energies ε i entering the self-energy are updated self-consistently, ImΣ(ω) (solid black line), as expected, crosses zero at the Fermi energy μ that is obtained self-consistently (zero of energy axis in Fig. 4).

The peaks in ImΣ(ω) can give rise to structures (satellites) in the spectral function; see (20). This is indeed the case for the top-valence state of VO2 in Fig. 4. For occupied states peaks of ImΣ(ω) are located approximately at − ω s  + ε i ; see (27). In VO2 ω s ∼1.5 eV is the energy of a localized d − d plasmon and the coupling with this plasmon is at the origin of the satellite [105]. This satellite is called a shake-up excitation. This means that in the photoemission process the photoemitted electron leaves the (N − 1)-electron system in an excited state in which the creation of an additional plasmon excitation has occurred. The main QP peak in this case corresponds to the ground state of the (N − 1)-electron system.

On the basis of this physical picture, for the energy-conservation rule we expect that in the spectral function the satellite is at lower energy than the quasiparticle peak and the distance is equal to the plasmon energy ω s . However, this does not happen in G 0 W 0 approximation. In fact, in G 0 W 0 the QP energy is shifted upwards with respect to the LDA energy ε i that is used to build ImΣ i . If we call this shift Δ i , from (27) we find that the peak of ImΣ i is located at − ω s  + ε i and hence has a distance Δ i  + ω s from the QP peak. Since Δ i  > 0, the satellite in G 0 W 0 has then too high a binding energy. Moreover, the intensity is so weak that one would even conclude that there is no satellite in GW [104]. Instead, in the eigenvalue self-consistent GW scheme the satellite gets closer to the QP peak (see solid red line), because at self-consistency Δ i  = 0. The distance between the satellite and the QP peak becomes correctly ∼ ω s , i.e., the energy of the plasmon. Moreover, the intensity of the satellite is enhanced with respect to G 0 W 0 due to the fact that |(ω − ε i  − (ReΣ i (ω) − V xc i )| becomes smaller (compare solid and dashed cyan lines). In conclusion, the eigenvalue-self-consistent GW scheme gives a correct description of the low-energy satellite [Gatti et al. (2013) Unpublished], in contrast to G 0 W 0.

The example just discussed shows that conclusions about spectral functions should not be based on calculations at the G 0 W 0 level, especially if one is interested in spectral properties close to the Fermi level and the self-energy induces a non-negligible shift of the positions of the peaks with respect to the LDA starting point. In fact the same problem has been observed in other situations, such as Hubbard chains [106] or SrVO3, another prototypical strongly correlated metal [107]. This conclusion is also relevant when GW is combined with other methods that calculate spectral functions, such as dynamical mean field theory (DMFT).

3.3 Quasiparticle Wavefunctions

The G 0 W 0 scheme is based on the hypothesis that QP wavefunctions can be approximated by LDA orbitals; see (17). This assumption holds for simple semiconductors and metals and has been investigated in depth, e.g., for bulk silicon [83]. However, it has been known for a long time that it can be questionable for finite systems [108] and surfaces [109111]. In simple systems like bulk silicon the LDA conduction wave functions have also recently been shown to be of significantly poorer quality, in particular away from highly symmetric points of the Brillouin zone [112]. Here, through the case study of the insulating phase of VO2, we discuss a situation in which the G 0 W 0 perturbation theory breaks down completely [105]. Similar problems occur when LDA produces a wrong ordering of the bands (see, e.g., [113, 114]).

Across the metal-insulator transition, VO2 also undergoes a structural phase transition from rutile (metal) to monoclinic (insulator). This is accompanied by the formation of V–V dimers along the rutile c axis, resulting into a doubling of the unit cell. The LDA underestimates the bonding-antibonding splitting of the V 3d states associated with the formation of these V–V dimers and in turn the band structure is metallic [115]. Alternatively, the metallic LDA band structure has been interpreted as the proof of the strongly correlated nature of VO2 [116].

In apparent support of the hypothesis of “strong correlation,” G 0 W 0 based on LDA also fails to open the gap. In order to get rid of the LDA starting point, QP wavefunctions have been calculated self-consistently in the COHSEX approximation [105]. This self-consistent quasiparticle calculation does succeed in opening a gap (0.8 eV, quite close to the experimental value of 0.6 eV). In contrast, in a COHSEX calculation where the wavefunctions are constrained to be the LDA ones and only the energies are updated self-consistently, a gap close to zero is found: the change of the wavefunctions with respect to the LDA ones is thus of utmost importance.

In Fig. 5 the LDA electron density for the top valence bands is plotted (see the left panel). It represents the V a 1g states that derive from the V–V dimer bonding states and are hence highly polarized along the V–V dimer axis (the vertical c axis in the figure). The self-consistent QP COHSEX calculation results in the enhancement of this anisotropy, inducing a stronger bonding character to the top valence wavefunctions. Therefore if the orbital redistribution associated with the formation of V–V dimers is underestimated, as happens in LDA, the system remains metallic. Since LDA orbitals are not a sufficiently good approximation of the QP wavefunctions at the Fermi level, the G 0 W 0 perturbative scheme on top of LDA is not valid in the present situation. Improved QP wavefunctions are instead obtained in the COHSEX approximation, which can be used for a subsequent G 0 W 0 calculation.

Fig. 5
figure 5

(Left panel) Isosurface of the LDA electron density for the top valence V 3d bands in insulating VO2. It is a V–V dimer bonding a 1g state, highly polarized along the cc axis (vertical axis in the figure). (Right panel) Difference between COHSEX and LDA electron density for the same states. Yellow surfaces are for positive variations and purple for negative ones. The c-axis polarization increases for COHSEX wavefunctions. See [105]

More generally, this conclusion about the failure of G 0 W 0 holds for d and f electron states for which the LDA tends to produce too delocalized wavefunctions [71, 113, 117126]. The Fock exchange term of the self-energy is essential in these situations to cure this delocalization error [126].

Finally, in Fig. 6 we illustrate the effect of using QP wavefunctions instead of LDA ones in the optical spectra of insulating VO2 [Gatti et al. (2013) Unpublished]. Here the absorption spectrum, given by the imaginary part of the dielectric function ɛ M, is obtained as a sum over independent vertical transitions from valence to conduction states, corresponding to Fermi’s golden rule in the independent-particle picture. The energies of these transition are calculated using the COHSEX+ G 0 W 0 results that correctly produce an insulator in VO2 [105]; hence the onset of the optical spectrum is at a finite energy. However, the spectrum calculated with LDA wavefunctions starts with a peak of huge intensity (see yellow line). This huge peak derives from the shift of the metallic Drude peak at ω ∼ 0 in the spectrum calculated with metallic LDA eigenvalues. In order to understand this problem better, we write the oscillators strengths for vanishing q that determine the absorption spectrum by using k ⋅ p perturbation theory [127]:

Fig. 6
figure 6

Absorption spectrum Im ϵ of insulating VO2, calculated in the independent-particle approximation using COHSEX+ G 0 W 0 energies and LDA (yellow line in the main panel and in the inset) or COHSEX (blue line) wavefunctions. See Gatti et al. (2013) Unpublished

$$ \underset{\mathbf{q}\to 0}{ \lim}\left\langle {\phi}_{v\mathbf{k}}\left|{\mathrm{e}}^{-i\mathbf{q}.\mathbf{r}}\right|{\phi}_{c\mathbf{k}+\mathbf{q}}\right\rangle =\underset{\mathbf{q}\to 0}{ \lim}\frac{i\mathbf{q}\cdot \left\langle {\phi}_{v\mathbf{k}}\left|\left[{H}^{\mathrm{KS}},\mathbf{r}\right]\right|{\phi}_{c\mathbf{k}}\right\rangle }{\varepsilon_{c\mathbf{k}}-{\varepsilon}_{v\mathbf{k}}}, $$
(28)

with H KS ϕ i  = ε i ϕ i . If LDA wavefunctions are used to calculate the oscillator strengths (28), then the energy differences appearing at the denominator in (28) are by definition also to be calculated in LDA. Therefore, since VO2 is metallic in LDA, for the first transitions the denominator is vanishingly small. In turn, the oscillator strength of the first peak remains very high, although the position is given correctly by COHSEX+ G 0 W 0 energy differences. Thus the wrong oscillator strength is due to the use of LDA wavefunctions and is corrected by using QP wavefunctions calculated in the self-consistent COHSEX scheme (see blue line in Fig. 6). The effect is drastic for VO2, but one should expect similar behavior whenever the opening of a small bandgap is accompanied by a change of wavefunctions.

4 Quasiparticle Self-Consistent GW

For all the aforementioned reasons, it seems very attractive to get rid of the mean-field starting point which pervasively affects all the G 0 W 0 results. The natural way to do so would be to turn to self-consistent GW calculations, as pictured in the central panel of Fig. 2. However, the full self-consistent GW approach suffers from many drawbacks as we will show in the next paragraph. This is the main reason why Faleev, Kotani, and van Schilfgaarde introduced in 2004 the so-called “Quasiparticle-Self-consistent GW” scheme [71], to which this section is devoted.

4.1 Full Self-Consistent GW

The full self-consistent GW is affected by two kinds of issues: there are some well identified theoretical problems and the calculations are extremely cumbersome.

In the self-consistent GW scheme, the Dyson equation (10) is solved self-consistently using the GW self-energy. At first sight, self-consistent GW is appealing, since it is a conserving approximation, obeying conservation laws for particle number, energy, and momentum under the influence of external perturbations [129]. For instance, it has been shown to yield very accurate total energies [68, 69]. However, in self-consistent GW the polarizability \( \tilde{\chi} \) is built from the interacting Green’s functions G that contain dynamical self-energy effects. In contrast to G 0, the spectrum of the full Green’s function G contains not only quasiparticles but also “incoherent” parts with a finite spectral weight. This transfer of spectral weight is measured by the renormalization factor Z, which is most often smaller than 1. When performing self-consistent GW without any vertex correction, the polarizability \( \tilde{\chi}=- iGG \) contains excitations renormalized with a factor Z 2 [118]. As a consequence, screening is reduced and the results are worse in practice. Furthermore, the resulting polarizability \( \tilde{\chi} \) does not obey the f-sum rule [67]. To summarize, there are theoretical arguments that strongly hint towards the inclusion of both self-consistency and vertex corrections together [130, 131].

These problems do not appear to be too dramatic for atoms and molecules. For finite systems, self-consistent GW does not deteriorate G 0 W 0 results [93, 95, 132] since, in this case, the dynamical effects in Σ are indeed negligible (i.e., Z ∼ 1). However, as far as we are interested in solid-state systems, the full self-consistent approach appears to be questionable.

Besides these physical problems, the computational cost of self-consistent GW is very high. The GW self-energy is dynamical: the equations have a frequency dependence. For instance, the quasiparticle wavefunctions are not eigenvectors of the same Hamiltonian. The GW self-energy is non-Hermitian: one should distinguish left and right eigenvectors and the eigenvalues are complex. As a consequence, the “wavefunctions” do not form an orthogonal basis. In summary, a fully self-consistent GW calculation is a formidable task and one can easily appreciate why its application to solid-state systems has been so scarce in the available literature.

4.2 The QSGW Approximation to the GW Self-Energy

A static and Hermitian approximation to the GW self-energy would be devoid of the issues mentioned in the previous paragraph. This would prevent the spectral weight from being transferred to the “incoherent part,” then curing most the physical flaws of the self-consistent GW scheme. Furthermore, it would allow one to have an orthogonal set of wavefunctions that are eigenvectors of the same (non-local) Hamiltonian. The computational gain would be massive.

The problem is then to design a static and Hermitian approximation that still retains the accuracy of the full GW self-energy. Faleev, Kotani, and van Schilfgaarde made two proposals in their seminal paper of 2004 [71]. One method was identified as clearly superior to the other: this is the method nowadays named QSGW. The matrix elements of the self-energy in the QSGW approximation read

$$ \begin{array}{l}\left\langle {\psi}_i\left|{\Sigma}^{QSGW}\right|{\psi}_j\right\rangle =\frac{1}{4}\left\{\left\langle {\psi}_i\left|{\Sigma}^{GW}\left({E}_i\right)\right|{\psi}_j\right\rangle +{\left\langle {\psi}_j\left|{\Sigma}^{GW}\left({E}_i\right)\right|{\psi}_i\right\rangle}^{\ast}\right.\\ {}\left.+\left\langle {\psi}_i\left|{\Sigma}^{GW}\left({E}_j\right)\right|{\psi}_j\right\rangle +{\left\langle {\psi}_j\left|{\Sigma}^{GW}\left({E}_j\right)\right|{\psi}_i\right\rangle}^{\ast}\right\}.\end{array} $$
(29)

The QSGW method was later justified as a way to optimize the quasiparticle Hamiltonian H 0 with respect to Δ(ω) = H(ω) − H 0, where H(ω) = h 0 + V H  + ΣGW(ω) [117]. QSGW is thus a self-consistent perturbation theory where self-consistency determines the best H 0 within the GWA. Here we see that this static and Hermitian approximation to Σ GW was chosen to conserve the quality of the diagonal terms. Once self-consistency is achieved, the eigenvalues of H 0 coincide with the (real part of the) poles of GW Green’s function. In fact, at self-consistency the diagonal terms of the self-energy are precisely evaluated at the self-consistent quasiparticle energy E i :

$$ \left\langle {\psi}_i\left|{\varSigma}^{QSGW}\right|{\psi}_i\right\rangle =\frac{1}{2}\left\{\left\langle {\psi}_i\left|{\varSigma}^{GW}\left({E}_i\right)\right|{\psi}_i\right\rangle +{\left\langle {\psi}_i\left|{\varSigma}^{GW}\left({E}_i\right)\right|{\psi}_i\right\rangle}^{\ast}\right\} $$
(30)

and the only approximation for these terms is the neglect of the imaginary parts. Then the calculated quasiparticle band structures should be very similar to the GW ones.

4.3 Practical Implementation of the QSGW Method

The extension of an existing G 0 W 0 code into a fully working QSGW code is straightforward. Instead of only calculating the diagonal matrix elements

$$ \left\langle {\psi}_i\left|{\Sigma}^{GW}\right|{\psi}_i\right\rangle, $$
(31)

the implementation should also allow for i ≠ j,

$$ \left\langle {\psi}_i\left|{\Sigma}^{GW}\right|{\psi}_j\right\rangle . $$
(32)

Then compared to a simple G 0 W 0 calculation, there is one additional convergence parameter: the extension of the range of states j. Indeed, after diagonalization of the QSGW Hamiltonian, the quasiparticle wavefunctions are expanded as a linear combination of Kohn–Sham orbitals:

$$ \left|{\Psi}_i\right\rangle ={\displaystyle \sum_j}{c}_{ji}\left|{\phi}_j\right\rangle . $$
(33)

The c ji coefficients form a unitary matrix, since both the quasiparticle wavefunctions and the Kohn–Sham wavefunctions form an orthonormal set. The accuracy of the expansion is of course governed by the flexibility of the basis set, i.e., the number of Kohn–Sham orbitals included in the right-hand side of (33). Fortunately, in all practical cases, the number of elements in the linear combination need not be as large as the original basis set.

Concerning computational time, there are two differences between QSGW and G 0 W 0. Within QSGW, the GW self-energy has to be evaluated for all states used as a basis set in (33). For solids, this further implies the need to calculate the self-energy for all the k-points in the grid. However in practice, the calculation of the self-energy is often not the limiting step of a QSGW run. This is rather the evaluation of the dielectric matrix used to construct W. This step is exactly the same for G 0 W 0 and for QSGW. If this operation dominates the computer time consumption, the QSGW would be slower that G 0 W 0 simply by the need to iterate the calculation of the dielectric matrix (typically 3–10 cycles). As a conclusion, the QSGW methods is roughly one order of magnitude slower than the usual G 0 W 0 approximation for the most common cases.

4.4 Results

As mentioned earlier in this chapter, the G 0 W 0 approach may experience problems when the starting point is inadequate. This is particularly true for crystals with very small or very large band gaps. In the first systematic study of QSGW for crystalline systems, van Schilfgaarde et al. [117] demonstrated the deficiencies of G 0 W 0 (left panel of Fig. 7). The G 0 W 0 band gaps generally slightly underestimate the experimental data.

Fig. 7
figure 7

Calculated band gaps compared to experimental band gaps within LDA and G 0 W 0 (left panel), or within QSGW (right panel). Γ − Γ direct gaps are shown as circles and other gaps as squares. The perfect agreement would be with all the points along the diagonal. Adapted from [117]. Copyright 2006, The American Physical Society

The QSGW method then brings a clear improvement, as can be appreciated in the right panel of Fig. 7. For the small band-gap semiconductors, for which LDA wrongly predicted a metallic band structure, the improvement is spectacular. Whereas the G 0 W 0 results were still affected by the wrong metallic starting point, the QSGW band gaps of InAs, InSb, and GaSb are quantitatively correct. For the large band-gap insulators, the LDA underestimation of the band gap is so dramatic that once again the G 0 W 0 gaps are strongly affected by the starting point. The QSGW band gaps of diamond, MgO, CaO, etc., are closer to the experimental values.

The QSGW has been found to be particularly important for transition metal oxides such as MnO and NiO [71]. In particular, the elements with a complete d shell experience an unexpectedly large error in G 0 W 0, as shown in Fig. 7 for ZnO and Cu2O. For Cu2O it was also shown that QSGW greatly improves both the photoemission spectroscopy and the optical spectrum [133], as obtained from the solution of the Bethe–Salpeter equation [14].

In order to judge the performance of an electronic-structure method, one should always consider more properties than just the band gaps. Indeed, QSGW generally improves with respect to LDA, as for effective masses and gaps at other critical points [118]. The bandwidth of Na is narrower in QSGW than in LDA, in agreement with experiment and with G 0 W 0 (self-consistent GW results on the homogeneous electron gas, which is a very good approximation to sodium, instead had shown an increase of the bandwidth [67, 130]). In 3d metals (e.g., Ti, Cr, Fe, Co, Ni), the 3d bandwidths, the exchange splittings, and magnetic moments are generally improved with respect to LDA [117], with some exceptions like Ni and other systems where LDA already overestimates the magnetic moment, and one expects that the coupling with spin fluctuations beyond GW should be taken into account [57]. In addition to spectroscopic data, some ground-state properties can also be successfully predicted with QSGW, such as electric field gradients [134]. Finally, QSGW results have also been used as a starting point for other calculations, like impact ionization rates [135] and the transverse spin susceptibility [136], again finding good agreement with experiment.

There is still an overall slight overestimation of the band gaps by the QSGW method, which is systematic in all compounds studied [117, 118]. The accuracy of QSGW somewhat deteriorates in d and f compounds with respect to sp systems. The unoccupied states are generally too high: by 0.2 eV for sp semiconductors, less than 1 eV for d 0 compounds like SrTiO3 and TiO2, more than 1 eV for other d materials like NiO, and up to 3 eV for f compounds like Gd and Er [119]. Most often these errors have been ascribed to the missing electron–hole interactions (excitonic effects) in the calculation of the polarization \( \tilde{\chi} \) and the screened interaction W, which is done at the RPA level in the GW approximation. The dielectric constants turn out to be too small (by ~20%) [117], resulting in a slight underscreening and, in turn, into too large gaps. Indeed, van Schilfgaarde and coworkers empirically found that by rescaling by 0.8 the screened interaction W, the results were improved (see, e.g., [123, 124, 137]). This tendency was later confirmed by Shishkin et al. [138], who corrected the remaining error with the inclusion of electron-hole interaction in W using the Nanoquanta kernel of time-dependent density-functional theory [139]. These observations mean that vertex corrections in Σ – see (13f)] – would play a secondary role in a large variety of materials. However, for example, the localized d band in the occupied valence band of materials like GaAs, GaN, and GaP, which is slightly too shallow, cannot be corrected by this simple rescaling that corrects the QSGW underscreening. A more detailed investigation of the effects of vertex corrections beyond GW is thus still an open issue.

Of course, not all the features of the full GW self-energy can be retained in QSGW. The simplicity has a price. Being a quasiparticle-only theory, all the properties beyond quasiparticles are naturally absent in QSGW. For instance, the satellites in the photoemission spectra of correlated [57] and non-correlated [56] materials are no longer accessible. The finite lifetimes of quasiparticles [58] are also beyond QSGW. A way to overcome this limitation is to use the QSGW results at self-consistency as a starting point to calculate the Green’s function in the GW approximation and hence the spectral function, as has been done recently for SrVO3 [107].

The QSGW has mostly been applied to solids. Nowadays, the first applications to atoms and molecules are being carried out by several groups. Bruneval [34] first applied QSGW to the ionization potential of small sodium clusters. Then Ke [140] demonstrated the improvement of QSGW compared to G 0 W 0 based on Hartree–Fock for the conjugated molecules (C n H m ) for electron affinities and ionization potentials. More recently, Bruneval [141] investigated its performance for all the light atoms from H to Ar. As shown in Fig. 8, the improvement offered by QSGW is not systematic. On average QSGW performs better than G 0 W 0 based on HF with a relatively smaller mean absolute error, although for specific atoms the opposite can hold true. Note that the error in the LUMO of the positive ion (right panel) remains rather large even for QSGW. Of course, the application of QSGW to molecules is still in its early stage and its performance has to be evaluated for a wider range of molecules, especially for large molecules.

Fig. 8
figure 8

Deviation from the experimental ionization potential of atoms as obtained from the HOMO of the neutral atoms (left panel) or from the LUMO of the positive ions (right panel), i.e., ϵHOMO/LUMO − (−I). HF is presented with open bars, G 0 W 0 based on HF inputs (GW@HF) with striped bars, and QSGW with filled bars. The mean absolute error (MAE) is also provided. Reprinted with permission from [141]. Copyright 2012, American Institute of Physics

5 Relation with Alternative Self-Consistent Schemes

Although the QSGW method is a simplified version of the full self-consistent GW self-energy, the calculations may still be quite cumbersome. Many authors resort to even more approximated types of self-consistency. This section briefly describes the alternatives to QSGW.

5.1 Energy-Only Self-Consistency

In the energy-only self-consistent GW scheme, the eigenvalues used to build G and W are updated iteratively, while the wavefunctions are kept at the level of the mean-field starting point. This scheme corresponds to keeping only the diagonal terms of the QSGW Hamiltonian. Thus whenever the mean-field orbitals are a good approximation to QP wavefunctions, the two schemes will give very similar results. Of course, the energy-only self-consistent GW is computationally much cheaper than QSGW.

This scheme has been used many times in the past [83, 142145] in order to improve band gaps and band widths with respect to G 0 W 0 results. Recently, Shishkin and Kresse [145] presented a systematic study of the band gap of several sp semiconductors and insulators. The conclusion of their analysis is that updating the eigenvalues in G while keeping W at the GGA level (the scheme is called GW 0) gives the best results in comparison to experiment (see Fig. 9). This is due to a cancellation of effects between the band-gap opening and excitonic interactions in W that is verified in the class of materials considered in [145]. In turn, the GGA screening is in better agreement with the physical one (i.e., the one that can be measured by loss spectroscopies) than the screening calculated self-consistently. In fact, updating the eigenvalues in W also leads to an underscreening and further opens the gaps, causing the results to deteriorate [145]. Similar conclusions can be reached in finite systems. For instance, Blase and coworkers [94, 146] recently found that in a set of organic molecules and DNA/RNA nucleobases the energy-only self-consistent GW scheme provides much better results than G 0 W 0 on top of LDA.

Fig. 9
figure 9

Comparison of band gaps calculated in GGA (PBE), G 0 W 0, and GW 0, with experiment for a series of prototypical sp semiconductors and insulators. GW 0 performs best. From [145]. Copyright 2007, The American Physical Society

5.2 Alternatives to the QSGW: COHSEX and Others

COHSEX is a static and Hermitian approximation to the GW self-energy (see Sect. 2.4). In this respect it parallels the self-energy of the QSGW approximation. The difference between the two approximations is the fact that in COHSEX the screening is static (i.e., it is calculated only at ω = 0), while it is fully dynamical in QSGW.

Bruneval, Vast, and Reining [112] proposed to combine self-consistent COHSEX with a subsequent G 0 W 0 calculation, giving very similar results to QSGW [112, 133] at a much lower computational cost. In fact, in COHSEX both the calculations of the screening (only static) and the self-energy (depending only on occupied states) are cheaper than in GW. COHSEX alone generally overestimates band gaps and dynamical effects contained in the GW self-energy are needed to correct this overestimation. In many cases a perturbative G 0 W 0 correction is sufficient, while, for example, in prototypical transition metal oxides like NiO and MnO, a self-consistent update of the eigenvalues is needed to get results similar to QSGW [147, Gatti and Rubio (2013) Unpublished].

The COHSEX+ G 0 W 0 scheme has been successfully applied in a wide range of materials: from transition metal compounds (e.g., Cu2O, VO2, NiS2−x Sex [105, 126, 133]) to CIGS and quaternary chalcogenides [120, 125, 148], delafossite transparent conductive oxides [121, 149], and intermediate-band materials [150, 151].

In an earlier work, Gygi and Baldereschi derived a simplified model static self-energy from the GW approximation [152] that was later used in a self-consistent manner in several transition metal oxides (e.g., MnO, NiO, CaCuO2, and VO2 [153155]). These applications employed a model screened interaction W in which the static dielectric constant was a parameter taken from experiment. Nevertheless, together with the work of Aryasetiawan and Gunnarsson on NiO [156], in which the self-consistency was simulated by adding to LDA a static correction term for the Ni e g bands, they provided the first meaningful GW descriptions of these transition metal oxides beyond the G 0 W 0 scheme.

More recently, Sakuma, Miyake, and Aryasetiawan have proposed an alternative self-consistent quasiparticle scheme that is based on Löwdin’s method of symmetric orthogonalization [157]. Results have been found to be very close to those in the QSGW scheme.

5.3 Hybrid Functionals and LDA+U

Hybrid functionals [158] and the LDA+U approach [159] have become very popular as improved exchange-correlation DFT functionals in solids. At the same time, both can be seen as an approximation to the GW self-energy [160, 161]. In hybrids like the Heyd–Scuseria–Ernzerhof (HSE) functional [158] the α and ω parameters, which are obtained by considerations of the adiabatic-connection formula and numerically fitting the results against a benchmark set of data, play the role of effective screening. In LDA+U the on-site Hubbard U correction is only applied to a subset of states (and a double-counting correction is introduced). The electron–electron local interaction is treated at the Hartree–Fock level and the screening of the interaction is effectively obtained by reducing U from its atomic values. The Hubbard U can be pragmatically used as fitting parameter or calculated for example in RPA: in this case all the electrons not belonging to the chosen subset of states are then providing the effective screening of U [162]. The common effect of both hybrid functionals and LDA+U is to produce, as a result of an exchange effect, a localization of the LDA orbitals that for d and f electrons are generally too delocalized.

Even though both methods depend on parameters, they can be seen as a cheap way to perform self-consistent quasiparticle calculations. In fact, in recent years they have been proposed as an improved starting point for subsequent G 0 W 0 calculations. In particular, Rödl and coworkers [163165] have shown that in prototypical transition metal oxides it is possible to fit the more expensive HSE results from GGA+U calculations with an additional scissor shift, producing results in good agreement with experiments. The application of these computationally efficient schemes is also gaining quite a large popularity in the GW community (see, e.g., [99101, 122, 163173] and [102] for an early review; Fig. 10 contains a collection of results from the latter).

Fig. 10
figure 10

Comparison of band gaps calculated in GGA (PBE), HSE and G 0 W 0 on top of them with experiment for a series of prototypical sp semiconductors and insulators. HSE is a better starting point than GGA. From [102]. Copyright © 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Besides the presence of parameters, the most important drawback with respect to QSGW is their limited transferability, which is restricted by the limited flexibility of their description of the screening of the Coulomb interaction (which in GW is explicitly accounted for). This can become an important issue in situations where the screening is substantially changing as a function of an external control parameter: for example, in the study of metal-insulator transitions under pressure [Gatti and Rubio (2013) Unpublished] and for materials characterized by a strong interplay between structure and electronic properties such as the CIGS [120]. In these cases QSGW is expected to be better (also with respect to G 0 W 0).

6 Conclusions and Outlook

The present chapter aimed at presenting the robustness and the simplicity of the QSGW when addressing the spectral properties of materials. The QSGW method is nowadays a mature theory that offers a remarkable trade-off between speed and accuracy.

It has been applied to an impressively wide variety of materials, covering the range from sodium [113] to uranium and plutonium [174, 175], and including metals, semi-conductors, insulators, atoms, and molecules. The comparison to experiment is in general very favorable. Therefore, in our opinion, QSGW is, to date, the best ab initio method available to calculate the quasiparticle spectrum of photoemission (and inverse photoemission) for solid state systems and can be thought of as a predictive theory.

Furthermore, QSGW is readily available in popular codes, such as Abinit [176] or Vasp [177]. On the other hand, the increasing popularity of the method nowadays goes hand in hand with the need for optimizing the computational algorithms, in order to prevent slow convergence in the self-consistency (see, e.g., [178]).

The QSGW method has enlarged the range of compounds that can be described by the GW approximation. It has demonstrated that, for materials for which G 0 W 0 was performing badly and thus considered out of reach for ab initio methods (e.g. “strongly correlated” transition metal oxides), the failure was to be ascribed to the perturbative scheme of G 0 W 0, and not to the GW approximation itself [71].

QSGW has provided an accurate reference from which it now makes sense to investigate more refined features of the electronic structure of materials, such as electron–phonon coupling [61, 62] and lattice polarization [179], which now are larger than the error bar of the method. Analogously, the field is now ready to study effects of electron interaction beyond the GW approximation (i.e., vertex corrections, for example, through the coupling with other approximations), without the bias that the perturbative G 0 W 0 scheme introduces.