Bold Feynman Diagrams and the Luttinger–Ward Formalism Via Gibbs Measures: Non-perturbative Analysis

Lin, Lin; Lindsey, Michael

doi:10.1007/s00205-021-01691-y

Bold Feynman Diagrams and the Luttinger–Ward Formalism Via Gibbs Measures: Non-perturbative Analysis

Published: 21 July 2021

Volume 242, pages 527–579, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Archive for Rational Mechanics and Analysis Aims and scope Submit manuscript

Bold Feynman Diagrams and the Luttinger–Ward Formalism Via Gibbs Measures: Non-perturbative Analysis

Download PDF

337 Accesses
1 Citation
Explore all metrics

Abstract

Many-body perturbation theory (MBPT) is widely used in quantum physics, chemistry, and materials science. At the heart of MBPT is the Feynman diagrammatic expansion, which is, simply speaking, an elegant way of organizing the combinatorially growing number of terms of a certain Taylor expansion. In particular, the construction of the ‘bold Feynman diagrammatic expansion’ involves the partial resummation to infinite order of possibly divergent series of diagrams. This procedure demands investigation from both the combinatorial (perturbative) and the analytical (non-perturbative) viewpoints. In this paper, we approach the analytical investigation of the bold diagrammatic expansion in the simplified setting of Gibbs measures (known as the Euclidean lattice field theory in the physics literature). Using non-perturbative methods, we rigorously construct the Luttinger–Ward formalism for the first time, and we prove that the bold diagrammatic series can be obtained directly via an asymptotic expansion of the Luttinger–Ward functional, circumventing the partial resummation technique. Moreover we prove that the Dyson equation can be derived as the Euler–Lagrange equation associated with a variational problem involving the Luttinger–Ward functional. We also establish a number of key facts about the Luttinger–Ward functional, such as its transformation rule, its form in the setting of the impurity problem, and its continuous extension to the boundary of the domain of physical Green’s functions.

Bold Feynman Diagrams and the Luttinger–Ward Formalism via Gibbs Measures: Perturbative Approach

Article 22 July 2021

On exact-WKB analysis, resurgent structure, and quantization conditions

Article Open access 17 December 2020

Form factors and correlation functions of $ \textrm{T}\overline{\textrm{T}} $-deformed integrable quantum field theories

Article Open access 08 September 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The bold Feynman diagrammatic expansion of many-body perturbation theory (MBPT), along with the many practically used methods in quantum chemistry and condensed matter physics that derive from it, can be formally derived from the Luttinger–Ward (LW)^{Footnote 1} formalism [19]. Since its original proposal in 1960, the LW formalism has found widespread applicability [5, 8, 13, 24]. However, the LW formalism and the LW functional are defined only formally, and this shortcoming poses serious questions both in theory and in practice. Indeed, the very existence of the LW functional in the setting of fermionic systems is under debate, with numerical evidence to the contrary appearing in the past few years [9, 11, 15, 28] in the physics community.

This paper expands on the work in [18], as well as an accompanying paper. In the accompanying paper, we provided a self-contained explanation of MBPT in the setting of the Gibbs model (alternatively known as the ‘Euclidean lattice field theory’ in the physics literature). In this setting one is interested in the evaluation of the moments of certain Gibbs measures. While the exact computation of such possibly high-dimensional integrals is intractable in general, important exceptions are the Gaussian integrals, that is, integrals for the moments of a Gaussian measure, which can be evaluated exactly. Perturbing about a reference system given by a Gaussian measure, one can evaluate quantities of interest by a series expansion of Feynman diagrams, which correspond to certain moments of Gaussian measures. For a specific form of quartic interaction that we refer to as the generalized Coulomb interaction, such a perturbation theory enjoys a correspondence with the Feynman diagrammatic expansion for the quantum many-body problem with a two-body interaction [1, 2, 22]. The generalized Coulomb interaction is also of interest in its own right and includes, for example, the (lattice) $\varphi ^{4}$ interaction [2, 29], as a special case. The combinatorial study of its perturbation theory was the goal of the accompanying paper. Nonetheless, the techniques of the accompanying paper, and MBPT more broadly, are more generally applicable to various types of field theories and interactions.

The culmination of the developments of the accompanying paper is the bold diagrammatic expansion, which is obtained formally via a partial resummation technique which sums possibly divergent series of diagrams to infinite order. Indeed, the main technical contribution of the accompanying paper was to place the combinatorial side of this procedure on firm footing. One motivation for this paper is to interpret the bold diagrams analytically, which we accomplish by first constructing the LW formalism. In fact this construction is non-perturbative and valid for rather general forms of interaction. Below we focus on the contributions and organization of this paper only.

1.1 Contributions

The main contribution of this paper is to establish the LW formalism rigorously for the first time, in the context of Gibbs measures. In this setting, the role of the Green’s function is assumed by the two-point correlator.

The construction of the LW functional proceeds via concave duality, in a spirit similar to that of the Levy-Lieb construction in density functional theory [16, 17] at zero temperature and the Mermin functional [21] at finite temperature, as well as the density matrix functional theory developed in [3, 7, 27]. With careful interpretation, this duality gives rise to a one-to-one correspondence between non-interacting and interacting Green’s functions. The LW formalism yields a variational interpretation of the Dyson equation; to wit, the free energy can be expressed variationally as a minimum over all physical Green’s functions, and the self-consistent solution of the Dyson equation yields its unique global minimizer. We also prove a number of useful properties of the LW functional, such as the transformation rule, the projection rule, and the continuous extension of the LW functional to the boundary of its domain, which can be interpreted as the domain of physical Green’s functions. In particular, this last property suggests a novel interpretation of the LW functional as the non-divergent part of the concave dual of the free energy. These results allow us to interpret the appropriate analogs of quantum impurity problems in our simplified setting. In particular, we prove that the self-energy is always a sparse matrix for impurity problems, with nonzero entries appearing only in the block corresponding to the impurity sites. Such a result is at the foundation of numerical approaches such as the dynamical mean field theory (DMFT) [10, 14].

We prove that the bold diagrams for the generalized Coulomb interaction can be obtained as asymptotic series expansions of the LW and self-energy functionals, circumventing the formal strategy of performing resummation to infinite order. The proof of this fact proceeds by proving the existence of such series non-constructively and then employing the combinatorial results of the accompanying paper to ensure that the terms of these series are in fact given by the bold diagrams.

Although the bold diagrammatic expansion (evaluated in terms of the interacting Green’s function, which is always defined) appears to be applicable in cases where the non-interacting Green’s function is ill-defined, we demonstrate that caution should be exercised in practice in such cases. Using a one-dimensional example, we demonstrate that the approximate Dyson equation obtained via a truncated bold diagrammatic expansion may yield solutions with large error in the regime of vanishing interaction strength or fail to admit solutions at all.

1.2 Outline

In Section 2 we review preliminary material and definitions needed to understand the results of this paper.

Section 3 concerns the construction of the LW formalism, beginning with a discussion of the the variational formulation of the free energy and the relevant concave duality (Section 3.1). This is followed by the introduction of the LW functional and the Dyson equation (Section 3.2). Then we introduce several key properties of the LW functional: the transformation rule (Section 3.3); the projection rule, accompanied by a discussion of impurity problems (Section 3.4); and the continuous extension property (Section 3.5). The proof of the continuous extension property, which is the most technically demanding part of the paper, is postponed to Section 5, which has its own outline.

Section 4 concerns the bold diagrammatic expansion. In Section 4.1 we prove the existence of asymptotic series for the LW functional and the self-energy, and in Section 4.2 we relate the coefficients of the former to the latter. Then for the rigorous development of the bold diagrammatic expansion, it only remains at this point to prove that the asymptotic series for the self-energy matches the bold diagrammatic expansion of the accompanying paper. This is the most involved task of Section 4. In Section 4.3, we review the results that we need from the accompanying paper in a ‘diagram-free’ way that should be understandable to the reader who has not read the accompanying paper, and in Section 4.4, we establish the claimed correspondence. Finally, in Section 4.5 we illustrate the aforementioned warning about the truncation of the bold diagrammatic series in cases where the non-interacting Green’s function is ill-defined.

Relevant background material on convex analysis and the weak convergence of measures is collected in “Appendices A and B”, respectively. The proofs of many lemmas are provided in “Appendix C”, as noted in the text.

2 Preliminaries

In this section we discuss some preliminary definitions and notations.

2.1 Notation and Quantities of Interest

Throughout we shall let $\mathcal {S}^{N}$, $\mathcal {S}^{N}_+$, and $\mathcal {S}^{N}_{++}$ denote respectively the sets of symmetric, symmetric positive semidefinite, and symmetric positive definite $N\times N$ real matrices. For simplicity we restrict our attention to real matrices, though analogous results can be obtained in the complex Hermitian case.

In this paper we will consider Gibbs measures defined by Hamiltonians $h:\mathbb {R}^N \rightarrow \mathbb {R}\cup \{+\infty \}$ of the form

$$\begin{aligned} h(x) = \frac{1}{2} x^T A x + U(x), \end{aligned}$$

where $A \in \mathcal {S}^N$. The first term represents the quadratic or ‘non-interacting’ part of the Hamiltonian, while the second term, U, represents the interaction. We define the partition function accordingly as

$$\begin{aligned} Z[A, U] = \int _{\mathbb {R}^{N}} e^{-\frac{1}{2} x^{T} A x - U(x)}\,\mathrm {d}x. \end{aligned}$$

(2.1)

For fixed interaction U, we may think of the partition function of A alone, that is, as $Z:\mathcal {S}^N\rightarrow \mathbb {R}$ sending $A\mapsto Z[A]$. In fact we adopt this perspective exclusively for the time being.

The free energy is then defined as a mapping $\Omega :\mathcal {S}^{N} \rightarrow \mathbb {R}\cup \{-\infty \}$ via

$$\begin{aligned} \Omega [A]:= -\log Z[A] = -\log \int _{{\mathbb {R}^{N}}}e^{-\frac{1}{2}x^{T}Ax-U(x)}\,\,\mathrm {d}x, \end{aligned}$$

(2.2)

We denote the domain of $\Omega $ by

$$\begin{aligned} \mathrm {dom}\,\Omega := \{A\in \mathcal {S}^N\,:\,\Omega [A]>-\infty \}, \end{aligned}$$

and the interior of the domain by $\mathrm {int}\,\mathrm {dom}\,\Omega $. As we will see, $\Omega $ is concave in A, and this notion of domain is the usual notion from convex analysis (see “Appendix A”), and it is simply the set of A such that the integral in Eq. (2.2) is convergent.

For $A \in \mathrm {int}\,\mathrm {dom}\,\Omega $, in fact the integrand in Eq. (2.2) must decay exponentially, hence we can define the two-point correlator (which we call the Green’s function by analogy with the quantum many-body literature) in terms of A via

$$\begin{aligned} G_{ij}[A] := \frac{1}{Z[A]} \int _{\mathbb {R}^{N}} x_i x_j \, e^{-\frac{1}{2} x^{T} A x - U(x)}\,\mathrm {d}x, \end{aligned}$$

and the integral on the right-hand side is convergent. More compactly, we have a mapping $G : \mathrm {int}\,\mathrm {dom}\,\Omega \rightarrow \mathcal {S}^N_{++}$ defined by

$$\begin{aligned} G[A] := \frac{1}{Z[A]} \int _{\mathbb {R}^{N}} xx^T\, e^{-\frac{1}{2} x^{T} A x - U(x)}\,\mathrm {d}x. \end{aligned}$$

(2.3)

It is important to note that $G[A] \in \mathcal {S}^N_{++}$ for all A. As we shall see in Section 3, this constraint defines the domain of ‘physical’ Green’s functions, in a certain sense. In the discussion below, G is also called the interacting Green’s function.

In the case of the ‘non-interacting’ Gibbs measure, where $U \equiv 0$, all quantities of interest can be computed exactly by straightforward multivariate integration. In particular, letting $G^{0}[A] := G[A;0]$, we have for $A\in \mathrm {dom}\,\Omega = \mathcal {S}^N_{++}$ that

$$\begin{aligned} G^0 [A] = A^{-1}. \end{aligned}$$

(2.4)

The neatness of this relation is that it motivates the factor of one half included in the quadratic part of the Hamiltonian. We refer to $G^0 [A]$ as the non-interacting Green’s function associated to A, whenever $A \in \mathcal {S}^N_{++}$. Note that for a general interaction U, $\mathrm {int}\,\mathrm {dom}\,\Omega $ may contain elements not in $\mathcal {S}^N_{++}$. For such A there is an associated (interacting) Green’s function but not a non-interacting Green’s function.

In general G can be viewed as the gradient of $\Omega $, for a suitably defined notion of gradient for functions of symmetric matrices, which we now define:

Definition 2.1

For $i,j=1,\ldots ,N$, let $E^{(ij)} \in \mathcal {S}^N$ be defined by $E^{(ij)}_{kl} = \delta _{ik}\delta _{jl} + \delta _{il}\delta _{jk}$. For a differentiable function $f:\mathcal {S}^N\rightarrow \mathbb {R}$, define the gradient $\nabla f :\mathcal {S}^N\rightarrow \mathcal {S}^N$ by

$$\begin{aligned} \nabla _{ij} f = (\nabla f)_{ij} := \lim _{\delta \rightarrow 0} \frac{f(A+\delta \cdot E^{(ij)})-f(A)}{\delta }. \end{aligned}$$

If f is obtained by restriction from a function $f: \mathbb {R}^{N\times N} \rightarrow \mathbb {R}$, then equivalently $\nabla _{ij}f = \frac{\partial f}{\partial X_{ij}} + \frac{\partial f}{\partial X_{ji}}$.

Then on $\mathrm {dom}\,\Omega $ the gradient map $\nabla \Omega $ is given by

$$\begin{aligned} \nabla _{ij}\Omega [A]=\frac{1}{Z[A]}\int x_i x_j \,e^{-\frac{1}{2}x^{T}Ax-U(x)}\,\,\mathrm {d}x, \end{aligned}$$

(2.5)

that is, $G = \nabla \Omega $, as claimed. The notion of gradient of Definition 2.1 is natural for our setting in that it yields this relation. However, it may seem a bit awkward when applied to specific computations. Indeed, consider a function $X\mapsto f(X)$ on $\mathcal {S}^N$ that is specified by a formula that can be applied to all $N\times N$ matrices and in which the roles of $X_{kl}$ and $X_{lk}$ are the same for all l, k. For instance, such a formula is given by $f(X) = \sum _{ij} X_{ij}^2$. Then the usual matrix derivative of f, considered as a function on $N\times N$ matrices, is given by $\frac{\partial f}{\partial X_{ij}}(X) = 2 X_{ij}$, whereas, viewing f as a function on $\mathcal {S}^N$ and with notation as specified in Definition 2.1, we have $\nabla _{ij} f (X) = 4 X_{ij}$. More generally in this situation we have $\nabla _{ij} = 2 \frac{\partial }{\partial X_{ij}}$. Since formulas like this arise from the bold diagrammatic expansion (as discussed in the accompanying paper), it is convenient then to estabilish.

Definition 2.2

For a differentiable function $f:\mathcal {S}^N\rightarrow \mathbb {R}$, define the matrix derivative $\frac{\partial f}{\partial X} :\mathcal {S}^N\rightarrow \mathcal {S}^N$ by

$$\begin{aligned} \frac{\partial f}{\partial X_{ij}} = \frac{1}{2} \nabla _{ij} f. \end{aligned}$$

Moreover, this notion of derivative will yield the relation

$$\begin{aligned} \Sigma [G] = \frac{\partial \Phi }{\partial G}, \end{aligned}$$

where $\Sigma $ is the self-energy and $\Phi $ is the LW functional, as was foreshadowed in the accompanying paper.

2.2 Interaction Growth Conditions

Note that $\mathrm {dom}\,\Omega $ depends on the shape of U(x). For example, if $U(x)=0$, then $\mathrm {dom}\,\Omega =\mathcal {S}^{N}_{++}$. If $U(x)=\sum _{i=1}^{N} x_{i}^4$, then $\mathrm {dom}\,\Omega =\mathcal {S}^{N}$. Our most basic condition on U is the following:

Definition 2.3

(Weak growth condition) A measurable function $U:{\mathbb {R}^{N}}\rightarrow \mathbb {R}$ satisfies the weak growth condition, if there exists a constant $C_U$ such that $U(x) + C_U( 1 + \Vert x\Vert ^2) \geqq 0$ for all $x\in {\mathbb {R}^{N}}$, and $\mathrm {dom}\,\Omega $ is an open set.

The weak growth condition of Definition 2.3 specifies that U cannot decay to $-\infty $ faster than quadratically, which ensures in particular that $\mathrm {dom}\,\Omega $ is non-empty. The assumption that $\mathrm {dom}\,\Omega $ is an open set (that is, $\mathrm {dom}\,\Omega = \mathrm {int}\,\mathrm {dom}\,\Omega $) will be used later to ensure that for fixed U there is a one-to-one correspondence between A and G (hence also between non-interacting and interacting Green’s functions) over suitable domains.

Note that the condition of Definition 2.3 is weaker than the condition

$$\begin{aligned} \frac{1}{2} x^{T} A x + U(x) \rightarrow +\infty , \quad \Vert x\Vert \rightarrow +\infty . \end{aligned}$$

(2.6)

For instance, if $N=2$ and $U(x)=x_{1}^4$, then the weak growth condition is satisfied with $C_{U}=0$, but Eq. (2.6) is not satisfied for all $A\in \mathcal {S}^N$. In fact, when U(x) only depends on a subset of components of $x\in {\mathbb {R}^{N}}$, we call the Gibbs model an impurity model or impurity problem, in analogy with the impurity models of quantum many-body physics [20], and we call the subset of components on which U depends the fragment. The flexibility of the weak growth condition will allow us to rigorously establish the LW formalism for the impurity model. In the setting of the impurity model, the ‘projection rule’ of Proposition 3.13 then allows us to understand the LW formalism of the impurity model in terms of the lower-dimensional LW formalism of the fragment and to prove a special sparsity pattern of the self-energy.

One of our main results (Theorem 3.18) is that the LW functional, which is initially defined on the set $\mathcal {S}^N_{++}$ of physical Green’s functions, can in fact be extended continuously to the boundary of $\mathcal {S}^N_{++}$, a fact which will not be apparent from the definition of the LW functional. (In fact, this extension shall be specified by an explicit formula involving lower-dimensional LW functionals.) However, in order for this result to hold, we need to strengthen the weak growth condition to the following:

Definition 2.4

(Strong growth condition) A measurable function $U:{\mathbb {R}^{N}}\rightarrow \mathbb {R}$ satisfies the strong growth condition if, for any $\alpha \in \mathbb {R}$, there exists a constant $b\in \mathbb {R}$ such that $U(x) + b \geqq \alpha \Vert x\Vert ^2$ for all $x\in {\mathbb {R}^{N}}$.

Note that the strong growth condition ensures that $\mathrm {dom}\,\Omega = \mathcal {S}^{N}$ and is hence an open set. If U is a polynomial function of x and satisfies the strong growth condition, then Eq. (2.6) will also be satisfied.

In Section 5 we will discuss the precise statement and proof of the aforementioned continuous extension property. In addition, a counterexample will be provided in the case where the weak growth condition holds but the strong growth condition does not. In fact, the continuous extension property is also valid for impurity models (which do not satisfy the strong growth condition) via the projection rule (Proposition 3.13), provided that the interaction satisfies the strong growth condition when restricted to the fragment.

For the generalized Coulomb interaction considered in the accompanying paper, that is,

$$\begin{aligned} U(x) = \frac{1}{8} \sum _{i,j=1}^{N} v_{ij} x_{i}^2 x_{j}^2, \end{aligned}$$

(2.7)

there is a natural condition on the matrix v that ensures that U satisfies the strong growth condition, namely that the matrix v is positive definite. We will simply assume that this holds whenever we refer to the generalized Coulomb interaction. To see that this assumption implies the strong growth condition, first note that $v \succ 0$ guarantees in particular that U is a nonnegative polynomial, strictly positive away from $x=0$. Since U is homogeneous quartic, it follows that $U\geqq C^{-1}\vert x\vert ^{4}$ for some constant C sufficiently large, which evidently implies the strong growth condition. Another sufficient assumption is that the entries of v are nonnegative and moreover that the diagonal entries are strictly positive.

Our interest in diagrammatic expansions leads us to adopt a further condition on the interaction. Too see why this is necessary, recall from the accompanying paper that the perturbation about a non-interacting theory ($U \equiv 0$) involves integrals such as

$$\begin{aligned} \int U(x)\, e^{-\frac{1}{2} x^T A x}\, \,\mathrm {d}x, \end{aligned}$$

which is clearly undefined if, for example, $U(x) = e^{x^4}$. In most applications of interest, U(x) is only of polynomial growth, but it is sufficient to assume growth that is at most exponential in the sense of Assumption 2.5, which is actually only needed in Section 4 for our consideration of the bold diagrammatic expansion.

Assumption 2.5

(At-most-exponential growth) In this section, we assume that there exist constants $B,C > 0$ such that $\vert U(x) \vert \leqq B e^{C \Vert x\Vert }$ for all $ x\in {\mathbb {R}^{N}}$.

Further technical reasons for this assumption will become clear in Section 4.

2.3 Measures and Entropy: Notation and Facts

Let $\mathcal {M}$ be the space of probability measures on ${\mathbb {R}^{N}}$ (equipped with the Borel $\sigma $-algebra), let $\mathcal {M}_2 \subset \mathcal {M}$ be the subset of probability measures with moments up to second order, and let $\lambda $ denote the Lebesgue measure on ${\mathbb {R}^{N}}$. For notational convenience we define a mapping that takes the second-order moments of a probability measure:

Definition 2.6

Define $\mathcal {G} :\mathcal {M}_2 \rightarrow \mathcal {S}_{+}^{N}$ by $\mathcal {G}(\mu )=\int xx^T\,\,\mathrm {d}\mu $. Writing $\mathcal {G}=(\mathcal {G}_{ij})$, we equivalently have $\mathcal {G}_{ij}(\mu )=\int x_{i}x_{j}\,\,\mathrm {d}\mu $.

Therefore if $\mu $ is defined via a density

$$\begin{aligned} \,\mathrm {d}\mu = \rho (x) \,\mathrm {d}x, \ \ \mathrm {where}\ \ \rho (x) = \frac{1}{Z[A]} e^{-\frac{1}{2}x^{T}Ax-U(x)}, \end{aligned}$$

then $\mathcal {G}(\mu ) = G[A]$.

We also denote by

$$\begin{aligned} \mathrm {Cov}(\mu )=\int x x^T \,\mathrm {d}\mu - \left( \int x\,\,\mathrm {d}\mu \right) \left( \int x\,\,\mathrm {d}\mu \right) ^T \end{aligned}$$

the covariance matrix of $\mu $.

For $\mu \in \mathcal {M}$, let H denote the (differential) entropy

$$\begin{aligned} H(\mu )={\left\{ \begin{array}{ll} -\int \log \frac{\,\mathrm {d}\mu }{\,\mathrm {d}\lambda }\,\,\mathrm {d}\mu , &{} \mu \ll \lambda \\ -\infty , &{} \mathrm {otherwise} \end{array}\right. } \end{aligned}$$

(2.8)

where $\frac{\,\mathrm {d}\mu }{\,\mathrm {d}\lambda }$ denotes the Radon-Nikodym derivative (that is, the probability density function of $\mu $ with respect to the Lebesgue measure $\lambda $) whenever $\mu \ll \lambda $ (that is, whenever $\mu $ is absolutely continuous with respect to the Lebesgue measure). We will often refer to the differential entropy as the entropy for convenience.

For $\mu ,\nu \in \mathcal {M}$, define the relative entropy $H_\nu (\mu )$ via

$$\begin{aligned} H_\nu (\mu ) = {\left\{ \begin{array}{ll} -\int \log \frac{\,\mathrm {d}\mu }{\,\mathrm {d}\nu }\,\,\mathrm {d}\mu , &{} \mu \ll \nu \\ -\infty , &{} \mathrm {otherwise}. \end{array}\right. } \end{aligned}$$

(2.9)

Note carefully the sign convention.^{Footnote 2} The integral in (2.9) is well-defined with values in $\mathbb {R}\cup \{-\infty \}$ for all $\mu ,\nu \in \mathcal {M}$.

We now record some useful properties of the relative entropy.

Fact 2.7

For fixed $\nu \in \mathcal {M}$, $H_\nu $ is non-positive and strictly concave on $\mathcal {M}$, and $H_\nu (\mu ) = 0$ if and only if $\mu = \nu $. Moreover $H_\nu $ is upper semi-continuous with respect to the topology of weak convergence; that is, if the sequence $\mu _k \in \mathcal {M}$ converges weakly to $\mu \in \mathcal {M}$, then $\limsup _{k\rightarrow \infty } H_{\nu }(\mu _k) \leqq H_{\nu }(\mu )$.

Proof

For proofs see [23].

By contrast to the relative entropy, the differential entropy suffers from two analytical nuisances.

First, in the definition of the entropy in (2.8), the entropy may actually fail to be defined for some measures (which simultaneously concentrate too much in some area and fail to decay fast enough at infinity, so the negative and positive parts of the integral are $-\infty $ and $+\infty $, respectively, and the Lebesgue integral is ill-defined). However, Lemma 2.8 states that when we restrict to $\mathcal {M}_2$, the integral cannot have an infinite positive part and is well-defined.

Lemma 2.8

For $\mu \in \mathcal {M}_2$, if $\mu \ll \lambda $, then the integral in (2.8) exists (in particular, the positive part of the integrand has finite integral) and moreover

$$\begin{aligned} H(\mu ) \leqq \frac{1}{2} \log \left( (2\pi e)^N \det \mathrm {Cov}(\mu ) \right) \leqq \frac{1}{2} \log \left( (2\pi e)^N \det \mathcal {G}(\mu ) \right) , \end{aligned}$$

with possibly $H(\mu ) = -\infty $. The first inequality is satisfied with equality if and only if $\mu $ is a Gaussian measure with a positive definite covariance matrix. The second inequality is satisfied with equality if and only if $\mu $ has mean zero.

Note that Lemma 2.8 also entails a useful bound on the entropy in terms of the second moments, as well as the classical fact that Gaussian measures are the measures of maximal entropy subject to second-order moment constraints.

The second analytical nuisance of the differential entropy is that we do not have the same semi-continuity guarantee as we have for the relative entropy in Fact 2.7. However, control on second moments allows a semi-continuity result that will suffice for our purposes.

Lemma 2.9

Assume that $\mu _j \in \mathcal {M}_2$ weakly converge to $\mu \in \mathcal {M}$, and that there exists a constant C such that $\mathcal {G}(\mu _j) \preceq C\cdot I_N$ for all j. Then $\limsup _{j\rightarrow \infty } H(\mu _j) \leqq H(\mu )$.

Remark 2.10

In other words, the entropy is upper semi-continuous with respect to the topology of weak convergence on any subset of probability measures with uniformly bounded second moments. The subtle difference between the statements in Fact 2.7 and Lemma 2.9 is due to the fact that the Lebesgue measure $\lambda \notin \mathcal {M}$.

The proofs of Lemmas 2.8 and 2.9 are given in “Appendix C”.

Finally we record the classical fact that subject to marginal constraints, the entropy is maximized by a product measure. In the statement and throughout the paper, ‘$\#$’ denotes the pushforward operation on measures.

Fact 2.11

Suppose $p < N$ and let $\pi _1:{\mathbb {R}^{N}}\rightarrow \mathbb {R}^{p}$ and $\pi _2:{\mathbb {R}^{N}}\rightarrow \mathbb {R}^{N-p}$ to be the projections onto the first p and last $N-p$ components, respectively. Then for $\mu \in \mathcal {M}_2$, $H(\mu ) \leqq H(\pi _1 \# \mu ) + H(\pi _2 \# \mu )$.

Remark 2.12

Note that $\pi _1 \# \mu $ and $\pi _2 \# \mu $ are the marginal distributions of $\mu $ with respect to the product structure ${\mathbb {R}^{N}}= \mathbb {R}^p \times \mathbb {R}^{N-p}$.

See “Appendix C” for a short proof.

3 Luttinger–Ward Formalism

This section is organized as follows. In Section 3.1, we provide a variational expression for the free energy via the classical Gibbs variational principle. For fixed U, this allows us to identify the Legendre dual of $\Omega [A]$, denoted by $\mathcal {F}[G]$, and to establish a bijection between A and the interacting Green’s function G. In Section 3.2, we define the Luttinger–Ward functional and show that the Dyson equation can be naturally derived by considering the first-order optimality condition associated to the minimization problem in the variational expression for the free energy. Then we prove that the LW functional satisfies a number of desirable properties. First, in Section 3.3 we prove the transformation rule, which relates a change of the coordinates of the interaction with an appropriate transformation of the Green’s function. The transformation rule leads to the projection rule in Section 3.4, which implies the sparsity pattern of the self-energy for the impurity problem. Up until this point we assume only that U satisfy the weak growth condition. Then in Section 3.5 we motivate and state our result that the LW functional is continuous up to the boundary of $\mathcal {S}^N_{++}$, for which we need the assumption that U satisfies the strong growth condition. The proof (as well as a counterexample demonstrating that weak growth is not sufficient) is deferred to Section 5. Throughout we defer the proofs of some technical lemmas to “Appendix C”. Moreover we will invoke the language of convex analysis following Rockafellar [25] and Rockafellar and Wets [26]. See “Appendix A” for further background and details.

3.1 Variational Formulation of the Free Energy

The main result in this subsection is given by Theorem 3.1.

Theorem 3.1

(Variational structure) For U satisfying the weak growth condition, the free energy can be expressed variationally via the constrained minimization problem

$$\begin{aligned} \Omega [A]=\inf _{G\in \mathcal {S}^{N}_{+}}\left( \frac{1}{2}\mathrm {Tr}[AG]-\mathcal {F}[G]\right) , \end{aligned}$$

(3.1)

where

$$\begin{aligned} \mathcal {F}[G]:=\sup _{\mu \in \mathcal {G}^{-1}(G)}\left[ H(\mu )-\int U\,\,\mathrm {d}\mu \right] \end{aligned}$$

(3.2)

is the concave conjugate of $\Omega [A]$ with respect to the inner product $\langle A,G\rangle = \frac{1}{2}\mathrm {Tr}[AG]$. (Note that by convention $\mathcal {F}[G] = -\infty $ whenever $\mathcal {G}^{-1}(G)$ is empty, that is, whenever $G\in \mathcal {S}^{N}\backslash \mathcal {S}_{+}^{N}$.) Moreover $\Omega $ and $\mathcal {F}$ are smooth and strictly concave on their respective domains $\mathrm {dom}\,\Omega $ and $\mathcal {S}^N_{++}$. The mapping $G[A]:=\nabla \Omega [A]$ is a bijection $\mathrm {dom}\,\Omega \rightarrow \mathcal {S}^{N}_{++}$, with inverse given by $A[G]:=\nabla \mathcal {F}[G]$.

We first record some technical properties of $\Omega $ in Lemma 3.2.

Lemma 3.2

$\Omega $ is an upper semi-continuous, proper (hence closed) concave function. Moreover, $\Omega $ is strictly concave and $C^{\infty }$-smooth on $\mathrm {dom}\,\Omega $.

Remark 3.3

Recall that a function f on a metric space X is upper semi-continuous if for any sequence $x_k \in X$ converging to x, we have $\limsup _{k\rightarrow \infty } f(x_k) \leqq f(x)$.

We now turn to exploring the concave (or Legendre-Fenchel) duality associated to $\Omega $. The following lemma, a version of the classical Gibbs variational principle [23] (alternatively known as the Donsker-Varadhan variational principle [12]), is the first step toward identifying the dual of $\Omega $.

Lemma 3.4

For any $A\in \mathcal {S}^{N}$,

$$\begin{aligned} \Omega [A]=\inf _{\mu \in \mathcal {M}_2}\left[ \int \left( \frac{1}{2}x^{T}Ax+U(x)\right) \,\,\mathrm {d}\mu (x)-H(\mu )\right] . \end{aligned}$$

(3.3)

If $A\in \mathrm {dom}\,\Omega $, the infimum is uniquely attained at $\,\mathrm {d}\mu (x)=\frac{1}{Z[A]}e^{-\frac{1}{2}x^{T}Ax-U(x)}\,\,\mathrm {d}x$.

Remark 3.5

One might wonder whether the infimum in (3.3) can be taken over all of $\mathcal {M}$. Note that if $\mu $ does not have a second moment, it is possible to have both $H(\mu ) = +\infty $ and $\int \left( \frac{1}{2}x^{T}Ax+U(x)\right) \,\,\mathrm {d}\mu (x) = +\infty $, so the expression in brackets is of the indeterminate form $\infty - \infty $. The restriction to $\mu \in \mathcal {M}_2$ takes care of this problem because Lemma 2.8 guarantees that $H(\mu ) < +\infty $, and by the weak growth condition, the other term in the infimum must be either finite or $+\infty $. Moreover, $\mathcal {M}_2$ is still large enough to contain the minimizer, and restricting our attention to measures with finite second-order moments will be convenient in later developments.

From the previous lemma we can split up the infimum in (3.3) and obtain

$$\begin{aligned} \Omega [A]=\inf _{G\in \mathcal {S}_{+}^{N}}\inf _{\mu \in \mathcal {G}^{-1}(G)}\left[ \int \left( \frac{1}{2}x^{T}Ax+U(x)\right) \,\,\mathrm {d}\mu (x) -H(\mu )\right] . \end{aligned}$$

Since $\int x^{T}Ax\,\,\mathrm {d}\mu =\mathrm {Tr}[\mathcal {G}(\mu )A]$, it follows that

$$\begin{aligned} \Omega [A]=\inf _{G\in \mathcal {S}_{+}^{N}}\left( \frac{1}{2}\mathrm {Tr}[AG]+\inf _{\mu \in \mathcal {G}^{-1}(G)}\left[ \int U\,\,\mathrm {d}\mu -H(\mu )\right] \right) . \end{aligned}$$

This proves Eq. (3.1) of Theorem 3.1 using the definition of $\mathcal {F}[G]$ in Eq. (3.2).

Remark 3.6

For the perspective of the large deviations theory, we comment that the construction of $\mathcal {F}$ from the entropy may be recognizable by analogy to the contraction principle [23]. Indeed, the expression $\int U\,\,\mathrm {d}\mu - H(\mu )$ is equal (modulo a constant offset) to $-H_{\nu _U}(\mu )$, where $\nu _U$ is the measure with density proportional to $e^{-U}$. If one considers i.i.d. sampling from the probability measure $\nu _U$, by Sanov’s theorem $-H_{\nu _U}$ is the corresponding large deviations rate function for the empirical measure. The rate function for the second-order moment matrix (that is, $-\mathcal {F}$, modulo constant offset) is obtained via the contraction principle applied to the mapping $\mu \mapsto \mathcal {G}(\mu )$. This is analogous to the procedure by which one obtains Cramér’s theorem from Sanov’s theorem via application of the contraction principle to a map that maps $\mu $ to its mean [23].

Now we record some technical facts about $\mathcal {F}$ in Lemma 3.7, which demonstrates in particular that $\mathcal {F}$ diverges (at least) logarithmically at the boundary $\partial \mathcal {S}_{+}^{N} = \mathcal {S}_{+}^{N} \backslash \mathcal {S}_{++}^{N}$.

Lemma 3.7

$\mathcal {F}$ is finite on $\mathcal {S}^N_{++}$ and $-\infty $ elsewhere. Moreover,

$$\begin{aligned} \mathcal {F}[G] \leqq \frac{1}{2}\log \left[ (2\pi e)^N \det G \right] + C_U (1+\mathrm {Tr}\,G) \end{aligned}$$

for all $G \in \mathcal {S}^N_{++}$.

Define

$$\begin{aligned} \Psi [\mu ]:=H(\mu )-\int U\,\,\mathrm {d}\mu , \end{aligned}$$

so $\mathcal {F}[G]=\sup _{\mu \in \mathcal {G}^{-1}(G)}\Psi [\mu ]$. By the concavity of the entropy, $\Psi $ is concave on $\mathcal {M}_2$. Thus, given G, we can in principle solve a concave maximization problem over $\mu \in \mathcal {M}$ to find $\mathcal {F}[G]$, with the linear constraint $\mu \in \mathcal {G}^{-1}(G)$. Moreover, this variational representation of $\mathcal {F}$ in terms of the concave function $\Psi $ is enough to establish the concavity of $\mathcal {F}$ by abstract considerations. This and other properties of $\mathcal {F}$ are collected in the following.

Lemma 3.8

$\mathcal {F}$ is an upper semi-continuous, proper (hence closed) concave function on $\mathcal {S}^{N}$.

Now Eq. (3.1) states precisely that $\Omega $ is the concave conjugate of $\mathcal {F}$ with respect to the inner product $\langle A,G\rangle = \frac{1}{2}\mathrm {Tr}[AG]$, and accordingly we write $\Omega =\mathcal {F}^{*}$. Since $\mathcal {F}$ is concave and closed, we have by Theorem A.14 that $\mathcal {F}=\mathcal {F}^{**}=\Omega ^{*}$, that is, $\mathcal {F}$ and $\Omega $ are concave duals of one another. Thus we expect that $\nabla \mathcal {F}$ and $\nabla \Omega $ are inverses of one another, but to make sense of this claim we need to establish the differentiability of $\mathcal {F}$. We collect this and other desirable properties of $\mathcal {F}$ in the following:

Lemma 3.9

$\mathcal {F}$ is $C^\infty $-smooth and strictly concave on $\mathcal {S}^N_{++}$.

Then Theorem A.15 guarantees that $\nabla \Omega $ is a bijection from $\mathrm {dom}\,\Omega \rightarrow \mathcal {S}^N_{++}$ with its inverse given by $\nabla \mathcal {F}$. This completes the proof of Theorem 3.1.

Finally, following Lemma 3.4, together with the splitting of (3.3) and the $A\leftrightarrow G$ correspondence of Theorem 3.1, we observe that the supremum in (3.2) is attained uniquely at the measure $\,\mathrm {d}\mu := \frac{1}{Z[A[G]]} e^{-\frac{1}{2}x^{T}A[G]x-U(x)} \,\mathrm {d}x$.

3.2 The Luttinger–Ward Functional and the Dyson Equation

According to Lemma 3.7, $\mathcal {F}$ should blow up at least logarithmically as G approaches the boundary of $\mathcal {S}_{++}^{N}$. Remarkably, we can explicitly separate the part that accounts for the blowup of $\mathcal {F}$ at the boundary. In fact, subtracting away this part is how we define the Luttinger–Ward (LW) functional for the Gibbs model. We will see in this subsection that the definition of the Luttinger–Ward functional can also be motivated by the stipulation that its gradient (the self-energy) should satisfy the Dyson equation.

Consider for a moment the case in which $U\equiv 0$, so

$$\begin{aligned} \mathcal {F}[G]=\sup _{\mu \in \mathcal {G}^{-1}(G)}\left[ H(\mu )-\int U\,\,\mathrm {d}\mu \right] =\sup _{\mu \in \mathcal {G}^{-1}(G)}H(\mu ). \end{aligned}$$

The random variable X achieving the maximum entropy subject to $\mathbb {E}[X_{i}X_{j}]=G_{ij}$ follows a Gaussian distribution, that is, $X\sim \mathcal {N}(0,G)$. It follows that

$$\begin{aligned} \mathcal {F}[G]=\frac{1}{2}\log \left( (2\pi e)^{N}\det G\right) =\frac{1}{2}\mathrm {Tr}[\log (G)]+\frac{N}{2}\log (2\pi e). \end{aligned}$$

This motivates, for general U, the consideration of the Luttinger–Ward functional

$$\begin{aligned} \Phi [G]:=2\mathcal {\mathcal {F}}[G]-\mathrm {Tr}[\log (G)]-N\log (2\pi e). \end{aligned}$$

(3.4)

For non-interacting systems, $\Phi [G]\equiv 0$ by construction.

Now we turn to establishing the Dyson equation. Theorem 3.1 shows that for $A\in \mathrm {dom}\,\Omega $, the minimizer $G^{*}$ in (3.1) satisfies $A=\nabla \mathcal {F}[G^{*}]=A[G^{*}]$, so the minimizer is $G^{*}=G[A]$. Recall that

$$\begin{aligned} \mathcal {F}[G] = \frac{1}{2}\mathrm {Tr}[\log (G)] + \frac{1}{2}\Phi [G] +\frac{1}{2}N \log (2\pi e). \end{aligned}$$

Taking gradients and plugging into $A=\nabla \mathcal {F}[G^*]$ yields

$$\begin{aligned} 0=A-(G^{*})^{-1}- \frac{1}{2} \nabla \Phi [G^{*}]. \end{aligned}$$

Define the self-energy $\Sigma $ as a functional of G by $\Sigma [G]:= \frac{1}{2} \nabla \Phi [G] = \frac{\partial \Phi }{\partial G} [G]$. Then we have established that for $G=G[A]$,

$$\begin{aligned} G^{-1}=A-\Sigma [G]. \end{aligned}$$

(3.5)

Moreover, by the strict concavity of $\mathcal {F}$, $G=G[A]$ is the unique G solving (3.5).

Eq. (3.5) is in fact the Dyson equation as in Section 3.8 of the accompanying paper. To see this, recall from Eq. (2.4) that the non-interacting Green’s function $G^{0}$ is given by $G^{0}=A^{-1}$, so we have

$$\begin{aligned} G^{-1}=(G^{0})^{-1}-\Sigma [G]. \end{aligned}$$

Left- and right-multiplying by $G^{0}$ and G, respectively, and then rearranging, we obtain

$$\begin{aligned} G=G^{0}+G^{0}\Sigma [G]G. \end{aligned}$$

However, Eq. (2.4) requires $G^{0}$ to be well defined, that is, $A\in \mathcal {S}_{++}^{N}$. On the other hand, the Dyson equation (3.5) derived from the LW functional does not rely on this assumption and makes sense for all $A\in \mathrm {dom}\,\Omega $. Nonetheless, if for fixed A one seeks to approximately solve the Dyson equation for G by inserting an ansatz for the self-energy obtained from many-body perturbation theory, one must be wary in the case that $A \notin \mathcal {S}^N_{++}$; see Section 4.5.

3.3 Transformation Rule for the LW Functional

Though the dependence of the Luttinger–Ward functional on the interaction U was only implicit in the previous section, we now explicitly consider this dependence, including it in our notation as $\Phi [G,U]$. The same convention will be followed for other functionals without comment. Proposition 3.10 relates a transformation of the interaction with a corresponding transformation of the Green’s function.

Proposition 3.10

(Transformation rule)Let $G\in \mathcal {S}_{++}^{N}$, U be an interaction satisfying the weak growth condition. Let T denote an invertible matrix in $\mathbb {R}^{N\times N}$, as well as the corresponding linear transformation ${\mathbb {R}^{N}}\rightarrow {\mathbb {R}^{N}}$. Then

$$\begin{aligned} \Phi [TGT^{*},U]=\Phi [G,U\circ T]. \end{aligned}$$

Proof

For $G\in \mathcal {S}^N_{++}$, note that the supremum in (3.2) can be restricted to the set of $\mu \in \mathcal {G}^{-1}(G)$ that have densities with respect to the Lebesgue measure. (Indeed, for any $\mu \in \mathcal {M}_2$ that does not have a density, $H(\mu )-\int U\,\,\mathrm {d}\mu = -\infty $.) Then observe

$$\begin{aligned} \Phi [G,U]= & {} -N\log (2\pi e) - \log \det G + 2 \sup _{\mu \in \mathcal {G}^{-1}(G)}\left[ H(\mu )-\int U\,\,\mathrm {d}\mu \right] \\= & {} -N\log (2\pi e) - \log \det G - 2 \inf _{\{\rho \,:\, \rho \,\,\mathrm {d}x \in \mathcal {G}^{-1}(G)\}}\left[ \int \left( \log \rho + U \right) \,\rho \,\,\mathrm {d}x \right] \\= & {} -N\log (2\pi e) - 2 \inf _{\{\rho \,:\,\rho \,\,\mathrm {d}x\in \mathcal {G}^{-1}(G)\}}\left[ \int \left( \log \left[ (\det G)^{1/2} \rho \right] + U \right) \,\rho \,\,\mathrm {d}x \right] . \end{aligned}$$

Going forward we will denote $C:=-N\log (2\pi e)$.

Then for T invertible, we have

$$\begin{aligned} \Phi [TGT^*,U] = C - 2 \inf _{\rho \,\,\mathrm {d}x\in \mathcal {G}^{-1}(TGT^*)}\left[ \int \left( \log \left[ (\det G)^{1/2}\cdot \vert \det T\vert \cdot \rho \right] + U \right) \,\rho \,\,\mathrm {d}x \right] . \end{aligned}$$

Now observe by changing variables that

$$\begin{aligned} \left\{ \rho \,:\, \rho \ \,\mathrm {d}x \in \mathcal {G}^{-1}(TGT^*)\right\} = \left\{ \vert \det T\vert ^{-1}\cdot \rho \circ T^{-1} \,:\, \rho \ \,\mathrm {d}x \in \mathcal {G}^{-1}(G)\right\} . \end{aligned}$$

Therefore

$$\begin{aligned}&\Phi [TGT^*,U]\nonumber \\&\quad = C - 2 \inf _{\rho \,\,\mathrm {d}x\in \mathcal {G}^{-1}(G)}\left[ \vert \det T\vert ^{-1} \int \left( \log \left[ (\det G)^{1/2} \cdot \rho \circ T^{-1}\right] + U \right) \,\rho \circ T^{-1}\,\,\mathrm {d}x \right] \\= & {} C - 2 \inf _{\rho \,\,\mathrm {d}x\in \mathcal {G}^{-1}(G)}\left[ \int \left( \log \left[ (\det G)^{1/2} \cdot \rho \right] + U\circ T \right) \,\rho \,\,\mathrm {d}x \right] \\= & {} \Phi [G,U\circ T], \end{aligned}$$

as was to be shown.

Remark 3.11

Since T is real, the Hermite conjugation $T^{*}$ is the same as the matrix transpose, and this is used simply to avoid the notation $T^{T}$.

From the transformation rule we have the following corollary:

Corollary 3.12

Let $G\in \mathcal {S}_{++}^{N}$, and consider an interaction U which is a homogeneous polynomial of degree 4 satisfying the weak growth condition. For $\lambda >0$, we have

$$\begin{aligned} \Phi [\lambda G,U]=\Phi [G,\lambda ^{2}U]. \end{aligned}$$

3.4 Impurity Problems and the Projection Rule

For the impurity problem, the interaction only depends on a subset of the variables $x_1,\ldots ,x_N$, namely the fragment. In such a case, the Luttinger–Ward functional can be related to a lower-dimensional Luttinger–Ward functional corresponding to the fragment. This relation, called the projection rule, is given in Proposition 3.13 below. In the notation, we will now explicitly indicate the dimension d of the state space associated with the Luttinger–Ward functional via subscript as in $\Phi _{d}[G,U]$, since we will be considering functionals for state spaces of different dimensions. We will follow the same convention for other functionals without comment.

Before we state the projection rule, we record some remarks on the domain of $\Omega $ and growth conditions in the context of impurity problems. Suppose that the interaction U depends only on $x_1, \ldots , x_p$, where $p\leqq N$, so U can alternatively be considered as a function on $\mathbb {R}^p$. Notice that even if U satisfies the strong growth condition as a function on $\mathbb {R}^p$, it is of course not true that $\mathrm {dom}\left( \Omega _N[\,\cdot \,,U] \right) = \mathcal {S}^N$. As mentioned above, this provides a natural reason to consider interactions that do not grow fast in all directions and motivates the generality of our previous considerations.

In fact, for

$$\begin{aligned} A = \left( \begin{array}{cc} A_{11} &{} A_{12} \\ A_{12}^T &{} A_{22} \end{array}\right) , \end{aligned}$$

one can show by Fubini’s theorem, integrating out the last $N-p$ variables in (2.2), that $A\in \mathrm {dom}\left( \Omega _N[\,\cdot \,,U] \right) $ if and only if both

$$\begin{aligned} A_{22} \in \mathcal {S}^{N-p}_{++}\ \mathrm {and}\ A_{11} - A_{12} A_{22}^{-1} A_{12}^T \in \mathrm {dom}\,\left( \Omega _p[\,\cdot \,,U] \right) . \end{aligned}$$

Moreover, one can show that for such A,

$$\begin{aligned} \Omega _N[A,U] = \Omega _p \left[ A_{11} - A_{12} A_{22}^{-1} A_{12}^T,\ U \right] + \frac{1}{2}\log ( (2\pi )^{p-N} \det A_{22}). \end{aligned}$$

Therefore, if $\mathrm {dom}\,\left( \Omega _p[\,\cdot \,,U(\,\cdot \,,0)] \right) $ is open, then so is $\mathrm {dom}\left( \Omega _N[\,\cdot \,,U] \right) $. It follows that if U satisfies the weak growth condition as a function on $R^p$, then U also satisfies the weak growth condition as a function on $\mathbb {R}^N$.

Proposition 3.13

(Projection rule) Let $p\leqq N$. Suppose that U depends only on $x_1,\ldots ,x_p$ and satisfies the weak growth condition. Hence we can think of U as a function on both ${\mathbb {R}^{N}}$ and $\mathbb {R}^p$. Then for $G\in \mathcal {S}^N_{++}$,

$$\begin{aligned} \Phi _{N}\left[ G,U\right] =\Phi _{p}\left[ G_{11},U\right] , \end{aligned}$$

where $G_{11}$ is the upper-left $p\times p$ block of G.

Remark 3.14

If U can be made to depend only on $p\leqq N$ variables by linearly changing variables, then we can use the projection rule in combination with the transformation rule (Proposition 3.10) to reveal the relationship with a lower-dimensional Luttinger–Ward functional, though we do not make this explicit here with a formula.

Corollary 3.15

Let $p\leqq N$, and P be the orthogonal projection onto the subspace $\mathrm {span}\,\{e_1^{(N)},\ldots ,e_p^{(N)}\}$. Suppose that $U(\,\cdot \,,0)$ satisfies the weak growth condition. Then for $G \in \mathcal {S}^N_{++}$,

$$\begin{aligned} \Phi _{N}\left[ G,U\circ P\right] =\Phi _{p}\left[ G_{11},U(\,\cdot \,,0)\right] , \end{aligned}$$

where $G_{11}$ is the upper-left $p\times p$ block of G.

Proof of Proposition 3.13

First we observe that we can assume that G is block-diagonal. To see this, let $G\in \mathcal {S}^N_{++}$, and write

$$\begin{aligned} G = \left( \begin{array}{cc} G_{11} &{} G_{12} \\ G_{12}^T &{} G_{22} \end{array}\right) . \end{aligned}$$

Then block Gaussian elimination reveals that

$$\begin{aligned} G = \left( \begin{array}{cc} I &{} 0\\ G_{12}^{T}G_{11}^{-1} &{} I \end{array}\right) \left( \begin{array}{cc} G_{11} &{} 0\\ 0 &{} G_{22}-G_{12}^{T}G_{11}^{-1}G_{12} \end{array}\right) \left( \begin{array}{cc} I &{} G_{11}^{-1}G_{12}\\ 0 &{} I \end{array}\right) . \end{aligned}$$

Define

$$\begin{aligned} T:= \left( \begin{array}{cc} I &{} 0\\ G_{12}^{T}G_{11}^{-1} &{} I \end{array}\right) , \ \ \widetilde{G} := \left( \begin{array}{cc} G_{11} &{} 0\\ 0 &{} G_{22}-G_{12}^{T}G_{11}^{-1}G_{12} \end{array}\right) , \end{aligned}$$

so $G = T\widetilde{G}T^*$. Then by the transformation rule, we have

$$\begin{aligned} \Phi _N[G,U] = \Phi _N[\widetilde{G},U\circ T] = \Phi _N[\widetilde{G},U], \end{aligned}$$

where the last equality uses the fact that U depends only on the first p arguments, which are unchanged by the transformation T.

Since $\widetilde{G}$ is block-diagonal with the same upper-left block as G, we have reduced to the block-diagonal case, as claimed, so now assume that $G\in \mathcal {S}^N_{++}$ with

$$\begin{aligned} G = \left( \begin{array}{cc} G_{11} &{} 0 \\ 0 &{} G_{22} \end{array}\right) . \end{aligned}$$

Recall the following expression for $\mathcal {F}_N$:

$$\begin{aligned} \mathcal {F}_N [G,U] = \sup _{\mu \in \mathcal {G}_N^{-1}(G)}\left[ H(\mu )-\int U\,\,\mathrm {d}\mu \right] . \end{aligned}$$

Next define $\pi _1:{\mathbb {R}^{N}}\rightarrow \mathbb {R}^{p}$ and $\pi _2:{\mathbb {R}^{N}}\rightarrow \mathbb {R}^{N-p}$ to be the projections onto the first p and last $N-p$ components, respectively. Then with ‘$\#$’ denoting the pushforward operation on measures, $\pi _1 \# \mu $ and $\pi _2 \# \mu $ are the marginals of $\mu $ with respect to the product structure ${\mathbb {R}^{N}}= \mathbb {R}^p \times \mathbb {R}^{N-p}$. Now recall Fact 2.11, in particular the inequality $H(\mu ) \leqq H(\pi _1 \# \mu ) + H(\pi _2 \# \mu )$. Also note that if $\mu \in \mathcal {G}_N^{-1}(G)$, then $\pi _1 \# \mu \in \mathcal {G}_p^{-1}(G_{11})$ and $\pi _2 \# \mu \in \mathcal {G}_{N-p}^{-1}(G_{22})$. Finally observe that since U depends only on the first p arguments, $\int U\,\,\mathrm {d}\mu = \int U\,d(\pi _1 \# \mu )$ for any $\mu $. Therefore,

$$\begin{aligned} \mathcal {F}_N [G,U]\leqq & {} \sup _{\mu \in \mathcal {G}_N^{-1}(G)}\left[ H(\pi _1 \# \mu ) + H(\pi _2 \# \mu ) -\int U\,d(\pi _1 \# \mu ) \right] \\\leqq & {} \sup _{\mu _1 \in \mathcal {G}_p^{-1}(G_{11})} \left[ H(\mu _1) -\int U\,\,\mathrm {d}\mu _1 \right] + \sup _{\mu _2 \in \mathcal {G}_{N-p}^{-1}(G_{22})}\left[ H(\mu _2) \right] \\= & {} \mathcal {F}_p [G_{11},U] + \frac{1}{2}\log ( (2\pi e)^{N-p} \det G_{22}). \end{aligned}$$

Since $\det G = \det G_{11} \det G_{22}$, it follows that

$$\begin{aligned} \Phi _N [G,U] \leqq \Phi _p [G_{11},U]. \end{aligned}$$

For the reverse inequality, let $\mu _1$ be arbitrary in $\mathcal {G}_p^{-1}(G_{11})$, and consider $\mu := \mu _1 \times \mu _2$, where $\mu _2$ is given by the normal distribution with mean zero and covariance $G_{22}$. Then

$$\begin{aligned} \mathcal {F}_N [G,U] \geqq H(\mu ) - \int U\,\,\mathrm {d}\mu = H(\mu _1)- \int U\,\,\mathrm {d}\mu _1 + \frac{1}{2}\log ( (2\pi e)^{N-p} \det G_{22}). \end{aligned}$$

Since $\mu _1$ is arbitrary in $\mathcal {G}_p^{-1}(G_{11})$, it follows by taking the supremum over $\mu _1$ that

$$\begin{aligned} \mathcal {F}_N [G,U] \geqq \mathcal {F}_p [G_{11},U] + \frac{1}{2}\log ( (2\pi e)^{N-p} \det G_{22}), \end{aligned}$$

which implies

$$\begin{aligned} \Phi _N [G,U] \geqq \Phi _p [G_{11},U]. \end{aligned}$$

Remark 3.16

The proof suggests that for U depending only on the first p arguments and G block-diagonal, the supremum in the definition of $\mathcal {F}$ is attained by a product measure, which is perhaps not surprising. The proof also suggests, however, that for such U and general G, the supremum is attained by taking a product measure and then ‘correlating’ it via the transformation T.

For the impurity problem, Proposition 3.13 immediately implies that the self-energy has a particular sparsity pattern, and thus we have

Corollary 3.17

Let $p\leqq N$ and suppose that U (satisfying the weak growth condition) depends only on $x_1,\ldots ,x_p$. Then

$$\begin{aligned} \Sigma _N [G,U] = \left( \begin{array}{cc} \Sigma _p [G_{11},U] &{} 0 \\ 0 &{} 0 \end{array}\right) . \end{aligned}$$

For example, consider $U(x) = \frac{1}{8} \sum _{ijkl} v_{ij} x_i^2 x_j^2$. Here the stipulation that U depend only on the first p arguments corresponds to the stipulation that $v_{ij} = 0$ unless $i,j \leqq p$. For such an interaction, in the bold diagrammatic expansion for $\Phi $ and $\Sigma $, any term in which $G_{ij}$ appears will be zero unless $i,j\leqq p$. This is a non-rigorous perturbative explanation of the fact that $\Phi $ depends only on the upper-left block of G, which in turn explains the sparsity structure of $\Sigma $, as well as the fact that $\Sigma $ also depends only on the upper-left block of G. However, the developments of this section apply to interactions U of far greater generality and which may indeed be non-polynomial, hence not admitting of a bold diagrammatic expansion.

3.5 Continuous Extension of the LW Functional to the Boundary

The discussion in this subsection is only heuristic, and the proofs of the theorems stated here are deferred to Section 5.

Now in Section 3.1 we saw that the functional $\mathcal {F}[G]$ diverges at the boundary $\partial \mathcal {S}_{+}^{N} = \mathcal {S}_{+}^{N} \backslash \mathcal {S}_{++}^{N}$. On the other hand, the projection rule together with the transformation rule, motivates the formula by which we can extend $\Phi $ continuously up to the boundary $\partial \mathcal {S}_{+}^{N}$.

Indeed, suppose that $T^{(j)} \rightarrow P$, where $T^{(j)}$ is invertible and P is the orthogonal projection onto the first p components, as in Corollary 3.15. Then for $G\in \mathcal {S}^N_{++}$,

$$\begin{aligned} \Phi _N [T^{(j)}G(T^{(j)})^*,U] = \Phi _N [G,U\circ T^{(j)}] . \end{aligned}$$

By naively taking limits of both sides, we expect that

$$\begin{aligned} \Phi _N [PGP, U] = \Phi _N [G,U\circ P], \end{aligned}$$

where $G_{11}$ is the upper-left $p\times p$ block of G. Then by the projection rule we expect

$$\begin{aligned} \Phi _N \left[ \left( \begin{array}{cc} G_{11} &{} 0 \\ 0 &{} 0 \end{array}\right) , U \right] = \Phi _p [G_{11},U(\,\cdot \,,0)], \end{aligned}$$

where $G_{11}$ is the upper-left $p\times p$ block of G. After possibly changing coordinates via the transformation rule, this formula provides a general recipe for evaluating the LW functional on the boundary $\partial \mathcal {S}^N_{+}$, which is the content of Theorem 3.18 below.

Unfortunately, there are nontrivial analytic difficulties that are hidden by this heuristic derivation. In fact there exists an interaction U satisfying the weak growth condition for which the continuous extension property fails. Since the discussion of this counterexample is somewhat involved, it is postponed to Section 5.5. However, the continuous extension property is true for U satisfying the strong growth condition of Definition 2.4.

Before stating the continuous extension property in Theorem 3.18, we provide a more careful discussion of the structure of the boundary $\partial \mathcal {S}^N_{+}$. Consider a q-dimensional subspace K of ${\mathbb {R}^{N}}$, and let $p=N-q$. Then the set

$$\begin{aligned} S_{K}:=\left\{ G\in \mathcal {S}_{+}^{N}\,:\,\ker G=K\right\} \end{aligned}$$

forms a ‘stratum’ of the boundary of $\mathcal {S}_{+}$, which is itself isomorphic to the set of $p\times p$ positive definite matrices. In turn, one can consider boundary strata (of smaller dimension) nested inside of $S_{K}$.

We will show that the restriction of the Luttinger–Ward function to such a stratum is precisely the Luttinger–Ward function for a lower-dimensional system. To this end, fix a subspace K and choose any orthonormal basis $v_{1},\ldots ,v_{p}$ for $K^{\perp }$. (The choice of basis is not canonical but can be made for the purpose of writing down results explicitly.) Define $V_{p}:=[v_{1},\ldots ,v_{p}]$. We use this notation to indicate both the matrix and the corresponding linear map.

Theorem 3.18

(Continuous extension, I) Suppose that U is continuous and satisfies the strong growth condition. With notation as in the preceding discussion, $\Phi _{N}[\,\cdot \,,U]$ extends continuously to $S_{K}$ via the rule

$$\begin{aligned} \Phi _{N}\left[ G,U\right] =\Phi _{p}\left[ V_{p}^{*}GV_{p},U\circ V_{p} \right] \end{aligned}$$

for $G\in S_{K}$. Consequently, $\Phi _{N}[\,\cdot \,,U]$ extends continuously to all of $\mathcal {S}_{+}^{N}$.

Remark 3.19

We interpret the extension rule as to set $\Phi _N [0,U] = \Phi _0[U] := -2 \cdot U(0)$. Moreover, it will become clear in the proof that even for continuous interactions U that do not satisfy the strong growth condition, the extension is still lower semi-continuous on $\mathcal {S}^N_{+}$ and continuous on $\mathcal {S}^N_{++}\cup \{0\}$.

Changing coordinates via Proposition 3.10, we see that Theorem 3.18 is actually equivalent to the following:

Theorem 3.20

(Continuous extension, II) Suppose that U is continuous and satisfies the strong growth condition. For $G\in \mathcal {S}_{++}^{p}$, $\Phi [\,\cdot \,,U]$ extends continuously via the rule

$$\begin{aligned} \Phi _{N}\left[ \left( \begin{array}{cc} G &{} 0\\ 0 &{} 0 \end{array}\right) ,U\right] =\Phi _{p}\left[ G,U(\cdot ,0)\right] . \end{aligned}$$

Once again we comment that proof is deferred to Section 5.

4 Bold Diagram Expansion for the Generalized Coulomb Interaction

Using the Luttinger–Ward formalism, in this section we prove that the bold diagrammatic expansions from the accompanying paper of the self-energy and the LW functional [for the generalized Coulomb interaction (4.1)] can indeed be interpreted as asymptotic series expansions in the interaction strength at fixed G. This provides a rigorous interpretation of the bold expansions that is not merely combinatorial. Recall that when each G in the bold diagrammatic expansion of the self-energy is further expanded using $G^{0}$ and U, the resulting expansion should be formally the same as the bare diagrammatic expansion of the self energy. The combinatorial argument in Section 4 of the accompanying paper guaranteeing this fact does not need to be repeated in this setting, and we will be able to directly use Theorem 4.12 from the accompanying paper. The remaining hurdles are analytical, not combinatorial.

We summarize the results of this section as follows:

Theorem 4.1

For any continuous interaction $U:{\mathbb {R}^{N}}\rightarrow \mathbb {R}$ satisfying the weak growth condition and any $G \in \mathcal {S}^N_{++}$, the LW functional and the self-energy have asymptotic series expansions as

$$\begin{aligned} \Phi [G,\varepsilon U] = \sum _{k=1}^\infty \Phi ^{(k)}[G,U]\varepsilon ^k,\quad \Sigma [G,\varepsilon U] = \sum _{k=1}^\infty \Sigma ^{(k)}[G,U]\varepsilon ^k. \end{aligned}$$

(4.1)

Moreover, for U a homogeneous quartic polynomial, the coefficients of the asymptotic series satisfy

$$\begin{aligned} \Phi ^{(k)}[G,U] = \frac{1}{2k} \mathrm {Tr}\left[ G \Sigma ^{(k)}[G,U] \right] . \end{aligned}$$

(4.2)

If U is moreover a generalized Coulomb interaction (2.7), we have (borrowing the language of the accompanying paper) that

$$\begin{aligned} \Sigma ^{(k)}_{ij}[G,U] = \sum _{\Gamma _{\mathrm {s}} \in \mathfrak {F}_2^{\mathrm {2PI}},\,\mathrm {order}\,k} \frac{\mathbf {F}_{\Gamma _{\mathrm {s}}}(i,j)}{S_{\Gamma _{\mathrm {s}}}}, \end{aligned}$$

(4.3)

that is, $\Sigma ^{(k)}$ is given the sum over bold skeleton diagrams of order k with bold propagator G and interaction $v_{ij} \delta _{ik} \delta _{jl}$.

Remark 4.2

For a series as in Eq. (4.1) to be asymptotic means that the error of the M-th partial sum is $O(\varepsilon ^{M+1})$ as $\varepsilon \rightarrow 0$.

Since U is fixed, for simplicity in the ensuing discussion we will omit the dependence on U from the notation via the definitions $\Phi _G(\varepsilon ) := \Phi [G,\varepsilon U]$, $\Sigma _G(\varepsilon ) = \Sigma [G,\varepsilon U]$, and $A_G(\varepsilon ) := A[G,\varepsilon U]$. We will also denote the series coefficients via $\Phi ^{(k)}_G := \Phi ^{(k)}[G,U]$ and $\Sigma ^{(k)}_G := \Sigma ^{(k)}[G,U]$. In this notation, our asymptotic series take the form

$$\begin{aligned} \Phi _G (\varepsilon ) = \sum _{k=1}^\infty \Phi ^{(k)}_G \varepsilon ^k, \quad \Sigma _G (\varepsilon ) = \sum _{k=1}^\infty \Sigma ^{(k)}_G \varepsilon ^k. \end{aligned}$$

(4.4)

Notation 4.3

Note carefully that in this section the superscript (k) is merely a notation and does not indicate the k-th derivative. Such derivatives will be written out as $\frac{\,\mathrm {d}^k}{\,\mathrm {d}\varepsilon ^k}$.

Now we outline the remainder of this section. In Section 4.1 we prove that the LW functional and the self-energy do indeed admit asymptotic series expansions. In Section 4.2 we prove the relation between the LW and self-energy expansions for quartic interactions, namely Eq. (4.2). Interestingly, this relation—which is well-known formally based on diagrammatic observations—was originally assumed to be true to obtain a formal derivation of the LW functional [19, 20]. Our proof here does not rely on any diagrammatic manipulation, only making use of the transformation rule and the quartic nature of the interaction U. Similar relations for homogeneous polynomial interactions of different order could easily be obtained. Next, in Section 4.3, we summarize and expand on the necessary results from the accompanying paper in diagram-free language; this both reduces the prerequisite knowledge needed for the remainder of the section and clarifies the arguments that follow. Finally, in Section 4.4 we prove that when U is a generalized Coulomb interaction, the series for the self-energy is in fact the bold diagrammatic expansion of Section 4 of the accompanying paper.

4.1 Existence of Asymptotic Series

In this section we assume that U is continuous and satisfies the weak growth condition. We first prove the following pair of lemmas.

Lemma 4.4

For any $G\in \mathcal {S}^N_{++}$, $A_G (\varepsilon ) \rightarrow G^{-1}$ as $\varepsilon \rightarrow 0^+$.

Lemma 4.5

For $G\in \mathcal {S}^N_{++}$, all derivatives of the functions $\Phi _G:(0,\infty ) \rightarrow \mathbb {R}$ and $\Sigma _G:(0,\infty ) \rightarrow \mathbb {R}^{N\times N}$ extend continuously to $[0,\infty )$.

We will convey the continuous extension of the derivatives of $\Phi _G$ to the origin by the notation $\Phi ^{(k)}_G := \Phi ^{(k)}_G (0)$, and similarly for the self-energy $\Sigma _G^{(k)} := \Sigma _G^{(k)}(0)$. From the preceding it will follow that the series (4.4) are indeed asymptotic series in the following sense:

Proposition 4.6

For any nonnegative integer M, $\Phi _G(\varepsilon ) - \sum _{k=1}^M \Phi ^{(k)}_G \varepsilon ^k = O(\varepsilon ^{M+1})$ and $\Sigma _G(\varepsilon ) - \sum _{k=1}^M \Sigma ^{(k)}_G \varepsilon ^k = O(\varepsilon ^{M+1})$ as $\varepsilon \rightarrow 0^+$.

Proof

Consider any function $f:[0,\infty )\rightarrow \mathbb {R}$ with all derivatives extending continuously up to the boundary (and so defined at 0). Let $\delta >0$, so for $\varepsilon \in (\delta ,1]$ we know by the Lagrange error bound that

$$\begin{aligned} \left| f(\varepsilon ) - \sum _{k=0}^M f^{(k)}(\delta ) (\varepsilon -\delta )^k \right| \leqq C (\varepsilon - \delta )^{M+1} \leqq C \varepsilon ^{M+1}, \end{aligned}$$

where C is a constant that depends only on a uniform bound on $\left( \frac{\,\mathrm {d}}{\,\mathrm {d}\varepsilon }\right) ^{k+1} f$ over [0, 1] (the existence of which is guaranteed by the continuous extension property). Simply taking the limit of our inequality as $\delta \rightarrow 0^+$, and again employing the continuous extension property, yields that $\left| f(\varepsilon )-\sum _{k=0}^M f^{(k)}(0) \varepsilon ^k \right| \leqq C\varepsilon ^{M+1}$. This fact together with Lemma 4.5 proves the proposition.

4.2 Relating the LW and Self-energy Expansions

The bold diagrams for the Luttinger–Ward functional are pinned down in terms of the bold diagrams for the self-energy via the following:

Proposition 4.7

If U is a homogeneous quartic polynomial, then for all k,

$$\begin{aligned} \Phi ^{(k)}_G = \frac{1}{2k} \mathrm {Tr}[G \Sigma ^{(k)}_G]. \end{aligned}$$

Proof

Observe that by the transformation rule that for any $G\in \mathcal {S}^N_{++}$, $\varepsilon , t>0$.

$$\begin{aligned} \Phi [tG,\varepsilon U] = \Phi [G, \varepsilon U\circ (t^{1/2} I)] \end{aligned}$$

Taking the gradient in G of both sides, we have

$$\begin{aligned} t \Sigma [tG,\varepsilon U] = \Sigma [G, \varepsilon U\circ (t^{1/2} I)]. \end{aligned}$$

Since U is homogeneous quartic, in fact, we have

$$\begin{aligned} \Sigma [tG,\varepsilon U] = \frac{1}{t} \Sigma [G, t^2 \varepsilon U]. \end{aligned}$$

Then, using this relation, we compute

$$\begin{aligned} \Phi [G,\varepsilon U]= & {} \int _{0}^{1}\frac{\mathrm{{d}}}{\mathrm{{d}}t}\Phi [tG,\varepsilon U]\,\mathrm{{d}}t\\= & {} \int _{0}^{1}\mathrm {Tr}[G\Sigma [tG,\varepsilon U]]\,\mathrm{{d}}t\\= & {} \int _{0}^{1}\frac{1}{t}\mathrm {Tr}[G\Sigma [G,t^{2}\varepsilon U]]\,\mathrm{{d}}t\\= & {} \int _{0}^{1}\frac{1}{t}\left[ \sum _{k=1}^{M}\mathrm {Tr}\left[ G\Sigma _{G}^{(k)}\right] t^{2k}\varepsilon ^k +O\left( t^{2(M+1)}\varepsilon ^{M+1}\right) \right] \,\mathrm{{d}}t\\= & {} \int _{0}^{1}\left[ \sum _{k=1}^{M}\mathrm {Tr}\left[ G\Sigma _{G}^{(k)}\right] t^{2k-1}\varepsilon ^{k}+O\left( t^{2M+1}\varepsilon ^{M+1}\right) \right] \,\mathrm{{d}}t. \end{aligned}$$

Now since t ranges from 0 to 1 in the integrand, we have that $t^{2N+1}\varepsilon ^{N+1}\leqq \varepsilon ^{N+1}$, and therefore

$$\begin{aligned} \Phi [G,\varepsilon U]= & {} \int _{0}^{1}\left[ \sum _{k=1}^{M}\mathrm {Tr}\left[ G\Sigma _{G}^{(k)}\right] t^{2k-1}\varepsilon ^{k}\right] \,\mathrm{{d}}t+O(\varepsilon ^{M+1})\\= & {} \sum _{k=1}^{M}\frac{1}{2k}\mathrm {Tr}\left[ G\Sigma _{G}^{(k)}\right] \varepsilon ^{k}+O(\varepsilon ^{M+1}). \end{aligned}$$

This establishes the proposition.

4.3 Diagram-free Discussion of Results from the Accompanying Paper

For U satisfying the weak growth condition and $A \in \mathrm {dom}\,\Omega [\,\cdot \,,U]$, define

$$\begin{aligned} \sigma [A,U] := A - (G[A,U])^{-1}. \end{aligned}$$

Here we use the lowercase $\sigma $ to emphasize that the self-energy here is being considered as a functional of A (not G), together with the interaction.

Now we set the notation of U to indicated a fixed generalized Coulomb interaction (2.7). Further define

$$\begin{aligned} G_A(\varepsilon ) := G[A,\varepsilon U], \quad \sigma _A(\varepsilon ) := \sigma [A,\varepsilon U]. \end{aligned}$$

(4.5)

The following lemma concerns the bare diagrammatic expansion of the Green’s function and the self-energy, that is, the asymptotic series for $G_A$ and $\sigma _A$:

Lemma 4.8

For fixed $A \in \mathcal {S}^N_{++}$, all derivatives $\frac{\,\mathrm {d}^{n}G_A}{\,\mathrm {d}\varepsilon ^{n}} : (0,\infty )\rightarrow \mathcal {S}^N_{++}$ and $\frac{\,\mathrm {d}^{n}\sigma _A}{\,\mathrm {d}\varepsilon ^{n}}: (0,\infty ) \rightarrow \mathcal {S}^N$ extend continuously to $[0,\infty )$. In fact, interpreted as functions of both A and $\varepsilon $, $\frac{\,\mathrm {d}^{n}G_A}{\,\mathrm {d}\varepsilon ^{n}}(\varepsilon )$ and $\frac{\,\mathrm {d}^{n}\sigma _A}{\,\mathrm {d}\varepsilon ^{n}}(\varepsilon )$ extend continuously to $\mathcal {S}^N_{++}\times [0,\infty )$. Moreover, we have asymptotic series expansions

$$\begin{aligned} G_A(\varepsilon ) = \sum _{k=0}^\infty g^{(k)}_A \varepsilon ^k, \quad \sigma _A(\varepsilon ) = \sum _{k=1}^\infty \sigma ^{(k)}_A \varepsilon ^k, \end{aligned}$$

where the coefficient functions $g^{(k)}_A$ and $\sigma ^{(k)}_A$ are polynomials in $A^{-1}$. More precisely, $g^{(k)}_A$ and $\sigma ^{(k)}_A$ are homogeneous polynomials of degrees $2k+1$ and $2k-1$, respectively. (Note that the zeroth-order term $\sigma _A^{(0)}$ is implicitly zero.)

Finally, let $G_{A}^{(\le M)}(\varepsilon )$ and $\sigma _{A}^{(\le M)}(\varepsilon )$ denote the M-th partial sums of the above asymptotic series for $G_A(\varepsilon )$ and $\sigma _A(\varepsilon )$, respectively. For every $A \in \mathcal {S}^N_{++}$, there exists a neighborhood $\mathcal {N}$ of A in $\mathcal {S}^N_{++}$ on which the truncation errors can actually be bounded

$$\begin{aligned} \left| G_A(\varepsilon ) - G^{(\le M)}_{A}(\varepsilon ) \right| \leqq C \varepsilon ^{M+1}, \quad \left| \sigma _A(\varepsilon ) - \sigma ^{(\le M)}_{A}(\varepsilon ) \right| \leqq C \varepsilon ^{M+1} \end{aligned}$$

for all $\epsilon \in [0,\tau ]$, with $C, \tau $ independent of $A \in \mathcal {N}$.

Proof

The asymptotic series expansions for $G_A$ and $\Sigma _A$ are established in Theorems 3.15 and 3.17 of the accompanying paper. The continuous extension of the derivatives of $G_A$ and $\sigma _A$ to $[0,\infty )$ follows from differentiation under the integral and simple dominated convergence arguments.

The uniform error bound follows from a Lagrange error bound argument as in Proposition 4.6, together with the continuity of $\frac{\,\mathrm {d}^{n}G_A}{\,\mathrm {d}\varepsilon ^{n}}(\varepsilon )$ and $\frac{\,\mathrm {d}^{n}\sigma _A}{\,\mathrm {d}\varepsilon ^{n}}(\varepsilon )$ on $\mathcal {S}^N_{++}\times [0,\infty )$.

Inspired by Eq. (4.3), let

$$\begin{aligned} \mathbf {S}_G^{(k)} = \sum _{\Gamma _{\mathrm {s}} \in \mathfrak {F}_2^{\mathrm {2PI}},\,\mathrm {order}\,k} \frac{\mathbf {F}_{\Gamma _{\mathrm {s}}}}{S_{\Gamma _{\mathrm {s}}}}. \end{aligned}$$

In fact $\mathbf {S}_G^{(k)}$ is polynomial in G, homogeneous of degree $2k-1$. At this point we do not yet know that $\mathbf {S}_G^{(k)}$ coincides with $\Sigma _G^{(k)}$, and indeed this is what we want to show. For any G, also define the partial sum

$$\begin{aligned} \mathbf {S}_G^{(\leqq M)}(\varepsilon ) := \sum _{k=1}^M \mathbf {S}_G^{(k)} \varepsilon ^k. \end{aligned}$$

Then the main result (Theorem 4.12) of the accompanying paper can be phrased as follows:

Theorem 4.9

For any fixed $A\in \mathcal {S}^N_{++}$, the expressions

$$\begin{aligned} \mathbf {S}_{G^{(\le M)}_{A}(\varepsilon )}^{(\le M)}(\varepsilon ) = \sum _{k=1}^M \mathbf {S}_{G^{(\le M)}_{A}(\varepsilon )}^{(k)} \varepsilon ^k, \quad \sigma _{A}^{(\le M)}(\varepsilon ) = \sum _{k=1}^M \sigma ^{(k)}_A \varepsilon ^k \end{aligned}$$

agree as polynomials in $\varepsilon $ up to order M, and hence they agree as joint polynomials in $(A^{-1},\varepsilon )$ after neglecting all terms in which $\varepsilon $ appears degree at least $M+1$.

4.4 Derivation of Self-energy Bold Diagrams

We have already shown that there exist asymptotic series for the LW functional and the self-energy. The remainder of Theorem 4.1 then consists of identifying that the self-energy coefficients $\Sigma _G^{(k)}$ are indeed given by the bold diagrammatic expansion, that is, that $\Sigma _G^{(k)} = \mathbf {S}_G^{(k)}$. Equivalently, we want to show that the partial sums $\mathbf {S}_G^{(\leqq M)}(\varepsilon )$ and $\Sigma _G^{(\leqq M)}(\varepsilon )$, which are polynomials of degree M in $\varepsilon $, are equal. We will think of $G \in \mathcal {S}^N_{++}$ as fixed throughout the following discussion, and we omit dependence on G from some of the notation below to avoid excess clutter. We will also think of M as a fixed positive integer and $\varepsilon >0$ as variable (and sufficiently small).

Since our series expansion is only valid in the asymptotic sense, for any finite M we consider the truncation

$$\begin{aligned} \Sigma ^{(\le M)}_G (\varepsilon ) := \sum _{k=1}^M \Sigma ^{(k)}_G\, \varepsilon ^k. \end{aligned}$$

Then we have $\Sigma _G (\varepsilon ) - \Sigma ^{(\le M)}_G (\varepsilon ) = O(\varepsilon ^{M+1})$. For the purpose of this discussion, $O(\varepsilon ^{M+1})$ will be thought of as negligibly small, and ‘$\approx $’ will be used to denote equality up to error $O(\varepsilon ^{M+1})$. Meanwhile ‘$\sim $’ will be used to denote error that is $O(\varepsilon ^{M+1-p})$ for all $p\in (0,1)$, equivalently $O(\varepsilon ^{M+\delta })$ for all $\delta \in (0,1)$. We remark that the difference between the relations ‘$\approx $’ and ‘$\sim $’ is due to technical reasons to be detailed later, and may be neglected on first reading.

Note that it actually suffices to show that $\Sigma _G^{(\leqq M)}(\varepsilon ) \sim \mathbf {S}_G^{(\leqq M)}(\varepsilon )$. Indeed, both sides are polynomials of degree M in $\varepsilon $. Thus their difference is a polynomial of degree $\leqq M$. If the degree-n part of the difference is nonzero for some $n = 1,\ldots , M$, then the difference is not $O(\varepsilon ^{n+\delta })$ for any $\delta >0$. But if $\Sigma _G^{(\leqq M)}(\varepsilon ) \sim \mathbf {S}_G^{(\leqq M)}(\varepsilon )$, then the difference is $O(\varepsilon ^{n+\delta })$ for all $n=1,\ldots ,M$, $\delta \in (0,1)$. Thus in this case the difference is zero. With this reduction in mind, we now make a simple yet critical observation, namely that $\Sigma ^{(\le M)}_G (\varepsilon )$ can be identified as the exact self-energy yielded by a modified interaction term. This will allow us to identify a quadratic form $A^{(M)}(\varepsilon )$, for which dependence on G has been suppressed from the notation, which generates (up to negligible error) the Green’s function G under the interaction $\varepsilon U$.

Lemma 4.10

With notation as in the preceding discussion, $\Sigma ^{(\le M)}_G (\varepsilon )$ is the self-energy induced by the interaction ${U}^{(M)}_\varepsilon (x) := \varepsilon U(x) + \frac{1}{2} x^T \left[ \Sigma _G(\varepsilon ) - \Sigma ^{(\le M)}_G (\varepsilon ) \right] x $, that is,

$$\begin{aligned} \Sigma ^{(\le M)}_G (\varepsilon ) = \Sigma [G,{U}^{(M)}_\varepsilon ], \end{aligned}$$

and moreover

$$\begin{aligned} A^{(M)} (\varepsilon ) := A\left[ G,{U}^{(M)}_\varepsilon \right] = G^{-1} + \Sigma ^{(\le M)}_G (\varepsilon ). \end{aligned}$$

Thus we may identify

$$\begin{aligned} G = G[A^{(M)}(\varepsilon ),U_\varepsilon ^{(M)}],\quad \Sigma ^{(\le M)}_G (\varepsilon ) = \sigma [A^{(M)} (\varepsilon ),{U}^{(M)}_\varepsilon ]. \end{aligned}$$

Proof

Recalling that $A_G(\varepsilon ) = A[G,\varepsilon U]$ and $\Sigma _G (\varepsilon ) = \Sigma [G,\varepsilon U]$, write

$$\begin{aligned} \frac{1}{2} x^T A_G(\varepsilon ) x + U(x)= & {} \frac{1}{2} x^T \left( A_G(\varepsilon ) - \Sigma _G(\varepsilon ) + \Sigma ^{(\le M)}_G (\varepsilon ) \right) x + U_\varepsilon ^{(M)} (x) \\= & {} \frac{1}{2} x^T \left( G^{-1} + \Sigma ^{(\le M)}_G (\varepsilon ) \right) x + U_\varepsilon ^{(M)} (x). \end{aligned}$$

It follows that under the interaction ${U}^{(M)}_\varepsilon $, the quadratic form $G^{-1} + \Sigma ^{(\le M)}_G (\varepsilon )$ corresponds to the (interacting) Green’s function G. This establishes the second statement of the lemma, that is, that

$$\begin{aligned} A[G,{U}^{(M)}_\varepsilon ] = G^{-1} + \Sigma ^{(\le M)}_G (\varepsilon ). \end{aligned}$$

Moreover, by the Dyson equation we have that

$$\begin{aligned} \Sigma [G,{U}^{(M)}_\varepsilon ] = A[G,{U}^{(M)}_\varepsilon ] - G^{-1} = \Sigma ^{(\le M)}_G (\varepsilon ), \end{aligned}$$

which is the first statement of the lemma. The last statement then follows from the second, together with the definitions of $G[\,\cdot \,,\,\cdot \,]$ and $\sigma [\,\cdot \,,\,\cdot \,]$.

Remark 4.11

Note carefully that Lemma 4.10 is a non-perturbative fact and is valid for all $\varepsilon >0$, though we shall apply it in a perturbative context.

At this point we have defined the terms needed to present a schematic diagram (Fig. 1) of our proof that $\Sigma _{G}^{(\le M)}(\varepsilon ) \sim \mathbf {S}^{(\le M)}_G (\varepsilon )$. Although the motivation for this schematic may not be fully clear at this point, the reader should refer back to it as needed for perspective.

Now recalling the definitions (4.5), we can write

$$\begin{aligned} G_{A^{(M)}(\varepsilon )}(\varepsilon ) = G[A^{(M)}(\varepsilon ) , \varepsilon U], \quad \sigma _{A^{(M)}(\varepsilon )} (\varepsilon ) := \sigma [A^{(M)}(\varepsilon ) , \varepsilon U]. \end{aligned}$$

(4.6)

Meanwhile, following Lemma 4.10 we have the identities

$$\begin{aligned} G = G[A^{(M)}(\varepsilon ) , U_\varepsilon ^{(M)}], \quad \Sigma _G^{(\le M)} (\varepsilon ) = \sigma [A^{(M)}(\varepsilon ) , U_\varepsilon ^{(M)}]. \end{aligned}$$

(4.7)

Note that pointwise, $\varepsilon U$ and $U_\varepsilon ^{(M)}$ differ negligibly, but the form of $\varepsilon U$ is simpler and easier to work with going forward.

Based on Eqs. (4.6) and (4.7), one then hopes that $G_{A^{(M)}(\varepsilon )}(\varepsilon )$ is close to G and $\sigma _{A^{(M)}(\varepsilon )}(\varepsilon )$ is close to $\Sigma _G^{(\leqq M)}(\varepsilon )$. This is the content of the next two lemmas.

Lemma 4.12

$G_{A^{(M)}(\varepsilon )}(\varepsilon ) \sim G$.

Proof

See “Appendix C.11”.

Lemma 4.13

$\sigma _{A^{(M)}(\varepsilon )}(\varepsilon ) \sim \Sigma ^{(\le M)}_{G}(\varepsilon )$.

Proof

Based on Eqs. (4.6) and (4.7), we want to show that $ \sigma [A^{(M)}(\varepsilon ), U_\varepsilon ^{(M)}] \sim \sigma [A^{(M)}(\varepsilon ), \varepsilon U]$. We have already shown that $ G = G[A^{(M)}(\varepsilon ), U_\varepsilon ^{(M)}] \sim G[A^{(M)}(\varepsilon ), \varepsilon U]$, from which it follows that

$$\begin{aligned} A^{(M)}(\varepsilon ) - (G[A^{(M)}(\varepsilon ), U_\varepsilon ^{(M)}])^{-1} \sim A^{(M)}(\varepsilon ) - (G[A^{(M)}(\varepsilon ), \varepsilon U])^{-1}, \end{aligned}$$

which is exactly what we want to show.

Then we can use $\sigma _{A^{(M)}(\varepsilon )}(\varepsilon )$ as a stepping stone to relate $\Sigma ^{(\leqq M)}_G (\varepsilon )$ with the bare diagrammatic expansion for the self-energy via the following:

Lemma 4.14

$\sigma _{A^{(M)}(\varepsilon )}(\varepsilon ) \approx \sigma _{A^{(M)}(\varepsilon )}^{(\leqq M)}(\varepsilon )$

Proof

Since $A^{(M)}(\varepsilon ) = G^{-1} + O(\varepsilon )$, the result follows from Lemma 4.8 (in particular, the locally uniform bound on truncation error of the bare self-energy series).

We can prove a similar fact (which will be useful later on) regarding the bare series for the interacting Green’s function:

Lemma 4.15

$G_{A^{(M)}(\varepsilon )} (\varepsilon ) \approx G_{A^{(M)}(\varepsilon )}^{(\leqq M)} (\varepsilon )$.

Proof

Since $A^{(M)}(\varepsilon ) = G^{-1} + O(\varepsilon )$, the result follows from Lemma 4.8 (in particular, the locally uniform bound on truncation error of the bare series for the interacting Green’s function).

From Lemmas 4.12 and 4.15 we immediately obtain

Lemma 4.16

$G_{A^{(M)}(\varepsilon )}^{(\leqq M)} (\varepsilon ) \sim G$.

Finally, we are ready to state and prove the last leg of the schematic diagram (Fig. 1).

Lemma 4.17

$\mathbf {S}^{(\le M)}_G \sim \sigma _{A^{(M)}(\varepsilon )}^{(\leqq M)}(\varepsilon )$.

Proof

Consider $\mathbf {S}_{G_{A}^{(\le M)}}^{(\le M)}$ as a polynomial in $(A^{-1},\varepsilon )$, and let $P(A^{-1},\varepsilon )$ be the contribution of terms in which $\varepsilon $ appears with degree at least $M+1$. By Theorem 4.9 we have the equality

$$\begin{aligned} \mathbf {S}_{G_{A}^{(\le M)}(\varepsilon )}^{(\le M)}(\varepsilon ) - P(A^{-1},\varepsilon ) = \sigma _A^{(\leqq M)} (\varepsilon ) \end{aligned}$$

of polynomials in $(A^{-1},\varepsilon )$. Then substituting $A\leftarrow A^{(M)}(\varepsilon )$, we obtain

$$\begin{aligned} \mathbf {S}_{G_{A^{(M)}(\varepsilon )}^{(\le M)}(\varepsilon )}^{(\le M)}(\varepsilon ) - P([A^{(M)}(\varepsilon )]^{-1},\varepsilon ) = \sigma _{A^{(M)}(\varepsilon )}^{(\leqq M)} (\varepsilon ). \end{aligned}$$

(4.8)

Although the first term on the left-hand side of Eq. (4.8) looks quite intimidating, we can recognize it as $\mathbf {S}_{ \mathbf {G}(\varepsilon ) }^{(\leqq M)}(\varepsilon )$, where

$$\begin{aligned} \mathbf {G}(\varepsilon ) := G_{A^{(M)}(\varepsilon )}^{(\le M)}(\varepsilon ) \sim G \end{aligned}$$

is the expression from Lemma 4.16. Since $\mathbf {S}_{ [\,\cdot \,] }^{(\leqq M)}(\varepsilon ) = \sum _{k=1}^M \mathbf {S}_{ [\,\cdot \,] }^{(k)} \varepsilon ^k $, where each $\mathbf {S}_{ [\,\cdot \,] }^{(k)}$ is a polynomial (homogeneous of positive degree) in the subscript slot, it follows that

$$\begin{aligned} \mathbf {S}_{ \mathbf {G}(\varepsilon ) }^{(\leqq M)}(\varepsilon ) \sim \mathbf {S}_{ G }^{(\leqq M)}(\varepsilon ). \end{aligned}$$

Then from Eq. (4.8) we obtain

$$\begin{aligned} \mathbf {S}_{G}^{(\le M)}(\varepsilon ) - P([A^{(M)}(\varepsilon )]^{-1},\varepsilon ) \sim \sigma _{A^{(M)}(\varepsilon )}^{(\leqq M)} (\varepsilon ), \end{aligned}$$

but since $[A^{(M)}(\varepsilon )]^{-1} = G + O(\varepsilon )$ and since P only includes terms of degree at least $M+1$ in the second slot, it follows that $P([A^{(M)}(\varepsilon )]^{-1},\varepsilon ) \approx 0$, and the desired result follows.

Taken together (as indicated in Fig. 1), Lemmas 4.13, 4.14, and 4.17 imply that $\Sigma _{G}^{(\le M)}(\varepsilon ) \sim \mathbf {S}^{(\le M)}_G (\varepsilon )$ as desired, and the proof of Theorem 4.1 is complete.

4.5 Caveat Concerning Truncation of the Bold Diagrammatic Expansion

Although the LW and self-energy functionals are defined even for G such that the corresponding quadratic form $A = A[G]$ is indefinite (and hence there is no physical bare non-interacting Green’s function), Green’s function methods (as discussed in Section 4.7 of the accompanying paper) based on truncation of the bold diagrammatic expansion can fail dramatically in the case of indefinite A. One can encounter divergent behavior as the interaction becomes small, or the Green’s function method may fail to admit a solution. Both failure modes can demonstrated by simple one-dimensional examples. The relevance of these to the solution of the quantum many-body problem is at this point unclear.

Consider the one-dimensional example of

$$\begin{aligned} Z = \int _{\mathbb {R}} e^{\frac{1}{2} x^2 - \frac{1}{8} \lambda x^4}\,\mathrm {d}x, \end{aligned}$$

(4.9)

where $a = -1$. The corresponding non-interacting Green’s function is $G^{0}=-1<0$ and hence is not even a physical Green’s function.

Nonetheless with $\lambda >0$ the true Green’s function is still well-defined via

$$\begin{aligned} G = \frac{1}{Z} \int _{\mathbb {R}} x^2 e^{\frac{1}{2} x^2 - \frac{1}{8} \lambda x^4}\,\mathrm {d}x. \end{aligned}$$

We now compute G via the Hartree-Fock method (cf. Section 4.7 of the accompanying paper), that is, we approximate the self-energy as

$$\begin{aligned} \Sigma ^{(1)} = -\frac{1}{2} \lambda G - \lambda G = -\frac{3}{2} \lambda G. \end{aligned}$$

Hence the self-consistent solution $G^{(1)}$ of the Dyson equation solves

$$\begin{aligned} \frac{1}{G^{(1)}} = -1 + \frac{3}{2} \lambda G^{(1)}. \end{aligned}$$

There is only one positive (physical) solution to this equation, namely

$$\begin{aligned} G^{(1)} = \frac{1 + \sqrt{1+6\lambda }}{3\lambda }. \end{aligned}$$

In the spirit of perturbation theory, one might hope that $G^{(1)}$ is a good approximation to G at least when $\lambda \rightarrow 0$. However we see just the opposite. This is perhaps not surprising because the exact Green’s function G itself blows up in this limit.

The failure of the method as $\lambda \rightarrow 0$ can be understood more precisely as follows. Rewrite the Hamiltonian from (4.9) as

$$\begin{aligned} \frac{1}{8} \lambda \left( x^2-\frac{2}{\lambda }\right) ^2 - \frac{1}{2\lambda }. \end{aligned}$$

The corresponding Gibbs measure (which is unaffected by the additive constant) then concentrates about two peaks at $x=\pm \sqrt{\frac{2}{\lambda }}$ as $\lambda \rightarrow 0$. Hence we expect

$$\begin{aligned} G \sim 2 \lambda ^{-1}. \end{aligned}$$

We note that, in contrast with the statement of Lemma 4.8, the limit $\lim _{\lambda \rightarrow 0+} G(\lambda )$ does not exist. According to Eq. (),

$$\begin{aligned} G^{(1)} \sim \frac{2}{3} \lambda ^{-1}. \end{aligned}$$

We find that as $\lambda \rightarrow 0+$, G and its first order approximation $G^{(1)}$ do not agree.

If we include the second-order terms of the bold diagrammatic expansion

$$\begin{aligned} \Sigma ^{(2)} = \frac{1}{2} \lambda ^2 G^3 + \lambda ^2 G^3 = \frac{3}{2} \lambda ^2 G^3. \end{aligned}$$

(4.10)

Then the self-consistent solution $G^{(2)}$ of the Dyson equation solves

$$\begin{aligned} \frac{1}{G^{(2)}} = -1 + \frac{3}{2} \lambda G^{(2)} - \frac{3}{2} \lambda ^2 \left( G^{(2)}\right) ^3. \end{aligned}$$

This yields a quartic equation in the scalar $G^{(2)}$, which in fact has no solution for physical $G^{(2)}$, that is, $G^{(2)} > 0$.

To see this, first ease the notation by substituting $x \leftarrow G^{(2)}$, so we are interested in the solutions $x>0$ of

$$\begin{aligned} \frac{3}{2} \left[ (\lambda ^{1/2}x)^4 - (\lambda ^{1/2} x)^2 \right] + x + 1 = 0. \end{aligned}$$

However, $y^4 - y^2 \geqq -\frac{1}{4}$ for all y, so the first term is at least $-\frac{3}{8}$, which evidently implies that no solutions exist for $x>0$.

5 Proof of the Continuous Extension of the LW Functional

In Section 3.5 we motivated the continuous extension of the LW functional to the boundary of $\mathcal {S}^N_{++}$ and stated this result in two equivalent forms (Theorems 3.18 and 3.20). In this section we prove the continuous extension property (for interactions of strong growth). We also develop the counterexample promised earlier, an interaction of weak but not strong growth for which the continuous extension property fails.

The section is outlined as follows. In Section 5.1, we describe some preliminary reductions in the proof of the continuous extension property, after which the proof can be divided into two parts: lower-bounding the limit inferior of the LW functional as the argument approaches the boundary and upper-bounding the limit supremum. In Section 5.2, we prove the lower bound, and in Section 5.3 we prove the upper bound. In Section 5.4 we provide an alternate view on the continuous extension property from the Legendre dual side, and in Section 5.5 we use this perspective to exhibit the aforementioned counterexample to the continuous extension property, which satisfies the weak growth condition but not the strong one.

5.1 Proof Setup

We are going to prove Theorem 3.20, which as we have remarked suffices to prove Theorem 3.18 by changing coordinates via Proposition 3.10.

Suppose $G\in \mathcal {S}^N_{+}$ is of the form

$$\begin{aligned} G=\left( \begin{array}{cc} G_{p} &{} 0\\ 0 &{} 0 \end{array}\right) , \end{aligned}$$

where $G_{p}\in \mathcal {S}_{++}^{p}$, and suppose that $G^{(j)} \in \mathcal {S}^N_{++}$ with $G^{(j)}\rightarrow G$ as $j\rightarrow \infty $. For each j, diagonalize $G^{(j)}=\sum _{i=1}^{N}\lambda _{i}^{(j)} v_{i}^{(j)} \left( v_{i}^{(j)} \right) ^{T}$, where the $v_{i}^{(j)}$ are orthonormal, $\lambda _{i}^{(j)}>0$ for $i=1,\ldots ,N$.

We want to show that

$$\begin{aligned} \Phi _{n}[G^{(j)},U] \rightarrow \Phi _{p}[G_p,U(\,\cdot \,,0)]. \end{aligned}$$

It suffices to show that every subsequence has a convergent subsequence with its limit being $\Phi _{p}[G_p,U(\,\cdot \,,0)]$. The $G^{(j)}$ are convergent, hence bounded (in the $\Vert \cdot \Vert _2$ norm), so the $\lambda _i^{(j)}$ are bounded. Moreover, the $v_i^{(j)}$ are all of unit length, hence bounded, so by passing to a subsequence if necessary we can assume that, for each i, there exist $\lambda _i, v_i$ such that $\lambda _i^{(j)} \rightarrow \lambda _i$ and $v_i^{(j)} \rightarrow v_i$ as $j\rightarrow \infty $. It follows that the $v_i$ are orthonormal and that G can be diagonalized as $G=\sum _{i=1}^{N}\lambda _{i}v_{i}v_{i}^{T}$. Since $G_p$ is positive definite, we must have $\lambda _{i}>0$ for $i=1,\ldots ,p$, and moreover $\lambda _{i}=0$ for $i=p+1,\ldots ,N$. Evidently, the eigenvectors of G with strictly positive eigenvalues must be precisely the eigenvectors of $G_p$, concatenated with $N-p$ zero entries, that is, for $i=1,\ldots ,p$, $v_i$ must be of the form $(*,0)$. By orthogonality, for $i=p+1,\ldots , n$, $v_i$ must be of the form $(0,*)$.

For convenience we also establish the following notation:

$$\begin{aligned}V_{G}:= \mathrm {span}\{v_1 ,\ldots , v_p\},\ V_{G^{(j)}}:= \mathrm {span}\{v_1^{(j)} ,\ldots , v_p^{(j)}\}. \end{aligned}$$

Now the proof consists of proving two bounds: a lower bound

$$\begin{aligned} \liminf _{j\rightarrow \infty }\Phi _{N}[G^{(j)},U] \geqq \Phi _{p}[G_{p},U(\cdot ,0)] \end{aligned}$$

and an upper bound

$$\begin{aligned} \limsup _{j\rightarrow \infty }\Phi _{N}[G^{(j)},U] \leqq \Phi _{p}[G_{p},U(\cdot ,0)]. \end{aligned}$$

These bounds will be proved in the next two sections, that is, Sections 5.2 and 5.3, respectively.

5.2 Lower Bound

We want to establish a lower bound on $\Phi _{N}[G_{j},U]$ via our expression for $\mathcal {F}_{N}$ as a supremum:

$$\begin{aligned} \mathcal {F}_{N}[G^{(j)},U]=\sup _{\mu \in \mathcal {G}_{N}^{-1}(G^{(j)})}\left[ H(\mu )-\int U\,\,\mathrm {d}\mu \right] . \end{aligned}$$

(5.1)

This strategy requires us to construct measures $\mu ^{(j)} \in \mathcal {G}_{N}^{-1}(G^{(j)})$. Intuitively, what one hopes to do (though this strategy will require some modification) is the following: consider the measure $\alpha $ on $\mathbb {R}^p$ that attains the supremum in the analogous expression for $\mathcal {F}_p [G_p,U(\,\cdot \,,0)]$, identify this measure with a measure on $V_G \simeq \mathbb {R}^p$, rotate and scale appropriately to obtain a measure $\alpha ^{(j)}$ supported on $V_{G^{(j)}}$ with the correct second-order moments with respect to this subspace, and finally take the direct sum with an appropriate Gaussian measure $\beta ^{(j)}$ on $V_{G^{(j)}}^\perp $. Unfortunately, due to difficulties of analysis, it is not clear how to then prove the desired limit as $j\rightarrow \infty $.

However, the analysis of this limit would be feasible if the $\mu ^{(j)}$ had compact support (which they evidently do not). Then our approach is to carry out a construction that preserves the spirit of the ‘ideal’ construction just described but instead works with $\mu ^{(j)}$ of (uniform) compact support.

For convenience we let $\mathcal {M}_c \subset \mathcal {M}_2$ denote the subset of measures of compact support. The acceptability of working with measures of compact support can be motivated by the following lemma, which will be used below. (In the statement we temporarily suppress dependence on the interaction and the dimension from the notation.)

Lemma 5.1

For all $G\in \mathcal {S}^N$,

$$\begin{aligned} \mathcal {F} [G] = \sup _{\mu \in \mathcal {G}^{-1}(G)\cap \mathcal {M}_c} \left[ H(\mu ) - \int U\,\,\mathrm {d}\mu \right] . \end{aligned}$$

Now we outline our actual construction of the $\mu ^{(j)}$. Consider an arbitrary measure $\alpha \in \mathcal {G}_{p}^{-1}(G_{p})$ with compact support on $\mathbb {R}^{p}\simeq V_{G}$. (We abuse notation slightly by considering $\alpha $ as a measure on both $\mathbb {R}^{p}$ and $V_{G}$.) The idea now is to construct a measure in $\mu ^{(j)} \in \mathcal {G}_{N}^{-1}(G^{(j)})$ by rotating $\alpha $ and scaling appropriately to obtain a measure $\alpha ^{(j)}$ supported on $V_{G^{(j)}}$ and then taking the direct sum with a compactly supported measure $\beta ^{(j)}$ on $V_{G^{(j)}}^\perp $ (the details of which will be discussed later). In fact the supremum in (5.1) will be approximately attained by a measure of this form as $j\rightarrow \infty $, that is, our lower bound will be tight as $j\rightarrow \infty $.

Accordingly, for the construction of $\alpha ^{(j)}$, let $O^{(j)}$ be the orthogonal linear transformation sending $v_i \mapsto v_i^{(j)}$, and let $D^{(j)}$ be the linear transformation with matrix (in the $v_i^{(j)}$ basis) given by

$$\begin{aligned} \mathrm {diag}\left( \sqrt{\lambda _{1}^{(j)}/\lambda _{1}},\ldots ,\sqrt{\lambda _{p}^{(j)}/\lambda _{p}}, 1,\ldots ,1 \right) . \end{aligned}$$

Then define $T^{(j)} := D^{(j)} O^{(j)}$ and $\alpha ^{(j)} := T^{(j)} \# \alpha $. Note that $T^{(j)} \rightarrow I_n$ as $j \rightarrow \infty $. Moreover, observe that $\alpha ^{(j)}$ is a measure supported on $V_{G^{(j)}}$ with second-order moment matrix given by $\mathrm {diag}(\lambda _{1}^{(j)},\ldots ,\lambda _{p}^{(j)})$ with respect to the coordinates on $V_{G^{(j)}}$ induced by the orthonormal basis $v_{1}^{(j)},\ldots ,v_{p}^{(j)}$.

Now we turn to the construction of $\beta ^{(j)}$. Let $R>1$ and let $\gamma $ be a measure supported on $[-R,R]$ with $\int x^2 \,\mathrm{{d}}\gamma = 1$. The parameter R will control the size of the support of $\beta ^{(j)}$ and will be sent to $+\infty $ at the very end of the proof of the lower bound (after the limit in j has been taken). Then define

$$\begin{aligned} \Lambda ^{(j)} := \mathrm {diag}\left( \sqrt{\lambda _{p+1}^{(j)}},\ldots ,\sqrt{\lambda _{N}^{(j)}} \right) , \end{aligned}$$

and define a measure $\beta ^{(j)}$ on $\mathbb {R}^{N-p}$ by $\beta ^{(j)} := \Lambda ^{(j)} \# (\gamma \times \cdots \times \gamma )$. Note that $\Lambda ^{(j)}\rightarrow 0$ as $j\rightarrow \infty $. Abusing notation slightly, we will also identify $\beta ^{(j)}$ with a measure supported on $V_{G^{(j)}}^{\perp }\simeq \mathbb {R}^{N-p}$ via the identification of the orthonormal basis $v_{p+1}^{(j)},\ldots ,v_{N}^{(j)}$ for $V_{G^{(j)}}^\perp $ with the standard basis of $\mathbb {R}^{N-p}$.

Finally, define the product measure $\mu ^{(j)}:=\alpha ^{(j)}\times \beta ^{(j)}$ with respect to the product structure ${\mathbb {R}^{N}}=V_{G^{(j)}}\times V_{G^{(j)}}^{\perp }$, and note that $\mu ^{(j)}\in \mathcal {G}_N^{-1}(G^{(j)})$, so by (5.1),

$$\begin{aligned} \mathcal {F}_{N}[G^{(j)},U]\geqq & {} H(\alpha ^{(j)}\times \beta ^{(j)})-\int U\,\,\mathrm {d}\mu ^{(j)}\\= & {} H(\alpha ^{(j)})+H(\beta ^{(j)})-\int U\,\mathrm{{d}} \mu ^{(j)}\\= & {} H(\alpha )-\int U\,\,\mathrm {d}\mu ^{(j)}+\frac{1}{2}\sum _{i=p+1}^{N}\log \lambda _{i}^{(j)}+(N-p)H(\gamma ), \end{aligned}$$

where $H(\alpha ^{(j)})$ and $H(\beta ^{(j)})$ are the entropies of $\alpha ^{(j)}$ and $\beta ^{(j)}$ on the probability spaces $V_{G^{(j)}}$ and $V_{G^{(j)}}^{\perp }$, respectively.

Notice that there is a compact set on which $all $ of the measures $\mu ^{(j)}$ are supported. It is then not difficult to see that $\mu ^{(j)}$ converges weakly to the measure $\alpha \times \delta _0$, where the product is with respect to the product structure ${\mathbb {R}^{N}}= V_G \times V_G^\perp $ and $\delta _0$ is the Dirac delta measure localized at the origin. By the continuity of U and the uniform boundedness of the supports of $\mu ^{(j)}$, this is enough to guarantee that

$$\begin{aligned} \int U\,\,\mathrm {d}\mu ^{(j)} \rightarrow \int U\,d(\alpha \times \delta _0) = \int U(\cdot ,0)\,\mathrm{{d}}\alpha \end{aligned}$$

as $j\rightarrow \infty $.

Next we write the Luttinger–Ward functional in terms of $\mathcal {F}_N$:

$$\begin{aligned} \frac{1}{2}\Phi _{N}[G^{(j)},U]= & {} \mathcal {\mathcal {F}}_{N}[G^{(j)},U]-\frac{1}{2}\mathrm {Tr}[\log (G^{(j)})]-\frac{N}{2}\log (2\pi e)\\= & {} \mathcal {\mathcal {F}}_{N}[G^{(j)},U]-\frac{1}{2}\sum _{i=1}^{N}\log \lambda _{i}^{(j)}-\frac{N}{2}\log (2\pi e). \end{aligned}$$

Then combining the preceding observations yields

$$\begin{aligned} \liminf _{j\rightarrow \infty }\frac{1}{2}\Phi _{N}[G^{(j)},U]\geqq & {} \liminf _{j\rightarrow \infty }\left[ H(\alpha )-\int U\,\,\mathrm {d}\mu ^{(j)}-\frac{1}{2}\sum _{i=1}^{p}\log \lambda _{i}^{(j)}\right. \\&\left. -\frac{N}{2}\log (2\pi e)+(N-p)H(\gamma ) \right] \\= & {} H(\alpha )-\int U(\cdot ,0)\,\mathrm{{d}}\alpha -\frac{1}{2}\sum _{i=1}^{p}\log \lambda _{i}\\&-\frac{N}{2}\log (2\pi e)+(N-p)H(\gamma )\\= & {} H(\alpha )-\int U(\cdot ,0)\,\mathrm{{d}}\alpha -\frac{1}{2}\mathrm {Tr}\left[ \log (G_{p})\right] \\&-\frac{N}{2}\log (2\pi e)+(N-p)H(\gamma ). \end{aligned}$$

Now for any $\varepsilon > 0$, we can choose R sufficiently large and $\gamma $ supported on $[-R,R]$ such that $H(\gamma ) \geqq \frac{1}{2} \log (2\pi e) - \varepsilon $. Indeed, note that $\frac{1}{2} \log (2\pi e)$ is the entropy of the standard normal distribution, that is, the maximal entropy over measures of unit variance. By restricting the normal distribution to $[-R,R]$ for R sufficiently large, we can become arbitrarily close to saturating this bound. Therefore we have that

$$\begin{aligned} \liminf _{j\rightarrow \infty }\frac{1}{2}\Phi _{N}[G^{(j)},U] \geqq H(\alpha )-\int U(\cdot ,0)\,\mathrm{{d}}\alpha -\frac{1}{2}\mathrm {Tr}\left[ \log (G_{p})\right] -\frac{p}{2}\log (2\pi e). \end{aligned}$$

Since $\alpha $ was arbitrary in $\mathcal {G}_{p}^{-1}(G_{p})\cap \mathcal {M}_c$, this establishes the desired upper bound

$$\begin{aligned} \frac{1}{2}\liminf _{j\rightarrow \infty }\Phi _{N}[G^{(j)},U]\geqq & {} \sup _{\alpha \in \mathcal {G}_{p}^{-1}(G_{p})\cap \mathcal {M}_c}\left[ H(\alpha )-\int U(\cdot ,0)\,\mathrm{{d}}\alpha \right] \\&-\frac{1}{2}\mathrm {Tr}\left[ \log (G_{p})\right] -\frac{p}{2}\log (2\pi e)\\= & {} \frac{1}{2} \Phi _{p}[G_{p},U(\cdot ,0)], \end{aligned}$$

where we have used Lemma 5.1, which allows us to look at the supremum over compactly supported measures.

Observe that the proof of the lower bound did not require the strong growth assumption, hence the semi-continuity claim of Remark 3.19.

5.3 Upper Bound

Next we turn to establishing an upper bound. The basic strategy is to select measures $\mu ^{(j)}$ that (approximately) attain the supremum in (5.1) and take a limit as $j\rightarrow \infty $.

Before proceeding, let $\varepsilon >0$. Moreover, define $\pi _1$ to be the orthogonal projection onto $V_{G} \simeq \mathbb {R}^p$, and define $\pi _2$ to be the orthogonal projection onto $V_{G}^\perp \simeq \mathbb {R}^{N-p}$.

Now for every j, as suggested above choose $\mu ^{(j)} \in \mathcal {G}_N^{-1}(G^{(j)})$ such that

$$\begin{aligned} \mathcal {F}_N [G^{(j)},U] \leqq H( \mu ^{(j)} ) - \int U \,\,\mathrm {d}\mu ^{(j)} + \varepsilon . \end{aligned}$$

Therefore

$$\begin{aligned} \Phi _N [G^{(j)},U] \leqq \underbrace{H( \mu ^{(j)} ) - \int U \,\,\mathrm {d}\mu ^{(j)} - \frac{1}{2} \sum _{i=1}^N \log (2\pi e \lambda _i^{(j)})}_{=: a_j} \,+ \ \varepsilon . \end{aligned}$$

(5.2)

Then choose a subsequence $j_k$ such that $\lim _{k\rightarrow \infty } a_{j_k} = \limsup _{j\rightarrow \infty } a_j$.

Now the $\mu ^{(j)}$ have uniformly bounded second moments, so by Markov’s inequality, the sequence $\mu ^{(j)}$ is tight. Then by Prokhorov’s theorem (Theorem A.4), we can assume, by extracting a further subsequence if necessary, that $\mu ^{(j_k)}$ converges weakly to some measure $\mu $.

We claim that $\mathcal {G}_N(\mu ) \preceq G$ (so in particular, $\mu \in \mathcal {M}_2$). Indeed, for any $z \in {\mathbb {R}^{N}}$, by the Portmanteau theorem for weak convergence of measures (Theorem A.1) we have

$$\begin{aligned} \int (z^T x)^2 \,\,\mathrm {d}\mu\leqq & {} \liminf _{k\rightarrow \infty } \int (z^T x)^2 \,\,\mathrm {d}\mu ^{(j_k)} \\= & {} \liminf _{k\rightarrow \infty } \int z^T xx^T z \,\,\mathrm {d}\mu ^{(j_k)} = \liminf _{k\rightarrow \infty } z^T G^{(j_k)} z = z^T G z. \end{aligned}$$

It follows that $\mu \in \mathcal {M}_2$ and moreover $z^T \mathcal {G}_n(\mu ) z \leqq z^T G z$ for all z, that is, $\mathcal {G}_n(\mu ) \preceq G$. In particular, $\mu $ is supported on $V_G$.

Define $T^{(j)}$ to be the orthogonal transformation that sends $v^{(j)}_i \mapsto v_i$, so $T^{(j)} \rightarrow I_n$ as $j\rightarrow \infty $. Define $\nu ^{(j)}:= T^{(j)}\#\mu ^{(j)}$. Again by Prokhorov’s theorem, we can assume that $\nu ^{(j_k)}$ converges weakly to some measure $\nu $. In fact, we must have $\nu = \mu $. To see this, note that for any continuous compactly supported function $\phi $ on ${\mathbb {R}^{N}}$, we have that $\phi \circ T^{(j)} \rightarrow \phi $ uniformly as $j\rightarrow \infty $. Therefore

$$\begin{aligned} \lim _{j\rightarrow \infty } \int \left| \phi - \phi \circ T^{(j)} \right| \,\,\mathrm {d}\mu ^{(j)} \rightarrow 0. \end{aligned}$$

Consequently

$$\begin{aligned} \int \phi \,\,\mathrm {d}\mu = \lim _{k\rightarrow \infty } \int \phi \,\,\mathrm {d}\mu ^{(j_k)} = \lim _{k\rightarrow \infty } \int \phi \circ T^{(j_k)} \,\,\mathrm {d}\mu ^{(j_k)} = \lim _{k\rightarrow \infty } \int \phi \, \mathrm{{d}}\nu ^{(j_k)} = \int \phi \,\mathrm{{d}}\nu . \end{aligned}$$

5 Since $\mu $ and $\nu $ agree on all continuous compactly supported functions, they must be equal (Riesz representation theorem), and $\nu ^{(j_k)} \rightarrow \mu $ weakly.

Define $\mu ^{(j)}_i := \pi _i \# \nu ^{(j)} = \left( \pi _i \circ T^{(j)}\right) \# \mu ^{(j)}$ and $\mu _i := \pi _i \#\mu $ for $i=1,2$. It follows that $\mu ^{(j_k)}_i \rightarrow \mu _i$ weakly. Notice (using Fact 2.11) that

$$\begin{aligned} H(\mu ^{(j)}) = H(\nu ^{(j)}) \leqq H(\mu ^{(j)}_1) + H(\mu ^{(j)}_2) \leqq H(\mu ^{(j)}_1) + \frac{1}{2}\sum _{i=p+1}^N \log (2\pi e \lambda _i ^{(j)}). \end{aligned}$$

Therefore, using Lemma 2.9 with the weak convergence $\mu _1^{(j_k)} \rightarrow \mu _1$, we obtain

$$\begin{aligned} \lim _{k\rightarrow \infty } a_{j_k}= & {} \lim _{k\rightarrow \infty } \left[ H(\mu ^{(j_k)}) - \int U \,\,\mathrm {d}\mu ^{(j_k)} - \frac{1}{2} \sum _{i=1}^N \log (2\pi e \lambda _i^{(j_k)}) \right] \\\leqq & {} \limsup _{k\rightarrow \infty } \left[ H(\mu _1^{(j_k)}) - \frac{1}{2}\sum _{i=1}^p \log (2\pi e \lambda _i ^{(j)}) \right] - \liminf _{k\rightarrow \infty } \left[ \int U \,\,\mathrm {d}\mu ^{(j_k)} \right] \\\leqq & {} H(\mu _1) - \liminf _{k\rightarrow \infty } \left[ \int U \,\,\mathrm {d}\mu ^{(j_k)} \right] - \frac{1}{2} \log ( (2\pi e)^p \det G_p). \end{aligned}$$

Now for any $\alpha \in \mathbb {R}$, define $U_\alpha (x) = U(x) - \alpha \Vert x\Vert ^2$. Then

$$\begin{aligned} \int U \,\,\mathrm {d}\mu ^{(j)} = \int U_\alpha \,\,\mathrm {d}\mu ^{(j)} + \alpha \mathrm {Tr}[G^{(j)}]. \end{aligned}$$

The utility of this manipulation will be made clear later. By the strong growth condition, $U_\alpha $ is bounded below. Therefore, by the Portmanteau theorem for weak convergence of measures,

$$\begin{aligned} \liminf _{k\rightarrow \infty } \left[ \int U \,\,\mathrm {d}\mu ^{(j_k)} \right] = \alpha \mathrm {Tr}[G] + \liminf _{k\rightarrow \infty } \left[ \int U_\alpha \,\,\mathrm {d}\mu ^{(j_k)} \right] \geqq \alpha \mathrm {Tr}[G_p] + \int U_\alpha \,\,\mathrm {d}\mu . \end{aligned}$$

Since $\mu $ is supported on $V_G$, in fact we have

$$\begin{aligned} \int U_\alpha \,\,\mathrm {d}\mu = \int U_\alpha (\,\cdot \,,0)\,\,\mathrm {d}\mu _1 = \int U(\,\cdot \,,0)\,\,\mathrm {d}\mu _1 - \alpha \mathrm {Tr}[\mathcal {G}_p(\mu _1)], \end{aligned}$$

and therefore,

$$\begin{aligned} \lim _{k\rightarrow \infty } a_{j_k}\leqq & {} H(\mu _1)- \int U(\,\cdot \,,0) \,\,\mathrm {d}\mu _1 - \frac{1}{2} \log ( (2\pi e)^p \det G_p) + \alpha \mathrm {Tr}[\mathcal {G}_p(\mu _1) - G_p] \\\leqq & {} \mathcal {F}_p[\mathcal {G}_p (\mu _1), U(\,\cdot \,,0)] - \frac{1}{2} \log ( (2\pi e)^p \det G_p) + \alpha \mathrm {Tr}[\mathcal {G}_p(\mu _1) - G_p]. \end{aligned}$$

Recall from (5.2) that

$$\begin{aligned} \limsup _{j\rightarrow \infty } \Phi [G^{(j)},U] \leqq \lim _{k\rightarrow \infty } a_{j_k} + \varepsilon . \end{aligned}$$

Since $\varepsilon >0$ was arbitrary, this means that

$$\begin{aligned} \limsup _{j\rightarrow \infty } \Phi [G^{(j)},U]\leqq & {} \mathcal {F}_p[\mathcal {G}_p (\mu _1), U(\,\cdot \,,0)]\\&- \frac{1}{2} \log ( (2\pi e)^p \det G_p) + \alpha \mathrm {Tr}[\mathcal {G}_p(\mu _1) - G_p]. \end{aligned}$$

If we had $\mathcal {G}_N(\mu ) = G$, that is, $\mathcal {G}_p(\mu _1)=G_p$, then we would be done. We have $\mathcal {G}_p(\mu _1) \preceq G_p$, so it will suffice to show that $\mathrm {Tr}[\mathcal {G}_p(\mu _1) - G_p]=0$. Suppose for contradiction that $\mathrm {Tr}[\mathcal {G}_p(\mu _1) - G_1] < 0$. But then, by taking $\alpha $ arbitrarily large we see that $\limsup _{j\rightarrow \infty } \Phi [G^{(j)},U] = -\infty $, which is impossible because we already have a lower bound on $\liminf _{j\rightarrow \infty } \Phi [G^{(j)},U]$. Therefore $\mathcal {G}_p(\mu _1) = G_p$, as desired, and we have

$$\begin{aligned} \limsup _{j\rightarrow \infty } \Phi [G^{(j)},U] \leqq \Phi _p[G_p, U(\,\cdot \,,0)], \end{aligned}$$

which completes the proof.

Notice the strong growth assumption was only used in this part of the proof (that is, the proof of the upper bound). In particular, it was only used to ensure that the measure $\mu ^{(j)}$ of maximum entropy relative to $\nu _U$ (as in Remark 3.6) subject to the moment constraint $\mathcal {G}(\mu ^{(j)}) = G^{(j)}$ cannot weakly converge to a measure $\mu $ with $\mathcal {G}(\mu ) \ne G = \lim _{j\rightarrow \infty } G^{(j)}$.

5.4 Dual Perspective on Continuous Extension

We now outline how Theorem 3.18 can be reinterpreted via the transformation rule. This perspective provides another way of understanding Theorem 3.18 and allows us to present a counterexample that illustrates the necessity of the strong growth condition of Definition 2.4.

Suppose that $T_j$ are linear transformations such that $T_j \rightarrow P$, where $P = I_p \oplus 0_{N-p}$ is the orthogonal projection onto $\mathrm {span}\{e_1^{(n)},\ldots ,e_p^{(n)}\}$. Let $G \in \mathcal {S}^N_{++}$ with upper-left block given by $G_{p}$. Then, using the transformation rule, Theorem 3.18, and the projection rule, we obtain

$$\begin{aligned} \Phi _N [G, U\circ T_j] = \Phi _N [T_j G T_j^*, U] \rightarrow \Phi _p[G_{p}, U(\,\cdot \,,0)] = \Phi _N [G, U \circ P]. \end{aligned}$$

This manipulation suggests that Theorem 3.18 is equivalent to the pointwise convergence

$$\begin{aligned} \Phi _N [\,\cdot \,, U\circ T_j] \rightarrow \Phi _N [\,\cdot \,, U \circ P] \end{aligned}$$

(5.3)

for all $T_j \rightarrow P$. To see the equivalence, consider an arbitrary sequence $G^{(j)} \in \mathcal {S}^N_{++}$ converging, as before, to the block-diagonal matrix $G = G_p \oplus 0_{N-p} \in \mathcal {S}^N_{+}$, where $G_p \in \mathcal {S}^p_{++}$. Then we want to show, using Eq. (5.3), that $\Phi _N [ G^{(j)}, U] \rightarrow \Phi _p[G_{p}, U(\,\cdot \,,0)]$.

To this end, let $T_j = [G^{(j)}]^{1/2} [G_p \oplus I_{N-p}]^{-1/2}$, so $G^{(j)} = T_j (G_p \oplus I_{N-p}) T_j^*$, and $T_j \rightarrow P$. Then (5.3) implies that $\Phi _N [G_p \oplus I_{N-p}, U\circ T_j] \rightarrow \Phi _N [G_p \oplus I_{N-p}, U \circ P]$, and combining with the transformation and projection rules yields Theorem 3.18.

Note that (5.3) is equivalent to the pointwise convergence of concave functions $\mathcal {F}_N[\,\cdot \,,U\circ T_j] \rightarrow \mathcal {F}_N[\,\cdot \,,U\circ P]$ as $T_j \rightarrow P$. Since the domains of these concave functions are open (namely, $\mathcal {S}^N_{++}$), by Theorem A.22 this is actually equivalent to uniform convergence on all compact subsets of $\mathcal {S}^N_{++}$. Furthermore, since $\mathcal {F}_N[\,\cdot \,,U\circ T_j]$ and $\mathcal {F}_N[\,\cdot \,,U\circ P]$ are both uniformly $-\infty $ on $\mathcal {S}^N\backslash \mathcal {S}^N_{++}$, this is in turn equivalent to uniform convergence on all compact subsets of $\mathcal {S}^N$ that do not contain a boundary point of $\mathcal {S}^N_{++}$, which by Theorem A.20 is equivalent to the hypo-convergence (see Definition A.19) $\mathcal {F}_N[\,\cdot \,,U\circ T_j] \overset{\mathrm {h}}{\rightarrow } \mathcal {F}_N[\,\cdot \,,U\circ P]$. (Note that the role of epi-convergence for convex functions is assumed by hypo-convergence for concave functions.) But then hypo-convergence is equivalent to hypo-convergence of the concave conjugates (Theorem A.21), that is, of $\Omega [\,\cdot \,,U\circ T_j]$ to $\Omega [\,\cdot \,,U\circ P]$ as $j \rightarrow \infty $.

In summary, the continuous extension property is equivalent to the hypo-convergence $\Omega [\,\cdot \,,U\circ T_j] \overset{\mathrm {e}}{\rightarrow } \Omega [\,\cdot \,,U\circ P]$.

5.5 Counterexample of Weak but Not Strong Growth

Here we give a counter example to show that the weak growth condition is insufficient for guaranteeing the continuous extension property. By the discussion of Section 5.4, we need only find U satisfying the weak growth condition for which $\Omega [\,\cdot \,,U\circ T_j]$ fails to hypo-converge to $\Omega [\,\cdot \,,U\circ P]$.

For example, consider $N=2$ and

$$\begin{aligned} U(x_1, x_2) = {\left\{ \begin{array}{ll} \vert x_1 \vert ^4 &{} \vert x_1 \vert \leqq \vert x_2 \vert ^{-1} \\ \vert x_2 \vert ^{-4} &{} \vert x_1 \vert \geqq \vert x_2 \vert ^{-1}. \end{array}\right. }. \end{aligned}$$

If $x_2 = 0$, then the first case holds for all $x_1$. This interaction is nonnegative, and hence satisfies the first part of the weak growth condition of Definition 2.3 with $C_{U}=0$. To see that U satisfies the weak growth condition, we need only show that $\mathrm {dom}\,\Omega $ is open. Clearly $\mathrm {dom}\,\Omega \supset \mathcal {S}^N_{++}$. Moreover, the restriction of U to any line except the $x_1$-axis is bounded, and it follows that in fact $\mathrm {dom}\,\Omega = \mathcal {S}^N_{++}$, hence $\mathrm {dom}\,\Omega $ is open, as desired.

Now let

$$\begin{aligned} T_j := \left( \begin{array}{cc} 1 &{} 0\\ 0 &{} j^{-1} \end{array}\right) \rightarrow P := \left( \begin{array}{cc} 1 &{} 0\\ 0 &{} 0 \end{array}\right) . \end{aligned}$$

Since $\Omega [\,\cdot \,,U\circ P]$ has an open domain, namely,

$$\begin{aligned} \mathrm {dom}\,\left( \Omega [\,\cdot \,,U\circ P]\right) = \left\{ A = (a_{ij}) \in \mathcal {S}^2 : a_{22} > 0 \right\} , \end{aligned}$$

the hypo-convergence of $\Omega [\,\cdot \,,U\circ T_j]$ to $\Omega [\,\cdot \,,U\circ P]$ is equivalent to pointwise convergence (by Theorems A.20 and A.22), which is the same as the pointwise convergence $Z[\,\cdot \,,U\circ T_j] \rightarrow Z[\,\cdot \,,U\circ P]$.

Set $A = (a_{ij})$ via $a_{11} = a_{12} = 0$, $a_{22} = 1$, so A is in the domain of $\Omega [\,\cdot \,,U\circ P]$, that is, $Z[A,U\circ P] < +\infty $. However,

$$\begin{aligned} Z[A,U\circ T_j]= & {} \int e^{- \frac{1}{2} \vert x_2 \vert ^2 - U(x_1,j^{-1} x_2) }\,\,\mathrm {d}x_1\,\,\mathrm {d}x_2\\= & {} j\cdot \int e^{-j^2 \frac{1}{2} \vert x_2 \vert ^2 - U(x_1, x_2) }\,\,\mathrm {d}x_1\,\,\mathrm {d}x_2. \end{aligned}$$

Now the restriction of the last integrand to any line of constant $x_2 \ne 0$ is asymptotically equal to $e^{-j^2 \vert x_2\vert ^2 - \vert x_2 \vert ^{-4}} > 0$, so the integral along any such line is $+\infty $, and by Fubini’s theorem, $Z[A,U\circ T_j] = +\infty $. Thus convergence fails at A, and we have a counterexample as claimed.

Notes

The Luttinger–Ward formalism is also known as the Kadanoff-Baym formalism [4] depending on the context. In this paper we always use the former.
Our relative entropy is then the negative of the Kullback-Leibler divergence, that is, $H_\nu (\mu ) = -D_{\mathrm {KL}} (\mu \Vert \nu )$.

References

Altland, A.; Simons, B.D.: Condensed Matter Field Theory. Cambridge University Press, Cambridge, 2010
Amit, D.J.; Martin-Mayor, V.: Field Theory, the Renormalization Group, and Critical Phenomena: Graphs to Computers. World Scientific Publishing Co Inc, Singapore, 2005
Baerends, E.J.: Exact exchange-correlation treatment of dissociated ${H}_{2}$ in density functional theory. Phys. Rev. Lett. 87, 133004, 2001
Article ADS Google Scholar
Baym, G.; Kadanoff, L.P.: Conservation laws and correlation functions. Phys. Rev. 124, 287, 1961
Article ADS MathSciNet Google Scholar
Benlagra, A.; Kim, K.-S.; Pépin, C.: The Luttinger-Ward functional approach in the Eliashberg framework: a systematic derivation of scaling for thermodynamics near the quantum critical point. J. Phys. Condens. Matter 23, 145601, 2011
Article ADS Google Scholar
Billingsley, P.: Probability and Measure. Wiley, New York, 2012
Blöchl, P.E.; Pruschke, T.; Potthoff, M.: Density-matrix functionals from Green’s functions. Phys. Rev. B 88, 205139, 2013
Article ADS Google Scholar
Dahlen, N.E.; Van Leeuwen, R.; Von Barth, U.: Variational energy functionals of the green function tested on molecules. Int. J. Quantum Chem. 101, 512–519, 2005
Article Google Scholar
Elder, R.: Comment on “Non-existence of the Luttinger-Ward functional and misleading convergence of skeleton diagrammatic series for Hubbard-like models”. arXiv:1407.6599, 2014
Georges, A.; Kotliar, G.; Krauth, W.; Rozenberg, M.J.: Dynamical mean-field theory of strongly correlated fermion systems and the limit of infinite dimensions. Rev. Mod. Phys. 68, 13, 1996
Article ADS MathSciNet Google Scholar
Gunnarsson, O.; Rohringer, G.; Schäfer, T.; Sangiovanni, G.; Toschi, A.: Breakdown of traditional many-body theories for correlated electrons. Phys. Rev. Lett. 119, 056402, 2017
Article ADS Google Scholar
Hartmann, C.; Richter, L.; Schütte, C.; Zhang, W.: Variational characterization of free energy: theory and algorithms. Entropy 19, 626, 2017
Article ADS Google Scholar
Ismail-Beigi, S.: Correlation energy functional within the GW-RPA: exact forms, approximate forms, and challenges. Phys. Rev. B 81, 1–21, 2010
Article Google Scholar
Kotliar, G.; Savrasov, S.Y.; Haule, K.; Oudovenko, V.S.; Parcollet, O.; Marianetti, C.A.: Electronic structure calculations with dynamical mean-field theory. Rev. Mod. Phys. 78, 865, 2006
Article ADS Google Scholar
Kozik, E.; Ferrero, M.; Georges, A.: Nonexistence of the Luttinger–Ward Functional and Misleading Convergence of Skeleton Diagrammatic Series for Hubbard-Like Models. Phys. Rev. Lett. 114, 156402, 2015
Article ADS Google Scholar
Levy, M.: Universal variational functionals of electron densities, first-order density matrices, and natural spin-orbitals and solution of the v-representability problem. Proc. Natl. Acad. Sci. 76, 6062–6065, 1979
Article ADS MathSciNet Google Scholar
Lieb, E.H.: Density functional for Coulomb systems. Int J. Quantum Chem. 24, 243, 1983
Article Google Scholar
Lin, L.; Lindsey, M.: Variational structure of Luttinger–Ward formalism and bold diagrammatic expansion for Euclidean lattice field theory. Proc. Natl. Acad. Sci. 115, 2282, 2018
Article MathSciNet Google Scholar
Luttinger, J.M.; Ward, J.C.: Ground-state energy of a many-fermion system. II. Phys. Rev. 118, 1417, 1960
Article ADS MathSciNet Google Scholar
Martin, R.M.; Reining, L.; Ceperley, D.M.: Interacting Electrons. Cambridge University Press, Cambridge, 2016
Mermin, N.D.: Thermal properties of the inhomogeneous electron gas. Phys. Rev. 137, A1441, 1965
Article ADS MathSciNet Google Scholar
Negele, J.W.; Orland, H.: Quantum Many-Particle Systems. Westview, Boulder, 1988
Rassoul-Agha, F.; Seppäläinen, T.: A Course on Large Deviations with an Introduction to Gibbs Measures. American Mathematical Society, Providence, 2015
Rentrop, J.F.; Meden, V.; Jakobs, S.G.: Renormalization group flow of the Luttinger–Ward functional: conserving approximations and application to the Anderson impurity model. Phys. Rev. B 93, 195160, 2016
Article ADS Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton, 1970
Rockafellar, R.T.; Wets, R.J.-B.: Variational Analysis. Springer, Berlin, 2009
Sharma, S.; Dewhurst, J.K.; Lathiotakis, N.N.; Gross, E.K.U.: Reduced density matrix functional for many-electron systems. Phys. Rev. B 78, 201103, 2008
Article ADS Google Scholar
Tarantino, W.; Romaniello, P.; Berger, J.A.; Reining, L.: Self-consistent Dyson equation and self-energy functionals: an analysis and illustration on the example of the Hubbard atom. Phys. Rev. B 96, 045124, 2017
Article ADS Google Scholar
Zinn-Justin, J.: Quantum Field Theory and Critical Phenomena. Clarendon Press, Oxford, 2002

Download references

Acknowledgements

This work was partially supported by the Department of Energy under Grant DE-AC02-05CH11231 (L.L., M.L.), by the Department of Energy under Grant No. DE-SC0017867 and by the Air Force Office of Scientific Research under award number FA9550-18-1-0095 (L.L.), and by the NSF Graduate Research Fellowship Program under Grant DGE-1106400 (M.L.). We thank Fabien Bruneval, Garnet Chan, Alexandre Chorin, Lek-Heng Lim, Nicolai Reshetikhin, Chao Yang and Lexing Ying for helpful discussions.

Author information

Authors and Affiliations

Department of Mathematics, University of California, Berkeley, Berkeley, CA, 94720, USA
Lin Lin
Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Lin Lin
Courant Institute of Mathematical Sciences, New York University, New York, NY, 10012, USA
Michael Lindsey

Authors

Lin Lin
View author publications
You can also search for this author in PubMed Google Scholar
Michael Lindsey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Lindsey.

Additional information

Communicated by G. Friesecke.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Definitions and Results from Convex Analysis

In this section we review some definitions and results from convex analysis. In this paper many results are stated for concave functions, that is, functions f such that $-f$ are convex. The standard results of convex analysis can always be applied by considering negations. We state results below for convex functions to maintain consistency with the literature. Many results are stated in somewhat more generality than is needed for the purposes of this paper (for example, we do not simply conflate proper and non-proper convex functions). This is done to make sure that the reader can refer to the cited references. The discussion follows developments from Rockafellar [25] and Rockafellar and Wets [26].

1.1 Convex Sets and Functions

We begin with the definition of convex sets and functions.

Definition A.1

A set $C\subset \mathbb {R}^{n}$ is convex if $(1-t)x+ty\in C$ for every $x,y\in C$ and all $t\in [0,1]$.

Definition A.2

An extended real-valued function f on a convex set C, that is, a function $f:C \rightarrow [-\infty ,\infty ] = \mathbb {R}\cup \{ -\infty ,+\infty \}$, is convex if

$$\begin{aligned} f\left( (1-t)x+ty\right) \leqq (1-t)f(x)+tf(y) \end{aligned}$$

for all $x,y\in C$ and all $t\in (0,1)$, where we interpret $\infty - \infty = +\infty $ if necessary. We say that f is strictly convex on the convex set C if this inequality holds strictly whenever $x \ne y$.

Definition A.3

The (effective) domain of a convex function f on S, denoted $\mathrm {dom}\, f$, is the set $ \mathrm {dom}\,f=\{x\in S\,:\, f(x)<+\infty \}$.

The following is an immediate consequence of the preceding definitions:

Lemma A.4

Let f be convex on $S\subset \mathbb {R}^{n}$. Then $\mathrm {dom}\, f$ is convex.

We note that when $f\in C^{2}(C)$, our definition of convexity coincides with the definition from multivariate calculus:

Theorem A.5

Let $f\in C^{2}(C)$, where $C\subset \mathbb {R}^{n}$ is open and convex. Then f is convex on C if and only if the Hessian matrix $\nabla ^{2}f(x)$ is positive semi-definite for all $x\in C$.

Proof

See Theorem 4.5 of Rockafellar [25].

Notice that for f convex on a convex set $C\subset \mathbb {R}^{n}$, we can extend to $\tilde{f}$ defined on $\mathbb {R}^{n}$ by taking $\tilde{f}\vert _{\mathbb {R}^{n}\backslash C}\equiv +\infty $. It is immediate that $\tilde{f}$ is convex on $\mathbb {R}^{n}$. Thus one loses no generality by considering only functions that are convex on $\mathbb {R}^{n}$.

The following definitions are helpful for ruling out pathologies:

Definition A.6

A convex function f is called proper if $\mathrm {dom}\, f\ne \emptyset $ and $f(x)>-\infty $ for all x.

We will only ever need to consider proper convex functions.

Definition A.7

If f is a proper convex function, then f is called closed if it is also lower semi-continuous. (If f is a non-proper convex function, then f is called closed if it is either $f\equiv +\infty $ or $f\equiv -\infty $.)

Remark A.8

For the fact that this can be taken as the definition, see Theorem 7.1 of [25].

The convexity of a function guarantees its continuity in a certain sense:

Theorem A.9

A convex function f on $\mathbb {R}^{n}$ is continuous relative to any relatively open convex set in $\mathrm {dom}\, f$. In particular, f is continuous on $\mathrm {int\, dom}\, f$. In fact, it holds that a proper convex function f is locally Lipschitz on $\mathrm {int\, dom}\, f$.

Proof

See Theorems 10.1 and 10.4 of Rockafellar [25].

1.2 First-order Properties of Convex Functions

There is an extension of the notion of differentiability that is fundamental to the analysis of convex functions.

Definition A.10

Let f be a convex function on $\mathbb {R}^{n}$. $y\in \mathbb {R}^{n}$ is called a subgradient of f at $x\in \mathrm {dom}\,f$ if $f(z)\geqq f(x)+\left\langle y,z-x\right\rangle $ for all $z\in \mathbb {R}^{n}$. The subdifferential of f at $x\in \mathrm {dom}\,f$, denoted $\partial f(x)$, is the set of all subgradients of f at x. By convention $\partial f (x) = \emptyset $ for $x \notin \mathrm {dom}\,f$.

Theorem A.11

Let f be a proper convex function. $\partial f(x)$ is a non-empty bounded set if and only if $x\in \mathrm {int\, dom}\, f$.

Proof

See Theorem 23.4 of Rockafellar [25].

It is perhaps no surprise that the derivative and the subdifferential of a convex function coincide wherever it is differentiable.

Theorem A.12

Let f be a convex function, and let $x\in \mathbb {R}^{n}$ such that f(x) is finite. If f is differentiable at x, then $\nabla f(x)$ is the unique subgradient of f at x, where $\nabla $ is the gradient defined with respect to the inner product used to define the subgradient. Conversely, if f has a unique subgradient at x, then f is differentiable at x.

Proof

See Theorem 25.1 of Rockafellar [25].

1.3 The Convex Conjugate

A fundamental notion of convex analysis is convex conjugation, which extends the older notion of Legendre transformation.

Definition A.13

Let f be a function $\mathbb {R}^{n}\rightarrow [-\infty ,+\infty ]$. Then the convex conjugate (or, Legendre-Fenchel transform) $f^{*}:\mathbb {R}^{n}\rightarrow [-\infty ,+\infty ]$ with respect to an inner product $\langle \, \cdot \,,\,\cdot \,\rangle $ on $\mathbb {R}^n$ is defined by

$$\begin{aligned} f^{*}(y)=\sup _{x}\left\{ \left\langle x,y\right\rangle -f(x)\right\} =-\inf _{x}\left\{ f(x)-\left\langle x,y\right\rangle \right\} . \end{aligned}$$

Theorem A.14

Let f be a convex function. Then $f^{*}$ is a closed convex function, proper if and only if f is proper. Furthermore, if f is closed, then $f^{**}=f.$

Proof

See Theorem 12.2 of Rockafellar [25].

It is an important fact that the subgradients of f and $f^{*}$ are, in a sense, inverse mappings.

Theorem A.15

If f is a closed proper convex function, then $x\in \partial f^{*}(y)$ if and only if $y\in \partial f(x)$.

Proof

See Corollary 23.5.1 of Rockafellar [25].

Roughly speaking, differentiability of a convex function corresponds to the strict convexity of its conjugate. Indeed:

Theorem A.16

If f is a closed proper convex function, then the following are equivalent:

1.
$\mathrm {int}\,\mathrm {dom}\,f$ is nonempty, f is differentiable on $\mathrm {int}\,\mathrm {dom}\,f$, and $\partial f (x) = \emptyset $ for all $x\in \mathrm {dom}\,f \,\backslash \, \mathrm {int}\,\mathrm {dom}\,f$.
2.
$f^*$ is strictly convex on all convex subsets of $\mathrm {dom}\,\partial f^* := \{ y \,:\, \partial f^* (y) \ne \emptyset \}$.

Proof

See Theorem 11.13 of [26].

Note that for proper convex f, if $\mathrm {dom}\,f^*$ is open, then $\mathrm {dom}\,\partial f^* = \mathrm {dom}\,f^*$ by Theorem A.11, and under the additional assumption that $\mathrm {dom}\,f$ is open, Theorem A.16 simplifies to the following:

Theorem A.17

Let f is a lower semi-continuous, proper convex function, and suppose that $\mathrm {dom}\,f$ and $\mathrm {dom}\,f^*$ are open. Then the following are equivalent:

1.
f is differentiable on $\mathrm {dom}\,f$.
2.
$f^*$ is strictly convex on $\mathrm {dom}\,f^*$.

1.4 Sequences of Convex Functions

Pointwise convergence of convex functions entails a kind of convergence of their subgradients.

Theorem A.18

Let f be a convex function on $\mathbb {R}^{n}$, and let C be an open convex set on which f is finite. Let $f_{1},f_{2},\ldots $ be a sequence of convex functions finite on C and converging pointwise to f on C. Let $x\in C$, and let $x_{1},x_{2},\ldots $ be a sequence of points in C converging to x. Then for any $\varepsilon >0$, there exists N such that

$$\begin{aligned} \partial f_{i}(x_{i})\subset \partial f(x)+B_{\varepsilon }(0) \end{aligned}$$

for all $i\geqq N$.

Proof

See Theorem 24.5 of Rockafellar [25].

Besides pointwise convergence, there is in fact another nature of convergence for convex functions. This is the notion of epi-convergence, which is defined (even for non-convex functions) as follows:

Definition A.19

Let $f_i, f$ be extended-real-valued functions on $\mathbb {R}^n$. Then we say that the sequence $\{f_i\}$ epi-converges to f, written as $f = \mathrm {e}\lim _{i\rightarrow \infty } f_i$ or $f_i \overset{\mathrm {e}}{\rightarrow } f$ as $i\rightarrow \infty $, if for all $x \in \mathbb {R}^n$, the following two conditions are satisfied:

$$\begin{aligned} \begin{aligned} \liminf _i f_i (x_i) \geqq f(x) \quad \text{ for } \text{ every } \text{ sequence } x_i \rightarrow x \\ \limsup _i f_i( x_i) \leqq f(x) \quad \text{ for } \text{ some } \text{ sequence } x_i \rightarrow x. \end{aligned} \end{aligned}$$

We say that the sequence $\{f_i\}$ hypo-converges to f, written as $f = \mathrm {h}\lim _{i\rightarrow \infty } f_i$ or $f_i \overset{\mathrm {h}}{\rightarrow } f$ as $i\rightarrow \infty $, if $\{-f_i\}$ epi-converges to $-f$.

The notion of epi-convergence is particularly natural in the theory of convex functions; accordingly hypo-convergence is more relevant to concave functions. Note also that epi-convergence is neither stronger nor weaker than pointwise convergence. However, there is a useful theorem that relates the pointwise convergence and epi-convergence of convex functions.

Theorem A.20

Let $f_{i}$ be a sequence of convex functions on $\mathbb {R}^{n}$, and let f be a lower semi-continuous convex function on $\mathbb {R}^{n}$ such that $\mathrm {dom}\, f$ has non-empty interior. Then $f=\mathrm {e}\lim _{i\rightarrow \infty }f_{i}$ if and only if the $f_{i}$ converge uniformly to f on every compact set C that does not contain a boundary point of $\mathrm {dom}\, f$.

Proof

See Theorem 7.17 of Rockafellar and Wets [26].

Under certain mild conditions, the epi-convergence of a sequence of convex functions is equivalent to the epi-convergence of the corresponding sequence of conjugate functions. Indeed, the following theorem is a natural motivation for considering epi-convergence as opposed to pointwise convergence.

Theorem A.21

Let $f_{i}$ and f be lower semi-continuous, proper convex functions on $\mathbb {R}^{n}$. Then the $f_{i}$ epi-converge to f if and only if the $f_{i}^{*}$ epi-converge to $f^{*}$.

Proof

See Theorem 11.34 of Rockafellar and Wets [26].

Finally, under certain circumstances one can upgrade mere pointwise convergence of convex functions to uniform convergence on compact subsets:

Theorem A.22

Let $f_{i}$ and f be finite convex functions on an open convex set $O \subset \mathbb {R}^n$, and suppose that $f_i \rightarrow f$ pointwise on O. Then $f_i$ converges uniformly to f on every compact subset of O.

Proof

See Corollary 7.18 of Rockafellar and Wets [26].

Classical Results on Weak Convergence of Probability Measures

For completeness we recall here several classical results on the weak convergence of measures. For reference, see, for example, Billingsley [6].

Let S be a metric space, and let $\mathcal {P}(S)$ denote the set of probability measures on S (equipped with the Borel $\sigma $-algebra). We say that a sequence $\mu _k \in \mathcal {P}(S)$ converges weakly to $\mu \in \mathcal {P}(S)$, denoted $\mu _k \Rightarrow \mu $, if $\int f \, \,\mathrm {d}\mu _k \rightarrow \int f\,\,\mathrm {d}\mu $ as $k\rightarrow \infty $ for all bounded, continuous functions $f : S \rightarrow \mathbb {R}$. A number of equivalent characterizations of weak convergence are given in the following result, often known as the Portmanteau theorem:

Theorem A.1

(Portmanteau) Let S be a metric space, and let $\mu _k, \mu \in \mathcal {P}(S)$. The following are all equivalent conditions for the weak convergence $\mu _k \Rightarrow \mu $:

1.
$\lim _{k\rightarrow \infty } \int f \, \,\mathrm {d}\mu _k = \int f\,\,\mathrm {d}\mu $ for all bounded, continuous functions $f : S \rightarrow \mathbb {R}$.
2.
$\liminf _{k\rightarrow \infty } \int f \, \,\mathrm {d}\mu _k \geqq \int f\,\,\mathrm {d}\mu $ for all lower semi-continuous functions $f : S \rightarrow \mathbb {R}$ bounded from below.
3.
$\liminf _{k\rightarrow \infty } \mu _k(U) \geqq \mu ( U)$ for all open sets $U \subset S$.

Remark A.2

There are several other equivalent conditions often included in the statement of this result.

A condition for extracting a weakly convergent subsequence, as guaranteed by Prokhorov’s theorem below, is given by the following notion of tightness:

Definition A.3

Let S be a metric space equipped with the Borel $\sigma $-algebra. A set $\mathcal {C}$ of measures on S is called $tight $ if for any $\varepsilon > 0$, there exists a compact subset $K \subset S$ such that $\mu (K) > 1-\varepsilon $ for all $\mu \in \mathcal {C}$. A sequence of measures is called tight if the set of terms in the sequence is tight.

Theorem A.4

(Prokhorov) Let S be a metric space equipped with the Borel $\sigma $-algebra. Then any tight sequence in $\mathcal {P}(S)$ admits a weakly convergent subsequence.

Proof of Lemmas

1.1 Lemma 2.8

Proof

Suppose $\mu \ll \lambda $ is in $\mathcal {M}_2$ and write $\,\mathrm {d}\mu = \rho \,\,\mathrm {d}x$ where $\rho $ is the probability density. Since $\mu \ll \lambda $, $\mathrm {Cov}(\mu )$ must be positive definite. Let $\mu _G$ be the Gaussian measure with the same mean and covariance as $\mu $, and let $\rho _G$ be the corresponding probability density. Then one can compute that

$$\begin{aligned} \int \rho \log \rho _G \,\,\mathrm {d}x = -\frac{1}{2} \log \left( (2\pi e)^N \det \mathrm {Cov}(\mu ) \right) \end{aligned}$$

(and in particular this integral is absolutely convergent). Now

$$\begin{aligned} \rho \log \rho = \rho \log \rho _G + \rho \log \frac{\rho }{\rho _G}. \end{aligned}$$

The first term on the right-hand side of this equation is absolutely integrable, and the integral of the second term exists (in particular, the integral of the negative part of the second term is finite, and the value of the full integral is in fact $-H_{\mu _G}(\mu )$). Therefore the integral $\int \rho \log \rho \,\,\mathrm {d}x \in (-\infty ,\infty ]$ exists. Moreover,

$$\begin{aligned} H(\mu )= & {} -\int \rho \log \rho \,\,\mathrm {d}x = \frac{1}{2} \log \left( (2\pi e)^N \det \mathrm {Cov}(\mu ) \right) \\&+ H_{\mu _G}(\mu ) \leqq \frac{1}{2} \log \left( (2\pi e)^N \det \mathrm {Cov}(\mu ) \right) \end{aligned}$$

with equality if and only if $\mu _G = \mu $.

To prove the second inequality in the statement of the lemma, define $\overline{\mu }:=\int x\, \,\mathrm {d}\mu $ to be the mean of $\mu $. Then $\mathrm {Cov}(\mu ) = \mathcal {G}(\mu ) - \overline{\mu }\,\overline{\mu }^T$, so in particular $\det \mathrm {Cov}(\mu ) \leqq \det \mathcal {G}(\mu )$, with equality if and only if $\overline{\mu } = 0$.

1.2 Lemma 2.9

Proof

Without loss of generality we can assume that $\mu _j = \rho _j \,\,\mathrm {d}x$ for all j.

First, by the Portmanteau theorem for weak convergence of measures (Theorem A.1) we have, for any $z\in {\mathbb {R}^{N}}$, that

$$\begin{aligned} z^{T}\mathcal {G}(\mu ) z = \int (z^T x)^2 \,\,\mathrm {d}\mu\leqq & {} \liminf _{j\rightarrow \infty } \int (z^T x)^2 \,\,\mathrm {d}\mu ^{(j)} \\= & {} \liminf _{j\rightarrow \infty } \int z^T xx^T z \,\,\mathrm {d}\mu ^{(j)} = \liminf _{j\rightarrow \infty } z^T \mathcal {G}(\mu _j) z \leqq C \Vert z\Vert ^2. \end{aligned}$$

It follows that $\mu \in \mathcal {M}_2$ (and moreover $\mathcal {G}(\mu ) \preceq C\cdot I_n$).

Our goal is to put ourselves in a position to use the upper semi-continuity (note our sign convention) of the relative entropy with respect to the topology of weak convergence (see Fact 2.7). Let $\beta > 0$, and let $Z_\beta = \int e^{-\beta \Vert x\Vert ^2}\,\,\mathrm {d}x$. Let $\gamma _\beta $ be the Gaussian measure with density proportional to $e^{-\beta \Vert x\Vert ^2}$. Then

$$\begin{aligned} H(\mu _j)= & {} -\int \rho _j \log \rho _j \,\,\mathrm {d}x \\= & {} \log (Z_\beta ) - \int \rho _j(x) \log \frac{\rho _j(x)}{\frac{1}{Z_\beta } e^{-\beta \Vert x\Vert ^2 }} \,\,\mathrm {d}x + \beta \int \rho _j(x) \Vert x\Vert ^2 \,\,\mathrm {d}x \\= & {} \log (Z_\beta ) + H_{\gamma _\beta } (\mu _j) + \beta \mathrm {Tr}[\mathcal {G}(\mu _j)]. \end{aligned}$$

Then by the upper semi-continuity of the relative entropy with respect to the topology of weak convergence, we have

$$\begin{aligned} \limsup _{j\rightarrow \infty } H(\mu _j) \leqq \log (Z_\beta ) + H_{\gamma _\beta } (\mu ) + \beta C N = H(\mu ) + \beta \left( C N - \mathrm {Tr}[\mathcal {G}(\mu )] \right) . \end{aligned}$$

Since this inequality holds for any $\beta >0$, the lemma follows.

1.3 Fact 2.11

Proof

We can assume that $\mu $ is absolutely continuous with respect to the Lebesgue measure, that is, has a density $\rho $ (otherwise $H(\mu ) = -\infty $ and the inequality is trivial). It follows that $\mu _i := \pi _i \# \mu $ are absolutely continuous with respect to the Lebesgue measure, that is, have densities $\rho _i$, for $i=1,2$. Let $x=(x_1,x_2)$ denote the splitting of $x\in \mathbb {R}^N$ according to the product structure ${\mathbb {R}^{N}}= \mathbb {R}^p \times \mathbb {R}^{N-p}$. Then using the fact that $\mu _1 \times \mu _2$ has density $\rho _1(x_1) \rho _2(x_2)$, one directly computes that

$$\begin{aligned} {\begin{matrix} &{} H(\mu _1) + H(\mu _2) + H_{\mu _1 \times \mu _2} (\mu ) \\ &{} \ \ = \int \rho _1(x_1) \log \rho _1(x_1) \,\mathrm {d}x_1 + \int \rho _2(x_2) \log \rho _2(x_2) \,\mathrm {d}x_2 + \int \rho (x) \log \frac{\rho (x)}{\rho _1(x_1) \rho _2(x_2)} \,\mathrm {d}x \\ &{} \ \ = \int \rho (x) \log \rho _1(x_1) \,\mathrm {d}x + \int \rho (x) \log \rho _2(x_2) \,\mathrm {d}x + \int \rho (x) \log \frac{\rho (x)}{\rho _1(x_1) \rho _2(x_2)} \,\mathrm {d}x \\ &{} \ \ = \int \rho (x) \log \rho (x) \,\mathrm {d}x\\ &{} \ \ = H(\mu ), \end{matrix}} \end{aligned}$$

but by Fact 2.7, the relative entropy term is non-negative.

1.4 Lemma 3.2

Proof

Upper semi-continuity follows directly from Fatou’s lemma. $\Omega $ is proper because its domain is nonempty and evidently $\Omega $ does not attain the value $+\infty $.

Now let $\theta \in [0,1]$ and $A_{1},A_{2}\in \mathrm {dom}\,\Omega $. Then

$$\begin{aligned} \Omega [\theta A_{1}+(1-\theta )A_{2}]= & {} -\log \int _{{\mathbb {R}^{N}}}\left( e^{-\frac{1}{2}x^{T}A_{1}x-U(x)}\right) ^{\theta }\left( e^{-\frac{1}{2}x^{T}A_{2}x-U(x)}\right) ^{1-\theta }\,\,\mathrm {d}x\\\geqq & {} -\log \left[ \left( \int _{{\mathbb {R}^{N}}}e^{-\frac{1}{2}x^{T}A_{1}x-U(x)}\,\,\mathrm {d}x\right) ^{\theta }\left( \int _{{\mathbb {R}^{N}}}e^{-\frac{1}{2}x^{T}A_{2}x-U(x)}\,\,\mathrm {d}x\right) ^{1-\theta }\right] \\= & {} \theta \Omega [A_{1}]+(1-\theta )\Omega [A_{2}], \end{aligned}$$

where we have used Hölder’s inequality in the second step. This establishes concavity. Strict concavity on $\mathrm {dom}\,\Omega $ follows from the following fact: Hölder’s inequality holds with equality in this scenario if and only if $e^{-\frac{1}{2}x^{T}A_{1}x-U(x)}=e^{-\frac{1}{2}x^{T}A_{2}x-U(x)}$ for all x, that is, if and only if $A_{1}=A_{2}$.

Lastly, observe that since $\mathrm {dom}\,\Omega $ is an open set, for any $A\in \mathrm {dom}\,\Omega $,

$$\begin{aligned} \int _{\mathbb {R}^{N}}e^{\delta x^2} e^{-\frac{1}{2}x^{T}Ax-U(x)}\,\,\mathrm {d}x < +\infty \end{aligned}$$

for some $\delta > 0$. Now, for any polynomial P, there exists a constant C such that for all $A'$ in a sufficiently small neighborhood of A,

$$\begin{aligned} P(x) e^{-\frac{1}{2}x^{T}A'x-U(x)} \leqq C e^{\delta x^2} e^{-\frac{1}{2}x^{T}Ax-U(x)}. \end{aligned}$$

Since derivatives of all orders of the integrand in (2.2) are of the form

$$\begin{aligned} P(x) e^{-\frac{1}{2}x^{T}Ax-U(x)}, \end{aligned}$$

differentiation under the integral is justified, and the smoothness result follows.

1.5 Lemma 3.4

Proof

First assume $A \in \mathrm {dom}\,\Omega $, so $Z[A]<+\infty $. Let $\mu \in \mathcal {M}_2$ and define $f(x):=\frac{1}{2}x^{T}Ax+U(x)$. For any f such that $e^{-f}$ is integrable, define $\nu _f$ to be the probability measure with density proportional to $e^{-f}$. Then, provided that $\mu \ll \lambda $,

$$\begin{aligned} {\begin{matrix} \int f \,\,\mathrm {d}\mu -H(\mu ) &{} = \Omega [A] -\int \log \left( \frac{1}{Z[A]} e^{-f} \right) \,\,\mathrm {d}\mu -H(\mu ) \\ &{} = \Omega [A] + \int \log \left( \frac{\,\mathrm {d}\mu }{\,\mathrm {d}\lambda }\right) -\log \left( \frac{\,\mathrm {d}\nu _{f}}{\,\mathrm {d}\lambda } \right) \,\,\mathrm {d}\mu \\ &{} = \Omega [A] + \int \log \frac{\,\mathrm {d}\mu }{\,\mathrm {d}\nu _f} \,\,\mathrm {d}\mu \\ &{} = \Omega [A] - H_{\nu _f}(\mu ) \geqq \Omega [A]. \end{matrix}} \end{aligned}$$

(C.1)

Since $\mu \in \mathcal {M}_2$, we have $H(\mu ) < +\infty $ as discussed in Remark 3.5. Careful observation reveals that manipulations are valid in the sense of the extended real numbers even when $\int f\,\,\mathrm {d}\mu = +\infty $. Moreover, $\mu \not \ll \lambda $ if and only if $\mu \not \ll \nu _f$, in which case both sides of (C.1) are $+\infty $. Therefore (C.1) holds for all $\mu \in \mathcal {M}_2$.

For $A\in \mathrm {dom}\,\Omega $, (C.1) establishes the ‘$\leqq $’ direction of (3.3). For $A\notin \mathrm {dom}\,\Omega $, $\Omega [A]=-\infty $, so this direction is immediate.

Next suppose that $A \in \mathrm {dom}\,\Omega $. Since $\mathrm {dom}\,\Omega $ is open, it follows that $\nu _f \in \mathcal {M}_2$. From (C.1) and the inequality $-H_{\nu _f}(\mu )\geqq 0$ (which holds with equality if and only if $\mu = \nu _f$), it follows that (3.3) holds. Moreover, that the infimum in (3.3) is uniquely attained at $\mu = \nu _f$, that is, at $\mathrm{{d}}\mu (x)=\frac{1}{Z[A]}e^{-\frac{1}{2}x^{T}Ax-U(x)}\,\,\mathrm {d}x$.

1.6 Lemma 3.7

Proof

By definition $\mathcal {F}[G]=-\infty $ whenever $G\in \mathcal {S}^{N}\backslash \mathcal {S}_{+}^{N}$. Now we show that also $\mathcal {F}[G]=-\infty $ for G on the boundary $\partial \mathcal {S}_{+}^{N}$. This follows from the fact that for such G, any $\mu \in \mathcal {G}^{-1}(G)$ is supported on a subspace of ${\mathbb {R}^{N}}$ of positive codimension, that is, not absolutely continuous with respect to the Lebesgue measure, and therefore $H(\mu )=-\infty $. Moreover, since such $\mu $ is in $\mathcal {M}_2$, we have (via the weak growth condition) that $\int U\,\,\mathrm {d}\mu \in (-\infty ,\infty ]$, so the expression within the supremum of (3.2) is $-\infty $ for all $\mu \in \mathcal {G}^{-1}(G)$.

Meanwhile, for $G\in S_{++}^{N}$, one can see that $\mathcal {F}[G]>-\infty $ by considering $\mu $ to be mean-zero with a compactly supported smooth density, linearly transformed to have the appropriate covariance G. For such $\mu $, both terms in the supremum are finite.

Moreover, for $G\in S_{++}^{N}$ we also have that $\mathcal {F}[G] < +\infty $. Indeed, for $\mu \in \mathcal {G}^{-1}(G)$, by Lemma 2.8 we have $H(\mu ) \leqq \frac{1}{2}\log \left[ (2\pi e)^n \det G \right] $. Since $\int U\,\,\mathrm {d}\mu \geqq -C_U (1+\mathrm {Tr}\,G)$, we have a finite upper bound on the expression in the supremum in (3.2), which finishes the proof.

1.7 Lemma 3.8

Proof

Let $G_{1},G_{2}\in \mathcal {S}^N_{++}$, $\theta \in [0,1]$, and $\varepsilon >0$. Furthermore let $\mu _{1},\mu _{2}\in \mathcal {M}_2$ such that $\mu _{i}\in \mathcal {G}^{-1}(G_{i})$ and $\Psi [\mu _{i}]\geqq \mathcal {F}[G_{i}]-\varepsilon /2$. Then, noting that $\theta \mu _{1}+(1-\theta )\mu _{2}\in \mathcal {G}^{-1}\left( \theta G_{1}+(1-\theta )G_{2}\right) $, we observe

$$\begin{aligned} \mathcal {F}[\theta G_{1}+(1-\theta )G_{2}]= & {} \sup _{\mu \in \mathcal {G}^{-1}\left( \theta G_{1}+(1-\theta )G_{2}\right) }\Psi [\mu ]\\\geqq & {} \Psi \left[ \theta \mu _{1}+(1-\theta )\mu _{2}\right] \\\geqq & {} \theta \Psi [\mu _{1}]+(1-\theta )\Psi [\mu _{2}]\\\geqq & {} \theta \mathcal {F}[G_{1}]+(1-\theta )\mathcal {F}[G_{2}] - \varepsilon , \end{aligned}$$

where the penultimate step employs convexity of $\Psi $. Since $\varepsilon $ was arbitrary, we have established concavity.

The fact that $\mathcal {F}$ is proper follows from Lemma 3.7. Since $\mathcal {F}$ is concave, by Theorem A.9 it is continuous on $\mathrm {int}\,\mathrm {dom}\,\mathcal {F}$, which is in fact all of $\mathrm {dom}\,\mathcal {F}$ by the weak growth assumption. Thus we only need to check upper semi-continuity at points G outside of $\mathrm {dom}\,\mathcal {F}$. At $G \notin \overline{\mathrm {dom}\,\mathcal {F}} = \mathcal {S}^N_{+}$, upper semi-continuity is trivial because $\mathcal {F} \equiv -\infty $ on a neighborhood of G. Therefore let $G\in \partial \mathcal {S}^N_{++}$ and suppose that $G_k \in \mathcal {S}^N_{++}$ such that $G_k \rightarrow G$ as $k\rightarrow \infty $. We need to show that $\limsup _{k\rightarrow \infty } \mathcal {F}[G_k] = -\infty $. Throwing out all $G_k \notin \mathcal {S}^N_{++}$ from the sequence cannot increase the limit superior, so we can just assume that $G_k \in \mathcal {S}^N_{++}$ for all k. Since $G \in \partial \mathcal {S}^N_{++}$, we have $\det G = 0$, and therefore $\det G_k \rightarrow 0$. By Lemma 3.7 we have

$$\begin{aligned} \mathcal {F}[G_k] \leqq \frac{1}{2}\log \left[ (2\pi e)^n \det G_k \right] + C_U (1+\mathrm {Tr}\,G_k). \end{aligned}$$

Since the right-hand side of this inequality goes to $-\infty $ as $k\rightarrow \infty $, the proof is complete.

1.8 Lemma 3.9

Proof

Observe that (1) $\Omega $ and $\mathcal {F}$ are upper semi-continuous, proper concave functions (by Lemmas 3.2 and 3.8), (2) $\mathcal {F} = \Omega ^*$ and $\Omega = \mathcal {F}^*$, and (3) both $\mathrm {dom}\,\Omega $ and $\mathrm {dom}\,\mathcal {F} = \mathcal {S}^N_{++}$ are open. Then the strict concavity and differentiability of $\mathcal {F}$ on $\mathrm {dom}\,\mathcal {F} = \mathcal {S}^N_{++}$ follow directly from Theorem A.17.

Now we turn to proving $C^{\infty }$-smoothness. Though infinite-order differentiability is not typically discussed in convex analysis, it can be obtained from infinite-order differentiability and strict convexity of the convex conjugate via the implicit function theorem. Indeed, define the smooth function $h:\mathcal {S}^n_{++} \times \mathrm {dom}\,\Omega \rightarrow \mathcal {S}^n$ by

$$\begin{aligned} h(G,A)= \nabla \Omega [A] - G. \end{aligned}$$

Then $Dh=\left( \ -I_{\mathcal {S}^n}\ \big \vert \ \nabla ^2 \Omega \ \right) $, and since $\Omega $ is smooth and strictly concave, the right block is invertible for all A, G. Fix some $G'\in \mathcal {S}_{++}^n$, and let $A' = \nabla \mathcal {F}[G']\in \mathrm {dom}\,\Omega $, so $h(G',A') = 0$. Then the implicit function theorem gives the existence of a smooth function $\phi $ on a neighborhood $\mathcal {V}\subset \mathcal {S}^n_{++}$ of $G'$ such that $h(G,\phi (G))=0$ for all $G\in \mathcal {V}$. But this means precisely that $\phi = \nabla \mathcal {F}$, hence in particular $\nabla \mathcal {F}$ is smooth at $G'$.

1.9 Lemma 4.4

Proof

Write

$$\begin{aligned} Z[A,\varepsilon U] = \int e^{-\frac{1}{2} x^T A x - \varepsilon U(x)} \,\mathrm {d}x. \end{aligned}$$

We want to show that as $\varepsilon \rightarrow 0^+$, $Z[\,\cdot \,,\varepsilon U]$ epi-converges (see Definition A.19) to $Z[\,\cdot \,,\varepsilon U]$. If so, then $-\Omega [\,\cdot \,,\varepsilon U]$ epi-converges $-\Omega [\,\cdot \,,0]$, and Theorems A.21 and A.20 yield in particular that $\mathcal {F}[\,\cdot \,,\varepsilon U] \rightarrow \mathcal {F}[\,\cdot \,,0]$ pointwise on $\mathcal {S}^N_{++}$ as $\varepsilon \rightarrow 0^+$. Then by Theorem A.18 we have the pointwise convergence of the gradients on $\mathcal {S}^N_{++}$, that is, $A[G,\varepsilon U] \rightarrow A[G,0] = G^{-1}$ as $\varepsilon \rightarrow 0^+$ for $G\in \mathcal {S}^N_{++}$.

Thus it remains to show that $Z[\,\cdot \,,\varepsilon U]$ epi-converges to $Z[\,\cdot \,,\varepsilon U]$. The first of the conditions in Definition A.19 follows immediately from Fatou’s lemma, so we need only show that for any $A \in \mathcal {S}^N$, there exists a sequence $A_\varepsilon \rightarrow A$ such that

$$\begin{aligned} \limsup _{\varepsilon \rightarrow 0^+} Z [A_\varepsilon , \varepsilon U ] \leqq Z_\varepsilon [A, 0 ] \end{aligned}$$

In particular, it suffices to show that

$$\begin{aligned} \limsup _{\varepsilon \rightarrow 0^+} Z [A, \varepsilon U ] \leqq Z_\varepsilon [A, 0]. \end{aligned}$$

(C.2)

For $A \notin \mathcal {S}^N_{++}$, the righthand side is $+\infty $, so the inequality holds trivially.

Thus assume $A \in \mathcal {S}^N_{++}$. By the weak growth condition, we can write $U(x) = \widetilde{U}(x) - \lambda - \lambda \Vert x\Vert ^2$, where $C>0$ and $\widetilde{U} \geqq 0$. Then

$$\begin{aligned} Z[A,\varepsilon U] = \int e^{\varepsilon \lambda } e^{-\frac{1}{2} x^T (A - \varepsilon \lambda ) x - \varepsilon \widetilde{U}(x)} \,\mathrm {d}x \leqq \int e^{\varepsilon \lambda } e^{-\frac{1}{2} x^T (A - \varepsilon \lambda ) x} \,\mathrm {d}x, \end{aligned}$$

and evidently the righthand side converges to Z[A, 0] by dominated convergence.

1.10 Lemma 4.5

Proof

Let $G\in \mathcal {S}^N_{++}$. Recall Eq. (C.2) from the proof of Lemma 4.4. From this inequality, it follows that there exists $\tau > 0$ and an open neighborhood $\mathcal {N}$ of $G^{-1}$ in $\mathcal {S}^N_{++}$ such that $A \in \mathrm {dom}\,\Omega [\,\cdot \,,\varepsilon U]$ for all $(\varepsilon , A) \in (0,\tau ) \times \mathcal {N}$.

Now consider $\hat{\varepsilon } > 0$ sufficiently small so that $\hat{\varepsilon } < \tau $ and $\hat{A} := A_G(\hat{\varepsilon }) \in \mathcal {N}$ (possible by Lemma 4.4). Define the smooth function $h:(0,\tau ) \times \mathcal {N} \rightarrow \mathcal {S}^N$ by

$$\begin{aligned} h(\varepsilon ,A)= \nabla _A \Omega [A,\varepsilon U] - G. \end{aligned}$$

Then $Dh(\varepsilon ,A) =\left( \ * \ \big \vert \ \nabla ^2_A \Omega [A,\varepsilon U] \ \right) $, and since $\Omega [\,\cdot \,,\varepsilon U]$ is smooth and strictly concave, the right block is invertible for all $\varepsilon , A$. Moreover, we have $h(\hat{\varepsilon },\hat{A}) = 0$ by construction. Then the implicit function theorem gives the existence of a smooth function $\phi $ on a neighborhood I of $\hat{\varepsilon }$ such that $h(\varepsilon ,\phi (\varepsilon ))=0$ for all $\varepsilon \in I$, but this means precisely that $\phi = A_G$. The implicit function theorem then also says that

$$\begin{aligned} A_G '(\varepsilon ) = -(\nabla ^2_A \Omega [A_G(\varepsilon ),\varepsilon U])^{-1} \frac{\partial h}{\partial \varepsilon }(\varepsilon , A_G(\varepsilon )) \end{aligned}$$

(C.3)

for all $\varepsilon \in I$, where $A_G'$ denotes the ordinary derivative of the function $A_G$ of a single variable. In particular Eq. (C.3) holds at $\varepsilon = \hat{\varepsilon }$, but since $\hat{\varepsilon }$ was arbitrary (beyond being taken sufficiently small), it follows that Eq. (C.3) simply holds for all $\varepsilon > 0$ sufficiently small.

We want to show that all derivatives of $A_G:(0,\infty ) \rightarrow \mathcal {S}^N$ extend continuously to $[0,\infty )$. Starting with $A_G'$, we can examine these functions by taking further derivatives on the righthand side of Eq. (C.3). The result will be an expression involving integrals of the form

$$\begin{aligned} \int P(x,U(x))\, e^{-\frac{1}{2} x^T A_G(\varepsilon ) x - \varepsilon U(x)} \, \,\mathrm {d}x, \end{aligned}$$

where P is some polynomial, and it suffices to show that such integrals converge to their desired limits

$$\begin{aligned} \int P(x,U(x))\, e^{-\frac{1}{2} x^T G^{-1} x} \, \,\mathrm {d}x. \end{aligned}$$

The argument is by dominated convergence. First observe that from the at-most-exponential growth assumption (Assumption 2.5), it follows that there exist $a,b>0$ such that $ \vert P(x,U(x)) \vert \leqq a e^{b \Vert x\Vert }$ for all x. As in the proof of Lemma 4.4, write $U(x) = \widetilde{U}(x) - \lambda - \lambda \Vert x\Vert ^2$, where $C>0$ and $\widetilde{U} \geqq 0$. Then

$$\begin{aligned} \vert P(x,U(x))\, e^{-\frac{1}{2} x^T A_G(\varepsilon ) x - \varepsilon U(x)} \vert\leqq & {} \vert P(x,U(x)) \vert \, e^{\varepsilon \lambda } e^{-\frac{1}{2} x^T (A_G(\varepsilon ) - \varepsilon \lambda ) x - \varepsilon \widetilde{U}(x)} \\\leqq & {} a e^{b \Vert x \Vert } e^{\varepsilon \lambda } e^{-\frac{1}{2} x^T (A_G(\varepsilon ) - \varepsilon \lambda ) x}. \end{aligned}$$

Then for all $\varepsilon > 0$ small enough such that $\varepsilon < 1$ and $A_G(\varepsilon ) - \varepsilon \lambda \succ \frac{1}{2} G^{-1}$, we see that the absolute value of the integrand is bounded uniformly by

$$\begin{aligned} a e^{b \Vert x \Vert } e^{ \lambda } e^{-\frac{1}{4} x^T G^{-1} x}, \end{aligned}$$

which is integrable. This completes the dominated convergence argument, and we conclude that all derivatives of $A_G$ extend continuously to $[0,\infty )$.

Next we aim to use the preceding to show that all derivatives of $\Phi _G$ and $\Sigma _G$ also extend continuously to $[0,\infty )$.

To this end, recall the Dyson equation

$$\begin{aligned} \Sigma _G = A_G - G^{-1}, \end{aligned}$$

which requires that the desired extension property of $\Sigma _G$ is equivalent to that of $A_G$, which we have already proved.

Now for any $\varepsilon >0$, we have

$$\begin{aligned} \Phi _G (\varepsilon )= & {} 2\mathcal {F}[G,\varepsilon U] - \mathrm {Tr}\log G - N\log (2\pi e) \\= & {} \mathrm {Tr}[A_G (\varepsilon ) G] -2\Omega [A_G (\varepsilon ),\varepsilon U] - \mathrm {Tr}\log G - N\log (2\pi e) \end{aligned}$$

by Legendre duality, from which it follows from our extension property for $A_G$, together with the arguments used to establish it, that all derivatives of $\Phi _G$ extend continuously to $[0,\infty )$.

1.11 Lemma 4.12

Proof

Based on Eqs. (4.6) and (4.7), we want to show that $G[A^{(M)}(\varepsilon ), U_\varepsilon ^{(M)}] \sim G[A^{(M)}(\varepsilon ) , \varepsilon U]$. As a first step, we aim to show that $Z[A^{(M)}(\varepsilon ), U_\varepsilon ^{(M)}] \sim Z[A^{(M)}(\varepsilon ) , \varepsilon U]$. Indeed, we can write

$$\begin{aligned}&Z[A^{(M)}(\varepsilon ) , \varepsilon U] - Z[A^{(M)}(\varepsilon ), U_\varepsilon ^{(M)}] \nonumber \\&\quad \quad \quad =\ \ \int e^{-\frac{1}{2} x^T A^{(M)}(\varepsilon ) x - \varepsilon U(x)} \left( 1 - e^{- \frac{1}{2} x^T \left[ \Sigma _G(\varepsilon ) - \Sigma _G^{(\leqq M)}(\varepsilon ) \right] x} \right) \,\mathrm {d}x. \end{aligned}$$

(C.4)

We can choose C such that

$$\begin{aligned} -C \varepsilon ^{M+1} \preceq \Sigma _G(\varepsilon ) - \Sigma _G^{(\leqq M)}(\varepsilon ) \preceq C \varepsilon ^{M+1} \end{aligned}$$

for all $\varepsilon >0$ sufficiently small.

Now let $R(\varepsilon ) = \varepsilon ^{-p/2}$ for $p\in (0,1)$. We split the integral in (C.4) into a part over $B_{R(\varepsilon )}(0)$ and another part over the complement. The integrand is dominated by $e^{- \delta x^T x}$ for some $\delta $ uniform in $\varepsilon $, the integral of which over the complement of $B_{R(\varepsilon )}(0)$ decays super-algebraically as $\varepsilon \rightarrow 0$, so we can neglect this contribution.

Meanwhile, for $x\in B_{R(\varepsilon )}(0)$, we have

$$\begin{aligned} \left| x^T \left[ \Sigma _G(\varepsilon ) - \Sigma _G^{(\leqq M)}(\varepsilon ) \right] x \right| \leqq C \varepsilon ^{M+1-p}, \end{aligned}$$

hence there exists $C'$ such that

$$\begin{aligned} \left| 1 - e^{- \frac{1}{2} x^T \left[ \Sigma _G(\varepsilon ) - \Sigma _G^{(\leqq M)}(\varepsilon ) \right] x} \right| \leqq C' \varepsilon ^{M+1-p} \end{aligned}$$

for all $x\in B_{R(\varepsilon )}(0)$. Combining with (C.4) and dominated convergence, we have established $Z[A^{(M)}(\varepsilon ), U_\varepsilon ^{(M)}] \sim Z[A^{(M)}(\varepsilon ) , \varepsilon U]$.

This result, together, together with analogous arguments applied to integrals of the form

$$\begin{aligned} \int x_i x_j \,e^{-\frac{1}{2} x^T A^{(M)}(\varepsilon ) x - \varepsilon U(x)} \left( 1 - e^{- \frac{1}{2} x^T \left[ \Sigma _G(\varepsilon ) - \Sigma _G^{(\leqq M)}(\varepsilon ) \right] x} \right) \,\mathrm {d}x, \end{aligned}$$

yields $G[A^{(M)}(\varepsilon ), U_\varepsilon ^{(M)}] \sim G[A^{(M)}(\varepsilon ) , \varepsilon U]$.

1.12 Lemma 5.1

Proof

For convenience, we define

$$\begin{aligned} \mathcal {F}_{c} [G] := \sup _{\mu \in \mathcal {G}^{-1}(G)\cap \mathcal {M}_c} \left[ H(\mu ) - \int U\,\,\mathrm {d}\mu \right] . \end{aligned}$$

Evidently $\mathcal {F}_c \leqq \mathcal {F}$ and $\mathcal {F}_{c}[G] = -\infty $ if $G \notin \mathcal {S}^N_{++}$, so we can restrict attention to $G \in \mathcal {S}^N_{++}$.

Fix $\varepsilon > 0$. Let $G\in \mathcal {S}^N_{++}$, so $\mathcal {F}[G]$ is finite, and let $\mu \in \mathcal {M}_2$ such that

$$\begin{aligned} H(\mu ) - \int U \,\,\mathrm {d}\mu \geqq \mathcal {F}[G] - \varepsilon /2. \end{aligned}$$

In particular, $H(\mu ) \ne -\infty $, so $\mathrm{{d}}\mu = \rho \,\,\mathrm {d}x$ for some density $\rho $. Then consider the measure $\mu _R \in \mathcal {M}_c (R)$ given by density $\rho _R := Z_R ^{-1}\cdot \rho \cdot \chi _R$, where $\chi _R$ is the indicator function for $B_R (0)$ and $Z_R = \int _{B_R (0)} \rho \,\,\mathrm {d}x$. By monotone convergence, $Z_R \rightarrow 1$.

Unfortunately we cannot expect $\mathcal {G}(\mu _R) = G$, but we do have $\mathcal {G}(\mu _R) \rightarrow G$ (following from dominated convergence, together with the finite second moments of $\mu $). We then want to modify $\mu _R$ (keeping its support compact) to construct a nearby measure with the correct second moments.

To this end let $G_R = \tau _R [G - \mathcal {G}(\mu _R)] + \mathcal {G}(\mu _R)$, where $\tau _R > 1$ is chosen so that $\tau _R \rightarrow +\infty $ and the eigenvalues of $G_R$ remain uniformly bounded away from zero and infinity (possible since $\mathcal {G}(\mu _R) \rightarrow G$). Note that we have $G = \tau _R^{-1} G_R + (1 - \tau _R^{-1}) \mathcal {G}(\mu _R)$.

Now let $\pi \in \mathcal {M}_2$ be any compactly supported measure with a density and finite entropy, and let $\pi _R = T_R \# \pi $, where $T_R$ is a linear transformation chosen so that $\mathcal {G}(\pi _R) = G_R$. Since the eigenvalues of $G_R$ are uniformly bounded away from zero and infinity, the $T_R$ can be chosen to have determinants uniformly bounded away from zero and infinity (which guarantees that that the $\vert H(\pi _R) \vert $ are uniformly bounded), and $\pi _R$ can be taken to have uniformly bounded support. Then finally we can define a measure $\nu _R := \tau _R^{-1} \pi _R + (1 - \tau _R^{-1}) \mu _R$, so $\mathcal {G}(\nu _R) = G$ and $\nu _R$ is compactly supported.

For the proof it suffices to show that

$$\begin{aligned} H(\nu _R) - \int U \,\,\mathrm {d}\nu _R \rightarrow H(\mu ) - \int U \,\,\mathrm {d}\mu \end{aligned}$$

(C.5)

as $R \rightarrow \infty $.

By the weak growth condition (Definition 2.3), we can choose a constant C such that $\widetilde{U}$ defined by $\widetilde{U}(x) := C(1+\Vert x\Vert ^2) + U(x)$ satisfies $\widetilde{U}(x) \geqq \Vert x\Vert ^2$. Now

$$\begin{aligned} \int (1+\Vert x\Vert ^2) \,\,\mathrm {d}\mu _R \rightarrow \int (1+\Vert x\Vert ^2) \,\,\mathrm {d}\mu < +\infty \end{aligned}$$

by monotone convergence together with the fact that $Z_R \rightarrow 1$. Furthermore

$$\begin{aligned} \tau _R^{-1} \int (1+\Vert x\Vert ^2) \,\,\mathrm {d}\pi _R \rightarrow 0, \end{aligned}$$

so in fact

$$\begin{aligned} \int (1+\Vert x\Vert ^2) \,\,\mathrm {d}\nu _R \rightarrow \int (1+\Vert x\Vert ^2) \,\,\mathrm {d}\mu < +\infty \end{aligned}$$

Therefore, without loss of generality, we can prove C.5 under the assumption that $U(x) \geqq \Vert x\Vert ^2$. But then $\int U \,\,\mathrm {d}\mu _R \rightarrow \int U\,\,\mathrm {d}\mu $ by monotone convergence, and $\tau _R^{-1} \int U \,\,\mathrm {d}\pi _R \rightarrow 0$ since the $\pi _R$ have uniformly bounded support, so in fact $\int U\,\,\mathrm {d}\nu _R \rightarrow \int U\,\,\mathrm {d}\mu $.

Then we need only show that $H(\nu _R) \rightarrow H(\mu )$. Here one verifies from the construction that $\nu _R$ converges weakly to $\mu $, and moreover the second moments of $\nu _R,\mu $ are uniformly bounded, so by Lemma 2.9, we have $\limsup _R H(\nu _R) \leqq H(\mu )$.

However, by the concavity of the entropy, we have $H(\nu _R) \geqq \tau _R^{-1} H(\pi _R) + (1-\tau _R^{-1}) H(\mu _R)$. Now recall that the $\vert H(\pi _R) \vert $ are uniformly bounded in R, so $ \tau _R^{-1} H(\pi _R) \rightarrow 0$. Thus the statement $\liminf _R H(\nu _R) \geqq H(\mu )$ (and hence also $H(\nu _R) \rightarrow H(\mu )$) will follow if we can establish $H(\mu _R) \rightarrow H(\mu )$.

Now

$$\begin{aligned} H(\mu _R) = \log (Z_R) - Z_R^{-1}\int _{B_R (0)} \rho \,\log \rho \,\,\mathrm {d}x, \end{aligned}$$

but we know that $Z_R \rightarrow 1$, so we need only show that

$$\begin{aligned} \int _{B_R (0)} \rho \log \rho \,\,\mathrm {d}x \rightarrow \int \rho \log \rho \,\,\mathrm {d}x. \end{aligned}$$

From Lemma 2.8, the negative part of $\rho \log \rho $ is integrable. But then the fact that $H(\mu ) > -\infty $ precisely means that the positive part of $\rho \log \rho $ is integrable, that is, $\rho \log \rho $ is absolutely integrable. Then the desired fact follows from dominated convergence.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, L., Lindsey, M. Bold Feynman Diagrams and the Luttinger–Ward Formalism Via Gibbs Measures: Non-perturbative Analysis. Arch Rational Mech Anal 242, 527–579 (2021). https://doi.org/10.1007/s00205-021-01691-y

Download citation

Received: 31 October 2018
Accepted: 29 June 2021
Published: 21 July 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s00205-021-01691-y

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bold Feynman Diagrams and the Luttinger–Ward Formalism Via Gibbs Measures: Non-perturbative Analysis

Abstract

Similar content being viewed by others

Bold Feynman Diagrams and the Luttinger–Ward Formalism via Gibbs Measures: Perturbative Approach

On exact-WKB analysis, resurgent structure, and quantization conditions

Form factors and correlation functions of \( \textrm{T}\overline{\textrm{T}} \)-deformed integrable quantum field theories

1 Introduction

1.1 Contributions

1.2 Outline

2 Preliminaries

2.1 Notation and Quantities of Interest

Definition 2.1

Definition 2.2

2.2 Interaction Growth Conditions

Definition 2.3

Definition 2.4

Assumption 2.5

2.3 Measures and Entropy: Notation and Facts

Definition 2.6

Fact 2.7

Proof

Lemma 2.8

Lemma 2.9

Remark 2.10

Fact 2.11

Remark 2.12

3 Luttinger–Ward Formalism

3.1 Variational Formulation of the Free Energy

Theorem 3.1

Lemma 3.2

Remark 3.3

Lemma 3.4

Remark 3.5

Remark 3.6

Lemma 3.7

Lemma 3.8

Lemma 3.9

3.2 The Luttinger–Ward Functional and the Dyson Equation

3.3 Transformation Rule for the LW Functional

Proposition 3.10

Proof

Remark 3.11

Corollary 3.12

3.4 Impurity Problems and the Projection Rule

Proposition 3.13

Remark 3.14

Corollary 3.15

Proof of Proposition 3.13

Remark 3.16

Corollary 3.17

3.5 Continuous Extension of the LW Functional to the Boundary

Theorem 3.18

Remark 3.19

Theorem 3.20

4 Bold Diagram Expansion for the Generalized Coulomb Interaction

Theorem 4.1

Remark 4.2

Notation 4.3

4.1 Existence of Asymptotic Series

Lemma 4.4

Lemma 4.5

Proposition 4.6

Proof

4.2 Relating the LW and Self-energy Expansions

Proposition 4.7

Proof

4.3 Diagram-free Discussion of Results from the Accompanying Paper

Lemma 4.8

Proof

Theorem 4.9

4.4 Derivation of Self-energy Bold Diagrams

Lemma 4.10

Proof

Remark 4.11

Lemma 4.12

Proof

Lemma 4.13

Proof

Lemma 4.14

Proof