Abstract
The generalized mean-field orthoplicial model is a mean-field model on a space of continuous spins on \(\mathbb {R}^n\) that are constrained to a scaled \((n-1)\)-dimensional \(\ell _1\)-sphere, equivalently a scaled \((n-1)\)-dimensional orthoplex, and interact through a general interaction function. The finite-volume Gibbs states of this model correspond to singular probability measures. In this paper, we use probabilistic methods to rigorously classify the infinite-volume Gibbs states of this model, and we show that they are convex combinations of product states. The predominant methods utilize the theory of large deviations, relative entropy, and equivalence of ensembles, and the key technical tools utilize exact integral representations of certain partition functions and locally uniform estimates of expectations of certain local observables.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The purpose of this paper is to present rigorous probabilistic methods to compute and classify the large n-limits of integrals of the form
where \(g: \mathbb {R} \rightarrow \mathbb {R}\) is a “sufficiently regular” function which will be referred to as an interaction function, \(d \phi \) is the Lesbesgue measure on \(\mathbb {R}^n\), \(\delta (\cdot )\) is formally a delta function, \(f \in C_b (\mathbb {R}^I)\), where \(C_b (\mathbb {R}^I)\) is the space of continuous bounded functions on a finite index set \(I \subset [n]:= \{ 1,2,..., n\}\), \(\pi _I:\mathbb {R}^n \rightarrow \mathbb {R}^I\) is the canonical coordinate projection, and \(Q_n (g)\) is a normalization constant, which will be referred to as the partition function, which make \(\mu _n^g\) into a probability measure. The main result in this paper is given in Theorem 3.3.6, and it constitutes a full characterization of the infinite-volume Gibbs states corresponding to the models given by the probability measures in Eq. (1.0.1) given some regularity of the interaction function g. The main result states the infinite-volume Gibbs states of this model are given by convex combinations of particular exponential product states, and the coefficients of the convex combination are explicitly determined by an associated free energy function. Explicit examples of this type of result are given in the associated examples appearing after Theorems 3.2.4 and 3.2.6.
As a guide for the further reading of our introduction, we present an informal version of this main result.
Theorem
For sufficiently regular interaction functions g, it follows that
where \(\psi ^g: [-1,1] \rightarrow \mathbb {R}\) is given by
and, if we denote the elements of the finite collection \(M^*\) of global maximizing points of \(\psi ^g\) by the collection \((m^*)\), then it follows that there exists a collection \((c_{m^*})\) of positive weights summing to unity such that
where \(\eta ^{m^*}\) are probability measures corresponding to factorizable product states on \(\mathbb {R}^\mathbb {N}\) with single-site marginal densities given by
where \((\beta , \mu ) \in \mathbb {R} \times (0, \infty )\) are coefficients satisfying \(|\beta | < \mu \) and \(q (\beta , \mu )\) are normalization constants making the marginals into probability measures.
The dependence on the collection \((m^*)\) of global maximizing points of the weights \(c_{m^*}\) and of the coefficients \((\beta ,\mu )\) of each \(\eta ^{m^*}\) can be explicitly determined.
The rest of this introduction is dedicated to the explanation and motivation of the various objects appearing in this informal statement of the main result. For the actual statement of the main result, one should note that the notation is changed to be more specific and suggestive.
We will refer to the probability measure \(\mu _n^g\) as a finite-volume Gibbs state and the infinite-volume limit, i.e. the large n-limit, when it exists, will be referred to as the infinite-volume Gibbs state. We refer to Sect. 2 for a complete definition and discussion of the notion of infinite-volume Gibbs state in this context.
At a heuristic level, to make such finite-volume Gibbs states rigorous, we use the fact that the constraint function inside the delta function, when restricted to an orthant of \(\mathbb {R}^n\), precisely defines a uniform measure over a scaled \((n-1)\)-dimensional simplex. This method is used in Sect. 4.1. Since the \((n-1)\)-dimensional \(\ell _1\)-sphere corresponds to the \((n-1)\)-dimensional orthoplex, we refer to this model as the generalized mean-field orthoplicial model. This naming convention is similar to the convention used for the mean-field spherical model, see [1], but the constraint changes from the \(\ell _2\)-sphere to the orthoplex.
Mean-field models of equilibrium statistical mechanics have been studied extensively as toy models of spins on various types of spaces, and the most famous model belonging to this class is the Curie–Weiss model, see [2,3,4]. The classical spin-\(\frac{1}{2}\)-Curie–Weiss model is a relatively simple exactly solvable model with interesting statistical mechanical phenomena such as phase transitions, anomalous scaling, infinite-volume Gibbs states, etc. In addition, the model can be generalized in a variety of ways while retaining the essential simplicity of the models. Such generalizations are for instance the modifications of the interaction function, see [5], additions of external random fields, see [6], and modifications to the ambient space, see [2]. In this direction, there are also modifications to the entire measure of the ambient spin space, where one changes the product structure to a constrained singular measure such as the uniform measure on a scaled sphere, see [1]. The model here also falls into this category, since the ambient space of spins is assumed to be constrained to scaled spheres in the \(\ell _1\) norm.
The primary motivation for investigating this particular model is that it is a non-trivial exactly solvable model corresponding in a sense to a canonical ensemble probability measure of a thermodynamic system with two constraints, such that the classical ferromagnetic mean-field models are included via a suitable choice of the interaction function. Similar systems have been investigated in [7,8,9,10,11,12,13,14], and there is a wealth of different methodologies that have been used depending on the different models, but it is typical that these models do not utilize a general interaction function if any interaction functional at all. For this particular model, as opposed to the mean-field spherical model in [1, 11], we must employ different more abstract methods to investigate the infinite-volume states, and it is our belief that these more abstract methods can be of use to study other similar problems. In fact, the general solution of this problem does not follow the more standard approach used in [2] which uses the Hubbard–Stratonovich transform, also referred to as Gaussian linearization. We comment on this point in Sect. 2. When comparing this model to the Berlin–Kac model [14], one should note that label permutation invariant models like the models of this paper, have no underlying geometry of the index set. However, since the limiting states of the model are shown to be convex combinations of product states, unless the convex combination is actually trivial i.e. there is just one product state, the limiting state is in general not factorizable.
To our knowledge, the orthoplicial model and the methods presented to solve the various problems associated with the orthoplicial model are novel in the literature, and known methods, such as in [2], are not necessarily applicable. A particular approach, which is quite natural, is to try to swap the delta function for an appropriately parametrized exponential function, and solve rigorously the problem of interchanging such functions. This approach, however, fails, see Sect. 2. Instead, the general method introduced in this paper, presented heuristically in Sect. 2, is to go from a singly delta constrained probability measure to one which is doubly constrained. This particular doubly constrained probability measures is more tractable, and one can characterize both its limiting probability measure, and the uniform parametric convergence it has in the large n-limit. The limiting measure of the doubly constrained measure has a product structure, and, subject to further analysis, we find the general result that for a wide variety of suitable interaction functions, the limiting states of the model are convex combinations of product states. The suitable swapping between constrained and non-constrained measures is one aspect of the equivalence of ensembles, see [15]. In fact, the approach we mentioned in the beginning of this paragraph, which is also shown to fail, is related to the fact that if we instead study this model as a thermodynamic system with two conserved quantities, namely the Hamiltonian associated to the interaction function, chosen to be quadratic here, and the particle number function corresponding to the \(\ell _1\) norm, then the calculation given in Eq. (2.0.15) shows that the grand canonical ensemble of this model fails to capture the full set of possible microstate values. We can only attain the trivial vanishing energy density with expectations in the grand canonical ensemble. By the result of this paper, we know that the corresponding canonical ensemble has limiting states corresponding to non-trivial energy densities, and thus the grand canonical ensemble is insufficient to describe all the infinite-volume Gibbs states of this system. This is one form of non-equivalence of ensembles considered in [15]. This point is also important in the sense that it is one of the practical reasons for studying the canonical versions of these models as opposed to the grand canonical versions, and why it is valuable to develop different techniques to studying such models. In this vein, the Berlin–Kac model [14] was one of the first models of this type, and it too exhibits a “trivial” infinite-volume Gibbs state of the grand canonical version of that corresponding model, since it does not capture the phase transition of the standard Berlin–Kac model.
Let us now remark on the main methods and concepts used in this paper in more detail. The first step involves writing the finite-volume Gibbs state as an integral mixture of probability measures such that the mixture acts on two variables which parametrize a doubly constrained measure which we will call a microcanonical probability measure. This step is carried out in Sects. 3 and 4.1. The general strategy is then to utilize a type of generalized dominated convergence theorem where both the integrating measure and the functions we are integrating are varying, see Lemma 3.1.1.
Using relative entropy methods, we are able to show that the difference in expectations of local observables of microcanonical probability measure and the completely unconstrained probability measures, referred to as the grand-canonical probability measure, depend explicitly on their corresponding statistical mechanical entropies, see Lemma 3.1.2. Using large deviations methods, we are able to prove a broadly applicable theorem which allows one to prove locally uniform convergence of the finite-volume microcanonical entropies using the concavity of the microcanonical entropies along with convergence of the grand-canonical entropy, see Theorem 3.1.3. Note that we are using a non-standard terminology by referring to the normalized logarithm of a partition function as an entropy irrespective of the ensemble the partition function comes from. We should emphasize that Theorem 3.1.3 formalizes, at least at this level of regularity, the notion that one can rigorously deduce many properties of the limiting microcanonical entropy from the grand canonical entropy, which is typically far more tractable mathematically.
For the orthoplicial model, one can easily verify some of the conditions of Theorem 3.1.3 that are required, since the corresponding grand-canonical probability measure is a product state. Of some methodological interest is the fact that we use the notion of Lorentzian polynomials, see [16], to prove that the microcanonical entropies are log-concave functions. The end result is that by combining together, Lemmas 3.1.2, 3.1.5, and 3.1.6, we obtain the locally uniform convergence of the difference of expectations of local observables for the microcanonical and grand-canonical probability measure in the form of Corollary 3.1.7.
As for the mixture probability measure, we begin by once again applying the general theorem Theorem 3.1.3, to deduce the entropy of the corresponding canonical model, see Lemma 3.2.1, which is directly related to our model with a linear interaction function. Using tilting, we are then able to show that the mixture probability measures satisfies a large deviations principle, see Corollary 3.2.2. Using large deviations techniques found in Sect. 4.3, we are able to already classify the limiting states of our model for a variety of relevant non-trivial interaction functions, see Example 3.2.5 for the quadratic mean-field interaction with a non-vanishing magnetic field, see Example 3.2.7, for the quadratic mean-field interaction without an external magnetic field, and see Theorems 3.2.4 and 3.2.6 for the rigorous results concerning these two examples.
In order to fully classify the limiting states of the model for more general interaction functions, we need an additional result concerning the microcanonial partition function which comes in the form of an exact generating function representation, see Lemma 3.3.1. The generating function that we obtain is a modified Bessel function of the first kind, and we utilize a particular integral representation of it. This allows one to fully characterize the weak convergence of the mixture probability measure by relating it to to Laplace-type integrals in three variables for which we can exactly deduce their asymptotics, see Lemma 3.3.4. The exact result employs the notions of type, and maximal type, given in [2], adapted to this particular model. The primary pair of results concerning this final result are Theorems 3.3.5 and 3.3.6, which can be summarized by stating that given sufficient regularity of the interaction function g, which are intimately related to properties of the limiting entropy, one is able to show that the limiting states are convex combinations of products states.
In the literature, the closest works are [1, 11], in which similar results, with entirely different methods, are produced for the so-called mean-field spherical model. Another similar work which considers a Berlin–Kac-type, see [14], model with a spherical constraint is given in [10]. From the pure mathematical perspective, non-interacting continuous models with multiple constraints have been considered in [7, 8]. These works both consider the particular phenomenon of condensation, and their approach could be described as probabilistic ones. For discrete two-constraint models, and formalism for the equivalence of ensembles for such models, see [12]. In terms of methods, in [17], there is an approach to proving a type of uniform convergence between constrained and non-constrained probability measures by adapting a uniform local central limit theorem. A similar approach based on uniform estimates is used in [18] to prove ensemble equivalence of some observables for microcanonical and canonical ensembles. For a random-field model constrained to the sphere, a similar uniform convergence result between constrained and non-constrained probability measures is obtained in [19]. Finally, we should also remark that this paper does not make use of the method of steepest descent, see [14], nor do we rely on characteristic functions in any particular way to complete any of the proofs. In Sect. 2, we use Gaussian linearization, also called the Hubbard–Stratonivich transform, to form a counter-example, but we will otherwise not use this standard tool.
1.1 Reading Guide
This paper is primarily organized so that a majority of the concepts and methods without proofs can be gathered by reading the introduction contained in Sect. 1 and the heuristics contained in Sect. 2. These sections do not contain any proofs, but they do contain some definitions and outline the basic approach to the problems in this paper.
The statements of the results, some important intermediate results, short or simple proofs, and relevant expository computations are done in Sect. 3. The more involved proofs or methods are contained in Sect. 4. Note that Sect. 4 also contains an entire subsection devoted to some results in theory of large deviations, see Sect. 4.3, and the basic concepts and properties of relative entropy are given in Sect. 4.2.
2 Heuristics
The functions f used in Eq. (1.0.1) will be referred to as local functions and their associated finite index sets I will be referred to as local index sets. Such local functions f are naturally functions on \(\mathbb {R}^n\) for large enough n by using the coordinate projection \(\pi _I: \mathbb {R}^n \rightarrow \mathbb {R}^I\), and representing them as a composition \(f \circ \pi _I\). If one is able to resolve the large-n limits of integrals of the form given in Eq. (1.0.1), then one is able to specify, in the limit, the “expectations” of a large class of local observables. In doing so, subject to other regularity conditions on this limiting state, one is able to produce a genuine probability measure on \(\mathbb {R}^\mathbb {N}\). From now on, we will omit the coordinate projection \(\pi _I\), and simple write the expectation with respect to a local function f without the composition, unless it becomes pertinent for a specified reason. We will use the following definition of weak convergence and limit points of probability measures.
Definition 2.0.1
A sequence of probability measures \(\mathcal {G}:= \{ \mu _n \}_{n \in \mathbb {N}}\), such that each \(\mu _n\) is a probability measure on \(\mathbb {R}^n\), is said to converge weakly to a probability measure \(\mu _\infty \) on \(\mathbb {R}^\mathbb {N}\) if
for any \(f \in C_b (\mathbb {R}^I)\).
The set of limit points \(\mathcal {G}_\infty \) of \(\mathcal {G}\) is given by
where the limit is understood in the sense of the weak limit given here.
There are simple extensions, see [19] for an extension by “tensoring on 0” to the remaining \(\mathbb {N} {\setminus } [n]\) components, that make the probability measure \(\mu _n\) in this definition into probability measures on \(\mathbb {R}^\mathbb {N}\), and using these extensions the definitions above are equivalent to the standard definitions of weak convergence of probability measure on Polish spaces, and the notion of limit points is to be understood as limit points with respect to the Lévy–Prokhorov metric. In notation, we would redefine the measures \(\mu _n':= \mu _n \otimes \delta ^{\mathbb {N} {\setminus } \{ 1,2,...,n\}}_{0}\), where \(\delta ^{\mathbb {N} {\setminus } \{ 1,2,...,n\}}_{0}\) is the Dirac measure on the 0 vector of the space \(\mathbb {R}^{\mathbb {N} \setminus \{ 1,2,...,n\}}\). It is now clear that if n is large enough, then for any local observable f, we have \(\mu _n [f] = \mu _n' [f]\). We see then that this type of redefinition simply extends the probability measures on \(\mathbb {R}^n\) to \(\mathbb {R}^\mathbb {N}\), but since we are predominantly interested in large n-limits, we might as well work only on the sequence of probability measures \(\{ \mu _n \}_{n \in \mathbb {N}}\) since their values coincide for expectations of fixed local observables for large enough n. For our purposes, understanding that we are predominantly interested in studying the limit of expectations of local observables is sufficient for the contents of this paper.
Using this notation, we are then interested in studying and classifying the structure and content of the sets \(\mathcal {G}^g\), corresponding to the sequence of probability measures \(\{ \mu _n^g \}_{n \in \mathbb {N}}\) specified in their functional form in Eq. (1.0.1), which will be called the collection of finite-volume Gibbs states, and \(\mathcal {G}_\infty ^g\), which will be called the collection of infinite-volume Gibbs states, and their dependence on the interaction function g.
The prototypical interaction function g of this paper is based on the Curie–Weiss Hamiltonian \(H^J_{{\text {CW}}, n}: \mathbb {R}^n \rightarrow \mathbb {R}\) given by
where \(J > 0\) is a coupling constant, with the associated interaction function \(g^{\beta , J}: \mathbb {R} \rightarrow \mathbb {R}\) given by
where \(\beta > 0\). With this interaction function, the probability measure in Eq. (1.0.1) takes the form
and can be seen to contain two competing weights in the integrand: the interaction function gives larger weight to fields \(\phi \) in which the components are of the same sign and as large as possible, this type of behaviour is why we refer to this interaction as ferromagnetic, while the delta function terms constrains the size aspect of the interaction. It is this competition which produces the non-trivial nature of the limiting state.
From this recipe of going from the Hamiltonian to the interaction function g, we can produce a number of “generalized” interactions such as k-body interactions corresponding to interaction functions of polynomial-type
where \(\alpha _j\) are some real constants, even convex smooth interactions intended to model non-polynomial ferromagnetic interaction, and countless others which might be of interest.
The problem described here is well understood for models where the delta function is replaced by a product of density functions, see [2]. Let us now remark on the connection between these types of generalized Curie–Weiss models, and the generalized mean-field orthoplicial model.
Formally, using delta functions, we have
where
where \(\rho > 0\), \(|m| \le \rho \), and \(Z_n (m n, \rho n)\) is a normalization constant which makes \(\nu _n (m, \rho )\) into a probability measure. The values of \((m, \rho )\) for which the probability measure \(\nu _n (m, \rho )\) exists in some formal sense are given by pairs satisfying \(\rho > 0\), and \(|m| \le \rho \). These statements can be heuristically guessed “geometrically” by considering the intersection of hyperplanes with the \(\ell _1\)-spheres. For reasons which will become clear later, we will consider the interior of this set of existence, given and denoted by \(\mathcal {A}:= \{ (m, \rho ): \rho > 0, |m| < \rho \}\). Returning to Eq. (1.0.1), we see that
In this form, the finite-volume Gibbs state is written as an integral mixture of another probability measure.
Although the original problem constrained the integrals to the \(\ell _1\) ball of radius n, we have suggestively modified the notation so as to include the other possible values of the radius. This suggestive notation is due to the principle or phenomenon of the equivalence of ensembles, see [15]. We will refer to the probability measure \(\nu _n (m, \rho )\) given formally in Eq. (2.0.2) as the microcanonical probability measure. This probability measure is constrained by two functions \(M_n, N_n: \mathbb {R}^n \rightarrow \mathbb {R}\) given by
We will refer to these functions as macrostates and the individual functions will be referred to as the magnetization and particle number respectively. In this paper, we will often refer to either ensembles or probability measures when discussing a particular thermodynamic model. Integrals with delta functions of the macrostates are referred to as constrained, and whenever we replace a delta function by some non-singular “function” of a macrostate, we are moving toward a less constrained state. With this perspective in mind, we will focus on the connection between the microcanonical probability measure and the grand canonical probability measure \(\eta (\beta , \mu )\) on \(\mathbb {R}^\mathbb {N}\) given by its action on \(f \in C_b (\mathbb {R}^I)\) given by
where \(\mu > 0\), \(|\beta | < \mu \), and \(q(\beta , \mu )^{|I|}\) is a normalization constant making the finite marginals into probability measures. One can compute, by direct integration, that
Note that, strictly speaking, the grand canonical probability measure should refer to the probability measure obtained from \(\eta (\beta , \mu )\) by considering its marginal distribution on the index set [n].
The equivalence of ensembles principle states that, subject to some yet to be verified properties of the microcanonical and grand canonical partition functions, there are a number of ways in which these two probability measures are the same. For our purposes, we will utilize ideas stemming from the ensemble equivalence principle corresponding in some sense to thermodynamic, macrostate, and measure level equivalence of these probability measures. For a more complete view on the principle of the equivalence of ensembles, see [15].
To that end, we will need the finite- and infinite-volume specific microcanonical entropies \(s_n, s: \mathcal {A} \rightarrow \mathbb {R}\) given respectively by
In addition, for the grand canonical ensemble we will need the finite- and infinite-volume specific entropies \(f_n,f: \mathcal {A} \rightarrow \mathbb {R}\) given respectively by
Note the sign conventions used here. We will omit the specific part in their naming, and refer simply to entropies. For this particular model, as for all product state models, we trivially have \(f_n (\beta , \mu ) = f (\beta , \mu ) = \ln q (\beta , \mu )\).
Using the entropies, we can rewrite Eq. (2.0.3) as
The first type of equivalence property that we wish to utilize is the following pair of relations
This relation is practically equivalent to that of two functions being Legendre conjugates, see [20]. Since we already have a closed form for \(f(\beta , \mu )\), we may extract the form of \(s(m, \rho )\) if this relation holds.
The second equivalence property is the parameter matching scheme given by
If for every pair \((m, \rho ) \in \mathcal {A}\) there exists a corresponding pair \((\beta , \mu ) \in \mathcal {A}\) satisfying the above relations and vice versa, then these corresponding pairs of values are the values for which we would expect the probability measures to be the same. We will use the notations \(m (\beta , \mu )\), \(\rho (\beta , \mu )\), \(\beta (m, \rho )\), and \(\mu (m, \rho )\) for this bijection. This bijection is intimately connected to the first equivalence property through the Legendre conjugates.
The final form of equivalence is then the rough statement that in the large n-limit, we have
for local functions \(f \in C_b (\mathbb {R}^{I})\).
If we now return to Eq. (2.0.3), the heuristic behaviour of the model in the large n-limit is roughly speaking that
and using the Laplace method, see [21], one would expect that
where \(\alpha \) is a probability measure on \([-1,1]\) and \(M^* (\psi ^g) \subset (-1,1)\) is the set of global maximizing points of the mapping \([-1,1] \ni m \mapsto \psi ^g (m):= g(m) + s(m,1)\). This is to be expected since integrands of the form above have an exponential rate concentration to the global maximum points of the given function.
The connection between this model and the generalized Curie–Weiss model is now evident. The limiting states of both models are given by mixtures of product states. However, for this model, one cannot realize these limiting states without the \(\ell _1\) constraint. To see this, let us consider the following integral
where \(\mu > 0\) and \(\beta \le 0\). This would be the less constrained grand canonical partition function for which one would hope that an equivalence principle holds. The partition function here is not finite if \(\beta > 0\). For the allowed values of \(\beta \), using the Fourier transform of the Gaussian, we have
Since the function \(z \mapsto \frac{1}{2} z^2 + \ln (\mu ^2 + (-\beta ) J z^2)\) is trivially minimized when \(z = 0\), by the Laplace method, it follows that
Now, if we include the mixture measure form of this integral, it follows that
where \(f \in C_b (\mathbb {R}^I)\) is a local function, from which we have
As can be seen, the limiting state is trivial in the sense that it is a pure state, i.e. not a convex combination of any other probability measures, and it does not depend on \(\beta \le 0\). It is this property why it is desirable to study the \(\ell _1\) constrained model, since the replacement of the product measure, for this particular model, with a delta function reproduces the non-trivial limiting states.
The heuristic is then that the limiting states of the model are mixtures of product states of the form given in Eq. (2.0.5), where the mixture probability measure is determined by the properties of the interaction function g. This is precisely what we will prove rigorously.
Before presenting the main results and proofs, let us remark on the what exactly is not rigorous, incorrect, or too formal in the above exposition. The delta functions appearing in Eqs. (1.0.1) and (2.0.2) are completely formal objects, and we will rigorously define the microcanonical probability measure on which we can actually perform non-formal computations. In particular, the formal calculation presented in Eq. (2.0.1) is strictly speaking incorrect. For this particular model, it is important to take into consideration the “boundary values” of the set \(\mathcal {A}\). That is to say, the admissible pairs which satisfy \(\rho > 0\) and \(|m| = \rho \) produce partition functions which can not be neglected if one wants to verify the formal calculation in Eq. (2.0.1). In addition, the form of equivalence of ensembles we have specified here are vague and unverified. We will verify these forms of equivalence explicitly, and they will be presented as lemmas.
3 Main Results
In this section, we present the main results, short or simple proofs, and expository computations concerning the main results.
3.1 Locally Uniform Convergence of Observables and Entropy of the Microcanonical Ensemble
We begin by rigorously defining the microcanonical probability measure \(\nu _n (m, \rho )\) from Eq. (2.0.2) for \((m, \rho ) \in \mathcal {A}\), and the so-called “boundary values” corresponding to \(\rho > 0\) and \(|m| = \rho \). This is done by identifying the microcanonical probability measure as a convex combination of products of uniform measures on simplexes. The uniform measures on simplexes are rigorously definable via the so-called flag coordinates, and these uniform measures are computationally tractable. Due to the large number of properties that need to be shown for the microcanonical probability measures, we dedicate an entire section, see Sect. 4.1, to the rigorous definition and methods of use of this particular probability measure. The key definitions are for the microcanonical probability measures \(\nu _n (m, \rho )\) and the microcanonical partition functions \(Z_n (M,N)\), now defined in Definition 4.1.3.
In this work, we will often refer to Polish spaces and probability measures on them. Whenever we do so without an explicit reference to a \(\sigma \)-algebra, we implicitly mean with respect to the Borel \(\sigma \)-algebra associated with the topology of the Polish space. The basic principle by which we will identify the infinite-volume Gibbs states is presented in the following lemma.
Lemma 3.1.1
Let X be a Polish space. If \(\{ \mu _n \}_{n \in \mathbb {N}}\) is a sequence of probability measures on X converging weakly to a probability measure \(\mu \) on X, \(K \subset X\) is a compact continuity set of \(\mu \) such that \({\text {supp}} (\mu ) \subset K\), and \(\{ f_n \}_{n \in \mathbb {N}}\) is a sequence of uniformly bounded functions on X converging uniformly on K to a function f, then it follows that
The proof, see Sect. 4.2, is an application of conditioning to K and applying various weak convergence properties.
With reference to Eqs. (1.0.1) and (2.0.3), using the definition and methods of Sect. 4.1, we can write the finite-volume Gibbs states in the following form
where \(\kappa _n^g\) are probability measures on \(\mathbb {R}\) supported by \([-1,1]\) with actions on \(f \in C_b (\mathbb {R})\) given by
where the partition function then takes the following form
In light of Lemma 3.1.1, we have two goals. The first goal is to show that the collection of mixture probability measures \(\{ \kappa _n^g \}_{n \in \mathbb {N}}\) converges weakly to some limiting probability measure, and that there exists a compact continuity set of this limiting probability measures which contains the support of the limiting probability measure. The second goal is to show that for a fixed \(f \in C_b (\mathbb {R}^I)\) the collection of functions \(\{ \nu _n (m,1) [f] \}_{n \in \mathbb {N}}\) understood as a collection of functions on the variable \(m \in [-1,1]\) is uniformly bounded, which is immediate by the boundedness of f, and uniformly convergent on the required compact continuity set.
In the heuristic sketch in the introduction, we did not pay any particular attention to the modes of convergence of the limiting objects. For this particular model, we are able to obtain locally uniform convergence by relating the rate of convergence of local functions to the rate and mode of convergence of the finite-volume entropies. This connection is described in the following fundamental inequality.
Lemma 3.1.2
For any finite index set \(I \subset [n]\) and any pairs of values \((m, \rho ) \in \mathcal {A}\) and \((\beta , \mu ) \in \mathcal {A}\), we have
The proof of this result is an application of Pinsker’s inequality for relative entropy, followed by the subadditivity property of relative entropy coupled with the permutation invariance of the microcanonical probability measure. For this model, we can exactly compute the relative entropy of the \((n-2)\):th marginal of the microcanonical probability measure from which we obtain the entropy terms in the above inequality. For the full proof, see Sect. 4.2.
If we were only interested in showing that the microcanonical probability measure converges to the grand canonical probability measure, it can be accomplished by studying the pointwise convergence of the entropies. However, since we want to prove locally uniform convergence, we need some additional regularity. The additional regularity that we will prove is that the sequence of finite-volume microcanonical entropies are pointwise uniformly bounded, and that the microcanonical partition functions are log-concave functions on \(\mathcal {A}\). By a classical result in convex analysis, see [20, Section 10], once the pointwise limit of the finite-volume microcanonical entropies is deduced, the convergence is immediately elevated to locally uniform convergence.
In some models, the grand canonical entropy is more computationally tractable than the microcanonical entropy. This is the case here as well and we will prove a general result which utilizes the aforementioned regularity properties of the microcanonical partition functions coupled with some additional regularity properties of the grand canonical entropy to prove a result, which might also be of general interest in other models.
Theorem 3.1.3
Let \(\{ Z_n \}_{n \in \mathbb {N}}\) be a sequence of log-concave functions \(Z_n: n \mathcal {C} \rightarrow (0, \infty )\), where \(\mathcal {C} \subset \mathbb {R}^m\) is a non-empty open convex set and \(n \mathcal {C}:= \{n c: c \in \mathcal {C} \}\), such that
for any \(x \in \mathcal {C}\), and there exists a non-empty open convex set \(\mathcal {C}' \subset \mathbb {R}^m\) such that
for all \(t \in \mathcal {C}'\) and all \(n \in \mathbb {N}\), where \(\left\langle \cdot , \cdot \right\rangle \) is the Euclidean inner product.
If the function \(f: \mathbb {R}^m \rightarrow \mathbb {R} \cup \{ \pm \infty \}\) given by the mapping
exists and is a proper convex lower semi-continuous function of Legendre type which satisfies \(\nabla [- f] \mathcal {C}' = \mathcal {C}\) then it follows that
for any compact set \(K \subset \mathcal {C}\).
The proof of this result, see Sect. 4.3, requires definitions and notions from large deviations theory. We have dedicated an entire section, see Sect. 4.3, to the relevant definitions, and results which can be deduced after establishing a large deviations principle. The proof itself uses a relative compactness argument concerning locally uniformly convergent subsequences, and a characterization of the limits of said subsequences using a large deviations principle.
To apply this method to this model, we proceed by providing the sufficient regularity of the finite-volume microcanonical entropies.
Lemma 3.1.4
The collection of finite-volume microcanonical entropies \(\{ s_n \}_{n \in \mathbb {N}}\) is pointwise uniformly bounded and concave on \(\mathcal {A}\).
The proof, see Sect. 4.4, of log-concavity proceeds by identifying the microcanonical partition functions \(Z_n\) as a composition of a bivariate Lorentzian polynomial of degree \(n-2\) and a linear map. To prove the uniform pointwise boundedness, we use the positivity of the relative entropy between the \((n-2)\):th marginal of the microcanonical probability measure and the grand-canonical probability measure.
In light of Theorem 3.1.3, it remains to consider the mapping \(f: \mathbb {R}^2 \rightarrow \mathbb {R}\) given by
where, in accordance with Eq. (4.1.3), we have
for \((M,N) \in \mathcal {A}\).
It is immediate that if \((\beta , \mu ) \not \in \mathcal {A}\), then \(f(\beta ,\mu ) = \infty \). As for \((\beta , \mu ) \in \mathcal {A}\), we can directly compute that
Computing the limit, it follows that
In summary, we have
We have included this calculation here to emphasize the fact that this calculation is relatively straightforward.
We present the relevant regularity conditions of the map \(f: \mathbb {R}^2 \rightarrow \mathbb {R}\) in the following result.
Lemma 3.1.5
The mapping \(f: \mathbb {R}^2 \rightarrow \mathbb {R}\) is a proper convex lower semi-continuous function of Legendre type.
In addition, it follows that \((- \nabla [f]) \mathcal {A} = \mathcal {A}\), and
where \((\beta , \mu ):= (-\nabla [f])^{-1}: \mathcal {A} \rightarrow \mathcal {A}\) is given by
For the proof, which is completely computational, see Sect. 4.4.
Combining together the regularity of the finite-volume entropies from Lemma 3.1.4, and the computations and verifications concerning the function f given in Lemma 3.1.5, we have the following result.
Lemma 3.1.6
It follows that
for any compact set \(K \subset \mathcal {A}\), where
for any \((m, \rho ) \in \mathcal {A}\).
Combining together Lemmas 3.1.2, 3.1.5, and 3.1.6, we have the following result concerning the mode of convergence of local observables of the microcanonical probability measures.
Corollary 3.1.7
For any finite index set \(I \subset [n]\), it follows that
Having established the compact-open convergence of the microcanonical probability measures, we move on to the weak convergence of the mixture probability measures.
3.2 Limiting Entropy and Convergence of Mixture Probability Measures
By the heuristics given, it is evident that the mixture probability measures \(\{ \kappa _n (g) \}_{n \in \mathbb {N}}\) should converge, at an exponential rate, to the global maximizing points of some tilting function. This idea can be realized by proving that the mixture probability measures satisfy a large deviations principle. Since the full models have a general interaction function g, we will first prove a large deviations principle for linear g, and then use tilting to obtain the full large deviations principle. The following result considers the large deviations principle for a linear g.
Lemma 3.2.1
Let \(\beta \in \mathbb {R}\), \(g^\beta (m):= - \beta m\), \(Q_n (\beta ):= Q_n (g^\beta )\), and \(\kappa _n^\beta := \kappa _n^{g^\beta }\).
Then, it follows that
Moreover, \(\{ \kappa _n^\beta \}_{n=1}^\infty \) satisfies a large deviations principle with rate function \(I^\beta : \mathbb {R} \rightarrow [0, \infty ]\) given by
and \(I^\beta (m) = \infty \) for \(m \not \in [-1,1]\).
The proof, see Sect. 4.4, follows the same strategy as for the microcanonical entropy. Here, the log-concavity is proved by an application of the Prekopa–Leindler theorem, and pointwise uniform boundedness is a direct calculation.
Since the previous result yields a large deviations principle for the mixture probability measures \(\{ \kappa _n^0 \}_{n \in \mathbb {N}}\), corresponding to the choice of g being identically 0, as direct corollary of tilting, see [22], we have the following large deviations principle for the full mixture probability measures.
Corollary 3.2.2
For any \(g \in C_b ([-1,1])\), it follows that
Moreover, \(\{ \kappa _n^g \}_{n=1}^\infty \) satisfies a large deviations principle with rate function \(I^g: \mathbb {R} \rightarrow [0, \infty ]\) given by
and \(I^g (m) = \infty \) for \(m \not \in [-1,1]\).
Whenever a sequence of probability measures satisfies a large deviations principle with some rate function, it is accompanied by a measure concentration result to the kernel of the rate function, see Sect. 4.3. In this vein, consider the function \({\psi ^g}: [-1,1] \rightarrow \mathbb {R}\) given by
It is clear that if \(I^g (m^*) = 0\), then the point \(m^*\) corresponds to a global maximum point of \({\psi ^g}\) by definition, and vice versa. Denote the set of global maximizing points of \({\psi ^g}\) by \(M^* (\psi ^g)\), and, by the previous observation, we have \(\left( I^g\right) ^{-1} \{ 0 \} = M^* (\psi ^g)\).
We may now begin the classification of the infinite-volume Gibbs states. As a first partial result, by combining together Lemma 3.1.1, Corollary 3.1.7, and Theorem 4.3.8, we have the following result.
Lemma 3.2.3
Let \(g \in C_b ([-1,1])\), and suppose that \(M^* (\psi ^g) \subset (-1,1)\). Then, it follows that
The proof, see Sect. 4.4, is a direct combination of the given results.
As a corollary, if we can deduce that there is exactly one global maximizing point of \({\psi ^g}\) contained in the interval \((-1,1)\), then there is a unique infinite-volume Gibbs state. This follows since the Dirac measure on a single point is the only probability measure supported on a single point.
Theorem 3.2.4
Let \(g \in C^1 ([-1,1])\), and suppose that \({\psi ^g}\) has a unique global maximizing point \(m^* \in (-1,1)\).
Then, it follows that
There are two prototypical functions g that fall into this category. One we have already seen which is \(g^\beta (m) = - \beta m\) for \(\beta \in \mathbb {R}\). Since s is strictly concave it is easy to check that there is a unique global maximizing point of \(\psi (g^{\beta })\). The other example is related to the Curie–Weiss Hamiltonian with an external field.
Example 3.2.5
Consider \(g (m):= \frac{\beta J}{2} m^2 + \beta h m\), where \(J > 0\), \(\beta > 0\), and \(h \not = 0\). Let us first remark that \({\psi ^g}\) must attain its maximum on \([-1,1]\). Suppose first that \(h > 0\). For any point \(m^* < 0\) of \({\psi ^g}\), this point cannot be a global maximum point since \({\psi ^g}(- m^*) > {\psi ^g}(m^*)\). It follows that if there exists a global maximizing point, then it must be of the same sign as h. Let us continue now with the case where \(h > 0\), and note that the other case is analogous. By direct computation, we have
One can further compute that
and
for \(0 < m \le 1\). It follows that \(\partial [\psi ^g](m) < 0\) on (0, 1] and \(\partial [\psi ^g]\) is thus strictly concave. In addition, we have \(\partial [\psi ^g](0) = \beta h > 0\), and \(\lim _{m \rightarrow 1^-} \partial [\psi ^g](m) = -\infty \). Using these properties, it follows that there must exist a unique point \(m^* \in (0,1)\) such that \(\partial [\psi ^g](m^*) = 0\). In addition, by strict concavity of \(\partial [\psi ^g]\), it follows that \({\psi ^g}\) is monotonically increasing on \((0, m^*)\) and monotonically decreasing on \((m^*,1)\) which implies that this \(m^*\) is the unique global maximum point and it is contained on (0, 1). A similar argument shows that if \(h < 0\), then there is a unique global maximum point contained in \((-1,0)\).
For the second type of interaction, we consider even functions \(g \in C_b ([-1,1])\) such that g has precisely two global maximizing points \(m^+ \in (0,1)\), and \(m^- = - m^+ \in (-1,0)\). For such even functions, by spin-flip symmetry, or by changing variables \(m \mapsto - m\), it follows that
for small enough \(\delta > 0\). In particular, by Corollary 4.3.10, it follows that
weakly. By combining this simple result with Lemma 3.1.1, and Corollary 3.1.7, we have the following result.
Theorem 3.2.6
Let \(g \in C_b ([-1,1])\) be an even function such that \(M^* (\psi ^g) = \{ m^+, m^-\}\), where \(m^+ > 0\), and \(m^- = - m^+\).
Then, it follows that
The prototypical example here is the Curie–Weiss Hamiltonian without an external field.
Example 3.2.7
Consider \(g(m):= \frac{\beta J}{2} m^2\) where \(\beta > 0\), and \(J > 0\). We have
From the form of the first derivative, we see that \({\psi ^g}\) cannot obtain a maximum at either end of the interval \([-1,1]\) and must thus be attained at a critical point in the open interval \((-1,1)\). There are now two options for the critical point, the first is that \(m = 0\), from which we have
Due to the sign of the second derivative, this fails to be even a local maximum when \(\beta J > \frac{1}{2}\), and whatever other critical point must be the global maximizing point if we are in this parameter range. The other case is that
when \(\beta J > \frac{1}{2}\). For other values of \(\beta J\), there is no solution to this equation and we must conclude that the other critical point corresponds to the global maximizing point. We can conclude that when \(\beta J \in \mathbb {R}\), then \(m^* = 0\) is always a critical point, but it cannot be even a local maximizing point when \(\beta J > \frac{1}{2}\), hence in this regime we must conclude that the pair of solutions \(m^\pm \) given above are the only viable critical points, but since they are the only critical points, and the function must attain its maximum at a critical point, we may conclude that \(m^\pm \) also correspond to global maximizing points of the function. When \(\beta J < \frac{1}{2}\), the \(m^* = 0\) critical point is the only critical point, and we can again conclude that this must then be the global maximizing point. If \(\beta J = \frac{1}{2}\), we can check that both \(m^\pm = 0\), and thus we again have a single critical point which must be a global maximizing point.
The interactions described here are ones which can be dealt with without any further study of the structure of the function \({\psi ^g}\). When there are no symmetries or unique global maximum points, one has to resort to other methods to resolve the limits. We will now present such methods for dealing with sufficiently smooth interaction functions that have multiple global maximizing points.
3.3 Exact Integral Representations of the Weights and Full Classification of the Infinite-Volume Gibbs States
We will need a preliminary result concerning the microcanonical partition function in order to have better control of the mixture probability measures. We have the following generating function based representation of the microcanonical partition function.
Lemma 3.3.1
Let \((m, \rho ) \in \mathcal {A}\).
Then, it follows that
where \(s:\mathcal {A} \times [0, 2 \pi ) \times [0, 2 \pi )\) is given by
The proof of this representation, see Sect. 4.5, follows by using the convolution structure of the microcanonical partition function and identifying the generating function to be the product of modified Bessel functions of the second kind. The proof is concluded by differentiation of these Bessel functions.
In the previous result, we introduced the overloaded s function by adding an angular dependence. We will differentiate between these functions by always specifying, in one form or another, the number of arguments the function takes.
In the following, we will specialize to functions g that are infinitely continuously differentiable, and obtain there finitely many global maximum points in the interval \((-1,1)\). In light of Corollary 4.3.10, our goal is to study quantities of the form
Using Lemma 3.3.1, it follows that
where we have introduced the overloaded function \({\psi ^g}: (-1,1) \times [0, 2 \pi ) \times [0, 2 \pi )\) given by
We see that the integral in Eq. (3.3.1) takes the form of a Laplace-type integral in three variables, and we expect that the local structure around the global maximum points of the overloaded function \({\psi ^g}\) determines the exponential asymptotics of such integrals precisely.
To that end, we present the following result which contains the relevant information concerning the structure and local asymptotics of the overloaded \({\psi ^g}\) function.
Lemma 3.3.2
Suppose that \({\psi ^g}\) has a local maximizing point \(m^*\) contained in the interval \((m^* - \delta , m^* + \delta )\), and there exists \(k \in \mathbb {N}\) such that \(\partial ^{2k} [\psi ^g] (m^*) < 0\) and \(\partial ^j [\psi ^g] (m^*) = 0\) for all \(1 \le j \le 2k - 1\).
Then, it follows that
where
In addition,
The proof of this result, see Sect. 4.5, follows by developing the Taylor polynomial of the overloaded \({\psi ^g}\) function around the point \((m^*, 0, 0)\), and using the fact that odd derivatives of cosines vanish when evaluated at 0. The second statement simply follows by taking the limit.
From the previous result, we see that it is pertinent to introduce the following classification, which is directly adapted from [2], of the global maxima of \({\psi ^g}\).
Definition 3.3.3
A global maximum point \(m^* \in (-1,1)\) of \({\psi ^g}\) is said to be of type \(k(m^*) \in \mathbb {N}\) if \(\partial ^{2k} [\psi ^g](m^*) < 0\) and \(\partial ^j [\psi ^g] (m^*) = 0\) for all \(1 \le j \le 2k - 1\).
For a finite collection of global maximum points \(M^* (\psi ^g) \subset (-1, 1)\) of \({\psi ^g}\), the maximal type \(k_\infty (\psi ^g)\) is given by \(k_\infty (\psi ^g) = \max _{m^* \in M^* (\psi ^g)} k (m^*)\). The collection of global maximum points of maximal type \(M_\infty ^* (\psi ^g)\) is given by \(M_\infty ^* (\psi ^g):= \{ m^* \in (-1,1): k(m^*) = k_\infty (\psi ^g) \}\).
Combining together Lemmas 3.3.1 and 3.3.2, and the form given in Eq. (3.3.1), we have the following asymptotic result.
Lemma 3.3.4
Suppose that \({\psi ^g}\) has a single unique maximizing point \(m^* \in (m^* - \delta , m^* + \delta )\) of type \(k \in \mathbb {N}\).
Then, it follows that
The proof of this result, see Sect. 4.5, is a standard application of the multivariate Laplace method.
From the previous result, denote \(W_n (g, m^*, \delta )\) to be the quantity given by
and its limit \(W (g, m^*)\) given by
To resolve the weak convergence of the mixture measure, using both Lemma 3.3.4 and Corollary 4.3.10, we compute
from which it follows that
Following this computation, we have the following result.
Theorem 3.3.5
Let \(g \in C_b ([-1,1])\) be an infinitely continuously differentiable function such that \(\psi ^g\) has finitely many global maximizing points \(M^* (\psi ^g) \subset (-1,1)\) of finite type.
Then, it follows that
To finish, we can directly compute the following
this implies that the integral containing these terms does not depend on g, other than through the value of the global maximizing point. In addition, it is immediate that the factor \(e^{\psi ^g (m^*) - g (m^*)}\) does not depend on g either. Furthermore, we immediately have
We can thus combine all factors not depending functionally on g into a single function \(C^k: (-1,1) \rightarrow (0, \infty )\) given by
so that
Using Corollary 3.1.7, Theorem 3.3.5, Lemma 3.1.1, and the form of the weights \(W^g (m^*)\) given above, we have the final result.
Theorem 3.3.6
Let \(g \in C_b ([-1,1])\) be an infinitely continuously differentiable function such that \(\psi ^g\) has finitely many global maximizing points \(M^* (\psi ^g) \subset (-1,1)\) of finite type, and let \(k_\infty := k_\infty (\psi ^g)\).
Then, it follows that
This concludes the presentation of the main results of this paper.
4 Intermediate Results and Proofs
This section contains proof of some of the results in Sect. 3, and some collections of intermediate results and theory that are required.
4.1 Microcanonical Probability Measures
To motivate the rigorous definition of the microcanonical ensemble and its associated probability measure, consider the following formal calculation
where the pair \((m, \rho ) \in \mathcal {A}\), \(f: \mathbb {R}^n \rightarrow \mathbb {R}\) is a sufficiently regular function, and \(\sigma \phi \) notation for a multiplication map defined by \((\sigma \phi )_i:= \sigma _i \phi _i\). Note that the integral in the sum is a product of two integrals since the index sets \(\sigma ^{-1} \{ +1 \}\) and \(\sigma ^{-1} \{ -1 \}\) are trivially disjoint. Note that the primary formal rule we have made use of is the following one
for an invertible linear map \(T: \mathbb {R}^k \rightarrow \mathbb {R}^k\), and elements \(x,y \in \mathbb {R}^k\).
To make this formal calculation rigorous, we need to define integrals over scaled simplexes in arbitrary dimensions. To do this, we introduce the so-called flag coordinates \(\phi ': \mathbb {R}^k \rightarrow \mathbb {R}^k\) given by
Note that \(\phi ' ([0, \infty )^k) = \{ \phi \in [0, \infty )^k: \phi _1 \le \phi _2 \le ... \le \phi _k \}\), \(\det (\phi ') = 1\), and the inverse function of \(\phi '\) is given by
where we take the convention that \(\phi '_0:= 0\).
The connection between the flag coordinates and the integrals over simplexes can be seen from the following formal calculation
where \(r > 0\), and \(f: \mathbb {R}^n \rightarrow \mathbb {R}\) is a sufficiently regular function.
From this formal calculation, we produce the following definition.
Definition 4.1.1
For a finite index set I and \(r > 0\), the measure \(S_I (r)\) on \([0, \infty )^I\) corresponding to the integral over an \((|I|-1)\)-dimensional r-scaled simplex on the index set I is given by its action on \(f \in C_b ([0, \infty )^I)\) given by
where \(\{ i_k \}_{k=1}^{|I|}\) is some enumeration of I.
For future use, whenever it is clear that we are either referring to the measure or the normalization constant, we will use the following notation
where the right-hand side follows by direct computation. Using dominated convergence, it is also clear that the mapping \(r \mapsto S_I (r) [f]\) is continuous if \(f \in C_b ([0, \infty )^I)\) is continuous.
To show that Definition 4.1.1 is independent of the enumeration of I given above, we will use a Lebesgue-absolutely continuous approximation of \(S_I (r)\). Let \(g: [0, \infty ) \rightarrow \mathbb {R}\) be a measurable function such that
It follows that
where \(f \in C_b ([0, \infty )^I)\). Now, consider the family \(\{ g_\varepsilon \}_{\varepsilon > 0}\) given by
Fix \(r > 0\). Since \(f \in C_b ([0, \infty )^I)\), as stated before, one can verify that \(S_I (\cdot ) [f] \in C ([0, \infty ))\). It follows that
We see that the left-hand side of the above equality will inherit properties from the right-hand side limiting term. In particular, the measure given by its action on \(f \in C_b ([0, \infty )^I)\) given by
is independent of any enumeration of I, and it is label permutation invariant. It follows that the measure \(S_I (r)\) is independent of the given enumeration in the definition, and it is label permutation invariant.
We can now define the microcanonical probability measure using Definition 4.1.1.
Definition 4.1.2
The measure \(Z_n (M, N)\) is given by its action on \(f \in C_b (\mathbb {R}^n)\) given by
where \(\otimes (\cdot )\) is the tensor product of two measures, \(f \circ \sigma \) is the composition of the multiplication map \(\sigma \) with f, and we take the necessary convention that
if \(\sigma = \{ 1,1,...,1\}\) or \(\sigma = \{ -1,-1,...,-1\}\) whenever \((M,N) \in \mathcal {A}\).
This last convention implies that we do not include the “first” and “last” in the sum, but we have left them in to save space on notation.
To conclude this section, we will, finally, give the definition of the microcanonical probability measure.
Definition 4.1.3
For \((m, \rho ) \in \overline{\mathcal {A}} \setminus \{ 0 \}\), the probability measure \(\nu _n (m,\rho )\) on \(\mathbb {R}^n\) corresponding to the microcanonical probability measure is defined by its action on \(f \in C_b (\mathbb {R}^n)\) given by
and the microcanonical partition function, acting as the normalization constant \(Z_n (mn, \rho n)\) is given by
which can be verified by direct computation.
To make the microcanonical probability measure computationally tractable, we will utilize a similar Lebesgue-absolutely continuous approximation as for the integrals over the simplex. However, as opposed to the approximation for the integrals over the simplexes, one must be more careful here. Using the family of functions \(\{ g_\varepsilon \}_{\varepsilon > 0}\) from Eq. (4.1.1), observe that
where \((m, \rho ) \in \mathcal {A}\), and \(\sigma \) does not consist of all 1’s or all \(-1\)’s. Now, if we consider instead the right-hand side first, then it makes sense even when \(\sigma \) consists of all 1’s or \(-1\)’s. In that instance, the argument of one of the \(g_\varepsilon \) will not integrate over any \(\phi \)-variables, and for small enough \(\varepsilon > 0\) the indicator function vanishes. Summing over the \(\sigma \), in this case, it then follows that
Returning now to the microcanonical probability measure, we see that its inherits the various properties of the measure with action on \(f \in C_b (\mathbb {R}^n)\) given by
In particular, it is label permutation invariant. Furthermore, this approximation will be used for some calculations related to the microcanonical probability measure.
4.2 Relative Entropy and Local Observables
We begin with the proof of a type of generalized dominated convergence theorem.
Proof of Lemma 3.1.1
The condition that K is a continuity set of \(\mu \) implies that
and the condition that \({\text {supp}} (\mu ) \subset K\) implies that \(\mu (K) = 1\).
Next, we have the following two simple inequalities
and
Since K is a continuity set of \(\mu \), using the continuity set definition of weak convergence, it follows that \(\mu _n\) conditioned to K converges weakly to \(\mu \) conditioned to K. Transitioning to the continuous bounded form of weak convergence, it follows that
For completeness, we have the following final inequality
Combining together all three inequalities, the result follows. \(\square \)
We will need the relative entropy between two absolutely continuous probability measures.
Definition 4.2.1
Let X be a Polish space, and let \(\mu \) and \(\nu \) be probability measures on X. If \(\mu \) is absolutely continuous with respect to \(\nu \), the relative entropy \(\mathcal {H}(\mu || \nu )\) is given by
If \(\mu \) is not absolutely continuous with respect to \(\nu \), we set \(\mathcal {H} (\mu || \nu ) = \infty \).
We will need the following properties of relative entropy.
Theorem 4.2.2
Let X be a Polish space, and let \(\mu \) and \(\nu \) be probability measures on X such that \(\mu \) is absolutely continuous with respect to \(\nu \).
-
For any \(\mu \) and \(\nu \) satisfying the assumptions
$$\begin{aligned} \mathcal {H}(\mu || \nu ) \ge 0 . \end{aligned}$$ -
For any \(\mu \) and \(\nu \) satisfying the assumptions
$$\begin{aligned} \sup _{f \in M_b (X), \ || f ||_\infty \le 1} |\mu [f] - \nu [f]| \le \sqrt{\frac{\mathcal {H}(\mu || \nu )}{2}}, \end{aligned}$$where \(M_b (X)\) is the space of measurable bounded functions on X.
-
If \(X = Y^n\), where Y is another Polish space, and \(\nu = \otimes _{k=1}^n \lambda \), where \(\lambda \) is a probability measure on Y, it follows that
$$\begin{aligned} \mathcal {H}_I (\mu || \nu ) + \mathcal {H}_J (\mu || \nu ) \le \mathcal {H}_{I \cup J} (\mu || \nu ) + \mathcal {H}_{I \cap J} (\mu || \nu ) , \end{aligned}$$where \(I,J \subset \{ 1,2,...,n\}\), and \(\mathcal {H}_I (\mu || \nu )\) is denotes the relative entropy of the I:th marginal distributions of \(\mu \) and \(\nu \).
The first and third properties are discussed and given proofs in [23]. The second property is sometimes referred to as Pinsker’s inequality and references to proofs and other details concerning this inequality can be found in [24].
We can now give a proof of the fundamental inequality connecting the constrained and non-constrained ensemble probability measures.
Proof of Lemma 3.1.2
Using Eq. (4.1.4), we can compute the integral over only the first 2 variables leaving the other \(n-2\) variables fixed. We compute
Taking the limit, it follows that
Accounting for the normalization, the \((n-2)\):th marginal of the microcanonical probability measure is given by
where \(d \phi _{n-2}\) is the \((n-2)\)-dimensional Lebesgue measure. Note the factor of 2 vanishes due to the presence of a factor of \(\frac{1}{2}\) in the partition function. It follows that
The relative entropy is then directly computed to be
Using label permutation invariance, one can directly compute that
In summary, we have
To continue, by Theorem 4.2.2, it follows that
By label permutation invariance, it follows that
Since I is finite, it follows that there exists \(k \in \mathbb {N}\) such that \((k - 1) |I| \le n - 2 < k |I|\). Since \(\eta _n (\beta , \mu )\) is a product measure, using Theorem 4.2.2, it follows that
Combining these inequalities together, it follows that
as desired. \(\square \)
4.3 Large Deviations and Weak Convergence
We begin with the standard key definitions of large deviations theory. Note that these definitions are either the same or slightly modified versions of the same results and definitions found in [22]. In addition, the result concerning convexity are either provided in [22], or we refer to [20] for more detailed analysis of convex objects.
In the following \(\{ P_n \}_{n = 1}^\infty \) is a sequence of probability measures on a Polish space X.
Definition 4.3.1
A function \(I: X \rightarrow [0, \infty ]\) is called a rate function if it satisfies the following properties
-
\(I(x) < \infty \) for all \(x \in X\).
-
I is lower semi-continuous.
-
I has compact level sets.
In the following, we use the notation \(I(A):= \inf _{x \in A} I(x)\).
Definition 4.3.2
A sequence of probability measure \(\{ P_n \}_{n = 1}^\infty \) is said to satisfy a large deviations principle with rate function I if it satisfies the following properties
-
For all closed sets \(C \subset X\), we have
$$\begin{aligned} \limsup _{n \rightarrow \infty } \frac{1}{n} \ln P_n (C) \le - I (C) . \end{aligned}$$ -
For all open sets \(O \subset X\), we have
$$\begin{aligned} \liminf _{n \rightarrow \infty } \frac{1}{n} \ln P_n (O) \ge - I (O) . \end{aligned}$$
Now, we specialize to probability distributions on \(\mathbb {R}^d\). In the following, let \(\{ m_n \}_{n=1}^\infty \) be a sequence of random variables on \(\mathbb {R}^d\), and we set \(P_n (A):= \mathbb {P}(m_n \in A)\). The moment generating functions \(\varphi _n: \mathbb {R}^d \rightarrow (0, \infty ]\) are given by \(\varphi _n (t):= \mathbb {E} e^{ \left\langle t, m_n \right\rangle }\). In the following, we assume the existence of a function \(\Lambda : \mathbb {R}^d \rightarrow [- \infty , \infty ]\) given by
and that this function satisfies \(0 \in {\text {int}}(\mathcal {D} (\Lambda ))\) where \(\mathcal {D}(\Lambda ):= \{ t \in \mathbb {R}^d: \Lambda (t) < \infty \}\). For such a function, it follows that \(\Lambda \) is convex and \(\Lambda (t) > - \infty \) for all \(t \in \mathbb {R}^d\). A convex function \(\Lambda : \mathbb {R}^d \rightarrow [- \infty , \infty ]\) is called proper if \(\Lambda (t) > - \infty \) for all \(t \in \mathbb {R}^d\), and there exists at least one point \(t_0 \in \mathbb {R}\) such that \(\Lambda (t_0) < \infty \). It is clear that when \(\Lambda \) is the limit of the scaled logarithmic moment generating functions, then it is a proper convex function.
We will need the Legendre transform of \(\Lambda \).
Definition 4.3.3
The Legendre transform \(\Lambda ^*: \mathbb {R}^d \rightarrow [- \infty , \infty ]\) of a \(\Lambda : \mathbb {R}^d \rightarrow [-\infty , \infty ]\) is given by
For \(\Lambda \) given by the limit of the scaled logarithm, it follows that \(\Lambda ^*\) is a convex rate function. In particular, we see that the range of \(\Lambda ^*\) must be contained in \([0, \infty )\).
To specify the form of the Gärtner-Ellis theorem, that we wish to utilize, we need the concept of essential smoothness.
Definition 4.3.4
A proper convex function \(\Lambda : \mathbb {R}^d \rightarrow (-\infty , \infty ]\) is called essentially smooth if it satisfies the following properties
-
\({\text {int}} (\mathcal {D} (\Lambda )) \not = \emptyset \).
-
\(\Lambda \) is differentiable on \({\text {int}} (\mathcal {D} (\Lambda ))\).
-
Either \(\mathcal {D}(\Lambda ) = \mathbb {R}^d\) or, for any \(t^* \in \partial \mathcal {D}(\Lambda )\), it follows that \(\lim _{t \rightarrow t^*} || \nabla [\Lambda ] (t)|| = \infty \).
We can now give the essentially smooth form of the Gärtner-Ellis theorem.
Theorem 4.3.5
Let \(\Lambda : \mathbb {R}^d \rightarrow (0, \infty ]\) be an essentially smooth lower semi-continuous function.
It follows that \(\{ P_n \}_{n=1}^\infty \) satisfies a large deviations principle with rate function \(\Lambda ^*\).
It is typical to introduce the notion of strict convexity of a function, but we will instead directly introduce the notion of a Legendre-type function.
Definition 4.3.6
A proper convex lower semi-continuous function \(\Lambda : \mathbb {R}^d \rightarrow (-\infty , \infty ]\) is said to be of Legendre-type if it is both essentially smooth and strictly concave on \({\text {int}} (\mathcal {D} (\Lambda ))\).
The primary feature of Legendre-type functions that we will use is that the gradient of such a function \(\Lambda \) is a bijection between \({\text {int}} (\mathcal {D} (\Lambda ))\) and \({\text {int}} (\mathcal {D} (\Lambda ^*))\).
We can now prove the following general theorem.
Theorem 4.3.7
Let \(\{ Z_n \}_{n=1}^\infty \) be a sequence of functions \(Z_n: n \mathcal {A} \rightarrow (0, \infty )\), where \(\mathcal {A} \subset \mathbb {R}^d\) is a non-empty open convex set such that each \(Z_n\) is log-concave, and
for each \(x \in \mathcal {A}\). Denote by \(s_n: \mathcal {A} \rightarrow (- \infty , \infty )\) the function given by
In addition, suppose that the function \(f: \mathbb {R}^d \rightarrow [- \infty , \infty ]\) given by
exists, where \(Q_n: \mathbb {R}^d \rightarrow (0, \infty ]\) are given by
and there exists a non-empty open convex set \(\mathcal {B} \subset \mathbb {R}^d\) such that \(\mathcal {D} (Q_n) = {\text {int}} (\mathcal {D} (f)) = \mathcal {B}\).
If f is a proper convex lower semi-continuous function of Legendre type such that \(-\nabla [f] \mathcal {B} = \mathcal {A}\) then the function \(s: \mathcal {A} \rightarrow \mathbb {R}\) given by the limit
exists, and satisfies
for any compact set \(K \subset \mathcal {A}\).
Proof
For the first step, let \(t_0 \in \mathcal {B}\) be any base point, and we define the sequence of probability measures \(\{ P_n \}_{n = 1}^\infty \) on \(\mathbb {R}^d\) by setting
where \(A \subset \mathbb {R}^d\) is Borel measurable.
The moment generating function \(\varphi _n: \mathbb {R}^d \rightarrow (0, \infty ]\) of the random variable \(m_n\) on \(\mathbb {R}^d\) with distribution given by \(P_n\) is given by
The limit of the scaled logarithm moment generating function \(\Lambda : \mathbb {R}^d \rightarrow [- \infty , \infty ]\) is given by
Since \(\Lambda \) inherits its properties from f, it follows that \(\Lambda \) exists, is a proper convex lower semi-continuous function of Legendre-type, and satisfies \(0 = t_0 - t_0 \in {\text {int}} (\mathcal {D} (\Lambda )) = t_0 - {\text {int}} (\mathcal {D}(f)) = t_0 - \mathcal {B}\). It follows that \(\{ P_n \}_{n \in \mathbb {N}}\) satisfies a large deviations principle with rate function \(\Lambda ^*\). Since \(\Lambda \) is of Legendre-type, it follows that \({\text {int}} (\mathcal {D} (\Lambda ^*)) = \nabla [\Lambda ] ({\text {int}} (\mathcal {D} (\Lambda ))) = - \nabla [f] \mathcal {B} = \mathcal {A}\).
Let \(y \in {\text {int}} (\mathcal {D} (\Lambda ^*)) = \mathcal {A}\). Since \(\Lambda ^*\) is convex, it follows that it is continuous on \(\mathcal {A}\) and thus the compact balls \(\overline{B}(y, \delta )\) for small enough \(\delta > 0\) are continuity sets from which it follows that
For the second step, since each \(s_n\) is concave and the collection \(\{ s_n \}_{n \in \mathbb {N}}\) is pointwise uniformly bounded, it follows that the collection \(\{ s_n \}_{n \in \mathbb {N}}\) is relatively compact in the compact-open topology of continuous functions. Let \(\{ s_{n_k} \}_{k = 1}^\infty \) be any locally uniformly convergent subsequence with limiting function \(s'\). Since \(\overline{B}(y, \delta )\) is a compact set, it follows that
Then we have
By combining this result with the large deviations principle, we deduce that
Now, since both functions inside the supremum and infimum respectively are continuous, letting \(\delta \rightarrow 0^+\), we obtain
Since \(s'\) was the locally uniform limit of an arbitrary convergent subsequence \(\{ s_{n_k }\}_{k = 1}^\infty \), the above result implies that this holds for any such \(s'\), and thus the limit of any convergent subsequence is the same from which it follows that
for \(x \in {\text {int}} (\mathcal {D} (\Lambda ^*)) = \mathcal {A}\), and since the \(s_n\) are concave and pointwise uniformly bounded, this convergence is automatically locally uniform. \(\square \)
Let us also give a quick proof of the following weak convergence result concerning large deviations principles.
Theorem 4.3.8
Let \(\{ P_n \}_{n = 1}^\infty \) be a sequence of probability measures on X satisfying a large deviations principle with rate function I.
It follows that
where \(\mathcal {P} (X)\) is the space of Borel probability measures on X.
Proof
Let us first show that \(I^{-1} \{ 0 \}\) is non-empty and closed. Since I has compact level sets, it follows that \(I^{-1} [0,c]\) are compact for \(c > 0\), but possibly empty. If they are not empty, then \(I^{-1} \{ 0 \} = \bigcap _{n=1}^\infty I^{-1} \left[ 0, \frac{1}{n}\right] \), and it follows directly that \(I^{-1} \{ 0 \}\) is non-empty and compact. However, if \(I^{-1} [0, c]\) is empty for some \(c > 0\), observe that
which is a contradiction, and thus \(I^{-1} [0,c]\) are non-empty for every \(c > 0\), and subsequently \(c = 0\). Note that the first line of the above proof by contradiction follows from the fact that \(\{ P_n \}_{n = 1}^\infty \) satisfies a large deviations principle.
Let \(y \not \in I^{-1} \{ 0 \}\) be such that \(\overline{B}(y, \delta )\) is disjoint from \(I^{-1} \{ 0 \}\) for small enough \(\delta > 0\). Note that
The last strict inequality follows since by lower semi-continuity I attains its minimum on any non-empty compact set, and I is strictly positive on the set \(\overline{B}(y, \delta )\). It follows that
so that
Since the sequence of probability measures satisfies a large deviations principle, it is exponentially tight which implies that it is uniformly tight in the weak sense. Let \(\{ P_{n_k} \}_{k = 1}^\infty \) be any weakly convergent subsequence with limiting probability measure P. Let \(\overline{B} (y, \delta )\) be as before, by weak convergence, it follows that
Since \(y \not \in I^{-1} \{ 0 \}\) is arbitrary, it follows that
\(\square \)
For the purposes of this paper, the most important corollary is the case where \(I^{-1} \{ 0 \}\) consists of a single point.
Corollary 4.3.9
Let \(\{ P_n \}_{n=1}^\infty \) be a sequence of probability measures on X satisfying a large deviations principle with rate function I such that \(I(x^*) = 0\) for exactly one \(x^* \in X\).
It follows that
weakly.
The proof of this statement is an application of the previous theorem in combination with Prokhorov’s theorem.
Another important corollary is the following result concerning the case where \(I^{-1} \{ 0 \}\) consists of finitely many points.
Corollary 4.3.10
Let \(\{ P_n \}_{n=1}^\infty \) be a sequence of probability measures on X satisfying a large deviations principle with rate function I such that the set \(M^*:= I^{-1} \{ 0 \}\) is finite.
It follows that
for any \(0< \delta < \min _{x^*, y^* \in M^*} d(x^*, y^*)\).
Proof
Let \(\delta < \min _{x^*, y^* \in M^*} d(x^*,y^*)\). We decompose X as follows
where
Using this decomposition, we have
Using the large deviations principle, it follows that
and, by using the previous corollary, it follows that
weakly, where \(x^* \in M^*\). Using these limits together, we obtain
\(\square \)
4.4 Infinite-Volume Entropies and States
Next, we prove the regularity and boundedness of the finite-volume entropies.
Proof of Lemma 3.1.4
From Eq. (4.1.3), we see that the microcanonical partition function is a homogeneous bivariate polynomial of degree \(n-2\). Let us introduce the change of coordinates \(z: \mathcal {A} \rightarrow (0, \infty )^2\) given by
It follows that \(Z_n (M,N) = \frac{1}{2} P_n (z (M,N))\), where \(P_n: (0, \infty )^2 \rightarrow (0, \infty )\) is given by
Using the properties of the binomial coefficient, we can manipulate \(P_n\) into the following form
Let us denote the coefficients of the above manipulated polynomial by \(\{ c_k \}_{k=0}^{n - 2}\). For \(k \in \mathbb {N}\), using the simple relation
it follows that
for \(0< k < n - 2\). Using [16, Example 2.3], this implies that the sequence of coefficients \(\{ c_k \}_{k=0}^{n - 2}\) is ultra log-concave, which yields that \(P_n\) is Lorentzian, which shows that \(P_n\) is log-concave, see [16, Theorem 2.30] and the definition of completely log-concave polynomials due to [25]. Since \(Z_n\) is the composition of an invertible linear map, simple scaling by a factor of 2, and a log-concave polynomial it follows that \(Z_n\) is log-concave.
For boundedness, by Theorem 4.2.2, we have \(\mathcal {H}_{n - 2} (\nu _n (m, \rho ) || \eta _n (\beta , \mu )) \ge 0\), from which it follows that
which shows that the family of entropies is pointwise bounded above. As for a lower bound, it is enough to use the following trivial lower bound
from which we obtain
It follows that
as desired. \(\square \)
We continue by consider the properties of the limiting entropy \(f(\beta , \mu )\).
Proof of Lemma 3.1.6
First, we observe that
From this form, it is apparent that f is strictly convex on \(\mathcal {A}\) and thus is a proper convex function on \(\mathbb {R}^2\). For lower semi-continuity, if \((\beta ,\mu ) \in \mathbb {R}^2 {\setminus } \overline{\mathcal {A}}\), then f is lower semi-continuous for trivial reasons, in addition, since f is continuous on \(\mathcal {A}\), it is also necessarily lower semi-continuous there. For the points in \((\beta , \mu ) \in \overline{\partial \mathcal {A}}\), it is clear that these points are of the form \((\pm \mu ', \mu ')\) for \(\mu ' \ge 0\). It is easy to check that \(\lim _{(\beta , \mu ) \rightarrow (\pm \mu ', \mu )} f(\beta , \mu ) = \infty \), since \(f(\beta , \mu )\) is either equal to infinity, or it is increasing without bound for points inside \(\mathcal {A}\) approaching \((\pm \mu ', \mu ')\).
As for the other properties, the non-empty interior of the domain of finiteness of f is given by \(\mathcal {A}\). The mapping f is differentiable in \(\mathcal {A}\). For steepness, which is the third property of being essentially smooth, observe that
Since all norms on \(\mathbb {R}^2\) are equivalent, it follows that there exists a constant \(C > 0\) such that
Using this estimate, it follows that
From this estimate it is now clear that if \((\beta , \mu ) \rightarrow (\pm \mu ', \mu ')\) for \(\mu ' \ge 0\) for points inside \(\mathcal {A}\), then clearly \(\lim _{(\beta , \mu ) \rightarrow (\pm \mu ', \mu ')} || \nabla [f] (\beta , \mu )|| = \infty \), which shows steepness.
In summary, we find that f is a proper convex lower semi-continuous function of Legendre type.
For the next few computational steps, it is useful to introduce the change of variables \(g: \mathbb {R}^2 \rightarrow \mathbb {R}^2\) given by \((\beta , \mu ) \mapsto g (\beta , \mu ) = (\mu + \beta , \mu - \beta )\) so that for \((\beta , \mu ) \in \mathcal {A}\), we have
We can now equivalently consider the function \(f': (0, \infty )^2 \rightarrow \mathbb {R}\) given by
so that \(f \circ g^{-1} = f'\). For the function \(f'\) it is easy to verify that
and the inverse map can be computed from
This shows that \((- \nabla [f']) (0, \infty )^2 = (0, \infty )^2\). Finally, for \((a,b) \in (0, \infty )^2\), one can observe that
To return to the function f, we have
where D[g] is the derivative of the map g. We can also compute the following
To finish, note that we can simply compute the gradient
but its inverse map is simpler to solve from the composite function \(f'\). Doing so, we obtain
Compiling together all of these results, we find that f is a proper convex lower semi-continuous function of Legendre type which satisfies \( (- \nabla [f]) \mathcal {A} = \mathcal {A}\), and, for \((m, \rho ) \in \mathcal {A}\), we have
where
\(\square \)
We begin with the proof of the half-constrained ensemble limiting entropy.
Proof of Lemma 3.2.1
Fix \(\beta \in \mathbb {R}\), and consider the mapping \(Q_n (g^\beta , \cdot ): (0, \infty ) \rightarrow \mathbb {R}\) given by
which, like Eq. (2.0.16), is to be understood as
By direct computation, using Eq. (4.1.3), it follows that
As for the mapping
it is enough to notice that the individual mappings in the integrand
are log-concave functions. To be more precise, the indicator function is the indicator of a convex set and is thus log-concave, the exponential function is trivially log-concave by direct computation, and, finally, the microcanonical partition function, which is to be understood as the microcanonical partition function on \(\mathcal {A}\) extended beyond this set by setting its value to 0, is log-concave by Lemma 3.1.4. It follows that that the mapping
is log-concave by the Prékopa–Leindler inequality or Prékopa’s theorem, see [26, Section 9], since it is the marginal of a log-concave function.
For pointwise uniform boundedness, we begin by observing that
and
We will use the beta function \(B(z_1,z_2)\) given by
for \({\text {Re}}(z_1), {\text {Re}}(z_2) > 0\). By a change of variables, one can see that
For integer values, we have the following identity
from which it follows that
In summary, we have
Computing the limits, it follows that
from which the uniform pointwise boundedness follows.
For \(\mu > |\beta |\), we can directly compute that
For any other value of \(\mu \), it is clear that the above integral is infinite. It follows that the limit and subsequent mapping given by
exists and has a domain of finiteness given by the half-infinite interval \((|\beta |, \infty )\). By using the properties of the full map \((\beta , \mu ) \mapsto f (\beta , \mu )\), already verified and computed in Lemma 3.1.5, one can verify that the mapping \(\mu \mapsto f (\beta , \mu )\) for fixed \(\beta \) is a proper convex lower semi-continuous function of Legendre type that satisfies \(- D[f(\beta , \cdot )] = (0, \infty )\). By Theorem 3.1.3, for any \(\rho > 0\), it follows that
To continue, by Lemma 3.1.5, we have
so that
For the rate function, the scaled logarithmic moment generating function \(\Lambda : \mathbb {R} \rightarrow [- \infty , \infty ]\) of a sequence of random variables with distributions given by \(\left\{ \kappa _n^\beta \right\} _{n \in \mathbb {N}}\) is given by
We can identify the first term on the last line as the convex conjugate of the restriction of a proper convex lower semi-continuous function of Legendre type with an interior of the domain of finiteness given by \((-1,1)\). From the form of the function s(m, 1), for \(m \in (-1,1)\), we immediately see that
Defining \(s (\pm 1, 1) = 1\) yields a continuous extension of s(m, 1) from \((-1,1)\) to \([-1,1]\), and we will consider it so from now on. The extended mapping given by
is upper semi-continuous, and we will consider this the redefinition of s(m, 1) to be understood now as not necessarily finite function on \(\mathbb {R}\). Compiling all of this together, it follows that the mapping \(\mathbb {R} \ni m \mapsto - (s (m, 1) - \beta m)\) defines a proper convex lower semi-continuous function of Legendre type, and thus the convex conjugate is involutive from which it follows that
which is the rate function of \(\{ \kappa _n^\beta \}_{n \in \mathbb {N}}\). \(\square \)
We finish by giving the proof of the limit point result.
Proof of Lemma 3.2.3
Using Theorem 4.3.8, let \(\{ \kappa ^g_{n_k}\}_{k \in \mathbb {N}}\) be a weakly convergent subsequence with a limit \(\kappa \). Since \(M^* (\psi ^g)\) is a compact subset of \((-1,1)\), it follows that there exists \(a:= \min M^* (\psi ^g)\) and \(b:= \max M^* (\psi ^g)\). There exists \(\delta > 0\) such that \({\text {supp}} (\kappa ) \subset M^* (\psi ^g) \subset [a - \delta , b + \delta ] \subset (-1,1)\). Since \({\text {supp}} (\kappa ) \subset [a - \delta , b + \delta ]\), we deduce that \(\kappa ([a - \delta , b + \delta ]) = 1\), and, since \(\partial [a - \delta , b + \delta ] \cap {\text {supp}} (\mu ) \subset \{ a - \delta , b + \delta \} \cap M^* (\psi ^g) = \emptyset \), we see that \(\kappa (\partial [a - \delta , b + \delta ]) = 0\). It follows that \([a - \delta , b + \delta ] \subset (-1,1)\) is a continuity set of \(\kappa \), and we can apply Lemma 3.1.1 along this subsequence with Corollary 3.1.7 to obtain the result. \(\square \)
4.5 Asymptotics of the Weights
We first establish the Laplace-type representation of the microcanonical partition function.
Proof of Lemma 3.3.1
The microcanonical partition function can be written as
which one can recognize as the convolution of two sequences with some factors in front. We consider the generating function \(G: \mathbb {C} \rightarrow \mathbb {C}\) given by
One can verify that the convolution yields a Cauchy product, and that the power series on the right define entire functions with absolutely convergent power series. We have the standard relation between the derivatives of G and its power series coefficients
Next, using the modified Bessel function of the first kind \(I_\nu (z)\) given by
where \(\nu \in \mathbb {Z}\), and we have
Using the integral representation, see [27, Chapter 9], given by
we see that
Taking derivatives, using the general Leibniz rule, we obtain
from which it follows that
By using the given from of the overloaded s function and simplifying, we obtain the desired representation. \(\square \)
We present the proof of the local asymptotics of the overloaded \({\psi ^g}\) function.
Proof
By computing the critical points of the overloaded \({\psi ^g}\) function, we see that there is precisely one critical point in the given set in the assumptions, and it is given by \((m^*,0,0)\). For this particular critical point, it is easy to see that any odd partial derivative with respect to either \(\theta _1\) or \(\theta _2\) is vanishing.
By developing \({\psi ^g}\) to second order in \((\theta _1, \theta _2)\), and (2k):th order in m, it follows that
where
\(\square \)
We can now prove the full Laplace method for the mixture measures.
Proof
Let us first remark that in the following proof, we will frequently use the statement for small enough \(\delta > 0\) something holds. In the context of this proof, we repeat this to imply that there is a series of finite choice of \(\delta > 0\) small enough such that all the conditions required will hold. In reality this proof should be worked through “backwards” so that the choice of \(\delta > 0\) is clear.
We begin by noting that
and by using the symmetries of the trigonometric functions, it follows that
We want to show that the first integral on the second line of this manipulation is exponentially dominant. To save space, denote the integrals as follows
For the terms \(I_2\) and \(I_3\), observe that
for any \(\alpha , \beta \in [0, \frac{\pi }{2}]\) and \(m \in (m^* - \delta , m^* + \delta )\). Using this property, one can check that
By continuity of the function inside the maximum, one can check that
from which it follows that for small enough \(\delta > 0\), we have \(M_2 (\delta ) < M_1 (\delta )\). One can verify in the same way that
for small enough \(\delta > 0\). For such \(\delta \), it follows that
which shows that \(I_1 (n)\) exponentially dominates \(I_{2/3} (n)\).
To continue, we have
It is now clear that in the limit the terms on the right of the \(I_1(n)\) term vanish since they are exponentially small. As for the limit of the integral \(I_1 (n)\), it is solved by a routine application of Laplace’s method using the asymptotics developed in Lemma 3.3.2. First, however, we must split the integral \(I_1 (n)\) with respect to the angular variables. Denote
Since \({\psi ^g}\) attains it unique maximum at \((m^*, 0, 0)\), it follows that
If we denote
we have
Again, since the right hand side contains exponentially decreasing terms, the asymptotics will be determined by the first term on the right. Finally, by changing variables, observe that
If one looks at the remainder term displayed in Lemma 3.3.2, one finds that
and
where \(A,B,C,D,E,F > 0\) are all positive constants. For \(\delta \) satisfying
and
Ultimately, for \(\delta > 0\) chosen small enough so as to satisfy the finite number of conditions given previously, using the error bounds above, by dominated convergence, it follows that
Combining all of these results together, it follows that
\(\square \)
Data Availability
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
References
Abramowitz, Milton: Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables. Dover Publications Inc, USA (1974)
Anari, Nima, Gharan, Shayan Oveis, Vinzant, Cynthia: Log-concave polynomials, i: Entropy and a deterministic approximation algorithm for counting bases of matroids. Duke Mathematical Journal 170(16), (2021)
Berlin, T.H., Kac, M.: The spherical model of a ferromagnet. Physical Review 86(6), 821–835 (1952)
Brändén, Petter, Huh, June: Lorentzian polynomials. Annals of Mathematics 192(3), (2020)
Caputo, Pietro: Uniform poincaré inequalities for unbounded conservative spin systems: the non-interacting case. Stochastic Processes and their Applications 106(2), 223–244 (2003)
Chatterjee, Sourav: A note about the uniform distribution on the intersection of a simplex and a sphere. Journal of Topology and Analysis 09(04), 717–738 (2017)
Amaro de Matos, J.M.G., Fernando Perez, J.: Fluctuations in the curie-weiss version of the random field ising model. Journal of Statistical Physics 62(3–4), 587–608 (1991)
Hollander, Frank den: Large Deviations. American Mathematical Society, jun (2008)
Eisele, Theodor, Ellis, Richard S.: Multiple phase transitions in the generalized curie-weiss model. Journal of Statistical Physics 52(1–2), 161–202 (1988)
Ellis, Richard S.: Entropy, Large Deviations, and Statistical Mechanics. Springer, Berlin Heidelberg (2006)
Ellis, Richard S., Newman, Charles M.: The statistics of curie-weiss models. Journal of Statistical Physics 19(2), 149–161 (1978)
Friedli, Sacha, Velenik, Yvan: Statistical Mechanics of Lattice Systems. Cambridge University Press (2017)
Gardner, R.J.: The brunn-minkowski inequality. Bulletin of the American Mathematical Society 39(03), 355–406 (2002)
Georgii, Hans-Otto: Gibbs Measures and Phase Transitions. DE GRUYTER, may (2011)
Großkinsky, Stefan: Equivalence of ensembles for two-species zero-range invariant measures. Stochastic Processes and their Applications 118(8), 1322–1350 (2008)
Kastner, Michael, Schnetz, Oliver: On the mean-field spherical model. Journal of Statistical Physics 122(6), 1195–1214 (2006)
Koskinen, Kalle: Infinite volume gibbs states and metastates of the random field mean-field spherical model. Journal of Statistical Physics 190(3), (2023)
Koskinen, Kalle, Lukkarinen, Jani: Estimation of local microcanonical averages in two lattice mean-field models using coupling techniques. Journal of Statistical Physics 180(1–6), 1206–1251 (2020)
Lukkarinen, Jani: Multi-state condensation in berlin-kac spherical models. Communications in Mathematical Physics 373(1), 389–433 (2019)
Nam, Kyeongsik: Large deviations and localization of the microcanonical ensembles given by multiple constraints. The Annals of Probability 48(5), (2020)
Rockafellar, Ralph Tyrell: Convex Analysis. Princeton University Press (1997)
Sason, Igal, Verdu, Sergio: \(f\) -divergence inequalities. IEEE Transactions on Information Theory 62(11), 5973–6006 (2016)
Touchette, Hugo: Equivalence and nonequivalence of ensembles: Thermodynamic, macrostate, and measure levels. Journal of Statistical Physics 159(5), 987–1016 (2015)
Wong, R.: Asymptotic Approximations of Integrals. Society for Industrial and Applied Mathematics, jan (2001)
Cancrini, Nicoletta, Olla, Stefano: Ensemble dependence of fluctuations: Canonical microcanonical equivalence of ensembles. Journal of Statistical Physics 168(4), 707–730 (2017)
Huveneers, François, Theil, Elias: Equivalence of ensembles, condensation and glassy dynamics in the bose-hubbard hamiltonian. Journal of Statistical Physics 177(5), 917–935 (2019)
Szavits-Nossan, Juraj, Evans, Martin R., Majumdar, Satya N.: Condensation transition in joint large deviations of linear statistics. Journal of Physics A: Mathematical and Theoretical 47(45), 455004 (2014)
Acknowledgements
I thank my advisor Jani Lukkarinen for his support and encouragement during this project. For their technical comments and helpful discussions, I thank my colleagues Brecht Donvil, and Gerardo Barrera Vargas. I also thank Gerardo Barrera Vargas for his diligent reading and suggested corrections to this manuscript. The research has been supported by the Academy of Finland, via an Academy project (Project No. 339228) and the Finnish centre of excellence in Randomness and STructures (Project No. 346306).
Funding
Open Access funding provided by University of Helsinki (including Helsinki University Central Hospital). This work was supported by the Academy of Finland, via an Academy project (Project No. 339228) and the Finnish centre of excellence in Randomness and STructures (Project No. 346306).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author has no conflict of interest to declare that are relevant to the content of this article.
Additional information
Communicated by Aernout van Enter.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Koskinen, K. Infinite-Volume Gibbs States of the Generalized Mean-Field Orthoplicial Model. J Stat Phys 191, 108 (2024). https://doi.org/10.1007/s10955-024-03321-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10955-024-03321-9