Computing the Partition Function of a Polynomial on the Boolean Cube

Barvinok, Alexander

doi:10.1007/978-3-319-44479-6_7

Alexander Barvinok⁴

Abstract

For a polynomial $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$, we define the partition function as the average of e ^λf(x) over all points x ∈ {−1, 1}ⁿ, where $\lambda \in \mathbb{C}$ is a parameter. We present a quasi-polynomial algorithm, which, given such f, λ and ε > 0 approximates the partition function within a relative error of ε in N ^{O(lnn−lnε)} time provided $\vert \lambda \vert \leq (2L\sqrt{\deg f})^{-1}$, where L = L( f) is a parameter bounding the Lipschitz constant of f from above and N is the number of monomials in f. As a corollary, we obtain a quasi-polynomial algorithm, which, given such an f with coefficients ± 1 and such that every variable enters not more than 4 monomials, approximates the maximum of f on { − 1, 1}ⁿ within a factor of $O\left (\delta ^{-1}\sqrt{\deg f}\right )$, provided the maximum is Nδ for some 0 < δ ≤ 1. If every variable enters not more than k monomials for some fixed k > 4, we are able to establish a similar result when δ ≥ (k − 1)∕k.

This research was partially supported by NSF Grant DMS 1361541.

Access provided by CONRICYT-eBooks. Download chapter PDF

On the Fourier spectrum of functions on Boolean cubes

Article 25 September 2018

A Formula for the Partition Function That “Counts”

Article 02 February 2016

The Turán and Laguerre inequalities for quasi-polynomial-like functions

Article Open access 21 April 2024

1991 Mathematics Subject Classification.

1 Introduction and Main Results

1.1 Polynomials and Partition Functions

Let { − 1, 1}ⁿ be the n-dimensional Boolean cube, that is, the set of all 2ⁿ n-vectors $x = \left (\pm 1,\ldots,\pm 1\right )$ and let $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$ be a polynomial with complex coefficients. We assume that f is defined as a linear combination of square-free monomials:

$$\displaystyle{ \begin{array}{rl} f(x) =\sum _{I\subset \{1,\ldots,n\}}\alpha _{I}&\mathbf{x}^{I}\quad \text{where}\quad \alpha _{I} \in \mathbb{C}\quad \text{for all}\quad I\quad \\ \text{and}\quad &\mathbf{x}^{I} =\prod _{i\in I}x_{i}\quad \text{for}\quad x = \left (x_{1},\ldots,x_{n}\right ),\end{array} }$$

(1)

where we agree that x ^∅ = 1. As is known, the monomials x ^I for I ⊂ {1, …, n} constitute a basis of the vector space of functions $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$.

We introduce two parameters measuring the complexity of the polynomial f in (1). The degree of f is the largest degree of a monomial x ^I appearing in (1) with a non-zero coefficient, that is, the maximum cardinality | I | such that α _I ≠ 0:

$$\displaystyle{\deg f =\max _{I:\ \alpha _{I}\neq 0}\vert I\vert.}$$

We also introduce a parameter which controls the Lipschitz constant of f:

$$\displaystyle{L(\,f) =\max _{i=1,\ldots,n}\sum _{\begin{array}{c}I\subset \{1,\ldots,n\} \\ i\in I \end{array}}\vert \alpha _{I}\vert.}$$

Indeed, if $\mathop{\mathrm{dist}}\nolimits$ is the metric on the cube,

$$\displaystyle{\mathop{\mathrm{dist}}\nolimits (x,y) =\sum _{ i=1}^{n}\vert x_{ i} - y_{i}\vert \quad \text{where}\quad x = \left (x_{1},\ldots,x_{n}\right )\quad \text{and}\quad y = \left (\,y_{1},\ldots,y_{n}\right )}$$

then

$$\displaystyle{\left \vert \,f(x) - f(\,y)\right \vert \ \leq \ L(\,f)\mathop{\mathrm{dist}}\nolimits (x,y).}$$

We consider { − 1, 1}ⁿ as a finite probability space endowed with the uniform measure.

For $\lambda \in \mathbb{C}$ and a polynomial $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$, we introduce the partition function

$$\displaystyle{{ 1 \over 2^{n}}\sum _{x\in \{-1,1\}^{n}}e^{\lambda f(x)} = \mathbf{E}e^{\lambda f}.}$$

Our first main result bounds from below the distance from the zeros of the partition function to the origin.

Theorem 1.1

Let $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$ be a polynomial and let $\lambda \in \mathbb{C}$ be such that

$$\displaystyle{\vert \lambda \vert \ \leq \ { 0.55 \over L(\,f)\sqrt{\deg f}}.}$$

Then

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \neq \ 0.}$$

If, additionally, the constant term of f is 0 then

$$\displaystyle{\left \vert \mathbf{E}\,e^{\lambda f}\right \vert \ \geq \ (0.41)^{n}.}$$

We prove Theorem 1.1 in Sect. 4. As a simple example, let $f\left (x_{1},\ldots,x_{n}\right ) = x_{1} + \cdots + x_{n}$. Then

$$\displaystyle{\mathbf{E}\,e^{\lambda f} = \left (\mathbf{E}\,e^{\lambda x_{1} }\right )\cdots \left (\mathbf{E}\,e^{\lambda x_{n} }\right ) = \left ({e^{\lambda } + e^{-\lambda } \over 2} \right )^{n}.}$$

We have $L(\,f) =\deg f = 1$ and Theorem 1.1 predicts that E e ^λf ≠ 0 provided | λ | ≤ 0. 55. Indeed, the smallest in the absolute value root of E e ^λf is λ = πi∕2 with | λ | = π∕2 ≈ 1. 57. If we pick f(x ₁, …, x _n) = ax ₁ + … + ax _n for some real constant a > 0 then the smallest in the absolute value root of E e ^λf is πi∕2a with | λ | inversely proportional to L( f), just as Theorem 1.1 predicts. It is not clear at the moment whether the dependence of the bound in Theorem 1.1 on $\deg f$ is optimal.

As we will see shortly, Theorem 1.1 implies that E e ^λf can be efficiently computed if | λ | is strictly smaller than the bound in Theorem 1.1. When computing E e ^λf, we may assume that the constant term of f is 0, since

$$\displaystyle{\mathbf{E}\,e^{\lambda (\,f+\alpha )} = e^{\lambda \alpha }\mathbf{E}\,e^{\lambda f}}$$

and hence adding a constant to f results in multiplying the partition function by a constant.

For a given f, we consider a univariate function

$$\displaystyle{\lambda \longmapsto \mathbf{E}\,e^{\lambda f}.}$$

As follows from Theorem 1.1, we can choose a branch of

$$\displaystyle{g(\lambda ) =\ln \left (\mathbf{E}\,e^{\lambda f}\right )\quad \text{for}\quad \vert \lambda \vert \ \leq \ { 0.55 \over L(\,f)\sqrt{\deg f}}}$$

such that g(0) = 0. It follows that g(λ) is well-approximated by a low degree Taylor polynomial at 0.

Theorem 1.2

Let $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$ be a polynomial with zero constant term and let

$$\displaystyle{g(\lambda ) =\ln \left (\mathbf{E}\,e^{\lambda f}\right )\quad \mathit{\text{for}}\quad \vert \lambda \vert \ \leq \ { 0.55 \over L(\,f)\sqrt{\deg f}}.}$$

For a positive integer m ≤ 5n, let

$$\displaystyle{T_{m}(\,f;\lambda ) =\sum _{ k=1}^{m}{ \lambda ^{k} \over k!}{ d^{k} \over d\lambda ^{k}}g(\lambda )\Big\vert _{\lambda =0}}$$

be the degree m Taylor polynomial of g(λ) computed at λ = 0. Then for n ≥ 2

$$\displaystyle{\left \vert g(\lambda ) - T_{m}(\,f;\lambda )\right \vert \ \leq \ { 50n \over (m + 1)(1.1)^{m}} + e^{-n}}$$

provided

$$\displaystyle{ \vert \lambda \vert \ \leq \ { 1 \over 2L(\,f)\sqrt{\deg f}}. }$$

(2)

In Sect. 3, we deduce Theorem 1.2 from Theorem 1.1.

As we discuss in Sect. 3.1, for a polynomial f given by (1), the value of T _m( f; λ) can be computed in nN ^O(m) time, where N is the number of monomials in the representation (1). Theorem 1.2 implies that as long as ε ≫ e ⁻ⁿ, by choosing $m = O{\bigl (\ln n-\ln \epsilon \bigr )}$, we can compute the value of E e ^λf within relative error ε in N ^{O(lnn−lnε)} time provided λ satisfies the inequality (2). For ε exponentially small in n, it is more efficient to evaluate E e ^λf directly from the definition.

1.2 Relation to Prior Work

This paper is a continuation of a series of papers by the author [3, 4] and by the author and P. Soberón [5, 6] on algorithms to compute partition functions in combinatorics, see also [16]. The main idea of the method is that the logarithm of the partition function is well-approximated by a low-degree Taylor polynomial at the temperatures above the phase transition (the role of the temperature is played by 1∕λ), while the phase transition is governed by the complex zeros of the partition function, cf. [15, 18].

The main work of the method consists of bounding the complex roots of the partition function, as in Theorem 1.1. While the general approach of this paper looks similar to the approach of [3–5] and [6] (a martingale type and a fixed point type arguments), in each case bounding complex roots requires some effort and new ideas. Once the roots are bounded, it is relatively straightforward to approximate the partition function as in Theorem 1.2.

Another approach to computing partition functions, also rooted in statistical physics, is the correlation decay approach, see [17] and [1]. While we did not pursue that approach, in our situation it could conceivably work as follows: given a polynomial $f:\{ -1,1\}^{n}\longrightarrow \mathbb{R}$ and a real λ > 0, we consider the Boolean cube as a finite probability space, where the probability of a point x ∈ {−1, 1}ⁿ is e ^λf(x)∕E e ^λf. This makes the coordinates x ₁, …, x _n random variables. We consider a graph with vertices x ₁, …, x _n and edges connecting two vertices x _i and x _j if there is a monomial of f containing both x _i and x _j. This introduces a graph metric on the variables x ₁, …, x _n and one could hope that if λ is sufficiently small, we have correlation decay: the random variable x _i is almost independent on the random variables sufficiently distant from x _i in the graph metric. This would allow us to efficiently approximate the probabilities P (x _i = 1) and P (x _i = −1) and then recursively estimate E e ^λf.

While both approaches treat the phase transition as a natural threshold for computability, the concepts of phase transition in our method (complex zeros of the partition function) and in the correlation decay approach (non-uniqueness of Gibbs measures) though definitely related and even equivalent for some spin systems [8], in general are different.

Theorem 1.2 together with the algorithm of Sect. 3.1 below implies that to approximate E e ^λf within a relative error of ε > 0, it suffices to compute moments E f ^k for $k = O\left (\ln \epsilon ^{-1}\right )$. This suggests some similarity with one of the results of [13], where (among other results) it is shown that the number of satisfying assignments of a DNF on n Boolean variables is uniquely determined by the numbers of satisfying assignments for all possible conjunctions of k ≤ 1 + log₂ n clauses of the DNF (though this is a purely existential result with no algorithm attached). Each conjunction of the DNF can be represented as a polynomial

$$\displaystyle{\begin{array}{rl} \phi _{j}(x) =&{ 1 \over 2^{\vert S_{j}\vert }}\prod _{i\in S_{j}}(1 +\epsilon _{i}x_{i})\quad \text{where} \\ &S_{j} \subset \{ 1,\ldots,n\}\quad \text{and}\quad \epsilon _{i} \in \{-1,1\},\end{array} }$$

and we let

$$\displaystyle{f(x) =\sum _{ j=1}^{m}\phi _{ j}(x).}$$

Then the number of points x ∈ {−1, 1}ⁿ such that f(x) > 0 is uniquely determined by various expectations $\mathbf{E}\,\phi _{j_{1}}\cdots \phi _{j_{k}}$ for k ≤ 1 + log₂ n. The probability that f(x) = 0 for a random point x ∈ {−1, 1}ⁿ sampled from the uniform distribution, can be approximated by E e ^−λf for a sufficiently large λ > 0. The expectations are precisely those that arise when we compute the moments E f ^k. It is not clear at the moment whether the results of this paper can produce an efficient way to compute the number of satisfying assignments.

2 Applications to Optimization

2.1 Maximizing a Polynomial on the Boolean Cube

Let $f:\{ -1,1\}^{n}\longrightarrow \mathbb{R}$ be a polynomial with real coefficients defined by its monomial expansion (1). As is known, various computationally hard problems of discrete optimization, such as finding the maximum cardinality of an independent set in a graph, finding the minimum cardinality of a vertex cover in a hypergraph and the maximum constraint satisfaction problem can be reduced to finding the maximum of f on the Boolean cube { − 1, 1}ⁿ, see, for example, [7].

The problem is straightforward if $\deg f \leq 1$. If $\deg f = 2$, it may already be quite hard even to solve approximately: Given an undirected simple graph G = (V, E) with set V = {1, …, n} of vertices and set $E \subset { V \choose 2}$ of edges, one can express the largest cardinality of an independent set (a set vertices no two of which are connected by an edge of the graph), as the maximum of

$$\displaystyle{f(x) ={ 1 \over 2}\sum _{i=1}^{n}\left (x_{ i} + 1\right ) -{ 1 \over 4}\sum _{\{i,j\}\in E}\left (1 + x_{i}\right )\left (1 + x_{j}\right )}$$

on the cube { − 1, 1}ⁿ. It is an NP-hard problem to approximate the size of the largest independent set in a given graph on n vertices within a factor of n ^1−ε for any 0 < ε ≤ 1, fixed in advance [10, 19]. If $\deg f = 2$ and f does not contain linear or constant terms, the problem reduces to the max cut problem in a weighted graph (with both positive and negative weights allowed on the edges), where there exists a polynomial time algorithm achieving an O(lnn) approximation factor, see [14] for a survey.

If $\deg f \geq 3$, no efficient algorithm appears to be known that would outperform choosing a random point x ∈ {−1, 1}ⁿ. The maximum of a polynomial f with $\deg f = 3$ and no constant, linear or quadratic terms can be approximated within an $O{\bigl (\sqrt{n/\ln n}\bigr )}$ factor in polynomial time, see [14]. Finding the maximum of a general real polynomial (1) on the Boolean cube { − 1, 1}ⁿ is equivalent to the problem of finding the maximum weight of a subset of a system of weighted linear equations over $\mathbb{Z}_{2}$ that can be simultaneously satisfied [12]. Assuming that $\deg f$ is fixed in advance, f contains N monomials and the constant term of f is 0, a polynomial time algorithm approximating the maximum of f within a factor of $O(\sqrt{N})$ is constructed in [12]. More precisely, the algorithm from [12] constructs a point x such that f(x) is within a factor of $O(\sqrt{N})$ from ∑ _I | α _I | for f defined by (1). If $\deg f \geq 3$, it is unlikely that a polynomial time algorithm exists approximating the maximum of f within a factor of $2^{(\ln N)^{1-\epsilon } }$ for any fixed 0 < ε ≤ 1 [12], see also [10].

Let us choose

$$\displaystyle{\lambda ={ 1 \over 2L(\,f)\sqrt{\deg f}}}$$

as in Theorem 1.2. As is discussed in Sect. 3.2, by successive conditioning, we can compute in N ^{O(lnn−lnε)} time a point y ∈ {−1, 1}ⁿ which satisfies

$$\displaystyle{ e^{\lambda f(\,y)}\ \geq \ (1-\epsilon )\mathbf{E}\,e^{\lambda f} }$$

(3)

for any given 0 < ε ≤ 1.

How well a point y satisfying (3) approximates the maximum value of f on the Boolean cube { − 1, 1}ⁿ? We consider polynomials with coefficients − 1, 0 and 1, where the problem of finding an x ∈ {−1, 1}ⁿ maximizing f(x) is equivalent to finding a vector in $\mathbb{Z}_{2}^{n}$ satisfying the largest number of linear equations from a given list of linear equations over $\mathbb{Z}_{2}$.

Theorem 2.1

Let

$$\displaystyle{f(x) =\sum _{I\in \mathcal{F}}\alpha _{I}\mathbf{x}^{I}}$$

be a polynomial with zero constant term, where $\mathcal{F}$ is a family of non-empty subsets of the set {1, …, n} and α _I = ±1 for all $I \in \mathcal{F}$ . Let

$$\displaystyle{\max _{x\in \{-1,1\}^{n}}f(x) =\delta \vert \mathcal{F}\vert \quad \mathit{\text{for some}}\quad 0 \leq \delta \leq 1.}$$

Suppose further that every variable x _i enters at most four monomials x ^I for $I \in \mathcal{F}$ . Then

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{3\lambda ^{2}\delta ^{2} \over 16} \vert \mathcal{F}\vert \right \}\quad \mathit{\text{for}}\quad 0 \leq \lambda \leq 1.}$$

Since E f = 0, the maximum of f is positive unless $\mathcal{F} =\emptyset$ and f ≡ 0. It is not clear whether the restriction on the number of occurrences of variables in Theorem 2.1 is essential or an artifact of the proof. We can get a similar estimate for any number occurrences provided the maximum of f is sufficiently close to $\vert \mathcal{F}\vert$.

Theorem 2.2

Let

$$\displaystyle{f(x) =\sum _{I\in \mathcal{F}}\alpha _{I}\mathbf{x}^{I}}$$

be a polynomial with zero constant term, where $\mathcal{F}$ is a family of non-empty subsets of the set {1, …, n} and α _I = ±1 for all $I \in \mathcal{F}$ . Let k > 2 be an integer and suppose that every variable x _i enters at most k monomials x ^I for $I \in \mathcal{F}$ . If

$$\displaystyle{\max _{x\in \{-1,1\}^{n}}f(x)\ \geq \ { k - 1 \over k} \vert \mathcal{F}\vert }$$

then

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{3\lambda ^{2} \over 16}\vert \mathcal{F}\vert \right \}\quad \mathit{\text{for all}}\quad 0 \leq \lambda \leq 1.}$$

We prove Theorems 2.1 and 2.2 in Sect. 5.

Let f be a polynomial of Theorem 2.1 and suppose that, additionally, | I | ≤ d for all $I \in \mathcal{F}$, so that $\deg f \leq d$. We have L( f) ≤ 4 and we choose

$$\displaystyle{\lambda ={ 1 \over 8\sqrt{d}}.}$$

Let y ∈ {−1, 1}ⁿ be a point satisfying (3). Then

$$\displaystyle{f(\,y)\ \geq \ { 1 \over \lambda } \ln \mathbf{E}\,e^{\lambda f} +{ \ln (1-\epsilon ) \over \lambda } \ \geq \ { 3\lambda \delta ^{2} \over 16}\vert \mathcal{F}\vert +{ \ln (1-\epsilon ) \over \lambda }.}$$

That is, if the maximum of f is at least $\delta \vert \mathcal{F}\vert$ for some 0 < δ ≤ 1, we can approximate the maximum in quasi-polynomial time within a factor of $O\left (\delta ^{-1}\sqrt{d}\right )$. Equivalently, if for some 0 < δ ≤ 0. 5 there is a vector in $\mathbb{Z}_{2}^{n}$ satisfying at least $(0.5+\delta )\vert \mathcal{F}\vert$ equations of a set $\mathcal{F}$ of linear equations over $\mathbb{Z}_{2}$, where each variable enters at most 4 equations, in quasi-polynomial time we can compute a vector $v \in \mathbb{Z}_{2}^{n}$ satisfying at least $(0.5 +\delta _{1})\vert \mathcal{F}\vert$ linear equations from the system, where $\delta _{1} = \Omega (\delta ^{2}/\sqrt{d})$ and d is the largest number of variables per equation.

Similarly, we can approximate in quasi-polynomial time the maximum of f in Theorem 2.2 within a factor of $O(k\sqrt{d})$ provided the maximum is sufficiently close to $\vert \mathcal{F}\vert$, that is, is at least ${k-1 \over k} \vert \mathcal{F}\vert$.

In Theorems 2.1 and 2.2, one can check in polynomial time whether the maximum of f is equal to $\vert \mathcal{F}\vert$, as this reduces to testing the feasibility of a system of linear equations over $\mathbb{Z}_{2}$. However, for any fixed 0 < δ < 1, testing whether the maximum is at least $\delta \vert \mathcal{F}\vert$ is computationally hard, cf. [10].

Håstad [9] constructed a polynomial time algorithm that approximates the maximum of f within a factor of O(kd). In [2], see also [11], a polynomial algorithm is constructed that finds the maximum of f within a factor of $e^{O(d)}\sqrt{k}$, provided f is an odd function. More precisely, the algorithm finds a point x such that f(x) is within a factor of $e^{O(d)}\sqrt{k}$ from $\vert \mathcal{F}\vert$.

3 Computing the Partition Function

3.1 Computing the Taylor Polynomial of $g(\lambda ) =\ln \left (\mathbf{E}\,e^{\lambda f}\right )$

First, we discuss how to compute the degree m Taylor polynomial T _m( f; λ) at λ = 0 of the function

$$\displaystyle{g(\lambda ) =\ln \left (\mathbf{E}\,e^{\lambda f}\right ),}$$

see Theorem 1.2. Let us denote

$$\displaystyle{h(\lambda ) = \mathbf{E}\,e^{\lambda f}\quad \text{and}\quad g(\lambda ) =\ln h(\lambda ).}$$

Then

$$\displaystyle{g' ={ h' \over h}\quad \text{and hence}\quad h' = g'h.}$$

Therefore,

$$\displaystyle{ h^{(k)}(0) =\sum _{ j=1}^{k}{k - 1\choose j - 1}g^{(j)}(0)h^{(k-j)}(0)\quad \text{for}\quad k = 1,\ldots,m. }$$

(4)

If we calculate the derivatives

$$\displaystyle{ h(0),\ h^{(1)}(0),\ldots,h^{(m)}(0), }$$

(5)

then we can compute

$$\displaystyle{g(0),\ g^{(1)}(0),\ldots,g^{(m)}(0)}$$

by solving a non-singular triangular system of linear equations (4) which has h(0) = 1 on the diagonal. Hence our goal is to calculate the derivatives (5).

We observe that

$$\displaystyle{h^{(k)}(0) ={ 1 \over 2^{n}}\sum _{x\in \{-1,1\}^{n}}f^{k}(x) = \mathbf{E}\,f^{k}.}$$

For a polynomial f defined by its monomial expansion (1) we have

$$\displaystyle{\mathbf{E}\,f =\alpha _{\emptyset }.}$$

We can consecutively compute the monomial expansion of f, f ², …, f ^m by using the following multiplication rule for monomials on the Boolean cube { − 1, 1}ⁿ:

$$\displaystyle{\mathbf{x}^{I}\mathbf{x}^{J} = \mathbf{x}^{I\Delta J},}$$

where $I\Delta J$ is the symmetric difference of subsets I, J ⊂ {1, …, n}. It follows then that for a polynomial $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$ given by its monomial expansion (1) and a positive integer m, the Taylor polynomial

$$\displaystyle{T_{m}(\,f;\lambda ) =\sum _{ k=1}^{m}{ \lambda ^{k} \over k!}{ d^{k} \over d\lambda ^{k}}g(\lambda )\Big\vert _{\lambda =0}}$$

can be computed in nN ^O(m) time, where N is the number of monomials in f.

Our next goal is deduce Theorem 1.2 from Theorem 1.1. The proof is based on the following lemma.

Lemma 3.1

Let $p: \mathbb{C}\longrightarrow \mathbb{C}$ be a univariate polynomial and suppose that for some β > 0 we have

$$\displaystyle{p(z)\neq 0\quad \mathit{\text{provided}}\quad \vert z\vert \leq \beta.}$$

Let 0 < γ < β and for | z | ≤ γ, let us choose a continuous branch of

$$\displaystyle{g(z) =\ln p(z).}$$

Let

$$\displaystyle{T_{m}(z) = g(0) +\sum _{ k=1}^{m}{z^{k} \over k!}{ d^{k} \over dz^{k}}g(z)\Big\vert _{z=0}}$$

be the degree m Taylor polynomial of g(z) computed at z = 0. Then for

$$\displaystyle{\tau ={ \beta \over \gamma }\>\ 1}$$

we have

$$\displaystyle{\left \vert g(z) - T_{m}(z)\right \vert \ \leq \ { \deg p \over (m + 1)\tau ^{m}(\tau -1)}\quad \mathit{\text{for all}}\quad \vert z\vert \leq \gamma.}$$

Proof

Let $n =\deg p$ and let α ₁, …, α _n be the roots of p, so we may write

$$\displaystyle{p(z) = p(0)\prod _{i=1}^{n}\left (1 -{ z \over \alpha _{i}} \right )\quad \text{where}\quad \vert \alpha _{i}\vert \geq \beta \quad \text{for}\quad i = 1,\ldots,n.}$$

Then

$$\displaystyle{g(z) = g(0) +\sum _{ i=1}^{n}\ln \left (1 -{ z \over \alpha _{i}} \right ),}$$

where we choose the branch of the logarithm which is 0 when z = 0. Using the Taylor series expansion of the logarithm, we obtain

$$\displaystyle{\ln \left (1 -{ z \over \alpha _{i}} \right ) = -\sum _{k=1}^{m}{ z^{k} \over k\alpha _{i}^{k}} +\zeta _{m}\quad \text{provided}\quad \vert z\vert \leq \gamma,}$$

where

$$\displaystyle{\left \vert \zeta _{m}\right \vert = \left \vert -\sum _{k=m+1}^{+\infty }{ z^{k} \over k\alpha _{i}^{k}}\right \vert \ \leq \ \sum _{k=m+1}^{+\infty }{ \gamma ^{k} \over k\beta ^{k}}\ \leq \ { 1 \over (m + 1)\tau ^{m}(\tau -1)}.}$$

Therefore,

$$\displaystyle{g(z) = g(0) -\sum _{i=1}^{n}\sum _{ k=1}^{m}{ z^{k} \over k\alpha _{i}^{k}} +\eta _{m}\quad \text{for}\quad \vert z\vert \leq \gamma,}$$

where

$$\displaystyle{\left \vert \eta _{m}\right \vert \ \leq \ { n \over (m + 1)\tau ^{m}(\tau -1)}.}$$

It remains to notice that

$$\displaystyle{T_{m}(z) = g(0) -\sum _{i=1}^{n}\sum _{ k=1}^{m}{ z^{k} \over k\alpha _{i}^{k}}.}$$

□

Next, we need a technical bound on the approximation of e ^z by its Taylor polynomial.

Lemma 3.2

Let ρ > 0 be a real number and let m ≥ 5ρ be an integer. Then

$$\displaystyle{\left \vert e^{z} -\sum _{ k=0}^{m}{z^{k} \over k!} \right \vert \ \leq \ e^{-2\rho }\quad \mathit{\text{for all}}\quad z \in \mathbb{C}\quad \mathit{\text{such that}}\quad \vert z\vert \leq \rho.}$$

Proof

For all $z \in \mathbb{C}$ such that | z | ≤ ρ, we have

$$\displaystyle{\begin{array}{rl} \left \vert e^{z} -\sum _{k=0}^{m}{z^{k} \over k!} \right \vert =&\left \vert \sum _{k=m+1}^{+\infty }{z^{k} \over k!} \right \vert \ \leq \ \sum _{k=m+1}^{+\infty }{\rho ^{k} \over k!} ={ \rho ^{m+1} \over (m+1)!}\sum _{k=0}^{+\infty }{ \rho ^{k}(m+1)! \over (k+m+1)!} \\ \leq \ &{ \rho ^{m+1} \over (m+1)!}\sum _{k=0}^{+\infty }{\rho ^{k} \over k!} ={ \rho ^{m+1}e^{\rho } \over (m+1)!}\ \leq \ { \rho ^{m+1}e^{\rho +m+1} \over (m+1)^{m+1}}. \end{array} }$$

Since m ≥ 5ρ, we obtain

$$\displaystyle{\left \vert e^{z} -\sum _{ k=0}^{+\infty }{z^{k} \over k!} \right \vert \ \leq \ { \rho ^{m+1}e^{\rho +m+1} \over 5^{m+1}\rho ^{m+1}} ={ e^{\rho } \over (5/e)^{m+1}}\ \leq \ { e^{\rho } \over (5/e)^{5\rho }}\ \leq \ e^{-2\rho }.}$$

and the proof follows. □

Proof of Theorem 1.2

Without loss of generality, we assume that L( f) = 1. Since the constant term of f is 0, for any x ∈ {−1, 1}ⁿ, we have

$$\displaystyle{\vert f(x)\vert \ \leq \ \sum _{i=1}^{n}\sum _{ I:\ i\in I}\vert \alpha _{I}\vert \ \leq \ n.}$$

Applying Lemma 3.2, we conclude that

$$\displaystyle{ \left \vert e^{\lambda f(x)} -\sum _{ k=0}^{5n}{{\bigl (\lambda f(x)\bigr )}^{k} \over k!} \right \vert \ \leq \ e^{-2n}\quad \text{for all}\quad x \in \{-1,1\}^{n} }$$

(6)

provided | λ | ≤ 1. Let

$$\displaystyle{p(\lambda ) = 1 +\sum _{ k=1}^{5n}{ \lambda ^{k} \over k!}{ d^{k} \over d\lambda ^{k}}\left (\mathbf{E}\,e^{\lambda f}\right )\Big\vert _{\lambda =0}}$$

be the degree 5n Taylor polynomial of the function λ ↦ E e ^λf at λ = 0. From (6) it follows that

$$\displaystyle{\left \vert \mathbf{E}\,e^{\lambda f} - p(\lambda )\right \vert \ \leq \ e^{-2n}\quad \text{provided}\quad \vert \lambda \vert \leq 1.}$$

From Theorem 1.1, we conclude that

$$\displaystyle{p(\lambda )\neq 0\quad \text{for all}\quad \lambda \in \mathbb{C}\quad \text{such that}\quad \vert \lambda \vert \ \leq \ { 0.55 \over \sqrt{\deg f}} }$$

and, moreover,

$$\displaystyle{ \left \vert \ln p(\lambda ) -\ln \left (\mathbf{E}\,e^{\lambda f}\right )\right \vert \ \leq \ e^{-n}\quad \text{provided}\quad \vert \lambda \vert \ \leq \ { 0.55 \over \sqrt{\deg f}} \quad \text{and}\quad n \geq 2. }$$

(7)

Applying Lemma 3.1 with

$$\displaystyle{\beta ={ 0.55 \over \sqrt{\deg f}},\quad \gamma ={ 0.5 \over \sqrt{\deg f}}\quad \text{and}\quad \tau ={ \beta \over \gamma } = 1.1,}$$

we conclude that for the Taylor polynomial of lnp(λ) at λ = 0,

$$\displaystyle{T_{m}(\lambda ) =\ln p(0) +\sum _{ k=1}^{m}{ \lambda ^{k} \over k!}{ d^{k} \over d\lambda ^{k}}\ln p(\lambda )\Big\vert _{\lambda =0}}$$

we have

$$\displaystyle{ \left \vert T_{m}(\lambda ) -\ln p(\lambda )\right \vert \ \leq \ { 50n \over (m + 1)(1.1)^{m}}\quad \text{provided}\quad \vert \lambda \vert \ \leq \ { 1 \over 2\sqrt{\deg f}}. }$$

(8)

It remains to notice that the Taylor polynomials of degree m ≤ 5n of the functions

$$\displaystyle{\lambda \longmapsto \ln \left (\mathbf{E}\,e^{\lambda f}\right )\quad \text{and}\quad \lambda \longmapsto \ln p(\lambda )}$$

at λ = 0 coincide, since both are determined by the first m derivatives of respectively E e ^λf and p(λ) at λ = 0, cf. Sect. 3.1, and those derivatives coincide. The proof now follows by (7) and (8). □

3.2 Computing a Point y in the Cube with a Large Value of f( y)

We discuss how to compute a point y ∈ {−1, 1}ⁿ satisfying (3). We do it by successive conditioning and determine one coordinate of $y = \left (\,y_{1},\ldots,y_{n}\right )$ at a time. Let F ⁺ and F ⁻ be the facets of the cube { − 1, 1}ⁿ defined by the equations x _n = 1 and x _n = −1 respectively for $x = \left (x_{1},\ldots,x_{n}\right )$, x ∈ {−1, 1}ⁿ. Then F ⁺ and F ⁻ can be identified with the (n − 1)-dimensional cube { − 1, 1}ⁿ⁻¹ and we have

$$\displaystyle{\mathbf{E}\,e^{\lambda f} ={ 1 \over 2}\mathbf{E}\,\left (e^{\lambda f}\vert F^{+}\right ) +{ 1 \over 2}\mathbf{E}\,\left (e^{\lambda f}\vert F^{-}\right ).}$$

Moreover, for the restrictions f ⁺ and f ⁻ of f onto F ⁺ and F ⁻ respectively, considered as polynomials on { − 1, 1}ⁿ⁻¹, we have

$$\displaystyle{\deg f^{+},\ \deg f^{-}\ \leq \ \deg f\quad \text{and}\quad L(\,f^{+}),\ L(\,f^{-})\ \leq \ L(\,f).}$$

Using the algorithm of Sect. 3.1 and Theorem 1.2, we compute $\mathbf{E}\,\left (e^{\lambda f}\vert F^{+}\right )$ and $\mathbf{E}\,\left (e^{\lambda f}\vert F^{-}\right )$ within a relative error ε∕2n, choose the facet with the larger computed value, let y _n = 1 if the value of $\mathbf{E}\,\left (e^{\lambda f}\vert F^{+}\right )$ appears to be larger and let y _n = −1 if the value of $\mathbf{E}\,\left (e^{\lambda f}\vert F^{-}\right )$ appears to be larger and proceed further by conditioning on the value of y _n−1. For polynomials with N monomials, the complexity of the algorithm is N ^O(lnn).

4 Proof of Theorem 1.1

To prove Theorem 1.1, we consider restrictions of the partition function onto faces of the cube.

4.1 Faces

A face F ⊂ {−1, 1}ⁿ consists of the points x where some of the coordinates of x are fixed at 1, some are fixed at − 1 and others are allowed to vary (a face is always non-empty). With a face F, we associate three subsets I ₊(F), I ₋(F), I(F) ⊂ {1, …, n} as follows:

$$\displaystyle{\begin{array}{rcl} I_{+}(F)& =&{\bigl \{i:\ x_{i} = 1\quad \text{for all}\quad x \in F,\ x = \left (x_{1},\ldots,x_{n}\right )\bigr \}}, \\ I_{-}(F)& =&{\bigl \{i:\ x_{i} = -1\quad \text{for all}\quad x \in F,\ x = \left (x_{1},\ldots,x_{n}\right )\bigr \}}\quad \text{and} \\ I(F)& =&\{1,\ldots,n\}\setminus \left (I_{+}(F) \cup I_{-}(F)\right ). \end{array} }$$

Consequently,

$$\displaystyle{\begin{array}{rl} F =\Big\{ \left (x_{1},\ldots,x_{n}\right )\quad \text{where}\quad &x_{i} = 1\quad \text{for}\quad i \in I_{+}(F)\quad \text{and} \\ &x_{i} = -1\quad \text{for}\quad i \in I_{-}(F)\Big\}. \end{array} }$$

In particular, if I ₊(F) = I ₋(F) = ∅ and hence I(F) = {1, …, n}, we have F = {−1, 1}ⁿ. We call the number

$$\displaystyle{\dim F = \vert I(F)\vert }$$

the dimension of F.

For a subset J ∈ {1, …, n}, we denote by { − 1, 1}^J the set of all points

$$\displaystyle{x = \left (x_{j}:\ j \in J\right )\quad \text{where}\quad x_{j} = \pm 1.}$$

Let F ⊂ {−1, 1}ⁿ be a face. For a subset J ⊂ I(F) and a point ε ∈ {−1, 1}^J, $\epsilon = \left (\epsilon _{j}:\ j \in J\right )$, we define

$$\displaystyle{F^{\epsilon } ={\bigl \{ x \in F,\ x = \left (x_{1},\ldots,x_{n}\right ):\ x_{j} =\epsilon _{j}\quad \text{for}\quad j \in J\bigr \}}.}$$

In words: F ^ε is obtained from F by fixing the coordinates from some set J ⊂ I(F) of free coordinates to 1 or to − 1. Hence F ^ε is also a face of { − 1, 1}ⁿ and we think of F ^ε ⊂ F as a face of F. We can represent F as a disjoint union

$$\displaystyle{ F =\bigcup _{\epsilon \in \{-1,1\}^{J}}F^{\epsilon }\quad \text{for any}\quad J \subset I(F). }$$

(9)

4.2 The Space of Polynomials

Let us fix a positive integer d. We identify the set of all polynomials f as in (1) such that $\deg f \leq d$ and the constant term of f is 0 with $\mathbb{C}^{N}$, where

$$\displaystyle{N = N(n,d) =\sum _{ k=1}^{d}{n\choose k}.}$$

For δ > 0, we consider a closed convex set $\mathcal{U}(\delta ) \subset \mathbb{C}^{N}$ consisting of the polynomials $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$ such that $\deg f \leq d$ and L( f) ≤ δ. In other words, $\mathcal{U}(\delta )$ consists of the polynomials

$$\displaystyle{f(x) =\sum _{\begin{array}{c}I\subset \{1,\ldots,n\} \\ 1\leq \vert I\vert \leq d\end{array}}\alpha _{I}\mathbf{x}^{I}\quad \text{where}\quad \sum _{ I:\ i\in I}\vert \alpha _{I}\vert \ \leq \ \delta \quad \text{for}\quad i = 1,\ldots,n.}$$

4.3 Restriction of the Partition Function onto a Face

Let $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$ be a polynomial and let F ⊂ {−1, 1}ⁿ be a face. We define

$$\displaystyle{\mathbf{E}\,\left (e^{\,f}\vert F\right ) ={ 1 \over 2^{\dim F}}\sum _{x\in F}e^{f(x)}.}$$

We suppose that f is defined by its monomial expansion as in (1) and consider $\mathbf{E}\,\left (e^{\,f}\vert F\right )$ as a function of the coefficients α _I. Using (9) we deduce

$$\displaystyle{ \begin{array}{rl} &{ \partial \over \partial \alpha _{J}}\mathbf{E}\,\left (e^{\,f}\vert F\right ) ={ 1 \over 2^{\dim F}}\sum _{x\in F}\mathbf{x}^{J}e^{f(x)} \\ & \quad ={ (-1)^{\vert I_{-}(F)\cap J\vert } \over 2^{\vert I(F)\vert }} \\ &\quad \times \sum _{\begin{array}{c}\epsilon \in \{-1,1\}^{I(F)\cap J} \\ \epsilon =(\epsilon _{j}:\ j\in I(F)\cap J)\end{array}}\left (\prod _{j\in I(F)\cap J}\epsilon _{j}\right )\sum _{x\in F^{\epsilon }}e^{f(x)} \\ & \quad ={ (-1)^{\vert I_{-}(F)\cap J\vert } \over 2^{\vert I(F)\cap J\vert }} \\ &\quad \times \sum _{\begin{array}{c}\epsilon \in \{-1,1\}^{I(F)\cap J} \\ \epsilon =(\epsilon _{j}:\ j\in I(F)\cap J)\end{array}}\left (\prod _{j\in I(F)\cap J}\epsilon _{j}\right )\mathbf{E}\,\left (e^{\,f}\vert F^{\epsilon }\right ). \end{array} }$$

(10)

In what follows, we identify complex numbers with vectors in $\mathbb{R}^{2} = \mathbb{C}$ and measure angles between non-zero complex numbers.

Lemma 4.1

Let 0 < τ ≤ 1 and δ > 0 be real numbers and let F ⊂ {−1, 1}ⁿ be a face. Suppose that for every $f \in \mathcal{U}(\delta )$ we have $\mathbf{E}\,\left (e^{\,f}\vert F\right )\neq 0$ and, moreover, for any K ⊂ I(F) we have

$$\displaystyle{\left \vert \mathbf{E}\,\left (e^{\,f}\vert F\right )\right \vert \ \geq \ \left ({ \tau \over 2}\right )^{\vert K\vert }\sum _{ \epsilon \in \{-1,1\}^{K}}\left \vert \mathbf{E}\,\left (e^{\,f},F^{\epsilon }\right )\right \vert.}$$

Given $f \in \mathcal{U}(\delta )$ and a subset J ⊂ {1, …, n} such that | J | ≤ d, let $\widehat{f} \in \mathcal{U}(\delta )$ be the polynomial obtained from f by changing the coefficient α _J of the monomial x ^J in f to −α _J and leaving all other coefficients intact. Then the angle between the two non-zero complex numbers $\mathbf{E}\,\left (e^{\,f}\vert F\right )$ and $\mathbf{E}\,\left (e^{\widehat{f}}\vert F\right )$ does not exceed

$$\displaystyle{{2\vert \alpha _{J}\vert \over \tau ^{d}}.}$$

Proof

Without loss of generality, we assume that α _J ≠ 0.

We note that for any $f \in \mathcal{U}(\delta )$, we have $\widehat{f} \in \mathcal{U}(\delta )$. Since $\mathbf{E}\,\left (e^{\,f}\vert F\right )\neq 0$ for all $f \in \mathcal{U}(\delta )$, we may consider a branch of $\ln \mathbf{E}\,\left (e^{\,f}\vert F\right )$ for $f \in \mathcal{U}(\delta )$.

Let us fix coefficients α _I for I ≠ J in

$$\displaystyle{ f(x) =\sum _{\begin{array}{c}I\subset \{1,\ldots,n\} \\ 1\leq \vert I\vert \leq d\end{array}}\alpha _{I}\mathbf{x}^{I} }$$

(11)

and define a univariate function

$$\displaystyle{g(\alpha ) =\ln \mathbf{E}\,\left (e^{\,f}\vert F\right )\quad \text{where}\quad \vert \alpha \vert \leq \vert \alpha _{ J}\vert }$$

obtained by replacing α _J with α in (11).

We obtain

$$\displaystyle{ g'(\alpha ) ={ \partial \over \partial \alpha _{J}}\ln \mathbf{E}\,\left (e^{\,f}\vert F\right ) = \left ({ \partial \over \partial \alpha _{J}}\mathbf{E}\,\left (e^{\,f}\vert F\right )\right )\Big/\mathbf{E}\,\left (e^{\,f}\vert F\right ). }$$

(12)

Let

$$\displaystyle{k = \vert I(F) \cap J\vert \ \leq \ \vert J\vert \ \leq \ d.}$$

Using (10) we conclude that

$$\displaystyle{ \left \vert { \partial \over \partial \alpha _{J}}\mathbf{E}\,\left (e^{\,f}\vert F\right )\right \vert \ \leq \ { 1 \over 2^{k}}\sum _{\epsilon \in \{-1,1\}^{I(F)\cap J}}\left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{\epsilon }\right )\right \vert. }$$

(13)

On the other hand,

$$\displaystyle{ \left \vert \mathbf{E}\,\left (e^{\,f}\vert F\right )\right \vert \ \geq \ \left ({ \tau \over 2}\right )^{k}\sum _{ \epsilon \in \{-1,1\}^{I(F)\cap J}}\left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{\epsilon }\right )\right \vert. }$$

(14)

Comparing (12), (13), and (14), we conclude that

$$\displaystyle{\vert g'(\alpha )\vert = \left \vert { \partial \over \partial \alpha _{J}}\ln \mathbf{E}\,\left (e^{\,f}\vert F\right )\right \vert \ \leq \ { 1 \over \tau ^{k}}\ \leq \ { 1 \over \tau ^{d}}.}$$

Then

$$\displaystyle{\left \vert \ln \mathbf{E}\,\left (e^{\,f}\vert F\right ) -\ln \mathbf{E}\,\left (e^{\widehat{f}}\vert F\right )\right \vert = \left \vert g\left (\alpha _{ J}\right ) - g\left (-\alpha _{J}\right )\right \vert \ \leq \ 2\vert \alpha _{J}\vert \max _{\vert \alpha \vert \leq \vert \alpha _{J}\vert }\left \vert g'(\alpha )\right \vert \ \leq \ { 2\vert \alpha _{J}\vert \over \tau ^{d}} }$$

and the proof follows. □

Lemma 4.2

Let θ ≥ 0 and δ > 0 be real numbers such that θδ < π, let F ⊆ {−1, 1}ⁿ be a face such that dimF < n and suppose that $\mathbf{E}\,\left (e^{\,f}\vert F\right )\neq 0$ for all $f \in \mathcal{U}(\delta )$ . Assume that for any $f \in \mathcal{U}(\delta )$ , for any J ⊂ {1, …, n} such that | J | ≤ d, and for the polynomial $\widehat{f}$ obtained from f by changing the coefficient α _J to −α _J and leaving all other coefficients intact, the angle between non-zero complex numbers $\mathbf{E}\,\left (e^{\,f}\vert F\right )$ and $\mathbf{E}\,\left (e^{\widehat{f}}\vert F\right )$ does not exceed θ | α _J | .

Suppose that $\widehat{F} \subset \{-1,1\}^{n}$ is a face obtained from F by changing the sign of one of the coordinates in I ₊(F) ∪ I ₋(F). Then $G = F \cup \widehat{ F}$ is a face of { − 1, 1}ⁿ and for

$$\displaystyle{\tau =\cos { \theta \delta \over 2}}$$

we have

$$\displaystyle{\left \vert \mathbf{E}\,\left (e^{\,f}\vert G\right )\right \vert \ \geq \ { \tau \over 2}\left (\left \vert \mathbf{E}\,\left (e^{\,f}\vert F\right )\right \vert + \left \vert \mathbf{E}\,\left (e^{\,f}\vert \widehat{F}\right )\right \vert \right )}$$

for any $f \in \mathcal{U}(\delta )$ .

Proof

Suppose that $\widehat{F}$ is obtained from F by changing the sign of the i-th coordinate. Let $\tilde{f}$ be a polynomial obtained from f by replacing the coefficients α _I by −α _I whenever i ∈ I and leaving all other coefficients intact. Then $\tilde{f} \in \mathcal{U}(\delta )$ and the angle between $\mathbf{E}\,\left (e^{\,f}\vert F\right )$ and $\mathbf{E}\,\left (e^{\tilde{f}}\vert F\right )$ does not exceed

$$\displaystyle{\theta \sum _{I:\ i\in I}\vert \alpha _{I}\vert \ \leq \ \theta \delta.}$$

On the other hand, $\mathbf{E}\,\left (e^{\tilde{f}}\vert F\right ) = \mathbf{E}\,\left (e^{f}\vert \widehat{F}\right )$ and

$$\displaystyle{\mathbf{E}\,\left (e^{\,f}\vert G\right ) ={ 1 \over 2}\mathbf{E}\,\left (e^{\,f}\vert F\right ) +{ 1 \over 2}\mathbf{E}\,\left (e^{\,f}\vert \widehat{F}\right ) ={ 1 \over 2}\mathbf{E}\,\left (e^{\,f}\vert F\right ) +{ 1 \over 2}\mathbf{E}\,\left (e^{\tilde{f}}\vert F\right ).}$$

Thus $\mathbf{E}\,\left (e^{\,f}\vert G\right )$ is the sum of two non-zero complex numbers, the angle between which does not exceed θδ < π. Interpreting the complex numbers as vectors in $\mathbb{R}^{2} = \mathbb{C}$, we conclude that the length of the sum is at least as large as the length of the sum of the orthogonal projections of the vectors onto the bisector of the angle between them, and the proof follows. □

Proof of Theorem 1.1

Let us denote $d =\deg f$ .

One can observe that the equation

$$\displaystyle{{ 2 \over \cos \left ({ \theta \beta \over 2}\right )} =\theta }$$

has a solution θ ≥ 0 for all sufficiently small β > 0. Numerical computations show that one can choose

$$\displaystyle{\beta = 0.55,}$$

in which case

$$\displaystyle{\theta \approx 2.748136091.}$$

Let

$$\displaystyle{\delta ={ \beta \over \sqrt{d}} ={ 0.55 \over \sqrt{d}}.}$$

We observe that

$$\displaystyle{0\ <\ \theta \delta \ \leq \ \theta \beta \approx 1.511474850\ <\ \pi.}$$

Let

$$\displaystyle{\tau =\cos { \theta \delta \over 2} =\cos { \theta \beta \over 2\sqrt{d}}.}$$

In particular,

$$\displaystyle{\tau \ \geq \ \cos { \theta \beta \over 2} \approx 0.7277659962.}$$

Next, we will use the inequality

$$\displaystyle{ \left (\cos { \alpha \over \sqrt{d}}\right )^{d}\ \geq \ \cos \alpha \quad \text{for}\quad 0 \leq \alpha \leq { \pi \over 2}\quad \text{and}\quad d\ \geq \ 1. }$$

(15)

One can obtain (15) as follows. Since tan(0) = 0 and the function tanα is convex for 0 ≤ α < π∕2, we have

$$\displaystyle{\sqrt{d}\tan { \alpha \over \sqrt{d}}\ \leq \ \tan \alpha \quad \text{for}\quad 0 \leq \alpha <{ \pi \over 2}.}$$

Integrating, we obtain

$$\displaystyle{d\ln \cos { \alpha \over \sqrt{d}}\ \geq \ \ln \cos \alpha \quad \text{for}\quad 0 \leq \alpha <{ \pi \over 2}}$$

and (15) follows.

Using (15), we obtain

$$\displaystyle{{ 2 \over \left (\cos { \theta \delta \over 2}\right )^{d}} ={ 2 \over \left (\cos { \theta \beta \over 2\sqrt{d}}\right )^{d}}\ \leq \ { 2 \over \cos \left ({ \theta \beta \over 2}\right )} =\theta. }$$

(16)

We prove by induction on m = 0, 1, …, n the following three statements.

1.
Let F ⊂ {−1, 1}ⁿ be a face of dimension m. Then, for any $f \in \mathcal{U}(\delta )$, we have $\mathbf{E}\,\left (e^{\,f}\vert F\right )\neq 0$.
2.
Let F ⊂ {−1, 1}ⁿ be a face of dimension m, let $f \in \mathcal{U}(\delta )$ and let $\widehat{f}$ be a polynomial obtained from f by changing one of the coefficients α _J to −α _J and leaving all other coefficients intact. Then the angle between two non-zero complex numbers $\mathbf{E}\,\left (e^{\,f}\vert F\right )$ and $\mathbf{E}\,\left (e^{\widehat{f}}\vert F\right )$ does not exceed θ | α _J |.
3.
Let F ⊂ {−1, 1}ⁿ be a face of dimension m and let $f \in \mathcal{U}(\delta )$. Assuming that m > 0 and hence I(F) ≠ ∅, let us choose any i ∈ I(F) and let F ⁺ and F ⁻ be the corresponding faces of F obtained by fixing x _i = 1 and x _i = −1 respectively. Then
$$\displaystyle{\left \vert \mathbf{E}\,\left (e^{\,f}\vert F\right )\right \vert \ \geq \ { \tau \over 2}\left (\left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{+}\right )\right \vert + \left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{-}\right )\right \vert \right ).}$$

If m = 0 then F consists of a single point x ∈ {−1, 1}ⁿ, so

$$\displaystyle{\mathbf{E}\,\left (e^{\,f}\vert F\right ) = e^{f(x)}\neq 0}$$

and statement 1 holds. Assuming that $\widehat{f}$ is obtained from f by replacing the coefficient α _J with −α _J and leaving all other coefficients intact, we get

$$\displaystyle{{\mathbf{E}\,\left (e^{\,f}\vert F\right ) \over \mathbf{E}\,\left (e^{\widehat{f}}\vert F\right )} =\exp \left \{2\alpha _{J}\mathbf{x}^{J}\right \}.}$$

Since

$$\displaystyle{\vert 2\alpha _{J}\mathbf{x}^{J}\vert = 2\vert \alpha _{ J}\vert \ \leq \ \theta \vert \alpha _{J}\vert,}$$

the angle between $\mathbf{E}\,\left (e^{\,f}\vert F\right )$ and $\mathbf{E}\,\left (e^{\widehat{f}}\vert F\right )$ does not exceed θ | α _J | and statement 2 follows. The statement 3 is vacuous for m = 0.

Suppose that statements 1 and 2 hold for faces of dimension m < n. Lemma 4.2 implies that if F is a face of dimension m + 1 and F ⁺ and F ⁻ are m-dimensional faces obtained by fixing x _i for some i ∈ I(F) to x _i = 1 and x _i = −1 respectively, then

$$\displaystyle{\begin{array}{rl} \left \vert \mathbf{E}\,\left (e^{\,f}\vert F\right )\right \vert \ \geq \ &\left (\cos { \theta \delta \over 2}\right ){\left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{+}\right )\right \vert +\left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{-}\right )\right \vert \over 2} \\ =&{ \tau \over 2}\left (\left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{+}\right )\right \vert + \left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{-}\right )\right \vert \right )\end{array} }$$

and the statement 3 holds for (m + 1)-dimensional faces.

The statement 3 for (m + 1)-dimensional faces and the statement 1 for m-dimensional faces imply the statement 1 for (m + 1)-dimensional faces.

Finally, suppose that the statements 1 and 3 hold for all faces of dimension at most m + 1. Let us pick a face F ⊂ {−1, 1}ⁿ of dimension m + 1, where 0 ≤ m < n. Applying the condition of statement 3 recursively to the faces of F, we get that for any K ⊂ I(F),

$$\displaystyle{\left \vert \mathbf{E}\,\left (e^{\,f}\vert F\right )\right \vert \ \geq \ \left ({ \tau \over 2}\right )^{\vert K\vert }\sum _{ \epsilon \in \{-1,1\}^{K}}\left \vert \mathbf{E}\,\left (e^{\,f}\vert F^{\epsilon }\right )\right \vert.}$$

Then, by Lemma 4.1, the angle between two non-zero complex numbers $\mathbf{E}\,\left (e^{\,f}\vert F\right )$ and $\mathbf{E}\,\left (e^{\widehat{f}}\vert F\right )$ does not exceed

$$\displaystyle{{2\vert \alpha _{J}\vert \over \tau ^{d}} ={ 2\vert \alpha _{J}\vert \over \left (\cos { \theta \delta \over 2}\right )^{d}}\ \leq \ \theta \vert \alpha _{J}\vert }$$

by (16), and the statement 2 follows for faces of dimension m + 1.

This proves that statements 1–3 hold for faces F of all dimensions. Iterating statement 3, we obtain that for any $f \in \mathcal{U}(\delta )$, we have

$$\displaystyle{\left \vert \mathbf{E}\,e^{\,f}\right \vert \ \geq \ \left ({ \tau \over 2}\right )^{n}\sum _{ x\in \{-1,1\}^{n}}\vert e^{f(x)}\vert.}$$

Since for any x ∈ {−1, 1}ⁿ and for any $f \in \mathcal{U}(\delta )$, we have

$$\displaystyle{\vert f(x)\vert \ \leq \ \sum _{i=1}^{n}\sum _{ \begin{array}{c}I\subset \{1,\ldots,n\}\\ i\in I\end{array}}\vert \alpha _{I}\vert \ \leq \ n\delta \ \leq \ \beta n,}$$

we conclude that

$$\displaystyle{\left \vert \mathbf{E}\,e^{\,f}\right \vert \ \geq \ \tau ^{n}e^{-\beta n}\ \geq \ (0.41)^{n}.}$$

The proof follows since if $f:\{ -1,1\}^{n}\longrightarrow \mathbb{C}$ is a polynomial with zero constant term and

$$\displaystyle{\vert \lambda \vert \ \leq \ { 0.55 \over L(\,f)\sqrt{\deg f}},}$$

then $\lambda f \in \mathcal{U}(\delta )$. □

5 Proofs of Theorems 2.1 and 2.2

The proofs of Theorems 2.1 and 2.2 are based on the following lemma.

Lemma 5.1

Let

$$\displaystyle{f(x) =\sum _{I\in \mathcal{F}}\alpha _{I}\mathbf{x}^{I}}$$

be a polynomial such that α _I ≥ 0 for all $I \in \mathcal{F}$ . Then

$$\displaystyle{\mathbf{E}\,e^{\,f}\ \geq \ \prod _{ I\in \mathcal{F}}\left ({e^{\alpha _{I}} + e^{-\alpha _{I}} \over 2} \right ).}$$

Proof

Since

$$\displaystyle{e^{\alpha x} = \left ({e^{\alpha } + e^{-\alpha } \over 2} \right ) + x\left ({e^{\alpha } - e^{-\alpha } \over 2} \right )\quad \text{for}\quad x = \pm 1,}$$

we have

$$\displaystyle{ \mathbf{E}\,e^{\,f} = \mathbf{E}\,\prod _{ I\in \mathcal{F}}e^{\alpha _{I}\mathbf{x}^{I} } = \mathbf{E}\,\prod _{I\in \mathcal{F}}\left (\left ({e^{\alpha _{I}} + e^{-\alpha _{I}} \over 2} \right ) + \mathbf{x}^{I}\left ({e^{\alpha _{I}} - e^{-\alpha _{I}} \over 2} \right )\right ). }$$

(17)

Since

$$\displaystyle{{e^{\alpha _{I}} - e^{-\alpha _{I}} \over 2} \geq 0\quad \text{provided}\quad \alpha _{I} \geq 0}$$

and

$$\displaystyle{\mathbf{E}\,\left (\mathbf{x}^{I_{1} }\cdots \mathbf{x}^{I_{k} }\right )\ \geq \ 0\quad \text{for all}\quad I_{1},\ldots,I_{k},}$$

expanding the product in (17) and taking the expectation, we get the desired inequality. □

Next, we prove a similar estimate for functions f that allow some monomials with negative coefficients.

Lemma 5.2

Let f(x) = g(x) − h(x) where

$$\displaystyle{g(x) =\sum _{I\in \mathcal{G}}\mathbf{x}^{I},\quad h(x) =\sum _{ I\in \mathcal{H}}\mathbf{x}^{I},\quad \mathcal{G}\cap \mathcal{H} =\emptyset.}$$

Suppose that the constant terms of g and h are 0 and that every variable x _i enters not more than k monomials of f for some integer k > 0. Then

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \left (\vert \mathcal{G}\vert - (k - 1)\vert \mathcal{H}\vert \right )\right \}\quad \mathit{\text{for}}\quad 0 \leq \lambda \leq 1.}$$

Proof

Since E f = 0, by Jensen’s inequality we have

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ 1}$$

and the estimate follows if $\vert \mathcal{G}\vert \leq (k - 1)\vert \mathcal{H}\vert$. Hence we may assume that $\vert \mathcal{G}\vert> (k - 1)\vert \mathcal{H}\vert$.

Given a function $f:\{ -1,1\}^{n}\longrightarrow \mathbb{R}$ and a set J ⊂ {1, …, n} of indices, we define a function (conditional expectation) $f_{J}:\{ -1,1\}^{n-\vert J\vert }\longrightarrow \mathbb{R}$ obtained by averaging over variables x _j with j ∈ J:

$$\displaystyle{f_{J}\left (x_{i}:\ i\notin J\right ) ={ 1 \over 2^{\vert J\vert }}\sum _{ \begin{array}{c}x_{j}=\pm 1 \\ j\in J \end{array}}f\left (x_{1},\ldots,x_{n}\right ).}$$

In particular, f _J = f if J = ∅ and f _J = E f if J = {1, …, n}. We obtain the monomial expansion of f _J by erasing all monomials of f that contain x _j with j ∈ J. By Jensen’s inequality we have

$$\displaystyle{ \mathbf{E}\,e^{\lambda f}\ \geq \ \mathbf{E}\,e^{\lambda f_{J} }\quad \text{for all real}\quad \lambda. }$$

(18)

Let us choose a set J of indices with $\vert J\vert \leq \vert \mathcal{H}\vert$ such that every monomial in h(x) contains at least one variable x _j with j ∈ J. Then every variable x _j with j ∈ J is contained in at most (k − 1) monomials of g(x) and hence f _J is a sum of at least $\vert \mathcal{G}\vert - (k - 1)\vert \mathcal{H}\vert$ monomials.

From (18) and Lemma 5.1, we obtain

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \mathbf{E}\,e^{\lambda f_{J} }\ \geq \ \left ({e^{\lambda } + e^{-\lambda } \over 2} \right )^{\vert \mathcal{G}\vert -(k-1)\vert \mathcal{H}\vert }\ \geq \ \left (1 +{ \lambda ^{2} \over 2}\right )^{\vert \mathcal{G}\vert -(k-1)\vert \mathcal{H}\vert }.}$$

Using that

$$\displaystyle{ \ln (1 + x)\ \geq \ x -{ x^{2} \over 2} = x\left (1 -{ x \over 2}\right )\quad \text{for}\quad x \geq 0, }$$

(19)

we conclude that

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{\lambda ^{2} \over 2}\left (1 -{\lambda ^{2} \over 4}\right )\left (\vert \mathcal{G}\vert - (k - 1)\vert \mathcal{H}\vert \right )\right \}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \left (\vert \mathcal{G}\vert - (k - 1)\vert \mathcal{H}\vert \right )\right \}}$$

as desired. □

Now we are ready to prove Theorem 2.2.

Proof of Theorem 2.2

Let x ₀ ∈ {−1, 1}ⁿ, $x_{0} = \left (\xi _{1},\ldots,\xi _{n}\right )$ be a maximum point of f, so that

$$\displaystyle{\max _{x\in \{-1,1\}^{n}}f(x) = f(x_{0}).}$$

Let us define $\tilde{f}:\{ -1,1\}^{n}\longrightarrow \mathbb{R}$ by

$$\displaystyle{\tilde{f}\left (x_{1},\ldots,x_{n}\right ) = f\left (\xi _{1}x_{1},\ldots,\xi _{n}x_{n}\right ).}$$

Then

$$\displaystyle{\max _{x\in \{-1,1\}^{n}}f(x) =\max _{x\in \{-1,1\}^{n}}\tilde{f}(x),\quad \mathbf{E}\,e^{\lambda f} = \mathbf{E}\,e^{\lambda \tilde{f}}}$$

and the maximum value of $\tilde{f}$ on the cube { − 1, 1}ⁿ is attained at $u = \left (1,\ldots,1\right )$. Hence without loss of generality, we may assume that the maximum value of f on the cube { − 1, 1}ⁿ is attained at u = (1, …, 1).

We write

$$\displaystyle{f(x) = g(x) - h(x)\quad \text{where}\quad g(x) =\sum _{I\in \mathcal{G}}\mathbf{x}^{I}\quad \text{and}\quad h(x) =\sum _{ I\in \mathcal{H}}\mathbf{x}^{I}}$$

for some disjoint sets $\mathcal{G}$ and $\mathcal{H}$ of indices. Moreover,

$$\displaystyle{\max _{x\in \{-1,1\}^{n}}f(x) = f(u) = \vert \mathcal{G}\vert -\vert \mathcal{H}\vert \ \geq \ { k - 1 \over k} \vert \mathcal{F}\vert.}$$

Since

$$\displaystyle{\vert \mathcal{G}\vert + \vert \mathcal{H}\vert = \vert \mathcal{F}\vert,}$$

we conclude that

$$\displaystyle{\vert \mathcal{G}\vert \ \geq \ { 2k - 1 \over 2k} \vert \mathcal{F}\vert \quad \text{and}\quad \vert \mathcal{H}\vert \ \leq \ { 1 \over 2k}\vert \mathcal{F}\vert.}$$

By Lemma 5.2,

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \left (\vert \mathcal{G}\vert - (k - 1)\vert \mathcal{H}\vert \right )\right \}\ \geq \ \exp \left \{{3\lambda ^{2} \over 16}\vert \mathcal{F}\vert \right \}}$$

as desired. □

To prove Theorem 2.1, we need to handle negative terms with more care.

Lemma 5.3

Let f(x) = g(x) − h(x) where

$$\displaystyle{g(x) =\sum _{I\in \mathcal{G}}\mathbf{x}^{I},\quad h(x) =\sum _{ I\in \mathcal{H}}\mathbf{x}^{I},\quad \mathcal{G}\cap \mathcal{H} =\emptyset }$$

and

$$\displaystyle{\vert \mathcal{G}\vert \ \geq \ \vert \mathcal{H}\vert.}$$

Suppose that the constant terms of g and h are 0 and that the supports $I \in \mathcal{H}$ of monomials in h(x) are pairwise disjoint. Then

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right )^{2}\right \}\quad \mathit{\text{for}}\quad 0 \leq \lambda \leq 1.}$$

Proof

By Jensen’s inequality we have

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{\lambda \mathbf{E}\,f\right \} = 1,}$$

which proves the lemma in the case when $\vert \mathcal{G}\vert = \vert \mathcal{H}\vert$. Hence we may assume that $\vert \mathcal{G}\vert> \vert \mathcal{H}\vert$.

If $\vert \mathcal{H}\vert = 0$ then, applying Lemma 5.1, we obtain

$$\displaystyle{\mathbf{E}\,e^{\lambda f} = \mathbf{E}\,e^{\lambda g}\ \geq \ \left ({e^{\lambda } + e^{-\lambda } \over 2} \right )^{\vert \mathcal{G}\vert }\ \geq \ \left (1 +{ \lambda ^{2} \over 2}\right )^{\vert \mathcal{G}\vert }.}$$

Using (19), we conclude that

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{\lambda ^{2} \over 2}\left (1 -{\lambda ^{2} \over 4}\right )\vert \mathcal{G}\vert \right \}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \vert \mathcal{G}\vert \right \},}$$

which proves the lemma in the case when $\vert \mathcal{H}\vert = 0$. Hence we may assume that $\vert \mathcal{G}\vert> \vert \mathcal{H}\vert> 0$.

Since the supports $I \in \mathcal{H}$ of monomials in h are pairwise disjoint, we have

$$\displaystyle{ \mathbf{E}\,e^{\lambda h} =\prod _{ I\in \mathcal{H}}\mathbf{E}\,e^{\lambda \mathbf{x}^{I} } = \left ({e^{\lambda } + e^{-\lambda } \over 2} \right )^{\vert \mathcal{H}\vert }. }$$

(20)

Let us choose real p, q ≥ 1, to be specified later, such that

$$\displaystyle{{1 \over p} +{ 1 \over q} = 1.}$$

Applying the Hölder inequality, we get

$$\displaystyle{\mathbf{E}\,e^{\lambda g/p} = \mathbf{E}\,\left (e^{\lambda f/p}e^{\lambda h/p}\right )\ \leq \ \left (\mathbf{E}\,e^{\lambda f}\right )^{1/p}\left (\mathbf{E}\,e^{\lambda qh/p}\right )^{1/q}}$$

and hence

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ { \left (\mathbf{E}\,e^{\lambda g/p}\right )^{p} \over \left (\mathbf{E}\,e^{\lambda qh/p}\right )^{p/q}}.}$$

Applying Lemma 5.1 and formula (20), we obtain

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \left ({e^{\lambda /p} + e^{-\lambda /p} \over 2} \right )^{\vert \mathcal{G}\vert p}\left ({e^{\lambda q/p} + e^{-\lambda q/p} \over 2} \right )^{-\vert \mathcal{H}\vert p/q}.}$$

Since

$$\displaystyle{e^{x^{2}/2 }\ \geq \ { e^{x} + e^{-x} \over 2} \ \geq \ 1 +{ x^{2} \over 2} \quad \text{for}\quad x \geq 0,}$$

we obtain

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \left (1 +{ \lambda ^{2} \over 2p^{2}}\right )^{\vert \mathcal{G}\vert p}\exp \left \{-{\lambda ^{2}q\vert \mathcal{H}\vert \over 2p} \right \}.}$$

Applying (19), we obtain

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{\lambda ^{2}\vert \mathcal{G}\vert \over 2p} -{\lambda ^{2}q\vert \mathcal{H}\vert \over 2p} -{\lambda ^{4}\vert \mathcal{G}\vert \over 8p^{3}} \right \}.}$$

Let us choose

$$\displaystyle{p ={ \sqrt{\vert \mathcal{G}\vert } \over \sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }}\quad \text{and}\quad q ={ \sqrt{\vert \mathcal{G}\vert } \over \sqrt{\vert \mathcal{H}\vert }}.}$$

Then

$$\displaystyle{\begin{array}{rl} \mathbf{E}\,e^{\lambda f}\ \geq \ &\exp \left \{{\lambda ^{2} \over 2}\left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right )^{2} -{\lambda ^{4}\left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right )^{3} \over 8\sqrt{\vert \mathcal{G}\vert }} \right \} \\ =\ &\exp \left \{{\lambda ^{2} \over 2}\left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right )^{2}\left (1 -{\lambda ^{2}\left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right ) \over 4\sqrt{\vert \mathcal{G}\vert }} \right )\right \} \\ \geq &\ \exp \left \{{3\lambda ^{2} \over 8} \left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right )^{2}\right \} \end{array} }$$

and the proof follows. □

Lemma 5.4

Let f(x) = g(x) − h(x) where

$$\displaystyle{g(x) =\sum _{I\in \mathcal{G}}\mathbf{x}^{I},\quad h(x) =\sum _{ I\in \mathcal{H}}\mathbf{x}^{I},\quad \mathcal{G}\cap \mathcal{H} =\emptyset }$$

and

$$\displaystyle{\vert \mathcal{G}\vert \ \geq \ \vert \mathcal{H}\vert.}$$

Suppose that the constant terms of g and h are 0, that every variable x _i enters at most two monomials in h(x) and that if x _i enters exactly two monomials in h(x) then x _i enters at most two monomials in g(x). Then for 0 ≤ λ ≤ 1, we have

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right )^{2}\right \}.}$$

Proof

We proceed by induction on the number k of variables x _i that enter exactly two monomials in h(x). If k = 0 then the result follows by Lemma 5.3.

Suppose that k > 0 and that x _i is a variable that enters exactly two monomials in h(x) and hence at most two monomials in g(x). As in the proof of Lemma 5.2, let $f_{i}:\{ 0,1\}^{n-1}\longrightarrow \mathbb{R}$ be the polynomial obtained from f by averaging with respect to x _i. As in the proof of Lemma 5.2, we have

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \mathbf{E}\,e^{\lambda f_{i} }\quad \text{where}\quad f_{i}(x) =\sum _{I\in \mathcal{G}_{i}}\mathbf{x}^{I} -\sum _{ I\in \mathcal{H}_{i}}\mathbf{x}^{I}}$$

and $\mathcal{G}_{i}$, respectively $\mathcal{H}_{i}$, is obtained from $\mathcal{G}$, respectively $\mathcal{H}$, by removing supports of monomials containing x _i. In particular,

$$\displaystyle{\vert \mathcal{H}_{i}\vert = \vert \mathcal{H}\vert - 2\quad \text{and}\quad \vert \mathcal{G}_{i}\vert \geq \vert \mathcal{G}\vert - 2.}$$

Applying the induction hypothesis to f _i, we obtain

$$\displaystyle{\begin{array}{rl} \mathbf{E}\,e^{\lambda f}\ \geq \ &\mathbf{E}\,e^{\lambda f_{i}}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \left (\sqrt{\vert \mathcal{G}_{i } \vert }-\sqrt{\vert \mathcal{H}_{i } \vert }\right )^{2}\right \} \\ \geq \ &\exp \left \{{3\lambda ^{2} \over 8} \left (\sqrt{\vert \mathcal{G}\vert - 2} -\sqrt{\vert \mathcal{H}\vert - 2}\right )^{2}\right \}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right )^{2}\right \} \end{array} }$$

and the proof follows. □

Finally, we are ready to prove Theorem 2.1.

Proof of Theorem 2.1

As in the proof of Theorem 2.2 , without loss of generality we may assume that the maximum of f is attained at u = (1, …, 1).

We write

$$\displaystyle{f(x) = g(x) - h(x)\quad \text{where}\quad g(x) =\sum _{I\in \mathcal{G}}\mathbf{x}^{I}\quad \text{and}\quad h(x) =\sum _{ I\in \mathcal{H}}\mathbf{x}^{I}}$$

for some disjoint sets $\mathcal{G}$ and $\mathcal{H}$ of indices. Moreover,

$$\displaystyle{\max _{x\in \{-1,1\}^{n}}f(x) = f(u) = \vert \mathcal{G}\vert -\vert \mathcal{H}\vert =\delta \vert \mathcal{F}\vert.}$$

Since

$$\displaystyle{\vert \mathcal{G}\vert + \vert \mathcal{H}\vert = \vert \mathcal{F}\vert,}$$

we conclude that

$$\displaystyle{ \vert \mathcal{G}\vert ={ 1+\delta \over 2} \vert \mathcal{F}\vert \quad \text{and}\quad \vert \mathcal{H}\vert ={ 1-\delta \over 2} \vert \mathcal{F}\vert. }$$

(21)

For i = 1, …, n let μ _i ⁺ be the number of monomials in g that contain variable i and let μ _i ⁻ be the number of monomials in h that contain x _i. Then

$$\displaystyle{ \mu _{i}^{+} +\mu _{ i}^{-}\ \leq \ 4\quad \text{for}\quad i = 1,\ldots,n. }$$

(22)

If for some i we have μ _i ⁺ < μ _i ⁻ then for the point u _i obtained from u by switching the sign of the i-th coordinate, we have

$$\displaystyle{f(u_{i}) = \left (\vert \mathcal{G}\vert - 2\mu _{i}^{+}\right ) -\left (\vert \mathcal{H}\vert - 2\mu _{ i}^{-}\right ) = \vert \mathcal{G}\vert -\vert \mathcal{H}\vert + 2\left (\mu _{ i}^{-}-\mu _{ i}^{+}\right )\>\ f(u),}$$

contradicting that u is a maximum point of f. Therefore,

$$\displaystyle{\mu _{i}^{+}\ \geq \ \mu _{ i}^{-}\quad \text{for}\quad i = 1,\ldots,n}$$

and, in view of (22), we conclude that

$$\displaystyle{\mu _{i}^{-}\ \leq \ 2\quad \text{for}\quad i = 1,\ldots,n\quad \text{and if}\quad \mu _{ i}^{-} = 2\quad \text{then}\quad \mu _{ i}^{+} = 2.}$$

By Lemma 5.4,

$$\displaystyle{\mathbf{E}\,e^{\lambda f}\ \geq \ \exp \left \{{3\lambda ^{2} \over 8} \left (\sqrt{\vert \mathcal{G}\vert }-\sqrt{\vert \mathcal{H}\vert }\right )^{2}\right \}.}$$

Using (21), we deduce that

$$\displaystyle{\begin{array}{rl} \mathbf{E}\,e^{\lambda f}\ \geq \ &\exp \left \{{3\lambda ^{2} \over 8} \left (\sqrt{{1+\delta \over 2}} -\sqrt{{1-\delta \over 2}} \right )^{2}\vert \mathcal{F}\vert \right \} \\ =\ &\exp \left \{{3\lambda ^{2} \over 8} \left (1 -\sqrt{1 -\delta ^{2}}\right )\vert \mathcal{F}\vert \right \}\ \geq \ \exp \left \{{3\lambda ^{2}\delta ^{2} \over 16} \vert \mathcal{F}\vert \right \}, \end{array} }$$

which completes the proof. □

References

A. Bandyopadhyay, D. Gamarnik, Counting without sampling: asymptotics of the log-partition function for certain statistical physics models. Random Struct. Algorithm 33(4), 452–479 (2008)
Article MathSciNet MATH Google Scholar
B. Barak, A. Moitra, R. O’Donnell, P. Raghavendra, O. Regev, D. Steurer, L. Trevisan, A. Vijayaraghavan, D. Witmer, J. Wright, Beating the random assignment on constraint satisfaction problems of bounded degree, in Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. LIPIcs. Leibniz International Proceedings in Informatics, vol. 40 (Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2015), pp. 110–123
Google Scholar
A. Barvinok, Computing the partition function for cliques in a graph. Theor. Comput. 11, Article 13, 339–355 (2015)
Google Scholar
A. Barvinok, Computing the permanent of (some) complex matrices. Found. Comput. Math. 16(2), 329–342 (2016)
Article MathSciNet MATH Google Scholar
A. Barvinok, P. Soberón, Computing the partition function for graph homomorphisms. Combinatorica (2016). doi:10.1007/s00493-016-3357-2
Article Google Scholar
A. Barvinok, P. Soberón, Computing the partition function for graph homomorphisms with multiplicities. J. Comb. Theory Ser. A 137, 1–26 (2016)
Article MathSciNet MATH Google Scholar
E. Boros, P.L. Hammer, Pseudo-boolean optimization. Discret. Appl. Math. 123(1–3), 155–225 (2002). Workshop on Discrete Optimization (DO’99), Piscataway
Google Scholar
R.L. Dobrushin, S.B. Shlosman, Completely analytical interactions: constructive description. J. Stat. Phys. 46(5–6), 983–1014 (1987)
Article MathSciNet MATH Google Scholar
J. Håstad, On bounded occurrence constraint satisfaction. Inf. Process. Lett. 74(1–2), 1–6 (2000)
Article MathSciNet Google Scholar
J. Håstad, Some optimal inapproximability results. J. ACM 48(4), 798–859 (2001)
Article MathSciNet MATH Google Scholar
J. Håstad, Improved bounds for bounded occurrence constraint satisfaction, manuscript (2005). Available at https://www.nada.kth.se/~johanh/bounded2.pdf
Google Scholar
J. Håstad, S. Venkatesh, On the advantage over a random assignment. Random Struct. Algorithm 25(2), 117–149 (2004)
Article MathSciNet MATH Google Scholar
J. Kahn, N. Linial, A. Samorodnitsky, Inclusion-exclusion: exact and approximate. Combinatorica 16(4), 465–477 (1996)
Article MathSciNet MATH Google Scholar
S. Khot, A. Naor, Grothendieck-type inequalities in combinatorial optimization. Commun. Pure Appl. Math. 65(7), 992–1035 (2012)
Article MathSciNet MATH Google Scholar
T.D. Lee, C.N. Yang, Statistical theory of equations of state and phase transitions. II. Lattice gas and Ising model. Phys. Rev. (2) 87, 410–419 (1952)
Google Scholar
G. Regts, Zero-free regions of partition functions with applications to algorithms and graph limits. Combinatorica (2017). doi:10.1007/s00493-016-3506-7
Google Scholar
D. Weitz, Counting independent sets up to the tree threshold, in Proceedings of the 38th Annual ACM Symposium on Theory of Computing, STOC’06 (ACM, New York, 2006), pp. 140–149
Google Scholar
C.N. Yang, T.D. Lee, Statistical theory of equations of state and phase transitions. I. Theory of condensation. Phys. Rev. (2) 87, 404–409 (1952)
Google Scholar
D. Zuckerman, Linear degree extractors and the inapproximability of max clique and chromatic number. Theor. Comput. 3, 103–1283 (2007)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

I am grateful to Johan Håstad for advice and references on optimizing a polynomial on the Boolean cube and to the anonymous referees for careful reading of the paper and useful suggestions.

Author information

Authors and Affiliations

Department of Mathematics, University of Michigan, 530 Church street, 48109-1043, Ann Arbor, MI, USA
Alexander Barvinok

Authors

Alexander Barvinok
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Barvinok .

Editor information

Editors and Affiliations

Department of Applied Mathematics, Charles University, Praha, Czech Republic
Martin Loebl
Computer Science Institute of Charles University, Charles University, Praha, Czech Republic
Jaroslav Nešetřil
School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia, USA
Robin Thomas

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Barvinok, A. (2017). Computing the Partition Function of a Polynomial on the Boolean Cube. In: Loebl, M., Nešetřil, J., Thomas, R. (eds) A Journey Through Discrete Mathematics. Springer, Cham. https://doi.org/10.1007/978-3-319-44479-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-44479-6_7
Published: 09 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44478-9
Online ISBN: 978-3-319-44479-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Computing the Partition Function of a Polynomial on the Boolean Cube

Abstract

Similar content being viewed by others

On the Fourier spectrum of functions on Boolean cubes

A Formula for the Partition Function That “Counts”

The Turán and Laguerre inequalities for quasi-polynomial-like functions

1991 Mathematics Subject Classification.

1 Introduction and Main Results

1.1 Polynomials and Partition Functions

Theorem 1.1

Theorem 1.2

1.2 Relation to Prior Work

2 Applications to Optimization

2.1 Maximizing a Polynomial on the Boolean Cube

Theorem 2.1

Theorem 2.2

3 Computing the Partition Function

3.1 Computing the Taylor Polynomial of \(g(\lambda ) =\ln \left (\mathbf{E}\,e^{\lambda f}\right )\)

Lemma 3.1

Proof

Lemma 3.2

Proof

Proof of Theorem 1.2

3.2 Computing a Point y in the Cube with a Large Value of f( y)

4 Proof of Theorem 1.1

4.1 Faces

4.2 The Space of Polynomials

4.3 Restriction of the Partition Function onto a Face

Lemma 4.1

Proof

Lemma 4.2

Proof

Proof of Theorem 1.1

5 Proofs of Theorems 2.1 and 2.2

Lemma 5.1

Proof

Lemma 5.2

Proof

Proof of Theorem 2.2

Lemma 5.3

Proof

Lemma 5.4

Proof

Proof of Theorem 2.1

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation