On some features of quadratic unconstrained binary optimization with random coefficients

Isopi, Marco; Scoppola, Benedetto; Troiani, Alessio

doi:10.1007/s40574-024-00433-8

On some features of quadratic unconstrained binary optimization with random coefficients

Open access
Published: 22 August 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Bollettino dell'Unione Matematica Italiana Aims and scope Submit manuscript

On some features of quadratic unconstrained binary optimization with random coefficients

Download PDF

121 Accesses
1 Citation
Explore all metrics

Abstract

Quadratic Unconstrained Binary Optimization (QUBO or UBQP) is concerned with maximizing/minimizing the quadratic form $H(J, \eta ) = W \sum _{i,j} J_{i,j} \eta _{i} \eta _{j}$ with J a matrix of coefficients, $\eta \in \{0, 1\}^N$ and W a normalizing constant. In the statistical mechanics literature, QUBO is a lattice gas counterpart to the (generalized) Sherrington–Kirkpatrick spin glass model. Finding the optima of H is an NP-hard problem. Several problems in combinatorial optimization and data analysis can be mapped to QUBO in a straightforward manner. In the combinatorial optimization literature, random instances of QUBO are often used to test the effectiveness of heuristic algorithms. Here we consider QUBO with random independent coefficients and show that if the $J_{i,j}$’s have zero mean and finite variance then, after proper normalization, the minimum and maximum per particle of H do not depend on the details of the distribution of the couplings and are concentrated around their expected values. Further, with the help of numerical simulations, we study the minimum and maximum of the objective function and provide some insight into the structure of the minimizer and the maximizer of H. We argue that also this structure is rather robust. Our findings hold also in the diluted case where each of the $J_{i,j}$’s is allowed to be zero with probability going to 1 as $N \rightarrow \infty $ in a suitable way.

On sparsity of the solution to a random quadratic optimization problem

Article 09 December 2019

Binary Unconstrained Quadratic Optimization Problem

Combinatorial optimization through variational quantum power method

Article 06 October 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

We consider the quadratic form

(1)

where $J \in \mathbb {R}^{N \times N}$ is a matrix of coefficients, $\eta \in \mathcal {A}^N$ (with $\mathcal {A}$ a finite subset of $\mathbb {R}$), $\eta _{i}$ is the value of the i-th component of $\eta $, and $W \in \mathbb {R}$ is a suitable normalizing constant. This quadratic form plays an important role both in combinatorial optimization and statistical mechanics.

In statistical mechanics (1) is the Hamiltonian of several physical systems whose nature depends on the elements of $\mathcal {A}$. For instance, if $\mathcal {A} = \{-1, 1\}$ the Hamiltonian describes a system of (pairwise) interacting spins whereas if $\mathcal {A} = \{0, 1\}$ it is generally used to describe a system of (pairwise) interacting particles. Spins or particles live on the vertices of some, possibly oriented, graph $\mathcal {G} = \{V, E\}$, called the interaction graph, with $|V| = N$ and J the weighted adjacency matrix of $\mathcal {G}$. For each $(i,j) \in E$ the entry $J_{i,j}$ represents the strength of the interaction between the entities (spins or particles) at vertices i and j of $\mathcal {G}$. The microscopic state of the physical system is given by $\eta $.

The physical system is described in terms of the probability of its microscopic states (Gibbs measure):

$$\begin{aligned} \mu (\eta ) = \frac{e^{-\beta H(J, \eta )}}{Z_{\beta }} \end{aligned}$$

(2)

where the parameter $\beta $ is called the inverse temperature and $Z_{\beta }$ is a normalizing constant called partition function.

In combinatorial optimization the problem of minimizing (or maximizing) (1) when $\mathcal {A} = \{0, 1\}$ and $W = 1$ is known under the names Quadratic Unconstrained Binary Optimization (QUBO) or Unconstrained Binary Quadratic Programming (UBQP). QUBO is NP-hard and, in general, no polynomial time algorithm is known to find a minimizer of H. Many problems in combinatorial optimization and data analysis can be mapped to QUBO in a straightforward manner (see [14] for a survey). Even constrained optimization problems can be reduced to QUBO by introducing quadratic penalties in the objective function.

Minimizers (ground states) of forms like H in (1) are of relevance in the context of statistical mechanics as well. Indeed ground states are the ones with the highest probability with respect to the probability measure (2). As the temperature of the physical system approaches zero ($\beta \rightarrow \infty $), the system will eventually reach its ground state.

If the signs of the entries of J are disordered, i.e. if it is not possible to find the global minimizer with a local criterion, finding the ground states of the system is non-trivial.

In the context of statistical mechanics, there is a vast literature concerning the properties of the ground states of H of the form (1). Here we recall the comprehensive texts by Parisi, Mezard, and Virasoro [16] (for a more physical perspective) and more recently by Talagrand [23, 24] and Panchenko [18]. The recent book [7] provides an up-to-date overview.

The system that attracted more effort from physicists, and that is most described in the references above, is the so-called Sherrington–Kirkpatrick (SK) model, in which $\eta \in \{-1, 1\}^{N}$, $W = \frac{1}{\sqrt{N}}$ and $J_{i,j}$ are independent standard Gaussian random variables. In relatively more recent times, always in the context of spin systems, the universality of the features of the SK model with respect to the distribution of the (independent) $J_{ij}$ has been proven, see [6, 8].

The $\{0, 1\}$ counterpart to the Sherrington–Kirkpatrick model, whose Hamiltonian matches the objective function of a QUBO instance, has not been the subject of the same vast attention in the statistical mechanics literature. Many important results have been achieved in [17, 19], in which features of the minimizers and maximizers analogous to the SK model for the Hamiltonian (1) have been proven for $J_{i,j}$ independent Gaussian random variables and a quite general choice of the set $\mathcal {A}$, including also the particle case $\eta \in \{0, 1\}^{N}$. In particular it has been proven that the free energy of the system is close to its expectation with probability one (concentration) and that its thermodynamic limit exists.

The same problem, in the canonical ensemble, i.e. with a fixed number of particles, has been discussed in [12]. There, for general distribution of $J_{i,j}$, interesting features of the system, most of all for small density of particles, have been found.

Restricting only to the case $\eta \in \{0, 1\}^N$ and $J_{i,j}$ independent Gaussian random variables, with very different and more elementary techniques, in [21] an almost sure lower bound for the minimum per particle of H has been provided together with a different proof of concentration.

In this paper we will show, in the spirit of [6], that the results of [21] are rather robust with respect to the distribution of the $J_{i,j}$’s. In particular, we consider the case of independent $J_{i,j}$ with $\mathbb {E}\left[ J_{i,j}\right] = 0$. If the tails of the distributions of the $J_{i,j}$ are not too fat, then after proper normalization, the value of the minimum of H is close to its expectation with high probability and does not depend on the actual distribution of the $J_{i,j}$. Further, with the help of numerical simulations we provide some insight into the structure of both the minimizer and the maximizer of H and show that also this structure is robust. Note that our results hold in the diluted case as well, that is, in the case where each $J_{i,j}$ is zero with a certain probability. This probability needs not to be fixed, but it is allowed to go to 1 as $N \rightarrow \infty $ in a suitable way.

Rigorous results are presented in Sect. 2, whereas the numerical findings concerning the structure of the minimizer and the maximizer of H are detailed in Sect. 3. Proofs are given in Sect. 4.

Throughout the paper, we use the jargon of statistical mechanics. As a consequence, we use expressions like (particle) configuration when referring to $\eta $, number of particles when referring to N, Hamiltonian for the quadratic form (1) and energy and energy per particle (of a configuration $\eta $) for, respectively, $H(J, \eta )$ and $\frac{H(J, \eta )}{N}$. Likewise, the minimizer (1) is often referred to as the ground state of H and we use the expression thermodynamic limit to denote the limit as $N \rightarrow \infty $.

Remark 1.1

Note that H denotes a family of random variables indexed by N. However, we do not write this dependence explicitly to lighten the notation.

2 Main results

We consider random instances of QUBO, that is we assume the matrix $J = \{J_{i,j}\}_{1 \le i,j \le N}$ to be the realization of some multivariate random variable. Unless otherwise specified we will assume the $J_{i,j}$’s to be independent, identically distributed, and such that $\mathbb {E}\left[ J_{i,j}\right] = 0$ and ${\text {Var}}\left( J_{i,j}\right) = \sigma ^2$. As for the value of the normalizing constant W we take it to be such that the random variable $\sum _{ij} J_{i,j}$ has variance N.

Remark 2.1

Note that, in general, the distribution of the $J_{i,j}$ is allowed to have atoms. In particular, we will be interested in random variables taking the value zero with probability $1 - p(N) = 1 - N^{\delta - 2}$ for $\delta $ in $\left( 1, 2\right] $. In this way we can include in our analysis the diluted case, that is the case where the matrix J of the coefficients of the objective function is “sparse” with expected density $\rho = p(N)$. Note that, in this way, the average degree of each vertex grows unbounded with N. For $\sum _{ij} J_{i,j}$ to have variance N one should set $W = \frac{1}{\sqrt{\rho N {\text {Var}}\left( J_{1,1}\right) }}$.

In the remainder of the paper, we will use the following notation.

Let

$$\begin{aligned} \eta ^{\min } :={{\,\mathrm{arg\,min}\,}}_{\eta \in \{0, 1\}^N} H(J, \eta ) \quad \text { and } \quad \eta ^{\max } :={{\,\mathrm{arg\,max}\,}}_{\eta \in \{0, 1\}^N} H(J, \eta ), \end{aligned}$$

that is $\eta ^{\min }$ and $\eta ^{\max }$ are, respectively, the minimizer and the maximizer of H.

Further, let

$$\begin{aligned} \min _{\eta \in \{0, 1\}^N} H(J, \eta ) =: -m_{\min , N} \cdot N \quad \text { and } \quad \max _{\eta \in \{0, 1\}^N} H(J, \eta ) =: m_{\max , N} \cdot N. \end{aligned}$$

In words, $m_{\min , N}$ and $m_{\max , N}$ are, respectively, the minimum and the maximum per particle of H.

Moreover, setting $|\eta | = \sum _{i = 1}^N \eta _{i}$ we call

$$\begin{aligned} \alpha _{\min , N} = \frac{ |\eta ^{\min }| }{N}\; \text { and }\; \alpha _{\max , N} = \frac{ |\eta ^{\max }| }{N} \end{aligned}$$

the proportion of ones in the minimizer and maximizer of H.

We remark that $\eta ^{\min }$, $\eta ^{\max }$, $m_{\min , N}$, $m_{\max , N}$, $\alpha _{\min , N}$ and $\alpha _{\max , N}$ are random variables that depend on the realization of J, but we do not write this dependence explicitly to lighten the notation.

We are interested in the behavior of $m_{\min , N}$, $m_{\max , N}$, $\alpha _{\min , N}$ and $\alpha _{\max , N}$ as $N \rightarrow \infty $. When the $J_{i,j}$ are Gaussian, the existence of the limits

a)
$m_{\min } = \lim _{N \rightarrow \infty } m_{\min , N}$,
b)
$m_{\max } = \lim _{N \rightarrow \infty } m_{\max , N}$,
c)
$\alpha _{\min } = \lim _{N \rightarrow \infty } \alpha _{\min , N}$,
d)
$\alpha _{\max } = \lim _{N \rightarrow \infty } \alpha _{\max , N}$

follows from Theorems 1.1 and 1.2 in [17]. Reasonable numerical estimates for the quantities of interest are $m_{\min } = m_{\max } \approx 0.42$ and $\alpha _{\min } = \alpha _{\max } \approx 0.624$ (see Sect. 3 below).

We show that, as N gets larger, the fluctuations of both the minimum and maximum per particle of H around their expected values vanish provided some conditions on the tails of the $J_{i,j}$ are satisfied. More precisely, we have the following

Theorem 2.2

Let the $J_{i,j}$ be independent identically distributed random variables with $\mathbb {E}\left[ J_{1,1}\right] = 0$ and ${\text {Var}}\left( J_{1,1}\right) = 1$. Then,

(a)
as $N \rightarrow \infty $,
$$\begin{aligned} m_{\min , N} - \mathbb {E}\left[ m_{\min , N}\right] \overset{{\text {P}}}{\longrightarrow } 0 \end{aligned}$$
(3)
$$\begin{aligned} m_{\max , N} - \mathbb {E}\left[ m_{\max , N}\right] \overset{{\text {P}}}{\longrightarrow } 0. \end{aligned}$$
(4)
where $\overset{p}{\longrightarrow }$ denotes convergence in probability.

Further,

(b)
If $\mathbb {E}\left[ \left| J_{1,1}\right| ^{3}\right] < \infty $ the convergence in (3) and (4) is almost sure.
(c)
Suppose the $J_{i,j}$ are strictly subgaussian, that is,
$$\begin{aligned} \mathbb {E}\left[ \exp (sJ_{1,1})\right] \le \exp \left( \frac{s^{2}}{2}\right) , \quad \forall s \in \mathbb {R}. \end{aligned}$$
Then there exists a positive constant D such that, for all $z > 0$
$$\begin{aligned} {\text {P}}\left( *\right) {\left| m_{\min , N} - \mathbb {E}\left[ m_{\min , N}\right] \right|> Nz } \le e^{-DNz}\, \text { and }\, {\text {P}}\left( *\right) {\left| m_{\max , N} - \mathbb {E}\left[ m_{\max , N}\right] \right| > Nz } \le e^{-DNz} \end{aligned}$$
that is, the convergence is exponentially fast.

The proof is provided in Sect. 4.1.

Moreover the expected value of the minimum and maximum per particle do not depend on the actual distribution of the random couplings, provided they have finite variance.

Theorem 2.3

Let $J = \left\{ J_{ij} \right\} _{1\le i,j\le N}$ and $Y = \left\{ Y_{i,j} \right\} _{1\le i,j\le N}$ be two independent sequences of independent random variables, such that for every i, j $\mathbb {E}\left[ J_{i,j}\right] = \mathbb {E}\left[ Y_{i,j}\right] = 0$ and $\mathbb {E}\left[ J_{i,j}^{2}\right] = \mathbb {E}\left[ Y_i^2\right] = 1$. Also, set $\gamma := \max \left\{ \mathbb {E}\left[ \left| J_{ij}\right| ^3\right] , \mathbb {E}\left[ \left| Y_{ij}\right| ^3\right] , 1 \le i,j \le N\right\} $; $\gamma $ may be infinite. Then we have, as $N \rightarrow \infty $,

$$\begin{aligned} \frac{1}{N} \left| \mathbb {E}\left[ \min _{\eta } H(J, \eta )\right] - \mathbb {E}\left[ \min _{\eta } H(Y, \eta )\right] \right| \rightarrow 0 \end{aligned}$$

(5)

$$\begin{aligned} \frac{1}{N} \left| \mathbb {E}\left[ \max _{\eta } H(J, \eta )\right] - \mathbb {E}\left[ \max _{\eta } H(Y, \eta )\right] \right| \rightarrow 0 \end{aligned}$$

(6)

If furthermore $\gamma < \infty $,

$$\begin{aligned} \frac{1}{N} \left| \mathbb {E}\left[ \min _{\eta } H(J, \eta )\right] - \mathbb {E}\left[ \min _{\eta } H(Y, \eta )\right] \right| \le C N^{-1/6} \end{aligned}$$

(7)

$$\begin{aligned} \frac{1}{N} \left| \mathbb {E}\left[ \max _{\eta } H(J, \eta )\right] - \mathbb {E}\left[ \max _{\eta } H(Y, \eta )\right] \right| \le C N^{-1/6} \end{aligned}$$

(8)

where C is a constant depending only on $\gamma $.

The proof comes as a consequence of an analogous result on the universality of the free energy which is presented in Sect. 4.2.

As an immediate consequence of Theorem 2.3 and Theorem 2.2 ((c)), the minimum and maximum per particle have the same almost sure lower bound of the Gaussian case (see [21]) irrespective of the details of the distribution of the $J_{i,j}$. More precisely, let $\nu ^{-}(m) = |\{\eta : H(J, \eta ) \le -m N\}|$ denote the number of configurations whose energy is less or equal to $-m N$ and, similarly, let $\nu ^{+}(m) = |\{\eta : H(J, \eta ) \ge m N\}|$ be the number of configurations with energy at least equal to mN. Then:

Corollary 2.4

Let the $J_{i,j}$’s be independent identically distributed strictly subgaussian random variables with $\mathbb {E}\left[ J_{1,1}\right] = 0$ and ${\text {Var}}\left( J_{1,1}\right) = 1$ and let $I(x) = - x \log (x) - (1-x) \log (1 - x)$. Then, for large values of N and for some constant C, ${\text {P}}\left( *\right) {\nu ^{-}(m^{\star }) > 0} \le e^{-CN}$ and ${\text {P}}\left( *\right) {\nu ^{+}(m^{\star }) > 0} \le e^{-CN}$ where $m^{\star } \approx 0.562$ is the extremal value of m such that the function $I(\alpha ) - \frac{m^2}{2\alpha ^{2}(1 - \alpha )^{2}}$, for fixed m, has no zeros. This value is obtained for $\alpha = \alpha ^{\star } \approx 0.644$.

3 Conjectures and numerical results

In this section, we present some numerical results concerning the minimum and maximum per particle and the structure of the minimizer and the maximizer of QUBO instances with random coefficients. The findings presented in Sect. 3.1 allow us to highlight some interesting features concerning the connection between particles contributing to the minimizer of H (that is, components of $\eta ^{\min }$ equal to 1) and particles contributing to the maximizer of H (that is, components of $\eta ^{\max }$ equal to 1) and the probability of the events $\{\eta ^{\min }_{i} = 1\}$ and $\{\eta ^{\max }_{i} = 1\}$. Simulations have been carried out using Monte Carlo Probabilistic Cellular Automata (PCA) as those introduced in [2, 3, 10, 21, 22]. The advantage of these algorithms is represented by their inherently parallel nature which allows the exploitation of parallel architectures at their fullest while preserving a quality of the solution comparable to the one obtained with single spin flip MCMC algorithms as outlined in [3, 13, 21]. Thanks to these algorithms we were able, in the diluted case, to simulate effectively systems with N up to 128000.

In particular, the algorithm used in the simulations works as follows. Let

$$\begin{aligned} H(\eta , \tau ) = \sum _{i,j} J_{ij}\eta _{i}\tau _{j} + q \sum _{i}(1 - \sigma _{i}\tau _{i}) \end{aligned}$$

(9)

As a preliminary step consider the symmetrized version of $\tilde{J}$ of J (that is $\tilde{J}=\frac{J + J^{T}}{2}$) and note that the value of the Hamiltonian dos not change if we replace J with $\tilde{J}$. Let $H(\eta , \tau ) = \beta \sum _{i}h_{i(\eta )\tau _{i}} + q\sum _{i}\left[ \eta _{i}(1-\tau _{i}) + \tau _{i}(1-\eta _{i}) \right] $ with $h_{i}(\eta ) =\frac{1}{\sqrt{ N }}\sum _{j}\tilde{J}_{i,j}\eta _{j}$ and $\beta $ and q two positive parameters and choose the transition matrix to be $P_{\eta , \tau } = \frac{e^{ -H(\eta , \tau ) }}{\sum _{\tau }e^{ -H(\eta ,\tau ) }}$. Rewriting $P_{\eta , \tau }$ as

$$\begin{aligned} P(\eta , \tau ) = \prod _{i} \frac{e^{ -\beta h_{i}(\eta )\tau _{i}-q[\eta _{i}(1-\tau _{i}) + \tau _{i}(1-\eta _{i})] }}{Z_{i}} \end{aligned}$$

(10)

we immediately see that, conditionally on $\eta $, the value of each $\tau _{i}$ can be sampled independently with the following probabilities:

$$\begin{aligned} P(\tau _{i}=1) = \frac{e^{ -\beta h_{i} (\eta ) - q (1 - \eta _{i}) }}{Z_{i}} \; \text { and } \; P(\tau _{i}=0|\eta ) = \frac{e^{ -q \eta _{i} }}{Z_{i}} \end{aligned}$$

(11)

with $Z_{i} = e^{ -\beta h_{i} - q(1- \eta _{i})} + e^{ - q\eta _{i} }$. Then, at least in principle, each component of the configuration could be updated on a dedicated core.

Note that, by the symmetry of $\tilde{J}$, we have that $\frac{\sum _{\eta }e^{ -H(\eta ,\tau ) }}{\sum _{\eta ,\tau }e^{ -H(\eta ,\tau ) }}$ is the reversible measure of $P_{\eta ,\tau }$. In addition, observe that $H(\eta , \eta ) = \beta H(\eta )$. The intuitive rationale for why the heuristic algorithm is expected to work is the following. As q gets large, the weight of pairs $(\eta , \tau )$ with $\eta \ne \tau $ in $\pi (\sigma )$ becomes smaller and the stationary measure of the algorithm becomes closer to the Gibbs measure $\frac{e^{ -\beta H(\eta ) }}{Z}$ with Hamiltonian H and inverse temperature $\beta $. A comparison between this algorithm and the Metropolis one is provided in [21].

The computation of the local fields $h_{i}$ proves to be the computationally expensive part of the algorithm. However, the vector of local fields $\{ h_{i}(\eta ) \}_{i = 1, \dots , N}$ is the matrix–vector product $J \cdot \eta $ and can be computed using highly optimized linear algebra libraries and exploiting a parallel computing device such as a GPU. Further, observe that k simulations can be carried over in parallel, possibly with different parameters $\beta $ and q. Let $\textbf{E}$ be the $N \times k$ matrix whose columns contain the k configurations to be updated. In this case, the entire collection of vectors of fields $\{ h_{i}(\eta ^{(m)} ) \}_{i = 1, \dots , N; m = 1, \dots , k}$ (encoded, again, as an $N \times k$ matrix) is the matrix–vector product $J \cdot \textbf{E}$. Also, this product can be computed effectively using highly optimized libraries and is, in general, substantially faster than computing k matrix–vector products.

We remark that the values we found are heuristic and we have no guarantee that they coincide with the true values of the minimizer and the maximizer of H(J). However, as already discussed in more detail in [21], the exact algorithm introduced in [20] can be exploited to compute the exact minima and maxima of H(J) for values of N up to 150. The absolute value of the minimum and the maximum per particle of H(J) are steadily close to 0.42 already for values of N between 80 and 150. Moreover, for instances with sizes up to 150, we verified that our algorithm typically finds the same optimum as the exact algorithm. For the larger instances, we retained the minimum and maximum found in a set of runs of the algorithms with different pairs of values of $\beta $ and q.

3.1 Minimum and maximum per particle

The standard normal case has been investigated extensively in [21]: for values of N relatively small, both $m_{\min , N}$ and $m_{\max , N}$ oscillate around a value about 0.42 whereas $\alpha _{\min , N}$ and $\alpha _{\max , N}$ very rapidly approach a value about 0.624 (see Fig. 1). Theorem 2.3 states that both $m_{\min , N}$ and $m_{\max , N}$ are robust with respect to the distribution of the $J_{i,j}$. Numerical simulations show that this robustness also concerns the proportion of ones in both the minimizer and maximizer of H.

Figure 2 shows the behavior of $m_{\min , N}, m_{\max , N}, \alpha _{\min , N}, \alpha _{\max , N}$ for $J_{i,j}$ with (shifted) exponential distribution. Even if the exponential distribution is rather asymmetric and is not subgaussian, for values of N relatively small the average energy per particle and the proportion of ones in both the minimizer and the maximizer approach those of the standard normal case.

In our simulations, we also considered the diluted case, that is, we took $J_{i,j}$ to be zero with probability $1 - p_{\delta }(N)$, where $p_{\delta }(N) = N^{\delta - 2}$, and a standard normal random variable otherwise. Findings concerning the behavior of $m_{\min , N}, m_{\max , N}, \alpha _{\min , N}, \alpha _{\max , N}$ are presented in Figs. 3 and 4. The log-log plot of Fig. 5 suggests that $m_N$ converges to $\bar{m}$ as a power of N.

We highlight that as $\delta $ becomes smaller, the density of J becomes very small. To give an idea, values of the density of J are given in Table 1. For instance, with $N = 128000$ and $\delta = 1.1$ the expected number of nonzero entries in each row of J is less than 4.

Table 1 Density of matrix J for several values of N and $\delta $

Full size table

Table 2 Values of the maximum per particle for some benchmark instances

Full size table

In the operation research literature, to test optimization algorithms, it is common to consider random instances of QUBO where the $J_{i,j}$ have a uniform distribution (see, e.g., [1, 15, 25, 26]) and where the matrix J is, possibly, sparse. Values of the best-known maximizer for some benchmark QUBO instances in the case of uniformly distributed $J_{i,j}$ are reported in Table 2. It is apparent that also in these cases, the values of the optima per particle agree with those of the standard normal case.

3.2 Structure of minimizer and maximizer

Observe that for any realization of J the minimizer and the maximizer of the Hamiltonian are with probability 1 since the distribution of the couplings is absolutely continuous. Then it is possible to partition the indices $1, 2, \ldots , N$ into four sets:

$I_{1} = \{i: \eta ^{\min }_{i} = 1, \; \eta ^{\max }_{i} = 0\}$;
$I_{2} = \{i: \eta ^{\min }_{i} = 1, \; \eta ^{\max }_{i} = 1\}$;
$I_{3} = \{i: \eta ^{\min }_{i} = 0, \; \eta ^{\max }_{i} = 1\}$;
$I_{4} = \{i: \eta ^{\min }_{i} = 0, \; \eta ^{\max }_{i} = 0\}$.

To refer properly to the cardinality of these sets we give the following

Definition 3.1

$\alpha _{k,N}:= \frac{ |I_{k}| }{ N }, \, k = 1, 2, 3, 4$.

With an abuse of notation, we say that the i-th row (column) of J belongs to $I_{k}$ if $i \in I_{k}$. Note that $\alpha _{1,N}$ can be interpreted as the proportion of 1 appearing in the minimizer but not appearing in the maximizer of H. Similar interpretations can be given for $\alpha _{2,N}$, $\alpha _{3,N}$ and $\alpha _{4,N}$.

Remark 3.2

With the definition of $\alpha _{i}$ given above it is immediate to see that $\alpha _{\min ,N} = \alpha _{N,1} + \alpha _{N,2}$ and $\alpha _{\max ,N} = \alpha _{N,2} + \alpha _{N,3}$

Leveraging on the definition of $I_{k}$, it is possible to partition J into 16 blocks

$$\begin{aligned} J[k,\ell ]:= \{J_{i,j}, \; i \in I_{k}, \, j \in I_{\ell }\} \end{aligned}$$

Clearly, $J[I_{k}, I_{\ell }]$ is a sub matrix of J with $N\alpha _{k,N}$ rows and $N\alpha _{\ell ,N}$ columns.

Numerical simulations suggest that the average value and the variance of the entries in each block, subject to proper normalization and the relative size of the blocks converge to a deterministic limit as $N \rightarrow \infty $. These limits do not depend on the distribution of the $J_{i,j}$’s, as long as they have expected value zero and variance 1. More precisely we make the following

Conjucture 3.3

Let $J_{i,j}$ be independent identically distributed random variables with $\mathbb {E}\left[ J_{1,1}\right] = 0$ and ${\text {Var}}\left( J_{1,1}\right) = 1$. Then there exist constants $\alpha _{1}, \alpha _{2}, \alpha _{3}, \alpha _{4}$ such that

1.
$\lim \limits _{N\rightarrow \infty } \alpha _{1,N} = \alpha _{1}$, $\lim \limits _{N\rightarrow \infty } \alpha _{2,N} = \alpha _{2}$$\lim \limits _{N\rightarrow \infty } \alpha _{3,N} = \alpha _{3}$$\lim \limits _{N\rightarrow \infty } \alpha _{4,N} = \alpha _{4}$
2.
$\alpha _{1} = \alpha _{2} = \alpha _{3} = \frac{1}{2}\alpha _{\min } = \frac{1}{2}\alpha _{\max }$; $\alpha _{4} = 1 - \frac{3}{2}\alpha _{\min }$
3.
${\text {Var}}\left( J[k, \ell ]_{i,j}\right) = 1$ for $k, \ell = 1, \ldots , 4$

Further, set $\mu [k, \ell ] = \mathbb {E}\left[ J[k,\ell ]_{i,j}\right] $ and write the covariance of $J[k,\ell ]_{i}, J[k, \ell ]_{j}$ as

$$\begin{aligned} {\text {Cov}}(J[k,\ell ]_{i}, J[k, \ell ]_{j}) = \frac{N^{2}\alpha _{k}\alpha _{\ell }}{2}(1 - \tilde{\sigma }[k,\ell ]) \end{aligned}$$

.

Computation of averages over 10000 realizations of J with standard normal distribution and $N = 1024$ yielded the following estimates for $\alpha _{i}$, $\mu [k,\ell ]$ and $\tilde{\sigma }[k,\ell ]$.

$$\begin{aligned}&\mathbf {\alpha } = \begin{bmatrix} 0.3105 \\ 0.31331 \\ 0.31054 \\ 0.06565 \\ \end{bmatrix} \end{aligned}$$

(12)

$$\begin{aligned}&\mu [k,\ell ] = \begin{bmatrix} -0.02161 & -0.05687 & 0.00006 & 0.04229 \\ -0.05687 & 0.00001 & 0.05684 & 0.00013 \\ -0.00005 & 0.05685 & 0.02161 & -0.04241 \\ 0.04228 & 0.00003 & -0.04227 & 0.00001 \\ \end{bmatrix} \end{aligned}$$

(13)

$$\begin{aligned}&\tilde{\sigma }[k,\ell ] = \begin{bmatrix} 0.92996 & 1.11006 & 0.74334 & 1.20509 \\ 1.06221 & 0.58133 & 1.08835 & 0.77939 \\ 0.73879 & 1.08343 & 0.9415 & 1.25195 \\ 1.21106 & 0.76959 & 1.2258 & 0.99425 \\ \end{bmatrix} \end{aligned}$$

(14)

These values suggest that the random couplings in blocks [2, 2], [3, 1], [4, 2], [1, 3], and [2, 4] are negatively correlated, the random couplings in blocks [1, 4], [3, 4], [4, 1], and [4, 3] are positively correlated, whereas the random couplings in the remaining blocks are roughly independent.

Very similar values can be obtained in the diluted case and for $J_{i,j}$ with distributions other than the normal.

Note that it is always possible to relabel the indices $1, 2, \ldots , N$ so that, in J, indices in $I_{1}$ appear “first”, indices in $I_{2}$ appear “second”, in $I_{3}$ appear “third” and indices in $I_{4}$ appear “last”. With this relabelling, a graphical representation of Conjecture 3.3 is provided in Fig. 6.

3.3 Probability a particle belongs to the minimizer/maximizer of H

Sets $I_{1}, \ldots I_{4}$ introduced above are random sets depending on the realization of J. The problem of finding $\eta ^{\min }$ and $\eta ^{\max }$ can be restated as the problem of determining the sets $I_{\min } = I_{1} \cup I_{2}$ and $I_{\max } = I_{2} \cup I_{3}$. Though, as already mentioned, this problem is NP-hard, we tried to assess, numerically, whether it is possible to determine the probability that a certain index i belongs to either $I_{\min }$ or $I_{\max }$.

To this aim, consider the matrix $\tilde{J} = \frac{J + J'}{2}$, that is the symmetrized version of J and the function $\tilde{H}$ obtained from H by replacing J with $\tilde{J}$. Note that $H(J, \eta ) = \tilde{H}(\eta )$ for all $\eta $ and, consequently, the values of $\eta ^{\min }_{i}$ and $\eta ^{\max }_{i}$ do not depend on whether matrix J or $\tilde{J}$ is considered.

Take the matrix $\tilde{J}$ and define $R = \sum _{j = 1}^{N} J_{i, j}$, that is R is the vector of the sums of the rows of $\tilde{J}$. Let [i] be the index of the i-th smallest element of R. We say that the [i]-th row (column) of $\tilde{J}$ belongs to the minimizer (respectively the maximizer) of H if $\eta ^{\min }_{[i]} = 1$ (respectively $\eta ^{\max }_{[i]} = 1$).

Numerical simulations show that the probability that the [i]-th row of $\tilde{J}$ belongs to the minimizer (respectively maximizer) of H is a decreasing (respectively increasing) deterministic function of $\frac{[i]}{N}$. This function is expected not to depend on the actual distribution of J. Further, we expect that a positive fraction of the smallest (largest) rows of $\tilde{J}$ (where smallest/largest refers to the sum of the elements on each row of $\tilde{J}$) to belong to the minimizer (maximizer) of H with positive probability (see Fig. 7). More precisely we have the following

Conjucture 3.4

As $N \rightarrow \infty $, ${\text {P}}\left( *\right) {\eta ^{\min }_{[i]} = 1 } = f_{\min }\left( \frac{[i]}{N}\right) + o(1)$ with $f_{\min }$ a decreasing function. Similarly, as $N \rightarrow \infty $, ${\text {P}}\left( *\right) {\eta ^{\min }_{[i]} = 1 } = f_{\max }\left( \frac{[i]}{N}\right) + o(1)$ as, with $f_{\max }\left( \frac{[i]}{N}\right) = f_{\min }\left( 1 - \frac{[i]}{N}\right) $. Moreover, there exists $\lambda _0 > 0$ such that $f_{\min } (\lambda ) = 1$ for all $\lambda < \lambda _0$. Finally, if the tails of $J_{i,j}$ decay sufficiently fast, the function f does not depend on the distribution of $J_{i,j}$

Finally, we observe that the random variables $\mathbb {1}_{\{\eta ^{\min }_{[i]} = 1\}}$ and $\mathbb {1}_{\{\eta ^{\max }_{[i]} = 1\}}$ appear to be positively correlated for all i. The qualitative behavior of the strength of the correlation between these two variables as a function of [i] is provided by the estimates of Fig. 8.

4 Proofs

We state first some general results that will be used below in the proofs of Theorem 2.2 and Theorem 2.3.

Let $X \!=\! \{X_{1}, \ldots , X_{n}\}$ be independent random variables and let $W\!=\!g\left( X_1, \ldots , X_i, \ldots , X_n\right) $ with g a measurable function. Further, let $X^{\prime } = \{X_{1}^{\prime }, \ldots , X_{n}^{\prime }\}$ be an independent copy of $X_{1}, \ldots , X_{n}$ and write

$$\begin{aligned} W_{i}=g\left( X_{1}, \ldots , X_{i}^{\prime }, \ldots , X_{n}\right) \end{aligned}$$

(15)

Define the random variable V by

$$\begin{aligned} V=\mathbb {E}\left[ \sum _{i=1}^n\left( W - W_{i}\right) ^{2} \mid {X}\right] \end{aligned}$$

(16)

which allows to re-state Efron-Stein’s theorem (see [11]) as follows

Theorem 4.1

(Efron-Stein)

$$\begin{aligned} {\text {Var}}\left( W\right) \le \frac{1}{2} \mathbb {E}\left[ V\right] \end{aligned}$$

(17)

From [5], we can bound the moment generating function with the following

Theorem 4.2

For all $\theta >0$ and $\lambda \in (0,\frac{1}{\theta })$,

$$\begin{aligned} \mathbb {E}\left[ \exp (\lambda (W - \mathbb {E}\left[ W\right] ))\right] \le \exp \left( \frac{\lambda \theta }{1-\lambda \theta }\right) \mathbb {E}\left[ \exp \left( \frac{\lambda V}{\theta }\right) \right] . \end{aligned}$$

(18)

Moreover, a straightforward consequence of [4, Theorem 2] yields

Theorem 4.3

$$\begin{aligned} \mathbb {E}\left[ \left| W-\mathbb {E}\left[ W\right] \right| ^3\right] < \mathbb {E}\left[ \left| V\right| ^{\frac{3}{2}}\right] . \end{aligned}$$

(19)

To prove both Theorem 2.2 and Theorem 2.3 a key role is played by the free energy function $F: \mathbb {R}^{N \times N} \rightarrow \mathbb {R}$ defined as

$$\begin{aligned} F_{\beta }(X) := \beta ^{-1} \log Z(X). \end{aligned}$$

(20)

with $Z(X):= \sum _{\eta } e^{\beta H(X, \eta )}$, $H(X, \eta ) = \frac{1}{\sqrt{N}}\sum _{i,j} X_{i,j} \eta _{i} \eta _{j}$. and $X = \{X_{i,j}\}_{1 \le i, j, \le N}$ a collection of independent random variables (the matrix of the couplings). We do not write the dependence of $F_{\beta }$ and Z on N to make the notation less heavy. Further, we set

$$\begin{aligned} p(X, \eta ) := \frac{e^{\beta H(X, \eta )}}{Z_{X}}; \quad \langle \eta _i\eta _j \rangle _{X} = \sum _{\eta }\eta _{i}\eta _{j}p(X, \eta ) \end{aligned}$$

(21)

where $p(X, \eta )$ is the Gibbs measure of $\eta $ and $\langle \cdot \rangle _{X}$ denotes the expectation with respect to the Gibbs measure when the matrix of couplings is X.

In the next lemma, we determine bounds on the derivatives of F with respect to one of the couplings.

Lemma 4.4

Let $F: \mathbb {R}^{N \times N} \rightarrow \mathbb {R}$ be as in (20). Then

$$\begin{aligned} \left| \frac{\partial F_{\beta }(X)}{\partial X_{i,j}}\right| \le \frac{1}{\sqrt{N}}\, ; \quad \left| \frac{\partial ^{2} F_{\beta }(X)}{\partial X_{i,j}^{2}}\right| \le \frac{\beta }{4N}\, ; \quad \left| \frac{\partial ^{3} F_{\beta }(X)}{\partial X_{i,j}^{3}}\right| \le \frac{\beta ^{2}}{6 \sqrt{3} N^{3/2}} \end{aligned}$$

(22)

Proof

At first, observe that a straightforward computation gives

$$\begin{aligned} \frac{\partial H(X, \eta )}{\partial X_{i,j}} = \frac{1}{\sqrt{N}} \eta _{i} \eta _{j} \, ; \quad \frac{\partial Z}{\partial X_{i,j}} (X) = \frac{\beta }{\sqrt{N}}\sum _{\eta } \eta _{i} \eta _{j} e^{\beta H(X, \eta )} =\frac{\beta }{\sqrt{N}}\langle \eta _i \eta _j \rangle _{X} Z(X). \end{aligned}$$

(23)

Using (23), the bound on the first derivative of F with respect to $X_{i,j}$ is readily obtained from the following

$$\begin{aligned} \frac{\partial F_{\beta }(X)}{\partial X_{i,j}} = \frac{1}{\beta Z(X)}\frac{\partial Z(X)}{\partial X_{i,j}} =\frac{1}{\sqrt{N}} \langle \eta _{i}\eta _{j} \rangle _{X} \end{aligned}$$

(24)

by observing that $\langle \eta _{i}\eta _{j} \rangle _{X} \in [0, 1]$. To compute higher-order derivatives note that

$$\begin{aligned} \frac{\partial p(X, \eta )}{\partial X_{i,j}} =\frac{ \frac{\beta }{\sqrt{N}}\eta _{i}\eta _{j} e^{\beta H(X, \eta )} Z- \frac{\beta }{\sqrt{N}}\langle \eta _{i}\eta _{j} \rangle _{X} e^{\beta H(X, \eta )} Z}{ Z^2} =\frac{\beta }{\sqrt{N}}\left( \eta _{i}\eta _{j}-\langle \eta _{i}\eta _{j} \rangle _{X}\right) p(X, \eta ) \end{aligned}$$

(25)

which, in turn, yields,

$$\begin{aligned} \frac{\partial \langle \eta _{i}\eta _{j} \rangle _{X}}{\partial X_{i,j}}&= \sum _{\eta } \eta _{i}\eta _{j} \frac{\partial p}{\partial X_{i,j}} (X, \eta ) = \frac{\beta }{\sqrt{N}} \sum _{\eta } \eta _{i}\eta _{j} \left( \eta _{i}\eta _{j} -\langle \eta _{i}\eta _{j} \rangle _{X}\right) p(X, \eta ) \end{aligned}$$

(26)

$$\begin{aligned}&= \frac{\beta }{\sqrt{N}}\left( \langle \eta _{i}\eta _{j} \rangle _{X}- \langle \eta _{i}\eta _{j} \rangle _{X}^2\right) = \frac{\beta }{\sqrt{N}}\langle \eta _{i}\eta _{j} \rangle _{X}\left( 1- \langle \eta _{i}\eta _{j} \rangle _{X}\right) \end{aligned}$$

(27)

and

$$\begin{aligned} \frac{\partial ^2 \langle \eta _{i}\eta _{j} \rangle _{X}}{\partial X_{ij}^2} = \frac{\beta }{\sqrt{N}} \frac{\partial \left( \langle \eta _{i}\eta _{j} \rangle _{X} \left( 1 - \langle \eta _{i}\eta _{j} \rangle _{X}\right) \right) }{\partial X_{ij}} = \frac{\beta ^2}{N} \langle \eta _{i}\eta _{j} \rangle _{X}\left( 1 - \langle \eta _{i}\eta _{j} \rangle _{X}\right) \left( 1 - 2\langle \eta _{i}\eta _{j} \rangle _{X}\right) . \end{aligned}$$

(28)

Consequently,

$$\begin{aligned} \frac{\partial ^{2} F_{\beta }(X)}{\partial X_{i,j}^{2}} = \frac{\beta }{N}\langle \eta _{i}\eta _{j} \rangle _{X} \left( 1 - \langle \eta _{i}\eta _{j} \rangle _{X}\right) \quad \text { and } \end{aligned}$$

(29)

$$\begin{aligned} \frac{\partial ^{3} F_{\beta }(X)}{\partial X^{3}} = \frac{\beta ^2}{N^{3/2}}\langle \eta _{i}\eta _{j} \rangle _{X} (1 - \langle \eta _{i}\eta _{j} \rangle _{X}) \left( 1 - 2\langle \eta _{i}\eta _{j} \rangle _{X}\right) . \end{aligned}$$

(30)

The claim follows by observing that, in the interval [0, 1], $x (1 - x) \le \frac{1}{4}$ and $\left| x (1 - x) (1 - 2x)\right| \le \frac{1}{6\sqrt{3}}$ $\square $

4.1 Proof of Theorem 2.2

We give the proof for the convergence of $m_{\max , N}$. The proof for $m_{\min , N}$ is analogous.

In the rest of this section, we write $\overline{H}(J):= \max _{\eta } H(J, \eta )$

To prove the convergence in probability of $m_{\max , N}$ to its expectation we use the following results which allow us to bound the variance of the free energy.

Assume $J_{i,j}$ are independent identically distributed random variables with expected value zero and variance 1. Call $F = F_{\beta }(J) = \frac{1}{\beta } \log \sum _{\eta } e^{\beta \sum _{i,j} J_{i,j} \eta _{i} \eta _{j}}$. The following theorem provides a bound on the variance of F.

Theorem 4.5

${\text {Var}}\left( F\right) \le N$

Proof

Let $F_{i,j}$ be obtained from F by substituting $J_{i,j}$ with $J'_{i,j}$ (an independent copy of $J_{i,j}$). Then, by Lemma 4.4, $\left| F - F_{i,j}\right| \le \frac{1}{\sqrt{N}} \left| J_{i,j} - J'_{i,j}\right| $. From Theorem 4.1 it follows

$$\begin{aligned} {\text {Var}}\left( F\right)&\le \frac{1}{2} \mathbb {E}\left[ \sum _{i,j}(F - F_{i,j})^{2}\right] = \frac{1}{2} \sum _{i,j} \mathbb {E}\left[ (F - F_{i,j})^{2}\right] = \frac{1}{2} N^{2} \mathbb {E}\left[ (F - F_{1,1})^{2}\right] \end{aligned}$$

(31)

$$\begin{aligned}&\le \frac{1}{2} N^{2} \mathbb {E}\left[ \left( \frac{1}{\sqrt{N}} \left| J_{i,j} - J'_{i,j}\right| \right) ^{2} \right] = N \mathbb {E}\left[ J_{1,1}^{2}\right] = N \end{aligned}$$

(32)

Note that this estimate is uniform in $\beta $. $\square $

Since $\overline{H}(J) = \lim _{\beta \rightarrow \infty } F_{\beta }(J)$, as an immediate consequence we get

Corollary 4.6

${\text {Var}}\left( \overline{H}(J)\right) \le N$.

By Chebyshev’s inequality and the previous corollary we get

$$\begin{aligned} {\text {P}}\left( *\right) { \left| \frac{\overline{H}(J)}{N} - \mathbb {E}\left[ \frac{\overline{H}(J)}{N} \right] \right| > \varepsilon } \le \frac{{\text {Var}}\left( \frac{\overline{H}(J)}{N} \right) }{\varepsilon ^{2}} = \frac{1}{N^{2}} \frac{{\text {Var}}\left( \overline{H}(J)\right) }{\varepsilon } \le \frac{1}{N \varepsilon } \end{aligned}$$

(33)

from which claim ((a)) follows.

To prove the other two claims, set $V_{F} = \mathbb {E}\left[ \sum _{i,j} (F - F_{i,j})^{2} \mid J\right] $. We have

$$\begin{aligned} V_{F}&= \sum _{i,j} \mathbb {E}\left[ (F - F_{i,j})^{2} \mid J\right] \le \sum _{i,j} \frac{1}{N} \mathbb {E}\left[ (J_{i,j} - J'_{i,j})^{2} \mid J_{i,j}\right] \end{aligned}$$

(34)

$$\begin{aligned}&= \frac{1}{N} \sum _{i,j} \left( J_{i,j}^{2} + \mathbb {E}\left[ (J'_{i,j})^2\right] \right) = \frac{1}{N} \sum _{i,j} \left( J_{i,j}^{2} + 1 \right) \end{aligned}$$

(35)

Let us examine the case $\mathbb {E}\left[ \left| J_{1,1}\right| ^{3}\right] < \infty $.

Leveraging on Jensen’s inequality we have $\left| \sum _1^n a_i\right| ^{\frac{3}{2}} \le \sqrt{n}\sum _{1}^{n} \left| a_i\right| ^{\frac{3}{2}}$. The latter inequality can be used to show that

$$\begin{aligned} \mathbb {E}\left[ V_{F}^{\frac{3}{2}}\right]&\le \mathbb {E}\left[ \left| \frac{1}{N}\sum _{i,j} \left( J_{i,j}^2 + 1\right) \right| ^{\frac{3}{2}}\right] \le \frac{1}{N^{\frac{3}{2}}} \mathbb {E}\left[ N \sum _{i,j} \left| J_{i,j}^2 + 1\right| ^{\frac{3}{2}}\right] \end{aligned}$$

(36)

$$\begin{aligned}&\le \frac{1}{\sqrt{N}} \mathbb {E}\left[ \sum _{i,j} \sqrt{2} (\left| J_{i,j}\right| ^3 + 1)\right] =\sqrt{2}N^{\frac{3}{2}} \mathbb {E}\left[ \left| J_{1,1}\right| ^3 + 1\right] . \end{aligned}$$

(37)

Then, by Theorem 4.3 and for some constant C,

$$\begin{aligned} \mathbb {E}\left[ \left| F - \mathbb {E}\left[ F\right] \right| ^3\right] \le CN^{\frac{3}{2}} \end{aligned}$$

(38)

which, in turn, implies

$$\begin{aligned} \mathbb {E}\left[ \left| \frac{F-\mathbb {E}\left[ F\right] }{N} \right| ^{3} \right] \le CN^{-\frac{3}{2}} \end{aligned}$$

(39)

Since (39) does not depend on $\beta $, it holds for $\overline{H}(J) = \lim _{\beta \rightarrow \infty } F_{\beta }(J)$ as well. Chebyshev’s inequality and Borel-Cantelli lemma allow us to conclude the proof of claim ((b)).

Consider now the case of strictly subgaussian $J_{i,j}$’s. Recall that for a centered strictly subgaussian random variable X with variance $\sigma ^{2}$,

$$\begin{aligned} \mathbb {E}\left[ \exp (sX)\right] \le \exp \left( \frac{\sigma ^{2}s^{2}}{2}\right) , \quad \forall s \in \mathbb {R}. \end{aligned}$$

(40)

Also, $\forall r \in \mathbb {N}$, $\mathbb {E}\left[ X^{2r}\right] \le 2^{r+1}\sigma ^{2r}r!$, yielding

$$\begin{aligned} \mathbb {E}\left[ e^{s\left( X^2-\sigma ^2\right) }\right]&=1+s \mathbb {E}\left[ X^2-\sigma ^2\right] +\sum _{r=2}^{\infty } \frac{s^r \mathbb {E}\left[ \left( X^2-\sigma ^2\right) ^r\right] }{r !} \le 1+\sum _{r=2}^{\infty } \frac{s^r \mathbb {E}\left[ X^{2 r}\right] }{r !} \end{aligned}$$

(41)

$$\begin{aligned}&\le 1+2\sum _{r=2}^{\infty } s^r 2^r \sigma ^{2 r} =1+\frac{8 s^2 \sigma ^4}{1-2 s \sigma ^2} \end{aligned}$$

(42)

where the inequality in (41) follows from the fact that, for a non negative random variable (such as $X^{2}$), the central r-th moment is smaller or equal to the r-th moment (recall that $X^{2}$ has expected value $\sigma ^{2}$).

This result, together with (34), can be used to bound the moment-generating function of $V_{F}$ as

$$\begin{aligned} \mathbb {E}\left[ \exp \left( t V_{F}\right) \right]&\le \mathbb {E}\left[ \exp \left( \frac{t}{N}\sum _{i,\,j} \left( X_{ij}^2 - 1 + 2\right) \right) \right] = \left( e^{\frac{2t}{N}} \mathbb {E}\left[ \exp \left( \frac{t}{N}\left( X_{1\,1}^2 - 1\right) \right) \right] \right) ^{N^2}\end{aligned}$$

(43)

$$\begin{aligned}&\le e^{\frac{2t}{N}} \left[ 1+\frac{1}{N}\frac{8 t^2}{N- 2t}\right] ^{N^2}\le e^{9t^2} \end{aligned}$$

(44)

for N large enough where the last inequality follows from the fact that $\left[ 1+\frac{1}{N}\frac{8 t^2}{N- 2t}\right] ^{N^2}$ is a decreasing sequence with the same limit as $\left[ 1 + \frac{8 t^2 }{N^2}\right] ^{N^2}$. Then, from Theorem 4.2 we get, for all $\theta >0$ and $\lambda \in (0, \frac{1}{\theta })$,

$$\begin{aligned} \mathbb {E}\left[ \exp (\lambda (F-\mathbb {E}\left[ F\right] ))\right] \le \exp \left( \frac{\lambda \theta }{1-\lambda \theta }\right) \mathbb {E}\left[ \exp \left( \frac{\lambda V_{F}}{\theta }\right) \right] \le \exp \left( \frac{\lambda \theta }{1-\lambda \theta }+\frac{9\lambda ^2}{\theta ^2}\right) \end{aligned}$$

(45)

and hence, by exponential Markov inequality

$$\begin{aligned} {\text {P}}\left( *\right) {\left| F-\mathbb {E}[F]\right| > t } \le 2\exp \left( -\lambda t + \frac{\lambda \theta }{1-\lambda \theta }+\frac{9\lambda ^2}{\theta ^2}\right) \le e^{-at} \end{aligned}$$

(46)

for some $a>0$ (by optimizing on $\lambda , \theta $) and for every $t > 0$. Setting $t = Nz$, with $z > 0$ we get

$$\begin{aligned} {\text {P}}\left( *\right) {\left| F-\mathbb {E}[F]\right| > Nz } \le e^{-aNz}. \end{aligned}$$

(47)

Since also (47) does not depend on $\beta $, it holds for $\overline{H}(J) = \lim _{\beta \rightarrow \infty } F_{\beta }(J)$. Dividing by N we get claim ((c)). $\square $

4.2 Proof of Theorem 2.3

Using the bounds of Lemma 4.4 we can prove that the expectation of the free energy does not depend on the distribution of the couplings in the thermodynamic limit.

As for Theorem 2.3, let $J = \left\{ J_{i,j} \right\} _{1\le i,j\le N}$ and $Y = \left\{ Y_{i,j} \right\} _{1\le i,j\le N}$ be two independent sequences of independent random variables, such that for every i, j $\mathbb {E}\left[ J_{i,j}\right] = \mathbb {E}\left[ Y_{i,j}\right] = 0$ and $\mathbb {E}\left[ J_{i,j}^{2}\right] = \mathbb {E}\left[ Y_{i,j}^{2}\right] = 1$.

Theorem 4.7

Consider $F_{\beta }(J)$ as in (20). Then

$$\begin{aligned} \lim _{N \rightarrow \infty } \left| \mathbb {E}\left[ F_{\beta }(J)\right] - \mathbb {E}\left[ F_{\beta }(Y)\right] \right| = 0 \end{aligned}$$

(48)

Proof

The proof is a straightforward adaptation of Chatterjee’s extension of Lindeberg’s argument for the central limit theorem. See [8, 9] for a comprehensive treatment.

Let $0 \le k \le n=N^{2}$ be any numbering of the elements of the sequences and define

$$\begin{aligned} \textbf{Z}_k:=\left( J_{1}, \ldots , J_{k-1}, J_k, Y_{k+1}, \ldots , Y_{n}\right) \end{aligned}$$

(49)

$$\begin{aligned} \textbf{W}_k:=\left( J_{1}, \ldots , J_{k-1}, 0, Y_{k+1}, \ldots , Y_{n}\right) \end{aligned}$$

(50)

Now consider a Taylor expansion of $F_{\beta }$ around $J_{k} = 0$ and write

$$\begin{aligned} F_{\beta }(\textbf{Z}_k) = F_{\beta }(\textbf{W}_k) + J_{k} \frac{\partial F_{\beta }(\textbf{W}_k)}{\partial J_{k}} + \frac{1}{2} J_{k}^2 \frac{\partial ^{2} F_{\beta }(\textbf{W}_k)}{\partial J_{k}^{2}} + R_{k} \end{aligned}$$

(51)

$$\begin{aligned} F_{\beta }(\textbf{Z}_{k-1})) = F_{\beta }(\textbf{W}_k) + Y_{k} \frac{\partial F_{\beta }(\textbf{W}_k)}{\partial J_{k}} + \frac{1}{2} Y_{k}^{2} \frac{\partial ^{2} F_{\beta }(\textbf{W}_k)}{\partial J_{k}^{2}} + S_{k}. \end{aligned}$$

(52)

The bounds from Lemma 4.4 then give

$$\begin{aligned} \left| R_k\right| \le \frac{\beta }{8N} X_k^2, \quad \left| S_k\right| \le \frac{\beta }{8N} Y_k^2 \qquad \text {and} \end{aligned}$$

(53)

$$\begin{aligned} \left| R_k\right| \le \frac{\beta ^{2}}{36 \sqrt{3} N^{3/2}} |X_k|^3, \quad \left| S_k\right| \le \frac{\beta ^{2}}{36 \sqrt{3} N^{3/2}} |Y_k|^3. \end{aligned}$$

(54)

Note that, for each k, the random variables $J_{k}, Y_{k}$ and $\textbf{W}_k$ are independent. Recalling that $J_{k}$ and $Y_{k}$ have both mean zero and variance 1, we get

$$\begin{aligned} \mathbb {E}\left[ F_{\beta }(\textbf{Z}_k)- F_{\beta }(\textbf{Z}_{k-1}))\right] = \mathbb {E}\left[ R_{k}- S_{k}\right] \end{aligned}$$

(55)

which gives the estimate

$$\begin{aligned} \left| \mathbb {E}\left[ F_{\beta }(X)\right] - \mathbb {E}\left[ F_{\beta }{(Y)}\right] \right|= & \left| \sum _{i=1}^n \mathbb {E}\left[ F_{\beta }(\textbf{Z}_k) - F_{\beta }(\textbf{Z}_{k-1})\right] \right| \nonumber \\\le & \sum _{i=1}^n \mathbb {E}\left[ \left| R_k\right| \right] + \mathbb {E}\left[ \left| S_k\right| \right] + \frac{\beta }{8N}\sum _{i=1}^n \left[ \mathbb {E}\left( X_i^2, \left| X_i\right|>K\right) \right. \nonumber \\ & \left. + \mathbb {E}\left( Y_i^2;\left| Y_i\right| >K\right) \right] + \frac{\beta ^{2}}{36 \sqrt{3} N^{3/2}}\sum _{i=1}^n \left[ \mathbb {E}\left( \left| X_i\right| ^3;\left| X_i\right| \le K\right) \right. \nonumber \\ & \left. +\mathbb {E}\left( \left| Y_i\right| ^3;\left| Y_i \right| \le K\right) \right] \end{aligned}$$

(56)

If we furthermore assume $\gamma < \infty $, we only need to use third order Taylor bounds on $R_{k}, S_{k}$ getting the more explicit bound

$$\begin{aligned} \left| \mathbb {E}\left[ F_{\beta }(J)\right] - \mathbb {E}\left[ F_{\beta }(Y)\right] \right| \le \frac{\beta ^2 \gamma }{18\sqrt{3}}{\sqrt{N}} \end{aligned}$$

(57)

$\square $

Finally, we can prove Theorem 2.3. We write the proof for the maximum of H. The proof for the minimum can be done in the same way by replacing $\beta $ with $-\beta $.

Recall that we set $\overline{H}(J) = \max _{\eta } H(J, \eta )$.

We have

$$\begin{aligned} \overline{H}(J)&=\beta ^{-1} \log \left[ e^{\beta \overline{H}(J)}\right] \le \beta ^{-1} \log \left[ \sum _{\eta } e^{\beta H(J, \eta )}\right] \le \beta ^{-1} \log \left[ 2^N e^{\beta \overline{H}(J)}\right] \end{aligned}$$

(58)

getting the uniform bound

$$\begin{aligned} \left| \overline{H}(J) - F_{\beta }(J)\right| \le \beta ^{-1} N \log 2. \end{aligned}$$

(59)

From Theorem 4.7

$$\begin{aligned} \begin{aligned} \left| \mathbb {E}\left[ \overline{H}(J)\right] - \mathbb {E}\left[ \overline{H}(Y)\right] \right|&\le \left| \mathbb {E}\left[ \overline{H}(J) - F_{\beta }(J)\right] \right| + \left| \mathbb {E}\left[ F_{\beta }(J)\right] - \mathbb {E}\left[ F_{\beta }(Y)\right] \right| \\&\quad + \left| \mathbb {E}\left[ \overline{H}(Y)\right] - F_{\beta }(Y)\right| \\&\le \frac{2 N \log 2}{\beta } + \left| \mathbb {E}\left[ F_{\beta }(J)\right] - \mathbb {E}\left[ F_{\beta }(Y)\right] \right| \end{aligned} \end{aligned}$$

(60)

First assume $\gamma < \infty $. From (57) and 60

$$\begin{aligned} \left| \mathbb {E}\left[ \overline{H}(J)\right] - \mathbb {E}\left[ \overline{H}(Y)\right] \right|&\le \frac{2 N \log 2}{\beta } + \frac{\beta ^2\gamma }{18\sqrt{3}} \sqrt{N} \end{aligned}$$

(61)

By choosing $\beta =N^{1/6}$ we get the second part of the thesis.

We now consider $\gamma \!=\!\infty $. Define $g(K)\!:=\!\max _{i, j}\left( \mathbb {E}\left( J_{i j}^2;\left| J_{i j}\right| \!>\!K\right) , \mathbb {E}\Big (Y_{i j}^2;\left| Y_{i j}\right| >K\right) \Big )$. From the finite variance hypothesis, $g(K) \rightarrow 0$ as $K \rightarrow \infty $. In (56) we take $\beta = g(N^{\frac{1}{4}})^{-\frac{1}{2}}$, so that $\beta \rightarrow \infty $ as $N \rightarrow \infty $. Also observe that $\gamma =\infty $ implies $Kg(K) \rightarrow \infty $ as $K\rightarrow \infty $.

Estimate (56), with $K=N^{\frac{1}{4}}$, now becomes

$$\begin{aligned} \frac{1}{N} \left| \mathbb {E}\left[ F_{\beta }(X)\right] - \mathbb {E}\left[ F_{\beta }{(Y)}\right] \right| \le C_1 g(N^{\frac{1}{4}})^{\frac{1}{2}}+ C_2 \left( N^{\frac{1}{4}}g(N^{\frac{1}{4}})\right) ^{-1} \end{aligned}$$

(62)

Which vanishes as $N \rightarrow \infty $ thus proving also part 1 of Theorem 2.3. $\square $

5 Conclusions and open problems

The analysis carried over in this paper unveils several lines of investigation that we believe are rather interesting.

It would be useful to identify the minimal requirement on the distribution of the couplings to have the minimum and maximum per particle to converge to their expected values exponentially fast.

One feature we find particularly intriguing is the relationship between the elements of J contributing to the minimum and those contributing to the maximum of H discussed in Conjecture 3.3. We think that a rigorous understanding of the relative size of the blocks of J and the correlation between the rows belonging to the minimizer and the maximizer of J could significantly improve our understanding of the problem.

Better understanding would, in turn, provide useful tools in order to evaluate the heuristic algorithms used to tackle QUBO.

Finally, it would be interesting to study the problem when the $J_{i,j}$ are not independent, especially in the case where the interaction graph is not the complete graph.

Data availability

Data of Table 2 is available from [26] and references therein. All statistics concerning the generated data during this study are included in this article.

References

Alidaee, B., Sloan, H., Wang, H.: Simple and fast novel diversification approach for the ubqp based on sequential improvement local search. Comput. Ind. Eng. 111, 164–175 (2017)
Article Google Scholar
Apollonio, V., D’Autilia, R., Scoppola, B., Scoppola, E., Troiani, A.: Criticality of measures on 2-d ising configurations: from square to hexagonal graphs. J. Stat. Phys. 177(5), 1009–1021 (2019)
Article MathSciNet Google Scholar
Apollonio, V., D’Autilia, R., Scoppola, B., Scoppola, E., Troiani, A.: Shaken dynamics: an easy way to parallel Markov chain Monte Carlo. J. Stat. Phys. 189(3), 39 (2022)
Article MathSciNet Google Scholar
Boucheron, S., Bousquet, O., Lugosi, G., Massart, P.: Moment inequalities for functions of independent random variables. Ann. Probab. 33(2), 514–560 (2005)
Article MathSciNet Google Scholar
Boucheron, S., Lugosi, G., Massart, P.: Concentration inequalities using the entropy method. Ann. Probab. 31(3), 1583–1614 (2003)
Article MathSciNet Google Scholar
Carmona, P., Hu, Y.: Universality in Sherrington-Kirkpatrick’s spin glass model. Ann. de l’Institut Henri Poincare (B) Probab. Stat. 42(2), 215–222 (2006)
Article MathSciNet Google Scholar
Charbonneau, P., Marinari, E., Parisi, G., Ricci-tersenghi, F., Sicuro, G., Zamponi, F., Mezard, M.: Spin Glass Theory and Far Beyond: Replica Symmetry Breaking after 40 Years. World Scientific (2023)
Chatterjee, S.: A simple invariance theorem. arXiv preprint math/0508213 (2005)
Chatterjee, S.: A generalization of the Lindeberg principle. Ann. Probab. 34(6), 2061–2076 (2006)
Article MathSciNet Google Scholar
D’Autilia, R., Andrianaivo, L.N., Troiani, A.: Parallel simulation of two-dimensional ising models using probabilistic cellular automata. J. Stat. Phys. 184, 1–22 (2021)
Article MathSciNet Google Scholar
Efron, B., Stein, C.: The jackknife estimate of variance. Ann. Stat. pages 586–596 (1981)
Erba, V., Krzakala, F., Ortiz, R.P., Zdeborová, L.: Statistical mechanics of the maximum-average submatrix problem. J. Stat. Mech: Theory Exp. 2024(1), 013403 (2024)
Article MathSciNet Google Scholar
Fukushima-Kimura, B.H., Handa, S., Kamakura, K., Kamijima, Y., Kawamura, K., Sakai, A.: Mixing time and simulated annealing for the stochastic cellular automata. J. Stat. Phys. 190(4), 79 (2023)
Article MathSciNet Google Scholar
Glover, F., Kochenberger, G., Yu, D.: Quantum bridge analytics i: a tutorial on formulating and using qubo models. Ann. Oper. Res. 314, 141–183 (2019)
Article MathSciNet Google Scholar
Liang, R.N., Anacleto, E.A.J., Meneses, C.N.: Data structures for speeding up tabu search when solving sparse quadratic unconstrained binary optimization problems. J. Heuristics 28(4), 433–479 (2022)
Article Google Scholar
Mézard, M., Parisi, G., Virasoro, M A.: Spin glass theory and beyond: An Introduction to the Replica Method and Its Applications, volume 9. World Scientific Publishing Company (1987)
Panchenko, D.: Free energy in the generalized Sherrington-Kirkpatrick mean field model. Rev. Math. Phys. 17(07), 793–857 (2005)
Article MathSciNet Google Scholar
Panchenko, D.: The Sherrington-Kirkpatrick model: an overview. J. Stat. Phys. 149, 362–383 (2012)
Article MathSciNet Google Scholar
Panchenko, D.: Free energy in the mixed $p$-spin models with vector spins. Ann. Probab. 46(2), 865–896 (2018)
Article MathSciNet Google Scholar
Rendl, F., Rinaldi, G., Wiegele, A.: Solving max-cut to optimality by intersecting semidefinite and polyhedral relaxations. Math. Program. 121, 307–335 (2010)
Article MathSciNet Google Scholar
Scoppola, B., Troiani, A.: Gaussian mean field lattice gas. J. Stat. Phys. 170(6), 1161–1176 (2018)
Article MathSciNet Google Scholar
Scoppola, B., Troiani, A., Veglianti, M.: Shaken dynamics on the 3d cubic lattice. Electron. J. Probab. 27, 1–26 (2022)
Article MathSciNet Google Scholar
Talagrand, M.: Mean field models for spin glasses: Volume I: Basic examples. Springer Science & Business Media (2010)
Talagrand, M.: Mean field models for spin glasses: Volume II: Advanced Replica-Symmetry and Low Temperature. Springer Science & Business Media (2011)
Waidyasooriya, H.M., Hariyama, M.: A gpu-based quantum annealing simulator for fully-connected ising models utilizing spatial and temporal parallelism. IEEE Access 8, 67929–67939 (2020)
Article Google Scholar
Wang, Y., Lü, Z., Glover, F., Hao, J.-K.: Probabilistic grasp-tabu search algorithms for the ubqp problem. Comput. Oper. Res. 40(12), 3100–3107 (2013)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors are grateful to the Department of Mathematics and Physics of the University of Rome "Tre" which provided us with additional computing resources. AT thanks PRIN 2022 PNRR: "RETINA: REmote sensing daTa INversion with multivariate functional modeling for essential climAte variables characterization" (Project Number: P20229SH29, CUP: J53D23015950001) funded by the European Union under the Italian National Recovery and Resilience Plan (NRRP) of NextGenerationEU, under the Italian Ministry of University and Research (MUR). BS acknowledges the support of MUR Excellence Department Project MatMod@TOV awarded to the Department of Mathematics, University of Rome Tor Vergata, CUP E83C23000330006.

Funding

Open access funding provided by UniversitÃ degli Studi di Perugia within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Dipartimento di Matematica “Guido Castelnuovo”, Sapienza Università di Roma, Piazzale Aldo Moro 5, 00185, Rome, Italy
Marco Isopi
Dipartimento di Matematica, Università di Roma “Tor Vergata”, Via della Ricerca Scientifica 1, 00133, Rome, Italy
Benedetto Scoppola
Dipartimento di Matematica e Informatica, Università degli Studi di Perugia, via Vanvitelli, 1, 06123, Perugia, Italy
Alessio Troiani

Authors

Marco Isopi
View author publications
You can also search for this author in PubMed Google Scholar
Benedetto Scoppola
View author publications
You can also search for this author in PubMed Google Scholar
Alessio Troiani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alessio Troiani.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Isopi, M., Scoppola, B. & Troiani, A. On some features of quadratic unconstrained binary optimization with random coefficients. Boll Unione Mat Ital (2024). https://doi.org/10.1007/s40574-024-00433-8

Download citation

Received: 24 July 2024
Accepted: 09 August 2024
Published: 22 August 2024
DOI: https://doi.org/10.1007/s40574-024-00433-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On some features of quadratic unconstrained binary optimization with random coefficients

Abstract

Similar content being viewed by others

On sparsity of the solution to a random quadratic optimization problem

Binary Unconstrained Quadratic Optimization Problem

Combinatorial optimization through variational quantum power method

1 Introduction

Remark 1.1

2 Main results

Remark 2.1

Theorem 2.2

Theorem 2.3

Corollary 2.4

3 Conjectures and numerical results

3.1 Minimum and maximum per particle

3.2 Structure of minimizer and maximizer

Definition 3.1

Remark 3.2

Conjucture 3.3

3.3 Probability a particle belongs to the minimizer/maximizer of H

Conjucture 3.4

4 Proofs

Theorem 4.1

Theorem 4.2

Theorem 4.3

Lemma 4.4

Proof

4.1 Proof of Theorem 2.2

Theorem 4.5

Proof

Corollary 4.6

4.2 Proof of Theorem 2.3

Theorem 4.7

Proof

5 Conclusions and open problems

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation