1 Introduction

A significant number of recent publications are devoted to the study of different rank functions of matrices arising from different measures of computational complexity. Examples of such functions include the nonnegative and positive semidefinite ranks of a matrix, the classical and quantum communication complexities and many others.

Let A be a real matrix with nonnegative entries. The nonnegative rank of A is the smallest integer k such that A can be written as a sum of k rank-one nonnegative matrices. The nonnegative rank arises as a measure of complexity of a linear program describing the polytope corresponding to a given matrix [21]. More precisely, the extension complexity of a polytope is equal to the nonnegative rank of its slack matrix, see [2, 6, 7, 21]. Another interesting rank function, known as positive semidefinite (or psd) rank, arises in a similar fashion but from semidefinite descriptions of polytopes (see [5]). Namely, the psd rank of A is the smallest integer k such that there are two tuples of realFootnote 1 positive semidefinite \(k\times k\) matrices, \((B_1,\ldots ,B_n)\) and \((C_1,\ldots ,C_m)\), such that \(A_{ij}={\text {tr}}(B_iC_j)\) for all indices ij.

The functions introduced above do also have applications in communication complexity. As proved in [4], the value \(\lceil \log _2{\text {rank}}_+(A)\rceil \) is the optimal cost of a classical randomized communication protocol (with nonnegative outputs) that returns a random matrix with expectation equal to A. This result was generalized to the setting of quantum computation in the paper [6], where \(\lceil \log _2{\text {rank}}_\mathrm{psd}(A)\rceil \) was proven to be the optimal cost of a quantum communication protocol computing the matrix equal to A in expectation. We refer the reader to [13] for precise definitions and a detailed treatment of communication complexity, and we note that the above mentioned rank functions have applications outside the theory of computation as well. In particular, the concept of nonnegative rank is important in statistics (see [12]), data mining (see [14]), demography and many other contexts (see [3] for details).

2 Our Results

Our paper deals with the family of so-called Euclidean distance matrices, which are an interesting source of examples illustrating the behavior of functions mentioned above. Let \(\alpha =(\alpha _1,\ldots ,\alpha _n)\) be a real vector with \(n\geqslant 3\) and pairwise distinct coordinates. We define the Euclidean distance matrix as the \(n\times n\) matrix \(D(\alpha )\) whose (ij)th entry equals \(D_{ij}=(\alpha _i-\alpha _j)^2\). Beasley and Laffey [1] showed that the classical rank of the matrix \(D(1,2,\ldots ,n)\) equals three and that the nonnegative rank of it gets arbitrarily large as n goes to infinity. They conjectured that the maximal possible rank of an \(n\times n\) Euclidean distance matrix is n, but this conjecture has been refuted in [19]. In the abstract of [10], Hrubeš asked whether or not \({\text {rank}}_+D(\alpha )\) is \(O(\ln n)\) for all \(\alpha \in \mathbb {R}^n\), and he gave an affirmative solution of this problem for several specific families of vectors \(\alpha \in \mathbb {R}^n\). The general case of the problem remained open, and we are going to show that the solution is in fact negative. Namely, we will prove that almost all vectors \(\alpha \in \mathbb {R}^n\) are such that \({\text {rank}}_+D(\alpha )\) grows as a power of n.

Theorem 2.1

If the coordinates of a vector \(\alpha \in \mathbb {R}^n\) are algebraically independent over \(\mathbb {Q}\), then \({\text {rank}}_+ D(\alpha )\geqslant 2\sqrt{n}-2\).

Remark 2.2

One can construct a family of n algebraically independent numbers as \(b_i=\exp a_i\), where \(a_1,\ldots ,a_n\) is a family of real numbers linearly independent over \(\mathbb {Q}\). (This is the famous Lindemann–Weierstrass theorem.)

We can get some other interesting separations as corollaries of Theorem 2.1. In particular, let us compare the behaviour of the nonnegative and psd ranks. It is a basic result that the former is greater than or equal to the latter, but how large can the difference be? The entrywise square root of \(D(\alpha )\) has rank two, and this quantity is an upper bound for psd rank as pointed out in [5, 22]. Therefore, the above mentioned result from [1] yields an example of a family of matrices whose psd ranks are bounded but nonnegative ranks grow logarithmically with the size of a matrix. The fundamental paper [6] provides a family of \(n\times n\) matrices whose nonnegative ranks grow as a power of n while the psd ranks grow logarithmically. The question of whether this separation is optimal has been left open. In particular, do there exist matrices with bounded psd ranks whose nonnegative ranks grow as a power of the size? The problem of separating the nonnegative and psd ranks has been discussed also in [11], but the above mentioned question remained open. We get the answer as a corollary of Theorem 2.1.

Corollary 2.3

There is a matrix \(D\in \mathbb {R}^{n\times n}\) such that \({\text {rank}}_\mathrm{psd}(D)=2\) and \({\text {rank}}_{+}(D)\geqslant 2\sqrt{n}-2\).

This corollary is also interesting from the point of view of the communication complexity theory. As pointed out above, the logarithms of psd and nonnegative ranks, respectively, are optimal sizes of quantum and classical communication protocols computing a given matrix in expectation. Therefore, we get an asymptotically optimal separation between the quantum and classical communication complexities. The existence of such a separation was an open problem despite the efforts mentioned in the above paragraph. The corresponding question was explicitly posed in [22] as Problem 4 in Sect. 5.

Corollary 2.4

There is a nonnegative matrix \(D\in \mathbb {R}^{2^m\times 2^m}\) which can be computed with a one-bit quantum communication protocol but requires \(\Omega (m)\) bits to be computed by a classical randomized protocol in expectation.

The goal of this paper is to prove Theorem 2.1. Our approach is mostly geometric, and we use a characterization of the nonnegative rank in terms of the classical nested polytopes problem. Several general results and a description of our technique are provided in Sect. 3. The proof of Theorem 2.1 is completed in Sect. 4.

3 Our Technique

Our proof of Theorem 2.1 is based on a variation of Yannakakis’ theory connecting the nonnegative rank and extension complexities of polytopes. It will be more convenient for us to work with the concept of intersection complexity, which is somewhat dual to extension complexity, see [17] for details. These quantities are always equal to each other, and all the results on one of them hold for the other one as well. (However, we do not use the fact that they are equal in the proof of Theorem 2.1.) Let \(P\subset \mathbb {R}^d\), \(Q\subset \mathbb {R}^n\) be polytopes, and let H be a d-dimensional affine subspace in \(\mathbb {R}^n\). We say that P is a slice of Q if P is congruent to \(Q\cap H\). The intersection complexity of P, denoted \({\text {ic}}(P)\), is the smallest integer k such that P is a slice of a polytope with k vertices. Clearly, an additional assumption requiring H to equal \(\{x\in \mathbb {R}^n\mid x_{d+1}=\cdots =x_n=0\}\) does not cause any loss of generality and leads to an equivalent definition of intersection complexity.

We will say that a real vector is stochastic if its entries sum up to one. Now let A be a nonnegative column-stochastic \(n\times m\) matrix. That is, we assume that the columns of A are taken from the standard simplex \(\Delta _n=\{x\in \mathbb {R}^n\mid x_1+\cdots +x_n=1,\,x_i\geqslant 0\}\). We denote by \({\text {col}}(A)\) the linear subspace of \(\mathbb {R}^n\) spanned by the columns of A, and we define the two polytopes, \(\mathcal {P}_\mathrm{in}(A)\) and \(\mathcal {P}_\mathrm{out}(A)\), as follows. We set \(\mathcal {P}_\mathrm{out}(A)=\Delta _n\cap {\text {col}}(A)\), and we define \(\mathcal {P}_\mathrm{in}(A)\) as the convex hull of the columns of A. The following proposition appears in [2], and it is widely known in the community; Pashkovich in [18] and Gillis and Glineur in [8] have obtained similar results.

Proposition 3.1

(See Theorem 1 in [2].) Let A be a nonnegative column-stochastic \(n\times m\) matrix. If \({\text {rank}}_+(A)\leqslant r\), then there is a polytope P satisfying \(\mathcal {P}_\mathrm{in}(A)\subset P\subset \mathcal {P}_\mathrm{out}(A)\) and \({\text {ic}}(P)\leqslant r\).

Now we are going to prove a useful lower bound on the intersection complexity of a polytope. Let P be a polytope in \(\mathbb {R}^d\); we denote by \(\mathbb {Q}(P)\) the field obtained from \(\mathbb {Q}\) by adjoining the coordinates of vertices of P. By \({\text {trdeg}}(P)\) we denote the transcendence degree of the field extension \(\mathbb {Q}(P)\supset \mathbb {Q}\). The following results provide a lower bound for the quantity \({\text {ic}}(P)\) in terms of \({\text {trdeg}}(P)\). These results can be seen as an algebraic analogue of the corresponding result in [16], which is itself a generalization of a result in [7].

Lemma 3.2

Let \(Q\subset \mathbb {R}^d\) be a polytope with v vertices, and let l be a rational affine subspace of \(\mathbb {R}^d\) (that is, an affine span of several rational points). If \(P=Q\cap l\) and \(\dim Q=d\), \(\dim l=k\), then \({\text {trdeg}}(P)\leqslant d(v-d+k)\).

Proof

Let \(U=(u_1,\ldots ,u_{k+1})\) be a tuple of arbitrary points on l satisfying \(\dim {\text {conv}} U=k\). We can find a tuple \(V=(v_1,\ldots ,v_{d-k})\) of \(d-k\) vertices of Q satisfying \(\dim {\text {conv}}\,U\cup V=d\).

Let \(V'=(v_1',\ldots ,v_{d-k}')\) be a tuple of arbitrary rational points satisfying \(\dim {\text {conv}}\,U\cup V'=d\). Then there exists a unique affine transformation \(\pi \) sending (UV) to \((U,V')\). Clearly, \(\pi \) is identical on l, and the polytope \(\pi (Q)\) has \(d-k\) vertices with rational coordinates. We get \({\text {trdeg}}(P)\leqslant {\text {trdeg}}(\pi (Q))\leqslant dv-d(d-k)\). \(\square \)

Theorem 3.3

Let \(P\subset \mathbb {R}^k\) be a polytope. Then \({\text {ic}}(P)\geqslant 2\sqrt{{\text {trdeg}}(P)}-k\).

Proof

Assume that there is a d-dimensional polytope Q with v vertices such that P is a slice of Q. By Lemma 3.2, we get \({\text {trdeg}}(P)\leqslant d(v+k-d)\). The expression \(d(v+k-d)\) attains its maximum at \(d=(v+k)/2\), so we get \(4{\text {trdeg}}(P)\leqslant (v+k)^2\) or \(v\geqslant 2\sqrt{{\text {trdeg}}(P)}-k\). \(\square \)

We recall that the slack matrix of a polytope P has rows indexed with facets of P and columns indexed with vertices of P, and its (ij) entry is the distance from the jth vertex to the ith facet. Two matrices \(S_1\), \(S_2\) are called scaling-equivalent if there are monomial square matrices \(D_1\), \(D_2\) such that \(S_1=D_1S_2D_2\), and polytopes \(P_1, P_2\) are projectively equivalent if they can be obtained from each other by a projective transformation. The following fact is a combination of Corollary 1.5 in [9] and Lemma 20 in [17].

Proposition 3.4

Polytopes \(P_1\), \(P_2\) are projectively equivalent if and only if their slack matrices are scaling-equivalent, and in this case \(P_1\), \(P_2\) have equal intersection complexities.

4 The Proof

Recall that we assume \(n\geqslant 3\). In this section, we use ijk as indices of the coordinates of n-vectors and entries of \(n\times n\) matrices, and we assume that these indices belong to \(\mathbb {Z}/n\mathbb {Z}\). In other words, we will assume that \(i+1\) stands for 1 if \(i=n\).

Let \(\alpha =(\alpha _1,\ldots ,\alpha _n)\) be a real vector whose coordinates are algebraically independent over \(\mathbb {Q}\). The \(n\times n\) matrix D is defined as \(D_{ij}=(\alpha _i-\alpha _j)^2\), and our goal is to prove that \({\text {rank}}_+(D)\geqslant 2\sqrt{n}-2\).

Claim 4.1

Let \(u_i\in \mathbb {R}^n\) be the vector whose kth coordinate equals \((\alpha _k-\alpha _i)(\alpha _k-\alpha _{i+1})\). We have \({\text {rank}}(D)=3\) and \(u_i\in {\text {col}}(D)\).

Proof

We have \(D_{kj}=\alpha _k^2-2\alpha _k\alpha _j+\alpha _j^2\) and \((u_j)_k=\alpha _k^2-(\alpha _j+\alpha _{j+1})\alpha _k+\alpha _j\alpha _{j+1}\), which means that the \(u_j\)’s and \({\text {col}}(D)\) are spanned by the vectors \((1,\ldots ,1)^\top \), \((\alpha _1,\ldots ,\alpha _n)^\top \), \((\alpha _1^2,\ldots ,\alpha _n^2)^\top \). \(\square \)

Let us note that permutations of \(\alpha \) lead only to permutations of rows and columns of D, which do not change the nonnegative rank. Therefore, we can assume that \(\alpha \) is increasing. Also, let us define \(d_i\) as the sum of the entries in the ith column of D; we define \(D'\) as the matrix obtained from D by dividing every entry \(D_{ij}\) by \(d_j\). Clearly, the matrix \(D'\) is column-stochastic and satisfies \({\text {rank}}_+(D)={\text {rank}}_+(D')\). Let us compute the polytope \(\mathcal {P}_\mathrm{out}(D')\) as in Proposition 3.1.

Claim 4.2

The polytope \(\mathcal {P}_\mathrm{out}(D')\) is an n-gon. The vertex \(v_i\) of \(\mathcal {P}_\mathrm{out}(D')\) is \(s_i^{-1}u_i\), where \(s_i\) is the sum of the coordinates of the vector \(u_i\) as in Claim 4.1.

Proof

Since \({\text {rank}}(D)=3\), the affine subspace \(H={\text {col}}(D')\cap \{x_1+\cdots +x_n=1\}\) has dimension 2, and we get \(\dim \mathcal {P}_\mathrm{out}(D')=2\). Therefore, \(\mathcal {P}_\mathrm{out}(D')\) is a polygon, and every edge of it is the intersection of H and a facet of \(\Delta _n\). We see that \(\mathcal {P}_\mathrm{out}(D')\) has at most n edges and, therefore, at most n vertices. The vertices of \(\mathcal {P}_\mathrm{out}(D')\) are intersections of H with the ridges of \(\Delta _n\); in other words, the vertices are those nonnegative vectors in H that have two zero coordinates. By Claim 4.1 the vectors in the assertion satisfy these properties; so we have identified all the n vertices of \(\mathcal {P}_\mathrm{out}(D')\). \(\square \)

Claim 4.3

The slack matrix of \(\mathcal {P}_\mathrm{out}(D')\) is scaling-equivalent to \((v_1|\ldots |v_n)\), where the \(v_i\)’s are as in Claim 4.2.

Proof

The polygon \(\mathcal {P}_\mathrm{out}(D')\) is defined by the equalities describing \({\text {col}}(D)\) and the inequalities \(x_i\geqslant 0\). In other words, \(x_1\geqslant 0,\ldots ,x_n\geqslant 0\) are facet-defining inequalities of \(\mathcal {P}_\mathrm{out}(D')\), which means that the distance from any point to the ith facet of \(\mathcal {P}_\mathrm{out}(D')\) is proportional to the ith coordinate of that point. \(\square \)

Claim 4.4

Every edge of \(\mathcal {P}_\mathrm{out}(D')\) contains a vertex of \(\mathcal {P}_\mathrm{in}(D')\).

Proof

Note that the ith column of \(D'\) is a vertex of \(\mathcal {P}_\mathrm{in}(D')\) and has a zero at the ith coordinate. Since the vertices of \(\mathcal {P}_\mathrm{out}(D')\) have nonnegative coordinates, the ith column of \(D'\) should belong to the convex hull of those vertices of \(\mathcal {P}_\mathrm{out}(D')\) that have zeros at their ith coordinates. By Claim 4.2, there are only two such vertices, \(v_i\) and \(v_{i-1}\), and their convex hull is the edge connecting them. \(\square \)

Claim 4.5

Let P be a polygon satisfying \(\mathcal {P}_\mathrm{in}(D')\subset P\subset \mathcal {P}_\mathrm{out}(D')\). Then any edge of \(\mathcal {P}_\mathrm{out}(D')\) contains some vertex of P.

Proof

Follows directly from Claim 4.4. \(\square \)

Claim 4.6

We define the points \(w_k=(w_{k1},w_{k2})\in \mathbb {R}^2\), where \(k\in \{1,\ldots ,n\},\) and

$$\begin{aligned} w_{k1}=\frac{1}{\alpha _k}+\frac{1}{\alpha _{k+1}}+\frac{1}{\alpha _k\alpha _{k+1}},\qquad w_{k2}=-\frac{1}{\alpha _k}-\frac{1}{\alpha _{k+1}}+\frac{1}{\alpha _k\alpha _{k+1}}. \end{aligned}$$

The polygon \(W={\text {conv}}\{w_1,\ldots ,w_n\}\) is projectively equivalent to \(\mathcal {P}_\mathrm{out}(D')\).

Proof

In view of Proposition 3.4, it suffices to prove that the slack matrices of W and \(\mathcal {P}_\mathrm{out}(D')\) can be obtained from each other by the scaling of rows and columns; the slack matrix of W is scaling-equivalent to the matrix S in which the (ik)th entry is the oriented volume of the triangle with vertices \(w_{i-1}\), \(w_i\), \(w_{k}\). That is, we have

$$\begin{aligned} S_{ik}=\det \begin{pmatrix}w_{i-1,1}&{}w_{i-1,2}&{}1\\ w_{i1}&{}w_{i2}&{}1\\ w_{k1}&{}w_{k2}&{}1\end{pmatrix}, \end{aligned}$$

and straightforward checking shows that

$$\begin{aligned} S_{ik}=\frac{2(\alpha _{i-1}-\alpha _{i+1})}{\alpha _{i-1} \alpha _i^2\alpha _{i+1}}\cdot \frac{1}{\alpha _{k}\alpha _{k+1}} \cdot (\alpha _i-\alpha _k)(\alpha _i-\alpha _{k+1}). \end{aligned}$$

Here, the first multiplier is independent of the column index, the second multiplier is independent of the row index, so we see that the matrix S can be obtained by scaling from the matrix \(S'\) defined as \(S_{ik}'=(\alpha _i-\alpha _k)(\alpha _i-\alpha _{k+1})\). The matrix \(S'\) is scaling-equivalent to the one in Claim 4.3, so we are done. \(\square \)

Claim 4.7

Let \(h_k\) be a point on the straight line connecting the points \(w_{k-1}\) and \(w_k\) as in Claim 4.6. Then \(\alpha _k\) is algebraic in the coordinates of \(h_k\).

Proof

The coordinates of \(h_k\) are \(\lambda w_{k-1,1}+\mu w_{k1}\) and \(\lambda w_{k-1,2}+\mu w_{k2}\), for some \(\lambda ,\mu \in \mathbb {R}_+\) satisfying \(\lambda +\mu =1\). The sum and difference of these coordinates both divided by two are equal, respectively, to

$$\begin{aligned} \sigma _1=\frac{1}{\alpha _k}\cdot \left( \frac{\lambda }{\alpha _{k-1}}+\frac{\mu }{\alpha _{k+1}}\right) ,\qquad \sigma _2= \frac{\lambda }{\alpha _{k-1}}+\frac{\mu }{\alpha _{k+1}}+\frac{1}{\alpha _k}. \end{aligned}$$

By Vieta’s formulas, one of the roots of the equation \(t^2-\sigma _2t+\sigma _1=0\) equals \(1/\alpha _k\) (while the other is \(\lambda /\alpha _{k-1}+\mu /\alpha _{k+1}\)). Therefore, \(\alpha _k\) is a root of a polynomial whose coefficients are rational functions of the coordinates of \(h_k\). \(\square \)

Claim 4.8

Let P be a polygon satisfying \(\mathcal {P}_\mathrm{in}(D')\subset P\subset \mathcal {P}_\mathrm{out}(D')\). Then \({\text {ic}}(P)\geqslant 2\sqrt{n}-2\).

Proof

According to Claim 4.6, there exists a projective transformation \(\pi \) sending \(\mathcal {P}_\mathrm{out}(D')\) to W. Claim 4.5 shows that every edge of W contains some vertex of \(P'=\pi (P)\), and from Claim 4.7 we get that any \(\alpha _k\) is algebraic over \(\mathbb {Q}(P')\). Since the \(\alpha _k\)’s are algebraically independent, we get \({\text {trdeg}}(P')\geqslant n\), and Theorem 3.3 implies \({\text {ic}}(P')\geqslant 2\sqrt{n}-2\). Finally, Proposition 3.4 implies \({\text {ic}}(P)={\text {ic}}(P')\), and we get the desired result. \(\square \)

In view of Proposition 3.1, Claim 4.8 completes the proof of Theorem 2.1. Therefore, we get the lower bound for \({\text {rank}}_{+}(D)\), which allows us to prove all the results announced in Sect. 1. We note that this bound is still quite far from the best known upper bound, which is \(O(n/\ln ^{\circ 6}n)\), see [20]. (Here, \(\ln ^{\circ 6}\) denotes the sixth iteration of the logarithm.) Proving that there are \(n\times n\) distance matrices D satisfying \({\text {rank}}_{+}(D)\in \omega (\sqrt{n})\) seems to require an essentially new technique, and our approach does not seem to lead to such an improvement. In particular, it is known [20] that there are generic n-gons P satisfying \({\text {ic}}(P)\in O(\sqrt{n})\), which means that Theorem 3.3 cannot be improved substantially.