Keywords

MSC 2010 Classification

32.1 Introduction

Extreme points and values of Vandermonde determinant and its generalizations constrained on surfaces have been considered in the work of the authors, including also applications to approximation and interpolation of functions and data, curve fitting and applications in electromagnetism, lightning modelling, electromagnetic compatibility, probability theory and financial engineering [1,2,3,4,5]. Algebraic varieties defined using Vandermonde determinant functions have also interesting algebraic and geometric properties and structure from the point of view of algebraic geometry and commutative algebra [6,7,8].

The ordinary Vandermonde matrices

$$\begin{aligned} V_{mn}(\mathbf {x}_n)&=G_{mn}(\mathbf {x}_n,(0,1,\ldots ,m-1))=\left[ x_j^{i-1}\right] _{mn} \\&=\left[ \begin{matrix} 1 &{} 1 &{} \cdots &{} 1 \\ x_1 &{} x_2 &{} \cdots &{} x_n \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ x_1^{m-1} &{} x_2^{m-1} &{} \cdots &{} x_n^{m-1}\end{matrix}\right] . \end{aligned}$$

Note that some authors use the transpose as the definition and possibly also let indices run from 0. All entries in the first row of Vandermonde matrices are ones and by considering \(0^0=1\) this is true even when some \(x_j\) is zero.

In this article the following notation will sometimes be used:

$$\begin{aligned} \mathbf {x}_I = (x_{i_1},x_{i_2},\ldots ,x_{i_n}),~I = \{ i_1, i_2, \ldots , i_n \}. \end{aligned}$$

For the sake of convenience we will use \(\mathbf {x}_n\) to mean \(\mathbf {x}_{I_n}\) where \(I_n = \{ 1, 2, \ldots , n \}\).

We have the following well known theorem.

Theorem 32.1

The determinant of (square) Vandermonde matrices has the well known form

$$\begin{aligned} v_n\equiv v_n(\mathbf {x}_n)\equiv \det V_n(\mathbf {x}_n)=\prod _{1\le i<j\le n}(x_j-x_i). \end{aligned}$$

This determinant is also simply referred to as the Vandermonde determinant

The Vandermonde determinant is a special case of a generalized Vandermonde determinant (32.1).  A generalized Vandermonde matrix is determined by two vectors \(\mathbf {x}_n=(x_1,\ldots ,x_n)\in K^n\) and \(\mathbf {a}_m=(a_1,\ldots ,a_m)\in K^m\), where K is usually the real or complex field, and is defined as

$$\begin{aligned} G_{mn}(\mathbf {x}_n,\mathbf {a}_m)=\left[ x_j^{a_i}\right] _{mn}. \end{aligned}$$
(32.1)

For square matrices only one index is given, \(G_n \equiv G_{nn}\).

Note that the term generalized Vandermonde matrix has been used for several kinds of matrices that are not equivalent to or a special cases of (32.1), see [9] for instance.

The determinant of generalized Vandermonde matrices

$$\begin{aligned} g_n\equiv g_n(\mathbf {x}_n,\mathbf {a}_n)\equiv \det G_n(\mathbf {x}_n,\mathbf {a}_n) \end{aligned}$$

and its connections to difference equations, symmetric polynomials and representation theory have been considered for example in [10, 11].

Vandermonde matrices can be used in polynomial interpolation. The coefficients of the unique polynomial \(c_0+c_1x+\cdots +c_{n-1}x^{n-1}\) that passes through n points \((x_i,y_i)\in \mathbb {C}^2\) with distinct \(x_i\) are

$$ \left[ \begin{matrix}c_0&c_1&\cdots&c_{n-1}\end{matrix}\right] = \left[ \begin{matrix}y_1&y_2&\cdots&y_n\end{matrix}\right] \left[ \begin{matrix} 1 &{} 1 &{} \cdots &{} 1\\ x_1 &{} x_2 &{} \cdots &{} x_n\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ x_1^{n-1} &{} x_2^{n-1} &{} \cdots &{} x_n^{n-1} \end{matrix}\right] ^{-1}. $$

In this context the \(x_i\) are called nodes. There are also many applications of Vandermonde and generalized Vandermonde determinants, for example in differential equations [9], difference equations and representation theory [10], and time series analysis [12].

In Sect. 32.1.1 we introduce the reader to the behavior of the Vandermonde determinant \(v_3(\mathbf {x}_3)=(x_3-x_2)(x_3-x_1)(x_2-x_1)\) by some introductory visualizations. We also consider \(g_3(\mathbf {x}_3,\mathbf {a}_3)\) for some choices of exponents \(\mathbf {a}_3\).

In Sect. 32.2 we optimize the Vandermonde determinant \(v_n\) over the unit sphere \(S^{n-1}\) in \(\mathbb {R}^n\) finding a rescaled Hermite polynomial whose roots give the extreme points. In Sect. 32.2.2 we arrive at the results for the special case \(v_3\) in a slightly different way. In Sect. 32.2.3 we present a transformation of the optimization problem into an equivalent form with known optimum value. In Sect. 32.2.4 we extend the range of visualizations to \(v_4,\ldots ,v_7\) by exploiting some analytical results. In Sect. 32.3 we prove some limits involving the generalized Vandermonde matrix and determinant.

32.1.1 Visual Exploration in 3D

In this section we plot the values of the determinant

$$\begin{aligned} v_3(\mathbf {x}_3)=(x_3-x_2)(x_3-x_1)(x_2-x_1), \end{aligned}$$

and also the generalized Vandermonde determinant \(g_3(\mathbf {x}_3,\mathbf {a}_3)\) for three different choices of \(\mathbf {a}_3\) over the unit sphere \(x_1^2 + x_2^2 + x_3^2=1\) in \(\mathbb {R}^3\). Our plots are over the unit sphere but the determinant exhibits the same general behavior over centered spheres of any radius. This follows directly from (32.1) and that exactly one element from each row appear in the determinant. For any scalar c we get

$$\begin{aligned} g_n(c\mathbf {x}_n,\mathbf {a}_n)= \left[ \prod _{i=1}^nc^{a_i}\right] g_n(\mathbf {x}_n,\mathbf {a}_n), \end{aligned}$$

which for \(v_n\) becomes

$$\begin{aligned} v_n(c\mathbf {x}_n)=c^{\displaystyle \frac{n(n-1)}{2}} v_n(\mathbf {x}_n), \end{aligned}$$
(32.2)

and so the values over different radii differ only by a constant factor.

Fig. 32.1
figure 1

Plot of \(v_3(\mathbf {x}_3)\) over the unit sphere

In Fig. 32.1 value of \(v_3(\mathbf {x}_3)\) has been plotted over the unit sphere and the curves where the determinant vanishes are traced as black lines. The coordinates in Fig. 32.1b are related to \(\mathbf {x}_3\) by

$$\begin{aligned} \mathbf {x}_3= \left[ \begin{matrix}2 &{} 0 &{} 1\\ -1&{}1&{}1\\ -1&{}-1&{}1\end{matrix}\right] \left[ \begin{matrix}1/\sqrt{6}&{}0&{}0\\ 0&{}1/\sqrt{2}&{}0\\ 0&{}0&{}1/\sqrt{3}\end{matrix}\right] \mathbf {t} , \end{aligned}$$
(32.3)

where the columns in the product of the two matrices are the basis vectors in \(\mathbb {R}^3\). The unit sphere in \(\mathbb {R}^3\) can also be described using spherical coordinates. In Fig. 32.1c the following parametrization was used.

$$\begin{aligned} \mathbf {t}(\theta ,\phi )=\left[ \begin{matrix} \cos (\phi ) \sin (\theta )\\ \sin (\phi )\\ \cos (\phi ) \cos (\theta )\\ \end{matrix}\right] . \end{aligned}$$
(32.4)

We will use this \(\mathbf {t}\)-basis and spherical parametrization throughout this section.

From the plots in Fig. 32.1 it can be seen that the number of extreme points for \(v_3\) over the unit sphere seem to be \(6=3!\). It can also been seen that all extreme points seem to lie in the plane through the origin that is orthogonal to an apparent symmetry axis in the direction (1, 1, 1), the direction of \(t_3\). We will see later that the extreme points for \(v_n\) indeed lie in the hyperplane \(\sum _{i=1}^n x_i=0\) for all n, see Theorem 32.3, and the number of extreme points for \(v_n\) count n!, see Remark 32.1.

The black lines where \(v_3(\mathbf {x}_3)\) vanishes are actually the intersections between the sphere and the three planes \(x_3-x_1=0\), \(x_3-x_2=0\) and \(x_2-x_1=0\), as these differences appear as factors in \(v_3(\mathbf {x}_3)\).

We will see later on that the extreme points are the six points acquired from permuting the coordinates in

$$\begin{aligned} \mathbf {x}_3=\frac{1}{\sqrt{2}} \left( -1,0,1\right) . \end{aligned}$$

For reasons that will become clear in Sect. 32.2.1 it is also useful to think about these coordinates as the roots of the polynomial

$$\begin{aligned} P_3(x) = x^3-\frac{1}{2}x. \end{aligned}$$

So far we have only considered the behavior of \(v_3(\mathbf {x}_3)\), that is \(g_3(\mathbf {x}_3,\mathbf {a}_3)\) with \(\mathbf {a}_3=(0,1,2)\). We now consider three generalized Vandermonde determinants, namely \(g_3\) with \(\mathbf {a}_3=(0,1,3)\), \(\mathbf {a}_3=(0,2,3)\) and \(\mathbf {a}_3=(1,2,3)\). These three determinants show increasingly more structure and they all have a neat formula in terms of \(v_3\) and the elementary symmetric polynomials

$$\begin{aligned} e_{kn}=e_k(x_1,\ldots ,x_n)=\sum _{1\le i_1<i_2<\cdots <i_k\le n} x_{i_1}x_{i_2}\cdots x_{i_k}, \end{aligned}$$

where we will simply use \(e_k\) whenever n is clear from the context.

Fig. 32.2
figure 2

Plot of \(g_3(\mathbf {x}_3,(0,1,3))\) over the unit sphere

In Fig. 32.2 we see the determinant

$$\begin{aligned} g_3(\mathbf {x}_3,(0,1,3))=\left| \begin{matrix}1&{}1&{}1\\ x_1&{}x_2&{}x_3\\ x_1^3&{}x_2^3&{}x_3^3\end{matrix}\right| =v_3(\mathbf {x}_3)e_{1}, \end{aligned}$$

plotted over the unit sphere. The expression \(v_3(\mathbf {x}_3)e_{1}\) is easy to derive, the \(v_3(\mathbf {x}_3)\) is there since the determinant must vanish whenever any two columns are equal, which is exactly what the Vandermonde determinant expresses. The \(e_1\) follows by a simple polynomial division. As can be seen in the plots we have an extra black circle where the determinant vanishes compared to Fig. 32.1. This circle lies in the plane \(e_1=x_1+x_2+x_3=0\) where we previously found the extreme points of \(v_3(\mathbf {x}_3)\) and thus doubles the number of extreme points to \(2\cdot 3!\).

A similar treatment can be made of the remaining two generalized determinants that we are interested in, plotted in the following two figures.

Table 32.1 Table of some determinants of generalized Vandermonde matrices
Fig. 32.3
figure 3

Plot of \(g_3(\mathbf {x}_3,(0,2,3))\) over the unit sphere

The four determinants treated so far are collected in Table 32.1. Derivation of these determinants is straight forward. We note that all but one of them vanish on a set of planes through the origin. For \(\mathbf {a}=(0,2,3)\) we have the usual Vandermonde planes but the intersection of \(e_2=0\) and the unit sphere occur at two circles.

$$\begin{aligned} x_1x_2+x_1x_3+x_2x_3&=\frac{1}{2}\left( (x_1+x_2+x_3)^2 - (x_1^2+x_2^2+x_3^2)\right) \\ =\frac{1}{2}\left( (x_1+x_2+x_3)^2 - 1 \right)&=\frac{1}{2}\left( x_1+x_2+x_3 + 1 \right) \left( x_1+x_2+x_3 - 1 \right) , \end{aligned}$$

and so \(g_3(\mathbf {x}_3,(0,2,3))\) vanish on the sphere on two circles lying on the planes \(x_1+x_2+x_3 + 1=0\) and \(x_1+x_2+x_3 - 1=0\). These can be seen in Fig. 32.3 as the two black circles perpendicular to the direction (1, 1, 1).

Fig. 32.4
figure 4

Plot of \(g_3(\mathbf {x}_3,(1,2,3))\) over the unit sphere

Note also that while \(v_3\) and \(g_3(\mathbf {x}_3,(0,1,3))\) have the same absolute value on all their respective local extreme points (by symmetry) we have that both \(g_3(\mathbf {x}_3,(0,2,3))\) and \(g_3(\mathbf {x}_3,(1,2,3))\) have different absolute values for some of their respective extreme points, this can be seen in Figs. 32.2, 32.3 and 32.4.

32.2 Optimizing the Vandermonde Determinant over the Unit Sphere

In this section we will consider the extreme points of the Vandermonde determinant on the n-dimensional unit sphere in \(\mathbb {R}^n\). We want both to find an analytical solution and to identify some properties of the determinant that can help us to visualize it in some area around the extreme points in dimensions \(n > 3\).

32.2.1 The Extreme Points Given by Roots of a Polynomial

The extreme points of the Vandermonde determinant on the unit sphere in \(\mathbb {R}^n\) are known and given by Theorem 32.4 where we present a special case of Theorem 6.7.3 in ‘Orthogonal polynomials’ by Gábor Szegő [13]. We will also provide a proof that is more explicit than the one in [13] and that exposes more of the rich symmetric properties of the Vandermonde determinant. For the sake of convenience some properties related to the extreme points of the Vandermonde determinant defined by real vectors \(\mathbf {x}_n\) will be presented before Theorem 32.4.

Theorem 32.2

For any \(1 \le k \le n\)

$$\begin{aligned} \frac{\partial v_n}{\partial x_k} = \sum _{\begin{array}{c} i=1\\ i\ne k \end{array}}^{n} \frac{v_n(\mathbf {x}_n)}{x_k-x_i} \end{aligned}$$
(32.5)

This theorem will be proven after the introduction of the following useful lemma:

Lemma 32.1

For any \(1 \le k \le n-1\)

$$\begin{aligned} \frac{\partial v_n}{\partial x_k} = -\frac{v_n(\mathbf {x}_n)}{x_n-x_k} + \left[ \prod _{i=1}^{n-1} (x_n-x_i)\right] \frac{\partial v_{n-1}}{\partial x_k} \end{aligned}$$
(32.6)

and

$$\begin{aligned} \frac{\partial v_n}{\partial x_n} = \sum _{i=1}^{n-1} \frac{v_n(\mathbf {x}_n)}{x_n-x_i}. \end{aligned}$$
(32.7)

Proof

Note that the determinant can be described recursively

$$\begin{aligned} v_n(\mathbf {x}_n)&= \left[ \prod _{i = 1}^{n-1} (x_n-x_i)\right] \prod _{1 \le i < j \le n-1} (x_j-x_i) \nonumber \\&= \left[ \prod _{i = 1}^{n-1} (x_n-x_i)\right] v_{n-1}(\mathbf {x}_{n-1}). \end{aligned}$$
(32.8)

Formula (32.6) follows immediately from applying the differentiation formula for products on (32.8). Formula (32.7) follows from (32.8), the differentiation rule for products and that \(v_{n-1}(\mathbf {x}_{n-1})\) is independent of \(x_n\).

$$\begin{aligned} \frac{\partial v_n}{\partial x_n} =&\frac{v_{n-1}(\mathbf {x}_{n-1})}{x_n-x_1}\prod _{i = 1}^{n-1} (x_n-x_i) \\&+ (x_n-x_1) \frac{\partial }{\partial x_n}\left( \frac{v_{n-1}(\mathbf {x}_{n-1})}{x_n-x_1}\prod _{i = 1}^{n-1} (x_n-x_i) \right) \\ =&\frac{v_n(\mathbf {x}_n)}{x_n-x_1}+\frac{v_n(\mathbf {x}_n)}{x_n-x_2} \\&+ (x_n-x_1)(x_n-x_2) \frac{\partial }{\partial x_n}\left( \frac{v_n(\mathbf {x}_n)}{(x_n-x_1)(x_n-x_2)}\right) \\ =&\sum _{i=1}^{n-1} \frac{v_n(\mathbf {x}_n)}{x_n-x_i} + \left[ \prod _{i = 1}^{n-1} (x_n-x_i)\right] \frac{\partial v_{n-1}}{\partial x_n} = \sum _{i=1}^{n-1} \frac{v_n(\mathbf {x}_n)}{x_n-x_i}. \end{aligned}$$

Proof

(Proof of Theorem 32.2) Using Lemma 32.1 we can see that for \(k = n\), formula (32.5) follows immediately from (32.7). The case \(1 \le k < n\) will be proved using induction. Using (32.6) gives

$$\begin{aligned} \frac{\partial v_n}{\partial x_k}&= -\frac{v_n(\mathbf {x}_n)}{x_n-x_k} + \left[ \prod _{i=1}^{n-1} (x_n-x_i)\right] \frac{\partial v_{n-1}}{\partial x_k}. \end{aligned}$$

Supposing that formula (32.5) is true for \(n-1\) results in

$$\begin{aligned} \frac{\partial v_n}{\partial x_k} =&-\frac{v_n(\mathbf {x}_n)}{x_n-x_k} + \left[ \prod _{i=1}^{n-1} (x_n-x_i)\right] \sum _{\begin{array}{c} i=1\\ i\ne k \end{array}}^{n-1} \frac{v_{n-1}(\mathbf {x}_{n-1})}{x_k-x_i} \\ =&\frac{v_n(\mathbf {x}_n)}{x_k-x_n} + \sum _{\begin{array}{c} i=1\\ i\ne k \end{array}}^{n-1} \frac{v_n(\mathbf {x}_n)}{x_k-x_i} = \sum _{\begin{array}{c} i=1\\ i\ne k \end{array}}^{n} \frac{v_n(\mathbf {x}_n)}{x_k-x_i}. \end{aligned}$$

Showing that (32.5) is true for \(n=2\) completes the proof

$$\begin{aligned} \frac{\partial v_2}{\partial x_1} = \frac{\partial }{\partial x_1}(x_2-x_1) = -1&= \frac{x_2-x_1}{x_1-x_2} = \sum _{\begin{array}{c} i=1\\ i\ne 1 \end{array}}^{2} \frac{v_2(\mathbf {x}_2)}{x_1-x_i} \\ \frac{\partial v_2}{\partial x_2} = \frac{\partial }{\partial x_2}(x_2-x_1) = 1&= \frac{x_2-x_1}{x_2-x_1} = \sum _{\begin{array}{c} i=1\\ i\ne 2 \end{array}}^{2} \frac{v_2(\mathbf {x}_2)}{x_2-x_i}. \end{aligned}$$

Theorem 32.3

The extreme points of \(v_n(\mathbf {x}_n)\) on the unit sphere can all be found in the hyperplane defined by

$$\begin{aligned} \sum _{i=1}^{n} x_i = 0. \end{aligned}$$
(32.9)

This theorem will be proved after the introduction of the following useful lemma:

Lemma 32.2

For any \(n \ge 2\) the sum of the partial derivatives of \(v_n(\mathbf {x}_n)\) will be zero.

$$\begin{aligned} \sum _{k=1}^{n} \frac{\partial v_n}{\partial x_k} = 0. \end{aligned}$$
(32.10)

Proof

This lemma is easily proven using Lemma 32.1 and induction:

$$\begin{aligned} \sum _{k=1}^{n} \frac{\partial v_n}{\partial x_k} =&\sum _{k=1}^{n-1} \left( -\frac{v_n(\mathbf {x}_n)}{x_n-x_k} + \left[ \prod _{i=1}^{n-1} (x_n-x_i)\right] \frac{\partial v_{n-1}}{\partial x_k} \right) + \sum _{i=1}^{n-1} \frac{v_n(\mathbf {x}_n)}{x_n-x_i} \\ =&\left[ \prod _{i=1}^{n-1} (x_n-x_i)\right] \sum _{k=1}^{n-1} \frac{\partial v_{n-1}}{\partial x_k}. \end{aligned}$$

Thus if (32.10) is true for \(n-1\), then it is also true for n. Showing that the equation holds for \(n=2\) is very simple

$$\begin{aligned} \frac{\partial v_2}{\partial x_1} + \frac{\partial v_2}{\partial x_2} = -1 + 1 = 0. \end{aligned}$$

Proof

(Proof of Theorem 32.3) Using the method of Lagrange multipliers it follows that any \(\mathbf {x}_n\) on the unit sphere that is an extreme point of the Vandermonde determinant will also be a stationary point for the Lagrange function

$$\begin{aligned} \varLambda _n(\mathbf {x}_n,\lambda ) = v(\mathbf {x}_n) - \lambda \left( \sum _{i=1}^{n} x_i^2 - 1\right) \end{aligned}$$

for some \(\lambda \). Explicitly this requirement becomes

$$\begin{aligned} \frac{\partial \varLambda _n}{\partial x_k}&= 0 \text { for all } 1 \le k \le n , \end{aligned}$$
(32.11)
$$\begin{aligned} \frac{\partial \varLambda _n}{\partial \lambda }&= 0 . \end{aligned}$$
(32.12)

Equation (32.12) corresponds to the restriction to the unit sphere and is therefore immediately fulfilled. Since all the partial derivatives of the Lagrange function should be equal to zero it is obvious that the sum of the partial derivatives will also be equal to zero. Combining this with Lemma 32.2 gives

$$\begin{aligned} \sum _{k=1}^{n} \frac{\partial \varLambda _n}{\partial x_k} = \sum _{k=1}^{n}\left( \frac{\partial v_n}{\partial x_k} - 2\lambda x_k \right) = -2\lambda \sum _{k=1}^{n} x_k = 0. \end{aligned}$$
(32.13)

There are two ways to fulfill condition (32.13) either \(\lambda = 0\) or \(\sum _{k=1}^n x_k= 0\). When \(\lambda = 0\), equations (32.11) reduce to

$$\begin{aligned} \frac{\partial v_n}{\partial x_k}=0 \text { for all } 1 \le k \le n, \end{aligned}$$

and by (32.2) this can only be true if \(v_n(\mathbf {x}_n)=0\), which is of no interest to us, and so all extreme points must lie in the hyperplane \(\sum _{k=1}^n x_k= 0\).

Theorem 32.4

A point on the unit sphere in \(\mathbb {R}^n\), \(\mathbf {x}_n = (x_1, x_2, \ldots \, x_n)\), is an extreme point of the Vandermonde determinant if and only if all \(x_i\), \(i \in \{ 1,2,\ldots \,n \}\), are distinct roots of the rescaled Hermite polynomial

$$\begin{aligned} P_n(x) = \left( 2n(n-1)\right) ^{-\frac{n}{2}}H_n\left( \sqrt{\frac{n(n-1)}{2}}x\right) . \end{aligned}$$
(32.14)

Remark 32.1

Note that if \(\mathbf {x}_n = (x_1, x_2, \ldots \, x_n)\) is an extreme point of the Vandermonde determinant then any other point whose coordinates are a permutation of the coordinates of \(\mathbf {x}_n\) is also an extreme point. This follows from the determinant function being, by definition, alternating with respect to the columns of the matrix and the \(x_i\)s defines the columns of the Vandermonde matrix. Thus any permutation of the \(x_i\)s will give the same value for \(\left| v_n(\mathbf {x}_n)\right| \). Since there are n! permutations there will be at least n! extreme points. The roots of the polynomial (32.14) defines the set of \(x_i\)s fully and thus there are exactly n! extreme points, n!/2 positive and n!/2 negative.

Remark 32.2

All terms in \(P_n(x)\) are of even order if n is even and of odd order when n is odd. This means that the roots of \(P_n(x)\) will be symmetrical in the sense that if \(x_i\) is a root then \(-x_i\) is also a root.

Proof

(Proof of Theorem 32.4) By the method of Lagrange multipliers condition (32.11) must be fulfilled for any extreme point. If \(\mathbf {x}_n\) is a fixed extreme point so that

$$\begin{aligned} v_n(\mathbf {x}_n) = v_{max}, \end{aligned}$$

then (32.11) can be written explicitly, using (32.5), as

$$\begin{aligned} \frac{\partial \varLambda _n}{\partial x_k} = \sum _{\begin{array}{c} i=1 \\ i \ne k \end{array}}^{n} \frac{v_{max}}{x_k-x_i} - 2\lambda x_k = 0 \text { for all } 1 \le k \le n , \end{aligned}$$

or alternatively by introducing a new multiplier \(\rho \) as

$$\begin{aligned} \sum _{\begin{array}{c} i=1 \\ i \ne k \end{array}}^{n} \frac{1}{x_k-x_i} = \frac{2\lambda }{v_{max}} x_k = \frac{\rho }{n} x_k \text { for all } 1 \le k \le n. \end{aligned}$$
(32.15)

By forming the polynomial \(f(x) = (x-x_1)(x-x_2)\cdots (x-x_n)\) and noting that

$$\begin{aligned} f'(x_k)= \sum _{j=1}^n \prod _{\begin{array}{c} i=1\\ i\ne j \end{array}}^n (x-x_i)\bigg |_{x=x_k}&= \prod _{\begin{array}{c} i=1\\ i\ne k \end{array}}^n (x_k-x_i),\\ f''(x_k)= \sum _{l=1}^n \sum _{\begin{array}{c} j=1\\ j\ne l \end{array}}^n \prod _{\begin{array}{c} i=1\\ i\ne j\\ i\ne l \end{array}}^n (x-x_i)\bigg |_{x=x_k}&=\sum _{\begin{array}{c} j=1\\ j\ne k \end{array}}^n \prod _{\begin{array}{c} i=1\\ i\ne j\\ i\ne k \end{array}}^n (x_k-x_i) +\sum _{\begin{array}{c} l=1\\ l\ne k \end{array}}^n \prod _{\begin{array}{c} i=1\\ i\ne l\\ i\ne k \end{array}}^n (x_k-x_i)\\&=2 \sum _{\begin{array}{c} j=1\\ j\ne k \end{array}}^n \prod _{\begin{array}{c} i=1\\ i\ne j\\ i\ne k \end{array}}^n (x_k-x_i), \end{aligned}$$

we can rewrite (32.15) as

$$\begin{aligned} \frac{1}{2} \frac{f''(x_k)}{f'(x_k)} = \frac{\rho }{n} x_k, \end{aligned}$$

or

$$\begin{aligned} f''(x_k) - \frac{2\rho }{n} x_k f'(x_k) = 0. \end{aligned}$$

Since the last equation must vanish for all k we must have

$$\begin{aligned} f''(x) - \frac{2\rho }{n} x f'(x) = c f(x), \end{aligned}$$
(32.16)

for some constant c. To find c the \(x^n\)-terms of the right and left part of (32.16) are compared to each other,

$$\begin{aligned} c \cdot c_{n} x^n = -\frac{2\rho }{n} x n c_n x^{n-1} = -2\rho \cdot c_n x^n ~\Rightarrow ~ c = -2\rho . \end{aligned}$$

Thus the following differential equation for f(x) must be satisfied

$$\begin{aligned} f''(x) - \frac{2\rho }{n} x f'(x) + 2\rho f(x) = 0. \end{aligned}$$
(32.17)

Choosing \(x = az\) gives

$$\begin{aligned}&f''(az) - \frac{2\rho }{(n-1)} a^2 z f'(az) + 2\rho f(az) \\ =&\frac{1}{a^2} \frac{\mathrm {d}^2 f}{\mathrm {d}z^2}(az) - \frac{2\rho }{n} a z\frac{1}{a}\frac{\mathrm {d} f}{\mathrm {d}z}(az) + 2\rho f(az) = 0. \end{aligned}$$

By setting \(g(z) = f(az)\) and choosing \(a = \sqrt{\frac{n}{\rho }}\) a differential equation that matches the definition for the Hermite polynomials is found:

$$\begin{aligned} g''(z) - 2 z g'(z) + 2 n g(z) = 0. \end{aligned}$$
(32.18)

By definition the solution to (32.18) is \(g(z) = b H_n(z)\) where b is a constant. An exact expression for the constant a can be found using Lemma 32.3 (for the sake of convenience the lemma is stated and proved after this theorem). We get

$$\begin{aligned} \sum _{i=1}^{n} x_i^2 = \sum _{i=1}^{n} a^2 z_i^2 = 1 \Rightarrow a^2 \frac{n(n-1)}{2} = 1, \end{aligned}$$

and so

$$\begin{aligned} a = \sqrt{\frac{2}{n(n-1)}}. \end{aligned}$$

Thus condition (32.11) is fulfilled when \(x_i\) are the roots of

$$\begin{aligned} P_n(x) = bH_n\left( z\right) = bH_n\left( \sqrt{\frac{n(n-1)}{2}}x\right) . \end{aligned}$$

Choosing \(b = \left( 2n(n-1)\right) ^{-\frac{n}{2}}\) gives \(P_n(x)\) with leading coefficient 1. This can be confirmed by calculating the leading coefficient of P(x) using the explicit expression for the Hermite polynomial (32.20). This completes the proof.

Lemma 32.3

Let \(x_i\), \(i = 1,2,\ldots ,n\) be roots of the Hermite polynomial \(H_n(x)\). Then

$$\begin{aligned} \sum _{i=1}^{n} x_i^2 = \frac{n(n-1)}{2}. \end{aligned}$$

Proof

By letting \(e_k(x_1,\ldots \,x_n)\) denote the elementary symmetric polynomials,  \(H_n(x)\) can be written as

$$\begin{aligned} H_n(x)&= A_n(x-x_1)\cdots (x-x_n) \\&= A_n (x^n - e_1(x_1,\ldots ,x_n)x^{n-1} + e_2(x_1,\ldots ,x_n)x^{n-2} + q(x)) \end{aligned}$$

where q(x) is a polynomial of degree \(n-3\). Noting that

$$\begin{aligned} \sum _{i=1}^{n} x_i^2&= (x_1+\cdots +x_n)^2 - 2 \sum _{1\le i < j \le n} x_i x_j \nonumber \\&= e_1(x_1,\ldots ,x_n)^2 - 2 e_2(x_1,\ldots ,x_n) , \end{aligned}$$
(32.19)

it is clear the sum of the square of the roots can be described using the coefficients for \(x^n\), \(x^{n-1}\) and \(x^{n-2}\). The explicit expression for \(H_n(x)\) is [13]

$$\begin{aligned} H_n(x)&= n! \sum _{i=0}^{\left\lfloor \frac{n}{2}\right\rfloor } \frac{(-1)^i}{i!}\frac{(2x)^{n-2i}}{(n-2i)!} \nonumber \\&= 2^n x^n - 2^{n-2}n(n-1) x^{n-2} + n! \sum _{i=3}^{\left\lfloor \frac{n}{2}\right\rfloor } \frac{(-1)^i}{i!}\frac{(2x)^{n-2i}}{(n-2i)!} . \end{aligned}$$
(32.20)

Comparing the coefficients in the two expressions for \(H_n(x)\) gives

$$\begin{aligned} A_n&= 2^n, \\ A_n e_1(x_1,\ldots ,x_n)&= 0, \\ A_n e_2(x_1,\ldots ,x_n)&= -n(n-1) 2^{n-2}. \end{aligned}$$

Thus by (32.19)

$$\begin{aligned} \sum _{i=1}^{n} x_i^2 = \frac{n(n-1)}{2}. \end{aligned}$$

Theorem 32.5

The coefficients, \(a_k\), for the terms of \(x^k\) in \(P_n(x)\) given by (32.14), are given by the following relations

$$\begin{aligned} a_n&= 1,~~a_{n-1} = 0,~~a_{n-2} = \frac{1}{2}, \nonumber \\ a_k&= -\frac{(k+1)(k+2)}{n(n-1)(n-k)}a_{k+2},~~1 \le k \le n-3. \end{aligned}$$
(32.21)

Proof

Equation (32.17) tells us that

$$\begin{aligned} P_n(x)=\frac{1}{2\rho }P_n''(x)-\frac{1}{n}xP_n'(x). \end{aligned}$$
(32.22)

That \(a_n = 1\) follows from the definition of \(P_n\) and \(a_{n-1} = 0\) follows from the Hermite polynomials only having terms of odd powers when n is odd and even powers when n is even. That \(a_{n-2} = \frac{1}{2}\) can be easily shown using the definition of \(P_n\) and the explicit formula for the Hermite polynomials (32.20).

The value of the \(\rho \) can be found by comparing the \(x^{n-2}\) terms in (32.22)

$$\begin{aligned} a_{n-2} = \frac{1}{2\rho } n(n-1) a_{n}+\frac{1}{n}(n-2)a_{n-2}. \end{aligned}$$

From this follows

$$\begin{aligned} \frac{1}{2 \rho } = \frac{-1}{n^2(n-1)}. \end{aligned}$$

Comparing the \(x^{n-l}\) terms in (32.22) gives the following relation

$$\begin{aligned} a_{n-l} = \frac{1}{2\rho } (n-l+2)(n-l) a_{n-l+2} + (n-l)a_{n-l}\frac{1}{n} \end{aligned}$$

which is equivalent to

$$\begin{aligned} a_{n-l} = a_{n-l+2} \frac{-(n-l+2)(n-l+1)}{l n^2(n-1)}. \end{aligned}$$

Letting \(k = n-l\) gives (32.21).

32.2.2 Extreme Points of the Vandermonde Determinant on the Three Dimensional Unit Sphere

It is fairly simple to describe \(v_3(\mathbf {x}_3)\) on the circle that is formed by the intersection of the unit sphere and the plane \(x_1+x_2+x_3=0\). Using Rodrigues’ rotation formula to rotate a point, \(\mathbf {x}\), around the axis \(\frac{1}{\sqrt{3}}(1,1,1)\) with the angle \(\theta \) will give the rotation matrix

$$R_\theta = \frac{1}{3}\begin{bmatrix} 2\cos (\theta ) + 1 &{} 1 - \cos (\theta ) - \sqrt{3}\sin (\theta ) &{} 1 - \cos (\theta ) + \sqrt{3}\sin (\theta ) \\ 1 - \cos (\theta ) + \sqrt{3}\sin (\theta ) &{} 2\cos (\theta ) + 1 &{} 1 - \cos (\theta ) - \sqrt{3}\sin (\theta ) \\ 1 - \cos (\theta ) - \sqrt{3}\sin (\theta ) &{} 1 - \cos (\theta ) + \sqrt{3}\sin (\theta ) &{} 2\cos (\theta ) + 1 \end{bmatrix}. $$

A point which already lies on \(S^2\) can then be rotated to any other point on \(S^2\) by letting \(R_\theta \) act on the point. Choosing the point \(\mathbf {x} = \frac{1}{\sqrt{2}}\left( -1,0,1\right) \) gives the Vandermonde determinant a convenient form on the circle since:

$$ R_\theta \mathbf {x} = \frac{1}{\sqrt{6}}\begin{bmatrix} -\sqrt{3}\cos (\theta )-\sin (\theta ) \\ -2\sin (\theta ) \\ \sqrt{3}\cos (\theta )+\sin (\theta ) \end{bmatrix}, $$

which gives

$$\begin{aligned}{2} v_3(R_\theta \mathbf {x}) = 2&\left( \sqrt{3}\cos (\theta )+\sin (\theta )\right) \\&\left( \sqrt{3}\cos (\theta )+\sin (\theta )+2\sin (\theta )\right) \\&\left( -2\sin (\theta )+\sqrt{3}\cos (\theta )+\sin (\theta )\right) \\ =&\frac{1}{\sqrt{2}}\left( 4\cos (\theta )^3-3\cos (\theta )\right) \\ =&\frac{1}{\sqrt{2}}\cos (3\theta ). \end{aligned}$$

Note that the final equality follows from \(\cos (n\theta ) = T_n(\cos (\theta ))\) where \(T_n\) is the nth Chebyshev polynomial of the first kind. From formula (32.14) if follows that \(P_3(x) = T_3(x)\) but for higher dimensions the relationship between the Chebyshev polynomials and \(P_n\) is not as simple.

Finding the maximum points for \(v_3(\mathbf {x}_3)\) on this form is simple. The Vandermonde determinant will be maximal when \(3\theta = 2n\pi \) where n is some integer. This gives three local maxima corresponding to \(\theta _1 = 0\), \(\theta _2 = \frac{2\pi }{3}\) and \(\theta _3 = \frac{4\pi }{3}\). These points correspond to cyclic permutation of the coordinates of \(\mathbf {x} = \frac{1}{\sqrt{2}}\left( -1,0,1\right) \). Analogously the minimas for \(v_3(\mathbf {x}_3)\) can be shown to be a transposition followed by cyclic permutation of the coordinates of \(\mathbf {x}\). Thus any permutation of the coordinates of \(\mathbf {x}\) correspond to a local extreme point just like it was stated on Sect. 32.1.1.

32.2.3 A Transformation of the Optimization Problem

In this section we provide a transformation of the problem of optimizing the Vandermonde determinant over the unit sphere to a related equation system with two equations.

Lemma 32.4

For any \(n\ge 2\) the dot product between the gradient of \(v_n(\mathbf {x}_n)\) and \(\mathbf {x}_n\) is proportional to \(v_n(\mathbf {x}_n)\). More precisely,

$$\begin{aligned} \nabla v_n^T \mathbf {x}_n = \sum _{k=1}^n x_k\frac{\partial v_n}{\partial x_k} = \frac{n(n-1)}{2}v_n(\mathbf {x}_n). \end{aligned}$$
(32.23)

Proof

Using Theorem 32.2 we have

$$\begin{aligned} \sum _{k=1}^n x_k\frac{\partial v_n}{\partial x_k} =\sum _{k=1}^n x_k\sum _{i\ne k} \frac{v_n(\mathbf {x}_n)}{x_k-x_i} =v_n(\mathbf {x}_n) \sum _{k=1}^n \sum _{i\ne k} \frac{x_k}{x_k-x_i}. \end{aligned}$$

Now, for each distinct pair of indices \(k=a,i=b\) in the last double sum we have that the indices \(k=b,i=a\) also appear. And so we continue

$$\begin{aligned} \sum _{k=1}^n x_k\frac{\partial v_n}{\partial x_k}&= v_n(\mathbf {x}_n) \sum _{1\le k<i\le n}^n\left( \frac{x_k}{x_k-x_i} + \frac{x_i}{x_i-x_k} \right) \\&= v_n(\mathbf {x}_n) \sum _{1\le k<i\le n}^n 1 = \frac{n(n-1)}{2} v_n(\mathbf {x}_n), \end{aligned}$$

which proves the lemma.

Consider the premise that an objective function \(f(\mathbf {x})\) is optimized on the points satisfying an equality constraint \(g(\mathbf {x})=0\) when its gradient is linearly dependent with the gradient of the constraint function. In Lagrange multipliers this is expressed as

$$\begin{aligned} \nabla f(\mathbf {x})=\lambda \nabla g(\mathbf {x}), \end{aligned}$$

where \(\lambda \) is some scalar constant. We can also express this using a dot product

$$\begin{aligned} \nabla f(\mathbf {x})\cdot \nabla g(\mathbf {x})=|\nabla f(\mathbf {x})||\nabla g(\mathbf {x})|\cos \theta . \end{aligned}$$

We are interested in the case where both \(\nabla f\) and \(\nabla g\) are non-zero and so for linear dependence we require \(\cos \theta =\pm 1\). By squaring we then have

$$\begin{aligned} \left( \nabla f(\mathbf {x})\cdot \nabla g(\mathbf {x}) \right) ^2 = \left( \nabla f(\mathbf {x})\cdot \nabla f(\mathbf {x})\right) \left( \nabla g(\mathbf {x})\cdot \nabla g(\mathbf {x})\right) , \end{aligned}$$

which can also be expressed

$$\begin{aligned} \left( \sum _{i=1}^n \frac{\partial f}{\partial x_i} \frac{\partial g}{\partial x_i} \right) ^2= \left( \sum _{i=1}^n \left( \frac{\partial f}{\partial x_i} \right) ^2 \right) \left( \sum _{i=1}^n \left( \frac{\partial g}{\partial x_i} \right) ^2 \right) . \end{aligned}$$
(32.24)

Theorem 32.6

The problem of finding the vectors \(\mathbf {x}_n\) that maximize the absolute value of the Vandermonde determinant over the unit sphere:

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \max _{\mathbf {x}_n} \prod _{i<j} | x_j-x_i |,\\ \displaystyle \text {s.t. } \sum _{i=1}^{n} x_i^2=1, \end{array}\right. }, \end{aligned}$$
(32.25)

has exactly the same solution set as the related problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \sum _{i<j} \frac{1}{(x_j-x_i)^2} = \frac{1}{2} \left( \frac{n(n-1)}{2} \right) ^2,\\ \displaystyle \sum _{i=1}^{n} x_i^2=1. \end{array}\right. } \end{aligned}$$
(32.26)

Proof

By applying (32.24) to the problem of optimizing the Vandermonde determinant \(v_n(\mathbf {x}_n)\) over the unit sphere we get.

$$\begin{aligned} \left( \sum _{i=1}^n \frac{\partial v_n}{\partial x_i} \frac{\partial \sum x_i^2}{\partial x_i} \right) ^2&= \left( \sum _{i=1}^n \left( \frac{\partial v_n}{\partial x_i} \right) ^2 \right) \left( \sum _{i=1}^n \left( \frac{\partial \sum x_i^2}{\partial x_i} \right) ^2 \right) ,\nonumber \\ \left( \sum _{i=1}^n 2x_i\frac{\partial v_n}{\partial x_i} \right) ^2&= \left( \sum _{i=1}^n \left( \frac{\partial v_n}{\partial x_i} \right) ^2 \right) \left( \sum _{i=1}^n \left( 2x_i \right) ^2 \right) ,\nonumber \\ \left( \sum _{i=1}^n x_i\frac{\partial v_n}{\partial x_i} \right) ^2&= \sum _{i=1}^n \left( \frac{\partial v_n}{\partial x_i} \right) ^2. \end{aligned}$$
(32.27)

By applying Lemma 32.4 the left part of (32.27) can be written as

$$\begin{aligned} v_n(\mathbf {x}_n)^2\left( \frac{n(n-1)}{2} \right) ^2. \end{aligned}$$

The right part of (32.27) can be rewritten as

$$\begin{aligned} \sum _{i=1}^n \left( \frac{\partial v_n}{\partial x_i} \right) ^2 = \sum _{i=1}^n \left( \sum _{\begin{array}{c} k=1\\ k\ne i \end{array}}^n \frac{v_n(\mathbf {x}_n)}{x_i-x_k} \right) ^2 = v_n(\mathbf {x}_n)^2 \sum _{i=1}^n \left( \sum _{\begin{array}{c} k=1\\ k\ne i \end{array}}^n \frac{1}{x_i-x_k} \right) ^2, \end{aligned}$$

and by expanding the square we continue

$$\begin{aligned} \sum _{i=1}^n \left( \frac{\partial v_n}{\partial x_i} \right) ^2 = v_n(\mathbf {x}_n)^2 \sum _{i=1}^n \left( \sum _{\begin{array}{c} k=1\\ k\ne i \end{array}}^n \frac{1}{(x_i-x_k)^2} + \sum _{\begin{array}{c} k=1\\ k\ne i \end{array}}^n \sum _{\begin{array}{c} j=1\\ j\ne i\\ j\ne k \end{array}}^n \frac{1}{(x_i-x_k)} \frac{1}{(x_i-x_j)} \right) \\ = v_n(\mathbf {x}_n)^2 \sum _{\begin{array}{c} k\ne i \end{array}} \frac{1}{(x_i-x_k)^2} + v_n(\mathbf {x}_n)^2 \sum _{i=1}^n \sum _{\begin{array}{c} k=1\\ k\ne i \end{array}}^n \sum _{\begin{array}{c} j=1\\ j\ne i\\ j\ne k \end{array}}^n \frac{1}{(x_i-x_k)} \frac{1}{(x_i-x_j)}. \end{aligned}$$

We recognize that the triple sum runs over all distinct ikj and so we can write them as one sum by expanding permutations:

$$\begin{aligned} \sum _{i=1}^n \left( \frac{\partial v_n}{\partial x_i} \right) ^2 =&~v_n(\mathbf {x}_n)^2 \sum _{\begin{array}{c} k\ne i \end{array}} \frac{1}{(x_i-x_k)^2} \\&+ v_n(\mathbf {x}_n)^2 \sum _{i<j<k} \left( \frac{1}{(x_i-x_k)(x_i-x_j)} +\frac{1}{(x_i-x_j)(x_i-x_k)}\right. \\&~\qquad \qquad \qquad \qquad +\frac{1}{(x_j-x_k)(x_j-x_i)} +\frac{1}{(x_j-x_i)(x_j-x_k)} \\&\left. ~\qquad \qquad \qquad \quad \,\,\,\,+\frac{1}{(x_k-x_i)(x_k-x_j)} +\frac{1}{(x_k-x_j)(x_k-x_i)} \right) \end{aligned}$$
$$\begin{aligned} =&~v_n(\mathbf {x}_n)^2 \sum _{\begin{array}{c} k\ne i \end{array}} \frac{1}{(x_i-x_k)^2} + 2v_n(\mathbf {x}_n)^2 \sum _{i<j<k} \frac{(x_k-x_j)+(x_i-x_k)+(x_j-x_i)}{(x_i-x_j)(x_j-x_k)(x_k-x_i)} \\ =&~v_n(\mathbf {x}_n)^2 \sum _{\begin{array}{c} k\ne i \end{array}} \frac{1}{(x_i-x_k)^2} = 2v_n(\mathbf {x}_n)^2 \sum _{i<j} \frac{1}{(x_j-x_i)^2}. \end{aligned}$$

We continue by joining the simplified left and right part of (32.27):

$$\begin{aligned} v_n(\mathbf {x}_n)^2\left( \frac{n(n-1)}{2} \right) ^2 = 2v_n(\mathbf {x}_n)^2 \sum _{i<j} \frac{1}{(x_j-x_i)^2}, \end{aligned}$$

and the result follows as

$$\begin{aligned} \sum _{i<j} \frac{1}{(x_j-x_i)^2} = \frac{1}{2} \left( \frac{n(n-1)}{2} \right) ^2. \end{aligned}$$
(32.28)

This captures the linear dependence requirement of the problem, what remains is to require the solutions to lie on the unit sphere.

$$\begin{aligned} \sum _{i=1}^n x_i^2=1. \end{aligned}$$

32.2.4 Further Visual Exploration

Visualization of the determinant \(v_3(\mathbf {x}_3)\) on the unit sphere is straightforward, as well as visualizations for \(g_3(\mathbf {x}_3,\mathbf {a})\) for different \(\mathbf {a}\). All points on the sphere can be viewed directly by a contour map. In higher dimensions we need to reduce the set of visualized points somehow. In this section we provide visualizations for \(v_4,\ldots ,v_7\) by using symmetry properties of the Vandermonde determinant.

32.2.4.1 Four Dimensions

By Theorem 32.3 we know that the extreme points of \(v_4(\mathbf {x}_4)\) on the sphere all lie in the hyperplane \(x_1+x_2+x_3+x_4=0\). The intersection of this hyperplane with the unit sphere in \(\mathbb {R}^4\) can be described as a unit sphere in \(\mathbb {R}^3\), under a suitable basis, and can then be easily visualized.

This can be realized using the transformation

$$\begin{aligned} \mathbf {x}=\left[ \begin{matrix} -1 &{} -1 &{} 0 \\ -1 &{} 1 &{} 0 \\ 1 &{} 0 &{} -1 \\ 1 &{} 0 &{} 1 \end{matrix}\right] \left[ \begin{matrix} 1/\sqrt{4} &{} 0 &{} 0 \\ 0 &{} 1/\sqrt{2} &{} 0 \\ 0 &{} 0 &{} 1/\sqrt{2} \\ \end{matrix}\right] \mathbf {t}. \end{aligned}$$
(32.29)

The results of plotting the \(v_4(\mathbf {x}_4)\) after performing this transformation can be seen in Fig. 32.5. All \(24 = 4!\) extreme points are clearly visible.

Fig. 32.5
figure 5

Plot of \(v_4(\mathbf {x}_4)\) over points on the unit sphere

From Fig. 32.5 we see that whenever we have a local maxima we have a local maxima at the opposite side of the sphere as well, and the same for minima. This it due to the occurrence of the exponents in the rows of \(V_n\). From (32.2) we have

$$\begin{aligned} v_n((-1)\mathbf {x}_n)=(-1)^{\displaystyle \frac{n(n-1)}{2}} v_n(\mathbf {x}_n), \end{aligned}$$

and so opposite points are both maxima or both minima if \(n=4k\) or \(n=4k+1\) for some \(k \in \mathbb {Z}^+\) and opposite points are of different types if \(n=4k-2\) or \(n=4k-1\) for some \(k \in \mathbb {Z}^+\).

By Theorem 32.4 the extreme points on the unit sphere for \(v_4(\mathbf {x}_4)\) is described by the roots of this polynomial

$$\begin{aligned} P_4(x) = x^4 -\frac{1}{2} x^2 + \frac{1}{48}. \end{aligned}$$

The roots of \(P_4(x)\) are:

$$\begin{aligned} x_{41} =-\frac{1}{2} \sqrt{1+\sqrt{\frac{2}{3}}},&~~x_{42} =-\frac{1}{2} \sqrt{1-\sqrt{\frac{2}{3}}},\\ x_{43} =\frac{1}{2} \sqrt{1-\sqrt{\frac{2}{3}}},&~~x_{44} =\frac{1}{2} \sqrt{1+\sqrt{\frac{2}{3}}}. \end{aligned}$$

By Theorem 32.4 or 32.5 we see that the polynomials providing the coordinates of the extreme points have all even or all odd powers. From this it is easy to see that all coordinates of the extreme points must come in pairs \(x_i,-x_i\). Furthermore, by Theorem 32.3 we know that the extreme points of \(v_5(\mathbf {x}_5)\) on the sphere all lie in the hyperplane \(x_1+x_2+x_3+x_4+x_ 5=0\).

We use this to visualize \(v_5(\mathbf {x}_5)\) by selecting a subspace of \(\mathbb {R}^5\) that contains all points that have coordinates which are symmetrically placed on the real line, \((x_1,x_2,0,-x_2,-x_1)\).

The coordinates in Fig. 32.6a are related to \(\mathbf {x}_5\) by

$$\begin{aligned} \mathbf {x}_5= \left[ \begin{matrix}-1 &{} 0 &{} 1\\ 0&{}-1&{}1\\ 0&{}0&{}1\\ 0&{}1&{}1\\ 1&{}0&{}1\end{matrix}\right] \left[ \begin{matrix}1/\sqrt{2}&{}0&{}0\\ 0&{}1/\sqrt{2}&{}0\\ 0&{}0&{}1/\sqrt{5}\end{matrix}\right] \mathbf {t} . \end{aligned}$$
(32.30)
Fig. 32.6
figure 6

Plot of \(v_5(\mathbf {x}_5)\) over points on the unit sphere

The result, see Fig. 32.6, is a visualization of a subspace containing 8 of the 120 extreme points. Note that to satisfy the condition that the coordinates should be symmetrically distributed pairs can be fulfilled in two other subspaces with points that can be described in the following ways: \((x_1,x_2,0,-x_1,-x_2)\) and \((x_2,-x_2,0,x_1,-x_1)\). This means that a transformation similar to (32.30) can be used to describe \(3 \cdot 8 = 24\) different extreme points.

The transformation (32.30) corresponds to choosing \(x_3 = 0\). Choosing another coordinate to be zero will give a different subspace of \(\mathbb {R}^5\) which behaves identically to the visualized one. This multiplies the number of extreme points by five to the expected \(5 \cdot 4! = 120\).

By Theorem 32.4, the extreme points on the unit sphere for \(v_5(\mathbf {x}_5)\) are described by the roots of this polynomial

$$\begin{aligned} P_5(x)=x^5 - \frac{1}{2} x^3 + \frac{3}{80} x. \end{aligned}$$

The roots of \(P_5(x)\) are:

$$\begin{aligned} x_{51}&=-x_{55},~~x_{52} = -x_{54},~~x_{53} = 0, \\ x_{54}&=\frac{1}{2} \sqrt{1-\sqrt{\frac{2}{5}}}, ~~ x_{55} =\frac{1}{2} \sqrt{1+\sqrt{\frac{2}{5}}}.\\ \end{aligned}$$

As for \(v_5(\mathbf {x}_5)\) we use symmetry to visualize \(v_6(\mathbf {x}_6)\). We select a subspace of \(\mathbb {R}^6\) that contains all symmetrical points \((x_1,x_2,x_3,-x_3,-x_2,-x_1)\) on the sphere.

The coordinates in Fig. 32.7a are related to \(\mathbf {x}_6\) by

$$\begin{aligned} \mathbf {x}_6= \left[ \begin{matrix} -1 &{} 0 &{} 0\\ 0&{}-1&{}0\\ 0&{}0&{}-1\\ 0&{}0&{}1\\ 0&{}1&{}0\\ 1&{}0&{}0 \end{matrix}\right] \left[ \begin{matrix}1/\sqrt{2}&{}0&{}0\\ 0&{}1/\sqrt{2}&{}0\\ 0&{}0&{}1/\sqrt{2}\end{matrix}\right] \mathbf {t}. \end{aligned}$$
(32.31)
Fig. 32.7
figure 7

Plot of \(v_6(\mathbf {x}_6)\) over points on the unit sphere

In Fig. 32.7 there are 48 visible extreme points. The remaining extreme points can be found using arguments analogous the five-dimensional case.

By Theorem 32.4 the extreme points on the unit sphere for \(v_6(\mathbf {x}_6)\) is described by the roots of this polynomial

$$\begin{aligned} P_6(x) = x^6 - \frac{1}{2} x^4 + \frac{1}{20} x^2 - \frac{1}{1800}. \end{aligned}$$

The roots of \(P_6(x)\) are:

$$\begin{aligned} x_{61} =&-x_{66},~~x_{62} = -x_{65},~~x_{63} = -x_{64}, \nonumber \\ x_{64} =&\frac{(-1)^{\frac{3}{4}}}{2\sqrt{15}}\left( 10 i -\root 3 \of {10} \left( z_6 w_6^\frac{1}{3}+ \overline{z}_6 \overline{w}_6^\frac{1}{3}\right) \right) ^\frac{1}{2} \nonumber \\ =&\frac{1}{2\sqrt{15}}\sqrt{10 - 2 \sqrt{10} \left( \sqrt{3} l_6 - k_6\right) }, \end{aligned}$$
(32.32)
$$\begin{aligned} x_{65} =&\frac{(-1)^{\frac{1}{4}}}{2\sqrt{15}}\left( -10 i -\root 3 \of {10} \left( \overline{z}_6 w_6^\frac{1}{3}+z_6\overline{w}_6^\frac{1}{3}\right) \right) ^\frac{1}{2} \nonumber \\ =&\frac{1}{2\sqrt{15}}\sqrt{10 - 2 \sqrt{10} \left( \sqrt{3} l_6 + k_6\right) }, \end{aligned}$$
(32.33)
$$\begin{aligned} x_{66} =&\left( \frac{1}{30} \left( \root 3 \of {10}\left( w_6^\frac{1}{3}+ \overline{w}_6^\frac{1}{3}\right) +5\right) \right) ^\frac{1}{2} \nonumber \\ =&\sqrt{\frac{1}{30} \left( 2\sqrt{10}\cdot k_6 + 5\right) }, \\ z_6 =&\, \sqrt{3}+i,~w_6 = 2+i \sqrt{6} \nonumber \\ k_6 =&\cos \left( \frac{1}{3}\arctan \left( \sqrt{\frac{3}{2}}\right) \right) ,~ l_6 = \sin \left( \frac{1}{3}\arctan \left( \sqrt{\frac{3}{2}}\right) \right) . \nonumber \end{aligned}$$
(32.34)

As for \(v_6(\mathbf {x}_6)\) we use symmetry to visualize \(v_7(\mathbf {x}_7)\). We select a subspace of \(\mathbb {R}^7\) that contains all symmetrical points \((x_1,x_2,x_3,0,-x_3,-x_2,-x_1)\) on the sphere.

The coordinates in Fig. 32.8a are related to \(\mathbf {x}_7\) by

$$\begin{aligned} \mathbf {x}_7= \left[ \begin{matrix} -1 &{} 0 &{} 0\\ 0&{}-1&{}0\\ 0&{}0&{}-1\\ 0&{}0&{}0\\ 0&{}0&{}1\\ 0&{}1&{}0\\ 1&{}0&{}0 \end{matrix}\right] \left[ \begin{matrix}1/\sqrt{2}&{}0&{}0\\ 0&{}1/\sqrt{2}&{}0\\ 0&{}0&{}1/\sqrt{2}\end{matrix}\right] \mathbf {t}. \end{aligned}$$
(32.35)
Fig. 32.8
figure 8

Plot of \(v_7(\mathbf {x}_7)\) over points on the unit sphere

In Fig. 32.8, there are 48 extreme points that are visible just like it was for the six-dimensional case. This is expected since the transformation corresponds to choosing \(x_4 = 0\) which restricts us to a six-dimensional subspace of \(\mathbb {R}^7\) which can then be visualized in the same way as the six-dimensional case. The remaining extreme points can be found using arguments analogous the five-dimensional case.

By Theorem 32.4 the extreme points on the unit sphere for \(v_4\) is described by the roots of this polynomial

$$\begin{aligned} P_7(x) = x^7 -\frac{1}{2} x^5 + \frac{5}{84} x^3 - \frac{5}{3528} x. \end{aligned}$$

The roots of \(P_7(x)\) are:

$$\begin{aligned} x_{71} =&-x_{77},~~x_{72} = -x_{76},~~x_{73} = -x_{75},~~x_{74} = 0, \nonumber \\ x_{75} =&\frac{(-1)^{\frac{3}{4}}}{2\sqrt{21}}\left( 14 i -\root 3 \of {14} \left( z_6 w_6^\frac{1}{3}+\overline{z}_6\overline{w}_6^\frac{1}{3}\right) \right) ^\frac{1}{2} \nonumber \\ =&\frac{1}{2\sqrt{21}}\sqrt{14 - 2 \sqrt{14} \left( \sqrt{3} l_6 - k_6\right) }, \end{aligned}$$
(32.36)
$$\begin{aligned} x_{76} =&\frac{(-1)^\frac{1}{4}}{2\sqrt{21}}\left( -14 i -\root 3 \of {14} \left( \overline{z}_6 w_6^\frac{1}{3}+z_6 \overline{w}_6^\frac{1}{3}\right) \right) ^\frac{1}{2} \nonumber \\ =&\frac{1}{2\sqrt{21}}\sqrt{14 - 2 \sqrt{14} \left( \sqrt{3} l_7 + k_7\right) }, \end{aligned}$$
(32.37)
$$\begin{aligned} x_{77} =&\, \sqrt{\frac{1}{42}} \left( \root 3 \of {14} \left( w_6^\frac{1}{3}+ \overline{w}_6^\frac{1}{3}\right) +5\right) ^\frac{1}{2} \nonumber \\ =&\sqrt{\frac{1}{42} \left( 2\sqrt{14} k_7 + 5 \right) }, \\ z_6 =&\, \sqrt{3}+i,~w_6 = 2+i \sqrt{10} \nonumber \\ k_7 =&\cos \left( \frac{1}{3}\arctan \left( \sqrt{\frac{5}{2}}\right) \right) , \nonumber \\ l_7 =&\sin \left( \frac{1}{3}\arctan \left( \sqrt{\frac{5}{2}}\right) \right) . \nonumber \end{aligned}$$
(32.38)

32.3 Some Limit Theorems Involving the Generalized Vandermonde Matrix

Let \(D_k\) be the diagonal matrix

$$ D_k={\text {diag}}\left( \frac{1}{0!},\frac{1}{1!},\ldots ,\frac{1}{(k-1)!}\right) . $$

Theorem 32.7

For any \(\mathbf {x}\in \mathbb {C}^n\) and \(\mathbf {a}\in \mathbb {C}^m\) with \(x_j\ne 0\) for all j we have

$$\begin{aligned} G_{m n}(\mathbf {x},\mathbf {a})=\lim _{k\rightarrow \infty } V_{k m}(\mathbf {a})^T D_k V_{k n}(\log \mathbf {x}), \end{aligned}$$
(32.39)

where the convergence is entry-wise, \(\log \mathbf {x}=(\log x_1,\ldots ,\log x_n)\) and the branch of the complex logarithm \(\log (\cdots )\) is fixed and defines the expression \(x_j^{a_i}\) by

$$\begin{aligned} x_j^{a_i}:=e^{a_i \log x_j}. \end{aligned}$$

We will prove this theorem after presenting some results for a larger class of matrices.

Generalized Vandermonde matrices is a special case of matrices on the form

$$\begin{aligned} A_{m n}(\mathbf {x},\mathbf {a})=\left[ f(x_j,a_i)\right] _{m n}, \end{aligned}$$

where f is a function. Suppose that \(\mathbf {x}\) is fixed, then each entry will be a function of one variable

$$\begin{aligned} A_{m n}(\mathbf {x},\mathbf {a})=\left[ f_j(a_i)\right] _{m n}. \end{aligned}$$
(32.40)

If all these functions \(f_j\) are analytic in a neighborhood of some common \(a_0\) then the functions have power series expansions around \(a_0\). If we denote the power series coefficients for function \(f_j\) as \(c_{j*}\) then we may write

$$\begin{aligned} A_{m n}(\mathbf {x},\mathbf {a})&=\left[ \sum _{k=0}^{\infty } c_{jk}(a_i-a_0)^k\right] _{m n} \nonumber \\&=\lim _{k\rightarrow \infty }\left[ (a_i-a_0)^{j-1}\right] _{m k} \left[ c_{j(i-1)}\right] _{k n} \nonumber \\&=\lim _{k\rightarrow \infty } V_{k m}(\mathbf {a}-a_0)^T\left[ c_{j(i-1)}\right] _{k n}, \end{aligned}$$
(32.41)

where convergence holds for each entry of \(A_{m n}\) and

$$\begin{aligned} \mathbf {a}-a_0=(a_1-a_0,\ldots ,a_m-a_0). \end{aligned}$$

Proof

(Proof of Theorem 32.7) With the complex logarithmic function, \(\log (\cdots )\), defined to lie in a fixed branch we may write generalized Vandermonde matrices as

$$\begin{aligned}&G_{m n}(\mathbf {x},\mathbf {a})=\left[ f_j(a_i)\right] _{m n}, \end{aligned}$$

where

$$ f_j(a_i)=x_j^{a_i}=e^{a_i \log x_j}, \quad 1\le j\le n. $$

These functions \(f_j\) are analytic everywhere whenever \(x_j\ne 0\). By the power series of the exponential function we have

$$ f_j(a_i)=e^{a_i \log x_j}=\sum _{k=0}^{\infty } \frac{(a_i \log x_j)^k}{k!}=\sum _{k=0}^{\infty } \frac{(\log x_j)^k}{k!}a_i^k, $$

and by (32.41) we get

$$\begin{aligned} G_{m n}(\mathbf {x},\mathbf {a})&=\lim _{k\rightarrow \infty } V_{k m}(\mathbf {a})^T\left[ \frac{(\log x_j)^{i-1}}{(i-1)!}\right] _{k n} \\&=\lim _{k\rightarrow \infty } V_{k m}(\mathbf {a})^T {\text {diag}}\left( \frac{1}{0!},\ldots ,\frac{1}{(k-1)!}\right) V_{k n}(\log \mathbf {x}), \end{aligned}$$

which concludes the proof.

Theorem 32.8

If \(n\ge 2\), \(\mathbf {x},\mathbf {a}\in \mathbb {C}^n\), \(x_j\ne 0\) for all j and \(v_n(\mathbf {a})\ne 0\) then

$$\begin{aligned} \lim _{t\rightarrow 0} \frac{ g_n(\mathbf {x},\mathbf {a}t) }{ v_n(\mathbf {a}t) } =\left( \prod _{k=1}^{n}\frac{1}{(k-1)!}\right) \left( \prod _{1\le i<j\le n} (\log \,x_j-\log \,x_i) \right) . \end{aligned}$$

We will prove this theorem after some intermediate results.

Let \(\mathbf {i}_n=(1,2,\ldots ,n)\), \(P_{kn}\) be the set of all vectors \(\mathbf {p}\in \mathbb {N}^n\) such that

$$\begin{aligned} 1\le p_1<p_2<\cdots <p_n\le k \end{aligned}$$

and \(Q_{kn}=\{\mathbf {p}\in P_{kn}:p_n=k\}\). An \(N\times N\) minor of a matrix \(A\in M_{m n}\) is determined by two vectors \(\mathbf {k}\in P_{mN}\) and \(\mathbf {l}\in P_{nN}\) and is defined as

$$ A\left( \begin{array}{c}\mathbf {k}\\ \mathbf {l}\end{array}\right) :=\det \left[ A_{k_il_j}\right] _{N N}. $$

Using this notation the determinant of the product of two matrices \(A\in M_{n k}\) and \(B\in M_{k n}\) can be written using the Cauchy–Binet formula [14, p. 18] as

$$\begin{aligned} \det (AB)=\sum _{\mathbf {p}\in P_{kn}} A\left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {p}\end{array}\right) \cdot B\left( \begin{array}{c}\mathbf {p}\\ \mathbf {i}_n\end{array}\right) . \end{aligned}$$
(32.42)

Lemma 32.5

If \(\mathbf {x},\mathbf {a}\in \mathbb {C}^n\) and \(x_j\ne 0\) for all j then we can write the determinant of generalized Vandermonde matrices as

$$\begin{aligned} g_n(\mathbf {x},\mathbf {a})= \sum _{k=n}^{\infty } \sum _{\mathbf {q}\in Q_{kn}} V_{k n}(\mathbf {a})^T \left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {q}\end{array}\right) \cdot D_k \left( \begin{array}{c}\mathbf {q}\\ \mathbf {q}\end{array}\right) \cdot V_{k n}(\log \mathbf {x}) \left( \begin{array}{c}\mathbf {q}\\ \mathbf {i}_n\end{array}\right) . \end{aligned}$$

Proof

By (32.39), the continuity of the determinant function, the associativity of matrix multiplication, and (32.42), we get

$$\begin{aligned} g_n(\mathbf {x},\mathbf {a})&=\det \left( \lim _{k\rightarrow \infty } V_{k n}(\mathbf {a})^T D_k V_{k n}(\log \mathbf {x}) \right) \\&=\lim _{k\rightarrow \infty } \det \left( V_{k n}(\mathbf {a})^T D_k V_{k n}(\log \mathbf {x}) \right) \\&=\lim _{k\rightarrow \infty } \det \left( \left( V_{k n}(\mathbf {a})^T D_k\right) V_{k n}(\log \mathbf {x}) \right) \\&=\lim _{k\rightarrow \infty } \sum _{\mathbf {p}\in P_{kn}} \left( V_{k n}(\mathbf {a})^T D_k\right) \left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {p}\end{array}\right) \cdot V_{k n}(\log \mathbf {x})\left( \begin{array}{c}\mathbf {p}\\ \mathbf {i}_n\end{array}\right) . \end{aligned}$$

We recognizing that \(D_k\) is a diagonal matrix that scales the columns of \(V_{k n}(\mathbf {a})^T\):

$$ \left( V_{k n}(\mathbf {a})^T D_k\right) (i,j)=\left( V_{k n}(\mathbf {a})^T\right) (i,j) D_k(j,j), $$

and so

$$\begin{aligned} \left( V_{k n}(\mathbf {a})^T D_k\right) \left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {p}\end{array}\right)&= \left( V_{k n}(\mathbf {a})^T\right) \left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {p}\end{array}\right) \prod _{l=1}^{n} D_k(p_l,p_l) \\&= \left( V_{k n}(\mathbf {a})^T\right) \left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {p}\end{array}\right) \cdot D_k\left( \begin{array}{c}\mathbf {p}\\ \mathbf {p}\end{array}\right) , \end{aligned}$$

that is

$$\begin{aligned} g_n(\mathbf {x},\mathbf {a}) =\lim _{k\rightarrow \infty } \sum _{\mathbf {p}\in P_{kn}} V_{k n}(\mathbf {a})^T\left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {p}\end{array}\right) \cdot D_k\left( \begin{array}{c}\mathbf {p}\\ \mathbf {p}\end{array}\right) \cdot V_{k n}(\log \mathbf {x})\left( \begin{array}{c}\mathbf {p}\\ \mathbf {i}_n\end{array}\right) , \end{aligned}$$

and the result follows by recognizing that as k is increased to \(k+1\) in the limit, the sum will contain all the previous terms (corresponding to \(\mathbf {p}\in P_{kn}\)) with the addition of new terms corresponding to \(\mathbf {p}\in Q_{(k+1)n}\).

Proof

(Proof of Theorem 32.8) When the summation in Lemma 32.5 is applied to \(g_n(\mathbf {x},\mathbf {a}t)\) the resulting expression will contain factors

$$ V_{k n}(\mathbf {a}t)^T\left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {q}\end{array}\right) = t^{E(\mathbf {q})} V_{k n}(\mathbf {a})^T\left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {q}\end{array}\right) . $$

where

$$ E(\mathbf {q})=\sum _{j=1}^{n}(q_j-1). $$

The lowest exponent for t will occur exactly once, for \(k=n\), when \(\mathbf {q}=\mathbf {i}_n\), and it is

$$ M=E(\mathbf {i}_n)=\sum _{j=1}^{n}(j-1)=\frac{n(n-1)}{2}, $$

and by splitting the sum we get

$$\begin{aligned} g_n(\mathbf {x},\mathbf {a}t)&= t^{M} V_n(\mathbf {a})^T \left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {i}_n\end{array}\right) \cdot D_n \left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {i}_n\end{array}\right) \cdot V_n(\log \mathbf {x}) \left( \begin{array}{c}\mathbf {i}_n\\ \mathbf {i}_n\end{array}\right) +\mathcal {O}(t^{M+1}) \\&=t^{M} v_n(\mathbf {a}) \left( \prod _{k=1}^{n}\frac{1}{(k-1)!}\right) v_n(\log \mathbf {x}) + \mathcal {O}(t^{M+1}). \end{aligned}$$

The final result can now be proven by rewriting the denominator in the theorem as \(v_n(\mathbf {a}t)=t^{M} v_n(\mathbf {a})\), taking the limit, and finally expanding \(v_n(\log \mathbf {x})\) by Theorem 32.1.

32.4 Conclusions

From the visualizations in Sect. 32.1.1 it was concluded that the extreme points for the ordinary Vandermonde determinant on the unit sphere in \(\mathbb {R}^3\) seems to have some interesting symmetry properties and in Sect. 32.2 it was proven that extreme points could only appear given certain symmetry conditions, see Remark 32.2 and Theorem 32.5. This also allowed visualization of the extreme points of the ordinary Vandermonde determinant on the unit sphere in some higher dimensional spaces, \(\mathbb {R}^n\), more specifically for \(n = 4,5,6,7\).

The exact location of the extreme points for any finite dimension could also be determined as roots of the polynomial given in Theorem 32.4. Exact solutions for these roots were also given for the dimensions that were visualized, see Sects. 32.2.2 and 32.2.4.

Some visual investigation of the generalized Vandermonde matrix was also done in Sect. 32.1.1 but no clear way to find or illustrate where the extreme points was given. The authors intend to explore this problem further.

In Sect. 32.3 some limit theorems that involve factorization of a generalized Vandermonde matrix using an ordinary Vandermonde matrix and the ratio between the determinant of a generalized Vandermonde matrix and the determinant of a related ordinary Vandermonde matrix.