Keywords

MSC 2010 Classification:

33.1 Introduction

A  Vandermonde matrix is a well-known type of matrix that appears in many different applications, most famously curve-fitting using polynomials. Here we will only consider square Vandermonde matrices of size \(n\times n\).

Definition 33.1

The Vandermonde matrices are determined by n values \(\mathbf {x}=(x_1,\ldots ,x_n)\) and is defined by [14, 15]:

$$\begin{aligned} V_{n}(\mathbf {x}) = \left[ x_j^{i-1}\right] _{mn} = \begin{bmatrix} 1 &{} 1 &{} \cdots &{} 1 \\ x_1 &{} x_2 &{} \cdots &{} x_n \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ x_1^{n-1} &{} x_2^{n-1} &{} \cdots &{} x_n^{n-1} \end{bmatrix}. \end{aligned}$$
(33.1)

The determinant of the Vandermonde matrix is well known, for example, see [1, 22] for detail.

Theorem 33.1

The determinant of square Vandermonde matrices has the form

$$\begin{aligned} \det V_n(\mathbf {x}) \equiv v_n(\mathbf {x}) =\prod _{1\le i<j\le n}(x_j-x_i). \end{aligned}$$
(33.2)

This determinant is also referred to as the Vandermonde determinant or Vandermonde polynomial or Vandermondian [24].

In this paper we will consider the extreme points of the Vandermonde determinant on surfaces that are implicitly defined by a univariate polynomial in a particular way. The examination is primarily motivated by mathematical curiosity but the techniques used here are likely to be extensible to some problems related to optimal experiment design for polynomial regression, and electrostatics.

This paper collects in a slightly generalized form some previous results, for detailed discussion see [14, 15], and expands upon them.

The main problem in this paper is to find the extreme points on a surface implicitly defined by

$$\begin{aligned} g_R(\mathbf {x}) = \sum _{i=1}^{n} R(x_i) = 0,~\mathrm {where}~R(x) = \sum _{i=0}^{m} r_i x^i,~~r_i \in \mathbb {R}. \end{aligned}$$
(33.3)

It is previously known where the extreme points are found for the sphere

$$\begin{aligned} R(x) = x_1^2 + x_2^2 + \cdots + x_n^2, \end{aligned}$$

for detailed description of the same, see [22]. In this paper we will examine a few other surfaces.

33.2 Some Applications of the Vandermonde Determinant and Its Extreme Points

The Vandermonde determinant appears in many circumstances, some well known examples are for proving the Lagrange interpolation gives a unique solution and in the classical formula for divided differences interpolation by sampling a function f at \(n+1\) points [18],

$$\begin{aligned}{}[x_1,x_2,\ldots ,x_{n+1}] f(x) = \frac{v_n(x_1,x_2,\ldots ,x_{n-1},f(x))}{v_n(x_1,x_2,\ldots ,x_{n-1},x_n)}. \end{aligned}$$

Another example is in the Harish-Chandra–Itzykson-Zuber integral formula [11, 12, 23] which states that if \(\mathbf {A}\) and \(\mathbf {B}\) are Hermitian matrices with eigenvalues \(\lambda _1(\mathbf {A}) \le \cdots \le \lambda _n(\mathbf {A})\) and \(\lambda _1(\mathbf {B}) \le \cdots \le \lambda _n(\mathbf {B})\) then

$$\begin{aligned} \int _{U(n)} e^{t \, \mathrm {tr}(\mathbf {A}U\mathbf {B}U^*)}~\mathrm {d}U = \frac{\det \big ([\exp (t\lambda _j(\mathbf {A})\lambda _k(\mathbf {B}))]_{j,k}^{nn}\big )}{t^{\frac{n(n-1)}{2}} v_n(\lambda (\mathbf {A})) v_n(\lambda (\mathbf {B}))} \prod _{i=1}^{n-1} i! \end{aligned}$$
(33.4)

where \(v_n\) is the determinant of the Vandermonde matrix.

For the remainder of this section we will list some applications where finding the extreme points of the Vandermonde determinant or a closely related expression is important.

33.2.1 Application to D-Optimal Experiment Designs for Polynomial Curve-Fitting with a Cost-Function

Optimal experiment design is a class of methods for choosing how to collect data used for curve-fitting to get the best possible result in some sense. There are various ways to measure the optimality of the design and one simple way is called D-optimality.

Suppose n data points \(x_{i}, ~i = 1,2,\ldots ,n\) are collected from some compact interval, \(\mathcal {X} \subset \mathbb {R}\), and the the interpolating polynomial of at most degree \(n-1\) is computed.

A vector containing the data points, \(\mathbf {x}_m = (x_1, x_2, \ldots , x_m) \in \mathcal {X}^m\), is called a design and a design is said to be D-optimal if

$$\begin{aligned} \det (M_n(\mathbf {x}_m)) \ge \det (M_n(\mathbf {y}_m)) \end{aligned}$$

for all \(\mathbf {y} \in \mathcal {X}^m\), where

$$ M_n(\mathbf {x}) = \begin{bmatrix} n &{} \displaystyle \sum ^{n}_{i = 1} x_i &{} \ldots &{} \displaystyle \sum ^{m}_{i = 1} x_i^{n-1} \\ \displaystyle \sum ^{n}_{i = 1} x_i &{} \displaystyle \sum ^{n}_{i = 1} x_i^2 &{} \ldots &{} \displaystyle \sum ^{n}_{i = 1} x_i^{n} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \displaystyle \sum ^{n}_{i = 1} x_i^{n-1} &{} \displaystyle \sum ^{n}_{i = 1} x_i^{n} &{} \ldots &{} \displaystyle \sum ^{n}_{i = 1} x_i^{2n-2} \end{bmatrix} $$

is the Fischer information matrix. This optimality measure is also equivalent (by the Kiefer-Wolfowitz equivalence theorem) to minimizing the generalized variance of the coefficients of the interpolating polynomial, so called G-optimalty, see [10, 13] for detailed discussion.

Noting that the Fischer information matrix is

$$\begin{aligned} \displaystyle M_n(\mathbf {x}) = V_n(\mathbf {x})^\top V_n(\mathbf {x}) \end{aligned}$$

and, \(V_n(\mathbf {x})\) is an \(n \times n\) matrix,

$$\displaystyle \det (M_n(\mathbf {x})) = \det (V_n(\mathbf {x})^\top ) \det (V_n(\mathbf {x})) = \det (V_n(\mathbf {x}))^2.$$

Thus the maximization of the determinant of the Fischer information matrix is equivalent to finding the extreme points of the determinant of a square Vandermonde matrix in some volume given by the set of possible designs, for further details see [10]. Usually the set of possible designs can be interpreted to belong in the n-dimensional cube \(\mathbf {x} \in [-1,1]^n\), and for this volume there are many known results, see [6] for an overview.

The problem considered in this paper could be applicable to the situation where a cost-function associated with the data such that the total cost of the experiment being below some threshold value, \(g(\mathbf {x}) \le 1\), defines some compact set,

$$\begin{aligned} \mathcal {G} = \{ \mathbf {x} \in \mathbb {R}^m | g(\mathbf {x}) \le 1 \}, ~~\mathrm {such ~~ that}~~ \mathcal {G} \subset \mathcal {X}^m. \end{aligned}$$

If the cost function for each collected data point is a polynomial R(x) then the volume of possible designs is given by

$$\begin{aligned} \displaystyle \sum _{i=1}^{n} R(x_i) = 1. \end{aligned}$$

33.2.2 Application in Electrostatics

A classical problem in electrostatics is to find the equilibrium configurations on a surface with some fixed and some movable charges each with charge potential \(\nu _{j}\). This is done by minimizing the energy of the configuration that is given by the expression

$$\begin{aligned} L(x_1,\ldots ,x_n) = \sum _{k=1}^{n} \sum _{j=0}^{p} \nu _j \log \frac{1}{|a_j-x_k|}+\sum _{1 \le i < k \le n} \log \frac{1}{|x_k-x_i|} \end{aligned}$$

where \(a_j\) denote the fixed charges and \(x_k\) denote the movable charges.

It can be shown that certain special cases of this problem are equivalent to finding the extreme points of the Vandermonde determinant, for a recent discussion on this, see [7].

33.2.3 Application in Systems with Coulomb Interactions

The Vandermonde determinant also appears regularly when discussing systems with Coloumb interactions, that is systems described by an energy given by

$$\begin{aligned} \mathcal {H}_N(x_1,\ldots ,x_N) = \frac{1}{2} \sum _{i \ne j} g(x_i-x_j) + N \sum _{i=1}^{N} V(x_i) \end{aligned}$$
(33.5)

where the interaction kernel, g(x), can take a few different forms, for a recent overview on the same, see [19] for detailed discussion. We will mention a few examples of interesting systems of this connected to the Vandermonde determinant.  

Fekete points::

When a function is approximated by a polynomial using interpolation the approximation error depends on the chosen interpolation points. The Fekete points is a set of points that provide an almost optimal choice of interpolation points [8] and they are given by maximizing the Vandermonde determinant, this can also be interpreted as minimizing the potential energy of a system with Coulomb interactions. The type of energy given by (33.5) appears when discussing various forms of weighted Fekete points. Finding the Fekete points is also of interest in complexity theory and would help with finding an appropriate starting polynomial for a homotopy algorithm for realizing the Fundamental Theorem of Algebra [20, 21].

Sphere packing::

Closely related to the problem of identifying the Fekete points is the optimal sphere packing problem that can be solved by minimizing the “Riesz s-energies”,

$$\begin{aligned} \displaystyle \sum _{i \not = j} \frac{1}{|x_{i} - x_{j}|^{s}} \end{aligned}$$

in the asymptotic case \(s \rightarrow \infty \). For more information about this see [5] for an overview of the theory and [4, 25] for some recent results.

The ‘Coulomb gas analogy’::

In random matrix theory it can be very useful to compute the limiting value of the so called Stieltjes transform. One method for doing this is to use the ‘Coulomb gas analogy’ [17]. This is also closely related to many problems in quantum mechanics and statistical mechanics.

 

33.3 Extreme Points of the Vandermonde Determinant on Surfaces Defined a Low Degree Univariate Polynomial

We are interested in finding the extreme points of the Vandermonde determinant \(v_{n}(\mathbf {x})\) on the surface defined by \(g_R(\mathbf {x}) = 0\) with \(g_R\) defined in (33.3).

Lemma 33.1

The problem of finding the extreme points of the Vandermonde determinant on the surface defined by \(g_R(\mathbf {x}) = 0\) can be rewritten as an ordinary differential equation of the form

$$\begin{aligned} f''(x) - 2 \rho R'(x) f'(x) - P(x) f(x) = 0 \end{aligned}$$
(33.6)

that has a unique (up to a multiplicative constant) polynomial solution, f, and any permutation of the roots of f will give the coordinates of a critical point of the Vandermonde determinant.

Proof

Using the method of Lagrange multipliers we get

$$\begin{aligned} \frac{\partial v_n}{\partial x_j} = \lambda \frac{\partial g_R}{\partial x_j} \Leftrightarrow \sum _{\begin{array}{c} i=1 \\ i \ne j \end{array}}^{n} \frac{v_n(\mathbf {x})}{x_j-x_i} = \lambda R'(x_j) \end{aligned}$$

for some \(\lambda \in \mathbb {R}\).

If we only consider this expression in a single point we can consider \(v_n(\mathbf {x})\) as a constant value and then the expression can be rewritten as

$$\begin{aligned} \sum _{\begin{array}{c} i=1 \\ i \ne j \end{array}}^{n} \frac{1}{x_j-x_i} = \rho R'(x_j) \end{aligned}$$
(33.7)

where \(\rho \) is some unknown constant.

Consider the polynomial

$$\begin{aligned} f(x) = \prod _{i=1}^{n} (x-x_i) \end{aligned}$$

and note that

$$\begin{aligned} \frac{1}{2} \frac{f''(x_j)}{f'(x_j)} = \sum _{\begin{array}{c} i=1 \\ i \ne j \end{array}}^{n} \frac{1}{x_j-x_i}. \end{aligned}$$
(33.8)

In each critical point we can combine (33.7) and (33.8) thus in each of the extreme points we will have the relation

$$\begin{aligned} f''(x_j) - 2 \rho R'(x_j) f'(x_j) = 0, ~ j = 1,2,\ldots ,n \end{aligned}$$

for some \(\rho \in \mathbb {R}\). Since each \(x_j\) is a root of f(x) we see that the left hand side in the differential equation must be a polynomial with the same roots as f(x), thus we can conclude that for any \(x \in \mathbb {R}\)

$$\begin{aligned} f''(x) - 2 \rho R'(x) f'(x) - P(x) f(x) = 0 \end{aligned}$$
(33.9)

where P(x) is a polynomial of degree \(m-2\).

Using this technique it is also easy to find the coordinates on a sphere translated in the \((1,\ldots ,1)\) direction.

Corollary 33.1

If \(\mathbf {x} = (x_1,x_2,\ldots ,x_n)\) is a critical point of the Vandermonde determinant on a surface \(S \subset \mathbb {C}^n\) then \((x_1+a,x_2+a,\ldots ,x_n+a)\) is a critical point of the Vandermonde determinant on the surface \(\{ \mathbf {x} + a \mathbf {1} \in \mathbb {C}^n | \mathbf {x} \in S \}\).

Proof

Follows immediately from

$$\begin{aligned} v_n\left( x_1+a,x_2+a,\ldots ,x_n+a\right) =&\prod _{1 \le i< j \le n} \left( x_j+a-x_i-a\right) \\ =&\prod _{1 \le i < j \le n} (x_j-x_i) = v_n(x_1,\ldots ,x_n). \end{aligned}$$

In several cases it is possible to find the extreme points by identifying the unknown parameters, \(\rho \) and the coefficients of P(x), by comparing the terms in (33.6) with different degrees and solving the resulting equation system. We will discuss the cases in the upcoming sections.

33.3.1 Critical Points on Surfaces Given by a First Degree Univariate Polynomial

When \(R(x) = r_1 x + r_0\) the surface defined by

$$\begin{aligned} \displaystyle \sum _{i=1}^{n} R(x_i) = 0 \end{aligned}$$

will always be a plane with normal \((1,1,\ldots ,1)\) through the point \(\left( \frac{r_0}{r_1},\frac{r_0}{r_1},\ldots ,\frac{r_0}{r_1}\right) \).

Since,

$$\begin{aligned} v_n\left( x_1+\frac{r_0}{r_1},x_2+\frac{r_0}{r_1},\ldots ,x_n+\frac{r_0}{r_1}\right) =&\prod _{1 \le i< j \le n} \left( x_j+\frac{r_0}{r_1}-x_i-\frac{r_0}{r_1}\right) \\ =&\prod _{1 \le i < j \le n} (x_j-x_i) = v_n(x_1,\ldots ,x_n). \end{aligned}$$

So the Vandermonde determinant will have no extreme point unless a further constraint is added.

33.3.2 Critical Points on Surfaces Given by a Second Degree Univariate Polynomial

Surfaces defined by letting

$$\begin{aligned} R(x) = \frac{1}{2} x^2 + r_1 x + r_0 = \frac{1}{2} \left( (x+r_1)^2 - r_1^2 + 2r_0 \right) \end{aligned}$$

will all be spheres around \((-r_1,-r_1,\ldots ,-r_1)\) with radius

$$\begin{aligned} \sqrt{n \left( \frac{r_1^2}{2} - r_0 \right) } \,. \end{aligned}$$

Thus the critical points can be found by a small modification of the technique used on the unit sphere described in [22].

Theorem 33.2

On the surface defined by

$$\begin{aligned} g(x) = \displaystyle \sum _{i=1}^{n} \frac{1}{2} x_i^2 + r_1 x_i + r_0 \end{aligned}$$

the coordinates of the critical points of the Vandermonde determinant are given by the roots of

$$\begin{aligned} f(x)&= H_n\left( \left( \frac{n-1}{2 (r_1^2 - 2 r_0)}\right) ^{\frac{1}{2}} \frac{(x+r_1)}{2}\right) \\&= n! \sum _{i=0}^{\left\lfloor \frac{n}{2}\right\rfloor } \frac{(-1)^i}{i!}\left( \frac{n-1}{2(r_1^2 - 2 r_0)}\right) ^{\frac{n-2i}{2}}\frac{(x+r_1)^{n-2i}}{(n-2i)!} \end{aligned}$$

where \(H_n\) denotes the nth (physicist) Hermite polynomial.

Proof

Since

$$\begin{aligned} R(x) = \frac{1}{2}x^2 + r_1 x + r_0 \end{aligned}$$

the differential equation (33.6) will be of the form

$$\begin{aligned} f''(x) - 2 \rho (x + r_1) f'(x) - p_0 f(x) = 0 \end{aligned}$$

By considering the terms with degree n it is easy to see that \(p_0 = - 2 \rho n\) and thus we get

$$\begin{aligned} f''(x) - 2 \rho (x + r_1) f'(x) + 2 \rho n f(x) = 0. \end{aligned}$$

Setting \(y = \rho ^{\frac{1}{2}}(x + r_1)\) gives \(x = \frac{y}{\rho ^{\frac{1}{2}}}-r_1\) and by considering the function

$$\begin{aligned} g(y) = f\left( \frac{y}{\rho ^{\frac{1}{2}}}-r_1\right) \end{aligned}$$

we can rewrite the differential equation as follows

$$\begin{aligned}&\frac{\mathrm {d}^2g}{\mathrm {d}x^2} - 2 \rho \left( \sqrt{\frac{y}{\rho }}-r_1 + r_1\right) \frac{\mathrm {d}g}{\mathrm {d}x} + 2 \, \rho \, n \, g(y) = 0 \nonumber \\ \Leftrightarrow ~&\rho \, g''(y) - 2 \, \rho \, \frac{y}{\rho ^{\frac{1}{2}}} \, \rho ^{\frac{1}{2}} \, g'(x) + 2 \, \rho \, n \, g(y) = 0 \nonumber \\ \Leftrightarrow ~&g''(y) - 2 \, y \, g'(x) + 2 \, n \, g(y) = 0. \end{aligned}$$
(33.10)

Equation (33.10) defines a class of orthogonal polynomials called the Hermite polynomials [1], \(H_n(y)\). Thus,

$$\begin{aligned} f(x) = c H_n(\rho ^{\frac{1}{2}}(x+r_1)) \end{aligned}$$

for some arbitrary constant c. To find the value of \(\rho \) we can exploit some properties of the roots of the Hermite polynomials.

If we let \(y_i\), \(i = 1,\ldots ,n\) be the roots of \(H_n(y)\). These roots will then have the following two properties.

$$\begin{aligned} \sum _{i=1}^{n} y_i&= 0 \end{aligned}$$
(33.11)
$$\begin{aligned} \sum _{i=1}^{n} y_i^2&= \frac{n(n-1)}{2} \end{aligned}$$
(33.12)

We can see this by letting \(e_k(y_1,\ldots \,y_n)\) denote the elementary symmetric polynomials and then \(H_n(y)\) can be written as

$$\begin{aligned} H_n(y)&= a_n (y-y_1)\cdots (y-y_n) \\&= a_n (y^n - e_1(y_1,\ldots ,y_n)y^{n-1} + e_2(y_1,\ldots ,y_n)y^{n-2} + q(y)) \end{aligned}$$

where q(y) is a polynomial of degree \(n-3\). The explicit expression for \(H_n(x)\) is [22]

$$\begin{aligned} H_n(y)&= n! \sum _{i=0}^{\left\lfloor \frac{n}{2}\right\rfloor } \frac{(-1)^i}{i!}\frac{(2y)^{n-2i}}{(n-2i)!} \nonumber \\&= 2^n y^n - 2^{n-2}n(n-1) y^{n-2} + n! \sum _{i=3}^{\left\lfloor \frac{n}{2}\right\rfloor } \frac{(-1)^i}{i!}\frac{(2y)^{n-2i}}{(n-2i)!} . \end{aligned}$$
(33.13)

Comparing the coefficients in the two expressions for \(H_n(y)\) gives

$$\begin{aligned} a_n&= 2^n, \end{aligned}$$
(33.14)
$$\begin{aligned} a_n e_1(y_1,\ldots ,y_n)&= 0, \end{aligned}$$
(33.15)
$$\begin{aligned} a_n e_2(y_1,\ldots ,y_n)&= -n(n-1) 2^{n-2}. \end{aligned}$$
(33.16)

Since

$$\begin{aligned} e_1(y_1,\ldots ,y_n) = \displaystyle \sum _{i = 1}^{n} y_i \end{aligned}$$

Eq. (33.15) implies (33.11) and since

$$\begin{aligned} \sum _{i=1}^{n} y_i^2&= (y_1+\cdots +y_n)^2 - 2 \sum _{1\le i < j \le n} y_i y_j \nonumber \\&= e_1(y_1,\ldots ,y_n)^2 - 2 e_2(y_1,\ldots ,y_n) \end{aligned}$$

Equation (33.14) together with (33.16) implies (33.12).

We now take the change of variables \(x = \frac{y}{\rho ^{\frac{1}{2}}}-r_1\) into consideration and get

$$\begin{aligned} \sum _{i=1}^{n} x_i&= \sum _{i = 1}^{n} \left( \frac{y_i}{\rho ^{\frac{1}{2}}}-r_1 \right) ^2 = \frac{1}{\rho ^{\frac{1}{2}}} \left( \sum _{i=1}^{n} y_i \right) - n r_1, \\ \sum _{i=1}^{n} x_i^2&= \sum _{i = 1}^{n} \left( \frac{y_i}{\rho ^{\frac{1}{2}}}-r_1 \right) ^2 = \sum _{i = 1}^{n} \frac{1}{\rho } \left( \sum _{i=1}^{n} y_i^2 \right) - \frac{2 r_1}{\rho ^{\frac{1}{2}}} \left( \sum _{i=1}^{n} y_i \right) + n r_1^2. \end{aligned}$$

Using (33.11) and (33.12) we can simplify these expression

$$\begin{aligned} \sum _{i=1}^{n} x_i&= -nr_1, \\ \sum _{i=1}^{n} x_i^2&= \frac{n(n-1)}{2 \rho } + nr_1^2. \end{aligned}$$

This allow us to rephrase the constraint \(g(x) = 0\) as follows

$$\begin{aligned} g(x) = \sum _{i=1}^{n} \frac{1}{2} x_i^2 + r_1 x_i + r_0 = \frac{n(n-1)}{4 \rho } - \frac{n r_1^2}{2} + n r_0 = 0 \end{aligned}$$

and from this it is easy to find an expression for \(\rho \)

$$\begin{aligned} \rho = \frac{n-1}{8 (r_1^2 - 2 r_0)}. \end{aligned}$$

Thus the coordinates of the extreme points are the roots of the polynomial given in Theorem 33.2.

Remark 33.1

Note that if \(\mathbf {x}_n = (x_1, x_2, \ldots \, x_n)\) is an extreme point of the Vandermonde determinant then any other point whose coordinates are a permutation of the coordinates of \(\mathbf {x}_n\) is also an extreme point. This follows from the determinant function being, by definition, alternating with respect to the columns of the matrix and the \(x_i\)s defines the columns of the Vandermonde matrix. Thus any permutation of the \(x_i\)s will give the same value for \(|v_n(\mathbf {x}_n)|\). Since there are n! permutations there will be at least n! extreme points. The roots of the polynomial in Theorem 33.2 defines the set of \(x_i\)s fully and thus there are exactly n! extreme points, n!/2 positive and n!/2 negative.

Remark 33.2

All terms in \(H_n(y)\) are of even order if n is even and of odd order when n is odd. This means that the roots of \(H_n(y)\) will be symmetrical in the sense that if \(y_i\) is a root then \(-y_i\) is also a root. From this it follows that if a point is a critical point on the here considered type of surface the points that is opposite of the circles centre will also be a critical point.

For more details and demonstrations of how to visualize the result see [14].

33.4 Critical Points on the Sphere Defined by a p-norm

Definition 33.2

The p-norm of \(\mathbf {x} \in \mathbb {R}^{n}\), where

$$\begin{aligned} S^{n}_p= \{(x_{1}, \ldots , x_{n}) \in \mathbb {R}^{2}:x_{1}^{p} + \ldots + x_{n}^{p} = r^{p} \}, \end{aligned}$$

also denoted by \(\Vert \mathbf {x} \Vert _p\) is defined as

$$\begin{aligned} \Vert \mathbf {x} \Vert _{p} = \left( \sum _{i=1}^{n} |x_{i}|^{p} \right) ^{\frac{1}{p}}, ~~ \mathrm {for}~~ p>0. \end{aligned}$$
(33.17)

Definition 33.3

The infinity norm of \(\mathbf {x} \in \mathbb {R}^{n}\) denoted \(\Vert \mathbf {x} \Vert _{\infty }\) is is defined as

$$\begin{aligned} \Vert \mathbf {x} \Vert _{\infty } = \mathrm {sup}\{| x_{i} | : ~1 \le i \le n \}. \end{aligned}$$
(33.18)

Definition 33.4

The sphere defined by the p-norm, denoted \(S^{n-1}_{p}(r)\), for positive integer p, is the set of all \(\mathbf {x} \in \mathbb {R}^{n}\) such that

$$\begin{aligned} \sum _{i=1}^{n}|x_{i}|^{p} = \Vert \mathbf {x} \Vert _p^n = r^{p}. \end{aligned}$$
(33.19)

When \(r = 1\) this is the unit sphere defined by the p-norm, denoted simply \(S^{n-1}_p\).

When p increases the points on \(S_p^{n-1}\) approaches the points on the cube so for convenience we define \(S_\infty ^{n-1}\) as the cube defined by the boundary of \([-1,1]^n\).

Sphere defined by a p-norms include many well-known geometric shapes. For instance when \(n=2, p=2,\) then

$$\begin{aligned} \displaystyle S_{2}^{1}(r) = \{(x_{1}, x_{2}) \in \mathbb {R}^{2}:x_{1}^{2} + x_{2}^{2} = r^{2} \} \end{aligned}$$

is a circle and when \(n=3, p=2,\) then

$$\begin{aligned} \displaystyle S_{3}^{1}(r) = \{(x_{1}, x_{2}, x_{3}) \in \mathbb {R}^{2}:x_{1}^{2} + x_{2}^{2} + x_{3}^{2} = r^{2} \} \end{aligned}$$

is the standard 2-sphere with radius r.

In the previous section we discussed how the extreme points of the Vandermonde determinant are distributed for the case \(p=2\) and \(n \ge 2\). In this section we will examine how the extreme points of the Vandermonde determinant are distributed on the sphere defined by the p-norm for the cases \(p \in \{4,6,8\}\) for a few different values of n.

In Fig. 33.1, we illustrate the surfaces generated under \(S^{n-1}_p\) for the cases \(p = 2\), \(p = 4\), \(p=6\), \(p= 8\), and \(p = \infty \) with a section cut out for internal cross-sectional view.

Fig. 33.1
figure 1

Illustration of \(S^{n-1}_p\) for \(p = 2\), \(p = 4\), \(p=6\), \(p= 8\), and \(p = \infty \) with a section cut out. The outer cube corresponds to \(p=0\) and \(p=2\) corresponds to the sphere in the middle

Similarly to the previous section we will construct a polynomial whose roots give the coordinates of the extreme points of the Vandermonde determinant. First we will consider the case \(p=4\), \(n = 4\).

33.4.1 The Case p \(=\) 4 and n \(=\) 4

We will illustrate the construction of a polynomial that has the coordinates of the points as roots with the case \(p=4\), \(n=4\). If we denote the polynomial whose roots give the coordinates with \(P_4^4(x)\) and use the same type of argument that was used to get Eq. (33.6). Taking P(x) to be of the form:

$$\begin{aligned} P(x) = x^{n} + c_{n-2}x^{n-2} + c_{n-4}x^{n-4} + \cdots \end{aligned}$$
(33.20)

with every other coefficient zero, when n is even of we have even powers and when n is odd we have odd powers. By identifying the powers in the differential equation (33.6) for the case \(p=4\):

$$\begin{aligned} P^{''}(x) + \rho _{n}x^{3}P^{'}(x) + (\sigma _{n}x^{2} + \tau _{n}x + \nu _{n})P(x) = 0, \end{aligned}$$
(33.21)

we obtain that \(\tau _{n}xP(x)\) does not share any powers with any other part of the equation and thus \(\tau _{n}=0\). Similarly, identifying the coefficients we obtain \(p\rho _{n} + \sigma _{n} = 0\). This leads us to the differential equation

$$\begin{aligned} P^{''}(x) + \rho _{n}x^{3}P^{'}(x) + (-p\rho _{n}x^{2} + \nu _{n})P(x) = 0. \end{aligned}$$
(33.22)

Basing on (33.20) and (33.22), and setting \(n=4, p=4\) we get to generate the system of

$$\begin{aligned} S_{p}^{n-1}&= \sum _{i=1}^{n}x_{i}^{p} = 1,\\ P_4^4(x)&= x^{n} + c_{n-2}x^{n-2} + c_{n-4}x^{n-4} + \cdots ,\\ {P_4^4}^{''}(x)&+ \rho _{n}x^{3}{P_4^4}^{'}(x) + (-p\rho _{n}x^{2} + \nu _{n})P_4^4(x) = 0. \end{aligned}$$

It follows that;

$$\begin{aligned}\begin{gathered} S_{4}^{4} = \sum _{i=1}^{4}x_{i}^{4} = x_{1}^{4} + x_{2}^{4} + x_{3}^{4} + x_{4}^{4} = 1,\\ P_4^4(x) = x^{4} + c_{2}p^{2} + c_{0} ~~\Rightarrow ~ {P_4^4}^{'}(x) = 4x^{3} + 2c_{2}x ~~\Rightarrow ~~ {P_4^4}^{''}(x) = 12x^{2} + 2c_{2}, \end{gathered}\end{aligned}$$

thus substituting into the differential equation

$$\begin{aligned} (12x^{2} + 2c_{2}) + \rho _{n}x^{3}(4x^{3} + 2c_{2}x) + (-p\rho _{n}x^{2} + \nu _{n})(x^{4} + c_{2}x^{2} + c_{0})&= 0\\ (\nu - 2\rho c_{2})x^{4} + (\nu c_{2}-4\rho c_{0} + 12)x^{2} + (2c_{2} + c_{0}\nu )&= 0. \end{aligned}$$

Equating corresponding coefficients as in P(x) we get:

$$\begin{aligned} \nu - 2\rho c_{2}&= 1\\ \nu c_{2}-4\rho c_{0} + 12&= c_{2}\\ 2c_{2} + c_{0}\nu&= c_{0}. \end{aligned}$$

Setting \(t = x^{2}\) we can express \(S_{3}^{4}\) and P(x) as follows:

$$\begin{aligned} S_{3}^{4}&= 2t_{1}^{2} + 2t_{2}^{2} = 2\sum _{i=1}^{4}t_{i}^{2} = 1\\ P_4^4(x)&= x^{4} + c_{2}p^{2} + c_{0} = t^{2} - (t_{0} + t_{1})t + t_{0} t_{1} = 0 \end{aligned}$$

Also equating coefficient in \(P_4^4(x)\) gives

$$\begin{aligned}&t_{0} + t_{1} = c_{2}, ~~~~ t_{0} t_{1} = c_{0}\\ \Rightarrow&t_{0} t_{1} + t_{1}^{2} = c_{2}t_{1} \Rightarrow c_{0} + t_{1}^{2} = c_{2}t_{1} \Rightarrow t_{1}^{2} = c_{2}t_{1} - c_{0} \\ \Rightarrow&t_{0}^{2} + t_{0} t_{1} = c_{2}t_{0} \Rightarrow t_{0}^{2} + c_{0} + = c_{2}t_{0} \Rightarrow t_{0}^{2} = c_{2}t_{0} - c_{0} \\ \Rightarrow&t_{0}^{2} + t_{1}^{2} = c_{2}(t_{0} + t_{1}) - 2c_{0} = c_{2}^{2} - 2c_{0} \Rightarrow 2\sum _{i=1}^{4}t_{i}^{2} = 2(c_{2}^{2} - 2c_{0}) = 1 \end{aligned}$$

This now gives a fourth equation so as to solve the system:

$$\begin{aligned} \nu - 2\rho c_{2}&= 1 \end{aligned}$$
(33.23)
$$\begin{aligned} \nu c_{2} - 4\rho c_{0} + 12&= c_{2} \end{aligned}$$
(33.24)
$$\begin{aligned} 2c_{2} + c_{0}\nu&= c_{0} \end{aligned}$$
(33.25)
$$\begin{aligned} 2(c_{2}^{2} - 2c_{0})&= 1 \end{aligned}$$
(33.26)

From we obtain \(\nu = 1 + 2\rho c_{2}\) and substituting this into (33.24) gives

$$\begin{aligned} c_{2}(1 + 2\rho c_{2}) -4 \rho c_{0} + 12 = c_{2} \Rightarrow \rho \left( 2(c_{2}^{2} -2c_{0})\right) = -12 \Rightarrow \rho = -12. \end{aligned}$$

To get the last equality use (33.26) and the fact that \(c_{2}\not = 0\).

Using this value in the expression for \(\nu \) we obtain \(\nu = -24 c_{2}\) and substituting this value into (33.24) gives

$$\begin{aligned} 2c_{2} + c_{0}(1 - 24 c_{2}) = c_{0} \Rightarrow 2c_{2}(1 - 12c_{0}) = 0 \Rightarrow 1 - 12c_{0} = 0 \Rightarrow c_{0} = \frac{1}{12}, \end{aligned}$$

where the last equality follows from \(c_{2}\not = 0\).

Now with \(\rho = -12, c_{0} = 1/12\), using (33.26) we obtain

$$\begin{aligned} \displaystyle 2(c_{2}^{2} - 2c_{0}) = 1 \Rightarrow c_{2} = \frac{1}{2} + \frac{2}{12} = \frac{8}{12} \Rightarrow c_{2} = \frac{2}{\sqrt{6}} \end{aligned}$$

Therefore we obtain \(P_4^4(x) = x^{4} - \frac{2}{\sqrt{6}}x^{2} + \frac{1}{12}\).

In Sect. 33.4.2 we will generalise this technique somewhat.

33.4.2 Some Results for Even n and p

In this section we will discuss the case when n and p are positive and even integers, and \(n>p\). We will discuss a method that can give the coordinates extreme points of the Vandermonde determinant constrained to \(S^{n-1}_p\), as defined in (33.19), as the roots of a polynomial.

First we will examine how this optimisation problem can be rewritten as a differential equation similar to (33.21).

Lemma 33.2

Let n and p be even positive integers. Consider the unit sphere given by the p-norm, in other words the surface given by

$$\begin{aligned} \displaystyle S_{n}^{p} = \left\{ (x_1,\ldots ,x_n) \in \mathbb {R}^n \Bigg | \sum _{i=1}^{n}x_{i}^{p} = 1 \right\} . \end{aligned}$$

There exists a second order differential equation

$$\begin{aligned} {P_n^p}''(x) - \frac{a_{p-2}}{n} x^{p-1}{P_n^p}'(x) + Q_n^p(x){P_n^p}(x) = 0, \end{aligned}$$
(33.27)

where \({P_n^p}(x)\) and \({Q_n^p}(x)\) are polynomials of the forms,

$$\begin{aligned} \displaystyle {P_n^p}(x) = x^{2n} + \sum _{i=0}^{\frac{1}{2}n-1}c_{2i}x^{2i} \end{aligned}$$

and

$$\begin{aligned} \displaystyle {Q_n^p}(x) = -a_{p-2}x^{p-2} + \sum _{i=0}^{\frac{1}{2}p-2}(-1)^{i}a_{2i}x^{2i}. \end{aligned}$$

There is also a relation between the coefficients of \(P_n^p\) and \(Q_n^p\) given by

$$\begin{aligned} 2j(2j-1)c_{2j}+\left( \sum _{k=0}^{j-1} \, a_{2k} \, c_{2(j-k-1)} \right) +\frac{n+p-2j}{n} \, a_{p-2} \, c_{2j-p} = 0 \end{aligned}$$
(33.28)

for \(1 \le j \le \frac{n+p-2}{2}\) where \(c_n = 1\), \(c_k = 0\) for  \(k \not \in \{ 0, 2, 4, \ldots ,n\}\) and \(a_k = 0\) for \(k \not \in \{ 0, 2, 4, \ldots , p-2 \}\).

Proof

This result is proved analogously to how (33.6) is found. Define

$$\begin{aligned} P^p_n(x) = \prod _{i = 1}^{n} (x-x_i) \end{aligned}$$

and note that

$$\begin{aligned} \frac{1}{2}\frac{{P^p_n}''(x)}{{P^p_n}'(x)} = \sum _{i=1}^{n} \frac{1}{x-x_i}. \end{aligned}$$

Now apply the method of Lagrange multipliers and see that in the critical points

$$\begin{aligned} \sum _{\begin{array}{c} i=1 \\ i \ne j \end{array}}^{n} \frac{1}{x_j-x_i} = \rho R'(x_j) \end{aligned}$$

where \(\rho \) is some unknown constant.

In each critical point we can combine the two expressions and conclude that

$$\begin{aligned} {P^p_n}''(x_j) - 2 \rho R'(x_j) {P^p_n}'(x_j) = 0, ~ j = 1,2,\ldots ,n \end{aligned}$$

for some \(\rho \in \mathbb {R}\). Since each \(x_j\) is a root of f(x) we see that the left hand side in the differential equation must be a polynomial with the same roots as \({P^p_n}(x)\), thus we can conclude that for any \(x \in \mathbb {R}\)

$$\begin{aligned} {P^p_n}''(x) - 2 \rho R'(x) {P^p_n}'(x) - Q(x) f(x) = 0 \end{aligned}$$

where Q(x) is a polynomial of degree \(p-2\).

By applying the principles of polynomial solutions to linear second order differential equation [2, 3], expanding the expression accordingly and matching the coefficients of the terms with different powers of x you can see that the coefficients of P(x) and Q(x) must obey the relation given in (33.28).

Noting that the relations between the two sets of coefficients are linear we will consider the equations given by (33.28) corresponding to

$$\begin{aligned} j \in \left\{ \frac{n-2}{2},\frac{n}{2},\ldots ,\frac{n+p-2}{2}\right\} , \end{aligned}$$

the corresponding system of equations in matrix form becomes

$$\begin{aligned} \begin{bmatrix} c_{n-2} &{} c_{n-4} &{} c_{n-6} &{} \cdots &{} c_{4} &{} \frac{p}{n} \, c_{n-p-2} \\ 1 &{} c_{n-2} &{} c_{n-4} &{} \cdots &{} c_{6} &{} \frac{p-2}{n} \, c_{n-p} \\ 0 &{} 1 &{} c_{n-2} &{} \cdots &{} c_{8} &{} \frac{p-4}{n} \, c_{n-p+2} \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0 &{} 0 &{} 0 &{} \cdots &{} c_{n-2} &{} \frac{4}{n} \, c_{n-4} \\ 0 &{} 0 &{} 0 &{} \cdots &{} 1 &{} \frac{2}{n} \, c_{n-2} \end{bmatrix} \begin{bmatrix} a_0 \\ a_2 \\ a_4 \\ \vdots \\ a_{p-4} \\ a_{p-2} \end{bmatrix} = \begin{bmatrix} -n(n-1) \\ 0 \\ 0 \\ \vdots \\ 0 \\ 0 \end{bmatrix}. \end{aligned}$$
(33.29)

By solving this system we can reduce the \(\frac{n+p-2}{2}\) equations given by matching the terms to \(\frac{n-2}{2}\) equations that together with the condition given by (33.19) gives a system of polynomial equations that determines all the unknown coefficients of P(x).

To describe how we can express the solution to (33.29) we will use a few well-known relations between elementary symmetric polynomials and power sums often referred to as the Newton–Girard formulae, and Vieta’s formula that describes the relation between the coefficients of a polynomial and its roots.

Here we will give some useful properties of elementary symmetric polynomials and power sums and relations between them.

Definition 33.5

The elementary symmetric polynomials are defined by

$$\begin{aligned} e_{1}(x_{1}, \ldots , x_{n}) =&\sum _{i=1}^{n}x_{i}, \\ e_{2}(x_{1}, \ldots , x_{n}) =&\sum _{1 \le i_{1}< i_{2}< n}x_{i_{1}}x_{i_{2}}, \\ e_{3}(x_{1}, \ldots , x_{n}) =&\sum _{1 \le i_{1}< i_{2}< i_{3}< n}x_{i_{1}}x_{i_{2}}x_{i_{3}}, \\ \vdots ~&\\ e_{m}(x_{1}, \ldots , x_{n}) =&\sum _{1 \le i_{1}< \ldots< i_{m} < n}x_{i_{1}}x_{i_{2}}x_{i_{3}}x_{i_{3}}, \\ \vdots ~&\\ e_{n}(x_{1}, \ldots , x_{n}) =&x_{1}x_{2}\cdots x_{n}. \end{aligned}$$

The elementary symmetric polynomials can be used to describe a well known relation between the roots of a polynomial and its coefficients often referred to as Vieta’s formula

Theorem 33.3

(Vieta’s formula)

Suppose \(x_{1}, \ldots , x_{n}\) are the n roots of a polynomial

$$\begin{aligned} x^{n} + c_{1}x^{n-1} + \ldots + c_{n}. \end{aligned}$$

Then \(c_k = (-1)^k e_k(x_1,\ldots ,x_n)\).

Definition 33.6

A power sum is an expression of the form \(p_k(x_1,\ldots ,x_n) = \displaystyle \sum _{i=1}^{n} x_i^k\).

Theorem 33.4

(Newton–Girard formulae) The Newton–Girard formulae can be expressed in many ways. For us the most useful version is the determinantal expressions. Let \(e_k = e_k(x_1,\ldots ,x_n)\) and \(p_k = p_k(x_1,\ldots ,x_n)\) denote the elementary symmetric polynomials and the power sums as in Definitions 33.5 and 33.6. Then the power sum can be expressed in terms of elementary symmetric polynomials in this way

$$ p_k = \begin{vmatrix} e_1&1&0&\cdots&0&0 \\ 2e_2&e_1&1&\cdots&0&0 \\ 3e_3&e_2&e_1&\cdots&0&0 \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ (p-1)e_{n-1}&e_{n-2}&e_{n-3}&\cdots&e_1&1 \\ pe_n&e_{n-1}&e_{n-2}&\cdots&e_2&e_1 \end{vmatrix}. $$

Proof

See for example [16].

Lemma 33.3

Using the following notation

$$\begin{aligned} t_n(c_1,c_2,\ldots ,c_m) = \begin{vmatrix} c_{m}&c_{m-1}&c_{m-2}&\cdots&c_{2}&\frac{2m}{n} \, c_{1} \\ 1&c_{m}&c_{m-1}&\cdots&c_{3}&\frac{2m-2}{n} \, c_{2} \\ 0&1&c_{m}&\cdots&c_{4}&\frac{2m-4}{n} \, c_{3} \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ 0&0&0&\cdots&c_{m}&\frac{4}{n} \, c_{m-1} \\ 0&0&0&\cdots&1&\frac{2}{n} \, c_{m} \end{vmatrix} \end{aligned}$$
(33.30)

and \(t_n(c) = \frac{2}{n}c\).

Proof

Comparing the expression for \(t_n\) with the relations given in Theorem 33.4 it is clear that these relations are equivalent to the Newton-Girard formulae with some minor modifications.

Lemma 33.4

For even n and p the condition (33.19) can be rewritten as

$$\begin{aligned} - n \, t_n(c_{n-p-2},c_{n-p},\ldots ,c_{n-2}) = 1 \end{aligned}$$

where \(t_n\) is defined by (33.30).

Proof

Note that the expression \(g_p(x_1,\ldots ,x_n) = \displaystyle \sum _{1}^{n} x_i^p = 1\) is a power sum. By Theorem 33.4 the following relation holds:

$$ g_p(\mathbf {x}) = \begin{vmatrix} e_1&1&0&\cdots&0&0 \\ 2e_2&e_1&1&\cdots&0&0 \\ 3e_3&e_2&e_1&\cdots&0&0 \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ (p-1)e_{n-1}&e_{n-2}&e_{n-3}&\cdots&e_1&1 \\ pe_n&e_{n-1}&e_{n-2}&\cdots&e_2&e_1 \end{vmatrix} $$

where \(e_k\) is the k:th elementary symmetric polynomial of \(x_1\), \(\ldots \), \(x_n\). Using Vieta’s formula we can relate the elementary symmetric polynomials to the coefficients of P(x) by noting that

$$\begin{aligned} P(x) = x^{2n} + \sum _{j=1}^{\frac{n}{2}-1} c_{2j} x^{2j} = \sum _{k=1}^{n} (-1)^k e_k x^{n-k} \end{aligned}$$

or more compactly \(e_{2k} = c_{n-2k}\).

With \(e_{2k} = c_{n-2k}\) and \(e_{2k+1} = 0\) we get

$$ g_p(\mathbf {x}) = \begin{vmatrix} 0&1&0&\cdots&0&0 \\ 2c_{n-2}&0&1&\cdots&0&0 \\ 0&c_{n-2}&0&\cdots&0&0 \\ 4c_{n-4}&0&c_{n-2}&\cdots&0&0 \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ 0&c_{n-p-2}&0&\cdots&0&1 \\ pc_{n-p}&0&c_{n-p-2}&\cdots&c_{n-2}&0 \end{vmatrix}. $$

Using Laplace expansion on every other row gives

$$\begin{aligned} g_p(\mathbf {x})&= \begin{vmatrix} 0&1&0&0&\cdots&0&0 \\ 2c_{n-2}&0&1&0&\cdots&0&0 \\ 0&c_{n-2}&0&1&\cdots&0&0 \\ 4c_{n-4}&0&c_{n-2}&0&\cdots&0&0 \\ \vdots&\vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ 0&c_2&0&c_4&\cdots&0&1 \\ pc_{n-p}&0&c_2&0&\cdots&c_{n-2}&0 \end{vmatrix} = - \begin{vmatrix} 2c_{n-2}&1&0&\cdots&0&0 \\ 0&0&1&\cdots&0&0 \\ 4c_{n-4}&c_{n-2}&0&\cdots&0&0 \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ 0&0&c_4&\cdots&0&1 \\ pc_0&c_2&0&\cdots&c_{n-2}&0 \end{vmatrix} \\&= \begin{vmatrix} 2c_{n-2}&1&0&0&\cdots&0&0 \\ 4c_{n-4}&c_{n-2}&1&0&\cdots&0&0 \\ 0&0&0&1&\cdots&0&0 \\ \vdots&\vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ 0&0&0&0&\cdots&0&1 \\ pc_0&c_2&c_4&c_6&\cdots&c_{n-2}&0 \end{vmatrix} = - \begin{vmatrix} 2c_{n-2}&1&0&\cdots&0&0 \\ 4c_{n-4}&c_{n-2}&1&\cdots&0&0 \\ 6c_{n-6}&c_{n-4}&c_{n-2}&\cdots&0&0 \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ (p-2)c_2&c_4&c_6&\cdots&c_{n-2}&1 \\ pc_0&c_2&c_4&\cdots&e_{n-4}&c_{n-2} \end{vmatrix} \\&= -n \begin{vmatrix} c_{p}&c_{p-1}&c_{p-2}&\cdots&c_{2}&\frac{p}{n} \, c_{1} \\ 1&c_{p}&c_{p-1}&\cdots&c_{3}&\frac{p-2}{n} \, c_{2} \\ 0&1&c_{p}&\cdots&c_{4}&\frac{p-4}{n} \, c_{3} \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ 0&0&0&\cdots&c_{p}&\frac{4}{n} \, c_{p-1} \\ 0&0&0&\cdots&1&\frac{2}{n} \, c_{p} \end{vmatrix} = (-1)^{\frac{p}{2}} n \, t_n(c_2,c_4,\ldots ,c_p) \end{aligned}$$

Thus \(g_p(x_1,\ldots ,x_n) = 1\) is equivalent to \(-n t_n(c_2,c_4,\ldots ,c_p) = 1\).

Lemma 33.5

The coefficients of the polynomial Q(x) in (33.27) can be expressed using the coefficients of P(x) as follows

$$\begin{aligned} a_{2k-2}&= (-1)^{k+1} n^2 (n-1) t_n(c_{n-p+2k+2},\ldots ,c_{n-2}),~k=1,2,\ldots ,\frac{p}{2}. \end{aligned}$$
(33.31)

Proof

By (33.29) we can write

$$ \begin{bmatrix} a_0 \\ a_2 \\ a_4 \\ \vdots \\ a_{p-4} \\ a_{p-2} \end{bmatrix} = \begin{bmatrix} c_{n-2} &{} c_{n-4} &{} c_{n-6} &{} \cdots &{} c_{n-p-4} &{} \frac{p}{n} \, c_{n-p-2} \\ 1 &{} c_{n-2} &{} c_{n-4} &{} \cdots &{} c_{n-p-6} &{} \frac{p-2}{n} \, c_{n-p} \\ 0 &{} 1 &{} c_{n-2} &{} \cdots &{} c_{n-p-8} &{} \frac{p-4}{n} \, c_{n-p+2} \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0 &{} 0 &{} 0 &{} \cdots &{} c_{n-2} &{} \frac{4}{n} \, c_{n-4} \\ 0 &{} 0 &{} 0 &{} \cdots &{} 1 &{} \frac{2}{n} \, c_{n-2} \end{bmatrix}^{-1} \begin{bmatrix} -n(n-1) \\ 0 \\ 0 \\ \vdots \\ 0 \\ 0 \end{bmatrix}. $$

and using Cramer’s rule we get

$$\begin{aligned} a_{p-2k} = \frac{\det (T_{n,p,k})}{t_n(c_{n-p-2},\ldots ,c_{n-2})} \end{aligned}$$

where

$$\begin{aligned} T_{n,p,k}&= \begin{bmatrix} c_{n-2} &{} c_{n-4} &{} \cdots &{} c_{n-2k+2} &{} -n(n-1) &{} c_{n-2k-2} &{} \cdots &{} \frac{p}{n} \, c_{n-p-2} \\ 1 &{} c_{n-2} &{} \cdots &{} c_{n-2k} &{} 0 &{} c_{n-2k-4} &{} \cdots &{} \frac{p-2}{n} \, c_{n-p} \\ 0 &{} 1 &{} \cdots &{} c_{n-2k-2} &{} 0 &{} c_{n-2k-6} &{} \cdots &{} \frac{p-4}{n} \, c_{n-p+2} \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots &{} \vdots &{} \cdots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 &{} 0 &{} \cdots &{} \frac{4}{n} \, c_{n-4} \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 &{} 0 &{} \cdots &{} \frac{2}{n} \, c_{n-2} \end{bmatrix} \\&= \begin{bmatrix} c_{n-2} &{} c_{n-4} &{} \cdots &{} c_{n-2k+2} &{} -n(n-1) &{} &{} &{} ~~\\ 1 &{} c_{n-2} &{} \cdots &{} c_{n-2k} &{} 0 &{} &{} &{} \\ 0 &{} 1 &{} \cdots &{} c_{n-2k-2} &{} 0 &{} &{} &{} \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots &{} &{} {\text {M}} &{} \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 &{} &{} &{} \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 &{} &{} &{} \end{bmatrix}. \\ \end{aligned}$$

By moving the k:th column to the first column and using Laplace expansion \(\det (T_k)\) can be rewritten on the form

$$\begin{aligned} \det (T_{n,p,k}) =&(-1)^k n(n-1) \begin{vmatrix} 1&c_{n-2}&\cdots&c_{n-2k}&&~~ \\ 0&1&\cdots&c_{n-2k-2}&&\\ \vdots&\vdots&\ddots&\vdots&&\\ 0&0&\cdots&1&&\\ 0&0&\cdots&0&M&\\ 0&0&\cdots&0&&\\ \vdots&\vdots&\ddots&\vdots&&\\ 0&0&\cdots&0&&\\ 0&0&\cdots&0&&\end{vmatrix} = -n(n-1) |M| \\ =&-n(n-1) \begin{vmatrix} c_{n-2}&\cdots&c_{n-p+2k}&\frac{p-2k}{n} \, c_{n-p+2k+2} \\ 1&\cdots&c_{n-p+2k+2}&\frac{p-2k-2}{n} \, c_{n-p+2k+4}\\ \vdots&\ddots&\vdots&\vdots \\ 0&\cdots&c_{n-2}&\frac{4}{n} \, c_{n-4} \\ 0&\cdots&1&\frac{2}{n} \, c_{n-2} \end{vmatrix} \\ =&(-1)^k n (n-1) t_n(c_{n-p+2k+2},\ldots ,c_{n-2}) \end{aligned}$$

We can also use Lemma 33.4 to note that \(t_n(c_{n-p-2},\ldots ,c_{n-2}) = \frac{-1}{n}\) and thus

$$\begin{aligned} a_{p-2k} = \frac{\det (T_{n,p,k})}{t_n(c_{n-p-2},\ldots ,c_{n-2})} = (-1)^{k+1} n^2(n-1)t_n(c_{n-p+2k+2},\ldots ,c_{n-2}) \end{aligned}$$

Theorem 33.5

The non-zero coefficients, \(c_{2k}\), in \(P^p_n\) that solves (33.27) can be found by solving the polynomial equation system given by

$$\begin{aligned} 2j(2j-1)c_{2j}&+ \left( \sum _{k=0}^{j-1} (-1)^{p-2k+1} n^2 (n-1) t_n(c_{n-2k+2},\ldots ,c_{n-2}) \right) \\&+ n(n-1)(n+p-2j)t_n(c_{n-p+4},\ldots ,c_{n-2}) = 0, \end{aligned}$$

for \(j=0,\ldots ,\frac{n}{2}-1\).

Proof

The equation system is the result of using (33.31) to substitute the \(a_k\) coefficients in (33.28).

Using Lagrange multipliers directly gives a polynomial equation system with n equations while Theorem 33.5 gives \(\frac{n}{2}\) equations.

Table 33.1 Polynomials, \(P^n_p\), whose roots give the extreme points of the Vandermonde determinant on the sphere defined by the p-norm in n dimensions

As an example we can consider the case \(n=8\), \(p = 4\). Matching the coefficients for (33.27) gives the system

$$ \left\{ \begin{aligned} a_0 c_0 + 2c_0&= 0, \\ a_0 c_2 + a_2 c_0 + 12 c_4&= 0, \\ 30 c_6 + a_0 c_4 + \frac{3}{4} a_2 c_2&= 0, \\ 56 + a_0 c_6 + \frac{1}{2} a_2 c_4&= 0, \\ a_0 + \frac{1}{4} a_2 c_6&= 0, \end{aligned} \right. $$

and rewriting the constraint that the points lie on \(S^{7}_4\) gives \(2c_6^2-4c_4 = 0\).

In this case the expressions for \(a_0\) and \(a_2\) becomes quite simple

$$ \left\{ \begin{aligned} a_0&= -112 c_6, \\ a_2&= 448. \end{aligned} \right. $$

By resubstituting the expressions into the system, or using Theorem 33.5 directly an equation systems for the \(c_0\), \(c_2\), c4 and \(c_6\) is given by

$$ \left\{ \begin{aligned} 112 c_0 c_6 + 2c_0&= 0, \\ -112 c_2 c_6 + 448 c_0 + 12 c_4&= 0, \\ -112 c_4 c_6 + 332 c_2 + 30 c_6&= 0, \\ -2c_6^2 + 4 c_4 + 1&= 0. \end{aligned} \right. $$

The authors are not aware of any method that can be used to easily and reliably solve the system given by Theorem 33.5. In Table 33.1 results for a number of systems, both with even and odd n and various values for p are given. These were found by manually experimentation combined with computer aided symbolic computations.

33.5 Some Results for Cubes and Intersections of Planes

It can be noted that when \(p \rightarrow \infty \) then \(S_p^{n-1}\) as defined in the previous section will converge towards the cube.

A similar technique to the described technique for surfaces implicitly defined by a univariate polynomial can be employed on the cube. The maximum value for the Vandermonde determinant on the cube \([-1,1]^n\) has been known for a long time (at least since [9]). Here we will show a short derivation.

Theorem 33.6

The coordinates of the critical points of \(v_n(\mathbf {x})\) on the cube \(\mathbf {x}_n\in [-1,1]^n\) are given by \(x_1 = -1\), \(x_n = 1\) and \(x_i\) equal to the ith root of \(P_{n-2}(x)\) where \(P_n\) are the Legendre polynomials

$$\begin{aligned} P_n(x) = 2^n \sum _{k=0}^{n} x^k \left( {\begin{array}{c}n\\ k\end{array}}\right) \left( {\begin{array}{c}\frac{n+k-1}{2}\\ n\end{array}}\right) \end{aligned}$$

or some permutation of them.

Proof

It is easy to show that the coordinates \(-1\) and \(+1\) must be present in the maxima points, if they were not then we could rescale the point so that the value of \(v_n(\mathbf {x})\) is increased, which is not allowed. We may thus assume the ordered sequence of coordinates

$$\begin{aligned} -1=x_1<\cdots <x_n=+1. \end{aligned}$$

The Vandermonde determinant then becomes

$$\begin{aligned} v_n(\mathbf {x}) = 2 \prod _{i=2}^{n-1} (1+x_i) (1-x_i) \prod _{1<i<j<n} (x_j-x_i). \end{aligned}$$

and the partial derivatives become

$$\begin{aligned} \frac{\partial v_n}{x_k} = v_n(\mathbf {x}) \left( \frac{1}{x_k+1}+\frac{1}{x_k-1}+\sum _{\begin{array}{c} i=2\\ i\ne k \end{array}}^{n-1}\frac{1}{x_k-x_i}\right) ,\quad 1<k<n. \end{aligned}$$

Using Lagrange multipliers the resulting equations system becomes

$$\begin{aligned} \frac{\partial v_n}{x_k} = 0, \quad k = 2,\ldots ,n-1 \end{aligned}$$

and choosing \(f(x) = \displaystyle \prod _{k=2}^{n-1} (x-x_k)\) gives that in each coordinate of a critical point

$$\begin{aligned}&\frac{1}{x_k+1}+\frac{1}{x_k-1}+\frac{1}{2} \frac{f''(x_k)}{f'(x_k)}=0,&1<k<n,\\ \Leftrightarrow \quad&(1-x^2) f''(x_k) + 2 x_k f'(x_k) = 0,&1<k<n \end{aligned}$$

and thus the left hand side of the expression must form a polynomial that can be expressed as some multiple of f(x)

$$\begin{aligned} (1-x^2) f''(x) - 2 x f'(x) - \sigma f(x)=0. \end{aligned}$$
(33.32)

The constant \(\sigma \) is found by considering the coefficient for \(x^{n-2}\):

$$\begin{aligned} (n-2)(n-3) + 2(n-2) - \sigma = 0 \quad \Leftrightarrow \quad \sigma = (n-2)(n-1). \end{aligned}$$

This gives us the differential equation that defines the Legendre polynomial \(P_{n-2}(x)\) [1].

The technique above can also easily be used to find critical points on the intersection of two planes given by \(x_1 = a\) and \(x_n = b\), \(b > a\).

Theorem 33.7

The coordinates of the critical points of \(v_n(\mathbf {x})\) on the intersection of two planes given by \(x_1 = a\) and \(x_n = b\) are given by \(x_{n-1} = a\), \(x_n = b\) and \(x_i\) is the ith root of \(P_{n-2}\left( \frac{x-a}{b-a}\right) \) where \(P_n\) are the Legendre polynomials

$$\begin{aligned} P_n(x) = 2^n \sum _{k=0}^{n} x^k \left( {\begin{array}{c}n\\ k\end{array}}\right) \left( {\begin{array}{c}\frac{n+k-1}{2}\\ n\end{array}}\right) \end{aligned}$$

or some permutation of them.

Proof

We assume the ordered sequence of coordinates

$$\begin{aligned} -1=x_1<\cdots <x_n=+1. \end{aligned}$$

The Vandermonde determinant then becomes

$$\begin{aligned} v_n(\mathbf {x}) = (b-a) \prod _{i=2}^{n-1} (a-x_i) (b-x_i) \prod _{1<i<j<n} (x_j-x_i). \end{aligned}$$

and the partial derivatives become

$$\begin{aligned} \frac{\partial v_n}{x_k} = v_n(\mathbf {x}) \left( \frac{1}{x_k-a}+\frac{1}{x_k-b}+\sum _{\begin{array}{c} i=2\\ i\ne k \end{array}}^{n-1}\frac{1}{x_k-x_i}\right) ,\quad 1<k<n. \end{aligned}$$

Using Lagrange multipliers the resulting equations system becomes

$$\begin{aligned} \frac{\partial v_n}{x_k} = 0, \quad k = 2,\ldots ,n-1 \end{aligned}$$

and choosing \(f(x) = \displaystyle \prod _{k=2}^{n-1} (x-x_k)\) gives that in each coordinate of a critical point

$$\begin{aligned}&\frac{1}{x_k-a}+\frac{1}{x_k-b}+\frac{1}{2} \frac{f''(x_k)}{f'(x_k)}=0,&1<k<n,\\ \Leftrightarrow \quad&(1-x^2) f''(x_k) + 2 x_k f'(x_k) = 0,&1<k<n, \end{aligned}$$

and thus the left hand side of the expression must form a polynomial that can be expressed as some multiple of f(x)

$$\begin{aligned} (x-a)(x-b) f''(x) + (2 x -a -b) f'(x) - \sigma f(x)=0. \end{aligned}$$

The constant \(\sigma \) is found by considering the coefficient for \(x^{n-2}\):

$$\begin{aligned} (n-2)(n-3) + 2(n-2) - \sigma = 0 \quad \Leftrightarrow \quad \sigma = (n-2)(n-1). \end{aligned}$$

The resulting differential equation is

$$\begin{aligned} (x-a)(x-b) f''(x) + (2 x -a -b) f'(x) - (n-2)(n-1) f(x)=0. \end{aligned}$$

If we change variables according to \(y = \frac{x-a}{b-a}\) and let \(g(y) = f(y(b-a)+a)\) then the differential equation becomes

$$\begin{aligned} y (y-1) g''(y) + (2y-1) g'(y) - (n-1)(n-2) g(y) = 0 \end{aligned}$$

which we can recognize as a special case of Euler’s hypergeometric differential equation whose solution can be expressed as

$$\begin{aligned} g(y) = c \cdot _2\!F_1(1-n,n+2;1;y), \text { for some arbitrary } c \in \mathbb {R}, \end{aligned}$$

where \(_2F_1\) is the hypergeometric function [1]. In this case the hypergeometric function is a polynomial and relates to the Legendre polynomials as follows

$$\begin{aligned} _2F_1(1-n,n+2;1;y) = n! P_{n-2}(y) \end{aligned}$$

thus it is sufficient to consider the roots of \(P_{n-2}\left( \frac{x-a}{b-a}\right) \).

33.6 Conclusion

In this paper we discussed the extreme points of the Vandermonde matrix on surfaces defined implicitly by (33.3).

We can find polynomial expressions that has the coordinates of the extreme points as roots when the surface is a sphere or cube. We also examine how to construct similar polynomials when the surface is a sphere defined by a p-norm. A technique for rewriting the problem as a smaller number of equations is demonstrated but the resulting systems are still challenging to solve in most cases.