Computing the roots of sparse high–degree polynomials that arise from the study of random simplicial complexes

Farouki, Rida T.; Strom, Jeffrey A.

doi:10.1007/s11075-019-00745-3

Computing the roots of sparse high–degree polynomials that arise from the study of random simplicial complexes

Original Paper
Published: 04 June 2019

Volume 83, pages 1653–1670, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Numerical Algorithms Aims and scope Submit manuscript

Computing the roots of sparse high–degree polynomials that arise from the study of random simplicial complexes

Download PDF

Rida T. Farouki¹ &
Jeffrey A. Strom²

154 Accesses
Explore all metrics

Abstract

The problem of computing the roots of a particular sequence of sparse polynomials p_n(t) is considered. Each instance p_n(t) incorporates only the n + 1 monomial terms $t,t^{2},t^{4},\ldots ,t^{2^{n}}$ associated with the binomial coefficients of order n and alternating signs. It is shown that p_n(t) has (in addition to the obvious roots t = 0 and 1) precisely n − 1 simple roots on the interval (0,1) with no roots greater than 1, and a recursion relating p_n(t) and p_n+ 1(t) is used to show that they possess interlaced roots. Closed–form expressions for the Bernstein coefficients of p_n(t) on [0,1] are derived and employed to compute the roots in double–precision arithmetic. Despite the severe variation of the graph of p_n(t), and tight clustering of roots near t = 1, it is shown that for n ≤ 10, the roots on (0,1) are remarkably well–conditioned and can be computed to high accuracy using both the power and Bernstein forms.

The Largest Roots of the Mandelbrot Polynomials

Rational Minimax Iterations for Computing the Matrix pth Root

Article 12 May 2020

Sparse multivariate polynomial interpolation on the basis of Schubert polynomials

Article 12 July 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The accurate and efficient computation of the real roots of polynomials is a fundamental problem in numerical analysis, of great importance in numerous applications. Although this problem has been intensively investigated since the advent of digital computers, a “universal” root–finding algorithm based on floating–point arithmetic, that infallibly computes to a specified precision the roots of polynomials with arbitrary coefficients, remains an elusive ideal. A primary factor in this state of affairs is the high variability observed in the sensitivity of polynomial roots to perturbations in the polynomial coefficients, and the sometimes counter–intuitive trends of this sensitivity.

For polynomials with exactly specified integer coefficients, the vagaries of floating–point arithmetic can be circumvented by invoking arbitrary–precision rational arithmetic to determine isolating intervals for each root, smaller than any prescribed width. However, this approach can become very demanding in computing time and memory requirements, and for many practical problems, the initial polynomial coefficients will not be exactly known.

Floating–point arithmetic, based on rounding or truncation of significant digits to ensure a fixed memory size for numbers, has become widely adopted in scientific computing since the 1960s. In the late 1950s, it was implemented in software rather than in the cpu, and the two–part paper [13] by the British numerical analyst J. H. Wilkinson describes one of its earliest applications in computing polynomial roots. Subsequently [15], he characterized his choice of an “easy” test polynomial (with the 20 equally spaced real roots 1,2,…,20) as the most traumatic experience in my career as a numerical analyst, due to the extreme difficulty in computing any accurate digits for most of the roots. The polynomial with the 20 roots 2^− 1,2^− 2,…,2^− 20 in geometric progression is also considered in [13], and despite the close proximity of successive roots, it was found that they could be computed to a high accuracy.

The choice of basis plays an important role in determining the sensitivity of the roots of a polynomial to uncertainties in its coefficients, and it is known that the usual monomial (power) form a₀ + a₁t + ⋯ + a_mt^m is not, in general, a favorable choice. For computing the real roots on a finite interval t ∈ [a,b], the Bernstein form is preferred, and in fact it is “optimally stable” among all bases that are non–negative over that interval [4].

In the present paper, we consider the roots of a sequence of sparse high–degree polynomials that arise from the study of random simplicial complexes (see Section 2), wherein each instance p_n(t) includes only the n + 1 monomial terms t^k for k = 1,2,…,2ⁿ with binomial coefficients of alternating sign. Because of the probabilistic interpretation of the independent variable t, the main focus is on real roots t ∈ [0,1] although we also briefly discuss positive roots > 1, negative roots, and complex conjugate root pairs.

The remainder of this paper is organized as follows. Section 2 introduces the topological problem that motivates the study of the positive roots of the family of sparse high–degree polynomials considered herein. In Section 3, we elucidate some of the distinctive properties of the roots of these polynomials. Section 4 then considers numerical computation of the roots on (0,1) using double–precision floating–point arithmetic for n ≤ 10, based upon the power and Bernstein representations, and Section 5 discusses the accuracy of the computed roots in relation to the root condition numbers. Finally, Section 6 summarizes the principle results of the present study.

2 Euler characteristic of random simplicial complexes

We begin by explaining how the sequence of polynomials investigated herein arises quite naturally in the context of a fundamental question concerning the homotopy type of randomly constructed simplicial complexes.

2.1 Random simplicial complexes

Erdős and Rényi introduced random graphs [1] in 1959. In the years since their introduction, random graphs have proven to be an extremely powerful tool in graph theory, and their study has opened up a huge collection of new research directions. The most useful and deeply studied probability model starts with n vertices and a fixed probability t. Then, for each pair of vertices u and v, we flip a “t–coin” (that comes up heads with probability t) to decide whether or not they should be connected by an edge.

Even though, to a topologist, a graph is just a 1–dimensional simplicial complex, the study of random simplicial complexes has attracted substantial attention only in the past 15 years or so. The generalization from the random models of Erdős and Rényi to probability models for random simplicial complexes is straightforward. Recall that an n–dimensional simplex Δⁿ is the convex hull of a set of n + 1 points {x₁,…,x_n+ 1} in general position in $\mathbb {R}^{m}$ (with m ≥ n). The points x_i are called the vertices of the simplex. Any subset with k + 1 of these vertices defines a subsimplex of dimension k. Thus, there are $\binom {n+1}{k+1}$k–dimensional simplices in Δⁿ, and a total of 2^n+ 1 simplices, including the empty simplex. Combinatorially, a simplicial complex K of Δⁿ is simply the union of some (perhaps all) of the simplices of Δⁿ, with the proviso that if a simplex is in K, then so are all its subsimplices.

Several probability models for constructing random simplicial complexes have been studied, which are all specializations of the following extremely general procedure. List all the non–empty simplices of Δⁿ with an ordering $\sigma _{1},\ldots ,\sigma _{2^{n+1}-1}$ such that their dimensions are weakly increasing, and for each simplex σ, assign a probability p_σ ∈ [0,1]. Then, beginning with $K =\varnothing $ and σ = σ₁, check if all the subsimplices of σ are already in K. If they are, then σ is “available for inclusion into K” and we flip a coin that comes up heads with probability p_σ to decide whether or not to include it. Once that decision is made, we consider the next simplex in our list; the construction ends once all of the simplices of Δⁿ have been considered.

We focus henceforth on the specialization of this general model in which we choose a fixed value t ∈ [0,1] and set p_σ = t for all simplices σ of Δⁿ.

2.2 Euler characteristic

A topologist is naturally interested in the homotopy–theoretical properties of random simplicial complexes. The most fundamental question is whether or not these spaces are contractible (i.e., homotopically trivial). One way to distinguish the homotopy types of spaces is through the Euler characteristic, defined as

$$ \chi(K) := \sum\limits_{k=0}^{\dim(K)} (-1)^{k} \#\{ \text{\textit{k}-simplices in \textit{K}} \} . $$

If K and L are homotopy equivalent, then χ(K) = χ(L), and since χ(K) = 1 if K is a single point, we see that K cannot be contractible when χ(K)≠ 1. Let the random variable χ(n) be the Euler characteristic of the complex K constructed by the above procedure, with expected value E(χ(n)).

Proposition 1

$ \displaystyle {E(\chi (n)) = - \frac {1}{t} \sum \limits _{j=1}^{n+1} (-1)^{j} \binom {n+1}{j} t^{2^{j}}} . $

This follows directly from the discussion above, together with the linearity of expected value. Straightforward manipulation of the equation E(χ(n)) = 1 yields the following result.

Corollary 1

The “expected random subcomplex” ofΔ^n− 1is not contractible if t is not a root of the polynomial

$$ p_n(t) := \sum\limits_{k=0}^n (-1)^k \binom{n}{k} t^{2^k} . $$

(1)

Motivation for studying the roots of p_n(t)

Corollary 1 indicates that the random process generally produces homotopically non–trivial complexes—with possible exceptions when the probability t is one of the finite number of real roots of the polynomial p_n(t) in the interval [0,1]. Furthermore, our study of these polynomials suggests that they always change sign at the roots, which means that the algebraic topology of the random complexes may have a phase transition or threshold of some kind at these particular t values. The study of thresholds separating different kinds of behavior is a major strand in the well–established theory of random graphs.

3 Anatomy of the polynomials

We are interested in the roots of the sparse polynomials p_n(t) of degree 2ⁿ defined for n ≥ 1 by (1). The first few instances of these polynomials are

$$ \begin{array}{@{}rcl@{}} p_1(t) &=& t-t^2 , \\ p_2(t) &=& t-2 t^2+t^4 , \\ p_3(t) &=& t-3 t^2+3 t^4-t^8 , \\ p_4(t) &=& t-4 t^2+6 t^4-4 t^8+t^{16} , \\ p_5(t) &=& t-5 t^2+ 10 t^4-10 t^8+5 t^{16}-t^{32} , \\ p_6(t) &=& t-6 t^2+ 15 t^4-20 t^8+ 15 t^{16}-6 t^{32}+t^{64} , \end{array} $$

and one can easily verify that they satisfy the simple recursion relation

$$ p_{n+1}(t) = p_n(t)-p_n(t^2) . $$

(2)

For all n ≥ 1, t = 0 is obviously a root of p_n(t), and t = 1 is also a root, since

$$ p_{n}(1) = \sum\limits_{k=0}^{n} (-1)^{k} \binom{n}{k} = 0 . $$

These are simple roots of p_n(t), since $p_{n}^{\prime }(0)=1$ and $p_{n}^{\prime }(1)=(-1)^{n}$. Thus, the factor (1 − t)t can be extracted from p_n(t) to obtain a polynomial $\tilde {p}_{n}(t)$ of the modestly lower degree 2ⁿ − 2, although $\tilde {p}_{n} (t)$ is no longer sparse. From an abstract algebraic point of view, we have verified using Maple that $\tilde {p}_{n}(t)$ is irreducible over the rational numbers for 2 ≤ n ≤ 10.

Clearly, t = 0 and t = 1 are the only roots of p₁(t), while p₂(t) has the additional real roots $t=\textstyle \frac {1}{2}\displaystyle (-1\pm \sqrt {5})$, one negative and one positive. Using Maple, we have verified that the Galois group of the splitting field E of $\tilde {p}_{3}(t)$ is isomorphic to the full symmetric group Sym(6), and since this group is not solvable, the roots of p₃(t) (apart from 0 and 1) cannot be expressed in terms of radicals. It seems likely that this is also true for all n ≥ 3, although the degree of p₄(t) is already too high for Maple to cope with.

Since the polynomial (1) has n sign changes in its coefficients, Descartes’ law of signs [12] indicates that the number of its positive real roots is less than n by an even amount. We show below that p_n(t) has exactly n positive roots (n − 1 roots on (0,1) together with the root t = 1) and identify an “interlacing” property of the roots for successive n values, i.e., the roots of p_n+ 1(t) on (0,1) lie within intervals delineated by the roots of p_n(t).

Lemma 1

Forn ≥ 1 the polynomials (1) have at mostn + 1 roots on[0,1] and their derivatives have at most n roots on (0,1).

Proof

Since p_n(t) exhibits n sign changes in its coefficients, Descartes’ law of signs indicates that it has at most n positive roots, including t = 1. Since t = 0 is also a root of p_n(t), it has at most n + 1 roots on [0,1]. The derivative of p_n(t), namely

$$ p_{n}^{\prime}(t) = \sum\limits_{k=0}^{n} (-1)^{k}\binom{n}{k} 2^{k} t^{2^{k}-1} , $$

also exhibits n coefficient sign changes, so $p_{n}^{\prime }(t)$ has at most n roots on (0,1). Note that 0 and 1 are not roots, since $p_{n}^{\prime }(0)=1$ and $p_{n}^{\prime }(1)=(-1)^{n}$. □

Proposition 2

Consider a polynomialf(t) satisfying:

f(t) has only simple roots on [0,1] including 0 and 1.
r < s²for consecutive root pairsr,soff(t) on [0,1].
f^′(t) has exactly one root between r and s.

Then, the polynomialg(t) = f(t) − f(t²) inherits the following properties:

g(t) has one simple root w between the rootsr,soff(t).
The root w ofg(t) actually lies between$\sqrt {r}$and s.
Any two consecutive rootsv,wofg(t) satisfyv < w².

Proof

Let q,r,s be consecutive roots of f(t) on t ∈ [0,1]. Then, $r<\sqrt {r}<s$ and q < r² ≤ t² for r ≤ t ≤ s. Now the values of f(t) on [r,s] are replicated by f(t²) on $[ \sqrt {r},\sqrt {s} ]$. In particular, f(t) and f(t²) must have opposite signs on $[ r,\sqrt {r} ]$ and $[ s,\sqrt {s} ]$ so g(t) cannot vanish on these intervals (see Fig. 1). For simplicity, we assume that f(t) < 0 for r < t < s (an analogous argument holds when f(t) > 0). Then, f(t) < 0 = f(t²) at $t=\sqrt {r}$, and 0 = f(t) > f(t²) at t = s. Thus, since g(t) < 0 at $t= \sqrt {r}$ and g(t) > 0 at t = s, the interval $(\sqrt {r},s)$ must contain a root w of g(t). Moreover, since

$$ g^{\prime}(t) = f^{\prime}(t)-2 t f^{\prime}(t^{2}) , $$

and f^′(t) > 0 and f^′(t²) < 0 at t = w when f(t) < 0 for r < t < s, we have g^′(w) > 0, so w is a simple root of g(t).

Finally, consecutive roots v,w of g(t) lying between successive root pairs q,r and r,s of f(t) must satisfy

$$ q < \sqrt{q} < v < r < \sqrt{r} < w < s , $$

and consequently it is clear that v < w². □

Theorem 1

Forn ≥ 1, the root structureof the polynomials (1) on [0,1] may be characterized as follows:

p_n(t) has exactlyn + 1 simple roots on [0,1] includingt = 0 andt = 1.
The roots exhibit a “super–quadratic distribution”—i.e.,r < s²for any pair of consecutive rootsr,sofp_n(t).

Proof

We argue by induction. First, the stated properties are easily verified by direct calculation for small n values. Now suppose that they hold for some larger n. Under this supposition, Lemma 1 indicates that p_n(t) has exactly n extrema on (0,1) that isolate its n + 1 roots on [0,1]. Consequently, p_n(t) satisfies the conditions stipulated for the polynomial f(t) in Proposition 2, and p_n+ 1(t)—as defined by the recursion relation (2)—has the properties of the polynomial g(t) in Proposition 2. This implies that p_n+ 1(t) has n + 2 simple roots on [0,1] (including t = 0 and t = 1) with a super–quadratic distribution, i.e., v < w² for any pair of consecutive roots v,w of p_n+ 1(t). This completes the induction proof. □

As a direct consequence of Theorem 1, we make the following observations.

Remark 1

By Descartes’ law of signs, the polynomials p_n(t) have no real roots for t > 1.

Corollary 2

The n rootst_n+ 1,1,…,t_n+ 1,nofp_n+ 1(t) on (0,1) interlace then − 1 rootst_n,1,…,t_n,n− 1ofp_n(t) on (0,1) — i.e., witht_n,0 = 0 andt_n,n = 1 they satisfy

$$ t_{n,k-1} < t_{n+1,k} < t_{n,k} \quad \text{for} k=1,\ldots,n . $$

(3)

The ordering (3) is just a weaker form of the relation $\sqrt {t_{n,k-1}} < t_{n+1,k} < t_{n,k}$ for k = 1,…,n established in Theorem 1.

The remarkable behavior of the graph of p_n(t), illustrated in Fig. 2 for t ∈ [0,1] in the case n = 10, makes it an interesting test case for polynomial root solvers. Using a linear scale in t, only the first 6 of its 9 roots on (0,1) are clearly discernible. As n increases, the graph of p_n(t) on [0,1] oscillates with increasing frequency but decreasing amplitude between negative and positive values as t → 1, and the roots become tightly clustered. For t > 1, p_n(t) rapidly diverges to + ∞ or −∞ according to whether n is even or odd.

4 Computing the roots on (0,1)

At the cost of sacrificing certainty, numerical computations permit greater scope and efficiency than symbolic computations in computing the real roots of (1) on t ∈ (0,1). In particular, the Bernstein form

$$ f(t) = \sum\limits_{i=0}^{m} c_{i} \binom{m}{i}(1-t)^{m-i}t^{i} $$

of a degree–m polynomial f(t) on t ∈ [0,1] offers significant computational advantages, including:

Interpolation of end–point values: f(0) = c₀ and f(1) = c_m.
Convex hull property: minic_i ≤ f(t) ≤maxic_i for t ∈ [0,1].
Variation–diminishing property: # of roots on (0,1) = # sign variations of (c₀,…,c_m) − 2k for some integer k ≥ 0.
Subdivision algorithm: for any point τ ∈ (0,1), the Bernstein coefficients on the subintervals^{Footnote 1}[0,τ] and [τ,1] can be computed through recursive convex combinations of the coefficients c₀,…,c_m.
Numerical stability: f(t) on t ∈ [0,1] is always less sensitive to uniform perturbations in its Bernstein coefficients than in its power coefficients.

A scheme to isolate the real roots of a polynomial on t ∈ (0,1) may be based on the variation–diminishing property and subdivision algorithm. See [3] for a comprehensive review of the properties of the Bernstein form—further details on the numerical stability properties may be found in [4, 5].

4.1 Bernstein form of p_n(t)

The change of variables t ∈ (0,∞) → u ∈ (0,1) specified by

$$ t = \frac{u}{1-u} $$

(4)

defines a bijective relation between t and u, since dt/du = (1 − u)^− 2 > 0 for u ∈ (0,1). Note also that t ∈ (0,1) and t ∈ (1,∞) correspond to $u\in (0,\textstyle \frac {1}{2}\displaystyle )$ and $u\in (\textstyle \frac {1}{2}\displaystyle ,1)$. Substituting (4) into (1) yields

$$ p_{n}(u) = \frac{1}{(1-u)^{2^{n}}} \sum\limits_{k=0}^{n} (-1)^{k} \binom{n}{k} (1-u)^{2^{n}-2^{k}}u^{2^{k}} , $$

In computing the positive roots of (1), we omit the factor $1/(1-u)^{2^{n}}$ since it is non–zero and finite on (0,1). The remaining term $f_{n} (u)=(1-u)^{2^{n}}p_{n}(u)$ can be expressed in the Bernstein basis of degree 2ⁿ on u ∈ [0,1] as

$$ f_n(u) = \sum\limits_{i=0}^{2^n} c_i \binom{2^n}{i} (1-u)^{2^n-i}u^i , $$

(5)

where c_i = 0 if i≠ 2^k for k = 0,…,n and

$$ c_i = (-1)^k \frac{\displaystyle\binom{n}{k}} {\displaystyle\binom{2^n}{2^k}} \quad \text{if} i=2^k \text{for} k=0,\ldots,n . $$

(6)

The form (5) retains the sparsity of (1), with only n + 1 non–zero coefficients.

Although the Bernstein coefficients of the restriction of (5) to $u\in \left (0,\textstyle \frac {1}{2}\displaystyle \right )$ can be computed by the de Casteljau subdivision algorithm [2], it is preferable to derive them from first principles. Setting

$$ p_n(t) = \sum\limits_{l=0}^{2^n} c_l \binom{2^n}{l}(1-t)^{2^n-l} t^l , $$

(7)

the coefficients $c_{0},\ldots ,c_{2^{n}}$ of (1) on t ∈ [0,1] may be determined as follows. For each k, the k^th term in (7) is multiplied by $1=[ (1-t)+t ]^{2^{n}-2^{k}}$ to give

$$ p_{n}(t) = \sum\limits_{k=0}^{n} (-1)^{k}\binom{n}{k} t^{2^{k}} [ (1-t)+t ]^{2^{n}-2^{k}} , $$

and through binomial expansions of the factors $[ (1-t)+t ]^{2^{n}-2^{k}}$, we obtain

$$ \begin{array}{@{}rcl@{}} p_n(t) \!\! &=& \!\! \sum\limits_{k=0}^n (-1)^k\binom{n}{k} t^{2^k} \sum\limits_{j=0}^{2^n-2^k} \binom{2^n-2^k}{j}(1-t)^{2^n-2^k-j} t^j \\ \!\! &=& \!\! \sum\limits_{k=0}^n \sum\limits_{j=0}^{2^n-2^k} (-1)^k \binom{n}{k}\binom{2^n-2^k}{j}(1-t)^{2^n-2^k-j} t^{2^k+j} . \end{array} $$

By setting l = 2^k + j, re–arranging the order of summation, and performing some manipulations on the binomial coefficients, the Bernstein form of p_n(t) on t ∈ [0,1] is determined to have coefficients defined by c₀ = 0 and

$$ c_l = \frac{1}{{\displaystyle\binom{2^n}{l}}} \sum\limits_{k=0}^{k_{\max}} (-1)^{k} \binom{n}{k}\binom{2^n-2^k}{l-2^k} , \quad l=1,\ldots,2^n , $$

(8)

k_max = ⌊log2l⌋ being the largest integer k with 2^k ≤ l. Note that $c_{2^{n}}=0$, since t = 1 is a root of p_n(t). Because of the restricted range of summation, and the appearance of the summation index as an exponent^{Footnote 2} in the binomial coefficients, there is no obvious further reduction of the sum (8).

The binomial coefficients appearing in (8) can be very large, because of the exponential dependence on n. Goetgheluck [8, 9] describes an algorithm to represent any binomial coefficient through its factorization

$$ \binom{m}{l} = \prod\limits_{i=1}^{r} p_i^{ e_i} , $$

(9)

where {p₁,p₂,p₃,…} = {2,3,5,…} is the ordered set of prime numbers, and e₁,e₂,e₃,… are their corresponding exponents. Since primes larger than m cannot appear in (9), the index r is such that p_r ≤ m and p_r+ 1 > m. The form (9) allows exact representation of high–order binomial coefficients, and also their products and ratios, since multiplication and division are performed by adding or subtracting their prime exponents. Having determined the sum in (8) exactly as an integer, the coefficient c_l is obtained by a single floating–point division. Nevertheless, using extended–precision integer arithmetic (the long long data type in the C programming language), this approach is only feasible for n ≤ 6, since larger values incur overflow in evaluating the sums in (8). Thus, for n ≤ 6, the prime factorization (9) is used in evaluating the binomial coefficients in (8), and for n > 6, they are evaluated using double–precision floating–point arithmetic.

The Bernstein representation (7) is “dense” in the sense that, apart from $c_{0}=c_{2^{n}}=0$ since p_n(0) = p_n(1) = 0, the coefficients c_l for 1 ≤ l ≤ 2ⁿ − 1 are, in general, non–zero. For n ≤ 3, the non–zero coefficients of p_n(t) are

$$ n=1: c_{1} = \frac{1}{2} , \qquad n=2: c_{1},c_{2},c_{3} = \frac{1}{4},\frac{1}{6},-\frac{1}{4} , $$

$$ n=3: c_{1},c_{2},c_{3},c_{4},c_{5},c_{6},c_{7} = \frac{1}{8},\frac{2}{14}, \frac{3}{56},-\frac{1}{10},-\frac{13}{56},-\frac{3}{14},\frac{1}{8} . $$

For n > 3, it is impractical to list the coefficients, and they are best displayed graphically. The Bernstein form (7) of p_n(t) has control points specified by

$$ (l/2^{n},c_{l}) \quad \text{for} l=0,\ldots,2^{n} $$

and the control polygon is the piecewise–linear graph obtained by connecting them in order. The control polygon mimics the graph of p_n(t), as illustrated in Fig. 3 for n = 5 and 6, and it can be refined by degree elevation or subdivision [2] so as to converge monotonically to p_n(t).

The roots of p_n(t) on (0,1) can be isolated through a hierarchical binary subdivision, such that each subinterval exhibits exactly 0 or 1 sign change in its associated Bernstein coefficients. Moreover, a simple check for guaranteed convergence of Newton–Raphson iterations [11] to the unique real root on each isolating interval can be expressed in terms of these Bernstein coefficients and their first and second forward differences.

Using initial estimates obtained from the graphs of p_n(t) for n ≤ 6, the roots on (0,1) were computed by Newton–Raphson iterations using both the power form (1) and Bernstein form (7). To evaluate p_n(t) and its derivative, Horner’s method and the de Casteljau algorithm were used in the former and latter cases. To ensure the greatest accuracy, the Bernstein coefficients (8) were computed using the binomial coefficient factorizations (9).

Table 1 Computed roots of p_n(t) on (0,1) for 2 ≤ n ≤ 6

Full size table

The roots on t ∈ (0,1) for n ≤ 6 computed using the power and Bernstein forms were found to differ by < 10^− 15. They are enumerated in Table 1 to 10 decimal places. For n > 6, double–precision floating–point arithmetic was used to compute the binomial coefficients in (8), to avoid integer overflow. However, the roots computed using the power and Bernstein forms still differ by < 10^− 15 in most cases, and 10^− 13 for the roots clustered near t = 1 when n = 10. These roots are listed in Table 2 to 10 decimal places.

Table 2 Computed roots of p_n(t) on (0,1) for 7 ≤ n ≤ 10

Full size table

4.2 Distribution of the positive real roots

The computed roots of p_n(t) on (0,1) for 2 ≤ n ≤ 10, as listed in Tables 1 and 2, are shown in Fig. 4 with linear scales for t and n, and in Fig. 5 with logarithmic scales. For n > 10, the computations are very cumbersome, and the clustering of roots near t = 1 makes them difficult to resolve. Figure 5 suggests a power–law dependence of the roots on n, but this is not exact. A least–squares fit to the smallest root of p_n(t) for 2 ≤ n ≤ 10 gives

$$ t \approx a n^b , \quad a \approx 1.146625 , b \approx -1.036283 . $$

(10)

As seen in Table 3, there is a noticeable discrepancy at the smaller n values, although the accuracy improves significantly if the fit is restricted to n ≥ 4.

Table 3 Comparison of smallest roots of the polynomials p_n(t) for 2 ≤ n ≤ 10 with the values obtained from a power law least–squares fit of the form (10)

Full size table

Since the tightly clustered roots near t = 1 are not easily distinguishable for the larger n values in Figs. 4 and 5, an alternative plot is presented in Fig. 6, showing their distances from t = 1 on a logarithmic scale in t.

Figure 7 illustrates the super–quadratic distribution of the roots of p_n(t) identified in Theorem 1 for the case n = 10. The interlacing property of the roots of p_n(t) and p_n+ 1(t) is evident in Tables 1 and 2, and through a close inspection of Fig. 4.

4.3 The complete set of roots

Although we are mainly interested in the positive real roots of p_n(t) on (0,1), we can easily identify some basic facts about its entire set of roots. First, as a consequence of Theorem 1 and Descartes’ law of signs, we observe that p_n(t) has no real roots t > 1. Concerning the negative roots, we note that, whereas p_n(t) has n coefficient sign variations, p_n(−t) has only n − 1. Thus, Descartes’ law of signs implies that p_n(t) has at least one negative real root for even n. Moreover, p_n(−t) = p_n(t) − 2t, and we have p_n(− 1) = − 2 for any n, since p_n(1) = 0. Hence, noting that p_n(t) → +∞ as t →−∞ for even n, there must be a real root on (−∞,− 1) if n is even. The negative real roots of p_n(t) can be identified as the negatives of the values t > 0 where the graphs of p_n(t) and 2t intersect. For t ∈ (0,1), this first occurs when n = 10, as can be seen by noting in Fig. 2 that the graph of 2t crosses p₁₀(t) twice.

Finally, we briefly comment on the complete set of 2ⁿ (real and complex) roots of p_n(t) which (as computed in Maple) are illustrated in Fig. 8 for the cases n = 4,6,8,10. It is seen that (most of) the complex roots cluster near a circle centered at the origin, whose radius approaches 1 as n increases.

5 Numerical stability of roots

It is known that, for a polynomial specified in the power and Bernstein forms

$$ f(t) = \sum\limits_{i=0}^{m} a_{i}t^{i} = \sum\limits_{i=0}^{m} c_{i} \binom{m}{i}(1-t)^{m-i}t^{i} , $$

on t ∈ [0,1], the values (and roots) of f(t) are systematically less sensitive to uncertainties of a uniform maximum relative magnitude 𝜖 in the Bernstein coefficients than in the power coefficients [5]. In fact, it has been shown [4] that among all non–negative polynomial bases on [0,1], the Bernstein basis is “optimally stable” with respect to such coefficient perturbations.

For the power form (1), the n + 1 non–zero coefficients have magnitudes ≥ 1 that (except for the lowest– and highest–order terms) are increasing with n. On the other hand, the Bernstein form (7) has 2ⁿ − 1 non–zero coefficients, of magnitude ≤ 1 that are decreasing with n. For both forms, the non–zero coefficients have alternating signs. A somewhat surprising result is that both representations admit computation of the roots of p_n(t) with similar accuracy—it is only for the tightly clustered roots near t = 1 in the case n = 10 that the power form begins to exhibit slower and more erratic convergence of the Newton–Raphson iterations than the Bernstein form (see Table 4).

Table 4 Newton–Raphson iterations for the largest real root of p₁₀(t) from the initial value t = 0.999966, using the Bernstein and power representations. Note that the computed values differ only in the 14^th and 15^th decimal places

Full size table

This phenomenon may be understood as follows. For the power form (1), the n + 1 non–zero coefficients are known exactly^{Footnote 3} a priori as integers. For the Bernstein form (7), on the other hand, the 2ⁿ − 1 non–zero coefficients must be computed from the expression (8). This can be done almost exactly for n ≤ 6 using “long” (64–bit) integers, but floating–point arithmetic must be employed for n > 6, incurring “initial” coefficient errors.

However, initial coefficient errors are not the only factors influencing the accuracy of the computed roots. In backward error analysis [14, 16], the result of each floating–point arithmetic operation is interpreted as an exact outcome for perturbed operands. By “backward” propagation of these perturbations through the algorithm, from the final result to the input values, the outcome of the computation can be regarded as exact for certain perturbed inputs. Knowing these input perturbations and the condition number for the problem (i.e., the sensitivity of the output values to changes in the input values), the accuracy of the final result can (in principle) be assessed.

The condition number of a simple real root t = τ of a polynomial p_m(t) with coefficients a₀,…,a_m in the basis {ϕ₀(t),…,ϕ_m(t)} is conventionally defined [6, 7] as

$$ C(\tau) := \frac{\displaystyle{\sum\limits_{i=0}^{m} | a_{i}\phi_{i}(\tau) |}}{|p_{m}^{\prime}(\tau)|} . $$

In the limit 𝜖 → 0, the perturbation δτ in the root τ incurred by errors in the coefficients of uniform maximum relative magnitude 𝜖 satisfies

$$ | \delta\tau | \le C(\tau) \epsilon . $$

Table 5 lists the condition numbers C_B and C_P in the Bernstein and power bases for the 9 roots of the instance n = 10 of the polynomial (1). It is seen that the inequality C_B < C_P holds for all the roots, and the growth of C_P as the roots approach 1 is noteworthy (whereas C_B actually decreases). This explains the slower convergence to the largest root seen in Table 4.

Table 5 Root condition numbers for the Bernstein and power forms of p₁₀(t)

Full size table

The condition number values listed in Table 5 are actually very modest. This fact, coupled with the remarkable degree of agreement between the roots computed using the power and Bernstein forms, imparts confidence in the roots quoted herein. In proceeding beyond n = 10, it is likely that the Bernstein form will furnish better results for the roots tightly clustered near t = 1, although at a significantly higher computational cost.

6 Closure

In this study, several interesting theoretical properties of the polynomials p_n(t) defined by (1) have been elucidated—including (i) the existence of a simple recursion relating p_n+ 1(t) to p_n(t); (ii) the fact that these polynomials have, for each n > 1, precisely n − 1 roots on t ∈ (0,1) and none for t > 1; (iii) the super–quadratic distribution of these roots, i.e., each root is smaller than the square of the next–largest root; and (iv) the interlacing of the roots of p_n(t) and p_n+ 1(t), wherein the roots of the latter on (0,1) lie within the intervals delineated by the roots of former.

The unusual nature of the polynomials p_n(t) poses formidable challenges for root computation algorithms on account of their high degrees and strong scale variations of their graphs. Nevertheless, it was observed that for n ≤ 10, their roots are remarkably well–conditioned, and admit accurate computation in ordinary double–precision arithmetic using both the power and Bernstein forms. Although the Bernstein form is more stable than the power form, it lacks the sparsity of the power form, and its advantages are only apparent with the tightly clustered roots of p₁₀(t) near t = 1. Due to the exponential growth in the degree, root computations for p_n(t) with n > 10 have not been attempted: these tax the capabilities of both symbolic and numeric methods, and special procedures may be needed to reliably address them.

Notes

It is understood here that these sub–intervals are both mapped to [0, 1] through the transformations t → t/τ and t → (t − τ)/(1 − τ), respectively.
The authoritative compilation of summation formulas [10] contains no examples that include the summation index as an exponent within the binomial coefficients.
Note, however, that not all of these coefficients admit a finite binary representation, and small errors may be incurred in converting them to floating–point numbers.

References

Erdös, P., Rényi, A.: On random graphs. I. Publ. Math. Debrecen 6, 290–297 (1959)
MathSciNet MATH Google Scholar
Farin, G.E.: Curves and Surfaces for Computer–Aided Geometric Design: a Practical Guide, 3rd edn. Academic Press, Boston (1993)
MATH Google Scholar
Farouki, R.T.: The Bernstein polynomial basis: a centennial retrospective. Comput. Aided Geom. Des. 29, 379–419 (2012)
Article MathSciNet Google Scholar
Farouki, R.T., Goodman, T.N.T.: On the optimal stability of the Bernstein basis. Math. Comput. 65, 1553–1566 (1996)
Article MathSciNet Google Scholar
Farouki, R.T., Rajan, V.T.: On the numerical condition of polynomials in Bernstein form. Comput. Aided Geom. Des. 4, 191–216 (1987)
Article MathSciNet Google Scholar
Gautschi, W.: On the condition of algebraic equations. Numer. Math. 21, 405–424 (1973)
Article MathSciNet Google Scholar
Gautschi, W.: Questions of Numerical Condition Related to Polynomials, Studies in Numerical Analysis. In: Golub, G.H. (ed.) , vol. 24, pp 140–177. MAA Studies in Mathematics (1984)
Goetgheluck, P.: Computing binomial coefficients. Am. Math. Month. 94, 360–365 (1987)
Article MathSciNet Google Scholar
Goetgheluck, P.: On prime divisors of binomial coefficients. Math. Comput. 51, 325–329 (1988)
Article MathSciNet Google Scholar
Gould, H.W.: Combinatorial Identities: a Standardized Set of Tables Listing 500 Binomial Coefficient Summations. Gould Publications, Morgantown (1972)
MATH Google Scholar
Henrici, P.: Elements of Numerical Analysis. Wiley, New York (1964)
MATH Google Scholar
Uspensky, J.V.: Theory of Equations. McGraw–Hill, New York (1948)
Google Scholar
Wilkinson, J.H.: The evaluation of the zeros of ill–conditioned polynomials, parts I & II. Numer. Math. 1, 150–166 & 167–180 (1959)
Article MathSciNet Google Scholar
Wilkinson, J.H.: Rounding Errors in Algebraic Processes. Prentice–Hall, Englewood Cliffs (1963)
MATH Google Scholar
Wilkinson, J.H.: The Perfidious Polynomial, Studies in Numerical Analysis. In: Golub, G.H. (ed.) , vol. 24, pp 1–28. MAA Studies in Mathematics (1984)
Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Clarendon Press, Oxford (1988)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mechanical and Aerospace Engineering, University of California, Davis, CA, 95616, USA
Rida T. Farouki
Department of Mathematics, Western Michigan University, Kalamazoo, MI, 49008, USA
Jeffrey A. Strom

Authors

Rida T. Farouki
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey A. Strom
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rida T. Farouki.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Farouki, R.T., Strom, J.A. Computing the roots of sparse high–degree polynomials that arise from the study of random simplicial complexes. Numer Algor 83, 1653–1670 (2020). https://doi.org/10.1007/s11075-019-00745-3

Download citation

Received: 19 August 2018
Accepted: 29 May 2019
Published: 04 June 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s11075-019-00745-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Computing the roots of sparse high–degree polynomials that arise from the study of random simplicial complexes

Abstract

Similar content being viewed by others

The Largest Roots of the Mandelbrot Polynomials

Rational Minimax Iterations for Computing the Matrix pth Root

Sparse multivariate polynomial interpolation on the basis of Schubert polynomials

1 Introduction