1 Introduction

We seek complex roots of a degree d univariate polynomial p with real or complex coefficients. For a while the user choice for this problem has been (the package MPsolve) based on e.g. Erhlich-Aberth (simultaneous Newton-like) iterations. Their empirical global convergence (right from the start) is very fast, but its formal support is a long-known challenge, and the iterations approximate the roots in a fixed region of interest (ROI) about as slow as all complex roots.

In contrast, for the known algorithms subdividing a ROI, e.g., box, the cost of root-finding in a ROI decreases at least proportionally to the number of roots in it. Some recent subdivision algorithms have a proved nearly optimal complexity, are robust in the case of root clusters and multiple roots, and their implementation in [IPY18] a little outperforms MPsolve for ROI containing only a small number of roots, which is an important benefit in many computational areas.

The Local Clustering Problem. For a complex set \(\mathcal {S}\), \(\texttt {Zero}(\mathcal {S},p)\), or sometimes \(\texttt {Zero}(\mathcal {S})\), stands for the roots of p in \(\mathcal {S}\). \(\#(\mathcal {S},p)\) (or \(\#(\mathcal {S})\)) stands for the number of roots of p in \(\mathcal {S}\). Here and hereafter the roots are counted with their multiplicity.

We consider boxes (that is, squares with horizontal and vertical edges, parallel to coordinate axis) and discs \(D(c,r)=\{z\text { s.t. }|z-c|\le r\}\) on the complex plane. For such a box (resp. disc) \(\mathcal {S}\) and a positive \(\delta \) we denote by \(\delta \mathcal {S}\) its concentric \(\delta \)-dilation. We call a disc \(\Delta \) an isolator if \(\#(\Delta )>0\) and call it natural isolator if in addition \(\#(\Delta )=\#(3\Delta )\). A set \(\mathcal {R}\) of roots of p is called a natural cluster if there exists a natural isolator \(\Delta \) with \(\texttt {Zero}(\mathcal {R})=\texttt {Zero}(\Delta )\). The Local Clustering Problem (LCP) is to compute natural isolators for natural clusters together with the sum of multiplicities of roots in the clusters:

figure a

The basic tool of the nearly optimal subdivision algorithm of [BSS+16] for the LCP is the \(T^*\)-test for counting the roots of p in a complex disc (with multiplicity). It relies on Pellet’s theorem, involves approximations of the coefficients of p, and applies shifting and scaling the variable z and Dandelin-Gräffe’s root-squaring iterations. [IPY18] describes high-level improvement of this test, and CclusterFootnote 1, a C implementation of [BSS+16].

Our Contributions. Our new counting test, the \(P^*\)-test, for a pair of complex c and positive r computes the number \(s_0\) of roots of p in a complex disc \(\Delta \) centered at c with radius r. If the boundary \(\partial \Delta \) contains no roots of p, then

$$\begin{aligned} s_0 = \frac{1}{2\pi {\mathbf{i}}} \int _{\partial \Delta } \frac{p'(z)}{p(z)}dz,~\mathrm{for}~{\mathbf{i}}=\sqrt{-1}, \end{aligned}$$
(1)

by virtue of Cauchy’s theorem. By following [Sch82] and [Pan18], we approximate \(s_0\) by \(s_0^*\) obtained by evaluating \(p'/p\) on q points on the boundary \(\partial \Delta \) within the error bound \(|s_0 - s_0^*|\) in terms of q and the relative width of a root-free annulus around \(\partial \Delta \). Namely if \(\#(\frac{1}{2}\Delta )=\#(2\Delta )\) then for \(q=\lceil \log _2(d+4) + 2\rceil \) we recover exact value of \(s_0\) from \(s_0^*\).

A usual practice to ensure condition \(\#(\frac{1}{2}\Delta )=\#(2\Delta )\) when knowing a \(\rho >1\) so that \(\#(\frac{1}{\rho }\Delta )=\#(\rho \Delta )\) is to apply Dandelin-Gräffe’s iterations. In the test we propose here, such root-squaring operations can be applied implicitly by doubling the number q of evaluation points.

We give an effectiveFootnote 2 (i.e. implementable) description of this \(P^*\)-test, which involves no coefficients of p and can be applied to a polynomial p represented by a black box for its evaluation. For sparse polynomials and polynomials defined by recursive process such as Mandelbrot’s polynomials (see [BF00], or Eq. (3) below), the test is particularly efficient and the resulting acceleration of the clustering algorithm of [BSS+16] is particularly strong.

Our second (and less significant) contribution applies to polynomials with real coefficients: the roots of such polynomials are either real or appear in complex conjugated pairs. As a consequence, one can recover all the roots in a ROI \(B_0\) containing \(\mathbb R\) from the ones with positive imaginary parts. We show how to improve a subdivision scheme by leveraging of the latter property.

Every polynomial p and its product \(p\overline{p}\) with its complex conjugate \(\overline{p}\) belongs to this class and has additional property that the multiplicity of its real roots is even, but we do not assume the latter restriction.

We implemented and tested our improvements in Ccluster. For polynomials with real coefficients that are sparse or can be evaluated by a fast procedure, we achieved a 2.5 to 3 fold speed-up as shown in Table 1 by columns \(t_{old}/t_{new}\). When the ROI contains only a few solutions, Ccluster is, thanks to those improvements, a little more efficient than MPsolve (compare columns Ccluster local, \(t_{new}\) and MPsolve in Table 1). We give details on our experiments below.

Table 1. Running times in seconds of Ccluster, new and old versions, for computing clusters of roots in a small ROI (local) and a ROI containing all the roots, and MPsolve.

Implementation and Experiments. All the timings shown in this article are sequential times in seconds on a Intel(R) Core(TM) i7-7600U CPU @ 2.80 GHz machine with Linux. MPsolve is called with the command mpsolve -as -Gi -o16 -j1Footnote 3. Table 1 shows comparative running times of Ccluster and MPsolve on two families of polynomials, Mignotte and Mandelbrot’s polynomials, with real coefficients, defined below. Columns \(t_{new}\) (resp. \(t_{old}\)) show timings of Ccluster with (resp. without) the improvements described in this paper. Columns #Clus show the number of clusters found by two versions. We used both versions of Ccluster with \(\epsilon =2^{-53}\). Ccluster global refers to the ROI \([-500, 500]+{\mathbf{i}}[-500, 500]\), that contains all the roots of the tested polynomials; Ccluster local refers to an ROI containing only a few solutions. We used \([-0.5, 0.5]+{\mathbf{i}}[-0.5, 0.5]\) for Mignotte’s polynomials and \([-0.25, 0.25]+{\mathbf{i}}[-0.25, 0.25]\) for Mandelbrot’s polynomials.

The Mignotte’s polynomial of degree d and parameter \(a=14\) is:

$$\begin{aligned} \mathtt{Mignotte}_{d}(z) = z^d - 2(2^az-1)^2 \end{aligned}$$
(2)

It has a cluster of two roots near the origin whose separation is near the theoretical minimum separation bound. It is sparse and can be evaluated very fast. We define the Mandelbrot’s polynomial as \(\mathtt{Mandelbrot}_{1}(z)=1\) and

$$\begin{aligned} \mathtt{Mandelbrot}_{k}(z) = z\mathtt{Mandelbrot}_{k-1}(z)^2 + 1 \end{aligned}$$
(3)

\(\mathtt{Mandelbrot}_{k}(z)\) has degree \(2^{k}-1\). It can be evaluated with a straight line program. The 63 clusters of roots of \(\mathtt{Mandelbrot}_{6}(z)\) and \(\mathtt{Mignotte}_{64}(z)\) are depicted in Fig. 1.

Fig. 1.
figure 1

Left: 63 clusters of roots for a Mignotte polynomial of degree 64. Right: Clusters of roots for the Mandelbrot polynomial of degree 63.

Structure of the Paper. Our paper is organized as follows: in Sect. 2 we describe our \(P^*\)-test. In Sect. 3 we apply it to speeding up a clustering algorithm. In Sect. 4 we cover our root-finder for polynomials with real coefficients. Section 5 presents the results of our improvements. In the rest of the present section, we recall the related work and the clustering algorithm of [BSS+16].

1.1 Previous Works

Univariate polynomial root-finding is a long-standing and still actual problem; it is intrinsically linked to numerical factorization of a polynomial into the product of its linear factors. The algorithms of [Pan02] support record and nearly optimal bounds on the Boolean complexity of the solution of both problems of factorization and root-finding. The cost bound of the factorization is smaller by a factor of d, and both bounds differ from respective information lower bound by at most a polylogarithmic factor in the input size and in the bound on the required output precision. Root-finder supporting such bit complexity bounds are said to be nearly optimal. The algorithms of [Pan02] are involved and have never been implemented. User’s choice has been for a while the package of subroutines MPsolve (see [BF00] and [BR14]), based on simultaneous Newton-like (i.e. Ehrlich-Aberth) iterations. These iterations converge to all roots simultaneously with cubic convergence rate, but only locally, that is, near the roots; empirically they converge very fast also globally, right from the start, although formal support for this empirical behavior is a long-known research challenge. Furthermore these iterations compute a small number of roots in a ROI not much faster than all roots.

In contrast, recent approaches based on subdivision (as well as the algorithms of [Pan02]) compute the roots in a fixed ROI at a cost that decreases at least proportionally to the number of roots. Near-optimal complexity has been achieved both for the real case (see [PT13, PT16, SM16] that combines the Descartes rule of signs with Newton’s iterations and its implementation in [KRS16]) and the complex case. In the complex case [BSSY18] similarly combines counting test based on Pellet’s theorem with complex version of the QIR algorithm, which in turn combines Newton’s and secant iterations.

[BSS+16] extends the method of [BSSY18] for root clustering, i.e. it solves the LCP and is robust in the case of multiple roots; its implementation ([IPY18]) is a little more efficient than MPsolve for ROI’s containing only several roots; when all the roots are sought, MPsolve remains the user’s choice. The algorithms of [BSS+16] and [BSSY18] are direct successors of the previous subdivision algorithms of [Ren87] and [Pan00], presented under the name of Quad-tree algorithms (inherited from the earlier works by Henrici and Gargantini).

Besides Pellet’s theorem, counting test in ROI can rely on Eq. (1) and winding numbers algorithms (see, e.g., [HG69, Ren87] and [ZZ19]).

1.2 Solving the LCP

\(\varvec{C^0}\) and \(\varvec{C^*}\) Tests. The two tests \(C^0\) and \(C^*\) discard boxes with no roots of p and count the number of roots in a box, respectively. For a given complex disc \(\Delta \), \(C^0(\Delta ,p)\) returns either \(-1\) or 0, and returns 0 only if p has no root in \(\Delta \), while \(C^*(\Delta ,p)\) returns an integer \(k\ge -1\) such that \(k\ge 0\) only if p has k roots in \(\Delta \). Below, we may write \(C^0(\Delta )\) for \(C^0(\Delta ,p)\) and \(C^*(\Delta )\) for \(C^*(\Delta ,p)\).

In [BSS+16, BSSY18, IPY18], both \(C^0\) and \(C^*\) are based on the so called “soft Pellet test” denoted \(T^*(\Delta ,p)\) or \(T^*(\Delta )\) which returns an integer \(k\ge -1\) such that \(k\ge 0\) only if p has k roots in \(\Delta \):

$$\begin{aligned} \begin{array}{rl} C^0(\Delta ) := &{} \left\{ \begin{array}{rl} 0 &{} \text { if } T^*(\Delta ) = 0 \\ -1 &{} \text { otherwise }\\ \end{array} \right. \\ &{} \\ C^*(\Delta ) := &{} T^*(\Delta ). \end{array} \end{aligned}$$
(4)

Boxes, Quadri-Section and Connected Components. The box B centered in \(c=a+{\mathbf{i}}b\) with width w is defined as \([a-w/2, a+w/2] + {\mathbf{i}}[b-w/2, b+w/2]\). We denote by \(w(B)\) the width of B. We call containing disc of B the disc \(\Delta (B)\) defined as \(D(c,\frac{3}{4}w(B))\). We define the four children of B as the four boxes centered in \((a\pm \frac{w}{4})+{\mathbf{i}}(b\pm \frac{w}{4})\) with width \(\frac{w}{2}\).

Recursive subdivisions of a ROI \(B_0\) falls back to the construction of a tree rooted in \(B_0\). Hereafter we refer to boxes that are nodes (and possibly leafs) of this tree as the boxes of the subdivision tree of \(B_0\).

A component \(\mathcal {C}\) is a set of connected boxes. The component box \(B_{\mathcal {C}}\) of a component \(\mathcal {C}\) is a smallest square box subject to \(\mathcal {C}\subseteq B_{\mathcal {C}}\subseteq B_0\), where \(B_0\) is the initial ROI. We write \(\Delta (\mathcal {C})\) for \(\Delta (B_{\mathcal {C}})\) and \(w(\mathcal {C})\) for \(w(B_{\mathcal {C}})\). Below we consider components made up of boxes of the same width; such a component is compact if \(w(\mathcal {C})\) is at most 3 times the width of its boxes. Finally, a component \(\mathcal {C}\) is separated from a set S if \(\forall \mathcal {C}'\in S, 4\Delta (\mathcal {C})\cap \mathcal {C}'=\emptyset \) and \(4\Delta (\mathcal {C})\subseteq 2B_0\).

A Root Clustering Algorithm. We give in Algorithm 1 a simple root clustering algorithm based on subdivision of ROI \(B_0\). For convenience we assume that p has no root in \(2B_0\setminus B_0\) but this limitation can easily be removed. The paper [BSS+16] proves that Algorithm 1 terminates and output correct solution provided that the \(C^0\) and \(C^*\)-tests are as in Eq. (4).

Note that in the while loop of Algorithm 1, components with widest containing box are processed first; together with the definition of a separated component, this implies the following remark:

Remark 1

Let \(\mathcal {C}\) be a component in Algorithm 1 that passes the test in step 4. Then \(\mathcal {C}\) satisfies \(\#(\Delta (\mathcal {C}))= \#(4\Delta (\mathcal {C}))\).

2 Counting the Number of Roots in a Well Isolated Disc

In this section we cover a new test for counting the number of roots with multiplicity of p in a disc \(\Delta \) provided that the roots in \(\Delta \) are well isolated from the other roots of p. Let us first formalize this notion:

Definition 2

(Isolation ratio). A complex disc \(\Delta \) has isolation ratio \(\rho \) for a polynomial p if \(\rho >1\) and \(\texttt {Zero}(\frac{1}{\rho }\Delta )=\texttt {Zero}(\rho \Delta )\).

Let \(\texttt {Zero}(\Delta )=\{\alpha _1, \ldots , \alpha _{d_\Delta }\}\) and let \(m_i\) be the multiplicity of \(\alpha _i\). The h-th power sum of the roots in \(\Delta \) is the complex number

$$\begin{aligned} s_h=\sum \limits _{i=1}^{d_\Delta } m_i\times \alpha _i^h \end{aligned}$$
(5)

In our test, called hereafter \(P^*\)-test, we approximate the 0-th power sum \(s_0\) of the roots of p in \(\Delta \) equal to the number of roots of p in \(\Delta \) (counted with multiplicity). We obtain precise \(s_0\) from \(s_0^*\) where p and its derivative \(p'\) are evaluated on only a small number of points on the contour of \(\Delta \). For instance, if \(\Delta \) has isolation ratio 2 and p has degree 500, our test amounts to evaluating p and \(p'\) on \(q=11\) points; \(s_0\) is recovered from these values in O(q) arithmetic operations.

figure b

If p and its derivative can be evaluated at a low computational cost, e.g. when p is sparse or p is defined by a recurrence as the Mandelbrot polynomial (see [BF00][Eq. (16)] or Eq. (3) above), our \(P^*\)-test can be substantially cheaper to apply than the \(T^*\)-test presented above. Notice however that it requires the isolation ratio of \(\Delta \) (or at least a lower bound) to be known.

2.1 Approximation of the 0-th Power Sum of the Roots in a Disc

[Sch82] and [Pan18] give formulas for approximating the powers sums \(s_h\) of the roots in the unit disc. Here we compute \(s_0\) in any complex disc \(\Delta =D(c,r).\)

For a positive integer q, define

$$\begin{aligned} s_0^* = \frac{r}{q}\sum \limits _{g=0}^{q-1} \omega ^{g} \frac{p'(c+r\omega ^g)}{p(c+r\omega ^g)} \end{aligned}$$
(6)

where \(\omega =e^{\frac{2\pi {\mathbf{i}}}{q}}\) denotes a primitive q-th root of unity.

The theorem below shows that the latter number approximates the 0-th power sum with an error that can be made as tight as desired by increasing q, providing that \(\Delta \) has isolation ratio noticeably exceeding 1.

Theorem 3

Let \(\Delta \) have isolation ratio \(\rho \) for p, let \(\theta =1/\rho \), let \(s_0\) be the 0-th power sum of the roots of p in \(\Delta \), and let \(s_0^*\) be defined as in Eq. 6. Then

  1. (i)

    \(|s_0^*-s_0| \le \dfrac{d\theta ^{q}}{1-\theta ^{q}}\).

  2. (ii)

    Fix \(e>0\). If \(q=\lceil \log _{\theta }(\frac{e}{d+e}) \rceil \) then \(|s_0^*-s_0| \le e\).

Proof of Theorem 3. Let \(p_{\Delta }(z)\) be the polynomial \(p(c+rz)\). Thus \(p_{\Delta }'(z)=rp'(c+rz)\) and Eq. (6) rewrites \(s_0^* = \frac{1}{q}\sum \limits _{g=0}^{q-1} \omega ^{g} \frac{p_{\Delta }'(\omega ^g)}{p_{\Delta }(\omega ^g)}\). In addition, the unit disc D(0, 1) has isolation ratio \(\rho \) for \(p_{\Delta }\) and contains \(s_0\) roots of \(p_{\Delta }\). Then apply equation (12.10) in [Sch82] (with \(e^{-\delta }=\theta , e^\delta =\rho \)) to \(p_{\Delta }(z)\) to obtain (i). (ii) is a direct consequence of (i).

   \(\square \)

For example, if \(\Delta \) has isolation ratio 2, p has degree 500 and one wants to approximate \(s_0\) with an error less than 1/4, it suffices to apply formula in Eq. (6) for \(q=11\), that is to evaluate p and its derivative \(p'\) at 11 points.

Remark that in (ii), the required number q of evaluation points increases as the logarithm of \(\rho \): if \(\Delta \) has isolation ratio \(\sqrt{\rho }\) (resp. \(\rho ^2\)) instead of \(\rho \), \(\frac{1}{2}q\) (resp. 2q) evaluation points are required. Thus doubling the number of evaluation points has the same effect as root squaring operations.

2.2 Black Box for Evaluating a Polynomial on an Oracle Number

Our goal is to give an effective description of our \(P^*\)-test; to this end, let us introduce the notion of oracle numbers that correspond to black boxes giving arbitrary precision approximations of any complex number. Such oracle numbers can be implemented through arbitrary precision interval arithmetic or ball arithmetic. Let be the set of complex intervals. If , then is the maximum width of real and imaginary parts of .

For a number \(a\in \mathbb C\), we call oracle for a a function such that \(a\in {\mathcal {O}}_a(L)\) and \(w({\mathcal {O}}_a(L))\le 2^{-L}\) for any L. Let \({\mathcal {O}}_{\mathbb C}\) be the set of oracle numbers.

For a polynomial \(p\in \mathbb C[z]\), we call evaluation oracle for p a function , such that if \({\mathcal {O}}_a\) is an oracle for a and \(L\in \mathbb N\), then \(p(a)\in {\mathcal {I}}_p( {\mathcal {O}}_a, L )\) and \(w({\mathcal {I}}_p( {\mathcal {O}}_a, L ))\le 2^{-L}\).

We consider evaluation oracles \({\mathcal {I}}_p\) and \({\mathcal {I}}_{p'}\) for p and its derivative \(p'\). If p is given by \(d+1\) oracles for its coefficients, one can easily construct \({\mathcal {I}}_p\) and \({\mathcal {I}}_{p'}\) by using for instance Horner’s rule. However for some polynomials defined by a procedure, for instance the Mandelbrot polynomial (see Eq. (3)), one can construct fast evaluation oracles \({\mathcal {I}}_p\) and \({\mathcal {I}}_{p'}\) from the procedurial definition.

2.3 The \(P^*\)-test

Algorithm 2 counts the number of roots of p in a disc \(\Delta =D(c,r)\) having isolation ratio at least \(\rho \). For such a disc, any positive integer q and any integer \(0\le g < q\), one has \(p(c+r\omega ^g)\ne 0\). As a consequence, there exist an \(L'\) s.t \(\forall L\ge L', \forall 0\le g\le q-1, 0\notin {\mathcal {I}}_p({\mathcal {O}}_{c+r\omega ^g},L)\) and the intervals computed in step 4 of Algorithm 2 have strictly decreasing width as of \(L\ge L'\). This shows the termination of Algorithm 2. Its correctness is stated in the following proposition:

figure c

Proposition 4

Let k be the result of the call \(P^*({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta , \rho )\). If \(\Delta \) has isolation ratio at least \(\rho \) for p, then p has k roots in \(\Delta \) counted with multiplicity.

Proof of Proposition 4. Once the while loop in Algorithm 2 terminates, the interval contains \(s_0^*\) and . In addition, by virtue of statement (ii) of Theorem 3, one has \(|s_0^*-s_0|\le 1/4\), thus defined in step 7 satisfies: and . Since contains at most one integer, \(s_0\) is the unique integer in , and is equal to the number of roots in \(\Delta \).

   \(\square \)

3 Using the \(P^*\)-test in a Subdivision Framework

Let us discuss the use of the \(P^*\)-test as \(C^0\) and \(C^*\)-tests in order to speed up Algorithm 1. Table 2 covers runs of Algorithm 1 on Mignotte and Mandelbrot’s polynomials. t is the running time when \(C^0\) and \(C^*\) tests are defined by Eq. (4). Columns nb show the respective numbers of \(C^0\) and \(C^*\)-tests performed, column \(t_0\) and \(t_0/t\) (resp. \(t_*\) and \(t_*/t\)) show time and ratio of times spent in \(C^0\) (resp. \(C^*\)) tests when it is defined by Eq. (4).

One can readily use the \(P^*\)-test to implement the \(C^*\)-test by defining

$$\begin{aligned} C^*(\Delta ) := P^*({\mathcal {I}}_p, {\mathcal {I}}_{p'}, 2\Delta , 2) \end{aligned}$$
(7)

Following Remark 1, the \(C^*\)-test is called in Algorithm 1 for components \(\mathcal {C}\) satisfying \(\#(\Delta (\mathcal {C}))=\#(4\Delta (\mathcal {C}))\). Hence \(2\Delta (\mathcal {C})\) has isolation ratio 2 and by virtue of Proposition 4, \(C^*(\Delta (\mathcal {C}))\) returns \(r\ge 0\) only if \(\Delta (\mathcal {C})\) contains r roots.

However this would not imply much improvements in itself. Column \(t_*'\) in Table 2 shows the time that would be spent in \(C^*\)-tests if it was defined by Eq. (7): it is far less than \(t_*\), but the ratio of time spent in \(C^*\)-tests (see column \(t_*/t\)) is very small. In contrast, about 90% of the running time of Algorithm 1 is spent in \(C^0\)-tests (see column \(t_0/t\)). We propose to use a modified version of the \(P^*\)-test as a filter in the \(C^0\)-test to decrease its running time.

Table 2. Details on runs of Algorithm 1 on Mignotte and Mandelbrot’s polynomials.
figure d

3.1 An Approximate \(P^*\)-test

The approximate version of the \(P^*\)-test is aimed at being applied to a disc \(\Delta =D(c,r)\) with unknown isolation ratio. Unless \(\Delta \) has isolation ratio \(\rho >1\), the very unlikely case where for some \(0\le g<q\), \(p(c+r\omega ^g)=0\), leads to a non-terminating call of \(P^*({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta , \rho )\). Also, computed in step 7 of Algorithm 2 could contain no integer or an integer that is not \(s_0\). We define the \(\widetilde{P^*}\)-test specified in Algorithm 3 by modifying Algorithm 2 as follows:

  1. 1.

    after step 3, if an \({\mathcal {I}}_p({\mathcal {O}}_{c+r\omega ^g},L)\) contains 0, the result -2 is returned;

  2. 2.

    step 7 is replaced with: ,

  3. 3.

    after step 7, unless contains a unique integer, the result \(-1\) is returned.

Modification 1 ensures termination when \(\Delta \) does not have isolation ratio \(\rho >1\). With modification 2, can have width greater than 1 and contain more than one integer. With modification 3, the \(\widetilde{P^*}\)-test can return \(-1\) which means that no conclusion can be made. If \(\widetilde{P^*}({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta , \rho )\) returns a positive integer, this result has still to be checked, for instance, with the \(T^*\)-test.

In Table 2, column \(n_{-2}\) (resp. \(n_{-1}\)) shows the number of times \(\widetilde{P^*}({\mathcal {I}}_p,\) \({\mathcal {I}}_{p'}, \Delta , 2)\) returns \(-2\) (resp. \(-1\)) when applied in place of \(T^*(\Delta )\) in the \(C^0\)-test. Column \(n_{err}\) shows the number of times the conclusion of \(\widetilde{P^*}\) was wrong, and \(t_0'\) shows the total time spent in \(\widetilde{P^*}\)-tests.

3.2 Using the \(P^*\) and \(\widetilde{P^*}\)-test in a Subdivision Framework

Our improvement of Algorithm 1 is based on the following heuristic remarks. First, it is very unlikely that \(\widetilde{P^*}({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta , 2)\) returns -2 (see column \(n_{-2}\) in Table 2). Second, when \(P^*({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta , 2)\) returns \(k\ge 0\), it is very likely that \(\Delta \) contains k roots counted with multiplicity (see column \(n_{err}\) in Table 2).

Fig. 2.
figure 2

Computing clusters for \(\mathtt{Mignotte}_{64}\) in the ROI \([-2, 2]+{\mathbf{i}}[-2, 2]\). Left: The subdivision tree for Algorithm 1. Right: The subdivision tree for Algorithm 5.

We define the \(C^0\)-test as follows:

$$\begin{aligned} C^0(\Delta ) := \left\{ \begin{array}{rl} -1 &{} \text { if } \widetilde{P^*}({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta , 2)\notin \{-2, 0\}, \\ -1 &{} \text { if } \widetilde{P^*}({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta , 2)\in \{-2, 0\} \text { and } T^*(\Delta )\ne 0, \\ 0 &{} \text { if } \widetilde{P^*}({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta , 2)\in \{-2, 0\} \text { and } T^*(\Delta )= 0. \\ \end{array} \right. \end{aligned}$$
(8)

If \(C^0(\Delta )\) is defined in Eq. (8), it returns 0 only if \(\Delta \) contains no root. Thus Algorithm 1 with \(C^0\) and \(C^*\)-tests defined by Eqs. (8) and (7) is correct.

Remark now that if a square complex box B of width w does not contain root and is at a distance at least \(\frac{3}{2}w\) from a root, then \(\Delta (B)\) has isolation ratio 2, and \(\widetilde{P^*}({\mathcal {I}}_p, {\mathcal {I}}_{p'}, \Delta (B), 2)\) returns 0 or \(-2\). As a consequence, the termination of Algorithm 1 with \(C^0\) and \(C^*\)-tests defined in Eqs. (8) and (7) amounts to the termination of Algorithm 1 with \(C^0\) and \(C^*\) defined in Eq. (4).

4 Clustering Roots of Polynomials with Real Coefficients

We consider here the special case where \(p\in \mathbb R[z]\), and show how to improve a subdivision algorithm for solving the LCP. We propose to leverage on the geometric structure of the roots of p, that are either real, or imaginary and come in complex conjugated pairs: if \(\alpha \in \mathbb C\) is a root of p so is \(\overline{\alpha }\) where \(\overline{\alpha }\) is the complex conjugate of \(\alpha \). The modified subdivision algorithm we propose deals only with the boxes of the subdivision tree of the ROI \(B_0\) that have a positive imaginary part; the roots with positive imaginary parts are in the latter boxes. The roots with negative imaginary parts are implicitly represented by the former ones. In Fig. 2 are shown two subdivision trees constructed for clustering roots of a Mignotte polynomial of degree 64; the left-most one is obtained when applying Algorithm 1; the right-most one results of our improvement.

Below, we suppose that \(B_0\) is symmetric with respect to the real axis and that p has no root in \(2B_0\setminus B_0\). These two limitations can easily be removed.

figure e

Notations. Let B be a box centered in c. We define its conjugate \(\overline{B}\) as the box centered in \(\overline{c}\) with width \(w(B)\). We say that B is imaginary positive (resp. imaginary negative) if \(\forall b\in B\), \(Im(b)>0\) (resp. \(Im(b)<0\)).

Let \(\mathcal {C}\) be a component of boxes of the subdivision tree of \(B_0\). We define \(\overline{\mathcal {C}}\) as the component which boxes are the conjugate of the boxes of \(\mathcal {C}\). We call conjugate closure of \(\mathcal {C}\), and we denote it by \(\mathcal {C}_{\overline{\cup }}\) the set of boxes \(\mathcal {C}\cup (\overline{\mathcal {C}}\setminus \mathcal {C})\). If \(\mathcal {C}\) intersects \(\mathbb R\), \(\mathcal {C}_{\overline{\cup }}\) is a component. We say that \(\mathcal {C}\) is imaginary positive (resp. imaginary negative) if each box in \(\mathcal {C}\) is imaginary positive (resp. imaginary negative).

Solving the LCP for Polynomials with Real Coefficients. We describe in Algorithm 4 a procedure to bisect a component, that discards boxes that are imaginary negative in addition to those that contain no root.

Our algorithm for solving the LCP for polynomials with real coefficients is presented in Algorithm 5. It maintains in the queue Q only components of boxes that are imaginary positive or that intersect the real line. Components with only imaginary negative boxes are implicitly represented by the imaginary positive ones. Components that intersect the real line are replaced by their conjugate closure. Components in Q are ordered by decreasing width of their containing boxes. The termination of Algorithm 5 is a consequence of the termination of Algorithm 1 that is proved in [BSS+16].

Let \(\{(\mathcal {C}^1,m^1),\ldots ,(\mathcal {C},m^\ell )\}\) be the list returned by Algorithm 5 called for arguments \(p,B_0,\epsilon \). Then \(\{(\Delta (\mathcal {C}^1),m^1),\ldots ,(\Delta (\mathcal {C}^\ell ),m^\ell )\}\) is a solution of the LCP problem for \(p,B_0,\epsilon \), i.e.:

  1. (i)

    the \(\Delta (\mathcal {C}^i)\)’s are pairwise disjoint with radius less that \(\epsilon \),

  2. (ii)

    \(\forall 1\le i \le \ell \), \((\mathcal {C}^i,m^i)\) satisfies \(\#(\Delta (\mathcal {C}^i))=\#(3\Delta (\mathcal {C}^i))=m^i\),

  3. (iii)

    \(\texttt {Zero}(B_0,p) \subseteq \bigcup _{i=1}^{\ell } \texttt {Zero}(\Delta (\mathcal {C}^i),p) \subseteq \texttt {Zero}(2B_0,p)\).

In what follow we may write R for the list of connected components in R. (i), (ii) and (iii) are direct consequences of the following proposition:

Proposition 5

Consider Q and R after any execution of the while loop in Algorithm 5. Decompose Q in two lists \(Q^1\) and \(Q^2\) containing respectively the imaginary positive components of Q and the non imaginary components of Q. Note \(\overline{Q^1}\) the list of the conjugates of the components in \(Q^1\) and \(Q^2_{\overline{\cup }}\) the list of the conjugate closures of the components in \(Q^2\), and let \(Q_{\overline{\cup }}\) be \(\overline{Q^1}\cup Q^2_{\overline{\cup }}\). One has:

figure f
  1. (1)

    any \(\alpha \in \texttt {Zero}(B_0)\) is in \(R\cup Q \cup Q_{\overline{\cup }}\),

  2. (2)

    any \(\mathcal {C}\in R\) is separated from \((R\setminus \{\mathcal {C}\})\cup Q \cup Q_{\overline{\cup }}\),

  3. (3)

    any \((\mathcal {C},m)\) in R is such that \(m=\#(\Delta (\mathcal {C}))=\#(3\Delta (\mathcal {C}))\).

Proposition 5 is a consequence of Remark 1 and the following remark.

Remark 6

Let \(p\in \mathbb R[z]\) and \(\mathcal {C}\) be a component. If \(\mathcal {C}\) is imaginary negative or imaginary positive and if there exists m such that \(m=\#(\Delta (\mathcal {C}))=\#(3\Delta (\mathcal {C}))\), then \(m=\#(\Delta (\overline{\mathcal {C}}))=\#(3\Delta (\overline{\mathcal {C}}))\).

5 Numerical Results

We implemented the two improvements of Sects. 3 and 4 in Ccluster. CclusterO refers to the original version of Ccluster. Both CclusterR and CclusterPs implement Algorithm 5. In CclusterPs, \(C^0\) and \(C^*\) are defined by Eqs. (8) and (7).

Testing Suite. We tested our improvements on Mignotte and Mandelbrot’s polynomials and on Bernoulli and Runnel’s polynomials: the Bernoulli polynomial of degree d is \(\mathtt{Bernoulli}_{d}(z)=\sum _{k=0}^{d} {{d}\atopwithdelims (){k}}b_{d-k}z^k\) where the \(b_i\)’s are the Bernoulli numbers. It has about d/2 non-zero coefficients and, as far as we know, cannot be evaluated substantially faster than with Horner’s scheme. It has real coefficients, and about 2/3 of its roots are real or imaginary positive (see left part of Fig. 3). Let \(r=2\), \(q_0(z)=1\), \(q_1(z)=z\) and \(q_{k+1}(z) = q_k(z)^r + zq_{k-1}(z)^{r^2}\). We define the Runnel’s polynomial of parameter k as \(\mathtt{Runnels}_{k}=q_k\). It has real coefficients, a multiple root (zero), and can be evaluated fast. The 107 distinct roots of \(\mathtt{Runnels}_{8}\) are drawn on right part of Fig. 3.

Fig. 3.
figure 3

Left: 64 clusters of roots for the Bernoulli polynomial of degree 64. Right: 107 clusters of roots for the Runnel’s polynomial of degree 170.

Results. Table 3 gives details concerning the execution of CclusterO, CclusterR and CclusterPs for polynomials with increasing degrees. We used \(\epsilon =2^{-53}\) and the ROI \(B_0=[-500, 500]+{\mathbf{i}}[-500, 500]\) that contains all the roots of all the considered polynomials. Column (#Clus,#Sols) shows the number of clusters and the total multiplicity found. Columns (depth, size) show the depth and the size (i.e. number of nodes) of the subdivision tree for each version. \(t_1\), \(t_2\) and \(t_3\) stand respectively for the running time in second of CclusterO, CclusterR and CclusterPs.

Algorithm 5 achieves speed up \(t_1/t_2\). It is almost 2 for Mignotte polynomials, since about half of its roots are above the real axis. This speed up is less important for the three other families of polynomials, which have a non-negligible ratio of real roots. The speed up achieved by using the \(P^*\)-test is \(t_2/t_3\). It is significant for Mignotte’s polynomial, which is sparse, and Mandelbrot and Runnel’s polynomials for which one can construct fast evaluation procedures.

Table 3. Details on runs of CclusterO, CclusterR and CclusterPs for polynomials in \(\mathbb R[z]\) with increasing degree.

6 Future Works

Our main contribution is a significant practical progress in subdivision root-finding based on a new test for counting roots in a well-isolated disc. If the latter assumption does not hold, the test result is not guaranteed but is very likely to be correct. In a subdivision framework, we have proposed to use a test based on Pellet’s theorem to verify its result. We aim to do so by using only evaluations of p and \(p'\). This would imply a very significant improvement of the root clustering algorithm when p and \(p'\) can be evaluated very efficiently.