A Polyhedral Homotopy Algorithm for Real Zeros

Ergür, Alperen A.; Wolff, Timo de

doi:10.1007/s40598-022-00219-w

A Polyhedral Homotopy Algorithm for Real Zeros

Research Exposition
Published: 27 October 2022

Volume 9, pages 305–338, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Arnold Mathematical Journal Aims and scope Submit manuscript

A Polyhedral Homotopy Algorithm for Real Zeros

Download PDF

221 Accesses
2 Altmetric
Explore all metrics

Abstract

We design a homotopy continuation algorithm, that is based on Viro’s patchworking method, for finding real zeros of sparse polynomial systems. The algorithm is targeted for polynomial systems with coefficients that satisfy certain concavity conditions, it tracks optimal number of solution paths, and it operates entirely over the reals. In more technical terms, we design an algorithm that correctly counts and finds the real zeros of polynomial systems that are located in the unbounded components of the complement of the underlying A-discriminant amoeba. We provide a detailed exposition of connections between Viro’s patchworking method, convex geometry of A-discriminant amoeba complements, and computational real algebraic geometry.

Early Ending in Homotopy Path-Tracking for Real Roots

Penalty Function Based Critical Point Approach to Compute Real Witness Solution Points of Polynomial Systems

Locating the Closest Singularity in a Polynomial Homotopy

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let ${\varvec{p}}=(p_1,p_2,\ldots ,p_n)$ be a system of sparse polynomials in ${\mathbb {C}[\varvec{x}]} = \mathbb {C}[x_1,\ldots ,x_n]$ with support sets $A_1,A_2,\ldots ,A_n \subseteq \mathbb {Z}^n$. More precisely, let

$$\begin{aligned} {p_i} \ := \ \sum _{\alpha \in A_i} c_{\alpha }^{(i)} \varvec{x}^{\alpha } \; , \; \text {for} \; i=1,2,\ldots ,n, \end{aligned}$$

where $\varvec{x}^{\alpha }:=x_1^{\alpha _1} x_2^{\alpha _2} \ldots x_n^{\alpha _n}$. Bernstein’s theorem from 1975 [3] shows that for generic choice of coefficients of $p_i$, the number of zeros of $\varvec{p}$ on $(\mathbb {C}^{*})^n$ equals to the mixed volume ${{\mathcal {M}}(Q_1,Q_2,Q_3,\ldots ,Q_n)}$ of the Newton polytopes ${Q_i} := {{\,\mathrm{{\text {conv}}}\,}}(A_i)$.

In the early 90s, the polyhedral homotopy method was developed as an algorithmic counterpart of Bernstein’s theorem [18]. The main idea of the polyhedral homotopy method is to continuously deform a given polynomial system to another “easy” system, that can be solved by pure combinatorics, and then trace back the change in the solution set with numerical path trackers. This geometric idea is colloquially referred to as toric deformation, and the “easy” systems with combinatorial structure are referred to as the systems at the toric limit. Polyhedral homotopy method is currently implemented in PHCPack [46], Hom4ps-3 [10], pss5 [33], and HomotopyContinuation.jl [8], and it is practically successful.

For most applications of polynomial system solving, and for certain questions in theoretical computer science, one needs to count and find zeros of polynomial equations over real numbers, e.g., see [23, 28]. No general and efficient algorithm that counts real zeros of arbitrary sparse polynomial systems is known, and there are good complexity theoretic reasons to believe that at this level of generality the problem is intractable. Our aim is to locate a sufficiently general and tractable sub-case of real zero finding problem: Suppose support sets $A_1,A_2,\ldots ,A_n \subseteq \mathbb {Z}^n$ are given, can we find effectively checkable conditions on the coefficients of the equations that guarantee tractable solving over the reals? In other words: Where are the “easy” equations located in space of sparse real polynomial systems with n equations and n unknowns?

An important observation from real algebraic geometry suggests a map for “easy” polynomial systems: one can count real zeros by pure combinatorics if the polynomial system is at the “toric limit”. We informally state this result (Viro’s Patchworking Method for Complete Intersections) to motivate our discussion; see Sect. 2.2 for a precise statement.

Theorem 1.1

(Viro’s Patchworking Method for Finitely Many Zeros) Let $A_1,\ldots ,A_n \subseteq \mathbb {Z}^n$, let $\omega _i : A_i \rightarrow \mathbb {R}$ be lifting functions, and consider the following family of equations parametrized by $t \ge 1$:

$$\begin{aligned} p_i(t,\varvec{x}) \ := \ \sum _{\varvec{\alpha } \in A_i} c_{\varvec{\alpha }}^{(i)} t^{\omega _i(\varvec{\alpha })} \varvec{x}^{\varvec{\alpha }} \; i=1,2,\ldots ,n. \end{aligned}$$

Let $\varepsilon _i: A_i \rightarrow \{ -1, +1 \}$ be the sign functions defined by signs of the coefficients $c_{\varvec{\alpha }}^{(i)} \in \mathbb {R}$. Then, for sufficiently large $t \gg 1$, the set of common zeros of $p_1(t,\varvec{x}),p_2(t,\varvec{x}),\ldots ,p_n(t,\varvec{x})$ on $\mathbb {R}_{+}^n$ is homeomorphic to

$$\begin{aligned} {{\,\mathrm{{\text {Trop}}}\,}}(A_1,\omega _1,\varepsilon _1) \cap {{\,\mathrm{{\text {Trop}}}\,}}(A_2,\omega _2,\varepsilon _2) \cap \cdots \cap {{\,\mathrm{{\text {Trop}}}\,}}(A_n, \omega _n,\varepsilon _n), \end{aligned}$$

where ${{\,\mathrm{{\text {Trop}}}\,}}(A_i,\omega _i,\varepsilon _i)$ are the positive part of tropical varieties ${{\,\mathrm{{\text {Trop}}}\,}}(A_i,\omega _i)$ as defined in Sect. 2.2.

Theorem 1.1 yields a polyhedral object that is homeomorphic to the common zero set of $p_1(t,\varvec{x}),\ldots ,p_n(t,\varvec{x})$ on $\mathbb {R}_{+}^n$ for sufficiently large t, and it can also be used to handle the set of common zeros on $(\mathbb {R}^{*})^n$. We have three immediate questions:

(1)
How can we quantify precisely when t is “sufficiently large”?
(2)
Given a polynomial system $p_1(t,\varvec{x}),\ldots ,p_n(t,\varvec{x})$ with support sets $A_1,\ldots ,A_n$ and coefficients $c_{\varvec{\alpha }}^{i}$ for $\varvec{\alpha } \in A_i$ (as in the theorem statement), can we guarantee that the number of common real zeros does not change as t goes from 1 to $\infty $?
(3)
Can we use the technique in Theorem 1.1 for polynomial systems that are not necessarily at the “toric limit”?

The first two questions are interrelated, and they form the main difficulty with respect to developing an algorithmic version of Theorem 1.1. These questions were asked since 90s [45]; to the best of our knowledge, the current paper provides the first progress. We provide an explicit criterion to answer the second question (stated in Sect. 3). The criterion also furnishes a homotopy algorithm that operates entirely over the reals, which we call real polyhedral homotopy algorithm (RPH).

The third question is due to Itenberg and Roy; they conjectured Viro’s patchworking method provides an upper bound for the number of real zeros regardless of the polynomial system being at the toric limit or not [20]. Li and Wang provided a counterexample to the Itenberg–Roy Conjecture [30].

1.1 Effective Patchworking

Our development is based on an observation from the book [16] by Gelfand, Kapranov, and Zelevinsky (henceforth GKZ) which provides a link between Viro’s patchworking method and A-discriminants. Using the GKZ observation for an algorithm is not straightforward. It requires to locate a query point against the A-discriminant variety, and this may be intractable: The defining equation of the discriminant locus is known to be extremely complicated; it obstructs the use of computational algebra methods. However, it is no obstruction against the use of amoeba theory. Discriminantal amoebas are proven to admit a certain parametric description, and it is easy to compute normal directions on their boundary; see Sect. 2.9. We exploit these special differential geometric properties of A-discriminant amoebas to develop an effective criterion for checking whether a given polynomial system is “easy”. Note that RPH relies on notions from discrete and tropical geometry, and on further notions from GKZ. Furthermore, we use an algorithm called Tropical Homotopy due to Jensen [22]. So, before reading the main statement in Sect. 3 we encourage the reader to check familiarity with the content of Sects. 2.1, 2.2, 2.3, 2.7, and 2.9.

1.2 Complexity Aspects

Our work is inspired by the practical efficiency of complex polyhedral homotopy algorithm. Complexity aspects of polyhedral homotopy have been elusive for more than two decades; early papers did not include any complexity analysis, later different authors approached the issue [31, 32, 36], certain technical obstacles still remain; see Sect. 5.4.

A complete complexity analysis of RPH will only become possible when the scientific community fully understands the complexity of numerical path tracking for sparse polynomial systems. We present our thoughts on the complexity of discrete computations, and touch upon the complexity of the numerical part of RPH in Sect. 5.

We point out here that the main parameters governing the complexity of RPH are different than its complex cousin: the overall complexity of RPH is controlled by the number of mixed cells (a combinatorial quantity), the complexity of complex polyhedral homotopy is, in contrast, controlled by the mixed volume (a geometric invariant).

1.3 Connections to Fewnomial Theory

A system of polynomials $\varvec{p}=(p_1,p_2,\ldots ,p_n)$ is called a patchworked polynomial system if the real zero set of $\varvec{p}$ is homeomorphic to a simplicial complex created by Viro’s combinatorial patchworking technique. For instance, every polynomial system that passes our test in Sect. 3 is a patchworked system. In Sect. 5.3, we observe the following result that is reminiscent to a conjecture from fewnomial theory [27] attributed to Kushnirenko [29].

Theorem 1.2

Let $\varvec{p}=(p_1,p_2,\ldots ,p_n)$ be a patchworked polynomial system where every polynomial $p_i$ has at most t terms. Then $\varvec{p}$ can have at most $2^{n+1} \left( {\begin{array}{c}n(t-1)\\ n\end{array}}\right) $ many common zeros on $(\mathbb {R}^{*})^n$.

Note that $2^{n+1} \left( {\begin{array}{c}n(t-1)\\ n\end{array}}\right) \le 2^{n+1} e^n (t-1)^n$ where the right hand side resembles Kushnirenko’s conjecture. To illustrate the difference between the number of paths tracked in RPH and the number of paths in complex polyhedral homotopy, we provide a very simple example.

Example 1.3

Let $A=\{ (0,0,0) , (0,0,d) , (0,d,0), (d,0,0) \}$, and let $\varvec{p}=(p_1,p_2,p_3)$ where $p_i= a^{(i)}_0 + a^{(i)}_1 x_3^d + a^{(i)}_2 x_2^d + a^{(i)}_3 x_1^d$ are real polynomials with three variables. Further assume that the coefficients of $\varvec{p}$ is generic in the sense of Bernstein’s theorem, this implies $\varvec{p}$ has $d^3$ many zeros on $(\mathbb {C}^{*})^3$. If $\varvec{p}$ is a patchworked polynomial system, by Theorem 1.2 it has at most 1344 zeros in $(\mathbb {R}^{*})^3$. We should note that 1344 is very much an over estimation (correct bound in this specific example is at most 8) where else $d^3$, with d being arbitrarily large, is the exact number of zeros in $(\mathbb {C}^{*})^3$.

Theorem 1.2 is a direct application of McMullen’s upper bound theorem. Things become geometrically more interesting when one tries to bound the number of mixed cells for support sets $A_i$ with different cardinalities (mixed supports). In [4], it is claimed that a patchworked polynomial system $\varvec{p}=(p_1,p_2,\ldots ,p_n)$, where $p_i$ has at most $t_i$ many terms, can have at most $\prod _{i=1}^n (t_i-1)$ many zeros on ${\mathbb {R}}_{+}^n$. We have learned from Bihan that the proof of this result is not correct, but the result still holds true. Bihan informed us that a new proof and an erratum will appear soon ( [5]).

1.4 Structure of the Paper

Our aim is to write this paper as self contained as possible. The preliminaries section contains background information and results from discrete geometry, the theory of A-discriminants, symbolic computation, and numerical path trackers. Jensen’s tropical homotopy algorithm and mixed cell cones are also introduced in this section. In the third section, we transform asymptotic and qualitative results from [16] to a more quantitative and checkable condition. In the fourth section we present our real polyhedral homotopy algorithm and an example. The fifth section is concerned with the complexity aspects. The last section contains a discussion of questions that were brought to our attention after the initial version of this paper appeared on ArXiv.

2 Preliminaries

We denote ${[n]} := \{1,\ldots ,n\}$, ${\mathbb {C}^*} := \mathbb {C}\setminus \{0\}$, and ${\mathbb {R}^*} := \mathbb {R}\setminus \{0\}$. Let ${\varvec{e}_j}$ denote the j-th coordinate vector in $\mathbb {R}^{n}$. To avoid redundancies later in the articles we set ${\varvec{e}_0} := \varvec{0}$.

For a given convex set C, we denote its boundary by ${\partial C}$. For a convex cone $K \in \mathbb {R}^{n}$, the dual cone $K^{\circ }$ is defined as

$$\begin{aligned} K^{\circ }:= \{ y \in \mathbb {R}^n : \langle x , y \rangle \ge 0 \; \text {for all} \; x \in K \}. \end{aligned}$$

For a given polytope P, we denote its vertex set as ${{\text {Vert}}\left( P\right) }$. For $\varvec{v} \in {\text {Vert}}\left( P\right) $ the normal cone is the collection of linear functional that achieves its maxima over P at $\varvec{v}$ and is denoted with ${{\text {NC}}{(\varvec{v})}}$. Entire collection of all normal cones ${\text {NC}}{(\varvec{v})}$ form a fan called normal fan and denoted with ${{\text {NF}}{(P)}}$.

In what follows we consider finite sets ${A} := \{\varvec{a}_1,\ldots ,\varvec{a}_m\} \subset \mathbb {Z}^n$ and ${A_1,A_2,\ldots ,A_k} \subset \mathbb {Z}^n$, which are support sets of polynomials. We denote the Minkowski sum of the $A_i$ as ${\sum _{i = 1}^{k}{A_i}}$. Note that

$$\begin{aligned} {{\,\mathrm{{\text {conv}}}\,}}\left( \sum _{i = 1}^{k}{A_i}\right) \ = \ \sum _{i = 1}^k {{\,\mathrm{{\text {conv}}}\,}}(A_i). \end{aligned}$$

For a polynomial $p \in \mathbb {C}[\varvec{x}]$ with support A, the Newton polytope is given by ${{{\,\mathrm{{\text {New}}}\,}}(p)} := {{\,\mathrm{{\text {conv}}}\,}}(A)$. We denote the variety, i.e., the common solutions of a system of polynomials $\varvec{p}$ on complex numbers as ${\mathcal {V}\left( \varvec{p}\right) }$, the real locus as ${\mathcal {V}_{\mathbb {R}}\left( \varvec{p}\right) } := \mathcal {V}\left( \varvec{p}\right) \cap \mathbb {R}^n$, and positive / nonzero real locus as ${\mathcal {V}_{\mathbb {R}_{> 0}}\left( \varvec{p}\right) }$ and ${\mathcal {V}_{\mathbb {R}^*}\left( \varvec{p}\right) }$.

2.1 Polyhedral Subdivisions, Secondary Polytope, and Cayley Configuration

In this section, we introduce polyhedral subdivisions, secondary polytopes and Cayley configurations; for further details, we refer the reader to [11].

Let $A \subset \mathbb {Z}^n$ be a set of lattice points and let ${\omega }: A \rightarrow \mathbb {R}$ be a function. The lifting of A induced by $\omega $ is defined as:

$$\begin{aligned} {A^{\omega }} \ := \ \left\{ ({{\,\mathrm{\varvec{x}}\,}}, \omega ({{\,\mathrm{\varvec{x}}\,}})) : {{\,\mathrm{\varvec{x}}\,}}\in A \right\} . \end{aligned}$$

We call a face F of ${{\,\mathrm{{\text {conv}}}\,}}(A^{\omega })$ an upper face if it is given by

$$\begin{aligned} F \ = \ \{ {{\,\mathrm{\varvec{x}}\,}}\in {{\,\mathrm{{\text {conv}}}\,}}(A^{\omega }) : \langle \varvec{c} ,{{\,\mathrm{\varvec{x}}\,}}\rangle \ge \langle \varvec{c}, \varvec{y} \rangle \text { for all } \varvec{y} \in {{\,\mathrm{{\text {conv}}}\,}}(A^{\omega }) \}, \end{aligned}$$

where $\varvec{c}$ is a vector with a positive last entry. Intuitively, upper faces are the faces that are “visible” from $(0,\ldots ,0,\infty )$. We project upper faces of ${{\,\mathrm{{\text {conv}}}\,}}(A^{\omega })$ on the point set A:

$$\begin{aligned} {\Delta _{\omega }} \ := \ \left\{ {{\,\mathrm{\varvec{x}}\,}}\in A : ({{\,\mathrm{\varvec{x}}\,}},\omega ({{\,\mathrm{\varvec{x}}\,}})) \text { belongs to an upper face of } {{\,\mathrm{{\text {conv}}}\,}}(A^{\omega }) \right\} . \end{aligned}$$

$\Delta _{\omega }$ is a polyhedral subdivision of A. Polyhedral subdivisions obtained this way are called coherent or regular. Note that $\Delta _{\omega }$ is a triangulation (i.e. a subdivision using only simplices) unless the lifted points $A^{\omega }$ have certain affine dependencies [11, Remark 5.2.3].

Now, we define the secondary polytope of A, which encodes all coherent triangulations of A, and discuss its key properties; see [11, Section 5].

Definition 2.1

Let T be a triangulation of $A =\{ \varvec{a}_1, \varvec{a}_2, \ldots , \varvec{a}_m \}$, and let $\sigma _1,\ldots ,\sigma _s$ be the simplices in T. We define

$$\begin{aligned} {\Phi _A(T)} \ := \ \sum _{j=1}^m \left( \sum _{\{\sigma \in T \, : \, \varvec{a}_j \in \sigma \}} {{\,\mathrm{{\text {vol}}}\,}}(\sigma ) \right) \varvec{e}_j. \end{aligned}$$

We define the secondary polytope of A as:

$$\begin{aligned} {\Sigma (A)} \ := \ {{\,\mathrm{{\text {conv}}}\,}}\left\{ \Phi _A(T) \ : \ T \text { is a triangulation of } A \right\} . \end{aligned}$$

The corresponding normal fan ${\text {NF}}{(\Sigma (A))}$ is called the secondary fan. For its cones, the secondary cones, we use the abbreviated notation ${{\text {NC}}{(T)}} := {\text {NC}}{(\Phi _A(T))}$.

Theorem 2.2

[11, Section 5] The secondary polytope has the following properties:

(1)
The vertices of $\Sigma (A)$ are in one to one correspondence to the coherent triangulations of A.
(2)
The face lattice of $\Sigma (A)$ is isomorphic to a refinement poset of the coherent polyhedral subdivisions of A.
(3)
A lifting function $\omega : A \rightarrow \mathbb {R}$ induces the triangulation T if and only if $\omega \in {\text {int}}\left( {\text {NC}}{(T)}\right) $.
(4)
Consider the support set A as a $n \times m$ integer matrix. Then every secondary cone ${\text {NC}}{(T)}$ includes the $n+1$ dimensional linear space spanned by rows of A and all ones vector $(1,1,\ldots ,1)$. As a consequence, the secondary polytope $\Sigma (A)$ is $m-n-1$ dimensional.

For later use, we need to have a better understanding of the description of secondary cones ${\text {NC}}{(T)}$. We first define a circuit.

Definition 2.3

An affine dependence among lattice points of a set $A \subseteq {\mathbb {Z}}^n$ is the relation given by $\sum _{\alpha \in A} a_{\alpha } \alpha =0$, where $\sum _{\alpha \in A} a_{\alpha }=0$. A circuit Z is a collection of affinely dependent lattice points where every proper subset of Z is affinely independent. Consequently, circuit represents a unique (up to scaling) affine relation $\sum _{\alpha \in Z} \lambda _{\alpha } \alpha =0$ and $\sum _{\alpha \in Z} \lambda _{\alpha } = 0$

A rhombus in the plane is a nice example of a circuit. The following is a basic fact about circuits, see, e.g., Lemma 2.4.2 [11].

Lemma 2.4

Let Z be a circuit, then Z can be decomposed into a disjoint union of two sets $Z = Z_{+} \cup Z_{-}$ with the following property:

$$\begin{aligned} {\mathcal {Z}}_{+} = \{ Z - \{ \alpha \} : \alpha \in Z_{+} \} \; \; , \; \; {\mathcal {Z}}_{-} = \{ Z - \{ \alpha \} : \alpha \in Z_{-} \} \end{aligned}$$

are the two triangulations of Z.

The volume of the simplex $Z-\{ \alpha \} $ is equal to absolute value of a determinant. Let us denote this determinant with $\sigma _{\alpha }$. A standard argument, see, e.g., Remark 4.1.8 in [11], shows that $\sigma _{\alpha }$s determine the unique affine relation supported by Z. That is, using the terminology in Definition 2.3, we have $\sigma _{\alpha }=\lambda _{\alpha }$ (up to swapping $Z_{-}$ and $Z_{+}$).

Now consider a regular triangulation T and suppose $\omega $ is a lifting function that induces T. Then, for a simplex $conv \{ a_1,a_2,\ldots ,a_{n+1} \}$ in T, and $a_{n+2} \in A$ with $a_{n+2 } \notin conv \{ a_1,a_2,\ldots ,a_{n+1} \}$, we must have that $(a_{n+2},\omega (a_{n+2}))$ lies ‘above’ the affine spane of $(a_1,\omega (a_1)), \ldots , (a_{n+1}, \omega (a_{n+1})) )$. Suppose $\sum _{i=1}^{n+2} \lambda _i a_i = 0$ is the unique affine relation of the circuit $\{ a_1,a_2,\ldots ,a_{n+2} \}$, if we have $\sum _{i=1}^{n+2} \lambda _i \omega (a_i) =0$, then we know that $(a_{n+2},\omega (a_{n+2}))$ is in the affine span of the vectors $\{ (a_1, \omega (a_1)), \ldots ,(a_{n+1}, \omega (a_{n+1})\}$. To have $(a_{n+2},\omega (a_{n+2}))$ ‘above’ simply corresponds to $\sum \lambda _i \omega (a_i) >0$. In conclusion, the secondary cone ${\text {NC}}{(T)}$ is described by inequalities supported on circuits, and these inequalities are of the form $\sum \lambda _i \omega (a_i) >0$ where $\lambda _{\alpha }$ are signed volumes of simplices in the triangulation of the circuit (up to scaling).

Now, we consider polyhedral subdivision of a set A where $A = \sum _{i = 1}^{k}{A_i}$. Let F be a cell in coherent polyhedral subdivision of $\sum _{i = 1}^{k}{A_i}$ introduced by a lifting function $\omega $. Then, F corresponds to a face in $\sum _{i = 1}^n {{\,\mathrm{{\text {conv}}}\,}}(A_i)^{\omega }$. Let $F = \sum _{i = i}^{k}{F_i}$ where ${F_i}$ are the corresponding faces on ${{\,\mathrm{{\text {conv}}}\,}}(A_i)^{\omega }$.

Definition 2.5

A coherent polyhedral subdivision ${\Delta _{\omega }}$ of $A_1+A_2+\cdots +A_k$ for $A_i \subset \mathbb {Z}^n$ is called fine mixed if it satisfies the following conditions:

(1)
For all cells F in the subdivision, we have $\sum _{i=1}^k \dim (F_i)=n$, and
(2)
for all cells F in the subdivision we have $\sum _{i=1}^k (\# F_i - 1)=n$,

where ${\# F_i}$ denotes the number of vertices of $F_i$.

We also need to define Cayley configuration of point sets $A_1,A_2,\ldots ,A_k$ and the corresponding Cayley polytope.

Definition 2.6

We define the Cayley configuration of $A_1,A_2,\ldots ,A_k$ as

$$\begin{aligned} {\textbf{A}} \ = \ {A_1 * A_2 * \cdots * A_k} \ := \ \{ (\varvec{x},\varvec{e}_{i-1}) : \varvec{x} \in A_i \} \subseteq \mathbb {R}^{n+k-1}. \end{aligned}$$

The Cayley polytope is defined as ${{\,\mathrm{{\text {conv}}}\,}}(\textbf{A})$, denoted by ${{{\,\textrm{Cay}\,}}(\textbf{A})}$.

The following observation is implicit in most papers in literature: A natural slicing of the Cayley polytope ${{\,\textrm{Cay}\,}}(\textbf{A})$ is equivalent to $\sum _{i = 1}^n {{\,\mathrm{{\text {conv}}}\,}}(A_i)$. More precisely, consider the following set defined by the intersection of ${{\,\textrm{Cay}\,}}(\textbf{A})$ with several hyperplanes:

$$\begin{aligned} {\widetilde{{{\,\textrm{Cay}\,}}(\textbf{A})}} \ := \ \left\{ \varvec{x} \in {{\,\textrm{Cay}\,}}(\textbf{A}) \ : \ x_{n+1}=x_{n+2}=\cdots =x_{n+k-1}=\frac{1}{k} \right\} . \end{aligned}$$

Observe that a k-scaling of ${\widetilde{{{\,\textrm{Cay}\,}}(\textbf{A})}}$, i.e., $k \cdot {\widetilde{{{\,\textrm{Cay}\,}}(\textbf{A})}}$, is equal to $\sum _{i = 1}^n {{\,\mathrm{{\text {conv}}}\,}}(A_i)$. For a detailed explanation and a picture-proof, see [17].

Suppose that T is a coherent triangulation of the Cayley configuration $\textbf{A}$. First, note that $T \cap {\widetilde{{{\,\textrm{Cay}\,}}(\textbf{A})}}$ creates a polyhedral subdivision of ${\widetilde{{{\,\textrm{Cay}\,}}(\textbf{A})}}$. Via the equivalence, this gives a polyhedral subdivision of $\sum _{i = 1}^n {{\,\mathrm{{\text {conv}}}\,}}(A_i)$. Let $\sigma $ be a simplex in T, then $\sigma $ has $n+k$ vertices which split into sets of vertices $\sigma _i$ that are induced by $A_i$. None of the $\sigma _i$ are empty since otherwise $\sigma $ can not be full-dimensional. Then, up to an isomorphism, $F_{\sigma }= {{\,\mathrm{{\text {conv}}}\,}}(\sigma _1) + {{\,\mathrm{{\text {conv}}}\,}}(\sigma _2) + \cdots + {{\,\mathrm{{\text {conv}}}\,}}(\sigma _k)$ yields a cell in the polyhedral subdivision of $\sum _{i = 1}^n {{\,\mathrm{{\text {conv}}}\,}}(A_i)$, and all such cells yield a fine mixed subdivision of $\sum _{i = 1}^n {{\,\mathrm{{\text {conv}}}\,}}(A_i)$. This correspondence gives a bijection between coherent triangulations of the Cayley polytope and coherent fine mixed subdivisions of the Minkowski sum $\sum _{i = 1}^k {{\,\mathrm{{\text {conv}}}\,}}(A_i)$; see [43, Theorem 5.1].

In summary, coherent fine mixed subdivisions of $\sum _{i = 1}^k {{\,\mathrm{{\text {conv}}}\,}}(A_i)$ are encoded by the vertices of the secondary polytope $\Sigma (\textbf{A})$ and the corresponding secondary cones.

Remark 2.7

At various parts of this article (in our theorem statements and algorithms), we work with triangulations. For a generic lifting function $\omega $, the induced polyhedral subdivision $\Delta _{\omega }$ is a triangulation. The proof of [11, Proposition 2.2.4] suggests an algorithm, albeit an inefficient one, to check whether a given lifting is generic. The question of finding an efficient algorithm to check genericity of a lifting is an interesting question, but it lies beyond the scope of our paper.

2.2 Viro’s Patchworking Method

In this section, we introduce Viro’s patchworking method for complete intersections. For further details and relations to Hilbert’s 16th problem, we kindly refer the reader to Viro’s survey [47]. For further background information on tropical geometry, see, e.g., [19, 37, 38]. For implementations of patchworking technique please see [39] and [24].

Definition 2.8

Let $A=\{ \varvec{a}_1, \varvec{a}_2, \ldots , \varvec{a}_m \} \subset \mathbb {Z}^{n}$ and $\Delta _{\omega }$ be a coherent triangulation of A given by a lifting function $\omega : A \rightarrow \mathbb {R}$. We define the associated tropical variety as

$$\begin{aligned} {{{\,\mathrm{{\text {Trop}}}\,}}(A,\omega )} := \{ \varvec{x} \in \mathbb {R}^n : \max _{i} \{ \langle \varvec{x} , a_i \rangle + \omega (a_i) \} \; \text {is attained at least twice} \}. \end{aligned}$$

Since we are interested in real varieties, we need to distinguish a positive and a negative part of ${{\,\mathrm{{\text {Trop}}}\,}}(A,\omega )$. Observe that ${{\,\mathrm{{\text {Trop}}}\,}}(A,\omega )$ together with its complement creates a polyhedral decomposition of $\mathbb {R}^n$. Also, by definition, every full-dimensional cell in the complement of ${{\,\mathrm{{\text {Trop}}}\,}}(A,\omega )$ corresponds to a unique $\varvec{a}_j \in A$ as it is given by the set:

$$\begin{aligned} \left\{ \varvec{x} \in \mathbb {R}^n \ : \ \langle \varvec{x} , \varvec{a}_j \rangle + \omega (a_j) > \langle \varvec{x} , \varvec{a}_i \rangle + \omega (a_i) \text { for all } i \in [n] \setminus \{j\}\right\} . \end{aligned}$$

We define the sign of this cell as ${\varepsilon (\varvec{a}_j)}$. For every $(n-1)$-dimensional cell in ${{\,\mathrm{{\text {Trop}}}\,}}(A,\omega )$, there exist two adjacent n-dimensional cells with signs assigned by $\varepsilon $. This motivates the definition of the positive part of a tropical variety.

Definition 2.9

The positive part ${{{\,\mathrm{{\text {Trop}}}\,}}(A,\omega ,\varepsilon )}$ of a given tropical variety ${{\,\mathrm{{\text {Trop}}}\,}}(A,\omega )$ is the subcomplex consisting of those $(n-1)$-dimensional cells that are adjacent to two n-cells with different signs.

Theorem 2.10

(Viro’s Patchworking for Complete Intersections [44]) Let $A_1,\ldots ,A_k \subset \mathbb {Z}^n$, let $\omega : A_1 * A_2 * \cdots * A_k \rightarrow \mathbb {R}$ be a lifting function. Consider a system of polynomials ${\varvec{p}}=(p_1,p_2,\ldots ,p_k)$ defined as follows:

$$\begin{aligned} {p_i(t,{{\,\mathrm{\varvec{x}}\,}})} \ := \ \sum _{\alpha \in A_i} c_{\alpha } t^{\omega (\alpha )} {{\,\mathrm{\varvec{x}}\,}}^{\alpha } \end{aligned}$$

with $c_{\alpha } \in \mathbb {R}$. Let ${\varepsilon }: A_1 * A_2 * \cdots * A_k \rightarrow \{ -1, +1 \}$ be the sign function defined by coefficients of $\varvec{p}$. Then, for sufficiently large $t>>1$, the real algebraic set $\mathcal {V}_{\mathbb {R}_{> 0}}\left( \varvec{p}\right) $ is homeomorphic to

$$\begin{aligned} {{\,\mathrm{{\text {Trop}}}\,}}(A_1,\omega _1,\varepsilon _1) \cap {{\,\mathrm{{\text {Trop}}}\,}}(A_2,\omega _2,\varepsilon _2) \cap \cdots \cap {{\,\mathrm{{\text {Trop}}}\,}}(A_k,\omega _k,\varepsilon _k), \end{aligned}$$

where $\omega _i$ and $\varepsilon _i$ are restrictions of $\omega $ and $\varepsilon $ to $A_i$.

Remark 2.11

For readers who are familiar with non-Archimedian tropical geometry the theorem statement here might look confusing. The only difference is that in non-Archimedian tropical geometry, it is customary to use min notation, lower facets, and t tends to zero. In amoeba theory, however, it is customary to use max notation, upper facets, and t tends to $\infty $. We follow the amoeba theory convention.

Theorem 2.10 generalizes to the set of zeros on the appropriate toric variety by applying the theorem on every one of the $2^n$ orthants separately and then gluing them together; see [44, Theorem 5]. We illustrate Theorem 2.10 on the most simple example possible.

Example 2.12

The set $A :=\{ \varvec{e}_0,\varvec{e}_1,\ldots , \varvec{e}_n \}$ represents the support set for linear forms (and hence its convex hull is the standard simplex). We consider positive solutions of an affine linear form $f = u_0+\sum _{i=1}^n u_i x_i$, i.e., the solutions with $x_i > 0$. We use a variant of moment the from symplectic geometry called algebraic moment map:

$$\begin{aligned} {\mu _A}: \mathbb {R}_{+}^n \rightarrow {{\,\mathrm{{\text {conv}}}\,}}(A) \qquad {{\,\mathrm{\varvec{x}}\,}}\ \mapsto \ \frac{ \sum _{i} x_i \varvec{e}_i }{1+ \sum _{i} x_i }. \end{aligned}$$

This map is a homeomorphism. The image of $\mathcal {V}_{\mathbb {R}_{> 0}}\left( u_0+\sum _{i=1}^n u_i x_i\right) $ under $\mu _A$ is given by:

$$\begin{aligned} \mu _A(\mathcal {V}_{\mathbb {R}_{> 0}}\left( f\right) ) \! = \! \left\{ (y_1,y_2,\ldots ,y_n) \in {{\,\mathrm{{\text {conv}}}\,}}(A) : u_0\left( 1-\sum _{i=1}^n y_i\right) + \sum _{i=1}^n u_i y_i = 0 \right\} . \end{aligned}$$

Hence, $\mu _A(\mathcal {V}_{\mathbb {R}_{> 0}}\left( f\right) )$ is defined by the linear form $u_0+u_1x_1+\cdots +u_nx_n$ on the simplex ${{\,\mathrm{{\text {conv}}}\,}}(A)$, and it separates those $\varvec{e}_i$ with $u_i > 0$ from those $\varvec{e}_j$ with $u_j < 0$.

To prove Theorem 2.10 above, one replaces the simplex with the triangulation, and the moment map with the moment map corresponding to the toric variety defined by $A_1+A_2+\cdots +A_k$ as explained in of [16, Chapter 11, Section 5, Subsections C and D]. We provide another example, first considered by Sturmfels [43, Page 382].

Example 2.13

Consider the two polynomials

$$\begin{aligned}&f_t \ = \ x_2^3 - tx_1x_2^2 - t^5x_1^2x_2 + t^{12}x_1^3 - tx_2^2 + t^4 x_1 x_2 - t^9 x_1^2 - t^5 x_2 - t^9 x_1 + t^{12}, \\&g_t \ = \ t^8 x_2^2 - t^6 x_1 x_2 + t^6 x_1^2 - t^3 x_2 - t^2 x_1 + 1. \end{aligned}$$

We consider the lifting function $\omega $ introduced by the exponents of t, the sign function $\varepsilon $ introduced by the coefficients of $f_t$ and $g_t$, and compute the corresponding patchworking. We present the outcome in Figure 1. The computation was already carried out by Sturmfels in the original article [43] in ’94. Here, we generate a plot using the Viro.sage package by O’Neill, Kwaakwah, and the second author [39].

2.3 Mixed-Cell Cones and Jensen’s Tropical Homotopy Algorithm

In this article, we are concerned with zero dimensional systems, that is we have n support sets $A_1,A_2,\ldots ,A_n \subset \mathbb {Z}^n$. For this case, some cells in the regular triangulation $\Delta _\omega $ (introduced by a lifting $\omega $) of $\textbf{A}=A_1 * A_2 * \cdots * A_n$ are of particular interest. These cells are called mixed cells. Formally, a cell $\sigma \in \Delta _\omega $ that has 2 elements from each $A_i$ is called a mixed cell. Equivalently, after identification as explained Sect. 2.1, in a fine mixed subdivision of $A_1+A_2+\cdots + A_n$, mixed cells $\sigma $ are the cells that are given by the Minkowski sum of n edges.

On the dual side, when we consider the finite set of points in the intersection

$$\begin{aligned} {{\,\mathrm{{\text {Trop}}}\,}}(A_1,\omega _1,\varepsilon _1) \cap {{\,\mathrm{{\text {Trop}}}\,}}(A_2,\omega _2,\varepsilon _2) \cap \cdots \cap {{\,\mathrm{{\text {Trop}}}\,}}(A_n,\omega _n,\varepsilon _n), \end{aligned}$$

each of these points correspond to a mixed cell in the triangulation of $\textbf{A}=A_1 * A_2 * \cdots * A_n$, where every two vertices from each $A_i$ have opposite signs. In the current literature, such a simplex is called an alternating mixed cell [22].

The main observation here is that Theorem 2.10 depend only on the mixed cells in a triangulation of $\textbf{A}=A_1*A_2*\cdots *A_n$; we do not need to differentiate between two triangulations that have same collection of mixed cells. We formalize this as follows.

Definition 2.14

(Mixed-Cell Cone of a Triangulation) Let T be a triangulation of $\textbf{A}=A_1 * A_2 * \cdots *A_n$, and let $\sigma \in T$ be a mixed cell. For every lifting function $\omega : \textbf{A}\rightarrow \mathbb {R}$ (represented by a vector in $\mathbb {R}^{\textbf{A}}$) we denote the induced subdivision as $\Delta _{\omega }$. We define the mixed cell cone of $\sigma $ as:

$$\begin{aligned} {M(\sigma )} \ := \ \{ \omega \in \mathbb {R}^{\textbf{A}} : \sigma \text { is a mixed-cell in } \Delta _{\omega } \}. \end{aligned}$$

Moreover, we define the mixed cell cone of T as:

$$\begin{aligned} {M(T)} \ := \ \bigcap _{\{\sigma \ : \ \sigma \text { is mixed cell of } T\}} M(\sigma ). \end{aligned}$$

To clarify the difference between the mixed cell cone and the secondary cone, we make a definition and state a lemma.

Definition 2.15

Let $A_1,A_2,\ldots ,A_n \subset {\mathbb {Z}}^n$ and set $\textbf{A}= A_1 * A_2 * \cdots * A_n \subset {\mathbb {Z}}^{2n-1}$. If $\Gamma $ is a facet of $\textbf{A}$ defined by $\Gamma = \{ x \in \textbf{A}: x_{n+i} = 0 \}$ for some $1 \le i \le n-1$, or $\Gamma = \{ x \in \textbf{A}: \exists \; i \; \text {such that} \; 1 \le i \le n-1 \; \text {and} \; x_{n+i}= 1 \}$ then we call $\Gamma $ an irrelevant facet. If $\Gamma $ is a face included in an irrelevant facet, we call $\Gamma $ an irrelevant face.

Lemma 2.16

Let $T=\Delta _{\omega }$ for a lifting function $\omega $, and assume that $\omega \in {\text {NC}}{(\varvec{v})}$ where $\varvec{v}$ is a vertex of the Newton polytope of the $\textbf{A}$-discriminant and ${\text {NC}}{(\varvec{v})}$ is its normal cone. Then, we have

$$\begin{aligned} {\text {NC}}{(\varvec{v})}^{\circ } \subseteq M(T)^{\circ } \subseteq {\text {NC}}{(T)}^{\circ }. \end{aligned}$$

Moreover, if $\tau \in {\text {NC}}{(T)}^{\circ } - M(T)^{\circ } $ then $\tau $ is supported on a set that is included in the union of irrelevant faces of $\textbf{A}$.

Proof

Inclusion of the cones follow directly from definition, for further structural information we refer to D-equivalence notion in Chapter 11, Section 3, subsection B of [16].

We prove “moreover” part of the claim. Let $\tau \in {\text {NC}}{(T)}^{\circ } - M(T)^{\circ } $ be an inequality supported on a circuit Z. We claim in Z there exist an $i \in [n]$ such that $\left| Z \cap A_i \right| =1$. Assume otherwise, then we have that for some j: $\left| Z \cap A_i \right| =2$ for all $i \ne j$, and $\left| Z \cap A_j \right| =3$. Then, passing from one triangulation of Z to another involves a mixed cell change which contradicts with the assumption $\tau \in {\text {NC}}{(T)}^{\circ } - M(T)^{\circ } $. Now without loss of generality assume $Z \cap A_1=\varvec{\alpha }$. Then $Z - \varvec{\alpha }$ lies in an irrelevant face of $\textbf{A}$, and the lattice distance from $\varvec{\alpha }$ to affine hull of $Z - \varvec{\alpha }$ is 1. One can easily observe that the simplex $\sigma _{\alpha }$ corresponding to $\alpha $ in Z is not full-dimensional and hence has volume zero. Thus, the inequality $\tau $ is supported on $Z-\alpha $. In general, any element of ${\text {NC}}{(T)}^{\circ } - M(T)^{\circ } $ is a conic combination of circuit inequalities $\tau \in {\text {NC}}{(T)}^{\circ } - M(T)^{\circ }$ and we showed that such $\tau $ are supported on irrelevant faces. $\square $

In the rest of this paper, we will use the mixed-cell cone M(T) and it will be represented by circuit inequalities generating the cone, that is we work with $M(T)^{\circ }$. Luckily for us, there is already an efficient algorithm for computing $M(T)^{\circ }$: Jensen’s tropical homotopy algorithm, see [22], computes for a given (generic) lifting function $\omega $, and point configurations $A_1,A_2,\ldots ,A_n$, the triangulation $T=\Delta _{\omega }$ of $\textbf{A}=A_1*A_2*\cdots *A_n$ and $M(T)^{\circ }$. The idea of Jensen’s algorithm is to start from a lifting function $\beta $ yielding only one mixed cell. Then, one keeps track of the changes in the mixed-cell cone as one changes the lifting function linearly from $\beta $ to a target lifting $\omega $. The algorithm updates the mixed-cell cone with the violated circuit inequalities, and halts whenever it arrives at a triangulation T with $\omega \in M(T)$. The correctness of the algorithm follows from the fact that changes in the regular triangulations always happen by a change between two triangulations of a circuit, and every such change corresponds to one circuit inequality added to the mixed-cell cone.

2.4 Solving Binomial Systems Over the Reals

Since we repeat the Viro construction in every orthant of $(\mathbb {R}^{*})^n$, the sign vector $\varepsilon $ changes. However, the lifting function $\omega $ and the corresponding triangulation remains the same for all orthants. So, to count the number of real zeros with Viro’s method, one needs to investigate the mixed cells and check how many times a mixed cell becomes an alternating one. Algorithmically, instead of going through Viro’s construction $2^n$ times, it is more convenient to use binomial systems, i.e., systems of polynomials where every polynomial has only two terms. Every mixed cell corresponds to a binomial system, and solving that binomial system on $(\mathbb {R}^{*})^n$ corresponds to counting how many times the mixed cell becomes and alternating mixed cell. This approach is much more effective.

Now we outline how to solve binomial systems over the reals. Consider the following system of binomials:

$$\begin{aligned} c_{11}{{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{11}} \ = \ c_{12} {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{12}}, \quad c_{21}{{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{21}} \ = \ c_{22} {{\,\mathrm{\varvec{x}}\,}}^{a_{22}}, \ \ldots , \quad c_{n1}{{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{n1}} \ = \ c_{n2} {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{n2}}, \end{aligned}$$

where $c_{ij} \in \mathbb {R}^{*}$ and $\varvec{a}_{ij} \in \mathbb {Z}^n$. This system is equivalent to the following system of equations:

$$\begin{aligned} {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{11}-a_{12}} \ = \ \frac{c_{12}}{c_{11}}, \quad {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{21}-a_{22}} \ = \ \frac{c_{22}}{c_{21}}, \ \ldots , \quad {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{n1}-a_{n2}} \ = \ \frac{c_{n2}}{c_{n1}}. \end{aligned}$$

(2.1)

Set $\varvec{d}_i=\varvec{a}_{i1}-\varvec{a}_{i2}$, and $D=[\varvec{d}_1 \varvec{d}_2 \cdots \varvec{d}_n]$. To solve the system (2.1) over $(\mathbb {R}^{*})^n$, it suffices to perform the elementary integer operations that reduce D into its Hermite normal form. This operations can be done in strong polynomial time [26]. The result is a system of equations in the following format:

$$\begin{aligned} {{\,\mathrm{\varvec{x}}\,}}_1^{\varvec{h}_{11}} \ = \ \lambda _1 , \quad {{\,\mathrm{\varvec{x}}\,}}_1^{\varvec{h}_{21}} {{\,\mathrm{\varvec{x}}\,}}_2^{\varvec{h}_{22}} \ = \ \lambda _2 , \ \ldots , \quad {{\,\mathrm{\varvec{x}}\,}}_1^{\varvec{h}_{n1}} \ldots {{\,\mathrm{\varvec{x}}\,}}_{n}^{\varvec{h}_{nn}} \ = \ \lambda _n, \end{aligned}$$

(2.2)

where $\varvec{h}_{ij} \in {\mathbb {Z}}$ and $\lambda _i \in \mathbb {R}^{*}$. The solutions of (2.2) are completely determined by the signs of $\lambda _i$ and $\varvec{h}_{ij}$ being even or odd. Hence, (2.2) either has no solution in $(\mathbb {R}^{*})^n$, or there exist solutions differing only by their signs.

There is also a recent paper focusing on probabilistic analysis of numerical methods for binomial system solving [40].

2.5 A-Discriminants

Given a set of lattice points $A=\{ \varvec{a}_1, \varvec{a}_2, \ldots , \varvec{a}_m \} \subset \mathbb {Z}^n$, we define

$$\begin{aligned} {\mathbb {C}^A} \ := \ \left\{ \sum _{\varvec{\alpha } \in A} c_{\varvec{\alpha }} \varvec{x}^{\varvec{\alpha }} \in \mathbb {C}[\varvec{x}] \ : \ c_{\varvec{\alpha }} \in \mathbb {C}\text { for all } \varvec{\alpha } \in A\right\} \end{aligned}$$

as the space of polynomials supported on A. Note that $\mathbb {C}^A$ is isomorphic to $\mathbb {C}^{m}$ with $m = \# A$. We define ${(\mathbb {C}^*)^A}$ analogously with $c_{\varvec{\alpha }} \in \mathbb {C}^*$. Now, we define our protagonist the A-discriminant variety:

$$\begin{aligned} {\nabla _A} \ := \ \overline{ \left\{ f \in (\mathbb {C}^*)^A \ : \ f \text { has a singularity on } (\mathbb {C}^{*})^n \right\} }. \end{aligned}$$

${\nabla _A}$ is a cone over a projective variety, to get a better sense of this we need to introduce projective toric variety corresponding to A:

$$\begin{aligned} X_A := \overline{ \{ [x^{a_1}:x^{a_2}:\cdots :x^{a_m}] : x \in (\mathbb {C}^{*})^n \} }. \end{aligned}$$

One can observe that $X_A$ is essentially Zariski closure of a torus orbit. A polynomial f supported on A can be considered as a linear form on the toric variety $X_A$. Discriminant variety corresponds to the hyperplanes that intersect the toric variety $X_A$ non-transversally. This means if $f \notin {\nabla _A}$ then the zero set of f has no singularity on $X_A$. This also means $ {\nabla _A}$ is the cone over the projective dual of $X_A$. Thus, except for specific degenerate configurations A, ${\nabla _A}$ is an irreducible hypersurface given by a polynomial with integral coefficients; [16, Chapter 9]. We denote the defining equation of ${\nabla _A}$ with $\Delta _A$. We are interested in the real part of the discriminant variety

$$\begin{aligned} {\nabla _A(\mathbb {R})} \ := \ \nabla _A \cap \mathbb {R}[\varvec{x}]. \end{aligned}$$

The hypersurface $\nabla _A(\mathbb {R})$ partitions the coefficient space $\mathbb {R}^{A}$ into connected components. If two polynomials $f,g \in \mathbb {R}^{A}$ lie in the same connected component of $\mathbb {R}^{A} - \nabla _A(\mathbb {R})$ and $X_A$ is smooth then the zero sets of f and g on are isotopic on $X_A$ [16, pg 380].

For the purposes of homotopy continuation we want to have zero sets of $f,g \in \mathbb {R}^{A}$ be isotopic in the torus orbit $X_A^{\circ }$ instead of the compactification $X_A$. The nice fact is that $X_A$ admits a decomposition into a disjoint union of torus orbits:

$$\begin{aligned} X_A = \sqcup _{\Gamma } X_{\Gamma }^{\circ }, \end{aligned}$$

where $\Gamma $ are faces of the polytope conv(A) and $X_{\Gamma }^{\circ }$ denotes the torus orbit for which the toric variety $X_{\Gamma }$ is the closure. So to have $\mathcal {V}_{\mathbb {R}^*}\left( f\right) $ and $\mathcal {V}_{\mathbb {R}^*}\left( g\right) $ isotopic, we will require two conditions: f and g both does not have a zero on $X_{\Gamma }^{\circ }$ for any proper face $\Gamma $ of conv(A), and f and g are isotopic on $X_A$. Since singularities of toric variety $X_A$ are known to be on $X_{\Gamma }^{\circ }$ for co-dimension two or higher faces, once the first condition is guaranteed the second condition boils down to f and g being in the same connected component in $\mathbb {R}^{A} - \nabla _A(\mathbb {R})$.

At this point, we need to introduce sparse resultant. We summarize basic properties in the following proposition-definition.

Proposition 2.17

[16, Chapter 8] Let $A_1,A_2,\ldots ,A_k \subset {\mathbb {Z}}^{k-1}$ be a collection of k finite sets. Then there exists a polynomial $R_{A_1,A_2,\ldots ,A_k}$ with the following properties:

$R_{A_1,A_2,\ldots ,A_k}$ has integral coefficients and it is irreducible.
If $\varvec{f}=(f_1,f_2,\ldots ,f_k)$ is a polynomial system with $f_i \in \mathbb {C}^{A_i}$ and $\varvec{f}$ has a zero in $(\mathbb {C}^{*})^{k-1}$, then $R_{A_1,A_2,\ldots ,A_k}(\varvec{f})=0$.

An exposition for sparse resultants with a computational focus can be found in [9]. A nice trick referred to as “Cayley trick" relates A-discriminants and sparse resultants.

Lemma 2.18

[16, Chapter 9, Prop 1.7] Using the notation of Proposition 2.17 and letting $\textbf{A}:=A_1*A_2* \cdots *A_k$, we have

$$\begin{aligned} R_{A_1,A_2,\ldots ,A_k}(\varvec{f}(x)) = \Delta _{\textbf{A}}(f_1(x) + \sum _{i=2}^{k} y_i f_i(x) ), \end{aligned}$$

where $y_i$ denotes the new variables added in the construction of $\textbf{A}$.

Now we would like to think about singular zeros of a sparse polynomial system. For a tuple of coefficient vectors ${\varvec{C}}=(\varvec{C}_1,\varvec{C}_2,\ldots ,\varvec{C}_k)$ with $\varvec{C}_i \in {\mathbb {C}}^{\# A_i}$, let $\varvec{p}_{\varvec{C}}$ be the polynomial system ${\varvec{p}_{\varvec{C}}}=(p_1,p_2,\ldots ,p_{k})$ with ${p_i}=\sum _{\varvec{a}_{ij} \in A_i} \varvec{C}_{ij} {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_{ij}} $. We define the discriminantal locus for systems of equations as follows:

$$\begin{aligned}&{\nabla _{A_1,A_2,\ldots ,A_k}} \\ {}&\quad := \ \overline{ \left\{ (\varvec{C}_1,\varvec{C}_2,\ldots ,\varvec{C}_k) \in \mathbb {C}^{A_1} \times \cdots \times \mathbb {C}^{A_k} : \varvec{p}_{\varvec{C}} \; \text {posses a singularity on} \; (\mathbb {C}^{*})^n \right\} }. \end{aligned}$$

The toric variety that is dual to ${\nabla _{A_1,A_2,\ldots ,A_k}} $ may not be clear at first sight, but it is isomorphic to $X_{A_1+A_2+\cdots +A_k}$; we refer the reader to [16, Chapter 8, Proposition 1.4] for a nice explanation.

The discriminantal locus corresponding to hypersurfaces supported by the Cayley configuration $\textbf{A}=A_1 * A_2 * \cdots * A_k$ is then given by

$$\begin{aligned} {\nabla _{\textbf{A}}} \ := \ \overline{ \left\{ \varvec{C}\in \mathbb {C}^{\textbf{A}} : \sum _{a \in \textbf{A}} c_a x^{a} \; \text {posses a singularity on} \; (\mathbb {C}^{*})^n \right\} }. \end{aligned}$$

If $\textbf{A}=A_1 * A_2 * \cdots * A_k$ is not degenerate, then $\nabla _{\textbf{A}}$ is an irreducible hypersurface. Also, using the definition of singularity with the Jacobian matrix, it immediately follows that $\nabla _{\textbf{A}} \subseteq \nabla _{A_1,A_2,\ldots ,A_n} $. The following result of Esterov, proved by a simple perturbation argument, relates $\nabla _{\textbf{A}}$ and $\nabla _{A_1,A_2,\ldots ,A_k}$; see of [15, Lemma 3.36], and note that in Esterov’s notation $\nabla _{A_1,A_2,\ldots ,A_k} $ is denoted by $ \Sigma _{A_0,A_1,\ldots ,A_{\ell }}$.

Theorem 2.19

(Esterov) If $\textbf{A}=A_1 * A_2 * \cdots * A_k$ is not defect, and $\dim ({{\,\mathrm{{\text {conv}}}\,}}(A_i))=n$ for $i=1,2,\ldots ,k$, then $\nabla _{A_1,A_2,\ldots ,A_k}$ is irreducible of codimension one.

Hence, if the assumptions of Esterov’s theorem are satisfied, then $\nabla _{\textbf{A}}$ and $\nabla _{A_1,A_2,\ldots ,A_k}$ coincide. So, in order to control the changes in the topology for systems of equations supported with $A_1,A_2,\ldots ,A_k$, we use the hypersurface $\nabla _{\textbf{A}}(\mathbb {R})$.

2.6 Basics of Amoeba Theory

In this section, we introduce the notion of amoeba following Gelfand, Kapranov, and Zelevinsky [16]. For an overview of amoeba theory, please see [35, 42].

Definition 2.20

We define the Log-absolute value map as

$$\begin{aligned} {{{\,\textrm{Log}\,}}}: (\mathbb {C}^{*})^{n} \rightarrow \mathbb {R}^{n}, \quad (z_1,z_2,\ldots ,z_n) \rightarrow (\log \left| z_1 \right| , \log \left| z_2 \right| , \ldots , \log \left| z_n \right| ). \end{aligned}$$

For a Laurent polynomial $f \in \mathbb {C}\left[ \varvec{z}^{\pm 1}\right] $ and variety $\mathcal {V}\left( f\right) \subset (\mathbb {C}^*)^n$ we define the amoeba of f as ${{\mathcal {A}}{(f)}} := {{\,\textrm{Log}\,}}|\mathcal {V}\left( f\right) | \subseteq \mathbb {R}^n$.

Lemma 2.21

Let $f=\sum _{i} c_i {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_i}$ be a polynomial with support $A=\{\varvec{a}_1, \varvec{a}_2, \ldots , \varvec{a}_m \}$. Let $\varvec{v}$ be a vertex of ${{\,\mathrm{{\text {New}}}\,}}(f)$. Suppose that $\varvec{b} \in {\text {NC}}{(\varvec{v})}$ with

$$\begin{aligned} \langle \varvec{b} , \varvec{v}-\varvec{a}_i \rangle \ > \ \log \left( \frac{m \cdot \left| c_i \right| }{\left| c_{\varvec{v}} \right| } \right) \end{aligned}$$

for all $\varvec{a}_i \ne \varvec{v}$. Then, $ {\mathcal {A}}{(f)} \cap \left( \varvec{b}+{\text {NC}}{(\varvec{v})} \right) = \emptyset $.

The statement is well known; see [16, Prop. 1.5, Page 195]. Here, we provide the main argument of the proof for the convenience of the reader.

Proof

We have

$$\begin{aligned} f({{\,\mathrm{\varvec{x}}\,}}) \ = \ c_{\varvec{v}} {{\,\mathrm{\varvec{x}}\,}}^{\varvec{v}} \left( 1 + \sum _{\varvec{a}_i \ne \varvec{v}} \frac{c_i}{c_{\varvec{v}}} {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_i-\varvec{v}} \right) . \end{aligned}$$

Set $g({{\,\mathrm{\varvec{x}}\,}})=\sum _{\varvec{a}_i \ne \varvec{v}} \frac{c_i}{c_{\varvec{v}}} {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}_i-\varvec{v}} $. Then for a given ${{\,\mathrm{\varvec{x}}\,}}\in (\mathbb {C}^{*})^{n}$ if $\left| g({{\,\mathrm{\varvec{x}}\,}}) \right| < 1$, this immediately implies $f({{\,\mathrm{\varvec{x}}\,}}) \ne 0$ and hence ${{\,\textrm{Log}\,}}|{{\,\mathrm{\varvec{x}}\,}}| \notin {\mathcal {A}}{(f)}$. The rest of the proof is straightforward. $\square $

Lemma 2.21 shows that for every $\varvec{v}$ of the ${{\,\mathrm{{\text {New}}}\,}}(f)$, there is an unbounded connected component in the complement of the amoeba ${\mathcal {A}}{(f)}$ that includes a copy of the normal cone ${\text {NC}}{(b)}$. The following are the basic facts: these connected components are distinct for every $\varvec{v}$, these connected components exhaust the list of unbounded components in the complement of ${\mathcal {A}}{(f)}$, and these connected components are convex [41, 42].

2.7 Real Toric Deformation

This section is to set up the real homotopy starting from combinatorial patchworking to our target system. We will require the deformation path to lie outside of a region given by a union discriminant amoebas. This will ensure there are no root paths that visit toric infinity and that the deformation preserves the geometry of the real zero set.

Proposition 2.22

Let $A_1,A_2,\ldots ,A_n \subset \mathbb {Z}^{n}$ be point configurations with $\dim (A_i)=n$ for all $i \in [n]$, and let $\textbf{A}=A_1 * A_2 * \cdots * A_n$ be the Cayley configuration. Suppose that $\varvec{v} = (\varvec{v}_{\varvec{a}})_{\{\varvec{a} \in A_i \ : \ 1 \le i \le n\}} \in \mathbb {R}^{\textbf{A}}$ and $\varvec{C}\in {\mathbb {R}}^{\textbf{A}}$ are vectors with the following properties:

(1)
$\varvec{v}$ is not on the boundary of any secondary cone of the point configuration $\textbf{A}$.
(2)
For every face $\Gamma $ of $\textbf{A}$, except for the irrelevant faces, the ray ${{\,\textrm{Log}\,}}|\varvec{C}|+ \lambda \varvec{v}$ for $\lambda \in [0,\infty )$ does not intersect the amoeba of $\Delta _{\Gamma }(\mathbb {R})$.

We consider a system of equations $\varvec{p}_{\varvec{C}}(t,{{\,\mathrm{\varvec{x}}\,}})=(p_1,p_2,\ldots ,p_n)$:

$$\begin{aligned} p_i(t,{{\,\mathrm{\varvec{x}}\,}}) \ = \ \sum _{\varvec{a} \in A_i} c_{\alpha } t^{- v_{\varvec{a}}} {{\,\mathrm{\varvec{x}}\,}}^{\varvec{a}} \ \text { for } \ i=1,2,\ldots ,n . \end{aligned}$$

(2.3)

Then, the real Puiseux series

$$\begin{aligned} {{\,\mathrm{\varvec{x}}\,}}(t)= (x_1 t^{\zeta _1}, x_2 t^{\zeta _2}, \ldots , x_n t^{\zeta _n}) + \; \text {higher-order terms} \end{aligned}$$

(2.4)

is a solution to the system $\varvec{p}_{\varvec{C}}$ only if $(\varvec{\zeta },1)$ is an outer normal to a lower facet of

$$\begin{aligned} {{\,\mathrm{{\text {conv}}}\,}}(A_1^{\varvec{v}} + A_2^{\varvec{v}} + \cdots + A_n^{\varvec{v}}), \end{aligned}$$

where ${A_i^{\varvec{v}}}$ stands for the lifting of $A_i$ with respect to $\varvec{v} \in {\mathbb {R}}^{\textbf{A}}$. Moreover, $\mathcal {V}_{\mathbb {R}^*}\left( p_i(t,x)\right) $ are isotopic for all $t \in (0,1]$.

Remark 2.23

We note that in the statement we have ${{\,\textrm{Log}\,}}|\varvec{C}|+ \lambda \varvec{v}$ for $\lambda \in [0,\infty )$, and also $c_{\alpha } t^{-v_{\alpha }}$ for $t \in (0,1]$; these two represent the same parameter regime as $\log \left| c_{\alpha } t^{-v_{\alpha }} \right| = \log {\left| c_{\alpha } \right| } + v_{\alpha } \log \frac{1}{t}$.

Proof

The statement about the Puiseux series follows the same proof as [18, Lemma 3.1], so we just list the main steps: Put (2.4) into (2.3), divide by the lowest degree term, and set $t=0$. The system of equations obtained this way will have at most 2n terms in total, and it can have a common zero only if it is a system of binomial equations. On can observe that under ${{\,\textrm{Log}\,}}$-map, the solutions of these binomial equations correspond to the finite number points given by Viro’s method. We had already discussed in Sects. 2.3 and 2.4 that these points given by Viro’s method identify alternating mixed cells.

Now we consider the statement about isotopy. Let $S := \{ x \in {\mathbb {R}}^{2n-1} : x_{n+1}= x_{n+2} = \cdots = x_{2n-1} = \frac{1}{n-1} \}$. We recall that ${{\,\textrm{Cay}\,}}(\textbf{A}) \cap S$ and $A_1+A_2+\cdots +A_n$ are equivalent up to scaling, see Sect. 2.1. Let $\Gamma $ be a proper face of $\textbf{A}$ that is not irrelevant, let ${\tilde{\Gamma }}$ be the face of $A_1+A_2+\cdots +A_n$ equivalent to $\Gamma \cap S$, and suppose ${\tilde{\Gamma }}= {\tilde{\Gamma }}_1 + {\tilde{\Gamma }}_2 + \cdots + {\tilde{\Gamma }}_n$, where ${\tilde{\Gamma }}_i$ is a face of $A_i$. Observe that $\Gamma = {\tilde{\Gamma }}_1 * {\tilde{\Gamma }}_2 * \cdots * {\tilde{\Gamma }}_n$. By Lemma 2.18, if a polynomial system $\varvec{f|_{\Gamma }}$ satisfies $\Delta _{\Gamma }(\varvec{f|_{\Gamma }}) \ne 0$ then $\varvec{f}$ has no zero on $X_{{\tilde{\Gamma }}}^{\circ }$. Thus if $\Delta _{\Gamma }(\varvec{f|_{\Gamma }}) \ne 0$ for all proper faces, except the irrelevant ones, then $\varvec{f}$ has no zero on $X_{A_1+A_2+\cdots +A_n}-X_{A_1+A_2+\cdots +A_n}^{\circ }$.

The ray ${{\,\textrm{Log}\,}}|\varvec{C}|+ \lambda \varvec{v}$ does not intersect the amoeba of $\Delta _{\Gamma }(\mathbb {R})$ for any $\lambda \in [0,\infty )$. This implies $\Delta _{\Gamma }(\varvec{\varvec{p_{\varvec{C}}(t)}}) \ne 0$ for all faces $\Gamma $, except irrelevant ones, and also $\Delta _{\textbf{A}}(\varvec{\varvec{p_{\varvec{C}}(t)}}) \ne 0$ for all $t \in (0,1]$. So the condition $\Delta _{\Gamma }(\varvec{\varvec{p_{\varvec{C}}(t)}}) \ne 0$ guarantees non-existence of zeros on $X_{A_1+A_2+\cdots +A_n}-X_{A_1+A_2+\cdots +A_n}^{\circ }$, and since $\Delta _{\textbf{A}}(\varvec{\varvec{p_{\varvec{C}}(t)}}) \ne 0$ we have that the zero sets are isotopic on $X_{A_1+A_2+\cdots +A_n}^{\circ }$. Thus, $\mathcal {V}_{\mathbb {R}^*}\left( p_i(t,x)\right) $ are isotopic for all $t \in (0,1]$. $\square $

2.8 Numerically Tracking a Solution from Toric Infinity

The numerical part of our algorithm tracks real zeros of $\varvec{p}_{\varvec{C}}(t,{{\,\mathrm{\varvec{x}}\,}})$, as in Proposition 2.22, from $\varvec{p}_{\varvec{C}}(0,{{\,\mathrm{\varvec{x}}\,}})$ to $\varvec{p}_{\varvec{C}}(1,{{\,\mathrm{\varvec{x}}\,}})$. There are several technicalities to be careful about: (1) we are not able to start the homotopy continuation precisely at $\varvec{p}_{\varvec{C}}(0,{{\,\mathrm{\varvec{x}}\,}})$ since all its zeros lie at toric infinity, (2) we need to design an algorithm to track the solution paths ${{\,\mathrm{\varvec{x}}\,}}(t)$, as in Proposition 2.22, from $t \sim 0$ to $t=1$.

The first issue is theoretically handled by an analytic continuation argument on toric compactification, and it is practically handled by predictor–corrector methods in numerical analysis. The second part, tracking the solution paths, can be done in two ways:

(1)
trace the solution curves ${{\,\mathrm{\varvec{x}}\,}}(t)$ numerically, or
(2)
start a homotopy from $\varvec{p}_{\varvec{C}}(0,{{\,\mathrm{\varvec{x}}\,}})$ with zeros given by alternating mixed cells and track the solution path from $t=0$ to $t=1$.

Explaining details of these numerical schemes have the potential of doubling the size of our paper and the techniques are now folklore, so we prefer to have a brief account. An established reference for curve tracing approach, i.e., the first method, is [1]. The curve tracing approach is often fast, and it is a standard technique in numerical analysis that is deployed in many applications. However, to the best of our knowledge, the safeguards to control precision issues for standard path trackers only exist for specific cases. The second approach has a well-developed theory to control precision issues and conduct rigorous complexity analysis in the case of dense polynomials [2]. For sparse polynomials, Malajovich recently developed a theory that allows to express complexity of numerical tracking with certain integrals of condition numbers [31]. We briefly explain Malajovich’s approach in Sect. 5.4. Our algorithm can be implemented using any of the two ways depending on the preferred trade-off between rigor and speed. For a nice exposition on comparing the two alternatives we suggest [6, Section 2.3 and 2.4].

2.9 An Entropy Type Formula for the Discriminant Locus

In this section, we introduce useful facts about A-discriminants, mostly relying on [16, Chapter 9, Section 3, subsection C] and works of Passare and Tsikh [42].

Theorem 2.24

(Horn–Kapranov Uniformization) Let $A=[ \varvec{a}_1, \varvec{a}_2, \ldots , \varvec{a}_m ]$ be a collection of lattice points in $\mathbb {Z}^n$, let $\nabla _A$ be the corresponding A-discriminant variety. We consider A as a $n \times m$ matrix, and define

$$\begin{aligned} {\Psi _A(\varvec{u},{{\,\mathrm{\varvec{x}}\,}})} \ := \ \left[ u_1 {{\,\mathrm{\varvec{x}}\,}}^{a_1} : u_2 {{\,\mathrm{\varvec{x}}\,}}^{a_2} : \cdots : u_m {{\,\mathrm{\varvec{x}}\,}}^{a_m} \right] . \end{aligned}$$

Then $\nabla _A$ admits the following parametrization:

$$\begin{aligned} \nabla _A \ = \ \overline{ \left\{ \Psi _A(\varvec{u},{{\,\mathrm{\varvec{x}}\,}}) \ : \ A\varvec{u}=\varvec{0} , \sum _{i = 1}^m u_i =\varvec{0}, {{\,\mathrm{\varvec{x}}\,}}\in (\mathbb {C}^{*})^n \right\} }. \end{aligned}$$

Now consider the amoeba of $\nabla _A$:

$$\begin{aligned} {{\,\textrm{Log}\,}}|\nabla _A| \ = \ {{\,\textrm{Log}\,}}\left| \left\{ \varvec{u} \ : \ A\varvec{u}=\varvec{0} , \sum _{i = 1}^m u_i =\varvec{0} \right\} \right| \ + \ \left( {{\,\textrm{Log}\,}}|{{\,\mathrm{\varvec{x}}\,}}| \right) ^T A, \end{aligned}$$

where $+$ denotes the Minkowski sum.

It is easy to observe that $\left( {{\,\textrm{Log}\,}}|{{\,\mathrm{\varvec{x}}\,}}| \right) ^T A$ corresponds to the row span of A. Moreover, for any $\varvec{u}$ with $A\varvec{u}=\varvec{0}, \sum _{i = 1}^m u_i =\varvec{0}$ any scalar multiple of $\varvec{u}$ satisfies the same equations. This n-dimensional row span and one-dimensional linear space represents $n+1$ homogeneities that are present in the discriminant variety; the variety is invariant under torus action and scaling.

For a given hypersurface $\mathcal {V}\left( f\right) \subseteq (\mathbb {C}^*)^n$, consider all points which are critical under the ${{\,\textrm{Log}\,}}|\cdot |$ map. The ${{\,\textrm{Log}\,}}|\cdot |$-image of these points is called the contour of the corresponding amoeba ${\mathcal {A}}{(f)}$; see e.g., [42]. It is straightforward to show that the contour contains the boundary $\partial {\mathcal {A}}{(f)}$, but does not coincide with it in general; see, e.g., [42]. Moreover, for a real polynomial f, the contour contains the amoeba of the smooth part of the real variety, i.e., ${\mathcal {A}}{(\mathcal {V}_{\mathbb {R}^*}\left( f\right) )}$ [42].

Let B be a Gale dual of A, i.e, an $m \times (m-n-1)$ integer matrix that has all column sums to be 0 and satisfies $AB=\varvec{0}$. Then, for any $\varvec{u} \in (\mathbb {R}^{*})^{m}$ with $A \varvec{u} = \varvec{0}$ and $\sum _i u_i = 0$ one can find a $\varvec{\zeta } \in (\mathbb {R}^{*})^{m-n-1}$ with $\varvec{u}=B \varvec{\zeta }$.

It follows from the discussion in [42] (see the section titled Discriminants and Real Contours, and specifically Theorem 4), that the parametrization of the contour of the reduced A-discriminant amoeba $B^{T}{\mathcal {A}}{(\nabla _A(\mathbb {C}))}$ is given as follows:

$$\begin{aligned} B^T {{\,\textrm{Log}\,}}\left| \left\{ \varvec{u} : \varvec{u} \in (\mathbb {R}^{*})^m , A \varvec{u} = \varvec{0} , \sum _i u_i = \varvec{0} \right\} \right| . \end{aligned}$$

(2.5)

Using (2.5) and the fact that contour includes the amoeba of the real part of the variety, one can concisely write

$$\begin{aligned} B^{T} {\mathcal {A}}{(\nabla _A(\mathbb {R}))} \ \subseteq \ \left\{ B^{T} {{\,\textrm{Log}\,}}|\varvec{u}| \ : \ \varvec{u} \in (\mathbb {R}^{*})^m, A \varvec{u} = \varvec{0} , \sum _{i = 1}^m u_i = 0 \right\} . \end{aligned}$$

(2.6)

Using the row space of B to parameterize the set $\{ \varvec{u} \in (\mathbb {R}^{*})^m, A \varvec{u} = \varvec{0} , \sum _{i = 1}^m u_i = 0 \}$, this can also be written as follows:

$$\begin{aligned} B^{T} {\mathcal {A}}{(\nabla _A(\mathbb {R}))} \ \subseteq \ \left\{ \sum _{i = 1}^m \varvec{b(i)} \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \ : \ \varvec{\zeta } \in (\mathbb {R}^{*})^{m-n-1} \right\} , \end{aligned}$$

(2.7)

where the $\varvec{b(i)}$ denote the rows of B. As a next step, we define the following map:

$$\begin{aligned} {\phi _A}: (\mathbb {R}^{*})^{m-n-1} \rightarrow (\mathbb {R}^{*})^{m-n-1} \; \; , \; \; \phi _A(\varvec{\zeta }) = \sum _{i = 1}^m \varvec{b(i)} \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| . \end{aligned}$$

The facts listed follows from [16, Chapter 9, Section 3, subsection C]:

(1)
The map $\phi _A$ is 0-homogeneous, that is for every $\lambda \in (0,\infty )$ and $\varvec{\zeta } \in (\mathbb {R}^{*})^{m-n-1}$ we have
$$\begin{aligned} \phi _A(\lambda \varvec{\zeta }) \ = \ \phi _A(\varvec{\zeta }). \end{aligned}$$
(2)
The image of the map $\phi _A$ is a hypersurface, and if the Gauss map $\gamma $ is defined at $\phi _A(\varvec{\zeta })$ then we have
$$\begin{aligned} \gamma ( \phi _A(\varvec{\zeta }) ) = \varvec{\zeta }. \end{aligned}$$

The first property follows since the column sums of B equals 0. The second property is proved by Kapranov [25]. Now assume that we have a $\varvec{\zeta } \in (\mathbb {R}^{*})^{m-n-1}$, and we would like to write down the equation of the tangent hyperplane ${H_{\varvec{\zeta }}}$ at $\phi _A(\varvec{\zeta })$.

Since we know the image under the Gauss map (i.e., the normal direction), we obtain:

$$\begin{aligned} H_{\varvec{\zeta }} \ = \ \left\{ {{\,\mathrm{\varvec{x}}\,}}\in \mathbb {R}^{m-n-1} : \langle x , \varvec{\zeta } \rangle = \langle \phi _A(\varvec{\zeta }) , \varvec{\zeta } \rangle \right\} . \end{aligned}$$

One can rewrite this as follows:

$$\begin{aligned} H_{\varvec{\zeta }} \ = \ \left\{ {{\,\mathrm{\varvec{x}}\,}}\in \mathbb {R}^{m-n-1} : \langle \varvec{\zeta } , {{\,\mathrm{\varvec{x}}\,}}\rangle = \sum _{i = 1}^m \langle \varvec{b(i)}, \varvec{\zeta } \rangle \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \right\} . \end{aligned}$$

(2.8)

3 Effective Viro’s Patchworking

Consider a polynomial system $\varvec{p}=(p_1,p_2,\ldots ,p_n)$ with support sets $A_1,A_2,\ldots ,A_n$ and the coefficient vector $\varvec{C}=(\varvec{C}_1,\varvec{C}_2, \ldots , \varvec{C}_n)$. How do we decide if the common real zero set of $\varvec{p}$ (up to continuous deformation) can be described by Viro’s patchworking method? Here we present a way to certify if this is the case for a given system $\varvec{p}$: We search for a ray ${{\,\textrm{Log}\,}}|\varvec{C}| + \lambda \varvec{v}$ that does not intersect the discriminant amoebae for all faces $\Gamma $ of $\textbf{A}$, except the irrelevant ones. This represents a real toric deformation starting from $\varvec{p}$ as in Proposition 2.22.

3.1 Statement of Main Result and an Example

We keep the notation from Sect. 2.9, what follows is the main result of this section.

Proposition 3.1

Let $\varvec{p}_{\varvec{C}}$ be a system of sparse polynomials with coefficient vector $\varvec{C}$ and support sets $A_1,A_2,\ldots ,A_n \subset \mathbb {Z}^n$ where $\dim (\textbf{A}_i)=n$ for all $1 \le i \le n$. Let T be the triangulation of the Cayley configuration $\textbf{A}=A_1*A_2*\cdots *A_n$ that is introduced using ${{\,\textrm{Log}\,}}\varvec{C}$ as a lifting function. Let M(T) be the corresponding mixed cell cone, and suppose that the dual cone $M(T)^{\circ }$ is generated by vectors $\varvec{\zeta (1)},\ldots ,\varvec{\zeta (L)}$. Then, if

$$\begin{aligned} \langle {{\,\textrm{Log}\,}}\varvec{C}, \varvec{\zeta (i)} \rangle \ > \ \log (\# \textbf{A}) \left\Vert \varvec{\zeta (i)}\right\Vert _1 \end{aligned}$$

(3.1)

for all $i=1,2,\ldots ,L$, the system $\varvec{p}_{\varvec{C}}$ is a patchworked polynomial system. Furthermore, for any $\varvec{v} \in M(T)$ the ray ${{\,\textrm{Log}\,}}\varvec{C}+ \lambda \varvec{v}$ for $\lambda \in [0,\infty )$ does not intersect the amoeba of $\Delta _{\Gamma }$ for all faces $\Gamma $ of $\textbf{A}$ except for the irrelevant ones.

Note that, for any coefficient vector $\varvec{C}$ with corresponding triangulation T, if the generators of the dual mixed cell cone $M(T)^{\circ }$ are $\varvec{\zeta (i)}$ for $i=1,2,\ldots ,L$, then, by definition,

$$\begin{aligned} \langle \varvec{\zeta (i)} , {{\,\textrm{Log}\,}}\varvec{C}\rangle \ > \ 0 \end{aligned}$$

for all $i=1,2,\ldots ,L$. To apply Proposition 3.1, we need

$$\begin{aligned} \langle \varvec{\zeta (i)}, {{\,\textrm{Log}\,}}\varvec{C}\rangle \ > \ \log (\# \textbf{A}) \left\Vert \varvec{\zeta (i)}\right\Vert _1. \end{aligned}$$

Here, $\left\Vert \varvec{\zeta (i)}\right\Vert _1 $ is a normalization; one can just use normalized generators with unit $\ell _1$-norm. So the loss in our relaxation is represented by the logarithmic term $\log (\# \textbf{A})$.

Let us illustrate Proposition 3.1 on the simplest case: univariate polynomials. Let $A=\{ 0 , a_1, a_2, \ldots , a_{2d} \} \subset {\mathbb {Z}}$, and let $p(x)=c_0 + c_1 x^{a_1} + c_2 x^{a_2} + \cdots + c_{2d} x^{a_{2d}}$. Here a triangulation is subdivision of the interval $[0,a_{2d}]$ into a union of smaller sub-intervals $[a_i,a_j]$. Suppose the lifting function ${{\,\textrm{Log}\,}}C =(\log \left| c_0 \right| , \log \left| c_1 \right| , \ldots , \log \left| c_{2d} \right| )$ introduces the triangulation $T = \{ [0,a_2], [a_2, a_4], \ldots , [a_{2(d-1)},a_{2d}] \}$. The first “simplex” being $[0,a_2]$ means for every $a_i$ with $i \ne 2$ we must have $(a_i, \log \left| c_i \right| )$ lying above the line segment $\{ (0, \log \left| c_0 \right| ), (a_2, \log \left| c_2 \right| ) \}$. In terms of circuit inequalities, this means the following

$$\begin{aligned} \frac{\log \left| c_2 \right| -\log \left| c_0 \right| }{a_2} < \frac{\log \left| c_i \right| - \log \left| c_0 \right| }{a_i} \; \; \text {for} \; i=1,3,4,5,\ldots , d. \end{aligned}$$

Or, equivalently

$$\begin{aligned} \log \left| c_0 \right| (a_i-a_2) - \log \left| c_2 \right| a_i + \log \left| c_i \right| a_2 > 0 \; \; \text {for} \; i=1,3,4,5,\ldots , d. \end{aligned}$$

The hypothesis of Proposition 3.1 amounts to

$$\begin{aligned} \log \left| c_0 \right| (a_i-a_2) - \log \left| c_2 \right| a_i + \log \left| c_i \right| a_2 > \log (d+1) (\left| a_i-a_2 \right| + a_i + a_2). \end{aligned}$$

If the hypothesis of Proposition 3.1 is satisfied for all the circuit inequalities of the triangulation T (that is all generators of $M(T)^{\circ }$), then the number of real zeros of p(x) can be counted as follows: Let ${{\,\mathrm{{\text {sgn}}}\,}}c_i$ represent the signs of $c_i$, set

$$\begin{aligned} {{\,\mathrm{{\text {sgn}}}\,}}C :=( {{\,\mathrm{{\text {sgn}}}\,}}c_0, {{\,\mathrm{{\text {sgn}}}\,}}c_2, {{\,\mathrm{{\text {sgn}}}\,}}c_4, \ldots , {{\,\mathrm{{\text {sgn}}}\,}}c_{2d} ), \end{aligned}$$

and let k be the number of sign changes in the vector ${{\,\mathrm{{\text {sgn}}}\,}}C$. The vector ${{\,\mathrm{{\text {sgn}}}\,}}C$ represents the signs relevant to the triangulation T, and due to nature of T we have the same sign vector on“negative orthant" $(-\infty ,0)$. Then, Proposition 2.22 combined with Proposition 3.1 says p has 2k many real zeros.

3.2 Some Basic Results on the Complement of A-Discriminant Amobea

For simplicity, we let $m= \# \textbf{A}$. Note that $\textbf{A}\subset {\mathbb {Z}}^{2n-1}$. Here, we assume the reader is familiar with the basic facts from Sect. 2.6 and start with the following lemma.

Lemma 3.2

Let $\varvec{\eta } $ be a vertex of the Newton polytope of $\Delta _{\textbf{A}}$, and let $K_{\varvec{\eta } }$ be the corresponding connected component in the complement of the $\varvec{A}$-discriminant amoeba.

(1)
Let $\varvec{u} \in K_{\varvec{\eta } }$ and let $\varvec{v} \in {\text {NC}}{(\varvec{\eta })}$ then the ray $\varvec{u}+ \lambda \varvec{v}$ for $\lambda \in [0,\infty )$ does not intersect the $\varvec{A}$-discriminant amoeba.
(2)
Let $\Phi _{\textbf{A}}$ and B, respectively, be the map and the matrix defined in Sect. 2.9. Suppose $\varvec{\zeta } \in {\mathbb {R}}^{m-2n}$ with $\Phi _\textbf{A}(\varvec{\zeta }) \in \partial (B^{T} K_{\varvec{\zeta }})$, then $\varvec{\zeta } \in \left( B^{T} {\text {NC}}{(\varvec{\eta })} \right) ^{\circ }$.

Proof

As $K_{\varvec{\eta } }$ is a component of the complement of an amoeba, it is a convex set. Moreover, by Lemma 2.21, it includes a shifted copy of ${\text {NC}}{(\varvec{\eta })}$. Now let $H_{{{\,\mathrm{\varvec{w}}\,}}} := \{ \langle {{\,\mathrm{\varvec{w}}\,}}, {{\,\mathrm{\varvec{x}}\,}}\rangle = c \} $ be a supporting hyperplane of $K_{\varvec{\eta } }$ (i.e., for every $\varvec{y} \in K_{\varvec{\eta } }$ we have $\langle {{\,\mathrm{\varvec{w}}\,}}, \varvec{y} \rangle \ge c$). We claim ${{\,\mathrm{\varvec{w}}\,}}\in {\text {NC}}{(\varvec{\eta })}^{\circ }$: otherwise the shifted copy of the cone ${\text {NC}}{(\varvec{\eta })}$, that is included in $K_{\varvec{\eta } }$, would intersect the supporting hyperplane $H_{{{\,\mathrm{\varvec{w}}\,}}}$, which is a contradiction.

Let $\varvec{u} \in K_{\varvec{\eta } }$ and $\varvec{v} \in {\text {NC}}{(\varvec{\eta })}$. Then we have for any ${{\,\mathrm{\varvec{w}}\,}}\in {\text {NC}}{(\varvec{\eta })}^{\circ }$ and $\lambda > 0$

$$\begin{aligned} \langle {{\,\mathrm{\varvec{w}}\,}}, \varvec{u} \rangle \ \le \ \langle {{\,\mathrm{\varvec{w}}\,}}, \varvec{u} + \lambda \varvec{v} \rangle . \end{aligned}$$

Hence, the ray $\varvec{u} + \lambda \varvec{v} $ does not intersect any supporting hyperplane of $K_{\varvec{\eta } }$, and in consequence does not intersect the boundary of the convex set $K_{\varvec{\eta } }$.

Now suppose that we have a $\varvec{\zeta } \in {\mathbb {R}}^{m-2n}$ with $\Phi _\textbf{A}(\varvec{\zeta }) \in \partial (B^{T} K_{\varvec{\zeta }})$, then by the second property in Sect. 2.9 the supporting hyperplane at $\Phi _\textbf{A}(\varvec{\zeta })$ will be

$$\begin{aligned} H_{\varvec{\zeta }} \ := \ \left\{ {{\,\mathrm{\varvec{x}}\,}}\in \mathbb {R}^{m-2n} \ : \ \langle \varvec{\zeta } , {{\,\mathrm{\varvec{x}}\,}}\rangle = \sum _{i}^m \langle \varvec{b(i)}, \varvec{\zeta } \rangle \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \right\} . \end{aligned}$$

Since there is a shifted copy of $B^{T} {\text {NC}}{(\varvec{\eta })}$ inside the convex set $B^{T} K_{\varvec{\eta } }$, this shows that $\varvec{\zeta } \in \left( B^{T} {\text {NC}}{(\varvec{\eta })} \right) ^{\circ }$. $\square $

The Gale dual matrix B in Lemma 3.2 is of size $m \times (m-2n)$. Thus, $B^{T} K_{\varvec{\zeta }}$ is a projection of $K_{\varvec{\eta } }$ from $\mathbb {R}^m$ to $\mathbb {R}^{m-2n}$. The kernel of the matrix $B^{T}$ is included in every connected component $K_{\varvec{\eta } }$ of the complement of the $\textbf{A}$-discriminant amoeba and this projection creates no loss of generality. There are two ways to see this: Formally, the kernel of $B^T$ is the span of rows of A and $(1,1,\ldots ,1)$ vector, and in Theorem 2.2 it was noted that this space is included in every secondary cone and hence also in ${\text {NC}}{(\varvec{\eta })}$ by Lemma 2.16. Geometrically, the kernel of $B^T$ represents the homogeneities present in the $\varvec{A}$-discriminant variety as explained in Sect. 2.9.

Given a point ${{\,\textrm{Log}\,}}|\varvec{C}|$, testing if ${{\,\textrm{Log}\,}}|\varvec{C}| \in K_{\varvec{\eta } }$ is equivalent to testing if $B^{T} {{\,\textrm{Log}\,}}|\varvec{C}| \in B^{T} K_{\varvec{\eta } }$; the kernel of $B^{T}$ is included in all $K_{\varvec{\eta } }$. One can test whether $B^{T} {{\,\textrm{Log}\,}}|\varvec{C}| \in B^{T} K_{\varvec{\eta } }$ by checking all the supporting hyperplanes of $B^{T} K_{\varvec{\eta } }$ due to convexity. By Lemma 3.2 and the discussion in Sect. 2.9, we know that these supporting hyperplanes are of the form

$$\begin{aligned} H_{\varvec{\zeta }} \ := \ \left\{ {{\,\mathrm{\varvec{x}}\,}}\in \mathbb {R}^{m-2n} : \langle \varvec{\zeta } , {{\,\mathrm{\varvec{x}}\,}}\rangle = \sum _{i}^m \langle \varvec{b(i)}, \varvec{\zeta } \rangle \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \right\} \end{aligned}$$

for some $\varvec{\zeta } \in \left( B^{T} {\text {NC}}{(\varvec{\eta })} \right) ^{\circ }$. Now let T be a triangulation of $\textbf{A}$, and let $\varvec{\eta } $ be a vertex in the Newton polytope of $\Delta _{\textbf{A}}$ with the property

$$\begin{aligned} {\text {NC}}{(T)} \ \subseteq \ M(T) \ \subseteq \ {\text {NC}}{(\varvec{\eta })}, \end{aligned}$$

see Lemma 2.16. By linearity, this means

$$\begin{aligned}&B^{T}{\text {NC}}{(T)} \ \subseteq \ B^{T}M(T) \ \subseteq \ B^{T}{\text {NC}}{(\varvec{\eta })}, \ \text { and } \\&\left( B^{T}{\text {NC}}{(\varvec{\eta })} \right) ^{\circ } \ \subseteq \ \left( B^{T} M(T) \right) ^{\circ } \ \subseteq \ \left( B^{T} {\text {NC}}{(T)} \right) ^{\circ }. \end{aligned}$$

Instead of checking hyperplanes defined by $ \varvec{\zeta } \in \left( B^{T} {\text {NC}}{(\varvec{\eta })} \right) ^{\circ }$ we check the inequalities given by the larger cone $\left( B^{T} M(T) \right) ^{\circ }$. Before we explain the reason for this, we make an observation: $B(B^{T} M(T))^{\circ } \subseteq M(T)^{\circ }$ as follows:

$$\begin{aligned} {{\,\mathrm{\varvec{x}}\,}}\in (B^{T} M(T))^{\circ } \ \Rightarrow \ \langle B {{\,\mathrm{\varvec{x}}\,}}, \varvec{y} \rangle \ge 0 \; \text {for all} \; \varvec{y} \in M(T). \end{aligned}$$

Also note that by definition we have

$$\begin{aligned} \langle \varvec{\zeta } , B^{T} {{\,\textrm{Log}\,}}\varvec{C}\rangle \ = \ \langle B \varvec{\zeta }, {{\,\textrm{Log}\,}}\varvec{C}\rangle . \end{aligned}$$

So instead of using $\langle \varvec{\zeta } , B^{T} {{\,\textrm{Log}\,}}\varvec{C}\rangle > 0$ for $ \varvec{\zeta } \in \left( B^{T} {\text {NC}}{(\varvec{\eta })} \right) ^{\circ }$ as our criterion, we will use $ \tau , {{\,\textrm{Log}\,}}\varvec{C}\rangle > 0$ for all $\tau \in M(T)^{\circ }$. There are two reasons for this: First reason is that to ensure the criterion in Proposition 2.22 is satisfied, we indeed have to check with all circuit inequalities in $M(T)^{\circ }$. Proposition 2.22 involves amoebae of all $\Delta _{\Gamma }$ for all faces of $\textbf{A}$, except the irrelavant ones, and checking with the dual cones coming from all such $\Gamma $ is equivalent to checking all inequalities in $M(T)^{\circ }$. This first reason will become more clear in Sect. 3.4. The second reason is algorithmic efficiency: we had already computed the generators of $M(T)^{\circ }$ along the way, these are the circuit inequalities computed by Jensen’s tropical homotopy algorithm Hence using $M(T)^{\circ }$ does not yield a significant computational cost.

3.3 Quantitative Estimates

Lemma 3.3

Let T be a triangulation of $\textbf{A}$, please keep the notation from Lemma 3.2 for $K_{\varvec{\eta }}$ and B. If a given vector ${{\,\textrm{Log}\,}}|\varvec{C}|$ satisfies

$$\begin{aligned} \left\langle \varvec{\zeta }, B^{T} {{\,\textrm{Log}\,}}\varvec{C}\right\rangle \ > \ \log (m) \left\Vert B \varvec{\zeta }\right\Vert _1 \end{aligned}$$

for all $\varvec{\zeta } \in (B^{T}M(T))^{\circ }$, then we have ${{\,\textrm{Log}\,}}|\varvec{C}| \in K_{\varvec{\eta }}$ for a vertex $\varvec{\eta }$ of $\Delta _\textbf{A}$ which satisfies $M(T) \subseteq {\text {NC}}{(\varvec{\eta })}$.

The proof of Lemma 3.3 will follow after we make some observations. We first note a basic observation on entropy type sums.

Lemma 3.4

Let ${{\,\mathrm{\varvec{x}}\,}}\in {\mathbb {R}}_{\ge 0}^{d}$ be a vector with nonnegative entries. Then, we have

$$\begin{aligned} \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1 \log \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1 - \log (d) \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1 \le \sum _{i=1}^d x_i \log (x_i) \le \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1 \log \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1, \end{aligned}$$

where $\left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1=\sum _{i=1}^d |x_i|$ represents the $\ell _1$-norm of the vector ${{\,\mathrm{\varvec{x}}\,}}$.

Proof

Let $\varvec{y}:=\frac{{{\,\mathrm{\varvec{x}}\,}}}{\left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1}$. Since $\left\Vert \varvec{y}\right\Vert _1=1$, and it has nonnegative entries, we can see $\varvec{y}$ as a discrete probability distribution supported on d strings. As usual $H(\varvec{y})=\sum _{i=1}^d -y_i\log (y_i) $ is the entropy of $\varvec{y}$, and it is well known that $H(\varvec{y}) \le \log (d)$ [48]. So, we have

$$\begin{aligned} H(\varvec{y}) \ = \ \frac{1}{\left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1}\left( \sum _{i=1}^d x_i \log \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1 - x_i \log (x_i) \right) \le \log (d). \end{aligned}$$

This gives us the following inequality

$$\begin{aligned} \log \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1 \sum _{i=1}^d x_i \ \le \ \log (d) \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1 + \sum _{i=1}^d x_i \log (x_i), \end{aligned}$$

which proves the left-hand side inequality in the claim. The right-hand side is obvious. $\square $

Now we derive the following useful estimate based on Lemma 3.4.

Lemma 3.5

Let $\textbf{A}$ be the support set, and let B be the $ m \times (m-2n)$ Gale dual. Then, for every $\varvec{\zeta } \in \mathbb {R}^{m-2n}$ we have

$$\begin{aligned} -\frac{1}{2} \left\Vert B \varvec{\zeta }\right\Vert _1 \log (m) \ \le \ \sum _{i=1}^m \langle \varvec{b(i)} , \varvec{\zeta } \rangle \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \ \le \ \frac{1}{2} \left\Vert B \varvec{\zeta }\right\Vert _1 \log (m). \end{aligned}$$

Proof

By construction, every element in the column space of B has the sum of its coordinates equal to zero. So, for every $\varvec{\zeta } \in {\mathbb {R}}^{m-2n}$ the sum of the entries of $B \varvec{\zeta }$ is zero. That is,

$$\begin{aligned} \sum _{i=1}^m \langle \varvec{b(i)} , \varvec{\zeta } \rangle \ = \ \textrm{0}, \end{aligned}$$

where $\varvec{b(i)}$ represents rows of the matrix B. We write $B \varvec{\zeta } = ({{\,\mathrm{\varvec{x}}\,}},-\varvec{y})$ for some ${{\,\mathrm{\varvec{x}}\,}}$ and $\varvec{y}$ that are nonnegative in all coordinates, so we have $\left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1=\left\Vert \varvec{y}\right\Vert _1=\frac{1}{2}\left\Vert B\varvec{\zeta }\right\Vert _1$. We also observe

$$\begin{aligned} \sum _{i=1}^m \langle \varvec{b(i)} , \varvec{\zeta } \rangle \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \ = \ \sum _{i=1}^{m_1} x_i \log (x_i) - \sum _{i=1}^{m_2} y_i \log (y_i). \end{aligned}$$

Note that $m_1$ and $m_2$ in the above expression are both less than m. Using Lemma 3.4 and $\left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1=\left\Vert \varvec{y}\right\Vert _1=\frac{1}{2}\left\Vert B\varvec{\zeta }\right\Vert _1$ gives us the following estimate:

$$\begin{aligned} - \frac{1}{2}\left\Vert B \varvec{\zeta }\right\Vert _1 \log (m) \ \le \ \sum _{i=1}^m \langle \varvec{b(i)} , \varvec{\zeta } \rangle \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \ \le \ \frac{1}{2} \left\Vert B \varvec{\zeta }\right\Vert _1 \log (m). \end{aligned}$$

(3.2)

$\square $

Proof of Lemma 3.3

Using Lemma 3.5 and the hypothesis of Lemma 3.3 we have

$$\begin{aligned} \langle \varvec{\zeta } , B^{T} {{\,\textrm{Log}\,}}|\varvec{C}| \rangle \> \ \log (m) \left\Vert B \varvec{\zeta }\right\Vert _1 \ > \ \sum _{i=1}^m \langle \varvec{b(i)}, \varvec{\zeta } \rangle \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \end{aligned}$$

for all $ \varvec{\zeta } \in (B^{T}M(T))^{\circ } $. (Note that $(B^{T}{\text {NC}}{(\varvec{\eta })})^{\circ } \subset (B^{T}M(T))^{\circ }$.) By Lemma 3.2, we know that the supporting hyperplanes of $K_{\varvec{\eta }}$ are of the form

$$\begin{aligned} H_{\varvec{\zeta }} \ := \ \left\{ {{\,\mathrm{\varvec{x}}\,}}\in \mathbb {R}^{m-2n} : \langle \varvec{\zeta } , {{\,\mathrm{\varvec{x}}\,}}\rangle = \sum _{i}^m \langle \varvec{b(i)}, \varvec{\zeta } \rangle \log \left| \langle \varvec{b(i)} , \varvec{\zeta } \rangle \right| \right\} \end{aligned}$$

for some $\varvec{\zeta } \in (B^{T}{\text {NC}}{(\varvec{\eta })})^{\circ } $. So, these two facts together imply that $B^{T}{{\,\textrm{Log}\,}}|\varvec{C}|$ and the shifted copy of ${\text {NC}}{(\varvec{\eta })}$ are not separated by any supporting hyperplane of $B^{T}K_{\varvec{\eta }}$. This means $B^{T}{{\,\textrm{Log}\,}}|\varvec{C}| \in B^{T}K_{\varvec{\eta }}$. Since the kernel of $B^{T}$ is included in $K_{\varvec{\eta }}$ this also implies ${{\,\textrm{Log}\,}}|\varvec{C}| \in K_{\varvec{\eta }}$. $\square $

3.4 Putting Things Together

Now, we complete the proof of Proposition 3.1. Recall that by definition $ \langle \tau , B^{T} {{\,\textrm{Log}\,}}\varvec{C}\rangle = \ \langle B \tau , {{\,\textrm{Log}\,}}\varvec{C}\rangle $, and $B(B^{T} M(T))^{\circ } \subseteq M(T)^{\circ }$. So, if a given vector ${{\,\textrm{Log}\,}}|\varvec{C}|$ satisfies

$$\begin{aligned} \langle \varvec{\zeta } , {{\,\textrm{Log}\,}}\varvec{C}\rangle > \log (m) \left\Vert \varvec{\zeta }\right\Vert _1 \end{aligned}$$

(3.3)

for all $\varvec{\zeta } \in M(T)^{\circ }$, then by Lemma 3.3 we have that ${{\,\textrm{Log}\,}}|\varvec{C}| \in K_{\varvec{\eta }}$ for a vertex $\varvec{\eta }$ of $\Delta _\textbf{A}$ which satisfies $M(T) \subseteq {\text {NC}}{(\varvec{\eta })}$.

Suppose $M(T)^{\circ }$ is generated by $\varvec{\zeta (1)},\ldots ,\varvec{\zeta (L)}$, and assume for a given vector ${{\,\textrm{Log}\,}}|\varvec{C}|$ we have

$$\begin{aligned} \langle \varvec{\zeta (i)} , {{\,\textrm{Log}\,}}\varvec{C}\rangle > \log (m) \left\Vert \varvec{\zeta (i)}\right\Vert _1 \end{aligned}$$

for all $i=1,2,\ldots ,L$. Then for any ${{\,\mathrm{\varvec{x}}\,}}\in M(T)^{\circ }$ with ${{\,\mathrm{\varvec{x}}\,}}=\sum t_i \varvec{\zeta (i)}$ with $t_i \ge 0$ one has the following inequality

$$\begin{aligned} \langle {{\,\textrm{Log}\,}}\varvec{C}, {{\,\mathrm{\varvec{x}}\,}}\rangle \ > \ \log (m) \sum t_i \left\Vert \varvec{\zeta (i)}\right\Vert _1 \ge \log (m) \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1. \end{aligned}$$

where the last inequality follows from the triangle inequality. Hence, checking the condition in Proposition 3.1 only for the generators of $M(T)^{\circ }$ suffices to guarantee $ \langle {{\,\textrm{Log}\,}}|\varvec{C}| , {{\,\mathrm{\varvec{x}}\,}}\rangle \ > \log (m) \left\Vert {{\,\mathrm{\varvec{x}}\,}}\right\Vert _1$ for all $x \in M(T)^{\circ }$. At this point we proved that the following: If the hypothesis of Proposition 3.1 is satisfied, then Lemma 3.3 and Lemma 3.2 show that for all $\varvec{v} \in M(T)$ the ray $\lambda \varvec{v} + {{\,\textrm{Log}\,}}\left| C \right| $ for $\lambda \in [0. \infty )$ does not intersect the amoeba of $\Delta _{\textbf{A}}$.

Now let $\Gamma $ be a face of $\textbf{A}$ that is not an irrelevant face. Let $T|_{\Gamma }$ be the restriction of the triangulation of T on $\Gamma $. By Lemma 2.16 the circuit inequalities generating $M(T|_{\Gamma })^{\circ }$ are included in $M(T)^{\circ }$. We also observe that $\log (\# \textbf{A}) \ge \log (\# \Gamma )$. This implies that the criterion of Proposition 3.1 also ensures the ray $\varvec{v} + \lambda {{\,\textrm{Log}\,}}\left| C \right| $ does not intersect the amoeba of $\Delta _{\Gamma }$. Using Proposition 2.22 completes the proof.

4 Real Polyhedral Homotopy

In this section, we summarize the main steps of our real polyhedral homotopy algorithm. The algorithm follows the common thread of homotopy continuation algorithms, but it operates entirely over the real numbers.

The idea of the algorithm is as follows: Given a polynomial system $\varvec{p}=(p_1,p_2,\ldots ,p_n)$ with support sets $A_1,A_2,\ldots ,A_n \subseteq {\mathbb {Z}}^n$ and coefficient vectors $\varvec{C}_i \in \mathbb {R}^{\# A_i}$ for $i=1,2,\ldots ,n$, we concatenate the support set and coefficient vectors as $\textbf{A}=A_1*A_2*\cdots *A_n$, $\varvec{C}=(\varvec{C}_1,\varvec{C}_2,\ldots ,\varvec{C}_n)$. Then, we compute triangulation T of $\textbf{A}$ with respect to lifting function ${{\,\textrm{Log}\,}}\varvec{C}$. This step is performed using Jensen’s tropical homotopy algorithm as explained Sect. 2.3. Using Jensen’s algorithm makes the generators of the cone $M(T)^{\circ }$ readily available. Then, we check if the criterion of Proposition 3.1 is satisfied by the vector ${{\,\textrm{Log}\,}}\varvec{C}$. If the criterion is not satisfied, algorithms halts and prints “Input system is not certifiably patchworked". If the criterion is satisfied, we then find real zeros of binomial systems that correspond to mixed cells of T as explained in Sect. 2.4. After that, we pick a vector $\varvec{v} \in M(T) - \partial M(T)$, this can be done in a multitude of ways e.g. availing to multiplicative updates method, or one can simply set $\varvec{v}={{\,\textrm{Log}\,}}\varvec{C}$ since the fact ${{\,\textrm{Log}\,}}\varvec{C}\in M(T) - \partial M(T)$ is already certified. Then, we track the solution paths x(t) corresponding to $\varvec{v}$ as in Proposition 2.22 from $t=0$ to $t=1$. This numerical tracking step is discussed in Sect. 2.8. The correctness of the algorithm follows from Proposition 3.1 and Proposition 2.22. We give an example showing how the algorithm performs in practice.

Example 4.1

We reconsider the polynomials presented in Example 2.13, but this time we fix the coefficients to be real numbers instead of using coefficients that are Puiseux series.

$$\begin{aligned}&f \ = \ x_2^3 - (0.45)x_1x_2^2 - (0.45)^5x_1^2x_2 + (0.45)^{12}x_1^3 - (0.45)x_2^2 + (0.45)^4 x_1 x_2 \\&- (0.45)^9 x_1^2- (0.45)^5 x_2 - (0.45)^9 x_1 + (0.45)^{12}, \\&g \ = \ (0.45)^8 x_2^2 - (0.45)^6 x_1 x_2 + (0.45)^6 x_1^2 - (0.45)^3 x_2 - (0.45)^2 x_1 + 1. \end{aligned}$$

This leads to the following support and, using log-absolute values of the coefficients, the following lifting vectors:

$$\begin{aligned}&{\texttt {Support f:}} {\mathtt {2 \times 10}} \,{\texttt {Array}}\{\texttt{Int64,2}\}: \left[ \begin{array}{cccccccccc} 0 &{} 1 &{} 2 &{} 3 &{} 0 &{} 1 &{} 2 &{} 0 &{} 1 &{} 0 \\ 3 &{} 2 &{} 1 &{} 0 &{} 2 &{} 1 &{} 0 &{} 1 &{} 0 &{} 0 \\ \end{array}\right] \\&{\texttt {Lifting f:}} \ \left[ \begin{array}{cccccccccc} 0 &{} 1 &{} 5 &{} 12 &{} 1 &{} 4 &{} 9 &{} 5 &{} 9 &{} 12\\ \end{array}\right] \\&{\texttt {Support g:}} {\mathtt {2 \times 6}} \,{\texttt {Array}} \{\texttt{Int64,2}\}: \left[ \begin{array}{cccccc} 0 &{} 1 &{} 2 &{} 0 &{} 1 &{} 0 \\ 2 &{} 1 &{} 0 &{} 1 &{} 0 &{} 0 \\ \end{array}\right] \\&{\texttt {Lifting g:}} \ \left[ \begin{array}{cccccc} 8 &{} 6 &{} 6 &{} 3 &{} 2 &{} 0 \\ \end{array}\right] . \end{aligned}$$

What we did so far corresponds to initialization step of the algorithm (step 2). Now we need to compute mixed cells, and list generators of the dual mixed-cell cone (step 3 and step 4). There are six mixed cells and corresponding circuit inequalities. These mixed cells are depicted in the right picture of Figure 1. After verifying our system is patchworked (this is step 5 in the algorithm), we pass to step 6: For every one of these mixed cells, we obtain a binomial system, which we then solve using Hermite normal form, e.g., the first mixed-cell is represented by

$$\begin{aligned} \texttt {volume:} 1 \, \texttt {indices:} \texttt {Tuple} \{\texttt {Int64,Int64}\}[(2, 1), (5, 6)] \, \texttt {normal:} [-2.0, -1.0] \end{aligned}$$

with a solution for the corresponding binomial system given by

$$\begin{aligned}{}[4.938271604938272, 2.2222222222222223]. \end{aligned}$$

Similarly, we obtain five further solutions for the five other binomial systems corresponding to the other mixed cells.

$$\begin{aligned}&[4.938271604938272, -0.20249999999999999] \quad [4.938271604938272, -0.041006249999999994] \\&[24.386526444139612, 10.973936899862824] \quad [24.386526444139612, -1.0] \\&[24.386526444139612, 0.09112500000000004]. \end{aligned}$$

For step 7, we simply pick $v= {{\,\textrm{Log}\,}}C$. We perform step 8 using the polyhedral homotopy continuation in Homotopy.JL. After a total runtime of roughly 0.0001 seconds^{Footnote 1}, we arrive to step 9. Here are the six real solutions for the original system:

$$\begin{aligned}&[4.20818, 2.41707] \quad [7.12063, -0.138875] \quad [6.94337, -0.0383256] \\&[49.3211, 24.3919] \quad [15.9697, -0.517115] \quad [17.5735, 0.0244792]. \end{aligned}$$

5 Remarks on Complexity

In this section, we discuss complexity aspects of the real polyhedral homotopy algorithm. Our goal in this section is to identify key parameters that governs the complexity of the RPH algorithm. Our main finding is that the complexity of the real polyhedral homotopy algorithm is controlled by the number of mixed cells in triangulation of the Cayley configuration $\textbf{A}=A_1*A_2*\cdots *A_n$ that is introduced using the coefficients as a lifting function. We also show that the number of mixed cells admits an $O(t^n)$ upper bound where t is the maximal number of terms in $A_i$ for $1 \le i \le n$. So if the number of variables n is considered to be fixed, and the number of terms t is a variable, the discrete computations in RPH takes polynomial time.

In general, the discrete part of the RPH corresponds to computing mixed cells of a polyhedral subdivision that is induced by a fixed lifting; without worrying about the volumes the mixed cells. Hence, any complexity theoretic upper and lower bounds for computing mixed cells (without volumes) applies to discrete computations in our algorithm. For the numerical part; the number of paths tracked by RPH is dramatically smaller than of complex homotopy algorithms. However, as noted in the introduction we are not able to provide a rigorous complexity analysis for the numerical part of the algorithm for the time being.

5.1 Tropical Homotopy Algorithm

We start this section with bounding the number of inequalities needed to describe a mixed cell.

Lemma 5.1

Let $A_1,A_2,\ldots ,A_n$ be point configurations with at most t elements, and let T be a triangulation of $\textbf{A}=A_1 * A_2 * \cdots * A_n$. Then, a mixed-cell $\sigma \in T$ is determined in the mixed-cell cone $M(\sigma )$ by at most $n(t-2)$ inequalities.

Proof

(Proof sketch) The mixed-cell cone describes the case where the simplex corresponding to the mixed cell is a facet of the lifted Cayley polytope. So, for every element $\varvec{\alpha } \in \textbf{A}$ we get a circuit inequality given by 2n many vertices of the mixed cell and $\varvec{\alpha }$ that determines whether $\varvec{\alpha }$ is contained in the mixed cell. In total, we have at most $n(t-2)$ many such $\varvec{\alpha }$, and at most that many corresponding circuit inequalities. $\square $

This immediately yields the following corollary.

Corollary 5.2

Let $A_1,A_2,\ldots ,A_n$ be point configurations with at most t elements, and let T be a triangulation of $\textbf{A}= A_1 * A_2 * \cdots * A_n$ with k mixed cells. Then, the mixed-cell cone M(T) can be described by at most $kn(t-2)$ many linear inequalities all supported on circuits.

The proof of Proposition 5.6 gives us an upper bound the number of mixed cells. Using this rough upper bound, we derive the following corollary.

Corollary 5.3

Let $A_1,A_2,\ldots ,A_n$ be point configurations with at most t elements, and let T be a triangulation of $\textbf{A}=A_1 * A_2 * \cdots * A_n$. Then, the mixed-cell cone M(T) can be described by at most $2ne^n(t-1)^{n+1}$ many linear inequalities all supported on circuits.

Corollary 5.3 gives an upper bound to the number of updates in the tropical homotopy algorithm: For a fixed number of variables n, it is polynomial in t. This shows that the complexity of a mixed-cell cone computation is controlled by the cardinality of the support sets; this aligns well with Kushnirenko’s fewnomial philosophy.

Jensen wrote a paper on implementation details of his algorithm for the purpose of mixed volume computation [21]. Thanks to real geometry, we do not need volumes, but only the mixed cells. So Jensen’s current implementation does not output precisely what we need in this paper. A new implementation that outputs our needs in this paper is currently worked on by Timme. Real polyhedral homotopy is planned to be incorporated into Homotopy.JL [8].

5.2 Effective Viro’s Patchworking

As explained in Jensen’s paper [22] and [11, Lemma 5.1.13], every circuit inequality is written by a vector with $n+2$ non-zero entries and every entry is given by the volume of a simplex. Since we can compute the volume of a simplex in $O(n^3)$ cost, we can compute each generator of a circuit inequality by $O(n^4)$ cost. This gives us the following basic complexity estimate as a corollary of Lemma 5.1 and Corollary 5.2.

Corollary 5.4

Let $A_1,A_2,\ldots ,A_n$ be point configurations with at most t elements, and let T be a triangulation of $\textbf{A}= A_1 * A_2 * \cdots * A_n$ with k mixed cells. Then, the criterion in Lemma 3.3 can be checked by $O(kn^5(t-2))$ many arithmetic operations.

Using Proposition 5.6 one can provide upper bound for k and hence deduce a $O(e^{n}n^5t^{n+1})$ upper bound for the number of arithmetic operations.

5.3 A Fewnomial Bound for Patchworked Polynomial Systems

We start this section by stating a special case of McMullen’s Upper Bound Theorem [49].

Theorem 5.5

(Upper Bound Theorem; special case) Let $Q \subset \mathbb {R}^{2n}$ be a polytope with t vertices. Then the number of facets of Q is bounded by $2\left( {\begin{array}{c}t-n\\ n\end{array}}\right) $.

In the case of zero-dimensional systems, Viro’s method counts the number of common zeros in $(\mathbb {R}^{*})^n$. The discussions in Sects. 2.2 and 2.7 show that for a patchworked polynomial system supported with point sets $A_1,A_2,\ldots ,A_n \subset \mathbb {Z}^n$, the number zeros in the positive orthant is bounded by the number of mixed cells in the corresponding coherent polyhedral subdivision of $A_1+A_2+\cdots +A_{n}$. This yields the following statement.

Proposition 5.6

(Few Zeros for Patchworked Systems) Let $A_1,A_2,\ldots ,A_n \subset \mathbb {Z}^n$, and let $\left| A_1*A_2* \cdots * A_n \right| \le tn$. Then for a patchworked polynomial system $\varvec{p}=(p_1,p_2,\ldots ,p_n)$ supported with $A_1,A_2,\ldots ,A_n$, the number of common zeros of $\varvec{p}$ in $(\mathbb {R}^{*})^{n}$ is at most

$$\begin{aligned} 2^{n+1} \left( {\begin{array}{c}tn-n\\ n\end{array}}\right) . \end{aligned}$$

Proof

Let $\omega $ be a lifting function and let $\Delta _{\omega }$ be the corresponding coherent fine mixed subdivision of $A_1+A_2+\cdots +A_n$. The number of mixed cells in $\Delta _{\omega }$ is equivalent to the number of corresponding simplices in the triangulation of the Cayley configuration $\textbf{A}=A_1*A_2* \cdots *A_n$; see Sect. 2.1. The simplices that correspond to mixed cells are the simplices with two vertices from each $A_i$. The number of all simplices in the triangulation is less than the number of facets in the lifted Cayley polytope ${{\,\textrm{Cay}\,}}(A^\omega ) = {{\,\mathrm{{\text {conv}}}\,}}(\textbf{A}^{\omega })$. ${{\,\textrm{Cay}\,}}(A^\omega )$ is contained in $\mathbb {R}^{2n}$, and it has the same number of vertices as $\textbf{A}$. So, the number of facets of ${{\,\textrm{Cay}\,}}(A^\omega )$ is bounded by Theorem 5.5. We multiply this bound with $2^n$ to cover all orthants of $(\mathbb {R}^{*})^{n}$, and obtain the following upper bound

$$\begin{aligned} 2^{n+1}\left( {\begin{array}{c}tn-n\\ n\end{array}}\right) \ \le \ 2^{n+1} e^{n} (t-1)^n, \end{aligned}$$

where the last inequality follows from Stirling’s estimate. $\square $

5.4 Complexity of Numerical Path Tracking

Homotopy continuation theory of polynomials uses condition numbers to give bounds for the complexity of numerical iterative solvers [2]. Malajovich noticed that the current theory, which considers solutions of homogeneous polynomials over the projective space, fails to address subtleties of sparse polynomial systems. He developed a theory of sparse Newton iterations [31]: For a given sparse polynomial system f, Malajovich’s theory uses two condition numbers $\mu (f,{{\,\mathrm{\varvec{x}}\,}})$ and $v({{\,\mathrm{\varvec{x}}\,}})$ at a given point ${{\,\mathrm{\varvec{x}}\,}}\in ({\mathbb {C}}^{*})^n$, and it provides tools to analyze the accuracy and complexity of sparse Newton iterations. Let us state the main result of Malajovich below.

Theorem 5.7

(Malajovich, [31]) As in Proposition 2.22, let $\varvec{p}_{\varvec{C}}(t,{{\,\mathrm{\varvec{x}}\,}})$ be the polynomial system. Assume that we track a solution path from $\varvec{p}_{\varvec{C}}(\varepsilon ,{{\,\mathrm{\varvec{x}}\,}})$ to $\varvec{p}(\varvec{C})(1,{{\,\mathrm{\varvec{x}}\,}})$ where $\varepsilon >0$ is a sufficiently small real number. Then, there exists an algorithm which takes

$$\begin{aligned} \int _{\varepsilon }^{1} \mu ( \varvec{p}_{\varvec{C}}(t,{{\,\mathrm{\varvec{x}}\,}}) , \varvec{z}_s ) \; v(\varvec{z}_s) \; \left( \left\Vert {\dot{\varvec{p}}}_{\varvec{C},s} \right\Vert _{\varvec{p}_{\varvec{C},s}}^{2} + \left\Vert \dot{\varvec{z}}_s\right\Vert _{\varvec{z}_s}^2 \right) ^{\frac{1}{2}} \; ds \end{aligned}$$

many iteration steps where $\varvec{z}_s$ represents the solution path, and $\left\Vert \cdot \right\Vert _{{{\,\mathrm{\varvec{x}}\,}}}$ represents the local norms defined as pull-back of the classical Fubini–Study metric under the Veronese map.

It is customary in the theory of homotopy continuation to go from an integral representation as above to a more comprehensible complexity estimate by considering average or smoothed analysis of the iteration process. This amounts to introduce a probability measure on $\varvec{p}_{\varvec{C}}$, the input space of polynomials, and to compute the expectation of the integral estimate over the input space. Malajovich notes in his paper [31] that the non-existence of a unitary group action on the space of sparse polynomials makes the probabilistic analysis harder. In our opinion, $\mu (\varvec{p}_{\varvec{C}}(t,{{\,\mathrm{\varvec{x}}\,}}))$ can be analyzed for general measures without group invariance [13, 14]. However, the second condition number $v({{\,\mathrm{\varvec{x}}\,}})$ seems hard to analyze; therefore, we refrain from a probabilistic analysis for the moment.

Remark 5.8

Gregorio Malajovich authored a 90 pages Arxiv paper that improves the state of the art [34]. Our hope is that these new results will pave the way for a rigorous complexity analysis of RPH, but until then what is written here represents our views.

6 Discussion and Outlook

We discuss some open questions related to this work that are brought to our attention after the initial submission of the article on ArXiv:

(1)
How successful is RPH on practical problems? For instance, how would it perform in problems concerning real polynomial systems coming from chemical reaction networks?
(2)
Imagine the support sets $A_1,A_2,\ldots ,A_n$ are fixed, and we use i.i.d Gaussian coefficients with unit variance to create a random polynomial system. Can one prove that with high probability the random Gaussian polynomial system would pass your effective patchworking test?
(3)
Is our algorithm better than the (complex) polyhedral homotopy algorithm?
(4)
How is the comparison of real polyhedral homotopy with Khovanskii–Rolle continuation algorithm of Bates and Sottile?

Regarding the first question: So far, we have only done a preliminary implementation and computed a few examples. The purpose of this article is to provide theoretical foundations for real polyhedral homotopy. We believe that a rigorous implementation and practical testing of the algorithm is crucial, but, given the magnitude of the task, it requires a second, separate article.

Regarding the second question: In a special case there are explicit estimates that shows indeed with high probability a random polynomial system is a patchworked system [12]. In general, this is a very intriguing question with far reaching consequences: A high probability positive answer would show that Viro’s patchworking method captures an essential combinatorial structure that force patterns on randomly generated systems of polynomial equations.

Regarding the third question: This question was brought to our attention by some colleagues, but the comparison between our algorithm and the polyhedral homotopy algorithm does not seem to be meaningful. The goal of the two algorithms are different; RPH tracks only real zero paths, and polyhedral homotopy tracks all complex zeros. If one is interested in real roots only, then the advantage of our algorithm is to track correct number of real zero paths (sometimes called optimal path tracking), where most algorithms in the literature find all complex zeros and then filter the real ones.

For the last question, we first need to explain a notable algorithm of Bates and Sottile called Khovanskii–Rolle Continuation Algorithm (KR) [6]. KR admits a sparse polynomial system where every polynomial has at most t terms, and traces at most

$$\begin{aligned} \frac{e^4+3}{4} 2^{\left( {\begin{array}{c}(t-2)n\\ 2\end{array}}\right) } \left( {\begin{array}{c}(t-2)n\\ t-2,t-2,\ldots ,t-2\end{array}}\right) \sim \exp \left( t^2 n^2 \right) \end{aligned}$$

many solution curves that can lead to real solutions [6, 7]. The number of paths is given by the best fewnomial bound in the literature, and to the best of our knowledge for mixed support, the best bounds are in [7].

On the one hand, RPH algorithm tracks polynomially many solution paths with respect to t, whereas the KR algorithm traces exponentially many solution curves. For instance, if one needs to solve a system of two bivariate polynomials both with 8 different terms, the KR algorithm traces more than $2^{76}$ many curves, and RPH tracks less than $2^{12}$ many paths. On the other hand, we stress that the KR algorithm can solve all input instances where RPH can only solve polynomials that are located against the discriminant variety. So, in our view, these two algorithms are complementary to each other: for a given sparse systems one should use KR when RPH fails to admit the input.

Notes

Carried out on a MacBook Pro, Intel i5-5257U, 2.70GHz, 8GB RAM.

References

Allgower, E.L., Georg, K.: Numerical Continuation Methods: An Introduction, vol. 13. Springer Science & Business Media, New York (2012)
MATH Google Scholar
Bürgisser, P., Cucker, F.: Condition: The Geometry of Numerical Algorithms, vol. 349. Springer Science & Business Media, New York (2013)
MATH Google Scholar
Bernstein, D.: The number of roots of a system of equations. Funct. Anal. Appl. 9, 183–185 (1975)
Article MathSciNet MATH Google Scholar
Bihan, F.: Irrational mixed decomposition and sharp fewnomial bounds for tropical polynomial systems. Discrete Comput. Geometry 55(4), 907–933 (2016)
Article MathSciNet MATH Google Scholar
Bihan, F.: Correction to ’Irrational mixed decomposition and sharp fewnomial bounds for tropical polynomial systems. (2020) (to appear)
Bates, D.J., Sottile, F.: Khovanskii-rolle continuation for real solutions. Found. Comput. Math. 11(5), 563 (2011)
Article MathSciNet MATH Google Scholar
Bihan, F., Sottile, F.: Fewnomial bounds for completely mixed polynomial systems. Adv. Geom. 11(3), 541–556 (2011)
Article MathSciNet MATH Google Scholar
Breiding, P., Timme, S.: Homotopycontinuation.jl: A Package for Homotopy Continuation in julia, Mathematical Software - ICMS: Lecture Notes in Computer Science, vol. 10931, p. 2018. Springer, Cham (2018)
MATH Google Scholar
Canny, J.F., Emiris, I.Z.: A subdivision-based algorithm for the sparse resultant. J. ACM (JACM) 47(3), 417–451 (2000)
Article MathSciNet MATH Google Scholar
Chen, T., Lee, T.-L., Li, T.-Y.: Hom4ps-3: A parallel numerical solver for systems of polynomial equations based on polyhedral homotopy continuation methods, in Mathematical Software—ICMS 2014—4th International Congress, Seoul, South Korea, August 5-9, 2014. Proceedings (Hoon Hong and Chee Yap, eds.), Lecture Notes in Computer Science, vol. 8592, Springer, pp. 183–190 (2014)
De Loera, J., Rambau, J., Santos, F.: Triangulations Structures for Algorithms and Applications. Springer, New York (2010)
MATH Google Scholar
Dickenstein, A., Rojas, J.M., Rusek, K., Shih, J.: Extremal real algebraic geometry and $\backslash $ mathcala-discriminants. Moscow Math. J. 7(3), 425–452 (2007)
Article MathSciNet MATH Google Scholar
Ergür, A.A., Paouris, G., Rojas, J.M.: Probabilistic condition number estimates for real polynomial systems I: a broader family of distributions. Foundations Comput. Math. 19, 1–27 (2018)
MathSciNet MATH Google Scholar
Ergür, A., Paouris, G., J Rojas: Smoothed analysis for the condition number of structured real polynomial systems. Math. Comput. 90(331), 2161–2184 (2021)
Article MathSciNet MATH Google Scholar
Esterov, A.: Newton polyhedra of discriminants of projections. Discrete Comput. Geometry 44(1), 96–148 (2010)
Article MathSciNet MATH Google Scholar
Gelfand, I.M., Kapranov, M., Zelevinsky, A.: Discriminants, Resultants, and Multidimensional Determinants. Springer Science & Business Media, New York (2008)
MATH Google Scholar
Huber, B., Rambau, J., Santos, F.: The cayley trick, lifting subdivisions and the bohne-dress theorem on zonotopal tilings. J. Eur. Math. Soc. 2(2), 179–198 (2000)
Article MathSciNet MATH Google Scholar
Huber, B., Sturmfels, B.: A polyhedral method for solving sparse polynomial systems. Math. Comput. 64(212), 1541–1555 (1995)
Article MathSciNet MATH Google Scholar
Itenberg, I., Mikhalkin, G., Shustin, E.I.: Tropical Algebraic Geometry, vol. 35. Springer Science & Business Media, New York (2009)
Book MATH Google Scholar
Itenberg, I., Roy, M.-F.: Multivariate descartes’ rule. Beitrage Zur Algebra und Geometrie 37(2), 337–346 (1996)
MathSciNet MATH Google Scholar
Anders Nedergaard Jensen: An Implementation of Exact Mixed. International Congress on Mathematical Software, vol. computation, pp. 198–205. Springer, New York (2016)
MATH Google Scholar
Jensen, A.N.: Tropical Homotopy Continuation. arXiv preprint arXiv:1601.02818 (2016)
Joshi, B., Shiu, A.: Which small reaction networks are multistationary? SIAM J. Appl. Dyn. Syst. 16(2), 802–833 (2017)
Article MathSciNet MATH Google Scholar
Joswig, M., Vater, P.: Real Tropical Hyperfaces by Patchworking in Polymake. International Congress on Mathematical Software, pp. 202–211. Springer, New York (2020)
MATH Google Scholar
Kapranov, M.M.: A characterization ofa-discriminantal hypersurfaces in terms of the logarithmic gauss map. Math. Ann. 290(1), 277–285 (1991)
Article MathSciNet MATH Google Scholar
Kannan, R., Bachem, A.: Polynomial algorithms for computing the smith and hermite normal forms of an integer matrix. siam J. Comput. 8(4), 499–507 (1979)
Article MathSciNet MATH Google Scholar
Khovanskii, A.G.: Fewnomials, vol. 88. American Mathematical Society, New York (1991)
MATH Google Scholar
Koiran, P.: Shallow circuits with high-powered inputs, (2010)
Kushnirenko, A.: Letter to Frank Sottile, https://www.math.tamu.edu/~sottile/research/pdf/Kushnirenko.pdf
Li, T.-Y., Wang, X.: On multivariate descartes’ rule-a counterexample. Beiträge Algebra Geom. 39(1), 1–5 (1998)
MathSciNet MATH Google Scholar
Malajovich, G.: Complexity of sparse polynomial solving: homotopy on toric varieties and the condition metric. Found. Comput. Math. 19, 1–53 (2016)
Article MathSciNet MATH Google Scholar
Malajovich, G.: Computing mixed volume and all mixed cells in quermassintegral time. Found. Comput. Math. 17(5), 1293–1334 (2017)
Article MathSciNet MATH Google Scholar
Malajovich, G.: Pss5—polynomial system solver, (2019) https://sourceforge.net/projects/pss5/
Malajovich, G.: Complexity of sparse polynomial solving 2: Renormalization, (2020)
Mikhalkin, G.: Amoebas of algebraic varieties and tropical geometry. In: Donaldson, S.K., Eliashberg, Y., Gromov, M. (eds.) Different Faces of Geometry, pp. 257–300. Kluwer, New York (2004)
Chapter MATH Google Scholar
Malajovich, G., Rojas, J.M.: High probability analysis of the condition number of sparse polynomial systems. Theoret. Comput. Sci. 315(2–3), 525–555 (2004)
Article MathSciNet MATH Google Scholar
Mikhalkin, G., Rau, J.: Tropical geometry, vol. 8, MPI for Mathematics, (2009)
Maclagan, D., Sturmfels, B.: Introduction to Tropical Geometry. American Mathematical Society, Providence (2015)
Book MATH Google Scholar
O’Neill, C., Kwaakwah, E.O., de Wolff, T.: Viro.sage, version 0.4(b), https://cdoneill.sdsu.edu/viro/, (2018)
Paouris, G., Phillipson, K., Rojas, J. M.: A faster solution to smale’s 17th problem i: Real binomial systems, in Proceedings of the 2019 on International Symposium on Symbolic and Algebraic Computation, pp. 323–330 (2019)
Passare, M., Sadykov, T., Tsikh, A.: Singularities of hypergeometric functions in several variables. Comp. Math. 141, 787–810 (2005)
Article MathSciNet MATH Google Scholar
Passare, M., Tsikh, A.: Amoebas: their spines and their contours. In: Idempotent Mathematics and Mathematical Physics, Contemp. Math., vol. 377, pp. 275–288. American Mathematical Society, Providence (2005)
Chapter MATH Google Scholar
Sturmfels, B.: On the newton polytope of the resultant. J. Algebraic Combin. 3(2), 207–236 (1994)
Article MathSciNet MATH Google Scholar
Sturmfels, B.: Viro’s theorem for complete intersections. Annali della Scuola Normale Superiore di Pisa-Classe di Scienze 21(3), 377–386 (1994)
MathSciNet MATH Google Scholar
Sturmfels, B.: Polynomial equations and convex polytopes. Am. Math. Mon. 105(10), 907–922 (1998)
Article MathSciNet MATH Google Scholar
Verschelde, J.: Algorithm 795: Phcpack: a general-purpose solver for polynomial systems by homotopy continuation. ACM Trans. Math. Softw. 25(2), 251–276 (1999)
Article MATH Google Scholar
Viro, O.: From the sixteenth Hilbert problem to tropical geometry. Jpn. J. Math. 3(2), 185–214 (2008)
Article MathSciNet MATH Google Scholar
Lint, V., Hendricus, J.: Coding Theory, vol. 201. Springer, New York (1971)
Google Scholar
Ziegler, G.M.: Lectures on Polytopes, vol. 152. Springer Science & Business Media, New York (2012)
MATH Google Scholar

Download references

Acknowledgements

We cordially thank Sascha Timme for implementing a preliminary version of the algorithm developed in this article in the software HomotopyContinuation.jl, and for his help with developing Example 4.1. We thank Matías Bender, Paul Breiding, Felipe Cucker, Mario Kummer, Gregorio Malajovich, Jeff Sommars, and Josue Tonelli-Cueto for useful discussions. The first author thanks J. Maurice Rojas for introducing him to the beautiful book [16]. First author is supported by NSF CCF 2110075, and the second author is supported by the DFG grant WO 2206/1-1.

Author information

Authors and Affiliations

Alperen A. Ergür, University of Texas at San Antonio, One UTSA Circle, San Antonio, TEXAS, 78249, USA
Alperen A. Ergür
Timo de Wolff, Technische Universität Braunschweig, Institut für Analysis und Algebra, AG Algebra, Universitätsplatz 2, 38106, Braunschweig, Germany
Timo de Wolff

Authors

Alperen A. Ergür
View author publications
You can also search for this author in PubMed Google Scholar
Timo de Wolff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alperen A. Ergür.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ergür, A.A., Wolff, T.d. A Polyhedral Homotopy Algorithm for Real Zeros. Arnold Math J. 9, 305–338 (2023). https://doi.org/10.1007/s40598-022-00219-w

Download citation

Received: 11 April 2021
Revised: 31 July 2022
Accepted: 10 October 2022
Published: 27 October 2022
Issue Date: September 2023
DOI: https://doi.org/10.1007/s40598-022-00219-w

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Polyhedral Homotopy Algorithm for Real Zeros

Abstract

Similar content being viewed by others

Early Ending in Homotopy Path-Tracking for Real Roots

Penalty Function Based Critical Point Approach to Compute Real Witness Solution Points of Polynomial Systems

Locating the Closest Singularity in a Polynomial Homotopy

1 Introduction

Theorem 1.1

1.1 Effective Patchworking

1.2 Complexity Aspects

1.3 Connections to Fewnomial Theory

Theorem 1.2

Example 1.3

1.4 Structure of the Paper

2 Preliminaries

2.1 Polyhedral Subdivisions, Secondary Polytope, and Cayley Configuration

Definition 2.1

Theorem 2.2

Definition 2.3

Lemma 2.4

Definition 2.5

Definition 2.6

Remark 2.7

2.2 Viro’s Patchworking Method

Definition 2.8

Definition 2.9

Theorem 2.10

Remark 2.11

Example 2.12

Example 2.13

2.3 Mixed-Cell Cones and Jensen’s Tropical Homotopy Algorithm

Definition 2.14

Definition 2.15

Lemma 2.16

Proof

2.4 Solving Binomial Systems Over the Reals

2.5 A-Discriminants

Proposition 2.17

Lemma 2.18

Theorem 2.19

2.6 Basics of Amoeba Theory

Definition 2.20

Lemma 2.21

Proof

2.7 Real Toric Deformation

Proposition 2.22

Remark 2.23

Proof

2.8 Numerically Tracking a Solution from Toric Infinity

2.9 An Entropy Type Formula for the Discriminant Locus

Theorem 2.24

3 Effective Viro’s Patchworking

3.1 Statement of Main Result and an Example

Proposition 3.1

3.2 Some Basic Results on the Complement of A-Discriminant Amobea

Lemma 3.2

Proof

3.3 Quantitative Estimates

Lemma 3.3

Lemma 3.4

Proof

Lemma 3.5

Proof

Proof of Lemma 3.3

3.4 Putting Things Together

4 Real Polyhedral Homotopy

Example 4.1

5 Remarks on Complexity

5.1 Tropical Homotopy Algorithm

Lemma 5.1

Proof

Corollary 5.2

Corollary 5.3

5.2 Effective Viro’s Patchworking

Corollary 5.4

5.3 A Fewnomial Bound for Patchworked Polynomial Systems

Theorem 5.5

Proposition 5.6

Proof

5.4 Complexity of Numerical Path Tracking