1 Introduction

Let X = {x1,x2,…,xn} be a set of n commuting variables and \({\mathbb {F}}\) be a field which is either the field \(\mathbb {Q}\) of rationals or a finite field throughout this paper. Let \(R={\mathbb {F}}[X]\) be the ring of multivariate polynomials over the variables in X with coefficients from the field \(\mathbb {F}\). A subring \(I\subseteq R\) is an ideal if I absorbs multiplications by elements of R. That is, \(I\cdot R\subseteq I\).

Computationally, an ideal IR is given by a set of generator polynomials : \(I = \left \langle {f_{1},f_{2},\ldots ,f_{\ell }}\right \rangle \). In other words, I is the smallest ideal containing the polynomials fi,1 ≤ i. Given fR and \(I=\left \langle {f_{1}, \ldots , f_{\ell }}\right \rangle \), the Ideal Membership problem is to decide whether fI or not. In general, the problem is notoriously intractable. It is EXPSPACE-complete even if f and the generators fi,i ∈ [] are given explicitly as sums of monomials [26]. Nevertheless, special cases of ideal membership problem have played important roles in several results in arithmetic complexity. For example, the polynomial identity testing algorithm for depth three ΣπΣ circuits with bounded top fan-in; the structure theorem for ΣπΣ(k,d) identities use ideal membership very crucially [8, 18, 29].

In this paper, our study of ideal membership is motivated by Alon’s Combinatorial Nullstellensatz [1], and we recall one of its formulations.

Theorem 1.1

[1] Let \({\mathbb {F}}\) be any field, and \(f(X)\in {\mathbb {F}}[X]\). Define polynomials \(g_{i}(x_{i}) = {\prod }_{s\in S_{i}}(x_{i} - s)\) for non-empty subsets Si,1 ≤ in of \({\mathbb {F}}\). If f vanishes on all the common zeros of g1,…,gn, then there are polynomials h1,…,hn satisfying \(\deg (h_{i})\leq \deg (f) - \deg (g_{i})\) such that \(f={\sum }_{i=1}^{n} h_{i}g_{i}\).

It can be restated in terms of ideal membership: Let \(f(X) \in {\mathbb {F}}[X] \) be a given polynomial, and \(I=\left \langle {g_{1}(x_{1}),g_{2}(x_{2}),\ldots ,g_{n}(x_{n})}\right \rangle \) be an ideal generated by univariate polynomials gi without repeated roots. Let Z(gi) denote the zero set of gi,1 ≤ in. By Theorem 1.1, if fI then there is a \(\vec {\alpha }=(\alpha _{1}, \ldots , \alpha _{n})\in Z(g_{1})\times \cdots \times Z(g_{n})\) such that \(f(\vec {\alpha })\neq 0\). Of course, if fI then \(f|_{Z(g_{1})\times \cdots \times Z(g_{n})}=0\).

Ideals I generated by univariate polynomials are called univariate ideals. For any univariate ideal I and any polynomial f, by repeated application of the division algorithm, we can write \(f({X})={\sum }_{i=1}^{n} h_{i}({X}) g_{i}(x_{i}) + R({X})\) where R is unique and for each \(i\in [n] : \deg _{x_{i}}(R) < \deg (g_{i}(x_{i}))\). Since the remainder is unique, it is convenient to write R = f mod I. By Alon’s theorem, if fI then there is a \(\vec {\alpha }\in Z(g_{1}) \times {\cdots } \times Z(g_{n})\) such that \(R(\vec {\alpha })\neq 0\).

Univariate ideal membership is further motivated by its connection with two well-studied problems. Computing the permanent of a n × n matrix over any field \(\mathbb {F}\) can be cast in terms of univariate ideal membership. Given a matrix \(A=(a_{i,j})_{1\leq i,j\leq n}\in \mathbb {F}^{n\times n}\), consider the product of linear forms \(P_{A}({X}) = {\prod }_{i=1}^{n} \left ({\sum }_{j=1}^{n} a_{ij} x_{j}\right )\). The following observation is well known.

Fact 1.2

The permanent of the matrix A is given by the coefficient of the monomial x1x2xn in PA. In other words, the remainder of the polynomial PA(x1,x2,…,xn) modulo the univariate ideal \(\left \langle {{x_{1}^{2}}, \ldots , {x_{n}^{2}}}\right \rangle \) is precisely Perm(A) ⋅ x1x2xn.

It follows immediately that the remainder \(P_{A} \text {mod}{\left \langle {{x_{1}^{2}}, \ldots , {x_{n}^{2}}}\right \rangle }\) evaluates to Perm(A) at the point \(\vec {1}\in {\mathbb {F}}^{n}\).

Next, we briefly mention the connection of univariate ideal membership with the multilinear monomial detection problem, a benchmark problem that is useful in designing fast parameterized algorithms for a host of problems [21,22,23, 33].

Notice that, given an arithmetic circuit C computing a polynomial \(f\in {\mathbb {F}}[X]\) of degree k, checking if f has a non-zero multilinear monomial of degree k is equivalent to checking if \(f \text {mod}{\left \langle {{x_{1}^{2}}, \ldots , {x_{n}^{2}}}\right \rangle }\) is non-zero. Moreover, the constrained multilinear detection problem studied in [10, 22] can also be viewed as a problem of deciding membership in a univariate ideal.

However, even for univariate ideals, the ideal membership problem is hard in general. As an application of Theorem 1.1, Alon and Tarsi [1, 2] show that checking k-colorability of a graph G is polynomial-time equivalent to checking if the corresponding graph polynomial \(f_{G}={\prod }_{ij\in E, i<j}(x_{i}-x_{j})\) is in the ideal \(\left \langle {{x_{1}^{k}}-1, \ldots , {x_{n}^{k}}-1}\right \rangle \). Hence, univariate ideal membership is coNP-hard when the polynomials have distinct roots. We show that Univariate Ideal Membership over \({\mathbb {Q}}\), in general, is in the third level of the counting hierarchy. For the lower bound, we note that checking if a product of n linear forms is in the ideal \(\left \langle {{x_{1}^{2}},{x_{2}^{2}},\ldots ,{x_{n}^{2}}}\right \rangle \) is as hard as checking if the integer permanent is zero, which is C=P-hard. Univariate Ideal Membership over finite fields of characteristic k is quite tightly classified: the upper bound of coR ⋅ModkP nearly matches with the ModkP hardness.

1.1 Our Results

In this paper, we study univariate ideal membership problem for different parameters of the input polynomial f and the univariate ideal I. The first parameter we consider is the rank of f. This notion has found applications, for example, in algorithms for depth-3 polynomial identity testing [29].

Definition 1.3

We say \(f\in {\mathbb {F}}[X]\) is a rank r polynomial if \(f \in {\mathbb {F}}[\ell _{1} , \ell _{2} ,\ldots ,\ell _{r}]\) for linear forms j : 1 ≤ jr.

We give two different algorithms for checking if a rank-r polynomial f is in a univariate ideal I. The first one is essentially an iterative division procedure. It evaluates the remainder polynomial f modI at a given point \(\vec {\alpha }\in {\mathbb {F}}^{n}\) in deterministic time O(dO(r)). Using this evaluation procedure, we can test if the remainder polynomial f modI is nonzero by evaluating it at a randomly chosen point \(\vec {\alpha }\) over \(\mathbb {F}\) or a suitable extension field. The second algorithm is structural. It expresses the remainder polynomial f modI as an O(dO(r)) sum of d-products of linear forms. By the Polynomial Identity Lemma [14, 31, 34], we can check if it is nonzero by evaluation at a randomly chosen point \(\vec {\alpha }\). We formally state the theorem.

Theorem 1.4

Let C be a polynomial-size arithmetic circuit computing a polynomial f in \({\mathbb {F}}[\ell _{1},\ell _{2},\ldots ,\ell _{r}]\), where 1,2,…,r are given linear forms in {x1,x2,…,xn}. Let \(I=\left \langle {p_{1}, \dots , p_{n}}\right \rangle \) be a univariate ideal generated by \(p_{i}(x_{i})\in {\mathbb {F}}[x_{i}], 1\le i\le n\).

  1. 1.

    Given \(\vec {\alpha }\in {\mathbb {F}}^{n}\), we can evaluate the remainder f modI at the point \(\vec {\alpha }\) in deterministic time dO(r)poly(n), where \(d=\max \limits \{\deg (f),\deg (p_{i}): 1\le i\le n\}\).

  2. 2.

    In deterministic time dO(r)poly(n) we can express the remainder f modI as an O(dO(r))-sum of d-products of linear forms.

Using either of these algorithms, we can decide in randomized O(dO(r)) time if f is in I.

We can check if fI by evaluating the remainder f modI at a randomly chosen point \(\vec {\alpha }\), which can be done using any of the above algorithms.

We apply the previous result to obtain an efficient algorithm for minimum vertex cover in low rank graphs. A graph G is said to be of rank r if the rank of the adjacency matrix AG is of rank r. Graphs of low rank were studied by Lovasz and Kotlov [4, 20] in the context of graph coloring.

Theorem 1.5

Given a graph G = (V,E) on n vertices such that the rank of the adjacency matrix AG is at most r, and a parameter k, there is a randomized nO(r) algorithm to decide if the graph G has vertex cover of size k or not.

Theorem 1.4 also yields an nO(r) algorithm to compute the permanent of rank-r matrices over any field. Barvinok had given [9] an algorithm of same running time for the permanent of low rank matrices (over \({\mathbb {Q}}\)) using apolar bilinear forms. By Fact 1.2, if matrix A is rank r then PA is a rank-r polynomial, and for the univariate ideal \(I=\left \langle {{x_{1}^{2}}, \ldots , {x_{n}^{2}}}\right \rangle \) computing PAmodI at the point \({\vec {1}}\) yields the permanent. Theorem 1.4 works more generally for all univariate ideals. In particular, the ideal in the proof of Theorem 1.5 is generated by polynomials that are not powers of variables. Thus, Theorem 1.4 can potentially have more algorithmic consequences than the technique in [9].

If k is the degree of the input polynomial and the ideal is given by the powers of variables as generators, we have a randomized FPT algorithm for the problem.

Theorem 1.6

Given an arithmetic circuit C computing a polynomial \(f(X) \in \mathbb {\mathbb {Q}}[X]\) of degree k and integers e1,e2,…,en, there is a randomized algorithm to decide whether \(f \in \left \langle {x^{e_{1}}_{1},x^{e_{2}}_{2},\ldots ,x^{e_{n}}_{n}}\right \rangle \) in O((2e)k) time.

The above result generalizes the algorithm for multilinear monomial detection [23] (there the ideal of interest is \(I=\left \langle {{x^{2}_{1}},{x^{2}_{2}},\ldots ,{x^{2}_{n}}}\right \rangle \)). Brand et al. have given the first FPT algorithm for degree-k multilinear monomial detection in arithmetic circuits [11]. Multilinear monomial detection can also be done, with the same running time, using the Hadamard product [5] of the given polynomial with the elementary symmetric polynomial (and in a different approach using apolar bilinear forms [27]).

When the number of generators in the univariate ideal is treated as fixed parameter, ideal membership is W[2]-hard.

Theorem 1.7

Given a polynomial \(f(X) \in \mathbb {F}[X]\) by an arithmetic circuit C and univariate polynomials p1(x1),p2(x2),…,pk(xk), checking if \(f \not \in \left \langle {p_{1}(x_{1}),p_{2}(x_{2}),\dots ,p_{k}(x_{k})}\right \rangle \) is W[2]-hard with k as the parameter.

Theorem 1.7 is shown by an efficient reduction from parameterized the dominating set problem to ideal membership parameterized by number of generators. To find an dominating set of size k, the reduction produces an ideal with k univariates and the polynomial created from the graph has k variables.

Unlike Theorem 1.6, even checking if f is in the ideal \(\left \langle {{x_{1}}^{e_{1}},{x_{2}}^{e_{2}},\ldots ,{x_{k}}^{e_{k}}}\right \rangle \) remains intractable in the parameterized sense.

Theorem 1.8

Let C be a polynomial-size arithmetic circuit computing a polynomial \(f\in {\mathbb {F}}[X]\). Let \(I = \left \langle {{x_{1}}^{e_{1}},{x_{2}}^{e_{2}},\ldots ,{x_{k}}^{e_{k}}}\right \rangle \) be the given ideal where e1,…,ek are given in unary, checking if fI is MINI[1]-hard with k as parameter.

The k−Lin-Eq problem, which asks if there is a \(\vec {x}\in \{0,1\}^{n}\) satisfying \(A\vec {x} = \vec {b}\), where \(A\in \mathbb {F}^{k\times n}\) and \(\vec {b}\in {\mathbb {F}}^{k}\), is reducible to the complement of univariate ideal membership for an ideal of the form \(I = \left \langle {{x_{1}}^{e_{1}},{x_{2}}^{e_{2}},\ldots ,{x_{k}}^{e_{k}}}\right \rangle \). We then show k−Lin-Eq is hard for the parameterized complexity class MINI[1] by reducing the miniature version of 1 − in − 3POSITIVE3 − SAT to it.

As already mentioned, the result of Alon and Tarsi [1, 2] shows that the membership of fG in \(\left \langle {{x_{1}^{k}}-1, \ldots , {x_{n}^{k}}-1}\right \rangle \) is coNP-hard and the proof crucially uses the fact that the roots of the generator polynomials are all distinct. This naturally raises the question if univariate ideal membership is in coNP when each generator polynomial has distinct roots. We show univariate ideal membership is in coNP over rationals when all the generator polynomials have distinct roots. We show that over \({\mathbb {Q}}\) univariate ideal membership, in general, is in the third level of the counting hierarchy. This upper bound is reasonably tight, as checking if a product of n linear forms is in the ideal \(\left \langle {{x_{1}^{2}},{x_{2}^{2}},\ldots ,{x_{n}^{2}}}\right \rangle \) is as hard as checking if the integer permanent is zero, which is C=P-hard.

Theorem 1.9

Let \(f\in \mathbb {Q}[X]\) be a polynomial of degree at most d given by a black-box. Let \(I=\left \langle {p_{1}(x_{1}), \ldots , p_{n}(x_{n})}\right \rangle \) be an ideal given explicitly by a set of univariate polynomials p1,p2,…,pn as generators of maximum degree bounded by d. Let L be the bit-size upper bound for any coefficient in f,p1,p2,…,pn. Moreover, assume that pis have distinct roots over \(\mathbb {C}\). Then there is a non-deterministic algorithm running in time poly(n,d,L) that decides the non-membership of f in the ideal I.

Remark 1.10

The distinct roots case discussed in Theorem 1.9 is in stark contrast to the complexity of testing membership of PA(X) in the ideal \(\left \langle {{x_{1}^{2}}, \ldots , {x_{n}^{2}}}\right \rangle \). That problem is equivalent to checking if Perm(A) is nonzero for a rational matrix A, which is hard for the exact counting class C=P. Hence it cannot be in coNP unless the polynomial-time hierarchy collapses. We do not have an analogue of Theorem 1.9 over finite fields.

Recall from Alon’s Nullstellensatz that if fI, then there is always a point \(\vec {\alpha }\in Z(p_{1})\times \ldots \times Z(p_{n})\) such that \(f(\vec \alpha )\neq 0\). Notice that in general the roots \(\alpha _{i} \in \mathbb {C}\) and in the standard Turing Machine model the NP machine can not guess the roots directly with only finite precision. But we are able to prove that the NP machine can guess a corresponding tuple of root approximations\(\vec {\tilde {\alpha }}\in \mathbb {Q}^{n}\), using only polynomial bits of precision and still can decide the non-membership. The main technical idea is to compute efficiently a parameter M (only from the input parameters) such that

$$ \begin{array}{@{}rcl@{}} |f(\vec{\tilde{\alpha}})| & \leq & M \textrm{ if } f\in I, \textrm{ and}\\ |f(\vec{\tilde{\alpha}})| & \geq & 2M \textrm{ if } f\not\in I. \end{array} $$

The NP machine decides the non-membership according to the final value of \(|f(\vec {\tilde {\alpha }})|\).

In this connection, we note that Koiran has considered the weak version of Hilbert Nullstellensatz (HN) problem [19]. The input is a set of multivariate polynomials \(f_{1}, f_{2}, \ldots , f_{m} \in \mathbb {Z}[X]\) and the problem is to decide whether \(1\in \left \langle {f_{1}, \ldots , f_{m}}\right \rangle \). The result of Koiran shows that \(\overline {{\text {HN}}}\in {\text {AM}}\) (under GRH), and it is an outstanding open problem problem to decide whether \(\overline {\text {HN}}\in \text {NP}\).

Organization

In Section 2 we present some background material. In Section 3 we show that, in general, univariate ideal membership is in the counting hierarchy. We prove Theorems 1.4 and 1.5 in Section 4. In Section 5, we explore the parameterized complexity of univariate ideal membership. In the first subsection, we prove Theorem 1.6, and in the second subsection we prove Theorems 1.7 and 1.8. Finally, in Section 7, we prove Theorem 1.9.

2 Preliminaries

We recall some basic definitions and results that are background material.

2.1 Basics of Ideal Membership

Let \({\mathbb {F}}[X]\) be the ring of polynomials \({\mathbb {F}}[x_{1},x_{2},\ldots ,x_{n}]\). Let \(I\subseteq {\mathbb {F}}[X]\) be an ideal given by a set of generators \(I=\left \langle {g_{1}, \ldots , g_{\ell }}\right \rangle \). Then for any polynomial \(f\in {\mathbb {F}}[X]\), it is a member of the ideal if and only if \(f={\sum }_{i=1}^{\ell } h_{i} g_{i}\) where \(\forall i : h_{i}\in {\mathbb {F}}[X]\). Dividing f by the gi by applying the standard division algorithm does not work in general to check if fI. Indeed, the remainder is not even uniquely defined. However, if the leading monomials of the generators are already pairwise relatively prime, then we can apply the division algorithm to compute the unique remainder.

Theorem 2.1 (See 12, Theorem 3, proposition 4, pp.101)

Let I be a polynomial ideal given by a basis G = {g1,g2,⋯ ,gs} such that all pairs ij LM(gi) and LM(gj) are relatively prime. Then G is a Gröbner basis for I.

In particular, if the ideal I is a univariate ideal given by \(I=\left \langle {p_{1}(x_{1}), \ldots , p_{n}(x_{n})}\right \rangle \), we can apply the division algorithm to compute the unique remainder f modI. To bound the run time of this procedure we note the following: Let \(\bar {p}\) denote the ordered list {p1,p2,…,pn}. Let \(\text {Divide}(f ; \bar {p})\) be the procedure that divides f by p1 to obtain remainder f1, then divides f1 by p2 to obtain remainder f2, and so on to obtain the final remainder fn after dividing by pn. We note the following time bound for \(\text {Divide}(f ; \bar {p})\).

Fact 2.2 (See [32], Section 6, pp.5-12)

Let \(f\in {\mathbb {F}}[X]\) be given by a size s arithmetic circuit and \(p_{i}(x_{i})\in {\mathbb {F}}[x_{i}]\) be given univariate polynomials. The running time of \(\text {Divide}(f ; \bar {p})\) is bounded by \(O(s\cdot {\prod }_{i=1}^{n} (d_{i} + 1)^{O(1)})\), where \(d_{i}=\max \limits \{\deg _{x_{i}}(f),\deg (p_{i}(x_{i}))\}\).

2.2 Some Bounds Concerning Roots of Univariate Polynomials

The following folklore lemma gives a bound on the absolute value of any root of a univariate polynomial in terms of the degree and the coefficients.

Lemma 2.3

For any root α of a univariate degree-d polynomial \(f(x) = {\sum }_{i=0}^{d} a_{i} x^{i}\in \mathbb {Q}[x]\) one of the following bounds hold:

$$ \frac{|a_{0}|}{{\sum}_{i=1}^{d} |a_{i}|} \leq |\alpha|<1 ~\mathrm{ or } ~ 1\leq |\alpha| \leq d \cdot \frac{\max_{i} |a_{i}|}{|a_{d}|}. $$

Proof

Since α is a root of f, we have \(0=f(\alpha )={\sum }_{i=0}^{d} a_{i} \alpha ^{i}=0\). Hence, \({\sum }_{i=1}^{d} a_{i} \alpha ^{i} = -a_{0}\). By triangle inequality

$$ \sum\limits_{i=1}^{d} |a_{i}| |\alpha|^{i} \geq |a_{0}|. $$

Since f is degree d, ad≠ 0. We consider two cases. First, suppose |α| < 1. Then, by the above inequality, \(|\alpha | \cdot ({\sum }_{i=1}^{d} |a_{i}|) \geq |a_{0}|\). Hence, \(|\alpha |\geq \frac {|a_{0}|}{{\sum }_{i=1}^{d} |a_{i}|}\). Next, suppose |α|≥ 1. Since \(-a_{d} \alpha ^{d} = {\sum }_{i=0}^{d-1} a_{i} \alpha ^{i}\), by triangle inequality \(|a_{d}| |\alpha |^{d} \leq |\alpha |^{d-1} \cdot ({\sum }_{i=0}^{d-1} |a_{i}|)\). Hence, \( |\alpha | \leq \frac {{\sum }_{i=0}^{d-1} |a_{i}|} {|a_{d}|}\leq d \cdot \frac {\max \limits _{i} |a_{i}|}{|a_{d}|}\). This completes the proof. □

The next lemma, due to Mahler [25], lower bounds the distance between any two distinct roots of a univariate polynomial in terms of its degree and the size of its coefficients.

Lemma 2.4 (Mahler 25)

Let \(g(x) = {\sum }_{i=0}^{d} a_{i} x^{i} \in \mathbb {Q}[x]\) and 2L ≤|ai|≤ 2L (if ai≠ 0). Let α,β are two distinct roots of g. Then \(|\alpha -\beta | \geq \frac {1}{2^{O(d L)}}\).

Given a univariate polynomial f and a point β that is far from the roots of f, the following lemma lower bounds |f(β)|. The following lemma states that any univariate polynomial can not get a very small value (in absolute sense) on any point which is far from every root.

Lemma 2.5

Let \(f= {\sum }_{i=1}^{d} a_{i} x^{i}\) be a univariate degree-d polynomial with 2L ≤|ai|≤ 2L (if ai≠ 0). Let \(\tilde {\alpha }\) be a point such that \(|\tilde {\alpha } - \beta _{i}| \geq \delta \) for every root βi of f. Then

$$ |f(\tilde{\alpha})| \geq 2^{-L} \delta^{d}. $$

Proof

Since \(\deg (f)=d\), ad≠ 0. We can write \(f(\tilde {\alpha }) = a_{d} {\prod }_{i=1}^{d} (\tilde {\alpha } - \beta _{i})\). Since \(|\tilde {\alpha } - \beta _{i}| \geq \delta \), \(|f(\tilde {\alpha })| = |a_{d}| {\prod }_{i=1}^{d} |\tilde {\alpha } - \beta _{i}| \geq 2^{-L} \delta ^{d}\). □

2.3 Parameterized Complexity Classes

We recall some standard definitions from parameterized complexity [13, ch.1,pp. 7-14]. For a parameterized problem the input instances are pairs (x,k), where x is the actual input and k is a fixed parameter. The parameterized problem is in the class FPT (for fixed parameter tractable) if the problem has an algorithm with run time f(k)|(x,k)|O(1) for some computable function f.

A parameterized reduction [13, def. 13.1] between two parameterized decision problems P1 and P2 is a many-one reduction such that on input instance (x,k) of P1 the reduction maps it to an instance \((x^{\prime },k^{\prime })\) of P2 in time f(k)|(x,k)|O(1), for some computable f, such that (x,k) is a “yes” instance of P1 if and only if \((x^{\prime },k^{\prime })\) is a “yes” instance of P2, and \(k^{\prime } \leq f(k)\).

A parameterized problem is said to be in the class XP if it has an algorithm with run time |x|f(k) for some computable function f.

For the purpose of this paper, it suffices to note that a parameterized problem L is in the class W[1] if there is a parameterized reduction from L to some standard W[1]-complete problem like, e.g., the k-Independent set problem and L is in the class W[2] if there is a parameterized reduction from L to some standard W[2]-complete problem like, e.g., the k-dominating set problem (more details can be found in, e.g, [13, def. 13.16]).

The complexity class MINI[1] consists of parameterized problems that are miniature versions of NP problems: For L ∈NP, its miniature version mini(L) has instances of the form (0n,x), where \(|x|\le k\log n\), k is the fixed parameter, and x is an instance of L. Showing mini(L) to be MINI[1]-hard under parameterized reductions is evidence of its parameterized intractability, for it cannot be in FPT assuming the Exponential Time Hypothesis [15].

2.4 Multivariate Polynomials

We recall the definition of Hadamard product of two polynomials.

Definition 2.6

Given two polynomials \(f,g \in {\mathbb {F}}[X]\), their Hadamard product is defined as

$$ f \circ g = \sum\limits_{m} [m]f \cdot [m]g \cdot m. $$

We will use a scaled variant of the Hadamard Product [7].

Definition 2.7

[7] Given two polynomials \(f,g \in {\mathbb {F}}[X]\), their scaled Hadamard Product fsg, is defined as

$$ f \circ^{s} g = \sum\limits{m} m! \cdot [m]f \cdot [m]g \cdot m, $$

where \(m=x^{e_{1}}_{i_{1}}x^{e_{2}}_{i_{2}} {\ldots } x^{e_{r}}_{i_{r}}\) and m! = e1! ⋅ e2!⋯er! abusing the notation.

Remark 2.8

If either f or g is multilinear, notice that their scaled Hadamard product coincides with their Hadamard product.

The elementary symmetric polynomial of degree k over n variables {x1,x2,…,xn} is defined as:

$$ S_{n,k}(x_{1},x_{2},{\ldots} , x_{n}) = \sum\limits_{T\subseteq [n],|T| = k} \prod\limits_{i\in T}x_{i}. $$

Notice that, Sn,k contains all the degree k multilinear terms.

3 A Complexity-Theoretic Upper Bound

We show that over \(\mathbb {Q}\) univariate ideal membership is in the counting hierarchy. Over finite fields of characteristic k, the problem is in the randomized complexity class coR ⋅ModkP.

Let Σ be a finite alphabet (of size at least 2). The class # P consists of functions \(h:{\Sigma }^{*}\to {\mathbb {N}}\) defined by an NP machine M such that for all x ∈Σ

$$ h(x)=\mathit{acc}_{M}(x), $$

where accM(x) is the number of accepting paths of M on input x. A language \(L\subseteq {\Sigma }^{*}\) is in the counting complexity class C=P if there is an NP machine M such that for all x ∈Σ xL if and only if accM(x) = rejM(x). For \(A\subseteq {\Sigma }^{*}\) the relativized class \(\mathrm {C}_{{=}\mathrm {P}}^{A}\) is defined as above for an NPA (oracle) machine M. For i ≥ 2, a language L is in the ith level of the exact counting hierarchy, denoted CHi, if \(L\in \mathrm {C}_{=}\mathrm {P}^{A}\) for some A ∈CHi− 1.

Theorem 3.1

  1. 1.

    Univariate ideal membership over \(\mathbb {Q}\) is in the third level of the counting hierarchy.

  2. 2.

    Univariate ideal membership over a finite field of characteristic k is in coR ⋅ModkP.

Proof

For the first part, let \(f\in \mathbb {Q}[X]\) be given as input by a degree d arithmetic circuit and \(p_{i}(x_{i})\in \mathbb {Q}[x_{i}], i\in [n]\) be the generators of the ideal I. By clearing denominators, we can assume that both f and the pi have integer coefficients. Writing f as an integer linear combination of monomials we have

$$ f = \sum\limits_{m:\deg(m)\le d} \alpha_{m} m, $$

where \(\alpha _{m}\in \mathbb {Z}\) is the integer coefficient of monomial m (note that each αm is polynomial size in binary). As the generators pi are univariate we can express the remainder polynomial

$$ f \text{mod} {I} = \sum\limits_{m:\deg(m)\le d} \alpha_{m} (m \text{mod} {I}). $$

In particular, let \(m=x_{1}^{e_{1}}\cdot x_{2}^{e_{2}}{\cdots } x_{n}^{e_{n}}\) and \(r_{m,i}(x_{i})=x_{i}^{e_{i}}\text {mod}{p_{i}(x_{i})}\). Then, we have \(m \text {mod} {I} ={\prod }_{i=1}^{n}r_{m,i}(x_{i})\), where \(\deg (r_{m,i})<\deg (p_{i})\) for each i. Thus, the remainder polynomial

$$ f \text{mod} {I} = \sum\limits_{m:\deg(m)\le d} \alpha_{m} \prod\limits_{i=1}^{n}r_{m,i}(x_{i}). $$

In order to check if f modI is nonzero, noting that the degree of the remainder also is bounded by d, by Alon’s Nullstellensatz it suffices to check if there is a point \(\vec {a}=(a_{1},a_{2},\ldots ,a_{n})\) in the n-dimensional grid [d + 1]n where f modI does not vanish.

We will be using the simple fact that we can compute in P# P the coefficient of any monomial of degree at most d in f.

Let \(L=\{(f,\{p_{i}\}_{i\in [n]},\vec {a})\mid f\in \mathbb {Z}[X], p_{i}(x_{i})\in \mathbb {Z}[X], \vec {a}\in [d+1]^{n}\ f \text {mod} {I}(\vec {a})\ne 0\}\). Checking if fI is clearly in NPL: we guess the point \(\vec {a}\) and verify that \((f,I,\vec {a})\in L\) by querying the oracle. We now show that \(\overline {L}\) is in CH2 and that completes the proof of the first part. To do so, we define an oracle NP machine M as follows:

  • M guesses the monomials of degree at most d along its computation paths (each path corresponds to a unique monomial).

  • On the computational path that guesses monomial m, M uses a # P oracle to compute its (integer) coefficient αm in the polynomial f.

  • Compute the remainder polynomial \(m \text {mod} {I} ={\prod }_{i=1}^{n}r_{m,i}(x_{i})\). This computation path contributes \(\alpha _{m}{\prod }_{i=1}^{n}r_{m,i}(x_{i})\) to the overall remainder.

  • Compute \(val(m)=\alpha _{m}{\prod }_{i=1}^{n}r_{m,i}(a_{i})\). If val(m) is negative then M produces |val(m)| many rejecting paths. Otherwise, M produces |val(m)| many accepting paths.

Notice that the overall remainder is \(f \text {mod} {I} ={\sum }_{m} \alpha _{m} {\prod }_{i=1}^{n}r_{m,i}(x_{i})\). Clearly, \(f \text {mod} {I}(\vec {a})=0\) if and only if the number of accepting paths equals the number of rejecting paths. Hence, \(\overline {L}\in {\mathrm {C}_{=}\mathrm {P}}^{\#\mathrm {P}}\). Since \(\mathrm {P}^{\#\mathrm {P}}\subseteq \text {coNP}^{\mathrm {C}_{=}\mathrm {P}}\), it follows that \(\overline {L}\in \text {CH}_{2}\).

For the second part, the proof is along the same lines using the additional facts that \(\text {Mod}_{k}\mathrm {P}^{\text {Mod}_{k}\mathrm {P}}=\text {Mod}_{k}\mathrm {P}\) for prime k, and \(\text {NP}\subseteq \text {coR}\cdot \text {Mod}_{k}\mathrm {P}\) by the Valiant-Vazirani lemma. □

Remark 3.2

It is interesting to note that we have the lower bound (of C=P for \({\mathbb {F}}=\mathbb {Q}\) and ModkP for \({\text {char}}({\mathbb {F}})=k, k>2\)) for the simple case of checking if a product of linear forms is in the ideal \(\left \langle {{x_{1}^{2}},{x_{2}^{2}},\ldots ,{x_{n}^{2}}}\right \rangle \), by virtue of the hardness of checking if the permanent is zero (over \({\mathbb {Q}}\) and \({\text {char}}(\mathbb {F})\ne 2\)). We now observe a hardness result over \({\text {char}}(\mathbb {F})=2\) for the same ideal. Consider a graph G = (V,E). For each vertex vV define the monomial \(star(v) = y_{v}{\prod }_{v\in e}x_{e}\), where xe and yv are edge and vertex variables. Now, we define the polynomial

$$ P=\prod\limits_{u=1}^{n}(1+ t\cdot star(u)), $$

where t is a new variable. Writing \(P={\sum }_{d=0}^{n} P_{d}\cdot t^{d}\), consider the polynomial Pn/2 (for which we can find a small circuit from P). Then \(P_{n/2}\text {mod}{\left \langle {\{{x_{e}^{2}}\mid e\in E\}}\right \rangle }\) is nonzero if and only if G has an independent set of size n/2. This holds over all fields \({\mathbb {F}}\) including \({\mathbb {F}}_{2}\).

4 Ideal Membership for Low Rank Polynomials

We first recall the notion rank of a polynomial in \({\mathbb {F}}[X]\).

Definition 4.1

A polynomial \(f(X)\in {\mathbb {F}}[X]\) is a rank-r polynomial if there are linear forms 1,2,…,r in the variables X and an r-variate polynomial \(g(z_{1},z_{2},\ldots ,z_{r})\in {\mathbb {F}}[z_{1},z_{2}\ldots ,z_{r}]\) such that

$$ f(X) = g(\ell_{1},\ell_{2},\ldots,\ell_{r}). $$

For an (unspecified) fixed parameter r, we refer to rank-r polynomials as low rank polynomials.

In this section we prove Theorem 1.4: Let \(f(X){\mathbb {F}}[X]\) be a rank-r degree d polynomial given by an r-variate arithmetic circuit C and linear forms i,i ∈ [r] such that f = C(1,2,…,r), along with a univariate ideal I, and a point \(\vec {\alpha }\in {\mathbb {F}}^{n}\) as inputs. We give a deterministic O(dO(r)) time algorithm to evaluate the remainder polynomial f modI at \(\vec {\alpha }\) where d is the degree of the polynomial f. As corollary, this yields an O(dO(r))-time randomized algorithm for testing if f is in the ideal I.

Remark 4.2

Kayal [17] has shown a randomized polynomial-time algorithm for testing if a given polynomial \(f(X)\in \mathbb {F}[X]\) is of a given rank r and, if so, to compute the linear forms 1,2,…,r and the polynomial g such that f(X) = g(1,2,…,r). Combined with Theorem 1.4, we can obtain a randomized O(dO(r)) time algorithm with f and I given as input, with the promise that f has rank r.

We present two different algorithms as proofs for Theorem 1.4. The first is essentially a division algorithm. The second gives a circuit construction for the remainder polynomial f modI. Both algorithms have O(dO(r)) running time.

4.1 A Division Algorithm

Given \(\vec {\alpha }\in {\mathbb {F}}^{n}\), a univariate ideal \(I=\left \langle {p_{1}(x_{1}), \ldots , p_{n}(x_{n})}\right \rangle \), and a rank r polynomial f(1,…,r) we show how to efficiently evaluate the remainder polynomial f(1,…,r)modI at \(\vec \alpha \) using a recursive procedure \(\text {REM}(f(\ell _{1}, \ldots , \ell _{r}), I, \vec \alpha )\). We introduce the following notation. For \(S\subseteq [n]\), let IS denote the ideal \(\left \langle {p_{i}(x_{i}) : i\in [S]}\right \rangle \) generated by the polynomials pi(xi),iS.

Let \(g\in {\mathbb {F}}[X]\) be an n-variate polynomial. For an n × n invertible matrix T over \({\mathbb {F}}\), we define the polynomial

$$ T(g(X))=g(T(x_{1}),T(x_{2}),\ldots,T(x_{n})), $$

where \(T(x_{i})={\sum }_{j=1}^{n} T_{ij}x_{j}, i\in [n]\).

The following lemma shows how to remove the redundant variables from a low rank polynomial. Let 1,2,…,r be homogeneous linear forms in X = {x1,x2,…,xn}, f be an r-variate degree-d polynomial over \(\mathbb {F}\), and consider f(1,2,…,r). For an n × n invertible matrix T over \({\mathbb {F}}\), let T(f) denote the polynomial

$$ T(f)(X) = f(T(\ell_{1}),T(\ell_{2}),\ldots,T(\ell_{r})), $$

where \(T(x_{i})={\sum }_{j=1}^{n} T_{ij}x_{j}, i\in [n]\), and each T(j),j ∈ [r] is defined by linearity.

Lemma 4.3

Given as input a polynomial f(1,…,r) where 1,…,r are given homogeneous linear forms in \({\mathbb {F}}[X]\), there is an invertible matrix \(T\in {\mathbb {F}}^{n\times n}\) such that T(xi) = xi,1 ≤ ir and T(f) is defined on the 2r variables x1,x2,…,x2r.

Proof

Write each linear form i in two parts: i = i,1 + i,2, where i,1 is the part over variables x1,…,xr and i,2 is over variables xr+ 1,…,xn. W.l.o.g, assume that \(\{\ell _{i,2}\}^{r^{\prime }}_{i=1}\) is a maximum linearly independent subset of linear forms in \(\{\ell _{i,2}\}^{r}_{i=1}\). Let T be the invertible linear map that fixes x1,…,xr, maps the independent linear forms \(\{\ell _{i,2} \}^{r^{\prime }}_{i=1}\) to variables \(x_{r+1},\ldots ,x_{r+r^{\prime }}\), and suitably extended to the remaining variables to form an invertible map. Clearly, T can be computed in polynomial time, given the i. This completes the proof. □

The following lemma shows that evaluating the remainder of a polynomial f modulo a univariate ideal \(I = \left \langle {p_{1}(x_{1}), \ldots , p_{n}(x_{n})}\right \rangle \) at a point in \({\mathbb {F}}^{n}\) can be done incrementally, by computing and evaluating the remainder modulo the smaller ideals I[],1 ≤ n.

Lemma 4.4

Let \(f(X)\in {\mathbb {F}}[X]\) and \(I = \left \langle {p_{1}(x_{1}), \ldots , p_{n}(x_{n})}\right \rangle \) be a univariate ideal. Let R(X) be the unique remainder f modI. Let \(\vec {\alpha }\in {\mathbb {F}}^{r}, r\leq n\) and Rr(X) = f modI[r]. Then R(α1,…,αr,xr+ 1,…,xn) = Rr(α1,…,αr,xr+ 1,…,xn)modI[n]∖[r].

Proof

By uniqueness of remainders modulo univariate ideals, it follows that R(X) = Rr(X)modI[n]∖[r]. Since the ideal I[n]∖[r] does not involve x1,x2,…,xr, substituting xi = αi,1 ≤ ir we have

$$ R(\alpha_{1},\alpha_{2},\ldots,\alpha_{r},x_{r+1},\ldots,x_{n}) = R_{r}(\alpha_{1},\alpha_{2},\ldots,\alpha_{r},x_{r+1},\ldots,x_{n}) \text{mod} {I_{[n]\setminus[r]}}. $$

The next lemma is crucial for the proof of Theorem 1.4.

Lemma 4.5

Let \(f\in {\mathbb {F}}[X]\), and \(T : {\mathbb {F}}^{n}\rightarrow {\mathbb {F}}^{n}\) be an invertible linear transformation fixing x1,…,xr and mapping xr+ 1,…,xn to linearly independent linear forms over xr+ 1,…,xn. Write R = f modI[r] and \(R^{\prime } = T(f) \text {mod} {I_{[r]}}\). Then \(R^{\prime } = T(R)\).

Proof

Let \(f = {\sum }_{i=1}^{r} h_{i}(X) \cdot p_{i}(x_{i}) + R(X)\) and \(T(f) = {\sum }_{i=1}^{r} h^{\prime }_{i}(X) \cdot p_{i}(x_{i}) + R^{\prime }(X)\). Note that for both remainder polynomials R and \(R^{\prime }\), we have \(\deg _{x_{i}}R < \deg _{x_{i}}(p_{i})\) and \(\deg _{x_{i}}R^{\prime }< \deg (p_{i})\) for 1 ≤ ir. Now, as T is invertible and it fixes x1,…,xr, we can write \(f = {\sum }_{i=1}^{r} T^{-1}(h^{\prime }_{i}({X})) \cdot p_{i}(x_{i}) + T^{-1}(R^{\prime }({X}))\). As T fixes each xi,i ∈ [r] it follows that \(\deg _{x_{i}}(T^{-1}(R^{\prime }(X))) < \deg (p_{i}(x_{i}))\) for 1 ≤ ir. Combining the two expressions for f, we obtain that

$$ (R - T^{-1}(R^{\prime})) = 0 \text{mod} {I_{[r]}} $$

which forces \(R = T^{-1}(R^{\prime })\) by the degree bounds on xi,i ∈ [r]. □

Proof of Theorem 1.4

We now describe the algorithm, prove its correctness and analyze its running time. The input to the algorithm is an arithmetic circuit computing the r-variate degree-d polynomial f, the linear forms 1,2,…,r, and the univariate polynomials pi(xi),i ∈ [n]. Let the positive integer L bound the encoding lengths of the coefficients of the linear forms and polynomials pi as well as any scalar inputs to the circuit defining f. □

The algorithm can be seen as a recursive procedure REM: the initial call to it is \(\text {REM}(f(\ell _{1}, \ldots , \ell _{r}), I_{[n]},\vec \alpha )\).

  • As the first step, we apply the invertible linear transformation obtained in Lemma 4.3 to f and obtain the polynomial T(f) over the variables \(x_{1}, \ldots , x_{r}, x_{r+1}, \ldots , x_{r+r^{\prime }}\) where \(r^{\prime }\leq r\).Footnote 1

  • The polynomial T(f) can be explicitly computed as a linear combination of degree d monomials in variables \(x_{1},x_{2},\ldots ,x_{r+r^{\prime }}\) in time poly(L,s,n,dO(r)).

  • Then we compute the remainder polynomial \(f^{\prime }(x_{1}, \ldots , x_{r + r^{\prime }}) = T(f) \text {mod} {I_{[r]}}\) by applying the division algorithm: it essentially amounts to replacing \({x_{i}^{e}}\) by \({x_{i}^{e}} \text {mod} p_{i}(x_{i})\) when \(e\ge \deg (p_{i}(x_{i}))\) for any \({x_{i}^{e}}\) occurring in a monomial of T(f).

  • Next we compute the polynomial \(g(x_{r+1},\ldots ,x_{r+r^{\prime }})=f^{\prime }(\alpha _{1}, \ldots ,\) \(\alpha _{r}, x_{r+1}, \ldots , x_{r+r^{\prime }})\). By Lemma 4.3, we have T− 1(xr+i) = i,2 for \(1\leq i\leq r^{\prime }\). Hence, \(T^{-1}(f^{\prime })=g(\ell _{1,2},\ell _{2,2},\ldots ,\ell _{r^{\prime },2})\).

  • We next consider the polynomial \(g(\ell _{1,2},\ell _{2,2},\ldots ,\ell _{r^{\prime },2})\) and recursively compute \(\text {REM}(g(\ell _{1,2}, \ldots , \ell _{r^{\prime },2}), I_{[n]\setminus [r]},\vec \alpha ^{\prime })\) where \(\vec \alpha ^{\prime } = (\alpha _{r+1},\ldots ,\alpha _{n})\).

Correctness

Let R(X) = f modI[n] be the unique remainder polynomial. Let Rr(X) = f modI[r]. Then, by Lemma 4.4, we know that RrmodI[n]∖[r] = R, and that it suffices to show \(g(\ell _{1,2}, \ldots , \ell _{r^{\prime },2}) = R_{r}(\alpha _{1}, \ldots , \alpha _{r}, x_{r+1}, \ldots , x_{n})\) as that would imply

$$ \text{REM}(g(\ell_{1,2}, \ldots, \ell_{r^{\prime},2}), I_{[n]\setminus[r]},\vec\alpha^{\prime}) = \text{REM}(f(\ell_{1},\ldots,\ell_{r},I_{[n]},\vec\alpha)= R(\alpha_{1},\alpha_{2},\ldots,\alpha_{n}), $$

showing the correctness of the recursion.

Let \(R^{\prime }(x_{1},\ldots , x_{r}, x_{r+1}, \ldots , x_{n}) = T(f) \text {mod} {I_{[r]}}\). By Lemma 4.5 we have \(R^{\prime } = T(R_{r})\) and hence \(R_{r}= T^{-1}(R^{\prime })(x_{1},\ldots , x_{r}, T^{-1}(x_{r+1}), \ldots , T^{-1}(x_{n}))\). By definition of the linear map T, and substituting xi = αi,i ∈ [r], we have

$$ \begin{array}{@{}rcl@{}} g(\ell_{1,2}, \ldots, \ell_{r^{\prime},2}) & = & T^{-1}(R^{\prime})(\alpha_{1}, \ldots, \alpha_{r},T^{-1}(x_{r+1}), \ldots, T^{-1}(x_{r+r^{\prime}}))\\ &=& R_{r}(\alpha_{1}, \ldots, \alpha_{r}, x_{r+1}, \ldots, x_{n}). \end{array} $$

Running Time

In order to bound the running time of the above algorithm, we need to bound the total number of scalar arithmetic operations and the size of the scalars involved in the computations. We will bound the total number of arithmetic operations by poly(L,s,n,dO(r)), where L bounds the encoding lengths of the scalars in the input and s is the size the input circuit for f.

First consider the case when \({\mathbb {F}}\) is a finite field. In that case, we can let L bound encodings of all elements of \(\mathbb {F}\). We only need to bound the size of the polynomial \(g(\ell _{1,2}, \ldots , \ell _{r^{\prime },2})\) and analyze the total number of operations.

Firstly, the polynomial T(f) can be explicitly computed from the input arithmetic circuit deterministically in time poly(L,s,n,dO(r)), because it has at most \(\left (\begin {array}{cc}{d+2r}\\ {2r} \end {array}\right )\) many monomials (as the number of variables is \(r+r^{\prime }\le 2r\)).

Next, notice that the polynomial \(g(x_{r+1},\ldots ,x_{r+r^{\prime }})\) can also be written as a linear combination of at most \(\left (\begin {array}{cc}{d+2r}\\ {2r} \end {array}\right )\) many degree-d monomials in \(x_{r+1},\ldots ,x_{r+r^{\prime }}\). Thus, the polynomial \(g(\ell _{1,2},\ell _{2,2},\ldots ,\ell _{r^{\prime },2})\) can be seen as a ΣπΣ circuit. In other words, it is a sum of at most \(\left (\begin {array}{cc}{d+2r}\\ {2r} \end {array}\right )\) products of the linear forms i,2, and the products are at most d-fold.

Further, notice that the number of divisions (by the univariate polynomials pi(xi),i ∈ [r]) performed in Step 3 is r per monomial of T(f). Since T(f) has at most \(\left (\begin {array}{cc}{d+2r}\\ {2r} \end {array}\right )\) monomials the number of univariate polynomial divisions, and hence number of scalar operations, is bounded by poly(L,s,n,dO(r)). All other steps require poly(s,n,d,L) operations.

Now, in each recursive application the number of generators in the ideal is reduced by at least one, and there is only one recursive call made.

Thus, the overall number of scalar (i.e., \({\mathbb {F}}\)) operations involved in the algorithm is bounded by poly(L,s,n,dO(r)).

The above analysis bounding the total number of operations also applies for \({\mathbb {F}}=\mathbb {Q}\). For \(\mathbb {Q}\), we additionally need to bound the sizes of the numbers during the computation.

Bit-size Growth Over \(\mathbb {Q}\)

It suffices to argue that the size of coefficients in the polynomial \(g(\ell _{1,2},\ell _{2,2},\ldots ,\ell _{r^{\prime },2})\) increase by a fixed additive value bounded by poly(n,d,L). As the total number of recursive calls is at most n, this would polynomially bound all scalars involved in the entire computation.

Let \(\tilde {L}\) bound the coefficients of polynomial f(z1,z2,…,zr). As \(2^{\tilde {L}}\le 2^{Ld}\cdot {\left (\begin {array}{cc}{d+r}\\ r \end {array}\right )}\), we have \(\tilde {L}\le dL+r\log (r+d)\).

We will show that the \(\sum \prod \sum \) circuit that we use for g in the next recursive step has coefficients of bit size at most \(\tilde {L} + \text {poly}(n,d,L)\).

For \(h\in \mathbb {Q}[X]\), let c(h) denote the maximum coefficient (in absolute value) of a nonzero monomial of h. By direct expansion

$$ |c(f(\ell_{1},\ldots, \ell_{r}))| \leq 2^{\tilde{L} + \text{poly}(n,d,L)}. $$

Also the matrix T of Lemma 4.3, and its inverse, require poly(n,L) size entries.

Therefore, \(c(T(f(\ell _{1},\ldots ,\ell _{r})) \leq 2^{\tilde {L} +\text {poly}(n,d,L)}\). Next, the algorithm expands T(f) explicitly as a sum of dO(r) monomials. Dividing T(f) by the polynomials p1(x1),…,pr(xr) one by one, and substituting x1 = α1,…,xr = αr giving us the remainder polynomial \(g(x_{r+1},\ldots ,x_{r+r^{\prime }})\). Each such division involves computing a remainder polynomial of the form \({x_{i}^{e}} \text {mod} p_{i}(x_{i})\) for some ed, which does not involve intermediate computations. Each such remainder \({x_{i}^{e}} \text {mod} p_{i}(x_{i})\) obtained has poly(n,d,L) size coefficients and degree at most \(\deg (p_{i})-1\). Putting it together, it follows that \(|c(g)| \leq 2^{\tilde {L} + \text {poly}(n,d,L)}\).

Now the algorithm passes the dO(r) size ΣπΣ circuit \(g(\ell _{1,2},\ldots ,\ell _{r^{\prime } , 2})\) (We note that \(T^{-1}(x_{r+1})=\ell _{1,2},\ldots ,T^{-1}(x_{r+r^{\prime }})=\ell _{r^{\prime } , 2}\)), univariates pr+ 1(xr+ 1),…, pn(xn) and the point (αr+ 1,…,αn) for the next recursive call.

In the recursive call \(\text {REM}(g(\ell _{1,2}, \ldots , \ell _{r^{\prime },2}), I_{[n]\setminus [r]},\vec \alpha ^{\prime })\), notice that the only change in the input size is in the size of g (which, as shown above, is of O(dO(r)) size with L + poly(n,d,L) size coefficients).

As there are at most n recursive calls overall, all coefficients involved at intermediate stages are bounded by poly(n,d,L) for a fixed polynomial p.

Remark 4.6

Given a rank r polynomial f(1,…,r) and a univariate ideal \(I = \left \langle {p_{1}(x_{1}),\ldots ,p_{n}(x_{n})}\right \rangle \), we can decide the membership of f in I by testing if the remainder polynomial f modI is identically zero by evaluating it at a randomly chosen α over \({\mathbb {F}}\) or a suitable extension field [14, 31, 34]. Hence, univariate ideal membership of degree-d rank-r polynomials can be decided in randomized dO(r) ⋅poly(n) time where \(d = \max \limits \{\deg (f),\deg (p_{i}): 1\le i\le n\}\) by Theorem 1.4.

As mentioned in Section 1, an application of our result yields an nO(r) time algorithm for computing the permanent of rank-r matrices over \({\mathbb {Q}}\) or any finite field. Barvinok [9], via a different method, had obtained an nO(r) time algorithm for this problem over \(\mathbb {Q}\).

Corollary 4.7

  • There is an nO(r) time algorithm to compute the permanent of n × n matrices of rank at most r over the field of rationals or any finite field.

  • For finite fields \({\mathbb {F}}\) the algorithm has running time bounded by \(O^{*}(|{\mathbb {F}}|^{O(r^{2})})\). In particular, over constant size fields this is an FPT algorithm for computing Perm(A) (with r as fixed parameter).

Proof

The nO(r) time algorithm is a direct application of the algorithm of Theorem 1.4 to the product of linear forms polynomial and univariate ideal described in Fact 1.2.

For the second part, suppose \({\mathbb {F}}\) is a finite field of size ps, where \({\text {char}}({\mathbb {F}})=p\) (a prime). Let \(A\in \mathbb {F}^{n\times n}\) be a rank r matrix and let \(\ell _{i}={\sum }_{j=1}^{n} a_{ij}x_{j}, 1\le i\le n\). Then there are exactly \(N=|\mathbb {F}|^{r}-1\) many distinct nonzero \(\mathbb {F}\)-linear forms spanned by i,i ∈ [n]. We denote them by \(\ell ^{\prime }_{1},\ell ^{\prime }_{2},\ldots ,\ell ^{\prime }_{N}\). Then the product \({\prod }_{i=1}^{n} \ell _{i}\) can be expressed as

$$ \prod\limits_{i=1}^{n} \ell_{i} = \prod\limits_{j=1}^{N} {\ell^{\prime}}_{j}^{d_{j}}, $$

where \(d_{1}+d_{2}+{\dots } + d_{N}=n\) is the degree of the product. Therefore, by Fact 1.2 we have

$$ {\text{Perm}}(A) = \prod\limits_{j=1}^{N} {\ell^{\prime}}_{j}^{d_{j}} \text{mod}\left\langle{{x_{1}^{2}},{x_{2}^{2}},\ldots,{x_{n}^{2}}}\right\rangle. $$

Now, suppose djp for some j. Let \(\ell ^{\prime }_{j} = {\sum }_{k=1}^{n} \alpha _{jk}x_{k}\). Then writing dj = pqj + rj,rj < p we have

$$ \begin{array}{@{}rcl@{}} {\ell^{\prime}}_{j}^{d_{j}} &=& \left( \sum\limits_{k=1}^{n}\alpha_{jk}x_{k}\right)^{pq_{j}+r_{j}}\\ &=& (\sum\limits_{k=1}^{n}\alpha_{jk}^{p}{x_{k}^{p}})^{q_{j}}\dot (\sum\limits_{k=1}^{n}\alpha_{jk}x_{k})^{r_{j}}\\ &=& 0 \text{mod}\left\langle{{x_{1}^{2}},{x_{2}^{2}},\ldots,{x_{n}^{2}}}\right\rangle. \end{array} $$

The last equality holds because \({x_{k}^{p}}=0 \text {mod} {{x_{k}^{2}}}\) for any p ≥ 2. Consequently, if \(n>(p-1)\dot (|{\mathbb {F}}|^{r}-1)\) then djp for some j, and by Fact 1.2 we have Perm(A) = 0. For \(n\le (p-1)\dot |{\mathbb {F}}|^{r}\) the nO(r)-time algorithm is an \(O^{*}(p^{r}\dot |{\mathbb {F}}|^{O(r^{2})})\) time algorithm, which completes the proof. □

4.2 Small Circuit for the Remainder Polynomial

The first algorithm is based on repeated division and partial evaluation. As such, it does not directly yield a small circuit for f modI.

We now show that f modI has an arithmetic circuit of size O(dO(r)), where \(d=\deg (f)\). The circuit has a nice form: it is a dO(r)-sum of products of univariate polynomials, each of degree at most d. Moreover, this circuit can be constructed in time O(dO(r)) from the input f and I. This also yields another proof of Theorem 1.4, since evaluation of the circuit obtained at a given scalar point can be done in O(dO(r)) time.

Some notation for the sequel: For \(q\in {\mathbb {F}}[t_{1},t_{2},\ldots ,t_{r},X]\), let \([t_{1}^{d_{1}}t_{2}^{d_{2}}{\cdots } t_{r}^{d_{r}}](q)\) denote the coefficient of \(t_{1}^{d_{1}}t_{2}^{d_{2}}{\cdots } t_{r}^{d_{r}}\) in q, noting that \([t_{1}^{d_{1}}t_{2}^{d_{2}}{\cdots } t_{r}^{d_{r}}](f)\in {\mathbb {F}}[X]\).

Now, we can write f = g(1,2,…,r) as a sum of dO(r) d-products of the r linear forms. Thus, it suffices to give a small circuit, of the above form, for a remainder \(\ell _{1}^{d_{1}}\ell _{2}^{d_{2}}{\cdots } \ell _{r}^{d_{r}} \text {mod} I\), where \(I=\left \langle {p_{1}(x_{1}),p_{2}(x_{2}),\ldots ,p_{n}(x_{n})}\right \rangle \). A + -gate summing up all these remainder circuits would be a circuit of the claimed form for f modI.

We first consider a single power dmodI, where \(\ell ={\sum }_{i=1}^{n} a_{i}x_{i}\) is a homogeneous linear form in \({\mathbb {F}}[X]\). By the multinomial theorem

$$ \left( \sum\limits_{i=1}^{n} a_{i}x_{i}t\right)^{d} = \sum\limits_{j_{1}+j_{2}+\dots+j_{n}=d} {\left( \begin{array}{cc}d\\{j_{1},j_{2},\ldots,j_{n}} \end{array}\right)} \prod\limits_{i=1}^{n} (a_{i}x_{i}t)^{j_{i}}. $$

For fields \({\mathbb {F}}\) of characteristic zero, we can write:

$$ \left( \sum\limits_{i=1}^{n} a_{i}x_{i}\right)^{d} = d! [t^{d}]\left( \prod\limits_{i=1}^{n}\left( \sum\limits_{j=0}^{d} {\frac{1}{j!}}(a_{i}x_{i}t)^{j}\right)\right). $$
(1)

Equation 1 is combinatorially verified by noting that the term \({\prod }_{i=1}^{n} (a_{i}x_{i}t)^{j_{i}}\), for \(j_{1}+j_{2}+{\dots } j_{n}=d\) occurs precisely \({\left (\begin {array}{cc}d\\{j_{1},j_{2},\ldots ,j_{n}} \end {array}\right )}\) times on the right side, matching the multinomial expansion of the left side. This identity was first used in arithmetic circuit complexity by Saxena [28],Footnote 2 and has found many applications.

Remark 4.8

Observe that, the right hand side expression of (1) can be viewed as a univariate polynomial in t of degree nd. Therefore, by interpolation, we can find \(\alpha _{1},\ldots , \alpha _{nd+1}\in {\mathbb {F}}\) (or a suitable extension field of \({\mathbb {F}}\)) and \(\beta _{1},\ldots , \beta _{nd+1}\in \mathbb {F}\) such that,

$$ \left( \sum\limits_{i=1}^{n} a_{i}x_{i}\right)^{d} = \sum\limits_{\ell=1}^{nd+1} \beta_{\ell}\left( \prod\limits_{i=1}^{n}\left( \sum\limits_{j=0}^{d} {\frac{1}{j!}}(a_{i}x_{i}\alpha_{\ell})^{j}\right)\right). $$
(2)

Therefore, a power of a linear form can be expressed as a small sum of product of univariates.

This can be generalized to the finite fields setting [16]. We give a self-contained description of this, as it is required for the circuit construction for f modI. First, for \({\text {char}}(\mathbb {F})=p\), (1) only holds for d < p, as each k! occurring in it is invertible in \(\mathbb {F}_{p}\) precisely if d < p. To obtain a suitable form of the equation for dp, we first write \(d={\sum }_{k=0}^{s} e_{k}p^{k}\), for \(s\le \log _{p}(d)-1\) and each ek < p. Since \({\text {char}}({\mathbb {F}})=p\) for each ks, letting \(a_{i}^{p^{k}}=a_{k,i}\in {\mathbb {F}}\) we have:

$$ \left( \sum\limits_{i=1}^{n} a_{i}x_{i}t\right)^{e_{k}p^{k}} = \left( \sum\limits_{i=1}^{n} a_{k,i}x_{i}^{p^{k}}t^{p^{k}}\right)^{e_{k}}. $$

Combined with (1) we get for 0 ≤ ks:

$$ \ell^{e_{k}p^{k}} = \left[t^{e_{k}p^{k}}\right]\left( \sum\limits_{i=1}^{n} a_{k,i}x_{i}^{p^{k}}t^{p^{k}}\right)^{e_{k}} = (e_{k})!\left[t^{e_{k}p^{k}}\right]\left( \prod\limits_{i=1}^{n}\left( \sum\limits_{j=0}^{e_{k}} {\frac{1}{j!}}\left( a_{k,i}x_{i}^{p^{k}}t^{p^{k}}\right)^{j}\right)\right). $$

As \(d={\sum }_{k=0}^{s} e_{k}p^{k}\), multiplying over all k gives

$$ \begin{array}{@{}rcl@{}} \ell^{d} & = & \prod\limits_{k=0}^{s} \left[t^{e_{k}p^{k}}\right]\left( \sum\limits_{i=1}^{n} a_{k,i}x_{i}^{p^{k}}t^{p^{k}}\right)^{e_{k}}\\ & = & {\prod}_{k=0}^{s} (e_{k})!\left[t^{e_{k}p^{k}}\right]\left( \prod\limits_{i=1}^{n}\left( \sum\limits_{j=0}^{e_{k}} {\frac{1}{j!}}\left( a_{k,i}x_{i}^{p^{k}}t^{p^{k}}\right)^{j}\right)\right)\\ \end{array} $$

Let t0,t1,…,ts be new variables. Replacing \(t^{p^{k}}\) by tk for each 0 ≤ ks in the above equations we get:

$$ \ell^{d} = \left[t_{0}^{e_{0}}t_{1}^{e_{1}}{\dots} t_{s}^{e_{s}}\right] \prod\limits_{k=0}^{s} (e_{k})! \left( \prod\limits_{i=1}^{n}\left( \sum\limits_{j=0}^{e_{k}} {\frac{1}{j!}}\left( a_{k,i}x_{i}^{p^{k}}t_{k}\right)^{j}\right)\right). $$
(3)

Thus, \(\ell ^{d}= [t_{0}^{e_{0}}t_{1}^{e_{1}}{\dots } t_{s}^{e_{s}}] Q_{\ell ,d}\), where Q,d is a product of the sn many polynomials as above (each of which is a bivariate polynomial in xi,tk,i ∈ [n],k ∈ [s]). This equation generalizes to express the product \(\ell _{1}^{d_{1}}\cdot \ell _{2}^{d_{2}}{\cdots } \ell _{r}^{d_{r}}\) in the following form:

$$ \ell_{1}^{d_{1}}\cdot \ell_{2}^{d_{2}}{\cdots} \ell_{r}^{d_{r}} = \left[t_{1}^{\nu_{1}}t_{2}^{\nu_{2}}{\dots} t_{D}^{\nu_{D}}\right] \prod\limits_{k=1}^{D}\prod\limits_{i=1}^{n} q_{k,i}, $$
(4)

where D = (s + 1)r, and νk < p for each k ∈ [D] such that \(d_{j}={\sum }_{k=(s+1)(j-1)+1}{(s+1)j} \nu _{k}p^{k-(s+1)(j-1)-1}, j\in [r]\). It is obtained simply by applying (3) to each \(\ell _{j}^{d_{j}}\) with a different set of s + 1 many variables ti and multiplying these equations for 1 ≤ jr. We note that each \(q_{k,i}\in {\mathbb {F}}[x_{i},t_{k}]\) is a polynomial of individual variable degree at most \(d={\sum }_{j=1}^{r} d_{j}\), as is clear from (3). The next claim will complete the proof of Theorem 1.4.

Claim 4.9

\(\ell _{1}^{d_{1}}\cdot \ell _{2}^{d_{2}}{\cdots } \ell _{r}^{d_{r}} \text {mod} I\) has an arithmetic circuit which is a dO(r)-sum of products of univariate polynomials, where each univariate polynomial in xi involved in a product has degree at most \(\deg (p_{i}(x_{i}))-1\).

For the proof, we first consider the following subexpression in (4)

$$ \left[t_{1}^{\nu_{1}}t_{2}^{\nu_{2}}{\dots} t_{D}^{\nu_{D}}\right]\prod\limits_{k=1}^{D} q_{k,i}, $$

which we will evaluate modulo pi(xi). Note that the number of monomials of the form \({\prod }_{k=1}^{D} t_{k}^{\mu _{k}}, \mu _{k}\le \nu _{k}<p\) is bounded by pD = (ps+ 1)r = dO(r). Thus, in O(dO(r)) time we can expand the product \({\prod }_{k=1}^{D} q_{k,i}\) by multiplying out the polynomials, one by one, from left to right. After each multiplication, we replace \({x_{i}^{a}}\) by its remainder \({x_{i}^{a}}\text {mod} p_{i}\) and drop any term with a factor \({t_{k}^{p}}, k\in [D]\). This will result in a polynomial expression of the form

$$ Q_{i} = \sum\limits_{\bar{\mu}}r_{\bar{\mu}}(x_{i})\prod\limits_{k=1}^{D}t_{k}^{\mu_{k}}, $$

where the sum runs over the dO(r) many tuples \(\bar {\mu }=(\mu _{1},\mu _{2},\ldots ,\mu _{k})\) such that μkνk for each k. Thus, each \(r_{\bar {\mu }}(x_{i})\) is a univariate in xi of degree at most \(\deg (p_{i})-1\). We can now evaluate the product Q1Q2Qn modulo the ideal \(\left \langle {{t_{1}^{p}},{t_{2}^{p}},\ldots ,{t_{D}^{p}}}\right \rangle \) by multiplying out adjacent pairs and dropping any terms with a factor \({t_{k}^{p}}, k\in [D]\). This will given an expression for Q1Q2Qn modulo \(\left \langle {{t_{1}^{p}},{t_{2}^{p}},\ldots ,{t_{D}^{p}}}\right \rangle \) of the form \({\sum }_{\bar {\mu }}R_{\bar {\mu }}{\prod }_{k=1}^{D}t_{k}^{\mu _{k}}\), where each \(R_{\bar {\mu }}\) is a dO(r)-sum of products of n univariate polynomials (and in each product the ith is a polynomial in xi of degree \(\deg (p_{i})-1\)). Finally, we note that \(R_{\bar {\nu }}\) is the desired polynomial expression for \({\prod }_{j=1}^{r}\ell _{j}^{d_{j}} \text {mod} I\), completing the proof of the claim.

4.3 Vertex Cover Detection in Low Rank Graphs

In the Vertex Cover problem, the input instances are pairs (G,k), where G = (V,E) is a graph and k is an integer. The problem is to decide whether or not G has a vertex cover of size k. This is a classical NP-complete problem.

A graph G is said to be of rank r if the rank of the adjacency matrix AG is of rank r. Graphs of low rank were studied by Lovasz and Kotlov [4, 20]. As an application of Theorem 1.4, we obtain an nO(r) time algorithm to compute a minimum vertex cover in an n-vertex graph of rank r.

Remark 4.10

A pair of vertices x,y in a graph G are twins if they have identical neighborhoods in G. Lovasz and Kotlov [4] have shown that a rank r graph G that is twin-free has at most O(2r/2) vertices. Clearly, a minimal vertex cover S of G does not contain twins. Therefore, in order to search for a minimum vertex cover for G, it suffices to search for it in a maximal twin-free subgraph H of G, which is easy to find in poly(n) time. Now, H will have at most O(2r/2) vertices as its rank is also bounded by r. A brute-force search for the minimum vertex cover in V (H) yields an \(O^{*}(2^{2^{r/2}})\) algorithm. For n that is double exponential in r, this brute-force search is faster than the nO(r) algorithm of this section.

Proof of Theorem 1.5

We give a polynomial-time reduction from Vertex Cover to Univariate Ideal Membership. Let (G,k) be a Vertex Cover instance. Let \(I=\left \langle {{x^{2}_{1}} - x_{1},{x^{2}_{2}} - x_{2},\ldots , {x^{2}_{n}} - x_{n}}\right \rangle \) and

$$ f = \prod\limits^{\left( \begin{array}{ll}{n}\\{2} \end{array}\right)}_{s=1} (\vec{x} A_{G} \vec{x}^{T} - s) \cdot \prod\limits^{n-k-1}_{t=0}\left( \sum\limits^{n}_{i=1} x_{i} - t\right), $$

where AG is the adjacency matrix of the graph G and \(\vec {x}=(x_{1},x_{2},\ldots ,x_{n})\) is row-vector.

Claim 4.11

The rank of the polynomial f is at most r + 1.

Proof

We note that AG is symmetric since it encodes an undirected graph. Let Q be an invertible n × n matrix that diagonalizes AG. So we have QAGQT = D where D is a diagonal matrix with only the first r diagonal elements being non-zero. Let \(\vec {y}=(y_{1},y_{2},\ldots ,y_{n})\) be another row-vector of variables. Now, we show the effect of the transform \(\vec {x}\mapsto \vec {y}Q\) on the polynomial \(\vec {x}A_{G} \vec {x}^{T}\). Clearly, \(\vec {y}Q A_{G} Q^{T} \vec {y}^{T} = \vec {y}D\vec {y}^{T}\) and since there are only r non-zero entries on the diagonal, the polynomial \(\vec {y}D\vec {y}^{T}\) is over the variables y1,y2,…,yr. Thus \(g = {\prod }^{\left (\begin {array}{ll}{n}\\{2} \end {array}\right )}_{s=1} (\vec {x}A_{G} \vec {x}^{T} - s)\) is a rank r polynomial. Also \(h={\prod }^{n-k-1}_{t=0}({\sum }^{n}_{i=1} x_{i} - t)\) is a rank 1 polynomial as there is only one linear form \({\sum }^{n}_{i=1} x_{i}\). Since f = gh, we conclude that f is a rank r + 1 polynomial. □

Now the proof of Theorem 1.5 follows from the next claim.

Claim 4.12

The graph G has a Vertex Cover of size k if and only if fI.

Proof

First, observe that the set of common zeroes of the generators of the ideal I is the set {0,1}n. Let S be a vertex cover in G such that |S|≤ k. We will exhibit a point \(\vec {\alpha }\in \{0,1\}^{n}\) such that \(f(\vec {\alpha })\neq 0\). This will imply that fI. Identify the vertices of G with {1,2,…,n}. Define \(\vec {\alpha }(i)=0\) if and only if iS. Since \(\vec {x} A_{G} \vec {x}^{T} = {\sum }_{(i,j)\in E_{G}} x_{i} x_{j}\) and S is a vertex cover for G, it is clear that \(\vec {x} A_{G} \vec {x}^{T}(\vec {\alpha })=0\). Also \(({\sum }_{i=1}^{n} x_{i})(\vec {\alpha })\geq n-k\). Then clearly \(f(\vec {\alpha })\neq 0\).

For the other direction, suppose that fI. Then by Theorem 1.1, there exists \(\vec {\alpha }\in \{0,1\}^{n}\) such that \(f(\vec {\alpha })\neq 0\). Define the set \(S\subseteq [n]\) as follows. Include iS if and only if \(\vec {\alpha }(i)=0\). Since \(f(\vec {\alpha })\neq 0\), and the range of values that \(\vec {x} A_{G} \vec {x}^{T}\) can take is {0,1,…,|E|}, it must be the case that \(\vec {x} A_{G} \vec {x}^{T}(\vec {\alpha })=0\). It implies that the set S is a vertex cover for G. Moreover, \({\prod }^{n-k-1}_{t=0}({\sum }^{n}_{i=1} x_{i} - t)(\vec {\alpha })\neq 0\) implies that |S|≤ k. □

The degree of the polynomial f is bounded by n2 + n and from Claim 4.12 we know that f modI is a non-zero polynomial if and only if G has a vertex cover of size k. By the Polynomial Identity lemma [14, 31, 34], \((f \text {mod} I)(\vec {\beta })\) is non-zero with high probability when \(\vec {\beta }\) is chosen randomly from a small domain. Now, we need to just compute \((f \text {mod} I)(\vec {\beta })\) where f is a rank r + 1 polynomial with \(\ell _{i} = (\vec {x}Q^{-1})_{i}\) for each 1 ≤ ir and \(\ell _{r+1} ={\sum }_{i=1}^{n} x_{i}\) which can be performed in (n,k)O(r) time using Theorem 1.4. □

5 Univariate Ideal Membership Parameterized by Degree

In this section, we consider the degree of the input polynomial as the fixed parameter. Consider \(I=\left \langle {\{p_{i}(x_{i})\}_{i=1}^{n}}\right \rangle \) be a univariate ideal and \(f\in \mathbb {F}[{X}]\) be a degree k polynomial given by an arithmetic circuit. Clearly, there is a simple O(nO(k)) algorithm for it: we can write \(f={\sum }_{m}\alpha _{m} m\) as a linear combination of \(\left (\begin {array}{cc}{n+k}\\ k \end {array}\right )\) many monomials m. We can then compute the remainder \(f \text {mod} I={\sum }_{m}\alpha _{m} (m \text {mod} I)\) as a linear combination of monomials.

We first prove Theorem 1.6 showing a randomized O((2e)k) time algorithm for the special case where \({\mathbb {F}}=\mathbb {Q}\) and the ideal \(I=\left \langle {x_{1}^{e_{1}},x_{2}^{e_{2}},\ldots ,x_{n}^{e_{n}}}\right \rangle \).

5.1 Proof of Theorem 1.6

Proof

The main step is the following reduction of checking if fI (where f is degree-k and \(I=\left \langle {x_{1}^{e_{1}},x_{2}^{e_{2}},\ldots ,x_{n}^{e_{n}}}\right \rangle \)) to the problem of checking if the polynomial fsg is identically zero, where g is chosen as a polynomial weakly equivalentFootnote 3 to the elementary symmetric polynomial. The claimed algorithm then follows by applying a recent result of [7].

Recall that Sm, denotes the elementary symmetric polynomial of degree over m variables. Set \(m = {\sum }_{i=1}^{n} (e_{i} - 1)\) and define Sm, on the m variables \(z_{1,1},\ldots ,z_{1,e_{1}-1},\ldots ,z_{n,1},\ldots ,z_{n,e_{n}-1}\). Now, for 0 ≤ k define g(X) as the polynomial obtained from Sm, by replacing each zi,j by xi,1 ≤ in.

Claim 5.1

Given integers e1,e2,…,en, and a homogeneous polynomial f(X) of degree k, \(f\in \left \langle {x^{e_{1}}_{1},x^{e_{2}}_{2},\ldots ,x^{e_{n}}_{n}}\right \rangle \) if and only if fsg ≡ 0 for 0 ≤ k.

Proof

Clearly \(f\not \in \left \langle {x^{e_{1}}_{1},x^{e_{2}}_{2},\ldots ,x^{e_{n}}_{n}}\right \rangle \) if and only if f has a nonzero degree monomial \(M = x^{f_{1}}_{1} x^{f_{2}}_{2}{\ldots } x^{f_{n}}_{n}\), for some k, such that fi < ei for each 1 ≤ in. Hence, the scaled Hadamard product polynomial fsg is not identically zero for some k if and only if \(f\not \in \left \langle {x^{e_{1}}_{1},x^{e_{2}}_{2},\ldots ,x^{e_{n}}_{n}}\right \rangle \). □

The proof now follows from the recent work of [7] as explained below:

For checking if fsg (as defined in Lemma 5.1) is identically zero, it suffices to check for some polynomial \(\tilde {g_{\ell }}\) weakly equivalent to g that \(f\circ ^{s} \tilde {g_{\ell }}\) is identically zero. By color coding [3], we can construct a homogeneous depth-three circuit of size ekpoly(n) that computes a polynomial weakly equivalent to Sn,k with high probability (see [7] for details). Replacing each zi,j by xi,1 ≤ in, we obtain a homogeneous depth-three circuit of the same size for a polynomial \(\tilde {g_{\ell }}\) weakly equivalent to g defined in Lemma 5.1.

Now, it is shown in [7] that we can compute the scaled Hadamard product of a circuit of size s1 with a degree-k homogeneous depth three circuit of size s2 in deterministic O(2ks1s2) time. Therefore, fsg can be computed in O((2e)k) time. We can check if fsg is identically zero by evaluating at a randomly chosen point [14, 31, 34]. Overall, this gives a randomized O((2e)k) time algorithm. □

Remark 5.2

  1. 1.

    The above proof fails for \({\text {char}}({\mathbb {F}})<k\) because fsg might vanish because the scaling factor m! for each monomial might be divisible by \({\text {char}}(\mathbb {F})\).

  2. 2.

    Over rationals, we can apply a recent work [27] to obtain an O(4.08k) time algorithm to test identity of scaled Hadamard product with elementary symmetric polynomial. This improves the algorithm of Theorem 1.6 to a randomized O(4.08k) algorithm.

We now consider deciding the membership for the general case of univariate ideal. We first make the following observation.

Observation 5.3

Let \(I=\left \langle {\{p_{i}(x_{i})\}_{i=1}^{n}}\right \rangle \) be a univariate ideal and \(f\in {\mathbb {F}}[X]\) be a degree k polynomial of Waring rank r. Then f can be expressed as an r-sum of kth powers of linear forms i.e. \(f = {\sum }_{i=1}^{r} \ell ^{k}\) for some affine linear forms i. Then, there is a deterministic poly(r,k,n) algorithm to decide whether fI.

The proof follows easily from (2) that allows us to write f as a small sum of product of univariates.

Remark 5.4

As an application, motivated by the permanent lemma [1, Lemma 8.1], consider the following constrained linear inequations problem: given \(A\in {\mathbb {F}}^{k\times n}\), \((b_{1},b_{2},\ldots ,b_{k})^{T}\in {\mathbb {F}}^{k}\), and a family of subsets S1,S2,…,Sn of the field \({\mathbb {F}}\) the problem is to find an assignment \(\vec {x}=\vec {a}\in S_{1}\times S_{2}\times {\dots } \times S_{n}\) such that \({\sum }_{j}a_{ij}x_{j}\ne b_{i}, 1\le i\le k\). We define the degree-k polynomial

$$ f = \prod\limits_{i=1}^{k} \left( \sum\limits_{j=1}^{n} a_{ij}x_{j} - b_{j}\right). $$

Clearly, a solution to the above inequation system exists if and only if there exists \(\vec {a} \in S_{1} \times {\cdots } S_{n}\) such that \(f(\vec {a})\) is non-zero. By the Combinatorial Nullstellensatz [1] (Theorem 1.1), it can be expressed as a univariate ideal membership problem. As f is a product of k linear forms, its the Waring rank is bounded by O(2k). By Observation 5.3, we obtain a deterministic O(2k) algorithm to solve this constrained inequation system.

For degree-k, n-variate polynomials f, we do not have an algorithm with running time better than \(O^{*}{\left (\left (\begin {array}{cc}n+k\\ k \end {array}\right )\right )}\) for univariate ideal membership in general. However, if each generator polynomial pi has distinct roots we obtain a faster algorithm.

Theorem 5.5

Let \(I = \left \langle {p_{1}(x_{1}), \ldots , p_{n}(x_{n})}\right \rangle \) be a univariate ideal given explicitly by a set of univariate polynomials p1,…,pn such that for each i ∈ [n], pi(xi) has distinct roots over \({\mathbb {Q}}\). Given a polynomial \(f(X)\in {\mathbb {C}}[X]\) of degree k and I as input, we can decide whether fI or not in randomized O(nk/2) time.

Proof

W.l.o.g. we can assume the degree of each pi is at most k. Otherwise, we can drop pi from I. For i ∈ [n], let \(S_{i}\subset \mathbb {Q}\) be the set of all roots of pi. By Alon’s Combinatorial Nullstellensatz (Theorem 1.1), Theorem 5.4 can be restated as the following.

Claim 5.6

Given a polynomial \(f(X)\in \mathbb {C}[X]\) of degree k and S1,…,Sn such that for each i ∈ [n], \(S_{i}\subset \mathbb {C}\) as inputs, we can decide whether S1 ×⋯ × Sn contains a nonzero of f in in randomized O(nk/2) time.

For a degree-k polynomial \(f\in {\mathbb {F}}[X]\) let

$$ \tilde{f}=x_{n+1}^{k}\cdot f\left( \frac{x_{1}}{x_{n+1}},\frac{x_{2}}{x_{n+1}},\ldots,\frac{x_{n}}{x_{n+1}}\right), $$

be its homogenization. Thus, \(\tilde {f}\) is homogeneous of degree k and \(\tilde {f}(x_{1},x_{2},\ldots ,\) xn,1) = f(x1,…,xn). Clearly, f is nonzero on the n-dimensional grid S1 ×⋯ × Sn if and only if \(\tilde {f}\) is nonzero on the n + 1 -dimensional grid S1 ×⋯ × Sn ×{1}. Hence, without loss of generality we can assume f is homogeneous degree k.

Observation 5.7

For a homogeneous polynomial f of degree k,

$$ f\circ^{s} (a_{1} x_{1}+\ldots+a_{n} x_{n})^{k}\mid_{\vec{1}} = k!\cdot f(a_{1},\ldots,a_{n}). $$

We need to decide whether there exists a point \(\vec {a} \in S_{1}\times \cdots \times S_{n}\) such that \(f(\vec {a})\neq 0\).

For each (a1,…,an) ∈ S1 ×… × Sn, by (2) we can write,

$$ \frac{1}{k!}\cdot (a_{1} x_{1}+\ldots+a_{n} x_{n})^{k} = \sum\limits_{\ell=1}^{nk+1} \beta_{\ell}\cdot \prod\limits_{i=1}^{n} p_{i}(a_{i}\alpha_{\ell} x_{i}). $$

where \(\alpha _{1},\ldots ,\alpha _{n}\in \mathbb {Q}\) are some distinct points, \(\beta _{\ell }\in \mathbb {Q}\), and each pi is univariate.

Now, we define the “grid” polynomial

$$ \begin{array}{@{}rcl@{}} g & = & \sum\limits_{\ell=1}^{nk+1} \beta_{\ell} \cdot\prod\limits_{i=1}^{n}\left( \sum\limits_{a_{i}\in S_{i}}\xi_{i,a_{i}} p_{i}(a_{i}\alpha_{\ell} x_{i})\right) \end{array} $$
(5)
$$ \begin{array}{@{}rcl@{}} & = & \sum\limits_{(a_{1},\ldots,a_{n}) \in S_{1}\times{\ldots} \times S_{n}}\prod\limits_{i=1}^{n}\xi_{i,a_{i}}\left( \sum\limits_{\ell=1}^{nk+1} \beta_{\ell}\cdot \prod\limits_{i=1}^{n} p_{i}(a_{i}\alpha_{\ell} x_{i})\right), \end{array} $$
(6)

where \(\xi _{i,a_{i}}, i\in [n], a_{i}\in S_{i}\) are new variables. Hence,

$$ \begin{array}{@{}rcl@{}} f\circ^{s} g\mid_{\vec{1}} &=& \sum\limits_{(a_{1},\ldots,a_{n}) \in S_{1}\times{\ldots} \times S_{n}}\prod\limits_{i=1}^{n}\xi_{i,a_{i}}f\circ^{s} \left( \sum\limits_{\ell=1}^{nk+1} \beta_{\ell}\cdot {\prod}_{i=1}^{n} p_{i}(a_{i}\alpha_{\ell} x_{i})\right)\mid_{\vec{1}} \end{array} $$
(7)
$$ \begin{array}{@{}rcl@{}} &= & \sum\limits_{(a_{1},\ldots,a_{n}) \in S_{1}\times{\ldots} \times S_{n}}\prod\limits_{i=1}^{n} \xi_{i,a_{i}} f(a_{1},a_{2},\ldots,a_{n}) \end{array} $$
(8)

Thus, \(f\circ ^{s} g\mid _{\vec {1}}\) is a nonzero polynomial (in the \(\xi _{i,a_{i}}\) variables) of degree n iff fs(a1x1 + ⋯anxn)k is nonzero for some \((a_{1},\ldots ,a_{n})\in S_{1}\times {\dots } S_{n}\). By the Polynomial Identity Lemma [14, 31, 34], we can independently randomly assign values for the \(\xi _{i,a_{i}}\) variables from [n2], and the evaluation is nonzero with probability at least 1 − 1/n iff f nonzero on a grid point in \(S_{1}\times \dots \times S_{n}\). Furthermore, from (7) we note that we can clear the denominators of all the β and the polynomials pi(aiαixi) and the polynomial f (given by input circuit) and take out a common factor \(\frac {1}{D}\) (where D is a polynomially many bits long integer) to write (7) as

$$ f\circ^{s} g\mid_{\vec{1}} =\frac{1}{D} \sum\limits_{(a_{1},\ldots,a_{n}) \in S_{1}\times{\ldots} \times S_{n}}\prod\limits_{i=1}^{n}\xi_{i,a_{i}}\hat{f}\circ^{s} \left( \sum\limits_{\ell=1}^{nk+1} \gamma_{\ell}\cdot \prod\limits_{i=1}^{n} \hat{p}_{i}(a_{i}\alpha_{\ell} x_{i})\right)\mid_{\vec{1}}, $$

where \(\hat {f}\) and \(\hat {p}_{i}(a_{i}\alpha _{\ell } x_{i})\) have integer coefficients. Thus, when \(f\circ ^{s} g\mid _{\vec {1}}\) is nonzero at a choice of the \(\xi _{i,a_{i}}\) then it is of absolute value at least 1/D.

Therefore, after randomly choosing \(\xi _{i,a_{i}}\in _{R}[n^{2}]\), it is clear from (5) that the problem reduces to efficiently computing the scaled Hadamard product \(f\circ ^{s} h\mid _{\vec {1}}\) evaluated at \(\vec {1}\), where \(h={\prod }_{i=1}^{n} q_{i}(x_{i})\) and each qi is of degree k. We now show that \(f\circ ^{s} h\mid _{\vec {1}}\) can be computed in O(nk/2) time which suffices to detect if \(f\circ ^{s} g\mid _{\vec {1}}\) is nonzero in O(nk/2) time.

Claim 5.8

\(f\circ ^{s} {\prod }_{i=1}^{n} q_{i}(x_{i})\mid _{\vec {1}}\) can be computed in O(nk/2) time.

Notice that the above claim completes the proof, because the summation over has nk + 1 terms. Let \(\beta = \max \limits _{\ell }\{|\beta _{\ell }|\}\). Then the overall error in \(f\circ ^{s} g\mid _{\vec {1}}\) is bounded by the precision error of the claim multiplied by (nk + 1)β which can be made smaller than 1/D by choosing the precision error of the claim.

We now prove the claim. We need approximations because we will need to approximately compute the roots of the univariate polynomials qi. Let Ri denote the nonzero roots of qi. Then we can write

$$ \prod\limits_{i} q_{i} = \prod\limits_{i=1}^{n} x_{i}^{\mu_{i}}\prod\limits_{i=1}^{n}\prod\limits_{-\beta\in R_{i}}(x_{i}+\beta)^{\nu_{i,\beta}}, $$

where νi,β is the multiplicity of root − β in qi. If \({\sum }_{i} \mu _{i} > k\) then clearly \(f\circ ^{s} {\prod }_{i} q_{i} = 0\). Otherwise, let \({\sum }_{i} \mu _{i} = s\) and let r = ks. Let \({\prod }_{i}{\prod }_{\beta \in R_{i}} \beta ^{\nu _{i,\beta }}={\Gamma }\). Write \({\prod }_{i=1}^{n}{\prod }_{-\beta \in R_{i}}(x_{i}+\beta )^{\nu _{i,\beta }}\) as \({\Gamma } {\prod }_{i=1}^{n}{\prod }_{-\beta \in R_{i}}(x_{i}/\beta +1)^{\nu _{i,\beta }}\). Let \(m={\sum }_{i}\deg (q_{i})-s\) and consider the elementary symmetric polynomial Sm,r in variables y1,y2,…,ym. By Lee’s result [24], Sm,r can be expressed as O(mr/2) sum of powers of linear forms. In the polynomial Sm,r we replace the m variables y1,y2,…,ym by the m nonzero roots (of the form xi/β, as explained above) of \({\prod }_{j} q_{j}\). Let the product of the resulting polynomial (which is still a O(mr/2)-sum of r-power of linear forms) with \({\Gamma }\cdot {\prod }_{i=1}^{n} x_{i}^{\mu _{i}}\) be denoted by Q. Clearly, \(f\circ ^{s} {\prod }_{i} q_{i} = f\circ ^{s} Q\). Since Q is a sum of power of linear forms using Observation 5.7, we can evaluate \(f\circ ^{s} Q\mid _{\vec {1}}\) with O(nk/2) arithmetic operations.

Now, replacing each root β by a rational approximation \(\beta ^{\prime }\) such that \(|\beta -\beta ^{\prime }|\le 1/2^{L}\) for a suitably chosen polynomial bit number L, the overall error in the approximation to \(f\circ ^{s} Q\mid _{\vec {1}}\) will be bounded. It can be made smaller than ε by choosing L suitably large. We can use any efficient root approximation algorithm for univariate polynomials to find all such root approximations \(\beta ^{\prime }\).

This completes the proof of the claim and the theorem. □

Remark 5.9

Observe that Claim 5.8 can be restated as follows: given univariate polynomials pi(xi),1 ≤ in, the Waring rank of the degree-k part of their product \({\prod }_{i=1}^{n} p_{i}(x_{i})\) is bounded by O(nk/2). Then the proof of Theorem 5.4 follows as an application of Observation 5.3.

6 Univariate Ideal Membership Parameterized by Number of Generators

In this section, we consider the univariate ideal membership parameterized on the number of generators of the univariate ideal. More precisely, we consider univariate ideal membership for input f(X) by a circuit of size s and univariate ideal \(I=\left \langle {p_{1}(x_{1}), \ldots , p_{k}(x_{k})}\right \rangle \) (with k as fixed parameter).

We show that the nonmembership problem is W[2]-hard by giving an efficient reduction from the k-dominating set problem which is W[2]-complete [13].

Moreover, in contrast to the problem parameterized by \(\deg (f)\), even for the special case of the ideal \(I=\left \langle {x_{1}^{e_{1}},x_{2}^{e_{2}},\ldots ,x_{k}^{e_{k}}}\right \rangle \) we show the problem remains hard. We are able to show it is MINI[1]-hard. Hence, even in this special case the problem cannot have an algorithm of run time O(so(k)) assuming the exponential time hypothesis. On the other hand, the problem has an easy O(sk) time randomized algorithm.

Proof of Theorem 1.7

Let (G,k) be an instance of the k-dominating set problem, where G = (V,E) is an n-vertex graph and the fixed parameter k is the size of the independent set. Let V (G) = {1,2,…,n}. For 1 ≤ ik, we define polynomials

$$ p_{i}(x_{i}) = \prod\limits^{n}_{j=1}(x_{i} - j). $$

The W[2]-hardness proof is an application of Alon’s Combinatorial Nullstellensatz (Theorem 1.1): By definition, for each pi its zero set is Z(pi) = [n]. Therefore, a polynomial \(g\in \mathbb {Q}[x_{1},x_{2},\ldots ,x_{k}]\) is in the ideal 〈p1,p2,…,pk〉 if and only if g is zero on every point in the k-dimensional grid \([n] \times [n]\times {\dots } \times [n]\).

For each uV, let Nu = {u}∪{vVuvE} denote its closed neighborhood in G. Define polynomials qu,uV

$$ q_{u} = \sum\limits_{i=1}^{k} \prod\limits_{v\in\overline{N_{u}}} (x_{i}-v)^{2}. $$

Notice that qu is nonzero at a grid point xi = vi,1 ≤ ik if and only if there is a viNu. That is, qu is nonzero at (v1,v2,…,vk) if and only if some vi dominates u. Now, letting

$$ q_{G}(x_{1},x_{2},\ldots,x_{k}) = \prod\limits_{u=1}^{n} q_{u}, $$

it follows that qG is nonzero at a grid point xi = vi,1 ≤ ik if and only if {v1,v2,…,vk} is a dominating set for G.

Hence, by Theorem 1.1 we have the following claim which completes the proof. □

Claim 6.1

The polynomial qG is not in the univariate ideal 〈p1,p2,…,pn〉 if and only if the graph G has a dominating set of size k.

6.1 Proof of Theorem 1.8

We first relate our univariate ideal membership problem with a linear algebraic problem k−Lin-Eq. It turns that k−Lin-Eq problem is more amenable to the MINI[1]-hardness proof. Finally we show a reduction from MINI − 1 − in − 3POSITIVE3 − SAT to k−Lin-Eq to complete the proof.

Definition 6.2 (k-Lin-Eq)

Input: Integers k,n in unary, a k × n matrix A with all the entries given in unary and a k dimensional vector \(\vec {b}\) with all entries in unary.

Parameter: k.

Question: Does there exist an \(\vec {x}\in \{0,1\}^{n}\) such that \(A\vec {x} = \vec {b}\)?

Lemma 6.3

There is a parameterized reduction from k−Lin-Eq to the univariate ideal membership problem when the ideal is given by the powers of variables as generators.

Proof

We introduce 2k variables \(x_{1},x_{2},\dots ,x_{k},y_{1},y_{2},\dots ,y_{k}\) where two variables will be used for each row. For each i ∈ [n], let \(\mu _{i} = {\sum }_{j=1}^{n} a_{ij}\). For each column \(c_{i} = (a_{1i},a_{2i},\dots ,a_{ki})\) we construct the polynomial \(P_{i} = ({y_{1}}^{a_{1i}}{y_{2}}^{a_{2i}}{\dots } {y_{k}}^{a_{ki}} + {x_{1}}^{a_{1i}}{x_{2}}^{a_{2i}}{\dots } {x_{k}}^{a_{ki}})\). We let \(P_{A} = {\prod }_{i=1}^{n} P_{i}\) and we choose the ideal to be \(\langle x_{1}^{b_{1} + 1},y_{1}^{\mu _{1} - b_{1} +1},\) \(\dots ,x_{k}^{b_{k} + 1},y_{1}^{\mu _{k} - b_{k} +1}\rangle \). Notice that PA has a small arithmetic circuit which is polynomial time computable. □

Claim 6.4

An instance \( (A,\vec {b})\) is an YES instance for k−Lin-Eq iff \(P_{A} \not \in \left \langle x_{1}^{b_{1} + 1},y_{1}^{\mu _{1} - b_{1} +1},\dots ,x_{k}^{b_{k} + 1},y_{k}^{\mu _{k} - b_{k} +1}\right \rangle \).

Proof of Claim

Suppose \((A,\vec {b})\) is an YES instance. Then there is an \(\vec {x}\in \{0,1\}^{n}\) such that \(A\vec {x}=\vec {b}\). Define \(S:=\{i\in [n] : \vec {x}_{i}=1\}\) where xi is the i th co-ordinate of \(\vec { x}\). Think of the monomial where \({x_{1}}^{a_{1i}}{x_{2}}^{a_{2i}}\dots {x_{k}}^{a_{ki}}\) is picked from Pi for each iS and \({y_{1}}^{a_{1i}}{y_{2}}^{a_{2i}}{\dots } {y_{k}}^{a_{ki}}\) is picked from reaming Pj’s where \(j\in \bar {S}\). This gives us the monomial \(x_{1}^{b_{1}} y_{1}^{\mu _{1} - b_{1}} {\ldots } x_{k}^{b_{k}} y_{1}^{\mu _{k} - b_{k}}\) in the polynomial PA. Thus \(P_{A} \not \in \left \langle {x_{1}^{b_{1} + 1},y_{1}^{\mu _{1} - b_{1} +1},\ldots ,x_{k}^{b_{k} + 1},y_{k}^{\mu _{k} - b_{k} +1}}\right \rangle \).

Now we show the other direction. Now suppose \(P_{A} \not \in \left \langle x_{1}^{b_{1} + 1},y_{1}^{\mu _{1} - b_{1} +1},\dots ,x_{k}^{b_{k} + 1},y_{k}^{\mu _{k} - b_{k} +1}\right \rangle \). Let \(S := \{ i\in [n] : {x_{1}}^{a_{1i}}{x_{2}}^{a_{2i}}\dots {x_{k}}^{a_{ki}}\) is picked from Pi}. There must be a monomial \({x_{1}}^{c_{1}}{x_{2}}^{c_{2}}\dots {x_{k}}^{c_{k}} {y_{1}}^{d_{1}}{y_{2}}^{d_{2}}\dots {y_{k}}^{d_{k}}\) in PA such that for each i, \({\sum }_{j\in S}a_{ij}=c_{i} \leq b_{i}\), \({\sum }_{j\not \in S}a_{ij} = d_{i} \leq (\mu _{i} - b_{i}) \). As, \(\mu _{i} = {\sum }_{j\in S} a_{ij} + {\sum }_{i\not \in S}a_{ij}\), we get \(b_{i} \leq {\sum }_{j\in S}a_{ij}\). Hence, \({\sum }_{j \in S}a_{ij} = b_{i}\) for each i. Define \(\vec {x}\in \{0,1\}^{n}\) where \(\vec {x}_{i} = 1\) if iS else \(\vec {x}_{i}=0\). This shows \((A,\vec {b})\) is an YES instance. □

Before we prove the MINI[1]-hardness of k−Lin-Eq, we show that the following problem is MINI[1]-hard.

Definition 6.5

MINI − 1 − in − 3POSITIVE3 − SAT

Input: Integers k,n in unary, a 3-SAT instance \(\mathcal {E}\) consisting of only positive literals where \(\mathcal {E}\) has at most \(k\log n \) variables and at most \(k\log n\) clauses.

Parameter: k.

Question: Does there exist a satisfiable assignment for \(\mathcal {E}\) such that every clause has exactly one literal?

Claim 6.6

MINI − 1 − in − 3POSITIVE3 − SAT is MINI[1]-hard.

To prove the claim we only need to observe that the standard Schaefer Reduction [30] from 3-SAT to 1 − in − 3POSITIVE3 − SAT is in fact a linear size reduction, that directly gives us an FPT reduction from MINI− 3SAT to MINI − 1 − in − 3POSITIVE3 − SAT.

Proof of Theorem 1.8

Given a MINI − 1 − in − 3POSITIVE3 − SAT instance \(\mathcal {E}\), order the variables \(v_{1},\dots ,v_{k\log n}\) and the clauses \(C_{1},\dots ,C_{k\log n}\). Construct the following \(k\log n\times k\log n\) matrix M where the rows are indexed by the clauses and the columns are indexed by the variables. M[i][j] is set to 1 if vj appears in Ci, otherwise set it to 0. Make M a \(2k\log n\times n\) matrix by adding an all zero row between every rows and appending all zero columns at the end. Now, define \(\vec {e}\) as a \(2k\log n\) dimensional vector where i th co-ordinate of e, ei = 1 when i is odd and ei = 0 when i is even. We want to find \(\vec {y}\in \{0,1\}^{n}\) such that \(M\vec {y}=\vec {e}\).

However this is not an instance of k−Lin-Eq. To make it so, we observe that M is a bit matrix and \(\vec {e}\) is a bit vector, hence we can modify them to a k × n matrix A and k dimensional vector \(\vec {b}\) in the following way. For each column j, think of the i th consecutive \(2\log n\) bits as the binary expansion of a single entry, call it N and set A[i][j] to N. Similarly, we modify \(\vec {e}\) to a k dimensional vector \(\vec {b}\) by considering \(2\log n\) bits as a binary expansion of a single entry. Now the proof follows from the following claim. □

Claim 6.7

\(\mathcal {E}\) is an YES instance for MINI − 1 − in − 3POSITIVE3 − SAT if and only if there exists an \(\vec {x}\in \{0,1\}^{n}\) such that \(A\vec {x} = \vec {b}\).

Proof

Suppose there is such a satisfiable assignment for \(\mathcal {E}\). Define \(S:=\{j\in [k\log n]\mid v_{j}=\text {TRUE}\}\). Define \(\vec {z}\in \{0,1\}^{n}\) such that zj = 1 where jS else zj = 0. For each i, as Ci contains exactly one literal, hence \(e_{2i+1} = {\sum }_{j=1}^{n} M[i][j]\cdot z_{j} = 1\) and e2i = 0. Therefore \(\vec {z}\) is a solution for \(M\vec {y} = \vec {e}\). As every integer has a unique binary expansion, hence \(\vec {z}\) is also a solution for \(A\vec {x} = \vec {b}\).

Now we prove the other direction. Suppose \(A\vec {z} = \vec {b}\) for some \(\vec {z}\in \{0,1\}^{n}\). From the construction of the matrix M, it is sufficient to show that \(\vec {z}\) is a satisfying assignment for \(M\vec {y} = \vec {e}\). First we note that the numbers A[i][j],b[i] in their binary expansion have bits 1 in the odd location and 0 in the even locations. Let \(A[i][j] = {\sum }^{2\log n}_{t=1} a_{ijt} 2^{t-1}\) and \(b[i]={\sum }^{2\log n}_{t=1} e_{t} 2^{t-1}\). Since \(A\vec {z} = \vec {b}\) we have \({\sum }_{j=1}^{n} A[i][j]\cdot z_{j} = b[i]\). This shows that

$$ \sum\limits_{j=1}^{n} A[i][j]\cdot z_{j} = \sum\limits_{j=1}^{n} \left( \sum\limits^{2\log n}_{t=1} a_{ijt} 2^{t-1}\right) \cdot z_{j} =\sum\limits^{2\log n}_{t=1} \left( \sum\limits_{j=1}^{n} a_{ijt} \cdot z_{j}\right) 2^{t-1}. $$

Since \(\mathcal {E}\) is a 3-CNF formula we have \(({\sum }_{j=1}^{n} a_{ijt} \cdot z_{j}) \in \{0,1,2,3\}\). Now we compare \(({\sum }_{j=1}^{n} a_{ijt} \cdot z_{j})\) with the binary expansion of b[i]. When t is odd the bit et is 1 and so there must be a 1 in the corresponding bit of \(({\sum }_{j=1}^{n} a_{ijt} \cdot z_{j})\). This shows that \(({\sum }_{j=1}^{n} a_{ijt} \cdot z_{j}) \neq 0\) when t is odd. Now if \(({\sum }_{j=1}^{n} a_{ijt} \cdot z_{j}) \in \{ 2,3 \} \) for any odd t then the term 2t+ 1 will be produced and this will not match the expansion of b[i] as the et+ 1 = 0. Thus by the uniqueness of binary expansion we conclude that \(({\sum }_{j=1}^{n} a_{ijt} \cdot z_{j}) = 1\) if t is odd and 0 otherwise. Thus \(M\vec {y} = \vec {e}\) has a solution with yi = zi. □

7 Non-deterministic Algorithm for Univariate Ideal Membership

In this section we prove Theorem 1.9. Given a polynomial \(f(X)\in \mathbb {Q}[X]\) and a univariate ideal I = 〈p1(x1),…,pn(xn)〉 where the generators are p1,…,pn have no repeated roots, we show that deciding nonmembership of f in I is in NP. By Theorem 1.1, it suffices to check in NP if there is a grid point (α1,α2,…,αn) in the n-dimensional grid \(Z(p_{1})\times Z(p_{2})\times {\dots } \times Z(p_{n})\) where f does not vanish. Since the roots of pi could be irrational (even complex), it is not immediately clear how to guess a polynomial size witness for such a grid point and efficiently verify. However, we show that for the NP machine it suffices to guess a grid point \(\vec {\alpha }\) approximately, upto polynomially many bits of precision. Recall that

$$ f(X)= \sum\limits_{i=1}^{n} h_{i}(X) ~p_{i}(x_{i}) + R(X), $$

where the remainder R is unique and \(\deg _{x_{i}} (R) < \deg (p_{i})\) for all i. For a polynomial \(g\in {\mathbb {F}}[X]\), let |c(g)| denote be the maximum coefficient (in absolute value) of a monomial in g. We obtain simple estimates for the coefficients of the polynomials h1,…,hn,R in terms of n, \(\deg (f)\), and the coefficients of f and the pi.

Lemma 7.1

Let 2L ≤|c(f)|,|c(pi)|≤ 2L. Then 2−poly(L,n,d) ≤|c(hi)|,|c(R)|≤ 2poly(L,n,d) where d is the degree upper bound for f, and {pi : 1 ≤ in}.

Proof

Write f as a linear combination of at most \(\left (\begin {array}{cc}{d+n}\\ n \end {array}\right )\) many monomials \(f={\sum }_{m}\alpha _{m} m\). Each monomial m occurring in it is of the form \(m=x_{1}^{e_{1}}x_{2}^{e_{2}}{\dots } x_{n}^{e_{n}}, {\sum }_{i} e_{i}\le d\). By univariate division, we can write each m as:

$$ m = \prod\limits_{i=1}^{n} (h_{m,i}p_{i} + r_{m,i}), $$

where \(h_{m,i},r_{m,i}\in \mathbb {Q}[x_{i}]\) such that \(x^{e_{i}}= h_{m,i}p_{i}+r_{m,i}\), and \(\deg (r_{m,i})<\deg (p_{i})\). Moreover, by the properties of univariate polynomial division, the absolute value of the coefficients of each hm,i and rm,i lie in an interval of the form [2−poly(L,n,d),2poly(L,n,d)]. We note that \(R={\sum }_{m}{\prod }_{i=1}^{n} r_{m,i}\), and each hi is a 2poly(n,d) sum of n-fold products of the hm,i and the pi. Therefore, the coefficients of R and of each hi, in absolute value, also lie in an interval of the form [2−poly(L,n,d),2poly(L,n,d)], as claimed. □

Let \(\vec \alpha = (\alpha _{1}, \ldots , \alpha _{n}) \in \mathbb {C}^{n}\) be such that pi(αi) = 0, 1 ≤ in. By Lemma 2.3, \(2^{-\hat {L}} \le |\alpha _{i}|\leq 2^{\hat {L}}\) where \(\hat {L}= \text {poly}(L,d)\). For each i, let \(\tilde {\alpha }_{i}\in \mathbb {Q}[i]\) be an 𝜖-approximation of αi. That is, \(|\alpha _{i}-\tilde {\alpha }_{i}|\leq \epsilon \). Let \(\tilde {\alpha }=(\tilde {\alpha }_{1},\ldots ,\tilde {\alpha }_{n})\). Then we can bound the absolute value of \(p_{i}(\tilde {\alpha }_{i})\) and \({\sum }_{i=1}^{n}h_{i}(\tilde {\alpha })p_{i}(\tilde {\alpha })\).

Observation 7.2

  • For 1 ≤ in we have that \(|p_{i}(\tilde {\alpha }_{i})|\leq \epsilon \cdot 2^{(d L)^{c}}\).

  • \(|{\sum }_{i=1}^{n}h_{i}(\tilde {\alpha })p_{i}(\tilde {\alpha })|\le \epsilon 2^{(ndL)^{c}}\).

Here c > 0 is a constant that is independent of 𝜖.

Proof

Let \(p_{i}(x_{i}) = c\cdot {\prod }_{j=1}^{d} (x_{i} - \beta _{i,j})\). Without loss of generality, suppose \(\tilde {\alpha }_{i}\)𝜖-approximates βi,1 for each i. Then

$$ \begin{array}{@{}rcl@{}} |p_{i}(\tilde{\alpha}_{i})| & \leq & \epsilon \cdot |c|\cdot \prod\limits_{j=2}^{d} |\tilde{\alpha}_{i} - \beta_{i,j}| \\ & \le & \epsilon \cdot |c|\cdot \prod\limits_{j=2}^{d} (|\beta_{i,1} - \beta_{i,j}| + \epsilon)\\ & \le & \epsilon \cdot 2^{\text{poly}(d, L)}, \end{array} $$

where the last inequality follows from the bound on the distance between the roots of a univariate polynomial shown in Lemma 2.3. For the second part, note that \(|\tilde {\alpha }_{i}|\le |\alpha _{i}|+1\le 2^{\tilde {L}+1}\) by Lemma 2.3. Each hi has at most \({\left (\begin {array}{cc}{n+d}\\ d \end {array}\right )}\) monomials, and, by Lemma 7.1, the coefficients of each hi is bounded by 2poly(n,d,L). Putting it together, \(|h_{i}(\tilde {\alpha })|\le 2^{\text {poly}(n,d,L)}\) for all i. Hence, by the first part, \(|{\sum }_{i=1}^{n}h_{i}(\tilde {\alpha })p_{i}(\tilde {\alpha })|\le \epsilon 2^{(ndL)^{O(1)}}\). □

We now prove Theorem 1.9.

Proof

If f is not in the ideal I then, by Theorem 1.1, there exists a grid point \(\vec \alpha = (\alpha _{1}, \ldots , \alpha _{n}) \in Z(p_{1}) \times {\ldots } \times Z(p_{n})\) such that \(R(\vec \alpha ) \neq 0\).

The NP Machine guesses an 𝜖-approximation \(\vec {\tilde {\alpha }}=(\tilde {\alpha }_{1}, \ldots , \tilde {\alpha }_{n})\) of \(\vec \alpha \), where 𝜖 will be chosen later in the analysis. Using the circuit (or black-box) for f, we obtain the value for \(f(\vec {\tilde {\alpha }})\).

Next, we show that the value \(|f(\vec {\tilde {\alpha }})|\) distinguishes between the cases fI and fI.

Case 1

fI \(|f(\tilde {\alpha })| = |{\sum }_{i=1}^{n} h_{i}(\vec {\tilde {\alpha }}) p_{i}(\tilde {\alpha _{i}})|\le \epsilon \cdot 2^{(n d L)^{c}}\) by Observation 7.2. We can verify this from the value returned by the circuit (or black-box) for f. Note: the inequality may be satisfied even for a \(\tilde {\alpha }\) that is not an 𝜖-approximation of \(\vec {\alpha }\). However, the analysis and choice of 𝜖 will guarantee correctness.

Case 2

fI We have \(f(\tilde {\alpha }) = {\sum }_{i=1}^{n} h_{i}(\tilde {\alpha })p_{i}(\tilde {\alpha }) + R(\tilde {\alpha })\). Hence,

$$ |f(\tilde{\alpha}) - R(\tilde{\alpha})| \le \epsilon 2^{(ndL)^{c}}. $$

Our aim is to show that \(|f(\tilde {\alpha })|\ge 2\epsilon 2^{(ndL)^{c}}\). We have from above that \(|f(\tilde {\alpha })|\ge R(\tilde {\alpha })-\epsilon 2^{(ndL)^{c}}\).

By triangle inequality, \(|R(\tilde {\alpha })| \geq |R(\vec \alpha )| - |R(\tilde {\alpha }) - R(\vec \alpha )|\). We now show a lower bound on \(|R(\vec \alpha )|\) and an upper bound for \(|R(\tilde {\alpha }) - R(\vec \alpha )|\).

Claim 7.3

\(|R(\vec \alpha )|\geq \frac {1}{2^{(ndL)^{c_{1}}}}\) for some constant c1.

Proof

Let \(\hat {R}(x_{n}) = R(\alpha _{1}, \ldots , \alpha _{n-1}, x_{n} ) = a \cdot {\prod }_{j=1}^{d^{\prime }} (x_{n}-\beta _{j})\), where a is some nonzero scalar and \(d^{\prime }\leq d\). Note that αn is not a zero for \(\hat {R}(x_{n})\). Consider the polynomial \(Q(x_{n}) = p_{n}(x_{n}) \hat {R}(x_{n})\). The set \(\{\alpha _{n}, \beta _{1}, \ldots , \beta _{d^{\prime }}\}\) are roots of Q(xn) and \(\alpha _{n} \neq \beta _{j} : 1\leq j\leq d^{\prime }\). By the root separation bound of Lemma 2.4 for |αnβj|, it follows that \(|\hat {R}(\alpha _{n})| \geq \frac {1}{2^{(n d L)^{c_{1}}}}\) for some c1 > 0. □

Claim 7.4

\(|R(\vec {\tilde {\alpha }}) - R(\vec \alpha )|\leq \epsilon 2^{(n d L)^{c_{2}}}\) for some constant c2.

Proof

Define \(R^{0}(\vec {\tilde {\alpha }})=R(\vec \alpha )\) and \(R^{i}(\vec {\tilde {\alpha }}) = R(\tilde {\alpha }_{1}, \ldots , \tilde {\alpha }_{i}, \alpha _{i+1}, \ldots , \alpha _{n})\). By triangle inequality, \(|R(\vec \alpha ) - R(\vec {\tilde {\alpha }})|\leq {\sum }_{i=1}^{n} |R^{i-1}(\vec {\tilde {\alpha }}) - R^{i}(\vec {\tilde {\alpha }})|\). Writing explicitly, we have \(R^{i-1}(\vec {\tilde {\alpha }}) - R^{i}(\vec {\tilde {\alpha }}) = {\sum }_{\vec {e}} c_{\vec {e}} \tilde {\alpha }_{1}^{e_{1}}\ldots \tilde {\alpha }_{i-1}^{e_{i-1}} (\alpha _{i}^{e_{i}} - \tilde {\alpha }_{i}^{e_{i}}) \alpha _{i}^{e_{i+1}}\ldots \alpha _{n}^{e_{n}}\). Now, the bounds \(|\alpha _{i}| \leq 2^{(ndL)^{O(1)}}\), and \(|\alpha _{i} - \tilde {\alpha }_{i}|\leq \epsilon \), combined with the number of summands being bounded by \({d+n}\choose d\) implies by triangle inequality that \(|R(\vec {\tilde {\alpha }}) - R(\vec \alpha )| \leq \epsilon \cdot 2^{(n d L)^{c_{2}}}\) for some constant c2 > 0 (independent of 𝜖). □

Combined with the inequalities in Claims 7.3 and 7.4, we have \(|f(\vec {\tilde {\alpha }})| \geq \frac {1}{2^{(n d L)^{c_{1}}}} - \epsilon \cdot \left (2^{(n d L)^{c_{2}}} + 2^{(n d L)^{c}}\right )\).

To make the calculation precise, let \(3M = \frac {1}{2^{(n d L)^{c_{1}}}}\) and choose 𝜖 such that \(\epsilon \cdot (2^{(n d L)^{c_{2}}} + 2^{(n d L)^{c}}) \leq M\). We note that the number M can be efficiently pre-computed from the input.

Summarizing the test, notice that fI implies that there is a guessed point \(\tilde {\alpha }\) of polynomial size such that \(|f(\tilde {\alpha })|\leq M\). On the other hand, as argued in Case 2 above, if fI then for any guessed point \(\tilde {\alpha }\) we have \(|f(\tilde {\alpha })|\geq 2M\). □