Keywords

A.M.S. Subject Classification:

1 Introduction

Sparse interpolation [1, 2, 5, 13] provides an interesting paradigm for efficient computations with multivariate polynomials. In particular, under suitable hypothesis, multiplication of sparse polynomials can be carried out in quasi-linear time, in terms of the expected output size. More recently, other multiplication algorithms have also been investigated, which outperform naive and sparse interpolation under special circumstances [12, 14]. An interesting question is how to exploit such asymptotically faster multiplication algorithms for the purpose of polynomial elimination. In this paper, we will focus on the reduction of a multivariate polynomial with respect to an autoreduced set of other polynomials and show that fast multiplication algorithms can indeed be exploited in this context in an asymptotically quasi-optimal way.

Consider the polynomial ring \(\mathbbm {K} [x] =\mathbbm {K} [x_1, \ldots , x_n]\) over an effective field \(\mathbbm {K}\) with an effective zero test. Given a polynomial

$$ P = \sum _{i \in \mathbbm {N}^n} P_i x^i = \sum _{i_1, \ldots , i_n \in \mathbbm {N}} P_{i_1, \ldots , i_n} x_1^{i_1} \cdots x_n^{i_n}, $$

we call \({\text {supp}}\, P = \{ i \in \mathbbm {N}^n : P_i \ne 0 \}\) the support of P. The naive multiplication of two sparse polynomials \(P, Q \in \mathbbm {K} [x]\) requires a priori \(\mathcal {O} (| {\text {supp}}\, P | | {\text {supp}}\, Q |)\) operations in \(\mathbbm {K}\). This upper bound is sharp if P and Q are very sparse, but pessimistic if P and Q are dense.

Assuming that \(\mathbbm {K}\) has characteristic zero, a better algorithm was proposed in [2] (see also [1, 5] for some background). The complexity of this algorithm can be expressed in the expected size \(s = | {\text {supp}}\, P + {\text {supp}}\, Q |\) of the output (when no cancellations occur). It is shown that P and Q can be multiplied using only \(\mathcal {O} ({\textsf {M}} (s) \log s)\) operations in \(\mathbbm {K}\), where \({\textsf {M}} (s) =\mathcal {O} (s \log s \log \log s)\) stands for the complexity of multiplying two univariate polynomials in \(\mathbbm {K} [z]\) of degrees \({<}s\). Unfortunately, the algorithm in [2] has two drawbacks:

  1. 1.

    The algorithm leads to a big growth for the sizes of the coefficients, thereby compromising its bit complexity (which is often worse than the bit complexity of naive multiplication).

  2. 2.

    It requires \({\text {supp}}\, PQ \subseteq {\text {supp}}\, P + {\text {supp}}\, Q\) to be known beforehand. More precisely, whenever a bound \({\text {supp}}\, PQ \subseteq {\text {supp}}\, P + {\text {supp}}\, Q \subseteq \mathcal {S}\) is known, then we really obtain a multiplication algorithm of complexity \(\mathcal {O} ({\textsf {M}} (| \mathcal {S} |) \log | \mathcal {S} |)\).

In practice, the second drawback is of less importance. Indeed, especially when the coefficients in \(\mathbbm {K}\) can become large, then the computation of \({\text {supp}}\, P + {\text {supp}}\, Q\) is often cheap with respect to the multiplication PQ itself, even if we compute \({\text {supp}}\, P + {\text {supp}}\, Q\) in a naive way.

Recently, several algorithms were proposed for removing the drawbacks of [2]. First of all, in [13] we proposed a practical algorithm with essentially the same advantages as the original algorithm from [2], but with a good bit complexity and a variant which also works in positive characteristic. However, it still requires a bound for \({\text {supp}}\, PQ\) and it only works for special kinds of fields \(\mathbbm {K}\) (which nevertheless cover the most important cases such as \(\mathbbm {K}=\mathbbm {Q}\) and finite fields). Even faster algorithms were proposed in [9, 14], but these algorithms only work for special supports. Yet another algorithm was proposed in [7, 12]. This algorithm has none of the drawbacks of [2], but its complexity is suboptimal (although better than the complexity of naive multiplication).

At any rate, these recent developments make it possible to rely on fast sparse polynomial multiplication as a building block, both in theory and in practice. This makes it natural to study other operations on multivariate polynomials with this building block at our disposal. One of the most important such operations is division.

The multivariate analogue of polynomial division is the reduction of a polynomial \(A \in \mathbbm {K} [x]\) with respect to an autoreduced tuple \(B = (B_1, \ldots , B_b) \in \mathbbm {K} [x]^b\) of other polynomials. This leads to a relation

$$\begin{aligned} A= & {} Q_1 B_1 + \cdots + Q_b B_b + R, \end{aligned}$$
(1)

such that none of the terms occurring in R can be further reduced with respect to B. In this paper, we are interested in the computation of R as well as \(Q_1, \ldots , Q_b\). We will call this the problem of extended reduction, in analogy with the notion of an “extended g.c.d.”.

Now in the univariate context, “relaxed power series” provides a convenient technique for the resolution of implicit equations [6,7,8, 10]. One major advantage of this technique is that it tends to respect most sparsity patterns which are present in the input data and in the equations. The main technical tool in this paper (see Sect. 3) is to generalize this technique to the setting of multivariate polynomials, whose terms are ordered according to a specific admissible ordering on the monomials. This will make it possible to rewrite (1) as a so-called recursive equation (see Sect. 4.2), which can be solved in a relaxed manner. Roughly speaking, the cost of the extended reduction then reduces to the cost of the relaxed multiplications \(Q_1 B_1, \ldots , Q_b B_b\). Up to a logarithmic overhead, we will show (Theorem 4) that this cost is the same as the cost of checking the relation (1).

In order to simplify the exposition, we will adopt a simplified sparse complexity model throughout this paper. In particular, our complexity analysis will not take into account the computation of support bounds for products or results of the extended reduction. Bit complexity issues will also be left aside in this paper. We finally stress that our results are mainly of theoretical interest since none of the proposed algorithms have currently been implemented. Nevertheless, practical gains are not to be excluded, especially in the case of small n, high degrees and dense supports.

2 Notations

Let \(\mathbbm {K}\) be an effective field with an effective zero test and let \(x_1, \ldots , x_n\) be indeterminates. We will denote

$$\begin{aligned} \mathbbm {K} [x]= & {} \mathbbm {K} [x_1, \ldots , x_n]\\ P_i= & {} P_{i_1, \ldots , i_n}\\ x^i= & {} x_1^{i_1} \cdots x_n^{i_n}\\ i \preccurlyeq j\Leftrightarrow & {} i_1 \leqslant j_1 \wedge \cdots \wedge i_n \leqslant j_n, \end{aligned}$$

for any \(i, j \in \mathbbm {N}^n\) and \(P \in \mathbbm {K} [x]\). In particular, \(i \preccurlyeq j \Leftrightarrow x^i | x^j\). For any subset \(E \subseteq \mathbbm {N}^n\) we will denote by \({\text {Fin}}\, (E) = \{ j \in \mathbbm {N}^n : \exists i \in E, i \preccurlyeq j \}\) the final segment generated by E for the partial ordering \(\preccurlyeq \).

Let \(\leqslant \) be a total ordering on \(\mathbbm {N}^n\) which is compatible with addition. Two particular such orderings are the lexicographical ordering \(\leqslant ^{{\text {lex}}\,}\) and the reverse lexicographical ordering \(\leqslant ^{{\text {rlex}}\,}\):

$$\begin{aligned} i<^{{\text {lex}}\,} j\Leftrightarrow & {} \exists k, i_1 = j_1 \wedge \cdots \wedge i_{k - 1} = j_{k - 1} \wedge i_k< j_k\\ i<^{{\text {rlex}}\,} j\Leftrightarrow & {} \exists k, i_k < j_k \wedge i_{k + 1} = j_{k + 1} \wedge \cdots \wedge i_n = j_n . \end{aligned}$$

In general, it can be shown [16] that there exist real vectors \(\lambda _1, \ldots , \lambda _n \in \mathbbm {R}^m\) with \(m \leqslant n\), such that

$$\begin{aligned} i \leqslant j\Leftrightarrow & {} (\lambda _1 \cdot i, \ldots , \lambda _m \cdot i) \leqslant ^{{\text {lex}}\,} (\lambda _1 \cdot j, \ldots , \lambda _m \cdot j) . \end{aligned}$$
(2)

In what follows, we will assume that \(\lambda _1, \ldots , \lambda _n \in \mathbbm {N}^n\) and \(\gcd ((\lambda _i)_1, \ldots , (\lambda _i)_n) = 1\) for all i. We will also denote

$$\begin{aligned} \lambda \cdot i= & {} (\lambda _1 \cdot i, \ldots , \lambda _n \cdot i) . \end{aligned}$$

For instance, the graded reverse lexicographical ordering \(\leqslant ^{{\text {grlex}}\,}\) is obtained by taking \(\lambda _1 = (1, \ldots , 1)\), \(\lambda _2 = (0, \ldots , 1)\), \(\lambda _2 = (0, \ldots , 0, 1, 0)\), \(\ldots \), \(\lambda _n = (0, 1, 0, \ldots , 0)\).

Given \(P \in \mathbbm {K} [x]\), we define its support by

$$\begin{aligned} {\text {supp}}\, P= & {} \{ i \in \mathbbm {N}^n : P_i \ne 0 \} . \end{aligned}$$

If \(P \ne 0\), then we also define its leading exponent \(l_P\) and coefficient \(c_P\) by

$$\begin{aligned} l_P= & {} \max _{\leqslant } {\text {supp}}\, P\\ c_P= & {} P_{l_P} . \end{aligned}$$

Given a finite set E, we will denote its cardinality by |E|.

3 Relaxed Multiplication

3.1 Relaxed Power Series

Let us briefly recall the technique of relaxed power series computations, which is explained in more detail in [7]. In this computational model, a univariate power series \({f \in \mathbbm {K} [[z]]}\) is regarded as a stream of coefficients \(f_0, f_1, \ldots \). When performing an operation \({g = \varPhi (f_1, \ldots , f_k)}\) on power series, it is required that the coefficient \(g_n\) of the result is output as soon as sufficiently many coefficients of the inputs are known, so that the computation of \(g_n\) does not depend on the further coefficients. For instance, in the case of a multiplication \(h = fg\), we require that \(h_n\) is output as soon as \(f_0, \ldots , f_n\) and \(g_0, \ldots , g_n\) are known. In particular, we may use the naive formula \(h_n = \sum _{i = 0}^n f_i g_{n - i}\) for the computation of \(h_n\).

The additional constraint on the time when coefficients should be output admits the important advantage that the inputs may depend on the output, provided that we add a small delay. For instance, the exponential \(g = \exp f\) of a power series \(f \in z\mathbbm {K} [[z]]\) may be computed in a relaxed way using the formula

$$\begin{aligned} g= & {} \int f' g. \end{aligned}$$

Indeed, when using the naive formula for products, the coefficient \(g_n\) is given by

$$\begin{aligned} g_n= & {} \frac{1}{n} (f_1 g_{n - 1} + 2 f_2 g_{n - 2} + \cdots + nf_n g_0), \end{aligned}$$

and the right-hand side only depends on the previously computed coefficients \(g_0, \ldots , g_{n - 1}\). More generally, equations of the form \(g = \varPhi (g)\) which have this property are called recursive equations and we refer to [11] for a mechanism to transform fairly general implicit equations into recursive equations.

The main drawback of the relaxed approach is that we cannot directly use fast algorithms on polynomials for computations with power series. For instance, assuming that \(\mathbbm {K}\) has sufficiently many \(2^p\)-th roots of unity and that field operations in \(\mathbbm {K}\) can be done in time \(\mathcal {O} (1)\), two polynomials of degrees \({<} n\) can be multiplied in time \({\textsf {M}} (n) =\mathcal {O} (n \log n)\), using FFT multiplication [3]. Given the truncations \(f_{; n} = f_0 + \cdots + f_{n - 1} z^{n - 1}\) and \(g_{; n} = g_0 + \cdots + g_{n - 1} z^{n - 1}\) at order n of power series \(f, g \in \mathbbm {K} [[z]]\), we may thus compute the truncated product \((fg)_{; n}\) in time \({\textsf {M}} (n)\) as well. This is much faster than the naive \(\mathcal {O} (n^2)\) relaxed multiplication algorithm for the computation of \((fg)_{; n}\). However, the formula for \((fg)_0\) when using FFT multiplication depends on all input coefficients \(f_0, \ldots , f_{n - 1}\) and \(g_0, \ldots , g_{n - 1}\), so the fast algorithm is not relaxed (we will say that FFT multiplication is a zealous algorithm). Fortunately, efficient relaxed multiplication algorithms do exist:

Theorem 1

[4, 6, 7] Let \({\textsf {M}} (n)\) be the time complexity for the multiplication of polynomials of degrees \(< n\) in \(\mathbbm {K} [z]\). Then there exists a relaxed multiplication algorithm for series in \(\mathbbm {K} [[z]]\) at order n of time complexity \({\textsf {R}} (n) =\mathcal {O} ({\textsf {M}} (n) \log n)\).

Remark 1

In fact, the algorithm from Theorem 1 generalizes to the case when the multiplication on \(\mathbbm {K}\) is replaced by an arbitrary bilinear “multiplication” \(\mathbbm {M}_1 \times \mathbbm {M}_2 \rightarrow \mathbbm {M}_3\), where \(\mathbbm {M}_1, \mathbbm {M}_2\) and \(\mathbbm {M}_3\) are effective modules over an effective ring \(\mathbbm {A}\). If \({\textsf {M}} (n)\) denotes the time complexity for multiplying two polynomials \(P \in \mathbbm {M}_1 [z]\) and \(Q \in \mathbbm {M}_2 [z]\) of degrees \({<} n\), then we again obtain a relaxed multiplication for series \(f \in \mathbbm {M}_1 [[z]]\) and \(g \in \mathbbm {M}_2 [[z]]\) at order n of time complexity \(\mathcal {O} ({\textsf {M}} (n) \log n)\).

Theorem 2

[10] If \(\mathbbm {K}\) admits a primitive \(2^p\)-th root of unity for all p, then there exists a relaxed multiplication algorithm of time complexity

$$ {\textsf {R}} (n) =\mathcal {O} (n \log n \mathrm {e}^{2 \sqrt{\log 2 \log \log n}}). $$

In practice, the existence of a \(2^{p + 1}\)-th root of unity with \(2^p \geqslant n\) suffices for multiplication up to order n.

3.2 Relaxed Multivariate Laurent Series

Let \(\mathbbm {A}\) be an effective ring. A power series \(f \in \mathbbm {A} [[z]]\) is said to be computable if there is an algorithm which takes \(n \in \mathbbm {N}\) on input and produces the coefficient \(f_n\) on output. We will denote by \(\mathbbm {A} [[z]]^{{\text {com}}\,}\) the set of such series. Then \(\mathbbm {A} [[z]]^{{\text {com}}\,}\) is an effective ring for relaxed addition, subtraction and multiplication.

A computable Laurent series is a formal product \(fz^k\) with \(f \in \mathbbm {A} [[z]]^{{\text {com}}\,}\) and \(k \in \mathbbm {Z}\). The set \(\mathbbm {A} ((z))^{{\text {com}}\,}\) of such series forms an effective ring for the addition, subtraction and multiplication defined by

$$\begin{aligned} fz^k + gz^l= & {} (fz^{k - \min (k, l)} + gz^{l - \min (k, l)}) z^{\min (k, l)}\\ fz^k - gz^l= & {} (fz^{k - \min (k, l)} - gz^{l - \min (k, l)}) z^{\min (k, l)}\\ (fz^k) (gz^l)= & {} (fg) z^{k + l}. \end{aligned}$$

If \(\mathbbm {A}\) is an effective field with an effective zero test, then we may also define an effective division on \(\mathbbm {A} ((z))^{{\text {com}}\,}\), but this operation will not be needed in what follows.

Assume now that z is replaced by a finite number of variables \(z = (z_1, \ldots , z_n)\). Then an element of

$$\begin{aligned} \mathbbm {A} ((z))^{{\text {com}}\,}:= & {} \mathbbm {A} ((z_n))^{{\text {com}}\,} \mathord {\cdots } ((z_1))^{{\text {com}}\,} \end{aligned}$$

will also be called a “computable lexicographical Laurent series”. Any nonzero \(f \in \mathbbm {A} ((z))\) has a natural valuation \(v_f = (v_1, \ldots , v_n) \in \mathbbm {Z}^n\), by setting \(v_1 = {\text {val}}\,_{z_1} f\), \(v_2 = {\text {val}}\,_{z_2} ([z_1^{v_1}] f)\), etc. The concept of recursive equations naturally generalizes to the multivariate context. For instance, for an infinitesimal Laurent series \(\varepsilon \in \mathbbm {A} ((z))^{{\text {com}}\,}\) (that is, \(\varepsilon = fz^k\), where \(v_f >^{{\text {lex}}\,} - k\)), the formula

$$\begin{aligned} g= & {} 1 + \varepsilon g \end{aligned}$$

allows us to compute \(g {=} (1 - \varepsilon )^{- 1}\) using a single relaxed multiplication in \(\mathbbm {A} ((z))^{{\text {com}}\,}\).

Now take \(\mathbbm {A}{=}\mathbbm {K} [x]\) and consider a polynomial \(P \in \mathbbm {A}\). Then we define the Laurent polynomial \(\hat{P} \in \mathbbm {K} [xz^{- \lambda }] \subseteq \mathbbm {A} ((z))^{{\text {com}}\,}\) by

$$\begin{aligned} \hat{P}= & {} \sum _{i \in \mathbbm {N}^n} P_i x^i z^{- \lambda \cdot i}. \end{aligned}$$

Conversely, given \(f \in \mathbbm {K} [xz^{- \lambda }]\), we define \(\check{f} \in \mathbbm {K} [x]\) by substituting \(z_1 = \cdots = z_n = 1\) in f. We will call the transformations \(P \mapsto \hat{P}\) and \(\hat{P} \mapsto P = \check{\hat{P}}\) tagging resp. untagging; they provide us with a relaxed mechanism to compute with multivariate polynomials in \(\mathbbm {K} [x]\), such that the admissible ordering \(\leqslant \) on \(\mathbbm {N}^n\) is respected. For instance, we may compute the relaxed product of two polynomials \(P, Q \in \mathbbm {K} [x]\) by computing the relaxed product \(\hat{P} \hat{Q}\) and substituting \(z_1 = \cdots = z_n = 1\) in the result. We notice that tagging is an injective operation which preserves the size of the support.

3.3 Complexity Analysis

Assume now that we are given \(P, Q \in \mathbbm {K} [x]\) and a set \(\mathcal {R} \subseteq \mathbbm {N}^n\) such that \({\text {supp}}\, (PQ) \subseteq \mathcal {R}\). We assume that \(\mathsf {SM} (s)\) is a function such that the (zealous) product PQ can be computed in time \(\mathsf {SM} (| \mathcal {R} |)\). We will also assume that \(\mathsf {SM} (s) / s\) is an increasing function of s. In [2, 15], it is shown that we may take \(\mathsf {SM} (s) =\mathcal {O} ({\textsf {M}} (s) \log s)\).

Let us now study the complexity of sparse relaxed multiplication of P and Q. We will use the classical algorithm for fast univariate relaxed multiplication from [6, 7], of time complexity \({\textsf {R}} (s) =\mathcal {O} ({\textsf {M}} (s) \log s)\). We will also consider semi-relaxed multiplication as in [8], where one of the arguments \(\hat{P}\) or \(\hat{Q}\) is completely known in advance and only the other one is computed in a relaxed manner.

Given \(X \subseteq \mathbbm {N}^n\) and \(i \in \{ 1, \ldots , n \}\), we will denote

$$\begin{aligned} \delta _i (X)= & {} \max \{ \lambda _i \cdot k : k \in X \} + 1\\ \delta (X)= & {} \delta _1 (X) \cdots \delta _n (X). \end{aligned}$$

We now have the following:

Theorem 3

With the above notations, the relaxed product of P and Q can be computed in time

$$ \mathcal {O} \left( \mathsf {SM} (| \mathcal {R} |) \log \delta (\mathcal {R}) \right) . $$

Proof

In order to simplify our exposition, we will rather prove the theorem for a semi-relaxed product of \(\hat{P}\) (relaxed) and \(\hat{Q}\) (known in advance). As shown in [8], the general case reduces to this special case. We will prove by induction over n that the semi-relaxed product can be computed using at most \(3 \mathsf {SM} (| \mathcal {R} |) \log \delta (\mathcal {R})\) operations in \(\mathbbm {K}\) if \(\mathcal {R}\) is sufficiently large. For \(n = 0\), we have nothing to do, so assume that \(n > 0\).

Let us first consider the semi-relaxed product of \(\hat{P}\) and \(\hat{Q}\) with respect to \(z_1\). Setting \(l = \lceil \log _2 \delta _1 (\mathcal {R}) \rceil \), the computation of this product corresponds (see the right-hand side of Fig. 1) to the computation of \({\leqslant } 2\) zealous \(2^{l - 1} \times 2^{l - 1}\) products (i.e. 2 products of polynomials of degrees \({<} 2^{l - 1}\) in \(z_1\)), \({\leqslant } 4\) zealous \(2^{l - 2} \times 2^{l - 2}\) products, and so on until \({\leqslant } 2^l\) zealous \(1 \times 1\) products. We finally need to perform \(2^l\) semi-relaxed \(1 \times 1\) products of series in \(z_2, \ldots , z_n\) only.

More precisely, assume that \(\hat{P}\) and \(\hat{Q}\) have valuations p resp. q in \(z_1\) and let \(\hat{P}_i\) stand for the coefficient of \(z_1^i\) in P. We also define

$$\begin{aligned} \hat{\mathcal {R}}= & {} \{ (a_1, \ldots , a_n, b_1, \ldots , b_n) \in \mathbbm {N}^n \times \mathbbm {Z}^n : (a_1, \ldots , a_n) \in \mathcal {R} \wedge (\forall i, b_i = - \lambda _i \cdot a) \}. \end{aligned}$$

Now consider a block size \(2^k\). For each i, we define

$$\begin{aligned} \hat{P}_{[i]}= & {} \hat{P}_{p + 2^k i} z_1^{p + 2^k i} + \cdots + \hat{P}_{p + 2^k (i + 1) - 1} z_1^{p + 2^k (i + 1) - 1}\\ \hat{Q}_{[i]}= & {} \hat{Q}_{q + 2^k i} z_1^{q + 2^k i} + \cdots + \hat{Q}_{q + 2^k (i + 1) - 1} z_1^{q + 2^k (i + 1) - 1}\\ \hat{\mathcal {R}}_{[i]}= & {} \{ (a_1, \ldots , a_n, b_1, \ldots , b_n) \in \hat{\mathcal {R}} : \\&2^k i \leqslant a_1 - p - q \leqslant 2^k (i + 1) - 1 \} , \end{aligned}$$

and notice that the \(\hat{\mathcal {R}}_{[i]}\) are pairwise disjoint. In the semi-relaxed multiplication, we have to compute the zealous \(2^k \times 2^k\) products \(\hat{P}_{[i]} \hat{Q}_{[1]}\) for all \(i \leqslant \lfloor (\delta _1 (\mathcal {R}) + 1) / 2^k \rfloor \). Since

$$\begin{aligned} {\text {supp}}\, \hat{P}_{[i]} \hat{Q}_{[1]}\subseteq & {} \hat{\mathcal {R}}_{[i + 1]} \amalg \hat{\mathcal {R}}_{[i + 2]}, \end{aligned}$$

we may compute all these products in time

$$\begin{aligned}&\mathsf {SM} (| \hat{\mathcal {R}}_{[1]} \amalg \hat{\mathcal {R}}_{[2]} |) + \cdots + \mathsf {SM} (| \hat{\mathcal {R}}_{[2^{l - k}]} \amalg \hat{\mathcal {R}}_{[2^{l - k} + 1]} |)\\&\quad = (| \hat{\mathcal {R}}_{[1]} \amalg \hat{\mathcal {R}}_{[2]} |) \tfrac{\mathsf {SM} (| \hat{\mathcal {R}}_{[1]} \amalg \hat{\mathcal {R}}_{[2]} |)}{| \hat{\mathcal {R}}_{[1]} \amalg \hat{\mathcal {R}}_{[2]} |} + \cdots +\\&\qquad (| \hat{\mathcal {R}}_{[2^{l - k}]} \amalg \hat{\mathcal {R}}_{[2^{l - k} + 1]} |) \tfrac{\mathsf {SM} (| \hat{\mathcal {R}}_{[2^{l - k}]} \amalg \hat{\mathcal {R}}_{[2^{l - k} + 1]} |)}{| \hat{\mathcal {R}}_{[2^{l - k}]} \amalg \hat{\mathcal {R}}_{[2^{l - k} + 1]} |}\\&\quad \leqslant (| \hat{\mathcal {R}}_{[1]} \amalg \hat{\mathcal {R}}_{[2]} | + \cdots + | \hat{\mathcal {R}}_{[2^{l - k}]} \amalg \hat{\mathcal {R}}_{[2^{l - k} + 1]} |) \tfrac{\mathsf {SM} (| \hat{\mathcal {R}} |)}{| \hat{\mathcal {R}} |}\\&\quad \leqslant 2 \mathsf {SM} (| \hat{\mathcal {R}} |) = 2 \mathsf {SM} (| \mathcal {R} |). \end{aligned}$$

The total time spent in performing all zealous \(2^k \times 2^k\) block multiplications with \(2^k < 2^l\) is therefore bounded by \(2 \mathsf {SM} (| \mathcal {R} |) \log \delta _1 (\mathcal {R})\).

Fig. 1
figure 1

Illustration of a fast relaxed product and a fast semi-relaxed product

Let us next consider the remaining \(1 \times 1\) semi-relaxed products. If \(n = 1\), then these are really scalar products, whence the remaining work can clearly be performed in time \(\mathsf {SM} (| \mathcal {R} |) \log \delta _1 (\mathcal {R})\) if \(\mathcal {R}\) is sufficiently large. If \(n > 1\), then for each i, we have

$$\begin{aligned} {\text {supp}}\, \hat{P}_{[i]} \hat{Q}_{[0]}\subseteq & {} \hat{\mathcal {R}}_{[i]}. \end{aligned}$$

By the induction hypothesis, we may therefore perform this semi-relaxed product in time \(3 \mathsf {SM} (| \hat{\mathcal {R}}_{[i]} |) (\log \delta (\mathcal {R}) - \log \delta _1 (\mathcal {R}))\). A similar argument as above now yields the bound \(3 \mathsf {SM} (| \mathcal {R} |) (\log \delta (\mathcal {R}) - \log \delta _1 (\mathcal {R}))\) for performing all \(1 \times 1\) semi-relaxed block products. The total execution time (which also takes into account the final additions) is therefore bounded by \(3 \mathsf {SM} (| \mathcal {R} |) \log \delta (\mathcal {R})\). This completes the induction.

Remark 2

In practice, the computation of zealous products of the form \(\hat{P}_{[i]} \hat{Q}_{[j]}\) is best done in the untagged model, i.e. using the formula

figure a

Proceeding this way allows us to use any of our preferred algorithms for sparse polynomial multiplication. In particular, we may use [14] or [12].

4 Polynomial Reduction

4.1 Naive Extended Reduction

Consider a tuple \(B = (B_1, \ldots , B_b) \in \mathbbm {K} [x]^b\). We say that B is autoreduced if \(B_i \ne 0\) for all i and \(l_{B_i} \mid \!\!\!\preccurlyeq l_{B_j}\) and \(l_{B_j} \mid \!\!\!\preccurlyeq l_{B_i}\) for all \(i \ne j\). Given such a tuple B and an arbitrary polynomial \(A \in \mathbbm {K} [x]\), we say that A is reduced with respect to B if \(l_{B_i} \mid \!\!\!\preccurlyeq k\) for all i and \(k \in {\text {supp}}\, A\). An extended reduction of A with respect to B is a tuple \((Q_1, \ldots , Q_b, R)\) with

$$\begin{aligned} A= & {} Q_1 B_1 + \cdots + Q_b B_b + R, \end{aligned}$$
(3)

such that R is reduced with respect to B. The naive algorithm extended-reduce below computes an extended reduction of A.

figure b

Remark 3

Although an extended reduction is usually not unique, the one computed by extended-reduce is uniquely determined by the fact that, in our main loop, we take i minimal with \(l_{B_i} \preccurlyeq k\) for some \(k \in {\text {supp}}\, R\). This particular extended reduction is also characterized by the fact that

$$\begin{aligned} {\text {supp}}\, Q_i + l_{B_i}\subseteq & {} {\text {Fin}}\, (\{ l_{B_i} \}) {\setminus } {\text {Fin}}\, (\{ l_{B_1}, \ldots , l_{B_{i - 1}} \}) \end{aligned}$$

for each i.

In order to compute \(Q_1, \ldots , Q_b\) and R in a relaxed manner, upper bounds

$$\begin{aligned} {\text {supp}}\, Q_i\subseteq & {} \mathcal {Q}_i\\ {\text {supp}}\, Q_i B_i\subseteq & {} \mathcal {Q}_i + {\text {supp}}\, B_i\\ {\text {supp}}\, R\subseteq & {} \mathcal {R} \end{aligned}$$

need to be known beforehand. These upper bounds are easily computed as a function of \(\mathcal {A}= {\text {supp}}\, A, \mathcal {B}_1 = {\text {supp}}\, B_1, \ldots , \mathcal {B}_b = {\text {supp}}\, B_b\) by the variant supp-extended-reduce of extended-reduce below. We recall from the end of the introduction that we do not take into account the cost of this computation in our complexity analysis. In reality, the execution time of supp-extended-reduce is similar to the one of extended-reduce, except that potentially expensive operations in \(\mathbbm {K}\) are replaced by boolean operations of unit cost. We also recall that support bounds can often be obtained by other means for specific problems.

figure c

4.2 Relaxed Extended Reduction

Using the relaxed multiplication from Sect. 3, we are now in a position to replace the algorithm extended-reduce by a new algorithm, which directly computes \(Q_1, \ldots , Q_b, R\) using the Eq. (3). In order to do this, we still have to put it in a recursive form which is suitable for relaxed resolution.

Denoting by \(e_i\) the i-th canonical basis vector of \(\mathbbm {K} [x]^{b + 1}\), we first define an operator \(\varPhi : x_1^{\mathbbm {N}} \cdots x_n^{\mathbbm {N}} \rightarrow \mathbbm {K} [x]^{b + 1}\) by

$$\begin{aligned} \varPhi (x^k)= & {} \left\{ \begin{array}{ll} c_{B_i}^{- 1} x^{k - l_{B_i}} e_i &{} \text {if }k \in \text { Fin} (\{ l_{B_i}, \ldots , l_{B_b} \})\text { and}\\ &{} i\text { is minimal with }l_{B_i} \preccurlyeq k\\ e_{b + 1} x^k &{} \text {otherwise} \end{array} \right. \end{aligned}$$

By linearity, this operator extends to \(\mathbbm {K} [x]\)

$$\begin{aligned} \varPhi (P)= & {} \sum _{i \in {\text {supp}}\, P} P_i \varPhi (x^i). \end{aligned}$$

In particular, \(\varPhi (c_A x^{l_A})\) yields the “leading term” of the extended reduction \(({Q_1, \ldots , Q_b}, R)\). We also denote by \(\hat{\varPhi }\) the corresponding operator from \(\mathbbm {K} [xz^{- \lambda }]\) to \(\mathbbm {K} [xz^{- \lambda }]^{b + 1}\) which sends \(\hat{P}\) to \(\widehat{\varPhi (P)}\).

Now let \(B_i^{*} = B_i - c_{B_i} x^{l_{B_i}}\) for each i. Then

$$\begin{aligned} (Q_i B_i)_k= & {} (Q_i B_i^{*})_k + (Q_i)_{k - l_{B_i}} c_{B_i} \end{aligned}$$

for each \(i \in \{ 1, \ldots , b \}\) and \(k \in \mathbbm {N}^n\). The equation

$$\begin{aligned} (Q_1 B_1 + \cdots + Q_b B_b + R)_k= & {} A_k \end{aligned}$$

can thus be rewritten as

$$\begin{aligned}&(Q_1)_{k - l_{B_1}} c_{B_1} + \cdots + (Q_i)_{k - l_{B_b}} c_{B_b}\\&\quad = (A - Q_1 B_1^{*} - \cdots - Q_b B_b^{*})_k \end{aligned}$$

Using the operator \(\varPhi \) this equation can be rewritten in a more compact form as

$$\begin{aligned} (Q_1, \ldots , Q_b, R)= & {} \varPhi (A - Q_1 B_1^{*} - \cdots - Q_b B_b^{*}). \end{aligned}$$

The tagged counterpart

$$\begin{aligned} (\hat{Q}_1, \ldots , \hat{Q}_b, \hat{R})= & {} \hat{\varPhi } (\hat{A} - \hat{Q}_1 \hat{B}_1^{*} - \cdots - \hat{Q}_b \hat{B}_b^{*}) \end{aligned}$$

is recursive, whence the extended reduction can be computed using b multivariate relaxed multiplications \(\hat{Q}_1 \hat{B}_1^{*}, \ldots , \hat{Q}_b \hat{B}_b^{*}\). With \(\mathcal {A}, \mathcal {B}_i, \mathcal {Q}_i\) and \(\mathcal {R}\) as in the previous section, Theorem 3 therefore implies:

Theorem 4

We may compute the extended reduction of A with respect to B in time

$$ \begin{array}{l} \mathcal {O} \left( \mathsf {SM} (| \mathcal {B}_1 +\mathcal {Q}_1 |) \log \delta (\mathcal {B}_1 +\mathcal {Q}_1) + \cdots + \right. \\ \,\,\,\,\,\, \mathsf {SM} (| \mathcal {B}_b +\mathcal {Q}_b |) \log \delta (\mathcal {B}_b +\mathcal {Q}_b) + | \mathcal {R} | ) . \end{array} $$

Remark 4

Following Remark 1, we also notice that A, the \(Q_i\) and R may be replaced by vectors of polynomials in \(\mathbbm {K} [x]^m\) (regarded as polynomials with coefficients in \(\mathbbm {K}^m\)), in the case that several polynomials need to be reduced simultaneously.