Keywords

1 Introduction

Suppose we seek to factor a multivariate polynomial \(a\in R=\mathbb {Z}[x_{1},\ldots ,x_{n}]\). Today many modern computer algebra systems, such as Maple, Magma and Singular, use Wang’s incremental design of multivariate Hensel lifting (MHL) to factor multivariate polynomials over integers. MHL was developed by Yun [15] and improved by Wang [13, 14].

To factor \(a(x_1,\dots ,x_n)\) the first step is to choose a main variable, say \(x_1\), then compute the content of a in \(x_1\) and remove it from a. If \(a=\sum _{i=0}^d a_i(x_2,\ldots ,x_n) x_1^i\), the content of a is \(\gcd (a_0,a_1,\ldots ,a_d)\), a polynomial in one fewer variables which is factored recursively. Let us assume this has been done.

The second step identifies any repeated factors in a by doing a square-free factorization. See Chap. 8 of [2]. In this step one obtains the factorization \( a = b_1 b_2^2 b_3^3 \cdots b_k^k\) such that each factor \(b_i\) has no repeated factors and \(\gcd (b_i,b_j)=1\). Let us assume this has also been done. So let \(a=f_1 f_2 \dots f_r\) be the irreducible factorization of a over \(\mathbb {Z}\). Also, let \(\#f\) denote the number of terms of a polynomial f and \(\mathrm{Supp}(f)\) denote the support f, i.e., the set of monomials in f.

MHL chooses an evaluation point \(\alpha = (\alpha _2,\alpha _3,\dots ,\alpha _n) \in \mathbb {Z}^{n-1}\) where the \(\alpha _i\)’s are preferably small and contain many zeros. Then \(a(x_1,\alpha )\) is factored over \(\mathbb {Z}\). The evaluation point \(\alpha \) must satisfy

  1. (i)

    \(L(\alpha ) \ne 0\) where L is the leading coefficient of a in \(x_1\),

  2. (ii)

    \(a(x_1,\alpha )\) must have no repeated factors in \(x_1\) and

  3. (iii)

    \(f_i(x_1,\alpha )\) must be irreducible over \(\mathbb {Q}\).

If any condition is not satisfied the algorithm must restart with a new evaluation point. Conditions (i) and (ii) may be imposed in advance of the next step. One way to ensure that condition (iii) is true with high probability is to pick a second evaluation point \(\beta = (\beta _2,\dots ,\beta _n) \in \mathbb {Z}^{n-1}\), factor \(a(x_1,\beta )\) over \(\mathbb {Z}\) and check that the two factorizations have the same degree pattern before proceeding.

For simplicity let us assume a is monic and suppose we have obtained the monic factors \(f_i(x_1,\alpha )\) in \(\mathbb {Z}[x_1]\). Next the algorithm picks a prime p which is big enough to cover the coefficients of a and the factors \(f_i\) of a.

The input to MHL is \(a,\alpha ,f_i(x_1,\alpha )\) and p such that \(a(x_1,\alpha )=\prod _{i=1}^{r}f_i(x_1,\alpha )\) where \(\mathrm {gcd}(f_i(x_1,\alpha ),f_j(x_1,\alpha ))=1\) in \(\mathbb {Z}_{p}[x_{1}]\) for \(i\ne j\). If the gcd condition is not satisfied, the algorithm chooses a new prime p until it is.

There are two main subroutines in the design of MHL. For details see Chap. 6 of [2]. The first one is the leading coefficient correction algorithm (LCC). The most well-known is the Wang’s heuristic LCC [14] which works well in practice and is the one Maple currently uses. There are other approaches by Kaltofen [6] and most recently by Lee [9]. In our implementation we use Wang’s LCC.

In a typical application of Wang’s LCC, one first factors the leading coefficient of a, a polynomial in \(\mathbb {Z}[x_2,\dots ,x_n]\), by a recursive call and then one applies LCC before the \(j^{\mathrm {th}}\) step of MHL. Then the total cost of the factorization is given by the cost of LCC + the cost of factoring \(a(x_1,\alpha )\) over \(\mathbb {Z}\) + the cost of MHL. One can easily construct examples where LCC or factoring \(a(x_1,\alpha )\) dominates the cost. However this is not typical. Usually MHL dominates the cost.

The second main subroutine solves a multivariate polynomial diophantine problem (MDP). In MHL, for each j with \(2 \le j\le n\), Wang’s design of MHL must solve many instances of the MDP in \(\mathbb {Z}_p[x_1,\ldots ,x_{j-1}]\). Wang’s method for solving an MDP (see Algorithm 2) is recursive. Although Wang’s method performs significantly better than the previous algorithm that he developed with Rothschild in [14], it does not explicitly take sparsity into account. During computation, the ideal-adic representation of factors is dense when the evaluation points \(\alpha _2,\dots ,\alpha _n\) are non-zero. In practice, conditions (i) and (iii) of LCC may force many non-zero \(\alpha _j\)’s. This makes Wang’s approach exponential in n.

Zippel’s sparse interpolation [18] was the first probabilistic method aimed at taking sparsity into account. Based on sparse interpolation and multivariate Newton’s iteration, Zippel then introduced a sparse Hensel lifting (ZSHL) algorithm in [17, 19], which uses a MHL organization different from Wang’s.

Another approach for sparse Hensel lifting for the sparse case was proposed by Kaltofen (KSHL) in [6]. Kaltofen’s method is also based on Wang’s incremental design of MHL but it uses a LCC different from Wang’s LCC and offers a distinct solution to the multivariate diophantine problem (MDP) that appears in Wang’s design of MHL.

At CASC 2016 the authors proposed a new practical sparse Hensel Lifting algorithm (MTSHL) [11]. It is also based on Wang’s incremental design of MHL and LCC but offers a solution to the MDP different from those of Zippel and Kaltofen. To solve the MDP problem appearing in MHL, MTSHL exploits the fact that at each step of MHL, the solutions to MDP’s, which are just Taylor polynomial coefficients, are structurally related. At the jth step of MHL we are recovering \(x_j\) in the factors. Let f be one such factor in \(\mathbb {Z}_p[x_1,x_2,\dots ,x_j]\) and let \(f=\sum _{k=0}^l f_k (x_j-\alpha _j)^k\) be its Taylor representation. At this point we know only \(f_0\). But \(\mathrm{Supp}(f_k) \subseteq \mathrm{Supp}(f_{k-1})\) with high probability if \(\alpha _j\) is chosen randomly from \([0,p-1]\) and p is sufficiently large. MTSHL is built on this key observation.

In this paper we consider the case where a has \(r > 2\) factors and secondly the case where the factors have large integer coefficients. When \(r>2\), the MDP problem is called a multiterm MDP problem and an approach to the solution to this problem is described in [2]. It reduces the multiterm MDP problem to \(r-1\) two term MDP problems. Our previous implementation of MTSHL described in [11] also used this approach.

In Sect. 2 we define the MDP problem in the context of MHL. See Algorithms 1 and 2. In Sect. 3 we discuss main ideas for the solution to the MDP used by MTSHL and present it as Algorithm 3 to make our explanation precise. We call Algorithm 3 MTSHL-d (d stands for direct), since it differs from our previous version of MTSHL (Algorithm 4 in [11]) in how it solves MDP problems when \(r>2\). For \(r=2\) it is the same as Algorithm 4 in [11].

In Sect. 4 we discuss the case \(r>2\). We argue that the probabilistic sparse interpolation method used in the design of MTSHL allows us to reduce the time spent solving multiterm MDP’s by up to a factor of \(r-1\). Because our proposal also reduces the multiplication cost in the previous approach described in [2], the observed speedup is sometimes greater than \(r-1\).

In Sect. 5, we study the case where the integer coefficients of the factors are large. The current approach (see [2]) chooses a prime p and \(l>0\) such that \(p^l\) bounds any coefficients in the factors \(f_i\) of a. We show that the sparse MDP solver developed in [11] renders an improved option. Suppose one factor \(f\in \mathbb {Z}[x_1,\ldots ,x_n]\) has a p-adic representation \(f=\sum _{k=0}^l f_k p^k\). We show that in this case also \(\mathrm {Supp}(f_k)\subseteq \mathrm {Supp}(f_{k-1})\) with high probability if p is chosen randomly. Therefore we propose first to factor a in \(\mathbb {Z}_p[x_1,\ldots ,x_n]\) by doing all arithmetic mod p where p is a machine prime (e.g. 63 bits on a 64 bit computer), i.e. run the entire Hensel lifting modulo a machine prime. Then lift the solution to \(\mathbb {Z}_{p^l}[x_1,\ldots ,x_n]\) by computing \(f_k\), again by solving each MDP appearing in the lifting process using the sparse interpolation developed in the design of MTSHL. Using this approach most of the computation is modulo p a machine prime.

In Sect. 6 we present some timing data to compare our new approaches with previous approaches and end with some concluding remarks.

In the paper we assume the input polynomial a is monic in \(x_{1}\) so as not to complicate the presentation with LCC. We note that what we explain remains true for the non-monic case with slight modifications. Our implementation uses Wang’s LCC for the non-monic case.

2 The Multivariate Diophantine Problem (MDP)

The Multivariate Diophantine Problem (MDP) arises naturally as a subproblem of the incremental design of MHL developed by Wang. For completeness we provide the \(j^\mathrm{{th}}\) step of MHL as Algorithm 1 for the monic case and Wang’s solution to the MDP as Algorithm 2.

figure a

The MDP appears at line 8 of Algorithm 1. Consider the case where the number of factors r to be computed is 2, i.e., \(r=2\). We discuss the case \(r>2\) in Sect. 4.

Let \(u,w,c\in \mathbb {Z}_{p}[x_{1},\ldots ,x_{j}]\) with u and w monic with respect to the variable \(x_{1}\) and let \(I_{j}=\left\langle x_{2}-\alpha _{2},\ldots ,x_{j}-\alpha _{j}\right\rangle \) be an ideal of \(\mathbb {Z}_{p}[x_{1},\ldots ,x_{j}]\) with \(\alpha _{i}\in \mathbb {Z}\). The MDP is to find multivariate polynomials \(\sigma ,\tau \in \mathbb {Z}_{p}[x_{1},\ldots ,x_{j}]\) that satisfy

$$\begin{aligned} \sigma u+\tau w=c\,\,\mathrm {\,mod}\,I_{j}^{d_{j}+1} \end{aligned}$$
(1)

with \(\mathrm {deg}_{x_{1}}(\sigma )<\mathrm {deg}_{x_{1}}(w)\) where \(d_{j}\) is the maximal degree of \(\sigma \) and \(\tau \) with respect to the variables \(x_{2},\ldots ,x_{j}\) and it is given that

$$ \mathrm {GCD}\left( u\,\mathrm {mod}\,I_{j},w\,\mathrm {mod}\,I_{j}\right) =1\,\mathrm {in}\ \mathbb {Z} _{p}[x_{1}]. $$

It can be shown that the solution \((\sigma ,\tau )\) exists and is unique and independent of the choice of the ideal \(I_{j}\). For \(j=1\) the MDP is in \(\mathbb {Z}_{p}[x_{1}]\) and can be solved with the extended Euclidean algorithm (see Chap. 2 of [2]).

To solve the MDP for \(j>1\), Wang uses the same approach as for Hensel Lifting, that is, an ideal-adic lifting approach. See Algorithm 2.

figure b

In general, if \(\alpha _{j}\ne 0\) the Taylor series expansion of \(\sigma \) and \(\tau \) about \(x_j=\alpha _j\) is dense in \(x_j\) so the \(c_i \ne 0\). Then the number of calls to the Euclidean algorithm of Wang’s solution to MDP is exponential in n. It is this exponential behaviour that the design of MTSHL eliminates. On the other hand, if MHL can choose some \(\alpha _j\) to be 0, for example, if the input polynomial \(a(x_1,\dots ,x_n)\) is monic in \(x_1\) then this exponential behaviour may not occur for sparse f and g.

3 MTSHL’s Solution to the MDP via Sparse Interpolation

We consider whether we can interpolate \(x_{2},\ldots ,x_{j}\) in \(\sigma \) and \(\tau \) in (1) using sparse interpolation methods. If \(\beta \in \mathbb {Z}_{p}\) with \(\beta \ne \alpha _{j}\), then

$$ \sigma (x_{j}=\beta )u(x_{j}=\beta )+\tau (x_{j}=\beta )w(x_{j}=\beta )\,=\,c(x_{j}=\beta )~\mathrm {mod}~I_{j-1}^{d_{j-1}+1}. $$

For \(K_{j}\!=\!\left\langle x_{2}\!-\!\alpha _{2},\ldots ,x_{j-1}\!-\!\alpha _{j-1},x_{j}\!-\!\beta \right\rangle \) and \(G_{j}\!=\!\mathrm {GCD}(u\,\mathrm {mod}\,K_{j},w\,\mathrm {mod}\,K_{j})\), we obtain a unique solution \(\sigma (x_{j}=\beta )\) iff \(G_{j}=1.\) However \(G_{j}\ne 1\) is possible. Let \(R=\mathrm {res}_{x_{1}}(u,w)\) be the Sylvester resultant of u and w taken in \(x_{1}\). Since uw are monic in \(x_{1}\) one hasFootnote 1

$$ G_{j}\ne 1\Longleftrightarrow \mathrm {res}_{x_{1}}(u\,\mathrm {mod}\,K_{j},\,w\,\mathrm {mod}\,K_{j})=0\Longleftrightarrow R(\alpha _2,\dots ,\alpha _{j-1},\beta )=0. $$

Let \(r = R(\alpha _2,\dots ,\alpha _{j-1},x_j) \in {\mathbb {Z}_{p}}[x_j]\) so that \(R(\alpha _2,\dots ,\alpha _{j-1},\beta ) = r(\beta )\). Also \(\mathrm {deg}(R)\le \deg (u)\deg (w)\) [1]. Now if \(\beta \) is chosen at random from \({\mathbb {Z}_{p}}\) and \(\beta \ne \alpha _j\) then

$$ \mathrm {Pr}[G_{j}\ne 1] = \mathrm{Pr}[ r(\beta ) = 0 ] \le \frac{\deg (r,x_j)}{p-1} \le \frac{\deg (u)\deg (w)}{p-1}. $$

This bound for Pr\([G_{j}\ne 1]\) is a worst case bound. In [10] we show that the average probability for \(\mathrm{Pr}[G_{j}\ne 1]=1/(p-1)\). Thus if p is large, the probability that \(G_{j}=1\) is high. Interpolation is thus an option to solve the MDP.

As can be seen from line 10 of Algorithm 1, the solutions to the MDP are the Taylor coefficients of the factors to be computed at the \(j^\mathrm{{th}}\) step. As such, if \(\sigma _{0,i}\) is sparse then the \(\sigma _{k,i}\) are also sparse. In line 5 of Algorithm 1, as k increases, on average, the number of terms of the \(\sigma _{k,i}\) decrease even for dense cases. That is, on average \( \#\sigma _{k,i} < \#\sigma _{k-1,i}\). A natural idea then is to use sparse interpolation techniques to solve the MDP. However, the sparse technique proposed by Zippel [16] is also iterative; it recovers \(x_2\) then \(x_3\) etc. To make one more step in this direction consider the following Lemma whose proof can be found in [11].

Lemma 1

Let \(f\in \mathbb {Z}_{p}[x_{1},\ldots ,x_{n}]\) and let \(\alpha \) be a randomly chosen element in \(\mathbb {Z}_{p}\) and \(f=\sum _{i=0}^{d_{n}}b_{i}(x_{1},\ldots ,x_{n-1})(x_{n}-\alpha )^{i}\) where \(d_{n}=\mathrm {deg}_{x_{n}}f.\) Then

$$ \mathrm {Pr}[\mathrm{Supp}(b_{j+1})\nsubseteq \mathrm{Supp}(b_{j})]~\le ~|\mathrm{Supp}(b_{j+1})|\,\frac{d_{n}-j}{p-d_{n}+j+1}~\mathrm{for}~0 \le j < d_{n}. $$

Lemma 1 says that for the sparse case, if p is big enough then the probability of \(\mathrm{Supp}(b_{j+1})\subseteq \mathrm{Supp}(b_{j})\) is high. This observation suggests, during MHL we use \(\sigma _{k-1,i}\) as a form of the solution of \(\sigma _{k,i}\). That is, the solutions to the MDP’s are related. During MHL, these problems shouldn’t be treated independently as previous approaches do. In light of the key role this assumption plays at each MHL step \(j>1\), for each factor \(f_i\), we call this assumption \(\mathrm{Supp}(\sigma _{k,i}) \subseteq \mathrm{Supp}(\sigma _{k-1,i})\) for all \(k>0\) the strong SHL assumption.

Algorithms 3 and 4 below show how this assumption can be combined with the sparse interpolation idea of Zippel [16] to reduce the solution to the MDP problem to solving linear systems over \({\mathbb {Z}_{p}}\). To see how MTSHL works on a concrete example for \(r=2\) and how MTSHL decreases the evaluation cost that sparse interpolation brings see [11].

We present the \(j^\mathrm{{th}}\) step of the new version of MTSHL in Algorithm 4 below and call it as MTSHL-d, as a shortcut for MTSHL direct. For \(r=2\) MTSHL-d is equivalent to MTSHL described in [11]. In the following section we discuss the case \(r>2\) and make it clear why we call it MTSHL direct.

figure c
figure d

4 The Multiterm Diophantine Problem

Let the input polynomial \(a(x_1,\dots ,x_n)\) be square-free with total degree d and irreducible factorization of a be

$$ a=f_{1}\cdots f_{r}\in \mathbb {Z}[x_{1},\ldots ,x_{n}]. $$

We consider the case \(r>2\). We start with the unique factorization of \(a_{1}(x_{1})=a(x_{1},\alpha )=u_{1}(x_{1})\cdots u_{r}(x_{1})\in \mathcal {\mathbb {Z}}[x_{1}]\). By Hilbert’s irreducibility theorem [7] most probably \(u_{i}(x_{1})=f_{i}(x_{1},\alpha )\). Next we choose a prime p which is big enough to cover the coefficients occurring in each \(f_{i}\) and then pass to mod p

$$ a(x_{1},\alpha )=u_{1}(x_{1})\cdots u_{r}(x_{1})\in {\mathbb {Z}_{p}}[x_{1}]. $$

We need \(\gcd (u_{i},u_{j})=1\in {\mathbb {Z}_{p}}[x_{1}]\) for all \(1 \le i < j \le r\). Otherwise we choose a different prime and repeat the process.

Suppose \(f_i = \sum _{k=0} \sigma _{i,k} (x_j-\alpha _j)^k\). So \(\sigma _{i,k}\) is the \(k^\mathrm{{th}}\) Taylor coefficient of the \(i^{\mathrm{th}}\) factor to be computed in the \(j^{\mathrm{th}}\) step of MHL. (See line 10 of Algorithm 1.) During the \(j^{\mathrm{th}}\) step of MHL, for each iteration \(k>0\), the algorithm computes \(\sigma _{k,i}\), by solving the multiterm Diophantine problem (multi-MDP), which is a natural generalization of the MDP defined in Sect. 2 and denoted as \(\mathrm {MDP}_{j,k}\) in line 8 of Algorithm 1. It has the form

$$ \mathrm {MDP}_{j,k}:\,\,\sigma _{k,1}b_{1}+\cdots +\sigma _{k,r}b_{r}=c_{k}, $$

where \(b_{k}=\prod _{i=1,i\ne k}^{r}f_{j-1,i}(x_{1},\ldots ,x_{j-1})\). So, given \(b_{k}\) and \(c_{k}\) in \({\mathbb {Z}_{p}}[x_{1},\ldots ,x_{j-1}]\), the goal is to find \(\sigma _{k,i}\) for each i.

The current approach to solve a multiterm MDP is to reduce it into \(r-1\) two term MDP’s. We describe the idea with an example. Let \(r=4\) and to save some space let \(u_{i}=f_{j-1,i}\). Then

$$\begin{aligned} c_k= & {} \sigma _{k,1}b_{1}+\sigma _{k,2}b_{2}+\sigma _{k,3}b_{3}+\sigma _{k,4}b_{4} \\= & {} \sigma _{k,1}u_{2}u_{3}u_{4}+\sigma _{k,2}u_{1}u_{3}u_{4}+\sigma _{k,3}u_{1}u_{2}u_{4}+\sigma _{k,4}u_{1}u_{2}u_{3} \\= & {} \sigma _{k,1}u_{2}u_{3}u_{4}+u_{1}(\sigma _{k,2}u_{3}u_{4}+u_{2}(\sigma _{k,3}u_{4}+\sigma _{k,4}u_{3})). \end{aligned}$$

We first solve the MDP \(\sigma _{k,1}u_{2}u_{3}u_{4}+u_{1}w_{1}=c_{k}\) for \(\sigma _{k,1}\) and \(w_{1}\). Then we solve \(\sigma _{k,2}u_{3}u_{4}+u_{2}w_{2}=w_{1}\) for \(\sigma _{k,2}\) and \(w_{2}\). Finally we solve \(\sigma _{k,3}u_{4}+\sigma _{k,4}u_{3}=w_{2}\) to compute \(\sigma _{k,3}\) and \(\sigma _{k,4}\). Let us call this approach as the iterative approach to solve the multiterm MDP.

Note that Wang’s approach to solve the MDP is recursive. So when \(r>2\), the iterative approach to solve multiterm MDP makes Wang’s design highly recursive. Also, if the polynomials \(u_{i}\) have many terms then the \(b_{i}\)’s will be large and expensive to compute. If we use the probabilistic sparse MDP solver of MTSHL as described in [11] for each of these MDP’s, then we will first compute the \(b_{i}\)’s and then evaluate \(b_{i}\)’s at random points. But evaluation is one of the most costly operations in sparse interpolation and this cost increases as the size of the polynomial to be evaluated increases.

However, the probabilistic non-recursive sparse interpolation idea used to solve the MDP’s in MHL renders another simple and efficient option. One can invoke the sparse MDP solver to compute the \(\sigma _{k,i}\)’s simultaneously without reducing \(\mathrm {MDP}_{j,k}\) to \(r-1\) two term MDP’s in the following way.

According to Lemma 1, if \(\alpha _{j}\) is random and p is big, then for each factor \(f_{j,i}\), with probability \({\ge }1-|{\mathrm{Supp}}(\sigma _{k,i})|\frac{d_{i}-i}{p-d_{i}+j+1}\) one has \({\mathrm{Supp}}(\sigma _{k,i})\subseteq {\mathrm{Supp}}(\sigma _{k-1,i})\) for \(k=1,..,d_{i}\) where \(\sigma _{0,i}\) is defined as \(\sigma _{0,i}:=f_{j-1,i}\) and \(d_i=\deg _{x_j}(f_{j,i})\). Therefore to solve \(\mathrm {MDP}_{j,k}\) we use \({\mathrm{Supp}}(\sigma _{k-1,i})\) as a skeleton of the solution of \(\sigma _{k,i}\). That is, if \(\sigma _{k-1,i}=\sum _{l,k}m_{ilk}M_{ilk}\) for \(m_{ilk}\in {\mathbb {Z}_{p}}-\{0\}\) with distinct monomials in \(M_{ilk}\in {\mathbb {Z}_{p}}[x_{1},\ldots ,x_{j-1}]\), then we construct \(\bar{\sigma }_{k,i}=\sum _{l,k}c_{ilk}M_{ilk}\) as a solution form (skeleton) of \(\sigma _{k,i}\), where \(c_{ilk}\) are to be computed.

At the \(k^{\mathrm{th}}\) iteration suppose that we need \(t_i\) evaluations to recover the coefficients \(c_{ilk}\) (see line 17 of Algorithm 3). Let \(\beta =(\beta _{2},\ldots \beta _{j-1})\) where \(\beta _{i}\in {\mathbb {Z}_{p}}-\{0\}\) be a random evaluation point. Consider the \(t_i\) consecutive univariate multiterm MDP’s

$$\begin{aligned} \tilde{\sigma }_{k,1}b_{1}(x_{1},\beta ^{s})+\cdots +\tilde{\sigma }_{k,r}b_{r}(x_{1},\beta ^{s})= & {} c_{i}(x_{1},\beta ^{s}) ~\mathrm{for}~~ 1\le s \le t_i, \end{aligned}$$
(2)

where the \(\tilde{\sigma }_{k,i}\) are to be computed. By uniqueness of the solutions to the multiterm MDP, with average probability \(\left( {\begin{array}{c}r\\ 2\end{array}}\right) \frac{1}{p}\) one has \(\tilde{\sigma }_{k,i}=\sigma _{k,i}(x_{1},\beta ^{s})\).

Equation 2 can be solved efficiently for \(\tilde{\sigma }_{k,i}\) using the iterative approach in the univariate domain \({\mathbb {Z}_{p}}[x_{1}]\). Next the univariate images \(\bar{\sigma }_{k,i}(x_{1},\beta ^{s})\) of \(\bar{\sigma }_{k,i}\) are used to compute the coefficient \(c_{ilk}\) of \(\bar{\sigma }_{k,i}\) by solving Vandermonde systems which are constructed by equating the coefficients of \(\sigma _{k,i}(x_{1},\beta ^{j})\) and \(\tilde{\sigma }_{k,i}\) (see line 23 of Algorithm 3). Again, if the strong SHL assumption is true, then by following Zippel’s analysis in [16], one can show that with probability \({\ge }1-\frac{(\#f_{i})^2}{2(p-1)}\), we have a unique solution to Vandermonde systems.

At this stage we have candidate solutions \(\bar{\sigma }_{k,i}\) for the actual solutions \(\sigma _{k,i}\) of \(\mathrm {MDP}_{j,k}\). Because our assumption \(\mathrm{Supp}(\sigma _{k,i}) \subseteq \mathrm{Supp}(\sigma _{k-1,i})\) may be false, we need to verify if \(\bar{\sigma }_{k,i} = \sigma _{k,i}\). We do this using a random evaluation in line 27 of Algorithm 3.

What does this approach bring us? First, MTSHL-d essentially follows MTSHL but eliminates an iteration at the cost of an increase in the probability of failure. However this probability is negligible if p is big enough. In our implementation we used a 31 bit prime and MTSHL-d never failed. Since it is an iteration on r, we expect MTSHL-d to solve multi-MDP’s faster than MTSHL by a factor of \(\mathcal {O}(r)\). This is verified by the experimental data in Table 1 of Sect. 6.

Second, \(b_{k}(x_{1},\beta ^{s})=\prod _{i=1,i\ne k}^{r}f_{i}(x_{1},\beta ^{s})\), so we don’t need to compute \(b_{k}\in {\mathbb {Z}_{p}}[x_{1},\ldots ,x_{j-1}]\). All we need to do is to compute and multiply their univariate images \(f_{i}(x_{1},\beta ^{s})\) of \(f_{i}\) to obtain \(b_{k}(x_{1},\beta ^{j})\).

Finally in MTSHL-d, like MTSHL, we may evaluate down to \(\mathbb {Z}[x_1,x_2]\) instead of \(\mathbb {Z}[x_1]\) to decrease the number of evaluations \(t_i\) needed and the size of the Vandermonde systems (Line 17 in Algorithm 3). To do this MTSHL-d uses multi-Bivariate Diophant Solver (multi-BDP). We implemented Multi-BDP in C. It solves the bivariate multi-MDP by the iterative approach and uses evaluation and interpolation on \(x_2\) to reduce to the univariate case.

5 The Case Modulo \({p}^{l}\) with \(l>1\)

When the integer coefficients of a or the factors of a to be computed are huge the current strategy implemented by most of the computer algebra platforms, including Maple, Singular [9] and Magma [12], is the following. For details see [2]. First we pick a prime p and a natural number \(l>0\) such that the ring \(\mathbb {Z}_{p^{l}}\) can be identified with the ring \(\mathbb {Z}\). That is, we find a bound B such that the integer coefficients of the polynomial a to be factored and its irreducible factors are bounded by B. One way to choose such an upper bound B is given by [4]. Then we choose l such that \(p^{l}>2B\). Next the MDP solution in \(\mathbb {Z}_{p}[x_{1}]\) is lifted to the solution in \(\mathbb {Z}_{p^{l}}[x_{1}]\). The second step is to lift the solution from \(\mathbb {Z}_{p^{l}}[x_{1}]\) to \(\mathbb {Z}_{p^{l}}[x_{1},\ldots ,x_{n}]\). Note that in the second step all arithmetic is in \(\mathbb {Z}_{p^{l}}\) with \(p^l>2B\). In this section we question whether this strategy is the best approach for the case \(l>1\).

Suppose for example that the coefficients of the factors are bounded by \(p^{10}\). Before the factorization we don’t have this information. Since most likely the coefficient bound \(B>p^{20}\), this means that throughout MHL all integer arithmetic is modulo \(p^{20}\) which is expensive.

MTSHL’s sparse multivariate diophantine solver allows us to propose an approach that eliminates most of the multi-precision arithmetic and allows us to lift up to the size of the actual coefficients in the factors, thus avoiding B.

  • First choose a random \((m+1)\)-bit machine prime p, i.e. \(p\in [2^m<p<2^{m+1}]\) and compute the factorization of a by lifting the factorization in \(\mathbb {Z}_{p}[x_{1}]\) to in \(\mathbb {Z}_{p}[x_{1},\ldots ,x_{n}]\) with MTSHL-d. Most of this work is mod p.

  • Next compute a lifting bound B. One may use Lemma 14 of [4] for this purpose. Now pick the smallest l such that \(p^l>2B\).

  • Then as a second stage do a p-adic lift of the factorization from \(\mathbb {Z}_{p}[x_{1},\ldots ,x_{n}]\) stopping when f and g are recovered or we exceed \(p^l\). The p-adic lift is presented as Algorithm 5. It reduces to solving MDPs in \(\mathbb {Z}_p[x_1,\dots ,x_n]\).

figure e

To make the following explanation easier we assume \(r=2\) and suppose that \(a=uw\) where \(a,u,w\in \mathbb {Z}[x_{1},\ldots ,x_{n}]\) and uw are unknown to us. As a first step we choose an evaluation ideal \(I=\left\langle x_{2}-\alpha _{2},\ldots ,x_{n}-\alpha _{n}\right\rangle \) with randomly chosen \(\alpha _{i}\) from \([0,p-1]\) such that conditions (i) and (ii) for MHL are satisfied with \(l=1\). Then there is a factorization \(a=u^{(n)}w^{(n)}\in \mathbb {Z}_{p}[x_{1},\ldots ,x_{n}]\). This factorization is computed using MTSHL-d.

Now suppose that u (similarly w) has the form

$$ u = \sum _{j=1}^t c_j M_j(x_1,\dots ,x_n) = \sum _{j=1}^t \sum _{i=0}^{l-1} s_{ji} p^i M_j(x_1,\dots ,x_n), $$

where the \(M_{j}\) are distinct monomials and \(0\ne c_{j}\in \mathbb {Z}\) with \(c_{j}=\sum _{i=0}^{l-1}s_{ji}p^{j}\) where \(-p^{l}/2<s_{ji}<p^{l}/2\). Then we have

$$ u = \sum _{i=0}^{l-1} \left( \sum _{j=1}^t s_{ji} M_j(x_1,\dots ,x_n) \right) p^i = \sum _{i=0}^{l-1} u_i p^i. $$

It follows that

$$ \frac{u-\sum _{i=0}^{k-1}u_{i}p^{i}}{p^{k}}= \sum _{j=1}^t \left( \sum _{i=k}^{l-1} s_{ji} p^{i-k} \right) M_{j}(x_1,\dots ,x_n). $$

Also, we have \(u_{0}=u\mod p\ne 0\) since in the first stage u is lifted from \(u_{0}\). Now we make a key observation: If p is chosen at random such that \(2^{m}< p < 2^{m+1}\), the probability that \(p\mid c_{i}\) is \(\Pr [p\mid c_{i}]=\frac{\#\mathrm {distinct\,(m+1) bit\,prime\,divisors\,of\,}c_{i}}{\#\mathrm {m\,bit\,primes}}\). Let \(\pi (s)\) be the number of primes \(\le s\). Since there are at most \(\lfloor \log _{2^{m}}(c_{i})\rfloor \) many \((m+1)\)-bit primes dividing \(c_{i}\) we have

$$ \Pr [p\mid c_{i}]\le \frac{\lfloor \log _{2^{m}}(c_{i})\rfloor }{\pi (2^{m+1})-\pi (2^{m})}\le \frac{l}{\pi (2^{m+1})-\pi (2^{m})} $$

This probability is very small because according to the prime number theorem \(\pi (s)\sim s/\log (s)\) and hence \(\pi (2^{m+1})-\pi (2^{m})\sim \frac{2^{m}}{m\log (2)}\).

It has been shown in [8] that the exact number of 31-bit primes (\(m=30\)) is 50697537. Therefore in our implementation the support of \(u_{0}\) will contain all monomials \(M_{i}\) and \(\mathrm {Supp}\{u_{j}\}\subseteq \mathrm {Supp}\{u_{0}\}\) with probability \({>}1-\frac{t\,l}{5\cdot 10^{7}}\).

We make one more key observation and claim that \(\mathrm {Supp}\{u_{j}\}\subseteq \mathrm {Supp}\{u_{j-1}\}\) for \(1\le j\le l\) with high probability: We have

$$\begin{aligned} u_{j}= & {} s_{0j}M_{0}+s_{1j}M_{1}+\cdots +s_{kj}M_{t},\\ u_{j+1}= & {} s_{0,j+1}M_{0}+s_{1,j+1}M_{1}+\cdots +s_{k,j+1}M_{t}. \end{aligned}$$

For a given \(j>0\), if \(s_{i,j+1}\ne 0\), but \(s_{ij}=0\) then \(M_{i}\in \mathrm {Supp}(u_{j+1})\) but \(M_{i}\notin \mathrm {Supp}(u_{j})\). We consider \(\Pr [s_{ij}=0\,\mathrm {|}\,s_{i,j+1}\ne 0].\) If A is the event that \(s_{ij}=0\) and B is the event that \(s_{i,j+1}=0\) then

$$ \Pr [A\,|\,B^{c}]=\frac{\Pr [A]-\Pr [B]\Pr [A\,|\,B]}{\Pr [B^{c}]}\le \frac{\Pr [A]}{\Pr [B^{c}]}. $$

It follows that

$$ \frac{\Pr [A]}{\Pr [B^{c}]}\le \frac{l/(\pi (2^{m+1})-\pi (2^{m}))}{1-l/(\pi (2^{m+1})-\pi (2^{m}))}=\frac{l}{(\pi (2^{m+1})-\pi (2^{m}))-l}. $$

Hence,

$$ \Pr [\mathrm {Supp}\{u_{j}\}\subseteq \mathrm {Supp}\{u_{j-1}\}\,|\,1\le j\le l]>1-\frac{t\,l}{((\pi (2^{m+1})-\pi (2^{m}))-l}. $$

As an example for \(m=30,l=5,t=500\), this probability is \({>}0.99993\).

Hardy and Ramanujan [5] proved that for almost all integers, the number of distinct primes dividing a number s is \(\omega (s)\approx \log \log (s)\). This theorem was generalized by Erdős-Kac which shows that \(\omega (s)\) is essentially normally distributed [3]. By this approximation note that

$$ \frac{\mathrm {Pr}[A]}{\mathrm {Pr}[B^{c}]}\!\le \!\frac{\log \log (s_{ij})/(\pi (2^{m+1})-\pi (2^{m}))}{1\!-\!\log \log (s_{i,j+1})/(\pi (2^{m+1})\!-\!\pi (2^{m}))}\!=\!\frac{\log (l\log p)}{(\pi (2^{m+1})\!-\!\pi (2^{m}))\!-\!\log (l\log p)}. $$

Hence the probability that \(\mathrm {Supp}\{u_{j}\}\subseteq \mathrm {Supp}\{u_{j-1}\}\) is \({\gtrsim }1-t\frac{m\log (lm)}{2^{m}-m\log (lm)}\). As an example for \(m=30,l=5,t=500\), this probability is \({>}0.99995\).

What does this mean in the context of multivariate factorization over mod \(\mathbb {Z}_{p^{l}}\) for \(l>1\)? It means that the solutions to the multivariate diophantine problems occurring in the lifting process will, with high probability, be a subset of the monomials of the solutions of the previous step and these solutions can be computed simply by solving Vandermonde systems by using a machine prime p and hence by an efficient arithmetic using a sparse MDP solver as described in Algorithm 3.

We sum up the observations made in this section in Theorem 1 below.

Theorem 1

Let p be a randomly chosen m-bit prime, i.e. \(p\in [2^m<p<2^{m+1}]\). With the notation introduced in this section

$$ \Pr (\mathrm {Supp}\{u_{j}\}\subseteq \mathrm {Supp}\{u_{j-1}\}\,\,\mathrm {for\,all}\,1\le j\le l)>1-\frac{t\,l}{((\pi (2^{m+1})-\pi (2^{m}))-l}. $$

This probability can be approximated by

$$ \Pr [\mathrm {Supp}\{u_{j}\}\subseteq \mathrm {Supp}\{u_{j-1}\}\,\,\mathrm {for\,all}\,1\le j\le l]\gtrsim 1-\frac{t\,m\log (lm)}{2^{m}-m\log (lm)}. $$

6 Timing Data

In this section we give some experimental data to verify the effectiveness of the methods described in Sects. 4 and 5. In the tables that follow all timings are in CPU seconds and were obtained on an Intel Core i5–4670 CPU running at 3.40 GHz with 16 GB of RAM. For all Maple timings, we set kernelopts(numcpus=1); to restrict Maple to use only one core as otherwise it will do polynomial multiplications and divisions in parallel.

6.1 Iterative vs Direct

In this section, we give some data in Table 1 to compare MTSHL-d with the current approach, i.e. implementing MTSHL so that it solves multi-MDP’s using iterative approach as explained in Sect. 4. We include also timings for Wang’s algorithm which also uses the iterative approach.

We generated r random polynomials in n variables of total degree d with T terms and coefficients from [1, 99] using Maple’s randpoly command thus and multiplied them. Then we factored these polynomials using (i) Wang’s algorithm, (ii) MTSHL and (iii) MTSHL-d (our new method explained in Sect. 4). All implementations are in Maple. tX(tY) means that the algorithm factored the polynomial in tX CPU seconds and spent tY CPU seconds solving multiterm MDPs. OOM stands for out of memory. As can be seen from the data, MTSHL is significantly faster than Wang’s algorithm and the MDP time in MTSHL-d is less than the MDP time in MTSHL by a factor of \(r-1\) or more.

Table 1. Timings for Wang, MTSHL vs MTSHL-d with \(r>2\).

6.2 The \(\mathbf {p}^L\) Case

In this section, we give some data in Table 2 to compare the current approach, i.e. implementing MTSHL so that it computes a bound \(l_B\) and factors staying in modulo \(\mathbb {Z}_{p^{l_B}}\) arithmetic, with the p-adic lifting at the last step approach, i.e. the -staying in \(\mathbb {Z}_p\) arithmetic approach-, as explained in this Sect. 5.

We generated 2 random polynomials in n variables of total degree d with T with coefficients in \([0,p^l)\) for \(p=2^{31}-1\). Then we multiplied the two factors over \(\mathbb {Z}\) and then factored the product with MTSHL. Since MTSHL does not know what the actual value of l is, it needs to compute the coefficient bound \(l_B\) (using Lemma 14 of [4]) and stays in the \(\mathbb {Z}_{p^{l_B}}\) arithmetic. It factored the polynomial in tX(tY) seconds where tY denotes the time spent on solving MDP’s. Then we factored the polynomial with MTSHL-d which uses p-adic lifting to recover the integer coefficients as explained in Sect. 5. The timings in column MTSHL-d (MDP) (Lift) are the total time, the time spent in MDP and the time spent doing l lifts. The data in Table 2 shows that doing a p-adic lift is much faster than the previous approach.

Table 2. Timings for MTSHL vs MTSHL-d for large integer coefficients.

7 Conclusion

We have shown that when the number of factors to be computed \({\ge }2\) and for the case where the coefficients of the factors are huge, sparse interpolation techniques can be used to speed up multivariate polynomial factorization. The second author has integrated our code into Maple under a MITACS internship with Dr. Jürgen Gerhard of Maplesoft. The new code will become the default factorization algorithm used by Maple’s factor command for multivariate polynomials with integer coefficients. The old code will still be accessible as an option.