Keywords

1 Introduction

The RSA cryptosystem is the most widely used public-key cryptosystem in practice, and its security is closely related to the difficulty of Integer Factorization Problem (IFP): if IFP is solved then RSA is broken. It is conjectured that factoring cannot be solved in polynomial-time without quantum computers.

In Eurocrypt’85, Rivest and Shamir [20] first studied the factoring with known bits problem. They showed that \(N=pq\) (pq is of the same bit size) can be factored given \(\frac{2}{3}\)-fraction of the bits of p. In 1996, Coppersmith [2] improved [20]’s bound to \(\frac{1}{2}\). Note that for the above results, the unknown bits are within one consecutive block. The case of n blocks was later considered in [7, 15].

Motivated by the cold boot attack [4], in Crypto’09, Heninger and Shacham [6] considered the case of known bits are uniformly spread over the factors p and q, they presented a polynomial-time attack that works whenever a 0.59-fraction of the bits of p and q is given. As a follow-up work, Henecka et al. [5] focused on the attack scenario that allowed for error correction of secret factors, which called Noisy Factoring Problem. Later, Kunihiro et al. [12] discussed secret key recovery from noisy secret key sequences with both errors and erasures. Recently, Kunihiro and Honda [11] discussed how to recover RSA secret keys from noisy analog data.

1.1 Implicit Factorization Problem (IFP)

The above works require the knowledge of explicitly knowing bits of secret factor. In PKC’09, May and Ritzenhofen [18] introduced a new factoring problem with implicit information, called Implicit Factorization Problem (IFP). Consider that \(N_{1}=p_{1}q_{1},\ldots ,N_{k}=p_{k}q_{k}\) be n-bit RSA moduli, where \(q_{1},\ldots ,q_{k}\) are \(\alpha n(\alpha \in (0,1))\)-bit primes: Given the implicit information that \(p_{1},\ldots ,p_{k}\) share certain portions of bit pattern, under what condition is it possible to factorize \(N_{1},\ldots ,N_{k}\) efficiently? This problem can be applied in the area of malicious generation of RSA moduli, i.e. the construction of backdoored RSA moduli. Besides, it also helps to understand the complexity of the underlying factorization problem better.

Since then, there have been many cryptanalysis results for this problem [3, 14, 18, 19, 2123]. Sarkar and Maitra [22] developed a new approach, they used the idea of [10], which is for the approximate common divisor problem (ACDP), to solve the IFP, and managed to improve the previous bounds significantly.

We now give a brief review of their method. Suppose that primes \(p_1,\ldots ,p_k\) share certain amount of most significant bits (MSBs). First, they notice that

$$ \gcd (N_1,N_2+(p_1-p_2)q_2,\ldots ,N_k+(p_1-p_k)q_k)=p_1 $$

Then they try to solve the simultaneous modular univariate linear equations

$$\begin{aligned} \left\{ \begin{array}{ccc} N_{2} + u_{2}\equiv 0\quad \text {mod }p_1\\ \vdots \\ N_{k} + u_{k}\equiv 0\quad \text {mod }p_1\\ \end{array} \right. \end{aligned}$$
(1)

for some unknown divisor \(p_1\) of known modulus \(N_1\). Note that if the root \((u^{(0)}_2,\ldots ,u^{(0)}_k)=\left( (p_1-p_2)q_2,\ldots ,(p_1-p_k)q_k\right) \) is small enough, we can extract them efficiently. In [22], Sarkar and Maitra proposed an algorithm to find the small root of Eq. (1). Recently, Lu et al. [14] performed a more effective analysis by making use of Cohn and Heninger’s algorithm [1].

1.2 Our Contributions

In this paper, we present a new algorithm to obtain better bounds for solving the IFP. As far as we are aware, our attack is the best among all known attacks.

Table 1. Comparison of our generalized bounds against previous bounds

Technically, our algorithm is also to find a small root of Eq. (1). Concretely, our improvement is based on the observation that for \(2\le i \le k\), \(u^{(0)}_i\) contains a large prime \(q_i\), which is already determined by \(N_i\).

Therefore, we separate \(u_i\) into two unknown variables \(x_i\) and \(y_i\) i.e. \(u_i=x_iy_i\). Consider the following equations

$$ \left\{ \begin{array}{ccc} N_{2} + x_{2}y_2\equiv 0 \quad \text {mod }p_1\\ \vdots \\ N_{k} + x_{k}y_k\equiv 0 \quad \text {mod }p_1\\ \end{array} \right. $$

with the root \((x^{(0)}_2,\ldots ,x^{(0)}_k,y^{(0)}_2,\ldots ,y^{(0)}_k)=\left( q_2,\ldots ,q_k,p_1-p_2,\ldots ,p_1-p_k\right) \). Then we introduce \(k-1\) new variables \(z_i\) for the prime factor \(p_i\) (\(2\le i \le k\)), and use the equation \(x_i z_i=N_i\) to decrease the determinant of the desired lattice. That is the key reason why we get better results than [22].

In Fig. 1, we give the comparison with previous bounds for the case \(k=2\). In Table 1, we list the comparisons between our generalized bounds and the previous bounds.

Recently in [19], Peng et al. proposed another method for the IFP. Instead of applying Coppersmith’s technique directly to the ACDP, Peng et al. utilized the lattice proposed by May and Ritzenhofen [18], and tried to find the coordinate of the desired vector which is not included in the reduced basis, namely they introduced a method to deal with the case when the number of shared bits is not large enough to satisfy the bound in [18].

In this paper, we also investigate Peng et al.’s method [19]. Surprisingly, we get the same result with a different method. In Sect. 5, we give the experimental data for our two methods.

We organize the rest of the paper as follows. In Sect. 2, we review the necessary background for our approaches. In Sect. 3, based on new observations, we present our new analysis on the IFP. In Sect. 4, we revisit Peng et al.’s method [19]. Finally, in Sect. 5, we give the experimental data for the comparison with previous methods.

Fig. 1.
figure 1figure 1

Comparison with previous bounds on \(\gamma \) with respect to \(\alpha \): \(k=2\). MR Attack denotes May and Ritzenhofen’s attack [18], SM Attack denotes Sarkar and Maitra’s attack [22], PHXHX Attack denotes Peng et al.’s attack [19].

2 Preliminaries

2.1 Notations

Let \(N_{1}=p_{1}q_{1},\ldots ,N_{k}=p_{k}q_{k}\) be n-bit RSA moduli, where \(q_{1},\ldots ,q_{k}\) are \(\alpha n(\alpha \in (0,1))\)-bit primes. Three cases are considered in this paper, we list them below:

  • \(p_{1},\ldots ,p_{k}\) share \(\beta n\) LSBs where \(\beta \in (0,1)\);

  • \(p_{1},\ldots ,p_{k}\) share \(\gamma n\) MSBs where \(\gamma \in (0,1)\);

  • \(p_{1},\ldots ,p_{k}\) share \(\gamma n\) MSBs and \(\beta n\) LSBs together where \(\gamma \in (0,1)\) and \(\beta \in (0,1)\);

For simplicity, here we consider \(\alpha n\), \(\beta n \) and \(\gamma n\) as integers.

2.2 Lattice

Consider a set of linearly independent vectors \(u_{1},\ldots ,u_{w}\in \mathbb {Z}^{n}\), with \(w\leqslant n\). The lattice \(\mathcal {L}\), spanned by \(\{u_{1},\ldots ,u_{w}\}\), is the set of all integer linear combinations of the vectors \(u_{1},\ldots ,u_{w}\). The number w of vectors is the dimension of the lattice. The set \(u_{1},\ldots ,u_{w}\) is called a basis of \(\mathcal {L}\). In lattices with large dimension, finding the shortest vector is a very hard problem, however, approximations of a shortest vector can be obtained in polynomial-time by applying the well-known LLL basis reduction algorithm [13].

Lemma 1

(LLL [13]). Let \(\mathcal {L}\) be a lattice of dimension w. In polynomial-time, the LLL algorithm outputs reduced basis vectors \(v_{i}\), \(1\leqslant i \leqslant w\) that satisfy

$$ \parallel v_{1} \parallel \leqslant \parallel v_{2} \parallel \leqslant \cdots \leqslant \parallel v_{i} \parallel \leqslant 2^{\frac{w(w-1)}{4(w+1-i)}} \det (\mathcal {L})^{\frac{1}{w+1-i}.} $$

We also state a useful lemma from Howgrave-Graham [9]. Let \(g(x_{1},\ldots ,x_{k})=\sum _{i_{1},\ldots ,i_{k}}a_{i_{1},\ldots ,i_{k}}x^{i_{1}}_{1}\cdots x_{k}^{i_{k}}\). We define the norm of g by the Euclidean norm of its coefficient vector: \(|| g || ^{2}=\sum _{i_{1},\ldots ,i_{k}}a^{2}_{i_{1},\ldots ,i_{k}}\).

Lemma 2

(Howgrave-Graham [9]). Let \(g(x_{1},\ldots ,x_{k})\in \mathbb {Z}[x_{1},\ldots ,x_{k}]\) be an integer polynomial that consists of at most w monomials. Suppose that

  1. 1.

    \(g(y_{1},\ldots ,y_{k})=0 \text {mod }p^{m}\) for some \(\mid y_{1} \mid \leqslant X_{1},\ldots , \mid y_{k}\mid \leqslant X_{k}\) and

  2. 2.

    \(\parallel g(x_{1}X_{1},\ldots ,x_{k}X_{k})\parallel < \frac{p^{m}}{\sqrt{w}} \)

Then \(g(y_{1},\ldots ,y_{k})=0\) holds over the integers.

The approach we used in the rest of the paper relies on the following heuristic assumption [7, 17] for computing multivariate polynomials.

Assumption 1

The lattice-based construction in this work yields algebraically independent polynomials, this common roots of these polynomials can be computed using techniques like calculation of the resultants or finding a Gr\(\mathrm {\ddot{o}}\)bner basis.

Gaussian Heuristic. For a random n-dimensional lattice \(\mathcal {L}\) in \(\mathbb {R}^n\) [8], the length of the shortest vector \(\lambda _1\) is expected to be approximately

$$ \sqrt{\frac{n}{2\pi e}}{\det (\mathcal {L})}^{\frac{1}{n}}. $$

In our attack, the low-dimensional lattice we constructed is not a random lattice, however, according to our practical experiments, the length of the first vector of the lattice basis outputted from the \(L^3\) algorithm to that specific lattice is indeed asymptotically close to the Gaussian heuristic, similarly as the assumption says for random lattices. Moreover, the lengths of other vectors in the basis are also asymptotically close to the Gaussian heuristic. Hence, we can roughly estimate the sizes of the unknown coordinate of desired vector in the reduced basis.

3 Our New Analysis for Implicit Factorization

As described in the previous section, we will use the fact the desired common root of the target equations contains large prime factors \(q_i\) (\(2\le i \le k\)) which are already determined by \(N_i\) to improve Sarkar-Maitra’s results.

3.1 Analysis for Two RSA Moduli: The MSBs Case

Theorem 1

Let \(N_{1}=p_{1}q_{1},N_{2}=p_{2}q_{2}\) be two different n-bit RSA moduli with \(\alpha n\)-bit \(q_1, q_2\) where \(\alpha \in (0,1)\). Suppose that \(p_{1},p_{2}\) share \(\gamma n\) MSBs where \(\gamma \in (0,1)\). Then under Assumption  1, \(N_{1}\) and \(N_{2}\) can be factored in polynomial-time if

$$ \gamma > 2\alpha (1-\alpha ) $$

Proof

Let \(\widetilde{p_2}=p_1 -p_2\). We have \(N_1=p_1 q_1\), \(N_2=p_2 q_2=p_1 q_2 -\widetilde{p_2} q_2\), and \(\gcd (N_1, N_2+\widetilde{p_2}q_2)=p_1\). Then we want to recover \(q_2,\widetilde{p_2}\) from \(N_1, N_2\). We focus on a bivariate polynomial \(f(x,y)=N_2+xy\) with the root \((x^{(0)},y^{(0)})=(q_2,\widetilde{p_2})\) modulo \(p_1\). Let \(X=N^{\alpha },Y=N^{1-\alpha -\gamma },Z=N^{1-\alpha }\) be the upper bounds of \(q_2,\widetilde{p_2},p_2\). In the following we will use the fact that the small root \(q_2\) is already determined by \(N_2\) to improve Sarkar-Maitra’s results.

First let us introduce a new variable z for \(p_2\). We multiply the polynomial f(xy) by a power \(z^{s}\) for some s that has to be optimized. Additionally, we can replace every occurence of the monomial xz by \(N_2\). Define two integers m and t, let us look at the following collection of trivariate polynomials that all have the root \((x_0,y_0)\) modulo \(p_1^{t}\).

$$ g_k (x,y,z)=z^{s}f^{k}N_{1}^{\max \{t-k,0 \}} \ \ \ \ \text {for} \ k= 0,\ldots , m $$

For \(g_k(x,y,z)\), we replace every occurrence of the monomial xz by \(N_2\) because \(N_2=p_2 q_2\). Therefore, every monomial \(x^k y^{k} z^s (k\ge s)\) with coefficient \(a_{k}\) is transformed into a monomial \(x^{k-s} y^{k}\) with coefficient \(a_{k}N_2 ^s\). And every monomial \(x^k y^{k} z^s (k< s)\) with coefficient \(a_{k}\) is transformed into a monomial \(y^{k} z^{s-k}\) with coefficient \(a_{k}N_2^k\).

To keep the lattice determinant as small as possible, we try to eliminate the factor of \(N_2^{i}\) in the coefficient of diagonal entry. Since \(\gcd (N_1,N_2)=1\), we only need to multiply the corresponding polynomial with the inverse of \(N_2^{i}\) modulo  \(N_1^t\).

Compare to Sarkar-Maitra’s lattice, the coefficient vectors \(g_k (xX,yY,zZ)\) of our lattice contain less powers of X, which decreases the determinant of the lattice spanned by these vectors, however, on the other hand, the coefficient vectors contain powers of Z, which in turn increases the determinant. Hence, there is a trade-off and one has to optimize the parameter s subject to a minimization of the lattice determinant. That is the key reason why we can get better result than Sarkar-Maitra’s results.

We have to find two short vectors in lattice \(\mathcal {L}\). Suppose that these two vectors are the coefficient vectors of two trivariate polynomial \(f_1(xX,yY,zZ)\) and \(f_2(xX,yY,zZ)\). These two polynomials have the root \((q_2,\widetilde{p_2},p_2)\) over the integers. Then we can eliminate the variable z from these polynomials by setting \(z=\frac{N_2}{x}\). Finally, we can extract the desired root \((q_2,\widetilde{p_2})\) from the new two polynomials if these polynomials are algebraically independent. Therefore, our attack relies on Assumption 1.

We are able to confirm Assumption 1 by various experiments later. This shows that our attack works very well in practice.

Now we give the details of the condition for which we can find two sufficiently short vectors in the lattice \(\mathcal {L}\). The determinant of the lattice \(\mathcal {L}\) is

$$ \det (\mathcal {L})=N_1^{\frac{t(t+1)}{2}} X^{\frac{(m-s)(m-s+1)}{2}} Y^{\frac{m(m+1)}{2}} Z^{\frac{s(s+1)}{2}} $$

The dimension of the lattice is \(w=m+1\).

To get two polynomials sharing the root \(q_2,\widetilde{p_2},p_2\), we get the condition

$$ 2^{\frac{w(w-1)}{4w}}\text {det} (\mathcal {L})^{\frac{1}{w}}<\frac{p_{1}^{t}}{\sqrt{w}} $$

Substituting the values of the \(\det (\mathcal {L})\) and neglecting low-order terms, we obtain the new condition

$$ \frac{t^2}{2}+\alpha \frac{(m-s)^2}{2}+(1-\alpha -\gamma )\frac{m^2}{2}+(1-\alpha )\frac{s^2}{2} < (1-\alpha )tm $$

Let \(t=\tau m, s=\sigma m\). The optimized values of parameters \(\tau \) and \(\sigma \) are given by

$$ \tau =1-\alpha \ \ \ \ \ \ \ \ \ \sigma =\alpha $$

Plugging in this values, we finally end up with the condition

$$ \gamma >2\alpha (1-\alpha ) $$

One can refer to Fig. 1 for the comparison with previous theoretical results.

3.2 Extension to k RSA Moduli

In this section, we give an analysis for k (\(k> 2\)) RSA moduli.

Theorem 2

Let \(N_{1}=p_{1}q_{1},\ldots ,N_{k}=p_{k}q_{k}\) be k different n-bit RSA moduli with \(\alpha n\)-bit \(q_1,\ldots , q_k\) where \(\alpha \in (0,1)\). Suppose that \(p_{1},\ldots ,p_{k}\) share \(\gamma n\) MSBs where \(\gamma \in (0,1)\). Then under Assumption 1, \(N_{1},\ldots ,N_{k}\) can be factored in polynomial-time if

$$ \gamma > k(1-\alpha )\left( 1-(1-\alpha )^{\frac{1}{k-1}}\right) $$

Proof

Let \(\widetilde{p_i}=p_1 -p_i\). We have \(N_1=p_1 q_1\) and \(N_i=p_i q_i=p_1 q_i -\widetilde{p_i} q_i\) \((2\le i \le k)\). We have \(\gcd (N_1, N_2+\widetilde{p_2}q_2,\ldots , N_k+\widetilde{p_k}q_k)=p_1\). Then we want to recover \(q_i,\widetilde{p_i}\) \((2\le i \le k)\) from \(N_1, \ldots ,N_k\). We construct a system of \(k-1\) polynomials

$$ \left\{ \begin{array}{ccc} f_{2}(x_{2},y_{2})= N_2 + x_2y_2\\ \vdots \\ f_{k}(x_{k},y_{k}) = N_{k} + x_ky_k\\ \end{array} \right. $$

with the root \((x_2^{(0)},y_2^{(0)},\ldots ,x_k^{(0)},y_k^{(0)})=(q_2,\widetilde{p_2},\ldots ,q_k,\widetilde{p_k})\) modulo \(p_1\). Using a technique similar to that of Theorem 1, and introducing \(k-1\) new variables \(z_i\) for \(p_i\) \((2\le i \le k)\), we define the following collection of trivariate polynomials.

$$ g_{i_2,\ldots ,i_k} (x_2,\ldots ,x_k,y_2,\ldots ,y_k,z_2,\ldots ,z_k) = (z_2\cdots z_k)^s f_2^{i_2}\cdots f_k^{i_k} N_{1}^{\max \{t-i_2-\cdots -i_k,0 \}} $$

with \(0\le i_2+\cdots +i_k\le m\) (Because of the symmetric nature of the unknown variables \(x_2,\ldots ,x_k\), i.e., all the \(x_2,\ldots ,x_k\) have the same size, we use the same parameter s).

For \(g_{i_2,\ldots ,i_k}\), we replace every occurrence of the monomial \(x_i z_i \) by \(N_i\). We can eliminate the factor of \(N_2^{j_2}\cdots N_k^{j_k}\) in the coefficient of diagonal entry. The determinant of the lattice \(\mathcal {L}\) is

$$ \text {det}(\mathcal {L})=N_1^{s_{N}}\prod _{i=2}^{k} X_{i}^{s_{X_i}} Y_{i}^{s_{Y_i}} Z_i^{s_{Z_i}} $$

where

$$\begin{aligned} s_{N}= & {} \sum _{j=0}^{t}j \left( {\begin{array}{c}t-j+k-2\\ k-2\end{array}}\right) =\left( {\begin{array}{c}t+k-1\\ k-1\end{array}}\right) \frac{t}{k}\\ s_{X_2}=\cdots =s_{X_k}= & {} \sum _{j=0}^{m-s}j \left( {\begin{array}{c}m-s-j+k-2\\ k-2\end{array}}\right) =\left( {\begin{array}{c}m-s+k-1\\ k-1\end{array}}\right) \frac{m-s}{k}\\ s_{Y_2}=\cdots =s_{Y_k}= & {} \sum _{j=0}^{m}j \left( {\begin{array}{c}m-j+k-2\\ k-2\end{array}}\right) = \left( {\begin{array}{c}m+k-1\\ k-1\end{array}}\right) \frac{m}{k}\\ s_{Z_2}=\cdots =s_{Z_k}= & {} \sum _{j=0}^{s}j \left( {\begin{array}{c}m-s+j+k-2\\ k-2\end{array}}\right) \\= & {} \left( {\begin{array}{c}m+k-1\\ k\end{array}}\right) \frac{ks-m}{m}+\left( {\begin{array}{c}m-s-1+k-1\\ k\end{array}}\right) \frac{k+m-s-1}{m-s-1}\\ \end{aligned}$$

Here \(X_i=N^{\alpha },Y_i=N^{1-\alpha -\gamma },Z_i=N^{1-\alpha }\) are the upper bounds of \(q_i,\widetilde{p_i},p_i\). The dimension of the lattice is

$$ w=\dim (\mathcal {L})=\sum ^{m}_{j=0}\left( {\begin{array}{c}j+k-2\\ j\end{array}}\right) =\left( {\begin{array}{c}m+k-1\\ m\end{array}}\right) $$

To get \(2k-2\) polynomials sharing the root \(q_2,\widetilde{p_2},p_2\), we get the condition

$$ 2^{\frac{w(w-1)}{4(w+4-2k)}}\det (\mathcal {L})^{\frac{1}{w+4-2k}}<\frac{p_{1}^{t}}{\sqrt{w}} $$

Substituting the values of the \(\det (\mathcal {L})\) and neglecting low-order terms, we obtain the new condition

$$\begin{aligned}&\left( {\begin{array}{c}t+k-1\\ k-1\end{array}}\right) \frac{t}{k}+(k-1)\alpha \left( {\begin{array}{c}m-s+k-1\\ k-1\end{array}}\right) \frac{m-s}{k}\\&+(k-1)(1-\alpha -\gamma )\left( {\begin{array}{c}m+k-1\\ k-1\end{array}}\right) \frac{m}{k} +(k-1)(1-\alpha )\left( {\begin{array}{c}m+k-1\\ k\end{array}}\right) \frac{ks-m}{m} \\&+(k-1)(1-\alpha )\left( {\begin{array}{c}m-s-1+k-1\\ k\end{array}}\right) \frac{k+m-s-1}{m-s-1}\\&<(1-\alpha )t\left( {\begin{array}{c}m+k-1\\ m\end{array}}\right) \end{aligned}$$

Let \(t=\tau m, s=\sigma m\), the optimized values of parameters \(\tau \) and \(\sigma \) were given by

$$ \tau =(1-\alpha )^{\frac{1}{k-1}} \ \ \ \ \ \ \ \ \ \sigma =1-(1-\alpha )^{\frac{1}{k-1}} $$

Plugging in this values, we finally end up with the condition

$$ \gamma > k(1-\alpha )\left( 1-(1-\alpha )^{\frac{1}{k-1}}\right) $$

One can refer to Table 1 for the comparison with previous theoretical results.

3.3 Extension to the LSBs Case.

In the following, we show a similar result in the case of \(p_{1},\ldots ,p_{k}\) share some MSBs and LSBs together. This also takes care of the case when only LSBs are shared.

Theorem 3

Let \(N_{1}=p_{1}q_{1},\ldots ,N_{k}=p_{k}q_{k}\) be k different n-bit RSA moduli with \(\alpha n\)-bit \(q_i\) \((\alpha \in (0,1))\). Suppose that \(p_{1},\cdots ,p_{k}\) share \(\gamma n\) MSBs \((\gamma \in (0,1))\) and \(\beta n\) LSBs \((\beta \in (0,1))\) together. Then under Assumption 1, \(N_{1},\cdots ,N_{k}\) can be factored in polynomial-time if

$$ \gamma + \beta > k(1-\alpha )\left( 1-(1-\alpha )^{\frac{1}{k-1}}\right) $$

Proof

Suppose that \(p_1,\ldots ,p_k\) share \(\gamma n\) MSBs and \(\beta n\) LSBs together. Then we have the following equations:

$$ \left\{ \begin{array}{cccc} p_{2}=p_{1}+2^{\beta n}\tilde{p_2}\\ \vdots \\ p_{k}=p_{1}+2^{\beta n}\tilde{p_k}\\ \end{array} \right. $$

We can write as follows

$$ N_i q_1 - N_1 q_i =2^{\beta n}\tilde{p_i} q_1 q_i \ \ \ {\text {for}} \ 2\le i \le k $$

Then we get

$$ (2^{\beta n})^{-1}N_i q_1 - \tilde{p_i} q_1 q_i \equiv 0 \quad \text {mod }N_1 \ \ \ {\text {for}} \ 2\le i \le k $$

Let \(A_i \equiv (2^{\beta n})^{-1}N_i \text {mod }\ N_1\) for \(2\le i \le k\). Thus, we have

$$ \left\{ \begin{array}{ccc} A_2 - q_2 \tilde{p_2} \equiv 0 \quad \text {mod }p_1\\ \vdots \\ A_k - q_k \tilde{p_k} \equiv 0 \quad \text {mod }p_1\\ \end{array} \right. $$

Then we can construct a system of \(k-1\) polynomials

$$ \left\{ \begin{array}{ccc} f_{2}(x_{2},\cdots ,x_{k}) = A_{2} + x_{2}y_2\\ \vdots \\ f_{k}(x_{2},\cdots ,x_{k}) = A_{k} + x_{k}y_k\\ \end{array} \right. $$

with the root \((x_2^{(0)},y_2^{(0)},\ldots ,x_k^{(0)},y_k^{(0)})=(q_2,\widetilde{p_2},\ldots ,q_k,\widetilde{p_k})\) modulo \(p_1\). The rest of the proof follows s similar technique as in the proof of Theorem 2. We omit the details here.

4 Revisiting Peng et al.’s Method [19]

In [19], Peng et al. gave a new idea for IFP. In this section, we revisit Peng et al.’s method and modify the construction of lattice which is used to solve the homogeneous linear modulo equation. Therefore, a further improved bound on the shared LSBs and MSBs is obtained.

Recall the method proposed by May and Ritzenhofen in [18], the lower bound on the number of shared LSBs has been determined, which can ensure the vector \((q_1,\cdots ,q_k)\) is shortest in the lattice, namely the desired factorization can be obtained by lattice basis reduction algorithm.

Peng et al. took into consideration the lattice introduced in [18] and discussed a method which can deal with the case when the number of shared LSBs is not enough to ensure that the desired factorization can be solved by applying reduction algorithms to the lattice. More narrowly, since \((q_1,\cdots ,q_k)\) is in the lattice, it can be represented as a linear combination of reduced lattice basis. Hence the problem of finding \((q_1,\cdots ,q_k)\) is transformed into solving a homogeneous linear equation with unknown moduli. Peng et al. utilized the result from Herrmann and May [7] to solve the linear modulo equation and obtain a better result.

Firstly, we recall the case of primes shared LSBs. Assume that there are k different n-bit RSA moduli \(N_1=p_1q_1,\cdots ,N_k=p_kq_k\), where \(p_1,\cdots ,p_k\) share \(\gamma n\) LSBs and \(q_1,\cdots ,q_k\) are \(\alpha n\)-bit primes. The moduli can be represented as

$$\begin{aligned} \left\{ \begin{array}{c} N_1=(p+2^{\gamma n}\widetilde{p_1})q_1\\ \vdots \\ N_k=(p+2^{\gamma n}\widetilde{p_k})q_k \end{array} \right. \end{aligned}$$

Furthermore, we can get following modular equations

$$\begin{aligned} \left\{ \begin{array}{c} N_1^{-1}N_2q_1-q_2\equiv 0\quad \text {mod }2^{\gamma n}\\ \vdots \\ N_1^{-1}N_kq_1-q_k\equiv 0\quad \text {mod }2^{\gamma n} \end{array} \right. \end{aligned}$$
(2)

In [18], May and Ritzenhofen introduced a k-dimensional lattice \(\mathcal {L}_1\) which is generated by the row vectors of following matrix

$$\begin{aligned} \begin{pmatrix} 1 &{} N_1^{-1}N_2 &{} N_1^{-1}N_3 &{} \cdots &{} N_1^{-1}N_k\\ 0 &{} 2^{\gamma n} &{} 0 &{} \cdots &{} 0\\ 0 &{} 0 &{} 2^{\gamma n} &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} 0 &{} \cdots &{} 2^{\gamma n} \end{pmatrix}. \end{aligned}$$

Since (2) holds, the vector \((q_1,\cdots ,q_k)\) is the shortest vector in \(\mathcal {L}_1\) with a good probability when \(\gamma \ge \frac{k}{k-1}\alpha \). Then by applying the LLL reduction algorithm to the lattice, the vector \((q_1,\cdots ,q_k)\) can be solved. Conversely, when \(\gamma <\frac{k}{k-1}\alpha \) the reduced basis \((\lambda _1,\cdots ,\lambda _k)\) doesn’t contain vector \((q_1,\cdots ,q_k)\), nevertheless, we can represent the vector \((q_1,\cdots ,q_k)\) as a linear combination of reduced basis. Namely, there exist integers \(x_1,x_2,\cdots ,x_k\) such that \((q_1,\cdots ,q_k)=x_1 \lambda _1+\cdots +x_k \lambda _k\). Moreover, the following system of modular equations can be obtained,

$$\begin{aligned} \left\{ \begin{array}{c} x_1l_{11}+x_2l_{21}+\cdots +x_kl_{k1}=q_1\equiv 0\quad \text {mod }q_1\\ \vdots \\ x_1l_{1k}+x_2l_{2k}+\cdots +x_kl_{kk}=q_k\equiv 0\quad \text {mod }q_k \end{array} \right. \end{aligned}$$
(3)

where \(\lambda _i=(l_{i1},l_{i2},\cdots ,l_{ik}),\,i=1,2,\cdots ,k\).

Based on the experiments, the size of the reduced basis can be roughly estimated as Gaussian heuristic. We estimate the length of \(\lambda _i\) and the size of \(l_{ij}\) as \(\text {det}(L_2)^{\frac{1}{k}}=2^{\frac{nt(k-1)}{k}}\), hence the solution of (3) is \(|x_i|\approx \frac{q_i}{kl_{ij}}\approx 2^{\alpha n-\frac{nt(k-1)}{k}-\text {log}_2k}\le 2^{\alpha n-\frac{nt(k-1)}{k}}\).

Then using the Chinese Remainder Theorem, from (3) we can get the following homogeneous modular equation

$$\begin{aligned} a_1x_1+a_2x_2+\cdots +a_kx_k\equiv 0 \quad \text {mod }q_1q_2\cdots q_k \end{aligned}$$
(4)

where \(a_i\) is an integer satisfying \(a_i\equiv l_{ij}\,\text {mod }N_j\) for \(1\le j\le k\) and it can be calculated from the \(l_{ij}\) and \(N_j\).

For this linear modular equation, Peng et al. directly utilized the method of Herrmann and May [7] to solve it and obtain that when

$$ \gamma \ge \frac{k}{k-1}(\alpha -1+(1-\alpha )^{\frac{k+1}{k}}+(k+1)(1-(1-\alpha )^{\frac{1}{k}})(1-\alpha ) $$

the desired solution can be solved.

In this paper, we notice that the linear modular equation is homogeneous which is a variant of Herrmann and May’s equation, hence we utilize the following theorem which is proposed by Lu et al. in [16] to modify the construction of lattice used in [19].

Theorem 4

Let N be a sufficiently large composite integer (of unknown factorization) with a divisor p (\(p\ge N^{\beta }\)). Furthermore, let \(f(x_{1},\ldots ,x_{n})\in \mathbb {Z}[x_1,\ldots ,x_n]\) be a homogenous linear polynomial in \(n (n \ge 2)\) variables. Under Assumption 1, we can find all the solutions \((y_{1},\ldots ,y_{n})\) of the equation \(f(x_{1},\ldots ,x_{n})=0 \ (\text {mod }\,\ p)\) with \(\gcd (y_1,\ldots ,y_n)=1\), \( \left| y_{1} \right| \le N^{\gamma _{1}},\ldots \left| y_{n} \right| \le N^{\gamma _{n}}\) if

$$ \sum _{i=1}^{n}\gamma _{i} < \left( 1-(1-\beta )^{\frac{n}{n-1}}-n(1-\beta )\left( 1-\root n-1 \of {1-\beta } \right) \right) $$

The running time of the algorithm is polynomial in \(\log N\) but exponential in n.

For this homogeneous linear Eq. (4) in k variables modulo \(q_1q_2\cdots q_k\approx (N_1N_2\cdots N_k)^\alpha \), by Theorem 4 with the variables \(x_i<(N_1N_2\cdots N_k)^{\delta _i}\approx 2^{k\delta _in},\,i=1,2,\cdots ,k\), we can solve the variables when

$$ \sum _{i=1}^k\delta _i\approx k\delta _i\le 1-(1-\alpha )^{\frac{k}{k-1}}-k(1-\alpha )\left( 1-(1-\alpha )^{\frac{1}{k-1}}\right) $$

where \(\delta _1\approx \delta _2\approx \cdots \approx \delta _k\).

Hence, when

$$ \alpha -\frac{\gamma (k-1)}{k}\le 1-(1-\alpha )^{\frac{k}{k-1}}-k(1-\alpha )\left( 1-(1-\alpha )^{\frac{1}{k-1}}\right) $$

Namely,

$$\begin{aligned}&\gamma \ge \frac{k}{k-1}\left( \alpha -1+(1-\alpha )^{\frac{k}{k-1}}+k(1-(1-\alpha )^{\frac{1}{k-1}})(1-\alpha )\right) \\&=k(1-\alpha )\left( 1-(1-\alpha )^{\frac{1}{k-1}}\right) \\ \end{aligned}$$

the desired vector can be found out.

The above result can be easily extend to MSBs case using the technique in [19]. Surprisingly we get the same result as Theorem 2 by modifying Peng et al.’s technique.

5 Experimental Results

We implemented our analysis in Magma 2.20 computer algebra system on a PC with Intel(R) Core(TM) Duo CPU(2.80 GHz, 2.16 GB RAM Windows 7). Note that for the first time, we can experimentally handle the IFP for the case of balanced RSA moduli. The column theo. denotes the asymptotic bound of shared bits when the dimension is infinite and the column expt. denotes the best experimental results for a fixed dimension of our constructed lattice. Since the method of [22] can not deal with the case of balanced RSA moduli, we use ‘-’ to fill the Table 2. Moreover, [19] showed that they can obtain an theoretical bound when p and q are balanced, however, they failed to obtain the experimental results, thus we also use ‘-’ to fill the Table 3. All of the running time of the experiments are measured in seconds.

Table 2. Theoretical and Experimental data of the number of shared MSBs in [22] and shared MSBs in Our Method in Sect.  3

We present some numerical values for comparisons between our method of Sect. 3 and [22]’s method in Table 2. The running time of LLL algorithm depends on the lattice dimension and bit-size of the entries in lattice, and the largest coefficient of entries in lattice has a bit-size of at most \(t\log (N_1)\). Thus the running time is decided by parameters m and t, that explains why the time is reduced as p and q get more balanced. For the case \(k=2\), when the bitlength of q increases, namely \(\alpha \) increases, the optimal value of t decreases. Thus, the running time of LLL algorithm is reduced when \(\alpha \) increased which means p and q get more balanced.

Table 3. Theoretical and Experimental data of the number of shared MSBs in [19] and shared MSBs in Our Method in Sect.  4

Note that in the practical experiments, we always found many integer equations which share desired roots over the integers when the numbers of shared bits is greater than the listed results. It means that in the reduced basis, there are several vectors that satisfy Howgrave-Graham’s bound. Moreover, the more integer equations corresponding to the vectors we choose, the less time calculating Gr\(\mathrm {\ddot{o}}\)bner basis. For an instance, when \(k=3\), \((m,t,s)=(13,9,4)\) and the bitlengths of p and q are both 512-bits, we constructed a 105-dimensional lattice and by applying the \(L^3\) algorithm to the lattice, we successfully collected 74 polynomial equations which share desired roots over the integers when \(q_1,q_2,q_3\) shared 460 MSBs. When we chose all of integer equations, the calculation of Gr\(\ddot{o}\)bner basis took 12.839 s.

Meanwhile our method of Sect. 4 is based on an improved method of [19], we present some numerical values for comparison with these two methods in Table 3. As it is shown, by using an improved method to solve the homogeneous equations, we obtained an improved bound on the numbers of shared bits and the experiments also showed this improvement. For a fixed dimension of lattice, similarly since entries of our constructed lattice is decided by m and t, the running time of LLL algorithm increases when t increases.

Note that the running time of the method of Sect. 3 is faster than the method of Sect. 4 when p and q get more balanced, especially for balanced moduli. For the unbalanced case, the method of Sect. 4 is faster.