Keywords

1 Introduction

In 1998, Blaze, Bleumer and Strauss [3] proposed a new type of public-key cryptographic scheme, namely proxy re-encryption (PRE) scheme. A complete PRE scheme consists of three parties: the sender Alice, the receiver Bob, and the proxy. It permits Alice to encrypt messages to the proxy, and allows Bob to decrypt the ciphertexts re-encrypted by the proxy. Further, in the communication process, the proxy only provides the re-encryption operation without knowing any information about messages. In fact, proxy re-encryption scheme is a variant of the traditional public key cryptosystem. Its basic algorithm is the same as that of public key encryption scheme, except that there are two more steps: generating proxy re-encryption key and re-encrypting ciphertexts. After a period of development, many PRE schemes have been constructed, but the vast majority of these are based on traditional number theoretic problems, such as discrete logarithm problem [4]. However, these problems suffered the impact after Shor’s algorithm [14, 15] was put forward. Therefore, the attention turned to the field of post-quantum cryptography, such as lattice-based schemes [1].

At AsiaCCS 2015 [13], Nu\(\tilde{n}\)ez et al. proposed a NTRU-based proxy re-encryption (PRE) scheme, called NTRUReEncrypt. It only has one more re-encryption step, and the rest is the same as NTRU scheme including parameter sets. In 1996, Hoffstein, Pipher and Silverman proposed a cryptosystem called NTRU [7], which has the advantages of high efficiency and low memory usage. Due to these properties, it becomes an indispensable part of post-quantum cryptography, and has been standardized by IEEE P1363.1 [2]. Recently, it was also submitted to the third round of NIST-PQC competition, e.g. NTRU-HPS, NTRU-HRSS [5]. One reason for the efficiency of NTRU is that, some of polynomials in NTRU have small coefficients, which we will call “small" polynomials for ease of description. The brief process of NTRUReEncrypt scheme is as follows: (1) Alice encrypts the message m as \(C_{A}=h_{A} * r+m\) by selecting a small polynomial r, where Alice’s private key is \(\left( f_{A}, g_{A}\right) \) and public key is \(h_{A}=p * g_{A} * f_{A}^{-1}\). (2) The proxy selects a small polynomial e, encrypts \(C_A\) sent by Alice as \(C_{B}=C_{A} * r k_{A \rightarrow B}+p * e\), where \(r k_{A \rightarrow B}=f_{A} * f_{B}^{-1} \bmod q\) is the re-encrypted key of the proxy and \(f_B\) is Bob’s private key. (3) Bob decrypts \(C_B\) sent by the proxy as \((C_B * f_B \bmod q) \bmod p\) to obtain the message m.

There are many attacks against NTRU, such as decryption failure attack [8], broadcast attack [6, 10]. The method of latter uses the linearization technique, whose main idea is to generate a linear system by linearizing monomials into new variables. At PQCrypto 2019 [11], Liu et al. proposed cryptanalysis of NTRUReEncrypt, whose strategy is based on two points: one is decryption failure, the other is statistical analysis. Due to the huge amount of data required, these two attacks are hard to implement in practice. For instance, for ees1171ep1 parameter set, the number of required ciphertexts are \(4.68 \cdot 10^{17}\) and \(4.83 \cdot 10^{17}\) respectively.

Our Contribution. We present a key recovery attack based on the linearization technique against NTRUReEncrypt, where the parameter sets are from those in AsiaCCS 2015 and PQCrypto 2019. To implement an attack, \(O(N+\left[ \frac{N}{2}\right] )\) legal communications are needed to collect ciphertexts \(C_{A_i}\) and \(C_{B_i}\), where N is an odd prime in the ring \(\mathcal {R}=\mathbb {Z}[x] /\left( x^{N}-1\right) \). The comparison of PQCrypto 2019 and our work is shown in Table 1.

Table 1. Number of ciphertexts needed

The technical overview is as follows. First, we focus on the following relation from the proxy’s re-encryption stage:

$$C_{B}=C_{A} * r k_{A \rightarrow B}+p * e \bmod q.$$

Here, \(C_{A}, C_B, p, q=2^\gamma \) are known, where \(\gamma \) is an integer, and \(r k_{A \rightarrow B}, e\) are unknown. Our goal is to recover the re-encryption key \(r k_{A \rightarrow B}\), and then obtain the private key \(f_{A}\) or \(f_{B}\) based on \(r k_{A \rightarrow B}=f_{A} * f_{B}^{-1} \bmod q\). For the sake of efficiency, we first choose to recover \(r k_{A \rightarrow B} \bmod 2\) instead of \(r k_{A \rightarrow B} \bmod q\). Due to the special structure of coefficients in the polynomial e, i.e., its coefficients have certain numbers of \(+1\), \(-1\), and 0. Hence, the inner product of the coefficient vector of e is fixed. Thus, we can establish a system of linear congruence equations by using inner product calculation, and then obtain \(r k_{A \rightarrow B} \bmod 2\) by using the linearization technique. According to \(r k_{A \rightarrow B}=f_{A} * f_{B}^{-1} \bmod q\) and q is a power of 2, we get that \(f_{B}*(r k_{A \rightarrow B} \bmod 2)=f_A\bmod 2\). Without loss of generality, suppose that Bob is an attacker, where \(f_{B}\) is the private key of Bob. Based on the above equation, Bob can determine the position of 0 bits of Alice’s private key \(f_A\). Notice that the private key pair \((f_A,g_A)\) of Alice satisfies \(h_A =p * g_A * f_A^{-1} \bmod q\), where \(h_A\) is the public key. It implies \( f_A*h_A =p * g_A \bmod 2\). Furthermore, the attacker Bob can also deduce the position of 0 bits of \(g_A\). Finally, combining with the position of 0 bits about \(f_A\) and \(g_A\), we get a new system of linear congruence equations derived from \(f_A*h_A =p * g_A \bmod q\), and compute the remaining bits of \(f_A\) and \(g_A\) using Gaussian elimination. Theoretically, the algorithm overhead is divided into two main parts: constructing linear equations from the proxy’s re-encryption stage and solving linear equations. Since we choose to work on \(\mathbb {F}_2\) rather than \(\mathbb {Z}_{2048}\), the cost of the latter is greatly reduced to negligible. This means that the time required to implement an attack is almost dependent on constructing a system of equations, which could be completed within one hour on PC.

Our another contribution is to discuss the NTRUReEncrypt instantiated with the NTRU parameter sets in the third round of NIST-PQC competition. Unlike the parameter sets from AsiaCCS 2015 and PQCrypto 2019, the parameter sets in the third round of NIST-PQC competition, e.g., NTRU-HPS and NTRU-HRSS [5], no longer determine the certain numbers of \(+1\), \(-1\), 0 in the coefficients of the secret polynomials. It means that the inner product of e is not fixed. However, we can still take advantage of another property of ternary polynomials e. Denote the coefficient vector of e as \(\textbf{e}\), hence each component \(\textbf{e}_i\in \{-1, 0, 1\}\) satisfies \(\textbf{e}_i = {(\textbf{e}{_i})^3}\). The remaining operations are the same as the previous attack, except that the number of communications is increased to \(O(N^2)\).

Organization. The rest of this paper is organized as follows: In Sect. 2, we provide some basic preliminaries for the linear form and parameter sets of NTRU. In Sect. 3, we briefly describe NTRU and NTRUReEncrypt schemes, provide the specific parameter sets used in this paper. In Sect. 4, we present our attack in detail and give a comparison with PQCrypto 2019 [11]. In Sect. 5, we discuss the NTRUReEncrypt instantiated with the NTRU parameter sets in the third round of NIST-PQC competition, and also compare with previous parameter sets used in Sect. 4. In Sect. 6, we present the experimental results, whose parameter sets are ees1087ep1, ees1171ep1, ees1499ep1 respectively. In Sect. 7, we conclude the paper.

2 Preliminaries

In this section, we provide some basic preliminaries of NTRU and NTRUReEncrypt scheme. The operations of both schemes are defined over the quotient ring \(\mathcal {R}=\mathbb {Z}_{q}[x] /\left( x^{N}-1\right) \), where N is an odd prime. Other parameters p, q are integers, where p is much smaller than q and \({\text {gcd}}(p, q)=1\).

The polynomials are selected from four subset of \(\mathcal {R}\), denote as \(\mathcal {L}_{f}=\mathcal {T}_{\left( d_{f}, d_{f}-1\right) }\), \(\mathcal {L}_{g}=\mathcal {T}_{\left( d_{g}, d_{g}\right) }\), \(\mathcal {L}_{r}=\mathcal {T}_{\left( d_{r}, d_{r}\right) }\),

$$\mathcal {L}_{m}=\left\{ m \in \mathcal {R}: \text{ every } \text{ coefficient } \text{ of } m \text{ lies } \text{ between } -\frac{p-1}{2} \text{ and } \frac{p-1}{2}\right\} .$$

In addition, elements in \(\mathcal {L}_{f}\), \(\mathcal {L}_{g}\), \(\mathcal {L}_{r}\) are ternary polynomials. We introduce the definition of ternary polynomial from PQCrypto 2019 [11].

Definition 1

A ternary polynomial \(\mathcal {T}\) with positive integers \(d_1\), \(d_2\) is defined as:

$$ \mathcal {T}_{\left( d_{1}, d_{2}\right) }=\left\{ \begin{array}{r} \text{ trinary } \text{ polynomials } \text{ of } \mathcal {R} \text{ with } d_{1} \text{ entries } \\ \text{ equal } \text{ to } 1 \text{ and } d_{2} \text{ entries } \text{ equal } \text{ to } -1 \end{array}\right\} . $$

2.1 Vector and Matrix Forms of NTRU

A polynomial \(f \in \mathcal {R}\) in NTRU can be presented as \(f=\sum _{i=0}^{N-1} f_{i} x^{i}.\) Its vector form can be presented as \( \textbf{f}=\left( f_{0}, f_{1}, \cdots , f_{N-1}\right) ^{T}. \) The polynomial f can be written in the form of a circular matrix \(\textbf{F}\) in \(\mathbb {Z}_{q}^{N \times N}\):

$$ \textbf{F}=\left( \begin{array}{cccc} f_{0} &{} f_{N-1} &{} \ldots &{} f_{1} \\ f_{1} &{} f_{0} &{} \ldots &{} f_{2} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ f_{N-1} &{} f_{N-2} &{} \ldots &{} f_{0} \end{array}\right) $$

Further, the matrix form of multiplication of two polynomials \(f, g \in \mathcal {R}\) can be presented as:

$$ \left( \begin{array}{cccc} f_{0} &{} f_{N-1} &{} \cdots &{} f_{1} \\ f_{1} &{} f_{0} &{} \cdots &{} f_{2} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ f_{N-1} &{} f_{N-2} &{} \cdots &{} f_{0} \end{array}\right) \left( \begin{array}{c} g_{0} \\ g_{1} \\ \vdots \\ g_{N-1} \end{array}\right) . $$

As needed, there are the following fundamental lemmas [9]:

Lemma 1

If \(\textbf{H} \in \mathbb {Z}_{q}^{N \times N}\) is a circular matrix over \(\mathbb {Z}_{q}^{N \times N}\), then \(\textbf{H}^{T}\) is also a circular matrix over \(\mathbb {Z}_{q}^{N \times N}\).

Lemma 2

If \(\textbf{G}, \textbf{H} \in \mathbb {Z}_{q}^{N \times N}\) are circular matrices, then \(\textbf{GH}\) is also a circular matrix. In particular, \(\textbf{H}^{T} \textbf{H}\) is a symmetric circular matrix.

3 NTRU and Its Proxy Re-encryption Scheme

In this section, we overview the NTRU and NTRU-based proxy re-encryption scheme, called NTRUReEncrypt. Parameters sets are shown in the following table, related to version 3.3 of the EESS#1 specification [2], from IEEE P1363.1 standard. For ees1087ep1, ees1171ep1, ees1499ep1, the private keys are fg selected from \(\mathcal {L}_{f}=\mathcal {T}_{\left( d_{f}, d_{f}-1\right) }\) and \(\mathcal {L}_{g}=\mathcal {T}_{\left( d_{g}, d_{g}\right) }\) respectively, the set of small polynomials is \(\mathcal {L}_{r}=\mathcal {T}_{\left( d_{r}, d_{r}\right) }\), where small means that the coefficients of the polynomials are small.

Table 2. Instance of polynomial sets

In practice, some variants of NTRU take the following approach to generating key f for efficiency: f have the form of \(1+p * F\) with \(F \in \mathcal {T}_{\left( d_{f}, d_{f}\right) }\), first generate \(F \in \mathcal {T}_{\left( d_{f}, d_{f}\right) }\), and then calculate \(1+p * F\) to obtain f. We would use this form throughout the rest of the paper.

3.1 NTRU Cryptosystem

The brief description of NTRU cryptosystem is as follows, see [7] for more details.

  • Key Generation: Randomly chooses \(F \in \mathcal {T}_{\left( d_{f}, d_{f}\right) }\), then calculate \(1+p * F\) to obtain f, where f has inverse \(f_{p}^{-1}\), \(f_{q}^{-1}\) in \(R_{p}\), \(R_{q}\), then randomly chooses \(g \in \mathcal {T}_{\left( d_{g}, d_{g}\right) }\). Outputs public key \(pk=h=p * g * f_{q}^{-1} \bmod q\), private key \(s k=(f, g)\).

  • Encryption: To encrypt a plaintext \(m \in \mathcal {L}_{m}\), randomly chooses \(r \in \mathcal {L}_{r}\). Outputs ciphertext \(c=h * r + m \bmod q\).

  • Decryption: To decrypt a ciphertext c, receiver uses private key f and computes \(a = f * c \bmod q\) such that coefficients of a are all lie between \((-q/2,q/2]\). Outputs plaintext \(m=a * f_{p}^{-1} \bmod p\).

Note that f, g, s, m are small, i.e. each of its coefficients is small, then all coefficients of \(a=c * f=p * g * s+m * f \bmod q\) lie in \((-q/2,q/2]\) with high probability. Thus, one computes \(a=c * f \bmod q\) turns to \(a=c * f\) over \(\mathbb {Z}\). Then can decrypt the message:

$$ a * f_{p}^{-1}=p * g * s * f_{p}^{-1}+m * f * f_{p}^{-1}=m \ \bmod p. $$

3.2 NTRUReEncrypt

NTRUReEncrypt is a NTRU-based proxy re-encryption scheme, all parameter sets are related to formal NTRU scheme. Its initial key generation and first encryption stage are consistent with NTRU encryption, at second re-encryption stage, algorithm selects the same set of polynomials as NTRU. NTRUReEncrypt has a unique re-encrypt key generation, which ensures that Bob can decrypt the re-encrypted ciphertext sent from proxy.

The flow of the algorithm is as follows:

  • Key Generation: Key generation algorithm is the same as that in NTRU. Outputs a pair of public and secret keys \(\left( p k_{A}, s k_{A}\right) \) for Alice, where \(s k_{A}=\left( f_{A}, g_{A}\right) \) and \(p k_{A}=h_{A}\), and Bob also obtains a public-private key pair in the same way.

  • Re-encrypt Key Generation: The algorithm requires two private keys \(s k_{A}\) and \(s k_{B}\), from sender Alice and receiver Bob respectively. Outputs re-encrypt key \(r k_{A \rightarrow B}=f_{A} * f_{B}^{-1} \bmod q\). The re-encryption key can be computed by a simple three-party protocol below:

    1. 1.

      Alice selects \(t \in \mathcal {R}_{q}\), sends \(t * f_{A} \bmod q\) to Bob and t to proxy;

    2. 2.

      Bob sends \(t * f_{A} * f_{B}^{-1} \bmod q\) to proxy;

    3. 3.

      Proxy computes \(r k_{A \rightarrow B}=f_{A} * f_{B}^{-1} \bmod q.\)

  • Encryption: Alice encrypts a plaintext \(m \in \mathcal {L}_{m}\), randomly chooses \(r \in \mathcal {L}_{r}\). Outputs ciphertext \(C_A=h_A * r + m \bmod q\).

  • Re-encryption: Proxy encrypts a ciphertext \(C_A\) sent by Alice, randomly chooses \(e \in \mathcal {L}_{r}\). Outputs ciphertext \(C_B=C_A * r k_{A \rightarrow B} + p*e \bmod q\).

  • Decryption: Bob decrypts a ciphertext \(C_B\), uses private key \(f_B\) and compute

    $$C_B*f_B = p * g_{A} * r+m * f_{A}+p * e * f_{B} \ \bmod q$$

    such that coefficients of \(C_B*f_B\) are all lie between \((-q/2,q/2]\). Outputs plaintext \(m=C_B*f_B \bmod p\).

Decryption stage is similar to previous NTRU decryption, see [13] for more details.

4 Key Recovery Attack Against NTRUReEncrypt

In this section, we propose an efficient key recovery attack by only collecting ciphertexts \(C_A\) and \(C_B\) based on the algorithm of Li et al. [10]. They proposed a broadcast attack against NTRU only to recover messages at AsiaCCS 2015, however in NTRUReEncrypt, we find out that the re-encryption key \(r k_{A \rightarrow B}\) can be recovered from the proxy’s re-encryption stage, then a malicious receiver (sender) can directly recover the private key of the other one.

4.1 Construction of Equations

We now recall the re-encryption stage, proxy encrypts a ciphertext \(C_A\) sent by Alice, randomly chooses\(e \in \mathcal {L}_{r}\). Outputs ciphertext

$$\begin{aligned} C_B=C_A * r k_{A \rightarrow B} + p*e \ \bmod q. \end{aligned}$$
(4.1)

For convenience, we denote \(\textbf{e}\), \(\mathbf {c_B}\), \(\mathbf {\lambda }\) as their vector form in lowercase, and denote \(\mathbf {C_A}\) as its matrix form in uppercase, then write Eq. (4.1) in linear form:

$$ p\textbf{e} = \mathbf {c_B}-\mathbf {C_A}\mathbf {\lambda } \ \bmod q, $$

where \(\mathbf {\lambda }\) is the vector form of re-encryption key \(r k_{A \rightarrow B}\).

Then, do the inner product of both sides of the equation:

$$ ({p\textbf{e}})^{T}(p\textbf{e}) = (\mathbf {c_B}-\mathbf {C_A}\mathbf {\lambda })^{T}(\mathbf {c_B}-\mathbf {C_A}\mathbf {\lambda }) \ \bmod q. $$

Note that \(p=3\) and secret polynomial e selected in set \(\mathcal {L}_{r}\), the numbers of \(+1\) and \(-1\) in their coefficients are \(d_r\), thus \(({p\textbf{e}})^{T}(p\textbf{e})=2d_{r}p^2\) is a constant, denote as d.

We can get

$$\begin{aligned} d-\mathbf {c_B}^{T}\mathbf {c_B}= \mathbf {\lambda }^{T}\mathbf {C_A}^{T}\mathbf {C_A}\mathbf {\lambda } -2\mathbf {c_B}^{T}\mathbf {C_A}\mathbf {\lambda }\ \bmod q. \end{aligned}$$
(4.2)

4.2 Linearization

For convenience, let \(d-\mathbf {c_B}^{T}\mathbf {c_B}=u\), \(\mathbf {c_B}^{T}\mathbf {C_A}=(k_0,k_1,\cdots ,k_{N-1})\), and

$$ \mathbf {C_A}^{T}\mathbf {C_A}=\left( \begin{array}{cccc} c_{0} &{} c_{N-1} &{} \ldots &{} c_{1} \\ c_{1} &{} c_{0} &{} \ldots &{} c_{2} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ c_{N-1} &{} c_{N-2} &{} \cdots &{} c_{0} \end{array}\right) . $$

From Lemma 2.2, \(\mathbf {C_A}^{T}\mathbf {C_A}\) is a symmetric circular matrix, where \(c_{i}=c_{N-i}\), for \(i \in \{0,1, \cdots , N-1\}\). Then expanding Eq. (4.2), we can get

$$\begin{aligned} \begin{aligned}&u=c_{0}\left( \lambda _{0}^{2}+\lambda _{1}^{2}+\cdots +\lambda _{N-1}^{2}\right) \\&\ \ +c_{1}\left( \lambda _{1} \lambda _{0}+\lambda _{2} \lambda _{1}+\cdots +\lambda _{0} \lambda _{N-1}\right) \\&\ \ +\cdots \cdots \\&\ \ +c_{N-1}\left( \lambda _{N-1} \lambda _{0}+\lambda _{0} \lambda _{1}+\cdots +\lambda _{N-2} \lambda _{N-1}\right) \\&\ \ -2 k_{0} \lambda _{0}-2 k_{1} \lambda _{1}-\cdots -2 k_{N-1} \lambda _{N-1} \ \bmod q \end{aligned} \end{aligned}$$
(4.3)

Note that when choosing a specific parameter N, vector \(\mathbf {\lambda }=(\lambda _0,\lambda _1,\cdots ,\lambda _{N-1})\) has N unknown components. After the inner product operation, it generates \(O(N^2)\) new monomials \(\lambda _{i}\lambda _{j}\), for \(0 \le i \le j \le N-1\).

A trivial idea is to linearize these variables to \(O(N^2)\) one-order monomials, denoted as \(\textbf{x} = (x_0,x_1,\cdots ,x_{O(N^2)-1})\). Then Eq. (4.3) turns to a congruence equation with \(O(N^2+N)\) variables, thus \(\lambda _i\) can be recovered by collecting \(O(N^2)\) equations in time \(O(N^6)\) by Gaussian elimination. In certain parameter sets defined by NTRUReEncrypt, the size of N generally amounts to \(10^3\), which means the system of linear equations with around \(10^6\) variables and it is hard to implement in practice.

To reduce the number of variables, let

$$ x_{i}=\lambda _{i} \lambda _{0}+\lambda _{i+1} \lambda _{1}+\cdots +\lambda _{N-1} \lambda _{N-i-1}+\lambda _{0} \lambda _{N-i}+\cdots +\lambda _{i-1} \lambda _{N-1},$$

for \(i=0,1, \cdots , N-1\). In the parameter sets we attacked, N is an odd prime. Note that \(c_{i}=c_{N-i}\) and \(x_{i}=x_{N-i}\) for \(i=0,1, \cdots , N-1\), the Eq. (4.3) is equivalent to

$$\begin{aligned} \begin{aligned}&u = c_{0} x_{0}+2 c_{1} x_{1}+\cdots +2 c_{\left[ \frac{N}{2}\right] } x_{\left[ \frac{N}{2}\right] } \\&\ \ -2 k_{0} \lambda _{0}-2 k_{1} \lambda _{1}-\cdots -2 k_{N-1} \lambda _{N-1} \ \bmod \ q, \end{aligned} \end{aligned}$$
(4.4)

where q is a power of 2 denoted as \(q = 2^\gamma \), \(\gamma \) is a positive integer. Further, assuming that \(c_0, u\) are even, the equation could be converted to

$$\begin{aligned} \begin{aligned}&\frac{1}{2}u = \frac{1}{2}c_{0} x_{0}+c_{1} x_{1}+\cdots +c_{\left[ \frac{N}{2}\right] } x_{\left[ \frac{N}{2}\right] } \\&\ \ \ \ -k_{0} \lambda _{0}-k_{1} \lambda _{1}-\cdots -k_{N-1} \lambda _{N-1} \ \bmod \ 2^{\gamma -1}. \end{aligned} \end{aligned}$$
(4.5)

Notice that we can get one congruence Eq. (4.4) with \((N+[\frac{N}{2}]+1)\) variables by collecting \(C_A\) and \(C_B\) through one legal communication, so we could collect a series of samples by communicating relevant times. In fact through experiment, we could always select enough equations in the form of (4.5) by choosing these samples, and the number of samples is \(O(N+[\frac{N}{2}])\), which is related to the number of variables.

4.3 Solving the System of Linear Congruence Equations

Denote n as the number of variables and \(n=N+\left[ \frac{N}{2}\right] +1\), then we build a linear system \(\textbf{L} \times \textbf{X}=\textbf{S} \bmod 2^{\gamma -1}\) by collecting \(n+l\) equations from Eq. (4.5), where l is a positive integer, the vector \(\textbf{X} =(x_0,x_1,\cdots ,x_{\left[ \frac{N}{2}\right] },\lambda _0,\lambda _1,\cdots ,\lambda _{N-1})^T\), the row of the matrix \(\textbf{L}\) corresponds to (4.5) equals

$$ (\frac{1}{2}c_0,c_1,\cdots ,c_{\left[ \frac{N}{2}\right] },-k_0,-k_1,\cdots ,-k_{N-1})^T, $$

and \(\textbf{S}\) is the column vector related to \(\frac{1}{2}u\). For the sake of efficiency, we choose to apply our algorithm to work over the finite field \(\mathbb {F}_{2}\) but not the ring \(\mathbb {Z}_{2^{\gamma -1}}\), which means that we turn to solve the system of equations \(\textbf{L} \times \textbf{X}=\textbf{S} \bmod 2\). That is, our goal is to find \(r k_{A \rightarrow B} \bmod 2\) not \(r k_{A \rightarrow B} \bmod 2^{\gamma -1}\), and we would show that it is enough for recovering the private key in the next subsection.

Note that the vector \(\textbf{S} \in \mathbb {F}_{2}^{n}\), the matrix \(\textbf{L} \in \mathbb {F}_{2}^{(n+l) \times n}\), we aim to find \(r k_{A \rightarrow B} \bmod 2 = (\lambda _0,\lambda _1,\cdots ,\lambda _{N-1})^T\) by selecting last N bits of \(\textbf{X} \in \mathbb {F}_{2}^{n}\). It is obvious that there is a unique solution is equivalent to the matrix \(\textbf{L}\) is invertible, which means that the rank of \(\textbf{L}\) equals to n. The problem turns to figure out the proportion of the matrices of rank n in \(\textbf{L} \in \mathbb {F}_{2}^{(n+l) \times n}\). Li et al. [10] gave the following result estimating the proportion of invertible matrices in finite field among all matrices:

Theorem 1

Let \(\mathbb {F}_{q}\) be the finite field with q elements, where q is a prime power. The proportion of matrices of rank n in the set of \((n+l) \times n\) matrices with entries in \(\mathbb {F}_{q}\) is equal to:

$$ \prod _{k=l+1}^{n+l}\left( 1-q^{-k}\right) , \quad l=0,1,2,\cdots . $$

According to the theorem above, we give the proportion of the matrices of rank n in \(\mathbb {F}_{q}\) in Table 3 blow. It implies that if l grows, the probability that the random matrix \(\textbf{L}\) is invertible is also increasing. In the case of our attack, \(q=2\), \(l=4\), and the random matrix \(\textbf{L}\) is invertible with high probability.

Table 3. The proportion of the matrices of \({\text {rank}} n\) in \(\textbf{L} \in \mathbb {F}_{q}^{(n+l) \times n}\)

For any ciphertext pair \((C_A,C_B)\) in Eq. (4.1), we could always get \(C_B(1) = C_A(1)r k_{A \rightarrow B}(1) \bmod q\), which also holds on \(\mathbb {F}_2\). Specifically, we could obtain a new equation:

$$ C_B(1) = C_A(1)(\lambda _0 + \lambda _1 + \cdots + \lambda _{N-1}) \ \bmod 2, $$

where \(C_A(1),C_B(1)\) are fixed number. Adding this equation to the system of linear equations that we seek to solve, and now we can take \(l=3\) to implement our attack. Since the number of variables is \(n=N+\left[ \frac{N}{2}\right] +1\), \(l=3\), thus we can construct a system of linear equations with the number of equations \(n+l+1= N+\left[ \frac{N}{2}\right] +5\), which could be solved to obtain a unique solution in time \(O(N^3)\) using Gaussian elimination. Compared to running on \(\mathbb {Z}_{1024}\), our algorithm requires significantly less time to run on \(\mathbb {F}_{2}\), just a few seconds.

4.4 Recovering Private Keys

In Sect. 4.3, we have obtained the re-encryption key \(r k_{A \rightarrow B} \bmod 2\). Now, we discuss how to recover the private key in this subsection. First, we recover the position of 0 bits of the private key pair (fg) by means of the obtained \(r k_{A \rightarrow B} \bmod 2\), and then reveal the remaining bits of the private key f by solving a system of linear equations.

Since \(r k_{A \rightarrow B}=f_{A} * f_{B}^{-1} \bmod q\), we have that \(r k_{A \rightarrow B}=f_{A} * f_{B}^{-1} \bmod 2\). If one party to the communication obtains \(r k_{A \rightarrow B} \bmod 2\), then can immediately calculate the other party’s private key in the sense of modulo 2. Now we design a roadmap to show how to recover the private keys. For the sake of description, we assume that Bob is the malicious party, who knows \(r k_{A \rightarrow B}\bmod 2\) and the private key \(f_B\):

\({\textbf {Step 1.}}\) Considering \(f_{A} = f_{B} * r k_{A \rightarrow B}\bmod 2\). Since \(f_A = 1 + p*F\) with \(F \in \mathcal {L}_{\left( d_{f}, d_{f}\right) }\) and \(p=3\), we get \(p*F = f_{B} * r k_{A \rightarrow B} - 1 \bmod 2\). Note that there are \(d_{f}\) \(+1\)’s, \(d_{f}\) \(-1\)’s and \((N-2d_f)\) 0’s in the coefficients of F, so the position of 0 bits of F can be determined by counting the position of the 0 bits of \(f_{B} * r k_{A \rightarrow B} - 1 \bmod 2\), where the number of 0 bits of F is \(N-2d_{f}\). It means that we can also get the position of the 0 bits of \(f_A\).

\({\textbf {Step 2.}}\) Since the public key \(h_A =p * g_A * f_{A}^{-1} \bmod q\) with \(g_A \in \mathcal {L}_{\left( d_{g}, d_{g}\right) }\) holds, \(h_A =p * g_A * f_{A}^{-1} \bmod 2\) is also satisfied, where the coefficients of \(g_A\) have \(d_{g}\) \(+1\)’s and \(d_{g}\) \(-1\)’s, \((N-2d_g)\) 0’s. Based on \(p * g_A = h_A * f_A \bmod 2\), the position of the 0 bits of \(g_A\) can be determined by counting the position of 0 bits of \(h_A * f_A \bmod 2\), where the number of 0 bits is \(N-2d_{g}\).

\({\textbf {Step 3.}}\) Plugging \(f_A = 1 + p*F\) into \(h_A * f_A = p * g_A \bmod q\), we get \(h_A * (1 + p*F) = p * g_A \bmod q\), which is equivalent to the equation

$$\begin{aligned} p * h_A * F= p * g_A - h_A \bmod q. \end{aligned}$$
(4.6)

For convenience, we denote \(\textbf{f}\), \(\textbf{g}\), \(\textbf{h}\) as the vector form of F, \({g_A}\), \({h_A}\), and \(\mathbf {H_A}\) as the matrix form of \({h_A}\). The Eq. (4.6) can be rewritten as the following linear form:

$$ p\mathbf {H_A}\textbf{f}= p\textbf{g} - \textbf{h} \ \bmod q. $$

That is,

$$\begin{aligned} p \cdot \mathbf {H_A}\left( \begin{array}{c} {f}_{0} \\ {f}_{1} \\ \vdots \\ {f}_{N-1} \end{array}\right) = p \cdot \left( \begin{array}{c} g_{0} \\ g_{1} \\ \vdots \\ g_{N-1} \end{array}\right) - \left( \begin{array}{c} h_{0} \\ h_{1} \\ \vdots \\ h_{N-1} \end{array}\right) \ \bmod q, \end{aligned}$$
(4.7)

where \(\mathbf {H_A}\) is a \(N \times N\) matrix. Considering the 2N variables (\(\textbf{f}\) and \(\textbf{g}\)) of Eq. (4.7), there are \(N-2d_f\) and \(N-2d_g\) known in \(\textbf{f}\) and \(\textbf{g}\) respectively. Hence, the number of unknown variables is \(2d_f + 2d_g\), whereas the number of equations is N. According to Table 2, N is larger than \(2d_f + 2d_g\) (e.g. in ees1171ep1, \(N=1171\), \(d_g = 390\), \(d_f=106\), the number of equations \(N=1171\) is larger than the number of variables \(2d_f + 2d_g = 992\)). Hence we can recover the remaining bits of \(\textbf{f}\) by solving the system of linear equations using Gaussian elimination, then recover all bits of \(f_A\) which is Alice’s private key.

At PQCrypto 2019, Liu, Pan, and Zhang [11] proposed a key recovery attack based on statistical methods, malicious receiver Bob needed huge amount of ciphertexts \(C_{B_i}\) encrypted by the same plaintext m, which is illegal and hard to implement. Here are the approximate number of ciphertexts in Table 4.

Table 4. Comparison of our work with PQCrypto 2019

Remark. The cryptanalysis proposed by PQCrypto 2019 is based on decryption failure and statistical analysis, both require huge amount of ciphertexts and the chosen plaintexts. Moreover, the ciphertexts in latter case should be encrypted by the same plaintext. Unlike the previous ones, our attack has two advantages: (1) The amount of ciphertext required is greatly reduced. (2) There are no restrictions on plaintext, our attack only needs to be done in legal communication.

5 Case of NTRU Scheme with Different Parameter Sets

In this section, we discuss other schemes of NTRU with different parameter sets instantiated to the NTRUReEncrypt. We divide them into two cases, one with a constant number of +1, −1 (if any) coefficients of the secret polynomial selected in \(\mathcal {L}_{r}\), in which case we can still attack with the same method as in the previous section, and the other with the NTRU schemes in the third round of NIST-PQC competition, which have a variable number of +1, −1 (if any) coefficients of the secret polynomial selected in \(\mathcal {L}_{r}\), and we analyze this case by a new trick.

5.1 Case of Certain Secret Polynomial Coefficients

For the case of certain secret polynomial coefficients, [12] summarised some instantiations of NTRU, and their specific parameter sets are listed in the table below, where \(\mathcal {B}\) denotes the set of all polynomials with binary coefficients, \(\mathcal {B}\left( d\right) \) denotes a subset of \(\mathcal {B}\) with exactly d coefficients equal 1, \(\mathcal {L}_m\) denotes the polynomial set whose coefficients lying between \(-\frac{1}{2}(p-1)\) and \(\frac{1}{2}(p-1)\) (Table 5).

Table 5. Some instantiations of NTRU

One can check that, as for the secret polynomial e selected from \(\mathcal {L}_{r}\) in these schemes, the inner product of its coefficient vectors is a constant. Then we can use the method proposed in Sect. 4 to recover the private keys.

5.2 Case of Uncertain Secret Polynomial Coefficients

We now discuss the case in the third round of NIST-PQC competition, such as NTRU-HPS, NTRU-HRSS [5], whose parameter sets are instantiated to the NTRUReEncrypt. For specific parameter sets in ees1087ep1, ees1171ep1, ees1499-ep1, our attack’s point is that the secret polynomial e selected in set \(\mathcal {L}_{r}\), whose coefficients have a certain number of \(+1\), \(-1\), and 0.

However, in NTRU-HPS and NTRU-HRSS, polynomial set \(\mathcal {L}_{r}=\mathcal {T}\) and \(\mathcal {T}\) is the set of non-zero ternary polynomials of degree at most \(N-2\). It indicates that we no longer have information on the number of coefficients in the secret polynomial e, thus the inner product calculation would fail. Ding et al. [6] used the property \(\textbf{e}_i = {\textbf{e}_i}^3\), for \(i \in \{0,1, \cdots , N-1\}\) in the broadcast attack against NTRU to recover plantexts, it could also be used in this case to recover the secret keys.

Separating p from Eq. (4.1) and write it in linear form, we can get

$$ \textbf{e} = (\mathbf {C_B}-\mathbf {C_A}\textbf{r}) * p^{-1}\ \bmod q. $$

Since \(\textbf{e}_i = {\textbf{e}_i}^3\), so we can get equations that eliminates \(\textbf{e}\):

$$\begin{aligned}{}[(\mathbf {C_B}-\mathbf {C_A}\textbf{r}) * p^{-1}]_i = [(\mathbf {C_B}-\mathbf {C_A}\textbf{r}) * p^{-1}]_i^3 \ \bmod q, \end{aligned}$$
(5.1)

for \(i \in \{0,1, \cdots , N-1\}\). Note that in Eq. (5.1) only \(\textbf{r}\) is the unknown variable, cubic computation generates \(O(N^3)\) new monomials, and we can also linearize these monomials into new variables. Since one legal communication produces N equations, the system of linear congruence equations can be built by communicating \(N^2\) times, thus recover \(\textbf{r}\) in time \(O(N^9)\). The following table is the comparison of parameter sets between EESS#1 and NTRU-Round3, where NTRU-HPS is the same as NTRU-HRSS (Table 6).

Table 6. Comparison of EESS#1 with NTRU-Round3

6 Experiments

In this section, we present experimental results on the assumption that Bob is a malicious receiver. Due to ciphertexts \(C_A\) could be collected on the public channel and \(C_B\) could be received normally by Bob, we assumed in our experiment that the attacker could collect enough ciphertext pairs \((C_A,C_B)\). All experiments were performed in SageMath 9.6 on a macOS Monterey 12.5.1 system with Apple M1 CPU @ 3.2GHz, 8GB RAM, and our implement was available at https://github.com/s4lTea/NTRUReEncrypt_Attack. We implemented our attack against NTRUReEncrypt scheme, whose parameter sets defined by EESS#1 are the same as those from AsiaCCS 2015 [13] and PQCrypto 2019 [11]. We performed our attack 50 times for each instance, and gave the average number of communications and running time required by the algorithm. In our experimental results, let \(n = N+\left[ \frac{N}{2}\right] +1 \), we could always find a matrix \(\textbf{L}\) of rank n. We splited the algorithm into 3 steps:

  1. 1)

    Focusing on the proxy’s re-encryption stage, then generate a system of linear congruence equations with \(n+4\) equations and n variables by communicating enough times.

  2. 2)

    Solving it on \(\mathbb {F}_2\) using Gaussian elimination to obtain re-encryption key \(r k_{A \rightarrow B} \bmod 2\).

  3. 3)

    Building another system of linear congruence equations with N equations and \(2d_f+2d_g\) variables to solve, finally obtain Alice’s private key.

Table 7. Experimental Results with different parameter sets

Step 1 takes some time (minutes) due to matrix multiplication operations. As it works on \(\mathbb {F}_2\), so step 2 takes only a few seconds and the running time could be negligible. There are small number of variables related to the equations in step 3, so the time required to either construct or solve the equations is negligible. The experimental results are shown in Table 7. For ease of description, we take the cost of step 1 as the total time of our algorithm.

7 Conclusion

In this paper, we presented an efficient key recovery attack against NTRUReEncrypt scheme, whose parameter sets are defined by EESS#1 specification [2] from IEEE P1363.1 standard. The attack is based on a special structure of secret polynomials from the set \(\mathcal {L}_{r}\). In addition, the key recovery attack could be extended to the NTRUReEncrypt instantiated with the NTRU parameter sets in the third round of NIST-PQC competition.