Keywords

1 Introduction

It is well known that classical cryptography is vulnerable to quantum computers since Shor’s algorithm [21] will solve the integer factorization and the logarithm discrete problems efficiently. This has motivated the development of post-quantum cryptography, especially lattice-based cryptosystems. In general, the security of lattice-based cryptosystems is always related to some hard computational problems in lattices, such as the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP).

As important cryptographic primitives, several lattice-based digital signature schemes have been proposed in recent years, such as [5, 8,9,10, 19]. In 1997, Goldreich et al. [9] proposed the GGH signature scheme based on lattices, whose security is related to the hardness of approximate CVP. In fact, GGH is not only a concrete signature scheme, but also a general framework to construct lattice-based digital signature schemes. The GGH framework consists of a good lattice basis \(\varvec{G}\), a bad basis \(\varvec{B}\) for the same lattice and a reduction algorithm as the signing algorithm. Usually, the good basis is used as the secret key, with which the reduction algorithm can efficiently output an approximation for the closest vector of a target vector corresponding to the message. Such approximation is the signature of the message. The bad basis is published as the public key, with which one can check if the signature is in the lattice and close enough to the target vector. In GGH scheme, they used a nearly orthogonal basis \(\varvec{G}\) as the good basis, a random basis as the bad basis \(\varvec{B}\), and Babai’s rounding-off algorithm [2] as the reduction algorithm.

Based on GGH framework, Hoffstein et al. [11] presented the NTRUSign as a more efficient lattice-based signature scheme. They used some special short basis as a good basis, a “random” basis as the bad basis \(\varvec{B}\), and Babai’s rounding-off algorithm as the reduction algorithm.

However, Nguyen and Regev [18] proposed a clever method to recover the secret key of the GGH signature scheme and NTRUSign by studying the parallelepiped of the lattice. More precisely, by collecting enough message-signature pairs, they can obtain many samples uniformly distributed in the parallelepiped due to Babai’s rounding-off algorithm employed as reduction algorithm in this two signature schemes. Then with these samples, they can finally recover the parallelepiped which leaks the good basis. They also pointed out that even taking Babai’s nearest plane algorithm [2] as the signing algorithm, these two schemes are still insecure. Later, Ducas and Nguyen [7] proposed some method to analyze some countermeasures against the Nguyen-Regev attack.

By the Nguyen-Regev attack, it seems that the security of GGH type signature schemes depends heavily on the reduction algorithms. To resist such attack, at least two different reduction algorithms have been proposed. In 2008, Gentry et al. [8] presented a Gaussian sample algorithm similar to [12]. Based on such a random vector-sampling algorithm, Gentry, Peikert and Vaikuntanathan constructed a signature scheme, with a short trap-door basis as the private key and a long basis as the public key. Since the lattice vectors outputted by the new sampling algorithm do not reveal the trap-door, the signature scheme of Gentry, Peikert and Vaikuntanathan can be proved to be secure under the chosen message attack (CMA).

In 2008, Plantard et al. [19] proposed another signature scheme at PKC’08 to resist the Nguyen-Regev attack. They employed a special type of lattices as the good basis which has a basis that can be written into the sum of a diagonal matrix and a ternary random matrix. With such a basis, they proposed a reduction algorithm to reduce any vector into a small cube. Since the cube is public and it seems hard to recover the private basis from the cube, the authors claimed that their scheme can resist the Nguyen-Regev attack well.

As pointed out by Plantard, Susilo, and Win, since their reduction algorithm is deterministic, the scheme may suffer some potential side channel attacks. To make the scheme more secure, they modified their reduction algorithm to be randomized.

In this paper, we show that the randomized version of the PSW signature scheme is insecure under the CMA model. Simply speaking, note that when we query the signing oracle with the single message \(\varvec{m}\) for many times, we will usually obtain different signature vectors \(\varvec{w}_1,\varvec{w}_2,\cdots ,\varvec{w}_k\) with \(k\ge 2\). Denote by \(\mathcal {H}(\varvec{m})\) the hash vector of the message \(\varvec{m}\). Note that, in the PSW scheme, the difference \(\varvec{w}_i-\mathcal {H}(\varvec{m})\), \(1\le i\le k\) are all in the given lattice. It is easy to see that \(\varvec{w}_i-\varvec{w}_j\), \(1\le i<j\le k\) are all in the lattice. Note that each signature \(\varvec{w}_i\) is contained in a relatively small cube, then their difference vectors \(\varvec{w}_i-\varvec{w}_j\) are relatively short. Once we obtain many such difference vectors, the \(\mathbb {Z}\)-linear combinations of these vectors will span the given lattice with high probability. By using the lattice reduction algorithms such as LLL [13] and BKZ [4, 20] to these short difference vectors, we could obtain a much shorter basis, which may leak the good basis in this signature scheme. In fact, we find that for dimension less than 400, BKZ-20 will recover all or partial rows of the good basis in our experiments.

To fix the randomized version of the PSW signature scheme, we will give two methods as presented in [8]. The first method is to store the message-signature pairs locally. When signing a message, we first check whether the message is in storage or not. If the message is in storage, we output the stored corresponding signature, otherwise, we apply the randomized reduction algorithm to generate a signature. The second method is using the randomized reduction algorithm to generate the signature for the hash value of a message and some additional random number instead of the hash value of just the message.

Roadmap. The remainder of the paper is organized as follows. First we present some notations and preliminaries on lattices and hard problems in Sect. 2. Then we describe the Plantard, Susilo, and Win signature scheme in Sect. 3. Finally we describe our attacks and some experimental results in detail in Sect. 4, and some strategies to fix the randomized version of PSW signature scheme are discussed in Sect. 5.

2 Preliminaries

Denote by \(\mathbb {R}\), \(\mathbb {Z}\) the real number field and the integer ring respectively. For a vector \(\varvec{v}=(v_1,v_2,\cdots ,v_n)\in \mathbb {R}^n\), denote by \(v_i\) its i-th component and denote by \(\Vert \varvec{v}\Vert =\sqrt{v_1^2+v_2^2+\dots +v_n^2}\) its length.

2.1 Lattices

A lattice \(\varLambda \) is a discrete subgroup of \(\mathbb {R}^n\). Equivalently, a lattice is a \(\mathbb {Z}\)-linear combinations of m linearly independent vectors in \(\mathbb {R}^n\). The set of these linearly independent vectors is called a basis of \(\varLambda \). Given a matrix \(\varvec{B}\in \mathbb {Z}^{m\times n}\), we denote by \(\varLambda (\varvec{B})\) the lattice spanned by the row vectors of \(\varvec{B}\). That is,

$$\varLambda (\varvec{B})=\Big \{\sum _{i=1}^m x_i\varvec{b}_i|x_i\in \mathbb {Z},1\le i\le m\Big \},$$

where \(\varvec{b}_i\) is the i-th row of \(\varvec{B}\). If the rows of \(\varvec{B}\) are linearly independent, we call \(\varvec{B}\) a basis of \(\varLambda (\varvec{B})\). For a basis \(\varvec{B}\), we denote by \(\det {(\varLambda (\varvec{B}))}\) the determinant of the lattice \(\varLambda (\varvec{B})\) as \(\sqrt{\det {(\varvec{B}\varvec{B}^T)}}\).

A lattice \(\varLambda (\varvec{B})\) may have many bases. If \(\varvec{B}\) is a nonsingular square matrix with all entries in \(\mathbb {Z}\), then \(\varLambda (\varvec{B})\) has a special basis in Hermite Normal Form. In general, a nonsingular square matrix \(\varvec{H}=(h_{ij})\in \mathbb {Z}^{n\times n}\) is in Hermite Normal Form if

  1. (1)

    \(h_{ij}=0\) for \(1\le j<i\le n\);

  2. (2)

    \(h_{ii}>0\) for \(1\le i\le n\);

  3. (3)

    \(0\le h_{ij}<h_{jj}\) for \(1\le i<j\le n\).

Hermite Normal Form of any integer matrix can be computed in polynomial time, and Micciancio [15] suggested publishing the Hermite Normal Form as the public key which will improve the security of some lattice-based cryptosystems.

2.2 Lattice Problems and Algorithms

In lattice theory, the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP) are two famous computational problems which have been proved to be NP-hard [1, 3]. Given a lattice basis \(\varvec{B}\in \mathbb {Z}^{m\times n}\), the shortest vector problem aims to find a nonzero shortest vector in \(\varLambda (\varvec{B})\), and the closest vector problem aims to find the closest vector to a target vector \(\varvec{t}\in \mathbb {Z}^n\). We denote by \(\lambda _1(\varLambda (\varvec{B}))\) the length of the shortest nonzero lattice vectors in the lattice \(\varLambda (\varvec{B})\).

The approximation versions of SVP and CVP are usually used to evaluate the security for lattice-based schemes. For the approximation of SVP, we need to find a lattice vector \(\varvec{v}\) such that \(\Vert \varvec{v}\Vert \le \gamma \lambda _1\), and for the approximation of CVP, our aim is to find a lattice vector \(\varvec{w}\) satisfying \(\Vert \varvec{w}-\varvec{t}\Vert \le \gamma \min _{\varvec{v}\in \varLambda (\varvec{B})}\Vert \varvec{v}-\varvec{t}\Vert \) with \(\gamma \ge 1\).

Some polynomial-time algorithms have been presented to solve approximate SVP and approximate CVP with exponentially large factor \(\gamma \), such as LLL [13], BKZ [4, 20] for the approximate SVP and Babai’s nearest plane algorithm [2] for approximate CVP.

LLL algorithm is a polynomial-time lattice reduction algorithm which was presented in [13]. An important property of this algorithm is the output vectors are relatively short. Furthermore, in practice, the output of LLL algorithm is much better than the theoretical analysis.

Blockwise Korkine-Zolotarev (BKZ) algorithm [4, 20] is also a widely used lattice reduction algorithm in the analysis for lattice-based cryptosystems. In general, BKZ algorithm has an additional parameter \(\beta \ge 2\) as the block size. In the process of BKZ algorithm, a subalgorithm which finds the shortest vector of the projective lattice with dimension \(\beta \) is called at each iteration. Generally speaking, BKZ algorithm will cost more time than LLL, but the output will be much shorter than that of LLL when \(\beta \) becomes larger.

3 The PSW Digital Signature Scheme

In PKC’08, Plantard et al. [19] proposed a new digital signature based on CVP\(_\infty \), which was claimed to be a countermeasure against the Nguyen-Regev attack.

3.1 The Original Signature Scheme

The original PSW signature scheme consists of three main steps as the following:

Setup

  1. 1.

    Choose an integer n.

  2. 2.

    Compute a random matrix \(\varvec{M}\in \{-1,0,1\}^{n\times n}\).

  3. 3.

    Compute \(d=\lfloor 2\rho (\varvec{M})+1\rfloor \) and \(\varvec{D}=dI_n\), where \(\rho (\varvec{M})\) is the maximum of the absolute value of the eigenvalues of \(\varvec{M}\).

  4. 4.

    Compute the Hermite Normal Form \(\varvec{H}\) of the basis \(\varvec{D}-\varvec{M}\).

  5. 5.

    The public key is \((\varvec{D},\varvec{H})\), and the secret key is \(\varvec{M}\).

To sign a message \(\varvec{m}\in \{0,1\}^*\), one does the following.

Sign

  1. 1.

    Compute the vector \(\varvec{v}=\mathcal {H}(\varvec{m})\in \mathbb {Z}^n\) where \(\mathcal {H}\) is a hash function which maps \(\varvec{m}\) to \(\{\varvec{x}\in \mathbb {Z}^n| |x_i|<d^2,1\le i\le n\}\).

  2. 2.

    By Algorithm 1, compute \(\varvec{w}\) as the signature of \(\varvec{m}\).

To verify a message-signature pair \((\varvec{m},\varvec{w})\), one does the following.

Verify

  1. 1.

    Check if \(|w_i|<d\), \(1\le i\le n\).

  2. 2.

    Compute the vector \(\mathcal {H}(\varvec{m})\in \mathbb {Z}^n\).

  3. 3.

    Check if the vector \(\mathcal {H}(\varvec{m})-\varvec{w}\) is in the lattice of basis \(\varvec{H}\).

figure a
figure b

3.2 The Randomized Version of PSW Signature Scheme

As pointed out by Plantard, Susilo, and Win, since the reduction algorithm is deterministic, the original PSW scheme may suffer some potential side channel attacks. To resist the potential side channel attacks, they suggest using the following randomized algorithm (Algorithm 2) as the signing algorithm.

4 The Chosen Message Attack Against the Randomized Version of PSW Scheme

4.1 Key Idea of Our Chosen Message Attack

As we can see, in the randomized version of the PSW signature scheme, the signature vectors for the same message may not be unique. Therefore, in the CMA model, if we query the randomized signing oracle with the same message \(\varvec{m}\), we may obtain different signature vectors \(\varvec{w}_1,\varvec{w}_2,\cdots ,\varvec{w}_k\) where \(k\ge 2\). Note that \(\varvec{w}_i-\mathcal {H}(\varvec{m})\), \(1\le i\le k\) are all in the lattice, and so are their difference vectors

$$(\varvec{w}_i-\mathcal {H}(\varvec{m}))-(\varvec{w}_j-\mathcal {H}(\varvec{m})) = \varvec{w}_i-\varvec{w}_j,$$

where \(1\le i\le j\le k\).

Since each component of \(\varvec{w}_i\) is in \((-d,d)\), we know that each component of \(\varvec{w}_i-\varvec{w}_j\) is in \((-2d,2d)\). Since \(d\in \Theta (\sqrt{n})\) as stated in [19], the lattice vectors \(\varvec{w}_i-\varvec{w}_j\)’s are very short.

Once we obtain many such short difference vectors, the \(\mathbb {Z}\)-linear combinations of these vectors will span the lattice \(\varLambda (\varvec{D}-\varvec{M})\). By using the lattice reduction algorithms such as LLL and BKZ to the set of short generators, we expect to obtain a much shorter basis, which may leak the private key.

We present the framework of our attack as the following:

  1. 1.

    Generate some messages \(\varvec{m}_1,\varvec{m}_2,\cdots \) randomly;

  2. 2.

    For any message \(\varvec{m}_j\in \{\varvec{m}_1,\varvec{m}_2,\cdots \}\), querying the signing oracle for several times to obtain many different signatures \(\{\varvec{w}_{j1},\varvec{w}_{j2},\cdots ,\varvec{w}_{jk}\}\) with \(k\ge 2\);

  3. 3.

    Collect enough difference vectors \(\varvec{w}_{ji}-\varvec{w}_{j1}\)’s such that they can span the lattice \(\varLambda (\varvec{D}-\varvec{M})\). Denote by \(\varvec{L}\) the set of these \(\varvec{w}_{ji}-\varvec{w}_{j1}\)’s;

  4. 4.

    Use lattice basis reduction algorithm to \(\varvec{L}\) to output a square matrix \(\varvec{LL}\), and expect to obtain some information about the private key.

4.2 Our Strategy to Collect the Difference Vectors

To collect the difference vectors, we have to decide how many messages we will choose in Step 1 and how many signatures for one message we will query with the oracle in Step 2. Below we give a very simple but efficient strategy, that is, for one message we query as many different signatures as possible and we choose as few messages as possible to satisfy Step 3.

Note that for every message, the signing algorithm (Algorithm 2) will generate at most n different signatures since there are n choices for the index i. Assume there were exactly n different signatures, then it is natural to ask how many times we query the signing oracle to collect all these signatures. Since every signature is uniformly randomly returned by the oracle, by the classical result for Coupon Collector’s Problem [16, 17], it can be easily concluded that the expectation of this number is

$$n(1+\frac{1}{2}+\cdots +\frac{1}{n})=n\ln {n}+\gamma n+\frac{1}{2}+O(\frac{1}{n}),$$

where \(\gamma \approx 0.5772156649\) is the Euler’s constant. Hence, we can query one message for \(\lceil n\log {n}\rceil \) times, and then we know that the probability of collecting all the n signatures is greater than \(1-n^{-\frac{1}{\ln {2}}+1}\) [16, 17]. When \(n\ge 100\), this value is greater than 0.85, which is acceptable.

Therefore, in our attack we query \(\lceil n\log {n}\rceil \) signatures for each message, and choose random messages until we collect enough difference vectors, then applying LLL and BKZ to obtain a short basis for the lattice.

We present the attack as Algorithm 3.

figure c

4.3 Experimental Results

In our experiments, we used SageMath 7.5.1 [23] to implement our attacks, and the LLL’s parameter is set to the default value. For BKZ algorithm, we set the parameter “algorithm” as “NTL” to call the NTL library [22] to implement this algorithm. All experiments were run on a machine with Intel(R) Xeon(R) CPU E5-2620 v4 @2.1 GHz.

We chose the dimension n to be 200, 300, 400, and for any dimension we chose 5 randomized generated instances. For the lattice reduction algorithms, we used LLL algorithm, BKZ-10, and BKZ-20 respectively. The results are listed in Table 1.

We would like to point out a natural attempt to recover the rows of \(\varvec{D}-\varvec{M}\) is by applying lattice basis reduction algorithm on the public key \(\varvec{H}\) directly, since every row of \(\varvec{D}-\varvec{M}\) is very short. However, for just dimension \(n=165\) in our experiments, we could not recover any row of \(\varvec{D}-\varvec{M}\) when we even applied BKZ-20 on the public key \(\varvec{H}\) directly.

Table 1. Experimental results for our attack

In contrast, with our attack, for the dimension \(n=200\), LLL algorithm could recover all (or partial) rows of \(\varvec{D}-\varvec{M}\), and BKZ-10 could recover all the rows of \(\varvec{D}-\varvec{M}\) for our instances. For the dimension \(n=300\), we could recover all rows of \(\varvec{D}-\varvec{M}\) in 2 instances and partial rows in 2 instances when BKZ-20 was used.

For the dimension \(n=400\), we just obtain partial rows in \(\varvec{D}-\varvec{M}\) for only one instance with BKZ-20 algorithm. Employing BKZ algorithm with bigger blocksize, we may obtain more rows.

However, we would like to point out that even only partial rows are recovered, the randomized version of the PSW signature scheme is not secure. Since the messages are all generated randomly, we may expect to recover all the rows of the matrix \(\varvec{D}-\varvec{M}\) by repeating our attack several times.

Remark 1

Once obtaining a short basis, we can also recover the matrix \(\varvec{M}\) by finding some lattice vector close to \((0,\cdots ,d,\cdots ,0)\). Using some strategies in [14] to solve the Bounded Distance Decoding (BDD) problem may improve our results.

Remark 2

We would like to point out that the strategy to collect the difference vectors also plays an important role in our attack. Another natural strategy is to query the signing oracle just twice for each message and collect enough difference vectors to mount the attack. However, the new strategy did not work so well as Algorithm 3. For dimension \(n=180\) and larger dimensions, we could never recover any rows of the matrix \(\varvec{D}-\varvec{M}\) by using this strategy in our experiments.

5 Possible Ways to Fix the Randomized Version

There are two possible ways to fix the randomized version similar to the strategies in [8].

The first way is to store the message-signature pairs locally, which seems a bit impractical. In detail, once given a message \(\varvec{m}\), we will modify the Sign step as the following:

Sign

  1. 1.

    Check whether \(\varvec{m}\) has been signed or not.

  2. 2.

    If \(\varvec{m}\) is stored locally, return the locally stored signature \(\varvec{w}\) corresponding to \(\varvec{m}\).

  3. 3.

    Otherwise, use Algorithm 2 to output a signature \(\varvec{w}\) and store \((\varvec{m},\varvec{w})\) locally.

The second way is to add some random number to the hash function. This strategy is usually used in the hash-then-sign schemes. Since the original PSW scheme has no security proof and we do not know the exact hardness of CVP\(_{\infty }\) over the PSW instances, we can not present some formal security proof for this fixed version, but just present it as the following:

Sign

  1. 1.

    Choose \(\varvec{r}\leftarrow \{0,1\}^n\) at random.

  2. 2.

    Compute the vector \(\varvec{v}=\mathcal {H}(\varvec{m}||\varvec{r})\), where \(\mathcal {H}\) maps \((\varvec{m}||\varvec{r})\) to the area \((-d^2,d^2)^n\).

  3. 3.

    Applying Algorithm 2, compute the signature \(\varvec{w}\).

Once given the signature \((\varvec{m},\varvec{r},\varvec{w})\), we will modify the Verify step as below.

Verify

  1. 1.

    Check if \(|w_i|<d\) for \(1\le i\le n\).

  2. 2.

    Compute the vector \(\mathcal {H}(\varvec{m}||\varvec{r})\).

  3. 3.

    Check whether the vector \(\mathcal {H}(\varvec{m}||\varvec{r})-\varvec{w}\in \varLambda (\varvec{H})\) or not.

6 Conclusions and Open Problems

In this paper, we show that the randomized PSW signature scheme is not secure under the chosen message attack at least for dimension less than or equal to 400. However, for the scheme with bigger dimension which becomes less efficient apparently, it seems that we need the BKZ algorithm with bigger blocksize to recover the private key. In fact, our attack reveals that the storage of previous signature or the use of random nonce employed in the randomized signature scheme is crucial.

However, there are still some unsolved theoretical problems, such as presenting a theoretical reason why the strategy in Remark 2 does not work as well as Algorithm 3. The lattice vectors we collected by the two strategies have almost the same length. However, Algorithm 3 usually succeeded, whereas the strategy in Remark 2 always failed when the dimension is between 200 and 400. It seems a bit strange. We conjecture the reason may relate to the fact that the lattice vectors collected with the strategy in Remark 2 seems more “independent” and “random”, but we can not present a rigorous analysis.

Moreover, we tried to apply our attack to analyze the security of some signature schemes with GPV algorithm [8] as the signing algorithm, such as [6]. However, we could only recover the private key with dimension 128 for [6], but failed for larger dimensions such as 256. This phenomenon also lacks theoretical explanation.

Hence, the theory about how the lattice basis reduction algorithm behaves with shorter input should be further studied. Usually, we measure the quality of the output for the lattice basis reduction algorithm with the determinant of the input lattice (such as Gauss heuristic), but it can be expected that with shorter input, we can have shorter output, although the determinant keeps the same. A natural problem is if there is some tight relation between the length of output and input on average, with which we can describe the attack more rigorously in theory.