On solving L P N using B K W and variants

Bogos, Sonia; Tramèr, Florian; Vaudenay, Serge

doi:10.1007/s12095-015-0149-2

On solving L P N using B K W and variants

Implementation and analysis

Published: 31 July 2015

Volume 8, pages 331–369, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Cryptography and Communications Aims and scope Submit manuscript

On solving L P N using B K W and variants

Download PDF

Sonia Bogos¹,
Florian Tramèr¹ &
Serge Vaudenay¹

470 Accesses
23 Citations
Explore all metrics

Abstract

The Learning Parity with Noise problem (L P N) is appealing in cryptography as it is considered to remain hard in the post-quantum world. It is also a good candidate for lightweight devices due to its simplicity. In this paper we provide a comprehensive analysis of the existing L P N solving algorithms, both for the general case and for the sparse secret scenario. In practice, the L P N-based cryptographic constructions use as a reference the security parameters proposed by Levieil and Fouque. But, for these parameters, there remains a gap between the theoretical analysis and the practical complexities of the algorithms we consider. The new theoretical analysis in this paper provides tighter bounds on the complexity of L P N solving algorithms and narrows this gap between theory and practice. We show that for a sparse secret there is another algorithm that outperforms B K W and its variants. Following from our results, we further propose practical parameters for different security levels.

On the Hardness of Sparsely Learning Parity with Noise

Solving the Learning Parity with Noise Problem Using Quantum Algorithms

Hybrid dual attack on LWE with arbitrary secrets

Article Open access 01 August 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The Learning Parity with Noise problem (L P N) is a well-known problem studied in cryptography, coding theory and machine learning. In the L P N problem, one has access to queries of the form (v, c), where v is a random vector and the inner product between v and a secret vector s is added to some noise to obtain c. Given these queries, one has to recover the value of s. So, the problem asks to recover a secret vector s given access to noisy inner products of itself with random vectors.

It is believed that L P N is resistant to quantum computers so it is a good alternative to the number-theoretic problems (e.g. factorization and discrete logarithm) which can be solved easily with quantum algorithms. Also, due to its simplicity, it is a nice candidate for lightweight devices. As applications where L P N or L P N variants are deployed, we first have the HB family of authentication protocols: HB [27], HB⁺ [28], HB⁺⁺ [11], HB^# [21] and A U T H [31]. An L P N-based authentication scheme secure against Man-in-the-Middle was presented in Crypto 2013 [35]. There are also several encryption schemes based on L P N: Alekhnovich [3] presents two public-key schemes that encrypt one bit at a time. Later, Gilbert, Robshaw and Seurin [21] introduce LPN-C, a public-key encryption scheme proved to be I N D- C P A. Two schemes that improve upon Alekhnovich’s scheme are introduced in [16] and [15]. In PKC 2014, Kiltz et al. [30] propose an alternative scheme to [16]. Duc and Vaudenay [18] introduce HELEN, an L P N-based public-key scheme for which they propose concrete parameters for different security levels. A PRNG based on L P N is presented in [8] and [4].

The L P N problem can also be seen as a particular case of the L W E [38] problem where we work in $\mathbb {Z}_{2}$. While in the case of L W E the reduction from hard lattice problems attests the hardness [10, 37, 38], there are no such results in the case of L P N. The problem is believed to be hard and is closely related to the long-standing open problem of efficiently decoding random linear codes.

In the current literature, there are few references when it comes to the analysis of L P N. The most well-known algorithm is B K W [9]. When introducing the HB⁺ protocol [28], which relies on the hardness of L P N, the authors propose parameters for different levels of security according to the B K W performance. These parameters are shown later to be weaker than thought [20, 33]. Fossorier et al. [20] provide a new variant that brings an improvement over the B K W algorithm. Levieil and Fouque [33] also give a formal description of the B K W algorithm and introduce two improvements over it. For their algorithm based on the fast Walsh-Hadamard transform, they provide the level of security achieved by different instances of L P N. This analysis is referenced by most of the papers that make use of the L P N problem. While they offer a theoretical analysis and propose secure parameters for different levels of security, the authors do not discuss how their theoretical bounds compare to practical results. As we will see, there is a gap between theory and practice. In the domain of machine learning, [22, 40] also cryptanalyse the L P N problem. The best algorithm for solving L P N was presented at Asiacrypt 2014 [24]. This new variant of B K W uses covering codes as a novelty.

While these algorithms solve the general case when we have a random secret, in the literature there is no analysis and implementation done for an algorithm specially conceived for the sparse secret case, i.e. the secret has a small Hamming weight.

The B K W algorithm can also be adapted to solve the L W E problem in exponential time. Implementation results and improvements of it were presented in [1, 2, 17]. In terms of variants of L P N, we have Ring- L P N [25] and Subspace L P N [31]. As an application for Ring- L P N we have the Lapin authentication protocol [25] and its cryptanalysis in [6, 23].

Motivation & contribution

Our paper comes to address exactly the aforementioned open problems, i.e. the gap between theory and practice and the analysis of an L P N solving algorithm that proves to be better than B K W and its variants in the case of a sparse secret. First, we present the current existing L P N solving algorithms in a unified framework. For these algorithms, we provide experimental results and give a better theoretical analysis that brings an improvement over the work of Levieil and Fouque [33]. Furthermore, we implement and analyse three new algorithms for the case where the secret is sparse. Our results show that for a sparse secret the B K W family of algorithms is outperformed by an algorithm that uses Gaussian elimination. Our motivation is to provide a theoretical analysis that matches the experimental results. Although this does not prove that L P N is hard, it gives tighter bounds for the parameters used by the aforementioned cryptographic schemes. It can also be used to have a tighter complexity analysis of algorithms related to L P N solving. Our results were actually used in [24] and also for L W E solving in [17].

Organization

In Section 2 we introduce the definition of L P N and present the main L P N solving algorithms. We also present the main ideas of how the analysis was conducted in [33]. We introduce novel theoretical analyses and show what improvements we bring in Section 3. Besides analysing the current existing algorithms, we propose three new algorithms and analyse their performance in Section 4. In Section 5, we provide the experimental results for the algorithms described in Sections 3 & 4. We compare the theory with the practical results and show the tightness of our query complexity. We provide a comparison between all these algorithms in Section 6 and propose practical parameters for a 80 bit security level.

Notations and preliminaries

Let 〈⋅,⋅〉 denote the inner product, $\mathbb {Z}_{2} = \{0,1\}$ and ⊕ denote the bitwise XOR. For a domain $\mathcal {D}$, we denote by $x \overset {U}{\leftarrow } \mathcal {D}$ the fact that x is drawn uniformly at random from $\mathcal {D}$. We use small letters for vectors and capital letters for matrices. We represent a vector v of size k as v = (v ₁,…,v _k), where v _i is the i ^th bit of v. We denote the Hamming weight of a vector v by H W(v). By B e r _τ we define the Bernoulli distribution with parameter τ, i.e. for a random variable X, $\Pr [X = 1] = \tau = 1 - \Pr [X = 0]$. The bias of a boolean random variable X is defined as δ = E((−1)^X). Thus, for a Bernoulli variable we have δ = 1−2τ.

2 L P N

In this section we introduce the L P N problem and the algorithms that solve it. For ease of understanding, we present the L P N solving algorithms in a unified framework.

2.1 The L P N problem

Intuitively, the L P N problem asks to recover a secret vector s given access to noisy inner products of itself and random vectors. More formally, we present below the definition of the L P N problem.

Definition 1 (L P N oracle)

Let $s \overset {U}{\leftarrow } \mathbb {Z}_{2}^{k}$ and let $\tau \in ] 0, \frac {1}{2} [$ be a constant noise parameter. Denote by D _{s, τ} the distribution defined as

$$\left\{ (v, c) \mid v \overset{U}{\leftarrow} \mathbb{Z}_{2}^{k}, c = \langle v,s \rangle \oplus d, d \leftarrow \mathsf{Ber}_{\tau}\right\} \in \mathbb{Z}_{2}^{k+1}. $$

An L P N oracle $\mathcal {O}^{\mathsf {LPN}}_{s,\tau }$ is an oracle which outputs independent random samples according to D _{s, τ}.

Definition 2 (Search L P N problem)

Given access to an L P N oracle $\mathcal {O}^{\mathsf {LPN}}_{s,\tau }$, find the vector s. We denote by L P N _{k, τ} the L P N instance where the secret has size k and the noise parameter is τ. Let $k^{\prime } \leq k$. We say that an algorithm $\mathcal {M}~(n,t,m,\theta ,k^{\prime })$-solves the search L P N _{k, τ} problem if

$$\Pr\left[ \mathcal{M}^{\mathcal{O}^{\mathsf{LPN}}_{s,\tau}}(1^{k}) = \left( s_{1} {\ldots} s_{k^{\prime}}\right) \mid s \overset{U}{\leftarrow}\mathbb{Z}_{2}^{k} \right] \geq \theta, $$

and $\mathcal {M}$ runs in time t, uses memory m and asks at most n queries from the L P N oracle.

Note that we consider here the problem of recovering the first $k^{\prime }$ bits of the secret. We will show in Section 3 that for all the algorithms we consider, the cost of recovering the full secret s is dominated by the cost of recovering the first block of $k^{\prime }$ bits of s.

An equivalent way to formulate the search L P N _{k, τ} problem is as follows: given access to a random matrix $A \in \mathbb {Z}_{2}^{n \times k}$ and a column vector c over $\mathbb {Z}_{2}$, such that A s ^T⊕d = c, find the vector s. Here the matrix A corresponds to the matrix that has the vectors v on its rows, s is the secret vector of size k and c corresponds to the column vector that contains the noisy inner products. The column vector d is of size n and contains the corresponding noise bits.

One may observe that with τ = 0, the problem is solved in polynomial time through Gaussian elimination given n = Θ(k) queries. The problem becomes hard once noise is added to the inner product. The value of τ can be either independent or dependent of the value k. Usually the value of τ is constant and independent from the value of k. A case where τ is taken as a function of k occurs in the construction of the encryption schemes [3, 15]. Intuitively, a larger value of τ means more noise and makes the problem of search L P N harder. The value of the noise parameter is a trade-off between the hardness of the L P N _{k, τ} and the practical impact on the applications that rely on this problem.

The L P N problem has also a decisional form. The decisional L P N _{k, τ} asks to distinguish between the uniform distribution over $\mathbb {Z}_{2}^{k+1}$ and the distribution $\mathcal {D}_{s,\tau }$. A similar definition for an algorithm that solves decisional L P N can be adopted as above. Let $\mathcal {U}_{k+1}$ denote an oracle that outputs random vectors of size k+1. We say that an algorithm $\mathcal {M} (n,t,m,\theta )$-solves the decisional L P N _{k, τ} problem if

$$\left|\Pr\left[ \mathcal{M}^{\mathcal{O}^{\mathsf{LPN}}_{s,\tau}}(1^{k}) = 1 \right] - \Pr\left[ \mathcal{M}^{\mathcal{U}_{k+1}}(1^{k}) = 1\right] \right| \geq \theta $$

and $\mathcal {M}$ runs in time t, uses memory m and needs at most n queries.

Search and decisional L P N are polynomially equivalent. The following lemma expresses this result.

Lemma 1 ([8, 29])

If there is an algorithm $\mathcal {M}$ that (n,t,m,θ)-solves the decisional L P N _k,τ, then one can build an algorithm $\mathcal {M^{\prime }}$ that $(n^{\prime },t^{\prime },m^{\prime },\theta ^{\prime },k)$-solves the search L P N _k,τ problem, where $n^{\prime } = \mathcal {O}(n \cdot \theta ^{-2} \log {k})$, $t^{\prime } = \mathcal {O}(t \cdot k \cdot \theta ^{-2} \log {k} )$, $m^{\prime } = \mathcal {O}(m \cdot \theta ^{-2} \log {k}))$ and $\theta ^{\prime } = \frac {\theta }{4}$.

We do not go into details as this is outside the scope of this paper. We only analyse the solving algorithms for search L P N. From now on we will refer to it simply as L P N.

2.2 L P N solving algorithms

In the current literature there are several algorithms to solve the L P N problem. The first that appeared, and the most well known, is B K W [9]. This algorithm recovers the secret s of an L P N _{k, τ} instance in sub-exponential $2^{\mathcal {O}\left (\frac {k}{\log k}\right )}$ time complexity by requiring a sub-exponential number $2^{\mathcal {O}\left (\frac {k}{\log k}\right )}$ of queries from the $\mathcal {O}^{\mathsf {LPN}}_{s,\tau }$ oracle. Levieil and Fouque [33] propose two new improvements which are called L F 1 and L F 2. Fossorier et al. [20] also introduce a new algorithm, which we denote F M I C M, that brings an improvement over B K W. The best algorithm to solve L P N was recently presented at Asiacrypt 2014 [24]. It can be seen as a variant of L F 1 where covering codes are introduced as a new method to improve the overall algorithm. All these algorithms still require a sub-exponential number of queries and have a sub-exponential time complexity.

Using B K W as a black-box, Lyubashevsky [34] introduces a ”pre-processing” phase and solves an L P N _{k, τ} instance with k ^1+η queries and with a time complexity of $2^{\mathcal {O}\left (\frac {k}{\log \log k}\right )}$. The queries given to B K W have a worse bias of $\tau ^{\prime } = \frac {1}{2} - \frac {1}{2}\left (\frac {1-2\tau }{4} \right )^{\frac {2k}{\eta \log {k}}}$. Thus, this variant requires a polynomial number of queries but has a worse time complexity. Given only n = Θ(k) queries, the best algorithms run in exponential time 2^Θ(k) [36, 39].

An easy to solve instance of L P N was introduced by Arora and Ge [5]. They show that in the k-wise version where the k-tuples of the noise bits can be expressed as the solution of a polynomial (e.g. there are no 5 consecutive errors in the sequence of queries), the problem can be solved in polynomial time. What makes the problem easy is the fact that an adversary is able to structure the noise.

In this paper we are interested in the B K W algorithm and its improvements presented by Levieil and Fouque [33] and by Guo et al. [24]. The common structure of all these algorithms is the following: given n queries from the $\mathcal {O}^{\mathsf {LPN}}_{s,\tau }$ oracle, the algorithm tries to reduce the problem of finding a secret s of k bits to one where the secret $s^{\prime }$ has only $k^{\prime }$ bits, with $k^{\prime }<k$. This is done by applying several reduction techniques. We call this phase the reduction phase. Afterwards, during the solving phase we can apply a solving algorithm that recovers the secret $s^{\prime }$. We then update the queries with the recovered bits and restart to fully recover s. For the ease of understanding, we describe all the aforementioned L P N solving algorithms in this setting where we separate the algorithms in two phases. We emphasize the main differences between the algorithms and discuss which improvements they bring.

First, we assume that k = a⋅b. Thus, we can visualise the k-bit length vectors v as a blocks of b bits.

2.2.1 B K W ^∗ algorithm

The B K W ^∗ algorithm as described in [33] works in two phases:

Reduction phase

Given n queries from the L P N oracle, we group them in equivalence classes. Two queries are in the same equivalence class if they have the same value on a set q ₁ of b bit positions. These b positions are chosen arbitrarily in {1,…,k}. There are at most 2^b such equivalence classes. Once this separation is done, we perform the following steps for each equivalence class: pick one query at random, the representative vector, and xor it to the rest of the queries from the same equivalence class. Discard the representative vector. This will give vectors with all bits set to 0 on those b positions. These steps are also illustrated in Algorithm 1 (steps 5 – 10). We are left with at least n−2^b queries where the secret is reduced to k−b effective bits (others being multiplied by 0 in all queries).

We can repeat the reduction technique a−1 times on other disjoint position sets q ₂,…,q _a−1 from {1,…,k}∖q ₁ and end up with at least n−(a−1)2^b queries where the secret is reduced to k−(a−1)b = b bits. The bias of the new queries is $\delta ^{2^{a-1}}$, as shown by the following Lemma with w = 2^a−1.

Lemma 2 ([9, 33])

If (v ₁ ,c ₁ ),…,(v _w ,c _w ) are the results of w queries from $\mathcal {A}_{s,p}^{\mathsf {LPN}}$ , then the probability that:

$$\langle v_{1} \oplus v_{2} \oplus {\ldots} \oplus v_{w},s \rangle = c_{1} \oplus {\ldots} \oplus c_{w} $$

is equal to $\frac {1+\delta ^{w}}{2}$.

It is easy so see that the complexity of performing this reduction step is $\mathcal {O}(kan)$.

After a−1 iterations, we are left with at least n−(a−1)2^b queries, and a secret of size of b effective bits at positions 1,…,b. The goal is to keep only those queries that have Hamming weight one (step 11 of Algorithm 1). Given n−(a−1)2^b queries and one bit position $j \in \{ 1, \ldots ,k\} \backslash \{ q_{1} \cup {\ldots } \cup q_{a-1} \}$, only $n^{\prime } = \frac {n-(a-1)2^{b}}{2^{b}}$ will have a single non-zero bit on position j and 0 on all the others. These queries represent the input to the solving phase. The bias does not change since we do not alter the original queries. The complexity for performing this step for n−(a−1)2^b queries is $\mathcal {O}\left (b(n - (a-1)2^{b})\right )$ as the algorithm just checks if the queries have Hamming weight 1.

The bit c is part of the query also: it gets updated during the xoring operations but we do not consider this bit in partitioning or when computing the Hamming weight of a query. Later on, the information stored in this bit will be used to recover bits of the secret.

Remark 1

Given that we have performed the xor between pairs of queries, we note that the noise bits are no longer independent. In the analysis of B K W ^∗, this was overlooked by Levieil and Fouque [33].^{Footnote 1} The original B K W [9] algorithm overcomes this problem in the following manner: each query that has Hamming weight 1 is obtained with a fresh set of queries. Given a2^b queries the algorithm runs the xoring process and is left with 2^b vectors. From these 2^b queries, with a probability of $ 1 - \left (1 - 2^{-b}\right )^{2^{b}} \approx 1 - \frac {1}{e}$, where e = 2.718, there is one with Hamming weight 1 on a given position i. In order to obtain more such queries the algorithm repeats this process with fresh queries. This means that for guessing 1 bit of the secret, the original algorithm requires $n = a \cdot 2^{b} \cdot \frac {1}{1- 1/e} \cdot n^{\prime }$ queries, where $n^{\prime }$ denotes the number of queries needed for the solving phase. This is larger than $n = 2^{b} n^{\prime } + (a-1)2^{b}$ which is the number of queries given by Levieil and Fouque [33]. We implemented and run B K W ^∗ as described in Algorithm 1 and we discovered that this dependency does not affect the performance of the algorithm. I.e., the number of queries computed by the theory that ignores the dependency of the error bits matches the practical results. We need $n = n^{\prime } + (a-1)2^{b}$ (and not $n = 2^{b} n^{\prime } + (a-1)2^{b}$) queries in order to recover one block of the secret. The theoretical and practical results are presented in Section 5. Given our practical experiments, we keep the “heuristic” assumption of independence and the algorithm as described in [33] which we called B K W ^∗. Thus, we assume from now on the independence of the noise bits and the independence of the queries.

Another discussion on the independence of the noise bits is presented in [19]. There we can see what is the probability to have a collision, i.e. two queries that share an error bit, among the queries formed during the xoring steps.

We can repeat the algorithm a times, with the same queries, to recover all the k bits. The total time complexity for the reduction phase is $\mathcal {O}\left (ka^{2}n\right )$ as we perform the steps described above a times (instead of $\mathcal {O}(kan)$ as given in [33]). However, by making the selection of a and b adaptive with ab near to the remaining number of bits to recover, we can show that the total complexity is dominated by the one of recovering the first block. So, we can typically concentrate on the algorithm to recover a single block. We provide a more complete analysis in Section 3.

Solving phase

The B K W solving method recovers the 1-bit secret by applying the majority rule. The queries from the reduction phase are of the form $c_{j}^{\prime }= s_{i} \oplus d_{j}^{\prime }$, $d_{j}^{\prime } \leftarrow \mathsf {Ber}_{\left (1-\delta ^{2^{a-1}}\right )/2}$ and s _i being the i ^th bit of the secret s. Given that the probability for the noise bit to be set to 1 is smaller than $\frac {1}{2}$, in more than half of the cases, these queries will be s _i. Thus, we decide that the value of s _i is given by the majority rule (steps 12–14 of Algorithm 1). By applying the Chernoff bounds [13], we find how many queries are needed such that the probability of guessing incorrectly one bit of the secret is bounded by some constant θ, with 0<θ < 1.

The time complexity of performing the majority rule is linear in the number of queries.

Complexity analysis

With their analysis, Levieil and Fouque [33] obtain the following result:

Theorem 1 (Theorem 1 from[33])

For k=a⋅b, the B K W ^∗ algorithm heuristically ($n = 20 \cdot \ln (4k) \cdot 2^{b} \cdot \delta ^{-2^{a}} + (a-1)2^{b}, t = \mathcal {O}(kan), m=kn, \theta = \frac {1}{2},b$)-solves the L P N problem.^{Footnote 2}

In Section 3 we will see that our theoretical analysis, which we believe to be more intuitive and simpler, gives tighter bounds for the number of queries.

2.2.2 L F 1 algorithm

During the solving phase, the B K W algorithm recovers the value of the secret bit by bit. Given that we are interested only in queries with Hamming weight 1, many queries are discarded at the end of the reduction phase. As first noted in [33], this can be improved by using a Walsh-Hadamard transform instead of the majority rule. This improvement of B K W is denoted in [33] by L F 1. Again, we present the algorithm in pseudo-code in Algorithm 2. As in B K W ^∗, we can concentrate on the complexity to recover the first block.

Reduction phase

The reduction phase for L F 1 follows the same steps as in B K W ^∗ in obtaining new queries as 2^a−1 xors of initial queries in order to reduce the secret to size b. At this step, the algorithm does not discard queries anymore but proceeds directly with the solving phase (see steps 3–10 of Algorithm 2). We now have $n^{\prime } = n - (a-1)2^{b}$ queries after this phase.

Solving phase

The solving phase consists in applying a Walsh-Hadamard transform in order to recover b bits of the secret at once (steps 11–13 in Algorithm 2). We can recover the b-bit secret by computing the Walsh-Hadamard transform of the function $f(x) = {\sum }_{i} 1_{v_{i}^{\prime }=x}(-1)^{c_{i}^{\prime }}$. The Walsh-Hadamard transform is $\hat {f}(\nu )= {\sum }_{x} (-1)^{\langle \nu , x \rangle } f(x) = {\sum }_{x} (-1)^{\langle \nu , x \rangle }{\sum }_{i} 1_{v_{i}^{\prime }=x}(-1)^{c_{i}^{\prime }} = {\sum }_{i} (-1)^{\langle v_{i}^{\prime }, \nu \rangle + c_{i}^{\prime }} = n^{\prime } - 2HW\left (A^{\prime }\nu ^{T} + c^{\prime }\right ) $. For ν = s, we have $ \hat {f}(s) = n^{\prime } - 2 \cdot HW(d^{\prime })$, where $d^{\prime }$ represents the noise vector after the reduction phase. We know that most of the noise bits are set to 0. So, $\hat {f}(s)$ is large and we suppose it is the largest value in the table of $\hat {f}$. Thus, we have to look at the maximum value of the Walsh-Hadamard transform in order to recover the value of s. A naive implementation of a Walsh-Hadamard transform would give a complexity of 2^2b since we apply it on a space of size 2^b. Since we apply a fast Walsh-Hadamard transform, we get a time complexity of b2^b [14].

Complexity analysis

The following theorem states the complexity of L F 1:

Theorem 2 (Theorem 2 from [33])

For k=a⋅b and a>1, the L F 1 algorithm heuristically ($n = (8b + 200)\delta ^{-2^{a}} + (a-1)2^{b}, t = \mathcal {O}\left (kan+b2^{b}\right ), m=kn + b2^{b}, \theta = \frac {1}{2},b$)-solves the L P N problem.^{Footnote 3}

The analysis is similar to the one done for B K W ^∗, except that we now work with blocks of the secret s and not bits. Thus, we bound by $\frac {1}{2a}$ the probability that $\hat {f}(s^{\prime }) > \hat {f}(s)$, where $s^{\prime }$ is any of the 2^b−1 values different from s. As for B K W ^∗, we will provide a more intuitive and tighter analysis for L F 1 in Section 3.2.

B K W ^∗ vs. L F 1

We can see that compared to B K W ^∗, L F 1 brings a significant improvement in the number of queries needed. As expected, the factor 2^b disappeared as we did not discard any query at the end of the reduction phase. There is an increase in the time and memory complexity because of the fast Walsh-Hadamard transform, but these terms are not the dominant ones.

2.2.3 L F 2 algorithm

L F 2 is a heuristic algorithm, also introduced in [33], that applies the same Walsh-Hadamard transform as L F 1, but has a different reduction phase. We provide the pseudocode for L F 2 below.

Reduction phase

Similarly to B K W ^∗ and L F 1, the n queries are grouped into equivalence classes. Two queries are in the same equivalence class if they have the same value on a window of b bits. In each equivalence class we perform the xor of all the pairs from that class. Thus, we do not choose any representative vector that is discarded afterwards. Given that in an equivalence class there are n/2^b queries, we expect to have $2^{b} \left (\begin {array}{c}n/2^{b}\\2 \end {array}\right )$ queries at the end of the xor-ing. One interesting case is when n is of the form n = 3⋅2^b as with this reduction phase we expect to preserve the number of queries since $\left (\begin {array}{c}3\\2 \end {array}\right ) = 3$. For any n>3⋅2^b, the number of queries will grow exponentially and will also affect the time and memory complexity.

Solving phase

This works like in L F 1.

In a scenario where the attacker has access to a restricted number of queries, this heuristic algorithm helps in increasing the number of queries. With L F 2, the attacker might produce enough queries to recover the secret s.

2.2.4 F M I C M algorithm

Another algorithm by Fossorier et al. [20] uses ideas from fast correlation attacks to solve the L P N problem. While there is an improvement compared with the B K W ^∗ algorithm, this algorithm does not perform better than L F 1 and L F 2. Given that it does not bring better results, we just present the main steps of the algorithm.

As the previous algorithms, it can be split into two phases: reduction and solving phase. The reduction phase first decimates the number of queries and keeps only those queries that have 0 bits on a window of a given size. Then, it performs xors of several queries in order to further reduce the size of the secret. The algorithm that is used for this step is similar to the one that constructs parity checks of a given weight in correlation attacks. The solving phase makes use of the fast Walsh-Hadamard transform to recover part of the secret. By iteration the whole secret is recovered.

2.2.5 Covering codes algorithm

The new algorithm [24] that was presented at Asiacrypt 2014, introduces a new type of reduction. There is a difference between [24] and what was presented at the Asiacrypt conference (mostly due to our results). We concentrate here on [24] and in the next section we present the suggestions we provided to the authors.

Reduction phase

The first step of this algorithm is to transform the L P N instance where the secret s is randomly chosen to an instance where the secret has now a Bernoulli distribution. This method was described in [4, 6, 32].

Given n queries from the L P N oracle: $(\bar {v_{1}},c_{1})$, $(\bar {v_{2}}, c_{2}), {\ldots } , (\bar {v_{n}}, c_{n})$, select k linearly independent vectors $\bar {v}_{i_{1}}, \ldots , \bar {v}_{i_{k}}$. Construct the k×k target matrix M that has on its columns the aforementioned vectors, i.e. $M = \left [\bar {v}_{i_{1}}^{T} \bar {v}_{i_{2}}^{T} {\ldots } \bar {v}_{i_{k}}^{T}\right ]$. Compute $\left (M^{T}\right )^{-1}$ the inverse of M ^T, where M ^T is the transpose of M. We can rewrite the k queries corresponding to the selected vectors as $M^{T}s^{T}+d^{\prime }$, where $d^{\prime }$ is the k-bit column vector $d = \left (d_{i_{1}}, d_{i_{2}}, \ldots , d_{i_{k}}\right )^{T}$. We denote $c^{\prime } = M^{T}s^{T}+d^{\prime }$. For any $\bar {v}_{j}$ that is not used in matrix M do the following computation:

$$\bar{v}_{j} \left( M^{T}\right)^{-1} c^{\prime} + c_{j} = \left\langle \bar{v}_{j}\left( M^{T}\right)^{-1},d^{\prime} \right\rangle +d_{j}. $$

We discard the matrix M. From the initial set of queries, we have obtained a new set where the secret value is $d^{\prime }$. This can be seen as a reduction to a sparse secret. The complexity of this transform is $\mathcal {O}\left (k^{3} + n k^{2}\right )$ by the schoolbook matrix inversion algorithm. This can be improved as follows: for a fixed χ, one can split the matrix $\left (M^{T}\right )^{-1}$ in $a^{\prime } = \left \lceil \frac {k}{\chi } \right \rceil $ parts $\left [\begin {array}{c}M_{1} \\ M_{2} \\{\ldots } \\ M_{a^{\prime }} \end {array}\right ]$of χ rows. By pre-computing $\bar {v} M_{i}$ for all $\bar {v} \in \{0,1\}^{\chi }$, the operation of performing $\bar {v}_{j}\left (M^{T}\right )^{-1}$ takes $\mathcal {O}\left (ka^{\prime }\right )$. The pre-computation takes $\mathcal {O}(2^{\chi })$ and is negligible if the memory required by the B K W reduction is bigger. With this pre-computation the complexity is $\mathcal {O}(nka^{\prime })$.

Afterwards the algorithm follows the usual B K W reduction steps where the size of the secret is reduced to $k^{\prime }$ by the xoring operation. Again the vector of k bits is seen as being split into blocks of size b. The B K W reduction is applied a times. Thus, we have $k^{\prime } = k - ab$.

The secret s of $k^{\prime }$ bits is split into 2 parts: one part denoted s ₂ of $k^{\prime \prime }$ bits and the other part, denoted s ₁, of $k^{\prime } - k^{\prime \prime }$ bits. The next step in the reduction is to guess value of s ₁ by making an assumption on its Hamming weight: H W(s ₁)≤w ₀. The remaining queries are of the form $\left (v_{i},c_{i}=\langle v_{i},s_{2}\rangle \oplus d_{i}\right )$, where $v_{i},s_{2} \in \{0,1\}^{k^{\prime \prime }}$ and $d_{i} \in \mathsf {Ber}_{\frac {1 - \delta ^{2^{a}}}{2}}$. Thus, the problem is reduced to a secret of $k^{\prime \prime }$ bits.

At this moment, the algorithm approximates the v _i vectors to the nearest codeword g _i in a $\left [k^{\prime \prime },\ell \right ]$ linear code where $k^{\prime \prime }$ is the size and ℓ is the dimension. By observing that g _i can be written as $g_{i}=g_{i}^{\prime }G$, where G is the generating matrix of the code, we can write the equations in the form

$$c_{i}=\langle v_{i},s_{2}\rangle\oplus d_{i} = \left\langle g_{i}^{\prime}G,s_{2} \right\rangle \oplus \langle v_{i} - g_{i},s_{2} \rangle \oplus d_{i} = \left\langle g_{i}^{\prime},s_{2}^{\prime} \right\rangle \oplus d_{i}^{\prime} $$

with $s^{\prime }_{2} = s_{2}G^{T}$ and $d^{\prime }_{i}=\langle v_{i}-g_{i},s_{2}\rangle \oplus d_{i}$, where $g_{i}^{\prime }, s_{2}^{\prime }$ have length ℓ. If the code has optimal covering radius ρ, v _i−g _i is a random vector of weight bounded by ρ, while s ₂ is a vector of some small weight bounded by w ₁, with some probability. So, 〈v _i−g _i,s ₂〉 is biased and we can treat $d^{\prime }_{i}$ in place of d _i.

In [24], the authors approximate the bias of 〈v _i−g _i,s ₂〉 to $\delta ^{\prime } = \left (1-2\frac {\rho }{k^{\prime \prime }}\right )^{w_{1}}$, as if all bits were independent. As discussed in the next section, this approximation is far from good.

No queries are lost during this covering code operation and now the secret is reduced to ℓ bits. We now have $n^{\prime } = n - k - a2^{b}$ queries after this phase.

Solving phase

The solving phase of this algorithm follows the same steps as L F 1, i.e. it employs a fast Walsh-Hadamard transform. One should notice that the solving phase recovers ℓ relations between the bits of the secret and not actual ℓ bits of the secret.

Complexity analysis

Recall that in the algorithm two assumptions are made regarding the Hamming weight of the secret: that s ₂ has a Hamming weight smaller than w ₁ and that s ₁ has a Hamming weight smaller than w ₀. This holds with probability $\Pr \left (w_{0},k^{\prime }-k^{\prime \prime }\right ) \cdot \Pr \left (w_{1},k^{\prime \prime }\right )$ where

$$\Pr(w,m) = \sum\limits_{i=0}^{w} (1-\tau)^{m-i} \tau^{i} \left( \begin{array}{c}m\\i \end{array}\right). $$

The total complexity is given by the complexity of one iteration to which we add the number of times we have to repeat the iteration. We state below the result from [24]:

Theorem 3 (Theorem 1. from [ 24 ])

Let n be the number of samples required and $a,a^{\prime },b,w_{0},w_{1},\ell ,k^{\prime },k^{\prime \prime }$ be the algorithm parameters. For the L P N _k,τ instance, the number of bit operations required for a successful run of the new attack is equal to

$$t = \frac{t_{\mathsf{sparse~reduction}} + t_{\mathsf{bkw~reduction}} + t_{\mathsf{guess}} + t_{\mathsf{covering~code}} + t_{\mathsf{Walsh~transform}}}{\Pr(w_{0},k^{\prime}-k^{\prime\prime}) \Pr(w_{1},k^{\prime\prime})}, $$

where

$t_{\mathsf {sparse~reduction}} = nka^{\prime }$ is the cost of reducing the L P N instance to a sparse secret
t_{b
k
w
r
e
d
u
c
t
i
o
n} = (k+1)an is the cost of the B K W reduction steps
$t_{\mathsf {guess}} = n^{\prime } {\sum }_{i=0}^{w_{0}} \left (\begin {array}{c}k^{\prime }-k^{\prime \prime }\\ i \end {array}\right ) i$ is the cost of guessing $k^{\prime }-k^{\prime \prime }$ bits and $n^{\prime } = n - k - a 2^{b}$ represents the number of queries at the end of the reduction phase
$t_{\mathsf {covering~code}} = \left (k^{\prime \prime } -\ell \right ) \left (2n^{\prime } + 2^{\ell }\right ) $ is the cost of the covering code reduction and $n^{\prime }$ is again the number of queries
$t_{\mathsf {Walsh~transform}} = \ell 2^{\ell } {\sum }_{i=0}^{w_{0}} \left (\begin {array}{c}k^{\prime }-k^{\prime \prime }\\i \end {array}\right )$ is the cost of applying the fast Walsh-Hadamard transform for every guess of $k^{\prime }-k^{\prime \prime }$ bits

under the condition that $n-a2^{b} > \frac {1}{\delta ^{2^{a+1}} \cdot \delta ^{\prime 2}},$ where δ=1−2τ and $\delta ^{\prime } = \left (1-2\frac {\rho }{k^{\prime \prime }}\right )^{w_{1}}$ and ρ is the smallest integer, s.t. ${\sum }_{i=0}^{\rho } \left (\begin {array}{c}k^{\prime \prime }\\i \end {array}\right ) > 2^{k^{\prime \prime } -\ell }$.

The condition $n-a2^{b} > \frac {1}{\delta ^{2^{a+1}} \cdot \delta ^{\prime 2}}$ proposed in [24] imposes a lower bound on the number of queries needed in the solving phase for the fast Walsh-Hadamard transform. In our analysis, we will see that this is underestimated: the Chernoff bounds dictate a larger number of queries.

3 Tighter theoretical analysis

In this section we present a different theoretical analysis from the one of Levieil and Fouque [33] for the solving phases of the L P N solving algorithms. A complete comparison is given in Section 5. Our analysis gives tighter bounds and aims at closing the gap between theory and practice. For the new algorithm from [24], we present the main points that we found to be incomplete.

We first show how the cost of solving one block of the secret dominates the total cost of recovering s. The main intuition is that after recovering a first block of $k^{\prime }$ secret bits, we can apply a simple back substitution mechanism and consider solving a $\mathsf {LPN}_{k-k^{\prime },\tau }$ problem. The same strategy is applied by [1,17] when solving L W E. Note that this is simply a generalisation of the classic Gaussian elimination procedure for solving linear systems, where we work over blocks of bits.

Specifically, let k ₁ = k and $k_{i} = k_{i-1} - k_{i-1}^{\prime }$ for i>1 and $k^{\prime }_{i-1} < k_{i-1}$. Now, suppose we were able to $\left (n_{i},t_{i},m_{i},\theta _{i},k_{i}^{\prime }\right )$-solve an $\mathsf {LPN}_{k_{i},\tau }$ instance (meaning we recover a block of size $k_{i}^{\prime }$ from the secret of size k _i with probability θ _i, in time t _i and with memory m _i). One can see that for k _i+1<k _i we need less queries to solve the new instance (the number of queries is dependent on the size k _i+1 and on the noise level). With a smaller secret, the time complexity will decrease. Having a shorter secret and less queries, the memory needed is also smaller. Then, we can (n, t, m, θ, k)-solve the problem L P N _{k, τ} (i.e recover s completely), with $n = \max (n_{1},n_{2},\ldots )$, θ = θ ₁+θ ₂+…, $t = t_{1}+k_{1}^{\prime } n_{1} + t_{2}+ k_{2}^{\prime } n_{2} \ldots $ (the terms $k_{i}^{\prime } n_{i}$ are due to query updates by back substitution) and $m = \max (m_{1},m_{2},\ldots )$. Finally, by taking θ _i = 3⁻ⁱ, we obtain $\theta \leq \frac {1}{2}$ and thus recover the full secret s with probability over 50 %.

It is easily verified that for all the algorithms we consider, we have n = n ₁, m = m ₁, and t is dominated by t ₁. We provide an example on a concrete L P N instance in Appendix B.

For all the solving algorithms presented in this section we assume that $n^{\prime }$ queries remain after the reduction phase and that the bias is $\delta ^{\prime }$. For the solving techniques that recover the secret block-by-block, we assume the block size to be $k^{\prime }$.

3.1 B K W ^∗ algorithm

Given an L P N instance, the B K W ^∗ solving method recovers the 1 bit secret by applying the majority rule. Recall that the queries are of the form $c_{j}^{\prime } = s_{i} \oplus d_{j}^{\prime }$, $d_{j}^{\prime } \leftarrow \mathsf {Ber}_{(1-\delta ^{\prime })/2}$. The majority of these queries will most likely be $c_{j}^{\prime } = s_{i}$. It is intuitive to see that the majority rule fails when more than half of the noise bits are 1 for a given bit. Any wrong guess of a bit gives a wrong value of the k-bit secret s. In order to bound the probability of such a scenario, we use the Hoeffding bounds [26] with X _j = d _j (See Appendix A). We have $\Pr [X_{j} =1] = \frac {1-\delta ^{\prime }}{2}$. For $X = {\sum }_{j=1}^{n^{\prime }} X_{j}$, we have $E(X) = \frac {(1-\delta ^{\prime })n^{\prime }}{2}$ and we apply Theorem 12 with $\lambda = \frac {\delta n^{\prime }}{2}$, α _j = 0 and β _j = 1 and we obtain

$$\Pr\left[\mathsf{incorrect~guess~on~}s_{i}\right] = \Pr \left[ X \geq \frac{n^{\prime}}{2} \right] \leq e^{-\frac{n^{\prime}\delta^{\prime 2}}{2}}. $$

As discussed in Remark 1, the assumption of independence is heuristic.

Using the above results for every bit 1,…,b, we can bound by a constant θ, the probability that we guess incorrectly a block of s, with 0<θ < 1. Using the union bound, we get that $n^{\prime }=2 \delta ^{\prime -2} \ln (\frac {b}{\theta })$. Given that $n^{\prime } = \frac {n - (a-1)2^{b}}{2^{b}}$ and that $\delta ^{\prime } = \delta ^{2^{a-1}}$, we obtain the following result.

Theorem 4

For k≤a⋅b, the B K W ^∗ algorithm heuristically ($n = 2^{b+1} \delta ^{-2^{a}} \ln \left (\frac {b}{\theta }\right ) + (a-1)2^{b}, t = \mathcal {O}(kan),~m=kn, \theta ,b$)-solves the L P N problem.

We note that we obtained the above result using the union bound. One could make use of the independence of the noise bits and obtain $n = 2^{b+1} \delta ^{-2^{a}} \ln \left (\frac {1}{1- 2^{-1/k}} \right ) + (a-1)2^{b}$, but this would bring a very small improvement.

In terms of query complexity, we compare our theoretical results with the ones from [33] in Tables 1 and 2. We provide the $\log _{2}(n)$ values for k varying from 32 to 100 and we take different Bernoulli noise parameters that vary from 0.01 to 0.4. Overall, our theoretical results bring an improvement of a factor 10 over the results of [33].

Table 1 B K W ^∗ query complexity - our theory

On solving L P N using B K W and variants

Abstract

Similar content being viewed by others

On the Hardness of Sparsely Learning Parity with Noise

Solving the Learning Parity with Noise Problem Using Quantum Algorithms

Hybrid dual attack on LWE with arbitrary secrets

1 Introduction

Motivation & contribution

Organization

Notations and preliminaries

2 L P N

2.1 The L P N problem

Definition 1 (L P N oracle)

Definition 2 (Search L P N problem)

Lemma 1 ([8, 29])

2.2 L P N solving algorithms

2.2.1 B K W ∗ algorithm

Reduction phase

Lemma 2 ([9, 33])

Remark 1

Solving phase

Complexity analysis

Theorem 1 (Theorem 1 from[33])

2.2.2 L F 1 algorithm

Reduction phase

Solving phase

Complexity analysis

Theorem 2 (Theorem 2 from [33])

B K W ∗ vs. L F 1

2.2.3 L F 2 algorithm

Reduction phase

Solving phase

2.2.4 F M I C M algorithm

2.2.5 Covering codes algorithm

Reduction phase

Solving phase

Complexity analysis

Theorem 3 (Theorem 1. from [ 24 ])

3 Tighter theoretical analysis

3.1 B K W ∗ algorithm

Theorem 4

3.2 L F 1 algorithm

Theorem 5

3.3 L F 2 algorithm

Theorem 6

3.4 Covering codes algorithm

Theorem 7

4 Other L P N solving algorithms

4.1 Exhaustive search on sparse secret

Complexity analysis

Theorem 8

Theorem 9

4.2 Meet in the middle on sparse secret (MITM)

Complexity analysis

Theorem 10

4.3 Gaussian elimination

Complexity analysis

Theorem 11

Remark 2

Remark 3

5 Tightness of our query complexity

5.1 B K W ∗

Remark 4

5.2 L F 1

Remark 5

Remark 6

5.3 L F 2

Remark 7

Remark 8

5.4 Exhaustive search

5.5 MITM

5.6 Gaussian elimination

5.7 Covering codes

6 Complexity analysis of the L P N solving algorithms

Remark 9 (L F 1 vs. S e a r c h 2 )

Selecting secure parameters

7 Conclusion

Notes

References

Acknowledgments

2.2.1 B K W ^∗ algorithm

B K W ^∗ vs. L F 1

3.1 B K W ^∗ algorithm

5.1 B K W ^∗

Remark 9 (L F 1 vs. S e a r c h ₂)