Worst-Case Hardness for LPN and Cryptographic Hashing via Code Smoothing

Brakerski, Zvika; Lyubashevsky, Vadim; Vaikuntanathan, Vinod; Wichs, Daniel

doi:10.1007/978-3-030-17659-4_21

Zvika Brakerski¹⁶,
Vadim Lyubashevsky¹⁷,
Vinod Vaikuntanathan¹⁸ &
…
Daniel Wichs¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11478))

Included in the following conference series:

Annual International Conference on the Theory and Applications of Cryptographic Techniques

2167 Accesses
19 Citations

Abstract

We present a worst case decoding problem whose hardness reduces to that of solving the Learning Parity with Noise (LPN) problem, in some parameter regime. Prior to this work, no worst case hardness result was known for LPN (as opposed to syntactically similar problems such as Learning with Errors). The caveat is that this worst case problem is only mildly hard and in particular admits a quasi-polynomial time algorithm, whereas the LPN variant used in the reduction requires extremely high noise rate of $1/2-1/\mathrm{poly}(n)$. Thus we can only show that “very hard” LPN is harder than some “very mildly hard” worst case problem. We note that LPN with noise $1/2-1/\mathrm{poly}(n)$ already implies symmetric cryptography.

Specifically, we consider the (n, m, w)-nearest codeword problem ((n, m, w)-NCP) which takes as input a generating matrix for a binary linear code in m dimensions and rank n, and a target vector which is very close to the code (Hamming distance at most w), and asks to find the codeword nearest to the target vector. We show that for balanced (unbiased) codes and for relative error $w/m \approx {\log ^2 n}/{n}$, (n, m, w)-NCP can be solved given oracle access to an LPN distinguisher with noise ratio $1/2-1/\mathrm{poly}(n)$.

Our proof relies on a smoothing lemma for codes which we show to have further implications: We show that (n, m, w)-NCP with the aforementioned parameters lies in the complexity class $\mathrm {{Search}\hbox {-}\mathcal {BPP}}^\mathcal {SZK}$ (i.e. reducible to a problem that has a statistical zero knowledge protocol) implying that it is unlikely to be $\mathcal {NP}$-hard. We then show that the hardness of LPN with very low noise rate $\log ^2(n)/n$ implies the existence of collision resistant hash functions (our aforementioned result implies that in this parameter regime LPN is also in $\mathcal {BPP}^\mathcal {SZK}$).

Z. Brakerski—Supported by the Israel Science Foundation (Grant No. 468/14), Binational Science Foundation (Grants No. 2016726, 2014276), and by the European Union Horizon 2020 Research and Innovation Program via ERC Project REACT (Grant 756482) and via Project PROMETHEUS (Grant 780701).

V. Lyubashevsky—Supported by the SNSF ERC Transfer Grant CRETP2-166734 – FELICITY.

D. Wichs—Research supported by NSF grants CNS1314722, CNS-1413964.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Smoothing Out Binary Linear Codes and Worst-Case Sub-exponential Hardness for LPN

Collision Resistant Hashing from Sub-exponential Learning Parity with Noise

Decoding Linear Codes with High Error Rate and Its Impact for LPN Security

1 Introduction

The hardness of noisy learning problems such as learning parity with noise (LPN) [BFKL93, BKW03] and learning with errors (LWE) [Reg05] have proved to be a goldmine in modern cryptography. The hardness of $\mathsf {LWE}$ has been instrumental in solving long-standing problems such as fully homomorphic encryption [Gen09, BV11]. Both $\mathsf {LPN}$ and $\mathsf {LWE}$ have given us efficient and plausibly quantum-proof cryptographic constructions [KPC+11, BCD+16, ADPS16]. However, while we know several structural results about LWE, relatively little is known about the 25-year old LPN problem.

Before we proceed, let us define the LPN and LWE problems. In the (search version of the) $\mathsf {LPN}$ problem, the algorithm is given access to an oracle that produces samples $(\mathbf {{a}}_i, \mathbf {{s}}^T \mathbf {{a}}_i + e_i)$ where $\mathbf {{s}}\in \mathbb {Z}_2^n$ is the “secret” vector, $\mathbf {{a}}_i \in \mathbb {Z}_2^n$ are uniformly distributed and $e_i \in \mathbb {Z}_2$ come from the Bernoulli distribution (that is, it is 1 with probability $\epsilon $ and 0 otherwise). The goal is to recover $\mathbf {{s}}$. The (search version of the) $\mathsf {LWE}$ problem is the same but for two key changes: first, the vectors $\mathbf {{a}}_i \in \mathbb {Z}_q^n$ are uniformly random with entries from some large enough finite field $\mathbb {Z}_q$ and second, each error term $e_i$ is chosen from the discrete Gaussian distribution over the integers. The exact choice of the error distribution does not matter much: what is important is that in $\mathsf {LWE}$, each sample has an error with bounded absolute value (at least with high probability). These seemingly minor differences seem to matter a great deal: we know worst-case to average-case reductions for $\mathsf {LWE}$ [Reg05, Pei09, BLP+13] while no such result is known for $\mathsf {LPN}$;^{Footnote 1} we know that (a decisional version of) $\mathsf {LWE}$ is in the complexity class $\mathcal {SZK}$ [MV03] (statistical zero-knowledge) while no such result is known for $\mathsf {LPN}$; and we can build a dizzying array of cryptographic primitives assuming the hardness of $\mathsf {LWE}$ (e.g. attribute based encryption and homomorphic encryption to name the more exotic examples) while the repertoire of $\mathsf {LPN}$ is essentially limited to one-way functions and public-key encryption (and primitives that can be constructed generically from it). In particular, we do not know how to construct even simple, seemingly “unstructured”, primitives such as a collision-resistant hash function from the hardness of $\mathsf {LPN}$, even with extreme parameter choices. Can we bridge this puzzling gap between $\mathsf {LWE}$ and $\mathsf {LPN}$?

In a nutshell, the goal of this paper is to solve all three of these problems. Our main tool is a smoothing lemma for binary linear codes. We proceed to describe our results and techniques in more detail.

1.1 Overview of Our Results and Techniques

Worst-Case to Average-Case Reduction. We consider the promise nearest codeword problem (NCP), a worst-case analog of the learning parity with noise problem. Roughly speaking, in the search version of the (n, m, w)-promise nearest codeword problem, one is given the generator matrix $\mathbf {{C}} \in \mathbb {Z}_2^{n \times m}$ of a linear code, along with a vector $\mathbf {{t}} \in \mathbb {Z}_2^m$ such that $\mathbf {{t}} = \mathbf{s}^T \mathbf {{C}} + \mathbf {{x}}^T$ for some $\mathbf{s}\in \mathbb {Z}_2^n$ and $\mathbf {{x}} \in \mathbb {Z}_2^m$ with the promise that $\mathsf {wt}(\mathbf {{x}}) = w$. The problem is to find $\mathbf{s}$. The non-promise version of this problem (which is commonly called the nearest codeword problem) is known to be $\mathcal {NP}$-hard, even to approximately solve [ABSS93] and the promise problem is similarly $\mathcal {NP}$-hard in the large-error regime (that is, when the Hamming weight of $\mathbf {{x}}$ exceeds $(1/2+\epsilon )d$ where d is the minimum distance of the code and $\epsilon >0$ is an arbitrarily small constant) [DMS99].

In terms of algorithms, Berman and Karpinski [BK02] show how to find an $O(n/\log n)$-approximate nearest codeword in polynomial time. In particular, this means that if the Hamming weight of $\mathbf {{x}}$ in the promise version is at most $O(d\cdot \log n/n)$, their algorithm finds the unique nearest codeword’s $\mathbf{s}$ efficiently. To the best of our knowledge, this result is the current limit of polynomial-time solvability of the promise nearest codeword problem. Alon, Panigrahy and Yekhanin [APY09] show a deterministic nearly-polynomial time algorithm with the same parameters. In this work, we consider the promise NCP for balanced codes, where all nonzero codewords have Hamming weight between $(1/2-\beta )m$ and $(1/2+\beta )m$ for some balance parameter $\beta >0$. We are not aware of improved NCP algorithms that apply to balanced codes.

Our first result (in Sect. 4) shows a reduction from the worst-case promise NCP for balanced codes where $w/m \approx \frac{\log ^2 n}{n}$ to the average-case hardness of $\mathsf {LPN}^{n}_{{\epsilon }}$ with very high error-rate $\epsilon = 1/2 - 1/O(n^4)$. We note that a random linear code is $\beta $-balanced with overwhelming probability when $\beta \ge 3\sqrt{n/m}$ so for a sufficiently large m the restriction on $\beta $ is satisfied by most codes. Thus, qualitatively speaking, our result shows that solving LPN with very high error on the average implies solving NCP with very low error for most codes. While the parameters we achieve are extreme, we emphasize that no worst-case to average-case reduction for LPN was known prior to our work.

The worst-case to average-case reduction is a simple consequence of a smoothing lemma for codes that we define and prove in Sect. 3. In a nutshell, our smoothing lemma shows a simple randomized procedure that maps a worst-case linear code $\mathcal{C}$ and a vector $\mathbf {{t}}$ to a random linear code $\mathcal{C}'$ and a vector $\mathbf {{t}}'$ such that if $\mathbf {{t}}$ is super-close to $\mathcal{C}$, then $\mathbf {{t}}'$ is somewhat close to $\mathcal{C}'$. Our worst-case to average-case reduction then follows simply by applying the smoothing lemma to the worst-case code and vector. We show a simple Fourier-analytic proof of the smoothing lemma, in a way that is conceptually similar to analogous statements in the context of lattices [MR04] (see more details in the end of Sect. 3). Similar statements have been shown before in the list-decoding high-error regime [KS10], whereas our setting for NCP is in the unique decoding (low error) regime.

Statistical Zero-Knowledge. Another consequence of our smoothing lemma is a statistical zero-knowledge proof for the NCP problem for balanced codes with low noise, namely where $w/m \approx \frac{\log ^2 n}{n}$. In particular, we show that the search problem is in $\mathcal {BPP}^{\mathcal {SZK}}$. Membership in $\mathcal {BPP}^{\mathcal {SZK}}$ should be viewed as an easiness result: a consequence of this result and a theorem of Mahmoody and Xiao [MX10] is that NCP with low noise cannot be $\mathcal {NP}$-hard unless the polynomial hierarchy collapses. Our result is the first non-$\mathcal {NP}$-hardness result we know for NCP, complementing the $\mathcal {NP}$-hardness result of Dumer, Micciancio and Sudan [DMS99] for noise slightly larger than half the minimum distance, namely where $w/m \approx 1/2$ (but leaves a large gap in between). This is the LPN/codes analog of a result for LWE/lattices that we have known for over a decade [MV03]. We refer the reader to Sect. 5 for this result.

Collision-Resistant Hashing. Finally, we show a new cryptographic consequence of the hardness of LPN with low noise, namely a construction of a collision-resistant hash (CRH) function. Again, collision-resistant hashing from LWE/lattices has been known for over two decades [Ajt96, GGH96] and we view this result as an LPN/codes analog. The construction is extremely simple: the family of hash functions is parameterized by a matrix $\mathbf {A} \in \mathbb {Z}_2^{n\times n^{1+c}}$ for some $c>0$, its domain is the set of vectors $\mathbf {{x}} \in \mathbb {Z}_2^{n^{1+c}}$ with Hamming weight $2n/(c\log n)$ and the output is simply $\mathbf {A}\mathbf {{x}} \pmod {2}$. This is similar to a CRH construction from the recent work of Applebaum et al. [AHI+17] modulo the setting of parameters; what is new in our work is a reduction from the LPN problem with error rate $O(\log ^2 n/n)$ to breaking this CRH function.

Related Work. Our LPN-based collision-resistant hash function was used in [BLSV17] as a basis for constructing an identity based encryption scheme based on LPN with very low noise. Concurrently with, and independently from, our work, Yu et al. [YZW+17] constructed a family of collision-resistant hash functions based on the hardness of LPN using the same main idea as in Sect. 6 of the present work. While the core ideas of the construction in the two works is identical, [YZW+17] further discusses different parameter settings and some heuristics upon whose reliance one can obtain a tighter connection between the hardness of the CRH and the LPN problem.

2 Preliminaries

2.1 Notation

Throughout the paper, we will be working with elements in the additive group $\mathbb {Z}_2$ with the usual addition operation. We will denote by bold lower-case letters vectors over $\mathbb {Z}_2^n$ for $n>1$, and by bold upper-case letters matrices over $\mathbb {Z}_2^{m\times n}$ for $m,n>1$. We will make the assumption that all vectors are column vectors and write $\mathbf{a}^T$ to denote the row vector which is the transpose of $\mathbf{a}$. The Hamming weight of $\mathbf{a}\in \mathbb {Z}_2^n$, written as $\mathsf {wt}(\mathbf{a})$, denotes the number of 1’s in $\mathbf{a}$. For a set S, we write $s\leftarrow S$ to denote that s is chosen uniformly at random from S. When D is some probability distribution, then $s\leftarrow D$ means that s is chosen according to D.

The $\mathsf {Ber}_\epsilon $ distribution over $\mathbb {Z}_2$ is the Bernoulli distribution that outputs 1 with probability $\epsilon $ and 0 with probability $1-\epsilon $. Let $\mathcal {S}^{m}_{k}$ be the set of all the elements $\mathbf{s}\in \mathbb {Z}_2^m$ such that $\mathsf {wt}(\mathbf{s})=k$.

A negligible function $\mathrm{negl}(n)$ is any function that grows slower than inverse polynomial in n. In particular, for every polynomial p there is an $n_0 \in \mathbb {N}$ such that for every $n>n_0$, $\mathrm{negl}(n) < 1/p(n)$.

2.2 The Learning Parity with Noise (LPN) Problem

For an $\mathbf{s}\in \mathbb {Z}_2^n$, and an $\epsilon \in [0,.5]$ let $\mathcal {O}^n_{\mathbf{s},\epsilon }$ be an algorithm that, when invoked, chooses a random $\mathbf{a}\leftarrow \mathbb {Z}_2^n$ and $e\leftarrow \mathsf {Ber}_\epsilon $ and outputs $(\mathbf{a},\mathbf{s}^T\mathbf{a}+e)$. An algorithm $\mathsf A$ is said to solve the search $\mathsf {LPN}^{n}_{{\epsilon }}$ problem with probability $\delta $ if

$$\Pr [\mathsf A^{\mathcal {O}^n_{\mathbf{s},\epsilon }} \Rightarrow \mathbf{s}~;~ \mathbf{s}\leftarrow \mathbb {Z}_2^n]\ge \delta .$$

Let $\mathcal {U}^n$ be an algorithm that, when invoked, chooses random $\mathbf{a}\leftarrow \mathbb {Z}_2^n$ and $b\leftarrow \mathbb {Z}_2$ and outputs $(\mathbf{a},b)$. We say that an algorithm $\mathsf A$ has advantage $\delta $ in solving the decisional $\mathsf {LPN}^{n}_{{\epsilon }}$ problem if

$$\left| \Pr [\mathsf A^{\mathcal {O}^n_{\mathbf{s},\epsilon }}\Rightarrow 0;~\mathbf{s}\leftarrow \mathbb {Z}_2^n] - \Pr [\mathsf A^{\mathcal {U}^n}\Rightarrow 0]\right| \ge \delta .$$

The LPN problem has a search to decision reduction (c.f. [KS06]). Namely, if there is an algorithm that runs in time t and has advantage $\delta $ in solving the decisional $\mathsf {LPN}^{n}_{{\epsilon }}$ problem, then there is an algorithm that runs in time $O(nt/\delta )$ that solves the search $\mathsf {LPN}^{n}_{{\epsilon }}$ problem with probability $\approx 1$.

The following fact is known in some contexts as The Piling-Up Lemma [Mat93].

Lemma 2.1

For all $\epsilon \in [0,\frac{1}{2}]$ it holds that $\Pr [e_1+\ldots +e_k = 0;~e_i\leftarrow \mathsf {Ber}_\epsilon ]=\frac{1}{2}+\frac{1}{2}\cdot \left( 1-2\epsilon \right) ^k$.

2.3 The Nearest Codeword Problem

An (binary) (n, m, d)-code $\mathcal{C}$ is a subset of $\{0,1\}^m$ such that $|\mathcal{C}| = 2^n$ and for any two codewords $\mathbf {{x}}, \mathbf {{y}}\in \mathcal{C}$, $\mathsf {wt}(\mathbf {{x}}\oplus \mathbf {{y}}) \le d$. The code is linear (denoted [n, m, d]-code) if $\mathcal{C}$ is the row span of some matrix $\mathbf {{C}}\in \{0,1\}^{n \times m}$.

Definition 2.1

(Nearest Codeword Problem $(\mathsf {NCP})$). The nearest codeword problem $\mathsf {NCP}_{n, m, w}$ is characterized by $n,m,w \in \mathbb {Z}$ and is defined as follows. The input consists of a matrix $\mathbf {{C}} \in \mathbb {Z}^{n \times m}$ which is the generator of a code, along with a vector $\mathbf {{t}} \in \mathbb {Z}^m$ such that $\mathbf {{t}} = \mathbf{s}^T \mathbf {{C}} + \mathbf {{x}}^T$ for some $\mathbf{s}\in \mathbb {Z}_2^n, \mathbf {{x}} \in \mathbb {Z}_2^m$ with $\mathsf {wt}(\mathbf {{x}}) = w$. The problem is to find $\mathbf{s}$.

Note that our definition requires $\mathsf {wt}(\mathbf {{x}})=w$, as opposed to the more relaxed requirement $\mathsf {wt}(\mathbf {{x}})\le w$. However since w comes from a polynomial domain $\{0,\ldots , m\}$ the difference is not very substantial (in particular, to solve the relaxed version one can go over all polynomially-many relevant values of w and try solving the exact version).

In this work, we consider a variant of the problem which is restricted to balanced codes, which are codes where all non-zero codewords have hamming weight close to 1/2. We start by defining balanced codes and then present balanced $\mathsf {NCP}$.

Definition 2.2

A code $\mathcal{C}\subseteq \{0,1\}^m$ is $\beta $-balanced if its minimum distance is at least $\tfrac{1}{2}(1-\beta )m$ and maximum distance is at most $\tfrac{1}{2}(1+\beta )m$.

Definition 2.3

(balanced $\mathsf {NCP}$ ($\mathsf {balNCP}$)). The balanced nearest codeword problem $\mathsf {balNCP}_{n, m, w, \beta }$ is characterized by $n,m,w \in \mathbb {Z}$ and $\beta \in (0,1)$, and is defined as follows. The input consists of a matrix $\mathbf {{C}} \in \mathbb {Z}^{n \times m}$ which is the generator of a $\beta $-balanced code, along with a vector $\mathbf {{t}} \in \mathbb {Z}^m$ such that $\mathbf {{t}}^T = \mathbf{s}^T \mathbf {{C}} + \mathbf {{x}}^T$ for some $\mathbf{s}\in \mathbb {Z}_2^n, \mathbf {{x}} \in \mathbb {Z}_2^m$ with $\mathsf {wt}(\mathbf {{x}}) = w$. The problem is to find $\mathbf{s}$.

The $\mathsf {balNCP}_{n, m, w, \beta }$ problem has a unique solution when $w \le \frac{1}{4}(1- \beta )m$.

Standard decoding algorithms allow to solve $\mathsf {NCP}$ in polynomial time with success probability $(1-\tfrac{w}{m})^n$ [BK02] or even deterministically in time $(1-\tfrac{w}{m})^{-n}\cdot \mathrm{poly}(n,m)$ [APY09]. We are not aware of improved methods that apply to balanced codes.

To conclude this section we show via a straightforward probabilistic argument that most sparse linear codes are indeed balanced (this is essentially the Gilbert-Varshamov Bound). This is to serve as sanity check that the definition is not vacuous and will also be useful when we apply our $\mathcal {SZK}$ results to the $\mathsf {LPN}^{n}_{{\epsilon }}$ problem which naturally induces random codes.

Lemma 2.2

A random linear code $\mathcal{C}\subseteq \mathbb {Z}_2^m$ of dimension n is $\beta $-balanced with probability at least $1 - 2^{n - \beta ^2 m/4 +1}$. In particular, when $\beta \ge 3\sqrt{n/m}$ a random linear code is $\beta $-balanced with probability $1-\mathrm{negl}(n)$.

Proof

Let $\mathbf {{C}} \leftarrow \mathbb {Z}_2^{n \times m}$ be a randomly chosen generator matrix. Then the associated code $\mathcal{C}$ fails to be $\beta $-balanced if and only if there exists some $\mathbf {{s}} \ne \mathbf {0} \in \mathbb {Z}_2^n$ such that $|\mathsf {wt}(\mathbf {{s}}^T \mathbf {{C}}) - \frac{m}{2}| > \tfrac{\beta }{2}m$. For any fixed $\mathbf {{s}}\ne \mathbf {0}$ the vector $\mathbf {{s}}^T\mathbf {{C}}$ is uniformly random in $\mathbb {Z}_2^m$ and therefore by the Chernoff bound:

$$ \Pr \left[ \left| ~\mathsf {wt}(\mathbf {{s}}^T \mathbf {{C}}) - \frac{m}{2} \right| > \tfrac{\beta m}{2} \right] \le 2 \exp \left( -\frac{\beta ^2 m}{4}\right) $$

By the union bound, the probability that the code is not $\beta $-balanced is at most

$$2^{n+1} \exp \left( -\frac{\beta ^2 m}{4}\right) \le 2^{n -\frac{\beta ^2 m}{4} + 1}.$$

This is negligible in n when $\beta \ge 3\sqrt{n/m}$.

2.4 Statistical Zero Knowledge

Statistical zero-knowledge ($\mathcal {SZK}$) is the class of all problems that admit a zero-knowledge proof [GMR89] with a statistically sound simulation. Sahai and Vadhan [SV03] showed that the following problem is complete for $\mathcal {SZK}$.

Definition 2.4

The promise problem Statistical Distance (SD) is defined by the following YES and NO instances. For a circuit $C:\{0,1\}^n\rightarrow \{0,1\}^m$, we let $C(U_n)$ denote the probability distribution on m-bit strings obtained by running C on a uniformly random input. Let $\mathsf {SD}(D_0,D_1)$ denote the statistical (variation) distance between the distributions $D_0$ and $D_1$.

$$\begin{aligned} \varPi _{YES} := \{(C_0,C_1):\ C_0, C_1: \{0,1\}^n\rightarrow \{0,1\}^m \text{ and } \mathsf {SD}(C_0(U_n),C_1(U_n)) \ge 2/3\} \\ \varPi _{NO} := \{(C_0,C_1):\ C_0, C_1: \{0,1\}^n\rightarrow \{0,1\}^m \text{ and } \mathsf {SD}(C_0(U_n),C_1(U_n)) \le 1/3\} \end{aligned}$$

By $\mathcal {BPP}^{\mathcal {SZK}}$, we mean decision problems that can be reduced to the statistical distance problem using randomized reductions. While in general such reduction could query the SD oracle on inputs that violate the promise (namely, a pair of circuits/distributions whose statistical distance lies strictly between 1/3 and 2/3), the reductions we present in this paper will respect the SD promise. Search-$\mathcal {BPP}^{\mathcal {SZK}}$ is defined analogously.

3 A Smoothing Lemma for Noisy Codewords

Let $\mathcal{C}\subseteq \mathbb {Z}_2^m$ be a binary linear code with generating matrix $\mathbf {{C}} \in \mathbb {Z}_2^{n\times m}$. We say that a distribution $\mathcal{R}$ over $\mathbb {Z}_2^m$ smooths $\mathcal{C}$ if the random variable $\mathbf {{C}}\mathbf {{r}}$ for $\mathbf {{r}}\leftarrow \mathcal{R}$ is statistically close to uniform over $\mathbb {Z}_2^n$. We say that $\mathcal{R}$ also smooths noisy codewords if for every vector $\mathbf {{x}}$ of sufficiently low Hamming weight, it holds that $(\mathbf {{C}}\mathbf {{r}}, \mathbf {{x}}^T\mathbf {{r}})$ is statistically close to the distribution $\mathcal{U}_{\mathbb {Z}_2^n}\times \mathsf {Ber}_{\epsilon }$ for some $\epsilon $.

The notion of smoothing will play an important role in our reductions in this work. In particular, we would like to characterize families of codes that are smoothed by distributions supported over low Hamming weight vectors. To this end, we show that for balanced codes, there exist such smoothing distributions. (Similar statements have been shown before in the high-error regime, e.g., by Kopparty and Saraf [KS10].)

We note that while our proof uses harmonic analysis, it is also possible to prove it using the Vazirani XOR Lemma [Vaz86, Gol95]. However, we find that our method of using harmonic analysis demonstrates more straightforwardly the analogy of our lemma to smoothing in the lattice world (which is most often proved using harmonic analysis), see comparison in the end of this section. Furthermore, this suggests an approach if one wants to analyze the non-binary setting.

We start by defining our family of smoothing distributions $\mathcal{R}_{d,m}$.

Definition 3.1

Let $d, m \in {\mathbb N}$. The distribution $\mathcal{R}_{d,m}$ over $\mathbb {Z}_2^m$ is defined as follows. Sample (with replacement) d elements $t_1, \ldots , t_d$ uniformly and independently from [m]. Output $\mathbf {{x}} = \oplus _{i=1}^d \mathbf {{u}}_{t_i}$, where $\mathbf {{u}}_j$ is the j-th standard basis vector. One can easily verify that $\mathcal{R}_{d,m}$ is supported only over vectors of Hamming weight at most d.

We can now state and prove our smoothing lemma for noisy codewords.

Lemma 3.1

Let $\beta \in (0,1)$ and let $\mathbf {{C}} \in \mathbb {Z}_2^{n \times m}$ be a generating matrix for a $\beta $-balanced binary linear code $\mathcal{C}\subseteq \mathbb {Z}_2^m$. Let $\mathbf {{c}} \in \mathbb {Z}_2^m$ be a word of distance w from $\mathcal{C}$. Let $\mathbf {{s}}, \mathbf {{x}}$ be s.t. $\mathbf {{c}}^T = \mathbf {{s}}^T\mathbf {{C}}+\mathbf {{x}}^T$ and $\mathsf {wt}(\mathbf {{x}})=w$.

Consider the distribution $(\mathbf {{a}}, b)$ generated as follows. Sample $\mathbf {{r}} \leftarrow \mathcal{R}_{d,m}$ and set $\mathbf {{a}} = \mathbf {{C}}\mathbf {{r}}$, $b = \mathbf {{c}}^T\mathbf {{r}}$. Then it holds that the joint distribution of $(\mathbf {{a}}, b-\mathbf {{s}}^T\mathbf {{a}})$ is within statistical distance $\delta $ from the product distribution $\mathcal{U}_{\mathbb {Z}_2^n}\times \mathsf {Ber}_{\epsilon }$, where

$$\begin{aligned} \delta\le & {} 2^{(n+1)/2} \cdot (\beta + \tfrac{2w}{m})^{d} \text { and } \\ \epsilon= & {} \tfrac{1}{2}-\tfrac{1}{2}(1-\tfrac{2w}{m})^{d}. \end{aligned}$$

Proof

Let e denote the value $b-\mathbf {{s}}^T\mathbf {{a}}$. We bound the distance of from $\mathcal{U}_{\{0,1\}^n}\times \mathsf {Ber}_{\epsilon }$ using simple harmonic analysis. Let f be the probability density function of , and consider its (binary) Fourier Transform:

$$\begin{aligned} \hat{f}(\mathbf {{y}},z) = \mathop {{\mathbb E}}_{\mathbf {{a}},e}[(-1)^{\mathbf {{y}}^T \mathbf {{a}} + z e}] = \mathop {{\mathbb E}}_{\mathbf {{r}}}[(-1)^{(\mathbf {{y}}^T \mathbf {{C}}+z\mathbf {{x}}^T) \mathbf {{r}}}], \end{aligned}$$

(1)

It immediately follows that $\hat{f}(\mathbf {{0}},0)=1$. Moreover

$$\begin{aligned} \hat{f}(\mathbf {{0}},1) = \mathop {{\mathbb E}}_{\mathbf {{r}}}[(-1)^{\mathbf {{x}}^T\mathbf {{r}}}]. \end{aligned}$$

(2)

Recalling that $\mathbf {{r}} = \oplus _{i=1}^d \mathbf {{u}}_{t_i}$ we have

$$ \mathop {{\mathbb E}}_{\mathbf {{r}}}[(-1)^{\mathbf {{x}}^T\mathbf {{r}}}] = \prod _{i=1}^d \mathop {{\mathbb E}}_{t_i}[(-1)^{\mathbf {{x}}^T\mathbf {{u}}_{t_i}}] = (1-\tfrac{2w}{m})^d, $$

since each $t_i$ is sampled uniformly and independently in [m] and thus has a $\tfrac{w}{m}$ probability to hit a coordinate where $\mathbf {{x}}$ is one. Recalling the definition of $\epsilon $, we have $\hat{f}(\mathbf {{0}},1)=1-2\epsilon $.

Now let us consider the setting where $\mathbf {{y}} \ne \mathbf {{0}}$. In that case, let us denote $\mathbf {{v}} = \mathbf {{y}}^T\mathbf {{C}}$, a nonzero codeword in $\mathcal{C}$. Since $\mathcal{C}$ is balanced it follows that $\mathsf {wt}(\mathbf {{v}}) \in [\tfrac{1}{2}(1- \beta )m, \tfrac{1}{2}(1+ \beta )m]$. Let us further denote $(\mathbf {{v}}')^T = \mathbf {{y}}^T\mathbf {{C}}+z \mathbf {{x}}^T$, since $\mathsf {wt}(\mathbf {{x}}) \le w$ it follows that $\mathsf {wt}(\mathbf {{v}}') \in \tfrac{1}{2}(1\pm \beta ')m$ for $\beta ' = \beta + \tfrac{2w}{m}$. For $\mathbf {{y}} \ne \mathbf {{0}}$ we thus get

$$\begin{aligned} \hat{f}(\mathbf {{y}},z) = \mathop {{\mathbb E}}_{\mathbf {{r}}}[(-1)^{(\mathbf {{v}}')^T\mathbf {{r}}}] = \prod _{i=1}^d \mathop {{\mathbb E}}_{t_i}[(-1)^{(\mathbf {{v}}')^T\mathbf {{u}}_{t_i}}]. \end{aligned}$$

(3)

Since each $t_i$ is sampled uniformly from [m], it follows that $\mathbf {{v}}'\mathbf {{u}}_{t_i} \pmod {2}=0$ with probability $\epsilon _i \in \frac{1}{2} (1\pm \beta ')$. Therefore for all $i\in [d]$ it holds that

$$\begin{aligned} \left| {\mathop {{\mathbb E}}_{t_i}[(-1)^{\mathbf {{v}}'\mathbf {{u}}_{t_i}}]} \right| = \left| {1-2\epsilon _i} \right| \le \beta '. \end{aligned}$$

(4)

We conclude that

$$\begin{aligned} \left| {\hat{f}(\mathbf {{y}},z)} \right| \le (\beta ')^d. \end{aligned}$$

(5)

Now we are ready to compare with $\mathcal{U}_{\mathbb {Z}_2^n}\times \mathsf {Ber}_{\epsilon }$. Let g be the probability density function of $\mathcal{U}_{\mathbb {Z}_2^n}\times \mathsf {Ber}_{\epsilon }$, and let $\hat{g}$ be its Fourier Transform. Then $\hat{g}(\mathbf {{0}},0)=1$, $\hat{g}(\mathbf {{0}},1)=1-2\epsilon $ and $\hat{g}(\mathbf {{y}},z)=0$ for all $\mathbf {{y}} \ne 0$. Therefore

$$\begin{aligned} \left\| {\hat{f} - \hat{g}} \right\| _2^2 =\sum _{\mathbf {{y}},z} \left| {\hat{f}(\mathbf {{y}}) - \hat{g}(\mathbf {{y}})} \right| ^2 \le \sum _{\begin{array}{c} \mathbf {{y}} \in \mathbb {Z}_2^n \setminus \{0\} \\ z\in \mathbb {Z}_2 \end{array}} (\beta ')^{2d} \le 2^{n+1} (\beta ')^{2d}. \end{aligned}$$

(6)

By Parseval’s theorem, we have that

$$\begin{aligned} \left\| {{f} - {g}} \right\| _2^2 = \frac{1}{2^{n+1}} \left\| {\hat{f} - \hat{g}} \right\| _2^2 \le (\beta ')^{2d}, \end{aligned}$$

(7)

and going to $\ell _1$ norm we have

$$\begin{aligned} \left\| {{f} - {g}} \right\| _1 \le 2^{(n+1)/2} \cdot \left\| {\hat{f} - \hat{g}} \right\| _2 \le 2^{(n+1)/2} \cdot (\beta ')^{d}, \end{aligned}$$

(8)

which completes the proof. $\square $

Relation to Lattice Smoothing. To conclude this section, let us briefly explain the analogy to smoothing lemmas for lattices [MR04]. Our explanation is intended mostly for readers who are familiar with lattice smoothing and the notion of q-ary lattices, and wish to better understand the connection to our notion of smoothing. Other readers may safely skip this paragraph. We recall that the goal of smoothing in the lattice world is to find a distribution $\mathcal{D}$ (a Gaussian in the lattice case), such that reducing it modulo a lattice $\mathcal{L}$ results in an almost uniform distribution over the cosets of the lattice. Let us restrict our attention to integer lattices, integer distributions and integer cosets. Formally, $\mathcal{D}\pmod {\mathcal{L}}$ is uniform over $\mathbb {Z}/\mathcal{L}$. Now let $\mathbf {{C}}$ be a generating matrix for a binary code, and consider the so called “perp lattice” $\mathcal{L}= \varLambda ^{\perp }_2(\mathbf {{C}}) = \{\mathbf {{x}} \in \mathbb {Z}^m : \mathbf {{C}}\mathbf {{x}}=0\pmod {2}\} = \mathcal{C}^{\perp }+2\mathbb {Z}^m$. That is, the lattice corresponding to the dual code of $\mathcal{C}$ plus all even vectors. Each integer cosets of the lattice $\mathcal{L}$ corresponds to a vector $\mathbf {{y}}$ where the respective coset is $\mathcal{L}_{\mathbf {{y}}}=\{\mathbf {{x}} \in \mathbb {Z}^m : \mathbf {{C}}\mathbf {{x}}=\mathbf {{y}}\pmod {2}\}$. Thus a smoothing distribution $\mathcal{D}$ is one where drawing $\mathbf {{r}}$ from $\mathcal{D}$ and computing $\mathbf {{C}}\mathbf {{r}}\pmod {2}$ is close to uniform. Therefore, our smoothing lemma above shows that for 2-ary lattices one can devise non-trivial (and useful) smoothing distributions, and these distributions are not discrete Gaussians as usually considered in the lattice literature. Finally, the fact that we can smooth the code together with a noisy codeword is somewhat analogous to Gaussian leftover hash lemmas in the context of lattices.

4 A Worst Case Balanced NCP to Average Case LPN Reduction

Theorem 4.1

Assume there is an algorithm that solves the search $\mathsf {LPN}^{n}_{{\epsilon }}$ problem with success probability $\alpha $ in the average case by running in time T and making q queries. Then, for every $d \le m \in \mathbb {Z}$ there is an algorithm that solves search $\mathsf {balNCP}_{n, m, w, \beta }$ in the worst case in time $T \cdot \mathrm{poly}(n,m)$ with success probability at least $\alpha - q \cdot \delta $ where

$$\begin{aligned} \delta\le & {} 2^{(n+1)/2} \cdot (\beta + \tfrac{2w}{m})^{d} \\ \epsilon= & {} \tfrac{1}{2}-\tfrac{1}{2}(1-\tfrac{2w}{m})^{d}. \end{aligned}$$

Proof

Assume $\mathcal{A}$ is an algorithm for the LPN problem as in the theorem. Define $\mathcal{B}$ as follows:

Input: $\mathbf {{C}} \in \mathbb {Z}_2^{n \times m}$, $\mathbf {{t}}\in \{0,1\}^m$. By assumption $\mathbf {{C}}$ is the generator of a $\beta $-balanced code and $\mathbf {{t}}^T = \mathbf{s}^T \mathbf {{C}} + \mathbf {{x}}^T$ for some $\mathbf{s}\in \mathbb {Z}_2^n, \mathbf {{x}} \in \mathbb {Z}_2^m$ with $\mathsf {wt}(\mathbf {{x}}) \le w$.

1.
Sample $\mathbf{s}' \leftarrow \mathbb {Z}_2^n$ and set $\mathbf {{c}}^T = \mathbf {{t}}^T + (\mathbf{s}')^T\mathbf {{C}} = (\mathbf{s}+ \mathbf{s}')^T\mathbf {{C}} + \mathbf {{x}}^T$.
2.
Run the algorithm $\mathcal{A}$. Every time $\mathcal{A}$ request a new LPN sample, choose $\mathbf {{r}} \leftarrow \mathcal{R}_{d,m}$ and set $\mathbf {{a}} = \mathbf {{C}}\mathbf {{r}}$, $b = \mathbf {{c}}^T\mathbf {{r}}$ and give $\mathbf {{a}}, b$ to $\mathcal{A}$.
3.
If at some point $\mathcal{A}$ outputs $\mathbf{s}^* \in \mathbb {Z}_2^n$ then output $\mathbf{s}^* - \mathbf{s}'$.

By Lemma 3.1 each of the values $(\mathbf {{a}}, b)$ given to $\mathcal{A}$ during step 2 is $\delta $-close to a fresh sample from $\mathcal {O}^n_{\mathbf{s}^*,\epsilon }$ where $\mathbf{s}^* = \mathbf{s}+ \mathbf{s}'$ is uniformly random over $\mathbb {Z}_2^n$. By assumption, if $\mathcal{A}$ were actually given samples from $\mathcal {O}^n_{\mathbf{s}^*,\epsilon }$ is step 2 it would output $\mathbf{s}^*$ in step 3 with probability $\alpha $. Therefore if $\mathcal{A}$ makes q queries in step 2, the probability that it outputs $\mathbf{s}^*$ in step 3 is at least $\alpha - q \delta $. This proves the theorem. $\square $

Corollary 4.1

Let $m = n^{c}$ for some constant $c>1$, $\beta = \frac{1}{\sqrt{n}}, w = \lceil m \frac{\log ^2 n}{n} \rceil $. Assume that search $\mathsf {balNCP}_{n, m, w, \beta }$ is hard in the worst case, meaning that for every polynomial time algorithm its success probability on the worst case instance is at most $\mathrm{negl}(n)$. Then for some $\epsilon < \frac{1}{2} - \frac{1}{O(n^4)}$ search $\mathsf {LPN}^{n}_{{\epsilon }}$ is hard in the average case, meaning that for every polynomial time algorithm its success probability on a random instance is at most $\mathrm{negl}(n)$.

Proof

Follows directly from the theorem by setting $d = \lceil 2n/ \log n \rceil $ and noting that:

$$\begin{aligned} \delta\le & {} 2^{(n+1)/2} \cdot (\beta + \tfrac{2w}{m})^{d} \le 2^{(n+1)/2 - (d/2) \log n + O(1)} \le 2^{-n/2 +O(1)} = \mathrm{negl}(n)\\ \epsilon= & {} \tfrac{1}{2}-\tfrac{1}{2}(1-\tfrac{2w}{m})^{d} \le \frac{1}{2} - 2^{-(4\frac{w}{m}d+1)} \le \frac{1}{2} - 1/O(n^4) \end{aligned}$$

$\square $

The above says that the wost-case hardness of $\mathsf {balNCP}$ with very low error-rate $w/m \approx \frac{\log ^2 n}{n}$ implies the average-case hardness of $\mathsf {LPN}^{n}_{{\epsilon }}$ with very high error-rate $\epsilon = 1/2 - 1/O(n^4)$. Note that a random linear code is $\beta $-balanced with overwhelming probability when $\beta \ge 3\sqrt{n/m}$ so for a sufficiently large m the restriction on $\beta $ is satisfied by most codes.

Other choices of parameters may also be interesting. For example, we can set the error-rate to be $w/m \approx 1/\sqrt{n}$ and $d = 2 n/\log n$ while keeping $m = n^c$ for some $c>1$, $\beta = 1/\sqrt{n}$ the same as before. Then if we assume that $\mathsf {balNCP}_{n,m,w,\beta }$ is $(T(n), \delta (n))$ hard in the worst case (meaning that for every T(n) time algorithm the success probability on the worst case instance is at most $\delta (n)$) this implies $\mathsf {LPN}^{n}_{{\epsilon }}$ is $(T'(n), \delta '(n))$ hard in the average where $\epsilon (n) = 1/2 - 2^{-\sqrt{n}/\log n}$, $T'(n) = T(n)/\mathrm{poly}(n)$ and $\delta '(n) = \delta (n) + T'(n) 2^{-(n-1)/2}$. Note that, as far as we know, the $\mathsf {balNCP}_{n,m,w,\beta }$ with noise rate $w/m = 1\sqrt{n}$ may be $(T(n), \delta (n))$ hard for some $T(n) = 2^{\varOmega (\sqrt{n})}$, $\delta (n) = 2^{-\varOmega (\sqrt{n})}$, which would imply the same asymptotic hardness for $\mathsf {LPN}^{n}_{{\epsilon }}$. Although the error-rate $\epsilon = 1/2 - 2^{-\sqrt{n}/\log n}$ is extremely high, it is not high enough for the conclusion to hold statistically and therefore this connection may also be of interest.

5 Statistical Zero Knowledge for Balanced NCP and LPN

In this section, we show that for certain parameter regimes, $\mathsf {balNCP}\in \mathrm {{Search}\hbox {-}\mathcal {BPP}}^\mathcal {SZK}$ and is thus unlikely to be $\mathcal {NP}$-hard [MX10]. Towards this end, we use a decision to search reduction analogous to the canonical one known for the $\mathsf {LPN}$ problem. We consider the following randomized samplers (with an additional implicit parameter d):

Randomized sampler $\mathsf {Samp}_0(\mathbf {{C}}, \mathbf {{t}})$ takes as input a matrix $\mathbf {{C}} \in \{0,1\}^{n \times m}$ and a word $\mathbf {{t}}\in \{0,1\}^m$. It samples $\mathbf {{r}} {\mathop {\leftarrow }\limits ^{\scriptscriptstyle {\$}}}\mathcal{R}_{d,m}$ and outputs $(\mathbf {{C}}\mathbf {{r}}, \mathbf {{t}}^T\mathbf {{r}})$.
Randomized sampler $\mathsf {Samp}_{i, \sigma }(\mathbf {{C}}, \mathbf {{t}})$ is parameterized by $i \in [n]$, $\sigma \in \{0,1\}$, takes as input a matrix $\mathbf {{C}} \in \{0,1\}^{n \times m}$ and a word $\mathbf {{t}}\in \{0,1\}^m$. It samples $\mathbf {{r}} {\mathop {\leftarrow }\limits ^{\scriptscriptstyle {\$}}}\mathcal{R}_{d,m}$ and $\rho \in \{0,1\}$ and outputs $(\mathbf {{C}}\mathbf {{r}}+\rho \mathbf {{u}}_{i}, \mathbf {{t}}^T\mathbf {{r}} + \rho \sigma )$.

Lemma 5.1

Consider a generating matrix $\mathbf {{C}} \in \{0,1\}^{n \times m}$ for a $\beta $-balanced code, and let $\mathbf {{t}} = \mathbf {{s}}^T \mathbf {{C}}+\mathbf {{x}}^T$ for some $\mathbf {{s}}\in \{0,1\}^n$ and $\mathbf {{x}}$ with hamming weight w. Then the following hold:

1.
The sampler $\mathsf {Samp}_0(\mathbf {{C}}, \mathbf {{t}})$ samples from a distribution that is $\delta $-close to $\mathcal{U}_{\{0,1\}^n}\times \mathsf {Ber}_{\epsilon }$.
2.
If $\mathbf {{s}}_i=\sigma $ then $\mathsf {Samp}_{i, \sigma }(\mathbf {{C}}, \mathbf {{t}})$ samples from a distribution that is $\delta $-close to $\mathcal{U}_{\{0,1\}^n}\times \mathsf {Ber}_{\epsilon }$.
3.
If $\mathbf {{s}}_i\ne \sigma $ then $\mathsf {Samp}_{i, \sigma }(\mathbf {{C}}, \mathbf {{t}})$ samples from a distribution that is $\delta $-close to $\mathcal{U}_{\{0,1\}^n}\times \mathcal{U}_{\{0,1\}}$.

Here, $\epsilon = \tfrac{1}{2}-\tfrac{1}{2}(1-\tfrac{2w}{m})^{d}$, $\delta = 2^{(n+1)/2} \cdot (\beta + \tfrac{2w}{m})^{d}$.

Proof

Assertion 1 follows directly from Lemma 3.1.

For Assertion 2 we note that if $\mathbf {{s}}_i=\sigma $ then

$$ (\mathbf {{C}}\mathbf {{r}}+\rho \mathbf {{u}}_{i}, \mathbf {{t}}^T\mathbf {{r}} + \rho \sigma ) = (\mathbf {{C}}\mathbf {{r}}, \mathbf {{t}}^T\mathbf {{r}})+\rho (\mathbf {{u}}_{i}, \sigma ) = (\mathbf {{C}}\mathbf {{r}}, \mathbf {{t}}^T\mathbf {{r}})+(\rho \mathbf {{u}}_{i}, \mathbf {{s}}^T(\rho \mathbf {{u}}_i)). $$

By Lemma 3.1 this distribution is within $\delta $ statistical distance to

$$ (\mathbf {{a}}, \mathbf {{s}}^T\mathbf {{a}}+e)+(\rho \mathbf {{u}}_{i}, \mathbf {{s}}^T(\rho \mathbf {{u}}_i)), $$

with $(\mathbf {{a}}, e)$ distributed $\mathcal{U}_{\{0,1\}^n}\times \mathsf {Ber}_{\epsilon }$. Finally, we can write

$$ (\mathbf {{a}}, \mathbf {{s}}^T\mathbf {{a}}+e)+(\rho \mathbf {{u}}_{i}, \mathbf {{s}}^T(\rho \mathbf {{u}}_i)) = ((\mathbf {{a}}+\rho \mathbf {{u}}_i), \mathbf {{s}}^T (\mathbf {{a}}+\rho \mathbf {{u}}_i) + e), $$

and since $\mathbf {{a}}' = \mathbf {{a}}+\rho \mathbf {{u}}_i$ is also uniformly distributed, the assertion follows.

For Assertion 3 we note that when $\mathbf {{s}}_i\ne \sigma $, i.e. $\sigma = \mathbf {{s}}_i+1$ then

$$ (\mathbf {{C}}\mathbf {{r}}+\rho \mathbf {{u}}_{i}, \mathbf {{t}}^T\mathbf {{r}} + \rho \sigma ) = (\mathbf {{C}}\mathbf {{r}}, \mathbf {{t}}^T\mathbf {{r}})+\rho (\mathbf {{u}}_{i}, \sigma ) = (\mathbf {{C}}\mathbf {{r}}, \mathbf {{t}}^T\mathbf {{r}})+(\rho \mathbf {{u}}_{i}, \mathbf {{s}}^T(\rho \mathbf {{u}}_i))+(\mathbf {{0}}, \rho ). $$

As above, by Lemma 3.1, this distribution is within $\delta $ statistical distance to

$$ (\mathbf {{a}}, \mathbf {{s}}^T\mathbf {{a}} + e) + (\rho \mathbf {{u}}_{i}, \mathbf {{s}}^T(\rho \mathbf {{u}}_i)) = (\underbrace{(\mathbf {{a}} + \rho \mathbf {{u}}_i)}_{ \mathbf {{a}}'}, \mathbf {{s}}^T (\mathbf {{a}} + \rho \mathbf {{u}}_i) + e) + (\mathbf {{0}}, \rho ) = (\mathbf {{a}}', \mathbf {{s}}^T\mathbf {{a}}'+ e+ \rho ), $$

with $(\mathbf {{a}}, e)$ distributed $\mathcal{U}_{\{0,1\}^n}\times \mathsf {Ber}_{\epsilon }$, and thus also $(\mathbf {{a}}', e)$ distributed $\mathcal{U}_{\{0,1\}^n}\times \mathsf {Ber}_{\epsilon }$ and independent of $\rho $. Since $\rho $ is uniform and independent of $(\mathbf {{a}}', e)$ it follows that $(\mathbf {{a}}', \mathbf {{s}}^T\mathbf {{a}}'+e+\rho )$ is distributed $\mathcal{U}_{\{0,1\}^n}\times \mathcal{U}_{\{0,1\}}$. $\square $

The following is an immediate corollary of Lemma 5.1.

Corollary 5.1

If $\mathbf {{s}}_i=\sigma $ then the distributions generated by $\mathsf {Samp}_{i, \sigma }(\mathbf {{C}}, \mathbf {{t}})$ and $\mathsf {Samp}_0(\mathbf {{C}}, \mathbf {{t}})$ are within statistical distance at most $2\delta $.

If $\mathbf {{s}}_i\ne \sigma $ then the distributions generated by $\mathsf {Samp}_{i, \sigma }(\mathbf {{C}}, \mathbf {{t}})$ and $\mathsf {Samp}_0(\mathbf {{C}}, \mathbf {{t}})$ are within statistical distance at least $(1-2\epsilon ) - 2\delta $.

Proof

A direct calculation shows that the statistical distance between $\mathsf {Ber}_{\epsilon }$ and $\mathcal{U}_{\{0,1\}}$ is $1-2\epsilon $. Plugging in Lemma 5.1, the result follows. $\square $

We define the notion of a direct product sampler. This is just a sampler that outputs multiple samplers.

Definition 5.1

Let $\mathcal{D}$ be a distribution and let $k \in {\mathbb N}$, then $\mathcal{D}^{({k})}$ is the distribution defined by k independent samples from $\mathcal{D}$.

Lemma 5.2

Consider distributions $\mathcal{D}_1, \mathcal{D}_2$ and values $0 \le \delta _1 \le \delta _2 \le 1$ s.t. $\mathsf {dist}(\mathcal{D}_1, \mathcal{D}_2) \in (\delta _1, \delta _2)$. Let $k \in {\mathbb N}$ then $\mathsf {dist}(\mathcal{D}_1^{({k})},\mathcal{D}_2^{({k})}) \in (1-c_1 e^{-c_2 \delta _1^2 k}, k\delta _2)$. For some positive constants $c_1, c_2$.

Proof

The upper bound follows by union bound and the lower bound from the Chernoff bound. $\square $

Theorem 5.1

There exists a $\mathrm {{Search}\hbox {-}\mathcal {BPP}}^\mathcal {SZK}$ algorithm for solving $\mathsf {balNCP}$ on instances of the following form. Letting $\mathbf {{C}} \in \{0,1\}^{n \times m}$, $\mathbf {{t}}\in \{0,1\}^m$, $w \in [m]$ denote the $\mathsf {balNCP}$ input, we require that the code generated by $\mathbf {{C}}$ is $\beta $-balanced and that $n,m,w,\beta $ are such that there exist $d \in [m]$ and $k \le \mathrm{poly}(n,m)$ for which

$$\begin{aligned} 2\delta k < 1/3 \end{aligned}$$

(9)

for $\delta = 2^{(n+1)/2} \cdot (\beta + \tfrac{2w}{m})^{d}$, and

$$\begin{aligned} c_1 e^{-c_2 (1-2\epsilon - 2 \delta )^2 k} < 1/3 \end{aligned}$$

(10)

for $\delta $ as above, $\epsilon = \tfrac{1}{2}-\tfrac{1}{2}(1-\tfrac{2w}{m})^{d}$, and $c_1, c_2$ are the constants from Lemma 5.2.

Proof

We recall the problem Statistical Distance ($\mathsf {SD}$) which is in $\mathcal {SZK}$. This problem takes as input two sampler circuits and outputs 0 if the inputs sample distributions that are within statistical distance $<1/3$ and 1 if the distributions are within statistical distance $>2/3$. We will show how to solve $\mathsf {balNCP}$ for the above parameters using an oracle to $\mathsf {SD}$.

Specifically, for all $i=1,\ldots , n$ and $\sigma \in \{0,1\}$, the algorithm will call the $\mathsf {SD}$ oracle on input $(\mathsf {Samp}^{({k})}_0(\mathbf {{C}}, \mathbf {{t}}), \mathsf {Samp}^{({k})}_{i, \sigma }(\mathbf {{C}}, \mathbf {{t}}))$, where $\mathsf {Samp}^{({k})}_{(\cdot )}$ is the algorithm that runs the respective $\mathsf {Samp}$ k times and outputs all k generated samples.

Let $\alpha _{i,\sigma }$ denote the oracle response on the $(i,\sigma )$ call. Then if for any i it holds that $\alpha _{i,0}=\alpha _{i,1}$, then return $\bot $. Otherwise set $\mathbf {{s}}_i$ to the value $\sigma $ for which $\alpha _{i,\sigma }=0$. Return $\mathbf {{s}}$.

By definition of our samplers, they run in polynomial time, so if k is polynomial then our inputs to $\mathsf {SD}$ are indeed valid. Combining Corollary 5.1 and Lemma 5.2, it holds that $\alpha _{i,\sigma }=0$ if and only if $\mathbf {{s}}^*_i=\sigma $, where $\mathbf {{s}}^*$ is the vector for which $\mathbf {{t}}^T = (\mathbf {{s}}^*)^T\mathbf {{C}}+\mathbf {{x}}^T$ and $\mathsf {wt}(\mathbf {{x}})=w$. The correctness of the algorithm follows. $\square $

Corollary 5.2

Let $m = n^{c}$ for some constant $c>1$, $\beta = \frac{1}{\sqrt{n}}, w = \lceil m \frac{\log ^2 n}{n} \rceil $. Then search $\mathsf {balNCP}_{n, m, w, \beta } \in \mathrm {{Search}\hbox {-}\mathcal {BPP}}^\mathcal {SZK}$.

Proof

In Theorem 5.1 set $d = \lceil 2n/ \log n \rceil $ and $k = n^9$. By the same calculation as in Corollary 4.1 we have $\delta = \mathrm{negl}(n)$ and $\epsilon \le \frac{1}{2} - 1/O(n^4)$. Therefore for large enough n we have $2\delta k < 1/3$ and $c_1 e^{-c_2 (1-2\epsilon - 2 \delta )^2 k} = e^{- \varOmega (n)} < 1/3$ as required by the theorem. $\square $

On Statistical Zero Knowledge and LPN. We notice that since sparse random codes are balanced with overwhelming probability (Lemma 2.2), our results in this section also imply that the $\mathsf {LPN}$ problem is in $\mathrm {{Search}\hbox {-}\mathcal {BPP}}^\mathcal {SZK}$ for error value $\frac{\log ^2 n}{n}$. We note that even though in LPN the weight of the noise vector (the distance from the code) is not fixed as in our definition of $\mathsf {balNCP}$, the domain of possible weights is polynomial and thus the exact weight can be guessed with polynomial success probability. Once a successful guess had been made, it can be verified once a solution had been found.

6 Collision-Resistant Hashing

In this section, we describe a collision-resistant hash function family whose security is based on the hardness of the (decisional) $\mathsf {LPN}^{n}_{{O(\log ^2n/n)}}$ problem. For any positive constant $c\in \mathbb {R}^+$ and a matrix $\mathbf{A}\in \mathbb {Z}_2^{n\times n^{1+c}}$, define the function

$$\begin{aligned} h_\mathbf{A}: \mathcal {S}^{n^{1+c}}_{2n/(c\log {n})} \rightarrow \mathbb {Z}_2^n\quad \text { as }\quad h_\mathbf{A}(\mathbf{r}):=\mathbf{A}\mathbf{r}. \end{aligned}$$

(11)

Notice that because

$$\left| \mathcal {S}^{n^{1+c}}_{2n/(c\log {n})}\right| = {n^{1+c} \atopwithdelims ()2n/(c\log {n})}> \left( \frac{n^{1+c}}{2n/(c\log {n})}\right) ^{2n/(c\log {n})}>2^{2n}$$

and the size of $\mathbb {Z}_2^n$ is exactly $2^n$, the function $h_\mathbf{A}$ is compressing.

We now relate the hardness of finding collisions in the function $h_\mathbf{A}$, for a random $\mathbf{A}$, to the hardness of the decisional $\mathsf {LPN}^{n}_{{\epsilon }}$ problem.

Theorem 6.1

For any constant $c>0$, if there exists an algorithm $\mathsf A_1$ running in time t such that

$$\Pr \left[ \mathsf A_1(h_\mathbf{A}) \Rightarrow (\mathbf{r}_1,\mathbf{r}_2) \in \mathcal {S}^{n^{1+c}}_{2n/(c\log {n})}\text { s.t. }\mathbf{r}_1\ne \mathbf{r}_2\text { and } h_\mathbf{A}(\mathbf{r}_1)=h_\mathbf{A}(\mathbf{r}_2);~ \mathbf{A}\leftarrow \mathbb {Z}_2^{n\times n^{1+c}}\right] \ge \delta ,$$

then there exists an algorithm $\mathsf A_2$ that runs in time $\approx t$ and solves the decisional $\mathsf {LPN}^{n}_{{\epsilon }}$ problem for any $\epsilon \le \frac{1}{4}$ with advantage at least $\delta \cdot 2^{-16n\epsilon /(c\log {n})-1}$.

In particular, for $\epsilon =O(\log ^2 n /n)$ and any $\delta =1/\mathrm{poly}(n)$, the advantage is $1/\mathrm{poly}(n)$.

Proof

The algorithm $\mathsf A_2$ has access to an oracle that is either $\mathcal {O}^n_{\mathbf{s},\epsilon }$ or $\mathcal {U}^n$. He calls the oracle $n^{1+c}$ times to obtain samples of the form $(\mathbf{a}_i,b_i)$. He arranges the $\mathbf{a}_i$ and $b_i$ into a matrix $\mathbf{A}$ and vector $\mathbf{b}$ as

$$\mathbf{A}= \begin{bmatrix}~\mathbf{a}_1~|~ \cdots ~|~\mathbf{a}_{n^{1+c}}~\end{bmatrix}\in \mathbb {Z}_2^{n\times n^{1+c}},~\mathbf{b}=\begin{bmatrix} b_1 \\ \cdots \\ b_{n^{1+c}}\end{bmatrix}\in \mathbb {Z}_2^{n^{1+c}}$$

and sends $\mathbf{A}$ to $\mathsf A_1$. If $\mathsf A_1$ fails to return a valid answer, then $\mathsf A_2$ outputs $\mathsf {ans}\leftarrow \{0,1\}$. If $\mathsf A_1$ does return valid distinct $\mathbf{r}_1$ and $\mathbf{r}_2$ such that $h_\mathbf{A}(\mathbf{r}_1)=h_\mathbf{A}(\mathbf{r}_2),$ then $\mathsf A_2$ returns $\mathsf {ans}=\mathbf{b}^T(\mathbf{r}_1-\mathbf{r}_2)$.

We first look at the distribution of $\mathsf {ans}$ when the oracle that $\mathsf A_2$ has access to is $\mathcal {U}^n$. In this case it’s easy to see that regardless of whether $\mathsf A_1$ returns a valid answer, we’ll have $\Pr [\mathsf {ans}=0]=\frac{1}{2}$ because $\mathbf{b}$ is completely uniform in $\mathbb {Z}_2^{n^{1+c}}$.

On the other hand, if the oracle is $\mathcal {O}^n_{\mathbf{s},\epsilon }$, then we know that for all i, $b_i=\mathbf{s}^T \mathbf{a}_i + e_i$, where $e_i\leftarrow \mathsf {Ber}_\epsilon $. This can be rewritten as $\mathbf{s}^T\mathbf{A}+\mathbf{e}^T=\mathbf{b}^T$ where $\mathbf{e}=\begin{bmatrix}e_1\\ \cdots \\ e_{n^{1+c}}\end{bmatrix}$. Therefore

$$\mathbf{b}^T(\mathbf{r}_1-\mathbf{r}_2) = \mathbf{A}(\mathbf{r}_1-\mathbf{r}_2) + \mathbf{e}^T(\mathbf{r}_1-\mathbf{r}_2) = \mathbf{e}^T(\mathbf{r}_1-\mathbf{r}_2).$$

Since $\mathsf {wt}(\mathbf{r}_i)=2n/(c\log {n})$, we know that $\mathsf {wt}(\mathbf{r}_1-\mathbf{r}_2)\le 4n/(c\log {n})$. Since the $\mathbf{A}$ that is sent to $\mathsf A_1$ is independent of $\mathbf{e}$, we have that

$$\begin{aligned} \Pr [\mathbf{e}^T\cdot (\mathbf{r}_1-\mathbf{r}_2) = 0 ~;~ e_i\leftarrow \mathsf {Ber}_\epsilon ] \ge \frac{1}{2}+\frac{1}{2}(1-2\epsilon )^{4n/(c\log {n})}\ge \frac{1}{2}+ 2^{-16n\epsilon /(c\log {n})-1}, \end{aligned}$$

(12)

where the first inequality follows from Lemma 2.1 and the second inequality is due to the assumption that $\epsilon \le \frac{1}{4}$ and the fact that $1-x \ge 2^{-2x}$ for $x\le 1/2$.

Thus when the oracle is $\mathcal {O}^n_{\mathbf{s},\epsilon }$, we have

$$\Pr [\mathsf {ans}=0]\ge \frac{1}{2}\cdot (1-\delta )+\left( \frac{1}{2}+ 2^{-16n\epsilon /(c\log {n})-1}\right) \cdot \delta = \frac{1}{2}+ \delta \cdot 2^{-16n\epsilon /(c\log {n})-1}.$$

$\square $

6.1 Observations and Other Parameter Regimes

As far as we know, the best attack against the hash function in (11) with $c=1$ requires $2^{\varOmega (n)}$ time, whereas the $\mathsf {LPN}^{n}_{{\log ^2n/n}}$ problem, from which we can show a polynomial-time reduction, can be solved in time $2^{O(\log ^2{n})}$. Thus there is possibly a noticeable loss in the reduction for this parameter setting. It was observed in [YZW+17, Theorem 2, Theorem 3] that there are other ways to set the parameters in Theorem 6.1 which achieve different connections between the hash function and the underlying LPN problem. For example, defining $n=\log ^2{m}$ and $c=\log {m}/\log \log {m}-1$ implies that there exists a hash function defined by the matrix $\mathbf{A}\in \mathbb {Z}_2^{\log ^2{m}\times 2m}$ such that succeeding with probability $\delta $ in finding collisions in this hash function is at least as hard as solving $\mathsf {LPN}^{\log ^2{m}}_{{\epsilon }}$ problem with advantage $\delta \cdot m^{-O(\kappa \epsilon )}$ for a constant $\kappa $. This is exactly the parameter setting in [YZW+17, Theorem 3].^{Footnote 2}

Based on the state of the art of today’s algorithms, it’s clear that using a hash function defined by an $n\times n^2$ matrix $\mathbf{A}$ is more secure than one defined by a $\log ^2{n}\times 2n$ matrix (since one can trivially find collisions in the latter in time $2^{O(\log ^2{n})}$). There is, however, no connection that we’re aware of between the LPN problems on which they are based via Theorem 6.1. In particular, we do not know of any polynomial-time (in n) reductions that relate the hardness of the $\mathsf {LPN}^{n}_{{\log ^2n/n}}$ problem to the $\mathsf {LPN}^{\log ^2n}_{{\epsilon }}$ problem for a constant $\epsilon $.

Notes

1.
Feldman et al. [FGKP09] showed a worst-case to average-case reduction with respect to the noise distribution, but not with respect to the samples themselves.
2.
The error $\epsilon $ in [YZW+17] can be any constant $<\frac{1}{2}$, whereas our Theorem 6.1 restricts it to $<\frac{1}{4}$. Our restriction is made simply for obtaining a “clean” inequality in Eq. (12) since we were only interested in LPN instances with much lower noise.

References

Arora, S., Babai, L., Stern, J., Sweedyk, Z.: The hardness of approximate optimia in lattices, codes, and systems of linear equations. In: 34th Annual Symposium on Foundations of Computer Science, Palo Alto, CA, USA, 3–5 November 1993, pp. 724–733. IEEE Computer Society (1993)
Google Scholar
Alkim, E., Ducas, L., Pöppelmann, T., Schwabe, P.: Post-quantum key exchange - a new hope. In: 25th USENIX Security Symposium, USENIX Security 16, Austin, TX, USA, 10–12 August 2016, pp. 327–343 (2016)
Google Scholar
Applebaum, B., Haramaty, N., Ishai, Y., Kushilevitz, E., Vaikuntanathan, V.: Low-complexity cryptographic hash functions. In: 8th Innovations in Theoretical Computer Science Conference, ITCS 2017, 9-11 January 2017, Berkeley, CA, USA, pp. 7:1–7:31 (2017)
Google Scholar
Ajtai, M.: Generating hard instances of lattice problems (extended abstract). In: Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, Philadelphia, PA, USA, 22–24 May 1996, pp. 99–108 (1996)
Google Scholar
Alon, N., Panigrahy, R., Yekhanin, S.: Deterministic approximation algorithms for the nearest codeword problem. In: Dinur, I., Jansen, K., Naor, J., Rolim, J.D.P. (eds.) APPROX/RANDOM 2009. LNCS, vol. 5687, pp. 339–351. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03685-9_26
Chapter MATH Google Scholar
Bos, J.W., et al.: Frodo: take off the ring! Practical, quantum-secure key exchange from LWE. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016, pages 1006–1018 (2016)
Google Scholar
Blum, A., Furst, M.L., Kearns, M.J., Lipton, R.J.: Cryptographic primitives based on hard learning problems. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 278–291. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48329-2_24
Chapter Google Scholar
Berman, P., Karpinski, M.: Approximating minimum unsatisfiability of linear equations. In: Eppstein, D. (ed.) Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 6–8 January 2002, San Francisco, CA, USA, pp. 514–516. ACM/SIAM (2002)
Google Scholar
Blum, A., Kalai, A., Wasserman, H.: Noise-tolerant learning, the parity problem, and the statistical query model. J. ACM 50(4), 506–519 (2003)
Article MathSciNet Google Scholar
Brakerski, Z., Langlois, A., Peikert, C., Regev, O., Stehlé, D.: Classical hardness of learning with errors. In: Symposium on Theory of Computing Conference, STOC 2013, Palo Alto, CA, USA, 1–4 June 2013, pp. 575–584 (2013)
Google Scholar
Brakerski, Z., Lombardi, A., Segev, G., Vaikuntanathan, V.: Anonymous IBE, leakage resilience and circular security from new assumptions. Cryptology ePrint Archive, Report 2017/967 (2017). https://eprint.iacr.org/2017/967. EUROCRYPT 2018 [BLSV18]
Brakerski, Z., Lombardi, A., Segev, G., Vaikuntanathan, V.: Anonymous IBE, leakage resilience and circular security from new assumptions. In: Nielsen, J.B., Rijmen, V. (eds.) EUROCRYPT 2018. LNCS, vol. 10820, pp. 535–564. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78381-9_20
Chapter Google Scholar
Brakerski, Z., Vaikuntanathan, V.: Efficient fully homomorphic encryption from (standard) LWE. In: IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011, Palm Springs, CA, USA, 22–25 October 2011, pp. 97–106 (2011)
Google Scholar
Dumer, I., Micciancio, D., Sudan, M.: Hardness of approximating the minimum distance of a linear code. In: 40th Annual Symposium on Foundations of Computer Science, FOCS 1999, 17–18 October 1999, New York, NY, USA, pp. 475–485. IEEE Computer Society (1999)
Google Scholar
Feldman, V., Gopalan, P., Khot, S., Ponnuswami, A.K.: On agnostic learning of parities, monomials, and halfspaces. SIAM J. Comput. 39(2), 606–645 (2009)
Article MathSciNet Google Scholar
Gentry, C.: Fully homomorphic encryption using ideal lattices. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, 31 May–2 June 2009, pp. 169–178 (2009)
Google Scholar
Goldreich, O., Goldwasser, S., Halevi, S.: Collision-free hashing from lattice problems. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 3, no. 42 (1996)
Google Scholar
Goldwasser, S., Micali, S., Rackoff, C.: The knowledge complexity of interactive proof systems. SIAM J. Comput. 18(1), 186–208 (1989)
Article MathSciNet Google Scholar
Goldreich, O.: Three XOR-lemmas - an exposition. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 2, no. 56 (1995)
Google Scholar
Kiltz, E., Pietrzak, K., Cash, D., Jain, A., Venturi, D.: Efficient authentication from hard learning problems. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 7–26. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20465-4_3
Chapter Google Scholar
Katz, J., Shin, J.S.: Parallel and concurrent security of the HB and hb${}^{\text{+ }}$ protocols. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 73–87. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_6
Chapter Google Scholar
Kopparty, S., Saraf, S.: Local list-decoding and testing of random linear codes from high error. In: Schulman, L.J. (ed.) Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, MA, USA, 5–8 June 2010, pp. 417–426. ACM (2010)
Google Scholar
Matsui, M.: Linear cryptanalysis method for DES cipher. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48285-7_33
Chapter Google Scholar
Micciancio, D., Regev, O.: Worst-case to average-case reductions based on Gaussian measures. In: Proceedings of 45th Symposium on Foundations of Computer Science (FOCS 2004), 17–19 October 2004, Rome, Italy, pp. 372–381 (2004)
Google Scholar
Micciancio, D., Vadhan, S.P.: Statistical zero-knowledge proofs with efficient provers: lattice problems and more. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 282–298. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45146-4_17
Chapter Google Scholar
Mahmoody, M., Xiao, D.: On the power of randomized reductions and the checkability of SAT. In: Proceedings of the 25th Annual IEEE Conference on Computational Complexity, CCC 2010, Cambridge, MA, 9–12 June 2010, pp. 64–75 (2010)
Google Scholar
Peikert, C.: Public-key cryptosystems from the worst-case shortest vector problem (extended abstract). In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, 31 May–2 June 2009, pp. 333–342 (2009)
Google Scholar
Regev, O.: On lattices, learning with errors, random linear codes, and cryptography. In: Proceedings of the 37th Annual ACM Symposium on Theory of Computing, Baltimore, MD, USA, 22–24 May 2005, pp. 84–93 (2005)
Google Scholar
Sahai, A., Vadhan, S.P.: A complete problem for statistical zero knowledge. J. ACM 50(2), 196–249 (2003)
Article MathSciNet Google Scholar
Vazirani, U.V.: Randomness, adversaries and computation (random polynomial time). Ph.D. thesis (1986)
Google Scholar
Yu, Y., Zhang, J., Weng, J., Guo, C., Li, X.: Learning parity with noise implies collision resistant hashing. Cryptology ePrint Archive, Report 2017/1260 (2017). https://eprint.iacr.org/2017/1260

Download references

Acknowledgments

The first author wishes to thank Ben Berger and Noga Ron-Zewi for discussions on the hardness of decoding problems. We also thank Yu Yu and anonymous Eurocrypt reviewers for their helpful feedback.

Author information

Authors and Affiliations

Weizmann Institute of Science, Rehovot, Israel
Zvika Brakerski
IBM Research – Zurich, Rüschlikon, Switzerland
Vadim Lyubashevsky
Massachusetts Institute of Technology, Cambridge, USA
Vinod Vaikuntanathan
Northeastern University, Boston, USA
Daniel Wichs

Authors

Zvika Brakerski
View author publications
You can also search for this author in PubMed Google Scholar
Vadim Lyubashevsky
View author publications
You can also search for this author in PubMed Google Scholar
Vinod Vaikuntanathan
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Wichs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zvika Brakerski .

Editor information

Editors and Affiliations

Technion, Haifa, Israel
Yuval Ishai
COSIC Group, KU Leuven, Heverlee, Belgium
Vincent Rijmen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brakerski, Z., Lyubashevsky, V., Vaikuntanathan, V., Wichs, D. (2019). Worst-Case Hardness for LPN and Cryptographic Hashing via Code Smoothing. In: Ishai, Y., Rijmen, V. (eds) Advances in Cryptology – EUROCRYPT 2019. EUROCRYPT 2019. Lecture Notes in Computer Science(), vol 11478. Springer, Cham. https://doi.org/10.1007/978-3-030-17659-4_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-17659-4_21
Published: 24 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17658-7
Online ISBN: 978-3-030-17659-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the International Association for Cryptologic Research (opens in a new tab)

Worst-Case Hardness for LPN and Cryptographic Hashing via Code Smoothing

Abstract

Similar content being viewed by others

Smoothing Out Binary Linear Codes and Worst-Case Sub-exponential Hardness for LPN

Collision Resistant Hashing from Sub-exponential Learning Parity with Noise

Decoding Linear Codes with High Error Rate and Its Impact for LPN Security

1 Introduction

1.1 Overview of Our Results and Techniques

2 Preliminaries

2.1 Notation

2.2 The Learning Parity with Noise (LPN) Problem

Lemma 2.1

2.3 The Nearest Codeword Problem

Definition 2.1

Definition 2.2

Definition 2.3

Lemma 2.2

Proof

2.4 Statistical Zero Knowledge

Definition 2.4

3 A Smoothing Lemma for Noisy Codewords

Definition 3.1

Lemma 3.1

Proof

4 A Worst Case Balanced NCP to Average Case LPN Reduction

Theorem 4.1

Proof

Corollary 4.1

Proof

5 Statistical Zero Knowledge for Balanced NCP and LPN

Lemma 5.1

Proof

Corollary 5.1

Proof

Definition 5.1

Lemma 5.2

Proof

Theorem 5.1

Proof

Corollary 5.2

Proof

6 Collision-Resistant Hashing

Theorem 6.1

Proof

6.1 Observations and Other Parameter Regimes

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation