Keywords

1 Introduction

Due to the recent developments in quantum computing and its threat to current asymmetric key schemes, the cryptographic community has increased its efforts towards the development of post-quantum cryptography, resulting in the NIST Post-Quantum Standardization Process. Several submissions to this process are built on top of the Learning with Errors (LWE) hard problem. These are frequently combined with the usage of polynomial matrix elements, resulting in Ring-LWE or Mod-LWE schemes such as New Hope [1], LAC [15], LIMA [17], R. Emblem [16] and Kyber [2]. Some schemes further reduce their communication bandwidth by replacing the pseudorandomly generated errors terms with rounding errors, resulting in Ring-LWR and Mod-LWR schemes as in Round2 [9] and Saber [3] respectively.

For most of the above encryption schemes there is a small probability of a decryption failure, in which the decryption of the encoded message returns a faulty result, where one or more message bits are flipped. As these failure events depend on the secret key, they might compromise the security of the scheme. Therefore, most candidates of the Post-Quantum Standardization Process aim for a failure probability around \(2^{-128}\). To reduce the failure rate, some schemes utilize error correcting codes (ECC) to make the decryption resilient against a certain number of errors. The NIST candidate LAC [15] relies on extensive error correction, and Fritzmann et al. [7] made a study on the positive impact of the usage of ECC’s on the security and bandwidth of lattice-based schemes. Another possibility is to eliminate decryption failures altogether and thus eliminate attacks that exploit them, by selecting the parameter of the scheme accordingly. This comes at the price of a higher bandwidth and computational complexity. Comparing the communication cost, defined as the number of bytes in the public key and the ciphertext, we have 2080 bytes for the original Saber and 3488 bytes for Saber with the same estimated core security level but without decryption failures [4]. However, as most submissions to the NIST Post-Quantum Process have a small decryption failure probability, an analysis of the impact of decryption failures is essential.

A chosen ciphertext attack against Ring-Learning with Errors (Ring-LWE) schemes exploiting decryption failures was reported by Fluhrer [6]. This attack uses knowledge of failing ciphertexts to retrieve the secret. D’Anvers et al. [5] analyzed a decryption failure attack on (Ring/Mod)-LWE/LWR schemes that have protection against chosen ciphertext attacks. The security risk of decryption failures is also reflected in the post-quantum versions [12, 13] of the Fujisaki-Okamoto transformation [8], which converts a chosen plaintext secure encryption scheme in a chosen ciphertext secure key encapsulation mechanism (KEM). The security bound of these transformations contains a term considering decryption failures. As this term is quadratic in the failure rate of the underlying scheme, it has an important effect on the security bound.

Consequently, the failure probability is an important factor in the security of these schemes and should be determined precisely. The common approach for computing this probability is calculating the failure rate for one bit of the message, from which the full failure rate is determined assuming the failures between the individual bits are independent. Jin and Zhao [14] proved that for some schemes the failures in individual bits are asymptotically independent if the number of bits goes to infinity. Hamburg [10] did an analysis of the independence of the bits for the NIST Post-Quantum Standardization Process submission ThreeBears [11], which is based on the Integer Module Learning with Errors problem. He identified three sources of correlation: the norm of the secret, the norm of the ciphertext and the correlation between the failures of the individual bits due to the ring structure.

In this paper, we examine the independence assumption for Ring/Mod-LWE/LWR based schemes. First we show both theoretically as well as experimentally that this assumption is not correct. Then, we develop a method to handle the dependency issue in the failure rate calculation. We calculate the failure rate for variants of LAC and validate our method using experimental dataFootnote 1. Finally, we discuss the implications of the dependency in different scenarios: for schemes without error correcting codes, we reason that the assumption of independence leads to a slight overestimation of the failure probability. Looking into schemes using error correcting codes to reduce the failure rate, we show that the independence assumption can lead to an underestimation of the failure rate, and thus an overestimation of the security of the underlying scheme. In the most extreme case for LAC-128, the failure rate is overestimated by a factor \(2^{48}\).

2 Preliminaries

2.1 Notation

Let \(\mathbb {Z}_q\) denote the ring of integers modulo q, let \(R_q\) represent the ring \(\mathbb {Z}_q[X]/(X^n+1)\) and let \(R_q^{l_1 \times l_2}\) designate the ring of \(l_1 \times l_2\) matrices over \(R_q\). Polynomials will be written using lowercase letters, vectors with bold lowercase, and matrices with bold uppercase. The \(l_2\)-norm of a polynomial x is defined as \(\Vert x \Vert _2 = \sqrt{\sum _i x_i^2}\) and the \(l_2\)-norm of a vector \(\pmb {x}\) as \(\Vert \pmb {x} \Vert _2 = \sqrt{\sum _i \Vert x_i \Vert _2^2}\). The rounding operation \(\lfloor x \rceil _{q \rightarrow p}\) for \(x \in \mathbb {Z}_q\), is calculated as \(\lfloor p/q \cdot x \rceil \in \mathbb {Z}_p\). The \(\mathtt {abs}()\) function takes the absolute value of its input. These operations are extended coefficient-wise for polynomials and vectors. Let \(a_i\), with \(a \in R_q\) denote the \(i^{\text {th}}\) coefficient of a, and denote with \(\pmb {a}_i\) for \(\pmb {a} \in R^{l \times 1}_q\) the \((i \mod l) ^{\text {th}}\) coefficient of the \(\lfloor i / l \rfloor ^{\text {th}}\) polynomial of \(\pmb {a}\).

Let \(x \leftarrow \chi (R_q)\) indicate sampling the coefficients of \(x \in R_q\) according to distribution \(\chi \). The sampling operation is extended coefficient-wise for vectors \(\pmb {x} \in R_q^{l \times 1}\) as \(\pmb {x} \leftarrow \chi (R_q^{l \times 1})\). Let \(\mathtt {Binom}(k, n, p)\) be the cumulative binomial distribution with n draws and probability p, so that \(\mathtt {Binom}(k, n, p) = \sum _{i=0}^{\lfloor k \rfloor } {n\atopwithdelims ()i}p^i(1-p)^{n-i}\) and let \(\mathtt {hypergeom}(k, N, K, n)\) be the hypergeometric distribution with population size N, success states K and draws n as defined by:

$$\begin{aligned} \mathtt {hypergeom}&(k, N, K, n) = \frac{ \left( \begin{array}{c} K \\ k\end{array}\right) \left( \begin{array}{c} N-K \\ n-k\end{array}\right) }{ \left( \begin{array}{c} N \\ n\end{array}\right) }, \end{aligned}$$
(1)
$$\begin{aligned}&\text {where: } \left( \begin{array}{c} a \\ b\end{array}\right) = \frac{a!}{b! (a-b)!}. \end{aligned}$$
(2)

2.2 Ring/Mod-LWE/LWR Based Encryption

A general framework for Ring/Mod-LWE-LWR based encryption schemes is provided in Algorithms 1, 2 and 3. The algorithm uses the function \( \mathtt {gen}\) to generate the pseudorandom matrix \(\pmb {A}\) from a seed \(seed_{\pmb {A}}\), the function \(\mathtt {enc}\) to encode the message m into an element of \(R_q\) and the inverse function \(\mathtt {dec}\) to decode a polynomial back into a message bitstring. The latter decodes coefficients of the polynomial correctly if the deviation from the initial encoded polynomial coefficient is at most \(\pm q/4\). If error correcting codes are used in the scheme, the function \(\mathtt {ecc\_enc}\) adds extra redundancy to the bitstring m to enable error correction, while \(\mathtt {ecc\_dec}\) recovers the original message if the number of flipped bits between \(m_{ecc}\) and \(m'_{ecc}\) is less than a threshold d, which depends on the chosen error correcting code (ECC). When no error correcting codes are used, the functions \(\mathtt {ecc\_enc}\) and \(\mathtt {ecc\_dec}\) act as the identity and return their input. The encryption algorithm \(\mathtt {PKE}{.}\mathtt {Enc}\) uses the seed r to pseudorandomly generate \(\pmb {s}'_B, \pmb {e}'_B\) and \(e''_B\).

figure a
figure b
figure c

By choosing \(l=1\), one obtains a Ring based scheme, while a bigger value of l indicates a module (Mod) based scheme. In Mod/Ring-LWE based schemes, the error distribution \(\chi _e\) is nonzero, in contrast to Mod/Ring-LWR based schemes where \(\chi _e = 0\). In the latter case, parameters p and t are smaller than q, so that the rounding operations \(\lfloor \cdot \rceil _{q \rightarrow p}\) and \(\lfloor \cdot \rceil _{q \rightarrow t}\) introduce the errors necessary for security. The rounding additionally compresses the ciphertexts. The rounding operations \(\lfloor \cdot \rceil _{p \rightarrow q}\) and \(\lfloor \cdot \rceil _{t \rightarrow q}\) decompress the input back to approximately the original value. The error introduced by these rounding and reconstruction operations will be denoted as follows:

$$\begin{aligned} \pmb {u}_A&= \pmb {A} \pmb {s}_A + \pmb {e}_A - \pmb {b}_r, \end{aligned}$$
(3)
$$\begin{aligned} \pmb {u}'_B&= \pmb {A}^T \pmb {s}'_B + \pmb {e}'_B - \pmb {b}'_r, \end{aligned}$$
(4)
$$\begin{aligned} u''_B&= \pmb {b}_r^T \pmb {s}'_B + e''_B + \mathtt {enc}(m_{ecc}) - v'_r \text {.} \end{aligned}$$
(5)

As a first step in determining the error probability of the encryption scheme, we can calculate the value of \(v'_r -v\) as follows:

$$\begin{aligned} v'_r -v&= ( \pmb {b}_r^T \pmb {s}'_B + e''_B + \lfloor q/2 \rfloor \mathtt {enc}(m_{ecc}) + u''_B) - \pmb {b}_r'^T \pmb {s}_A \end{aligned}$$
(6)
$$\begin{aligned}&= \lfloor q/2 \rfloor \mathtt {enc}(m_{ecc}) + (\pmb {e}_A + \pmb {u}_A)^T \pmb {s}'_B - (\pmb {e}'_B + \pmb {u}'_B)^T \pmb {s}_A + (u''_B + e''_B) \end{aligned}$$
(7)

The distribution of one coefficient of \(- (\pmb {e}'_B + \pmb {u}'_B)^T \pmb {s}_A + (\pmb {e}_A + \pmb {u}_A)^T \pmb {s}'_B + (u''_B + e''_B) \) can be calculated exhaustively. For the sake of convenience, we will rewrite this as \(\pmb {c}^T \pmb {s} + g\), where \(\pmb {s}\) is the vector constructed as the concatenation of \(-\pmb {s}_A\) and \((\pmb {e}_A + \pmb {u}_A)\), where \(\pmb {c}\) is constructed similarly as the concatenation of \((\pmb {e}'_B + \pmb {u}'_B)\) and \(\pmb {s}'_B\), and where \(g = u''_B + e''_B\):

$$\begin{aligned} \pmb {s} = \begin{pmatrix} - \pmb {s}_A \\ \pmb {e}_A+\pmb {u}_A \end{pmatrix}, \quad \pmb {c} = \begin{pmatrix}\pmb {e}'_B+\pmb {u}'_B \\ \pmb {s}'_B \end{pmatrix}, \quad g = u''_B + e''_B. \end{aligned}$$
(8)

A coefficient of the polynomial \(v'_r -v\) decodes correctly if the absolute value of the corresponding coefficient of the error term \(\pmb {c}^T \pmb {s} + g\) is smaller than q/4. A higher value results in a flipped bit after decoding, which will be called a bit error and will be denoted with \(F_i\) with i the position of the bit in the message. If the number of bit errors exceeds the threshold for error correction d, a decryption failure occurs, which we will denote with the symbol F. A correct decryption will be denoted with S, so that by definition \(P[S] = 1 - P[F]\).

In Table 1, the parameters for LAC-128 and LAC-256 [15] are given. These schemes are used throughout this paper to validate our methodology, as their high failure rate and significant error correction causes their failure rate calculation to be more sensitive to error dependencies. Due to the choices of the moduli q, p and t, the rounding errors \(\pmb {u}_A\), \(\pmb {u}'_B\) equal the zero vector and \(u''_B\) is the zero polynomial.

Table 1. Parameters for LAC

2.3 Key Encapsulation Mechanism

From an IND-CPA secure encryption scheme, an IND-CCA secure Key Encapsulation Mechanism (KEM) can be constructed using a post-quantum version [12] of the Fujisaki-Okamoto transformation. The key generation phase is the same as Algorithm 1 and the Encapsulation and Decapsulation functions are defined in Algorithms 4 and 5 respectively, with \(\mathcal {G}\) and \(\mathcal {H}\) hash functions that model Random Oracles.

figure d
figure e

3 Error Dependency

The typical method to calculate the failure rate, is to determine the error probability of a single bit of \(m'_{ecc}\), calculated as \(p_b = P[ |(\pmb {c}^T\pmb {s}+g)_i| > q/4]\), and then assume independence to extend this error probability to the full failure rate. For a scheme that does not use any error correction, this can be expressed as \(1-(1-p_b)^{l_m} \) or \(1- \mathtt {Binom}(0,l_m, p_b)\), with \(l_m\) the length of the encoded message \(m_{ecc}\). For schemes that deploy error correcting codes with a correction capability of d errors, the failure rate amounts to \(1-\mathtt {Binom}(d,l_m,p_b)\).

However, this assumption of independence is not correct. In this section we will show both theoretically and experimentally that there is a positive correlation between the errors of the bits in \(m'_{ecc}\). Intuitively, one can make the following reasoning: \((\pmb {c}^T\pmb {s}+g)\) with high norm for \(\pmb {s}\) and \(\pmb {c}\) is more likely to produce bit errors, and conversely, bit errors are also more likely to stem from high norm \(\pmb {s}\) and \(\pmb {c}\). Therefore, a bit error at a certain location, increases the expected norm of \(\pmb {s}\) and \(\pmb {c}\), therefore increasing the bit error probabilities at other locations. In conclusion, bit errors are expected to be positively correlated.

In Fig. 1, the probability of various number of bit errors in \(m'_{ecc}\) is plotted for LAC-256, both experimentally by running the protocol for approximately \(2^{31}\) times, and theoretically under the independence assumption as \(1- \mathtt {Binom}(0,l_m, p_b)\), where \(p_b\) is determined experimentally. The choice for LAC stems from the fact that the error probability of a bit of \(m'_{ecc}\) is large compared to other schemes, making it possible to experimentally obtain enough errors to get accurate estimations. In Fig. 1, one can see that the errors are clustered: there are more messages without errors and more messages with a high number of errors than predicted by the theoretical model, which confirms our hypothesis that the bit errors are positively correlated. Note that the error probability of a single bit is the same for the model and the experimental data, and that the errors are just more clustered compared to the prediction of the model.

Fig. 1.
figure 1

The probability of a certain number of errors in \(m'_{ecc}\)

3.1 Handling the Dependency

In this section, we will develop a methodology to calculate the failure rate taking into account the dependency between the errors in the bits of \(m'_{ecc}\). For the sake of simplicity, we will first assume that there is no error correcting code.

$$\begin{aligned} 1 - P[F]&= P[S] \end{aligned}$$
(9)
$$\begin{aligned}&= P[S_0 \cdots S_n] \end{aligned}$$
(10)

Under the independence assumption, one can derive the formulas of the previous section as follows:

$$\begin{aligned} 1 - P[F]&= \prod _i P[S_i] \end{aligned}$$
(11)
$$\begin{aligned}&= (1 - P[F_0])^n \end{aligned}$$
(12)

However step (11) is not valid if this assumption does not hold. To work around this issue, we involve conditional information in the form of \(\pmb {s}, \pmb {c}\) and g:

$$\begin{aligned} 1 - P[F]&= \sum _{\pmb {s}, \pmb {c}, g} P[S_0 \cdots S_n \,|\, \pmb {s}, \pmb {c}, g ] P[\pmb {s}, \pmb {c}, g] \end{aligned}$$
(13)

As the \(S_i\)’s are fully determined conditioned on \(\pmb {s}, \pmb {c}\) and g, the error or success of other bits does not convey any extra information. Therefore, the bit successes \(S_i\) are independent conditioned on the extra information, so we can write:

$$\begin{aligned} 1 - P[F]&= \sum _{\pmb {s}, \pmb {c}, g} \prod _i \left( P[S_i \,|\, \pmb {s}, \pmb {c}, g ] \right) P[\pmb {s}, \pmb {c}, g] \end{aligned}$$
(14)
$$\begin{aligned}&= \sum _{\pmb {s}, \pmb {c}, g} \left( 1-P[F_0 \,|\, \pmb {s}, \pmb {c}, g ] \right) ^n P[\pmb {s}, \pmb {c}, g] \end{aligned}$$
(15)

Unfortunately, this expression is not efficiently computable.

Note that the \(e''_B\) term of \(g_j\) does not add any information to \(S_i\) if \(j \ne i\) and that its coefficients are independent. We will assume that this is also the case for \(u''_B\), so we can write:

$$\begin{aligned} P[S_i | \pmb {s}, \pmb {c}, g] \approx P[S_i | \pmb {s}, \pmb {c}, g_i] \end{aligned}$$
(16)

From this result we can see that g has little or no contribution to the dependency between the \(S_i\). As discussed in Sect. 3, the norm of \(\pmb {s}\) and \(\pmb {c}\) is an important cause of dependency. For rings of the form \(\mathbb {Z}[X]/(X^n+1)\) we could assume that this is the main cause of correlation, as different coefficients of \(\pmb {c}^T\pmb {s}\) are calculated with different combinations of elements of \(\pmb {c}\) and \(\pmb {s}\), which can be formalized as follows:

Assumption 1

For \(\pmb {s}, \pmb {c}\) and g as described in Eq. (8), where g and the coefficients of \(\pmb {s}\) and \(\pmb {c}\) are elements of the ring \(\mathbb {Z}[X]/(X^n+1)\), we can approximate \(S_0 \cdots S_n\) to be independent conditioned on \(\Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2\), which is equivalent to \(P[S_0 \cdots S_n \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2 ] \approx \prod _{i} P[S_i \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \).

Using this assumption we write:

$$\begin{aligned} 1 - P[F]&= \sum _{\Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2} P[S_0 \cdots S_n \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2 ] P[\Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \end{aligned}$$
(17)
$$\begin{aligned}&\approx \sum _{\Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2} \prod _{i} \left( P[S_i \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \right) P[\Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \end{aligned}$$
(18)
$$\begin{aligned}&\approx \sum _{\Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2} \left( P[S_0 \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \right) ^n P[\Vert \pmb {s} \Vert _2] P[\Vert \pmb {c} \Vert _2] \end{aligned}$$
(19)
$$\begin{aligned}&\approx \sum _{\Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2} \left( 1 - P[F_0 \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \right) ^n P[\Vert \pmb {s} \Vert _2] P[\Vert \pmb {c} \Vert _2] \end{aligned}$$
(20)

Using a similar derivation, the failure rate for schemes with error correction under Assumption 1 can be calculated as:

$$\begin{aligned} 1 - P[F] \approx&\sum _{\Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2} \left( 1-\mathtt {Binom}(d,l_m,p_b) \right) P[\Vert \pmb {s} \Vert _2] P[\Vert \pmb {c} \Vert _2] \end{aligned}$$
(21)
$$\begin{aligned}&\text {where: } p_b = P[F_0 \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \end{aligned}$$
(22)

To conclude, one has to calculate the failure rate for every value of \(\Vert \pmb {s} \Vert _2\) and \(\Vert \pmb {c} \Vert _2\), after which the failure rate can be found by taking a weighted average. The model from Eq. (20) can be seen as an intermediate between the model from Eq. (12) that was constructed using the independence assumption, and the exact but incalculable model from Eq. (15). In this intermediate model, the main source of correlation between the \(S_i\), following Assumption 1, is taken into account. In the next section we will experimentally assess our intermediate model and observe that it closely represents the experimental data, thus validating our assumption.

3.2 Experiments

To validate the developed methodology, we ran LAC-256 approximately \(2^{31}\) times to get experimental data on the probability of a certain number of failures in \(m'_{ecc}\). We calculated the same probability using the assumption of independence and our dependency aware model.

In general \(P[F_0 \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2]\) can be calculated using a Gaussian assumption on the distribution of \(\pmb {c}^T\pmb {s} + g\) as described in [5]. For our calculations of LAC we use a more exact algorithm using the fact that the elements of \(\pmb {c}, \pmb {s}\) and g are ternary. Intuitively, we first calculate the probability that a certain number l of nonzero coefficients of \(\pmb {c}\) and \(\pmb {s}\) coincide during the multiplication, expressed as \(P[ {( \mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}) )_0=l} \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2]\). Then, we assume the term \((\pmb {c}^T\pmb {s})_0\) given \(( \mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}) )_0=l\) to be a sum of l elements randomly picked as plus or minus 1. The full derivation can be expressed as follows:

$$\begin{aligned} p_b&= P[ \mathtt {abs}(\pmb {c}^T\pmb {s} + g)_0 > q/4 \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2 ] \end{aligned}$$
(23)
$$\begin{aligned}&= \sum _l \left( \begin{array}{l} P[ \mathtt {abs}(\pmb {c}^T\pmb {s} + g)_0 > q/4 \,|\, ( \mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}) )_0 = l, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2 ] \cdot \\ P[(\mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}))_0 = l \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \end{array} \right) \end{aligned}$$
(24)
$$\begin{aligned}&= \sum _l \left( \begin{array}{l} P[ \mathtt {abs}(\pmb {c}^T\pmb {s} + g)_0 > q/4 \,|\, (\mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}) )_0 = l ] \cdot \\ P[(\mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}) )_0 = l \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \end{array} \right) \end{aligned}$$
(25)
$$\begin{aligned}&= \sum _l \sum _{g_0} \left( \begin{array}{l} P[ \mathtt {abs}(\pmb {c}^T\pmb {s} + g)_0 > q/4 \,|\, ( \mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}) )_0 = l , g_0 ] \cdot \\ P[( \mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}) )_0 = l \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2] \cdot P[g_0] \end{array} \right) \end{aligned}$$
(26)

We can model \(P[ (\pmb {c}^T\pmb {s})_0 > q/4 - g_0 \,|\, (\mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}))_0 = l , g_0 ]\) as the survival function of a binomial distribution, which can be calculated as \(\mathtt {Binom}(\frac{l-q/4+g_0}{2}, l, 1/2)\). Similarly, \(P[ (\pmb {c}^T\pmb {s})_0 < -q/4 - g_0 \,|\, (\mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}))_0 = l , g_0 ]\) can be modelled as \(\mathtt {Binom}(\frac{l-q/4-g_0}{2}, l, 1/2)\), so that \(P[ \mathtt {abs}(\pmb {c}^T\pmb {s} + g)_0 > q/4 \,|\, ( \mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}) )_0 = l , g_0 ]\) is the sum of both probabilities. The distribution \(P[(\mathtt {abs}(\pmb {c})^T \mathtt {abs}(\pmb {s}))_0 = l \,|\, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2]\) can be seen as a hypergeometric distribution \(\mathtt {hypergeom}(l, n, \Vert \pmb {s} \Vert _2, \Vert \pmb {c} \Vert _2)\).

The probability of a decryption failure is plotted for various error correction capabilities of the ECC in Fig. 2. We can see that our new dependency aware model outputs a much better estimate of the probabilities of a certain maximum number of errors. Another observation to be made is that the independency based model deviates further from the experimental data as the number of errors increases, which is the case for codes with higher error correction capabilities. This makes the dependency issue especially important for schemes with extensive error correction.

Fig. 2.
figure 2

Probability of failure for various error correction capabilities of \(\mathtt {ecc\_enc}\)

4 Implications

As seen in previous sections, the errors in \(m'_{ecc}\) are positively correlated, meaning that an error at a certain position is more likely to happen if another error is present. The inverse is also true: a correct bit of \(m'_{ecc}\) enlarges the probability of other bits in \(m'_{ecc}\) to be correct. Therefore, due to the dependency, there will be more fully correct messages than one would expect under the assumption of independence. However, as one can see in Fig. 2, the impact of the dependency is small for schemes without error correction. To conclude, an estimate using the assumption of independence will slightly overestimate the failure rate, and thus underestimate the security of the scheme with a small margin. As a result, the approximation using an assumption of independence is legitimate for schemes without an error correction step.

Table 2. The failure rate of different versions of LAC under the different models

In the case of schemes with error correction, one has to be more careful. As can be seen in Fig. 2, the independence model gives an underestimation of the failure rate, which corresponds to an overestimation of the security of the scheme. This overestimation grows as d, the error correction capability of the ECC, becomes larger. In Table 2, the estimated failure rate of different versions of LAC is compared under both models. The discrepancy between both models reaches a factor \(2^{48}\) in case of LAC-128. Therefore, the assumption of independence is not valid for schemes with error correction, and that it could lead to a serious overestimation of the security of the underlying algorithm.

More specifically, a higher failure probability suggests that the scheme might be more vulnerable to a decryption failure attack similar to the attack described by D’Anvers et al. [5], where the secrets are estimated statistically based on failing ciphertexts. Moreover, an attacker can reduce the failure probability by performing a precomputation for weak ciphertexts with higher failure probability. As LAC does not have any security against multi-target attacks that exploit decryption failures, this precomputation only needs to be performed once.

5 Conclusions

In this paper, we challenged the independency assumption of bit errors in messages encrypted with (Ring/Mod)-(LWE/LWR) based schemes. We showed both theoretically and experimentally that the occurrence of errors is positively correlated. Then we devised a method to calculate the failure rate of a scheme, taking into account the dependency of failures. Finally, we showed that the assumption of independence is appropriate for schemes without error correcting codes, but that it might lead to a substantial underestimation of the failure rate for schemes with error correcting codes. This underestimation attains a factor of \(2^{48}\) for LAC-128. A higher-than-expected failure rate could have a serious impact on the security of the scheme through a decryption failure attack.