Keywords

1 Introduction

With the advent of quantum computers, many traditional public–key cryptosystems based on number–theoretic or elliptic curves primitives are to become vulnerable to attacks using them [14, 42]. So, there is a strong need in developing post-quantum cryptographic protocols that will remain secure against adversaries equipped with quantum computers. One of the most prominent and well-established approach to post-quantum cryptography is cryptography based on error-correcting codes.

The first code–based cryptosystem was proposed in 1978 by R. McEliece [31]. The main idea of the McEliece cryptosystem is to mask a generator matrix of a fast–decodable code by permuting its columns and multiplying by a scrambling matrix on the left. The encryption is performed by encoding a message using the public generator matrix and adding an error. So, the security against message–recovery attacks is based on NP–hard syndrome decoding problem [13]. The original proposal of R. McEliece was based on binary Goppa codes, so the security against key–recovery attack relies on hardness of the problem of distinguishing permuted Goppa codes. It is worth mentioning that the original McEliece cryptosystem with several improvements is one of three Round 4 competitors in NIST-PQC [1]. Despite many advantages, the main drawback of McEliece cryptosystem is large public–key size. There were many attempts to overcome this by replacing Goppa codes with more efficient ones. The notable examples are Generalized Reed–Solomon codes [35], Reed–Muller codes [43], algebraic geometry codes [29], concatenated codes [40], rank–metric Gabidulin codes [22]. However, most of this modifications were proven unsecure [15, 17, 32, 38, 40, 44]. In addition, several modifications of protocol itself were proposed to avoid key–recovery attacks against McEliece–like cryptosystems based on efficient algebraic codes (e.g. [7, 12, 28, 48]), however most of them were also successfully cryptanalyzed [16, 18, 19, 30, 47, 50].

One of the most efficient approaches to reducing public-key size was proposed by P. Gaborit [23] and is based on using quasi-cyclic codes (QC-codes). A code C of length \(n = n'l\) is said to be quasi-cyclic of order \(n'\) and index l if its permutation automorphism group \({\text {PAut}}(C)\) has a cyclic subgroup of order \(n'\) that acts freely on coordinates. The quasi-cyclic structure implies the existence of generator and parity-check matrices of C that admit a block-circulant representation, i.e.

$$\begin{aligned} \begin{pmatrix} {\text {rot}}(h_{1,1}) &{} \dots &{} {\text {rot}}(h_{1,l}) \\ \vdots &{} \ddots &{} \vdots \\ {\text {rot}}(h_{s,1}) &{} \dots &{} {\text {rot}}(h_{s,l}) \end{pmatrix}, \quad {\text {rot}}(a_1, a_2, \dots , a_{n'}) = \begin{pmatrix} a_1 &{} a_2 &{} \dots &{} a_{n'} \\ a_n &{} a_1 &{} \dots &{} a_{n'-1} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ a_2 &{} a_3 &{} \dots &{} a_1 \end{pmatrix}. \end{aligned}$$
(1)

This representation allows storing only the first row of each circulant block \({\text {rot}}(h_{i,j})\), thereby reducing storage and communication costs. Therefore, the public key sizes of code-based encryption protocols that preserve quasi-cyclic structure can be significantly reduced. Note that many encryption protocols based on algebraic QC–codes (e.g. [12, 23]) have been successfully attacked [20, 36]. However, protocols based on random quasi–cyclic moderate density parity–check (QC–MDPC) codes [33], which have no algebraic structure except being quasi–cyclic, are still considered secure and efficient.

The concept of moderate-density parity-check (MDPC) codes extends the idea of low-density parity-check codes (LDPC codes) initially introduced by R. Gallager [24]. In Gallager’s seminal work [24], it was shown that efficient decoding of binary codes with a parity-check matrix containing a very small constant number of ones in each row is feasible using iterative algorithms such as bit-flipping and belief propagation, provided certain conditions are met (no two rows have two or more ones in the same positions). In 2000, C. Monico et al. [34] considered replacing Goppa codes in the McEliece cryptosystem with LDPC codes and pointed out that these codes can be easily distinguished due to the existence of very low–weight codewords in the dual code. The application of quasi-cyclic LDPC codes in constructing code-based cryptosystems was initially proposed in [11] and further developed in [8, 10]. To mitigate key-recovery attacks based on searching for low-weight dual codewords, it was suggested to replace the permutation matrix in the protocol with a sparse non–singular matrix of a specific form. However, this approach was found to introduce serious vulnerabilities [2, 36]. An alternative method to prevent key–recovery based on the search for low–weight codewords was proposed in [33], where it was suggested to use random QC–MDPC codes instead of LDPC. The difference between MDPC and LDPC codes lies in the slightly higher weight of the rows in the parity-check matrices, i.e., which is of order \(O(\sqrt{n})\) for MDPC codes and O(1) for LDPC.

We denote the finite field of size q as \(\mathbb {F}_q\). For a vector \(v \in \mathbb {F}_q^n\), the notation \({\text {supp}}(v) = \{ i \in \llbracket 1, n \rrbracket \mid v_i \ne 0 \}\) is used to represent the set of indices corresponding to the positions where v is nonzero. Here, \(\llbracket a, b \rrbracket = \left\{ a, a+1, \dots , b \right\} \) denotes set of all integers between a and b. The Hamming weight of vector v, denoted as \({\text {wt}}(v)\), is defined as the number of nonzero positions in v. A linear code \(C \in \mathbb {F}_q^n\) of length n and dimension k is refereed as \([n,k]_q\)–code. A generic description of a QC-MDPC cryptosystem [33] in the Niedderiter form [35] is as follows:

  • Key generation The secret key is the parity-check matrix H of a random QC-MDPC \([n=ln', (l-1)n']_q\)-code, represented as

    $$\begin{aligned} H = \begin{pmatrix} H_1 \; | \; H_2 \; | \; \dots \; | \; H_{l-1} \; | \; H_l \end{pmatrix}. \end{aligned}$$
    (2)

    The matrix H consists of circulant \((n' \times n')\)–blocks \(H_i\), where each \(H_i\) has a row weight of \(\gamma \). Note that \(n'\) is usually chosen to be a prime number p. The public key is the systematic form of H, i.e.

    $$ \tilde{H} = H_1^{-1} H = \begin{pmatrix} I_{n'} \; | \; H_1^{-1} H_2 \; | \; \dots \; | \; H_1^{-1} H_l, \end{pmatrix} $$

    which can be represented by the first rows of \(H_1^{-1} H_i\), where \(i \in \llbracket 2, l \rrbracket \), since the product of circulant matrices is also a circulant matrix.

  • Encryption The plaintext is an error vector \(e \in \mathbb {F}_q^n\) of weight t, and the ciphertext is its syndrome \(\tilde{s} = \tilde{H} e^\textsf{T}\).

  • Decryption To decrypt, the private syndrome \(s = H e^\textsf{T}= H_1 \tilde{s}^{\textsf{T}}\) is computed and used as input for the MDPC decoder (bit-flipping or symbol flipping).

Note that in NIST-PQC, the QC–MDPC approach is represented by BIKE (bit-flipping key encapsulation) protocol [3].

Due to probabilistic nature of decoding of LDPC and MDPC codes, there is a non–zero probability of decryption failure. In [26] it was shown that decryption failures can be used to recover the secret key in binary case. Hence in order to achieve indistinguishability against chosen ciphertext attacks, where an adversary has an access to a decryption oracle (IND–CCA2 security), the decoding failure rate (DFR) has to be negligibly small, i.e. of order \(2^{-\lambda }\), where \(\lambda \) is a security level. In [46], an experimental–based extrapolation framework for estimating DFR has been proposed. In this approach, the DFR curve is assumed to be concave, so estimates for high DFR (\(> 10^{-9}\)) can be obtained via numerical simulations and then extrapolated to low DFRs providing an upper bound. However, it is known that LDPC and MDPC codes exhibit error floor phenomenon, resulting in violation of concavity assumption (see e.g. [4, 6]). Hence DFR estimates obtained by extrapolation could possibly be overly optimistic (see Fig. 1). Another approach is to estimate DFR using only theoretical tools. In [45] J. P. Tillich studied guaranteed error–correction performance of binary QC–MDPC codes under one–step majority logic decoder (OSML). In addition, in [45] the DFR of two–iteraion decoder is studied under some reasonable assumptions, i.e. the probability that one iteration of parallel bit–flipping decoder reduces error weight enough to be corrected by OSML decoder is computed. In [39], the estimate of the number of errors correctable by OSML decoder was improved. Under the same assumptions as in [45], the worst–case plausibility analysis of one and two iteration randomized serial bit–flipping decoder was performed in [5]. In addition, in [5] a combination of one iteration of randomized serial bit-flipping and OSML was studied, and recommended design parameters for IND–CCA2 secure QC-MDPC cryptosystems were given.

Fig. 1.
figure 1

Approximate illustration of a situation where the use of extrapolation may lead to an incorrect estimation of DFR due to the presence of an error floor.

In this paper, we study DFR of non–binary QC–MDPC codes using theoretical tools. Namely, we extend the results of [39, 45] to the non–binary case, i.e. we show that error–correcting performance of OSML decoder can also be estimated using similar methods of [39, 45]. In addition, we propose a parallel symbol flipping decoder. Under the same assumptions used in [5, 45], we give theoretical estimates of DFR for the parallel symbol–flipping decoder and its combination with the OSML decoder. We also note that the extension of the randomized serial approach, as considered in [5], in the non-binary case seems to yield unreliable results due to a observed discrepancy between the theoretical estimates and the worst-case simulations. Hence this approach is not included in this paper. In addition, we experimentally demonstrate that slightly modified attack of [26] can also recover secret key in non–binary case. Employing the obtained results, recommended parameters and corresponding key sizes for IND-CCA2–secure QC–MDPC cryptosystems over \(\mathbb {F}_q\) for various security levels are computed.

The paper is organized as follows. In Sect. 2, we present the basic principles of decoding non-binary QC-MDPC codes and study the guaranteed error-correction capability of the one-step majority logic decoder in an assumption-free setting. In Sect. 3, we provide a plausibility analysis of error counters distribution and flipping probability in the non-binary case. Subsequently, we propose a 1-iteration parallel symbol flipping decoder and theoretically estimate the probability of reducing the error weight to a certain value, allowing for further decoding by the OSML decoder. We also provide experimental validation of the theoretical model. Finally, in Sect. 4, we consider the reaction attack against non–binary QC–MDPC cryptosystems and find potential cryptosystem parameters and corresponding public-key sizes.

2 Analysis of Guaranteed Error–correction Capability of Non–binary QC-MDPC Codes

Recall that a code C with a parity–check matrix \(H \in \mathbb {F}_q^{m \times n}\) is said to be a moderate–density parity–check (MDPC) code if each row of the \(H=(h_{i,j})\) is of weight \(O(\sqrt{n})\). In addition, C is said to be \((\gamma , \delta )\)–regular if the weight of each column of H is \(\gamma \) and the weight of each row is \(\delta \). Unless otherwise specified, we will focus exclusively on regular MDPC codes.

Let \(z = c + e \in \mathbb {F}_{q}^n\), where \(c \in C\) and \({\text {wt}}(e) \le t\), be a noisy codeword. By \(s = Hz^{\textsf{T}} = He^{\textsf{T}}\) we denote the syndrome of e. One can easily note that since i–th position of s is computed as

$$ s_i = \langle h_i, e \rangle = \sum _{\omega \in {\text {supp}}{(h_i)}} h_{i, \omega } e_{\omega }. $$

Hence, by selecting \(\gamma \) row indices \(i_1, i_2, \dots , i_{\gamma }\) for which \(h_{i_1,j}, \dots , h_{i_1,j}\) are non-zero, we obtain the following \(\gamma \) equalities:

$$\begin{aligned} {\left\{ \begin{array}{ll} s_{i_1} h_{i_1, j}^{-1} = e_j + h_{i_1, j}^{-1} \left( \sum \nolimits _{\omega \in {\text {supp}}(h_{i_1}) \setminus \{ j \} } h_{i_1, \omega } e_{\omega } \right) , \\ \quad \dots \\ s_{i_\gamma } h_{i_\gamma , j}^{-1} = e_j + h_{i_\gamma , j}^{-1} \left( \sum \nolimits _{\omega \in {\text {supp}}(h_{i_\gamma }) \setminus \{ j \} } h_{i_\gamma , \omega } e_{\omega } \right) . \end{array}\right. } \end{aligned}$$
(3)

Since C is an MDPC code, the rows \(h_{i}\) of H are sparse. Considering sparsity of e, it follows that \(s_{i} h_{i,j}^{-1}\) equals \(e_j\) with high probability. Hence it is possible to use the values \(s_{i} h_{i,j}^{-1}\) for estimating e.

Let \(\mathbb {F}_q = \{ \alpha _0 = 0, \alpha _1=1, \alpha _2, \dots , \alpha _{q-1} \}\) be a enumeration of elements of \(\mathbb {F}_q\). Let us define

$$\begin{aligned} \sigma _{j, i} = \left| \left\{ w \mid h_{w,j} \ne 0 \; \text {and} \; s_i h_{w,j}^{-1} = \alpha _i \right\} \right| \end{aligned}$$
(4)

as the number of rows \(h_{w}\) containing the position j in \({\text {supp}}(h_w)\) and \(s_i h_{w,j}^{-1} = \alpha _i\). The values of \(\sigma _{j, i}\) will be referred to as error counters in position j. Clearly, \(\sigma _{j, i}\) indicates the likelihood that the error value \(e_j\) in position j is equal to \(\alpha _i\). In particular, a higher value of \(\sigma _{j,0}\) implies that position j is less likely to be corrupted, while higher values of \(\sigma _{j,i}\), \(i \ne 0\), indicate a greater likelihood that \(e_j = \alpha _i \ne 0\).

Therefore, several decoding strategies are possible. For instance, it is possible to choose an information set I of k positions with highest \(\sigma _{j, 0}\), indicating that these positions less likely to be erroneous, and then use this I for information set decoding (ordered statistics decoding [21] and statistical decoding [37]).

Another straightforward decoding algorithm that uses counters is as follows:

  1. 1.

    compute the syndrome s and the counters \(\sigma _{j,i}\) for all \(j \in \llbracket 1, n \rrbracket \) and \(i \in \llbracket 0, q-1 \rrbracket \);

  2. 2.

    update the position j of the received word z having the highest value of \(\sigma _{j}^* - \sigma _{j,0}\), where

    $$\begin{aligned} \sigma _j^* = \max _{i \in \llbracket 1, q-1 \rrbracket } \sigma _{j,i}, \end{aligned}$$
    (5)

    to the new value \(z_j - \alpha _{i^*}\), where \(i^* = {\text {argmax}}_{i \in \llbracket 1, q-1 \rrbracket }\sigma _{j,i}\);

  3. 3.

    repeat from step 1 until either \(s = 0\) or maximum number of iterations is reached.

Remark 1

One can easily note that the syndrome weight after step 2 is decreased by \(\sigma _{j}^{*} - \sigma _{j,0}\). Therefore, the error position and error value in step 2 are chosen to decrease the syndrome weight the most. In this formulation the decoding approach described above was proposed in [9] as a generalization of Gallager’s bit–flipping. In the binary case, the Gallagher’s decoder is also a greedy algorithm that reduces the syndrome weight the most in each step .

2.1 One–Step Majority Logic Decoding

In this subsection, we study guaranteed decoding performance of regular MDPC codes under the OSML decoder (Algorithm 1) which can be considered as single iteration version of parallel symbol flipping.

Algorithm 1:
figure a

OSML

Remark 2

Note that in the decoder description, instead of recovering the corrected codeword \(c \in C\) from the noisy codeword \(z = c + e\) by iteratively subtracting the estimated error from z, we employ an equivalent formulation where we iteratively find the estimated error \(\tilde{e}\) itself.

Let \(X \in \mathbb {F}_q^{m \times n}\) be an \((m \times n)\)–matrix, and let \(I \subset \llbracket 1, m \rrbracket \) and \(J \subset \llbracket 1, n \rrbracket \) be sets of row and column indices, respectively. We denote the matrix composed of the elements of X with indices \((i,j) \in I \times J\) as \(X_{I, J} = (x_{i,j})_{i \in I, j \in J}\). For convenience, we use the notations \(X_{:, J}\) and \(X_{I, :}\) to represent \(X_{\llbracket 1, m \rrbracket , J}\) and \(X_{I, \llbracket 1, n \rrbracket }\), respectively.

Proposition 1

Let \(H = (h_{i,j}) \in \mathbb {F}_q^{m \times n}\) be a parity–check matrix of a MDPC code, and let \(e \in \mathbb {F}_q^n\) be an error of weight t. Define \(H^{(j)}\) as the matrix consisting of rows from the set

$$ \left\{ h_{i,j}^{-1} \cdot \left( H_{i, \; \llbracket 1, n \rrbracket \setminus \{j\}} \right) \mid i \in \llbracket 1, m \rrbracket , \; h_{i,j} \ne 0 \right\} . $$

Let

$$ a_l = {\text {wt}}(H^{(j)}_{:, l}), \quad \mu (s) = \sum _{\begin{array}{c} \omega \in \text {indicies of }s\text { largest} \\ \text {values of }a_l \end{array}} a_{\omega }, $$

If \(e_j = \alpha _i\), then \(\sigma _{j,i}\) can be lower bounded as follows

$$ \sigma _{j,i} \ge {\left\{ \begin{array}{ll} \gamma - \mu (t), &{} e_j = \alpha _i = 0, \\ \gamma - \mu (t-1), &{} e_j = \alpha _i \ne 0. \\ \end{array}\right. } $$

Proof

Using (3), we obtain that \(\sigma _{j,i}\) denotes the frequency of occurrence of \(\alpha _i\) in the vector

$$ v = \begin{pmatrix} s_{i_1} h_{i_1,j}^{-1} \\ \vdots \\ s_{i_\gamma } h_{i_\gamma ,j}^{-1} \end{pmatrix} = \begin{pmatrix} e_j \\ \vdots \\ e_j \end{pmatrix} + \underbrace{H^{(j)} {e'}^{\textsf{T}}}_{v'}, \quad e' = e_{\llbracket 1, n \rrbracket \setminus \{j\}}. $$

Hence if \(e_j = \alpha _i\) then \(\sigma _{j, i} = \gamma - {\text {wt}}(v')\). Since \(v'\) is a linear combination of \({\text {wt}}(e')\) columns of \(H^{(j)}\), its weight can be upper bounded by

$$ {\text {wt}}(v') \le \mu ({\text {wt}}(e')) = {\left\{ \begin{array}{ll} \mu (t), &{} e_j = 0, \\ \mu (t-1), &{} e_j \ne 0. \end{array}\right. } $$

This concludes the proof of the proposition.

Remark 3

Note that the weight \({\text {wt}}(H^{(j)}_{:, l})\) of l–th column \(H_{:, l}^{(j)}\) of \(H^{(j)}\) equals

$$ \left| {\text {supp}}\left( H_{:, l} \right) \cap {\text {supp}}\left( H_{:, j} \right) \right| . $$

Corollary 1

Let \({\text {wt}}(e) \le t\). If \(\mu (t) < th_j \le \gamma - \mu (t-1)\), then the OSML decoder correctly estimates the j–th position of e.

Proof

If \(e_j = 0\), then \(\sigma _{j,0} \ge \gamma - \mu (t)\) and hence \(\sigma _j^* \le \gamma - \sigma _{j,0} \le \mu (t)\). It follows that setting \(th_j > \mu (t)\) in Algorithm 1 will ensure that no non–erroneous position will be corrupted.

If \(e_j = \alpha _i \ne 0\), then \(\sigma _{j,i} \ge \gamma - \mu (t-1)\). Since \(\mu (t) < \gamma - \mu (t-1)\) and \(\mu (t) \ge \mu (t-1)\), it follows that \(\mu (t-1) < \gamma /2\) and thereby \(\sigma _{j,i} \ge \gamma - \mu (t-1) > \gamma /2\). This implies that a clear majority of equalities in (3) vote for \(\alpha _i\) and hence \(\sigma _j^* =\sigma _{j,i}\) (see (5)). Therefore, setting \(th_j < \gamma - \mu (t-1)\) will ensure that error value in a erroneous position will be estimated correctly.

Corollary 2

The guaranteed error–correction capability of OSML decoder is t if for all \(j \in \llbracket 1, n \rrbracket \) it is possible to choose \(th_j\) according to Corollary 1.

Note that OSML is a very simple yet effective decoder that is capable of correcting low–weight error patterns. However, it is particularly useful as a second decoding iteration because it does not rely on probabilistic assumptions. It can effectively decode errors of a certain weight that remain after previous iterations, even if they have a harder–to–decode structure that would make plausibility analysis based on probabilistic assumptions irrelevant.

3 Plausibility Analysis of 1–iteration Parallel Symbol Flipping Decoder

In this section, we provide an analysis of the single-iteration parallel symbol flipping algorithm. Namely, following the approach of [45], we estimate the probability of correcting an error using this decoder under several probabilistic assumptions. Furthermore, under the same assumptions, we estimate the probability of decreasing the error weight to a value that allows correction by the OSML decoder. This provides an upper bound on the decoding failure rate for the combination of a single iteration of parallel symbol flipping followed by the OSML decoder.

3.1 Distribution of Counters

Below we give necessary results on probabilistic distributions of syndrome values and counters \(\sigma _{j,i}\), \(j \in \llbracket 1, n \rrbracket \), \(i \in \llbracket 0, q-1 \rrbracket \), required for further analysis of decoding iteration of proposed parallel symbol–flipping decoder. Our analysis will rely on several assumptions that are analogous to those used in [5, 45].

Assumption 1

Let H be a parity–check of a random QC–MDPC code C in block–circulant form. It is assumed that each row of H is well modeled as a sample from uniform distribution over \(\mathbb {F}_q^n\).

Proposition 2

Let \(x \in \mathbb {F}_q^n\), \(y \in \mathbb {F}_q^n\) be uniformly sampled. Let

$$\begin{aligned} A_m = \Pr [ \langle x, y \rangle \ne 0 \mid \left| {\text {supp}}(x) \cap {\text {supp}}(y) \right| = m ]. \end{aligned}$$

Then \(A_m\) can be found recursively using

$$ A_m = {\left\{ \begin{array}{ll} (1 - A_{m-1}) + \frac{q-2}{q-1} A_{m-1}, &{} m \ge 1 \\ 0, &{} m = 0. \end{array}\right. } $$

Proof

Without loss of generality, we assume that \({\text {supp}}(x) \cap {\text {supp}}(y) = \{ 1, \dots , m \}\). It follows that

$$\begin{aligned} A_m &= \Pr \left[ \left( \sum _{i=1}^{m-1} x_i y_i = 0 \right) , \; x_m y_m \ne 0\right] + \Pr \left[ \left( \sum _{i=1}^{m-1} x_i y_i \ne 0 \right) , x_my_m \ne - \sum _{i=1}^{m-1} x_i y_i \right] = \\ &= \Pr \left[ \left( \sum _{i=1}^{m-1} x_i y_i = 0 \right) \right] \cdot \Pr \left[ \; x_m y_m \ne 0 \mid \left( \sum _{i=1}^{m-1} x_i y_i = 0 \right) \right] + \\ &+ \Pr \left[ \left( \sum _{i=1}^{m-1} x_i y_i \ne 0 \right) \right] \cdot \Pr \left[ \; x_m y_m \ne -\alpha \mid \left( \sum _{i=1}^{m-1} x_i y_i = \alpha \ne 0 \right) \right] = \\ &= (1-A_{m-1}) \cdot 1 + A_{m-1} \frac{q-2}{q-1}. \end{aligned}$$

Theorem 1

Let \(H = (h_{i,j})\) be a parity–check matrix of \((\gamma , \delta )\)–regular QC-MDPC code C of length n. Let \(e \in \mathbb {F}_q^n\) be a random error of weight t, and \(s = eH^\textsf{T}\) be its syndrome. Then for any row \(h_i\) of H, such that \(j \in {\text {supp}}(h_i)\)

$$\begin{aligned} \Pr [s_i h_{i,j}^{-1} = e_j \mid e_j \ne 0] &= \sum _{i=0}^{\min (\delta -1, t-1)}\frac{\left( {\begin{array}{c}\delta -1\\ i\end{array}}\right) \left( {\begin{array}{c}n-\delta \\ t-i-1\end{array}}\right) }{\left( {\begin{array}{c}n-1\\ t-1\end{array}}\right) }(1 - A_{i}), \end{aligned}$$
(6)
$$\begin{aligned} \Pr [s_i h_{i,j}^{-1} = e_j \mid e_j = 0] &= \sum _{i=0}^{\min (\delta -1, t)}\frac{\left( {\begin{array}{c}\delta -1\\ i\end{array}}\right) \left( {\begin{array}{c}n-\delta \\ t-i\end{array}}\right) }{\left( {\begin{array}{c}n-1\\ t\end{array}}\right) }(1 - A_{i}), \end{aligned}$$
(7)
$$\begin{aligned} \Pr [s_i h_{i,j}^{-1} = \alpha \ne e_j \mid e_j \ne 0] &= (q-1)^{-1} \left( 1 - \Pr [ s_i h_{i,j}^{-1} = e_j \mid e_j \ne 0 ] \right) , \end{aligned}$$
(8)
$$\begin{aligned} \Pr [s_i h_{i,j}^{-1} = \alpha \ne 0 \mid e_j = 0] &= (q-1)^{-1} \left( 1 - \Pr [s_i h_{i,j}^{-1} = e_j \mid e_j = 0] \right) . \end{aligned}$$
(9)

Proof

Since \(j \in {\text {supp}}(h_i)\), Eq. (3) implies that \( s_i h_{i,j}^{-1} = e_j + h_{i,j}^{-1} \langle e', h' \rangle , \) where

$$\begin{aligned} e' = e_{\llbracket 1, n \rrbracket \setminus \{ j \} }, \quad h' = H_{i, \llbracket 1, n \rrbracket \setminus \{ j \} }. \end{aligned}$$

One can easily note that

$$\begin{aligned} {\text {wt}}(e') = {\left\{ \begin{array}{ll} t, &{} e_j = 0 \\ t-1, &{} e_j \ne 0 \end{array}\right. }, \quad {\text {wt}}(h') = \delta - 1. \quad \end{aligned}$$
(10)

Since \(s_i h_{i,j}^{-1} = e_j\) if and only if \(\langle h', e' \rangle = 0\), it follows that

$$ \Pr [s_i h_{i,j}^{-1} = e_j \mid e_j = \alpha ] = \Pr [ \langle e', h' \rangle = 0]. $$

So, using Assumption 1, we obtain

$$\begin{aligned} \Pr [ \langle e', h' \rangle = 0 ] &= \sum _{i=0}^{\min ({\text {wt}}(e'), {\text {wt}}(h'))} \Pr [ \langle e', h' \rangle = 0, \; \left| {\text {supp}}(e') \cap {\text {supp}}(h') \right| = i ] = \\ {} &= \sum _{i=0}^{\min ({\text {wt}}(e'), {\text {wt}}(h'))} (1 - A_{i}) \cdot \Pr [ \left| {\text {supp}}(e') \cap {\text {supp}}(h') \right| = i ] = \\ {} &= \sum _{i=0}^{\min ({\text {wt}}(e'), {\text {wt}}(h'))} \frac{\left( {\begin{array}{c}{\text {wt}}(h')\\ i\end{array}}\right) \left( {\begin{array}{c}n - 1 - {\text {wt}}(h')\\ {\text {wt}}(e') - i\end{array}}\right) }{\left( {\begin{array}{c}n-1\\ {\text {wt}}(e')\end{array}}\right) } (1-A_i). \end{aligned}$$

Substituting (10) into this formula, we obtain (6) and (7). In addition, when \(\langle e', h' \rangle \ne 0\), the product \(\langle e', h' \rangle \) can assume any non–zero element of \(\mathbb {F}_q\) with equal probabilities. Consequently, we obtain (8) and (9).

In the parallel symbol flipping decoder (see Algorithm 2), we propose the following flipping criterion based on counter values, using three decoding thresholds: \(th_0\), \(th_E\), and \(th_D\). Namely, the position j of the received noisy codeword \(z = c + e\) will be updated to \(z_j - \alpha _{i}\) if the following conditions are satisfied:

  1. 1.

    \(\sigma _{j,i} > \sigma _{j, \omega }\) for all \(\omega \in \llbracket 0, q-1 \rrbracket \setminus \{ i \}\), and thus \(\sigma _{j}^* = \sigma _{j,i}\),

  2. 2.

    \(\sigma _{j}^* \ge th_E\),

  3. 3.

    \(\sigma _{j,0} < th_0\),

  4. 4.

    \(\sigma _{j}^* - \sigma _{j,0} \ge th_D\).

Note that conditions 1–4 can be replaced by the single condition

$$\begin{aligned} \sigma _j = (\sigma _{j,0}, \dots , \sigma _{j,q-1}) \in \varDelta _{th_0, th_E, th_D}(i), \end{aligned}$$

where \(\varDelta _{th_0, th_E, th_D}(i)\) is defined as follows

$$\begin{aligned} \varDelta _{th_0, th_E, th_D}(i)= \varDelta (i) = \Big \{ (b_0, \dots , b_{q-1}) \in \mathbb {Z}^q \mid &\sum _{\omega =0}^{q-1} b_{\omega } = \gamma , \; b_i > \max _{\omega \ne i} b_z, \\ &b_0 \le th_0, \; b_i \ge th_E, \; b_i - b_0 \ge th_D \Big \}. \end{aligned}$$

In the following theorem, we will estimate the probability that the flipping criterion accurately determines the positions and values of errors.

Assumption 2

We assume that the probability \(\Pr [\sigma _{j} \in \varDelta (i)]\) to flip position j to value \(z_j-\alpha _i\) is a function only of error weight, i.e. it does not depend on error structure and the location j.

Theorem 2

Let H be a parity–check matrix of \((\gamma , \delta )\)–regular QC-MDPC code C of length n and dimension k. Let \(e \in \mathbb {F}_q^n\) be a random error of weight t. Define

$$ p_1 = \Pr [s_i h_{i,j}^{-1} = e_j \mid e_j \ne 0], \quad p_2 = \Pr [s_i h_{i,j}^{-1} = \alpha \ne e_j \mid e_j \ne 0], $$
$$ p_3 = \Pr [s_i h_{i,j}^{-1} = e_j \mid e_j = 0], \quad p_4 = \Pr [s_i h_{i,j}^{-1} = \alpha \ne e_j \mid e_j = 0]. $$

Then the probability that non–zero error value will be estimated correctly is

$$\begin{aligned} p_{e \rightarrow c}(t) &= \Pr [ \sigma _{j} \in \varDelta (i) \mid e_j = \alpha _i \ne 0 ] = \sum _{(b_0, \dots , b_{q-1}) \in \varDelta (i)} \frac{\gamma !}{b_0! \dots , b_{q-1}!} p_1^{b_i} p_2^{\gamma - b_i}, \end{aligned}$$
(11)

and the probability of incorrect estimate in non–erroneous position is

$$\begin{aligned} p_{c \rightarrow e}(t) = (q-1) \cdot \Pr [ \sigma _{j} \in \varDelta (i) \mid e_j = 0 ], \end{aligned}$$
(12)

where

$$ \Pr [ \sigma _{j} \in \varDelta (i) \mid e_j = 0 ] = \sum _{(b_0, \dots , b_{q-1}) \in \varDelta (i)} \frac{\gamma !}{b_0! \dots , b_{q-1}!} p_3^{b_0} p_4^{\gamma - b_0}, \quad i \ne 0. $$

Proof

From Assumption 2 it follows that the probability

$$\begin{aligned} \Pr \left[ \sigma _{j} = (b_0, \dots , b_{q-1}) \mid e_j = \alpha _i \ne 0 \right] \end{aligned}$$

can be modelled using multinomial distribution with parameters

$$ \left( \Pr [s_i h_{i,j}^{-1} = 0 \mid e_j \ne 0], \dots , \Pr [s_i h_{i,j}^{-1} = \alpha _{q-1} \mid e_j \ne 0] \right) = ( \underbrace{p_2, \dots , p_2}_{i-1}, p_1, \underbrace{p_2, \dots , p_2}_{q-i}). $$

Hence

$$ \Pr \left[ \sigma _{j} = (b_0, \dots , b_{q-1}) \mid e_j = \alpha _i \ne 0 \right] = \frac{\gamma !}{b_0! \dots , b_{q-1}!} p_1^{b_i} p_2^{\gamma - b_i}, $$

which implies (11). By similar reasoning, we can also obtain (12).

3.2 Analysis of Parallel Symbol-Flipping Decoder

In this subsection, we employ results of previous subsection to give an plausibility analysis of the one–step parallel symbol flipping decoder (Algorithm 2) and its combination with OSML decoder (Algorithm 3).

Algorithm 2:
figure b

1–iteration parallel symbol flipping decoder

Note that, after each iteration some error positions can be estimated correctly and some non–erroneous positions can be estimated to be erroneous incorrectly. In the following proposition, we provide an analysis of the probability that 1-iteration version of this decoder transforms a random error e of weight t into some new error \(e'\) of weight \(t'\).

Proposition 3

Let e be a random error of weight t, then after execution Algorithm 2

  1. 1.

    the probability to correctly estimate u error positions from e is

    $$ P_{correct}(t, u) = \left( {\begin{array}{c}t\\ u\end{array}}\right) \left( p_{e \rightarrow c}(t) \right) ^u \left( 1 - p_{e \rightarrow c}(t)\right) ^{t-u}, $$
  2. 2.

    the probability to corrupt v non–erroneous positions is

    $$ P_{corrupt}(t, v) = \left( {\begin{array}{c}n-t\\ v\end{array}}\right) \left( p_{c \rightarrow e}(t) \right) ^v \left( 1 - p_{c \rightarrow e} \right) ^{n-t-v}, $$
  3. 3.

    the probability to transform e into an error \(e'\) of weight \(t'\) is

    $$ {\text {Pr}}(t \rightarrow t') = \sum _{t-u+v = t'} P_{correct}(t, u) P_{corrupt}(t,v). $$

Proof

Assumption 2 implies that the flip decisions are statistically independent and depend solely on the error weight. It follows that \(P_{correct}(t, u)\) and \(P_{corrupt}(t, v)\) can be modeled as samples from binomial distributions with parameters \(p_{e \rightarrow c}(t)\) and \(p_{c \rightarrow e}(t)\) described in Theorem 2, respectively. The last claim trivially follows from the first two.

Corollary 3

The decoding failure rate of 1-iteration parallel symbol-flipping decoder can be estimated as follows

$$\begin{aligned} DFR_1 = 1 - {\text {Pr}}(t \rightarrow 0). \end{aligned}$$

Note that the new error \(e'\) is not random anymore and, therefore, the same analysis for further iteration is not possible. However, it is possible to decode \(e'\) using OSML decoder, which rely on no probabilistic assumptions.

Algorithm 3:
figure c

PSF+OSML

Thus, we obtain the following corollary:

Corollary 4

Let e be a random error of weight t, let \(\tau \) be the number of errors which can be corrected with certainty using OSML decoder. Then DFR of this combination is upper bounded by

$$\begin{aligned} DFR_2 = 1 - \sum _{t'=0}^{\tau } {\text {Pr}}(t \rightarrow t'). \end{aligned}$$

In Figs. 2, 3, 4, we present the results of numerical simulations and compare them with the obtained theoretical estimates. Each experiment involved generating a random key and decoding a random error. For each error weight, the experiments were conducted until 100 decoding failures were detected or until \(10^8\) experiments were performed, whichever occurred first.

We observe that the theoretical estimates of \(DFR_1\) and \(DFR_2\) closely match the simulation results, substantiating the accuracy of the obtained theoretical model.

Fig. 2.
figure 2

Simulation results of \(DFR_1\) for random QC-MDPC \([n = 2 \cdot 2339, k=2339]_4\)–codes over \(\mathbb {F}_4\) (\(l = 2\), \(p=2339\), \(\gamma =37\)), with decoding thresholds \((th_0, th_E, th_D) = (18, 4, 4)\)

Fig. 3.
figure 3

Simulation results of \(DFR_2\) for random QC-MDPC \([n = 2 \cdot 2339, k=2339]_4\)–codes over \(\mathbb {F}_4\) (\(l = 2\), \(p=2339\), \(\gamma =37\), \((th_0, th_E, th_D) = (18, 4, 4)\), and \(\tau = 4\)). For each experiment, we generated a random code and then checked if its OSML bound (see Corollary 2) is \(\ge \tau \). If a code had a lower bound, it was rejected. We chose \(\tau = 4\) to reject no more than \(50\%\) of keys (the actual rejection rate was \(3\%\)).

Fig. 4.
figure 4

Simulation results of \(DFR_1\) and \(DFR_2\) for random QC–MDPC \([n = 2 \cdot 1583, k = 1583]_8\)–codes over \(\mathbb {F}_8\) (\(l = 2\), \(p=1583\), \(\gamma =37\), \((th_0, th_E, th_D) = (18, 4, 4)\), and \(\tau = 4\))

4 Choice of Cryptosystem Parameters

The choice of parameters of QC–MDPC cryptosystems is determined by the complexity of potential attacks on such cryptosystems. Specifically, the parameters of the cryptosystem should be chosen in such a way that the best key-recovery attacks and message-recovery attacks require a sufficiently large number of operations.

The most effective message–recovery attacks are a family of information set decoding (ISD) algorithms, designed for decoding random codes. This family includes the Prange algorithm, the Lee-Brickell algorithm, Stern algorithm, BJMM, ball–collision, etc. An overview of ISD–algorithms can be found in [49]. The average complexity of these algorithms can be directly estimated using a formula that depends on parameters such as the field size q, code length n, code dimension k, and the weight of the error w that needs to be found. For non–binary code direct complexity estimates for the Lee-Brickell and Stern algorithms can be found in [49], for BJMM in [25], and for ball-collision in [27].

It should be noted that for quasi-cyclic codes of order \(n'\), it has been shown [41] that the complexity of ISD attacks can be reduced by a factor of \(\sqrt{n'}\) compared to codes without any structure. One of the features of QC-MDPC cryptosystems is that for key-recovery attacks, which involve finding low-weight dual codewords, the best attacks are also based on ISD. This is because the same algorithms can easily be adapted to search for codewords of a given weight instead of finding an error of a given weight. For quasi-cyclic codes, in this case, it is also possible to reduce the complexity by a factor of \(n'\) compared to random codes.

Furthermore, we must consider the decoding failure rate since in [26], Q. Guo et al. proposed a reaction attack that allows the recovery of secret keys in cryptosystems based on binary QC-MDPC codes by exploiting decoding failures. The original description assumes that \(l = 2\), i.e., \(n = 2n'\), but it can be easily generalized to other cases. This attack is based on the observation that certain error patterns are more easily decodable than other ones. Namely, let \(\mathcal {E}_r\) be the set of error patterns of the following form:

$$\begin{aligned} \mathcal {E}_r = \{ (e, \textbf{0}) \in \mathbb {F}_2^{2p} \mid \; &e \in \mathbb {F}_2^p, \; \exists \text { distinct }s_{1}, s_2, \dots , s_t,\text { s.t. }e_{s_i} = 1\text { and} \\ &s_{2i} = (s_{2i-1} + r) \; {\text {mod}} \; n'\text { for }i \in \llbracket 1, t/2 \rrbracket \} \end{aligned}$$

Let \(\mathbf {h_1} \in \mathbb {F}_q^{n'}\) denote the first row of \(H_1\) (see (2)). Let \(\psi (r)\) denote the number of pairs of non-zero positions of \(\textbf{h}_1\) placed at distance d. The distance between i and j is computed as \(\min \left\{ (i-j) \, {\text {mod}} \, n', (j-i) \, {\text {mod}} \, n'\right\} \). The set of values \(\psi (i)\), \(i \in \llbracket 1, \lfloor n'/2 \rfloor \rrbracket \), is called the distance spectrum of \(\mathbf {h_1} \in \mathcal {R}_n\). In [26], it was shown that there is a correlation between the decoding failure rate on errors from \(\mathcal {E}_r\) and the value of \(\psi (r)\). Specifically, the larger \(\psi (r)\) is, the lower the DFR for errors from \(\mathcal {E}_r\).

Therefore, computing the DFR on errors from \(\mathcal {E}_r\) for different r allows for the recovery of the distance spectrum of \(\mathbf {h_1}\) and subsequently \(\mathbf {h_1}\) itself. Consequently, it becomes possible to reconstruct the secret key of binary QC-MDPC cryptosystems by exploiting the decoding failures. Below, we demonstrate how this attack can be applied to the non-binary case as well.

Table 1. Dependency between simulated DFR for random errors \(e \in \tilde{\mathcal {E}_r}\) and the values \(\psi (r)\). The results are averaged over 100 random QC–MDPC [4678, 2339]–codes.

In our experiments, we observed a correlation between the DFR for errors from \(\tilde{\mathcal {E}r}\) and the values of \(\psi (r)\), where the set \(\tilde{\mathcal {E}_r}\) is defined as follows:

$$\begin{aligned} \tilde{\mathcal {E}_r} = \{ (e, \textbf{0}) \in \mathbb {F}_q^{2p} \mid \; &e \in \mathbb {F}_2^p, \; \exists \text { distinct }s_{1}, s_2, \dots , s_t,\text { s.t. }e_{s_i} \ne 0\text { and} \\ &s_{2i} = (s_{2i-1} + r) \; {\text {mod}} \; n'\text { for }i \in \llbracket 1, t/2 \rrbracket \} \end{aligned}$$

For instance, we conducted simulations to decode errors of weight \(t=84\) from \(\tilde{\mathcal {E}_r}\) using Algorithm 4 for random QC-MDPC codes over \(\mathbb {F}_4\) with parameters \(n' = 2339\), \(l=2\), and \(\gamma = 37\), which ensure a minimal cost of ISD-based key-recovery and message-recovery attacks of \(2^{80}\) bit operations [9]. The results obtained from these simulations are presented in Table 1. As shown in the table, a strong dependency between the distance spectrum and the DFR for errors of this specific form can still be observed.

Algorithm 4:
figure d

Sorted Parallel Symbol Flipping

Thus, it is possible to reconstruct the support of the secret vector \(\mathbf {h_1}\) (up to a cyclic shift) using the following steps:

  1. 1.

    for each \(r \in \llbracket 1, \lfloor n'/2 \rfloor \rrbracket \) numerically estimate DFR for random errors from \(\tilde{\mathcal {E}_r}\), and then use the obtained results to recover the distance spectrum \(\psi \) of \(\mathbf {h_1}\);

  2. 2.

    recover \({\text {supp}}(h_1)\) using the procedure described in [26] for finding positions of ones in \(\mathbf {h_1}\) for the binary case

Once \({\text {supp}}(\mathbf {h_1})\) is recovered, it is possible to recover the whole secret key \((\mathbf {h_1}, \mathbf {h_2})\) in the non-binary case as follows. Let I be an information set such that

$$\begin{aligned} |I \cap {\text {supp}}(\mathbf {h_1} \mid \mathbf {h_2})| = 1, \end{aligned}$$

then the matrix \(\tilde{H}_{:, I}^{-1} \tilde{H} = H_{:, I}^{-1} H\) contains the row \((\mathbf {h_1}, \mathbf {h_2})\) or its quasi–circular shift. When \({\text {supp}}(\mathbf {h_1})\) is known, I can be constructed of one element from \({\text {supp}}(\mathbf {h_1})\), \(n'-\gamma \) elements from \(\llbracket 1, n' \rrbracket \setminus {\text {supp}}(\mathbf {h_1})\), and randomly guessed \(\gamma -1\) elements from \(\llbracket n'+1, 2n' \rrbracket \). Therefore, the probability of finding a suitable I can be estimated as follows:

$$ \left( {\begin{array}{c}n' - \gamma \\ \gamma -1\end{array}}\right) \cdot \left( {\begin{array}{c}n'\\ \gamma -1\end{array}}\right) ^{-1}. $$

So, the method described above in our experiments allowed reconstruction of secret key with significantly lower complexity than claimed security level of \(2^{80}\) bit operations.

It follows that, when choosing the parameters of QC–MDPC cryptosystem that can be converted into IND–CCA2 secure KEM in non–binary case the design criteria are the complexity of ISD–based key–recovery, and message–recovery attacks and small enough decoding failure rate making reaction attacks infeasible. Table 2 provides potential parameters of QC-MDPC cryptosystems over \(\mathbb {F}_4\), with \(l = 2\) and \(n' = p\) being a prime such that the polynomial \(x^p-1\) has a low number of irreducible factors. These parameters are given for three different security levels: \(\lambda \in \{128, 192, 256\}\), which correspond to the complexity of breaking AES with the corresponding key sizes. All the proposed instances are designed to have \(DFR_2 \le 2^{-\lambda }\) (see Corollary 4). Note that the resulting public key sizes (\(pk_{size}\)) are slightly larger than in the binary case (28, 277, 52, 667, 83, 579 respectively [5]). Moreover, increasing the field size to \(q = 8\) with security level \(\lambda =128\) yields an estimated public key size of 36, 321 bits (\(p=12,107\), \(\gamma =69\), \(t=130\)). Thus, for a fixed security level, public key size grows with increasing field size. Indeed, to maintain the same or smaller \(pk_\text {size}\) when increasing q, one must consider shorter MDPC codes. However, due to the complexity of ISD-based key–recovery and message–recovery attacks, \(\gamma \) and t are nearly the same across various ranges of q, implying higher-density codes. Therefore, the increased field size does not appear to compensate for the negative impact of increased code density.

Table 2. Cryptosystem parameters

5 Conclusion

In this paper, we have studied the guaranteed error-correction capability of the one-step majority logic (OSML) decoder and provided a plausibility analysis of the 1-iteration parallel symbol flipping decoder for non-binary QC-MDPC codes. Through this analysis, we were able to estimate the decoding failure rate (DFR) of the combined use of these decoders, where parallel symbol flipping is employed to reduce the error weight to a level at which the OSML decoder can successfully correct any remaining errors. Consequently, we have obtained worst-case estimates of the DFR, considering some minimalistic and reasonable assumptions. The accuracy and validity of our theoretical model have been verified through numerical simulations.

Furthermore, we have demonstrated the importance of considering key-recovery reaction attacks when designing non-binary QC-MDPC cryptosystems. This implies that such cryptosystems need to be constructed with an extremely low DFR in order to achieve IND-CCA2 security with long-term keys. Finally, we have provided possible parameters for different NIST security levels of non-binary QC-MDPC cryptosystems, along with their theoretically estimated DFR.

It should be noted that the resulting key sizes are slightly larger than those in the binary case. Therefore, it appears that using non-binary QC-MDPC codes does not offer any benefits in terms of reducing the public-key sizes of IND-CCA2-secure cryptosystems considering the reaction attack. However, there is a possibility that replacing the quasi-cyclic structure with a more general (non-abelian) quasi-group structure, specifically replacing circulant matrices with matrices of multiplication operators in group algebras, could potentially hinder the reaction attack.

Additionally, by abandoning the requirement of key re-usage, it becomes possible to consider more sophisticated decoders for cryptosystems resistant against chosen plaintext attacks (CPA–secure). The study of such decoders can only be carried out through experimental methods and may provide benefits in terms of reducing key sizes, as previously explored in [9].

It is worth mentioning that the obtained in this paper theoretical models could potentially be useful for providing conservative estimates of the DFR of non-binary codes in telecommunications applications.