Keywords

1 Introduction

1.1 Background

Perceivable advances in quantum computers render Shor’s quantum algorithm a threat to the widely used public key cryptosystems based on integer factoring and discrete logarithm problems [43]. As a consequence, NIST develops a post-quantum cryptography standardization project to solicit, evaluate, and standardize one or more quantum-resistant public cryptographic algorithms in recent years [38]. The cryptographic research community is stimulated by this initiation to construct practicable cryptographic systems that are secure against both quantum and classic computers, and can incorporate with existing communications protocols and networks. It is commonly thought that code-based cryptosystems can be resistant to quantum computing attack and so they are still becoming a hot topic even if NIST has ended the call.

The first code-based cryptosystem was proposed by McEliece in 1978 by hiding a generator matrix of a Goppa code [33]. Another equivalent Niederreiter-type code-based scheme is constructed by scrambling a parity-check matrix of a Goppa code [35]. They are still secure under approximate parameters. However, the size of public keys in above schemes using Goppa codes is very huge. In order to reduce the size of public keys, LDPC (Low Density Parity Check) codes, convolutional codes, Gabidulin codes, Reed-Muller codes, and generalized Reed-Solomon codes were used to replace Goppa codes in the above cryptosystems framework, however, all were proven to be insecure [7, 27, 36, 45, 46].

As we all know, there are significant analogies between lattices and coding theory and the difference mainly consists in the use of different metrics (Euclidean metric for lattices, Hamming metric or rank metric for codes). Recently, inspired by the merits of lattices such as ideal rings and ring-LWE [2, 32, 39, 40], diverse code-based public-key schemes such as RQC, HQC, BIKE, LOCKER, and Ouroboros-R, were proposed by using specific quasi-cyclic codes so that the size of public key is significantly reduced [1, 4, 5, 17]. Those quasi-cyclic codes, i.e., we called one-dimensional module codes here, are also used in the many other code-based cryptosystems to advance compact key size [8, 9, 34]. However, the added quasi-cyclic structure may be exploited to initiate an algebraic attack and therefore brings about less confidence in the underlying security [18, 41, 42].

In lattice-based public key cryptosystems, Kyber which employs module lattices was proposed to thwart attacks from exploiting the algebraic structure of cyclotomic ideal lattices [11,12,13,14,15]. However, in code-based cryptosystems, there are no similar schemes.

In this paper, motivated by Kyber based on module lattices, we use the concept of module codes to redefine quasi-cyclic codes and propose an alternative assumption that rank module syndrome decoding (RMSD for short) problem is difficult so that our schemes are distinguishable from those so-called quasi-cyclic-code-based cryptosystems. It is worth mentioning that a handful of cryptosystems using rank codes exist in literature due to nice properties of rank metric such as RQC, Ouroboros-R, GPT’s variant [31]. Therefore, based on the hardness of RMSD problem, we construct a suite of code-based public-key schemes—Piglet, which includes a new IND-CPA-secure public-key encryption scheme Piglet-1.CPAPKE and an IND-CCA-secure key encapsulation mechanism (KEM for short) Piglet-1.CCAKEM by applying the KEM variant of Fujisaki-Okamoto transform to Piglet-1.CPAPKE. We also put a new IND-CPA-secure KEM Piglet-2.CPAKEM into this suite. Then, we present the parameters comparison between our schemes and some code-based NIST submissions. The results show that our schemes are good long-term-secure candidates for post-quantum cryptography.

1.2 Our Contribution and Techniques

In this paper, the main contribution is that we propose a semantically secure public-key encryption scheme Piglet-1.CPAPKE and a new IND-CPA-secure KEM Piglet-2.CPAKEM based on the hardness of rank module syndrome decoding problem. We believe that our schemes would be good candidates for post-quantum public-key cryptosystems with long-term security. The following are some advantages:

Security. The security of our schemes is established on the hardness of RMSD problem with two dimensions, while current code-based schemes are built upon rank quasi-cyclic syndrome decoding (RQCSD) problem which is RMSD problem with one dimension. In [42], the authors used the quasi-cyclic algebraic structure to propose a generic decoding attack. It shows that higher dimension of a module code can diminish the impact that possible attacks introduce. Furthermore, it cannot be excluded that some fatal attacks which exploits the quasi-cyclic structure embedded in the code might be proposed in the future. Therefore, we use module codes with two dimensions to construct new schemes, which would be good candidates for post-quantum public-key cryptosystems with long-term security.

More Plaintext Bits. In kyber, the size of plaintext is fixed to 256 bits, however, in our schemes, the size of plaintext depends on the extension degree of the finite field and the dimension of the auxiliary code in our scheme Piglet-1. So the sizes of plaintexts in Piglet-1 in 128, 192, and 256 bits security level are 267, 447, and 447 bits, respectively.

Efficiency. Although the operations in our schemes are implemented in large finite fields, it is also efficient in practice.

Decoding Failure. There is no decoding failure in Piglet-1.CPAPKE and Piglet-1.CCAKEM since we use the decoding algorithm for Gabidulin codes. As to Piglet-2.CPAKEM, the decoding failure rate is extremely low and tolerable.

1.3 Road Map

The rest of the paper is organized as follows. Section 2 introduces some basic concepts and some results needed in our paper. In Sect. 3, we describe a difficult problem on which the security of our schemes is based. In Sect. 4, we propose Piglet-1.CPAPKE and give the security proof. Then, we apply Fujisaki-Okamoto transform to Piglet-1.CPAPKE and then construct Piglet-1.CCAKEM with CCA security. Next, we give three parameter sets achieving 128, 192 and 256 bits of security, and make comparison on parameters between our schemes and some NIST candidates. In Sect. 5, we present Piglet-2.CPAKEM, whose session key is the hash value of error vectors without encrypting plaintexts. In Sect. 6, we provide analysis on the existing attacks to our schemes. Finally, Sect. 7 is devoted to our conclusions.

2 Preliminaries

2.1 Results on Rank Codes

We represent vectors by lower-case bold letters and matrices by upper-case letters, and all vectors will be assumed to be row vectors. Let \(\mathbb {F}^{n}_{q^m}\) be an n-dimensional vector space over a finite field \(\mathbb {F}_{q^m}\) where q is a prime power, and n, m are positive integers.

Let \(\mathbf {\beta }=\{\beta _1,\ldots ,\beta _{m}\}\) be a basis of \(\mathbb {F}_{q^m}\) over \(\mathbb {F}_q\). Let \(\mathcal {F}_i\) be the map from \(\mathbb {F}_{q^m}\) to \(\mathbb {F}_q\) where \(\mathcal {F}_i(u)\) is the i-th coordinate of an element \(u\in \mathbb {F}_{q^m}\) in the basis representation with \(\mathbf {\beta }\). To any \(\mathbf {u}=(u_1,\ldots , u_n)\) in \(\mathbb {F}^{n}_{q^m}\), we associate the \(m\times n\) matrix \((\mathcal {F}_i(u_j))_{1\le i\le m,1\le j\le n}\) over \(\mathbb {F}_{q}\). The rank weight of a vector \(\mathbf {u}\) can be defined as the rank of its associated matrix, denoted by \(w_{R}(\mathbf {u})\). We refer to [29] for more details on rank codes.

For integers \(1\le k\le n\), an [nk] linear rank code C over \(\mathbb {F}_{q^m}\) is a subspace of dimension k of \(\mathbb {F}^{n}_{q^m}\) embedded with the rank metric. The minimum rank distance of the code C, denoted by \(d_{R}(C)\), is the minimum rank weight of the non-zero codewords in C. A \(k\times n\) matrix is called a generator matrix of C if its rows span the code. The dual code of C is the orthogonal complement of the subspace C of \(\mathbb {F}^{n}_{q^m}\), denoted by \(C^{\perp }\). A parity-check matrix H for a linear code C is a generator matrix for \(C^{\perp }\).

For any vector \(\mathbf {x}=(x_1, \ldots , x_n)\) in \( \mathbb {F}_{q^m}^{n}\), the support of \(\mathbf {x}\), denoted by Supp(\(\mathbf {x}\)), is the \(\mathbb {F}_q\)-linear subspace of \(\mathbb {F}_{q^m}\) spanned by the coordinates of \(\mathbf {x}\), that is, Supp(\(\mathbf {x})={<}x_1, \ldots , x_n{>}_{\mathbb {F}_q}\). So we have \(w_{R}(\mathbf {x})=\dim (\text {Supp}(\mathbf {x}))\).

Let r be a positive integer and a vector \(\mathbf {v}=(v_1, \ldots , v_r)\in \mathbb {F}^{r}_{q^m}\). The circulant matrix \(\text {rot}(\mathbf {v})\) induced by \(\mathbf {v}\) is defined as follows:

$$\text {rot}(\mathbf {v})=\left( \begin{array}{cccc} v_1 &{} v_{r} &{} \ldots &{} v_{2} \\ v_{2} &{} v_{1} &{} \ldots &{} v_{3} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ v_{r} &{} v_{r-1} &{} \ldots &{} v_{1} \\ \end{array}\right) \in \mathbb {F}^{r\times r}_{q^m}, $$

where \(\mathbb {F}_{q^m}^{r\times r}\) denotes the set of all matrices of size \(r\times r\) over \(\mathbb {F}_{q^m}\).

For any two vectors \(\mathbf {u}, \mathbf {v}\in \mathbb {F}_{q^m}^{r}\), \(\mathbf {u}\cdot \mathbf {v}\) can be expressed to vector-matrix product as follows.

$$\begin{aligned} \mathbf {u}\cdot \mathbf {v}=\mathbf {u}\times \text {rot}(\mathbf {v})^T=(\text {rot}(\mathbf {u})\times \mathbf {v}^T)^T=\mathbf {v}\times \text {rot}(\mathbf {u})^T=\mathbf {v}\cdot \mathbf {u}. \end{aligned}$$

Let \(\mathcal {R}=\mathbb {F}_{q^m}[x]/(x^r-1)\). Then \(\mathbb {F}^{r}_{q^m}\) is an \(\mathbb {F}_{q^m}\)-algebra isomorphic to \(\mathcal {R}\) defined by \((v_1, v_2,\ldots , v_r) \mapsto \sum _{i=1}^{r} v_i x^{i} \).

Definition 1

An [nk]-linear block code \(\mathcal {C}\in \mathbb {F}_{q^m}^n\) is a quasi-cyclic with index s if for any \(\mathbf {c}=(\mathbf {c}_1, \ldots , \mathbf {c}_{s})\in \mathcal {C}\) with s|n, the vector obtained after applying a simultaneous circulant shift to every block \(\mathbf {c}_1,\ldots , \mathbf {c}_s\) is also a codeword.

When \(n=s r\), it is convenient to have parity-check matrices composed by \(r\times r\) circulant blocks. In this paper, we use another viewpoint to describe quasi-cyclic codes so that it is clear to distinguish the quasi-cyclic codes used in our schemes from the many other quasi-cyclic-code-based cryptosystems.

Definition 2

An [nk]-linear block code \(\mathcal {C}\) over \(\mathcal {R}\) is called an \(\mathcal {R}\)-module code if C is a k-dimensional \(\mathcal {R}\)-submodule of \(\mathcal {R}^{n}\).

Remark 1

  1. 1.

    The module code C over \(\mathcal {R}\) is also quasi-cyclic over \(\mathbb {F}_{q^m}\) since \((xc_1, \cdots , x c_{n})\) is also a codeword of C for any \((c_1,\cdots , c_{n})\in C\).

  2. 2.

    The quasi-cyclic codes over \(\mathbb {F}_{q^m}\) used in RQC, HQC, Ouroboros-R, BIKE, etc, are module codes over \(\mathcal {R}\) with dimension \(k=1\).

  3. 3.

    The module codes are reduced to a general linear cyclic code if \(n=1\).

  4. 4.

    The module codes are a general linear code if \(r=1\).

Definition 3

A systematic [nk] module code over \(\mathcal {R}\) has the form of a parity-check matrix as \(H=(I|A)\), where A is an \((n-k)\times k\) matrix over \(\mathcal {R}\).

For example, in our schemes we use a systematic [4, 2] module code over \(\mathcal {R}\) and A has the form \(\left( \begin{array}{cc} a_{1,1} &{} a_{1,2}\\ a_{2,1} &{} a_{2,2} \end{array}\right) \), where \(a_{ij}\in \mathcal {R}\), \(i=1, 2, j=1, 2\), and so \(a_{ij}\) can also be seen a circulant matrix over \(\mathbb {F}_{q^m}\). In fact, the systematic cyclic codes used in RQC, HQC, Ouroboros-R, BIKE are [2, 1] module codes over \(\mathcal {R}\) and have such forms \(A=(a)\), where \(a \in \mathcal {R}\).

Next, we generalize the rank weight of a vector in \(\mathbb {F}^{n}_{q^m}\) to \(\mathcal {R}^n\).

Definition 4

Let \(\mathbf {v}=(v_1, \ldots , v_{n})\in \mathcal {R}^n\), where \(v_i=\sum _{j=0}^{r-1}a_{ij}x^{j}\) for \(1\le i\le n\). The support of \(\mathbf {v}\) is defined by Supp(\(\mathbf {v})=\langle a_{1,0}, \ldots , a_{1,r-1}, \ldots , a_{n,0}, \ldots , a_{n, r-1}\rangle _{\mathbb {F}_{q}}\). The rank weight of \(\mathbf {v}\) is defined to be the dimension of the support of \(\mathbf {v}\), also denoted by \(w_{R}(\mathbf {v})\).

2.2 Gabidulin Codes and Their Decoding Technique

Gabidulin codes were introduced by Gabidulin in [20] and independently by Delsarte in [16]. They exploit linearized polynomials instead of regular ones, which was introduced in [37].

A q-linearized polynomial over \(\mathbb {F}_{q^m}\) is defined to be a polynomial of the form

$$\begin{aligned} L(x)=\sum _{i=0}^{d}a_i x^{q^i}, a_i\in \mathbb {F}_{q^m}, a_d\ne 0 \end{aligned}$$

where d is called the q-degree of f(x), denoted by \(\deg _q(f(x))\). Denote the set of all q-linearized polynomials over \(\mathbb {F}_{q^m}\) by \(\mathcal {L}_q(x, \mathbb {F}_{q^m})\).

Let \(g_1,\ldots , g_n\in \mathbb {F}_{q^m}\) be linearly independent over \(\mathbb {F}_q\) and the Gabidulin code \(\mathcal {G}\) is defined by

$$\begin{aligned} \mathcal {G} =\{(L(g_1),\ldots , L(g_n))\in \mathbb {F}^{n}_{q^m}|\, L(x)\in \mathcal {L}_q(x, \mathbb {F}_{q^m}) \text { and } \deg _q(L(x))<k\} . \end{aligned}$$

The Gabidulin code \(\mathcal {G}\) with length n has dimension k over \(\mathbb {F}_{q^m}\) and the generator matrix of \(\mathcal {G}\) is

$$\begin{aligned} G=\left( \begin{array}{cccc} g_1 &{} \ldots &{} g_n\\ g^{q}_1 &{} \ldots &{} g^{q}_n\\ \vdots &{} \ddots &{} \vdots \\ g^{q^{k-1}}_1 &{} \ldots &{} g^{q^{k-1}}_{n} \end{array}\right) . \end{aligned}$$
(1)

The minimum rank distance of Gabidulin code \(\mathcal {G}\) is \(n-k+1\), and so it can efficiently decode up to \(\frac{n-k}{2}\) rank errors [20]. The decoding algorithm employed in our scheme was proposed in [44], which is the generalization of Berlekamp-Massey algorithm and its computational complexity is \(O(n^2)\), see details in [44].

2.3 Low Rank Parity Check Codes and Their Decoding Algorithm

The Low Rank Parity Check (LRPC) codes have been introduced in [24]. LRPC codes are widely used in code-based cryptosystems because they have a weak algebraic structure and efficient decoding algorithms.

An LRPC code of rank d, length n and dimension k is an [nk]-linear block code over \(\mathbb {F}_{q^m}\) that has its parity-check matrix \(H=(h_{ij})_{1\le i\le n-k, 1\le j\le n}\) such that the dimension of the subspace spanned by all \(h_{ij}\) is d.

The rank syndrome decoding for an LRPC code is that given a parity-check matrix \(H\in \mathbb {F}^{(n-k)\times n}_{q^m}\) of an LRPC code of rank d and a syndrome \(\mathbf {s}\in \mathbb {F}^{n-k}_{q^m}\), the goal is to find a vector \(\mathbf {x}\in \mathbb {F}^{n}_{q^m}\) with \(w_{R}(\mathbf {x})\le r\) such that \(H\mathbf {x}^T=\mathbf {s}^T\).

In fact, what we want in Piglet-2.CPAKEM is just to recover the subspace E spanned by \(\mathbf {x}\) instead of \(\mathbf {x}\), which is called rank support recovery problem. The rank support recovery algorithm was provided in [17], which combines the general decoding algorithm of LRPC codes in [21] and a tweak of the improved algorithm in [3]. The following is the rank support recovery algorithm in detail (RS-Recover for short).

In the following algorithm, S and E are the vector spaces generated by the coordinates of the syndrome \(\mathbf {s}=(s_1, \cdots , s_{n-k})\) and of the vector \(\mathbf {x}\), respectively. \(S_i\) is defined by \(S_i={F^{-1}_{i}}_{\cdot } S=\langle F^{-1}_{1}s_1, F^{-1}_{1}s_2, \cdots , F^{-1}_{d}s_{n-k}\rangle \), with \(F_i\) an element of a basis of H, and \(S_{ij}=S_i\cap S_j\).

figure a

The above algorithm will probably fail in some cases and the decode failure probability is given in Ouroboros-R [17].

Proposition 1

The probability of failure of the above algorithm is \(\max (q^{(2-r)(d-2)} \times q^{-(n-k-rd+1)}, q^{-2(n-k-rd+2)})\), where r is the rank weight of the error vector.

3 Difficult Problems for Code-Based Cryptography

In this section, we describe some difficult problems which are used in code-based cryptography. In particular, we introduce a difficult problem, i.e., rank module syndrome decoding (RMSD for short) problem, which is the security assumption for our schemes.

Definition 5

(Rank Syndrome Decoding (RSD for short) Problem). Given a parity-check matrix \(H=(I_{n-k}|\, A_{(n-k)\times k})\in \mathbb {F}^{(n-k)\times n}_{q^m}\) of a random linear code, and \(\mathbf {y}\in \mathbb {F}^{n-k}_{q^m}\), the goal is to find \(\mathbf {x}\in \mathbb {F}_{q^m}^{n}\) with \(w_{R}(\mathbf {x})\le w\) such that \(H\mathbf {x}^T=\mathbf {y}^T\).

The RSD problem has recently been proven difficult with a probabilistic reduction to the Hamming setting in [22]. As we all know, syndrome decoding problem in Hamming metric is NP-hard [10]. Most of QC-code-based cryptosystems in rank metric are built upon the following difficult problem.

Definition 6

(Rank Quasi-Cyclic Syndrome Decoding (RQCSD) Problem). Given a parity-check matrix \(H=(I_{n-1}|A_{(n-1)\times 1})\in \mathcal {R}^{(n-1)\times n} \) of a systematic random module code over \(\mathcal {R}\) and a syndrome \(\mathbf {y}\in \mathcal {R}^{n-1}\), to find a word \(\mathbf {x}\in \mathcal {R}^{n}\) with \(\omega _{R}(\mathbf {x})\le w\) such that \(\mathbf {y}^{T}=H\mathbf {x}^T\).

RQCSD problem is not proven to be NP-hard, however, the size of public-key is much shorter of variant code-based cryptosystems constructed on this problem such as RQC, Ouroboros-R, LOCKER. As for Hamming metric, one use quasi-cyclic syndrome decoding (QCSD for short) problem as security assumption [8, 34]. We give a new difficult problem as follows:

Definition 7

(Rank Module Syndrome Decoding (RMSD) Problem). Given a parity-check matrix \(H=(I_{n-k}|\, A_{(n-k)\times k})\in \mathcal {R}^{(n-k)\times n}\) of a systematic random module code over \(\mathcal {R}\) and a syndrome \(\mathbf {y}\in \mathcal {R}^{n-k}\), to find a word \(\mathbf {x}\in \mathcal {R}^{n}\) with \(\omega _{R}(\mathbf {x})\le w\) such that \(\mathbf {y}^{T}=H\mathbf {x}^T\).

Simply denote the above problem by the (nkwr)-RMSD problem over \(\mathcal {R}\).

Remark 2

  1. 1.

    If \(k=1\), the (nkwr)-RMSD problem over \(\mathcal {R}\) is the RQCSD problem, which is used in some NIST submissions such as RQC, Ouroboros-R, LOCKER. The result holds for the Hamming metic.

  2. 2.

    If \(r=1\), the (nkwr)-RMSD problem over \(\mathcal {R}\) is the usual RSD problem over \(\mathbb {F}_{q^m}\).

  3. 3.

    The RSD problem is proved to be NP-hard [22], however, the RQCSD and the RMSD problem are still not yet proven to be NP-hard. Furthermore, smaller k implies more algebraic structure makes the scheme potentially susceptible to more avenues of attacks. Therefore, the security of RMSD-based schemes (\(k\ge 2\) by default) is supposed to be in between RSD and RQCSD based cryptosystems.

The above problem is also called the search version of RMSD problem. We also give the definition of the decisional rank module syndrome decoding problem (DRMSD). Since the best known attacks on the (nkwr)-DRMSD problem consist in solving the same instance of the (nkwr)-RMSD problem, we make the assumption that the (nkwr)-DRMSD problem is difficult.

Definition 8

Given input \((H, \mathbf {y})\in \mathcal {R}^{(n-k)\times n} \times \mathcal {R}^{n-k}\), the decisional RMSD problem asks to decide with non-negligible advantage whether \((H, \mathbf {y}^T)\) came from the RMSD distribution or the uniform distribution over \(\mathcal {R}^{(n-k)\times n} \times \mathcal {R}^{n-k}\).

The above problem is simply denoted as \((n, k, w, r)\text {-DRMSD}\) problem.

4 Piglet-1: A New Module-Code-Based Public-Key Scheme

4.1 Piglet-1.CPAPKE

In this subsection, we first present a new IND-CPA-secure public-key encryption, i.e., Piglet-1.CPAPKE, in which XOF\((\cdot )\) denotes an extendable output function and \(S:=\text {XOF}(x)\) denotes the output of the function is distributed uniformly over a set S while x is as input.

In this scheme, we exploit an [rl]-Gabidulin code \(\mathcal {G}\), since the Gabidulin code is a unique rank code family with an efficient decoding algorithm. The minimum distance is \(r-l+1\) and so one can efficiently decode up to \(\frac{r-l}{2}\) rank errors. The plaintext \(\mathbf {m}\) is chosen from the plaintext space \(\mathbb {F}^{l}_{q^m}\).

Piglet-1.CPAPKE.keyGen(): key generation

  1. 1.

    \(\rho \xleftarrow {\$} \{0, 1\}^{256}\), \( \sigma \xleftarrow {\$} \{0, 1\}^{320}\)

  2. 2.

    \(H\in \mathcal {R}^{k\times k}:= \text {XOF}(\rho )\)

  3. 3.

    \((\mathbf {x}, \mathbf {y}) \in \mathcal {R}^{k}\times \mathcal {R}^{k} := \text {XOF}(\sigma )\) with \(w_{R}(\mathbf {x})=w_{R}(\mathbf {y})=w\)

  4. 4.

    \(\mathbf {s} :=\mathbf {x} H +\mathbf {y}\)

  5. 5.

    return \((pk:=(H,\mathbf {s}), sk:=\mathbf {x})\)

Piglet-1.CPAPKE.Enc\((\rho ,\mathbf {s}, \mathbf {m}\in \mathbb {F}^{l}_{q^m})\): encryption

  1. 1.

    \(\tau \xleftarrow {\$} \{0, 1\}^{320}\)

  2. 2.

    \(H\in \mathcal {R}^{k\times k}:= \text {XOF}(\rho )\)

  3. 3.

    \((\mathbf {r}, \mathbf {e},\mathbf {e}') \in \mathcal {R}^{k}\times \mathcal {R}^{k} \times \mathcal {R} := \text {XOF}(\tau )\) with \(w_{R}(\mathbf {r})=w_{R}(\mathbf {e})=w_{R}(\mathbf {e}')=w_e\)

  4. 4.

    \(\mathbf {u}:=H \mathbf {r}^T+\mathbf {e}^T\)

  5. 5.

    \(\mathbf {v}:= \mathbf {s} \mathbf {r}^T+\mathbf {e}' +\mathbf {m}G\), where G is an \(l\times r\) generator matrix over \(\mathbb {F}_{q^m}\) of a Gabidulin code \(\mathcal {G}\).

  6. 6.

    return a ciphertext pair \(\mathbf {c}:=(\mathbf {u}, \mathbf {v})\)

Piglet-1.CPAPKE.Dec\((sk=\mathbf {x}, \mathbf {c}=(\mathbf {u},\mathbf {v}))\): decryption

  1. 1.

    Compute \(\mathbf {v}-\mathbf {x}\mathbf {u} :=\mathbf {m}G+\mathbf {y}\mathbf {r}^T+\mathbf {e}'-\mathbf {x}\mathbf {e}^T\)

  2. 2.

    \(\mathbf {m}:=\mathcal {D}_{G}(\mathbf {v}-\mathbf {x}\mathbf {u})\), where \(\mathcal {D}_{G}(\cdot )\) is a decoding algorithm for the Gabidulin code \(\mathcal {G}\).

Remark 3

  1. 1.

    The secret key \(\mathbf {x}\) and \(\mathbf {y}\) share the same support including 1 with dimension w. The \(\mathbf {r}\), \(\mathbf {e}\) and \(\mathbf {e}'\) share the same support with dimension \(w_e\). So that the rank weight of overall error vector \(\mathbf {y}\mathbf {r}^T+\mathbf {e}'-\mathbf {x}\mathbf {e}^T\) is less than or equal to \(ww_e\).

  2. 2.

    The plaintext \(\mathbf {m}\) can be obtained by decoding algorithm of the Gabidulin code \(\mathcal {G}\) if \(w_{R}(\mathbf {y} \mathbf {r}^T+\mathbf {e}'-\mathbf {x}\mathbf {e}^T)=ww_{e}\le \frac{r-l}{2}\).

4.2 Proof of Security

In this subsection, we show that Piglet-1.CPAPKE is IND-CPA secure under the RMSD hardness assumption.

Theorem 1

For any adversary A, there exists an adversary B such that .

Proof

Let A be an adversary that is executed in the IND-CPA security experiment which we call game \(G_1\), i.e.,

In game \(G_2\), the view of \(\mathbf {s}=\mathbf {x} H+\mathbf {y}\) generated in KeyGen is replaced by a uniform random matrix. It is possible to verify that there exists an adversary B with the same running time as that of A such that

since \((I \,\, H^T)\left( \begin{array}{c} \mathbf {y}^T\\ \mathbf {x}^T \end{array}\right) =\mathbf {s}^T\), where \((I \, H^T)\) is a systematic parity-check matrix of a module code over \(\mathcal {R}\) while \(\mathbf {x}\) and \(\mathbf {y}\) are drawn randomly with low rank weight w.

In game \(G_3\), the values of \(\mathbf {u}=H\mathbf {r}^T+\mathbf {e}^T\) and \(\mathbf {v}= \mathbf {s} \mathbf {r}^T+\mathbf {e}' +\mathbf {m}G\) used in the generation of the challenge ciphertext are simultaneously substituted with uniform random values. Again, there exists an adversary B with the same running time as that of A such that

since \(\left( \begin{array}{ccc} I_{k} &{} &{} H\\ &{} I_{1} &{} \mathbf {s} \end{array}\right) \left( \begin{array}{c} \mathbf {e}^T\\ \mathbf {e}'\\ \mathbf {r}^T \end{array} \right) =\left( \begin{array}{c} \mathbf {u}\\ \mathbf {v}-\mathbf {m}G \end{array} \right) \), where \(\left( \begin{array}{ccc} I_{k} &{} &{} H\\ &{} I_{1} &{} \mathbf {s} \end{array}\right) \) is a systematic parity-check matrix of a module code while H, \(\mathbf {s}\) are uniform and \(\mathbf {r}, \mathbf {e}, \mathbf {e}'\) are drawn randomly with low rank weight \(w_e\).

Note that in game \(G_3\), the value \(\mathbf {v}\) from the challenge ciphertext is independent of b and therefore \(Pr[b=b' \text { in game } G_3]=\frac{1}{2}+\epsilon \), in which \(\epsilon \) is arbitrarily small. We build a sequence of games allowing a simulator to transform a ciphertext of a message \(\mathbf {m}_0\) to a ciphertext of a message \(\mathbf {m}_1\). Hence the result is required.   \(\square \)

4.3 Piglet-1.CCAKEM: A New IND-CCA-Secure KEM

In this subsection, let \(G: \{0,1\}^{*}\rightarrow \{0,1 \}^{3\times 256}\) and \(H: \{0,1\}^{*}\rightarrow \{0,1 \}^{2\times 256}\) be hash functions, and z is a random, secret seed. Then, we apply the KEM variant of Fujisaki-Okamoto transform to Piglet-1.CPAPKE to construct an IND-CCA-secure KEM, i.e., Piglet-1.CCAKEM when the hash functions G and H are modeled random oracle.

Piglet-1.CCAKEM.Keygen() is the same as Piglet-1.CPAPKE. Keygen()

Piglet-1.CCAKEM.Encaps(pk = (\(\rho ,\mathbf {s})\))

  1. 1.

    \(\mathbf {m}\leftarrow \mathbb {F}^{l}_{q^m}\)

  2. 2.

    \((\hat{K}, \sigma , d):=G(pk, \mathbf {m})\)

  3. 3.

    \((\mathbf {u},\mathbf {v}):=\text {Piglet-1.CPAPKE.Enc}((\rho ,\mathbf {s}), \mathbf {m}; \sigma )\)

  4. 4.

    \(\mathbf {c}:=(\mathbf {u}, \mathbf {v},d)\)

  5. 5.

    \(K:=H(\hat{K},\mathbf {c})\)

  6. 6.

    return\((\mathbf {c}, K)\)

\(\text {Piglet-1.CCAKEM.Decaps(sk}=(\mathbf {x}, z, \rho , \mathbf {s}), \mathbf {c}=(\mathbf {u}, \mathbf {v},d))\)

  1. 1.

    \(\mathbf {m}':=\) Piglet-1.CPAKEM.Dec\((\mathbf {x}, (\mathbf {u}, \mathbf {v}))\)

  2. 2.

    \((\hat{K}', \sigma ', d'):=G(pk, \mathbf {m}')\)

  3. 3.

    \((\mathbf {u}', \mathbf {v}'):=\) Piglet-1.CPAKEM.Enc\(((\rho ,\mathbf {s}), \mathbf {m}'; \sigma ')\)

  4. 4.

    if \((\mathbf {u}',\mathbf {v}', d')=(\mathbf {u},\mathbf {v}, d)\) then

  5. 5.

    return \(K:=H(\hat{K}',\mathbf {c})\)

  6. 6.

    else

  7. 7.

    return \(K:=H(z,\mathbf {c})\)

  8. 8.

    end if

4.4 Parameter Sets

In this subsection, we give three sets of parameters for Piglet-1.CCAKEM, achieving 128, 192 and 256 bits of security, respectively.

First we choose the dimension of the module code used in our schemes \(k=2\) so that the size of public key is as small as possible. In this case, we consider \(1\in \text {Supp}(\mathbf {x}, \mathbf {y})\), since finding a small weight codeword of weight w with support containing 1 is harder than finding a small weight codeword of \(w-1\). Therefore, the security of the (2kkwr)-RMSD over \(\mathcal {R}\) in our scheme can be reduced to decoding [4r, 2r]-linear codes over \(\mathbb {F}_{q^m}\) with rank weight \(w-1\). The security of the \((2k+1, k, w_e, r)\)-RMSD over \(\mathcal {R}\) can be reduced to decoding [5r, 2r]-linear codes over \(\mathbb {F}_{q^m}\) with rank weight \(w_e\). One can use the best combinatorial attack algorithm in [22] to determine the choice of parameters such as \(m, r, w, w_e\). Furthermore, we can determine l since \(w w_e\le \frac{r-l}{2}\). Those parameters also need to resist the algebraic attacks which are presented in Sect. 6. The concrete parameters are listed in Table 1.

Table 1. Parameter sets of Piglet-1.CCAKEM

Table 2 presents the theoretical sizes in bytes for Piglet-1.CCAKEM. The size of pk is \(kmr+256\) bits, i.e., \(\frac{2mr+256}{8}\) bytes. The size of sk is 256 bits, i.e., 32 bytes. The size of ciphertext is \(3mr+256\) bits, i.e., \(3mr/8+32\) bytes. The size of ss (session secret) is \(2\times 256\) bits, i.e., 64 bytes.

Table 2. The theoretical sizes in bytes for Piglet-1.CCAKEM
Table 3. Comparison on sizes of public keys (in bytes)

Table 3 presents parameters comparison between our scheme and some NIST submissions which proceed the second round of NIST PQC standardization process. As we have analyzed in Sect. 3, it shows that the size of public key in our schemes is slightly larger than those in RQC, Ouroboros-R and LOCKER, which are based RQCSD hardness problem. The size of public key in our schemes is better than those in HQC, LEDAkem, BIKE which are based on the QCSD hardness problem. And it is much better than those in Classic McEliece and NTS-kem which are original McEliece cryptosystems.

5 Piglet-2: A New Module-Code-Based KEM

In this section, we propose a new IND-CPA-secure KEM Piglet-2.CPAKEM. The difference lies in choice of the auxiliary codes we use (LRPC codes for Piglet-2.CPAKEM, Gabidulin codes for Piglet-1.CPAPKE). The session key is the hash value of error vectors without encrypting a plaintext. As for LRPC codes, we introduced them in Sect. 2. In addition, \(G: \{0,1\}^{*}\rightarrow \{0,1 \}^{2\times 256}\) denotes a hash function.

Piglet-2.CPAKEM.Keygen(): key generation

  1. 1.

    \(\rho \xleftarrow {\$} \{0, 1\}^{256}\), \( \sigma \xleftarrow {\$} \{0, 1\}^{320}\)

  2. 2.

    \(H\in \mathcal {R}^{k\times k}:= \text {XOF}(\rho )\)

  3. 3.

    \((\mathbf {x}, \mathbf {y}) \in \mathcal {R}^{k}\times \mathcal {R}^{k} := \text {XOF}(\sigma )\) with \(w_{R}(\mathbf {x})=w_{R}(\mathbf {y})=w\)

  4. 4.

    \(\mathbf {s} :=\mathbf {x} H +\mathbf {y}\)

  5. 5.

    return \((pk:=(H,\mathbf {s}), sk:=(\mathbf {x}, \mathbf {y}))\)

Piglet-2.CPAKEM.Encaps\((\rho ,\mathbf {s})\): encapsulation

  1. 1.

    \(\tau \xleftarrow {\$} \{0, 1\}^{320}\)

  2. 2.

    \(H\in \mathcal {R}^{k\times k}:= \text {XOF}(\rho )\)

  3. 3.

    \((\mathbf {r}, \mathbf {e},\mathbf {e}') \in \mathcal {R}^{k}\times \mathcal {R}^{k} \times \mathcal {R} := \text {XOF}(\tau )\) with \(w_{R}(\mathbf {r})=w_{R}(\mathbf {e})=w_{R}(\mathbf {e}')=w_e\)

  4. 4.

    \(E: =\text {Supp}(\mathbf {r},\mathbf {e},\mathbf {e}')\) and \(K :=G(E)\)

  5. 5.

    \(\mathbf {u}:= H \mathbf {r}^T+\mathbf {e}^T\)

  6. 6.

    \(\mathbf {v}:= \mathbf {s} \mathbf {r}^T+\mathbf {e}' \)

  7. 7.

    return a ciphertext pair \(\mathbf {c}:=(\mathbf {u}, \mathbf {v})\)

Piglet-2.CPAKEM.Decaps\((sk=(\mathbf {x},\mathbf {y}), \mathbf {c}=(\mathbf {u},\mathbf {v}))\): decapsulation

  1. 1.

    \(F:=\text {Supp}(\mathbf {x},\mathbf {y})\)

  2. 2.

    Compute \(\mathbf {v}-\mathbf {x}\mathbf {u}: =\mathbf {y} \mathbf {r}^T+\mathbf {e}'-\mathbf {x}\mathbf {e}^T\)

  3. 3.

    \(E: =\text {RS-recover}(F, \mathbf {v}-\mathbf {x}\mathbf {u}, w_e)\)

  4. 4.

    \(K: =G(E)\)

Remark 4

  1. 1.

    In the above scheme, \(E =\text {RS-recover}(F, \mathbf {v}-\mathbf {x}\mathbf {u}, w_e)\) denotes that the decoding algorithm outputs the support E of error vectors \(\mathbf {r}, \mathbf {e}\) and \(\mathbf {e}'\) with dimension \(w_e\) given the support F of \(\mathbf {x} \) and \(\mathbf {y}\) and the syndrome \(\mathbf {v}-\mathbf {x}\mathbf {u}\).

  2. 2.

    The security proof of Piglet-2.CPAKEM is the same as that of Piglet-1.CPAPKE and so we omit it here.

  3. 3.

    The choice of parameter sets for Piglet-2.CPAKEM are the same as that for Piglet-1.CCAKEM.

  4. 4.

    The rank support recovery algorithm is probabilistic and the decoding failure probability can be computed by Proposition 1. So in our case the result is \(\max (q^{(2-w)(w_e-2)}\times q^{-(r-ww_e+1)}, q^{-2(r-ww_e+2)})=2^{-38}\) for both 128 and 192 bits security levels, and \(2^{-52}\) for 256 bits security level.

  5. 5.

    Since rank support recovery decoding techniques do not attain a negligible decoding failure rate, this makes it challenge to achieve higher security notions such as IND-CCA.

6 Known Attacks

There are two types of generic attacks on our schemes, which play an important role in choice of parameter sets in our schemes. One is general combinatorial decoding attack and the other is algebraic attack using Gröbner basis.

The decoding algorithm was proposed in [3, 21] and the best result is as follows.

For an [nk] rank code \(\mathcal {C}\) over \(\mathbb {F}_{q^m}\), the time complexity of the known best combinatorial attack to decode a word with rank weight d is

$$\begin{aligned} O((nm)^{3} q^{d\lceil \frac{m(k+1)}{n}\rceil -m}). \end{aligned}$$
(2)

As for algebraic attack, the time complexity is much greater than the decoding attack when \(q=2\). The complexity of the above problem is \(q^{d\lceil \frac{d(k+1)-(n+1)}{d}\rceil }\) [28].

Next, the general attacks from [42] which use the cyclic structure of the code have less impact on module codes than quasi-cyclic codes in RQC, Ouroboros-R, LOCKER, etc.

In addition, as for the choice of r, no attacks of quasi-cyclicity of a code are known if there are only two factors of \(x^r-1 \mod q\) [26]. Therefore, r should be prime, and q is a generator of the multiplicative group of \((\mathbb {Z}/{r\mathbb {Z}})^{*}\).

7 Conclusions

In this paper, we propose an IND-CCA-secure KEM Piglet-1.CCAKEM and an IND-CPA-secure Piglet-2.CPAKEM, both of which are based on the RMSD difficult problem. More importantly, the size of public key in our schemes is much shorter than those of NIST submissions which entered the second round except the candidates based on RQCSD hardness problem. The shorter keys from the RQCSD-problem related candidates are due to simple quasi-cyclic structure used. However, the advantage of our new construction is the elimination of possible quasi-cyclic attacks and thus makes our schemes strong and robust. The parameter comparison between Piglet and other NIST proposals shows that our schemes would be good candidates for post-quantum cryptosystems with long-term security. Moreover, we expect to further reduce the public key size by using similar Kyber’s approach in our future work.