PIPO: A Lightweight Block Cipher with Efficient Higher-Order Masking Software Implementations

Kim, Hangi; Jeon, Yongjin; Kim, Giyoon; Kim, Jongsung; Sim, Bo-Yeon; Han, Dong-Guk; Seo, Hwajeong; Kim, Seonggyeom; Hong, Seokhie; Sung, Jaechul; Hong, Deukjo

doi:10.1007/978-3-030-68890-5_6

Hangi Kim⁹,
Yongjin Jeon⁹,
Giyoon Kim⁹,
Jongsung Kim^9,10,
Bo-Yeon Sim⁹,
Dong-Guk Han^9,10,
Hwajeong Seo¹¹,
Seonggyeom Kim¹²,
Seokhie Hong¹²,
Jaechul Sung¹³ &
…
Deukjo Hong¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12593))

Included in the following conference series:

International Conference on Information Security and Cryptology

1043 Accesses
22 Citations

Abstract

In this paper, we introduce a new lightweight 64-bit block cipher PIPO (PIPO stands for “Plug-In” and “Plug-Out”, representing its use in side-channel protected and unprotected environments, respectively.) supporting a 128 or 256-bit key. It is a byte-oriented and bitsliced cipher that offers excellent performance in 8-bit AVR software implementations. In particular, PIPO allows for efficient higher-order masking implementations, since it uses a minimal number of nonlinear operations. Our implementations demonstrate that PIPO outperforms existing block ciphers (for the same block and key lengths) in both side-channel protected and unprotected environments, on an 8-bit AVR. Furthermore, PIPO records competitive round-based hardware implementations.

For the nonlinear layer of PIPO, we have developed a new lightweight 8-bit S-box that provides an efficient bitsliced implementation including only 11 nonlinear bitwise operations. Furthermore, its differential and linear branch numbers are both 3. This characteristic enables PIPO to thwart differential and linear attacks with fewer rounds. The security of PIPO has been scrutinized with regards to state-of-the-art cryptanalysis.

This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2017-0-00520, Development of SCR-Friendly Symmetric Key Cryptosystem and Its Application Modes).

Access provided by Autonomous University of Puebla. Download conference paper PDF

Strong 8-bit Sboxes with efficient masking in hardware extended version

Article 15 March 2017

Strong 8-bit Sboxes with Efficient Masking in Hardware

Analysis and Improvements of the DPA Contest v4 Implementation

Keywords

1 Introduction

Most devices in IoT environments are miniature and resource-constrained. Therefore, lightweight cryptography, which is an active area of research, is essential. Some lightweight block ciphers such as PRESENT [22] and CLEFIA [44] have been standardized by ISO/IEC. Additionally, a lightweight cryptography standardization project is ongoing at NIST. A variety of lightweight block ciphers have been introduced in the literature, many of which are hardware-friendly systems [9, 22, 24, 33, 43]. Others focus on software performance [10, 28, 31] or both hardware and software performance [8, 12, 14, 32, 49].

In 1996, Paul Kocher first proposed side-channel attacks, which extract secret information by analyzing side-channel information [37]. Since secure designs for mathematical cryptanalysis cannot guarantee security against side-channel attacks, various countermeasures have been studied. With side-channel attacks becoming more advanced and the associated equipment cost decreasing [47], the application of side-channel countermeasures to ciphers has become critical. Recently, various studies have been actively conducted on efficient implementations of side-channel countermeasures, especially on efficient masked implementations. To minimize performance overhead, the focus has been on reducing the number of nonlinear operations. Several lightweight block ciphers, whose design goal is a low nonlinear operation count, have been proposed [2, 28, 31]. These are intended for use in side-channel protected environments.

However, most existing lightweight block ciphers are unsuitable for at least one type of hardware, software, or side-channel protected implementations in low resource environments. Consequently, it is challenging to design general-purpose lightweight block ciphers capable of operating in any resource-constrained environment.

In this paper, we propose a new lightweight versatile block cipher PIPO. During the PIPO design process, the focus was on minimizing the number of nonlinear operations because this is the most significant factor for efficient higher-order masking implementations. To construct an 8-bit S-box with a small number of nonlinear operations, we devised a unbalanced-Bridge structure that accepts one 3-bit and two 5-bit S-boxes and produces 8-bit S-boxes. This structure allows us to construct a new 8-bit S-box that offers good cryptographic properties and an efficient bitsliced implementation including only 11 nonlinear bitwise operations. We also present theorems applied to the unbalanced-Bridge structure, which show the conditions that the both differential and linear branch numbers of the S-boxes constructed through the structure are greater than 2. We investigated the linear layer with the highest security against differential and linear cryptanalyses when combined with the new S-box, through which we could reduce the number of rounds of PIPO. Consequently, PIPO achieves fast higher-order masking implementations, and its execution time increases less with the number of shares (i.e., the masking order) compared with other lightweight 64-bit block ciphers with 128-bit keys. Additionally, PIPO records excellent performance on 8-bit microcontrollers and competitive round-based hardware implementations.

The following notation and definitions are used throughout this paper.

DDT	Difference Distribution Table of an n-bit S-box whose
	$(\varDelta \alpha , \varDelta \beta )$ entry is $\#\{x \in \mathbb {F}^{n}_{2}\|S(x) \oplus S(x \oplus \varDelta \alpha ) = \varDelta \beta \}$,
	where $\varDelta \alpha , \varDelta \beta \in \mathbb {F}^{n}_{2}$
LAT	Linear Approximation Table of an n-bit S-box whose $(\lambda _\alpha ,\lambda _\beta )$
	entry is $\#\{x \in \mathbb {F}^{n}_{2}\| \lambda _\alpha \bullet x = \lambda _\beta \bullet S(x)\} - 2^{n-1}$,
	where $\lambda _\alpha , \lambda _\beta \in \mathbb {F}^{n}_{2}$, and the symbol $\bullet $ denotes the canonical
	inner product in $\mathbb {F}^{n}_{2}$
Differential uniformity	$\underset{\varDelta \alpha \ne 0, \varDelta \beta }{\text{ max }}\#\{x \in \mathbb {F}^{n}_{2}\|S(x) \oplus S(x \oplus \varDelta \alpha ) = \varDelta \beta \}$

Non-linearity	$2^{n-1} - 2^{-1}\times \underset{\lambda _\alpha , \lambda _\beta \ne 0}{\text{ max }}\|\Phi (\lambda _\alpha , \lambda _\beta )\|$, where $\Phi (\lambda _\alpha , \lambda _\beta )$
	$=\underset{x \in \mathbb {F}^{n}_{2}}{\sum } (-1)^{\lambda _\beta \bullet S(x) \oplus \lambda _\alpha \bullet x}$
DBN	Differential Branch Number of an S-box defined as
	$\underset{a,b\ne a}{\text{ min }}(wt(a \oplus b)+wt(S(a)\oplus S(b)))$
LBN	Linear Branch Number of an S-box defined as
	$\underset{a,b,\Phi (a,b)\ne 0}{\text{ min }}(wt(a)+wt(b))$

2 Specification of PIPO

The PIPO block cipher accepts a 64-bit plaintext and either a 128 or 256-bit key, generating a 64-bit ciphertext. It performs 13 rounds for a 128-bit key and 17 rounds for a 256-bit key. Each round is composed of a nonlinear layer denoted as the S-layer, a linear layer denoted as the R-layer, and round key and constant XOR additions. The overall structure of PIPO is depicted on the left side of Fig. 1. Here, $RK_0$ is a whitening key and $RK_1,RK_2, \cdots , RK_{r}$ are round keys, where $r=13$ (128-bit key) or 17 (256-bit key). The i-th round constant $c_{i}$ is i (the round counter) which is XORed with $RK_i$. During the enciphering process, the intermediate state is regarded as an $8 \times 8$ array of bits, as shown on the right side of Fig. 1, where X[i] represents the i-th row byte for $i=0\sim 7$. The S-layer executes eight identical 8-bit S-boxes (denoted as $S_{8}$) in parallel. The $S_{8}$ is applied to each column of the $8 \times 8$ array of bits, where the uppermost bit is the least significant. The $S_{8}$ is shown in Table 7 of Appendix C.1. The R-layer rotates the bits in each row by a given offset (Fig. 2).

The key schedule of PIPO is very simple. We first split a 128-bit master key K into two 64-bit subkeys $K_0$ and $K_1$, i.e., $K=K_1 || K_0$. The whitening and round keys are then defined as $RK_i = K_{i~mod~2}$, where $i=0,1,2,\cdots ,13$. Similarly, a 256-bit master key K is divided into four 64-bit subkeys $K_0$, $K_1$, $K_2$, and $K_3$, i.e., $K=K_3 || K_2 || K_1 || K_0$. In this case, $RK_i = K_{i~mod~4}$ where $i=0,1,2,\cdots ,17$. Some test vectors for PIPO are provided in Appendix A. Note that resistance to related-key attacks was not considered when designing the PIPO cipher.

3 Design Rationale of PIPO

3.1 S-Layer

Overall Structure. We focused on the following three criteria when designing our 8-bit S-box, $S_8$.

1.
It should offer an efficient bitsliced implementation including 11 or fewer nonlinear operations.
2.
Its differential and linear branch numbers (DBN and LBN) should both be greater than 2.
3.
Its differential uniformity should be 16 or less, and its non-linearity should be 96 or more.

Criterion 1 minimizes the number of nonlinear operations required by PIPO, which allows for efficient higher-order masking implementations. Criteria 2 and 3 ensure the cryptographic strengths of the $S_8$ against differential cryptanalysis (DC) and linear cryptanalysis (LC). Any inferior criteria will lead to the implementation of more rounds to achieve acceptable security against these attacks, eventually resulting in a weak proposal. The thresholds of the criteria were selected based on the properties of the existing lightweight 8-bit S-boxes (refer to Table 1).

The Bridge structure was first proposed in [36], and revisited in [15]. In order to construct an $S_8$ satisfying all the aforementioned three criteria, we employed the unbalanced-Bridge structure depicted in Fig. 3, where $S_i^j$ represents the j-th and i-bit S-box in the structure. This structure has the following three characteristics. First, it uses 3-bit and 5-bit S-boxes instead of 4-bit S-boxes. We observe that 8-bit S-box constructions using three 4-bit S-boxes would have difficulty satisfying criterion 1, even though they conform to criteria 2 and 3. Second, all eight output bits are generated from at least two smaller S-boxes (to meet criterion 3). Finally, at least one non-bijective smaller S-box can be adopted to increase the number of possible combinations of smaller S-boxes.

The notation used in this section is introduced below.

$\rho _c:\mathbb {F}^{5}_{2}\rightarrow \mathbb {F}^{5}_{2}, \,\,\,\rho _c(x||y)=y||x,\,\,\,\text {for}\,\,x\in \mathbb {F}^{3}_{2},\,y\in \mathbb {F}^{2}_{2},$

$\tau _n:\mathbb {F}^{5}_{2}\rightarrow \mathbb {F}^{n}_{2},\,\,\,\tau _n(x||y)=x,\,\,\,\text {for}\,\,x\in \mathbb {F}^{n}_{2},\,y\in \mathbb {F}^{5-n}_{2},\,n\in \{1,2,3,4\},$

$\tau _n':\mathbb {F}^{5}_{2}\rightarrow \mathbb {F}^{n}_{2},\,\,\,\tau _n'(x||y)=y,\,\,\,\text {for}\,\,x\in \mathbb {F}^{5-n}_{2},\,y\in \mathbb {F}^{n}_{2},\,n\in \{1,2,3,4\},$

$\mathfrak {F}^1_A:\mathbb {F}^{3}_{2}\rightarrow \mathbb {F}^{5}_{2},\,\,\,\mathfrak {F}^1_A(X)=(S^{1}_5)^{-1}(X||A)\,\,\,\text {for}\,\,A\in \mathbb {F}^{2}_{2},$

$\mathfrak {F}^2_A:\mathbb {F}^{3}_{2}\rightarrow \mathbb {F}^{5}_{2},\,\,\,\mathfrak {F}^2_A(X)=S^{2}_5(X||A)\,\,\,\text {for}\,\,A\in \mathbb {F}^{2}_{2}.$

Now we can define an 8-bit S-box constructed by the unbalanced-Bridge. Let $S_{8}(X_{L}||X_{R})=C_{L}(X_{L}, X_{R}) ||$ $C_{R}(X_{L}, X_{R})$, where $X_{L}$ and $X_{R}$ represent the input variables of the $S_8$ which are in $\mathbb {F}^{5}_{2}$ and $\mathbb {F}^{3}_{2}$, respectively. Then $C_{L}(X_{L},X_{R})=\tau _3(S^{1}_{5}(X_{L}))\oplus S_{3}(X_{R})$ and $C_{R}(X_{L},X_{R})=\rho _c(S^{2}_{5}(S^{1}_{5}(X_{L})\oplus (S_{3}(X_{R})||0^{(2)})))\oplus $ $(0^{(2)}||S_{3}(X_{R}))$ with $C_{L}:\mathbb {F}^{5}_{2}\times \mathbb {F}^{3}_{2}\rightarrow \mathbb {F}^{3}_{2}$ and $C_{R}:\mathbb {F}^{5}_{2}\times \mathbb {F}^{3}_{2}\rightarrow \mathbb {F}^{5}_{2}$. Proposition 1 shows the conditions for the 8-bit S-box constructed by the unbalanced-Bridge to be bijective.

Proposition 1

The 8-bit S-box constructed using the unbalanced-Bridge is bijective if and only if the following three conditions are all satisfied:

i)
$S_{3}$ is bijective.
ii)
$S^{1}_{5}$ is bijective.
iii)
For all $y\in \mathbb {F}^{3}_{2}$, $f_{y}(x)=\tau _2'(S^{2}_{5}(y||x))$ is a bijective function with $f_{y}:\mathbb {F}^{2}_{2}\rightarrow \mathbb {F}^{2}_{2}$.

Proof

Refer to Appendix B.1.

Construction of 8-Bit S-Boxes with DBN$\varvec{>2}$ and LBN$\varvec{>2}$ and Our $\varvec{S}_{\varvec{8}}$ Selection. We present here how to construct 8-bit S-boxes with DBN$>2$ and LBN$>2$. Our framework is to eliminate all the input and output differences (masks) where the sum of their Hamming weights is 2. During this elimination process, we can obtain some conditions of smaller S-boxes. Theorems 1 and 2 present the necessary and sufficient conditions of smaller S-boxes so that the 8-bit S-boxes constructed by Fig. 3 have both differential and linear branch numbers greater than 2.

Theorem 1

The DBN of bijective 8-bit S-boxes constructed using the unbalanced-Bridge is greater than 2 if and only if conditions i), ii), and iii) are all satisfied ($\varDelta \alpha $ and $\varDelta \beta $ below represent arbitrary differences where $wt(\varDelta \alpha )=wt(\varDelta \beta )=1$):

i)
For each $\varDelta \alpha ,\varDelta \beta \in \mathbb {F}^3_2$, at least one of the entry $(\varDelta \alpha , \varDelta \beta )$ in DDT of $S_3$ and the entry $(\varDelta \beta ||0^{(2)}, \varDelta \beta ||0^{(2)})$ in DDT of $S^2_5$ is 0,
ii)
For each $\varDelta \alpha ,\varDelta \beta \in \mathbb {F}^5_2$, for each $A,B(\ne A)\in \mathbb {F}^2_2$, at least one of $\mathfrak {F}^{1}_{A}(X)\oplus \mathfrak {F}^{1}_{B}(X)=\varDelta \alpha $ and $\mathfrak {F}^{2}_{A}(X)\oplus \mathfrak {F}^{2}_{B}(X)=\varDelta \beta $ has no solution X, where $X\in \mathbb {F}^3_2$,
iii)
For each $\varDelta \alpha \in \mathbb {F}^3_2$ and $\varDelta \beta \in \mathbb {F}^5_2$, for each $A,B\in \mathbb {F}^2_2$, at least one of $\mathfrak {F}^{1}_{A}(X)\oplus \mathfrak {F}^{1}_{B}(X\oplus \varDelta \alpha )=\varDelta \beta $ and $\mathfrak {F}^{2}_{A}(X)\oplus \mathfrak {F}^{2}_{B}(X\oplus \varDelta \alpha )=\varDelta 0$ has no solution X, where $X\in \mathbb {F}^3_2$.

Proof

Refer to Appendix B.2.

The following theorem concerning the LBN can be similarly obtained.

Theorem 2

The LBN of bijective 8-bit S-boxes constructed using the unbalanced-Bridge is greater than 2 if and only if conditions i), ii), and iii) are all satisfied ($\lambda _\alpha $ and $\lambda _\beta $ below represent arbitrary masks where $wt(\lambda _\alpha )=wt(\lambda _\beta )=1$):

i)
For each $\lambda _\alpha ,\lambda _\beta \in \mathbb {F}^3_2$, at least one of the entry $(\lambda _\alpha ,\lambda _\beta )$ in LAT of $S_3$ and the entry $(0,\lambda _\beta ||0^{(2)})$ in LAT of $S^2_5$ is 0,
ii)
For each $\lambda _\alpha \in \mathbb {F}^5_2$ and $\lambda _\beta \in \mathbb {F}^3_2$, $\sum _{A\in \mathbb {F}_2^2} X\cdot Y=0$ where X is the entry $(\lambda _\beta ,\lambda _\alpha )$ in LAT of $\mathfrak {F}^1_A$ and Y is the entry $(\lambda _\beta ,\lambda _\beta ||0^{(2)})$ in LAT of $\mathfrak {F}^2_A$,
iii)
For each $\lambda _\alpha ,\lambda _\beta \in \mathbb {F}^5_2$ satisfying $\tau _3(\lambda _\beta )=0$, $\sum _{A\in \mathbb {F}_2^2} X\cdot Y=0$ where X is the entry $(0,\lambda _\alpha )$ in LAT of $\mathfrak {F}^1_A$ and Y is the entry $(0,\lambda _\beta )$ in LAT of $\mathfrak {F}^2_A$.

Proof

Refer to Appendix B.3.

Our $S_8$ search process is outlined as follows. First, we generated 3-bit and 5-bit S-box sets; for 3-bit S-boxes we ran an exhaustive search with AND, OR, XOR, and NOT instructions while restricting the number of nonlinear (resp. linear) operations to 3 (resp. 4), and for 5-bit S-boxes we ran an exhaustive search with AND, OR, and XOR instruction while restricting the number of nonlinear (resp. linear) operations to 4 (resp.7) with a differential uniformity of 8 or less. Second, we classified two 5-bit S-boxes and one 3-bit S-box that satisfy the conditions of Proposition 1, Theorems 1 and 2. During this process, the search space for the $S_8$ could be significantly reduced because the early abort technique was used to select $S_3$, $S_1^5$, and $S_2^5$. Third, we randomly chose the combination of $S_3$, $S_5^1$, and $S_5^2$ to verify whether the corresponding 8-bit S-box satisfies criterion 3, and no fixed point. During the search, we found more than 8,000 candidates for the $S_8$. We selected the one that leads to the best resistance to differential and linear attacks when combined with the linear layer of PIPO (refer to section 3.2). The final selected input/output values of the $S_8$ are presented in Table 7; its bitsliced implementation is given in Appendix C.2.

Table 1 compares the cryptographic properties and operations with those of other 8-bit S-boxes built from smaller 3 S-boxes.

Table 1. Comparison of bitslice 8-bit S-boxes with respect to cryptographic properties and number of operations

Full size table

3.2 R-Layer

To ensure efficient hardware and software implementations, we chose the R-layer to be a bit permutation which only uses bit-rotations in bytes. Its bitsliced implementation is given in Listing 1.1. During the design of the R-layer, the following criteria were considered.

1.
The number of rounds to achieve full diffusion – through which any input bit can affect the entire output bits – should be minimized.
2.
Combining the R-layer with the S-layer should enable the cipher to have the best resistance to DC and LC (among all bit permutations satisfying the first criterion).

To meet the first criterion, we adopted a bit permutation that enables PIPO to achieve full diffusion in two rounds by using rotation offsets $0\sim 7$ for all rows. The second criterion was taken into account when deciding which rotation to use for which row. We applied all 5,040(=7!) R-layers (except for all rotation equivalences) to the S-layer and selected one with the lowest probabilities of 6 and 7-round best differential and linear trails. Our analysis found that the selected combination of the S and R layers provides superior resistance to DC and LC than any other combinations even when other S-boxes among the aforementioned candidates were chosen. Note that most combinations of S and R layers candidates could not provide best 7-round differential and linear trails with less than probability $2^{-64}$.

4 Security Evaluation of PIPO

Table 2 shows the maximum numbers of rounds of characteristics and key recovery attacks that we found for each attack [3, 18,19,20, 40, 42, 46]. In addition to the cryptanalysis shown in Table 2, we conducted algebraic attack [23], integral attack [48], statistical saturation attack [25], invariant subspace attack [38, 39], nonlinear invariant attack [45] and slide attack [21], but they were not applied more effectively than DC or LC.

Table 2. The numbers of rounds of the best characteristics for each cryptanalysis

Full size table

One of the major design considerations for PIPO is to adopt a compact number of rounds (not enough rounds to guarantee security that is (too) high) based on thorough security analyses. We discovered that the best attacks applied to PIPO are DC and LC. An exhaustive search (based on the branch and bound technique [41]) for the DC and LC distinguishers was performed, in which the best reaches 6 rounds. Our analyses could recover the key of up to 9 and 11 rounds of PIPO-64/128 and PIPO-64/256, respectively.

5 Performance Evaluation of Higher-Order Masking Implementations of PIPO

Bitsliced implementations, initially proposed by Biham [17], are known to be efficient when applying Boolean masking, since secure S-box computations can be carried out in parallel [29,30,31, 34]. Thus, we used an S-box that can be efficiently implemented in this way, and only involves 11 nonlinear bitwise operations. The number of nonlinear operations is very important for Boolean masking schemes, since they have a quadratic complexity, i.e., $O(d^2)$, compared with the linear complexity, i.e., O(d), for other operations.

Table 3. Comparison of required ROM (bytes) for round constant, number of nonlinear bitwise operations, and permutation layers of round functions

Full size table

We constructed PIPO using higher-order masked S-layer and R-layer. There are several variations of ISW-AND [6, 7, 16], however, in this paper, we apply original ISW-AND. Since logical OR of two inputs a and b satisfies $a \vee b = (a \wedge b) \oplus a \oplus b$, thus, ISW-OR can be calculated by replacing logical AND with ISW-AND.

We compare our proposed PIPO with 64-bit block ciphers with 128-bit keys as shown in Table 3. All the ciphers compared were implemented using bitslice techniques, and only round constants were precomputed. There is no need to precompute round constants of PIPO, RoadRunneR, and PRESENT, because they are the i or $NR-i$ for $i=0,1,\cdots ,NR-1$, where NR is the number of rounds. Therefore, the required ROM for round constants is shown in Table 3. Only CRAFT used an additional 16-byte diffusion table for generating tweakeys. The same secure logical operations of PIPO were applied to implement higher-order masking structures.

Figure 4 shows the execution times for different numbers of shares on an 8-bit AVR processor. Especially, it shows that the more nonlinear operations, the greater increase in execution time with the number of shares, refer to Table 3. PIPO has the smallest number of nonlinear operations.

6 Performance Evaluation of Software and Hardware Implementations of PIPO

6.1 Software Implementations

The PIPO block cipher consists of permutation (R-layer) and S-box (S-layer) computations. The permutation routine is performed in 8-bit rotation operations, and 22 XOR, 6 AND, 5 OR, 1 COM and 24 MOV instructions are used to compute the S-box. This uses a total of 21 general-purpose registers: six for temporal storage, one for a zero constant, eight for a plaintext, four for address pointers and two for counter variables.

The developers of SIMON and SPECK have proposed a new metric to measure overall performance on low-end devices, namely RANK [11]. This is calculated as follows:

$$RANK = (10^6/CPB)/(ROM + 2 \times RAM).$$

In this metric, higher values of RANK correspond to better performance. Table 4 compares results for several block ciphers on an 8-bit AVR platform. Here, we used Atmel Studio 6.2, and compiled all implementations with optimization level 3. The target processor was an ATmega128 running at 8 MHz [4]. PIPO requires 320 bytes of code, 31 bytes of RAM and an execution time of 197 CPB. We used the RANK metric to compare the ciphers’ overall performances, finding that PIPO achieved the highest score among block ciphers with the same parameter lengths.

Table 4. Comparison of block ciphers on 8-bit AVR*

Full size table

6.2 Hardware Implementations

We implemented PIPO-64/128 and PIPO-64/256 in Verilog, and synthesized the proposed architectures using the Synopsys Design Compiler with 130 nm CMOS technology. Figure 5 shows the datapath of an area-optimized encryption-only PIPO block cipher, which performs one round per clock cycle (i.e., uses a 64-bit-wide datapath). The S-layer uses the same 8-bit S-box 8 times, whereas the R-layer is implemented in wiring. For lightweight key generation, we obtain the round key from the master key, directly. This feature avoids including the key storage. Our implementations require 13 and 17 clock cycles to encrypt a 64-bit plaintext with 128-bit and 256-bit keys, respectively.

Table 5. Area requirement of PIPO-64/128 and PIPO-64/256.

Full size table

Table 5 shows the areas required by PIPO-64/128 and PIPO-64/256. Most of the areas are taken up by the S-layer, in order to compute eight 8-bit S-boxes in parallel. The flip-flops are used for storing plaintext and counter, and the other areas consist of MUX and other logical operations.

Table 6 compares the results for several different block ciphers implemented as ASICs. Compared with the other block ciphers using the same parameter lengths, PIPO needs more gates than CRAFT, Piccolo and SIMON but its cycles per block are much lower, resulting in the highest figure of merit FOM (nano bits per clock cycle per GE squared [5, 32]). It is obvious that the high FOM of PIPO requires less energy and battery consumption.

Table 6. Comparison of round-based and area optimized implementations for block ciphers using 130 nm ASIC library.

Full size table

7 Conclusion

In this paper, we proposed a new lightweight versatile block cipher PIPO suitable for diverse resource-constrained environments. In particular, PIPO exhibits excellent performance in both side-channel protected and unprotected environments on 8-bit microcontrollers, and fast round-based hardware implementations as well. Furthermore, a thorough security analysis of PIPO was conducted.

Notes

1.
For a higher resistance against DC and LC, swapping bits is additionally conducted in the $S_8$ design (refer to section 3.2).

References

Adomnicai, A., et al.: Lilliput-AE: a new lightweight tweakable block cipher for authenticated encryption with associated data. Submission to the NIST Lightweight Cryptography Standardization Process (2019)
Google Scholar
Albrecht, M.R., Driessen, B., Kavun, E.B., Leander, G., Paar, C., Yalçın, T.: Block ciphers – focus on the linear layer (feat. PRIDE). In: Garay, J.A., Gennaro, R. (eds.) CRYPTO 2014. LNCS, vol. 8616, pp. 57–76. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44371-2_4
Chapter Google Scholar
Aoki, K., Sasaki, Yu.: Preimage attacks on one-block MD4, 63-step MD5 and more. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008. LNCS, vol. 5381, pp. 103–119. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04159-4_7
Chapter Google Scholar
Atmel Corporation, ATmega128(L) Datasheet. www.microchip.com/wwwproducts/en/ATmega128. Accessed 23 Apr 2019
Badel, S., et al.: ARMADILLO: a multi-purpose cryptographic primitive dedicated to hardware. In: Mangard, S., Standaert, F.-X. (eds.) CHES 2010. LNCS, vol. 6225, pp. 398–412. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15031-9_27
Chapter Google Scholar
Barthe, G., Dupressoir, F., Faust, S., Grégoire, B., Standaert, F.-X., Strub, P.-Y.: Parallel implementations of masking schemes and the bounded moment leakage model. In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017. LNCS, vol. 10210, pp. 535–566. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56620-7_19
Chapter Google Scholar
Battistello, A., Coron, J.-S., Prouff, E., Zeitoun, R.: Horizontal side-channel attacks and countermeasures on the ISW masking scheme. In: Gierlichs, B., Poschmann, A.Y. (eds.) CHES 2016. LNCS, vol. 9813, pp. 23–39. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53140-2_2
Chapter MATH Google Scholar
Banik, S., Pandey, S.K., Peyrin, T., Sasaki, Yu., Sim, S.M., Todo, Y.: GIFT: a small present. In: Fischer, W., Homma, N. (eds.) CHES 2017. LNCS, vol. 10529, pp. 321–345. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66787-4_16
Chapter Google Scholar
Banik, S., et al.: Midori: a block cipher for low energy. In: Iwata, T., Cheon, J.H. (eds.) ASIACRYPT 2015. LNCS, vol. 9453, pp. 411–436. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48800-3_17
Chapter Google Scholar
Baysal, A., Şahin, S.: RoadRunneR: a small and fast bitslice block cipher for low cost 8-bit processors. In: Güneysu, T., Leander, G., Moradi, A. (eds.) LightSec 2015. LNCS, vol. 9542, pp. 58–76. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29078-2_4
Chapter MATH Google Scholar
Beaulieu, R., Shors, D., Smith, J., Treatman-Clark, S., Weeks, B., Wingers, L.: The Simon and Speck block ciphers on AVR 8-bit microcontrollers. In: Eisenbarth, T., Öztürk, E. (eds.) LightSec 2014. LNCS, vol. 8898, pp. 3–20. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16363-5_1
Chapter MATH Google Scholar
Beaulieu, R., Shors, D., Smith, J., Treatman-Clark, S., Weeks, B., Wingers, L.: The SIMON and SPECK families of lightweight block ciphers, Cryptology ePrint Archive (2013)
Google Scholar
Beierle, C., Leander, G., Moradi, A., Rasoolzadeh, S.: CRAFT: lightweight tweakable block cipher with efficient protection against DFA attacks. IACR Trans. Symmetric Cryptol. 2019(1), 5–45 (2019)
Article Google Scholar
Beierle, C., et al.: The SKINNY family of block ciphers and its low-latency variant MANTIS. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9815, pp. 123–153. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53008-5_5
Chapter Google Scholar
Bilgin, B., De Meyer, L., Duval, S., Levi, I., Standaert, F.X.: Low AND depth and efficient inverses: a guide on s-boxes for low-latency masking. IACR Trans. Symmetric Cryptol. 2020(1), 144–184 (2020)
Article Google Scholar
Belaïd, S., Benhamouda, F., Passelègue, A., Prouff, E., Thillard, A., Vergnaud, D.: Randomness complexity of private circuits for multiplication. In: Fischlin, M., Coron, J.-S. (eds.) EUROCRYPT 2016. LNCS, vol. 9666, pp. 616–648. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49896-5_22
Chapter Google Scholar
Biham, E.: A fast new DES implementation in software. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 260–272. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0052352
Chapter MATH Google Scholar
Biham, E., Biryukov, A., Shamir, A.: Cryptanalysis of skipjack reduced to 31 rounds using impossible differentials. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 12–23. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_2
Chapter Google Scholar
Biham, E., Dunkelman, O., Keller, N.: The rectangle attack — rectangling the serpent. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 340–357. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44987-6_21
Chapter Google Scholar
Biham, E., Shamir, A.: Differential cryptanalysis of DES-like cryptosystems. In: Menezes, A.J., Vanstone, S.A. (eds.) CRYPTO 1990. LNCS, vol. 537, pp. 2–21. Springer, Heidelberg (1991). https://doi.org/10.1007/3-540-38424-3_1
Chapter Google Scholar
Biryukov, A., Wagner, D.: Advanced slide attacks. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 589–606. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45539-6_41
Chapter Google Scholar
Bogdanov, A., et al.: PRESENT: an ultra-lightweight block cipher. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74735-2_31
Boura, C., Canteaut, A., De Cannière, C.: Higher-order differential properties of Keccak and Luffa. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 252–269. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21702-9_15
Chapter Google Scholar
Borghoff, J., et al.: PRINCE – a low-latency block cipher for pervasive computing applications. In: Wang, X., Sako, K. (eds.) ASIACRYPT 2012. LNCS, vol. 7658, pp. 208–225. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34961-4_14
Chapter Google Scholar
Collard, B., Standaert, F.-X.: A statistical saturation attack against the block cipher PRESENT. In: Fischlin, M. (ed.) CT-RSA 2009. LNCS, vol. 5473, pp. 195–210. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00862-7_13
Chapter Google Scholar
Dinu, D., Biryukov, A., Großschädl, J., Khovratovich, D., Corre, Y.L., Perrin, L.: FELICS-fair evaluation of lightweight cryptographic systems. In: NIST Workshop on Lightweight Cryptography (2015)
Google Scholar
Engels, S., Kavun, E.B., Paar, C., Yalçin, T., Mihajloska, H.: A non-linear/linear instruction set extension for lightweight ciphers. In: IEEE 21st Symposium on Computer Arithmetic, pp. 67–75 (2013)
Google Scholar
Gérard, B., Grosso, V., Naya-Plasencia, M., Standaert, F.-X.: Block ciphers that are easier to mask: how far can we go? In: Bertoni, G., Coron, J.-S. (eds.) CHES 2013. LNCS, vol. 8086, pp. 383–399. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40349-1_22
Chapter Google Scholar
Goudarzi, D., Journault, A., Rivain, M., Standaert, F.-X.: Secure multiplication for bitslice higher-order masking: optimisation and comparison. In: Fan, J., Gierlichs, B. (eds.) COSADE 2018. LNCS, vol. 10815, pp. 3–22. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-89641-0_1
Chapter MATH Google Scholar
Goudarzi, D., Rivain, M.: How fast can higher-order masking be in software? In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017. LNCS, vol. 10210, pp. 567–597. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56620-7_20
Chapter MATH Google Scholar
Grosso, V., Leurent, G., Standaert, F.-X., Varıcı, K.: LS-designs: bitslice encryption for efficient masked software implementations. In: Cid, C., Rechberger, C. (eds.) FSE 2014. LNCS, vol. 8540, pp. 18–37. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46706-0_2
Chapter MATH Google Scholar
Guo, J., Peyrin, T., Poschmann, A., Robshaw, M.: The LED block cipher. In: Preneel, B., Takagi, T. (eds.) CHES 2011. LNCS, vol. 6917, pp. 326–341. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23951-9_22
Chapter Google Scholar
Hong, D., et al.: HIGHT: a new block cipher suitable for low-resource device. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 46–59. Springer, Heidelberg (2006). https://doi.org/10.1007/11894063_4
Chapter Google Scholar
Journault, A., Standaert, F.-X.: Very high order masking: efficient implementation and security evaluation. In: Fischer, W., Homma, N. (eds.) CHES 2017. LNCS, vol. 10529, pp. 623–643. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66787-4_30
Chapter MATH Google Scholar
Karpman, P., Grégoire, B.: The littlun s-box and the fly block cipher. In: Lightweight Cryptography Workshop (2016)
Google Scholar
Kim, J., Lee, C., Sung, J., Hong, S., Lee, S., Lim, J.: Seven new block cipher structures with provable security against differential cryptanalysis. IEICE Trans. 91-A(10), 3047–3058 (2008)
Google Scholar
Kocher, P.C.: Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 104–113. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-68697-5_9
Chapter Google Scholar
Leander, G., Abdelraheem, M.A., AlKhzaimi, H., Zenner, E.: A cryptanalysis of PRINTcipher: the invariant subspace attack. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 206–221. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22792-9_12
Chapter Google Scholar
Leander, G., Minaud, B., Rønjom, S.: A generic approach to invariant subspace attacks: cryptanalysis of Robin, iSCREAM and Zorro. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9056, pp. 254–283. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46800-5_11
Chapter Google Scholar
Matsui, M.: Linear cryptanalysis method for DES cipher. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48285-7_33
Chapter Google Scholar
Matsui, M.: On correlation between the order of S-boxes and the strength of DES. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 366–375. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0053451
Chapter Google Scholar
Sasaki, Yu., Aoki, K.: Finding preimages in full MD5 faster than exhaustive search. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 134–152. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01001-9_8
Chapter Google Scholar
Shibutani, K., Isobe, T., Hiwatari, H., Mitsuda, A., Akishita, T., Shirai, T.: Piccolo: an ultra-lightweight blockcipher. In: Preneel, B., Takagi, T. (eds.) CHES 2011. LNCS, vol. 6917, pp. 342–357. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23951-9_23
Chapter Google Scholar
Shirai, T., Shibutani, K., Akishita, T., Moriai, S., Iwata, T.: The 128-bit blockcipher CLEFIA (extended abstract). In: Biryukov, A. (ed.) FSE 2007. LNCS, vol. 4593, pp. 181–195. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74619-5_12
Chapter Google Scholar
Todo, Y., Leander, G., Sasaki, Y.: Nonlinear invariant attack - practical attack on full SCREAM, iSCREAM, and Midori64. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10032, pp. 3–33. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53890-6_1
Wagner, D.: The boomerang attack. In: Knudsen, L. (ed.) FSE 1999. LNCS, vol. 1636, pp. 156–170. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48519-8_12
Worthman, E.: ChaoLogix: integrated security. Semiconductor Eng. (2015)
Google Scholar
Z’aba, M.R., Raddum, H., Henricksen, M., Dawson, E.: Bit-pattern based integral attack. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086, pp. 363–381. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71039-4_23
Chapter Google Scholar
Zhang, W., Bao, Z., Lin, D., Rijmen, V., Yang, B., Verbauwhede, I.: RECTANGLE: a bit-slice lightweight block cipher suitable for multiple platforms. Sci. China Inf. Sci. 58(12), 1–15 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Financial Information Security, Kookmin University, Seoul, Republic of Korea
Hangi Kim, Yongjin Jeon, Giyoon Kim, Jongsung Kim, Bo-Yeon Sim & Dong-Guk Han
Department of Information Security, Cryptology, and Mathematics, Kookmin University, Seoul, Republic of Korea
Jongsung Kim & Dong-Guk Han
Division of IT Convergence Engineering, Hansung University, Seoul, Republic of Korea
Hwajeong Seo
School of Cyber Security, Korea University, Seoul, Republic of Korea
Seonggyeom Kim & Seokhie Hong
Department of Mathematics, University of Seoul, Seoul, Republic of Korea
Jaechul Sung
Department of Information Technology and Engineering, Jeonbuk National University, Jeonju, Republic of Korea
Deukjo Hong

Authors

Hangi Kim
View author publications
You can also search for this author in PubMed Google Scholar
Yongjin Jeon
View author publications
You can also search for this author in PubMed Google Scholar
Giyoon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jongsung Kim
View author publications
You can also search for this author in PubMed Google Scholar
Bo-Yeon Sim
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Guk Han
View author publications
You can also search for this author in PubMed Google Scholar
Hwajeong Seo
View author publications
You can also search for this author in PubMed Google Scholar
Seonggyeom Kim
View author publications
You can also search for this author in PubMed Google Scholar
Seokhie Hong
View author publications
You can also search for this author in PubMed Google Scholar
Jaechul Sung
View author publications
You can also search for this author in PubMed Google Scholar
Deukjo Hong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jongsung Kim .

Editor information

Editors and Affiliations

Jeonbuk National University, Jeonju-si, Korea (Republic of)
Deukjo Hong

Appendices

A Test Vectors

The following test vectors are represented in big endian representation.

PIPO-64/128
$\bullet $:

Secret key: 0x6DC416DD_779428D2_7E1D20AD_2E152297

$\bullet $:

Plaintext: 0x098552F6_1E270026

$\bullet $:

Ciphertext: 0x6B6B2981_AD5D0327
PIPO-64/256
$\bullet $:

Secret key:0x009A3AA4_76A96DB5_54A71206_26D15633_6DC416DD _779428D2_7E1D20AD_2E152297

$\bullet $:

Plaintext: 0x098552F6_1E270026

$\bullet $:

Ciphertext: 0x816DAE6F_B6523889

B Proofs of Proposition and Theorems

1.1 B.1 Proof of Proposition 1

($\Rightarrow $)

If $S_{3}$ or $S^{1}_{5}$ is non-bijective, there are two different inputs $X_L||X_R,X_L'||X_R'$ satisfying $(S^{1}_{5}(X_L),S_{3}(X_R))=(S^{1}_{5}(X_L'),S_{3}(X_R'))$. Then, it is easy to see that $S_8(X_L||X_R)=S_8(X_L'||X_R')$, and thus two conditions i) and ii) should hold. Assume that the $f_{y}$ in condition iii) is non-bijective for some $y\in \mathbb {F}^{3}_{2}$. Then there should be two different inputs $a,a'$ satisfying $f_{y}(a)=f_{y}(a')$. It induces $\tau _2'(S^{2}_{5}(y||a))=\tau _2'(S^{2}_{5}(y||a'))$. On the other hand, we can take a pair $X_{R},X_{R}'$ satisfying $\tau _3(S^{2}_{5}(y||a))\oplus S_3(X_R)=\tau _3(S^{2}_{5}(y||a'))\oplus S_3(X_R')$, and thus $C_R=C_R'$. Combining the above two equations yields $S^{2}_{5}(y||a)\oplus (S_3(X_R)||0^{(2)})=S^{2}_{5}(y||a')\oplus (S_3(X_R')||0^{(2)})$. And, we take a pair $X_L,X_L'$ satisfying $S^{1}_{5}(X_{L})=(y\oplus S_{3}(X_R))||a$ and $S^{1}_{5}(X_{L}')=(y\oplus S_{3}(X_R'))||a'$. Since $a\ne a'$, we have $X_L \ne X_L'$ satisfying $S_8(X_L||X_R)=S_8(X_L'||X_R')$. Therefore, condition iii) should also hold.

($\Leftarrow $)

Assume that $X_L\ne X_L'$ and $X_R=X_R'$. If $\tau _3(S^1_5(X_L))\ne \tau _3(S^1_5(X_L'))$, then $C_L(X_L,X_R)\ne C_L(X_L',X_R')$. Let $\tau _3(S^1_5(X_L))=\tau _3(S^1_5(X_L'))$. It leads to $C_L(X_L,X_R)$ $= C_L(X_L',X_R')$, and $\tau _2'(S^1_5(X_L))\ne \tau _2'(S^1_5(X_L'))$. Because of condition iii), $\tau _2(C_R(X_L,$ $X_R))\ne \tau _2(C_R(X_L',X_R'))$. Assume that $X_L= X_L'$ and $X_R\ne X_R'$. Since $S_3(X_R)\ne S_3(X_R')$, $C_L(X_L,X_R)\ne C_L(X_L',X_R')$. Assume that $X_L\ne X_L'$, $X_R\ne X_R'$. If $C_L(X_L,X_R)= C_L(X_L',X_R')$, either $\tau _2'(S^1_5(X_L))\ne \tau _2'(S^1_5(X_L'))$ or $\tau _2'(S^1_5(X_L))=\tau _2'(S^1_5(X_L'))$. The former case leads to $\tau _2(C_R(X_L,X_R))\ne \tau _2(C_R(X_L',X_R'))$, and the latter case leads to $\tau _3'(C_R(X_L,X_R))\ne \tau _3'(C_R(X_L',X_R'))$. Therefore, the 8-bit S-box is bijective. $\blacksquare $

1.2 B.2 Proof of Theorem 1

We define the following notation for ease of expression.

$Y=S^1_5(X_L)$, $Z=S^1_5(X_L)\oplus (S_3(X_R)||0^{(2)})$, $A=\tau _2'(Y)=\tau _2'(Z)$, $Y=Y'||A$, $Z=Z'||A$.

Then, the expression of the $C_L$ and $C_R$ is

$C_L(X_L,X_R)=\tau _3(Y)\oplus S_3(X_R)=\tau _3(Z)$,

$C_R(X_L,X_R)=\rho _c(S^2_5(Y\oplus (S_3(X_R)||0^{(2)})))\oplus S_3(X_R)=\rho _c(Z)\oplus S_3(X_R)$.

For convenience, we do not write 0 paddings on MSBs of smaller-bit data operating with larger-bit data; here, the 5-bit operand $S_3(X_{R})$ represents $0^{(2)}||S_3(X_{R})$.

: It happens if and only if there exists at least one $(X_{L},X_{R})$ satisfying both $C_{L}(X_{L},X_{R})\oplus C_{L}(X_{L},X_{R}\oplus \varDelta a)=\varDelta 0$ and $C_{R}(X_{L},X_{R})\oplus C_{R}(X_{L},X_{R}\oplus \varDelta a)=\varDelta c$. The first equation is expressed as

$$\begin{aligned} \tau _3(Y)\oplus S_3(X_R)\oplus \tau _3(Y)\oplus S_3(X_R\oplus \varDelta a)=S_{3}(X_{R})\oplus S_{3}(X_{R}\oplus \varDelta a)=\varDelta 0. \end{aligned}$$

Since $S_{3}$ is bijective, the $(0^{(5)}||\varDelta a,0^{(3)}||\varDelta c)$ case dose not happen.

: It happens if and only if there exists at least one $(X_L,X_R)$ satisfying both $C_{L}(X_L,X_R)\oplus C_{L}(X_L,X_R\oplus \varDelta a)=\varDelta d$ and $C_{R}(X_L,X_R)\oplus C_{R}(X_L,X_R\oplus \varDelta a)=\varDelta 0$. The first equation is expressed as

$$\begin{aligned} \tau _3(Y)\oplus S_3(X_R)\oplus \tau _3(Y)\oplus S_3(X_R\oplus \varDelta a)=S_{3}(X_{R})\oplus S_{3}(X_{R}\oplus \varDelta a)=\varDelta d. \end{aligned}$$

(1)

Similarly, the second equation $C_{R}(X_L,X_R)\oplus C_{R}(X_L,X_R\oplus \varDelta a)=\varDelta 0$ is expressed as

$$\begin{aligned} \rho _c({}&S^{2}_{5}(Y\oplus (S_{3}(X_{R})||0^{(2)})))\oplus S_{3}(X_{R})\\ {}&\oplus \rho _c(S^{2}_{5}(Y\oplus (S_{3}(X_{R}\oplus \varDelta a)||0^{(2)})))\oplus S_{3}(X_{R}\oplus \varDelta a)\\ {}&=\rho _c(S^{2}_{5}(Y\oplus (S_{3}(X_{R})||0^{(2)})))\oplus \rho _c(S^{2}_{5}(Y\oplus ((S_{3}(X_{R})\oplus \varDelta d)||0^{(2)})))\oplus \varDelta d=\varDelta 0. \end{aligned}$$

By applying $\rho _c^{-1}$, we have

$$\begin{aligned} S^{2}_{5}(Y\oplus (S_{3}(X_{R})||0^{(2)}))\oplus S^{2}_{5}(Y\oplus ((S_{3}(X_{R})\oplus \varDelta d)||0^{(2)}))=\varDelta d||0^{(2)}. \end{aligned}$$

By applying Z, we obtain

$$\begin{aligned} S^{2}_{5}(Z)\oplus S^{2}_{5}(Z\oplus (\varDelta d||0^{(2)}))=\varDelta d||0^{(2)}. \end{aligned}$$

(2)

Since the function $(X_L,X_R)\mapsto (Z,X_R)$ is bijective, the $(0^{(5)}||\varDelta a,\varDelta d||0^{(5)})$ case does not happen if and only if there is no $(Z,X_R)$ satisfying both Eqs. (1) and (2), which is equivalent to condition i) where $\varDelta \alpha =\varDelta a$, $\varDelta \beta =\varDelta d$.

: It happens if and only if there exists at least one $(X_L,X_R)$ satisfying both $C_{L}(X_L,X_R)\oplus C_{L}(X_L\oplus \varDelta b,X_R)=\varDelta 0$ and $C_{R}(X_L,X_R)\oplus C_{R}(X_L\oplus \varDelta b,X_R)=\varDelta c$. The first equation is expressed as

$$\begin{aligned} \tau _3(S^{1}_{5}(X_{L}))\oplus S_{3}(X_{R})\oplus \tau _3(S^{1}_{5}(X_{L}\oplus \varDelta b))\oplus S_{3}(X_{R})=\tau _3(S^{1}_{5}(X_{L}))\oplus \tau _3(S^{1}_{5}(X_{L}\oplus \varDelta b))=\varDelta 0. \end{aligned}$$

Since $S^{1}_{5}$ is bijective, for a non-zero difference $\varDelta \omega \in \mathbb {F}^{2}_{2}$, the above equation becomes

$$\begin{aligned} S^{1}_{5}(X_{L})\oplus S^{1}_{5}(X_{L}\oplus \varDelta b)=\varDelta \omega . \end{aligned}$$

The equation is rewritten as

$$\begin{aligned} S^{1}_{5}(X_{L}\oplus \varDelta b)=S^{1}_{5}(X_{L})\oplus \varDelta \omega . \end{aligned}$$

By applying $(S^1_5)^{-1}$, we obtain

$$\begin{aligned} X_{L}\oplus \varDelta b=(S^{1}_{5})^{-1}(S^{1}_{5}(X_{L})\oplus \varDelta \omega ). \end{aligned}$$

By using the variables $Y,Y'$ and A, we have

$$\begin{aligned} (S^{1}_{5})^{-1}(Y)\oplus (S^{1}_{5})^{-1}(Y\oplus \varDelta \omega )=\varDelta b, \end{aligned}$$

$$\begin{aligned} (S^{1}_{5})^{-1}(Y'||A)\oplus (S^{1}_{5})^{-1}(Y'||(A\oplus \varDelta \omega ))=\varDelta b. \end{aligned}$$

(3)

And the second equation $C_{R}(X_L,X_R)\oplus C_{R}(X_L\oplus \varDelta b,X_R)=\varDelta c$ is expressed as

$$\begin{aligned} \rho _c({}&S^{2}_{5}(S^{1}_{5}(X_{L})\oplus (S_{3}(X_{R})||0^{(2)})))\oplus S_{3}(X_{R})\\ {}&\oplus \rho _c(S^{2}_{5}(S^{1}_{5}(X_{L}\oplus \varDelta b)\oplus (S_{3}(X_{R})||0^{(2)})))\oplus S_{3}(X_{R})\\ {}&= \rho _c( S^{2}_{5}(Z))\oplus \rho _c(S^{2}_{5}(Z\oplus \varDelta \omega ))=\varDelta c. \end{aligned}$$

By applying $\rho _c^{-1}$, we obtain

$$\begin{aligned} S^{2}_{5}(Z)\oplus S^{2}_{5}(Z\oplus \varDelta \omega )=\rho _c^{-1}(\varDelta c). \end{aligned}$$

This gives the equation

$$\begin{aligned} S^{2}_{5}(Z'||A)\oplus S^{2}_{5}(Z'||(A\oplus \varDelta \omega ))=\rho _c^{-1}(\varDelta c). \end{aligned}$$

(4)

For each A, the above Eqs. (3) and (4) are equivalent to

$$\begin{aligned} \mathfrak {F}^{1}_{A}(Y')\oplus \mathfrak {F}^{1}_{A\oplus \varDelta \omega }(Y')=\varDelta b, \end{aligned}$$

(5)

$$\begin{aligned} \mathfrak {F}^{2}_{A}(Z')\oplus \mathfrak {F}^{2}_{A\oplus \varDelta \omega }(Z')=\rho ^{-1}_c(\varDelta c). \end{aligned}$$

(6)

Here, $\varDelta \omega $ is arbitrary nonzero 2-bit difference, and thus we can define $B=A\oplus \varDelta \omega $ i.e., $B\ne A$. Since the function $(X_L,X_R)\mapsto (Y',A,Z')$ is bijective, the $(\varDelta b||0^{(3)},0^{(3)}||\varDelta c)$ case does not happen if and only if there is no $(Y',A,Z')$ satisfying both Eqs. (5) and (6) for all $B(\ne A)$, which is equivalent to condition ii) where $\varDelta \alpha =\varDelta b$, $\varDelta \beta =\rho ^{-1}_c(\varDelta c)$.

: It happens if and only if there exists at least one $(X_L,X_R)$ satisfying both $C_{L}(X_L,X_R)\oplus C_{L}(X_L\oplus \varDelta b,X_R)=\varDelta d$ and $C_{R}(X_L,X_R)\oplus C_{R}(X_L\oplus \varDelta b,X_R)=\varDelta 0$. The first equation is expressed as

$$\begin{aligned} \tau _3(S^{1}_{5}(X_{L}))\oplus S_{3}(X_{R})\oplus \tau _3(S^{1}_{5}(X_{L}\oplus \varDelta b))\oplus S_{3}(X_{R})=\tau _3(S^{1}_{5}(X_{L}))\oplus \tau _3(S^{1}_{5}(X_{L}\oplus \varDelta b))=\varDelta d. \end{aligned}$$

For a difference $\varDelta \omega \in \mathbb {F}^{2}_{2}$, the above equation becomes

$$\begin{aligned} S^{1}_{5}(X_{L})\oplus S^{1}_{5}(X_{L}\oplus \varDelta b)=\varDelta d||\varDelta \omega . \end{aligned}$$

As in Eq. (3), we obtain

$$\begin{aligned} (S^{1}_{5})^{-1}(Y'||A)\oplus (S^{1}_{5})^{-1}((Y'\oplus \varDelta d)||(A\oplus \varDelta \omega ))=\varDelta b. \end{aligned}$$

(7)

And the second equation is expressed as

$$\begin{aligned} \rho _c({}&S^{2}_{5}(S^{1}_{5}(X_{L})\oplus (S_{3}(X_{R})||0^{(2)})))\oplus S_{3}(X_{R})\\ {}&\oplus \rho _c(S^{2}_{5}(S^{1}_{5}(X_{L}\oplus \varDelta b)\oplus (S_{3}(X_{R})||0^{(2)})))\oplus S_{3}(X_{R})\\ {}&= \rho _c( S^{2}_{5}(Z))\oplus \rho _c(S^{2}_{5}(Z\oplus (\varDelta d||\varDelta \omega )))=\varDelta 0. \end{aligned}$$

Clearly,

$$\begin{aligned} S^{2}_{5}(Z)\oplus S^{2}_{5}(Z\oplus (\varDelta d||\varDelta \omega ))=\varDelta 0. \end{aligned}$$

It becomes

$$\begin{aligned} S^{2}_{5}(Z'||A)\oplus S^{2}_{5}((Z'\oplus \varDelta d)||(A\oplus \varDelta \omega ))=\varDelta 0. \end{aligned}$$

(8)

For each A, the above Eqs. (7) and (8) are equivalent to

$$\begin{aligned} \mathfrak {F}^{1}_{A}(Y')\oplus \mathfrak {F}^{1}_{A\oplus \varDelta \omega }(Y'\oplus \varDelta d)=\varDelta b, \end{aligned}$$

(9)

$$\begin{aligned} \mathfrak {F}^{2}_{A}(Z')\oplus \mathfrak {F}^{2}_{A\oplus \varDelta \omega }(Z'\oplus \varDelta d)=\varDelta 0. \end{aligned}$$

(10)

Similarly to the case above, we define $B=A\oplus \varDelta \omega $. In this time, B can be either A or not, since $\varDelta \omega $ can be a zero difference. The $(\varDelta b||0^{(3)},\varDelta d||0^{(5)})$ case does not happen if and only if there is no $(Y',A,Z')$ satisfying both Eqs. (9) and (10) for all B, which is equivalent to condition iii) where $\varDelta \alpha =\varDelta d$, $\varDelta \beta =\varDelta b$.$\blacksquare $

1.3 B.3 Proof of Theorem 2

We use $Y,Y',Z,Z'$, and A defined in proof B.2.

: This case is expressed as $X_{R}\bullet \lambda _a=C_{R}(X_{L},X_{R})\bullet \lambda _c$. It follows $X_{R}\bullet \lambda _a=(\rho _c(S^{2}_{5}(S^{1}_{5}(X_{L})\oplus (S_{3}(X_{R})||0^{(2)})))\oplus S_{3}(X_{R}))\bullet \lambda _c$. By applying the variable Z, the equation becomes $X_R\bullet \lambda _a\oplus S_{3}(X_{R})\bullet \lambda _c = \rho _c(S^{2}_{5}(Z))\bullet \lambda _c$. Note that the function $(X_{L},X_{R})\mapsto (Z,X_R)$ is bijective. Suppose $\tau _2(\lambda _c)\ne 0$. Then, the equation becomes $X_R\bullet \lambda _a=\rho _c(S^{2}_{5}(Z))\bullet \lambda _c$. This should have zero bias because the equation $X_R\bullet \lambda _a=0$ has zero bias, and Z and $X_R$ are independent variables. Now, suppose $\tau _2(\lambda _c)=0$. The equation $X_R\bullet \lambda _a\oplus S_{3}(X_{R})\bullet \lambda _c = \rho _c(S^{2}_{5}(Z))\bullet \lambda _c$ has zero bias if and only if at least one of the entries $(\lambda _a,\tau _3'(\lambda _c))$ in LAT of $S_3$ and $(0,\tau _3'(\lambda _c)||0^{(2)})$ in LAT of $S_5^2$ is zero. This is due to the fact that Z is independent of $X_R$. It is equivalent to condition i)

: This case is expressed as $X_{R}\bullet \lambda _a=C_{L}(X_{L},X_{R})\bullet \lambda _d$. It follows $X_{R}\bullet \lambda _a=(\tau _3(S^{1}_{5}(X_{L}))\oplus S_{3}(X_{R}))\bullet \lambda _d$. The equation becomes $X_R\bullet \lambda _a = \tau _3(Z)\bullet \lambda _d$ by using the definition of Z. So, this case has zero bias, because $\tau _3(Z)$ is independent of $X_R$.

: This case is expressed as $X_{L}\bullet \lambda _b=C_{R}(X_{L},X_{R})\bullet \lambda _c$. It follows $X_{L}\bullet \lambda _b=(\rho _c(S^{2}_{5}(S^{1}_{5}(X_{L})\oplus (S_{3}(X_{R})||0^{(2)})))\oplus S_{3}(X_{R}))\bullet \lambda _c$. We can replace the equation to

$$\begin{aligned} X_{L}{}&\bullet \lambda _b\oplus S^{1}_{5}(X_{L})\bullet \lambda _t\\ {}&=(S^{1}_{5}(X_{L})\oplus (S_{3}(X_{R})||0^{(2)}))\bullet \lambda _t\oplus \rho _c(S^{2}_{5}(S^{1}_{5}(X_{L})\oplus (S_{3}(X_{R})||0^{(2)})))\bullet \lambda _c, \end{aligned}$$

where $\lambda _t=\tau _3'(\lambda _c)||0^{(2)}$ (here, $0^{(2)}$ can be replaced by 01, 10 or $1^{(2)}$). By applying the variables of Y and Z, this becomes equivalent to the following equations

$$\begin{aligned} (S^{1}_5)^{-1}(Y)\bullet \lambda _b\oplus Y\bullet \lambda _t=Z\bullet \lambda _t\oplus (\rho _c(S^{2}_{5}(Z)))\bullet \lambda _c, \end{aligned}$$

$$\begin{aligned} (S^{1}_5)^{-1}(Y'||A)\bullet \lambda _b\oplus (Y'||A)\bullet \lambda _t=(Z'||A)\bullet \lambda _t\oplus (\rho _c(S^{2}_{5}(Z'||A)))\bullet \lambda _c. \end{aligned}$$

For all $A\in \mathbb {F}^2_2$, we have

$$\begin{aligned} \mathfrak {F}^{1}_A(Y')\bullet \lambda _b\oplus (Y'||A)\bullet \lambda _t=(Z'||A)\bullet \lambda _t\oplus (\rho _c(\mathfrak {F}^{2}_{A}(Z')))\bullet \lambda _c. \end{aligned}$$

Clearly,

$$\begin{aligned} \mathfrak {F}^{1}_A(Y')\bullet \lambda _b\oplus Y'\bullet \tau _3(\lambda _t)=Z'\bullet \tau _3(\lambda _t)\oplus (\rho _c(\mathfrak {F}^{2}_{A}(Z')))\bullet \lambda _c. \end{aligned}$$

A collection of $(Y',Z')$ that satisfies the above equation is equivalent to

$$\begin{aligned} \{Y'|{}&0= \mathfrak {F}^{1}_A(Y')\bullet \lambda _b\oplus Y'\bullet \tau _3(\lambda _t)\} \times \{Z'|0=Z'\bullet \tau _3(\lambda _t)\oplus (\rho _c(\mathfrak {F}^{2}_{A}(Z')))\bullet \lambda _c\}\\ {}&\cup \{Y'|1= \mathfrak {F}^{1}_A(Y')\bullet \lambda _b\oplus Y'\bullet \tau _3(\lambda _t)\} \times \{Z'|1=Z'\bullet \tau _3(\lambda _t)\oplus (\rho _c(\mathfrak {F}^{2}_{A}(Z')))\bullet \lambda _c\} \end{aligned}$$

Then the number of the above set is $(4+a_A)(4+b_A)+(4-a_A)(4-b_A)=32+2a_Ab_A$, where $a_A$ and $b_A$ are the entries of $(\tau _3(\lambda _t),\lambda _b)$ and $(\tau _3(\lambda _t),\rho _c^{-1}(\lambda _c))$ in LAT of $\mathfrak {F}_A^1$ and $\mathfrak {F}_A^2$, respectively. The above equation has zero bias if and only if

$$\begin{aligned} \sum _{A\in \mathbb {F}_2^2} (32+2a_Ab_A) = 2(\sum _{A\in \mathbb {F}_2^2} a_Ab_A) + 128 = 128 \end{aligned}$$

It leads to $\sum _{A\in \mathbb {F}_2^2} a_Ab_A=0$. Because $\tau _3(\lambda _t)=\tau _3'(\lambda _c)$, it is equivalent to condition ii) (when $\tau _3'(\lambda _c)\ne 0$) and condition iii) (when $\tau _3'(\lambda _c)=0$).

: This case is expressed as $X_{L}\bullet \lambda _b=C_{L}(X_{L},X_{R})\bullet \lambda _d$. It follows $X_{L}\bullet \lambda _b=(\tau _3(S^{1}_{5}(X_{L}))\oplus S_{3}(X_{R}))\bullet \lambda _d$. The equation becomes $X_L\bullet \lambda _b = Z'\bullet \lambda _d$ by using the definition of $Z'$. We note that the function $(X_{L},X_{R})\mapsto (X_L,Z')$ is bijective, and $X_L$ and $Z'$ are independent variables. So, this equation has zero bias.$\blacksquare $

C 8-bit S-box of PIPO, $S_8$

1.1 C.1 Table of the $S_8$

Table 7 shows the $S_{8}$.

Table 7. 8-bit S-box of PIPO in hexadecimal notation: For example, $S_8$(31)=86.

Full size table

1.2 C.2 Bitsliced Implementations of the $S_8$ and Its Inverse

Listing 1.2 is the bitsliced implementation of the $S_8$.^{Footnote 1} The bitsliced implementation of the inverse $S_8$ cannot be obtained by reversing the bitsliced implementation of the $S_8$ because the input bits of $S_5^2$ are not all given. The Listing 1.3 shows how to implement the inverse $S_8$ with the given input bits. Since the $S_8$ applies each column of $8\times 8$ array of bits depicted in Fig. 1, we can implement the S-layer by replacing bit x[i] with byte X[i] which represents the i-th row value, where $i=0,1,2,\cdots ,7$.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, H. et al. (2021). PIPO: A Lightweight Block Cipher with Efficient Higher-Order Masking Software Implementations. In: Hong, D. (eds) Information Security and Cryptology – ICISC 2020. ICISC 2020. Lecture Notes in Computer Science(), vol 12593. Springer, Cham. https://doi.org/10.1007/978-3-030-68890-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-68890-5_6
Published: 07 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68889-9
Online ISBN: 978-3-030-68890-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DDT	Difference Distribution Table of an n-bit S-box whose
	\((\varDelta \alpha , \varDelta \beta )\) entry is \(\#\{x \in \mathbb {F}^{n}_{2}\|S(x) \oplus S(x \oplus \varDelta \alpha ) = \varDelta \beta \}\),
	where \(\varDelta \alpha , \varDelta \beta \in \mathbb {F}^{n}_{2}\)
LAT	Linear Approximation Table of an n-bit S-box whose \((\lambda _\alpha ,\lambda _\beta )\)
	entry is \(\#\{x \in \mathbb {F}^{n}_{2}\| \lambda _\alpha \bullet x = \lambda _\beta \bullet S(x)\} - 2^{n-1}\),
	where \(\lambda _\alpha , \lambda _\beta \in \mathbb {F}^{n}_{2}\), and the symbol \(\bullet \) denotes the canonical
	inner product in \(\mathbb {F}^{n}_{2}\)
Differential uniformity	\(\underset{\varDelta \alpha \ne 0, \varDelta \beta }{\text{ max }}\#\{x \in \mathbb {F}^{n}_{2}\|S(x) \oplus S(x \oplus \varDelta \alpha ) = \varDelta \beta \}\)

PIPO: A Lightweight Block Cipher with Efficient Higher-Order Masking Software Implementations

Abstract

Similar content being viewed by others

Strong 8-bit Sboxes with efficient masking in hardware extended version

Strong 8-bit Sboxes with Efficient Masking in Hardware

Analysis and Improvements of the DPA Contest v4 Implementation

Keywords

1 Introduction

2 Specification of PIPO

3 Design Rationale of PIPO

3.1 S-Layer

Proposition 1

Proof

Theorem 1

Proof

Theorem 2

Proof

3.2 R-Layer

4 Security Evaluation of PIPO

5 Performance Evaluation of Higher-Order Masking Implementations of PIPO

6 Performance Evaluation of Software and Hardware Implementations of PIPO

6.1 Software Implementations

6.2 Hardware Implementations

7 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A Test Vectors

B Proofs of Proposition and Theorems

1.1 B.1 Proof of Proposition 1

1.2 B.2 Proof of Theorem 1

1.3 B.3 Proof of Theorem 2

C 8-bit S-box of PIPO, \(S_8\)

1.1 C.1 Table of the \(S_8\)

1.2 C.2 Bitsliced Implementations of the \(S_8\) and Its Inverse

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation