Cryptanalysis of SPEEDY

Wang, Jinliang; Niu, Chao; Liu, Qun; Li, Muzhou; Preneel, Bart; Wang, Meiqin

doi:10.1007/978-3-031-35486-1_7

Jinliang Wang^9,10,
Chao Niu^9,10,
Qun Liu^9,10,
Muzhou Li^9,10,
Bart Preneel¹² &
…
Meiqin Wang^9,10,11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13915))

Included in the following conference series:

Australasian Conference on Information Security and Privacy

864 Accesses

Abstract

SPEEDY is a family of ultra-lightweight block ciphers designed by Leander et al. at CHES 2021. There are three recommended variants denoted as SPEEDY-r-192 with $r\in \{5,6,7\}$. All of them support the 192-bit block and the 192-bit key. The main focus during its design is to ensure hardware-aware low latency, thus, whether it is designed to have enough security is worth to be studied. Recently, the full-round security of SPEEDY-7-192 is announced to be broken by Boura et al. at EUROCRYPT 2023 under the chosen-ciphertext setting, where a round-reduced attack on SPEEDY-6-192 is also proposed. However, no valid attack on SPEEDY-5-192 is given due to its more restricted security parameters. Up to now, the best key recovery attack on this variant only covers 3 rounds proposed by Rohit et al. at AFRICACRYPT 2022.

In this paper, we give three full-round attacks on SPEEDY-7-192. Using the divide-and-conquer strategy and other new proposed techniques, we found a 5.5-round differential distinguisher which can be used to mount the first chosen-plaintext full-round key recovery attack. With a similar strategy, we also found a 5-round linear distinguisher which leads to the first full-round attack under the known-plaintext setting. Meanwhile, the 5.5-round differential distinguisher also helps us slightly improve the full-round attack in the chosen-ciphertext setting compared with the previous result. Besides, we also present a 4-round differential attack on SPEEDY-5-192, which is the best attack on this variant in terms of the number of rounds so far. A faster key recovery attack covering the same rounds is also given using a differential-linear distinguisher. Both attacks cannot threaten the full round security of SPEEDY-5-192.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Cryptanalysis of Reduced Round SPEEDY

Differential Cryptanalysis of Round-Reduced SPEEDY Family

Cryptanalysis of KLEIN

Keywords

1 Introduction

In lightweight cryptography, cryptographic primitives are required to be suitable for resource-constrained environments, such as low area, latency or energy consumption. Therefore, it’s usually difficult to design such ciphers since one has to make trade-offs between implementation efficiency and security. Hence, the security of such ciphers is worth to be evaluated thoroughly.

In CHES 2021, Leander et al. proposed a family of ultra-low-latency block ciphers named as SPEEDY [9] to resolve the problem: How to design a secure encryption algorithm whose hardware implementation is fast. To gain such cipher, they first considered which type of logic gates and circuit topologies are suitable for ultra low-latency encryption. As a result, SPEEDY adopts a lightweight S-box, whose coordinate functions are realized as two-level NAND trees, and a hardware-friendly linear layer. After careful combination, SPEEDY achieves lower latency in hardware than most cryptographic primitives. SPEEDY contains three versions SPEEDY-5-192, SPEEDY-6-192 and SPEEDY-7-192, whose number of rounds are 5, 6, and 7, respectively. Note that, as the number of rounds decreases, the SPEEDY cipher gains better performance in hardware implementations but has a weak bound of security claim.

Recently, Boura et al. [5] announced that the full-round security of SPEEDY-7-192 is broken under the chosen-ciphertext setting, where a round-reduced attack on SPEEDY-6-192 is also proposed. However, due to the more restricted security parameters, no valid attack on SPEEDY-5-192 is given. The best key recovery attack on this variant only covers 3 rounds proposed by Rohit et al. [16].

In this paper, we aim to evaluate the security of SPEEDY thoroughly using differential cryptanalysis, linear cryptanalysis and differential-linear cryptanalysis. Proposed by Biham and Shamir in 1990, differential cryptanalysis [4] has been an important method in evaluating the security of symmetric-key ciphers. Using a differential distinguisher with high probability, one can recover the secret key after adding several rounds before or/and after it. Such an attack can be mounted only when the adversary is under the chosen-plaintext/ciphertext setting. Linear cryptanalysis [13], proposed by Matsui in 1992, only requires the adversary under the known-plaintext setting, which is more realistic than the differential attack. It exploits the linear distinguisher with high correlation considering the linear relation between some bits of plaintexts, ciphertexts and the secret key. Differential-Linear cryptanalysis [8] is another chosen plaintext/ciphertext method, which is proposed by Langford and Hellman in 1994. This method uses a distinguisher with a high correlation that consists of a differential distinguisher followed by a linear distinguisher. To evaluate its correlation more accurately, Achiya and Dunkelman introduced DLCT (short for Differential Linear Connectivity Table) [1] to connect the differential distinguisher and the linear one. This method works efficiently on ciphers whose differential probabilities and linear correlations are very high when the distinguisher covers small rounds, but decrease very quickly when the number of rounds increases.

1.1 Contributions

In this paper, we propose several key recovery attacks on full-round SPEEDY-7-192 and 4-round SPEEDY-5-192 by exploiting key-recovery friendly distinguishers constructed following the divide-and-conquer strategy. Detailed contributions are shown as follows.

Strategy of Searching Key-Recovery Friendly Differential and Linear Trail of SPEEDY. Our searching strategy includes two parts: (a) control the propagation of active pattern deduced from the input difference/mask and output difference/mask; (b) construct long distinguishers in the divide-and-conquer way. To fulfill (a), we introduce TDDT (short for truncated differential distribution table) to show the non-randomness of S-box with bit-wise truncated differential propagation. With this table, one can depict clearly in the searching algorithm on how to choose input/output differences of the target distinguisher. While for (b), we construct part of the distinguisher by concatenating two distinguishers covering the small number of rounds. For instance, the 5.5-round distinguisher is first built with a 4-round distinguisher. To gain a better 4-round distinguisher, we split it into two parts with each covering two rounds. The input difference of the MixColumn operation in the middle is chosen to minimize the total number of activated S-boxes involved in its two neighborhood S-box layers. For the other MixColumns contained in the 5.5-round distinguisher, no restriction on their input differences is necessary.

Improved Key Recovery Attacks Against Full-Round SPEEDY-7-192. By exploiting the above strategy, we gain a 5.5-round differential distinguisher, which has a higher probability than the one found by [5]. With this differential trail, we mount the first chosen-plaintext attack on its full-round variant. Besides, since it’s key-recovery-friendly, it also leads to a full-round chosen-ciphertext attack, which requires slightly less complexities than the one proposed by [5]. Further, we also mount the first known-plaintext 7-round linear attack with a 5-round linear distinguisher by adopting a similar search strategy. Although this one costs slightly higher complexities than the differential attack, it is a known-plaintext attack which requires weaker capabilities of the adversary than the chosen-plaintext attack required in the differential cryptanalysis. Therefore, such linear key recovery attack is also valuable. All our attack results along with previous published ones are summarized in Table 1.

Improved Key Recovery Attack on Round-Reduced SPEEDY-5-192. We also provide two attacks on 4-round SPEEDY-5-192 using differential and differential-linear cryptanalysis, respectively. More precisely, the differential attack is mounted with a two-round distinguisher where one round is added at the top and bottom, respectively, while the differential-linear attack is given by exploiting a three-round distinguisher and adding one half-round at both sides. Note that they are the best valid attacks on it in terms of the number of rounds.

Table 1. Summary of valid attacks on SPEEDY-7-192 and SPEEDY-5-192. KP, CP and CC separately denote the data collected in the known-plaintext, chosen-plaintext and chosen-ciphertext settings. Time complexities are evaluated in encryption units, while memory costs are evaluated in block size.

Full size table

1.2 Organization

First, we briefly recall SPEEDY, as well as differential, linear and differential-linear cryptanalysis in Sect. 2. In Sect. 3, we show how to exploit the propagation properties of SB and MC operations, and introduce the TDDT (short for truncated differential distribution table). Using these properties, we introduce the automated search method oriented to key recovery for differential, linear and differential-linear cryptanalysis in Sect. 4. Detailed attack procedures are given in Sect. 5. Finally, we summarize our work in Sect. 6. Besides, the code for searching distinguishers in this paper is available at the following repository:

https://github.com/Jin-liang-Wang/Cryptanalysis-of-SPEEDY.

2 Preliminaries

2.1 Brief Introduction of SPEEDY

SPEEDY [9] is a family of ultra-low latency block ciphers proposed at CHES 2021. SPEEDY uses a 6-bit bijective S-box and can be instantiated with different block sizes, key sizes, and numbers of rounds. For instance, SPEEDY-r-6l is the version of block size and key size of 6l-bits, and r is the number of iterated rounds. The internal state of SPEEDY-r-6l can be viewed as an $l \times 6$ rectangle array of bits where $l=32$. Following the notation of its design document [9], we use $x_{[i,j]}$ to denote the bit located at row i and column j of the state x where $0 \le i< l, 0 \le j < 6$.

The design document of SPEEDY suggests $l = 32$, and the number of rounds $r\in \{5,6,7\}$, which are called SPEEDY-5-192, SPEEDY-6-192, SPEEDY-7-192, respectively. As for the security claim, SPEEDY-r-192 achieves 128-bit security when iterated over $r = 6$ rounds and full 192-bit security when iterated over $r = 7$ rounds, while the $r = 5$ rounds variant provides $2^{128}$ time complexity and $2^{64}$ data complexity. To make it clear, we denote SPEEDY-r-192 as SPEEDY in this paper, and briefly recall the cipher as follows.

Initialization. SPEEDY receives a 192-bit plaintext and initializes the internal state with a two-dimensional matrix x. Specifically, it initializes the k-th MSB (short for the most significant bit) into $x_{[i,j]}$, where $k = 6i + j$.

Round Function. Its round function consists of five different operations: SubBox (SB), ShiftColumns (SC), MixColumns (MC), AddRoundKey (A$_{k_r}$) and AddRoundConstant (A$_{c_r}$). A high-level overview of its round function is shown in Fig. 1. Note that the last round function only has three operations.

SB is composed of 32 parallel 6-bit Sbox, where each Sbox takes the i-th row $(x_{[i,0]},x_{[i,1]},x_{[i,2]},x_{[i,3]},x_{[i,4]},x_{[i,5]})$ as input and its output is placed in the same row. The detail of S-box is shown in Appendix A. SC rotates the j-th column of the state upwards by j bits. In other words, the output bit $y_{[i,j]}$ equals to the input bit $x_{[(i + j)\bmod 32, j]}$. MC also operates on each column, where seven specifically chosen input bits are XORed as a new bit of its output, as shown in Eq. (1). It also can be seen as multiplying each column by a cyclic matrix.

$$\begin{aligned} y_{[i,j]} = x_{[i, j]} \oplus x_{[i + 1, j]} \oplus x_{[i + 5, j]} \oplus x_{[i + 9, j]} \oplus x_{[i + 15, j]} \oplus x_{[i + 21, j]} \oplus x_{[i + 26, j]}, \quad {\forall {i,j}. } \end{aligned}$$

(1)

The 192-bit round key $k_r$ and 192-bit constant $c_r$ are respectively XORed with the entire state in A$_{k_r}$ and A$_{c_r}$.

Key Schedule. The 192-bit master key of SPEEDY is regarded as the first round key $k_0$. Relation between the r-th round key $k_r$ and the $(r+1)$-the round key $k_{r+1}$ is constructed by a bit permutation PB. To be clear, the j-th bit of $k_{r+1}$ equals to the i-th bit of $k_r$, where $j\equiv ( 7i + 1) \bmod 192$.

2.2 Differential, Linear and Differential-Linear Crytanalysis

Differential Cryptanalysis. Differential cryptanalysis is a chosen plaintext attack proposed by Biham and Shamir [4] in 1990. Starting from a well-chosen difference $\delta _{in}$, the distribution of $E_k (x)\oplus E_k (x\oplus \delta _{in})$ is non-uniform. More precisely, there exits a specific $\delta _{out}$ such that ${\text {Pr}}_{x,k}[E_k (x)\oplus E_k (x\oplus \delta _{in})= \delta _{out}]$ is significantly higher than $2^{-n}$, where n is the block size. In order to distinguish a cipher with a high probability differential $(\delta _{in} \rightarrow \delta _{out})$ from a random permutation, one can collect D ciphertexts corresponding to pairs of plaintexts $(P,P\oplus \delta _{in})$, and compute the number of pairs following the differential:

$$ Q=\#\left\{ P: E_k(P) \oplus E_k(P \oplus \delta _{in})=\delta _{out}\right\} . $$

The expected value of Q for the cipher is $D\times {\text {Pr}}[\delta _{in}\rightarrow \delta _{out}]$ and $D\times 2^{-n}$ for the random permutation. Thus, the distinguisher succeeds with high probability when $D=\mathcal {O}(1/{\text {Pr}}[\delta _{in}\rightarrow \delta _{out}])$.

Linear Cryptanalysis. Linear cryptanalysis, proposed by Matsui at EUROCRYPT 1993 [13], exploits the linear approximations of the round function in order to obtain a biased approximation of the cipher. The linear approximation is a pair of masks $(\alpha ,\alpha ^{\prime })$ such that the distribution of the XORed masked plaintext and ciphertext $x\cdot \alpha \oplus E_k(x)\cdot \alpha ^{\prime }$ is biased. More precisely, we have ($|{\text {Pr}}_{x}[x\cdot \alpha \oplus E_k(x)\cdot \alpha ^{\prime }]-\frac{1}{2}| \gg 2^{-n/2})$ for most keys k), where $x\cdot y = \bigoplus _i x_i y_i$ denotes the inner product. Then, we have the correlation which is expected to be zero when averaged over all keys is:

$$ \text {Cor}_k\left( \alpha \rightarrow \alpha ^{\prime }\right) =2 \underset{x}{{\text {Pr}}}\left[ x \cdot \alpha =E_k(x) \cdot \alpha ^{\prime }\right] -1. $$

Similarly, to distinguish a cipher with a biased linear approximation $(\alpha ,\alpha ^{\prime })$ from a random permutation, we collect D known plaintexts/ciphertexts, and evaluate the experimental correlation:

$$ Q=\left( \#\left\{ P, C: P \cdot \alpha \oplus C \cdot \alpha ^{\prime }=0\right\} -\#\left\{ P, C: P \cdot \alpha \oplus C \cdot \alpha ^{\prime }=1\right\} \right) / D . $$

The expected value Q is larger (in absolute value) for the cipher than for a random permutation and can be observed with high probability when $D=\mathcal {O}(\text {Cor}(\alpha \rightarrow \alpha ^{\prime })^{-2})$.

Differential-Linear Cryptanalysis. Differential-linear cryptanalysis, proposed by Langford and Hellman in 1994 [8], is a technique to combine differential and linear cryptanalysis. Let E be a cipher which can be described as a cascade of three subciphers, $E_0$, $E_m$ and $E_1$, i.e., $E = E_1 \circ E_m \circ E_0 $. Let $\delta _{in}$ and $\delta _{out}$ denote the input and output difference of $E_0$, $\alpha $ and $\alpha ^{\prime }$ be the input and output linear mask of $E_1$. Asume that the differential trail $\delta _{in} \xrightarrow {} \delta _{out}$ in $E_0$ is satisfied with probability p, and the linear trail $\alpha \xrightarrow {} \alpha ^{\prime }$ in $E_1$ is satisfied with correlation q. The correlation of $\delta _{out} \xrightarrow {} \alpha $ in $E_m$ is satisfied with

$$ r = \text {Cor}\left( \delta _{out} \xrightarrow {E_m} \alpha \right) =2 \underset{x}{{\text {Pr}}}\left[ E_m(x) \cdot \alpha = E_m(x \oplus \delta _{out}) \cdot \alpha \right] -1 . $$

Then, the correlation of the trail $ \delta _{in} \xrightarrow {E} \alpha ^{\prime }$ is satisfied with correlation $\text {Cor}(\delta _{in} \xrightarrow {E} \alpha ^{\prime }) =pq^2r$. One can collect D ciphertexts corresponding to pairs of plaintexts $(P,P \oplus \delta _{in})$ and evaluate the experimental correlation:

Similarly, the expected value is larger (in absolute value) for the cipher than for a random permutation and can be observed with high probability when $D=\mathcal {O}(\text {Cor}(\delta _{in} \xrightarrow {E} \alpha ^{\prime })^{-2})$.

3 Further Study on Operations of SPEEDY

In general, the extended path before and after the distinguisher determines the number of guessed key bits, the attack rounds, and the time complexity. To control the number of active S-boxes in the extended path, we add the constraint of the propagation through key recovery rounds and propose the concept of TDDT (short for truncated differential distribution table). Since the security of SPEEDY highly relies on the 6-bit S-box and the MC operation, we did in-depth research for SB operation with the TDDT in Sect. 3.1 and for MC operation in Sect. 3.2, respectively. With these newly observed properties, we can construct the automated search method for the differential and linear distinguishers by taking the key recovery into account in Sect. 4.

3.1 Truncated Differential Distribution Table

In general, the time complexity of differential cryptanalysis is affected by the number of active bits in plaintext and ciphertext. To control the number of active bits, we limit some bits being inactive before the second SB operation to reduce the number of active bits in plaintext, as shown in Sect. 4.

On average, each bit is limited inactive in rounds of key recovery with probability $2^{-1}$. However, we have found that in a specific S-box, e.g., the S-box of SPEEDY, this probability is fluctuate considerably. Two examples are provided in Equations (2a) and (2b) below, although the input sets of these two situations are both one bit inactive, the probabilities of Eqs. (2a) and (2b) are much higher than $2^{-1}$.

$$\begin{aligned}&{\texttt {**0***}} \xrightarrow [p = 2^{-0.54}]{{\textsf{SB}}} \texttt{010000}\,, \end{aligned}$$

(2a)

$$\begin{aligned}&{{\texttt {0*****}}} \xrightarrow [p = 2^{-0.83}]{{\textsf{SB}}} {\texttt{010000}}\,, \end{aligned}$$

(2b)

where * denotes a arbitrary bit $\in \{0,1\}$. The probability is computed as follows. Taking Eq. (2b) as an example. The input set $\varDelta _{in}$ contains $2^{5}$ differences, i.e., {0b000000,0b000001,$\cdots $,0b011111}. We assume that each difference in $\varDelta _{in}$ appears with equal probability. Besides, the output difference is $\delta _{out}=$ 0b010000. Then this probability is computed by $ {\text {Pr}}(\varDelta _{in} \xrightarrow []{{{\textsf{SB}}}} \delta _{out} ) = \frac{\sum _{\delta _{in} \in \varDelta _{in}} {\text {Pr}}(\delta _{in} \xrightarrow []{{{\textsf{SB}}}} \delta _{out})}{|\varDelta _{in}|}. $

In order to get an optimal probability in rounds of key recovery, we need to consider how to select inactive bits. In this section, we introduce the TDDT and show how it can select inactive bits using the automatic solver in Sect. 4.

Compared to DDT (short for differential distribution table [4]) which contains the number of transition situations of each fixed input-output difference, for a N-bit S-box, the TDDT is a $2^N \times 2^N$ table which contains the number of transition situation from a set of input differences to a set of output differences. To clarify, we define the following notation to represent a set of differences named bit-string ${\textbf {k}}$. For an n-bit bit-string ${\textbf {k}} = (k_0, k_1, ..., k_{n-1}), {\textbf {k}} \in \mathbb {F}^{n}_{2}$, we define ${\textbf {k}}^{\prime } \preceq {\textbf {k}}$, if $k^{\prime }_i \le k_i$ for all i. Then, we denote $\mathcal {B}_{\varDelta _{in}}/\mathcal {B}_{\varDelta _{out}}$ as the bit-string representation of the input/output difference set $\varDelta _{in}/\varDelta _{out}$. For example, when a truncated input difference $\varDelta _{in}$ is "00*00*", the bit-string representation $\mathcal {B}_{\varDelta _{in}}$ is $\texttt{0b001001}_{\mathcal {(S)}}$^{Footnote 1}, i.e., $\mathcal {B}_{\varDelta _{in}} = {\texttt{0b001001}}_{\mathcal {(S)}}$ denotes the differential set "00*00*" which contains differences {0b000000,0b000001,0b001000,0b001001}. Thus, we can construct TDDT from the DDT using Algorithm 1 and the probability of the truncated difference propagation from $\varDelta _{in}$ to $\varDelta _{out}$ is:

$$\begin{aligned} {\text {Pr}}(\varDelta _{in} \xrightarrow []{{{\textsf{SB}}}} \varDelta _{out} ) = \frac{ {{\textsf{TDDT}}}[\mathcal {B}_{\varDelta _{in}}][\mathcal {B}_{\varDelta _{out}]}}{|\varDelta _{in}| \cdot 2^N}, \end{aligned}$$

(3)

where N is the size of S-box and $|\varDelta _{in}|$ denotes the number of input differences $\delta _{in}$ that fulfills $\delta _{in}\preceq \varDelta _{in}$.

By exploiting the differential property of the transition from a set of differences that are inactive in certain bits to another set of differences with different inactive bits, TDDT can help us make a universal search model for key recovery. Moreover, we also consider the cases of differential propagation from the fixed input difference to a set of output differences. Then, we define FTDDT(short for fixed-input truncated difference distribution table) to describe the probability distribution of this transition and introduce it as follows.

Fixed-Input Truncated Difference Distribution Table . As aforementioned, FTDDT contains the number of transition situations of a fixed input difference to a set of output differences. For the special case, when the input difference is zero, the output difference must be zero, i.e., the set of output differences $\varDelta _{out}$ contains only zero differences. More precise, when $\delta _{in} = 0$, we have

$$ {{\textsf{FTDDT}}}[0][0] = 2^{N}\quad \text {and} \quad {{\textsf{FTDDT}}}[0][i] = 0, i\in [1,2^{N})\text {.} $$

We show the construction of FTDDT in Algorithm 2. By modeling these difference distribution tables into our automated search model, we can find a key recovery-oriented differential which contains key recovery rounds as well as distinguishers. Time complexity of Algorithm 2 is$ (2^N - 1) \cdot \sum _{i = 1} ^{N} \left( {\begin{array}{c}N\\ i\end{array}}\right) 2^i =\mathcal {O}(2^{N \cdot \log _2 6}) \text {,} $ which is evaluated as the number of the look-up table operations.

In summary, FTDDT shows the non-randomness of S-box with bit-wise truncated differential propagation. Considering the differential propagation through a random permutation, the fixed input difference $\delta _{in}$ will propagate to the output difference set $\varDelta _{out}$ with probability $2^{-i}$ where the number of zero bits in $\mathcal {B}_{\varDelta _{out}}$ is i. As mentioned before, due to the non-randomness of a specific S-box, the probability is changed to ${{\textsf{FTDDT}}}[\delta _{in}][\mathcal {B}_{\varDelta _{out}}]/2^{N}$. In order to verify the correctness of FTDDT, we have conducted an experiment in Appendix B. With this accurate propagation probability of differential, we can gain a better-extended path for a key recovery attack as shown in Sect. 4.

Inverse Probability of FTDDT. During key recovery, we need to partially encrypt (resp. decrypt) the plaintext pair (resp. ciphertext pair) to validate the input (resp. output) of the distinguisher. To filter the good pairs of plaintext more precisely in validating the input of the distinguisher, we utilize the inverse of FTDDT. We denote $N_a$ as the number of active bits in $\varDelta _{out}$. For example, $N_a=5$ if $\varDelta _{out}=$ 0b110111, i.e., **0***. Then, when considering a fixed difference $\delta _{out} \preceq \varDelta _{out}$ propagating to $\delta _{in}$ which is the inverse of FTDDT[$\delta _{in}$][$\mathcal {B}_{\varDelta _{out}}$], the average probability of this transition is $ {{\textsf{FTDDT}}}[\delta _{in}][\mathcal {B}_{\varDelta _{out}}] / 2^{N + N_a}. $ The proof is shown in Appendix C. With the inverse probability of FTDDT, we will get the more precise time complexity of the key-recovery procedure.

3.2 Differential Propagation Property of MC

The MC operation of SPEEDY generates each new bit in a column by XORing seven current bits. It also can be seen as multiplying each column by a cyclic matrix M. In this section, we research the MC operation in-depth by considering the matrix M. Here we traverse all possible inputs with HW (short for Hamming weight) from one to eight and find that their minimum output $\text {HW are } (7,8,7,8,7,6,5,4)$. Table 2 summarizes which choices of active differential bits are associated with these minimal output HW cases. Specifically, we use $\{ a_{0}, a_{1},\cdots \}$ to denote that the $a_i$-th bit of the input difference is active. Moreover, due to the property of the MC operation of SPEEDY, we rotate the array and make the 0-th bit active to represent the rotation invariant array class. From Table 2, we can observe that when the input HW is one, there is a minimum sum of the input and output minimum HW. Meanwhile, when the input HW is two and three, the minimal sum of the input and output HW is relatively small.

Table 2. Summary of active differentials yielding low output HW from M. When the input HW is greater than 3, the cases of active bits are too large to list. We record their number to simplify. For the input differences with different HW, we list its output difference which have the minimum HW.

Full size table

When building the automated search model, one can utilize the property of MC to control the diffusion of the differential, which can reduce the size of the search space. In this work, we utilize a divide-and-conquer method to search for a longer differential by dividing it into several parts. More precisely, for a four rounds differential, we first search several two rounds differentials with the input difference of the last MC is of relatively low HW as in Table 2. Then, we search the last two rounds of differential and connect them with the middle MC. By limiting the sum of input-output HW of the middle MC, we can control the diffusion of the 4-round differential, i.e. minimizing the total number of activated S-boxes involved in the middle, and achieve a higher probability.

We note here that Boura et al. [5] proposed a different strategy to utilize the properties of the MC operation. In their approach, a differential is divided into several one-round parts and they consider the case that the HW of both input and output are small.

4 Automated Search Method Oriented to Key Recovery

In this section, we introduce the automated search strategies of differential, linear and differential-linear distinguishers against SPEEDY following the divide-and-conquer strategy, as well as taking the key recovery procedure into consideration.

Usually, cryptanalysts perform key recovery attacks based on a strong distinguisher. By appending several rounds before and after the distinguisher, one can obtain an extended path containing both distinguisher and key recovery rounds. However, this two-step process of key recovery cannot promise optimal results as shown in [15, 18]. Therefore, automated search oriented to key recovery can obtain better attack results. To find such key-recovery-friendly distinguishers, properties of SB and MC are introduced in Sect. 3 are utilized.

We implement our search algorithms using STP^{Footnote 2}, which is a solver developed based on the SAT (short for boolean satisfiability problem) [12]/SMT(short for satisfiability modulo theories) [2] problem and has been widely used in the field of cryptography [6, 10, 11, 14].

4.1 Differential Search Model Against SPEEDY

Since we aim to search for distinguishers in the single-key setting, we can omit A$_{k_r}$ and A$_{c_r}$ in our model. We denote $S_j^i$ as the j-th intermediate state of the i-th round and set the first round indexed with zero as the key recovery round.

Generally, the variable number and the size of each variable will determine the total scale of the search model and affect the complexity of solving the SAT/SMT problem. As the number of rounds increases, the time and memory complexity of solving the problem increases exponentially. Specifically, when the model exceeds four rounds, we cannot obtain a solution in a reasonable time with STP in the case of SPEEDY. For obtaining a longer distinguisher, we employ a divide-and-conquer [18] approach to search several models with a short round to form the entire differential. As shown in Fig. 2, our automated search model is divided into several steps. Motivated by [15], we also consider a key recovery-oriented search model where the distinguisher rounds and key recovery rounds are all included.

Model One: 7-Round Differential Contains 1.5-Round of Key Recovery. The 7-round differential consists of 5.5-round differential distinguisher $S_0^1\rightarrow S_1^6$ and 1.5 key recovery rounds $S_0^0 \rightarrow S_0^1,S_1^6 \rightarrow S_3^6$. For convenience, we use backward FTDDT to represent the FTDDT of SPEEDY S-box and forward FTDDT to represent the FTDDT of the inverse of SPEEDY S-box.

Step 1: From $S_3^1$ to $S_4^3$, the search strategies are as follows.
- Rule 1: Model SB to comply with DDT.
- Rule 2: Model each active S-box in $S_3^1$ to comply with $2^{-3\times 2}$ probability. This strategy is mainly used to limit the active S-box in the $S^1_3$. More precisely, we measure this trail with probability $2^{-6 \times N_a} \times p$, where $N_a$ is the number of active S-box in $S^1_3$ and p is the probability generated from Rule 1 in Step 1. Indeed, $N_a$ active S-boxes imply that there will have $2 \times N_a$ S-boxes in trail $S^1_0 \rightarrow S^1_3$. Since the maximum probability transition through the S-box is $2^{-3}$, we measure each active S-box in $S^1_3$ with probability $2^{-6}$.
- Rule 3: Constrain only one of the columns to be activated in $S_4^3$. For the input before MC, consider only entries with low HW from Table 2.
Step 2: From $S^3_4$ to $S^6_0$, the strategies to reduce search space are as follows.
- Rule 1: Due to the differential must be linked, we fix the input differences as those in $S^3_4$.
- Rule 2: Model the active S-box in $S^6_0$ to comply with a maximum probability of $2^{-3}$.
Step 3: After running the above two steps in parallel, we obtain several four rounds of differentials with high probability. Then, we choose the differential whose probability is higher than $2^{-170}$ and search the corresponding last 1.5-round differential of distinguisher and 1.5-round of key recovery as follows.
- Rule 1: Model $S^1_0 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^1_1 $ and $ S^1_2 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^1_3$ to comply with DDT.
- Rule 2: Model $S^0_1 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}} ^ {-1}}} S^0_0$ and $ S^6_0 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^6_1$ to comply with FTDDT.
- Rule 3: Constrain the active S-box in $S^0_1$ and $S^6_2$ less than 32.
- Rule 4: Constrain the active S-box number in $S^0_1$ less than 16.
- Rule 5: Constrain the active S-box number in $S^6_2$ more than 6.
Here, rules 3 to 5 are used to constrain the active bits to reduce the time complexity of the key recovery phase as shown in Sect. 5.1.
Step 4: Following step one to three, we can get several 5.5-round distinguishers with 1.5-round of key recovery. Then, we choose the distinguisher with maximum probability and search an extended path for key recovery as shown in Algorithm 6 (Appendix G). To reduce the time complexity, we consider the differential rotation invariant of SPEEDY. More precisely, we use one of the 32 rotation-invariant distinguishers to implement the key recovery attack, and the chosen one makes the lowest time complexity.

With the above four steps, we find a differential trail with probability $2^{-185.53}$ which can be used to attack on the 7-round SPEEDY. More precisely, this trail consists of a 5.5-round differential distinguisher with probability $2^{-182.49}$ and a 1.5-round key-recovery phase with probability $2^{-3.04}$. We illustrate it as Fig. 3.

Model Two: 4-Round Differential Contains Two Rounds of Key Recovery. From Step 2, we can get several two rounds distinguisher from $S^3_4$ to $S^6_0$. Then, we use backward FTDDT and forward FTDDT to search 2-round of key recovery. We show this distinguisher with probability $2^{-59.35}$ in Appendix J. For the property of linear key relation between $k_0$ and $k_4$, the rotation invariant property does not affect the time complexity when attacking the 4-round SPEEDY.

4.2 Searching Linear Distinguishers Against SPEEDY

For linear cryptanalysis, we use a similar strategy used for differential search, which is a model oriented toward key recovery. Then, we search for a five-round distinguisher with two rounds of key recovery by following a four-step process. Due to the page number limit of the paper, we have placed the specifics in Appendix D.

Finally, we found a 5-round distinguisher combined with two key recovery rounds with correlation $2^{-93.01}$. We show the detail of the linear distinguisher in Appendix I. In this distinguisher, the bits of $k_7$ which is derived from $k_0$ by linear key schedule is 139. We reduce the guessed key bits to 185 bits in the key recovery phase.

4.3 Searching Differential-Linear Distinguishers Against SPEEDY

Similarly, we search for distinguishers against 4-round SPEEDY. More specifically, we search for three rounds of differential-linear distinguishers and extend half a round forward and backward for key recovery. The detailed strategy is shown in Appendix E.

Finally, the distinguisher is shown in Fig. 5 of Appendix F. The correlation of this 3-round distinguisher is $2^{-23.095}$ and the probability of rounds of key-recovery is $2^{-6.972}$. The bits of $k_7$ deduced from $k_0$ by linear key schedule are 29.

5 Key Recovery Attack Against SPEEDY

In this section, we first implemented both differential and linear cryptanalysis on full SPEEDY with the newly obtained distinguishers in Sect. 4. Then, the key recovery attack against other versions of SPEEDY are also provided.

5.1 7-Round Differential Attack Against Full SPEEDY

Using the newly obtained 7-round differential containing distinguishers and key recovery rounds, we can perform key recovery attacks on full SPEEDY. Before the attack, we first introduce some techniques that will be used in this section as follows.

Deduce $\boldsymbol{k_{0}}$ from $\boldsymbol{k_{7}}$ by Linear Key Schedule. Due to the linear key schedule, we can get the relationship between $k_{0}$ and $k_{7}$ and deduce the key bits of $k_{0}$ from $k_{7}$. We call these bits of $k_{0}$ deduced key bits from $k_{7}$. The detailed relationship between $k_{0}$ and $k_{7}$ is shown as the purple squares in Fig. 3.

Function to Sieve Possible Key Guesses. The FirstSboxSieve function uses the key bits of $k_7$ to deduce the six bits of each row in $k_0$ using the linear key schedule. After deducing the six bits of $k_0$ from $k_7$, we can filter the corresponding key guessing of the plaintext pair with the zero difference in $S^0_1$. Then, for each plaintext pair and the corresponding six bit keys of each row, there will leave $2^{6-d-n}$ key guesses after sieving by the first S-box, where d is the bit number deduced from $k_7$ and n is the inactive bit number after the first S-box. We show the details of FirstSBoxSieve in Algorithm 3. The SecondSboxSieve is used to combine the sets $L_i$ of possible key bits in a bigger set. The function can sieve the elements of the bigger set using the difference in $S^0_3$. The detail is shown in Algorithm 4.

Distinguisher with Rounds of Key Recovery. The 5.5-round differential distinguisher is shown in Fig. 3, where the black square and the blue square denote the active bits from $S^1_4$ to $S^5_4$ and the active bits from $S^0_3$ to $S^1_3$, respectively. To distinguish the differential propagation of the key recovery round, we use different colors to indicate the propagation of each active bit in the extended rounds, where the gray square indicates the active bits whose differences are forced zero. Further, we use the shaded square to denote the bits random from $\{0,1\}$. Then, we use the red square to denote the active bits in $S^0_0$ and $S^6_3$. To represent the bits in $k_0$ that are derived from $k_7$ using the linear key schedule, we use the purple square to denote the deduced bits in $k_0$ and the green square to denote the guessed key bits when performing the key recovery attack. Note that 180 key bits in $k_0$ are involved in the key recovery attack.

The attack procedure contains three parts, i.e., data collection, key recovery and brute force the master key.

Data Collection. Consider structure S which consists of $2^{156}$ plaintexts and for each plaintext pair $P_0,P_1 \in S$, $ P_0 \,\oplus \, P_1 = (N,N,N,N,N,N,N,N,N,N,N,N,N,$ 0, N, N, N, N, N, N, N, N, N, N, N, 0, 0, 0, 0, 0, N, N), where N denotes any 6-bit cell. We require the oracle to encrypt those $2^{156}$ plaintexts and collect the ciphertexts by the hash table which is indexed by the 156-bit inactive bits in ciphertexts. From the hash table, we expect to get $2^{155}$ pairs of plaintext-ciphertext pair $(P_0,C_0),(P_1,C_1)$ that satisfy $C_0 \oplus C_1= (0,N,0,N,0,0,0,N,0,0,0,0,N,0,0,$ $0,N,0,0,0,0,0,0,0,0,N,0,0,0,0,0,0)\text {.}$

Next, we briefly introduce the key recovery and brute force phases, the detail of the pseudocode is shown in Appendix G. We use $\delta _{in}^i$ to denote the i-th row of the input difference of the 5.5-round distinguisher.

Key Recovery. For each pair obtained from the data collection phase, we first use $C_0 \oplus C_1$ to deduce the 36-bit $k_7$ from DDT. Note that, since we evaluate the propagation probability of key-recovery rounds by FTDDT, the number of deduced 36-bit $k_7$ should be calculated by the inverse probability of FTDDT as shown in Sect. 3.1. For each deduced 36-bit $k_7$, we fix the deduced keys in $k_0$ from $k_7$ due to the linear key schedule and guess other 144-bit $k_0$ as follows.

Firstly, we use the early abort technique which is guessing the key-bits of row 14,15,17 and partial encrypt to run FisrtSboxSieve. If there are no satisfied key guesses to validate the output difference of the SubBox operation, we consider the 36-bit deduced $k_7$ is wrong and discard it. If all the deduced $k_7$ are wrong, we consider this pair of plaintext-ciphertext is wrong and discard it. Next, for each plaintext-ciphertext pair, we sieve $k_0$ in row 11,12,13,16 with FisrtSboxSieve and partial encrypt to $\delta _{in}^{11}$ with SecondSboxSieve. After this step, the plaintext-ciphertext pair and corresponding $k_0$ can validate the input difference $\delta _{in}^{11}$. Similarly, we discard the deduced $k_7$ if there are no satisfied $k_0$ guesses to validate $\delta _{in}^{11}$ and discard the pair if no deduced $k_7$ is left. Then, we employed the same method to those deduced keys in the row of $k_0$ with FisrtSboxSieve and partial encrypt plaintext pairs to $\delta _{in}^{14}, \delta _{in}^{15}, \delta _{in}^{18}, \delta _{in}^{19}, \delta _{in}^{7}, \delta _{in}^{5}, \delta _{in}^{4}, \delta _{in}^{3}, \delta _{in}^{2}, \delta _{in}^{1}, \delta _{in}^{30},\delta _{in}^{29}, \delta _{in}^{21}$ to further sieve with SecondSboxSieve.

Finally, it will leave $2^{135.46}$ pairs and $2^{6.5}$ 180-bit keys for each pair on average, and we will get $2^{141.96}$ candidate keys of this 180-bit master key for each structure. Then, we use the brute force procedure to sieve the wrong candidate keys.

Brute Force. For each candidate key from the key recovery procedure, we guess the remaining 12-bit of the master key to get the complete 192-bit candidate master key $k^{c}$. Then we use a test pair $(m_t,c_t)$ from encrypt oracle to check if $E_{k^{c}}(m_t) = c_t$ to verify if $k^{c}$ is the right key.

Complexity and Success Probability. Since the memory access of storing the ciphertexts to the hash table is negligible compared to the time for collecting the ciphertexts, the time complexity of data collection for each structure is $2^{156}$. Since the hash table storing $2^{155}$ pairs is very large, we treat the time to access this table as one encryption. Further, since the 7-round SPEEDY encryption has $7 \times 2 \times 32 = 448$ SubBox operations, the time complexity of the key recovery phase is less than $2^{155} + (70\times 2^{155})/448$ for each structure (details are shown in Appendix ). Then, the time complexity of Brute Force phase is $2^{151.96}$ times 7-round SPEEDY encryption for each structure. Overall, the total time complexity of each structure is $2^{156} + 2^{155} + (70\times 2^{155})/448 + 2^{153.96} \approx 2^{156.86}$ 7-round ${\texttt {SPEEDY}}$ encryptions.

The signal-to-noise-ratio is Using $2^{185.53}/2^{155} \approx 2^{30.53}$ structures, we can ensure the success rate of 79.15% with 7.51-bit advantage according to [17]. Hence, the data complexity is $2^{186.53}$, the time complexity is $2^{30.53} \times 2^{156.86} \approx 2^{187.39}$ and the memory required is $2^{156}$.

Reducing Memory Requirement. The above attack require $2^{156}$ bits in memory, which is too large. To obtain lower memory complexity, we can use the chosen ciphertext attack mode and select the active 36-bit structure at the ciphertext. After using the inactive 36 bits in the plaintext as the index, we get $2^{35}$ pairs, and then one can follow the key recovery procedure used in the above attack to obtain the right key.

5.2 7-Round Linear Attack Against Full SPEEDY

With the 5-round linear distinguisher shown in Appendix I, we compute the attack parameter on SPEEDY-7-192 using the symbols borrowed from [7]. With the $\mathcal {FWT}$ approach in [7], the time complexity is $ 2(\rho _M + \rho _A)2^{|k_0 |+ |k_7 |- l_{07}}\text {,} $ where the $|k_i |$ means the bit number guessed in $k_i$, and $l_{07}$ means the key bits of $k_7$ deduced from $k_0$. Following the notation in [7], $\rho _A$ is the cost of an addition, and $\rho _M$ is the cost of a multiplication. Assuming that two multiplications correspond to roughly one evaluation of the cipher, we have the time complexity of key recovery phase $2^{186}$ encryptions. The memory required is $2^{144} + 2^{180} + 2^{185} \approx 2^{185.04}$. According to [17], we need $2^{188.50}$ data and the advantage $a = 4$ to achieve the success rate 69.12%. Finally, we exhaustively search the remaining 7 key bits. The total time complexity of the attack is $2^{188.50} + 2^{186} + 2^{188} \approx 2^{189.41}$.

5.3 4-Round Differential Attack Against SPEEDY.

As shown in Appendix J, we choose ciphertexts in $S^3_3$ to make a 60-bit structure and filter the corresponding plaintext pairs in $S^2_0$. After running this step, it will leave $\left( {\begin{array}{c}2^{60}\\ 2\end{array}}\right) /2^{36} \approx 2^{83}$ pairs. Then, we sieve the key bits of the 108-bit involved $k_4$ according to the $2^{83}$ pairs and leave $2^{43.692}$ possible $k_4$ for each pair. Then, for the combination of these $2^{83}\times 2^{43.692} = 2^{126.692}$ plaintext-ciphertext pairs and corresponding key guesses, we guess the 156-bits $k_0$ using the method in Sect. 5.1.

Due to the differential probability being $2^{-59.35}$ and one structure can only provide $2^{60}$ data, we construct two structures to collect the data. The data complexity of this attack is $2^{61}$, the time complexity is $2^{119.692}$ and the memory required is $2^{83}$. With above attack parameters, the success rate is $94.17\%$.

5.4 4-Round Differential-Linear Attack Against SPEEDY

With the distinguisher searching in Sect. 4.3, we obtain a 4-round differential-linear attack against SPEEDY with $2^{61}$ data complexity $2^{105}$ time complexity and $2^{105}$ memory requirement. The detail of this attack is shown in Appendix F.

6 Conclusion

In this paper, we first show how to find key-recovery friendly distinguishers for SPEEDY by following the divide-and-conquer strategy as well as some other new techniques. With such strategies, we found a 5.5-round differential distinguisher which is key-recovery friendly and with higher probability than the one used in the previous result. Using this distinguisher, we are able to mount the first chosen-plaintext full-round attack on SPEEDY-7-192. Besides, with the same distinguisher, we can also mount a full-round attack under the chosen-ciphertext setting, which slightly reduces the attack complexities compared with the previous one proposed in the same setting. Meanwhile, using a similar search strategy, we also found a 5-round linear distinguisher which leads to the first known-plaintext full-round attack. Although the full-round security of this variant has recently been announced to be broken, our attacks here need a weaker ability requirement for the adversary. Besides, for SPEEDY-5-192, which requires more restricted attack parameters, we also propose two 4-round key recovery attacks using a differential distinguisher and a differential-linear one, respectively. According to our best knowledge, both attacks are the best ones on this variant in terms of the number of rounds. However, its full-round security is not threatened.

Notes

1.
To distinguish the set of differences from the single differential, we use numbers with subscript $_{\mathcal (S)}$ to represent the set of differences.
2.
https://stp.github.io/.

References

Bar-On, A., Dunkelman, O., Keller, N., Weizman, A.: DLCT: a new tool for differential-linear cryptanalysis. In: Ishai, Y., Rijmen, V. (eds.) EUROCRYPT 2019. LNCS, vol. 11476, pp. 313–342. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17653-2_11
Chapter Google Scholar
Barrett, C., Tinelli, C.: Satisfiability modulo theories. In: Clarke, E., Henzinger, T., Veith, H., Bloem, R. (eds.) Handbook of Model Checking, pp. 305–343. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-10575-8_11
Chapter Google Scholar
Beierle, C., Leander, G., Todo, Y.: Improved differential-linear attacks with applications to ARX ciphers. In: Micciancio, D., Ristenpart, T. (eds.) CRYPTO 2020, Part III. LNCS, vol. 12172, pp. 329–358. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-56877-1_12
Chapter Google Scholar
Biham, E., Shamir, A.: Differential cryptanalysis of des-like cryptosystems. J. Cryptol. 4(1), 3–72 (1991)
Article MathSciNet MATH Google Scholar
Boura, C., David, N., Boissier, R.H., Naya-Plasencia, M.: Better steady than speedy: full break of speedy-7-192. Cryptology ePrint Archive, Paper 2022/1351 (2022)
Google Scholar
Fan, Y., Li, M., Niu, C., Lu, Z., Wang, M.: Related-tweakey impossible differential attack on reduced-round SKINNY-AEAD M1/M3. In: Galbraith, S.D. (ed.) CT-RSA 2022. LNCS, vol. 13161, pp. 247–271. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-95312-6_11
Chapter Google Scholar
Flórez-Gutiérrez, A., Naya-Plasencia, M.: Improving key-recovery in linear attacks: application to 28-round PRESENT. In: Canteaut, A., Ishai, Y. (eds.) EUROCRYPT 2020. LNCS, vol. 12105, pp. 221–249. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45721-1_9
Chapter MATH Google Scholar
Langford, S.K., Hellman, M.E.: Differential-linear cryptanalysis. In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 17–25. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48658-5_3
Chapter Google Scholar
Leander, G., Moos, T., Moradi, A., Rasoolzadeh, S.: The speedy family of block ciphers - engineering an ultra low-latency cipher from gate level for secure processor architectures. Cryptology ePrint Archive, Paper 2021/960 (2021)
Google Scholar
Li, M., Hu, K., Wang, M.: Related-tweak statistical saturation cryptanalysis and its application on QARMA. Cryptology ePrint Archive (2019)
Google Scholar
Liu, Y., Wang, Q., Rijmen, V.: Automatic search of linear trails in ARX with applications to SPECK and chaskey. In: Manulis, M., Sadeghi, A.-R., Schneider, S. (eds.) ACNS 2016. LNCS, vol. 9696, pp. 485–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39555-5_26
Chapter Google Scholar
Longo, G., Zilli, M.V.: Complexity of theorem-proving procedures: some general properties. Revue française d’automatique inf. recherche opé. Inf. théorique 8(R3), 5–18 (1974)
Google Scholar
Matsui, M.: Linear cryptanalysis method for DES cipher. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48285-7_33
Chapter Google Scholar
Niu, C., Li, M., Sun, S., Wang, M.: Zero-correlation linear cryptanalysis with equal treatment for plaintexts and tweakeys. In: Paterson, K.G. (ed.) CT-RSA 2021. LNCS, vol. 12704, pp. 126–147. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75539-3_6
Chapter Google Scholar
Qin, L., Dong, X., Wang, X., Jia, K., Liu, Y.: Automated search oriented to key recovery on ciphers with linear key schedule: applications to boomerangs in SKINNY and ForkSkinny. IACR Trans. Symmetric Cryptol. 2021(2), 249–291 (2021)
Article Google Scholar
Rohit, R., Sarkar, S.: Cryptanalysis of reduced round speedy. Cryptology ePrint Archive (2022)
Google Scholar
Selçuk, A.A.: On probability of success in linear and differential cryptanalysis. J. Cryptol. 21(1), 131–147 (2008)
Article MathSciNet MATH Google Scholar
Zhou, C., Zhang, W., Ding, T., Xiang, Z.: Improving the MILP-based security evaluation algorithm against differential/linear cryptanalysis using a divide-and-conquer approach. Cryptology ePrint Archive (2019)
Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. This work is supported by the National Key Research and Development Program of China (Grant No. 2018YFA0704702), the National Natural Science Foundation of China (Grant No. 62032014), the Major Basic Research Project of Natural Science Foundation of Shandong Province, China (Grant No. ZR202010220025). The corresponding author is also supported by Qingdao Innovation project (Grant No. QDBSH20230101008).

Author information

Authors and Affiliations

Key Laboratory of Cryptologic Technology and Information Security, Ministry of Education, Shandong University, Jinan, China
Jinliang Wang, Chao Niu, Qun Liu, Muzhou Li & Meiqin Wang
School of Cyber Science and Technology, Shandong University, Qingdao, China
Jinliang Wang, Chao Niu, Qun Liu, Muzhou Li & Meiqin Wang
Quan Cheng Shandong Laboratory, Jinan, China
Meiqin Wang
imec-COSIC, KU Leuven, Leuven, Belgium
Bart Preneel

Authors

Jinliang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Niu
View author publications
You can also search for this author in PubMed Google Scholar
Qun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Muzhou Li
View author publications
You can also search for this author in PubMed Google Scholar
Bart Preneel
View author publications
You can also search for this author in PubMed Google Scholar
Meiqin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muzhou Li .

Editor information

Editors and Affiliations

Queensland University of Technology, Brisbane, QLD, Australia
Leonie Simpson
Queensland University of Technology, Brisbane, QLD, Australia
Mir Ali Rezazadeh Baee

Appendices

A The 6-Bit S-Box of SPEEDY

(See Table 3).

Table 3. The 6-bit S-box of SPEEDY. All elements are expressed in hexadecimal.

Full size table

B Experiment to Verify the Correctness of FTDDT

In this section, we conducted an experiment to calculate the probability of propagation to verify the correctness of FTDDT. For several random-chosen fixed-input differences $\delta _{in}$, we tested the transfer probability from a single difference $\delta _{in}$ (shown in the left-hand side of Eq. 4) to a set of differences $\varDelta _{out}$ (shown in the right-hand side of Eq. 4) in the S-box used in SPEEDY.

More precisely, we randomly choose $2^{20}$ 42-bit plaintexts p and denote the number of p that satisfies $\{ \text {S-box}(p) \oplus \text {S-box}(p \oplus \delta _{in}) \in \varDelta _{out} \}$ as $\mathcal {N}$. Then the probability obtained from this test is $\mathcal {N} / 2^{20}$.

$$\begin{aligned}&{\texttt{010000}} \xrightarrow []{{\textsf{SB}}} {{\texttt {**0***}}} \nonumber \\&{\texttt{010000}} \xrightarrow []{{\textsf{SB}}} {{\texttt {****00}}} \nonumber \\&{\texttt{010000}} \xrightarrow []{{\textsf{SB}}} {\texttt {0*****}} \nonumber \\&{\texttt{001000}} \xrightarrow []{{\textsf{SB}}} {\texttt {00****}} \\&{\texttt{001000}} \xrightarrow []{{\textsf{SB}}} {\texttt {****00}} \nonumber \\&{\texttt{000100}} \xrightarrow []{{\textsf{SB}}} {\texttt {***0**}} \nonumber \\&{\texttt{000100}} \xrightarrow []{{\textsf{SB}}} {\texttt {0**0**}} \nonumber \end{aligned}$$

(4)

Our validation algorithm is written in python code and we run it on an Intel(R) Core(TM) i7-8700 CPU. Finally, the tested probability is $2^{-5.9671}$ while the transfer probability caculated from FTDDT is $2^{-5.9622}$. Therefore, we consider that the accuracy obtained by FTDDT is verified.

C Inverse Probability of FTDDT

Theorem 1

When considering a fixed difference $\delta _{out} \preceq \varDelta _{out}$ propagate to $\delta _{in}$ which is the inverse of FTDDT[$\delta _{in}$][$\mathcal {B}_{\varDelta _{out}}$], the average probability of this transition is

$$ {{\textsf{FTDDT}}}[\delta _{in}][\mathcal {B}_{\varDelta _{out}}] / 2^{N + N_a}. $$

Proof

Since the S-box is bijective, the inverse probability of FTDDT is equal to the average probability of $\delta _{in} \xrightarrow {{{\textsf{SB}}}} \delta _{out}$ (averaged over all $\delta _{out} \in \varDelta _{out}$). Next, remember that the FTDDT describes the probability of a group of differential transition. Thus, with this probability, we can obtain the average probability of $\delta _{in}\xrightarrow {{{\textsf{SB}}}} \delta _{out}, \delta _{out} \preceq \varDelta _{out}$ by dividing by the size of the output differential set $|\varDelta _{out}|$. The probability of the case $\delta _{in} \xrightarrow {{{\textsf{SB}}}} \varDelta _{out}$ is ${{\textsf{FTDDT}}}[\delta _{in}][\mathcal {B}_{\varDelta _{out}}]/2^{N}$ and the number of difference in $\varDelta _{out}$ is $2^{N_a}$. Thus, the probability of the case $\delta _{in} \xrightarrow {{{\textsf{SB}}}} \delta _{out}$ is $ {{\textsf{FTDDT}}}[\delta _{in}][\mathcal {B}_{\varDelta _{out}}] / 2^{n + N_a}$ on average if $\delta _{out} \in \varDelta _{out}$. $\square $

D Searching Linear Distinguishers Against SPEEDY

Since the MC operation can be seen as $\varGamma _{in} = {\textsf{M}}^{\textsf{T}} \cdot \varGamma _{out}$ in linear attack where $\varGamma _{in}$ is the input mask and $\varGamma _{out}$ is the output mask of matrix M. Following the method in Sect. 3.2, we get the linear mask transition properties for MC, precisely, the correspondence between the input mask $\varGamma _{in}$ and the output mask $\varGamma _{out}$ of the minimum HW.

For linear cryptanalysis, we use a similar strategy used for differential search, which is a model oriented toward key recovery. Then, we search for a five-round distinguisher with two rounds of key recovery by following four steps.

Step 1: We found the 4-round distinguishers from $S^0_3$ to $S^5_0$ using the same method in searching differentials.
Step 2: For each of the 4-round linear distinguishers, we found the distinguishers with two rounds of key recovery from $S^5_0$ to $S^6_2$ according to the following conditions: the probability from $S^5_0$ to $S^6_0$ is as high as possible and after spreading from $S^6_0$ to $S^6_2$ with probability one, the active S-box number in $S^6_2$ is less than 32.
Step 3: Then, the linear mask propagates from $S^0_3$ to $S^0_1$ with probability one and limits the number of active S-box at $S^0_1$ is less than 32.
Step 4: Finally, we consider the linear rotation invariant of the 7-round distinguishers combined with key recovery and choose the case where the number of bits of $k_7$ deduced from $k_0$ by linear key schedule is maximized.

Following step one to four, we found a 5-round distinguisher combined two key recovery rounds with correlation $2^{-93.01}$. We show the detail of the linear distinguisher in Appendix I. In this distinguisher, the bits of $k_7$ which is derived from $k_0$ by linear key schedule is 139. We reduce the guessed key bits to 185 bits in the key recovery phase.

E Searching Differential-Linear Distinguishers Against SPEEDY

We show the four rounds differential-linear search model in Fig. 4. More specifically, we search for three rounds of differential-linear distinguishers from $S^0_4$ to $S^4_0$ and extend half a round forward and backward for key recovery. Detailed strategy is as follows.

Step 1: We search the differential part contains key recovery rounds from $S^0_3$ to $S^1_4$, the search strategies are as follows.
- Rule 1: Constrain the active bit numbers in $S^0_3$ less than 64.
- Rule 2: Model $S^0_3 {\mathop {\longrightarrow }\limits ^{{{\textsf{SB}}}^{-1}}} S^0_2$ to comply with FTDDT.
- Rule 3: Model $S^1_0 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^1_1, S^1_2 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^1_3$ to comply with DDT.
- Rule 4: Constrain only one bit active in $S^1_4$.
Step 2: Meanwhile, we search the linear part contains key recovery rounds from $S^3_0$ to $S^4_0$, the search strategies are as follows.
- Rule 1: Constrain only one bit active in $S^3_0$.
- Rule 2: Model $S^3_0 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^3_1, S^3_2 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^3_3$ to comply with LAT (short for linear approximation table).
- Rule 2: Model the active S-box in $S^4_0$ to comply with a maximum correlation of $2^{-1.415}$.
Step 3: We search the connect round from $S^1_4$ to $S^3_0$ as follows.
- Rule 1: Constrain only one bit active in $S^1_4$ and $S^3_0$.
- Rule 2: Model $S^2_0 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^2_1$ with backward FTDDT.
- Rule 3: Model $S^2_2 {\mathop {\longrightarrow }\limits ^{{\textsf{SB}}}} S^2_3$ to comply with LAT.
- Rule 4: Constrain the difference of the i-th row in $S^2_2$ is zero if the difference of corresponding row in $S^2_2$ is active.
Step 4: From Step one to three, we get several distinguishers. Then, we choose the one with the highest correlation and run an experiment for obtaining the estimated correlation from $S^1_2$ to $S^3_1$. After testing 128 random keys with $2^{30}$ plaintexts under a certain key, we obtained the correlation estimated as $2^{-8.85}$.
Step 5: Finally, we consider the rotation invariant property of SPEEDY such that the bits of $k_7$ deduced from $k_0$ by linear key schedule are maximized.

Finally, the distinguisher is shown in Fig. 5. The correlation of this 3-round distinguisher is $2^{-23.095}$ and the probability of rounds of key-recovery is $2^{-6.972}$. The bits of $k_7$ deduced from $k_0$ by linear key schedule are 29.

F 4-Round Differential-Linear Attack Against SPEEDY

With the 3-round differential-linear distinguisher shown in Fig. 5, we can implement the differential-linear attack against 4-round SPEEDY. Firstly, we choose the active bits in plaintext to make a 61-bit structure and obtain corresponding ciphertexts. After guessing the 61-bit $k_0$, we use Algorithm 5 to put the pairs that can validate the input difference of the 3-round differential-linear distinguisher into the set $\mathcal {S}$. Then we recover the master key by applying the FWHT (short for Fast Walsh-Hadamard Transform) which has already been used in [3].

Since the time complexity of the Algorithm 5 is negligible compared to those of FWHT, we use the same computing method for our attack. Then, we choose the advantage $a = 88$, use one structure to collect data, the time complexity of this attack is $2^{61} + 2^{104} + 2^{104} \approx 2^{105}$. The data complexity is $2^{61}$ and the memory required is $2^{61} + 2^{72} + 2^{105} \approx 2^{105}$. We denote $|\mathcal {S}|$ as the number of pairs in $\mathcal {S}$, according to [17] the distribution of $|\mathcal {S}|$ is

$$ |\mathcal {S}| \sim \frac{1}{2} N(2^{61}\cdot 2^{-6.972}, 2^{61}\cdot 2^{-6.972}) = N(2^{53.028}, 2^{52.028}). $$

Then, the success rate is

$$\begin{aligned} P_S&= \int _{0} ^{+ \infty } P(|\mathcal {S}| = x) \cdot P_{S'}(|\mathcal {S}| = x) dx \\&\ge P(|\mathcal {S}| \ge 2^{53.027}) \cdot P_{S'}(|\mathcal {S}| = 2^{53.027}) \\&\approx 45.31\%, \nonumber \end{aligned}$$

where $P_{S'}(|\mathcal {S}| = n)$ is the success rate for this attack, if $|\mathcal {S}| = n$.

G Pseudocode of Key Recovery and Brute Force Parts on the $2^{155}$ Pairs for Each Structure in the 7-Round Differential Attack

H Details for the Time Complexity of the Key Recovery on the $2^{155}$ Pairs for Each Structure in the 7-Round Differential Attack

In this section, we mainly consider the time of operating the Algorithm 6 in Appendix G. In order to make it easier to understand, we show the details of each step in Algorithm 6 in Table 4.

Position in Table 4 denotes the location in Algorithm 6, e.g., position 9 denotes the operation that generates $\mathcal {L}_{14}$ at line 9 in Algorithm 6. The filtering probability of FirstSboxSieve is determined by the number of inactive bits in the output of the S-box, e.g., when generating the $\mathcal {L}_{14}$, the output different set of the S-box is 0b*00*00 so that the filtering probability is $2^{-4}$ while the filtering probability of SecondSboxSieve is calculated by the inverse probability of FTDDT.

Besides, to facilitate the calculation of the time complexity, we see the $2^{155}$ pair of each structure as a group. Then, we compute the time complexity of each operation in Algorithm 6 in this group. More precisely, we introduce two new parameters, i.e., the number of the remaining pairs and the number of remaining keys for each pair. Further, for an operation with filtering probability p, the number of the remaining pairs and remaining keys before this operation is $N_{p}$ and $N_{k}$, respectively. Besides, we use $N_a$ to denote the number of imported keys in this operation. For the FisrtSboxSieve operation, $I_k$ is the number of keys in the input key set, e.g., $|\mathcal {K}^0_{14}|$ in position 9. For the SecondSboxSieve operation, $I_a$ is zero. Then, the complexity of this operation for the whole group in this operation, the remaining pairs and the remaining keys after this operation is calculated by Algorithm 7.

In fact, the sum of the time complexity in Table 4 is about $68.8 \times 2^{155}$ times SubBox operation. Since there are other operations that have not been evaluated, we give a generous estimate for the complexity of the key recovery phase, i.e., $70 \times 2^{155}$ times SubBox operation.

Evaluation the Number of Elements in $\mathcal {K}^7$. In this section, we use $S^i_j[k,:]$ to denote the k-th line in the $S^i_j$, i.e., the 6k-th bit to $(6k + 5)$-th bit of $S^i_j$. As shown in Fig. 3, there are only two active differences transitioning to a set of differences through the process $S^6_0 \xrightarrow {{\textsf{SB}}} S^6_1$, i.e., the $S^6_0[3,:] \xrightarrow {{\textsf{SB}}} S^6_1[3,:]$ (0b000010$ \xrightarrow {{\textsf{SB}}}$ 0b*0*000) and $S^6_0[6,:] \xrightarrow {{\textsf{SB}}} S^6_1[6,:]$ (0b000010 $\xrightarrow {{\textsf{SB}}}$0b000*0*). Other active differences propagate to a single difference.

Therefore, when deducing the key bits of $k_7$ by DDT (position 5 in Algorithm 6), on average, there is only one 6-bit partial key will be deduced for the states $k_7[i,:]$, where $i \in \{7,2,16,25\}$ since the corresponding difference in $S^6_2$ is a specific single difference. However, $k_7[1,:]$ and $k_7[3,:]$ will have several possibilities since the corresponding state $S^6_2[1,:]$ and $S^6_2[3,:]$ are sets of differences. Thus, the average number of elements in $\mathcal {K}^7$ is determined by the average number of deduced $k_7[1,:]$ and $k_7[3,:]$.

Note that the state $S^6_2[1,:]$ and $S^6_2[3,:]$ are propagated by $S^6_1[3,:]$ and $S^6_1[6,:]$ while the state $S^6_1[i,:]$ is propagated by $S^6_0[i,:]$ through the S-box. Thus, we did an in-depth study of the propagation $S^6_0[i,:] \xrightarrow {{\textsf{SB}}} S^6_0[i,:]$, where $i \in \{3,6\}$.

In our trail, the $S^6_0[3,:] \xrightarrow {{\textsf{SB}}} S^6_0[3,:]$ will have two cases since the propagation 0b000010 $\xrightarrow {{\textsf{SB}}}$ 0b*0*000 have two possible propagations, i.e., 0b000010 $\xrightarrow {{\textsf{SB}}}$ 0b100000 and 0b000010 $\xrightarrow {{\textsf{SB}}}$ 0b001000 (indeed, there are four cases for this propagation, but other two cases have zero probability).
For the same reason, the $S^6_0[6,:] \xrightarrow {{\textsf{SB}}} S^6_0[6,:]$ also has two cases.

Above all, there are $2 \times 2 = 4$ possible cases for the propagation of the two active S-boxes and each case will deduce one possible 12-bit keys for $k_7$. Besides, each case needs two Subox operations. Thus, the average number of elements in $\mathcal {K}^7$ is and the time complexity of deducing 36-bit $k_7$ is $4 \times 2 + 4 = 12$ times SubBox operation.

Table 4. Details for the Time Complexity of Algorithm 6. The set parameter is the output of the operation, e.g., $\mathcal {L}_{14}$ for position 9. The time complexity is measured by the SubBox operation. $I_k$ is the number of imported keys in this operation. $N_p$ and $N_k$ are the number of the remaining pairs and remaining keys after the operation calculated by Algorithm 7.

Full size table

I Key Recovery Attack Against 7-Round SPEEDY with Linear Cryptanalysis

(See Fig. 6).

J Key Recovery Attack against 4-Round SPEEDY with Differential Cryptanalysis

(See Fig. 7).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Niu, C., Liu, Q., Li, M., Preneel, B., Wang, M. (2023). Cryptanalysis of SPEEDY. In: Simpson, L., Rezazadeh Baee, M.A. (eds) Information Security and Privacy. ACISP 2023. Lecture Notes in Computer Science, vol 13915. Springer, Cham. https://doi.org/10.1007/978-3-031-35486-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-35486-1_7
Published: 15 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35485-4
Online ISBN: 978-3-031-35486-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cryptanalysis of SPEEDY

Abstract

Similar content being viewed by others

Cryptanalysis of Reduced Round SPEEDY

Differential Cryptanalysis of Round-Reduced SPEEDY Family

Cryptanalysis of KLEIN

Keywords

1 Introduction

1.1 Contributions

1.2 Organization

2 Preliminaries

2.1 Brief Introduction of SPEEDY

2.2 Differential, Linear and Differential-Linear Crytanalysis

3 Further Study on Operations of SPEEDY

3.1 Truncated Differential Distribution Table

3.2 Differential Propagation Property of MC

4 Automated Search Method Oriented to Key Recovery

4.1 Differential Search Model Against SPEEDY

4.2 Searching Linear Distinguishers Against SPEEDY

4.3 Searching Differential-Linear Distinguishers Against SPEEDY

5 Key Recovery Attack Against SPEEDY

5.1 7-Round Differential Attack Against Full SPEEDY

5.2 7-Round Linear Attack Against Full SPEEDY

5.3 4-Round Differential Attack Against SPEEDY.

5.4 4-Round Differential-Linear Attack Against SPEEDY

6 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A The 6-Bit S-Box of SPEEDY

B Experiment to Verify the Correctness of FTDDT

C Inverse Probability of FTDDT

Theorem 1

Proof

D Searching Linear Distinguishers Against SPEEDY

E Searching Differential-Linear Distinguishers Against SPEEDY

F 4-Round Differential-Linear Attack Against SPEEDY

G Pseudocode of Key Recovery and Brute Force Parts on the \(2^{155}\) Pairs for Each Structure in the 7-Round Differential Attack

H Details for the Time Complexity of the Key Recovery on the \(2^{155}\) Pairs for Each Structure in the 7-Round Differential Attack

I Key Recovery Attack Against 7-Round SPEEDY with Linear Cryptanalysis

J Key Recovery Attack against 4-Round SPEEDY with Differential Cryptanalysis

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation