1 Introduction

As a countermeasure of the collision attacks on MD5 and SHA-1 by Wang et al. [30, 31], the U.S. National Institute of Standards and Technology (NIST) announced a public contest in 2007 aiming at the selection of a new standard for a cryptographic hash function (SHA-3). After 5 years of intensive scrutiny, in 2012 NIST selected Keccak as the winner of the SHA-3 competition. As one of the most important cryptographic standards, Keccak attracts lots of attention from the world wide researchers and engineers. Till now, many cryptanalysis results [9, 10, 16, 17, 23] and evaluation tools [7, 12, 22] have been proposed, including the recent impressive collision attacks [25, 27]. However, the cryptanalysis progress of Keccak is still limited.

At EUROCRYPT 2015, Dinur et al. [11] introduced a new cube-attack-like cryptanalysis technique and gave the security evaluations of the Keccak keyed modes for the first time. At CT-RSA 2015, Dobraunig et al. [13] evaluated the security of Ascon [14] against the cube-attack-like cryptanalysis. Later, Dong et al. [15] applied the cube-like method to Ketje Sr [1]. At EUROCRYPT 2017, Huang et al. [19] introduced the conditional cube attack, which takes advantage of the large state freedom of Keccak to find a so-called conditional cube variable that does not multiply with all the other cube variables (called ordinary cube variables) in the first round and second round of Keccak. Later, Li et al. [21] applied the conditional cube attack to Ascon.

Recently, cryptographic communities found many classical cryptanalysis methods could be converted to mathematical optimization problems which aim to achieve the minimal or maximal value of an objective function under certain constraints. Mixed-integer Linear Programming (MILP) is the most widely studied technique to solve these optimization problems. One of the most successful applications of MILP is to search differential and linear trails. Mouha et al. [24] first applied MILP method to count active Sboxes of word-based block ciphers. Then, at Asiacrypt 2014, by deriving some linear inequalities through the H-Representation of the convex hull of all differential patterns of Sbox, Sun et al. [29] extended this technique to search differential and linear trails. Another two important applications are to search integral distinguisher [32] and impossible differentials [6, 26, 34, 35].

At Asiacrypt 2017, Li et al. [20] introduced a new MILP tool to improve conditional cube attacks on Keccak keyed modes. They found that when the conditional cube variable is given, to find enough ordinary cube variables is a mathematical optimization problem. They gave the MILP model and improved Huang et al.’s conditional cube attacks. And most recently, Song et al. [28] introduced a new MILP model to find better/optimal choices of conditional cubes. These works seem to exhibit a new way to research Keccak sponge function. The tedious cryptanalytic works become much easier because of MILP model.

1.1 Our Contributions

In this paper, we find Dinur et al.’s [11] cube-attack-like cryptanalysis technique could also be converted to and improved by a MILP model. In Dinur et al.’s attack, the key point is to select the public variables as the cube in such a way that the superpolys depend only on a (relatively) small number of key bits. In detail, at the first round of Keccak, the attacker finds a set of cube variables that are not multiplied with each other (we denoted it as linear-cube), meanwhile, these cube variables are not multiplied with some key bits. By taking advantage of the CP-kernel [3], Dinur et al. found 32/64-dimension linear-cubes that are not multiplied with 64 key bits in Keccak-MAC-128 (with capacity 256).

In this paper, we propose a new MILP model to search optimal linear-cubes that multiply with a minimum number of key bits in the first round. We model the so-called CP-like-kernel, model the way that the cube variables are not multiplied with each other in the first round and model the way that the cube variables are not multiplied with key bits, etc. We construct a linear inequality system. The target object is the minimum number of key bits which are multiplied with cube variables. Based on this new MILP tool, we find the optimal cubes that are multiplied with fewest key bits for Keccak-MAC, Keyak and Ketje. All the results improve Dinur et al.’s attacks.

Table 1 Summary of key recovery attacks on Keccak Keyed modes

When comparing with Huang et al.’s conditional cube attack, the advantage of the MILP-aided cube-attack-like cryptanalysis is that it has larger effective range. The conditional cube attack becomes much weaker or invalid when the number of degrees of freedom is small.Footnote 1 Hence, the conditional cube attack can only be applied to 6-round Keccak-MAC-512 and the attacks on Keyak in nonce-respected setting are still limited, etc. However, MILP-aided cube-attack-like cryptanalysis could not only attack the same number of rounds with conditional cube attack with the same number of degrees of freedom, but also get better results on the Keccak keyed variants with relatively smaller number of degrees of freedom. The results are summarized in Table 1. In addition, we list the source code of the new MILP tools and the verification programs in a public domainFootnote 2 to help researchers study Keccak. Our main results achieved by the MILP tools are listed below.

  1. 1.

    When the capacity is 256, we find the optimal 32-dimension linear-cube for Keccak-MAC that only multiplies with 18 key bits instead of Dinur et al.’s 64 bits. By divide-and-conquer manner, the complexity of the 6-round attack is only \(2^{42}\) instead of \(2^{66}\). This key-recovery attack is experimentally verified and the verification programs could be found in the above public domain. We find a new 64-dimension linear-cube that only multiplies with 30 key bits instead of Dinur et al.’s 64 bits. Based on it, the complexity of 7-round cube-attack-like cryptanalysis is reduced by a factor of \(2^{17}\).

  2. 2.

    In Keccak sponge function, when the capacity reaches 1024, the number of degrees of freedom is so small that the cryptanalysis becomes quite hard. Actually, the rounds of the collision attack and preimage attack on Keccak-512 are only 3 and 4, respectively. For Keccak-MAC-512, the cryptanalysis results are also the weakest. In fact, at EUROCRYPT 2015, Dinur et al. only gave cube-attack-like cryptanalysis when the capacity is smaller than 576. At EUROCRYPT 2017, Huang et al. gave the first 5-round key-recovery attack on Keccak-MAC-512 using conditional cube attack. Then at Asiacrypt 2017, Li et al. gave a 6-round conditional cube attack. In this paper, using our new MILP tool, we give the first 7-round key-recovery attack on Keccak-MAC-512.

  3. 3.

    Considering Keyak in nonce-respected setting, using our MILP tool, we improve Dinur et al.’s 6-round key-recovery attack on Lake Keyak to 8-round with the recommended nonce. In addition, we also get the best attacks on Ketje Major/Minor with nonce reduced settings. For Ketje Major, when nonce is 9-lane, we improve the best previous 6-round attack to 7-round. We give the first 7-round attacks on Ketje Minor when the nonce is reduced to 288 bits, while the best previous 7-round attacks need 654-bit nonce.

1.2 Organization of the paper

Section 2 gives some notations, and a brief description on Keccak-permutations, Keccak-MAC, Keyak and Ketje. Some related works are introduced in Sect. 3. Section 4 introduces the idea of improvement of Dinur et al.’s attack. Section 5 describes the MILP search model for cube-like-attack. Round-reduced key-recovery attacks on Keccak-MAC-512 are introduced in Sect. 6. Section 7 gives the cryptanalysis results on Lake Keyak with the nonce-respected setting. Section 8 gives the applications to Ketje. Section 9 concludes this paper.

2 Preliminaries

2.1 Notations

\(S_{i}\) :

The intermediate state after i-round of Keccak-p, for example \(S_{0.5}\) means the intermediate state before \(\chi \) in the 1st round of Keccak-p,

A :

Used in tables: for Keccak-MAC and Keyak, the initial state; for Ketje, the state after \(\pi ^{-1}\) of Keccak-\(p^*\)

A[i][j]:

The 32/64-bit word indexed by \([i,j,*]\) of state A, \(0\leqslant i\leqslant 4\), \(0\leqslant j\leqslant 4\)

A[i][j][k]:

The bit indexed by [ijk] of state A

\(v_{i}\) :

The ith cube variable

\(a_{i}\) :

The ith auxiliary variable used in the attack procedure

K :

128-bit key, for Keccak-MAC, \(K=k_0||k_1\), both \(k_0\) and \(k_1\) are 64-bit; for Ketje Major and Lake Keyak, \(K=k_0||k_1||k_2\), \(k_0\) is 56-bit, \(k_1\) is 64-bit, \(k_2\) is 8-bit; for Ketje Minor, \(K=k_0||k_1||k_2||k_3||k_4\), \(k_0\) is 24-bit, \(k_1\), \(k_2\) and \(k_3\) are 32-bit, \(k_4\) is 8-bit

\({k_i[j]}\) :

The jth bit of \(k_i\)

2.2 The Keccak-p permutations

The Keccak-p permutations are derived from the Keccak-f permutations [3] and have a tunable number of rounds. A Keccak-p permutation is defined by its width \(b=25\times 2^l\), with \(b\in \{25,50,100,200,400,800,1600\}\), and its number of rounds \(n_r\), denoted as Keccak-p[b]. The round function R consists of five operations:

$$\begin{aligned} \texttt {R}=\iota \circ \chi \circ \pi \circ \rho \circ \theta \end{aligned}$$

Keccak-p[b] works on a state A of size b, which can be represented as \(5\times 5\)\(\frac{b}{25}\)-bit lanes, as depicted in Fig. 1, A[i][j] with i for the index of column and j for the index of row. In what follows, indexes of i and j are in set \(\{0,1,2,3,4\}\) and they are working in modulo 5 without other specification.

$$\begin{aligned} \begin{array}{l} \theta :A[x][y] = A[x][y] \oplus \sum \nolimits _{j = 0}^4 {(A[x - 1][j] \oplus (A[x + 1][j]\lll 1)).} \\ \rho :A[x][y] = A[x][y]\lll r[x,y].\\ \pi :A[y][2x+3y] = A[x][y].\\ \chi :A[x][y] = A[x][y]\oplus ((\lnot A[x+1][y])\wedge A[x+2][y].\\ \iota : A[0][0]=A[0][0]\oplus RC. \end{array} \end{aligned}$$

In Ketje v2, the twisted permutations, Keccak-\(p^*[b]\)=\(\pi \circ \) Keccak-\(p[b]\circ \pi ^{-1}\), are introduced to effectively re-order the bits in the state. \(\pi ^{-1}\) is the inverse of \(\pi \): \(\pi ^{-1} :A[x+3y][x] = A[x][y].\)

Fig. 1
figure 1

a The Keccak state [3], b state A in 2-dimension

2.3 Keccak-MAC

A MAC mode of Keccak can be obtained by adding key as the prefix of message/nonce. As depicted in Fig. 2, the input of Keccak-MAC-n is concatenation of key and message, n is half of the capacity length.

Fig. 2
figure 2

Construction of Keccak-MAC-n

2.4 Keyak

Authenticated Encryption cipher Keyak [2] is one of the 16 candidates in the 3rd round CAESAR competition, whose mode is based on Motorist mode [2], which is sponge-based and supports one or more duplex instances operating in parallel.

In Keyak, five instances are proposed (Table 2). For all instances, the round number of Keccak-p[b] is \(n_r=12\), the capacity \(c=256\) and the tag length \(\tau =128\). The primary recommendation is the Lake Keyak. As shown in Fig. 3, 128-bit key is encoded by a key pack:

$$\begin{aligned} keypack(key,l)=enc_8(l)\Vert key\Vert pad10*[8l-8](|key|) \end{aligned}$$

In Lake Keyak, the key pack l is 40 bytes and 150-byte nonce is recommended. According to the specification of Keyak, in order to assure confidentiality of data, a user must respect the nonce requirement. Namely, a nonce cannot be reused, otherwise, confidentiality is not guaranteed. However, for authenticity and integrity of data, a variable nonce is not required. Readers can refer to [2] for more details.

2.5 Ketje

Ketje [1] is also one of the 16 candidates in the 3rd round CAESAR competition. It is a sponge-like construction.

Table 2 Five instances of Keyak
Fig. 3
figure 3

Construction of Keyak on two blocks

The structure of Ketje is an authenticated encryption mode MonkeyWrap[1], which is based on MonkeyDuplex [4]. It consists of four parts: the Initialization phase, Processing associated data, Processing the plaintext and Finalization phase. The initialization takes the secret key K, the public nonce N and some paddings as the initial state. Then \(n_{start}=12\) rounds Keccak-\(p^*\) is applied. Our attack is applied to the initialization phase of Ketje.

In Ketje v2, four concrete instances are proposed, shown in Table 3, \(n_{start}=12\), \(n_{step}=1\) and \(n_{stride}=6\). For Ketje Minor and Major, the recommended key length is 128-bit, so the maximal length of nonce is \((800-128-18=)\,654\) and \((1600-128-18=)\,1454\) bits.

Table 3 Four instances in Ketje v2

3 Related work

3.1 Cube attack

At EUROCRYPT 2009, Dinur and Shamir introduced the cube attack [8], in which the output bit of a symmetric cryptographic scheme can be regarded as a polynomial \(f(k_0,\ldots ,k_{n-1},v_0,\ldots ,v_{m-1})\) over GF(2), \(k_0,\ldots ,k_{n-1}\) are the secret variables, \(v_0,\ldots ,v_{m-1}\) are the public variables (e.g., IV or nonce bits).

Theorem 1

[8]

$$\begin{aligned} f(k_0,\ldots ,k_{n-1},v_0,\ldots ,v_{m-1}) = t \cdot {P} + {Q}(k_0,\ldots ,k_{n-1},v_0,\ldots ,v_{m-1}) \end{aligned}$$
(1)

t is called maxterm and is a product of certain public variables, for example \((v_0,\ldots ,v_{s-1})\), \(1\le s\le m\), denoted as cube \(C_t\). None of the monomials in Q is divisible by t. P is called superpoly, which does not contain any variables of \(C_t\). Then the sum of f over all values of the cube \(C_t\) (cube sum) is

$$\begin{aligned} \sum \limits _{v'=(v_0,\ldots ,v_{s-1}) \in {C_t}} {f(k_0,\ldots ,k_{n-1},v',v_{s},\ldots ,v_{m-1}) = {P}} \end{aligned}$$
(2)

where \(C_t\) contains all binary vectors of the length s, \(v_{s},\ldots ,v_{m-1}\) are fixed to constant.

The basic idea is to find enough t whose P is linear and not a constant. This enables the key recovery through solving a system of linear equations.

3.2 Dinur et al.’s cube-attack-like attack

At EUROCRYPT 2015, Dinur et al. [11] launched a cube-attack-like cryptanalysis on Keccak keyed modes. In the attack on 6-round reduced Keccak-MAC with capacity 256, the 128-bit key is placed in the lane A[0][0] and A[1][0]. They found if the cube variables are in A[2][2] and A[2][3] which are equal in the same column, shown in Fig. 4, after \(\theta \), \(\rho \) and \(\pi \), the cube variables are only multiplied with 64-bit key in A[0][0] after the first round. The cube sums after 6-round are independent of the key bits in A[1][0].

Fig. 4
figure 4

Dinur et al.’s Work

In addition, Dinur et al. introduced 32 bits auxiliary variables which are assumed to be equal to key bits in A[0][0] in the same column. Hence, half of A[0][0] (32-bit key) and auxiliary variables are in CP-kernel, which makes that cube variables do not multiply with those key bits and auxiliary variables in the first round. So only 32 key bits will multiply with the cube variables after the first round, which means only 32 key bits will affect the cube sums of the output after 6-round.

The whole 6-round attack is as follows. In preprocessing phase, the attacker calculates the cube sums for each of 32-bit keys which multiply with cube variables and store them in a list L. In the online phase, for \(2^{32}\) values of auxiliary variables, the attacker calculates the cube sums for the output bits and search them in L, for each match in L return the corresponding 32-bit key as a candidate. Similar attack is applied to 7-round Keccak-MAC. For more details, please refer to [11].

4 An improvement of Dinur et al.’s idea

In Dinur et al.’s divide-and-conquer strategy, the cube was chosen manually and not optimal. Recently, Ye et al. [33] introduced a method to choose the cube variables more precisely, then the number of secret key bits, which multiply with the cube variables in the first round, will decrease and the complexity would reduce as well, which leads to longer-round attacks sometimes. We call these secret key bits related key bits (and other secret key bits which do not multiply with cube variables in the first round called unrelated key bits), we describe this idea as follows.

As shown in Fig. 5, for example, we set the 128-bit secret key in \(A[0][0]=k_0\) and \(A[1][0]=k_1\) for 1600-bit-state Keccak-MAC. If we select 32 cube variables as follows:

$$\begin{aligned} \left\{ \begin{aligned}&A[2][0][3 \cdot i]=v_{i},\\&A[2][2][3 \cdot i]=v_{i+16},\\&A[2][3][3 \cdot i]=v_{i}+v_{i+16} \end{aligned} \ \ \ for \ i=0,1 \ldots 15 \right. \end{aligned}$$
(3)

Firstly, we explore that how many lanes with key bits would multiply with the cube variables after the first round. After \(\theta \) operation, these lanes in red are diffused by \(k_0\) as shown in Fig. 5 and the lanes in blue are diffused by \(k_1\), while the cube variables in green are just added by the key \(k_1\), not multiplied with them. After \(\rho \) and \(\pi \) operation, we could see that only three lanes diffused by \(k_0\) will multiply with cube variables (in green), other lanes especially diffused by \(k_1\) would do not multiply with cube variables in the first round.

Fig. 5
figure 5

The diffusion of key bits and cube variables in one round

Table 4 Rotation constants r[xy] in Keccak \(\rho \) operation
Fig. 6
figure 6

The offset of key bits and cube variables in one round

Secondly, we explore how many bits in \(k_0\) (only in 3 lanes) would multiply with these cube variables. As the \(\rho \) operation rotates a different offset (Table 4) for different lanes, we use the number in each lane to denote the rotated offset compared with the initial state, as shown in Fig. 6. In the \(\chi \) operation, the cube variables in A[2][0] (in the initial state A[2][0]) would multiply with the key bits \(k_0[3\cdot i+62-2]\) (mod 64) for \(0\le i \le 15\), the cube variables in A[2][2] would multiply with the key bits \(k_0[3\cdot i+43-44]\) (mod 64) for \(0\le i \le 15\), and the cube variables in A[2][3] would multiply with the key bits \(k_0[3\cdot i +15-10]\) (mod 64) for \(0\le i \le 15\), we list the key bits they multiplied in Table 5 for each lane. As we can see that in different lines in Table 5, each related key bit emerges two or three times corresponding to different lane, which means that only a few key bits are involved in the cube attacks after diffusion if we choose the cube variables more precisely.

Table 5 The key bits multiplied with the 32-dimension cube variable

As a result, the new 32-dimension linear cube just multiplies with only 19-bit key bits instead of Dinur et al.’s 64 key bits. However, this linear cube is found manually and is not an optimal cube that multiplied with minimum key bits. Obviously, it is hard to find such optimal 32 or 64 dimension linear cube by hand. In this paper, we introduce a new MILP method to solve the above optimization problem and then improve cube-attack-like cryptanalysis on Keccak keyed modes especially when the number of degrees of freedom is relatively small.

5 MILP modeling search strategy

In this section, we present how to model our search strategy using the MILP method. For any bit A[x][y][z] in the Keccak-p initial state, we define \(A[x][y][z]=1\) when it is a cube variable or a related key bit.

Since we need linear cubes in the first round, we need constraints to make the cube variables do not multiply with each other in the first round, and the following inequalities are sufficient to model this:

$$\begin{aligned} \begin{aligned} A[x_1][y_1][z_1] + A[x_2][y_2][z_2] \le 1 \end{aligned} \end{aligned}$$
(4)

which means if there are two bits \(A[x_1][y_1][z_1]\) and \(A[x_2][y_2][z_2]\) multiply with each other, we choose one of them as cube variable at most.

In order to control the diffusion of the cube variables, we make use of the CP-like-kernel which was formalized by Guo et al. [17] and studied by the related work [20]. We keep the sum of the variables within the same column is constant (usually zero) which makes the following \(\theta \) be identity, hence the diffusion of the variables is reduced largely. As the number of cube variables in a column is at least 2, the following inequalities are sufficient to model the CP-like-kernel:

$$\begin{aligned} \left\{ \begin{aligned}&\sum \limits _{y = 0}^4 {A[x][y][z]} - 2d[x][z] \ge 0\\&d[x][z] - A[x][i][z] \ge 0 \ \ \ 0 \le i \le 4\\ \end{aligned} \right. \end{aligned}$$
(5)

where the auxiliary variable d[x][z] records whether the column [x][z] contains cube variables as illustrated above. The [x][z] column can provide \(\sum \nolimits _{y = 0}^4 {A[x][y][z]} - d[x][z]\) independent cube variables. As we need enough cube variables for our attack, for example, 64 cube variables for 7-round Keccak-MAC, we sum up the free bits for cube variables and make it equal to 64. That is,

$$\begin{aligned} \begin{aligned} \sum \limits _{x,y,z}{A[x][y][z]} - \sum \limits _{x,z}{d[x][z]} = 2^{n-1}, \end{aligned} \end{aligned}$$
(6)

i.e. the number of cube variables in free bits.

When a cube variable multiplies with a key bit A[x][y][z], we name this key bit as a related key bit, and set \(A[x][y][z]=1\). If one key bit \(A[x_1][y_1][z_1]\) multiplies with a cube variable\(A[x_2][y_2][z_2]\), we need the following inequality to constraint a related key bit:

$$\begin{aligned} \begin{aligned} A[x_1][y_1][z_1] - A[x_2][y_2][z_2] \ge 0 \end{aligned} \end{aligned}$$
(7)

Since we would like to get the minimum number of the related key bits when the round number is given, We set the objective function as:

$$\begin{aligned} \begin{aligned} Min {\sum \limits _{x,y,z}{A[x][y][z]}}. \end{aligned} \end{aligned}$$
(8)

Now we have the objective function and all the inequalities above as constrains, thus we get a complete MILP model which could be solved by the openly available software Gurobi [18].

6 Applications to round-reduced Keccak-MAC

In this section, we apply our MILP tool to round-reduced Keccak-MAC. In order to get explicit comparison with previous the cube-attack-like on Keccak-MAC and to verify our key-recovery attack experimentally, we present the application when the capacity is 256 for Keccak-MAC at first, i.e., Keccak-MAC-128, our attack is slightly inferior to Huang et al.’s conditional cube attack, but much better than previous cube-like attack. And then, we give our main result, the application when the capacity is 1024, i.e., Keccak-MAC-512, our attack is the first 7-round key-recovery attack on Keccak-MAC-512.

6.1 Attack on 6/7-round Keccak-MAC-128

For Keccak-MAC-128 with 1600-bit state, rate occupies 1344 bits, and capacity 256 bits. As shown in Fig. 7, 128-bit key (\(k_0,k_1\)) locates at the first two yellow lanes, then the white bits represent nonce or message bits, all of which can be selected as cube variables, while the grey ones are initialized with all zero.

Fig. 7
figure 7

The initial state of Keccak-MAC-128 (left) and Keccak-MAC-512 (right)

According to the modeling search strategy illustrated in Sect. 5, we search for the minimize number of related key bits. The objective function is

$$\begin{aligned} Min \sum \limits _{y=0,x\in \{0,1\},z\in \{0,1\ldots 63\}}{A[x][y][z]}. \end{aligned}$$

With the help of Gurobi [18], the objective function is optimized under all the constraints in Sect. 5, the minimum number of related key bits is 18 for 32-dimension linear cubes and 30 for 64-dimension linear cubes. The cube variables and related bits are listed in Tables 6 and 7 in Appendix.

6.1.1 Attack on 6-round Keccak-MAC-128

The attack includes preprocessing phase and online phase. The related key bits in the Table 6 are multiplied with the cube variables in the first round, and the cube variables are not multiplies with other secret key bits, which means that the other secret key bits have no influence on the cube sums of the \(2^{32}\) different messages. Among these 18 related key bits, we guess the 9 guessed key bits in Table 6 in the preprocessing phase, and for the other related key bits except the guessed key bits, we set auxiliary variables in the same column for each key bit. When the auxiliary variables are equal to the key bits in the same column, these bits act as CP-kernel and the diffusion of these key bits is reduced and they do not multiplied with cube variables so that the cube sums do not depend on these key bits. We choose the auxiliary variables precisely and examine that their related key bits matched (in the same column) do not multiply with the cube variables in the first round when the auxiliary variables equal to their matched related key bits.Footnote 3 After recovering these key bits, we just shift the positions of all the cube variables along the z-direction and there would be another set of key bits involved in the key-recovery attack because of the translation invariance in the direction of the z axis of Keccak. We present the attack procedure as follows:

Preprocessing phase

  • Set all the state bits except the cube variables to zero (or any other arbitrary constant).

  • For each possible value of 9 guessed key bits in Table 6, calculate the \(2^{32}\) cube sums after 6 rounds for all the output bits according to the 32-dimension cube variables in Table 6. Store the cube sums in a sorted list L with the value of 9-bit guessed key.

In preprocessing phase we calculate \(2^{9}\) cube sums for 32-dimension cube variables, so the time complexity is \(2^{9} \times 2^{32} = 2^{41}\) 6-round Keccak-MAC, and the memory complexity is \(2^{9}\) 128-bit word.

Online phase

  • For each possible value of 9-bit auxiliary variables listed in Table 6, request the outputs of \(2^{32}\) messages that make up the 32-dimension cube variables.

  • Calculate the cube sums for the output bits and search them in list L. For each match in L, regard the 9-bit guessed key and 9-bit auxiliary variables as the candidate for the 18-bit related key in the Table 6.

Once the value of the 9-bit auxiliary variables equal to the 9-bit other related key bits except the 9-bit guessed key bits, these 9-bit related key with auxiliary variables would have no influence on the cube sums as they are not multiply with the 32-dimension cube variables any more in the first round. Then only the 9-bit guessed key affects on the cube sums. The memory complexity is \(2^{9}\), the data complexity is \(2^{32}\), and total computation of this attack is \(2^{9} \times 2^{32} \times 2 = 2^{42}\), which is less than Dinur et al.’s \(2^{66}\).

6.1.2 Attack on 7-round Keccak-MAC-128

We find 30 related key bits with 64-dimension cube variables which are listed in Table 7 as well as the guessed key bits and auxiliary variables. The attack procedure is just like the 6-round attack. In the preprocessing phase, we compute \(2^{64}\) cube sums for each value of 15-bit guessed key bits and store them in list L. In the online phase, we compute the cube sums for each of the 15-bit auxiliary variables. The total time complexity for 7-round attack is \(2^{15}\times 2^{64}\times 2=2^{80}\) and the memory complexity is \(2^{15}\) 128-bit word.

6.2 Attack on 6/7-round Keccak-MAC-512

For Keccak-MAC-512 with 1600-bit state, rate occupies 576 bits, and capacity 1024 bits. As shown in Fig. 7, 128-bit key (\(k_0,k_1\)) locates at the first two yellow lanes, then the white bits represent nonce bits, but only white ones highlighted by red thick lines can be selected as cube variables because the other white lanes do not satisfy the CP-like-kernel and diffuse badly. In fact, we could found enough cube variables in those lanes highlighted by red trick lines. The grey ones are initialized with all zero.

We use our MILP tools and find 32-dimension cube variables with 52 related key bits and 64-dimension cube variables with 95 related key bits respectively. The cube variables, related key bits, guessed key bits and the auxiliary variables for 6/7-round attack are listed in Tables 8 and 9 in Appendix respectively.

The attack procedure is the same as the Keccak-MAC-128. We just discuss the complexity here.

6.2.1 Attack on 6-round Keccak-MAC-512

There are 52 related key bits in total and 26 guessed key bits guessed in the preprocessing phase. The auxiliary variables are also 26-bit, then the time complexity is both \(2^{26}\times 2^{32}= 2^{58}\) for preprocessing phase and online phase, so the time complexity is \(2^{59}\) totally and the memory complexity is \(2^{26}\) 128-bit word.

6.2.2 Attack on 7-round Keccak-MAC-512

There are 95 related key bits. We choose 47-bit as guessed key bits and find auxiliary variables for the other 48 bits. The time complexity of 7-round attack on Keccak-MAC-512 is \(2^{47}\times 2^{64}+2^{48}\times 2^{64}=2^{112.6}\), and the memory complexity is \(2^{47}\) 128-bit word.

7 Attacks on round-reduced Keyak

Authenticated encryption cipher Keyak is one of the 16 candidates in the 3rd round CAESAR competition, the cryptanalytic results before such as [5, 19] aim at its authenticity only, in other word, nonce is reused. While if confidentiality and authenticity of data are both required, a nonce can not be reused [2]. In this section, we present 7/8-round key recovery attacks for Lake Keyak (1600-bit state) with nonce-respected setting, we use the recommended length of nonce 150 bytes for Lake Keyak, according to the design documents [2]. Our result on Lake Keyak is the first 8-round key-recovery attack in the nonce-respected setting as we know, but nearly at the same time, Song et al. [28] provide another result with conditional cube attack, which is sightly better than ours.

7.1 Attacks on round-reduced Lake Keyak

For Lake Keyak in the nonce-respected setting, we get 18 related key bits with 32-dimension linear cube and 29 related key bits with 64-dimension linear cube. The related key bits, guessed key bits, auxiliary variables and cube variables are listed in Tables 10 and 11 in Appendix for 7-round/8-round key-recovery attack respectively. We present the 7/8-round key-recovery attack for Lake Keyak with time complexity \(2^{42}\) and \(2^{79.6}\) respectively as follows.

7.1.1 Attack on 7-round Lake Keyak

We use the 32-dimension cube variables listed in Table 10, as well as the related key bits, guessed key bits and auxiliary variables.

Preprocessing phase

  • Set all the state bits except the cube variables to zero (or any other arbitrary constant).

  • For each possible value of 9 guessed key bits in Table 10, calculate the output bits after 7 rounds (with 256 capacity, there are 1344-bit output). Then compute the first 10-lane output backward through \(\iota \) and \(\chi \) operations (as the state \(S_{6.5}\) has the same algebraic degree with \(S_{7}\)) and compute the cube sums for the \(2^{32}\) messages according to the 32-dimension cube variables listed in Table 10. Store the 640-bit cube sums in a sorted list L with the value of 9-bit guessed key.

Online phase

  • For each possible value of 9-bit auxiliary variables list in Table 10, request the 7-round River Keyak outputs of \(2^{32}\) messages that make up the 32 cube variables, then compute the first 10-lane output backward through \(\iota \) and \(\chi \) operations.

  • Calculate the cube sums for the first 10-lane output bits and search them in list L. For each match in L, regard the 9-bit guessed key and 9-bit auxiliary variables as the candidate for the 18-bit related key in the Table 10.

We need compute the 32-dimension cube sums for \(2^{9}\) values of guessed key bits in the preprocessing phase, the time complexity is \(2^9 \times 2^{32}=2^{41}\) and memory complexity is \(2^9\) 640-bit word. In the online phase, we need compute the cube sums for each possible value of 9 auxiliary variables, the time complexity is also \(2^{41}\). Thus, the time complexity is \(2^{42}\) and memory complexity is \(2^{9}\) for the whole attack.

7.1.2 Attack on 8-round Lake Keyak

The attack procedure is very similar to the 7-round attack, so we only introduce the complexity of our attacks in the following. We need compute the cube sums for \(2^{14}\) values of 14 guessed key bits in the preprocessing phase, the time complexity is \(2^{14} \times 2^{64}=2^{78}\) and memory complexity is \(2^{14}\) 640-bit word. In the online phase, we need compute the cube sums for each possible value of 15 auxiliary variables, the time complexity is \(2^{15} \times 2^{64} = 2^{79}\). Thus, the time complexity is \(2^{78} + 2^{79} = 2^{79.6}\) and memory complexity is \(2^{14}\).

8 Applications to round-reduced initialization of Ketje

At 6 March 2017, the Keccak team announces the Ketje cryptanalysis prize to encourage the cryptanalysis. In Li et al. [20] present the conditional cube attacks on Ketje. Besides, they explore the resistance of Ketje against conditional cube attack according to different lengths of nonce. For Ketje Major, they search the possible cube variables in the instances with different lengths of nonce, and study the borderline length of nonce to provide enough cube variables for conditional cube attack. As a result, they point out that one could attack 7-round Ketje Major when its nonce is larger than 704 bits. While for Ketje Minor, it’s necessary for adversaries to utilize (nearly) full length of nonce. In this section, we use MILP-aided cube-like-attack to explore the number of degrees of freedom for Ketje. We should point out that the MILP-aided cube-like-attack could work better when the number of degrees of freedom is smaller. We present our attacks on Ketje as follows.

8.1 Attacks on round-reduced initialization of Ketje Minor

Since we would like to explore how smaller the number of degrees of freedom could be when the MILP-aided cube-like-attack works, we need to search for enough cube variables (64 for 7-round attack) and minimize the related key bits at the same time. The number of related key bits should be smaller than 128, on the other side, if the number of related key bits is smaller than 128, we would utilize the smallest length of nonce as far as we can. As a result, for Ketje Minor, when the length of nonce is reduced to 288-bit, we find 64-dimension linear cubes with 96 related key bits which are listed in Table 12 in Appendix as well as the auxiliary variables and guessed key bits. With these variables, we could perform cube-attack-like cryptanalysis just as before. In preprocessing phase, we compute the cube sums over the 64 cube variables for each possible value of 48 guessed key bits and store them, while in the online phase, we compute the cube sums for each possible of 48 auxiliary variables, if the cube sums in these two phase are equal, we regard the combination of 96-bit related key as the right key. The time complexity of this attack is \(2^{48} \times 2^{64} \times 2 = 2^{113}\), and the memory complexity is \(2^{48}\) for our 7-round key-recovery attack.

8.2 Attacks on round-reduced initialization of Ketje Major

For Ketje Major, we search for 64 cube variables with 58 related key bits when the length of nonce is reduced to 576 bits. We list the cube variables, related key bits, guessed key bits as well as the auxiliary variables in Table 13 in Appendix. We omit the attack procedure here but present the complexity. In preprocessing phase, we compute the cube sums over the 64 cube variables for each possible value of 29 guessed key bits and store them, while in the online phase, we compute the cube sums for each possible of 29 auxiliary variables, if the values of cube sums for these two phase equal, we regard the combination of 58-bit related key as the right key. The time complexity of this attack is \(2^{29} \times 2^{64} \times 2 = 2^{94}\), and the memory complexity is \(2^{29}\) for our 7-round key-recovery attack.

9 Conclusion

In this paper, we give a new MILP-based method to improve Dinur et al.’s cube-attack-like method. We find the optimal linear-cubes that are multiplied with minimum number of key bits for Keccak-MAC, Lake Keyak and Ketje. Then, we give the first 7-round key-recovery attack on Keccak-MAC-512. In Lake Keyak, our attack could work in the nonce-respected setting, while not as good as conditional cube attack. In Ketje Minor/Major, we get better results than before in aspect of complexity or attacked rounds with smaller length of nonce.

When comparing with Huang et al.’s conditional cube attack, the advantage of the MILP-aided cube-attack-like cryptanalysis is that it has larger effective range. In variants with the same number of degrees of freedom, MILP-aided cube-attack-like cryptanalysis and conditional cube attack could attack the same rounds. In variants with relatively smaller number of degrees of freedom, MILP-aided cube-attack-like cryptanalysis could get better results than conditional cube attack.

Currently, the cryptanalysis progress of symmetric-key ciphers heavily depends on automated evaluation tools. Due to Keccak’s robust design, its cryptanalysis is still hard and limited. In this paper, we provide a new MILP model to study Keccak. As we put the tedious cryptanalysis work to the MILP solver, the study of Keccak becomes easier.