Keywords

1 Introduction

In recent years, research on lightweight block ciphers has received a lot of attentions. Lightweight block ciphers are widely used in Internet of things and wireless communication because their structures are simple and they can be run in low-power environment. Many lightweight block ciphers such as PRESENT [5], CLEFIA [17], LED [10], PRINCE [6], SIMON and SPECK [3] have been published in last decades. GIFT [2] is a new lightweight block cipher proposed by Banik et al. at CHES 2017, which is designed to celebrate 10 years of PRESENT. GIFT has an SPN structure which is similar to PRESENT. It has two versions, namely GIFT-64 and GIFT-128, whose block sizes are 64 and 128, and the round numbers are 28 and 40 respectively.

Many classical cryptanalysis methods could be converted to mathematical optimization problems which aims to achieve the minimal or maximal value of an objective function under certain constraints. Mixed-integer Linear Programming (MILP) is the most widely studied technique to solve these optimization problems. One of the most successful applications of MILP is to search for differential and linear trails. Mouha et al. first applied MILP method to count active S-boxes of word-based block ciphers [12]. Then, at Asiacrypt 2014, Sun et al. extended this technique to search for differential and linear trails [20], whose main idea is to derive some linear inequalities through the H-Representation of the convex hull of all differential patterns and linear bias of S-box. Xiang et al. [21] introduced a MILP model to search for integral distinguisher, Sasaki et al. [16] and Cui et al. [7] gave the MILP-based impossible differential search model independently. There are many MILP-based tools proposed already, such as MILP-based differential/linear search model for ARX ciphers [8], MILP-based conditional cube attacks [11] on Keccak [4], etc.

Our Contributions

The designers of GIFT provided many analysis result about GIFT in [2]. They use MILP to compute the lower bounds for the number of active S-boxes in differential cryptanalysis firstly. Then they presented round-reduced differential probabilities. For GIFT-64, they provided a 9-round differential characteristic with probability of \(2^{-44.415}\) and they expected that the differential probability of 13-round GIFT-64 will be lower than \(2^{-63}\). For GIFT-128, they provided a 9-round differential probability of \(2^{-47}\) and they expected that the differential probability of 26-round GIFT-128 will be lower than \(2^{-127}\). The designers did not present actual attack on GIFT in [2].

In this paper, we generalize an efficient two-stage MILP-based model inspired by Sun et al.’s two-stage model [18]. Our model includes two interactive sub-models, denoted as outer-MILP and inner-MILP part. The outer-MILP part obtains the minimal active S-boxes, namely, the truncated differential. Then the inner-MILP part produces the differential characteristic with maximal probability, the differential characteristic should match the truncated differential. With our two-stage model, we find some 12-round differential characteristics of GIFT-64, some of the differential characteristics are iterative. Moreover, using a 12-round differential characteristic with probability of \(2^{-60}\), we give an attack on 19-round reduced GIFT-64 (out of 28 full rounds) with time complexity \(2^{112}\), memory complexity \(2^{80}\) and data complexity \(2^{63}\).

In addition, we also improved our search model to find differential characteristics of GIFT-128. Firstly, the algorithm solves a sub-MILP-model to obtain an acceptable differential characteristic with small number of rounds. The output difference of a sub-MILP-model should be served as input difference of the following sub-MILP-model. The sub-MILP-model is iterated until the probability of the whole differential characteristic is higher than the given bound. Using our algorithm, we find some new differential characteristics, including a new 18-round differential characteristic with probability \(2^{-109}\). We give the first attack on 23-round GIFT-128 (out of 40 full rounds) with the 18-round differential characteristic. All of the source code is uploaded to GitHub (https://github.com/zhuby12/MILP-basedModel).

The summary of differential analysis of GIFT is shown in Table 1.

Table 1. Summary of cryptography analysis on GIFT
Fig. 1.
figure 1

Two rounds of GIFT-64

2 Preliminaries

2.1 Description of GIFT

GIFT has an SPN structure which is similar to PRESENT. It has two versions, namely GIFT-64 and GIFT-128, whose block sizes are 64 and 128 and round numbers are 28 and 40 respectively. Both versions have a key length of 128 bits.

Each round of GIFT consists of three steps: SubCells, PermBits and AddRoundKey. The round function of GIFT-64 is shown in Fig. 1. Similarly, GIFT-128 adopts thirty-two 4-bit S-boxes for each round.

SubCells. Both versions of GIFT use the same invertible 4-bit S-box, which is the only nonlinear component of the algorithm. The action of this S-box in hexadecimal notation is given in Table 2.

Table 2. Sbox of GIFT

PermBits. The bit permutation used in GIFT-64 and GIFT-128 are given in Table 3.

Table 3. Specifications of GIFT bit permutation

AddRoundKey. The round key RK is extracted from the key state. A round key is first extracted from the key state before the key state update.

For GIFT-64, two 16-bit words of the key state are extracted as the round key \(RK=U||V\). U and V are XORed to \(b_{4i+1}\) and \(b_{4i}\) of the cipher state respectively. \(b_i\) represents the i-th bit of the cipher state. \(u_i\) and \(v_i\) represent the i-th bit of U and V.

$$ U \leftarrow k_1, V \leftarrow k_0 $$
$$ b_{4i+1} \leftarrow b_{4i+1} \oplus u_i, b_{4i} \leftarrow b_{4i} \oplus v_i, \forall i \in \{0,\cdots ,15\} $$

For GIFT-128, four 16-bit words of the key state are extracted as the round key \(RK=U||V\). U and V are XORed to \(b_{4i+2}\) and \(b_{4i+1}\) of the cipher state respectively.

$$ U \leftarrow k_5||k_4, V \leftarrow k_1||k_0 $$
$$ b_{4i+2} \leftarrow b_{4i+2} \oplus u_i, b_{4i+1} \leftarrow b_{4i+1} \oplus v_i, \forall i \in \{0,\cdots ,31\} $$

The key state for two versions are updated as follows,

$$ k_7||k_6||\cdots ||k_1||k_0 \leftarrow k_1 \ggg 2 || k_0 \ggg 12 || \cdots ||k_3||k_2 $$

Round Constants. For both versions of GIFT, a single bit “1” and a 6-bit constant \(C=\{c_5,c_4,c_3,c_2,c_1,c_0\}\) are XORed into the cipher state at bit position n-1,23,19,15,11,7,3 respectively in each round. For GIFT-64, n-1 is 63 and for GIFT-128, n-1 is 127. \(\{c_5, c_4, c_3, c_2, c_1, c_0\}\) are initialized to “0”, and they are updated as follow:

$$ (c_5,c_4,c_3,c_2,c_1,c_0)\leftarrow (c_4,c_3,c_2,c_1,c_0,c_5\oplus c_4\oplus 1) $$

2.2 Notations

figure a

3 Related Works

3.1 Mouha et al.’s Framework for Word-Oriented Block Ciphers

Mouha et al. [12] introduced MILP model to count the number of differentially active S-boxes for word-oriented block ciphers.

Definition 1

Consider a differential characteristic state \(\varDelta \) consisting of n bytes \(\varDelta = (\varDelta _0,\varDelta _1,\ldots ,\varDelta _{n-1})\). Then, the difference vector \(x = (x_0,x_1,\ldots ,x_{n-1})\) corresponding to \(\varDelta \) is defined as

$$\begin{aligned} x_i = \left\{ \begin{array}{l} 0 \quad if\,\varDelta _i=0,\\ 1 \quad otherwise. \end{array} \right. \end{aligned}$$
(1)

Based on Definition 1, Mouha et al. translated the XOR operation and the linear transformation to linear inequalities as follows:

  • Equations describing the XOR operation: Suppose the input difference vector for the XOR operation be \((x_{in1}^\oplus ,x_{in2}^\oplus )\) and the corresponding output difference vector be \(x_{out}^\oplus \). The following constraints will make sure that when \(x_{in1}^\oplus \), \(x_{in2}^\oplus \) and \(x_{out}^\oplus \) are not all zero, then there are at least two of them are nonzero:

    $$\begin{aligned} \left\{ \begin{array}{l} x_{in1}^\oplus + x_{in2}^\oplus + x_{out}^\oplus \ge 2d_\oplus \\ d_\oplus \ge x_{in1}^\oplus , d_\oplus \ge x_{in2}^\oplus , d_\oplus \ge x_{out}^\oplus \end{array} \right. \end{aligned}$$
    (2)

    where \(d_\oplus \) is a dummy variable taking values in {0, 1}.

  • Equations describing the linear transformation: Assume linear transformation L transforms the input difference vector \((x_{1}^L,x_{2}^L,\ldots ,x_{m-1}^L)\) to the output difference vector \((y_{1}^L,y_{2}^L,\ldots ,y_{m-1}^L)\). Given the differential branch number \(\mathcal {B_D}\). The following constraints can describe the relation between the input and output difference vectors, they should be subject to:

    $$\begin{aligned} \left\{ \begin{array}{l} \sum \nolimits _i^{m - 1} {x_i^L} + \sum \nolimits _i^{m - 1} {y_i^L} \ge {\mathcal {B_D}{d^L}} \\ {d^L} \ge {x_i^L},{d^L} \ge {y_i^L},i \in \{ 0,...,m - 1\} \end{array} \right. \end{aligned}$$
    (3)

    where \(d^L\) is a dummy variable taking values in {0, 1}.

3.2 Sun et al.’s Framework for Bit-Oriented Block Ciphers

At Asiacrypt 2014, Sun et al. [20] extended Mouha et al.’s framework [12] to bit-oriented ciphers. For bit-oriented ciphers, Mouha et al.’s descriptions of XOR operation and linear transformation are also suitable.

Definition 2

Consider a differential characteristic state \(\varDelta \) consisting of n bits \(\varDelta = (\varDelta _0,\varDelta _1,\ldots ,\varDelta _{n-1})\). Then, the difference vector \(x = (x_0,x_1,\ldots ,x_{n-1})\) corresponding to \(\varDelta \) is defined as

$$\begin{aligned} x_i = \left\{ \begin{array}{l} 0 \quad if\,\varDelta _i=0,\\ 1 \quad if\,\varDelta _i=1. \end{array} \right. \end{aligned}$$
(4)

Based on Definition 2, Sun et al. translated the S-box operation to linear inequalities as follow:

  • Equations describing the S-box operation: Suppose \(({x_0},\ldots ,{x_{{w-1}}})\) and \(({y_0},\ldots ,{y_{v-1}})\) are the input and output bit-level differences of an \(w\times v\) S-box. A is a dummy variable taking values in {0,1} to describe whether the S-box is active or not. \(A=1\) holds if and only if \(x_0,x_1,\ldots ,x_{w-1}\) are not all zero. The following constraints should be obeyed:

    $$\begin{aligned} \left\{ \begin{array}{l} {A} - {x_i} \ge 0,i \in \{ 0,\ldots ,w - 1\} \\ \sum \nolimits _i^{w - 1} {{x_i} - {A} \ge 0} \end{array} \right. \end{aligned}$$
    (5)

3.3 Valid Cutting-Off Inequalities from the Convex Hull of S-Box

The convex hull of a set Q of discrete points in \(\mathbb {R}^n\) is the smallest convex that contains Q. A convex hull in \(\mathbb {R}^n\) can be described as the common solutions of a set of finitely many linear equalities and inequalities.

Suppose \(p = (x,y) = ({x_0},\ldots ,x_{w-1},y_0,\ldots ,y_{v-1})\) is a differential pattern of a \(w\times v\) S-box, in which x is the input differential vector and y is the output differential vector. If we treat a differential pattern of a \(w\times v\) S-box as a discrete point in \(\mathbb {R}^{w+v}\), then we can get a set of finitely discrete points which includes all possible differential patterns of the S-box. We can describe this definite set with the following inequalities:

$$\begin{aligned} \left\{ \begin{array}{l} \alpha _{0,0}x_0 + \ldots + \alpha _{0,w-1}x_{w-1} + \beta _{0,0}y_0 + \ldots + \beta _{0,v-1}y_{v-1} + \gamma _0 \ge 0\\ \ldots \\ \alpha _{n,0}x_0 + \ldots + \alpha _{n,w-1}x_{w-1} + \beta _{n,0}y_0 + \ldots + \beta _{n,v-1}y_{v-1} + \gamma _n \ge 0 \end{array} \right. \end{aligned}$$
(6)

This is called the H-Representation of a \(w\times v\) S-box, in which \(\alpha \) and \(\beta \) are constant. With the help of SageMath [1], hundreds of linear inequalities can be derived by the differential distribution table of a S-box. But the inequalities is redundant in general, for example, the number of inequalities of GIFT S-box given by SageMath is 237. Because the efficiency of the MILP optimizer is reduced radically when the amount of linear inequalities increase, adding all of the inequalities to the MILP model will make the model insolvable in practical time.

In order to minimize the number of the set of inequalities, Sasaki et al. raised a MILP-based reduction algorithm in [15] to find the optimal combination with minimal number of linear inequalities from hundreds of inequalities in the H-representation of the convex hull. The algorithm considers each impossible pattern in the DDT of S-box. An impossible pattern should be excluded from the solution space by at least one inequality. Under these constraints, we can minimize the number of inequalities by using MILP optimizer.

4 MILP-Based Model to Search Differential Characteristic for GIFT-64

4.1 MILP-Based Two-Stage Algorithm to Search for Differential Characteristic

Two-stage search strategy to find differential characteristics of block ciphers is used in [9, 13, 18]. In the first step, truncated differential characteristics with minimal active S-box will be found. Then, concrete differential characteristics matching the truncated differential characteristic can be found in a subroutine algorithm. In previous works, one first chose a prespecified threshold of the number of active S-box. However, it is possible that the characteristic with the highest probability do not have the minimal number of active S-box. In this section, we propose Algorithm 1 to search for the best or better differential characteristic.

figure b

Algorithm 1 does not need the predefined threshold and could get the characteristic with highest probability definitely. Algorithm 1 includes two interactive sub-models, denoted as outer-MILP part and inner-MILP part. The two stages are interactive. In the outer-MILP part, the objective function is the minimal active S-boxes. When a solution is found in the outer-MILP part, the truncated differential that contains the information of the positions of active S-boxes will input the inner-MILP part as constraints. In the inner-MILP part, it produces the differential characteristic with maximal probability that matches the truncated differential. Then the algorithm goes to the outer-MILP part with the truncated differential removed from its feasible region.

In addition, the maximal probability of the derived differential characteristic is also used to reduce the feasible region of the outer-MILP part dynamically. In details, if a differential characteristic with larger probability could be found in the next loops, the number of active S-boxes produced in outer-MILP part must be lower than a certain bound. The bound is dynamically computed by the current maximal probability. When the outer-MILP part is infeasible, the algorithm returned.

We apply Algorithm 1 to search for differential characteristics for GIFT-64, and get some interesting results.

4.2 Search for Differentials of GIFT-64

Algorithm 1 needs two convex hulls about the S-box in the outer-MILP part and the inner-MILP part respectively. First, we compute the H-representation of convex hull of differential patterns of S-box in Appendix A. Using SageMath, 237 inequalities are produced in the H-Representation of the convex hull of GIFT S-box, then after selecting inequalities by the method introduced in [15], we get 21 inequalities. Second, we study the convex hull of differential patterns with probabilities of the S-box. Sun et al. introduced the differential distribution probability of S-box to MILP-model in [19]. Since, for GIFT S-box, there are 4 possible probabilities, i.e. 1, \(2^{-1.415}\), \(2^{-2}\), \(2^{-3}\), we need three extra bits \((p_0,p_1,p_2)\) to encode the differential patterns with probability. The new differential pattern is \( ({x_0},{x_1},{x_2},{x_3},{y_0},{y_1},{y_2},{y_3};{p_0},{p_1},{p_2})\in \mathbb {F}_2^{8+3}\) which satisfies Eq. 7.

$$\begin{aligned} \left\{ \begin{array}{l} ({p_0},{p_1},{p_2}) = (0,0,0),\mathrm{{if }}\,\Pr _s[({x_0},{x_1},{x_2},{x_3}) \rightarrow ({y_0},{y_1},{y_2},{y_3})] = 1 = {2^{-0}}\\ ({p_0},{p_1},{p_2}) = (0,0,1),\mathrm{{if }}\,\Pr _s[({x_0},{x_1},{x_2},{x_3}) \rightarrow ({y_0},{y_1},{y_2},{y_3})] = 6/16 = {2^{-1.415}}\\ ({p_0},{p_1},{p_2}) = (0,1,0),\mathrm{{if }}\,\Pr _s[({x_0},{x_1},{x_2},{x_3}) \rightarrow ({y_0},{y_1},{y_2},{y_3})] = 4/16 = {2^{-2}}\\ ({p_0},{p_1},{p_2}) = (1,0,0),\mathrm{{if }}\,\Pr _s[({x_0},{x_1},{x_2},{x_3}) \rightarrow ({y_0},{y_1},{y_2},{y_3})] = 2/16 = {2^{-3}} \end{array} \right. \end{aligned}$$
(7)

Then the objective function is changed to minimize \(\sum (3\times p_0 + 2 \times p_1 + 1.415 \times p_2)\).

Table 4. 12-round differential characteristic with probability \(2^{-59}\)

We implement the Algorithm 1 to search for differential characteristics for GIFT-64. In the Outer-MILP part of the Algorithm 1, the objective function is to minimize active S-boxes. We get the tight bound of number of active S-boxes for 11-round and 12-round reduced GIFT-64, which are 22 and 24 respectively. Using the Algorithm 1, we find many 12-round differential characteristics. The highest probability of 12-round differential characteristic is \(2^{-59}\), the 12-round differential characteristic with highest probability is shown in Table 4. Meanwhile we get dozens of differential characteristics with probability \(2^{-60}\).

We observe that some of 12-round characteristics are iterative. As a result, we get eight 4-round differential characteristics with probability \(2^{-20}\) totally. These 4-round characteristics are iterative, namely, their input states are identical to their output states. One of them is shown in Table 5, and these characteristics can be extended to more rounds. So we get one of 12-round differential characteristics cycled by three 4-round differential characteristics with probability \(2^{-60}\) in Table 6. A 13-round characteristic with probability \(2^{-64}\) can also be generated by adding another round at the beginning of 12-round differential characteristic. Note that the designers of GIFT claimed that the differential probability of 13-round GIFT-64 will be lower than \(2^{-63}\). Our result does not violate the claim, however the gap is very small.

Table 5. 4-round differential characteristic with probability \(2^{-20}\)
Table 6. 12-round differential characteristic with probability \(2^{-60}\)

4.3 Attack on 19-Round GIFT-64

Using the 12-round differential characteristic with probability \(2^{-60}\) in Table 6, we could launch a key-recovery attack against 19-round GIFT-64. We choose this differential characteristic because its active bits in the head and tail is less than others. As shown in Table 7, we add three rounds at its beginning and four rounds at the end of the differential characteristic. Therefore, we can attack 19-round GIFT-64. According to the key schedule, the round key used in 1-st, 2-nd, 16-th, 17-th, 18-th and 19-th round corresponds to \((k_1,k_0)\), \((k_3,k_2)\), \((k_7\ggg 6,k_6\ggg 4)\), \((k_1 \ggg 8,k_0)\), \((k_3\ggg 8,k_2)\) and \((k_5\ggg 8,k_4)\) in initial key state \((k_7,k_6,k_5,k_4,k_3,k_2,k_1,k_0)\), respectively.

Table 7. 19-round differential attack on GIFT-64

Data Collection

Since GIFT-64 does not have whitening key layer at the beginning, after the P permutation of the first round, we could build \(2^n\) structures. Each structure traverses the sixteen bits undetermined in \(\varDelta X^1_P\), i.e. the bit labeled by “?” in \(\varDelta X^1_P\) of Table 7, thus it can generate \(2^{16\times 2-1}=2^{31}\) pairs obeying the differential. Therefore, \(2^n\) structures can generate \(2^n\times 2^{31}=2^{n+31}\) pairs.

For such a pair, it has an average probability of \(2^{-16}\) to meet the differential in 4-th round in Table 7. Then, the pair encrypted with the right key will obey the differential after 15th round with probability of \(2^{-60}\). While the pair with a wrong key will obey it with a random probability of \(2^{-64}\). Therefore, with the right key guess, \(2^{n+31}\times 2^{-16}\times 2^{-60}=2^{n-45}\) pairs will obey the differential after 15th round. Here we choose \(n=47\). So the data complexity is \(2^{47}\times 2^{16}=2^{63}\).

Key Recovery

When processing the key recovery, the guessing key bits include: \(k_1^{3}\), \(k_1^{2}\), \(k_1^{1}\), \(k_1^{0}\), \(k_0^{3}\), \(k_0^{2}\), \(k_0^{1}\), \(k_0^{0}\) in 1st round, \(k_3^{12}\), \(k_2^{12}\), \(k_3^{4}\), \(k_2^{4}\) in 2nd round; \(k_7^{6}\), \(k_6^{8}\), \(k_7^{14}\), \(k_6^{0}\) in 16th round, \(k_1^{15}\), \(k_1^{14}\), \(k_1^{13}\), \(k_1^{12}\), \(k_0^{3}\), \(k_0^{2}\), \(k_0^{1}\), \(k_0^{0}\) in 17th round, as well as all 64 key bits in 18th, 19th round. Totally, we construct \(2^{80}\) counters for the possible values of the 80 key bits above. The whole attack procedure is a guess and filter approach. Guess two key bits \(k_1^{0}\), \(k_0^{0}\), then we can partially encrypt the plaintexts.

Table 8. Round keys of GIFT-64

As the middle values of right pairs should obey \(\varDelta X_S^2\{0\}=0,\) \(\varDelta X_S^2\{2\}=0,\) \(\varDelta X_S^2\{3\}=1\), the (plaintextciphertext) pairs can be filtered with a probability of \(2^{-3}\). Similarly, guessing \(k_1^{i}, k_0^{i}\), \(i=1,2,3\) and partially encrypt, corresponding conditions in \(\varDelta X_S^2\{5,7\}\), \(\varDelta X_S^2\{8,10,11\}\), \(\varDelta X_S^2\{13,15\}\) can filter the pairs with \(2^{-2}\), \(2^{-3}\) and \(2^{-2}\). Totally 1st round provide a filtering probability of \(2^{-10}\).

Similarly, the encryption at 2-nd, 16-th, 17-th, 18-th round can filter the pairs with probability \(2^{-6}\), \(2^{-8}\), \(2^{-8}\), \(2^{-48}\) while all 32 key bits in 19th round need to be guessed. Thus, \(2^{-2}\) pairs will be left for a random key, while 4 pairs should be left for a right key.

The time complexity is \(2^2\times 2^{31+47}\times 2^{32}=2^{112}\), the data complexity is \(2^{63}\) and the memory complexity is \(2^{80}\).

5 Improved MILP-Based Method to Find Differential for GIFT-128

GIFT-128 adopts 128 bits state and has thirty-two 4-bit S-boxes in each round. The variables and constrains are twice as many as GIFT-64. The designers of GIFT [2] gives 9-round differential characteristics of GIFT-128. We test Algorithm 1 on 9-round GIFT-128 and obtain the designers’ conclusion. But it costs days to solve. In this section, we devise a segmented MILP-based method to search for longer differential characteristics for GIFT-128.

Suppose we aim to find a r-round differential characteristic for a block cipher. We first divide it as \(r_i\)-round (\(i=1,2,...,t\)) sub-ciphers and \(\sum _1^{t}r_i=r\). We choose probability thresholds for \(r_1\)-round, \(r_2\)-round,...,\(r_t\)-round ciphers as \(P_{r_1},P_{r_2},...,P_{r_t}\), so that the probability \(p_{r_i}\) for \(r_i\)-round sub-cipher should be larger than \(P_{r_i}\). Choose a threshold value \(P_{target}\) for r-round. If \(p_{r_1}p_{r_2}\ldots p_{r_t}\) is larger than \(P_{target}\), an acceptable solution is found.

As shown in Fig. 2, for \(r_i\)-round sub-cipher, the input state are fixed as the output state of the differential characteristic \(\mathcal {D}_{i-1}\) of \(r_{i-1}\)-round sub-cipher, and construct the MILP model \(\mathcal {M}_{r_i}\). If \(\mathcal {M}_{r_i}\) is feasible, we continue to construct \(\mathcal {M}_{r_{i+1}}\) for \(r_{i+1}\)-round sub-cipher; else, we remove \(\mathcal {D}_{i-1}\) from \(\mathcal {M}_{r_{i-1}}\), and solve it again. The search terminates until we find the differential characteristics of \(r_1\)-round,\(r_2\)-round,...,\(r_t\)-round sub-ciphers that could be connected to produce a r-round differential characteristic.

Fig. 2.
figure 2

The framework of our search algorithm

We apply this model to search for differential characteristics for GIFT-128. It is indeed a heuristic and empirical process. For GIFT-128, it is time consuming to solve a more than 6-round MILP model. In order to keep the efficiency, we choose \(r_i<6\). \(P_{r_i}\) is chosen more flexible. According to the designers’ analysis in [2], for 3/4/5-round GIFT-128, the numbers of minimum active S-boxes are 3, 5, and 7, respectively. The length of the sub-cipher can neither be too short nor be too long. If the number of rounds is smaller than 2, this sub-MILP-model is unnecessary to solve. On the other hand, if the number of rounds is bigger than 6 or 7, it costs too much time to solve the sub-model that we cannot bear. We do not want the probability of \(r_i\)-round differential characteristic of GIFT-128 to be much smaller than the highest one. So \(P_{r_i}\) are chosen according to the minimum active S-boxes of \(r_i\)-round GIFT-128. In this section, we choose \(P_{r_i=3}=2^{-30}\), \(P_{r_i=4}=2^{-40}\) and \(P_{r_i=5}=2^{-50}\) to act as the exact lower bound of differential probability of each sub-model.

We use this model and the strategies above choosing parameters to search for differential characteristics for GIFT-128. We list some results in Table 9. The 12-round and 14-round differential characteristics are shown in Appendix C.

Table 9. Probabilities of some differential characteristics of GIFT-128
Table 10. 18-round differential characteristic of GIFT-128

The 18-round characteristic, shown in Table 10 is constructed by the connection of the following three 4-round differential characteristics and a 6-round differential characteristic:

$$ \begin{array}{l} (0000\,0000\,7060\,0000\,0000\,0000\,0000\,0000)\xrightarrow []{4-round,~2^{-12}} (0020\,0000\,0010\,0000\,0000\,0000\,0000\,0000)\\ (0020\,0000\,0010\,0000\,0000\,0000\,0000\,0000)\xrightarrow []{4-round,~2^{-29}} (0000\,0000\,0000\,0011\,0000\,0000\,0000\,0000)\\ (0000\,0000\,0000\,0011\,0000\,0000\,0000\,0000)\xrightarrow []{4-round,~2^{-32}} (0000\,0000\,0a00\,0a00\,0000\,0000\,0000\,0000)\\ (0000\,0000\,0a00\,0a00\,0000\,0000\,0000\,0000)\xrightarrow []{6-round,~2^{-36}} (0000\,0100\,0020\,0800\,0014\,0404\,0002\,0202) \end{array} $$

With the 18-round differential characteristic, we can add three rounds at its beginning and two rounds at the end to attack 23-round reduced GIFT-128. The attack procedure is similar to Subsect. 4.3. The time complexity is \(2^{120}\) which is bounded by the data complexity and the memory complexity is \(2^{86}\) bits to store the key counters.

6 Conclusion

In this paper, first, we design a more efficient MILP-based differential search model. Using this model, we give a 12-round differential characteristic with probability \(2^{-60}\) and get the first 19-round key-recovery attack on GIFT-64. Second, we improve our MILP-based model for block ciphers with large state size. With this model, we give 18-round differential characteristic with probability \(2^{-109}\) and obtain the first 23-round key-recovery attack on GIFT-128.

MILP can efficiently find high-probabilistic differential characteristics when attacking algorithms whose permutation layer will not cause diffusion. In the future work, we can try to apply heuristic method to constrain global variables, so as to find a higher probability differential characteristics.