Keywords

1 Introduction

Differential and linear attacks are two of the most fundamental methods for analyzing symmetric-key primitives, which many advanced cryptanalytic techniques are derived from or partly rely on. Performing differential and linear analysis is a tedious routine work for the designers and cryptanalysts of symmetric-key algorithms. Matsui’s branch and bound search algorithm [21] is a classical approach for finding the best differential and linear characteristics, and it is extremely efficient for some specific ciphers. But, implementing Matsui’s algorithm properly demands for sophisticated programming skills when cipher-specific optimizations are taken into account [5, 6, 12]. Moreover, there seems to be no obvious way to create highly reusable code for Matsui’s algorithm targeting different ciphers.

However, the diversity of cryptographic algorithms is an unstoppable trend. In the case of block ciphers, to have a single algorithm work as a security solution for all scenarios is doomed to fail due to the ever-increasing complexity and diversity of today’s communication systems. Over recent years, we have witnessed many new block ciphers designed for lightweight devices or dedicated use cases. These include, to just name a few of them, the ISO standard PRESENT [11], SIMON and SPECK [7] designed by the NSA, the SKINNY family presented in CRYPTO 2016 [8], and Rasta with minimizing AND-related metrics as its main design objective [15]. We refer the reader to [9] for a more comprehensive survey. To meet the requirements of the target applications, these newly designed block ciphers typically use lightweight components with relatively weak local cryptographic properties, consume less resources when implemented and executed, and reserve limited security margins aggressively. This approach makes the design and evaluation more difficult, where the security bounds cannot be derived theoretically.

In such situation, the security evaluation against differential and linear attacks have to be performed with the help of search tools. Matsui’s algorithm is obviously not a satisfactory choice not only because of its inconvenience but also that it is unable to get useful results in some cases. Another option getting more and more popular in recent years is the Mixed Integer Linear Programming (MILP) based method, where the problem of searching for characteristics is transformed into an MILP model that can be solved with generic MILP solvers.

Similar to SAT/SMT and CP based methods [4, 19, 22, 26], in the MILP based approach [23, 28, 29], the cryptanalysts only need to specify the problem in standard modeling languages without mixing in the actual search algorithms. This decoupling of formulation and resolution is the key that makes the MILP based approach more attractive than Matsui’s algorithm. Unlike Matsui’s algorithm, searching heuristics and optimizations can be issued externally without touching the sophisticated code powering the search. In addition, cryptanalysts benefit directly from the advancement of MILP resolution techniques. So far, the MILP based approach covers many cryptanalytic techniques, including differential/linear [18, 28], impossible differential [25], zero-correlation linear [14], and integral cryptanalysis [30].

Despite all these advantages, there are situations where Matsui’s algorithm performs far more better than the MILP based approach (e.g., search for the best characteristics of DES and PRESENT in the single-key model). Moreover, both MILP and Matsui’s algorithm rarely work for non-lightweight designs under today’s computational power. Therefore, it is of great importance to improve the efficiency of the MILP based approach, and a natural question to ask is whether it is possible to strengthen the MILP based search with Matsui’s algorithm. In this work, we make a first step towards this direction. Finally, before we present our work, we emphasis that all of our analysis are based on the Markov assumption [20], where we assume that each round of an iterative cipher is independent.

Motivation and Contribution. One obvious difference between Matsui’s algorithm and the MILP based approach is worth to highlight. When we search for the best characteristic of an R-round iterative block cipher, Matsui’s algorithm requires the probabilities of the optimal characteristics of the same cipher reduced to r rounds for \(1 \le r < R \). That is, to get the result of R rounds, we must first run Matsui’s algorithm for rounds 1, 2, \(\cdots \), and \(R-1\). These probabilities are employed to prune the search tree according to certain bounding conditions. In contrast, in the MILP based approach, we always set up an R-round model directly, and do not exploit the solutions for lower rounds explicitly. This fact motivates us to enhance the R-round MILP models by taking into account some information of the solutions of lower rounds. We achieve this by adapting the objective function of an R-round model such that constraints encoding Matsui’s bounding conditions can be incorporated into the model. In practice, this new modeling strategy leaves many choices for the cryptanalysts, since one can choose to include only a subset of the constraints generated from Matsui’s bounding conditions. We perform experiments on PRESENT, SIMON, and SPECK, which shows that the inclusion of the constraints derived from Matsui’s algorithm leads to significantly improved resolution performance for PRESENT. For SIMON, obvious improvement is also observed, and for the ARX cipher SPECK, the new model is unable to accelerate the resolution performance. Our work suggests that trying to combine the power of dedicated search algorithms implemented in general purpose programming language and MILP is a valuable endeavor. In the future, it is interesting to see how to integrate other search heuristics [16, 17] to speed up the resolution of the MILP models for finding characteristics of ARX ciphers.

Organization. In Sects. 2 and 3, we give a brief introduction of Matsui’s algorithm and the MILP based differential and linear analysis. A method for enhancing the MILP models with constraints generated from Matsui’s bounding condition is presented in Sect. 4. We then show applications of the enhanced MILP models in Sect. 5. Section 6 concludes the paper and suggests future work.

2 Matsui’s Algorithm

At Eurocypt 1994, Matsui presented a branch and bound search algorithm that can be used to identify the maximum probability characteristic of a target block cipher [21]. Matsui’s algorithm, together with its variations, has been an important tool in the practice of security evaluation of symmetric-key primitives. It is improved in subsequent work [3, 6, 13, 24] and adapted to ARX constructions in [10, 31].

A general description of Matsui’s algorithm for an iterative block cipher depicted in Fig. 1 is given in Algorithm 1. Our presentation largely follows the work of Bannier et al. [5]. Also note that Algorithm 1 is an over simplification of Matsui’s algorithm, which does not exhibit the necessary details (e.g., the technique for controlling the number of initial branches, the order in which candidates are enumerated) in actual implementations.

Fig. 1.
figure 1

An R-round iterative cipher, where \(\mathcal {T}= (\alpha _{0}, \cdots , \alpha _{R})\) is an R-round differential characteristic with probability \(\mathbb {P}(\mathcal {T})\), and the probability of the differential \(\alpha _{r-1} \rightarrow \alpha _{r}\) is denoted by \(P_{Rd(r)}\).

figure a

With the knowledge of the best probabilities \(P_{Best}(i)\) of i-round characteristics for \(i\in \{1, \cdots , R-1\}\), Matsui’s algorithm explores the search space of all possible characteristics in a depth-first approach, and output the optimal R-round characteristic. The search space conceptually forms a tree structure, and at the rth level of the tree, \(\mathcal {T}_{[1,r]} = (\alpha _0, \cdots , \alpha _{r})\) is assigned to actual values by Matsui’s algorithm, and all possible values of \((\alpha _{r+1}, \cdots , \alpha _{R+1})\) form a subtree to be explored. We call \(\mathcal {T}_{[1,r]}\) with \(r < R\) instantiated with actual values a partial solution (corresponding to intermediate node of the search tree), and \(\mathcal {T}= \mathcal {T}_{[1,R]}\) instantiated with actual values a full solution (corresponding to a leaf node of the search tree). Thus, when Matsui’s algorithm goes one level deeper into the search tree, it extends the current partial solution towards a full solution.

The efficiency of Matsui’s algorithm comes from the fact that it will not try to extend every partial solution. Before trying to extend the current partial solution, the so-called bounding condition specified in line 24 of Algorithm 1 is tested, which essentially states that if this condition is violated, a better characteristic will never be found by extending the current partial solution, and therefore we should give up the current branch, backtrack to the upper level of the search tree, and try another branch.

The variable \(P_{Estim}\) in Matsui’s algorithm keeps track of the best characteristic known so far. Only when a strictly better characteristic is encounter during the search, it will be updated (see line 42 of Algorithm 1).

Moreover, in Matsui’s algorithm, The first and last rounds receive special treatment (see functions FirstRound() and LastRound() in Algorithm 1), where the input and output difference is determined directly by the output differences of the round 1 and round \(R-1\), without the effort of searching through a set of candidates.

3 MILP Aided Characteristic Search

At first, MILP was used to determine the minimum number of differentially or linearly active S-boxes of word-oriented ciphers [23, 29]. In [28], Sun et al. introduced the convex hull computation method which can encode any subset of 0–1 vectors as the solution set of a system of linear inequalities. Thanks to this technique, actual differential and linear characteristics can be found with MILP based method. Subsequently, the MILP aided approach is applied in impossible differential analysis [25], zero-correlation linear analysis [14], and Integral cryptanalysis [30]. It is also extended and adapted to analyze ARX based constructions [18]. In what follows, we give a brief introduction of the MILP modeling technique for finding differential characteristics, which is employed in the following sections.

The key to transfer the problem of searching for differential characteristics into an MILP model is to express the propagation rules of the characteristics as a set of linear inequalities, and encode the overall probability as a linear function.

Objective Function. Since the goal is to find the optimal characteristic, we set the objective function to minimize the probability of the underlying differential characteristic. However, we must be able to express the probability as a linear function at the first place to make it valid in MILP. Such representations are available for SIMON, SPECK, and PRESENT [18, 28]. For the sake of simplicity and without loss of generality, we assume the probability (or its equivalence) can be represented by

$$\sum _{i=1}^{R}\sum _{j = 1}^{k} A_{i,j},$$

and we call \(A_{i,j}\)’s are probability weight variables, where \(A_{i,j}\) for \(j \in \{1, \cdots , k \}\) is the probability weight variables of round i of an iterative cipher. Under this notation, the probability weight contributed by round i is \(\sum _{j = 1}^{k} A_{i,j}\).

Modeling XOR. Let \(a\oplus b=c\), where \(a,b,c \in \mathbb {F}_{2}\) are the bit-level input and output differences of the XOR operation. Then (abc) is a valid differential characteristic of XOR if and only if \(a + b + c - 2d_{\oplus } = 0\), where a, b, and \(c \in \{0, 1\}\), and \(d_{\oplus }\) is a 0–1 dummy variable.

Modeling S-box. The exact differential property of an \(\omega \times \nu \) S-box S can be modeled by a set of linear inequalities with the convex hull computation method [28]. Let \(\mathcal {D} = \{ (\mathbf {a}, \mathbf {b}) \in \{ 0, 1\}^{\omega + \nu }: P(\mathbf {a} \rightarrow \mathbf {b}) > 0 \}\) be the set of all possible input-output differential patterns of S, where \(\mathbf {a} = (a_{0},a_{1},\ldots ,a_{\omega -1})\) and \(\mathbf {b} = (b_{0},b_{1},\ldots ,b_{\nu -1})\). Then, we can compute the H-representation of \(\mathcal {D} \subseteq \mathbb {R}^{\omega +\nu }\). With the help of the greedy algorithm proposed in [28], we can extract a system of inequalities whose 0–1 solution set is exactly \(\mathcal {D}\). Sometimes, it is possible to encode the differential probabilities of \(\mathbf {a} \rightarrow \mathbf {b}\) into \(\mathcal {D}\), and we refer the reader to [18, 27, 28] for concrete examples.

Modeling Modular Addition [18].  Suppose \(\mathbf {a}=(a_{0},a_{1},\ldots ,a_{n-1})\), \(\mathbf {b}=(b_{0},b_{1},\ldots ,b_{n-1})\) and \(\mathbf {c}=(c_{0},c_{1},\ldots ,c_{n-1})\) are the input and output bit-level XOR-difference of addition module \(2^{n}\). The constraints are as follows, where \(d_{\oplus }\) is 0–1 dummy variable, \(s_{i}(i=1, \ldots , n-2)\) are 0–1 active markers and \(\sum _{i=1}^{n-2}s_{i}\) is negative logarithm of the probability \(P[(\mathbf {a},\mathbf {b})\rightarrow \mathbf {c}]\).

$$\begin{aligned} \left\{ \begin{array}{l} a_{n-1} + b_{n-1} + c_{n-1} \le 2\\ a_{n-1} + b_{n-1} + c_{n-1} - 2 d_{\oplus } \ge 0\\ d_{\oplus } - a_{n-1}\ge 0\\ d_{\oplus } - b_{n-1} \ge 0\\ d_{\oplus } - c_{n-1}\ge 0\\ -a_{i} + b_{i} + s_{i} \ge 0\\ -b_{i}+ c_{i} + s_{i} \ge 0\\ a_{i} - c_{i} + s_{i} \ge 0\\ a_{i} + b_{i} + c_{i} - s_{i} \ge 0\\ -a_{i} - b_{i} - c_{i} - s_{i} \ge - 3\\ c_{i} + a_{i-1} + b_{i-1} - c_{i-1} + s_{i} \ge 0\\ -a_{i} - b_{i} - c_{i} + 3 a_{i-1} + 3 b_{i-1} + 3 c_{i-1} + 2 s_{i} \ge 0\\ a_{i}+ b_{i} + c_{i} - 3 a_{i-1} - 3 b_{i-1}- 3 c_{i-1} + 2 s_{i}\ge - 6\\ - b_{i} + a_{i-1} - b_{i-1} - c_{i-1} + s_{i} \ge - 2\\ c_{i} + a_{i-1} - b_{i-1} + c_{i-1} + s_{i} \ge 0\\ -a_{i} - b_{i} - c_{i} - 3 a_{i-1} + 3 b_{i-1} - 3 c_{i-1} + 2 s_{i} \ge - 6\\ -a_{i} - a_{i-1}- b_{i-1} + c_{i-1} + s_{i} \ge - 2\\ a_{i} + b_{i} + c_{i} - 3 a_{i-1} + 3 b_{i-1} + 3 c_{i-1} + 2 s_{i} \ge 0\\ (i=1,\ldots ,n-2) \end{array} \right. \end{aligned}$$
(1)

4 Enhancing MILP Based Search with Matsui’s Bounding Condition

Firstly, let us recall the bounding condition of Matsui’s algorithm (see Algorithm 1):

$$\begin{aligned} \prod \limits _{i=1}^{r} P_{Rd(i)} \cdot P_{Best}(R-r)\ge P_{Estim}. \end{aligned}$$
(2)

When we run Matsui’s algorithm against an R-round cipher, the variable \(P_{Estim}\) keeps track of the probability of the best characteristic known by the algorithm so far, and it will be updated dynamically if a strictly better characteristic is encountered during the search. Whenever the algorithm needs to go one level deeper into the search tree, condition (2) is tested. A violation of (2) implies that any extension of the partial solution leads to inferior characteristics with probability less than \(P_{Estim}\) (the probability of a known characteristic). Therefore, the entire subtree is pruned.

To integrate Matsui’s bounding condition into the MILP models, we introduce a variable named xobj acting as the variable \(P_{Estim}\) in Matsui’s algorithm, and let

$$\mathrm {Minimize}~xobj$$

be the objective function of the new model. Note that this is a very natural choice since the variable xobj always keeps track of the currently known best solution during the resolution of the MILP model. To make the xobj correspond to the probability of the identified characteristic, we put an equation

$$xobj = \sum _{i=1}^{R}\sum _{j = 1}^{k} A_{i,j}$$

into the constraints section of the model. At this point, the new model is completely equivalent to the original model. What we do is essentially renaming the objective function of the original model.

Assuming we know the probabilities \(P_{Best}(1), P_{Best}(2), \cdots , P_{Best}(R-1)\), we are now ready to express the bounding condition (2) as

$$\begin{aligned} \sum \limits _{t=1}^{i}\sum \limits _{j=1}^{k}A_{t,j}+wt({P_{Best}(R-i))}\le xobj,~~i=1,\ldots ,\ R-1 \end{aligned}$$
(3)
$$\begin{aligned} \sum \limits _{t=i+1}^{R} \sum \limits _{j=1}^{k}A_{t,j}+wt({P_{Best}(i)})\le xobj,~~~i= 1,\ldots , R-1 \end{aligned}$$
(4)

Therefore, for an R-round model, we can generate \(2R-2\) more constraints, where \(wt(\cdot )\) make \(P_{best}(i)\) compatible with the probability weight variables. The most different part of the new model is that it takes into account the solutions of the models of lower rounds. In the following, we present three different modeling strategies, which will be compared in the next section.

  • \(\mathcal {M}^{I}\): The original model without any modification.

  • \(\mathcal {M}^{II}\): The model with modified objective function, and \(R-1\) additional constraints of (4) generated from Matsui’s bounding condition for round 1 to round \(R-1\) respectively.

    $$\begin{aligned} \left\{ \begin{array}{l} \text {min}~~xobj \\ \sum \limits _{i,j}A_{i,j}-xobj=0\\ \sum \limits _{t=i+1}^{R} \sum \limits _{j=1}^{k}A_{t,j}+wt({P_{Best}(i)})\le xobj,~~~i=1, \ldots , R-1\\ \end{array} \right. \end{aligned}$$
    (5)
  • \(\mathcal {M}^{III}\): The model with modified objective function, and all \(2R-2\) additional constraints.

    $$\begin{aligned} \left\{ \begin{array}{l} \text {min}~~xobj \\ \sum \limits _{i,j}A_{i,j}-xobj=0\\ \sum \limits _{t=i+1}^{R} \sum \limits _{j=1}^{k}A_{t,j}+wt({P_{Best}(i)})\le xobj,~~~i=1, \ldots , R-1\\ \sum \limits _{t=1}^{i} \sum \limits _{j=1}^{k}A_{t,j}+wt({P_{Best}(R-i)})\le xobj,~~~i=1, \ldots , R-1 \end{array} \right. \end{aligned}$$
    (6)

5 Applications

In this section, we apply the modeling strategy presented in Sect. 4 to PRESENT, SIMON, and SPECK. The reasons that these ciphers are selected as the experimental targets are twofold. Firstly, the probabilities (or their equivalences) of the differential characteristics of these ciphers can be expressed as linear functions. Secondly, they represent the most common structures for modern block ciphers, where PRESENT is a SPN network, SIMON is a Feistel cipher with pure bitwise operations, and SPECK is an ARX construction.

However, we admit that in our experiments only lightweight primitives are involved. This is because generally MILP based approach (and actually all currently available automatic search tools) is too inefficient to search for characteristics of non-lightweight ciphers directly, and it is sometimes difficult to modeling the components of non-lightweight ciphers at the first place. For example, only recently, Abdelkhalek et al. show how to model the differential property of an \(8\times 8\) S-box with MILP [2], and even that, the search procedure has to be divided into two steps for a cipher involving \(8 \times 8\) S-boxes, where only truncated differentials are identified in the first step.

In addition, since the focus of this paper is to improve the MILP based method, we will not give a comparison between Matsui’s algorithm and the MILP based approach. Nevertheless, we would like to mention that Matsui’s algorithm is much more better than MILP in the case of PRESENT, while for SIMON and SPECK, it is inferior to MILP. Finally, all of the models presented in this paper are solved by the MILP optimizer Gurobi (version 7.0.2) [1] running at 16 threads on a server with Intel\(^\circledR \) Xeon\(^\circledR \) E5-2637V3 CPU 3.50 GHz.

5.1 Application to PRESENT

The PRESENT, designed by Bogdanov et al., is an ISO standardized lightweight block cipher [11]. The round function of PRESENT is shown in Fig. 2, and we refer the reader to [11] for more information.

Fig. 2.
figure 2

The round function of PRESENT

We construct three models \(\mathcal {M}^{I}\), \(\mathcal {M}^{II}\), and \(\mathcal {M}^{III}\) according to the strategies presented in Sect. 4. The resolution time for these models are recorded in Table 1. Note that what we measure is the time cost for the solver to prove that the solution it identified is optimal. This timing information is of most importance since in the design process what we care is the bound, and the tighter the bound is, the more accurate the security evaluation.

Table 1. Experimental results of PRESENT

From Table 1 we can see that the resolution time can be significantly improved by using the new modeling strategies. For instance, we can prove that the probability of the optimal characteristic of 8-round PRESENT is \(2^{-32}\) in 1074.45 s by using \(\mathcal {M}^{III}\), while for \(\mathcal {M}^I\) we can not get this result in less 10 h. Moreover, by using the new models, some interesting phenomenons are observed that we cannot explain. For example, the resolution time of \(\mathcal {M}^{III}\) for 6-round PRESENT is faster than that of the 5-round model.

5.2 Application to SIMON

SIMON (depicted in Fig. 3) is a family of lightweight block ciphers with Feistel structure involving only bitwise operations: XOR, AND, and Rotation, which is designed by the National Security Agency of USA. The parameters of different SIMON instances involved in our experiments are summarized in Table 2.

Fig. 3.
figure 3

The round function of SIMON 

Table 2. Parameters for SIMON32 and SIMON48

We construct three models \(\mathcal {M}^{I}\), \(\mathcal {M}^{II}\), and \(\mathcal {M}^{III}\) according to the strategies presented in Sect. 4. The resolution time for these models are recorded in Table 3.

Table 3. Experimental results of SIMON

From Table 3 we can see that, for larger number of rounds, the improvement is obvious. For example, using \(\mathcal {M}^{III}\) we can prove that the probability of the optimal characteristic of 15-round SIMON48 is \(2^{-46}\) in 2444.28 s, while for \(\mathcal {M}^I\), the resolution time is 31979.80 s.

5.3 Application to SPECK

The SPECK is a family of ARX Feistel block ciphers (depicted in Fig. 4) designed by the National Security Agency of USA. The parameters of different SPECK instances involved in our experiments are summarized in Table 4.

We construct three models \(\mathcal {M}^{I}\), \(\mathcal {M}^{II}\), and \(\mathcal {M}^{III}\) according to the strategies presented in Sect. 4. The resolution time for these models are recorded in Table 5. However, the results show that the new modeling strategies are inferior to the original method. This may somehow implies that adding Matsui’s bounding conditions for MILP models of ARX ciphers is not a good choice.

Fig. 4.
figure 4

The round function of SPECK

Table 4. Parameters for SPECK32 and SPECK48
Table 5. Experimental results of SPECK

6 Conclusion

Borrowing the ideas from Matsui’s algorithm, we tweak the MILP models for differential cryptanalysis by altering the objective functions and introducing in special constraints derived from Matsui’s bounding condition. We apply this new modeling strategy to PRESENT, SPECK, and SIMON, which demonstrates that the fusion of Matsui’s bounding condition and the MILP approach leads to faster resolution in some cases. Therefore, the new modeling approach is expected to reduce the time cost of differential and linear analysis. In particular, during the design process of symmetric-key schemes, a larger design space may be explored within limited time. Our work shows that it is beneficial to include Matsui’s bounding condition in the MILP models for differential analysis. More generally, it is interesting to see how to integrate other search heuristics [16, 17] from the literature of symmetric-key cryptanalysis into the MILP models.