Keywords

1 Introduction

The cube attack is an effective cryptanalytic technique proposed by Dinur and Shamir in [2]. For stream ciphers, let f(xv) be the first bit of keystream, and \(v=(v_1,v_2,\cdots ,v_m)\) be m-bit public variables and \(x=(x_1,x_2,\cdots ,x_n)\) be n-bit secret variables. Cube indices set containing the indexes of some public variables is I. \(C_I\) containing all possible combinations of public variables indexed by the set I, which is applied to the initialization of the f. The sum of all possible outputs of f can be obtained, which is represented as the polynomial of x and v, and can be called the superpoly of \(C_I\) in f. If the superpoly of the \(C_I\) is simple enough, we can recover the superpoly in the offline phase and obtain some information of secret variables involved in the superpoly in the online phase. In [2], the superpoly is tested for its linearity. If the test passes, the superpoly can be recovered by assuming it is linear. Recently, there are some new research insights based on the cube attack, such as correlation cube attack [6], division property based cube attack [9], degree evaluation of NFSR-based cryptosystems [5] and dynamic cube attack [3].

Division property can construct integral distinguisher by analyzing the structure of block ciphers, which was proposed by Todo et al. in [7]. In [8], Todo et al. obtained the integral property of the full-round MISTY1 by the division property. In order to improve the propagation of division property within the structure, Todo proposed bit-based division property and bit-based division property using three subsets in [10]. The bit-based division property can find new integral property for Simon32. In [13], Xiang et al. applied mixed integer linear programming(MILP) to the propagation of bit-based division property, which can reduce the complexity of time and memory. In [4], a variant three subsets division property with STP solver was proposed. This method can apply the automatic search model to ciphers with large block sizes.

In [10], Todo et al. applied bit-based division property based on MILP model to the cube attack on stream ciphers. The superpoly of \(C_I\) can be recovered by the truth table that depends on the values of secret variables indexed by J and assignments of values of non-cube variables. When a non-constant superpoly was found by a proper assignment of non-cube variables, some information about secret variables can be recovered in the online phase. In [11], Wang et al. introduced flag technique and degree evaluation to reduce the complexity and enhance the precision of division property based cube attack. In the offline phase, the superpoly was recovered by constructing a linear system of coefficients for all possible monomials instead of constructing the truth table.

Unlike applying bit-based division property to cube attack, [12] proposed a practical method that based on bit-based division property using three subsets. The corresponding coefficients of all possible monomials can be evaluated by propagating \(\mathbb {L}\) instead of constructing truth table as in [9] or solving a linear system as in [11].

Besides improving the efficiency of recovering the superpoly of division property based cube attack, another important research is a new variant of division property from an algebraic point of view in [14]. The authors built the relationship between bit-based division property and the algebraic degree evaluation, which can remove invalid division trails. These insights has been applied to verify the results of division property based cube attack on Trivium.

In this paper, we focus on the following improvements of division property based on cube attack that introduces the flag technique.

  • Firstly, the previous phase that evaluating all possible monomials in [11] does not consider the relationship between lower degree monomials and specific higher degree monomials. In their methods, all t-degree monomials can be evaluated from the combinations of the involved variables indices J or indices set \(JR_t \subset J\) that composing all t-degree monomials. However, when evaluating all t-degree monomials that may exist in the superpoly by propagating bit-based division property, there are many t-degree monomials that do not exist for the combinations of indices set J or \(JR_t\) need to be evaluated. The complexity of evaluating these monomials is huge. Therefore, we introduce the filter technique to further reduce the complexity of evaluating all t-degree monomials. With the filter technique, when low-degree monomials do not exist in the superpoly under propagating division property, some specific high-degree monomials also do not exist in the superpoly. Therefore, these evaluations can be saved.

  • Secondly, non-cube public variables that involved in the superpoly can be achieved by filter technique and added to the set J. While evaluating monomials, the effect of non-cube public variables on monomials that may exist in the superpoly can be considered directly, rather than attempting to find proper non-cube IV assignment with MILP model as in [11].

  • Thirdly, during the initialization phase for flag technique, we modify the parameters of each public variable and secret variable to remove most invalid monomials. Then, only corresponding coefficients of the remaining monomials need to be determined.

The outline of this paper is as follows. Section 2 introduces the background of cube attack, bit-based division property, division trail, MILP model and flag technique. Section 3 provides the filter technique to reduce the complexity of evaluating monomials by division trails. We remove some invalid monomials by modifying the parameters of the flag technique in Sect. 4. We apply the method to Grain128a in Sect. 5. Finally, we give a conclusion.

2 Preliminaries

2.1 Cube Attack

Cube attack was first proposed in [2]. Let f(xv) be the first output bit of the stream cipher that contains m-bit public variables \(v=(v_1,v_2,\cdots ,v_m)\) and n-bit secret variables \(x=(x_1,x_2,\cdots ,x_n)\). Cube indices set is \(I=\{i_1,i_2,\cdots ,i_{|I|}\} \subset \{1,2,\cdots ,m\}\) and a maxterm is \(t_I=v_{i_1}v_{i_2}\cdots v_{i_{|I|}}\). Then, f(xv) can be decomposed as

$$\begin{aligned} f(x,v)=t_Ip(x,v)+q(x,v) \end{aligned}$$

where q(xv) miss at least one variable indexed by I, and p(xv) is independent of the variables indexed by I, denoted as the superpoly of \(C_I\). As inputting the cube \(C_I\) into the cryptosystems, the sum of all output bits is

$$\begin{aligned} \underset{C_I}{\oplus }f(x,v)=\underset{C_I}{\oplus }\,t_Ip(x,v)+ \underset{C_I}{\oplus }\,q(x,v)=p(x,v) \end{aligned}$$
(1)

If the cube indices set I is determined and the superpoly of \(C_I\) is not constant polynomial in the offline phase, then some information of secret variables can be recovered with the same assignment to I and non-cube IV in the online phase.

2.2 Bit-Based Division Property

The conventional bit-based division property was proposed in [10] and the definition is as follows. Let \(\mathbb {X}\) be the multiset whose elements are values of \(\mathbb {F}_2^n\) and \(\mathbb {K}\) be a set of n-bit vectors. If the multiset \(\mathbb {X}\) has division property \(\mathcal {D}_{\mathbb {K}}^n\), it will fulfill the following conditions:

$$\begin{aligned} \underset{x \in \mathbb {X}}{\oplus } {\pi }_{u}(x)= {\left\{ \begin{array}{ll} \textit{unknown} &{} \text {if there exist } k \in \mathbb {K} \text { s.t. } u \ge k\\ \textit{0} &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

where \({\pi }_u(x)=\prod _{i=0}^{n-1}x_{i}^{u_i}\) and \(u \ge k\) if \(u_i \ge k_i\) for all i. There are three basic propagation rules for the conventional bit-based division property was proved in [10], which are And, Xor and Copy, respectively. In the following, we show these three basic propagation rules.

Rule 1

(Copy) [10]. Let \(x_1 \in \mathbb {F}_2\) be the input of Copy propagation, and \((y_1,y_2)\in \mathbb {F}_2^{2}\) be the output values. Assuming that the input and output multiset have \(\mathcal {D}_{\mathbb {K}}^{1}\) and \(\mathcal {D}_{\mathbb {K}^{'}}^{{2}}\), Copy operation that values are propagated from \(k \in \mathbb {K}\) to \(\mathbb {K}^{'}\) can be computed as follows.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbb {K}^{'}=\{(0,0)\}, &{} \textit{if} \ k_1=0\\ \mathbb {K}^{'}=\{(1,0),(0,1)\}, &{} \textit{if} \ k_1=1 \end{array}\right. } \end{aligned}$$

Rule 2

(And) [10]. Let \((x_1,x_2) \in \mathbb {F}_2^{2}\) be the input of And propagation, and \(y_1\in \mathbb {F}_2\) be the output values. Assuming that the input and output multiset have \(\mathcal {D}_{\mathbb {K}}^{{2}}\) and \(\mathcal {D}_{\mathbb {K}^{'}}^{1}\), And operation that values are propagated from \(k\in \mathbb {K}\) to \(\mathbb {K}^{'}\) can be computed as follows.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbb {K}^{'}=\{(0)\}, &{} \textit{if} \ k=(0,0)\\ \mathbb {K}^{'}=\{(1)\}, &{} \textit{otherwise} \end{array}\right. } \end{aligned}$$

Rule 3

(Xor) [10]. Let \((x_1,x_2) \in \mathbb {F}_2^{2}\) be the input of Xor propagation, and \(y_1\in \mathbb {F}_2\) be the output values. Assuming that the input and output multiset have \(\mathcal {D}_{\mathbb {K}}^{{2}}\) and \(\mathcal {D}_{\mathbb {K}^{'}}^{1}\), Xor operation that values are propagated from \(k\in \mathbb {K}\) to \(\mathbb {K}^{'}\) can be computed as follows.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbb {K}^{'}=\{(0)\}, &{} \textit{if} \ k=(0,0)\\ \mathbb {K}^{'}=\{(1)\}, &{} \textit{if} \ k=(0,1),(1,0) \\ \mathbb {K}^{'}=\emptyset , &{} \textit{if} \ k=(1,1) \end{array}\right. } \end{aligned}$$

2.3 Division Trail [13]

The propagation of bit-based division property can be denoted as \(\mathbb {K}_0 \rightarrow \mathbb {K}_1 \rightarrow \cdots \rightarrow \mathbb {K}_r\). \(\mathbb {K}_0\) is initial division property of input multiset. For any vector \(k_i^{*} \in \mathbb {K}_i\), there must exist a vector \(k_{i-1}^{*} \in \mathbb {K}_{i-1}\) such that \(k_{i-1}^{*}\) can propagate to \(k_{i}^{*}\) by the propagation of division property. If \(k_{i-1}\) can propagate to \(k_i\) for all \(i \in \{1,2,\cdots ,r\}\), \((k_0,k_1,\cdots ,k_r)\) can be called as the r-round division trail, where \((k_0,k_1,\cdots ,k_r) \in (\mathbb {K}_0,\mathbb {K}_1,\cdots ,\mathbb {K}_r)\).

For the r-round cipher, let \(\mathbb {X}\) be the chosen plaintexts and \(\mathcal {D}_{\mathbb {K}_0}^{n}\) be the division property of \(\mathbb {X}\). After r-round encryption, the division property \(\mathcal {D}_{\mathbb {K}_r}^{n}\) of the output ciphertexts can be computed by division trail. If \(\mathbb {K}_r\) does not contain the unit vector \(e_j\), then the j-th bit of the output ciphertexts is balanced.

2.4 MILP Model of Division Property

In [13], the authors applied MILP models to construct the propagation of division property to reduce the complexity of propagating the division property. We describe the propagation MILP models for And, Xor and Copy as follows.

Model 1

(MILP model for Copy) [9]. Let \(a \rightarrow (b_1,b_2,\cdots ,b_m)\) be the division property propagation of Copy operation. The following MILP model can describe this propagation.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathcal {M}.var \leftarrow a,b_1,b_2,\cdots ,b_m ~\mathrm {as}~ \mathrm {binary}\\ \mathcal {M}.con \leftarrow a=b_1+b_2+\cdots +b_m \\ \end{array}\right. } \end{aligned}$$

Model 2

(MILP model for Xor) [9]. Let \((a_1,a_2,\cdots ,a_m) \rightarrow b\) be the division property propagation of Xor operation. The following MILP model can describe this propagation.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathcal {M}.var \leftarrow a_1,a_2,\cdots ,a_m,b~\mathrm {as~binary} \\ \mathcal {M}.con \leftarrow a_1+a_2+\cdots +a_m=b\\ \end{array}\right. } \end{aligned}$$

Model 3

(MILP model for And) [9]. Let \((a_1,a_2,\cdots ,a_m) \rightarrow b\) be the division property propagation of And operation. The following MILP model can describe this propagation.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathcal {M}.var \leftarrow a_1,a_2,\cdots ,a_m,b ~\mathrm {as~binary} \\ \mathcal {M}.con \leftarrow b \ge a_i \text { for } i=1,2,\cdots ,m \end{array}\right. } \end{aligned}$$

Flag Technique

In order to enhance the precision of Copy + And MILP model, the authors in [11] proposed flag technique that adding the parameter \(v.F \in \{1_c,0_c,\delta \}\) for every variable \(v \in \mathcal {M}\). The parameters have three basic operations. The = operation rules are \(1_c = 1_c\), \(0_c = 0_c\) and \(\delta =\delta \). The \(\oplus \) operation rules are

$$\begin{aligned} {\left\{ \begin{array}{ll} 1_c \oplus 1_c = 0_c \\ 0_c \oplus x = x \oplus 0_c = x \\ \delta \oplus x = x \oplus \delta = \delta \ \ \ {x\in \{1_c,0_c,\delta \}} \end{array}\right. } \end{aligned}$$

The \(\times \) operation rules are

$$\begin{aligned} {\left\{ \begin{array}{ll} 1_c \times x = x\times 1_c=x\\ 0_c \times x = x \times 0_c = 0_c \\ \delta \times \delta = \delta \ \ \ {x\in \{1_c,0_c,\delta \}} \end{array}\right. } \end{aligned}$$

When applying flag technique into division property, three basic operations are modified to copyf, xorf and andf.

Model 4

(MILP model for copyf) [11]. Let \(a\rightarrow (b_1,b_2,\cdots ,b_m)\) be the propagation of copyf operation. The following MILP model can describe this propagation.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathcal {M}.var \leftarrow a,b_1,b_2,\cdots ,b_m ~\mathrm {as~binary}\\ \mathcal {M}.con \leftarrow a=b_1+b_2+\cdots +b_m\\ a.F=b_1.F=\cdots = b_m.F \end{array}\right. } \end{aligned}$$

Model 5

(MILP model for xorf) [11]. Let \((a_1,a_2,\cdots ,a_m) \rightarrow b\) be the propagation of xorf operation. The following MILP model can describe this propagation.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathcal {M}.var \leftarrow b,a_1,a_2,\cdots ,a_m ~\mathrm {as~binary} \\ \mathcal {M}.con \leftarrow a_1+a_2+\cdots +a_m=b\\ b.F=a_1.F \oplus a_2.F \oplus \cdots \oplus a_m.F \end{array}\right. } \end{aligned}$$

Model 6

(MILP model for andf) [11]. Let \((a_1,a_2,\cdots ,a_m) \rightarrow b\) be the propagation of andf operation. The following MILP model can describe this propagation.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathcal {M}.var \leftarrow b,a_1,a_2,\cdots ,a_m ~\mathrm {as~binary} \\ \mathcal {M}.con \leftarrow b\ge a_{i}, \ i \in \{1,\cdots ,m\}\\ b.F=a_1.F \times a_2.F \times \cdots \times a_m.F\\ \mathcal {M}.con \leftarrow b=0,if \ b.F=0_c \end{array}\right. } \end{aligned}$$

2.5 Division Property Based Cube Attack

At CRYPTO 2017, the authors proposed the division property based cube attack to recover the superpoly in [9]. In [11], Wang et al. proposed some techniques to reduce the complexity of recovering the superpoly. There were the following lemma and propositions proved in the division property based cube attack.

Lemma 1

[9]. Let \(f(x)=\underset{u \in {\mathbb {F}}_2^n}{\oplus } a_u^{f}x^u\) be a polynomial from \(\mathbb {F}_2^n\) to \(\mathbb {F}_2\) and \(a_u^f \in \mathbb {F}_2\) be the ANF coefficients, where \(u \in \mathbb {F}_{2}^{n}\). Let k be the n-dimension bit vector. Assuming there is no division trail such that \(k \xrightarrow f 1\), then \(a_u^f\) is always 0 for \(u \ge k\).

Proposition 7

[9]. Let f(xv) be a polynomial from \(\mathbb {F}_{2}^{n+m}\) to \(\mathbb {F}_{2}\), where x and v denote the secret and public variables. Let \(C_I\) be a set of \(2^{|I|}\) values, where the indices set \(I=\{i_1,i_2,\cdots ,i_{|I|}\} \subset \{1,2,\cdots ,m\}\) and the variables indexed by \(\{i_1,i_2,\cdots ,i_{|I|}\}\) are taking all possible combinations. Let \(k_I\) be the m-dimensional bit vector such that \(v^{k_I}=v_{i_1}v_{i_2}\cdots v_{i_{|I|}}\), i.e, \(k_i=1\) if \(i\in I\) and \(k_i=0\) otherwise. Assuming there is no division trail such that \((e_j,k_I) \xrightarrow {f} 1\), then \(x_j\) is not involved in the superpoly of the cube \(C_I\).

Proposition 9

[11]. Let f(xv) be a polynomial from \(\mathbb {F}_{2}^{n+m}\) to \(\mathbb {F}_2\), where x and v denote the secret and public variables. Let \(C_I\) be a set of \(2^{|I|}\) values, where the indices set \(I=\{i_1,i_2,\cdots ,i_{|I|}\} \subset \{1,2,\cdots ,m\}\) and variables indexed by \(\{i_1,i_2,\cdots ,i_{|I|}\}\) are taking all possible combinations. Let \(k_I\) be the m-dimensional bit vector such that \(v^{k_{I}}=t_I=v_{i_{1}}v_{i_{2}}\cdots v_{i_{|I|}}\). Let \(k_{\varLambda }\) be the n-dimension bit vector. Assuming there is no division trail such that \((k_{\varLambda } || k_{I}) \xrightarrow f 1\), then the monomial \(x^{k_{\varLambda }}\) is not involved in the superpoly of the cube \(C_{I}\).

With the above lemma and propositions, [9] obtained variables indexed by J, which are involved in the superpoly. The superpoly could be recovered by constructing the truth table of size \(2^{|J|}\). The degree evaluation in [11] could return the degree d of the superpoly. The recovery of the superpoly could be turned to recover the corresponding coefficients of all \(\sum _{i=0}^{d}\genfrac(){0.0pt}1{J}{i}\) possible monomials. The complexity of recovering the superpoly is \(2^{|I|} \times \sum _{i=0}^d \genfrac(){0.0pt}1{|J|}{d}\).

3 Reduce Evaluation Complexity

3.1 Filter Technique

According to Proposition 9, some invalid monomials that do not exist in the superpoly can be removed by evaluating division property. Let involved secret variables indices set be J. Initially, the degree d of superpoly can be returned by degree evaluation. Then, all possible t-degree monomials that combinations are from J can be evaluated for \(t=0,1,\cdots ,d\). Because evaluations of all t-degree monomials are large. Therefore, we reduce the complexity of the phase that evaluating these possible monomials with the relationship between the low-degree monomials and some specific high-degree monomials under the propagation of division property. Degree of the superpoly also can be determined after this phase. Therefore, some extra evaluations can be saved. The relationship can be described as follows.

Proposition 10

Let f(xv) be a polynomial, where x and v denote n-bit secret variables and m-bit public variables. Let \(C_I\) be a set of \(2^{|I|}\) values, where the indices set \(I=\{i_1,i_2,\cdots ,i_{|I|}\} \subset \{1,2,\cdots ,m\}\) and variables indexed by I are taking all possible combinations. Let \(k_{I}\) be m-dimensional bit vector such that \(v^{k_I}=t_{I}=v_{i_1}v_{i_2}\cdots v_{i_{|I|}}\). Let \(k_{\varLambda }\) and \(k_{\varDelta }\) be n-dimensional bit vectors, where \(k_{\varDelta } \ge k_{\varLambda }\). Assuming there is no division trail such that \((k_{\varLambda }||k_{I}) \xrightarrow {f} 1\), then the division trail that \((k_{\varDelta }||k_{I}) \xrightarrow {f} 1\) do not exist, either. Therefore, the monomial \(x^{k_{\varDelta }}\) is not involved in the superpoly.

Proof

The ANF of the first output bit f(xv) can be represented as

$$\begin{aligned} \begin{aligned} f(x,v)=\underset{u \in \mathbb {F}_2^{n+m}}{\oplus } a_u^f(x||v)^u \end{aligned} \end{aligned}$$

When cube indices I is determined, f(xv) can be decomposed as follows.

$$\begin{aligned} \begin{aligned} f(x,v)&=\underset{u \in \mathbb {F}_2^{n+m}|u \ge (0||k_I)}{\oplus } a_u^f (x||v)^u \oplus \underset{u \in \mathbb {F}_2^{n+m}|u \ngeq (0||k_I)}{\oplus } a_u^f(x||v)^u \\&=t_I \underset{u \in \mathbb {F}_2^{n+m}|u \ge (0||k_I)}{\oplus } a_u^f (x||v)^{u \oplus (0||k_I)} \oplus \\&\ \ \ \underset{u \in \mathbb {F}_2^{n+m}| u \ngeq (0||k_I)}{\oplus } a_u^f (x||v)^u \\&=t_Ip(x,v) \oplus q(x,v) \end{aligned} \end{aligned}$$

The superpoly p(xv) of the cube \(C_I\) can be denoted as

$$\begin{aligned} p(x,v)=\underset{u \in \mathbb {F}_2^{n+m}|u \ge (0||k_I)}{\oplus } a_u^f (x||v)^{u \oplus (0||k_I)} \end{aligned}$$

According to Lemma 1, if there is no division trail that \((k_{\varLambda }||k_I) \xrightarrow {f} 1\), then \(a_{(k_{\varDelta }||k_I)}^{f}=0\) for \((k_{\varDelta }||k_I) \ge (k_{\varLambda }||k_I)\). Therefore, the monomial \(x^{k_{\varDelta }}\) does not exist in the superpoly, either.    \(\square \)

Proposition 10 is inspired by [11] and shows the relationship between low-degree monomials and specific high-degree monomials, where the ANF of high-degree monomials contain the ANF of low-degree monomials. When the division trails that evaluating low-degree monomials do not exist, the evaluations of these specific high-degree monomials can be saved. Fewer high-degree monomials need to be evaluated by solving the model. Based on these operations, the complexity that evaluating all possible monomials can be reduced.

For monomials that combinations are from indices J, if \(d < |J|\), specific high-degree monomials will be removed from all possible monomials involved in the superpoly when the results of MILP model that evaluating low-degree monomials are infeasible. Therefore, these monomials need not to be evaluated again. Because all higher degree monomials that do not exist will be removed, the bound of the superpoly’s algebraic degree can be achieved. If \(d=|J|\), the complexity of evaluation will be consistent with the method term enumeration that evaluating all monomials from J in [11].

For t-degree monomials that combinations are from indices set \(JR_t\), where \(JR_t\) are key indices set that can compose all t-degree monomials, specific higher t-degree monomials also can be removed when division trails of lower t-degree monomials do not exist such that evaluations can be saved. If no specific higher t-degree monomials can be removed during the evaluation phase, the complexity will be consistent with relaxed term enumeration in [11]. For simplicity, we focus on the process that evaluating the monomials from indices set J. Compared to indices set J, \(JR_t\) only can reduce the searching space when evaluating t-degree monomials.

The application of the filter technique on evaluating the possible monomials can be written as Algorithm 1. The inputs of TERMEV consist of cube indices I, constant 1 indices \(I_1\), constant 0 indices \(I_0\) and involved variables indices J. When the result of MILP model that evaluating the t-degree monomial is infeasible, which is from all t-degree monomials indices set \(J_t^{enum}\), specific monomials that algebraic degree is higher than t can be removed from set \(J^{enum}\) by procedure DELETE. In Algorithm 1, \(J^{enum}\) contains all monomials indices that have not been evaluated, which are from the combinations of J. The outputs of the algorithm are bound of algebraic degree and monomials indices set \(J_{term}\) that may exist in the superpoly.

3.2 Application of Filter Technique on Non-cube Public Variables

The effect of non-cube public variables on the superpoly could be considered by repeated summations in [9] or attempting to find proper assignments of IVs with the MILP model in [11]. However, it is not obvious whether the monomial contains some specific non-cube public variables when evaluating the specific monomial. Now, when applying the filter technique to evaluate the non-cube public variables, variables indexed by \(J_{v}\) that involved in the superpoly can be achieved by running Algorithm 2. J can be called as updated involved variables indices, where \(J=J \cup J_v\). Then, all possible monomials that combinations are from J can be evaluated by Algorithm 1. Upper bound of the algebraic degree of updated involved variables indices and all possible monomials under the propagation of division property will be returned by running Algorithm 1.

figure a
figure b

4 Improve the Evaluations with Modified MILP Model

In Sect. 3, the relationship between low-degree monomials and specific high-degree monomials is considered by the filter technique. Some high-degree monomials will be removed from all corresponding degree monomials set and these monomials will not need to be evaluated by propagating division trails again. Only the rest of high-degree monomials need to be evaluated by solving the MILP model. Therefore, the filter technique can be applied to reduce the complexity of evaluations. After evaluating all possible monomials by Algorithm 1, there are close to \(\sum _{i=0}^{d} |JT_i|\) monomials whose corresponding coefficients still need to be further determined by constructing a linear system, where \(|JT_i|\) is the number of i-th degree monomials that are returned by utilizing filter technique. The complexity of the process that determining corresponding coefficients is huge.

When evaluating the possible monomials by propagating division trails, there are some invalid division trails that affect the results of the model. Hence, close to \(\sum _{i=0}^{d} |JT_i|\) model results are feasible. To improve the results of MILP model, when evaluating the remaining monomial, the parameters of flag technique are modified to identify some invalid division trails. When removing these division trails, fewer trails can propagate to \(\varvec{1}\) that relate to corresponding uncertain monomials. After the phase that evaluating all possible monomials by utilizing filter technique, the remaining monomials can be evaluated by modifying the initial parameters of the flag technique for all public and secret variables. As is described in Algorithm 3, when cube indices set is I and all possible monomial indices set is \(J_{term}\), the process that modifying the initial parameters includes setting the parameters for the variables indexed by I and \(J_t\) to \(\delta \), where \(J_t\in J_{term}\), setting the parameters that the constant variables to \(1_c\) or \(0_c\) and setting the remaining variables to \(0_c\). With the MILP model that introduces the flag technique of modifying the initial parameters, if the trail to \(\varvec{1}\) does not exist, the monomial indexed by \(J_{t}\) does not exist in the superpoly, either. If all monomials indices \(J_t \in J_{term}\) are evaluated, FILTERTERM outputs the remaining monomials indices \(J_{out}\).

Finally, in the offline phase, \(< \sum _{i=0}^{d} {|JT_i|}\) monomials that indexed by \(J_{out}\) need to be determined by constructing a linear system. The complexity of recovering the superpoly is

$$\begin{aligned} 2^{|I|} \times c \cdot \sum _{i=0}^{d} |JT_i| \end{aligned}$$
(2)

where \(c\in (0,1)\) is the parameter of computational complexity.

In the online phase, the sum can be obtained as Eq. (1) after inputting the \(C_{I}\) into the cryptosystem. For all \(2^{|J|}\) possible combinations, the output value of the ANF of superpoly can be solved by looking up the corresponding coefficients table. Then, only the combinations whose corresponding output values are equal to the initial sum can be reserved as the correct key candidates. Therefore, the complexity of online phase is \(2^{|I|}+2^{|J|}\times c \cdot \sum _{i=0}^{d}{|JT_{i}|}\).

Fig. 1.
figure 1

Structure of Grain128a

5 Applications to Grain128a

5.1 Description of Grain128a

Grain128a[1] is the member of Grain family of stream ciphers. The state of Grain128a is represented by a 128-bit LFSR and a 128-bit NFSR, which is described in Fig. 1. During the initialization step, the 96-bit IV are loaded into the LFSR, with the other state bits are set to 1 except the least one bit is set to 0. The initial state bits can be represented by

$$\begin{aligned} \begin{aligned}&(b_0,\cdots ,b_{127})=( K _1,\cdots , K _{128}) \\&(s_0,\cdots ,s_{127})=( IV _1,\cdots , IV _{96},1,\cdots ,1,0) \end{aligned} \end{aligned}$$

The algorithm runs 256 rounds without producing any keystream bit. A complete description of the update function and the output function is given as follows.

$$\begin{aligned} \begin{aligned} g \leftarrow&b_0+b_{28}+b_{56}+b_{91}+b_{96}+b_{3}b_{67} \\&+b_{11}b_{13}+b_{17}b_{18}+b_{27}b_{59}+b_{40}b_{48}+b_{61}b_{65} \\&+b_{68}b_{84}+b_{88}b_{92}b_{93}b_{95}+b_{22}b_{24}b_{25}+b_{70}b_{78}b_{82} \\ f \leftarrow&s_0+s_7+s_{38}+s_{70}+s_{81}+s_{96} \\ h \leftarrow&b_{12}s_{8}+s_{13}s_{20}+b_{95}s_{42}+s_{60}s_{79}+b_{12}b_{95}s_{94}\\ z \leftarrow&h+s_{93}+ \sum _{j \in A}b_j , A=\{2,15,36,45,64,73,89\} \end{aligned} \end{aligned}$$

During the initialization step, the state is represented as

$$\begin{aligned} \begin{aligned}&(b_0,b_1,\cdots ,b_{127}) \leftarrow (b_1,\cdots ,b_{127},g+s_0+z) \\&(s_0,s_1,\cdots ,s_{127}) \leftarrow (s_1,\cdots ,s_{127},f+z) \end{aligned} \end{aligned}$$
figure c
figure d

5.2 Experimental Verification

The process that constructing the initial model \(\mathcal {M}\) is the same as the procedure \(\textit{Grain128aEval}\) in [11]. All division trails to r round can be evaluated with the model \(\mathcal {M}\).

When the model \(\mathcal {M}\) is implemented, the secret variables involved indices J and the public variables involved indices \(J_v\) can be received. We choose the same cube as in [11] that \(I=\{1,2,\cdots ,9\}\) to verify our new scheme.

Example: For analysing 106-round Grain128a, where constant 1 indices set is \(I_1=\{97,\cdots ,127\}\) and constant 0 indices set is \(I_0=\{128\}\), when running algorithms that evaluating public and secret variables, the secret variables involved indices \(J=\{53,85,119,122,126,127\}\) and public variables involved indices \(J_v=\{76\}\) can be identified. After executing Algorithms 1 and 3, the remaining monomials that \(x_{53}x_{119}x_{122}x_{126}x_{127}\), \(x_{85}x_{119}x_{122}x_{126}x_{127}\), \(x_{53}x_{119}v_{76}x_{126}x_{127}\) and \(x_{85}x_{119}v_{76}x_{126}x_{127}\) need to be determined. These results are in accordance with the superpoly which is given in [9]. While running Algorithm 1, the degree 5 of the superpoly is returned. The number of monomials that need to be determined has a \(5.56\%\) reduction.

To further verify the scheme, we choose some cube indices I for low-round Grain128a as described in Table 1, which are obtained by exploiting the filter technique. When the rounds and the corresponding cube indices I are chosen, there are no non-constant public variables involved in the superpoly by running Algorithm 2. These cube indices can be regarded as effective. Then, the secret variables that involved in the superpoly can be achieved. After running Algorithms 1 and 3, the complexity of recovering the remaining monomials can be computed. The comparison of the evaluation complexity between filter technique and previous method also can be achieved. For example, when the round is 108 and cube indices set is \(I=\{1,2,3,4,5,6,7,8,9,10,12,14\}\), the model result of monomial \(x_{18}x_{53}\) is infeasible. Therefore, all specific high-degree monomials that contain \(x_{18}x_{53}\) need not to be evaluated by solving model and the complexity of evaluations can be reduced. In Table 1, the parameter c represents that the complexity changes from the attack that recovering possible monomials that returned by precise term enumeration in [11] to our new method.

For these results, the upper bound of all c values are \(25\%\). As is described in Table 1, the remaining monomials only include fewer possible monomials that need to be determined by constructing a linear system. When the rounds are larger, the changes are obvious. These changes are easy to understand. When the superpoly only includes a t-degree monomial that t is large, all low-degree monomials that combinations are from these t variables will not be removed from the possible monomials set by evaluating division trails. However, when evaluating these low-degree monomials by modifying the parameters of the flag technique, these division trails can be identified effectively. Only fewer remaining monomials need to be recovered by constructing a linear system.

5.3 Theoretical Results

For 184-round Grain128a, if \(v_{47}=0\), then the size of J is 21 and the degree is 14, respectively. After utilizing the filter technique, the number of possible monomials that the corresponding coefficients should be further determined is about \(2^{14.602}\).

While evaluating the monomials by modifying the corresponding parameters of flag technique, no division trails can propagate to \(\varvec{1}\) for the monomials that algebraic degree is smaller than 7. These monomials need not to be determined by constructing the linear system again. Without evaluating monomials that algebraic degree is higher than 6 by modifying the corresponding parameters of flag technique, there are about \(57\%\) of monomials that given in [11] need to be determined, which the size is 14034.

Under the process that evaluating the above low-degree monomials, the complexity has about \(43\%\) reducation. The recovery attack can reduce the complexity from \(2^{109.602}\) to \(c\cdot 2^{109.602}\), where \(c\cdot 2^{109.602}\) is smaller than \(2^{108.777}\).

When analysing higher rounds Grain128a by Algorithm 2, we find that all non-constant public variables should be chosen as cube variables and the size of the cube indices set is 96. Therefore, when analysing these rounds Grain128a, the recovery attack is meaningful only if the number of possible monomials that determined by constructing a linear system is smaller than \(2^{32}\). Compared with the previous recovery attack, the parameter c can affect the number of the remaining possible monomials that need to be determined, which can enable us to analyze higher target rounds.

Table 1. Summary of experimental results on low-round Grain128a

6 Conclusion

In this paper, we propose some methods to reduce the complexity of recovering the superpoly, which is developed from division property based cube attack. Firstly, we develop the filter technique to reduce the complexity of evaluating the monomials by propagating division trails. We also assess the non-cube public variables that involved in the superpoly by the filter technique, rather than attempting to identify proper initialization of non-cube public variables. Secondly, we remove most invalid monomials by modifying the initialization for parameters of every variable. With this scheme, fewer remaining monomials need to be further determined. While evaluating monomials, the effect of non-cube public variables on the monomials can be considered directly.