Keywords

1 Introduction

Cube attack was introduced by Dinur and Shamir [8] at Eurocrypt 2009, which is a chosen plaintext key-recovery attack. In performing such an attack, one would like to express the outputs of a cryptosystem as Boolean functions on the inputs, namely, key bits and plaintext bits (say, IV bits for stream ciphers). By examining the integral properties of the outputs over some cubes, i.e., some indices of plaintext variables, one can obtain equations for the so-called superpolys over certain key bits of the cipher. After the introduction of cube attack, several variants of it were proposed, including cube testers [1], dynamic cube attack [9], conditional cube attack [16], division-property-based cube attack [22] and correlation cube attack [19]. Among these, correlation cube attack was proposed at Eurocrypt 2018 by Liu et al. [19]. It exploits correlations between the superpoly \(f_I\) of a cube and the so-called basis \(Q_I\), which is a set of low-degree Boolean functions over key bits such that \(f_I\) can be expanded over them in terms of \(f_I=\bigoplus _{h\in Q_I}h\cdot q_h\). Then the adversary could utilize the obtained equations regarding h to extract information about the encryption key.

Superpoly recovery has always been the most important step in a cube attack. At the beginning, one can only guess the superpolys by performing experiments, such as linearity tests [8] and degree tests [10]. It only became possible to recover the exact expressions of superpolys for some cubes when the division property was introduced to cube attacks.

Division property was introduced by Todo [21] in 2015, which turned out to be a generalization of the integral property. The main idea is, according to the parity of \(\boldsymbol{x}^{\boldsymbol{u}}\) for all \(\boldsymbol{x}\) in a multiset \(\mathbb {X}\) is even or unknown, one can divide the set of \(\boldsymbol{u}\)’s into two parts. By applying the division property, Todo [21] improved the integral distinguishers for some specific cryptographic primitives, such as Keccak-f [3], Serpent [4] and the Simon family [2]. Then, the bit-based division property was proposed in 2016 [24], which aimed at cryptographic primitives only performing bit operations. It was also generalized to the three subsets setting to describe the parity of \(\boldsymbol{x}^{\boldsymbol{u}}\) for all \(\boldsymbol{x}\) in \(\mathbb {X}\) as not only even or unknown but also odd. Since it is more refined than the conventional division property, integral cryptanalysis against the Simon family of block ciphers was further improved. Afterwards, Xiang et al. [28] firstly transformed the propagation of bit-based division property into a mixed integer linear programming (MILP) model, and since then, one could search integral distinguishers by using off-the-shelf MILP solvers.

At Crypto 2017, cube attack based on the division property was proposed by Todo et al. [22]. One can evaluate values of the key bits that are not involved in the superpoly of a cube by using the division property. If we already know the superpoly is independent of most key bits, then we can recover the superpoly by trying out all possible combinations of other key variables which may be involved. At Crypto 2018, Wang et al. [26] improved the division-property-based cube attack in both complexity and accuracy. They reduced the complexity of recovering a superpoly by evaluating the upper bound of its degree. In addition, they improved the preciseness of the MILP model by using the “flag” technique so that one could obtain a non-zero superpoly. However, with these techniques, it remains impossible to recover superpolys with high degrees or superpolys for large-size cubes, as the time complexity grows exponentially in both cases.

Wang et al. [27] transformed the problem of superpoly recovery into evaluating the trails of division property with three subsets, and one could recover superpolys practically thanks to a breadth-first search algorithm and the pruning technique. As a result, they successfully recovered the superpolys of large-size ISoCs for 839- and 840-round Trivium practically, but only gave a theoretical attack against 841-round Trivium. In [11], Hao et al. pointed out that the pruning technique is not always so efficient. Therefore, instead of a breadth-first search algorithm, they simply utilized an MILP model for three-subset division property without unknown subset. As a result, they successfully recovered the superpoly of 840-, 841- and 842-round Trivium with the aid of an off-the-shelf MILP solver. At Asiacrypt 2020, Hu et al. [15] introduced the monomial prediction technique to describe the division property and provided deeper insights to understand it. They also established the equivalence between three-subset division property without unknown subsets and monomial predictions, showing both of them were perfectly accurate. However, the complexity of both techniques are very dependent on the efficiency of the MILP solvers. Once the number of division trails is very large, it is hard to recover superpolys by these two techniques, since the MILP solver may not find all solutions in an acceptable time. Afterwards, Hu et al. [14] proposed an improved framework called nested monomial prediction to recover massive superpolys. Recently, based on this technique, He et al. [13] proposed a new framework which contains two main steps: one is to obtain the so-called valuable terms which contributes to the superpoly in the middle rounds, and the other is to compute the coefficients of these valuable terms. To recover the valuable terms, non-zero bit-based division property (NBDP) and core monomial prediction (CMP) were introduced, which promoted great improvement to the computational complexity of superpoly recovery.

In addition to superpoly recovery, degree evaluation of cryptosystems is also an important issue in cube attacks, since the algebraic degree is usually used to judge whether the superpoly is zero and to search for good ISoCs. In [18], Liu introduced the numeric mapping technique and proposed an algorithm for degree evaluation of nonlinear feedback shift register (NFSR) based cryptosystems, which could give upper bounds of the degree. This method has low complexity but the estimation is less accurate generally. For example, it performs badly for Trivium-like ciphers when there exist adjacent indices in an ISoC. On the other hand, Hu et al.’s monomial prediction technique [15] can promise accurate degree evaluation, but the time consumption is too considerable which limits its application in large-scale search. An algorithm seeking a trade-off between accuracy and efficiency in degree evaluation has been missing in the literature.

The Trivium cipher [7], a notable member of the eSTREAM portfolio, has consistently been a primary target for cube attacks. Notably, the advances in cube attacks in recent years were significantly propelled by analysis of this cipher [5, 13, 14]. When it comes to theoretical attacks on 840 rounds of Trivium and beyond, the key challenge is to identify balanced superpolys. These superpolys often encompass millions to billions of terms, generally involving the majority of key bits. Due to the infeasibility of solving these high-degree equations, researchers have resorted to exhaustively enumerating most potential keys. This process simplifies the equations but often only results in the recovery of a handful of key bits. When it comes to practical attacks, we can look at the attacks mentioned in [5]. Here, a thorough search for ISoCs with simpler superpolys, such as linear or quadratic polynomials, is necessary. However, as the number of rounds increases, smaller ISoCs increasingly produce complex superpolys, making higher-round attacks infeasible. These complexities in superpolys obstruct effective key recovery attacks, leading us to the question that how can we gain more key information from the equation system to enhance the attack. In this work, we propose methods to address this challenge.

Our Contributions. To handle complex superpolys, leveraging the correlation between superpolys and low-degree Boolean functions is a promising approach for key recovery. In this paper, we revisit the correlation cube attack and propose an improvement by utilizing a significant number of so-called “special” ISoCs whose superpolys have low-degree Boolean factors, improving both the quantity and quality of equations obtained in the online phase. However, this approach introduces two challenges: superpoly recovery and the search for good ISoCs.

For superpoly recovery, we propose a novel and effective variable substitution technique. By introducing new variables to replace complex expressions of key bits and eliminating trails in intermediate states, we achieve a more compact representation of the superpoly on these new variables, making it easier to factorize. This technique also improves the computational complexity of superpoly recovery, enabling us to effectively identify special ISoCs.

To search good ISoCs, a common method is to filter ISoCs based on a comparison between the estimated algebraic degree and a fixed threshold. We introduce the concept of vector degree for a Boolean function, which contains more information than the conventional algebraic degree. We further employ a new technique called “vector numeric mapping” to depict the propagation of vector degrees in compositions of Boolean functions. As a result, we can iteratively estimate an upper bound for the vector degree of the entire composite function. Our vector numeric mapping technique outperforms Liu’s numeric mapping in accuracy.

Furthermore, by studying properties of the vector numeric mapping, we introduce a pruning technique to quickly filter out good ISoCs whose superpolys have estimated degrees satisfying a threshold. We also construct an MILP model to describe this process, promissing an efficient automated selection of good ISoCs.

Our techniques are applied to the Trivium stream cipher. Initially, we apply our algorithms to three ISoCs proposed in [17], which were claimed to have zero-sum distinguishers up to 842 rounds. However, it is verified that these three ISoCs do not possess zero-sum properties for certain numbers of rounds. Nevertheless, two of them still exhibit the 841-round zero-sum property, which is the maximum number of rounds discovered so far for Trivium. Leveraging our good ISoC search technique and superpoly recovery with variable substitution technique, we mount correlation cube attacks against Trivium with 820, 825 and 830 rounds, respectively. As a result, there are \(\mathbf {2^{80}\times 87.8\%}\), \(\mathbf {2^{80}\times 83\%}\) and \(\mathbf {2^{80}\times 65.7\%}\) keys that can be practically recovered, respectively, if we consider \(\mathbf {2^{60}}\) as the upper bound for practical computational complexity. Besides, even for computers with computational power not exceeding \(\mathbf {2^{52}}\), we can still recover \(\mathbf {58\%}\) of the keys in the key space for 820 rounds. For computers with computational power not exceeding \(\mathbf {2^{55}}\), we can recover \(\mathbf {46.6\%}\) of the keys in the key space for 830 rounds. Our attacks have achieved a significant improvement compared to the previous best practical attack [5], with up to 10 additional rounds recovered. Furthermore, for the first time, the complexity for recovering 830 rounds is less than \(2^{75}\), even surpassing the threshold of \(2^{60}\). Previous results on key recovery attacks against Trivium and our results are compared in Table 1.

Table 1. A summary of key-recovery attacks against Trivium

Organization. The rest of this paper is organized as follows. In Sect. 2, we give some preliminaries including some notations and concepts. In Sect. 3, we review correlation cube attack and propose strategies to improve it. In Sect. 4, we propose the variable substitution technique to improve the superpoly recovery. In Sect. 5, we introduce the definition of vector degree for any Boolean function and present an improved technique for degree evaluation. Then we introduce an ISoC search method. In Sect. 6, we apply our techniques to Trivium. Conclusions are given in Sect. 7.

For the full version of this paper, please refer to [25].

2 Preliminaries

2.1 Notations

Let \(\boldsymbol{v} = (v_0, \cdots , v_{n-1})\) be an n-dimensional vector. For any \(\boldsymbol{v},\boldsymbol{u}\in \mathbb {F}_2^n\), denote \(\prod _{i=0}^{n-1} v_i^{u_i}\) by \(\boldsymbol{v}^{\boldsymbol{u}}\) or \(\pi _{\boldsymbol{u}}(\boldsymbol{v})\), and define an order \(\boldsymbol{v} \preccurlyeq \boldsymbol{u}\) (\(\boldsymbol{v} \succcurlyeq \boldsymbol{u}\), resp.), which means \(v_i \le u_i\) (\(v_i \ge u_i\), resp.) for all \(0\le i\le n-1\). For any \(\boldsymbol{u}_0, \cdots , \boldsymbol{u}_{m-1} \in \mathbb {F}_2^n\), we use \(\boldsymbol{u}=\bigvee _{i=0}^{m-1} \boldsymbol{u}_i \in \mathbb {F}_2^n\) to represent the bitwise logical OR operation, that is, for \(0\le j\le n-1\), \(u_j = 1\) if and only if there exists an \(\boldsymbol{u}_i\) whose j-th bit equal to 1. Use \(\boldsymbol{1}\) and \(\boldsymbol{0}\) to represent the all-one and all-zero vector, respectively.

For a set I, denote its cardinality by |I|. For \(I \subset [n]= \{0, 1, \cdots , n - 1\}\), let \(I^c\) be its complement. For an n-dimensional vector \(\boldsymbol{x}\), let \(\boldsymbol{x}_I\) represent the |I|-dimensional vector \((x_{i_0}, \cdots , x_{i_{|I|-1}})\) for \(I = \{i_0,\cdots , i_{|I|-1}\}\). Note that we always list the elements of I in an increasing order to eliminate ambiguity.

In this paper, we always distinguish \(j \in \mathbb {Z}_{2^d}\) with a d-bit vector \(\boldsymbol{u}\) in the sense that \(\sum _{k=0}^{d-1}u_k 2^{k} = j\).

2.2 Algebraic Normal Form and Algebraic Degree of Boolean Functions

An n-variable Boolean function f can be uniquely written in the form \(f(\boldsymbol{x})=\bigoplus _{\boldsymbol{u} \in \mathbb {F}_2^{n}} a_{\boldsymbol{u}}\boldsymbol{x}^{\boldsymbol{u}},\) which is called the algebraic normal form (ANF) of f. If the term \(\boldsymbol{x}^{\boldsymbol{u}}\) appears in f, i.e., \(a_{\boldsymbol{u}} = 1\), we denote \(\boldsymbol{x}^{\boldsymbol{u}} \rightarrow f\). Otherwise, denote \(\boldsymbol{x}^{\boldsymbol{u}} \nrightarrow f\).

For an index set \(I\subset [n]\) with size d, if \(\boldsymbol{x}_I\) are considered as variables and \(\boldsymbol{x}_{I^c}\) are considered as parameters in f, we can write the ANF of f w.r.t. \(\boldsymbol{x}_I\) as

$$\begin{aligned} f(\boldsymbol{x})= \bigoplus _{\boldsymbol{v}\in \mathbb {F}_2^{d}}g_{\boldsymbol{v}}(\boldsymbol{x}_{I^c})\boldsymbol{x}_I^{\boldsymbol{v}}, \end{aligned}$$

where \(g_{\boldsymbol{v}}(\boldsymbol{x}_{I^c})=\bigoplus _{\{ \boldsymbol{u}\in \mathbb {F}_2^{n}\mid \boldsymbol{u}_I = \boldsymbol{v}\}} a_{\boldsymbol{u}}\boldsymbol{x}_{I^c}^{\boldsymbol{u}_{I^c}}\).

The algebraic degree of f w.r.t. \(\boldsymbol{x}_I\) is defined as

$$\deg (f)_{\boldsymbol{x}_I} = \max _{\boldsymbol{v} \in \mathbb {F}_2^d} \{\textrm{wt}(\boldsymbol{v}) \mid g_{\boldsymbol{v}}(\boldsymbol{x}_{I^c}) \ne 0\},$$

where \(\textrm{wt}(\boldsymbol{v})\) is the Hamming weight of \(\boldsymbol{v}\).

2.3 Cube Attack

The cube attack was proposed by Dinur and Shamir in [8], which is essentially an extension of the higher-order differential attack. Given a Boolean function f whose inputs are \(\boldsymbol{x}\in \mathbb {F}_2^n\) and \(\boldsymbol{k}\in \mathbb {F}_2^m\), and given a subset \(I=\{i_0,\cdots ,i_{d-1}\}\subset [n]\), we can write f as

$$f(\boldsymbol{x}, \boldsymbol{k})=f_I(\boldsymbol{x}_{I^c}, \boldsymbol{k})\cdot \boldsymbol{x}_I^{\boldsymbol{1}}+q_I(\boldsymbol{x}_{I^c}, \boldsymbol{k}),$$

where each term in \(q_I\) is not divisible by \(\boldsymbol{x}_I^{\boldsymbol{1}}\). Let \(C_I\), called a cube (defined by I), be the set of vectors \(\boldsymbol{x}\) whose components w.r.t. the index set I take all possible \(2^d\) values and other components are undetermined. I is called the index set of the cube (ISoC). For each \(\boldsymbol{y} \in C_I\), there will be a Boolean function with \(n-d\) variables derived from f. Summing all these \(2^d\) derived functions, we have

$$\bigoplus _{C_{I}}f(\boldsymbol{x}, \boldsymbol{k})=f_I(\boldsymbol{x}_{I^c}, \boldsymbol{k}).$$

The polynomial \(f_I\) is called the superpoly of the cube \(C_{I}\) or of the ISoC I. Actually, \(f_I\) is the coefficient of \(\boldsymbol{x}_I^{\boldsymbol{1}}\) in the ANF of f w.r.t. \(\boldsymbol{x}_I\). If we assign all the values of \(\boldsymbol{x}_{I^c}\) to 0, \(f_I\) becomes the coefficient of \(\boldsymbol{x}^{\boldsymbol{u}}\) in f, which is a Boolean function in \(\boldsymbol{k}\), where \(u_i=1\) if and only if \(i \in I\). We denote it by \(\texttt{Coe}(f, \boldsymbol{x}^{\boldsymbol{u}})\).

2.4 Correlation Cube Attack

The correlation cube attack was proposed at Eurocrypt 2018 by Liu et al. [19]. The objective and high-level idea of this attack is to obtain key information by exploiting the correlations between superpolys and their low-degree basis, thereby deriving equations for the basis rather than the superpolys.

In mathematical terms, for an ISoC I, denote the basis of a superpoly \(f_I\) as \(Q_I=\{h_1, \cdots , h_r\}\), such that \(h_i\) has low degree w.r.t. \(\boldsymbol{k}\) and

$$f_I(\boldsymbol{x}_{J},\boldsymbol{k})=\bigoplus _{i=1}^rh_iq_i,$$

where \(J\subset I^c\). This attack primarily works in two phases:

  1. 1.

    Preprocessing phase (see [25, Algorithm 4]): In this stage, the adversary tries to obtain a basis \(Q_I\) of the superpoly \(f_I\) and add the tuples \((I,h_i,b)\) leading to \(\Pr (h_i = b \mid f_I)\) greater than a threshold p into \(\varOmega \), where \(\Pr (h_i = b \mid f_I)\) is the probability of \(h_i = 0\) (or \(h_i = 1\)) given that \(f_I\) is zero constant (or not) on \(\boldsymbol{x}_J\) for a random fixed key, respectively.

  2. 2.

    Online phase (see [25, Algorithm 5]): The adversary randomly chooses \(\alpha \) values for non-cube public bits, and computes corresponding values of the superpoly \(f_I\) to check whether it is zero constant or not. If all the values of \(f_I\) are zero, for each \((I,h_i,0)\) in \(\varOmega \) the equation \(h_i=0\) holds with probability greater than p. Otherwise, for each \((I,h_i,1)\) in \(\varOmega \) the equation \(h_i=1\) holds with probability greater than p. If all the \(h_i\)’s are balanced and independent with each other, the adversary would recover r-bit key information with a probability greater than \(p^r\) by solving these r equations.

This method, though intricate, provides a solution for dealing with high-degree superpolys, and has demonstrated effectiveness in extending theoretical attacks on Trivium to more rounds.

2.5 Superpoly Recovery with Monomial Prediction/Three-Subset Division Property Without Unknown Subset

In [15], Hu et al. established the equivalence between monomial prediction and three-subset division property without unknown subset [11], showing both techniques could give accurate criterion on the existence of a monomial in f. Here we take the monomial prediction technique as an example to explain how to recover a superpoly.

For a vector Boolean function \(\boldsymbol{f} = \boldsymbol{f}_{r-1}\circ \cdots \circ \boldsymbol{f}_0\), denote the input and output of \(\boldsymbol{f}_i\) by \(\boldsymbol{x}_i\) and \(\boldsymbol{x}_{i+1}\) respectively. If any \(\pi _{\boldsymbol{u}_i}(\boldsymbol{x}_i) \rightarrow \pi _{\boldsymbol{u}_{i+1}}(\boldsymbol{x}_{i+1})\), i.e., the coefficient of \(\pi _{\boldsymbol{u}_i}(\boldsymbol{x}_i)\) in \(\pi _{\boldsymbol{u}_{i+1}}(\boldsymbol{x}_{i+1})\) is nonzero, then we call

$$\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0) \rightarrow \pi _{\boldsymbol{u}_1}(\boldsymbol{x}_1) \rightarrow \cdots \rightarrow \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})$$

a monomial trail from \(\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\) to \(\pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})\), denoted by \(\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\rightsquigarrow \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})\). If there is no trail from \(\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\) to \(\pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})\), we denote . The set of all trails from \(\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\) to \(\pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})\) are denoted by \(\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\bowtie \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})\). Obviously, for any \(0< i < r-1\), it holds that

$$|\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\bowtie \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})|= \sum _{\boldsymbol{u}_i}|\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\bowtie \pi _{\boldsymbol{u}_{i}}(\boldsymbol{x}_{i})|\cdot |\pi _{\boldsymbol{u}_i}(\boldsymbol{x}_i)\bowtie \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})|.$$

Theorem 1

(Monomial prediction[11, 15]). We have \(\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\rightarrow \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})\) if and only if

$$|\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\bowtie \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})|\equiv 1 \pmod 2.$$

That is, if and only if, for any \(0<i<r-1\),

$$|\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\bowtie \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})|\equiv \sum _{\pi _{\boldsymbol{u}_{i}}(\boldsymbol{x}_{i})\rightarrow \pi _{\boldsymbol{u}_{r-1}}(\boldsymbol{x}_{r-1})}|\pi _{\boldsymbol{u}_0}(\boldsymbol{x}_0)\bowtie \pi _{\boldsymbol{u}_{i}}(\boldsymbol{x}_{i})| \pmod 2.$$

Theorem 2

(Superpoly recovery [11, 15]). Let f be a Boolean function with input \(\boldsymbol{x}\) and \(\boldsymbol{k}\), and \(f= f_{r-1}\circ \boldsymbol{f}_{r-2}\circ \cdots \circ \boldsymbol{f}_0(\boldsymbol{x}, \boldsymbol{k})\). When setting \(\boldsymbol{x}_{I^c}=\boldsymbol{0}\), the superpoly of an ISoC I is

$$\texttt{Coe}(f, \boldsymbol{x}^{\boldsymbol{u}}) = \bigoplus _{|\boldsymbol{k}^{\boldsymbol{w}}\boldsymbol{x}^{\boldsymbol{u}}\bowtie f|\equiv 1 \pmod 2}\boldsymbol{k}^{\boldsymbol{w}},$$

where \(\boldsymbol{u}_I=\boldsymbol{1}\) and \(\boldsymbol{u}_{I^c}=\boldsymbol{0}\).

MILP Model for Monomial Trails. It is a difficult task to search all the monomial trails manually. Since Xiang et al. [28] first transformed the propagation of bit-based division property into an MILP model, it only becomes possible to solve such searching problems by using off-the-shelf MILP solvers. To construct an MILP model for the monomial trail of a Boolean function, one needs only to model three basic operations, i.e., COPY, AND and XOR. Please refer to Appendix A in [25] for details.

2.6 Nested Monomial Prediction with NBDP and CMP Techniques

At Asiacrypt 2021, Hu et al. [14] proposed a framework, called nested monomial prediction, to exactly recover superpolys. For a Boolean function \(f(\boldsymbol{x}, \boldsymbol{k}) = f_{r-1}\circ \boldsymbol{f}_{r-2}\circ \cdots \circ \boldsymbol{f}_0(\boldsymbol{x},\boldsymbol{k})\), denote the input and output of \(\boldsymbol{f}_i\) by \(\boldsymbol{y}_i\) and \(\boldsymbol{y}_{i+1}\) respectively. To compute \(\texttt{Coe}(f, \boldsymbol{x}^{\boldsymbol{u}})\), the process is as follows:

  1. 1.

    Set \(n = r - 1\), \(Y_{n}=\{f\}\) and set a polynomial \(p=0\).

  2. 2.

    Choose l such that \(0 < l < n\) with certain criterion, and set \(Y_l=\emptyset \) and \(T_l=\emptyset \).

  3. 3.

    Express each term in \(Y_n\) with \(\boldsymbol{y}_{l}\) by constructing and solving MILP model of monomial prediction and save the terms \(\pi _{\boldsymbol{u}_{l}}(\boldsymbol{y}_{l})\) satisfying that the size of \(\{\pi _{\boldsymbol{u}_{n}}(\boldsymbol{y}_{n})\in Y_n\mid \pi _{\boldsymbol{u}_{l}}(\boldsymbol{y}_{l})\rightarrow \pi _{\boldsymbol{u}_{n}}(\boldsymbol{y}_{n})\}\) is odd into \(T_{l}\).

  4. 4.

    For each \(\pi _{\boldsymbol{u}_{l}}(\boldsymbol{y}_{l})\in T_{l}\), compute \(\texttt{Coe}(\pi _{\boldsymbol{u}_{l}}(\boldsymbol{y}_{l}), \boldsymbol{x}^{\boldsymbol{u}})\) by constructing and solving MILP model of monomial prediction. If the model about \(\pi _{\boldsymbol{u}_{l}}(\boldsymbol{y}_{l})\) is successfully solved with acceptable time, update p by \(p\oplus \texttt{Coe}(\pi _{\boldsymbol{u}_{l}}(\boldsymbol{y}_{l}), \boldsymbol{x}^{\boldsymbol{u}})\) and save the unsolved \(\pi _{\boldsymbol{u}_{l}}(\boldsymbol{y}_{l})\) into \(Y_{l}\).

  5. 5.

    If \(Y_{l}\ne \emptyset \), set \(n=l\) and go to Step 2. Otherwise, return the polynomial p.

The idea of Step 3 and Step 4 comes from Theorem 1 and Theorem 2, i.e.,

$$\begin{aligned} \texttt{Coe}(f, \boldsymbol{x}^{\boldsymbol{u}})&=\bigoplus _{\pi _{\boldsymbol{u}_n}(\boldsymbol{y}_n)\rightarrow f}\texttt{Coe}(\pi _{\boldsymbol{u}_n}(\boldsymbol{y}_n), \boldsymbol{x}^{\boldsymbol{u}})\end{aligned}$$
(1)
$$\begin{aligned} &=\bigoplus _{\pi _{\boldsymbol{u}_n}(\boldsymbol{y}_n)\rightarrow f}\bigoplus _{\pi _{\boldsymbol{u}_l}(\boldsymbol{y}_l)\rightarrow \boldsymbol{y}_n}\texttt{Coe}(\pi _{\boldsymbol{u}_l}(\boldsymbol{y}_l), \boldsymbol{x}^{\boldsymbol{u}})\end{aligned}$$
(2)
$$\begin{aligned} &=\bigoplus _{\pi _{\boldsymbol{u}_l}(\boldsymbol{y}_l)\in T_l}\texttt{Coe}(\pi _{\boldsymbol{u}_l}(\boldsymbol{y}_l), \boldsymbol{x}^{\boldsymbol{u}})\end{aligned}$$
(3)
$$\begin{aligned} &= p \oplus \left( \bigoplus _{\pi _{\boldsymbol{u}_l}(\boldsymbol{y}_l)\in Y_l}\texttt{Coe}(\pi _{\boldsymbol{u}_l}(\boldsymbol{y}_l), \boldsymbol{x}^{\boldsymbol{u}})\right) . \end{aligned}$$
(4)

Since the number of monomial trails grows sharply as the number of rounds of a cipher increases, it becomes infeasible to compute a superpoly for a high number of rounds with nested monomial prediction. At Asiacrypt 2022, He et al. [13] proposed new techniques to improve the nested monomial prediction. They no longer took the way of trying to solve out the coefficient of \(\boldsymbol{x}^{\boldsymbol{u}}\) in \(\pi _{\boldsymbol{u}_{l}}(\boldsymbol{y}_{l})\) at multiple numbers of middle rounds. Instead, for a fixed number of middle round \(r_m\), they focused on recovering a set of valuable terms (see Definition 1), denoted by \(\texttt{VT}_{r_m}\), and then computing coefficient of \(\boldsymbol{x}^{\boldsymbol{u}}\) in every valuable term. They discard the terms \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\) satisfying there exists no \(\boldsymbol{k}^{\boldsymbol{w}}\) such that \(\boldsymbol{k}^{\boldsymbol{w}}\boldsymbol{x}^{\boldsymbol{u}}\rightsquigarrow \pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\), i.e., \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})=0\) in Eq. (1) for \(n=r_m\). The framework of this technique is as follows:

  1. 1.

    Try to recover \(\texttt{VT}_{r_m}\). If the model is solved within an acceptable time, go to Step 2.

  2. 2.

    For each term \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\) in \(\texttt{VT}_{r_m}\), compute \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) and then sum all of them.

To recover \(\texttt{VT}_{r_m}\), He et al. proposed two techniques: non-zero bit-based division property (NBDP) and core monomial prediction (CMP), which led to great improvement of the complexity of recovering the valuable terms compared to nested monomial prediction. For details, please refer [13].

Definition 1

(Valuable terms [13]). For a Boolean function \(f(\boldsymbol{x}, \boldsymbol{k}) = f_{r-1}\circ \boldsymbol{f}_{r-2}\circ \cdots \circ \boldsymbol{f}_0(\boldsymbol{x},\boldsymbol{k})\), denote the input and output of \(\boldsymbol{f}_i\) by \(\boldsymbol{y}_i\) and \(\boldsymbol{y}_{i+1}\), respectively. Given \(0 \le r_m < r\), if a term \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\) satisfies (1) \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}) \rightarrow f\) and (2) \(\exists \boldsymbol{k}^{\boldsymbol{w}}\) such that \(\boldsymbol{k}^{\boldsymbol{w}}\boldsymbol{x}^{\boldsymbol{u}}\rightsquigarrow \pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\), then it is called a valuable term of \(\texttt{Coe}(f, \boldsymbol{x}^{\boldsymbol{u}})\) at round \(r_m\).

3 Improvements to Correlation Cube Attack

As the number of rounds of a cipher increases, it becomes infeasible to search small-size ISoCs with low-degree superpolys. Correlation cube attack [19] provides a viable solution to recover keys by using the correlation property between keys and superpolys, allowing for the use of high-degree superpolys. However, the correlation cube attack has not shown significant improvements or practical applications since its introduction. We revisit this attack first and then propose strategies to improve it.

For convenience, we will continue to use the notations from Sect. 2.4, where

$$f_I(\boldsymbol{x}_{J},\boldsymbol{k})=\bigoplus _{i=1}^rh_iq_i.$$

In the online phase of a correlation cube attack, the adversary computes the values of \(f_I\) for all possible values of \(\boldsymbol{x}_{J}\). Using these values, the adversary can make guesses about the value of \(h_i\) in \(Q_I\). The guessing strategy is as follows: for the tuple \((I,h_i,1)\) satisfying \(\Pr (h_i = 1 \mid f_I) > p\), if there exists a value of \(f_I\) is 1, guess \(h_i = 1\); for the tuple \((I,h_i,0)\) satisfying \(\Pr (h_i = 0 \mid f_I) > p\), if \(f_I \equiv 0\), guess \(h_i = 0\). Therefore, the adversary can obtain some low-degree equations over \(\boldsymbol{k}\).

Now we examine the probability of one such equation being correct. For certain i, in the first case, the success probability is \(\Pr (h_i=1\mid f_I\not \equiv 0)\). If \(r>1\), and \(f_I=1\), \(q_i = 1\) and \(\bigoplus _{j\ne i}h_j q_j = 1\) for some value of \(\boldsymbol{x}_{I^c}\), then we have \(h_{i}=0\). That is, the guess about \(h_i\) is incorrect. In the second case, the success probability is \(\Pr (h_i=0\mid f_I\equiv 0)\). If \(r>1\) and \(f_I\equiv 0\), there still exists the possibility that \(h_{i}=1\) and \(q_i \equiv \bigoplus _{j\ne i}h_j q_j\), leading to incorrect guess of \(h_i\).

Therefore, since in the case \(r>1\) only probabilistic equations can be obtained, we first improve the strategy by constraining \(r=1\). That is, we consider the case

$$f_I = hq,$$

and call the ISoC I satisfying this condition a “special” ISoC. Note that now the success probability becomes 1 for the first case, and the fail probability for the second case is actually equal to \(\Pr (h = 1 , q \equiv 0)\). Considering there are a set of special ISoCs \(\{I_1, \cdots , I_m\}\) such that \(f_{I_i} = hq_i\), we can modify the strategy as follows: if \(\exists i\) such that \(f_{I_i}\not \equiv 0\), guess \(h=1\); otherwise, guess \(h = 0\). The success probability is still 1 for the first case. The fail probability for the second case is now reduced to \(\Pr (h = 1 , q_1\equiv 0,\ldots , q_m\equiv 0)\). In summary, we can improve the success probability of the guessing by searching for a large number of special ISoCs.

Based on the above observations, we propose the improved correlation cube attack in Algorithm 1 and Algorithm 2. This attack is executed in two phases:

  1. 1.

    Preprocessing phase:

    1. a.

      Identify special ISoCs.

    2. b.

      For each h, gather all the special ISoC I for which h is a factor of \(f_I\) into a set \(T_h\).

    3. c.

      To reduce the number of equations derived from wrong guesses of h, for those h whose success probability in the second case is at or below a threshold p, they will be exclusively guessed in the first case. Their associated \(T_h\) are then added to a set \(\mathcal {T}_1\).

    4. d.

      The remaining h will be guessed in both cases with their associated \(T_h\) forming a set \(\mathcal {T}\).

  2. 2.

    Online phase:

    1. a.

      Computes the value of \(f_I\) for each ISoC I.

    2. b.

      For every \(T_h\) in \(\mathcal {T}\), make a guess on the value of h based on \(f_I\)’s value for all I in \(T_h\).

    3. c.

      If for any \(T_h\) in \(\mathcal {T}_1\), the values of \(f_I\) for all I in \(T_h\) satisfy the condition in the first case, then \(h = 1\). Otherwise, no guess is formulated concerning h.

    4. d.

      Store the equations \(h = 1\) in to a set \(G_1\), while store the other equations into a set \(G_0\). Note that only equations in \(G_0\) may be incorrect.

    5. e.

      Using these derived equations along with partial key guesses, we can try to obtain a candidate of the key. If verifications for all partial key guesses do not yield a valid key, it indicates that there exist incorrect equations. In this case, modify some equations from \(G_0\) and solve again until a valid key is obtained. Repeat this iteration until the correct key is ascertained.

A crucial factor for the success of this attack is to acquire a significant number of special ISoCs. To achieve this goal, the first step is to search for a large number of good ISoCs and recover their corresponding superpolys. Then, low-degree factors of these superpolys need to be computed.

Using degree estimation techniques is one of the common methods for searching cubes. In Sect. 5, we will first introduce a vector numeric mapping technique to improve the accuracy of degree estimation. By combining this attack, we will propose an algorithm for fast search of lots of good ISoCs on a large scale.

To our knowledge, it is difficult to decompose a complicated Boolean polynomial. To solve this problem, we propose a novel and effective technique to recover superpolys in Sect. 4. Using this technique, not only the computational complexity for recovering superpolys can be reduced, making it feasible to recover a large number of superpolys, but also it allows for obtaining compact superpolys that are easy to decompose.

figure c
figure d

4 Recover Superpolys from a Novel Perspective

4.1 Motivation

As discussed in Sect. 3, we need lots of special ISoCs to improve the correlation cube attack. On the one hand, it is still difficult to compute the factor of a complicated polynomial effectively with current techniques to our best knowledge. On the other hand, the efficiency of recovering superpolys needs to be improved in order to recover a large number of superpolys within an acceptable time. Therefore, we propose new techniques to address the aforementioned issues. Let \(f(\boldsymbol{x}, \boldsymbol{k}) = f_{r-1}\circ \boldsymbol{f}_{r-2}\circ \cdots \circ \boldsymbol{f}_0(\boldsymbol{x},\boldsymbol{k})\) and denote the input and output of \(\boldsymbol{f}_i\) by \(\boldsymbol{y}_i\) and \(\boldsymbol{y}_{i+1}\), respectively. Here we adopt the notations used in the monomial prediction technique (see Sect. 2.5). Since

$$\begin{aligned} \texttt{Coe}(f, \boldsymbol{x}^{\boldsymbol{u}})&= \bigoplus _{\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})}\texttt{Coe}(f, \pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}))\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\\ &=\bigoplus _{\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\rightarrow f}\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\\ &=\bigoplus _{\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\rightarrow f \text { and }\exists \boldsymbol{w}\ \text {s.t.} \ \boldsymbol{k}^{\boldsymbol{w}} \boldsymbol{x}^{\boldsymbol{u}}\rightsquigarrow \pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}) }\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}}). \end{aligned}$$

By Definition 1, the superpoly is equal to

$$\texttt{Coe}(f, \boldsymbol{x}^{\boldsymbol{u}})=\bigoplus _{\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\in \texttt{VT}_{r_m}}\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}}).$$

Therefore, recovering a superpoly requires two steps: obtaining the valuable terms \(\texttt{VT}_{r_m}\) and recovering the coefficients \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\). The specific steps are as follows:

  1. 1.

    Try to obtain \(\texttt{VT}_{r_m}\). If the model is solved within an acceptable time, go to Step 2.

  2. 2.

    For each term \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\) in \(\texttt{VT}_{r_m}\), compute \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) with our new techniques and sum them.

We will provide a detailed explanation of the procedures for each step.

4.2 Obtain Valuable Terms

One important item to note about the widely used MILP solver, the Gurobi optimizer, is that model modifications are done in a lazy fashion, meaning that effects of modifications of a model are not seen immediately. We can set up an MILP model with callback function indicating whether the optimizer finds a new solution. The following shows the process of how to obtain the \(r_m\)-round Valuable Terms; see [25, Algorithm 6].

  1. 1.

    Establish a model \(\mathcal {M}\) to search for all trails \(\boldsymbol{k}^{\boldsymbol{w}}\boldsymbol{x}^{\boldsymbol{u}} \rightsquigarrow \pi _{\boldsymbol{u}_{r_1}}(\boldsymbol{y}_{r_1})\rightsquigarrow \cdots \rightsquigarrow f\).

  2. 2.

    Solve the model \(\mathcal {M}\). Once a trail is found, go to Step 3. If there is no solution, go to Step 4.

  3. 3.

    (VTCallbackFun) Determine whether \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\rightarrow f\) by the parity of the number of trails \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\rightsquigarrow f\). If \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\rightarrow f\), add \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\) to the set \(\texttt{VT}_{r_m}\). Remove all trails from \(\mathcal {M}\) that satisfy \(\boldsymbol{k}^{\boldsymbol{w}}\boldsymbol{x}^{\boldsymbol{u}}\rightsquigarrow \pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}) \rightsquigarrow f\). Go to the Step 2.

  4. 4.

    Return the Valuable Terms \(\texttt{VT}_{r_m}\).

Note that for each \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\) satisfying \(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\rightsquigarrow f\), the parity of the number of trails is calculated only once due to the removal of all trails satisfying \(\boldsymbol{k}^{\boldsymbol{w}}\boldsymbol{x}^{\boldsymbol{u}}\rightsquigarrow \pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}) \rightsquigarrow f\).

He et al. [13] also applied the same framework, but they used different techniques. By combining their NBDP and DBP techniques, we can further improve the efficiency of recovering \(\texttt{VT}_{r_m}\). We will show the results of experiments in Sect. 6.

4.3 Variable Substitution Technique for Coefficient Recovery

For a Boolean function \(f(\boldsymbol{x}, \boldsymbol{k}) = f_{r-1}\circ \boldsymbol{f}_{r-2}\circ \cdots \circ \boldsymbol{f}_0(\boldsymbol{x},\boldsymbol{k})\) whose inputs are \(\boldsymbol{x}\in \mathbb {F}_2^n\) and \(\boldsymbol{k}\in \mathbb {F}_2^m\), denote the input and output of \(\boldsymbol{f}_i\) by \(\boldsymbol{y}_i\) and \(\boldsymbol{y}_{i+1}\), respectively. We study about the problem of recovering \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) at middle rounds from an algebraic perspective. Let \(\overleftarrow{\boldsymbol{f}_{r_m}}\) denote \(\boldsymbol{f}_{r_m-1}\circ \cdots \circ \boldsymbol{f}_0\), i.e., \(\boldsymbol{y}_{r_m} = \overleftarrow{\boldsymbol{f}_{r_m}}(\boldsymbol{x}, \boldsymbol{k})\). Assume the algebraic normal form of \(\overleftarrow{\boldsymbol{f}_{r_m}}\) in \(\boldsymbol{x}\) is

$$\overleftarrow{\boldsymbol{f}_{r_m}}=\bigoplus _{\boldsymbol{v}\in \mathbb {F}_2^n}\boldsymbol{h}_{\boldsymbol{v}}(\boldsymbol{k})\boldsymbol{x}^{\boldsymbol{v}}.$$

Then one could get that \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) is an XOR of some products over \(\boldsymbol{h}_{\boldsymbol{v}}(\boldsymbol{k})\). Assume that the number of different non-constant \({\boldsymbol{h}}_{\boldsymbol{v}}[j]\)’s is t for all \(\boldsymbol{v}\) and j, where \({\boldsymbol{h}}_{\boldsymbol{v}}[j]\) represents the j-th component of \({\boldsymbol{h}}_{\boldsymbol{v}}\). Now we introduce new intermediates denoted by \(\boldsymbol{z}\) to substitute these t \(\boldsymbol{h}_{\boldsymbol{v}}[j]\)’s. Without loss of generality, assume \(\boldsymbol{z} = \boldsymbol{d}(\boldsymbol{k})\), where \(\boldsymbol{d}[i]\) is equal to a certain non-constant \(\boldsymbol{h}_{\boldsymbol{v}}[j]\). From the ANF of \(\overleftarrow{\boldsymbol{f}_{r_m}}\), it is natural to derive the vectorial Boolean function \(\boldsymbol{g}_{r_m}\) such that \(\boldsymbol{y}_{r_m} = \boldsymbol{g}_{r_m}(\boldsymbol{x},\boldsymbol{z})\), whose ANF in \(\boldsymbol{x}\) and \(\boldsymbol{z}\) can be written as

$$\boldsymbol{g}_{r_m}[j] = \bigoplus _{\boldsymbol{v}}a_{\boldsymbol{v},j}\boldsymbol{z}^{\boldsymbol{c}_{\boldsymbol{v},j}}\boldsymbol{x}^{\boldsymbol{v}},$$

where \(\boldsymbol{g}_{r_m}[j]\) represents j-th component of \(\boldsymbol{g}_{r_m}\), and \(a_{\boldsymbol{v},j}\in \mathbb {F}_2\) and \(\boldsymbol{c}_{{\boldsymbol{v}},j}\in \mathbb {F}_2^t\) are both determined by \(\boldsymbol{v}\) and j.

Example 1 serves as an illustration of the process of variable substitution. The transition from round 0 to round \(r_m\) with \((k_0k_1\oplus k_2k_5\oplus k_9+k_{10})(k_2k_7\oplus k_8)x_0x_2x_3\) will have at least 4 * 2 = 8 monomial trails. But after variable substitution, there remains only one trail \(z_0z_2x_0x_2x_3\), which means we have consolidated 8 monomial trails into a single one. As the coefficients become more intricate and the number of terms in the product increases, the magnitude of this reduction becomes more pronounced. Additionally, it is evident that this also makes the superpoly more concise. In general, the more compact the superpoly is, the easier it is to factorize.

Example 1

Assume \(\boldsymbol{y}_{r_m} = \boldsymbol{g}_{r_m}(\boldsymbol{x},\boldsymbol{k}) = [(k_0k_1\oplus k_2k_5\oplus k_9+k_{10})x_0x_2\oplus (k_3\oplus k_6)x_5, (k_2k_7\oplus k_8)x_3\oplus x_6x_7]\). Through variable substitution, all coefficients within \(\boldsymbol{y}_{r_m}\), including \(k_0k_1\oplus k_2k_5\oplus k_9+k_{10}\), \(k_3\oplus k_6\), and \(k_2k_7\oplus k_8\), will be replaced with new variables \(z_0\), \(z_1\), and \(z_2\), respectively. Then \(\boldsymbol{y}_{r_m}\) could be rewritten as \(\boldsymbol{y}_{r_m} = \boldsymbol{g}_{r_m}(\boldsymbol{x},\boldsymbol{z}) = [z_0x_0x_2\oplus z_1x_5, z_2x_3\oplus x_6x_7]\).

Therefore, we take such a way of substituting variables at the middle round \(r_m\) to recover \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\), and the process is as follows:

  1. 1.

    Compute the ANF of \(\boldsymbol{y_{r_m}}\) in \(\boldsymbol{x}\).

  2. 2.

    Substitute all different non-constant \(\boldsymbol{h}_{\boldsymbol{v}}[j]\) for all \(\boldsymbol{v}\) and j by new variables \(\boldsymbol{z}\).

  3. 3.

    Recover \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) in \(\boldsymbol{z}\) by monomial prediction.

In fact, to solve \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) in \(\boldsymbol{z}\) by monomial prediction is equivalent to find all possible monomial trails \(\boldsymbol{z}^{\boldsymbol{c}}\boldsymbol{x}^{\boldsymbol{u}} \rightsquigarrow \pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m})\) about \(\boldsymbol{c}\). We can construct an MILP model to describe all feasible trails.

Model for Recovering \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) in \(\boldsymbol{z}\). To describe monomial prediction into an MILP model, we actually need only to construct an MILP model to describe all the trails for \(\boldsymbol{g}_{r_m}\). Since the ANF of \(\boldsymbol{g}_{r_m}\) is known, three consecutive operations \(\texttt{Copy}\rightarrow \texttt{And}\rightarrow \texttt{XOR}\) are sufficient to describe \(\boldsymbol{g}_{r_m}\). The process is as follows:

  • [Copy] For each \(x_i\) (resp. \(z_i\)), the number of copies is equal to the number of monomials divisible by \(x_i\) (resp. \(z_i\)) contained in \(\boldsymbol{g}_{r_m}[j]\) for all j.

  • [And] Generate all monomials contained in \(\boldsymbol{g}_{r_m}[j]\) for all j.

  • [XOR] According to the ANF of each \(\boldsymbol{g}_{r_m}[j]\), collect monomials using XOR to form \(\boldsymbol{g}_{r_m}[j]\).

We give an example to show how to describe \(\boldsymbol{g}_{r_m}\) by \(\texttt{Copy}\rightarrow \texttt{And}\rightarrow \texttt{XOR}\). The algorithm for recovering \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) can be found in Algorithm 3.

Example 2

If \(\boldsymbol{y}_{r_m} = \boldsymbol{g}_{r_m}(\boldsymbol{x},\boldsymbol{z}) = (x_0x_1x_2\oplus x_0z_0\oplus z_1, x_2\oplus z_0z_1\oplus z_0)\), we can describe \(\boldsymbol{g}_{r_m}\) by the following three steps.

$$\begin{aligned} \begin{aligned} &(x_0, x_1, x_2, z_0, z_1) \overset{\texttt {Copy} }{\longrightarrow } (x_0, x_0, x_1, x_2, x_2, z_0, z_0, z_0, z_1, z_1) \overset{\texttt {And} }{\longrightarrow }\\ &(x_0x_1x_2, x_0z_0, z_1, x_2, z_0z_1, z_0) \overset{\texttt {XOR} }{\longrightarrow } (x_0x_1x_2\oplus x_0z_0\oplus z_1, x_2\oplus z_0z_1\oplus z_0) \end{aligned} \end{aligned}$$

Discussion. We have given a method of describing \(\boldsymbol{g}_{r_m}\) into an MILP model, which is easy to understand and implement. In general, there may be other ways to construct the MILP model for a concrete \(\boldsymbol{g}_{r_m}\). Of course, different ways do not affect the correctness of the coefficients recovered. It is difficult to find theoretical methods to illustrate what kind of way of modeling \(\boldsymbol{g}_{r_m}\) is easier to solve. In order to verify the improvement of our variable substitution technique over previous methods, we will compare the performance by some experiments.

figure e

5 Improved Method for Searching a Large Scale of Cubes

The search of ISoCs in cube attacks often involves degree evaluations of cryptosystems. While the numeric mapping technique [18] offers lower complexity, it performs not well for Trivium-like ciphers when dealing with sets of adjacent indices. This limitation arises from the repeated accumulation of estimated degrees due to the multiplications of adjacent indices during updates. Although the monomial prediction technique [15] provides exact results, it is time-intensive. Thus, efficiently obtaining the exact degree of a cryptosystem remains a challenge. To efficiently search for promising cubes with adjacent indices on a large scale, we propose a compromise approach for degree evaluation called the “vector numeric mapping” technique. This technique yields a tighter upper bound than the numeric mapping technique while maintaining lower time complexity than monomial prediction. Additionally, we have developed an efficient algorithm based on an MILP model for large-scale search of ISoCs.

5.1 The Numeric Mapping

Let \(\mathbb {B}_n\) be the set consisting of all n-variable Boolean functions. The numeric mapping [18], denoted by \(\texttt {DEG}\), is defined as

$$\begin{aligned} \texttt {DEG}:\quad \mathbb {B}_n\times \mathbb {Z}^n&\longrightarrow \mathbb {Z}\\ (f,\boldsymbol{d})&\longmapsto \max _{a_{\boldsymbol{u}}\ne 0}\left\{ \sum _{i=0}^{n-1}\boldsymbol{u}[i]\boldsymbol{d}[i]\right\} , \end{aligned}$$

where \(a_{\boldsymbol{u}}\) is the coefficient of the term \(x^{\boldsymbol{u}}\) in the ANF of f.

Let \(\boldsymbol{g}=(g_1,\ldots ,g_n)\) be an (mn)-vectorial Boolean function, i.e. \(g_i\in \mathbb {B}_m\), \(1\le i\le n\). Then for \(f\in \mathbb {B}_n\), the numeric degree of the composite function \(h=f \circ \boldsymbol{g}=f(g_1,\ldots ,g_n)\), denoted by \(\texttt {DEG}(h)\), is defined as \(\texttt {DEG}(f, \boldsymbol{d}_{\boldsymbol{g}})\), where \(\boldsymbol{d}_{\boldsymbol{g}}[i] \ge \deg (g[i])\) for all \(0 \le i \le n - 1\). The algebraic degree of h is always no greater than \(\texttt {DEG}(h)\), therefore, the algebraic degrees of internal states of an NFSR-based cryptosystem can be estimated iteratively by using the numeric mapping.

5.2 The Vector Numeric Mapping

Firstly, we introduce the definition of vector degree of a Boolean function, from which we will easily understand the motivation of the vector numeric mapping. For the sake of simplicity, let \(\deg (g_1,\ldots , g_n)\) represent \(\left( \deg (g_1), \ldots , \deg (g_n) \right) \).

Definition 2 (Vector Degree)

Let f be an n-variable Boolean function represented w.r.t. \(\boldsymbol{x}_I\) as

$$f(\boldsymbol{x}) = \bigoplus _{\boldsymbol{u}\in \mathbb {F}_2^d} g_{{\boldsymbol{u}}}(\boldsymbol{x}_{I^c}) \boldsymbol{x}_I^{\boldsymbol{u}},$$

where \(I \subset [n]\), \(|I|=d\). The vector degree of f w.r.t. \(\boldsymbol{x}\) and the index set I, denoted by \(\textbf{vdeg}_{[I,\boldsymbol{x}]}\), is defined as

$$\begin{aligned} \textbf{vdeg}_{[I,\boldsymbol{x}]}(f) = \deg (g_{\boldsymbol{u}_0}, g_{\boldsymbol{u}_1}, \ldots , g_{\boldsymbol{u}_{2^d - 1}})_{\boldsymbol{x}_{I^c}} =\left( \deg (g_{\boldsymbol{u}_0})_{\boldsymbol{x}_{I^c}},\ldots , \deg (g_{\boldsymbol{u}_{2^d - 1}})_{\boldsymbol{x}_{I^c}}\right) , \end{aligned}$$

where \(\boldsymbol{u}_j\) satisfies \(\sum _{k=0}^{d - 1} {\boldsymbol{u}_j[k]}2^k = j\), \(0\le j\le 2^d-1\).

When we do not emphasize I and \(\boldsymbol{x}\), we abbreviate \(\textbf{vdeg}_{[I,\boldsymbol{x}]}\) as \(\textbf{vdeg}_{I}\) or \(\textbf{vdeg}\). Similarly, for a vectorial Boolean function \(\boldsymbol{g}=(g_1,\ldots ,g_n)\), we denote the vector degree of \(\boldsymbol{g}\) by \({\textbf {vdeg}}(\boldsymbol{g})=\left( {\textbf {vdeg}}(g_1), \ldots , {\textbf {vdeg}}(g_n)\right) \).

According to Definition 2, it is straightforward to get an upper bound of the vector degree of f, which is shown in Proposition 1.

Proposition 1

For any \(0 \le j < 2^{\left|I \right|}\), \(\textbf{vdeg}_{[I,\boldsymbol{x}]}(f)[j] \le n - \left|I \right|\).

Moreover, it is obvious that the vector degree of f contains more information about f than the algebraic degree. We can also derive the algebraic degree of f from its vector degree, that is,

$$\deg (f) = \max _{0 \le j < 2^{\left|I \right|}}\{\textbf{vdeg}_{I}(f)[j] + \textrm{wt}(j)\}.$$

Therefore, the upper bound of the algebraic degree can be estimated by the upper bound of the vector degree.

Corollary 1

Let \(\boldsymbol{v}\) be an upper bound of the vector degree of f, i.e., \(\textbf{vdeg}_{[I,\boldsymbol{x}]}(f)\preccurlyeq \boldsymbol{v}\). Then we have

$$\deg (f) \le \max _{0 \le j < 2^{\left|I \right|}}\left\{ \min \left\{ \boldsymbol{v}[j], n - \left|I \right|\right\} + \textrm{wt}(j)\right\} .$$

In fact, the algebraic degree of f is the degenerate form of the vector degree of f w.r.t. \(I = \emptyset \). Moreover, if \(I_1 \subset I_2\), the vector degree of f w.r.t. \(I_1\) can be deduced from the vector degree of f w.r.t. \(I_2\), that is,

$$\begin{aligned} \textbf{vdeg}_{I_1}(f)[j] = \max _{0 \le j'<2^{|I_2 |- |I_1 |}}\left\{ \textbf{vdeg}_{I_2}(f)[j'\cdot 2^{|I_1 |} + j] + \textrm{wt}(j')\right\} \end{aligned}$$
(5)

for any \(0 \le j < 2^{|I_1 |}\).

In order to estimate the vector degree of composite functions, we propose the concept of vector numeric mapping.

Definition 3 (Vector Numeric Mapping)

Let \(d \ge 0\). The vector numeric mapping, denoted by \(\texttt{VDEG}_d\), is defined as

$$\begin{aligned} \texttt{VDEG}_d:\quad \mathbb {B}_n\times \mathbb {Z}^{n \times 2^d}&\longrightarrow \mathbb {Z}^{2^d}\\ (f,V)&\longmapsto \boldsymbol{w}, \end{aligned}$$

where \(f=\bigoplus _{\boldsymbol{u}\in \mathbb {F}_2^n }a_{\boldsymbol{u}}\boldsymbol{x}^{\boldsymbol{u}}\) and for any \(0\le j<2^d\),

$$\begin{aligned} \boldsymbol{w}[j] := \max _{a_{\boldsymbol{u}} \ne 0}\max _{\begin{array}{c} j_0,\cdots ,j_{n - 1} \\ 0 \le j_i \le \boldsymbol{u}[i](2^d-1) \\ j = \bigvee _{i=0}^{n - 1} \boldsymbol{u}[i]j_i \end{array}}\left\{ \sum _{i=0}^{n - 1} \boldsymbol{u}[i]V[i][j_i]\right\} . \end{aligned}$$

For an (mn)-vectorial Boolean function \(\boldsymbol{g}=(g_0,\ldots , g_{n-1})\), we define its vector numeric mapping as \(\texttt {VDEG}(\boldsymbol{g},V)=(\texttt {VDEG}(g_0,V),\ldots ,\texttt {VDEG}(g_{n-1}, V))\).

Theorem 3

Let f be an n-variable Boolean function and \(\boldsymbol{g}\) be an (mn)-vectorial Boolean function. Assume \(\textbf{vdeg}_I(g_i) \preccurlyeq \boldsymbol{v}_i\) for all \(0 \le i \le n - 1\) w.r.t. an index set I. Then each component of the vector degree of \(f \circ \boldsymbol{g}\) is less than or equal to the corresponding component of \(\texttt{VDEG}_{I}(f, V)\), where \(V=(\boldsymbol{v}_0,\cdots ,\boldsymbol{v}_{n - 1})\).

The proof of Theorem 3 can be found in [25, Appendix D]. By Theorem 3, we know that the vector numeric mapping \(\texttt{VDEG}(f,V)\) gives an upper bound of the vector degree of the composite function \(f\circ \boldsymbol{g}\) when V is the upper bound of the vector degree of the vectorial Boolean function \(\boldsymbol{g}\).

For a Boolean function \(f(\boldsymbol{x}) = f_{r-1}\circ \boldsymbol{f}_{r-2}\circ \cdots \circ \boldsymbol{f}_0(\boldsymbol{x})\), let I be the index set. We denoted the upper bound of the vector degree of f w.r.t. \(\boldsymbol{x}\) and I by

$$\widehat{\textbf{vdeg}}_{[I,\boldsymbol{x}]}(f)=\texttt{VDEG}(f_{r-1},V_{r-2}),$$

where \(V_i=\texttt{VDEG}(\boldsymbol{f}_i,V_{i-1})\), \(0<i\le r-2\), and \(V_0=\textbf{vdeg}_{[I,\boldsymbol{x}]}(\boldsymbol{f}_0)\).

According to Proposition 1 and Corollary 1, the estimation of algebraic degree of f w.r.t. \(\boldsymbol{x}\) and I, denoted by \(\widehat{\textbf{deg}}_{[I,\boldsymbol{x}]}(f)\), can be derived from \(\widehat{\textbf{vdeg}}_{[I,\boldsymbol{x}]}(f)\). To meet different goals in various scenes, we give the following three modes to get \(\widehat{\textbf{deg}}_{[I,\boldsymbol{x}]}(f)\):

Mode 1. \(\widehat{\textbf{deg}}_{[I,\boldsymbol{x}]}(f)=\max _{0\le j<2^{|I|}}\{\min \{\widehat{\textbf{vdeg}}_{[I,\boldsymbol{x}]}(f)[j],n-|I|\}+\textrm{wt}(j)\}.\)

Mode 2. \(\widehat{\textbf{deg}}_{[I,\boldsymbol{x}]}(f)=\widehat{\textbf{vdeg}}_{[I,\boldsymbol{x}]}(f)[2^{|I|}-1]+|I|.\)

Mode 3. \(\widehat{\textbf{deg}}_{[I,\boldsymbol{x}]}(f)=\max _{0\le j<2^{|I|}}\{\widehat{\textbf{vdeg}}_{[I,\boldsymbol{x}]}(f)[j]+\textrm{wt}(j)\}.\)

Mode 1 gives the estimated degree that can be totally derived from previous discussions, which is most precise. Mode 2 focuses on the value of the last coordinate of \(\widehat{\textbf{vdeg}}_{[I,\boldsymbol{x}]}(f)\), which may tell us whether the algebraic degree can reach the maximum value. Mode 3 gives the estimated degree without revision, which will be used when choosing the index set of the vector degree.

Since the index set I is an important parameter when estimating the vector degree of f, we learn about how different choices of the index set influence the estimation of the vector degree. Then, we give the relationship between numeric mapping and vector numeric mapping.

Theorem 4

Let \(f\in \mathbb {B}_n\) and \(I_1\) and \(I_2\) be two index sets with \(|I_1|=k\), \(|I_2|=d\) and \(I_1\subset I_2\). If \(V_1 \in \mathbb {Z}^{n\times 2^{k}}\) and \(V_2 \in \mathbb {Z}^{n \times 2^{d}}\) satisfy

$$\begin{aligned} V_1[i][j] \ge \max _{0 \le j'<2^{d-k}}\left\{ V_2[i][j'\cdot 2^{k} + j] + wt(j')\right\} \end{aligned}$$
(6)

for any \(0 \le i \le n - 1\) and \(0 \le j<2^{k}\), then we have

$$\begin{aligned} \texttt{VDEG}_k(f, V_1)[j] \ge \max _{0 \le j'<2^{d-k}}\left\{ \texttt{VDEG}_{d}(f,V_2)[j'\cdot 2^{k} + j] + wt(j')\right\} \end{aligned}$$
(7)

for any \(0 \le j < 2^{k}.\)

The proof of Theorem 4 can be found in [25, Appendix E]. Let \(V_i \succcurlyeq \textbf{vdeg}_{I_i}(\boldsymbol{g})\) for \(i = 1,2\) in Theorem 4, and assume that they satisfy the inequality (6). Since \(\texttt{VDEG}_d(f,V_2) \succcurlyeq \textbf{vdeg}_{I_2}(f\circ \boldsymbol{g})\) by Theorem 3, we can see that the RHS of (7) is larger than or equal to \(\textbf{vdeg}_{I_1}(f\circ \boldsymbol{g})[j]\) from (5). It implies that the RHS of (7) gives a tighter upper bound of \(\textbf{vdeg}_{I_1}(f\circ \boldsymbol{g})[j]\) than the LHS of (7). Moreover, the relation in (6) would be maintained after iterations of the vector numeric mapping by Theorem 4.

In fact, the numeric mapping is the degenerate form of the vector numeric mapping in the sense of \(d=0\). Therefore, we can assert that \(\deg (\boldsymbol{g}_r \cdots \boldsymbol{g}_1)\) derived from the iterations of the vector numeric mapping \(\texttt{VDEG}(\boldsymbol{g_i},V_i)\) leads to a tighter upper bound than the iterations of the numeric mapping \(\texttt{DEG}(\boldsymbol{g}_i,\boldsymbol{d}_i)\). An example can be found in [25, Appendix F].

How to choose a suitable index set of the vector degree? One can consider the index set \(I=[m]\), where m is the size of the input of the function \(\boldsymbol{g}\). Of course, it is the best set by Theorem 4 if we only consider the accuracy of the estimated degree. However, the space and time complexity of the vector numeric mapping is exponential w.r.t. such a set. Therefore, we should choose the index set of the vector degree carefully. We will put forward some heuristic ideas for the Trivium cipher in Sect. 6.

5.3 Algorithm for Searching Good ISoCs

As mentioned in Sect. 3, finding a large scale of special ISoCs is quite important in improving correlation cube attacks. Indeed, we observe that if the estimated algebraic degree of f over an ISoC exceeds the size of it, the higher the estimated algebraic degree is, the more complex the corresponding superpoly tends to. Therefore, when searching ISoCs of a fixed size, imposing the constraint that the estimated algebraic degree of f is below a threshold may significantly increase the likelihood of obtaining a relatively simple superpoly. Then, we heuristically convert our goal of finding large scale of special ISoCs to finding large scale of good ISoCs whose corresponding estimated algebraic degrees of f are lower than a threshold d.

In the following, we propose an efficient algorithm for searching large scale of such good ISoCs.

Theorem 5

Let \(f(\boldsymbol{x},\boldsymbol{k})=f_{r-1}\circ \boldsymbol{f}_{r-2}\circ \cdots \circ \boldsymbol{f}_0(\boldsymbol{x},\boldsymbol{k})\) be a Boolean function, where \(\boldsymbol{x}\in \mathbb {F}_2^n\) represents the initial vector and \(\boldsymbol{k}\in \mathbb {F}_2^m\) represents the key. Let \(J\subset [n]\) be an index set for vector degree and I and K be two ISoCs satisfying \(J\subset K \subset I\). Then we have

$$\widehat{\textbf{vdeg}}_{[J,\boldsymbol{x}_K]}(f|_{\boldsymbol{x}_{K^c}=0})\preccurlyeq \widehat{\textbf{vdeg}}_{[J,\boldsymbol{x}_I]}(f|_{\boldsymbol{x}_{I^c}=0}).$$

Proof

Let \(U_0=\textbf{vdeg}_{[J,\boldsymbol{x}_K]}(\boldsymbol{f}_0|_{\boldsymbol{x}_{K^c}=0})\), \(V_0=\textbf{vdeg}_{[J,\boldsymbol{x}_I]}(\boldsymbol{f}_0|_{\boldsymbol{x}_{I^c}=0})\), and \(U_t=\texttt{VDEG}(\boldsymbol{f}_t,U_{t-1})\), \(V_t=\texttt{VDEG}(\boldsymbol{f}_t,V_{t-1})\) for \(1\le t \le r-2\). Then \(\widehat{\textbf{vdeg}}_{[J,\boldsymbol{x}_K]}(f|_{\boldsymbol{x}_{K^c}=0})\) \(=\texttt{VDEG}(f,U_{r-2})\), \(\widehat{\textbf{vdeg}}_{[J,\boldsymbol{x}_I]}(f|_{\boldsymbol{x}_{I^c}=0})=\texttt{VDEG}(f,V_{r-2}).\)

It is obvious that the set of monomials in \(\boldsymbol{f}_0|_{\boldsymbol{x}_{I^c}=0}\) is a superset of the set of monomials in \(\boldsymbol{f}_0|_{\boldsymbol{x}_{K^c}=0}\) since \(I^c\subset K^c\). Thus, we can get \(U_0\preccurlyeq V_0\) from Definition 2. According to Definition 3, we can iteratively get \(U_i\preccurlyeq V_i\) for all \(1\le i\le r-2\), which leads to \(\widehat{\textbf{vdeg}}_{[J,\boldsymbol{x}_K]}(f|_{\boldsymbol{x}_{K^c}=0})\preccurlyeq \widehat{\textbf{vdeg}}_{[J,\boldsymbol{x}_I]}(f|_{\boldsymbol{x}_{I^c}=0})\).

Corollary 2

Let \(f(\boldsymbol{x},\boldsymbol{k})=f_{r-1}\circ \boldsymbol{f}_{r-2}\circ \cdots \circ \boldsymbol{f}_0(\boldsymbol{x},\boldsymbol{k})\) be a Boolean function. Let J be an index set of vector degree, \(d>|J|\) be a threshold of algebraic degree, and K be an ISoC satisfying \(J\subset K\). If \(\widehat{\textbf{deg}}_{[J,\boldsymbol{x}_K]}(f|_{\boldsymbol{x}_{K^c}=0})\ge d\), then \(\widehat{\textbf{deg}}_{[J,\boldsymbol{x}_I]}(f|_{\boldsymbol{x}_{I^c}=0})\ge d\) for all ISoCs I satisfying \(K\subset I\).

Corollary 2 can be derived from Theorem 5 directly. Theorem 5 shows a relationship between the estimated vector degrees of f w.r.t. a fixed index set J for two ISoCs containing J. According to Corollary 2, we can delete all the sets I containing an ISoC K from the searching space of ISoCs if K satisfies \(\widehat{\textbf{deg}}_{[J,\boldsymbol{x}_K]}(f|_{\boldsymbol{x}_{K^c}=0})\ge d\). Therefore, in order to delete more “bad” ISoCs from the searching space, we can try to find such an ISoC K as small as possible.

For a given ISoC I satisfying \(\widehat{\textbf{deg}}_{[J,\boldsymbol{x}_{I}]}(f|_{\boldsymbol{x}_{I^c}=0})\ge d\), we can iteratively choose a series of ISoCs \(I\supsetneqq I_1 \supsetneqq \cdots \supsetneqq I_q\supset J\) such that \(\widehat{\textbf{deg}}_{[J,\boldsymbol{x}_{I_i}]}(f|_{\boldsymbol{x}_{I_i^c}=0})\ge d\) for all \(1\le i \le q\) and \(\widehat{\textbf{deg}}_{[J,\boldsymbol{x}_{I'}]}(f|_{\boldsymbol{x}_{I'^c}=0})<d\) for any \(I'\subsetneqq I_q\). Note that this process can terminate with a smallest ISoC \(I_q\) from I since \(\widehat{\textbf{deg}}_{[J,\boldsymbol{x}_{J}]}(f|_{\boldsymbol{x}_{J^c}=0})\le |J|<d\).

Next, we give a new algorithm according to previous discussions for searching a large scale of good ISoCs.

Process of Searching Good ISoCs. Let J be a given index set, \(\varOmega \) be the set of all subsets of [n] containing J and with size k, d be a threshold of degree, and a be the number of repeating times. The main steps are:

  1. 1.

    Prepare an empty set \(\mathcal {I}\).

  2. 2.

    Select an element I from \(\varOmega \) as an ISoC.

  3. 3.

    Estimate the algebraic degree of f w.r.t. the variable \(\boldsymbol{x}_{I}\) and the index set J, denoted by \(d_I\). If \(d_I<d\), then add I to \(\mathcal {I}\) and go to Step 5; otherwise, set \(count = 0\) and go to Step 4.

  4. 4.

    Set \(count = count+1\). Let \(I'=I\), randomly remove an element \(i\in I'\setminus J\) from \(I'\) and let \(x_i=0\). Then, estimate the algebraic degree of f w.r.t. the variable \(\boldsymbol{x}_{I'}\). If the degree is less than d and \(count<a\), continue to execute Step 4; if the degree is less than d and \(count\ge a\), go to step 5; if the degree is greater than or equal to d, let \(I=I'\) and go to Step 3.

  5. 5.

    Remove all the sets containing I from \(\varOmega \). If \(\varOmega \ne \emptyset \), go to Step 2; otherwise, output \(\mathcal {I}\).

The output \(\mathcal {I}\) is the set of all good ISoCs we want. In the algorithm, Step 4 shows the process of finding a “bad” ISoC as small as possible. Since the index i we choose to remove from \(I'\) is random every time, we use a counter to record the number of repeating times and set the number a as an upper bound of it to ensure that the algorithm can continue to run.

To implement the algorithm efficiently, we establish an MILP model and use the automated searching tool Gurobi to solve the model, and then we can get a large scale of good ISoCs that are needed.

MILP Model for Searching Good ISoCs. In order to evaluate the elements of \(\varOmega \) more clearly, we use linear inequalities over integers to describe \(\varOmega \). We use a binary variables \(b_i \) to express whether to choose \(v_i\) as a cube variable, namely, \(b_i=1\) iff \(v_i\) is chosen as a cube variable, \(0\le i \le n-1\). Then the sub-models are established as follows:

Model 1

To describe that the size of each element of \(\varOmega \) is equal to k, we use

$$\sum _{i=0}^{n-1}b_i=k.$$

Model 2

To describe that each element of \(\varOmega \) includes the set J, we use

$$b_j=1~\text {for}~\forall j\in J.$$

Model 3

To describe removing all the sets that contain I from \(\varOmega \), we use

$$\sum _{i\in I}b_i<|I|.$$

Since some ISoCs are deleted in Step 5 during the searching process, we need to adjust the MILP model continuously. Thus we can use the Callback function of Gurobi to implement this process. In fact, using Callback function to adjust the model will not repeat the test for excluded nodes that do not meet the conditions, and will continue to search for nodes that have not been traversed, so the whole process of adjusting the model will not cause the repetition of the solving process, and will not result in a waste of time.

According to the above descriptions and the MILP model we have already established, we give an algorithm for searching good ISoCs. The algorithm includes two parts which are called the main procedure and the callback function, and the complete algorithm can be found in [25, Appendix G].

6 Application to Trivium

In this section, we apply all of our techniques to Trivium, including degree estimation, superpoly recovery and improved correlation cube attack. We set \(r_m = 200\) in the experiment of recovering superpolys below, and expression of the states after 200-round initialization of Trivium has been computed and rewritten in new variables as described in Sect. 4, where the ANF of new variables in the key \(\boldsymbol{k}\) is also determined. For details, please visit the git repository https://github.com/faniw3i2nmsro3nfa94n/Results. All experiments are completed on a personal computer due to the promotion of the algorithms.

6.1 Description of Trivium Stream Cipher

Trivium [7] consists of three nonlinear feedback shift registers whose size is 93, 84, 111, denoted by \(r_0, r_1, r_2\), respectively. Their internal states, denoted by \(\boldsymbol{s}\) with a size of 288, are initialized by loading 80-bit key \(k_i\) into \(s_i\) and 80-bit IV \(x_i\) into \(s_{i+93}\), \(0\le i \le 79\), and other bits are set to 0 except for the last three bits of the third register. During the initialization stage, the algorithm would not output any keystream bit until the internal states are updated for 1152 rounds. The linear components of the three update functions are denoted by \(\ell _1, \ell _2\) and \(\ell _3\), respectively, and the update process can be described as

$$\begin{aligned} \begin{aligned} s_{n_i}&=s_{n_i-1} \cdot s_{n_i - 2} \oplus \ell _i(\boldsymbol{s})~ \text { for }~ i = 1,~ 2,~ 3,\\ \boldsymbol{s} & \leftarrow \left( s_{287}, s_{0}, s_{1}, \cdots , s_{286} \right) , \end{aligned} \end{aligned}$$
(8)

where \(n_1, n_2, n_3\) are equal to 92, 176, 287, respectively. Denote z to be the output bit of Trivium. Then the output function is \(z=s_{65}\oplus s_{92}\oplus s_{161}\oplus s_{176}\oplus s_{242} \oplus s_{287}\).

6.2 Practical Verification for Known Cube Distinguishers

In [17], Kesarwani et al. found three ISoCs having Zero-Sum properties till 842 initialization rounds of Trivium by cube tester experiments. These ISoCs are listed in [25, Appendix H], denoted by \(I_1, I_2,I_3\). We apply the superpoly recovery algorithm proposed in Sect. 4 to these ISoCs. It turns out that the declared Zero-Sum properties of these ISoCs is incorrect, which is due to the randomness of experiments on a small portion of the keys. The correct results are listed in Table 2, where “Y” represents the corresponding ISoC has Zero-Sum property, while “N” represents the opposite. For more details about the superpolys of these ISoCs, please refer to our git repository. We also give some values of the key for which the value of non-zero superpolys is equal to 1, listed in [25, Appendix I].

Table 2. Verification of Zero-Sum properties in [17]

Comparison of Computational Complexity for Superpoly Recovery. For comparison, we recover superpoly of the ISoC \(I_2\) for 838 rounds by nested monomial prediction, nested monomial prediction with NBDP and CMP techniques, and nested monomial prediction with our variable substitution technique, respectively, where the number of middle rounds is set to \(r_m = 200\) for the last two techniques. As a result, it takes more than one day for superpoly recovery by nested monomial prediction, about 13 min by NBDP and CMP techniques, and 15 min by our method. It implies that variable substitution technique plays a role as important as the NBDP and CMP techniques in improving the complexity of superpoly recovery. Further, by combining our methods with NBDP and CMP techniques to obtain valuable terms, it takes about 2 min to recover this superpoly. Thus, it is the best choice to combine our variable substitution technique with NBDP and CMP in superploy recovery.

6.3 Estimation of Vector Degree of Trivium

Recall the algorithm proposed by Liu in [18] for estimating the degree of Trivium-like ciphers. We replace the numeric mapping with the vector numeric mapping. The reason is that vector numeric mapping can perform well for the ISoCs containing adjacent indices but numeric mapping cannot.

The algorithm for estimation of the vector degree of Trivium is detailed in [25, Algorithm 11] and [25, Algorithm 12]. The main idea is the same as [18, Algorithm 2], but the numeric mapping is replaced. For the sake of simplicity, we denote \(\texttt{VDEG}(\prod _{i=1}^k x[i],(\boldsymbol{v}_1,\cdots , \boldsymbol{v}_k))\) as \(\texttt{VDEGM}(\boldsymbol{v}_1, \cdots , \boldsymbol{v}_k)\) in the algorithms.

Heuristics Method for Choosing Indices of Vector Degree. As we discussed earlier, the size of the index set of vector degree should not be too large, and we usually set the size less than 13. How to choose the indices to obtain a good degree evaluation? We give the following two heuristic strategies.

  1. 1.

    Check whether there are adjacent elements in the ISoC I. If yes, add all the adjacent elements into the index set J. When the size of the set J exceeds a preset threshold, randomly remove elements from J until its size is equal to the threshold. Otherwise, set \(I = I \setminus J\) and execute Strategy 2.

  2. 2.

    Run our vector degree estimation algorithm ([25, Algorithm 11]) with the input \((\boldsymbol{s}^0,I_i,\emptyset ,R,3)\) for all \(i\in I\), where \(I_i=\{i\}\). Remove the index with the largest degree evaluation of the R-round output bit from I every time, and add it to J until the size of J is equal to the preset threshold. If there exist multiple choices that have equal degree evaluation, randomly pick one of them.

After applying the above two strategies, we will get an index set of vector degree. Since there are two adjacent states multiplied in the trivium update function, the variables with adjacent indices may be multiplied many times. So in Strategy 1, we choose adjacent indices in I and add them to the index set of vector degree. In Strategy 2, we compute the degree evaluation of the R-round output bit by setting the degree of \(x_j\) to be zero for all \(j \in I\) except i. Although the exact degree of the output bit is less than or equal to 1, the evaluation is usually much larger than 1. This is because the variable \(x_i\) is multiplied by itself many times and the estimated degree is added repeatedly. So we choose these variables, whose estimated degrees are too large, as the index of vector degree. Once we fix a threshold of the size of the index set of vector degree, we can obtain the index set by these two strategies.

Degree of Trivium on All IV Bits. We have estimated the upper bound of the degree of the output bit on all IV bits for R-round Trivium by our vector degree estimation algorithm ([25, Algorithm 11]) with \(mode = 1\). Every time we set the threshold to be 8 to obtain the index set of vector degree and run the procedure of degree estimation with the index set. We repeat 200 times and choose the minimum value as the upper bound of the output bit’s degree. The results compared with the numeric mapping technique are illustrated in Fig. 1. In our experiments, the upper bound of the output bit’s degree reaches the maximum degree 80 till 805 rounds using vector numeric mapping, while till 794 rounds using numeric mapping. Besides, the exact degree [6] exhibits the behavior of a decrease when the number of rounds increases at certain points. The vector numeric mapping can also capture this phenomenon, whereas numerical mapping cannot. This is because the vector numeric mapping can eliminate the repeated degree estimation of variables whose indices are in the index set of vector degree.

Fig. 1.
figure 1

Degree evaluations by vector numeric mapping and numeric mapping

Degree of Trivium on Partial IV Bits. In fact, the degree evaluation algorithm will perform better when there are a few adjacent indices in the ISoC. We generate the ISoC in the following way. Firstly, randomly generate a set \(I_0 \subset [n]\) with size 36 which does not contain adjacent indices. Next, find a set \(I_0\subset I\) with size \(36+l\) such that there are exactly l pairs adjacent indices in I. Then, one can estimate the degree for the ISoC I by numeric mapping technique and vector numeric mapping technique, where the size of the index set of vector degree is set to 8, and calculate the difference of a maximum number of zero-sum rounds between these two techniques. For each l, we repeat 200 times and record the average of the differences; see Table 3 for details.

Table 3. Average improved number of rounds by vector numeric mapping relative to numeric mapping technique

It is obvious that when the ISoC contains adjacent indices, the vector numeric mapping technique can improve more than 27 rounds compared with the numerical mapping technique on average, even to 45 rounds. When there are no adjacent index or few adjacent indices, the difference between degree evaluations by numerical mapping technique and vector mapping technique is small. It implies the reason for the success of degree evaluation for cubes with no adjacent index by numeric mapping in [18]. As l increases, the improved number of rounds first increases and then slowly decreases. This is because the index set of vector degree cannot contain all adjacent indices when l is large. But the vector numeric mapping technique compared with the numeric mapping technique can still improve by about 30 rounds.

Complexity and Precision Comparison of Degree Evaluation. In theory, the complexity of degree evaluation using vector numeric mapping technique is no more than \(2^{|J|}\) times that of degree evaluation using numeric mapping technique, where J is the index set of vector degree. As evidenced by the experiments conducted above, we have observed that our degree estimation is notably more accurate when the ISoC involves only a small adjacent subset. Moreover, since complexity is exponentially related to the size of the index set of vector degree, we typically limit its size to not exceed 10.

The runtime of our algorithm for 788-round Trivium with various sizes is detailed in [25, Table 9]. In comparison to degree estimation based on the division property [6], the difference between the two methods is not substantial when the ISoC consists of only a few adjacent indices. Furthermore, our algorithm significantly outpaces that method, as they require nearly 20 min to return degree evaluations for 788 rounds of Trivium.

6.4 The Complexity of Fast Cube Search

To validate the effectiveness of our pruning technique, we conducted a comparative experiment. As a comparison, we replicated a partial experiment by Liu [18], which involved searching for 837-round distinguishers using cubes of size 37 with non-adjacent indices. As a result, our search algorithm made a total of 9296 calls to the degree estimation algorithm to complete the search of entire space, while exhaustive search required over 38320568 calls to the degree estimation algorithm. This clearly demonstrates the effectiveness of our pruning technique.

6.5 Practical Key Recovery Attacks

Benefiting from the new framework of superpoly recovery and the ISoC search technique, we could obtain a large scale of special ISoCs within an acceptable time so that we can mount practical correlation cube attacks against Trivium with large number of rounds. For correlation cube attacks, we choose the threshold of the conditional probability as \(p = 0.77\). We will not elaborate further on these parameters.

Practical Key Recovery Attacks Against 820-Round Trivium Parameter Settings. Set \(\varOmega \) to be the total space of the ISoC with size \(k = 38\). Set the index set \(J=\{0, 1, 2, i, i+1\}\), the threshold of degree d to be 41 in the ISoC search algorithm in Sect. 5.3, where i ranges from 3 to 26. We call the search algorithms in parallel for different i.

Attacks. We have finally obtained 27428 special ISoCs with size 38, whose concrete information can be found in our git repository, including the ISoCs, superpolys, factors and balancedness of superpolys, where the balancedness of each superpoly is estimated by randomly testing 10000 keys. Besides, these ISoCs are sorted by balancedness of superpolys in descending order. Finally, we choose the first \(2^{13}\) ISoCs to mount key recovery attacks.

For the first \(2^{13}\) ISoCs, we call Algorithm 1 to generate the sets \(\mathcal {T}\) and \(\mathcal {T}_1\) whose elements are pairs composed of the factor of superpoly and the corresponding special ISoC, and sizes are 30 and 31, respectively. The results are listed in [25, Appendix L], where the probabilities are estimated by randomly testing 10000 keys. The details about the ISoC corresponding to each factor h are listed in our git repository.

In the online phase, after computing all the values of the superpolys, one obtain the set of equations \(G_0\) and \(G_1\). To make full use of the equations, one should recover keys as follows:

  1. 1.

    For all \(54\le i \le 79\), guess the value of \(k_{i}\) if the equation for \(k_{i}\) is not in \(G_0 \cup G_1\).

  2. 2.

    For i from 53 to 0, if the equation for \(k_{i}+k_{i+25}k_{i+26}+k_{i+27}\) or \(k_{i}+k_{i+25}k_{i+26}\) is in \(G_0 \cup G_1\), recover the value of \(k_i\). Otherwise, guess the value of \(k_{i}\).

  3. 3.

    Go through over all possible values of \(k_i\) guessed in Step 1 and Step 2, and repeat Step 1 until the solution is correct.

  4. 4.

    If none of the solutions is correct, adjust the equations in \(G_0\) according to Step 20 in Algorithm 2 and go to Step 1.

Note that the complexity of recovering the value of \(k_i\) for \(i < 53\) is \(\mathcal {O}(1)\), since the values of \(k_{i+25}\), \(k_{i+26}\) and \(k_{i+27}\) are known before. In our experiments, the factors are all chosen in the form \(k_{i}+k_{i+25}k_{i+26}+k_{i+27}\) for \(0\le i\le 52\) or \(k_{53}+k_{78}k_{79}\) or \(k_i\) for \(54 \le i \le 65\). Thus the number of key bits obtained by the equations is always equal to the number of equations.

Now we talk about computing the complexity of our improved correlation cube attack. Since the set \(\mathcal {I}\) of ISoCs is fixed, for each fixed key \(\boldsymbol{k}\), the corresponding values of the superpolys of all ISoCs are determined. Therefore, we can calculate the time complexity of recovering this \(\boldsymbol{k}\) using the following method. The complexity for computing the values of superpolys remains the same, which is \(\mathcal {O}(2^{13}\cdot 2^{38})\). For brute force key recovery, the complexity can be determined by combining the values of the superpolys with the guessing strategy, allowing us to obtain the number of equations in \(G_0\) and \(G_1\), say, \(a_{\boldsymbol{k}}\) and \(b_{\boldsymbol{k}}\), respectively, as well as the numbers of incorrect equations in \(G_0\), denoted by \(e_{\boldsymbol{k}}\). It then enables us to determine the complexity of the preprocessing phase to be \(2^{80-a_{\boldsymbol{k}}-b_{\boldsymbol{k}}}\cdot \left( \sum _{i=0}^{e_{\boldsymbol{k}}} \left( {\begin{array}{c}a_{\boldsymbol{k}}\\ i\end{array}}\right) \right) \). Thus, the complexity for recovering \(\boldsymbol{k}\) is

$$\mathcal {C}_{\boldsymbol{k}} = \mathcal {O}(2^{13}\cdot 2^{38}) + \mathcal {O}\left( 2^{80-a_{\boldsymbol{k}}-b_{\boldsymbol{k}}}\cdot \left( \sum _{i=0}^{e_{\boldsymbol{k}}} \left( {\begin{array}{c}a_{\boldsymbol{k}}\\ i\end{array}}\right) \right) \right) .$$

We estimated the proportion of keys with a complexity not larger than \(\mathcal {C}\) by randomly selecting 10,000 keys, namely, \({|\{\boldsymbol{k}: \mathcal {C}_{\boldsymbol{k}} \le \mathcal {C}\}|}/{10000}\), and the result is listed in Table 4. Due to the extensive key space, we have performed a hypothesis testing in [25, Appendix O] to assess whether these proportions can accurately approximate the true proportions. In conclusion, our findings indicate a very strong correlation between them. From Table 4, it can be seen that 87.8% of the keys can be practically recovered by the attack. In particular, 58.0% of keys can be recovered with a complexity of only \(\mathcal {O}(2^{52})\).

Table 4. The proportion of keys with attack complexities not exceeding \(\mathcal {C}\) for 820 rounds

Practical Key Recovery Attacks Against 825-Round Trivium Parameter Settings. Set \(\varOmega \) to be the total space of ISoC with size 41. Set the index set \(J=\{0, 1, \cdots , 10\}\setminus \{j_0, j_1, j_2\}\), the threshold of degree d to be 44 in the ISoC search algorithm in Sect. 5.3, where \(j_0 > 2\), \(j_1 > j_0+1\) and \( j_1 + 1 < j_2 < 11\). We call the search algorithms in parallel for different tuples \((j_0, j_1, j_2)\). Attacks. We finally obtained 12354 special ISoCs with size 41, and we provide their concrete information in our git repository. Besides, these ISoCs are sorted by balancedness of superpolys in descending order, where the balancedness is estimated by randomly testing 10000 keys. We choose the first \(2^{12}\) ISoCs to mount key recovery attacks.

For the first \(2^{12}\) ISoCs, we call Algorithm 1 to generate the sets \(\mathcal {T}\) and \(\mathcal {T}_1\) whose elements are pairs composed of the factor of superpoly and the corresponding special ISoC, and the sizes are 31 and 30, respectively. The results are listed in [25, Appendix M], where the probabilities are estimated by randomly testing 10000 keys. The details about the ISoC corresponding to each factor h are listed in our git repository.

We estimate the proportion of keys with a complexity not larger than \(\mathcal {C}\) by randomly selecting 10,000 keys, and the result is listed in Table 5. From Table 5, it can be seen that 83% of the keys can be practically recovered by the attack. In particular, 60.9% of keys can be recovered with a complexity of only \(\mathcal {O}(2^{54})\).

Table 5. The proportion of keys with attack complexities not exceeding \(\mathcal {C}\) for 825 rounds

Practical Key Recovery Attacks Against 830-Round Trivium Parameter Settings. The parameter settings are the same as that of 825 rounds, except the threshold of degree d is set to 45 here. We also call the search algorithms in parallel for different tuples \((j_0, j_1, j_2)\).

Attacks. We finally obtained 11099 special ISoCs with size 41, whose concrete information can be found in our git repository. Besides these ISoCs are sorted by balancedness of superpolys in descending order, where the balancedness is estimated by randomly testing 10000 keys. We choose the first \(2^{13}\) ISoCs to mount key recovery attacks.

For the first \(2^{13}\) ISoCs, we call Algorithm 1 to generate the sets \(\mathcal {T}\) and \(\mathcal {T}_1\), with sizes 25 and 41, respectively. The results are listed in [25, Appendix N], where the probabilities are estimated by randomly testing 10000 keys. The details about the ISoC corresponding to each factor h are listed in our git repository.

We also estimate the proportion of keys with a complexity not larger than \(\mathcal {C}\) by randomly selecting 10000 keys, and the result is listed in Table 6. From Table 6, it can be seen that 65.7% of the keys can be practically recovered by the attack. In particular, 46.6% of keys can be recovered with a complexity of only \(\mathcal {O}(2^{55})\).

Table 6. The proportion of keys with attack complexities not exceeding \(\mathcal {C}\) for 830 rounds

Due to limited computational resources, we were unable to conduct practical validations of key recovery attacks. Instead, we randomly selected some generated superpolys and verified the model’s accuracy through cross-validation, utilizing publicly accessible code for superpoly recovery. Furthermore, we have performed practical validations for the non-zero-sum case presented in [25, Table 8] to corroborate the accuracy of our model. In addition, as mentioned in [5], attempting to recover keys would take approximately two weeks on a PC equipped with two RTX3090 GPUs when the complexity reaches \(\mathcal {O}(2^{53})\). Therefore, for servers with multiple GPUs and nodes, it is feasible to recover a 830-round key within a practical time.

Discussion About the Parameter Selections. Parameter selection is a nuanced process. The number of middle rounds \(r_m\) is determined by the complexity of computing the expression of \(\boldsymbol{g}_{r_m}\). Once \(r_m\) exceeds 200, the expression for \(\boldsymbol{g}_{r_m}\) becomes intricate and challenging to compute, and overly complex expressions also hinder efficient computation of \(\texttt{Coe}(\pi _{\boldsymbol{u}_{r_m}}(\boldsymbol{y}_{r_m}), \boldsymbol{x}^{\boldsymbol{u}})\) for MILP solvers. For ISoCs, we chose their size not exceeding 45 to maintain manageable complexity. We focused on smaller adjacent indices as bases when searching for ISoCs. The decision is based on the observation that smaller indices become involved later in the update process of Trivium. Consequently, this usually results in comparatively simpler superpolys. We directly selected these preset index sets as index of set for vector degree. When determining the threshold for searching good ISoCs, we noticed that a higher threshold tended to result in more complex superpolys. Thus, we typically set the threshold slightly above the size of the ISoCs. In the improved correlation cube attacks, the probability threshold significantly affects the complexity. Too low threshold will increase the number of incorrect guessed bits \(e_{\boldsymbol{k}}\), raising the complexity. Conversely, an excessively high threshold reduces the number of equations in \(G_0\), i.e., \(a_{\boldsymbol{k}}\), prolonging the brute-force search. One can modify the p-value to obtain a relative high success probability.

Comparison with Other Attacks. From the perspective of key recovery, our correlation cube attack differs from attacks in [5, 13, 14] in how we leverage key information from the superpolys. We obtain equations from the superpolys’ factors through their correlations with superpolys, whereas [5, 13, 14] directly utilize the equations of the superpolys. This allows us to extract key information even from high-degree complex superpolys. We also expect that this approach will be effective for theoretical attacks and find applications in improving theoretical attacks to more extended rounds.

7 Conclusions

In this paper, we propose a variable substitution technique for cube attacks, which makes great improvement to the computational complexity of superpoly recovery and can provide more concrete superpolys in new variables. To search good cubes, we give a generalized definition of degree of Boolean function and give out a degree evaluation method with the vector numeric mapping technique. Moreover, we introduce a pruning technique to fast filter the ISoCs and describe it into an MILP model to search automatically. It turn out that, these techniques perform well in cube attacks. We also propose practical verifications for some former work by other authors and perform practical key recovery attacks on 820-, 825- and 830-round Trivium cipher, promoting up to 10 more rounds than previous best practical attacks as we know. In the future study, we will apply our techniques to more ciphers to show their power.