1 Introduction

Stream ciphers are particularly useful in resource-constrained environments because of their low gate counts. The designers are hence competing to model stream ciphers with as low gate count as possible. In fact, the eStream portfolio saw one of its finalist Grain v1 [6] with a circuit size of 960 GE. However, it has been noted that designing stream ciphers with state size less than twice the key size makes them weak against the well known Time-Memory-Data Trade-Off (TMDTO) Attacks. Hence, it was considered a thumb rule to design stream ciphers with state size more than twice the key size, only to be proved wrong by the introduction of BSW sampling [1, 2], which asks for a state size minimum of 2.5 times the key size (one may refer to [8] for more details). CAESAR (Competition for Authenticated Encryption: Security, Applicability, and Robustness) [4] has recently announced its finalists, and ACORN v3 is one among those [13]. ACORN v3 is a lightweight authenticated stream cipher with a state size of 293, composed of 6 Linear Feedback Shift Registers (LFSRs) and four additional bits. It promises a 128-bit security using a 128-bit secret key and IV.

Given that the present ciphers are designed with well-informed efforts, refuting the designer’s claim are quite challenging and sometimes even elusive. However, there are important observations discovered by the cryptanalysts that help in providing more robust ciphers. This is the reason ACORN has been revised twice and the current version is ACORN v3. In this paper, we try to see how well one can obtain certain portion of the state bits for ACORN v3 and Grain v1, given some keystream bits and the rest of the bits of the state. This is related to sampling resistance as noted in [1, 2]. We calculate the sampling resistance by writing down several equations and feeding them to a SAT solver. Using SAT solving techniques, we have formed a generalized attack on stream ciphers with a state size less than 2.5 times the key size. The online complexity can be best reached to T = M = D = N2/5 in such scenarios. We then observe this attack on Grain v1, a stream cipher from eStream portfolio whose state size is less than 2.5 times the key size. We will apply similar techniques for mounting an attack on Grain v1 as we have done for ACORN v3. Finally, we will list the possible TMDTO parameters and compare them.

1.1 Overview of the Paper

The paper is divided under the following sections:

  • In Section 2, we will describe ACORN v3 relevant to our work.

  • In Section 3, we provide ways to recover a certain portion of ACORN v3 using a fixed keystream sequence and guessing the remaining state bits.

  • In Section 4, we discuss the possible TMDTO parameters for ACORN v3.

  • In Section 5, we discuss how the attack can be applied to Grain v1.

  • In Section 6, we compare the two attacks on ACORN v3 and Grain v1, and comment on its feasibility.

  • In Section 7, we conclude our work.

2 Description of ACORN v3

We briefly state here the description of ACORN v3 relevant to our work. We assume the plaintext message to be a stream of 0’s, and we concentrate only on the Pseudo Random Generation Algorithm (PRGA) that provides the keystream. We omit the Key Loading Algorithm (KLA) and the Key Scheduling Algorithm (KSA) of the cipher that are available at [13]. This is because the recovery of secret state bits during the PRGA and further the TMDTO attack can be studied irrespective of the initialization process. As stated before, ACORN v3 has six LFSRs and four additional bits concatenated to form the 293 bit state. The block diagram of ACORN is represented in Fig. 1 where ft represents the feedback bit and mt represents the message bit at tth step [13]. We denote the state of the cipher by \(\mathscr{S}_{t}\) and its respective bits as St+ 0St+ 292. The cipher has the following three functions.

1. Output Function::

The output bit zt for any state t is generated as

$$\begin{array}{@{}rcl@{}} z_{t} &=& S_{t + 12} \oplus S_{t + 154}\\ &\oplus& maj(S_{t + 235},S_{t + 61},S_{t + 193})\\ &\oplus& ch(S_{t + 230}, S_{t + 111}, S_{t + 66}) \end{array} $$
(1)

where maj() and ch() functions are defined as following:

$$\begin{array}{@{}rcl@{}} maj(x,y,z) & =& x\& y \oplus y \& z \oplus z \& x \end{array} $$
(2)
$$\begin{array}{@{}rcl@{}} ch(x,y,z) & =& x\& y \oplus (\sim x)\& z \end{array} $$
(3)
2. Feedback Function::

The feedback bit ft for any state t is generated as

$$\begin{array}{@{}rcl@{}} f_{t} &=& S_{t + 0} \oplus (\sim S_{t + 107}) \oplus maj(S_{t + 244}, S_{t + 23}, \\ &&S_{t + 160}) \oplus (ca_{t} \& S_{t + 196}) \oplus (cb_{t} \& z_{t}) \end{array} $$
(4)
3. State Update Function: :

Before performing the shift, the bits St+ 289, St+ 230, St+ 193, St+ 154, St+ 107, St+ 61 are updated as follows:

$$\begin{array}{@{}rcl@{}} S_{t + 289} & =& S_{t + 289} \oplus S_{t + 235} \oplus S_{t + 230} \end{array} $$
(5)
$$\begin{array}{@{}rcl@{}} S_{t + 230} & =& S_{t + 230} \oplus S_{t + 196} \oplus S_{t + 193} \end{array} $$
(6)
$$\begin{array}{@{}rcl@{}} S_{t + 193} & =& S_{t + 193} \oplus S_{t + 160} \oplus S_{t + 154} \end{array} $$
(7)
$$\begin{array}{@{}rcl@{}} S_{t + 154} & =& S_{t + 154} \oplus S_{t + 111} \oplus S_{t + 107} \end{array} $$
(8)
$$\begin{array}{@{}rcl@{}} S_{t + 107} & =& S_{t + 107} \oplus S_{t + 66} \oplus S_{t + 61} \end{array} $$
(9)
$$\begin{array}{@{}rcl@{}} S_{t + 61} & =& S_{t + 61} \oplus S_{t + 23} \oplus S_{t + 0} \end{array} $$
(10)

And then the next bit entering the state is initialized with the feedback bit:

$$ S_{t + 293} = f_{t} $$
(11)
Fig. 1
figure 1

The internal state of ACORN cipher

3 Methods to Recover Certain Bits of the State for ACORN v3

The underlying motivation of BSW sampling [1, 2] is the fact that certain bits of the state can be recovered by observing the keystream sequence zt and guessing the remaining part of the state. This reduces the search space and offers a wider range of parameters to choose from in TMDTO attack. We consider two approaches here. The first one is using the SAT solver, and the other one is by discovering the equations by hand using trial and error.

3.1 Using SAT Solver

Towards this, we first form a family of equations and then feeding them into a SAT solver. While building equations, the degree increases rapidly, which makes it very difficult to find solutions. Hence, we have to adopt a specific approach for formulating equations by introducing new variables. This is in line of [11]. Consider some PRGA round t of ACORN v3. The equations for the same round are:

  1. 1.

    1 output bit equation,

  2. 2.

    1 feedback bit equation, and

  3. 3.

    6 state update equations.

At the beginning of PRGA, the adversary has 293 state variables S0,S1,…,S292. The adversary has access to an -length keystream z0, z1, …z− 1. We will now explain how the output equation is introduced into the system of equations. The output equation as mentioned in Eq. 1 is:

$$\begin{array}{@{}rcl@{}} z_{t} &=& S_{t + 12} \oplus S_{t + 154} \oplus maj(S_{t + 235},S_{t + 61},S_{t + 193}) \\ &\oplus& ch(S_{t + 230}, S_{t + 111}, S_{t + 66}) \end{array} $$
(12)

To add an equation to the SAT solver, the equations are represented in a way such that it is zero in the ring of Boolean polynomials. That is, the output equation is written as

$$\begin{array}{@{}rcl@{}} z_{t} &\oplus& S_{t + 12} \oplus S_{t + 154} \oplus maj(S_{t + 235},S_{t + 61},S_{t + 193}) \\ &\oplus& ch(S_{t + 230}, S_{t + 111}, S_{t + 66}) \equiv 0, \end{array} $$
(13)

for t = 0,1,2,…, − 1 and added to the system. Thus we have an array of output equations as

$$\begin{array}{@{}rcl@{}} z_{0} &\oplus& S_{12} \oplus S_{154} \oplus maj(S_{235},S_{61},S_{193})\\ & \oplus& ch(S_{230}, S_{111}, S_{66}) \equiv 0\\ z_{1} &\oplus& S_{13} \oplus S_{155} \oplus maj(S_{236},S_{62},S_{194})\\ & \oplus& ch(S_{231}, S_{112}, S_{67}) \equiv 0\\ &\vdots&\\ z_{\ell-1} &\oplus& S_{\ell-1 + 12} \oplus S_{\ell-1 + 154}\\ &\oplus& maj(S_{\ell-1 + 235},S_{\ell-1 + 61},S_{\ell-1 + 193})\\ & \oplus& ch(S_{\ell-1 + 230}, S_{\ell-1 + 111}, S_{\ell-1 + 66}) \equiv 0 \end{array} $$

Next, we discuss the inclusion of feedback bit equation into the system of equations. The equation as mentioned in Eq. 4 for PRGA is:

$$\begin{array}{@{}rcl@{}} f_{t} &=& S_{t + 0} \oplus (\sim S_{t + 107}) \oplus maj(S_{t + 244},\\ && S_{t + 23}, S_{t + 160}) \oplus S_{t + 196} \end{array} $$
(14)

However, the feedback bit generated is not known. Thus directly substituting the state variable St+ 293 by feedback equations increases non-linearity. Instead, the we introduce new variables f0,f1,…f− 1 and add these equations to the SAT solver in the following manner:

$$\begin{array}{@{}rcl@{}} f_{0} &\oplus& S_{0} \oplus (\sim S_{107}) \oplus maj(S_{244}, S_{23}, S_{160})\\ &\oplus& S_{196} \equiv 0\\ f_{1} &\oplus& S_{1} \oplus (\sim S_{108}) \oplus maj(S_{245}, S_{24}, S_{161})\\ &\oplus& S_{197} \equiv 0\\ &\vdots&\\ f_{\ell -1} &\oplus& S_{\ell -1} \oplus (\sim S_{\ell -1 + 107})\\ &\oplus& maj(S_{\ell -1 + 244}, S_{\ell -1 + 23}, S_{\ell -1 + 160})\\ &\oplus& S_{\ell -1 + 196} \equiv 0 \end{array} $$

By now, 2 new equations and new variables have been introduced into the system. The variables St+ 289,St+ 230,St+ 193, St+ 154, St+ 107, St+ 61 are updated in Step 3 as mentioned earlier. For this, we introduce 6 new variables \({a^{0}_{1}}\), \({a^{0}_{2}}\), \({a^{0}_{3}}\), \({a^{0}_{4}}\), \({a^{0}_{5}}\), \({a^{0}_{6}}\), …, \(a^{\ell -1}_{1}\), \(a^{\ell -1}_{2}\), \(a^{\ell -1}_{3}\), \(a^{\ell -1}_{4}\), \(a^{\ell -1}_{5}\), \(a^{\ell -1}_{6}\) and add the following equations to the system (for t = 0,1,…, − 1):

$$\begin{array}{@{}rcl@{}} {a^{t}_{1}} & \oplus& S_{t + 289} \oplus S_{t + 235} \oplus S_{t + 230} \equiv 0 \end{array} $$
(15)
$$\begin{array}{@{}rcl@{}} {a^{t}_{2}} & \oplus& S_{t + 230} \oplus S_{t + 196} \oplus S_{t + 193}\equiv 0 \end{array} $$
(16)
$$\begin{array}{@{}rcl@{}} {a^{t}_{3}} & \oplus& S_{t + 193} \oplus S_{t + 160} \oplus S_{t + 154}\equiv 0 \end{array} $$
(17)
$$\begin{array}{@{}rcl@{}} {a^{t}_{4}} & \oplus& S_{t + 154} \oplus S_{t + 111} \oplus S_{t + 107}\equiv 0 \end{array} $$
(18)
$$\begin{array}{@{}rcl@{}} {a^{t}_{5}} & \oplus& S_{t + 107} \oplus S_{t + 66} \oplus S_{t + 61} \equiv 0 \end{array} $$
(19)
$$\begin{array}{@{}rcl@{}} {a^{t}_{6}} &\oplus& S_{t + 61} \oplus S_{t + 23} \oplus S_{t + 0}\ \quad \equiv 0 \end{array} $$
(20)

Since new variables have been introduced, new equations need to be introduced to maintain consistency of the system. That is, the following equations are added to the system:

$$\begin{array}{@{}rcl@{}} {a^{t}_{1}} &\oplus& S_{t + 288}\equiv 0 \end{array} $$
(21)
$$\begin{array}{@{}rcl@{}} {a^{t}_{2}} &\oplus& S_{t + 229}\equiv 0 \end{array} $$
(22)
$$\begin{array}{@{}rcl@{}} {a^{t}_{3}} &\oplus& S_{t + 192}\equiv 0 \end{array} $$
(23)
$$\begin{array}{@{}rcl@{}} {a^{t}_{4}} &\oplus& S_{t + 153}\equiv 0 \end{array} $$
(24)
$$\begin{array}{@{}rcl@{}} {a^{t}_{5}} &\oplus& S_{t + 106}\equiv 0 \end{array} $$
(25)
$$\begin{array}{@{}rcl@{}} {a^{t}_{6}} &\oplus& S_{t + 60}\equiv 0 \end{array} $$
(26)

for t = 0,1,…, − 1. Finally, we substitute the feedback bit into the state variable:

$$ S_{293+t} = f_{t} \quad \quad \forall t \in [0,\ell-1]. $$
(27)

Therefore, the number of variables used are 293 + + 6 = 293 + 7 and the number of equations formulated are + + 6 = 8 equations. All the equations are collected and fed to the SAT solver.

We set the SAT solver to find all possible solutions for the above system of equations. In this way, we are guaranteed that if the SAT solver returns only one solution, no other solution exists for the system of equations, and hence, we can solve for the state. However, in few cases of our experiments, we could not achieve that. For example, when we consider recovery of 60 bits with the help of 70 keystream bits, we sometimes obtain two solutions. The reason for the same is that the number of keystream bits is not enough and thus the SAT solver provides more solutions instead of a unique solution.

We use the SAT solver Cryptominisat-2.9.6 available with Sage-7.6 [10]. The experiments were performed on a laptop having hardware configuration Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz and 8 GB RAM running with Ubuntu-16.10. A few experimental data are provided in Table 1 where each row is based on 215 experiments.

Table 1 Experimental results for solving the equations

3.2 Formation of Equations by Observation, not using SAT Solver

In this section, we build the system of equations used to recover 49 bits of internal state by using first 49 bits of keystream. To perform this recovery, we need to fix 10 bits of internal state with a particular pattern and guess remaining state bits. The internal state bits to be recovered are represented by set \(\mathscr{R} =\mathscr{R}_{1}\cup \mathscr{R}_{2}\), where \(\mathscr{R}_{1}= \{ S_{t + 107} : t = 0,\hdots , 43\}\) and \(\mathscr{R}_{2}= \{ S_{t + 56} : t = 0,\hdots , 4\} \). The Eq. 1 for generating keystream can be written as

$$\begin{array}{@{}rcl@{}} z_{t} &=&S_{t + 12} \oplus \overline{S_{t + 154}} \oplus S_{t + 235} \overline{S_{t + 61}}\\ &\oplus& S_{t + 235}\overline{S_{t + 193}} \oplus \overline{S_{t + 193}} \overline{S_{t + 61}}\\ &\oplus& \overline{S_{t + 230}} S_{t + 111} \oplus \overline{S_{t + 230}}S_{t + 66} \oplus S_{t + 66}. \end{array} $$
(28)

Note that in the above equation, over-lined bits are feedback bits. The state bits are updated according to the following equations before generating the output bit:

$$\begin{array}{@{}rcl@{}} S_{t + 289} & =& S_{t + 289} \oplus S_{t + 235} \oplus S_{t + 230} \end{array} $$
(29)
$$\begin{array}{@{}rcl@{}} S_{t + 230} & =& S_{t + 230} \oplus S_{t + 196} \oplus S_{t + 193} \end{array} $$
(30)
$$\begin{array}{@{}rcl@{}} S_{t + 193} & =& S_{t + 193} \oplus S_{t + 160} \oplus S_{t + 154} \end{array} $$
(31)
$$\begin{array}{@{}rcl@{}} S_{t + 154} & =& S_{t + 154} \oplus S_{t + 111} \oplus S_{t + 107} \end{array} $$
(32)
$$\begin{array}{@{}rcl@{}} S_{t + 107} & =& S_{t + 107} \oplus S_{t + 66} \oplus S_{t + 61} \end{array} $$
(33)
$$\begin{array}{@{}rcl@{}} S_{t + 61} & =& S_{t + 61} \oplus S_{t + 23} \oplus S_{t + 0} \end{array} $$
(34)

Thus, Eq. 28 can be written as

$$\begin{array}{@{}rcl@{}} S_{t + 107} &=& z_{t} \oplus S_{t + 12} \oplus S_{t + 154} \oplus S_{t + 111}\\ &\oplus& S_{t + 235} (S_{t + 61} \oplus S_{t + 23} \oplus S_{t + 0})\\ &\oplus& S_{t + 235}(S_{t + 193} \oplus S_{t + 160} \oplus S_{t + 154})\\ &\oplus& (S_{t + 193} \oplus S_{t + 160} \oplus S_{t + 154})(S_{t + 61}\\ & \oplus& S_{t + 23} \oplus S_{t + 0}) \oplus (S_{t + 230}\\ & \oplus& S_{t + 196} \oplus S_{t + 193}) S_{t + 111}\\ & \oplus& (S_{t + 230} \oplus S_{t + 196} \oplus S_{t + 193})S_{t + 66}\\ & \oplus& S_{t + 66}, \end{array} $$
(35)

which makes the recovery simpler, because all the bits on the RHS of the equation are state bits (and not feedback bits) for t = 0,…,32. However when we place t = 33,…,48 in Eq. 35, feedback bits are also involved and need to be calculated.

Now we use Eq. 35 to recover internal state bits of set \(\mathscr{R}_{1}\). The recovery of state bits is made in a certain order. For example, if we attempt to recover S107 by placing t = 0 in Eq. 35, then S111 appears on the RHS of the equation and requires the knowledge of S111. Thus, S111 is recovered before performing the recovery of S107.

We define four sets \(\mathscr{R}_{3}, \mathscr{R}_{4}, \mathscr{R}_{5} \), \(\mathscr{R}_{6}\), where

$$\begin{array}{@{}rcl@{}} \mathscr{R}_{3}&=&\{ S_{t + 107} : t = 40, 36,\hdots, 0\}\\ \mathscr{R}_{4}&=& \{ S_{t + 107} : t = 41, 37,\hdots, 1\}\\ \mathscr{R}_{5}&=& \{ S_{t + 107} : t = 42, 38,\hdots, 2\}\\ \mathscr{R}_{6}&=& \{ S_{t + 107} : t = 43, 39,\hdots, 3\} \end{array} $$

and each \(\mathscr{R}_{i} \subset \mathscr{R}_{1}\), for i = 3…,6. The order of recovery of state bits is \(\mathscr{R}_{3}, \mathscr{R}_{4}, \mathscr{R}_{5}, \mathscr{R}_{6}\) and \(\mathscr{R}_{2} \), respectively, i.e., the state bits of \(\mathscr{R}_{3}\) are recovered first then \(\mathscr{R}_{4}\) and so on. For each set \(\mathscr{R}_{i} : i = 2 \hdots , 6 \), the higher index elements are recovered first. We need not fix any internal state bits for recovering \(\mathscr{R}_{1}\). However, to recover \(\mathscr{R}_{2}\), the internal state bits are fixed according to Table 2. Let the set \(\mathscr{F}\) represent the internal state bits which are fixed according to Table 2.

Table 2 State bits fixed

Now we describe recovery of \(\mathscr{R}_{3}\). The internal state bit S147 is recovered by substituting t = 40 in Eq. 35. From this, we have

$$\begin{array}{@{}rcl@{}} S_{147} &=&z_{40} \oplus S_{52} \oplus \overline{S_{194}} \oplus S_{151} \oplus S_{275} (S_{101}\\ &\oplus& \overline{S_{63}} \oplus S_{40}) \oplus S_{275}(\overline{S_{233}} \oplus \overline{S_{200}}\\ &\oplus& \overline{S_{194}} ) \oplus(\overline{S_{233}} \oplus \overline{S_{200}} \oplus \overline{S_{194}}) (S_{101}\\ &\oplus& \overline{S_{63}} \oplus S_{40})\oplus (S_{270} \oplus \overline{S_{236}} \oplus \overline{S_{233}}) S_{151}\\ &\oplus& (S_{270} \oplus \overline{S_{236}} \oplus \overline{S_{233}})S_{106} \oplus S_{106}. \end{array} $$
(36)

In Eq. 36, all the bits appearing on the RHS of the equation are guessed, except the over-lined bits. The over-lined bits are feedback bits and not internal state bits due to Eq. 34. Thus, we need to guess more internal state bits to calculate the value of S63, S194, S200, S233, and S236 using Eq. 34. In this way, we recover S147.

Now the internal state bit of S143 is recovered by placing t = 36 in Eq. 35 and we derive

$$\begin{array}{@{}rcl@{}} S_{143} &=&z_{36} \oplus S_{48} \oplus {S_{190}} \oplus S_{147} \oplus S_{271} (S_{97} \\ & \oplus& {S_{59}} \oplus S_{36}) \oplus S_{271} ({S_{229}} \oplus \overline{{S_{196}}} \oplus {S_{190}}) \\ &\oplus& ({S_{229}} \oplus \overline{{S_{196}}} \oplus {S_{190}}) (S_{97} \oplus {S_{59}} \oplus S_{36}) \\ & \oplus& (S_{266} \oplus \overline{{S_{232}}} \oplus {S_{229}}) S_{147} (\oplus S_{266} \oplus \overline{{S_{232}}}\\ & \oplus& {S_{229}})S_{102} \oplus S_{102}. \end{array} $$
(37)

Similarly, in Eq. 37, all the state bits appearing on the right side of the equation need to be guessed, except S271,S190 and the over-lined bits. The internal state bits S271 and S190 are fixed according to Table 2. The over-lined bits are calculated using Eq. 34. Thus, we need to guess more internal state bits to calculate the value of S196,S232 and recover S143.

The remaining state bits of \(\mathscr{R}_{3}\) i.e. S139, S135, …, S107 are recovered by substituting t = 32,28,…,0, respectively, in Eq. 35. While placing t = 32,28,…,0 in Eq. 35, the internal state bits appearing on the RHS of the equation are guessed, except state bits belonging to \(\mathscr{R}\) and \(\mathscr{F}\). Following the same methodology, the internal state bits of set \(\mathscr{R}_{4}, \mathscr{R}_{5}\) and \(\mathscr{R}_{6}\) are recovered.

To recover the state bits of set \(\mathscr{R}_{2}\), a similar procedure is followed, except for Eq. 28, which is rewritten as

$$\begin{array}{@{}rcl@{}} S_{t + 12} &=&z_{t} \oplus S_{t + 107} \oplus S_{t + 154} \oplus S_{t + 111}\\ &\oplus& S_{t + 235} (S_{t + 61} \oplus S_{t + 23} \oplus S_{t + 0})\\ &\oplus& S_{t + 235}(S_{t + 193} \oplus S_{t + 160}\oplus S_{t + 154})\\ &\oplus& (S_{t + 193} \oplus S_{t + 160} \oplus S_{t + 154})(S_{t + 61}\\ &\oplus& S_{t + 23} \oplus S_{t + 0}) \oplus (S_{t + 230} \oplus S_{t + 196}\\ &\oplus& S_{t + 193}) S_{t + 111} \oplus (S_{t + 230} \oplus S_{t + 196}\\ &\oplus& S_{t + 193})S_{t + 66} \oplus S_{t + 66}. \end{array} $$
(38)

Thus, the internal state bits S56,…,S60 are recovered by using t = 44,…,48 in Eq. 38, respectively. Another difference between recovery of \(\mathscr{R}_{1}\) and \(\mathscr{R}_{2}\) is that it is not necessary to recover the higher indexed elements first (as done before).

In this way, we recover 49 bits of \(\mathscr{R}\) by fixing the 10 internal state bits of set \(\mathscr{F}\) and guessing the remaining 234 state bits. However, there are nine internal state bits, i.e., S284,…,S292 which have not appeared in the equations used for recovery. However, these bits are also considered as guessed bits during application of TMDTO attack. The equations used for recovery of \(\mathscr{R}\) have been mentioned in Table 6 in the Appendix. The over-lined state bits and underlined state bits in Table 6 in the Appendix are feedback bits and fixed state bits (according to Table 2), respectively.

4 Complexity of TMDTO Attack for ACORN v3

Now we will describe the TMDTO attack in complete detail. We have a state size of n = 293 bits. Thus, the standard TMDTO formula [1, 2] with a single table will be as follows:

  • TM2D2 = N2, where N = 2n,

  • D2T,

  • \(P = \frac {N}{D}\).

During the preprocessing phase, we will prepare a table with m rows and t columns, where mt2 = N for a successful attack. The number of tables is \(\frac {t}{D}\) and given a single table we have t = D. Each row of the table contains a chain of t elements. Consider that a specific state of n = 293 bits is ζ and f is the one way function. Here by one way function f, we mean that the cipher with the state ζ will be clocked for n times to generate n keystream bits. These n bits will be loaded into the new state, which is called η. That is η = f(ζ). We will start with a random state and then generate a row of t elements by this method. There will be m such rows. Thus, the total table size is mt. However, the complete row will not be saved. Only the starting and the final element will be kept. Thus, the storage requirement of the table will be O(m), which is the memory parameter M.

4.1 Knowledge of 47 bits of State from 47 Keystream Bits

Now consider the case when we are able to recover ψ bits of the state from ψ consecutive keystream bits and the rest of the state bits. In this case, we consider a fixed pattern for the keystream bits and only when that pattern is found in the keystream, we try to search the state in the table. Thus, in this case, we consider a state size of nψ bits and the parameters are referred as N = 2nψ, P,M,T,D. Let us now consider the exact parameters referring to Table 1, where ψ = 47. Thus, TMD′2 = N′2 = 22(293 − 47). Let us consider D′2 = T. Thus, we have TM = 2293 − 47 = 2246. Now, one can consider, T = M = 2123 and D = 261.5. However, as we have discussed that during the online phase, we can only mount the attack when a specific ψ-bit pattern comes, we have D = 2ψD. Thus, finally, we will have the parameters T = T = 2123, M = M = 2123, D = 2ψD = 247 ⋅ 261.5 = 2108.5, \(P = P^{\prime } = \frac {N^{\prime }}{D^{\prime }} = 2^{184.5}\). This provides the maximum of online parameters as 2123, which is less than the exhaustive secret key search of complexity 2128. However, as expected, the pre-processing time is much larger than the exhaustive key search.

At this point, we would like to explain the “unit” cost related to exact complexity. Such unit cost may involve several computations related to the cipher operations. In a most generic way, given a k-bit secret key, the exhaustive attack asks for the complexity of 2k units, where each unit may require several CPU clocks. While mounting the TMDTO attack the same situation is valid. Thus, in our technique, we also consider all the operations as unit cost. However, we will point out a few cases when our calculations are most costly, and that should be taken care of in the complexity analysis. For example, simply generating a 293-bit keystream (that will become the state η) of ACORN v3 from a state ζ requires 0.088 sec in our computing facility. However, to recover the 47 bits of the state from 47 bits of keystream and the remaining state bits requires a time of 0.076 sec, which is almost as same as the time taken to generate ζ. Thus, no additional complexity is required for solving. Hence for this scenario, our parameters are as follows. We can take T = 2122,M = 2124 and D = 261. Then, T = T⋅ 20 = 2122, M = M = 2124, D = 2ψD = 247 ⋅ 261 = 2108, \(P = P^{\prime } = \frac {N^{\prime }}{D^{\prime }} = 2^{185}\).

4.2 Knowledge of 53 Bits of State from 60 Keystream Bits

We follow a similar procedure as mentioned in Section 4.1. However, when the SAT solver is populated with equations and is set to find all possible solutions for 53 state bits using only 53 keystream bits, the SAT solver fails to find a unique solution. Instead, we get multiple solutions, where each solution provides the same 53-bit keystream pattern. To combat this problem, we involve a new idea. Instead of searching for a 53-bit pattern (say 53 continuous 0’s), we search for a 300-bit pattern where the first 53-bit sequence and the last 7-bit sequence are fixed (say to 0’s). This is based on the fact that the keystream sequence generated by all solutions are different. The SAT solver identifies the difference between the sequence of last 7 bits and removes all additional solutions. However, this gives us an additional Data complexity of 27. Considering this constraint into the SAT solver in a similar fashion (as explained in Section 3.1), we get the data mentioned in Table 1. However, in very few cases, two solutions sets are possible which generate the same keystream. Since the proportion is very small and our success probability is \(\frac {2^{14}-1}{2^{14}}\), the time complexity should be multiplied by \(T^{\prime } = T^{\prime } \times \frac {2^{14}}{2^{14}-1} \approx T^{\prime } \times 1 = T^{\prime }\). However, we still attempt to deal with this edge case scenario of two solutions. The idea is to discard the second solution during the offline phase and continue with the first solution set. The matrix stopping rule ensures the entire search space is covered with a negligible collision. During the online phase, the adversary can access few more keystream bits following the fixed pattern in the keystream and hence conclude with the final solution. Our experiments show that 7 more keystream bits, i.e., 67 keystream bits in total are enough to find a unique solution.

Similar to Section 4.1, the time taken for solving equations is of the same order of generating ζ, hence T = T.

  • T = T = 2120 is the total time complexity of the attack,

  • M = M = 2120 is the amount of memory required,

  • D = D× 260 = 2120 where D = 260, since the adversary must succeed in finding a 60-bit pattern,

  • \(P = \frac {N^{\prime }}{D^{\prime }} = 2^{180}\) is the preprocessing time for formulating tables.

4.3 Knowledge of 49 Bits of State From 49 Keystream Bits and Fixing 10 State Bits

Here, we consider the third approach which is similar to what has been recently considered in [8] for a TMDTO attack against Lizard [5]. We consider that ψ state bits can be recovered from ψ many keystream bits and rest of the state bits, but τ many state bits has to be fixed to a specific pattern. This follows the idea mentioned in Section 3.2. In this case, we go back to the single preprocessing table. We will consider ψ = 49 here, with τ = 10. That is from ψ bits of keystream and the remaining (nψ) state bits (out of which τ are fixed to a specific pattern), we will be able to solve for the ψ bits of the state. The initial table preparation goes as follows. We start with a (nψτ) bit random pattern and then take a specific pattern for ψ. Also, the fixed pattern of ϕ is known. Now, using the equations as described in Section 3.2, we solve for the rest ψ bits of the state. This gives the complete state. Then we run the PRGA for nτ times. The initial ψ bits will be as fixed. The remaining (nψτ) pseudorandom bits will be considered as the part of the next state bits. Thus, we have TM = 2nψτ = 2293 − 49− 10 = 2234. Let us take T = 2112 and M = 2122, which also gives, \(D^{\prime } = \sqrt {T^{\prime }} = 2^{56}\). Thus, we will now have the following parameters.

  • D = D⋅ 2ψ + τ = 256 + 49+ 10 = 2115, as the specific pattern ψ should come towards consulting the table, and also for a good success rate to have the specific τ bit pattern in the state we need to try 2τ many times,

  • M = M = 2122,

  • T = T⋅ 2τ = 2112 + 10 = 2122, as we only consult the preprocessing table when the specific ψ bit pattern appears in the keystream, but we need to try 2τ times as we have that more data and here the solution time can be estimated from the operations in the equations and that can subsumed in the PRGA effort,

  • \(P = P^{\prime } = \frac {N^{\prime }}{D^{\prime }} = 2^{234-56} = 2^{178}\).

A similar online parameter in this respect can be obtained considering the equation 5ψ + 2τ = n. Here, ψ = 49, by fixing τ = 10. However, we can easily increase τ to 24 to satisfy the equation 5ψ + 2τ = 5 ⋅ 49 + 2 ⋅ 24 = 293 = n. That is we will fix 24 state bits to a specific pattern. In this case, the online complexity becomes \(T = M = D = 2^{\frac {n-\psi }{2}} = 2^{\frac {293-49}{2}} = 2^{122}\). However, the preprocessing becomes less, which is \(P = 2^{\frac {n+\psi }{2}} = 2^{\frac {293 + 49}{2}} = 2^{171}\).

5 Using SAT Solving Techniques to Mount TMDTO on Grain v1

Using SAT solver and a given keystream sequence, we could recover a portion of the state of ACORN v3. Now we would like to see, how this new approach can help us mount an unconditional TMDTO attack on one of the finalists of eStream portfolio, Grain v1, and compare it with the existing attacks on Grain v1. We will begin by describing Grain v1.

5.1 Description of Grain v1

Grain v1 constitutes of an LFSR and an NFSR, each of size 80 bits, and supports an 80-bit secret key with a 64-bit IV. The encryption process is divided into 3 phases: KLA (Key Loading Algorithm), KSA (Key Scheduling Algorithm), and PRGA (Pseudo Random Generating Algorithm).

Notation

The LFSR bits are represented by l0,l1,…,l79 while the NFSR bits are represented as n0,n1,…,n79. Similarly, the key bits are represented as k0,k1,…,k79 and the IV bits as v0,v1,…,v63.

1. KLA: :

The lower 64 bits of LFSR are initialized with the IV concatenated by a 16-bit string of 1’s, while the NFSR is initialized with the key bits:

$$\begin{array}{@{}rcl@{}} l_{i} &=& v_{i}\text{ for } 0 \leq i \leq 63. \end{array} $$
(39)
$$\begin{array}{@{}rcl@{}} l_{i} &=& 1 \text{ for } 64 \leq i \leq 79, \end{array} $$
(40)
$$\begin{array}{@{}rcl@{}} n_{i} &=& k_{i} \text{ for } 0 \leq i \leq 79, \end{array} $$
(41)
2. KSA: :

The Key Scheduling Algorithm is carried out for 160 rounds, where in each round, the LFSR and NFSR are shifted by 1 bit and a new bit generated by the feedback routine enters the state. The feedback routine for LFSR can be described as

$$\begin{array}{@{}rcl@{}} l_{i + 80} &=& l_{i + 62} \oplus l_{i + 51} \oplus l_{i + 38} \oplus l_{i + 23} \\ &\oplus& l_{i + 13} \oplus l_{i} \oplus z_{i} \end{array} $$
(42)

And the NFSR feedback routine is described as follows:

$$\begin{array}{@{}rcl@{}} n_{i + 80} & =& z_{i} \oplus l_{i} \oplus n_{i + 62} \oplus n_{i + 60} \oplus n_{i + 52} \\ &\oplus& n_{i + 45} \oplus n_{i + 37} \oplus n_{i + 33}\oplus n_{i + 28}\oplus \\ n_{i + 9}&\oplus& n_{i} \oplus n_{i + 63}n_{i + 60} \oplus n_{i + 37} n_{i + 33}\\ &\oplus& n_{i + 15}n_{i + 9} \oplus n_{i + 60}n_{i + 52}n_{i + 45} \\ &\oplus& n_{i + 33}n_{i + 28}n_{i + 21} \\ &\oplus& n_{i + 63}n_{i + 45}n_{i + 28}n_{i + 9} \\ &\oplus& n_{i + 60}n_{i + 52}n_{i + 37}n_{i + 33}\\ &\oplus&a n_{i + 63}n_{i + 60}n_{i + 52}n_{i + 45}n_{i + 37} \\ &\oplus& n_{i + 33}n_{i + 28}n_{i + 21}n_{i + 15}n_{i + 9} \\ &\oplus& n_{i + 52}n_{i + 45}n_{i + 37}n_{i + 33}n_{i + 28}n_{i + 21} \end{array} $$
(43)

where zi is an output bit (used internally during KSA) computed as

$$\begin{array}{@{}rcl@{}} z_{i} & =& n_{i + 1} \oplus n_{i + 2} \oplus n_{i + 4} \oplus n_{i + 10} \\ &\oplus& n_{i + 31} \oplus n_{i + 43} \oplus n_{i + 56} \oplus h(x) \end{array} $$
(44)

and h(x) is a non-linear function:

$$\begin{array}{@{}rcl@{}} h(x) & =& x_{1} \oplus x_{4} \oplus x_{0} x_{3} \oplus x_{2} x_{3} \oplus x_{3} x_{4} \oplus x_{0} x_{1} x_{2} \\ & \oplus& x_{0} x_{2} x_{3} \oplus x_{0} x_{2} x_{4} \oplus x_{1} x_{2} x_{4} \oplus x_{2} x_{3} x_{4} \end{array} $$
(45)

where x0,x1,x2,x3,x4 correspond to the tap positions li+ 3, li+ 25, li+ 46, li+ 64, ni+ 63.

3. PRGA: :

After 160 rounds of KSA, the cipher is ready to produce keystream bits. The output function remains the same from KSA. The LFSR and NFSR registers are updated in a similar fashion as KSA, except that the output bits are not XORed into the state, but produced externally. Note that the output bit is generated prior to a state update.

Now that we have discussed the structure of Grain v1, we shall discuss how by guessing a certain portion of the state bits and assuming a fixed-length keystream sequence, we can recover the remaining state bits.

5.2 Formulating Equations for Grain v1

The procedure of formulating equations for Grain v1 is very similar to that of ACORN v3 as mentioned in Section 3.1. First, 160 variables are introduced into the system, for representing the internal state bits. Next, we assume that we know a portion of the internal state and directly substitute the known values instead of the state variables. We also have a fixed keystream z0,z1,…,zψ− 1. In our case, we do not include any equations of the LFSR, since we are guessing the values of LFSR bits in our attack. Hence, the LFSR bits are clocked as mentioned in Equation 42 without the keystream bit (since we are talking about PRGA rounds here). For updating the NFSR bits, we substitute the known values directly into the NFSR state update routine using Eq. 43 (without the keystream bit) and directly substitute the newly generated bit into the state (instead of introducing an extra variable as done in 3.1); since this is giving us a considerable speed boost.

We clock the state ψ times, and for each round, we calculate the output bit (which will be a boolean polynomial) using Eq. 44 and XOR the polynomial with the corresponding keystream bit zi before adding it to the set of equations. We first convert this system of ANF equations into CNF using the popular dense strategy (available with SAGE). Now the equations are expressible in terms of input to SAT solvers, we can use any SAT solver to solve for a solution. One popular SAT solver is Cryptominisat 5.6, which we have used for performing our experiments.

Now we will describe our attack on Grain v1.

5.3 Recovering 25 State Bits using 30 Keystream Bits

We follow a similar idea as mentioned in Section 4.2. The equations are formulated as mentioned in the Section 5.2. Here, we are considering NFSR bits n0,n1,…,n24 for recovery, by guessing the remaining state bits and using the fixed length keystream pattern. Like in Section 4.2, we are faced with a situation of multiple solutions for the system of equations. Hence, we resort to increasing the number of keystream bits from 25 to 30. The chance of two solutions in this scenario is 1/25. However, we can tackle this situation by following a similar approach as followed in Section 4.2. We search for a 160 bit pattern in the keystream sequence where the first 25 bits and the last 5 bits are a pattern of our choice, say all 0’s. This will increase the data complexity by 230, while our search space is now reduced to N = 2160 − 25 = 2135. We have to find appropriate parameters for T,M and D such that TM′2D′2N′2. We can choose T = M = 267.5 and D = 233.75 such that T = M = 267.5 and D = 263.75. The offline complexity of the attack will stand at P = 2101.25.

Note that the complexity of SAT solver also needs to be accounted for. Similar to Section 4.2, we will compare the average time to solve one system of equations with the time taken to compute 160 keystream bits in SAGE 8.2. The average time taken to solve a system of equations is 0.059 seconds, whereas the time taken to compute 160 keystream bits is 0.0079 seconds. From this, we can assume that the complexity of solving equations to be 22.90 Grain v1 encryptions.

The different possible TMDTO parameters are mentioned in Table 3.

Table 3 The possible TMDTO parameters inclusive of the time complexity of the SAT solver

5.4 Recovering 32 State Bits using 36 Keystream Bits

Here, we will recover NFSR bits n0,n1,…,n31 by guessing remaining state bits and using a 36-bit fixed keystream pattern. The chance of two solutions in this case would be 2− 5. The search space in this case would be N = 2128. A suitable parameter in this case would be T = 264, M = 264,D = 232 and consequently P = N/D = 296.

The average time taken to solve the system of equations in this case is 0.42 seconds. Considering the time taken for producing 160 keystream bits from the previous Section 5.3, the complexity of solving equations in this case would be 24.06. The various possible TMDTO parameters are given in Table 4.

Table 4 The possible TMDTO parameters inclusive of the time complexity of the SAT solver (Recovering 32 state bits using 36 keystream bits)

6 Comparison and Viability of the Two Attacks

We have seen TMDTO attacks on ACORN v3 and Grain v1 in Section 4 and Section 5, respectively. Even in the best case scenario for our attack on ACORN v3, we need a pre-processing complexity of at least 2171 (Section 4.3) which is considerably higher than exhaustive key search. However, a preprocessing complexity higher than exhaustive key search has often been allowed since it is a one-time cost.

Note that for Grain v1, the offline complexity is close to the exhaustive key search. We cannot claim that we have broken the cipher since it is still slower than exhaustive key search. But since the tables need to be prepared only once and then can repeatedly be used during the online phase with a complexity lower than exhaustive key search, the attack does seem practical in future. A comparison of our work for Grain v1 with existing works has been mentioned in Table 5.

Table 5 Comparison of complexities with existing TMDTO attacks on Grain-v1

7 Conclusion

In this paper, we have studied a generalized TMDTO attack by using SAT solver on stream ciphers with state size less than 2.5 times its key size. We have presented a TMDTO attack on ACORN v3 and Grain v1. We have shown how we can formulate equations and by using SAT solver, we can deduce a portion of the state using a fixed keystream pattern and guessing the remaining state bits. Then we have enlisted the possible TMDTO parameters for each case. We observe that the online parameters are lower than exhaustive key search, while the pre-processing complexity still remains higher than exhaustive key search. While our observations do not refute any security claim of the cipher, the study adds certain insight towards the cryptanalysis and may lead to further research in this area.