Keywords

1 Introduction

Research in the past decade has revealed possible quantum attacks on symmetric cryptosystems are not limited to the exhaustive key search with Grover’s algorithm [30] or the collision search by the BHT algorithm [19]. A notable line of research is the one initiated by Kuwakado and Morii showing that Simon’s algorithm breaks lots of classically secure schemes in polynomial time [13, 42, 45, 46]. Other previous works show how to speed-up classical cryptanalytic techniques such as differential and linear cryptanalysis, MITM, and integral attacks [15, 37, 43]. Some recent papers study dedicated quantum collision attacks on concrete hash functions such as SHA-2 and SHA-3 [28, 31, 38, 39].

In this paper, we investigate the possibility to achieve more quantum speed-up for major classical cryptanalytic techniques than previous works.

Q1 and Q2. For quantum cryptanalysis on symmetric cryptosystems, there are two attack models called Q1 and Q2 [43]. The Q1 model assumes the existence of a quantum computer but oracles of keyed functions are classicalFootnote 1, whereas Q2 assumes that oracles are also quantumFootnote 2. For instance, a Q2 attack on a cipher is allowed to query quantum superposition of messages to the encryption oracle. Such an attack is called a Quantum Chosen-Plaintext Attack (QCPA). This paper studies attacks in the Q2 model.

Significance of Studying Q2 Attacks. The Q1 model is more realistic than Q2 in that oracles in Q1 are the same as classical ones, and thus Q1 attacks become real threats as soon as a large-scale fault-tolerant quantum computer is available. Still, studying Q2 attacks is important for the following two reasons. First, a new non-trivial Q1 attack may be found based on Q2 attacks. For instance, the so-called offline Simon’s algorithm by Bonnetain et al. [12], which is a Q1 attack, is developed by modifying the Q2 attack by Leander and May [47]. Second, Q2 attacks can be converted into Q1 attacks when the key length is sufficiently long: Let \(E_K\) be an n-bit block cipher with k-bit keys. Suppose that \(k > 2n\), and that there is a Q2 attack on \(E_K\) with time complexity \(T < 2^{k/2}\). Now, assume we are in the Q1 model and run the following attack. First, query all the (classical) inputs to \(E_K\), storing the results in a qRAM. Second, simulate the quantum oracle of \(E_K\) by accessing the qRAM, and execute the Q2 attack with the simulated oracle. This is a valid Q1 attack since the resulting complexity \(T' = \max \{T,2^n\}\) is less than \(2^{k/2}\), the complexity of the exhaustive key-search by Grover’s algorithm. Even if \(k \le 2n\), some Q2 attacks may similarly be converted into Q1 if quantum queries are required only on some small portion of inputs.

Quantum Speed-Up for Linear Cryptanalysis. Linear cryptanalysis [49] is one of the most fundamental techniques in symmetric cryptanalysis. Kaplan et al. [43] has already shown a quadratic quantum speed-up for linear attacks. However, their distinguisher uses only one-dimensional linear approximations, while classical attacks often exploit multidimensional linear approximations to reduce complexity [32,33,34]. In fact, it is unclear whether Kaplan et al.’s distinguisher can be sped-up further even if multiple linear approximations are available, due to the following reason.

Kaplan et al.’s distinguisher relies on the quantum counting algorithm [18], which (approximately) counts the number of x satisfying \(F(x)=1\) for an (efficiently computable) Boolean function F. Since classical one-dimensional linear distinguishers work just by counting the number of messages satisfying a linear approximation, such F is naturally defined in the one-dimensional case, and the quantum counting algorithm can be applied.

Meanwhile, classical multidimensional linear distinguishers are based on sophisticated statistical tests exploiting a relationship between capacity and a sum of squared correlations in a clever way. It is highly unclear whether there exists an efficiently computable Boolean function F such that just counting the number of x satisfying \(F(x)=1\) corresponds to performing such statistical tests.

Thus it is natural to ask whether there exists another quantum technique for linear distinguishers running faster than Kaplan et al.’s when a multidimensional linear approximation is available.

Multidimensional linear cryptanalysis has many variants including (multidimensional) zero correlation linear cryptanalysis [9] and generalized linear cryptanalysis on an arbitrary finite abelian group [4]. However, no previous work has shown quantum speed-up for them. A technique speeding-up multidimensional linear distinguishers may lead to a speed-up for such variants, which is of another interest. It may also lead to a speed-up for some integral distinguishers, because a class of multidimensional zero correlation linear distinguishers corresponds to integral distinguishers based on balanced functions, as shown by Bogdanov et al. and Sun et al. [8, 56].

Quadratic Barrier. Due to Grover’s generic quadratic speed-up for exhaustive search, the only way to break more rounds in the quantum setting, especially for key-recovery and distinguishing attacks, is to achieve a super-quadratic speed-upFootnote 3. Hence, such a speed-up is one of the main goals in quantum cryptanalysis on symmetric-key cryptosystems.

Some previous works have already achieved such a speed-up, even in the Q1 model [16], but the types of techniques are limited. All of them exploit algebraic structures such as hidden periods or shifts of target ciphers by using Simon’s algorithm or a related algorithm solving an algebraic problem.

Moreover, few previous works have succeeded to achieve a more than quadratic speed-up on classical cryptanalytic techniques such as differential, linear, or integral cryptanalysis. The only one exception is the quantum versions of (advanced) slide attacks [14, 42], but they also rely on special algebraic structures like hidden periods. Whether a more than quadratic speed-up is possible for other major classical techniques (without relying on periods or shifts) has been an important open problem for years.

1.1 Our Contributions

This paper shows that quantum speed-up for multidimensional (zero correlation) linear and integral distinguisher can be achieved by using a modified version of the subroutine of Simon’s algorithm, without exploiting hidden periods or shifts. Especially, we show that some special versions of integral distinguishers achieve a more-than-quadratic speed-up.

First, we observe that Simon’s algorithm has a close relationship with linear correlations of functions via Fourier transform. Simon’s algorithm iterates a subroutine, which is composed of the Hadamard transform and an oracle query to the target function. We find that, with a slight modification made, the subroutine outputs a pair of linear masks of the target function with probability proportional to the squared linear correlation. Since it extracts linear correlations of a function into quantum amplitude, we call the subroutine after the modification the correlation extraction algorithm, or CEA for short.

Second, we show that multidimensional linear distinguishers can be sped-up by combining CEA and the Quantum Amplitude Amplification (QAA) technique. As an application example, we show that the multidimensional distinguishers on FEA-1 and FEA-2 by Beyne [6] can be sped-up from \(O(2^{(r/4-3/4)n})\) and \(O(2^{(r/6-3/4)n})\) to \(O(2^{(r/8-1/4)n})\) and \(O(2^{(r/12-1/4)n})\), respectively, when messages are n bits and the number of rounds is r.

Then we show that CEA also leads to a speed-up for multidimensional zero correlation linear distinguishers. Our technique leads to quantum distinguishers on 5-round balanced Feistel running in time \(O(2^{n/2})\) when round functions are bijections and the entire width of the cipher is n, and distinguishers on Type-I/II generalized Feistel structures. (See Table 2 of this paper’s full version [36] for details.)

Finally, we show a speed-up for integral distinguishers. The speed-up is obtained via the correspondence of integral and zero correlation properties observed by Bogdanov et al. and Sun et al. [8, 56], and applicable when integral properties are based on balanced functions. Especially, we observe the possibility of a more than quadratic speed-up when there are multiple integral properties on mutually orthogonal subspaces, which appear in some SPN ciphers such as the 3.5-round AES. As a notable example, we show that a toy 4-bit-cell SPN cipher having the same integral property as the 2.5-round AES is distinguished only by a single quantum query. Such a single-query attack is almost impossible in the classical setting (unless another weakness exists), and the example illustrates a new type of qualitative difference between classical and quantum computation that has not been observed beforeFootnote 4.

Note that all of our attacks do not require the target cipher to have algebraic structures such as hidden periods or shifts. It is somewhat surprising that (a modified) Simon’s algorithm, which was primarily developed to solve an algebraic problem of hidden periods, leads to non-trivial speed-ups for various classical attacks not relying on hidden periods nor shifts.

Our technique extends to generalized linear distinguishers on arbitrary finite abelian groups [4] by replacing the Hadamard transform in CEA with the general Quantum Fourier Transform (QFT). As an application, we show a speed-up for the distinguisher by Beyne [6] on the FF3-1 structure. The amount of speed-up is the same as that for FEA-1.

A drawback of our technique is that it cannot be applied to integral distinguishers based on zero-sum properties, although zero-sum properties are usually used to extend distinguishers into key-recovery attacks on more rounds. Especially, it does not directly lead to breaking more rounds than classical attacks. Still, we believe that our techniques are novel and general, and will inspire other new types of quantum attacks in both of the Q1 and Q2 models.

1.2 Related Works

Quantum speed-up for integral attacks has already been studied in, e.g., [15], but zero-sum properties are used and the distinguisher part itself is not sped-up. A recent work by Shi et al. [54] also studies zero correlation linear attacks in the quantum setting, but it mainly focuses on how to find zero correlation linear approximations by using quantum computers, and does not have much overlap with our results.

Schrottenloher’s Key-Recovery Attack. Another recent concurrent and independent work by Schrottenloher [53] showed how to obtain quantum speed-up for linear key-recovery attacks.

Classical linear distinguishers are often combined with efficient key-recovery attacks using the FFT [22, 29, 57]. What Schrottenloher did is to combine such classical techniques with the QFT. Computing convolution of some Boolean functions related to linear approximations in quantum superposition, Schrottenloher’s algorithm produces some quantum superposition of subkey candidates in such a way that the quantum amplitude are proportional to their experimental correlations. Then the amplitude of the right key is amplified by QAA.

Note that our main interest is to achieve a speed-up for multidimensional (zero correlation) linear distinguishers. Schrottenloher’s work [53] also deals with multiple linear approximations, but the existence of multiple approximations improves only precision of attack by a constant factor, and essentially does not contribute much to reducing the time complexity. Additionally, zero correlation linear or integral attacks are not studied in [53].

One would expect that a more speed-up for key-recovery is obtained by combining our technique and Schrottenloher’s. Still, the mechanism of the two techniques is quite different (Schrottenloher uses the QFT to compute convolution in superposition to obtain a superposition of key candidates, while we use it to extract correlations of multidimensional linear approximations), and so far we do not have any idea on how to combine them. Studying theoretical connection between them and reducing the time complexity of key-recovery exploiting (zero correlation) multidimensional linear approximations is definitely an important and interesting future work.

1.3 Organization

Section 2 introduces basic notions and facts. Section 3 reviews classical (multidimensional) linear distinguishers and Kaplan et al.’s quantum one-dimensional linear distinguisher. Section 4 studies relationships between the Simon’s subroutine and linear correlations, and introduces CEA. Sections 5, 6, and 7 show how to achieve a quantum speed-up with CEA for multidimensional linear, zero correlation multidimensional linear, and integral distinguishers, respectively. Section 8 shows the extension to generalized linear distinguishers on an arbitrary finite abelian group. Section 9 concludes the paper.

2 Preliminaries

\({\mathbb {F}}_2\) denotes the Galois field of order two. We identify the set of n-bit strings \(\{0,1\}^n\) and the n-dimensional \({\mathbb {F}}_2\)-vector space \({\mathbb {F}}^n_2\). Especially, by “bit string” we denote an element of \(\mathbb F^n_2\) for some n. By \(\textbf{e}_i\) we denote the n-bit string (for some n) of which the i-th bit is 1 and other bits are 0. \(x\oplus y\) denotes the addition of x and y in \({\mathbb {F}}^n_2\), and x||y denotes the concatenation as bit strings. For a bit string \(x\in {\mathbb {F}}^n_2\), we denote the i-th bit (from the left) by \(x_i\). Namely we represent x as \(x = x_1 || \cdots || x_n\). For \(x,y \in {\mathbb {F}}^n_2\), the dot product of x and y is defined by \(x \cdot y := (x_1 \cdot y_1) \oplus \cdots \oplus (x_n \cdot y_n)\). For a vector space \(V \subset {\mathbb {F}}^n_2\) (resp., vector x), \(V^\perp \) (resp., \(x^\perp \)) denotes the subspace that is composed of y satisfying \(y \cdot x=0\) for all \(x \in V\) (resp., y satisfying \(y \cdot x = 0\)). For two vector spaces \(V_1,V_2 \subset {\mathbb {F}}^n_2\), we write \(V_1 \perp V_2\) if \(v_1 \cdot v_2=0\) for all \(v_1 \in V_1\) and \(v_2 \in V_2\). The event that a (classical or quantum) algorithm \(\mathcal {A}\) outputs a classical bit string x is denoted by \(x \leftarrow \mathcal {A}\). For a bit string \(x \in {\mathbb {F}}^n_2\) (resp., function \(f : {\mathbb {F}}^m_2 \rightarrow {\mathbb {F}}^n_2\)), by \(\textsf{msb}_u[x]\) (resp., \(\textsf{msb}_u[f]\)) we denote the most significant u bits of x (resp., the function that returns \(\textsf{msb}_u[f(x)]\) for each input x). The notations \(\textsf{lsb}_u[x]\) and \(\textsf{lsb}_u[f]\) are similarly defined for least significant u bits. For a distribution D and a real value \(X_w\) depending on a parameter w, \(\mathbb E_{w \sim D}[X_w]\) denotes the expected value of \(X_w\) when w is sampled according to D. It is also denoted by \(\mathbb E_{w}[X_w]\) or just \(\mathbb E[X_w]\) if the distribution is clear from the context. Similar notations are used for variance and the probability of an event. For a unitary operator U, its adjoint is denoted by \(U^*\). In cryptanalysis of a block cipher E, we regard the unit of time as the time to encrypt a message by E. We assume that readers are familiar with Pearson’s chi-squared test of goodness-of-fit. For those who are not, we provide a brief overview about the relationship between the test and distinguishers in Section A of the full version of this paper [36].

2.1 Linear Approximations and Correlations

The (one-dimensional) linear approximation of a function \(f : {\mathbb {F}}^m_2 \rightarrow {\mathbb {F}}^n_2\) for an input mask \(\alpha \in {\mathbb {F}}^m_2\) and output mask \(\beta \in {\mathbb {F}}^n_2\) is the Boolean function defined by \(x \mapsto (\alpha \cdot x) \oplus (\beta \cdot f(x))\). The correlation \(\textrm{Cor}(f;\alpha ,\beta )\) of this linear approximation is defined by \(\textrm{Cor}(f;\alpha ,\beta ) := \Pr _x\left[ \alpha \cdot x = \beta \cdot f(x)\right] - \Pr _x\left[ \alpha \cdot x \ne \beta \cdot f(x)\right] \). It is well-known that the linear correlation satisfies

$$\begin{aligned} \textrm{Cor}(f;\alpha ,\beta ) &= \sum _{x\in {\mathbb {F}}^m_2}\frac{(-1)^{\alpha \cdot x \oplus \beta \cdot f(x)}}{2^m}. \end{aligned}$$
(1)

In addition, we need the following claim for analysis of attacks.

Claim 1

(Distribution of capacity of a random permutation). Let \(V \subset {\mathbb {F}}^n_2 \times {\mathbb {F}}^n_2\) be a vector space and S be an arbitrary basis of V. Then, for a randomly chosen permutation P, the value \( 2^n \cdot \sum _{(\alpha ,\beta ) \in V - \{\textbf{0}\}} \textrm{Cor}(P;\alpha ,\beta )^2 \) approximately follows the \(\chi ^2\) distribution with \(2^v-2^u-2^w+1\) degrees of freedom. Here, \(v := \dim (V)\), \(u := \dim (V \cap {\mathbb {F}}^n_2 \times \{0^n\})\) and \(w := \textrm{dim}(V \cap \{0^n\} \times {\mathbb {F}}^n_2)\).

This claim is conjectured in [2]. We do not have a formal proof, but explain why the claim is plausible in Section D of the full version of this paper [36].

2.2 Balanced Function and Zero-Sum Property

Integral cryptanalysis [44], which was initially proposed as a dedicated attack on the block cipher SQUARE [24], exploits the zero-sum property of (a part of) ciphers. Here, we say that a function \(f : {\mathbb {F}}^m_2 \rightarrow {\mathbb {F}}^n_2\) has the zero-sum property if \(\sum _x f(x)=0\). Moreover, we say that a function f is balanced if \(|f^{-1}(y)| = |f^{-1}(y')|\) holds for any \(y,y'\) in the range of f. A balanced function has the zero-sum property but the converse does not necessarily hold. In some previous works, the zero-sum property is called “balanced property”, but this paper uses the term “balanced” only when referring to a balanced function in the above sense.

2.3 Quantum Computation

We assume that the readers are familiar with quantum computation and linear algebra (see, e.g., [52] for basics of quantum computation). We adopt the standard quantum circuit model and do not take the cost of quantum error correction into account. \(I_m\) denotes the identity operator on an m-qubit system and H denotes the (1-qubit) Hadamard transform. For a function \(f : \mathbb F^m_2 \rightarrow \mathbb F_2^n\), \(U_f\) denotes the unitary operator defined by \( U_f : \left|x\right>\left|y\right> \mapsto \left|x\right>\left|y \oplus f(x)\right>. \) Namely, \(U_f\) is the quantum oracle of f. All quantum attacks in this paper are Quantum Chosen-Plaintext Attacks (QCPAs, in the Q2 model), and the quantum encryption oracle \(U_{E_K}\) of a target cipher \(E_K\) is assumed to be available. If \(E_K\) is a tweakable block cipher, adversaries query tweaks also in quantum superposition.

Quantum Amplitude Amplification. Here we recall the Quantum Amplitude Amplification (QAA) technique [18], which is a generalization of Grover’s algorithm [30]. Let \(f : \mathbb F_2^m \rightarrow \mathbb F_2\) be a Boolean function, U be a unitary operator acting on an m-qubit system, and p denote the probability that we observe a bit string x satisfying \(f(x)=1\) when the state \(U \left|0^m\right>\) is measured by the computational basis. In addition, let \(\mathcal {S}_f\) and \(\mathcal {S}_0\) be the unitary operators acting on an m-qubit quantum system defined by \(\mathcal {S}_f\left|x\right> = (-1)^{f(x)}\left|x\right>\) and \(\mathcal {S}_0\left|x\right>=(-1)^{\delta _{0^m,x}}\left|x\right>\), where \(\delta _{0^m,x}\) is Kronecker’s delta.

Proposition 1

(Quantum amplitude amplification). In the above setting, let \(Q(U,f) := - U \mathcal {S}_0 U^* \mathcal {S}_f\). When the state \(Q(U,f)^i U\left|0^m\right>\) is measured by the computational basis for some \(i > 0\), an outcome x satisfying \(f(x)=1\) is obtained with probability \(\sin ^2( (2i+1) \cdot \arcsin (\sqrt{p}))\). Especially, such an x is obtained with probability at least \(\max \{p,1-p\}\) by setting \(i=\left\lceil \pi / 4\arcsin (\sqrt{p}) \right\rceil \).

Grover’s algorithm is obtained when \(U = H^{\otimes m}\). Here, \(p = |f^{-1}(1)|/2^m\) and an \(x \in f^{-1}(1)\) is found by applying \(Q(H^{\otimes m},f)\) at most \(\sqrt{2^m/|f^{-1}(1)|}\) times.

Applications to Distinguishers. A typical task in cryptanalysis is to distinguish two distributions of functions. That is, under the assumption that a function f is chosen from a distribution \(D_1\) or \(D_2\), an adversary tries to judge which distribution f is chosen from. For linear distinguishers, \(D_1\) (resp., \(D_2\)) corresponds to a linear approximation of a real block cipher (resp., a random permutation).

A counterpart of such a task in the quantum setting is to distinguish two distributions of unitary operators. That is, under the assumption that a unitary operator U is chosen according to a distribution \(D_1\) or \(D_2\), an adversary tries to judge which distribution U is chosen fromFootnote 5.

QAA can be used to solve such a task. Assume that an adversary has access to not only U but \(U^*\), and that U acts on an n-qubit systemFootnote 6. Moreover, suppose that we know a Boolean function \(F : {\mathbb {F}}^n_2 \rightarrow {\mathbb {F}}_2\) satisfying the following conditions.

  1. (1)

    If U is chosen from \(D_1\), then the probability \(p_U := \Pr \Big [ x \xleftarrow {\text {measure}}U \left|0^n\right> : F(x)=1 \Big ]\) is relatively high on average.

  2. (2)

    If U is chosen from \(D_2\), then \(p_U\) is relatively low on average.

Specifically, assume we know a value t satisfying \(\mathbb {E}_{U \sim D_1}\left[ p_U\right] \ge t \gg \mathbb {E}_{U \sim D_2}\left[ p_U\right] . \) Then we can distinguish \(D_1\) and \(D_2\) by using QAA on U and F: If U is chosen from \(D_1\), then QAA with \(O(\sqrt{t^{-1}})\) applications of U, \(U^*\), and \(\mathcal {S}_F\) will find x satisfying \(F(x)=1\) because \(p_U \ge t\). If U is chosen from \(D_2\), such QAA will not find x because \(t \gg p_U\) and the number of iterations is not large enough.

More precisely, since we know only the lower bound of \(\mathbb {E}_{U \sim D_1}\left[ p_U\right] \), we run multiple instances of QAAs with the number of iteration randomized as follows.

QAA for Distinguisher (\(\boldsymbol{\mathcal{Q}\mathcal{D}}\))

  1. 1.

    For \(j=1,\dots ,s\), do:

    1. (a)

      Choose i from the set of integers from 0 to \( \left\lfloor \frac{1}{ \sin \left( 2 \cdot \arcsin \left( \sqrt{t} \right) \right) } \right\rfloor \) uniformly at random.

    2. (b)

      Apply \(Q(U,F)^iU\) to \(\left|0^n\right>\) and measure the entire state by the computational basis, and let x be the outcome.

    3. (c)

      Compute F(x). If \(F(x)=1\), return 1 and abort.

  2. 2.

    Return 0.

Here, s is a positive integer constant chosen depending on applications. We denote the above algorithm by \(\mathcal{Q}\mathcal{D}\).

The idea of randomly choosing the number of iteration is just a straightforward adaptation of previous works on Grover’s algorithm and QAA without knowing initial success probability [17, 18].

Proposition 2

With the above setting and notions, suppose \(1/4 > t > 0\). Then, \(\mathcal{Q}\mathcal{D}\) applies U, \(U^*\), and \(\mathcal {S}_F\) at most \(s( \frac{1}{\sqrt{t}}+1)\) times and (1) returns 1 with probability at least \((1-(\frac{3}{4})^s) \cdot \Pr _{U \sim D_1}\left[ 1/4 > p_U \ge t\right] \) if U is chosen according to \(D_1\) and (2) returns 1 with probability at most \(s\cdot (\frac{16t'}{t} + \frac{20t'}{\sqrt{t}}) + \Pr _{U \sim D_2}[t' < p_U]\) for any \(t'>0\) satisfying \(4\sqrt{t'/t} + 2\sqrt{t'} < \pi /2\) if U is chosen according to \(D_2\).

The interpretation of the proposition is as follows. Suppose that \(p_U\) is distributed around t (resp., \(t'\)) if U is chosen according to \(D_1\) (resp., \(D_2\)), and \(1/4 > t \gg t'\) holds. For a sufficiently large constant s (e.g., \(s=3\)), the proposition guarantees that \(\mathcal{Q}\mathcal{D}\) returns 1 with probability \(\ge 1/2\) (resp., only with a negligibly small probability) when U is chosen according to \(D_1\) (resp., \(D_2\)). Hence \(D_1\) is distinguished from \(D_2\). The proof of Proposition 2 is a straightforward application of some lemmas in previous works [17, 18], though, we provide a proof in Section B of the full version of this paper [36] for completeness.

Simon’s Algorithm. Simon’s quantum algorithm [55] finds a period of a periodic function. More precisely, it solves the following problem.

Problem 1

Let \(s \in \mathbb F_2^m\) be a (secret) constant, and \(f : \mathbb F_2^m \rightarrow \mathbb F_2^n\) be a function satisfying the following properties C1 and C2.

C1.:

\(f(x\oplus s) = f(x)\) for all x. Namely, f is a periodic function with period s.

C2.:

\(f(x) \ne f(y)\) if \(x \ne y\) and \(x\oplus s \ne y\).

Given the (quantum) oracle of f, find s.

The classical complexity to solve the problem is \(\Theta (2^{m/2})\) but Simon’s algorithm, which runs as follows, solves it in polynomial time with high probability.

  1. 1.

    For \(i=1,2,\dots ,2m\), execute the following subroutine (a)–(e).

    1. (a)

      Prepare the initial state \(\left|0^m\right>\left|0^n\right>\).

    2. (b)

      Apply the m-qubit Hadamard transform \(H^{\otimes m}\) on the first m qubits.

    3. (c)

      Apply \(U_f\) on the state (i.e., make a quantum query to f).

    4. (d)

      Apply the \(H^{\otimes m}\otimes I_n\) on the state.

    5. (e)

      Measure the first m qubits by the computational basis, discard the remaining n-qubits, and return the observed m-bit string (denoted by \(\alpha _i\)).

  2. 2.

    If \(\dim (\textrm{Span}_{{\mathbb {F}}_2}(\alpha _1,\dots ,\alpha _{2m})) = m-1\), compute and output the unique \(s' \in {\mathbb {F}}^m_2 \setminus \{0^m\}\) such that \(s' \cdot \alpha _i = 0\) for \(i=1,\dots ,2m\). If \(\dim (\textrm{Span}_{{\mathbb {F}}_2}(\alpha _1,\dots ,\alpha _{2m})) \ne m-1\), output \(\bot \).

Simon showed that each \(\alpha _i\) is uniformly distributed over the subspace \(\{v \in {\mathbb {F}}^m_2 | v \cdot s = 0\}\), and thus the algorithm returns the period s with high probability. We refer to the subroutine (a)–(e) as Simon’s subroutine.

Many papers (e.g., [42, 45, 46]) showed polynomial-time quantum attacks on symmetric cryptosystems by using Simon’s algorithm. In fact only C1 is satisfied in those applications and C2 is not necessarily satisfied. Still, C1 guarantees that the subroutine (a)–(e) always returns an \(\alpha _i\) satisfying \(\alpha _i \cdot s=0\) [42].

3 Classical and Kaplan et al.’s Linear Distinguishers

Here we review classical (multidimensional) linear distinguishers and Kaplan et al.’s quantum one-dimensional linear distinguisher [43].

3.1 Classical One-Dimensional Linear Distinguisher

The linear correlation \(\textrm{Cor}(P;\alpha ,\beta )\) of an n-bit random permutation P approximately follows the normal distribution \(\mathcal {N}(0,2^{-n})\) for an arbitrary mask \((\alpha ,\beta )\) with \(\alpha ,\beta \ne 0^n\) [26]. Thus, if the correlation \(\textrm{Cor}(E_K;\alpha ,\beta )\) for a block cipher \(E_K\) with \(\alpha , \beta \ne 0^n\) significantly deviates from the segment \([-2^{-n/2},2^{-n/2}]\), then \(E_K\) can be distinguished by collecting a list \(L = \{(P_1,C_1),\dots ,(P_N,C_N)\}\) for random \(P_1,\dots ,P_N\), and checking if the estimated empirical correlation

$$ \widehat{\textrm{Cor}}(E_K ;\alpha ,\beta ) = \frac{\#\{ (P,C) \in L | \alpha \cdot P = \beta \cdot C \} - \#\{ (P,C) \in L | \alpha \cdot P \ne \beta \cdot C \}}{N} $$

is far from \([-2^{-n/2},2^{-n/2}]\). The attack succeeds with a high probability if \(N \gtrapprox \textrm{Cor}(E_K;\alpha ,\beta )^{-2}\).

3.2 Classical Multidimensional Linear Distinguishers

A natural idea to enhance the power of linear distinguishers is to utilize multiple linear approximations. Some early works indeed show such attacks, assuming the existence of statistically independent multiple approximations [7, 41]. However, the assumption does not necessarily hold in general [50]. Instead, Hermelin et al. [35] proposed to use multidimensional linear approximations, i.e., sets of linear approximations of which the input-output masks form a vector space.

Specifically, let \(f : \mathbb F_2^m \rightarrow \mathbb F_2^n\) be a function, \(V \subset {\mathbb {F}}^m_2 \times {\mathbb {F}}^n_2\) be a set of input-output masks for f that is a vector space, and \(S := \{(\alpha _1,\beta _1),\dots ,(\alpha _\ell ,\beta _\ell )\} \) be a basis of V. Then the multidimensional linear approximation of f (w.r.t. (VS)) is defined as the function \(\textsf{Lin}^f_{S} : {\mathbb {F}}^m_2 \rightarrow {\mathbb {F}}^\ell _2\) such that

$$ \textsf{Lin}^f_{S}(x) = (\alpha _1 \cdot x \oplus \beta _1 \cdot f(x),\dots ,\alpha _\ell \cdot x \oplus \beta _\ell \cdot f(x)). $$

Define a distribution \(p^f_S\) on \({\mathbb {F}}^\ell _2\) by \(p^f_S(z) := \Pr _{}\left[ x \xleftarrow {\$} {\mathbb {F}}^m_2 : \textsf{Lin}^f_S(x)=z \right] \).

Below we denote the zero vector \((0^m,0^n)\) by \(\textbf{0}\). We say that the input and output masks are linearly independent if \(V = V_1 \times V_2\) holds for some \(V_1 \subset {\mathbb {F}}^m_2\) and \(V_2 \subset {\mathbb {F}}^n_2\). Moreover, we say that the input and output masks are linearly completely dependent if there exists a basis \(\{(\alpha _i,\beta _i)\}_{1 \le i \le \dim (V)}\) of V such that both of \(\{\alpha _i\}_{1 \le i \le \dim (V)}\) and \(\{\beta _i\}_{1 \le i \le \dim (V)}\) are linearly independent in \({\mathbb {F}}^n_2\).

The advantage of considering a set of masks forming a vector space is that we can utilize a link of the sum of the squared correlations to the capacity of \(p^f_S\) and Pearson’s chi-squared test: Here, the capacity of a probability function (distribution) p over \({\mathbb {F}}^\ell _2\) is the value definedFootnote 7 by

$$\begin{aligned} \textrm{Cap}(p) := 2^\ell \sum _{z \in {\mathbb {F}}^\ell _2}(p(z) - 2^{-\ell })^2. \end{aligned}$$

The important well-known fact is that

$$\begin{aligned} \textrm{Cap}(p^f_S) = \sum _{(\alpha ,\beta ) \in V - \{\textbf{0}\}} \textrm{Cor}(f;\alpha ,\beta )^2 \end{aligned}$$
(2)

holds for the multidimensional approximation of f. Moreover, suppose a list of random input-output pairs \(L=\{(P_1,C_1), \dots ,(P_N,C_N)\}\) is given. Then the capacity \(\textrm{Cap}(\hat{p}^{f}_S)\) of the estimated empirical distribution \(\hat{p}^f_S\) (defined by \(\hat{p}^f_S(z) := \frac{\#\{ (P,C) \in L | \textsf{Lin}^f_S(P) = z \}}{N}\)) multiplied by N is equal to the test statistic of the Pearson’s chi-squared goodness-of-fit test (for testing the goodness-of-fit of \(p^f_S\) and the uniform distribution on \({\mathbb {F}}^\ell _2\)).

The idea of multidimensional linear distinguishers for a block cipher \(E_K\) is that the distribution \(p^{E_K}_S\) is far from uniform if the right hand side of Eq. (2) with \(f=E_K\) is sufficiently large for random K, and thus \(E_K\) can be distinguished from random by checking whether the test statistic of the Pearson’s chi-squared test is larger than a certain threshold. Specifically, given a list of (real) random plaintext-ciphertext pairs \(L=\{(P_1,C_1), \dots ,(P_N,C_N)\}\), we count \(\textsf{num}(z) := \{(P_i,C_i) \in L | \textsf{Lin}^{E_K}_S(P_i) = z \}\) for each z, and compute the test statistic \( {\chi }^2_{\textrm{real}} := N2^\ell \sum _{z}(\textsf{num}(z)/N - 2^{-\ell })^2 = N \cdot \textrm{Cap}(\hat{p}^{E_K}_S). \) Then \(\chi ^2_{\textrm{real}}\) is approximately distributed around \( (2^\ell -1) + N\sum _{\begin{array}{c} (\alpha ,\beta ) \in V -\{\textbf{0}\} \end{array}} \textrm{Cor}(E_K;\alpha ,\beta )^2. \) If the plaintext-ciphertext pairs are generated from a random permutation, then \(\textsf{num}(z)\) approximately follows the uniform distribution. Thus, the similarly computed statistic \(\mathbf {\chi }^2_{\textrm{ideal}}\) approximately follows the \(\chi ^2\) distribution with \((2^\ell - 1)\) degrees of freedom (denoted by \(\chi ^2_{2^\ell -1}\)), of which the standard deviation is \(\sqrt{2(2^\ell -1)}\). Hence \(E_K\) can be distinguished from a random permutation with a constant advantage whenFootnote 8 \( N \gg \sqrt{2^\ell } /\sum _{\begin{array}{c} (\alpha ,\beta ) \in V - \{\textbf{0}\} \end{array}} \textrm{Cor}(E_K;\alpha ,\beta )^2 = \sqrt{2^\ell } / \textrm{Cap}(p^{E_K}_S) \), by checking whether the test statistic is larger than \((2^\ell -1) + \sqrt{2(2^\ell -1)}\) or not.

Some Remarks. The arguments in the above paragraph are mainly based on [6, Section 4.3]. Strictly speaking, the statistic in the ideal world \(\chi ^2_\textrm{ideal}\) does not follow \(\chi ^2_{2^\ell -1}\) actually because the squared correlation \(\textrm{Cor}(P;\alpha ,\beta )^2\) is not zero on average even for a random permutation P for \(\alpha ,\beta \ne 0^n\). Still, the difference of \(\chi ^2_\textrm{ideal}\) and \(\chi ^2_{2^\ell -1}\) is very small compared to the difference of \(\chi ^2_\textrm{real}\) and \(\chi ^2_{2^\ell -1}\), and it is usually (and implicitly) assumed that the above arguments heuristically work in practice. Meanwhile, zero-correlation linear cryptanalysis does exploit such small difference, which we will explain later.

Some early works showed that distinguishers based on the Log Likelihood Ratio (LLR) test [3, 32, 33] requires only \(O( 1 / \textrm{Cap}(p^{E_K}_S) )\) data instead of \(O(\sqrt{2^\ell } / \textrm{Cap}(p^{E_K}_S) )\) of the \(\chi ^2\)-test-based distinguishers, and the LLR-test-based distinguishers perform better. However, the LLR test requires accurate knowledge on key-dependent distributions of multidimensional linear approximations, which is not often the case as pointed out by Cho [21].

3.3 Kaplan et al.’s Quantum One-Dimensional Linear Distinguisher

Kaplan et al. [43] observed that a quadratic quantum speed-up can be obtained for linear distinguishers by using the quantum counting algorithm [18].

Roughly speaking, the quantum counting algorithm achieves a quadratic speed-up to solve the problem of estimating \(M := \#\{x | F(x)=1 \}\) for a Boolean function F. Making O(q) quantum queries to F, it returns an approximation \(\tilde{M}\) of M satisfying \(|\tilde{M}-M| \le O\left( \frac{\sqrt{M(2^n-M)}}{q} + \frac{2^n}{q^2}\right) \).

Now, suppose that there exists a linear approximation of an n-bit block cipher \(E_K\) satisfying \(c := |\textrm{Cor}(E_K;\alpha ,\beta )| \gg 2^{-n/2}\) for a random key K, and let F be the Boolean function such that \(F(x)=1\) iff \(\alpha \cdot x \oplus \beta \cdot E_K(x) = 0\). Then, \(E_K\) can be distinguished by estimating \(M = \#\{x | F(x)=1 \}\) and checking whether \(|M - \frac{2^n}{2}| \gg 2^{n/2}\). Using the quantum counting algorithm, one can obtain an estimation of \(\tilde{M}\) with sufficient precision for distinguisher (\(|\tilde{M} - M| \le \frac{M}{a}\) for a small integer \(a >0\)) in time O(1/c). Compared to the classical complexity of \(O(1/c^2)\), a quadratic speed-up is achieved.

Extension to Multidimensional Linear Distinguishers? After seeing Kaplan et al.’s work, it is natural to ask whether their technique can be extended to multidimensional linear distinguishers. However, to apply the quantum counting algorithm to solve a problem, one has to construct an efficiently computable Boolean function F in such a way that counting the number of x satisfying \(F(x)=1\) solves the problem. In the one-dimensional case F is obtained in a quite natural way as explained above, but in the multidimensional case essentially we have to construct F in such a way to achieve a quadratic speed-up for Pearson’s chi-squared test applied to the distribution \(p^{E_K}_S\). It seems highly unclear whether such F exists, and thus we seek for another technique.

4 New Observation on Simon’s Algorithm

As explained in Sect. 2, the subroutine of Simon’s algorithm uses only the quantum oracle of a target function and the Hadamard transform, which is the Fourier transform over the group \((\mathbb {Z}/2\mathbb {Z})^n\). Meanwhile, a well-know fact is that linear correlations have strong relationships with Fourier transform. This section shows that a slightly modified version of Simon’s subroutine, which we call CEA, returns input and output masks of a function with a probability proportional to the linear correlations. Later we show that CEA can be utilized to obtain speed-up for various techniques including multidimensional linear distinguishers. Since \(\mathbb F^n_2\) is isomorphic to \((\mathbb {Z}/2\mathbb {Z})^n\) as Abelian groups, in what follows we identify \(\mathbb F^n_2\) with \((\mathbb {Z}/2\mathbb {Z})^n\).

4.1 Fourier Transform

First, we recall the Fourier transform (over \((\mathbb {Z}/2\mathbb {Z})^n\)) and its relationship with linear cryptanalysis and quantum computation. The Fourier transform of a function \(F : {\mathbb {F}}^n_2 \rightarrow \mathbb {C}\) is the function \(\mathcal {F}F : {\mathbb {F}}^n_2 \rightarrow \mathbb {C}\) defined by \( \mathcal {F}F(x) := \sum _{y \in {\mathbb {F}}^n_2} \frac{(-1)^{x \cdot y} F(y)}{\sqrt{2^n}}. \)

Relationship with Linear Correlations. It is well-known that the linear correlation of an arbitrary function f is obtained by applying the Fourier transform on a function naturally defined from f [23, 58].

For arbitrary function \(f:{\mathbb {F}}^m_2 \rightarrow {\mathbb {F}}^n_2\), let \(f_\textrm{emb}: {\mathbb {F}}^m_2 \times {\mathbb {F}}^n_2 \rightarrow \mathbb {C}\) be the function defined by \(f_\textrm{emb}(x,y) = 1\) if \(f(x)=y\) and \(f_\textrm{emb}(x,y)=0\) otherwiseFootnote 9. Then some straightforward calculation shows

$$\begin{aligned} \mathcal {F}{f_\textrm{emb}}(\alpha ,\beta ) = \sqrt{2^{m-n}} \cdot \textrm{Cor}(f;\alpha ,\beta ). \end{aligned}$$
(3)

Relationship with Quantum Computation. The relationship with quantum computation is quite clear. The Fourier transform on \({\mathbb {F}}^n_2\) exactly corresponds to the Hadamard operator \(H^{\otimes n}\). For instance, let \(\psi : {\mathbb {F}}^n_2 \rightarrow \mathbb {C}\) and \(\left|\psi \right> := \sum _{x \in {\mathbb {F}}^n_2}\psi (x)\left|x\right>\). Then

$$\begin{aligned} H^{\otimes n}\left|\psi \right> = \sum _{y \in {\mathbb {F}}^n_2} \mathcal {F}{\psi }(y)\left|y\right> \end{aligned}$$
(4)

holds. (Note that this property holds regardless of the norm of \(\left|\psi \right>\).) In fact this is one of the most important sources of quantum speed-up: While the classical FFT requires time \(O(n2^n)\) to compute the Fourier transform of a function, an application of the Hadamard transform to a quantum state requires time O(1).

4.2 Extracting Correlations by (Modified) Simon’s Subroutine

Here we show that Simon’s subroutine with a slight modification returns input and output masks for linear approximations with high correlation. We call the resulting algorithm Correlation Extraction Algorithm (CEA) because it extracts linear correlations into the quantum amplitude of a state, and we denote it by \(\textsf{CEA}^f\) when applied to a function f. Specifically, the algorithm runs as follows.

Algorithm \(\boldsymbol{\textsf{CEA}^f}\)

  1. (a)

    Prepare the initial state \(\left|0^m\right>\left|0^n\right>\).

  2. (b)

    Apply the m-qubit Hadamard transform \(H^{\otimes m}\) on the first m qubits.

  3. (c)

    Apply \(U_f\) on the state (i.e., make a quantum query to f).

  4. (d)

    Apply the \(\underline{(m+n)\text {-}\mathrm {qubit~Hadamard~transform}~H^{\otimes (m+n)}}\) on the state.

  5. (e)

    Measure \(\underline{\mathrm {the~entire}~(m+n)~\textrm{qubits}}\) by the computational basis and return the observed \(\underline{(m+n)\text {-}\mathrm {bit~string}~\alpha ||\beta }\) (\(\alpha \in \mathbb F_2^m\) and \(\beta \in \mathbb F_2^n\)).

The underlines indicate the parts modified from the original Simon’s subroutine on p.9. \(\textsf{CEA}^f\) is different from the original Simon’s subroutine only in that \(\textsf{CEA}^f\) does not discard the last n qubits and measure them after applying \(H^{\otimes n}\).

Note that this change does not affect the distribution of \(\alpha \) in Step (e). Especially, \(\alpha \) is just uniformly distributed over the subspace \(\{v \in {\mathbb {F}}^m_2 | v \cdot s = 0\}\) if f satisfies the conditions of Problem 1. Thus there is nothing new if we focus only on \(\alpha \). However, we observe that \(\textsf{CEA}^f\) shows an interesting link to linear correlations when \(\beta \) is taken into account, as in the following proposition.

Proposition 3

The quantum state of \(\textsf{CEA}^f\) before the final measurement is

$$\begin{aligned} \sum _{\alpha \in {\mathbb {F}}^m_2,\beta \in {\mathbb {F}}^n_2} \frac{\textrm{Cor}(f;\alpha ,\beta )}{\sqrt{2^n}} \left|\alpha \right>\left|\beta \right>. \end{aligned}$$
(5)

In particular, for any subset \(S \subset \{0,1\}^m \times \{0,1\}^n\),

$$\begin{aligned} \Pr \left[ (\alpha ,\beta ) \leftarrow \textsf{CEA}^f : (\alpha ,\beta ) \in S\right] = \sum _{(\alpha ,\beta ) \in S}\frac{\textrm{Cor}(f;\alpha ,\beta )^2}{2^n} \end{aligned}$$
(6)

holds.

Proof

The quantum state of \(\textsf{CEA}^f\) before the final measurement is

$$\begin{aligned} H^{\otimes (m+n)} U_f \left( H^{\otimes m} \otimes I_n\right) \left|0^m\right>\left|0^n\right> &= H^{\otimes (m+n)} U_f \sum _{x \in {\mathbb {F}}^m_2}\frac{1}{\sqrt{2^m}}\left|x\right>\left|0^n\right> \\ &= H^{\otimes (m+n)} \sum _{x \in {\mathbb {F}}^m_2}\frac{1}{\sqrt{2^m}}\left|x\right>\left|f(x)\right> \\ & \!\!\!\!\!\! {\mathop {=}\limits ^{\text {Def. of }f_\textrm{emb}}} H^{\otimes (m+n)} \sum _{x \in {\mathbb {F}}^m_2, y \in {\mathbb {F}}^n_2}\frac{f_\textrm{emb}(x,y)}{\sqrt{2^m}}\left|x\right>\left|y\right> \\ & \!\!\!\! {\mathop {=}\limits ^{\text {Eq. }(4)}} \sum _{\alpha \in {\mathbb {F}}^m_2, \beta \in {\mathbb {F}}^n_2}\frac{\mathcal {F}{f_\textrm{emb}}(\alpha ,\beta )}{\sqrt{2^{m}}}\left|\alpha \right>\left|\beta \right> \\ & \!\!\!\! {\mathop {=}\limits ^{\text {Eq. }(3)}} \sum _{\alpha \in {\mathbb {F}}^m_2, \beta \in {\mathbb {F}}^n_2} \frac{\textrm{Cor}(f;\alpha ,\beta )}{\sqrt{2^n}}\left|\alpha \right>\left|\beta \right>. \end{aligned}$$

Hence we have Eq. (5). Eq. (6) immediately follows from Eq. (5).    \(\square \)

Some Remarks. CEA is quite close to the Bernstein-Vazirani algorithm [5] when \(n=1\) and some previous works [20, 59] already observes similar relationships between linear correlations and the Bernstein-Vazirani algorithm. Still, analysis in the previous works is done only in the case of \(n=1\). To obtain speed-up for multidimensional (zero correlation) linear and integral distinguishers, our analysis for general n involving both input and output masks is essential. Furthermore, we observe that a similar relationship holds for generalized linear correlations over an arbitrary finite abelian group and the general quantum Fourier transformation. See Sect. 8 for details.

5 Speed-Up for Multidimensional Linear Distinguishers

By using the CEA in the previous section, here we show quantum linear distinguishers achieving a bigger speed-up than Kaplan et al.’s when a multidimensional linear approximation with high correlations exists. Recall that what the algorithm \(\textsf{CEA}^f\) does is to apply the unitary operator \(H^{\otimes (m+n)} U_f (H^{\otimes m} \otimes I_n)\) on \(\left|0^m\right>\left|0^n\right>\) and measure the entire state by the computational basis. By abuse of notation, let \(\textsf{CEA}^f\) also denote the operator \(H^{\otimes (m+n)} U_f (H^{\otimes m} \otimes I_n)\) itself.

We show three distinguishersFootnote 10 \(\mathcal {A}_1\), \(\mathcal {A}_2\), and \(\mathcal {A}_3\). \(\mathcal {A}_1\) is a general distinguisher applicable to arbitrary multidimensional linear approximations. \(\mathcal {A}_2\) (resp., \(\mathcal {A}_3\)) is applicable only when the input and output masks are linearly independent (resp., completely dependent). Here are some remarks on notations and assumptions.

  • We assume that \(\sum _{(\alpha ,\beta ) \in V - \{\textbf{0}\}} \textrm{Cor}(E_K;\alpha ,\beta )^2 \ge c\) holds with a high probability for some \(c>0\), and that we know the value of c. \(\mathcal {O}\) denotes the given oracle, which is either \(E_K\) for a random K or a random permutation P.

  • (Notations for \(\mathcal {A}_2\)) When the input and output masks are linearly independent, i.e., \(V=V_1 \times V_2\) holds for some subspaces \(V_1,V_2 \subset \mathbb {F}^n_2\), we denote \(\dim (V_1)\) and \(\dim (V_2)\) by u and w, respectively. In addition, \(S_1 := \{\alpha _1,\dots ,\alpha _u\}\) and \(S_2 := \{\beta _1,\dots ,\beta _w\}\) denotes basis of \(V_1\) and \(V_2\). Without loss of generality, we assume \(V_2 = \{\beta ||0^{n-w} | \beta \in {\mathbb {F}}^w_2 \}\) and \(\beta _i = \textbf{e}_i\)Footnote 11. Especially, we regard V as a subspace of \(\mathbb F^n_2 \times \mathbb F^w_2\).

  • (Notations for \(\mathcal {A}_3\)) When the input and output masks are linearly completely dependent, we fix a basis \(S := \{(\alpha _i,\beta _i)\}_{1 \le i \le \dim (V)}\) of V such that both of \(\{\alpha _i\}_{1 \le i \le \dim (V)}\) and \(\{\beta _i\}_{1 \le i \le \dim (V)}\) are linearly independent in \({\mathbb {F}}^n_2\). W.l.o.g., we assume \(\beta _i = \textbf{e}_i\)Footnote 12. Especially, we regard V as a subspace of \(\mathbb F^n_2 \times \mathbb F^{\dim (V)}_2\).

Distinguishers \(\mathcal A_1\), \(\mathcal A_2\), and \(\mathcal A_3\). All the three distinguishers are obtained by applying the algorithm \(\mathcal{Q}\mathcal{D}\) of Proposition 2. The difference between the distinguishers is the choice of the parameters s and t, the unitary operator U, and the Boolean functionFootnote 13 F, which is as follows.

  • \(\mathcal A_1\) (general case): \((s,t) := (3,c/2^n)\) and \(U := \textsf{CEA}^\mathcal {O}\). F is the Boolean function of which the domain is \({\mathbb {F}}^n_2 \times {\mathbb {F}}^n_2\) and \(F(\alpha ,\beta ) = 1\) iff \((\alpha ,\beta ) \in V - \{\textbf{0}\}\).

  • \(\mathcal A_2\) (linearly independent masks): \((s,t) := (3,c/2^w)\) and \(U := \textsf{CEA}^{\textsf{msb}_w[\mathcal {O}]}\). F is the Boolean function of which the domain is \({\mathbb {F}}^n_2 \times {\mathbb {F}}^w_2\) and \(F(\alpha ,\beta ) = 1\) iff \((\alpha ,\beta ) \in V - \{\textbf{0}\}\).

  • \(\mathcal A_3\) (linearly completely dependent masks): \((s,t) := (3,c/2^{\dim (V)})\) and \(U := \textsf{CEA}^{\textsf{msb}_{\dim (V)}[\mathcal {O}]}\). F is the Boolean function of which the domain is \({\mathbb {F}}^n_2 \times {\mathbb {F}}^{\dim (V)}_2\) and \(F(\alpha ,\beta ) = 1\) iff \((\alpha ,\beta ) \in V - \{\textbf{0}\}\).

Here, sampling a unitary U according to \(D_1\) (resp., \(D_2\)) in Proposition 2 corresponds to sampling a random key K for a real cipher (resp., choosing an ideally random permutation P).

Analysis. If input and output masks are linearly independent, \(\mathcal {A}_2\) distinguishes \(E_K\) and P in time \(O(\sqrt{2^w/c})\) roughly due to the following reasoning. If the oracle given to \(\mathcal {A}_2\) is \(E_K\), the probability that we observe \((\alpha ,\beta ) \in F^{-1}(1)\) when measuring \(\textsf{CEA}^{\textsf{msb}_w[E_K]}\left|0^n\right>\left|0^{w}\right>\) is approximately lower bounded by \(c/2^{w}\). Hence, QAA on \(\textsf{CEA}^{\textsf{msb}_w[E_K]}\) and F with \(O(\sqrt{2^{w}/c})\) iterations returns \((\alpha ,\beta ) \in F^{-1}(1)\) (i.e., \(\mathcal {A}_2\) returns 1) with high probability. On the other hand, if the oracle given to \(\mathcal {A}_2\) is a random permutation P, from Claim 1 it follows that the probability that we observe \((\alpha ,\beta ) \in F^{-1}(1)\) when measuring \(\textsf{CEA}^{\textsf{msb}_{w}[P]}\left|0^n\right>\left|0^{w}\right>\) is approximately upper bounded by \(2^{\dim (V)}/2^{n+w}\). Especially, the probability that QAA on \(\textsf{CEA}^{\textsf{msb}_w[P]}\) and F with \(O(\sqrt{2^{w}/c})\) iterations returns \((\alpha ,\beta ) \in F^{-1}(1)\) (i.e., \(\mathcal {A}_2\) returns 1) is negligibly small. For similar reasons, \(\mathcal A_1\) and \(\mathcal A_3\) distinguish \(E_K\) and P in time \(O(\sqrt{2^n/c})\) and \(O(\sqrt{2^{\dim (V)}/c})\), respectively.

More precisely, define parameters \(\mathsf c_i\), \(\textsf{pb}_i\), and \(\mathsf T_i\) for \(i=1,2,3\) as follows.

  • \(\mathsf c_{1} := \mathsf c_3 = 2^{-n}\), \(\mathsf c_2 := 2^{-n-w+\dim (V)}\).

  • \(\textsf{pb}_1 := \frac{2^{\dim (V)+7}(n+1)}{2^{2n} \cdot c} + 2^{-\dim (V)+1} \cdot n^{-2}\), \(\textsf{pb}_2 := \frac{2^{\dim (V)+7}(n+1)}{2^{n+w} \cdot c} + 2^{-\dim (V)+1} \cdot n^{-2}\), and \(\textsf{pb}_3 := \frac{2^{7}(n+1)}{2^{n} \cdot c} + 2^{-\dim (V)+1} \cdot n^{-2}\).

  • \(\mathsf T_1 := 6\sqrt{2^n/c}\), \(\mathsf T_2 := 6\sqrt{2^w/c}\), and \(\mathsf T_3 := 6\sqrt{2^{\dim (V)}/c}\).

Then the following proposition holds.

Proposition 4

Let \(i=1,2\), or 3. Suppose that \(c \gg \mathsf c_i\), and that \(1/4 > \sum _{(\alpha ,\beta ) \in V - \{\textbf{0}\}} \textrm{Cor}(E_K;\alpha ,\beta )^2 \ge c\) holds with a constant probability p when K is randomly chosen. If \(\mathcal {A}_i\) runs relative to the real cipher \(E_K\), then the probability that \(\mathcal {A}_i\) outputs 1 is at least p/2. If \(\mathcal {A}_i\) runs relative to a random permutation P, then the probability that \(\mathcal {A}_i\) outputs 1 is approximately upper bounded by \(\textsf{pb}_i\). In addition, the time complexity of \(\mathcal {A}_i\) is at most \(\mathsf T_i\). (The probabilities are taken not only over the randomness of \(\mathcal {A}_i\) but also over the randomness of choices of K or P.)

This proposition can be proven by applying Proposition 2 and Claim 1 in a straightforward manner. Still, we provide a proof in Section C of the full version [36] for completeness.

Some Remarks. The speed-ups in this section are not always quadratic. Still, a quadratic speed-up is obtained in the specific case when input-output masks are linearly independent and \(u=w\) (by applying \(\mathcal A_2\)). In this case, the classical complexity is about \(2^{\ell / 2}/(\textrm{capacity}) = 2^w / (\textrm{capacity})\) because \(\ell = \dim (V) = \dim (V_1)+\dim (V_2) = u+w = 2w\). Meanwhile, if \(\mathcal A_2\) is appliedFootnote 14, the complexity drops to about \(\sqrt{2^{w} / (\textrm{capacity})}\). For other cases, the speed-up is not quadratic in general, except for the one-dimensional case.

In the one-dimensional case, the asymptotic complexity of our technique is the same as Kaplan et al.’s, but the non-asymptotic complexity become smaller in a specific situation. For example, suppose that we have a one-dimensional linear approximation of a cipher, and the absolute value of the linear correlation is concentrated in a very narrow range around a known value \(c > 0\). Then, some analysis shows that the combination of \(\textsf{CEA}\) and QAA distinguishes the cipher by making \(( \sqrt{2} \pi ) \cdot (1/c)\) queries to the oracle. (This is faster than \(\mathcal A_3\) in Proposition 4. Here, we consider to run QAA only once, whereas \(\mathcal A_3\) runs QAA multiple times. A single run of QAA is sufficient here since the variance of the correlation is assumed to be small.) Meanwhile, Kaplan et al.’s distinguisher requires about \((2 \sqrt{2} \pi ) \cdot (1/c)\) queries. (See Appendix E of the full version [36] for details.) Thus our attack is about 2 times faster in this situation. For general cases where the variance of the correlation may be large, we do not observe evident difference between ours and Kaplan et al.’s because we have to run QAA multiple times (as \(\mathcal A_3\) does).

So far we have discussed how to distinguish block ciphers from random permutations, but we expect the above distinguishers are also applicable to distinguish keyed functions from random functions of n-bit inputs, without changing the asymptotic complexity (in the same way as classical linear distinguishers work not only for permutations but Below we give some application examples, but they are essentially distinguishers on keyed functions from random functions, rather than block ciphers from random permutations.

5.1 Application Example: FEA-1 and FEA-2 Structures

FEA is a Korean standard (TTAK.KO-12.0275) for format preserving encryption [48], which has two variants named FEA-1 and FEA-2. Both variants adopt tweakable Feistel structures. Here we study linear distinguishers on these structures when round functions are ideally random.

The FEA-1 and FEA-2 structures look like Fig. 1. As in usual Feistel structures, plaintexts are divided into two parts. We focus on the case when the widths of the two branches are equal. A tweak T is also divided into two parts, denoted by \(T_L\) and \(T_R\), and processed in an alternate manner. In FEA-1, the i-th round function takes \(T_L\) (resp., \(T_R\)) when \(i \equiv 1\) (resp., \(i \equiv 0\)) mod 2. In FEA-2, the i-th round function takes \(T_L\) (resp., \(T_R\)) when \(i \equiv 2\) (resp., \(i \equiv 0\)) mod 3. The \((3j+1)\)-th round function of FEA-1 does not take any tweak (or equivalently, take a constant value instead). For simplicity, we assume the tweak length is sufficiently large.

Fig. 1.
figure 1

The FEA-1 structure (left) and FEA-2 structure (right).

At CRYPTO 2021, Beyne showed multidimensional linear distinguishers on these structures [6]. The multidimensional linear approximationFootnote 15 for FEA-1 is a vector space V of completely linearly dependent input-output masks with \(\dim (V)=n/2\) (when n is the block size of Feistel), and the sum of the squared correlations \(\sum _{(\alpha ,\beta ) \in V}\textrm{Cor}(\alpha ,\beta )^2\) is equal to \(2^{n(1-r/4)}\). Meanwhile, the approximation for FEA-2 is a vector space \(V'\) of linearly independent input-output masks with \(\dim (V)=\dim (V'_2)=n/2\) (here, we assume \(V'\) is decomposed as \(V' = V'_1 \times V'_2\)), and the sum of the squared correlation is equal to \(2^{n(1-r/6)}\).

The classical distinguishing complexity is \(O(2^{(r/4-3/4)n})\) for FEA-1 (resp., \(O(2^{(r/6-3/4)n})\) for FEA-2). By applying our quantum distinguishers above, the complexity is reduced to \(O(2^{(r/8-1/4)n})\) (resp., \(O(2^{(r/12-1/4)n})\)).

Remark 1

In [6], linear distinguishers are extended to message recovery attacks and key recovery attacks. Our distinguishers could also be extended to message or key recovery attacks in the quantum setting by just guessing the secret information with the Grover search, though, non-trivial extension of interest (beyond just applying Grover) would require another new idea and not be straightforward.

6 Speed-Up for Zero Correlation Linear Distinguishers

This section shows how CEA can be used to speed-up (multidimensional) zero correlation linear distinguishers [9]. We first recall the basic ideas of attacks in the classical setting.

6.1 Classical Zero Correlation Linear Distinguishers

Unlike linear cryptanalysis, zero correlation linear cryptanalysis exploits linear approximations of which the correlation is exactly zero.

For instance, let \(E_K\) be an n-bit block cipher and suppose \(\textrm{Cor}(E_K;\alpha ,\beta )=0\) holds for some input and output masks \(\alpha ,\beta \ne 0^n\). Then, \(\left( \textrm{Cor}(P;\alpha ,\beta )\right) ^2\) for a random permutation P is distributed around \(2^{-n}\) and non-zero with high probability. Hence we can distinguish \(E_K\) from P if we have sufficiently many \((\approx 2^n)\) plaintext-ciphertext pairs by checking whether the estimated empirical correlation is zero or not.

This idea naturally extends to attacks exploiting multidimensional linear approximations of correlation zero (below we follow the notations of Sect. 3.2). Again, let \(E_K\) be an n-bit block cipher and \(V \subset {\mathbb {F}}^n_2 \times {\mathbb {F}}^n_2\) be a vector space such that \(\textrm{Cor}(E_K;\alpha ,\beta )=0\) for all \((\alpha ,\beta ) \in V\). Moreover, let S be an arbitrary basis of V. Then the distribution \(p^{E_K}_S\) over \({\mathbb {F}}^{\dim (V)}_2\) defined by \(p^{E_K}_S(z) := \Pr _x\left[ \textsf{Lin}^f_S(x) = z\right] \) exactly matches the uniform distribution. On the other hand, the distribution \(p^P_S\) similarly defined for a random permutation P is slightly different from the uniform distribution. Hence \(E_K\) and P can be distinguished by using suitable statistical tests. Indeed, Bogdanov et al. [8] showed that \(E_K\) can be distinguished in time \(O(2^n/\sqrt{2^{\dim (V)}})\) in such a settingFootnote 16.

Remark 2

In the special case where the input-output masks are independent and \(V=V_1 \times V_2\) holds, we can achieve the time complexity \(O(2^n/{2^{\dim (V_1)}})\) instead of \(O(2^n/\sqrt{2^{\dim (V)}})\) by using the link between zero correlation linear cryptanalysis and integral cryptanalysis, which we will elaborate in Sect. 7.

6.2 Quantum Speed-Up by CEA

Next, we study how to speed-up (multidimensional) zero correlation linear distinguishers by using CEA and QAA.

As well as linear distinguishers in Sect. 3.2, we introduce three distinguishers which we denote by \(\mathcal {B}_1\), \(\mathcal {B}_2\), and \(\mathcal {B}_3\). \(\mathcal {B}_1\) is a general distinguisher applicable to arbitrary multidimensional linear approximations. \(\mathcal {B}_2\) (resp., \(\mathcal {B}_3\)) is applicable when the input and output masks are linearly independent (resp., completely dependent).

In what follows, we assume that \(\textrm{Cor}(E_K;\alpha ,\beta )^2 =0\) holds for all \((\alpha ,\beta ) \in V - \{\textbf{0}\}\). \(\mathcal {O}\) denotes the oracle, which is either \(E_K\) for a random K or a random permutation P. For notations related to \(\mathcal {B}_2\) and \(\mathcal {B}_3\), we use the same ones as those for \(\mathcal A_2\) and \(\mathcal A_3\) introduced on p.16.

Distinguisher \(\mathcal {B}_1\) (General Case). When nothing can be assumed on linear dependence of masks, a natural way to mount a distinguisher by using QAA and CEA is to run the following procedure.

  1. 1.

    Let \(F : {\mathbb {F}}^n_2 \times {\mathbb {F}}^n_2 \rightarrow {\mathbb {F}}_2\) be the Boolean function such that \(F(\alpha ,\beta )=1\) iff \((\alpha ,\beta ) \in V - \{\textbf{0}\}\).

  2. 2.

    Apply QAA on \(\textsf{CEA}^{\mathcal O}\) and F with the number of iterations \(\lfloor \frac{\pi }{4} \sqrt{2^{2n-\dim (V)}} \rfloor \). Namely, let the unitary operator \(Q(\textsf{CEA}^{\mathcal O},F)^{i}\textsf{CEA}^{\mathcal O}\) act on \(\left|0^n\right>\left|0^n\right>\) with \(i=\lfloor \frac{\pi }{4} \sqrt{2^{2n-\dim (V)}}\rfloor \). Then, measure the resulting state by the computational basis and let \((\alpha ,\beta )\) be the observed bit string.

  3. 3.

    If \(F(\alpha ,\beta )=0\), return 1. Otherwise, return 0.

Some analysis shows that this algorithm distinguishes \(E_K\) and P with high probability. However, the running time of \(\mathcal {B}_1\) is \(O(\sqrt{2^{2n-\dim (V)}})=O(2^n/\sqrt{2^{\dim (V)}})\), which is the same as the complexity of the classical distinguisher. Namely, \(\mathcal {B}_1\) does not obtain any speed-up from classical attacks. Meanwhile, we can obtain some quantum speed-up when input-output masks are linearly independent or linearly completely dependent, which we explain below.

Remark 3

\(\mathcal B_1\) runs QAA only once, unlike \(\mathcal{Q}\mathcal{D}\) of Proposition 2 (or its applications \(\mathcal A_1\)-\(\mathcal A_3\)) running QAA multiple times. This is because the probability \(\Pr [F(\alpha ,\beta )=1]\) is exactly zero when \((\alpha ,\beta )\) is obtained by measuring the state \(\textsf{CEA}^{E_K}\left|0^n\right>\left|0^n\right>\), and thus we can achieve a sufficiently high advantage with a single run of QAA.

Distinguishers \(\mathcal {B}_2\) and \(\mathcal B_3\). Here we show distinguishers \(\mathcal B_2\) and \(\mathcal B_3\) for linearly independent and completely dependent masks, respectivelyFootnote 17.

\(\mathcal {B}_2\) is obtained by modifying the unitary operators and the number of iterations for QAA in \(\mathcal {B}_1\). Specifically, we change

  1. 1.

    the unitary operator for QAA of \(\mathcal {B}_1\) from \(\textsf{CEA}^{\mathcal O}\) to \(\textsf{CEA}^{\textsf{msb}_w[\mathcal {O}]}\), and

  2. 2.

    the number of iterations from \(\lfloor \frac{\pi }{4} \sqrt{2^{2n-\dim (V)}} \rfloor \) to \(\lfloor \frac{\pi }{4} \sqrt{2^{n+w-\dim (V)}} \rfloor = \lfloor \frac{\pi }{4} \sqrt{2^{n-u}} \rfloor \).

\(\mathcal {B}_3\) is obtained just by changing the parameter w appeared in \(\mathcal {B}_2\) to \(\dim (V)\).

\(\mathcal {B}_2\) distinguishes \(E_K\) and P with high probability, roughly for the following reason: If the oracle given to \(\mathcal {B}_2\) is \(E_K\), \(\mathcal {B}_2\) always returns 1. If the oracle given to \(\mathcal {B}_2\) is a random permutation P, Claim 1 guaranteesFootnote 18 that the probability that we observe \((\alpha ,\beta ) \in F^{-1}(1)\) when measuring \(\textsf{CEA}^{\textsf{msb}_w[P]}\left|0^n\right>\left|0^w\right>\) is approximately equal to \(2^{\dim (V)}/2^{n+w} = 2^{u-n}\). Hence the QAA with \(O(\sqrt{2^{n-u}})\) iterations in Step 2 of \(\mathcal {B}_2\) returns \((\alpha ,\beta ) \in F^{-1}(1)\) with high probability, and \(\mathcal {B}_2\) returns 0. Thus \(\mathcal {B}_2\) distinguishes \(E_K\) and P. Especially, \(\mathcal {B}_2\) achieves a quadratic speed-up in the special case where \(w=1\) (see Remark 2). Similarly, \(\mathcal B_3\) distinguishes \(E_K\) in time \(O(\sqrt{2^n})\). More precisely, the following proposition holds.

Proposition 5

If \(\mathcal {B}_2\) (resp., \(\mathcal B_3\)) runs relative to \(E_K\), then \(\mathcal {B}_2\) (resp., \(\mathcal B_3\)) always outputs 1. If \(\mathcal {B}_2\) (resp., \(\mathcal B_3\)) runs relative to a random permutation P, then the probability that \(\mathcal {B}_2\) (resp., \(\mathcal B_3\)) outputs 0 is approximately lower bounded by \(\frac{1}{2} \cdot \left( 1 - {2^{-\dim (V)+1}}\right) \). In addition, the running time of \(\mathcal {B}_2\) (resp., \(\mathcal B_3\)) is at most \(2\lfloor \frac{\pi }{4} \sqrt{2^{n-u}} \rfloor + 1\) (resp., \(2\lfloor \frac{\pi }{4} \sqrt{2^{n}} \rfloor + 1\)) encryptions by \(E_K\). (The probabilities are taken not only over the randomness of \(\mathcal {B}_2\) or \(\mathcal B_3\) but also over the randomness of choices of K or P.)

A proof of the proposition is given in Section G of the full version [36].

6.3 Applications

Both of \(\mathcal {B}_2\) and \(\mathcal {B}_3\) have various immediate applications. For instance, Bogdanov and Rijmen showed multidimensional zero correlation linear approximations on the 5-round balanced Feistel structure, 18-round 4-branch Type-I generalized Feistel structure, and 9-round 4-branch Type-II generalized Feistel structure (see Fig. 2 and Table 1 of the full version [36]) when round functions are bijections. The input-output masks of the linear approximations are linearly completely dependent. Thus \(\mathcal {B}_3\) distinguishes these constructions in time \(O(2^{n/2})\) (when inputs and outputs are n bits). In fact the linear approximations on the 4-branch Type-I/II generalized Feistel structures can be extended to k-branch structures for generalFootnote 19 k in a straightforward manner, and \(\mathcal {B}_3\) distinguishes \((k^2+k-2)\)-round (resp., \((2k+1)\)-round) k-branch Type-I (resp., Type-II) generalized Feistel structure in time \(O(2^{n/2})\).

Fig. 2.
figure 2

One round of Balanced and (4-branch) generalized Feistel structures. What we assume is only that P and \(P'\) are bijections. Our attacks work regardless of whether P (and \(P'\)) for different rounds are independent or not.

Table 1. Input-output mask patterns for balanced and generalized Feistel structures. \(\alpha \in {\mathbb {F}}^{n/2}_2\) and \(\beta \in {\mathbb {F}}^{n-\frac{n}{k}}\) are non-zero values. “0” for generalized Feistel structures denotes \(0^{n/k} \in {\mathbb {F}}^{n/k}_2\).

Note that the numbers of rounds we attack here are larger than those broken by previous polynomial time attacks using Simon’s algorithm in a black-box way. The number of rounds of balanced (resp., k-branch Type-I and Type-II) Feistel broken by previous polynomial time attacks is 4 [40] (resp., \(k^2-k+1\) [51] and \(k+1\) [27]). See also Table 2 of the full version [36].

In fact, the complexity of our distinguishers may also be achieved just by speeding-up a one-dimensional zero-correlation linear distinguisher with simpler techniques. Still, to the authors’ best knowledge, we are the first to point out the existence of attacks with such complexity.

There also exist lots of other previous works showing zero correlation approximations [1, 8,9,10, 56] and our \(\mathcal {B}_2\) or \(\mathcal {B}_3\) can be applied to all of them in principle. The amount of quantum speed-up compared to classical distinguishers depends on linear approximations, and we can achieve at most quadratic speed-up.

7 Speed-Up for Integral Distinguishers

This section shows applications of CEA to integral cryptanalysis. As shown by Bogdanov et al. [8] and Sun et al. [56], balanced property of a cipher is equivalent to multidimensional zero correlation linear properties of which the input-output masks are linearly independent. Specifically, the following proposition holdsFootnote 20.

Proposition 6

([8, 56]). Let \(F : {\mathbb {F}}^m_2 \rightarrow {\mathbb {F}}^n_2\) be a function. Let \(V_1 \subset {\mathbb {F}}^m_2, V_2 \subset {\mathbb {F}}^n_2\) be sub-vector spaces, and \(V := V_1 \times V_2\). Then the following conditions are equivalent.

  1. 1.

    V is the set of input-output masks of a multidimensional zero correlation linear approximation of F, i.e., \(\textrm{Cor}(F;\alpha ,\beta )=0\) for all \((\alpha ,\beta ) \in V - \{\textbf{0}\}\).

  2. 2.

    The function \(G : x \mapsto \beta \cdot F(x \oplus \lambda )\) is balanced over \(V^\perp _1\) for all \(\lambda \in {\mathbb {F}}^m_2\) and \(\beta \in V_2 - \{\textbf{0}\}\).

Remark 4

Note that this equivalence holds only for balanced property but not for zero-sum property. Our quantum attacks below also rely on the above equivalence. Especially, the attacks are applicable only if a balanced property exists.

Recall that the distinguisher \(\mathcal {B}_2\) (Proposition 5) is applicable when a multidimensional zero correlation linear approximation exists and the input-output masks are linearly independent. Together with Proposition 6, this implies the following proposition.

Proposition 7

Let \(E_K\) be an n-bit block cipher. Suppose some output bits of \(E_K\) are balanced over a vector space \(V \subset {\mathbb {F}}^n_2\). (W.l.o.g., we assume the most significant w bits are balanced, and let \(V' := \left\{ x||0^{n-w} \left| x \in {\mathbb {F}}^w_2 \right. \right\} \).) Then, by applying \(\mathcal {B}_2\) on the zero correlation multidimensional linear approximations of \(V^\perp \times V'\), we can distinguish \(E_K\) from P with time and query complexity at most \(2\lfloor \frac{\pi }{4}\sqrt{2^{\dim (V)}}\rfloor + 1\). \(\mathcal {B}_2\) always outputs 1 if the given encryption oracle is the real cipher \(E_K\). If the oracle is a random permutation P, then \(\mathcal {B}_2\) outputs 0 with probability at least \(\frac{1}{2}\left( 1 - 2^{-\dim (V)+1}\right) .\)

This proposition shows that we can obtain (almost) quadratic speed-up for integral distinguisher because the complexity of \(\mathcal {B}_2\) is \(\approx 1.6\sqrt{2^{\dim (V)}}\) while the complexity of the classical integral distinguisher is \(2^{\dim (V)}\).

Still, this is at most quadratic speed-up. At first glance, achieving a more than quadratic speed-up seems impossible for integral distinguishers. However, we see possibility of a more than quadratic speed-up in some situations.

7.1 Possibility of More Than Quadratic Speed-Up

Roughly speaking, if a part of the outputs of a cipher (e.g., a specific byte of ciphertexts) is balanced on multiple mutually orthogonal vector spaces included in the input space, there exists possibility to achieve a more than quadratic quantum speed-up by using CEA.

Specifically, let \(E_K : {\mathbb {F}}^n_2 \rightarrow {\mathbb {F}}^n_2\) be a block cipher, and suppose there exist sub-vector spaces \(V_1,\dots ,V_s \subset {\mathbb {F}}^m_2\) satisfying the following conditions.

  1. 1.

    \(V_1,\dots ,V_s\) are mutually orthogonal, i.e., \(V_i \perp V_j\) for \(i\ne j\).

  2. 2.

    There exists some \(d \le n/2\) and \(\dim (V_i)=d\) holds for all i.

  3. 3.

    A part of the outputs of \(E_K\) is balanced on \(V_i \oplus \lambda \) for all \(1 \le i \le s\) and arbitrary \(\lambda \in {\mathbb {F}}^m_2\). (For ease of explanation, below we assume the most significant w bits of outputs of \(E_K\) are balanced.)

Then, by Proposition 6 we have \( \textrm{Cor}(\textsf{msb}_w[E_K];\alpha ,\beta )=0 \) if \( \alpha \in (V_1)^\perp \cup \cdots \cup (V_s)^\perp - \{\textbf{0}\} \) and \( (\alpha ,\beta ) \ne (0,0). \) This means

$$ \Pr _K\left[ (\alpha ,\beta ) \leftarrow \textsf{CEA}^{\textsf{msb}_w[E_K]} : \alpha \perp V_i \text { for some { i}} \text { and } \alpha \ne 0 \text { and } \beta \ne 0\right] \ = 0. $$

Meanwhile, for a random permutation P we have

$$\begin{aligned} &\Pr _P\left[ (\alpha ,\beta ) \leftarrow \textsf{CEA}^{\textsf{msb}_w[P]} : \alpha \perp V_i \text { for some { i}} \text { and } \alpha \ne 0 \text { and } \beta \ne 0\right] \\ &\quad \quad {\mathop {=}\limits ^{(*)}} \sum _{\begin{array}{c} \alpha \ne 0, \beta \ne 0 \\ \alpha \perp V_i \text { for some } i \end{array}} \mathbb E_P \left[ \frac{\textrm{Cor}(\textsf{msb}_w[P];\alpha ,\beta )^2}{2^w}\right] \\ &\quad \quad = \sum _{\begin{array}{c} \alpha \ne 0, \beta \ne 0 \\ \alpha \perp V_i \text { for some } i \\ \textsf{lsb}_{n-w}[\beta ]=0 \end{array}} \mathbb E_P \left[ \frac{\textrm{Cor}(P;\alpha ,\beta )^2}{2^w}\right] {\mathop {=}\limits ^{(**)}} \sum _{\begin{array}{c} \alpha \ne 0, \beta \ne 0 \\ \alpha \perp V_i \text { for some } i \\ \textsf{lsb}_{n-w}[\beta ]=0 \end{array}} \frac{1}{2^w (2^n-1)} \\ &\quad \quad \quad = \# \left\{ \alpha \in {\mathbb {F}}^n_2-\{\textbf{0}\} \left| \alpha \perp V_i \text { for some } i \right. \right\} \cdot \frac{\# \left\{ \beta \in {\mathbb {F}}^n_2-\{\textbf{0}\} \left| \textsf{lsb}_{n-w}[\beta ]=0 \right. \right\} }{2^{w}(2^n-1)} \\ &\quad \quad \ge \left( \sum _{1 \le i \le s}|V_i^\perp | - \sum _{1 \le i < j \le s} |V_i^\perp \cap V_j^\perp | - 1 \right) \cdot \frac{2^w-1}{2^{w}(2^n-1)} \\ &{\mathop {\ge }\limits ^{V_i \perp V_j \text { for } i \ne j}} \left( s2^{n-d} - s^2 2^{n-2d} - 1 \right) \cdot \frac{2^w-1}{2^{w}(2^n-1)} \quad \approx \quad \frac{s }{2^d}. \end{aligned}$$

Here, \((*)\) (resp., \((**)\)) follows from Proposition 3 (resp., Proposition 11 of the full version [36]).

Therefore, \(E_K\) can be distinguished from P in time about \(\frac{\pi }{2}\sqrt{2^d/s}\) by applying QAA on \(\textsf{CEA}^{\textsf{msb}_w[E_K]}\) (or \(\textsf{CEA}^{\textsf{msb}_w[P]}\)) and the Boolean function \(F : {\mathbb {F}}^n_2 \times {\mathbb {F}}^w_2 \rightarrow {\mathbb {F}}_2\) such that \(F(\alpha ,\beta )=1\) iff \(\alpha \perp V_i\) for some \(i=1,\dots ,s\) and \(\alpha \ne 0\) and \(\beta \ne 0\).

This can lead to a more than quadratic speed-up compared to the corresponding classical integral distinguisher (when \(s \ge 4\)) because the classical complexity is \(2^{d}\): Even if we have such multiple spaces \(V_1,\dots ,V_s\), what we can do in the classical setting is just to choose a single space \(V_i\) and check whether (a part of) \(E_K\) is balanced on that space, unless some additional properties can be assumed.

Application Examples. To see how the above distinguisher can be applied to concrete ciphers, let us recall the 3.5-round integral property of AES for an example [25]. If a tuple of certain four cells of inputs take all values while others being constant, we can make a single column after the first round take all values while others remain constant, and each cell after 3.5 rounds balanced (see Fig. 3). Since there are four choices on which tuple of four cells to activate (i.e., which column after the first round to activate), we are in the situation of the distinguisher explained above with \(d=32\) and \(s=4\) (\(V_i\) corresponds to a tuple of four active cells of inputs).

Fig. 3.
figure 3

The integral property of the 2.5-round AES. Cells with filled circles are those taking all values and others are constants. “\(\times 2^{24}\)” indicates that \(2^{24}\) sets of the same active cell pattern shown in the figure is observed.

In fact this example itself is not so significant because more efficient distinguishers exist: If a tuple of four cells of inputs take all values like Fig. 3, actually a certain tuple of four cells of outputs take all values. This means that the integral property specifies a 32-bit permutation between part of inputs and outputs. Hence the 3.5-round AES is distinguished by checking if this part contains a collision in time \(\approx \root 3 \of {2^{32}}\) with the BHT algorithm [19].

Still, we observe an interesting attack when s is relatively large. For instance, suppose \(s=2^d\) holds. This situation happens if, e.g., \(E_K\) is a 4-bit cell SPN cipher with the same integral property as the 2.5-round AES (the latter 2.5 round of Fig. 3). Then \(\Pr \left[ F(\alpha ,\beta )=1 \right] \ \approx s / 2^d = 1\) holds when \((\alpha ,\beta )\) is obtained by measuring \(\textsf{CEA}^{\textsf{msb}_w[P]}\), while the probability is always zero for the real cipher \(E_K\). Thus we can distinguish \(E_K\) only with a single quantum query, which apparently exhibits a more than quadratic speed-up.

The margin compared to the square-root of the classical complexity is not large, but this example is important in that a new-type example illustrating a qualitative difference between classical and quantum computation is achieved by using a classical cryptanalytic technique.

8 Extension to Generalized Linear Distinguishers

Linear cryptanalysis is useful when group operations are done in \(\mathbb {Z}^n_2\), but some ciphers use other group operations such as modular additions (i.e., additions in \(\mathbb {Z}/2^n\mathbb {Z}\)), where generalized linear cryptanalysis on arbitrary finite groups [4] is more useful. Generalized linear cryptanalysis uses group characters instead of bit masks, but we observe again there exists a close relationship between (generalized) correlations and quantum computation via Fourier transform. This section shows how the technique of Sect. 5 extends to generalized linear distinguishers. In this section, the symbol “\(\oplus \)” denotes the direct sum of groups.

8.1 Fourier Transform on Arbitrary Finite Abelian Group

Let G be an arbitrary finite abelian group. Then, by the Chinese remainder theorem, there is a group isomorphism from G to \(\mathbb {Z}/N_1\mathbb {Z} \oplus \cdots \oplus \mathbb {Z}/N_m\mathbb {Z}\) for some positive integers \(N_1,\dots ,N_m\). We fix an isomorphism and identify the two groups. Recall that a character of a finite abelian group G is a group homomorphism \(\phi : G \rightarrow \mathbb {C}^\times \). The set of characters of G is denoted by \(\hat{G}\), which forms a group by point-wise multiplication. It is well-known that \(\hat{G}\) is isomorphic to G as a group.

Specifically, for each \(w = (w_1,\dots ,w_m) \in G\), the function

$$ \textsf{ch}_w : (x_1,\dots ,x_m) \mapsto \textrm{exp}\left( 2\pi i \frac{x_1 w_1}{N_1}\right) \cdots \textrm{exp}\left( 2\pi i \frac{x_m w_m}{N_m}\right) $$

is a character of G. In fact the map \(w \mapsto \textsf{ch}_w\) defines a group isomorphism from G to \(\hat{G}\). We identify G with \(\hat{G}\) by this isomorphism.

Let G be a finite abelian group and \(F : G \rightarrow \mathbb {C}\) be a function. Then, the Fourier transform of F over G is a function \(\mathcal {F}_GF : G \rightarrow \mathbb {C}\) defined by

$$ \mathcal {F}_GF (w) := \sum _{x \in {G}}\frac{1}{\sqrt{|G|}} \cdot \overline{\textsf{ch}_w(x)} \cdot F(x). $$

The inverse transform of \(\mathcal {F}_G\), denoted by \(\mathcal {F}^*_G\), is given by \(\mathcal {F}^*_GF(x) = \sum _{w \in {G}}\frac{1}{\sqrt{|G|}} \cdot {\textsf{ch}_x(w)} \cdot F(w)\).

We naturally identify a function from G to \(\mathbb {C}\) (resp., the set of the functions from G to \(\mathbb {C}\)) with a vector in the |G|-dimensional vector space \(\mathbb {C}^{|G|}\) (resp., the vector space \(\mathbb {C}^{|G|}\)). Moreover, we assume that \(\mathbb {C}^{|G|}\) is endowed with the standard Hermitian inner product. Then \(\mathcal {F}_G\) can be regarded as a unitary operator.

8.2 Linear Correlations

Let GH be finite abelian groups and \(f : G \rightarrow H\) be a function. For \(\alpha \in G\) and \(\beta \in H\), the (generalized) linear correlation \(\textrm{Cor}(f;\alpha ,\beta )\) is defined as

$$ \textrm{Cor}(f;\alpha ,\beta ) := \sum _{x \in G} \frac{1}{|G|} \overline{\textsf{ch}_\beta (f(x))} \cdot \textsf{ch}_\alpha (x). $$

We call \(\alpha \) (resp., \(\beta \)) an input mask (resp., output mask).

Let \(f_\textrm{emb} : G \oplus H \rightarrow \mathbb {C}\) be the function defined by \(f_\textrm{emb}(x,y) = 1\) if \(y=f(x)\) and \(f_\textrm{emb}(x,y) = 0\) if \(y \ne f(x)\). Then, some straightforward calculation shows that

$$\begin{aligned} \left( \left( \mathcal {F}^*_{G} \otimes \mathcal {F}_H\right) f_\textrm{emb}\right) (\alpha ,\beta ) = \sqrt{|G|/|H|} \cdot \textrm{Cor}(f;\alpha ,\beta ) \end{aligned}$$
(7)

holds. (This corresponds to Eq. (3) for the linear cryptanalysis over \((\mathbb {Z}/2\mathbb {Z})^{\oplus n}\).)

8.3 Extension of CEA

For an arbitrary finite abelian group G, we assume that elements of G are appropriately encoded into n-bit strings for some n s.t. \(|G| \le 2^n\). Let \(\psi : G\rightarrow \mathbb {C}\) be a function satisfying \(\sum _{x \in G}|\psi (x)|^2=1\), and \(\left|\psi \right> := \sum _x\psi (x)\left|x\right>\). Recall that the Quantum Fourier Transform (QFT) over an abelian group G, denoted by \(\textrm{QFT}_G\), is defined by

$$\begin{aligned} \textrm{QFT}_G\left|\psi \right> = \sum _x (\mathcal {F}^*_G\psi )(x)\left|x\right>. \end{aligned}$$
(8)

With these notations, the extension of CEA on a function \(f : G \rightarrow H\) (G and H are finite abelian groups) is obtained by replacing the Hadamard transform in CEA with the QFT (or its inverse) over G and H. Specifically, the extended algorithm runs as follows.

Extended Version of CEA

  1. (a)

    Prepare the initial state \(\left|0_G\right>\left|0_H\right>\).

  2. (b)

    Apply \(\textrm{QFT}_G\) on the first (left) register.

  3. (c)

    Apply \(U_f\) on the state (i.e., make a quantum query to f).

  4. (d)

    Apply \(\textrm{QFT}_G \otimes \textrm{QFT}^*_H\) on the state.

  5. (e)

    Measure the entire state by the computational basis and return the observed result \((\alpha ,\beta ) \in G \oplus H\).

We also use the symbol \(\textsf{CEA}^f\) to denote the extended algorithm.

The following proposition is an extension of Proposition 3.

Proposition 8

The quantum state of \(\textsf{CEA}^f\) before the final measurement is

$$\begin{aligned} \sum _{\alpha \in G,\beta \in H} \frac{\textrm{Cor}(f;\alpha ,\beta )}{\sqrt{|H|}} \left|\alpha \right>\left|\beta \right>. \end{aligned}$$
(9)

In particular, for any subset \(S \subset G \oplus H\),

$$\begin{aligned} \Pr \left[ (\alpha ,\beta ) \leftarrow \textsf{CEA}^f : (\alpha ,\beta ) \in S\right] = \sum _{(\alpha ,\beta ) \in S}\frac{\textrm{Cor}(f;\alpha ,\beta )^2}{|H|} \end{aligned}$$
(10)

holds.

Proof

The quantum state of \(\textsf{CEA}^f\) before the final measurement is

$$\begin{aligned} &(\textrm{QFT}_G \otimes \textrm{QFT}^*_H) U_f \left( \textrm{QFT}_G \otimes I_n\right) \left|0_G\right>\left|0_H\right>\\ &\qquad = (\textrm{QFT}_G \otimes \textrm{QFT}^*_H) U_f \sum _{x \in G}\frac{1}{\sqrt{|G|}}\left|x\right>\left|0_H\right> \\ &\qquad = (\textrm{QFT}_G \otimes \textrm{QFT}^*_H) \sum _{x \in G}\frac{1}{\sqrt{|G|}}\left|x\right>\left|f(x)\right> \\ & \quad {\mathop {=}\limits ^{\text {Def. of }f_\textrm{emb}}} (\textrm{QFT}_G \otimes \textrm{QFT}^*_H) \sum _{x \in G, y \in H}\frac{f_\textrm{emb}(x,y)}{\sqrt{|G|}}\left|x\right>\left|y\right> \\ & \quad {\mathop {=}\limits ^{\text {Def. of QFT}}} \sum _{\alpha \in G, \beta \in H}\frac{((\mathcal {F}^*_G\otimes \mathcal {F}_H){f_\textrm{emb}})(\alpha ,\beta )}{\sqrt{|G|}}\left|\alpha \right>\left|\beta \right> \\ & \qquad {\mathop {=}\limits ^{\text {Eq. }(7)}} \sum _{\alpha \in G, \beta \in H} \frac{\textrm{Cor}(f;\alpha ,\beta )}{\sqrt{|H|}}\left|\alpha \right>\left|\beta \right>. \end{aligned}$$

Hence we have Eq. (9). Equation (10) immediately follows from Eq. (9).    \(\square \)

8.4 Quantum Speed-Up for Generalized Linear Distinguishers

Let \(f : G \rightarrow H\) be a function, where G and H are finite abelian groups. Here we define linearly independent masks and linearly completely dependent masks.

  1. 1.

    Suppose G and H are decomposed as \(G=G_1 \oplus G_2\) and \(H = H_1 \oplus H_2\). Then, we say the set \(G_1 \oplus H_1 (\subset G \oplus H)\) is a set of linearly independent input-output masks.

  2. 2.

    Suppose again the decomposition \(G=G_1 \oplus G_2\) and \(H = H_1 \oplus H_2\), and assume that there is a group isomorphism \(\phi : G_1 \rightarrow H_1\). Then we say that the set \(\{ (g,\phi (g)) | g \in G_1\}\) is a set of linearly completely dependent input-output masks.

We show distinguishers when input-output masks are linearly independent or completely dependent, which correspond to \(\mathcal {A}_2\) and \(\mathcal {A}_3\) in Sect. 5. We provide only rough ideas and heuristic estimations, and omit detailed analysis.

Distinguisher for Linearly Independent Input-Output Masks. Suppose \(f_K : G \rightarrow H\) is a keyed function, G and H are decomposed as \(G=G_1 \oplus G_2\), \(H = H_1 \oplus H_2\), and \(\sum _{\alpha \in G_1, \beta \in H_1} {\textrm{Cor}(f_K;\alpha ,\beta )^2}/|H_1| \gg \frac{1}{|G_2|}\) holds. Let \(f^{(1)}_K : G \rightarrow H_1\) be the projection of \(f_K\) onto \(H_1\), and \(F : G_1 \oplus H_1 \rightarrow \{0,1\}\) be the Boolean function such that \(F(\alpha ,\beta ) = 1\) iff \((\alpha ,\beta ) \in G_1 \oplus H_1\). Then,

$$\begin{aligned} p_\textrm{real}:= \Pr \left[ (\alpha ,\beta ) \leftarrow \textsf{CEA}^{f^{(1)}_K} : F(\alpha ,\beta )=1 \right] = \sum _{(\alpha ,\beta ) \in G_1 \oplus H_1}\frac{\textrm{Cor}(f_K;\alpha ,\beta )^2}{|H_1|} \end{aligned}$$

follows from Proposition 8. Meanwhile, for a random function \(\textsf{RF}: G \rightarrow H\),

$$\begin{aligned} p_\textrm{ideal}&:= \Pr \left[ (\alpha ,\beta ) \leftarrow \textsf{CEA}^{\textsf{RF}^{(1)}} : F(\alpha ,\beta )=1 \right] = \sum _{(\alpha ,\beta ) \in G_1 \oplus H_1}\frac{\textrm{Cor}(\textsf{RF};\alpha ,\beta )^2}{|H_1|} \\ &\approx \sum _{(\alpha ,\beta ) \in G_1 \oplus H_1} \frac{1}{|G|} \cdot \frac{1}{|H_1|} = \frac{1}{|G_2|}. \end{aligned}$$

(We heuristically assume the third equality approximately holds due to [6, Theorem 3.2].) Since \(p_\textrm{real}\gg p_\textrm{ideal}\) by assumption, we can distinguish \(f_K\) from \(\textsf{RF}\) by applying the QAA on \(\textsf{CEA}^{f^{(1)}_K}\) (or \(\textsf{CEA}^{\textsf{RF}^{(1)}}\)) and F with \(O(\sqrt{1/p_\textrm{real}})\) iterations.

Distinguisher for Linearly Completely Dependent Input-Output Masks. Again, let \(f_K : G \rightarrow H\) be a keyed function, G and H are decomposed as \(G=G_1 \oplus G_2\), \(H = H_1 \oplus H_2\). Moreover, assume there is a group isomorphism \(\phi : G_1\rightarrow H_1\) and \(\sum _{\alpha \in G_1} {\textrm{Cor}(f_K;\alpha ,\phi (\alpha ))^2}/|H_1| \gg \frac{1}{|G|}\) holds. Let \(F : G_1 \oplus H_1 \rightarrow \{0,1\}\) be the binary function such that \(F(\alpha ,\beta ) = 1\) iff \(\alpha \in G_1\) and \(\beta = \phi (\alpha )\). Then, from Proposition 8,

$$\begin{aligned} p_\textrm{real}:= \Pr \left[ (\alpha ,\beta ) \leftarrow \textsf{CEA}^{f^{(1)}_K} : F(\alpha ,\beta )=1 \right] = \sum _{\alpha \in G_1}\frac{\textrm{Cor}(f_K;\alpha ,\phi (\alpha ))^2}{|H_1|} \end{aligned}$$

follows. On the other hand, for a random function \(\textsf{RF}: G \rightarrow H\) we have

$$\begin{aligned} p_\textrm{ideal}&:= \Pr \left[ (\alpha ,\beta ) \leftarrow \textsf{CEA}^{\textsf{RF}^{(1)}} : F(\alpha ,\beta )=1 \right] = \sum _{\alpha \in G_1 }\frac{\textrm{Cor}(\textsf{RF};\alpha ,\phi (\alpha ))^2}{|H_1|} \\ &\approx \sum _{\alpha \in G_1 } \frac{1}{|G|} \cdot \frac{1}{|H_1|} = \frac{1}{|G|}. \end{aligned}$$

Since \(p_\textrm{real}\gg p_\textrm{ideal}\) holds by assumption, we can distinguish \(f_K\) from \(\textsf{RF}\) by applying the QAA on \(\textsf{CEA}^{f^{(1)}_K}\) (or \(\textsf{CEA}^{\textsf{RF}^{(1)}}\)) and F with \(O(\sqrt{1/p_\textrm{real}})\) iterations.

Application to the FF3-1 Structure. Beyne [6] showed generalized linear distinguishers on the FF3-1 structure in addition to linear distinguishers on FEA. The FF3-1 structure is almost the same as the FEA-1 structure (see Fig. 1), but the XOR operations in FEA-1 are replaced with modular additions in FF3-1. Thus, generalized linear distinguisher is more suitable for the FF3-1 structure.

The (generalized) linear approximation for FF3-1 in [6] is similar to the multidimensional linear approximation for FEA-1, but underlying groups are different from \(\mathbb {Z}^n_2\). In fact, firstly a keyed function \(F_K : \mathbb {Z}/2^{n/2}\mathbb {Z} \oplus \mathbb {Z}^t_2 \rightarrow \mathbb {Z}/2^{n/2}\mathbb {Z}\) is built from the FF3-1 structure by fixing some inputs (here, input means plaintext and tweak) and truncating some outputs, and the distinguisher is applied \(F_K\). The set (sub-group) of input-output masks is given by \( \left\{ ((\alpha ,0),\alpha ) \in (\mathbb {Z}/2^{n/2}\mathbb {Z} \oplus \mathbb {Z}^t_2) \oplus \mathbb {Z}/2^{n/2}\mathbb {Z} \left| \alpha \in \mathbb {Z}/2^{n/2}\mathbb {Z}\right. \right\} \). In particular, the input-output masks are linearly completely dependent. The corresponding sum of the squared correlation is estimated as \(\sum _{\alpha \in \mathbb {Z}/2^{n/2}\mathbb {Z}}\textrm{Cor}(F_K;(\alpha ,0),\alpha )^2 \approx 2^{-n(r/4-1)}\), and the classical distinguishing complexity is \(O(2^{(r/4-3/4)n})\).

On the other hand, if we apply the quantum distinguisher explained above, we achieve the complexity \(O(2^{(r/8-1/4)n})\).

9 Concluding Remarks

This paper showed a quantum speed-up for the multidimensional (zero correlation) linear distinguishers for the first time in such a way to exploit multidimensional approximation in a non-trivial way. Firstly, we observed that there is a close relationship between the subroutine of Simon’s algorithm and linear correlations of functions via Fourier transform. Specifically, a slightly modified version of the subroutine, which we call CEA, returns input and output linear masks of a target function with probability proportional to the squared linear correlation. Combining CEA with QAA, we achieved a quantum speed-up for multidimensional linear distinguishers. It is interesting that, only with a slight modification made, the subroutine of Simon’s algorithm can be used to speed-up such a statistical attack. Our technique is naturally extended to generalized linear distinguishers on arbitrary finite abelian groups by replacing the Hadamard transform in CEA with general QFT. We also showed that CEA similarly speeds-up multidimensional zero correlation linear distinguishers, as well as some integral distinguishers via the correspondence shown by Bogdanov et al. and Sun et al [8, 56]. Especially, we observe that a more than quadratic speed-up is possible if an integral property holds on multiple mutually orthogonal vector spaces of the same dimension, and even a single-query distinguisher for a toy example of a 4-bit cell SPN cipher with the same integral property as the 2.5-round AES.

Future Directions. An important future work is to investigate how to extend our technique to key-recovery attacks, or combine it with Schrottenloher’s [53].

All the distinguishers in this paper can be extended to key-recovery attacks just by guessing sub-keys of additional rounds using Grover’s algorithm. Suppose we would like to recover the key of an \((r+r')\)-round cipher and there is a (quantum) r-round distinguisher on the cipher running in time T. In addition, assume that we can apply the distinguisher on the intermediate r rounds if we know a k-bit subkey \(K'\) in the remaining \(r'\)-rounds. Then, roughly speaking, by just guessing the subkey \(K'\) with the Grover search while checking if a key-guess is correct with the distinguisher, we achieve an \((r+r')\)-round quantum key-recovery attack of time complexity \(O(T \cdot 2^{k/2})\).

Still this idea is too naive, compared to classical key-recovery attacks using sophisticated techniques such as the FFT [22, 29, 57]. As mentioned in Sect. 1.2, the recent work by Schrottenloher [53] has shown how to combine such key-recovery techniques using the FFT with the QFT, taking multiple linear approximations into account. However, in Schrottenloher’s attack, multiple approximations contribute to only the precision of the attack by a constant factor, and does not contribute much to reducing time complexity. It is definitely an important and interesting future work to investigate theoretical relationships between our technique with Schrottenloher’s and study how to reduce the time complexity of key-recovery exploiting multidimensional approximations.

Another important future work is to study quantum speed-up for integral distinguishers based on zero-sum properties. As mentioned before, our quantum integral distinguishers are applicable only if the distinguishers are based on a balanced functions and not a zero-sum property. However, distinguishers based on zero-sum properties often break more rounds than those on balanced functions, especially when extended to key-recovery attacks. Since multiple integral properties sometimes could lead to a more than quadratic speed-up, a quantum attack breaking more rounds of a cipher than classical attacks may be found by investigating this direction.

It would be also of interest to investigate how the super-quadratic speed-up in Sect. 7.1 can be reproduced more broadly. We observe that the following two things are essential in achieving that speed-up: (i) There exist multiple properties that are similar to each other, but only one of them can be exploited at a time in the classical setting. (For the 2.5-round integral property of AES-like ciphers, there are 16 choices on which input cell to activate, but the existence of multiple choices is not exploited in the classical distinguisher.) (ii) The properties are translated/embedded into quantum amplitude in some sense (by using \(\textsf{CEA}\), through the correspondence between integral and zero-correlation linear properties). So, if we find some classical properties satisfying (i) and a quantum technique enabling (ii), we might be able to reproduce similar quantum speed-ups, not only for linear/integral distinguishers but also for some other techniques.