Keywords

1 Introduction

Cryptographic hash functions are one of the central primitives in cryptography. They are used virtually everywhere: As cryptographically secure checksums to verify integrity of software or data packages, as building block in security protocols, including TLS, SSH, IPSEC, as part of any efficient variable-input-length signature scheme, to build full-fledged hash-based signature schemes, in transformations for CCA-secure encryption, and many more.

While all widely deployed public-key cryptography is threatened by the rise of quantum computers, hash functions are believed to be only mildly affected. The reason for this is twofold. On the one hand, generic quantum attacks achieve at most a square-root speed up compared to their pre-quantum counterparts and can be proven asymptotically optimal [8, 13, 20]. On the other hand, there do not exist any dedicated quantum attacks on any specific hash function that perform better than the generic quantum attacks (except, of course, for hash functions based on number theory like, e.g., VSH [9]).

One of the most important properties of a hash function H is collision-resistance. That is, it is infeasible to find \(x\ne x'\) with \(H(x)=H(x')\). Intuitively, collision-resistance guarantees some kind of computational injectivity – given H(x), the value x is effectively determined. Of course, information-theoretically, x is not determined, but in many situations, we can treat the preimage x as unique, because we will never see another value with the same hash. For example, collision-resistant hashes can be used to extend the message space of signature schemes (by signing the hash of the message), or to create commitment schemes (e.g., sending \(H(x\Vert r)\) for random r commits us to x; the sender cannot change his mind about x because he cannot find another preimage).

In the post-quantum setting,Footnote 1 however, it was shown by Unruh [18] that collision-resistance is weaker than expected: For example, the commitment scheme sketched in the previous paragraph is not binding: it is possible for an attacker to send a hash h, then to be given a value x, and then to send a random value r such that \(h=H(x\Vert r)\), thus opening the commitment to any desired value – even if H is collision-resistant against quantum adversaries.Footnote 2 This contradicts the intuitive requirement that H(x) determines x.

Fortunately, Unruh [18] also presented a strengthened security definition for post-quantum secure hash functions: collapsing hash functions. Roughly speaking, a hash function is collapsing if, given a superposition of values m, measuring H(m) has the same effect as measuring m (at least from the point of view of a computationally limited observer). Collapsing hash functions serve as a drop-in replacement for collision-resistant ones in the post-quantum setting: Unruh showed that several natural classical commitment schemes (namely the scheme sketched above, and the statistically-hiding schemes from [12]) become post-quantum secure when using a collapsing hash function instead of a collision-resistant one. The collapsing property also directly implies collision-resistance.

In light of these results, it is desirable to find hash functions that are collapsing. Unruh [18] showed that the random oracle is collapsing. (That is, a hash function \(H(x):=\mathcal{{O}}(x)\) is collapsing when \(\mathcal {O}\) is a random oracle.) However, this has little relevance for real-world hash functions: A practical hash function is typically constructed by iteratively applying some elementary building block (e.g., a “compression function”) in order to hash large messages. So even if we are willing to model the elementary building block as a random oracle, the overall hash function construction should arguably not be modeled as a random oracle.Footnote 3

For hash functions based on the Merkle-Damgård (MD) construction (such as SHA2 [16]), Unruh [19] showed: If the compression function is collapsing, so is the hash function resulting from the MD construction. In particular, if we model the compression function as a random oracle (as is commonly done in the analysis of practical hash functions), we have that hash functions based on the MD construction are collapsing (and thus suitable for use in a post-quantum setting).

However, not all hash functions are constructed using MD. Another popular construction is the sponge construction [4], underlying for example the current international hash function standard SHA3 [17], but also other hash functions such as Quark [2], Photon [11], Spongent [6], and Gluon [3]. The sponge construction builds a hash function H from a block functionFootnote 4 \({\mathbf {f}}\). In the classical setting, we know that the sponge construction is collision-resistant if the block function \({\mathbf {f}}\) is modeled as a random oracle, or a random permutation, or an invertible random permutation [5].Footnote 5 However, their proof does not carry over to the post-quantum setting: their proof relies on the fact that queries performed by the adversary to the block function are classical (i.e., not in superposition between different values). As first argued in [7], random oracles and related objects should be modeled as functions that can be queried in superposition of different inputs. (Namely, with a real hash function, an adversary can use a quantum circuit implementing SHA3 and can thereby query the function in superposition. The adversary could evaluate the sponge on the uniform superposition over all messages of a certain length, possibly helping him to, e.g., find a collision.) Thus, we do not know whether the sponge construction (and thus hash functions like SHA3) is collapsing (or at least collision-resistant in the post-quantum setting).

Our contributions. In the present paper we tackle the question whether the sponge construction is collision-resistant and collapsing in the post-quantum setting. We show:

  • If the block function \({\mathbf {f}}\) is collision-resistant when restricted to the left and right half of its output and it is hard to find a zero-preimage of \({\mathbf {f}}\) (restricted to the right half of its output), then the sponge construction is collision resistant.

  • If the block function \({\mathbf {f}}\) is collapsing when restricted to the left and right half of its output, respectively, and if it is hard to find a zero-preimage of \({\mathbf {f}}\) (restricted to the right half of its output), then the sponge construction is collapsing.

  • If the block function \({\mathbf {f}}\) is a random oracle or a random permutation, then the sponge construction is collapsing.

  • We give a quantum algorithm for finding collisions in any function (given access to a random oracle), in particular in the sponge construction. The number of quantum queries to \({\mathbf {f}}\) asymptotically matches our bounds for collision resistance.

It should be stressed that we do not show that the sponge construction is collapsing (or even collision-resistant) if the block function \({\mathbf {f}}\) is an efficiently invertible random permutation. In this case, it is trivial to find zero-preimages by applying the inverse permutation to 0. This means that the present result cannot be directly used to show the security of, say, SHA3, because SHA3 uses an efficiently invertible permutation as block function. Our results apply to hash functions where the block function is not (efficiently) invertible, e.g., Gluon [3]. It seems that this limitation is just a residue of our technique.

Organization. In Sect. 2 (“Collapsing hash functions”), we recall the definition of collapsing hash functions and some important properties of that definition. In Sect. 3 (“The sponge construction”) we recall the sponge construction. In Sect. 4 (“Collision-resistance of the sponge construction”) we first show the collision resistance of the sponge construction. Then in Sect. 5 (“Sponges are collapsing”), we present our main result – that the sponge construction is collapsing. In Sect. 6 (“Quantum Attack”) we present a quantum algorithm for finding collisions that uses a random oracle. Additional details, including preliminaries and full proofs are given in the full version [10].

2 Collapsing Hash Functions

In this section, we recall the notion of collapsing hash functions H from [18]. We describe both the underlying intuition, as well as the formal definitions.

A hash function is a function \(H^\mathcal {O}:X \rightarrow Y\) for some range X and domain Y. (Typically, Y consists of fixed length bitstrings, and X consists of fixed length bitstrings or \(\{0,1\}^*\).) H can depend on an oracle \(\mathcal {O}\). (Typically, \(\mathcal{O}\) will be a random function, a random permutation, or simply be missing if we are in the standard model. Unless specified otherwise, we make no assumptions about the distribution of \(\mathcal {O}\).)

As mentioned in the introduction, intuitively, we wish that H(m) uniquely identifies m in some sense. In the classical setting, this naturally leads to the requirement that it is hard to find \(m\ne m'\) with \(H(m)=H(m')\). Then we can treat H(m) as if it had only a single preimage (even though, of course, a compressing H will have many preimages, we just cannot find them). In the quantum setting, there is another interpretation of the requirement that H(m) identifies m. Namely, if we are given a register M that contains a superposition of many values m, then measuring H(m) on that register should – intuitively – fully determine m. That is, the effect on the register M should be the same, no matter whether we measure just the hash H(m) or the whole message m. One can see that for any compressing function H, it is impossible that measuring H(m) and m has information-theoretically the same effect on the state.Footnote 6 However, what we can hope for is that for a computationally limited adversary, the two situations are indistinguishable. In other words, we require that no quantum-polynomial-time adversary can distinguish whether we measure H(m) or m. This property is then useful in proofs, because we can replace H(m)-measurements by m-measurements and vice versa.

We can slightly simplify this condition if we require that the register M already contains a superposition of values m that all have the same hash H(m). In this case, measuring H(m) has no effect on the state, so we can state the requirement as: If M contains a superposition of messages m with the same \(H(m)=h\), then no quantum-polynomial-time adversary can distinguish whether we measure M in the computational basis, or whether we do not measure it at all.

Or slightly more formally: We let the adversary A produce a register M and a hash value h (subject to the promise that measuring M would lead to an m with \(H(m)=h\)). The adversary additionally keeps an internal state in register S. Then we either measure M in the computational basis (\(\mathsf {Game}_1\), depicted in Fig. 1(a)), or we do not perform any such measurement (\(\mathsf {Game}_2\), depicted in Fig. 1(b)). Finally, we give registers S (the internal state) and M (the potentially measured message register) to the adversary’s second part B. We call H collapsing if no quantum-polynomial-time (AB) can distinguish \(\mathsf {Game}_1\) and \(\mathsf {Game}_2\).

Fig. 1.
figure 1

Games from the definition of collapsing hash functions. \(\mathcal {M}\) represents a measurement in the computational basis. (AB) is assumed to satisfy the property that \(\mathcal {M}\) always returns m with \(H(m)=h\). A function is collapsing if the probability of \(b=1\) is negligibly close in both games.

This is formalized by the following definition:

Definition 1

(Collapsing [18]). For algorithms A, B, consider the following games:

Here SM are quantum registers. \(\mathcal {M}(M)\) is a measurement of M in the computational basis.

For a set \({\varvec{m}}\), we call an adversary (AB) valid on \({\varvec{m}}\) for \(H^\mathcal {O}\) iff  \(\Pr [H^\mathcal {O}(m)=h\, \wedge \ m\in {\varvec{m}}]=1\) when we run \((S,M,h)\leftarrow A^\mathcal {O}()\) and measure M in the computational basis as m. If we omit “on \({\varvec{m}}\)”, we assume \({\varvec{m}}\) to be the domain of \(H^\mathcal {O}\).

A function H is collapsing (on \(\mathbf m \)) iff for any quantum-polynomial-time adversary (AB) that is valid for \(H^\mathcal {O}\) (on \(\mathbf m \)), \(\bigl |\Pr [b=1:\mathsf {Game}_1] - \Pr [b=1:\mathsf {Game}_2]\bigr |\) is negligible.

The definition follows [18], except that we made the oracle \(\mathcal {O}\) explicit (which was implicit in [18]).

Miscellaneous facts. The following properties of collapsing hash functions will be useful throughout this paper. They are immediate consequences of their concrete-security variants as shown in Sect. 3.1 of the full version [10].

Lemma 2

If \(H^\mathcal {O}\) is injective, then \(H^\mathcal {O}\) is collapsing.

Theorem 3

If \(\mathcal {O}:\{0,1\}^e \rightarrow \{0,1\}^d\) is a random function with superlogarithmic d (in the security parameter), then \(H^\mathcal {O}:=\mathcal {O}\) is collapsing.

Lemma 4

If \(G^\mathcal {O}\circ H^\mathcal {O}\) is collapsing, and \(G^\mathcal {O}\) is quantum-polynomial-time computable, then \(H^\mathcal {O}\) is collapsing.

Lemma 5

If \(G^\mathcal {O}\) and \(H^\mathcal {O}\) are collapsing, and \(H^\mathcal {O}\) is quantum-polynomial-time computable, then \(G^\mathcal {O}\circ H^\mathcal {O}\) is collapsing.

Fig. 2.
figure 2

The sponge construction \(\mathbf {S}\) with a four block input \(m_1\Vert m_2\Vert m_3\Vert m_4\) and a three block output \(h_1\Vert h_2\Vert h_3\). The application of the padding function is not depicted (we assume \(m_1\Vert m_2\Vert m_3\Vert m_4= pad (m)\)).

3 The Sponge Construction

In this section, we review the sponge construction introduced by [4]. The sponge construction has two internal parameters r and c called the rate and the capacity, respectively. The internal state has \(r+c\) bits. We refer to the first part of the state as the left state, and to the second part of the state as the right state. Underlying the sponge construction is a block function \({\mathbf {f}}\) that inputs and outputs \(r+c\) bits. To hash a message m, the message is first padded to a non-zero multiple of the rate r. That is, we use some injective padding function \( pad \) to get \(k\ge 1\) message blocks \(m_1\Vert \dots \Vert m_k= pad (m)\).Footnote 7 Then we XOR \(m_1\) to the left state, apply \({\mathbf {f}}\) to the (whole) state, XOR \(m_2\) to the left state, apply \({\mathbf {f}}\) to the state, ..., apply \({\mathbf {f}}\) to the state, XOR \(m_{k}\) to the left state. The steps performed so far are referred to as the absorbing phase (denoted in this paper with \(\mathbf {S}^{\textit{in}}\)). Now we start with the squeezing phase \(\mathbf {S}^{\textit{out}}\): We apply \({\mathbf {f}}\) to the state, read the left state as \(h_1\), apply \({\mathbf {f}}\) to the state, read the left state as \(h_2\), .... We continue to do so until \(h_1\Vert h_2\Vert \dots \) contains \(\ge n\) bits (where n is a parameter specifying the desired output length), and return the first n bits of \(h_1\Vert h_2\Vert \dots \). The whole process described here (padding, absorbing phase, squeezing phase) is the sponge construction, referred to as \(\mathbf {S}\) in this paper. Note that the use of the terms absorbing and squeezing phase in this paper slightly differ from the description in [4]: In this paper, we end the absorbing phase just before the last application of \({\mathbf {f}}\), whilst the original sponge paper includes that application of \({\mathbf {f}}\) in the absorbing phase. The separation we use helps to simplify the proofs in later sections. The resulting sponge construction is the same as in [4], though. The sponge construction is illustrated in Fig. 2 for the special case of \(k=4\) and \(n=3r\) (four input blocks and three output blocks). The following definition makes the above explanation precise:

Definition 6

(Sponge construction). Fix integers \(c>0\) (the capacity) and \(r>0\) (the rate ), and \(n>0\) (the output length). Fix \({\mathbf {f}}:\{0,1\}^{r+c} \rightarrow \{0,1\}^{r+c}\) (the block function ) and \( pad :\{0,1\}^*\rightarrow ( \{0,1\}^r) ^+\) (where \(\{0,1\}^r) ^+\) is the set of bit-strings consisting of r-bit blocks).

For \(m_1,\dots ,m_k\in \{0,1\}^r\), let

$$\begin{aligned} \mathbf {S}^{\textit{in}}_{c,r,{\mathbf {f}}} (m_1\Vert \dots \Vert m_k)&:={\mathbf {f}}\bigl (\mathbf {S}^{\textit{in}}_{c,r,{\mathbf {f}}}(m_1\Vert \dots \Vert m_{k-1})\bigr ) \oplus (m_k\Vert 0^{c}) \\ \mathbf {S}^{\textit{in}}_{c,r,{\mathbf {f}}}(m_1)&:=m_1\Vert 0^{c} \end{aligned}$$

(We call \(\mathbf {S}^{\textit{in}}\) the absorbing phase.)

For \(s\in \{0,1\}^{r+c}\), let

$$ \mathbf {S}^{\textit{out}}_{c,r,{\mathbf {f}},n}(s) = {\left\{ \begin{array}{ll} s'\Vert \mathbf {S}^{\textit{out}}_{c,r,{\mathbf {f}},n-|s'|}({\mathbf {f}}(s)) &{} (n>0) \\ {empty}\,\, {word} &{} (n = 0) \end{array}\right. } $$

where \(s'\) consists of the first \(\min \{n,r\}\) bits of \({\mathbf {f}}(s)\). (We call \(\mathbf {S}^{\textit{out}}\) the squeezing phase .)

Let \(\mathbf {S}_{c,r,{\mathbf {f}}, pad ,n}:=\mathbf {S}^{\textit{out}}_{c,r,{\mathbf {f}},n}\circ \mathbf {S}^{\textit{in}}_{c,r,{\mathbf {f}}} \circ pad \). We call \(\mathbf {S}_{c,r,{\mathbf {f}}, pad ,n}\) the sponge construction.

Usually, \(c,r,{\mathbf {f}}, pad ,n\) will be clear from the context. Then we omit them and simply write \(\mathbf {S}^{\textit{in}},\mathbf {S}^{\textit{out}}\), and \(\mathbf {S}\).

Notation: The sponge construction operates on a state of size \(r+c\), and we will often need to refer to the two halves of that state separately: For any \(s\in \{0,1\}^{r+c}\), let \(s^\mathsf {left}\) denote the first r bits of s, and let \(s^\mathsf {right}\) refer to the last c bits of s. If f is a function with \(r+c\) bit output, then we write \(f^\mathsf {left}\) for the function defined by \(f^\mathsf {left}(x):=f(x)^\mathsf {left}\). And \(f^\mathsf {right}\) analogously.

As the output of the sponge function can be smaller than the rate, i.e. \(n \le r\), we also define the function \({\mathbf {f}}^{\mathsf {left}/n} : \{0,1\}^{r+c} \rightarrow \{0,1\}^{\min (n,r)}\), which is the function that outputs the first \(\min (n,r)\) bits of \({\mathbf {f}}\). In particular, \({\mathbf {f}}^{\mathsf {left}/n} := {\mathbf {f}}^\mathsf {left}\) for \(n \ge r\).

4 Collision-Resistance of the Sponge Construction

In this section we state our result concerning collision-resistance of the sponge construction. We motivate our statement with Lemma 8 connecting attacks on some features of the block function with collision-resistance of the overall construction. Those features are collision-resistance of \({\mathbf {f}}^{\mathsf {right}}\), collision resistance of \({\mathbf {f}}^{\mathsf {left}/n}\), and zero-preimage-resistance of \({\mathbf {f}}^{\mathsf {right}}\).

The last notion is defined as follows: a function \(f^\mathcal {O}:\{0,1\}^ d\rightarrow \{0,1\}^e\) is zero-preimage-resistant iff for any quantum-polynomial-time adversary \(A^\mathcal {O}\), we have that \(\Pr [f^\mathcal {O}(x)=0^e:x\leftarrow A^\mathcal {O}()]\) is negligible. Concrete security bounds are given in the full version [10].

Let us state the main result of this section.

Theorem 7

Assume that \({\mathbf {f}}^\mathsf {right}\) and \({\mathbf {f}}^{\mathsf {left}/n}\) are collision resistant and \({\mathbf {f}}^\mathsf {right}\) is zero-preimage resistant. Then \(\mathbf {S}_{c,r,{\mathbf {f}}, pad ,n}\) is collision-resistant.

Proof sketch

We prove this theorem by a reduction to adversaries attacking the block function. Namely finding collisions in \({\mathbf {f}}^\mathsf {right}\) or \({\mathbf {f}}^{\mathsf {left}/n}\), or a zero-preimage under \({\mathbf {f}}^\mathsf {right}\). This reduction is presented in Lemma 8. Knowing that every collision in \(\mathbf {S}\) results in breach in the security of \({\mathbf {f}}^\mathsf {right}\) or \({\mathbf {f}}^{\mathsf {left}/n}\), allows us to state the claim of the theorem.    \(\square \)

Lemma 8

Assume that \( pad \) is injective. There is a deterministic polynomial-time oracle algorithm A such that for any \(m\ne \hat{m}\) with \(\mathbf {S}(m)=\mathbf {S}(\hat{m})\), \(A^{\mathbf {f}}(m,\hat{m})\), outputs one of the following:

  • \((\mathsf {right},(s,\hat{s}))\) where \((s,\hat{s})\) is a collision of \({\mathbf {f}}^{\mathsf {right}}\),

  • \((\mathsf {zero},s)\) where s is a zero-preimage of \({\mathbf {f}}^{\mathsf {right}}\),

  • or \((\mathsf {left},(s,\hat{s}))\) where \((s,\hat{s})\) is a collision of \({\mathbf {f}}^{\mathsf {left}/n}\).

Proof

A starts by computing the first right-state of the squeezing phase on input of the two colliding messages, i.e., it evaluates \({\mathbf {f}}\circ \mathbf {S}^{\textit{in}}\circ pad \). We will denote the states traversed during this calculation by \(s_i\) and \(\hat{s}_i\) for m and \(\hat{m}\), respectively. As our analysis starts with the final state of this computation and revisits the intermediate states in backwards direction, we denote by \(s_0\) the final state, whose left part is output (for \(n < r\) only the first n bits), by \(s_{-1}\) the state just before the last application of \({\mathbf {f}}\) and so on. A figure including this notation is presented later in Fig. 3. Using \(p:= pad (m)\) and \(\hat{p}:= pad (\hat{m})\), the intermediate states \(s_{-i}\) for \(1\le i \le |p|-1\) are defined by \(s_{-i}:= {\mathbf {f}}(s_{-i-1})\oplus p_{|p|+1-i}\Vert 0^c\), \(s_0:={\mathbf {f}}(s_{-1})\) and \(s_{-|p|}:=p_{1}\Vert 0^c\). As m and \(\hat{m}\) collide per assumption, we have \(s_0^{\mathsf {left}/n}=\hat{s}_0^{\mathsf {left}/n}\).

  1. 1.

    Algorithm A first checks if \(s_{-1}\) or \(\hat{s}_{-1}\) are a preimage of \(0^c\), or form a collision under \({\mathbf {f}}^{\mathsf {left}/n}\). If the right part of \(s_0\) (or \(\hat{s}_0\)) is \(0^c\), \(s_{-1}\) (\(\hat{s}_{-1}\)) is a pre-image of \(0^c\) under \(f^\mathsf {right}\) and A outputs \((\mathsf {zero},s_{-1})\) (\((\mathsf {zero},\hat{s}_{-1})\), respectively). If \(s_{-1}\not =\hat{s}_{-1}\), A outputs \((\mathsf {left},s_{-1},\hat{s}_{-1})\). These two states form a collision under \({\mathbf {f}}^{\mathsf {left}/n}\) because they are the inputs to the last \({\mathbf {f}}\) in \(\mathbf {S}\) and \(s_0^{\mathsf {left}/n}=\hat{s}_0^{\mathsf {left}/n}\). Otherwise, \(s_{-1}=\hat{s}_{-1}\) and there are no preimages of zero.

  2. 2.

    If not done yet, \(s_{-1}=\hat{s}_{-1}\) and A checks for a preimage of \(0^c\) or a collision in \({\mathbf {f}}^{\mathsf {right}}\). If \(s_{-1}^\mathsf {right}=0^c\), A found a preimage of \(0^c\). This is true as if both messages ended here then \(s_{-1}=\hat{s}_{-1}\) would imply that \(p=\hat{p}\) (and so \(m=\hat{m}\)) which contradicts the assumptions of the lemma. Hence, at least one message must be longer. Assuming the longer message is m, A outputs \((\mathsf {zero},s_{-2})\) (or \((\mathsf {zero},\hat{s}_{-2})\) if it was \(\hat{m}\)).

    Next the algorithm checks if \(p_{-1}=\hat{p}_{-1}\), where we follow a similar notation for message blocks as for the states. The last block of the input is denoted by \(p_{-1}\). If \(p_{-1}\not =\hat{p}_{-1}\), A outputs \((\mathsf {right},s_{-2},\hat{s}_{-2})\). This is a collision of \({\mathbf {f}}^\mathsf {right}\) because \(p_{-1}\not =\hat{p}_{-1}\) but \(s_{-1}=\hat{s}_{-1}\). Thus \({\mathbf {f}}(s_{-2}) \ne {\mathbf {f}}(\hat{s}_{-2})\) which in turn implies \(s_{-2}\ne \hat{s}_{-2}\) while \({\mathbf {f}}^{\mathsf {right}}(s_{-2})= {\mathbf {f}}^{\mathsf {right}}(\hat{s}_{-2})\). We can be certain that there are at least two applications of \({\mathbf {f}}\) both in \(\mathbf {S}(m)\) and \(\mathbf {S}(\hat{m})\) because the right half of \(s_{-1}=\hat{s}_{-1}\) is not \(0^c\).

  3. 3.

    If \(p_{-1}=\hat{p}_{-1}\) we end up in the same situation as before but now for \(i=2\). Namely we have that \(s_{-2}=\hat{s}_{-2}\) and the algorithm performs the same checks as before but for a bigger i. Repeat Step 2 for all \( 2\le i\le \min \{|p|,|\hat{p}|\}\).

If the iteration ends without success, this especially means that no collision was found but at least one message was fully processed. In this case A outputs a preimage of \(0^c\) under \({\mathbf {f}}^\mathsf {right}\). That is because no collisions means that all compared message blocks are the same but the two messages are different per assumption. Hence, they must have different lengths. With different length messages that traverse the same state values at the point of \(i= \min \{|p|,|\hat{p}|\}\) the right part of both states is \(0^c\), so the algorithm will output \((\mathsf {zero},\hat{s}_{-|p|-1})\) (assuming \(|p|<|\hat{p}|\)).    \(\square \)

5 Sponges are Collapsing

In this section, we show that the sponge constrution is collapsing, under certain assumptions about the block function \({\mathbf {f}}\). We only state the qualitative results here, more precise statements with concrete security bounds will be given in the full version [10]. The results in this section hold for all distributions of the oracle \(\mathcal {O}\) (including the case that there is no oracle \(\mathcal {O}\)). The specific cases of random functions and random permutations are covered in Sect. 5.1. Since all adversaries (\(A,B,A',B',\dots \)) and the block function \({\mathbf {f}}\) have oracle access to \(\mathcal {O}\) throughout the section, we omit the oracle \(\mathcal {O}\) from our notation for increased readability (i.e., we write \(A,{\mathbf {f}}\) instead of \(A^\mathcal {O},{\mathbf {f}}^\mathcal {O}\)). Throughout this section, we assume that \({\mathbf {f}}^\mathcal {O}\) can be computed in quantum-polynomial-time (given oracle access to \(\mathcal {O}\)).

We will analyze the sponge construction in three parts. First, we analyze the security of the absorbing phase \(\mathbf {S}^{\textit{in}}\), then we analyze the security of the squeezing phase \(\mathbf {S}^{\textit{out}}\), and finally we conclude security of the whole sponge \(\mathbf {S}\), consisting of padding, absorbing, and squeezing.

First, we analyze the absorbing phase. For the absorbing phase (without padding or squeezing) to be collapsing, we will need two properties of \({\mathbf {f}}^\mathsf {right}\):

  • \({\mathbf {f}}^\mathsf {right}\) is collapsing. This is the main property required from the block function \({\mathbf {f}}\). If we would restrict \(\mathbf {S}^{\textit{in}}\) to fixed length messages, then we could show the collapsing property of \(\mathbf {S}^{\textit{in}}\) based on that property alone.

  • \({\mathbf {f}}^\mathsf {right}\) is zero-preimage-resistant.

    To see why we need this property, consider a block function \({\mathbf {f}}\) where the adversary can find, e.g., \(x,y\in \{0,1\}^r\) with \(f(x\Vert 0^c)=y\Vert 0^c\). Then we can see that \(\mathbf {S}^{\textit{in}}(x\Vert y)=0^{c+r}\), and thus \(\mathbf {S}^{\textit{in}}(x\Vert y\Vert z)=z\Vert 0^c=\mathbf {S}^{\textit{in}}(z)\) for any \(z\in \{0,1\}^r\). Thus \(\mathbf {S}^{\textit{in}}\) would not be collision-resistant, and in particular not collapsing.

We state the result formally:

Lemma 9

(Absorbing phase is collapsing). Assume that \({\mathbf {f}}^\mathsf {right}\) is collapsing, and that \({\mathbf {f}}^\mathsf {right}\) is zero-preimage-resistant. Then \(\mathbf {S}^{\textit{in}}\) is collapsing.

Note that this lemma does not explicitly state anything about the size of r and c. But of course, \({\mathbf {f}}^\mathsf {right}\) can only be collapsing and zero-preimage-resistant is the capacity c is superlogarithmic.

We only give a detailed proof sketch for Lemma 9. The full proof is given in the full version [10].

Proof sketch

Consider a quantum-polynomial-time adversary (AB) where A outputs a hash h, and a superposition of messages m on the register M, and B expects M back and outputs a guess b. We need to show that the two games in Definition 1 (see also Fig. 1) are indistinguishable, i.e., the probability of \(b=1\) is approximately the same in both games. Since the domain of \(\mathbf {S}^{\textit{in}}\) is \({(}\{0,1\}^r{)}^+\), we can assume that (AB) is valid on \({(}\{0,1\}^r{)}^+\), i.e., M contains a superposition of messages \(m\in {(}\{0,1\}^r{)}^+\) with \(\mathbf {S}^{\textit{in}}(m)=h\).

To show that the two games are indistinguishable, we start with \(\mathsf {Game}_2\) from Definition 1 (Fig. 1(b)) and transform it step by step into \(\mathsf {Game}_1\).

Game 1

\((M,h)\leftarrow A()\). \(b\leftarrow B(M)\). (Same as \(\mathsf {Game}_2\) from Definition 1.)

Note that we keep the register S (the state of (AB)) implicit in this proof sketch, to improve readability.

Now, in each successive game, we measure more and more information about the message m contained in M, until in the final game, we measure m completely (like in \(\mathsf {Game}_1\) from Definition 1). In order to refer to the different values derived from m that we measure, we will need to use a lot of notation to refer to various intermediate values occurring in the computation of \(\mathbf {S}^{\textit{in}}(m)\). To make it easier to follow the proof, all relevant notation has been depicted in Fig. 3, which shows an evaluation of \(\mathbf {S}^{\textit{in}}(m)\).

Fig. 3.
figure 3

Values occurring in the computation of \(h=\mathbf {S}^{\textit{in}}(m)\), using the notation from the proof sketch of Lemma 9.

Since (AB) is valid, we know that \(\mathbf {S}^{\textit{in}}(m)=h\) where h is the classical output of A. That is, \(h=\mathbf {S}^{\textit{in}}(m)\) has already been measured.Footnote 8 In the computation of \(\mathbf {S}^{\textit{in}}(m)\), let \(s_{-2}\) refer to the state that goes into the last application of \({\mathbf {f}}\) (see Fig. 3), and let \(m_{-1}\) refer to the last block of the input message m (i.e., \(m_{-1}=m_{|m|}\)). Then \(h={\mathbf {f}}(s_{-2})\oplus (m_{-1}\Vert 0^c)\) by definition of \(\mathbf {S}^{\textit{in}}\), and thus \(h^\mathsf {right}={\mathbf {f}}^\mathsf {right}(s_{-2})\). Now since by assumption, \({\mathbf {f}}^\mathsf {right}\) is collapsing, we can reason: Since \({\mathbf {f}}^\mathsf {right}\) is collapsing, and we have measured the output \(h^\mathsf {right}\) of \({\mathbf {f}}^\mathsf {right}(s_{-2})\), it follows that we can additionally measure \(s_{-2}\), and the adversary B will not be able to notice the difference. That is, we get the following game with only negligibly different \(\Pr [b=1]\):

\(\mathbf {Game}\,\, \mathbf {2}_\mathbf {attempt}\) \((M,h)\leftarrow A()\). \(b\leftarrow B(M)\).

This would indeed work, if we knew that \(|m|\ge 2\). However, it could be that \(|m|=1\). In this case, we have \(s_{-2}=\bot \) (i.e., \(s_{-2}\) does not occur in the computation). Worse, if M contains a superposition of messages m, some of length \(|m|=1\), others of length \(|m|\ge 2\), then measuring \(s_{-2}\) will reveal whether \(|m|\ge 2\) or \(|m|=1\). We cannot guarantee that this measurement will not change the quantum state of M in a noticeable way. Then \(\Pr [b=1: \text {Game 1}]\not \approx \Pr [b=1: \text {Game 2}_{\text {attempt}}]\). The collapsing property of \({\mathbf {f}}^\mathsf {right}\) does not help here, because to apply that property, we need to know that \(h^\mathsf {right}\) is indeed the output of \({\mathbf {f}}^\mathsf {right}\) (which is not the case when \(|m|=1\)).

A similar problem also occurs in later games: Let \(s_{-k}\) denote the k-th state from the end, with h being \(s_{-1}\), the input to the last \({\mathbf {f}}\) being \(s_{-2}\), the input to the previous \({\mathbf {f}}\) being \(s_{-3}\), etc., see Fig. 3. (We count backwards because this will make notation easier, since our games will start measuring states from the end.) When we have measured some right state \(s^\mathsf {right}_{-k}\), we want to argue that we can measure the previous state \(s_{-(k+1)}\) because \(s_{-k}^\mathsf {right}={\mathbf {f}}^\mathsf {right}(s_{-(k+1)})\). Again, this will not be possible because we do not know whether \(s_{-k}\) is not already the first state of the computation of \(\mathbf {S}^{\textit{in}}(m)\). (That is, we do not know whether \(|m|=k\).)

To get around this problem, we need a mechanism to decide whether a state \(s_{-k}\) is the initial state, i.e., whether \(s_{-k}\) is \(s_{-|m|}\). How do we do that? By construction of \(\mathbf {S}^{\textit{in}}\), the initial state \(s_{-|m|}\) satisfies \(s_{-|m|}^\mathsf {right}=0^c\). Thus, we might try to decide whether \(s_{-k}\) is the initial state by checking whether \(s_{-k}^\mathsf {right}=0^c\). For example, if we want to measure \(s_{-2}\), we do so only when \(h^\mathsf {right}=s_{-1}^\mathsf {right}\ne 0^c\). This approach is basically sound, but what happens when a state in the middle has \(s_{-k}^\mathsf {right}=0^c\)? We would be mislead, and the proof would break down.

To avoid this problem, we will first measure at which positions this bad case happens. Let \(\mathbf b\) be the set of all indices \(k<|m|\) such that \(s_{-k}^\mathsf {right}=0^c\). (That is, the indices of all states in which we observe a \(0^c\) in the right part, but which are not the initial state.) Once we know the set \(\mathbf b\), then we can decide whether \(s_{-k}\) is the initial state or not. Namely, \(s_{-k}\) is the initial state (i.e., \(k=|m|\)) iff \(s_{-k}^\mathsf {right}=0^c\) and \(k\notin \mathbf b\).

So the first step in our sequence of games is to measure the set \(\mathbf {b}\):

Game 2

\((M,h)\leftarrow A()\). \(b\leftarrow B(M)\).

We assumed that \({\mathbf {f}}^\mathsf {right}\) is zero-preimage-resistant. This implies that with overwhelming probability, \(s_{-k}^\mathsf {right}={\mathbf {f}}^\mathsf {right}(s_{-(k+1)})\ne 0^c\) for all \(k<|m|\). Thus \(\mathbf {b}=\varnothing \) with overwhelming probability. Therefore measuring \(\mathbf {b}\) has only negligible effect on the quantum state. Thus

$$ \Pr [\text {Game 1}] \approx \Pr [\text {Game 2}]. $$

(We use the shorthand \(\Pr [\text {Game 1}]\) for \(\Pr [b=1:\text {Game 2}]\). And \(\approx \) denotes a negligible difference.) Now we can proceed with measuring more and more states from the computation of \(\mathbf {S}^{\textit{in}}(m)\). First, we measure \(s_{-1}\):

\(\mathbf {Game}\,\,\mathbf {4}_{1}\) \((M,h)\leftarrow A()\). Measure \(\mathbf b\) \(b\leftarrow B(M)\).

(Note: The numbering of games in this proof sketch has gaps so that the game numbers here match the game numbers in the full version [10].) Since \(s_{-1}=h\) by definition, and since h is already measured by A, the additional measurement does not change the quantum state, and we have:

$$ \Pr [\text {Game 2}] = \Pr [\text {Game 4}_1]. $$

Now we add a measurement whether \(s_{-2}\) is defined:

\(\mathbf {Game}\,\,\mathbf {5}_{1}\) \((M,h)\leftarrow A()\). Measure \(\mathbf b\) and \(s_{-1}\) .Footnote 9 \(b\leftarrow B(M)\).

Given \(\mathbf {b}\) and \(s_{-1}\) we can already tell whether \(s_{-2}=\bot \). Namely, \(s_{-2}=\bot \) iff \(s_{-1}^\mathsf {right}=0^c\) and \(1\notin \mathbf {b}\). Thus measuring whether \(s_{-2}=\bot \) has no effect on the quantum state, and we get

$$ \Pr [\text {Game 4}_1] = \Pr [\text {Game 5}_1]. $$

Now, finally, we can do what we already intended to do in \(\text {Game 2}_{\text {attempt}}\): We measure \(s_{-2}\), and use the collapsing property of \({\mathbf {f}}^\mathsf {right}\) to show that this measurement does not noticeably disturb the quantum state:

\(\mathbf {Game}\,\,\mathbf {4}_{2}\) \((M,h)\leftarrow A()\). Measure \(\mathbf b\), and \(s_{-1}\), and whether \(s_{-2}=\bot \), \(b\leftarrow B(M)\).

In case that we measured that \(s_{-2}=\bot \), measuring \(s_{-2}\) in \(\text {Game 4}_2\) has no effect on the quantum state (since we know that the outcome will be \(\bot \)). And in case that we measured that \(s_{-2}\ne \bot \), we know that \(s_{-1}^\mathsf {right}={\mathbf {f}}^\mathsf {right}(s_{-2})\) (as already discussed above), and thus measuring \(s_{-2}\) can be noticed with at most noticeable probability by a quantum-polynomial-time adversary. Thus

$$ \Pr [\text {Game 5}_1] \approx \Pr [\text {Game 4}_2]. $$

And then we continue by adding a measurement whether \(s_{-3}\ne \bot \):

\(\mathbf {Game}\,\,\mathbf {5}_{2}\) \((M,h)\leftarrow A()\). Measure \(\mathbf b\), and \(s_{-1}\), and whether \(s_{-2}=\bot \), and \(s_{-2}\), \(b\leftarrow B(M)\).

Since \(s_{-3}=\bot \) iff \(s_{-2}=\bot \) or \(s_{-2}^\mathsf {right}=0^c\) and \(2\notin \mathbf {b}\), measuring whether \(s_{-3}=\bot \) holds has no effect on the quantum state. Thus we get

$$ \Pr [\text {Game 4}_2] = \Pr [\text {Game 5}_2]. $$

And then we measure \(s_{-3}\):

\(\mathbf {Game}\,\,\mathbf {4}_{3}\) \((M,h)\leftarrow A()\). Measure \(\mathbf b\), and \(s_{-1}\), and whether \(s_{-2}=\bot \), and \(s_{-2}\), and whether \(s_{-3}=\bot \), \(b\leftarrow B(M)\).

Using that \({\mathbf {f}}^\mathsf {right}\) is collapsing, we get

$$\begin{aligned} \Pr [\text {Game 5}_2] \approx \Pr [\text {Game 4}_3]. \end{aligned}$$

We continue in this way, alternatively adding a measurement whether the next state \(s_{-k}=\bot \), and then adding a measurement of \(s_{-k}\), each time using the collapsing property of \({\mathbf {f}}^\mathsf {right}\). After \(\ell \) such steps, where \(\ell \) is a polynomial upper bound on the length of m, we get the following game:

\(\mathbf {Game}\,\,\mathbf {4}_{\ell }\) \((M,h)\leftarrow A()\). Measure \(\mathbf {b}\), \(b\leftarrow B(M)\).

Since in each of the steps, we accrue only a negligible distinguishing probability between consecutive games, we get:

$$ \Pr [\text {Game 4}_1] \approx \Pr [\text {Game 4}_\ell ]. $$

(Details are given in the full version [10].)

In \(\text {Game 4}_\ell \), we measure \(s_{-1},\dots ,s_{-\ell }\). From these, we can compute \(|m|\) (since \(s_{-k}=\bot \) for \(k>|m|\)). Furthermore, each message block \(m_{-k}\) (the k-th message block from the end) can be computed as follows: We have \(s_{-k}={\mathbf {f}}(s_{-(k+1)})\oplus (m_{-k}\Vert 0^c)\), and thus \(m_{-k}=s_{-k}^\mathsf {left}\oplus {\mathbf {f}}(s_{-(k+1)})^\mathsf {left}\). Except for \(m_{-|m|}\), which can be computed as \(m_{-|m|}=s^\mathsf {left}_{-|m|}\). (Cf. Fig. 3) Finally, we can compute m as \(m=m_{-|m|}\Vert \dots \Vert m_{-1}\).

Since we can compute m from the measurements performed in \(\text {Game 4}_\ell \), it follows that those measurements are equivalent (in their effect on the quantum state) to a measurement of m. Thus

$$ \Pr [\text {Game 4}_\ell ] = \Pr [\text {Game 6}] $$

for the following final game:

Game 6

\((M,h)\leftarrow A()\). \(b\leftarrow B(M)\).

Altogether, we have shown

$$\begin{aligned} \Pr [\text {Game 1}] \approx \Pr [\text {Game 6}]. \end{aligned}$$

And \(\text {Game 1}\) and \(\text {Game 6}\) are identical to the games \(\mathsf {Game}_2\) and \(\mathsf {Game}_1\) from Definition 1, respectively. Since (AB) was an arbitrary quantum-polynomial-time adversary that is valid for \(\mathbf {S}^{\textit{in}}\), it follows by Definition 1 that \(\mathbf {S}^{\textit{in}}\) is collapsing.    \(\square \)

Next, we show that the squeezing phase is collapsing. Let \({\mathbf {f}}^{\mathsf {left}/n}\) be defined for \(n > 0\), as the first \(\min (n,r)\) bits of the output of \({\mathbf {f}}\) (in particular, \({\mathbf {f}}^{\mathsf {left}/n}={\mathbf {f}}^\mathsf {left}\) for \(n\ge r\)). Then the collapsing property of the squeezing phase is a relatively trivial consequence of the fact that \({\mathbf {f}}^{\mathsf {left}/n}\) is collapsing.

Lemma 10

(Squeezing phase is collapsing). Let \(n > 0\) be the output length and assume that \({\mathbf {f}}^{\mathsf {left}/n}\) is collapsing. Then \(\mathbf {S}^{\textit{out}}\) is collapsing.

A concrete security variant of this lemma is given in the full version [10].

Proof

Let \(G_\eta (x)\) return the first \(\eta = \min (r,n)\) bits of x. Then \(G_\eta (\mathbf {S}^{\textit{out}}(s))={\mathbf {f}}^{\mathsf {left}/n}(s)\). Thus the lemma follows directly from Lemma 4.    \(\square \)

And finally we get that the sponge construction as a whole is collapsing. This is a simple corollary from the fact that both the absorbing and the squeezing phase are collapsing.

Theorem 11

(Sponge construction is collapsing). Let \(n > 0\) be the output length and assume that \({\mathbf {f}}^{\mathsf {left}/n}\) and \({\mathbf {f}}^\mathsf {right}\) are collapsing, and that \({\mathbf {f}}^\mathsf {right}\) is zero-preimage-resistant. Assume that \( pad \) is injective. Then \(\mathbf {S}\) is collapsing.

A concrete security variant of this theorem is given in the full version [10].

Proof

By Lemma 9, \(\mathbf {S}^{\textit{in}}\) is collapsing, and by Lemma 10, \(\mathbf {S}^{\textit{out}}\) is collapsing. Then by Lemma 5, \(\mathbf {S}^{\textit{out}}\circ \mathbf {S}^{\textit{in}}\) is collapsing. Since \( pad \) is injective, by Lemma 2, \( pad \) is collapsing. Thus by Lemma 5, \(\mathbf {S}=(\mathbf {S}^{\textit{out}}\circ \mathbf {S}^{\textit{in}})\circ pad \) is collapsing.    \(\square \)

5.1 Using Random Oracles or Random Permutations

In this section, we show that \(\mathbf {S}\) is collapsing, when \(\mathcal {O}\) is a random function or random permutation and \({\mathbf {f}}^\mathcal {O}(x) := \mathcal {O}(x)\). The collapsing of \({\mathbf {f}}^\mathsf {right}, {\mathbf {f}}^{\mathsf {left}/n}\) follows from [18], and the zero-preimage-resistance of \({\mathbf {f}}^\mathsf {right}\) follows from the optimality of Grover’s algorithm. The computation of the precise advantage is given in the full version [10].

Theorem 12

If \(\mathcal {O}:\{0,1\}^{r+c}\rightarrow \{0,1\}^{r+c}\) is a random function, and \({\mathbf {f}}^\mathcal {O}(x):=\mathcal {O}(x)\), and rc and output length n are superlogarithmic, then \(\mathbf {S}\) is collapsing.

Proof

In the full version, we show that the preconditions of this Lemma follow from [18]. It then follows immediate from Theorem 11.   \(\square \)

And since random functions and random permutations are known to be indistinguishable, we readily derive the security of the sponge construction also for block functions that are random permutations.

Theorem 13

If \(\mathcal {O}:\{0,1\}^{r+c}\rightarrow \{0,1\}^{r+c}\) is a random permutation, and \({\mathbf {f}}^\mathcal {O}(x):=\mathcal {O}(x)\), and rc and output length n are superlogarithmic, then \(\mathbf {S}\) is collapsing.

A concrete security variant (with security bounds in terms of the number of oracle-queries) is given in the full version [10].

Proof

Zhandry [20] shows that no adversary making a polynomial number of queries can distinguish a random permutation from a random function with more than negligible probability (assuming that the output length is superlogarithmic). Thus the advantage of the adversary attacking the collapsing property of \(\mathbf {S}\) when \(\mathcal {O}\) is a random permutation can only be negligibly higher than the advantage of the same adversary attacking the collapsing property of \(\mathbf {S}\) when \(\mathcal {O}\) is a random function. The latter advantage is negligible by Theorem 12, thus the former advantage is negligible, too. Hence \(\mathbf {S}\) is collapsing when \(\mathcal {O}\) is a random permutation.    \(\square \)

6 Quantum Attack

In the following we present a quantum collision-finding attack against the Sponge construction. The attack is based on a quantum collision-finding algorithm for any function (Theorem 15 below) that assumes access to a random oracle (RO).

The general working of our attack is to select a suitable function g, run a collision finding algorithm by Ambainis [1] to obtain a collision for g, and finally turn this collision into a collision for the target Sponge. The suitable function in this context refers to the function giving the optimal result. First, we make a case distinction whether the length of the required collision n is smaller or bigger than the capacity c of the sponge. In case \(n < c\), we simply search for an output collision in \(\mathbf {S}\). In the other case \(n \ge c\), it is more efficient to search for a right-collision, as these are collisions in a function with c bits of output and can be extended to arbitrary-length output collisions. Second, the function has to be selected or rather constructed in a way that allows for efficient iteration, in case a first run of the core algorithm does not succeed. Our attack makes heavy use of the following quantum algorithm by Ambainis [1].

Theorem 14

([1] Theorem 3). Let \(\mathbf {g}:X\rightarrow Y\) be a function that has at least one collision. The size of the set X is M. Then there exists a constant \(k_{\mathrm {Amb}}\) and a quantum algorithm \(\textsc {Amb}\) making \(k_{\mathrm {Amb}}\cdot M^{2/3}\) quantum queries to \(\mathbf {g}\) that finds a collision with probability at least 15/16.

We note that [1] also gives guarantees on the actual quantum running time and memory requirements of the quantum collision-finding algorithm. Concretely, there exist (small) constants \(k_{\mathrm {Amb}}', k_{\mathrm {Amb}}''\) such that the running time and quantum memory is at most \(k_{\mathrm {Amb}}' \cdot M^{2/3} \cdot \log ^{k_{\mathrm {Amb}}''}(M + |Y|)\). Therefore, all our results of this section which are stated in terms of query complexity also yield guarantees on the running time and memory, incurring the same blowup by a poly-logarithmic factor in the number of queries.

6.1 Quantum Collision Finding with Random Oracle

We start by showing how to use Ambainis’ algorithm to generically find a collision in any function as long as we have access to a random oracle.

Theorem 15

For finite sets AB with \(|B| \ge 3\) and \( |A| \ge 40 |B|\), and any function \(h: A \rightarrow B\), there exists a quantum algorithm which requires access to a random oracle \(\mathcal {H}: T \rightarrow A\) and outputs a collision of h with probability at least 1 / 8 after at most \(k_{\mathrm {Amb}}\cdot |B|^{1/3}\) queries to h and at most \(2 k_{\mathrm {Amb}}\cdot |B|^{1/3} + 2\) queries to \(\mathcal {H}\) where \(k_{\mathrm {Amb}}\) is the constant from Theorem 14.

As noted after Theorem 14, there exist constants \(k_{\mathrm {Amb}}', k_{\mathrm {Amb}}''\) such that the running time and quantum memory of the collision-finding algorithm is at most \(k_{\mathrm {Amb}}' \cdot |B|^{1/3} \cdot \log ^{k_{\mathrm {Amb}}''}(|B|)\).

Proof

Let \(T := \{1, 2, \ldots , \left\lceil \sqrt{|B|} \right\rceil +1 \}\) be a finite set of \(\left\lceil \sqrt{|B|} \right\rceil + 1\) elements. In the description of our generic collision-finding algorithm coll-ro below, we use the random oracle (RO) \(\mathcal {H}: T \rightarrow A\). When repeating the algorithm in order to improve the success probability, we assume that a “fresh” random oracle is used in every run, which can be achieved using standard techniques such as prepending an iteration-counter to the inputs.

figure a

If Ambainis’ algorithm succeeds in outputting a collision and \(\mathcal {H}\) does not have any collisions, then we obtain a collision of h. Hence,

$$\begin{aligned}&\Pr [\textsc {coll-ro}\, \hbox {outputs}\, m\ne \hat{m} \,\hbox {such that}\, h(m) = h(\hat{m})] \nonumber \\&\ge \Pr [\textsc {Amb}\, \hbox {outputs a collision} \wedge \mathcal {H}\, \hbox {does not have collisions}] \nonumber \\&\ge \Pr [\textsc {Amb}\, \hbox {outputs a collision}] - \Pr [\mathcal {H} \,\hbox {has a collision}]. \end{aligned}$$
(1)

We can lower bound the first probability as follows:

$$\begin{aligned}&\Pr [\textsc {Amb}\, \hbox {outputs a collision}] = \Pr [\mathbf {g}\, \hbox {has a collision} \wedge \textsc {Amb}\,\, \hbox {outputs a collision}] \\&= \Pr [\mathbf {g}\, \hbox {has a collision}] - \Pr [\mathbf {g}\,\hbox { has a collision} \wedge \textsc {Amb}\,\hbox {does not output a collision}]\\&\ge \Pr [\mathbf {g}\,\hbox {has a collision}] - \frac{1}{16}. \end{aligned}$$

Note that \(\mathbf {g}\) maps M messages to B. If these outputs were distributed independently and uniformly, we could lower bound the collision probability with a birthday bound. In our case, these outputs are not necessarily uniformly (due to h) but still independently (due to \(\mathcal {H}\)) distributed. It is proven in [15] that in this case, the same lower bound on the collision probability remains true. Therefore, it follows (e.g. from [14, Lemma A.16]) that

$$\begin{aligned} \Pr [\mathbf {g}\,\, \hbox {has a collision}] \ge \frac{M(M-1)}{4 \cdot |B|} \ge \frac{(M-1)^2}{4 \cdot |B|} \ge \frac{\left( \left\lceil \sqrt{|B|} \right\rceil \right) ^2}{4 \cdot |B|} \ge \frac{1}{4}. \end{aligned}$$
(2)

In order to upper bound the second term of (1), observe that \(\mathcal {H}\) maps M messages to independent and uniform elements in A. From a union bound (see, e.g. [14, Lemma A.15], noting that \(M \le \sqrt{2 |A|}\)), we get that

$$\begin{aligned} \Pr [\mathcal {H} \,\hbox {has a collision}] \le \frac{M^2}{2 |A|} \le \frac{(\sqrt{|B|}+2)^2}{2 |A|} \le \frac{|B| + 4\sqrt{|B|} + 4}{2 |A|} \le \frac{5 |B|}{2 |A|} \le \frac{1}{16}. \end{aligned}$$
(3)

The second-to-last inequality holds because \(4\sqrt{|B|}+4 \le 4|B|\) for \(|B|\ge 3\), and the last inequality is due to our assumption \(40 |B| \le |A|\). Combining (2) and (3), coll-ro outputs a collision with probability at least \(\frac{1}{4}- \frac{1}{16} - \frac{1}{16} = \frac{1}{8}\). The quantum circuit for \(\mathbf {g}:=h \circ \mathcal {H}\) makes one query to h and two queries to \(\mathcal {H}\). Therefore, the total number of queries to h is at most \(k_{\mathrm {Amb}}\cdot M^{2/3} = k_{\mathrm {Amb}}\cdot |B|^{1/3}\) and the number of queries to \(\mathcal {H}\) is at most \(2 k_{\mathrm {Amb}}\cdot |B|^{1/3} + 2\).    \(\square \)

Theorem 16

Let \(\mathbf {S}_{c,r,{\mathbf {f}}, pad ,n}(m)\) be a sponge construction with arbitrary block function \({\mathbf {f}}\). There exists a quantum algorithm \(\textsc {coll-ro}\) making at most \(q_{{\mathbf {f}}}\) quantum queries to \({\mathbf {f}}\) and \(q_{\mathcal {H}}\) quantum queries to a random oracle \(\mathcal {H}\). coll-ro outputs colliding messages \(m \ne \hat{m}\) such that \(\mathbf {S}_{c,r,{\mathbf {f}}, pad ,n}(m) = \mathbf {S}_{c,r,{\mathbf {f}}, pad ,n}(\hat{m})\) with probability at least 1 / 8, where \(q_{{\mathbf {f}}} := 2k_{\mathrm {Amb}}\cdot \min \{\frac{c+6+2r}{r} 2^{c/3}, \frac{2n+6+3r}{r} 2^{n/3}\}\), and \(q_{\mathcal {H}} := 2 k_{\mathrm {Amb}}\cdot \min \{2^{c/3}, 2^{n/3}\}+2\), where \(k_{\mathrm {Amb}}\) is the constant from Theorem 14 and \( pad \) is any padding function which appends at most 2r bits.

Typical padding functions do not append more than \(r+1\) bits to the message, and are therefore covered by the theorem. Otherwise, the proof below can be easily modified to take longer paddings into account, resulting in increased factors in the expression of \(q_{{\mathbf {f}}}\) above.

Proof

We make a case distinction whether the length n of the required collision n is smaller or bigger than the capacity c of the sponge. In case \(n < c\), it is more efficient to directly search for an output collision in \(\mathbf {S}\). In the other case \(n \ge c\), it is more efficient to search for collisions in the right internal state, as these are collisions in a function with c bits of output and can be extended to arbitrary-length output collisions.

figure b

Let us analyze the case \(n<c\). According to Theorem 15, coll-ro outputs a collision with probability at least \(\frac{1}{8}\) using at most \(k_{\mathrm {Amb}}\cdot |B|^{1/3} = k_{\mathrm {Amb}}\cdot 2^{n/3}\) quantum queries to h. A single evaluation of \(\mathbf {S}\) requires at most \(2 \cdot \lceil \max \{| pad (m)| : m \in \{0,1\}^{n+6}\}/r \rceil \le 2\cdot (n+6+2r)/r\) queries to the block function \({\mathbf {f}}\) in the absorbing phase and \(2 \cdot \lceil n/r \rceil \le 2 (n+r)/r\) queries to \({\mathbf {f}}\) in the squeezing phase. Hence, one query to h requires at most \(2 \cdot (2n+6+3r)/r\) queries to the block function \({\mathbf {f}}\). Therefore, a collision in \(\mathbf {S}\) can be found with at most \(2 k_{\mathrm {Amb}}\cdot \frac{2n+6+3r}{r} 2^{n/3}\) queries to \({\mathbf {f}}\). In the other case \(n \ge c\), the algorithm coll-ro finds two messages \(m \ne \hat{m}\) such that \(({\mathbf {f}}^{\mathsf {right}}\circ \mathbf {S}^{\textit{in}}\circ pad )(m) = ({\mathbf {f}}^{\mathsf {right}}\circ \mathbf {S}^{\textit{in}}\circ pad )(\hat{m})\) with probability at least \(\frac{1}{8}\). Such a right-collision can then be extended to a full-state collision by appending to the padded colliding messages \( pad (m)\) and \( pad (\hat{m})\) one more suitably chosen message block resulting in \(y:= pad (m) \Vert a\) and \(\hat{y}:= pad (\hat{m}) \Vert 0^r\). As both |y| and \(|\hat{y}|\) are (possibly different) multiples of r, the same bits will be appended by \( pad \) according to our assumptions on the padding function. By the choice of a in Step 9, we have that \(({\mathbf {f}}\circ \mathbf {S}^{\textit{in}}\circ pad )(y) = ({\mathbf {f}}\circ \mathbf {S}^{\textit{in}}\circ pad )(\hat{y})\), i.e. the full states collide and therefore, all n output bits produced from this state will coincide. The algorithm makes \(k_{\mathrm {Amb}}\cdot |B|^{1/3} = k_{\mathrm {Amb}}\cdot 2^{c/3}\) queries to \(h:={\mathbf {f}}^{\mathsf {right}}\circ \mathbf {S}^{\textit{in}}\circ pad \). In this case, one query to h requires at most \(2 \cdot (c+6+2r)/r\) queries to the block function \({\mathbf {f}}\). Therefore, a collision in \(({\mathbf {f}}\circ \mathbf {S}^{\textit{in}}\circ pad )\) can be found with at most \(2 k_{\mathrm {Amb}}\cdot \frac{c+6+2r}{r} 2^{c/3}\) queries to \({\mathbf {f}}\), resulting in the claimed bound.    \(\square \)